Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
DNA methylation changes in the development of lung adenocarcinoma
(USC Thesis Other)
DNA methylation changes in the development of lung adenocarcinoma
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
DNA METHYLATION CHANGES IN THE
DEVELOPMENT OF LUNG ADENOCARCINOMA
by
Suhaida Adura Selamat
________________________________________________
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(GENETIC, MOLECULAR AND CELLULAR BIOLOGY)
May 2012
Copyright 2012 Suhaida Adura Selamat
ii
DEDICATION
I would first like to thank my husband Géza Szigethy for his unending
support, his love, his brilliance, and his understanding over the years. I’m so
grateful that I didn’t do my homework that first day in orgo lab, and that you let
me look into your lab book. That was the first time in my entire life that I cheated,
and it paid off big time. I am unbelievably lucky to have you as a partner in life,
and am so excited for our future together. I love you.
I thank my parents Selamat Bajuri and Rafiah Salim for allowing me to do
something completely different, for letting me choose my own path, and for loving
me all the same. The years we spend apart are difficult, but you are never far
from my thoughts and my heart, and I love you both so much. I would also like to
thank my siblings Rahimi, Rozani, Suhana Dewi, Suhazi Reza and Suhani Idayu,
as well as everyone else in my ever-expanding family for tolerating my absence,
and for still remembering me. I miss you all every day, and I wait for the day
someone invents a Transporter so that we could see each other more often. Until
then, rest assured I will always come home.
Ite, I cannot thank you enough for all that you have taught me, both
professionally and personally. You pushed me to do better, and to do things I
didn’t even know I could, but were simultaneously incredibly understanding and
empathetic. Your enthusiasm is infectious, and your work ethic inspiring. I will
always be grateful to have had you as a mentor.
iii
ACKNOWLEDGEMENTS
This work represents a collaboration between many scientists and
supporters. I am especially grateful to our collaborator and my committee
member Dr. Kimberly Siegmund who is always encouraging and always
responsive to my unending questions. Thank you to Dr. Michael Koss, Dr. Keith
Kerr, Dr. Jeffrey Hagen, as well as the scientists and support staff of the Canary
Foundation and the Early Detection Research Network for making these projects
possible. I’m also grateful to members of the USC Epigenome Center, namely
Dr. Daniel Weisenberger, Dr. Toshinori Hinoue, Dr. Houtan Noushmehr, Tim
Triche Jr., Dr. Simeen Malik and Hui Shen for their assistance. I thank Dr. Peter
Laird and his lab members including Dr. KwangHo Lee, Dr. Shirley Oghamian
and Dr. Mihaela Campan for many years of advice, guidance and friendship. Of
course, I also thank Dr. Ite Laird-Offringa, my committee members Dr. Kimberly
Siegmund and Dr. Zea Borok as well as the past and present members of the
ILO lab, especially Dr. Janice Galler, Dr. Meleeneh Kazarian Derhartunian, Dr.
Crystal Marconett and Brian Chung, who have been friends to me for years, and
hopefully will be for years to come.
iv
TABLE OF CONTENTS
Dedication ii
Acknowledgements iii
List of Tables vii
List of Figures ix
Abstract xii
Chapter 1: Introduction 1
Lung cancer 1
Lung adenocarcinoma 2
DNA methylation and cancer 5
DNA methylation in lung adenocarcinoma 8
DNA methylation: Lung cancer biomarker 10
DNA methylation: Functional changes in lung cancer 14
DNA methylation and tumor heterogeneity 19
Chapter 2: DNA methylation changes in atypical adenomatous 24
hyperplasia, adenocarcinoma in situ, and lung adenocarcinoma
Introduction 26
Materials and Methods 32
Ethics statement 32
Study subjects 32
DNA extraction and bisulfite treatment 36
DNA methylation analysis 37
Statistical analyses 39
Results 42
DNA methylation levels across the AdjNTL-AAH-AIS- 42
adenocarcinoma spectrum
Analysis of DNA methylation in preneoplastic lesions 52
Discussion 54
Chapter 3. Genome-scale analysis of DNA methylation in lung 68
adenocarcinoma: Biomarkers and functional changes
Introduction 70
Materials and Methods 72
Study samples 72
DNA methylation data production 75
DNA methylation data analysis 77
v
Functional classification/Gene network analyses 80
Integration of gene expression analysis 81
Results 83
Identification of differentially methylated regions in lung 83
adenocarcinoma
Identification of potentially functionally relevant DNA 93
methylation changes in lung adenocarcinoma
Pan-non-small cell lung cancer marker 104
Discussion 109
Chapter 4: Genome-scale DNA methylation profiling of lung 114
adenocarcinoma: Tumor heterogeneity
Introduction 116
Materials and Methods 118
Study samples 118
DNA methylation data production 118
DNA methylation data analysis 119
Results 121
Differential DNA methylation analysis between smokers 121
and never- smokers
Differential DNA methylation analysis between Asians and 125
Caucasians
Class discovery: Identification of DNA methylation 128
sub-groups in lung adenocarcinoma
Discussion 141
Chapter 5: Genome-scale DNA methylation profiles from formalin-fixed 150
paraffin-embedded tissues
Introduction 152
Materials and Methods 154
Samples 154
DNA extraction 154
DNA purification and quantization 156
DNA methylation analysis 157
Data analysis 157
Results 159
Probe Failures 159
Beta-value distribution and variance 161
Comparison of data from FFPE and matched frozen samples 165
with known DNA methylation of select genes
Discussion 168
vi
Chapter 6: Summary and conclusions 170
References 179
Appendix A: DNA methylation analysis of mixed bronchioloalveolar 208
(BAC) lesions
Introduction 208
Materials and Methods 210
Results and Discussion 211
Appendix B 218
Appendix C 224
Appendix D 225
vii
LIST OF TABLES
Table 1.1 Genome-scale DNA methylation studies in lung 16
cancer
Table 2.1 Genes and gene functions 30
Table 2.2 Information on study subjects 33
Table 2.3 Distribution of AdjNTL, AAH, AIS and 35
adenocarcinoma samples among 63 subjects
Table 2.4 Distribution of multiple lesions among subjects 35
Table 2.5 Primers and probe sequences 38
Table 2.6 Median PMRs and pair-wise comparison p-values 44
between each tissue type
Table 2.7 Comparison between high-grade and low-grade 53
AAH lesions
Table 3.1 Characteristics of subjects and tumors 73
Table 3.2 Characteristics of validation subjects 74
Table 3.3 MethyLight primer and probe sequences 78
Table 3.4 Top hypermethylated and downregulated genes 101
in lung adenocarcinoma
Table 3.5 Top hypomethylated and upregulated genes in lung 103
adenocarcinoma
Table 3.6 Top 10 hypermethylated loci in adenocarcinoma and 107
squamous cell carcinoma
Table 4.1 Genes showing statistically significant differential DNA 123
methylation between current and never-smoker tumors
Table 4.2 Analysis of genes previously identified as significantly 126
differentially methylated between tumors of current
and never-smokers
viii
Table 4.3 Top genes showing statistically significant DNA 138
methylation differences between KRAS mutant
and KRAS wildtype tumors
Table 4.4 Genes showing statistically significant differences in 140
gene expression between Cluster 1 and
Cluster 2 tumors
Table 5.1 Subject and sample characteristics 155
Table B.1 Top 100 statistically significantly hypermethylated 218
genes in lung adenocarcinoma
Table B.2 Top 100 statistically significantly hypomethylated 221
genes in lung adenocarcinoma
Table C.1 Genes with positive correlations between DNA 224
methylation and gene expression
(hypermethylated and upregulated;
hypomethylated and downregulated)
Table D.1 Top 100 statistically significantly different genes 225
between DNA methylation-based clusters
(Cluster 1-Cluster 2)
ix
LIST OF FIGURES
Figure 1.1 Lung and lesion histology 4
Figure 2.1 Heatmap of DNA methylation levels of 15 loci and 45
repeats in all tissue types
Figure 2.2 “Early” DNA methylation changes: scatterplots of loci 47
significantly hypermethylated in AAH lesions compared
to AdjNTL
Figure 2.3 “Intermediate” DNA methylation changes: scatterplots 48
of loci significantly hypermethylated in AIS lesions
compared to AAH
Figure 2.4 “Late” DNA methylation changes: scatterplots of loci 49
significantly hypermethylated in adenocarcinoma
compared to AIS
Figure 2.5 Global DNA methylation levels in AAH, AIS, and 50
lung adenocarcinoma
Figure 2.6 Summary of DNA methylation changes in AAH, AIS and 55
lung adenocarcinoma
Figure 3.1 Quality control and batch effects analysis of the DNA 79
methylation data show no confounding batch effects
Figure 3.2 Diagram of analysis strategy 84
Figure 3.3 Identification of DNA methylation differences between 85
lung adenocarcinoma and NTL
Figure 3.4 CpG island-associated Illumina Infinium 88
HumanMethylation27probes for CDH13
Figure 3.5 Intronic Illumina Infinium HumanMethylation27 probes for 89
CDH13
Figure 3.6 Verification of DNA methylation differences between lung 91
adenocarcinoma and NTL in independent samples
x
Figure 3.7 Verification of selected DNA methylation differences 92
between lung adenocarcinoma and NTL with an
alternate method
Figure 3.8 Identification of genes showing coordinately changed 94
DNA methylation and gene expression
Figure 3.9 Characterization of genes showing coordinately 96
Changed DNA methylation and gene expression
Figure 3.10 Nextbio illustration of biosets found to be most 97
significantly correlated with genes identified in this
study to have functional changes in DNA methylation
Figure 3.11 Genes showing most significant changes in DNA 100
methylation and gene expression
Figure 3.12 Correlations between DNA methylation and gene 106
expression
Figure 3.13 DNA methylation levels of HOXB4, NID2 and 108
TRIM58 on Illumina Infinium HumanMethylation27 and
MethyLight
Figure 4.1 Identification of DNA methylation differences between 122
lung adenocarcinoma tumors from smokers and
never-smokers
Figure 4.2 SULT1C2 DNA methylation differences between 127
Caucasians and Asians
Figure 4.3 DNA methylation and gene expression levels for 129
SULT1C2 in NTL tissues
Figure 4.4 Hierarchical clustering of tumors identifies two distinct 130
DNA methylation based clusters
Figure 4.5 Hierarchical clustering of tumors using 766 genes also 131
Identifies two distinct DNA methylation based clusters
Figure 4.6 No survival differences between the two DNA 132
methylation based clusters
Figure 4.7 DNA methylation differences between clusters 136
xi
Figure 4.8 DNA methylation differences between KRAS mutant 137
and wildtype tumors
Figure 4.9 Gene expression differences between the two clusters 139
Figure 5.1 Probe failures in frozen and FFPE tissue samples 160
Figure 5.2 Infinium HumanMethylation27 beta-values and 162
correlations between frozen and FFPE NTL tissues
Figure 5.3 Infinium HumanMethylation27 beta-values and 163
correlations between frozen and FFPE tumors
Figure 5.4 Relationship between DNA amount as measured by 164
ALU-C4 C(T) with probe failure and different measures
of variance
Figure 5.5 Comparison of beta-values between frozen and FFPE 166
tumor and NTL samples for six loci known to be
hypermethylated in lung cancer
Figure 5.6 Hierarchical clustering of samples show close 167
relationship between frozen and FFPE samples of the
same subject
Figure A.1 Heatmap of DNA methylation levels of 15 212
hypermethylation loci and two hypomethylation repeat
loci in all tissue types
Figure A.2 DNA methylation scatterplots of AIS, Mixed BAC and 214
lung adenocarcinoma lesions
Figure A.3 Hierarchical clustering of Mixed BAC samples show at 217
least two subgroups
xii
ABSTRACT
Lung cancer accounted for 13% of total cancer cases and 18% of cancer
deaths globally in 2008. The combination of increasing smoking prevalence in
many developing countries and a long latency period predicts that lung cancer
will remain a major world health problem for decades to come. This work focuses
on DNA methylation in lung adenocarcinoma, a subtype of lung cancer that is the
most prevalent in the United States, as well as the most common amongst
women and non-smokers. We built on previously identified DNA methylation
early detection markers by delineating the timing of these changes in
preneoplastic lesions. We then used genome-scale DNA methylation profiling to
identify novel potential blood-based non-small cell lung cancer biomarkers, and
integrated DNA methylation information with gene expression data to identify
DNA methylation changes that may lead to functional consequences in the
development of cancer. Additionally, we identified a DNA methylation based
subgroup of lung adenocarcinoma that is associated with KRAS mutations and
smoking status, as well as explored the use of paraffin-embedded tissues to
facilitate larger genome-scale DNA methylation studies. Our findings provide
insights into the roles that DNA methylation may play in the development of lung
adenocarcinoma, as well as potential DNA methylation markers for the early
detection or risk assessment of the disease.
1
CHAPTER 1
INTRODUCTION
Lung cancer
Lung cancer is the leading cause of cancer-related death for both men
and women worldwide (Jemal et al. 2011). It is the most frequently diagnosed
cancer in men, accounting for 17% of total new cases, as well as the most lethal
cancer, accounting for 23% of total cancer deaths. For women, it is the fourth
most frequently cancer diagnosed worldwide and the second leading cause of
cancer death, with a mortality burden of 11%, equal to that of cervical cancer.
Despite the recent stabilization of lung cancer deaths in the United States due to
successful smoking cessation campaigns in recent decades, the increased use
of tobacco products in developing countries such as China (Jemal et al. 2010),
compounded with a long latency period ensures that lung cancer will remain a
major health problem worldwide for many decades to come. Moreover, given that
approximately 15% of lung cancers in men and 50% of lung cancers in women
occur in patients who do not have a cigarette smoking history, there are clearly
other persistent factors that contribute to the development of lung cancer. These
may include mutations, such as EGFR mutations (Pao et al. 2004; Pao and
Girard 2011), hormonal factors such as the use of hormone replacement therapy
(Schabath et al. 2004), environmental factors like environmental tobacco smoke
2
(Vineis et al. 2004) and cooking oil vapors (Metayer et al. 2002), as reviewed in
(Sun et al. 2007).
Clinically, there are two main broad categories of lung cancer: Small cell
lung cancer (SCLC) and non-small cell lung cancer (NSCLC). SCLC is an
aggressive cancer located in the bronchi which accounts for 10-15% of all lung
cancers, and is almost always associated with cigarette smoking (Jackman and
Johnson 2005). NSCLC is also strongly associated with cigarette smoking, and
accounts for the majority of lung cancer cases (85-90%). NSCLC consists of
several different histological subtypes, including adenocarcinoma, squamous cell
carcinoma, and large cell carcinoma. Lung adenocarcinoma occurs in the
peripheral lung, squamous cell carcinoma usually starts in the bronchi, while
large cell carcinoma is predominantly identified by the larger size of tumor cells
and can be found throughout the lung. This dissertation will focus on lung
adenocarcinoma which has surpassed squamous cell carcinoma as the most
common histological subtype of lung cancer in the United States, and is also the
most common histological subtype amongst women, Asians and never-smokers
(Toh et al. 2006).
Lung adenocarcinoma
Unlike squamous cell carcinoma, the natural history of lung
adenocarcinoma is still poorly understood. Recent studies suggest that at least
some lung adenocarcinomas arise from preneoplastic lesions called atypical
3
adenomatous hyperplasia (AAH), which progress to adenocarcinoma in situ
(AIS), formerly known as bronchioloalveolar carcinoma (BAC), and eventually
develop into invasive cancer (Chapman and Kerr 2000; Westra 2000; Kerr 2001),
as illustrated in Figure 1.1.
In 1999, the WHO acknowledged AAH as a putative preneoplastic lesion
of lung adenocarcinoma. AAH is now defined as “localized proliferation of mild to
moderately atypical cells lining involved alveoli and sometimes respiratory
bronchioles, resulting in focal lesions in peripheral alveolated lung, usually less
than 5 mm in diameter” (Travis WD 1999). AIS has recently been defined as “a
localized small (<3 cm) adenocarcinoma with growth restricted to neoplastic cells
along preexisting alveolar structures (lepidic growth), lacking stromal, vascular,
or pleural invasion. Papillary or micropapillary patterns and intraalveolar tumor
cells are absent.” (Travis et al. 2011). It is also associated with a 100% five year
post-resection patient survival, and is similar in morphology to high-grade AAH
lesions.
Both AAH and AIS can be found as incidental findings in the lungs of
patients undergoing resection for a primary lung tumor, usually adenocarcinoma.
However, with the advent of more sensitive radiological imaging, they are now
being individually detected using fine section high resolution computed
tomography (Ikeda et al. 2007; Funama et al. 2009). A number of molecular
studies support the existence of an AAH-BAC-adenocarcinoma continuum (as
4
Figure 1.1. Lung and lesion histology. Haematoxylin and eosin-stained
sections of (A) Adjacent non-tumor lung (B) Atypical adenomatous hyperplasia
(C) Adenocarcinoma in situ (D) Lung adenocarcinoma, all at 100x magnification.
5
reviewed in (Kerr et al. 2007)). Loss of heterozygosity (LOH) events at 9q and
16p, key features of lung cancer, have been reported to occur at similar
frequencies in AAH and adenocarcinoma (Takamochi et al. 2001; Morandi et al.
2007), and the mutually exclusive natures of KRAS and EGFR mutations
reported in lung adenocarcinoma are maintained in AAH lesions (Sakamoto et al.
2007). Support for a developmental sequence from AAH to adenocarcinoma also
comes from conditional oncogenic mouse models for lung adenocarcinoma, in
which KRAS or EGFR genes are activated. In both types of mice, AAH-like
lesions are found before the emergence of adenocarcinomas (Jackson et al.
2001; Politi et al. 2006).
Investigating the molecular changes underlying preneoplastic lesions of
lung adenocarcinoma may be critical for understanding the etiology of the
disease, as well as for the establishment of potential molecular diagnostic
modules to supplement visual imaging techniques. One well established
molecular change that occurs in lung cancer is aberrant DNA methylation.
DNA methylation and cancer
DNA methylation in mammals is an epigenetic change in which a methyl
group is covalently added to the 5-position of a cytosine in the context of a CpG
dinucleotide. DNA methylation is a critical regulator of normal biological
processes such as differentiation (Meissner et al. 2008), imprinting (Bird 2002),
development (Okano et al. 1999), and X chromosome inactivation (Sharp et al.
6
2011). Aberrant DNA methylation, however, has been implicated in a number of
diseases, including cancer (Jones and Baylin 2002), and consists of both DNA
hypomethylation or loss of DNA methylation (Ehrlich 2002) and DNA
hypermethylation, or gain in DNA methylation (Bird 1986).
DNA hypomethylation typically refers to a decrease in global levels of
DNA methylation, which has been associated with tumor progression (Kim et al.
1994) and may contribute to tumorigenesis by causing chromosomal instability
and increased mutation rates (Chen et al. 1998; Eden et al. 2003), potentially by
the activation of latent transposons or mobile DNAs such as long interspersed
nuclear elements (LINE) sequences (Jurgens et al. 1996). There is controversy
in the field as to when DNA hypomethylation occurs. Theoretically, the earlier a
change occurs in the development of cancer, the more likely it is that this change
is a significant causal event, as opposed to a consequence of cellular
transformation. Some evidence indicates that DNA hypomethylation is an early
event (Goelz et al. 1985), while other studies, including our work covered in
Chapter 2, indicate it may be a later event in tumorigenesis (Yegnasubramanian
et al. 2008; Selamat et al. 2011).
The classical view of aberrant DNA hypermethylation is that it occurs in a
targeted fashion, typically at CpG islands located at the promoter regions of
genes (Gardiner-Garden and Frommer 1987; Takai and Jones 2002). DNA
hypermethylation has long been associated with gene silencing and cancer
(Herman and Baylin 2003), and began with candidate gene analyses of one or a
7
few genes at a time (Belinsky et al. 1998; Nuovo et al. 1999). This strategy has
evolved to microarray and bead-based genome-scale analyses involving
thousands of genes (Noushmehr et al. 2010; Hinoue et al. 2011) and next
generation sequencing based “methylome” studies (Irizarry et al. 2009; Kim et al.
2011).
Such studies have challenged the classical DNA methylation paradigm by
enabling the unexpected identification of long range DNA methylation, or
spreading of DNA methylation (Clark 2007), as well as roles for DNA methylation
on the border or just outside CpG islands, termed “shores” and “shelves” (Irizarry
et al. 2009), DNA methylation-regulated alternate transcripts (Maunakea et al.
2010), the role of DNA methylation in chromatin arrangement and organization of
the genome (Berman et al. 2011), miRNA regulation (Lopez-Serra and Esteller
2011), and even gene silencing by non-CpG island DNA methylation (Han et al.
2011). Importantly, these studies have also shown that DNA methylation
changes occur at hundreds of genes in cancer, leading to a new and important
question: Which of these changes are “driver” changes, or changes that
contribute to pathogenesis, and which are “passenger” changes, or changes that
are a merely consequence of disease state. In Chapter 3, we explore one
strategy for determining functional DNA methylation changes by integrating
genome-scale DNA methylation information with corresponding genome-scale
gene expression data.
8
DNA methylation in lung adenocarcinoma
DNA methylation in lung cancer has been studied mostly at candidate-
gene levels, and numerous hypermethylated genes in lung cancer have been
identified (as reviewed in (Anglim et al. 2008a)). While initial studies typically
included all NSCLCs for convenience, it was soon established that lung cancer of
different subtypes had distinct DNA methylation profiles ((Tsou et al. 2005) and
other citations), and that DNA methylation studies needed to take into account
histological subtypes.
Abnormal DNA methylation has not yet been thoroughly examined in AAH
and AIS, the putative precursor lesions to lung adenocarcinoma. At the time of
writing, only a few studies have attempted to analyze DNA methylation in these
lesions, including our analysis detailed in Chapter 2 (Licchesi et al. 2008a;
Licchesi et al. 2008b; Chung et al. 2011; Selamat et al. 2011). However, DNA
methylation in lung adenocarcinoma itself has been well studied. The most well
established DNA methylated loci in lung adenocarcinoma are CDKN2A,
RASSF1A and CDH13. CDKN2A, or cyclin-dependent kinase inhibitor 2A, also
known as p16, is a cell cycle regulator and was a known tumor suppressor gene
frequently mutated in a number of cancers (Kamb et al. 1994). Promoter DNA
methylation of the gene was first shown to correlate with gene silencing in cell
lines (Merlo et al. 1995), and then shown to occur early in development of an
animal model of lung carcinogenesis (Belinsky et al. 1998). CDKN2A showed
great initial promise as a lung cancer biomarker, as it was methylated in sputum
9
DNA of subjects up to 3 years before clinical diagnosis (Palmisano et al. 2000).
In Chapter 2, we show that DNA methylation in exon 2 of CDKN2A was
methylated in AAH lesions, supporting its identification as an early DNA
methylation change in the development of lung adenocarcinoma (Selamat et al.
2011).
Another well studied hypermethylated locus in lung adenocarcinoma is
RASSF1A, or Ras association (RalGDS/AF-6) domain family member 1, variant
A, a gene involved in both cell cycle regulation and apoptosis (Shivakumar et al.
2002; Oh et al. 2006). RASSF1A is located on chromosome arm 3p21.3, a
chromosome region frequently deleted in lung cancer, and was identified as
frequently epigenetically inactivated (Dammann et al. 2000). DNA methylation of
RASSF1A was associated with the lung adenocarcinoma histology (Liu et al.
2007), and was also found to be concordantly methylated in DNA found in patient
plasma (Hsu et al. 2007), making it a potential blood-based biomarker for lung
adenocarcinoma. However, in Chapter 2 we show that RASSF1A does not
appear in the putative lung adenocarcinoma progression in AAH or AIS lesions,
and therefore may not occur early enough for diagnostic marker purposes.
CDH13 is a member of the cadherin family, and is involved in cell-cell
adhesions. Its location on chromosome arm 16q24.1, a frequently deleted region
in lung cancer, first brought it to the attention of researchers, and it was first
reported to be hypermethylated based on lung cancer cell line studies (Sato et al.
1998). DNA methylation of the CDH13 promoter region was then shown to be
10
correlated with loss of expression in both breast and lung tumors (Toyooka et al.
2001a). In 2008, Brock et al found that DNA methylation of CDH13 and CDKN2A
were associated with increased risk of early recurrence in early stage lung
cancer (Brock et al. 2008).
These studies indicate that DNA methylation of CDKN2A, RASSF1A and
CDH13 appear to be both molecularly and clinically relevant for the development
and progression of lung cancer. These genes were initially discovered in
candidate-gene studies, often initiated by known genetic deletion and/or mutation
studies. Recent technological advances, however, have enabled the less biased
determination of DNA methylation levels at tens of thousands of genes
simultaneously (Laird 2010), also known as DNA methylation profiling, or
“methylome” analyses. These studies can be generalized into three categories:
Biomarker studies, functional studies, and subtype discovery studies.
DNA methylation: Lung cancer biomarker
The overall five-year survival of patients with lung cancer is very poor. In
the United States it is a dismal 18% despite extensive efforts to improve
diagnosis and treatment (Horner MJ 2009), partly because most patients are
diagnosed only when the cancer has progressed and clinical symptoms are
present. Since lung cancer survival is correlated to the stage at which the cancer
is diagnosed, early detection and successful surgical intervention would help
reduce lung cancer mortality (Mountain 1997). However, unlike breast or
11
colorectal cancers, there is as yet no widely implemented, easily accessible
screening tool for persons at high risk for lung cancer. Sensitive new visual
diagnostic modalities such as low dose spiral computed tomography (LDSCT) do
show potential to detect much smaller lung lesions than conventional chest X-
rays, and recent reports from the National Lung Screening Trial demonstrate a
20% reduction in mortality (Aberle et al. 2011). However, the results of this study
clearly show that the increased sensitivity to detect stage I cancers and
preneoplastic lesions does not overcome the already significant problem of low
specificity faced by traditional chest x-rays; the majority of these lesions may be
indolent or non-lethal in nature, leading to potentially unnecessary and invasive
follow-up procedures (Black 2000).
Improvements on visual imaging such as Fluorine-18 fluorodeoxyglucose
labeled positron emission tomography (
18
FDG-PETCT) show potential to improve
specificity (Subedi et al. 2009).
18
F-FDG is a glucose analog which is more
readily absorbed and retained by tumor cells. Further improvements including
PETCT-based labeling of lung tumors containing specific mutations, such as the
L858R EGFR PETCT, may provide another layer of specificity and even direction
for patient treatment (Yeh et al. 2011).
Another approach to provide specificity to diagnostic or screening
procedures could come from molecular biomarkers. Such markers might
complement LDSCT screening, allowing earlier detection of cancerous or
12
precancerous lesions, as well as potential prognostic markers. The analysis of
DNA methylation might provide such a test.
DNA methylation profiles between tumor and non-tumor tissues have
been extensively studied, and can be used to classify diseased from non-
diseased tissues (as reviewed in (Anglim et al. 2008a)). Therefore, DNA
methylation may be a possible molecular marker to complement image-based
lung cancer screening. There are several advantages to DNA methylation as a
biomarker. While mutations of a particular gene can occur in many positions
throughout the gene promoter, introns and gene body, screening for DNA
hypermethylation typically involves looking only at the CpG islands on or near the
promoters of target loci. In contrast to proteins, DNA methylation markers are
PCR-amplifiable, allowing for improved sensitivity and detection. DNA
methylation marks are also extremely stable, facilitating clinical implementation
without the difficulties of RNA or protein degradation issues.
DNA methylation profiles have not only been shown to be markedly
different between tumors and non-tumor tissues, but importantly for the
development of biomarkers, have been detected in a variety of less invasively
obtainable biological samples, including sputum, urine, stool, exhaled breath
condensate and blood, either as free DNA shed by tumors, or as circulating
tumor cells (Esteller et al. 1999; Usadel et al. 2002; Bremnes et al. 2005). A
handful of DNA methylation biomarkers have made it to commercial production,
even though they are not yet widely adopted for clinical use. Amongst these are
13
SEPT9 DNA hypermethylation in stool as a test for colorectal cancer (Lofton-Day
et al. 2008), SHOX2 DNA hypermethylation in bronchial lavage as a test for lung
cancer (Schmidt et al. 2010), and GSTP1 DNA hypermethylation in urine as a
test for prostate cancer (Rosenbaum et al. 2005).
Strikingly, none of these validated markers are blood-based markers.
Despite the knowledge that cancer patients appear to have higher levels of
circulating DNA in blood than healthy subjects (Shapiro et al. 1983) and the
identification of many cancer-associated hypermethylated loci, many challenges
remain in the detection of DNA methylation in blood. One such challenge is the
issue of sensitivity. If a marker is to be used for early detection, it necessarily
must be shed from a small tumor, and it is unknown whether there will be enough
tumor-derived DNA (and methylated DNA) in the blood to detect. However,
improvements in detection technologies such as Digital MethyLight, with the
ability to detect single methylated molecules, may eventually resolve this issue
(Weisenberger et al. 2008).
Another challenge to implementing blood-based biomarkers is the
uncertainty regarding the origin of any cell-free or circulating tumor DNA
(Schwarzenbach et al. 2011). Many loci are methylated in multiple solid cancers,
so that an observed cancer-specific methylation event might suggest cancer but
may not accurately pinpoint the organ affected. This problem, however, can be
addressed if the biomarker is used in conjunction with a visual imaging modality
such as CT-scans to determine the location of the tumor. Contaminating white
14
blood cell DNA could also be a problem, as these cells have their own DNA
methylation profiles. Potential markers should therefore be filtered against white
blood cell DNA methylation profiles. More recently, differential DNA methylation
profiles have even been shown in the white blood cells of patients with disease
(Terry et al. 2011). If these results hold true, they would overcome the problem of
sensitivity associated with the need to examine minute amounts of circulating
tumor cells or tumor-derived DNA in patient plasma or serum (Laird 2003) and of
contaminating leukocyte DNA. However, the basis for cancer-specific DNA
methylation patterns in leukocytes remains to be established.
Importantly, a good biomarker will not only be robust enough to be
detected in remote bodily fluids, have good sensitivity and specificity, it must also
occur early enough during the development of cancer that successful surgical
intervention will still be possible. Therefore, in our search for potentially useful
early detection markers, we and others have begun looking into the timing at
which DNA hypermethylation occurs, and by extension, the possible role DNA
methylation plays in the development of lung adenocarcinoma (Licchesi et al.
2008a; Licchesi et al. 2008b; Selamat et al. 2011).
DNA methylation: Functional changes in lung cancer
The early DNA methylation studies in cancer were candidate gene
studies, often asking whether DNA methylation causes silencing of known tumor
suppressor genes. The functional consequence of DNA methylation associated
15
gene silencing in these single or moderate-sized gene studies were already
known. With the advent of genome-scale and genome-wide DNA methylation
assay technologies, it became apparent that for many different types of cancer,
there are hundreds of genes with accompanying DNA methylation changes,
paralleling the many hundreds of mutations observed with next-generation
sequencing of individual tumors (Table 1.1).
It is highly unlikely that every gene targeted by DNA methylation will have
tumor suppressor function, or contribute to tumorigenesis. One of the major
questions that has emerged in DNA methylation profiling studies is the
differentiation between “driver” DNA methylation changes, i.e. those DNA
methylation changes that have a functional consequence and contribute to
tumorigenesis, and “passenger” DNA methylation changes, i.e. those changes
that are an effect of the molecular changes brought about by tumorigenesis, but
do not themselves contribute to tumor development. The mechanism by which
either type of DNA methylation change occurs is still unknown, and may be a
combination of permissive DNA sequence and/or chromatin organization
(McCabe et al. 2009; Gebhard et al. 2010). A substantial portion of genes that
become DNA hypermethylated in several solid cancers are homeobox genes
(Rauch et al. 2007) which are key master regulators of developmental processes
also deregulated in cancer through other mechanisms (Abate-Shen 2002).
Homeobox genes are part of a wider category of genes known as polycomb
16
Table 1.1. Genome-scale DNA methylation studies in lung cancer
Paper Method Assay Subjects Major findings
(Dai et al.
2001)
RLGS
a
~1184 CpG
islands
16
AD
b
/NTL
c
11 genes and 6 ESTs
hyper methylated in AD
(Bibikova et
al. 2006)
Illumina
GoldenGate
1536 CpGs/
371 genes
23 AD/NTL 55 markers for AD
(Rauch et al.
2007)
MIRA
d
+
NimbleGen tiling
array/MCAM
e
~27,800 CpG
islands
(MCAM)
A549 cell line
HOX gene clusters are
preferential targets for
DNA methylation AD
(Goto et al.
2009)
MCAM
15,134
probes/
6,157 genes
20 Meso
f
,
20 AD
Meso and AD have distinct
DNA methylation profiles
(Christensen
et al. 2009)
Illumina
GoldenGate
1413 CpGs/
773 genes
158 Meso,
57 AD, 18
pleura, 48
NTL
Meso and AD have distinct
DNA methylation profiles.
1266 loci with different
methylation between meso
and AD
(Helman et
al. 2011)
MIRA +
NimbleGen tiling
array
5 AD/NTL
DNA methylation
selectively silences
developmental genes
required for the
maintenance of a
differentiated state
(Kwon et al.
2011)
MIRA/Illumina
GenomeAnalyzer
& CodeLink
Expression Array
DNA
methylation: 3
pooled
SCC
g
/NTL
Gene
expression: 21
SCC/NTL
30 hypermethlated and
downregulated genes, 22
hypomethlated and
upregulated genes in SCC
(Son et al.
2011)
Illumina
GoldenGate
1505 CpG/
807 genes
11 AD/NTL
6 hypermethylated and 9
hypomethylated loci in AD
a
Restriction landmark genomic scanning
b
Lung adenocarcinoma
c
Non-tumor lung tissue
d
Methylated CpG island recovery assay
e
Methylated CpG island amplification microarray
f
Mesothelioma
g
Squamous cell carcinoma
17
group targets, which are commonly methylated genes in cancer (Widschwendter
et al. 2007). Indeed, Tommasi et al found that more than 50% of the genes they
identified as hypermethylated in breast cancer were known polycomb group
targets (Tommasi et al. 2009).
Compounding the complexity of hypermethylated gene promoters, higher
resolution arrays and bisulfite next-generation sequencing technologies have
also found that there is frequent tumor-specific DNA methylation outside of
promoter regions, some potentially regulating alternative transcripts and miRNA
expression (Maunakea et al. 2010; Lopez-Serra and Esteller 2011). More
recently, even non-CpG island DNA methylation silencing has been observed
(Han et al. 2011).
Even without considering the newly discovered complexities in DNA
methylation, the consequences of DNA methylation of hundreds of genes in
cancer are still unknown. One possibility is that at least some of these genes are
already silent in adult normal lung tissue, for example embryonic developmentally
important genes; DNA methylation would therefore have no functional
consequences. Another possibility is that the developmental genes often seen
aberrantly methylated may also play a role in cellular or tissue differentiation, and
therefore the deregulation of this process may contribute to the observed de-
differentiation often seen in cancer progression. Lastly, there is of course also the
possibility that DNA methylation does effect a change in expression of a
particular gene, which could potentially contribute to tumorigenesis. In order to
18
come to that conclusion, in-depth functional assays need to be performed,
ranging from in vitro forced expression and knockdown assays, to animal model
knockout and transgenic studies. These experiments, however, are time-
consuming, labor-intensive and expensive, and cannot easily be performed for
hundreds of candidate genes.
There are several methods in use to prioritize candidate genes for follow-
up experiments, each with its own advantages and disadvantages. One way,
which has been in use since candidate-gene studies, is to focus on methylation
frequency with the assumption that genes which are functionally important will
also be the most commonly inactivated in cancer. This approach, however, may
miss less penetrant epigenetic changes, and miss connections between different
genes targeting the same pathways. Another classic approach is to focus on
specific pathways known to be deregulated in cancer, for example the Wnt
pathway (Dai et al. 2011; Hill et al. 2011). This however, disallows the discovery
of new targets.
A newer approach is to integrate data from different genome-scale
platforms assaying different mechanisms of gene silencing. The rationale behind
this is that genes which are commonly targeted by different mechanisms, for
example genes which are commonly methylated and also found in regions of
copy number changes or loss of heterozygosity, or are targeted by miRNA
degradation, or overlap with regions of histone modifications (Andrews et al.
2010; Nishiyama et al. 2011) are likely to be genes of functional importance to
19
tumor development. A more direct method is to integrate DNA methylation
profiles with gene expression profiles (Noushmehr et al. 2010; Hinoue et al.
2011) to identify those DNA methylation changes which correspond to gene
expression changes. This is the method we chose to employ in our own
identification of potential functional DNA methylation changes in lung
adenocarcinoma, detailed in Chapter 3 (Selamat, S.A.,manuscript in review).
Integration of genome-scale data has a further advantage as gene lists can be
analyzed for gene ontologies, gene networks and pathway involvements.
However, there are challenges associated with this approach, namely data
curation, sample processing, data pre-processing, batch effects, as well as
difficulties in obtaining enough high-quality material needed to test the same
samples on multiple high-throughput platforms.
DNA methylation and tumor heterogeneity
Lung adenocarcinoma is increasingly recognized as a clinically and
molecularly heterogeneous disease, which has important prognostic and
therapeutic implications. This is exemplified by recent re-classifications of lung
adenocarcinoma sub-types based on pathology and patient survival (Travis et al.
2011), the increasing number of clinical trials demonstrating targeted treatments
that specifically benefit patients defined by molecular subtypes such as EGFR,
KRAS, BRAF, and ERBB2 mutations and EML4-ALK fusions (Pao et al. 2004;
Pao et al. 2005a; Pao et al. 2005b; Pao and Girard 2011), as well as observed
20
prognostic gene expression signature profiles (Bhattacharjee et al. 2001; Beer et
al. 2002; Larsen et al. 2007). More recently, the rapidly expanding field of
epigenetic profiling has confirmed the existence of DNA methylation-based
subtypes in several cancers (Issa 2004; Li et al. 2010; Noushmehr et al. 2010;
Hinoue et al. 2011).
The best-established DNA methylation-based subgroup is that of CpG
island methylator phenotype (CIMP), first identified in colorectal cancer (Toyota
et al. 1999). CIMP tumors possess high frequency and levels of cancer-specific
DNA methylation at loci which show little or no methylation in non-CIMP tumors.
CIMP is sometimes associated with differences in patient survival and is closely
associated with BRAF activating mutations (Weisenberger et al. 2006). However,
the molecular mechanism underlying CIMP has not yet been elucidated
(Teodoridis et al. 2008). The existence of CIMP has been suggested in NSCLC
(Marsit et al. 2006; Suzuki et al. 2006) using a very limited number of genes,
however, another study did not support the existence of CIMP in lung cancer
(Vaissiere et al. 2009). In Chapter 4, we use the Illumina Infinium
HumanMethylation27 platform with the ability to interrogate 27,578 probes
simultaneously to enable a more thorough examination of DNA methylation in
lung adenocarcinoma.
An additional epigenotype, termed CIMP-low (CIMP-L, Intermediate-
methylation epigenotype, or CIMP2) has been reported and confirmed in several
independent populations of colorectal tumors using different methodologies
21
(Ogino et al. 2006; Shen et al. 2007; Yagi et al. 2010; Hinoue et al. 2011). CIMP-
low exhibits moderately high levels of DNA hypermethylation at a subset of
CIMP- associated loci, and in each study was found to be associated with KRAS
mutation. The association between DNA methylation subtypes and clinical
variables is especially important as it provides some insight into the molecular
mechanisms underlying epigenetic differences, as well as having potential
prognostic and therapeutic implications. CIMP in glioblastoma is closely tied to
IDH1 mutations, it tends to occur in younger patients and patients with L-CIMP
tumors have significantly better survival (Noushmehr et al. 2010). In contrast, in
colorectal cancer CIMP is strongly correlated to BRAF mutations and MLH1 DNA
hypermethylation (Weisenberger et al. 2006; Hinoue et al. 2011). In lung cancer,
the most commonly tested mutations are KRAS and EGFR mutations, which are
mutually exclusive and are strongly associated with patient smoking history. A
natural question for DNA methylation profiling studies of lung adenocarcinoma
therefore is whether patients with different clinical profiles such as KRAS or
EGFR mutations, smoking status, gender or race have differing DNA methylation
profiles.
This is an especially important question since worldwide, about 15% of
men with lung cancer are non-smokers, and up to 50% of women with lung
cancer are non-smokers (Jemal et al. 2010). Lung adenocarcinoma in smokers
and that in never-smokers have distinct genetic, copy number, and gene
expression profiles (Sun et al. 2007; Landi et al. 2008; Huang et al. 2011).
22
Candidate-gene based studies have hinted at differences in DNA methylation
between smoker and never-smoker lung adenocarcinomas (Belinsky et al. 2002;
Toyooka et al. 2003), but Chapter 4 details the first genome-scale DNA
methylation profiling in lung adenocarcinoma to attempt to address this question.
Although in the United States more than 90% of lung cancer in men and
75% of lung cancer in women is attributable to cigarette smoking, this proportion
is much lower in Asian women with lung cancer (Subramanian and Govindan
2007), leading to a hypothesis of higher lung cancer risk amongst Asian never-
smokers and a search for alternative risk factors. Some proposals include
genetic factors such as single nucleotide polymorphisms (SNPs) and
environmental factors such as cooking fumes (reviewed in (Sun et al. 2007;
Jemal et al. 2010)). A recent study found significant differences in leukocyte
DNA methylation profiles between genders and ethnicities (Zhang et al. 2011).
However, much larger sample sizes are needed to fully investigate this issue.
Current DNA methylation array and sequencing technologies require high quality
DNA, resulting in a bottleneck in the procurement of sufficient quantities of well-
annotated fresh frozen samples, a problem addressed in Chapter 5 by exploring
the use of formalin-fixed paraffin-embedded (FFPE) tissue blocks which are
routinely collected in hospitals during patient treatment.
23
The following chapters will address our contributions to lung
adenocarcinoma DNA methylation profiling studies in the search for biomarkers,
functional changes, and DNA methylation based differences and subtype
discovery.
24
CHAPTER 2
DNA METHYLATION CHANGES IN ATYPICAL
ADENOMATOUS HYPERPLASIA, ADENOCARCINOMA
IN SITU, AND LUNG ADENOCARCINOMA
Chapter 2 Abstract
Aberrant DNA methylation is a common event in lung adenocarcinoma,
but its timing in the phases of tumor development is largely unknown. Delineating
when abnormal DNA methylation arises may provide insight into the natural
history of lung adenocarcinoma and into the role that DNA methylation alterations
play in tumor formation. We used MethyLight, a sensitive real-time PCR-based
quantitative method, to analyze DNA methylation levels at 15 CpG islands that
are frequently methylated in lung adenocarcinoma and that we had flagged as
potential markers for non-invasive detection. We also used two repeat probes
(ALU-M2 and SAT2-M1) as indicators of global DNA hypomethylation. We
examined DNA methylation in tissue samples spanning the putative spectrum of
peripheral lung adenocarcinoma development: histologically normal adjacent
non-tumor lung, atypical adenomatous hyperplasia (AAH), adenocarcinoma in
situ (AIS), and invasive lung adenocarcinoma. Comparison of DNA methylation
levels between the putative sequential lesion types suggests that DNA
25
hypermethylation of distinct loci occurs at different time points during the
development of lung adenocarcinoma. DNA methylation at CDKN2A ex2 and
PTPRN2 is already significantly elevated in AAH, while CpG islands at 2C35,
EYA4, HOXA1, HOXA11, NEUROD1, NEUROD2 and TMEFF2 are significantly
hypermethylated in AIS. In contrast, hypermethylation at CDH13, CDX2,
OPCML, RASSF1, SFRP1 and TWIST1 and global DNA hypomethylation appear
to be present predominantly in invasive cancer. The gradual increase in DNA
methylation seen for numerous loci in progressively more transformed lesions
supports the model in which AAH and AIS are sequential stages in the
development of lung adenocarcinoma. The demarcation of DNA methylation
changes characteristic for AAH, AIS and adenocarcinoma begins to lay out a
possible roadmap for aberrant DNA methylation events in tumor development. In
addition, it identifies which DNA methylation changes might be used as molecular
markers for the detection of preinvasive lesions.
26
Introduction
Lung cancer is the leading cause of cancer-related death in the world, and
is estimated to have caused over 1.3 million deaths in 2008 (Garcia M 2007;
Ferlay et al. 2010). Despite massive efforts to improve diagnosis and treatment
for lung cancer patients over many decades, the overall five-year survival of
patients with lung cancer remains a low 18% (Horner MJ 2009). Recent studies
demonstrate that the use of sensitive new visual diagnostic modalities such as
low dose spiral computed tomography (LDSCT) show potential to detect much
smaller lung lesions than the conventional chest X-ray. The National Lung
Screening Trial was halted ahead of schedule since investigators reached their
goal of 20% reduction of mortality (Aberle et al. 2011). However, even though
mortality is reduced, this imaging modality shows limited specificity; while LDSCT
can detect stage I cancers, a number of these may not actually progress to late
stage cancer (Black 2000). Thus, in order to avoid unnecessary interventions, we
must gain better insight into the molecular changes underlying the natural history
of lung cancer. Such knowledge could be used to develop additional molecular
tests that might complement LDSCT screening, allowing the more specific
detection of those lesions that would progress to tumors with metastatic potential.
The analysis of DNA methylation might provide such a test.
Abnormal DNA methylation is an epigenetic change that has been widely
observed in all types of cancer including lung cancer (Belinsky 2004; Kerr et al.
2007; Anglim et al. 2008a). It consists of the addition of a methyl group to the 5-
27
position of cytosine in the context of a two-base pair palindrome, or CpG
dinucleotide. Sensitive molecular assays allow detection of DNA methylation in
tumors as well as in patient bodily fluids (Eads et al. 2000; Laird 2003; Anglim et
al. 2008a; Weisenberger et al. 2008), and it therefore holds much promise as a
possible molecular marker to complement image-based lung cancer screening.
This study focuses on lung adenocarcinoma, a histological subtype of lung
cancer that is increasing in many countries (Koyi et al. 2002; Au et al. 2004;
Chen et al. 2007; Tse et al. 2009), and which currently accounts for at least 37%
of all lung cancer in the United States (Horner MJ 2009). While smoking remains
the predominant cause of lung adenocarcinoma, this histological subtype is also
the most common form of lung cancer amongst never smokers, Asians and
women (Horner MJ 2009). There is still much to learn about the natural history of
lung adenocarcinoma. The WHO has recognized a putative preneoplastic lesion
of lung carcinoma, termed atypical adenomatous hyperplasia (AAH) (Travis WD
1999). An accumulating number of studies have suggested that at least some
lung adenocarcinomas arise from these preneoplastic lesions, and which
progress to adenocarcinoma in situ (AIS), and can eventually develop into
invasive cancer, reviewed in (Chapman and Kerr 2000; Kerr 2001; Travis et al.
2011)
Abnormal DNA methylation has not yet been thoroughly examined in AAH
and AIS. Extensive investigation of DNA methylation in AAH has been impeded
by the minute size of these lesions and the necessity to use bisulfite conversion.
28
This chemical treatment specifically deaminates unmethylated cytosine to uracil,
but not 5-methylated cytosine, thereby embedding DNA methylation information
into the DNA sequence. Unfortunately, bisulfite treatment can result in
considerable degradation of already scarce genetic material (Tanaka and
Okamoto 2007). Previously, DNA methylation analysis of AAH has required the
use of multiplexed nested methylation-specific polymerase chain reaction (MS-
PCR), disallowing quantitative assessment of methylation and limiting the
number of genes that can be tested (Licchesi et al. 2008a; Licchesi et al. 2008b).
In this study, we overcame these limitations by using the sensitive technology
MethyLight, which uses real-time PCR on bisulfite-converted DNA, with primers
and probes designed to specifically hybridize to methylated regions that retained
cytosines. We used MethyLight to successfully quantitatively assess DNA
methylation levels at 15 CpG islands prone to hypermethylation in lung
adenocarcinoma, and also assessed global hypomethylation by examining
methylation of repeated sequences. We examined DNA methylation in tissue
samples spanning the putative spectrum of peripheral lung adenocarcinoma
development: histologically normal adjacent non-tumor lung from non-lung
cancer patients (MetNTL) as well as lung cancer patients (AdjNTL), atypical
adenomatous hyperplasia (AAH), adenocarcinoma in situ (AIS), and invasive
lung adenocarcinoma (Figure 1.1).
Since AAH lesions provide very little DNA, we carefully weighed our choice
of the 15 loci to interrogate. We chose these loci as they constitute strong
29
candidates for non-invasive DNA methylation markers for lung adenocarcinoma
detection. From a prescreening of 114 loci, we had previously chosen 28 that
were most differentially methylated between tumor and adjacent non-tumor lung
for further analysis and had identified 12 of these loci as significantly
hypermethylated in lung adenocarcinoma (Tsou et al. 2007). We have
subsequently evaluated hundreds more loci (using individual probes, a CpG
island microarray, and an Illumina GoldenGate analysis), yielding additional loci
highly and frequently methylated in lung adenocarcinoma (unpublished data).
From our cumulative data sets, we chose the 15 loci that showed the most
promise for development into lung adenocarcinoma molecular markers (Table
2.1). These loci are also of interest for the potential biological implications of their
DNA methylation.
Fourteen of these represented loci that were most frequently and highly
methylated: 2C35, CDH13, CDKN2A ex2, CDX2, EYA4, HOXA1, HOXA11,
NEUROD1, NEUROD2, OPCML, PTPRN2, SFRP1, TMEFF2 and TWIST1. We
also added RASSF1 because we had observed that, although its methylation
frequency is not as high in adenocarcinoma as some other loci (Dammann et al.
2000; Tsou et al. 2007), it can be methylated in those adenocarcinomas showing
little methylation of the other commonly methylated loci, in other words, its
methylation profile can be complementary.
30
Table 2.1. Genes and gene functions
HUGO
a
Gene Name;
Alternate or previous names
Gene Function
b
2C35
Restriction Landmark Genome Scanning
fragment, no known gene associated
Unknown
CDH13 Cadherin 13; H (heart)-cadherin; CDHH Cell adhesion
CDKN2A
cyclin-dependent kinase inhibitor 2A; p16;
INK4A
Tumor suppressor, cell cycle,
cell proliferation, apoptosis
CDX2 Caudal type homeo box 2: CDX3 Transcription regulation
EYA4
Eyes absent homolog 4
(Drosophila):DFNA10; CMD1J
Development, DNA repair
HOXA1 Homebox protein A1; HOX1F; HOX1
Transcription factor,
development
HOXA11 Homebox protein A11; HOX1I; HOX1
Transcription factor,
development
NEUROD1
Neurogenic differentiation 1; BETA2; BHF-
1; NeuroD; bHLHa3; MODY6
Transcription factor,
differentiation
NEUROD2
Neurogenic differentiation 2; NDRF;
bHLHa1
Transcription factor,
differentiation
OPCML
3
Opioid binding protein/cell adhesion
molecule-like; OPCM; OBCAM; IGLON1
Opioid receptor, cell contact
PTPRN2
Protein tyrosine phosphatase, receptor
type, N polypeptide 2; KIAA0387; phogrin;
ICAAR; IA-2beta
Cell growth, differentiation, cell
cycle, oncogenic transormation
RASSF1
Ras association (RalGDS/AF-6) domain
family 1; NORE2A; REH3P21; RDA32;
123F2
Apoptosis, cell cycle regulation,
potential tumor suppressor
SFRP1
Secreted frizzled-related protein 1; FRP;
FRP-1; SARP2
Modulates Wnt signaling, cell
growth, differentiation,
angiogenesis
TMEFF2
transmembrane protein with EGF-like and
two follistatin-like domains 2; TENB2;
HPP1; TR; TPEF; CT120.2
Cell proliferation
TWIST1
Twist homolog (acrocephalosyndactyly 3;
Saethre-Chotzen syndrome) (Drosophila)
Transcription factor
a
HUGO gene name, www.genenames.org
b
Gene function derived from GeneCards, www.genecards.org
31
We have validated these 15 CpG islands as being significantly hypermethylated
in lung adenocarcinoma compared to adjacent non-tumor lung in two additional
independent sample collections (Tsou et al. 2007) and unpublished results.
Besides local hypermethylation at CpG islands, global DNA
hypomethylation is also a hallmark of cancer, and is associated with
retrotransposon activation and genomic instability (Ehrlich 2002). In order to
examine hypomethylation in our analysis, we included two repeat-based DNA
methylation probes (SAT2-M1 and ALU-M2) in the study. The mean methylation
of these two probes has been shown to correlate well with global DNA
methylation levels (Weisenberger et al. 2005). Thus, our selection of probes was
tailored to provide key insights into the occurrence of DNA methylation
alterations in putative precursor lesions to lung adenocarcinoma.
32
Materials and Methods
Ethics statement
All human tissue samples were paraffin-embedded archival remnants of tissue
resected for clinical purposes, and were obtained from Aberdeen University
Medical School. The research was approved as exempt from the need to obtain
informed consent by the USC IRB (# HS-CG-07-00017) and by the Grampian
Research Ethics Committee (study 05/S0801/141). No consents were required
as these were archival tissue remnants that were largely from deceased patients
and because samples were de-identified prior to being sent to the laboratory, so
that identities of the subjects were unknown to USC investigators or lab
personnel.
Study subjects
Information on the subjects from whom the samples were procured is provided in
Table 2.2. Due to the archival nature of the samples, and the fact that many
patients had long since been deceased, very limited smoking information was
available. Of the 19 subjects for whom smoking information was obtained, only
one was a non-smoker. Those subjects who stated they were ex-smokers all had
smoked for many years (range 10-100 pack-years), but the interval between
smoking cessation and lung cancer diagnosis was not known for most. A detailed
investigation of the effects of smoking on the presence of DNA methylation in the
various lesions was therefore not feasible in this study.
33
Table 2.2. Information on study subjects.
AdjNTL, AAH, AIS,
Adenocarcinoma subjects
MetNTL subjects
Number of subjects 63 30
Median age
1
65 60.5
Age range 42-80 35-74
Gender
2
32 F, 31 M 11F, 19M
Subjects from whom smoking
status is known
19 0
Confirmed smokers 18 unknown
Packyears 20-100
3
unknown
Nonsmokers 1 unknown
1
No statistically significant difference in age between subjects providing AdjNTL, AAH, AIS and
adenocarcinoma samples (ANOVA), statistically significant difference in age between AdjNTL
and MetNTL subjects (p=0.0281, two-tailed t-test).
2
No statistically significant difference in gender between any groups (ANOVA).
3
Sixteen subjects had smoked for >20 pack-years. One subject had smoked for less than 10
years, and pack-years were unknown for one smoker.
34
The distribution of adjacent histologically normal non-tumor lung (AdjNTL),
AAH, AIS and adenocarcinoma tissue samples derived from 63 subjects is
described in Tables 2.3 and 2.4. AAH lesions are generally quite difficult to find.
In order to identify such lesions with more frequency, the Department of
Pathology, Aberdeen Royal Infirmary, prospectively and specifically examines all
surgical lung resection specimens received for AAH and AIS lesions as well as
the index lesion that requires surgical resection. All specimens are inflated per-
bronchially with 10% neutral buffered formalin and cut into 1 cm thick parasagittal
sections after a 24-hr fixation. All visible lesions of > 1 mm diameter are sampled.
In addition, up to 6 random parenchymal tissue blocks are taken from the lung
surrounding but separate from the main lesion, which is most often a primary
carcinoma. A minority of lesions is visible to the trained naked eye on such gross
examination of the lung slices; most AAH lesions are only detected at
microscopy. This approach has provided a high yield of both AAH and AIS
lesions over many years (Chapman and Kerr 2000).
As additional controls we also analyzed 30 independent samples of
histologically verified cancer-free lung (MetNTL) from non-lung cancer patients
who had been operated for a single pulmonary metastasis from a different organ
site (usually colorectal cancer). In total, 249 formalin-fixed paraffin-embedded
tissues from 93 subjects were included in the statistical analysis.
35
Table 2.3. Distribution of AdjNTL, AAH, AIS and adenocarcinoma samples
among 63 subjects.
Lesion Type AdjNTL AAH AIS Adenocarcinoma
N=10
1
+ +
N=3
1
+ + +
N=19 + +
N=18 + + +
N=3 + + +
N=10 + + + +
Number of subjects with samples of
each type
2
63 41 16 50
1
13 subjects lacked an AD sample either because it was no longer available (4) or because the
patient had a different type of lung cancer (1 mixed adeno/squamous, 5 large cell carcinomas, 2
squamous cell cancers and 1 carcinoid).
2
Total number of subjects from which AdjNTL, AAH, AIS or adenocarcinoma was studied: 63.
MetNTL was obtained from an additional 30 subjects.
Table 2.4. Distribution of multiple lesions among subjects.
Number of each type of lesion
obtained from a single subject
Subjects with
AAH
Subjects with
AIS
Subjects with
AD
1 23 11 48
2 8 2 2
3 7 0 0
4 2 1 0
5 1 1 0
6 0 0 0
7 0 1 0
Total subjects 41 16 50
Total number of lesions
1
73 31 52
1
In addition, a single AdjNTL was obtained from each of 63 subjects and a single MetNTL sample
was obtained from each of 30 subjects.
36
DNA extraction and bisulfite treatment
Each section was hematoxylin stained and evaluated by an experienced
pathologist (Keith M. Kerr), who carefully marked the lesions to be retrieved.
Slides were manually microdissected under the microscope and DNA was
extracted by proteinase K digestion. Microdissected cells were incubated
overnight at 50°C in a buffer containing 100 mM TrisHCl (pH 8.0), 10 mM EDTA
(pH 8.0), 1 mg/ml proteinase K, and 0.05 mg/mL tRNA. Extracted DNA was
bisulfite converted using Zymo EZ DNA Methylation kit (Zymo Research, Orange,
CA) with a modification to the protocol in which samples were cycled at 90°C for
30 seconds and then 50°C for one hour, for up to 16 hours total. Bisulfite-treated
DNA was subjected to quality control tests for DNA amount and bisulfite
conversion (Campan et al. 2009). DNA levels were determined by a bisulfite
conversion-independent ALU reaction, consisting of a primer/probe set lacking
CpGs (Campan et al. 2009). This ALU probe (ALU-C4) is distinct from the ALU-
M2 probe used to interrogate DNA hypomethylation. A conservative cutoff was
set at C
t
(threshold cycle) D I W H U H [ W H Q VL Y H D Q D O \ VH V FR P S D U L Q J G D W D Z L W K D cutoff of ALU C
t
Z LWK WKDW R I & t
VK R Z H G Q R VW D W L VW L F D O O \ VL J Q L I L FD Q W difference in percentage methylated reference (PMR) values (see below)
between the two (data not shown). In addition, a previous study demonstrated
that samples with C
t
values VW L O O \ L H O G H G U eliable results (Poynter et al. 2008).
Four independent AAH samples with ALU C
t
values >22 were thus excluded.
37
DNA methylation analysis
Bisulfite-treated DNA was analyzed by MethyLight as previously described
(Weisenberger et al. 2006). Primer and probe sequences are listed in Table 2.5.
Locus 2C35 was identified by restriction landmark genomic sequencing as highly
methylated in non-small cell lung cancer (Dai et al. 2001) as well as other types
of cancers (Costello et al. 2000). While it overlaps with an expressed sequence
tag, there is no confirmed gene associated with this CpG island. The CDKN2A
ex2 primer/probe set detects highly significant hypermethylation in a CpG island
in exon 2 of CDKN2A in lung adenocarcinoma vs. adjacent non-tumor lung,
showing more prominent differences than more upstream probes near the
promoter CpG island (Tsou et al. 2007). The OPCML primer/probe set also
targets the CpG island of its adjacent gene and closely related family member
HNT (Tsou et al. 2007).
The mean of the two values of the SAT2 and ALU probes (SAT2-M1 and
ALU-M2) was used as an indicator for global DNA methylation levels
(Weisenberger et al. 2005). A distinct ALU-C4 probe that hybridizes to a
methylation-independent (CpG-less) region of ALU repeats was used for input
DNA normalization (Campan et al. 2009). Genomic DNA which was exhaustively
enzymatically methylated by three consecutive M.SssI treatments was used as a
reference sample to generate standard curves.
38
Table 2.5. Primers and probe sequences
1
.
1
All DNA sequences are written in the 5’ to 3’ direction.
2
The OPCML-M1 MethyLight amplicon also targets the adjacent HNT CpG island.
Reaction ID Forward Primer Sequence Reverse Primer Sequence Probe Oligo Sequence
2C35-M1
TCGTTATTTAGGCGGTCGT
TGT
ATCAACCCCATTCTTACGCTT
C
CAAAACCCGCGACGCAACG
AAA
CDH13-M1
AATTTCGTTCGTTTTGTGC
GT
CTACCCGTACCGAACGATCC
AACGCAAAACGCGCCCGAC
A
CDKN2A-M3
(exon 2)
GCGTTCGAGTGGCGGA
CTCCCGAACAACGTCGTACA
C
CAATTAAACTCCGCGCCGT
AAAACAACAA
CDX2-M1
GGTAATCGTCGTAGTTCG
GGTATT
ACTCCGTACGCCACTCTAAC
G
CAACCTAACGCCGCAAAAC
TTCGTCA
EYA4-M3
TGGATAGGATGGAAGTTTT
GCG
AACTACCGACAACGCGACG
CGCTCCGACCGTTCCCGAC
TT
HOXA1-M1
GTTGTTGCGGCGATTGTAA
A
CGCGCAAAACGCAACTT
TACTCTTCTTCGCTCCAACA
CTCCAAATCG
HOXA11-M1
TTTTGTTTTCGATTTTAGTC
GGAAT
TAATCAAATCACCGTACAAAT
CGAAC
ACCACCAAACAAACACATC
CACGACTTCA
NEUROD1-M1 GTTTTTTGCGTGGGCGAAT
CCGCGCTTAACATCACTAACT
AAA
CGCGCGACCACGACACGAA
A
NEUROD2-M1
GGTTTGGTATAGAGGTTG
GTATTTCGT
ACGAACGCCGACGTCTTC
CGCCATACGAACCGCGAAA
CGAATATAA
2
OPCML-M1 CGTTTCGAGGCGGTATCG CGAACCGCCGAAATTATCAT
AACAACTCCATCCCTAACC
GCCACTTTCT
PTPRN2-M1
CGTTTTAATAGTTTCGGGT
TTAGTTATAAGT
AACTACGCTTTCTCAACGCCT
C
TAAAACGACCGCGTACTCG
CCAAAAAA
RASSF1A-M1
ATTGAGTTGCGGGAGTTG
GT
ACACGCTCCAACCGAATACG CCCTTCCCAACGCGCCCA
SFRP1-M1 GAATTCGTTCGCGAGGGA
AAACGAACCGCACTCGTTAC
C
CCGTCACCGACGCGAAAAC
CAAT
TMEFF2-M1
CGACGAGGAGGTGTAAGG
ATG
CAACGCCTAACGAACGAACC
TATAACTTCCGCGACCGCC
TCCTCCT
TWIST1-M1 GTAGCGCGGCGAACGT
AAACGCAACGAATCATAACC
AAC
CCAACGCACCCAATCGCTA
AACGA
ALU-C4
GGTTAGGTATAGTGGTTTA
TATTTGTAATTTTAGTA
ATTAACTAAACTAATCTTAAA
CTCCTAACCTCA
CCTACCTTAACCTCCC
ALU-M2 GCGCGGTGGTTTACGTTT
AACCGAACTAATCTCGAACTC
CTAAC
AAATAATCCGCCCGCCTCG
ACCT
SAT2-M1
TCGAATGGAATTAATATTT
AACGGAAAA
CCATTCGAATCCATTCGATAA
TTCT
CGATTCCATTCGATAATTCC
GTTT
39
MethyLight data is represented as the percentage methylated reference
(PMR), which is defined by the GENE: ALU-C4 ratio of a sample, divided by the
GENE: ALU-C4 ratio of M.SssI-treated reference DNA (Campan et al. 2009).
While rare, occasionally PMR values of more than 100 can be observed,
indicating that the reference DNA may not be fully methylated at the assayed
site. The same batch of reference DNA was used throughout the study to avoid
bias.
Statistical analyses
We included a total of 249 tissue samples from 93 subjects in the analysis. Our
primary goal was to compare the DNA methylation values between groups of
different lesion types. Since we had multiple lesions of the same type from the
same individual, we used generalized estimating equations, or GEE (Zeger and
Liang 1992) for this statistical analysis. GEE is a regression approach that
allows us to use all multiple lesions from individuals, while properly accounting
for the possible within-individual correlation in DNA methylation values. We first
verified that the data satisfied the assumption that the average DNA methylation
value was the same in lesions from patients with one lesion compared to patients
with multiple lesions of the same type (data not shown). For each marker, two
groups were then compared by regressing the rank of the PMR values on an
indicator variable for group membership. The rank transformation was used to
address skewness in the PMR value when testing for differences in group
40
means, ranking all 249 samples before proceeding with the pair-wise group
comparisons. Hypothesis testing used robust variance estimates under an
independence working correlation structure. All testing was performed at the 5%
significance level.
In order to identify when the markers first show a difference in average
DNA methylation value, i.e. in which lesion type, AAH, AIS or adenocarcinoma,
we performed a series of univariate tests, comparing DNA methylation values
between pairs of histologies: AdjNTL vs. AAH, AAH vs. AIS, and AIS vs.
adenocarcinoma. To account for conducting three tests for each marker
(multiple testing), we applied a Bonferroni correction to determine statistical
significance, requiring a cutoff of p<0.017 (=0.05/3 tests) for statistical
significance. Markers were then classified into the categories “early”,
“intermediate”, or “late”, depending on the pairwise comparison that yielded the
first increase in average DNA methylation value that both achieved statistical
significance, and showed a group median of >1 on the raw PMR scale. The
PMR scale runs from 0 to 100 (100% methylated compared to enzymatically fully
methylated DNA); a >1 PMR cut-off was chosen to minimize undue emphasis on
very low levels of methylation that are not likely to be biologically significant.
Following this analysis, we investigated the potential for a “field defect” in
the lung by comparing methylation values in AdjNTL with MetNTL. As none of the
15 hypermethylation markers or the hypomethylation measure had been
compared previously between these two tissue types, we controlled for multiple
41
testing by requiring a p-value cutoff of p<0.0031 (=0.05/16 tests) to declare
statistical significance.
We also performed a cluster analysis to see if we could identify any
subgroups within AAH lesions. Using the 15 hypermethylation loci, we applied
partitioning around medoids (PAM) (Kaufman 1990) using silhouette width to
select the number of clusters. For all markers, DNA methylation values were
compared between the two identified clusters using GEE and a Bonferroni cutoff
of p<0.0033 (15 tests). The same methods were applied to compare AAH lesions
based on histologic grade: high grade (HG) and low grade (LG).
To examine the potential effects of clinical variables on the analysis, we
used ANOVA (for age and pack-years) and the Chi-square test (for gender and
known smoking status) to examine whether these variables differed significantly
between sample types. Statistical analyses were performed using STATA version
10 and R.2.10.0.
42
Results
DNA methylation levels across the AdjNTL-AAH-AIS-adenocarcinoma spectrum
We used a comprehensive collection of tissues encompassing adjacent non-
tumor lung (AdjNTL), the putative adenocarcinoma precursor lesions AAH and
AIS, as well as synchronous adenocarcinoma (Tables 2.3 and 2.4). Careful
microdissection and the sensitive MethyLight technology allowed us to
successfully quantitate DNA methylation levels in even very small AAH lesions
without pre-amplification.
Since AdjNTL from lung cancer patients might show DNA methylation “field
defects” and general molecular changes arising from environmental exposures
such as tobacco smoke (Guo et al. 2004), we wanted to include lung tissue from
non-lung cancer patients. Normal healthy peripheral lung tissue is very difficult to
obtain and most resected non-cancer lung tissue is derived from patients afflicted
by other lung diseases which may have their distinct DNA methylation
signatures. We reasoned that a good source of non-tumor lung tissue would be
adjacent lung tissue from resections of subjects with single pulmonary
metastases from non-lung primary cancers. Thus, we included adjacent non-
tumor lung (MetNTL) from 30 such subjects in the study. Our sample collection
also included cases in which multiple AAH and AIS lesions were obtained from a
single subject (Table 2.4), which allowed evaluation of the spectrum of DNA
methylation changes within individuals. Each of the AAH and AIS specimens was
43
pathologically confirmed to be an isolated lesion separate from any other lesions
in the same lung.
We had previously found all 15 CpG islands to be highly significantly
methylated in lung adenocarcinoma compared to AdjNTL (tissues derived from
lung cancer patients from the Los Angeles area, the East coast of the United
States, and Ontario, Canada ((Tsou et al. 2007) and unpublished data). Here, we
confirmed these findings, observing highly significant DNA hypermethylation in
adenocarcinoma vs. AdjNTL for all 15 hypermethylation loci (all p<1x10
-5
, Table
2.6) in samples originating from the United Kingdom. This indicates that lung
adenocarcinoma samples from a variety of geographic areas can exhibit similar
hypermethylation profiles.
Having thus confirmed hypermethylation of the 15 loci in the invasive
adenocarcinoma samples of the present sample collection, we next determined
whether these DNA methylation changes are present in the presumptive
precursor stages of the disease, AAH and AIS (Figure 2.1 and Table 2.6).
Markers were classified into the categories “early”, “intermediate”, or “late”,
depending on the pairwise comparison that yielded the first increase in average
DNA methylation value that both achieved statistical significance, and showed a
group median of >1 on the raw PMR scale (Table 3). The PMR scale runs from 0
to 100 (100% methylated compared to enzymatically fully methylated DNA); a >1
PMR cut-off was chosen to minimize undue emphasis on very low levels of
methylation that are not likely to be biologically significant.
44
Table 2.6. Median PMRs and pair-wise comparison p-values between each tissue type.
Median PMRs GEE p-values for pair-wise comparisons of tissue types
Locus
MetNTL
(n=30)
AdjNTL
(n=63)
AHH
(n=73)
BAC
(n=31)
AD
(n=52)
MetNTL vs.
AdjNTL
AdjNTL vs.
AAH
AAH vs.
BAC
BAC vs.
AD
AdjNTL
vs. AD
Designation
BH p-value threshold
0.0031 0.017 0.017 0.017 0.05
CDKN2A EX2 4.5 5 11 19 21 0.42 2.60E-11 0.045 0.9 <2.1E-14 Early
PTPRN2 3.7 1.1 2.8 8.8 19 7.70E-05 2.10E-03 0.023 5.50E-03 <2.1E-14 Early
2C35 0.53 0.6 1.1 11 24 0.54 0.18 7.20E-04 0.032 <2.1E-14 Intermediate
EYA4 2 1.6 0.41 3.1 16 0.94 0.092 7.00E-04 1.50E-04 1.30E-11 Intermediate
HOXA1 <0.01 0.015 0.12 4.6 21 0.15 0.16 8.70E-05 0.037 <2.1E-14 Intermediate
HOXA11 1.5 0.92 1.3 7.8 19 0.014 0.23 6.70E-08 1.40E-04 <2.1E-14 Intermediate
NEUROD1 0.29 0.17 0.6 3.9 13 0.18 0.014 0.011 3.10E-03 <2.1E-14 Intermediate
NEUROD2 0.78 1.3 1.2 4 12 0.0056 0.71 0.016 7.10E-03 <2.1E-14 Intermediate
TMEFF2 6.1 4.9 6.1 19 18 0.089 0.19 1.90E-09 0.23 8.90E-07 Intermediate
CDH13 <0.01 <0.01 0 <0.01 1.3 0.89 0.039 4.80E-03 5.50E-10 2.10E-14 Late
CDX2 1.3 0.46 0.53 1.6 10 0.17 0.94 0.061 6.10E-03 <2.1E-14 Late
OPCML/HNT 0.32 0.028 0 0.29 5.3 0.097 0.067 1.50E-03 3.00E-04 <2.1E-14 Late
RASSF1 0.44 0.12 <0.01 0.23 8.9 8.00E-04 0.009 9.00E-03 0.026 8.50E-06 Late
SFRP1 0.29 0.44 0.15 0.29 8 0.27 0.071 0.52 6.90E-12 2.90E-14 Late
TWIST1 0.01 <0.01 0 0.12 16 0.4 4.40E-03 4.20E-03 9.90E-03 <2.1E-14 Late
Mean Repeats 80 71 72 75 46 0.076 0.12 0.99 1.00E-09 2.40E-09 Late
45
Figure 2.1. Heatmap of DNA methylation levels of 15 loci and repeats in all tissue types. Loci are arranged in
alphabetical order. Dark blue indicates very low levels of DNA methylation, yellow indicates high levels of DNA
methylation, and missing values are indicated in white. The type of lesion is indicated at the top.
46
According to these criteria, CDKN2A ex2 and PTPRN2 were designated as
"early" loci, with statistically significantly higher DNA methylation in AAH than in
AdjNTL, as seen in Figure 2.2. PTPRN2 showed a further significant increase in
DNA methylation in adenocarcinoma vs. AIS (the increase from AAH to AIS did
not meet our multiple comparisons threshold).
Seven loci- 2C35, EYA4, HOXA1, HOXA11, NEUROD1, NEUROD2 and
TMEFF2 were designated as "intermediate”, or characteristic for AIS (Figure
2.3). Significant DNA hypermethylation of these loci was observed in AIS
compared to AAH, and for four of these loci, DNA methylation levels further
increased significantly in adenocarcinoma compared to AIS.
Five remaining loci, CDH13, CDX2, OPCML, SFRP1 and TWIST1 were
designated as "late" loci, since significantly elevated DNA hypermethylation was
only detected in adenocarcinoma, as compared with AIS (Figure 2.4). RASSF1
hypermethylation approached significance but did not meet our multiple
comparisons cut-off in the AIS to adenocarcinoma comparison. However,
RASSF1 was highly significantly hypermethylated in adenocarcinoma vs.
AdjNTL, and the scatterplot supports the notion that RASSF1 hypermethylation is
a late event (Figure 2.4). Significant DNA methylation of these six loci would
therefore appear to be associated with invasive lung adenocarcinoma.
47
Figure 2.2. “Early” DNA methylation changes: scatterplots of loci
significantly hypermethylated in AAH lesions compared to AdjNTL.
p-values were calculated by GEE, with a Bonferroni cutoff of p<0.017 (see
Materials and Methods). Statistically significant differences are marked with an
asterisk. Interquartile ranges are marked with red bars.
48
Figure 2.3. “Intermediate” DNA methylation changes: scatterplots of loci
significantly hypermethylated in AIS lesions compared to AAH.
p-values were calculated by GEE, with a Bonferroni cutoff of p<0.017 (see
Materials and Methods). Statistically significantly differences are marked with an
asterisk. Interquartile ranges are marked with red bars.
49
Figure 2.4. “Late” DNA methylation changes: scatterplots of loci
significantly hypermethylated in adenocarcinoma compared to AIS.
p-values were calculated by GEE, with a Bonferroni cutoff of p<0.017 (see
Materials and Methods). Statistically significant differences are marked with an
asterisk. Interquartile ranges are marked with red bars. RASSF1 was included in
the figure because hypermethylation is clearly present and increased in
adenocarcinoma, although the AIS vs. adenocarcinoma comparison did not
reach statistical significance (p=0.026, see Table 2.6).
50
Figure 2.5. Global DNA methylation levels in AAH, AIS, and lung
adenocarcinoma. The average of ALU-M2 and SAT2-M1 probes were used as
indicators of global DNA methylation. p-values were calculated by GEE, with a
Bonferroni cutoff of p<0.017 (see Materials and Methods). Statistically significant
differences are marked with an asterisk. Interquartile ranges are marked with red
bars.
51
Examination of the mean of the two repeat probes as an indicator of global
DNA hypomethylation showed highly significant hypomethylation only in the AIS
to adenocarcinoma comparison (Figure 2.5) suggesting that global DNA
hypomethylation may be a late event in lung adenocarcinoma development.
Baseline DNA methylation levels in AdjNTL were in general low, however,
modest methylation was observed for several of the 15 DNA hypermethylation
markers (Table 2.6). To determine whether these potentially elevated DNA
methylation levels could be an indication of a “field defect”, we compared DNA
methylation levels of the 15 hypermethylation probes and the hypomethylation
measure in AdjNTL vs. MetNTL. Only one hypermethylation locus met the
criterion for a statistically significant difference in methylation between the two
tissue types: PTPRN2 (Table 2.6). DNA methylation levels for PTPRN2 were
lower in AdjNTL compared to MetNTL (median PMR of 1 vs. 4), not higher. This
difference is difficult to discern from Figure 2.2 due to low variation in PMR
values (interquartile ranges of 0.4-3.3 for AdjNTL and 2.0-5.5 for MetNTL) and
the scale on the vertical axis. PTPRN2 showed significantly increased DNA
methylation at every other stage comparison, from AdjNTL to AAH, from AAH to
AIS and from AIS to AD. Thus, we did not find elevated DNA methylation in
AdjNTL compared to MetNTL, nor did we observe any significant difference in
global hypomethylation (Table 2.6, bottom row).
With the limited smoking information we had, we examined whether
smoking status (current or past) or pack-years of smoking were associated with
52
DNA methylation levels seen in AdjNTL, and might explain the variability seen in
baseline DNA methylation levels. We observed no significant differences (data
not shown).
Analysis of DNA methylation in preneoplastic lesions
It has been proposed that not all AAH lesions progress to cancer. If true, some
AAH could show molecular changes indicative of their propensity to progress. In
order to assess the existence of any sub-groups of preneoplastic lesions differing
in DNA methylation profiles, we examined the relationship of the samples and the
15 DNA hypermethylation probes using partitioning around medoids (PAM). We
observed two distinct clusters of 68 and 5 samples. In the latter group, the five
AAH lesions from four individuals had statistically significantly higher DNA
methylation levels for 2C35, CDKN2A ex2, CDX2, HOXA1, NEUROD1, TMEFF2
and TWIST1 than the remaining 68 samples (all p<0.003).
AAH lesions are sometimes divided into high grade (HG) and low grade
(LG) based on histology. However, this distinction can be rather subjective. The
grade determination did not correlate with our delineation of the two AAH
clusters. We compared PMR values from AAH lesions histologically denoted as
high-grade (HG, n=11) to low-grade (LG, n=45) lesions and found no statistically
significant differential methylation between the two histologies after multiple
comparison correction (Table 2.7).
53
Table 2.7. Comparison between high-grade and low-grade AAH lesions.
Median PMR
1
Median PMR
1
Median
PMR
1
LG HG Undesignated
Locus (n=48) (n=11) (n=14) p-value
2
2C35 0.9 2.8 2.6 0.04
CDH13 0 0 0 0.48
CDKN2A ex2 10.2 13.9 9.2 0.56
CDX2 0.3 1.8 2.2 0.17
EYA4 0.3 0.2 2.1 0.91
HOXA1 0 0.5 0.2 0.16
HOXA11 1.4 3 0.9 0.21
NEUROD1 0.4 0.7 1.1 0.47
NEUROD2 1.6 0.6 1.7 0.01
OPCML/HNT 0 0 0 0.8
PTPRN2 3.2 2.7 1.9 0.58
RASSF1 0 0 0 0.22
SFRP1 0.1 0.1 0.2 0.81
TMEFF2 4.3 7 7.4 0.31
TWIST1 0 0 0 0.12
1
PMR= Percentage Methylated Reference
2
p-values are from GEE analysis of low grade versus high grade on ranked PMR values. No
statistically significant p-values were seen (cutoff p<0.0033 after Bonferroni correction of 15).
54
Discussion
Our observation that distinct loci show DNA hypermethylation at different
stages of the putative adenocarcinoma development sequence and that the
number of methylated loci and DNA methylation levels are generally higher in
each progressive stage, support a model in which AAH and AIS are precursor
stages of at least a subset of lung adenocarcinomas. The data indicate that
distinct epigenetic events occur with the transition to hyperplasia, carcinoma in
situ and finally invasive cancer (summarized in Figure 2.6) and imply a model
similar to that for the development of colorectal and breast cancers (Dong et al.
2005; Kim et al. 2006; Muggerud et al. 2010; Park et al. 2011b). Our quantitative
observations build on previous reports of increased DNA methylation frequency
in AAH compared to adjacent non-tumor tissue (Licchesi et al. 2008a; Licchesi et
al. 2008b) and suggest that this trend continues in the AIS to adenocarcinoma
continuum.
A longitudinal study, in which lesions are studied over time in the same
individual, would be the best way to study the natural history of cancer, but this is
very difficult to do for peripheral lung cancer given the small size and
inaccessibility of preinvasive lesions. Since our study was cross-sectional,
comparing individual lesions from a collection of patients, any temporal
interpretations should be treated with caution; the results could be affected by
confounding factors such as age, gender and smoking history.
55
Figure 2.6. Summary of DNA methylation changes in AAH, AIS and lung
adenocarcinoma. The putative sequence of DNA hypermethylation events is
indicated by the color shading and position of locus names. Dark shading
indicates hypermethylation. Global DNA hypomethylation is only significantly
altered in the AIS to adenocarcinoma comparison, though it appears to occur
sporadically even in histologically normal tissue.
56
Examination of gender and age showed no significant differences between
AdjNTL, AAH, AIS and adenocarcinoma groups, nor did we find a relationship
between smoking status (current or former) or pack-years and DNA methylation
levels. However, the number of subjects for which any smoking information was
available was small (n=19).
To further support our findings, we therefore also examined two subsets of
samples from our collection: the samples from the 10 subjects from whom
AdjNTL, AAH, AIS, and adenocarcinoma were all available (top row, Table 2.3),
and the collection of samples obtained from 16 confirmed current or previous
smokers with at least 20 pack-years or more of smoking. While the two subsets
had a much smaller sample size and therefore had less power than the full
collection, we observed that the DNA hypomethylation measure and the majority
of hypermethylation loci (10/15) showed similar changes in median PMR and
classify to the same category (early, intermediate or late) in both subsets, either
through statistical significance or trending to statistical significance (data not
shown). This suggests that our observations are robust and not the result of
confounding factors.
Of the 15 loci we studied, the CpG islands of CDKN2A ex2 and PTPRN2
are the only two that we found to be significantly hypermethylated in AAH lesions
compared to adjacent non-tumor lung. Frequent deletions and mutations of
CDKN2A (a negative regulator of cell cycle, also known as p16) in lung cancer
were first observed in 1994 (Hayashi et al. 1994), and hypermethylation and
57
silencing was subsequently observed to occur in substantial numbers of cancers
carrying an intact gene (Merlo et al. 1995). Inactivation of CDKN2A by DNA
hypermethylation is now thought to be one of the earliest events in during lung
cancer development ((Belinsky 2005) and references therein) and is observed in
hyperplasia and carcinoma in situ (Belinsky et al. 1998; Nuovo et al. 1999;
Licchesi et al. 2008a). We focused on the exon 2 CpG island of the gene
because our previous study of CDKN2A DNA methylation showed that it was
more highly significantly associated with cancer compared to adjacent non-tumor
lung than the promoter CpG island. However, it should be noted that some
cancer cell line data suggests that CDKN2A exon 2 DNA methylation is not
necessarily associated with gene silencing (Nguyen et al. 2001). The methyl-
brinding protein MeCP2 has been shown to associate with methylated CDKN2A
exon 2, but the biological significance of this modification for cancer progression
remains to be clarified (Nguyen et al. 2001). The functional consequences of
CDKN2A exon 2 DNA methylation in tumor samples merits further investigation.
Interestingly, CDKN2A hypermethylation at the promoter CpG island has been
associated with progression of stage I lung cancer (Brock et al. 2008),
suggesting the importance of continued inactivation of this gene during
progression.
Little is known about the function of PTPRN2, a receptor type protein
tyrosine phosphatase (PTP) that is a major autoantigen in insulin-dependent
diabetes mellitus (Kawasaki et al. 1996) and that is also expressed in the
58
cerebellum and other parts of the nervous system (Takeyama et al. 2009). Since
PTPs dephosphorylate proteins, many of these enzymes are implicated in the
negative regulation of cell growth, differentiation and oncogenic transformation
(Navis et al. 2010). A variety of PTPs have been shown to be mutated in
colorectal cancer (Wang et al. 2004) and PTPRD was identified as mutated and
inactivated in lung adenocarcinoma (Ding et al. 2008). We have found PTPRN2
to be frequently methylated in adenocarcinoma and squamous cell cancer of the
lung in at least two independent sample sets for both histological subtypes
(Anglim et al. 2008b). However, functional studies on the potential role of this
protein in any type of cancer remain to be done.
We observed significant hypermethylation in AIS compared to AAH for
seven loci: 2C35, EYA4, HOXA1, HOXA11, NEUROD1, NEUROD2 and
TMEFF2. 2C35 was identified through restriction landmark genomic scanning to
be hypermethylated in lung cancer (Dai et al. 2001) as well as in primitive
neuroectodermal tumors, gliomas and colon cancer, and these observations
were the basis for the design of MethyLight probe/primer set and our examination
of this locus in lung cancer. The CpG island does not overlap with a known gene,
although it overlaps with an uncharacterized expressed sequence tag DA773580
(Benson et al. 2004). The nearest identified gene, located 20 kb downstream, is
PTF1A, pancreas specific transcription factor 1a, a helix-loop-helix transcription
factor promoting acinar differentiation in the pancreas and showing loss of
function in pancreatic cancer (Sellick et al. 2004). To date no role of PTF1A in
59
lung cancer has been reported. Thus, the biological relevance of methylation at
2C35 remains to be investigated. One possibility is that this locus carries an
enhancer that might normally drive the expression of one or more distant genes;
in human H1 embryonic stem cells the region containing 2C35 shows histone 3
lysine 4 mono-methylation, a mark that is associated with enhancers and regions
downstream of transcription start sites (data from the Bernstein lab at the Broad
Institute, (Birney et al. 2007)).
EYA4, the human homologue for eyes-absent 4 from Drosophila, is a
tyrosine phosphatase that targets histone H2AX and plays a role in recruiting the
DNA repair machinery to DNA. The gene is inactivated by DNA methylation in
Barrett’s esophagus and esophageal adenocarcinoma (Zou et al. 2005).
We found significant DNA hypermethylation of both HOXA1 and HOXA11,
which lie about 90 kilobases apart at opposite ends of the HOXA cluster, in AIS.
HOX genes have been reported to be coordinately hypermethylated and
inactivated in lung cancer, particularly adenocarcinoma (Shiraishi et al. 2002;
Rauch et al. 2007). In breast cancer, HOXA1 was identified as a frequently
methylated, and in an analysis similar to ours, was found to be significantly
hypermethylated in atypical ductal hyperplasia (ADH) relative to normal breast,
and ductal carcinoma in situ (DCIS) relative to ADH (Park et al. 2011b).
However, no multiple comparisons correction was applied in the latter study;
using such a correction HOXA1 is only significantly hypermethylated in DCIS vs.
60
ADH, which is very similar to our finding of significant hypermethylation in AIS
compared to AAH.
TMEFF2, a transmembrane protein with EGF-like and two follistatin-like
domains (also known as hyperplastic polyposis protein (HPP1) and tomoregulin),
was found to be similarly hypermethylated in DCIS in the above breast cancer
study. TMEFF2 had previously been reported to be methylated in lung
adenocarcinoma (Hanabata et al. 2004) and inactivation of DNMT1 in a breast
cancer cell line reactivates methylated TMEFF2 (Suzuki et al. 2004), suggesting
its methylation leads to silencing. We thus identified two loci, HOXA1 and
TMEFF2, that occur at an “intermediate” timepoint during cancer development in
the lung as well as the breast. NEUROD1 and NEUROD2 were identified by us
as highly methylated in lung adenocarcinoma compared to AdjNTL. NEUROD1
DNA methylation has been observed in diffuse large B-cell lymphoma (Pike et al.
2008) and breast cancer where it was associated with a ten-fold more likely
response to neoadjuvant therapy in estrogen receptor-negative cancers (Fiegl et
al. 2008). It is intriguing that just like PTPRN2, NEUROD proteins appear to be
involved both in diabetes mellitus and cerebellar development (Naya et al. 1997).
CDH13, CDX2, OPCML, SFRP1 and TWIST1 do not show significant
hypermethylation in AAH or AIS, and instead are only significantly
hypermethylated in invasive adenocarcinoma. Inactivation or hypermethylation of
many of the latter genes has been linked to poor prognosis or metastasis,
agreeing with a potential role in the development of invasive cancer. CDH13 or
61
heart cadherin, encoding an adhesion molecule, was identified as
hypermethylated in lung cancer in 1998 (Sato et al. 1998), a finding that was
substantiated by many studies (Toyooka et al. 2001a; Ulivi et al. 2006; Tsou et
al. 2007). CDH13 DNA methylation has been found to be associated with stage
IV disease (Kim et al. 2005), poor prognosis (Suzuki et al. 2006), and
tumorigenicity of xenografts in nude mice (Zhong et al. 2001). In a silica-induced
lung cancer animal model, CDH13 methylation was seen in invasive but not
preinvasive lung cancer (Blanco et al. 2007) and in an analysis of stage 1 lung
cancer patients, CDH13 methylation was observed to be associated with
recurrent cancer (Brock et al. 2008). Thus, loss of CDH13 may be linked to the
altered adhesive properties that allow cells to become invasive.
Likewise, OPCML, an opioid receptor and putative tumor suppressor
thought to play a role in adhesion (Maneckjee and Minna 1990), appears to
become methylated late, showing hypermethylation mainly in adenocarcinoma.
Silencing of OPCML has been implicated in metastasis of gastric cancer (Wang
et al. 2006).
SFRP1, encoding secreted frizzled-related protein, a WNT signaling
pathway antagonist, is another “late” locus. SFRP1 was previously examined in
AAH lesions, and was found to be methylated in 11-14% of AAH lesions
(Licchesi et al. 2008b). In our hands, methylation of SFRP1 in AAH was even
less frequent, and we see little methylation in AIS. Like Licchesi et al., we
observe dramatic hypermethylation of SFRP1 in adenocarcinoma (Figure 2.4),
62
suggesting that the DNA methylation of this gene may be a key change
associated with invasion. Transcriptional silencing of SFRP1 by DNA methylation
and loss of heterozygosity in lung cancer have been documented, supporting a
role for this gene as a tumor suppressor (Fukui et al. 2005; Zhang et al. 2010),
and SFRP1 hypermethylation was found to be associated with lymph node
metastasis and progression (Zhang et al. 2010). The silencing of SFRP1 is
especially of interest since the WNT pathway was recently implicated in lung
adenoarcinoma metastasis (Nguyen et al. 2009).
TWIST1, encoding a helix-loop-helix transcription factor, was identified as
DNA methylated in lung cancer based on a genome-wide screen for genes
reactivated in lung cancer cell lines by 5-aza-2’deoxycitidine, a DNA
methyltransferase inhibitor (Shames et al. 2006). The locus has also been found
to be highly methylated in metastatic breast cancer (Mehrotra et al. 2004).
Intriguingly, overexpression of TWIST1 has been linked to invasion and
metastasis in hepatocellular carcinoma and oesophageal cancer (Niu et al. 2007;
Yuen et al. 2007). These observations suggest that further studies of TWIST1 to
clarify its role in invasion and metastasis are warranted.
Of the genes we characterized as becoming DNA methylated as AIS
becomes invasive, CDX2 has been least well studied. In colorectal cancer, its
methylation appears to cause silencing and seems to be associated with
advanced stage disease and poor prognosis (Kawai et al. 2005; Baba et al.
2009).
63
In the study of stage 1 lung cancer patients mentioned above, DNA
methylation of RASSF1, a ras-associated putative tumor suppressor, was also
found to be associated with recurrence (Brock et al. 2008). Numerous groups
have reported RASSF1 methylation in lung cancer (Dammann et al. 2000;
Toyooka et al. 2001b; Pfeifer et al. 2002; Tomizawa et al. 2002), and methylation
of this gene has been associated with poor prognosis (Tomizawa et al. 2002; Kim
et al. 2003) and later stage cancer (Niklinska et al. 2009). The latter observations
would appear to be in agreement with our characterization of RASSF1 DNA
methylation as associated with the transition from in situ cancer to invasive
cancer. While we observed occasional hypermethylation of RASSF1 in both AAH
and AIS, the frequency and DNA methylation levels in these preinvasive lesions
were low. The methylation frequency we observed in the tumors was comparable
to that found by us and others (Dammann et al. 2000; Toyooka et al. 2001b;
Pfeifer et al. 2002; Tomizawa et al. 2002). It is interesting therefore that RASSF1
methylation has been found in the sputum of smokers prior to the detection of
overt lung cancer (Hobbs and Mattick 1993). In another study of RASSF1 DNA
methylation in AAH lesions (Licchesi et al. 2008a), methylation of the locus was
reported in almost 30% of AAH, a frequency that approaches that reported for
tumors. The lower frequency we observed in AAH in our study might be
attributable to our use of a quantitative technique to measure DNA methylation,
and to the fact that our probe/primer set detects methylation of six CpGs in the
amplicon, thus providing a more strict measurement of hypermethylation.
64
To obtain an indicator for the timing of DNA hypomethylation with respect
to lung adenocarcinoma development, we used the mean of two repeat-based
probes used as measures for global DNA hypomethylation (Weisenberger et al.
2005). To our knowledge, global DNA methylation has not been previously
investigated in the putative preinvasive stages of lung adenocarcinoma. We
observe highly significant hypomethylation only in adenocarcinoma, suggesting
that pervasive global hypomethylation is a later event than hypermethylation. A
recent study of global hypomethylation in stage 1 lung cancer found it to be
significantly associated with stage IB vs IA, larger tumors and less differentiated
morphology (Anisowicz et al. 2008), indicating that it may indeed be a later rather
than earlier event.
As a comparison for the studied (pre)malignant lesions, we examined two
types of histologically normal lung tissue, AdjNTL and MetNTL. We observed no
increased DNA methylation of the 15 loci in AdjNTL compared to MetNTL. This is
especially telling, since the median age of the MetNTL subjects was slightly
younger (Table 2.2), and increased DNA methylation with age has been reported
(Ahuja et al. 1998); if observed, a slightly higher DNA methylation in AdjNTL
could have been attributed to the small age difference. The lack of higher DNA
methylation in AdjNTL strongly suggests that there is no field defect for these
loci, at least when compared to histologically normal lung from patients with a
metastasis to the lung. We did observe significantly higher DNA methylation of
PTPRN2 in MetNTL. One possible explanation is that something is different
65
about PTPRN2 in the cases from which MetNTL was obtained. We have no basis
for assuming that PTPRN2 values for AdjNTL are not representative and for
some reason were abnormally low.
While we did not find increased DNA methylation in any of our 15 DNA
hypermethylation loci between high-grade (HG) or low-grade (LG) AAH, in an
unsupervised analysis we identified a small group of five AAH lesions from four
patients that showed significantly higher levels of DNA methylation in seven loci:
2C35, CDKN2A ex2, CDX2, HOXA1, NEUROD1, TMEFF2 and TWIST1.
Whether this elevated DNA methylation is somehow related to the propensity to
progress will require further studies, but it is notable that four of these loci are the
ones that were designated “intermediate” for increased methylation in AIS. The
four patients carrying the AAH that were more highly methylated did not
consistently show unusually high levels of DNA methylation in their other lesions,
confirming that lesions found in patients are independent.
The small number of lesions that clusters separately from the group would
be in keeping with a model in which the majority of AAH lesions may never
progress. The five AAH lesions in the small cluster were a mixture of HG and LG
lesions, again indicating no link between hypermethylation and grade designation
in AAH. One could wonder whether the 5 separately clustering AAH samples
were the ones driving the designation of CDKN2A ex2 as an “early”
hypermethylation change, since they exhibited higher levels of methylation of this
locus than other AAH samples. However, when we reanalyzed the data set with
66
the omission of these five samples, the difference in methylation of CDKN2A ex2
between AdjNTL and AAH was still highly significant (p<0.000001), supporting its
designation as an “early” methylation event occurring as hyperplasia develops in
the peripheral lung.
Of interest was the observation that the patient for whom two AAH lesions
partitioned to the small cluster had 7 AIS lesions. Comparison of DNA
hypermethylation levels between single AAH or AIS lesions and those from
subjects in whom two or more lesions were found showed no statistical
differences in PMR levels for any of the CpG islands, and the distribution of PMR
values was comparable to that of subjects with single AAH or AIS lesions (not
shown). Thus, it would not appear that persons with many AAH or AIS lesions
show generally increased DNA methylation levels in these lesions.
An additional level of complexity however, is not addressed in this chapter,
as AIS, formerly known as pure bronchioloalveoloar carcinoma (BAC), accounts
for only 4% of lung cancers. Up to 20% of lung cancers comprise of a
heterogeneous group of lung tumors which contain some component of BAC
histology (Read et al. 2004). These tumors were therefore called tumors with a
“mixed BAC” component, and DNA methylation of this component is explored in
Appendix A.
For those loci for which it is unknown whether their DNA methylation might
contribute to cancer (such as 2C35), further experiments will be required to
determine whether DNA hypermethylation has functional consequences.
67
Examining the biological consequences of sequential gene silencing, for example
in AAH- or AIS-derived cell lines (Shimada et al. 2005), will help confirm the role
of the genes under study in lung adenocarcinoma development and progression.
Further delineating the nature and timing of epigenetic hits, which are in principle
reversible, is potentially highly relevant for epigenetic therapy of early lung
cancer, and perhaps for cancer prevention. Lastly, irrespective of the biological
effects of hypermethylation at each locus, the presence of DNA methylation
characteristic of each type of lesion can be used to inform the generation of
biomarkers specific for the different developmental stages of lung
adenocarcinoma.
68
CHAPTER 3
GENOME-SCALE ANALYSIS OF DNA METHYLATION IN
LUNG ADENOCARCINOMA:
BIOMARKERS AND FUNCTIONAL CHANGES
Chapter 3 Abstract
Lung cancer is the leading cause of cancer death worldwide and
adenocarcinoma is its most common histological subtype. Here we performed
genome-scale DNA methylation profiling using the Illumina Infinium
HumanMethylation27 platform on 59 matched lung adenocarcinoma/non-tumor
lung samples, with genome-scale verification on an independent set of tissues.
We identified 766 genes showing altered DNA methylation between tumors and
non-tumor lung. By integrating DNA methylation and mRNA expression data, we
identified 164 hypermethylated genes showing concurrent downregulation, and
57 hypomethylated genes showing increased expression. Integrated pathways
analysis indicates that these genes are involved in cell differentiation, epithelial to
mesenchymal transition, RAS and WNT signaling pathways and cell cycle
regulation, among others. Our analysis lays the groundwork for further molecular
studies of lung adenocarcinoma by providing new candidate DNA methylation
69
biomarkers for early detection and identifying novel molecular alterations
potentially involved in lung adenocarcinoma development and progression.
70
Introduction
Lung cancer is the leading cause of cancer-related death worldwide (Jemal et
al. 2011). In many countries, adenocarcinoma has surpassed squamous
carcinoma as the most common histological subtype of lung cancer, and it is also
the most common histological subtype in women, Asians, and never-smokers
(Toh et al. 2006). Promoter DNA methylation, which is associated with gene
silencing, can regulate gene expression in a myriad of biological and pathological
processes, including lung cancer (Jones 2002; Belinsky 2004; Kerr et al. 2007;
Brock et al. 2008; Risch and Plass 2008). Unlike genetic mutations, DNA
methylation is an inherently reversible change, and therefore is of great interest
as an active target of drug development (Esteller 2003; Rodriguez-Paredes and
Esteller 2011).
Here we analyzed 59 lung adenocarcinoma tumors and matched adjacent
non-tumor lung (NTL) tissues. Using the Illumina Infinium HumanMethylation27
platform we interrogated the DNA methylation status of 27,578 CpG
dinucleotides spanning 14,475 genes. Comparisons of cancer and non-
cancerous tissues are often done to discover genes which contribute to
tumorigenesis in order to better understand the etiology of the disease or to
develop new therapeutics, as well as to identify those genes which regardless of
function may serve as molecular biomarkers, either for diagnostic or prognostic
purposes. In this study we used our data, as well as publicly available datasets to
identify potential blood based DNA methylation biomarkers. Then, focusing on
71
genes differentially methylated in tumor vs. non-tumor lung, we integrated mRNA
expression data to identify DNA methylation events with potential functional
significance. We verified our findings using an independent set of 28 lung
adenocarcinomas and matched adjacent NTL, as well as validated select DNA
methylation loci using an alternative assay, MethyLight.
72
Materials and Methods
Study samples
The Early Detection Research Network (EDRN)/Canary Foundation tissue
collection consisted of 60 lung adenocarcinoma tumors and matched adjacent
histologically confirmed non-tumor lung (NTL), collected after surgery. 45
adenocarcinoma/NTL pairs were obtained from the Vancouver General Hospital
(Vancouver, Canada) and 15 adenocarcinoma/NTL pairs from the British
Columbia Cancer Agency Tumor Tissue Repository (Vancouver, Canada, BCCA
Research Ethics Board #: H09-00008). 30 subjects were never smokers (defined
as <100 lifetime cigarettes), and 30 were current smokers (average 53 pack-
years, range 11-120 pack-years). One tumor sample was excluded after
pathology review later revealed it to be a large cell carcinoma. Subject
characteristics for the remaining 59 subjects are detailed in Table 3.1. For
verification of DNA methylation profiling, an independent sample set of 28 lung
adenocarcinomas and 27 NTL was used. Subject characteristics for the
validation population are detailed in Table 3.2.Of these, 21 tumor and 20 NTL de-
identified samples were purchased from the Ontario Tumor Bank (OTB, Ontario,
Canada) while 7 tumors and 7 NTL were collected at the University Hospital at
the University of Southern California (USC IRB protocol #HS-06-00447). For
MethyLight verification of selected probes, OTB samples were used (26 tumors
and 26 NTL), of which 25 pairs were matched.
73
Table 3.1. Characteristics of subjects and tumors.
a
range (11-120 pack-years, mean=53 pack-years)
b
range (39-86 years, mean= 68 years)
Characteristic Classes
Never
Smokers
Current
Smokers
a
Fisher p
Sex Male 7 7 1.00
Female 23 22
Race Asian 21 1 6.15x10
-8
White 9 28
Age
b
<68 10 15 0.19
20 14
Stage Early (Stages I-II) 23 23 1.00
Late (Stages III-IV) 7 6
KRAS mutation WT 27 10 1.10x10
-5
Mutant 3 19
EGFR mutation WT 13 29 6.19x10
-7
Mutant 17 0
LKB1 mutation WT 27 22 0.052
Mutant 1 7
74
Table 3.2. Characteristics of validation subjects
Characteristics Classes NTL Tumor
Tissue Source
a
OTB 20 21
USC 7 7
Sex
b
Male 11 11
Female 13 14
Race
c
Asian 2 2
Caucasian 3 3
Stage
d
Stage I-II 14
Stage III-IV 6
a
17 OTB samples were paired, and all 7 USC samples were paired
b
Information not available for 3 subjects
c
Information only available for 5 subjects
d
Information not available from 8 subjects
75
10 of the tumors and 13 of the NTL samples examined by MethyLight were the
same tissues used for the genome-wide verification. EDRN/Canary samples
were assessed by an experienced pathologist (Adi F. Gazdar), while the
verification samples were assessed by a separate expert lung pathologist
(Michael N. Koss). All sample collections were performed conforming to protocols
approved by the appropriate local Institutional Review Boards and were acquired
with informed consent. The identities of the subjects were not made available to
the laboratory investigators.
DNA methylation data production
DNA was extracted by proteinase K digestion following manual microdissection
from slides prepared from fresh frozen tissue blocks. The DNA was then bisulfite
converted using the EZ DNA Methylation kit (Zymo Research, Irvine, CA, USA)
with a modification to the manufacturer’s protocol in which samples were cycled
16 times for 30 seconds at 90°C and one hour at 50°C. The Illumina Infinium
HumanMethylation27 BeadChip assays were performed by the USC Epigenome
Center according to manufacturer’s protocols (Illumina, San Diego, CA, USA).
This assay generates DNA methylation data for 27,578 CpG dinucleotides
FR Y H U L Q J X Q L T X H J H Q H V ' 1 $ P H W K \ O D W L R Q O H Y H O V D U H U H S R U W H G D V ȕ -values,
calculated from mean methylated (M) and unmethylated (U) signal intensities for
H D FK O R FX V I R U H D FK VD P S O H X VL Q J W K H I R U P X O D ȕ 0 0 8 3 U R E H V Z L W K
76
detection p-values of >0.05 were deemed not significantly different from
background noise and were labeled “NA”. Data for all samples are publicly
available at the EDRN Public Portal (http://www.cancer.gov/edrn). Publicly
available Illumina Infinium HumanMethylation27 data for 23 squamous cell
carcinomas (SCC) was downloaded from The Cancer Genome Atlas data portal
(http://tcga-data.nci.nih.gov/tcga/). At the time of analysis, no matched adjacent
non-tumor lung (AdjNTL) for these SCC tissues were available. Therefore top
differential DNA methylation loci were generated by comparing these 23 SCC
with the AdjNTL obtained from 20 of the verification set tumors (OTB lung
adenocarcinomas). Illumina Infinium HumanMethylation27 for 10 white blood
cells (WBC) from healthy subjects were also generated at the USC Epigenome
Center as detailed above.
MethyLight experiments were performed as previously described
(Campan et al. 2009). MethyLight measurements are represented as percentage
methylated reference (PMR), defined by the GENE:ALU ratio of a sample,
wherein ALU refers to a reference primer/probe combination that lacks CpGs and
is designed to bind to a subset of ALU repeat sequences (Weisenberger et al.
2006), divided by the GENE:ALU ratio of M.SssI-treated reference DNA. This
results in a PMR range of 0-100, where a PMR of 0 indicates no detectable DNA
methylation and a PMR of 100 represents fully methylated molecules.
Occasionally, a PMR of >100 can be observed, which may result when the
reference DNA was not fully methylated at a particular site. To minimize this, the
77
same batch of reference DNA that has been exhaustively enzymatically
methylated is used throughout the experiments (Selamat et al. 2011). Previously
published MethyLight primer/probe sets were used for the SFN and MAL locus
and the ALU reference (Weisenberger et al. 2006; Noushmehr et al. 2010). Other
MethyLight primer/probe sequences are detailed in Table 3.3.
DNA methylation data analysis
Data analyses were performed using R (R Development Core Team, 2011) and
Bioconductor (Gentleman et al. 2004). The analyses of 120 tissue samples
necessitated conducting the experiment with the samples randomized and
spread over two bisulfite treatment plates and 16 Infinium BeadChips. Batch
effect investigations were performed as recommended (Leek et al. 2010) and are
illustrated in Figure 3.1. Three tissue samples were excluded from analyses: one
tumor/NTL tissue pair (07L36_T/N) found to be a large cell carcinoma instead of
a lung adenocarcinoma and one NTL sample (3023_N) for which correlation
analyses suggested this was neither a lung adenocarcinoma nor NTL tissue.
Probes targeting the X and Y chromosomes were excluded, as were probes
containing a known single-nucleotide polymorphism, probes that contain repeat
sequences of E S D Q G S U R E H V W K D W Z H U H I R X Q G W R E H Q R Q -unique in the
genome (Noushmehr et al. 2010). Hierarchical clustering was performed using
Ward linkage with Euclidean distance for samples and Pearson correlation
coefficients for probes.
78
Table 3.3. MethyLight primer and probe sequences
a
HUGO gene name (www.genenames.org)
HUGO
a
Forward Primer
Sequence
Reverse Primer
Sequence
Probe Oligo Sequence
ABCA3
TCGATAGTTGTCGCG
GGTC
CCGAAACCACGAACG
CAC
CCGACCGCTAACGACGCT
CGAA
ACVRL1
TAGAGGTGAGTCGAG
GTTCGC
CTAAACCTCCCGCGA
CCA
AACGAAAATCAAACCGCCC
GCC
DOCK2
ATAGGGTTAAGATTTG
CGTTTGA
CTCCCACGCGACGAA
C
CTCCTAAAAACGCCGAATT
TAACGCGAA
EFEMP1
TTAGGATCGGAATTAG
GGATCG
AAAACCGCCGCGTTA
ACTT
CCGCCCGCCCGAAACTAA
TACCT
FAM83A
TTCGGTTGGTTACGG
ACGTT
AAACCTAAATATACTA
AACCTCCACCGA
CGCCTTCCTAACTAAACAC
CCGCCAAT
HOXA5
CGACGCGGGCGTTAT
C
ACGCGAATAAAACATT
ACAACCC
CTCCCGACTCCGAATTCCC
GCT
JAM2 CGAGGTCGTCGCGGG
CTTACTCCCAACGTCC
GCTC
CTAAATTTCCCGCTCGCCT
AAACGCA
MAL
GTTCGGTGTAGGATTT
TAGCGTC
ATCTACAATAAAAAAT
AAAACCGACCG
CGACCGCCGACCCCTTCC
G
NDRG2
AGGTGTGTTGGCGTC
GAAAG
AAACAAACGTCGAAAC
GCAAC
AAAATCGAAACTCCTACGC
CCTCGACG
SOCS2 TCGGGCGGAGTTGCG
AAAACTTCGCAACCTA
CTCTCGAC
CCGCCCTCGCGACAATAT
CAATCC
SPON1
TCGTTAAGTCGAGGC
GGG
CCTACCCGAAATAATC
CCGAC
AAACTCGAAACTAACGTCC
GCGACTCG
79
Figure 3.1. Quality control and batch effects analysis of the DNA
methylation data show no confounding batch effects. A) The top two
principal components of all probes after subtracting for probes on the X and Y
chromosomes, probes that contain SNPs and repeat sequences of E S7X P RU and NTL samples are colored by bisulfite-treatment plate: Plate 31 (purple); Plate
32 (green). There is clear separation of tumor from NTL samples, but no strong
delineation between plates, supporting the absence of a strong batch effect. B
and C) Density plots of H2228, a lung adenocarcinoma cell line was run on both
bisulfite treatment plates and show good overlap and good correlation with an
R
2
=0.97.
80
For each comparison analysis, the top 5000 most variable probes across
all samples included in the comparison as measured by SD/SD
MAX
were retained
(Cancer Genome Atlas Research Network (Network 2011)). Locus-by-locus
analyses were conducted using the nonparametric Wilcoxon rank sum-test and
multiple comparisons correction was performed using Q-values from the qvalue
package in R (Storey and Tibshirani 2003). Probes were considered statistically
significantly different between the tested groups with a Q<0.05. We also included
DQ DGGLWL RQDO I L O WHU U HT XLU L QJ WKH P HGL DQ ȕ -value difference between groups to be
R U D P L Q L P X P difference for all group comparisons.
Functional classification/Gene network analyses
Differentially methylated genes were analyzed for gene ontology enrichments in
comparison to all genes available on the Illumina Infinium HumanMethylation27K
platform using the DAVID Functional Annotation Tool (Huang da et al. 2009).
Genes for which expression levels change in concordance with DNA methylation
changes were analyzed for gene network and biological processes enrichment
using IPA (Ingenuity Systems, www.ingenuity.com). Meta-analyses to identify
correlated biosets and overlapping genes in publicly available datasets were
performed using NextBio online database (NextBio, Cupertino, CA,
www.nextbio.com).
81
Integration of gene expression analysis
Gene expression profiles were generated using RNA obtained by Trizol
extraction (Invitrogen Corp., Carlsbad, CA, USA) from microdissected alternate
sections of the same 59 EDRN lung AD/NTL frozen tissue pairs used for the
DNA methylation analysis. Expression data was obtained using the Illumina
Human WG-6 v3.0 Expression BeadChips (Illumina, San Diego, CA, USA) at the
Genomics Core at UT Southwestern. Bead-summarized data was obtained using
the Illumina BeadStudio software and expression values were log2-transformed
and Robust Spline Normalization (RSN) was performed with the lumi package in
R (Du et al. 2008). The ReMOAT annotation of gene expression data was used
to include only “perfect” and “good” annotated probes (Barbosa-Morais et al.
2010). Exploratory quality control analyses revealed no strong batch effects (data
not shown), although one tumor sample was excluded (3035_T) due to quality
concerns. Out of 766 differentially methylated genes identified (Q<0.05, median
ȕ -value difference Z H Z H U H D E O H W R H [ D P L Q H J H Q H H [ S U H VVL R Q O H Y H O V I R U genes after the probe quality filtering detailed above. For genes with multiple
probes, we selected the probe with highest variance, and analyzed differential
expression using t-tests and a Benjamini-Hochberg multiple comparisons
correction. Statistical significance was called at BH-adjusted p<0.05. An
additional stringent filter of mean 2-fold change was used to identify top changing
genes. Correlation between gene expression and DNA methylation for each gene
was measured using Spearman correlation coefficient. For genes with multiple
82
probes measuring DNA methylation, we selected the probe with the highest
SD/SD
max
value for DNA methylation.
83
Results
59 lung adenocarcinomas and matched adjacent non-tumor lung tissue
(Table 3.1) were interrogated for DNA methylation using the Illumina Infinium
HumanMethylation27 platform. 30 tumors were from never-smokers (defined
here as less than 100 cigarettes in a lifetime), while 29 were from current
smokers. Before any statistical tests were conducted, we first inspected the data
for the presence of substantial confounding batch effects due to the separate
plates or chips (Leek et al. 2010). We did not observe any such effects (see
Methods and Figure 3.1). One NTL sample was eliminated for quality control
reasons (see Methods); 117 samples were thus further analyzed (as outlined in
Figure 3.2).
Identification of differentially methylated regions in lung adenocarcinoma
We first performed an exploratory two-dimensional hierarchical clustering of the
top 5000 probes that varied most across the 117 samples (Figure 3.3A). The
DNA methylation profiles of tumors and NTL resulted in separate clusters, with
the exception of one NTL sample (3022_N), indicating a substantial difference in
DNA methylation profiles between the tumor and non-tumor samples. We next
performed a locus-by-locus differential DNA methylation analysis of tumors vs.
NTL to identify differentially methylated probes.
84
Figure 3.2. Diagram of analysis strategy. Overview of analyses and samples
used in this study.
85
Figure 3.3. Identification of DNA methylation differences between lung
adenocarcinoma and NTL. (A) Two dimensional-hierarchical clustering was
performed using the 5000 most variable Infinium DNA methylation probes across
all samples (n=117). Probes are in rows; samples are in columns. (B) Volcano
plot of the differential DNA methylation analysis. X-axis: median ȕ -value
difference (median Tumor-median NTL); Y-axis: Q-values for each probe (-
1*log10 scale). The vertical dotted lines mark 20% change in ȕ -values; the
horizontal dotted line marks the significance cut-off. (C) Proportions of probes
from genes with associated CpG islands (CGI) for each gene list. D) Overlap of
significant unique gene lists using an independent sample set (also see Figure
3.6).
86
Using our criteria of Q< DQ G D P LQL P X P P HGL DQ ȕ -value difference of
20%, we identified 681 probes (520 genes) that were significantly
hypermethylated in tumors, and 275 probes (247 genes) that were significantly
hypomethylated in tumors (Figure 3.3B and Appendix B. Some of our most
hypermethylated loci include HOX genes, specifically HOXB4, HOXA9 and
HOXA7. 17 different HOX genes passed our strict cutoffs, many passing with
multiple probes, supporting previous observations of widespread DNA
methylation of the polycomb complex targeted HOX genes (Shiraishi et al. 2002;
Rauch et al. 2007). Some of our most hypomethylated loci include CASP8 and
TNFRSF10A, both involved in TNF-receptor mediated apoptosis (Boldin et al.
1996; Wang and El-Deiry 2003). To investigate the categories of genes exhibiting
altered DNA methylation, we performed a DAVID functional annotation analysis
(Huang da et al. 2009). The differentially hypermethylated set of genes was
significantly enriched in GO biological processes including regulation of
transcription, embryonic morphogenesis, cell-cell signaling, and cell surface
receptor-linked signal transduction, while the differentially hypomethylated set
was significantly enriched in processes including epidermal cell differentiation,
epithelial cell differentiation and defense response (Benjamini-Hochberg (BH)
adjusted p<0.05).
For the differentially methylated probes, we also investigated whether or
not they corresponded to genes with associated CpG islands (Figure 3.3C). 64%
of probes hypermethylated in tumors corresponded to genes with associated
87
CpG islands, versus only 7% of probes hypomethylated in tumors (p<2.2x10
-16
,
Fisher’s Exact test). Our findings support previous observations showing
significant differences between the characteristics of genes that gain DNA
methylation during tumorigenesis, versus those that lose DNA methylation (Ohm
et al. 2007; Schlesinger et al. 2007). One gene, CDH13 - represented by nine
probes on the Infinium platform - was present in both the hypermethylated and
hypomethylated lists (Figures 3.4 and 3.5). Three CDH13 probes were included
in our set of the 5000 most variable probes. Of these, two were statistically
significantly hypermethylated in tumors, while one was hypomethylated. The
former were both located in the promoter region of the gene and 4 additional
probes in this region also showed hypermethylation in tumors, though they did
not meet our criteria for significance in our discovery analysis. The
hypomethylated probe was not located in/near a CpG island and resided in intron
1 of the gene, over 10 kilobases from the transcription start site. The two
additional probes in this region showed trends of hypomethylation, although they
do not reach statistical significance. Thus, the CDH13 gene appears to be
characterized by widespread differential and bidirectional changes in DNA
methylation in lung adenocarcinoma vs. the adjacent NTL. This exemplifies the
recent observation that methylation gain in the promoter can be coupled with
methylation loss in the gene body (Berman et al. 2011).
88
Figure 3.4. CpG island-associated Illumina Infinium HumanMethylation27 probes
for CDH13. (A) Six of the nine CDH13 probes are located in or near a CpG island. Two
probes were included in our analysis of the top 5000 most variable probes and were
statistically significantly differentially methylated between tumors and NTL (Q<0.05,
median beta-value difference KL J KO L J K WHG L Q U HG % ; -axes: DNA methylation
beta-values; Y-axes: log2-transformed gene expression values. Spearman correlation
coefficients are shown at the bottom of each scatterplot. Tumors (red); NTL (blue). (C)
Y-axes: DNA methylation beta-values.
89
Figure 3.5. Intronic Illumina Infinium HumanMethylation27 probes for CDH13. (A)
Three of the nine CDH13 probes are located in the intronic region of the gene. One
probe was included in our analysis of the top 5000 most variable probes and was
statistically significantly differentially methylated between tumors and NTL (Q<0.05,
median beta-value difference 0.2, highlighted in red). (B) X-axes: DNA methylation
beta-values; Y-axes: log2-transformed gene expression values. Spearman correlation
coefficients are shown at the bottom of each scatterplot. Tumors (red); NTL (blue). (C)
Y-axes: DNA methylation beta-values.
90
To ensure that our findings were not dependent on the specific population
analyzed, we performed an additional differential DNA methylation analysis on an
independent sample set of 28 lung adenocarcinomas and 27 NTL (Table 3.2),
using the same Infinium platform. By applying identical statistical criteria, we
identified 313 significantly hypermethylated genes and 85 significantly
hypomethylated genes in this verification set (Figure 3.6). The smaller number of
genes which passed our criteria is expected due to the smaller sample size of the
verification set, which reduces our statistical power. Importantly, we found that
95% of the hypermethylated genes and 80% of hypomethylated genes in the
verification set had been identified in our discovery analysis (Figure 3.3D),
supporting our initial findings. To provide technical validation of our observations,
we assessed the DNA methylation levels of 12 genes using the real-time PCR-
based MethyLight technique on DNA from 26 tumor/NTL pairs, of which 10
tumors and 13 NTL samples were also part of the Infinium verification dataset
(Figure 3.7). These 12 genes were chosen based on significance, functional
relevance and assay design compatibility. While differences in the observed
absolute levels of DNA methylation are expected due to the more stringent
nature of the MethyLight assay, which requires multiple fully methylated CpG
dinucleotides in the region covered by the primers and probe, all but one
(SOCS2) of the 10 tested genes hypermethylated in the Infinium study were also
found to be significantly more highly methylated in the verification set tumors by
MethyLight.
91
Figure 3.6. Verification of DNA methylation differences between lung
adenocarcinoma and NTL in independent samples. (A) Two dimensional-
hierarchical clustering was performed using the 5000 most variable Infinium DNA
methylation probes across all samples (n=57). Probes are in rows; samples are
in columns. (B) Volcano plot of the differential DNA methylation analysis. X-axis:
median ȕ -value difference (median Tumor-median NTL); Y-axis: Q-values for
each probe (-1*log10 scale). The vertical dotted lines mark 20% change in ȕ -
values; the horizontal dotted line marks the significance cut-off.
92
Figure 3.7. Verification of selected DNA methylation differences between lung adenocarcinoma and
NTL with an alternate method. MethyLight verification of select loci using 26 independent tumor/NTL pairs. PMR =
Percent Methylated Reference, a measure of DNA methylation in comparison to a fully methylated control, with range
0-100, where 0 represents no detectable methylation and 100 represents high DNA methylation. p-values were
calculated using Wilcoxon signed-rank tests.
93
Both hypomethylated genes tested (FAM83A and SFN, right two panels,
Figure 3.7) were also statistically significantly hypomethylated in tumors in the
MethyLight verification.
Identification of potentially functionally relevant DNA methylation changes in lung
adenocarcinoma
Recent integrated expression and methylome studies have indicated that most
DNA methylation changes associated with cancer do not correlate with altered
gene expression (Noushmehr et al. 2010; Hinoue et al. 2011). To identify those
DNA methylation changes with concomitant changes in gene expression, we
integrated the gene expression profiles and DNA methylation profiles of the
EDRN tumor and NTL tissue samples. We were able to examine gene
expression levels for 709 out of 766 of the differentially methylated genes. An
exploratory hierarchical clustering of the expression levels of just these 709
genes completely separated out tumors vs. NTL (Figure 3.8A). Using a BH-
adjusted p-value of 0.05, 349 genes were differentially expressed. 164 of these
were statistically significantly hypermethylated and downregulated (23%), while
57 genes were significantly hypomethylated and upregulated (Figure 3.8B).
94
Figure 3.8. Identification of genes showing coordinately changed DNA
methylation and gene expression. (A) Two-dimensional hierarchical clustering
with 1061 probes corresponding to 709 genes across all tumors (n=58) and NTL
(n=58). Rows represent probes; columns are samples. (B) Starburst plot
integrating differential DNA methylation and gene expression analyses. X-axis:
DNA methylation Q-values (-1*log10 scale); Y-axis: BH adjusted p-values
(-1*log10 scale). Indicated are genes which are hypermethylated and
downregulated in tumors (red); genes which are hypomethylated and
upregulated in tumors (green); genes which are hypermethylated and
upregulated in tumors (blue); genes which are hypomethylated and
downregulated in tumors (orange).
95
We used Ingenuity Pathways Analysis to investigate which gene networks
might be affected by the aberrant DNA methylation of these 221 genes. The top
two gene networks identified involved cell differentiation on the one hand, and
MAPK signaling/cell cycle control on the other (Figure 3.9A). Prominent in the
first network were PI3K, AP1 transcription factors, 7* )ȕ and WNT signaling
pathway members. Epigenetic interactions with histones H3 and H4 were also
seen in the first network. In the second network, ERK1/2 and MAPK1/2, FGF
and CCNA (cyclin A) genes were central. Analysis of the top functional
categories of deregulated genes pointed to cellular movement and development,
tissue development, and cellular growth and proliferation (Figure 3.9B).
We next used the NextBio database (www.nextbio.com) to identify biosets
(uploaded data sets) that were significantly associated with our list of 221 genes
for which DNA methylation changes were significantly inversely correlated with
changes in expression. We found highly statistically significant overlaps (all
p<1.0X10
-20
) with a number of previously published gene expression studies
comparing lung adenocarcinoma to non-tumor lung (Wrage et al. 2009; Hou et al.
2010; Lu et al. 2010) (Figure 3.10). 107 out of the 221 genes were found in all of
the top three most correlated biosets (48.4%), while 184 out of the 221 genes
were found in at least one of the top three most correlated biosets (83%).
Additionally, our 221 genes were highly correlated with two colorectal cancer
DNA methylation studies (Hinoue et al. 2011), Kim YH and Kim YS,
unpublished).
96
Figure 3.9. Characterization of genes showing coordinately changed DNA
methylation and gene expression. (A) Top gene networks identified through
integrated pathways analysis of significant DNA methylation changes associated
with significant inverse gene expression changes. Indicated are genes which are
hypomethylated and up-regulated in tumors (green); genes which are
hypermethylated and downregulated in tumors (red). Solid lines indicate direct
interaction; dashed lines, indirect interaction. (B) The most significantly enriched
biological process categories within genes showing significant DNA methylation
changes associated with significant inverse gene expression changes.
A
B
97
Figure 3.10. Nextbio illustration of biosets found to be most significantly
correlated with genes identified in this study to have functional changes in
DNA methylation. (A-E) Bioset 1 is the list of 221 genes for which DNA
methylation changes were significantly inversely correlated with gene expression.
Venn diagram and statistics for each panel illustrates significantly overlapping
genes between Bioset 1 and each publicly available bioset. (A) Bioset 2 contains
data from GSE19804 (Lu et al. 2010) comparing gene expression profiles of non-
smoking female stage 1 lung cancer patients in Taiwan vs. adjacent non-tumor
lung. (B) Bioset 2 contains data from GSE10799 (Wrage et al. 2009) comparing
gene expression profiles of primary lung adenocarcinomas without bone
metastasis to non-tumor lung. (C) Bioset 2 contains data from GSE19188 (Hou et
al. 2010) comparing gene expression profiles of lung adenocarcinoma tumors
and non-tumor lung. (D) Bioset 2 contains data from GSE25062 (Hinoue et al.
2011) comparing DNA methylation profiles of colorectal adenocarcinoma and
adjacent normal tissue. (E) Bioset 2 contains data from GSE17648 (Kim YH and
Kim YS, unpublished), comparing DNA methylation profiles from colorectal
adenocarcinoma and normal adjacent mucosa. (F) Venn diagram of NextBio
analysis showing the overlap of our bioset (genes showing significant DNA
methylation changes in conjunction with significant inverse gene expression
changes), with the three most highly correlated NextBio biosets.
98
Figure 3.10, Continued.
99
To identify the top changing genes, we applied a 2-fold cut-off in the
average change in gene expression (Figure 3.11), finding 45 genes that were
coordinately hypermethylated and downregulated in tumors, and 16 genes that
were coordinately hypomethylated and upregulated (Table 3.4 and Table 3.5).
Thus, 6.3% of the genes identified as hypermethylated in lung adenocarcinoma
are also downregulated more than two-fold in the same tissues, a percentage
similar to that found in colorectal cancer (Hinoue et al. 2011). In a more global
integration analysis utilizing all genes in common between the DNA methylation
and gene expression platforms, we identified these same genes as our top
candidates for showing DNA methylation-based deregulation (data not shown).
For many of the genes we identified, little to nothing is known about DNA
methylation-based deregulation in cancer.
In addition to those genes showing inverse relationships between DNA
methylation and gene expression changes, 5 genes were found to be
hypermethylated but upregulated in tumors, while 10 genes were found to be
hypomethylated and downregulated (Appendix C. We attempted to characterize
the different groups of genes by examining whether or not they were associated
with CpG islands. We found a statistically significant association between group
membership and CpG island status (p<0.001, Fisher’s Exact Test, Figure 3.11B);
genes for which DNA methylation increased were significantly associated with
CpG islands, regardless of the direction of the gene expression difference.
100
Figure 3.11. Genes showing most significant changes in DNA methylation
and gene expression. (A) Three dimensional starburst plot of 709 genes,
integrating significant changes in DNA methylation (X-axis) and gene expression
(Z-axis), with a mean >2-fold change in gene expression (Y-axis). (B) Presence
of CpG islands in genes exhibiting hyper- or hypomethylation and up- or
downregulation.
101
Table 3.4. Top hypermethylated and downregulated genes in lung
adenocarcinoma
HUGO
a
HUGO gene name
b
: Function
c
References
d
ABCA3
e
ATP-binding cassette, sub-family A, member 3: ATP-binding cassette
transporter (Schimanski et al. 2010)
ACVRL1
e
activin A receptor type II-like 1: TGF-beta receptor, serine/threonine kinase (Hu-Lowe et al. 2011)
ADCY4 adenylate cyclase 4: Membrane-bound adenylyl cyclase (Rui et al. 2008)
ALDH1A2 aldehyde dehydrogenase 1 family, member A2:Retinoic acid synthesis (Kim et al. 2005a)
C1orf87 chromosome 1 open reading frame 87: Undetermined None available
C7 complement component 7:Component of complement system (Oka et al. 2001)
CDH13 cadherin 13: Cell adhesion (Selamat et al. 2011)
CDO1 cysteine dioxygenase, type I: Regulator of cellular cysteine concentrations (Dietrich et al. 2010)
CLDN5 claudin 5: Integral membrane protein, tight junction component (Sato et al. 2003)
CLEC14A C-type lectin domain family 14, member A: Undetermined (Mura et al. 2011)
CLEC1A C-type lectin domain family 1, member A: Cell adhesion, cell-cell signalling None available
CSF3R colony stimulating factor 3 receptor: Cell proliferation, differentiation, survival (Wang et al. 2010)
CTNNAL1
catenin (cadherin-associated protein), alpha-like 1: Modulates Rho
signalling (Noordhuis et al. 2011)
CYYR1 cysteine/tyrosine-rich 1: Undetermined (Vitale et al. 2007)
DOCK2
e
dedicator of cytokinesis 2: Cytoskeletal rearrangements, activate
RAC1,RAC2 (Nishihara et al. 2002)
EFCAB1 EF-hand calcium binding domain 1: Undetermined None available
EFEMP1
e
EGF containing fibulin-like extracellular matrix protein 1: Binds EGF, EGFR; cell
adhesion,migration (Yue et al. 2007)
EPB41L3 erythrocyte membrane protein band 4.1-like 3: Meningiomas pathogenesis (Kikuchi et al. 2005)
GATA2 GATA binding protein 2: Transcriptional activator (Acosta et al. 2011)
HBA1 hemoglobin, alpha 1: Oxygen transport from lung to peripheral tissues None available
HDC histidine decarboxylase: Converts L-histidine to histamine (Suzuki-Ishigaki et al. 2000)
HOXA5
e
homeobox A5: Transcription factor; development, upregulates p53 (Shiraishi et al. 2002)
HYAL2 hyaluronoglucosaminidase 2: Cell proliferation, migration, differentiation (Rai et al. 2001)
ICAM2 intercellular adhesion molecule 2: Cell adhesion interaction (Hiraoka et al. 2011)
JAM2
e
junctional adhesion molecule 2: Tight junctions (Oster et al. 2011)
LTC4S leukotriene C4 synthase: Production of leukotriene C4 (Sakhinia et al. 2006)
MAL
e
mal, T-cell differentiation protein: Integral membrane protein, trafficking (Lind et al. 2007)
MAMDC2 MAM domain containing 2: Undetermined None available
NDRG2
e
NDRG family member 2: Dendritic/neuron cell differentiation, anti-tumor activity (Piepoli et al. 2009)
PRX periaxin: Peripheral nerve myelin maintenance (Lehtonen et al. 2004)
RHOJ ras homolog gene family, member J: GTP-binding protein; cell morphology (Kaur et al. 2011)
SCARA5 scavenger receptor class A, member 5:Ferritin receptor (Huang et al. 2010)
SCN4B
sodium channel, voltage-gated, type IV, beta: Modulate channel gating
kinetics (Chioni et al. 2009)
SLC15A3 solute carrier family 15, member 3: Proton oligopeptide cotransporter (Ibragimova et al. 2010)
SNRPN
small nuclear ribonucleoprotein polypeptide N: Tissue-specific alternative RNA
processing (Kohda et al. 2001)
SOCS2
e
suppressor of cytokine signaling 2: Cytokine transduction, negative regulator in
GH/IFG1 signalling (Wikman et al. 2002)
102
Table 3.4, Continued.
HUGO
a
HUGO gene name
b
: Function
c
References
d
SOSTDC1
sclerostin domain containing 1: BMP antagonist; Wnt and TGF-beta
signalling
(Clausen et al. 2010)
SOX17
SRY (sex determining region Y)-box 17: Transcription regulator; Wnt
signalling inhibitor
(Zhang et al. 2008)
SPARCL1 SPARC-like 1 (hevin): Undetermined (Isler et al. 2004)
SPON1
e
spondin 1, extracellular matrix protein: Cell adhesion (Pyle-Chenault et al. 2005)
TEK
TEK tyrosine kinase, endothelial: Endothelial cell proliferation,
differentation
(Mazzieri et al. 2011)
TM6SF1 transmembrane 6 superfamily member 1: Undetermined (Tao et al. 2011)
TMEM204 transmembrane protein 204:Cell adhesion and cellular permeability (Shimizu et al. 2011)
TOX2
TOX high mobility group box family member 2: Transcriptional
activator
None available
TUBB6 tubulin, beta 6: Major constituent of microtubules (Leandro-Garcia et al. 2010)
ZEB2
zinc finger E-box binding homeobox 2: Transcriptional inhibitor, interacts
with activated SMADs
(Rodenhiser et al. 2008)
a,b
Human Genome Organization nomenclature.
c
Gene function from GeneCards website http://www.genecards.org/
d
Previously reported studies in order of relevance: DNA methylation in lung cancer, DNA
methylation in non-lung cancer, known in lung cancer, known in non-lung cancer.
e
Tested in independent population using MethyLight (see Figure 3.7).
Bolded: First report of DNA methylation in lung cancer.
103
Table 3.5. Top hypomethylated and upregulated genes in lung adenocarcinoma
HUGO
a
HUGO gene name
b
: Function
c
References
d
AGR2
anterior gradient homolog 2: Proto-oncogene; cell migration, differentiation and
growth
(Pizzi et al. 2011)
AIM2 absent in melanoma 2: Tumor suppressor; represses NF-kappa-B transcription (Pedersen et al. 2011)
CFB complement factor B: Component of complement system None available
FAM83A
e
family with sequence similarity 83, member A: Undetermined (Jensen et al. 2008)
GRB7
growth factor receptor-bound protein 7: Integrin sigalling pathway, cell
migration
(Tanaka et al. 2006)
HABP2 hyaluronan binding protein 2: Activates coagulation factor VII (Wang et al. 2002)
KRT8 keratin 8: Cellular structure; signal transduction (Sartor et al. 2011)
LAMB3 laminin, beta 3: Interaction with ECM, cell migration, attachment (Sathyanarayana et al. 2003)
MX2 myxovirus (influenza virus) resistance 2: GTPase (Kobayashi et al. 2004)
PROM2 prominin 2: Organization of plasma membrane (Rohan et al. 2006)
RAB25
RAB25, member RAS oncogene family: Cell survival, migration and
proliferation
(Roland et al. 2011)
SFN
e
stratifin: Epithelial cell growth; stimulates Akt/mTOR pathway; p53 regulated
inhibitor of G2/M progression
(Osada et al. 2002)
SPDEF
SAM pointed domain containing ets transcription factor: Androgen-
independent transactivator of PSA, SERPINB5
(Ghadersohi et al. 2004)
TCN1 transcobalamin I: Vitamin B12-binding protein (Remmelink et al. 2005)
TM4SF4
transmembrane 4 L six family member 4: Cell proliferation, growth,
motility
None available
ZNF750 zinc finger protein 750: Undetermined None Available
a,b
Human Genome Organization nomenclature.
c
Gene function from GeneCards website http://www.genecards.org/
d
Previously reported studies in order of relevance: DNA methylation in lung cancer, DNA
methylation in non-lung cancer, known in lung cancer, known in non-lung cancer.
e
Tested in independent population using MethyLight (see Figure 3.7).
Bolded: First report of DNA methylation in lung cancer.
104
Scatterplots illustrate the negative correlation between DNA methylation
and gene expression for select genes as well as the distinct distribution of tumor
and NTL sample values (Figure 3.12).
Pan-non-small cell lung cancer marker
The top 10 most hypermethylated loci in the lung adenocarcinoma vs. AdjNTL
comparison as ranked by median beta-value difference and the top 10 most
hypermethylated loci in the squamous cell carcinoma vs. AdjNTL comparison
had three overlapping genes: HOXB4, NID2 and TRIM58 (Table 3.6). These loci
therefore may serve as biomarkers for both subtypes of non-small cell lung
cancer. However, a clinically useful biomarker ideally would be detectable in less
invasive bodily fluids, such as urine, stool, or blood. For blood-based tests, a
tumor-specific marker must not already be methylated in white blood cells
(WBC), a potential confounder. Therefore, the DNA methylation levels of 10
WBC from healthy subjects were examined. For all three loci, there were no
WBC samples with a beta-value of more than 0.1 (Table 3.6 and Figure 3.13A),
making them promising potential blood-based DNA methylation biomarkers for
lung cancer.
MethyLight reactions were then designed for HOXB4, NID2 and TRIM58,
and tested on the verification samples (Figure 3.13B), illustrating that the
MethyLight reactions also show high levels of DNA methylation in the tumor, and
low levels of DNA methylation in the AdjNTL. This will enable the continued
105
investigation of these three loci as potential blood-based DNA methylation
biomarkers using the highly sensitive Digital MethyLight platform (Weisenberger
et al. 2008).
106
Figure 3.12. Correlations between DNA methylation and gene expression. Correlation plots of DNA methylation
vs. gene expression in tumors and normal tissues for select genes.
107
Table 3.6. Top 10 hypermethylated loci in adenocarcinoma and squamous cell carcinoma. Pink >0.1, yellow >0.05.
Adenocarcinoma vs. AdjNTL White Blood Cell DNA methylation
IlmnID Gene TvN BH p-value
Median ȕ -value
difference WBC1 WBC2 WBC3 WBC4 WBC5 WBC6 WBC7 WBC8 WBC9 WBC10
cg25720804 TLX3 4.15E-09 0.570
cg07533148 TRIM58 1.99E-05 0.566
cg08089301 HOXB4 3.54E-08 0.549
cg22881914 NID2 4.21E-07 0.545
cg06760035 HOXB4 4.15E-09 0.519
cg16731240 ZNF577 1.59E-06 0.514
cg17525406 AJAP1 1.88E-06 0.501
cg12374721 PRAC 3.54E-08 0.489
cg23290344 NEFM 3.54E-08 0.480
cg14458834 HOXB4 5.07E-07 0.473
Squamous cell carcinoma vs AdjNTL White Blood Cell DNA methylation
cg14991487 HOXD9 3.23E-10 0.581
cg08089301 HOXB4 0.000475184 0.570
cg22881914 NID2 6.46E-06 0.566
cg07533148 TRIM58 1.07E-11 0.550
cg04490714 SLC6A2 4.01E-10 0.534
cg01381846 HOXA9 8.68E-14 0.533
cg06760035 HOXB4 0.003976838 0.525
cg01009664 TRH 4.03E-11 0.523
cg21790626 ZNF154 1.76E-12 0.522
cg06277657 DGKI 2.01E-13 0.521
108
Figure 3.13. DNA methylation levels of HOXB4, NID2 and TRIM58 on
Illumina Infinium HumanMethylation27 and MethyLight. (A) Y-axes: DNA
methylation beta-values. AD=adenocarcinoma, SQ=squamous cell carcinoma,
AdjNTL= non-tumor lung adjacent to adenocarcinoma, WBC=white blood cell
from healthy subjects (B) PMR = Percent Methylated Reference, a measure of
DNA methylation in comparison to a fully methylated control, with range 0-100,
where 0 represents no detectable methylation and 100 represents high DNA
methylation. p-values were calculated using Wilcoxon signed-rank tests.
109
Discussion
We used the Illumina Infinium HumanMethylation27 assays to interrogate
27,578 CpG dinucleotides spanning 14,495 genes in 59 lung adenocarcinoma
tumors and matched adjacent non-tumor lung tissue. We first identified cancer-
specific DNA methylation changes; 520 genes were significantly hypermethylated
and 247 genes were significantly hypomethylated in cancer. These genes may
be useful to develop biomarkers for diagnostic or prognostic purposes. For
example, the comparison of lung adenocarcinoma DNA methylation profiles with
those of white blood cells and other types of cancers, (e.g. through The Cancer
Genome Atlas; http://cancergenome.nih.gov/), could lead to the development of
improved DNA methylation-based blood biomarkers specific for lung
adenocarcinoma (Esteller et al. 1999; Usadel et al. 2002).
In this study, we identified three loci, HOXB4, TRIM58 and NID2 that are
hypermethylated both in lung adenocarcinoma and squamous cell carcinoma, the
two most common forms of lung cancer. Importantly, all three loci also had low
levels of DNA methylation in AdjNTL and white blood cells of healthy subjects.
The MethyLight reactions for these three loci will be investigated as potential
blood based DNA methylation biomarkers for lung cancer. HOXB4 is a member
of the homeobox group of developmental proteins and is a polycomb group
target, a group of genes often methylated in lung cancer (Rauch et al. 2007).
Little is known about TRIM58, but it has previously been identified as methylated
in hepatocytes isolated from liver infected with hepatitis B virus (Tao et al. 2011).
110
NID2 is a cell adhesion protein with a role in cell-extracellular matrix adhesion,
and has previously been identified as aberrantly methylated in gastrointestinal
cancer (Ulazzi et al. 2007), bladder cancer (Renard et al. 2010), as well as oral
squamous cell carcinoma (Guerrero-Preston et al. 2011).
While for biomarker purposes, the functional consequences of DNA
methylation is not of great importance, in order to understand the mechanisms by
which DNA methylation contributes to tumorigenesis, it is important to delineate
which of the many DNA methylation changes observed in cancer are functionally
relevant. To help differentiate between DNA methylation events that are of
potential functional significance (“driver events”) and those that do not
biologically contribute to tumorigenesis (“passenger events”), we integrated the
DNA methylation data with gene expression profiles of the same tumors, using
stringent statistical and cutoff criteria. There are several caveats to our
approach. First, our stringent criteria may filter out genes that show weak
correlations with gene expression. Secondly, our approach is modeled on the
median behavior of a set of tumors, and some genes may be deregulated by a
variety of mechanisms in addition to DNA methylation, such as histone
modifications, gene mutations or copy number changes. Thus, our approach is
geared towards identifying those genes that are frequently and substantially
deregulated by DNA methylation. With these limitations in mind, we identified 164
genes that were concordantly hypermethylated and downregulated, and 57
genes that were concordantly hypomethylated and upregulated. Integrated
111
pathways analysis identified two top networks showing frequent gene
deregulation. One of them included genes and pathways involved in epithelial to
mesenchymal transition, such as TGFB and WNT signaling pathways, while the
other centered on growth factor signal transduction and cell cycle control,
including mitogen-activated protein kinases, growth factors, and cell cycle control
hubs. Both networks contained links to known RAS effector pathways (PI3K,
MAPK family). More stringent filtering of genes by requiring a minimal 2-fold
change in expression yielded 45 hypermethylated and downregulated genes,
including novel methylated genes such as ABCA3, an ATP-binding cassette
transporter protein with a critical role in lung development (Cheong et al. 2007)
(Shulenin et al. 2004), CLEC1, encoding C-type lectin domain family 1 member
A, potentially involved in cell adhesion, SPARCL1, extracellular matrix associated
Sparc-like 1, a potential tumor suppressor reported to be down-regulated in lung
cancer (Bendik et al. 1998; Isler et al., 2004), SOX17, a canonical WNT
antagonist previously shown to be functionally hypermethylated in breast and
colorectal cancers (Zhang et al. 2008; Fu et al. 2010) and TMEM204, a
transmembrane protein that plays a role in cell adhesion and is hypermethylated
and downregulated in pancreatic cancer (Shimizu et al. 2011). To our knowledge
this is the first report of the epigenetic dysregulation of these and numerous other
genes in lung cancer (see Table 2A).
The top hypomethylated and upregulated gene is FAM83A. Little is known
about the function of FAM83A, however, it has been demonstrated previously to
112
be specifically upregulated in lung cancer, especially in lung adenocarcinomas
(Li et al. 2005). Expression of FAM83A has been used to detect circulating
cancer cells in the peripheral blood of lung cancer patients (Liu et al. 2008).
Additionally, FAM83A was shown to be epigenetically regulated an in vitro model
of arsenic mediated malignant transformation (Jensen et al. 2008), supporting an
epigenetic role for FAM83A in tumorigenesis. Another hypomethylated and
upregulated gene is AGR2, a known proto-oncogene which has recently been
confirmed to be overexpressed in lung adenocarcinoma in an independent study
(Pizzi et al. 2011). AGR2 overexpression has been shown to promote cell
proliferation and migration in a number of different cancers using a variety of
functional assays (Ramachandran et al. 2008; Vanderlaag et al. 2010; Park et al.
2011a). KRT8, encoding keratin 8, was also hypomethylated and upregulated,
and was previously reported to be upregulated in lung adenocarcinoma (Wikman
et al., 2002). To our knowledge, this is the first report of an association of DNA
hypomethylation with overexpression of these and other genes in lung
adenocarcinoma (Table 2B). Investigating the link between loss of methylation
and increased expression of these genes will be important, given the increasing
use of epigenetic therapies in cancer treatment.
In addition to genes showing an inverse correlation between DNA
methylation and expression, we also identified 5 genes that were coordinately
hypermethylated and upregulated, and 10 genes that were hypomethylated and
downregulated. While these two groups of genes do not fit into the classical
113
paradigm of DNA methylation regulation, increasing evidence from recent deep
sequencing studies show that DNA methylation regulation may be more complex
(Irizarry et al. 2009; Maunakea et al. 2010; Brenet et al. 2011; van Vlodrop et al.
2011). The location of DNA methylation (inter vs. intragenic DNA methylation,
CpG islands vs. shores, etc.) and the regulation of alternate transcripts must not
be discounted. Newly developed whole genome bisulfite-sequencing and RNA-
seq technologies will be able to shed more light onto these possibilities (Berman
et al., 2011). The effects of promoter vs. intra- and inter-genic DNA methylation is
just beginning to be investigated, and the location of hybridization probes on
DNA methylation measurements should be considered in any analyses. The
CDH13 gene is an example of how differential DNA methylation readouts can be
obtained from a gene that has widely been shown to be repressed in lung cancer
(Sato et al. 1998; Toyooka et al. 2003; Selamat et al. 2011).
In summary, our DNA methylation profiling of 59 lung adenocarcinomas
and matched adjacent non-tumor lung tissue accomplished two goals: 1) the
identification of DNA methylation changes which can be pursued as potential
lung adenocarcinoma biomarkers in the blood, and 2) the identification of
potentially functional DNA methylation changes that may constitute driver
alterations in lung adenocarcinoma. In the next chapter, we will use the same
data to investigate DNA methylation based tumor heterogeneity in lung
adenocarcinoma.
114
CHAPTER 4
GENOME-SCALE DNA METHYLATION PROFILING OF
LUNG ADENOCARCINOMA: TUMOR HETEROGENEITY
Chapter 4 Abstract
Lung cancer is the leading cause of cancer death worldwide and
adenocarcinoma is its most common histological subtype. Clinical and molecular
evidence indicates that lung adenocarcinoma is a heterogeneous disease, which
has important implications for treatment. Here we performed genome-scale DNA
methylation profiling using the Illumina Infinium HumanMethylation27 platform on
59 matched lung adenocarcinoma/non-tumor lung samples. Comparison of DNA
methylation profiles between lung adenocarcinomas of current and never-
smokers showed modest differences, identifying only LGALS4 as significantly
hypermethylated and downregulated in smokers. LGALS4, encoding a
galactoside-binding protein involved in cell-cell and cell-matrix interactions, was
recently shown to be a tumor-suppressor in colorectal cancer. We also identified
a locus SULT1C2, an enzyme involved in the catalysis of many drugs and
xenobiotic compounds, as being significantly hypermethylated and
downregulated in the non-tumor lung of Asians in comparison to Caucasions.
Unsupervised analysis of the DNA methylation data identified two tumor
115
subgroups, one of which showed increased DNA methylation and was
significantly associated with KRAS mutation and to a lesser extent, with smoking.
116
Introduction
Lung cancer is the leading cause of cancer-related death worldwide
(Jemal et al. 2011). In many countries, adenocarcinoma has surpassed
squamous carcinoma as the most common histological subtype of lung cancer,
and it is also the most common histological subtype in women, Asians, and
never-smokers (Toh et al. 2006). Lung adenocarcinoma is increasingly
recognized as a clinically and molecularly heterogeneous disease, which has
important prognostic and therapeutic implications. This is exemplified by recent
re-classifications based on pathology and patient survival (Travis et al. 2011), the
increasing number of clinical trials demonstrating targeted treatments that
specifically benefit patients defined by molecular subtypes such as EGFR,
KRAS, BRAF, and ERBB2 mutations and EML4-ALK fusions (Pao et al. 2004;
Pao et al. 2005a; Pao et al. 2005b; Pao and Girard 2011), as well as observed
prognostic gene expression signature profiles (Bhattacharjee et al. 2001; Beer et
al. 2002; Larsen et al. 2007). In addition to genetic and gene expression studies
performed with the goal of discovering clinically relevant subtypes of cancer, the
rapidly expanding field of epigenetic profiling has confirmed the existence of DNA
methylation-based subtypes in several cancers (Issa 2004; Li et al. 2010;
Noushmehr et al. 2010; Hinoue et al. 2011). Promoter DNA methylation, which is
associated with gene silencing, can regulate gene expression in a myriad of
biological and pathological processes, including lung cancer (Jones 2002;
Belinsky 2004; Kerr et al. 2007; Brock et al. 2008; Risch and Plass 2008). Unlike
117
genetic mutations, DNA methylation is an inherently reversible change, and
therefore is of great interest as an active target of drug development (Esteller
2003; Rodriguez-Paredes and Esteller 2011).
While previous studies have profiled DNA methylation in lung
adenocarcinoma (Shiraishi et al. 2002; Divine et al. 2005; Tsou et al. 2005;
Toyooka et al. 2006; Tsou et al. 2007; Tessema and Belinsky 2008; Christensen
et al. 2009; Goto et al. 2009; Sasaki et al. 2009), they have either been limited in
the number of samples or genes assayed, or included a mix of lung cancer
histologies thereby limiting the ability to identify subtypes. To address these
issues, here we analyzed 59 lung adenocarcinoma tumors and matched adjacent
non-tumor lung (NTL) tissues. Since adenocarcinoma is the most common lung
cancer subtype found in never-smokers, it was important to ensure that cancer
from smokers and never-smokers would both be included. Thus, we chose the
cases so that approximately half of the tumors were from patients who were
never-smokers. Lastly, we used both supervised and unsupervised analyses of
the DNA methylation data to identify sub-groups within the tumors.
118
Materials and Methods
Study samples
The Early Detection Research Network (EDRN)/Canary Foundation tissue
collection consisted of 60 lung adenocarcinoma tumors and matched adjacent
histologically confirmed non-tumor lung (NTL), collected after surgery. 45
adenocarcinoma/NTL pairs were obtained from the Vancouver General Hospital
(Vancouver, Canada) and 15 adenocarcinoma/NTL pairs from the British
Columbia Cancer Agency Tumor Tissue Repository (Vancouver, Canada, BCCA
Research Ethics Board #: H09-00008). 30 subjects were never smokers (defined
as <100 lifetime cigarettes), and 30 were current smokers (average 53 pack-
years, range 11-120 pack-years). One tumor sample was excluded after
pathology review later revealed it to be a large cell carcinoma. Subject
characteristics for the remaining 59 subjects are detailed in Table 3.1.
EDRN/Canary samples were assessed by an experienced pathologist (AFG). All
sample collections were performed conforming to protocols approved by the
appropriate local Institutional Review Boards and were acquired with informed
consent. The identities of the subjects were not made available to the laboratory
investigators.
DNA methylation data production
DNA was extracted by proteinase K digestion following manual microdissection
from slides prepared from fresh frozen tissue blocks. The DNA was then bisulfite
119
converted using the EZ DNA Methylation kit (Zymo Research, Irvine, CA, USA)
with a modification to the manufacturer’s protocol in which samples were cycled
16 times for 30 seconds at 90°C and one hour at 50°C. The Illumina Infinium
HumanMethylation27 BeadChip assays were performed by the USC Epigenome
Center according to manufacturer’s protocols (Illumina, San Diego, CA, USA).
This assay generates DNA methylation data for 27,578 CpG dinucleotides
FR Y H U L Q J X Q L T X H J H Q H V ' 1 $ P H W K \ O D W L R Q O H Y H O V D U H U H S R U W H G D V ȕ -values,
calculated from mean methylated (M) and unmethylated (U) signal intensities for
H D FK O R FX V I R U H D FK VD P S O H X VL Q J W K H I R U P X O D ȕ 0 0 8 3 U R E H V Z L W K detection p-values of >0.05 were deemed not significantly different from
background noise and were labeled “NA”. Data for all samples are publicly
available at the EDRN Public Portal (http://www.cancer.gov/edrn).
DNA methylation data analysis
Data analyses were performed using R (R Development Core Team, 2011) and
Bioconductor (Gentleman et al. 2004). The analyses of 120 tissue samples
necessitated conducting the experiment with the samples randomized and
spread over two bisulfite treatment plates and 16 Infinium BeadChips. Batch
effect investigations were performed as recommended (Leek et al. 2010) and are
illustrated in Figure 3.7. Three tissue samples were excluded from analyses: one
tumor/NTL tissue pair (07L36_T/N) found to be a large cell carcinoma instead of
a lung adenocarcinoma and one NTL sample (3023_N) for which correlation
120
analyses suggested this was neither a lung adenocarcinoma nor NTL tissue.
Probes targeting the X and Y chromosomes were excluded, as were probes
containing a known single-nucleotide polymorphism, probes that contain repeat
sequences of E S D Q G S U R E H V W K D W Z H U H I R X Q G W R E H Q R Q -unique in the
genome (Noushmehr et al. 2010). Hierarchical clustering was performed using
Ward linkage with Euclidean distance for samples and Pearson correlation
coefficients for probes. For each comparison analysis, the top 5000 most variable
probes across all samples included in the comparison as measured by SD/SD
MAX
were retained (Cancer Genome Atlas Research Network (Network 2011)).
121
Results
59 lung adenocarcinomas and matched adjacent non-tumor lung tissue (Table
3.1) were interrogated for DNA methylation using the Illumina Infinium
HumanMethylation27 platform. 30 tumors were from never-smokers (defined
here as less than 100 cigarettes in a lifetime), while 29 were from current
smokers.
Differential DNA methylation analysis between smokers and never-smokers
The EDRN tumor collection consisted of 30 tumors from never-smokers and 29
from current smokers (Table 3.1). A correlation analysis to examine the overall
difference in DNA methylation between tumor tissues from smokers and never
smokers showed that DNA methylation profiles of both groups are very similar
(Figure 4.1A). We then performed a locus-by-locus differential DNA methylation
analysis of smoker vs. never-smoker tissue (tumors as well as NTL) to identify
differentially methylated probes. No genes were found to be statistically
significantly differentially methylated between the subsets of NTL (data not
shown), but six genes were statistically significantly different between smoker
and never-smoker tumor tissues (Figure 4.1B). IRF8, IHH, LGALS4, IL18BP and
VTN were hypermethylated in current smoker tumors, while KLF11 was
hypomethylated (Table 4.1). Only one gene, LGALS4, showed a statistically
significant corresponding down-regulation in gene expression in current smoker
tumors (BH multiple comparisons correction; t-test p<0.0069).
122
Figure 4.1. Identification of DNA methylation differences between lung adenocarcinoma tumors from
smokers and never-smokers. (A) Correlation matrix of median ȕ -values of tumors from current smokers vs.
never-smokers, with the Spearman rho correlation given in the top left corner. (B) Volcano plot of the differential
DNA methylation analysis between smokers and never-smokers. The vertical dotted lines mark 20% change in ȕ -
values; the horizontal dotted line marks the significance cut-off. (C) Correlation plot of DNA methylation vs.
expression for LGALS4. Spearman rho correlation coefficient is provided on top.
123
Table 4.1. Genes showing statistically significant differential DNA
methylation between current and never-smoker tumors.
DNA methylation Gene Expression
HUGO
a
Q-value ȕ -value difference p-value Fold change
IRF8 0.018 0.347 0.064 0.766
IHH 0.046 0.283 0.555 0.938
LGALS4 0.02 0.218 0.009 0.357
IL18BP 0.046 0.214 0.111 0.785
VTN 0.032 0.204 0.888 1.006
KLF11 0.046 -0.27 0.37 1.117
a
Human *HQRP H 2 U JDQL ] D WL RQ Q RP HQF O DWXU H *HQ HV DU H V RU WHG E \ GHF U HDV L QJ ȕ -value difference
P HGL DQȕ -value in current smoker tumors – PHGLDQȕ -value in never-smoker tumors).
124
A scatter plot of DNA methylation vs. gene expression for LGALS4, which
encodes a galactoside-binding protein involved in cell-cell and cell-matrix
interactions, demonstrates the dramatic hypermethylation and downregulation of
the gene in current smoker tumors (Figure 4.1C).
Examination of the clinical characteristics of the tumors (Table 3.1) yielded
the expected statistically significant correlations of smoking with KRAS mutations
and never-smoking with EGFR mutations (Sun et al. 2007). However, we also
noted a statistically significant correlation of smoking status with race, reflecting a
bias in this tumor collection. To address this issue, we performed a multiple linear
regression analysis, and found that even with adjustment for race, KRAS and
EGFR status, the correlation between LGALS4 and smoking remained
statistically significant. In addition, a comparison of Asian and Caucasian tumors
did not identify LGALS4 as significantly differentially methylated between the two
populations (Figure 4.2). Lastly, in an analysis limited to only Caucasian tumors
(9 never-smokers vs. 28 current-smokers), LGALS4 was still found to be
differentially methylated (Wilcoxon p P H G L D Q ȕ -value difference = 0.22).
The limited number of significant differences between current and never-
smoker lung adenocarcinomas and their highly similar global DNA methylation
profiles (Figure 4.1A) prompted further investigation. We therefore examined 30
genes that had previously been reported to show differences between smokers
and never-smokers.
125
Of these, 14 genes showed statistically significant differences in DNA
methylation, but only one gene, CDKN2A, showed a median ȕ -value difference of
more than 20% between current and never-smokers (Table 4.2). Our results
therefore suggest that smoking status did not greatly influence the DNA
methylation profiles of the tumors in our collection.
Differential DNA methylation analysis between Asians and Caucasians
Due to a bias in this tumor collection, there is a statistically significant correlation
of smoking status with race (Table 3.1, p< 6.15x10
-8
). Since we found few DNA
methylation differences between smoker and never-smoker tumor and non-tumor
tissues, we decided to investigate a potential confounder by comparing the DNA
methylation profiles between the Asians and Caucasians in our cohort. We did
not observe any statistically significant differences between the tumor tissues of
the different races (data not shown). Our analyses of non-tumor lung tissues
(NTL) also found very similar profiles between the two ethnicities, but we did find
three probes that were significantly more hypermethylated in Caucasian NTL
than Asian NTL (Figure 4.2). Two of these probes represented the gene
PM20D1, a peptidase with unknown biological function, while one probe
represented SULT1C2, a glutathione sulfur transferase (GST) involved in the
metabolism of envinonmental toxins. Encouragingly, when the analysis was
limited to NTL tissues from never-smoker subjects only, SULT1C2 was still
126
Table 4.2. Analysis of genes previously identified as significantly
differentially methylated between tumors of current and never-smokers
a
Human *HQRP H 2 U JDQL ] D WL RQ Q RP HQF O DWXU H *HQ HV DU H V RU WHG E \ GHF U HDV L QJ ȕ -value difference
P HGL DQȕ -value in current smoker tumors – PHGLDQȕ -value in never-smoker tumors).
Probe
Name
HUGO
a
Wilcoxon Rank
p-‐value
ɴ -‐value
difference
cg09099744 CDKN2A 0.015 0.264
cg25156443 SFRP5 0.004 0.140
cg13398291 SFRP1 0.042 0.129
cg13759328 CDH13 0.013 0.125
cg12128839 HOXA5 0.003 0.123
cg10210238 CDKN2B 0.041 0.122
cg07694025 SFRP2 0.034 0.111
cg27196745 PTPRO 0.021 0.094
cg17482740 DNMT3B 0.028 0.078
cg08331313 SPARC 0.017 0.057
cg17129141 MSH2 0.038 -‐0.006
cg15043975 RASSF1 0.004 -‐0.009
cg22215728 FHIT 0.014 -‐0.034
cg08797471 DAPK1 0.036 -‐0.047
127
Figure 4.2. SULT1C2 DNA methylation differences between Caucasians and
Asians. (A) Volcano plots showing comparison of DNA methylation profiles
between Asians and Caucasians show three statistically significantly different
probes (Q<0.05, beta-value difference>0.2) highlighted in green. Left: All normal
lungs, Right: Never-smoker lungs only. (B) DNA methylation beta-values
comparing SULT1C2 between Asians and Caucasians.
128
statistically significantly hypermethylated in Caucasians, with a median beta-
value difference of 0.27 (Figure 4.2).
Importantly, an examination of gene expression levels for the SULT1C2
gene also show statistically significant negative correlation between DNA
methylation and gene expression (both expression probes p<0.001, Figure 4.3).
Therefore, while we found no statistically significant differences between tumor
tissues of the two ethnicities, we did observe higher levels of DNA methylation of
the SULT1C2 gene in the non-tumor lung of Caucasians that was correlated to
levels of gene expression.
Class discovery: Identification of DNA methylation sub-groups in lung
adenocarcinoma
We next performed an unsupervised analysis of the entire 59-tumor set to
identify any intrinsic sub-classes based on DNA methylation which could then be
investigated in relation to known clinical features. We carried out the analysis
using either the top 5000 most variable probes within the tumors, or the 766
differentially methylated genes (Figure 4.4 and Figure 4.5). Both hierarchical
clustering analyses identified two distinct clusters, with only two tumors changing
memberships between the two clustering approaches. While there were no
statistically significant differences in clinical stage, gender, race, LKB1 or EGFR
mutation status or survival between the two clusters (all p<0.05, Figure 4.4,
129
Figure 4.3. DNA methylation and gene expression levels for SULT1C2 in
NTL tissues. Scatterplot of DNA methylation (beta-value) and log2 expression
values for two probes matched to the SULT1C2 gene show negative correlations.
130
Figure 4.4. Hierarchical clustering of tumors identifies two distinct DNA-methylation based
clusters. Two-dimensional hierarchical clustering of the top 5000 most variant probes amongst
59 tumors. Rows are probes; columns are samples. Fisher p-values for different sample
parameters are shown on the left, parameters are indicated at right (the listed characteristic is
marked as a black tick mark, except as indicated in the key beside the heatmap). The two main
clusters are marked in color at the top of the heatmap. (B) Associations of cluster membership
with KRAS mutation status (left) and smoking status (right). (C) KRAS mutation types in each
cluster.
131
Figure 4.5. Hierarchical clustering of tumors using 766 genes also identifies two distinct
DNA-methylation based clusters. (A) Two-dimensional hierarchical clustering using the 766
genes previously identified to be significantly differently methylated between tumors and NTL.
Rows are probes; columns are samples. Fisher p-values for different sample parameters are
shown on the left, parameters are indicated at right (the listed characteristic is marked as a black
tick mark, except as indicated in the key beside the heatmap). The two main clusters are marked
in color at the top of the heatmap. (B) Silhouette plots of hierarchical clustering analyses using
top 5000 most variable probes as well as the 766 differentially methylated genes.
132
Figure 4.6. No survival differences between the two DNA methylation
based clusters. Kaplan-Meier survival plot between the two clusters do not
show a statistically significant difference.
133
Figure 4.5 and Figure 4.6), we did observe statistically significant associations
between cluster membership and KRAS mutation and smoking status (Fisher
p=0.0052 and p=0.02, respectively, Figure 4.4B). These results hold true with
both hierarchical clustering approaches (Figure 4.5).
We further examined the relationship between KRAS mutation, smoking
status and cluster membership, and determined that within KRAS mutants, there
is no association between DNA methylation cluster membership and smoking
(Fisher p=0.21), whereas within current smokers, there is a significant
association between KRAS mutation and Cluster 1 (Fisher p=0.02), indicating
that KRAS rather than smoking is associated with the more heavily methylated
cluster. Lastly, we examined whether KRAS mutation sub-types might segregate
between the clusters, but found no significant association between subtypes and
cluster membership (Fisher’s p<0.49, Figure 4.4C), although this analysis was
limited by the modest number of samples in this study.
To gain insight into the origins of the two sample clusters, we compared DNA
methylation of the 5000 most variable probes amongst the tumor tissues, finding
that 962 probes (753 genes) were significantly more highly methylated in Cluster
1 vs. Cluster 2, and that only one gene was significantly more methylated in
Cluster 2 in comparison to Cluster 1 (RUNX1) (Figure 4.7 and Appendix D Table
1). The differentially hypermethylated genes were significantly enriched in GO
biological processes including embryonic development, regulation of
transcription, cell-cell signaling, cell morphogenesis and extracellular matrix (BH-
134
adjusted p<0.05). However, none of these differentially methylated genes
showed a corresponding significant change in gene expression between the two
clusters. The functional consequences of the differential DNA methylation are
therefore still undetermined. We also examined whether the six genes that were
differentially methylated between smokers and never smokers showed any
significant differences in DNA methylation and/or expression between the two
clusters and found that IHH, IRF8, VTN, IL18BP, and LGALS4 all were
significantly hypermethylated in Cluster 1 (BH-corrected p-value cutoff of
p<0.0083) and of these, IHH, IRF8 and VTN PH W R X U PH G LD Q ȕ -value difference
cut-off of 20%. However, the genes were not significantly differentially expressed.
The association of KRAS mutations with Cluster 1 membership led us to
investigate whether KRAS mutant tumors have differential DNA methylation
profiles in comparison to KRAS wildtype tumors (Figure 4.8, Table 4.3 and
Appendix B Table 2). 93 genes were statistically significantly more methylated in
KRAS mutant tumors, while 3 genes were statistically significantly less
methylated. 91 out of the 93 hypermethylated genes were also hypermethylated
in Cluster 1. However, none of these genes showed statistically significant
corresponding changes in gene expression.
To examine whether the differential DNA methylation between the two
clusters could be due to differential expression of DNA methyltransferase
proteins or other proteins known to influence DNA methylation levels, we
135
compared expression levels of DNMT1, DNMT3A, DNMT3B, DNMT3L, TET1,
TET2, and TET3 between the two clusters (data not shown).
We found differences in DNMT3A and DNMT3L expression levels
between Cluster 1 and Cluster 2 (p<0.05 and p<0.03 respectively), but they do
not survive a multiple comparisons correction, and the median fold change was
modest (<1.2-fold; data not shown). However, when we examined for any gene
expression differences between the two clusters, we identified 36 genes that
were statistically significantly differentially expressed, with seven genes meeting
a two-fold cutoff (Figure 4.9 and Table 4.4). Interestingly, three out of these
seven genes were cytokines (CXCL9, CXCL10 and CXCL14), while another
gene, PHLDA1, encodes a regulator of IGF1-mediated apoptosis (Toyoshima et
al. 2004)
136
Figure 4.7. DNA methylation differences between clusters. Volcano plot
showing statistically significant DNA methylation alterations between the two clusters.
The vertical dotted lines mark 20% change in ȕ -values; the horizontal dotted line marks
the significance cut-off.
137
Figure 4.8. DNA methylation differences between KRAS mutant and
wildtype tumors. Left: Correlation matrix of median ȕ -values of tumors from
KRAS mutant vs. wildtype tumors, with the Spearman rho correlation given in the
top left corner. Right: Volcano plot of differential DNA methylation analysis
between KRAS wildtype and KRAS mutant tumors. The vertical dotted lines mark
20% change in ȕ -values; the horizontal dotted line marks the significance cut-off.
138
Table 4.3.Top genes showing statistically significant DNA methylation
differences between KRAS mutant and KRAS wildtype tumors
DNA methylation
HUGO
a
Q-value ȕ -value difference
b
GRASP 5.428E-05 0.610
CNRIP1 0.002 0.449
RUNX3 0.045 0.422
FOXE3 0.020 0.390
BCAN 1.239E-04 0.389
HOXA13 0.002 0.376
SOX7 0.006 0.362
DLX5 0.001 0.356
RIMS4 0.010 0.350
FOXF1 2.881E-05 0.340
KLF11 0.002 -0.259
TULP2 0.033 -0.209
FGF19 0.032 -0.206
a
Human Genome Organization nomenclature
b
Genes are listed by decreasing ȕ -value difference; the top 10 genes hypermethylated and the
top 3 hypomethylated genes in KRAS mutated tumors are listed.
139
Figure 4.9. Gene expression differences between the two clusters. Volcano
plot showing statistically significant gene expression differences between the two
clusters. The vertical dotted lines mark a 2-fold change in ȕ -values; the horizontal
dotted line marks the significance cut-off.
140
Table 4.4. Genes showing statistically significant differences in gene expression
between Cluster 1 and Cluster 2 tumors.
HUGO
a
Q-value
Fold change
(log2)
Function Cancer references
EFHD1 0.040 -1.249 Undetermined NA
PHLDA1 0.050 1.016 Involved in apoptosis
(Neef et al., 2002;
Sakthianandeswaren
et al., 2011)
DCBLD2 0.068 1.123 Undetermined
(Kim et al., 2008;
Koshikawa et al.,
2002)
GBP1 0.008 1.147
Binds GMP, GDP and
GTP, immune
response
(Duan et al., 2006;
Guenzi et al., 2001)
CXCL9 0.050 1.267
Cytokine affecting
growth, movement of
immune response cells
(Amatschek et al.,
2011; Andersson et
al., 2011)
CXCL10 0.041 1.369
Chemotactic for
immune cells, and
involved in adhesion
and migration
(Andersson et al.,
2011; Shimizu et al.,
2011)
CXCL14 0.062 2.069
Cytokine involved in
immune and
inflammatory
processes
(Augsten et al., 2009;
Tessema et al., 2010)
a
Human Genome Organization nomenclature
141
Discussion
Distinct mutational and gene expression differences between lung
adenocarcinomas of smokers and never-smokers have been frequently noted
e.g. (Belinsky et al. 2002; Toyooka et al. 2003; Pao et al. 2004; Pao et al. 2005b;
Sun et al. 2007; Landi et al. 2008). Previous candidate-gene studies have
identified several genes as differentially methylated between the two groups e.g.
(Belinsky et al. 2002; Pulling et al. 2003), findings which were recapitulated in our
study albeit with smaller differences. It should be taken into consideration that the
second most common form of lung cancer, squamous cell carcinoma, occurs
predominantly in smokers, while adenocarcinoma is the most common lung
cancer histology in never smokers.
Unlike this study, which focuses exclusively on lung adenocarcinoma,
previous studies may not have always corrected for histology, potentially leading
to larger observed differences. The modest differences we detect may therefore
be due to the different histological composition as well as to differences in
methodology. Two recent studies suggest that the link between tobacco smoke
carcinogens and DNA methylation may be more complex. In a study comparing
peripheral blood DNA methylation profiles of smokers and non-smokers only one
differentially methylated locus was found (Breitling et al. 2011). An in vitro study
of human lung cells chronically exposed to a tobacco carcinogen also showed
little to no effect on DNA methylation profiles in treated vs. untreated cells
(Tommasi et al. 2010).
142
In our genome-wide supervised approach, we noted only six significantly
differentially methylated genes between smokers and never-smokers and of
these only LGALS4 showed a corresponding downregulation in gene expression
in current smoker tumors. LGALS4 has been implicated in several cancers,
including gastric, colon, and sinusoidal adenocarcinomas (Sakakura et al. 2005;
Tripodi et al. 2009; Watanabe et al. 2011). Recently, a mechanism for the
involvement of LGALS4 as a tumor suppressor in colorectal cancer was
proposed involving the WNT signaling pathway (Satelli et al. 2011). This is
especially interesting given the well-established role that WNT signaling plays in
the development of lung cancer (Mazieres et al. 2005; Nguyen et al. 2009).
Importantly, previous studies have suggested that the WNT pathway is involved
in cigarette smoke-induced tumorigenesis (Lemjabbar-Alaoui et al. 2006;
Hussain et al. 2009). To our knowledge, this is the first report of differential
expression of LGALS4 between smoker and never-smoker lung tumors and of
DNA methylation as a potential regulator of LGALS4 expression. Functional
validation of LGALS4 regulation in lung adenocarcinoma is a highly interesting
avenue of future investigation.
Our analysis also identified a locus, SULT1C2, that was differentially
methylated in the non-tumor lung (NTL) tissues of Asians and Caucasian
subjects. Although in the United States more than 90% of lung cancer in men
and 75% of lung cancer in women can be attributable to cigarette smoking, this
proportion is much lower within Asian women with lung cancer (Subramanian
143
and Govindan 2007), leading to a hypothesis of higher lung cancer risk amongst
Asian never-smokers and a search for alternative risk factors. Several studies
have identified cooking fumes as a possible risk factor for lung cancer among
nonsmoking Asian women (Ko et al. 1997; Ko et al. 2000; Yu et al. 2006). Deep-
frying and stir-frying of certain foods releases numerous compounds, amongst
them heterocyclic aromatic amines, which upon bioactivation in the body can
form DNA damage-inducing reactive intermediates (Eisenbrand and Tang 1993;
Schut and Snyderwine 1999; Sugimura 2000). Individual rates of metabolizing
such toxins would therefore influence susceptibility of an individual to the toxic
effects of each pollutant.
SULT1C2 is a glutathione sulfur transferase (GST), a class of enzymes
which catalyzes the transfer of sulfur-based moieties to a variety of substrates.
Sulfurtransferases play important roles in many biological processes including
the metabolism of environmental toxins such as heterocyclic aromatic amines as
well aschemotherapeutic drugs. While SULT1C2 has not been extensively
studied, a polymorphism in the gene has been associated with increased relapse
risk in Acute Myeloid Lymphoma patients (Monzo et al. 2006). Functional
polymorphisms in other GST enzymes have also been associated with increased
risk of lung cancer (Hosgood et al. 2007) and breast cancer risk (Zheng et al.
2001), underlying the important roles that these enzymes may play in the
development of cancer. In addition to SNPs which potentially affect enzyme
efficacy, Glatt et al. found that the expression levels of SULT1A1 and SULT1C2
144
proteins themselves can also affect the bioactivation rates of heterocyclic
aromatic amines (Glatt et al. 2004). Given the inherently reversible nature of
epigenetic changes, a finding of a potential disease or risk-associated DNA
methylation change would be especially exciting as it readily provides itself with a
potential avenue of diagnosis and therapy. Further research into SULT1C2 and
DNA methylation is therefore being actively pursued.
Tumor heterogeneity is increasingly recognized as crucial for patient
classification, prognostication and treatment (Pao and Girard 2011; Travis et al.
2011). Genome-wide DNA methylation profiling has led to increased knowledge
of epigenetic subtypes of colorectal cancer, glioblastoma and multiple myeloma,
among others (Noushmehr et al. 2010; Hinoue et al. 2011; Walker et al. 2011).
The best-established DNA methylation-based subgroup is that of CpG island
methylator phenotype (CIMP), first identified in colorectal cancer (Toyota et al.
1999). CIMP tumors possess high frequency and levels of cancer-specific DNA
methylation at loci which show little or no methylation in non-CIMP tumors. CIMP
sometimes shows differences in patient survival and is closely associated with
BRAF activating mutations, but the molecular mechanism underlying CIMP has
not yet been elucidated (Teodoridis et al. 2008). The existence of CIMP has been
suggested in NSCLC (Marsit et al. 2006; Suzuki et al. 2006) using a very limited
number of genes. However, another study did not support this conclusion
(Vaissiere et al. 2009). Our use of 27,578 probes enabled a more thorough
145
examination of DNA methylation, and we find no evidence for classic CIMP in
lung cancer.
An additional epigenotype, termed CIMP-low (CIMP-L, Intermediate-
methylation epigenotype, or CIMP2) has been reported and confirmed in several
independent populations of colorectal tumors using different methodologies
(Ogino et al. 2006; Shen et al. 2007; Yagi et al. 2010; Hinoue et al. 2011). CIMP-
low exhibits moderately high levels of DNA hypermethylation at a subset of
CIMP- associated loci, and in each study was found to be associated with KRAS
mutation. Although the current lung adenocarcinoma study is limited by a modest
sample size, we too observe an epigenetic subtype of lung adenocarcinoma with
higher DNA methylation levels that is associated with KRAS mutation. In 2006,
Toyooka et al. observed a higher methylation index in KRAS mutant tumors in
comparison to KRAS wildtype tumors (Toyooka et al. 2006). An examination of
KRAS wildtype vs. mutant tumors showed DNA methylation changes in 93
genes, however no corresponding changes in gene expression were observed.
Just like CIMP-low in colorectal cancer, KRAS mutations might not drive this
epigenetic subgroup, and a more complex molecular mechanism may cause the
observed epigenetic heterogeneity (Hinoue et al. 2011). The hypermethylated
cluster was also associated (albeit more weakly) with smoking status, which is
not surprising given the fact that KRAS mutations are most common in smokers
(Ahrendt et al. 2001). Our observation that DNA methylation differences between
KRAS mutant and wildtype tumors are more pronounced than between smokers
146
and never-smokers (Fig. 4) leads us to speculate that previously reported
associations between smoking and DNA methylation might be driven by KRAS
mutations, rather than smoking status.
To begin investigating the association between KRAS mutation and
increased DNA methylation, we examined gene expression differences between
the high methylation and low methylation clusters, in the hope of identifying
genes that may play a role in tumor heterogeneity. We did not observe large
expression differences in known DNMT or TET genes, but did identify seven
genes showing expression levels that were statistically significantly different and
at least 2-fold changed. Three out of the seven genes were cytokines recently
shown to play a role in tumorigenesis (Andersson et al. 2011), while PHLDA1 is a
regulator of IGF1-mediated apoptosis (Toyoshima et al. 2004) and has recently
been suggested to be a putative epithelial stem cell marker in the human
intestine (Sakthianandeswaren et al. 2011). The role of cytokines in
tumorigenesis is of increasing interest (Dranoff 2004; Li et al. 2011), and the
connection between DNA methylation, KRAS mutation, cytokines and cancer is
an intriguing avenue of investigation (Yoshikawa et al. 2001; Galm et al. 2003;
Niwa et al. 2005; Sunaga et al. 2011).
As with CIMP-H and CIMP-L in colorectal cancer, our integrated gene
expression and DNA methylation analyses of the two tumor clusters showed few
genes for which DNA methylation and gene expression demonstrated strong
inverse correlations (Hinoue et al. 2011), suggesting that the majority of these
147
changes are “passenger” DNA methylation events, or that there is substantial
heterogeneity in the tumor population that makes such correlations more difficult
to discern. Finally, while we do not find a survival difference for our epigenetic
subtype, this may be due to sample size limitations and merits further
investigation.
In addition to examining tumor heterogeneity, our analysis also identified a
locus, SULT1C2, that was differentially methylated in the non-tumor lung (NTL)
tissues of Asians and Caucasian subjects. Although in the United States more
than 90% of lung cancer in men and 75% of lung cancer in women can be
attributable to cigarette smoking, this proportion is much lower within Asian
women with lung cancer (Subramanian and Govindan 2007), leading to a
hypothesis of higher lung cancer risk amongst Asian never-smokers and a
search for alternative risk factors. Several studies have identified cooking fumes
as a possible risk factor for lung cancer among nonsmoking Asian women (Ko et
al. 1997; Ko et al. 2000; Yu et al. 2006). Deep-frying and stir-frying of certain
foods releases numerous compounds, amongst them heterocyclic aromatic
amines, which upon bioactivation in the body can form DNA damage-inducing
reactive intermediates (Eisenbrand and Tang 1993; Schut and Snyderwine 1999;
Sugimura 2000). Individual rates of metabolizing such toxins would therefore
influence susceptibility of an individual to the toxic effects of each pollutant.
SULT1C2 is a glutathione sulfur transferase (GST), a class of enzymes
which catalyzes the transfer of sulfur-based moieties to a variety of substrates.
148
Sulfurtransferases play important roles in many biological processes including
the metabolism of environmental toxins such as heterocyclic aromatic amines as
well aschemotherapeutic drugs. While SULT1C2 has not been extensively
studied, a polymorphism in the gene has been associated with increased relapse
risk in Acute Myeloid Lymphoma patients (Monzo et al. 2006). Functional
polymorphisms in other GST enzymes have also been associated with increased
risk of lung cancer (Hosgood et al. 2007) and breast cancer risk (Zheng et al.
2001), underlying the important roles that these enzymes may play in the
development of cancer. In addition to SNPs which potentially affect enzyme
efficacy, Glatt et al. found that the expression levels of SULT1A1 and SULT1C2
proteins themselves can also affect the bioactivation rates of heterocyclic
aromatic amines (Glatt et al. 2004). Given the inherently reversible nature of
epigenetic changes, a finding of a potential disease or risk-associated DNA
methylation change would be especially exciting as it readily provides itself with a
potential avenue of diagnosis and therapy. Further research into SULT1C2 and
DNA methylation is therefore being actively pursued.
In summary, our DNA methylation profiling of 59 lung adenocarcinomas
and matched adjacent non-tumor lung tissue accomplished three goals: 1) the
identification of numerous new cancer-specific DNA methylation changes which
can be pursued as potential lung adenocarcinoma biomarkers, 2) the
identification of potentially functional DNA methylation changes that may
constitute driver alterations in lung adenocarcinoma, and 3) the identification of
149
an epigenetic sub-group of lung adenocarcinoma with higher levels of DNA
methylation that is correlated to KRAS mutation and is reminiscent of CIMP-L in
colorectal cancer. Our observations lay the groundwork for further diagnostic and
mechanistic studies of lung adenocarcinoma that could lead to improvements in
detection, patient classification and therapy.
150
CHAPTER 5
GENOME-SCALE DNA METHYLATION PROFILES FROM
FORMALIN–FIXED PARAFFIN-EMBEDDED TISSUES
Chapter 5 Abstract
DNA methylation is an important epigenetic modification affecting gene
regulation in both physiological and pathological processes. Aberrant DNA
methylation and resultant changes in gene expression have been observed in a
wide variety of diseases. The emergence of genome-scale technologies such as
the Illumina Infinium HumanMethylation27 BeadChip, a robust platform which
enables the interrogation of 27,578 CpG sites covering 14,473 genes, has
prompted a shift in DNA methylation studies from a candidate locus-based
approach to very high throughput DNA methylation profiling. However, the
Infinium HumanMethylation27 BeadChip was designed for high quality input
DNA. The bottleneck for many studies is therefore the procurement and
processing of high-quality frozen tissue. If clinically collected archival formalin-
fixed paraffin-embedded (FFPE) tissues could be used for the Infinium
HumanMethylation27 platform, this would greatly increase the scope of studies
that could be performed using this platform. Here we report the successful DNA
151
methylation profiling of archival FFPE samples using the HumanMethylation27
BeadChip on archival samples from six lung cancer subjects.
152
Introduction
DNA methylation in mammals consists of the covalent addition of a methyl
group to the 5-carbon position of a cystosine within a CpG dinuceotide. This
epigenetic modification plays an important role in gene expression regulation in
many biological and pathological processes (Bird 2002; Robertson 2005).
Aberrant DNA methylation consisting both of local hypermethylation of certain
CpG islands and global DNA hypomethylation is a hallmark of many cancers
(Ehrlich 2002).
DNA methylation detection procedures use variations on three central
techniques: 1) Methylation-sensitive restriction enzymes 2) Bisulfite conversion
(a chemical treatment that converts unmethylated Cs to Us but leaves
methylated Cs intact) usually followed by various forms of PCR-based detection,
or 3) Immunoprecipitation/purification of methylated DNA. Recent technological
developments allowed DNA methylation studies to expand from candidate loci-
based approaches such as methylation-specific PCR (MSP), MethyLight and
traditional bisulfite-sequencing to more genome-scale profiling, including
restriction enzyme, bisulfite and affinity-based microarrays and sequencing
methods (Laird 2010). One such approach is the DNA methylation platform
developed by Illumina, the Infinium HumanMethylation27 BeadChip (Bibikova et
al. 2009). This platform allows the interrogation of 27,578 probes covering 14,473
genes. This method utilizes bisulfite conversion, whole-genome amplification
(WGA) and hybridization to probes specific to methylated and unmethylated
153
molecules. Due to the inclusion of the WGA step, the DNA used in Infinium
experiments should be of good quality with the stated requirement of a minimum
length of 1kb. This would preclude the usage of formalin-fixed, paraffin-
embedded (FFPE) DNA, which is typically highly degraded.
Formalin-fixation followed by paraffin-embedding is a routine method of
tissue storage that has been used for decades in clinics and hospitals. If such
material could be used for the Infinium HumanMethylation27 BeadChip, it would
allow investigators to capitalize on the large number of archival samples obtained
during routine clinical practice. It would also facilitate the study of uncommon or
rare diseases, since it would make the largest possible number of samples
accessible for study. A previous study reported the usage of FFPE DNA on the
Infinium HumanMethylation27 platform, with the addition of a random ligation
step in order to increase DNA fragment length (Thirlwell et al. 2010). While
encouraging, the addition of a ligation step adds considerable labor, variation and
cost. Here we demonstrate that un-ligated DNA from FFPE blocks can be
successfully interrogated on the Infinium HumanMethylation27 platform.
154
Materials and Methods
Samples
All human tissue samples were obtained with appropriate informed consent and
in accordance with the policies of the University of Southern California
Institutional Review Board (protocol HS-06-00447). FFPE blocks were derived
from archival clinical tissue remnants from patients undergoing surgery for
suspected lung cancer at the University of Southern California University
Hospital. Frozen samples were derived from remnant tissue that had been
collected for research purposes. Five subjects had both frozen and formalin-fixed
paraffin-embedded (FFPE) lung tumor and matched adjacent non-tumor lung
(NTL) available, and one subject had only frozen and FFPE NTL tissues (Table
5.1). Three of the samples were stored in FFPE in 2007, two in 2008 and one in
2009. The identities of the subjects were not made available to the laboratory
investigators.
DNA extraction
10 x 10 µm frozen sections were cut and stained with histogene. For each
sample, a reference H&E stained section was evaluated by MNK, an
experienced lung pathologist who marked the areas to be microdissected. Slides
were manually microdissected under the microscope using a 23GTW sterile
needle (Becton Dickinson and Company, Franklin Lakes, New Jersey) and DNA
was isolated using TRIzol® (Life Technologies Corporation, Carslbad,
155
Table 5.1. Subject and sample characteristics
a
.
a
Y=sample available; N= sample unavailable
b
Formalin-fixed paraffin-embedded
Year
Samples
Tumor
(Frozen)
Tumor
(FFPE
b
)
NTL
(Frozen)
NTL
(FFPE)
2007
Subject 1
Y
Y
Y
Y
2007
Subject 2
Y
Y
Y
Y
2007
Subject 3
Y
Y
Y
Y
2008
Subject 4
Y
Y
Y
Y
2008
Subject 5
N
N
Y
Y
2009
Subject 6
Y
Y
Y
Y
156
California). For FFPE samples, 10 x 10 µm unstained sections were cut,
deparaffinized by baking at 90°C, and stained by H&E. The staining was
performed as follows: 3 minutes in CitriSolv (Fischer Scientific, Pittsburgh,
Pennsylvania) (repeated 2 times), 10 dips in dehydration alcohol 100 (EMD
Chemicals Inc., Gibbston, New Jersey), 10 dips in dehydration alcohol 95 (EMD
Chemicals Inc., Gibbston, New Jersey), 2 washes in water, 2.5 minutes in
Hematoxylin 7211 (Richard-Allan Scientific, Kalamazoo, Missouri), 2 washes in
water, 2 minutes in blueing reagent (EMD Chemicals Inc. Gibbston, New Jersey),
2 washes in water, 3 dips in Eosin Y (Fischer Scientific, Pittsburgh,
Pennsylvania), 10 dips in dehydration alcohol 95, 10 dips in dehydration alcohol
100 (repeated 2 times), 1 minute in CitriSolv. Slides were manually
microdissected under the microscope using a 23GTW sterile needle and DNA
was extracted by proteinase K digestion. Microdissected tissue was incubated
overnight at 50°C in a buffer consisting of 15 µL of a solution containing 100 mM
TrisHCl (pH 8.0) and 10 mM EDTA (pH 8.0), 2 µL of 1 mg/ml proteinase K, and 1
µL of 1 mg/ml glycogen (Fermentas Life Sciences, Glen Burnie, Maryland) per
18 µL of the buffer.
DNA purification and quantization
A volume of 3M sodium acetate equivalent to 10% of the sample volume was
added. Tris-saturated phenol (pH 8.0) was added for extraction in a volume equal
to the total sample volume. Following mixing, a volume of 24:1 chloroform-
157
isoamyl alcohol equal to the sample volume was added to the aqueous top layer
after it had been removed. The aqueous top layer was removed after mixing, 1
mL of 100% alcohol was added to it, and incubated in -80°C for a minimum of 2
hours. Samples were spun at 4°C for 10 minutes and the supernatant was
subsequently removed. Pellets were washed with 500 µL of 70% ethanol. The
supernatant was removed after mixing and the pellets were resuspended in 50µL
of TE. DNA was quantified using NanoDrop (Wilmington, DE).
DNA methylation analysis
DNA methylation analysis was performed using the Illumina Infinium
HumanMethylation27 BeadChip (Bibikova et al. 2009). A minimum of 1.25 ug of
DNA was provided to the USC Epigenome Center for bisulfite conversion, quality
assessment and the analysis was performed according to the manufacturer’s
protocol. Bisulfite conversion was performed using the EZ-96 DNA Methylation
Kit (Zymo Research, Orange, CA) with additional cycling at 90°C for 30 seconds
and then 50°C for one hour, for up to 16 hours total. Bisulfite-treated DNA was
subjected to quality control tests for DNA amount and bisulfite conversion
(Campan et al. 2009).
Data analysis
DNA methylation values from the BeadChip hybridizations were provided as
beta-values, which are the ratio of the methylated fluorescent signal (M) over the
158
total (methylated and unmethylated, M+U) signal, ranging from 0 to 1. Zero
represents no detectable methylation, and 1 represents complete methylation for
that particular probe. Data points with detection p values of >0.05 were masked
as “NA” since they were not significantly different from background. Wilcoxon
tests were performed with GraphPad Prism version 5.00 for Windows (GraphPad
software, San Diego California USA, www.graphpad.com), while correlation and
cluster analyses were completed using R 2.11.1. Cluster analysis was performed
on the most variant probes using a standard deviation cutoff of 0.075 (6202
probes), with Euclidean distance and Ward two-dimensional hierarchical
clustering.
159
Results
In order to evaluate the performance of FFPE DNA on the Infinium Human
Methylation27 platform, we compared a total of eleven matched pairs of frozen
and FFPE tissues. Five of these pairs were lung adenocarcinomas, and six were
adjacent non-tumor lung tissue (NTL). We assessed the resulting data based on
the following criteria: number of probe failures, beta value distribution and
variance in the data, correlation coefficient between data from frozen and
matched FFPE samples, and the examination of DNA methylation values at loci
known to be hypermethylated in lung adenocarcinoma.
Probe Failures
The number of probe failures (probes with detection p-values >0.05) is an
indication of DNA quality. We therefore wanted to compare probe failure rates
between frozen and FFPE tissues. In general, FFPE samples performed more
variably with an average of 1.3% probe failures (range=0-7.2%) in comparison to
0.39% probe failures (range=0.01-3.2%) in the frozen samples (Figure 5.1). The
difference between probe failures, however, was not statistically significantly
different (p>0.167).
160
Figure 5.1. Probe failures in frozen and FFPE tissue samples. Left:
Percentage of probe failures, Mann-Whitney p>0.167. Right: Variance as
measured by the 90
th
percentile -10
th
percentile, Mann-Whitney p>0.254. Red:
tumor; blue: NTL tissue.
161
Beta-value distribution and variance
DNA methylation data from the Infinium HumanMethylation27 typically shows a
bimodal distribution, with a large number of probes with beta-values closer to
zero, representing no or lower levels of methylation, and a subset of methylated
probes closer to 1, representing more methylated probes. Poor quality DNA
tends to show a loss of variance, with beta values merging towards the middle.
Retention of variance is thus an important aspect to consider in using FFPE
samples on the Infinium HumanMethylation27 platform. We did not observe a
significant difference in variance between the FFPE and frozen tissue samples,
both in the tumor and NTL tissues (Figure 5.1). Thirlwell et al showed a large
loss of variance in an unligated FFPE sample, corresponding to a tempering of
extreme beta-values and loss of key DNA methylation information (Thirlwell et al.
2010). However, we did not see such drastic re-distribution of beta-values in any
of our FFPE samples, even in our oldest FFPE sample, embedded in 2007
(Subject 1, Figure 5.2 and Figure 5.3). We did observe some diminished signal in
the FFPE distribution which was not statistically significantly different between
the two groups (Figure 5.1, p>0.254).
We also compared the relationship between amount of input DNA and
probe failures and different measures of variance in Figure 5.4, and do not see a
separation between tissue treatment, DNA amount and probe failure or variance.
162
Figure 5.2. Infinium HumanMethylation27 beta-values and correlations between frozen and FFPE NTL tissues.
163
Figure 5.3. Infinium HumanMethylation27 beta-values and correlations between frozen and FFPE tumors.
164
Figure 5.4. Relationship between DNA amount as measured by ALU-C4 C(T)
with probe failure and different measures of variance.
165
Comparison of data from FFPE and matched frozen samples with known DNA
methylation of select genes
Each FFPE sample was compared with its matched frozen partner. It should be
noted that while the samples were derived from the same patient, they do not
represent the exact same piece of tissue, since part of the tumor was formalin
fixed and embedded for clinical analysis, while the remnant was frozen for the
study. Thus, some variation between the samples is to be expected due to intra-
tumor heterogeneity. The correlations varied for each frozen/FFPE pair with an
average correlation of R
2
=0.93 (range 0.88-0.97, Figure 5.2 and Figure 5.3).
In addition, we looked at several genes known to be hypermethylated in
lung adenocarcinoma (Tsou et al. 2007) and saw similar methylation profiles
between both FFPE and frozen samples (Figure 5.5), with no statistically
significant differences observed between the two tissue preparations. Finally, an
exploratory hierarchical clustering shows that the tumors were clustered by
subject (frozen and FFPE pairs), more so than by treatment (Figure 5.6). Since
DNA methylation profiles of NTL tissues are in general much less divergent
between patients than the heterogenous tumors, the NTL samples were not as
tightly clustered by subject, and some samples appear to cluster more closely
together based on the tissue treatment.
166
Figure 5.5. Comparison of beta values between frozen and FFPE tumor and NTL
samples for six loci known to be hypermethylated in lung cancer. Also
shown are lung tumors and matched NTL from an independent sample set (far
right two columns).
167
Figure 5.6. Hierarchical clustering of samples show close relationship
between frozen and FFPE samples of the same subject. Dark blue: Frozen
NTL, light blue: FFPE NTL, dark red: Frozen tumors, light red: FFPE tumors.
168
Discussion
The procurement of human tissues in sufficient numbers for robust
statistical power is the limiting factor for many genome-wide studies. This
problem is even more pronounced for rare diseases, or specific, less abundant
sub-populations. A requirement for high-quality DNA from frozen material
compounds this difficulty, and complicates sample acquisition and processing.
The ability to perform high-dimensional DNA methylation profiling of archival
FFPE samples would capitalize on a great resource already available in large
numbers with established clinical records and follow-up data.
In this pilot study we demonstrate that the usage of Illumina Infinium
HumanMethylation27 platform on archival FFPE tissues is feasible. A previous
study demonstrated the use of FFPE on this platform, however, the investigators
required a pre-ligation step to generate acceptable data (Thirwell 2010). In our
study, we demonstrate that even without pre-ligation, good quality data can be
obtained. We used six human subjects with tumor and NTL tissues, in order to
determine if we can extract similar differential information with FFPE tissues. It is
important to note that this was a pilot study with limited number of samples, and
therefore we were unable to do discovery-based statistical tests due to lack of
power. Additionally, due to the limited number of samples tested, we were unable
to study the relationship between the age of FFPE sample and performance.
However, our oldest FFPE sample was from January of 2007 (Subject 1) and
performed exceptionally well. Data from FFPE specimens correlated well with
169
that of their matched frozen counterparts. Most importantly, we were able to
recapitulate the known differences between tumor and normal levels of DNA
methylation for several genes in the FFPE samples, indicative of the accuracy of
the information acquired from the analysis of these FFPE tissues.
We repeated our analysis with FFPE samples obtained from a different
source to ensure that this was not unique to this particular study, and found
variable results, both in terms of probe failure and variance retention (data not
shown). Therefore, we conclude that while frozen samples are ideal, FFPE
samples may perform well in terms of probe success and variance, but may differ
in performance based on the fixation treatment protocol used. A larger study
would need to be conducted to fully establish the reliability of FFPE tissue
performance on the Infinium HumanMethylation27. However, Illumina has
recently released a new DNA methylation platform, the Infinium
HumanMethylation450 that is purported to be more amenable to degraded DNA
obtained from FFPE tissues. In combination with the recently developed Illumina
FFPE DNA Restoration Solution, FFPE samples were reported have a
reproducibility of R
2
with a 95% probe detection (Infinium
HumanMethylation450 BeadChip Datasheet: Epigenetics). This new platform
therefore may be more promising for FFPE studies in the future.
170
CHAPTER 6
SUMMARY AND CONCLUSIONS
Lung cancer has been a major health problem in the United States and
worldwide for many decades, and will remain a serious worldwide health concern
for years to come (Jemal et al. 2011). The persistently high incidence and high
mortality of lung cancer demonstrates that there is still much to learn about this
disease. Lung adenocarcinoma is the subtype of lung cancer that is the most
common amongst women and non-smokers, and the molecular changes
underlying the development of this disease is of great interest.
DNA methylation is an epigenetic modification with roles in both
physiological and pathological processes. Aberrant DNA methylation changes
are commonly found in many different types of cancer, including lung
adenocarcinoma (Kerr et al. 2007). This thesis focuses on DNA methylation
profiling of lung adenocarcinoma, ranging from delineating the timing at which
previously identified changes occur during the development of lung
adenocarcinoma in Chapter 2, the identification of new loci methylated in lung
adenocarcinoma as well as those associated with gene expression changes in
Chapter 3, the identification of DNA methylation changes between smoker and
never-smokers, Asians and Caucasians, and the discovery of intrinsic sub-
groups based on DNA methylation profiles in Chapter 4. Finally in Chapter 5 we
explored the use of formalin-fixed, paraffin-embedded (FFPE) tissues for
171
genome-scale DNA methylation profiling, in order to facilitate larger studies with
better power, or studies of rare diseases.
Previous work from our laboratory had identified a panel of 15 frequent
DNA methylation changes that may serve as early detection biomarkers of lung
adenocarcinoma (Tsou et al. 2007). Chapter 2 builds on this previous work by
delineating when these DNA methylation changes arise, by analyzing putative
precursor lesions to lung adenocarcinoma, atypical adenomatous hyperplasia
(AAH) and adenocarcinoma in situ (AIS).
The proposed lung adenocarcinoma progression sequence stipulates that
some AAH lesions may progress to become AIS, and that some of these AIS
lesions then progress to full blown invasive adenocarcinomas (Chapman and
Kerr 2000; Kerr 2001). We used these putative precursor lesions to delineate the
15 previously identified DNA hypermethylation markers as “early”, meaning a
statistically significant increase in DNA methylation first observed between AAH
and adjacent non-tumor lung (AdjNTL), “intermediate”, meaning a statistically
significant increase beginning in AIS lesions in comparison to AAH, and “late”
changes, or those loci which only become statistically significantly
hypermethylated in invasive adenocarcinoma. Additionally, we included a mark
for global DNA hypomethylation in the study.
We found that CDKN2A exon 2 and PTPRN2 were hypermethylated
“early” in lung adenocarcinoma progression, while seven loci- 2C35, EYA4,
HOXA1, HOXA11, NEUROD1, NEUROD2 and TMEFF2 were “intermediate”, or
172
become methylated first in AIS tumors. The remaining five loci, CDH13, CDX2,
OPCML, SFRP1 and TWIST1 were designated as “late” loci, as they only
showed hypermethylated in invasive adenocarcinoma. Intriguingly, our marker for
global levels of DNA methylation, the mean of two repeat sequences ALU and
SATELLITE2, only become demethylated in invasive adenocarcinoma.
An important caveat to this study was that it was a cross-sectional
analysis, where individual lesions were derived from a population of subjects, as
opposed to a more ideal longitudinal study, where the same preneoplastic
lesions are followed over time. A longitudinal study on preneoplastic lesions,
however, would be prohibitively difficult to do, since these lesions are typically
too small for repeated sampling, and are also inaccessible. However, recent
efforts to immortalize precursor and preneoplastic lesions of lung
adenocarcinoma may provide an avenue for more detailed investigation into the
role of DNA methylation changes in these preneoplastic lesions (Shimada et al.
2005).
An additional limitation to the study in Chapter 2 was that the 15 loci
studied were identified in candidate-gene based screens, and therefore may
have excluded other lung adenocarcinoma markers. Furthermore, these 15 loci
were chosen solely based on early detection marker performance, and not based
on any potential functional role they may have in lung adenocarcinoma
development. Therefore, in Chapter 3 our goal was to use new genome-scale
technologies to identify DNA methylation changes in lung adenocarcinoma, both
173
for biomarker development purposes as well as to identify potentially functionally
important DNA methylation alterations.
We used the Illumina Infinium HumanMethylation27 platform, which uses
bead-based technology to simultaneously query 27,578 probes representing 14,
475 genes, to analyze 59 lung adenocarcinoma and matched adjacent non-tumor
lung tissues. We successfully identified 520 genes to be statistically significantly
hypermethylated in lung adenocarcinoma, while 247 genes were significantly
hypomethylated. For biomarker development purposes, we then compared the
top 10 most hypermethylated loci to that of the top 10 most hypermethylated loci
in squamous cell carcinomas from publicly available The Cancer Genome Atlas
(TCGA) data (http://cancergenome.nih.gov/). We identified three loci, HOXB4,
TRIM58 and NID2, that were in both top 10 lists, and importantly, also had low
levels of DNA methylation in adjacent non-tumor lung as well as the white blood
cells of healthy subjects.
This is an important preliminary filter for the development of blood-based
biomarkers. Although a multitude of DNA methylation biomarkers have been
identified, few have ever made it past preliminary tests, due to issues with
specificity or sensitivity. While sensitivity can be improved by using multiple
biomarkers in a panel, avoiding loci that are already methylated in white blood
cells of healthy subjects reduces the chance that these loci will have high levels
of background DNA methylation in healthy subjects, and will hopefully help
improve biomarker specificity. These three loci are therefore actively being
174
pursued in our laboratory as potential blood-based biomarkers for the early
detection of non-small cell lung cancers.
In an effort to identify loci for whom DNA methylation changes which may
be functionally relevant, we integrated gene expression information with DNA
methylation profiles. We thus identified 164 genes that were concordantly
hypermethylated and downregulated, and 57 genes that were concordantly
hypomethylated and upregulated. Restricting the genes further to only those
which yielded a minimum of 2 fold change in expression, we were left with 45
hypermethylated and downregulated genes, and 16 hypomethylated and
upregulated genes to pursue in downstream analyses. These 61 genes will be
the focus of future experiments, including verification in independent populations
and other ethnic groups, as well as functional studies using in vitro or in vivo
methods.
While detailed examination is outside the scope of this work, we also
identified 5 genes that were coordinately hypermethylated and upregulated, and
10 genes that were hypomethylated and downregulated. Although the
consequences of the DNA methylation these genes are as yet unknown, new
studies using high throughput bisulfite DNA sequencing has provided evidence
that DNA methylation regulation and patterns are more complex than previously
assumed. The location of DNA methylation, including intra and intergenic DNA
methylation, as well as the regulation of alternate transcripts and even DNA
methylation of non-CpG islands appear to play previously unrecognized,
175
important roles in gene regulation (Irizarry et al. 2009; Maunakea et al. 2010;
Han et al. 2011). Additionally, this study does not address the interplay between
DNA methylation and other epigenetic modifications, including histone
modifications and nucleosome remodeling.
The complexity of DNA methylation profiles is therefore just beginning to
be unraveled. Another key recent discovery is the existence of DNA methylation-
based subgroups, including CpG island Methylator Phenotype (CIMP) in several
cancers (Weisenberger et al. 2006; Noushmehr et al. 2010). In Chapter 4, we do
not find evidence for the existence of a CIMP phenotype in lung
adenocarcinoma, although we do find a CIMP-Low-like DNA methylation based
subgroup, first identified in colorectal cancer (Ogino et al. 2006) which is
associated with smoking status and KRAS mutations.
We also investigated DNA methylation differences between smokers and
never-smokers, and to our surprise found few differences in both tumor and non-
tumor tissues of the two groups. However, we did identify LGALS4 as a gene that
was both aberrantly hypermethylated and downregulated in smoker tumors.
LGALS4 was recently shown to be involved in the WNT signaling pathway, and
was proposed as a potential tumor suppressor in colorectal cancer (Satelli et al.
2011). The parallels we have observed between colorectal cancer and lung
adenocarcinoma, as well as the fact that the WNT pathway is well-known to be
involved in lung cancer makes LGALS4 an important target for follow-up studies
(Mazieres et al. 2005).
176
Additionally, we identified SULT1C2, an enzyme involved in the
metabolism of environmental toxins such as heterocyclic aromatic amines, as
being differentially methylated and expressed between the non-tumor lung
tissues of Asians and Caucasians. This preliminary finding is especially exciting
since there is much interest in the molecular mechanism behind lung
adenocarcinoma development in non-smoking Asian women. The release of
certain compounds into the air during some types of Asian cooking, specifically
deep-frying and stir-frying of certain foods, have been proposed as a risk factor
for lung adenocarcinoma development in Asian women, as the rates of
metabolizing such compounds would influence the risk of disease in each
individual.
Our conclusions in Chapter 4 are preliminary in nature, since the cohort
size used for analysis was limited. Our study consisted of only 59 subjects, and
even fewer for the stratified questions. Obtaining a sufficiently sized cohort is
often the limiting factor in most experimental designs, especially in studies
involving genome-scale analyses. These platforms impose an additional
restriction, as they often require high quality, fresh frozen tissues. In order to
address this issue and facilitate larger studies, we addressed the possibility of
using FFPE tissue blocks, which are routinely collected in hospitals and therefore
more readily available, for DNA methylation profiling using the Illumina Infinium
HumanMethylation27 in Chapter 5.
177
We performed a pilot study using six human subjects with both tumor and
non-tumor lung tissues that had portions of each preserved in FFPE as well as
fresh frozen. We evaluated the feasibility of using FFPE DNA on this platform
using several criteria, including probe failures, retention of variance, and
reproducibility of previously identified DNA methylation differences between
tumors and NTL. While we were able to extract good data in this initial study,
using different FFPE tissues from another source gave us poor quality data,
leading us to conclude that while some good information can be obtained using
FFPE DNA, it is variable and dependent on treatment conditions at each
collection facility. Additionally, our pilot study was too small to consider issues
such as the age of the FFPE block, or different fixation protocols. However, the
release of new platforms that are more suitable to degraded or lower-quality
DNA, including the Infinium HumanMethylation450, as well as the new Illumina
FFPE DNA Restoration kit, indicates that FFPE DNA methylation studies may be
a feasible option in the future.
This thesis described DNA methylation profiling in lung adenocarcinoma,
including biomarker discovery, delineation of the timing of DNA methylation
changes, the identification of potentially functionally relevant DNA methylation
alterations, the identification of DNA methylation differences between smoker
and never-smoker subjects as well as between Asians and Caucasians, as well
as the identification of a potential DNA methylation sub-group in lung
adenocarcinoma that is associated with smoking and KRAS mutations. These
178
analyses have given rise to several downstream studies in our laboratory,
including blood-based biomarker development, functional characterization of
selected DNA methylation changes in preneoplastic lesion and lung
adenocarcinoma cell line models, as well as verification of certain DNA
methylation changes in different ethnic populations and independent populations.
This work has provided key leads in our continued efforts both to improve early
detection and prognostic methods to alleviate lung cancer mortality, as well as
improving our understanding of the interplay between DNA methylation and the
genes, pathways and molecular mechanisms that may be involved in the
development of lung adenocarcinoma.
179
REFERENCES
Abate-Shen C. 2002. Deregulated homeobox gene expression in cancer: cause
or consequence? Nat Rev Cancer 2(10): 777-785.
Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, Fagerstrom RM, Gareen
IF, Gatsonis C, Marcus PM, Sicks JD. 2011. Reduced lung-cancer
mortality with low-dose computed tomographic screening. N Engl J Med
365(5): 395-409.
Ahrendt SA, Decker PA, Alawi EA, Zhu Yr YR, Sanchez-Cespedes M, Yang SC,
Haasler GB, Kajdacsy-Balla A, Demeure MJ, Sidransky D. 2001. Cigarette
smoking is strongly associated with mutation of the K-ras gene in patients
with primary adenocarcinoma of the lung. Cancer 92(6): 1525-1530.
Ahuja N, Li Q, Mohan AL, Baylin SB, Issa JP. 1998. Aging and DNA methylation
in colorectal mucosa and cancer. Cancer Res 58(23): 5489-5494.
Andersson A, Srivastava MK, Harris-White M, Huang M, Zhu L, Elashoff D,
Strieter RM, Dubinett SM, Sharma S. 2011. Role of CXCR3 ligands in IL-
7/IL-7R{alpha}-Fc-mediated antitumor activity in lung cancer. Clin Cancer
Res 17(11): 3660-3672.
Andrews J, Kennette W, Pilon J, Hodgson A, Tuck AB, Chambers AF,
Rodenhiser DI. 2010. Multi-platform whole-genome microarray analyses
refine the epigenetic signature of breast cancer metastasis with gene
expression and copy number. PLoS One 5(1): e8665.
Anglim PP, Alonzo TA, Laird-Offringa IA. 2008a. DNA methylation-based
biomarkers for early detection of non-small cell lung cancer: an update.
Mol Cancer 7: 81.
Anglim PP, Galler JS, Koss MN, Hagen JA, Turla S, Campan M, Weisenberger
DJ, Laird PW, Siegmund KD, Laird-Offringa IA. 2008b. Identification of a
panel of sensitive and specific DNA methylation markers for squamous
cell lung cancer. Mol Cancer 7: 62.
Anisowicz A, Huang H, Braunschweiger KI, Liu Z, Giese H, Wang H, Mamaev S,
Olejnik J, Massion PP, Del Mastro RG. 2008. A high-throughput and
sensitive method to measure global DNA methylation: application in lung
cancer. BMC Cancer 8: 222.
180
Au JS, Mang OW, Foo W, Law SC. 2004. Time trends of lung cancer incidence
by histologic types and smoking prevalence in Hong Kong 1983-2000.
Lung Cancer 45(2): 143-152.
Aviel-Ronen S, Coe BP, Lau SK, da Cunha Santos G, Zhu CQ, Strumpf D,
Jurisica I, Lam WL, Tsao MS. 2008. Genomic markers for malignant
progression in pulmonary adenocarcinoma with bronchioloalveolar
features. Proc Natl Acad Sci U S A 105(29): 10155-10160.
Baba Y, Nosho K, Shima K, Freed E, Irahara N, Philips J, Meyerhardt JA,
Hornick JL, Shivdasani RA, Fuchs CS et al. 2009. Relationship of CDX2
loss with molecular features and prognosis in colorectal cancer. Clin
Cancer Res 15(14): 4665-4673.
Barbosa-Morais NL, Dunning MJ, Samarajiwa SA, Darot JF, Ritchie ME, Lynch
AG, Tavare S. 2010. A re-annotation pipeline for Illumina BeadArrays:
improving the interpretation of gene expression data. Nucleic Acids Res
38(3): e17.
Beer DG, Kardia SL, Huang CC, Giordano TJ, Levin AM, Misek DE, Lin L, Chen
G, Gharib TG, Thomas DG et al. 2002. Gene-expression profiles predict
survival of patients with lung adenocarcinoma. Nat Med 8(8): 816-824.
Belinsky SA. 2004. Gene-promoter hypermethylation as a biomarker in lung
cancer. Nat Rev Cancer 4(9): 707-717.
Belinsky SA. 2005. Silencing of genes by promoter hypermethylation: key event
in rodent and human lung cancer. Carcinogenesis 26(9): 1481-1487.
Belinsky SA, Nikula KJ, Palmisano WA, Michels R, Saccomanno G, Gabrielson
E, Baylin SB, Herman JG. 1998. Aberrant methylation of p16(INK4a) is an
early event in lung cancer and a potential biomarker for early diagnosis.
Proc Natl Acad Sci U S A 95(20): 11891-11896.
Belinsky SA, Palmisano WA, Gilliland FD, Crooks LA, Divine KK, Winters SA,
Grimes MJ, Harms HJ, Tellez CS, Smith TM et al. 2002. Aberrant
promoter methylation in bronchial epithelium and sputum from current and
former smokers. Cancer Res 62(8): 2370-2377.
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. 2004.
GenBank: update. Nucleic Acids Res 32(Database issue): D23-26.
181
Berman BP, Weisenberger DJ, Aman JF, Hinoue T, Ramjan Z, Liu Y,
Noushmehr H, Lange CP, van Dijk CM, Tollenaar RA et al. 2011. Regions
of focal DNA hypermethylation and long-range hypomethylation in
colorectal cancer coincide with nuclear lamina-associated domains. Nat
Genet.
Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C,
Beheshti J, Bueno R, Gillette M et al. 2001. Classification of human lung
carcinomas by mRNA expression profiling reveals distinct
adenocarcinoma subclasses. Proc Natl Acad Sci U S A 98(24): 13790-
13795.
Bibikova M, Le J, Barnes B, Saedinia-Melnyk S, Zhou L, Shen R, Gunderson KL.
2009. Genome-wide DNA methylation profiling using Infinium assay.
Epigenomics 1(1): 177-200.
Bibikova M, Lin Z, Zhou L, Chudin E, Garcia EW, Wu B, Doucet D, Thomas NJ,
Wang Y, Vollmer E et al. 2006. High-throughput DNA methylation profiling
using universal bead arrays. Genome Res 16(3): 383-393.
Bird A. 2002. DNA methylation patterns and epigenetic memory. Genes Dev
16(1): 6-21.
Bird AP. 1986. CpG-rich islands and the function of DNA methylation. Nature
321(6067): 209-213.
Birney E Stamatoyannopoulos JA Dutta A Guigo R Gingeras TR Margulies EH
Weng Z Snyder M Dermitzakis ET Thurman RE et al. 2007. Identification
and analysis of functional elements in 1% of the human genome by the
ENCODE pilot project. Nature 447(7146): 799-816.
Black WC. 2000. Overdiagnosis: An underrecognized cause of confusion and
harm in cancer screening. J Natl Cancer Inst 92(16): 1280-1282.
Blanco D, Vicent S, Fraga MF, Fernandez-Garcia I, Freire J, Lujambio A, Esteller
M, Ortiz-de-Solorzano C, Pio R, Lecanda F et al. 2007. Molecular analysis
of a multistep lung cancer model induced by chronic inflammation reveals
epigenetic regulation of p16 and activation of the DNA damage response
pathway. Neoplasia 9(10): 840-852.
Boldin MP, Goncharov TM, Goltsev YV, Wallach D. 1996. Involvement of MACH,
a novel MORT1/FADD-interacting protease, in Fas/APO-1- and TNF
receptor-induced cell death. Cell 85(6): 803-815.
182
Breitling LP, Yang R, Korn B, Burwinkel B, Brenner H. 2011. Tobacco-smoking-
related differential DNA methylation: 27K discovery and replication. Am J
Hum Genet 88(4): 450-457.
Bremnes RM, Sirera R, Camps C. 2005. Circulating tumour-derived DNA and
RNA markers in blood: a tool for early detection, diagnostics, and follow-
up? Lung Cancer 49(1): 1-12.
Brenet F, Moh M, Funk P, Feierstein E, Viale AJ, Socci ND, Scandura JM. 2011.
DNA methylation of the first exon is tightly linked to transcriptional
silencing. PLoS One 6(1): e14524.
Brock MV, Hooker CM, Ota-Machida E, Han Y, Guo M, Ames S, Glockner S,
Piantadosi S, Gabrielson E, Pridham G et al. 2008. DNA methylation
markers and early recurrence in stage I lung cancer. N Engl J Med
358(11): 1118-1128.
Campan M, Weisenberger DJ, Trinh B, Laird PW. 2009. MethyLight. Methods
Mol Biol 507: 325-337.
Chapman AD, Kerr KM. 2000. The association between atypical adenomatous
hyperplasia and primary lung cancer. Br J Cancer 83(5): 632-636.
Chen F, Cole P, Bina WF. 2007. Time trend and geographic patterns of lung
adenocarcinoma in the United States, 1973-2002. Cancer Epidemiol
Biomarkers Prev 16(12): 2724-2729.
Chen RZ, Pettersson U, Beard C, Jackson-Grusby L, Jaenisch R. 1998. DNA
hypomethylation leads to elevated mutation rates. Nature 395(6697): 89-
93.
Cheong N, Zhang H, Madesh M, Zhao M, Yu K, Dodia C, Fisher AB, Savani RC,
Shuman H. 2007. ABCA3 is critical for lamellar body biogenesis in vivo. J
Biol Chem 282(33): 23811-23817.
Christensen BC, Marsit CJ, Houseman EA, Godleski JJ, Longacker JL, Zheng S,
Yeh RF, Wrensch MR, Wiemels JL, Karagas MR et al. 2009.
Differentiation of lung adenocarcinoma, pleural mesothelioma, and
nonmalignant pulmonary tissues using DNA methylation profiles. Cancer
Res 69(15): 6315-6321.
Chung JH, Lee HJ, Kim BH, Cho NY, Kang GH. 2011. DNA methylation profile
during multistage progression of pulmonary adenocarcinomas. Virchows
Arch 459(2): 201-211.
183
Clark SJ. 2007. Action at a distance: epigenetic silencing of large chromosomal
regions in carcinogenesis. Hum Mol Genet 16 Spec No 1: R88-95.
Costello JF, Fruhwald MC, Smiraglia DJ, Rush LJ, Robertson GP, Gao X, Wright
FA, Feramisco JD, Peltomaki P, Lang JC et al. 2000. Aberrant CpG-island
methylation has non-random and tumour-type-specific patterns. Nat Genet
24(2): 132-138.
Dai W, Teodoridis JM, Zeller C, Graham J, Hersey J, Flanagan JM, Stronach E,
Millan DW, Siddiqui N, Paul J et al. 2011. Systematic CpG islands
methylation profiling of genes in the wnt pathway in epithelial ovarian
cancer identifies biomarkers of progression-free survival. Clin Cancer Res
17(12): 4052-4062.
Dai Z, Lakshmanan RR, Zhu WG, Smiraglia DJ, Rush LJ, Fruhwald MC, Brena
RM, Li B, Wright FA, Ross P et al. 2001. Global methylation profiling of
lung cancer identifies novel methylated genes. Neoplasia 3(4): 314-323.
Dammann R, Li C, Yoon JH, Chin PL, Bates S, Pfeifer GP. 2000. Epigenetic
inactivation of a RAS association domain family protein from the lung
tumour suppressor locus 3p21.3. Nat Genet 25(3): 315-319.
Ding L, Getz G, Wheeler DA, Mardis ER, McLellan MD, Cibulskis K, Sougnez C,
Greulich H, Muzny DM, Morgan MB et al. 2008. Somatic mutations affect
key pathways in lung adenocarcinoma. Nature 455(7216): 1069-1075.
Divine KK, Pulling LC, Marron-Terada PG, Liechty KC, Kang T, Schwartz AG,
Bocklage TJ, Coons TA, Gilliland FD, Belinsky SA. 2005. Multiplicity of
abnormal promoter methylation in lung adenocarcinomas from smokers
and never smokers. Int J Cancer 114(3): 400-405.
Dong SM, Lee EJ, Jeon ES, Park CK, Kim KM. 2005. Progressive methylation
during the serrated neoplasia pathway of the colorectum. Mod Pathol
18(2): 170-178.
Dranoff G. 2004. Cytokines in cancer pathogenesis and cancer therapy. Nat Rev
Cancer 4(1): 11-22.
Du P, Kibbe WA, Lin SM. 2008. lumi: a pipeline for processing Illumina
microarray. Bioinformatics 24(13): 1547-1548.
Eads CA, Danenberg KD, Kawakami K, Saltz LB, Blake C, Shibata D,
Danenberg PV, Laird PW. 2000. MethyLight: a high-throughput assay to
measure DNA methylation. Nucleic Acids Res 28(8): E32.
184
Eden A, Gaudet F, Waghmare A, Jaenisch R. 2003. Chromosomal instability and
tumors promoted by DNA hypomethylation. Science 300(5618): 455.
Ehrlich M. 2002. DNA methylation in cancer: too much, but also too little.
Oncogene 21(35): 5400-5413.
Eisenbrand G, Tang W. 1993. Food-borne heterocyclic amines. Chemistry,
formation, occurrence and biological activities. A literature review.
Toxicology 84(1-3): 1-82.
Esteller M. 2003. Relevance of DNA methylation in the management of cancer.
Lancet Oncol 4(6): 351-358.
Esteller M, Sanchez-Cespedes M, Rosell R, Sidransky D, Baylin SB, Herman JG.
1999. Detection of aberrant promoter hypermethylation of tumor
suppressor genes in serum DNA from non-small cell lung cancer patients.
Cancer Res 59(1): 67-70.
Ferlay J, Shin HR, Bray F, Forman D, Mathers C, Parkin DM. 2010. Estimates of
worldwide burden of cancer in 2008: GLOBOCAN 2008. Int J Cancer
127(12): 2893-2917.
Fiegl H, Jones A, Hauser-Kronberger C, Hutarew G, Reitsamer R, Jones RL,
Dowsett M, Mueller-Holzner E, Windbichler G, Daxenbichler G et al. 2008.
Methylated NEUROD1 promoter is a marker for chemosensitivity in breast
cancer. Clin Cancer Res 14(11): 3494-3502.
Fu DY, Wang ZM, Li C, Wang BL, Shen ZZ, Huang W, Shao ZM. 2010. Sox17,
the canonical Wnt antagonist, is epigenetically inactivated by promoter
methylation in human breast cancer. Breast Cancer Res Treat 119(3):
601-612.
Fukui T, Kondo M, Ito G, Maeda O, Sato N, Yoshioka H, Yokoi K, Ueda Y,
Shimokata K, Sekido Y. 2005. Transcriptional silencing of secreted frizzled
related protein 1 (SFRP 1) by promoter hypermethylation in non-small-cell
lung cancer. Oncogene 24(41): 6323-6327.
Funama Y, Awai K, Liu D, Oda S, Yanaga Y, Nakaura T, Kawanaka K,
Shimamura M, Yamashita Y. 2009. Detection of nodules showing ground-
glass opacity in the lungs at low-dose multidetector computed
tomography: phantom and clinical study. J Comput Assist Tomogr 33(1):
49-53.
185
Galm O, Yoshikawa H, Esteller M, Osieka R, Herman JG. 2003. SOCS-1, a
negative regulator of cytokine signaling, is frequently silenced by
methylation in multiple myeloma. Blood 101(7): 2784-2788.
Garcia M JA, Ward EM, Center MM, Hao Y, Siegel RL, Thun MJ. 2007. Global
Cancer Facts & Figures 2007. American Cancer Society.
Gardiner-Garden M, Frommer M. 1987. CpG islands in vertebrate genomes. J
Mol Biol 196(2): 261-282.
Gebhard C, Benner C, Ehrich M, Schwarzfischer L, Schilling E, Klug M,
Dietmaier W, Thiede C, Holler E, Andreesen R et al. 2010. General
transcription factor binding at CpG islands in normal cells correlates with
resistance to de novo DNA methylation in cancer cells. Cancer Res 70(4):
1398-1407.
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B,
Gautier L, Ge Y, Gentry J et al. 2004. Bioconductor: open software
development for computational biology and bioinformatics. Genome Biol
5(10): R80.
Ghadersohi A, Odunsi K, Lele S, Collins Y, Greco WR, Winston J, Liang P, Sood
AK. 2004. Prostate derived Ets transcription factor shows better tumor-
association than other cancer-associated molecules. Oncol Rep 11(2):
453-458.
Glatt H, Pabel U, Meinl W, Frederiksen H, Frandsen H, Muckel E. 2004.
Bioactivation of the heterocyclic aromatic amine 2-amino-3-methyl-9H-
pyrido [2,3-b]indole (MeAalphaC) in recombinant test systems expressing
human xenobiotic-metabolizing enzymes. Carcinogenesis 25(5): 801-807.
Goelz SE, Vogelstein B, Hamilton SR, Feinberg AP. 1985. Hypomethylation of
DNA from benign and malignant human colon neoplasms. Science
228(4696): 187-190.
Goto Y, Shinjo K, Kondo Y, Shen L, Toyota M, Suzuki H, Gao W, An B, Fujii M,
Murakami H et al. 2009. Epigenetic profiles distinguish malignant pleural
mesothelioma from lung adenocarcinoma. Cancer Res 69(23): 9073-9082.
Guerrero-Preston R, Soudry E, Acero J, Orera M, Moreno-Lopez L, Macia-Colon
G, Jaffe A, Berdasco M, Ili-Gangas C, Brebi-Mieville P et al. 2011. NID2
and HOXA9 promoter hypermethylation as biomarkers for prevention and
early detection in oral cavity squamous cell carcinoma tissues and saliva.
Cancer Prev Res (Phila) 4(7): 1061-1072.
186
Guo M, House MG, Hooker C, Han Y, Heath E, Gabrielson E, Yang SC, Baylin
SB, Herman JG, Brock MV. 2004. Promoter hypermethylation of resected
bronchial margins: a field defect of changes? Clin Cancer Res 10(15):
5131-5136.
Han H, Cortez CC, Yang X, Nichols PW, Jones PA, Liang G. 2011. DNA
methylation directly silences genes with non-CpG island promoters and
establishes a nucleosome occupied promoter. Hum Mol Genet.
Hanabata T, Tsukuda K, Toyooka S, Yano M, Aoe M, Nagahiro I, Sano Y, Date
H, Shimizu N. 2004. DNA methylation of multiple genes and
clinicopathological relationship of non-small cell lung cancers. Oncol Rep
12(1): 177-180.
Hayashi N, Sugimoto Y, Tsuchiya E, Ogawa M, Nakamura Y. 1994. Somatic
mutations of the MTS (multiple tumor suppressor) 1/CDK4l (cyclin-
dependent kinase-4 inhibitor) gene in human primary non-small cell lung
carcinomas. Biochem Biophys Res Commun 202(3): 1426-1430.
Helman E, Naxerova K, Kohane IS. 2011. DNA hypermethylation in lung cancer
is targeted at differentiation-associated genes. Oncogene.
Herman JG, Baylin SB. 2003. Gene silencing in cancer in association with
promoter hypermethylation. N Engl J Med 349(21): 2042-2054.
Hill VK, Ricketts C, Bieche I, Vacher S, Gentle D, Lewis C, Maher ER, Latif F.
2011. Genome-wide DNA methylation profiling of CpG islands in breast
cancer identifies novel genes associated with tumorigenicity. Cancer Res
71(8): 2988-2999.
Hinoue T, Weisenberger DJ, Lange CP, Shen H, Byun HM, Van Den Berg D,
Malik S, Pan F, Noushmehr H, van Dijk CM et al. 2011. Genome-scale
analysis of aberrant DNA methylation in colorectal cancer. Genome Res.
Hobbs M, Mattick JS. 1993. Common components in the assembly of type 4
fimbriae, DNA transfer systems, filamentous phage and protein-secretion
apparatus: a general system for the formation of surface-associated
protein complexes. Mol Microbiol 10(2): 233-243.
Horner MJ RLK, Neyman N, Aminou R et al. 2009. SEER Cancer Statistics
Review, 1975-2006.
187
Hosgood HD, 3rd, Berndt SI, Lan Q. 2007. GST genotypes and lung cancer
susceptibility in Asian populations with indoor air pollution exposures: a
meta-analysis. Mutat Res 636(1-3): 134-143.
Hou J, Aerts J, den Hamer B, van Ijcken W, den Bakker M, Riegman P, van der
Leest C, van der Spek P, Foekens JA, Hoogsteden HC et al. 2010. Gene
expression-based classification of non-small cell lung carcinomas and
survival prediction. PLoS One 5(4): e10312.
Hsu HS, Chen TP, Hung CH, Wen CK, Lin RK, Lee HC, Wang YC. 2007.
Characterization of a multiple epigenetic marker panel for lung cancer
detection and risk assessment in plasma. Cancer 110(9): 2019-2026.
Huang da W, Sherman BT, Lempicki RA. 2009. Systematic and integrative
analysis of large gene lists using DAVID bioinformatics resources. Nat
Protoc 4(1): 44-57.
Huang YT, Lin X, Liu Y, Chirieac LR, McGovern R, Wain J, Heist R, Skaug V,
Zienolddiny S, Haugen A et al. 2011. Cigarette smoking increases copy
number alterations in nonsmall-cell lung cancer. Proc Natl Acad Sci U S A
108(39): 16345-16350.
Hussain M, Rao M, Humphries AE, Hong JA, Liu F, Yang M, Caragacianu D,
Schrump DS. 2009. Tobacco smoke induces polycomb-mediated
repression of Dickkopf-1 in lung cancer cells. Cancer Res 69(8): 3570-
3578.
Ikeda K, Awai K, Mori T, Kawanaka K, Yamashita Y, Nomori H. 2007. Differential
diagnosis of ground-glass opacity nodules: CT number analysis by three-
dimensional computerized quantification. Chest 132(3): 984-990.
Irizarry RA, Ladd-Acosta C, Wen B, Wu Z, Montano C, Onyango P, Cui H, Gabo
K, Rongione M, Webster M et al. 2009. The human colon cancer
methylome shows similar hypo- and hypermethylation at conserved
tissue-specific CpG island shores. Nat Genet 41(2): 178-186.
Issa JP. 2004. CpG island methylator phenotype in cancer. Nat Rev Cancer
4(12): 988-993.
Jackman DM, Johnson BE. 2005. Small-cell lung cancer. Lancet 366(9494):
1385-1396.
188
Jackson EL, Willis N, Mercer K, Bronson RT, Crowley D, Montoya R, Jacks T,
Tuveson DA. 2001. Analysis of lung tumor initiation and progression using
conditional expression of oncogenic K-ras. Genes Dev 15(24): 3243-3248.
Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D. 2011. Global cancer
statistics. CA Cancer J Clin 61(2): 69-90.
Jemal A, Center MM, DeSantis C, Ward EM. 2010. Global patterns of cancer
incidence and mortality rates and trends. Cancer Epidemiol Biomarkers
Prev 19(8): 1893-1907.
Jensen TJ, Novak P, Eblin KE, Gandolfi AJ, Futscher BW. 2008. Epigenetic
remodeling during arsenical-induced malignant transformation.
Carcinogenesis 29(8): 1500-1508.
Jones PA. 2002. DNA methylation and cancer. Oncogene 21(35): 5358-5360.
Jones PA, Baylin SB. 2002. The fundamental role of epigenetic events in cancer.
Nat Rev Genet 3(6): 415-428.
Jurgens B, Schmitz-Drager BJ, Schulz WA. 1996. Hypomethylation of L1 LINE
sequences prevailing in human urothelial carcinoma. Cancer Res 56(24):
5698-5703.
Kamb A, Gruis NA, Weaver-Feldhaus J, Liu Q, Harshman K, Tavtigian SV,
Stockert E, Day RS, 3rd, Johnson BE, Skolnick MH. 1994. A cell cycle
regulator potentially involved in genesis of many tumor types. Science
264(5157): 436-440.
Kaufman L, Rousseuw, P.J. 1990. Finding Groups in Data: An Introduction to
Cluster Analysis. Wiley Interscience, New York.
Kawai H, Tomii K, Toyooka S, Yano M, Murakami M, Tsukuda K, Shimizu N.
2005. Promoter methylation downregulates CDX2 expression in colorectal
carcinomas. Oncol Rep 13(3): 547-551.
Kawasaki E, Eisenbarth GS, Wasmeier C, Hutton JC. 1996. Autoantibodies to
protein tyrosine phosphatase-like proteins in type I diabetes. Overlapping
specificities to phogrin and ICA512/IA-2. Diabetes 45(10): 1344-1349.
Kerr KM. 2001. Pulmonary preinvasive neoplasia. J Clin Pathol 54(4): 257-271.
-. 2009. Pulmonary adenocarcinomas: classification and reporting.
Histopathology 54(1): 12-27.
189
Kerr KM, Galler JS, Hagen JA, Laird PW, Laird-Offringa IA. 2007. The role of
DNA methylation in the development and progression of lung
adenocarcinoma. Dis Markers 23(1-2): 5-30.
Kerr KM, MacKenzie SJ, Ramasami S, Murray GI, Fyfe N, Chapman AD,
Nicolson MC, King G. 2004. Expression of Fhit, cell adhesion molecules
and matrix metalloproteinases in atypical adenomatous hyperplasia and
pulmonary adenocarcinoma. J Pathol 203(2): 638-644.
Kim DH, Kim JS, Ji YI, Shim YM, Kim H, Han J, Park J. 2003. Hypermethylation
of RASSF1A promoter is associated with the age at starting smoking and
a poor prognosis in primary non-small cell lung cancer. Cancer Res
63(13): 3743-3746.
Kim JH, Dhanasekaran SM, Prensner JR, Cao X, Robinson D, Kalyana-
Sundaram S, Huang C, Shankar S, Jing X, Iyer M et al. 2011. Deep
sequencing reveals distinct patterns of DNA methylation in prostate
cancer. Genome Res 21(7): 1028-1041.
Kim JS, Han J, Shim YM, Park J, Kim DH. 2005. Aberrant methylation of H-
cadherin (CDH13) promoter is associated with tumor progression in
primary nonsmall cell lung carcinoma. Cancer 104(9): 1825-1833.
Kim YH, Petko Z, Dzieciatkowski S, Lin L, Ghiassi M, Stain S, Chapman WC,
Washington MK, Willis J, Markowitz SD et al. 2006. CpG island
methylation of genes accumulates during the adenoma progression step
of the multistep pathogenesis of colorectal cancer. Genes Chromosomes
Cancer 45(8): 781-789.
Kim YI, Giuliano A, Hatch KD, Schneider A, Nour MA, Dallal GE, Selhub J,
Mason JB. 1994. Global DNA hypomethylation increases progressively in
cervical dysplasia and carcinoma. Cancer 74(3): 893-899.
Ko YC, Cheng LS, Lee CH, Huang JJ, Huang MS, Kao EL, Wang HZ, Lin HJ.
2000. Chinese food cooking and lung cancer in women nonsmokers. Am J
Epidemiol 151(2): 140-147.
Ko YC, Lee CH, Chen MJ, Huang CC, Chang WY, Lin HJ, Wang HZ, Chang PY.
1997. Risk factors for primary lung cancer among non-smoking women in
Taiwan. Int J Epidemiol 26(1): 24-31.
190
Kobayashi K, Nishioka M, Kohno T, Nakamoto M, Maeshima A, Aoyagi K, Sasaki
H, Takenoshita S, Sugimura H, Yokota J. 2004. Identification of genes
whose expression is upregulated in lung adenocarcinoma cells in
comparison with type II alveolar cells and bronchiolar epithelial cells in
vivo. Oncogene 23(17): 3089-3096.
Koyi H, Hillerdal G, Branden E. 2002. A prospective study of a total material of
lung cancer from a county in Sweden 1997-1999: gender, symptoms,
type, stage, and smoking habits. Lung Cancer 36(1): 9-14.
Kwon YJ, Lee SJ, Koh JS, Kim SH, Lee HW, Kang MC, Bae JB, Kim YJ, Park
JH. 2011. Genome-Wide Analysis of DNA Methylation and the Gene
Expression Change in Lung Cancer. J Thorac Oncol.
Laird PW. 2003. The power and the promise of DNA methylation markers. Nat
Rev Cancer 3(4): 253-266.
Laird PW. 2010. Principles and challenges of genomewide DNA methylation
analysis. Nat Rev Genet 11(3): 191-203.
Landi MT, Dracheva T, Rotunno M, Figueroa JD, Liu H, Dasgupta A, Mann FE,
Fukuoka J, Hames M, Bergen AW et al. 2008. Gene expression signature
of cigarette smoking and its role in lung adenocarcinoma development and
survival. PLoS One 3(2): e1651.
Larsen JE, Pavey SJ, Passmore LH, Bowman RV, Hayward NK, Fong KM. 2007.
Gene expression signature predicts recurrence in lung adenocarcinoma.
Clin Cancer Res 13(10): 2946-2954.
Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, Geman
D, Baggerly K, Irizarry RA. 2010. Tackling the widespread and critical
impact of batch effects in high-throughput data. Nat Rev Genet 11(10):
733-739.
Lemjabbar-Alaoui H, Dasari V, Sidhu SS, Mengistab A, Finkbeiner W, Gallup M,
Basbaum C. 2006. Wnt and Hedgehog are critical mediators of cigarette
smoke-induced lung cancer. PLoS One 1: e93.
Li L, Lee KM, Han W, Choi JY, Lee JY, Kang GH, Park SK, Noh DY, Yoo KY,
Kang D. 2010. Estrogen and progesterone receptor status affect genome-
wide DNA methylation profile in breast cancer. Hum Mol Genet 19(21):
4273-4277.
191
Li N, Grivennikov SI, Karin M. 2011. The unholy trinity: inflammation, cytokines,
and STAT3 shape the cancer microenvironment. Cancer Cell 19(4): 429-
431.
Li Y, Dong X, Yin Y, Su Y, Xu Q, Zhang Y, Pang X, Chen W. 2005. BJ-TSA-9, a
novel human tumor-specific gene, has potential as a biomarker of lung
cancer. Neoplasia 7(12): 1073-1080.
Licchesi JD, Westra WH, Hooker CM, Herman JG. 2008a. Promoter
hypermethylation of hallmark cancer genes in atypical adenomatous
hyperplasia of the lung. Clin Cancer Res 14(9): 2570-2578.
Licchesi JD, Westra WH, Hooker CM, Machida EO, Baylin SB, Herman JG.
2008b. Epigenetic alteration of Wnt pathway antagonists in progressive
glandular neoplasia of the lung. Carcinogenesis 29(5): 895-904.
Liu L, Liao GQ, He P, Zhu H, Liu PH, Qu YM, Song XM, Xu QW, Gao Q, Zhang
Y et al. 2008. Detection of circulating cancer cells in lung cancer patients
with a panel of marker genes. Biochem Biophys Res Commun 372(4):
756-760.
Liu Y, Gao W, Siegfried JM, Weissfeld JL, Luketich JD, Keohavong P. 2007.
Promoter methylation of RASSF1A and DAPK and mutations of K-ras,
p53, and EGFR in lung tumors from smokers and never-smokers. BMC
Cancer 7: 74.
Lofton-Day C, Model F, Devos T, Tetzner R, Distler J, Schuster M, Song X,
Lesche R, Liebenberg V, Ebert M et al. 2008. DNA methylation
biomarkers for blood-based colorectal cancer screening. Clin Chem 54(2):
414-423.
Lopez-Serra P, Esteller M. 2011. DNA methylation-associated silencing of tumor-
suppressor microRNAs in cancer. Oncogene.
Lu TP, Tsai MH, Lee JM, Hsu CP, Chen PC, Lin CW, Shih JY, Yang PC, Hsiao
CK, Lai LC et al. 2010. Identification of a novel biomarker, SEMA5A, for
non-small cell lung carcinoma in nonsmoking women. Cancer Epidemiol
Biomarkers Prev 19(10): 2590-2597.
Maneckjee R, Minna JD. 1990. Opioid and nicotine receptors affect growth
regulation of human lung cancer cell lines. Proc Natl Acad Sci U S A
87(9): 3294-3298.
192
Marsit CJ, Houseman EA, Christensen BC, Eddy K, Bueno R, Sugarbaker DJ,
Nelson HH, Karagas MR, Kelsey KT. 2006. Examination of a CpG island
methylator phenotype and implications of methylation profiles in solid
tumors. Cancer Res 66(21): 10621-10629.
Maunakea AK, Nagarajan RP, Bilenky M, Ballinger TJ, D'Souza C, Fouse SD,
Johnson BE, Hong C, Nielsen C, Zhao Y et al. 2010. Conserved role of
intragenic DNA methylation in regulating alternative promoters. Nature
466(7303): 253-257.
Mazieres J, He B, You L, Xu Z, Jablons DM. 2005. Wnt signaling in lung cancer.
Cancer Lett 222(1): 1-10.
McCabe MT, Lee EK, Vertino PM. 2009. A multifactorial signature of DNA
sequence and polycomb binding predicts aberrant CpG island
methylation. Cancer Res 69(1): 282-291.
Mehrotra J, Vali M, McVeigh M, Kominsky SL, Fackler MJ, Lahti-Domenici J,
Polyak K, Sacchi N, Garrett-Mayer E, Argani P et al. 2004. Very high
frequency of hypermethylated genes in breast cancer metastasis to the
bone, brain, and lung. Clin Cancer Res 10(9): 3104-3109.
Meissner A, Mikkelsen TS, Gu H, Wernig M, Hanna J, Sivachenko A, Zhang X,
Bernstein BE, Nusbaum C, Jaffe DB et al. 2008. Genome-scale DNA
methylation maps of pluripotent and differentiated cells. Nature 454(7205):
766-770.
Merlo A, Herman JG, Mao L, Lee DJ, Gabrielson E, Burger PC, Baylin SB,
Sidransky D. 1995. 5' CpG island methylation is associated with
transcriptional silencing of the tumour suppressor p16/CDKN2/MTS1 in
human cancers. Nat Med 1(7): 686-692.
Metayer C, Wang Z, Kleinerman RA, Wang L, Brenner AV, Cui H, Cao J, Lubin
JH. 2002. Cooking oil fumes and risk of lung cancer in women in rural
Gansu, China. Lung Cancer 35(2): 111-117.
Monzo M, Brunet S, Urbano-Ispizua A, Navarro A, Perea G, Esteve J, Artells R,
Granell M, Berlanga J, Ribera JM et al. 2006. Genomic polymorphisms
provide prognostic information in intermediate-risk acute myeloblastic
leukemia. Blood 107(12): 4871-4879.
Morandi L, Asioli S, Cavazza A, Pession A, Damiani S. 2007. Genetic
relationship among atypical adenomatous hyperplasia, bronchioloalveolar
carcinoma and adenocarcinoma of the lung. Lung Cancer 56(1): 35-42.
193
Mountain CF. 1997. Revisions in the International System for Staging Lung
Cancer. Chest 111(6): 1710-1717.
Muggerud AA, Ronneberg JA, Warnberg F, Botling J, Busato F, Jovanovic J,
Solvang H, Bukholm I, Borresen-Dale AL, Kristensen VN et al. 2010.
Frequent aberrant DNA methylation of ABCB1, FOXC1, PPP2R2B and
PTEN in ductal carcinoma in situ and early invasive breast cancer. Breast
Cancer Res 12(1): R3.
Navis AC, van den Eijnden M, Schepens JT, Hooft van Huijsduijnen R,
Wesseling P, Hendriks WJ. 2010. Protein tyrosine phosphatases in glioma
biology. Acta Neuropathol 119(2): 157-175.
Naya FJ, Huang HP, Qiu Y, Mutoh H, DeMayo FJ, Leiter AB, Tsai MJ. 1997.
Diabetes, defective pancreatic morphogenesis, and abnormal
enteroendocrine differentiation in BETA2/neuroD-deficient mice. Genes
Dev 11(18): 2323-2334.
Network CGRA. 2011. Integrated genomic analyses of ovarian carcinoma.
Nature 474(7353): 609-615.
Nguyen CT, Gonzales FA, Jones PA. 2001. Altered chromatin structure
associated with methylation-induced gene silencing in cancer cells:
correlation of accessibility, methylation, MeCP2 binding and acetylation.
Nucleic Acids Res 29(22): 4598-4606.
Nguyen DX, Chiang AC, Zhang XH, Kim JY, Kris MG, Ladanyi M, Gerald WL,
Massague J. 2009. WNT/TCF signaling through LEF1 and HOXB9
mediates lung adenocarcinoma metastasis. Cell 138(1): 51-62.
Niklinska W, Naumnik W, Sulewska A, Kozlowski M, Pankiewicz W, Milewski R.
2009. Prognostic significance of DAPK and RASSF1A promoter
hypermethylation in non-small cell lung cancer (NSCLC). Folia Histochem
Cytobiol 47(2): 275-280.
Nishiyama N, Arai E, Nagashio R, Fujimoto H, Hosoda F, Shibata T, Tsukamoto
T, Yokoi S, Imoto I, Inazawa J et al. 2011. Copy number alterations in
urothelial carcinomas: their clinicopathological significance and correlation
with DNA methylation alterations. Carcinogenesis 32(4): 462-469.
Niu RF, Zhang L, Xi GM, Wei XY, Yang Y, Shi YR, Hao XS. 2007. Up-regulation
of Twist induces angiogenesis and correlates with metastasis in
hepatocellular carcinoma. J Exp Clin Cancer Res 26(3): 385-394.
194
Niwa Y, Kanda H, Shikauchi Y, Saiura A, Matsubara K, Kitagawa T, Yamamoto
J, Kubo T, Yoshikawa H. 2005. Methylation silencing of SOCS-3 promotes
cell growth and migration by enhancing JAK/STAT and FAK signalings in
human hepatocellular carcinoma. Oncogene 24(42): 6406-6417.
Noguchi M, Morikawa A, Kawasaki M, Matsuno Y, Yamada T, Hirohashi S,
Kondo H, Shimosato Y. 1995. Small adenocarcinoma of the lung.
Histologic characteristics and prognosis. Cancer 75(12): 2844-2852.
Noushmehr H, Weisenberger DJ, Diefes K, Phillips HS, Pujara K, Berman BP,
Pan F, Pelloski CE, Sulman EP, Bhat KP et al. 2010. Identification of a
CpG island methylator phenotype that defines a distinct subgroup of
glioma. Cancer Cell 17(5): 510-522.
Nuovo GJ, Plaia TW, Belinsky SA, Baylin SB, Herman JG. 1999. In situ detection
of the hypermethylation-induced inactivation of the p16 gene as an early
event in oncogenesis. Proc Natl Acad Sci U S A 96(22): 12754-12759.
Ogino S, Kawasaki T, Kirkner GJ, Loda M, Fuchs CS. 2006. CpG island
methylator phenotype-low (CIMP-low) in colorectal cancer: possible
associations with male sex and KRAS mutations. J Mol Diagn 8(5): 582-
588.
Oh HJ, Lee KK, Song SJ, Jin MS, Song MS, Lee JH, Im CR, Lee JO, Yonehara
S, Lim DS. 2006. Role of the tumor suppressor RASSF1A in Mst1-
mediated apoptosis. Cancer Res 66(5): 2562-2569.
Ohm JE, McGarvey KM, Yu X, Cheng L, Schuebel KE, Cope L, Mohammad HP,
Chen W, Daniel VC, Yu W et al. 2007. A stem cell-like chromatin pattern
may predispose tumor suppressor genes to DNA hypermethylation and
heritable silencing. Nat Genet 39(2): 237-242.
Okano M, Bell DW, Haber DA, Li E. 1999. DNA methyltransferases Dnmt3a and
Dnmt3b are essential for de novo methylation and mammalian
development. Cell 99(3): 247-257.
Osada H, Tatematsu Y, Yatabe Y, Nakagawa T, Konishi H, Harano T, Tezel E,
Takada M, Takahashi T. 2002. Frequent and histological type-specific
inactivation of 14-3-3sigma in human lung cancers. Oncogene 21(15):
2418-2424.
Palmisano WA, Divine KK, Saccomanno G, Gilliland FD, Baylin SB, Herman JG,
Belinsky SA. 2000. Predicting lung cancer by detecting aberrant promoter
methylation in sputum. Cancer Res 60(21): 5954-5958.
195
Pao W, Girard N. 2011. New driver mutations in non-small-cell lung cancer.
Lancet Oncol 12(2): 175-180.
Pao W, Miller V, Zakowski M, Doherty J, Politi K, Sarkaria I, Singh B, Heelan R,
Rusch V, Fulton L et al. 2004. EGF receptor gene mutations are common
in lung cancers from "never smokers" and are associated with sensitivity
of tumors to gefitinib and erlotinib. Proc Natl Acad Sci U S A 101(36):
13306-13311.
Pao W, Miller VA, Politi KA, Riely GJ, Somwar R, Zakowski MF, Kris MG,
Varmus H. 2005a. Acquired resistance of lung adenocarcinomas to
gefitinib or erlotinib is associated with a second mutation in the EGFR
kinase domain. PLoS Med 2(3): e73.
Pao W, Wang TY, Riely GJ, Miller VA, Pan Q, Ladanyi M, Zakowski MF, Heelan
RT, Kris MG, Varmus HE. 2005b. KRAS mutations and primary resistance
of lung adenocarcinomas to gefitinib or erlotinib. PLoS Med 2(1): e17.
Park K, Chung YJ, So H, Kim K, Park J, Oh M, Jo M, Choi K, Lee EJ, Choi YL et
al. 2011a. AGR2, a mucinous ovarian cancer marker, promotes cell
proliferation and migration. Exp Mol Med 43(2): 91-100.
Park SY, Kwon HJ, Lee HE, Ryu HS, Kim SW, Kim JH, Kim IA, Jung N, Cho NY,
Kang GH. 2011b. Promoter CpG island hypermethylation during breast
cancer progression. Virchows Arch 458(1): 73-84.
Pedersen KS, Bamlet WR, Oberg AL, de Andrade M, Matsumoto ME, Tang H,
Thibodeau SN, Petersen GM, Wang L. 2011. Leukocyte DNA methylation
signature differentiates pancreatic cancer patients from healthy controls.
PLoS One 6(3): e18223.
Pfeifer GP, Yoon JH, Liu L, Tommasi S, Wilczynski SP, Dammann R. 2002.
Methylation of the RASSF1A gene in human cancers. Biol Chem 383(6):
907-914.
Pike BL, Greiner TC, Wang X, Weisenburger DD, Hsu YH, Renaud G, Wolfsberg
TG, Kim M, Weisenberger DJ, Siegmund KD et al. 2008. DNA methylation
profiles in diffuse large B-cell lymphoma and their relationship to gene
expression status. Leukemia 22(5): 1035-1043.
Pizzi M, Fassan M, Balistreri M, Galligioni A, Rea F, Rugge M. 2011. Anterior
Gradient 2 Overexpression in Lung Adenocarcinoma. Appl
Immunohistochem Mol Morphol.
196
Politi K, Zakowski MF, Fan PD, Schonfeld EA, Pao W, Varmus HE. 2006. Lung
adenocarcinomas induced in mice by mutant EGF receptors found in
human lung cancers respond to a tyrosine kinase inhibitor or to down-
regulation of the receptors. Genes Dev 20(11): 1496-1510.
Poynter JN, Siegmund KD, Weisenberger DJ, Long TI, Thibodeau SN, Lindor N,
Young J, Jenkins MA, Hopper JL, Baron JA et al. 2008. Molecular
characterization of MSI-H colorectal cancer by MLHI promoter
methylation, immunohistochemistry, and mismatch repair germline
mutation screening. Cancer Epidemiol Biomarkers Prev 17(11): 3208-
3215.
Pulling LC, Divine KK, Klinge DM, Gilliland FD, Kang T, Schwartz AG, Bocklage
TJ, Belinsky SA. 2003. Promoter hypermethylation of the O6-
methylguanine-DNA methyltransferase gene: more common in lung
adenocarcinomas from never-smokers than smokers and associated with
tumor progression. Cancer Res 63(16): 4842-4848.
Ramachandran V, Arumugam T, Wang H, Logsdon CD. 2008. Anterior gradient 2
is expressed and secreted during the development of pancreatic cancer
and promotes cancer cell survival. Cancer Res 68(19): 7811-7818.
Rauch T, Wang Z, Zhang X, Zhong X, Wu X, Lau SK, Kernstine KH, Riggs AD,
Pfeifer GP. 2007. Homeobox gene methylation in lung cancer studied by
genome-wide analysis with a microarray-based methylated CpG island
recovery assay. Proc Natl Acad Sci U S A 104(13): 5527-5532.
Read WL, Page NC, Tierney RM, Piccirillo JF, Govindan R. 2004. The
epidemiology of bronchioloalveolar carcinoma over the past two decades:
analysis of the SEER database. Lung Cancer 45(2): 137-142.
Remmelink M, Mijatovic T, Gustin A, Mathieu A, Rombaut K, Kiss R, Salmon I,
Decaestecker C. 2005. Identification by means of cDNA microarray
analyses of gene expression modifications in squamous non-small cell
lung cancers as compared to normal bronchial epithelial tissue. Int J Oncol
26(1): 247-258.
Renard I, Joniau S, van Cleynenbreugel B, Collette C, Naome C, Vlassenbroeck
I, Nicolas H, de Leval J, Straub J, Van Criekinge W et al. 2010.
Identification and validation of the methylated TWIST1 and NID2 genes
through real-time methylation-specific polymerase chain reaction assays
for the noninvasive detection of primary bladder cancer in urine samples.
Eur Urol 58(1): 96-104.
197
Risch A, Plass C. 2008. Lung cancer epigenetics and genetics. Int J Cancer
123(1): 1-7.
Robertson KD. 2005. DNA methylation and human disease. Nat Rev Genet 6(8):
597-610.
Rodriguez-Paredes M, Esteller M. 2011. Cancer epigenetics reaches mainstream
oncology. Nat Med 17(3): 330-339.
Rohan S, Tu JJ, Kao J, Mukherjee P, Campagne F, Zhou XK, Hyjek E, Alonso
MA, Chen YT. 2006. Gene expression profiling separates chromophobe
renal cell carcinoma from oncocytoma and identifies vesicular transport
and cell junction proteins as differentially expressed genes. Clin Cancer
Res 12(23): 6937-6945.
Roland JT, Bryant DM, Datta A, Itzen A, Mostov KE, Goldenring JR. 2011. Rab
GTPase-Myo5B complexes control membrane recycling and epithelial
polarization. Proc Natl Acad Sci U S A 108(7): 2789-2794.
Rosenbaum E, Hoque MO, Cohen Y, Zahurak M, Eisenberger MA, Epstein JI,
Partin AW, Sidransky D. 2005. Promoter hypermethylation as an
independent prognostic factor for relapse in patients with prostate cancer
following radical prostatectomy. Clin Cancer Res 11(23): 8321-8325.
Sakakura C, Hasegawa K, Miyagawa K, Nakashima S, Yoshikawa T, Kin S,
Nakase Y, Yazumi S, Yamagishi H, Okanoue T et al. 2005. Possible
involvement of RUNX3 silencing in the peritoneal metastases of gastric
cancers. Clin Cancer Res 11(18): 6479-6488.
Sakamoto H, Shimizu J, Horio Y, Ueda R, Takahashi T, Mitsudomi T, Yatabe Y.
2007. Disproportionate representation of KRAS gene mutation in atypical
adenomatous hyperplasia, but even distribution of EGFR gene mutation
from preinvasive to invasive adenocarcinomas. J Pathol 212(3): 287-294.
Sakthianandeswaren A, Christie M, D'Andreti C, Tsui C, Jorissen RN, Li S,
Fleming NI, Gibbs P, Lipton L, Malaterre J et al. 2011. PHLDA1
expression marks the putative epithelial stem cells and contributes to
intestinal tumorigenesis. Cancer Res 71(10): 3709-3719.
Sartor MA, Dolinoy DC, Jones TR, Colacino JA, Prince ME, Carey TE, Rozek LS.
2011. Genome-wide methylation and expression differences in HPV(+)
and HPV(-) squamous cell carcinoma cell lines are consistent with
divergent mechanisms of carcinogenesis. Epigenetics 6(6): 777-787.
198
Sasaki Y, Aoki S, Aoki K, Achiwa K, Yama T, Kubota M, Ishikawa D, Mizutani T,
Kunii S, Watanabe K et al. 2009. [Acute pancreatitis associated with the
administration of ceftriaxone in an adult patient]. Nihon Shokakibyo Gakkai
Zasshi 106(4): 569-575.
Satelli A, Rao PS, Thirumala S, Rao US. 2011. Galectin-4 functions as a tumor
suppressor of human colorectal cancer. Int J Cancer 129(4): 799-809.
Sathyanarayana UG, Toyooka S, Padar A, Takahashi T, Brambilla E, Minna JD,
Gazdar AF. 2003. Epigenetic inactivation of laminin-5-encoding genes in
lung cancers. Clin Cancer Res 9(7): 2665-2672.
Sato M, Mori Y, Sakurada A, Fujimura S, Horii A. 1998. The H-cadherin (CDH13)
gene is inactivated in human lung cancer. Hum Genet 103(1): 96-101.
Schabath MB, Wu X, Vassilopoulou-Sellin R, Vaporciyan AA, Spitz MR. 2004.
Hormone replacement therapy and lung cancer risk: a case-control
analysis. Clin Cancer Res 10(1 Pt 1): 113-123.
Schlesinger Y, Straussman R, Keshet I, Farkash S, Hecht M, Zimmerman J,
Eden E, Yakhini Z, Ben-Shushan E, Reubinoff BE et al. 2007. Polycomb-
mediated methylation on Lys27 of histone H3 pre-marks genes for de
novo methylation in cancer. Nat Genet 39(2): 232-236.
Schmidt B, Liebenberg V, Dietrich D, Schlegel T, Kneip C, Seegebarth A,
Flemming N, Seemann S, Distler J, Lewin J et al. 2010. SHOX2 DNA
methylation is a biomarker for the diagnosis of lung cancer based on
bronchial aspirates. BMC Cancer 10: 600.
Schut HA, Snyderwine EG. 1999. DNA adducts of heterocyclic amine food
mutagens: implications for mutagenesis and carcinogenesis.
Carcinogenesis 20(3): 353-368.
Schwarzenbach H, Hoon DS, Pantel K. 2011. Cell-free nucleic acids as
biomarkers in cancer patients. Nat Rev Cancer 11(6): 426-437.
Selamat SA, Galler JS, Joshi AD, Fyfe MN, Campan M, Siegmund KD, Kerr KM,
Laird-Offringa IA. 2011. DNA methylation changes in atypical
adenomatous hyperplasia, adenocarcinoma in situ, and lung
adenocarcinoma. PLoS One 6(6): e21443.
199
Sellick GS, Barker KT, Stolte-Dijkstra I, Fleischmann C, Coleman RJ, Garrett C,
Gloyn AL, Edghill EL, Hattersley AT, Wellauer PK et al. 2004. Mutations in
PTF1A cause pancreatic and cerebellar agenesis. Nat Genet 36(12):
1301-1305.
Shames DS, Girard L, Gao B, Sato M, Lewis CM, Shivapurkar N, Jiang A, Perou
CM, Kim YH, Pollack JR et al. 2006. A genome-wide screen for promoter
methylation in lung cancer identifies novel methylation markers for multiple
malignancies. PLoS Med 3(12): e486.
Shapiro B, Chakrabarty M, Cohn EM, Leon SA. 1983. Determination of
circulating DNA levels in patients with benign or malignant gastrointestinal
disease. Cancer 51(11): 2116-2120.
Sharp AJ, Stathaki E, Migliavacca E, Brahmachary M, Montgomery SB, Dupre Y,
Antonarakis SE. 2011. DNA methylation profiles of human active and
inactive X chromosomes. Genome Res 21(10): 1592-1600.
Shen L, Toyota M, Kondo Y, Lin E, Zhang L, Guo Y, Hernandez NS, Chen X,
Ahmed S, Konishi K et al. 2007. Integrated genetic and epigenetic
analysis identifies three different subclasses of colon cancer. Proc Natl
Acad Sci U S A 104(47): 18654-18659.
Shimada A, Kano J, Ishiyama T, Okubo C, Iijima T, Morishita Y, Minami Y,
Inadome Y, Shu Y, Sugita S et al. 2005. Establishment of an immortalized
cell line from a precancerous lesion of lung adenocarcinoma, and genes
highly expressed in the early stages of lung adenocarcinoma
development. Cancer Sci 96(10): 668-675.
Shiraishi M, Sekiguchi A, Oates AJ, Terry MJ, Miyamoto Y. 2002. HOX gene
clusters are hotspots of de novo methylation in CpG islands of human lung
adenocarcinomas. Oncogene 21(22): 3659-3662.
Shivakumar L, Minna J, Sakamaki T, Pestell R, White MA. 2002. The RASSF1A
tumor suppressor blocks cell cycle progression and inhibits cyclin D1
accumulation. Mol Cell Biol 22(12): 4309-4318.
Shulenin S, Nogee LM, Annilo T, Wert SE, Whitsett JA, Dean M. 2004. ABCA3
gene mutations in newborns with fatal surfactant deficiency. N Engl J Med
350(13): 1296-1303.
200
Son JW, Jeong KJ, Jean WS, Park SY, Jheon S, Cho HM, Park CG, Lee HY,
Kang J. 2011. Genome-wide combination profiling of DNA copy number
and methylation for deciphering biomarkers in non-small cell lung cancer
patients. Cancer Lett 311(1): 29-37.
Storey JD, Tibshirani R. 2003. Statistical significance for genomewide studies.
Proc Natl Acad Sci U S A 100(16): 9440-9445.
Subedi N, Scarsbrook A, Darby M, Korde K, Mc Shane P, Muers MF. 2009. The
clinical impact of integrated FDG PET-CT on management decisions in
patients with lung cancer. Lung Cancer 64(3): 301-307.
Subramanian J, Govindan R. 2007. Lung cancer in never smokers: a review. J
Clin Oncol 25(5): 561-570.
Sugimura T. 2000. Nutrition and dietary carcinogens. Carcinogenesis 21(3): 387-
395.
Sun S, Schiller JH, Gazdar AF. 2007. Lung cancer in never smokers--a different
disease. Nat Rev Cancer 7(10): 778-790.
Sunaga N, Imai H, Shimizu K, Shames DS, Kakegawa S, Girard L, Sato M, Kaira
K, Ishizuka T, Gazdar AF et al. 2011. Oncogenic KRAS-induced
interleukin-8 overexpression promotes cell growth and migration and
contributes to aggressive phenotypes of non-small cell lung cancer. Int J
Cancer.
Suzuki M, Shigematsu H, Iizasa T, Hiroshima K, Nakatani Y, Minna JD, Gazdar
AF, Fujisawa T. 2006. Exclusive mutation in epidermal growth factor
receptor gene, HER-2, and KRAS, and synchronous methylation of
nonsmall cell lung cancer. Cancer 106(10): 2200-2207.
Suzuki M, Sunaga N, Shames DS, Toyooka S, Gazdar AF, Minna JD. 2004. RNA
interference-mediated knockdown of DNA methyltransferase 1 leads to
promoter demethylation and gene re-expression in human lung and breast
cancer cells. Cancer Res 64(9): 3137-3143.
Takai D, Jones PA. 2002. Comprehensive analysis of CpG islands in human
chromosomes 21 and 22. Proc Natl Acad Sci U S A 99(6): 3740-3745.
201
Takamochi K, Ogura T, Suzuki K, Kawasaki H, Kurashima Y, Yokose T, Ochiai
A, Nagai K, Nishiwaki Y, Esumi H. 2001. Loss of heterozygosity on
chromosomes 9q and 16p in atypical adenomatous hyperplasia
concomitant with adenocarcinoma of the lung. Am J Pathol 159(5): 1941-
1948.
Takeyama N, Ano Y, Wu G, Kubota N, Saeki K, Sakudo A, Momotani E, Sugiura
K, Yukawa M, Onodera T. 2009. Localization of insulinoma associated
protein 2, IA-2 in mouse neuroendocrine tissues using two novel
monoclonal antibodies. Life Sci 84(19-20): 678-687.
Tanaka K, Okamoto A. 2007. Degradation of DNA by bisulfite treatment. Bioorg
Med Chem Lett 17(7): 1912-1915.
Tanaka S, Pero SC, Taguchi K, Shimada M, Mori M, Krag DN, Arii S. 2006.
Specific peptide ligand for Grb7 signal transduction protein and pancreatic
cancer metastasis. J Natl Cancer Inst 98(7): 491-498.
Tao R, Li J, Xin J, Wu J, Guo J, Zhang L, Jiang L, Zhang W, Yang Z, Li L. 2011.
Methylation profile of single hepatocytes derived from hepatitis B virus-
related hepatocellular carcinoma. PLoS One 6(5): e19862.
Teodoridis JM, Hardie C, Brown R. 2008. CpG island methylator phenotype
(CIMP) in cancer: causes and implications. Cancer Lett 268(2): 177-186.
Terry MB, Delgado-Cruzata L, Vin-Raviv N, Wu HC, Santella RM. 2011. DNA
methylation in white blood cells: association with risk factors in
epidemiologic studies. Epigenetics 6(7): 828-837.
Tessema M, Belinsky SA. 2008. Mining the epigenome for methylated genes in
lung cancer. Proc Am Thorac Soc 5(8): 806-810.
Thirlwell C, Eymard M, Feber A, Teschendorff A, Pearce K, Lechner M,
Widschwendter M, Beck S. 2010. Genome-wide DNA methylation analysis
of archival formalin-fixed paraffin-embedded tissue using the Illumina
Infinium HumanMethylation27 BeadChip. Methods.
Toh CK, Gao F, Lim WT, Leong SS, Fong KW, Yap SP, Hsu AA, Eng P, Koong
HN, Thirugnanam A et al. 2006. Never-smokers with lung cancer:
epidemiologic evidence of a distinct disease entity. J Clin Oncol 24(15):
2245-2251.
202
Tomizawa Y, Kohno T, Kondo H, Otsuka A, Nishioka M, Niki T, Yamada T,
Maeshima A, Yoshimura K, Saito R et al. 2002. Clinicopathological
significance of epigenetic inactivation of RASSF1A at 3p21.3 in stage I
lung adenocarcinoma. Clin Cancer Res 8(7): 2362-2368.
Tommasi S, Karm DL, Wu X, Yen Y, Pfeifer GP. 2009. Methylation of homeobox
genes is a frequent and early epigenetic event in breast cancer. Breast
Cancer Res 11(1): R14.
Tommasi S, Kim SI, Zhong X, Wu X, Pfeifer GP, Besaratinia A. 2010.
Investigating the epigenetic effects of a prototype smoke-derived
carcinogen in human cells. PLoS One 5(5): e10594.
Toyooka KO, Toyooka S, Virmani AK, Sathyanarayana UG, Euhus DM,
Gilcrease M, Minna JD, Gazdar AF. 2001a. Loss of expression and
aberrant methylation of the CDH13 (H-cadherin) gene in breast and lung
carcinomas. Cancer Res 61(11): 4556-4560.
Toyooka S, Maruyama R, Toyooka KO, McLerran D, Feng Z, Fukuyama Y,
Virmani AK, Zochbauer-Muller S, Tsukuda K, Sugio K et al. 2003. Smoke
exposure, histologic type and geography-related differences in the
methylation profiles of non-small cell lung cancer. Int J Cancer 103(2):
153-160.
Toyooka S, Tokumo M, Shigematsu H, Matsuo K, Asano H, Tomii K, Ichihara S,
Suzuki M, Aoe M, Date H et al. 2006. Mutational and epigenetic evidence
for independent pathways for lung adenocarcinomas arising in smokers
and never smokers. Cancer Res 66(3): 1371-1375.
Toyooka S, Toyooka KO, Maruyama R, Virmani AK, Girard L, Miyajima K,
Harada K, Ariyoshi Y, Takahashi T, Sugio K et al. 2001b. DNA methylation
profiles of lung tumors. Mol Cancer Ther 1(1): 61-67.
Toyoshima Y, Karas M, Yakar S, Dupont J, Lee H, LeRoith D. 2004. TDAG51
mediates the effects of insulin-like growth factor I (IGF-I) on cell survival. J
Biol Chem 279(24): 25898-25904.
Toyota M, Ahuja N, Ohe-Toyota M, Herman JG, Baylin SB, Issa JP. 1999. CpG
island methylator phenotype in colorectal cancer. Proc Natl Acad Sci U S
A 96(15): 8681-8686.
203
Travis WD, Brambilla E, Noguchi M, Nicholson AG, Geisinger KR, Yatabe Y,
Beer DG, Powell CA, Riely GJ, Van Schil PE et al. 2011. International
association for the study of lung cancer/american thoracic
society/european respiratory society international multidisciplinary
classification of lung adenocarcinoma. J Thorac Oncol 6(2): 244-285.
Travis WD CT, Corrin B. 1999. Histological Typing of Lung Cancer and Pleural
Tumors. editors. WHO International Histological Classification of Tumors,
3rd edition. Springer, Berlin.
Tripodi D, Quemener S, Renaudin K, Ferron C, Malard O, Guisle-Marsollier I,
Sebille-Rivain V, Verger C, Geraut C, Gratas-Rabbia-Re C. 2009. Gene
expression profiling in sinonasal adenocarcinoma. BMC Med Genomics 2:
65.
Tse LA, Mang OW, Yu IT, Wu F, Au JS, Law SC. 2009. Cigarette smoking and
changing trends of lung cancer incidence by histological subtype among
Chinese male population. Lung Cancer 66(1): 22-27.
Tsou JA, Galler JS, Siegmund KD, Laird PW, Turla S, Cozen W, Hagen JA, Koss
MN, Laird-Offringa IA. 2007. Identification of a panel of sensitive and
specific DNA methylation markers for lung adenocarcinoma. Mol Cancer
6: 70.
Tsou JA, Shen LY, Siegmund KD, Long TI, Laird PW, Seneviratne CK, Koss MN,
Pass HI, Hagen JA, Laird-Offringa IA. 2005. Distinct DNA methylation
profiles in malignant mesothelioma, lung adenocarcinoma, and non-tumor
lung. Lung Cancer 47(2): 193-204.
Ulazzi L, Sabbioni S, Miotto E, Veronese A, Angusti A, Gafa R, Manfredini S,
Farinati F, Sasaki T, Lanza G et al. 2007. Nidogen 1 and 2 gene
promoters are aberrantly methylated in human gastrointestinal cancer. Mol
Cancer 6: 17.
Ulivi P, Zoli W, Calistri D, Fabbri F, Tesei A, Rosetti M, Mengozzi M, Amadori D.
2006. p16INK4A and CDH13 hypermethylation in tumor and serum of
non-small cell lung cancer patients. J Cell Physiol 206(3): 611-615.
Usadel H, Brabender J, Danenberg KD, Jeronimo C, Harden S, Engles J,
Danenberg PV, Yang S, Sidransky D. 2002. Quantitative adenomatous
polyposis coli promoter methylation analysis in tumor tissue, serum, and
plasma DNA of patients with lung cancer. Cancer Res 62(2): 371-375.
204
Vaissiere T, Hung RJ, Zaridze D, Moukeria A, Cuenin C, Fasolo V, Ferro G,
Paliwal A, Hainaut P, Brennan P et al. 2009. Quantitative analysis of DNA
methylation profiles in lung cancer identifies aberrant DNA methylation of
specific genes and its association with gender and cancer risk factors.
Cancer Res 69(1): 243-252.
van Vlodrop IJ, Niessen HE, Derks S, Baldewijns M, Van Criekinge W, Herman
JG, van Engeland M. 2011. Analysis of promoter CpG island
hypermethylation in cancer: location, location, location! Clin Cancer Res.
Vanderlaag KE, Hudak S, Bald L, Fayadat-Dilman L, Sathe M, Grein J,
Janatpour MJ. 2010. Anterior gradient-2 plays a critical role in breast
cancer cell growth and survival by modulating cyclin D1, estrogen
receptor-alpha and survivin. Breast Cancer Res 12(3): R32.
Vineis P, Alavanja M, Buffler P, Fontham E, Franceschi S, Gao YT, Gupta PC,
Hackshaw A, Matos E, Samet J et al. 2004. Tobacco and cancer: recent
epidemiological evidence. J Natl Cancer Inst 96(2): 99-106.
Walker BA, Wardell CP, Chiecchio L, Smith EM, Boyd KD, Neri A, Davies FE,
Ross FM, Morgan GJ. 2011. Aberrant global methylation patterns affect
the molecular pathogenesis and prognosis of multiple myeloma. Blood
117(2): 553-562.
Wang KK, Liu N, Radulovich N, Wigle DA, Johnston MR, Shepherd FA, Minden
MD, Tsao MS. 2002. Novel candidate tumor marker genes for lung
adenocarcinoma. Oncogene 21(49): 7598-7604.
Wang L, Zhu JS, Song MQ, Chen GQ, Chen JL. 2006. Comparison of gene
expression profiles between primary tumor and metastatic lesions in
gastric cancer patients using laser microdissection and cDNA microarray.
World J Gastroenterol 12(43): 6949-6954.
Wang S, El-Deiry WS. 2003. TRAIL and apoptosis induction by TNF-family death
receptors. Oncogene 22(53): 8628-8633.
Wang Z, Shen D, Parsons DW, Bardelli A, Sager J, Szabo S, Ptak J, Silliman N,
Peters BA, van der Heijden MS et al. 2004. Mutational analysis of the
tyrosine phosphatome in colorectal cancers. Science 304(5674): 1164-
1166.
205
Watanabe M, Takemasa I, Kaneko N, Yokoyama Y, Matsuo E, Iwasa S, Mori M,
Matsuura N, Monden M, Nishimura O. 2011. Clinical significance of
circulating galectins as colorectal cancer markers. Oncol Rep 25(5): 1217-
1226.
Weisenberger DJ, Campan M, Long TI, Kim M, Woods C, Fiala E, Ehrlich M,
Laird PW. 2005. Analysis of repetitive element DNA methylation by
MethyLight. Nucleic Acids Res 33(21): 6823-6836.
Weisenberger DJ, Siegmund KD, Campan M, Young J, Long TI, Faasse MA,
Kang GH, Widschwendter M, Weener D, Buchanan D et al. 2006. CpG
island methylator phenotype underlies sporadic microsatellite instability
and is tightly associated with BRAF mutation in colorectal cancer. Nat
Genet 38(7): 787-793.
Weisenberger DJ, Trinh BN, Campan M, Sharma S, Long TI, Ananthnarayan S,
Liang G, Esteva FJ, Hortobagyi GN, McCormick F et al. 2008. DNA
methylation analysis by digital bisulfite genomic sequencing and digital
MethyLight. Nucleic Acids Res 36(14): 4689-4698.
Westra WH. 2000. Early glandular neoplasia of the lung. Respir Res 1(3): 163-
169.
Widschwendter M, Fiegl H, Egle D, Mueller-Holzner E, Spizzo G, Marth C,
Weisenberger DJ, Campan M, Young J, Jacobs I et al. 2007. Epigenetic
stem cell signature in cancer. Nat Genet 39(2): 157-158.
Wrage M, Ruosaari S, Eijk PP, Kaifi JT, Hollmen J, Yekebas EF, Izbicki JR,
Brakenhoff RH, Streichert T, Riethdorf S et al. 2009. Genomic profiles
associated with early micrometastasis in lung cancer: relevance of 4q
deletion. Clin Cancer Res 15(5): 1566-1574.
Yagi K, Akagi K, Hayashi H, Nagae G, Tsuji S, Isagawa T, Midorikawa Y,
Nishimura Y, Sakamoto H, Seto Y et al. 2010. Three DNA methylation
epigenotypes in human colorectal cancer. Clin Cancer Res 16(1): 21-33.
Yegnasubramanian S, Haffner MC, Zhang Y, Gurel B, Cornish TC, Wu Z, Irizarry
RA, Morgan J, Hicks J, DeWeese TL et al. 2008. DNA hypomethylation
arises later in prostate cancer progression than CpG island
hypermethylation and contributes to metastatic tumor heterogeneity.
Cancer Res 68(21): 8954-8967.
206
Yeh HH, Ogawa K, Balatoni J, Mukhapadhyay U, Pal A, Gonzalez-Lepera C,
Shavrin A, Soghomonyan S, Flores L, 2nd, Young D et al. 2011. Molecular
imaging of active mutant L858R EGF receptor (EGFR) kinase-expressing
nonsmall cell lung carcinomas using PET/CT. Proc Natl Acad Sci U S A
108(4): 1603-1608.
Yoshikawa H, Matsubara K, Qian GS, Jackson P, Groopman JD, Manning JE,
Harris CC, Herman JG. 2001. SOCS-1, a negative regulator of the
JAK/STAT pathway, is silenced by methylation in human hepatocellular
carcinoma and shows growth-suppression activity. Nat Genet 28(1): 29-
35.
Yu IT, Chiu YL, Au JS, Wong TW, Tang JL. 2006. Dose-response relationship
between cooking fumes exposures and lung cancer among Chinese
nonsmoking women. Cancer Res 66(9): 4961-4967.
Yuen HF, Chan YP, Wong ML, Kwok WK, Chan KK, Lee PY, Srivastava G, Law
SY, Wong YC, Wang X et al. 2007. Upregulation of Twist in oesophageal
squamous cell carcinoma is associated with neoplastic transformation and
distant metastasis. J Clin Pathol 60(5): 510-514.
Zeger SL, Liang KY. 1992. An overview of methods for the analysis of
longitudinal data. Stat Med 11(14-15): 1825-1839.
Zhang FF, Cardarelli R, Carroll J, Fulda KG, Kaur M, Gonzalez K, Vishwanatha
JK, Santella RM, Morabia A. 2011. Significant differences in global
genomic DNA methylation by gender and race/ethnicity in peripheral
blood. Epigenetics 6(5): 623-629.
Zhang W, Glockner SC, Guo M, Machida EO, Wang DH, Easwaran H, Van
Neste L, Herman JG, Schuebel KE, Watkins DN et al. 2008. Epigenetic
inactivation of the canonical Wnt antagonist SRY-box containing gene 17
in colorectal cancer. Cancer Res 68(8): 2764-2772.
Zhang YW, Miao YF, Yi J, Geng J, Wang R, Chen LB. 2010. Transcriptional
inactivation of secreted frizzled-related protein 1 by promoter
hypermethylation as a potential biomarker for non-small cell lung cancer.
Neoplasma 57(3): 228-233.
Zheng W, Xie D, Cerhan JR, Sellers TA, Wen W, Folsom AR. 2001.
Sulfotransferase 1A1 polymorphism, endogenous estrogen exposure,
well-done meat intake, and breast cancer risk. Cancer Epidemiol
Biomarkers Prev 10(2): 89-94.
207
Zhong Y, Delgado Y, Gomez J, Lee SW, Perez-Soler R. 2001. Loss of H-
cadherin protein expression in human non-small cell lung cancer is
associated with tumorigenicity. Clin Cancer Res 7(6): 1683-1687.
Zou H, Osborn NK, Harrington JJ, Klatt KK, Molina JR, Burgart LJ, Ahlquist DA.
2005. Frequent methylation of eyes absent 4 gene in Barrett's esophagus
and esophageal adenocarcinoma. Cancer Epidemiol Biomarkers Prev
14(4): 830-834.
208
APPENDIX A
DNA METHYLATION ANALYSIS OF MIXED
BRONCHIOLOALVEOLAR (BAC) LESIONS
Introduction
While adenocarcinoma in situ (AIS, formerly known as pure bronchioloalveolar
carcinoma or BAC) has a 100% 5-year survival rate (Noguchi et al. 1995), it only
accounts for 4% of lung cancers. This “pure BAC” or AIS lesion must not have
any invasive features, and can be resected completely. As soon as any invasive
component is present, however, the tumor is termed an adenocarcinoma.
However, 20% of lung cancers contain some features with BAC-like histology.
These tumors were collectively called “Mixed BAC” adenocarcinomas, and were
diverse in nature, ranging tumors with predominantly BAC histology with a small
area of invasion, to invasive adenocarcinomas with BAC features only at the
periphery of the tumor (Read et al. 2004). This broad definition is the subject of
active debate, and the uncertainty surrounding it is reflected in the continual re-
classification of lung tumors (Kerr 2009; Travis et al. 2011). The “Mixed BAC”
tumor has most recently been re-classified into more specific and narrow
categories, including minimally invasive adenocarcinoma (MIA) and lepidic
predominant invasive adenocarcinoma, amongst others (Travis et al. 2011).
209
However, some questions remain: Does the BAC component of these
adenocarcinomas represent a remnant of a precursor AIS lesion? Or, is it
actually a segment of adenocarcinoma that has progressed and taken on certain
BAC features? Some evidence indicates that Mixed BACs are remants of the
precursor lesion AIS. FHIT protein expression in Mixed BAC is more similar to
expression levels in AIS lesions than in invasive adenocarcinoma (Kerr et al.
2004). A comparative genomic hybridization (CGH) study of AIS and
adenocarcinomas with BAC features (AWBF, or Mixed BAC adenocarcinomas)
compared the genomic profiles of AIS lesions with the BAC component as well
as the invasive component of AWBF lesions (Aviel-Ronen et al. 2008). They
concluded that the genomic profiles of AIS and the BAC component of AWBF
were indistinguishable, and additionally found significant differences between the
BAC components and the invasive components of AWBF. Their results therefore
support the hypothesis that the BAC features in mixed-adenocarcinomas may
indeed be remnant AIS lesions. However, some pathologists still propose the
possibility that at least some of these BAC components may represent some type
of spread or progression of the invasive adenocarcinoma, based on the varying
histologies of Mixed BACs (Kerr 2009).
210
Materials and Methods
In this preliminary analysis, we examined the DNA methylation levels of 15 DNA
hypermethylation loci and two global DNA hypomethylation loci in Mixed BAC
tissues in comparison with the AIS and lung adenocarcinomas discussed in
Chapter 2. An important caveat to this pilot study is that these Mixed BAC tissues
still need to be re-evaluated according to the 2011 IASCLC guidelines.
Additionally, one random lesion was chosen from subjects with multiple lesions to
be included in the statistical analysis in order to avoid issues with intra-subject
correlations. Therefore, we compared 27 Mixed BAC tissues from 28 subjects
with 16 AIS lesions and 50 lung adenocarcinomas using Mann-Whitney tests and
a Bonferronic multiple comparisons correction of two, to correct for the two
questions we were asking (Mixed BAC vs. AIS, Mixed BAC vs. adenocarcinoma).
We therefore considered statistical significance at p<0.025. DNA methylation
data was generated as discussed in Chapter 2.
211
Results and Discussion
A heatmap showing AIS, Mixed BAC and adenocarcinoma lesions demonstrates
visually that there appear to be more DNA methylation in the 15
hypermethylation loci in Mixed BAC tissues than there are in AIS (Figure A.1).
There appears to be a range of DNA methylation profiles in the Mixed BAC
tissues, with some lesions having generally higher levels of DNA methylation
than others, parallel to the heterogeneity observed in adenocarcinomas. This
may reflect the broad category of “Mixed BAC” adenocarcinomas, which has
since been refined (Travis et al. 2011). Additionally, clear demethylation of SAT2
can be observed in the Mixed BAC tissues, which is still methylated in AIS,
although ALU-M2 appears to still be highly methylated.
The statistical analysis and corresponding scatterplots of each locus
illustrate that 9/15 of the hypermethylation loci (2C35, CDH13, HOXA1, HOXA11,
NEUROD1, NEUROD2, OPCML, PTPRN2, RASSF1 and TWIST1) appear to be
statistically significantly different between AIS and Mixed BAC, whereas only
2/15 loci (HOXA1 and TMEFF2) are different between Mixed BAC and
adenocarcinoma (Figure A.2). Additionally, the global DNA hypomethylation
indicator, the mean of SAT2 and ALU-M2 loci, is statistically significantly different
between AIS and Mixed BAC, but not between Mixed BAC and adenocarcinoma.
212
Figure A.1. Heatmap of DNA methylation levels of 15 hypermethylation loci
and two hypomethylation repeat loci in all tissue types. Loci are arranged in
alphabetical order. Dark blue indicates very low levels of DNA methylation,
yellow indicates high levels of DNA methylation, and missing values are indicated
in white. The type of lesion is indicated at the top.
213
These results indicate that at least in this DNA methylation data, the Mixed
BAC are more similar to invasive lung adenocarcinoma than AIS lesions,
contrary to published data on genomic alterations (Aviel-Ronen et al. 2008).
Additionally, although most of the loci do not reach statistical significance, the
two loci HOXA1 and TMEFF2 that are different between adenocarcinoma and
Mixed BAC actually have higher median levels of DNA methylation in the Mixed
BAC than in the lung adenocarcinoma, supporting the idea that these BAC
features may represent a more invasive front of the tumor, rather than a remnant
of AIS.
The histological variation in Mixed BAC lesions, as well as the varied DNA
methylation observed in Figure A.1 prompted us to perform an exploratory
clustering analysis of the Mixed BAC lesions, to determine if there are any sub-
groups within this classification. A hierarchical clustering with bootstrap
resampling was performed (Figure A.3A), and two major sub-groups were found,
with an average silhouette width of 0.53 (Figure A.3B). The key conclusion from
this preliminary clustering analysis is that the Mixed BAC tissues in this study is
hetereogeneous. The analysis underscores the importance of re-assessing the
histology of the Mixed BAC tissues included in this study according to the new
IASCLC guidelines.
Tissues formerly collectively known as Mixed BAC are now divided into
minimally invasive adenocarcinoma (MIA) and lepidic predominant invasive
adenocarcinoma, among others, and the new histological divisions may well be
214
Figure A.2. DNA methylation scatterplots of AIS, Mixed BAC and lung
adenocarcinoma lesions. Asterisks mark statistically significant difference of
Mann-Whitney p<0.025. Interquartile range is marked in red.
215
Figure A.2, Continued.
216
the driving force for the wide range of DNA methylation profiles of Mixed BACs
observed in this preliminary study, and the two clusters.
This preliminary analysis suggests that at least some of the Mixed BACs
included in this study are more similar to invasive adenocarcinoma than AIS. In
addition, there are higher levels of DNA methylation levels in at least some loci in
Mixed BAC in comparison to invasive adenocarcinoma, supporting the
hypothesis that these Mixed BAC lesions may be invasive fronts rather than
remnant AIS. However, our analysis also indicates that the Mixed BAC collection
used in this study is heterogeneous, and a more thorough analysis is contingent
on the re-evaluation of the Mixed BAC tissues based on the new, more specific
histological classifications of the 2011 IASCLC guidelines.
217
Figure A.3. Hierarchical clustering of Mixed BAC samples show at least two
subgroups. (A) Multiscale bootstrap resampling hierarchical clustering
performed with Euclidean distance and Ward clustering. “au” is the
“Approximately Unbiased” p-value computed by multiscale bootstrap resampling,
while “bp” is the “Bootstrap Probability” p-value, computed by normal boostrap
resampling. Values shown on the edges of the clustering are 1-p-values (%).
Clustering performed using package pvclust in R. (B) Silhouette plot of the
clusters found in the Mixed BAC tissues.
218
APPENDIX B
Table B.1. Top 100 statistically significantly hypermethylated genes in lung adenocarcinoma
Infinium Probe
Name
Gene
Name
Wilcoxon Rank p-
value
Q-value
ȕ -value
difference
cg25720804 TLX3 7.55E-23 1.39E-22 0.608
cg08089301 HOXB4 6.93E-20 6.52E-20 0.596
cg14458834 HOXB4 3.44E-25 1.19E-24 0.593
cg12374721 PRAC 6.13E-32 4.12E-30 0.584
cg22881914 NID2 4.25E-20 4.09E-20 0.575
cg26521404 HOXA9 1.21E-30 2.86E-29 0.557
cg00949442 ABCA3 1.32E-18 9.90E-19 0.528
cg16731240 ZNF577 9.96E-17 5.41E-17 0.525
cg06760035 HOXB4 2.45E-17 1.46E-17 0.518
cg07307078 TUBB6 2.01E-20 2.09E-20 0.512
cg22660578 LHX1 4.67E-21 5.65E-21 0.511
cg23290344 NEFM 7.55E-23 1.39E-22 0.508
cg17525406 AJAP1 3.94E-19 3.27E-19 0.502
cg21546671 HOXB4 1.76E-26 1.03E-25 0.494
cg18952647 BNC1 8.95E-25 2.67E-24 0.493
cg23432345 HOXA7 1.37E-22 2.39E-22 0.489
cg25875213 ZNF781 4.27E-21 5.24E-21 0.485
cg04534765 GALR1 5.58E-23 1.07E-22 0.478
cg10303487 DPYS 8.83E-20 8.13E-20 0.473
cg19456540 SIX6 4.41E-24 1.10E-23 0.473
cg01683883 CMTM2 3.16E-24 8.43E-24 0.471
cg22471346 GAS7 1.11E-20 1.24E-20 0.466
cg24199834 POU4F2 4.67E-21 5.65E-21 0.463
cg01354473 HOXA9 3.54E-29 4.92E-28 0.458
cg01295203 PRDM14 6.23E-19 5.02E-19 0.456
cg17241310 BARHL2 1.12E-25 4.62E-25 0.452
cg04330449 NEUROG1 2.80E-17 1.65E-17 0.451
cg26609631 GSX1 6.78E-16 3.20E-16 0.451
cg15520279 HOXD8 8.06E-18 5.19E-18 0.447
cg26128092 WDR8 1.11E-21 1.54E-21 0.446
cg27188703 FAIM2 2.02E-22 3.35E-22 0.444
cg25764191 INA 2.64E-10 6.22E-11 0.443
cg00117172 RUNX3 2.06E-13 7.03E-14 0.442
cg14859460 GRM6 1.54E-26 9.39E-26 0.442
cg03958979 NR2E1 2.01E-20 2.09E-20 0.437
cg09649610 GNG4 1.14E-17 7.18E-18 0.437
cg06263495 ASCL2 5.88E-17 3.29E-17 0.435
cg14991487 HOXD9 7.52E-20 7.02E-20 0.433
219
Table B.1, Continued.
cg01335367 C12orf34 8.86E-30 1.49E-28 0.431
cg13878010 ADCY5 4.09E-11 1.06E-11 0.429
cg05436658 PRKCB 5.26E-22 7.78E-22 0.429
cg25942450 TLX3 3.53E-24 9.18E-24 0.429
cg21226224 SOX17 2.69E-25 9.45E-25 0.428
cg26069745 HOXA2 1.37E-22 2.39E-22 0.427
cg12111714 ATP8A2 1.01E-26 6.47E-26 0.426
cg01009664 TRH 5.04E-23 9.73E-23 0.425
cg09516965 PTGDR 1.62E-17 9.95E-18 0.424
cg10660256 BHMT 4.36E-22 6.66E-22 0.421
cg27626299 EVX1 3.97E-26 1.98E-25 0.419
cg05521696 SLC2A14 3.12E-33 4.20E-31 0.418
cg15540820 EOMES 3.28E-34 6.63E-32 0.417
cg03975694 ZNF540 1.79E-23 3.87E-23 0.415
cg10883303 HOXA13 9.18E-13 2.86E-13 0.415
cg08572611 ACTL6B 2.04E-15 8.96E-16 0.411
cg23244913 HCG9 1.34E-15 6.02E-16 0.410
cg06722633 GRIK3 3.28E-21 4.09E-21 0.409
cg00767581 HOXD4 9.46E-31 2.73E-29 0.408
cg19352038 PAX3 1.12E-31 6.47E-30 0.408
cg02245378 CCDC140 1.93E-30 4.32E-29 0.408
cg17965019 HIST1H3J 2.24E-06 3.42E-07 0.408
cg13912117 ADCY8 1.23E-17 7.68E-18 0.407
cg26316946 GRIK2 4.94E-25 1.60E-24 0.407
cg12127282 HOXD4 2.02E-27 1.66E-26 0.406
cg18236477 ATP8A2 3.21E-17 1.87E-17 0.406
cg02164046 SST 1.79E-24 5.06E-24 0.405
cg08118311 SALL3 1.11E-20 1.24E-20 0.404
cg04048259 EDN3 4.25E-18 2.89E-18 0.403
cg03734874 TMEM179 7.81E-19 6.17E-19 0.403
cg24715245 UCHL1 5.28E-15 2.20E-15 0.401
cg12680609 ZFP41 1.43E-20 1.56E-20 0.401
cg20959866 AJAP1 2.38E-25 8.58E-25 0.401
cg14056644 PITX2 1.53E-18 1.14E-18 0.400
cg23207990 SFRP2 1.60E-21 2.15E-21 0.397
cg22341310 ZNF541 1.27E-28 1.55E-27 0.396
cg07536847 PAX7 1.21E-21 1.68E-21 0.393
cg24989962 PTGDR 8.65E-18 5.54E-18 0.393
cg06958829 CHAD 3.91E-20 3.79E-20 0.392
cg01805282 EYA4 6.86E-09 1.37E-09 0.391
cg20723355 FBXO39 1.19E-11 3.29E-12 0.390
cg09643544 ZNF177 2.11E-19 1.82E-19 0.389
cg22815110 FOXD3 2.81E-16 1.41E-16 0.389
cg19988449 BNC1 4.25E-18 2.89E-18 0.389
220
Table B.1, Continued.
cg07703401 HBQ1 4.81E-17 2.74E-17 0.388
cg19831575 FGF4 7.07E-14 2.51E-14 0.387
cg02909790 HIST1H3G 6.78E-13 2.15E-13 0.387
cg13398291 SFRP1 1.43E-19 1.26E-19 0.386
cg26416466 MEGF11 1.09E-29 1.76E-28 0.385
cg25574024 IGF2AS 2.47E-16 1.25E-16 0.385
cg09873258 DLK1 9.77E-19 7.54E-19 0.384
cg20291049 POU3F3 4.69E-15 1.97E-15 0.384
cg21296230 GREM1 3.12E-19 2.65E-19 0.384
cg13870866 TBX20 2.69E-25 9.45E-25 0.383
cg23130254 HOXD12 7.95E-25 2.43E-24 0.383
cg20616414 WNK2 3.93E-15 1.67E-15 0.380
cg12880658 CDO1 3.43E-31 1.26E-29 0.380
cg17063929 NOX4 2.52E-28 2.83E-27 0.380
cg18722841 PHOX2A 7.95E-25 2.43E-24 0.377
cg02919422 SOX17 3.68E-27 2.65E-26 0.375
cg02599464 HIST1H4I 4.41E-24 1.10E-23 0.374
cg25047280 HOXA9 2.92E-29 4.37E-28 0.373
221
Table B.2. Top 100 statistically significantly hypomethylated genes in lung adenocarcinoma
Infinium Probe
Name
Gene Name
Wilcoxon Rank p-
value
Q-value
ȕ -value
difference
cg07014174 KRTAP11-1 3.17E-27 2.37E-26 -0.394
cg23067535 FAM83A 1.27E-25 5.09E-25 -0.361
cg26799474 CASP8 3.97E-26 1.98E-25 -0.345
cg26530341 TNFRSF10A 2.66E-26 1.45E-25 -0.344
cg20837735 SERPINB5 1.02E-20 1.15E-20 -0.342
cg26738880 DPP6 1.50E-31 7.58E-30 -0.341
cg05440289 IVL 1.54E-19 1.35E-19 -0.337
cg16812893 KRTAP15-1 2.02E-27 1.66E-26 -0.334
cg23213217 DEGS1 3.03E-30 6.12E-29 -0.333
cg11206634 SFT2D3 3.68E-18 2.56E-18 -0.330
cg23147597 CEACAM19 6.17E-23 1.16E-22 -0.329
cg14153740 TRYX3 3.91E-20 3.79E-20 -0.328
cg08970694 HBE1 2.02E-22 3.35E-22 -0.322
cg27342801 REG3A 3.94E-24 1.01E-23 -0.322
cg26767897 XDH 2.52E-24 6.87E-24 -0.319
cg24423088 KRTAP8-1 2.98E-28 3.25E-27 -0.315
cg10368842 C10orf81 6.28E-25 1.99E-24 -0.314
cg11204562 C10orf81 7.07E-25 2.19E-24 -0.312
cg25388528 KRTAP20-1 9.32E-28 8.54E-27 -0.308
cg25391023 BTNL2 6.13E-32 4.12E-30 -0.306
cg04574507 CD1B 5.04E-23 9.73E-23 -0.305
cg04947157 TMC6 6.13E-32 4.12E-30 -0.304
cg24512973 MUC1 6.62E-21 7.76E-21 -0.301
cg15606663 KRTAP15-1 2.46E-22 3.95E-22 -0.301
cg00718513 IGKV7-3 3.97E-26 1.98E-25 -0.300
cg11003133 AIM2 4.32E-14 1.58E-14 -0.299
cg02906939 HNMT 7.68E-16 3.62E-16 -0.297
cg02442161 PI3 1.67E-22 2.84E-22 -0.296
cg23732024 LY96 1.31E-11 3.59E-12 -0.294
cg20676475 LCE3D 3.00E-21 3.78E-21 -0.293
cg20916523 VHL 1.91E-21 2.52E-21 -0.293
cg02833180 PLCL1 2.17E-15 9.46E-16 -0.293
cg14826683 SPRR2D 1.17E-23 2.69E-23 -0.293
cg24240626 REG3A 8.34E-23 1.52E-22 -0.291
cg15700197 OR10J1 2.75E-15 1.19E-15 -0.290
cg08256691 RIT1 8.17E-16 3.82E-16 -0.290
cg25072962 MS4A15 1.27E-27 1.12E-26 -0.290
cg16431978 KRTAP13-3 1.33E-21 1.83E-21 -0.289
cg22268164 TRHR 2.96E-18 2.09E-18 -0.288
cg21614638 DAPP1 5.15E-17 2.91E-17 -0.287
cg07525077 RNASE3 6.17E-23 1.16E-22 -0.287
222
Table B.2, Continued.
cg24949488 DNTT 4.38E-25 1.47E-24 -0.286
cg18484189 NLRP10 3.47E-26 1.84E-25 -0.286
cg10054857 C18orf20 1.14E-17 7.18E-18 -0.285
cg20312687 DEFB118 3.17E-27 2.37E-26 -0.285
cg12493906 MMP26 1.76E-26 1.03E-25 -0.284
cg02764897 KRTAP13-1 1.11E-24 3.23E-24 -0.284
cg04915566 RUNX1 6.80E-12 1.93E-12 -0.284
cg18462653 DEFB119 6.71E-26 3.01E-25 -0.282
cg24338843 C1orf158 3.18E-18 2.23E-18 -0.278
cg05767404 C1orf150 1.81E-15 7.97E-16 -0.278
cg09868882 GRM8 6.67E-15 2.73E-15 -0.278
cg20018806 TCN1 7.64E-26 3.35E-25 -0.277
cg03872376 ZP4 2.01E-20 2.09E-20 -0.276
cg07456201 UMOD 4.53E-26 2.13E-25 -0.275
cg21201572 AGR2 1.21E-30 2.86E-29 -0.275
cg14992108 SNTB1 6.13E-24 1.46E-23 -0.274
cg20311730 NLRP10 1.91E-21 2.52E-21 -0.272
cg09195271 RNF186 2.06E-18 1.50E-18 -0.272
cg12108912 TMEM177 9.88E-11 2.45E-11 -0.271
cg23413307 LCE1F 4.98E-15 2.08E-15 -0.270
cg03109316 ZNF80 2.09E-21 2.72E-21 -0.269
cg05615150 ARPP-21 1.60E-15 7.14E-16 -0.269
cg21130374 MX2 1.11E-21 1.54E-21 -0.269
cg08659707 PUM2 1.42E-15 6.38E-16 -0.269
cg00187686 TCN1 9.46E-24 2.19E-23 -0.269
cg04057858 T-SP1 4.79E-22 7.19E-22 -0.268
cg10318258 RIPK3 3.97E-26 1.98E-25 -0.268
cg05828624 REG1A 4.53E-26 2.13E-25 -0.266
cg25433648 S100A14 8.95E-25 2.67E-24 -0.264
cg15485859 C1orf116 4.53E-26 2.13E-25 -0.264
cg26390526 FLG 8.91E-15 3.58E-15 -0.264
cg17628717 HECW1 5.82E-13 1.86E-13 -0.264
cg12019109 AZGP1 9.08E-19 7.09E-19 -0.263
cg27622610 OR1G1 1.42E-18 1.06E-18 -0.263
cg23111544 REG1A 8.69E-16 4.05E-16 -0.262
cg22879289 NID1 4.94E-25 1.60E-24 -0.262
cg01119135 C1orf116 3.54E-29 4.92E-28 -0.262
cg02131853 TMEM156 1.06E-28 1.34E-27 -0.261
cg10207745 C12orf36 5.01E-20 4.79E-20 -0.261
cg14333454 SFN 3.17E-27 2.37E-26 -0.261
cg27285599 ZNF750 2.17E-13 7.39E-14 -0.260
cg02168291 CDH13 9.88E-26 4.11E-25 -0.260
cg08319404 THRB 3.53E-24 9.18E-24 -0.259
cg15979932 CUEDC1 1.01E-21 1.42E-21 -0.258
223
Table B.2, Continued.
cg15913671 TMEM105 7.95E-28 7.46E-27 -0.258
cg03112433 CDK14 5.82E-13 1.86E-13 -0.258
cg26656452 HABP2 2.13E-28 2.53E-27 -0.257
cg06038133 CORO6 4.67E-21 5.65E-21 -0.256
cg08411049 SERPINB5 1.21E-20 1.34E-20 -0.256
cg14458615 IGSF11 6.05E-10 1.36E-10 -0.256
cg00744433 CXADR 2.52E-28 2.83E-27 -0.255
cg26164184 FCN2 6.13E-24 1.46E-23 -0.255
cg21747271 AIP 2.69E-25 9.45E-25 -0.254
cg14062083 KRTAP13-4 2.45E-23 5.05E-23 -0.254
cg26789453 TMEM116 1.91E-16 9.88E-17 -0.253
cg12878228 PRSS1 6.70E-14 2.39E-14 -0.253
cg06793062 CNTNAP4 1.70E-20 1.80E-20 -0.252
cg06850526 SLC38A10 8.69E-26 3.73E-25 -0.251
cg24244000 GABRG3 4.11E-23 8.13E-23 -0.251
224
APPENDIX C
Table C.1. Genes with positive correlations between DNA methylation and gene
expression (hypermethylated and upregulated; hypomethylated and
downregulated)
Top 10 Hypermethylated and Upregulated genes in lung adenocarcinoma
DNA methylation Gene Expression
Gene Name Q-value ȕ -value difference Q-value Fold change
COMP 6.46E-11 0.331 8.81E-17 2.448
COL5A2 4.15E-20 0.293 8.14E-14 1.429
ACOT11 1.40E-19 0.254 4.57E-15 1.052
EEF1A2 4.09E-14 0.232 2.01E-14 2.901
CELSR3 2.40E-14 0.205 4.20E-20 1.560
SEZ6L2 1.98E-17 0.203 1.20E-22 2.029
Top 10 Hypomethylated and Downregulated genes in lung adenocarcinoma
DNA methylation Gene Expression
Gene Name Q-value ȕ -value difference Q-value Fold change
MS4A15 7.48E-26 -0.290 9.15E-16 -1.448
ZBED2 4.44E-21 -0.243 2.00E-19 -1.391
CLIC3 2.11E-27 -0.235 4.66E-16 -1.754
C1QB 4.15E-20 -0.233 3.69E-17 -1.662
CCL18 2.94E-19 -0.214 2.33E-09 -1.007
MS4A2 2.04E-21 -0.210 1.80E-32 -1.505
FGFBP2 4.11E-15 -0.209 4.05E-22 -1.265
EVI2B 3.22E-05 -0.206 1.69E-24 -1.741
HBB 1.27E-16 -0.203 3.30E-39 -4.149
LY86 5.19E-18 -0.200 8.42E-19 -1.534
225
APPENDIX D
Table D.1. Top 100 statistically significantly different genes between DNA methylation-based
clusters (Cluster 1-Cluster 2)
Infinium Probe
Name
Gene
Name
Wilcoxon Rank p-
value
Q-value
ȕ -value
difference
cg04034767 GRASP 1.36E-08 4.39E-07 0.663
cg07017374 FLT3 3.49E-08 6.80E-07 0.645
cg17872757 FLI1 1.25E-06 6.97E-06 0.597
cg11126134 C13orf33 6.49E-06 2.31E-05 0.595
cg22029275 FAM123A 3.49E-08 6.80E-07 0.585
cg24446548 TWIST1 6.30E-08 1.01E-06 0.571
cg08056146 SOX7 2.50E-07 2.23E-06 0.565
cg06377278 RUNX3 2.47E-06 1.16E-05 0.561
cg18815943 FOXE3 6.14E-10 9.70E-08 0.557
cg18630040 PLA2G7 0.00020055 0.00032389 0.555
cg19332710 RIMS4 4.96E-09 2.80E-07 0.552
cg07080358 CNRIP1 2.20E-08 5.42E-07 0.552
cg04532952 CA4 3.69E-07 3.00E-06 0.547
cg25583174 FGF2 5.27E-06 2.04E-05 0.534
cg08186362 HRH3 1.88E-08 5.11E-07 0.529
cg09516965 PTGDR 6.30E-05 0.000132403 0.528
cg08108311 WNK4 4.75E-07 3.57E-06 0.525
cg17108819 CD8A 2.50E-07 2.23E-06 0.520
cg20357628 PHACTR3 8.40E-08 1.21E-06 0.518
cg02126753 AEBP1 1.04E-10 5.20E-08 0.514
cg13562911 ELOVL2 1.11E-06 6.45E-06 0.512
cg12998491 FAM78A 6.08E-07 4.25E-06 0.505
cg00290506 CNIH3 4.75E-07 3.57E-06 0.500
cg20530314 AGTR1 4.74E-06 1.87E-05 0.497
cg20209009 TBX21 8.76E-07 5.60E-06 0.493
cg03168582 DMRT1 9.75E-06 3.16E-05 0.491
cg12251804 C10orf125 1.08E-05 3.41E-05 0.491
cg10918202 BAALC 0.000937199 0.00109651 0.491
cg12422450 CHGA 3.08E-06 1.36E-05 0.490
cg12686016 HOXA1 8.88E-05 0.000171387 0.489
cg24826867 IRF8 3.69E-07 3.00E-06 0.487
cg23422659 WNT9B 3.44E-06 1.46E-05 0.484
cg20415809 ITGA4 5.78E-05 0.000124002 0.483
cg26195812 DPYSL5 3.08E-06 1.36E-05 0.482
cg13921352 FAM19A4 1.57E-06 8.40E-06 0.480
cg09009111 EMILIN2 0.000185312 0.000306169 0.475
cg25971347 FOXF1 8.29E-09 3.74E-07 0.474
cg26646370 SHD 3.08E-06 1.36E-05 0.473
226
Table D.1, Continued.
cg11657808 RYR2 5.27E-06 2.04E-05 0.472
cg10883303 HOXA13 2.99E-08 6.48E-07 0.469
cg09053680 UTF1 1.46E-07 1.65E-06 0.469
cg24662718 VAV3 0.000370278 0.000523587 0.467
cg22036988 SPSB4 0.001073561 0.001217278 0.466
cg26656135 EYA4 1.40E-06 7.69E-06 0.464
cg25094569 WT1 1.36E-08 4.39E-07 0.463
cg22571530 NFASC 7.28E-08 1.11E-06 0.463
cg00243313 IRX4 7.19E-06 2.49E-05 0.459
cg03238797 ADAMTS18 0.000134358 0.000235796 0.459
cg23559331 KCNH4 1.76E-05 4.94E-05 0.458
cg20339230 ST8SIA2 5.45E-08 9.25E-07 0.455
cg22594309 SYT2 0.001311767 0.001433848 0.454
cg07109287 LHX2 8.88E-05 0.000171387 0.451
cg21351994 EMX1 1.46E-07 1.65E-06 0.450
cg17740399 PDX1 0.003371616 0.00308898 0.448
cg10362591 SLC6A2 2.81E-05 7.08E-05 0.448
cg11248413 NEUROG1 8.76E-07 5.60E-06 0.448
cg13297865 ELOVL4 0.000318638 0.000466866 0.446
cg14409941 ADAMTS2 1.57E-06 8.40E-06 0.446
cg26365854 ALX4 4.99E-10 9.70E-08 0.445
cg08185661 SYT9 1.31E-05 3.96E-05 0.444
cg12832649 SPOCK1 1.16E-08 4.06E-07 0.444
cg21359747 ALDH1A3 8.88E-05 0.000171387 0.442
cg13694867 SIM2 1.25E-06 6.97E-06 0.442
cg04988423 ALX4 2.20E-08 5.42E-07 0.442
cg19018097 SNX32 0.000710321 0.000873783 0.442
cg13562542 GPR27 0.000318638 0.000466866 0.441
cg18794577 GRIN3A 0.000134358 0.000235796 0.441
cg08211091 NAT8L 2.20E-08 5.42E-07 0.440
cg07271264 MYOD1 2.99E-08 6.48E-07 0.439
cg09229912 CUX2 2.50E-07 2.23E-06 0.439
cg12373771 CECR6 1.68E-07 1.83E-06 0.437
cg12300353 KCTD8 6.30E-05 0.000132403 0.437
cg22737001 RUNX3 1.19E-05 3.64E-05 0.437
cg14174099 SLC8A3 4.26E-06 1.74E-05 0.435
cg22341104 GFI1 3.44E-06 1.46E-05 0.435
cg12748258 HR 1.25E-06 6.97E-06 0.434
cg26599006 GSC2 3.24E-07 2.75E-06 0.433
cg06954481 GBX2 0.002645386 0.002568123 0.433
cg06905514 CAMK2B 1.76E-06 9.17E-06 0.433
cg21530890 SOX8 1.88E-08 5.11E-07 0.430
cg00090147 GATA4 9.67E-08 1.27E-06 0.428
cg07903918 GABBR2 6.08E-07 4.25E-06 0.427
227
Table D.1, Continued.
cg24924779 KCNG1 5.85E-06 2.17E-05 0.427
cg09191327 PRDM12 2.20E-08 5.42E-07 0.425
cg06291867 HTR7 3.79E-11 2.99E-08 0.422
cg20895877 PRIMA1 1.11E-07 1.42E-06 0.422
cg20256494 CABP7 1.60E-08 4.68E-07 0.422
cg09551147 SORCS3 3.44E-06 1.46E-05 0.422
cg03777459 GATA5 1.92E-07 1.97E-06 0.421
cg16652063 SLC13A5 5.38E-07 3.89E-06 0.420
cg04528819 KLF14 0.000343576 0.000495138 0.420
cg08768421 GDA 0.000875061 0.001036088 0.417
cg18338311 TMEM132E 3.70E-05 8.75E-05 0.416
cg26850754 CD8B 6.30E-08 1.01E-06 0.416
cg14312526 FOXL2 2.76E-06 1.25E-05 0.416
cg11428724 PAX7 0.002988929 0.002830309 0.415
cg05028467 SNCB 1.11E-06 6.45E-06 0.413
cg24891133 C13orf33 0.000710321 0.000873783 0.413
cg21309147 STAC2 7.97E-06 2.71E-05 0.412
cg19358442 ALX4 1.36E-08 4.39E-07 0.410
Abstract (if available)
Abstract
Lung cancer accounted for 13% of total cancer cases and 18% of cancer deaths globally in 2008. The combination of increasing smoking prevalence in many developing countries and a long latency period predicts that lung cancer will remain a major world health problem for decades to come. This work focuses on DNA methylation in lung adenocarcinoma, a subtype of lung cancer that is the most prevalent in the United States, as well as the most common amongst women and non-smokers. We built on previously identified DNA methylation early detection markers by delineating the timing of these changes in preneoplastic lesions. We then used genome-scale DNA methylation profiling to identify novel potential blood-based non-small cell lung cancer biomarkers, and integrated DNA methylation information with gene expression data to identify DNA methylation changes that may lead to functional consequences in the development of cancer. Additionally, we identified a DNA methylation based subgroup of lung adenocarcinoma that is associated with KRAS mutations and smoking status, as well as explored the use of paraffin-embedded tissues to facilitate larger genome-scale DNA methylation studies. Our findings provide insights into the roles that DNA methylation may play in the development of lung adenocarcinoma, as well as potential DNA methylation markers for the early detection or risk assessment of the disease.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
DNA methylation markers for blood-based detection of small cell lung cancer in mouse models
PDF
Development of DNA methylation biomarkers as an early detection tool for human lung adenocarcinoma
PDF
Understanding DNA methylation and nucleosome organization in cancer cells using single molecule sequencing
PDF
Development of DNA methylation based biomarkers for the early detection of squamous cell lung cancer
PDF
Functional DNA methylation changes in normal and cancer cells
PDF
Tight junction protein CLDN18.1 attenuates malignant properties and related signaling pathways of human lung adenocarcinoma in vivo and in vitro
PDF
DNA methylation as a biomarker in human reproductive health and disease
PDF
CpG poor promoter SULT1C2 regulated by DNA methylation and is induced by cigarette smoke condensate in lung cell lines
PDF
Investigating the function and epigenetic regulation of ABCA3, a novel LUAD tumor suppressor gene
PDF
CpG methylation profiling in lung cancer cell lines, tumors and non-tumors
PDF
Effects of chromatin regulators during carcinogenesis
PDF
DNA methylation inhibitors and epigenetic regulation of microRNA expression
PDF
Prenatal air pollution exposure, newborn DNA methylation, and childhood respiratory health
PDF
DNA methylation groups determined by GATA5 gene methylation level are correlated with tumor subtype, sex, smoking status, and body mass index in esophageal and gastric adenocarcinoma
PDF
Elucidating the cellular origins of lung adenocarcinoma
PDF
Developing a robust single cell whole genome bisulfite sequencing protocol to analyse circulating tumor cells
PDF
The role of DNA methylation in early detection and progression of pancreatic cancer
PDF
Modeling lung adenocarcinoma progression in vitro using immortalized human alveolar epithelial cells
PDF
The role of endoplasmic reticulum chaperone glucose-regulated 78-kilodalton (GRP78) in lung cancer
PDF
DNA methylation and gene expression profiles in Vidaza treated cultured cancer cells
Asset Metadata
Creator
Selamat, Suhaida Adura
(author)
Core Title
DNA methylation changes in the development of lung adenocarcinoma
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Genetic, Molecular and Cellular Biology
Publication Date
04/05/2012
Defense Date
04/05/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
DNA methylation,epigenetics,lung adenocarcinoma,lung cancer,OAI-PMH Harvest
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Laird-Offringa, Ite A. (
committee chair
), Borok, Zea (
committee member
), Siegmund, Kimberly (
committee member
)
Creator Email
selamat@usc.edu,suhaidaadura@yahoo.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-2645
Unique identifier
UC11288283
Identifier
usctheses-c3-2645 (legacy record id)
Legacy Identifier
etd-SelamatSuh-569.pdf
Dmrecord
2645
Document Type
Dissertation
Rights
Selamat, Suhaida Adura
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
DNA methylation
epigenetics
lung adenocarcinoma
lung cancer