Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The influence of DNA repair genes and prenatal tobacco exposure on childhood acute lymphoblastic leukemia risk: a gene-environment interaction study
(USC Thesis Other)
The influence of DNA repair genes and prenatal tobacco exposure on childhood acute lymphoblastic leukemia risk: a gene-environment interaction study
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
The Influence of DNA Repair Genes and Prenatal Tobacco Exposure
on Childhood Acute Lymphoblastic Leukemia Risk:
A Gene-Environment Interaction Study
by
XINRAN WANG
A Thesis Presented to the
FACULTY OF THE USC KECK SCHOOL OF MEDICINE
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
(BIOSTATISTICS)
May 2024
Copyright 2024 XINRAN WANG
ii
ACKNOWLEDGEMENTS
This research was funded in part by a grant from the National Institute of Environmental Health
Sciences (T32 ES013678, C. Zhong and W.J. Gauderman), supporting C. Zhong’s contributions,
and a grant from the Tobacco-Related Disease Research Program (grant no. 27IR-0032, J.L.
Wiemels), which underwrote the tobacco biomarker laboratory efforts. The CCRLP GWAS study
benefited from funding by R01CA155461 (J.Wiemels and X. Ma), essential for acquiring genetic
data. The CCLS received support from R01ES009137 and R24ES028524 (C. Metayer) for patient
recruitment, data collection, and genotyping. Our team extends appreciation to the Center for
Advanced Research Computing at the University of Southern California for their invaluable
computing resources (https://carc.usc.edu), significantly contributing to this research. We
utilized biospecimen and data from the California Biobank Program, (SIS request #1380), and are
grateful to Robin Cooley and Steven Graham for their assistance in biospecimen retrieval. The
application of AHRR in determining smoking status is protected under US Patent 8,637,652, US
Patent 9,273,358, and other pending applications. This study also relied on cancer incidence data
supported by the California Department of Public Health, part of the statewide cancer-reporting
program as mandated by California Health and Safety Code Section 103885.
We also gratefully acknowledge the Genotype-Tissue Expression (GTEx) Project, facilitated by the
Common Fund of the Office of the Director of the National Institutes of Health, with contributions
from NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The valuable data employed in this manuscript
were sourced from the GTEx Portal on [22/01/2024] and/or dbGaP accession number
iii
phs000424.vN. pN on [22/01/2024]. The support and resources from the GTEx Project, providing
gene expression level data crucial for our analyses, are deeply appreciated."
I would like to express my sincere gratitude to my thesis advisor, Dr. Joseph Wiemels, for the
continuous support of my research, for his patience, motivation, and immense knowledge.
Their guidance helped me in all the time of research and writing of this thesis. I could not have
imagined having a better advisor and mentor for my research.
I would like to thank the rest of my thesis committee: Dr. William Gauderman and Dr. Nicholas
Mancuso, for their insightful comments and encouragement, but also for the challenging
questions which incentivized me to widen my research from various perspectives.
My sincere thanks also go to Dr. Charlie Zhong, Dr. Xiaomei Ma and Dr. Catherine Metayer who
provided me with the necessary resources and support to complete this research.
Last but not least, I would like to thank my family: my parents Sihai Hu and Hao Wang, for
giving birth to me at the first place and supporting me spiritually throughout writing this thesis
and my life.
Xinran Wang
March 2024
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ...............................................................................................................ii
LIST OF TABLES...........................................................................................................................vi
LIST OF FIGURES........................................................................................................................vii
ABSTRACT: ............................................................................................................................... viii
Chapter 1: Introduction...............................................................................................................1
1.1 Current Status of Childhood All Epidemiology....................................................................1
1.2 Introduction of Studies/Datasets.......................................................................................1
1.3 Research Objectives ..........................................................................................................3
Chapter 2: Study Methods...........................................................................................................4
2.1 Study Design and Population .............................................................................................4
2.2 Exposure Assessment........................................................................................................4
2.3 Statistical Approach-Minp Method ....................................................................................5
Chapter 3: Permutation Results...................................................................................................7
3.1 Key Gene in Gene-Environment Interaction Analysis .........................................................7
3.2 Pathway-Level Difference Between Two Ancestries.........................................................13
Chapter 4: Discussion................................................................................................................17
4.1 Summary of Findings.......................................................................................................17
v
4.2 Previous Research ...........................................................................................................17
4.3 Contributions...................................................................................................................18
4.4 Limitations.......................................................................................................................20
4.5 Future Research Directions..............................................................................................20
References................................................................................................................................21
Appendix...................................................................................................................................24
Bulk Tissue Gene Expression Figures For Top Significant Genes.............................................24
vi
LIST OF TABLES
Table1: Descriptive Statistics of Children with ALL and controls from the CCRLP .........................4
Table2: Top Significant Genes in Gene-Environment Association Permutation Test on ALL
among the Whole CCRLP Dataset ....................................................................................7
Table3: Top Significant Genes in Gene-Environment Association Permutation Test among
Latino Group....................................................................................................................8
Table4: Top Significant Genes in Gene-Environment Association Permutation Test among
non-Latino White Group..................................................................................................8
Table5: Top Significant Genes’ expression levels in Tissues measured by Transcripts Per
Million (TPM)11,12
.............................................................................................................9
Supplementary Table1: 10K-Permutation Test P-Values and the Corresponding Pathways
for Human DNA repair Genes (Latino Group - SNP Level Results)
Supplementary Table2: 10K-Permutation Test P-Values and the Corresponding Pathways
for Human DNA repair Genes (non-Latino White Group - SNP Level Results)
Supplementary Table3: 10K-Permutation Test P-Values and the Corresponding Pathways
for Human DNA repair Genes (Whole CCRLP dataset - SNP Level Results)
vii
LIST OF FIGURES
Figure 1: 10k Permutation-Adjusted P-values of Genes by Pathways among Latinos.................13
Figure 2: 10k Permutation-Adjusted P-values of Genes by Pathways among Non-Latino
Whites.........................................................................................................................13
Figure 3: 10k Permutation-Adjusted P-values of Genes by Pathways among the Whole
CCRLP Sample Set........................................................................................................14
viii
The Influence of DNA Repair Genes and Prenatal Tobacco Exposure
on Childhood Acute Lymphoblastic Leukemia Risk——
A Gene-Environment Interaction Study
ABSTRACT:
The relationship between maternal smoking during gestation and childhood acute lymphoblastic
leukemia (ALL) remains incompletely understood. This study seeks to explore how genetic
susceptibility in DNA repair mechanisms interacts with environmental exposure to tobacco
smoke to increase risk for ALL in children. Our investigation employed data from California
Childhood Cancer Record Linkage Project (CCRLP), logistic regression and MinP tests were
employed. Our analysis revealed significant interactions between maternal tobacco smoking and
DNA repair genes such as DCLRE1A (P = 0.0087), ERCC1(P = 0.0150), and GTF2H5(P = 0.0155) that
affected childhood ALL risk. Furthermore, Latino populations demonstrated notable interactions
in Homologous Recombination pathways, while non-Latino White populations showed notable
interactions in Base Excision Repair and Nucleotide Excision Repair pathways. This study
highlights the significance of DNA repair genes when considering environmental exposure to
tobacco smoke, suggesting that genetic variation within these pathways could impact risks of ALL
in children exposed to environmental tobacco smoke during gestation.
Keywords: Maternal Smoking, Childhood Acute Lymphoblastic Leukemia (ALL), DNA Repair
Genes, Genetic Susceptibility, Environmental and Genetic Interaction, Childhood Cancer
1
Chapter 1: Introduction
1.1 Current Status of Childhood ALL Epidemiology
Acute lymphoblastic leukemia (ALL) is the most frequently diagnosed cancer among
children, 1 a disease resulting from an intricate interplay of genetic susceptibilities and
environmental exposures that play into disease risk and progression. 2 While the exact
etiology remains elusive, emerging evidence points toward maternal exposure to tobacco
smoke during gestation being one potential environmental contributor. 2 While maternal
tobacco is not considered a major risk factor for ALL, some exposed individuals will be
susceptible due to genetic variation within DNA repair pathways to this major modifiable risk
factor in developing ALL. 3
DNA repair mechanisms are key in correcting damage caused by environmental
factors, such as tobacco smoke.4 Cigarette smoke contains numerous carcinogens that can
induce DNA lesions that affect individual susceptibility to carcinogenesis. Among the SNPs
associated with DNA repair mechanisms are SNPs known to increase susceptibility.4 Therefore,
studying gene-environment interactions - particularly between maternal tobacco exposure
and DNA repair SNPs - is of great relevance in this regard.
1.2 Introduction of Studies/Datasets
The California Childhood Cancer Record Linkage Project (CCRLP) is a linkage-based
research effort investigating childhood cancer cases within California, USA. Cases of ALL
2
diagnosed in children aged 0-14 between 1988-2011 were identified from the California
Cancer Registry, linked with birth records from 1982 to 2009 held by CDPH's Office of Vital
Records, and further investigated through analysis and correlation of data sources. Controls
were chosen from birth records and matched with cases according to year and month of birth,
gender, mode of delivery, and self-reported group identification by race/ancestry/ethnicity
(Latinos, non-Latino White, or Others). Additionally, birth records were reviewed to gather
additional details such as gestational age, birthweight, and mode of delivery. To facilitate
newborn genetic testing in California born children the CDPH Genetic Diseases Screening
Branch collects dried blood spots (DBSs). Since 1982, these dried blood spots (DBSs) have
been collected, archived, and made available for research, and were used for genetic analysis
and tobacco exposure assessments.
The California Childhood Leukemia Study (CCLS) was an active recruitment casecontrol investigation of childhood leukemia across California from 1995 through 2014.5
Patients were identified via pediatric cancer hospitals across California (excluding CCRLP
cases) using birth records from CDPH; one or two control participants for each case identified
were then selected through the birth registry based on birthdate, sex, Latino ethnicity, and
maternal ancestry criteria. Institutional Review Boards at California Health and Human
Services Agency, University of California Berkeley, USC, and Yale approved these two studies.
The California Childhood Cancer Record Linkage Project (CCRLP) offers a genomewide association study (GWAS) dataset which facilitates exploration of GxE interactions across
diverse populations. We used CCRLP leukemia GWAS data in this analysis and focused our
3
GxE interaction analysis on DNA repair SNPs as targets of our GxE analysis to understand
whether tobacco consumption alters leukemia risk according to DNA repair SNP status -
providing valuable insight into mechanistic pathways behind ALL susceptibility and reducing
overall numbers of comparisons for better statistical power (compared to genome-wide data).
1.3 Research Objectives
We evaluated single nucleotide polymorphisms (SNPs) singly, but additionally at a
gene-specific level using the MinP approach. MinP is a statistical method which utilizes the
smallest observed p-value across SNPs in a gene to evaluate gene-level significance of GxE
interactions - in combination with permutation testing to generate robust null distributions
for these MinP statistics and establish more precise significance thresholds.6
This study seeks to bridge the knowledge gap regarding the complex relationships
among tobacco exposure, DNA repair genetic variations and leukemia risk. Through carefully
conducted statistical analyses and biological interpretation of interaction effects, our goal is
to advance personalized medicine by contributing knowledge that could inform risk
assessments, prevention strategies and therapeutic interventions within pediatric oncology.
4
Chapter 2: Study Methods
2.1 Study Design and Population
Our study utilized the California Childhood Cancer Record Linkage Project (CCRLP), for
its large sample size and reduced selection bias arising from enrollment without consent
requirements. Analysis focused exclusively on non-Latino Whites and Latinos to minimize
potential confounding due to population stratification (Table 1).
Table1: Descriptive Statistics of Children with ALL and controls from the CCRLP
Characteristics Cases (N, %) Controls (N, %)
Sex
Male 1,680 (55.6) 1,809 (55.9)
Female 1,340 (44.4) 1,429 (44.1)
Ancestry
Non-Latino White 1,019 (33.8) 1,066 (32.9)
Latino 1,622 (53.7) 1,771 (54.7)
Other 379 (12.5) 401 (12.4)
Mode of delivery
Vaginal 2,277 (75.4) 2,498 (77.1)
C-section 743 (24.6) 740 (22.9)
Birthweight (grams, mean (SD)) 3,426 (526) 3,389 (534)
Gestational age (weeks, mean (SD)) 38.9 (2.0) 38.9 (2.1)
2.2 Exposure Assessment
5
Prenatal tobacco exposure was assessed using data from the California Childhood
Leukemia Study (CCLS). Levels of exposure were measured by a validated biomarker for
maternal pregnancy tobacco exposure. This was done by detecting DNA methylation levels at
the cg05575921 locus of the AHRR gene in newborn dried blood spots via Droplet Digital PCR
(ddPCR) 7,8, specifically for the CCLS study, the Illumina 450K or EPIC850K arrays9
. The extent
of hypomethylation of this CpG is linearly associated with level of tobacco intake as assessed
by cotinine levels in cord bloods and self-reported tobacco use during pregnancy. A linear
regression model developed from this dataset specifically examined linear coefficient
between fractional abundance of tobacco exposure (the ratio of AHRR cg05575921 cytosine
to cg05575921 methyl-cytosine) for analysis purposes.
2.3 Statistical Approach-minP method
This paper aims to evaluate the interactive influences between prenatal tobacco
exposure and DNA repair genes on childhood ALL risk. For each ancestry group in the CCRLP
dataset, logistic regression models with or without GXE interaction term were fitted, adjusting
for covariates such as gender, birthweight, gestational age, mode of delivery, first 3 GWAS
principal components. Then 1-degree Likelihood Ratio Tests were performed to get p-values
of GXE interaction term for each DNA repair SNP within Human DNA repair Genes list10
.
MinP 10K-permutation tests were utilized to ascertain each gene's interaction with prenatal
tobacco exposure by computing its minimum p-value across SNPs within it and comparing this
minimum with its null distribution derived by permutation testing. Permutation-adjusted pvalues serve to represent how strong gene-environment interactions affect ALL risk while
6
accounting for genome-wide multiple comparisons performed in multiple analyses
performed.
7
Chapter 3: Permutation Results
3.1 Key Gene in Gene-Environment Interaction Analysis
Through an analysis of California Childhood Cancer Record Linkage Project (CCRLP)
leukemia GWAS data, our study has discovered notable gene-environment interactions
related to risk for childhood acute lymphoblastic leukemia (ALL). Through systematic
permutation testing, we identified genes where variations appear to interact significantly
with maternal tobacco smoke exposure among the whole sample, potentially altering risk for
ALL (Table 2, and Supplemental Table 1-3).
Table2: Top Significant Genes in Gene-Environment Association Permutation Test on ALL among the Whole
CCRLP Dataset
Gene P-Value Pathway Leading-SNP Estimate
DCLRE1A 0.0087 Editing and processing nucleases rs7084835 -0.286140421
ERCC1 0.0150 Nucleotide excision repair (NER) rs1005165 -0.400018226
GTF2H5 0.0155 Nucleotide excision repair (NER) rs644637 0.355380289
XPC 0.0254 Nucleotide excision repair (NER) rs3731127 1.019787760
ALKBH3 0.0281 Direct reversal of damage rs1554316 0.336433804
PARK7 0.0480 Modulation of nucleotide pools rs161799 0.301574728
8
Table3: Top Significant Genes in Gene-Environment Association Permutation Test among Latino Group
Gene P-Value Pathway Leading-SNP Estimate
RECQL 0.0001 Other identified genes with known
or suspected DNA repair function
rs2284392 -0.592573192
PDS5B 0.0031 Homologous recombination rs1324414 -0.456592615
BARD1 0.0035 Homologous recombination rs10176759 0.489299719
POLQ 0.0074 DNA polymerases rs1381057 0.449814384
RAD51-AS1 0.0272 Homologous recombination rs8031427 -0.290150370
Table4: Top Significant Genes in Gene-Environment Association Permutation Test among non-Latino White
Group
Gene P-Value Pathway Leading-SNP Estimate
TDG 0.0001 Base excision repair (BER) rs4135060 -1.016086859
SETMAR 0.0038
Chromatin Structure and
Modification
rs13099918 -0.552570081
MLH3 0.0091 Mismatch excision repair (MMR) rs2098251 -0.421344079
POLD3 0.0143 DNA polymerases rs6592590 -0.503638751
GTF2E2 0.0242 Nucleotide excision repair (NER) rs16877238 -0.44336597
SPIDR 0.0250 Homologous recombination rs12541086 0.429481566
FANCI 0.0261 Fanconi anemia rs649656 -0.330563927
PRKDC 0.0274 Non-homologous end-joining rs10109984 0.420063508
EXO5 0.0291 Editing and processing nucleases rs12068587 0.426101143
RAD50 0.0337 Homologous recombination rs3798134 -0.672021468
CHAF1A 0.0391 Chromatin Structure and
Modification
rs243375 0.442594566
RAD23B 0.0417 Nucleotide excision repair (NER) rs1323809 0.522966275
9
Table5: Top Significant Genes’ expression levels in Tissues measured by Transcripts Per Million (TPM)11,12
Gene Expression in
Tissues (median TPM)
Lung Function EBV-Transformed
Lymphocytes
Whole Blood
DCLRE1A 4.5880 17.8600 0.5719
ERCC1 33.5800 24.8800 11.9100
GTF2H5 5.5400 6.4320 0.9816
XPC 36.8300 54.7000 13.2300
ALKBH3 17.3000 26.7900 4.1270
PARK7 164.3000 303.8000 61.4800
The results of combined meta-analyses using Latinos and non-Latino White data sets
show P-values and Beta estimates of gene-environment interaction in both populations. They
are calculated via minP method6 and adjusted by 10,000 permutation tests. Table 2 includes
top 6 genes with pathways they belong to and the top DNA repair SNPs for the whole
population; Tissue-specific gene expression levels for the top 6 genes were obtained from the
Genotype-Tissue Expression (GTEx) project for explanation purpose. These expression levels
indicate that the top genes are relevant to tobacco (lung) and target tissue (lymphoblasts) in
particular (Table 3); When referring to ancestry-specific top genes, table 3 and table 4 gives
the 10,000 permutations test result for Latinos and non-Latino Whites respectively.
10K-permutation test results and observed original p-values for each DNA repair SNPs
with tobacco exposure are provided for the Latino group, the non-Latino White group and
metadata (Supplementary Tables 1, 2 and 3, respectively). The supplementary tables give the
10
10,000 times permutation test P-values, on both gene level and SNP levels, for interaction
with Prenatal Tobacco Exposure on ALL risk.
DCLRE1A emerged as a stand-out performer (P = 8.7x10-3
, Table 2) in our permutation
test due to its essential role in editing and processing nuclease pathways. It displayed
significant variance in expression levels between EBV-transformed lymphocytes (the target
cell in GTEX that is most closely related to pre-B cells, median TPM=17.8600, Table 3) and
whole blood (median TPM =0.5719, Table 3) samples, suggesting its activity may be relevant
to carcinogenic processes stimulated by tobacco exposure. Such responsiveness may play an
integral part in repairing complex DNA damage, which forms part of defense mechanisms
against cancer formation caused by tobacco-related toxins.13
Next up was ERCC1 (P = 1.50x10-2
, Table 2), an essential gene involved in Nucleotide
Excision Repair (NER) pathway and showing significant expression in lung function (median
TPM = 33.5800, Table 3) suggesting repair as well as modulating immune responses against
tobacco smoke. Additionally, this gene plays an integral part in repairing DNA damage caused
by smoking and providing defense mechanisms against its harmful mutagenic effects and cell
protection mechanisms against them.14 GTF2H5, an essential element of the NER pathway
(P= 1.55x10-2
, Table 2), was more abundantly expressed in lung function cells (median TPM =
5.54400, Table 3) and EBV-transformed lymphocytes (median TPM = 6.43200, Table 3),
suggesting its involvement in early damage recognition and repair processes. GTF2H5 may
play an early role in NER processes by helping remove DNA adducts produced by tobacco
carcinogens that contribute to genetic damage accumulation, thus helping prevent further
11
accumulation.15 XPC, another gene in the NER pathway with a significant p-value of 2.54x10-
2
(Table 2), was highly expressed among EBV-transformed lymphocytes (median TPM =
54.7000, Table 3) and lung function cells (median TPM = 36.8300, Table 3), suggesting its
significance for identifying tobacco-caused DNA damage. Damage verification is an integral
component of NER process in cells exposed to tobacco smoke which significantly increases
DNA damage burden.16
After that, ALKBH3, with a p-value of 2.81x10-2 (Table 2), plays an essential role in
direct reversal of DNA damage repair pathways and specifically alkylated DNA repair
pathways - distinct mechanisms to repair alkylated DNA. ALKBH3's expression data
demonstrated significant levels in EBV-transformed lymphocytes (median TPM = 26.7900,
Table 3) suggesting its expression may provide adaptive responses against environmental
tobacco smoke-induced oxidative stress. ALKBH3 helps maintain genomic stability by
upholding genetic information integrity by both directly contributing repair mechanisms of
alkylated DNA damage repair while simultaneously contributing reversal mechanisms
directly.17
PARK7 (P = 4.8×10−2
, Table 2) gene within the modulation nucleotide pool pathway,
stands out for its expression levels among EBV-transformed lymphocytes (median TPM =
303.8800, Table 3) and whole blood (median TPM = 61.4800, Table 3), plays an essential role
in cell homeostasis and repair from tobacco exposure induced oxidative lesions. Its
modulation of nucleotide pools essential to repair of damaged guanines in DNA is essential.18
12
These findings shed light on the intricate relationships between genetic factors related
to DNA repair pathways and environmental challenges presented by tobacco smoke, as well
as possible moderating effects of genetic factors on ALL risk. They highlight the potential
modulatory effects of genetic factors on ALL risk and illuminate our genetic blueprint's
interaction with environmental exposures.
The Nucleotide Excision Repair (NER) pathway was among the top genes most often
observed; specific genes like ERCC1, GTF2H5, and XPC were noted for their expression levels
in lung function and EBV-transformed lymphocytes. Although PARK7 did not belong to this
pathway, its highest expression levels in EBV-transformed cells (median TPM = 303.8800,
Table 3) and whole blood (median TPM = 61.4800, Table 3) could potentially act as biomarkers
to measure tobacco exposure's impact on immunity; DCLRE1A which edits and processes
nucleases also had high expression levels among EBV-transformed lymphoblasts, suggesting
postexposure DNA repair mechanisms in the presence of tobacco exposure.
These findings expand our knowledge of the molecular interactions between genetic
factors and environmental exposure to tobacco, with particular focus on Nucleotide Excision
Repair pathway as having potential significance with respect to tobacco exposure. Therefore,
further investigation must be undertaken to better understand its implications for ALL risk
and progression.
13
3.2 Pathway-Level Difference Between Two Ancestries
Figure 1: 10k Permutation-Adjusted P-values of Genes by Pathways among Latinos
Figure 2: 10k Permutation-Adjusted P-values of Genes by Pathways among Non-Latino Whites
14
Figure 3: 10k Permutation-Adjusted P-values of Genes by Pathways among the whole CCRLP Sample Set
The pathway figures illuminate the relationship between DNA repair genes and
maternal tobacco exposure in shaping risk for ALL. Notably, plots show that in Latino
15
populations (Figure 1), most significant interactions were noted with genes involved in
Homologous Recombination pathway genes like RECQL (p-value = 1.0 x 10-4
, Supplementary
Table 1), PDS5B (p-value = 3.1 x 10-3
, Supplementary Table 1), BARD1 (p-value = 3.5 x 10-3
,
Supplementary Table 1), that are essential in the repair of double-strand breaks caused by
tobacco carcinogens; such heightened significance may suggest increased ALL risk due to both
environmental exposure as well as genetic predispositions within this pathway.
In contrast, the non-Latino White dataset revealed marked interactions in Base
Excision Repair (BER) pathway (Figure 2), as evidenced by TDG's role (p-value = 1.0 x 10-4
,
Supplementary Table 2) in correcting small base lesions in DNA. Furthermore, significant
involvement in Nucleotide Excision Repair (NER) pathway such as those initiated by
environmental exposures like tobacco smoke was noted with genes like GTF2E2 (p-value =
2.42 x 10-2
, Supplementary Table 2) showing responses against tobacco induced DNA damage.
Furthermore, SETMAR gene in the Chromatin Structure and Modification pathway, with a pvalue of 0.0038 (Figure 2 and Supplementary Table 2) suggesting its possible role in
modulating access of damaged DNA regions to repair enzymes. This indicates different
genetic mechanisms may be at work, where maternal tobacco exposure could influence ALL
risk through multiple repair systems such as BER, NER and chromatin modification pathways
- underscoring their importance within non-Latino White populations' susceptibility profiles.
These differences may be attributable to genetic diversity, including variations in allele
frequencies and linkage disequilibrium patterns, which can affect the detection of geneenvironment interactions. 19,20 Moreover, environmental exposures, population history, and
16
diet could also contribute to these observed differences.21,22 The variation in sample size
between these groups further underscores the importance of considering statistical power in
such analyses. Furthermore, the confirmation that there are ethnic and racial differences in
the incidence of childhood leukemia suggests that genetic and environmental or cultural
factors play a significant role in the etiology of childhood leukemia, adding a layer of
complexity to the observed gene-environment interactions within our study. 23
The significant interactions identified in our study suggest that genetic predisposition
regarding DNA repair capacity could influence the risk of ALL following prenatal tobacco
exposure.24,25 These findings support the hypothesis that tobacco exposure causes DNA
damage, and the efficiency of DNA repair mechanisms could influence the likelihood of
malignant transformation in hematopoietic cells. 24,25 Our data support this theory with
support of DNA damage caused by tobacco exposure being one potential contributing factor
influencing how likely malign transformation will become in these cells. Permutation-adjusted
p-values provide evidence for the statistical significance of our findings, considering multiple
testing inherent to GWASs. The biological plausibility of these interactions is supported by the
roles played by identified genes in maintaining genomic integrity and responding to DNA
damage. When combined with statistical evidence from identified genes, this approach
strengthens the notion that maternal tobacco exposure and genetic variations interact to
increase ALL risk in children.
17
Chapter 4: Discussion
4.1 Summary of Findings
This study has investigated the complex interactions between gene and environment
(GxE), specifically tobacco exposure during gestation, and DNA repair gene variants as risk
factors for acute lymphoblastic leukemia (ALL) among children. This novel interrelation
between DCLRE1A, ERCC1, GTF2H5 and prenatal tobacco exposure represents a notable
advance in the field, fitting perfectly into the gene-environment interaction paradigm put
forth in recent epidemiological research that emphasizes genetic elements interacting with
environmental influences to influence ALL risk. Previous studies have investigated the
relationship between prenatal tobacco exposure and the PTPRK gene, suggesting novel
approaches to understanding ALL susceptibility. 25 Our study provides further evidence for
direct associations between several genes and prenatal tobacco exposure that augments
current narratives on ALL risk.
4.2 Previous Research
Building upon the foundations laid by Zhong et al.,25 yet this study delves deeper into
the complicated relationship between DNA repair SNPs and prenatal tobacco exposure. Our
gene-environment interaction studies focus more closely on specific contributions made by
DNA repair pathways to ALL risk. By employing a minP approach, we shed light on the intricate
interactions of DNA repair SNPs within pathways and gave a more nuanced picture of genetic
variations that interact with environmental tobacco smoke exposure. Focusing on DNA repair
18
SNPs allows for an in-depth investigation of the mechanistic pathways behind ALL
susceptibility, specifically gene-environment interactions.
Studies focusing on genetic polymorphisms associated with xenobiotic metabolism
and DNA repair reaffirming how genetic variations influence both disease susceptibility and
treatment response are increasingly being conducted in research circles. Though not directly
addressing prenatal tobacco exposure, such studies lend credence to our findings by
underscoring genetic variation that regulates disease outcomes. MDR1 gene variants in
relation to childhood acute lymphoblastic leukemia (ALL) has shed much-needed light on the
interactions between genetic predispositions and environmental exposures such as indoor
insecticide use.26 The research, which explores the influence of specific genetic
polymorphisms on ALL risk,27 forms an integral basis for our own work, which expands these
inquiries by exploring more DNA repair gene SNPs as well as their interactions with
environmental factors like tobacco smoke. Gene-environment (GxE) analysis can give us a
much clearer picture of all DNA repair genes than single gene studies can. Instead of just
testing for interactions, this approach seeks to delineate how specific variations across DNA
repair genes may help mitigate the environmental influence of ALL risks.
4.3 Contributions
Our research seeks to close gaps in understanding ALL's etiology by going beyond
existing associations and investigating whether broader DNA repair mechanisms may also
play a part in prenatal tobacco exposure. By doing this, we shed new light on genetic
influence's complex nature; suggesting that leukemia risk may result from complex
19
interactions between genetic factors and environmental variables that extend far beyond
previously considered factors. These findings mark a watershed moment in our quest to
understand the etiology of childhood ALL and underscore our call for further exploration of
genetic-environmental interactions that contribute to disease risk. Their potential application
could lead to tailor-made therapeutic interventions and pave the way toward an eventual
endgame where childhood leukemia will no longer be so prominent an issue in its context.
Methodological rigor is of utmost importance in genetic epidemiology to substantiate GxE
interactions, our study employs the MinP approach augmented with 10,000 permutation
tests to create an analytical framework which excels in mitigating Type I errors, an issue
prevalent among genome-wide association studies (GWASs). Our chosen methodology
ensures reliable results that are especially applicable given multiple testing challenges
inherent to GWASs, reinforcing validity in our findings.
Another hallmark of our study's methodological strength lies in its comprehensive
permutation testing that underpins its statistical analysis. By simulating numerous
permutations, we obtain more accurate estimation of p-value distribution under null
hypothesis - providing a rigorous benchmark against which to test significance of GxE
interactions observed by us. Not only can this approach tighten control over false positives
while also offering a framework that can withstand multiple testing, which often proves
challenging when working with large datasets.
Furthermore, our research relies on an expansive and varied dataset, featuring a large
cohort with substantial ethnic heterogeneity to untangle the complex web of GxE interactions
20
across populations. Such diversity enables differential gene-environment interactions among
subgroups while considering genetic variability and exposures specific to each subgroup; such
scrutiny is indispensable in moving towards personalized risk evaluation and public health
interventions as hallmarks of personalized medicine's era.
4.4 Limitations
Though our study's methodology is sound, we must take its limitations into account.
One such limitation is the varying sample sizes among subgroups, presenting difficulties for
subgroup analyses. Particularly, when compared with Latino cohort of 2706 individuals, 1678
individuals are underpowered relative to these populations and can cause detectability
problems in terms of GxE interactions within these populations. To address this challenge, we
use meta-analytical techniques designed to increase statistical power of our analyses. By
synthesizing data across disparate cohorts, this approach strengthens our capacity to detect
true associations amidst noise in complex genetic datasets.
4.5 Future Research Directions
While our research highlights the interweave of genetic and environmental factors
contributing to childhood ALL, longitudinal studies are warranted to further dissect its
dynamics over time. Furthermore, GxE interactions should go beyond prenatal tobacco
exposure in assessing leukemogenesis; novel genes or their variants provide opportunities for
further exploration in terms of leukemogenesis evaluation. Future research endeavors must
aim at uncovering this complex web while translating findings into actionable prevention,
diagnosis, and treatment strategies in pediatric oncology.
21
References
1. Ekpa QL, Akahara PC, Anderson AM, et al. A Review of Acute Lymphocytic
Leukemia (ALL) in the Pediatric Population: Evaluating Current Trends and Changes in
Guidelines in the Past Decade. Cureus. 2023;15(12): e49930. doi:10.7759/cureus.49930.
2. Nematollahi P, Arabi S, Mansourian M, et al. Environmental Risk Factors for
Pediatric Acute Leukemia: Methodology and Early Findings. Int J Prev Med. 2023; 14:103.
doi: 10.4103/ijpvm.ijpvm_348_22.
3. Zehtab, S., Sattarzadeh Bardsiri, M., Mirzaee Khalilabadi, R. et al. Association
of DNA repair genes polymorphisms with childhood acute lymphoblastic leukemia: a
high-resolution melting analysis. BMC Res Notes. 2022; 15:46. doi:10.1186/s13104-022-
05918-3
4. Verde Z, Reinoso-Barbero L, Chicharro L, et al. The Effect of Polymorphisms
in DNA Repair Genes and Carcinogen Metabolizers on Leukocyte Telomere Length: A
Cohort of Healthy Spanish Smokers. Nicotine Tob Res. 2016;18(4):447-452.
doi:10.1093/ntr/ntv172.
5. Ma X, Buffler PA, Layefsky M, Does MB, Reynolds P. Control selection
strategies in case-control studies of childhood diseases. Am J Epidemiol.
2004;159(10):915-921. doi: 10.1093/aje/kwh136.
6. Schwartzbaum JA, Xiao Y, Liu Y, et al. Inherited variation in immune genes
and pathways and glioblastoma risk. Carcinogenesis. 2010;31(10):1770-1777.
doi:10.1093/carcin/bgq152.
7. Arroyo K, Nargizyan A, Andrade FG, et al. Development of a Droplet Digital™
PCR DNA methylation detection and quantification assay of prenatal tobacco exposure.
BioTechniques. 2022;72(4):121-133.
8. Li S, Mancuso N, Metayer C, Xa X, de Smith AJ, Wiemels JL. Incorporation of
DNA methylation quantitative trait loci (mQTLs) in epigenome-wide association analysis:
application to birthweight effects in neonatal whole blood. Clinical Epigenetics. 2022(In
Press).
9. Shenker NS, Ueland PM, Polidoro S, et al. DNA methylation as a long-term
biomarker of exposure to tobacco smoke. Epidemiology. 2013;24(5):712-716.)
10. Wood RD, Mitchell M, Lindahl T. Human DNA repair genes. Mutation
Research/Fundamental and Molecular Mechanisms of Mutagenesis. 2005;577(1–2):275-
283. doi: 10.1016/j.mrfmmm.2005.03.007.
11. GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat
Genet. 2013 Jun;45(6):580-5. doi: 10.1038/ng.2653. PMID: 23715323.
22
12. GTEx Consortium. Human genomics. The Genotype-Tissue Expression
(GTEx) pilot analysis: multitissue gene regulation in humans. Science.
2015;348(6235):648-660. doi:10.1126/science.1262110. PMID: 25954001.
13. Li P, Zhong R, Yu J, et al. DCLRE1A Contributes to DNA Damage Repair and
Apoptosis in Age-Related Cataracts by Regulating the lncRNA/miRNA/mRNA Axis. Curr
Eye Res. 2023;48(11):992-1005. doi: 10.1080/02713683.2023.2241159.
14. Zhu J, Hua RX, Jiang J, Zhao LQ, Sun X, Luan J, Lang Y, Sun Y, Shang K, Peng S,
Ma J. Association studies of ERCC1 polymorphisms with lung cancer susceptibility: a
systematic review and meta-analysis. PLoS One. 2014;9(5): e97616. doi:
10.1371/journal.pone.0097616. PMID: 24841208; PMCID: PMC4026486.
15. Vaughn CM, Sancar A. Mechanisms of DNA Repair and Diseases. In:
Dizdaroglu M, Lloyd RS, editors. DNA Damage, DNA Repair and Disease: Volume 2. The
Royal Society of Chemistry; 2020; p. 1-23. doi:10.1039/9781839162541.
16. Zhou H, Saliba J, Sandusky GE, Sears CR. XPC protects against smoking- and
carcinogen-induced lung adenocarcinoma. Carcinogenesis. 2019; 40(3):403-411. doi:
10.1093/carcin/bgz003. PMID: 30624620; PMCID: PMC6514449.
17. Li P, Zhong R, Yu J, et al. DCLRE1A Contributes to DNA Damage Repair and
Apoptosis in Age-Related Cataracts by Regulating the lncRNA/miRNA/mRNA Axis. Curr
Eye Res. 2023;48(11):992-1005. doi: 10.1080/02713683.2023.2241159. Epub 2023 Jul
28. PMID: 37503815.
18. Tang MS, Lee HW, Weng MW, et al. DNA Damage, DNA Repair and
Carcinogenicity: Tobacco Smoke versus Electronic Cigarette Aerosol. Mutation Res. Rev.
Mutation Res. 2022 Jan-Jun; 789:108409. doi: 10.1016/j.mrrev.2021.108409. PMID:
35690412; PMCID: PMC9208310.
19. Khrunin A, Mihailov E, Nikopensius T, Krjutskov K, Limborska S, Metspalu A.
Analysis of allele and haplotype diversity across 25 genomic regions in three Eastern
European populations. Hum Hered. 2009;68(1):35-44. doi:10.1159/000210447. PMID:
19339784.
20. Chande AT, Rishishwar L, Conley AB, et al. Ancestry effects on type 2
diabetes genetic risk inference in Hispanic/Latino populations. BMC Med Genet.
2020;21(Suppl 2):132. doi:10.1186/s12881-020-01068-0.
21. McArdle CE, Bokhari H, Rodell CC, et al. Findings from the Hispanic
Community Health Study/Study of Latinos on the Importance of Sociocultural
Environmental Interactors: Polygenic Risk Score-by-Immigration and Dietary
Interactions. Front Genet. 2021; 12:720750. doi:10.3389/fgene.2021.720750.
22. Norris ET, Wang L, Conley AB, et al. Genetic ancestry, admixture, and health
determinants in Latin America. BMC Genomics. 2018;19(Suppl 8):861.
doi:10.1186/s12864-018-5195-7.
23
23. Oksuzyan S, Crespi CM, Cockburn M, Mezei G, Vergara X, Kheifets L.
Race/ethnicity, and the risk of childhood leukaemia: a case-control study in California. J
Epidemiol Community Health. 2015;69(8):795-802. doi:10.1136/jech-2014-204975.
PMID: 25792752; PMCID: PMC4550439.
24. Breton CV, Byun HM, Wenten M, Pan F, Yang A, Gilliland FD. Prenatal
Tobacco Smoke Exposure Affects Global and Gene-specific DNA Methylation. Am J Respir
Crit Care Med. 2009;180(5):462-467. doi:10.1164/rccm.200901-0135OC. PMID:
19498054; PMCID: PMC2742762.
25. Zhong C, Li S, Arroyo K, et al. Gene-Environment Analyses Reveal Novel
Genetic Candidates with Prenatal Tobacco Exposure in Relation to Risk for Childhood
Acute Lymphoblastic Leukemia. Cancer Epidemiol Biomarkers Prev. 2023;32(12):1707-
1715. doi: 10.1158/1055-9965.EPI-23-0258. PMID: 37773025.
26. Urayama KY, Wiencke JK, Buffler PA, Chokkalingam AP, Metayer C, Wiemels
JL. MDR1 gene variants, indoor insecticide exposure, and the risk of childhood acute
lymphoblastic leukemia. Cancer Epidemiol Biomarkers Prev. 2007;16(6):1172-1177. doi:
10.1158/1055-9965.EPI-07-0007. PMID: 17548681.
27. Krajinovic M, Labuda D, Mathonnet G, Labuda M, Moghrabi A, Champagne
J, Sinnett D. Polymorphisms in genes encoding drugs and xenobiotic metabolizing
enzymes, DNA repair enzymes, and response to treatment of childhood acute
lymphoblastic leukemia. Clin Cancer Res. 2002 Mar;8(3):802-810. PMID: 11895912.
24
Appendix
Bulk Tissue Gene Expression Figures for Top Significant Genes11,12:
25
26
Abstract (if available)
Abstract
The relationship between maternal smoking during gestation and childhood acute lymphoblastic leukemia (ALL) remains incompletely understood. This study seeks to explore how genetic susceptibility in DNA repair mechanisms interacts with environmental exposure to tobacco smoke to increase risk for ALL in children. Our investigation employed data from California Childhood Cancer Record Linkage Project (CCRLP), logistic regression and MinP tests were employed. Our analysis revealed significant interactions between maternal tobacco smoking and DNA repair genes such as DCLRE1A (P = 0.0087), ERCC1(P = 0.0150), and GTF2H5(P = 0.0155) that affected childhood ALL risk. Furthermore, Latino populations demonstrated notable interactions in Homologous Recombination pathways, while non-Latino White populations showed notable interactions in Base Excision Repair and Nucleotide Excision Repair pathways. This study highlights the significance of DNA repair genes when considering environmental exposure to tobacco smoke, suggesting that genetic variation within these pathways could impact risks of ALL in children exposed to environmental tobacco smoke during gestation.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
The interplay between tobacco exposure and polygenic risk score for growth on birthweight and childhood acute lymphoblastic leukemia
PDF
Understanding acute lymphoblastic leukemia in different ethnic groups in the United States
PDF
Genetic epidemiological approaches in the study of risk factors for hematologic malignancies
PDF
Genetic and environmental risk factors for childhood cancer
PDF
Identification of gene-exposure interactions for risk of cardiovascular disease
PDF
Air pollution, smoking, and multigenerational DNA methylation Signatures: a study of two southern California cohorts
PDF
Exploring the interplay of birth order and birth weight on leukemia risk
PDF
The effect of cytomegalovirus on gene expression of pediatric acute lymphoblastic leukemia
PDF
Ancestral/Ethnic variation in the epidemiology and genetic predisposition of early-onset hematologic cancers
PDF
Bayesian model averaging methods for gene-environment interactions and admixture mapping
PDF
The effects of tobacco exposure on hormone levels and breast cancer risk among young women
PDF
Genetic variation in the base excision repair pathway, environmental risk factors and colorectal adenoma risk
PDF
The environmental and genetic determinants of cleft lip and palate in the global setting
PDF
Detecting joint interactions between sets of variables in the context of studies with a dichotomous phenotype, with applications to asthma susceptibility involving epigenetics and epistasis
PDF
Native American ancestry among Hispanic Whites is associated with higher risk of childhood obesity: a longitudinal analysis of Children’s Health Study data
Asset Metadata
Creator
Wang, Xinran
(author)
Core Title
The influence of DNA repair genes and prenatal tobacco exposure on childhood acute lymphoblastic leukemia risk: a gene-environment interaction study
School
Keck School of Medicine
Degree
Master of Science
Degree Program
Biostatistics
Degree Conferral Date
2024-05
Publication Date
03/27/2024
Defense Date
03/25/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
childhood acute lymphoblastic leukemia (ALL),childhood cancer,DNA repair genes,environmental and genetic interaction,genetic susceptibility,maternal smoking,OAI-PMH Harvest
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Wiemels, Joseph Leo (
committee chair
), Gauderman, William (
committee member
), Mancuso, Nicholas (
committee member
)
Creator Email
1155107696@link.cuhk.edu.hk,xwang210@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113858385
Unique identifier
UC113858385
Identifier
etd-WangXinran-12715.pdf (filename)
Legacy Identifier
etd-WangXinran-12715
Document Type
Thesis
Format
theses (aat)
Rights
Wang, Xinran
Internet Media Type
application/pdf
Type
texts
Source
20240327-usctheses-batch-1131
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
childhood acute lymphoblastic leukemia (ALL)
childhood cancer
DNA repair genes
environmental and genetic interaction
genetic susceptibility
maternal smoking