Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Identifying genetic, environmental, and lifestyle determinants of ethnic variation in risk of pancreatic cancer
(USC Thesis Other)
Identifying genetic, environmental, and lifestyle determinants of ethnic variation in risk of pancreatic cancer
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Identifying Genetic, Environmental, and Lifestyle Determinants of Ethnic Variation
in Risk of Pancreatic Cancer
by
David Daniel Alesse Bogumil
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA,
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(EPIDEMIOLOGY)
August 2021
Copyright 2021 David Daniel Alesse Bogumil
ii
Acknowledgements
I would like to acknowledge the guidance, expertise, and patience of my committee members in
the conducting of this research. Specifically, V. Wendy Setiawan, PhD, for her guidance on all analyses,
mentorship, constant support throughout the program, and giving me the resources and connections to
succeed beyond what I could have imagined; Roberta McKean-Cowdin, PhD, for her instruction on
epidemiology methods and theory, mentorship, and wisdom; David Conti, PhD, for instruction and
guidance on analysis and problem solving in genetic epidemiology; Anna Wu, PhD, for instruction on
epidemiology methods and theory, guidance on air pollution analyses, and guidance on honing my
attention to the details of analysis in epidemiology; Stephen Pandol, MD, for his clinical expertise of
pancreatic cancer, feedback on manuscript development, and excitement for this domain of research.
I additionally want to thank lab members and department investigators who have been critical
in my development and success in the program. A thanks to Dr. Victoria Cortessis who constantly
challenges and improves my problem solving in, and excitement for, epidemiology, and has involved me
in numerous experience-gaining and exciting projects; Daniel Stram, PhD, who has been incredibly
generous with his time in offering his biostatistics expertise through instruction, manuscript
development, and discussion; Xin (Grace) Sheng, who is responsible for large portions of my experience
and understanding of genetic epidemiology through discussion, sharing of example materials, and
conducting data processing/quality control; Burcu Darst, PhD, for her time in offering guidance,
discussion, and expertise in genetic epidemiology.
The data used in this dissertation was collected, processed, stored, cleaned, and organized by
multiple investigators mentioned above, in addition to Christopher Haiman, PhD, Loïc Le Marchand,
PhD, Lynne Wilkens, PhD, Iona Cheng, PhD, Brian E. Henderson, MD, Lawrence Kolonel, MD, PhD, Jackie
iii
Porcel, MS and Songren Wang, MS. This research would not have been possible, without these, and all,
Multiethnic Cohort Investigators.
Additionally, thank you to the National Cancer Institute (NCI), which has provided continuous
support to the Multiethnic Cohort Study and its investigators, and to the public who has helped fund this
research through taxes paid. Most importantly, thank you to Multiethnic Cohort participants who have
donated their time, data, and samples to help us understand and reduce the burden of disease and
health disparities in our communities.
Last, I want to thank my family and friends for their support, expertise, and feedback during the
program. Specifically, my parents David Bogumil, PhD, Elizabeth Alesse, MA; my siblings, Elizabeth
Bogumil, Michael Bogumil, Matthew Bogumil; my aunt, Mary Molino, PhD; and my uncle, Michael
Molino, PhD. Thank you to Charlie Zhong, PhD, for the fun and engaging discussions of epidemiology.
Special thanks to Ann George, PharmD, who has given her time to discuss my research, provided help in
any way that would move me toward success, and for the time spent outside of research having fun.
iv
Table of Contents
Acknowledgements ....................................................................................................................................... ii
List of Tables ............................................................................................................................................... vii
List of Figures ............................................................................................................................................. viii
Abstract ........................................................................................................................................................ ix
Chapter 1: Introduction, Background and Significance ................................................................................ 1
Introduction .............................................................................................................................................. 1
Background ............................................................................................................................................... 1
Pancreas Physiology and Anatomy ....................................................................................................... 1
Pancreas Endocrine Function ................................................................................................................ 2
Pancreas Exocrine Function .................................................................................................................. 2
Molecular Pathology of Pancreatic Cancer ........................................................................................... 3
Mutation Types ..................................................................................................................................... 4
Tumor Characteristics ........................................................................................................................... 4
Identifying Pancreatic Cancer ............................................................................................................... 5
Pancreatic Cancer Epidemiology ........................................................................................................... 5
Known Risk Factors of Pancreatic Cancer: Modifiable.......................................................................... 6
Known Risk Factors of Pancreatic Cancer: Semi-Modifiable .............................................................. 11
Known Risk Factors of Pancreatic Cancer: Non-Modifiable................................................................ 13
Tables ...................................................................................................................................................... 19
Chapter 2: Replication and Genetic Risk Score Analysis for Pancreatic Cancer in a Diverse Multiethnic
Population ................................................................................................................................................... 20
Abstract ................................................................................................................................................... 20
Introduction ............................................................................................................................................ 21
Materials and Methods ........................................................................................................................... 22
Results ..................................................................................................................................................... 25
Discussion................................................................................................................................................ 27
Tables ...................................................................................................................................................... 31
Figures ..................................................................................................................................................... 33
Chapter 3: The Association Between Ambient Air Pollutants and Pancreatic Cancer in the Multiethnic
Cohort Study ............................................................................................................................................... 38
Abstract: .................................................................................................................................................. 38
v
Introduction: ........................................................................................................................................... 39
Materials and Methods ........................................................................................................................... 40
Study Participants ............................................................................................................................... 40
Address history ................................................................................................................................... 41
Exposure Assessment .......................................................................................................................... 41
Statistical Analysis ............................................................................................................................... 42
Results ..................................................................................................................................................... 43
Discussion................................................................................................................................................ 45
Tables ...................................................................................................................................................... 50
Figures ..................................................................................................................................................... 54
Chapter 4: Excess Risk due to Smoking and Effects of Quitting on Pancreatic Cancer Incidence in the
Multiethnic Cohort Study ............................................................................................................................ 59
Abstract ................................................................................................................................................... 59
Introduction: ........................................................................................................................................... 61
Materials and Methods ........................................................................................................................... 62
Study Population ................................................................................................................................. 62
Statistical Analysis ............................................................................................................................... 63
Discussion................................................................................................................................................ 68
Tables ...................................................................................................................................................... 72
Figures ..................................................................................................................................................... 77
References .................................................................................................................................................. 79
Appendices .................................................................................................................................................. 94
Appendix A: Detailed MEC and SCCS Sample Description ...................................................................... 94
Appendix B: Flow diagram of the data cleaning, exclusion, and imputation for the sample. ................ 96
Appendix C: Identical by descent plot showing relatedness of samples within the Multiethnic Cohort
(MEC) and Southern Community Cohort Study (SCCS) prior to filtering based on relatedness. ............ 97
Appendix D: Principal component (PC) analysis plots with point color corresponding to self-reported
race/ethnicity from baseline questionnaires .......................................................................................... 98
Appendix E: Multiethnic and ethnic-specific replication results. ........................................................... 99
Appendix F: Comparison of risk allele frequencies (RAFs) within the Multiethnic Cohort (MEC) and
Southern Community Cohort Study (SCCS) with the reported RAF from the most recent GWAS results
reported in the literature. ..................................................................................................................... 107
Appendix G: Multiethnic and ethnic-specific polygenic risk score odds ratios (ORs) and 95% confidence
intervals (CIs)......................................................................................................................................... 108
vi
Appendix H: Comparison of polygenic risk score calculation methods in the multiethnic analysis. In
ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid unstable
scores .................................................................................................................................................... 109
Appendix I: Comparison of polygenic risk score calculation methods in the white analysis. In ethnic-
specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid unstable scores.
.............................................................................................................................................................. 110
Appendix J: Comparison of polygenic risk score calculation methods in the African American analysis.
In ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid
unstable scores. .................................................................................................................................... 111
Appendix K: Comparison of polygenic risk score calculation methods in the Japanese analysis. In
ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid unstable
scores. ................................................................................................................................................... 112
Appendix L: Comparison of polygenic risk score calculation methods in the Latino analysis. In ethnic-
specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid unstable scores.
.............................................................................................................................................................. 113
Appendix M: Comparison of polygenic risk score calculation methods in the Native Hawaiian analysis.
In ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid
unstable scores. .................................................................................................................................... 114
Appendix N: Participant Exclusions Prior to Air Pollution Analysis....................................................... 115
Appendix O: Global Proportional Hazard Violation Test Results .......................................................... 116
Appendix P: Monthly Air Pollutant Measures Over Study Duration by Ethnicity ................................. 117
Appendix Q: Full Excess Relative Risk Model Terms. ............................................................................ 118
Appendix R: Sensitivity analysis; BMI’s effect on smoking estimates, when included as a baseline term.
.............................................................................................................................................................. 119
vii
List of Tables
Table 1: Pancreatic Loci Discovered in Prior Studies .................................................................................. 19
Table 2: Characteristics of pancreatic cancer cases and controls .............................................................. 32
Table 3: Air Pollution Analysis Baseline Characteristics of Study Participants by Race/Ethnicity. ............. 51
Table 3 continued ....................................................................................................................................... 52
Table 3 continued ....................................................................................................................................... 53
Table 4: Sample Baseline Characteristics, Stratified by Race/Ethnicity. ..................................................... 73
Table 4: Continued ...................................................................................................................................... 74
Table 5: Incidence Rates of Pancreatic Cancer by Race/Ethnicity. ............................................................. 75
Table 6: Fit Model Statistics from Multiethnic, Combined, Excess Relative Risk Model ............................ 76
viii
List of Figures
Figure 1: Multiethnic and ethnic-specific replication analysis results for 31 SNPs identified in prior GWAS
of pancreatic cancer in European, Chinese, and Japanese ancestry. ......................................................... 34
Figure 2: Comparison between 31 replicating SNPs from multiethnic replication analysis and most recent
GWAS results on the log OR scale. .............................................................................................................. 36
Figure 3: Multiethnic and ethnic-specific polygenic risk score odds ratios (ORs) and 95% confidence
intervals (CIs)............................................................................................................................................... 37
Figure 4: Association between Particulate Matter and Pancreatic Cancer. ............................................... 55
Figure 5: Association between Nitrogen Dioxide, Nitrogen Oxides, and Pancreatic Cancer. ..................... 57
Figure 6: Excess Relative Risk of Smoking Over a Varity of Smoking Histories. .......................................... 77
Figure 7: Hypothetical Risk Trajectories, Given Differing Smoking Histories, Over a Lifespan................... 78
ix
Abstract
Pancreatic cancer is projected to be the second leading cause of cancer death by 2040. Currently
it is the 10th most common malignancy in humans and is one of the most fatal types of cancer due to no
forms of regular screening. The lack of screening results in late-stage diagnosis and poor survival.
Numerous risk factors have been identified for pancreatic cancer, such as smoking, diabetes, body mass
index, and common genetic variation. One of the most notorious risk factors, smoking, has been shown
to increase risk of pancreatic cancer by seventy to over 100 percent among current smokers, relative to
never smokers. In the domain of genetics, thirty-one risk loci have been identified to be associated with
pancreatic cancer in samples of European, Japanese, and Chinese ancestry. And finally, air pollution, an
exposure which has had growing attention for many cancers, including pancreatic, has shown mixed
results.
It is important to note, the burden of pancreatic cancer differs by race and ethnicity, with
African Americans having twenty percent higher incidence rates relative to most other ethnic/racial
groups. Although there is variation in the incidence rates by race/ethnicity, most of the previously
mentioned associations have only been measured in samples of predominantly European and European-
American ancestry. These prior studies are not representative of the United States population and do
not include participants from ethnic/racial groups who are at elevated risk for this disease. This is an
important limitation of the current state of pancreatic cancer research since we cannot assume that
associations identified in a single population, such as European-Americans, are generalizable to other
populations of differing race/ethnicity. Reasons for lack of generalizability include differing exposure
distributions between population and differing distributions of component causes, resulting in non-
transportability of association measures.
x
In this dissertation, we test the associations between common genetic variation, common air
pollutants, smoking history, and pancreatic cancer using the Multiethnic Cohort Study (MEC), an
ethnically diverse cohort, representative of Southern California and Hawaii, composed of African-
American, Japanese-American, Native Hawaiian, Latino, and white participants. In the first project, we
evaluate the associations between the 31 previously identified risk loci, from European, Japanese, and
Chinese ancestry studies, in a multiethnic sample, then within each of 5 major ethnic/racial groups of
the MEC and the Southern Community Cohort Study (SCCS). Using these results, we then build a
weighted polygenic risk score (PRS), to examine if all information from the 31 risk loci, summed into a
score, can predict pancreatic cancer risk in a multiethnic sample and within ethnic groups.
In the second project, we further investigate prior air pollution-pancreatic cancer findings by
measuring the association between common outdoor air pollutants (PM 2.5, PM 10, NO X, NO 2) and
pancreatic cancer in an ethnically diverse sample. Prior studies examining this association have found
mixed results. Additionally, these studies also contain multiple limitations, such as not accounting for
participant address change over the study period and containing non-representative samples, which
include few non-white participants. Using Kirging interpolation, we estimate air pollutant concentrations
at each participant’s address in the MEC, then measure the association between air pollutant
concentrations and pancreatic cancer using Cox regression while adjusting for multiple confounders.
Finally, in the last project, we estimate the association between smoking history and pancreatic
cancer. Smoking is a common and notorious risk factor for pancreatic cancer. Prior studies characterize
this association by only considering a single characteristic of smoking at a time. Modeling the data in this
restricted manner commonly results in relationships that contrast with what we know to be true of the
exposure. For example, decreased risk of pancreatic cancer among ex-smokers, relative to never
smokers. In this project, we measure the association between smoking and pancreatic cancer, while
considering smoking cessation and smoking intensity in a single excess relative risk model. This
xi
technique allows us to estimate increased risk of pancreatic cancer associated with pack-years smoked,
while estimating, in the same model, the modifying effects of years-quit and cigarettes smoked per day
on the pack-years pancreatic cancer relationship. Using this model, we then test for ethnic
heterogeneity of the pack-years association by race/ethnicity and ethnic heterogeneity of the modifying
variables, years-quit and cigarettes per day. Finally, using our model, we estimate excess relative risk
and absolute risk across a variety of smoking histories to understand an individual’s risk trajectory based
on their smoking behavior. In addition to the project summaries provided above, there are detailed
abstracts for each project at the beginning of each project chapter.
1
Chapter 1: Introduction, Background and Significance
Introduction
Pancreatic cancer is the 10
th
most common malignancy in humans, resulting in over 57,000 cases
and 47,000 deaths in the United States, annually [1]. Diagnosis at advanced stages is common, since
there are no regular forms of screening [2]. Late detection is the primary reason for the low survival
rate, with only 10.5% of cases surviving past five years [3]. Incidence of pancreatic cancer is known to
differ by race and ethnicity. In comparison to most other racial groups, African Americans experience a
20% higher incidence rate of pancreatic cancer [4]. Most research on pancreatic cancer originates from
large European databases, European consortiums, or national data composed mostly of whites [5, 6].
Due to this limitation, research on the sources of ethnic variation in rates of pancreatic cancer is sparse.
The MEC, a prospective cohort from California and Hawaii, is a rich data source that can be used to
examine how and why ethnic variation in pancreatic cancer exists as well as the magnitude of
racial/ethnic variation [7].
This dissertation focuses on identifying and measuring sources of ethnic variation, in rates of
incident pancreatic cancer. Specifically, data from the MEC is used to measure and determine
racial/ethnic variation from genetic markers, air pollutants, and smoking. These projects and their
impacts will be discussed briefly followed by a background chapter. In the background chapter pancreas
physiology, pancreatic cancer oncogenesis, and epidemiology of pancreatic cancer will be discussed.
Background
Pancreas Physiology and Anatomy
Located in the upper abdomen, the pancreas is involved in both maintaining blood glucose
levels (endocrine function) and digestion of macro nutrients (exocrine function). The structure of the
2
pancreas is divided into 3 parts, the head, the body, and the tail [8-10]. The head and body receive blood
from the celiac and superior mesenteric arteries. Whereas the body and tail receive blood from the
splenic artery. The pancreas is structured by lobules, composed primarily of acinar cells. These lobules
are bound by connective tissue and supporting tissues, such as blood vessels [9]. To better explain
pancreas function, the organ can be broken into the domains of exocrine and endocrine function.
Pancreas Endocrine Function
The primary endocrine function of the pancreas is to maintain glucose homeostasis by
regulating the metabolism of glucose, fatty acids, and amino acids. This goal is achieved through
excreting the peptide hormones insulin and glucagon [10]. Clusters of endocrine cells form to create
groups of cells called Islets of Langerhans, which make up only 1% to 2% of the pancreas. Islets of
Langerhans are composed of alpha, beta, delta, pancreatic polypeptide, and epsilon cells, which are
separated from the pancreatic exocrine cells by connective tissue [8, 9]. The majority of islet cells are
beta cells that secrete insulin in the presence of elevated exogenous glucose, such as what occurs
following a meal. The insulin then triggers glucose uptake on muscle and adipose tissue.
The second most common islet cell, alpha cells, are responsible for the production of glucagon.
Glucagon levels rise when blood glucose levels drop. This step promotes glycogenolysis and
gluconeogenesis, which is the breaking down of glycogen and non-carbohydrates to glucose. Insulin and
glucagon are passed into the blood through capillaries surrounding Islets of Langerhans [10, 11]. An
addition to blood glucose levels, the endocrine function is also altered by hormones released by the gut
during digestion, fat tissue hormones, and hormones produced in the brain [10].
Pancreas Exocrine Function
Most of the pancreas's weight is from acinar cells, which are responsible for the exocrine
function of the pancreas. This process is responsible for producing over a liter of pancreatic juice per
3
day. Acinar cells are grouped together to form pancreatic acini, which have a ductal opening surrounded
by centroacinar cells. Enzyme production is specific to the acinar cells. While some enzymes, such as
those that digest starch, glycogen, and fat, are produced in their active form by acinar cells, enzymes
that digest protein and amino acids are secreted in an inactive form. Inactive enzymes may later become
activated in the environment of the duodenum [9, 11].
In addition to enzymes, pancreatic juice contains ions of sodium, chlorine, potassium, and
bicarbonate. While sodium and potassium levels remain the same (like concentrations found in the
blood), increased flow-rate of pancreatic juice is associated with higher bicarbonate levels and lower
chlorine levels. The pancreatic juice collects in the main pancreatic duct. The end of this duct joins with
the common bile duct, emptying into the duodenum of the upper intestine. Although there are two
distinct cell types of the pancreas, their proximity means shared exposures and environment. The
exocrine tissue of the pancreas receives blood from the circulatory system, but a portion of blood in the
exocrine tissue is exposed to blood that has passed through islets. Therefore, some exocrine tissue is
exposed to high levels of insulin and glucagon because of their location [9, 11].
Molecular Pathology of Pancreatic Cancer
Due to the pancreas being composed of two major cell categories, pancreatic cancer can be
classified into two major forms, endocrine and exocrine tumors. Endocrine tumors are much more rare
than exocrine tumors. They have better survival, which may result from their easier detection.
Endocrine tumors, which alter hormone levels, produce symptoms leading to earlier diagnosis.
Pancreatic cancer of the exocrine tissue is most commonly pancreatic ductal adenocarcinoma (PDAC).
This form of pancreatic cancer is a malignancy of the epithelial cells lining the duct and constitutes over
90% of pancreatic cancers. Unlike pancreatic endocrine tumors, PDAC tends not to have symptoms until
later stages where it has already spread to surrounding tissue, such as the liver and lymphatic tissue.
4
Constituting most of the remaining exocrine tumors are mucinous tumors, which also originate from
epithelial cells in the duct but secrete mucin. Although acinar cells are responsible for most of the
pancreas's size, acinar cell cancer is rare [12].
PDAC can be further typed by originating pre-cancerous lesions. These lesions are known as
pancreatic intraepithelial neoplasias (PanIN), intraductal papillary mucinous neoplasms (IPMN), and
mucinous cystic neoplasms (MCNs) [12, 13]. IPMN has the characteristics of finger-like protrusions in the
pancreatic duct, which secrete mucin. These pre-cancerous lesions are likely to account for less than
10% of pancreatic cancers [12, 14]. PanIN are smaller in size but a more common precancerous lesion.
PanIN is further staged by type of cellular alterations resulting in cancer [12, 15-17].
Mutation Types
The most common oncogene of PDAC is KRAS, a gene responsible for basic cell functions such as
proliferation and differentiation. This mutation, in conjunction with inactivation of tumor suppression
genes (TSGs) can lead to oncogenesis. Although PDAC can originate from inactivation of many different
TSGs; CDKN2A, TP53, and SMAD4/DPC4 are known to be most common [12, 15].
Tumor Characteristics
Tumors are most common in the head of the pancreas, at the base of the organ where the duct
exits to the duodenum [18]. Average tumor size at diagnosis tends to be around 30mm, however size
closely depends on stage [18, 19]. Pancreatic cancer staging is defined by the following factors: tumor
size (1A, under 2cm; AB, over 2cm in longest dimension), whether the tumor extends beyond the
pancreas (2A), whether the tumor has spread to lymph nodes in the region (2B), whether the tumor
extends to major arteries or veins (3, unresectable), and if there is metastasis to other parts of the body
(4) [13, 20, 21].
5
Identifying Pancreatic Cancer
Regular symptoms of pancreatic cancer at diagnosis are pain, jaundice, fatigue, and weight loss.
Jaundice may result from biliary obstruction, as PDACs are most prevalent in the pancreas head, which is
closest to the common bile duct [19-23]. Incident diabetes mellitus is also a symptom of pancreatic
cancer [24, 25]. This late onset diabetes, termed Type 3, is likely the result of pancreatic cancer
progression causing disruption of insulin response.
The diagnosis of pancreatic cancer is most common at a late stage when these symptoms have
already developed enough to require medical attention. Late-stage diagnosis occurs, as there is no
regular form of screening for pancreatic cancer. There are some proposed molecular markers for use in
diagnosis of pancreatic cancer, with assessment of Carbohydrate Antigen 19-9 via blood test being most
popular [13].
Another common initial diagnostic step includes assessment of serum levels to measure reduced
bile flow, such as bilirubin levels. Although biomarkers may be used in initial diagnostic steps, abdominal
imaging is needed for certain diagnosis, estimation of tumor size, and location. Imaging techniques
include ultrasound, computerized tomography (CT), magnetic resonance imaging, magnetic resonance
cholangiopancreatography, positron emission tomography imaging, and endoscopic imaging
procedures. CT scans are the preferred scanning method for diagnostic and staging images [23].
Pancreatic Cancer Epidemiology
Pancreatic Cancer Survival and Treatment
In 2020 there was an estimated 57,000 new pancreatic cancer cases and 47,000 deaths in the US
[1]. Pancreatic cancer has a 5-year survival of only 10.5% making it the 4
th
most common cause of
cancer-related death [3, 22]. Survival of early stage (I-III) pancreatic cancer is significantly worse among
6
Hispanics and Blacks, relative to whites. This difference persists after adjusting for confounding variables
including marital status, socioeconomics, region, comorbidities, and tumor characteristics [26]. The high
mortality rates among pancreatic cancer patients results from several difficulties in treatment. Firstly,
there is no regular form of screening for pancreatic cancer. This means a malignancy will likely develop
until symptoms progress, giving the cancer time to develop in stage [27, 28]. Over 70% of pancreatic
cancer cases are diagnosed with lymph node involvement, meaning a stage of 2B or higher [29]. In
addition to local spread, more distant spread is common for PDAC due to microscopic venous invasion,
where the malignancy spreads via the circulatory system [28, 30].
Surgery and adjuvant chemotherapy is required for treatment of PDAC, however eligibility for
surgery heavily depends on stage of the cancer [18-20, 28]. The most common surgery options are
pancreaticoduodenectomy, for tumors in the head of the pancreas, and pancreatectomy and
splenectomy, for the body and tail of the pancreas [20]. These surgeries are only performed on around
15% of pancreatic cancer cases [31]. Survival time is heavily dependent on eligibility for surgery, with
those receiving surgery having 10 times the odds of surviving to 5 years [31, 32].
Descriptive Statistics
Pancreatic cancer is the 10th most common form of cancer in US men and 11th in US women,
with incidence rates of 14.5 and 11.2 cases per 100,000 person-years, respectively [4]. Incidence of
pancreatic cancer is heterogeneous by race and ethnicity, with blacks experiencing the highest rates
(17.0) followed by Non-Hispanics (14.7), whites (14.4), Asian Pacific Islander (12.6), and Hispanics (12.3)
[4]. There are many known risk factors of PDAC. These factors can be organized onto the continuum of
modifiable to non-modifiable risk factors.
Known Risk Factors of Pancreatic Cancer: Modifiable
Smoking
7
Smoking is known to be one of the strongest risk factors for PDAC. Many studies report a 70%-
100% increase in the risk among current smokers, relative to never smokers [5, 33-37]. Two of the
largest studies are from the Pancreatic Cancer Cohort Consortium and the Pancreatic Cancer Case-
Control Consortium. These studies present a more stable measure of association due to pooling of data
across many studies [33]. Cohort consortium data found a 77% increase in risk (Odds Ratio [OR] = 1.77
[95% CI: 1.38, 2.26]) of pancreatic cancer among current smokers, relative to never smokers. Panc4
found 120% increase in risk (OR = 2.20 [95% confidence interval [CI] 1.71, 2.83]) in pooled case control
studies for current smokers, relative to never. Intensity, duration, and time since quitting are also
associated with risk. Smoking intensity has a strong monotonic trend with pancreatic cancer. Duration
has inconsistent results between studies, but results show risk may likely decline to that of never
smokers after 10 to 20 years of cessation [5, 33].
There is limited information on how the effects of smoking on pancreatic cancer differ by
race/ethnicity. Within the MEC, the effect of smoking does not statistically differ between ethnic groups.
However, there is variation in the strengths of association [38]. Relative to never smokers, the risk of
current smoking was strongest among Japanese Americans (RR 1.92 95% CI: [1.48, 2.49]). African
Americans experienced the second strongest association followed by Native Hawaiians, European
Americans, and Latino Americans. Alternative modeling techniques may help elucidate differences in
smoking risk by ethnicity.
A meta-analysis and a pooled analysis found mixed findings in the association between
smokeless tobacco and pancreatic cancer (RR: 1.0-2.0) [39, 40]. Cigar-only smokers, relative to never
smokers, had 1.6 (95% CI: 1.2, 2.3) times the odds of pancreatic cancer, and 1.1 (95% CI: 0.7, 1.6) for
pipe-only smokers [40]. This may mean chemicals (or constituents) specific to cigarettes, in addition to
the tobacco smoke itself, may be responsible for the increased risk of cancer.
8
There is limited literature on the mechanism behind smoking on risk of PDAC. The exposure
routes to the pancreas are through the blood stream and possibly the bile duct. In this case, a
mechanism of action would be indirect exposure to tobacco carcinogens through blood to the tissue
itself [36]. The bile duct is discussed as another possible route of exposure. This duct is in close proximity
to the main pancreas duct and would lead to greater exposure at the pancreas head [36], which is found
to be the most common site for PDAC [18]. This second mechanism is also supported by smokers having
a lower gastric volume, which leads to increased bile salt reflux following a meal [41]. This bile salt may
come into contact with the main pancreas duct itself.
Diet
Food frequency questionnaires (FFQs) are often used to estimate a diet consumption profile for
participants in nutritional epidemiology studies. These questionnaires ask about consumption of specific
foods and portion sizes. From this data calories, nutrients, vitamins, and dietary patterns can be
estimated. There are three main types of diet analyses: data-driven pattern analysis, a-priori dietary
pattern analysis, and specific food item analysis. Data-driven pattern analysis uses data and outcomes,
to determine consumption patterns that are associated with decreased or increased risk of disease.
These patterns are commonly defined using factor analysis to determine foods that are correlated with
each other in explaining variation of disease. In contrast to data-driven patterns, dietary pattern analysis
uses pre-defined dietary patterns with a scoring system that is used to determine level of adherence to
a given pattern. There are several studies reporting associations with pancreatic cancer using all three of
these diet measurement methods.
Data-driven diet findings are mixed. Two large studies assessed a prudent and a Western diet
pattern. The prudent pattern was characteristic of healthful eating (vegetables, legumes, fruit, whole
grains, etc.). The Western diet pattern contained food items such as red meat, processed meat, refined
9
grains, sugar drinks, etc. In a pooled analysis of the Health Professionals Follow-Up Study and the
Nurses' Health Study, these diet patterns were not associated with risk of pancreatic cancer [42]. In
contrast, a smaller case-control study using 532 cases and 1,701 controls using factor analysis to
determine diet characteristics associated with pancreatic cancer found the prudent diet to be associated
with a near 50% reduction in pancreatic cancer among men and women for those who most closely
adhered to the diet (Q5), relative to those who lease closely adhere (Q1) (Men OR: 0.51 95% CI [0.31,
0.84], Women OR: 0.51 95% CI [0.29, 0.90]). This same study also found a Western diet to be associated
with an increase in risk of pancreatic cancer but only among men (OR: 2.4 95% CI [1.3, 4.2]) [43].
There are limited results on the association between specific dietary quality index measures and
pancreatic cancer. An analysis of the NIH-AARP cohort showed an inverse association between HEI-2005
adherence and pancreatic cancer. In this study, the closest adhering quintile showed a 15% (Hazard
Ratio [HR]: 0.85 95% CI [0.74, 0.97]) reduction in the rate of PDAC, relative to those who adhered to the
HEI-2005 pattern. This diet pattern is defined as having moderate consumption of fruits, fruit juice,
vegetables, grains, milk, meat, beans, and oils. Closer adherence also favors low consumption levels of
saturated fat, sodium, and calories from solid fats, alcohol, and sugar. The stratified analysis of HEI-2005
components showed strongest associations for dark green and orange vegetables and legumes, total
grains, and milk [44]. These dietary components are the largest contributing factor for decreased risk.
The NIH-AARP cohort also evaluated a modified version of the alternative Mediterranean Diet.
This dietary index has characteristics like the prudent diet. The index scales from 0 to 8, with a score of 8
representing closest adherence. Those who received an index score of 5 or greater showed an inverse,
non-significant association with pancreatic cancer (RR: 0.92, 95% CI [0.81, 1.05]) [45]. The only other
prospective cohort, examining the association between Mediterranean diet and pancreatic cancer,
found an 18% reduction in rate of pancreatic cancer among men (HR: 0.82 95% CI [0.68, 0.99]) and a
10
17% reduction in rate among women (HR: 0.83 95% CI [0.69, 1.00]), per one-step change in diet score
[46].
In addition to dietary patterns, several studies have examined the association between specific
food items and risk of pancreatic cancer. A 2012 meta-analysis of seven prospective cohort studies
reported a significant association between PDAC and processed meats. Each 50g increase in consumed
processed meat per day was associated with a 19% (RR: 1.19 95% CI [1.04, 1.36]) increase in risk of
pancreatic cancer. In contrast, the same meta-analysis found increased risk, but not significant, for red
meat consumption and pancreatic cancer (RR: 1.13 95% CI [0.93, 1.39] per 120g/day) [47]. Within the
MEC, an earlier analysis identified an association between red meat, processed meat and PDAC [48]. A
later analysis in the MEC, with the accumulation of significantly more cases, did not find an association
using a trend test of consumption but still found a significant association among highest consumption
quartile, relative to the lowest [38].
Red and processed meats cooked a high temperature contain heterocyclic aromatic amines
(HAA). These compounds, although not carcinogenic themselves, metabolize to form compounds that
can cause DNA adducts. N-nitroso compounds (HOCs) and polycyclic aromatic hydrocarbons (PAHs) are
also found in red and processed meats. These compounds are likely involved in the carcinogenic process
leading to PDAC [49, 50].
In addition to lower consumption of red and processed meats, healthful diets may likely
decrease risk of pancreatic cancer through reduced consumption of fats. The NIH-AARP cohort found
positive, statistically significant, associations between total fat, saturated fat, monounsaturated fat, and
pancreatic cancer risk. These associations were present in fat sources of red meat, dairy products, and in
other animal food sources that include other meats, fish, and eggs [51]. A later study using the Nurses'
Health Study found no association between total fat, saturated fat, polyunsaturated fat, mono-
11
unsaturated fat, trans-fat, or cholesterol. This discrepancy in results may be due to lack of power, as the
Nurses’ Health Study only contains 178 accrued and confirmed cases, relative to the 1,337 in the NIH-
AARP study [52]. Thiebaut et al. discusses a number of possible mechanisms that could explain the
association between dietary fat and pancreatic cancer observed in the AARP cohort. The most direct
mechanism postulated is pancreas hypertrophy and hyperplasia that results from elevated enzyme
production needed for fat digestion [51].
Known Risk Factors of Pancreatic Cancer: Semi-Modifiable
BMI
A 2012 meta-analysis found a number of anthropometric measures to be associated with
pancreatic cancer [53]. In this analysis, a 5-unit change in BMI was associated with a 10% increase in risk
of pancreatic cancer (RR: 1.10 95% CI [1.07, 1.14]). This association was consistent for both sexes. A
sensitivity analysis identified a likely non-linear association between BMI and pancreatic cancer; larger
BMI values had greater than additive effects. The same study also found incident pancreatic cancer to
be associated with waist circumference (RR: 1.11; 95% CI [1.05, 1.18], per 10 cm) and waist-to-hip ratio
(RR: 1.19; 95% CI [1.09, 1.31], per 0.1 unit) [53]. These findings were consistent with a large pooled
analysis using data from the National Cancer Institute Pancreatic Cancer Cohort Consortium, which
found adult BMI from self-reported height and weight to be associated with incident pancreatic cancer
[6]. The most discussed pathway for incident pancreatic cancer stemming from body fat is through
disruption of insulin production and insulin resistance, most commonly seen in diabetics.
Diabetes
Type 2 diabetes is one of the strongest risk factors for pancreatic cancer. A 2014 meta-analysis
that summarized results across 88 studies found 1.97 (95% CI: 1.78, 2.18) times the risk of pancreatic
cancer among type 2 diabetics, relative to non-diabetics [54]. As with body fat, the meta-analysis
12
findings are consistent with pooled analysis results from the Pancreatic Cancer Case-Control
Consortium. Among participants recruited via population-based sampling, history of diabetes was
associated with 1.50 (95% CI: 1.31, 1.71) times the odds of pancreatic cancer. Hospital-based studies
showed a significantly stronger association [55]. The risk of pancreatic cancer among participants with a
diabetes duration less than 1 year was significantly higher than those with a diabetes duration of two
years or more. This finding is likely the result of pancreatic cancer cases developing diabetes during the
latency period of the pancreatic cancer. This has also been observed in the MEC, where the association
between pancreatic cancer was 2.3 times stronger among diabetics with a duration less than 3 years,
relative to those with a longer duration [56].
Infections
Some infections have shown increased risk of pancreatic cancer, namely Helicobacter pylori (H.
Pylori), and hepatitis virus B (HBV) and C (HCV). There is limited evidence showing an association
between HBV, HCV, and pancreatic cancer [57-59]. As discussed by Hassan et al., the major reason for
investigating hepatitis as an exposure is the close proximity of the liver to the pancreas. Both HBV and
HCV are major risk factors for liver cancer. The proximity and dependence between the liver and
pancreas may mean similar risks of cancer resulting from a hepatitis infection. In further support for the
hepatitis- pancreatic cancer relationship, the HBV surface antigen has been measured in pure pancreatic
juice and bile, with later studies showing HBV replication and inflammation response of exocrine
epithelial cells. The reason for limited evidence of the hepatitis- pancreatic cancer relationship is due to
the rarity of pancreatic cancer in-conjunction with the rarity of hepatitis as an exposure. This
combination results in few exposed cases for a statistical analysis of adequate power. This rarity of
hepatitis as a cause of pancreatic cancer is further supported by SEER data showing a population
attributable fraction of only 0.13% for HCV [59].
13
The more common bacterial infection, H. Pylori has also been identified as a risk factor for
pancreatic cancer. A 2013 meta-analysis including 9 case-control studies found a positive association
between H. Pylori and pancreatic cancer. A sensitivity analysis, restricting analysis to four high-quality
studies, found H. Pylori infection to be associated with 1.28 (95% CI 1.01, 1.63) times the odds of
developing pancreatic cancer later in life [60].
Known Risk Factors of Pancreatic Cancer: Non-Modifiable
Air Pollution
Air pollutants are classified either as particulate matter (PM) or gaseous compounds. Particulate
matter is grouped by size of particles and compound. Gaseous compounds are grouped by chemical
composition. Common gas pollutants (ozone (O 3), sulfur oxide (SO x), and nitrogen oxides (NO x) are
created from fossil fuel combustion but also exist from natural emissions. Common particle pollutants
such as lead, inorganic ions, and metal oxides, result from fuel combustion and biomass combustion
[61].
There are only six studies that report associations between air pollutants (PM 10, NO x, H 2S, SO x,
O 3) and pancreatic cancer [62-67]. Of these studies three were conducted in the United States [64-66],
one in Denmark [63], one in Italy [62], and on in China [67]. Only one of the three studies, Ancona et al.,
found a statistically significant association between SO x and PM 10 with incident pancreatic cancer [62].
Ancona et al. conducted a study that was based on a cohort of individuals living within a 7km
radius of a major landfill, incinerator, and refinery adjacent to Rome, Italy. Because of this unique
setting, a landfill-specific gas emission model was used to measure air pollutants (H 2S, SO x, and PM 10).
Among women, SO x was associated with 1.75 (95% CI 1.02, 3.01) times the rate of pancreatic cancer for
each additional 2.882 μ/m
3
. In both men (52 cases) and women (64 cases) each 0.027μ/m
3
PM 10 was
associated with pancreatic cancer (HR Men: 1.40 95% CI [1.03, 1.90]); HR Women 1.47 95% CI [1.12,
14
1.93]). This study is unique because it is based on a region with large variation in pollutant exposure
levels and uses an exposure assessment model specific to the exposure type. Although these
characteristics may be useful in determining an association between air pollution and pancreatic cancer,
the maximum exposure levels and variation observed in the study are uncommon to most of the world.
The remaining studies, which found no association between measured pollutants and pancreatic
cancer were based on large geographical regions in Denmark and the United States [63-66]. Use of large
regions allows for better generalization of results but requires exposure estimation to perform well on a
large scale. Both studies considered an extensive list of confounders in their models, reducing the
likelihood of residual confounding distorting measures of association.
The earlier study by Raaschou et al. accrued 112 cases of incident pancreatic cancer over the
study period. Measures of association were only estimated for nitrogen oxides (NO x). This was done
since NO x is highly correlated with other pollutants and may likely capture their effects. The authors
found no association (HR: 0.62 95% CI [0.21, 1.71] per μg/m3 of NO x).
The most recent project reporting on the air pollution- pancreatic cancer relationship is the
Cancer Prevention Study [64]. The association between PM 2.5, NO 2, O 3, and pancreatic cancer mortality
was measured using data on 623,048 participants and 3,812 pancreatic cancer deaths across the United
States. Neither PM 2.5, NO 2, or O 3 was found to be associated with pancreatic cancer mortality in this
study. The exposure level in this study were estimated at a national level, meaning they must generalize
accurately back to the participants’ local area. Additionally, this study used mortality as an outcome, not
incident pancreatic cancer. Although pancreatic cancer has poor survival, disease severity and treatment
may confound the observed relationship. Ideally, a region-specific model should be used to estimate
exposure status, and incident pancreatic cancer should be used as an outcome in place of death unless
diagnostic information, such as staging and treatment, is not associated with pollutant levels [68].
15
Genetics
Family history of cancer pancreatic is significantly associated with incident pancreatic cancer.
Individuals with a first degree relative who has had pancreatic cancer are 9 times more likely to develop
the disease (standardized incidence ratio: 9.0 95% CI [4.5, 16.1]). This relationship becomes significantly
stronger per each additional first degree relative with pancreatic cancer[69]. The observed relationship
between family history and pancreatic cancer highlights the strong association between genetics and
this disease. There are several familial mutations that are strong risk factors for pancreatic cancer [70].
Peutz-Jeghers syndrome, which is characteristic of intestinal polyps, is estimated to carry over 76 times
the risk of pancreatic cancer relative to those without the disorder. Other familial disorders such as
Lynch syndrome and hereditary pancreatitis also show a strong increase in risk of pancreatic cancer [70].
There are several genetic studies identifying and replicating specific loci associated with
pancreatic cancer. These studies have primarily focused on those of European ancestry [71-76], however
some recent studies have also examined the influence of genetics in Chinese and Japanese populations
[77-79]. Most results for Europeans have been identified by the Pancreatic Cancer Cohort Consortium
(PanScan) and The Pancreatic Cancer Case-Control Consortium (PanC4). PanC4 is a series of hospital-
based case-control studies that is now included as part of PanScan analysis. An additional consortium,
the PANcreatic Disease ReseArch (PANDoRA) consortium, has also reported independent results and has
been used in the replication of PanScan study results [80-82]. There are currently 24 loci discovered in
European studies, 4 from Japanese, and 5 in Chinese (Table 1).
The majority of pancreatic cancer loci have been identified through a series of genome-wide
association studies (GWAS) (PanScan I – PanScan 3), which have later included data from PanC4 and
PANDoRA projects.[69] The first iteration of PanScan contained 1,896 cases and 1,939 controls used in
the discovery phase. In this project, authors identified the SNP (rs505922), on the ABO blood type gene,
16
to be associated with pancreatic cancer [72]. These results were shortly replicated using self-reported
blood group from the Nurses’ Health Study and Health Professionals Follow-up Study. The replication
found, relative to those with type-O blood, increases in risk of pancreatic cancer were observed for
those with type-A blood (32%), AB (51% increase) and B (72% increase) blood type. Blood type likely
affects risk of pancreatic cancer through levels of inflammation and altered immune response following
cell mutation.
Following this study, additional loci were discovered among Chinese and Japanese in two
separate studies. The first Asian study, conducted by Low et al., contained 991 Japanese cases and 5,209
Japanese controls used in the discovery phase. Authors identified three SNPs on chromosomal loci
6p25.3, 12.p11.21, 7q36.2 to be associated with pancreatic cancer [79]. The second study among Asians,
conducted by Wu et al., in a Chinese sample, contained 981 cases and 1,991 controls. The authors of this
study identified five new loci (21q21.3, 5p13.1, 21q22.3, 22q13.32 and 10q26.11) to be associated with
pancreatic cancer.
These two projects were followed by later iterations of PanScan studies. The most recent
iteration, using PanScan data was published in 2018. This study now contained 9,040 cases and 17,248
controls used in the discovery phase and identified 5 new loci [71]. Currently, there are 24 loci identified
to be associated with pancreatic cancer in Europeans, 4 identified among Japanese, and 5 identified in
Chinese populations.
Replication of Genetic Findings Across Ethnicities
Multiple projects have attempted to replicate findings from other ethnic groups. In 2017 Wang
et al., attempted to replicate 20 loci in a Chinese sample of 254 cases and 1200 controls [83]. Cases were
sampled from a large Chinese referral hospital in Shanghai. Controls were from a national sample
collected for use in a prior study. Of the tested SNPs, 3 were identified in Japanese, 6 in Chinese, and 11
17
in Europeans. Only three of the 20 SNPs met the study-specific Bonferroni-corrected threshold of p <
2.5E-03. Two of these SNPs were identified in Europeans and one was identified in Japanese.
Among Japanese, Nakatochi et al. tested 61 SNPs discovered in previous studies [84]. This study
used 664 cases and 664 controls that were recruited from multiple hospitals across Japan. Of the SNPs
tested 13 remained statistically significant at p < 0.05. Most of the loci that remained significant were
discovered in Europeans, with only three discovered in a Chinese sample. Finally, the most recent
European genetic study tested the association between pancreatic cancer and 8 risk loci discovered in
Chinese and Japanese. Among these, only one SNP in Japanese, was statistically significant at p < 0.05 in
the European sample.
There are two major reasons for the lack of replication in some ethnic samples. First, Asian
studies have 1/5 to 1/10 the sample size of the European studies, leading to lower power for detecting
true signals. The strongest and earliest signals from European studies, such as the ABO locus, replicate
best due to their effect size. Second, differing LD structures between ethnic groups may affect results.
Non-causal tag SNPs may not be present or associated with the causal SNPs in the replicating ethnic
group.
Published Polygenic Risk Scores
There only two published polygenic risk scores for pancreatic cancer. In the most recent
European results, Klein et al. estimated the odds of pancreatic cancer using 22 independent genome-
wide significant risk SNPs in a weighted-PRS model [71]. Relative to the reference group in the 40-60%
risk category, those in the greater than 90% category had 2.20 (95% CI: 1.83, 2.65) times the odds of
pancreatic cancer. Those in the less than 10% category had 0.40 (95% CI: 0.38, 0.55) times the odds of
pancreatic cancer, relative to the reference [71]. This PRS model shows strong performance among
Europeans.
18
A weighted-PRS had also been built in Japanese using 61 SNPs identified in previous studies [84].
In development of this PRS, step-wise regression was used to identify 5 SNPs, at separate loci to be
associated with pancreatic cancer. Relative to those in the mid-quantile (reference group), risk of
pancreatic cancer was 0.62 (95% CI: 0.42, 0.91) for those in the lowest quantile while risk was 1.98 (95%
CI: 1.42, 2.76) for those in the highest quantile. In this study, Nakotochi et al. used their data to
determine the set of SNPs included in the PRS, then used the same sample to evaluate the model. This
process may lead to an overestimation of the PRS performance.
Currently there is no published PRS in a multiethnic sample. Both prior mentioned studies that
estimated PRSs showed good performance in their ethnic-specific sample. However, these same studies
showed a limited number of SNPs that replicate from other ethnic-specific samples. This means,
performance of a PRS in a multiethnic sample may need to be based on the subset of SNPs that replicate
across multiple ethnicities to achieve a moderate level of performance.
Development of a multiethnic PRS is important, as many developed countries, such as the United States,
are composed of an ethnically diverse and admixed population. A flexible multiethnic PRS could act as a
tool to guide screening regimens among high risk individuals. Additionally, a PRS would help determine
how much variation in risk of pancreatic cancer between ethnic groups could be attributed to genetics.
19
Tables
Table 1: Pancreatic Loci Discovered in Prior Studies
First Author Year Type
Sample
Ethnicity Studies
Discovery
Cases (n)
Discovery
Controls
(n)
Novel Loci
Discovered
(n) (GWAS
sig)
PanScan Studies
Amundadottir
"PanScan 1"
2009 GWAS European PanScan 1
1
1,896 1,939 1
Pererson
"PanScan 2"
2010 GWAS European
PanScan 1-
2
2
3,851
2
3,934
2
3
Wolpin
"PanScan 3"
2014 GWAS European
PanScan 1-
3,
PANDoRA,
CALGB
5,107
3
8,845
3
6
Childs 2015 GWAS European
PanScan 1-
2,
PANDoRA,
PanC4
4
7,638 7,364 4
Zhang 2016 GWAS European
PanScan 1-
3,
PANDoRA,
PanC4
5,107 8,845 3
Klein 2018 GWAS European
PanScan 1-
3, PanC4,
PANDoRA
9,040 17,248 5
Asian Studies
Low 2010 GWAS Japanese / 991 5,209 3
Wu 2011 GWAS Chinese / 981 1,991 5
Nakatochi 2018
PRS/
Replication
Japanese / 664 664 0
Lin 2018 GWAS Japanese
JaPAN, NCC,
BJJ
2,039 32,592 1
Other/Mixed
Chen 2018
Meta-
Analysis /
Replication
European,
Japanese,
Chinese
/ / / 0
1. Cohorts used for discovery, case-control studies (which later became PanScan 2) for "Fast-Track
Replication" of hits.
2. Genotyped and imputed PanScan 1's "Fast-Track" case control studies for GWAS, which are now called
PanScan 2. No stages for this analysis.
3. Discovery Phase is PanScan 1 and PanScan 2.
4. PanC4 refers to Pancreatic Cancer Case-Control studies (as those used in PanScan 2).
20
Chapter 2: Replication and Genetic Risk Score Analysis for Pancreatic Cancer in a
Diverse Multiethnic Population
Abstract
Background: Genome-wide association studies have identified several single nucleotide
polymorphisms (SNPs) associated with pancreatic cancer risk. No studies yet have attempted to
replicate these SNPs in US minority populations. We aimed to replicate the associations of 31 GWAS-
identified SNPs with pancreatic cancer and build and test a PRS for pancreatic cancer in an ethnically
diverse population.
Methods: We evaluated 31 risk variants in the MEC and the Southern Community Cohort Study.
We included 691 pancreatic ductal adenocarcinoma cases and 13,778 controls from African-American,
Japanese-American, Latino, Native Hawaiian, and white participants. We tested the association between
each SNP and PDAC, established a PRS using the 31 SNPs and tested the association between the score
and PDAC risk.
Results: Eleven of the 31 SNPs were replicated in the multiethnic sample. The PRS was
associated with PDAC risk [OR top vs. middle quintile = 2.25 (95% CI: 1.73, 2.92)]. Notably, the PRS was
associated with PDAC risk in all ethnic groups except Native Hawaiian (OR per risk allele ranged from
1.33 in Native Hawaiians to 1.91 in African Americans; P heterogeneity=0.12).
Conclusions: This is the first study to replicate 11 of the 31 GWAS-identified risk variants for
pancreatic cancer in multiethnic populations, including African Americans, Japanese Americans and
Latinos. Our results also suggest a potential utility of PRS with GWAS-identified risk variants for the
identification of individuals at increased risk for PDAC across multiple ethnic groups.
21
Introduction
Pancreatic cancer is the fourth leading cause of cancer deaths in the United States with over
56,000 new cases and 45,000 deaths in 2019 [85]. By 2030, pancreatic cancer is projected to be the
second leading cause of cancer-related death [86]. Diagnosis at a late stage is common due to lack of
symptoms at early stage of disease and regular forms of screening [31]. These characteristics result in a
5-year survival of only 9% [85], emphasizing the importance of primary prevention strategies for this
disease.
Pancreatic cancer incidence differs by ethnicity. African Americans experience 1.36 times the
rate of pancreatic cancer (10.4 per 100,000) relative to non-Hispanic whites (7.7) [87]. Differences in
incidence rates are observed across other ethnic groups [Hispanic (7.1), Japanese (8.1), Asian/Pacific
Islander (6.2 per 100,000)]. In the MEC, the incidence rates of pancreatic cancer are notably higher
among Native Hawaiians (1.8 times that of whites), followed by African Americans and Japanese
Americans (1.3-1.4 times that of whites) [88]. Epidemiologic studies have associated body mass index
(BMI) [6, 53, 88], type 2 diabetes [54, 55, 88], diet patterns [43, 44] and smoking [33, 88] with pancreatic
cancer. In the MEC, ~20% of pancreatic cancer can be attributed to these factors [89].
Common genetic variants have been associated with pancreatic cancer risk in genome-wide
association studies [71-77, 79, 90]. So far, these GWAS have identified 31 risk variants for pancreatic
cancer. Twenty-two were identified by the PanScan and the PanC4 studies, composed of populations of
primarily European-ancestry [71-76]. Of the remaining variants, four were discovered in Japanese and
five in Chinese [77, 79, 90].
The associations between GWAS variants and pancreatic cancer have yet to be examined in
other ethnic groups, especially in high-risk African Americans and other minority populations. Few single
nucleotide polymorphisms identified in European ancestry replicated in Asian samples. In Chinese, Wang
22
et al. replicated 4 SNPs identified in GWAS and pathway analysis in Europeans, Chinese, and Japanese
[83]. Among Japanese, Nakatochi et al. has replicated 13 GWAS-significant and suggestive loci,
discovered in Europeans, Japanese, and Chinese [84]; Ueno et al. has replicated one European ancestry
loci in Japanese [91]. Similarly, there is limited cross-ethnic replications among Europeans [71, 92], with
only one Japanese-identified SNP replicated in Europeans [71]. Additionally, three other Asian GWAS
have also reported on replication of GWAS identified SNPs in their samples [77, 79, 90]. Lack of
replication of pancreatic cancer-associated SNPs across ethnic groups may be due to low minor allele
frequencies, monomorphic loci, and differences of linkage disequilibrium of tagging SNPs between
ethnic groups. Identifying the association of these SNPs with pancreatic cancer in a multiethnic
population, and in ethnic-specific analyses, will help us identify the value of these SNPs for disease
prediction in an admixed sample.
In this study, we assessed the transportability of prior GWAS findings in an ethnically diverse
population and examined how these variants contribute to pancreatic cancer risk across populations.
We first attempted to replicate the 31 GWAS-significant risk variants in the MEC and the Southern
Community Cohort Study. Using the 31 SNPs, we then built a multiethnic PRS and assessed its
association with pancreatic cancer risk.
Materials and Methods
Study Population: This study included case-control samples within the MEC and SCCS.
Information on recruitment, characteristics, and case ascertainment in the MEC and SCCS has been
described [93, 94] (Appendix A). Briefly, the MEC is a population-based prospective cohort study
initiated between 1993 and 1996 to investigate cancer etiology. The MEC consists of over 215,000 men
and women from Los Angeles County and Hawaii who were 45 to 75 years old at enrollment and from
23
these racial/ethnic groups: African Americans, Japanese Americans, Latinos, Native Hawaiians and
whites. The SCCS was initiated in 2002 to investigate sources of racial disparities in cancer and chronic
disease. The SCCS participants were mainly African Americans and whites between the ages of 40 and 79
who resided in one of 12 US southern states. At baseline, the MEC and SCCS gathered detailed
information on demographics, lifestyle, diet, anthropometry, reproductive history, and medical history.
In both cohorts, cancer cases were identified through annual linkage to state cancer registries.
Pancreatic cancer cases were defined as primary invasive pancreatic cancer with pancreatic ductal
adenocarcinoma histology (ICD-O-3 code C25). Controls were selected by matching to incident cases
based on age, sex, and ethnicity. For the MEC, we also added eligible controls (without PDAC) with
genotype data from prior GWAS. We conducted all analyses with the original cases and matched
controls then with added controls. We present the results using the added controls since the effect
estimates were similar between analyses and we had improved statistical power.
Genotyping, Quality Control and Genotype Imputation: Samples were genotyped using the
Multi-Ethnic Genotyping Array (MEGA) chip (Illumina, San Diego, CA), which was developed to ensure
genome-wide coverage of variants down to 1% frequency in non-European ancestry populations.
Samples underwent an intensive quality control process including SNP call-rate filtering, sample call rate
filtering, concordance checks of inter- and intra-plate controls, removal of redundant or discordant
variants based on location and call rates, removal of SNPs with race-specific allele frequency differences
over 25% in comparison to 1000G phase 3 race-specific estimates (Appendix B). Following QC, 932,530
SNPs, 691 cases and 13,778 controls were used for imputation. The sample was stratified based on self-
reported ethnicity, then imputed using Minimac3, ShapeIT v2, and the cosmopolitan 1000 Genomes
Project reference panel (Phase 3 v5).
Statistical Analysis: Participants with missing covariate values or with implausible values for age,
sex, diabetes, and body mass index (BMI in kg/m
2
) were removed from analysis. Related samples (first-
24
and second-degree relatives) were identified using KING software for robust relationship inference then
removed based on a kinship coefficient of 0.0884 or greater (Appendix C) [95].
SNP and PDAC associations were examined using logistic regression, adjusting for age at sample
collection, sex, study, BMI, diabetes, and population stratification using principal components (PC 1-6).
PCs were estimated using PLINK and a set of >50,000 independent SNPs [96]. Most global ancestry
variation among the five ethnic groups was captured in the first six PCs (Appendix D). Measures of
association were reported on the ratio scale along with corresponding likelihood ratio test (LRT) p-
values. As a sensitivity analysis, we estimated multiethnic associations by meta-analyzing ethnic-specific
results using both fixed effect and random effects models. We present multiethnic pooled results since
there was no effect heterogeneity between the pooled multiethnic analysis and the meta-analyses. All
SNPs were modeled as log odds of PDAC per risk allele (0, 1, 2). A log-odds weighted PRS was estimated
for each participant by multiplying the multiethnic log-odds for each of the 31 SNPs by the number of
risk alleles at the given loci, then summing all values. This PRS took the following form: PRS =
x + x + k x k + n x n . In this algorithm, is the log-odds ratio for risk of PDAC associated
with a per allele increase in risk for a given SNP in our replication analysis. x k is the number of risk alleles
an individual has for the corresponding SNP (0, 1, or 2). We additionally conducted sensitivity analyses
using the following alternative weighting methods: external weights, external weights only using 22
SNPs from European studies, unweighted, ethnic-specific internal weights, and multiethnic weights from
a meta-analysis of ethnic-specific associations from the replication, using both a random-effects and a
fixed-effect.
Logistic regression was used to estimate the log odds of PDAC based on binned percentiles ([1%-
20%], (20%-40%], (40%-60%], (60%-80%], (80%-100%]) generated using the PRS distribution among
controls within each ethnic group, except for the multiethnic analysis which used the control
distribution from all groups combined. Log-odds of PDAC in each percentile were compared to the mid-
25
quantile category (40%-60%). The PRS was also modeled continuously, after standardizing the score to
the ethnic-specific interquartile range (IQR) among controls, except for the multiethnic analysis which
used all controls. The replication and PRS analyses were stratified by ethnicity. P <0.05 was used to
determine statistical significance. Analysis was conducted using R 3.5.0 [97].
Results
Sample Characteristics: The final analytical sample included 691 PDAC cases and 13,778 controls
(518 cases and 13,426 controls from the MEC; 173 cases and 352 controls from the SCCS). Most cases
were African American (230 cases/5,235 controls), followed by Japanese American (181 cases/3,285
controls), white (132 cases/570 controls), Latino (105 cases/ 2,935 controls), and Native Hawaiian (43
cases/1,753 controls) (Table 2). SCCS samples were younger, had a higher prevalence of diabetes, and a
higher mean BMI than MEC participants. Diabetes was common among SCCS African Americans (32.4%
of cases) and MEC Native Hawaiians (25.6% of cases). A large portion of the sample was overweight or
obese. SCCS African-American and white cases had a mean BMI of 31.4 kg/m
2
. MEC Japanese Americans
had the lowest mean BMI (24.8 kg/m
2
among cases).
SNP Frequencies: All SNPs, except rs78193826 and rs35226131 had a minor allele frequency
(MAF) >0.05 in the multiethnic sample. Multiple SNPs were rare in ethnic-specific groups (Appendix E),
and all had risk allele frequencies similar to what were reported in prior studies (Appendix F). Among
cases or controls combined, there were 2 SNPs in the multiethnic sample with a MAF <0.05, 3 in whites,
3 in African Americans, 6 in Japanese Americans, 2 in Latinos, and 3 in Native Hawaiians. When
considering MAF <0.01, there were 0 SNPs in the multiethnic sample, 2 in whites, 2 in African
Americans, 2 in Japanese Americans, 1 in Latinos, and 1 in Native Hawaiians.
Replication Analysis: 11 of the 31 SNPs were replicated at P <0.05, with consistent direction of
association with that observed in the literature (Figure 1; Figure 2; Appendix E). Of the replicating SNPS,
26
10 were discovered in Europeans (rs505922, rs6971499, rs4795218, rs10094872, rs401681, rs7190458,
rs7214041, rs1517037, rs13303010, rs9543325) and one (rs1547374) in Chinese. Replicating SNPs had a
similar mean effect size to what is reported in the literature (mean log-odds of replicating SNPs from
literature: 0.20; mean log-odds of replicating SNPs in multiethnic sample: 0.17). Of these 11 replicated
SNPs, 8 were statistically significant in at least one ethnic group after filtering out SNPs with an MAF
<0.05 in cases or controls. Four replicated in African Americans, three in whites, one in Japanese
Americans, one in Latinos, and one in Native Hawaiians, at P <0.05. Within the set of 11 replicating SNPs
in the multiethnic sample, with MAF >0.05 among cases or controls, we assessed directional consistency
of SNP associations with the literature. Among whites, 9/11 SNPs had consistent direction of effect;
11/11 among African Americans, 8/9 among Japanese Americans, 10/11 among Latinos, 7/11 among
Native Hawaiians.
Of the 20 SNPs not replicating in the multiethnic sample with an MAF >0.05, one was replicated
in Native Hawaiians (rs372883; discovered in Chinese), with consistent direction of association to that
observed in the literature. Within this set of non-replicating SNPs in the multiethnic sample, with ethnic-
specific MAF >0.05, we assessed directional consistency of associations in each race/ethnic group with
the literature. Among whites, 15/17 SNPs had consistent direction of effect; 10/17 among African
Americans, 8/16 among Japanese-Americans, 12/17 among Latinos, 7/16 among Native Hawaiians.
Polygenic Risk Score: We estimated a genetic risk score using the multiethnic effect estimates as
the weight for each SNP. When comparing PRS distributions, we observed a significant difference in risk
scores by case status, where cases had 0.13 higher mean PRS than controls (P<0.001). In the multiethnic
sample, those in the (80%-100%] risk score group had OR=2.25 (95% CI: 1.73, 2.92) for PDAC relative to
the reference group (Figure 3; Appendix G). The IQR-standardized PRS was significantly associated with
PDAC (OR per IQR increase=1.94; 95% CI: 1.70, 2.22; P LRT=4.092 x 10
-17
). The PRS was significantly
associated with PDAC risk in African Americans (OR per IQR increase=1.91; 95% CI: 1.55, 2.35; P LRT:
27
3.121 x 10
-8
), Japanese Americans (OR=1.46 95% CI: 1.19, 1.79; P LRT: 0.003), Latinos (OR=1.65; 95%CI
1.26, 2.17; P LRT: 0.013 and whites (OR=1.85; 95% CI: 1.38, 2.46; P LRT: 2.937 x 10
-4
). There was no
significant ethnic heterogeneity of the association between PRS and PDAC in the continuous model (P
heterogeneity=0.12). We observed similar results when using alternative weighting schemes for the
multiethnic analysis (Appendices G-M).
Discussion
We investigated the association of GWAS-identified SNPs with pancreatic cancer risk in an
ethnically diverse population. Of the 31 SNPs tested, we replicated 11 in the multiethnic sample at an
alpha of 0.05. In comparison to prior replication attempts across racial groups [71, 83, 84], we found a
number of SNPs identified in Europeans and Asians to be associated with PDAC in a US multiethnic
sample. Furthermore, we showed the potential utility of PRS with GWAS-identified risk variants for the
identification of individuals at increased risk for PDAC across multiple ethnic groups.
Of the SNPs tested, the 3 least common SNPs in our multiethnic sample (MAF <0.1 among
controls), were not replicated, but 60% of the most common SNPs in our sample (MAF >0.4) were
replicated. This highlights a possible pattern between replication and allele frequencies in our
multiethnic sample. Although most SNPs have been identified in GWAS of European ancestry [71-76],
only 3 of these SNPs were replicated in our white population [72, 75], the group which most similarly
reflects European ancestry. This limited replication likely results from the small number of whites
relative to the other ethnic groups in our study.
The most significant replicating SNP (rs505922 in ABO) was the first SNP identified to be
associated with PDAC [72]. This association was significant in whites and African Americans. In our study,
this SNP was not replicated in Japanese Americans, however it was directionally consistent with a similar
effect size (OR=1.19; 95% CI: 0.96, 1.48) to prior Japanese studies (ORs range 1.11 - 1.36) [79, 84]. The
28
next three most significant replicating SNPs in our multiethnic sample (rs6971499, rs4795218,
rs10094872; discovered in European ancestry) have not been replicated across any race/ethnicity to our
knowledge. Following these, rs401681 (CLPTM1L) has been replicated in Chinese [83, 90]. The study that
reports replication has a similar effect to ours (Wang et al. OR=1.39; 95% CI: 1.11, 1.74; MEC/SCCS
OR=1.15; 95% CI 1.03, 1.29), however, we observed significant heterogeneity of association by ethnic
group (P heterogeneity= 0.01). Another frequent cross-ethnic replication from prior studies, rs9543325
(KLF5), was replicated in our multiethnic sample. This SNP was associated with PDAC in Japanese and
Chinese [77, 79, 84, 90] with a larger effect estimate than in our multiethnic analysis and an estimate
similar to our Japanese-American sample. In contrast, the single Chinese-discovered SNP (rs1547374;
TFF1) [90], which was replicated in our multiethnic sample and in the Latino subset, was not replicated
in prior studies of European [71], Chinese [83], and Japanese samples [84]. Lastly, rs3790844 and
rs3790843 in NR5A2 have been associated with pancreatic cancer in Europeans [71, 98] and Japanese
[79, 91], however we did not replicate this finding. The effect estimates in MEC whites and Japanese
Americans were most similar to those reported in European [71] and in Japanese ancestry [79, 91].
Reason for lack of replication between studies is likely due to limited sample size. The size of our
case group is less than half the number of cases included in the first European ancestry GWAS and is
only around 7% the size of the most recent European ancestry GWAS [71, 72]. It is likely that the limited
number of cases in our study, relative to what is seen in pancreatic cancer GWAS, resulted in a lower
statistical power than required for replication of additional SNPs. A second likely factor limiting
replication in our study is racial heterogeneity. Across ethnic groups, differing linkage-disequilibrium
structures can lead to the tagging SNPs not being in association with the true causal SNP, resulting in
lack of replication [99].
We built and tested a multiethnic genetic risk score using the previously identified 31-GWAS
SNPs. The associations between PRS and PDAC risk from the multiethnic and ethnic-specific continuous
29
models were statistically significant, except in Native Hawaiians. In the ethnic analysis, a monotonic
pattern between categorical PRS and PDAC risk was clearest in African Americans. Two studies have
reported a PRS analysis for PDAC [71, 84]. In the most recent study, Klein et al. used the 22 SNPs
identified in European ancestry to estimate a weighted PRS using their results and found a strong
association with PDAC [71]. In an earlier study, Nakatochi et al. first attempted to replicate 61 GWAS-
identified SNPs (both significant and suggestive) then used the 8 replicating SNPs in stepwise regression
to select five independent SNPs for use in a PRS [84]. They observed significant associations between the
extreme PRS categories and PDAC risk.
Consistent with findings from Klein et al. and the Japanese ancestry PRS in Nakatoshi et al. [71,
84], we observed an association between PRS and PDAC risk in whites and Japanese Americans. In our
sensitivity analysis using weights from Klein et al, we found similar performance of the PRS, with three
of the four risk quantile groups differing from the reference. Both previous studies used ethnic-specific
weights in their analysis which might provide a better fit in a large study. In our main analysis, we used
multiethnic weights to uniformly weight SNPs across ethnicities. This was done so differences in PRS-
PDAC association reflects case-control differences and not discrepancies in ethnic-specific weights which
can be highly variable due to small sample sizes within some ethnic groups.
There are several strengths and limitations to this study. This is the first replication study of
pancreatic cancer risk variants in a multiethnic population. Our ethnic-specific analysis is the first to
produce replication estimates and show transportability of GWAS findings for multiple ethnic groups,
including African Americans who have notably high pancreatic cancer incidence, yet have not been
studied in the context of genetics. We leveraged existing MEC GWAS data to boost sample size which
improved power needed to replicate multiple SNPs. Finally, we showed that multiethnic estimates for
SNPs known to be associated with pancreatic cancer perform better than expected in both a multiethnic
and ethnic-stratified PRS analysis. Limitations include our relatively small number of cases in comparison
30
to what was included in previous GWAS, which may be responsible for some SNPs not replicating in our
sample. We stratified the replication and PRS analysis by self-reported ethnicity. As observed in the
principal component figures, there can be considerable variation of global ancestry within these groups.
In conclusion, we successfully replicated 11 of the 31 GWAS-identified loci in a multiethnic
population. These replications provide evidence for the importance of these SNPs in understanding
genetic pancreatic cancer risk in an admixed population and in understudied ethnic groups. We showed
a potential value of PRS with GWAS-identified variants for the identification of individuals at increased
risk for PDAC across multiple ethnic groups. Currently there is no routine screening recommended for
PDAC, and thus PRS may be useful in identifying a subgroup of high-risk individuals who may benefit the
most from screening with endoscopic ultrasound or MRI. Furthermore, with known modifiable risk
factors (i.e. smoking, excess weight, diabetes) for PDAC, PRS may be useful for prioritizing individuals for
targeted health and lifestyle-related interventions.
31
Tables
32
Table 2: Characteristics of pancreatic cancer cases and controls
Case Control
n Age
Female Diabetes
Mean
BMI
n Age
Female Diabetes
Mean
BMI
n (%) n (%) n (%) n (%)
MEC
African
American
94 71.3 56 (59.6) 8 (8.5) 28
4961 69.1
3091
(62.3)
757
(15.3)
28.8
Latino 105 69.5 45 (42.9) 21 (20.0) 28.4
2935 67.0
1600
(54.5)
211 (7.2) 27.8
Japanese
American
181 70.9 103 (56.9) 16 (8.8) 24.8
3285 69.0
1541
(46.9)
761
(23.2)
25.4
White 95 69.2 42 (44.2) 0 (0) 26.4
492 60.2 234 (47.6) 2 (0.4) 25.1
Native
Hawaiian
43 67.0 21 (48.8) 11 (25.6) 29.6
1753 65.4 968 (55.2)
237
(13.5)
28.6
SCCS
African
American
136 57.6 70 (51.5) 44 (32.4) 29.4
274 57.5 141 (51.5) 69 (25.2) 30.0
White 37 56.8 19 (51.4) 10 (27.0) 29.4
78 57.3 39 (50.0) 13 (16.7) 28.7
MEC = Multiethnic Cohort; SCCS = Southern Community Cohort Study; BMI = Body Mass Index
33
Figures
34
Figure 1: Multiethnic and ethnic-specific replication analysis results for 31 SNPs identified in prior GWAS
of pancreatic cancer in European, Chinese, and Japanese ancestry.
35
Results shown on the odds ratio (OR) scale with corresponding 95% confidence intervals (CIs) and
ordered from lowest to highest p-values from multiethnic replication analysis. Only SNPs with minor
allele frequency > 0.05 are shown. RSID = Reference SNP cluster ID; ALT = Alternative (risk) allele; REF =
Reference allele.
36
Figure 2: Comparison between 31 replicating SNPs from multiethnic replication analysis and most recent
GWAS results on the log OR scale.
Point size corresponds to minor allele frequency (MAF) among controls in the replication analysis. The
red line represents, MAF-weighted least squares fit. One point removed from figure due to extreme
replication result (rs2816938, OR = 0.69). MEC = Multiethnic Cohort; SCCS = Southern Community
Cohort Study.
37
Figure 3: Multiethnic and ethnic-specific polygenic risk score odds ratios (ORs) and 95% confidence
intervals (CIs).
Weights used from multiethnic replication analysis. Multiethnic analysis used binned risk score
percentile groups from the complete, multiethnic, sample among controls. Ethnic-specific analysis used
binned risk score percentile groups from the control ethnic-specific risk score distribution among
controls. ref = Reference category used in binned regression analysis; P Cts = P-value from continuous
polygenic risk score model.
38
Chapter 3: The Association Between Ambient Air Pollutants and Pancreatic Cancer
in the Multiethnic Cohort Study
Abstract:
Background: Prior studies examining the association between ambient air pollutants and
pancreatic cancer have been conducted in racially/ethnically homogeneous samples and have produced
mixed results, with some studies supporting evidence of an association with fine particulate matter.
Methods: To further investigate these findings, we estimated exposure levels of particulate
matter (PM 2.5, PM 10,) and oxides of nitrogen (NO X, and NO 2) using kriging interpolation for 100,527 men
and women from the MEC, residing largely in Los Angeles County from 1993 through 2013. We
measured the association between these air pollutants and incident pancreatic cancer using Cox
proportional hazards models with time-varying pollutant measures, with adjustment for confounding
factors.
Results: A total of 821 incident pancreatic cancer and 1,660,488 person-years accumulated over
the study period, with an average follow-up time of over 16 years. PM 2.5 (per 10 µg/m
3
) was associated
with incident pancreatic cancer (hazard ratio [HR] = 1.61; 95% CI, 1.09, 2.37). This PM 2.5 -association was
strongest among Latinos (HR = 3.59; 95% CI, 1.60, 8.06) and ever smokers (HR = 1.76; 95% CI, 1.05,
2.94). There was no association for PM 10 (HR = 1.12; 95% CI, 0.94, 1.32, per 10 µg/m
3
), NO x (HR = 1.14;
95% CI, 0.88, 1.48, per 50 ppb), or NO 2 (HR = 1.14; 95% CI, 0.85, 1.54, per 20 ppb).
Conclusions: Our findings support prior research identifying an association between fine
particulate matter, PM 2.5, and pancreatic cancer. This association was most notable among Latinos and
smokers. Future studies are needed to replicate these results in an urban setting and in a
racially/ethnically diverse population.
39
Introduction:
Pancreatic cancer is now the fourth leading cause of cancer-related death in the United States
[85], accounting for over 56,000 new cases and 45,000 pancreatic cancer deaths in 2019 [85].
Pancreatic cancer is projected to be the second leading cause of cancer death by 2030 [86]. A dismal
five year survival of nine percent stems from the lack of effective screening for this disease [31, 85] and
the high proportion (~80%) of pancreatic cancer diagnosed at a late stage. These characteristics
highlight the importance of identifying modifiable personal and environmental risk factors that can be
used in primary prevention strategies.
The burden of pancreatic cancer varies across racial/ethnic groups; the incidence is highest in
African Americans and lowest in whites and Latinos [87], but incidence rates are elevated among Native
Hawaiians and Japanese Americans in the MEC [88]. Numerous factors, most notably, smoking [33, 88],
type 2 diabetes [54, 88], diet quality [44] , BMI [6, 53, 88], and common genetic variants [100] have been
associated with pancreatic cancer risk.
In 2013, the International Agency for Research on Cancer classified outdoor air pollution, which
includes PM 2.5, as a carcinogen for humans based largely on evidence for lung cancer [101]. However,
the evidence for pancreatic cancer is still sparse with inconsistent results [62, 64, 66, 67] from four
mortality cohort studies, conducted in the United States [64, 66], Italy [62], and China [67]. Three
studies, including one prospective cohort [64], and two retrospective cohorts [66, 67], investigated the
role of particulate matter with an aerodynamic diameter less than 2.5 μm (PM 2.5) as the exposure, and
one study examined the role of PM 10 [62] . Exposure to PM 2.5 was not associated with risk of pancreatic
cancer in both US studies but was positively associated with risk in a Chinese study (hazard ratio = 1.16,
95% CI: 1.13, 1.20, per 10 µg/m
3
) [64, 66, 67]; PM 10 was also associated with risk of pancreatic cancer in
an Italian study [62]. These mixed results may be related, in part, to study population differences
40
(including the sample size), differences in air pollutants investigated, using region-level exposure
measures as individual-level exposure estimates [66, 67], not using time-varying exposure measures
[66], limited confounder control [62, 67], or using of pancreatic deaths as an outcome [62, 64, 66, 67].
In this study, we examined the association between ambient air pollutants (PM 2.5, PM 10, NO 2,
NO X) and pancreatic cancer risk in a racially/ethnically diverse population, while accounting for the
limitations seen in prior studies.
Materials and Methods
Study Participants
The MEC is a population-based, prospective cohort of over 215,000 men and women in
California and Hawaii. Details of the cohort and enrollment have previously been described [102]. Study
participants were identified using the Department of Motor Vehicles, voter registration lists, and Health
Care Financing Administration files. Enrollment occurred between 1993 and 1996. Participants were
between 45 and 75 years old at the time of enrollment and were from one of five major racial/ethnic
groups (African American, Japanese American, Latino, Native Hawaiian and white). Covariate
information was obtained via a mailed baseline questionnaire to collect data on demographics, diet,
smoking and other lifestyle factors, anthropometric measures, and reproductive history (among
women). This analysis was restricted to MEC participants who resided in Southern California, largely Los
Angeles County, at study enrollment through follow-up.
Incident pancreatic cancers were identified through annual linkage to the California Cancer
Registry, part of the National Cancer Institute's Surveillance, Epidemiology, and End Results Program.
Pancreatic cancer was identified using ICD-O-3 site codes C25.0–C29.9. Vital status and cause of death
were obtained through linkage to the National Death Index and state death certificate files. A total of
100,527 Southern California MEC participants were our population at risk as they completed the
41
baseline questionnaire, reported a valid address that could be geocoded at the parcel or street segment
level across the study period, had valid estimates of air pollutant levels [103], and did not have
pancreatic cancer prior to cohort entry (Appendix N). There were 821 incident pancreatic cancer cases
diagnosed over the study period.
Address history
The details of address history, geocoding, and neighborhood SES (nSES) data for MEC
participants have been described in a previous publication [103]. In brief, addresses at baseline and
during follow-up are recorded for MEC participants. In addition, information is updated using periodic
mailings of newsletters, follow-up questionnaires, administrative data linkages, and registry linkages.
Using this information, participant addresses were geocoded to land parcels or street segments over the
study period (1993-2013). Invalid address records were excluded if the end time of a residence was prior
to study start date or the start time of a residence was after the study end date. Geocoded addresses
were then linked to U.S. Census block groups based on the year. Based on the block group and
residential address history at baseline and time of censorship, each participant was assigned a
composite measure of nSES which was then categorized into quintiles based on the nSES distribution of
Los Angeles County.
Exposure Assessment
Kriging interpolation was used to estimate each participant’s exposure levels for PM 2.5, PM 10,
NO 2, and NO x [103, 104]. Kriging uses spatial regression to estimate exposure levels given temporally
measured air pollutant levels from monitoring stations and location of residence. Measured
concentrations were obtained from the U.S. Environmental Protection Agency routine air monitoring
data. NO 2, NO X, and PM 10 were available for the years 1993-2013, and PM 2.5 from 2000 to 2013. PM 2.5
concentrations for the years prior to 2000 were estimated using a spatiotemporal model that uses PM 10
42
measurements, meteorological factors, and spatiotemporal characteristics to extrapolate PM 2.5 values in
California [105]. In the cases of incomplete address records or incomplete air pollution data, exposure
levels were imputed. Participants with more than 50% imputed data were removed from analysis.
Statistical Analysis
We used Cox regression to examine the association between each air pollutant and incident
pancreatic cancer. Due to variation in air pollutant levels over time, time-dependent exposure variables
were used to estimate cumulative monthly average pollutant levels for each participant-month. We
used participants’ age months as the timescale for this analysis and defined a series of risk sets based on
month at diagnosis of each pancreatic cancer event (index case). Using age as a timescale for analysis of
cohort study designs has been shown to produce least biased measures of association [106, 107] and
adjusts for the effects of age in model fitting. Each risk set consisted of all MEC participants who
remained alive and uncensored at the time of the pancreatic cancer diagnosis. For each member of the
risk set (including the index case), we computed the average exposure from the time of cohort entry
(month/year) up to the time of pancreatic cancer diagnosis based on each participant’s residential
history. Participants were censored at time of pancreatic cancer diagnosis, death, or end of follow-up
(Dec 31, 2013). In the case of tied event times, the Efron approximation was used.
Variables considered for inclusion in analysis were age at cohort entry (<50, 50-54, 55-59, 60-64,
65-69, >70), sex (male, female), race/ethnicity (African American, Japanese American, Latino, Native
Hawaiian and white), diabetes status at baseline (yes, no), BMI at baseline (<25, 25-29, >30 kg/m
2
,
missing), smoking status at baseline (never, former smokers ≥ 20 pack-years, former smokers < 20 pack-
years, current smokers <20 pack-years, current smokers ≥ 20 pack-years, missing, current smokers-
unknown pack-year, former smokers -unknown pack-year), birth year (1918-1922, 1923-1927, 1928-
43
1932, 1933-1937, 1938-1942, 1943-1948), nSES at baseline and at censorship (quintiles: Q1 (lowest), Q2,
Q3, Q4, Q5 (highest), missing).
The proportional hazards assumption for all covariates was assessed using a test of correlation
between Schoenfeld residuals and time. All models were adjusted for race/ethnicity, BMI, nSES at
enrollment, nSES at censorship, and age at cohort entry. We stratified models by smoking and diabetes
status due to the violation of the proportional hazards assumption for these variables (Appendix O). In a
series of sensitivity analyses, we tested models that included the Alternative Healthy Eating Index (AHEI)
2010 diet score, alcohol consumption, and occupational history. We also considered stratification on age
at cohort entry to allow for differing baseline hazards across entry age [107], and last, we tested models
that did not include the first five years of follow-up to account for possible measurement error of
cumulative average measures at the beginning of the study. We found little difference in associations of
our main effect in these sensitivity analyses and only present our initial models.
A unit change was selected for each pollutant based on what is most commonly reported in
epidemiologic studies, which are similar to the interquartile range of each pollutant measure (PM 2.5: 10
µg/m
3
, PM 10: 10 µg/m
3
, NO X: 50 ppb, NO 2: 20 ppb) [103]. Hazard ratios and corresponding 95%
confidence intervals were reported for each pollutant for all subjects combined. Additionally, we
present stratified results and tests of heterogeneity by sex, race/ethnicity, smoking status, moving
status (whether participants changed address over the study period), BMI, and nSES at baseline.
Heterogeneity p-values were generated using product terms between pollutant levels and each
stratification variable, while allowing for separate baseline hazards for each strata level.
Results
This analysis was based on 100,527 CA MEC participants totaling 1,660,488 person-years, and an
average follow-up time of over 16 years. Most of the participants were Latinos (42.4%), followed by
44
African Americans (31.9%), whites (13.6%), and Japanese Americans (12.1%) (Table 3). African-American
and Japanese American participants tended to be older. Latino and African-American participants had
the highest prevalence of diabetes and obesity. We observed temporal patterns in air pollutant levels,
with the highest levels of PM 10, NO 2, and NO X occurring in the mid-1990’s and PM 2.5 in the mid 2000’s
(Appendix P). Across groups, Latinos had the highest cumulative average exposure levels for PM 2.5, PM 10,
and NO 2.
PM 2.5 exposure was associated with risk of pancreatic cancer (HR = 1.61; 95% CI, 1.09, 2.37, per 10
µg/m
3
; Figure 4). We observed some differences in effect sizes across stratification variables but none
were significantly heterogeneous. For example, in analyses by race/ethnicity, the association was
strongest among Latinos (HR = 3.59; 95% CI, 1.60, 8.06). Risk estimates were similar in men and women
although the association was statistically significant in women only (HR = 1.67; 95% CI, 1.01, 2.74). The
association with PM 2.5 was somewhat stronger in ever smokers (HR = 1.76; 95% CI, 1.05, 2.94),
participants who moved during the study period (HR = 1.80; 95% CI, 1.12, 2.89), and in the middle BMI
category (BMI 24-29 kg/m
2
HR = 2.15, 95% CI, 1.15, 4.04).
The results were similar in a fully adjusted model which also considered work history, AHEI
2010, alcohol consumption, and in the models with the first 5 years of follow-up time removed from
analysis (results not shown). There was a slight attenuation association with PM 2.5 when treating age at
cohort entry as a strata variable in place of conventional adjustment as a covariate (HR = 1.44; 95% CI,
0.95, 2.17).
We did not find an association between PM 10 and pancreatic cancer risk (HR = 1.12; 95% CI,
0.94, 1.32, per 10 µg/m
3
; Figure 1) in men and women combined. However, there was an elevated risk
among Latinos (HR = 1.45; 95% CI, 1.00, 2.09, race/ethnicity p-heterogeneity = 0.48). Risk of pancreatic
cancer was not significantly associated with exposure to NO X and NO 2 (NO X HR = 1.14; 95% CI, 0.88, 1.48,
45
per 50 ppb; NO 2 HR = 1.14; 95% CI, 0.85, 1.54, per 20 ppb; Figure 5), but there was a suggested
association between risk and NO X exposure among Latinos (HR = 1.72; 95% CI, 1.00, 2.94).
Discussion
This is the first prospective study to examine the association between ambient air pollutants and
pancreatic cancer risk in a multiethnic population using time varying exposures. Our most notable
finding was the significant association between PM 2.5 and pancreatic cancer risk. There was no
significant association between pancreatic cancer and PM 10 or gaseous pollutants. Although there were
no significant heterogeneity in subgroup analyses, a consistent pattern emerged, showing stronger risk
associations among Latinos, the largest ethnic subgroup in this study.
Our finding for PM 2.5 adds to the accumulating body of evidence supporting PM as a risk factor
for pancreatic cancer. The association between PM 2.5 and pancreatic cancer mortality was investigated
in three prior studies, with effect estimates ranging from 0.95, per 10 µg/m
3
, in the Cancer Prevention
Study II [64], to 1.16 using national mortality data from China [67], and 1.09 in the US using National
Health Interview Survey data [66]. “Near-source” associations from the CPSII, which is pollutant levels
estimated using only land-use regression, were slightly more similar to what is seen in our current, and
in prior, studies (HR = 1.44, 95% CI: 1.00, 2.15, per 10 µg/m
3
) [64].
Although kriging is a commonly used method to estimate air pollution exposure levels, prior
studies have used different methods of pollutant estimation making it more difficult to compare results.
Turner et al., used a modified land use regression (LUR) model that incorporates Bayesian interpolation
to estimate national PM 2.5 measures [64]. This LUR model, which uses roadways and greenspace
surrounding air monitoring stations, can perform well if the model is built using monitors in regions with
a variety of exposure levels and land usages. Because of this requirement a national-level model may
underperform in certain conditions [108, 109]. The Bayesian interpolation component of the model adds
46
a level of robustness to PM 2.5 estimation; however, a high density of monitoring stations is needed to
produce the best measures. Wang et al, estimated district-level pollutant concentrations using satellite-
based estimates, ground measurements, and chemical transport simulations [67]. This is a more recently
developed method of air pollutant estimation and is limited in distinguishing between pollutant types
[108]. Satellite-based estimates are more commonly used for large or remote regions, likely used in this
study due to the large-scale and remote regions of China. In contrast to both LUR and satellite-based
estimates, kriging relies only on measured values to estimate concentrations at participant’s addresses
and does not incorporate factors such as land use or geography. The exclusion of these factors may
harm pollutant concentration estimation in cases where data monitoring stations are sparse. However,
in our study, the U.S. EPA Air Quality System has a high density of monitoring stations in Southern
California, where most MEC participants resided in this study.
Differences in air pollutant levels across these studies in addition to air pollutant estimation
methods may explain some variation in results. The highest PM 2.5 levels reported in some regions in
China by Wang et al., were over 10 times the average concentrations we observed [67]. Similarly, in
both the Cancer Prevention Study II and in the National Health Interview Study [64, 67], estimated PM 2.5
concentrations were, on average, slightly lower than what we observed in the MEC, largely in the Los
Angeles area. Less variation or lower concentrations of PM 2.5 may prevent some studies from identifying
an association. In addition to variation in pollutant levels, the chemical composition of PM 2.5 is known to
vary globally, and within the United States [110]. Since both prior US-based studies had similar follow-up
dates to ours the differing associations may be due to regional PM 2.5 composition.
In this study, we observed variation in pollutant associations by race/ethnicity. Most notably the
association between PM 10 and PM 2.5 were strongest among Latinos. It is possible the increased
concentrations of these exposures among Latinos (Appendix P) may result from a greater portion of
Latinos living proximate to major roads [104]. Although additional adjustment for occupation did not
47
attenuate results, residual confounding by occupation may positively bias the association within this
racial/ethnic group.
It is unclear how PM 2.5 affects risk of pancreatic cancer, as most cancer research of air pollutants
has focused on lung cancer. Since PM 2.5 is a characteristic of the particle size, the composition can vary.
The major components of PM 2.5 that are likely most relevant to cancer etiology are organic compounds,
metals, and polycyclic aromatic hydrocarbons [110, 111]. These compounds may affect cancer risk
through increasing oxidative stress and inflammation, as observed in the airways [112], and formation of
DNA adducts [113]. In addition to these commonly discussed hypotheses, there are two mechanisms
more specific to the pancreas. Firstly, air pollution may increase risk of pancreatic cancer through heavy
metal accumulation in the pancreas. Metals from tobacco smoke are known to accumulate in the body
through inhalation and absorption through the lungs or from swallowing during mucociliary clearance.
Smokers and non-smokers with pancreatitis and pancreatic cancer are shown to have elevated
concentrations of heavy metals in the pancreas [114, 115], likely due to smoking and other
environmental or occupational exposures. These findings support the stronger association we observe
among smokers in the PM 2.5 analysis which may be due to the additive effect of smoking on metal
accumulation and exposure to PM 2.5-bound PAHs. Although the bioaccumulation of metals from air
pollutants may increase the risk of pancreatic cancer, this has only been studied in the context of
smoking, occupational exposures, and animal studies [115, 116].
Second, air pollution may increase the risk of diabetes mellitus which may, in turn increase risk
of pancreatic cancer, however it is still unclear to what degree air pollutants affect diabetes [117]. One
meta-analysis estimates a 10% increase in risk (HR = 1.10, 95% CI: 1.02, 1.18) of incident type 2 diabetes
per 10 µg/m
3
increase in concentration of PM 2.5 [118].
48
There are several strengths to this study. First, since the MEC is composed of a racially/ethnically
diverse population, we were able to measure the association between air pollution and pancreatic
cancer in the multiethnic population and stratified by racial/ethnic group. Second, multiple prior studies
did not have participant address information for estimation of air pollutants and, as the result, used
region-level pollutants for each participant [66, 67]. In this study we used participants’ geocoded
addresses across their residential histories to better captures exposure variation and allow for a more
precise estimate of each participant’s exposure levels. No prior study included updated residential
addresses [62, 64, 66]. Finally, we were able to include a comprehensive list of potential confounders in
our analyses.
There are also some limitations to this study. First, we lacked exposure information for
participants prior to enrollment, and at each participant’s workplace. In a large National Human Activity
Pattern Survey study, it was estimated that Californians spent less than 70% of their time at home [119]
and that around 6% of their day was spent in an enclosed vehicle [119], which may confer pollutant
exposure levels that vary by vehicle type [120]. Residual confounding is possible as method of
commuting may be associated with air pollution levels at residence and pancreatic cancer; however, it is
not possible to identify the degree and direction of bias without information on commuting and work-
related pollution exposures [121]. Second, although pollutant estimates were updated monthly,
covariate information used baseline measures and changes in covariate levels over time were not
accounted for in analysis.
In conclusion, using kriging estimates of time-varying air pollutants exposure, we identified an
association between pancreatic cancer and PM 2.5 levels in a multiethnic cohort of Southern California
residents. This association was strongest among Latinos, which may likely be due to increased exposure
levels. Compared to previous studies showing a significant association with PM 2.5, we observed a larger
risk estimate. Due to the variation in published results, additional studies using prospective cohorts with
49
time-varying measures of pollutant levels and comprehensive confounder adjustment are needed to
confirm the association of particulate matter with pancreatic cancer risk.
50
Tables
51
Table 3: Air Pollution Analysis Baseline Characteristics of Study Participants by Race/Ethnicity.
Combined Latinos African American Whites
Japanese
American
(n = 100,527) (n = 42,647) (n = 32,038) (n = 13,663) (n = 12,179)
n % n % n % n % n %
Cases 821 0.8 287 0.7 315 1 118 0.9 101 0.8
Age at Entry
<50 12,058 12 4,861 11.4 4,329 13.5 1,407 10.3 1,461 12
50-54 13,042 13 5,851 13.7 4,232 13.2 1,467 10.7 1,492 12.2
55-59 18,158 18.1 9,478 22.2 4,581 14.3 2,424 17.8 1,675 13.8
60-64 19,277 19.2 9,981 23.4 4,430 13.8 2,757 20.2 2,109 17.3
65-69 19,044 18.9 7,076 16.6 6,852 21.4 2,763 20.2 2,353 19.3
>70 18,948 18.8 5,400 12.7 7,614 23.8 2,845 20.8 3,089 25.4
Sex
Male 42,828 42.6 20,495 48.1 11,519 36 4,872 35.7 5,942 48.8
Female 57,699 57.4 22,152 51.9 20,519 64 8,791 64.3 6,237 51.2
Diabetes Mellitus
No 86,223 85.8 35,830 84 26,892 83.9 12,530 91.7 10,971 90.1
Yes 14,304 14.2 6,817 16 5,146 16.1 1,133 8.3 1,208 9.9
BMI (kg/m
2
)
<25 33,849 33.7 11,836 27.8 8,337 26 5,793 42.4 7,883 64.7
25-29 41,113 40.9 19,765 46.3 12,599 39.3 5,079 37.2 3,670 30.1
>30 24,068 23.9 10,672 25 10,031 31.3 2,751 20.1 614 5.1
Missing 1,497 1.5 374 0.9 1,071 3.4 40 0.3 12 0.1
52
Table 4 continued
Combined Latinos African American Whites
Japanese
American
(n = 100,527) (n = 42,647) (n = 32,038) (n = 13,663) (n = 12,179)
Smoking Status
Never 43,443 43.2 20,195 47.4 11,878 37.1 5,570 40.8 5,800 47.6
Quit ≥ 20 pack-years 7,439 7.4 1,958 4.6 2,353 7.3 1,825 13.4 1,303 10.7
Quit < 20 pack-years 28,075 27.9 11,913 27.9 9,151 28.6 3,581 26.2 3,430 28.2
Current <20 pack-
years
10,208 10.1 4,041 9.5 4,570 14.3 887 6.5 710 5.8
Current ≥ 20 pack-
years
5,987 6 1,556 3.6 2,388 7.5 1,383 10.1 660 5.4
Missing 2,276 2.3 1,493 3.5 497 1.5 174 1.3 112 0.9
Current Unknown
Pack-year
419 0.4 152 0.4 223 0.7 32 0.2 12 0.1
Quit Unknown Pack-
year
2,680 2.7 1,339 3.1 978 3 211 1.5 152 1.3
Birth Year
1918-1922 14,232 14.2 4,035 9.5 5,775 18 2,306 16.9 2,116 17.4
1923-1927 19,500 19.4 6,619 15.5 7,570 23.6 2,774 20.3 2,537 20.8
1928-1932 19,105 19 9,160 21.5 5,122 16 2,605 19.1 2,218 18.2
1933-1937 17,214 17.1 8,562 20.1 4,605 14.4 2,258 16.5 1,789 14.7
1938-1942 16,599 16.5 8,679 20.3 4,320 13.5 2,113 15.5 1,487 12.2
1943-1948 13,877 13.8 5,592 13.1 4,646 14.5 1,607 11.7 2,032 16.7
53
Table 5 continued
Combined Latinos African American Whites
Japanese
American
(n = 100,527) (n = 42,647) (n = 32,038) (n = 13,663) (n = 12,179)
Number of Moves
During Follow-up
1 59,636 59.3 23,475 55 19,567 61.1 8,056 59 8,538 70.1
2 20,706 20.6 9,181 21.5 6,106 19 3,078 22.5 2,341 19.2
3 10,985 10.9 5,235 12.3 3,388 10.6 1,503 11 859 7.1
4 5,193 5.2 2,640 6.2 1,656 5.2 607 4.4 290 2.4
5 or more 4,007 4 2,116 5 1,321 4.1 419 3.1 151 1.2
54
Figures
55
Figure 4: Association between Particulate Matter and Pancreatic Cancer.
56
Abbreviations: Afr Amr: African American; BMI: Body mass index; Japanese Amr: Japanese American;
HR: Hazard ratio and corresponding 95% confidence interval; nSES: Neighborhood socioeconomic status;
p: p-value for test for hazard ratio=1; p-het: p-value for test of heterogeneity of stratification variables.
Note: Cases in stratified analyses may not sum to total due to removal of missing category in
heterogeneity tests.
57
Figure 5: Association between Nitrogen Dioxide, Nitrogen Oxides, and Pancreatic Cancer.
58
Abbreviations: Afr Amr: African American; BMI: Body mass index; Japanese Amr: Japanese American;
HR: Hazard ratio and corresponding 95% confidence interval; nSES: Neighborhood socioeconomic status;
p: p-value for test for hazard ratio=1; p-het: p-value for test of heterogeneity of stratification variables.
Note: Cases in stratified analyses may not sum to total due to removal of missing category in
heterogeneity tests.
59
Chapter 4: Excess Risk due to Smoking and Effects of Quitting on Pancreatic
Cancer Incidence in the Multiethnic Cohort Study
Abstract
Background: Smoking is an established risk factor of pancreatic cancer, however there are limited
results on how the cumulative effect of pack-years and years-quit differs by race/ethnicity. Additionally,
there are no published data displaying how the risk trajectory of pancreatic cancer over a lifespan is
modified by exposure of cumulative pack-years and years-quit.
Methods: To better understand the association of smoking with pancreatic cancer by
race/ethnicity and risk trajectory, we estimated the effect of smoking in an excess relative risk model using
183,661 men and women in the MEC Study, residing in Southern California and Hawaii and followed
between 1993 and 2014. We estimated incidence of pancreatic cancer as a function of race/ethnicity, and
time-varying age, pack-years smoked, and years-quit. Modeled this way, pack-years is excess risk beyond
the effect of age and years-quit is a modifier of the pack-years effect on risk of pancreatic cancer. We
tested for heterogeneous effect of pack-years and years-quit on excess risk by race/ethnicity and
predicted incidence of pancreatic cancer over a lifespan using different hypothetical smoking histories.
Results: Over an average of 17.5 years of follow-up there were 1,588 incident cases of pancreatic
cancer and 3,226,219 person-years accumulated. We found no difference in the effects of pack-years
(p=0.38) or years-quit (p=0.83) by race/ethnicity. In a simplified model with a single effect of pack-years
and years-quit for all ethnic/racial groups, 50 pack-years smoked was associated with 89% increased risk
(HR=1.89 (95% CI: 1.53, 2.24)) relative to a never smoker. Among smokers with 50 pack-years, every year
quit corresponded to 8% decreased risk in the association between smoking and pancreatic cancer (0.92
times the excess risk of smoking (95% CI: 0.89, 0.98)). The predicted risk trajectories over a lifespan show
60
that the greatest change smoking cessation has, occurs when quitting prior to 60 years old, before the
incidence of pancreatic cancer increases quickest as a function of age.
Conclusions: The associations of pack-years and years-quit with risk of pancreatic cancer do not
differ by race/ethnicity, however, in this multiethnic population, pack-years is significantly associated with
an increase in the risk. This association of smoking is reduced by 8% for each year quit.
61
Introduction:
In 2020, there was an estimated 57,000 pancreatic cancer cases and 47,000 deaths in the
United States [1]. Although pancreatic cancer is the 10
th
most common malignancy, it is the fourth
leading cause of cancer death [1], and projected to be the second by 2040 [122]. Due to no regular
forms of screening, diagnosis at a late stage is common, making successful treatment difficult and 5-year
survival low, at only 10.5% [2, 3].
The burden of this disease differs across ethnic and racial groups, with the highest incidence
among African Americans, 1.4 times that of non-Hispanic whites, followed by Japanese Americans, non-
Hispanic whites, Hispanics, and Native-Hawaiians [87]. Established risk factors for pancreatic cancer
include BMI [6, 38], type 2 diabetes [38, 54], family history [38], pancreatitis [123], and common genetic
variants [100], however, smoking is of the strongest and most prevalent risk factors, conferring a 70-80%
increase in risk relative to never smokers [33, 36, 38, 124]. Smoking is also an important risk factor as it
is modifiable, where risk can be largely attenuated with quitting [33, 38]. The prevalence of smoking
varies by race/ethnicity. According to the National Health Interview Survey, the prevalence of current
smokers is highest among American Indian males (22.8%), followed by American Indian females (18.7%),
Black/African American males (19.0%), white males (16.4%) and Asian males (11.5%) [38, 125]. Although
we know the prevalence of smoking to differ by race/ethnicity, there is little research showing if the
associations of smoking and smoking cessation with risk of pancreatic cancer differ by race/ethnicity.
Variation in these associations by race/ethnicity would be expected given known differences in
metabolism of tobacco, type of cigarettes smoked (menthol/nonmenthol), blood metal concentrations
among smokers, and smoking duration prior to cessation, among ethnic groups [126-128].
62
In this study, we employ an excess relative risk (ERR) model to estimate the excess risk of
smoking, modified by years-quit, on risk of pancreatic cancer by race/ethnicity in the Multiethnic Cohort
Study.
Materials and Methods
Study Population
To measure the effects of pack-years and years-quit we utilized data from the Multiethnic
Cohort Study. The MEC is a population-based prospective cohort study, composed of over 215,000 men
and women living in California (mainly Los Angeles County) or Hawaii at enrollment. Participants were
identified using records from the Department of Motor Vehicles, Health Care Financing Administration
files, and voter registration lists. Detailed information on enrollment, data collection, and follow-up have
been previously described [7]. Briefly, participants were enrolled from 1993 to 1996 and were between
45 and 75 years of age, and self-identified into one of five major racial/ethnic groups (African American,
Japanese American, Latino, Native Hawaiian and white). Information on baseline characteristics, such as
demographics, diet, anthropometric measures, lifestyle factors, and informed consent were collected
via self-administered questionnaire through mail.
This analysis was restricted to 183,661 participants, who self-reported into one of the five major
ethnic/racial groups, had valid/non-missing baseline responses for age and sex, had no history of
pancreatic cancer, had complete responses for smoking history (ever, never, current), and among
current or past smokers, had completed responses on number of years smoked, average number of
cigarettes smoked per day, and years since quit.
Smoking assessment: Smoking was assessed by self-report on both a baseline and follow-up
questionnaire set to MEC participants. Smoking status (ever/never) was assessed using the question,
“Have you ever smoked a total of 20 or more packs of cigarettes in your lifetime?”. Pack-years was
63
calculated using self-reported cigarettes smoked per day and years smoked, which were asked using the
following questions: “What is the total number of years smoked?” and “What is the average number of
cigarettes that you smoked per day?”. Years-quit was asked using the question, “If you quit smoking,
how long ago did you quit?”. A follow-up survey, distributed 10 years after baseline, with a response
rate of 50% [129], asked the same questions and provided updated information on quitting and pack-
years. The later survey responses were also used to then predict pack-years among current smokers at
last contact.
Case Ascertainment: Incident pancreatic cancer cases were identified through annual linkage to
the California and Hawaii State Cancer Registry system, which is part of the National Cancer Institute’s
Surveillance, Epidemiology, and End Results Program. The National Death Index and state death
certificate files were used to obtain additional information on vital status and cause of death. Cases
were defined using ICD O-3 site codes C25.0-C25.9. There were 1,588 pancreatic cancer cases identified
between study enrollment and time of censorship, December 31, 2014. This study was approved by the
University of Hawaii and the University of Southern California institutional review boards.
Statistical Analysis
Excess Relative Risk (ERR) Model: The cumulative effect of smoking and smoking cessation on
risk of pancreatic cancer was estimated using an ERR model. This model has previously been
implemented in the MEC to measure the excess risk of smoking variables in association with incident
lung cancer; the methods we present here are explained in greater detail in this initial publication [130].
The ERR model estimates the incidence rate of pancreatic cancer, as a function of age, race/ethnicity,
pack-years smoked, and years-quit smoking, as follows: We model the hazard of pancreatic cancer at
age t as
ℎ ( 𝑡 ) = 𝑏 𝑎𝑠 𝑒 𝑙𝑖𝑛𝑒 ( 𝑡 ; 𝑟𝑎𝑐 𝑒 /𝑒𝑡 ℎ 𝑛𝑖𝑐 𝑖𝑡 𝑦 ) ( 1 + 𝐸 𝑅 𝑅 ( 𝑝 𝑘𝑦 𝑟𝑠 ( 𝑡 ) , 𝑦 𝑒 𝑎𝑟𝑠 _ 𝑞 𝑢𝑖𝑡 ( 𝑡 ) )
64
In this model, the baseline is a model for the hazard (i.e., the age specific incidence rate) among
never smokers from racial/ethnic group, j. The baseline model is assumed to take the loglinear form
𝑏 𝑎𝑠 𝑒 𝑙𝑖𝑛𝑒 ( 𝑡 ) = ex p ( 𝑎 𝑗 + 𝑏 log ( 𝑡 ) ), as a function of age, t, and ethnicity, j. Note that this model is a
simple polynomial ex p ( 𝑎 𝑗 ) 𝑡 𝑏 , as in the classic Armitage Doll model. Here we usually express age in units
of 70 years so that ex p ( 𝑎 𝑗 ) is the estimate of risk for a 70-year-old non-smoker. The excess relative risk
(ERR) term in the model takes the form
𝐸 𝑅 𝑅 ( 𝑡 ) = 𝑐 𝑝 𝑘𝑦 𝑟𝑠 ( 𝑡 ) ex p ( 𝑑 𝑦 𝑒 𝑎𝑟𝑠 _ 𝑞 𝑢𝑖𝑡 ( 𝑡 ) )
Here, years-quit is allowed to modify the linear effect of pack-years on the excess relative risk
term. The log-linear part of the ERR term is termed a dose modifier in that it has no meaning if pack-
years is zero. We can enrich this modification term if necessary, by adding other variables. For example,
to test if pack-years is the right metric we can add smoking intensity (in cigarettes/day, CPD) to the
modifying term
𝐸 𝑅 𝑅 ( 𝑡 ) = 𝑐 𝑝 𝑘𝑦 𝑟𝑠 ( 𝑡 ) ex p ( 𝑒 log ( 𝐶𝑃𝐷 ) 𝑑 𝑦 𝑒 𝑎𝑟𝑠 _ 𝑞 𝑢𝑖𝑡 ( 𝑡 ) )
so that the effect of packyears, which is equal to CPD/20 * smoking-duration, is replaced by
𝐶𝑃 𝐷 1 + 𝑒 20
× 𝑑 𝑢𝑟 𝑎𝑡 𝑖𝑜𝑛 ( 𝑡 ). If the estimate of the newly added parameter, e, is less than zero then this
indicates that the effect of CPD is sublinear, if positive then super linear. Similarly, the log of duration(t)
can be added to the dose modifying term to check if the cumulative effect of packyears is indeed linear
in duration.
To estimate the parameters (a-d) in the model we develop a finely stratified person-years table,
stratified on ethnicity and small increments of age, as well as the age dependent variables pack-years
and years-quit. Then exploiting the link (Holford, Laird Oliver) between the likelihood for a piecewise
65
exponential model and Poisson regression for survival analysis [131]; we created the tables and fit the
models above by maximum likelihood to the table counts using the software, Epicure [132].
Estimating Pack-Years: For ex- (and never-) smokers at the baseline questionnaire, pack-years is
treated as fixed (not time dependent) variable in the fitting. For current smokers however pack-years
continue to accumulate. Assuming participants who smoked at baseline would continue to smoke until
censorship is likely invalid, so to estimate pack-years and years-quit among current smokers, data was
used from 11,630 baseline smokers who later returned valid follow-up questionnaire data, to build a
model of quitting behavior. This model was then used to estimate smoking exposure beyond last used
assessed in the follow-up questionnaire. The details of this model were presented in Supplementary
Material to a previous MEC publication of excess risk of lung cancer among smokers [130]. Briefly,
pack-years accumulated among current smokers beyond last questionnaire was estimated by fitting a
survival model of quitting behavior, with the “hazard” of quitting being a linear function of
race/ethnicity, sex, age at entry, time on study, and cigarettes smoked per day among ex-smokers. Using
the hazard function from this model, in conjunction with participant characteristics used in prediction of
quitting behavior, we estimated pack-years and years-quit for participants who self-reported as smokers
at last questionnaire.
ERR Model Fitting: When fitting the excess risk model to predict risk of pancreatic cancer, the
parameters (a-d) in the baseline term and ERR terms were estimated simultaneously using all
participants. Parameter estimates of the fit model are presented in Appendix Q. Due to significant
differences in baseline risk between ethnic groups, we allowed for differing intercepts and age effects
for each group, producing a more complex baseline model that better represents our data and disease
characteristics. Beyond the main analysis, we also conducted a series of sensitivity analyses. We first
tested if a modifying effect of smoking intensity (measured in cigarettes per day) on the effect of
smoking duration (pack-years) differs by race/ethnicity. This was done by including a log(CPD) ×ethnicity
66
product term in the log-linear sub-term and testing for difference in effects using a likelihood ratio test.
Second, we assessed the importance of accounting for sex and BMI, when estimating the effects of
smoking. This was done by including these terms in the baseline term of the ERR model. In this analysis,
we used BMI as a proxy for type 2 diabetes, a variable which will later be included in analysis. Two-sided
p-values were calculated using a LRT. P-values less than 0.05 were considered statistically significant.
Results
The final analysis included 183,661 participants, with an average follow-up time of 17.5 years,
representing 3,226,319 person-years. Over the study duration there were 1,588 cases of incident
pancreatic cancer, with most cases among Japanese Americans (n=558), followed by whites (n=315),
African-Americans (n=295), Latinos (n=285), and Native Hawaiians (n=135) (Table 4). The greatest
portion of never smokers was among Japanese Americans (50.9%) and Latinos (50.7%). Whites and
Native Hawaiians had the greatest number of cigarettes smoked per day, 21.9% and 16.6%, respectively.
Whites had the greatest number of former smokers (43.8%) while all other ethnic groups had a similar
portion of former smokers (35%-38.6%). Among these former smokers, whites had the largest portion
with 15+ years quit (22.4%). All other ethnic groups had a similar distribution, with 15% to 18.4% having
quit for over 15+ years. The incidence rates, standardized to the 2000 U.S. Standard Population, showed
highest incidence among Native Hawaiians (54.9 per 100,000), followed by African Americans (41.3),
Japanese Americans (38.3), Latinos (29.5), and whites (29.1) (Table 5).
In our primary analysis in which we tested for differing effects of smoking on risk of pancreatic
cancer by race/ethnicity, there was no statistically significant difference in the association of pack-years
with risk of pancreatic cancer among ethnic/racial groups p LRT = 0.35 (Appendix Q). Additionally, we
found no significant difference between ethnic/racial groups in the modifying effect quitting has on the
primary effect of pack-years (p LRT = 0.83) or in the sensitivity analysis of cigarettes smoked per day (p
67
LRT = 0.96). Although there was no statistical difference in estimates, the effect of smoking was only
significant among Japanese-Americans, which has the largest effect size (ERR = 1.42, 95% CI, 0.68, 2.17,
for current smokers with 50 pack-years smoked, relative to never smokers), followed by African
Americans (ERR = 1.19), Native Hawaiians (0.63), whites (0.54), and Latinos (0.63) (Appendix Q). The
effect of quitting was also only significant among Japanese-Americans, with a modifying effect of 0.94
(95% CI: 0.89, 0.99) on the smoking ERR, per year quit. Japanese-Americans had the smallest effect size
of quitting, followed by whites (0.92), Latinos (0.88), African Americans (0.88), and Native Hawaiians
(0.86). Due to the homogeneous effects of pack-years and years-quit across ethnic/racial groups, we
simplified the excess risk model and fit a single model in a combined multiethnic sample (Table 6).
In this simplified model, 50 pack-years smoked is associated with an 89% increase in the risk of
pancreatic cancer relative to a never smoker ERR = 0.89 (95% CI: 0.53, 1.24). Given the model
parameters, we can also see that, among a smoker who has smoked 50 pack-years, the effect of every
year quit corresponds to 0.92 (0.87, 0.98) times the primary effect of smoking on risk of pancreatic
cancer, i.e., a reduction in the excess risk due to smoking, of 8 percent per year. The relative risk of a
smoker, compared to a never smoker, across a variety of smoking histories is shown in Figure 6. We can
see the effect of years quit on an ex-smoker’s relative risk is greatest among those with the highest
cumulative exposure. As an alternative representation of the model results, we can visualize the
absolute risk of pancreatic cancer across a lifespan and a variety of possible smoking histories (Figure 7).
We see in Figure 7 that risk among never smoking in the combined, multiethnic, sample, is a very steeply
increasing function of age with risks proportionate to age raised to the 5
th
power (Table 6). Never
smokers are estimated to reach risks of over 41.7 cases per 100,000 person-years by age 70, current
smokers of 1 pack/day are at 89 percent higher risk, (Table 6) again quitting by age 50 greatly reduces
the excess risk.
68
In the sensitivity analyses, sex was not a predictor of pancreatic cancer risk in the baseline term
(p=0.10). When including BMI in the baseline term, it was significantly associated with risk of pancreatic
cancer (p<0.01), however adjusting for this variable resulted in less than an 15% change in all smoking-
related estimates, except for the effect of quitting among whites (19.4% change), which was not a
significant term in the model (Appendix R). To assess the direction of bias in our estimates that may be
introduced by not including diabetes as a covariate, we measured the association between diabetes and
smoking, reported in the baseline questionnaire. We found “ever smoking” to be associated with a 17%
decrease in the odds of diabetes, signaling that our smoking effect estimates will be stronger in a
diabetes-adjusted model than what we currently observe.
Discussion
In this analysis, we found a significant association of smoking with risk of pancreatic cancer and
a significant effect of quitting on the excess risk of past smoking. In comparison to Cox models, which
are commonly used to measure the effect of quitting on risk of pancreatic cancer, the ERR model
estimates excess risk of smoking beyond baseline risk and the effect of age [130, 131]. This fit restriction
ensures the risk among current and past smokers is not lower than that of never smokers, a
characteristic of this exposure we know to be false. This issue of estimated lower risk of pancreatic
cancer among ex-smokers, relative to never smokers is a fairly common issue, occurring in 9 of the 40
prospective cohort studies included in a recent meta-analysis of smoking history and pancreatic cancer
[124].
Additionally, in contrast to Cox regression, which is semi-parametric and does not allow for
estimates of absolute risk, our excess risk model results have allowed us to present risk of pancreatic
cancer over life, given a variety of hypothetical exposure configurations to better understand the benefit
of smoking cessation at different ages. As observed in our results, increased risk of pancreatic cancer
69
among those who have accumulated 30 pack-years but have quit at 50 years old is not dramatically
higher than that of never smokers. This is likely because the incidence rate of pancreatic cancer drops
considerably even after just 10 years of smoking cessation, as we see in our results and in prior studies
[36]. This reduction in the effect of smoking, prior to the steep increase in risk due to age more
drastically attenuates risk, relative to quitting at older ages.
Prior studies show a similar effect of pack-years to our multiethnic model. Analysis of the
Pancreatic Cancer Case-Control Consortium showed that over 40 years of smoking is associated with
more than two-fold increased risk of pancreatic cancer (OR=2.10 (1.58, 2.78), and those with 15-20
years-quit had a hazard ratio of 1.12 (95% CI, 0.86, 1.44), which was further attenuated after 20-30
years-quit (HR = 0.98) [5]. Similar results were also observed in the EPIC cohort and Pancreatic Cancer
Cohort Consortium, for years-smoked [33, 133], however these two studies did show a quicker
attenuation of the effect of smoking cessation (5-15 years) until risk returns to that of a never smoker.
Only two studies have attempted to measure the cumulative effect of pack-years and modifying
effects of smoking on pancreatic cancer risk in a joint model [33, 134]. Schulte et al. used logistic
regression and mean-centered smoking variables with an interaction variable for ever/never smoking
status [134]. This process allows for the estimation of smoking effects relative to average smokers [135].
They found, each year-quit to be associated with a 4% reduction in odds of pancreatic cancer, relative to
an average smoker [134]. This interpretation differs from ours, in that our effect of smoking represents a
modifying effect (coefficient) on the effect of pack-years. In our study, this is an 8% reduction in the
effect of smoking per year-quit, after 50 pack-years accumulated. More similar to our analysis, Lynch et
al. implemented an excess odds ratio model in Epicure using the Pancreatic Cancer Cohort Consortium
[33]. In this model, the modifying effect of pack-years only included cigarettes per day; the effect of
years-quit was not estimated.
70
Our results that show homogeneous effects of smoking on risk of pancreatic cancer across
multiple ethnic groups is also supported by a prior analysis in the MEC using Cox regression [38]. In prior
analyses, homogeneous smoking effects by race/ethnicity were also observed for breast cancer but not
for lung cancer [130, 136]. Due to the limited research on the tobacco-pancreatic cancer mechanism, it
is difficult to tell if heterogeneous effects should be expected. A likely exposure route of carcinogenic
compounds is, indirectly, through the blood stream or bile duct [18, 36]. A second possible source is by
an indirect effect of smoking; smokers, given their reduced gastric volume, are more prone to bile salt
reflux which may damage the main pancreas duct and lead to cancer [41]. However, these mechanisms
are still likely affected by differences in cigarette types smoked, metabolism, and inhalation behavior
that is known to differ by race/ethnicity [126, 128, 130, 137].
An alternative etiologic hypothesis, pancreas metal accumulation from smoking [115, 138], may
be less affected by metabolic differences between ethnic and racial groups. However, if a true difference
in smoking effect by race/ethnicity exists, limited power in our analysis may have resulted in a type 2
error, making it difficult to detect the racial difference. This explanation may be supported by the fact
that we observe over a 50% difference in multiple beta coefficients when comparing all ethnic groups
against African Americans, yet do not observe a statistically significant difference.
The main limitation of this study is the reliance on self-reported smoking and quitting behavior
among participants. Self-reported smoking measures, which were then used to calculate pack-years,
may not necessarily correspond to actual cigarette carcinogen exposure levels, as self-reported
cigarettes smoked is known to be associated with differing nicotine metabolites [139]. Lastly, our
models have not yet included diabetes, a known risk factor for pancreatic cancer. However, our
sensitivity analysis, using BMI as a proxy, shows little change in effect estimates and no loss of
significance among currently significant terms. Additionally, the 17% reduction in odds of diabetes
71
among smokers, allows us to infer that our current estimates of smoking on risk of pancreatic cancer are
biased towards the null, and that future results from the fully adjusted model will be stronger.
In conclusion, we characterized the effect of smoking cessation has on risk of pancreatic cancer.
We determined that the effects of quitting and pack-years accumulated do not differ by race/ethnicity.
In a multiethnic sample, the effect of quitting appears greatest prior to entering high incidence age
groups, even after accumulation of over 20 pack-years. Given the strength of smoking as a risk factor
and ubiquity of smoking cessation, more studies are needed to estimate the effect of smoking cessation
on pancreatic cancer risk. Additionally, we need more research to better understand the mechanism of
the smoking-pancreatic cancer relationship to learn if heterogeneous effects of smoking by
race/ethnicity should be expected based on metabolism of tobacco smoke compounds.
72
Tables
73
Table 6: Sample Baseline Characteristics, Stratified by Race/Ethnicity.
Combined
African-
American
Hawaiian
Japanese-
American
Latino White
(n=183,661) (n=31,047) (n=13,342) (n=53,130) (n=40,438) (n=45,704)
n % n % n % n % n % n %
Status
Case 1588 0.9 295 1.0 135 1.0 558 1.1 285 0.7 315 0.7
Non-Case 182073 99.1 30752 99.0 13207 99.0 52572 98.9 40153 99.3 45389 99.3
Age at Cohort
Entry
<50 31019 16.9 4439 14.3 3783 28.4 8417 15.8 4969 12.3 9411 20.6
50-54 27554 15.0 4252 13.7 2672 20.0 7009 13.2 5736 14.2 7885 17.3
55-59 29168 15.9 4503 14.5 2090 15.7 6748 12.7 8929 22.1 6898 15.1
60-64 31438 17.1 4220 13.6 1949 14.6 8815 16.6 9405 23.2 7049 15.4
65-69 32190 17.5 6507 21.0 1594 11.9 10518 19.8 6563 16.2 7008 15.3
>70 32292 17.6 7126 22.9 1254 9.4 11623 21.9 4836 12.0 7453 16.3
Sex
Female 101251 55.1 19868 64.0 7541 56.5 28162 53.0 20972 51.9 24708 54.1
Male 82410 44.9 11179 36.0 5801 43.5 24968 47.0 19466 48.1 20996 45.9
Diabetes
Yes 21499 11.7 4922 15.9 1991 14.9 5584 10.5 6332 15.7 2670 5.8
Non-Case 162161 88.3 26125 84.1 11351 85.1 47546 89.5 34105 84.3 43034 94.2
Missing 1 0.0 - 0.0 - 0.0 - 0.0 1 0.0 - 0.0
BMI (kg/m
2
)
<25 77898 42.4 9105 29.3 3630 27.2 11775 29.1 32230 60.7 21158 46.3
25-29 69564 37.9 12271 39.5 4967 37.2 18620 46.0 17177 32.3 16529 36.2
>30 36194 19.7 9667 31.1 4745 35.6 10042 24.8 3723 7.0 8017 17.5
74
Table 7: Continued
Combined
African-
American
Hawaiian
Japanese-
American
Latino White
(n=183,661) (n=31,047) (n=13,342) (n=53,130) (n=40,438) (n=45,704)
Smoking History
Never 83272 45.3 12234 39.4 5320 39.9 27061 50.9 20489 50.7 18168 39.8
Quit < 20 pack-
years
52843 28.8 9435 30.4 3624 27.2 14115 26.6 12134 30.0 13535 29.6
Quit ≥ 20 pack-
years
17992 9.8 2371 7.6 1380 10.3 5709 10.8 2035 5.0 6497 14.2
Current <20
pack-years
16022 8.7 4615 14.9 1559 11.7 2994 5.6 4111 10.2 2743 6.0
Current ≥ 20
pack-years
13532 7.4 2392 7.7 1459 10.9 3251 6.1 1669 4.1 4761 10.4
Cigarettes per
Day*
<=10 CPD 41457 22.6 9876 31.8 2747 20.6 8375 15.8 12821 31.7 7638 16.7
11-20 CPD 35038 19.1 6507 21.0 3058 22.9 10678 20.1 4923 12.2 9872 21.6
>= 30 CPD 23894 13.0 2430 7.8 2217 16.6 7016 13.2 2205 5.4 10026 21.9
Time Since Quit*
<=5 years 13854 7.6 2956 9.5 1099 8.2 3155 5.9 2989 7.4 3655 8.0
6-15 years 23195 12.6 4206 13.5 1733 13.0 6372 12.0 4768 11.8 6116 13.4
> 15 years 33786 18.4 4644 15.0 2172 16.3 10297 19.4 6412 15.8 10261 22.4
*Among Former Smokers
75
Table 8: Incidence Rates of Pancreatic Cancer by Race/Ethnicity.
Cases
Incidence rate per
100,000* Rate Ratio
African American 295 41.25 (35.42, 48.03) 1 (ref)
Native Hawaiian 135 54.90 (45.38, 66.42) 1.33 (1.30, 1.36)
Japanese American 558 38.30 (34.76, 42.21) 0.93 (0.91, 0.95)
Latino 285 29.47 (25.42, 34.16) 0.72 (0.69, 0.74)
White 315 29.06 (25.77, 32.76) 0.71 (0.69, 0.73)
* Estimated using observed incidence rates, standardized to the 2000 U.S. Standard Population, left truncated at 45 years
old.
76
Table 9: Fit Model Statistics from Multiethnic, Combined, Excess Relative Risk Model
Parameter Estimate Std. Error
Test
Statistic P value*
Log-linear term 0
Constant (baseline rate in AA) -7.684 0.06352 -121 < 0.001
NH 0.2851 0.1046 2.727 0.00639
LA -0.2869 0.08342 -3.44 < 0.001
JA 0.01222 0.07218 0.1693 > 0.5
W -0.2893 0.08131 -3.558 < 0.001
logage70 5.011 0.2291 21.87 < 0.001
Linear term 1
py50 0.8859 0.1811 4.892 < 0.001
Log-linear term 1
yearsquit -0.0798 0.03232 -2.469 0.0136
* P values are two-sided from the Wald test
†logage70=log(age in years / 70) so that exp(logage70) gives absolute risk at age 70 for never smoker
‡py50 = pack-years/50 so that the estimate is corresponds to estimated excess relative risk at 50 pack-years
||AA NH, LA, JA, W refer to African Americans, Native Hawaiians, Latinos, Japanese Americans and Whites
respectively.
77
Figures
Figure 6: Excess Relative Risk of Smoking Over a Varity of Smoking Histories.
Excess relative risks (y-axis) plotted over a variety of smoking durations (dark purple = less pack-years,
light purple = greater pack-years) over differing levels of years quit (x-axis). The excess relative risk
corresponds to added risk due to smoking, beyond the effects of age.
78
Figure 7: Hypothetical Risk Trajectories, Given Differing Smoking Histories, Over a Lifespan.
Risk, in cases per 100,000 (y-axis), plotted as a function of age (x-axis), pack-years, and year-quit (purple
color). Darker purple lines correspond to less cumulative time smoked (pack-years) and greater time-
quit. To estimate the risk in the combined sample, we used a simplified model, only including age, pack-
years, and years-quit as terms.
79
References
1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA: A Cancer Journal for Clinicians
2020;70(1):7-30.
2. Lindquist CM, Miller FH, Hammond NA, et al. Pancreatic cancer screening. Abdominal Radiology
(New York) 2018;43(2):264-272.
3. SEER*Explorer: An interactive website for SEER cancer statistics [Internet].
https://seer.cancer.gov/explorer/.
4. Cronin KA, Lake AJ, Scott S, et al. Annual Report to the Nation on the Status of Cancer, part I:
National cancer statistics. Cancer 2018;124(13):2785-2800.
5. Bosetti C, Lucenteforte E, Silverman DT, et al. Cigarette smoking and pancreatic cancer: an
analysis from the International Pancreatic Cancer Case-Control Consortium (Panc4). Annals of Oncology
2012;23(7):1880-1888.
6. Arslan AA, Helzlsouer KJ, Kooperberg C, et al. Anthropometric measures, body mass index, and
pancreatic cancer: a pooled analysis from the Pancreatic Cancer Cohort Consortium (PanScan). Archives
of Internal Medicine 2010;170(9):791-802.
7. Kolonel LN, Henderson BE, Hankin JH, et al. A Multiethnic Cohort in Hawaii and Los Angeles:
Baseline Characteristics. American journal of epidemiology 2000;151(4):346-357.
8. Leung PS. Physiology of the Pancreas. In: Leung PS, (ed). The Renin-Angiotensin System: Current
Research Progress in The Pancreas: The RAS in the Pancreas. Dordrecht: Springer Netherlands; 2010, 13-
27.
9. Reinus JF, Simon D. Gastrointestinal Anatomy and Physiology: The Essentials. Hoboken, United
Kingdom: John Wiley & Sons, Incorporated; 2014.
10. Röder PV, Wu B, Liu Y, et al. Pancreatic regulation of glucose homeostasis. Experimental &
Molecular Medicine 2016;48(3):e219-e219.
80
11. Costanzo LS. Physiology. Sixth edition ed. Philadelphia, PA: Elsevier; 2018.
12. Grant TJ, Hua K, Singh A. Molecular Pathogenesis of Pancreatic Cancer. Progress in Molecular
Biology and Translational Science 2016;144:241-275.
13. Corbo V, Tortora G, Scarpa A. Molecular pathology of pancreatic cancer: from bench-to-bedside
translation. Current Drug Targets 2012;13(6):744-752.
14. Machado NO, al Qadhi H, al Wahibi K. Intraductal Papillary Mucinous Neoplasm of Pancreas.
North American Journal of Medical Sciences 2015;7(5):160-175.
15. Hackeng WM, Hruban RH, Offerhaus GJA, et al. Surgical and molecular pathology of pancreatic
neoplasms. Diagnostic Pathology 2016;11(1):47.
16. Hruban RH, Maitra A, Goggins M. Update on Pancreatic Intraepithelial Neoplasia. International
Journal of Clinical and Experimental Pathology 2008;1(4):306-316.
17. Saiki Y, Horii A. Molecular pathology of pancreatic cancer. Pathology International
2014;64(1):10-19.
18. Kalser MH, Barkin J, MacIntyre JM. Pancreatic cancer. Assessment of prognosis by clinical
presentation. Cancer 1985;56(2):397-402.
19. Agarwal B, Correa AM, Ho L. Survival in pancreatic carcinoma based on tumor size. Pancreas
2008;36(1):e15-20.
20. Eibl ASaG. Pancreatic Ductal Adenocarcinoma. Pancreapedia: The Exocrine Pancreas Knowledge
Base 2015; 10.3998/panc.2015.14.
21. Vincent A, Herman J, Schulick R, et al. Pancreatic cancer. The Lancet 2011;378(9791):607-620.
22. Kelsen DP, Portenoy R, Thaler H, et al. Pain as a predictor of outcome in patients with operable
pancreatic carcinoma. Surgery 1997;122(1):53-59.
23. Tummala P, Junaidi O, Agarwal B. Imaging of pancreatic cancer: An overview. Journal of
Gastrointestinal Oncology 2011;2(3):168-174.
81
24. Pannala R, Leirness JB, Bamlet WR, et al. Prevalence and Clinical Profile of Pancreatic Cancer-
associated Diabetes mellitus. Gastroenterology 2008;134(4):981-987.
25. Chari ST, Leibson CL, Rabe KG, et al. Pancreatic cancer-associated diabetes mellitus: prevalence
and temporal association with diagnosis of cancer. Gastroenterology 2008;134(1):95-101.
26. Nipp R, Tramontano AC, Kong CY, et al. Disparities in cancer outcomes across age, sex, and
race/ethnicity among patients with pancreatic cancer. Cancer Medicine 2018;7(2):525-535.
27. Ilic M, Ilic I. Epidemiology of pancreatic cancer. World Journal of Gastroenterology
2016;22(44):9694-9705.
28. Oberstein PE, Olive KP. Pancreatic cancer: why is it so hard to treat? Therapeutic Advances in
Gastroenterology 2013;6(4):321-337.
29. Sohn TA, Yeo CJ, Cameron JL, et al. Resected adenocarcinoma of the pancreas-616 patients:
results, outcomes, and prognostic indicators. Journal of Gastrointestinal Surgery: Official Journal of the
Society for Surgery of the Alimentary Tract 2000;4(6):567-579.
30. Yamada M, Sugiura T, Okamura Y, et al. Microscopic Venous Invasion in Pancreatic Cancer.
Annals of Surgical Oncology 2018;25(4):1043-1051.
31. Kardosh A, Lichtensztajn DY, Gubens MA, et al. Long-Term Survivors of Pancreatic Cancer: A
California Population-Based Study. Pancreas 2018;47(8):958-966.
32. Lambe M, Eloranta S, Wigertz A, et al. Pancreatic cancer; reporting and long-term survival in
Sweden. Acta Oncologica 2011;50(8):1220-1227.
33. Lynch SM, Vrieling A, Lubin JH, et al. Cigarette smoking and pancreatic cancer: a pooled analysis
from the pancreatic cancer cohort consortium. American Journal of Epidemiology 2009;170(4):403-413.
34. Larsson SC, Permert J, Håkansson N, et al. Overall obesity, abdominal adiposity, diabetes and
cigarette smoking in relation to the risk of pancreatic cancer in two Swedish population-based cohorts.
British Journal of Cancer 2005;93(11):1310-1315.
82
35. Heinen MM, Verhage BAJ, Goldbohm RA, et al. Active and passive smoking and the risk of
pancreatic cancer in the Netherlands Cohort Study. Cancer Epidemiology, Biomarkers & Prevention: A
Publication of the American Association for Cancer Research, Cosponsored by the American Society of
Preventive Oncology 2010;19(6):1612-1622.
36. Iodice S, Gandini S, Maisonneuve P, et al. Tobacco and the risk of pancreatic cancer: a review
and meta-analysis. Langenbeck's Archives of Surgery 2008;393(4):535-545.
37. Lin Y, Tamakoshi A, Kawamura T, et al. A prospective cohort study of cigarette smoking and
pancreatic cancer in Japan. Cancer causes & control: CCC 2002;13(3):249-254.
38. Huang BZ, Stram DO, Marchand LL, et al. Interethnic differences in pancreatic cancer incidence
and risk factors: The Multiethnic Cohort. Cancer Medicine 2019;8(7):3592-3603.
39. Burkey MD, Feirman S, Wang H, et al. The association between smokeless tobacco use and
pancreatic adenocarcinoma: a systematic review. Cancer Epidemiology 2014;38(6):647-653.
40. Bertuccio P, La Vecchia C, Silverman DT, et al. Cigar and pipe smoking, smokeless tobacco use
and pancreatic cancer: an analysis from the International Pancreatic Cancer Case-Control Consortium
(PanC4). Annals of Oncology: Official Journal of the European Society for Medical Oncology
2011;22(6):1420-1426.
41. Müller-Lissner SA. Bile reflux is increased in cigarette smokers. Gastroenterology 1986;90(5 Pt
1):1205-1209.
42. Michaud DS, Skinner HG, Wu K, et al. Dietary patterns and pancreatic cancer risk in men and
women. Journal of the National Cancer Institute 2005;97(7):518-524.
43. Chan JM, Gong Z, Holly EA, et al. Dietary patterns and risk of pancreatic cancer in a large
population-based case-control study in the San Francisco Bay Area. Nutrition and Cancer
2013;65(1):157-164.
83
44. Arem H, Reedy J, Sampson J, et al. The Healthy Eating Index 2005 and risk for pancreatic cancer
in the NIH-AARP study. Journal of the National Cancer Institute 2013;105(17):1298-1305.
45. Jiao L, Mitrou PN, Reedy J, et al. A combined healthy lifestyle score and risk of pancreatic cancer
in a large cohort study. Archives of Internal Medicine 2009;169(8):764-770.
46. Tognon G, Nilsson LM, Lissner L, et al. The Mediterranean diet score and mortality are inversely
associated in adults living in the subarctic region. The Journal of Nutrition 2012;142(8):1547-1553.
47. Larsson SC, Wolk A. Red and processed meat consumption and risk of pancreatic cancer: meta-
analysis of prospective studies. British Journal of Cancer 2012;106(3):603-607.
48. Nöthlings U, Wilkens LR, Murphy SP, et al. Meat and fat intake as risk factors for pancreatic
cancer: the multiethnic cohort study. Journal of the National Cancer Institute 2005;97(19):1458-1465.
49. Turesky RJ, Le Marchand L. Metabolism and biomarkers of heterocyclic aromatic amines in
molecular epidemiology studies: lessons learned from aromatic amines. Chemical Research in Toxicology
2011;24(8):1169-1214.
50. Humans IWGotEoCRt. Red Meat and Processed Meat. Lyon (FR): International Agency for
Research on Cancer; 2018.
51. Thiébaut ACM, Jiao L, Silverman DT, et al. Dietary fatty acids and pancreatic cancer in the NIH-
AARP diet and health study. Journal of the National Cancer Institute 2009;101(14):1001-1011.
52. Michaud DS, Giovannucci E, Willett WC, et al. Dietary meat, dairy products, fat, and cholesterol
and pancreatic cancer risk in a prospective study. American Journal of Epidemiology 2003;157(12):1115-
1125.
53. Aune D, Greenwood DC, Chan DSM, et al. Body mass index, abdominal fatness and pancreatic
cancer risk: a systematic review and non-linear dose-response meta-analysis of prospective studies.
Annals of Oncology: Official Journal of the European Society for Medical Oncology 2012;23(4):843-852.
84
54. Batabyal P, Vander Hoorn S, Christophi C, et al. Association of diabetes mellitus and pancreatic
adenocarcinoma: a meta-analysis of 88 studies. Annals of Surgical Oncology 2014;21(7):2453-2462.
55. Bosetti C, Rosato V, Li D, et al. Diabetes, antidiabetic medications, and pancreatic cancer risk: an
analysis from the International Pancreatic Cancer Case-Control Consortium. Annals of Oncology: Official
Journal of the European Society for Medical Oncology 2014;25(10):2065-2072.
56. Setiawan VW, Stram DO, Porcel J, et al. Pancreatic Cancer Following Incident Diabetes in African
Americans and Latinos: The Multiethnic Cohort. Journal of the National Cancer Institute 2019;111(1):27-
33.
57. Hassan MM, Li D, El-Deeb AS, et al. Association between hepatitis B virus and pancreatic cancer.
Journal of Clinical Oncology: Official Journal of the American Society of Clinical Oncology
2008;26(28):4557-4562.
58. Huang J, Magnusson M, Törner A, et al. Risk of pancreatic cancer among individuals with
hepatitis C or hepatitis B virus infection: a nationwide study in Sweden. British Journal of Cancer
2013;109(11):2917-2923.
59. Mahale P, Torres HA, Kramer JR, et al. Hepatitis C virus infection and the risk of cancer among
elderly US adults: A registry-based case-control study. Cancer 2017;123(7):1202-1211.
60. Xiao M, Wang Y, Gao Y. Association between Helicobacter pylori infection and pancreatic cancer
development: a meta-analysis. PloS One 2013;8(9):e75559.
61. IARC Monographs on the Evaluation of Carcinogenic Risks to Humans: Outdoor Air Pollution.
International Agency for Research on Cancer 2016;109.
62. Ancona C, Badaloni C, Mataloni F, et al. Mortality and morbidity in a population exposed to
multiple sources of air pollution: A retrospective cohort study using air dispersion models.
Environmental Research 2015;137:467-474.
85
63. Raaschou-Nielsen O, Andersen ZJ, Hvidberg M, et al. Air pollution from traffic and cancer
incidence: a Danish cohort study. Environmental Health 2011;10(1).
64. Turner MC, Krewski D, Diver WR, et al. Ambient Air Pollution and Cancer Mortality in the Cancer
Prevention Study II. Environmental Health Perspectives 2017;125(8).
65. Coleman NC, Burnett RT, Ezzati M, et al. Fine Particulate Matter Exposure and Cancer Incidence:
Analysis of SEER Cancer Registry Data from 1992-2016. Environmental Health Perspectives
2020;128(10):107004.
66. Coleman NC, Burnett RT, Higbee JD, et al. Cancer mortality risk, fine particulate air pollution,
and smoking in a large, representative cohort of US adults. Cancer causes & control: CCC
2020;31(8):767-776.
67. Wang Y, Li M, Wan X, et al. Spatiotemporal analysis of PM 2.5 and pancreatic cancer mortality in
China. Environmental Research 2018;164:132-139.
68. Touma JS, Isakov V, Ching J, et al. Air quality modeling of hazardous pollutants: current status
and future directions. Journal of the Air & Waste Management Association (1995) 2006;56(5):547-558.
69. Klein AP, Brune KA, Petersen GM, et al. Prospective risk of pancreatic cancer in familial
pancreatic cancer kindreds. Cancer Research 2004;64(7):2634-2638.
70. Amundadottir LT. Pancreatic Cancer Genetics. International Journal of Biological Sciences
2016;12(3):314-325.
71. Klein AP, Wolpin BM, Risch HA, et al. Genome-wide meta-analysis identifies five new
susceptibility loci for pancreatic cancer. Nature Communications 2018;9(1).
72. Amundadottir L, Kraft P, Stolzenberg-Solomon RZ, et al. Genome-wide association study
identifies variants in the ABO locus associated with susceptibility to pancreatic cancer. Nature Genetics
2009;41(9):986-990.
86
73. Childs EJ, Mocci E, Campa D, et al. Common variation at 2p13.3, 3q29, 7p13 and 17q25.1
associated with susceptibility to pancreatic cancer. Nature Genetics 2015;47(8):911-916.
74. Petersen GM, Amundadottir L, Fuchs CS, et al. A genome-wide association study identifies
pancreatic cancer susceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33. Nature Genetics
2010;42(3):224-228.
75. Wolpin BM, Rizzato C, Kraft P, et al. Genome-wide association study identifies multiple
susceptibility loci for pancreatic cancer. Nature Genetics 2014;46(9):994-1000.
76. Zhang M, Wang Z, Obazee O, et al. Three new pancreatic cancer susceptibility signals identified
on chromosomes 1q32.1, 5p15.33 and 8q24.21. Oncotarget 2016;7(41):66328-66343.
77. Lin Y, Nakatochi M, Ito H, et al. Genome-wide association meta-analysis identifies novel GP2
gene risk variants for pancreatic cancer in the Japanese population. bioRxiv 2018;
10.1101/498659:498659.
78. Wu C, Kraft P, Stolzenberg-Solomon R, et al. Genome-Wide Association Study of Survival in
Patients with Pancreatic Adenocarcinoma. Gut 2014;63(1).
79. Low S-K, Kuchiba A, Zembutsu H, et al. Genome-Wide Association Study of Pancreatic Cancer in
Japanese Population. PLOS ONE 2010;5(7):e11824.
80. Campa D, Rizzato C, Capurso G, et al. Genetic susceptibility to pancreatic cancer and its
functional characterisation: the PANcreatic Disease ReseArch (PANDoRA) consortium. Digestive and
Liver Disease: Official Journal of the Italian Society of Gastroenterology and the Italian Association for
the Study of the Liver 2013;45(2):95-99.
81. Campa D, Pastore M, Capurso G, et al. Do pancreatic cancer and chronic pancreatitis share the
same genetic risk factors? A PANcreatic Disease ReseArch (PANDoRA) consortium investigation.
International Journal of Cancer 2018;142(2):290-296.
87
82. Canzian F, Obazee O, Hackert T, et al. The PANcreatic DISEASE ReseArch (PANDoRA) consortium:
an update. Pancreatology 2018;18(4, Supplement):S35.
83. Wang X, Lin X, Na R, et al. An evaluation study of reported pancreatic adenocarcinoma risk-
associated SNPs from genome-wide association studies in Chinese population. Pancreatology: official
journal of the International Association of Pancreatology (IAP) ... [et al.] 2017;17(6):931-935.
84. Nakatochi M, Lin Y, Ito H, et al. Prediction model for pancreatic cancer risk in the general
Japanese population. PloS One 2018;13(9):e0203386.
85. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2019. CA: A Cancer Journal for Clinicians
2019;69(1):7-34.
86. Rahib L, Smith BD, Aizenberg R, et al. Projecting cancer incidence and deaths to 2030: the
unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer Research
2014;74(11):2913-2921.
87. Liu L, Zhang J, Deapen D, et al. Differences in Pancreatic Cancer Incidence Rates and Temporal
Trends Across Asian Subpopulations in California (1988-2015). Pancreas 2019;48(7):931-933.
88. Huang BZ, Stram DO, Marchand LL, et al. Interethnic differences in pancreatic cancer incidence
and risk factors: The Multiethnic Cohort. Cancer Medicine 2019;0(0).
89. Huang BZ, Stram DO, Le Marchand L, et al. Interethnic differences in pancreatic cancer incidence
and risk factors: The Multiethnic Cohort. Cancer Med 2019;8(7):3592-3603.
90. Wu C, Miao X, Huang L, et al. Genome-wide association study identifies five loci associated with
susceptibility to pancreatic cancer in Chinese populations. Nature Genetics 2011;44(1):62-66.
91. Ueno M, Ohkawa S, Morimoto M, et al. Genome-wide association study-identified SNPs
(rs3790844, rs3790843) in the NR5A2 gene and risk of pancreatic cancer in Japanese. Scientific Reports
2015;5:17018.
88
92. Campa D, Rizzato C, Bauer AS, et al. Lack of replication of seven pancreatic cancer susceptibility
loci identified in two Asian populations. Cancer Epidemiology, Biomarkers & Prevention: A Publication of
the American Association for Cancer Research, Cosponsored by the American Society of Preventive
Oncology 2013;22(2):320-323.
93. Kolonel LN, Henderson BE, Hankin JH, et al. A Multiethnic Cohort in Hawaii and Los Angeles:
Baseline Characteristics. American journal of epidemiology 2000;151:346-357.
94. Signorello LB, Hargreaves MK, Steinwandel MD, et al. Southern community cohort study:
establishing a cohort to investigate health disparities. Journal of the National Medical Association
2005;97(7):972-979.
95. Manichaikul A, Mychaleckyj JC, Rich SS, et al. Robust relationship inference in genome-wide
association studies. Bioinformatics (Oxford, England) 2010;26(22):2867-2873.
96. Purcell S, Chang C. PLINK 2. In.
97. R: A Language and Environment for Statistical Computing. In. R Core Team. 3.5.1 ed. Vienna,
Austria: R Foundation for Statistical Computing; 2018.
98. Chen Q, Yuan H, Shi G-D, et al. Association between NR5A2 and the risk of pancreatic cancer,
especially among Caucasians: a meta-analysis of case-control studies. OncoTargets and Therapy
2018;11:2709-2723.
99. Marigorta UM, Navarro A. High Trans-ethnic Replicability of GWAS Results Implies Common
Causal Variants. PLOS Genetics 2013;9(6):e1003566.
100. Bogumil D, Conti DV, Sheng X, et al. Replication and Genetic Risk Score Analysis for Pancreatic
Cancer in a Diverse Multiethnic Population. Cancer Epidemiology, Biomarkers & Prevention: A
Publication of the American Association for Cancer Research, Cosponsored by the American Society of
Preventive Oncology 2020; 10.1158/1055-9965.EPI-20-0963.
89
101. International Agency for Research on Cancer (IARC) monographs on the evaluation of
carcinogenic risks to humans: outdoor air pollution; 2016.
102. Kolonel LN, Henderson BE, Hankin JH, et al. A Multiethnic Cohort in Hawaii and Los Angeles:
Baseline Characteristics. American Journal of Epidemiology 2000;151(4):346-357.
103. Cheng I, Tseng C, Wu J, et al. Association between ambient air pollution and breast cancer risk:
The multiethnic cohort study. International Journal of Cancer 2019; 10.1002/ijc.32308.
104. Wu AH, Wu J, Tseng C, et al. Association Between Outdoor Air Pollution and Risk of Malignant
and Benign Brain Tumors: The Multiethnic Cohort Study. JNCI Cancer Spectrum 2020;4(2).
105. Li L, Wu AH, Cheng I, et al. Spatiotemporal estimation of historical PM2.5 concentrations using
PM10, meteorological variables, and spatial effect. Atmospheric Environment 2017;166:182-191.
106. Cologne J, Hsu W-L, Abbott RD, et al. Proportional Hazards Regression in Epidemiologic Follow-
up Studies: An Intuitive Consideration of Primary Time Scale. Epidemiology 2012;23(4):565-573.
107. Korn EL, Graubard BI, Midthune D. Time-to-event analysis of longitudinal follow-up of a survey:
choice of the time-scale. American Journal of Epidemiology 1997;145(1):72-80.
108. Jerrett M, Arain A, Kanaroglou P, et al. A review and evaluation of intraurban air pollution
exposure models. Journal of Exposure Science & Environmental Epidemiology 2005;15(2):185-204.
109. Xie X, Semanjski I, Gautama S, et al. A Review of Urban Air Pollution Monitoring and Exposure
Assessment Methods. ISPRS International Journal of Geo-Information; Basel 2017;6(12):389.
110. Harrison RM, Yin J. Particulate matter in the atmosphere: which particle properties are
important for its effects on health? Science of The Total Environment 2000;249(1):85-101.
111. Philip S, Martin RV, van Donkelaar A, et al. Global chemical composition of ambient fine
particulate matter for exposure assessment. Environmental Science & Technology 2014;48(22):13060-
13068.
90
112. Zhang X, Staimer N, Gillen DL, et al. Associations of oxidative stress and inflammatory
biomarkers with chemically-characterized air pollutant exposures in an elderly cohort. Environmental
Research 2016;150:306-319.
113. Demetriou CA, Vineis P. Carcinogenicity of ambient air pollution: use of biomarkers, lessons
learnt and future directions. Journal of Thoracic Disease 2015;7(1):67-95.
114. Carrigan PE, Hentz JG, Gordon G, et al. Distinctive Heavy Metal Composition of Pancreatic Juice
in Patients with Pancreatic Carcinoma. Cancer Epidemiology and Prevention Biomarkers
2007;16(12):2656-2663.
115. Amaral AFS, Porta M, Silverman DT, et al. Pancreatic cancer risk and levels of trace elements.
Gut 2012;61(11):1583-1588.
116. Barone E, Corrado A, Gemignani F, et al. Environmental risk factors for pancreatic cancer: an
update. Archives of Toxicology 2016;90(11):2617-2642.
117. Li Y, Xu L, Shan Z, et al. Association between air pollution and type 2 diabetes: an updated
review of the literature. Therapeutic Advances in Endocrinology and Metabolism 2019;10.
118. Eze IC, Hemkens LG, Bucher HC, et al. Association between ambient air pollution and diabetes
mellitus in Europe and North America: systematic review and meta-analysis. Environmental Health
Perspectives 2015;123(5):381-389.
119. Klepeis NE, Nelson WC, Ott WR, et al. The National Human Activity Pattern Survey (NHAPS): a
resource for assessing exposure to environmental pollutants. Journal of Exposure Analysis and
Environmental Epidemiology 2001;11(3):231-252.
120. Cepeda M, Schoufour J, Freak-Poli R, et al. Levels of ambient air pollution according to mode of
transport: a systematic review. The Lancet. Public Health 2017;2(1):e23-e34.
91
121. Jurek AM, Greenland S, Maldonado G. How far from non-differential does exposure or disease
misclassification have to be to bias measures of association away from the null? International Journal of
Epidemiology 2008;37(2):382-385.
122. Rahib L, Wehner MR, Matrisian LM, et al. Estimated Projection of US Cancer Incidence and
Death to 2040. JAMA network open 2021;4(4):e214708.
123. Kirkegård J, Mortensen FV, Cronin-Fenton D. Chronic Pancreatitis and Pancreatic Cancer Risk: A
Systematic Review and Meta-analysis. The American Journal of Gastroenterology 2017;112(9):1366-
1372.
124. Lugo A, Peveri G, Bosetti C, et al. Strong excess risk of pancreatic cancer for low frequency and
duration of cigarette smoking: A comprehensive review and meta-analysis. European Journal of Cancer
(Oxford, England: 1990) 2018;104:117-126.
125. Health, United States, 2019. In. Hyattsville, MD: National Center for Health Statistics.
126. Muscat JE, Djordjevic MV, Colosimo S, et al. Racial differences in exposure and glucuronidation
of the tobacco-specific carcinogen 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK). Cancer
2005;103(7):1420-1426.
127. Pérez-Stable EJ, Herrera B, Jacob P, et al. Nicotine metabolism and intake in black and white
smokers. JAMA 1998;280(2):152-156.
128. Jones MR, Apelberg BJ, Tellez-Plaza M, et al. Menthol cigarettes, race/ethnicity, and biomarkers
of tobacco use in U.S. adults: the 1999-2010 National Health and Nutrition Examination Survey
(NHANES). Cancer Epidemiology, Biomarkers & Prevention: A Publication of the American Association
for Cancer Research, Cosponsored by the American Society of Preventive Oncology 2013;22(2):224-232.
129. Maskarinec G, Erber E, Grandinetti A, et al. Diabetes incidence based on linkages with health
plans: the multiethnic cohort. Diabetes 2009;58(8):1732-1738.
92
130. Stram DO, Park SL, Haiman CA, et al. Racial/Ethnic Differences in Lung Cancer Incidence in the
Multiethnic Cohort Study: An Update. Journal of the National Cancer Institute 2019;
10.1093/jnci/djy206.
131. Laird N, Olivier D. Covariance Analysis of Censored Survival Data Using Log-Linear Analysis
Techniques. Journal of the American Statistical Association 1981;76(374):231-240.
132. Preston D, Lubin J, Pierce D. Epicure User Guide. In. Seattle, WA: Hirosoft International
Corporation; 1993.
133. Vrieling A, Bueno-de-Mesquita HB, Boshuizen HC, et al. Cigarette smoking, environmental
tobacco smoke exposure and pancreatic cancer risk in the European Prospective Investigation into
Cancer and Nutrition. International Journal of Cancer 2010;126(10):2394-2403.
134. Schulte A, Pandeya N, Tran B, et al. Cigarette smoking and pancreatic cancer risk: More to the
story than just pack-years. European Journal of Cancer 2014;50(5):997-1003.
135. Leffondré K, Abrahamowicz M, Siemiatycki J, et al. Modeling smoking history: a comparison of
different approaches. American Journal of Epidemiology 2002;156(9):813-823.
136. Gram IT, Park S-Y, Maskarinec G, et al. Smoking and breast cancer risk by race/ethnicity and
oestrogen and progesterone receptor status: the Multiethnic Cohort (MEC) study. International Journal
of Epidemiology 2019;48(2):501-511.
137. Murphy SE. Nicotine Metabolism and Smoking: Ethnic Differences in the Role of P450 2A6.
Chemical Research in Toxicology 2017;30(1):410-419.
138. Carrigan PE, Hentz JG, Gordon G, et al. Distinctive Heavy Metal Composition of Pancreatic Juice
in Patients with Pancreatic Carcinoma. Cancer Epidemiology Biomarkers & Prevention
2007;16(12):2656-2663.
93
139. Benowitz NL, Dains KM, Dempsey D, et al. Racial Differences in the Relationship Between
Number of Cigarettes Smoked and Nicotine and Carcinogen Exposure. Nicotine & Tobacco Research
2011;13(9):772-783.
94
Appendices
Appendix A: Detailed MEC and SCCS Sample Description
The MEC is a population-based prospective cohort that was initiated in 1993-1996 in California
and Hawaii to investigate cancer etiology in a multiethnic population. The Department of Motor
Vehicles, voter registration lists, and Health Care Financing Administration data files were used to
identify participants. Over 215,000 men and women, between the ages 45 and 75, and from five major
ethnic/racial groups (African Americans, Japanese Americans, Latinos, Native Hawaiians and whites)
were enrolled. Enrollment and informed consent was gathered using a mailed baseline questionnaire
which collected data on demographics, anthropometric measures, diet, and lifestyle factors.
Cancer cases are identified in the MEC through annual linkage to California and Hawaii cancer
state registries which are part of the National Cancer Institute's Surveillance, Epidemiology, and End
Results Program. Vital status and reason for death among participants is collected through linkage of
participant data to the National Death Index and death certificate files for California.
The nested case-control sample of the MEC included participants with incident PC and available
DNA. Incident PC cases were individually matched to an equal number of controls based on age, sex, and
race/ethnicity. This sample, along with SCCS cases and controls, were genotyped using the 2M
Multiethnic Genotyping Array (the ‘MEGA’ chip) from Illumina (San Diego, CA).
In addition to these newly genotyped controls, additional controls were selected from ~18,000
MEC participants genotyped using the MEGA array as part of other genetic studies. These earlier
samples were genotyped for prior MEC studies at the Center for Inherited Disease (CIDR) and at the USC
Norris Molecular Genomic Core facility.
The SCCS participants were selected from the prospective cohort study of >85,000 black and
white men and women, initiated in 2002. Average age at cohort entry was 52 years. Each participant
95
completed an extensive baseline questionnaire which ascertained information about demographic,
anthropometric, medical, lifestyle and other factors. The educational and income levels of the SCCS
members are low compared with other established cohorts and the prevalence of obesity (45%) and
current smoking (44%) are high in this cohort. Ascertainment of cancer incidence is done by annual
linkage of the cohort with the 12 state cancer registries covering the SCCS catchment area.
Approximately 90% of participants provided biological samples at baseline (blood or buccal).
96
Appendix B: Flow diagram of the data cleaning, exclusion, and imputation for the sample.
Multiethnic Cohort (MEC) and Southern Community Cohort Study (SCCS) samples were combined,
samples then underwent quality control of genetic data. Following this process, samples were removed
based on missing data. Related pairs were then identified, and the most intra-cohort related individual
removed, producing independent observations used in analysis.
97
Appendix C: Identical by descent plot showing relatedness of samples within the Multiethnic
Cohort (MEC) and Southern Community Cohort Study (SCCS) prior to filtering based on
relatedness.
Z0 refers to the probability of 0 (none) alleles being identical by descent. Z1 refers to the probability of
one allele being identical by descent. Point colors correspond to PI_HAT values produced by relationship
inference in PLINK. PI_HAT values represent the proportion of alleles that are estimated to be identical
by descent. This value is calculated as P(IBD=2) + 0.5*P(IBD=1).
98
Appendix D: Principal component (PC) analysis plots with point color corresponding to self-
reported race/ethnicity from baseline questionnaires
99
Appendix E: Multiethnic and ethnic-specific replication results.
SNP
Information
Statistic Multiethnic White
African-
American
Japanese-
American
Hispanic
Native-
Hawaiian
Chr 9q34
RAF
Case;Ctl
0.41; 0.35 0.43; 0.33
0.41;
0.34
0.50; 0.45
0.27;
0.25
0.37; 0.36
RSID
(RISK/REF)
rs505922 (C/T)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 136149229 OR (CI)
1.30 (1.16,
1.47)
1.54
(1.13,
2.10)
1.34
(1.08,
1.65)
1.19
(0.96,
1.48)
1.13
(0.82,
1.55)
1.07
(0.68,
1.68)
Gene ABO p 9.74E-06 0.006 0.007 0.114 0.472 0.763
Disc. Race European P-Het 0.176
Chr 7q32.3
RAF
Case;Ctl
0.91; 0.89 0.86; 0.82
0.89;
0.84
0.97; 0.97
0.90;
0.88
0.94; 0.95
RSID
(RISK/REF)
rs6971499
(T/C)
Info
Score
(R2)
0.88 (IP) 0.95 (IP) 0.76 (IP) 0.88 (IP) 0.85 (IP)
POS 130680521 OR (CI)
1.39 (1.12,
1.71)
1.60
(1.02,
2.52)
1.50
(1.10,
2.06)
0.96
(0.45,
2.03)
1.21
(0.75,
1.95)
0.91
(0.34,
2.46)
Gene LINC-PINT p 0.002 0.035 0.008 0.911 0.432 0.855
Disc. Race European P-Het 0.48
Chr 17q12
RAF
Case;Ctl
0.83; 0.81 0.83; 0.78
0.95;
0.93
0.72; 0.72
0.85;
0.80
0.66; 0.64
RSID
(RISK/REF)
rs4795218
(G/A)
Info
Score
(R2)
0.96 (IP) 0.88 (IP) 0.99 (IP) 0.96 (IP) 0.97 (IP)
POS 36078510 OR (CI)
1.25 (1.06,
1.47)
1.42
(0.94,
2.13)
1.56
(0.93,
2.60)
1.05
(0.83,
1.34)
1.39
(0.93,
2.07)
1.12
(0.70,
1.77)
Gene HNF1B p 0.006 0.086 0.077 0.674 0.1 0.64
Disc. Race European P-Het 0.341
Chr 8q24.21
RAF
Case;Ctl
0.28; 0.24 0.35; 0.34
0.31;
0.27
0.20; 0.16
0.27;
0.26
0.28; 0.25
RSID
(RISK/REF)
rs10094872
(T/A)
Info
Score
(R2)
0.84 (IP) 0.77 (IP) 0.76 (IP) 0.76 (IP) 0.80 (IP)
POS 128719884 OR (CI)
1.20 (1.04,
1.39)
1.14
(0.81,
1.62)
1.30
(1.01,
1.68)
1.35
(0.99,
1.84)
1.09
(0.76,
1.57)
1.15
(0.68,
1.96)
Gene MYC p 0.013 0.459 0.042 0.061 0.627 0.603
Disc. Race European P-Het 0.734
Chr 5p15.33
RAF
Case;Ctl
0.49; 0.46 0.51; 0.46
0.66;
0.58
0.29; 0.33
0.44;
0.41
0.50; 0.39
RSID
(RISK/REF)
rs401681 (T/C)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
100
POS 1322087 OR (CI)
1.15 (1.03,
1.29)
1.12
(0.84,
1.50)
1.33
(1.07,
1.64)
0.81
(0.64,
1.03)
1.19
(0.90,
1.56)
1.57
(1.02,
2.42)
Gene CLPTM1L p 0.017 0.431 0.008 0.087 0.228 0.041
Disc. Race European P-Het 0.007
Chr 16q23.1
RAF
Case;Ctl
0.13; 0.12 0.08; 0.05
0.25;
0.23
0.04; 0.03
0.07;
0.05
0.06; 0.05
RSID
(RISK/REF)
rs7190458
(A/G)
Info
Score
(R2)
0.71 (IP) 0.87 (IP) 0.39 (IP) 0.80 (IP) 0.44 (IP)
POS 75263661 OR (CI)
1.27 (1.04,
1.55)
2.14
(1.08,
4.26)
1.12
(0.87,
1.44)
1.85
(0.85,
4.05)
1.49
(0.82,
2.72)
1.46
(0.39,
5.55)
Gene BCAR1 p 0.024 0.034 0.375 0.146 0.213 0.59
Disc. Race European P-Het 0.193
Chr 21q22.3
RAF
Case;Ctl
0.66; 0.63 0.67; 0.69
0.71;
0.68
0.55; 0.51
0.73;
0.65
0.66; 0.68
RSID
(RISK/REF)
rs1547374
(A/G)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 43778895 OR (CI)
1.14 (1.01,
1.28)
0.80
(0.58,
1.10)
1.19
(0.95,
1.49)
1.18
(0.95,
1.46)
1.47
(1.07,
2.01)
0.93
(0.59,
1.48)
Gene TFF1 p 0.031 0.168 0.128 0.128 0.013 0.768
Disc. Race Chinese P-Het 0.102
Chr 18q21.32
RAF
Case;Ctl
0.81; 0.78 0.83; 0.83
0.75;
0.70
0.89; 0.86
0.78;
0.74
0.91; 0.89
RSID
(RISK/REF)
rs1517037
(C/T)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 56878274 OR (CI)
1.17 (1.01,
1.35)
0.97
(0.65,
1.45)
1.24
(0.97,
1.57)
1.27
(0.91,
1.77)
1.23
(0.89,
1.72)
1.19
(0.57,
2.52)
Gene GRP p 0.037 0.878 0.078 0.156 0.207 0.636
Disc. Race European P-Het 0.415
Chr 17q25.1
RAF
Case;Ctl
0.25; 0.24 0.14; 0.11
0.23;
0.22
0.28; 0.26
0.38;
0.31
0.22; 0.24
RSID
(RISK/REF)
rs7214041
(T/C)
Info
Score
(R2)
0.79 (IP) 0.82 (IP) 0.84 (IP) 0.87 (IP) 0.86 (IP)
POS 70401476 OR (CI)
1.17 (1.01,
1.35)
1.05
(0.64,
1.71)
1.02
(0.78,
1.34)
1.19
(0.92,
1.53)
1.45
(1.05,
1.99)
0.89
(0.50,
1.56)
Gene LINC00673 p 0.037 0.858 0.877 0.186 0.024 0.678
Disc. Race European P-Het 0.386
Chr 1p36.33
RAF
Case;Ctl
0.40; 0.41 0.13; 0.09
0.74;
0.69
0.34; 0.28
0.20;
0.23
0.20; 0.19
101
RSID
(RISK/REF)
rs13303010
(G/A)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 894573 OR (CI)
1.15 (1.01,
1.31)
1.37
(0.87,
2.15)
1.16
(0.91,
1.48)
1.32
(1.06,
1.66)
0.85
(0.60,
1.21)
1.04
(0.59,
1.83)
Gene NOC2L p 0.04 0.185 0.225 0.017 0.366 0.9
Disc. Race European P-Het 0.103
Chr 13q22.1
RAF
Case;Ctl
0.60; 0.59 0.41; 0.39
0.88;
0.83
0.56; 0.51
0.40;
0.39
0.43; 0.45
RSID
(RISK/REF)
rs9543325
(C/T)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 73916628 OR (CI)
1.14 (1.00,
1.29)
1.09
(0.80,
1.48)
1.21
(0.88,
1.67)
1.21
(0.98,
1.50)
1.01
(0.76,
1.35)
0.90
(0.57,
1.40)
Gene KLF5, KLF12 p 0.042 0.604 0.24 0.077 0.938 0.636
Disc. Race European P-Het 0.653
Chr 12p11.21
RAF
Case;Ctl
0.61; 0.61 0.45; 0.41
0.63;
0.62
0.70; 0.66
0.54;
0.54
0.73; 0.65
RSID
(RISK/REF)
rs708224
(A/G)
Info
Score
(R2)
1.00 (IP) 0.99 (IP) 1.00 (IP) 1.00 (IP) 1.00 (IP)
POS 32436409 OR (CI)
1.12 (1.00,
1.26)
1.16
(0.86,
1.57)
1.03
(0.84,
1.28)
1.17
(0.93,
1.48)
1.01
(0.76,
1.33)
1.44
(0.86,
2.39)
Gene BICD1 p 0.058 0.333 0.769 0.172 0.964 0.155
Disc. Race Japanese P-Het 0.462
Chr 3q29
RAF
Case;Ctl
0.74; 0.72 0.65; 0.64
0.75;
0.73
0.91; 0.91
0.52;
0.47
0.76; 0.76
RSID
(RISK/REF)
rs9854771
(G/A)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 0.98 (GT)
1.00
(GT)
1.00 (GT)
POS 189508471 OR (CI)
1.11 (0.97,
1.27)
1.02
(0.75,
1.39)
1.19
(0.94,
1.51)
1.05
(0.72,
1.54)
1.19
(0.90,
1.57)
0.98
(0.59,
1.64)
Gene TP63 p 0.139 0.9 0.134 0.78 0.213 0.937
Disc. Race European P-Het 0.737
Chr 5p15.33
RAF
Case;Ctl
0.98; 0.99 0.96; 0.98
0.97;
0.98
1.00; 1.00
0.99;
0.99
1.00; 1.00
RSID
(RISK/REF)
rs35226131
(C/T)
Info
Score
(R2)
0.93 (IP) 0.92 (IP) 0.27 (IP) 0.90 (IP) 0.89 (IP)
POS 1295373 OR (CI)
0.69 (0.43,
1.11)
0.62
(0.25,
1.54)
0.59
(0.30,
1.17)
6.83
(0.00,
8.88e+05)
1.43
(0.32,
6.39)
106.22
(0.00,
2.49e+12)
Gene
TERT,
CLPTM1L
p 0.139 0.313 0.148 0.714 0.62 0.347
Disc. Race European P-Het 0.613
102
Chr 1q32.1
RAF
Case;Ctl
0.35; 0.35 0.23; 0.22
0.74;
0.70
0.06; 0.07
0.23;
0.20
0.17; 0.15
RSID
(RISK/REF)
rs2816938
(A/T)
Info
Score
(R2)
0.98 (IP) 0.97 (IP) 0.94 (IP) 0.98 (IP) 0.94 (IP)
POS 199985368 OR (CI)
1.12 (0.96,
1.30)
0.99
(0.69,
1.43)
1.11
(0.87,
1.42)
0.96
(0.61,
1.50)
1.32
(0.94,
1.85)
1.16
(0.63,
2.12)
Gene NR5A2 p 0.145 0.952 0.411 0.85 0.12 0.642
Disc. Race European P-Het 0.872
Chr 13q12.2
RAF
Case;Ctl
0.27; 0.26 0.46; 0.40
0.11;
0.15
0.33; 0.37
0.26;
0.28
0.28; 0.27
RSID
(RISK/REF)
rs9581943
(A/G)
Info
Score
(R2)
0.95 (IP) 0.91 (IP) 0.95 (IP) 0.95 (IP) 0.94 (IP)
POS 28493997 OR (CI)
0.91 (0.79,
1.04)
1.14
(0.83,
1.55)
0.78
(0.55,
1.09)
0.83
(0.66,
1.04)
0.98
(0.71,
1.36)
1.00
(0.60,
1.67)
Gene
PDX1-AS1-
PDX1
p 0.155 0.428 0.135 0.11 0.914 0.999
Disc. Race European P-Het 0.553
Chr 22q12.1
RAF
Case;Ctl
0.23; 0.21 0.19; 0.17
0.06;
0.06
0.51; 0.49
0.19;
0.17
0.22; 0.24
RSID
(RISK/REF)
rs16986825
(T/C)
Info
Score
(R2)
0.98 (IP) 0.88 (IP) 0.93 (IP) 0.96 (IP) 0.97 (IP)
POS 29300306 OR (CI)
1.11 (0.95,
1.29)
1.12
(0.75,
1.65)
1.11
(0.69,
1.79)
1.10
(0.88,
1.37)
1.15
(0.80,
1.64)
0.90
(0.52,
1.55)
Gene ZNRF3 p 0.195 0.584 0.675 0.414 0.465 0.696
Disc. Race European P-Het 0.881
Chr 1q32.1
RAF
Case;Ctl
0.65; 0.63 0.80; 0.76
0.86;
0.86
0.35; 0.31
0.60;
0.60
0.48; 0.51
RSID
(RISK/REF)
rs3790844
(A/G)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 200007432 OR (CI)
1.09 (0.95,
1.24)
1.16
(0.81,
1.65)
0.95
(0.70,
1.27)
1.15
(0.92,
1.44)
0.99
(0.74,
1.33)
0.85
(0.54,
1.34)
Gene NR5A2 p 0.222 0.42 0.711 0.209 0.967 0.491
Disc. Race European P-Het 0.324
Chr 6p25.3
RAF
Case;Ctl
0.49; 0.48 0.46; 0.45
0.60;
0.57
0.35; 0.37
0.64;
0.59
0.21; 0.28
RSID
(RISK/REF)
rs9502893
(C/T)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 1340189 OR (CI)
1.07 (0.95,
1.21)
1.20
(0.89,
1.64)
1.21
(0.98,
1.51)
0.93
(0.75,
1.17)
1.25
(0.93,
1.67)
0.68
(0.40,
1.17)
Gene FOXQ1 p 0.24 0.233 0.075 0.543 0.142 0.153
103
Disc. Race Japanese P-Het 0.162
Chr 8q24.21
RAF
Case;Ctl
0.73; 0.70 0.77; 0.71
0.45;
0.44
0.97; 0.98
0.85;
0.80
0.79; 0.82
RSID
(RISK/REF)
rs1561927
(T/C)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 129568078 OR (CI)
1.09 (0.94,
1.27)
1.26
(0.90,
1.76)
1.01
(0.82,
1.26)
0.72
(0.37,
1.37)
1.43
(0.97,
2.10)
0.84
(0.49,
1.44)
Gene MIR1208 p 0.243 0.178 0.896 0.335 0.064 0.541
Disc. Race European P-Het 0.221
Chr 21q21.3
RAF
Case;Ctl
0.48; 0.46 0.52; 0.51
0.39;
0.39
0.54; 0.55
0.44;
0.45
0.63; 0.47
RSID
(RISK/REF)
rs372883 (T/C)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 30717737 OR (CI)
1.07 (0.95,
1.20)
1.02
(0.75,
1.37)
1.08
(0.87,
1.34)
0.97
(0.78,
1.20)
0.98
(0.74,
1.29)
2.02
(1.28,
3.20)
Gene BACH1 p 0.256 0.921 0.473 0.782 0.886 0.002
Disc. Race Chinese P-Het 0.071
Chr 5p13.1
RAF
Case;Ctl
0.90; 0.87 1.00; 1.00
1.00;
1.00
0.73; 0.73
0.89;
0.89
0.76; 0.70
RSID
(RISK/REF)
rs2255280
(A/C)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 39394989 OR (CI)
1.10 (0.91,
1.33)
7.65e+05
(0.00, Inf)
1.54
(0.33,
7.30)
1.05
(0.83,
1.34)
1.06
(0.68,
1.66)
1.25
(0.75,
2.10)
Gene DAB2 p 0.33 0.242 0.566 0.681 0.799 0.386
Disc. Race Chinese P-Het 0.648
Chr 2p13.3
RAF
Case;Ctl
0.29; 0.27 0.30; 0.29
0.18;
0.18
0.43; 0.39
0.28;
0.31
0.26; 0.24
RSID
(RISK/REF)
rs1486134
(G/T)
Info
Score
(R2)
0.99 (IP) 0.99 (IP) 1.00 (IP) 0.99 (IP) 1.00 (IP)
POS 67639769 OR (CI)
1.06 (0.93,
1.20)
1.20
(0.86,
1.67)
1.06
(0.81,
1.37)
1.19
(0.96,
1.48)
0.84
(0.61,
1.14)
1.16
(0.69,
1.95)
Gene ETAA1 p 0.377 0.291 0.69 0.107 0.252 0.584
Disc. Race European P-Het 0.391
Chr 22q13.32
RAF
Case;Ctl
0.33; 0.32 0.36; 0.39
0.29;
0.32
0.41; 0.40
0.28;
0.27
0.23; 0.23
RSID
(RISK/REF)
rs5768709
(G/A)
Info
Score
(R2)
0.94 (IP) 0.94 (IP) 0.94 (IP) 0.93 (IP) 0.93 (IP)
POS 48929569 OR (CI)
0.96 (0.84,
1.08)
0.86
(0.62,
1.18)
0.90
(0.71,
1.13)
1.03
(0.82,
1.29)
1.04
(0.75,
1.44)
1.00
(0.58,
1.72)
104
Gene FAM19A5 p 0.487 0.343 0.346 0.786 0.807 0.998
Disc. Race Chinese P-Het 0.707
Chr 7p13
RAF
Case;Ctl
0.88; 0.89 0.77; 0.72
0.93;
0.92
0.97; 0.97
0.78;
0.77
0.88; 0.88
RSID
(RISK/REF)
rs17688601
(C/A)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 40866663 OR (CI)
1.07 (0.89,
1.29)
1.34
(0.95,
1.91)
0.97
(0.65,
1.46)
0.94
(0.51,
1.73)
1.03
(0.74,
1.44)
0.94
(0.47,
1.86)
Gene SUGCT p 0.487 0.095 0.888 0.846 0.852 0.85
Disc. Race European P-Het 0.631
Chr 5p15.33
RAF
Case;Ctl
0.79; 0.78 0.79; 0.73
0.89;
0.88
0.74; 0.77
0.80;
0.79
0.50; 0.50
RSID
(RISK/REF)
rs2736098
(C/T)
Info
Score
(R2)
0.96 (IP) 0.95 (IP) 0.96 (IP) 0.96 (IP) 0.97 (IP)
POS 1294086 OR (CI)
1.04 (0.90,
1.21)
1.29
(0.90,
1.85)
1.00
(0.72,
1.41)
0.83
(0.65,
1.06)
1.12
(0.79,
1.60)
1.02
(0.65,
1.62)
Gene TERT p 0.558 0.152 0.978 0.151 0.517 0.92
Disc. Race European P-Het 0.062
Chr 10q26.11
RAF
Case;Ctl
0.65; 0.67 0.56; 0.56
0.85;
0.83
0.49; 0.52
0.58;
0.55
0.77; 0.73
RSID
(RISK/REF)
rs12413624
(T/A)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
1.00 (GT)
POS 120278944 OR (CI)
1.03 (0.91,
1.17)
1.05
(0.78,
1.40)
1.01
(0.76,
1.34)
0.90
(0.72,
1.11)
1.15
(0.87,
1.53)
1.23
(0.73,
2.08)
Gene PRLHR p 0.593 0.756 0.95 0.31 0.331 0.431
Disc. Race Chinese P-Het 0.52
Chr 7p12
RAF
Case;Ctl
0.88; 0.88 0.88; 0.88
0.75;
0.76
0.99; 0.99
0.92;
0.91
0.99; 0.94
RSID
(RISK/REF)
rs73328514
(A/T)
Info
Score
(R2)
0.96 (IP) 0.89 (IP) 0.19 (IP) 0.94 (IP) 0.88 (IP)
POS 47488569 OR (CI)
1.04 (0.85,
1.27)
1.09
(0.67,
1.78)
0.99
(0.77,
1.29)
0.33
(0.02,
5.54)
1.05
(0.61,
1.79)
6.05
(0.71,
51.32)
Gene TNS3 p 0.683 0.719 0.954 0.48 0.863 0.028
Disc. Race European P-Het 0.299
Chr 8q21.11
RAF
Case;Ctl
0.62; 0.61 0.58; 0.56
0.86;
0.85
0.52; 0.49
0.40;
0.43
0.43; 0.44
RSID
(RISK/REF)
rs2941471
(A/G)
Info
Score
(R2)
1.00 (IP) 0.98 (IP) 1.00 (IP) 0.99 (IP) 0.99 (IP)
105
POS 76470404 OR (CI)
1.00 (0.88,
1.14)
1.16
(0.85,
1.58)
0.95
(0.70,
1.30)
1.12
(0.91,
1.39)
0.94
(0.70,
1.27)
0.99
(0.63,
1.54)
Gene HNF4G p 0.949 0.334 0.762 0.295 0.685 0.949
Disc. Race European P-Het 0.817
Chr 7q36.2
RAF
Case;Ctl
0.08; 0.08 0.05; 0.05
0.11;
0.09
0.12; 0.12
0.04;
0.05
0.01; 0.03
RSID
(RISK/REF)
rs6464375
(T/C)
Info
Score
(R2)
1.00 (GT) 1.00 (GT) 1.00 (GT)
1.00
(GT)
0.99 (GT)
POS 153625843 OR (CI)
1.01 (0.82,
1.23)
1.19
(0.62,
2.31)
1.17
(0.83,
1.65)
0.93
(0.67,
1.30)
0.74
(0.37,
1.52)
0.36
(0.05,
2.71)
Gene DPP6 p 0.96 0.606 0.37 0.682 0.396 0.243
Disc. Race Japanese P-Het 0.505
Chr 16p12.3
RAF
Case;Ctl
0.02; 0.02 0.00; 0.00
0.00;
0.00
0.06; 0.07
0.00;
0.00
0.08; 0.05
RSID
(RISK/REF)
rs78193826
(T/C)
Info
Score
(R2)
0.07 (IP) 0.94 (GT) 1.00 (GT)
0.89
(GT)
1.00 (GT)
POS 20328666 OR (CI)
1.01 (0.70,
1.45)
1.70e+22
(0.00,
2.13e+50)
0.64
(0.06,
6.43)
0.87
(0.57,
1.34)
0.76
(0.04,
13.48)
1.93
(0.85,
4.36)
Gene GP2 p 0.978 0.005 0.692 0.515 0.844 0.143
Disc. Race Japanese P-Het 0.08
POS = Position using Genome Reference Consortium Human Build 37 (GRCh37); Disc. Race =
Ethnic/racial group in which the SNP was discovered to be associated with pancreatic ductal
adenocarcinoma; Chr = Chromosome; RSID = Reference SNP cluster ID; ALT = Alternative (risk) allele;
REF = Reference Allele; RAF = Risk allele frequency; OR = Odds ratio; CI = Confidence interval; p = p-value
from likelihood ratio test for the tested single nucleotide polymorphism; P-Het = p-value for
heterogeneity test of odds ratios between ethnic/racial groups, using likelihood ratio test. IP = Imputed;
GT = Genotype.
106
107
Appendix F: Comparison of risk allele frequencies (RAFs) within the Multiethnic Cohort (MEC)
and Southern Community Cohort Study (SCCS) with the reported RAF from the most recent
GWAS results reported in the literature.
108
Appendix G: Multiethnic and ethnic-specific polygenic risk score odds ratios (ORs) and 95%
confidence intervals (CIs)
Weights used from multiethnic replication analysis. Multiethnic analysis used binned risk score
percentile groups from the complete, multiethnic, sample among controls. Ethnic-specific analysis used
binned risk score percentile groups from the control ethnic-specific risk score distribution among
controls. Bolded race labels correspond to p values from likelihood ratios tests with p < 0.05.
109
Appendix H: Comparison of polygenic risk score calculation methods in the multiethnic analysis.
In ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid
unstable scores
110
Appendix I: Comparison of polygenic risk score calculation methods in the white analysis. In
ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid
unstable scores.
111
Appendix J: Comparison of polygenic risk score calculation methods in the African American
analysis. In ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used
to avoid unstable scores.
112
Appendix K: Comparison of polygenic risk score calculation methods in the Japanese analysis. In
ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid
unstable scores.
113
Appendix L: Comparison of polygenic risk score calculation methods in the Latino analysis. In
ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used to avoid
unstable scores.
114
Appendix M: Comparison of polygenic risk score calculation methods in the Native Hawaiian
analysis. In ethnic-specific internal scores, if SNP MAF < 0.05, the multiethnic weight was used
to avoid unstable scores.
115
Appendix N: Participant Exclusions Prior to Air Pollution Analysis
116
Appendix O: Global Proportional Hazard Violation Test Results
Pollutant (p-value of time interaction)
Variable PM 2.5 PM 10 NO X NO 2
Pollutant Level 0.716 0.050* 0.333 0.166
Age at cohort entry 0.609 0.575 0.504 0.576
BMI 0.905 0.909 0.832 0.908
Birth Year 0.996 0.995 0.977 0.995
Diabetes 0.063 0.050 0.066 0.050
Ethnicity 0.205 0.213 0.263 0.213
Sex 0.104 0.095 0.118 0.095
Smoking Status 0.025 0.024 0.031 0.024
Calendar Time (months) 0.273 0.262 0.248 0.263
Alcohol 0.159 0.160 0.106 0.160
Work Hist 0.665 0.684 0.109 0.684
AHEI 2010 0.159 0.109 0.075 0.108
* Test for variable not violated in full model.
117
Appendix P: Monthly Air Pollutant Measures Over Study Duration by Ethnicity
118
Appendix Q: Full Excess Relative Risk Model Terms.
Estimate Std. Error
Test
Statistic P value* P LRT
Log-linear term 0
AA -7.685 0.08101 -94.86 <0.001
NH -7.342 0.1038 -70.74 <0.001
LA -7.965 0.07455 -106.8 <0.001
JA -7.728 0.06094 -126.8 <0.001
W -7.925 0.07529 -105.3 <0.001
AA * logage70 4.948 0.5419 9.13 <0.001
NH * logage70 4.784 0.6913 6.92 <0.001
LA * logage70 5.626 0.574 9.801 <0.001
JA * logage70 4.931 0.3978 12.4 <0.001
W * logage70 4.914 0.4863 10.11 <0.001
Linear term 1
AA * py50 1.188 0.4736 2.509 0.0121 0.38
NH * py50 0.6309 0.5481 1.151 0.25
LA * py50 0.4552 0.5277 0.8625 0.388
JA * py50 1.423 0.379 3.756 <0.001
W * py50 0.5359 0.2959 1.811 0.0701
Log-linear term 1
AA * yearsquit -0.1294 0.1071 -1.208 0.227 0.83
NH * yearsquit -0.1488 0.2775 -0.5362 >0.5
LA * yearsquit -0.1228 0.2932 -0.4186 >0.5
JA * yearsquit -0.06131 0.02764 -2.218 0.0265
W * yearsquit -0.08183 0.09886 -0.8277 0.408
* P values are two-sided from the Wald test
†logage70=log(age in years / 70) so that exp(eth group) gives absolute risk at age 70 for never smoker
‡py50 = pack-years/50 so that the estimate is corresponds to estimated excess relative risk at 50 pack-years for each
interaction group (i.e. ethnic group * logage70)
§The modification of excess relative risks due smoking are obtained using linear combination "lincomb" statement in
EPICURE
||AA NH, LA, JA, W refer to African Americans, Native Hawaiians, Latinos, Japanese Americans and Whites
respectively.
119
Appendix R: Sensitivity analysis; BMI’s effect on smoking estimates, when included as a baseline term.
No BMI in Model BMI in Model Estimate
Linear term 1 beta se t p beta se t p
Percent
Change
STD Mean
Diff
AA * py50 1.188 0.4736 2.509 0.0121 1.342 0.5066 2.649 0.00808 -13.0 -0.12
NH * py50 0.6309 0.5481 1.151 0.25 0.7037 0.5723 1.23 0.219 -11.5 -0.11
LA * py50 0.4552 0.5277 0.8625 0.388 0.5224 0.5627 0.928 0.353 -14.8 -0.14
JA * py50 1.423 0.379 3.756 <0.001 1.413 0.3824 3.694 < 0.001 0.7 0.01
W * py50 0.5359 0.2959 1.811 0.0701 0.5782 0.3124 1.851 0.0642 -7.9 -0.08
Log-linear term 1
AA * yearsquit -0.1294 0.1071 -1.208 0.227 -0.1369 0.1038 -1.319 0.187 -5.8 -0.06
NH * yearsquit -0.1488 0.2775 -0.5362 >0.5 -0.154 0.2627 -0.5871 > 0.5 -3.5 -0.03
LA * yearsquit -0.1228 0.2932 -0.4186 >0.5 -0.1389 0.3004 -0.4623 > 0.5 -13.1 -0.12
JA * yearsquit -0.06131 0.02764 -2.218 0.0265 -0.0674 0.03044 -2.214 0.0268 -9.9 -0.09
W * yearsquit -0.08183 0.09886 -0.8277 0.408 -0.09769 0.1134 -0.861 0.389 -19.4 -0.18
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Diet quality and pancreatic cancer incidence in the multiethnic cohort
PDF
Body size and the risk of prostate cancer in the multiethnic cohort
PDF
The role of heritability and genetic variation in cancer and cancer survival
PDF
The environmental and genetic determinants of cleft lip and palate in the global setting
PDF
Utility of polygenic risk score with biomarkers and lifestyle factors in the multiethnic cohort study
PDF
Arm lymphedema in a multi-ethnic cohort of female breast cancer survivors
PDF
Prostate cancer: genetic susceptibility and lifestyle risk factors
PDF
Genetic and environmental risk factors for childhood cancer
PDF
Genetic studies of cancer in populations of African ancestry and Latinos
PDF
Genetic risk factors in multiple myeloma
PDF
Ancestral/Ethnic variation in the epidemiology and genetic predisposition of early-onset hematologic cancers
PDF
Characterizing the genetic and environmental contributions to ocular and central nervous system health
PDF
The effects of tobacco exposure on hormone levels and breast cancer risk among young women
PDF
Disparities in gallbladder, intra-hepatic bile duct, and other biliary cancers among multi-ethnic populations: a California Cancer Registry study
PDF
Factors that influence mammographic density: role of estrogen metabolism genes, biomarkers of inflammation, and lifestyle
PDF
Using genetic ancestry to improve between-population transferability of a prostate cancer polygenic risk score
PDF
Air pollution and breast cancer survival in California teachers: using address histories and individual-level data
PDF
Age related macular degeneration in Latinos: risk factors and impact on quality of life
PDF
The role of alcohol and alcohol-related risk factors in population health using a multi-level approach
PDF
The multiethnic nature of chronic disease: studies in the multiethnic cohort
Asset Metadata
Creator
Bogumil, David Daniel Alesse
(author)
Core Title
Identifying genetic, environmental, and lifestyle determinants of ethnic variation in risk of pancreatic cancer
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Epidemiology
Degree Conferral Date
2021-08
Publication Date
07/21/2021
Defense Date
05/11/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Air pollution,excess risk,genetics,OAI-PMH Harvest,pancreatic cancer,pancreatic ductal adenocarcinoma,PDAC,PM?.?,smoking
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Setiawan, Veronica Wendy (
committee chair
), Conti, David (
committee member
), Mckean-Cowdin, Roberta (
committee member
), Pandol, Stephen (
committee member
), Wu, Anna (
committee member
)
Creator Email
david.bogumil@gmail.com,dbogumil@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC15613662
Unique identifier
UC15613662
Legacy Identifier
etd-BogumilDav-9808
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Bogumil, David Daniel Alesse
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
excess risk
genetics
pancreatic cancer
pancreatic ductal adenocarcinoma
PDAC
PM?.?
smoking