Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Utility of polygenic risk score with biomarkers and lifestyle factors in the multiethnic cohort study
(USC Thesis Other)
Utility of polygenic risk score with biomarkers and lifestyle factors in the multiethnic cohort study
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Utility of Polygenic Risk Score with Biomarkers and Lifestyle Factors
in the Multiethnic Cohort Study
by
Alisha Chou
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(EPIDEMIOLOGY)
December 2022
ii
Acknowledgements
I would like to first acknowledge the scientific expertise, contributions and
mentorship of my committee members. I would like to express my utmost gratitude to
my mentor and committee chair, Christopher A. Haiman, ScD, for giving me the
opportunity to explore the exciting field of epidemiology and statistical genetics,
believing in me on my transition into this inspirational field of study, and giving me
guidance on academic subjects and personal growth; David V. Conti, PhD, for his
guidance and instruction on many exciting and cutting-edge statistical methods in
genetic epidemiology; V. Wendy Setiawan, PhD, for her mentorship, constant support,
and expert guidance in NAFLD studies; Adam de Smith, PhD, for his insightful feedback
and continuing support; and Michael F. Press, MD, PhD, for his clinical expertise in
breast cancer, valuable feedback and continuing encouragement. I am very grateful for
the scientific guidance and kindness that I have received from my committee members
for navigating through this process.
I would also like to extend special thanks to MEC members, Fei Chen, PhD, for
her guidance, countless hours of discussions and feedbacks on genetic lifestyle
interaction study; Burcu Darst, PhD, for her time, guidance and introducing me to the
subject of PRS; Grace Sheng for preparing and providing helpful insights on managing
genetic data; Victor Hom for genetic data pipeline expertise; and Peggy Wan for data
preparation and guiding me on my first ever statistical analysis.
Finally, I would like to thank my friends, my family, most importantly, my partner,
Gary Chan, MD, and my parents, Tzu-Ming Kuo and Jin-Wu Chou, for their
unconditional love and support through this rewarding journey. I am empowered to
achieve this milestone today because of you.
iii
Table of Contents
Acknowledgements .......................................................................................................... ii
Abstract ............................................................................................................................ v
Chapter 1: Introduction ................................................................................................... 1
Polygenic Risk Score (PRS) ........................................................................................... 1
Overview of PRS ......................................................................................................... 1
PRS in the Multiethnic Cohort Study (MEC) .............................................................. 3
PRS with prostate-specific antigen (PSA) screening ................................................. 3
PRS with lifestyle factors in gene-environment interaction ........................................ 4
Characterization and evaluation of PRSs................................................................... 5
Prostate cancer epidemiology ........................................................................................ 6
Overview of prostate cancer ....................................................................................... 6
Non-genetic risk factors of prostate cancer ................................................................ 7
Genetic susceptibility of prostate cancer .................................................................... 8
Genome-wide association studies (GWAS) and polygenic risk score (PRS) of
prostate cancer............................................................................................................ 9
Aggressive prostate cancer ........................................................................................ 9
Prostate-specific antigen (PSA) screening ............................................................... 10
Breast cancer epidemiology ......................................................................................... 12
Overview of breast cancer ........................................................................................ 12
Breast cancer subtypes ............................................................................................ 13
Lifestyle and environmental risk factors of breast cancer ........................................ 14
Genetic susceptibility of breast cancer ..................................................................... 15
Genome-wide association studies (GWAS) and polygenic risk score (PRS) of
breast cancer ............................................................................................................. 16
Nonalcoholic fatty liver disease epidemiology ............................................................. 17
Background: Hepatic Steatosis................................................................................. 17
Overview of nonalcoholic fatty liver disease ............................................................ 17
Subtypes of nonalcoholic fatty liver disease ............................................................ 18
Mortality with non-alcoholic fatty liver disease ......................................................... 18
Lifestyle and environmental risk factors of nonalcoholic fatty liver disease ............ 19
Genetic susceptibility of nonalcoholic fatty liver disease ......................................... 20
Polygenic risk score (PRS) of nonalcoholic fatty liver disease ................................ 21
References ................................................................................................................ 23
Chapter 2: Association of Prostate-Specific Antigen Levels with Prostate Cancer Risk
in a Multiethnic Population: Stability over Time and Comparison with Polygenic Risk
Score ............................................................................................................................. 43
Abstract ..................................................................................................................... 45
Introduction ................................................................................................................ 47
Materials and Methods .............................................................................................. 48
Results ....................................................................................................................... 53
iv
Discussion ................................................................................................................. 58
References ................................................................................................................ 62
Tables ........................................................................................................................ 65
Figures ....................................................................................................................... 68
Chapter 3: Interaction of Polygenic Risk Score and Lifestyle Factors on the Risk of
Breast Cancer in a Multiethnic Population ................................................................... 70
Abstract ..................................................................................................................... 71
Introduction ................................................................................................................ 73
Materials and Methods .............................................................................................. 75
Results ....................................................................................................................... 79
Discussion ................................................................................................................. 82
References ................................................................................................................ 86
Tables ........................................................................................................................ 93
Figures ....................................................................................................................... 98
Chapter 4: Characterization and Evaluation of Polygenic Risk Scores for Nonalcoholic
Fatty Liver Disease in a Multiethnic Population ......................................................... 103
Abstract ................................................................................................................... 103
Introduction .............................................................................................................. 105
Materials and Methods ............................................................................................ 106
Results ..................................................................................................................... 110
Discussion ............................................................................................................... 114
References .............................................................................................................. 118
Tables ...................................................................................................................... 123
Chapter 5: Conclusions and Future Directions .......................................................... 132
Association of Prostate-Specific Antigen Levels with Prostate Cancer Risk in a
Multiethnic Population: Stability over Time and Comparison with Polygenic Risk
Score ....................................................................................................................... 132
Interaction of Polygenic Risk Score and Lifestyle Factors on the Risk of Breast
Cancer in a Multiethnic Population ......................................................................... 134
Characterization and Evaluation of Polygenic Risk Scores for Nonalcoholic Fatty
Liver Disease in a Multiethnic Population............................................................... 136
Conclusion ............................................................................................................... 138
References .............................................................................................................. 139
v
Abstract
Genome-wide association studies (GWAS) have identified thousands of common
genetic variants, mostly of modest effects, that are associated with hundreds of complex
human diseases and traits. Even though individually the associated variants mostly
modify the disease risks only marginally, for many diseases the cumulative impact of
risk across the genome is substantial. Polygenic risk scores (PRS) measuring the
cumulative genetic burden have been shown to have great potential in disease risk
stratification. One of the key public health goals is to identify individuals at high risk of a
given disease to allow enhanced screening or preventive therapies. PRS, alone or
combined with environmental risk factors or biomarkers, has the potential to offer
substantial stratification of a population into distinct risk categories for common complex
diseases and may in turn aid in targeted screening. Despite promising results, current
PRS are predominantly developed and assessed in populations of European ancestry
and have not been well studied in non-European populations, leading to lack of
generalizability. Evaluating PRS in non-European ancestry populations in research is
essential to broad clinical application of PRS to the general population. In this
dissertation, I aim to address this question by investigating the utility of PRS with
biomarkers and lifestyle factors in a racially and ethnically diverse population in the
Multiethnic Cohort Study (MEC).
In Chapter 2, in addition to assessing prostate-specific antigen (PSA), measured
many years before prostate cancer diagnosis, as a marker of prostate cancer risk, we
also evaluated PSA as an indicator of prostate cancer risk in comparison to PRS. In
Chapter 3, we investigated the association and interaction between breast cancer PRS
vi
and lifestyle factors on risk of breast cancer. In Chapter 4, we characterized and
evaluated previously published nonalcoholic fatty liver disease (NAFLD) PRSs with the
goal of constructing a new PRS incorporating previously published independent SNPs
to assess whether the performance of NAFLD PRS could be improved across different
populations. Collectively, these investigations highlight the importance in developing
ethnic-specific polygenic risk scores that are informative to improve prevention,
screening, and treatment of diseases across diverse populations.
1
Chapter 1: Introduction
Polygenic Risk Score (PRS)
Overview of PRS
Genome-wide association studies (GWAS) have identified more than 10,000
genetic variants, mostly of modest effects, that are associated with hundreds of complex
human diseases and traits (1). Results from GWASs suggested that almost all complex
human traits and diseases are polygenic in nature, affecting by multiple variants that
each explains small phenotypic variance (2). A polygenic risk score (PRS) for a given
trait is the aggregate of a set of germline single-nucleotide polymorphisms (SNPs),
weighed by the estimated strength of association between the SNP and the trait. PRS is
most commonly calculated by summing the dosage of each risk allele carried by an
individual (ranging from 0 to 2), weighing each variant by its natural logarithm of the
relative risk extracted from the GWAS. For each individual 𝑖 , this results in a single
value on a continuous scale, where 𝛽 ̂
𝑗 is the weight for variant 𝑗 obtained from GWAS
summary statistics and 𝑀 is the total number of variants included (3).
𝑃𝑅𝑆 𝑖 = ∑ 𝛽 ̂
𝑗 × 𝑑𝑜𝑠𝑎𝑔𝑒 𝑖𝑗
𝑀 𝑗 =1
One of the key public health goals is to identify individuals at high risk of a given
disease to allow enhanced screening or preventive therapies. PRS has the potential to
offer substantial stratification of a population into distinct risk categories for common
complex diseases and may in turn aid in targeted screening. A study using UK Biobank
data and investigating 16 cancers demonstrated that stratifying on levels of PRS (high
2
risk: ≥80%, average: >20–<80%, low risk: ≤20%) identified significantly divergent 5-year
risk trajectories after accounting for family history (i.e. prostate cancer (P ≤ 4.5 x 10
−25
),
and breast cancer (P ≤ 4.6 x 10
−32
)) and modifiable risk factors (i.e. colorectal cancer (P
≤ 1.8 x 10
−42
), and melanoma (P ≤ 3.5 x 10
−139
)). Such that individuals with a high PRS
were predicted to have higher overall risk, comparing to individuals with average PRS
and low PRS, after accounting for family history and modifiable risk factors (4).
Despite promising results, current PRS are predominantly developed and
assessed in populations of European ancestry (5-9). It was reported that across 733
studies on polygenic risk score from 2008 to 2017, 67% were exclusively conducted in
European ancestry participants, 19% were conducted exclusively in Asian populations,
particularly in East Asian countries, and only 3.8% of studies were conducted in African,
Latino/Hispanic, or Indigenous people combined (10). This may lead to reduced
predictive power of PRS in non-European ancestry populations because of differences
in variant frequencies and linkage disequilibrium patterns between populations (11). In a
systematic review, PRS performance across different disease outcomes performed the
worst among African ancestry samples (11). It was found that the median effect size of
PRS in African ancestry samples was on average 42% that of the matched European
ancestry samples (p = 3.7 x 10
6
). Relative to matched European ancestry samples,
performance was also lower in East Asian samples (95%), but not significantly (11).
Evaluating PRS in non-European ancestry populations is essential before utilizing PRS
in public health and clinical setting.
3
PRS in the Multiethnic Cohort Study (MEC)
In this report, we will assess the utility of PRS with biomarkers and lifestyle
factors in an ethnically diverse population in the MEC. The MEC is an on-going
prospective study of over 215,000 residents of Hawaii and Los Angeles recruited at age
45 to 75 from 1993 to 1996, primarily from five self-reported racial/ethnic groups:
Japanese Americans, Native Hawaiians, African Americans, Latinos, and Whites. The
cohort participants were identified through driver’s license files from the department of
motor vehicles, voter registration lists, and Health Care Financing Administration
(Medicare) data files of California and Hawaii. Between 2001 to 2006, approximately
70,000 participants contributed blood and urine specimens to the biorepository.
Approximately 35,000 MEC participants in the biorepository have GWAS data and the
remaining 35,000 are currently being genotyped. Incident cases of cancer are identified
by linking the cohort to the SEER tumor registries in Hawaii and California. Mortality and
its cause were determined by routine linkages to state death files and to the National
Death Index for deaths that occurred outside of Hawaii and California. This large cohort
provides prospective data on genetic factors, dietary factors, and other cancer and
chronic disease risk factors in a multiethnic population that allows us to investigate our
research aims.
PRS with prostate-specific antigen (PSA) screening
For project 1, in addition to assessing prostate-specific antigen (PSA), measured
many years before prostate cancer diagnosis, as a marker of prostate cancer risk, we
also evaluated PSA as an indicator of prostate cancer risk in comparison to PRS.
4
Studies have shown that while population screening for prostate cancer using PSA can
prevent prostate cancer death, this benefit comes at a cost of overdiagnosis and
subsequent overtreatment (12, 13). This leads to the question of whether targeted
screening for prostate cancer can improve the benefit-to-harm ratio of screening. A
study using a prostate cancer PRS constructed from 66 variants found that the
proportion of over-diagnosed cases in prostate cancer screening decreased with an
increase in polygenic risk score (14). In PRS quartiles 1 to 4, an estimated of 43, 30, 25,
and 19% of cases, respectively, were likely to be over-diagnosed cancers, assuming a
prostate-specific antigen test sensitivity of 80% (14). This shows that targeted screening
to men with higher polygenic risk score could reduce number of men likely to be over-
diagnosed by PSA screening.
PRS with lifestyle factors in gene-environment interaction
For project 2, we investigated the association and interaction between breast
cancer PRS and lifestyle factors on risk of breast cancer. The etiology of cancer is multi-
factorial, likely arising from a complex interaction between genetic and environmental
risk factors. Public health strategies for targeted prevention may be more effective with
knowledge of gene-environment (GxE) interaction to identify genetic subgroups at
higher exposure-specific disease risk. In addition to be leveraged to discover new
genetic risk factors, GxE interaction studies may improve disease risk prediction and
our understanding of disease etiology. However, due to small to moderate effect size of
potential GxE interaction as a result of weak effects of individual genetic variants,
studies on GxE based on single genetic variants have been inconclusive (15). It has
5
been shown that using polygenic risk score (PRS), which is an aggregate of common
genetic susceptibility loci of small effects, could help to mitigate the issue of multiple
testing in genome-wide GxE studies and improve detection of GxE interaction. A study
using 20 studies in the Breast Cancer Association Consortium (BCAC) found significant
interactions of PRS with alcohol consumption (P-interaction = 0.009) and the use of
menopausal hormone therapy (P-interaction = 0.038) on the risk of ER-positive breast
cancer (16).
Characterization and evaluation of PRSs
For project 3, we characterized and evaluated previously published nonalcoholic
fatty liver disease (NAFLD) PRSs with the goal of constructing a new PRS incorporating
previously published independent SNPs to assess whether the performance of NAFLD
PRS could be improved. Genome-wide association studies (GWAS) have identified
several variants associated with NAFLD development and disease severity (17-30).
Polygenic risk scores (PRS) comprised of these variants have been demonstrated to
predict NAFLD development, progression, and severity (31, 32). However, there is an
underrepresentation of non-European ancestry populations in NAFLD GWASs for
discovery, which creates a major gap in the use of genetic information for disease
prediction and prevention across populations (33-36). Few studies have addressed this
gap; a recent study in the Multiethnic Cohort Study (MEC) constructed a weighted 11-
SNP PRS after taking into account all previously identified GWAS-significant variants,
and reported a significant association between PRS with NAFLD risk in a multiethnic
population (odds ratio [OR] per SD increase = 1.41; 95% confidence interval [CI] = 1.32-
6
1.50) (37). Recently, a large multi-ancestry GWAS in the Million Veteran Program
(MVP) identified 77 genetic loci associated with NAFLD, and replicated 17 SNPs in
external cohorts with histology-defined or radiologic imaging NAFLD cases (38). These
two studies facilitated the opportunity to develop and improve NAFLD PRS to estimate
overall genetic risk for NAFLD across populations (32, 38). The most commonly used
SNP set to construct a PRS is based on GWAS-identified risk variants for a disease.
However, there are arguments that since causal variants with smaller effect sizes or
lower allele frequencies are not likely to be captured by GWAS without a large sample
size with adequate power, a less stringent inclusion criteria should be applied. Methods
using GWAS summary statistics are popular strategies (39, 40). Penalized methods
such as LASSO and Ridge regression with coefficient shrinkage can perform SNP
selection across the genome simultaneously without setting arbitrary thresholds.
Bayesian models, which can incorporate external information such as functional
annotation and LD structure, have also been shown with good performance (40). In
addition to disease variant selection, careful consideration should be put into optimal
weights assignment, which should be extracted from a large multi- or ideally same-
ancestry population as the study population (41).
Prostate cancer epidemiology
Overview of prostate cancer
Prostate cancer is the most common non-cutaneous malignancy and the second
leading cause of cancer death among U.S. men, with an estimated 248,530 new cases
and 34,130 deaths in 2021 (42). Among men globally, prostate cancer is the second
7
most commonly diagnosed cancer and fifth leading cause of cancer death, with
estimated 1.4 million new cases and 375,000 associated deaths in 2020 (43).
Ethnic disparities exist in prostate cancer incidence and mortality. The age-
adjusted incidence rate for prostate cancer in the U.S. is highest in African American
men (175.8 per 100,000 person-years), followed by non-Hispanic White (104.1),
Hispanic (90.9), and Asian/Pacific Islander (57.7) based on Surveillance, Epidemiology,
and End Results (SEER) Program data from 2014 to 2018 (42). Based on SEER
Program data from 2015 to 2019, the age-adjusted mortality rate for prostate cancer in
the U.S. is highest in African American men (36.9 per 100,000 person-years), followed
by non-Hispanic White (17.7), Hispanic (15.6), and Asian/Pacific Islander (8.6) (42).
African American men have the highest overall-incidence, distant-stage incidence and
mortality rate of prostate cancer, approximately 1.6-fold, 2.6-fold and 2-fold higher,
respectively, comparing to other populations (44, 45). Moreover, African American men
are more likely to have family history, and to be diagnosed with aggressive prostate
cancer at a younger age (46, 47). The causes for this disparity remain unclear, and are
potentially multifactorial, consisting of social, environmental, and genetic influences
(48). Understanding the causes of this disparity is important in informing efforts in
reducing prostate cancer incidence and mortality in the diverse American population
and populations worldwide.
Non-genetic risk factors of prostate cancer
The most established non-genetic risk factors for prostate cancer are age,
race/ethnicity and family history. The incidence of prostate cancer increases greatly with
8
older age, from 1 in 298 for men aged 49 years or younger to 1 in 9 for men aged 70
years and older (42). While most prostate cancer occurs without a family history, studies
have shown association between family history and prostate cancer risk. Two large
meta-analyses have reported that having a first-degree relative with prostate cancer
was associated with approximately 2.5-fold increase in prostate cancer risk (49, 50).
Moreover, men with a younger affected first-degree relative, and men with more than
one affected first-degree relative have greater risk of prostate cancer. This association
between family history and risk of prostate cancer has been reported to be consistent
across different ethnic groups (51). While studies on type II diabetes, hormone, other
environmental, and life-style risk factors have been inconclusive, there has been
increasing evidence on the association between excess body weight and risk of
advanced prostate cancer (52).
Genetic susceptibility of prostate cancer
Twins studies have estimated genetic factors to account for 42 to 57% of
prostate cancer, which has the highest heritability among all cancer types (53, 54).
However, only a handful of rare germline mutations in DNA damage repair genes have
been reported to confer increased risk of prostate cancer, including BRCA1, BRCA2,
ATM, ATR, NBS1, mismatch repair-related genes (MSH2,MSH6 and PMS2), CHEK2,
RAD51D, and PALB2 (55). Mutations in BRCA2, and HOXB13 have been shown to
confer the highest risk for prostate cancer, leading to sevenfold, and threefold increased
relative risk, respectively (56, 57). For prostate cancer, linkage studies, which tends to
identify rare and highly penetrant mutations for the disease, have reported inconsistent
9
results and lack of reproducibility. This suggests that prostate cancer is likely to be
complex and polygenic in nature, where each genetic susceptibility locus may confer a
small to moderate risk.
Genome-wide association studies (GWAS) and polygenic risk score (PRS) of prostate
cancer
Success in prostate cancer genome-wide association studies (GWAS) supports
the hypothesis of prostate cancer being a polygenic disease. A recent multiancestry
GWAS meta-analysis, along with prior GWAS and fine-mapping studies of prostate
cancer have together discovered 269 germline risk variants for prostate cancer (58).
GWAS in prostate cancer has also identified variants in the genomic region 8q24 where
the MYC oncogene is located (59). Polygenic risk scores (PRS) constructed from these
variants have been demonstrated to identify men who at birth can be determined to be
at substantially higher risk of developing prostate cancer (60). It was reported that the
top decile of the 269 prostate cancer variant-PRS, compared to the 40-60
th
percentile,
was associated with odds ratios that ranged from 5.06 (95% confidence interval [CI]),
4.18-5.29) for men of European ancestry to 3.74 (95% CI, 3.36-4.17) for men of African
ancestry.
Aggressive prostate cancer
Most men will develop histologic prostate cancer in their lifetime, but only a
subset will be diagnosed with clinically relevant disease. Identifying men who are at
increased risk of aggressive prostate cancer is therefore one of the most important
10
challenges in prostate cancer management. Studies have characterized aggressive
prostate cancer based on lethal prostate cancer, prostate cancer survival, Gleason
score, clinical stage, and PSA level. Several studies have identified genome-wide
significant risk loci for aggressive prostate cancer, such as variants rs4054823,
rs35148638, rs78943174, and rs11672691 (61-63). However, in the
ELLIPSE/PRACTICAL GWAS meta-analysis, including more than 140,000 men, no
genome-wide significant aggressive prostate cancer risk loci were identified (64, 65).
Prostate-specific antigen (PSA) screening
Prostate-specific antigen (PSA) is the most commonly used biomarker for
prostate cancer screening. While PSA screening has been shown to reduce prostate
cancer metastases and mortality (13, 66-68), it is associated with overdiagnosis and
overtreatment of indolent disease that is unlikely to progress (69, 70). Many men with
PSA-detected prostate cancer were subsequently observed to survive for 15 to 20
years, and ultimately died from other competing diseases (71). In addition, many men
underwent unnecessary biopsies and invasive treatments that led to lower quality of life.
In a large systematic review and meta-analysis, it was found that PSA screening led to
only one less death from prostate cancer per 1,000 men screened over 10 years (72).
As a result, the United States Preventive Services Task Force updated their evaluation
of PSA screening for prostate cancer to a “C” grade, recommending that physicians
selectively provide PSA testing to individual patients based on professional judgment
and patient preferences (73).
11
Screening tools and methods with improved accuracy of diagnosing aggressive
and lethal prostate cancer could greatly enhance clinical decision making and reduce
the number of unnecessary biopsies performed and treatments received. One proposed
strategy is to use a baseline PSA measured during midlife to estimate risk and
determine the frequency of subsequent screening. This approach is based on the
observation that prostate carcinogenesis likely initiates in men in their 30s to 40s (74).
Thus, midlife PSA levels may reflect early stages of the prostate carcinogenesis while
being less prone to elevation due to benign prostatic hyperplasia (BPH) comparing to
PSA levels measured later in life (75). Multiple studies in the U.S. and Europe have
shown that PSA levels in midlife, many years before a prostate cancer diagnosis, are
associated with increased risk of prostate later in life (76-84). Moreover, several studies
have also reported associations between midlife PSA levels and risk of metastatic
prostate cancer and prostate cancer mortality (85-89). A nested case-control study in
the Malmö Preventive cohort found that 44% of prostate cancer deaths occurred in men
within the top 10
th
percentile of PSA levels measured in men 45 to 55 years of age 25 to
30 years earlier (90). Baseline PSA level collected in midlife, 9.0 and 8.6 years (median)
before a prostate cancer diagnosis, was also associated with overall and lethal prostate
cancer risk, respectively, in a primarily White population aged 40 to 59 years in the
Physician's Health Study (91). A more recent study in the Prostate, Lung, Colorectal,
and Ovarian (PLCO) Cancer Screening Trial also found baseline PSA levels measured
5.9 years (median) before prostate cancer diagnosis in middle-aged men 55 to 60 years
to be significantly associated with risk of overall and clinically significant prostate cancer
12
(92). These studies in men of European ancestry indicate PSA is a marker of very early
prostate cancer development that may help to risk-stratify men earlier in life (74).
Other approaches such as performing more complex PSA testing with other
components, such as using the four-kallikrein (4K) panel that consists of blood
measures of total, free, and intact PSA and human kallikrein-related peptidase 2 (hK2)
have been demonstrated to reduce unnecessary biopsy and to improve detection of
aggressive prostate cancer when compared to men with moderate PSA levels (93-101).
In the Multiethnic Cohort Study (MEC), the 4K panel has been shown to confer
enhanced discrimination over PSA for overall and aggressive prostate cancer across
different race/ethnicity groups (102).
Breast cancer epidemiology
Overview of breast cancer
Breast cancer is the most commonly diagnosed cancer among women worldwide
(103). Approximately one in eight women will be diagnosed with breast cancer in their
lifetime (103). Among U.S. women, breast cancer is also the most common non-
cutaneous malignancy and the second leading cause of cancer death, with an
estimated 281,550 new cases and 43,600 deaths in 2021 (42). The age-adjusted
incidence rate for female breast cancer in the U.S. is 129.1 per 100,000 person-years
based on Surveillance, Epidemiology, and End Results (SEER) Program data from
2014 to 2018 (42). Based on SEER Program data from 2015 to 2019, the age-adjusted
mortality rate for female breast cancer in the U.S. is 19.9 per 100,000 person-years
(42). Age-adjusted incidence rates for female breast cancer have been rising on
13
average 0.3% per year over 2009 through 2018, while age-adjusted mortality rates have
been falling on average 1.3% per year over 2010 through 2019 (42). However, there
has been a slowdown of the pace of the mortality rate decline from an annual decrease
of 1.9% during 1998 through 2011 to 1.3% during 2011 through 2017 (103). This is
largely driven by the trend in White women, where early screening and treatment
advances have contributed to an average annual decrease in breast cancer mortality of
1.5% among White women, and 1% among African American women from 2013 to
2017 (104). While breast cancer incidence rates are higher in White women (130.8 per
100,000) than African American women (126.7 per 100,000), African American women
have 40% higher breast cancer mortality rate (28.4 per 100,000) comparing to that in
White women (20.3 per 100,000) (103). Although the underlying causes of these
disparities are not well understood, socioeconomic, lifestyle, and genetic factors have
been purported to be determinants of the disparities in breast cancer. To inform efforts
in reducing breast cancer incidence and mortality in the diverse American population
and populations worldwide, it is important to understand the disease burden attributable
to genetics and modifiable risk factors contributing to disparities in breast cancer risk.
Breast cancer subtypes
Breast cancer is a heterogenous disease and have different prognostic indices
and clinical management for each molecular subtype. Breast cancer subtypes are
separated by the tumor expression of two hormone receptors, the estrogen receptor
(ER) and progesterone receptor (PR), the human epidermal growth factor receptor 2
(HER2), and grade of the tumor. According to these characterizations, the four major
14
subtypes are Luminal A (ER+ and/or PR+, HER2-, low grade), Luminal B (ER+ and/or
PR+, HER2-/+, higher grade), HER2-enriched (ER-, PR-, HER2+, any grade), and triple
negative breast cancer (TNBC; ER-, PR-, HER2-, any grade) (105). Out of all female
breast cancers diagnosed in the U.S., 68% are Luminal A subtype, 10% are Luminal B
subtype, 4% are HER2-enriched subtype, 10% are TNBC subtype, and 8% are other
special histologic subtypes (42). Five-year survival rate by breast cancer subtype from
the most favorable to poorest prognosis are: 94% for Luminal A, 91% for Luminal B,
84% for HER2-enriched, and 77% for TNBC (42).
Lifestyle and environmental risk factors of breast cancer
Established non-modifiable risk factors positively associated with breast cancer
include older age, family history of breast cancer, history of benign breast disease, and
higher breast density (42). Reproductive factors associated with increased risk of breast
cancer include older age at first full-term pregnancy, nulliparity, decreased duration of
breastfeeding, early age of menarche, and older age at menopause (42). Modifiable risk
factors associated with increased risk of breast cancer include obesity, weight gain, oral
contraceptive use, physical inactivity, and alcohol consumption (42). In the Multiethnic
Cohort Study (MEC), women who were overweight (BMI 25 to <30), and obese (BMI
≥30) were 1.19 (hazard ratio [HR]; 95% confidence interval [CI], 1.12-1.25), and 1.32
(HR; 95% CI, 1.23-1.41) times as likely to develop breast cancer, respectively,
compared to women of normal BMI (BMI 20-24.9) (106). In the California Teacher’s
Study, a plant-based diet in the highest quintile was observed to be associated with
decreased risk of breast cancer compared to women in the lowest quintile (Risk ratio
15
[RR]: 0.85: 95% CI: 0.76, 0.95) (107). It was estimated that in the U.S., 11.3% of breast
cancer may be attributable to excess body weight, while 3.9% of breast cancer may be
attributable to physical inactivity (108).
Genetic susceptibility of breast cancer
It has been estimated that about 30% of breast cancers are attributable to
genetic factors (109). Mutations in two breast cancer susceptibility genes, BRCA1 and
BRCA2, have been identified to confer high breast cancer risk (110, 111). BRCA1 and
BRCA2 are tumor suppressor genes that produce DNA repair proteins. Mutations in
these genes are thought to account for around 5% of all breast cancer cases, with a
prevalence of about 4-10% among White women with a breast cancer diagnosis under
age 40 years, and 17% among African American women with a diagnosis at age 35
years or younger (112, 113). Several meta-analyses reported that a BRCA1 mutation
confers a 57–65% probability, and BRCA2 mutation confers a 45–49% probability of
developing breast cancer over a woman’s lifetime (110, 114).
In addition to BRCA1 and BRCA2 genes, other breast cancer susceptibility
genes of high and moderate penetrance have been discovered (111). Studies have
confirmed CDH1, PTEN, STK11, TP53, ATM, CHEK2, PALB2, NBN, and BRIP1 genes
to be associated with an overall 2 to 6-fold increased risk of developing breast cancer
compared with population controls (111, 115-129).
16
Genome-wide association studies (GWAS) and polygenic risk score (PRS) of breast
cancer
While rare mutations in genes such as BRCA1 and BRCA2 convey high risks of
developing breast cancer, these account for only a small proportion of breast cancer
cases in the general population. Genome-wide association studies (GWAS) have
identified hundreds of common germline single nucleotide polymorphisms (SNPs) that
individually confer small risk of breast cancer (130, 131). The breast cancer-associated
variants’ combined effects can be summarized into a polygenic risk score (PRS), which
can be used to stratify women according to their risk of developing breast cancer. A
recent study using 79 studies from the Breast Cancer Association Consortium (BCAC)
constructed a PRS consisted of 313 variants associated with breast cancer risk (132).
For the PRS313, odds ratio for overall disease per 1 standard deviation was 1.61 (95%
confidence interval [CI]: 1.57–1.65) (132). Moreover, the lifetime risk of overall breast
cancer in the top centile of the PRS313 was 32.6% (132). However, current BCa PRS
are developed in populations of European ancestry, which may attenuate its disease
prediction accuracy, and may not be generalizable to non-European populations (5-9).
While the PRS is a powerful tool for breast cancer risk discrimination, better risk
discrimination could be achieved by combining the PRS with family history and
environmental risk factors (16).
17
Nonalcoholic fatty liver disease epidemiology
Background: Hepatic Steatosis
Chronic liver disease is a major cause of morbidity and mortality worldwide. In
the United States, chronic liver disease and cirrhosis were the 11th-leading cause of
death in 2017, accounting for 41,743 deaths, or 12.8 per 100,000 population (133). The
progression of chronic liver disease is characterized by chronic inflammation and
scarring of the liver, resulting in end-stage liver disease and its complications. Hepatic
steatosis is also recognized as an important pathological feature of chronic liver
disease. Hepatic steatosis, also known as fatty liver disease (FLD), is defined as
excessive fat accumulation within the hepatocytes. FLD is a complex disease influenced
by genetic variation, lifestyle, and environment. There are two major conditions
associated with hepatic steatosis: nonalcoholic fatty liver disease (NAFLD) and
alcoholic fatty liver disease (AFLD). The prognosis of hepatic steatosis varies according
to the etiology and conditions such as inflammation and fibrosis. Thus, it is important to
study hepatic steatosis based on its different forms.
Overview of nonalcoholic fatty liver disease
Characterized by excessive fat deposition in the liver that is not attributable to
consumption of alcohol, nonalcoholic fatty liver disease (NAFLD) is the most common
cause of chronic liver disease, with an estimated global prevalence of 20-30% (134-
136). In the United States, NAFLD prevalence is projected to increase 21% from 83
million cases in 2015 to 101 million cases in 2030 (137). NAFLD risk burden also differs
across different race/ethnicity groups. Investigation in the Multiethnic Cohort Study
18
(MEC) showed that U.S. Latinos experience the highest prevalence of NAFLD
compared to other ethnic groups, and Whites experience the lowest prevalence of
NAFLD (138-140). NAFLD is not only linked to metabolic syndrome and hepatic
outcomes of type 2 diabetes and kidney disease, but also to extrahepatic outcomes
such as cardiovascular disease (CVD), cancer, and all-cause mortality (141). In the US,
NAFLD has risen to become the fastest growing cause of hepatocellular carcinoma
(HCC) and leading indicator for liver transplantation (142-144). In addition to become
the leading indication for liver transplantation and a major risk factor for hepatocellular
carcinoma that can develop even without cirrhosis, NAFLD is becoming the most
common cause of liver-mortality worldwide (143, 145, 146). Patients with NAFLD are at
increased risk of mortality from liver disease, and more commonly from cardiovascular
disease and malignancy (147).
Subtypes of nonalcoholic fatty liver disease
NAFLD is a spectrum of liver disease ranging from nonalcoholic fatty liver (NAFL)
to nonalcoholic steatohepatitis (NASH). NAFL is the nonprogressive subtype, which is
simple steatosis. NASH is the progressive subtype, which is distinguished from NAFL
by the presence of hepatocellular injury with varying degrees of fibrosis, and can lead to
cirrhosis and hepatocellular carcinoma (HCC) (148-150).
Mortality with non-alcoholic fatty liver disease
The presence of NAFLD can increase the rate of liver fibrosis progression,
leading to cirrhosis, HCC, and/or death. NAFLD is becoming the most common cause of
19
liver-mortality worldwide (143, 145, 146). For patients with NAFLD, all-cause mortality
and liver-specific mortality per 1,000 person-years was reported to be 15.44, and 0.77,
respectively. For patients with NASH, the progressive subtype of NAFLD, all-cause
mortality and liver-specific mortality per 1,000 person-years was reported to be 25.56,
and 11.77, respectively (135, 151). An early study in the Danish National Registry of
Patients reported that overall mortality was increased 2.6-fold (95% confidence interval
[CI], 2.4-2.9) in patients with non-alcoholic or unspecified fatty liver (152). In addition, a
retrospective cohort study in Cleveland demonstrated that NASH was an independent
predictors of liver-related mortality (153). While patients with NAFLD are at an increased
risk of mortality from liver disease, the most common cause of death among patients
with NAFLD is cardiovascular disease (147). It has been estimated that 5 to 10% of
patients with NAFLD die from cardiovascular disease (154). In a meta-analysis of six
studies with 25,837 subjects, the risk of cardiovascular mortality was also increased in
the NAFLD group (RR 1.46, 95% CI 1.31–1.64) (155).
Lifestyle and environmental risk factors of nonalcoholic fatty liver disease
Established risk factors associated with NAFLD are obesity, hypertension,
dyslipidemia, type 2 diabetes and metabolic syndrome. A large meta-analysis including
86 studies with a sample size of 8,515,431 from 22 countries observed these metabolic
comorbidities with high prevalence among NAFLD patients: obesity (51.34%; 95% CI:
41.38-61.20), type 2 diabetes (22.51%; 95% CI: 17.92-27.89), hyperlipidemia (69.16%;
95% CI: 49.91-83.46%), hypertension (39.34%; 95% CI: 33.15-45.88), and metabolic
syndrome (42.54%; 95% CI: 30.06-56.05) (135). A recent study in the Multiethnic
20
Cohort Study (MEC) found that higher intakes of red meat (P trend=0.010), processed
red meat (P trend= 0.004), poultry (P trend= 0.005), and cholesterol (P trend= 0.005)
are risk factors for NAFLD, while dietary fiber is a protective factor (156).
Genetic susceptibility of nonalcoholic fatty liver disease
For high prevalence diseases, such as NAFLD, heritable component typically
accounts for up to 30 to 50% of relative risk (157). Familial aggregation, twin studies
and interethnic differences in susceptibility have suggested that there is a significant
heritable component to NAFLD (158-166). Genome-wide association studies (GWAS)
have identified several variants associated with NAFLD development and disease
severity (17, 21, 23, 24, 167). Among the variants identified, the rs738409 C>G variant
in PNPLA3 gene has been identified as an independent genetic risk factor for NAFLD
by multiple GWAS among predominantly European or Asian ancestry (17, 23, 24, 27,
28, 167, 168). The polymorphism in PNPLA3 (rs738409 c.444 C > G, p.I148M) is a
nonsynonymous cytosine to guanine nucleotide transversion mutation that results in an
isoleucine to methionine amino acid change at codon 148. The G allele of rs738409 has
been shown to reduce lipase activity, resulting in hepatic fat accumulation. The PNPLA3
polymorphism has been linked with NAFLD, nonalcoholic steatohepatitis,
decompensated cirrhosis, hepatocellular carcinoma, and all-cause, cardiovascular and
liver-related mortality (27, 169). In an exome wide association study, in addition to
validating the association between PNPLA3 and NAFLD, the study also identified that a
nonsynonymous genetic variant within TM6SF2 gene (rs58542926 c.449 C > T,
p.Glu167Lys) at the 19p13.11 locus was associated with radiologically measured
21
steatosis (HTGC) (170). This polymorphism of TM6SF2 has additionally been reported
in at least two GWAS of European ancestry (17, 25). Several variants in HSD17B13,
GCK, GATAD2A, NCAN, PPP1R3B, TRIB2, CPN1, ERLIN1 and SAMM50 have also
been identified as GWAS-significant variants for NAFLD, liver fat content or liver
enzymes (17, 20, 21, 28, 29, 168). While not identified in the context of genome-wide
assessment, several variants in MBOAT7, SERPINA1, HFE, and MARC1 have been
shown to be associated with NAFLD histopathological features (171, 172).
More specifically, several studies have investigated the association between
PNPLA3 I148M (rs738409) polymorphism with mortality. Multiple studies conducted in
the National Health and Nutrition Examination Survey (NHANES III) reported
associations between PNPLA3 polymorphism with higher risks of liver-related mortality
(HR, 6.23; 95% CI, 1.83-21.27) and overall mortality (HR 1.22, 95% CI, 1.09–1.36)
(173, 174).
Polygenic risk score (PRS) of nonalcoholic fatty liver disease
While multiple studies have demonstrated the utility of PRS in predicting NAFLD
outcomes, most studies generated the PRS based on a few selected genetic variants
(2-6 SNPs) without taking into account all previously identified GWAS-significant
variants (31). A 4-variant PRS was found to be significantly associated with NAFLD risk
in Whites (3-fold increased risk among individuals at the top tertile of the PRS) (175). A
large recent study in the UK Biobank among individuals of European ancestry found a
15-SNP PRS associated with higher alanine aminotransferase (ALT) levels (Odds Ratio
[OR] = 1.14) (176). There has been limited number of studies that have evaluated
22
NAFLD PRS in non-White populations. In a study conducted among severely obese
(BMI ≥ 40 kg/m
2
) Mexicans, a 4-variant PRS was significantly associated with hepatic
fat content and higher ALT levels (OR = 1.63; 95% CI = 1.28-2.07) (177). In the largest
study of liver biopsy-confirmed NAFLD among Japanese (n = 902 cases), a 3-variant
PRS was significantly associated with increased NAFLD risk (AUC = 0.65; 95% CI =
0.63-0.67) (21). A recent study in the Multiethnic Cohort Study assessed an 11-SNP
weighted PRS and reported a significant association between PRS with NAFLD risk in a
multiethnic population (odds ratio [OR] per SD increase = 1.41; 95% confidence interval
[CI] = 1.32-1.50) (37). Recently, a large multi-ancestry GWAS in the Million Veteran
Program (MVP) identified 77 genetic loci associated with NAFLD, and replicated 17
SNPs in external cohorts with histology-defined or radiologic imaging NAFLD cases
(38).
23
References
1. Welter D, MacArthur J, Morales J, et al. The NHGRI GWAS Catalog, a curated
resource of SNP-trait associations. Nucleic Acids Res. 2014;42(Database
issue):D1001-6. doi:10.1093/nar/gkt1229
2. Visscher PM, Wray NR, Zhang Q, et al. 10 Years of GWAS Discovery: Biology,
Function, and Translation. Am J Hum Genet. 2017;101(1):5-22.
doi:10.1016/j.ajhg.2017.06.005
3. Chatterjee N, Shi J, García-Closas M. Developing and evaluating polygenic risk
prediction models for stratified disease prevention. Nat Rev Genet. 2016;17(7):392-406.
doi:10.1038/nrg.2016.27
4. Kachuri L, Graff RE, Smith-Byrne K, et al. Pan-cancer analysis demonstrates that
integrating polygenic risk scores with modifiable risk factors improves risk prediction.
Nat Commun. 2020;11(1):6084. doi:10.1038/s41467-020-19600-4
5. Allman R, Dite GS, Hopper JL, et al. SNPs and breast cancer risk prediction for
African American and Hispanic women. Breast Cancer Res Treat. 2015;154(3):583-9.
doi:10.1007/s10549-015-3641-7
6. Hsieh YC, Tu SH, Su CT, et al. A polygenic risk score for breast cancer risk in a
Taiwanese population. Breast Cancer Res Treat. 2017;163(1):131-8.
doi:10.1007/s10549-017-4144-5
7. Shieh Y, Fejerman L, Lott PC, et al. A Polygenic Risk Score for Breast Cancer in
US Latinas and Latin American Women. J Natl Cancer Inst. 2020;112(6):590-8.
doi:10.1093/jnci/djz174
8. Wang S, Qian F, Zheng Y, et al. Genetic variants demonstrating flip-flop
phenomenon and breast cancer risk prediction among women of African ancestry.
Breast Cancer Res Treat. 2018;168(3):703-12. doi:10.1007/s10549-017-4638-1
9. Willoughby A, Andreassen PR, Toland AE. Genetic Testing to Guide Risk-
Stratified Screens for Breast Cancer. J Pers Med. 2019;9(1). doi:10.3390/jpm9010015
24
10. Duncan L, Shen H, Gelaye B, et al. Analysis of polygenic risk score usage and
performance in diverse human populations. Nat Commun. 2019;10(1):3328.
doi:10.1038/s41467-019-11112-0
11. Scutari M, Mackay I, Balding D. Using Genetic Distance to Infer the Accuracy of
Genomic Prediction. PLoS Genet. 2016;12(9):e1006288.
doi:10.1371/journal.pgen.1006288
12. Andriole GL, Crawford ED, Grubb RL, 3rd, et al. Prostate cancer screening in the
randomized Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial: mortality
results after 13 years of follow-up. J Natl Cancer Inst. 2012;104(2):125-32.
doi:10.1093/jnci/djr500
13. Schröder FH, Hugosson J, Roobol MJ, et al. Screening and prostate cancer
mortality: results of the European Randomised Study of Screening for Prostate Cancer
(ERSPC) at 13 years of follow-up. Lancet. 2014;384(9959):2027-35. doi:10.1016/s0140-
6736(14)60525-0
14. Pashayan N, Duffy SW, Neal DE, et al. Implications of polygenic risk-stratified
screening for prostate cancer on overdiagnosis. Genet Med. 2015;17(10):789-95.
doi:10.1038/gim.2014.192
15. Rudolph A, Chang-Claude J, Schmidt MK. Gene-environment interaction and risk
of breast cancer. Br J Cancer. 2016;114(2):125-33. doi:10.1038/bjc.2015.439
16. Rudolph A, Song M, Brook MN, et al. Joint associations of a polygenic risk score
and environmental risk factors for breast cancer in the Breast Cancer Association
Consortium. Int J Epidemiol. 2018;47(2):526-36. doi:10.1093/ije/dyx242
17. Anthonisen NR, Skeans MA, Wise RA, Manfreda J, Kanner RE, Connett JE. The
effects of a smoking cessation intervention on 14.5-year mortality: a randomized clinical
trial. Ann Intern Med. 2005;142(4):233-9. doi:10.7326/0003-4819-142-4-200502150-
00005
18. Grooteman MP, van den Dorpel MA, Bots ML, et al. Effect of online
hemodiafiltration on all-cause mortality and cardiovascular outcomes. J Am Soc
Nephrol. 2012;23(6):1087-96. doi:10.1681/asn.2011121140
25
19. Hedley AJ, Wong CM, Thach TQ, Ma S, Lam TH, Anderson HR.
Cardiorespiratory and all-cause mortality after restrictions on sulphur content of fuel in
Hong Kong: an intervention study. Lancet. 2002;360(9346):1646-52.
doi:10.1016/s0140-6736(02)11612-6
20. Mohiuddin SM, Mooss AN, Hunter CB, Grollmes TL, Cloutier DA, Hilleman DE.
Intensive smoking cessation intervention reduces mortality in high-risk smokers with
cardiovascular disease. Chest. 2007;131(2):446-52. doi:10.1378/chest.06-1587
21. Melzer D, Pilling LC, Ferrucci L. The genetics of human ageing. Nat Rev Genet.
2020;21(2):88-101. doi:10.1038/s41576-019-0183-6
22. Timmers PR, Mounier N, Lall K, et al. Genomics of 1 million parent lifespans
implicates novel pathways and common diseases and distinguishes survival chances.
Elife. 2019;8. doi:10.7554/eLife.39856
23. Wright KM, Rand KA, Kermany A, et al. A Prospective Analysis of Genetic
Variants Associated with Human Lifespan. G3 (Bethesda). 2019;9(9):2863-78.
doi:10.1534/g3.119.400448
24. Meisner A, Kundu P, Zhang YD, et al. Combined Utility of 25 Disease and Risk
Factor Polygenic Risk Scores for Stratifying Risk of All-Cause Mortality. Am J Hum
Genet. 2020;107(3):418-31. doi:10.1016/j.ajhg.2020.07.002
25. Siegel RL, Miller KD, Jemal A. Cancer Statistics, 2017. CA Cancer J Clin.
2017;67(1):7-30. doi:10.3322/caac.21387
26. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN
Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA
Cancer J Clin. 2021;71(3):209-49. doi:10.3322/caac.21660
27. Haiman CA, Chen GK, Blot WJ, et al. Characterizing genetic risk at known
prostate cancer susceptibility loci in African Americans. PLoS Genet.
2011;7(5):e1001387. doi:10.1371/journal.pgen.1001387
26
28. Li J, Siegel DA, King JB. Stage-specific incidence rates and trends of prostate
cancer by age, race, and ethnicity, United States, 2004-2014. Annals of epidemiology.
2018;28(5):328-30. doi:10.1016/j.annepidem.2018.03.001
29. Cotter MP, Gern RW, Ho GY, Chang RY, Burk RD. Role of family history and
ethnicity on the mode and age of prostate cancer presentation. Prostate.
2002;50(4):216-21. doi:10.1002/pros.10051
30. Powell IJ. Epidemiology and pathophysiology of prostate cancer in African-
American men. J Urol. 2007;177(2):444-9. doi:10.1016/j.juro.2006.09.024
31. Freedland SJ, Isaacs WB. Explaining racial differences in prostate cancer in the
United States: sociology or biology? Prostate. 2005;62(3):243-52.
doi:10.1002/pros.20052
32. Johns LE, Houlston RS. A systematic review and meta-analysis of familial
prostate cancer risk. BJU Int. 2003;91(9):789-94. doi:10.1046/j.1464-410x.2003.04232.x
33. Zeegers MP, Jellema A, Ostrer H. Empiric risk of prostate carcinoma for relatives
of patients with prostate carcinoma: a meta-analysis. Cancer. 2003;97(8):1894-903.
doi:10.1002/cncr.11262
34. Whittemore AS, Wu AH, Kolonel LN, et al. Family history and prostate cancer
risk in black, white, and Asian men in the United States and Canada. Am J Epidemiol.
1995;141(8):732-40. doi:10.1093/oxfordjournals.aje.a117495
35. Solans M, Chan DSM, Mitrou P, Norat T, Romaguera D. A systematic review and
meta-analysis of the 2007 WCRF/AICR score in relation to cancer-related health
outcomes. Ann Oncol. 2020;31(3):352-68. doi:10.1016/j.annonc.2020.01.001
36. Lichtenstein P, Holm NV, Verkasalo PK, et al. Environmental and Heritable
Factors in the Causation of Cancer — Analyses of Cohorts of Twins from Sweden,
Denmark, and Finland. New England Journal of Medicine. 2000;343(2):78-85.
doi:10.1056/nejm200007133430201
27
37. Page WF, Braun MM, Partin AW, Caporaso N, Walsh P. Heredity and prostate
cancer: a study of World War II veteran twins. Prostate. 1997;33(4):240-5.
doi:10.1002/(sici)1097-0045(19971201)33:4<240::aid-pros3>3.0.co;2-l
38. Verze P, Cai T, Lorenzetti S. The role of the prostate in male fertility, health and
disease. Nat Rev Urol. 2016;13(7):379-86. doi:10.1038/nrurol.2016.89
39. Karlsson R, Aly M, Clements M, et al. A population-based assessment of
germline HOXB13 G84E mutation and prostate cancer risk. Eur Urol. 2014;65(1):169-
76. doi:10.1016/j.eururo.2012.07.027
40. Lynch HT, Kosoko-Lasaki O, Leslie SW, et al. Screening for familial and
hereditary prostate cancer. Int J Cancer. 2016;138(11):2579-91. doi:10.1002/ijc.29949
41. Conti DV, Darst BF, Moss LC, et al. Trans-ancestry genome-wide association
meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic
risk prediction. Nat Genet. 2021;53(1):65-75. doi:10.1038/s41588-020-00748-0
42. Ahmadiyeh N, Pomerantz MM, Grisanzio C, et al. 8q24 prostate, breast, and
colon cancer risk loci show tissue-specific long-range interaction with MYC. Proc Natl
Acad Sci U S A. 2010;107(21):9742-6. doi:10.1073/pnas.0910668107
43. Schumacher FR, Al Olama AA, Berndt SI, et al. Association analyses of more
than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat Genet.
2018;50(7):928-36. doi:10.1038/s41588-018-0142-8
44. Amin Al Olama A, Kote-Jarai Z, Schumacher FR, et al. A meta-analysis of
genome-wide association studies to identify prostate cancer susceptibility loci
associated with aggressive and non-aggressive disease. Hum Mol Genet.
2013;22(2):408-15. doi:10.1093/hmg/dds425
45. Berndt SI, Wang Z, Yeager M, et al. Two susceptibility loci identified for prostate
cancer aggressiveness. Nat Commun. 2015;6:6889. doi:10.1038/ncomms7889
46. Xu J, Zheng SL, Isaacs SD, et al. Inherited genetic variant predisposes to
aggressive but not indolent prostate cancer. Proc Natl Acad Sci U S A.
2010;107(5):2136-40. doi:10.1073/pnas.0914061107
28
47. Al Olama AA, Kote-Jarai Z, Berndt SI, et al. A meta-analysis of 87,040 individuals
identifies 23 new susceptibility loci for prostate cancer. Nat Genet. 2014;46(10):1103-9.
doi:10.1038/ng.3094
48. Schumacher FR, Al Olama AA, Berndt SI, et al. Association analyses of more
than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat Genet.
2018;50(7):928-36. doi:10.1038/s41588-018-0142-8
49. Hugosson J, Carlsson S, Aus G, et al. Mortality results from the Göteborg
randomised population-based prostate-cancer screening trial. Lancet Oncol.
2010;11(8):725-32. doi:10.1016/s1470-2045(10)70146-7
50. Schröder FH, Hugosson J, Roobol MJ, et al. Prostate-cancer mortality at 11
years of follow-up. N Engl J Med. 2012;366(11):981-90. doi:10.1056/NEJMoa1113135
51. Welch HG, Gorski DH, Albertsen PC. Trends in Metastatic Breast and Prostate
Cancer--Lessons in Cancer Dynamics. N Engl J Med. 2015;373(18):1685-7.
doi:10.1056/NEJMp1510443
52. Loeb S, Bjurlin MA, Nicholson J, et al. Overdiagnosis and overtreatment of
prostate cancer. Eur Urol. 2014;65(6):1046-55. doi:10.1016/j.eururo.2013.12.062
53. Vickers AJ, Sjoberg DD, Ulmert D, et al. Empirical estimates of prostate cancer
overdiagnosis by age and prostate-specific antigen. BMC Med. 2014;12:26.
doi:10.1186/1741-7015-12-26
54. Welch HG, Albertsen PC. Reconsidering Prostate Cancer Mortality - The Future
of PSA Screening. N Engl J Med. 2020;382(16):1557-63. doi:10.1056/NEJMms1914228
55. Ilic D, Djulbegovic M, Jung JH, et al. Prostate cancer screening with prostate-
specific antigen (PSA) test: a systematic review and meta-analysis. BMJ.
2018;362:k3519. doi:10.1136/bmj.k3519
56. Grossman DC, Curry SJ, Owens DK, et al. Screening for Prostate Cancer: US
Preventive Services Task Force Recommendation Statement. Jama.
2018;319(18):1901-13. doi:10.1001/jama.2018.3710
29
57. Ross KS, Carter HB, Pearson JD, Guess HA. Comparative efficiency of prostate-
specific antigen screening strategies for prostate cancer detection. Jama.
2000;284(11):1399-405. doi:10.1001/jama.284.11.1399
58. Nadler RB, Humphrey PA, Smith DS, Catalona WJ, Ratliff TL. Effect of
inflammation and benign prostatic hyperplasia on elevated serum prostate specific
antigen levels. J Urol. 1995;154(2 Pt 1):407-13. doi:10.1097/00005392-199508000-
00023
59. Fang J, Metter EJ, Landis P, Chan DW, Morrell CH, Carter HB. Low levels of
prostate-specific antigen predict long-term risk of prostate cancer: results from the
Baltimore Longitudinal Study of Aging. Urology. 2001;58(3):411-6. doi:10.1016/s0090-
4295(01)01304-8
60. Lilja H, Cronin AM, Dahlin A, et al. Prediction of significant prostate cancer
diagnosed 20 to 30 years later with a single measure of prostate-specific antigen at or
before age 50. Cancer. 2011;117(6):1210-9. doi:10.1002/cncr.25568
61. Loeb S, Roehl KA, Antenor JA, Catalona WJ, Suarez BK, Nadler RB. Baseline
prostate-specific antigen compared with median prostate-specific antigen for age group
as predictor of prostate cancer risk in men younger than 60 years old. Urology.
2006;67(2):316-20. doi:10.1016/j.urology.2005.08.040
62. Parkes C, Wald NJ, Murphy P, et al. Prospective observational study to assess
value of prostate specific antigen as screening test for prostate cancer. Bmj.
1995;311(7016):1340-3. doi:10.1136/bmj.311.7016.1340
63. Whittemore AS, Cirillo PM, Feldman D, Cohn BA. Prostate specific antigen levels
in young adulthood predict prostate cancer risk: results from a cohort of Black and
White Americans. J Urol. 2005;174(3):872-6; discussion 6.
doi:10.1097/01.ju.0000169262.18000.8a
64. Whittemore AS, Lele C, Friedman GD, Stamey T, Vogelman JH, Orentreich N.
Prostate-Specific Antigen as Predictor of Prostate Cancer in Black Men and White Men.
JNCI: Journal of the National Cancer Institute. 1995;87(5):354-9.
doi:10.1093/jnci/87.5.354
30
65. Antenor JA, Han M, Roehl KA, Nadler RB, Catalona WJ. Relationship between
initial prostate specific antigen level and subsequent prostate cancer detection in a
longitudinal screening study. J Urol. 2004;172(1):90-3.
doi:10.1097/01.ju.0000132133.10470.bb
66. Holmström B, Johansson M, Bergh A, Stenman UH, Hallmans G, Stattin P.
Prostate specific antigen for early detection of prostate cancer: longitudinal study. Bmj.
2009;339:b3537. doi:10.1136/bmj.b3537
67. Tang P, Sun L, Uhlman MA, et al. Initial prostate specific antigen 1.5 ng/ml or
greater in men 50 years old or younger predicts higher prostate cancer risk. J Urol.
2010;183(3):946-50. doi:10.1016/j.juro.2009.11.021
68. Kuller LH, Thomas A, Grandits G, Neaton JD. Elevated prostate-specific antigen
levels up to 25 years prior to death from prostate cancer. Cancer Epidemiol Biomarkers
Prev. 2004;13(3):373-7.
69. Stattin P, Vickers AJ, Sjoberg DD, et al. Improving the Specificity of Screening for
Lethal Prostate Cancer Using Prostate-specific Antigen and a Panel of Kallikrein
Markers: A Nested Case-Control Study. Eur Urol. 2015;68(2):207-13.
doi:10.1016/j.eururo.2015.01.009
70. Vickers AJ, Cronin AM, Bjork T, et al. Prostate specific antigen concentration at
age 60 and death or metastasis from prostate cancer: case-control study. Bmj.
2010;341:c4521. doi:10.1136/bmj.c4521
71. Connolly D, Black A, Gavin A, Keane PF, Murray LJ. Baseline prostate-specific
antigen level and risk of prostate cancer and prostate-specific mortality: diagnosis is
dependent on the intensity of investigation. Cancer Epidemiol Biomarkers Prev.
2008;17(2):271-8. doi:10.1158/1055-9965.Epi-07-0515
72. Orsted DD, Nordestgaard BG, Jensen GB, Schnohr P, Bojesen SE. Prostate-
specific antigen and long-term prediction of prostate cancer incidence and mortality in
the general population. Eur Urol. 2012;61(5):865-74. doi:10.1016/j.eururo.2011.11.007
73. Vickers AJ, Ulmert D, Sjoberg DD, et al. Strategy for detection of prostate cancer
based on relation between prostate specific antigen at age 40-55 and long term risk of
metastasis: case-control study. Bmj. 2013;346:f2023. doi:10.1136/bmj.f2023
31
74. Preston MA, Batista JL, Wilson KM, et al. Baseline Prostate-Specific Antigen
Levels in Midlife Predict Lethal Prostate Cancer. J Clin Oncol. 2016;34(23):2705-11.
doi:10.1200/jco.2016.66.7527
75. Kovac E, Carlsson SV, Lilja H, et al. Association of Baseline Prostate-Specific
Antigen Level With Long-term Diagnosis of Clinically Significant Prostate Cancer
Among Patients Aged 55 to 60 Years: A Secondary Analysis of a Cohort in the Prostate,
Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. JAMA Netw Open.
2020;3(1):e1919284. doi:10.1001/jamanetworkopen.2019.19284
76. Assel M, Sjöblom L, Murtola TJ, et al. A Four-kallikrein Panel and β-
Microseminoprotein in Predicting High-grade Prostate Cancer on Biopsy: An
Independent Replication from the Finnish Section of the European Randomized Study
of Screening for Prostate Cancer. Eur Urol Focus. 2019;5(4):561-7.
doi:10.1016/j.euf.2017.11.002
77. Bryant RJ, Sjoberg DD, Vickers AJ, et al. Predicting high-grade cancer at ten-
core prostate biopsy using four kallikrein markers measured in blood in the ProtecT
study. J Natl Cancer Inst. 2015;107(7). doi:10.1093/jnci/djv095
78. Gupta A, Roobol MJ, Savage CJ, et al. A four-kallikrein panel for the prediction of
repeat prostate biopsy: data from the European Randomized Study of Prostate Cancer
screening in Rotterdam, Netherlands. Br J Cancer. 2010;103(5):708-14.
doi:10.1038/sj.bjc.6605815
79. Kim EH, Andriole GL, Crawford ED, et al. Detection of High Grade Prostate
Cancer among PLCO Participants Using a Prespecified 4-Kallikrein Marker Panel. J
Urol. 2017;197(4):1041-7. doi:10.1016/j.juro.2016.10.089
80. Nordström T, Vickers A, Assel M, Lilja H, Grönberg H, Eklund M. Comparison
Between the Four-kallikrein Panel and Prostate Health Index for Predicting Prostate
Cancer. Eur Urol. 2015;68(1):139-46. doi:10.1016/j.eururo.2014.08.010
81. Punnen S, Freedland SJ, Polascik TJ, et al. A Multi-Institutional Prospective Trial
Confirms Noninvasive Blood Test Maintains Predictive Value in African American Men.
J Urol. 2018;199(6):1459-63. doi:10.1016/j.juro.2017.11.113
32
82. Vickers A, Cronin A, Roobol M, et al. Reducing unnecessary biopsy during
prostate cancer screening using a four-kallikrein panel: an independent replication. J
Clin Oncol. 2010;28(15):2493-8. doi:10.1200/jco.2009.24.1968
83. Vickers AJ, Cronin AM, Aus G, et al. A panel of kallikrein markers can reduce
unnecessary biopsy for prostate cancer: data from the European Randomized Study of
Prostate Cancer Screening in Göteborg, Sweden. BMC Med. 2008;6:19.
doi:10.1186/1741-7015-6-19
84. Vickers AJ, Cronin AM, Roobol MJ, et al. A four-kallikrein panel predicts prostate
cancer in men with recent screening: data from the European Randomized Study of
Screening for Prostate Cancer, Rotterdam. Clin Cancer Res. 2010;16(12):3232-9.
doi:10.1158/1078-0432.Ccr-10-0122
85. Darst BF, Chou A, Wan P, et al. The Four-Kallikrein Panel Is Effective in
Identifying Aggressive Prostate Cancer in a Multiethnic Population. Cancer Epidemiol
Biomarkers Prev. 2020;29(7):1381-8. doi:10.1158/1055-9965.EPI-19-1560
86. DeSantis CE, Ma J, Gaudet MM, et al. Breast cancer statistics, 2019. CA Cancer
J Clin. 2019;69(6):438-51. doi:10.3322/caac.21583
87. Cronin KA, Lake AJ, Scott S, et al. Annual Report to the Nation on the Status of
Cancer, part I: National cancer statistics. Cancer. 2018;124(13):2785-800.
doi:10.1002/cncr.31551
88. Provenzano E, Ulaner GA, Chin SF. Molecular Classification of Breast Cancer.
PET Clin. 2018;13(3):325-38. doi:10.1016/j.cpet.2018.02.004
89. Dela Cruz R, Park SY, Shvetsov YB, et al. Diet Quality and Breast Cancer
Incidence in the Multiethnic Cohort. Eur J Clin Nutr. 2020;74(12):1743-7.
doi:10.1038/s41430-020-0627-2
90. Link LB, Canchola AJ, Bernstein L, et al. Dietary patterns and breast cancer risk
in the California Teachers Study cohort. Am J Clin Nutr. 2013;98(6):1524-32.
doi:10.3945/ajcn.113.061184
33
91. Islami F, Goding Sauer A, Miller KD, et al. Proportion and number of cancer
cases and deaths attributable to potentially modifiable risk factors in the United States.
CA Cancer J Clin. 2018;68(1):31-54. doi:10.3322/caac.21440
92. Lichtenstein P, Holm NV, Verkasalo PK, et al. Environmental and heritable
factors in the causation of cancer--analyses of cohorts of twins from Sweden, Denmark,
and Finland. N Engl J Med. 2000;343(2):78-85. doi:10.1056/nejm200007133430201
93. Chen S, Parmigiani G. Meta-analysis of BRCA1 and BRCA2 penetrance. J Clin
Oncol. 2007;25(11):1329-33. doi:10.1200/jco.2006.09.1066
94. Shuen AY, Foulkes WD. Inherited mutations in breast cancer genes--risk and
response. J Mammary Gland Biol Neoplasia. 2011;16(1):3-15. doi:10.1007/s10911-011-
9213-5
95. John EM, Miron A, Gong G, et al. Prevalence of pathogenic BRCA1 mutation
carriers in 5 US racial/ethnic groups. Jama. 2007;298(24):2869-76.
doi:10.1001/jama.298.24.2869
96. Kurian AW. BRCA1 and BRCA2 mutations across race and ethnicity: distribution
and clinical implications. Curr Opin Obstet Gynecol. 2010;22(1):72-8.
doi:10.1097/GCO.0b013e328332dca3
97. Antoniou A, Pharoah PD, Narod S, et al. Average risks of breast and ovarian
cancer associated with BRCA1 or BRCA2 mutations detected in case Series unselected
for family history: a combined analysis of 22 studies. Am J Hum Genet.
2003;72(5):1117-30. doi:10.1086/375033
98. CHEK2*1100delC and susceptibility to breast cancer: a collaborative analysis
involving 10,860 breast cancer cases and 9,065 controls from 10 studies. Am J Hum
Genet. 2004;74(6):1175-82. doi:10.1086/421251
99. Bogdanova N, Enssen-Dubrowinskaja N, Feshchenko S, et al. Association of two
mutations in the CHEK2 gene with breast cancer. Int J Cancer. 2005;116(2):263-6.
doi:10.1002/ijc.21022
34
100. Bogdanova N, Feshchenko S, Cybulski C, Dörk T. CHEK2 mutation and
hereditary breast cancer. J Clin Oncol. 2007;25(19):e26. doi:10.1200/jco.2007.11.4223
101. Bogdanova N, Feshchenko S, Schürmann P, et al. Nijmegen Breakage
Syndrome mutations and risk of breast cancer. Int J Cancer. 2008;122(4):802-6.
doi:10.1002/ijc.23168
102. Cybulski C, Wokołorczyk D, Huzarski T, et al. A deletion in CHEK2 of 5,395 bp
predisposes to breast cancer in Poland. Breast Cancer Res Treat. 2007;102(1):119-22.
doi:10.1007/s10549-006-9320-y
103. Erkko H, Dowty JG, Nikkilä J, et al. Penetrance analysis of the PALB2
c.1592delT founder mutation. Clin Cancer Res. 2008;14(14):4667-71.
doi:10.1158/1078-0432.Ccr-08-0210
104. Heikkinen K, Rapakko K, Karppinen SM, et al. RAD50 and NBS1 are breast
cancer susceptibility genes associated with genomic instability. Carcinogenesis.
2006;27(8):1593-9. doi:10.1093/carcin/bgi360
105. Kilpivaara O, Vahteristo P, Falck J, et al. CHEK2 variant I157T may be
associated with increased breast cancer risk. Int J Cancer. 2004;111(4):543-7.
doi:10.1002/ijc.20299
106. Rahman N, Seal S, Thompson D, et al. PALB2, which encodes a BRCA2-
interacting protein, is a breast cancer susceptibility gene. Nat Genet. 2007;39(2):165-7.
doi:10.1038/ng1959
107. Renwick A, Thompson D, Seal S, et al. ATM mutations that cause ataxia-
telangiectasia are breast cancer susceptibility alleles. Nat Genet. 2006;38(8):873-5.
doi:10.1038/ng1837
108. Seal S, Thompson D, Renwick A, et al. Truncating mutations in the Fanconi
anemia J gene BRIP1 are low-penetrance breast cancer susceptibility alleles. Nat
Genet. 2006;38(11):1239-41. doi:10.1038/ng1902
35
109. Shaag A, Walsh T, Renbaum P, et al. Functional and genomic approaches
reveal an ancient CHEK2 allele associated with breast cancer in the Ashkenazi Jewish
population. Hum Mol Genet. 2005;14(4):555-63. doi:10.1093/hmg/ddi052
110. Southey MC, Teo ZL, Dowty JG, et al. A PALB2 mutation associated with high
risk of breast cancer. Breast Cancer Res. 2010;12(6):R109. doi:10.1186/bcr2796
111. Steffen J, Nowakowska D, Niwińska A, et al. Germline mutations 657del5 of the
NBS1 gene contribute significantly to the incidence of breast cancer in Central Poland.
Int J Cancer. 2006;119(2):472-5. doi:10.1002/ijc.21853
112. Tung N, Lin NU, Kidd J, et al. Frequency of Germline Mutations in 25 Cancer
Susceptibility Genes in a Sequential Series of Patients With Breast Cancer. J Clin
Oncol. 2016;34(13):1460-8. doi:10.1200/jco.2015.65.0747
113. Michailidou K, Lindström S, Dennis J, et al. Association analysis identifies 65
new breast cancer risk loci. Nature. 2017;551(7678):92-4. doi:10.1038/nature24284
114. Milne RL, Kuchenbaecker KB, Michailidou K, et al. Identification of ten variants
associated with risk of estrogen-receptor-negative breast cancer. Nat Genet.
2017;49(12):1767-78. doi:10.1038/ng.3785
115. Mavaddat N, Michailidou K, Dennis J, et al. Polygenic Risk Scores for Prediction
of Breast Cancer and Breast Cancer Subtypes. Am J Hum Genet. 2019;104(1):21-34.
doi:10.1016/j.ajhg.2018.11.002
116. Kochanek KD, Murphy SL, Xu J, Arias E. Deaths: Final Data for 2017. Natl Vital
Stat Rep. 2019;68(9):1-77.
117. Wree A, Broderick L, Canbay A, Hoffman HM, Feldstein AE. From NAFLD to
NASH to cirrhosis-new insights into disease mechanisms. Nature reviews.
Gastroenterology & hepatology. 2013;10(11):627-36. doi:10.1038/nrgastro.2013.149
118. Younossi ZM, Koenig AB, Abdelatif D, Fazel Y, Henry L, Wymer M. Global
epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of
prevalence, incidence, and outcomes. Hepatology. 2016;64(1):73-84.
doi:10.1002/hep.28431
36
119. Chalasani N, Younossi Z, Lavine JE, et al. The diagnosis and management of
non-alcoholic fatty liver disease: practice guideline by the American Gastroenterological
Association, American Association for the Study of Liver Diseases, and American
College of Gastroenterology. Gastroenterology. 2012;142(7):1592-609.
doi:10.1053/j.gastro.2012.04.001
120. Estes C, Razavi H, Loomba R, Younossi Z, Sanyal AJ. Modeling the epidemic of
nonalcoholic fatty liver disease demonstrates an exponential increase in burden of
disease. Hepatology. 2018;67(1):123-33. doi:10.1002/hep.29466
121. Rich NE, Oji S, Mufti AR, et al. Racial and Ethnic Disparities in Nonalcoholic
Fatty Liver Disease Prevalence, Severity, and Outcomes in the United States: A
Systematic Review and Meta-analysis. Clinical gastroenterology and hepatology : the
official clinical practice journal of the American Gastroenterological Association.
2018;16(2):198-210 e2. doi:10.1016/j.cgh.2017.09.041
122. Ruhl CE, Everhart JE. Fatty liver indices in the multiethnic United States National
Health and Nutrition Examination Survey. Aliment Pharmacol Ther. 2015;41(1):65-76.
doi:10.1111/apt.13012
123. Setiawan VW, Stram DO, Porcel J, Lu SC, Le Marchand L, Noureddin M.
Prevalence of chronic liver disease and cirrhosis by underlying cause in understudied
ethnic groups: The multiethnic cohort. Hepatology. 2016;64(6):1969-77.
doi:10.1002/hep.28677
124. Stender S, Loomba R. PNPLA3 Genotype and Risk of Liver and All-Cause
Mortality. Hepatology. 2020;71(3):777-9. doi:10.1002/hep.31113
125. Rinella M, Charlton M. The globalization of nonalcoholic fatty liver disease:
Prevalence and impact on world health. Hepatology. 2016;64(1):19-22.
doi:10.1002/hep.28524
126. Pais R, Barritt ASt, Calmus Y, et al. NAFLD and liver transplantation: Current
burden and expected challenges. J Hepatol. 2016;65(6):1245-57.
doi:10.1016/j.jhep.2016.07.033
37
127. Paradis V, Zalinski S, Chelbi E, et al. Hepatocellular carcinomas in patients with
metabolic syndrome often develop without significant liver fibrosis: a pathological
analysis. Hepatology. 2009;49(3):851-9. doi:10.1002/hep.22734
128. Adams LA, Lymp JF, St Sauver J, et al. The natural history of nonalcoholic fatty
liver disease: a population-based cohort study. Gastroenterology. 2005;129(1):113-21.
doi:10.1053/j.gastro.2005.04.014
129. Dulai PS, Singh S, Patel J, et al. Increased risk of mortality by fibrosis stage in
nonalcoholic fatty liver disease: Systematic review and meta-analysis. Hepatology.
2017;65(5):1557-65. doi:10.1002/hep.29085
130. Kleiner DE, Brunt EM, Van Natta M, et al. Design and validation of a histological
scoring system for nonalcoholic fatty liver disease. Hepatology. 2005;41(6):1313-21.
doi:10.1002/hep.20701
131. Ludwig J, Viggiano TR, McGill DB, Oh BJ. Nonalcoholic steatohepatitis: Mayo
Clinic experiences with a hitherto unnamed disease. Mayo Clin Proc. 1980;55(7):434-8.
132. Sayiner M, Koenig A, Henry L, Younossi ZM. Epidemiology of Nonalcoholic Fatty
Liver Disease and Nonalcoholic Steatohepatitis in the United States and the Rest of the
World. Clin Liver Dis. 2016;20(2):205-14. doi:10.1016/j.cld.2015.10.001
133. Jepsen P, Vilstrup H, Mellemkjaer L, et al. Prognosis of patients with a diagnosis
of fatty liver--a registry-based cohort study. Hepatogastroenterology. 2003;50(54):2101-
4.
134. Rafiq N, Bai C, Fang Y, et al. Long-term follow-up of patients with nonalcoholic
fatty liver. Clin Gastroenterol Hepatol. 2009;7(2):234-8. doi:10.1016/j.cgh.2008.11.005
135. Younossi ZM. Non-alcoholic fatty liver disease - A global public health
perspective. J Hepatol. 2019;70(3):531-44. doi:10.1016/j.jhep.2018.10.033
136. Mahfood Haddad T, Hamdeh S, Kanmanthareddy A, Alla VM. Nonalcoholic fatty
liver disease and the risk of clinical cardiovascular events: A systematic review and
meta-analysis. Diabetes Metab Syndr. 2017;11 Suppl 1:S209-s16.
doi:10.1016/j.dsx.2016.12.033
38
137. Noureddin M, Zelber-Sagi S, Wilkens LR, et al. Diet Associations With
Nonalcoholic Fatty Liver Disease in an Ethnically Diverse Population: The Multiethnic
Cohort. Hepatology. 2020;71(6):1940-52. doi:10.1002/hep.30967
138. Hirschhorn JN, Gajdos ZK. Genome-wide association studies: results from the
first few years and potential implications for clinical medicine. Annu Rev Med.
2011;62:11-24. doi:10.1146/annurev.med.091708.162036
139. Bambha K, Belt P, Abraham M, et al. Ethnicity and nonalcoholic fatty liver
disease. Hepatology. 2012;55(3):769-80. doi:10.1002/hep.24726
140. Browning JD, Kumar KS, Saboorian MH, Thiele DL. Ethnic differences in the
prevalence of cryptogenic cirrhosis. Am J Gastroenterol. 2004;99(2):292-8.
doi:10.1111/j.1572-0241.2004.04059.x
141. Browning JD, Szczepaniak LS, Dobbins R, et al. Prevalence of hepatic steatosis
in an urban population in the United States: impact of ethnicity. Hepatology.
2004;40(6):1387-95. doi:10.1002/hep.20466
142. Guerrero R, Vega GL, Grundy SM, Browning JD. Ethnic differences in hepatic
steatosis: an insulin resistance paradox? Hepatology. 2009;49(3):791-801.
doi:10.1002/hep.22726
143. Makkonen J, Pietiläinen KH, Rissanen A, Kaprio J, Yki-Järvinen H. Genetic
factors contribute to variation in serum alanine aminotransferase activity independent of
obesity and alcohol: a study in monozygotic and dizygotic twins. J Hepatol.
2009;50(5):1035-42. doi:10.1016/j.jhep.2008.12.025
144. Schwimmer JB, Celedon MA, Lavine JE, et al. Heritability of nonalcoholic fatty
liver disease. Gastroenterology. 2009;136(5):1585-92. doi:10.1053/j.gastro.2009.01.050
145. Struben VM, Hespenheide EE, Caldwell SH. Nonalcoholic steatohepatitis and
cryptogenic cirrhosis within kindreds. Am J Med. 2000;108(1):9-13. doi:10.1016/s0002-
9343(99)00315-0
146. Szczepaniak LS, Nurenberg P, Leonard D, et al. Magnetic resonance
spectroscopy to measure hepatic triglyceride content: prevalence of hepatic steatosis in
39
the general population. Am J Physiol Endocrinol Metab. 2005;288(2):E462-8.
doi:10.1152/ajpendo.00064.2004
147. Willner IR, Waters B, Patil SR, Reuben A, Morelli J, Riely CA. Ninety patients
with nonalcoholic steatohepatitis: insulin resistance, familial tendency, and severity of
disease. Am J Gastroenterol. 2001;96(10):2957-61. doi:10.1111/j.1572-
0241.2001.04667.x
148. Anstee QM, Darlay R, Cockell S, et al. Genome-wide association study of non-
alcoholic fatty liver and steatohepatitis in a histologically characterised cohort( ☆). J
Hepatol. 2020;73(3):505-15. doi:10.1016/j.jhep.2020.04.003
149. Kawaguchi T, Sumida Y, Umemura A, et al. Genetic polymorphisms of the
human PNPLA3 gene are strongly associated with severity of non-alcoholic fatty liver
disease in Japanese. PloS one. 2012;7(6):e38322. doi:10.1371/journal.pone.0038322
150. Kitamoto T, Kitamoto A, Yoneda M, et al. Genome-wide scan revealed that
polymorphisms in the PNPLA3, SAMM50, and PARVB genes are associated with
development and progression of nonalcoholic fatty liver disease in Japan. Hum Genet.
2013;132(7):783-92. doi:10.1007/s00439-013-1294-3
151. Kawaguchi T, Shima T, Mizuno M, et al. Risk estimation model for nonalcoholic
fatty liver disease in the Japanese using multiple genetic markers. PLoS One.
2018;13(1):e0185490. doi:10.1371/journal.pone.0185490
152. Namjou B, Lingren T, Huang Y, et al. GWAS and enrichment analyses of non-
alcoholic fatty liver disease identify new trait-associated genes and pathways across
eMERGE Network. BMC Med. 2019;17(1):135. doi:10.1186/s12916-019-1364-z
153. Chambers JC, Zhang W, Sehmi J, et al. Genome-wide association study
identifies loci influencing concentrations of liver enzymes in plasma. Nature genetics.
2011;43(11):1131-8. doi:10.1038/ng.970
154. Romeo S, Kozlitina J, Xing C, et al. Genetic variation in PNPLA3 confers
susceptibility to nonalcoholic fatty liver disease. Nat Genet. 2008;40(12):1461-5.
doi:10.1038/ng.257
40
155. Speliotes EK, Yerges-Armstrong LM, Wu J, et al. Genome-wide association
analysis identifies variants associated with nonalcoholic fatty liver disease that have
distinct effects on metabolic traits. PLoS Genet. 2011;7(3):e1001324.
doi:10.1371/journal.pgen.1001324
156. Mandorfer M, Scheiner B, Stättermayer AF, et al. Impact of patatin-like
phospholipase domain containing 3 rs738409 G/G genotype on hepatic
decompensation and mortality in patients with portal hypertension. Aliment Pharmacol
Ther. 2018;48(4):451-9. doi:10.1111/apt.14856
157. Kozlitina J, Smagris E, Stender S, et al. Exome-wide association study identifies
a TM6SF2 variant that confers susceptibility to nonalcoholic fatty liver disease. Nat
Genet. 2014;46(4):352-6. doi:10.1038/ng.2901
158. Parisinos CA, Wilman HR, Thomas EL, et al. Genome-wide and Mendelian
randomisation studies of liver MRI yield insights into the pathogenesis of steatohepatitis.
J Hepatol. 2020;73(2):241-51. doi:10.1016/j.jhep.2020.03.032
159. Feitosa MF, Wojczynski MK, North KE, et al. The ERLIN1-CHUK-CWF19L1 gene
cluster influences liver fat deposition and hepatic inflammation in the NHLBI Family
Heart Study. Atherosclerosis. 2013;228(1):175-80.
doi:10.1016/j.atherosclerosis.2013.01.038
160. Yuan X, Waterworth D, Perry JR, et al. Population-based genome-wide
association studies reveal six loci influencing plasma levels of liver enzymes. Am J Hum
Genet. 2008;83(4):520-8. doi:10.1016/j.ajhg.2008.09.012
161. Krawczyk M, Liebe R, Lammert F. Toward Genetic Prediction of Nonalcoholic
Fatty Liver Disease Trajectories: PNPLA3 and Beyond. Gastroenterology.
2020;158(7):1865-80.e1. doi:10.1053/j.gastro.2020.01.053
162. Emdin CA, Haas ME, Khera AV, et al. A missense variant in Mitochondrial
Amidoxime Reducing Component 1 gene and protection against liver disease. PLoS
Genet. 2020;16(4):e1008629. doi:10.1371/journal.pgen.1008629
163. Wijarnpreecha K, Scribani M, Raymond P, et al. PNPLA3 Gene Polymorphism
and Liver- and Extrahepatic Cancer-Related Mortality in the United States. Clin
Gastroenterol Hepatol. 2020. doi:10.1016/j.cgh.2020.04.058
41
164. Wijarnpreecha K, Scribani M, Raymond P, et al. PNPLA3 gene polymorphism
and overall and cardiovascular mortality in the United States. J Gastroenterol Hepatol.
2020. doi:10.1111/jgh.15045
165. Vespasiani-Gentilucci U, Gallo P, Dell'Unto C, Volpentesta M, Antonelli-Incalzi R,
Picardi A. Promoting genetics in non-alcoholic fatty liver disease: Combined risk score
through polymorphisms and clinical variables. World J Gastroenterol. 2018;24(43):4835-
45. doi:10.3748/wjg.v24.i43.4835
166. Wang J, Conti DV, Bogumil D, et al. Association of Genetic Risk Score With
NAFLD in An Ethnically Diverse Cohort. Hepatol Commun. 2021.
doi:10.1002/hep4.1751
167. Mucci LA, Hjelmborg JB, Harris JR, et al. Familial Risk and Heritability of Cancer
Among Twins in Nordic Countries. JAMA. 2016;315(1):68-76.
doi:10.1001/jama.2015.17703
168. Preston MA, Gerke T, Carlsson SV, et al. Baseline Prostate-specific Antigen
Level in Midlife and Aggressive Prostate Cancer in Black Men. Eur Urol.
2019;75(3):399-407. doi:10.1016/j.eururo.2018.08.032
169. Kolonel LN, Henderson BE, Hankin JH, et al. A multiethnic cohort in Hawaii and
Los Angeles: baseline characteristics. American journal of epidemiology.
2000;151(4):346-57.
170. Wang H, Haiman CA, Kolonel LN, et al. Self-reported ethnicity, genetic structure
and the impact of population stratification in a multiethnic study. Hum Genet.
2010;128(2):165-77. doi:10.1007/s00439-010-0841-4
171. Mitrunen K, Pettersson K, Piironen T, Bjork T, Lilja H, Lovgren T. Dual-label one-
step immunoassay for simultaneous measurement of free and total prostate-specific
antigen concentrations and ratios in serum. Clin Chem. 1995;41(8 Pt 1):1115-20.
172. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D.
Principal components analysis corrects for stratification in genome-wide association
studies. Nat Genet. 2006;38(8):904-9. doi:10.1038/ng1847
42
173. Gann PH, Hennekens CH, Stampfer MJ. A prospective evaluation of plasma
prostate-specific antigen for detection of prostatic cancer. Jama. 1995;273(4):289-94.
174. Harmon BE, Boushey CJ, Shvetsov YB, et al. Associations of key diet-quality
indexes with mortality in the Multiethnic Cohort: the Dietary Patterns Methods Project.
Am J Clin Nutr. 2015;101(3):587-97. doi:10.3945/ajcn.114.090688
175. Park SY, Boushey CJ, Wilkens LR, Haiman CA, Le Marchand L. High-Quality
Diets Associate With Reduced Risk of Colorectal Cancer: Analyses of Diet Quality
Indexes in the Multiethnic Cohort. Gastroenterology. 2017;153(2):386-94.e2.
doi:10.1053/j.gastro.2017.04.004
176. Mavaddat N, Michailidou K, Dennis J, et al. Polygenic Risk Scores for Prediction
of Breast Cancer and Breast Cancer Subtypes. Am J Hum Genet. 2019;104(1):21-34.
doi:10.1016/j.ajhg.2018.11.002
177. Pal Choudhury P, Wilcox AN, Brook MN, et al. Comparative Validation of Breast
Cancer Risk Prediction Models and Projections for Future Risk Stratification. J Natl
Cancer Inst. 2020;112(3):278-85. doi:10.1093/jnci/djz113
178. Conomos MP, Miller MB, Thornton TA. Robust inference of population structure
for ancestry prediction and correction of stratification in the presence of relatedness.
Genetic epidemiology. 2015;39(4):276-93. doi:10.1002/gepi.21896
43
Chapter 2: Association of Prostate-Specific Antigen Levels with
Prostate Cancer Risk in a Multiethnic Population: Stability over
Time and Comparison with Polygenic Risk Score
Authors: Alisha Chou
1
, Burcu F. Darst
1,2
, Lynne R. Wilkens
3
, Loïc Le Marchand
3
, Hans
Lilja
4
, David V. Conti
1
, Christopher A. Haiman
1
1
Center for Genetic Epidemiology, Department of Population and Public Health
Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA,
US
2
Present address: Public Health Sciences Division, Fred Hutchinson Cancer Research
Center, Seattle, WA, U.S.A.
3
Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
4
Departments of Laboratory Medicine, Surgery, and Medicine, Memorial Sloan
Kettering Cancer Center, New York, NY, U.S.A.; and Department of Translational
Medicine, Lund University, Malmö, Sweden
Running title: PSA levels and PCa Risk in a Multiethnic Population
Corresponding Author:
Christopher A. Haiman, Sc.D
1450 Biggy Street, NRT-1504
Los Angeles, CA 90089 USA
1-323-442-7755
44
haiman@usc.edu
Conflict of interest: Dr. Lilja is named on a patent for intact PSA assays and a patent for
a statistical method to detect prostate cancer that has been licensed to and
commercialized as the 4Kscore test by OPKO Health. Dr. Lilja receives royalties from
sales of this test and owns stock in OPKO Health. Dr. Lilja is a member of the Scientific
Advisory Board of Fujirebio Diagnostics Inc. and owns stock in Diaprost AB and in
Acousort AB.
45
Abstract
Background: Studies in men of European ancestry suggest prostate-specific antigen
(PSA) as a marker of early prostate cancer (PCa) development that may help to risk-
stratify men earlier in life.
Methods: We examined PSA levels in men measured up to 10+ years before a PCa
diagnosis in association with PCa risk in 2,245 cases and 2,203 controls of African
American, Latino, Japanese, Native Hawaiian, and White men in the Multiethnic Cohort.
We also compared the discriminative ability of PSA to polygenic risk score (PRS) for
PCa.
Results: Excluding cases diagnosed within 2 and 10 years of blood draw, men with PSA
above the median had a PCa OR (95% CIs) of 9.12 (7.66-10.92) and 3.52 (2.50-5.03),
respectively, compared to men with PSA below the median. A PSA level above the
median identified 90% and 75% of cases diagnosed more than 2 and 10 years after
blood draw, respectively. The associations were significantly greater for Gleason ≤7 vs.
8+ disease. At 10+ years, the association of PCa with PSA was comparable to that with
the PRS (OR per SD increase: 1.88 (1.45-2.46) and 2.12 (1.55-2.93), respectively).
Conclusions: We found PSA to be an informative marker of PCa risk at least a decade
before diagnosis across multiethnic populations. This association was diminished with
increasing time, greater for low grade tumors, and comparable to a PRS when
measured 10+ years before diagnosis.
46
Impact: Our multiethnic investigation suggests broad clinical implications on the utility of
PSA and PRS for risk stratification in PCa screening practices.
47
Introduction
Prostate-specific antigen (PSA) is the biomarker most commonly used for PCa
screening and multiple studies in the U.S. and Europe have shown that PSA levels in
midlife are associated with increased risk of PCa later in life (1-9). This association is
likely the result of increasing PSA production in the prostate reflective of the
development of cancer. Studies have also reported midlife PSA levels to be an indicator
of disease aggressiveness (10-14). A nested case-control study in the Malmö
Preventive cohort found that 44% of PCa deaths occurred in men within the top 10
th
percentile of PSA levels measured in men 45 to 55 years of age, 25 to 30 years prior to
death (15). PSA levels assessed in midlife, 9 years (median) before a PCa diagnosis,
were also associated with risk of overall and lethal PCa in a primarily White population
under 60 in the Physician's Health Study (16). A more recent study in the Prostate,
Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial also found PSA levels
measured 6 years (median) before a PCa diagnosis in middle-aged men (55 to 60 years
old) to be significantly associated with risk of overall and clinically significant PCa (17).
These studies in men of European ancestry suggest PSA may be a marker of very early
PCa development that may help to risk-stratify men earlier in life (18).
There is limited information on PSA levels and risk of PCa in non-White
populations and in men older than 60. In a small study among African American men
40-64 years of age in the Southern Community Cohort Study (SCCS) with median
follow-up time of 9 years, men with PSA levels in the top decile had a >30-fold increase
in risk of PCa compared to those with ≤ median PSA levels (19). In the current
population-based study, we examined whether PSA levels in older adult men (mean
age at blood draw: 68) measured up to 10+ years before a PCa diagnosis were
48
associated with PCa among African American (AA), Latino (LA), Japanese (JA), Native
Hawaiian (NH), and Non-Hispanic White (WH) men in the Multiethnic Cohort (MEC). To
compare with previous studies, we additionally examined the association between
midlife PSA and risk of PCa in men aged 60 years or younger. We also compared the
discriminative ability of PSA years before diagnosis to that of the multi-ancestry
polygenic risk score (PRS) for PCa (20).
Materials and Methods
Participants
The MEC is a large prospective multiethnic cohort established between 1993–
1996 to investigate risk of cancer and other chronic conditions among individuals living
in Hawaii and Los Angeles (21). Participants included self-reported AA, LA, JP, NH, and
WH race/ethnicity groups who completed a detailed baseline questionnaire to obtain
information on cancer risk factors and health conditions (22). On the third questionnaire
(between 2003-2008), 72% of men reported having undergone PSA screening. Blood
samples were collected between 1994-2006 from approximately 67,000 participants for
nested case-control studies of cancer. PCa cases and information on Gleason score
and stage were identified through linkage of the MEC to Surveillance, Epidemiology,
and End Results (SEER) cancer registries in Hawaii and California. Cancer mortality
was determined by routine linkages to state death files and to the National Death Index
for deaths that occurred outside of Hawaii and California. Inclusion and exclusion
criteria were as follow: After exclusion of prevalent PCa cases, the current study
included 2,245 incident PCa cases diagnosed after blood collection and 2,203 controls.
49
The cases in the study are incident cases while controls are men who were selected
because they did not have prostate cancer prior to being included in the study. Around
50% of men underwent PSA screening in the MEC. While controls with undiagnosed
disease may have been included, this would only lead to attenuation in the associations
reported in the current study. Controls included males matched to cases on
race/ethnicity, age at blood draw (± 5 years), location (Hawaii or Los Angeles), hours of
fasting (± 2 hours), year of collection (± 0.5 years), and time of blood draw (± 3 hours).
The primary outcomes assessed were overall PCa versus controls, 8+ versus ≤ 7
Gleason score disease, and localized vs. non-localized disease. Subgroup analyses
were also conducted for lethal PCa, defined as metastatic PCa or death from PCa. All
participants provided written informed consent, and study protocols were approved by
the Institutional Review Boards overseeing research on human subjects at the
University of Hawaii and the University of Southern California.
Laboratory Methods
Immunoassay measurements of total PSA were performed using AutoDELFIA®
1235 automatic immunoassay systems, which have been previously described (23).
Total PSA values were measured using the dual label DELFIA® ProStatus
TM
total PSA
Assay calibrated in accordance with WHO standards. All measurements were
performed blinded to case-control status.
Samples were processed in two batches, with 38 duplicate samples being
processed in both batches. Measurements between batches for these 38 samples were
50
highly correlated, and coefficients of variation within batches one and two were
comparable, as detailed in a previous study (24).
Genotyping, Quality Control, and Genotype Imputation
The genotype data used in this project was derived from multiple GWAS studies
of cancer and other phenotypes in the MEC. For all projects, Illumina Infinium arrays
were used, with imputation conducted using Minimac4 and the 1000 Genomes (1000G)
Project reference panel (Phase 3 v5). Both subject call rates and variant call rates were
≥ 0.95. Ethnic-specific frequencies were calculated and compared to corresponding
ethnic groups in Phase3 1000G for quality control. Infoscore filtering was not
implemented in an effort to include all 269 PCa-associated variants (20), as poor
imputation is only likely to introduce non-differential bias. Average r
2
was 0.88 and only
2 SNPs (0.7%) had r
2
below 0.30. Principal components were calculated using
EIGENSTRAT (25) with 20,202 independent common variants to adjust for potential
confounding due to genetic ancestry.
Statistical Analyses
Association between PSA and PCa
PSA was log-transformed and geometric means were reported adjusted for age
at blood draw (and race/ethnicity in the overall sample). In association analyses, age-
and ethnicity-adjusted PSA was assessed using residuals from a linear model of log-
transformed PSA with covariates for age at blood draw, race/ethnicity, and an
interaction term between age at blood draw and race/ethnicity. PSA increases with age
51
and differs between racial/ethnic groups. The interaction between age at blood draw
and race/ethnicity was included to standardize subject’s PSA by these factors. In the
current study, the age and ethnicity-adjusted residual PSA is referred to as simply
‘PSA’. PSA percentile cutoffs were determined using all controls or controls within
race/ethnicity group. Unconditional logistic regression was used to estimate odds ratios
(ORs) and 95% confidence intervals (CIs) for the association between PSA levels and
risk of PCa phenotypes, adjusting for BMI at blood draw, laboratory batch, and the
matching factors. Corresponding conditional logistic models formally accounting for the
matching led to qualitatively similar effect estimates but were limited to a smaller sample
size. Analyses were also conducted excluding cases diagnosed within 2, 5, and 10
years of blood draw to assess the temporal stability of the association between PSA
and PCa risk. We also conducted analyses in men aged 60 or younger to examine the
relationship between midlife PSA and risk of PCa.
Polygenic Risk Scores (PRS)
We also evaluated the discriminative ability of PSA relative to a multi-ancestry
PRS for PCa that we previously reported to be strongly associated with PCa risk (20).
Of the 4,448 MEC participants, 3,110 had genotype data (24) that could be used to
construct the PRS. A weighted PRS was calculated for each participant as the sum of
the number of risk alleles carried by an individual, weighted by multi-ancestry variant-
specific effects for 269 PCa-associated variants, as previously described (20). While
participants in the current analysis were included in the multi-ancestry PCa GWAS, the
PCa PRS is unlikely to be noticeably impacted by overfitting because the number of
52
subjects included in the multi-ancestry PCa GWAS meta-analysis (234,253 subjects)
was substantially larger compared to that of the current analysis (3,110 subjects).
Moreover, we have previously shown that after accounting for the within sample bias
using bias-corrected estimates, results were essentially unchanged (20). Since PRS
distributions have been shown to differ by populations (20), residuals from a linear
regression model of PRS adjusted for race/ethnicity were used in combined analyses
including all populations. Principal components were calculated using 20,202
independent common variants to adjust for potential confounding by genetic ancestry
(25). In a series of independent unconditional logistic regression models adjusting for
the matching factors and the first 10 principal components, we evaluated the
association between PCa risk and 1) PRS, 2) PSA and 3) both PRS and PSA, with
cases stratified based on time between blood draw and diagnosis (2+, 5+, and 10+
years). ORs are reported per standard deviation (SD) increase in PRS and PSA to
better compare the relative associations of each factor. Model AUCs (area under the
curve) were calculated to compare PSA, PRS and PSA + PRS models’ ability to
discriminate between PCa outcomes and controls.
Lorenz curves were used to characterize the proportion of PCa cases captured
by various PSA or PRS cut points (12). For most analyses we focus on reporting effect
estimates and corresponding confidence intervals. We only rely on p-values for
assessment of effect heterogeneity between: Gleason ≤7 and Gleason 8+ PCa;
Gleason ≤6, Gleason 7 and Gleason 8+ PCa; non-localized and localized PCa; and,
PCa outcomes across race/ethnicity groups. In this analysis, we consider each outcome
as an independent hypothesis and consider statistical significance in 2 ways: 1) Two-
53
sided p-values <0.05, treating every outcome and subgroup analysis as an independent
hypothesis. 2) Two-sided p-value < , where =
0.05
𝑛 𝑠𝑢𝑏𝑔𝑟𝑜𝑢𝑝𝑠 =
0.05
8
= 0.006, for treating
each outcome as an independent hypothesis but correcting for multiple testing across 8
subgroups (2 PSA percentile subgroups x 4 time subgroups) within each outcome.
Analyses were performed using R (R Foundation for Statistical Computing, Vienna,
Austria, 2015).
Data availability statement
The Multiethnic Cohort investigators and institutions affirm their intention to share
the research data consistent with all relevant NIH resource/data sharing policies. Data
requests should be submitted through MEC online data request system at
https://www.uhcancercenter.org/for-researchers/mec-data-sharing.
Results
The mean age at blood draw for cases was 68 (range: 47-86) and 69 (47-87) for
controls (Table 1). Among cases, mean age at diagnosis was 73 years, with a mean
timespan between blood draw and a PCa diagnosis of 4.9 years (range: <1 year to 18
years), with 82%, 49%, and 11% having a timespan longer than 2, 5, and 10 years,
respectively. In the multiethnic sample, the median PSA level was 1.21 ng/mL (range:
0.05-90.3 ng/mL; IQR: 0.70-2.30 ng/mL) in controls and 3.38 ng/mL (range: 0.11-250
ng/mL; IQR: 2.00-5.59 ng/mL) in cases (Table 1). Among both PCa controls and cases,
older age at blood draw was significantly associated with higher PSA level (P<0.001
and P=0.004, respectively). Age-adjusted geometric mean PSA was significantly
54
different across race/ethnicity groups in cases (P<0.001) but not in controls (P=0.056).
Among cases, compared to Whites, African Americans and Latinos had higher PSA
levels, whereas Native Hawaiians and Japanese had lower PSA levels. Among controls,
Latinos had the same PSA level as Whites while African Americans, Japanese and
Native Hawaiians had lower levels compared to Whites (Table 1).
Compared to men with a PSA level at or below the median in the overall
population, ORs (95% CIs) for total PCa in men with a PSA above the 50
th
, and 90
th
percentiles were 10.05 (8.50-11.93), and 24.54 (19.75-30.67), respectively (Table 2).
Significant effect heterogeneity (P<0.05) was observed between Gleason ≤7 PCa and
Gleason 8+ PCa for men with a PSA above the 50
th
, and 90
th
percentiles: ORs for
Gleason ≤7 PCa were higher [11.89 (9.75-14.61), and 27.49 (21.53-35.39),
respectively], while ORs for Gleason 8+ PCa were lower [6.65 (5.07-8.85), and 17.04
(12.40-23.71), respectively] (Table 2). This trend was also observed when comparing
risks for Gleason ≤6, Gleason 7 and Gleason 8+ tumors (Supplementary Table S1).
After considering multiple tests across subgroups, significant effect heterogeneity (P <
0.006) is observed for PSA above the 50
th
percentile. Odds ratios for localized PCa
were lower [10.26 (8.53-12.42), and 24.20 (19.21-30.68), respectively], compared to
non-localized PCa [11.63 (7.80-18.14), and 29.81 (18.94-48.78), respectively], although
the differences were not statistically significant (Table 2).
The positive association between PSA and PCa risk attenuated with time
between the PSA measurement and PCa diagnosis (Table 2). Excluding cases
diagnosed within 2, 5, and 10 years of blood draw, compared to men with PSA below
the median, men with PSA above the median had ORs for total PCa (95% CIs) of 9.12
55
(7.66-10.92), 6.65 (5.43-8.20), and 3.52 (2.50-5.03), respectively. Excluding cases
diagnosed within 2, 5, and 10 years of blood draw, the magnitude of the associations
were consistently and significantly (P-heterogeneity <0.05) higher for Gleason ≤7
compared to Gleason 8+ PCa (Table 2). After considering multiple tests across
subgroups, significant effect heterogeneity (P < 0.006) is observed for PSA above the
50
th
percentile. The magnitude of associations were lower for localized compared to
non-localized PCa, although the differences were not statistically significant (Table 2).
Overall, the magnitude of the associations were greater for all race/ethnicity groups
compared to WH, although statistically significant effect heterogeneity was only
observed for total PCa and PSA above the median versus below the median measured
5+ years before a PCa diagnosis when considering a single test (P < 0.05), but not after
consideration of multiple tests (P < 0.006) (Table 3 and Supplementary Table S2).
As shown in Figure 1 and Supplementary Table S3, 91% of all PCa cases
occurred among those with a PSA level above the median, while 42% of cases occurred
among those with a PSA level in the top 10
th
percentile. Excluding cases diagnosed
within 2, 5, and 10 years of blood draw, 90%, 86%, and 75% of all PCa cases occurred
among those with a PSA level above the median, respectively, while 36%, 26%, and
19% of cases occurred among those with a PSA level in the top 10
th
percentile,
respectively. These percentages were higher for Gleason ≤7 PCa (92%, 91%, and 88%)
compared with Gleason 8+ PCa (87%, 85%, 80%) among those with PSA above the
median (Supplementary Table S3, Supplementary Figure S1-S2), whereas percentages
were similar for localized (91%, 90%, 85%) and non-localized PCa (92%, 92%, 89%)
(Supplementary Table S3, Supplementary Figure S3-S4). There were 171 lethal cases,
56
with 141, 79, and 19 diagnosed more than 2, 5, and 10 years since blood draw,
respectively. At 10+ years since blood draw, 68% of lethal cases occurred among men
with a PSA above the median, whereas 16% occurred in men in the top 10
th
percentile.
Percentages of total PCa and different disease characteristics captured were similar
across populations (Figures 1, Supplementary Figures S1-S4, Supplementary Table
S3).
For 436 cases and 342 controls aged 60 or younger at blood draw, mean age at
blood draw was 57 (range: 47-60). Among cases, the average timespan between blood
draw and a PCa diagnosis was 6.3 years (range: <1 year to 18 years). In this younger
group of men, OR for total PCa were elevated compared to those estimated in the full
sample, although no significant effect heterogeneity was observed (Supplementary
Table S4-S6). Compared to men with PSA below the median, men with PSA above the
median had an OR (95% CIs) of 16.22 (10.43-26.17), and ORs of 14.81 (9.51-23.91),
12.91 (7.97-21.83), and 7.44 (3.69-16.31) when excluding cases diagnosed within 2, 5,
and 10 years of blood draw, respectively (Supplementary Table S4). As shown in
Supplementary Figure S5 and Supplementary Table S7 for men aged 60 or younger,
94% of all PCa cases occurred among those with a PSA level above the median which
reduced only slightly to 93%, 92%, and 85% when excluding cases diagnosed within 2,
5, and 10 years of blood draw. As observed in all men, the percentages were higher for
Gleason ≤7 vs. Gleason 8+ tumors (Supplementary Table S7).
Last, we compared the performance of PSA measured 2, 5 and 10 years before
diagnosis to a multi-ancestry PCa PRS (20). The correlation (r) between the PRS and
PSA decreased with time from PSA measurement to diagnosis. The correlation was
57
0.24 in the overall population, 0.10 in cases, and 0.12 in controls. Excluding cases
diagnosed within 2, 5 and 10 years of blood draw, the correlation between PCa PRS
was 0.22, 0.19 and 0.14, respectively in the overall population, and 0.08, 0.06 and 0.11,
respectively in cases. As shown in Figure 1, 42% of cases occurred among those in the
top 10
th
percentile of the PSA distribution, with this number decreasing to 36%, 26%,
and 19% when excluding cases diagnosed within 2, 5 and 10 years of blood draw,
respectively, whereas 26% of cases occurred among those in the top 10
th
percentile of
the PRS distribution. While PSA is more informative closer to the time of diagnosis, at
10+ years, the magnitude of the association of PSA (OR per SD increase: 1.88, 95% CI:
1.45-2.46) was comparable to that of the PRS (OR per SD increase: 2.12, 95% CI:
1.55-2.93) in a model that mutually adjusted the effects of both PSA and PRS (Figure
2). PRS and PSA measured 10+ years before diagnosis captured a similar percentage
of cases across percentiles of PRS and PSA for all PCa outcomes (e.g., Gleason ≤7 or
8+, localized, non-localized, and lethal disease) (Supplementary Figure S1-S4), and the
magnitudes of the associations of PSA were comparable to that of the PRS
(Supplementary Figure S6). In case-case comparisons of Gleason 8+ vs ≤7 PCa and
localized vs. non-localized disease no statistically significant difference was observed
between PSA and PRS when excluding cases diagnosed within 2, 5 and 10 years of
blood draw (Supplementary Figure S7). Comparing model AUCs between PSA, PRS
and PSA+PRS similarly showed that while PSA’s discriminative ability was better closer
to the time of diagnosis, at 10+ years, the discriminative ability of PSA was comparable
to that of PRS (Supplementary Table S8). AUCs for PSA+PRS were comparable to that
of PSA across time. These observations were similar for all PCa outcomes.
58
Discussion
In this multiethnic study, we found that a PSA measurement taken 5 years on
average before diagnosis was associated with PCa risk. The association was observed
to be consistent across racial/ethnic populations, was significantly stronger for men with
low versus high grade disease but similar for advanced and lethal versus localized
disease. We also found PSA to be less effective as a marker of risk with increased
length of time since measurement, and at 10+ years before diagnosis, the magnitude of
the association of PSA with PCa risk was observed to be equivalent to that of the PRS.
Our findings in this multiethnic population with opportunistic screening are
consistent with studies in U.S. White and African American men (16,17,19). In a nested
case-control study in the PHS of primarily White men aged 40 to 59 years, with PSA
measured a median of 9 years before PCa diagnosis, ORs (95% CI) for total PCa in
men with a baseline PSA above the median and 75
th
percentile were 8.7 (5.5-13.9), and
14.1 (8.6-23.3), respectively, compared to men with PSA below the median (16). Among
White men aged 60 or younger in our study, we observed comparable effect sizes for
total PCa of 7.9 (3.8-17.4) and 14.0 (6.2-34.6) with PSA above the median and 75
th
percentile, respectively, measured 6 years on average before PCa diagnosis. We also
observed similar effect sizes for total PCa in the other racial/ethnic groups. In PHS, ORs
(95% CI) for lethal PCa in men with a baseline PSA above the median, and 90
th
percentile were 3.1 (1.6-6.1), and 7.4 (3.3-16.6), respectively (16). In our multiethnic
study, with PSA measured at least 5 years before diagnosis, we observed similar ORs
for lethal PCa in men with a PSA above the median, and 90
th
percentiles [3.6 (2.1-6.5),
and 7.5 (3.9-14.7), respectively]. In a small nested case-control study in the SCCS
among African American men, ORs for PCa in men with a baseline PSA above the
59
median, and 90
th
percentile were 18.8 (9.5-42.3), and 71.5 (31.0-190), respectively (19),
with similar effect sizes reported in a subset of men (n=91) with aggressive PCa. Our
results are consistent with these previous U.S. studies that emphasized PSA being
predictive of future lethal PCa. Our findings are also in line with the consistent
observation from these studies that PSA does not differentiate predicting indolent
versus lethal PCa.
Opportunistic PSA screening is a potential limitation in interpreting results of U.S-
based studies examining PSA as a predictive marker for PCa and lethal disease.
Screened men with higher PSA levels when measured may be at greater risk of
eventually having a tumor that would have progressed but are more likely to have their
cancer detected and treated. This results in fewer non-localized and lethal cases in this
group of men compared to a non-screened population. As a result, this would lead to an
attenuation of the association of PSA levels years before diagnosis with overall PCa
and advanced stage disease (e.g., non-localized, metastatic, and lethal PCa). This
potential bias may explain the lack of significant differences in the PSA association by
disease stage in our study and in the previous U.S.-based studies conducted following
the introduction of PSA screening. However, we also found the magnitude of PSA
associations to be significantly stronger for less aggressive Gleason ≤7 versus 8+
tumors, which does not support the underlying hypothesis that PSA is a marker for more
aggressive disease. This unexpected result is less likely to be due to opportunistic
screening as grade progression is less common (26). We expect that the results from
this multiethnic study of older men (and from the analysis of men <60) undergoing
opportunistic screening would be generalizable to men in the U.S today.
60
As reported in previous studies (19,27), we observed that PSA is less effective
as a marker of PCa risk with increased length of time since PSA measurement, with
90% of cases captured in the top PSA decile for cases diagnosed more than 2 years
after blood draw, which was reduced to 75% for cases 10+ years since blood draw;
these percentages were similar for the other PCa endpoints and 86% and 68% for lethal
PCa, respectively. When limited to cases diagnosed 10+ years after a PSA
measurement, we found the association of PSA (OR per SD increase=1.9, 95% CI: 1.5-
2.5) to be comparable to that of the PRS (OR per SD increase=2.1, 95% CI: 1.6-2.9).
Given the attenuated effect of PSA with time to diagnosis, the PRS assigned risk at birth
is likely to be a comparable indicator of risk earlier in life until PCa starts to develop and
PSA levels rise. Based on our findings, PSA appears to only be more effective than
PRS as an indicator of risk within 10 years of diagnosis, with both PRS and PSA having
limited ability to differentiate risk of advanced versus localized disease.
In conclusion, in this multiethnic study population with opportunistic screening,
PSA was significantly associated with PCa risk, with the effectiveness of PSA for risk
prediction significantly attenuated with time to PCa diagnosis. Our findings among older
men suggest that PSA is informative as a marker of risk within 10 years of diagnosis,
whereas the PRS is comparable for risk stratification earlier in life. While we did not find
PSA to differentiate risk of advanced versus localized disease, only a small fraction of
non-localized (23%) or lethal disease (32%) occurred in men with PSA levels below the
median, diagnosed 10 or more years after blood draw. This suggests, as indicated by
others (17), that a risk-stratified approach to screening is warranted (based on early life
PSA and/or PRS), with men at low risk being screened less frequently than men at high
61
risk, which would translate into fewer biopsies, associated complications, and over-
diagnoses for men at lower risk of dying from PCa.
Acknowledgements
The Multiethnic Cohort Study (MEC) is supported by NIH/NCI grant U01
CA164973. This work was supported by the National Cancer Institute at the National
Institutes of Health (grant numbers U01 CA164973 to C. Haiman and K99 CA246063 to
B. Darst), the Prostate Cancer Foundation (grants 21YOUN11 to B. Darst and
20CHAS03 to C. Haiman), and the Achievement Rewards for College Scientists
Foundation Los Angeles Founder Chapter to B. Darst. H. Lilja was supported in part by
the National Institutes of Health/National Cancer Institute (NIH/NCI) with a Cancer
Center Support Grant to Memorial Sloan Kettering Cancer Center (P30 CA008748), a
SPORE grant in Prostate Cancer to H. Scher (P50 CA092629), and by Grant Award
from the Swedish Cancer Society to H. Lilja (Cancerfonden 20 1354 PjF).
62
References
1. Fang J, Metter EJ, Landis P, Chan DW, Morrell CH, Carter HB. Low levels of
prostate-specific antigen predict long-term risk of prostate cancer: results from
the Baltimore Longitudinal Study of Aging. Urology 2001;58:411-6
2. Lilja H, Cronin AM, Dahlin A, Manjer J, Nilsson PM, Eastham JA, et al. Prediction
of significant prostate cancer diagnosed 20 to 30 years later with a single
measure of prostate-specific antigen at or before age 50. Cancer 2011;117:1210-
9
3. Loeb S, Roehl KA, Antenor JA, Catalona WJ, Suarez BK, Nadler RB. Baseline
prostate-specific antigen compared with median prostate-specific antigen for age
group as predictor of prostate cancer risk in men younger than 60 years old.
Urology 2006;67:316-20
4. Parkes C, Wald NJ, Murphy P, George L, Watt HC, Kirby R, et al. Prospective
observational study to assess value of prostate specific antigen as screening test
for prostate cancer. Bmj 1995;311:1340-3
5. Whittemore AS, Cirillo PM, Feldman D, Cohn BA. Prostate specific antigen levels
in young adulthood predict prostate cancer risk: results from a cohort of Black
and White Americans. J Urol 2005;174:872-6; discussion 6
6. Whittemore AS, Lele C, Friedman GD, Stamey T, Vogelman JH, Orentreich N.
Prostate-Specific Antigen as Predictor of Prostate Cancer in Black Men and
White Men. JNCI: Journal of the National Cancer Institute 1995;87:354-9
7. Antenor JA, Han M, Roehl KA, Nadler RB, Catalona WJ. Relationship between
initial prostate specific antigen level and subsequent prostate cancer detection in
a longitudinal screening study. J Urol 2004;172:90-3
8. Holmström B, Johansson M, Bergh A, Stenman UH, Hallmans G, Stattin P.
Prostate specific antigen for early detection of prostate cancer: longitudinal study.
Bmj 2009;339:b3537
9. Tang P, Sun L, Uhlman MA, Robertson CN, Polascik TJ, Albala DM, et al. Initial
prostate specific antigen 1.5 ng/ml or greater in men 50 years old or younger
predicts higher prostate cancer risk. J Urol 2010;183:946-50
10. Kuller LH, Thomas A, Grandits G, Neaton JD. Elevated prostate-specific antigen
levels up to 25 years prior to death from prostate cancer. Cancer Epidemiol
Biomarkers Prev 2004;13:373-7
11. Stattin P, Vickers AJ, Sjoberg DD, Johansson R, Granfors T, Johansson M, et al.
Improving the Specificity of Screening for Lethal Prostate Cancer Using Prostate-
63
specific Antigen and a Panel of Kallikrein Markers: A Nested Case-Control Study.
Eur Urol 2015;68:207-13
12. Vickers AJ, Cronin AM, Bjork T, Manjer J, Nilsson PM, Dahlin A, et al. Prostate
specific antigen concentration at age 60 and death or metastasis from prostate
cancer: case-control study. Bmj 2010;341:c4521
13. Connolly D, Black A, Gavin A, Keane PF, Murray LJ. Baseline prostate-specific
antigen level and risk of prostate cancer and prostate-specific mortality:
diagnosis is dependent on the intensity of investigation. Cancer Epidemiol
Biomarkers Prev 2008;17:271-8
14. Orsted DD, Nordestgaard BG, Jensen GB, Schnohr P, Bojesen SE. Prostate-
specific antigen and long-term prediction of prostate cancer incidence and
mortality in the general population. Eur Urol 2012;61:865-74
15. Vickers AJ, Ulmert D, Sjoberg DD, Bennette CJ, Bjork T, Gerdtsson A, et al.
Strategy for detection of prostate cancer based on relation between prostate
specific antigen at age 40-55 and long term risk of metastasis: case-control
study. Bmj 2013;346:f2023
16. Preston MA, Batista JL, Wilson KM, Carlsson SV, Gerke T, Sjoberg DD, et al.
Baseline Prostate-Specific Antigen Levels in Midlife Predict Lethal Prostate
Cancer. J Clin Oncol 2016;34:2705-11
17. Kovac E, Carlsson SV, Lilja H, Hugosson J, Kattan MW, Holmberg E, et al.
Association of Baseline Prostate-Specific Antigen Level With Long-term
Diagnosis of Clinically Significant Prostate Cancer Among Patients Aged 55 to 60
Years: A Secondary Analysis of a Cohort in the Prostate, Lung, Colorectal, and
Ovarian (PLCO) Cancer Screening Trial. JAMA Netw Open 2020;3:e1919284
18. Ross KS, Carter HB, Pearson JD, Guess HA. Comparative efficiency of prostate-
specific antigen screening strategies for prostate cancer detection. Jama
2000;284:1399-405
19. Preston MA, Gerke T, Carlsson SV, Signorello L, Sjoberg DD, Markt SC, et al.
Baseline Prostate-specific Antigen Level in Midlife and Aggressive Prostate
Cancer in Black Men. Eur Urol 2019;75:399-407
20. Conti DV, Darst BF, Moss LC, Saunders EJ, Sheng X, Chou A, et al. Trans-
ancestry genome-wide association meta-analysis of prostate cancer identifies
new susceptibility loci and informs genetic risk prediction. Nat Genet 2021;53:65-
75
21. Kolonel LN, Henderson BE, Hankin JH, Nomura AM, Wilkens LR, Pike MC, et al.
A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics.
American journal of epidemiology 2000;151:346-57
64
22. Wang H, Haiman CA, Kolonel LN, Henderson BE, Wilkens LR, Le Marchand L,
et al. Self-reported ethnicity, genetic structure and the impact of population
stratification in a multiethnic study. Hum Genet 2010;128:165-77
23. Mitrunen K, Pettersson K, Piironen T, Bjork T, Lilja H, Lovgren T. Dual-label one-
step immunoassay for simultaneous measurement of free and total prostate-
specific antigen concentrations and ratios in serum. Clin Chem 1995;41:1115-20
24. Darst BF, Chou A, Wan P, Pooler L, Sheng X, Vertosick EA, et al. The Four-
Kallikrein Panel Is Effective in Identifying Aggressive Prostate Cancer in a
Multiethnic Population. Cancer Epidemiol Biomarkers Prev 2020;29:1381-8
25. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D.
Principal components analysis corrects for stratification in genome-wide
association studies. Nat Genet 2006;38:904-9
26. Penney KL, Stampfer MJ, Jahn JL, Sinnott JA, Flavin R, Rider JR, et al. Gleason
grade progression is uncommon. Cancer Res 2013;73:5163-8
27. Gann PH, Hennekens CH, Stampfer MJ. A prospective evaluation of plasma
prostate-specific antigen for detection of prostatic cancer. Jama 1995;273:289-94
65
Tables
Table 1. Descriptive characteristics of prostate cancer cases and controls.
Race/Ethnicity
All White African American Native Hawaiian Japanese Latino
Cases Controls Cases Controls Cases Controls Cases Controls Cases Controls Cases Controls
n 2245 2203 432 422 476 456 138 139 757 748 442 438
Age at blood draw, n (%)
≤55 86 (3.8) 68 (3.1) 18 (4.2) 13 (3.1) 13 (2.7) 11 (2.4) 15 (10.9) 10 (7.2) 31 (4.1) 29 (3.9) 9 (2.0) 5 (1.1)
56-65 767 (34.2) 673 (30.5) 172 (39.8) 161 (38.2) 127 (26.7) 108 (23.7) 64 (46.4) 67 (48.2) 243 (32.1) 194 (25.9) 161 (36.4) 143 (32.6)
66-75 977 (43.5) 970 (44.0) 164 (38.0) 151 (35.8) 243 (51.1) 240 (52.6) 50 (36.2) 55 (39.6) 309 (40.8) 308 (41.2) 211 (47.7) 216 (49.3)
>75 415 (18.5) 492 (22.3) 78 (18.1) 97 (23.0) 93 (19.5) 97 (21.3) 9 (6.5) 7 (5.0) 174 (23.0) 217 (29.0) 61 (13.8) 74 (16.9)
Mean age at blood draw (SD) 68.0 (7.6) 69.1 (7.6) 67.1 (8.0) 68.5 (8.1) 68.9 (7.0) 69.5 (6.9) 64.2 (7.2) 64.9 (6.6) 68.7 (8.1) 70.1 (8.2) 67.8 (6.7) 68.7 (6.5)
Mean PSA
a
(SD) 3.5 (0.1) 1.3 (0.03) 3.5 (0.1) 1.4 (0.1) 3.8 (0.2) 1.3 (0.1) 3.3 (0.2) 1.2 (0.1) 3.2 (0.1) 1.2 (0.04) 3.8 (0.2) 1.4 (0.1)
Case Characteristics
Mean age at diagnosis (SD) 72.8 (7.5) -- 72.0 (8.0) -- 73.5 (7.2) -- 70.0 (7.0) -- 73.6 (7.7) -- 72.5 (6.7) --
Mean time between blood
draw and diagnosis (SD)
4.9 (3.4) -- 4.8 (3.5) -- 4.7 (3.5) -- 5.8 (3.8) -- 4.9 (3.4) -- 4.7 (3.1) --
Stage
b
, n (%)
Localized 1764 (84.9) -- 332 (81.4) -- 373 (87.6) -- 107 (79.3) -- 623 (86.5) -- 329 (84.6) --
Regional 213 (10.3) -- 44 (10.8) -- 32 (7.5) -- 22 (16.3) -- 74 (10.3) -- 41 (10.5) --
Metastatic 101 (4.9) -- 32 (7.8) -- 21 (4.9) -- 6 (4.4) -- 23 (3.2) -- 19 (4.9) --
Gleason Score
b
, n (%)
≤6 692 (38.0) -- 126 (34.1) -- 109 (39.9) -- 38 (29.7) -- 240 (35.1) -- 179 (48.5) --
7 709 (38.9) -- 148 (40.1) -- 115 (42.1) -- 51 (39.8) -- 264 (38.7) -- 131 (35.5) --
≥8 421 (23.1) -- 95 (25.7) -- 49 (17.9) -- 39 (30.5) -- 179 (26.2) -- 59 (16.0) --
Lethal PCa
c
, n (%) 178 (7.9) -- 43 (10.0) -- 59 (12.4) -- 8 (5.8) -- 32 (4.2) -- 36 (8.1) --
Abbreviations: PSA, prostate-specific antigen; PCa, prostate cancer.
a
Geometric mean PSA (SD), age-adjusted ANCOVA test by race/ethnicity: P<0.001 in cases, P=0.06 in controls.
b
Numbers do not sum to total due to missing information on stage and Gleason.
c
Lethal PCa: Metastatic PCa or death from PCa.
66
Table 2: Association between PSA and prostate cancer risk.
All cases and controls
Excluding cases diagnosed within 2
years of blood draw
Excluding cases diagnosed within 5
years of blood draw
Excluding cases diagnosed
within 10 years of blood draw
PSA
percentiles
a
OR (95%CI)
b
n, ca/co OR (95%CI)
b
n, ca/co OR (95%CI)
b
n, ca/co OR (95%CI)
b
n, ca/co
Total PCa ≤50th (ref.) 202/1092 (ref.) 184/1092 (ref.) 151/1092 (ref.) 62/1092
>50th 10.05 (8.50-11.93) 1982/1092 9.12 (7.66-10.92) 1600/1092 6.65 (5.43-8.20) 915/1092 3.52 (2.50-5.03) 182/1092
>90th 24.54 (19.75-30.67) 922/219 20.02 (15.93-25.33) 644/219 10.85 (8.29-14.30) 274/219 4.4 (2.69-7.21) 46/219
Gleason ≤7
PCa
≤50th (ref.) 127/1092 (ref.) 114/1092 (ref.) 92/1092 (ref.) 42/1092
>50th 11.89 (9.75-14.61)
c
1483/1092 10.87 (8.81-13.52)
c
1183/1092 8.25 (6.46-10.64)
c
672/1092 3.79 (2.52-5.83) 124/1092
>90th 27.49 (21.53-35.39)
c
673/219 22.77 (17.52-29.85)
c
462/219 12.89 (9.37-17.92)
c
189/219 4.26 (2.38-7.64) 30/219
Gleason ≥8
PCa
≤50th (ref.) 65/1092 (ref.) 61/1092 (ref.) 53/1092 (ref.) 18/1092
>50th 6.65 (5.07-8.85)
c
425/1092 5.89 (4.44-7.92)
c
356/1092 3.90 (2.84-5.45)
c
208/1092 2.86 (1.62-5.26) 49/1092
>90th 17.04 (12.40-23.71)
c
210/219 13.24 (9.47-18.75)
c
154/219 6.95 (4.64-10.49)
c
70/219 3.33 (1.39-7.72) 11/219
Localized PCa ≤50th (ref.) 155/1092 (ref.) 141/1092 (ref.) 114/1092 (ref.) 49/1092
>50th 10.26 (8.53-12.42) 1553/1092 9.18 (7.56-11.21) 1235/1092 6.52 (5.21-8.24) 668/1092 3.18 (2.16-4.76) 123/1092
>90th 24.20 (19.21-30.68) 733/219 19.66 (15.41-25.27) 509/219 10.11 (7.53-13.67) 195/219 3.50 (1.98-6.17) 29/219
Non-localized
PCa
≤50th (ref.) 25/1092 (ref.) 21/1092 (ref.) 18/1092 (ref.) 9/1092
>50th 11.63 (7.80-18.14) 284/1092 11.61 (7.53-18.87) 237/1092 8.50 (5.26-14.58) 145/1092 4.09 (1.92-9.65) 30/1092
>90th 29.81 (18.94-48.78) 130/219 25.14 (15.29-43.24) 90/219 16.45 (9.12-31.09) 45/219 4.32 (1.21-14.54) 5/219
Lethal PCa
d
≤50th (ref.) 24/1092 (ref.) 22/1092 (ref.) 17/1092 (ref.) 6/1092
>50th 6.15 (4.03-9.80) 147/1092 5.42 (3.47-8.86) 119/1092 3.61 (2.12-6.46) 62/1092 2.24 (0.85-6.65) 13/1092
>90th 16.44 (10.26-27.28) 82/219 13.13 (7.89-22.63) 59/219 7.50 (3.91-14.73) 25/219 3.14 (0.62-13.53) 3/219
Abbreviations: PSA, prostate-specific antigen; OR, odds ratio; CI, confidence interval; ca, cases; co, controls; PCa, prostate cancer.
a
PSA is the age- and ethnicity-adjusted residuals of log(PSA). Percentiles are based on the distribution of controls. See Supplementary Table S3 for PSA percentile values.
b
Odds ratios were estimated using unconditional logistic regression, adjusting for BMI at blood draw, laboratory batch, and matching factors of race/ethnicity, age at blood draw, area,
fasting hours, collection time and collection year.
c
Heterogeneity P<0.05 between Gleason ≤7 PCa and Gleason ≥8 PCa, or Localized PCa and Non-localized PCa.
d
Lethal PCa: Metastatic PCa or death from PCa.
67
Table 3: Association between PSA and risk of total prostate cancer by race/ethnicity groups.
All cases and controls
Excluding cases diagnosed within 2
years of blood draw
Excluding cases diagnosed within 5
years of blood draw
Excluding cases diagnosed
within 10 years of blood draw
Race/eth
nicity
PSA
percentiles
a
OR (95%CI)
b
n, ca/co
OR (95%CI)
b
n, ca/co
OR (95%CI)
b
n, ca/co
OR (95%CI)
b
n, ca/co
White
≤50th (ref.) 56/211 (ref.) 52/211 (ref.) 44/211 (ref.) 17/211
>50th 6.69 (4.76-9.52) 365/210 5.79 (4.06-8.38) 284/210 3.74 (2.49-5.72)
c
153/210 1.99 (0.98-4.20) 32/210
>75th 11.99 (8.20-17.81) 306/105 10.11 (6.80-15.33) 227/105 6.14 (3.88-9.91) 113/105 2.34 (1.03-5.36) 19/105
African
American
≤50th (ref.) 35/228 (ref.) 32/228 (ref.) 30/228 (ref.) 12/228
>50th 12.29 (8.39-18.48) 420/227 10.56 (7.08-16.24) 323/227 6.44 (4.16-10.29)
c
185/227 4.36 (1.93-10.74) 37/227
>75th 18.38 (12.17-28.49) 309/114 14.82 (9.61-23.53) 221/114 7.8 (4.84-12.95) 113/114 4.22 (1.71-11.16) 21/114
Native
Hawaiian
≤50th (ref.) 13/69 (ref.) 13/69 (ref.) 11/69 (ref.) 8/69
>50th 11.64 (5.94-24.69) 125/69 10.5 (5.30-22.49) 108/69 10.12 (4.61-24.70)
c
70/69 3.28 (1.16-10.41) 18/69
>75th 19.5 (9.21-45.26) 98/35 17.09 (7.96-40.34) 82/35 15.42 (6.24-43.80) 46/35 3.48 (0.97-13.64) 9/35
Japanese
≤50th (ref.) 62/372 (ref.) 53/372 (ref.) 41/372 (ref.) 17/372
>50th 11.23 (8.38-15.27) 681/371 10.91 (7.98-15.16) 559/371 8.08 (5.59-11.94)
c
316/371 4.33 (2.40-8.27) 68/371
>75th 18.17 (13.29-25.20) 559/186 17.31 (12.40-24.58) 440/186 12.01 (8.04-18.36) 221/186 6.37 (3.26-13.10) 42/186
Latino
≤50th (ref.) 39/214 (ref.) 36/214 (ref.) 25/214 (ref.) 7/214
>50th 10.17 (7.00-15.09) 388/213 9.33 (6.34-14.08) 324/213 8.89 (5.51-14.90)
c
191/213 6.25 (2.40-19.26) 28/213
>75th 17.2 (11.48-26.38) 319/107 15.19 (10.01-23.63) 258/107 12.84 (7.77-22.07) 144/107 6.8 (2.44-22.16) 18/107
Abbreviations: PSA, prostate-specific antigen; OR, odds ratio; CI, confidence interval; ca, cases; co, controls.
a
PSA is the age-adjusted residuals of log(PSA). Percentiles are based on the distribution of controls. See Supplementary Table S3 for PSA percentile values.
b
Odds ratios were estimated using unconditional logistic regression, adjusting for BMI at blood draw, laboratory batch, and matching factors of age at blood draw, area, fasting
hours, collection time and collection year.
c
Heterogeneity test P<0.05 across race/ethnicity associations by PSA percentile.
68
Figures
Figure 1. Lorenz curves showing the proportion of total prostate cancer cases captured
by PSA and PRS measurements above indicated risk cutoffs. Proportions for PSA are
shown by time between PSA measurement and diagnosis. Results are shown in the A)
overall, B) White, C) African American, D) Native Hawaiian, E) Japanese, and F) Latino
populations. The x-axis shows proportion of the total population with PSA levels or PRS
values above indicated risk cutoffs, hence percentages run from 100 to 0. The y-axis
shows proportion of cases with PSA levels or PRS values above the risk cutoff indicated
by the x-axis. PSA, prostate-specific antigen; PRS, polygenic risk score; Dx, diagnosis;
PCa, prostate cancer.
69
Figure 2. Association of PRS and PSA on total PCa by time to diagnosis, in a model
that mutually adjusted the effects of both PSA and PRS. OR indicates the magnitude
increase in PCa risk per standard deviation increase in PRS and PSA. PRS, polygenic
risk score; PSA, prostate-specific antigen; PCa, prostate cancer; OR, odds ratio; CI,
confidence interval.
70
Chapter 3: Interaction of Polygenic Risk Score and Lifestyle
Factors on the Risk of Breast Cancer in a Multiethnic Population
71
Abstract
Purpose: A polygenic risk score (PRS) has demonstrated great potential in stratifying
breast cancer risk in non-African ancestry populations. Several modifiable lifestyle risk
factors have been identified for breast cancer, although little is known regarding their
effects among women with varying genetic risk.
Methods: In the Multiethnic Cohort (MEC), we conducted a nested case-control study of
3,229 breast cancer cases and 3,921 controls from five major racial/ethnic groups
(White, African American, Latino, Japanese American, and Native Hawaiian). We
examined a PRS of 313 variants in association with breast cancer risk and evaluated
the interaction with selected modifiable lifestyle factors on the risk of breast cancer. The
modifiable lifestyle factors examined included body mass index (BMI), physical activity,
smoking, alcohol consumption, and five diet quality indexes, such as the Healthy Eating
Index (HEI)-2010, the Alternative Healthy Eating Index (aHEI)-2010, the alternate
Mediterranean Diet score (aMED), the Dietary Approaches to Stop Hypertension score
(DASH), and the Dietary Inflammatory Index (DII).
Results: The 313-variant PRS was strongly associated with breast cancer risk across
the five racial/ethnic groups, with per SD OR being 2.07 (95% CI=1.63-2.63) in Native
Hawaiians, 1.72 (95% CI=1.54-1.92) in Whites, 1.56 (95% CI=1.36-1.77) in Latinas,
1.45 (95% CI=1.31-1.60) in Japanese Americans, and 1.32 (95% CI=1.20-1.44) in
African American women. We found that the association of BMI, physical activity, aHEI-
2010, HEI-2010, or DII with BCa risk depended on PRS. In the analysis of ER+ BCa,
72
among women with high genetic risk (50-100% of PRS), the OR was 0.65 (95%
CI=0.55-0.77) for women with a BMI <25 kg/m
2
(vs. BMI 25 kg/m
2
), while no
association between BMI and ER+ BCa was observed among women with low genetic
risk (0-50% of PRS, OR=0.98, 95%CI=0.81-1.19, PLRT =0.006). In the analysis of ER-
BCa, high physical activity was associated with a 10% lower risk of ER- BCa (95% CI =
0.82-1.00) among women with low genetic risk and the association was null among
women with high genetic risk (PLRT=0.049). Similarly for aHEI-2010 and HEI-2010, a
healthier diet (per SD increase) was associated with 15-19% lower risk of ER- BCa in
women with low genetic risk while no association was observed in women with high
genetic risk (PLRT < 0.009). Moreover, a more pro-inflammatory diet (higher DII) was
associated with an elevated risk of ER- BCa but this positive association was only
observed in women with low genetic risk (OR=1.23, 95% CI=1.05-1.44, PLRT = 0.028).
Conclusions: In line with previous reports, the 313-variant PRS was effective in
stratifying breast cancer risk, with diminished transferability for women of African
ancestry. Our findings also suggest that maintaining a healthy BMI may offset the
genetic risk of ER+ breast cancer, while adhering to a healthy dietary pattern or being
physically active may further reduce risk of ER- BCa for women with lower genetic risk.
73
Introduction
Breast cancer (BCa) is the most commonly diagnosed cancer and the second
leading cause of cancer death among U.S. women, with an estimated 288,580 new
cases and 43,250 deaths in 2022 (1). BCa risk burden and outcome differ across
different race/ethnicity groups. Although breast cancer incidence rates are slightly
higher in White women than in African American women, African American women are
more likely to be diagnosed at a younger age (< 40 years) and to have an aggressive
subtype of breast cancer (e.g. hormone receptor negative) that is associated with higher
mortality (2-5).
Several lifestyle risk factors are known to be associated with increased BCa risk,
including alcohol intake, exogenous hormonal use, decreased physical activity,
reproductive factors such as nulliparity and late age at first birth, and anthropometric
factors such as obesity and weight gain (6-15). Research on diet quality indexes (DQIs)
such as Healthy Eating Index 2010 (HEI-2010), Alternate Healthy Eating Index 2010
(aHEI-2010), alternate Mediterranean diet score (aMED), the Dietary Approaches to
Stop Hypertension score (DASH), and the dietary Inflammatory Index (DII) had shown
the benefits of a healthy diet on lowering the cancer incidence and mortality (16-20).
However, evidence of these DQIs in preventing breast cancer has been inconsistent.
Although some studies had reported a lower risk of postmenopausal breast cancer in
women with a higher aHEI-2010, aMED, or DASH score, most studies found no
association between any DQIs and total BCa risk (21-23), and very few of them had
examined the association by breast cancer subtype.
Genome-wide association studies (GWAS) have identified hundreds of germline
variants associated with BCa risk (24,25). Recently, a PRS for BCa consisting of 313
74
variants was developed based on GWAS summary statistics from populations of
European ancestry (26). This PRS has shown to be effective in stratifying breast cancer
risk among women of European ancestry (26-34) but was less effective in women of
African ancestry (35) and Latinas (36). Moreover, a recent study in the Million Veteran
Program (MVP) demonstrated that the inclusion of PRS313 improved the prediction
performance of clinical models in women of European ancestry but not in women of
African ancestry (26,37). These findings suggest the lack of transferability of breast
cancer PRS in non-European populations.
There is emerging evidence suggesting that gene-environment (GXE)
interactions also contribute to the susceptibility of breast cancer. A study in the Breast
Cancer Association Consortium (BCAC) among 58,684 women from 20 studies of
European descent found significant interactions of PRS with alcohol consumption (P-
interaction = 0.009) and the use of menopausal hormone therapy (P-interaction = 0.038)
on the risk of estrogen receptor-positive (ER+) breast cancer (38). In addition, two
studies in the UK Biobank among women of European descent investigated BCa PRS
in combination with lifestyle score and found evidence of an additive interaction
between the lifestyle score and PRS (39,40). Research on the interaction between
breast cancer genetics and lifestyle factors, in particular the dietary factors, in non-
European populations is scarce.
To better understand the contribution of GxE interactions on the risk of breast
cancer across racial/ethnic populations, in this case-control study we assessed a BCa
PRS of 313 variants for association with breast cancer risk, and evaluated the
75
interactions between this BCa PRS and selected lifestyle and dietary factors on breast
cancer risk across the five racial/ethnic groups in the MEC.
Materials and Methods
Study Population
The MEC is a large prospective multiethnic cohort established to investigate
chronic conditions and cancer among 215,000 individuals aged 45-75 who were
residents of the State of Hawaii and Los Angeles from 1993 to 1996 (41). Participants
were primarily self-reported African Americans (AA), Latinos (LA), Japanese Americans
(JA), Native Hawaiians (NH), and Non-Hispanic Whites (WH) who completed a detailed
baseline questionnaire to obtain information on chronic disease risk factors, health
conditions, and lifestyle and dietary factors (42). Incident cases of breast cancer and
additional information on cancer stage, grade, and hormone receptor status were
obtained through linkage of the cohort to the SEER (Surveillance, Epidemiology, and
End Results) tumor registries in Hawaii and California. The current study included 3,229
incident BCa cases and 3,921 controls from existing nested case-control studies of BCa
with available genotyping data. All participants provided written informed consent, and
study protocols were approved by the Institutional Review Boards at the University of
Hawaii and the University of Southern California.
Genotyping and Imputation
The BCa cases and control included in this study were genotyped on three
Illumina arrays (Human1M-Duo, Human660W-Quad, and Infinium OncoArray-500K),
76
and were imputed to the 1000 Genomes Project reference panel (Phase 3 v5). All 313
BCa-associated variants included in the construction of the PRS (43) have an INFO
score > 0.30 across the five racial/ethnic groups. Principal components (PCs) of genetic
ancestry were calculated using KING [ref] and PC-AiR (44) with 15,678 independent
common variants (MAF > 5% in all racial/ethnic groups).
Statistical Analyses
Association between lifestyle risk factors and BCa
We performed multivariate logistic regression analyses to assess the association
with BCa risk for lifestyle factors including body mass index (BMI < 25 kg/m
2
, >= 25 kg/m
2
),
physical activity (MET-hours/day), smoking status (never, former, and current), alcohol
consumption (never, ever), postmenopausal hormone therapy use (never use, past use,
current use of estrogen therapy, and current use of estrogen-progestin therapy), as well
as five DQIs (HEI-2010, aHEI-2010, aMED, DASH, and DII). All models adjusted for age,
1
st
-degree family history of BCa (negative, positive), race/ethnicity (White, African
American, Native Hawaiian, Japanese American, and Latino), BMI (continuous), parity
(nulliparity, 1, 2-3, and 4+ children), age at menarche (<=12, 13-14, and >14 years), age
at menopause (pre-menopause, <45, 45-49, 50+ years), smoking status, alcohol
consumption (per 10g/day), education (<=12, 13-15, and 15+ years), and hormone use,
where applicable. In the analysis of five DQIs, the DQIs were each modeled as a
continuous variable (per standard deviation [SD]), and the models additionally adjusted
for total calorie intake. Standardization of DQIs was based on their distribution in controls
only. In the subtype analysis, we also evaluated the associations of these lifestyle and
77
dietary factors with estrogen receptor-positive (ER+) and estrogen receptor-negative (ER-)
BCa. In all analyses, missing values were included as an “unknown” category so that all
subjects could be included in the analyses. All statistical analyses were performed with R
v.3.6 (R Foundation for Statistical Computing, Vienna, Austria, 2015). P values less than
0.05 were considered statistically significant. Tests of statistical significance were two-
sided.
Polygenic Risk Scores (PRS)
PRS was calculated for each participant as the sum of the number of risk alleles
carried by an individual, weighted by its natural logarithm of the relative risk extracted
from BCa GWAS for the 313 BCa-associated variants (26). In this study, we constructed
three BCa PRS using the weights specific to total BCa, ER+ BCa, and ER- BCa (26)
and examined their associations with corresponding BCa phenotypes adjusting for age
and the top ten PCs. These analyses were performed separately in each racial/ethnic
group where PRS was modeled as a continuous variable (per SD). The PRS
standardization was based on the PRS distribution in controls within each racial/ethnic
group.
Lifestyle Score
To evaluate the aggregate effects of lifestyle and dietary factors on BCa risk, we
developed a lifestyle score including smoking status, alcohol consumption, physical
activity, BMI, hormone use, and aHEI-2010. These factors were included because they
are established risk factors for BCa or potential risk factors which showed significant
78
associations in our study. The lifestyle score was constructed as the sum of an
individual’s status on lifestyle and dietary factors, weighted by its natural logarithm of
the relative risk extracted from previous breast cancer studies in the MEC (21,45-49) or
the most recent meta-analysis if not available in the MEC (11) (Supplementary Table 1).
We assessed the association of the lifestyle score with BCa risk both as a continuous
variable (per SD) and as a categorical variable (quartiles), adjusting for age,
race/ethnicity, family history of BCa, parity, age at menarche, age at menopause,
education, and total calorie intake.
Interaction between PRS and lifestyle risk factors
We performed stratified analyses to evaluate the association of each lifestyle and
dietary factor (BMI, physical activity, smoking, alcohol consumption, and the five DQIs)
as well as the lifestyle score with BCa risk among individuals with high ( 50%) and low
(< 50%) PRS. PRS categories (high vs. low) were determined based on the distribution
in controls within each racial/ethnic group, and then pooled across the groups in the
interaction analyses. To maximize the statistical power, the lifestyle and dietary factors
were modeled either as a continuous variable (per SD) or as a dichotomized variable.
Likelihood ratio test (LRT) was used to evaluate the interaction between PRS and
lifestyle risk factors comparing models with and without an interaction term adjusting for
all other covariates and the top ten PCs.
79
Results
This study included 3,229 BCa cases and 3,921 controls, of which 2,308 were
ER+ BCa cases and 564 were ER- BCa cases (Table 1). Family history of BCa in 1st-
degree relatives was significantly associated with an elevated risk of total, ER+ or ER-
BCa (Supplementary Table 2). A positive association with total BCa and ER+ BCa was
also observed for BMI, alcohol consumption, and hormone use. The associations of
these factors with ER- BCa were not statistically significant. No association was
observed for physical activity or smoking. Although the associations between DQIs and
BCa risk were mostly null, there was a suggestive inverse association with total BCa for
aHEI-2010 (per SD OR=0.96, 95%CI=0.91-1.01, P = 0.09). When aggregated, the per
SD increase in lifestyle score was associated with a 1.13-fold risk of total BCa (95%
CIs=1.07-1.18, P<0.01) and women in the 4
th
quartile were 28% more likely to be
diagnosed with BCa (OR=1.28, 95% CI=1.11-1.48, P<0.01) than those in the 1
st
quartile. The associations of lifestyle score were similar for ER+ BCa and ER- BCa
(Supplementary Table 2).
The 313-variant PRS was strongly associated with breast cancer risk across the
five racial/ethnic groups, with per SD OR being 2.07 (95% CI=1.63-2.63) in Native
Hawaiians, 1.72 (95% CI=1.54-1.92) in Whites, 1.56 (95% CI=1.36-1.77) in Latinas,
1.45 (95% CI=1.31-1.60) in Japanese Americans, and 1.32 (95% CI=1.20-1.44) in
African American women (Figure 1). Significant and similar associations were observed
across race/ethnicity groups for ER+ BCa. For ER- BCa, the association for per SD
increase in PRS was 1.96 (95% CI=1.25-3.09) in Native Hawaiians, 1.58 (95% CI=1.29-
1.95) in Whites, 1.52 (95% CI=1.22-1.88) in Latinas, 1.51 (95% CI=1.28-1.78) in African
Americans, and 1.50 (95% CI=1.23-1.82) in Japanese American women (Figure 1).
80
In the stratified analyses, we found that the association of BMI, physical activity,
aHEI-2010, HEI-2010, or DII with BCa risk depends on PRS (Table 2). In the analysis of
ER+ BCa, among women with high genetic risk (50-100% of PRS), the OR was 0.65
(95% CI=0.55-0.77) for women with a BMI <25 kg/m
2
(vs. BMI 25 kg/m
2
), while no
association between BMI and ER+ BCa was observed among women with low genetic
risk (0-50% of PRS, OR=0.98, 95%CI=0.81-1.19, PLRT =0.006). This interaction
between BMI and PRS on the risk of ER+ BCa was directionally consistent across the
five racial/ethnic groups and appeared to be more evident in Latinas (PLRT=0.004;
Supplementary Table 3). In the analysis of ER- BCa, high physical activity was
associated with a 10% lower risk of ER- BCa (95% CI = 0.82-1.00) among women with
low genetic risk and the association was null among women with high genetic risk
(PLRT=0.049). Similarly for aHEI-2010 and HEI-2010, a healthier diet (per SD increase)
was associated with 15-19% lower risk of ER- BCa in women with low genetic risk while
no association was observed in women with high genetic risk (PLRT < 0.009). Although
only suggestive (PLRT < 0.11), a similar trend was also observed for aMED and DASH.
Moreover, a more pro-inflammatory diet (higher DII) was associated with an elevated
risk of ER- BCa but this positive association was only observed in women with low
genetic risk (OR=1.23, 95% CI=1.05-1.44, PLRT = 0.028). No statistically significant
interaction was detected for smoking, alcohol consumption, hormone use or lifestyle
score (Table 2).
Joint effect of PRS and BMI show that women in the 4
th
PRS quartile with
unhealthy BMI (> 25 kg/m
2
) were 3.58 (95%CI = 2.80-4.58) times as likely to develop
ER+ BCa compared to women with healthy BMI ( 25 kg/m
2
) in the 1
st
PRS quartile
81
(Supplementary Figure S1). Furthermore, among women in PRS 50th-75th percentile
category, ER+ BCa risk among those who have a healthy BMI (OR [95%CI] = 1.52
[1.17-1.99]) is comparable to women with unhealthy BMI in lower PRS category 25th-
50th percentile (OR [95%CI] = 1.59 [1.23-2.06]), indicating maintaining a healthy BMI
can lower risk of ER+ BCa for women in higher PRS category. For ER- BCa, women
with unhealthy diet score of aHEI-2010 (below median), HEI-2010 (below median) and
DII (above median), and low physical activity (below median) in the highest genetic risk
quartile are 3.25 (95%CI = 2.15-4.91), 3.53 (95%CI = 2.27-5.48), 3.19 (95%CI = 2.13-
4.79), and 2.45 (95%CI = 1.63-3.68) times as likely to develop ER- BCa, respectively,
compared to women with healthy diet in the lowest genetic risk quartile (Supplementary
Figure 1). For women in the 25th-50th percentile PRS category with unhealthy diet of
aHEI-2010, the risk of ER- BCa (OR [95%CI] = 2.14 [1.39-3.29]) wass similar to that of
women in PRS 50th-75th PRS category with healthy diet of aHEI-2010 (OR [95%CI] =
2.38 [1.55-3.65]). Similar significant associations were observed for HEI-2010, DII, and
physical activity (Supplementary Figure 1).
Lastly, for lifestyle score, even though no significant interaction with PRS was
detected by likelihood ratio test, stratified analysis indicated that a high lifestyle risk
score (above the median), with reference to low lifestyle risk score (below the median),
was significantly associated with 17% increased risk of ER+ BCa in women with high
genetic risk while no association was observed in women with low genetic risk (Table
2). Joint effect of PRS and lifestyle score indicated that women in the highest PRS
quartile with unhealthy lifestyle score were 3.12 (95%CI = 2.48-3.91) and 4.34 (95%CI =
82
2.80-6.71) times as likely to develop ER+ and ER- BCa, respectively, compared to
women with healthy lifestyle score in the 1
st
PRS quartile (Supplementary Figure 1).
Discussion
In this multiethnic study, we found that the 313-variant PRS was strongly
associated with breast cancer risk across the five racial/ethnic groups, but the
associations were weaker in African American women which is consistent with the lack
of transferability of the breast cancer PRS in these populations (35-37). Our findings
suggest that maintaining a healthy BMI may offset the genetic risk of ER+ breast cancer
among women with high genetic risk while adhering to a healthy dietary pattern or being
physically active may further reduce risk of ER- BCa for women with lower genetic risk.
Furthermore, we examined the association between a lifestyle score with risk of BCa.
While we observed significant association between lifestyle score with risk of BCa, we
did not observe significant interaction between PRS and lifestyle score with risk of BCa.
In line with published reports, the 313-variant PRS was comparably effective in
stratifying breast cancer risk in Whites, Japanese Americans and Latinas, with
diminished transferability for African American women (26,35,36). In our study, the
association of PRS with total BCa was 1.72 in Whites, 1.50 in Japanese Americans,
1.52 in Latinas, and 1.32 in African American women, which were largely consistent
with previously reported OR of 1.49-1.71 in women of European ancestry
(26,36,38,50,51), 1.44-1.67 in Asian women (52-54), 1.17-1.58 in Latinas (36,55), and
1.05-1.27 in women of African ancestry (35,36). Our results of diminished PRS
transferability in women of African ancestry are consistent with the observations in Liu
83
et al., where they also found smaller effect sizes of PRS per SD for women with African
ancestry (OR=1.19; 95%CI=1.05-1.35) comparing to women of European (OR=1.19;
95%CI=1.05-1.35) and Latina ancestries (OR=1.31; 95%CI=1.09-1.58) (36). As the first
study evaluating the breast cancer PRS in Native Hawaiians, our results found that PRS
derived from women with European ancestry for breast cancer risk generalized well for
Native Hawaiian women. In fact, associations were the strongest in Native Hawaiian
women across all three breast cancer outcomes examined (OR from 1.96 to 2.07, P <
0.01), though this could be due to the small sample size for Native Hawaiians.
Because risks conferred by majority of individual SNPs are not sufficiently large
to be useful in risk prediction, it is of particular interest to utilize PRS, which aggregates
genetic susceptibility associated with a large number of variants, for detecting the global
patterns of interaction and mitigating the multiple-testing issue in genome-wide GxE
studies. In the current study, we found that the association of BMI with risk of ER+ BCa
depended on PRS, where a healthy weight was associated with a 35% lower risk of
ER+ BCa in women with higher genetic risk, but showed no association among women
with low genetic risk. In contrast to our findings, a recent study in 27,305 ER+ cases and
46,137 controls of European ancestry from the Breast Cancer Association Consortium
(BCAC), no interaction was found between a 313-variant PRS and BMI on risk of ER+
BCa (56). This discrepancy may be partially due to the differences in racial/ethnic
composition of the two studies given that our study consisted of women from five
diverse racial-ethnic groups. Although underpowered, our ethnic-specific results
suggest that the interaction between PRS and BMI were directionally consistent across
the ethnic groups and appeared to be more evidence among Latinas.
84
Our study has strengths and limitations. Our study utilized comprehensive data
on genetic, lifestyle, and dietary factors in an racially/ethnically diverse case-control
study population from the MEC, which is largely representative of its source populations
(57). This allowed us to simultaneously assess the association of genetic, lifestyle, and
dietary factors on risk of BCa across racial/ethnic groups and by BCa subtypes. The
PRS tested in our study was predominately developed in population of European
ancestry (58), which was shown to have a lower transferability to some of the non-
European populations. This could further affect our power to detect its potential
interactions with non-genetic factors. Due to the high missingness on hormone receptor
status, we do not have sufficient information to identify other BCa subtypes such as
Triple Negative BCa, which prevent us from examining the GXE interaction for this
aggressive form of BCa that is more likely to occur in African American women. In
addition, our ethnic-specific analysis had very limited power which could explain the
mostly non-significant interactions in each ethnic group. Lastly, we cannot rule out that
the significant observation was due to multiple testing. After accounting for multiple
testing, Kapoor et al. did not observe significant interaction between 313-SNP PRS with
any of the classical lifestyle risk factors. For the same variables examined in both
studies, these results are in line with the null associations that we observed in our
current study. Further investigations are warranted to validate our results.
Our results demonstrated generalizability of breast cancer PRS in non-European
populations. Our results also suggested that breast cancer genetic risk could be
potentially modified by lifestyle factors, which has implications for preventive strategies
aimed at modifying lifestyle risk factors. In combination with lifestyle risk factors, PRS
85
could improve our ability to distinguish women at different levels of breast cancer risk in
the general population, which could have profound impact on prevention and screening
strategies for breast cancer.
86
References
1. Siegel RL, Miller KD, Jemal A. Cancer Statistics, 2017. CA Cancer J Clin
2017;67:7-30
2. Althuis MD, Fergenbaum JH, Garcia-Closas M, Brinton LA, Madigan MP,
Sherman ME. Etiology of hormone receptor-defined breast cancer: a systematic
review of the literature. Cancer Epidemiol Biomarkers Prev 2004;13:1558-68
3. DeSantis CE, Ma J, Gaudet MM, Newman LA, Miller KD, Goding Sauer A, et al.
Breast cancer statistics, 2019. CA Cancer J Clin 2019;69:438-51
4. Ma H, Bernstein L, Pike MC, Ursin G. Reproductive factors and breast cancer
risk according to joint estrogen and progesterone receptor status: a meta-
analysis of epidemiological studies. Breast Cancer Res 2006;8:R43
5. Rauscher GH, Campbell RT, Wiley EL, Hoskins K, Stolley MR, Warnecke RB.
Mediation of Racial and Ethnic Disparities in Estrogen/Progesterone Receptor-
Negative Breast Cancer by Socioeconomic Position and Reproductive Factors.
Am J Epidemiol 2016;183:884-93
6. Menarche, menopause, and breast cancer risk: individual participant meta-
analysis, including 118 964 women with breast cancer from 117 epidemiological
studies. Lancet Oncol 2012;13:1141-51
7. Beral V, Reeves G, Bull D, Green J. Breast cancer risk in relation to the interval
between menopause and starting hormone therapy. J Natl Cancer Inst
2011;103:296-305
8. Guo W, Fensom GK, Reeves GK, Key TJ. Physical activity and breast cancer
risk: results from the UK Biobank prospective cohort. Br J Cancer 2020;122:726-
32
9. Hamajima N, Hirose K, Tajima K, Rohan T, Calle EE, Heath CW, Jr., et al.
Alcohol, tobacco and breast cancer--collaborative reanalysis of individual data
from 53 epidemiological studies, including 58,515 women with breast cancer and
95,067 women without the disease. Br J Cancer 2002;87:1234-45
87
10. Hunter DJ, Colditz GA, Hankinson SE, Malspeis S, Spiegelman D, Chen W, et al.
Oral contraceptive use and breast cancer: a prospective study of young women.
Cancer Epidemiol Biomarkers Prev 2010;19:2496-502
11. Matthews CE, Moore SC, Arem H, Cook MB, Trabert B, Håkansson N, et al.
Amount and Intensity of Leisure-Time Physical Activity and Lower Cancer Risk. J
Clin Oncol 2020;38:686-97
12. Nelson HD, Zakher B, Cantor A, Fu R, Griffin J, O'Meara ES, et al. Risk factors
for breast cancer for women aged 40 to 49 years: a systematic review and meta-
analysis. Ann Intern Med 2012;156:635-48
13. Pizot C, Boniol M, Mullie P, Koechlin A, Boniol M, Boyle P, et al. Physical activity,
hormone replacement therapy and breast cancer risk: A meta-analysis of
prospective studies. Eur J Cancer 2016;52:138-54
14. Reeves GK, Pirie K, Green J, Bull D, Beral V. Comparison of the effects of
genetic and environmental risk factors on in situ and invasive ductal breast
cancer. Int J Cancer 2012;131:930-7
15. Wiseman M. The second World Cancer Research Fund/American Institute for
Cancer Research expert report. Food, nutrition, physical activity, and the
prevention of cancer: a global perspective. Proc Nutr Soc 2008;67:253-6
16. Harmon BE, Boushey CJ, Shvetsov YB, Ettienne R, Reedy J, Wilkens LR, et al.
Associations of key diet-quality indexes with mortality in the Multiethnic Cohort:
the Dietary Patterns Methods Project. Am J Clin Nutr 2015;101:587-97
17. Hashemian M, Farvid MS, Poustchi H, Murphy G, Etemadi A, Hekmatdoost A, et
al. The application of six dietary scores to a Middle Eastern population: a
comparative analysis of mortality in a prospective study. Eur J Epidemiol
2019;34:371-82
18. Park SY, Kang M, Wilkens LR, Shvetsov YB, Harmon BE, Shivappa N, et al. The
Dietary Inflammatory Index and All-Cause, Cardiovascular Disease, and Cancer
Mortality in the Multiethnic Cohort Study. Nutrients 2018;10
88
19. Schwingshackl L, Bogensberger B, Hoffmann G. Diet Quality as Assessed by the
Healthy Eating Index, Alternate Healthy Eating Index, Dietary Approaches to
Stop Hypertension Score, and Health Outcomes: An Updated Systematic Review
and Meta-Analysis of Cohort Studies. J Acad Nutr Diet 2018;118:74-100.e11
20. Zahedi H, Djalalinia S, Asayesh H, Mansourian M, Esmaeili Abdar Z, Mahdavi
Gorabi A, et al. A Higher Dietary Inflammatory Index Score is Associated with a
Higher Risk of Incidence and Mortality of Cancer: A Comprehensive Systematic
Review and Meta-Analysis. Int J Prev Med 2020;11:15
21. Dela Cruz R, Park SY, Shvetsov YB, Boushey CJ, Monroe KR, Le Marchand L,
et al. Diet Quality and Breast Cancer Incidence in the Multiethnic Cohort. Eur J
Clin Nutr 2020;74:1743-7
22. Fung TT, Hu FB, McCullough ML, Newby PK, Willett WC, Holmes MD. Diet
quality is associated with the risk of estrogen receptor-negative breast cancer in
postmenopausal women. J Nutr 2006;136:466-72
23. Haridass V, Ziogas A, Neuhausen SL, Anton-Culver H, Odegaard AO. Diet
Quality Scores Inversely Associated with Postmenopausal Breast Cancer Risk
Are Not Associated with Premenopausal Breast Cancer Risk in the California
Teachers Study. J Nutr 2018;148:1830-7
24. Michailidou K, Lindström S, Dennis J, Beesley J, Hui S, Kar S, et al. Association
analysis identifies 65 new breast cancer risk loci. Nature 2017;551:92-4
25. Milne RL, Kuchenbaecker KB, Michailidou K, Beesley J, Kar S, Lindström S, et
al. Identification of ten variants associated with risk of estrogen-receptor-negative
breast cancer. Nat Genet 2017;49:1767-78
26. Mavaddat N, Michailidou K, Dennis J, Lush M, Fachal L, Lee A, et al. Polygenic
Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes. Am J
Hum Genet 2019;104:21-34
27. Hurson AN, Pal Choudhury P, Gao C, Hüsing A, Eriksson M, Shi M, et al.
Prospective evaluation of a breast-cancer risk model integrating classical risk
factors and polygenic risk in 15 cohorts from six countries. Int J Epidemiol
2022;50:1897-911
89
28. Kapoor PM, Mavaddat N, Choudhury PP, Wilcox AN, Lindström S, Behrens S, et
al. Combined Associations of a Polygenic Risk Score and Classical Risk Factors
With Breast Cancer Risk. J Natl Cancer Inst 2021;113:329-37
29. Lacaze P, Bakshi A, Riaz M, Orchard SG, Tiller J, Neumann JT, et al. Genomic
Risk Prediction for Breast Cancer in Older Women. Cancers (Basel) 2021;13
30. Lakeman IMM, Rodríguez-Girondo M, Lee A, Ruiter R, Stricker BH, Wijnant
SRA, et al. Validation of the BOADICEA model and a 313-variant polygenic risk
score for breast cancer risk prediction in a Dutch prospective cohort. Genet Med
2020;22:1803-11
31. Lakeman IMM, van den Broek AJ, Vos JAM, Barnes DR, Adlard J, Andrulis IL, et
al. The predictive ability of the 313 variant-based polygenic risk score for
contralateral breast cancer risk prediction in women of European ancestry with a
heterozygous BRCA1 or BRCA2 pathogenic variant. Genet Med 2021;23:1726-
37
32. Li SX, Milne RL, Nguyen-Dumont T, Wang X, English DR, Giles GG, et al.
Prospective Evaluation of the Addition of Polygenic Risk Scores to Breast Cancer
Risk Models. JNCI Cancer Spectr 2021;5
33. Pal Choudhury P, Brook MN, Hurson AN, Lee A, Mulder CV, Coulson P, et al.
Comparative validation of the BOADICEA and Tyrer-Cuzick breast cancer risk
models incorporating classical risk factors and polygenic risk in a population-
based prospective cohort of women of European ancestry. Breast Cancer Res
2021;23:22
34. Pal Choudhury P, Wilcox AN, Brook MN, Zhang Y, Ahearn T, Orr N, et al.
Comparative Validation of Breast Cancer Risk Prediction Models and Projections
for Future Risk Stratification. J Natl Cancer Inst 2020;112:278-85
35. Du Z, Gao G, Adedokun B, Ahearn T, Lunetta KL, Zirpoli G, et al. Evaluating
Polygenic Risk Scores for Breast Cancer in Women of African Ancestry. J Natl
Cancer Inst 2021;113:1168-76
36. Liu C, Zeinomar N, Chung WK, Kiryluk K, Gharavi AG, Hripcsak G, et al.
Generalizability of Polygenic Risk Scores for Breast Cancer Among Women With
90
European, African, and Latinx Ancestry. JAMA Network Open 2021;4:e2119084-
e
37. Minnier J, Rajeevan N, Gao L, Park B, Pyarajan S, Spellman P, et al. Polygenic
Breast Cancer Risk for Women Veterans in the Million Veteran Program. JCO
Precis Oncol 2021;5
38. Rudolph A, Song M, Brook MN, Milne RL, Mavaddat N, Michailidou K, et al. Joint
associations of a polygenic risk score and environmental risk factors for breast
cancer in the Breast Cancer Association Consortium. Int J Epidemiol
2018;47:526-36
39. Arthur RS, Wang T, Xue X, Kamensky V, Rohan TE. Genetic Factors, Adherence
to Healthy Lifestyle Behavior, and Risk of Invasive Breast Cancer Among
Women in the UK Biobank. J Natl Cancer Inst 2020;112:893-901
40. Kachuri L, Graff RE, Smith-Byrne K, Meyers TJ, Rashkin SR, Ziv E, et al. Pan-
cancer analysis demonstrates that integrating polygenic risk scores with
modifiable risk factors improves risk prediction. Nat Commun 2020;11:6084
41. Kolonel LN, Henderson BE, Hankin JH, Nomura AM, Wilkens LR, Pike MC, et al.
A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics.
American journal of epidemiology 2000;151:346-57
42. Wang H, Haiman CA, Kolonel LN, Henderson BE, Wilkens LR, Le Marchand L,
et al. Self-reported ethnicity, genetic structure and the impact of population
stratification in a multiethnic study. Hum Genet 2010;128:165-77
43. Conti DV, Darst BF, Moss LC, Saunders EJ, Sheng X, Chou A, et al. Trans-
ancestry genome-wide association meta-analysis of prostate cancer identifies
new susceptibility loci and informs genetic risk prediction. Nat Genet 2021;53:65-
75
44. Conomos MP, Miller MB, Thornton TA. Robust inference of population structure
for ancestry prediction and correction of stratification in the presence of
relatedness. Genetic epidemiology 2015;39:276-93
91
45. Gram IT, Park SY, Maskarinec G, Wilkens LR, Haiman CA, Le Marchand L.
Smoking and breast cancer risk by race/ethnicity and oestrogen and
progesterone receptor status: the Multiethnic Cohort (MEC) study. Int J
Epidemiol 2019;48:501-11
46. Maskarinec G, Jacobs S, Park SY, Haiman CA, Setiawan VW, Wilkens LR, et al.
Type II Diabetes, Obesity, and Breast Cancer Risk: The Multiethnic Cohort.
Cancer Epidemiol Biomarkers Prev 2017;26:854-61
47. Park SY, Kolonel LN, Lim U, White KK, Henderson BE, Wilkens LR. Alcohol
consumption and breast cancer risk among women from five ethnic groups with
light to moderate intakes: the Multiethnic Cohort Study. Int J Cancer
2014;134:1504-10
48. Pike MC, Kolonel LN, Henderson BE, Wilkens LR, Hankin JH, Feigelson HS, et
al. Breast cancer in a multiethnic cohort in Hawaii and Los Angeles: risk factor-
adjusted incidence in Japanese equals and in Hawaiians exceeds that in whites.
Cancer Epidemiol Biomarkers Prev 2002;11:795-800
49. Sarink D, White KK, Loo LWM, Wu AH, Wilkens LR, Le Marchand L, et al.
Racial/ethnic differences in postmenopausal breast cancer risk by hormone
receptor status: The multiethnic cohort study. Int J Cancer 2022;150:221-31
50. Khera AV, Chaffin M, Aragam KG, Haas ME, Roselli C, Choi SH, et al. Genome-
wide polygenic scores for common diseases identify individuals with risk
equivalent to monogenic mutations. Nat Genet 2018;50:1219-24
51. Maas P, Barrdahl M, Joshi AD, Auer PL, Gaudet MM, Milne RL, et al. Breast
Cancer Risk From Modifiable and Nonmodifiable Risk Factors Among White
Women in the United States. JAMA Oncol 2016;2:1295-302
52. Ho WK, Tan MM, Mavaddat N, Tai MC, Mariapun S, Li J, et al. European
polygenic risk score for prediction of breast cancer shows similar performance in
Asian women. Nat Commun 2020;11:3833
53. Shu X, Long J, Cai Q, Kweon SS, Choi JY, Kubo M, et al. Identification of novel
breast cancer susceptibility loci in meta-analyses conducted among Asian and
European descendants. Nat Commun 2020;11:1217
92
54. Yang Y, Tao R, Shu X, Cai Q, Wen W, Gu K, et al. Incorporating Polygenic Risk
Scores and Nongenetic Risk Factors for Breast Cancer Risk Prediction Among
Asian Women. JAMA Netw Open 2022;5:e2149030
55. Shieh Y, Fejerman L, Lott PC, Marker K, Sawyer SD, Hu D, et al. A Polygenic
Risk Score for Breast Cancer in US Latinas and Latin American Women. J Natl
Cancer Inst 2020;112:590-8
56. Kapoor PM, Mavaddat N, Choudhury PP, Wilcox AN, Lindstrom S, Behrens S, et
al. Combined Associations of a Polygenic Risk Score and Classical Risk Factors
With Breast Cancer Risk. J Natl Cancer Inst 2021;113:329-37
57. Kolonel LN, Altshuler D, Henderson BE. The multiethnic cohort study: exploring
genes, lifestyle and cancer risk. Nat Rev Cancer 2004;4:519-27
58. Mavaddat N, Michailidou K, Dennis J, Lush M, Fachal L, Lee A, et al. Polygenic
Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes. Am J
Hum Genet 2019;104:21-34
93
Tables
Table 1. Descriptive characteristics of breast cancer cases and controls.
Total BCa
cases
ER+ ER- Controls
n 3229 2308 564 3921
Ethnicity
White 774 (24.0) 592 (25.6) 112 (19.9) 810 (20.7)
African American 728 (22.5) 456 (19.8) 174 (30.9) 1616 (41.2)
Native Hawaiian 265 (8.2) 203 (8.8) 35 (6.2) 200 (5.1)
Japanese American 931 (28.8) 728 (31.5) 130 (23.0) 762 (19.4)
Latina 531 (16.4) 329 (14.3) 113 (20.0) 533 (13.6)
Age at blood draw mean (SD) 67.21 (8.97) 67.67 (8.90) 65.34 (9.05) 66.29 (8.84)
Family History
Negative 2400 (74.3) 1708 (74.0) 410 (72.7) 3259 (83.1)
Positive 817 (25.3) 595 (25.8) 148 (26.2) 651 (16.6)
Smoking Status
Never 1738 (53.8) 1254 (54.3) 297 (52.7) 2051 (52.3)
Former 1019 (31.6) 741 (32.1) 179 (31.7) 1258 (32.1)
Current 419 (13.0) 276 (12.0) 79 (14.0) 558 (14.2)
BMI
< 25 1356 (42.0) 983 (42.6) 254 (45.0) 1530 (39.0)
25-30 1093 (33.8) 773 (33.5) 182 (32.3) 1280 (32.6)
>=30 750 (23.2) 527 (22.8) 125 (22.2) 1037 (26.4)
Parity
None 473 (14.6) 362 (15.7) 62 (11.0) 451 (11.5)
1 child 376 (11.6) 270 (11.7) 58 (10.3) 464 (11.8)
2-3 children 1466 (45.4) 1061 (46.0) 257 (45.6) 1771 (45.2)
4 or more children 876 (27.1) 589 (25.5) 177 (31.4) 1199 (30.6)
Age at menarche
<= 12 yrs 1700 (52.6) 1217 (52.7) 286 (50.7) 1922 (49.0)
13-14 yrs 1145 (35.5) 827 (35.8) 204 (36.2) 1490 (38.0)
> 14 yrs 332 (10.3) 223 (9.7) 66 (11.7) 465 (11.9)
Age at menopause
Pre-menopausal 438 (13.6) 297 (12.9) 102 (18.1) 534 (13.6)
< 45 838 (26.0) 592 (25.6) 147 (26.1) 1177 (30.0)
45-49 667 (20.7) 481 (20.8) 111 (19.7) 856 (21.8)
>= 50 1034 (32.0) 760 (32.9) 158 (28.0) 1051 (26.8)
Estrogen and progestin use
Never estrogen use 1393 (43.1) 952 (41.2) 270 (47.9) 1873 (47.8)
Past estrogen use 514 (15.9) 369 (16.0) 81 (14.4) 704 (18.0)
Current estrogen use alone 438 (13.6) 302 (13.1) 93 (16.5) 532 (13.6)
Current estrogen use with
progesterone
764 (23.7) 598 (25.9) 100 (17.7) 643 (16.4)
Physical Activity
a
Low 1545 (47.8) 1082 (46.9) 277 (49.1) 1889 (48.2)
High 1502 (46.5) 1103 (47.8) 257 (45.6) 1784 (45.5)
aHEI-2010
a
Low 1527 (47.3) 1064 (46.1) 290 (51.4) 1916 (48.9)
High 1618 (50.1) 1186 (51.4) 266 (47.2) 1916 (48.9)
DASH
a
Low 1603 (49.6) 1123 (48.7) 289 (51.2) 2002 (51.1)
High 1542 (47.8) 1127 (48.8) 267 (47.3) 1830 (46.7)
HEI-2010
a
Low 1561 (48.3) 1092 (47.3) 291 (51.6) 1916 (48.9)
High 1584 (49.1) 1158 (50.2) 265 (47.0) 1916 (48.9)
aMED
a
Low 1761 (54.5) 1254 (54.3) 313 (55.5) 2171 (55.4)
High 1384 (42.9) 996 (43.2) 243 (43.1) 1661 (42.4)
DII
a
Low 1600 (49.6) 1174 (50.9) 261 (46.3) 1916 (48.9)
High 1545 (47.8) 1076 (46.6) 295 (52.3) 1916 (48.9)
a
Categorization determined by distribution in controls.
94
Table 2. Association of BCa and lifestyle risk factors by genetic risk.
Total Bca ER+ ER-
PRS dichotomize PRS dichotomize PRS dichotomize
0% - 50% 50% - 100% 0% - 50% 50% - 100% 0% - 50% 50% - 100%
OR (95% CI) OR (95% CI) P-LRT
d
OR (95% CI) OR (95% CI) P-LRT
d
OR (95% CI) OR (95% CI) P-LRT
d
Lifestyle
Factors
a
BMI (< 25 vs. >= 25) 0.96 (0.81-1.14) 0.75 (0.65-0.87) 8.59E-02 0.98 (0.81-1.19) 0.65 (0.55-0.77) 6.34E-03 1.02 (0.72-1.43) 1.22 (0.94-1.56) 4.11E-01
Family history 1.26 (1.11-1.44) 1.4 (1.21-1.61) 6.68E-01 1.2 (1.04-1.38) 1.38 (1.19-1.61) 1.50E-01 1.31 (1.12-1.54) 1.43 (1.19-1.7) 9.44E-01
Physical activity (High
vs. Low) 0.99 (0.95-1.03) 1 (0.97-1.03) 4.49E-01 1 (0.96-1.05) 0.99 (0.95-1.03) 5.18E-01 0.9 (0.82-1) 1.01 (0.96-1.07) 4.88E-02
Smoking Status
(Current vs. Never +
Ever) 1.34 (1.07-1.69) 0.99 (0.81-1.21) 2.14E-01 1.21 (0.92-1.58) 0.96 (0.77-1.2) 6.38E-01 1.21 (0.77-1.9) 0.98 (0.69-1.37) 7.62E-01
Alcohol Consumption
(Drinker vs.
Nondrinker) 1.12 (0.95-1.33) 1.18 (1.02-1.35) 1.73E-01 1.06 (0.87-1.28) 1.21 (1.04-1.41) 5.17E-02 1.05 (0.75-1.47) 1.2 (0.94-1.53) 5.47E-01
Hormone use (Current
E + Current E+P vs.
Never + Past) 1.33 (1.11-1.59) 1.32 (1.13-1.54) 8.76E-01 1.43 (1.17-1.76) 1.36 (1.15-1.61) 9.33E-01 1.23 (0.85-1.78) 1.33 (1.02-1.74) 1.33E-01
Dietary
Factors
b
aHEI-2010 per SD 0.9 (0.83-0.97) 1 (0.94-1.08) 3.34E-02 0.92 (0.84-1.01) 0.99 (0.92-1.07) 3.13E-01 0.81 (0.69-0.95) 1.05 (0.93-1.18) 6.64E-03
DASH per SD 0.97 (0.89-1.06) 0.99 (0.92-1.06) 4.61E-01 1.01 (0.91-1.11) 0.98 (0.9-1.06) 8.41E-01 0.86 (0.72-1.02) 1.04 (0.92-1.18) 1.08E-01
HEI-2010 per SD 1 (0.92-1.09) 1.05 (0.98-1.13) 4.54E-01 1.05 (0.96-1.15) 1.05 (0.97-1.13) 7.60E-01 0.85 (0.72-1) 1.1 (0.98-1.25) 8.96E-03
aMED per SD 0.93 (0.86-1.01) 1 (0.94-1.07) 2.08E-01 0.96 (0.87-1.05) 0.98 (0.91-1.06) 8.63E-01 0.9 (0.77-1.06) 1.05 (0.93-1.18) 6.93E-02
DII per SD 1.01 (0.93-1.09) 1.01 (0.94-1.08) 7.29E-01 0.96 (0.88-1.05) 0.99 (0.92-1.07) 6.40E-01 1.23 (1.05-1.44) 1.01 (0.9-1.14) 2.84E-02
Lifestyle
score
c
(High vs. Low) 1.13 (0.97-1.32) 1.12 (0.98-1.28) 7.81E-01 0.98 (0.83-1.17) 1.17 (1.01-1.35) 9.76E-02 1.3 (0.94-1.78) 1.08 (0.85-1.36) 4.48E-01
a
Model adjusted for age, family history, race/ethnicity, BMI, alcohol consumption, parity, age at menarche, age at menopause, smoking status, education, hormone use, where
appropriate.
b
Dietary factors additionally adjusted for calories intake. HEI-2010 and DASH additionally adjusted for alcohol consumption.
c
Model adjusted for age, family history, race/ethnicity, parity, age at menarche, age at menopause, education, calories intake.
d
Likelihood ratio test between models with and without the interaction term of lifestyle factor and dichotomized PRS, additionally adjusted for PC 1-10
.
P-LRT results similar to
continuous PRS.
95
Supplementary Table 1: Lifestyle score components.
Weights
Lifestyle Factor Study Categories Total BCa ER+ BCa ER- BCa
aHEI Dela Cruz et al., 2020
Q1 (<57.3) (ref.) (ref.) (ref.)
Q2 (57.3-62.8) -0.051 -0.051 -0.051
Q3 (62.8-67.4) 0.000 0.000 0.000
Q4 (67.4-72.9) 0.010 0.010 0.010
Q5 (>=72.9) -0.041 -0.041 -0.041
Smoking Gram et al., 2019
Never (ref.) (ref.) (ref.)
Former 0.077 0.086 0.058
Current 0.104 0.058 0.122
Alcohol
consumption
(g/day)
Park et al., 2014
0 (ref.) (ref.) (ref.)
0.1-4.9 -0.020 -0.083 0.166
5-9.9 0.207 0.131 0.451
10-14.9 0.191
0.300 0.215
15-29.9 0.113
30+ 0.425 0.476 0.457
Physical Activity
(MET hrs/week)
Matthews et al., 2019
<7.5 (ref.) (ref.) (ref.)
7.5-15 -0.062 -0.062 -0.062
15-22.5 -0.105 -0.105 -0.105
22.5-30 -0.128 -0.128 -0.128
30+ -0.151 -0.151 -0.151
BMI (kg/m
2
) Sarink et al., 2021
<25 (ref.) (ref.) (ref.)
25-29.9 0.215 0.215 0.039
30+ 0.329 0.329 -0.062
Hormone use
Sarink et al., 2021:
ER+/ER- BCa; Pike et
al., 2002: Total BCa
Never (ref.) (ref.) (ref.)
Former 0.049 0.113 -0.010
Current estrogen
alone
0.307 0.104 0.131
Current estrogen +
progestin use
0.621 0.519 0.086
96
Supplementary Table 2. Association of BCa and lifestyle risk factors.
Total BCa ER+ ER-
OR (95% CI) p-value OR (95% CI) p-value OR (95% CI) p-value
Lifestyle
Factors
a
BMI
>=25 (ref.) (ref.) (ref.)
< 25 0.83 (0.74-0.92) 7.03E-04 0.77 (0.68-0.87) 3.19E-05 1.14 (0.93-1.39) 2.12E-01
Family
History
Negative (ref.) (ref.) (ref.)
Positive 1.66 (1.47-1.88) 3.75E-16 1.69 (1.48-1.93) 2.24E-14 1.83 (1.48-2.26) 2.07E-08
Moderate or
Vigorous
Activity
(MET-
hours/day)
0% - 25% (ref.) (ref.) (ref.)
25% - 50% 0.87 (0.75-1) 5.69E-02 0.88 (0.75-1.04) 1.32E-01 0.86 (0.65-1.12) 2.66E-01
50% - 75% 0.82 (0.71-0.94) 4.54E-03 0.85 (0.73-0.99) 4.26E-02 0.8 (0.61-1.04) 8.88E-02
75% - 100% 1.01 (0.88-1.16) 9.08E-01 1.02 (0.88-1.19) 7.56E-01 0.99 (0.77-1.28) 9.47E-01
Smoking
Status
Never+former (ref.) (ref.) (ref.)
Current 1.13 (0.97-1.31) 1.20E-01 1.06 (0.89-1.26) 5.05E-01 1.05 (0.81-1.38) 6.99E-01
Alcohol
consumption
Non-drinker (ref.) (ref.) (ref.)
Drinker 1.15 (1.04-1.28) 8.93E-03 1.15 (1.02-1.3) 2.27E-02 1.16 (0.96-1.41) 1.32E-01
Hormone
Use
Never
estrogen use
(ref.) (ref.) (ref.)
Past
estrogen use
1.02 (0.87-1.18) 8.24E-01 1.05 (0.89-1.24) 5.67E-01 0.95 (0.71-1.28) 7.52E-01
Current
estrogen use
alone
1.1 (0.93-1.3) 2.53E-01 1.08 (0.89-1.3) 4.46E-01 1.37 (1.02-1.83) 3.48E-02
Current
estrogen use
with
progestin
1.44 (1.24-1.67) 1.18E-06 1.59 (1.35-1.87) 1.98E-08 1.13 (0.85-1.5) 3.92E-01
Dietary
Factors
b
aHEI-2010 per SD 0.96 (0.91-1.01) 8.67E-02 0.96 (0.9-1.01) 1.32E-01 0.97 (0.88-1.06) 4.77E-01
DASH per SD 0.98 (0.92-1.03) 3.82E-01 0.99 (0.93-1.05) 6.44E-01 0.97 (0.88-1.08) 6.22E-01
HEI-2010 per SD 1.03 (0.98-1.09) 2.43E-01 1.05 (0.99-1.11) 1.12E-01 1.01 (0.92-1.12) 7.96E-01
aMED per SD 0.98 (0.93-1.03) 4.93E-01 0.98 (0.92-1.03) 3.99E-01 1 (0.91-1.1) 9.43E-01
DII per SD 1 (0.95-1.06) 8.53E-01 0.98 (0.92-1.04) 4.52E-01 1.07 (0.97-1.17) 1.83E-01
Lifestyle
Score
c
per SD 1.13 (1.07-1.18) 4.18E-06 1.11 (1.05-1.17) 2.79E-04 1.18 (1.08-1.29) 2.36E-04
Quartile
0% - 25% (ref.) (ref.) (ref.)
25% - 50% 1.09 (0.95-1.25) 2.05E-01 1.15 (0.99-1.35) 7.47E-02 1.07 (0.83-1.39) 5.91E-01
50% - 75% 1.11 (0.97-1.27) 1.44E-01 1.04 (0.90-1.20) 5.94E-01 1.04 (0.81-1.35) 7.44E-01
75% - 100% 1.28 (1.11-1.48) 5.98E-04 1.36 (1.16-1.59) 1.41E-04 1.36 (1.06-1.76) 1.71E-02
a
Model adjusted for age, family history, race/ethnicity, BMI, alcohol consumption, parity, age at menarche, age at menopause, smoking status,
education, hormone use, where appropriate.
b
Dietary factors additionally adjusted for calories intake. HEI-2010 and DASH additionally adjusted for alcohol consumption.
c
Model adjusted for age, family history, race/ethnicity, parity, age at menarche, age at menopause, education, calories intake.
97
Supplementary Table 3. Association
a
of BCa and interaction of BMI (< 25 vs. >= 25) and PRS by race/ethnicity group.
Overall BCa ER+ BCa ER- BCa
PRS dichotomize PRS dichotomize PRS dichotomize
0% - 50% 50% - 100% 0% - 50% 50% - 100% 0% - 50% 50% - 100%
BMI (< 25 vs. >= 25) OR (95% CI) OR (95% CI) P-LRT
b
OR (95% CI) OR (95% CI) P-LRT
b
OR (95% CI) OR (95% CI) P-LRT
b
White 1.02 (0.72-1.44) 0.95 (0.72-1.26) 5.14E-01 1.11 (0.75-1.64) 0.89 (0.66-1.2) 2.00E-01 1.15 (0.56-2.37) 1.66 (0.93-2.98) 5.46E-01
African American 1.07 (0.76-1.5) 1.03 (0.76-1.4) 7.12E-01 1.12 (0.74-1.68) 0.8 (0.55-1.16) 1.46E-01 1.39 (0.73-2.66) 1.18 (0.73-1.9) 7.77E-01
Native Hawaiian 0.76 (0.34-1.73) 0.74 (0.41-1.34) 6.27E-01 0.78 (0.33-1.82) 0.52 (0.27-0.99) 4.93E-01 1.48 (0.12-18.61) 0.37 (0.1-1.37) 7.79E-01
Japanese American 0.83 (0.6-1.15) 0.49 (0.36-0.67) 7.32E-02 0.81 (0.57-1.14) 0.49 (0.36-0.68) 1.72E-01 0.99 (0.47-2.09) 1.06 (0.6-1.87) 9.83E-01
Latina 1.27 (0.8-2) 0.56 (0.38-0.81) 1.05E-02 1.39 (0.81-2.39) 0.43 (0.28-0.67) 4.41E-03 0.62 (0.26-1.5) 1.45 (0.78-2.7) 1.29E-01
a
Model adjusted for age, family history, race/ethnicity (in overall population), alcohol consumption, parity, age at menarche, age at menopause, smoking status, education, and hormone use.
b
Likelihood ratio test between models with and without the interaction term of BMI and dichotomized PRS, additionally adjusted for PC 1-10.
98
Figures
Figure 1. Association of BCa PRS313 per standard deviation with risk of BCa by
race/ethnicity groups.
1
2
3
Overall BCa ER+ BCa ER− BCa
BCa outcomes
OR and 95% CI
White African American Native Hawaiian Japanese Latino
99
Supplementary Figure S1. Joint association of BCa PRS313 with A) BMI (< 25 vs. >= 25)
in ER+ BCa, B) aHEI-2010 (above vs. below median) in ER- BCa, C) HEI-2010 (above
vs. below median) in ER- BCa, D) DII (below vs. above median) in ER- BCa, E) Lifestyle
score (below vs. above median) in ER+ BCa, and F) Lifestyle score (below vs. above
median) in ER- BCa. PRS quartiles were determined based on distribution in controls.
Dashed line indicated odds ratio (OR) of 1.
A) BMI in ER+ BCa
0.6
1.0
1.6
2.5
4.0
5.0
[0% − 25%] (25% − 50%] (50% − 75%] (75% − 100%]
PRS Category
OR and 95% CI
BMI < 25 BMI >= 25
100
B) aHEI-2010 in ER- BCa
C) HEI-2010 in ER- BCa
0.6
1.0
1.6
2.5
4.0
5.0
[0% − 25%] (25% − 50%] (50% − 75%] (75% − 100%]
PRS Category
OR and 95% CI
High aHEI Low aHEI
0.6
1.0
1.6
2.5
4.0
5.0
[0% − 25%] (25% − 50%] (50% − 75%] (75% − 100%]
PRS Category
OR and 95% CI
High HEI Low HEI
101
D) DII in ER- BCa
E) Lifestyle score in ER+ BCa
0.6
1.0
1.6
2.5
4.0
5.0
[0% − 25%] (25% − 50%] (50% − 75%] (75% − 100%]
PRS Category
OR and 95% CI
Low DII High DII
0.6
1.0
1.6
2.5
4.0
5.0
[0% − 25%] (25% − 50%] (50% − 75%] (75% − 100%]
PRS Category
OR and 95% CI
Low lifestyle score High lifestyle score
102
F) Lifestyle score in ER- BCa
0.6
1.0
1.6
2.5
4.0
5.0
[0% − 25%] (25% − 50%] (50% − 75%] (75% − 100%]
PRS Category
OR and 95% CI
Low lifestyle score High lifestyle score
103
Chapter 4: Characterization and Evaluation of Polygenic Risk
Scores for Nonalcoholic Fatty Liver Disease in a Multiethnic
Population
Abstract
Purpose: Polygenic risk scores (PRS) have been demonstrated to predict risk of
NAFLD. However, there is an underrepresentation of non-European ancestry
populations in NAFLD genetic studies. We evaluated existing PRSs developed in multi-
ethnic populations and assessed whether NAFLD PRS performance could be improved
through characterization of the PRSs in multiple ethnic groups.
Methods: We conducted a nested case-control study with 1,448 NAFLD cases and
8,444 controls in the Multiethnic Cohort Study. We evaluated association of a 17-variant
PRS with risk of NAFLD. We compared its performances to that of a 11-variant PRS
previously developed in the MEC, constructed a new 21-variant PRS by incorporating
independent SNPs between the previously published PRSs, and examined its
association with NAFLD risk across multiple ethnic groups.
Results: Both PRS17 and PRS21 were statistically significantly associated with risk of
NAFLD, with odds ratios per standard deviation ranging between 1.40 and 1.43, and
AUCs approximately 0.78 in the multi-ethnic population. Individuals in the top 25% of
PRS17 and PRS21 had approximately a 2-fold elevated risk of NAFLD. However, these
estimates were not significantly different from the performance of the PRS11.
104
Conclusions: In this ethnically diverse population, we showed the utility of PRS for
NAFLD risk stratification in multiple ethnic groups. Our results highlight the need for
well-developed and validated PRS that are optimized for specific populations.
105
Introduction
Nonalcoholic fatty liver disease (NAFLD) is the most common cause of chronic
liver disease, with an estimated global prevalence of 25% (1-3). NAFLD is characterized
by excess accumulation of fat in the liver (defined as hepatic fat content ≥ 5%) in the
absence of causes due to increased alcohol intake, medications, infections,
autoimmune disease processes, or other known causes for liver disease (4-6). In the
United States (US), NAFLD prevalence is projected to increase to 33.5% among adults
by 2030 (7). NAFLD risk burden also differs across different race/ethnicity groups.
Previous studies in the US have shown that Latinos, Japanese Americans, and Native
Hawaiians experience notably high NAFLD prevalence compared to Whites and African
Americans (8-10). In the US, NAFLD has risen to become the fastest growing cause of
hepatocellular carcinoma (HCC) and leading indicator for liver transplantation (11-13).
Moreover, numerous studies have demonstrated a deleterious role of NAFLD in
metabolic syndrome and hepatic outcomes of type 2 diabetes and kidney disease, as
well as extrahepatic outcomes such as cardiovascular disease, cancer, and overall
mortality (12,14-17).
Inherited genetic variation contributes to the development of NAFLD, with current
estimates of NAFLD heritability range from 20% to 50% (18). Genome-wide association
studies (GWAS) have identified several variants associated with NAFLD development
and disease severity (19-32). Polygenic risk scores (PRS) comprised of these variants
have been demonstrated to predict NAFLD development, progression, and severity
(33,34). However, there is an underrepresentation of non-European ancestry
populations in NAFLD GWASs for discovery, which creates a major gap in the use of
genetic information for disease prediction and prevention across populations (35-38).
106
Broad clinical application for risk prediction of the PRS will also require performance
evaluation across diverse racial and ethnic populations. Few studies have addressed
this gap; a recent study in the Multiethnic Cohort Study (MEC) constructed a weighted
11-SNP PRS after taking into account all previously identified GWAS-significant
variants, and reported a significant association between PRS with NAFLD risk in a
multiethnic population (odds ratio [OR] per SD increase = 1.41; 95% confidence interval
[CI] = 1.32-1.50) (39). Recently, a large multi-ancestry GWAS in the Million Veteran
Program (MVP) identified 77 genetic loci associated with NAFLD, and replicated 17
SNPs in external cohorts with histology-defined or radiologic imaging NAFLD cases
(40). These two studies facilitated the opportunity to develop and improve NAFLD PRS
to estimate overall genetic risk for NAFLD across populations (34,40).
In the current population-based study, we assessed whether the performance of
PRS11 (34) could be improved by constructing a new PRS incorporating independent
SNPs from the PRS17 (40). We constructed the PRS by replacing index variants that
better capture NAFLD risk and examined performance among diverse populations in the
MEC.
Materials and Methods
Participants
The MEC is a large prospective multiethnic cohort established in 1993–1996 to
investigate risk of cancer and other chronic diseases among individuals aged 45-75 at
baseline living in Hawaii and Los Angeles (41). Participants included self-reported
African American (AA), Latino (LA), Japanese (JA), Native Hawaiian (NH), and Non-
107
Hispanic White (WH) race/ethnicity groups who completed a detailed baseline
questionnaire to obtain information on risk factors and health conditions (42). As
described in details previously (10,34,43), NAFLD were identified from eligible MEC
participants using International Classification of Diseases Ninth (ICD-9) and Tenth (ICD-
10) Revision codes (ICD-9 codes 571.8 and 571.9 and ICD-10 codes K75.81, K760,
K7689, K741, K769), one inpatient or two or more outpatient/carrier Medicare fee-for-
service claims on different dates between 1999-2016, and excluding other liver disease
etiology. All NAFLD cases with DNA samples were included (N=1,498). Controls were
selected among eligible participants across nested case-control studies within the MEC
who met the following criteria: genotyped using the Illumina Infinium arrays in the five
major ethnic groups, valid information on type 2 diabetes status and BMI, and did not
have chronic liver disease. After exclusion criteria, the current study included 1,448
NAFLD cases and 8,444 controls with available GWAS data.
The Institutional Review Boards for the University of Southern California and the
University of Hawaii approved this study. NAFLD cases and controls consented to
genetic association studies.
Genotyping, Quality Control, and Genotype Imputation
The genotype data used in this project was derived from NAFLD study and
multiple GWAS studies of cancer and other phenotypes in the MEC. For these studies,
Illumina Infinium arrays were used, with imputation conducted using Minimac4 and the
1000 Genomes (1000G) Project reference panel (Phase 3 v5). Both subject call rates
and variant call rates were ≥ 0.95. Risk allele frequencies (RAF) were calculated and
108
compared to corresponding RAF in Phase3 1000G for quality control. Post–quality
control data contained 1,189,906 SNPs for 1,448 cases, and 8,444 controls.
Polygenic Risk Score (PRS) Construction
We evaluated 2 PRSs (PRS17 and PRS21) and compared their performances to
that of a 11-variant PRS previously developed by the MEC in a multiethnic population
that included 11 previously identified independent NAFLD associated variants (34).
PRS17 is a 17-variant PRS developed by the Million Veteran Program (MVP) in a multi-
ancestry GWAS analysis (40). Briefly, MVP identified 77 variants that exceeded
genome-wide significance, with one additional variant identified in European American-
only and two in African American-only populations, and replicated 17 variants in
external cohorts with histology-defined or radiologic imaging NAFLD cases. Comparing
the 2 sets of variants in PRS11 and PRS17, 2 variants (in ERLIN1 and TM6SF2)
overlapped, 5 variants were in linkage disequilibrium (LD; r
2
≥ 0.1) in the populations
examined (African, Admixed American, East Asian and European) in the 1000
Genomes Project (1KGP), and 2 variants were in LD in all but African population. From
PRS17, 10 SNPs were not in LD with variants from those in PRS11, and 2 SNPs from
PRS11 were not in LD with those in PRS17 or PRS77. Thus to investigate whether we
could improve upon risk prediction performance of PRS11, we additionally assessed a
21-variant PRS, constructed from PRS11 incorporating additional 10 SNPs that were not
in LD from PRS17.
109
Statistical Analyses
SNP Replication Analysis
A total of 89 SNP-NAFLD associations (Supplementary Table S1) were assessed
in all ethnic groups combined and by ethnicity, using multivariable logistic regression
models, adjusting for age at blood collection, sex, population stratification using
principal components (PCs) 1-5, and additionally race/ethnicity in the combined
analyses. Principal components were calculated using PC-AiR (44) with >15,000
independent common variants to adjust for genetic ancestry. A nominal P value of 0.05
was used to determine statistical significance.
PRS Analysis
A weighted PRS was calculated for each participant as the sum of the number of
risk alleles carried by an individual, weighted by ethnic-specific and multiethnic inverse
variance weights to combine results from the two studies (34,40). The PRS was
computed using the formula: 𝑃𝑅𝑆 = ∑ (
𝛽 𝑛 𝑆𝐸 (𝛽 𝑛 )
× 𝑆𝑁𝑃 𝑛 )
C
𝑛 =1
, where C defines a set of risk
loci, 𝛽 𝑛 is the per allele log(OR), 𝑆𝐸 (𝛽 𝑛 ) is the standard error of the nth SNP, and SNPn
is the dosage for the risk allele (range 0 to 2) of the nth SNP. The PRS was examined
as both continuous (per standard deviation [SD]) and as categorical. For PRS
categorization, PRS percentile cut points were determined using all controls, and all
controls in race/ethnicity groups in race/ethnicity-stratified analyses. Analysis were
assessed in all ethnic groups combined and by ethnicity, using multivariable logistic
regression models, adjusting for age at blood collection, sex, PC1-5, and additionally
race/ethnicity in the combined analyses.
110
We assessed whether the performance of PRS11 could be improved by
constructing a new PRS incorporating independent SNPs from the PRS17. The MVP is a
larger study compared to the MEC, and is likely to confer more accurate effect
estimates of risk variants. Thus, to construct a new PRS considering variants from both
studies (PRS11 from the MEC and PRS17 from the MVP), for SNPs from PRS11 that are
in linkage disequilibrium (LD) with those from PRS17, we replaced the index variant from
PRS11 with the variant in LD from PRS17 that might be more informative than the index
variant from PRS11. Ten variants in PRS17 that are not in LD with those in PRS11 were
incorporated to construct a 21-variant PRS. Receiver operating characteristic (ROC)
analysis was also conducted, and discrimination between cases and controls was
measured using the area under the ROC curve (AUC).
Two-sided p-values <0.05 indicated statistical significance. Analyses were
performed using R (R Foundation for Statistical Computing, Vienna, Austria, 2015).
Results
Participant Characteristics
The analysis included 1448 NAFLD cases and 8444 controls (Table 1). The
mean age at blood collection was 66 and 67 in NAFLD cases and controls, respectively.
There were more women in cases (61.5%) than in controls (53.8%). Among NAFLD
cases, Japanese Americans accounted for 57.9% of the cases, followed by Latinos
(18.0%), Whites (12.4%), Native Hawaiians (6.5%), and African Americans (5.2%).
Association Testing of Individual Risk Alleles
111
The comparison of risk allele frequency and effect size of individual variants
across race/ethnicity groups between the MEC and MVP are summarized in
Supplementary Table S1. Of the 80 previously reported GWAS-significant NAFLD risk
loci (77 multi-ancestry loci, 1 locus in European American-only, and 2 loci in African
American-only), 68, 69, 66, 67, 59, and 68 variants among Multiethnic, White, African
American, Latino, Japanese American, and Native Hawaiian populations, respectively,
were common (MAF > 0.05). Of the 80 SNPs, 45 (56%) had consistent direction of
effect on NAFLD risk, and 13 (16%) were nominally statistically significant (P < 0.05). Of
the 80 prior GWAS-significant SNPs, 13 (16%) were replicated (P < 0.05) in the
multiethnic population. When assessing the associations between variants and risk of
NAFLD by ethnicity, the highest number of replicated associations were found in
Latinos, followed by Japanese Americans and Whites, with 16 (20%), 11 (14%), and 7
(9%) replicated variants, respectively. Only 6 (8%) and 1 (1%) variant(s) were replicated
in Native Hawaiians and African Americans, respectively.
Of the 77 previously reported GWAS-significant multi-ancestry loci, 66, 67, 63,
65, 57, and 66 variants among Multiethnic, White, African American, Latino, Japanese
American, and Native Hawaiian populations, respectively, were common (MAF > 0.05)
(Supplementary Table S1). Of the 77 SNPs, 43 (56%) showed directional consistency,
and 12 (16%) were nominally statistically significantly associated with NAFLD risk (P <
0.05). Of the 77 prior GWAS-significant SNPs, 12 (16%) were replicated (P < 0.05) in
the multiethnic population. When assessing the associations between variants and risk
of NAFLD by ethnicity, the highest number of replicated associations were found in
Latinos, followed by Japanese Americans and Whites, with 16 (21%), 11 (14%), and 7
112
(9%) replicated variants, respectively. Only 6 (8%) and 1 (1%) variant(s) were replicated
in Native Hawaiians and African Americans.
Many of the 17 multi-ancestry loci that has been replicated in previous studies
were common in non-White populations in this study, although about 2 to 4 variants
(rs28929474, rs1801689, rs4841132 and rs17036160) were rare in each of the non-
White populations (Supplementary Table S1). Out of the 17 loci, 15, 15, 13, 15, 13, 14
variants among Multiethnic, White, African American, Latino, Japanese American, and
Native Hawaiian populations, respectively, were common (MAF > 0.05). Of the 17
SNPs, 13 (76%) had consistent direction of effect on NAFLD risk, and 6 (35%) were
nominally statistically significant (P < 0.05). Of the 17 previously replicated SNPs, 6
(35%) were replicated (P < 0.05) in the multiethnic population. When assessing the
associations between variants and risk of NAFLD by ethnicity, the highest number of
replicated variants of 7 (41%) were found in Latinos, followed by 3 (18%) in Japanese
Americans, 2 (12%) in Whites, 1 (6%) in Native Hawaiians and none in African
Americans (Supplementary Table S1).
Polygenic Risk Score
Both PRS17 and PRS21 were associated with NAFLD risk. The odds ratio per 1
SD of PRS17 was 1.40 (95%CI = 1.31-1.51) and PRS21 was 1.43 (95%CI = 1.33-1.53) in
the overall multi-ethnic population (Table 2). Compared with individuals at the bottom
25% of the PRS, those at the top 25% PRS had approximately double risk in both
PRS17 (OR = 1.98; 95%CI = 1.58-2.48) and PRS21 (OR = 2.03; 95%CI = 1.62-2.56)
(Supplementary Table S2). For PRS17, strongest per SD PRS association with NAFLD
113
risk was observed in Latinos (OR = 1.67; 95%CI = 1.44-1.93), followed by Native
Hawaiians (OR = 1.39; 95%CI = 1.12-1.73) and Japanese Americans (OR = 1.30;
95%CI = 1.20-1.41), and not significant in Whites (OR = 1.14; 95%CI = 0.94-1.39) and
African Americans (OR = 1.08; 95%CI = 0.86-1.36) even though the direction and
magnitude of associations were what we would expect. Similar associations to PRS17
were observed for PRS21 across all racial/ethnic groups: strongest per SD PRS
association with NAFLD risk was observed in Latinos (OR = 1.68; 95%CI = 1.45-1.94),
followed by Native Hawaiians (OR = 1.41; 95%CI = 1.14-1.75) and Japanese
Americans (OR = 1.31; 95%CI = 1.21-1.43), and not significant in Whites (OR = 1.15;
95%CI = 0.95-1.40) and African Americans (OR = 1.11; 95%CI = 0.88-1.39). The
discriminatory accuracy for both PRS17 and PRS21 (per SD increase) for NAFLD risk
was the same of 0.78 (95% CI = 0.77-0.79) in the overall population. By race/ethnicity
group the AUCs ranged from 0.56-0.69 and 0.57-0.69 for PRS17 and PRS21,
respectively (Supplementary Table S3).
Even though confidence intervals overlap, stronger NAFLD risk associations
were observed among Latinos in PRS17 (per SD: OR = 1.67; 95%CI = 1.44-1.93) and
PRS21 (per SD: OR = 1.68; 95%CI = 1.45-1.94) compared to PRS11 (per SD: OR = 1.51,
95%CI = 1.31-1.74) (Table 2). Compared with Latinos at the bottom 25% of the PRS,
Latinos at the top 25% had a 2.73 (95%CI = 1.81-4.13), 3.17 (95%CI = 2.11-4.77), and
3.19 (95%CI = 2.11-4.82)-fold increased risk for PRS11, PRS17 and PRS21, respectively
(Supplementary Table S2). In contrast, stronger associations were observed in PRS11
compared to PRS17 and PRS21 among African Americans [per SD: PRS11:1.26 (1.01-
1.57); PRS17: 1.08 (0.86-1.36); PRS21: 1.11 (0.88-1.39)], Native Hawaiians [per SD:
114
PRS11: 1.46 (1.18-1.80); PRS17:1.39 (1.12-1.73); PRS21: 1.41 (1.14-1.75)], Japanese
Americans [per SD: PRS11: 1.33 (1.23-1.45); PRS17: 1.30 (1.20-1.41); PRS21: 1.31
(1.21-1.43)] and Whites [per SD: PRS11: 1.29 (1.06-1.56); PRS17: 1.14 (0.94-1.39);
PRS21: 1.15 (0.95-1.40)] (Table 2). Even though confidence intervals overlap, consistent
results of higher discriminatory accuracy for NAFLD risk among Latinos was observed
for PRS17 (AUC = 0.65; 95% CI = 0.61-0.68) and PRS21 (AUC = 0.65; 95% CI = 0.61-
0.69) comparing to PRS11 (AUC = 0.63; 95% CI = 0.60-0.67) (Supplementary Table S3).
Discussion
In this multiethnic study, we assessed whether the performance of a 11-variant
PRS previously developed in the MEC could be improved by incorporating independent
SNPs from a 17-variant PRS developed in the Million Veteran Program (MVP) and
constructing a new 21-variant PRS. Both PRS17 and PRS21 were statistically
significantly associated with risk of NAFLD, with odds ratios per SD ranging between
1.40 and 1.43, and AUCs approximately 0.78. Individuals in the top 25% of PRS17 and
PRS21 had approximately a 2-fold elevated risk of NAFLD. However, these estimates
were not significantly different from the performance of the PRS11.
The large and ancestrally diverse MVP study that developed and externally
validated the PRS17 found the PRS17 strongly predicted NAFLD (P=2.53E-10, effect per
SD not available) (45). Our study among an ethnically diverse population observed
consistent result or PRS17 strongly associated with risk of NAFLD (P=5.32E-21). A large
study in the UK Biobank among individuals of European ancestry found the PRS15,
which was a subset of PRS17, associated with higher alanine aminotransferase (ALT)
115
levels (Odds Ratio [OR] = 1.14) (46). This association is consistent with the direction
and magnitude of association of PRS17 and risk of NAFLD observed among White
individuals in our current study (OR = 1.14; 95% confidence interval [CI] = 0.94-1.39),
even though the association was not significant, which could be due to limited power in
the current study.
In our current study among overall population combined across racial/ethnic
groups, PRS21 (OR = 1.43; 95% CI = 1.33-1.53) showed slightly stronger association
with risk of NAFLD compared to PRS17 (OR = 1.40; 95% CI = 1.31-1.51) and PRS11
(OR = 1.40; 95% CI = 1.31-1.49), although the confidence intervals overlapped. Even
though the association of PRS21 with risk of NAFLD compared to PRS11 was stronger
among all combined (PRS21: OR =1.43, 95% CI = 1.33-1.53; PRS11: OR = 1.40, 95% CI
= 1.31-1.49) and Latino (PRS21: OR = 1.68, 95% CI = 1.45-1.94; PRS11: OR = 1.51,
95% CI = 1.31-1.74) populations, the association was weaker among African American
(PRS21: OR = 1.11, 95% CI = 0.88-1.39; PRS11: OR = 1.26, 95% CI = 1.01-1.57), Native
Hawaiian (PRS21: OR = 1.41, 95% CI = 1.14-1.75; PRS11: OR = 1.46, 95% CI = 1.18-
1.80), Japanese American (PRS21: OR = 1.31, 95% CI = 1.21-1.43; PRS11: OR = 1.33,
95% CI = 1.23-1.45), and White (PRS21: OR = 1.15, 95% CI = 0.95-1.40; PRS11: OR =
1.29, 95% CI = 1.06-1.56). While PRS11 per SD was significantly associated with risk of
NAFLD within each of the five racial/ethnic populations, significant associations for
PRS21 were observed within all combined, Native Hawaiian, Japanese American and
Latino, and not among African American or White. AUCs results were consistent in that
AUCs among Latino was higher for PRS21 (AUC = 0.65; 95% CI = 0.61-0.69) compared
to PRS11 (AUC = 0.57; 95%CI = 0.53-0.61).
116
The results in the current study were consistent with the limited number of
studies that have evaluated NAFLD PRS in non-White populations. In a study
conducted among severely obese (BMI ≥ 40 kg/m
2
) Mexicans, a 4-variant PRS was
significantly associated with hepatic fat content and higher ALT levels (OR = 1.63; 95%
CI = 1.28-2.07) (47). In the largest study of liver biopsy-confirmed NAFLD among
Japanese (n = 902 cases), a 3-variant PRS was significantly associated with increased
NAFLD risk (AUC = 0.65; 95% CI = 0.63-0.67) (23). Additional exploratory and
association studies among large multiethnic populations are warranted to further
improve and assess the discrimination capacity of the PRS for NAFLD for specific
minority populations.
Strengths of the present study include leveraging a racially/ethnically diverse,
population-based design of the MEC, which is largely representative of its source
populations, and includes comprehensive information on participants’ genetic data (48).
This allowed us to conduct ethnic-specific analysis in assessing the transportability of
NAFLD PRSs across racial/ethnic groups. Our study had several limitations. First,
NAFLD cases were identified using ICD codes from Medicare claims, which may lead to
selection of NAFLD cases with more severe disease. Because we did not have imaging
data, participants with undiagnosed NAFLD might have been inadvertently included in
the control group, which may lead to attenuation of associations towards the null.
Second, there were limited number of NAFLD cases for certain ethnic-specific analyses.
This could explain the lack of significant associations in certain ethnic-specific analyses
despite the consistent direction of associations, and the lack of differences in effect
estimates and confidence intervals between different race/ethnicity groups observed in
117
our study. Thus, results in our current study should be validated in future studies with
larger sample size across racial/ethnic groups.
In this ethnically diverse population, we showed the utility of PRS for NAFLD risk
stratification in multiple ethnic groups and assessed whether we could optimize the
performance of PRS in a multiethnic population or in ethnic-specific populations. Even
though overall performance of PRS21 does not seem to be significantly different from
that of PRS11, there may be evidence suggesting better performance of PRS21 for
individuals of Latino ancestry. Our study supports the notion that it may be fruitful to
optimize PRS for specific populations. Moreover, PRS may be useful in screening large
populations, and providing lifestyle modification advice to genetically predisposed
individuals, which could have profound impact on prevention and screening strategies
for NAFLD and liver-related diseases.
118
References
1. Wree A, Broderick L, Canbay A, Hoffman HM, Feldstein AE. From NAFLD to
NASH to cirrhosis-new insights into disease mechanisms. Nature reviews
Gastroenterology & hepatology 2013;10:627-36
2. Younossi ZM, Koenig AB, Abdelatif D, Fazel Y, Henry L, Wymer M. Global
epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of
prevalence, incidence, and outcomes. Hepatology 2016;64:73-84
3. Chalasani N, Younossi Z, Lavine JE, Diehl AM, Brunt EM, Cusi K, et al. The
diagnosis and management of non-alcoholic fatty liver disease: practice guideline
by the American Gastroenterological Association, American Association for the
Study of Liver Diseases, and American College of Gastroenterology.
Gastroenterology 2012;142:1592-609
4. Brunt EM, Wong VW, Nobili V, Day CP, Sookoian S, Maher JJ, et al.
Nonalcoholic fatty liver disease. Nat Rev Dis Primers 2015;1:15080
5. Carr RM, Oranu A, Khungar V. Nonalcoholic Fatty Liver Disease:
Pathophysiology and Management. Gastroenterol Clin North Am 2016;45:639-52
6. Chalasani N, Younossi Z, Lavine JE, Charlton M, Cusi K, Rinella M, et al. The
diagnosis and management of nonalcoholic fatty liver disease: Practice guidance
from the American Association for the Study of Liver Diseases. Hepatology
2018;67:328-57
7. Estes C, Razavi H, Loomba R, Younossi Z, Sanyal AJ. Modeling the epidemic of
nonalcoholic fatty liver disease demonstrates an exponential increase in burden
of disease. Hepatology 2018;67:123-33
8. Rich NE, Oji S, Mufti AR, Browning JD, Parikh ND, Odewole M, et al. Racial and
Ethnic Disparities in Nonalcoholic Fatty Liver Disease Prevalence, Severity, and
Outcomes in the United States: A Systematic Review and Meta-analysis. Clinical
gastroenterology and hepatology : the official clinical practice journal of the
American Gastroenterological Association 2018;16:198-210 e2
9. Ruhl CE, Everhart JE. Fatty liver indices in the multiethnic United States National
Health and Nutrition Examination Survey. Aliment Pharmacol Ther 2015;41:65-
76
10. Setiawan VW, Stram DO, Porcel J, Lu SC, Le Marchand L, Noureddin M.
Prevalence of chronic liver disease and cirrhosis by underlying cause in
understudied ethnic groups: The multiethnic cohort. Hepatology 2016;64:1969-77
11. Younossi Z, Anstee QM, Marietti M, Hardy T, Henry L, Eslam M, et al. Global
burden of NAFLD and NASH: trends, predictions, risk factors and prevention. Nat
Rev Gastroenterol Hepatol 2018;15:11-20
119
12. Pais R, Barritt ASt, Calmus Y, Scatton O, Runge T, Lebray P, et al. NAFLD and
liver transplantation: Current burden and expected challenges. J Hepatol
2016;65:1245-57
13. Burra P, Becchetti C, Germani G. NAFLD and liver transplantation: Disease
burden, current management and future challenges. JHEP Rep 2020;2:100192
14. Adams LA, Lymp JF, St Sauver J, Sanderson SO, Lindor KD, Feldstein A, et al.
The natural history of nonalcoholic fatty liver disease: a population-based cohort
study. Gastroenterology 2005;129:113-21
15. Paradis V, Zalinski S, Chelbi E, Guedj N, Degos F, Vilgrain V, et al.
Hepatocellular carcinomas in patients with metabolic syndrome often develop
without significant liver fibrosis: a pathological analysis. Hepatology 2009;49:851-
9
16. Rinella M, Charlton M. The globalization of nonalcoholic fatty liver disease:
Prevalence and impact on world health. Hepatology 2016;64:19-22
17. Stender S, Loomba R. PNPLA3 Genotype and Risk of Liver and All-Cause
Mortality. Hepatology 2020;71:777-9
18. Sookoian S, Pirola CJ. Genetic predisposition in nonalcoholic fatty liver disease.
Clin Mol Hepatol 2017;23:1-12
19. Anstee QM, Darlay R, Cockell S, Meroni M, Govaere O, Tiniakos D, et al.
Genome-wide association study of non-alcoholic fatty liver and steatohepatitis in
a histologically characterised cohort( ☆). J Hepatol 2020;73:505-15
20. Chambers JC, Zhang W, Sehmi J, Li X, Wass MN, Van der Harst P, et al.
Genome-wide association study identifies loci influencing concentrations of liver
enzymes in plasma. Nat Genet 2011;43:1131-8
21. DiStefano JK, Kingsley C, Craig Wood G, Chu X, Argyropoulos G, Still CD, et al.
Genome-wide analysis of hepatic lipid content in extreme obesity. Acta Diabetol
2015;52:373-82
22. Feitosa MF, Wojczynski MK, North KE, Zhang Q, Province MA, Carr JJ, et al.
The ERLIN1-CHUK-CWF19L1 gene cluster influences liver fat deposition and
hepatic inflammation in the NHLBI Family Heart Study. Atherosclerosis
2013;228:175-80
23. Kawaguchi T, Shima T, Mizuno M, Mitsumoto Y, Umemura A, Kanbara Y, et al.
Risk estimation model for nonalcoholic fatty liver disease in the Japanese using
multiple genetic markers. PLoS One 2018;13:e0185490
24. Kawaguchi T, Sumida Y, Umemura A, Matsuo K, Takahashi M, Takamura T, et
al. Genetic polymorphisms of the human PNPLA3 gene are strongly associated
120
with severity of non-alcoholic fatty liver disease in Japanese. PLoS One
2012;7:e38322
25. Kitamoto T, Kitamoto A, Yoneda M, Hyogo H, Ochi H, Nakamura T, et al.
Genome-wide scan revealed that polymorphisms in the PNPLA3, SAMM50, and
PARVB genes are associated with development and progression of nonalcoholic
fatty liver disease in Japan. Hum Genet 2013;132:783-92
26. Namjou B, Lingren T, Huang Y, Parameswaran S, Cobb BL, Stanaway IB, et al.
GWAS and enrichment analyses of non-alcoholic fatty liver disease identify new
trait-associated genes and pathways across eMERGE Network. BMC Med
2019;17:135
27. Parisinos CA, Wilman HR, Thomas EL, Kelly M, Nicholls RC, McGonigle J, et al.
Genome-wide and Mendelian randomisation studies of liver MRI yield insights
into the pathogenesis of steatohepatitis. J Hepatol 2020;73:241-51
28. Prins BP, Kuchenbaecker KB, Bao Y, Smart M, Zabaneh D, Fatemifar G, et al.
Genome-wide analysis of health-related biomarkers in the UK Household
Longitudinal Study reveals novel associations. Sci Rep 2017;7:11008
29. Romeo S, Kozlitina J, Xing C, Pertsemlidis A, Cox D, Pennacchio LA, et al.
Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver
disease. Nat Genet 2008;40:1461-5
30. Speliotes EK, Yerges-Armstrong LM, Wu J, Hernaez R, Kim LJ, Palmer CD, et
al. Genome-wide association analysis identifies variants associated with
nonalcoholic fatty liver disease that have distinct effects on metabolic traits. PLoS
Genet 2011;7:e1001324
31. Yuan X, Waterworth D, Perry JR, Lim N, Song K, Chambers JC, et al.
Population-based genome-wide association studies reveal six loci influencing
plasma levels of liver enzymes. Am J Hum Genet 2008;83:520-8
32. Miao Z, Garske KM, Pan DZ, Koka A, Kaminska D, Mannisto V, et al.
Identification of 90 NAFLD GWAS loci and establishment of NAFLD PRS and
causal role of NAFLD in coronary artery disease. HGG Adv 2022;3:100056
33. Vespasiani-Gentilucci U, Gallo P, Dell'Unto C, Volpentesta M, Antonelli-Incalzi R,
Picardi A. Promoting genetics in non-alcoholic fatty liver disease: Combined risk
score through polymorphisms and clinical variables. World J Gastroenterol
2018;24:4835-45
34. Wang J, Conti DV, Bogumil D, Sheng X, Noureddin M, Wilkens LR, et al.
Association of Genetic Risk Score With NAFLD in An Ethnically Diverse Cohort.
Hepatol Commun 2021;5:1689-703
121
35. Bustamante CD, Burchard EG, De la Vega FM. Genomics for the world. Nature
2011;475:163-5
36. Martin AR, Gignoux CR, Walters RK, Wojcik GL, Neale BM, Gravel S, et al.
Human Demographic History Impacts Genetic Risk Prediction across Diverse
Populations. Am J Hum Genet 2017;100:635-49
37. Popejoy AB, Fullerton SM. Genomics is failing on diversity. Nature
2016;538:161-4
38. Rosenberg NA, Huang L, Jewett EM, Szpiech ZA, Jankovic I, Boehnke M.
Genome-wide association studies in diverse populations. Nat Rev Genet
2010;11:356-66
39. Wang J, Conti DV, Bogumil D, Sheng X, Noureddin M, Wilkens LR, et al.
Association of Genetic Risk Score With NAFLD in An Ethnically Diverse Cohort.
Hepatol Commun 2021
40. Vujkovic M, Ramdas S, Lorenz KM, Guo X, Darlay R, Cordell HJ, et al. A
multiancestry genome-wide association study of unexplained chronic ALT
elevation as a proxy for nonalcoholic fatty liver disease with histological and
radiological validation. Nat Genet 2022;54:761-71
41. Kolonel LN, Henderson BE, Hankin JH, Nomura AM, Wilkens LR, Pike MC, et al.
A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics.
American journal of epidemiology 2000;151:346-57
42. Wang H, Haiman CA, Kolonel LN, Henderson BE, Wilkens LR, Le Marchand L,
et al. Self-reported ethnicity, genetic structure and the impact of population
stratification in a multiethnic study. Hum Genet 2010;128:165-77
43. Noureddin M, Zelber-Sagi S, Wilkens LR, Porcel J, Boushey CJ, Le Marchand L,
et al. Diet Associations With Nonalcoholic Fatty Liver Disease in an Ethnically
Diverse Population: The Multiethnic Cohort. Hepatology 2019
44. Conomos MP, Miller MB, Thornton TA. Robust inference of population structure
for ancestry prediction and correction of stratification in the presence of
relatedness. Genetic epidemiology 2015;39:276-93
45. Vujkovic M, Ramdas S, Lorenz KM, Guo X, Darlay R, Cordell HJ, et al. A
multiancestry genome-wide association study of unexplained chronic ALT
elevation as a proxy for nonalcoholic fatty liver disease with histological and
radiological validation. Nat Genet 2022;54:761-71
46. Schnurr TM, Katz SF, Justesen JM, O'Sullivan JW, Saliba-Gustafsson P,
Assimes TL, et al. Interactions of physical activity, muscular fitness, adiposity,
and genetic risk for NAFLD. Hepatol Commun 2022;6:1516-26
122
47. León-Mimila P, Vega-Badillo J, Gutiérrez-Vidal R, Villamil-Ramírez H, Villareal-
Molina T, Larrieta-Carrasco E, et al. A genetic risk score is associated with
hepatic triglyceride content and non-alcoholic steatohepatitis in Mexicans with
morbid obesity. Exp Mol Pathol 2015;98:178-83
48. Kolonel LN, Altshuler D, Henderson BE. The multiethnic cohort study: exploring
genes, lifestyle and cancer risk. Nat Rev Cancer 2004;4:519-27
123
Tables
Table 1. Characteristics of NAFLD cases and controls in the MEC.
NAFLD cases Controls
n
1448 8444
Race/ethnicity [N (%)]
African American
75 (5.2) 2511 (29.7)
Native Hawaiian
94 (6.5) 1911 (22.6)
Japanese American
838 (57.9) 2134 (25.3)
Latino
261 (18.0) 1547 (18.3)
White
180 (12.4) 341 (4.0)
Age at blood collection, years [Mean (SD)]
66.35 (7.65) 66.81 (7.97)
Age at blood collection by race/ethnicity
African American
68.68 (7.26) 68.62 (8.01)
Native Hawaiian
63.54 (7.00) 64.30 (7.38)
Japanese American
66.29 (7.81) 67.77 (8.27)
Latino
66.76 (6.99) 66.53 (7.03)
White
66.54 (7.96) 62.75 (7.87)
Female [N (%)]
890 (61.5) 4544 (53.8)
Female by race/ethnicity
African American
49 (65.3) 1549 (61.7)
Native Hawaiian
53 (56.4) 1033 (54.1)
Japanese American
522 (62.3) 976 (45.7)
Latino
165 (63.2) 816 (52.7)
White
101 (56.1) 170 (49.9)
BMI, kg/m
2
[Mean (SD)]
27.12 (4.88) 27.61 (5.19)
BMI by race/ethnicity
African American
31.02 (5.54) 28.61 (5.42)
Native Hawaiian
30.21 (5.56) 28.96 (5.80)
Japanese American
25.62 (3.73) 25.45 (4.11)
Latino
29.25 (5.37) 27.89 (4.42)
White
27.77 (5.28) 25.04 (3.83)
124
Table 2. Associations between PRSs per standard deviation and risk of NAFLD, overall
and by race/ethnicity groups.
PRS Population Controls Cases OR (95% CI)
a
P
11
All Combined 8444 1448 1.4 (1.31-1.49) 2.15E-23
African American 2511 75 1.26 (1.01-1.57) 4.30E-02
Native Hawaiian 1911 94 1.46 (1.18-1.8) 3.81E-04
Japanese American 2134 838 1.33 (1.23-1.45) 1.30E-11
Latino 1547 261 1.51 (1.31-1.74) 1.13E-08
White 341 180 1.29 (1.06-1.56) 1.06E-02
17
All Combined 8444 1448 1.4 (1.31-1.51) 5.32E-21
African American 2511 75 1.08 (0.86-1.36) 4.93E-01
Native Hawaiian 1911 94 1.39 (1.12-1.73) 2.72E-03
Japanese American 2134 838 1.3 (1.2-1.41) 7.00E-10
Latino 1547 261 1.67 (1.44-1.93) 4.54E-12
White 341 180 1.14 (0.94-1.39) 1.92E-01
21
All Combined 8444 1448 1.43 (1.33-1.53) 1.69E-22
African American 2511 75 1.11 (0.88-1.39) 3.75E-01
Native Hawaiian 1911 94 1.41 (1.14-1.75) 1.81E-03
Japanese American 2134 838 1.31 (1.21-1.43) 1.29E-10
Latino 1547 261 1.68 (1.45-1.94) 2.14E-12
White 341 180 1.15 (0.95-1.4) 1.57E-01
a
Model adjusted for age, sex and PCs 1-5. All combined model additionally adjusted
for race/ethnicity.
125
Supplementary Table S1. NAFLD GWAS variants from the Million Veterans Program
(MVP) and the Multiethnic Cohort (MEC) among multiethnic population.
SNP CHR Position Gene
Alleles
Risk/Ref
Risk
Allele
Frequency OR (95% CI)
P-value Study
rs2642438 1 220970028 MTARC1 G/A 0.726 1.08 (1.06-1.09) 6.65E-24 MVP
rs6734238 2 113841030 IL1RN A/G 0.593 1.06 (1.05-1.07) 4.94E-19 MVP
rs13389219 2 165528876 COBLL1; SCN2A C/T 0.568 1.05 (1.04-1.07) 8.20E-16 MVP
rs17036160 3 12329783 PPARG C/T 0.885 1.07 (1.05-1.09) 3.39E-10 MVP
rs10433937 4 88230100 HSD17B13 T/G 0.745 1.08 (1.07-1.1) 5.28E-26 MVP
rs17598226 4 100496891 MTTP C/G 0.739 1.04 (1.03-1.06) 5.61E-09 MVP
rs4841132 8 9183596
PPP1R3B; TNKS;
MFHAS1 A/G 0.106 1.13 (1.11-1.16) 6.62E-32 MVP
rs2980888 8 126507308 TRIB1 T/C 0.286 1.14 (1.12-1.15) 4.21E-72 MVP
rs10883451 10 101924418 ERLIN1 T/C 0.563 1.16 (1.15-1.18) 2.65E-112 MVP
rs4918722 10 113947040 GPAM C/T 0.277 1.08 (1.06-1.1) 1.75E-23 MVP
rs28929474 14 94844947 SERPINA1; SERPINA6 T/C 0.017 1.64 (1.55-1.72) 9.01E-73 MVP
rs56094641 16 53806453 FTO; IRX3; IRX5 G/A 0.377 1.04 (1.03-1.06) 1.36E-09 MVP
rs1801689 17 64210580 APOH C/A 0.031 1.19 (1.14-1.23) 1.46E-18 MVP
rs11668950 19 18282940
JUND; IFI30;
MPV17L2; PIK3R2 A/G 0.258 1.05 (1.03-1.06) 2.22E-10 MVP
rs58542926 19 19379549 TM6SF2 T/C 0.07 1.23 (1.2-1.26) 6.54E-62 MVP
rs5117 19 45418790 APOE; APOC1 T/C 0.769 1.07 (1.06-1.09) 2.21E-20 MVP
rs738408 22 44324730 PNPLA3 T/C 0.239 1.31 (1.29-1.33) 3.99E-273 MVP
rs36086195 1 16510894 EPHA2 T/C 0.525 1.04 (1.03-1.06) 4.50E-10 MVP
rs79598313 1 27284913
PIGV; GPN2; SYTL1;
SLC9A1 T/C 0.021 1.19 (1.13-1.24) 4.91E-13 MVP
rs74816838 1 161643560 FCGR2A; FCGR2B T/C 0.106 1.08 (1.06-1.11) 1.51E-10 MVP
rs1337101 1 219726100 LYPLAL1; SLC30A10 G/T 0.703 1.05 (1.04-1.07) 1.98E-12 MVP
rs848559 2 36694497 CRIM1 A/T 0.859 1.06 (1.04-1.08) 1.58E-08 MVP
rs73024760 2 169885122 ABCB11 T/C 0.045 1.11 (1.08-1.14) 7.59E-11 MVP
rs2943652 2 227108446 IRS1; MIR5702 T/C 0.647 1.06 (1.05-1.08) 1.04E-19 MVP
rs7653249 3 136005792 PCCB G/C 0.771 1.08 (1.06-1.1) 3.51E-22 MVP
rs4683438 3 142652559 PAQR9;U2SURP G/T 0.662 1.05 (1.03-1.06) 4.73E-12 MVP
rs686250 6 32585055 HLA G/A 0.585 1.04 (1.03-1.05) 3.06E-09 MVP
rs4711750 6 43757082 VEGFA A/T 0.465 1.04 (1.02-1.05) 4.79E-08 MVP
rs4734654 8 103669991 KLF10 A/G 0.658 1.04 (1.03-1.06) 1.55E-10 MVP
rs2737217 8 116630311 TRPS1 A/G 0.392 1.04 (1.03-1.06) 3.92E-10 MVP
rs7041363 9 117146043 AKNA; ORM1; ORM2 C/G 0.541 1.14 (1.12-1.15) 1.02E-81 MVP
rs687621 9 136137065 CEL; ADAMTS13; ABO G/A 0.354 1.04 (1.03-1.06) 7.91E-11 MVP
126
rs11601507 11 5701074 TRIM5 A/C 0.072 1.1 (1.08-1.13) 1.53E-14 MVP
rs146774114 12 49743142 DNAJC22 G/A 0.985 1.17 (1.11-1.24) 2.50E-08 MVP
rs9668670 12 53278512 KRT84;KRT74 A/T 0.654 1.06 (1.04-1.07) 2.89E-15 MVP
rs34123446 12 122511238 MLXIP A/G 0.556 1.04 (1.03-1.06) 2.82E-11 MVP
rs11621792 14 24871926 NFATC4 T/C 0.415 1.04 (1.03-1.06) 5.48E-10 MVP
rs3935942 15 73971361 CD276 A/C 0.384 1.06 (1.04-1.07) 2.72E-15 MVP
rs4782568 16 83980529 OSGIN1; MLYCD C/G 0.58 1.07 (1.05-1.08) 8.82E-21 MVP
rs8082024 17 47945460 KAT7; PDK2 C/T 0.322 1.04 (1.03-1.06) 2.32E-08 MVP
rs2207132 20 39142516 MAFB A/G 0.028 1.19 (1.13-1.26) 1.58E-10 MVP
rs132665 22 36564170 APOL3 A/G 0.855 1.07 (1.05-1.09) 1.18E-11 MVP
rs6541349 1 93787867 CCDC18; FNBP1L C/T 0.268 1.05 (1.03-1.07) 1.76E-09 MVP
rs6543007 2 101663584 IL1R2 T/C 0.431 1.04 (1.02-1.05) 5.00E-08 MVP
rs11683409 2 112770134 MERTK G/C 0.41 1.04 (1.03-1.06) 9.19E-12 MVP
rs10201587 2 202202791 CASP8 A/G 0.523 1.04 (1.03-1.06) 4.18E-12 MVP
rs11683367 2 233510011 EFHD1 T/C 0.598 1.06 (1.04-1.07) 6.38E-18 MVP
rs934295 3 149122431 TM4SF1; CP; WWTR1 A/T 0.36 1.06 (1.05-1.07) 2.93E-16 MVP
rs61791108 3 170732742 SLC2A2 A/G 0.03 1.12 (1.08-1.16) 2.73E-09 MVP
rs574044675 3 172274232 TNFSF10 A/C 0.977 1.26 (1.2-1.33) 1.96E-19 MVP
rs12500824 4 77416627 SHROOM3 A/G 0.391 1.05 (1.03-1.06) 7.92E-13 MVP
rs138033684 6 71895252 OGFRL1 G/T 0.006 1.97 (1.58-2.45) 1.42E-09 MVP
rs799165 7 73052057 MLXIPL; BCL7B A/T 0.125 1.06 (1.04-1.08) 1.32E-09 MVP
rs115038698 7 87024718 ABCB4 T/C 0.012 1.9 (1.66-2.17) 3.51E-21 MVP
rs4484649 8 10571491 RP1L1; SOX7 C/A 0.419 1.05 (1.03-1.06) 1.38E-11 MVP
rs141505249 8 145732114 GPT G/C 0.997 7.54 (5.92-9.59) 7.15E-61 MVP
rs35199395 10 70983936 HKDC1 C/G 0.589 1.04 (1.03-1.05) 2.79E-08 MVP
rs148337160 10 104166504
FBXL15; ELOVL3;
PSD C/T 0.063 1.09 (1.06-1.12) 2.89E-11 MVP
rs174535 11 61551356
FADS1; FADS2;
FADS3 T/C 0.664 1.07 (1.05-1.08) 1.59E-20 MVP
rs56175344 11 93864393 PANX1 C/G 0.876 1.14 (1.11-1.16) 4.60E-40 MVP
rs1626329 12 121622023 P2RX7; HNF1A T/C 0.388 1.05 (1.04-1.07) 2.61E-14 MVP
rs2296285 13 29009673 FLT1 A/T 0.335 1.04 (1.03-1.05) 1.76E-08 MVP
rs340009 15 60899639 RORA; ANXA2 A/C 0.399 1.04 (1.03-1.06) 3.65E-11 MVP
rs7168849 15 90346227 ANPEP A/G 0.765 1.07 (1.05-1.09) 1.67E-11 MVP
rs12149380 16 72043546 DHODH; HP; HPR G/C 0.228 1.05 (1.03-1.07) 2.25E-09 MVP
rs7599 19 36038390 TMEM147; ATP4A A/G 0.388 1.05 (1.03-1.06) 1.05E-12 MVP
rs6059896 20 33111783 AHCY; ITCH C/T 0.485 1.05 (1.04-1.06) 1.80E-13 MVP
rs1547014 22 29100711 CHEK2 C/T 0.658 1.07 (1.05-1.08) 1.57E-21 MVP
rs1047891 2 211540507 CPS1; ACADL A/C 0.325 1.04 (1.02-1.05) 2.76E-08 MVP
rs3852142 5 55796968 MAP3K1 A/T 0.923 1.07 (1.05-1.1) 1.90E-08 MVP
127
rs60315134 8 8670599 PRAG1; CLDN23 G/A 0.457 1.04 (1.02-1.05) 2.88E-08 MVP
rs1658943 9 6676953 UHRF2 C/T 0.842 1.05 (1.03-1.07) 3.54E-08 MVP
rs10774625 12 111910219 CUX2; SH2B3; ATXN2 A/G 0.465 1.04 (1.03-1.05) 9.75E-09 MVP
rs2727324 17 61922102 SMARCD2; DDX42 C/G 0.368 1.04 (1.02-1.05) 1.24E-08 MVP
rs3810367 19 4342847 SIRT6; STAP2 G/T 0.374 1.04 (1.02-1.05) 3.77E-08 MVP
rs8108722 19 10347084 HDGFL2; S1PR2 T/C 0.22 1.04 (1.03-1.06) 2.25E-08 MVP
rs4805033 19 33839554 CEBPA; CEBPG A/G 0.561 1.04 (1.02-1.05) 3.06E-08 MVP
rs4940689 18 56089116 NEDD4L A/G 0.205 1.06 (1.03-1.08) 1.21E-08 MVP
rs144127357 9 71829595 TJP2 C/T 0.927 1.19 (1.12-1.27) 2.39E-08 MVP
rs2666559 11 64439227 NRXN2 G/T 0.809 1.13 (1.08-1.17) 1.87E-08 MVP
rs1260326 2 27730940 GCKR T/C 0.370 1.16 (1.06-1.28) 1.00E-03 MEC
rs13118664 4 88239609 HSD17B13 A/T 0.760 1.1 (0.99-1.22) 7.00E-02 MEC
rs2954021 8 126482077 TRIB2 A/G 0.440 1.07 (0.98-1.17) 1.20E-01 MEC
rs4240624 8 9184231 PPP1R3B G/A 0.110 1.13 (0.95-1.33) 1.60E-01 MEC
rs10883437 10 101795361 CPN1 T/A 0.640 1.05 (0.95-1.16) 3.70E-01 MEC
rs429358 19 45411941 APOE T/C 0.850 1.08 (0.94-1.24) 2.70E-01 MEC
rs4808199 19 19545099 GATAD2A A/G 0.240 1.15 (1.05-1.27) 3.00E-03 MEC
rs641738 19 54676763 MBOAT7 T/C 0.340 1.16 (1.04-1.29) 1.00E-02 MEC
rs738409 22 44324727 PNPLA3 G/C 0.330 1.39 (1.28-1.52) 1.05E-13 MEC
128
Supplementary Table S2. Associations between PRSs quartiles and risk of NAFLD,
overall and by race/ethnicity groups.
PRS Population Controls Cases PRS quartiles
a
OR (95% CI)
b
P
11
All Combined
2111 156 Q1 (ref.) (ref.)
2111 247 Q2 1.16 (0.92-1.45) 2.05E-01
2112 358 Q3 1.38 (1.11-1.71) 3.43E-03
2110 687 Q4 2.16 (1.75-2.66) 3.70E-13
African American
628 13 Q1 (ref.) (ref.)
628 18 Q2 1.43 (0.69-2.94) 3.36E-01
628 22 Q3 1.63 (0.81-3.29) 1.74E-01
627 22 Q4 1.79 (0.9-3.56) 9.97E-02
Native Hawaiian
478 19 Q1 (ref.) (ref.)
478 13 Q2 0.52 (0.25-1.09) 8.43E-02
477 26 Q3 1.07 (0.58-1.98) 8.20E-01
478 36 Q4 1.8 (1.03-3.15) 3.84E-02
Japanese American
534 144 Q1 (ref.) (ref.)
533 172 Q2 1.16 (0.9-1.5) 2.49E-01
533 215 Q3 1.32 (1.03-1.7) 2.64E-02
534 307 Q4 2.12 (1.68-2.67) 3.22E-10
Latino
387 36 Q1 (ref.) (ref.)
387 48 Q2 1.05 (0.66-1.67) 8.44E-01
386 67 Q3 2.04 (1.34-3.11) 8.20E-04
387 110 Q4 2.73 (1.81-4.13) 1.78E-06
White
86 32 Q1 (ref.) (ref.)
85 45 Q2 1.7 (0.95-3.04) 7.31E-02
85 39 Q3 1.43 (0.79-2.59) 2.39E-01
85 64 Q4 2.27 (1.29-3.99) 4.37E-03
129
17
All Combined
2111 151 Q1 (ref.) (ref.)
2111 207 Q2 1.06 (0.84-1.35) 6.12E-01
2111 384 Q3 1.35 (1.07-1.69) 9.92E-03
2111 706 Q4 1.98 (1.58-2.48) 2.61E-09
African American
628 23 Q1 (ref.) (ref.)
628 13 Q2 0.76 (0.39-1.47) 4.08E-01
627 15 Q3 0.72 (0.37-1.41) 3.38E-01
628 24 Q4 1.1 (0.6-2.01) 7.56E-01
Native Hawaiian
478 19 Q1 (ref.) (ref.)
478 19 Q2 1.21 (0.63-2.33) 5.72E-01
477 19 Q3 1.22 (0.63-2.36) 5.53E-01
478 37 Q4 1.98 (1.06-3.71) 3.30E-02
Japanese American
534 131 Q1 (ref.) (ref.)
533 191 Q2 1.38 (1.07-1.79) 1.31E-02
533 224 Q3 1.63 (1.27-2.09) 1.40E-04
534 292 Q4 2.2 (1.73-2.8) 1.26E-10
Latino
387 37 Q1 (ref.) (ref.)
387 44 Q2 1.27 (0.81-1.99) 3.06E-01
386 58 Q3 1.47 (0.95-2.29) 8.70E-02
387 122 Q4 3.17 (2.11-4.77) 3.04E-08
White
86 42 Q1 (ref.) (ref.)
85 24 Q2 0.73 (0.41-1.31) 2.97E-01
85 56 Q3 1.07 (0.62-1.85) 7.97E-01
85 58 Q4 1.46 (0.86-2.48) 1.64E-01
21 All Combined
2111 142 Q1 (ref.) (ref.)
2111 183 Q2 0.98 (0.77-1.25) 8.93E-01
2111 379 Q3 1.4 (1.11-1.76) 4.34E-03
130
2111 744 Q4 2.03 (1.62-2.56) 1.24E-09
African American
628 14 Q1 (ref.) (ref.)
628 18 Q2 0.75 (0.38-1.48) 4.09E-01
627 16 Q3 0.76 (0.39-1.5) 4.30E-01
628 27 Q4 1.26 (0.69-2.3) 4.48E-01
Native Hawaiian
478 23 Q1 (ref.) (ref.)
478 8 Q2 0.98 (0.5-1.91) 9.54E-01
477 22 Q3 1.27 (0.67-2.4) 4.66E-01
478 41 Q4 1.87 (1.01-3.46) 4.60E-02
Japanese American
534 130 Q1 (ref.) (ref.)
533 160 Q2 1.32 (1.02-1.71) 3.27E-02
533 247 Q3 1.64 (1.28-2.11) 9.15E-05
534 301 Q4 2.14 (1.68-2.72) 5.17E-10
Latino
387 36 Q1 (ref.) (ref.)
387 38 Q2 1.13 (0.71-1.81) 6.04E-01
386 70 Q3 1.76 (1.14-2.72) 1.02E-02
387 117 Q4 3.19 (2.11-4.82) 3.27E-08
White
86 40 Q1 (ref.) (ref.)
85 34 Q2 0.78 (0.44-1.4) 4.05E-01
85 36 Q3 1.08 (0.62-1.86) 7.89E-01
85 70 Q4 1.56 (0.92-2.65) 9.90E-02
a
PRS quartiles determined using controls' distributions within each population.
b
Models adjusted for age, sex and PCs 1-5. All combined model additionally adjusted for race/ethnicity.
131
Supplementary Table S3. Area under the receiver operating characteristic curve (AUC)
of PRS
a
for NAFLD risk, overall and by race/ethnicity group.
No PRS PRS11 PRS17 PRS21
All Combined 0.77 (0.75-0.78) 0.78 (0.77-0.79) 0.78 (0.77-0.79) 0.78 (0.77-0.79)
African American 0.56 (0.49-0.63) 0.6 (0.53-0.67) 0.56 (0.49-0.63) 0.57 (0.5-0.64)
Native Hawaiian 0.57 (0.51-0.63) 0.62 (0.55-0.68) 0.59 (0.53-0.65) 0.6 (0.54-0.66)
Japanese American 0.61 (0.59-0.64) 0.64 (0.62-0.66) 0.64 (0.61-0.66) 0.64 (0.62-0.66)
Latino 0.57 (0.53-0.61) 0.63 (0.6-0.67) 0.65 (0.61-0.68) 0.65 (0.61-0.69)
White 0.69 (0.64-0.74) 0.7 (0.65-0.74) 0.69 (0.64-0.74) 0.69 (0.64-0.74)
a
PRS was analyzed as per SD increased; regression models were adjusted for age, sex, race/ethnicity (for all
combined), and PCs 1-5.
132
Chapter 5: Conclusions and Future Directions
Association of Prostate-Specific Antigen Levels with Prostate Cancer Risk
in a Multiethnic Population: Stability over Time and Comparison with
Polygenic Risk Score
Summary
In chapter 2, we assessed association of PSA levels in men measured up to 10+
years before a prostate cancer diagnosis and compared the discriminative ability of PSA
to a polygenic risk score (PRS) for PCa among men in the Multiethnic Cohort. We found
that a PSA measurement taken 5 years on average before diagnosis was associated
with PCa risk. The association was observed to be consistent across racial/ethnic
populations, was significantly stronger for men with low versus high grade disease but
similar for advanced and lethal versus localized disease. We also found PSA to be less
effective as a marker of risk with increased length of time since measurement, and at
10+ years before diagnosis, the magnitude of the association of PSA with PCa risk was
observed to be equivalent to that of the PRS. While we did not find PSA to differentiate
risk of advanced versus localized disease, only a small fraction of non-localized (23%)
or lethal disease (32%) occurred in men with PSA levels below the median, diagnosed
10 or more years after blood draw. This suggests, as indicated by others (1), that a risk-
stratified approach to screening is warranted (based on early life PSA and/or PRS), with
men at low risk being screened less frequently than men at high risk, which would
translate into fewer biopsies, associated complications, and over-diagnoses for men at
lower risk of dying from PCa.
133
Future Directions
An immediate next study of interest would be to update the PRS used in the
current study. Our current PRS is based on 269 known PCa variants (2). A recent
investigation in a multi-ancestry GWAS meta-analysis discovered 187 novel risk
variants for PCa, increased the number of risk variants for PCa to 451, and
demonstrated this 451-variant multi-ancestry PRS to improve genetic risk prediction of
PCa across populations (Wang et al., currently in review). It would be of great interest
to assess the discriminative ability of PSA relative to this updated PRS on PCa
outcomes across men of different ancestries.
The ultimate goal for investigations on PSA and PRS is to improve screening
methods for prostate cancer to decrease overdiagnosis. Thus, a big-picture future
directions for this project is to have a large-scale multiethnic randomized clinical trial to
evaluate how useful a risk stratified approach of PRS and PSA screening would be. The
randomized clinical trial would have a general PSA screening group comparing to a
screening group that incorporates PRS in additional to PSA. The study would compare
the two clinical trial arms and evaluate the utility of PSA when integrated with PRS. This
randomized clinical study would be conducted in similar national scale to the Women’s
Health Initiative (WHI) (3) and is expected to cost tens of millions of dollars. This would
need to be a large-scale, long-term, multi-institutional study led on the national level by
national agencies such as the National Institutes of Health (NIH).
134
Interaction of Polygenic Risk Score and Lifestyle Factors on the Risk of
Breast Cancer in a Multiethnic Population
Summary
In Chapter 3, we conducted a nested case-control study of 3,229 breast cancer
cases and 3,921 controls from five major racial/ethnic groups in the Multiethnic Cohort
(MEC). We examined a PRS of 313 variants in association with breast cancer risk and
evaluated the interaction with selected modifiable lifestyle factors on the risk of breast
cancer. We found that the 313-variant PRS was strongly associated with breast cancer
risk across the five racial/ethnic groups of Native Hawaiian, Whites, Latina, Japanese
American and African American women (OR=1.32 to 2.07 per SD), but the associations
were weaker in African American women which is consistent with the lack of
transferability of the breast cancer PRS in these populations (4). Our findings also
suggest that maintaining a healthy BMI may offset the genetic risk of ER+ breast
cancer, while adhering to a healthy dietary pattern or being physically active may further
reduce risk of ER- BCa for women with lower genetic risk. Furthermore, we examined
the association between a lifestyle score with risk of BCa. While we observed significant
association between lifestyle score with risk of BCa, we did not observe significant
interaction between PRS and lifestyle score with risk of BCa.
Future Directions
An immediate next study of interest would be to assess the associations between
breast cancer modifiable risk factors and PRS with another breast cancer subtype of
human epidermal growth factor receptor 2-positive (HER2+). HER2-enriched subtype is
135
rarer and has worse prognosis compared to hormone receptor-positive (HR+) subtypes
(5). Studies on HER2+ breast cancer is still relatively limited because HER2 protein
expression was often underreported in pathology reports before 2005. Surveillance,
Epidemiology, and End Results (SEER) registries began routinely collecting HER2
receptor status for BCa cases in 2010 (6). This subtype information available from the
SEER may provide valuable insight into breast cancer progression and prevention for
this important subtype disease.
The current BCa PRS was developed based on GWAS summary statistics from
populations of European ancestry (7). As suggested in our current study and previous
studies (4,7,8), even though the 313-variant PRS was comparably effective in stratifying
breast cancer risk in Native Hawaiians, Whites, Japanese Americans and Latinas, the
transferability was diminished for African American women. These results highlight the
need to improve representation of population groups by increasing the inclusion of
racially and ethnically diverse individuals, particularly women of African ancestry, in
genomic research cohorts, and to develop a multiethnic PRS or an ethnic-specific PRS
that is optimized for women of different ancestries.
For future directions in the field of gene-environment (GxE) interactions, a
comprehensive examination on the interplay between genetic and non-genetic risk
factors is needed. Thus, a big picture future directions for this project would be to
conduct a large-scale multiethnic study that take into consideration modifiable lifestyle
factors, and in addition classical risk factors, to assess the comprehensive interplay
between nongenetic and genetic factors on breast cancer subtypes for women across
different ancestries.
136
Characterization and Evaluation of Polygenic Risk Scores for Nonalcoholic
Fatty Liver Disease in a Multiethnic Population
Summary
In chapter 4 we conducted a nested case-control study with 1,448 NAFLD cases
and 8,444 controls in the Multiethnic Cohort Study. We assessed whether the
performance of a 11-variant PRS previously developed in the MEC (9) could be
improved by incorporating independent SNPs from a 17-variant PRS developed in the
Million Veteran Program (MVP) (10) and constructing a new 21-variant PRS. Both
PRS17 and PRS21 were statistically significantly associated with risk of NAFLD, with
odds ratios per SD ranging between 1.40 and 1.43, and AUCs approximately 0.78.
Individuals in the top 25% of PRS17 and PRS21 had approximately a 2-fold elevated risk
of NAFLD. However, these estimates were not significantly different from the
performance of the PRS11. The association of PRS21 with risk of NAFLD compared to
that of PRS11 was stronger among overall and Latino populations, while weaker among
African American, Native Hawaiian, Japanese American and White populations. The
current study highlights the need for additional exploratory and association studies
among large multiethnic populations to further improve and assess the discrimination
capacity of the PRS for NAFLD for specific minority populations.
Future Directions
As suggested in our study, it may be fruitful in developing an ethnic-specific PRS
that is optimized for specific population. Advanced statistical methods, such as machine
learning approaches, could be useful for improving prediction performance by selecting
137
more variants that are informative in risk prediction but with a smaller effect size, which
would be difficult to be identified through GWAS studies without a large enough sample.
Thus, an immediate next step would be to use machine learning methods such as
Bayesian method to select weights and variants to construct a PRS to assess which
subset of SNPs and weights would work better for each racial/ethnic groups.
Another possible method to construct the NAFLD PRS may be to select SNPs to
include in the PRS based on known specific pathways or function. NAFLD is a complex
disease affected by many different pathways such as fat metabolism, glucose
metabolism and inflammatory pathways. Some SNPs may be risk factors for some
pathways while protective for others. This may lead to reduced associations observed
for SNPs with opposing effects in multiple pathways. Thus, it would be of interest to
construct PRSs based on specific pathways which may give clarity on specific pathways
that could influence NAFLD etiology.
The current gold standard for identifying fatty liver is using imaging techniques
such as elastography (FibroScan) and magnetic resonance imaging (MRI) (11-14).
Currently NAFLD GWASs examined different outcomes across different studies, which
makes validating the results difficult. To advance the field of NAFLD genetics, the future
directions should be to have a uniform and validated method to identify fatty liver. An
established, validated method would allow pooling NAFLD cases from different studies,
which would increase power in genetic studies and could lead to discovery of additional
NAFLD risk-associated variants, and specifically ethnic-specific risk variants that could
shed light on racial/ethnic disparities in NAFLD risk.
138
Conclusion
In this dissertation, I have characterized and evaluated the utility of PRS of
multiple diseases with biomarkers and lifestyle factors in a racially and ethnically diverse
population. Our investigations suggest broad clinical implications on the utility of
biomarkers and PRS for risk stratification in screening practices, which could translate
into fewer invasive procedures, associated complications, and over-diagnoses for
certain diseases. In combination with environmental risk factors, PRS may be useful in
screening large populations, and providing lifestyle modification advice to genetically
predisposed individuals, which could have profound impact on prevention and screening
strategies for diseases. While PRS could stratify disease risk in non-European
populations, the predictive performance was attenuated. Our results highlight the need
to improve representation of diverse population groups in large-scale genomic studies
to achieve well-developed and validated PRS that are optimized for specific populations
for broad clinical application of the PRS.
139
References
1. Kovac E, Carlsson SV, Lilja H, Hugosson J, Kattan MW, Holmberg E, et al.
Association of Baseline Prostate-Specific Antigen Level With Long-term Diagnosis of
Clinically Significant Prostate Cancer Among Patients Aged 55 to 60 Years: A
Secondary Analysis of a Cohort in the Prostate, Lung, Colorectal, and Ovarian (PLCO)
Cancer Screening Trial. JAMA Netw Open 2020;3:e1919284
2. Conti DV, Darst BF, Moss LC, Saunders EJ, Sheng X, Chou A, et al. Trans-
ancestry genome-wide association meta-analysis of prostate cancer identifies new
susceptibility loci and informs genetic risk prediction. Nat Genet 2021;53:65-75
3. Prentice RL, Howard BV, Van Horn L, Neuhouser ML, Anderson GL, Tinker LF,
et al. Nutritional epidemiology and the Women's Health Initiative: a review. Am J Clin
Nutr 2021;113:1083-92
4. Liu C, Zeinomar N, Chung WK, Kiryluk K, Gharavi AG, Hripcsak G, et al.
Generalizability of Polygenic Risk Scores for Breast Cancer Among Women With
European, African, and Latinx Ancestry. JAMA Network Open 2021;4:e2119084-e
5. Siegel RL, Miller KD, Jemal A. Cancer Statistics, 2017. CA Cancer J Clin
2017;67:7-30
6. DeSantis CE, Ma J, Gaudet MM, Newman LA, Miller KD, Goding Sauer A, et al.
Breast cancer statistics, 2019. CA Cancer J Clin 2019;69:438-51
7. Mavaddat N, Michailidou K, Dennis J, Lush M, Fachal L, Lee A, et al. Polygenic
Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes. Am J Hum
Genet 2019;104:21-34
8. Du Z, Gao G, Adedokun B, Ahearn T, Lunetta KL, Zirpoli G, et al. Evaluating
Polygenic Risk Scores for Breast Cancer in Women of African Ancestry. J Natl Cancer
Inst 2021;113:1168-76
9. Wang J, Conti DV, Bogumil D, Sheng X, Noureddin M, Wilkens LR, et al.
Association of Genetic Risk Score With NAFLD in An Ethnically Diverse Cohort. Hepatol
Commun 2021;5:1689-703
10. Vujkovic M, Ramdas S, Lorenz KM, Guo X, Darlay R, Cordell HJ, et al. A
multiancestry genome-wide association study of unexplained chronic ALT elevation as a
proxy for nonalcoholic fatty liver disease with histological and radiological validation. Nat
Genet 2022;54:761-71
11. Bohte AE, van Werven JR, Bipat S, Stoker J. The diagnostic accuracy of US, CT,
MRI and 1H-MRS for the evaluation of hepatic steatosis compared with liver biopsy: a
meta-analysis. Eur Radiol 2011;21:87-97
140
12. Klopfenstein BJ, Kim MS, Krisky CM, Szumowski J, Rooney WD, Purnell JQ.
Comparison of 3 T MRI and CT for the measurement of visceral and subcutaneous
adipose tissue in humans. Br J Radiol 2012;85:e826-30
13. Abdelmalek MF. Nonalcoholic fatty liver disease: another leap forward. Nat Rev
Gastroenterol Hepatol 2021;18:85-6
14. Eddowes PJ, Sasso M, Allison M, Tsochatzis E, Anstee QM, Sheridan D, et al.
Accuracy of FibroScan Controlled Attenuation Parameter and Liver Stiffness
Measurement in Assessing Steatosis and Fibrosis in Patients With Nonalcoholic Fatty
Liver Disease. Gastroenterology 2019;156:1717-30
Abstract (if available)
Abstract
Genome-wide association studies (GWAS) have identified thousands of common genetic variants, mostly of modest effects, that are associated with hundreds of complex human diseases and traits. Even though individually the associated variants mostly modify the disease risks only marginally, for many diseases the cumulative impact of risk across the genome is substantial. Polygenic risk scores (PRS) measuring the cumulative genetic burden have been shown to have great potential in disease risk stratification. One of the key public health goals is to identify individuals at high risk of a given disease to allow enhanced screening or preventive therapies. PRS, alone or combined with environmental risk factors or biomarkers, has the potential to offer substantial stratification of a population into distinct risk categories for common complex diseases and may in turn aid in targeted screening. Despite promising results, current PRS are predominantly developed and assessed in populations of European ancestry and have not been well studied in non-European populations, leading to lack of generalizability. Evaluating PRS in non-European ancestry populations in research is essential to broad clinical application of PRS to the general population. In this dissertation, I aim to address this question by investigating the utility of PRS with biomarkers and lifestyle factors in a racially and ethnically diverse population in the Multiethnic Cohort Study (MEC).
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Using genetic ancestry to improve between-population transferability of a prostate cancer polygenic risk score
PDF
Genetic studies of cancer in populations of African ancestry and Latinos
PDF
Identifying genetic, environmental, and lifestyle determinants of ethnic variation in risk of pancreatic cancer
PDF
Examining the relationship between common genetic variation, type 2 diabetes and prostate cancer risk in the multiethnic cohort
PDF
Prostate cancer: genetic susceptibility and lifestyle risk factors
PDF
The multiethnic nature of chronic disease: studies in the multiethnic cohort
PDF
Genetic risk factors in multiple myeloma
PDF
The role of alcohol and alcohol-related risk factors in population health using a multi-level approach
PDF
Diet quality and pancreatic cancer incidence in the multiethnic cohort
PDF
The interplay between tobacco exposure and polygenic risk score for growth on birthweight and childhood acute lymphoblastic leukemia
PDF
Screening and association testing of coding variation in steroid hormone coactivator and corepressor genes in relationship with breast cancer risk in multiple populations
PDF
Association of comorbidity with prostate cancer tumor characteristics in African American men
PDF
Polygenic analyses of complex traits in complex populations
PDF
Body size and the risk of prostate cancer in the multiethnic cohort
PDF
Genes and environment in prostate cancer risk and prognosis
PDF
Post-GWAS methods in large scale studies of breast cancer in African Americans
PDF
The association of cerebrovascular disease risk factors with brain structure and its modification by genetic variation
PDF
Identification and fine-mapping of genetic susceptibility loci for prostate cancer and statistical methodology for multiethnic fine-mapping
PDF
Genetic and environmental risk factors for childhood cancer
PDF
The effects of hormonal exposures on ovarian and breast cancer risk
Asset Metadata
Creator
Chou, Alisha
(author)
Core Title
Utility of polygenic risk score with biomarkers and lifestyle factors in the multiethnic cohort study
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Epidemiology
Degree Conferral Date
2022-12
Publication Date
12/07/2022
Defense Date
10/17/2022
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
breast cancer,lifestyle factors,MEC,multiethnic cohort study,NAFLD,nonalcoholic fatty liver disease,OAI-PMH Harvest,polygenic risk scores,prostate cancer,prostate-specific antigen,PRS,PSA
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Haiman, Christopher (
committee chair
), Conti, David (
committee member
), de Smith, Adam (
committee member
), Press, Michael (
committee member
), Setiawan, Veronica Wendy (
committee member
)
Creator Email
alisha.n.chou@gmail.com,alishach@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC112617315
Unique identifier
UC112617315
Identifier
etd-ChouAlisha-11345.pdf (filename)
Legacy Identifier
etd-ChouAlisha-11345
Document Type
Dissertation
Format
theses (aat)
Rights
Chou, Alisha
Internet Media Type
application/pdf
Type
texts
Source
20221207-usctheses-batch-994
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
breast cancer
lifestyle factors
MEC
multiethnic cohort study
NAFLD
nonalcoholic fatty liver disease
polygenic risk scores
prostate cancer
prostate-specific antigen
PRS
PSA