Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Genetic studies of cancer in populations of African ancestry and Latinos
(USC Thesis Other)
Genetic studies of cancer in populations of African ancestry and Latinos
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
GENETIC STUDIES OF CANCER IN POPULATIONS OF AFRICAN ANCESTRY AND LATINOS by Zhaohui Du A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (EPIDEMIOLOGY) December 2019 Copyright 2019 Zhaohui Du i Acknowledgement I am grateful to express my sincere thanks to all the people who helped and accompanied me along this journey. First and foremost, I would like to thank my mentor Dr. Christopher Haiman, who has guided and supported me throughout the past four years. I am deeply grateful for the inspirations he provided, which has shaped my way of thinking and helped me see a bigger picture. I also thank Chris for all the opportunities he provided me to learn and practice new skills and present my work. Last but not least, I thank Chris for creating such a lovely and supportive lab environment in which I always enjoyed working and learning. I would like to thank Dr. David Conti for his incredible guidance in statistical analyses of this dissertation. I am thankful that he always explained everything so clearly and helped me understand those concepts more deeply. His patience and sense of humor made me at ease and confident to learn more and more. I want to thank Dr. Wendy Cozen, who has supported and guided one project of this dissertation. I am thankful for her understanding and encouragement, and it was such a pleasure working with her. I would also like to thank Dr. John Carpten, who have provided insightful and valuable feedback for this dissertation which helped me think deeper and broader. I am also thankful for his continuing support and encouragement. I would like to acknowledge my lab members Grace Sheng and Peggy Wan for preparing all the data I needed and providing so many handful suggestions on managing and analyzing data. I also want to thank Niels Weinhold, Chi Song, Kristin A. Rand and Hannah Hopp for the incredible work they did before. I am also thankful for all the other members in my lab: Victor Hom, ii Loreall Pooler, Susan Gundell and Antonia Maldonado, for their kind supports throughout my time in the lab. I would like to thank all my friends in our program: Yifan Liu, Huiyu Deng, Sisi Li, Lai Jiang, Chubing Zeng, Alisha Chou, Anqi Wang, Intira Sriprasert, Uggona Ihenacho, Malcolm Barrett, Abby Keener, Ashley Song, Zhi Yang, Kan Wang, Zhongjie Cai and Cauchy for all the happiness and companion they brought to me in the past four years. Special thanks to my lifelong friends in China: Yongmei Wang, Min Wang and Xiaoyan Wang in WUSTL. Their friendship has nourished my soul and made my life so much more joyful and fulfilled. Finally, I would like to express my heartfelt thanks to my parents, Chenping Zhao and Qingtao Du for their unconditional and limitless love. I am so grateful that they have always stood behind me, taught me to be brave and encouraged me to pursue my dreams. I would not have been here without their understanding and supports. iii Table of Contents ACKNOWLEDGEMENT ...................................................................................................................................................... i LIST OF TABLES ................................................................................................................................................................. v LIST OF FIGURES .............................................................................................................................................................. vi ABSTRACT ........................................................................................................................................................................ vii CHAPTER ONE: INTRODUCTION ................................................................................................................................... 1 1.1 GENETIC SUSCEPTIBILITY OF CANCER ........................................................................................................ 1 1.2 GENOME-WIDE ASSOCIATION STUDY (GWAS) .......................................................................................... 2 1.3 ADMIXTURE MAPPING ..................................................................................................................................... 9 1.4 FINE-MAPPING.................................................................................................................................................. 12 1.5 POST-GWAS FUNCTIONAL ANALYSES ........................................................................................................ 13 1.6 POLYGENIC RISK SCORE................................................................................................................................. 17 1.7 PROSTATE CANCER EPIDEMIOLOGY .......................................................................................................... 21 1.8 MULTIPLE MYELOMA EPIDEMIOLOGY ...................................................................................................... 34 1.9 REFERENCES ..................................................................................................................................................... 41 CHAPTER TWO: GENETIC RISK OF PROSTATE CANCER IN UGANDAN MEN .................................................... 60 2.1 ABSTRACT ......................................................................................................................................................... 62 2.2 INTRODUCTION ................................................................................................................................................ 64 2.3 MATERIALS AND METHODS .......................................................................................................................... 65 2.4 RESULTS ............................................................................................................................................................. 70 2.5 DISCUSSION ....................................................................................................................................................... 73 CHAPTER THREE: GENOME-WIDE ASSOCIATION STUDY OF PROSTATE CANCER IN LATINOS ................. 85 iv 3.1 ABSTRACT ......................................................................................................................................................... 87 3.2 INTRODUCTION ................................................................................................................................................ 88 3.3 MATERIALS AND METHODS .......................................................................................................................... 90 3.4 RESULTS ............................................................................................................................................................. 97 3.5 DISCUSSION .................................................................................................................................................... 101 CHAPTER FOUR: A META-ANALYSIS OF GENOME-WIDE ASSOCIATION STUDIES OF MULTIPLE MYELOMA AMONG AFRICAN AMERICANS ............................................................................................................. 113 4.1 ABSTRACT ...................................................................................................................................................... 118 4.2 INTRODUCTION ............................................................................................................................................. 119 4.3 METHODS ........................................................................................................................................................ 120 4.4 RESULTS .......................................................................................................................................................... 126 4.5 DISCUSSION .................................................................................................................................................... 130 CHAPTER FIVE: CONCLUSIONS AND FUTURE DIRECTIONS .............................................................................. 146 5.1 GENETIC SUSCEPTIBILITY OF PROSTATE CANCER IN UGANDAN MEN .......................................... 146 5.2 GENETIC SUSCEPTIBILITY OF PROSTATE CANCER IN LATINOS ...................................................... 150 5.3 GENETIC SUSCEPTIBILITY OF MULTIPLE MYELOMA IN AFRICAN AMERICANS ........................... 153 5.4 CONCLUSION .................................................................................................................................................. 156 5.5 REFERENCE..................................................................................................................................................... 157 SUPPLEMENTARY MATERIALS ................................................................................................................................ 159 CHAPTER 2 .................................................................................................................................................................... 159 CHAPTER 3 .................................................................................................................................................................... 171 CHAPTER 4 .................................................................................................................................................................... 194 v List of Tables Table 2.1 Associations between categorized polygenic risk scores (PRS) and prostate cancer risk in Latino men. Table 2.2 Associations between categorized polygenic risk scores (PRSs) and prostate cancer risk in Latino men by European global ancestry strata. Table 2.3 A polygenic risk score for prostate cancer in Ugandan men. Table 3.1 Associations between categorized polygenic risk scores (PRS) and prostate cancer risk in Latino men. Table 3.2 Associations between categorized polygenic risk scores (PRSs) and prostate cancer risk in Latino men by European global ancestry strata. Table 4.1 Associations of variants (index variants and better AA markers) in known multiple myeloma risk regions with multiple myeloma risk in AA individuals. Table 4.2 Associations between categorical polygenic risk scores (PRSs) and multiple myeloma risk in African ancestry population. vi List of Figures Figure 2.1 Regional association plot of the 8q24 risk region (127.8-128.3MB) in Ugandan men. Figure 2.2 Density plots of the polygenic risk scores. Figure 4.1 Risk allele frequency (RAF) comparisons of the 23 known risk alleles between AA MM controls with European population in phase III 1000 Genome Project. Figure 4.2 Regional association plot of the 2p23.3 risk region (25.4-25.9MB) in African Americans. vii Abstract Genome-wide association studies (GWAS) in the past decade have been successful in identifying thousands of common genetic susceptibility loci for cancers. However, non-European populations were underrepresented in GWAS samples to date. The clinical value of genetic information in guiding personalized medicine in populations of non-European ancestry will require additional discovery and risk locus characterization efforts across populations. In this dissertation, I aim to expand the current knowledge of genetic susceptibility for multiple cancers to the underrepresented populations of Africa and Latino ancestry. In chapter 2, I conducted the first genetic risk characterization and GWAS study of prostate cancer (PrCa) in men from Eastern Africa among cases and controls from Uganda. In chapter 3, I assembled all existing genetic studies of PrCa in Latino men to search for novel risk alleles as well as to determine whether known PrCa risk alleles are important in capturing PrCa risk in Latino men. I also explored whether genetic background/ancestry modified associations with single variants and a PrCa polygenic risk score in Latino men. In chapter 4, I combined all existing multiple myeloma GWAS data for men and women of African ancestry. In addition to scanning for novel risk regions and assessing the aggregated effect of known risk loci on multiple myeloma risk, I also comprehensively examined each known risk region to find markers that better capture multiple myeloma genetic risk in this high-risk population. These studies show that genetic analyses of these cancers in non-European ancestry populations are imperative in developing ethnic-specific polygenic scores that are informative to improve prevention, screening and treatment of cancers in these diverse populations. 1 Chapter One: Introduction 1.1 Genetic susceptibility of cancer Cancer is a major public health problem both worldwide and in the U.S.. In 2018, it is estimated that globally, 18.1 million men and women will be diagnosed, and 9.6 million people will die from cancer 1 , and in the U.S., there’ll be 1.7 million new cancer cases and 0.6 million cancer- caused deaths. Multiple lines of evidence, including familial aggregation, positive associations between family history and cancer risk, varying incidence and mortality rates across racial and ethnic groups and concordance rates in twin-studies, have revealed that inherited genetic variation contributes to cancer susceptibility. The approaches most commonly employed to identify cancer predisposition genetic variants can be classified into two major types: linkage and association studies. Linkage studies, which are family-based, track the co-segregation of genetic markers and disease in multigenerational families that contain both disease-affected and -unaffected relatives. Linkage studies typically identify rare and highly penetrant monogenic disease-associated loci. Hence, they are best suitable for diseases controlled by mutation of a single genetic locus that is inherited according to Mendel’s law, i.e. the mendelian inheritance pattern. While some cancers and cancer-related syndromes demonstrate strong familial aggregation and have revealed high-penetrant mutations. For example, in 1990, through a linkage study, the genomic region at 17q21 was implicated as harboring germline variation that contributed to familial breast cancer 2 . This discovery led to the subsequent identification of mutations in the BRCA1 gene 3 . 2 However, cancer, as a complex disease, is also thought to arise from the contribution of multiple genetic variants, each of modest effect; Unlike linkage studies, association studies were designed to identify genetic risk variants that are in statistical correlation with disease risk at the population level. Initially, due to high genotyping costs and limited understanding of germline variation in the human genome, genetic association studies focused on candidate genes, which selected by prior knowledge/hypotheses of disease etiology or obtained from findings of previous linkage studies. Candidate gene studies were criticized for lack of convincing replication and insufficient inclusion of functional genes/variants. With the rapid development of inexpensive high-through-put genotyping technologies, Genome-Wide Association Studies (GWAS), a hypothesis-free approach to scan the entire genome, have emerged as a powerful approach to identify common disease-associated germline polymorphisms. 1.2 Genome-wide association study (GWAS) GWAS is a kind of association study that scans for common DNA polymorphisms that distribute differentially between patients and their comparable controls across the entire genome 4 . GWAS can also be applied to quantitative traits such as height and blood pressure, but they are not the focus of this dissertation. The main unit of a GWAS study is the single nucleotide polymorphism (SNP), which is a base pair of DNA sequence variation with frequency that is common (e.g. ≥1%) within a population. Other genetic variations such as insertions or deletions of bases in genome (INDELs) and structural variants could also be involved. Among individuals in the Phase III 1000 Genome Project (1KGP, described later), the human genome contains > 14 million SNPs (allele frequency≥1%). In a GWAS study, however, it’s not necessary to directly genotype each single genomic variation because of the existence of genetic linkage disequilibrium (LD). LD refers to the non-random coinheritance/association of alleles at adjacent 3 loci; the degree and pattern of LD are mainly determined by recombination events and is also influenced by nature selection, mutation, population admixture, etc. 5 . In fact, the majority of common variations across the genome in European ancestry populations can be adequately captured (at r 2 ≥0.8) by genotyping 260K~500K SNPs 6 . Given the variability in allele frequency noted between racial and ethnic populations, an important potential confounder that needs to be cautiously assessed and controlled in GWAS is population stratification (or population structure). Briefly, population stratification refers to the systematic differences in allele frequencies across subgroups in the population 7,8 . With the existence of population stratification, the detected marker might result in a spurious association which merely represents differences in population substructure between cases and controls. Several strategies exist to address this potential confounder, including family-based case-control study designs, genomic control and principal component (PC) analysis, which is a common approach for adjusting for population structure in the statistical analysis 9 . 1.2.1 GWAS design and consortia In the early days of GWASs, to both maintain statistical power and reduce costs, a multi-stage design was widely implemented. In this design, an initial/first-stage GWAS scan was carried out to discover suggestive disease-related variants at a pre-defined significance level, which were then genotyped and tested in a second (and often larger) independent population to rule out false- positives. As genome-wide genotyping costs decreased, investigators were able to genotype-scan the whole genome in the entire study population. The initial GWAS of cancer and other common phenotypes discovered low-risk variants with more sizable effects (i.e., the low hanging fruit 4 fruit), however it was soon realized that much larger studies would be required. With a purpose of achieving sufficient statistical power to detect such variants, international collaborative consortia were established. For example, the Collaborative Oncological Gene-Environment Study (COGS) consortium which combined resources from cancer-specific consortia to design a custom high-density genotyping array (iCOGS) that was genotyped in more than 250,000 men and women of European-ancestry 10 . Another example is the National Cancer Institute (NCI)- funded Genetic Associations and Mechanisms in Oncology (GAME-ON) consortium. GAME- ON was launched in 2010, with a long-term goal to understand the genetic architecture and mechanisms underlying breast, ovarian and prostate cancer, and to provide a rigorous knowledge base that would enable the clinical translation of cancer GWAS findings 11 . 12 . The network, in collaboration with Illumina, developed a custom SNP genotyping array, the “OncoArray”, which included ~533K markers, includes~260K markers as a GWAS backbone. In total, the GAME- ON consortium assembled and genotyped 447,705 multi-ethnic samples worldwide 12 and have increased substantially the number of susceptibility loci discovered for most common cancers. 1.2.2 Imputation reference panels As stated above, in GWAS, the large majority of variants in the genome are not directly assayed by commercial chips. Rather, through exploiting LD structure, the non-genotyped variants are inferred/imputed by comparing to a dense reference panel 13 . In this way, the testable variants in GWAS are expanded from hundreds of thousands to tens of millions, which substantially boost the ability of discovering disease-associated variants 14 . Moreover, imputation to the same reference panel facilitate the meta-analysis of GWAS results of different studies originally genotyped to different chips, which is of particular importance in the era of extensive collaboration across multiple consortia. 5 One of the key factors that impact imputation quality and suitability is the reference panel. In the past year, investigators worldwide have been making great efforts in collaboratively providing more comprehensive reference resources. Chronologically, three major consortia, the International HapMap Consortium, 1KGP and the Haplotype Reference Consortium (HRC), have led the way in advancing the development of the most comprehensive, ancestrally diverse publicly available reference panels through The International HapMap Consortium, launched in Oct. 2002, is the earliest reference panel. Three phases of data have been released: phase I, release in 2005, genotyped 1.3 million SNPs of 269 individuals from 4 geographically diverse populations 6 ; phase II, released in 2007, genotyped another 2.1 million SNPs in the same individuals 15 ; and Phase III, released in 2009, genotyped ~1.6 million common SNPs among 1,184 individuals from 11 populations and sequenced ten 100-kilobase regions in 692 individuals and copy number polymorphisms (CNPs) were included as well 16 . The 1KGP, launched in 2008 and completed in 2015, by far provides the most comprehensive catalogue of human genetic variation 17,18 . The phase I release, included low-coverage whole- genome sequence and exon-targeted sequence data for 1,092 individuals from 14 populations. In total, around 39 million polymorphic sites, including SNPs, INDELs, and structural variants, were described 19 . In the phase III release, the sample size was expanded to 2,504 individuals from 26 world-wide populations. By combining low-coverage whole-genome sequencing, 6 targeted exome sequencing, and high-density genotyping, approximately ~88 million genetic variants and 5,008 haplotypes were identified 18 . The HRC, which is currently the gold standard, includes sequence datasets from 20 different studies, which in total consist of 32,488 individuals of predominantly European ancestry, including the phase III 1KGP populations. This project identified ~39 million SNPs (minor allele count ≥ 5) and ~65K human haplotypes. Compared to phase III 1KGP, HRC-based imputation has been shown to have greater performance, especially for SNPs with minor allele frequency (MAF) less than 5% 20 . A second release is planned which will include more populations of diverse ethnicities and INDELs in addition to SNPs. 1.2.3 GWAS findings and clinical applications According to the NHGRI GWAS Catalog, approximately 3600 GWAS have been published 21,22 . Among them, around 300 articles were cancer-related, most of which were conducted among European ancestry populations (~64%), followed by Asian (~20%), and African (~8%) 23 . Together, approximately 700 chromosome regions have been identified to harbor at least one variant in association with cancer risk, with ~2,800 SNP-cancer associations (P<5×10 -8 ) reported. GWAS have provided insight into the genetic architecture (aka. risk allele number, effect size, etc,) underlying cancer susceptibility. From GWAS findings, we’ve learned that 1) the vast majority of common risk variants only confer a modest effect on cancer predisposition; 2) for almost all cancers, multiple genetic polymorphisms contribute to risk (aka. polygenic). For example, to date, ~170 independent common variants have been found to affect PrCa risk; 3) 7 some regions harbor multiple independent genetic signals. For example, the 8q24 and TERT regions contain multiple independent variants associated with PrCa risk; 4) some regions harbor variants associated with multiple cancers (aka. pleiotropy). For example, variant rs1057941 at 1q22 is associated with breast cancer and lung cancer risk 24 , and variant rs11715604 at 3q22.2 is associated with chronic lymphocytic leukaemia (CLL) and Hodgkin lymphoma (HL) risk 25 ;, and 5) the majority of GWAS-identified common risk variants do not locate in protein-coding regions or are in strong LD with coding variants. In contrast, most are found in non-coding regulatory regions. Although biological mechanisms underlying the vast majority of GWAS findings remain unclear and the clinical applications have yet to be defined, GWAS results have provided valuable insight regarding potential disease-related genes and pathogenic pathways that have never been hypothesized before, shedding light on novel targets for drug development and perhaps new strategies for personalized treatment 26,27 . For example, a GWAS study identified a susceptibility locus located at 6p21.33 for hepatitis C virus-induced hepatocellular carcinoma (HCV-induced HCC), and suggested that the MHC class I polypeptide-related sequence A gene (MICA), located 4.7kb downstream of the risk locus, is the disease-associated candidate gene 28 . Based on this discovery, a MICA sheddase, a disintegrin and metalloproteinase 10 (ADAM10), was suggested as a therapeutic target 29 . A subsequent invitro study identified the anti-alcoholism agent, disulfiram, as having inhibitory effects on ADAM10, which may lead to further the development of new anti-HCC regimens 30 . 8 1.2.4 Challenges and limitations of GWAS Despite the remarkable GWAS discoveries over the past 10 years, challenges and limitations remain. First of all, the majority of the risk variants discovered in GWAS are likely to be surrogate signals that are in LD with the causal markers rather than the functional variants themselves; fine-mapping and functional studies are needed to characterize the underlying biologically functional variants and their affected genes/pathways. Secondly, non-European populations have been under-represented in current GWASs, with only 4% and 1% samples in the discovery phase of GWAS of African and Latin descent, respectively 31 . Because populations of diverse ancestry have varying allele frequency and LD structure GWAS and fine-mapping studies in larger, multi-ethnic populations are warranted to both uncover ethnic-specific cancer risk loci as well as characterize the established risk loci. In the upcoming precision medicine era, the public health and clinical value of genetic information in guiding personalized prevention, screening, and treatment in non-European descents will require additional discovery and risk locus characterization efforts in minority populations 32 . While GWAS associations have been proven to be highly replicable in populations of European and East Asian ancestry 33,34 , this is not the case for individuals of African ancestry 35 . Several possible reasons might lead to failure of replication. First, the risk SNPs identified in the discovery population might be false positives (type I error), however this is unlikely as genome- significance was used in the discovery study. Second, the effect size of GWAS signals are generally modest and, “winner’s curse” might bias the initially reported effect sizes away from null 36 , replication studies may be under-power to detect these associations. Third, risk allele frequencies and LD patterns between tested markers and the underlying causal variants vary 9 across ethnic/racial groups. Fourth, the biological functional variants of cancer might have heterogeneous effects across various ethnic groups due to unobserved gene-gene and gene- environmental interactions, which could modify the magnitude of variant-cancer associations 37,38 . For example, compared to non-to-light drinkers and non-smokers, the effect size of rs671 (within gene ALDH2) on esophageal squamous cell carcinoma (ESCC) risk was found to be significantly greater among heavy drinkers and smokers 39 . Despite growing number of cancer-related loci identified, only a small fraction of heritability can be currently explained by cancer risk variants, known as “missing heritability”. For example, to date, the ~170 prostate cancer (PrCa) susceptibility loci are estimated to capture only 28.4% of the PrCa familial relative risk 40 . Possible explanations for the “missing heritability” could be: 1) the heritability calculated from family studies might be overestimated; 2) incomplete LD between index SNPs and the underlying causal variants 41 ; 3) multiple independent causal variants might exist within known risk regions. While traditionally only one SNP with the lowest P-value is selected in a risk region, the heritability captured by other causal variants would be neglected; 4) apart from common risk variants, rare variants, with larger effects on risk, might also contribute to cancer heritablity 42 . In addition, other under-explored types of genetic variation, such as copy-number polymorphic (CNP) duplications and epigenetic variation, might also play a role in explaining cancer heritability. 1.3 Admixture mapping Admixture mapping is an alternative genome-wide association study approach conducted among admixed populations. The rationale of admixture mapping is that, if disease incidences vary substantially across two (or more) parental populations, the disease causal alleles are expected to 10 be enriched on chromosome segments inherited from the ancestral population with higher disease incidence 43 . The analytic unit in admixture mapping is local ancestry, which is defined as the genetic ancestry at a defined chromosomal segment. Two strategies commonly used in conducting admixture mapping analysis are: 1) a case-only study, which compares local ancestry to genome-wide average ancestry within each individual and 2) a case-control study, which tests differences in local ancestry between cases and controls 44 . In contrast to GWAS, which takes advantage of LD caused by physical distance proximity of adjacent genetic loci, admixture mapping relies on long-range LD created during genetic admixture between two populations. Genetic admixture occurs when two (or more) populations begin mixing 45 . Because of distinct histories of natural selection, genetic drift, etc., many of the allele frequencies in the two parental populations (ancestries) vary from one another. In the admixture process, recombination between a pair of parental chromosomes occurs during meiosis and after several generations, the genome of each descendant consists of mosaic-like blocks inherited from the two parental populations 43 . Consequently, genetic variations within these broad blocks will have LD that purely created by this admixture process, which are longer than the background LD created by distance proximity 46 . As a result, the number of tests that need to be corrected for in an admixture mapping study is much smaller than in a standard GWAS study. Thus, when local ancestry values can be precisely estimated, admixture mapping offers greater power, yet lower resolution, to detect genetic susceptibility allele that are highly differentiated in frequency between populations. Because the extended LD created in the admixture process will be progressively shortened as recombination continues over more generations, it is recent admixed populations (e.g. African Americans, Latinos) that favor this increased power. 11 Admixture-induced LD in genetic studies was first proposed 60 years ago 47 , however, it wasn’t until 2005 with the development of analytical tools, was admixture mapping possible. In earlier studies, a sparse ancestry-informative markers (AIM) panel, usually consisting of several thousands of SNPs, were used in estimating local ancestry. Nowadays, modern computational programs, such as RFMix, facilitate interring local ancestry using denser SNP arrays or even sequencing data 48 . This refined ancestry estimation increases the power and resolution of admixture mapping substantially. Admixture mapping studies have identified numerous risk regions that harbor disease risk variants. Most regions were later confirmed by GWAS studies, which support efficiency of this approach 49 . A notable region in PrCa is the 8q24 risk region which was found by admixture mapping among African Americans in 2006 49 and confirmed and all subsequent GWAS studies. 12 1.4 Fine-mapping As stated in section 1.3, fine-mapping studies are essential to distinguish causal SNPs from GWAS signals, so that putative functional variants can be identified for downstream experimental validation 50 . Generally, fine-mapping studies start with dense genotyping followed by high quality imputation or target sequencing of each risk region, followed by conditional analyses to determine the number of independent signals in a region. A number of fine-mapping approaches have been proposed. The most straightforward yet arbitrary one is the heuristic approach. In this approach, local LD structure around the original GWAS signal is explored, and among SNPs correlated with the index SNP above a specified threshold (r 2 >0.2), those with the smallest P-value or with P-values passing a pre-defined significance criterion are selected as most likely to be ‘causal’. Another type of approach is the Bayesian method, which identified a “credible set” of candidate causal SNPs based on Bayes factors, the posterior probability of being causal 51 . Bayesian methods are able to incorporate external information such as functional annotation. In addition, statistical methods that integrate the existing summary statistics, external LD information from comparable reference panels, and functional information are also gaining attention 52 . They’re proved to perform well in identifying causal SNPs without the individual genotype data 53 . Because the set of SNPs in close correlation with the underlying causal SNP differ across populations, resolution of fine-mapping could be further improved by leveraging LD differences across ethnic/racial populations 54 . Several empirical studies have revealed that inclusion of 13 African ancestry samples increases fine-mapping resolution markedly 55 . A simulation study demonstrated that, compared to using a homogeneous population, populations of multiple ancestries increased the ability to distinguish causal variants 56 . Thus, multi-ethnic fine-mapping studies, particularly with the inclusion of African descendants, are beneficial for prioritizing the set of putative causal SNPs. 1.5 Post-GWAS functional analyses Although GWAS have identified thousands of trait-associated risk loci (most of which locate at intergenic/intron regions), the biological basis underlying these associations remain unclear. Incorporating information of biologically functional features in fine-mapping studies could aid in both prioritizing the candidate causal variants and providing hints on disease etiology (target genes, pathways, etc.). The most widely-used functional information include functional annotation and gene expression. 1.5.1 Functional annotation Protein coding region Genetic changes in protein coding regions (exons) generally lead to two consequences: non-synonymous and synonymous changes. Non-synonymous changes, such as missense and nonsense mutations, alters the protein’s amino acid sequence, which may result in aberrant protein function. Synonymous changes, on the contrary, do not alter the protein structure; however, they can still impact protein functions and abundance by affecting mRNA alternative splicing, translation rate, protein folding, etc. 57 . In addition, a mutation occurring at exonic splicing enhancers (ESEs), located at the intron/exon boundary, may also impact RNA alternative splicing. 14 Non-protein coding region Previous studies have shown that only a small proportion of GWAS signals are located at coding regions (~16%) 58 . with the majority located in introns or intergenic regions 59 . This implies that non-coding SNPs, which are likely to play a role in gene regulation, have an important role in common human traits. The major non-coding regions involved in gene regulation are Cis-regulatory elements (CREs) and untranslated regions (UTRs). The activation of these regions is associated with open chromatin and distant patterns of chromatin signatures (e.g. histone modifications, TF binding) which could be identified by modern techniques such as DNase I hypersensitive sites followed by sequencing (DNase-seq) 60 and Formaldehyde-Assisted Isolation of Regulatory Elements followed by sequencing (FAIRE-seq) 61 . CREs contain binding sits for regulatory proteins and RNAs and regulate the spatiotemporal (aka. cell type, tissue and time) expression of genes. The best-characterized CREs are promoters and enhancers. Most promoters locate surrounding the transcript start site (TSS) of their regulated gene. Promoters contain binding sites for RNA polymerase II (Poll II), transcription factors (TFs) and other factors assembling the transcription machinery, and facilitate transcription initiation 62 . Enhancers, on the other hand, may be distant for the genes they influence, located upstream, downstream, or in the intron or exon of a gene, or even on a different chromosome 63 . Enhancers could facilitate the interaction between promoters and transcription factors and thus enhance gene transcription. Genetic variation in regulatory regions could impact gene expression, in a disease- and tissue- specific manner, through various mechanisms such as altering DNA recognition motifs to modulate TF binding, or changing chromatin loop formation which bridges enhancers and promotors, or altering miRNA recruitment 64 . Previous studies have shown that GWAS signals are enriched in CREs 65 . For 15 example, a functional annotation study of 77 PrCa risk alleles revealed that most of the known risk variants overlapped with likely enhancer regions 66 . 1.5.2 Gene expression - eQTLs Genomic variants that influence gene expression abundance are defined as expression quantitative trait loci (eQTLs). Since gene expression is an intermediate status between genotype and phenotype, it is plausible to hypothesize that, by intersecting a set of disease-associated variants with eQTLs, one can distinguish functional SNPs from their LD surrogates, as well as to connect disease susceptible loci with their putative target genes. This assumption has been supported by the observation that complex trait associated SNPs were significantly more likely to be eQTLs than randomly selected MAF-matched SNPs 67 . Nowadays, eQTL information is widely integrated in post-GWAS analyses to aid in prioritizing the candidate causal SNPs and interpreting the possible functional mechanisms underlying GWAS findings 68 . For example, in a fine-mapping of 84 established PrCa risk regions, about half (40) of the regions contained at least one eQTL and >25% of the variants within the credible set were also eQTLs in prostate cancer tissue 68 . These overlaps/colocalizations should be considered priority in the follow-up functional confirmation. 1.5.3 Publicly available databases The publicly available genomic annotation and gene expression databases provide important resources for investigators to unravel the biological basis of GWAS discoveries. The encyclopedia of DNA elements (ENCODE) project is an international collaborated project that aims to identify all functional elements in the human genomes 65 ; additional genomes of mouse, fly, and worm have also been included 69 . ENCODE has systematically mapped human genetic functional elements, including RNA transcribed regions, protein-coding regions, transcription- 16 factor-binding regions, chromatin structure and DNA methylation sites 65 . The NIH Roadmap Epigenomics Mapping Consortium, also provides publicly available epigenomic reference maps including histone modifications, DNA methylation, DNA accessibility, and RNA expression 70 . The Functional Annotation of the Mammalian Genome (FANTOM) project aims to identify all functional elements in mammalian genomes. The latest iteration, FANTOM5, includes a comprehensive map of transcriptional regulatory elements along with a network across every major human organs, primary cell types, cancer cell lines and mouse developmental time courses 71 . The Genotype-Tissue Expression Project (GTEx), launched in 2010, is another important tool that provides a comprehensive resource to study the correlation between genetic variation and gene expression, and other molecular phenotypes. 72 . 17 1.6 Polygenic risk score The vast majority of single GWAS variants only confer modest effect on human traits and explain only a small proportion of heritability. However, most complex traits are affected by multiple loci, which in aggregate may have a larger effect and explain larger proportion of phenotypic variance 33 . The polygenic risk score (PRS), which combines multiple genetic variants for a disease or trait, can be used to estimate the aggregate effect of multiple variants for identifying high-risk populations and/or predicting individual disease risk. A PRS is generally constructed by a weighted sum of allele counts for each individual, with weights being allelic log relative risks from the study or log relative risks assigned to 1 (denoted as “unweighted PRS”). To date, various approaches of selecting SNP sets have been utilized. The most prevalently-used SNP set is one based on GWAS-identified risk SNPs for a disease. For example, using a PRS constructed by 147 established PrCa risk alleles, it was observed that men in the top 1% of the PRS distribution had a 5.7-fold (95% CI: 5.04-6.48) PrCa risk compared to men in the median (25%-75%) PRS stratum 40 . In contrast, others argued that, since causal SNPs with smaller effects and/or lower allele frequencies are not likely to be captured by GWAS without a large enough sample, a less stringent inclusion criteria should be applied. For example, using >37,000 nominally significant and LD-pruned common SNPs, a PRS was shown to be highly correlated with schizophrenia risk (P=9×10 -19 ) 73 . Penalized methods with coefficient shrinkage, such as LASSO and Ridge regression, can assess SNPs across the genome simultaneously without setting arbitrary thresholds. Bayesian models, which are able to incorporate external information (e.g. functional annotation, LD structure), have also been shown to have good performance 74 . In addition, methods using GWAS summary statistics, rather than 18 individual level data, are also attractive strategies, particularly in the consortia meta-analysis era when primary data may not be publicly available 74,75 . Apart from selecting SNP sets, assigning allelic weights is another factor that needs to be carefully considered. It was suggested that the optimal weights should be the precise effect size estimated from a large sample of the same ancestry 76 . In the precision medicine era, PRS is a potential tool to facilitate personalized prevention, screening and treatment. Previous studies have demonstrated the utility of PRS in guiding intervention or targeted risk factor prevention to improve primary cancer prevention. For example, a PRS constructed using 12 risk variants was found to significantly interact (additively) with smoking in bladder cancer; the 30-year absolute risk associated with smoking status was 0.9%-2.9% for subjects in the lowest PRS quartile and 1.7%-9.9% for subjects in the highest PRS quartile 77 . Thus, targeted smoking elimination in subjects in the high PRS risk stratum will prevent more individuals from developing bladder cancer. Previous studies also demonstrated that PRS could aid in targeted screening. For example, one study found that, compared to traditional age-based screening, using PRS-based (37 known colorectal cancer risk variants) personalized screening would result in 16% and 17% fewer men and women being eligible for screening at the cost of 10% and 8% fewer screening-detected cases. It is estimated that if all susceptibility variants were known, it would result in 26% fewer men and women being eligible for screening at the cost of 7% and 5% fewer screening-detected cases 78 . Another group has proposed a risk-based screening approach, which would integrate clinical risk 19 factors, breast density, a PRS constructed using GWAS-identified risk alleles, and high-risk mutations in nine moderate- or high- penetrance genes, to guide individualized BrCa screening 79 . Studies have also assessed PRS’s capability in identifying high-risk populations compared to the currently accepted disease predictors. For example, one study, with large independent training and validation cohorts and sophisticated PRS generating algorithms, showed that ~1.5% of the population at ≥3-fold breast cancer risk could be identified by PRS, with a testing AUC of 0.69 (0.68-0.69) 80 . This study revealed that PRS acted more effectively than rare mutations in BRCA1 or BRCA2, which have been already used in current clinical settings to identify high-risk populations. Before incorporating a biomarker into clinical use other more clinically relevant measures of predictive performance, in addition to AUC, such as net reclassification indices (NRI), should be examined 81 . One large prospective cohort study evaluated whether PRS improved cardiovascular disease (CHD) risk prediction over and above traditional risk factors and family history. They demonstrated that PRS improved both risk discrimination (C-index change = 0.3% - 0.5%) and risk reclassification (NRI=5%; clinical NRI=27%); and targeted PRS screening of clinically relevant risk group would reclassify ~12% of individuals in the intermediate- to high-risk category; statin allocation for these reclassified subjects could prevent ~2.5 times more CHD events over 14 years 82 . 20 Despite these inspirational discoveries, before putting PRS into public health and clinical use, several challenges remain to be addressed. First, as the PRS was predominantly developed and assessed in population of European origin, it’s critical to assess the generalizability across different ethnic populations. A simulation study suggested that PRS derived from European GWAS were biased when applied to other populations and the prediction accuracy was diminished with increasing ancestral divergence from the discovery population 83 . Second, as the etiology of cancer involves the complicated interplay between non-genetic and genetic factors, including common variants and highly penetrant rare variants, when building prediction models, potential interactions should be carefully assessed. Third, as incorporating genetic tests will inevitably increase expense, cost-effectiveness needs to be assessed before applying this tool in the general population. Recently, a study using a hypothetical large cohort showed that, compared to age-based breast cancer screening, targeted screening among population at the 70 th PRS risk stratum would save $720,900, have 71.4% fewer over-diagnoses, and prevent 9.6% fewer breast cancer deaths 84 . Other practical issues such as information communication should also be assessed. 21 1.7 Prostate cancer epidemiology 1.7.1 Overview Prostate cancer (PrCa) is the most common non-skin cancer and the second leading cause of cancer death among men in the US. In 2018, it is estimated that 164,690 new PrCa cases will be diagnosed and 29,430 men will die from PrCa 85 . Globally, PrCa is the second most frequent cancer and the fifth leading cause of cancer death in men, with 1.3 million new cases and 359,000 associated deaths estimated in 2018 1 . Ethnic disparities exist in both PrCa incidence and mortality rates in the US. During 2010-2014, the age-standardized incidence rate was the highest among non-Hispanic Blacks (NHB, 178.3 per 100,000 individuals), followed by non-Hispanic Whites (NHW, 107.0), American Indian/Alaska native (AI/AN, 78.3), and Asian/Pacific islander (ASN/PI, 58.4); similarly, the age-standardized mortality rate was also the highest among AA (40.8), followed by AI/AN (19.7), NHW (18.2), and lowest among ASN/PI (8.7). In addition, African Americans (AA) are more likely to be diagnosed with aggressive PrCa and at a younger age, and are more likely to have family history 86,87 . Globally, the highest PrCs incidence and mortality rates have been reported in Guadeloupe and the Barbados, respectively 1 . Although incidence rates are difficult to estimate in African countries incidence rates have been shown or are expected to be high in African populations, such as in Uganda, Nigeria and Zimbabwe 1,88,89 In African countries, low screening rates, limited cancer registration, and/or the shorter life span caused by the existence of competing morbidity 22 (e.g. the high prevalence of infectious diseases) have made it difficult to estimate and compare incidence rates across countries. A study conducted in Ghana showed that, among >1000 randomly-selected unscreened men, the confirmed PrCa cases were 65 (~6%) among which 60% were clinically high-grade 90 , suggesting that PrCa is more common in Africa than that implied from cancer registry data. Moreover, PrCa is one of the most common non-HIV related cancer types among many countries in Africa, especially in Uganda 88 . Considering the fact that men in Africa and AA, who share a common ancestry, live with highly distinct environmental conditions, lifestyles and medical care resources, yet both groups have a heavy PrCa burden, it’s plausible to hypothesize that genetic factors play an important role in PrCa etiology. 1.7.2 Non-genetic risk factors Previous immigration studies have pointed to non-inheritable factors, such as lifestyle/socio- cultural factors and environmental exposures in PrCa susceptibility. For example, PrCa incident and mortality rates in non-US born Asian-Americans (e.g. 24.0 per 100,000 in Chinese- Americans born in China) were approximately half that of US-born Asian Americans (44.4 in US-born Chinese-Americans) 91,92 . However, the only well-established non-genetic PrCa risk factors to date are age, ethnicity/race, and family history; generally, the evidence supporting environmental and life-style risk factors have been inconclusive. 1.7.2.1 Family history Although most PrCa occurs sporadically (i.e. without a family history), studies provide strong and consistent support for the association between family history and PrCa risk. For example, two large meta-analyses have reported that having a first-degree relative with PrCa was 23 associated with ~2.5-fold increase in risk 93,94 . Both studies stated that compared to father-son pairs, the association was stronger in brothers pairs. The effect size was also greater among men with a younger affected first-relative, and among men with more than one affected first-degree relative. The effect size associated with family history of PrCa risk has also been shown to be consistent across ethnic groups 95 . 1.7.2.2 Diet, nutrition, physical activity and weight As stated by Continuous Update Project (CUP) from World Cancer Research Fund International at 2014, there’s strong evidence that adult attained height is positively associated with PrCa risk and being overweight or obese increases advanced PrCa risk. To note, as higher adult height is influenced by a complex interplay of multiple developmental factors from the womb to adulthood, including genetic, environmental, hormonal and nutritional factors, it is unlikely to directly influence PrCa risk itself 96 . One study reported a stronger association between obesity and PrCa risk among AA compared to NHW 97 , suggesting that the high prevalence of obesity among AA might explain a part of the higher PrCa risk in this population 98 . However, another study didn’t find clear evidence of racial differences in the BMI/weight-PrCa association 99 . With respect to dietary factors, there is some evidence that higher consumption of dairy products, diets high in calcium, low plasma alpha-tocopherol concentration and low plasma selenium might increase PrCa risk, however these observations have not been reported in all studies. There is no convincing evidence of association between PrCa risk with other diet, nutrition or physical activity factors 100,101 . 1.7.2.3 Type 2 Diabetes Some studies have suggested that diabetes is inversely related with PrCa risk 102,103 , while other have not 104,105 . In the Multiethnic Cohort, a negative association was reported that was modified 24 by ethnicity, although only at a suggestive significance level 106,107 . Some studies have also showed that the diabetes-PrCa association differs by length of time since diabetes diagnosis, with PrCa incidence being slightly increased during the first few years and then reduced gradually 108,109 . Such a change might be attributed to metabolic and hormone change in diabetic patients and metformin intake. The effect of metformin use on PrCa risk is also controversial 110 . One large meta-analysis of >1 million samples found no such association in either Asian or European populations 111 . Another study indicated that metformin has an impact on PrCa survival rather than PrCa incidence 112 . 1.7.2.4 Hormone and other Molecular markers Testosterone was believed to contribute to PrCa risk and was supported by different studies, however, to date, studies have been inconsistent 113 . One systematic review study indicated that, among 45 published articles, 17 studies found that total testosterone level was positively associated with PrCa risk, while 18 suggested and opposite direction of effect, and another 10 showed no association 113 . Serum insulin levels has been found to be positively associated with PrCa in a study among Chinese men 114 , while other studies do not support the association 115,116 . A meta-analysis combining results of 21 case-control studies indicated that plasma circulating concentration of Insulin-like growth factor-I (IGF-I), a mitogen for prostate epithelial cells, had a modest positive association with PrCa risk 117 . Some studies, but not all, have found that the abundance of IGF-I’s binding protein IGFBP-3 is inversely associated with PrCa risk 117,118 . A small multi-ethnic study also showed that AA have a lower IGFBP-3 level compared to Asians and Whites 119 . 25 1.7.3 Genetic susceptibility 1.7.3.1 Heritability and pre-GWAS genetic findings Twin studies revealed that genetic factors account for a large proportion of PrCa susceptibility, ranging 42%-57%, which is the highest among all cancer types 120,121 . Familial segregation studies suggested more than one inheritance modes for PrCa, including autosomal dominant disease transmission 122,123 , and recessive or X-linked modes 124,125,126 . Linkage studies identified several potential risk regions for hereditary PrCa, such as 1q24-25 (HPC1) 127,128 , 1q42.2-43 (PCaP) 129 , 1p36 (CAPB) 130 , 16q23.2 (CTRB1) 131 , 17p (HPC2/ELAC2) 132 , 20q13 (HPC20) 133 , and Xq27-28(HPCX) 134 . The downstream positional cloning studies further suggested several candidate genes, such as RNASEL, that could segregate in PrCa families 135 . However, findings from familial and linkage studies generally lacked reproducibility. Nevertheless, one thing can be learned from these discoveries, and that was that genetic predisposition to PrCa was unlikely to be explained by any single chromosomal locus; on the contrary, it suggested that PrCa is likely to be complex and polygenic in nature. Candidate gene studies were initially used to search for common PrCa susceptibility loci focused on several key pathways including genes in the metabolic pathway of testosterone and other androgens, such as dihydrotestosterone (DHT), 5α-Reductase Type II (SRD5A2) and androgen receptor(AR) 136,137 ; genes involved in the cell cycle or DNA repair pathways such as CHEK2 138 . Several gene, such as AR, BRCA2 139 , HOXB13 140 and the prostate-specific antigen encode gene KLK3 141 were successfully replicated; however, generally, these findings were failed to be 26 robustly reproduced by other groups. Moreover, the limited prior knowledge of PrCa relevant biological genes made researchers adopt of the hypothesis-free GWAS approaches. 1.7.3.2 GWAS discoveries Since 2007, more than 20 PrCa GWAS have identified ~170 common PrCa predisposition loci, which in total explain ~28.4% of familial PrCa heritability. GWAS in European population The first wave of PrCa GWAS were predominantly conducted in populations of European ancestry including Iceland, Sweden, the US, the UK and Australia 142,143,144,145 . Generally, they employed multi-stage designs, with <2,000 cases in the initial discovery stage. From 2007 to 2012, ~50 PrCa susceptibility loci were identified. With the dramatic reduction of genotyping expenses and the establishment of collaborative international consortia, genotyping in the discovery sample size expanded to tens of thousands 146 .Leading the way was the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium. PRACTICAL was established in September 2008, and initially included studies from 13 groups (Europe, North America and Australia) comprising 7,623 PrCa cases and 5,913 controls 147,148,149 . Over the past 10 years the PRACTICAL consortium has identified >50% of PrCa GWAS loci. To date, the PRACTICAL consortium consists of 133 groups worldwide, with a total of over 200,000 samples 150 . More recently, the Elucidating Loci Involved in Prostate Cancer Susceptibility (ELLIPSE) Consortium, a part of the GAME-ON project genotyped the OncoArray in an additional 46,939 PrCa cases and 27,910 controls, which when combined with existing GWAS revealed an additional 63 novel PrCa susceptibility loci 40 . 27 GWAS in non-European population GWAS in non-European populations have also identified novel PrCa risk alleles, providing support for broadening GWAS to diverse populations. For Asian descents, GWAS studies conducted in Japanese and Chinese men have identified 12 novel risk loci 151,152,153,154 . For African descents, to better characterize PrCa genetic susceptibility in this high-risk population, The African Ancestry Prostate Cancer (AAPC) Consortium was established in 2007. In the initial AAPC GWAS, comprising 4,853 PrCa cases and 4,678 controls which were genotyped to Illumina Infinium 1M-Duo, an AA-specific PrCa risk variant at 17q21 was identified 155 . A meta-analysis among >20,000 men of African Ancestry, including samples from the AAPC consortium, and ELLIPSE identified another two novel AA-specific PrCa risk variants at 13q34 and 22q12 155 . Despite these achievements, so far, only one small PrCa GWAS has been performed among men in Africa (the Ghana Prostate Study), which failed to identify any novel risk loci, probably due to lack of power 156 . For Latino and Hispanic men, only one has been published, however in this study from the Multiethnic Cohort no novel risk locus was identified 157 . Multi-ethic studies have also facilitated discovery of novel PrCa risk loci. For example, in 2014, by combining existing high-density genotyping data of 87,040 multi-ethnic individuals (European, AA, Japanese, Latino) 23 novel susceptibility loci were identified 158 . GWAS of advanced PrCa Aside from identifying risk variants for overall PrCa predisposition, several GWAS were conducted to seek variants associated with advanced PrCa. In these studies, various criteria/endpoints were used as proxy for aggressive PrCa, including lethal PrCa, PrCa survival, Gleason score, clinical stage and PSA level. Previous studies identified several genome-wide significant risk loci, including rs4054823 at 17p12 159 , rs35148638 at 5q14.3, rs78943174 at 3q26.31 160 , and rs11672691 at 19q13 161 . However, in the two latest and largest 28 GWASs, no genome-wide significant aggressive PrCa (defined by Gleason score, clinical stage and PSA level) risk loci were identified 40,158 . Also, among ~24,000 men of European ancestry in the PRACTICAL and BPC3 consortia, no genome-wide significant risk variant was identified for PrCa-specific survival 162 . 1.7.3.3 Post-GWAS studies Following the initial PrCa GWAS discoveries, fine-mapping studies, integrated with functional and multi-ethnic LD information, were conducted to prioritize putative candidate causal SNPs. The early fine-mapping studies mainly focused on a single or a few risk regions using several tag SNPs, such as studies of 10q11.2 (MSMB, 13 tag SNPs examined) 163 , 19q13.33 (KLK3, 24 tag SNPs) 164 and 11q13 (TPCN2 and MYEOV, 79 tag SNPs) 165 . With the development of genotyping technology and availability of publicly accessible high-resolution imputation panels, more comprehensive sets of SNPs per region were able to be examined, elevating the ability to narrow the association signal to fewer putative causal SNPs as well as identify secondary signals. For example, using the high-resolution iCOGS array followed by imputation to Phase I 1KGP, a total of 1094 SNPs in the telomerase reverse transcriptase (TERT) region at 5p15 were tested; four independent signals were identified, of which one was associated with TERT expression 166 . More recently, with the dramatic reduction in genotyping expense and the establishment of international collaborative consortia, multiple regions can be assessed simultaneously among a large number of samples. Furthermore, with the development of publicly accessible functional annotation, eQTL and epigenetic databases, functional information can be integrated into the fine-mapping studies. For example, using summary statistics estimated from ~52,000 men of European ancestry, one study fine-mapped and functionally annotated 64 known PrCa regions 167 . 29 Similarly, using the summary statistics estimated in >140,000 men of European ancestry, another study examined 84 susceptibility regions 68 . Both of the two studies indicated >10 regions as harboring multiple independent signals, identified better candidate causal SNPs in more than half of the examined regions and demonstrated significant enrichment for overlap with bio-features. Additionally, fine-mapping studies in non-European and multi-ethnic populations, by leveraging the varying LD structures, have aided in identifying ancestry-specific novel risk alleles and narrowing down candidate causal SNPs. For example, a fine-mapping study among men in the AAPC consortium identified two AA-specific risk alleles at the 8q24 region 168 . A multi-ethnic fine-mapping study, comprising men of European (cases/controls: 8600/6946), African (5327/5136), Japanese (2563/4391) and Latino (1034/1046) ancestry, examined 67 risk regions followed by genomic annotation 169 . This study identified better lead SNPs in 30 regions and secondary signals in two regions, and also demonstrated enrichment in overlap with various functional markers. The majority of common PrCa risk variants locate in non-protein coding regions. As stated above, putative causal SNPs identified from GWAS and fine-mapping studies were found to frequently overlap with functional annotations. To further elucidate the underlying biological mechanism(s), experimental follow-up studies have been initiated. One study experimentally examined the suggestive risk locus of advanced PrCa at 19q13 170 , rs11672691, located within an intron of a long noncoding RNA PCAT19. In this study, they first confirmed that rs11672691-G was associated with shorter recurrence-free survival, and then performed a Cis-eQTL analysis 30 and demonstrated that rs11672691 was reciprocally associated with the abundance of two PCAT19 isoforms, PCAT19-long and PCAT19-short, the ratio of which was significantly associated with increased risk of PrCa relapse. Through functional annotation, they found the mutation A→G reduced the binding affinity of a tumor suppressor NKX3.1. Coupling evidence from knockdown of NKX3.1, reporter-gene and 3C assays, the authors found that the A→G mutation converted the PCAT19-short promoter into a PCAT19-long enhancer. Through in vitro (PrCa cell lines) and in vivo (mice) experiments, they also demonstrated that knockdown of PCAT19-long both reduced cell proliferation, migration and invasion, tumor growth and distant metastases. Further, through RNA-seq, they revealed that the knockdown also led to the upregulation of 113 genes, which were found to be associated with shorter disease-free survival in the TCGA cohort. Finally, they showed that PCAT19-long regulated its target genes through interacting with HNRNPAB, an RNA binding protein. Similar studies should be conducted to better understand the functional mechanisms underlying GWAS signals to reveal cancer etiology. In accordance with studies in other diseases, the PRS for PrCa has shown great potential in population risk stratification as well as individual risk prediction. A PRS constructed using 147 known PrCa risk variants was shown to be significantly associated with PrCa risk; compared to men with the average risk (25 th -75 th percentiles), men in the top 10% PRS risk stratum had a 2.69-fold higher risk, which increased to 5.71-fold among those in the top 1% 40 . Multiple studies have demonstrated that the PRS impacts PrCa risk independently (or only in weak correlation) with serum prostate-specific antigen (PSA) levels and family history, implying that the PRS could provide additional information on PrCa risk and that incorporating PRS in risk prediction 31 models would improve discrimination capability 171,172,173 . For example, one study showed that among previously unbiopsied men with low PSA level (1-3 ng/ml), the PRS was can predict men who’re more likely to have a positive biopsy 174 . Several studies constructed PRS-based prediction models and showed that the positive predictive value (PPV) 175 and specificity 176 of a PSA test increased while over-diagnosis rates decreased 177 within higher risk stratum. These studies suggest that PRS has great potential in guiding targeted screening. The most important PrCa risk region identified to date far is at 8q24, which is a gene-poor region, with the closet protein-coding gene being MYC which is >200kb downstream. It was first described by a genome-wide linkage study among Icelandic families 178 . Soon after, a whole- genome admixture mapping study among AA men independently identified the same broad associated region (125.68-129.48 Mb) 49 . This region has been robustly replicated by GWAS and fine-mapping studies among multiple ethnic/racial populations as harboring PrCa risk polymorphisms 179,168,180 . These studies also indicated that multiple loci, located at five separate LD block, conferred independent associations with PrCa risk. In a recent in-depth fine-mapping study of 8q24 (127.6-129.0 Mb) in >120,000 European descents, 12 independent SNPs were identified and together captured 11.54% of familial PrCa risk [ref – Marco etl. in press], further supporting the importance of this region in PrCa etiology. Nevertheless, although multiple studies have been conducted, the underlying biological mechanism(s) remains unknown. Some studies suggested that the associated SNPs lie in regulatory elements, especially enhancers, in this region. For example, one study showed that a risk allele (rs6983267) is located within an in vivo prostate enhancer, which is involved in regulating MYC expression 181 . Another study, using reporter assays, identified several enhancers within this region and indicated that a risk allele 32 (rs11986220) facilitated both stronger FoxA1 binding and AR responsiveness 182 . A study also demonstrated that risk alleles at 8q24 formed long-range chromatin interactions with MYC in a tissue specific manner (LNCaP cancer cell line) 183 . Another study suggested that variants at 8q24 may constitute a regulatory hub that could regulate multiple genes, especially those enriched in pathways such as Wnt signaling, by long-range physical contact 184 . Moreover, several studies provided evidence that this region could be transcribed into lncRNAs such as prostate cancer non-coding RNA 1 (PRNCR1) 185 . One study examined the association between 8q24 risk alleles and MYC miRNA transcript abundance, however no evidence was found 186 . Another study observed that the 8q24 allele rs378854-G could reduce binding of the transcription factor YY1 in vitro, and the presence of this risk allele was associated with expression of the oncogene PVT1 in normal prostate tissue 187 . In summary, to date, previous GWAS and fine-mapping studies have identified ~170 PrCa risk loci, indicating that multiple germline variants underline PrCa susceptibility. Several ethnic- specific risk loci have been identified through studying non-European and multi-ethnic populations, supporting the necessity of expanding GWAS and fine-mapping studies to non- European populations. PRS have also been shown to stratify population PrCa risk and has great potential in personalized screening, yet ethnic-specific PRS remains to be constructed and tested. In this dissertation, I aim to expand the current knowledge of PrCa genetic susceptibility to the underrepresented populations of Africa and Latino ancestry. For African men (Chapter 2), I conducted the first GWAS and genetic risk characterization study in men from Eastern Africa, 33 among 571 incidence PrCa cases and 485 controls from Uganda. I examined risk associations at 111 known PCa loci and constructed a polygenic risk model to assess the cumulative genetic effects of genetic risk loci in this high-risk African population. For Latinos (Chapter 3), I assembled all existing genetic studies of PrCa in Latino men, comprising 2,714 cases and 5,293 controls. In this study I carried out the largest PrCa GWAS to date in this population discovering search of novel risk alleles as well as determined whether known PrCa risk alleles are important in capturing PrCa risk in Latino men. I also conducted the first genome-wide admixture mapping analysis of PrCa in Latino men to scan for PrCa risk alleles associated with local ancestry. As part of this work, I also developed a PrCa PRS to test the cumulative effect of all known PrCa risk variants in Latinos and explored whether genetic background/ancestry modified associations with single variants and a PrCa PRS in Latino men. 34 1.8 Multiple myeloma epidemiology 1.8.1 Overview Multiple Myeloma (MM) is a hematological malignancy of plasma cells, which is characterized by clinical symptoms including hypercalcemia, renal insufficiency, anemia, and bone lesions 188 . The development of MM follows a multistage pattern: firstly, the proliferation of monoclonal plasma cells in bone marrow and the secretion of monoclonal immunoglobulin protein (aka. M protein) leading to an asymptomatic premalignant disorder termed monoclonal gammopathy of undetermined significance (MGUS), which has a 0.5-1% annual risk of progressing to MM. The next intermediate asymptomatic premalignant disorder following MGUS is smoldering multiple myeloma (SMM), in which patients do not show evidence of Myeloma-defined Events or amyloidosis. The SMM stage has an annual risk of 10% progressing to MM in the first 5 years after diagnosis. Later on, the increased level of circulating M protein will affect multiple organs and result in the appearance of typical clinical symptoms and detectable biomarkers, through which could be diagnosed as MM 189 . MM is the second most common blood cancer after non-Hodgkin lymphoma. In 2018, the estimated number of new cases is 30,770 and the estimated MM-caused deaths is 12,770, accounting for ~2% of total cancer deaths 85 . During the last 10 years, the age-adjusted MM incidence rate increased in each ethnic group; in contrast, its mortality rate slightly decreased 190 . Although the treatment has greatly improved, MM remains to be a rarely curable disease, with a 5-year survival rate of ~50.7% and a median survival of 6 years 189,191 . 35 MM occurs more often among elderly people, males and populations of African ancestry. Compared to all other racial groups, AAs have ~2-fold increased MM incidence and mortality rates. In addition, compared to people of European ancestry, African descents are diagnosed earlier (median diagnosis age: 70 in EAs versus 66 in AAs) 192,193 , and have a higher MGUS prevalence rate 194 . Regarding differences in survival one study showed that, since 1999, survival has been greater in people of European ancestry 195 , especially in those under 70 196 . According to SEER, during 2000-2011, after controlling for potential barriers to access to care and overall health status, patients of African ancestry were significantly less likely to be treated by stem cell transplantation or bortezomib, resulting in a 12% increase in risk of death 197 . 1.8.2 Non-genetic risk factors Apart from the aforementioned factors, other well-established risk factors for MM are obesity 198 , and family history of MM 199 . Particularly, a prospective study showed that increased BMI at both ages 18 and 50 were significantly associated with MM risk, suggesting that overweight or obesity in both early- and later in life are risk factors for MM 200 . A positive association between BMI and MM mortality has also been observed in an AA cohort (~240,000 AA individuals, 11.6 average follow-up years, 496 MM deaths) 201 . Although most MM patients have no affected relatives, previous studies have shown that family history is a robust risk factor of MM. For example, a study in Sweden (13,896 MM cases, 54,365 matched controls) indicated that first-degree relatives of MM cases had an increased risk of developing MM (RR=2.1, 1.6-2.9) 202 . First-degree relatives of those with MGUS are associated with a ~3-fold increase in MM risk 203 . Previous studies also suggested that a family history of 36 any hematologic malignancy was associated with ~2-fold elevated MM risk 204,205 , and the effect size was slightly greater among African descents 204 . Other suspected risk factors, including alcohol consumption, smoking, diet, history of autoimmune disorders, occupational/ environmental exposures to chemicals such as benzene, pesticide, and ionizing radiation, have been examined, however, the evidence has not been consistent 199 . One example is the association between alcohol consumption and MM risk. In a prospective cohort study of women (1.3 million, follow-up for an average of 10.3 years), increased alcohol intake was found to be associated with several haematological malignancies including myeloma (RR=0.88, 0.80-0.98) 206 . In a pooled analysis of case-control studies (1,567 cases, 7,296 controls), compared to never drinkers, current drinkers had a ~0.5-fold lower risk of developing MM, yet no dose-response relationship was observed for drinking frequency, duration, or lifetime consumption 206 . However, other studies have not found such an association. In a meta-analysis containing 18 studies, no association between alcohol drinking and MM risk was found (OR=0.97, 0.85-1.10) 207 . A study that conducted stratified analyses by race also did not observe significant associations 208 . 1.8.3 Genetic susceptibility Multiple lines of evidence, including the ethnic/racial disparity in MM incidence and mortality rates, younger diagnosis age among AAs, elevated risk among first-degree relatives of MM patients, and the reported familial clustering 209,210 , strongly suggest an inherited genetic susceptibility for MM. 37 Due to its rarity and poor survival rate, family-based linkage studies of MM are scarce. Recently, a novel pedigree analysis, applied to 11 extended high-risk MM pedigrees, followed by exome sequencing, identified a genome-wide significant segment at 6q16 containing USP45, an important regulator of DNA repair 211 . Based on pathological hypotheses, several candidate gene studies searched for possible genetic polymorphisms associated with MM risk. They examined pathways/genes including immune- pathway genes such as Interleukin-6 212 , tumor necrosis factor α (TNF-α) 213 and CD4 214 , cell cycle and apoptosis genes such as BAX and CASP9 215 , xenobiotic metabolism genes such as CYP1A1 216 , folate and methionine metabolism genes such as 5,10-methylenetetrahydrofolate reductase (MTHFR) 217 , and DNA repair genes such as XRCC3 218 . However, due to lack of power, none of these findings were consistently replicated. To date, a total of five GWAS among European descents have been published. The first GWAS study of MM, conducted among UK and German men and women with a combined sample size of 1,675 cases and 5,903 controls identified two genome-wide significant regions at 3p22.1 (rs1052501) and 7p15.3 (rs4487645) and a suggestive region at 2p23.3 (rs6746082) 219 . SNP rs1052501 localized to an exon of the ULK4 gene, which encodes a serine–threonine protein kinase; rs4487645 maps to an intron of the DNAH11 gene, which encodes a dynein heavy chain microtubule-dependent motor ATPase, and also encompasses the 3’ part of the CDCA7L gene; rs6746082 maps to intron 12 of the DTNB gene, which encodes dystrobrevin beta 219 . None of the three regions were found to map to potential transcriptional regulatory regions, or associate with 38 mRNA expression. These three alleles accounted for ~4% of familial MM risk but contribute to the development of MM in terms of population attributable fraction, underlying ~37% cases 219 . The GWAS in the German samples was expanded to 2335 cases and 7306 controls which revealed 4 novel genome-wide significant risk regions at 3q26.2 (rs10936599), 6p21.33 (rs2285803), 17p11.2 (rs4273077), and 22q13.1 (rs877529). The allele rs10936599 maps to the MYNN gene, which encodes a zinc finger protein, and also encompasses the TERC gene; further imputation identified a stronger association, rs2293607, which also maps to the TERC gene; allele rs2285803 maps to an intron of the PSORS1C1 gene, and also encompasses CCHCR1, CDSH, and POU5F1 genes; allele rs42773077 maps to an intron of the TNFRSF13B gene, which is a regulator of B and T-cell function; rs877529 maps to an intron of the CBX7 gene, which encodes a polycomb group protein 220 . Using chromatin state segmentation, the authors indicated that rs877529 was located within an enhancer and rs2293607 was predicted to lie in an active promoter. In total, the seven GWAS-identified SNPs were estimated to explain ~13% of the familial risk. A third GWAS study in European descents meta-analyzed GWAS from Sweden, Norway and Icelandic populations, with a total discovery sample containing 2,925 cases and 506,554 controls. They identified one novel genome-wide significant allele at 5q15 (rs56219066), which maps to the ELL2 gene and regulates mRNA processing in plasma cells 221 . Allele 22q13 (rs138740, HMGXB4 and TOM1) was also found to be a promising variant however it wasn’t replicated in their replication sample. 39 The fourth GWAS study, also conducted in men and women of European descent comprised 7,319 cases and 234,385 controls from six populations. They identified 8 new MM risk loci, including 6p22.3 (rs34229995, 2.2-kb telomeric to the 5’ of JARID2), 6q21 (rs9372120, intron of ATG5), 7q36.1 (rs7781265, intron of SMARCD3), 8q24.21 (rs1948915, CCAT1), 9p21.3 (rs2811710, intron of CDKN2A), 10p12.1 (rs2790457, intron of WAC), 16q23.1 (rs7193541, non-synonymous SNP I564V of RFWD3) and 20q13.13 (rs6066835, intron of PREX1) 222 . The largest and latest GWAS combined results of a new GWAS with all existing GWAS and replication data (totaling 9,974 cases and 247,556 controls) and identified 6 novel risk loci, including 2q31.1 (rs4325816, SP3), 5q23.2 (rs6595443, CEP120), 7q22.3 (rs17507636, CCDC71L), 7q31.33 (rs58618031, POT1), 16p11.2 (rs13338946, FBRS) and 19p13.11 (rs11086029, KLF2) 223 . In addition, this study performed functional analyses, including in situ promoter capture Hi-C (CHi-C) in a MM cell line, ChIP-seq chromatin profiling from MM and lymphoblastoid cell lines and naive B cells, and eQTL analysis using CD138-purified plasma cells from patients. These analyses provided insights into the functional mechanisms underlying MM risk loci, namely, the impact of MM loci on regulating genes involved in cycle and genomic instability, chromatin remodeling, the central role IRF4-MYC-mediated apoptosis/autophagy, and B cell and plasma cell differentiation. To date, only one GWAS study has been conducting in non-European samples. This study combined GWAS results of EAs and AAs and tested the effects of the 7 known MM risk alleles (known at the time) identified among European descents. It demonstrated that five risk loci, 40 2p23.3, 3p22.1, 7p15.3, 17p11.2, and 22q13.1, were replicated at a nominal significance level, among which three variants (rs4487645, rs4273077, rs877529) were significant in AA-only samples, suggesting that genetic susceptibility of MM is shared across racial/ethnic groups 224 . Through fine-mapping in this multiethnic study, and genomic annotation, better lead SNPs were identified at 4 regions (3p22.1, 7p15.3, 17p11.2, 22q13.1). In summary, GWAS studies provide strong support for genetic factors in MM susceptibility. To date, a total of 23 risk SNPs has been identified, implying that MM is a polygenic disease, and more susceptibility loci likely remain to be discovered. However, lacking are comprehensive discovery and genetic risk characterization efforts in non-European ancestry populations, which are critical given the disparities in disease incidence across populations. In this dissertation (chapter 4), I combined all existing MM GWAS data for men and women of African ancestry which were imputed using the high-dense HRC imputation panel. I conducted and GWAS and whole-genome admixture mapping analysis to scan for novel MM risk regions. I also tested the 23 reported known MM risk variants and comprehensively examined each risk region to find markers that better capture AA MM genetic risk as well as potential secondary signals. Finally, I constructed a PRS to assess the aggregated effect of 23 known risk loci on MM risk in this high-risk population. 41 1.9 References 1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians. 0(0). doi:10.3322/caac.21492 2. Hall JM, Lee MK, Newman B, et al. Linkage of early-onset familial breast cancer to chromosome 17q21. Science. 1990;250(4988):1684-1689. doi:10.1126/science.2270482 3. Miki Y, Swensen J, Shattuck-Eidens D, et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science. 1994;266(5182):66-71. doi:10.1126/science.7545954 4. Bush WS, Moore JH. Chapter 11: Genome-Wide Association Studies. PLOS Computational Biology. 2012;8(12):e1002822. doi:10.1371/journal.pcbi.1002822 5. Reich DE, Cargill M, Bolk S, Ireland J, al et. Linkage disequilibrium in the human genome. Nature; London. 2001;411(6834):199-204. doi:http://dx.doi.org.libproxy2.usc.edu/10.1038/35075590 6. A haplotype map of the human genome. Nature. 2005;437(7063):1299-1320. doi:10.1038/nature04226 7. Population stratification. In: Wikipedia. ; 2018. https://en.wikipedia.org/w/index.php?title=Population_stratification&oldid=818216620. Accessed May 19, 2018. 8. Freedman ML, Reich D, Penney KL, et al. Assessing the impact of population stratification on genetic association studies. Nat Genet. 2004;36(4):388-393. doi:10.1038/ng1333 9. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904-909. doi:10.1038/ng1847 10. Sakoda LC, Jorgenson E, Witte JS. Turning of COGS moves forward findings for hormonally mediated cancers. Nature Genetics; New York. 2013;45(4):345-348. 11. Genetic Associations and Mechanisms in Oncology (GAME-ON): A Network of Consortia for Post-Genome Wide Association (Post-GWA) Research. https://epi.grants.cancer.gov/gameon/. Accessed October 26, 2018. 12. Amos CI, Dennis J, Wang Z, et al. The OncoArray Consortium: a Network for Understanding the Genetic Architecture of Common Cancers. Cancer Epidemiol Biomarkers Prev. 2017;26(1):126-135. doi:10.1158/1055-9965.EPI-16-0106 13. Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nature Reviews Genetics. 2010;11(7):499-511. doi:10.1038/nrg2796 42 14. Spencer CCA, Su Z, Donnelly P, Marchini J. Designing Genome-Wide Association Studies: Sample Size, Power, Imputation, and the Choice of Genotyping Chip. PLOS Genetics. 2009;5(5):e1000477. doi:10.1371/journal.pgen.1000477 15. Consortium TIH. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007;449(7164):851-861. doi:10.1038/nature06258 16. Consortium TIH 3. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467(7311):52-58. doi:10.1038/nature09298 17. Consortium T 1000 GP. A map of human genome variation from population-scale sequencing. Nature. 2010;467(7319):1061-1073. doi:10.1038/nature09534 18. Consortium T 1000 GP. A global reference for human genetic variation. Nature. 2015;526(7571):68-74. doi:10.1038/nature15393 19. Consortium T 1000 GP. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491(7422):56-65. doi:10.1038/nature11632 20. McCarthy S, Das S, Kretzschmar W, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016;48(10):1279-1283. doi:10.1038/ng.3643 21. Welter D, MacArthur J, Morales J, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42(Database issue):D1001-D1006. doi:10.1093/nar/gkt1229 22. MacArthur J, Bowler E, Cerezo M, et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017;45(D1):D896- D901. doi:10.1093/nar/gkw1133 23. GWAS Catalog. https://www.ebi.ac.uk/gwas/docs/about. Accessed November 9, 2017. 24. Fehringer G, Kraft P, Pharoah PD, et al. Cross-Cancer Genome-Wide Analysis of Lung, Ovary, Breast, Prostate, and Colorectal Cancer Reveals Novel Pleiotropic Associations. Cancer Res. 2016;76(17):5103-5114. doi:10.1158/0008-5472.CAN-15-2980 25. Law PJ, Sud A, Mitchell JS, et al. Genome-wide association analysis of chronic lymphocytic leukaemia, Hodgkin lymphoma and multiple myeloma identifies pleiotropic risk loci. Sci Rep. 2017;7. doi:10.1038/srep41071 26. Plenge RM, Scolnick EM, Altshuler D. Validating therapeutic targets through human genetics. Nature Reviews Drug Discovery. 2013;12(8):581-594. doi:10.1038/nrd4051 27. Visscher PM, Brown MA, McCarthy MI, Yang J. Five Years of GWAS Discovery. The American Journal of Human Genetics. 2012;90(1):7-24. doi:10.1016/j.ajhg.2011.11.029 43 28. Kumar V, Kato N, Urabe Y, et al. Genome-wide association study identifies a susceptibility locus for HCV-induced hepatocellular carcinoma. Nature Genetics. 2011;43(5):455-458. doi:10.1038/ng.809 29. Goto K, Annan DA, Morita T, et al. Novel chemoimmunotherapeutic strategy for hepatocellular carcinoma based on a genome-wide association study. Sci Rep. 2016;6:38407. doi:10.1038/srep38407 30. Goto K, Arai J, Stephanou A, Kato N. Novel therapeutic features of disulfiram against hepatocellular carcinoma cells with inhibitory effects on a disintegrin and metalloproteinase 10. Oncotarget. 2018;9(27):18821-18831. doi:10.18632/oncotarget.24568 31. Park SL, Cheng I, Haiman CA. Genome-Wide Association Studies of Cancer in Diverse Populations. Cancer Epidemiol Biomarkers Prev. 2018;27(4):405-417. doi:10.1158/1055- 9965.EPI-17-0169 32. Bustamante CD, De La Vega FM, Burchard EG. Genomics for the world. Nature. 2011;475:163-165. doi:10.1038/475163a 33. Visscher PM, Wray NR, Zhang Q, et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. The American Journal of Human Genetics. 2017;101(1):5-22. doi:10.1016/j.ajhg.2017.06.005 34. Marigorta UM, Navarro A. High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants. PLoS Genet. 2013;9(6). doi:10.1371/journal.pgen.1003566 35. Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genetics in Medicine. 2002;4(2):45-61. doi:10.1097/00125817- 200203000-00002 36. Palmer C, Pe’er I. Statistical correction of the Winner’s Curse explains replication variability in quantitative trait genome-wide association studies. PLoS Genet. 2017;13(7). doi:10.1371/journal.pgen.1006916 37. Phillips PC. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet. 2008;9(11):855-867. doi:10.1038/nrg2452 38. Risch NJ. Searching for genetic determinants in the new millennium. Nature. doi:10.1038/35015718 39. Cui R, Kamatani Y, Takahashi A, et al. Functional Variants in ADH1B and ALDH2 Coupled With Alcohol and Smoking Synergistically Enhance Esophageal Cancer Risk. Gastroenterology. 2009;137(5):1768-1775. doi:10.1053/j.gastro.2009.07.070 40. Schumacher FR, Olama AAA, Berndt SI, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nature Genetics. 2018;50(7):928- 936. doi:10.1038/s41588-018-0142-8 44 41. Yang J, Benyamin B, McEvoy BP, et al. Common SNPs explain a large proportion of the heritability for human height. Nature Genetics; New York. 2010;42(7):565-569. 42. Gorlov IP, Gorlova OY, Sunyaev SR, Spitz MR, Amos CI. Shifting Paradigm of Association Studies: Value of Rare Single-Nucleotide Polymorphisms. Am J Hum Genet. 2008;82(1):100-112. doi:10.1016/j.ajhg.2007.09.006 43. Zhu X, Tang H, Risch N. Admixture Mapping and the Role of Population Structure for Localizing Disease Genes. In: Advances in Genetics. Vol 60. Genetic Dissection of Complex Traits. Academic Press; 2008:547-569. doi:10.1016/S0065-2660(07)00419-1 44. Winkler CA, Nelson GW, Smith MW. Admixture Mapping Comes of Age. Annual Review of Genomics and Human Genetics. 2010;11(1):65-89. doi:10.1146/annurev-genom-082509- 141523 45. Balding DJ, Bishop M, Cannings C. Handbook of Statistical Genetics. John Wiley & Sons; 2008. 46. Chakraborty R, Weiss KM. Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. PNAS. 1988;85(23):9119-9123. doi:10.1073/pnas.85.23.9119 47. Rife DC. Populations of hybrid origin as source material for the detection of linkage. Am J Hum Genet. 1954;6(1):26-33. 48. Maples BK, Gravel S, Kenny EE, Bustamante CD. RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am J Hum Genet. 2013;93(2):278-288. doi:10.1016/j.ajhg.2013.06.020 49. Freedman ML, Haiman CA, Patterson N, et al. Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men. Proc Natl Acad Sci U S A. 2006;103(38):14068-14073. doi:10.1073/pnas.0605832103 50. Spain SL, Barrett JC. Strategies for fine-mapping complex traits. Hum Mol Genet. 2015;24(R1):R111-R119. doi:10.1093/hmg/ddv260 51. Maller JB, McVean G, Byrnes J, et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat Genet. 2012;44(12):1294-1301. doi:10.1038/ng.2435 52. Yang J, Ferreira T, Morris AP, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nature Genetics; New York. 2012;44(4):369-375, S1-3. 53. Chen W, Larrabee BR, Ovsyannikova IG, et al. Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics. Genetics. 2015;200(3):719- 736. doi:10.1534/genetics.115.176107 45 54. van de Bunt M, Cortes A, Brown MA, Morris AP, McCarthy MI. Evaluating the Performance of Fine-Mapping Strategies at Common Variant GWAS Loci. PLoS Genet. 2015;11(9). doi:10.1371/journal.pgen.1005535 55. Asimit JL, Hatzikotoulas K, McCarthy M, Morris AP, Zeggini E. Trans-ethnic study design approaches for fine-mapping. Eur J Hum Genet. 2016;24(9):1330-1336. doi:10.1038/ejhg.2016.1 56. Zaitlen N, Paşaniuc B, Gur T, Ziv E, Halperin E. Leveraging Genetic Variability across Populations for the Identification of Causal Variants. Am J Hum Genet. 2010;86(1):23-33. doi:10.1016/j.ajhg.2009.11.016 57. Sauna ZE, Kimchi-Sarfaty C. Understanding the contribution of synonymous mutations to human disease. Nature Reviews Genetics. 2011;12(10):683-691. doi:10.1038/nrg3051 58. Schaub MA, Boyle AP, Kundaje A, Batzoglou S, Snyder M. Linking disease associations with regulatory information in the human genome. Genome Res. 2012;22(9):1748-1759. doi:10.1101/gr.136127.111 59. Hindorff LA, Sethupathy P, Junkins HA, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. PNAS. 2009;106(23):9362- 9367. doi:10.1073/pnas.0903103106 60. Chen A, Chen D, Chen Y. Advances of DNase-seq for mapping active gene regulatory elements across the genome in animals. Gene. 2018;667:83-94. doi:10.1016/j.gene.2018.05.033 61. Giresi PG, Kim J, McDaniell RM, Iyer VR, Lieb JD. FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin. Genome Res. 2007;17(6):877-885. doi:10.1101/gr.5533506 62. Smale ST, Kadonaga JT. The RNA Polymerase II Core Promoter. Annual Review of Biochemistry. 2003;72(1):449-479. doi:10.1146/annurev.biochem.72.121801.161520 63. Wittkopp PJ, Kalay G. Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nature Reviews Genetics; London. 2012;13(1):59-69. doi:http://dx.doi.org.libproxy2.usc.edu/10.1038/nrg3095 64. Zhang X, Bailey SD, Lupien M. Laying a solid foundation for Manhattan – ‘setting the functional basis for the post-GWAS era.’ Trends in Genetics. 2014;30(4):140-149. doi:10.1016/j.tig.2014.02.006 65. Consortium TEP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57-74. doi:10.1038/nature11247 66. Hazelett DJ, Rhie SK, Gaddis M, et al. Comprehensive Functional Annotation of 77 Prostate Cancer Risk Loci. PLOS Genetics. 2014;10(1):e1004102. doi:10.1371/journal.pgen.1004102 46 67. Nicolae DL, Gamazon E, Zhang W, Duan S, Dolan ME, Cox NJ. Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS. PLOS Genetics. 2010;6(4):e1000888. doi:10.1371/journal.pgen.1000888 68. Dadaev T, Saunders EJ, Newcombe PJ, et al. Fine-mapping of prostate cancer susceptibility loci in a large meta-analysis identifies candidate causal variants. Nature Communications. 2018;9(1):2256. doi:10.1038/s41467-018-04109-8 69. Diehl AG, Boyle AP. Deciphering ENCODE. Trends in Genetics. 2016;32(4):238-249. doi:10.1016/j.tig.2016.02.002 70. Consortium RE, Kundaje A, Meuleman W, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518(7539):317-330. doi:10.1038/nature14248 71. Kawaji H, Kasukawa T, Forrest A, Carninci P, Hayashizaki Y. The FANTOM5 collection, a data series underpinning mammalian transcriptome atlases in diverse cell types. Scientific Data. doi:10.1038/sdata.2017.113 72. The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013;45(6):580-585. doi:10.1038/ng.2653 73. International Schizophrenia Consortium, Purcell SM, Wray NR, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460(7256):748-752. doi:10.1038/nature08185 74. Vilhjálmsson BJ, Yang J, Finucane HK, et al. Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores. Am J Hum Genet. 2015;97(4):576-592. doi:10.1016/j.ajhg.2015.09.001 75. So H, Sham PC. Improving polygenic risk prediction from summary statistics by an empirical Bayes approach. Scientific Reports (Nature Publisher Group); London. 2017;7:41262. doi:http://dx.doi.org.libproxy2.usc.edu/10.1038/srep41262 76. Chatterjee N, Wheeler B, Sampson J, Hartge P, Chanock SJ, Park J-H. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat Genet. 2013;45(4):400-405, 405e1-3. doi:10.1038/ng.2579 77. Garcia-Closas M, Rothman N, Figueroa JD, et al. Common genetic polymorphisms modify the effect of smoking on absolute risk of bladder cancer. Cancer Res. 2013;73(7):2211-2220. doi:10.1158/0008-5472.CAN-12-2388 78. Frampton MJE, Law P, Litchfield K, et al. Implications of polygenic risk for personalised colorectal cancer screening. Ann Oncol. 2016;27(3):429-434. doi:10.1093/annonc/mdv540 79. Shieh Y, Eklund M, Madlensky L, et al. Breast Cancer Screening in the Precision Medicine Era: Risk-Based Screening in a Population-Based Trial. J Natl Cancer Inst. 2017;109(5). doi:10.1093/jnci/djw290 47 80. Khera AV, Chaffin M, Aragam KG, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature Genetics. 2018;50(9):1219. doi:10.1038/s41588-018-0183-z 81. Chatterjee N, Shi J, García-Closas M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nature Reviews Genetics. 2016;17(7):392-406. doi:10.1038/nrg.2016.27 82. Tikkanen E, Havulinna AS, Palotie A, Salomaa V, Ripatti S. Genetic Risk Prediction and a 2-Stage Risk Screening Strategy for Coronary Heart Disease. Arteriosclerosis, Thrombosis, and Vascular Biology. 2013;33(9):2261-2266. doi:10.1161/ATVBAHA.112.301120 83. Martin AR, Gignoux CR, Walters RK, et al. Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations. The American Journal of Human Genetics. 2017;100(4):635-649. doi:10.1016/j.ajhg.2017.03.004 84. Pashayan N, Morris S, Gilbert FJ, Pharoah PDP. Cost-effectiveness and Benefit-to-Harm Ratio of Risk-Stratified Screening for Breast Cancer: A Life-Table Model. JAMA Oncology. July 2018. doi:10.1001/jamaoncol.2018.1901 85. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA: A Cancer Journal for Clinicians. 2018;68(1):7-30. doi:10.3322/caac.21442 86. Cotter MP, Gern RW, Ho GYF, Chang RY, Burk RD. Role of family history and ethnicity on the mode and age of prostate cancer presentation*. The Prostate. 2002;50(4):216-221. doi:10.1002/pros.10051 87. Powell IJ. Epidemiology and pathophysiology of prostate cancer in African-American men. J Urol. 2007;177(2):444-449. doi:10.1016/j.juro.2006.09.024 88. Okuku F, Orem J, Holoya G, De Boer C, Thompson CL, Cooney MM. Prostate Cancer Burden at the Uganda Cancer Institute. J Glob Oncol. 2016;2(4):181-185. doi:10.1200/JGO.2015.001040 89. Parkin DM, Nambooze S, Wabwire-Mangen F, Wabinga HR. Changing cancer incidence in Kampala, Uganda, 1991-2006. Int J Cancer. 2010;126(5):1187-1195. doi:10.1002/ijc.24838 90. Hsing AW, Yeboah E, Biritwum R, et al. High Prevalence of Screen Detected Prostate Cancer in West Africans: Implications for Racial Disparity of Prostate Cancer. J Urol. 2014;192(3):730-735. doi:10.1016/j.juro.2014.04.017 91. Cook LS, Goldoft M, Schwartz SM, Weiss NS. INCIDENCE OF ADENOCARCINOMA OF THE PROSTATE IN ASIAN IMMIGRANTS TO THE UNITED STATES AND THEIR DESCENDANTS. The Journal of Urology. 1999;161(1):152-155. doi:10.1016/S0022- 5347(01)62086-X 92. Ito K. Prostate cancer in Asian men. Nature Reviews Urology. 2014;11(4):197-212. doi:10.1038/nrurol.2014.42 48 93. Johns LE, Houlston RS. A systematic review and meta-analysis of familial prostate cancer risk. BJU International. 2003;91(9):789-794. doi:10.1046/j.1464-410X.2003.04232.x 94. Zeegers MPA, Jellema A, Ostrer H. Empiric risk of prostate carcinoma for relatives of patients with prostate carcinoma. Cancer. 2003;97(8):1894-1903. doi:10.1002/cncr.11262 95. Whittemore AS, Wu AH, Kolonel LN, et al. Family History and Prostate Cancer Risk in Black, White, and Asian Men in the United States and Canada. Am J Epidemiol. 1995;141(8):732-740. doi:10.1093/oxfordjournals.aje.a117495 96. Markozannes G, Tzoulaki I, Karli D, et al. Diet, body size, physical activity and risk of prostate cancer: An umbrella review of the evidence. Eur J Cancer. 2016;69:61-69. doi:10.1016/j.ejca.2016.09.026 97. Barrington WE, Schenk JM, Etzioni R, et al. Difference in Association of Obesity With Prostate Cancer Risk Between US African American and Non-Hispanic White Men in the Selenium and Vitamin E Cancer Prevention Trial (SELECT). JAMA Oncol. 2015;1(3):342- 349. doi:10.1001/jamaoncol.2015.0513 98. Ogden CL, Carroll MD, Kit BK, Flegal KM. Prevalence of Childhood and Adult Obesity in the United States, 2011–2012. JAMA. 2014;311(8):806-814. doi:10.1001/jama.2014.732 99. Mordukhovich I, Reiter PL, Backes DM, et al. A review of African American-white differences in risk factors for cancer: prostate cancer. Cancer Causes Control. 2011;22(3):341-357. doi:10.1007/s10552-010-9712-5 100. World Cancer Research Fund International/American Institute for Cancer Research Continuous Update Project Report: Diet, Nutrition, Physical Activity, and Prostate Cancer. 2014. Available at: www.wcrf.org/sites/default/files/Prostate-Cancer-2014-Report.pdf. 101. World Cancer Research Fund International Systematic Literature Review: The Associations between Food, Nutrition and Physical Activity and the Risk of Prostate Cancer. 2014. Available at:wcrf.org/sites/default/files/Prostate-Cancer-SLR-2014.pdf. 102. Kasper JS, Liu Y, Giovannucci E. Diabetes Mellitus and Risk of Prostate Cancer in the Health Professionals Follow-Up Study. Int J Cancer. 2009;124(6):1398-1403. doi:10.1002/ijc.24044 103. Dankner R, Boffetta P, Keinan-Boker L, et al. Diabetes, prostate cancer screening and risk of low- and high-grade prostate cancer: an 11 year historical population follow-up study of more than 1 million men. Diabetologia. 2016;59:1683-1691. doi:10.1007/s00125-016- 3972-x 104. Vecchia CL, Negri E, Franceschi S, D’Avanzo B, Boyle P. A case-control study of diabetes mellitus and cancer risk. British Journal of Cancer. 1994;70(5):950-953. doi:10.1038/bjc.1994.427 49 105. Wallström P, Bjartell A, Gullberg B, Olsson H, Wirfält E. A prospective Swedish study on body size, body composition, diabetes, and prostate cancer risk. British Journal of Cancer. 2009;100(11):1799-1805. doi:10.1038/sj.bjc.6605077 106. Waters KM, Henderson BE, Stram DO, Wan P, Kolonel LN, Haiman CA. Association of Diabetes With Prostate Cancer Risk in the Multiethnic Cohort. Am J Epidemiol. 2009;169(8):937-945. doi:10.1093/aje/kwp003 107. Park S-Y, Haiman CA, Cheng I, et al. Racial/ethnic differences in lifestyle-related factors and prostate cancer risk: the Multiethnic Cohort Study. Cancer Causes Control. 2015;26(10):1507-1515. doi:10.1007/s10552-015-0644-y 108. Giovannucci E, Rimm EB, Stampfer MJ, Colditz GA, Willett WC. Diabetes mellitus and risk of prostate cancer (United States). Cancer Causes Control. 1998;9(1):3-9. doi:10.1023/A:1008822917449 109. Rodriguez C, Patel AV, Mondul AM, Jacobs EJ, Thun MJ, Calle EE. Diabetes and Risk of Prostate Cancer in a Prospective Cohort of US Men. Am J Epidemiol. 2005;161(2):147- 152. doi:10.1093/aje/kwh334 110. Wright JL, Stanford JL. Metformin use and prostate cancer in Caucasian men: results from a population-based case–control study. Cancer Causes Control. 2009;20(9):1617. doi:10.1007/s10552-009-9407-y 111. Chen CB, Eskin M, Eurich DT, Majumdar SR, Johnson JA. Metformin, Asian ethnicity and risk of prostate cancer in type 2 diabetes: a systematic review and meta-analysis. BMC Cancer. 2018;18(1):65. doi:10.1186/s12885-017-3934-9 112. Margel D, Urbach DR, Lipscombe LL, et al. Metformin Use and All-Cause and Prostate Cancer–Specific Mortality Among Men With Diabetes. Journal of Clinical Oncology. 2013;31(25):3069-3075. doi:10.1200/JCO.2012.46.7043 113. Klap J, Schmid M, Loughlin KR. The Relationship between Total Testosterone Levels and Prostate Cancer: A Review of the Continuing Controversy. The Journal of Urology. 2015;193(2):403-414. doi:10.1016/j.juro.2014.07.123 114. Hsing AW, Chua S, Gao Y-T, et al. Prostate Cancer Risk and Serum Levels of Insulin and Leptin: a Population-Based Study. J Natl Cancer Inst. 2001;93(10):783-789. doi:10.1093/jnci/93.10.783 115. Hubbard JS, Rohrmann S, Landis PK, et al. Association of prostate cancer risk with insulin, glucose, and anthropometry in the Baltimore longitudinal study of aging. Urology. 2004;63(2):253-258. doi:10.1016/j.urology.2003.09.060 116. Chen C, Lewis SK, Voigt L, Fitzpatrick A, Plymate SR, Weiss NS. Prostate carcinoma incidence in relation to prediagnostic circulating levels of insulin-like growth factor I, insulin-like growth factor binding protein 3, and insulin. Cancer. 2005;103(1):76-84. doi:10.1002/cncr.20727 50 117. Renehan AG, Zwahlen M, Minder C, O’Dwyer ST, Shalet SM, Egger M. Insulin-like growth factor (IGF)-I, IGF binding protein-3, and cancer risk: systematic review and meta- regression analysis. The Lancet. 2004;363(9418):1346-1353. doi:10.1016/S0140- 6736(04)16044-3 118. Yu H, Rohan T. Role of the Insulin-Like Growth Factor Family in Cancer Development and Progression. J Natl Cancer Inst. 2000;92(18):1472-1489. doi:10.1093/jnci/92.18.1472 119. Platz EA, Pollak MN, Rimm EB, et al. Racial Variation in Insulin-Like Growth Factor-1 and Binding Protein-3 Concentrations in Middle-Aged Men. Cancer Epidemiol Biomarkers Prev. 1999;8(12):1107-1110. 120. Page WF, Braun MM, Partin AW, Caporaso N, Walsh P. Heredity and prostate cancer: A study of World War II veteran twins. The Prostate. 1997;33(4):240-245. doi:10.1002/(SICI)1097-0045(19971201)33:4<240::AID-PROS3>3.0.CO;2-L 121. Environmental and Heritable Factors in the Causation of Cancer — Analyses of Cohorts of Twins from Sweden, Denmark, and Finland | NEJM. New England Journal of Medicine. http://www.nejm.org/doi/10.1056/NEJM200007133430201?url_ver=Z39.88- 2003&rfr_id=ori%3Arid%3Acrossref.org&rfr_dat=cr_pub%3Dwww-ncbi-nlm-nih- gov.libproxy2.usc.edu. Accessed May 14, 2018. 122. Gr nberg H, Damber L, Damber J-E, Iselius L. Segregation Analysis of Prostate Cancer in Sweden: Support for Dominant Inheritance. American Journal of Epidemiology. 1997;146(7):552-557. doi:10.1093/oxfordjournals.aje.a009313 123. Verhage BAJ, Baffoe-Bonnie AB, Baglietto L, et al. Autosomal dominant inheritance of prostate cancer: a confirmatory study. Urology. 2001;57(1):97-101. doi:10.1016/S0090- 4295(00)00891-8 124. Monroe KR, Yu MC, Kolonel LN, et al. Evidence of an X-linked or recessive genetic component to prostate cancer risk. Nature Medicine. 1995;1(8):827-829. doi:10.1038/nm0895-827 125. Hemminki K, Li X. Familial risks of cancer as a guide to gene identification and mode of inheritance. International Journal of Cancer. 2004;110(2):291-294. doi:10.1002/ijc.20107 126. Pakkanen S, Baffoe-Bonnie AB, Matikainen MP, et al. Segregation analysis of 1,546 prostate cancer families in Finland shows recessive inheritance. Hum Genet. 2007;121(2):257-267. doi:10.1007/s00439-006-0310-2 127. Smith JR, Freije D, Carpten JD, et al. Major Susceptibility Locus for Prostate Cancer on Chromosome 1 Suggested by a Genome-Wide Search. Science. 1996;274(5291):1371-1374. doi:10.1126/science.274.5291.1371 128. Cooney KA, Huang L, Sandler HM, et al. Prostate Cancer Susceptibility Locus on Chromosome 1q: a Confirmatory Study. J Natl Cancer Inst. 1997;89(13):955-959. doi:10.1093/jnci/89.13.955 51 129. Berthon P, Valeri A, Cohen-Akenine A, et al. Predisposing Gene for Early-Onset Prostate Cancer, Localized on Chromosome 1q42.2-43. The American Journal of Human Genetics. 1998;62(6):1416-1424. doi:10.1086/301879 130. Gibbs M, Stanford JL, McIndoe RA, et al. Evidence for a Rare Prostate Cancer– Susceptibility Locus at Chromosome 1p36. The American Journal of Human Genetics. 1999;64(3):776-787. doi:10.1086/302287 131. Suarez BK, Lin J, Burmester JK, et al. A Genome Screen of Multiplex Sibships with Prostate Cancer. The American Journal of Human Genetics. 2000;66(3):933-944. doi:10.1086/302818 132. Tavtigian SV, Simard J, Teng DHF, et al. A candidate prostate cancer susceptibility gene at chromosome 17p. Nature Genetics; New York. 2001;27(2):172-180. doi:http://dx.doi.org.libproxy1.usc.edu/10.1038/84808 133. Berry R, Schroeder JJ, French AJ, et al. Evidence for a Prostate Cancer–Susceptibility Locus on Chromosome 20. The American Journal of Human Genetics. 2000;67(1):82-91. doi:10.1086/302994 134. Xu * J, Meyers D, Freije D, et al. Evidence for a prostate cancer susceptibility locus on the X chromosome. Nature Genetics. 1998;20(2):175-179. doi:10.1038/2477 135. Carpten J, Nupponen N, Isaacs S, et al. Germline mutations in the ribonuclease L gene in families showing linkage with HPC1. Nature Genetics. 2002;30(2):181-184. doi:10.1038/ng823 136. Giovannucci E, Stampfer MJ, Krithivas K, et al. The CAG repeat within the androgen receptor gene and its relationship to prostate cancer. PNAS. 1997;94(7):3320-3323. doi:10.1073/pnas.94.7.3320 137. Makridakis NM, Ross RK, Pike MC, et al. Association of mis-sense substitution in SRD5A2 gene with prostate cancer in African-American and Hispanic men in Los Angeles, USA. The Lancet. 1999;354(9183):975-978. doi:10.1016/S0140-6736(98)11282-5 138. Cybulski C, Wokołorczyk D, Huzarski T, et al. A large germline deletion in the Chek2 kinase gene is associated with an increased risk of prostate cancer. Journal of Medical Genetics. 2006;43(11):863-866. doi:10.1136/jmg.2006.044974 139. Breast Cancer Linkage Consortium. Cancer risks in BRCA2 mutation carriers. J Natl Cancer Inst. 1999;91(15):1310-1316. 140. Ewing CM, Ray AM, Lange EM, et al. Germline Mutations in HOXB13 and Prostate- Cancer Risk. New England Journal of Medicine. 2012;366(2):141-149. doi:10.1056/NEJMoa1110000 52 141. Lose F, Batra J, O’Mara T, et al. Common variation in Kallikrein genes KLK5, KLK6, KLK12, and KLK13 and risk of prostate cancer and tumor aggressiveness. Urologic Oncology: Seminars and Original Investigations. 2013;31(5):635-643. doi:10.1016/j.urolonc.2011.05.011 142. Yeager M, Orr N, Hayes RB, et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nature Genetics. 2007;39(5):645-649. doi:10.1038/ng2022 143. Gudmundsson J, Sulem P, Manolescu A, et al. Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nature Genetics. 2007;39(5):631- 637. doi:10.1038/ng1999 144. Eeles RA, Kote-Jarai Z, Giles GG, et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nature Genetics. 2008;40(3):316-321. doi:10.1038/ng.90 145. Schumacher FR, Berndt SI, Siddiq A, et al. Genome-wide association study identifies new prostate cancer susceptibility loci. Hum Mol Genet. 2011;20(19):3867-3875. doi:10.1093/hmg/ddr295 146. Eeles RA, Al Olama AA, Benlloch S, et al. Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nat Genet. 2013;45(4):385- 391, 391e1-2. doi:10.1038/ng.2560 147. Kote-Jarai Z, Easton DF, Stanford JL, et al. Multiple Novel Prostate Cancer Predisposition Loci Confirmed by an International Study: The PRACTICAL Consortium. Cancer Epidemiol Biomarkers Prev. 2008;17(8):2052-2061. doi:10.1158/1055-9965.EPI-08- 0317 148. Eeles RA, Kote-Jarai Z, Olama AAA, et al. Identification of seven new prostate cancer susceptibility loci through a genome-wide association study. Nat Genet. 2009;41(10):1116- 1121. doi:10.1038/ng.450 149. Kote-Jarai Z, Olama AAA, Giles GG, et al. Seven novel prostate cancer susceptibility loci identified by a multi-stage genome-wide association study. Nat Genet. 2011;43(8):785- 791. doi:10.1038/ng.882 150. PRACTICAL. http://practical.icr.ac.uk/. Accessed October 26, 2018. 151. Takata R, Akamatsu S, Kubo M, et al. Genome-wide association study identifies five new susceptibility loci for prostate cancer in the Japanese population. Nature Genetics. 2010;42(9):751-754. doi:10.1038/ng.635 152. Xu J, Mo Z, Ye D, et al. Genome-wide association study in Chinese men identifies two new prostate cancer risk loci at 9q31.2 and 19q13.4. Nat Genet. 2012;44(11):1231-1235. doi:10.1038/ng.2424 53 153. Akamatsu S, Takata R, Haiman CA, et al. Common variants at 11q12, 10q26 and 3p11.2 are associated with prostate cancer susceptibility in Japanese. Nature Genetics. 2012;44(4):426-429. doi:10.1038/ng.1104 154. Wang M, Takahashi A, Liu F, et al. Large-scale association analysis in Asians identifies new susceptibility loci for prostate cancer. Nat Commun. 2015;6. doi:10.1038/ncomms9469 155. Haiman CA, Chen GK, Blot WJ, et al. Genome-wide association study of prostate cancer in men of African ancestry identifies a susceptibility locus at 17q21. Nature Genetics. 2011;43(6):570-573. doi:10.1038/ng.839 156. Cook MB, Wang Z, Yeboah ED, et al. A genome-wide association study of prostate cancer in West African men. Human Genetics. 2014;133(5):509-521. doi:10.1007/s00439- 013-1387-z 157. Cheng I, Chen GK, Nakagawa H, et al. Evaluating Genetic Risk for Prostate Cancer among Japanese and Latinos. Cancer Epidemiol Biomarkers Prev. 2012;21(11):2048-2058. doi:10.1158/1055-9965.EPI-12-0598 158. Al Olama AA, Kote-Jarai Z, Berndt SI, et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nature Genetics. 2014;46(10):1103- 1109. doi:10.1038/ng.3094 159. Xu J, Zheng SL, Isaacs SD, et al. Inherited genetic variant predisposes to aggressive but not indolent prostate cancer. Proc Natl Acad Sci U S A. 2010;107(5):2136-2140. doi:10.1073/pnas.0914061107 160. Berndt SI, Wang Z, Yeager M, et al. Two Susceptibility Loci Identified for Prostate Cancer Aggressiveness. Nat Commun. 2015;6:6889. doi:10.1038/ncomms7889 161. Amin Al Olama A, Kote-Jarai Z, Schumacher FR, et al. A meta-analysis of genome-wide association studies to identify prostate cancer susceptibility loci associated with aggressive and non-aggressive disease. Hum Mol Genet. 2013;22(2):408-415. doi:10.1093/hmg/dds425 162. Szulkin R, Karlsson R, Whitington T, et al. Genome-wide association study of prostate cancer-specific survival. Cancer Epidemiol Biomarkers Prev. 2015;24(11):1796-1800. doi:10.1158/1055-9965.EPI-15-0543 163. Lou H, Yeager M, Li H, et al. Fine mapping and functional analysis of a common variant in MSMB on chromosome 10q11.2 associated with prostate cancer susceptibility. Proc Natl Acad Sci U S A. 2009;106(19):7933-7938. doi:10.1073/pnas.0902104106 164. Parikh H, Wang Z, Pettigrew KA, et al. Fine mapping the KLK3 locus on chromosome 19q13.33 associated with prostate cancer susceptibility and PSA levels. Hum Genet. 2011;129(6):675-685. doi:10.1007/s00439-011-0953-5 54 165. Chung CC, Ciampa J, Yeager M, et al. Fine mapping of a region of chromosome 11q13 reveals multiple independent loci associated with risk of prostate cancer. Hum Mol Genet. 2011;20(14):2869-2878. doi:10.1093/hmg/ddr189 166. Kote-Jarai Z, Saunders EJ, Leongamornlert DA, et al. Fine-mapping identifies multiple prostate cancer risk loci at 5p15, one of which associates with TERT expression. Hum Mol Genet. 2013;22(12):2520-2528. doi:10.1093/hmg/ddt086 167. Al Olama AA, Dadaev T, Hazelett DJ, et al. Multiple novel prostate cancer susceptibility signals identified by fine-mapping of known risk loci among Europeans. Human Molecular Genetics. 2015;24(19):5589-5602. doi:10.1093/hmg/ddv203 168. Han Y, Rand KA, Hazelett DJ, et al. Prostate Cancer Susceptibility in Men of African Ancestry at 8q24. Journal of the National Cancer Institute. 2016;108(7):djv431. doi:10.1093/jnci/djv431 169. Han Y, Hazelett DJ, Wiklund F, et al. Integration of multiethnic fine-mapping and genomic annotation to prioritize candidate functional SNPs at prostate cancer susceptibility regions. Human Molecular Genetics. 2015;24(19):5603-5618. doi:10.1093/hmg/ddv269 170. Gao P, Xia J-H, Sipeky C, et al. Biology and Clinical Implications of the 19q13 Aggressive Prostate Cancer Susceptibility Locus. Cell. 2018;174(3):576-589.e18. doi:10.1016/j.cell.2018.06.003 171. Zheng SL, Sun J, Wiklund F, et al. Cumulative association of five genetic variants with prostate cancer. N Engl J Med. 2008;358(9):910-919. doi:10.1056/NEJMoa075819 172. Al Olama AA, Benlloch S, Antoniou AC, et al. Risk Analysis of Prostate Cancer in PRACTICAL, a Multinational Consortium, Using 25 Known Prostate Cancer Susceptibility Loci. Cancer Epidemiol Biomarkers Prev. 2015;24(7):1121-1129. doi:10.1158/1055- 9965.EPI-14-0317 173. Nordström T, Aly M, Eklund M, Egevad L, Grönberg H. A Genetic Score Can Identify Men at High Risk for Prostate Cancer Among Men With Prostate-Specific Antigen of 1–3 ng/ml. European Urology. 2014;65(6):1184-1190. doi:10.1016/j.eururo.2013.07.005 174. Nordstrom T, Aly M, Eklund M, Egevad L, Gronberg H. A genetic score can identify men at high risk for prostate cancer among men with prostate-specific antigen of 1-3 ng/ml. Eur Urol. 2014;65(6):1184-1190. doi:10.1016/j.eururo.2013.07.005 175. Nam RK, Zhang WW, Trachtenberg J, et al. Utility of incorporating genetic variants for the early detection of prostate cancer. Clin Cancer Res. 2009;15(5):1787-1793. doi:10.1158/1078-0432.CCR-08-1593 176. Aly M, Wiklund F, Xu J, et al. Polygenic risk score improves prostate cancer risk prediction: results from the Stockholm-1 cohort study. Eur Urol. 2011;60(1):21-28. doi:10.1016/j.eururo.2011.01.017 55 177. Pashayan N, Duffy SW, Neal DE, et al. Implications of polygenic risk-stratified screening for prostate cancer on overdiagnosis. Genet Med. 2015;17(10):789-795. doi:10.1038/gim.2014.192 178. Amundadottir LT, Sulem P, Gudmundsson J, et al. A common variant associated with prostate cancer in European and African populations. June 2006. doi:10.1038/ng1808 179. Haiman CA, Patterson N, Freedman ML, et al. Multiple regions within 8q24 independently affect risk for prostate cancer. Nature Genetics. 2007;39(5):638-644. doi:10.1038/ng2015 180. Al Olama AA, Kote-Jarai Z, Giles GG, et al. Multiple loci on 8q24 associated with prostate cancer susceptibility. Nat Genet. 2009;41(10):1058-1060. doi:10.1038/ng.452 181. Wasserman NF, Aneas I, Nobrega MA. An 8q24 gene desert variant associated with prostate cancer risk confers differential in vivo activity to a MYC enhancer. Genome Res. 2010;20(9):1191-1197. doi:10.1101/gr.105361.110 182. Jia L, Landan G, Pomerantz M, et al. Functional Enhancers at the Gene-Poor 8q24 Cancer-Linked Locus. PLOS Genetics. 2009;5(8):e1000597. doi:10.1371/journal.pgen.1000597 183. Ahmadiyeh N, Pomerantz MM, Grisanzio C, et al. 8q24 prostate, breast, and colon cancer risk loci show tissue-specific long-range interaction with MYC. Proceedings of the National Academy of Sciences. 2010;107(21):9742-9746. doi:10.1073/pnas.0910668107 184. Du M, Yuan T, Schilter KF, et al. Prostate cancer risk locus at 8q24 as a regulatory hub by physical interactions with multiple genomic loci across the genome. Hum Mol Genet. 2015;24(1):154-166. doi:10.1093/hmg/ddu426 185. Chung S, Nakagawa H, Uemura M, et al. Association of a novel long non-coding RNA in 8q24 with prostate cancer susceptibility. Cancer Sci. 2011;102(1):245-252. doi:10.1111/j.1349-7006.2010.01737.x 186. Pomerantz MM, Beckwith CA, Regan MM, et al. Evaluation of the 8q24 Prostate Cancer Risk Locus and MYC Expression. Cancer Res. 2009;69(13):5568-5574. doi:10.1158/0008- 5472.CAN-09-0387 187. Meyer KB, Maia A-T, O’Reilly M, et al. A Functional Variant at a Prostate Cancer Predisposition Locus at 8q24 Is Associated with PVT1 Expression. PLOS Genetics. 2011;7(7):e1002165. doi:10.1371/journal.pgen.1002165 188. Kumar SK, Rajkumar V, Kyle RA, et al. Multiple myeloma. Nature Reviews Disease Primers. 2017;3:17046. doi:10.1038/nrdp.2017.46 56 189. Dutta AK, Hewett DR, Fink JL, Grady JP, Zannettino ACW. Cutting edge genomics reveal new insights into tumour development, disease progression and therapeutic impacts in multiple myeloma. British Journal of Haematology. 2017;178(2):196-208. doi:10.1111/bjh.14649 190. Jemal A, Ward EM, Johnson CJ, et al. Annual Report to the Nation on the Status of Cancer, 1975–2014, Featuring Survival. J Natl Cancer Inst. 2017;109(9). doi:10.1093/jnci/djx030 191. Sun T, Wang S, Sun H, Wen J, An G, Li J. Improved survival in multiple myeloma, with a diminishing racial gap and a widening socioeconomic status gap over three decades. Leukemia & Lymphoma. 2017;0(0):1-10. doi:10.1080/10428194.2017.1335398 192. Waxman AJ, Mink PJ, Devesa SS, et al. Racial disparities in incidence and outcome in multiple myeloma: a population-based study. Blood. 2010;116(25):5501-5506. doi:10.1182/blood-2010-07-298760 193. Ravindran A, Bartley AC, Holton SJ, et al. Prevalence, incidence and survival of smoldering multiple myeloma in the United States. Blood Cancer Journal. 2016;6(10):e486. doi:10.1038/bcj.2016.100 194. Landgren O, Graubard BI, Katzmann JA, et al. Racial disparities in the prevalence of monoclonal gammopathies: a population-based study of 12,482 persons from the National Health and Nutritional Examination Survey. Leukemia. 2014;28(7):1537-1542. doi:10.1038/leu.2014.34 195. Waxman AJ, Mink PJ, Devesa SS, et al. Racial disparities in incidence and outcome in multiple myeloma: a population-based study. Blood. 2010;116(25):5501-5506. doi:10.1182/blood-2010-07-298760 196. Pulte D, Redaniel MT, Brenner H, Jansen L, Jeffreys M. Recent improvement in survival of patients with multiple myeloma: variation by ethnicity. Leukemia & Lymphoma. 2014;55(5):1083-1089. doi:10.3109/10428194.2013.827188 197. Fiala MA, Wildes TM. Racial disparities in treatment use for multiple myeloma. Cancer. 2017;123(9):1590-1596. doi:10.1002/cncr.30526 198. Wallin A, Larsson SC. Body mass index and risk of multiple myeloma: A meta-analysis of prospective studies. European Journal of Cancer. 2011;47(11):1606-1615. doi:10.1016/j.ejca.2011.01.020 199. Alexander DD, Mink PJ, Adami H-O, et al. Multiple myeloma: A review of the epidemiologic literature. Int J Cancer. 2007;120(S12):40-61. doi:10.1002/ijc.22718 200. Hofmann JN, Moore SC, Lim U, et al. Body Mass Index and Physical Activity at Different Ages and Risk of Multiple Myeloma in the NIH-AARP Diet and Health Study. Am J Epidemiol. 2013;177(8):776-786. doi:10.1093/aje/kws295 57 201. Sonderman JS, Bethea TN, Kitahara CM, et al. Multiple Myeloma Mortality in Relation to Obesity Among African Americans. J Natl Cancer Inst. 2016;108(10). doi:10.1093/jnci/djw120 202. Kristinsson SY, Björkholm M, Goldin LR, et al. Patterns of hematologic malignancies and solid tumors among 37,838 first-degree relatives of 13,896 patients with multiple myeloma in Sweden. International Journal of Cancer. 2009;125(9):2147-2150. doi:10.1002/ijc.24514 203. Landgren O, Kristinsson SY, Goldin LR, et al. Risk of plasma cell and lymphoproliferative disorders among 14621 first-degree relatives of 4458 patients with monoclonal gammopathy of undetermined significance in Sweden. Blood. 2009;114(4):791- 795. doi:10.1182/blood-2008-12-191676 204. Brown LM, Linet MS, Greenberg RS, et al. Multiple myeloma and family history of cancer among blacks and whites in the U.S. Cancer. 1999;85(11):2385-2390. doi:10.1002/(SICI)1097-0142(19990601)85:11<2385::AID-CNCR13>3.0.CO;2-A 205. Bourguet CC, Grufferman S, Delzell E, Delong ER, Cohen HJ. Multiple myeloma and family history of cancer a case—control study. Cancer. 1985;56(8):2133-2139. doi:10.1002/1097-0142(19851015)56:8<2133::AID-CNCR2820560842>3.0.CO;2-F 206. Andreotti G, Birmann B, De Roos AJ, et al. A Pooled Analysis of Alcohol Consumption and Risk of Multiple Myeloma in the International Multiple Myeloma Consortium. Cancer Epidemiol Biomarkers Prev. 2013;22(9):1620-1627. doi:10.1158/1055-9965.EPI-13-0334 207. Rota M, Porta L, Pelucchi C, et al. Alcohol drinking and multiple myeloma risk – a systematic review and meta-analysis of the dose–risk relationship: European Journal of Cancer Prevention. 2014;23(2):113-121. doi:10.1097/CEJ.0000000000000001 208. Brown LM, Pottern LM, Silverman DT, et al. Multiple Myeloma among Blacks and Whites in the United States: Role of Cigarettes and Alcoholic Beverages. Cancer Causes & Control. 1997;8(4):610-614. 209. Maldonado JE, Kyle RA. Familial myeloma: Report of eight families and a study of serum proteins in their relatives. The American Journal of Medicine. 1974;57(6):875-884. doi:10.1016/0002-9343(74)90164-8 210. Lynch HT, Sanger WG, Pirruccello S, Quinn-Laquer B, Weisenburger DD. Familial Multiple Myeloma: a Family Study and Review of the Literature. J Natl Cancer Inst. 2001;93(19):1479-1483. doi:10.1093/jnci/93.19.1479 211. Waller RG, Darlington TM, Wei X, et al. Novel pedigree analysis implicates DNA repair and chromatin remodeling in multiple myeloma risk. PLoS Genet. 2018;14(2). doi:10.1371/journal.pgen.1007111 58 212. Ziakas PD, Karsaliakos P, Prodromou ML, Mylonakis E. Interleukin-6 polymorphisms and hematologic malignancy: a re-appraisal of evidence from genetic association studies. Biomarkers. 2013;18(7):625-631. doi:10.3109/1354750X.2013.840799 213. Zheng C, Huang DR, Bergenbrant S, et al. Interleukin 6, tumour necrosis factor alpha, interleukin 1beta and interleukin 1 receptor antagonist promoter or coding gene polymorphisms in multiple myeloma. Br J Haematol. 2000;109(1):39-45. 214. Lee K-M, Baris D, Zhang Y, et al. Common single nucleotide polymorphisms in immunoregulatory genes and multiple myeloma risk among women in Connecticut. Am J Hematol. 2010;85(8):560-563. doi:10.1002/ajh.21760 215. Hosgood HD, Baris D, Zhang Y, et al. Genetic variation in cell cycle and apoptosis related genes and multiple myeloma risk. Leukemia Research. 2009;33(12):1609-1614. doi:10.1016/j.leukres.2009.03.013 216. Lincz LF, Kerridge I, Scorgie FE, Bailey M, Enno A, Spencer A. Xenobiotic gene polymorphisms and susceptibility to multiple myeloma. Haematologica. 2004;89(5):628- 629. 217. Zintzaras E, Giannouli S, Rodopoulou P, Voulgarelis M. The role of MTHFR gene in multiple myeloma. J Hum Genet. 2008;53(6):499-507. doi:10.1007/s10038-008-0277-z 218. Hayden PJ, Tewari P, Morris DW, et al. Variation in DNA repair genes XRCC3, XRCC4, XRCC5 and susceptibility to myeloma. Hum Mol Genet. 2007;16(24):3117-3127. doi:10.1093/hmg/ddm273 219. Broderick P, Chubb D, Johnson DC, et al. Common variation at 3p22.1 and 7p15.3 influences multiple myeloma risk. Nat Genet. 2011;44(1):58-61. doi:10.1038/ng.993 220. Chubb D, Weinhold N, Broderick P, et al. Common variation at 3q26.2, 6p21.33, 17p11.2 and 22q13.1 influences multiple myeloma risk. Nat Genet. 2013;45(10):1221-1225. doi:10.1038/ng.2733 221. Swaminathan B, Thorleifsson G, Jöud M, et al. Variants in ELL2 influencing immunoglobulin levels associate with multiple myeloma. Nat Commun. 2015;6. doi:10.1038/ncomms8213 222. Mitchell JS, Li N, Weinhold N, et al. Genome-wide association study identifies multiple susceptibility loci for multiple myeloma. Nat Commun. 2016;7. doi:10.1038/ncomms12050 223. Went M, Sud A, Försti A, et al. Identification of multiple risk loci and regulatory mechanisms influencing susceptibility to multiple myeloma. Nature Communications. 2018;9(1):3707. doi:10.1038/s41467-018-04989-w 224. Rand KA, Song C, Dean E, et al. A Meta-analysis of Multiple Myeloma Risk Regions in African and European Ancestry Populations Identifies Putatively Functional Loci. Cancer Epidemiol Biomarkers Prev. 2016;25(12):1609-1618. doi:10.1158/1055-9965.EPI-15-1193 59 225. Cancer Statistics Review, 1975-2015 - SEER Statistics. https://seer.cancer.gov/csr/1975_2015/. Accessed May 21, 2018. 226. Breast Cancer Risk and Prevention. https://www.cancer.org/cancer/breast-cancer/risk- and-prevention.html. Accessed May 21, 2018. 227. new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) | Nucleic Acids Research | Oxford Academic. https://academic-oup- com.libproxy2.usc.edu/nar/article/45/D1/D896/2605751/The-new-NHGRI-EBI-Catalog-of- published-genome-wide. Accessed November 9, 2017. 60 Chapter Two: Genetic Risk of Prostate Cancer in Ugandan Men Published work Zhaohui Du 1 , Alexander Lubmawa 2 , Susan Gundell 1 , Peggy Wan 1 , Cissy Nalukenge 3 , Proscovia Muwanga 3 , Moses Lutalo 3 , Deborah Nansereko 3 , Olivia Ndaruhutse 3 , Molly Katuku 3 , Rosemary Nassanga 3 , African Ancestry Prostate Cancer Consortium (AAPC), Frank Asiimwe 3 , Benon Masaba 3 , Sam Kaggwa 4 , Dan Namuguzi 4 , Vicky Kiddu 2 , George Mutema 5 , David V. Conti 1 , Asiimwe Luke 6 , Kuteesa Job 7 , Dabanja M. Henry 8 , Christopher A. Haiman 1,9 , Stephen Watya 2,4 Affiliations 1. Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA 2. Uro Care, Kampala, Uganda 3. Mulago Hospital, Kampala, Uganda 4. Makerere University College of Health Sciences, Kampala, Uganda 5. SurgPath, Kampala, Uganda 6. Nyakibale Hospital, Rukungiri, Uganda 7. Kagando Hospital, Kasese, Uganda 8. Mengo Hospital, Kampala, Uganda 61 9. Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA Corresponding Authors Christopher A. Haiman Harlyne Norris Research Tower, 1450 Biggy Street, Room 1504, Los Angeles, CA 90033, Telephone: (323) 442-7755 Fax: (323) 442-7749 Email: haiman@usc.edu Stephen Watya Makerere University, Mulago Hospital, Department of Surgery, Urology Unit, Kampala, Uganda Email: watya_2000@yahoo.com 62 2.1 Abstract Background: Men of African-ancestry have elevated prostate cancer (PCa) incidence and mortality compared to men of other racial groups. There is support for a genetic contribution to this disparity, with evidence of genetic heterogeneity in the underlying risk alleles between populations. Studies of PCa among African men may inform the contribution of genetic risk factors to the elevated disease burden in this population. Methods: We conducted an association study of >100 previously reported PCa risk alleles among 571 incidence cases and 485 controls among Uganda men. Unconditional logistic regression was used to test genetic associations and a polygenetic risk score (PRS) was derived to assess the cumulative effect of the known risk alleles in association with PCa risk. In an exploratory analysis, we also tested associations of 17,125,421 genotyped and imputed markers genome-wide in association with PCa risk. Results: Of the 111 known risk loci with a frequency >1%, 75 (68%) had effects that were directionally consistent with the initial discovery population,14 (13%) of which were nominally significantly associated with PCa risk at P<0.05. Compared to men with average risk (25 th -75 th percentile in PRS distribution), Ugandan men in the top 10% of the PRS, constructed of alleles outside of 8q24, had a 2.9-fold (95% CI: 1.75, 4.97) risk of developing PCa; risk for the top 10% increased to 4.86 (95% CI: 2.70, 8.76) with the inclusion of risk alleles at 8q24. In genome-wide association testing, the strongest associations were noted with known risk alleles located in the 8q24 region, including rs72725854 (OR=3.37, P = 2.14×10 -11 ) that is limited to populations of African ancestry (6% frequency). 63 Conclusions: The ~100 known PCa risk variants were shown to effectively stratify PCa risk in Ugandan men, with 10% of men having a >4-fold increase in risk. The 8q24 risk region was also found to be a major contributor to PCa risk in Ugandan men, with the African ancestry-specific risk variant rs72725854 estimated to account for 12% of PCa in this population. Keywords: Prostate cancer, GWAS, African men, 8q24 64 2.2 Introduction Prostate cancer (PCa) is the second most common cancer globally among men and is one of the leading causes of cancer mortality 1,2 . Long-standing racial/ethnic differences have been noted, with men of African-ancestry having elevated incidence and mortality as well as more aggressive tumors compared to other racial/ethnic groups 3 . In Africa, PCa is the most common cancer among men 4 , with incidence rates in Uganda being near the highest of all African countries 5 . Prostate cancer is one of the most heritable cancers 6 , with genome-wide association studies (GWAS) having identified more than 100 risk variants, which in total explain ~30% of the familial risk 7–13 . Genetic studies in men of African-ancestry have revealed risk alleles at a number of regions, including on chromosomes 8q24, 13q34 and 22q12 that are specific for men of African descent and which may contribute, in part, to the greater PCa incidence in this population 10,13,14 . To date, our understanding of genetic risk for PCa in men of African ancestry is based on studies conducted primarily among African American men 10,13,15,16 . Genetic studies in African men are needed to quantify the contribution of germline variation to the higher risk observed in this population, and may assist in detecting novel African-specific loci because of minimal to no non-African admixture. In the present study, we characterized risk associations at known PCa loci and constructed a polygenic risk model comprised of all known risk loci to assess the cumulative genetic effects of genetic risk loci in this high-risk population. In an exploratory analysis, we also conducted a GWAS of PCa in Ugandan men to search for novel PCa risk alleles. 65 2.3 Materials and Methods Study Participants The Uganda Prostate Cancer Study (UGPCS) is a case-control study of incident PCa among Ugandan men. Between January 1, 2010 to December 31, 2014, 571 incident PCa cases over the age of 40 were enrolled from the Urology units at 7 hospitals/clinics in Kampala (Mulago Hospital, Uro Care, Mengo Hospital, Nakasero Hospital, Nsambya Hospital, Kibuli Hospital, Surgeons Plaza) and 6 hospitals/clinics outside of Kampala (Kagando Hospital, Nyakibale Hospital, Surgical Clinic Mbarara, Bwindi Community Hospital, Mbarara Hospital, Mbale Hospital). All cases were histologically confirmed and Gleason score was determined for 317 (55.5%) of cases. Controls (n=485) were recruited from non-urologic clinics (i.e. surgery) at the same hospital and were men over 40 with no history of PCa or current urologic conditions. To remove potential undiagnosed disease, all controls were men with a prostate-specific antigen (PSA) level < 4 ng/m. Descriptive PCa risk factor information was collected using a standardized questionnaire and a saliva spit kit (Genotek) was used to collect germline DNA. Written (or a thumb print) consents were obtained from each participant. The study protocol was reviewed and approved by the Uganda National Council of Science and Technology, the Makerere University Research and Ethics Committee, as well as the Institutional Review Board of the University of Southern California. Genotype calling and quality control We genotyped 571 cases and 485 controls in UGPCS with the Illumina OncoArray 17 as part of the ELLIPSE GAME-ON Consortium 14 . For quality control we combined these samples with 66 those from the ELLIPSE GAME-ON consortium. Quality control processes for genotypes included removing SNPs with call rate <0.95, replicate concordance <99.8% based on QC replicate samples, or due to poor clustering after visual inspection. Additional removal criteria included monomorphic SNPs, SNPs with estimated MAF that deviated or had mismatched alleles in comparison to the AFR individuals in phase III 1000 Genome Project (1KGP) data, and INDELs not identified within 1KGP; 448,939 SNPs were available for imputation (described below). Samples were removed with invalid sample (n=3), unknown duplication (n=4) and call rate <0.95 (n=6). Another 3 individuals were further removed because they were 1 st degree relative pairs. After QC, the remaining sample size available for analysis was 580 cases and 460 controls. Imputation The genotypes were phased by SHAPEIT 18 and imputed with Minimac3 Version 1.0.12, using the phase III 1KGP cosmopolitan reference panel. SNPs with MAF<0.01 in the UGPCS population or with imputation quality score<0.3 in the combined dataset were excluded, leaving a total of 17,125,421 variants for statistical analysis. Statistical analyses Principal component analysis (PCA) was performed using EIGENSTRAT 19 together with the 1KGP populations. For each SNP, per-allele odds ratios (genotyped counts or imputed dosage) and P values were estimated using unconditional logistic regression, adjusting for age and the 67 first 10 principal components. A quantile-quantile plot was produced to assess the influence of hidden population stratification. We tested for association with 118 known risk variants for PCa in UGPCS, (20 at 8q24, and 98 at non-8q24 regions). The known risk alleles were selected based on previous GWAS 7,11 , with African-specific risk alleles (rs75823044 at 13q34 and rs78554043 at 22q12.1) discovered in a previous GWAS meta-analysis in the African Ancestry Prostate Cancer Consortium (AAPC) 14 , which included subjects in UGPCS, and an expanded variant list at 8q24 as different studies report different variants likely representing the same signals. Directional consistency of effect was defined as ORs in Ugandan men that were in the same direction (i.e. >1) as those reported previously in the population that discovered the variant. A nominal P-value of 0.05 was used to determine statistical significance. Risk allele frequency comparisons were conducted between UGPCS controls and African Americans in AAPC 10 , as well as European ancestry samples in phase III 1KGP. Given the importance of variation at 8q24 in PCa and the observation of multiple independent risk alleles 10,20–22 , we performed a forward-selection stepwise logistic regression procedure for the region 127.8-128.8 Mb to assess the number of independent signals in UGPCS men. The correlation (r 2 ) between the independent SNPs identified by the stepwise regression procedure (P<0.001), and previous reported PCa risk alleles in the 8q24 region, was calculated in the phase III 1KGP African populations. An 8q24 regional association plot was generated using LocusZoom 23 . 68 We estimated the aggregate effect of known risk alleles, using a weighted polygenetic risk score, , for each individual. is the risk allele dosage for individual i at SNP m; C defines a set of 92 non-8q24 risk SNPs with MAF >0.01 in UGPCS (6 non-8q24 known risk alleles were excluded because of MAF≤0.01), together with 5 independent 8q24 risk alleles identified in the forward-selection stepwise procedure in UGPCS. The weights for non-8q24 risk alleles were marginal logORs from logistic regression adjusting for 10 PCs and age in UGPCS. For variants in the 8q24 region, we first obtained conditional logORs from the regression model in the forward-selection stepwise procedure, and then to correct for potential bias in effect estimation of newly discovered variants, we implemented a fully Bayesian version of a weighted correction 24 . The risk score was then categorized by percentile (<10%, 10-25%, 25-75%, 75- 90%, ≥90%) and the risk for each category was estimated relative to the interquartile range (25- 75%) using logistic regression with covariates including 10 PCs and age. We assessed the influence of 8q24 on the PRS by comparing risk score values computed before and after the inclusion of the 5 8q24 risk alleles. Also, to evaluate the relative improvement on classification performance of PRS in distinguishing cases from controls after including 8q24 variants, we calculated the area under the receiver-operating-characteristic (ROC) curve (AUC). We conducted case-case analysis to examine the association between PCa aggressiveness with the independent risk variants in 8q24 identified in UGPCS, the known risk alleles and PRS, using logistic models adjusted for PCs and age. We defined cases with Gleason score ≥ 8 as aggressive and <8 as non-aggressive. For variants that were nominally statistically significant in 69 case-case testing (Phet<0.05), stratified analyses were run in which separate logistic models were fitted using cases among each stratum and all controls (ORagg and ORnon-agg). In the exploratory GWAS analysis, the genetic inflation factor (λ=1.026) indicated no evidence of over-dispersion. A P-value of <5x10 -8 was used as the threshold for genome-wide significance. 70 2.4 Results The mean age of the cases and controls included in the analysis were 70.9 (±9.5) and 65.1 (±8.9), respectively (Supplementary table 1). Of the 560 cases, Gleason score was available for 309, of which, 136 (44%) had a Gleason Score ≥ 8. Supplementary figure 1 illustrates genetic comparisons between Ugandan men in UGPCS and African ancestry samples from phase III 1KGP, and highlights close genetic relationships between Ugandan men and the Luhya in Webuye, Kenya (LWK) compared to men of African ancestry men from Western Africa, the United States and the Caribbean. Of the 118 known risk loci, two were monomorphic (rs12621278, rs76934034) and another 5 had a MAF ≤ 0.01 (one at 8q24). Of the 111 variants with a MAF >0.01, directionally consistent associations were noted with 75 (67.6%) variants, of which 14 were significantly associated with PCa risk at P<0.05 (Table 1, Supplementary table 2). Of the 3 African-specific risk alleles for PCa reported outside of 8q24, variant rs7210100 at 17q21 13 was not significantly associated with PCa risk (OR=1.04, P=0.86; RAF=0.04), while the effect sizes for rs75823044 at 13q34 and rs78554043 at 22q12.1 were similar to those reported previously in men of African ancestry 14 , with rs75823044 nominally statistically significant in UGCPS (rs75823044: OR=2.02, P = 0.04; RAF=0.01; rs78554043: OR=1.53, P=0.44; RAF=0.01). In comparing the frequency of the known risk alleles between populations, on average, the risk allele frequency in UGPCS controls was only 0.001 smaller than that observed among African Americans in AAPC (P=0.82, t test). A larger non-significant difference in risk allele frequency distribution was noted between UGPCS controls and European ancestry populations (1000 Genomes), being 0.04 larger in UGPCS controls on average (P=0.11, t test), and with 29 (25.4%) having opposite minor alleles: 71 12 alleles (10.5%) had an RAF>0.5 in European ancestry populations and RAF<0.5 in UGPCS whereas 17 (14.9%) had a RAF <0.5 in European ancestry populations that was >0.5 in UGPCS. Five of the known variants were detected as nominally associated with PCa aggressiveness in case-case analysis (PHet<0.05), including rs6763931 [PHet=2.1×10 -4 ; ORnon-agg=3.39 (95%CI: 1.73, 6.67), ORagg=0.78 (95%: 0.49, 1.22)], rs1218582 [PHet=2.9×10 -4 ; ORnon-agg=0.67 (95%CI: 0.51, 0.87), ORagg=1.31 (95%CI: 0.98, 1.76)], rs8014671 [PHet=0.01; ORnon-agg=1.21 (95%CI: 0.92, 1.59), ORagg=0.72 (95%CI: 0.53, 0.99)], rs1465618 [PHet=0.02; ORnon-agg=1.67 (95%CI: 0.99, 2.80), ORagg=0.80 (95%CI: 0.39, 1.64)], and rs2292884 [PHet=0.04; ORnon-agg=1.08 (95%CI: 0.82, 1.42), OR=0.72 (95%CI: 0.54, 0.97)]. At 8q24, five independent signals were defined (P<0.001; Table 2) in the stepwise selection procedure. In addition to rs72725854, the variants included rs28556804, which is highly correlated with the previous reported risk allele rs7463326 (r 2 = 0.95, AFR phase III 1KGP; Supplementary table 3), rs1456315, previously reported in Japanese 25 and Chinese 26 studies and moderately correlated with another previous reported risk allele rs72725879 (r 2 = 0.41, AFR phase III 1KGP), and variants rs6470538 and rs73707269, located in close proximity (35-155 kb) to MYC with little correlation to known risk alleles (r 2 <0.01; Supplementary table 3). We did not detect statistically significant differences in effect by aggressiveness for the 8q24 variants. In estimating a PRS for the 92 non-8q24 risk alleles, the mean weighted score was 6.70 for PCa cases and 6.25 for controls (P<2.2×10 -16 , t-test). The average PRS increased to 10.09 in PCa cases and 9.24 in controls (P<2.2×10 -16 , t test) with the inclusion of the 5 risk alleles at 8q24. 72 Similarly, for non-8q24 risk alleles, men in the top 10% PRS had a 2.9-fold (95% CI: 1.75, 4.97) elevated risk compared to those with the average risk (PRS in 25 th -75 th percentiles), which increased to 4.86 (95% CI: 2.70, 8.76) with the inclusion of risk alleles at 8q24. Moreover, the AUC of the PRS also improved by adding the 5 8q24 variants, increasing from 0.67 (95%CI: 0.65, 0.71) to 0.73 (95%CI: 0.70, 0.76). The PRS was equally associated with aggressive and non-aggressive PCa in case-case analysis (aggressive vs. non-aggressive OR=0.92, P= 0.49). In the genome-wide analysis, only 5 SNPs, all located in the risk region at 8q24.21 (Supplementary table 4, Supplementary figures 3) were genome-wide significant. Variant rs72725854 was found to be the most statistically significant in association with PCa risk (OR=3.37, P=2.14×10 -11 ), with a risk allele frequency (RAF) of 0.14 in cases and 0.06 in controls. This variant is only found in men of African ancestry and is the most strongly associated risk variant for PCa found to date in men of African ancestry 14 . The other four genome-wide significant variants were all located in the vicinity (within ~10 kb) and were correlated with rs72725854 (r 2 =0.52-0.61) (Supplementary table 4). 73 2.5 Discussion In this study, we characterized known genetic risk factor for PCa in Ugandan men. Of the 118 established risk alleles for PCa, 111 are common in Ugandan men (MAF>0.1%) and ~70% had ORs in UGPCS that were directionally consistent with previous studies, which suggests that most of the known PCa susceptibility variants are likely to be markers of PCa risk in East African men. Just over half (56.7%) of the effects sizes observed in UGPCS were smaller than those observed in European ancestry populations (or had opposite directions), and only 14 risk alleles achieved nominal statistical significance. This observation is likely due to many factors, including variability of effect estimation due to small sample size, true differences in the effect of these susceptible variants across ethnic groups and population differences in LD between the index risk SNPs and causal alleles. Larger studies in men of African ancestry will be required to disentangle these possibilities. The overwhelming importance of genetic variation at 8q24 in contributing to PCa risk was confirmed in UGPCS, with 5 independent risk alleles observed in this region. The polygenic risk score analysis further demonstrated the substantial contribution of 8q24 to PCa susceptibility, with the mean risk score increasing by ~50% with the inclusion of the 5 independent risk SNPs in the 8q24 region (9.70) to the PRS of the 92 non-8q24 known risk alleles (6.49). The change in the AUC of 0.06 when adding 5 8q24 alleles suggests variation in this region is important for risk classification, although overfitting likely exists in this study and external validation is needed. 74 A predictive PRS for PCa may be effective for identifying high-risk populations who are most likely to benefit from biopsy and subsequent treatment, and in reducing the over-diagnosis of indolent disease 27 . Studies among men of European descent (using 100 risk SNPs) indicate that men in the 90-99% of the PRS have a ~3-fold increase in PCa risk compared to the population average (25 th -75 th percentile of the PRS distribution) 7 . In our study, using a comparable set of 97 risk SNPs, Ugandan men in the 90-99% of the PRS demonstrated an increased PCa risk of 4.55- fold (95%CI: 2.48,8.36) compared with the average risk. This elevated risk might be attributed to African-specific risk alleles included in our PRS, especially rs72725854, which has a greater effect size and is limited to Ugandan men, or, perhaps variability in effect estimation due to the limited sample size of our study. To address this, we also constructed the PRS using per-allele log odds ratios obtained from the large AAPC meta-analysis (>10,000 cases) in men of African Ancestry as weights 14 , which includes some of the UGPCS samples. Here, we found that men in the 90-99% risk stratum had a relative risk of 4.79 (95%CI: 2.64, 8.71), which is still larger than that reported in previous studies among Whites, and is similar to that observed using Uganda- specific weights (i.e. ORs) suggesting that the effects and/or frequencies of PCa risk alleles may be greater in men of African ancestry. However, the risk may be higher in our study because we removed controls with PSA levels ≥4 ng/m, which is a criterion that is not applied consistently in other studies. We found no evidence of heterogeneity in the PRS by disease aggressiveness which is consistent with other studies 28 . Other studies have provided evidence of heterogeneity in PRS effects by PCa family history 28 , but unfortunately, family history was not available in our study, so we were not able to assess this hypothesis here. 75 A major limitation in this study is the small sample size, with statistical power for testing of known risk loci and for risk allele discovery being underpowered. For known risk alleles with MAF of 20%, the power to detect ORs of 1.25 at a nominal significant level (P<0.05) was only 55%. In the exploratory genome-wide analyses, for alleles with MAF of 20%, there was only a priori adequate power (80%) to detect alleles with large effects (ORs >1.90) at genome-wide significance (P<5x10 -8 ). The most genome-wide significant association was noted with rs72725854. The risk variant rs72725854 is tri-allelic, with the risk allele T only reported in African ancestry populations, and is correlated with other published African ancestry-specific 8q24 risk alleles rs114798100 (r 2 =0.54) and rs111906932 (r 2 =0.39). This variant is located in an intergenic region near a long non-coding RNA, PCAT1 (Prostate Cancer Associated Transcript 1), and was previously discovered in the AAPC GWAS which included the majority of UGPCS subjects 14 . While the RAF of this variant is similar across African ancestry populations (5-11% in AFR populations in 1KGP) the odds ratio was slightly higher in Ugandan men (OR=3.37; 95% CI, 2.36-4.82) compared to that reported in AAPC, which included mainly African Americans (OR=2.13; 95% CI, 1.97-2.32) 14 . In summary, we found that the known PCa risk variants can effectively stratify PCa risk in Ugandan men, with 10% of men having a >4-fold increase in risk. These results also emphasize the importance of germline variation at the 8q24 risk region in the etiology of PCa in Ugandan 76 men. Larger studies in African populations will be required to identify additional PCa risk alleles that aid in risk stratification and that contribute to the higher PCa incidence in this population. Acknowledgements This work was supported by NIH grants U19CA148537 and R01CA165862. 77 2.6 References 1. Rebbeck TR, Devesa SS, Chang B-L, et al. Global Patterns of Prostate Cancer Incidence, Aggressiveness, and Mortality in Men of African Descent. Prostate Cancer. doi:10.1155/2013/560857. 2. Global Burden of Disease Cancer Collaboration, Fitzmaurice C, Allen C, et al. Global, Regional, and National Cancer Incidence, Mortality, Years of Life Lost, Years Lived With Disability, and Disability-Adjusted Life-years for 32 Cancer Groups, 1990 to 2015: A Systematic Analysis for the Global Burden of Disease Study. JAMA Oncol. 2017;3(4):524-548. doi:10.1001/jamaoncol.2016.5688. 3. CDC - Prostate Cancer Rates by Race and Ethnicity. https://www.cdc.gov/cancer/prostate/statistics/race.htm. Accessed December 15, 2016. 4. Parkin DM, Nambooze S, Wabwire-Mangen F, Wabinga HR. Changing cancer incidence in Kampala, Uganda, 1991-2006. Int J Cancer. 2010;126(5):1187-1195. doi:10.1002/ijc.24838. 5. Ferlay J, Soerjomataram I, Ervik M, et al. GLOBOCAN 2012 v1.0, Cancer Incidence and Mortality Worldwide: IARC CancerBase No. 11 [Internet]. Lyon, France: International Agency for Research on Cancer; 2013.Available from: http://globocan.iarc.fr, accessed on day/month/year. 6. Mucci LA, Hjelmborg JB, Harris JR, et al. Familial Risk and Heritability of Cancer Among Twins in Nordic Countries. JAMA. 2016;315(1):68-76. doi:10.1001/jama.2015.17703. 7. Al Olama AA, Kote-Jarai Z, Berndt SI, et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat Genet. 2014;46(10):1103-1109. doi:10.1038/ng.3094. 8. Hoffmann TJ, Van Den Eeden SK, Sakoda LC, et al. A large multiethnic genome-wide association study of prostate cancer identifies novel risk variants and substantial ethnic differences. Cancer Discov. 2015;5(8):878-891. doi:10.1158/2159-8290.CD-15-0315. 9. Cook MB, Wang Z, Yeboah ED, et al. A genome-wide association study of prostate cancer in West African men. Hum Genet. 2014;133(5):509-521. doi:10.1007/s00439-013-1387- z. 10. Han Y, Rand KA, Hazelett DJ, et al. Prostate Cancer Susceptibility in Men of African Ancestry at 8q24. J Natl Cancer Inst. 2016;108(7):djv431. doi:10.1093/jnci/djv431. 11. Eeles RA, Al Olama AA, Benlloch S, et al. Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array. Nat Genet. 2013;45(4):385-391, 391-2. doi:10.1038/ng.2560. 12. Al Olama AA, Dadaev T, Hazelett DJ, et al. Multiple novel prostate cancer susceptibility signals identified by fine-mapping of known risk loci among Europeans. Hum Mol Genet. 2015;24(19):5589-5602. doi:10.1093/hmg/ddv203. 78 13. Haiman CA, Chen GK, Blot WJ, et al. Genome-wide association study of prostate cancer in men of African ancestry identifies a susceptibility locus at 17q21. Nat Genet. 2011;43(6):570- 573. doi:10.1038/ng.839. 14. Conti DV, Wang K, Sheng X, et al. Two Novel Susceptibility Loci for Prostate Cancer in Men of African Ancestry. JNCI J Natl Cancer Inst. 2017;109(8). doi:10.1093/jnci/djx084. 15. Han Y, Hazelett DJ, Wiklund F, et al. Integration of multiethnic fine-mapping and genomic annotation to prioritize candidate functional SNPs at prostate cancer susceptibility regions. Hum Mol Genet. 2015;24(19):5603-5618. doi:10.1093/hmg/ddv269. 16. Haiman CA, Chen GK, Blot WJ, et al. Characterizing genetic risk at known prostate cancer susceptibility loci in African Americans. PLoS Genet. 2011;7(5):e1001387. doi:10.1371/journal.pgen.1001387. 17. Amos CI, Dennis J, Wang Z, et al. The OncoArray Consortium: a Network for Understanding the Genetic Architecture of Common Cancers. Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol. October 2016. doi:10.1158/1055- 9965.EPI-16-0106. 18. Delaneau O, Marchini J, 1000 Genomes Project Consortium, 1000 Genomes Project Consortium. Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nat Commun. 2014;5:3934. doi:10.1038/ncomms4934. 19. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904-909. doi:10.1038/ng1847. 20. Cropp CD, Robbins CM, Sheng X, et al. 8q24 risk alleles and prostate cancer in African- Barbadian men: 8q24 Risk Alleles and Prostate Cancer in AB Men. The Prostate. 2014;74(16):1579-1588. doi:10.1002/pros.22871. 21. Haiman CA, Patterson N, Freedman ML, et al. Multiple regions within 8q24 independently affect risk for prostate cancer. Nat Genet. 2007;39(5):638-644. doi:10.1038/ng2015. 22. Al Olama AA, Kote-Jarai Z, Giles GG, et al. Multiple loci on 8q24 associated with prostate cancer susceptibility. Nat Genet. 2009;41(10):1058-1060. doi:10.1038/ng.452. 23. Pruim RJ, Welch RP, Sanna S, et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinforma Oxf Engl. 2010;26(18):2336-2337. doi:10.1093/bioinformatics/btq419. 24. Zhong H, Prentice RL. Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. Biostat Oxf Engl. 2008;9(4):621-634. doi:10.1093/biostatistics/kxn001. 25. Takata R, Akamatsu S, Kubo M, et al. Genome-wide association study identifies five new susceptibility loci for prostate cancer in the Japanese population. Nat Genet. 2010;42(9):751-754. doi:10.1038/ng.635. 79 26. Xu J, Mo Z, Ye D, et al. Genome-wide association study in Chinese men identifies two new prostate cancer risk loci at 9q31.2 and 19q13.4. Nat Genet. 2012;44(11):1231-1235. doi:10.1038/ng.2424. 27. Pashayan N, Duffy SW, Neal DE, et al. Implications of polygenic risk-stratified screening for prostate cancer on overdiagnosis. Genet Med Off J Am Coll Med Genet. 2015;17(10):789-795. doi:10.1038/gim.2014.192. 28. Al Olama AA, Benlloch S, Antoniou AC, et al. Risk Analysis of Prostate Cancer in PRACTICAL, a Multinational Consortium, Using 25 Known Prostate Cancer Susceptibility Loci. Cancer Epidemiol Biomark Prev Publ Am Assoc Cancer Res Cosponsored Am Soc Prev Oncol. 2015;24(7):1121-1129. doi:10.1158/1055-9965.EPI-14-0317. 80 Table 1. Known prostate cancer risk alleles that were nominally statistically significant (P<0.05) among Ugandan men SNP ID Region Position Risk Allele RAF a OR(95%CI) b P-value c rs72725854 8q24.21 128074815 T 0.06 3.37(2.36, 4.82) 2.14×10 -11 rs114798100 8q24.21 128085434 G 0.04 2.92(2.00, 4.28) 3.63×10 -8 rs72725879 8q24.21 128103969 T 0.36 1.58(1.30, 1.93) 4.73×10 -6 rs16901979 8q24.21 128124916 A 0.47 1.45(1.20, 1.76) 1.01×10 -4 rs6983561 8q24.21 128106880 C 0.50 1.39(1.16, 1.67) 3.91×10 -4 rs111906932 8q24.21 128086204 A 0.01 3.50(1.60, 7.66) 0.0017 rs1512268 8p21.2 23526463 T 0.65 1.31(1.07, 1.60) 0.0087 rs3096702 6p21.32 32192331 A 0.09 0.62(0.43, 0.89) 0.0090 rs11568818 11q22.2 102401661 T 0.53 1.23(1.03, 1.48) 0.0237 rs10086908 8q24.21 128011937 T 0.71 1.27(1.03, 1.56) 0.0258 rs684232 17p13.3 618965 C 0.62 1.24(1.02, 1.50) 0.0277 rs7463326 8q24.21 128027954 G 0.88 1.41(1.03, 1.94) 0.0312 rs12549761 8q24.21 128540776 C 0.97 1.97(1.03, 3.79) 0.0412 rs7153648 14q23.1 61122526 C 0.38 1.21(1.01, 1.45) 0.0417 rs75823044 13q34 110360784 T 0.01 2.02(1.02, 4.00) 0.0437 rs1218582 1q21.3 154834183 G 0.69 0.83(0.69, 1.00) 0.0497 a Risk allele frequency in UGPCS controls; b Adjusted for age and 10 principle components; c Wald test 81 Table 2. Prostate cancer risk alleles at 8q24 in Ugandan men. SNP ID Chromosome Position Alleles a Effect Allele Frequency (Case |Control) Marginal OR (95%CI) b Marginal P-value Conditional OR(95%CI) c Conditional P-value Largest correlation (r 2 ) with known risk variants at 8q24 d rs72725854 128074815 T|A 0.14|0.06 3.37(2.36, 4.82) 2.14×10 -11 3.62(2.50, 5.26) 1.20×10 -11 0.54 with rs114798100 9 rs6470538 128594189 T|C 0.39|0.32 1.44(1.18, 1.76) 2.87×10 -4 1.47(1.21, 1.80) 1.18×10 -4 0.01 with rs6983267 20 rs1456315 128103937 T|C 0.62|0.53 1.47(1.22, 1.77) 6.50×10 -5 1.52(1.24, 1.87) 6.90×10 -5 0.41 with rs72725879 9 rs28556804 128014315 A|G 0.91|0.87 1.46(1.08, 1.97) 1.28×10 -2 1.80(1.31, 2.48) 2.94×10 -4 0.95 with rs7463326 13 rs73707269 128712597 G|C 0.95|0.92 2.00(1.30, 3.07) 1.51×10 -3 2.19(1.40, 3.44) 6.23×10 -4 0.01 with rs10090154 20 a Effect|Reference allele; b Adjusted for age and 10 principle components; c Adjusted for all the SNPs shown, age and 10 principle components; d r 2 determined in AFR populations in 1000 Genomes 82 Table 3. A polygenic risk score for prostate cancer in Ugandan men. Polygenic Risk Score Category a UGPCS AAPC EUR Number of Cases Number of Controls OR (95% CI) b P-value c OR (95% CI) d P-value c OR (95% CI) e 1%-10% 19 74 0.18(0.10, 0.32) 2.19×10 -9 0.28(0.17, 0.47) 2.10×10 -6 0.31 (0.28-0.35) 10%-25% 55 101 0.42(0.28, 0.63) 2.29×10 -5 0.57(0.39, 0.84) 4.48×10 -3 0.52 (0.48-0.55) 25%-75%(baseline) 282 238 - - - - - 75%-90% 114 42 2.25(1.48, 3.42) 1.37×10 -4 1.72(1.15, 2.55) 7.61×10 -3 1.78 (1.68-1.88) 90%-99% 78 15 4.55(2.48, 8.36) 1.06×10 -6 4.79(2.64, 8.71) 2.75×10 -7 2.93 (2.75-3.12) a PRS was calculated using 97 SNPs, including 92 known non-8q24 risk alleles (MAF>0.01) and 5 independent 8q24 risk variants identified in Ugandan men; b Adjusted for age and 10 principle components; weights were marginal log10(OR) from Ugandan men, with a Bayesian adjustment for the 5 8q24 allele for their “winner’s curse”; c Wald test P-value d Adjusted for age and 10 principle components; weights were log10(OR) from a large African American sample 14 e Estimates were from previous paper using 100 SNPs to construct PRS 7 . 83 Figure 1. Regional association plot of the 8q24 risk region (127.8-128.3MB) in Ugandan men. Single-nucleotide polymorphisms (SNPs) are plotted by position (x-axis) and -log10 P value (y- axis). LDs were estimated from AFR individuals in phase III 1000 Genomes Project (1KGP) data using r 2 statistics. The most statistically significant associated SNP (purple diamond) is rs72725854, and the surrounding SNPs are colored to indicate pairwise correlation with the index SNP. 84 Figure 2. Density plots of the polygenic risk scores. The dotted lines are controls and the solid lines are cases. The greens lines are density distributions of the polygenic risk score (PRS) derived from 92 known non-8q24 risk SNPs. The blue lines are distributions of the PRS derived from 92 known non-8q24 risk SNPs plus 5 independent risk alleles in 8q24 region identified among UGPCS men. The labeled values are the mean PRS of the corresponding categories. 85 Chapter Three: Genome-wide Association Study of Prostate Cancer in Latinos (Manuscript submitted) Zhaohui Du 1 , Hannah Hopp 1 , Sue Ann Ingles 1 , Chad Huff 2 , Xin Sheng 1 , Brandi Weaver 3 , Mariana Stern 1 , Thomas J. Hoffmann 4,5 , Esther M. John 6 , Stephen K. Van Den Eeden 7,8 , Sara Strom 2 , Robin J. Leach 3 , Ian M. Thompson Jr 3 , John S. Witte 4,5,8,9 , David V. Conti 1 , Christopher A. Haiman 1 Affiliations 1 Department of Preventative Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, CA 2 The University of Texas MD Anderson Cancer Center Houston, TX 3 Department of Urology, University of Texas Health Science Center, San Antonio, TX 4 Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94158, USA 5. Institute for Human Genetics, University of California San Francisco, San Francisco, CA, 94143, USA 6. Department of Medicine and Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA 7. Division of Research, Kaiser Permanente, Northern California, Oakland, CA 94612, USA 86 8. Department of Urology, University of California San Francisco, San Francisco, CA 94158, USA 9. UCSF Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94158, USA Corresponding Authors Christopher A. Haiman Harlyne Norris Research Tower, 1450 Biggy Street, Room 1504, Los Angeles, CA 90033, Telephone: (323) 442-7755 Fax: (323) 442-7749 Email: haiman@usc.edu 87 3.1 Abstract Latinos represent less than 1% of samples analyzed to date in genome-wide association studies of cancer. The clinical value of genetic information in guiding personalized medicine in populations of non-European ancestry will require additional discovery and risk locus characterization efforts across populations. In the present study, we performed a GWAS of PrCa in 2,820 Latino PrCa cases and 5,293 controls to search for novel PrCa risk loci and to examine the generalizability of known PrCa risk loci in Latino men. We also conducted a genetic admixture mapping scan to identify PrCa risk alleles associated with local ancestry. Genome- wide significant associations were observed with 84 variants all located at the known PrCa risk regions at 8q24 (128.484-128.548) and 10q11.22 (MSMB gene). In admixture mapping, we observed genome-wide significant associations with local African ancestry at 8q24. Of the 162 established PrCa risk variants that are common in Latino men, 135 (83.3%) had effects that were directionally consistent as previously reported, among which 55 (34.0%) were statistically significant with P<0.05. A polygenic risk model of the known PrCa risk variants showed that, compared to men with average risk (25th-75th percentile of the polygenic risk score distribution), men in the top 10% had a 3.19-fold (95% CI: 2.65, 3.84) increased PrCa risk. In conclusion, we found that the known PrCa risk variants can effectively stratify PrCa risk in Latino men. Larger studies in Latino populations will be required to discover and characterize genetic risk variants for PrCa and improve risk stratification for this population. 88 3.2 Introduction Prostate cancer (PrCa) is the most common non-skin cancer and the second leading cause of cancer death among men in the U.S., with large differences in incidence rates observed across racial/ethnic groups 1 . Age-adjusted incidence rates (per 100,000) are highest in African Americans (AA) (178.3), lower in non-Hispanic whites (NHW) (105.7), and slightly lower still in Hispanics/Latinos (91.8) 1,2 . In the only prospective study of PrCa in Latinos, risk was observed to be higher among Latinos compared to NHW after adjustment for potential confounders, including lifestyle factors and PSA screening history 3 . Though classified as a single ethnic group, the Latino population consists of genetically admixed individuals from populations that display considerable diversity in PrCa incidence and mortality rates. For example, analyses of cancer registry data in Florida revealed that Latinos of Mexican origin had a remarkably lower age-adjusted incidence rate compared to those of Cuban or Puerto Rican origin or to NHW 4 , whereas Latinos with Dominican and Cuban origins had significantly a higher mortality rate compared to NHW 5 . Possible explanations for these differences include variation across subgroups in nativity, socioeconomic status, access to care, lifestyle factors, and genetic ancestry and susceptibility. Latinos are extensively admixed with genetic information from multiple ancestries including Amerindian (AMR), European (EUR) and African (AFR) 6 , with large variation in ancestry proportions observed across subgroups and individuals. For example, the proportion of AFR ancestry is small among Mexicans (<10%) but quite large in Dominicans and Puerto Ricans (20-40%). Throughout the Americas, and even within a single country, AMR ancestry proportions vary widely 6,7 . 89 GWAS in non-European populations have provided insight into ancestry-specific variation and have revealed regions of susceptibility that are of particular importance in certain populations. For example, GWAS in Latinos of phenotypes such as central corneal thickness 8 , asthma 9 and diabetes 10 , have discovered novel susceptibility not reported in other populations. For breast cancer, an admixture-mapping study discovered higher AMR ancestry at chromosome 6q25 and protective variants within this region 11 that are only found in AMR populations. Genetic studies of PrCa in Latino men have been limited but are needed to discover novel AMR germline variants for PrCa risk and to test the generalizability of established PrCa genetic markers in this admixed population. The extensive diversity of ancestry proportions within Latinos also provides the opportunity to investigate the interaction between genetic background and genetic risk loci on disease risk 12,13 . In the present study, we carried out a GWAS of PrCa in Latinos to discover novel risk alleles and to examine whether the known PrCa risk alleles are important in stratifying PrCa risk in Latino men. We also leveraged genetic admixture to conduct a genome-wide admixture mapping analysis to scan for PrCa risk alleles associated with local ancestry. In addition to generating a polygenic risk score (PRS) to test the cumulative effect of all known PrCa risk variants in Latinos, we also explored whether genetic background/ancestry modified associations with single variants and a PRS for PrCa. 90 3.3 Materials and Methods Study Participants and GWAS Genotyping This study includes Latino PrCa cases and controls from five studies, that were genotyped with different GWAS array platforms and denoted as Sets 1-3. Set 1 consisted of 1,079 incident Latino PrCa cases and 1,083 controls from the Multiethnic Cohort (MEC) 14 . In brief, the MEC is a large population-based cohort study including 215,251 men and women recruited from Hawaii and California between 1993-1996. Incident Latino PrCa cases were identified by linking with the cancer registries in Hawaii and California. Controls were men with no prostate cancer diagnosis that were selected from a control pool who provided specimens for genetic analysis and were frequency-matched to cases (± 5 years). Genotyping of Set 1 was performed with the Illumina Human660W array 14 . Set 2 included 1,253 cases and 1,069 controls from four studies: the MEC, the Los Angeles Aggressive Prostate Cancer (LAAPC), MD Anderson (MDA), and San Antonio Biomarkers of Risk (SABOR). These studies were genotyped with the Illumina OncoArray (260K GWAS backbone) 15 , as part of the ELLIPSE GAME-ON Consortium 16 . The MEC included 152 incident and prevalent Latino PrCa cases and 162 controls (not included in Set 1). The LAAPC is a population-based case-control study of aggressive prostate cancer in Los Angeles county 17 . Eligible cases (n=320) were Latinos of any age diagnosed with primary prostate cancer. Controls (n=331) were Latino men without PrCa diagnosis and were frequency matched with cases on age (± 5 years), who were identified via a neighborhood walk algorithm 18 . MDA cases (n=521) were 91 Latino men enrolled in epidemiological PrCa studies conducted at the University of Texas M.D. Anderson Cancer Center 19,20 . Controls (n=316) were men of self-reported Mexican origin recruited by random digit dialing in Texas 20 or enrolled in the Mexican American Cohort Study, an ongoing population-based cohort in Houston, TX 21 . MDA controls had no diagnosis of invasive cancer and were frequency matched with cases on age (± 5 years). SABOR is a cohort study which has been enrolling healthy male volunteers in San Antonio and South Texas area since 2001 22,23 . Participants were examined annually/biannually by digital rectal exam and serum prostate-specific antigen level, and prostate biopsy was recommended for men with positive results. In total, 260 incident Latino PrCa cases, who had been biopsy-confirmed, were enrolled. Controls (n=260) were Latino men ³ 45 years old who had normal digital rectal exams and prostate-specific antigen levels ≤ 2.5 ng/ml on all annual visits. Set 3 included 488 Latino PrCa cases and 3,141 controls from three cohorts within Kaiser Permanente (KP), an integrated health care delivery system: the Research Program on Genes, Environment and Health (RPGEH) cohort, the ProHealth Study, and the California Men’s Health Study. Incident PrCa cases were identified from the KP Northern California Cancer Registry (KPNCCR), the KP Southern California Cancer Registry (KPSCCR) or through review of clinical electronic health records by the end of 2012. Controls were all Latino men in RPGEH Genetic Epidemiology Research on Aging (GERA) study without PrCa diagnosis. These studies were genotyped using the Affymetrix Axiom v2 reagent as previously described 24 . 92 Genotyping Quality Control, Imputation and GWAS Analysis In Set 1, samples were excluded based on call rate <95% and 1 st degree relatedness, with the final analysis sample size of 2,080 (1,034 cases, 1,046 controls). SNPs with call rate <0.95 or with MAF<1% were excluded and 528,023 SNPs were retained for imputation. Imputation was performed using the cosmopolitan reference panel in the 1000 Genomes Project (1KGP) using Minimac3 Version 1.0.12. A total of 10,441,344 SNPs with MAF ≥ 0.01 and imputation quality score ≥ 0.3 were included in the analysis. Principle components (PC) were estimated using EIGENSTRAT 25 and per-allele odds ratios (OR) and P-values were estimated using unconditional logistic regression for each SNP, adjusting for age, and the first ten PCs. In Set 2, genotyping quality control (QC) was conducted together with a larger number of samples from the ELLIPSE consortium as described previously 15,16 . Briefly, samples were removed if they were gender/sex mismatches (n= 6), 1 st degree relative pairs (n=19) or had a call rate <0.95 (n=9). We calculated the shared IBD for Set 1 and Set 2 using PLINK to remove related samples across sets. We further excluded 43 cases and 1 control from Set 2, leaving a final sample size of 2,244 (1,192 cases and 1,052 controls). We excluded SNPs with call rate <0.95 or replicate concordance <99.8% based on QC replicate samples, or due to poor clustering after visual inspection. We further removed SNPs with estimated MAF that deviated or had mismatched alleles in comparison to the AMR individuals in phase III 1KGP data; 456,809 SNPs were available for imputation. Imputation was performed with Minimac3 Version 1.0.12 using the phase III 1KGP cosmopolitan reference panel. A total of 10,595,258 SNPs with MAF ≥ 0.01 and imputation quality score ≥ 0.3 were included in the analysis. PCs were estimated using 93 EIGENSTRAT and per-allele ORs and P-values were estimated using unconditional logistic regression, adjusting for age, study, and the first ten PCs. In Set 3, QC exclusions were based on call rate <97%, ancestry outliers, and relatedness as previously described 24 . The final analysis sample size was 3,629 (488 cases and 3,141 controls). Problematic SNPs were removed if they had MAF<1%, call rate<95%, or Hardy-Weinberg equilibrium (HWE) P <1×10 -5 , leaving 568,496 SNPs for imputation. Imputation was performed to the phase III 1KGP using IMPUTE2 v2.3.1. A total of 10,748,756 SNPs with MAF ≥ 0.01 and imputation quality score ≥ 0.3 were included in the analysis. PCs were estimated using EIGENSTRAT v4.2. ORs and P-values were estimated using unconditional logistic regression for each SNP, adjusting for age, body mass index (BMI) and the first ten PCs. Statistical Analyses A fixed-effect meta-analysis with inverse variance weights was used to obtain the combined results of the three sets for the overlapping SNPs (n=10,330,976). The combined sample size for meta-analysis was 2,714 cases and 5,239 controls. Risk allele frequencies (RAF) were derived by averaging the case/control RAFs of the 3 sets, weighted by the corresponding case/control numbers in each study. Regional association plots were generated using LocusZoom 26 for regions with genome-wide significant variants. All tests were two-sided with the genome-wide significance level being α=5.0×10 -8 . Unlike in Sets 1 and 2, Set 3 was additionally adjusted for BMI because it was found to be associated with PrCa risk in ProHealth. However, BMI was not found to confound the SNP associations or alter the PRS meta-analysis results (data not shown). 94 To assess the number of independent signals in the genome-wide significant risk regions, we performed forward-selection logistic regression in a pooled dataset of Set 1 and 2 (primary data were not available for Set 3), adjusting for global ancestry, age, and study. The correlation (r 2 ) between the independent SNPs identified by the stepwise regression procedure, and previous reported PrCa risk alleles in these regions, was calculated within the phase III 1KGP AFR/AMR/EUR populations. Admixture analysis. We performed an admixture-based genome-wide scan using the primary genotype data for Sets 1 and 2. We first computed PCs with the reference panels including 1KGP (n=2504; n=347 AMR) and the National Human Genome Research Institute (NHGRI) Population Architecture using Genomics and Epidemiology (PAGE) Consortium reference panel (n=1553; n=630 AMR) to visualize the ancestry distribution of our samples. We used individuals (i.e. European, African and Amerindian) of PAGE as the reference samples for local ancestry estimation. We conducted random sampling of the AMR population to get balanced sample sizes across ethnicities, leaving a total of 393 individuals in the final reference panel (AMR=147, EUR=150, AFR=96). We estimated genome-wide local ancestry using RFMix 27 . We calculated individual EUR/AMR/AFR global ancestry (QEUR/AMR/AFR) by taking an average of an individual’s local ancestry estimates across 1-22 autosome chromosomes. The association between global ancestry and PrCa risk was examined by a logistic regression model adjusting for age and study. We also tested the association between global ancestry and PrCa aggressiveness using case-only analysis adjusting for age and study. Aggressive PrCa was defined as cases with Gleason score ≥8. 95 To search for regions of the genome where local ancestry (EUR vs. AMR. vs. AFR) may be associated with PrCa risk, we regressed the difference between an individual’s local ancestry from their global ancestry using linear regression and compared this difference between cases and controls adjusting for age, study, and global ancestry. We also performed case-only analyses using linear regression adjusting for age and study, comparing a case’s local ancestry with his global ancestry. A fixed-effect meta-analysis with inverse variance weights was conducted to combine results of Sets 1 and 2, using P<1×10 -5 as criteria for genome-wide significance. Continuous regions (adjacent regions with P<1×10 -4 ) that were significant in both case-only and case-control comparisons were considered suggestive PrCa risk associations. In order to assess whether the local ancestry signal could be explained by risk alleles within that region, we did two conditional analyses for local ancestry: one adjusting for the top independent risk alleles identified in our Latino GWAS, the other one adjusting for both independent Latino risk alleles and the known risk variants in the detected region. Association Testing of Known Risk Regions. We examined the associations of the 181 established risk variants from previous PrCa GWAS and fine-mapping studies (Supplemental Table S2). Consistent directionality of effect were alleles with ORs in the same direction as those previously described (i.e. OR>1). A nominal P-value of 0.05 was used to determine statistical significance. For each risk loci, we tested the interaction between continuous local ancestry (AMR/EUR) estimates and risk allele dosage on PrCa risk. For alleles with a nominal significant interaction term (P interaction<0.05) we conducted stratified analysis by local ancestry (i.e. number of AMR chromosomes: ≤0.5, 0.5-1.5, >1.5). 96 Polygenic Risk Score Analyses. The aggregate effect of the known risk alleles was examined using a weighted polygenetic risk score (PRS), , for each individual. is the risk allele dosage for individual i at SNP m; C defines a set of 176 reported risk loci with MAF³0.001 and imputation r 2 ³0.3 in Latino men (5 risk variants were excluded based on this criteria). is the weight for SNP m. For an EUR-weighed PRS, weights were the conditional log ORs derived from men of European ancestry 28 ; for a Latino-weighted PRS, weights were the conditional log ORs obtained from meta-analyses in Latino men (Set 1 and Set 2). The PRS in each set (Set 1 and Set 2) was categorized by percentile (<10%, 10-25%, 25-75%, 75-90%, ≥90%), and the risk for each category was estimated relative to the interquartile range (25-75%) using logistic regression adjusting for the first 10 PCs, age and study. The estimates were then meta-analyzed using the metafor package in R. We also examined the association between PRS and PrCa risk by strata of EUR and AMR global ancestry. 97 3.4 Results Demographic and clinical characteristics of individuals in the study are presented in Supplementary Table S1. The mean age of cases was 61.8 to 73.7 across studies with mean ages being comparable in controls. The frequency of cases with Gleason score ≥8 ranged from 13.4% to 33.6% across studies, with LAAPC (Set 2) containing a higher proportion of aggressive cases (by design). Family history was more common among cases than controls in all studies and was significantly associated with PrCa risk (OR=2.8; 95% CI: 2.2, 3.5, P=3.1×10 -16 ) after adjusting for age and study. The degree of European/AmerIndian admixture in the Set 1 and Set 2 samples is shown in Supplementary Figure S1, with the majority of the current study samples spread along the European and AmerIndian axis. The PAGE reference panel revealed two AMR clusters: with the Set 1 and 2 samples congregated more closely with samples from Venezuela/Colombia/Brazil/Mexico (vs. Peru) (Supplementary Figure S2). European ancestry was the major ancestral component with average values ranging from 48.5% to 58.4% in controls across studies, followed by AmerIndian ancestry (36.9% ~ 46.7% in controls), with African ancestry being a minor component (4.7% ~ 5.7% in controls). AMR global ancestry was negatively associated with PrCa risk after adjusting for age and study, with a 0.1 increase in AMR ancestry percentage associated with a 16% decrease in PrCa risk (OR=0.84; 95%CI: 0.81, 0.88, P=1.01×10 -15 ). This difference between cases and controls was variable across studies (Supplementary Figure S3) and when excluding MDA, the inverse association between AMR and PrCa risk was attenuated (OR=0.94 per 0.1 increase in AMR; 95% CI: 0.90, 0.99), but still statistically significant (P=0.01). AFR global ancestry was not significantly associated with PrCa 98 risk. In the case-only analysis, neither AMR or AFR global ancestry was significantly associated with PrCa aggressiveness (PAMR=0.62, PAFR=0.28). The GWAS meta-analysis indicated no evidence of inflation in association test statistics (e.g., due to confounding by population stratification) (λ = 1.03). Genome-wide statistically significant associations were detected with 84 variants in known risk regions at 10q11.22 (SNP n=74) and 8q24.21 (SNP n=10) (Supplementary Table S2, Supplementary Figure S4, Supplementary Figure S5, Supplementary Figure S6). The most statistically significant variant was the known risk allele rs10993994 (OR=1.29; 95% CI: 1.19,1.39, P=1.08×10 -10 ) located upstream of MSMB at 10q11.22. At 8q24.21 the strongest association was with rs7843031 [OR=1.53; 95% CI: 1.34,1.74, P=5.12×10 -10 ], which is highly correlated with the known risk variant rs7812894 (rAFR 2 =0.57, rEUR 2 =0.89, rAMR 2 =0.83, 1KGP phase III) at 128.52Mb. All other associated SNPs (P<5x10 -8 ) at 10q11.22 and 8q24.21 were correlated with either rs7843031 (r 2 ≥ 0.3) or rs10993994 (r 2 ≥0.6). At 8q24, a second variant, rs56005245, was found to be independently associated with risk (P<1×10 -5 ) from the forward selection procedure. Variant rs56005245 is highly correlated with the previous reported risk allele rs72725879 (rAFR 2 =0.11, rEUR 2 =0.71, rAMR 2 =0.29). Admixture analysis: We found no genome-wide significant associations between local EUR or AMR ancestry and PrCa risk in the case-control or case-case analyses. We did detect genome- wide significant (P<1×10 -5 ) PrCa risk associations with AFR local ancestry at the 8q24 PrCa susceptibility region (127.0 -127.8 MB), and each AFR-derived chromosome at this region was 99 associated with an average of 1.60-fold increased PrCa risk (95%CI: 1.31, 1.95); the continuous suggestive risk associations (P<1×10 -4 ) extended from 126.9 to 128.1MB (Supplementary Figure S7). We performed a conditional analysis for AFR local ancestry in the genome-wide significant risk region, with additional adjustment for the two independent risk alleles rs7843031 and rs56005245 identified above. This resulted in a general increase of less than 2 orders of magnitude for the AFR local ancestry P-values (P=9.5×10 -5 ~5.9×10 -4 , Supplementary Figure S8). Conditioning on all the 14 risk 8q24 variants (2 independent and 12 known risk alleles 16,29 ), each AFR-derived chromosome at 8q24 was associated with 1.30-fold increased PrCa risk (95%CI: 1.03, 1.61) and the increase of the P-values were much greater (P=0.02~0.06). Association Testing of Known Risk Regions: Of the 181 previously reported PrCa risk loci, one (rs138213197) was not imputed in Set1 and Set2; 162 were polymorphic with MAF ≥ 0.01 and imputation quality score≥ 0.3 in all three sets of Latino men (Supplementary Table S3). Of the remaining 162 variants, directional consistency was noted for 135 (83.3%) in the meta-analysis, among which 55 (34.0%) were nominally significant (P < 0.05). In comparing the frequency of the known risk alleles between populations, the average risk allele frequency in Latino controls was only 0.005 larger than that observed in the European population (P=0.48, t-test), with 18 (11.3%) having opposite minor alleles. Local ancestry was estimated for 157 autosomal risk alleles and 11 variants demonstrated nominally statistically significant (P<0.05) interactions between local ancestry (EUR or AMR) and risk allele on PrCa risk, although no variant was statistically significant after accounting for the number of interaction tests. Of note, there was suggestive evidence that variant rs10993994 at 10q11.22 is more strongly associated with risk among Latino men with AMR local ancestry (ORAMR>1.5=1.40, 95% CI: 1.10, 1.77, P=5.97×10 -3 ; 100 ORAMR0.5-1.5=1.36, 95%CI: 1.17, 1.58, P=6.28×10 -5 ) compared to men with little AMR local ancestry in this region (ORAMR≤0.5=1.19; 95%CI: 1.04, 1.36, P=1.23×10 -2 ). The same suggestive trend, with an association being stronger or limited to men with AMR ancestry in the region, was observed for another four known PrCa risk variants [rs9443189 (6q14.1), rs10875943 (12q13.12), rs12956892 (18q21.32), and rs1978060 (22q11.21)], while six variants had greater effect sizes among men with lower AMR local ancestry proportions [rs2028900 (2p11.2), rs4976790 (5q35.3), rs5875234 (6p22.1), rs630045 (6q22.1), rs17790938 (20q13.13), and rs909666 (22q13.2) (Supplementary Table S4)]. Polygenic risk score: In estimating a EUR-weighted PRS, Latino men in the top 10% PRS stratum had a 3.19-fold (95% CI: 2.65, 3.84) elevated risk and those in the top 1% had a 4.02- fold (95% CI: 2.46, 6.55) increased risk compared to men with average risk (PRS in 25 th -75 th percentiles) (Table 1). Among Latinos with a higher proportion of European global ancestry (in the 4 th quantile of EUR global ancestry in controls), we observed a more pronounced increase in PrCa risk (OR=3.68; 95%CI: 2.56, 5.29) for men in the top 10% EUR-weighted PRS risk stratum (Table 2). This association was slightly reduced (OR=2.94) among men in the 4 th quartile of Amerindian ancestry (Supplemental Table S5). The P-values for interaction between PRS and EUR and AMR global ancestry were 0.26 and 0.04, respectively. The PRS odds ratios were larger using weights among Latino men from this study; the top 10% PRS stratum had a 4.18-fold (95% CI: 3.47, 5.04) elevated risk and those in the top 1% had a 6.87-fold (95% CI: 4.27, 11.06) (Table 1). Effect modification of the Latino-weighted PRS by EUR and AMR global ancestry was also observed (Table 2 and Supplemental Table S5), with P-values for interaction of 0.08 and 0.01, respectively. 101 3.5 Discussion In this study among Latinos, two known risk regions, at 8q24.21 and 10q11.22, achieved genome-wide significance, and admixture mapping highlighted the 8q24 region as harboring PrCa risk variants related to local African ancestry. The majority of established risk alleles were also replicated in Latinos in terms of directional consistency, and among them, ~30% achieved nominal significance. In the PRS analysis, the established risk alleles were found to be strongly associated with PrCa risk, with a larger PRS effect observed for men with more European ancestry. Previous GWAS studies of PrCa have identified more than 170 common risk variants, with the majority of discovery populations being of European or Asian ancestry 28,30 . As found in previous studies in men of African ancestry 16 , directional consistency was also observed for the majority (>80%) of risk variants in Latinos, among which ~30% were nominally statistically significant, suggesting that most of the known genetic susceptibility loci for PrCa generalize to the Latino population, which may not be surprising given their high degree of European ancestry. Two regions, 8q24 and 10q11.22, achieved genome-wide significance. The risk region at 8q24 harbors multiple independent risk variants and is consistently recognized as the most significant PrCa risk region across racial/ethnic populations 16,31 . However, in the Latino population, the 10q11.22 surpassed 8q24 as the most significant risk region. At 10q11.22, the risk variant rs10993994 has been consistently associated with PrCa risk across populations 32-34 , and is likely to be the putative causal variant within the region 35 . The risk allele rs10993994-T is more common among populations of African ancestry (RAFAFR=0.65, RAFAMR=0.40, RAFEUR=0.39) in Phase III 1KGP. In our Latino men, it was associated with a 1.29-fold (95%CI: 1.19, 1.39) 102 increased risk, which is similar to that reported in the largest European PrCa GWAS (OR=1.23, 1.21, 1.25) 28 , while larger than that reported in the AA PrCa GWAS (OR=1.12, 95%CI: 1.07, 1.16) 16 . This allele is located close to the transcription start site of the microseminoprotein-beta (MSMB) gene and was reported to be significantly related with gene expression abundance 36 . The encoded microseminoprotein (MSP) is one of the three major proteins secreted by the prostate, and we have shown that reduced serum levels are strongly associated with PrCa risk 37 . In comparison to whites and blacks, the geometric mean plasma MSP level was observed to be lower in PrCa-free Latinos after adjusting for rs10993994 genotype, age, BMI and PSA level 38 . However, in contrast to the association observed with the risk SNP, the magnitude of the association between blood MSP concentration and PrCa risk is smaller in Latinos than whites, yet the difference was not statistically significant 37 . Additional studies will be needed to better understand the strong association between the 10q11.22 risk SNP and PrCa risk in Latinos. Latinos are a highly heterogeneous population; the ancestry structure varies widely across subgroups. Previous literature reported that compared to other Latino subgroups, Mexicans had the highest proportion of Native American ancestries 39 . Coincidently, studies have also shown that self-reported Mexican Americans have lower PrCa incidence and mortality rates than whites and other Latino-subgroups 4,40,41 , suggesting that AmerIndian genetic ancestry might be a protective factor for PrCa risk. A previous study showed that the estimated global AMR ancestry was inversely associated with breast cancer risk 42 . Similarly, our results support the hypothesis that global AMR ancestry was inversely associated with PrCa risk, even after excluding the outlier study MDA with Mexican American controls but a more diverse representation of Latino cases. To note, global genetic ancestry estimates not only reflect potential genetic differences in 103 disease susceptibility but may also capture cultural, behavioral and lifestyle factors, including socioeconomic status (SES) as well as access and adherence to medical care and cancer screening. For some chronic diseases, associations between genetic ancestry and disease risk have been shown to be greatly attenuated or extinguished after accounting for such factors 43 . However, this is not the case in some other studies of lung cancer 44 , myocardial infarction and impaired fasting glucose 45 . Thus, further investigation is required to disentangle genetic ancestry representing genetic versus non-genetic/social/behavioral influences on PrCa risk. While none of the interactions between known risk alleles and local ancestry were significant after correcting for multiple tests, there was a suggestion that variant rs10993994 was more strongly associated with risk among men with greater local AMR ancestry. In men with a high or moderate proportion of local AMR ancestry, the OR was 1.4 versus 1.2 in men with lower local AMR ancestry (<25%), which may explain the observed strong association in 10q11.22 risk region among Latinos. Testing interaction effects by local ancestry in Latinos will require a larger sample size. Previous PRS analyses in populations of European ancestry have reported a ~3-fold difference in risk comparing people in the top 10% risk stratum to the population average 46–49 , with the magnitude of effect being similar in African Americans 16 . Similar to the previously reported effect size in studies among men of European descent, we observed a 3.2-fold increased PrCa risk in Latino men. A multi-ethnic study, which contained a part of our samples, demonstrated that when comparing the highest to the lowest risk score decile, the effect size was larger among 104 non-Hispanic whites than in Latinos (OR=6.2 versus 5.8) 24 . Consistent with their results, we observed a stronger effect of the PRS on PrCa risk among Latino men with higher proportion of European global ancestry: among them the effect size comparing the top 10% to the population average risk stratum increased to 3.7-fold. These observations may be due to ethnic differences in the frequencies of risk alleles and to the LD patterns surrounding causal SNPs and suggests that global ancestry background might modify the effect of PrCa risk variants, further supporting the need to construct ethnicity-specific PRS. We also found the PRS associations to be larger when using weights from Latino men in this study; however, since weights came from the same population, the effect sizes are likely to be overestimated. An independent Latino replication sample is needed to validate this observation. Although our analysis represented the largest study of PrCa genetic susceptibility among Latinos, it remained under-powered for less common risk alleles with small effect size; for the genome-wide analysis (α= 5×10 -8 ), our study only had 80% a priori power to detect common risk alleles (MAF=10%) with moderate effect size (OR≥1.40); for known risk alleles with MAF of 5%, the power to detect ORs of 1.20 at a nominal significant level (P<0.05) was only 70%. However, for common variants with MAF>10%, we had more than adequate power (90%) to detect a moderate effect of 1.20. In summary, we found that the known PrCa risk variants can stratify PCa risk in Latino men. Larger studies in Latino populations, both in the US and in other countries, which will expand 105 AMR ancestral diversity, will be required to characterize genetic risk variants and improve risk stratification for this population. Acknowledgement This work was supported by grants U01 CA164973, U01 CA086402 and R01 CA84979. 106 3.6 References 1. Jemal A, Ward EM, Johnson CJ, Cronin KA, Ma J, Ryerson AB, Mariotto A, Lake AJ, Wilson R, Sherman RL, Anderson RN, Henley SJ, et al. Annual Report to the Nation on the Status of Cancer, 1975–2014, Featuring Survival. J Natl Cancer Inst 2017;109:djx030 2. Latini DM, Elkin EP, Cooperberg MR, Sadetsky N, DuChane J, Carroll PR. Differences in clinical characteristics and disease-free survival for Latino, African American, and non- Latino white men with localized prostate cancer. Cancer 2006;106:789–95. 3. Park S-Y, Haiman CA, Cheng I, Park SL, Wilkens LR, Kolonel LN, Le Marchand L, Henderson BE. Racial/ethnic differences in lifestyle-related factors and prostate cancer risk: the Multiethnic Cohort Study. Cancer Causes Control 2015;26:1507–15. 4. Pinheiro PS, Sherman RL, Trapido EJ, Fleming LE, Huang Y, Gomez-Marin O, Lee D. Cancer Incidence in First Generation U.S. Hispanics: Cubans, Mexicans, Puerto Ricans, and New Latinos. Cancer Epidemiol Biomarkers Prev 2009;18:2162–9. 5. Pinheiro PS, Callahan KE, Siegel RL, Jin H, Morris CR, Trapido EJ, Gomez SL. Cancer Mortality in Hispanic Ethnic Groups. Cancer Epidemiol Biomarkers Prev 2017;26:376–82. 6. Bryc K, Durand EY, Macpherson JM, Reich D, Mountain JL. The Genetic Ancestry of African Americans, Latinos, and European Americans across the United States. Am J Hum Genet 2015;96:37–53. 7. Beuten J, Halder I, Fowler SP, Groing HHH, Duggirala R, Arya R, Thompson IM, Leach RJ, Lehman DM. Wide disparity in genetic admixture among Mexican Americans from San Antonio, TX. Ann Hum Genet 2011;75:529–38. 8. Gao X, Nannini DR, Corrao K, Torres M, Chen Y-DI, Fan BJ, Wiggs JL, Taylor KD, Gauderman WJ, Rotter JI, Varma R. Genome-wide association study identifies WNT7B as a novel locus for central corneal thickness in Latinos. Hum Mol Genet 2016;25:5035–45. 9. Galanter JM, Gignoux CR, Torgerson DG, Roth LA, Eng C, Oh SS, Nguyen EA, Drake KA, Huntsman S, Hu D, Sen S, Davis A, et al. GWAS and admixture mapping identify different asthma-associated loci in Latinos: The GALA II Study. J Allergy Clin Immunol 2014;134:295–305. 10. Consortium TST 2 D. Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico. Nature 2014;506:97–101. 11. Fejerman L, Ahmadiyeh N, Hu D, Huntsman S, Beckman KB, Caswell JL, Tsung K, John EM, Torres-Mejia G, Carvajal-Carmona L, Echeverry MM, Tuazon AMD, et al. Genome- wide association study of breast cancer in Latinas identifies novel protective variants on 6q25. Nat Commun 2014;5:5260 12. González Burchard E, Borrell LN, Choudhry S, Naqvi M, Tsai H-J, Rodriguez-Santana JR, Chapela R, Rogers SD, Mei R, Rodriguez-Cintron W, Arena JF, Kittles R, et al. Latino Populations: A Unique Opportunity for the Study of Race, Genetics, and Social Environment in Epidemiological Research. Am J Public Health 2005;95:2161–8. 107 13. Choudhry S, Ung N, Avila PC, Ziv E, Nazario S, Casal J, Torres A, Gorman JD, Salari K, Rodriguez-Santana JR, Toscano M, Sylvia JS, et al. Pharmacogenetic Differences in Response to Albuterol between Puerto Ricans and Mexicans with Asthma. Am J Respir Crit Care Med 2005;171:563–70. 14. Cheng I, Chen GK, Nakagawa H, He J, Wan P, Laurie CC, Shen J, Sheng X, Pooler LC, Crenshaw AT, Mirel DB, Takahashi A, et al. Evaluating Genetic Risk for Prostate Cancer among Japanese and Latinos. Cancer Epidemiol Biomarkers Prev 2012;21:2048–58. 15. Amos CI, Dennis J, Wang Z, Byun J, Schumacher FR, Gayther SA, Casey G, Hunter DJ, Sellers TA, Gruber SB, Dunning AM, Michailidou K, et al. The OncoArray Consortium: a Network for Understanding the Genetic Architecture of Common Cancers. Cancer Epidemiol Biomarkers Prev 2017;26:126–35. 16. Conti DV, Wang K, Sheng X, Bensen JT, Hazelett DJ, Cook MB, Ingles SA, Kittles RA, Strom SS, Rybicki BA, Nemesure B, Isaacs WB, et al. Two Novel Susceptibility Loci for Prostate Cancer in Men of African Ancestry. J Natl Cancer Inst 2017;109:djx084 17. Schwartz GG, John EM, Rowland G, Ingles SA. Prostate cancer in African-American men and polymorphism in the calcium-sensing receptor. Cancer Biology & Therapy 2010;9:994– 9. 18. Pike MC, Peters RK, Cozen W, Probst-Hensch NM, Wan PC, Mack TM, Felix JC. Estrogen- Progestin Replacement Therapy and Endometrial Cancer. J Natl Cancer Inst 1997;89:1110– 6. 19. Strom SS, Gu Y, Zhang H, Troncoso P, Babaian RJ, Pettaway CA, Shete S, Spitz MR, Logothetis CJ. Androgen receptor polymorphisms and risk of biochemical failure among prostatectomy patients. The Prostate 60:343–51. 20. Strom SS, Yamamura Y, Flores-Sandoval FN, Pettaway CA, Lopez DS. Prostate cancer in Mexican-Americans: identification of risk factors. Prostate 2008;68:563–70. 21. Wilkinson AV, Spitz MR, Strom SS, Prokhorov AV, Barcenas CH, Cao Y, Saunders KC, Bondy ML. Effects of nativity, age at migration, and acculturation on smoking among adult Houston residents of Mexican descent. Am J Public Health 2005;95:1043–9. 22. Beuten J, Gelfond JAL, Martinez-Fierro ML, Weldon KS, Crandall AC, Rojas-Martinez A, Thompson IM, Leach RJ. Association of chromosome 8q variants with prostate cancer risk in Caucasian and Hispanic men. Carcinogenesis 2009;30:1372–9. 23. Beuten J, Gelfond JAL, Franke JL, Weldon KS, Crandall AC, Johnson-Pais TL, Thompson IM, Leach RJ. Single and Multigenic Analysis of the Association between Variants in 12 Steroid Hormone Metabolism Genes and Risk of Prostate Cancer. Cancer Epidemiology Biomarkers & Prevention 2009;18:1869–80. 24. Hoffmann TJ, Eeden SKVD, Sakoda LC, Jorgenson E, Habel LA, Graff RE, Passarelli MN, Cario CL, Emami NC, Chao CR, Ghai NR, Shan J, et al. A Large Multiethnic Genome-Wide Association Study of Prostate Cancer Identifies Novel Risk Variants and Substantial Ethnic Differences. Cancer Discov 2015;5:878–91. 108 25. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006;38:904–9. 26. Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, Boehnke M, Abecasis GR, Willer CJ. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 2010;26:2336–7. 27. Maples BK, Gravel S, Kenny EE, Bustamante CD. RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am J Hum Genet 2013;93:278– 88. 28. Schumacher FR, Olama AAA, Berndt SI, Benlloch S, Ahmed M, Saunders EJ, Dadaev T, Leongamornlert D, Anokian E, Cieza-Borrella C, Goh C, Brook MN, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nature Genetics 2018;50:928–36. 29. Matejcic M, Saunders EJ, Dadaev T, Brook MN, Wang K, Sheng X, Olama AAA, Schumacher FR, Ingles SA, Govindasami K, Benlloch S, Berndt SI, et al. Germline variation at 8q24 and prostate cancer risk in men of European ancestry. Nature Communications 2018;9:4616. 30. Park SL, Cheng I, Haiman CA. Genome-Wide Association Studies of Cancer in Diverse Populations. Cancer Epidemiol Biomarkers Prev 2018;27:405–17. 31. Takata R, Akamatsu S, Kubo M, Takahashi A, Hosono N, Kawaguchi T, Tsunoda T, Inazawa J, Kamatani N, Ogawa O, Fujioka T, Nakamura Y, et al. Genome-wide association study identifies five new susceptibility loci for prostate cancer in the Japanese population. Nature Genetics 2010;42:751–4. 32. Guy M, Kote-Jarai Z, Giles GG, Al Olama AA, Jugurnauth SK, Mulholland S, Leongamornlert DA, Edwards SM, Morrison J, Field HI, Southey MC, Severi G, et al. Identification of new genetic risk factors for prostate cancer. Asian J Androl 2009;11:49–55. 33. Mhatre DR, Mahale SD, Khatkhatay MI, Achrekar SK, Desai SS, Jagtap DD, Dhabalia JV, Tongaonkar HB, Dandekar SP, Varadkar AM. The rs10993994 in the proximal MSMB promoter region is a functional polymorphism in Asian Indian subjects. Springerplus 2015;4:380 34. Chang B-L, Cramer SD, Wiklund F, Isaacs SD, Stevens VL, Sun J, Smith S, Pruett K, Romero LM, Wiley KE, Kim S-T, Zhu Y, et al. Fine mapping association study and functional analysis implicate a SNP in MSMB at 10q11 as a causal variant for prostate cancer risk. Hum Mol Genet 2009;18:1368–75. 35. Dadaev T, Saunders EJ, Newcombe PJ, Anokian E, Leongamornlert DA, Brook MN, Cieza- Borrella C, Mijuskovic M, Wakerell S, Olama AAA, Schumacher FR, Berndt SI, et al. Fine- mapping of prostate cancer susceptibility loci in a large meta-analysis identifies candidate causal variants. Nature Communications 2018;9:2256. 36. Pomerantz MM, Shrestha Y, Flavin RJ, Regan MM, Penney KL, Mucci LA, Stampfer MJ, Hunter DJ, Chanock SJ, Schafer EJ, Chan JA, Tabernero J, et al. Analysis of the 10q11 Cancer Risk Locus Implicates MSMB and NCOA4 in Human Prostate Tumorigenesis. PLOS Genetics 2010;6:e1001204. 109 37. Haiman CA, Stram DO, Vickers AJ, Wilkens LR, Braun K, Valtonen-André C, Peltola M, Pettersson K, Waters KM, Marchand LL, Kolonel LN, Henderson BE, et al. Levels of Beta- Microseminoprotein in Blood and Risk of Prostate Cancer in Multiple Populations. J Natl Cancer Inst 2013;105:237–43. 38. Waters KM, Stram DO, Marchand L-L, Klein RJ, Valtonen-André C, Peltola M, Kolonel LN, Henderson BE, Lilja H, Haiman CA. A common prostate cancer risk variant 5’ of MSMB (microseminoprotein-beta) is a strong predictor of circulating MSP (microseminoprotein) in multiple populations. Cancer Epidemiol Biomarkers Prev 2010;cebp.0427.2010. 39. Bryc K, Velez C, Karafet T, Moreno-Estrada A, Reynolds A, Auton A, Hammer M, Bustamante CD, Ostrer H. Genome-wide patterns of population structure and admixture among Hispanic/Latino populations. Proceedings of the National Academy of Sciences 2010;107:8954–61. 40. Sheinfeld Gorin S, Heck JE. Cancer screening among Latino subgroups in the United States. Preventive Medicine 2005;40:515–26. 41. Martinez-Tyson D, Pathak EB, Soler-Vila H, Flores AM. Looking Under the Hispanic Umbrella: Cancer Mortality Among Cubans, Mexicans, Puerto Ricans and Other Hispanics in Florida. J Immigr Minor Health 2009;11:249–57. 42. Fejerman L, John EM, Huntsman S, Beckman K, Choudhry S, Perez-Stable E, Burchard EG, Ziv E. Genetic Ancestry and Risk of Breast Cancer among U.S. Latinas. Cancer Res 2008;68:9723–8. 43. Pereira L, Zamudio R, Soares-Souza G, Herrera P, Cabrera L, Hooper CC, Cok J, Combe JM, Vargas G, Prado WA, Schneider S, Kehdy F, et al. Socioeconomic and Nutritional Factors Account for the Association of Gastric Cancer with Amerindian Ancestry in a Latin American Admixed Population. PLoS One 2012; 7: e41200 44. Aldrich MC, Selvin S, Wrensch MR, Sison JD, Hansen HM, Quesenberry CP, Seldin MF, Barcellos LF, Buffler PA, Wiencke JK. Socioeconomic Status and Lung Cancer: Unraveling the Contribution of Genetic Admixture. Am J Public Health 2013;103:e73–80. 45. Ruiz-Narváez EA, Bare L, Arellano A, Catanese J, Campos H. West African and Amerindian ancestry and risk of myocardial infarction and metabolic syndrome in the Central Valley population of Costa Rica. Hum Genet 2010;127:629–38. 46. Al Olama AA, Kote-Jarai Z, Berndt SI, Conti DV, Schumacher F, Han Y, Benlloch S, Hazelett DJ, Wang Z, Saunders E, Leongamornlert D, Lindstrom S, et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nature Genetics 2014;46:1103–9. 47. Fritsche LG, Gruber SB, Wu Z, Schmidt EM, Zawistowski M, Moser SE, Blanc VM, Brummett CM, Kheterpal S, Abecasis GR, Mukherjee B. Association of Polygenic Risk Scores for Multiple Cancers in a Phenome-wide Study: Results from The Michigan Genomics Initiative. The American Journal of Human Genetics 2018;102: 1048–61 110 48. Al Olama AA, Benlloch S, Antoniou AC, Giles GG, Severi G, Neal DE, Hamdy FC, Donovan JL, Muir K, Schleutker J, Henderson BE, Haiman CA, et al. Risk Analysis of Prostate Cancer in PRACTICAL, a Multinational Consortium, Using 25 Known Prostate Cancer Susceptibility Loci. Cancer Epidemiol Biomarkers Prev 2015;24:1121–9. 49. Machiela MJ, Chen C-Y, Chen C, Chanock SJ, Hunter DJ, Kraft P. Evaluation of polygenic risk scores for predicting breast and prostate cancer risk. Genet Epidemiol 2011;35:506–14. 111 Table 1. Associations between categorized polygenic risk scores (PRS) and prostate cancer risk in Latino men. Polygenic Risk Score Category European-weighted PRS a Latino-weighted PRS a No. Case s No. Control s OR (95%CI) b P-value c No. Cases No. Controls OR (95%CI) b P-value c 0% - 1% 5 22 0.25(0.09,0.67) 6.11×10 -3 2 22 0.14(0.03,0.68) 1.44×10 -2 1% - 10% 68 189 0.38(0.28,0.51) 1.93×10 -10 53 189 0.32(0.23,0.45) 1.17×10 -11 10% - 25% 169 314 0.60(0.48,0.74) 2.02×10 -6 150 314 0.57(0.45,0.71) 5.30×10 -7 25% - 75% (baseline) 952 1048 - - 835 1048 - - 75% - 90% 445 314 1.58(1.33,1.88) 2.34×10 -7 540 314 2.25(1.90,2.67) 1.10×10 -20 90% - 99% 507 189 3.10(2.55,3.76) 2.37×10 -30 533 189 3.87(3.18,4.71) 9.57×10 -42 99% - 100% 80 22 4.02(2.46,6.55) 2.43×10 -8 113 22 6.87(4.27,11.06) 2.22×10 -15 a PRS was calculated using 176 known SNPs (MAF>0.001 and imputation score>=0.3 in Set 1 and Set 2); for EUR-weighted PRS, the weights were conditional log ORs derived in men of European ancestry; for Latino- weighted PRS, the weights were conditional log ORs derived in men of Latino ancestry (Set 1 and Set 2). b Odds Ratios (ORs) were adjusted for age, study and the first 10 principle components. c P-values were Wald P-value from fixed-effect meta-analysis. 112 Table 2. Associations between categorized polygenic risk scores (PRSs) and prostate cancer risk in Latino men by European global ancestry strata. European Global Ancestry Strata a Polygenic Risk Score Category European-weighted PRS b Latino-weighted PRS b No. Case s No. Control s OR (95%CI) c P-value d No. Cases No. Controls OR (95%CI) c P-value d ≤25% 0% - 10% 12 53 0.27(0.14,0.54) 1.64×10 -4 9 53 0.21(0.10,0.45) 5.71×10 -5 10% - 25% 30 79 0.51(0.32,0.83) 6.41×10 -3 15 79 0.27(0.15,0.49) 2.46×10 -5 25% - 75% 183 261 - - 174 261 - - 75% - 90% 77 79 1.47(1.00,2.15) 4.70×10 -2 88 79 1.64(1.12,2.40) 1.04×10 -2 90% - 100% 104 53 3.08(2.05,4.63) 5.50×10 -8 120 53 3.81(2.53,5.73) 1.46×10 -10 >75% 0% - 10% 25 53 0.43(0.25,0.74) 2.13×10 -3 17 53 0.32(0.18,0.59) 2.25×10 -4 10% - 25% 61 79 0.68(0.46,1.01) 5.50×10 -2 45 79 0.59(0.38,0.91) 1.75×10 -2 25% - 75% 299 262 - - 247 262 - - 75% - 90% 132 78 1.43(1.01,2.03) 4.60×10 -2 192 78 2.70(1.93,3.78) 7.53×10 -9 90% - 100% 216 53 3.68(2.56,5.29) 1.84×10 -12 232 53 4.95(3.43,7.15) 1.36×10 -17 a Strata were created by categorizing European global ancestry score according to its percentiles (≤25%, >75%) in controls. b PRS was calculated using 176 known SNPs (MAF≥0.001 and imputation score≥0.3 in Set 1 and Set 2); for European- weighted PRS, the weights were the conditional log ORs derived from men of European ancestry; for Latino-weighted PRS, weights were the conditional log ORs obtained from our Latino men (Set 1 and Set 2). c Odds Ratios (ORs) were adjusted for age, the first 10 principle components, and studies. d P-values were Wald P-values from fixed-effect meta-analyses. 113 Chapter Four: A Meta-Analysis of Genome-Wide Association Studies of Multiple Myeloma among African Americans (Manuscript submitted) Zhaohui Du 1+ , Niels Weinhold 2+ , Gregory Chi Song 3 , Kristin A. Rand 4 , David J. Van Den Berg 1 , Amie E. Hwang 5 , Xin Sheng 1 , Victor Hom 5 , Sikander Ailawadhi 6 , Ajay K. Nooka 7 , Seema Singhal 8 , Karen Pawlish 9 , Edward S. Peters 10 , Cathryn Bock 11 , Ann Mohrbacher 12 , Alexander Stram 13 , Sonja I Berndt 14 , William J. Blot 15 , Graham Casey 16 , Victoria L. Stevens 17 , Rick Kittles 18 , Phyllis J. Goodman 19 , W. Ryan Diver 17 , Anselm Hennis 20 , Barbara Nemesure 20 , Eric A. Klein 21 , Benjamin A. Rybicki 22 , Janet L. Stanford 23 , John S. Witte 24 , Lisa Signorello 14 , Esther M. John 25 , Leslie Bernstein 18 , Antoinette M. Stroup 24 , Owen W. Stephens 2 , Maurizio Zangari 2 , Frits Van Rhee 2 , Andrew Olshan 25 , Wei Zheng 15 , Jennifer J. Hu 26 , Regina Ziegler 14 , Sarah J. Nyante 25 , Sue Ann Ingles 5 , Michael Press 27 , John David Carpten 28 , Stephen Chanock 14 , Jayesh Mehta 8 , Graham A Colditz 29 , Jeffrey Wolf 24 , Thomas G. Martin 24 , Michael Tomasson 30 , Mark A. Fiala 29 , Howard Terebelo 31 , Nalini Janakiraman 32 , Laurence Kolonel 33 , Kenneth C. Anderson 34 , Loic Le Marchand 33 , Daniel Auclair 35 , Brian C.-H. Chiu 36 , Elad Ziv 24 , Daniel Stram 4 , Ravi Vij 29, Leon Bernal-Mizrachi 37 , Gareth J. Morgan 38 , Jeffrey A. Zonder 11 , Carol Ann Huff 39 , Sagar Lonial 7 , Robert Z. Orlowski 40 , David V. Conti 1 *, Christopher A. Haiman 1* , and Wendy Cozen 1,27 * +Z.D. and N.W. are co-first authors; *D.V.C., C.A.H. and W.C. are co-senior authors 114 Affiliations 1 Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA; 2 Myeloma Center, University of Arkansas For Medical Sciences, Little Rock, AR; 3 Millennium Pharmaceuticals Inc., Takeda Pharmaceutical Company Limited, Cambridge, MA; 4 Ancestry.com, San Francisco, CA 5 Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA; 6 Division of Hematology-Oncology, Mayo Clinic, Jacksonville, FL; 7 Winship Cancer Institute/ Hematology and Medical Oncology, Emory University, Atlanta, GA; 8 Feinberg School of Medicine, Northwestern University, Chicago, IL; 9 New Jersey Department of Health, Trenton, NJ; 10 Louisiana State University School of Public Health, New Orleans, LA; 11 Karmanos Cancer Center, Wayne State University, Detroit, MI; 12 Department of Medicine, Division of Hematology, University of Southern California, Los Angeles, CA; 13 Genomic Health, Inc., Redwood City, CA; 14 National Cancer Institute, Division of Cancer Genetics and Epidemiology; NIH, DHHS, Bethesda, MD; 115 15 Vanderbilt University, Nashville, TN; 16 University of Virginia, University of Virginia School of Medicine, Charlottesville, VA; 17 American Cancer Society, Atlanta, GA; 18 City of Hope National Medical Center, Duarte, CA; 19 SWOG Statistical Center, Seattle, WA; 20 Stony Brook University, Stony Brook, NY; 21 Cleveland Clinic Foundation, Cleveland, OH; 22 Henry Ford Hospital, Detroit, MI; 23 Fred Hutchinson Cancer Center, Seattle, WA; 24 University of California at San Francisco, San Francisco, CA; 25 Stanford University School of Medicine, Stanford, CA; 24 Rutgers University, New Brunswick, CA; 25 University of North Carolina, Chapel Hill, NC; 26 University of Miami Miller School of Medicine, Miami, FL; 27 Department of Pathology, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA; 28 Center for Translational Genomics, Department of Translational Genomics, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA; 29 Division of Oncology, Washington University School of Medicine, Saint Louis, MO; 116 30 University of Iowa, Iowa City, IA; 31 Providence Hospital, Southfield, MI; 32 Division of Hematology-Oncology, Henry Ford Hospital, Detroit, MI; 33 University of Hawaii Cancer Center, Honolulu, HI; 34 J. Lipper Cancer Center for Multiple Myeloma, Dana Farber Cancer Institute, Harvard University, Boston, MA; 35 Multiple Myeloma Research Foundation, Norwalk, CT; 36 Department of Public Health Sciences, University of Chicago, Chicago, IL; 37 Grady Memorial Hospital, Emory University, Atlanta, GA; 38 Myeloma Centre, New York University, NY; 39 Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD; 40 Lymphoma and Myeloma, University of Texas MD Anderson Cancer Center, Houston, TX; Corresponding author: Wendy Cozen Genetic Epidemiology Center, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA; Email: wcozen@med.usc.edu 117 Text word count: 3987; Abstract word count: 249; Number of tables: 2; Number of figures: 2; Number of references: 39 Key points 1. African Americans with the top 10% of a polygenic risk score have a 60-80% increased risk of multiple myeloma compared to those with an average score. 2. Common genetic variation contributes to the risk of multiple myeloma in African Americans. 118 4.1 Abstract Persons of African ancestry (AA) have a two-fold higher risk of multiple myeloma (MM) compared to whites. A genetic contribution to MM etiology in individuals of European ancestry (EA) is supported by genome-wide association studies (GWAS). Little is known about genetic risk factors for MM in individuals of AA. We performed a GWAS of MM in 1,813 cases and 8,871 controls and conducted an admixture mapping scan to identify risk alleles. We fine- mapped the 23 known susceptibility loci to find markers that could better capture MM risk in individuals of AA and constructed a polygenic risk score (PRS) to assess the aggregated effect of known MM risk alleles. In GWAS analysis, we identified two suggestive novel loci located at 9p24.3 and 9p13.1 at P<1×10 -6 ; however, no genome-wide significant association was noted. In admixture mapping, we observed a genome-wide significant inverse association between local AA at 2p24.1-23.1 and MM risk in AA individuals. Of the 23 known EA risk variants, 20 showed directional consistency and 9 replicated at P<0.05 in AA individuals. In eight regions, we identified markers that better capture MM risk in persons of AA. AA individuals with a PRS in the top 10% had a 1.82-fold (95%CI: 1.56, 2.11) increased MM risk compared to those with average risk (25-75%). The strongest functional association was between the risk allele for variant rs56219066 at 5q15 and lower ELL2 expression (P= 5.1×10 –12 ). Our study shows that common genetic variation contributes to MM risk individuals of AA. Keywords: Multiple Myeloma, GWAS, African Americans, Admixture Mapping 119 4.2 Introduction Multiple myeloma (MM) originates from a malignant clone of plasma cells, the terminally differentiated B-lymphocytes that produce antibody upon antigen recognition. It is the second most common hematologic malignancy in the U.S., with approximately 160,000 new cases in 2018 1 , and remains largely incurable, with a 50% 5-year survival rate. Older age, male sex, African ancestry (AA), family history, and obesity, especially in young adulthood, are factors consistently associated with MM risk 2–4 . In the U.S., the incidence rate of MM is twice as high in men and women of AA compared to those of European ancestry (EA), for unknown reasons 2 . Case reports of familial clustering 5 and a 2-3-fold increased risk among first-degree relatives 6,7 suggest a genetic contribution to the etiology of MM. We previously showed that 5 of 8 risk loci identified in persons of EA also contribute to risk in persons of AA 8–10 . Fifteen new MM risk loci have been identified in EA populations 11–13 , but these have not yet been examined in populations of AA. Moreover, no genome-wide significant associations have been identified that explain MM risk specifically for AA individuals. Here, we added a second GWAS in a meta-analysis for a total of 1,813 cases and 8,871 controls, to assess the association between common genetic variation and MM risk in the largest study of cases and controls of AA conducted to date. 120 4.3 Methods All studies contributing DNA samples had approval from their Institutional Review Boards according to the Declaration of Helsinki Ethical Principles for Medical Research for Human Subjects (1964). Signed informed consent was obtained from all participants at the time of specimen collection. Study Participants, Genotyping, and Quality Control There were two sets of study participants. Set 1 comprised a GWAS case-control study that was previously conducted in 1,179 AA MM patients identified from 11 National Cancer Institute (NCI) comprehensive cancer centers and non-profit hospitals and 4 NCI Surveillance, Epidemiology, and End Results (SEER) cancer registries participating in the African American Multiple Myeloma Study (AAMMS) in 2010-2015. In addition, DNA samples from the Multiethnic Cohort (MEC) (n=43), the University of California at San Francisco (UCSF) study (n=32), and the Multiple Myeloma Research Consortium (MMRC) (n=84) were included (total n=1,338 cases), as described elsewhere 8 . Controls were AA subjects unaffected with MM, with existing GWAS data including 2,631 female controls from the African Ancestry Breast Cancer Consortium (AABC) 14 and 4,447 male controls from the African Ancestry Prostate Cancer Consortium (AAPC) 15 . Cases were genotyped using the Illumina HumanCore GWAS array while controls were genotyped using the Illumina 1M-Duo. Quality control (QC) measures for cases and controls were conducted separately. Cases with call rate <0.98, unexpected replicates, 1 st or 2 nd degree relatives, and those who were sex discordant based on X chromosome genotypes were excluded 8 . Single nucleotide polymorphisms (SNPs) with a call rate <0.98 or replicate concordance <1 based on 100 QC replicate samples were removed. QC procedures among controls are reported in previous publications 14,15 . Only SNPs that passed QC measures directly 121 genotyped in both cases and controls were included in the imputation (n=188,376; discussed below). Set 2 consisted of DNA samples from 421 AA MM patients from the Myeloma Center, University of Arkansas For Medical Sciences (UAMS) and 132 additional samples from continued enrollment in the AAMMS study in 2016 (primarily from the MD Anderson Cancer Center) (total n=553 cases). Controls were 2,398 unaffected AA participants from the MEC with existing GWAS data. Both cases and controls in Set 2 were genotyped using the Illumina MEGA array (total genotyped SNPs ~1.7M). Cases and controls were subjected to the same QC procedures. One case and one control were removed because of cross-set replication. Additional individuals were removed based on call rate <0.95 (n=3), sex mismatch (n=13), ancestry outlier (estimated African global ancestry <0.1 by STRUCTURE 16 , n=4), and one in a 1 st degree relative pair (estimated sharing identity-by-descent (IBD) ≥0.375 by PLINK, n=10), leaving 529 cases and 2,389 controls for imputation. Control data were cleaned previously combined with the NHGRI PAGE Consortium (PAGE) samples 17 , thus, we first excluded SNPs that were low quality in PAGE. Further SNP QC included removing monomorphic SNPs, variants with a call rate <0.98, replication concordance <1 based on 10 replicate pairs in cases, cross-platform replication concordance <100% based on 8 replicate pairs in cases and controls, and SNPs with poor clustering by visual inspection. Additional removal criteria included SNPs with minor allele frequency (MAF) that deviated from the African (AFR) individuals in phase III 1000 Genome Project (1KGP) data, and INDELs not identified within 1KGP. SNPs found in both cases and controls were used for imputation (n=1,046,801). 122 Shared IBD for Sets 1 and Set 2 was calculated using PLINK to remove duplicate samples across sets. We excluded 21 cases and 571 controls from Set 1 that were included in Set 2, leaving a final analysis sample size for Set 1 of 7,766 (1,284 cases, 6,482 controls). Genotyping, QC and imputation procedures are described in Supplemental Figure S1. Imputation For this analysis, Set 1 and Set 2 data were imputed using the same protocol as follows: Genotype data were first phased using SHAPEIT 18 , and then imputed to a cosmopolitan reference panel from the Haplotype Reference Consortium (HRC) release 1 (n=32,488 in total; n=661 AFR) 19 using the Michigan Imputation Server. We excluded SNPs with imputation quality score r 2 <0.5 and MAF<0.01 in each dataset, leaving a total of 12,683,648 overlapping SNPs in Set 1 and Set 2 for statistical analyses. Statistical Analyses GWAS Analysis: Principal components (PC) were calculated using EIGENSTRAT 20 . Risk allele frequencies (RAF) were calculated by taking the average frequencies in Set 1 and Set 2 controls. Per-allele odds ratios (ORs) and standard errors were estimated using unconditional logistic regression, adjusted for age, sex, and the first 10 PCs, separately for Set 1 and Set 2. A fixed- effect meta-analysis with inverse variance weights was used to obtain the combined effects for each SNPs. The genome-wide significance level was a=5.0×10 -8 . 123 Admixture analysis: We estimated local ancestry separately in Set 1 and Set 2 by RFMix v1.5.4 21 , using European and African populations in phase I 1KGP as reference. Genotyped SNPs that passed our QC and were present in both sets were included in the analysis. We calculated individual global ancestry by averaging local African ancestry values across the 22 autosomal chromosomes. For local ancestry at each locus, we performed case-only and case- control analyses 22 using regression models adjusted for age and sex, separately two sets. The case-only analyses compared local ancestry with global ancestry, while in case-control analyses, we tested whether the average deviation of local ancestry from global ancestry was the same between cases and controls. A fixed-effect meta-analysis was then conducted, using P<1×10 -5 defining genome-wide significance. Continuous regions (adjacent regions with P<1×10 -4 ) that were significant in both case-only and case-control comparisons were considered to be suggestive risk regions for MM. To identify the set of independent SNPs within the local ancestry signal region, we selected SNPs with marginal P-values<0.001 and conducted forward-selection logistic regression using inclusion criteria of 0.001, adjusting for age, sex and global AFR ancestry. To examine whether the detected local ancestry signal could be explained by variant dosage, we compared the marginal and conditional P-values and OR percent changes of local ancestry using logistic regression adjusting for age, sex and the first 10 PCs, with and without additional adjusting for allele dosage. We adjusted for both known risk variants, and the independent risk SNPs within the local ancestry signal region. 124 Association Testing of Known Risk Regions: We were able to directly genotype or impute 23 known MM risk variants 9–13 , but excluded variant rs34229995 because it is uncommon in the AA population (<1%); thus the following analyses of known risk regions included 22 of the 23 reported EA risk alleles. Directional consistency of effect was defined as ORs in the AA MM meta-analysis that were in the same direction of effect (i.e. >1) as those reported in the EA population. A nominal P-value of 0.05 was used to determine statistical significance. We also examined the t(11;14) translocation MM specific risk allele rs603965 at 11q13.3 in a pooled subset of our AA cases with that translocation (102 cases from Set 1 and 45 cases from Set 2), with the full set of controls, using logistic regression adjusting for global AFR ancestry, age, sex, and set. We used the 22 reported EA risk alleles as index markers to search for other markers within a ±250kb region that could better capture MM risk in the AA participants (defined as a “better AA marker”). In order to control measurement error, we retained only those SNPs with high imputation quality (r 2 ³0.8). In each region, we examined SNPs with pairwise correlation (r 2 ) ≥0.2 with the index variant in EA population of 1KGP (EUR 1KGP), as these are likely to capture the functional allele. To reduce false-positive associations, only SNPs with a P-value that was smaller by one order of magnitude compared with the index variant were defined as a putative “better” marker of risk in AA individuals. A secondary marker within the known risk region was defined as a marker that had weak correlation (r 2 <0.2) with the index SNP among EUR and AFR 1KGP and a P-value < 1×10 -6 after conditioning on the index SNP. 125 Polygenic risk score analysis: The aggregate effect of known risk alleles was examined using a weighted polygenic risk score, !"# $ =∑ ' ( ) $( * (+, , for each individual. ) $( is the risk allele dosage for individual i at SNP m; C defines risk SNPs at 22 known MM susceptibility loci; ' ( is the weight for SNP m. We explored two sets of weights for the index SNPs: the marginal log odds ratios (logORs) published from EA populations, and logORs in our AA MM meta- analysis. We also substituted better AA markers for 8 index SNPs and used weights of AA MM logORs. The risk score in each set was categorized according to its percentile (<10%, 10-25%, 25-75%, 75-90%, ≥90%) and the risk associated with each category was estimated relative to the interquartile range of 25-75% using logistic regression adjusting for the first 10 principal components, age and sex. The PRS was constructed separately for each set and the results combined in a fixed effect meta-analysis. We also tested the interaction between age at diagnosis and PRS scores. Expression Quantitative Trait Loci (eQTL) analysis: We performed an eQTL analysis using Affymetrix Human Genome U133 2.0 Plus Array data for CD138-positive plasma cells isolated from 292 UAMS patients’ bone marrow as recently described 23 . Briefly, the expression data were pre-processed and the probabilistic estimation of expression residuals (PEER) method was applied to estimate non-genetic hidden confounders 24 . Linear regression was used to test the association between the genotype of risk variants and gene expression of genes within 1 Mb, adjusting for PEER factors. Risk variants including suggestive novel signals, known markers identified from previous EA studies and better AA markers were examined in the eQTL analysis. A conservative Bonferroni-corrected P-value of 0.0001 was used to determine statistical significance after correcting for 509 total tests. 126 4.4 Results The GWAS meta-analysis indicated no evidence of over-dispersion (λ=1.03). We did not observe any genome-wide significant association (P<5×10 -8 ) between common alleles and MM risk (Supplementary Figure S2; Supplementary Table S1). Seven regions harbored signals with a P- value<1×10 -6 , however, the signal in four of them disappeared when more stringent criteria (r 2 ≥0.8) was used to filter imputed SNPs (Supplementary Table S1). The two remaining suggestive novel risk alleles were located at 9p24.3 (rs13296848, OR=1.25, Wald-P=3.44×10 -7 , RAF=0.33 in cases and 0.28 in controls) and 9p13.1 (rs7034061, OR=1.32, Wald-P=9.17×10 -7 , RAF=0.15 in cases and 0.13 in controls) (Supplementary Figure S3). In admixture mapping analysis, we found a region on chromosome 2 where lower local AA was statistically significantly associated with MM risk in both case-only and case-control comparisons (P<1×10 -5 ), ranging from 23.1- 29.8 Mb (2p24.1-23.1), which covered the known risk region at 2p23.3 (Supplemental Figure S4; Supplementary Table S2). The strongest local ancestry-MM association was observed at 28.8-29.2 Mb (2p23.2, OR per African chromosome =0.79, 95% confidence interval (CI): 0.72-0.88, P=9.4×10 -6 ). The known risk allele rs6746082 in that region, which is more common in the EA (RAF=0.79) than AA population (RAF=0.56), was not significantly associated with MM risk in AA individuals (Table 1). When conditioning on rs6746082, the local ancestry-MM association became only slightly less significant, implying that the known risk allele does not explain the admixture signal. However, when conditioning on our AA MM better marker rs10180663 (identified in fine-mapping), the signal was attenuated by more than 2 orders of magnitude. In the forward-selection logistic regression, three independent SNPs, rs10180663, rs10169985, rs6734496, were identified (conditional P-value<0.001) as 127 associated with MM risk. When adjusting for these three SNPs, the admixture signal was no longer even nominally significant (OR per African chromosome =0.94, 95% CI: 0.83-1.06, P=0.3) (Supplementary Figure S5 and S6). Next, we focused on the 22 known MM risk variants previously identified in EA populations with MAF³0.01 in AA individuals to assess their generalizability in the AA population (Figure 1; Table 1). Our study had 80% statistical power to detect the reported effect sizes at significance level of α=0.05 for 18 of the 22 loci (Table 1). Directional consistency was noted for 20 variants, 9 of which were statistically nominally significantly associated with MM risk (P<0.05; Table 1). The average effect size for the AA population (ORAA=1.09) was significantly smaller (P=1.7×10 -4 , t- test) compared to the reported values for the EA population (OREA=1.20). The average RAF was slightly larger in the AA controls, but this difference was not statistically significant (RAFAA=0.463, RAFEA=0.451, P=0.87, t-test); however, five alleles had RAFs that differed greatly between the two populations (>0.2) (Table 1; Figure 1). Two alleles, rs6746082 at 2p23.3 and rs2811710 at 9p21.3, were more common among persons of EA, and three alleles, rs1052501 at 3p22.1, rs4487645 at 7p15.3, and rs1948915 at 8q24.21, were more common among persons of AA. In fine-mapping of the 22 risk regions, 8 were found to harbor a better marker for MM risk in persons of AA than the index variant by our criteria stated in the Methods (Table 1; Figure 2; Supplementary Figure S7). We found no significant evidence of secondary association signals in these regions in persons of AA. The association between rs603965 and t(11;14) MM risk replicated in the subset of our AA cases with translocation information at the nominal significance level (OR=2.04, 95%CI: 1.41, 2.95; P=1.4×10 -4 ). 128 In PRS analyses with weights (i.e. log-ORs) from studies in EA populations, AA individuals in the top 10% PRS stratum had a 1.61-fold (95%CI: 1.38, 1.88, P=1.4×10 -9 ) increased MM risk compared to those with average risk (PRS in the 25 th -75 th percentiles). Using weights from our AA MM meta-analysis, this OR was 1.66 (95%CI: 1.43, 1.94; P=7.5×10 -11 ), and, when substituting the 8 index EA SNPs with their corresponding better AA markers, this association became slightly stronger, with an OR of 1.82 (95%CI: 1.56, 2.11; P=9.4×10 -15 ) (Table 2). We did not detect any significant interaction between PRS and age at diagnosis on MM risk (data not shown). In an eQTL analysis (Supplementary Table S3), of the two suggestive novel variants from the GWAS meta-analysis, only rs7034061 at 9p13.1 was found to be marginally associated (P=0.046) with nearby gene expression (EXOSC3); however, it did not remain significant after correcting for multiple comparisons. Of the 22 known risk variants, the strongest association was observed for risk variant rs56219066 at 5q15, with the risk allele being associated with lower ELL2 expression (P= 5.1×10 –12 ). We also identified a significant association between the risk allele of rs2790457 at 10p12.1 and decreased expression of WAC (P= 2.29×10 –11 ). Both eQTLs were recently demonstrated in malignant plasma cells from EA patients 13,25 . The recently identified association between rs4487645 at 7p5.13 and CDCA7L expression in EA patients 23 was also evident at a borderline statistically significant level in malignant plasma cells from AA patients (P=0.067). Of the 8 better AA MM markers, three showed nominally significant associations with gene expression which were not significant after correction for multiple testing: rs10180663 at 2p23.3 with HADHA, rs9290375 at 3q26.2 with GPR160, and rs879882 at 6p21.33 with VARS2 and HCG27. 129 We further explored the potential overlap between suggestive novel signals and their correlated SNPs (r 2 >0.2 AFR 1KGP) with the genome regulatory domains and eQTLs regions using the publicly available databases through HaploReg 26 , UCSC genome browser 27 and GTEx portal 28 . Neither of the two suggestive novel risk alleles showed overlap with regulatory elements. However, two correlated variants of rs13296848 (rs7854502, r 2 =0.58; rs13285101, r 2 =0.51; Supplementary Figure S8) displayed enrichments in promoter and enhancer histone markers in multiple tissues 29 ; and both were associated with KANK1 expression in EBV-transformed lymphocytes, spleen, and whole blood 28 . 130 4.5 Discussion In this largest genetic study of MM in individuals of AA, we did not identify any novel locus for MM risk through GWAS or admixture mapping. Of the 22 reported MM EA risk alleles that we were able to examine, 20 were directionally consistent and 9 achieved nominal statistical significance in AA individuals, suggesting a common shared underlying risk variant in these regions. Though most of the reported risk alleles had a modest association (OR<1.2) with MM risk in AA individuals, in aggregate, we observed those in the top 10% risk stratum had a 1.6~1.8-fold increase in MM risk compared to the population average risk (25 th -75 th percentile of the PRS distribution). Overall, the effect sizes among AAs were smaller than in the corresponding discovery reports among EAs. The possible explanations for the smaller effect sizes include bias caused by “winner’s curse” in the EA discovery set, random errors from sampling across different studies, modification by environmental factors, and different LD structure across ethnic groups between the index risk alleles and the functional causal SNPs. Even though our study has sufficient statistical power (>80%) to replicate 18 of the risk alleles at significance level of 0.05, more than half of them did not achieve nominal significance, implying the index SNP in EA populations may not be a valid proxy for the causal variant in AA populations, and thus fine-mapping of those risk regions in persons of AA is warranted. Our previous study identified 5 better AA markers among 7 of the reported EA MM risk variants 8 . Here, we identified better AA markers in 8 of the 22 regions we examined. Our current 131 study had greater power in characterizing AA MM risk variants due to both improved imputation coverage by using the HRC reference panel and increased AA sample size. Notably, we identified a better marker, rs10180663, in 2p23.3 with a much stronger association in the AA population than in our previous study 8 . In this region, previous GWAS studies in the EA population have reported two MM risk variants, rs6746082 9 and rs7577599 11 , and a pleiotropic risk variant rs6546149 for B-cell malignancies including chronic lymphocytic leukemia, Hodgkin lymphoma and MM 30 . Our better AA marker rs10180663 is correlated with all of the three aforementioned risk variants in the EA population (r 2 >0.25 in EUR 1KGP) and displayed the strongest association among AA populations. Variant rs10180663 was also previously identified as a suggestive risk allele for the t(11;14)(q13;q32) MM molecular subtype among the EA population (P=2×10 –6 ) 31 . Moreover, it overlapped with H3K4Me1 enrichment in hematopoietic stem cell and B cells 26 and was predicted to be located at an enhancer region 29 , suggesting that it might be a better surrogate for the causal MM variant in the 2p23.3 region in AA individuals. Although no genome-wide novel variants were found, we did observe suggestive associations with rs13296848 located at 9p24.3, with a risk allele (C) frequency of 0.33 among AA cases and 0.28 among AA controls. This variant is located in intron of the KN motif and ankyrin repeat domains 1 (KANK1) gene, a candidate tumor suppressor for renal cell carcinoma 32 . KANK1 may have a role in KPβ-associated thrombocythemia 33 , which was found to be occasionally associated with MM 34 . The correlated alleles of rs13296848 displayed enrichments in promoter and enhancer histone markers and were associated with KANK1 expression in EBV-transformed lymphocytes, spleen, and whole blood 28 . The other suggestive novel MM risk allele, rs7034061, 132 was located at 19kb 5’ of the gene Insulin like growth factor binding protein like 1 (IGFBPL1), whose epigenetic inaction was reported as being involved in breast cancer pathogenesis 35 . Moreover, serum fasting IGFBP1 concentration was associated with MM risk in a nested case- control study 36 . However, no overlap of genome regulatory elements nor IGFBPL1 expression enrichment was detected for rs7034061 or its correlated SNPs in publicly available databases. Fine-mapping in larger samples of AA cases and controls is needed to determine whether this region harbors a true risk variant that is associated with MM risk. Our study is the first to examine the association between local ancestry and MM risk. Compared to GWAS, admixture mapping has enhanced power (fewer tests requiring correction), but a lower resolution for discovering disease risk regions 37 . We found that the level of local African ancestry in a continuous region ranging from 2p24.1 to 2p23.1 that covered the known risk region 2p23.3, was inversely associated with MM risk (P<1×10 –5 ). The signal could not be explained by the known risk marker rs6746082 but was completely explained by conditioning on the three independent SNPs, including rs10180663. Thus, the previously noted signal in this region was detected in a different population using admixture mapping. Unlike prostate cancer, where variants at a single locus (8q24) were discovered by admixture mapping in African American men 38 , we did not identify a single locus that explains the excess risk of MM in AA individuals, although our study was well-powered (80% power to detect a single locus with an OR of 1.7). It is possible that multiple loci with small effects, rather than a single region with large effect, contribute to MM risk in AA individuals, which would require 133 much larger sample size than we have here. It is also possible that there is etiologic heterogeneity by molecular subtype, and that by combining them, we diluted the signal. From the gene expression studies in malignant plasma cells from AA MM patients, the only risk variants that significantly impacted gene expression after correcting for multiple tests were rs2790457 at 10p12.1 (WAC) and rs1423269 at 5q15 (ELL2), two eQTLs that were recently shown in EA MM patients 11,25 . Thus, the functional basis of the majority of MM risk variants remains elusive and further studies, perhaps in circulating B-memory lymphocytes from healthy individuals, are required to understand the underlying biology. A possible limitation of our study was that the AA MM cases and controls in Set 1 were genotyped on different platforms. To minimize this bias, we conducted stringent pre- and post- imputation quality controls, including eliminating SNPs with low cross-platform replicate concordance, and imputation to the same reference panel. Moreover, because MM is characterized clinically by molecular subgroups 39 , it is possible that genetic susceptibility may vary across these subtypes 12,31 . The prevalence of each molecular subtype are <50% and we did not have complete records of clinical molecular subtypes in all patients; thus, we could not evaluate subtype-specific associations. Though we had adequate power to examine most of the known risk variants at P<0.05, the sample size was still not large enough for discovery of novel AA-specific genetic markers at the genome-wide significant level. Obtaining patient samples for such an uncommon cancer among a minority population is challenging. Nevertheless, this is the largest study in AA individuals to date examining associations between common genetic 134 variation and MM risk and our results show a role for genetic susceptibility in AA MM. The PRS comprised of GWAS-identified risk variants predicted a 1.6~1.8-fold and ~2-fold increase in MM risk among the AA individuals with the top 10% and 1% PRS, respectively (data not shown). This effect is smaller than that in EA population, which could identify 1% EA individuals with ~3-fold increased MM risk 13 . Thus, further studies with larger sample sizes are necessary to capture the likely existing additional genetic risk factors in this high-risk population. 135 Funding: National Cancer Institute (NCI) at the National Institutes of Health (NIH) supported this work (1R01CA134786 to WC and CAH, NIH/NCI, R01CA84979 to SAI, 5U01CA164973 to LLM, and CAH, P30CA01409 and P50CA100707 to KCA, R21 CA1918896 to EZ, R01CA092447 to LS). Acknowledgement: The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The collection of incident MM patients used in this publication was supported by the California Department of Health Services as part of the statewide cancer reporting program mandated by California Health and Safety Code Section 103885 and from a grant from the Centers for Disease Control and Prevention to support cancer patient registration and ascertainment (1U58DP000807-01). In addition, patient identification was made possible by federal funds from the NCI Surveillance Epidemiology and End Results Population- based Registry Program, NIH, Department of Health and Human Services. Genotyping of cases and some controls was performed at the USC Norris Comprehensive Cancer Center Genomics Core supported by the NCI Comprehensive Cancer Center Core grant P30CA014089. Patient accrual and sample processing at Johns Hopkins Medical Center was supported by the Sidney Kimmel Comprehensive Cancer core CA006973. 136 Authorship Contribution: W.C. , D.V.C. and C.A.H. conceived the study design, supported analyses and critically reviewed and edited the manuscript; Z.D. performed genetic analyses and wrote the manuscript; N.W. performed eQTL analyses and co-wrote the manuscript; C.S. and K.R. performed genetic analysis and quality control; A.H. and X.S. managed data and performed genetic quality control; V.H. and A.S. imputed genetic data; S.A., A.K.N., S.S., A.M.,J.M.,G.A.C., J.W., T.G.M., M.T., M.A.F., H.T., N.J., L.K., L.LM. D. A., E.V. B.C.C, R.V. L.B-M., G.J.M., J.A.Z., C.A.H. S.L. and R.Z.O. provided patient samples and clinical data, along with input on biological and clinical aspects of myeloma and molecular subtypes for the study; K.P. E.S.P, A.S.,C.B. provided patient samples and demographic data and input on epidemiologic aspects of the study, K.C.A. provided input on etiologic and molecular aspects of myeloma; D.S. and E.Z. provided additional input on genetic ancestry analysis and admixture mapping; N.W., O.W.S., M.Z. and F.V.R. conducted the eQTL analysis; S.I.B., W.J.B., G.C.,V.L.S., R.K.,P.J.G., W.R.D., A.H., B.N.,E.A.K., B.A.R., J.L.S., J.S.W., L.S., E.M.J., L.B., A.O., W.Z., J.J.H. ,R.Z., S.J.N., E.B., S.A.I., M.P., J.D.C., S.C. contributed GWAS data from controls and provided guidance on genetic analysis methods. All authors provided important intellectual content and contributed to reviewing and editing the manuscript. Conflict-of-interest disclosure: The following authors have disclosures: C.A. Huff (Consulting Karyopharm, Sanofi, MiDiagnostics, Member of Safety Monitoring Board for Johnson and Johnson); 137 T. Martin (Consultant for Roche and Juno; Research funding from Amgen, Sanofi, Seattle Genetics); J. Mehta (Speakers' Bureau of Millennium/Takeda, Celgene and owns stock in Celgene, Bristol-Myers Squibb and Bluebird); S. Singhal (Speakers' Bureau of Millennium/Takeda, Celgene, Janssen and owns stock in Celgene, Bristol-Myers Squibb and Bluebird); S. Lonial (Consultant for Janssen, Takeda, Celgene, Novartis BMS Merck and GlaxoSmithKline, Research funding from Celgene, Takeda and Janssen); K. C. Anderson (Celgene, Jansen, Bristol-Myers Squibb and Sanofi, Scientific found of Oncopep and C4 Therapeutics), S. Ailawadhi (Consultant for Novartis, Amgen, Takeda, research funding from Pharmacyclics); A.J. Nooka(Consultant for Amgen, Novartis, Spectrum and Adaptive Biotechnologies); R. Vij (Honoraria and Research Funding from Takeda, Amgen, Honoraria from Celgene, Bristol-Myers Squibb, Janssen, Abbvie, Jazz, Konypharma); J. A. Zonder (Consultant for Prothena, Janssen, Consultant and Research Funding from BMS, Celgene, Takeda; Member of Data Safety Monitoring Committee for Pharmacyclics); G. J. Morgan (Consultant, Research Funding, Honoraria from Celgene, Consultant for and Honoraria from Takeda and Bristol-Myers Squibb; R. Z. Orlowski (Member of Advisory Board for: Amgen, Inc., Celgene Corporation, Forma Therapeutics, GSK Biologicals, Ionis Pharmaceuticals, Janssen Biotech, Juno Therapeutics, Kite Pharma, Legend Biotech, Sanofi-Aventis, Servier, Takeda Pharmaceuticals; Consultant for Molecular Partners, Research Funding from BioThery X). Kristin A. Rand was a postdoctoral fellow at the Keck School of Medicine at USC when she performed the work for this study; she is now employed at Ancestry.com. George Chi Song was a doctoral student when he performed the work for this study and he is now employed at Millennium Pharmaceuticals Inc., a subsidiary of Takeda Pharmaceutical Company Limited. 138 4.6 References 1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2018. CA. Cancer J. Clin. 2018;68(1):7– 30. 2. Gebregzhiabher M, Berstein L, Wang Y, Cozen W. Risk patterns of multiple myeloma in Lo Angeles County 1972-1999 (United States). Cancer Causes Control. 2006;17(7)931-8. 3. Sonderman JS, Bethea TN, Kitahara CM, et al. Multiple Myeloma Mortality in Relation to Obesity Among African Americans. JNCI J. Natl. Cancer Inst. 2016;108(10):. 4. Hofmann JN, Moore SC, Lim U, et al. Body Mass Index and Physical Activity at Different Ages and Risk of Multiple Myeloma in the NIH-AARP Diet and Health Study. Am. J. Epidemiol. 2013;177(8):776–786. 5. Grufferman S, Cohen HJ, Delzell ES, et al. Familial aggregation of multiple myeloma and central nervous system diseases. J. Am. Geriatr. Soc. 1989;37(4):303–309. 6. VanValkenburg ME, Pruitt GI, Brill IK, et al. Family history of hematologic malignancies and risk of multiple myeloma: differences by race and clinical features. Cancer Causes Control. 2016;27:81–91. 7. Kristinsson SY, Björkholm M, Goldin LR, et al. Patterns of hematologic malignancies and solid tumors among 37,838 first-degree relatives of 13,896 multiple myeloma patients in Sweden. Int. J. Cancer J. Int. Cancer. 2009;125(9):2147–2150. 8. Rand KA, Song C, Dean E, et al. A Meta-analysis of Multiple Myeloma Risk Regions in African and European Ancestry Populations Identifies Putatively Functional Loci. Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol. 2016;25(12):1609–1618. 9. Broderick P, Chubb D, Johnson DC, et al. Common variation at 3p22.1 and 7p15.3 influences multiple myeloma risk. Nat. Genet. 2011;44(1):58–61. 10. Chubb D, Weinhold N, Broderick P, et al. Common variation at 3q26.2, 6p21.33, 17p11.2 and 22q13.1 influences multiple myeloma risk. Nat. Genet. 2013;45(10):1221–1225. 11. Mitchell JS, Li N, Weinhold N, et al. Genome-wide association study identifies multiple susceptibility loci for multiple myeloma. Nat. Commun. 2016;7:. 12. Swaminathan B, Thorleifsson G, Jöud M, et al. Variants in ELL2 influencing immunoglobulin levels associate with multiple myeloma. Nat. Commun. 2015;6:. 13. Went M, Sud A, Försti A, et al. Identification of multiple risk loci and regulatory mechanisms influencing susceptibility to multiple myeloma. Nat. Commun. 2018;9(1):3707. 14. Feng Y, Stram DO, Rhie SK, et al. A comprehensive examination of breast cancer risk loci in African American women. Hum. Mol. Genet. 2014;23(20):5518–5526. 15. Han Y, Signorello LB, Strom SS, et al. Generalizability of established prostate cancer risk variants in men of African ancestry: Generalizability of prostate cancer SNPs in African ancestry. Int. J. Cancer. 2015;136(5):1210–1217. 139 16. Porras-Hurtado L, Ruiz Y, Santos C, et al. An overview of STRUCTURE: applications, parameter settings, and supporting software. Front. Genet. 2013;4:98. 17. Matise TC, Ambite JL, Buyske S, et al. The Next PAGE in Understanding Complex Traits: Design for the Analysis of Population Architecture Using Genetics and Epidemiology (PAGE) Study. Am. J. Epidemiol. 2011;174(7):849–859. 18. Loh P-R, Danecek P, Palamara PF, et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 2016;48(11):1443–1448. 19. Das S, Forer L, Schönherr S, et al. Next-generation genotype imputation service and methods. Nat. Genet. 2016;48(10):1284–1287. 20. Price AL, Patterson NJ, Plenge RM, et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 2006;38(8):904–909. 21. Maples BK, Gravel S, Kenny EE, Bustamante CD. RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference. Am. J. Hum. Genet. 2013;93(2):278–288. 22. Lilit C. Moss, Xin Sheng, Christopher A. Haiman, David V. Conti, On Behalf of the African Ancestry Prostate Cancer Consortium and the Ellipse Game-On Consortium. Using Bayes model averaging for admixture mapping. Genet. Epidemiol. 2018;42(7):718–718. 23. Weinhold N, Meissner T, Johnson DC, et al. The 7p15.3 (rs4487645) association for multiple myeloma shows strong allele-specific regulation of the MYC-interacting gene CDCA7L in malignant plasma cells. Haematologica. 2015;100(3):e110–e113. 24. Stegle O, Parts L, Piipari M, Winn J, Durbin R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 2012;7(3):500–507. 25. Ali M, Ajore R, Wihlborg A-K, et al. The multiple myeloma risk allele at 5q15 lowers ELL2 expression and increases ribosomal gene expression. Nat. Commun. 2018;9(1):1649. 26. Ward LD, Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40(D1):D930–D934. 27. Kent WJ, Sugnet CW, Furey TS, et al. The Human Genome Browser at UCSC. Genome Res. 2002;12(6):996–1006. 28. Keen JC, Moore HM. The Genotype-Tissue Expression (GTEx) Project: Linking Clinical Data with Molecular Analysis to Advance Personalized Medicine. J. Pers. Med. 2015;5(1):22–29. 29. Consortium TEP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74. 30. Law PJ, Sud A, Mitchell JS, et al. Genome-wide association analysis of chronic lymphocytic leukaemia, Hodgkin lymphoma and multiple myeloma identifies pleiotropic risk loci. Sci. Rep. 2017;7:. 31. Weinhold N, Johnson DC, Chubb D, et al. The CCND1 870G>A polymorphism is a risk factor for t(11;14)(q13;q32) multiple myeloma. Nat. Genet. 2013;45(5):522–525. 140 32. Sarkar S, Roy BC, Hatano N, et al. A Novel Ankyrin Repeat-containing Gene (Kank) Located at 9p24 Is a Growth Suppressor of Renal Cell Carcinoma. J. Biol. Chem. 2002;277(39):36585–36591. 33. Medves S, Duhoux FP, Ferrant A, et al. KANK1, a candidate tumor suppressor gene, is fused to PDGFRB in an imatinib-responsive myeloid neoplasm with severe thrombocythemia. Leukemia. 2010;24(5):1052–1055. 34. Cobo F, Cervantes F, Martinez C, et al. Multiple Myeloma Following Essential Thrombocythemia. Leuk. Lymphoma. 1995;20(1–2):177–179. 35. Smith P, Nicholson LJ, Syed N, et al. Epigenetic inactivation implies independent functions for insulin-like growth factor binding protein (IGFBP)-related protein 1 and the related IGFBPL1 in inhibiting breast cancer phenotypes. Clin. Cancer Res. Off. J. Am. Assoc. Cancer Res. 2007;13(14):4061–4068. 36. Birmann BM, Neuhouser ML, Rosner B, et al. Prediagnosis biomarkers of insulin-like growth factor-1, insulin, and interleukin-6 dysregulation and multiple myeloma risk in the Multiple Myeloma Cohort Consortium. Blood. 2012;120(25):4929–4937. 37. Chakraborty R, Weiss KM. Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. Proc. Natl. Acad. Sci. 1988;85(23):9119– 9123. 38. Freedman ML, Haiman CA, Patterson N, et al. Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men. Proc. Natl. Acad. Sci. U. S. A. 2006;103(38):14068–14073. 39. Kumar SK, Rajkumar V, Kyle RA, et al. Multiple myeloma. Nat. Rev. Dis. Primer. 2017;3:17046. 141 Table 1. Associations of variants (index variants and better AA markers) in known multiple myeloma risk regions with multiple myeloma risk in AA individuals. Region SNP Position Risk Allele r 2 with index SNP a European African American Power g EUR AFR RAF b OR c RAF d OR(95%CI) e P-value e PHet f 2p23.3 rs6746082 * 25659244 A 0.79 1.29 0.56 1.06(0.98,1.14) 0.16 0.44 1.00 rs10180663 † 25633242 T 0.27 0.00 0.67 0.28 1.24(1.14,1.36) 1.9×10 -6 0.92 2q31.1 rs4325816 * 174808899 T 0.77 1.12 0.79 1.02(0.93,1.12) 0.65 0.22 0.70 3p22.1 rs1052501 * 41925398 C 0.16 1.32 0.62 1.04(0.96,1.13) 0.32 0.33 1.00 rs73828280 † 41833907 T 0.88 0.44 0.17 0.45 1.09(1.02,1.18) 0.02 0.09 3q26.2 rs10936599 * 169492101 C 0.80 1.26 0.92 1.14(0.98,1.33) 0.08 0.64 0.89 rs9290375 † 169566090 A 0.39 0.03 0.61 0.49 1.12(1.03,1.20) 5.3×10 -3 0.58 5q15 rs56219066 * 95242931 T 0.71 1.25 0.61 1.13(1.05,1.22) 1.3×10 -3 0.75 1.00 5q23.2 rs6595443 * 122743325 T 0.43 1.11 0.45 1.12(1.04,1.21) 2.8×10 −3 0.61 0.81 6p22.3 rs34229995 * 15244018 G 0.03 1.37 0.008 6p21.33 rs2285803 * 31107258 T 0.32 1.19 0.25 1.07(0.98,1.16) 0.12 0.73 0.99 rs879882 † 31139452 T 0.25 0.01 0.36 0.45 1.20(1.12,1.29) 1.0×10 -6 0.84 6q21 rs9372120 * 106667535 G 0.22 1.18 0.09 1.04(0.91,1.19) 0.58 0.60 0.76 7p15.3 rs4487645 * 21938240 C 0.65 1.38 0.89 1.38(1.20,1.57) 3.6×10 -6 0.75 1.00 7q22.3 rs17507636 * 106291118 C 0.74 1.12 0.86 1.04(0.93,1.16) 0.52 0.07 0.56 7q31.33 rs58618031 * 124583896 T 0.72 1.12 0.79 1.07(0.97,1.18) 0.16 0.52 0.71 rs61068276 † 124804887 C 0.42 0.16 0.63 1.16(1.07,1.26) 2.0×10 −4 0.57 7q36.1 rs7781265 * 150950940 A 0.13 1.19 0.26 1.01(0.92,1.11) 0.86 0.21 0.99 rs73169662 † 150922306 C 0.91 0.00 0.09 0.03 1.32(1.08,1.60) 5.3×10 -3 0.00 8q24.21 rs1948915 * 128222421 C 0.35 1.13 0.59 1.13(1.03,1.24) 7.2×10 -3 0.52 0.90 9p21.3 rs2811710 * 21991923 C 0.66 1.15 0.36 1.15(1.06,1.25) 1.2×10 -3 0.45 0.96 10p12.1 rs2790457 * 28856819 G 0.74 1.12 0.56 1.13(1.05,1.21) 1.9×10 -3 0.16 0.87 rs1265841 † 28920701 A 0.91 0.47 0.74 0.43 1.17(1.08,1.26) 7.3×10 -5 0.61 16p11.2 rs13338946 * 30700858 C 0.26 1.15 0.43 1.19(1.10,1.28) 1.4×10 −5 0.28 0.97 142 16q23.1 rs7193541 * 74664743 T 0.59 1.13 0.49 1.11(1.03,1.19) 5.6×10 -3 0.19 0.92 17p11.2 rs4273077 * 16849139 G 0.12 1.26 0.13 1.11(0.99,1.25) 0.07 0.90 0.99 rs34562254 † 16842991 A 0.82 0.24 0.11 0.13 1.23(1.10,1.38) 2.1×10 -4 0.72 19p13.11 rs11086029 * 16438661 T 0.24 1.14 0.19 1.08(0.97,1.21) 0.17 0.62 0.82 20q13.13 rs6066835 * 47355009 C 0.08 1.26 0.09 0.96(0.84,1.10) 0.54 0.47 0.97 22q13 rs138740 * 35699582 C 0.36 1.22 0.76 0.94(0.85,1.02) 0.15 0.76 0.99 22q13.1 rs877529 * 39542292 A 0.51 1.23 0.47 1.12(1.04,1.20) 3.4×10 -3 0.89 1.00 *: Index SNP in each risk region initially reported from GWAS studies of MM among EA populations; †: Better AA marker in each risk region that better captured AA MM risk; a: r 2 were calculated among European/African populations in phase III 1KGP; b: Risk allele frequencies (RAFs) were from European population in phase III 1KGP; c: Odds Ratios (ORs) were from original GWAS reports; d: RAFs were from AA controls; e: AA MM ORs and Wald-P values were adjusted for age, sex and PC1-10; f: P-values of heterogeneity tests of fixed-effect meta-analyses. g: Power were calculated using OR in EA population, RAF in AA population and ! = 0.05. 143 Table 2. Associations between categorical polygenic risk scores (PRSs) and multiple myeloma risk in African ancestry population. Polygenic Risk Score Category European-weighted PRS a AA-weighted PRS1 b AA-weighted PRS2 c OR(95% CI) P-value PHet OR(95% CI) P-value PHet OR(95% CI) P-value PHet 0%-10% 0.70(0.57,0.86) 6.52×10 -4 0.94 0.52(0.41,0.65) 1.49×10 -8 0.90 0.48(0.37,0.60) 9.68×10 -10 0.56 10%-25% 0.78(0.66,0.92) 3.10×10 -3 0.30 0.70(0.59,0.84) 7.16×10 -5 0.01 0.72(0.60,0.85) 1.34×10 -4 0.83 25%- 75%(baseline) 1 - - 1 - - 1 - - 75%-90% 1.24(1.07,1.43) 3.71×10 -3 0.78 1.36(1.18,1.56) 1.54×10 -5 0.99 1.39(1.21,1.60) 2.76×10 -6 0.80 90%-100% 1.61(1.38,1.88) 1.41×10 -9 0.31 1.66(1.43,1.94) 7.54×10 -11 0.37 1.82(1.56,2.11) 9.43×10 -15 0.65 Odds Ratios (ORs) were adjusted for age, sex, and PCs1-10; P-values were Wald P-values; PHet were P-value of heterogeneity in fixed-effect meta analyses a: European-weighted PRS were constructed using the known 22 index SNPs reported in European GWAS, weights were from original European GWAS; b: AA-weighted PRS1 were constructed using the known 22 index SNPs reported in European GWAS, weights were from AA MM; c: AA-weighted PRS2 were constructed using 14 index SNPs (weights from AA MM) and 8 better AA markers (weights from AA MM and adjusted for their “winner’s curse” using a Bayesian approach). 144 Figure 1. Risk allele frequency (RAF) comparisons of the 23 known risk alleles between AA MM controls with European population in phase III 1000 Genome Project. 145 Figure 2. Regional association plot of the 2p23.3 risk region (25.4-25.9MB) in African Americans. Single-nucleotide polymorphisms (SNPs) are plotted by position (x-axis) and -log10P-value (y-axis). LDs are estimated from European population in phase III 1000 Genomes Project (1KGP) using r 2 statistics. The index SNP (purple diamond) is rs6746082. The surrounding SNPs are colored to indicate pairwise correlation with the index SNP. The most associated SNP in AA population in this region is rs10180663. 146 Chapter Five: Conclusions and Future Directions 5.1 Genetic susceptibility of prostate cancer in Ugandan men Summary In chapter 2, we assessed associations at 118 previously reported prostate cancer (PrCa) risk alleles among Ugandan men and evaluated their aggregate effect on PrCa risk by constructing a polygenic risk score (PRS). We were able to replicate the majority of previously reported risk variants in terms of directional consistency in this East African population, suggesting causal variants at most risk loci are shared across ethnic populations. The PRS could effectively stratify PrCa genetic risk in Uganda men, with those in the top 10% PRS stratum having a >4-fold increased risk. We also found that the 8q24 region was a major contribution to PrCa genetic risk in Uganda men: the mean PRS score increased for ~50% and the effect size in the top 10% PRS stratum also increased for more than 50% with inclusion of the 5 8q24 risk alleles. In the exploratory GWAS analysis, 5 variants in the 8q24 region achieved genome-wide significance, with the top signal only existing in African population and conferring a >3-fold increased risk in Uganda men. 147 Future direction Our current study only contained 571 cases and 485 controls, which was underpowered for GWAS discovery of novel risk variants. The next step would be continuing to expand the number of case and control samples in this Ugandan population to improve power. Furthermore, Africa contains the most extensive genetic diversity and previous studies have noted divergent patterns of LD among subpopulations classified by geography, language and subsistence within the African continent 1 . It will be particularly informative to include more diverse African samples in genetic studies of PrCa to test generalizability of the current known risk loci and PRS as well as to identify novel PrCa susceptibility variants in this high-risk population. Thus, we are now collaborating with Men of African Descent and Carcinoma of the Prostate (MADCaP) 2 , which has collected PrCa cases and controls in Senegal, Ghana, Nigeria and South Africa, to increase genetic variation as well as the overall statistical power. The current imputation was done with Phase III 1000 genome (1KGP) data, which includes ~88 million variants from 2,504 individuals, among which 661 are African descendants. To improve imputation quality in this African population, future work will include re-imputation to the Trans-Omics for Precision Medicine (TOPMed) reference panel, which consists of ~144K individuals, among which 45,840 are people of African ancestry. Its latest released panel has sequenced ~60K samples and detected >239 million SNPs and INDELs, and has shown that using this reference panel, the imputation accuracy and coverage, especially for variants with low frequency (e.g. MAF ~0.1% ~ 1%), were substantially improved 3 . 148 Furthermore, apart from common risk variants, rare variants, with larger effect, also contribute to PrCa susceptibility. For example, rare mutations in genes such as HOXB13, BRCA2 and CHEK2 have been found to confer >5-fold increased PrCa risk in European populations. As rare variants tend to be population specific and African populations have the most extensive genetic diversity 4 , sequencing of genes and regions in African population will be informative for identifying novel rare risk variants that play an important role in PrCa susceptibility in individuals of African ancestry. To shed further insight into PrCa biology and to identify men at higher risk of developing more advanced PrCa, it is imperative that we identify genetic variants associated with aggressive PrCa. However, previous GWAS studies have been focusing on identifying common genetic risk variants for overall prostate cancer and no common risk variant that can distinguish aggressive vs. non-aggressive disease has been identified to date. Larger scale genetic studies of aggressive PrCa and case-only comparisons are needed to discover such risk variants. Moreover, previous studies reported that rare germline mutations in certain DNA-repair genets could distinguish risk for lethal and indolent PrCa and were associated with earlier age at death 5 . Further studies in population of African ancestry are needed to characterize the role of rare germline mutations in the development of aggressive PrCa in this high-risk population. Our current PRS consists of 114 known common risk variants. Recently, an additional 63 risk loci have been identified in European population 6 . Incorporating these new risk alleles into the current PRS may improve the performance of risk stratification. Except for variants in 8q24, the 149 alleles included in the PRS were predominantly identified in European populations. To optimize the PRS in populations of African ancestry, fine-mapping studies are needed to prioritize allele sets that better represent risk in this population. In addition, previous study reported that family history and PRS independently influenced PrCa risk 7 , the next step to refine risk stratification is incorporating family history and the number of affected relatives and their ages. PRS in combination with clinical variables and/or other biomarkers may also improve risk management. For example, a study reported that an individualized diagnostic prediction model including a 254-SNP genetic score was superior than the PSA-alone method (AUC: 0.75 versus 0.58) in predicting prostate cancer with Gleason Score > 7 8 . Several companies (e.g. Ambry Genetics) are now providing commercial PRS testing in addition to hereditary cancer multigene panels. However, all of these tests are limited to population of European ancestry; the general clinical and commercial utility of PRS should be further developed and optimized for populations of non-European ancestry. 150 5.2 Genetic susceptibility of prostate cancer in Latinos Summary In Chapter three, we performed a GWAS scan as well as an admixture mapping scan to search for novel PrCa risk loci in 2,820 Latino cases and 5,293 controls. No novel risk region was identified in either analysis. The known risk region 10q11.22 was the most significant region in GWAS scan, with the previously reported variant rs10993994, which is likely to be the putative causal variant within this region, being the leading signal. We assessed the associations with the 181 known PrCa risk variants and found that the majority (>80%) of them had a consistent direction of effect among Latino men, among which ~30% achieved nominal significance. We also tested the interactions between known risk variants and local ancestry on PrCa risk but did not detect any significant interaction after correcting for multiple tests. One interesting observation was the stronger effect of variant rs10993994 among men with a higher degree of local AmerIndian ancestry, suggesting local ancestry background might be a modifier of the variant-PrCa association. The PRS could successfully stratify PrCa genetic risk among Latino men, with individuals in the top 10% risk stratum having a 3.20-fold increased risk compared to population average, which was similar to that reported in the European ancestry population. We also observed a marginally significant interaction between global AmerIndian ancestry and PRS on PrCa risk, with men in the 4 th quartile of AmerIndian ancestry having a less pronounced increase in PrCa risk. 151 Future direction As mentioned in Chapter three, Latinos are a highly heterogeneous population with distinct culture, lifestyle, environmental exposures, socioeconomic status as well as disease burden. Across the U.S., the proportions of ancestry in subgroups of Latinos vary substantitally 9 . However, our current study sample mainly consists of Mexican Americans, which contain lower proportion of African ancestry and higher proportion of AmerIndian ancestry compared to other Latino subgroups. To better characterize genetic risks in the broad Latino/Hispanic population, a larger sample of Latinos with more diverse ancestry backgrounds, such as Latinos in the southern and eastern U.S., which include more men from the Caribbean and South America, should be included in future studies. Our study revealed that global AmerIndian ancestry was a protective factor for PrCa risk, which was consistent with previous reports 10–13 . Ancestry not only represents genetic background, but also captures shared non-genetic factors including cultural, behavioral and lifestyle factors, socioeconomic status and access to medical care and cancer screening. However, none of the previous studies comprehensively addressed these non-genetic factors in a large sample. The second future direction is understanding to which degree the observed protective effect of AmerIndian ancestry is explained by non-genetic factors. Moreover, our study revealed suggestive evidences that local ancestry background modified variant-PrCa associations for some of the known risk variants. Whether these risk variants have distinct effects across diverse ethnic groups should be further validated in a larger sample. 152 As implied in our study, the PRS constructed using risk markers identified in European populations performed less effectively in identifying a high-risk population among men with a high degree of AmerIndian global ancestry. Thus, developing an ethnic-specific PRS that is optimized for Latinos is the next step. Fine-mapping analysis is needed to identify markers that could better capture risk in Latinos. Advanced statistical methods, such as regularized regressions and machine learning approaches, might be useful to improve prediction performance by including more variants that are informative in risk prediction yet with a smaller effect size which will not be identified through GWAS studies without a tremendously large sample. Furthermore, as subgroups of Latinos have highly diverse genetic ancestry proportions, jointly modeling genetic ancestry and PRS should be another future direction. In addition, previous studies have revealed that PRS-based risk stratification could reduce overdiagnosis and false-negative of PSA screening in European populations 14–16 ; similar studies in Latinos are needed to avoid potential health disparities in the development of such a targeted screening strategy. The cost-effectiveness and benefit-to-harm ratio of such PRS-informed risk-stratified PSA screening strategy will also need to be assessed in the Latino population. 153 5.3 Genetic susceptibility of multiple myeloma in African Americans Summary In Chapter four, we conducted a GWAS scan and an admixture mapping scan in 1,813 African American multiple myeloma cases and 8,871 controls, which was the largest genetic study of multiple myeloma among African Americans, to search for novel risk regions associated with multiple myeloma risk. No novel region was discovered in either analysis. We examined the 23 index GWAS SNPs and found that 20 of them were in directional consistency with effects reported in European population and 9 were replicated at nominal significance level. We then comprehensively investigated these known risk loci to identify genetic markers that could better capture multiple myeloma risk (AA MM better marker) in men and women of African ancestry. We were able to identify better genetic markers in 8 risk loci. The polygenic risk score constructed using 23 risk SNPs could identify 10% population with 1.6~1.8-fold increased multiple myeloma risk in this population. 154 Future direction Although our study represented the largest genetic study of multiple myeloma in African Americans, the power was still limited to detect less common risk variants with modest effects. Continuing the recruitment of multiple myeloma cases of African ancestry in our consortia is warranted to improve the ability to discover novel genetic risk loci in this high-risk population. Although no genome-wide novel variants were identified, we did observe two suggestive risk regions at 10 -6 ; the next step would be replicating and characterizing these two risk loci/regions in both populations of African ancestry and European ancestry. The current study identified a better marker at 2p23.3, which was not identified in our previous analysis 17 . Both increased sample size and larger imputation reference panel (HRC) might contribute to this additional discovery. However, INDELs were not available in the HRC reference panel. As mentioned in 5.1, compared to HRC, the currently available TopMed reference panel includes a substantially greater number of samples of African ancestry and genetic variation, which could assist in improving imputation accuracy. The next step will include re-imputation to the TopMed reference panel to increase our ability in identify and test less common variants. Furthermore, some of the known risk regions in our data, especially in Set 1, were not densely genotyped/imputed, which hampered our ability to identify better markers for individuals of African ancestry in these regions. Re-genotyping using a denser array or sequencing these regions would also be a next step to improve the resolution of fine-mapping analyses. 155 Our study examined genetic risk for overall multiple myeloma, however, it is a highly heterogeneous disease characterized by distinct molecular subgroups 18 . Previous studies have revealed that the clinical features are distinct between populations of European and African ancestry 19 . It is possible that inherited genetic variation may vary across these subtypes. Future stratified analysis divided into hyperdiploid and nonhyperdiploid groups might be helpful in identifying subgroup-specific germline risk variants. In addition, all multiple myeloma cases are preceded by an asymptomatic condition, monoclonal gammopathy of undetermined significance (MGUS), which has an annual progression rate to multiple myeloma of approximately 1% and is two- to three-fold more common in population of African ancestry 20 . The increased multiple myeloma risk in individuals of African ancestry might be partially explained by the high prevalence of MGUS in this population. Therefore, genetic studies identifying MGUS susceptibility genetic variants as well as variants associated with progression from MGUS to multiple myeloma in individuals of African ancestry may be informative in explaining the excess risk of multiple myeloma in this population as well as developing risk management strategies. Furthermore, as implied in our admixture mapping analysis, multiple variants with modest effects, rather than a single variant with large effect, may contribute to the excess multiple myeloma risk in individuals of African ancestry. To identify such risk variants, apart from increasing study sample size, it might be helpful to conduct gene-based pooled variant association analysis, under the assumption of multiple independent causal variants 21 . 156 5.4 Conclusion In this dissertation, I have examined the genetic risk of multiple cancers in non-European populations, including people of African and Latino ancestry. These studies suggest that common functional variants were shared at most risk loci. Further studies in non-European populations are needed to identify genetic markers that better capture risk in each ethnic population. While the known genetic risk variants for these cancers could stratify cancer risk in non-European populations, the predictive performance was attenuated. Additional genetic analyses of these cancer in non-European ancestry populations will be critical for developing ethnic-specific polygenic scores and inform how germline variation can be used clinically to improve prevention, screening and treatment of these cancers. 157 5.5 Reference 1. Campbell MC, Tishkoff SA. AFRICAN GENETIC DIVERSITY: Implications for Human Demographic History, Modern Human Origins, and Complex Disease Mapping. Annu. Rev. Genomics Hum. Genet. 2008;9:403–433. 2. Odiaka E, Lounsbury DW, Jalloh M, et al. Effective Project Management of a Pan-African Cancer Research Network: Men of African Descent and Carcinoma of the Prostate (MADCaP). J. Glob. Oncol. 2018;(4):1–12. 3. Taliun D, Harris DN, Kessler MD, et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. bioRxiv. 2019;563866. 4. Tishkoff SA, Reed FA, Friedlaender FR, et al. The Genetic Structure and History of Africans and African Americans. Science. 2009;324(5930):1035–1044. 5. Na R, Zheng SL, Han M, et al. Germline Mutations in ATM and BRCA1/2 Distinguish Risk for Lethal and Indolent Prostate Cancer and are Associated with Early Age at Death. Eur. Urol. 2017;71(5):740–747. 6. Schumacher FR, Olama AAA, Berndt SI, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat. Genet. 2018;50(7):928–936. 7. Al Olama AA, Benlloch S, Antoniou AC, et al. Risk Analysis of Prostate Cancer in PRACTICAL, a Multinational Consortium, Using 25 Known Prostate Cancer Susceptibility Loci. Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol. 2015;24(7):1121–1129. 8. Ström P, Nordström T, Aly M, et al. The Stockholm-3 Model for Prostate Cancer Detection: Algorithm Update, Biomarker Contribution, and Reflex Test Potential. Eur. Urol. 2018;74(2):204–210. 9. Bryc K, Durand EY, Macpherson JM, Reich D, Mountain JL. The Genetic Ancestry of African Americans, Latinos, and European Americans across the United States. Am. J. Hum. Genet. 2015;96(1):37–53. 10. Genetic ancestry and odds of prostate cancer diagnosis in African American and European American men. J. Clin. Oncol. . 11. Stern MC, Fejerman L, Das R, et al. Variability in Cancer Risk and Outcomes Within US Latinos by National Origin and Genetic Ancestry. Curr. Epidemiol. Rep. 2016;3:181–190. 12. Nyame YA, Murphy A, Batai K, et al. Genetic ancestry and odds of prostate cancer diagnosis in African American and European American men. J. Clin. Oncol. 2016;34(2_suppl):86–86. 158 13. Murphy A, Batai K, Shah E, Kittles RA. Abstract C32: Native American genetic ancestry is protective against prostate cancer in African Americans and European Americans. Cancer Epidemiol. Prev. Biomark. 2016;25(3 Supplement):C32–C32. 14. Nordstrom T, Aly M, Eklund M, Egevad L, Gronberg H. A genetic score can identify men at high risk for prostate cancer among men with prostate-specific antigen of 1-3 ng/ml. Eur. Urol. 2014;65(6):1184–1190. 15. Aly M, Wiklund F, Xu J, et al. Polygenic risk score improves prostate cancer risk prediction: results from the Stockholm-1 cohort study. Eur. Urol. 2011;60(1):21–28. 16. Pashayan N, Pharoah PD, Schleutker J, et al. Reducing overdiagnosis by polygenic risk- stratified screening: findings from the Finnish section of the ERSPC. Br. J. Cancer. 2015;113(7):1086–1093. 17. Rand KA, Song C, Dean E, et al. A Meta-analysis of Multiple Myeloma Risk Regions in African and European Ancestry Populations Identifies Putatively Functional Loci. Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol. 2016;25(12):1609–1618. 18. Morgan GJ, Walker BA, Davies FE. The genetic architecture of multiple myeloma. Nat. Rev. Cancer. 2012;12(5):335. 19. Weiss BM, Minter A, Abadie J, et al. Patterns of monoclonal immunoglobulins and serum free light chains are significantly different in black compared to white monoclonal gammopathy of undetermined significance (MGUS) patients. Am. J. Hematol. 2011;86(6):475–478. 20. Greenberg A, Vachon C, Rajkumar S. Disparities in the prevalence, pathogenesis and progression of monoclonal gammopathy of undetermined significance and multiple myeloma between blacks and whites. Leukemia. 2012;26(4):609–614. 21. Wang M, Huang J, Liu Y, et al. COMBAT: A Combined Association Test for Genes Using Summary Statistics. Genetics. 2017;207(3):883–891. 159 Supplementary Materials Chapter 2 Supplementary table 1. Descriptive characteristics of the 560 cases and 480 controls in the analysis of prostate cancer in Ugandan men. Variables case, n(%) N=560 control, n(%) N=480 Age(y) 40-60 59(10.5) 96(20.0) 61-70 169(30.2) 243(50.6) 70-80 218(38.9) 109(22.7) >=80 114(20.4) 32(6.7) mean(sd) 71.0 (9.5) 65.1 (8.9) Gleason Score ≤6 87(28.2) - 7 86(27.8) - 8 78(25.2) - 9 41(13.3) - 10 17(5.5) - unknown 251 - PSA (ng/m) median (1st Q - 3rd Q) 100.0(38.1-300.7) 1.0 (0.6 - 1.9) 160 Supplementary table 2. Association with known prostate cancer risk alleles in UGPCS SNP ID Chr Position Risk Allele a UGPCS Imputati on r 2 Type 1000 Genome AAPC Included in PRS (Table 3) RAF in Control OR (95%CI) P-Value EUR RAF AFR RAF RAF in control OR(95%CI) P-Value rs636291 1 10556097 A 0.199 1.06(0.84, 1.33) 0.62 1.00 Genotyped 0.68 0.19 0.28 1.03(0.96, 1.10) 0.41 yes rs17599629 1 150658287 G 0.028 1.39(0.82, 2.34) 0.22 1.00 Genotyped 0.20 0.04 0.08 1.06(0.95, 1.18) 0.32 yes rs1218582 1 154834183 G 0.692 0.83(0.69, 1.00) 0.0497 1.00 Genotyped 0.46 0.66 0.65 1.02(0.96, 1.08) 0.61 yes rs4245739 1 204518842 A 0.753 0.95(0.76, 1.18) 0.64 1.00 Genotyped 0.74 0.77 0.76 1.09(1.01, 1.17) 0.017 yes rs1775148 1 205757824 C 0.687 1.09(0.89, 1.32) 0.41 0.98 - 0.36 0.70 0.61 1.06(0.99, 1.12) 0.076 yes rs11902236 2 10117868 T 0.673 1.09(0.90, 1.33) 0.39 1.00 Genotyped 0.29 0.65 0.61 0.99(0.93, 1.05) 0.78 yes rs9287719 2 10710730 C 0.292 1.04(0.85, 1.27) 0.70 1.00 Genotyped 0.48 0.23 0.27 0.99(0.92, 1.06) 0.70 yes rs13385191 2 20888265 G 0.041 0.78(0.47, 1.28) 0.33 1.00 Genotyped 0.25 0.02 0.06 1.00(0.88, 1.14) 0.99 yes rs1465618 2 43553949 T 0.053 1.25(0.85, 1.84) 0.27 1.00 Genotyped 0.19 0.07 0.11 1.04(0.95, 1.14) 0.41 yes rs721048 2 63131731 A 0.015 0.95(0.43, 2.10) 0.90 1.00 - 0.17 0.01 0.04 1.12(0.96, 1.30) 0.15 yes rs10187424 2 85794297 T 0.305 0.98(0.80, 1.19) 0.81 1.00 Genotyped 0.56 0.31 0.36 1.07(1.01, 1.14) 0.03 yes rs12621278 2 173311553 A 1.000 - - 1.00 Genotyped 0.96 1.00 0.99 1.48(1.13, 1.94) 0.0042 no rs2292884 2 238443226 G 0.617 0.96(0.80, 1.17) 0.71 1.00 - 0.21 0.62 0.56 1.10(1.04, 1.17) 0.0016 yes rs3771570 2 242382864 T 0.000 - - 1.00 Genotyped 0.14 0.01 0.03 1.11(0.94, 1.30) 0.23 no rs2660753 3 87110674 T 0.635 1.01(0.84, 1.22) 0.90 1.00 Genotyped 0.11 0.59 0.49 0.95(0.90, 1.01) 0.13 yes rs2055109 3 87467332 T 0.880 1.09(0.82, 1.44) 0.56 1.00 Genotyped 0.77 0.87 0.87 0.94(0.86, 1.03) 0.19 yes rs7611694 3 113275624 A 0.628 1.07(0.89, 1.29) 0.46 1.00 Genotyped 0.58 0.68 0.66 0.98(0.92, 1.04) 0.42 yes rs10934853 3 128038373 A 0.751 1.15(0.92, 1.43) 0.21 1.00 Genotyped 0.28 0.78 0.70 1.04(0.97, 1.11) 0.29 yes rs6763931 3 141102833 A 0.901 1.24(0.89, 1.72) 0.20 1.00 Genotyped 0.43 0.89 0.80 1.04(0.96, 1.12) 0.33 yes rs10936632 3 170130102 A 0.187 1.10(0.86, 1.40) 0.47 0.97 - 0.52 0.17 0.24 1.06(0.99, 1.14) 0.088 yes rs10009409 4 73855253 T 0.379 1.02(0.85, 1.23) 0.80 0.99 Genotyped 0.33 0.41 0.35 1.02(0.96, 1.08) 0.62 yes rs1894292 4 74349158 G 0.684 1.01(0.83, 1.23) 0.94 1.00 Genotyped 0.54 0.68 0.67 1.03(0.97, 1.09) 0.40 yes rs12500426 4 95514609 A 0.342 0.97(0.80, 1.17) 0.73 1.00 Genotyped 0.47 0.38 0.40 0.98(0.92, 1.04) 0.50 yes rs17021918 4 95562877 C 0.801 0.87(0.69, 1.09) 0.22 1.00 Genotyped 0.66 0.80 0.78 1.07(1.00, 1.15) 0.065 yes rs7679673 4 106061534 C 0.319 0.96(0.79, 1.17) 0.71 1.00 Genotyped 0.61 0.35 0.39 1.14(1.08, 1.22) 1.15×10 -5 yes rs2242652 5 1280028 G 0.829 1.10(0.85, 1.42) 0.47 0.95 - 0.79 0.87 0.86 1.12(1.02, 1.22) 0.018 yes rs12653946 5 1895829 T 0.394 1.04(0.87, 1.26) 0.65 1.00 Genotyped 0.45 0.40 0.41 1.09(1.02, 1.15) 0.0056 yes rs2121875 5 44365545 C 0.796 1.20(0.95, 1.51) 0.14 1.00 Genotyped 0.34 0.77 0.69 1.02(0.96, 1.09) 0.47 yes 161 rs6869841 5 172939426 T 0.394 1.02(0.84, 1.23) 0.87 1.00 Genotyped 0.20 0.40 0.36 1.01(0.95, 1.07) 0.71 yes rs4713266 6 11219030 C 0.817 1.19(0.94, 1.51) 0.14 1.00 Genotyped 0.51 0.84 0.77 1.06(0.99, 1.14) 0.10 yes rs7767188 6 30073776 A 0.146 0.96(0.75, 1.24) 0.78 1.00 Genotyped 0.21 0.15 0.15 1.04(0.96, 1.13) 0.38 yes rs130067 6 31118511 G 0.136 0.96(0.73, 1.26) 0.77 1.00 Genotyped 0.21 0.19 0.19 1.06(0.98, 1.14) 0.13 yes rs3096702 6 32192331 A 0.090 0.62(0.43, 0.89) 0.0090 0.99 - 0.36 0.14 0.15 1.06(0.98, 1.15) 0.13 yes rs3129859 6 32400939 G 0.755 0.97(0.78, 1.21) 0.81 0.99 - 0.70 0.81 0.81 0.93(0.86, 1.00) 0.042 yes rs1983891 6 41536427 T 0.465 1.12(0.93, 1.34) 0.24 1.00 Genotyped 0.30 0.51 0.47 1.12(1.05, 1.19) 1.96×10 -4 yes rs9443189 6 76495882 A 0.459 1.10(0.91, 1.33) 0.32 0.99 - 0.86 0.41 0.48 1.08(1.02, 1.15) 0.013 yes rs2273669 6 109285189 G 0.389 0.90(0.74, 1.08) 0.26 0.98 - 0.16 0.35 0.31 1.05(0.99, 1.12) 0.11 yes rs339331 6 117210052 T 0.750 1.13(0.91, 1.40) 0.26 1.00 Genotyped 0.69 0.76 0.76 1.19(1.11, 1.27) 1.20×10 -6 yes rs1933488 6 153441079 A 0.518 0.87(0.72, 1.05) 0.15 1.00 Genotyped 0.60 0.52 0.56 0.99(0.93, 1.05) 0.73 yes rs9364554 6 160833664 T 0.012 2.07(0.97, 4.41) 0.059 1.00 Genotyped 0.28 0.02 0.06 1.18(1.05, 1.33) 0.0045 yes rs12155172 7 20994491 A 0.059 1.15(0.78, 1.69) 0.48 1.00 Genotyped 0.20 0.09 0.12 1.07(0.98, 1.17) 0.13 yes rs10486567 7 27976563 G 0.590 1.07(0.89, 1.30) 0.47 1.00 Genotyped 0.77 0.72 0.71 1.11(1.04, 1.19) 0.0011 yes rs56232506 7 47437244 A 0.077 1.31(0.92, 1.86) 0.14 0.94 - 0.48 0.04 0.12 1.00(0.91, 1.10) 0.98 yes rs6465657 7 97816327 C 0.935 1.24(0.82, 1.88) 0.30 1.00 Genotyped 0.49 0.96 0.87 1.00(0.91, 1.10) 0.97 yes rs2928679 8 23438975 A 0.261 1.05(0.86, 1.29) 0.63 1.00 Genotyped 0.42 0.25 0.27 1.00(0.94, 1.06) 0.95 yes rs1512268 8 23526463 T 0.653 1.31(1.07, 1.60) 0.0087 1.00 Genotyped 0.47 0.69 0.63 1.16(1.09, 1.23) 1.23×10 -6 yes rs11135910 8 25892142 T 0.055 1.15(0.77, 1.71) 0.50 1.00 Genotyped 0.13 0.09 0.11 1.02(0.93, 1.12) 0.65 yes rs12543663 8 127924659 C 0.103 0.74(0.53, 1.02) 0.07 1.00 Genotyped 0.29 0.10 0.15 0.86(0.79, 0.94) 8.87×10 -4 no rs10086908 8 128011937 T 0.706 1.27(1.03, 1.56) 0.03 1.00 Genotyped 0.68 0.78 0.75 1.14(1.06, 1.22) 1.74×10 -4 no rs7463326 8 128027954 G 0.883 1.41(1.03, 1.94) 0.03 1.00 Genotyped 0.75 0.87 0.84 1.17(1.08, 1.27) 2.24×10 -4 no rs72725854 8 128074815 T 0.058 3.37(2.36, 4.82) 2.14×10 -11 0.97 - 0.00 0.08 0.06 2.30(2.06, 2.57) 1.11×10 -49 yes rs114798100 8 128085434 G 0.045 2.92(2.00, 4.28) 3.63×10 -8 1.00 Genotyped 0.00 0.05 0.04 2.49(2.17, 2.85) 2.40×10 -38 no rs111906932 8 128086204 A 0.011 3.50(1.60, 7.66) 0.0017 1.00 Genotyped 0.00 0.03 0.02 1.83(1.54, 2.18) 1.69×10 -11 no rs1016343 8 128093297 T 0.185 0.90(0.71, 1.14) 0.38 1.00 Genotyped 0.22 0.21 0.21 1.05(0.98, 1.12) 0.18 no 162 rs13252298 8 128095156 A 0.943 0.90(0.61, 1.34) 0.61 1.00 Genotyped 0.71 0.97 0.93 1.15(1.02, 1.29) 0.022 no rs72725879 8 128103969 T 0.363 1.58(1.30, 1.93) 4.73×10 -6 0.98 - 0.20 0.37 0.34 1.41(1.32, 1.50) 1.42×10 -27 no rs183373024 8 128104117 G 0.000 - - 0.96 - 0.01 0.00 - - - no rs6983561 8 128106880 C 0.499 1.39(1.16, 1.67) 3.91×10 -4 1.00 Genotyped 0.04 0.53 0.44 1.29(1.21, 1.37) 1.07×10 -16 no rs16901979 8 128124916 A 0.474 1.45(1.20, 1.76) 1.01×10 -4 1.00 - 0.04 0.51 0.42 1.29(1.22, 1.37) 5.74×10 -17 no rs620861 8 128335673 G 0.595 0.97(0.81, 1.17) 0.78 1.00 Genotyped 0.64 0.66 0.65 1.06(1.00, 1.13) 0.057 no rs6983267 8 128413305 G 0.966 0.90(0.54, 1.50) 0.68 1.00 Genotyped 0.50 0.95 0.88 1.24(1.12, 1.36) 1.38×10 -5 no rs1447295 8 128485038 A 0.309 1.01(0.83, 1.23) 0.90 1.00 Genotyped 0.10 0.35 0.31 1.06(1.00, 1.13) 0.070 no rs7812894 8 128520479 A 0.126 0.89(0.67, 1.19) 0.44 1.00 Genotyped 0.10 0.18 0.16 1.16(1.07, 1.25) 1.58×10 -4 no rs11986220 8 128531689 A 0.052 1.34(0.89, 2.03) 0.17 1.00 Genotyped 0.10 0.06 0.06 1.38(1.22, 1.56) 1.68×10 -7 no rs10090154 8 128532137 T 0.154 0.99(0.76, 1.28) 0.92 1.00 Genotyped 0.10 0.19 0.17 1.19(1.10, 1.28) 1.25×10 -5 no rs7837688 8 128539360 T 0.041 1.21(0.76, 1.93) 0.43 1.00 Genotyped 0.11 0.06 0.06 1.32(1.17, 1.49) 5.50×10 -6 no rs12549761 8 128540776 C 0.970 1.97(1.03, 3.79) 0.04 0.99 Genotyped 0.89 0.98 0.95 1.47(1.26, 1.71) 6.20×10 -7 no rs17694493 9 22041998 G 0.198 1.08(0.86, 1.36) 0.49 1.00 Genotyped 0.13 0.11 0.11 1.00(0.91, 1.10) 0.99 yes rs817826 9 110156300 T 0.637 1.09(0.90, 1.32) 0.39 1.00 Genotyped 0.86 0.65 0.72 0.92(0.86, 0.98) 0.012 yes rs1571801 9 124427373 T 0.113 0.97(0.73, 1.29) 0.84 1.00 Genotyped 0.26 0.15 0.14 1.02(0.94, 1.11) 0.58 yes rs76934034 10 46082985 T 1.000 - - 1.00 Genotyped 0.93 1.00 0.98 0.98(0.79, 1.23) 0.88 no rs10993994 10 51549496 T 0.755 1.12(0.91, 1.38) 0.30 1.00 Genotyped 0.39 0.65 0.60 1.10(1.04, 1.17) 0.0011 yes rs3850699 10 104414221 A 0.642 1.06(0.88, 1.29) 0.52 1.00 Genotyped 0.70 0.61 0.62 1.05(0.99, 1.12) 0.078 yes rs2252004 10 122844709 A 0.595 1.02(0.84, 1.22) 0.87 1.00 Genotyped 0.09 0.59 0.50 0.97(0.91, 1.03) 0.29 yes rs4962416 10 126696872 C 0.125 1.17(0.90, 1.53) 0.24 1.00 Genotyped 0.28 0.17 0.17 1.07(0.99, 1.15) 0.10 yes rs7127900 11 2233574 A 0.340 1.08(0.89, 1.31) 0.44 1.00 Genotyped 0.18 0.41 0.35 1.10(1.03, 1.17) 0.0024 yes rs12791447 11 7556577 G 0.005 - - 1.00 Genotyped 0.06 0.01 0.02 1.07(0.87, 1.33) 0.52 no rs1938781 11 58915110 G 0.361 0.88(0.72, 1.06) 0.18 1.00 - 0.20 0.39 0.32 1.06(1.00, 1.13) 0.047 yes rs7931342 11 68994497 G 0.824 1.10(0.86, 1.40) 0.45 1.00 Genotyped 0.51 0.84 0.77 1.16(1.08, 1.25) 6.57×10 -5 yes 163 rs11568818 11 102401661 T 0.531 1.23(1.03, 1.48) 0.024 1.00 Genotyped 0.56 0.55 0.55 1.05(0.99, 1.11) 0.12 yes rs11214775 11 113807181 G 0.768 1.10(0.89, 1.35) 0.40 1.00 Genotyped 0.72 0.70 0.70 1.04(0.98, 1.11) 0.20 yes rs80130819 12 48419618 A 0.999 - - 1.00 Genotyped 0.92 0.99 0.98 1.25(1.00, 1.55) 0.048 no rs10875943 12 49676010 C 0.718 0.97(0.79, 1.18) 0.74 1.00 Genotyped 0.29 0.69 0.62 1.01(0.95, 1.08) 0.68 yes rs902774 12 53273904 A 0.080 0.84(0.60, 1.20) 0.34 1.00 Genotyped 0.15 0.07 0.09 0.95(0.85, 1.05) 0.29 yes rs1270884 12 114685571 A 0.129 1.08(0.82, 1.41) 0.60 1.00 Genotyped 0.48 0.17 0.21 1.02(0.95, 1.10) 0.55 yes rs9600079 13 73728139 T 0.478 1.10(0.92, 1.32) 0.30 1.00 Genotyped 0.47 0.52 0.52 0.99(0.93, 1.04) 0.63 yes rs75823044 13 110360784 T 0.013 2.02(1.02, 4.00) 0.044 1.00 Genotyped 0.00 0.04 0.02 1.47(1.22, 1.76) 3.73×10 -5 yes rs8008270 14 53372330 C 0.746 1.07(0.87, 1.31) 0.53 1.00 Genotyped 0.80 0.71 0.72 1.03(0.97, 1.10) 0.32 yes rs7153648 14 61122526 C 0.375 1.21(1.01, 1.45) 0.042 1.00 Genotyped 0.08 0.39 0.34 1.11(1.04, 1.18) 8.42×10 -4 yes rs58262369 14 64693912 T 0.058 0.96(0.65, 1.41) 0.83 0.98 - 0.00 0.06 0.05 1.04(0.91, 1.19) 0.60 yes rs7141529 14 69126744 C 0.491 1.05(0.87, 1.26) 0.61 1.00 Genotyped 0.50 0.48 0.54 1.04(0.98, 1.10) 0.22 yes rs8014671 14 71092256 G 0.398 0.96(0.79, 1.16) 0.64 1.00 Genotyped 0.57 0.43 0.46 1.01(0.95, 1.07) 0.80 yes rs12051443 16 71691329 A 0.263 0.96(0.78, 1.18) 0.67 0.99 - 0.33 0.25 0.25 1.07(1.00, 1.15) 0.039 yes rs684232 17 618965 C 0.621 1.24(1.02, 1.50) 0.028 1.00 Genotyped 0.36 0.70 0.64 1.08(1.02, 1.15) 0.011 yes rs11649743 17 36074979 G 0.915 1.16(0.81, 1.65) 0.42 0.99 Genotyped 0.80 0.95 0.92 1.08(0.97, 1.20) 0.18 yes rs4430796 17 36098040 A 0.283 1.00(0.80, 1.24) 0.98 0.94 - 0.52 0.31 0.35 1.04(0.98, 1.11) 0.21 yes rs11650494 17 47345186 A 0.193 1.00(0.80, 1.25) 0.98 1.00 Genotyped 0.06 0.23 0.22 1.10(1.03, 1.18) 0.0057 yes rs7210100 17 47436749 A 0.044 1.04(0.67, 1.63) 0.86 1.00 Genotyped 0.00 0.06 0.05 1.40(1.24, 1.58) 4.37×10 -8 yes rs1859962 17 69108753 G 0.293 0.91(0.74, 1.12) 0.38 1.00 Genotyped 0.48 0.28 0.30 1.00(0.94, 1.06) 0.90 yes rs7241993 18 76773973 C 0.430 1.04(0.87, 1.25) 0.65 1.00 Genotyped 0.69 0.44 0.48 1.01(0.96, 1.07) 0.64 yes rs8102476 19 38735613 C 0.724 1.21(0.98, 1.49) 0.081 1.00 - 0.54 0.80 0.74 1.08(1.01, 1.16) 0.021 yes rs11672691 19 41985587 G 0.085 1.11(0.80, 1.54) 0.53 1.00 Genotyped 0.75 0.08 0.22 1.07(1.00, 1.15) 0.056 yes rs2735839 19 51364623 G 0.623 0.86(0.71, 1.03) 0.11 1.00 Genotyped 0.86 0.61 0.69 0.97(0.91, 1.03) 0.33 yes rs103294 19 54797848 C 0.935 1.04(0.72, 1.49) 0.84 1.00 Genotyped 0.82 0.93 0.90 0.96(0.87, 1.05) 0.37 yes rs12480328 20 49527922 T 0.789 1.25(1.00, 1.57) 0.055 1.00 - 0.93 0.84 0.87 1.10(1.01, 1.21) 0.029 yes rs2427345 20 61015611 C 0.450 1.15(0.96, 1.39) 0.14 1.00 Genotyped 0.63 0.47 0.50 1.02(0.96, 1.08) 0.51 yes rs6062509 20 62362563 T 0.924 1.39(0.96, 2.02) 0.080 1.00 Genotyped 0.72 0.95 0.90 1.05(0.95, 1.15) 0.37 yes rs1041449 21 42901421 G 0.387 0.92(0.75, 1.12) 0.41 0.94 - 0.41 0.39 0.38 1.05(0.99, 1.12) 0.099 yes rs2238776 22 19757892 G 0.982 1.92(0.62, 5.90) 0.26 0.92 - 0.80 0.98 0.95 0.95(0.83, 1.10) 0.49 yes 164 rs78554043 22 28374943 C 0.006 - - 1.00 Genotyped 0.00 0.02 0.02 1.59(1.27, 2.00) 5.02×10 -5 no rs58133635 22 40471188 T 0.245 1.14(0.92, 1.41) 0.24 0.99 Genotyped 0.20 0.28 0.23 1.02(0.95, 1.09) 0.64 yes rs5759167 22 43500212 G 0.733 1.06(0.86, 1.31) 0.57 1.00 Genotyped 0.51 0.79 0.75 1.15(1.08, 1.24) 3.55×10 -5 yes rs2405942 X 9814135 A 0.715 1.01(0.88, 1.17) 0.85 1.00 Genotyped 0.79 0.73 - - - yes rs5945619 X 51241672 C 0.345 1.11(0.97, 1.26) 0.14 1.00 Genotyped 0.37 0.39 0.37 1.07(1.03, 1.12) 9.06×10 -4 yes rs2807031 X 52896949 C 0.287 0.95(0.82, 1.10) 0.48 1.00 Genotyped 0.18 0.19 0.21 1.06(1.01, 1.11) 0.020 yes rs5919432 X 67021550 T 0.290 0.92(0.80, 1.06) 0.26 1.00 Genotyped 0.79 0.29 0.37 1.08(1.04, 1.13) 2.84×10 -4 yes rs6625711 X 70139850 A 0.869 0.86(0.71, 1.04) 0.13 0.90 Imputed 0.16 0.87 0.79 0.93(0.88, 0.98) 0.0055 yes rs4844289 X 70407983 G 0.802 0.94(0.80, 1.11) 0.48 1.00 Genotyped 0.43 0.74 0.68 0.99(0.95, 1.04) 0.70 yes 165 Supplementary table 3. Pairwise correlation (r 2 ) of SNPs at 8q24 from forward stepwise logistic regression in UGPCS and known 8q24 risk alleles. rs72725854 rs6470538 rs1456315 rs28556804 rs73707269 rs72725854 1.00 0 0 0 0 rs72725879 0.03 0 0.41 0 0 rs7463326 0 0 0.01 0.95 0 rs7812894 0 0 0 0 0 rs12549761 0 0 0 0 0 rs12543663 0 0 0 0 0 rs10086908 0 0 0 0.51 0 rs114798100 0.54 0 0.01 0 0 rs111906932 0.38 0 0 0 0 rs1016343 0.02 0 0.02 0 0 rs13252298 0 0 0.02 0 0 rs6983561 0.01 0.01 0.27 0 0 rs16901979 0.01 0 0.26 0 0 rs620861 0 0 0 0 0 rs6983267 0 0.01 0 0 0 rs1447295 0 0 0.01 0 0 rs10090154 0 0 0 0 0.01 rs7837688 0 0 0 0 0 rs11986220 0 0.01 0 0 0 166 Supplementary table 4. SNPs significantly associated with prostate cancer risk in UGPCS (Wald P<10 -5 ). SNP ID Chr Position Alleles a Risk Allele Frequency b OR (95%CI) P-Value r 2 Genotyped AAPC Risk Allele Frequency b AAPC OR (95%CI) AAPC P-Value rs1977673 1 80451138 C|T 0.59|0.50 1.57(1.29, 1.92) 7.27×10 -6 0.93 - 0.54|0.53 1.02(0.96, 1.08) 0.55 rs1340676 1 80452038 T|G 0.60|0.50 1.55(1.28, 1.88) 5.92×10 -6 0.99 - 0.55|0.53 1.03(0.97, 1.09) 0.30 rs2050469 1 80452521 T|A 0.61|0.51 1.56(1.29, 1.89) 4.49×10 -6 0.99 - 0.55|0.54 1.03(0.97, 1.09) 0.40 rs1340678 1 80452798 T|G 0.61|0.51 1.56(1.29, 1.89) 4.29×10 -6 0.99 - 0.55|0.54 1.03(0.97, 1.09) 0.41 rs1340680 1 80453134 G|C 0.60|0.50 1.56(1.29, 1.89) 5.14×10 -6 0.99 - 0.55|0.53 1.04(0.98, 1.10) 0.20 rs12095604 1 89897115 C|T 0.16|0.10 2.10(1.53, 2.88) 4.84×10 -6 0.82 - 0.12|0.12 1.03(0.93, 1.13) 0.60 rs61005944 1 224405487 CT|C 0.31|0.22 1.77(1.41, 2.23) 1.23×10 -6 0.84 - 0.27|0.28 0.94(0.88, 1.01) 0.08 rs3767731 1 224415616 T|C 0.38|0.28 1.58(1.29, 1.93) 9.16×10 -6 0.97 - 0.33|0.34 0.96(0.90, 1.02) 0.17 rs73928479 2 41303505 A|T 0.98|0.94 3.45(2.03, 5.86) 5.00×10 -6 0.96 - 0.96|0.96 1.08(0.93, 1.26) 0.31 rs73928501 2 41306652 T|C 0.98|0.94 3.52(2.07, 6.01) 3.80×10 -6 0.95 - 0.96|0.96 1.08(0.93, 1.26) 0.31 rs60024409 2 41308673 T|C 0.98|0.94 3.55(2.08, 6.07) 3.51×10 -6 0.96 - 0.96|0.96 1.08(0.92, 1.25) 0.35 rs148184576 2 41321361 A|G 0.98|0.94 3.59(2.09, 6.16) 3.42×10 -6 0.94 - 0.96|0.96 1.08(0.93, 1.26) 0.29 rs58488929 2 97801368 C|T 0.13|0.07 2.28(1.59, 3.25) 6.40×10 -6 0.89 - 0.11|0.11 1.04(0.94, 1.14) 0.46 rs7588752 2 97849669 T|G 0.12|0.07 2.21(1.56, 3.14) 7.70×10 -6 0.94 - 0.11|0.11 1.03(0.94, 1.13) 0.48 rs6431219 2 127862133 C|T 0.60|0.51 1.55(1.28, 1.88) 8.13×10 -6 0.97 - 0.55|0.56 0.98(0.92, 1.04) 0.48 rs11916600 3 183770006 G|C 0.83|0.77 1.78(1.38, 2.29) 8.99×10 -6 0.87 - 0.83|0.84 0.98(0.90, 1.06) 0.54 rs73052473 3 183773196 G|A 0.92|0.86 2.08(1.51, 2.86) 7.58×10 -6 0.91 - 0.93|0.93 0.96(0.85, 1.08) 0.48 rs73052475 3 183773201 G|A 0.92|0.87 2.12(1.53, 2.94) 6.54×10 -6 0.92 - 0.94|0.94 0.94(0.83, 1.07) 0.34 rs142364627 3 183773597 A|AT 0.92|0.86 2.08(1.51, 2.86) 7.38×10 -6 0.91 - 0.93|0.93 0.96(0.86, 1.08) 0.51 rs61539668 3 183773599 C|G 0.92|0.86 2.08(1.51, 2.86) 7.70×10 -6 0.91 - 0.93|0.93 0.96(0.85, 1.08) 0.48 rs57213098 3 183773600 T|C 0.92|0.86 2.08(1.51, 2.86) 7.71×10 -6 0.91 - 0.93|0.93 0.96(0.85, 1.08) 0.48 rs5855014 3 183801193 GC|G 0.91|0.85 2.12(1.55, 2.91) 2.84×10 -6 0.83 - 0.92|0.93 0.98(0.87, 1.10) 0.76 rs73042308 3 183802194 G|A 0.93|0.88 2.09(1.51, 2.89) 8.70×10 -6 0.94 - 0.95|0.95 1.00(0.87, 1.15) 0.995 167 rs113283808 3 183808266 G|A 0.93|0.88 2.13(1.53, 2.95) 5.96×10 -6 0.93 - 0.95|0.95 1.00(0.87, 1.15) 0.99 3 183808372 GC|AC 0.78|0.71 1.65(1.33, 2.05) 5.11×10 -6 0.92 - 0.85|0.85 0.99(0.92, 1.08) 0.90 rs73042325 3 183809118 A|C 0.93|0.88 2.12(1.52, 2.95) 7.81×10 -6 0.93 - 0.95|0.95 1.01(0.88, 1.16) 0.89 rs73042396 3 183812765 A|G 0.93|0.88 2.15(1.55, 2.98) 4.90×10 -6 0.93 - 0.95|0.95 1.00(0.87, 1.15) 0.97 rs141828188 3 183817230 C|A 0.93|0.88 2.19(1.57, 3.05) 3.90×10 -6 0.91 - 0.95|0.95 1.00(0.87, 1.16) 0.98 4 181412356 T|TCTCTC 0.97|0.93 3.04(1.88, 4.91) 5.41×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.57 rs201870367 4 181412434 AG|A 0.97|0.93 3.04(1.88, 4.91) 5.43×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.57 rs115294351 4 181412823 T|C 0.97|0.93 3.04(1.88, 4.90) 5.44×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.57 rs1833727 4 181413339 T|C 0.97|0.93 3.03(1.88, 4.89) 5.47×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.58 rs1833726 4 181413384 C|T 0.97|0.93 3.03(1.88, 4.89) 5.47×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.58 rs376293756 4 181413966 TGA|T 0.97|0.93 3.03(1.88, 4.88) 5.50×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.58 rs191942895 4 181414004 T|C 0.97|0.93 3.03(1.88, 4.88) 5.50×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.58 rs181860787 4 181414005 C|T 0.97|0.93 3.02(1.88, 4.88) 5.49×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.58 rs113946114 4 181414043 C|G 0.97|0.93 3.02(1.88, 4.87) 5.49×10 -6 0.84 - 0.97|0.97 0.95(0.79, 1.14) 0.58 rs139675196 4 181414253 G|C 0.97|0.93 3.02(1.87, 4.86) 5.54×10 -6 0.84 - 0.97|0.97 0.95(0.80, 1.14) 0.58 rs115414625 4 181414310 C|G 0.97|0.93 3.02(1.87, 4.86) 5.56×10 -6 0.84 - 0.97|0.97 0.95(0.80, 1.14) 0.58 rs78923694 4 181414639 C|T 0.97|0.93 3.02(1.87, 4.86) 5.53×10 -6 0.84 - 0.97|0.97 0.95(0.80, 1.14) 0.58 rs116485782 4 181414701 T|C 0.97|0.93 3.02(1.87, 4.86) 5.55×10 -6 0.84 - 0.97|0.97 0.95(0.80, 1.14) 0.58 rs113558905 4 181414837 T|G 0.97|0.93 3.01(1.87, 4.85) 5.55×10 -6 0.84 - 0.97|0.97 0.95(0.80, 1.14) 0.58 rs371311300 4 181416940 C|A 0.97|0.93 2.97(1.84, 4.79) 7.89×10 -6 0.85 - 0.97|0.97 0.95(0.80, 1.14) 0.59 rs143555288 4 181426870 G|T 0.97|0.93 2.88(1.81, 4.58) 8.70×10 -6 0.87 - 0.97|0.97 0.95(0.79, 1.14) 0.58 rs142791707 4 181427842 G|T 0.97|0.93 2.87(1.80, 4.56) 8.95×10 -6 0.87 - 0.97|0.97 0.95(0.79, 1.14) 0.58 rs76861935 5 116278053 C|T 0.92|0.86 2.02(1.48, 2.75) 8.61×10 -6 0.97 - 0.92|0.92 1.03(0.92, 1.15) 0.61 rs113257807 5 116278091 C|T 0.92|0.86 2.02(1.48, 2.75) 8.73×10 -6 0.97 - 0.92|0.92 1.04(0.93, 1.16) 0.54 rs12518354 5 161972337 A|T 0.94|0.89 2.29(1.59, 3.31) 9.60×10 -6 0.92 - 0.93|0.93 1.04(0.93, 1.17) 0.50 rs57756387 5 161975597 A|T 0.94|0.89 2.29(1.59, 3.31) 9.06×10 -6 0.92 - 0.93|0.93 1.04(0.93, 1.17) 0.47 rs77413432 5 161980310 C|G 0.94|0.89 2.31(1.60, 3.34) 7.81×10 -6 0.92 - 0.93|0.93 1.05(0.93, 1.18) 0.43 rs55700842 5 161984612 G|C 0.94|0.89 2.31(1.60, 3.34) 7.89×10 -6 0.92 - 0.93|0.93 1.05(0.93, 1.18) 0.42 rs17060512 5 161992398 T|C 0.94|0.89 2.32(1.61, 3.35) 6.81×10 -6 0.93 - 0.93|0.93 1.05(0.94, 1.18) 0.38 168 rs7718671 5 178552715 A|G 0.91|0.86 2.44(1.64, 3.62) 1.00×10 -5 0.83 - 0.84|0.82 1.09(1.01, 1.18) 0.03 rs79774606 6 4905275 G|A 0.87|0.81 1.93(1.44, 2.57) 9.20×10 -6 0.83 - 0.85|0.86 0.97(0.89, 1.06) 0.54 rs6979813 7 29143079 C|G 0.79|0.72 1.70(1.35, 2.14) 6.45×10 -6 0.90 - 0.74|0.74 1.01(0.94, 1.08) 0.82 rs200219623 8 128074135 G|GCAGGA GAA 0.26|0.18 1.86(1.47, 2.35) 2.11×10 -7 0.97 - 0.20|0.14 1.59(1.46, 1.72) 5.56×10 -29 rs72725854 8 128074815 T|A 0.14|0.06 3.37(2.36, 4.82) 2.14×10 -11 0.97 - 0.12|0.06 2.30(2.06, 2.57) 1.11×10 -49 rs746971193 8 128083282 G|A 0.11|0.04 2.94(2.01, 4.32) 3.12×10 -8 0.96 - 0.08|0.04 2.46(2.15, 2.82) 2.50×10 -38 rs76229939 8 128085394 G|A 0.11|0.04 2.92(2.00, 4.28) 3.63×10 -8 1.00 Genotyped 0.08|0.04 2.49(2.17, 2.85) 2.41×10 -38 rs114798100 8 128085434 G|A 0.11|0.04 2.92(2.00, 4.28) 3.63×10 -8 1.00 Genotyped 0.08|0.04 2.49(2.17, 2.85) 2.40×10 -38 rs76595456 8 128087829 T|C 0.17|0.09 2.48(1.83, 3.35) 3.69×10 -9 1.00 Genotyped 0.14|0.09 1.79(1.62, 1.97) 2.75×10 -32 rs75500912 8 128089536 C|A 0.14|0.07 2.29(1.67, 3.15) 3.18×10 -7 1.00 Genotyped 0.10|0.06 1.73(1.54, 1.93) 5.71×10 -22 rs115297798 8 128092741 T|A 0.09|0.03 2.93(1.89, 4.53) 1.36×10 -6 0.99 - 0.06|0.03 2.54(2.17, 2.96) 5.10×10 -32 rs73705708 8 128092911 G|A 0.18|0.11 2.08(1.57, 2.75) 2.68×10 -7 1.00 Genotyped 0.16|0.11 1.60(1.47, 1.75) 3.05×10 -25 rs73705709 8 128094084 C|G 0.15|0.08 2.13(1.57, 2.90) 1.25×10 -6 0.99 - 0.11|0.07 1.62(1.45, 1.80) 4.76×10 -19 rs59765225 8 128099765 G|A 0.21|0.14 1.89(1.46, 2.44) 1.09×10 -6 1.00 Genotyped 0.18|0.13 1.51(1.39, 1.64) 8.25×10 -22 rs56006726 8 128100439 C|G 0.14|0.08 2.10(1.55, 2.86) 2.02×10 -6 1.00 Genotyped 0.11|0.07 1.62(1.46, 1.80) 2.04×10 -19 8 128101685 C|A 0.09|0.03 2.85(1.86, 4.37) 1.53×10 -6 0.98 - 0.06|0.03 2.51(2.16, 2.93) 1.13×10 -31 rs201592624 8 128102187 G|GA 0.09|0.03 2.85(1.86, 4.38) 1.52×10 -6 0.98 - 0.06|0.03 2.57(2.20, 3.01) 1.90×10 -32 rs72725879 8 128103969 T|C 0.45|0.36 1.58(1.30, 1.93) 4.73×10 -6 0.98 - 0.41|0.34 1.41(1.32, 1.50) 1.42×10 -27 rs115266272 8 128106397 A|G 0.09|0.03 2.77(1.81, 4.25) 2.79×10 -6 1.00 Genotyped 0.06|0.03 2.43(2.08, 2.84) 2.87×10 -29 rs146278672 8 128108324 T|C 0.09|0.03 2.77(1.81, 4.25) 2.79×10 -6 1.00 Genotyped 0.06|0.03 2.38(2.04, 2.77) 1.87×10 -28 rs114099351 8 128110665 G|C 0.09|0.03 2.77(1.81, 4.25) 2.79×10 -6 1.00 - 0.06|0.03 2.38(2.04, 2.77) 1.87×10 -28 rs202073575 8 128112009 T|TG 0.09|0.03 2.77(1.81, 4.25) 2.78×10 -6 1.00 - 0.06|0.03 2.38(2.04, 2.77) 1.87×10 -28 rs145698299 8 128113333 T|G 0.09|0.03 2.77(1.81, 4.25) 2.78×10 -6 1.00 - 0.06|0.03 2.38(2.04, 2.77) 1.87×10 -28 rs115207770 8 128116947 A|G 0.09|0.03 2.77(1.81, 4.25) 2.77×10 -6 1.00 - 0.06|0.03 2.38(2.04, 2.77) 1.70×10 -28 rs190565485 8 128119215 C|T 0.09|0.03 2.77(1.81, 4.25) 2.77×10 -6 1.00 - 0.06|0.03 2.38(2.04, 2.78) 1.55×10 -28 8 128121325 T|G 0.09|0.03 2.77(1.81, 4.25) 2.76×10 -6 1.00 - 0.06|0.03 2.38(2.04, 2.78) 1.44×10 -28 rs115238595 8 128126452 G|A 0.09|0.03 2.77(1.81, 4.25) 2.75×10 -6 1.00 - 0.06|0.03 2.38(2.04, 2.78) 1.47×10 -28 rs114938413 8 128127371 G|A 0.09|0.03 2.77(1.81, 4.25) 2.75×10 -6 1.00 - 0.06|0.03 2.38(2.04, 2.78) 1.48×10 -28 169 rs116041037 8 128131809 A|G 0.09|0.03 2.78(1.81, 4.26) 2.70×10 -6 1.00 Genotyped 0.06|0.03 2.38(2.04, 2.78) 1.47×10 -28 rs116075424 8 128134375 A|G 0.09|0.03 2.78(1.81, 4.26) 2.69×10 -6 1.00 - 0.06|0.03 2.28(1.97, 2.65) 4.66×10 -27 rs138925762 8 128134734 G|A 0.09|0.03 2.78(1.81, 4.26) 2.69×10 -6 1.00 - 0.06|0.03 2.39(2.05, 2.78) 1.22×10 -28 rs114111380 8 128137544 T|C 0.09|0.03 2.78(1.81, 4.26) 2.71×10 -6 1.00 Genotyped 0.06|0.03 2.38(2.04, 2.78) 1.46×10 -28 rs144912976 8 128138134 C|G 0.09|0.03 2.78(1.81, 4.26) 2.78×10 -6 0.99 - 0.06|0.03 2.37(2.03, 2.75) 8.46×10 -29 rs115342523 8 128141674 C|A 0.09|0.03 2.76(1.80, 4.24) 3.26×10 -6 0.99 - 0.06|0.03 2.34(2.02, 2.73) 1.79×10 -28 rs116040182 8 128142007 G|A 0.09|0.03 2.76(1.80, 4.22) 3.36×10 -6 0.99 - 0.07|0.03 2.32(2.00, 2.70) 4.05×10 -28 rs149502620 8 128147022 A|G 0.09|0.03 2.75(1.80, 4.22) 3.11×10 -6 1.00 - 0.06|0.03 2.34(2.02, 2.73) 2.37×10 -28 rs200825818 8 128150864 G|GT 0.09|0.04 2.74(1.79, 4.19) 3.22×10 -6 1.00 - 0.06|0.03 2.35(2.02, 2.74) 2.80×10 -28 rs115894212 8 128158511 A|G 0.09|0.04 2.72(1.78, 4.15) 3.60×10 -6 1.00 - 0.06|0.03 2.35(2.02, 2.74) 2.97×10 -28 8 128159454 C|T 0.08|0.03 3.01(1.87, 4.83) 5.40×10 -6 0.91 - 0.06|0.03 2.56(2.15, 3.05) 2.17×10 -26 rs115705407 8 128161740 A|G 0.09|0.03 2.75(1.79, 4.20) 3.20×10 -6 0.99 - 0.06|0.03 2.38(2.04, 2.78) 1.46×10 -28 rs146380832 8 128167551 A|G 0.09|0.03 2.79(1.82, 4.27) 2.49×10 -6 0.99 - 0.06|0.03 2.42(2.07, 2.83) 1.19×10 -28 rs76784613 8 128176963 G|A 0.09|0.04 2.86(1.87, 4.37) 1.24×10 -6 0.98 - 0.06|0.03 2.11(1.81, 2.45) 2.87×10 -22 rs139890247 8 128178411 T|C 0.08|0.03 2.94(1.88, 4.60) 2.34×10 -6 0.94 - 0.05|0.02 2.52(2.12, 2.99) 1.06×10 -25 rs60955457 8 128185548 A|G 0.10|0.05 2.38(1.62, 3.49) 9.12×10 -6 0.98 - 0.07|0.04 1.83(1.60, 2.10) 3.02×10 -18 rs146726331 8 128192892 A|G 0.08|0.03 2.98(1.89, 4.70) 2.56×10 -6 0.94 - 0.05|0.02 2.51(2.11, 2.99) 2.05×10 -25 rs79655367 8 128193217 T|A 0.26|0.18 1.71(1.36, 2.15) 4.85×10 -6 0.99 - 0.19|0.17 1.12(1.04, 1.21) 3.49×10 -3 rs115316824 8 128194229 G|T 0.08|0.03 2.88(1.85, 4.47) 2.72×10 -6 0.98 - 0.06|0.03 2.37(2.02, 2.78) 3.37×10 -26 rs1456304 8 128194694 T|A 0.26|0.18 1.72(1.36, 2.16) 4.55×10 -6 0.99 - 0.19|0.17 1.12(1.04, 1.21) 3.54×10 -3 rs115563653 8 128194755 A|T 0.08|0.03 2.88(1.85, 4.47) 2.77×10 -6 0.98 - 0.06|0.03 2.37(2.02, 2.78) 3.00×10 -26 rs368559193 8 128196674 A|AAAAAT 0.26|0.18 1.70(1.35, 2.15) 6.01×10 -6 0.99 - 0.19|0.17 1.12(1.03, 1.20) 4.83×10 -3 rs16902008 8 128197295 G|T 0.26|0.18 1.68(1.34, 2.11) 8.24×10 -6 1.00 Genotyped 0.19|0.17 1.16(1.07, 1.25) 1.31×10 -4 rs116722593 8 128198147 T|C 0.08|0.03 2.86(1.84, 4.45) 3.01×10 -6 0.98 - 0.06|0.03 2.37(2.02, 2.78) 3.41×10 -26 rs148974531 8 128207194 G|A 0.08|0.03 2.85(1.83, 4.43) 3.32×10 -6 1.00 Genotyped 0.06|0.02 2.39(2.04, 2.81) 2.42×10 -26 rs114791469 8 128214610 T|C 0.08|0.04 2.67(1.74, 4.08) 6.69×10 -6 0.99 - 0.06|0.03 2.33(1.99, 2.73) 7.64×10 -26 8 128216106 C|T 0.12|0.06 2.29(1.61, 3.26) 4.10×10 -6 0.96 - 0.07|0.04 2.00(1.74, 2.30) 2.95×10 -22 rs7470818 9 11109779 A|C 0.80|0.71 1.67(1.33, 2.09) 9.46×10 -6 0.98 - 0.66|0.66 0.98(0.92, 1.04) 0.53 rs4434707 9 11125140 T|C 0.69|0.60 1.59(1.29, 1.95) 9.53×10 -6 0.93 - 0.56|0.56 0.95(0.90, 1.01) 0.095 170 rs2151715 9 11130398 G|C 0.79|0.70 1.74(1.38, 2.18) 2.52×10 -6 0.93 - 0.67|0.67 0.98(0.92, 1.05) 0.58 rs10959640 9 11225906 C|T 0.76|0.68 1.70(1.36, 2.13) 3.32×10 -6 0.93 - 0.67|0.68 0.96(0.90, 1.02) 0.18 rs4741206 9 12079333 G|A 0.85|0.78 1.78(1.38, 2.30) 8.90×10 -6 0.91 - 0.79|0.80 0.96(0.89, 1.03) 0.28 rs73408421 9 12184024 A|G 0.85|0.76 1.81(1.42, 2.30) 1.39×10 -6 0.99 - 0.80|0.80 0.98(0.91, 1.06) 0.61 rs11003686 10 55337855 A|G 0.15|0.10 2.06(1.50, 2.84) 9.09×10 -6 0.84 - 0.12|0.12 0.99(0.91, 1.09) 0.91 rs140971918 13 37435668 G|C 0.71|0.63 1.68(1.34, 2.09) 4.13×10 -6 0.82 - 0.75|0.74 1.03(0.96, 1.10) 0.41 rs7325069 13 113274812 G|A 0.72|0.62 1.63(1.33, 1.99) 2.54×10 -6 0.99 Genotyped 0.60|0.60 0.98(0.92, 1.04) 0.53 rs61708973 13 113275422 C|T 0.48|0.39 1.63(1.31, 2.01) 7.63×10 -6 0.83 - 0.34|0.33 1.04(0.97, 1.11) 0.25 rs9549550 13 113275959 T|A 0.73|0.64 1.66(1.34, 2.06) 3.91×10 -6 0.90 - 0.63|0.62 1.00(0.94, 1.06) 0.99 rs9549551 13 113276304 A|G 0.64|0.55 1.67(1.35, 2.07) 2.95×10 -6 0.81 - 0.54|0.54 1.01(0.95, 1.07) 0.72 rs234439 14 97822865 G|A 0.87|0.79 1.78(1.39, 2.26) 3.21×10 -6 1.00 Genotyped 0.79|0.79 1.00(0.93, 1.07) 0.99 rs140698498 16 7600106 A|AT 0.87|0.80 1.86(1.42, 2.44) 6.07×10 -6 0.88 - 0.85|0.85 1.04(0.96, 1.13) 0.36 rs112896149 16 7602981 G|A 0.86|0.80 1.87(1.43, 2.45) 5.04×10 -6 0.86 - 0.84|0.84 1.03(0.95, 1.12) 0.45 rs66718777 18 47190438 C|T 0.92|0.87 2.06(1.50, 2.84) 9.93×10 -6 0.89 - 0.92|0.92 1.02(0.91, 1.13) 0.76 rs8093567 18 47206346 T|A 0.90|0.84 2.01(1.51, 2.69) 2.32×10 -6 0.92 - 0.91|0.91 1.05(0.95, 1.16) 0.37 rs114181512 18 47207139 C|T 0.90|0.84 2.02(1.50, 2.71) 2.90×10 -6 0.93 - 0.92|0.92 1.03(0.92, 1.14) 0.64 rs113490493 18 47208215 A|C 0.90|0.84 1.96(1.48, 2.61) 3.39×10 -6 0.92 - 0.91|0.91 1.00(0.90, 1.11) 0.99 rs117201167 18 47208466 C|T 0.90|0.84 2.02(1.51, 2.71) 2.55×10 -6 0.92 - 0.91|0.91 1.05(0.94, 1.16) 0.40 rs201497371 18 47210467 GATCTGG ATTGTC|G 0.91|0.84 2.03(1.51, 2.73) 2.82×10 -6 0.92 - 0.92|0.92 1.02(0.92, 1.14) 0.66 rs59922605 18 47223024 T|C 0.92|0.86 2.04(1.49, 2.79) 8.79×10 -6 0.94 - 0.93|0.93 1.01(0.90, 1.13) 0.91 rs2722752 19 44875987 T|C 0.80|0.72 1.76(1.37, 2.25) 7.03×10 -6 0.88 - 0.72|0.72 0.99(0.93, 1.06) 0.79 rs2571082 19 44875988 G|A 0.80|0.72 1.76(1.38, 2.25) 6.99×10 -6 0.88 - 0.72|0.72 0.99(0.93, 1.06) 0.79 rs12978865 19 57196507 A|T 0.78|0.70 1.63(1.31, 2.03) 9.68×10 -6 0.95 - 0.72|0.71 1.03(0.97, 1.10) 0.34 rs12971881 19 57197032 G|T 0.78|0.70 1.63(1.31, 2.02) 9.56×10 -6 0.96 - 0.72|0.71 1.03(0.97, 1.10) 0.31 rs9653124 19 57197096 C|T 0.78|0.70 1.63(1.31, 2.02) 9.43×10 -6 0.97 - 0.72|0.71 1.03(0.97, 1.11) 0.31 rs7258285 19 57198573 G|A 0.78|0.70 1.66(1.34, 2.06) 3.49×10 -6 1.00 Genotyped 0.72|0.71 1.04(0.97, 1.11) 0.30 rs6034548 20 16590457 A|C 0.46|0.37 1.57(1.29, 1.93) 9.90×10 -6 0.94 - 0.36|0.37 0.97(0.92, 1.04) 0.41 a Effect allele|Reference allele; b Case | Control; c imputation r-square 171 Chapter 3 Supplementary Table S1. Descriptive characteristics of study participants. Variables Set 1 – MEC, n (%) Set 2 - LAAPC, n (%) Set 2 – MDA, n (%) Set 2 – MEC, n (%) Set 2 – SABOR, n (%) Set 3 – Kaiser, n (%) Case Control Case Control Case Control Case Control Case Control Case Control N=1,034 N=1,046 N=284 N=326 N=517 N=311 N=135 N=160 N=256 N=255 N=488 N=3,141 Age(y) ≤60 79 (7.6) 90 (8.6) 60 (21.1) 70 (21.5) 199 (38.5) 128 (41.2) 1 (0.7) 13 (8.1) 68 (26.6) 65 (25.5) 111 (22.7) 1387 (44.2) 60-70 460 (44.5) 418 (40.0) 140 (49.3) 150 (46.0) 217 (42.0) 121 (38.9) 35 (25.9) 82 (51.2) 100 (39.1) 101 (39.6) 237 (48.6) 842 (26.8) 70-80 445 (43.0) 423 (40.4) 67 (23.6) 88 (27.0) 92 (17.8) 52 (16.7) 72 (53.3) 57 (35.6) 69 (27.0) 70 (27.5) 124 (25.4) 665 (21.2) ≥80 50 (4.8) 115 (11.0) 16 (5.6) 18 (5.5) 9 (1.7) 9 (2.9) 27 (20.0) 8 (5.0) 19 (7.4) 19 (7.5) 16 (3.3) 247 (7.9) mean(sd) 69.0 (6.5) 69.6 (7.4) 65.7 (8.6) 65.8 (8.4) 61.8 (8.4) 62.0 (8.8) 73.7 (6.2) 68.2 (6.6) 66.4 (8.8) 66.5 (8.8) 66.1 (7.6) 61.2 (14.0) BMI (kg/m 2 ) <25 277 (26.8) 275 (26.5) 50 (17.8) 68 (21.2) 87 (17.2) 48 (15.4) 37 (27.4) 41 (25.8) - - 103 (21.1) 760 (24.2) 25-30 574 (55.6) 541 (52.2) 153 (54.4) 176 (54.8) 254 (50.2) 121 (38.9) 70 (51.9) 88 (55.3) - - 257 (52.7) 1475 (47.0) ≥30 182 (17.6) 221 (21.3) 78 (27.8) 77 (24.0) 165 (32.6) 142 (45.7) 28 (20.7) 30 (18.9) - - 128 (26.2) 906 (28.8) unknown 1 9 3 5 11 0 0 1 256 255 0 0 mean(sd) 27.2 (3.8) 27.5 (3.9) 28.5 (4.7) 27.9 (4.2) 29.0 (4.6) 29.8 (5.4) 27.3 (3.7) 27.3 (3.7) - - 27.7 (4.1) 27.7 (4.8) Gleason Score <5 58 (5.8) - 159 (58.2) - 2 (0.4) - 0 (0.0) - 5 (2.1) - 3 (1.1) - 172 5-7 674 (67.5) - 0 (0.0) - 393 (77.8) - 108 (85.7) - 180 (75.9) - 243 (85.6) - 8-10 266 (26.7) - 114 (41.8) - 110 (21.8) - 18 (14.3) - 52 (21.9) - 38 (13.4) - unknown 36 - 11 - 12 - 9 - 19 - 204 - Stage localized 815 (82.3) - 113 (42.0) - 58 (46.8) - 85 (80.2) - - - 335 (88.2) - regional 148 (14.9) - 156 (58.0) - 66 (53.2) - 12 (11.3) - - - 41 (10.8) - distant metastases /systemic disease 27 (2.7) - 0 (0.0) - 0 (0.0) - 9 (8.5) - - - 4 (1.1) - unknown 44 - 15 - 393 - 29 - 256 - 108 - Family history yes 106 (11.4) 64 (6.8) 56 (19.9) 20 (6.7) 98 (19.0) 9 (2.9) 10 (8.8) 4 (2.8) - - - - unknown 107 112 2 29 0 0 21 18 256 255 488 3141 Frequencies of categorical variables were calculated by diving cell counts by total number of cases/controls among individuals with available variable values 173 Supplementary Table S2. SNPs significantly associated with prostate cancer risk in Latino men (Wald P<5x10 --5 ). SNP Position Allele s a RAF 1KGP b RAFs in Latino Studies Meta-analysis Imputation r 2 EUR AM R AFR Set1 Case Set1 Cont rol Set2 Case Set2 Control Set3 OR (95% CI) c P-value Phet Set1 Set2 Set3 rs10993994 10:51549496 T|C 0.39 0.40 0.65 0.40 0.35 0.43 0.35 0.39 1.29(1.19,1.39) 1.08x10 -10 0.23 1.00 1.00 1.00 rs7843031 8:128533473 T|C 0.11 0.08 0.17 0.11 0.07 0.12 0.08 0.09 1.53(1.34,1.74) 5.12x10 -10 0.12 0.93 1.00 1.00 rs11006462 10:51545813 G|A 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.18,1.37) 8.09x10 -10 0.14 0.98 0.99 1.00 rs7075009 10:51544143 T|G 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.18,1.37) 9.15x10 -10 0.13 0.98 1.00 1.00 rs7098889 10:51544475 C|T 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.18,1.37) 1.04x10 -9 0.14 0.98 1.00 1.00 rs4630243 10:51540867 T|C 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.18,1.37) 1.06x10 -9 0.09 0.99 1.00 1.00 rs4315013 10:51540860 G|A 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.18,1.37) 1.07x10 -9 0.09 0.99 1.00 1.00 rs10763588 10:51539762 G|T 0.44 0.45 0.83 0.44 0.40 0.48 0.39 0.44 1.27(1.18,1.37) 1.13x10 -9 0.09 0.98 1.00 1.00 rs60822390 10:51544680 CAA AT|C 0.44 0.46 0.87 0.43 0.39 0.48 0.39 0.44 1.27(1.18,1.37) 1.18x10 -9 0.17 0.98 1.00 1.00 rs28416634 10:51539271 G|A 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.18x10 -9 0.08 0.99 1.00 1.00 rs78865546 10:51539259 C|A 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.19x10 -9 0.08 0.99 1.00 1.00 rs11006223 10:51538915 G|T 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.24x10 -9 0.08 1.00 1.00 1.00 rs7075697 10:51547371 C|G 0.40 0.36 0.25 0.34 0.31 0.41 0.33 0.38 1.29(1.19,1.40) 1.26x10 -9 0.17 0.83 0.99 0.99 rs11006207 10:51538176 T|C 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.26x10 -9 0.08 1.00 1.00 1.00 rs7477953 10:51544692 G|A 0.44 0.45 0.80 0.43 0.39 0.48 0.39 0.43 1.27(1.17,1.37) 1.28x10 -9 0.11 0.97 1.00 1.00 rs35494443 10:51539745 AC|A 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.32x10 -9 0.08 0.99 1.00 1.00 rs7914347 10:51537431 C|T 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.33x10 -9 0.08 1.00 1.00 1.00 rs4131357 10:51537292 C|A 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.35x10 -9 0.08 1.00 1.00 1.00 rs11006489 10:51546336 A|G 0.40 0.36 0.26 0.34 0.32 0.41 0.33 0.38 1.29(1.19,1.40) 1.36x10 -9 0.17 0.83 0.99 1.00 rs10763546 10:51536399 C|G 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.45x10 -9 0.08 1.00 1.00 1.00 rs10826181 10:51535992 C|T 0.44 0.45 0.83 0.43 0.39 0.48 0.39 0.44 1.27(1.17,1.37) 1.46x10 -9 0.08 0.99 1.00 1.00 rs7911198 10:51537475 G|A 0.43 0.43 0.79 0.42 0.38 0.47 0.37 0.43 1.27(1.17,1.37) 1.98x10 -9 0.07 0.98 0.96 1.00 rs7824776 8:128533442 C|T 0.11 0.08 0.17 0.11 0.07 0.11 0.08 0.09 1.51(1.32,1.72) 1.98x10 -9 0.06 0.92 1.00 1.00 rs10763539 10:51535931 T|C 0.44 0.46 0.86 0.43 0.39 0.48 0.39 0.44 1.26(1.17,1.36) 2.18x10 -9 0.10 1.00 1.00 1.00 174 rs34073986 10:51547002 CA|C 0.40 0.38 0.51 0.36 0.33 0.43 0.34 0.39 1.28(1.18,1.39) 2.37x10 -9 0.18 0.83 0.98 0.99 rs10826227 10:51539256 T|C 0.44 0.40 0.44 0.41 0.37 0.45 0.37 0.42 1.27(1.17,1.37) 2.65x10 -9 0.16 0.98 1.00 1.00 rs10763576 10:51538813 A|T 0.43 0.40 0.44 0.41 0.37 0.45 0.37 0.42 1.26(1.17,1.37) 2.87x10 -9 0.16 0.98 1.00 1.00 rs9787697 10:51533376 C|T 0.44 0.46 0.85 0.43 0.40 0.48 0.39 0.44 1.26(1.17,1.36) 3.17x10 -9 0.11 1.00 1.00 1.00 rs4512771 10:51540906 C|A 0.43 0.39 0.40 0.39 0.36 0.44 0.35 0.40 1.27(1.17,1.37) 3.17x10 -9 0.11 0.96 0.95 0.99 rs10763534 10:51534920 C|T 0.44 0.45 0.79 0.43 0.39 0.48 0.39 0.44 1.26(1.17,1.36) 3.19x10 -9 0.11 1.00 1.00 1.00 rs4463804 10:51537653 T|C 0.43 0.40 0.43 0.41 0.37 0.45 0.37 0.42 1.26(1.17,1.36) 3.19x10 -9 0.15 0.98 1.00 1.00 rs7920517 10:51532621 G|A 0.44 0.46 0.79 0.43 0.39 0.48 0.39 0.44 1.26(1.17,1.36) 3.68x10 -9 0.10 1.00 1.00 1.00 rs11006274 10:51540291 T|C 0.43 0.39 0.40 0.39 0.36 0.44 0.35 0.40 1.27(1.17,1.37) 3.74x10 -9 0.12 0.96 0.95 1.00 rs10826127 10:51530757 G|A 0.44 0.46 0.85 0.43 0.40 0.48 0.39 0.44 1.26(1.17,1.36) 4.00x10 -9 0.11 0.99 1.00 1.00 rs6481462 10:51545028 G|A 0.40 0.38 0.48 0.36 0.33 0.43 0.34 0.39 1.27(1.18,1.38) 4.03x10 -9 0.14 0.83 1.00 1.00 rs10763567 10:51538169 A|C 0.43 0.37 0.29 0.37 0.34 0.43 0.34 0.40 1.27(1.17,1.37) 4.04x10 -9 0.12 0.90 1.00 1.00 rs4304716 10:51544587 A|G 0.40 0.38 0.48 0.36 0.33 0.43 0.34 0.39 1.27(1.18,1.38) 4.16x10 -9 0.14 0.83 1.00 1.00 rs4630241 10:51532751 G|A 0.44 0.46 0.85 0.43 0.40 0.48 0.39 0.44 1.26(1.16,1.36) 4.52x10 -9 0.10 1.00 1.00 1.00 rs10763538 10:51535857 A|G 0.43 0.39 0.41 0.39 0.36 0.44 0.35 0.41 1.26(1.17,1.36) 5.01x10 -9 0.11 0.97 0.99 1.00 rs4631830 10:51543344 C|T 0.40 0.38 0.48 0.36 0.33 0.43 0.34 0.39 1.27(1.17,1.38) 6.07x10 -9 0.13 0.83 0.99 1.00 rs2843555 10:51524274 C|T 0.44 0.46 0.85 0.44 0.40 0.48 0.39 0.44 1.26(1.16,1.36) 6.24x10 -9 0.11 0.98 1.00 1.00 rs4306255 10:51542444 A|G 0.41 0.41 0.64 0.40 0.36 0.45 0.36 0.41 1.26(1.17,1.37) 6.28x10 -9 0.18 0.91 1.00 1.00 rs10763536 10:51535801 G|A 0.43 0.40 0.46 0.40 0.36 0.44 0.36 0.41 1.26(1.16,1.36) 6.60x10 -9 0.15 0.98 0.99 1.00 rs4581397 10:51532367 A|G 0.43 0.39 0.38 0.39 0.36 0.44 0.35 0.40 1.26(1.16,1.36) 7.26x10 -9 0.09 0.97 0.97 1.00 rs1585378 10:51523213 A|T 0.43 0.46 0.79 0.43 0.40 0.48 0.39 0.44 1.25(1.16,1.35) 7.36x10 -9 0.10 0.98 1.00 1.00 rs7070301 10:51543070 T|G 0.41 0.41 0.64 0.40 0.36 0.45 0.36 0.41 1.26(1.17,1.36) 7.48x10 -9 0.18 0.90 1.00 1.00 rs11593361 10:51539156 A|G 0.43 0.43 0.65 0.42 0.38 0.46 0.38 0.43 1.25(1.16,1.35) 8.34x10 -9 0.10 0.98 1.00 0.99 rs2249986 10:51521684 T|G 0.43 0.46 0.85 0.44 0.40 0.48 0.39 0.44 1.25(1.16,1.35) 8.71x10 -9 0.10 0.96 1.00 1.00 10:51520488 T|TA A 0.46 0.42 0.62 0.42 0.39 0.47 0.38 0.43 1.27(1.17,1.37) 9.26x10 -9 0.04 0.86 0.87 0.96 rs2611486 10:51519516 G|A 0.43 0.46 0.85 0.44 0.40 0.48 0.39 0.44 1.25(1.16,1.35) 1.15x10 -8 0.10 0.95 1.00 1.00 rs2611504 10:51518910 G|A 0.43 0.46 0.85 0.44 0.40 0.48 0.39 0.44 1.25(1.16,1.35) 1.22x10 -8 0.10 0.95 1.00 1.00 rs2843546 10:51518688 A|C 0.43 0.46 0.85 0.44 0.40 0.48 0.39 0.44 1.25(1.16,1.35) 1.23x10 -8 0.10 0.95 1.00 1.00 rs7896437 10:51529593 A|C 0.43 0.40 0.47 0.40 0.36 0.44 0.36 0.41 1.25(1.16,1.35) 1.30x10 -8 0.15 0.98 1.00 1.00 rs4486572 10:51531805 A|G 0.43 0.40 0.46 0.39 0.36 0.44 0.35 0.41 1.26(1.16,1.36) 1.34x10 -8 0.14 0.97 0.98 1.00 175 rs10090154 8:128532137 T|C 0.10 0.08 0.19 0.12 0.08 0.11 0.08 0.09 1.47(1.29,1.68) 1.48x10 -8 0.06 0.92 1.00 0.99 rs7915602 10:51528656 A|G 0.43 0.40 0.47 0.40 0.36 0.44 0.36 0.41 1.25(1.16,1.35) 1.58x10 -8 0.15 0.98 1.00 1.00 rs66462216 10:51512393 A|AT 0.44 0.44 0.78 0.42 0.39 0.47 0.38 0.43 1.26(1.16,1.36) 1.60x10 -8 0.12 0.84 0.95 0.99 rs2611471 10:51503559 G|C 0.47 0.49 0.85 0.48 0.45 0.52 0.44 0.47 1.26(1.16,1.36) 1.84x10 -8 0.22 0.81 0.94 1.00 rs66462216 10:51512393 AT|A TT 0.50 0.53 0.85 0.51 0.47 0.55 0.47 0.51 1.27(1.17,1.37) 2.01x10 -8 0.20 0.80 0.90 0.92 rs2012677 10:51504797 T|A 0.47 0.49 0.79 0.47 0.44 0.51 0.43 0.47 1.25(1.16,1.35) 2.02x10 -8 0.20 0.83 0.96 1.00 rs7081532 10:51526093 A|G 0.43 0.39 0.37 0.39 0.36 0.44 0.35 0.40 1.25(1.16,1.35) 2.25x10 -8 0.06 0.98 0.99 1.00 rs2125771 10:51506957 C|T 0.47 0.49 0.85 0.47 0.44 0.51 0.43 0.47 1.25(1.16,1.35) 2.25x10 -8 0.20 0.86 0.95 1.00 rs7088225 10:51522120 C|T 0.43 0.37 0.27 0.36 0.34 0.43 0.35 0.40 1.25(1.16,1.36) 2.30x10 -8 0.06 0.89 0.95 1.00 rs4242382 8:128517573 A|G 0.11 0.09 0.36 0.14 0.09 0.13 0.10 0.11 1.40(1.25,1.58) 2.35x10 -8 0.07 1.00 1.00 1.00 rs2611475 10:51503178 G|A 0.47 0.49 0.79 0.47 0.44 0.51 0.43 0.47 1.25(1.16,1.36) 2.40x10 -8 0.21 0.80 0.95 0.99 rs2843556 10:51516299 G|T 0.43 0.46 0.79 0.44 0.41 0.48 0.39 0.44 1.25(1.15,1.35) 2.46x10 -8 0.08 0.92 0.97 0.98 rs7077830 10:51522276 G|C 0.43 0.37 0.27 0.36 0.34 0.43 0.34 0.39 1.25(1.16,1.36) 2.52x10 -8 0.06 0.88 0.94 1.00 rs4314621 8:128518015 G|A 0.10 0.08 0.16 0.12 0.08 0.12 0.09 0.10 1.43(1.26,1.63) 2.71x10 -8 0.08 1.00 1.00 1.00 rs35094615 10:51518083 TA|T 0.43 0.44 0.75 0.43 0.39 0.47 0.38 0.43 1.24(1.15,1.34) 2.74x10 -8 0.13 0.94 1.00 0.99 rs4515512 8:128532399 A|G 0.10 0.08 0.08 0.11 0.07 0.11 0.07 0.08 1.48(1.29,1.69) 2.74x10 -8 0.04 0.93 1.00 0.99 rs2843554 10:51523861 G|T 0.43 0.39 0.36 0.39 0.36 0.44 0.35 0.40 1.25(1.15,1.35) 2.89x10 -8 0.07 0.97 0.99 1.00 rs4935162 10:51525699 G|C 0.43 0.40 0.47 0.40 0.37 0.44 0.36 0.41 1.25(1.15,1.35) 2.91x10 -8 0.11 1.00 1.00 1.00 rs6481329 10:51529746 G|A 0.43 0.43 0.61 0.41 0.38 0.46 0.38 0.43 1.24(1.15,1.34) 3.05x10 -8 0.10 0.98 1.00 1.00 rs10826125 10:51530505 G|A 0.43 0.43 0.60 0.41 0.38 0.46 0.38 0.43 1.24(1.15,1.34) 3.19x10 -8 0.10 0.98 1.00 1.00 rs4554834 10:51530146 A|C 0.43 0.43 0.60 0.41 0.38 0.46 0.38 0.43 1.24(1.15,1.34) 3.21x10 -8 0.10 0.98 1.00 1.00 rs7896156 10:51529379 A|G 0.43 0.43 0.60 0.41 0.38 0.46 0.38 0.43 1.24(1.15,1.34) 3.33x10 -8 0.10 0.98 1.00 1.00 rs2611489 10:51524889 G|A 0.43 0.40 0.47 0.40 0.37 0.44 0.36 0.41 1.24(1.15,1.34) 3.68x10 -8 0.11 1.00 1.00 1.00 rs4242384 8:128518554 C|A 0.10 0.08 0.16 0.12 0.08 0.12 0.09 0.10 1.43(1.26,1.62) 3.80x10 -8 0.06 1.00 1.00 1.00 rs3123078 10:51524971 C|T 0.43 0.40 0.47 0.40 0.37 0.44 0.36 0.41 1.24(1.15,1.34) 3.81x10 -8 0.11 1.00 1.00 1.00 rs12545648 8:128534755 C|T 0.11 0.08 0.06 0.11 0.07 0.11 0.07 0.09 1.47(1.28,1.68) 3.98x10 -8 0.06 0.96 1.00 1.00 rs9297759 8:128519171 A|C 0.10 0.08 0.16 0.12 0.08 0.12 0.09 0.10 1.42(1.25,1.61) 4.75x10 -8 0.07 1.00 1.00 1.00 rs2843549 10:51521247 C|A 0.43 0.40 0.47 0.40 0.37 0.44 0.36 0.41 1.24(1.15,1.34) 4.87x10 -8 0.12 0.98 1.00 1.00 rs3101227 10:51520203 C|A 0.43 0.39 0.36 0.39 0.36 0.43 0.35 0.40 1.24(1.15,1.35) 4.87x10 -8 0.08 0.95 0.98 0.99 rs8180905 8:128538824 A|G 0.11 0.08 0.06 0.11 0.07 0.11 0.07 0.09 1.46(1.28,1.68) 4.95x10 -8 0.05 0.98 0.99 0.98 176 rs61847060 10:51510203 A|G 0.44 0.38 0.27 0.37 0.35 0.43 0.35 0.40 1.25(1.16,1.36) 4.98x10 -8 0.10 0.80 0.94 1.00 a Effect allele|Reference allele; b Risk allele frequencies (RAFs) were from 1KGP; c Odds ratios (ORs) were adjusted for age, study, and the first 10 principle components from the PCA analysis 177 Supplementary Table S3. Associations of 179 known PrCa risk alleles among Latino men SNP ID Region Ref. Paper a Risk Allele European b Latino Study Imputation r 2 RAF OR (95% CI) RAF c OR (95% CI) d P-value Phet e Set 1 Set 2 Set 3 rs11352831 1q21.3 1 C 0.58 1.03(1.01, 1.05) 0.53 0.92 (0.85,0.99) 0.03 0.48 0.91 0.94 0.96 rs1811698 1q21.3 1 C 0.89 1.1(1.08, 1.13) 0.88 1.08 (0.95,1.21) 0.23 0.97 0.97 0.99 1 rs34579442 1q21.3 2 C 0.34 1.07(1.05, 1.09) 0.40 1.03 (0.95,1.12) 0.42 0.61 0.87 0.81 0.93 rs56103503 1q21.3 1 T 0.39 1.07(1.05, 1.08) 0.46 1.07 (0.99,1.15) 0.1 0.82 0.9 1 1 rs55664108 1q32.1 1 CCTT 0.72 1.1(1.08, 1.11) 0.71 1.17 (1.08,1.27) 2.81x10 -4 0.58 0.99 0.99 0.98 rs397982729 1q32.1 1 C 0.48 1.05(1.03, 1.07) 0.52 1.01 (0.92,1.10) 0.86 0.23 0.75 0.88 0.85 rs56391074 1p22.3 2 AT 0.37 1.05(1.03, 1.06) 0.48 1.02 (0.95,1.10) 0.61 0.93 0.98 0.99 0.99 rs572944347 2p25.1 1 C 0.10 1.08(1.05, 1.11) 0.07 1.10 (0.95,1.28) 0.2 0.03 0.99 0.99 1 rs533351651 2p25.1 1 ATTTTTTT TTTTTTT 0.47 1.07(1.06, 1.09) 0.47 0.99 (0.92,1.07) 0.76 0.58 0.96 0.9 0.98 rs11691517 2q13 2 T 0.74 1.07(1.05, 1.08) 0.82 1.03 (0.93,1.15) 0.56 0.74 0.93 1 0.97 rs114161133 2q31.1 1 G 0.94 1.27(1.23, 1.32) 0.93 1.25 (1.07,1.45) 4.71x10 -3 0.88 0.99 1 1 rs34925593 2q31.1 2 C 0.48 1.05(1.03, 1.07) 0.47 1.06 (0.98,1.14) 0.15 0.7 0.97 1 0.98 rs59308963 2q33.1 2 T 0.73 1.05(1.03, 1.07) 0.60 1.03 (0.95,1.11) 0.46 0.52 0.99 1 1 rs7255 2p24.1 1 T 0.46 1.07(1.06, 1.09) 0.49 1.03 (0.96,1.12) 0.38 0.58 1 1 0.98 rs60079197 2q37.3 1 T 0.19 1.07(1.05, 1.09) 0.20 1.00 (0.91,1.10) 0.97 0.84 0.91 0.95 0.95 rs77559646 2q37.3 1 A 0.02 1.28(1.22, 1.35) - - - - - - - rs77482050 2q37.3 1 G 0.99 1.48(1.37, 1.6) - - - - - - - rs62187431 2q37.3 1 G 0.17 1.1(1.08, 1.12) 0.15 1.12 (1.01,1.25) 0.04 0.63 0.88 1 0.92 rs7591218 2p21 1 A 0.32 1.09(1.07, 1.11) 0.50 1.08 (1.00,1.16) 0.06 0.58 1 1 1 rs58235267 2p15 1 G 0.47 1.12(1.11, 1.14) 0.52 1.17 (1.08,1.26) 9.67x10 -5 0.34 0.99 0.99 1 rs74702681 2p14 2 T 0.02 1.17(1.11, 1.23) - - - - - - - rs2028900 2p11.2 1 C 0.56 1.09(1.07, 1.1) 0.63 1.12 (1.04,1.22) 4.70x10 -3 0.38 1 1 1 rs62106670 2p25.1 2 T 0.38 1.05(1.04, 1.07) 0.40 1.04 (0.96,1.13) 0.34 0.32 0.77 0.84 0.99 rs1283104 3q13.12 2 G 0.38 1.05(1.03, 1.07) 0.47 1.03 (0.96,1.11) 0.41 0.72 0.98 0.99 0.99 rs7639565 3q13.2 1 C 0.56 1.09(1.07, 1.11) 0.61 1.03 (0.95,1.12) 0.43 0.53 0.99 1 1 rs34834087 3q21.3 1 TC 0.27 1.11(1.09, 1.13) 0.38 0.99 (0.91,1.07) 0.74 0.76 0.98 1 0.99 rs7624084 3q23 1 C 0.44 1.04(1.03, 1.06) 0.43 0.96 (0.89,1.03) 0.25 0.04 0.92 1 0.98 rs182314334 3q25.1 2 T 0.9 1.09(1.06, 1.12) 0.93 1.02 (0.87,1.19) 0.81 0.12 0.85 0.99 0.97 rs142436749 3q26.2 2 G 0.01 1.25(1.16, 1.34) - - - - - - - 178 rs57508070 3q26.2 1 A 0.88 1.07(1.04, 1.09) 0.86 1.10 (0.98,1.23) 0.11 0.6 0.97 1 0.98 rs61436251 3q26.2 1 C 0.79 1.19(1.16, 1.21) 0.78 1.25 (1.13,1.38) 1.05x10 -5 0.35 0.7 0.98 0.94 rs7642887 3p12.1 1 T 0.50 1.12(1.1, 1.14) 0.61 1.13 (1.04,1.22) 2.14x10 -3 0.42 1 1 1 rs559060446 4q24 1 TA 0.59 1.13(1.11, 1.15) 0.47 1.09 (1.01,1.17) 0.04 0.35 0.97 0.99 0.99 rs1894292 4q13.3 1 G 0.52 1.06(1.05, 1.08) 0.63 1.02 (0.94,1.10) 0.68 0.27 1 1 0.98 rs17804499 4q13.3 1 G 0.95 1.17(1.12, 1.21) 0.96 0.97 (0.78,1.20) 0.75 0.97 0.62 1 0.93 rs2452593 4q22.3 1 G 0.22 1.01(0.99, 1.03) 0.20 0.99 (0.90,1.09) 0.79 0.06 1 1 0.99 rs6853490 4q22.3 1 G 0.44 1.08(1.07, 1.1) 0.51 1.06 (0.98,1.15) 0.14 0.42 0.93 0.99 0.96 rs4449583 5p15.33 1 C 0.66 1.12(1.1, 1.14) 0.69 1.09 (1.00,1.19) 0.04 0.76 0.79 0.99 0.97 rs7705526 5p15.33 1 A 0.33 0.93(0.92, 0.95) 0.30 0.93 (0.85,1.02) 0.11 0.98 0.79 1 0.93 rs2853677 5p15.33 1 G 0.42 1.02(1, 1.04) 0.35 1.05 (0.97,1.15) 0.21 0.11 0.81 0.99 0.99 rs11414507 5p15.33 1 AC 0.85 1.18(1.15, 1.21) 0.88 1.07 (0.94,1.21) 0.32 0.95 0.8 0.97 0.93 rs71595003 5p15.33 1 A 0.03 1.21(1.15, 1.26) - - - - - - - rs10793821 5q31.1 2 T 0.58 1.05(1.04, 1.07) 0.71 1.20 (1.10,1.31) 5.12x10 -5 0.85 0.99 0.96 0.98 rs76551843 5q35.1 2 A 0.99 1.31(1.19, 1.44) - - - - - - - rs6860868 5q35.2 1 C 0.44 1.05(1.03, 1.07) 0.41 1.02 (0.95,1.10) 0.57 0.72 0.99 1 0.99 rs4976790 5q35.3 2 T 0.11 1.08(1.05, 1.1) 0.13 1.07 (0.96,1.20) 0.23 0.88 1 1 1 rs199577062 5p15.33 1 TG 0.36 1.1(1.08, 1.12) 0.33 1.12 (1.03,1.22) 7.42x10 -3 0.89 0.75 0.98 0.86 rs10055386 5p12 1 G 0.33 1.05(1.03, 1.07) 0.48 1.02 (0.94,1.10) 0.68 0.28 1 1 1 rs12209480 6q21 1 A 0.09 1.08(1.06, 1.11) 0.05 1.15 (0.97,1.37) 0.11 0.2 0.82 1 1 rs2018336 6p24.2 1 T 0.78 1.07(1.05, 1.09) 0.82 1.07 (0.96,1.18) 0.22 0.84 0.98 0.99 0.97 rs630045 6q22.1 1 C 0.69 1.09(1.07, 1.11) 0.72 1.00 (0.92,1.08) 0.95 0.82 0.98 1 1 rs570624213 6q25.2 1 ATT 0.59 1.08(1.06, 1.09) 0.57 1.00 (0.93,1.08) 0.98 0.5 0.97 0.99 0.99 rs2342478 6q25.3 1 C 0.81 1.06(1.04, 1.09) 0.81 1.10 (1.00,1.21) 0.06 0.21 1 1 1 rs4646284 6q25.3 1 TG 0.29 1.22(1.2, 1.24) 0.24 1.15 (1.05,1.26) 2.78x10 -3 0.17 0.87 0.97 0.89 rs641990 6q25.3 1 G 0.49 1.08(1.06, 1.1) 0.56 1.04 (0.96,1.12) 0.36 0.77 0.99 1 0.99 rs5875234 6p22.1 1 CTGAGTA 0.25 1.06(1.04, 1.08) 0.28 1.10 (1.01,1.19) 0.03 0.54 1 1 1 rs12665339 6p21.33 2 G 0.17 1.06(1.04, 1.09) 0.18 1.02 (0.93,1.13) 0.66 0.79 1 0.98 0.99 rs3096702 6p21.32 2 A 0.38 1.06(1.04, 1.07) 0.33 1.12 (1.04,1.22) 4.69x10 -3 0.58 0.98 0.99 1 rs3129859 6p21.32 2 G 0.67 1.06(1.04, 1.08) 0.76 1.04 (0.95,1.14) 0.38 0.87 0.97 0.99 1 rs9296068 6p21.32 2 T 0.65 1.05(1.03, 1.07) 0.68 0.91 (0.84,0.99) 0.03 0.45 1 1 1 rs9469899 6p21.31 2 A 0.36 1.05(1.03, 1.07) 0.33 1.01 (0.94,1.10) 0.72 0.93 0.98 0.99 1 rs10947980 6p21.1 1 G 0.26 1.09(1.07, 1.11) 0.32 1.16 (1.07,1.25) 4.89x10 -4 0.43 0.88 1 0.99 rs4711748 6p21.1 2 T 0.23 1.05(1.03, 1.07) 0.27 0.99 (0.91,1.07) 0.77 0.3 1 0.95 1 rs9443189 6q14.1 1 A 0.86 1.07(1.04, 1.09) 0.86 1.04 (0.94,1.17) 0.43 0.89 0.99 0.99 0.99 179 rs527510716 7p22.3 2 C 0.24 1.06(1.04, 1.08) 0.28 1.07 (0.97,1.18) 0.2 0.95 0.65 0.71 0.8 rs11452686 7p21.1 2 T 0.56 1.05(1.03, 1.07) 0.44 1.10 (1.01,1.20) 0.04 0.63 0.73 0.7 0.85 rs12155172 7p15.3 1 A 0.22 1.1(1.08, 1.12) 0.15 1.11 (0.99,1.24) 0.06 0.57 0.87 1 1 rs67152137 7p15.2 1 G 0.76 1.14(1.12, 1.16) 0.57 1.14 (1.05,1.23) 9.69x10 -4 0.71 1 1 1 rs17621345 7p14.1 2 A 0.74 1.07(1.05, 1.09) 0.78 1.02 (0.93,1.12) 0.61 0.82 0.96 1 0.96 rs3037443 7p12.3 1 TGCAAA 0.35 1.06(1.04, 1.08) 0.44 1.03 (0.96,1.11) 0.43 0.4 0.97 1 0.97 rs11763970 7q21.3 1 A 0.46 1.11(1.09, 1.12) 0.64 1.13 (1.04,1.23) 3.66x10 -3 0 1 1 1 rs1914295 8q24.21 3 T 0.68 1.08(1.06, 1.09) 0.73 1.05 (0.96,1.14) 0.29 0.27 0.98 0.97 0.99 rs1487240 8q24.21 3 A 0.74 1.16(1.14, 1.18) 0.72 1.15 (1.06,1.25) 1.23x10 -3 0.14 1 1 1 rs7463326 8q24.21 4 G 0.75 1.16(1.14, 1.18) 0.72 1.15 (1.06,1.25) 1.32x10 -3 0.19 0.99 1 0.99 rs72725854 8q24.21 4 T - - - - - - - - - rs77541621 8q24.21 3 A 0.02 1.81(1.73, 1.89) - - - - - - - rs190257175 8q24.21 3 T 0.99 1.31(1.17, 1.45) - - - - - - - rs72725879 8q24.21 3 T 0.19 1.16(1.14, 1.18) 0.24 1.25 (1.15,1.36) 2.63x10 -7 0.83 0.97 0.98 0.99 rs5013678 8q24.21 3 T 0.79 1.19(1.17, 1.22) 0.81 1.21 (1.09,1.34) 3.45x10 -4 0.81 0.86 1 0.95 rs183373024 8q24.21 3 G 0.01 2.91(2.69, 3.14) - - - - - - - rs78511380 8q24.21 3 T 0.92 0.97(0.94, 1.00) 0.96 0.96 (0.78,1.18) 0.69 0.99 0.9 0.99 0.97 rs17464492 8q24.21 3 A 0.72 1.15(1.13, 1.17) 0.79 1.16 (1.06,1.28) 2.18x10 -3 0.47 1 1 0.98 rs6983267 8q24.21 3 G 0.51 1.22(1.2, 1.24) 0.58 1.24 (1.15,1.34) 7.50x10 -8 0.02 1 1 1 rs7812894 8q24.21 3 A 0.11 1.42(1.39, 1.46) 0.10 1.38 (1.22,1.55) 2.82x10 -7 0.03 0.99 1 1 rs12549761 8q24.21 3 C 0.88 1.27(1.23, 1.3) 0.86 1.10 (0.98,1.23) 0.12 0.83 0.86 0.99 0.87 rs11135749 8p21.2 1 T 0.28 1.06(1.04, 1.07) 0.23 1.06 (0.97,1.16) 0.21 0.55 1 1 0.99 rs11782388 8p21.2 1 C 0.43 1.14(1.12, 1.16) 0.45 1.16 (1.08,1.26) 1.03x10 -4 0.57 0.99 1 1 rs11135910 8p21.2 1 T 0.15 1.08(1.06, 1.1) 0.16 1.06 (0.96,1.17) 0.27 0.52 1 1 0.99 rs138284905 9q31.2 1 G 0.28 1.06(1.04, 1.07) 0.31 1.13 (1.04,1.22) 4.20x10 -3 0.41 1 1 1 rs1182 9q34.11 2 A 0.22 1.06(1.04, 1.08) 0.20 1.00 (0.90,1.10) 0.94 0.27 1 0.98 0.99 rs1048169 9p22.1 2 C 0.38 1.06(1.05, 1.08) 0.34 1.09 (1.00,1.18) 0.04 0.65 1 0.99 0.99 rs17694493 9p21.3 1 G 0.14 1.08(1.05, 1.1) 0.09 1.07 (0.93,1.22) 0.35 0.01 0.92 1 0.99 rs10122495 9p13.3 2 T 0.30 1.05(1.03, 1.07) 0.22 0.97 (0.88,1.07) 0.54 0.3 0.92 0.93 0.96 rs12764219 10q24.3 2 1 G 0.72 1.07(1.05, 1.09) 0.78 1.19 (1.08,1.31) 4.00x10 -4 0.79 0.97 1 0.99 rs7094871 10q25.2 2 G 0.54 1.04(1.03, 1.06) 0.56 0.97 (0.90,1.05) 0.46 0.23 0.86 1 0.95 rs1004934 10q26.1 2 1 C 0.39 1.06(1.04, 1.08) 0.29 1.03 (0.94,1.12) 0.52 0.16 1 1 1 180 rs11245446 10q26.1 3 1 G 0.32 1.06(1.04, 1.08) 0.24 1.03 (0.95,1.13) 0.45 0.52 1 1 0.99 rs76934034 10q11.2 1 1 T 0.92 1.12(1.09, 1.16) 0.92 1.12 (0.96,1.31) 0.14 0.16 0.65 1 0.99 rs10993994 10q11.2 3 1 T 0.38 1.23(1.21, 1.25) 0.37 1.29 (1.19,1.39) 1.08x10 -10 0.23 1 1 1 rs141536087 10p15.3 2 GCGCA 0.15 1.08(1.06, 1.11) 0.09 1.23 (1.07,1.40) 2.63x10 -3 0.29 0.94 0.86 0.97 rs1935581 10q23.3 1 2 C 0.63 1.05(1.03, 1.07) 0.41 1.05 (0.97,1.14) 0.2 0.4 1 0.98 1 rs12285347 11q22.2 1 T 0.55 1.08(1.06, 1.09) 0.63 1.16 (1.07,1.26) 2.91x10 -4 0.88 0.96 1 1 rs1800057 11q22.3 2 G 0.02 1.16(1.1, 1.22) 0.02 1.02 (0.73,1.44) 0.9 0.78 0.75 0.95 0.93 rs5794883 11q23.2 1 C 0.7 1.08(1.06, 1.09) 0.75 1.05 (0.96,1.15) 0.31 0.86 0.88 0.94 0.94 rs138466039 11q24.2 2 T 0.01 1.32(1.22, 1.44) - - - - - - - rs878987 11q25 2 G 0.15 1.07(1.04, 1.09) 0.10 1.10 (0.96,1.25) 0.17 0.16 0.99 0.96 0.99 rs1881502 11p15.5 2 T 0.19 1.06(1.04, 1.08) 0.15 0.98 (0.88,1.09) 0.69 0.58 0.98 0.99 0.99 rs7127900 11p15.5 1 A 0.2 1.19(1.16, 1.21) 0.29 1.13 (1.04,1.23) 3.60x10 -3 0.73 1 1 1 rs59111863 11p11.2 2 CGG 0.47 1.05(1.03, 1.07) 0.45 1.03 (0.95,1.12) 0.51 0.12 0.82 0.86 0.89 rs533676902 11q12.1 1 GGCAGAT ACTT 0.01 1.37(1.25, 1.51) - - - - - - - rs2277283 11q12.3 2 C 0.31 1.06(1.04, 1.08) 0.20 1.12 (1.01,1.23) 0.03 0.89 0.89 0.96 1 rs12785905 11q13.2 2 C 0.05 1.12(1.08, 1.17) 0.03 1.32 (1.01,1.72) 0.04 0.48 0.59 0.69 0.92 rs4620729 11q13.3 1 A 0.5 1.17(1.15, 1.19) 0.38 1.13 (1.04,1.22) 3.36x10 -3 0.88 0.99 1 1 rs11228580 11q13.3 1 C 0.16 1.25(1.22, 1.27) 0.12 1.20 (1.07,1.35) 1.81x10 -3 0.72 0.98 1 1 rs61890184 11p15.4 2 A 0.12 1.07(1.05, 1.10) 0.20 1.20 (1.09,1.31) 1.47x10 -4 0.38 0.97 1 0.97 rs12791447 11p15.4 5 G 0.07 1.05(1.02, 1.08) 0.17 1.24 (1.13,1.37) 1.18 x10 -5 0.62 1.00 1.00 0.95 rs11290954 11q13.5 2 AC 0.68 1.06(1.05, 1.08) 0.78 0.97 (0.89,1.07) 0.57 0.86 0.98 0.97 1 rs10774740 12q24.2 1 1 G 0.62 1.08(1.06, 1.09) 0.52 1.04 (0.97,1.13) 0.28 0.12 0.96 1 0.98 rs2066827 12p13.1 2 T 0.76 1.06(1.04, 1.08) 0.80 0.99 (0.90,1.09) 0.83 0.32 0.95 0.97 0.98 rs7295014 12q24.3 3 2 G 0.34 1.05(1.04, 1.07) 0.44 1.02 (0.95,1.11) 0.54 0.34 0.93 0.88 0.99 rs10845938 12p13.1 2 G 0.55 1.06(1.04, 1.08) 0.61 1.09 (1.00,1.18) 0.04 0.87 0.95 1 0.98 rs4760607 12q13.1 1 1 T 0.91 1.08(1.05, 1.11) 0.94 1.09 (0.93,1.28) 0.28 0.8 1 0.94 0.96 181 rs10875943 12q13.1 2 1 C 0.29 1.07(1.05, 1.09) 0.33 1.04 (0.96,1.13) 0.34 0.93 1 1 1 rs147181250 12q13.1 3 1 T 0.13 1.16(1.14, 1.19) 0.07 1.08 (0.93,1.26) 0.31 0.18 0.99 1 0.99 rs7968403 12q14.2 2 T 0.64 1.06(1.04, 1.08) 0.71 1.01 (0.93,1.10) 0.87 0.17 0.98 0.99 0.97 rs5799921 12q21.3 3 2 GA 0.7 1.06(1.04, 1.08) 0.62 1.09 (1.01,1.18) 0.04 0.38 0.98 0.97 0.99 rs75823044 13q34 4 T - - - - - - - - - rs7996468 13q22.1 1 C 0.22 1.07(1.05, 1.09) 0.45 0.99 (0.91,1.07) 0.74 0.97 0.96 1 0.98 rs1004030 14q11.2 2 T 0.58 1.05(1.03, 1.06) 0.72 1.08 (0.99,1.18) 0.09 0.13 0.96 0.96 0.99 rs11629412 14q13.3 2 C 0.58 1.06(1.04, 1.08) 0.45 1.10 (1.01,1.19) 0.02 0.16 0.87 0.99 0.99 rs566645858 14q22.1 1 G 0.87 1.11(1.08, 1.13) 0.90 1.14 (0.99,1.30) 0.06 0.76 0.96 0.99 1 rs58262369 14q23.2 5 C 0.998 1.27(1.03, 1.57) 0.92 0.97 (0.85,1.10) 0.6 0.64 0.99 0.98 1 rs7141529 14q24.1 1 C 0.50 1.05(1.04, 1.07) 0.55 1.00 (0.93,1.08) 0.94 0.93 1 1 1 rs11158871 14q24.2 1 C 0.68 1.05(1.03, 1.07) 0.67 1.02 (0.94,1.11) 0.6 0.89 1 1 1 rs4924487 15q15.1 2 C 0.84 1.06(1.04, 1.09) 0.63 1.04 (0.95,1.13) 0.38 0.66 0.99 0.99 1 rs33984059 15q21.3 2 A 0.98 1.19(1.12, 1.27) - - - - - - - rs112293876 15q22.3 1 2 C 0.29 1.06(1.04, 1.08) 0.22 1.14 (1.04,1.27) 8.15x10 -3 0.49 0.83 0.83 0.91 rs11863709 16q21 2 C 0.96 1.16(1.11, 1.21) 0.98 0.95 (0.72,1.25) 0.72 0.96 0.83 0.68 0.96 rs199737822 16q23.3 2 TAA 0.44 1.05(1.03, 1.07) 0.42 1.04 (0.96,1.13) 0.3 0.28 0.87 0.83 0.93 rs142444269 17q11.2 2 C 0.78 1.07(1.05, 1.09) 0.67 1.01 (0.93,1.10) 0.83 0.1 0.91 0.96 0.95 rs3110641 17q12 1 A 0.22 1.07(1.05, 1.09) 0.23 1.08 (0.98,1.18) 0.1 0.74 0.9 1 1 rs11649743 17q12 1 G 0.81 1.13(1.11, 1.15) 0.82 1.19 (1.07,1.31) 1.11x10 -3 0.49 0.98 0.99 0.99 rs10908278 17q12 1 A 0.52 1.22(1.21, 1.24) 0.58 1.18 (1.09,1.28) 4.77x10 -5 0.32 0.89 0.91 0.98 rs138213197 17q21.3 2 1 T 0.002 3.85(3.29, 4.51) - - - - - - - rs530923403 17q21.3 2 1 TTG 0.08 1.11(1.08, 1.14) 0.05 1.05 (0.88,1.25) 0.59 0.2 0.96 0.96 1 rs2680708 17q22 2 G 0.61 1.05(1.03, 1.06) 0.65 1.03 (0.95,1.11) 0.48 0.69 0.96 1 0.98 rs2474694 17p13.3 1 A 0.35 1.09(1.07, 1.1) 0.41 1.12 (1.04,1.21) 3.44x10 -3 0.21 0.99 1 0.98 rs7222314 17q24.3 1 A 0.47 1.17(1.16, 1.19) 0.55 1.18 (1.09,1.27) 3.94x10 -5 0.6 0.94 1 0.98 rs28441558 17p13.1 2 C 0.05 1.16(1.12, 1.2) 0.02 1.31 (1.01,1.69) 0.04 0.21 0.64 1 0.96 rs8093601 18q21.2 2 C 0.44 1.05(1.03, 1.06) 0.46 0.96 (0.89,1.04) 0.35 0.17 0.97 0.98 0.93 rs28607662 18q21.2 2 C 0.1 1.08(1.05, 1.11) 0.05 1.00 (0.83,1.20) 1 0.68 0.94 1 0.99 182 rs12956892 18q21.3 2 2 T 0.3 1.05(1.03, 1.07) 0.43 1.06 (0.98,1.15) 0.13 0.15 0.96 0.93 0.99 rs11381388 18q21.3 3 2 CT 0.42 1.05(1.03, 1.07) 0.41 1.05 (0.95,1.15) 0.34 0.27 0.59 0.63 0.81 rs10460109 18q22.3 2 T 0.41 1.05(1.03, 1.06) 0.36 0.95 (0.88,1.03) 0.24 0.41 0.99 0.99 0.98 rs9959454 18q23 1 A 0.73 1.09(1.07, 1.11) 0.74 1.12 (1.03,1.23) 8.69x10 -3 0.85 0.99 0.99 1 rs11666569 19p13.1 1 2 C 0.71 1.05(1.03, 1.07) 0.61 1.09 (1.01,1.18) 0.04 0.83 0.95 0.96 0.99 rs118005503 19q12 2 G 0.91 1.09(1.06, 1.13) 0.95 1.07 (0.88,1.30) 0.52 0.79 0.79 0.72 0.93 rs11667256 19q13.2 1 A 0.49 1.1(1.08, 1.12) 0.44 1.08 (1.00,1.17) 0.04 0.63 0.97 1 1 rs11672691 19q13.2 1 G 0.74 1.1(1.08, 1.12) 0.77 1.03 (0.93,1.13) 0.6 0.43 0.85 1 0.96 rs61088131 19q13.2 2 T 0.82 1.06(1.04, 1.09) 0.77 1.10 (1.00,1.21) 0.04 0.18 0.75 0.99 0.97 rs266863 19q13.3 3 1 C 0.61 1.09(1.07, 1.11) 0.60 1.15 (1.06,1.24) 5.49x10 -4 0.31 0.94 1 1 rs62113212 19q13.3 3 1 C 0.92 1.35(1.31, 1.4) 0.96 1.37 (1.11,1.68) 3.18x10 -3 0.57 0.93 1 0.98 rs61752561 19q13.3 3 1 G 0.96 1.12(1.08, 1.17) - - - - 0.67 1 0.94 rs11480453 20q11.2 1 2 C 0.6 1.05(1.03, 1.06) 0.45 0.99 (0.91,1.07) 0.78 0.41 0.85 0.79 0.93 rs17790938 20q13.1 3 1 G 0.93 1.11(1.08, 1.15) 0.92 0.97 (0.84,1.11) 0.67 0.24 0.99 1 0.99 rs6091758 20q13.2 2 G 0.47 1.07(1.06, 1.09) 0.42 1.05 (0.97,1.14) 0.24 0.71 0.81 0.95 0.95 rs397745119 20q13.3 3 1 A 0.57 1.04(1.03, 1.06) 0.47 1.02 (0.94,1.10) 0.63 0.6 1 1 1 rs1058319 20q13.3 3 1 C 0.86 1.13(1.11, 1.16) 0.81 1.06 (0.96,1.17) 0.24 0.75 0.91 0.99 0.94 rs145013758 21q22.3 1 A 0.01 1.33(1.23, 1.44) - - - - - - - rs1978060 22q11.2 1 1 G 0.61 1.06(1.04, 1.08) 0.52 1.06 (0.98,1.15) 0.14 0.89 0.83 1 1 rs78554043 22q12.1 4 C - - - - - - - - - rs9625483 22q12.1 2 A 0.03 1.14(1.09, 1.2) 0.01 1.34 (0.94,1.91) 0.11 0.77 0.57 1 0.93 rs6001723 22q13.1 1 G 0.27 1.07(1.05, 1.09) 0.21 1.07 (0.97,1.18) 0.19 0.19 0.92 0.93 0.98 rs5759167 22q13.2 1 G 0.50 1.15(1.13, 1.17) 0.58 1.15 (1.07,1.24) 3.51x10 -4 0.11 1 1 1 rs909666 22q13.2 1 T 0.85 1.15(1.13, 1.18) 0.72 0.97 (0.89,1.06) 0.5 0.22 0.95 0.98 0.97 rs17321482 Xp22.2 2 C 0.87 1.07(1.05, 1.09) 0.93 1.05 (0.93,1.18) 0.45 0.01 0.8 0.85 0.99 183 rs11338635 Xp11.22 1 GA 0.36 1.11(1.1, 1.12) 0.24 1.03 (0.97,1.10) 0.37 0.88 0.99 1 0.95 rs5943724 Xp11.22 1 G 0.63 1.06(1.05, 1.07) - - - - - - - rs5919402 Xq12 1 T 0.85 1.06(1.04, 1.07) 0.87 1.01 (0.94,1.10) 0.72 1 0.99 1 0.98 rs5937025 Xq13.1 1 C 0.48 1.05(1.03, 1.06) 0.61 0.96 (0.91,1.01) 0.12 0.12 1 1 0.97 rs4830660 Xp22.2 1 A 0.71 1.05(1.04, 1.06) 0.82 1.02 (0.95,1.10) 0.59 0.37 0.95 1 0.97 a Risk variants were from: 1. Fine-mapping of PCa susceptibility loci in European ancestry populations. (Dadaev et al, Nature Commun, 2018); 2. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. (Schumacher et al, Nature Genetics, 2018); 3. Germline variation at 8q24 and prostate cancer risk in men of European ancestry. (Matejcic et al, Nature Commun, 2018); 4. Two Novel Susceptibility Loci for Prostate Cancer in Men of African Ancestry. (Conti et al, JNCI, 2017); 5. Large-scale association analysis in Asians identifies new susceptibility loci for prostate cancer. (Wang et al. Nature Commun, 2015) b Risk allele frequencies (RAFs) and Odds Ratios (ORs)(marginal) were from a meta-analysis in European population (reference 2 above). c RAFs were from the weighted average of RAFs in Latino controls of Sets 1-3. d ORs were from the meta-analysis of Sets 1-3 in Latino men. e Phet were P-values of heterogeneity tests for results of the three sets. 184 Supplementary Table S4. Stratified analyses by AmerIndian (AMR) local ancestry for known PrCa risk alleles which had nominal significant interactions (P<0.05) with either EUR or AMR local ancestry on PrCa risk in Latino men (Set 1 + Set 2). SNP ID Region Risk Allele AMR ≤0.5 a AMR>0.5 and AMR≤1.5 a AMR > 1.5 a Interaction P-value b OR (95%CI) P- value OR (95%CI) P-value OR (95%CI) P-value EUR AMR rs2028900 2p11.2 C 1.12(0.98,1.29) 0.09 1.08(0.91,1.29) 0.37 0.81(0.56,1.16) 0.24 0.35 0.045 rs4976790 5q35.3 T 1.11(0.92,1.34) 0.28 1.08(0.88,1.33) 0.45 0.81(0.57,1.15) 0.24 0.047 0.01 rs9443189 6q14.1 A 0.94(0.79,1.13) 0.54 1.04(0.84,1.29) 0.70 1.47(1.00,2.16) 0.049 0.02 0.04 rs5875234 6p22.1 CTGA GTA 1.20(1.01,1.42) 0.03 1.10(0.94,1.29) 0.24 0.85(0.67,1.10) 0.22 0.08 0.046 rs630045 6q22.1 C 1.07(0.92,1.26) 0.38 1.00(0.86,1.17) 0.96 0.87(0.70,1.08) 0.20 0.08 0.047 rs10993994 10q11.23 T 1.19(1.04,1.36) 0.01 1.36(1.17,1.58) 6.28x10 -5 1.40(1.10,1.77) 5.97x10 -3 0.04 0.2 rs10875943 12q13.12 C 0.95(0.82,1.09) 0.46 1.07(0.92,1.24) 0.38 1.32(1.04,1.67) 0.02 7.08x10 -3 0.04 rs12956892 18q21.32 T 1.03(0.89,1.20) 0.68 1.04(0.89,1.22) 0.60 1.45(1.16,1.83) 1.37x10 -3 0.03 0.049 rs17790938 20q13.13 G 1.23(0.96,1.57) 0.10 0.93(0.71,1.23) 0.62 0.82(0.56,1.19) 0.29 0.02 0.01 rs1978060 22q11.21 G 0.93(0.81,1.07) 0.32 1.01(0.86,1.17) 0.93 1.39(1.10,1.76) 5.85x10 -3 0.01 7.22x10 -3 rs909666 22q13.2 T 1.12(0.94,1.34) 0.20 0.93(0.80,1.09) 0.40 0.90(0.74,1.10) 0.30 0.04 0.09 a Odds Ratios (ORs) were adjusted for age, study and the first 10 principle components; P-values were Wald P-values from meta- analyses of Set 1 and Set 2. b P-values were for interaction terms between continuous EUR or AMR local ancestry and allele dosage 185 Supplementary Table S5. Associations between categorized polygenic risk scores (PRSs) and prostate cancer risk in Latino men by AmerIndian ancestry strata. AmerIndian Global Ancestry Strata a Polygenic Risk Score Category European-weighted PRS b Latino-weighted PRS b No. Cases No. Controls OR (95% CI) c P-value d No. Cases No. Control s OR (95% CI) c P-value d ≤25% 0% - 10% 26 54 0.42(0.25,0.73) 1.72×10 -3 14 54 0.24(0.12,0.45) 1.59×10 -5 10% - 25% 56 78 0.65(0.43,0.97) 3.40×10 -2 40 78 0.52(0.33,0.81) 4.19×10 -3 25% - 75% 297 261 - - 256 261 - - 75% - 90% 149 79 1.53(1.08,2.17) 1.58×10 -2 191 79 2.48(1.77,3.47) 1.28×10 -7 90% - 100% 217 53 3.70(2.57,5.32) 1.77×10 -12 244 53 5.07(3.51,7.31) 4.45×10 -18 >75% 0% - 10% 11 53 0.28(0.14,0.57) 3.94×10 -4 8 53 0.21(0.10,0.46) 1.17×10 -4 10% - 25% 31 79 0.60(0.37,0.96) 3.39×10 -2 16 79 0.32(0.18,0.58) 1.76×10 -4 25% - 75% 163 262 - - 159 262 - - 75% - 90% 75 78 1.65(1.12,2.42) 1.11×10 -2 73 78 1.61(1.08,2.38) 1.81×10 -2 90% - 100% 86 53 2.94(1.93,4.47) 4.66×10 -7 110 53 3.87(2.57,5.84) 1.14×10 -10 a Strata were created by categorizing AmerIndian global ancestry score according to its percentiles (≤25%, >75%) in controls. b PRS was calculated using 176 known SNPs (MAF≥0.001 and imputation score≥0.3 in Latino men); for European-weighted PRS, the weights were the conditional log ORs derived from men of European ancestry; for Latino-weighted PRS, weights were the conditional log ORs obtained from our Latino men (Set1 and Set2). c Odds Ratios (ORs) were adjusted for age, the first 10 principle components, and studies; d P-values were Wald P-values from fixed-effect meta-analyses. 186 Supplementary Figure S1. Scatter plot for PC1 versus PC2 among Latino men and populations in phase III 1000 Genome and in the NHGRI PAGE Consortium. x-axis and y- axis are the 1 st and 2 nd eigenvectors in PCA analysis. The subjects included individuals in phase III 1000 Genome Project (AFR, AMR, EAS, EUR, SAS), NHGRI PAGE Consortium (AFRICA, AMERICA, CENTRAL_SOUTH_ASIA, EAST_ASIA, EUROPE, MIDDLE_EAST, OCEANIA), and our Set1 and Set2 Latino popualtion. 187 Supplementary Figure S2. Scatter plot for PC5 and PC6 among Latino men and populations in the NHGRI PAGE Consortium. x-axis and y-axis are the 5 th and 6 th eigenvectors in the PCA analysis. The subjects included individuals in the NHGRI PAGE Consortium and our Set1 and Set2 Latino popualtions. 188 Supplementary Figure S3. Pie charts of average proportion of global ancestries among cases and controls by study. n All studies n Set 1 – the MEC n Set 2 189 Supplementary Figure S4. Manhattan plot of GWAS meta-analysis for prostate cancer in Latino men. Manhattan plot shows all genotyped and imputed results, excluding SNPs with MAF<0.01 and imputation score <0.8. x-axis is chromosome position, y-axis is -log10(P-value). The red line represents the genome-wide significant cut-off value of P = 5 × 10 −8 . 190 Supplementary Figure S5. QQ plot of GWAS meta-analysis for prostate cancer in Latino men. QQ plot shows all genotyped and imputed results, excluding SNPs with MAF<0.01 and imputation score <0.8. x-axis is expected -log10(P-value) from normal distribution, y-axis is - log10(P-value) from our fixed-effect meta-analysis. l = 1.03 191 Supplementary Figure S6. Regional association plots of the 8q24 (127-129.5MB) and 10q11.22 (51.3-51.8MB) risk regions for prostate cancer in Latino men. Single-nucleotide polymorphisms (SNPs) are plotted by position (x-axis) and -log10 P value (y-axis). r 2 was estimated from AMR individuals in phase III 1000 Genomes Project (1KGP) data. The most statistically significant associated SNP (purple diamond) at 8q24 is rs7843031 (chr8:128533473) and at 10q11.22 is rs10993994 (chr10:51549496). The surrounding SNPs are colored to indicate pairwise correlation with the index SNP. a) b) 192 Supplementary Figure S7. Negative log10(p-value) plot for admixture association in a case- control analysis of local African ancestry at chromosome 8 (100-145 MB) in Latino men. x- axis is chromosome position, y-axis is -log10(P-value) of associations between AFR local ancestry and PrCa risk. Purple lines are known risk alleles at 8q24. 193 Supplementary Figure S8. Negative log10(p-value) plot for admixture associations in case- control analyses of African local ancestry at 8q24 in Latino men. x-axis is chromosome position, y-axis is -log10(P-value). Grey line is marginal P-value of AFR local ancestry in model 1, which only adjusted for age, study and the first 10 principle components. Green line is conditional P-value of AFR local ancestry of model 2, which with additional adjustment for the two-independent top 8q24 risk alleles captured through forward-selection logistic regression in LAPC. Red line is conditional P-value of AFR local ancestry of model 3, which with additional adjustment for both the 12 8q24 risk alleles identified in previous study and the 2 8q24 independent risk variants identified in LAPC. 194 Chapter 4 Supplementary Table S1. Suggestive (P<1×10-6) risk variants for multiple myeloma in AA population. SNP Position Alleles risk|ref Meta Set1 Set2 RAF C a s e RAF Control OR(95%CI) a P-value PHet b OR(95%CI) a P-value Imputa tion r 2 OR(95%CI) a P-value Imputa tion r 2 rs266375 15:67228085 C|T 0.26 0.23 1.32(1.19,1.46) 1.45×10 -7 0.43 1.36(1.20,1.54) 2.17×10 -6 0.58 1.24(1.04,1.49) 0.02 0.83 rs2047077 12:66887206 G|C 0.06 0.05 1.66(1.37,2.01) 2.42×10 -7 0.14 1.52(1.22 ,1.9) 1.97×10 -4 0.68 2.13(1.45,3.12) 1.07×10 -4 0.90 rs10457096 6:156265122 G|A 0.03 0.02 2.06(1.56,2.73) 3.26×10 -7 0.97 2.07(1.49,2.88) 1.39×10 -5 0.50 2.05(1.21,3.45) 7.27×10 -3 0.75 rs13296848 9:701529 C|T 0.33 0.29 1.25(1.15,1.36) 3.44×10 -7 0.01 1.16(1.05,1.28) 4.55×10 -3 0.82 1.5(1.28,1.76) 5.51×10 -7 0.93 rs4913513 12:66881488 G|A 0.06 0.05 1.64(1.36,1.99) 3.57×10 -7 0.12 1.51(1.21,1.88) 2.89×10 -4 0.69 2.13(1.45,3.11) 1.00×10 -4 0.91 rs117284313 12:66880343 A|G 0.06 0.05 1.64(1.35,1.99) 3.99×10 -7 0.12 1.50(1.20,1.87) 3.13×10 -4 0.69 2.12(1.45,3.11) 1.02×10 -4 0.91 rs11176202 12:66858679 A|T 0.05 0.04 1.66(1.36,2.02) 4.52×10 -7 0.22 1.55(1.23,1.94) 1.85×10 -4 0.66 2.05(1.39,3.04) 3.13×10 -4 0.90 rs80227918 12:66878336 C|T 0.06 0.05 1.63(1.35,1.98) 4.77×10 -7 0.12 1.50(1.20,1.87) 3.59×10 -4 0.69 2.12(1.45,3.11) 1.04×10 -4 0.91 rs12369857 12:66865653 A|G 0.04 0.03 1.76(1.41,2.19) 5.58×10 -7 0.15 1.59(1.23,2.06) 3.90×10 -4 0.66 2.30(1.50,3.53) 1.36×10 -4 0.89 rs28362345 6:31165836 T|C 0.74 0.70 1.24(1.14,1.34) 5.90×10 -7 0.59 1.25(1.14,1.38) 6.19×10 -6 1.00 1.19(1.02,1.39) 0.03 1.00 rs28362342 6:31165438 T|G 0.74 0.70 1.23(1.14,1.34) 6.23×10 -7 0.61 1.25(1.14,1.38) 6.76×10 -6 1.00 1.19(1.02,1.39) 0.03 1.00 rs114301391 5:77208230 T|C 0.02 0.01 2.37(1.69,3.34) 7.21×10 -7 0.85 2.31(1.51,3.54) 1.09×10 -4 0.69 2.48(1.40,4.41) 1.92×10 -3 0.86 rs12194664 6:156260851 T|C 0.03 0.02 2.01(1.52,2.66) 8.49×10 -7 0.96 2.02(1.46,2.80) 2.53×10 -5 0.50 1.99(1.17,3.37) 0.01 0.76 rs12366841 12:66865701 G|A 0.06 0.05 1.62(1.34,1.96) 9.06×10 -7 0.11 1.48(1.18,1.84) 6.12×10 -4 0.68 2.13(1.45,3.11) 1.07×10 -4 0.91 rs7034061 9:38443792 T|G 0.15 0.13 1.32(1.18,1.48) 9.17×10 -7 1.00 1.32(1.16,1.51) 3.54×10 -5 0.83 1.33(1.08,1.63) 8.18×10 -3 0.95 a Odds rations (ORs) were adjusted for age, sex and the first 10 principle components from the PCA analysis. b Phet was P-value for the heterogeneity test in meta-analysis of the two sets. 195 Supplementary Table S2. Local African ancestry that were significantly associated with AA MM risk in both case-only analysis and case-control analysis in chromosome 2 (Meta P-value<1×10-5). SNP Position Case-only Case-control OR(95%CI) a P-value OR(95%CI) b P-value rs2592781 23185972 0.97(0.96, 0.98) 6.62×10 -6 0.97(0.96, 0.98) 5.93×10 -6 rs2681019 23187504 0.97(0.96, 0.98) 6.62×10 -6 0.97(0.96, 0.98) 5.93×10 -6 rs2592774 23192909 0.97(0.96, 0.98) 6.62×10 -6 0.97(0.96, 0.98) 5.93×10 -6 rs12474212 23197435 0.97(0.96, 0.98) 6.62×10 -6 0.97(0.96, 0.98) 5.93×10 -6 rs1446877 23216262 0.97(0.96, 0.98) 6.62×10 -6 0.97(0.96, 0.98) 5.93×10 -6 rs17044554 23218255 0.97(0.96, 0.98) 6.62×10 -6 0.97(0.96, 0.98) 5.93×10 -6 rs7581165 23872765 0.97(0.96, 0.98) 9.43×10 -6 0.97(0.96, 0.98) 5.93×10 -6 rs1822300 24724785 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs11125627 24725464 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs7598617 24784168 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs995647 24810255 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs10208038 25032151 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs4665273 25046355 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs6760328 25077991 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs6726261 25153986 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs1172294 25169200 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs6749526 25202668 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs1982200 25205427 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs893589 25259442 0.97(0.96, 0.98) 3.84×10 -6 0.97(0.96, 0.98) 9.42×10 -6 rs2918630 25314186 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.80×10 -6 rs10495751 25316045 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.80×10 -6 rs13395518 25323747 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.80×10 -6 rs13401241 25518470 0.97(0.96, 0.98) 3.87×10 -6 0.97(0.96, 0.98) 7.49×10 -6 rs749130 25530028 0.97(0.96, 0.98) 3.87×10 -6 0.97(0.96, 0.98) 7.49×10 -6 rs6711622 25531350 0.97(0.96, 0.98) 3.87×10 -6 0.97(0.96, 0.98) 7.49×10 -6 rs7560488 25568821 0.97(0.96, 0.98) 3.87×10 -6 0.97(0.96, 0.98) 7.49×10 -6 rs6705138 25587851 0.97(0.96, 0.98) 3.87×10 -6 0.97(0.96, 0.98) 7.49×10 -6 rs1010658 25595194 0.97(0.96, 0.98) 3.87×10 -6 0.97(0.96, 0.98) 7.49×10 -6 rs730072 25597681 0.97(0.96, 0.98) 3.87×10 -6 0.97(0.96, 0.98) 7.49×10 -6 rs2384232 25623749 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.96, 0.98) 6.47×10 -6 rs17745923 25643944 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.96, 0.98) 6.47×10 -6 rs6746082 25659244 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.96, 0.98) 6.47×10 -6 rs10210057 25662578 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.96, 0.98) 6.47×10 -6 rs6546183 25671806 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.96, 0.98) 6.47×10 -6 rs6546199 25683447 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs1507705 25714916 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs9309386 25727910 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs517403 25730439 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 196 rs936012 25755848 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs6725591 25789849 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs12613835 25829201 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs11678268 25837156 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs6546314 25845467 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs858648 25867203 0.97(0.96, 0.98) 2.10×10 -6 0.97(0.95, 0.98) 4.43×10 -6 rs6747116 25969729 0.97(0.96, 0.98) 2.01×10 -6 0.97(0.95, 0.98) 4.89×10 -6 rs6546452 25981272 0.97(0.96, 0.98) 2.01×10 -6 0.97(0.95, 0.98) 4.89×10 -6 rs6758088 26036036 0.97(0.96, 0.98) 2.01×10 -6 0.97(0.95, 0.98) 4.89×10 -6 rs4063544 26042515 0.97(0.96, 0.98) 2.01×10 -6 0.97(0.95, 0.98) 4.89×10 -6 rs2138390 26112833 0.97(0.96, 0.98) 2.01×10 -6 0.97(0.95, 0.98) 4.89×10 -6 rs11895615 26113120 0.97(0.96, 0.98) 2.01×10 -6 0.97(0.95, 0.98) 4.89×10 -6 rs6546642 26174103 0.97(0.96, 0.98) 1.66×10 -6 0.97(0.96, 0.98) 7.98×10 -6 rs12994424 26183142 0.97(0.96, 0.98) 1.66×10 -6 0.97(0.96, 0.98) 7.98×10 -6 rs7603456 26191296 0.97(0.96, 0.98) 1.66×10 -6 0.97(0.96, 0.98) 7.98×10 -6 rs7563440 26199220 0.97(0.96, 0.98) 1.66×10 -6 0.97(0.96, 0.98) 7.98×10 -6 rs4665298 26256378 0.97(0.96, 0.98) 1.66×10 -6 0.97(0.96, 0.98) 7.98×10 -6 rs10170359 26272219 0.97(0.96, 0.98) 1.66×10 -6 0.97(0.96, 0.98) 7.98×10 -6 rs12471809 26307140 0.97(0.96, 0.98) 1.66×10 -6 0.97(0.96, 0.98) 7.98×10 -6 rs4665304 26313406 0.97(0.96, 0.98) 1.66×10 -6 0.97(0.96, 0.98) 7.98×10 -6 rs1039823 28623159 0.97(0.96, 0.98) 3.00×10 -6 0.97(0.95, 0.98) 5.96×10 -6 rs1396733 28642747 0.97(0.96, 0.98) 3.00×10 -6 0.97(0.95, 0.98) 5.96×10 -6 rs4666076 28652529 0.97(0.96, 0.98) 3.00×10 -6 0.97(0.95, 0.98) 5.96×10 -6 rs2940797 28670404 0.97(0.96, 0.98) 3.00×10 -6 0.97(0.95, 0.98) 5.96×10 -6 rs2972050 28671048 0.97(0.96, 0.98) 3.00×10 -6 0.97(0.95, 0.98) 5.96×10 -6 rs10186544 28683174 0.97(0.96, 0.98) 3.20×10 -6 0.97(0.95, 0.98) 2.39×10 -6 rs1581035 28686332 0.97(0.96, 0.98) 3.20×10 -6 0.97(0.95, 0.98) 2.39×10 -6 rs2940790 28687043 0.97(0.96, 0.98) 3.20×10 -6 0.97(0.95, 0.98) 2.39×10 -6 rs6715256 28723942 0.97(0.96, 0.98) 3.20×10 -6 0.97(0.95, 0.98) 2.39×10 -6 rs11885873 28736582 0.97(0.96, 0.98) 3.20×10 -6 0.97(0.95, 0.98) 2.39×10 -6 rs4334451 28747170 0.97(0.96, 0.98) 3.20×10 -6 0.97(0.95, 0.98) 2.39×10 -6 rs6547857 28757368 0.97(0.96, 0.98) 3.20×10 -6 0.97(0.95, 0.98) 2.39×10 -6 rs11127162 28763374 0.97(0.96, 0.98) 3.59×10 -6 0.97(0.95, 0.98) 2.50×10 -6 rs1881254 28768251 0.97(0.96, 0.98) 3.59×10 -6 0.97(0.95, 0.98) 2.50×10 -6 rs9309663 28768427 0.97(0.96, 0.98) 3.59×10 -6 0.97(0.95, 0.98) 2.50×10 -6 rs12617735 28780313 0.97(0.96, 0.98) 3.59×10 -6 0.97(0.95, 0.98) 2.50×10 -6 rs7595633 28803240 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs4666103 28807899 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs12468715 28812285 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs12993525 28818931 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs1534477 28820641 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs4371315 28821149 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs7557449 28821695 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 197 rs1534476 28833479 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs1534478 28840691 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs4530322 28852926 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs6713845 28854570 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs3752899 28856220 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs6741437 28867201 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs11678098 28876857 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs10192375 28892116 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs7586033 28892960 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs4666115 28902004 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs4372836 28973883 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs3190 29025479 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs12475612 29030006 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs6547881 29032746 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs6547892 29138009 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs9653591 29145725 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs3087649 29169612 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs11680458 29170623 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs10197378 29181107 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs6727948 29184909 0.97(0.96, 0.98) 3.07×10 -6 0.97(0.95, 0.98) 1.77×10 -6 rs4666157 29196785 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs7578007 29215005 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs12466400 29229806 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs11684978 29231870 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs13420380 29245933 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs6739684 29247894 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs13008323 29248046 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs6742110 29260085 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs2276551 29268375 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs6547906 29272472 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs6731719 29273719 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs882631 29280791 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs10196859 29287227 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs17744052 29289411 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs7569316 29289501 0.97(0.96, 0.98) 3.57×10 -6 0.97(0.96, 0.98) 8.46×10 -6 rs7572957 29487405 0.97(0.95, 0.98) 8.97×10 -8 0.97(0.95, 0.98) 4.63×10 -6 rs876748 29512675 0.96(0.95, 0.98) 3.96×10 -8 0.97(0.95, 0.98) 3.45×10 -6 rs13406263 29519475 0.96(0.95, 0.98) 3.96×10 -8 0.97(0.95, 0.98) 3.45×10 -6 rs6708752 29536039 0.96(0.95, 0.98) 3.96×10 -8 0.97(0.95, 0.98) 3.45×10 -6 rs4666200 29538411 0.96(0.95, 0.98) 3.96×10 -8 0.97(0.95, 0.98) 3.45×10 -6 rs12465220 29559519 0.96(0.95, 0.98) 2.34×10 -8 0.97(0.95, 0.98) 2.72×10 -6 rs17007931 29578068 0.96(0.95, 0.98) 2.34×10 -8 0.97(0.95, 0.98) 2.72×10 -6 rs12714277 29586686 0.96(0.95, 0.98) 2.34×10 -8 0.97(0.95, 0.98) 2.72×10 -6 198 rs4633880 29605589 0.96(0.95, 0.98) 2.34×10 -8 0.97(0.95, 0.98) 2.72×10 -6 rs7591913 29619971 0.96(0.95, 0.98) 2.34×10 -8 0.97(0.95, 0.98) 2.72×10 -6 rs4233734 29637040 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs4555301 29639679 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs3923028 29685596 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs11888731 29702883 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs4528720 29704715 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs13383564 29708619 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs12714287 29710663 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs13000666 29715388 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs9808313 29721334 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs10445906 29727204 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs4433956 29730492 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs4665463 29733601 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs4666243 29763753 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs10193521 29785679 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 rs6547948 29790622 0.97(0.95, 0.98) 7.96×10 -8 0.97(0.96, 0.98) 9.85×10 -6 a Odds rations (ORs) were estimated in a case-only study adjusting for age and sex. b Odds rations (ORs) were estimated in a case-control study adjusting for age and sex. 199 Supplementary Table S3. eQTL analyses for SNPs in known MM risk region and suggestive risk regions in AAMM. SNP Chr Position A1 A2 Gene TSS Beta SE P-value rs7577599 2 25613146 T C NCOA1 24807346 0.03 0.03 0.39 PTRHD1 25016251 -0.05 0.05 0.28 ADCY3 25142055 0.04 0.05 0.45 DNMT3A 25564784 0.00 0.03 0.94 ASXL2 26101312 -0.02 0.02 0.38 KIF3C 26205443 -0.09 0.04 0.02 RAB10 26256729 0.00 0.02 0.85 HADHA 26467594 0.02 0.02 0.41 HADHB 26467616 0.02 0.02 0.35 EPT1 26568954 0.04 0.03 0.23 rs6546148 2 25625005 A G NCOA1 24807346 0.01 0.04 0.79 PTRHD1 25016251 -0.06 0.05 0.26 ADCY3 25142055 0.02 0.05 0.74 DNMT3A 25564784 0.00 0.03 0.86 ASXL2 26101312 -0.01 0.02 0.60 KIF3C 26205443 -0.02 0.04 0.67 RAB10 26256729 0.01 0.02 0.57 HADHA 26467594 0.06 0.02 8.48E-03 HADHB 26467616 0.04 0.02 0.12 EPT1 26568954 0.04 0.03 0.27 rs10180663 2 25633242 T C NCOA1 24807346 -0.02 0.03 0.57 PTRHD1 25016251 -0.02 0.05 0.72 ADCY3 25142055 0.00 0.05 0.95 DNMT3A 25564784 -0.01 0.03 0.75 ASXL2 26101312 0.02 0.02 0.49 KIF3C 26205443 -0.03 0.04 0.52 RAB10 26256729 0.02 0.02 0.47 HADHA 26467594 0.06 0.02 5.19E-03 HADHB 26467616 0.04 0.02 0.06 EPT1 26568954 0.03 0.03 0.29 rs6746082 2 25659244 A C NCOA1 24807346 -0.01 0.03 0.80 PTRHD1 25016251 -0.08 0.05 0.11 ADCY3 25142055 0.02 0.04 0.73 DNMT3A 25564784 0.03 0.02 0.30 ASXL2 26101312 -0.02 0.02 0.46 KIF3C 26205443 -0.04 0.04 0.29 RAB10 26256729 -0.01 0.02 0.59 HADHA 26467594 0.02 0.02 0.41 HADHB 26467616 0.00 0.02 0.92 200 EPT1 26568954 0.02 0.03 0.43 rs4325816 2 174808899 T C ZAK 173940565 -0.02 0.05 0.67 CDCA7 174219561 0.02 0.11 0.85 SP3 174830430 0.07 0.02 5.47E-04 OLA1 175113365 0.00 0.01 0.93 CIR1 175260443 -0.04 0.03 0.18 SCRN3 175260457 0.03 0.03 0.28 GPR155 175351816 -0.09 0.08 0.26 WIPF1 175547627 -0.02 0.03 0.53 rs73828280 3 41833907 A T CTNNB1 41240942 0.00 0.03 0.97 TRAK1 42132746 0.01 0.02 0.56 SEC22C 42623520 -0.02 0.01 0.23 SS18L2 42632298 0.00 0.02 0.89 NKTR 42642147 0.03 0.04 0.34 rs1052501 3 41925398 C T CTNNB1 41240942 0.02 0.03 0.52 TRAK1 42132746 -0.02 0.02 0.24 SEC22C 42623520 0.03 0.01 0.03 SS18L2 42632298 -0.01 0.02 0.50 NKTR 42642147 -0.04 0.04 0.23 HIGD1A 42845934 -0.03 0.03 0.42 rs12108049 3 41958161 A G CTNNB1 41240942 0.09 0.05 0.11 TRAK1 42132746 0.02 0.03 0.45 SEC22C 42623520 -0.04 0.02 0.07 SS18L2 42632298 -0.02 0.03 0.50 NKTR 42642147 0.02 0.06 0.70 HIGD1A 42845934 -0.07 0.06 0.19 rs6599192 3 41992408 G A CTNNB1 41240942 0.03 0.03 0.37 TRAK1 42132746 -0.01 0.02 0.55 SEC22C 42623520 0.03 0.01 0.04 SS18L2 42632298 -0.02 0.02 0.37 NKTR 42642147 -0.06 0.04 0.10 HIGD1A 42845934 -0.04 0.03 0.22 rs10936599 3 169492101 C T MYNN 169492053 0.05 0.05 0.33 LRRC34 169530574 -0.34 0.14 0.02 SEC62 169684580 0.04 0.05 0.39 GPR160 169755735 -0.10 0.12 0.39 PHC3 169899537 -0.06 0.05 0.18 PRKCI 169940220 -0.01 0.07 0.85 SKIL 170075473 0.11 0.07 0.11 rs10936600 3 169514585 A T MYNN 169492053 -0.03 0.04 0.44 LRRC34 169530574 -0.03 0.12 0.81 SEC62 169684580 -0.04 0.04 0.35 GPR160 169755735 -0.11 0.10 0.27 201 PHC3 169899537 -0.07 0.04 0.09 PRKCI 169940220 -0.05 0.06 0.46 SKIL 170075473 0.12 0.06 0.04 rs9290375 3 169566090 A G MYNN 169492053 0.00 0.02 0.97 LRRC34 169530574 0.12 0.07 0.11 SEC62 169684580 0.06 0.02 0.02 GPR160 169755735 0.17 0.06 0.005 PHC3 169899537 0.01 0.02 0.75 PRKCI 169940220 0.07 0.04 0.08 rs56219066 5 95242931 T C MCTP1 94620279 0.03 0.08 0.68 TTC37 94890709 0.04 0.03 0.14 ARSK 94890825 -0.08 0.06 0.19 RFESD 94982583 -0.04 0.05 0.44 RHOBTB3 95066850 -0.05 0.07 0.44 GLRX 95158577 -0.04 0.03 0.16 ELL2 95297775 -0.26 0.04 5.09E-12 CAST 96038493 -0.03 0.02 0.18 ERAP1 96149848 0.01 0.03 0.64 ERAP2 96211644 0.13 0.15 0.38 rs1423269 5 95255724 A G MCTP1 94620279 0.06 0.08 0.51 TTC37 94890709 0.03 0.03 0.19 ARSK 94890825 -0.08 0.06 0.19 RFESD 94982583 0.01 0.05 0.89 RHOBTB3 95066850 -0.10 0.07 0.16 GLRX 95158577 -0.06 0.03 0.03 ELL2 95297775 -0.27 0.04 3.77E-13 CAST 96038493 0.00 0.02 0.99 ERAP1 96149848 0.01 0.03 0.84 ERAP2 96211644 0.17 0.15 0.28 rs6595443 5 122743325 A T SNX2 122110750 -0.01 0.02 0.63 SNX24 122181160 0.07 0.05 0.15 PPIC 122372425 -0.13 0.12 0.27 CEP120 122759286 0.02 0.03 0.53 CSNK1G3 122881111 0.01 0.02 0.64 rs34229995 6 15244018 C G JARID2 15246206 -0.19 0.15 0.19 DTNBP1 15663289 0.13 0.09 0.18 MYLIP 16129317 0.26 0.34 0.45 GMPR 16238811 0.08 0.36 0.83 rs2285803 6 31107258 T C TRIM26 30181271 -0.04 0.04 0.32 TRIM39 30294621 0.00 0.04 0.91 C6orf136 30614816 -0.03 0.04 0.37 NRM 30659197 -0.03 0.05 0.60 TUBB 30688157 0.03 0.04 0.52 202 FLOT1 30710453 -0.01 0.03 0.67 DDR1 30852327 0.02 0.04 0.70 GTF2H4 30875977 -0.05 0.04 0.23 VARS2 30881982 0.01 0.03 0.77 CCHCR1 31125566 0.02 0.02 0.42 HCG27 31165537 0.05 0.04 0.19 HLA-B 31324989 -0.01 0.03 0.64 HCG26 31439006 0.10 0.10 0.31 MICB 31465855 -0.03 0.04 0.44 PRRC2A 31588450 -0.02 0.04 0.67 CSNK2B 31633657 -0.01 0.02 0.47 GPANK1 31634060 -0.02 0.02 0.47 CLIC1 31704341 0.01 0.02 0.59 LSM2 31774761 -0.05 0.04 0.15 HSPA1B 31795512 -0.11 0.13 0.40 NELFE 31926864 0.00 0.02 0.86 FKBPL 32098067 0.00 0.02 0.96 rs3132535 6 31116526 A G TRIM26 30181271 -0.05 0.04 0.22 TRIM39 30294621 -0.01 0.04 0.85 C6orf136 30614816 -0.07 0.04 0.06 NRM 30659197 -0.04 0.05 0.49 TUBB 30688157 0.03 0.04 0.54 FLOT1 30710453 0.00 0.03 0.91 DDR1 30852327 0.03 0.04 0.56 GTF2H4 30875977 -0.06 0.04 0.14 VARS2 30881982 0.03 0.03 0.38 CCHCR1 31125566 0.01 0.02 0.61 HCG27 31165537 0.05 0.04 0.23 HLA-B 31324989 0.01 0.03 0.76 HCG26 31439006 0.09 0.10 0.36 MICB 31465855 -0.03 0.04 0.41 PRRC2A 31588450 -0.02 0.04 0.66 CSNK2B 31633657 -0.01 0.02 0.62 GPANK1 31634060 0.00 0.02 0.84 CLIC1 31704341 0.02 0.02 0.38 LSM2 31774761 -0.07 0.04 0.05 HSPA1B 31795512 -0.11 0.13 0.41 NELFE 31926864 0.01 0.02 0.72 FKBPL 32098067 0.00 0.02 0.94 rs879882 6 31139452 T C TRIM26 30181271 0.06 0.04 0.11 TRIM39 30294621 0.05 0.03 0.16 C6orf136 30614816 -0.01 0.03 0.75 NRM 30659197 0.02 0.05 0.76 203 TUBB 30688157 0.04 0.04 0.37 FLOT1 30710453 0.02 0.03 0.54 DDR1 30852327 0.00 0.04 0.95 GTF2H4 30875977 0.02 0.04 0.57 VARS2 30881982 0.07 0.03 0.02 CCHCR1 31125566 0.02 0.02 0.32 HCG27 31165537 -0.08 0.04 0.05 HLA-B 31324989 -0.01 0.02 0.69 HCG26 31439006 -0.03 0.09 0.73 MICB 31465855 0.06 0.04 0.11 PRRC2A 31588450 -0.01 0.04 0.74 CSNK2B 31633657 -0.02 0.02 0.39 GPANK1 31634060 0.02 0.02 0.31 CLIC1 31704341 0.02 0.02 0.45 LSM2 31774761 -0.03 0.03 0.41 HSPA1B 31795512 0.18 0.13 0.16 NELFE 31926864 0.00 0.02 0.94 FKBPL 32098067 0.00 0.02 0.99 PPT2 32121229 0.06 0.04 0.09 rs9372120 6 106667535 T G PREP 105850999 -0.04 0.03 0.20 PRDM1 106546737 0.00 0.05 0.97 ATG5 106773695 0.05 0.06 0.39 AIM1 106959730 0.08 0.16 0.63 RTN4IP1 107077373 -0.24 0.10 0.01 QRSL1 107077441 0.03 0.08 0.67 C6orf203 107349407 -0.02 0.06 0.75 BEND3 107435636 0.04 0.07 0.55 rs4487645 7 21938240 C A SP4 21467689 0.06 0.03 0.03 CDCA7L 21985542 0.18 0.10 0.07 RAPGEF5 22396533 -0.01 0.18 0.96 STEAP1B 22539901 0.06 0.15 0.69 IL6 22766766 -0.07 0.14 0.65 TOMM7 22862421 -0.06 0.04 0.18 rs17507636 7 106291118 C T SYPL1 105752791 0.35 0.16 0.03 NAMPT 105925638 0.06 0.11 0.57 CCDC71L 106301634 0.12 0.05 0.03 PIK3CG 106505924 -0.02 0.10 0.80 PRKAR2B 106685178 -0.01 0.18 0.97 HBP1 106809460 0.05 0.04 0.23 DUS4L 107204402 0.05 0.04 0.20 COG5 107204959 0.09 0.04 0.02 BCAP29 107220422 -0.02 0.05 0.66 rs58618031 7 124583896 T C GPR37 124405681 -0.02 0.06 0.67 204 LOC154872 124430864 0.00 0.06 0.94 POT1 124570037 -0.02 0.04 0.63 rs61068276 7 124804887 C T GPR37 124405681 -0.07 0.05 0.17 LOC154872 124430864 0.02 0.05 0.71 POT1 124570037 0.02 0.03 0.61 rs92903 7 124452670 C T GPR37 124405681 -0.05 0.05 0.30 LOC154872 124430864 0.02 0.05 0.77 POT1 124570037 -0.01 0.04 0.73 rs73169662 7 150922306 T C LRRC61 150020296 0.04 0.07 0.54 ACTR3C 150020758 -0.21 0.17 0.22 ZBED6CL 150026938 -0.01 0.12 0.91 RARRES2 150038763 0.05 0.12 0.71 REPIN1 150065879 0.00 0.05 0.95 ZNF775 150076406 0.03 0.06 0.62 LINC00996 150130742 -0.18 0.35 0.60 GIMAP7 150211945 -0.24 0.22 0.28 GIMAP4 150264458 0.02 0.11 0.83 GIMAP6 150329736 0.07 0.13 0.60 GIMAP2 150382794 -0.10 0.17 0.56 TMEM176B 150497621 -0.34 0.30 0.25 TMEM176A 150497854 -0.07 0.24 0.77 AOC1 150549573 -0.30 0.28 0.29 KCNH2 150675402 0.10 0.16 0.53 CDK5 150755052 -0.13 0.12 0.26 SLC4A2 150759634 -0.17 0.11 0.12 FASTK 150777970 0.03 0.05 0.56 TMUB1 150780413 0.08 0.08 0.31 ABCF2 150924317 -0.02 0.05 0.60 CHPF2 150929585 0.05 0.07 0.48 NUB1 151038847 0.11 0.24 0.66 RHEB 151217010 -0.04 0.04 0.23 PRKAG2- AS1 151574127 0.08 0.18 0.67 PRKAG2 151574316 0.05 0.13 0.71 GALNT11 151722778 -0.15 0.24 0.52 rs7781265 7 150950940 G A LRRC61 150020296 -0.04 0.03 0.24 ACTR3C 150020758 -0.18 0.08 0.03 ZBED6CL 150026938 0.07 0.05 0.21 RARRES2 150038763 -0.03 0.06 0.62 REPIN1 150065879 -0.04 0.02 0.11 ZNF775 150076406 0.05 0.03 0.07 LINC00996 150130742 0.21 0.16 0.20 GIMAP7 150211945 0.01 0.10 0.95 205 GIMAP4 150264458 -0.03 0.05 0.49 GIMAP6 150329736 0.06 0.06 0.31 GIMAP2 150382794 -0.02 0.08 0.84 TMEM176B 150497621 0.23 0.14 0.10 TMEM176A 150497854 0.15 0.11 0.19 AOC1 150549573 0.13 0.13 0.31 KCNH2 150675402 0.03 0.08 0.71 CDK5 150755052 -0.05 0.06 0.35 SLC4A2 150759634 -0.06 0.05 0.25 FASTK 150777970 -0.02 0.02 0.40 TMUB1 150780413 0.02 0.04 0.64 ABCF2 150924317 0.02 0.02 0.31 CHPF2 150929585 0.01 0.03 0.76 NUB1 151038847 0.04 0.11 0.74 RHEB 151217010 -0.01 0.02 0.41 PRKAG2- AS1 151574127 -0.16 0.08 0.05 PRKAG2 151574316 -0.03 0.06 0.58 GALNT11 151722778 -0.13 0.11 0.26 rs1948915 8 128222421 T C FAM84B 127570711 0.09 0.06 0.12 MYC 128748315 -0.37 0.13 5.92E-03 PVT1 128902874 -0.10 0.10 0.34 rs13296848 9 701529 T C C9orf66 215893 -0.07 0.05 0.20 DOCK8 273048 -0.01 0.04 0.70 KANK1 706806 -0.07 0.04 0.12 DMRT2 1050354 -0.04 0.19 0.82 rs2811710 9 21991923 C T PTPLAD2 21031635 -0.02 0.14 0.91 KLHL9 21335429 0.01 0.03 0.77 MTAP 21802635 0.01 0.03 0.86 CDKN2A 21994490 -0.03 0.02 0.17 ZBTB5 37465407 -0.03 0.05 0.51 POLR1E 37485945 0.06 0.05 0.24 TOMM5 37592636 0.00 0.03 0.95 EXOSC3 37785089 0.06 0.03 0.05 DCAF10 37800790 0.03 0.04 0.45 SLC25A51 37904350 0.03 0.05 0.58 ALDH1B1 38392661 0.11 0.07 0.12 rs7034061 9 38443792 G T ZBTB5 37465407 -0.03 0.05 0.51 POLR1E 37485945 0.06 0.05 0.24 TOMM5 37592636 0.00 0.03 0.95 EXOSC3 37785089 0.06 0.03 0.05 DCAF10 37800790 0.03 0.04 0.45 206 SLC25A51 37904350 0.03 0.05 0.58 ALDH1B1 38392661 0.11 0.07 0.12 rs2790457 10 28856819 G A MKX 28034778 0.06 0.14 0.66 WAC 28821422 -0.11 0.02 2.29E-11 BAMBI 28966424 0.09 0.12 0.47 PTCHD3P1 29698501 0.03 0.04 0.44 rs2102616 12 39654659 C T ALG10B 38710557 -0.05 0.07 0.49 CPNE8 39299420 -0.02 0.09 0.81 KIF21A 39837192 0.01 0.08 0.87 ABCD2 40013843 0.09 0.10 0.38 SLC2A13 40499661 0.02 0.04 0.74 LRRK2 40618813 -0.15 0.18 0.40 rs13338946 16 30700858 T C C16orf54 29757340 0.08 0.06 0.24 ZG16 29789561 -0.04 0.04 0.34 KIF22 29802034 0.00 0.02 0.96 MAZ 29817855 0.02 0.03 0.53 PAGR1 29827528 -0.03 0.02 0.20 MVP 29831715 0.01 0.04 0.86 CDIPT 29874578 0.03 0.03 0.35 KCTD13 29937545 0.01 0.02 0.81 TMEM219 29973351 -0.01 0.03 0.74 HIRIP3 30007417 0.03 0.02 0.13 INO80E 30007531 0.03 0.04 0.45 FAM57B 30042186 0.00 0.02 0.91 ALDOA 30064411 0.01 0.02 0.46 PPP4C 30087384 -0.03 0.04 0.52 YPEL3 30107521 0.00 0.04 0.92 MAPK3 30134630 0.01 0.03 0.71 CORO1A 30194731 0.00 0.07 1.00 CD2BP2 30366682 0.03 0.02 0.13 TBC1D10B 30381522 -0.04 0.04 0.37 SEPT1 30394171 0.06 0.06 0.29 ZNF48 30406740 -0.03 0.03 0.27 DCTPP1 30441373 -0.01 0.02 0.81 SEPHS2 30457224 0.00 0.02 0.97 ITGAL 30483983 -0.17 0.07 0.02 ZNF768 30537910 -0.05 0.03 0.14 ZNF747 30546194 -0.07 0.06 0.24 ZNF764 30569642 -0.01 0.04 0.74 ZNF785 30597092 0.05 0.05 0.30 ZNF689 30621682 0.08 0.04 0.03 PRR14 30662241 0.03 0.02 0.19 FBRS 30675778 -0.01 0.02 0.66 207 C16orf93 30773565 -0.01 0.03 0.72 ZNF629 30798523 -0.02 0.02 0.51 BCL7C 30905399 -0.08 0.04 0.05 ORAI3 30960405 0.03 0.03 0.31 STX4 31044903 0.00 0.04 0.90 VKORC1 31106276 0.00 0.03 0.96 BCKDK 31119662 0.00 0.02 0.95 FUS 31191431 0.00 0.03 0.94 PYCARD 31214097 0.02 0.07 0.77 ITGAM 31271288 -0.07 0.09 0.42 ARMC5 31470317 0.01 0.02 0.80 SLC5A2 31494439 0.01 0.01 0.44 C16orf58 31519706 0.06 0.03 0.03 AHSP 31539203 0.12 0.10 0.24 rs7193541 16 74664743 T C PSMD7 74330673 0.03 0.03 0.20 GLG1 74641042 -0.01 0.02 0.64 RFWD3 74700779 0.11 0.04 5.81E-03 MLKL 74734789 0.07 0.07 0.31 ZFP1 75182421 0.02 0.05 0.68 CTRB2 75241072 -0.05 0.11 0.64 BCAR1 75301951 -0.02 0.02 0.48 CFDP1 75467387 0.03 0.04 0.39 TMEM170A 75498584 0.04 0.04 0.31 TMEM231 75590170 0.02 0.05 0.66 GABARAPL 2 75600249 0.01 0.02 0.56 ADAT1 75657221 0.00 0.05 0.99 rs34562254 17 16842991 G A TTC19 15902694 0.00 0.06 0.94 ZSWIM7 15903006 -0.02 0.05 0.60 NCOR1 16118874 0.00 0.03 0.90 CENPV 16256812 -0.07 0.09 0.40 UBB 16284367 0.01 0.04 0.70 TRPV2 16318856 0.05 0.05 0.27 LRRC75A- AS1 16342301 0.02 0.03 0.38 ZNF624 16557167 0.04 0.05 0.44 TNFRSF13B 16875402 0.01 0.11 0.90 PLD6 17109646 0.18 0.14 0.18 COPS3 17184617 0.03 0.04 0.54 RASD1 17399709 0.04 0.10 0.69 PEMT 17495017 -0.03 0.06 0.60 RAI1 17584787 -0.02 0.02 0.35 rs4273077 17 16849139 A G TTC19 15902694 0.09 0.05 0.09 208 ZSWIM7 15903006 -0.01 0.04 0.86 NCOR1 16118874 0.03 0.03 0.30 CENPV 16256812 -0.05 0.08 0.55 UBB 16284367 0.01 0.04 0.70 TRPV2 16318856 -0.01 0.04 0.85 LRRC75A- AS1 16342301 0.03 0.03 0.30 ZNF624 16557167 0.07 0.05 0.16 TNFRSF13B 16875402 0.20 0.11 0.07 PLD6 17109646 0.30 0.13 0.02 COPS3 17184617 0.07 0.04 0.07 RASD1 17399709 0.05 0.09 0.62 PEMT 17495017 -0.10 0.06 0.09 RAI1 17584787 0.01 0.02 0.60 rs11086029 19 16438661 T A AKAP8 15490612 0.03 0.02 0.24 AKAP8L 15529833 0.01 0.04 0.73 RASAL3 15575382 -0.04 0.05 0.52 PGLYRP2 15590315 0.01 0.05 0.83 CYP4F12 15783828 -0.04 0.08 0.63 OR10H1 15918936 -0.02 0.05 0.73 UCA1 15939757 -0.01 0.05 0.83 TPM4 16187135 0.01 0.12 0.96 RAB8A 16222490 -0.11 0.04 0.01 HSH2D 16244838 -0.07 0.08 0.41 CIB3 16284286 -0.01 0.03 0.72 FAM32A 16296235 -0.02 0.03 0.55 AP1M1 16308665 0.05 0.03 0.17 KLF2 16435651 -0.11 0.09 0.23 CHERP 16653263 0.01 0.01 0.35 SLC35E1 16683193 -0.02 0.03 0.53 MED26 16739015 -0.02 0.03 0.54 SMIM7 16770968 -0.02 0.03 0.52 SIN3B 16940209 0.01 0.04 0.83 CPAMD8 17137625 -0.01 0.05 0.88 HAUS8 17186343 0.02 0.03 0.55 MYO9B 17186591 0.01 0.03 0.80 USE1 17326155 0.00 0.05 0.96 OCEL1 17337055 -0.01 0.03 0.65 BABAM1 17378232 0.03 0.04 0.44 MRPL34 17416477 0.03 0.03 0.32 rs6066835 20 47355009 T C SULF2 46414808 -0.49 0.34 0.15 LINC00494 46988654 -0.13 0.11 0.25 PREX1 47444420 0.09 0.09 0.34 209 ARFGEF2 47538275 0.02 0.04 0.53 CSE1L 47662783 0.03 0.05 0.46 STAU1 47804904 0.02 0.03 0.61 DDX27 47835832 0.00 0.04 0.91 ZFAS1 47894715 -0.03 0.07 0.66 ZNFX1 47894756 -0.05 0.08 0.54 B4GALT5 48330421 -0.02 0.08 0.84 rs138740 22 35699582 C T HMGXB4 35653445 0.02 0.04 0.68 HMOX1 35777060 -0.27 0.16 0.08 MCM5 35796116 -0.07 0.05 0.16 RASD2 35937352 -0.08 0.04 0.04 APOL6 36044424 0.00 0.04 0.99 RBFOX2 36236630 0.02 0.05 0.72 APOL3 36556977 0.17 0.10 0.08 APOL2 36636000 0.01 0.02 0.68 APOL1 36649117 0.02 0.08 0.76 HMGXB4 35653445 0.03 0.03 0.39 HMOX1 35777060 0.03 0.03 0.39 MCM5 35796116 0.03 0.03 0.39 RASD2 35937352 0.03 0.03 0.39 APOL6 36044424 0.03 0.03 0.39 RBFOX2 36236630 0.03 0.03 0.39 APOL3 36556977 0.03 0.03 0.39 APOL2 36636000 0.03 0.03 0.39 APOL1 36649117 0.03 0.03 0.39 rs877529 22 39542292 G A MAFF 38597939 0.04 0.10 0.72 TMEM184B 38669040 -0.05 0.04 0.18 CSNK1E 38714089 -0.02 0.03 0.57 KDELR3 38864083 -0.09 0.10 0.40 DDX17 38902345 -0.05 0.08 0.50 CBY1 39052658 0.00 0.03 0.95 TOMM22 39077954 -0.06 0.03 0.06 JOSD1 39096459 0.02 0.04 0.70 GTPBP1 39101807 0.02 0.02 0.22 SUN2 39151467 -0.03 0.03 0.31 CBX6 39268258 0.01 0.04 0.75 APOBEC3B 39378404 0.04 0.12 0.76 APOBEC3C 39410265 -0.04 0.04 0.25 APOBEC3F 39436673 -0.02 0.04 0.72 APOBEC3G 39473010 0.00 0.04 0.98 CBX7 39548538 -0.05 0.03 0.16 RPL3 39715670 0.02 0.03 0.44 SYNGR1 39745954 0.02 0.04 0.64 210 TAB1 39795759 0.01 0.04 0.80 MGAT3 39853325 -0.02 0.02 0.24 MIEF1 39898284 -0.02 0.03 0.55 RPS19BP1 39928860 0.00 0.03 0.89 rs139402 22 39546145 T C MAFF 38597939 0.05 0.11 0.62 TMEM184B 38669040 -0.05 0.04 0.21 CSNK1E 38714089 -0.02 0.03 0.57 KDELR3 38864083 -0.08 0.10 0.44 DDX17 38902345 -0.05 0.08 0.52 CBY1 39052658 0.00 0.03 0.89 TOMM22 39077954 -0.06 0.03 0.06 JOSD1 39096459 0.02 0.04 0.62 GTPBP1 39101807 0.02 0.02 0.20 SUN2 39151467 -0.02 0.03 0.41 CBX6 39268258 0.01 0.04 0.72 APOBEC3B 39378404 0.04 0.12 0.74 APOBEC3C 39410265 -0.04 0.04 0.27 APOBEC3F 39436673 -0.01 0.04 0.80 APOBEC3G 39473010 0.00 0.04 0.97 CBX7 39548538 -0.04 0.03 0.19 RPL3 39715670 0.02 0.03 0.41 SYNGR1 39745954 0.02 0.04 0.65 TAB1 39795759 0.01 0.04 0.74 MGAT3 39853325 -0.02 0.02 0.36 MIEF1 39898284 -0.01 0.03 0.66 RPS19BP1 39928860 0.00 0.03 0.87 211 Supplementary figure S1. Quality control flow chart of AAMM study 212 Supplementary figure S2. Manhattan plots Multiple Myeloma GWAS meta-analysis of Multiple Myeloma in AA population. This figure shows all genotyped and imputed results of the overlapped SNPs across two sets. The upper plot a) excluded SNPs with MAF<0.01 and imputation score (r 2 )<0.5; while the lower plot b) excluded SNPs with MAF<0.01 and imputation score (r 2 )<0.8. The orange line represents the genome-wide significant cut-off value of P = 5 × 10 −8 . Blue line represents the suggestive cut-off value of P = 1 × 10 −6 . Green dots represent known risk alleles for MM in European population. a) b) 213 Supplementary figure S3. Regional association plots of the 9p24.3 (0.2-1.2MB) and 9p13.1 (37.8- 38.8MB) suggestive novel risk regions for Multiple Myeloma in AA population. Single-nucleotide polymorphisms (SNPs) are plotted by position (x-axis) and -log10P-value (y-axis). SNPs with MAF<0.01 and imputation score (r 2 )<0.8 were excluded. r 2 was estimated in AFR individuals of phase III 1000 Genomes Project (1KGP). The most statistically significant associated SNP (purple dot) in 9p24.3 is rs13296848 (chr9:701529) and in 9p13.1 is rs7034061 (chr9:38443792), and the surrounding SNPs are colored to indicate pairwise correlation with the index SNP. a) 214 b) 215 Supplementary figure S4. Scatter plots of admixture analyses of Multiple Myeloma at chromosome 2 (0-150 MB) in AA population. x-axis is chromosome position in MB and y-axis is - log10P-value for admixture association in case-control analysis (upper plot) and case-only analysis (lower plot) of African local ancestry on chromosome 2 (0-150MB). Purple lines indicate known MM risk loci (rs10180663). Dotted lines are -log 10P-values of 5 and 6. 216 Supplementary Figure S5. Negative log10(p-value) plot for admixture associations in case-control analyses of African local ancestry on chromosome 2 (23.1- 29.8 Mb) in AA population. x-axis is chromosome position, y-axis is -log10P-value of AFR local ancestry. Grey dots are marginal results adjusted for age, sex, and the first ten principle components, orange dots are conditional results with additional adjusting for the known risk allele rs6746082, blue dots are conditional results with additional adjusting for the independent SNPs within 23.1- 29.8 Mb identified by forward-selection logistic regression. 217 Supplementary Figure S6. The odds ratios of African local ancestry on MM risk with/without adjusting for allele dosage in case-control analyses on chromosome 2 (23.1- 29.8 Mb) in AA population. x-axis is chromosome position, y-axis is odds ratios (ORs) of AFR local ancestry. Grey dots are marginal ORs adjusted for age, sex, and the first ten principle components, orange dots are conditional ORs with additional adjusting for the known risk allele rs6746082, blue dots are conditional ORs with additional adjusting for the independent SNPs within 23.1- 29.8 Mb identified by forward- selection logistic regression. The grey region represents the ±15% change of marginal ORs. 218 Supplementary figure S7. LocusZoom plots for variants located at the 22 known multiple myeloma risk loci among AA individuals (r 2 computed from European population in 1KGP). Single-nucleotide polymorphisms (SNPs) are plotted by position (x-axis) and -log10P-value (y-axis). SNPs with MAF<0.01 and imputation score (r 2 )<0.8 were excluded. r 2 was calculated in European individuals of phase III 1KGP. The purple diamonds are index SNPs reported by previous GWAS studies, and the surrounding SNPs are colored to indicate pairwise correlation with the index SNP. 2p23.3 2q31.1 3p22.1 3q26.2 5q15 5q23.2 219 6p21.33 6q21 7p15.3 7q22.3 7q22.3 7q36.1 220 8q24.21 9p21.3 10p12.1 16p11.2 16q23.1 17p11.2 221 19p13.11 20q13.13 22q13 22q13.1 222 Supplementary figure S8. Functional annotation in UCSC Genome Browser for allele rs13296848 and its correlated SNPs. Variant with blue color is the index SNP rs13296848. Variants with yellow color are SNPs in association with the index SNP rs13296848 (r 2 >0.4 in African population of 1KGP). chr2 (p23.3) p21 16.1 14 2p12 11.2 11.2 13 q14.3 24.3q31.1 32.1 33.1 q34 q35 37.3 0.2<R2<=0.4 Chromosome Band Txn Factor ChIP All SNPs(150) Plot of R2 values Query Variant: rs13296848 Proxy Variants with 0.8<R2<=1.0 Proxy Variants with 0.4<R2<=0.6 Proxy Variants with 0.2<R2<=0.4 Chromosome Bands Localized by FISH Mapping Clones UCSC Genes (RefSeq, GenBank, CCDS, Rfam, tRNAs & Comparative Genomics) NHGRI-EBI Catalog of Published Genome-Wide Association Studies H3K4Me1 Mark (Often Found Near Regulatory Elements) on 7 cell lines from ENCODE H3K4Me3 Mark (Often Found Near Promoters) on 7 cell lines from ENCODE H3K27Ac Mark (Often Found Near Active Regulatory Elements) on 7 cell lines from ENCODE DNaseI Hypersensitivity Clusters in 125 cell types from ENCODE (V3) Transcription Factor ChIP-seq (161 factors) from ENCODE with Factorbook Motifs GM12878 Genome Segmentation by Combined Segway+ChromHMM from ENCODE/Analysis Simple Nucleotide Polymorphisms (dbSNP 150) rs13296848 rs7041829 rs4742257 rs7854502 rs13285101 9p24.3 KANK1 KANK1 KANK1 KANK1 KANK1 KANK1 KANK1 KANK1 KANK1 R2 Plot 1 _ 0 _ Layered H3K4Me1 Layered H3K4Me3 Layered H3K27Ac
Abstract (if available)
Abstract
Genome-wide association studies (GWAS) in the past decade have been successful in identifying thousands of common genetic susceptibility loci for cancers. However, non-European populations were underrepresented in GWAS samples to date. The clinical value of genetic information in guiding personalized medicine in populations of non-European ancestry will require additional discovery and risk locus characterization efforts across populations. In this dissertation, I aim to expand the current knowledge of genetic susceptibility for multiple cancers to the underrepresented populations of Africa and Latino ancestry. In chapter 2, I conducted the first genetic risk characterization and GWAS study of prostate cancer (PrCa) in men from Eastern Africa among cases and controls from Uganda. In chapter 3, I assembled all existing genetic studies of PrCa in Latino men to search for novel risk alleles as well as to determine whether known PrCa risk alleles are important in capturing PrCa risk in Latino men. I also explored whether genetic background/ancestry modified associations with single variants and a PrCa polygenic risk score in Latino men. In chapter 4, I combined all existing multiple myeloma GWAS data for men and women of African ancestry. In addition to scanning for novel risk regions and assessing the aggregated effect of known risk loci on multiple myeloma risk, I also comprehensively examined each known risk region to find markers that better capture multiple myeloma genetic risk in this high-risk population. These studies show that genetic analyses of these cancers in non-European ancestry populations are imperative in developing ethnic-specific polygenic scores that are informative to improve prevention, screening and treatment of cancers in these diverse populations.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Genetic risk factors in multiple myeloma
PDF
Using genetic ancestry to improve between-population transferability of a prostate cancer polygenic risk score
PDF
Utility of polygenic risk score with biomarkers and lifestyle factors in the multiethnic cohort study
PDF
Prostate cancer: genetic susceptibility and lifestyle risk factors
PDF
Identifying genetic, environmental, and lifestyle determinants of ethnic variation in risk of pancreatic cancer
PDF
Identification and fine-mapping of genetic susceptibility loci for prostate cancer and statistical methodology for multiethnic fine-mapping
PDF
The multiethnic nature of chronic disease: studies in the multiethnic cohort
PDF
The interplay between tobacco exposure and polygenic risk score for growth on birthweight and childhood acute lymphoblastic leukemia
PDF
The role of heritability and genetic variation in cancer and cancer survival
PDF
Association of comorbidity with prostate cancer tumor characteristics in African American men
PDF
Polygenic analyses of complex traits in complex populations
PDF
Genetic and environmental risk factors for childhood cancer
PDF
Pharmacogenetic association studies and the impact of population substructure in the women's interagency HIV study
PDF
Understanding prostate cancer genetic susceptibility and chromatin regulation
PDF
Examining the relationship between common genetic variation, type 2 diabetes and prostate cancer risk in the multiethnic cohort
PDF
Genes and environment in prostate cancer risk and prognosis
PDF
Environmental risk factors of Multiple Sclerosis: a twin study
PDF
Genes and hormonal factors involved in the development or recurrence of breast cancer
PDF
Pathogenic variants in cancer predisposition genes and risk of non-breast multiple primary cancers in breast cancer patients
PDF
Population substructure and its impact on genome-wide association studies with admixed populations
Asset Metadata
Creator
Du, Zhaohui
(author)
Core Title
Genetic studies of cancer in populations of African ancestry and Latinos
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Epidemiology
Publication Date
12/16/2019
Defense Date
12/17/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
admixture mapping,African Americans,GWAS,Latinos,multiple myeloma,OAI-PMH Harvest,polygenic risk score,prostate cancer
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Haiman, Christopher A. (
committee chair
), Carpten, John D. (
committee member
), Conti, David V. (
committee member
), Cozen, Wendy (
committee member
)
Creator Email
duzhh1226@hotmail.com,zhaohuid@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-256779
Unique identifier
UC11673105
Identifier
etd-DuZhaohui-7996.pdf (filename),usctheses-c89-256779 (legacy record id)
Legacy Identifier
etd-DuZhaohui-7996.pdf
Dmrecord
256779
Document Type
Dissertation
Rights
Du, Zhaohui
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
admixture mapping
GWAS
multiple myeloma
polygenic risk score
prostate cancer