Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Ancestral/Ethnic variation in the epidemiology and genetic predisposition of early-onset hematologic cancers
(USC Thesis Other)
Ancestral/Ethnic variation in the epidemiology and genetic predisposition of early-onset hematologic cancers
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Ancestral/Ethnic Variation in the Epidemiology and Genetic Predisposition of Early-onset Hematologic Cancers by Qianxi Feng, MPH A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (EPIDEMIOLOGY) May 2022 Copyright 2022 Qianxi Feng i Acknowledgements First and foremost, I am extremely grateful to my supervisors, Dr. Joseph Wiemels and Dr. Adam de Smith for their invaluable advice, continuous support, and patience during my PhD study. Their immense knowledge and plentiful experience have encouraged me in all the time of my academic research and daily life. I would also like to thank Dr. William J. Gauderman, Dr. Charleston Chiang, and Dr. Deepa Bhojwani for sitting on my dissertation committee and giving valuable advice during the defense. I would also like to thank all the collaborators from Keck Medicine at USC (Dr. Roberta McKean-Cowdin, Dr. Thomas Mack, Dr. Nicholas Mancuso, Ms. Charite Ricker, Dr. Russell Brynes, and Dr. Maria Vergara-Lluri), UCSF School of Medicine (Dr. Andrew D. Leavitt, Dr. Scott Kogan, Dr. Jill Hollenbach, Dr. Mi Zhou, and Ms. Helen Hansen), Yale School of Public Health (Dr. Rong Wang and Dr. Xiaomei Ma), UC Berkeley School of Public Health (Dr. Libby Morimoto, Dr. Catherine Metayer, and Ms. Alice Kang), Children’s Hospital Oakland (Dr. Henry Erlich, Ms. Anna Lisa Fear, Dr. Derek Pappas, and Dr. Elizabeth Trachtenberg), and University of Chicago School of Medicine (Dr. Lucy A. Godley, Ms. Kelsey McNeely, and Dr. Simone Feurstein) for their technical support on my studies. I would like to thank all the members in the Children's Cancer Research Laboratory at USC. It is their kind help and support that have made my study and life in Los Angeles a wonderful time. Finally, I would like to express my gratitude to my parents, Dr. Dan Xi and Mr. Chun Feng, my parents-in-law, Mrs. Hong Zhang and Mr. Jian Wang, and my husband, Mr. Ziyuan Wang. Without their tremendous understanding and encouragement in the past few years, it would be impossible for me to complete my study. ii Table of Contents Acknowledgements .......................................................................................................................... i List of Tables ................................................................................................................................. iv List of Figures ................................................................................................................................. v List of Abbreviations ..................................................................................................................... vi Abstract ........................................................................................................................................ viii Introduction ................................................................................................................................... 10 Chapter 1: Trends in acute lymphoblastic leukemia Incidence in the US ...................................... 3 Introduction ............................................................................................................................................. 3 Methods ................................................................................................................................................... 4 Results ...................................................................................................................................................... 6 Discussion ............................................................................................................................................. 10 Chapter 2: In utero immune development and susceptibility of childhood ALL ......................... 24 Introduction ........................................................................................................................................... 24 Methods ................................................................................................................................................. 26 Results .................................................................................................................................................... 31 Discussion ............................................................................................................................................. 34 Chapter 3: Burden of familial-associated early-onset cancer risk ................................................ 44 Introduction ........................................................................................................................................... 44 Methods ................................................................................................................................................. 45 Results .................................................................................................................................................... 48 Discussion ............................................................................................................................................. 50 Chapter 4: Genetic predispositions of familial-associated early-onset hematologic cancers ....... 60 Introduction ........................................................................................................................................... 60 Methods ................................................................................................................................................. 62 Results .................................................................................................................................................... 65 Discussion ............................................................................................................................................. 66 Conclusion .................................................................................................................................... 69 iii Bibliography ................................................................................................................................. 73 Appendices .................................................................................................................................... 95 Appendix A: Supplementary materials for Chapter 1 ..................................................................... 95 Appendix B: Supplementary materials for Chapter 2 ................................................................... 109 Appendix C: Supplementary materials for Chapter 3 ................................................................... 131 Appendix D: Supplementary materials for Chapter 4 ................................................................... 142 iv List of Tables Table 1- 1. ..................................................................................................................................... 16 Table 1- 2. ..................................................................................................................................... 17 Table 2- 1. ..................................................................................................................................... 39 Table 2- 2. ..................................................................................................................................... 40 Table 2- 3. ..................................................................................................................................... 41 Table 3- 1. ..................................................................................................................................... 55 Table 3- 2. ..................................................................................................................................... 56 Table 4- 1. ..................................................................................................................................... 71 Table 4- 2. ..................................................................................................................................... 72 v List of Figures Figure 1- 1. .................................................................................................................................... 18 Figure 1- 2. .................................................................................................................................... 19 Figure 1- 3. .................................................................................................................................... 21 Figure 2- 1. .................................................................................................................................... 42 Figure 3- 1. .................................................................................................................................... 57 Figure 3- 2. .................................................................................................................................... 58 Figure 3- 3. .................................................................................................................................... 59 vi List of Abbreviations 1KG 1000 Genomes Project AAIR age-adjusted incidence rate ACMG American College of Medical Genetics and Genomics AIAN American Indian/Alaskan Native ALL acute lymphoblastic leukemia AML acute myeloid leukemia APC annual percent change API Asian/Pacific Islander ARG-II arginase-II ARID5B AT-Rich Interaction Domain 5B ASIR Age-standardized incidence rates AYA adolescent and young adult CCLS California Childhood Leukemia Study CCRLP California Childhood Cancer Records Linkage Project CCRLP California Cancer Registry CI Confidence interval CLL chronic lymphoblastic leukemia CNS central nervous system ETV6 ETS Variant Transcription Factor 6 FPM first primary malignancy GATK Genome Analysis ToolKit GCT germ cell tumor gnomAD Genome Aggregation Database GWAS genome-wide association studies HC hematologic cancer HGDP Human Genome Diversity Project HL Hodgkin lymphoma HLA human leukocyte antigen IGH immunoglobulin heavy locus IKZF1 IKAROS Family Zinc Finger 1 IL-10 Interleukin 10 IRR Incidence Rate Ratio KIR killer immunoglobulin-like receptors KMT2A lysine methyltransferase 2A MAF minor allele frequency NHL non-Hodgkin lymphoma NK cell natural killer cell NL non-Latino NLA Non-Latino Asian NLB non-Latino Black NLW non-Latino White NOTCH1 Notch receptor 1 gene vii OR Odds Ratio PC Principal Component PeCanPIE Pediatric Cancer Variant Pathogenicity Information Exchange Ph-like Philadelphia chromosome like RB1 RB Transcriptional Corepressor 1 RUNX1 runt-related transcription factor 1 SD Standard deviation SEER Surveillance, Epidemiology, and End Results SEP socioeconomic position SIR standardized incidence ratio SNP single-nucleotide polymorphism SPM second Primary malignancy TNF-α Tumor Necrosis Factor alpha TOPMed Trans-Omics for Precision Medicine TP53 Tumor Protein P53 VCF variant call format VSN variance stabilizing normalization WES whole exome sequencing viii Abstract Incidence trends in acute lymphoblastic leukemia (ALL) demonstrate disparities by race and ethnicity. We used data from the Surveillance, Epidemiology and End Results Registry to evaluate patterns in ALL incidences from 2000-2016, including the association between the percent of people born in a foreign country at the county level and ALL incidence. Among 23,829 individuals of all ages diagnosed with ALL, 8,297 (34.8%) were Latinos, 11,714 (49.2%) were non-Latino (NL) Whites, and 1,639 (6.9%) were NL Blacks. Latinos had the largest increase in the age-adjusted incidence rate (AAIR) in this period compared to other race/ethnicities for both children and adults: AAIR was 1.6 times higher for Latinos (AAIR=2.43; 95%CI: 2.37,2.49) compared to NL Whites (AAIR=1.56; 95%CI:1.53,1.59; P<0.01). The AAIR for all subjects increased approximately 1% per year from 2000-2016 (annual percent change=0.97; 95%CI:0.67,1.27), with the highest increase in Latinos (annual percent change=1.18; 95%CI:0.76,1.60). In multivariable models evaluating the contribution of % of the county residents that were foreign born to ALL risk, a positive association was found for percentage of foreign born for NL Whites (P-trend<0.01) and Blacks (P-trend<0.01), but the inverse association was found for Latinos (P-trend<0.01) consistent with tenets of the “Hispanic paradox” in which better health outcomes exist for foreign-born Latinos. Acute lymphoblastic leukemia (ALL) in children is associated with a distinct neonatal cytokine profile. The basis of this neonatal immune phenotype is unknown, but potentially related to maternal-fetal immune receptor interactions. We conducted a case-control study of 226 case child-mother pairs and 404 control child-mother pairs to evaluate the role of interaction between human leukocyte antigen (HLA) genotypes in the offspring and maternal killer immunoglobulin-like receptor (KIR) genotypes in the etiology of childhood ALL, while ix considering potential mediation by neonatal cytokines and the immune-modulating enzyme arginase-II (ARG-II). We observed different associations between offspring HLA-maternal KIR activating profiles and the risk of ALL in different predicted genetic ancestry groups. For instance, in Latino subjects who experience the highest risk of childhood leukemia, activating profiles were significantly associated with a lower risk of childhood ALL (odds ratio, OR=0.59; 95% confidence interval, CI:0.49, 0.71) and a higher level of ARG-II at birth (coefficient=0.13; 95%CI:0.04, 0.22). HLA-KIR activating profiles were also associated with a lower risk of ALL in non-Latino Asians (OR=0.63; 95%CI:0.38, 1.01), however with lower TNF-α level (coefficient=-0.27; 95%CI: -0.49, -0.06). Among non-Latino White subjects, no significant association was observed between offspring HLA-maternal KIR interaction and ALL risk, or cytokine levels. The current study reports the association between offspring HLA-maternal KIR interaction and the development of childhood ALL with variation by predicted genetic ancestry. We also observed some associations between activating profiles and immune factors related to cytokine control; however, cytokines did not demonstrate causal mediation of the activating profiles on ALL risk. The role of race/ethnicity in genetic predisposition of early-onset cancers can be estimated by comparing family-based cancer concordance rates among ethnic groups. We used linked California health registries to evaluate the relative cancer risks for first degree relatives of patients diagnosed between ages 0-26, and the relative risks of developing distinct second malignancies (SPMs). From 1989-2015, we identified 29,631 cancer patients and 62,863 healthy family members. Given probands with cancer, there were increased relative risks of any cancer for siblings and mothers [standardized incidence ratio (SIR)=3.32; 95% confidence interval (CI): 2.85-3.85)] and of SPMs (SIR=7.27; 95%CI: 6.56-8.03). Higher relative risk of any cancer in x siblings and mothers given a proband with solid cancer (P=0.019) was observed for Latinos (SIR=4.98;95%CI:3.82-6.39) compared to non-Latino White subjects (SIR=3.02;95%CI:2.12- 4.16), supporting a need for increased attention to the genetics of early-onset cancer predisposition and environmental factors in Latinos. Introduction The American Cancer Society has characterized hematologic cancer (HC) as the most diagnosed cancer among children and adolescents under 20 years. The three major types of HC, leukemias, myelomas and lymphomas, together account for approximately one-third of all cancers diagnosed in people under 20 years of age 1 . Across all age groups, HCs account for 8- 10% of all cancer diagnosis and a similar percentage of cancer deaths worldwide. More than 1.3 million people in the US are living with or in remission from a blood cancer 2 . Although the prognosis of HCs has been improved significantly over the past few decades, patients suffer from various long- and short-term sequelae of treatment, such as second and subsequent malignancies 3 , amyotrophic lateral sclerosis 4 , osteonecrosis, hormonal, cardiovascular, and pulmonary abnormalities, and neuropsychological problems 5,6 . Therefore, understanding the epidemiology and etiology is important in the prevention of HCs and the alleviation of the burden of HCs. The incidence of early-onset HCs varies by race/ethnic groups. For childhood HCs diagnosed under 20 years of age, the incidence among Latino subjects is approximately 7% higher compared to non-Latino (NL) White subjects, 50% higher compared to NL Black subjects, 20% higher compared to NL Asian/Pacific Islander (API) subjects, and 40% higher compared to NL American Indian/Alaskan Native (AIAN) subjects 7 . The striking high incidence of childhood HCs among Latino subjects is mainly driven by the high incidence of leukemias, 2 which accounts for approximately 65% of total HC diagnosis among childhood HC patients 1 . Environmental, lifestyle and genetic factors may contribute to this variation in incidence rate across different race/ethnic groups. In this dissertation, I describe four studies that investigated HCs on population, familial, and individual levels. In Chapters 1-2, we studied the most common cancer in children, acute lymphoblastic leukemia (ALL) 1 at population and individual levels. We first reported the incidence and trend of ALL at all ages in the US using the largest population-based cancer registry in the US, the Surveillance, Epidemiology, and End Results (SEER) Program 8 . Further, we investigated how in utero immune development is associated with the risk of childhood ALL, and how this association varies by race/ethnic group using a case-control study that collected biological samples from mother-child pairs in California. In Chapters 3-4, we assessed the familial clustering of early onset cancers, and explored the genetic predispositions that are associated with the variation in HC risk among different race/ethnic groups. We first conducted a study to quantify the burden of such familial-associated early-onset cancers in California. Subsequently, to identify the potential causal genetic predisposition of early-onset HCs that explain race/ethnic variation in HC predisposition, we conducted a study to detect genetic mutations and their ancestral origins contributing to familial-associated early-onset HCs in these groups. 3 Chapter 1: Trends in acute lymphoblastic leukemia Incidence in the US In this Chapter, we describe an epidemiologic study on the incidence of acute lymphoblastic leukemia from 2000 to 2016 with the largest population-based cancer registry in the US. Introduction Acute lymphoblastic leukemia (ALL) affects both children and adults and is characterized by the accumulation of either B- or T-lymphoblasts in the bone marrow. Outcomes for pediatric ALL patients have improved remarkably in recent decades, but survival rates and long-term prognosis for adults remain poor 9 . Despite advances in therapy, knowledge on causes lags behind, thwarting any possible prevention measures for both children and adults. The disease in children is dominated by good prognosis subtypes, such as those with high hyperdiploidy or ETV6-RUNX1 fusion translocations 10 . Over the past few decades, the five-year survival rate of ALL in children has improved drastically with intensive conventional chemotherapy, from 10% in the 1960’s to 80% in 1990s and over 90% in 2000s in the United States 11 . However, despite the improvements in treatment, knowledge to enable ALL prevention did not concomitantly advance, demonstrated by the continuous increase in ALL incidences among Latino children from 1990 to 2013. Intriguingly, this increase in ALL incidences was not observed for non-Latino White, Black, or Asian children who also have lower incidence than Latino subjects 12 . In contrast to children, other age groups (adolescents, young adults, and older adults) tend to harbor poor-prognosis subtypes, such as those related to the Philadelphia (Ph) chromosome and Ph-like subtypes, and translocations in the lysine methyltransferase 2A (KMT2A) and immunoglobulin heavy locus (IGH) genes, or, among T-cell subtypes, Notch receptor 1 gene (NOTCH1) mutations 13-15 . Admittedly, in the post-tyrosine kinase inhibitor therapy era, clinical 4 outcomes of Ph-positive (as compared to Ph-negative) ALL in adults have shown improvement over time, demonstrating comparable survival as the childhood illness 16,17 . Nevertheless, the prognosis among adolescents and young adults for Ph-negative ALL remains poor as only half of adolescent and young adult ALL patients survive more than ten years 18 . Similar to childhood ALL, the incidence among adolescent, young adult and older adult Latino subjects has kept increasing over the past decades 18-20 . Using publicly available Surveillance, Epidemiology and End Results (SEER) registry data, a pattern of increasing incidence of childhood ALL over the past several decades among all racial/ethnic groups has been described but driven largely in recent years by increasing rates among Latino children 12,21-23 . In adults, we also reported a higher incidence of ALL among Latinos compared to non-Latino Whites through 2004 24 . In the current analysis, we update the trends for ALL from childhood through adult years using data from 2000 to 2016 and evaluate factors available in SEER that may be influencing trends by race/ethnicity. Specifically, we survey ALL incidence patterns over the life course including characterization of incidence rates in children, adolescents, and adults and examine trends by age and by SEER specific factors that may inform these patterns including socioeconomic position (SEP) and place of birth. With the statistical models, we provide novel insights into the association between SEP and place of birth and ALL. Methods We used data from the National Cancer Institute’s population-based SEER registry to analyze trends in ALL from 2000 to 2016. Age-adjusted incidence rates (AAIRs) per 100,000 population were calculated using data from all SEER 18 registries overall and by race/ethnicity 25 . 5 ALL was defined by the International Classification of Diseases for Oncology, Third Edition (ICD-O-3) codes 9811-9818, 9826, and 9835-9837 26 . Only the first matching record of primary malignant cases were derived from the database. Cases with known age-at-diagnosis, and of Latino ethnicity (all races), non-Latino (NL) White, NL Black, NL Asian and Pacific Islander (API), NL American Indian, Alaskan Native (AIAN), and NL unknown race were included in the analyses. AAIRs and 95% confidence intervals (95% CI) were calculated using SEER stat v8.3.6 (19) and the 2000 U.S. Standard Population as the reference 27 . AAIRs were stratified by race/ethnicity and by age groups: 0-14 (childhood), 15-39 (adolescent and young adult, [AYA]), and 40+ years (adult). The summary AAIRs with 95%CIs are shown graphically and annual percent change (APC) reported to show patterns by race/ethnicity and age groups. APCs were calculated using the JoinPoint regression program version 4.7.0 28 with the year of diagnosis as the primary predictor variable. We further used a Poisson or negative binomial regression model with age-standardized population offset to analyze the association between the community (county)-level percent of people born in a foreign country in 2000 and the AAIR of ALL among NL Whites, NL Blacks and Latinos. The analyses were stratified by race/ethnicity, and adjusted for age groups (0-14, 15-39, 40+ years), sex, year of diagnosis and SEP. SEP was identified with a time-dependent Yost index variable from census tract linked to SEER 29 . The Yost index variable captures 7 aspects of SEP, including: proportion of the population that is working-class; proportion of persons age 16 years and over who are unemployed; proportion of persons below 150% of the federal poverty line; median annual household income; education index; median house value; and median cost to rent a residence in the surrounding community 30 . The % foreign-born 6 variable was categorized by quintiles and the Yost index variable was categorized by race- specific quintiles. We performed the analysis among all ALL cases in SEER. For sensitivity analysis, we stratified the analysis by the State of residence at diagnosis, specifically inside/outside California to reduce the overall potential for confounding by national origin, because over 80% of Latino subjects in California is from one source, Mexico 31 . All statistical tests were two-sided. Any P value less than 0.05 was considered statistically significant. Generalized collinearity diagnostics was used to evaluate collinearity in Poisson regression models. All analyses were performed with R, version 3.6.0 (R Foundation for Statistical Computing, Vienna, Austria). Results Overall From 2000 to 2016, 23,829 people were diagnosed with ALL in the SEER database. Among them, 8,297 (34.8%) were Latino; of the non-Latino cases, 11,714 (49.2%) were White, 1,831 (7.68%) were API, 1,639 (6.88%) were Black, 225 (0.94%) were AIAN and 123 (0.52%) were of another race/ethnic category. The AAIRs per 100,000 population from 2000 to 2016 were the highest for Latinos of all races (AAIR=2.43;95%CI:2.37,2.49), followed by AIANs (AAIR=1.78;95%CI:1.55,2.04), Whites (AAIR=1.56;95%CI:1.53,1.59), APIs (AAIR=1.45;95%CI:1.38,1.51), and Blacks (AAIR=0.95;95%CI:0.90,1.00) (Table 1-1). During this same time window, AAIRs increased significantly among Latinos (APC=1.18;95%CI:0.76,1.60) and non-Latino Blacks (APC=1.04;95%CI:0.02,2.07), but not among NL Whites, APIs or AIANs (Table 1-2). 7 When we specifically evaluated SEER registry data from California, 11,840 people were diagnosed with ALL from 2000-2016 of whom 6,386 (53.94%) were Latino and 3,828 (32.33%) were NL White. In California, we observed the AAIRs were significantly higher among Latinos (AAIR=2.56;95%CI:2.49,2.63) compared to NL Whites (1.62;95%CI:1.56,1.67). The AAIRs increased significantly among Latinos (APC=1.15;95%CI:0.80,1.50), but not among NL Whites, Blacks or APIs in California from 2000 to 2016 (Supplementary Table 1-1). Incidence rates by age The incidence rate peaked among children aged 0-9 years, decreased from 10 to 29 years, and then increased gradually after age 30 (Figure 1-1). The AAIRs for Latinos [0-14 years, AAIR=5.10;95%CI:4.95, 5.25; 15-39 years, AAIR=1.63;95%CI:1.56, 1.70; 40+ years, AAIR=1.76;95%CI:1.67, 1.86] were statistically significantly higher than that of NL Whites [0- 14 years, AAIR=4.01;95%CI:3.91, 4.12; 15-39 years, AAIR=0.79;95%CI:0.76, 0.83; 40+ years, AAIR=0.97;95%CI:0.94, 1.00] across all age groups. For NL Blacks, the AAIRs [0-14 years, AAIR=2.02;95%CI:1.88, 2.16; 15-39 years, AAIR=0.53;95%CI:0.47, 0.59; 40+ years, AAIR=0.77;95%CI:0.70, 0.84] were statistically significantly lower than that of NL Whites across all age groups (Table 1-1). For all race/ethnic groups combined, the AAIRs increased significantly in a linear pattern across all age groups; these patterns were largely driven by Latinos. The AAIR for Latinos increased significantly across all age groups (APC=1.18;95%CI:0.76,1.60), with the greatest increase in the 15-39 years age group (APC=2.02;95%CI:1.17,2.88). For APIs and AIANs, the AAIR increased significantly among people aged 15-39 [APIs, APC=1.95;95%CI:0.15,3.79; AIANs, APC=9.79;95%CI:5.65,14.09], but not among other age groups. For NL Whites and 8 Blacks, the trend of AAIR remained stable over time for each age group (Table 1-2, Figure 1-2, & Supplementary Figure 1-1). The effects of percent foreign-born and socioeconomic position on ALL incidence vary by ethnicity Among NL Whites (P-trend<0.01) and Blacks (P-trend<0.01) we found higher incidence of ALL among people who lived in a community with a high percent of foreign-born persons compared to a low percent foreign-born, but an inverse, lower incidence of ALL among Latinos (P-trend<0.01) who lived in a community with a high percent of foreign-born persons compared to a low percent foreign-born, after adjustment for age, sex, year of diagnosis and Yost index. For NL Whites, the incidence rate of ALL was 2.21 times higher (incidence rate ratio, IRR=2.21;95%CI:2.06,2.37) among people who lived in a community in the highest quintile of percent foreign-born compared to those of the lowest quintile (Figure 1-3A). But among Latinos, the incidence rate of ALL for people who lived in a community in the high quintile of percent foreign-born was 0.69 times (IRR=0.69;95%CI:0.62,0.77) that of those in the lowest quintile. However, the ALL risk for people in the highest percent foreign-born quintile was not different from those in the lowest quintile among Latinos (IRR=1.03;95%CI:0.92,1.15) (Figure 1-3B). For NL Blacks, the incidence rate of ALL was 3.62 times higher (IRR=3.62;95%CI:2.99,4.37) among people of the highest quintile of percent foreign-born compared to those of the lowest quintile (Figure 1-3C). We performed a sensitivity analysis excluding Los Angeles County, where 27.7% of Latino ALL cases (2,295 of 8,297) were diagnosed and found that the incidence rate of ALL among people living in the highest quintile of percent foreign-born was 0.27 times (IRR=0.27;95%CI:0.23,0.30) that of those in the lowest quintile (Supplementary Table 1-2 & 9 Supplementary Figure 1-2). Pearson’s correlation between the continuous Yost index and percent foreign born variables among NL Whites and Blacks (R= 0.30, 0.38, respectively, Supplementary Figure 1-3) was small but significantly higher than zero, however there was no evidence that non-collinearity model assumptions were violated. The Yost index was inversely associated with the AAIR of ALL for NL Whites (P- trend<0.01) and Blacks (P-trend<0.01), but positively associated among Latinos (P- trend<0.01). For NL Whites, the incidence rate of ALL for people of the highest Yost index quintile was 0.78 times (IRR=0.78;95%CI:0.73,0.82) that of those of the lowest quintile, indicating lower rates with higher SEP (Figure 1-3A). Among Latinos, the reverse relationship was observed: the incidence rate of ALL was 1.29-fold (IRR=1.29;95%CI:1.20,1.38) higher for people of the high Yost index quintile compared to those of the lowest indicating higher rates with higher SEP (Figure 1-3B). For Blacks, the effect was similar, where the incidence rate of ALL for people of the highest Yost index quintile was 0.63 times (IRR=0.63;95%CI:0.53,0.75) that of those of the lowest quintile (Figure 1-3C). When stratified by age group and after controlling for sex, year of diagnosis and Yost index, the incidence rate of ALL was lower among people in higher percent foreign-born quintiles only among Latinos 40 years of age and older. For this age group, the incidence rate of ALL was 0.74 times lower (IRR=0.74;95%CI:0.52,1.05) among people in the highest percent foreign-born quintile compared to those in the lowest quintile. Percent foreign-born was not associated with ALL incidences among Latino subjects aged 0-14 years (IRR=1.15;95%CI:0.88,1.51), or 15-39 years (IRR=0.96;95%CI:0.71,1.31). Percent foreign-born was positively associated with ALL incidences across all age groups among NL White and Black subjects (Supplementary Table 1-3). 10 Among NL Whites in California, the incidence rate of ALL was lower (IRR=0.78;95%CI:0.66,0.93) for people of the highest percent foreign-born quintile compared to those of the lowest quintile. No strong association between ALL incidence and Yost index was present in the California data of NL White subjects (Supplementary Figure 1-4A). Among Latino subjects in California, no consistent pattern between percent foreign born and ALL incidence was found. For Yost index, the incidence rate of ALL was higher among California Latinos in the medium or high quintiles [medium: IRR=1.27;95%CI:1.18,1.37; high:IRR=1.56;95%CI:1.44,1.70] compared to people of the lowest quintile (Supplementary Figure 1-4B). In contrast to NL Whites in California, the incidence of ALL in NL Whites outside of California was higher among people in the highest percent foreign-born quintile compared to those in the lowest quintile (IRR=2.99;95%CI:2.24,3.89) (Supplementary Figure 1-5A). No strong association between ALL incidence and Yost index was observed for the non- Californian NL Whites. Among Latino subjects outside of California, the incidence of ALL was higher (IRR=2.30;95%CI:1.77,2.95) for people of the highest percent foreign-born quintile compared to those of the lowest quintile (Supplementary Figure 1-5B). No strong association between ALL incidence and Yost index was observed for the non-Californian NL Whites, nor for the non-Californian Latinos. Discussion We provide novel insights to the association between ALL incidence and SEP and place of birth. Childhood ALL and adult ALL are fundamentally different diseases – childhood ALL is dominated by the early pre-B cell subtype that is characterized by the high frequency of ETV6- RUNX1 fusions and high hyperdiploidy, while adult ALL largely comprises later stage B-cell 11 phenotypes with BCR-ABL1 fusion mutations and other poor prognostic subtypes 32,33 . It is noteworthy then that the ethnic disparities in ALL risk commonly noted for the childhood disease 12,21,22 , including a higher incidence in Latinos, are observed equivalently over the life course. We observed the same race/ethnicity specific order of ALL incidence rates across age groups, with the highest rates found among Latinos, followed by children and adults classified as non-Latino White, Asian, Native American, and non-Latino Black. The pattern of increasing incidence among Latinos was present among children, adolescents and young adults, and older adults during the past 16 years of observation. The consistent pattern of increasing incidence across all Latino age groups may be due to a number of factors that need further investigation. First, Latinos are a genetically admixed population deriving major ancestral components from Europeans, Native Americans, and Africans 34 , and germline genetic variation associated with ancestry may affect lymphopoietic pathways for all subtypes and ages, which is reflected in ALL risk. Our current spectrum of knowledge gained from genome-wide association studies of childhood ALL are dominated by factors that affect development of pre-B cells, which may have relevance to adult ALL subtypes as well, although these have been understudied to date. Germline variants in GATA3, which are strongly associated with Ph-like ALL in children 35 , have been associated with ALL risk in adolescents and young adults as well as in the ≥40 year age group 36,37 . The allele frequencies of GATA3 risk variants are much higher in Latino populations compared to European ancestry populations, and lower still in most African populations 35 ; this may contribute to the increased incidence of ALL in Latinos across all age groups, as well as the increased prevalence of Ph-like ALL in Latino patients 38 . In addition, CEBPE, PIP4K2A and ARID5B are also susceptibility genes associated with childhood B-cell ALL risk in which risk alleles are more common among 12 Latinos compared to European ancestry individuals 39,40 . The risk variants in CEBPE, PIP4K2A and ARID5B also demonstrated a stronger association with the high hyperdiploid ALL subtype, which is prevalent among children 39 and appears to be more common proportionately in Latinos 40,41 . The array of common polymorphisms may account partially for the increased prevalence of ALL among Latino children. Second, the rise in ALL rates among Latinos has continued since our last report on this subject 24 , emphasizing that environmental factors must impact risk since gene polymorphism frequencies will not have changed in such a short timeframe. The only other rise in incidence noted here was among young adults in the SEER NL Black category, which, as an increase in a lower-incidence subgroup, could also be related to environmental factors. In the current study, the analysis of ALL AAIR patterns by percent foreign born and Yost index revealed that among Latinos, regions with a higher level of foreign-born families and lower SEP had lower risks of leukemia. The variable “percent foreign-born” appears to be a crucial risk predictor for the incidence of ALL and is likely an indicator for a variety of true risk factors that may differ by individual and ethnic group. A previous study using SEER data without adjustment for percent foreign born found the opposite association with SEP as found in the current analysis; it reported higher ALL incidence with higher SEP index among patients aged 0-19 years 42 . Foreign born nativity does not equate with lower SEP 43 , and it is intriguing that Latino rates of ALL were lower in regions with higher levels of foreign-born residents. This may represent a typical feature of disease incidence observed among Latinos sometimes referred to as the “Hispanic paradox” 44,45 . This is the observation that health outcomes are often better among foreign-born Hispanic/Latinos or children of such who are born from foreign mothers. The lower incidence of leukemias in South and Central American countries including Costa Rica 13 compared to Latinos in Northern America, along with a possible lower upward 46,47 is compatible with the basic tenets of the Hispanic paradox. While our data did not examine individual families, our result is consistent with this phenomenon which has not been reported previously among adults with leukemia and requires further study to address proof of concept and possible mechanisms, as aspects such as diet and lifestyle associated with nativity are more robustly studied in other diseases such as breast cancer and diabetes. One prior report suggested that the Hispanic paradox may affect childhood cancer cases within California 48 ; however, the effect was not shown to be significant in ALL, which we have shown here. In our study, the inverse association between percent foreign born and ALL incidence among Latino subjects was observed in the low, medium, and high quintiles but not the highest, where the highest quintile exhibited a similar relative risk as the lowest quintile (Figure 1-3B). This inconsistent trend was also observed in the analysis on Latino subjects in California (Supplementary Figure 1-3B), but not for Latino subjects outside of California (Supplementary Figure 1-4B). It is likely that this trend was driven by the excessive ALL risk in Los Angeles County 48 , where approximately 36% of Latino cases (2,295 of 6,386) in California were diagnosed. Restricted by the intrinsic limitation of an ecologic study design, the percentage of foreign-born persons in this county was classified as being in the highest quintile. However, the population nativity in Los Angeles County is of high diversity, with a large number of immigrants from East and Southeast Asia in addition to Mexico and Central America 49 . In general, it is probable that among the large number of cases diagnosed in Los Angeles County that many were US born as well as foreign-born, and a future direction for studies on “Hispanic paradox” will need to focus on individuals to account for potential residual confounding or information bias. 14 A study of global patterns of leukemia using 290 population-based cancer registries on five continents from 68 countries for all ages and both sexes, age-standardized rates of ALL were highest in South America, Oceania, and Europe, with strikingly increased rates in Central and South America, particularly Ecuador, Costa Rica, and Colombia 46 . Age-standardized incidence rates (ASIR) in males per 100,000, were 2.8 in Ecuador, 2.4 in Costa Rica, and 2.3 in Colombia 46 , with similarly increased rates seen in these countries in other studies 46,50 . In a comparison of standardized average annual incidence rates per million children from cancer registries 51 , standardized average annual incidence rates for Guatemala (14) 52 and Costa Rica (43.1) 53 were elevated, though not as high as those observed in Hispanics in California (51.1) and in Florida (49.2) 54 , or Mexicans in Mexico City (49.5) whose standardized average annual incidence rates in the 1-4 year age group was 77.7 51 . It should be noted that standardization of leukemia rates and case capture methods vary by country, so comparisons may be problematic, but a general consensus is that ALL is a disease with increasing burden worldwide particularly among Latino populations. Our analysis has strengths and weaknesses. In the current analysis of ALL patterns by chronological time and age, we added detailed evaluation of ALL rates by race/ethnicity using nationwide SEER data over a 16-year window. The data is population-based, using high quality and mature cancer registries. Our analysis of acculturation (percent foreign born) and SEP are ecologic level variables measured at the census tract level and therefore more detailed individual level data from epidemiologic studies are necessary to correctly interpret the factors driving these associations. In addition, we failed to explore the mediation effects of SEP factors with the current dataset. It is worthwhile to conduct future analyses on the mediation effects to better understand the disparities in ALL incidences across ethnic groups. 15 In sum, our results point to a continued worrying increase in ALL rates among Latinos in all age groups. Of the multiple incidence rate calculations made in this report, no incidence statistic has decreased indicating that efforts to understand and control ALL risks have not yielded any successful interventions to reduce the societal burden of disease. While treatments have improved, long-term morbidities and health care burdens in survivors continue to make prevention a priority 55-57 . The data also point towards higher rates of ALL for non-Latinos within regions with lower SEP and in regions with a higher proportion of foreign-born individuals, whereas in Latinos the opposite appears to be the case. This suggests that ALL is not an inevitable stochastic event driven entirely by genetic susceptibility but rather that environmental exposures play a large role, and likely interact with genetic and non-genetic factors associated with race/ethnicity. The nature of such environmental factors should form a basis for further research efforts. 16 17 18 Figure 1- 1. Age-adjusted incidence rates for acute lymphocytic leukemia by ethnicity, 2000-2016, United States. Figure 1-1 legend: The age-adjusted incidence rates for acute lymphocytic leukemia were derived from Surveillance, Epidemiology and End Results (SEER) registry version 18 years 2000 to 2016. Each line depicts a different race/ethnicity. 0.00 2.00 4.00 6.00 8.00 0−9 10−19 20−29 30−39 40−49 50−59 60−69 70−79 ...80 Age Group, years Age−Adjusted Incidence Rate Race/Ethnicity Latino All Races Non−Latino White Non−Latino Black - ≥ 19 Figure 1- 2. Age-adjusted incidence rates for acute lymphocytic leukemia by race/ethnic group, 2000-2016, United States. A. Overall. B. Non-Latino White. 0.00 1.00 2.00 3.00 4.00 5.00 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Y ear of Diagnosis Age...Adjusted Incidence Rate Age Group 0−14 15−39 >40 - ≥ 0.00 2.00 4.00 6.00 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Y ear of Diagnosis Age...Adjusted Incidence Rate - 20 C. Latinos. D. Non-Latino Blacks. Figure 1-2 legend: The age-adjusted incidence rates for acute lymphocytic leukemia derived from Surveillance, Epidemiology and End Results (SEER) registry version 18 years 2000 to 2016 among (A) all race and ethnicities; (B) non-Latino Whites; (C) Latino all races; (D) non-Latino Blacks. Each line depicts a different age group. Statistically significant annual percent changes (APC) in AAIR were found in (A) all race and ethnicities age 0-14, 15-39 and 40+ (APC= 0.65 [95%CI: 0.24, 1.05], APC= 1.56 [95%CI: 1.03, 2.09], APC= 1.21 [95%CI: 0.73, 1.69], respectively); (C) Latino all races age 0-14, 15-39 and 40+ (APC= 0.64 [95%CI: 0.08, 1.20], APC= 2.02 [95%CI: 1.17, 2.88], APC= 1.28 [95%CI: 0.41, 2.17], respectively. 0.00 2.00 4.00 6.00 8.00 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Y ear of Diagnosis Age...Adjusted Incidence Rate - 0.00 1.00 2.00 3.00 4.00 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Y ear of Diagnosis Age...Adjusted Incidence Rate - 21 Figure 1- 3. Multivariate Poisson regression model of age-adjusted incidence rates by race/ethnic group, 2000-2016, United States. A. Non-Latino Whites. P for Trend 0.0 1.0 -1.0 Variable 1.00 (Referent) 1.00 (Referent) 1.00 (Referent) 1.00 (Referent) 22 B. Latinos. -1.0 0.0 1.0 for Trend P Variable 1.00 (Referent) 1.00 (Referent) 1.00 (Referent) 1.00 (Referent) 23 C. Non-Latino Blacks. Figure 1-3 legend: N: Number of cases. IRR: Incidence rate ratio. Percent foreign born: percent of people born in a foreign country by county in 2000 from the Census 2000 ACS data. Lowest, less than 5.93%; low, 5.94%-13.17%; medium, 13.18%- 19.20%; high, 19.21%-29.86%; highest, more than 29.87%. Yost index: the race-specific Yost index by county in 2000 from the Census 2000 ACS data. NL White: lowest, less than 10,405; low, 10,406-11,045; medium, 11,046-11,405; high, 11,406- 11,618; highest, more than 11,619. Latino: lowest, less than 10,780; low, 10,781-10,996; medium, 10,997-11,127; high, 11,128-11,537; highest, more than 11,538. Black: lowest, less than 9,747; low, 9,748-10,845; medium, 10,846-11,155; high, 11,156-11,503; highest, more than 11,504. The Poisson model using age-adjusted incidence rates (AAIR) as the dependent variable can be denoted as Count= Age+ Gender+ Year of diagnosis+ Percent foreign born quintile + Yost index quintile+ offset (ln(weighted population)), where weighted population= population in each category/(standard population in each category/ total standard population). Population in each category was the population in each age-, gender-, year of diagnosis-, and percent foreign born- category. Standard population in each category and total standard population were derived from the Census 2000 ACS data. P for Trend -1.0 0.0 1.0 Variable 1.00 (Referent) 1.00 (Referent) 1.00 (Referent) 1.00 (Referent) 24 Chapter 2: In utero immune development and susceptibility of childhood ALL In this Chapter, we describe a case-control study to further explore the role of in utero immune status before birth in association with the risk of childhood acute lymphoblastic leukemia to better understand the mechanism of the disease. Introduction Acute lymphoblastic leukemia (ALL) is the most common type of cancer diagnosed among children under 15 years of age 1 . The risk of childhood ALL varies across race/ethnic groups, with the highest incidence rate in the US among Latino children 58 . Patterns of early-life infections 59 , exposure to infectious agents via childhood contact (e.g., daycare attendance 60 or older siblings 61 ), and immunizations 62 may affect the risk of childhood ALL 63 . Notably, exposure to extensive social contacts and microbial stimuli in early life is protective for childhood ALL, possibly by facilitating development of a well-modulated immune system that appropriately reacts to infections 61,64 . A child whose immune system was naïve to antigenic stimuli may respond to common infections in an aberrantly strong manner, stimulating B-cell growth and increasing the risk of ALL 65,66 . Incorporating observations on responses to infections and the history and timing of ALL mutations, Greaves proposed that ALL evolves in two discrete steps. First, a pre-leukemic clone is generated in utero by fusion gene formation or an aneuploidy event 67 . Then, the pre-leukemic clones convert to leukemia due to secondary genetic changes after birth that are driven by vigorous immune activation via common infections 61,68,69 . This hypothesis proposes a role for infection in early life but does not address the immune responder status at birth prior to the exposure to infectious agents. 25 More recently, studies examining neonatal cytokine levels suggest that immune status at birth plays a key role in the development of childhood ALL. Immunologically naïve newborns who had a deficit of IL-10 were at a higher risk for ALL later in childhood 70 . IL-10 is a cytokine that limits the magnitude of immune response to pathogens, and during pregnancy helps ensure the coexistence of fetal tissues within the mother despite the expression of “non-self” antigens from the father 71 . Two other studies confirmed and extended these observations to additional cytokines, suggesting that a cluster of cytokines is crucial 72,73 . The origin of this cytokine profile associated with ALL risk, along with its role in establishing the “responder status” of the infant, would provide potential avenues for ALL prevention. The presence of specific activating KIR genes in the child is associated with a decreased risk of ALL 74,75 . KIRs expressed on natural killer (NK) cells help the host to identify normal cells by the HLA expressed on the surface of these cells. HLA expression is often lost during neoplastic transformation. Cells without class I HLA expression will be targeted for removal 76 . HLA and KIR expression are also fundamental in the context of normal pregnancy. Since the fetus is a foreign tissue in the mother’s uterus, the interaction between HLA and KIR is crucial in modulating maternal immune response to the developing fetus, which expresses specific HLA alleles that communicate through maternal KIR receptors to help modulate maternal NK cell activity 77 . Furthermore, it is hypothesized that during uterine NK cell development, the maternal KIR interacts with her own HLA molecules, thus her NK cells are “licensed” not to target the fetus’ cells with the same type of HLA molecules during placentation 78-81 . Twelve KIR genes and 2 pseudogenes are classified to two haplotypes A and B. Haplotype A consists of five genes (KIR2DL1, 2DL3, 3DL1, 3DL2, 3DL3) that inhibit, one gene (KIR2DS4) that activates, and one gene (KIR2DL4) that may activate or inhibit NK cell activity. 26 KIR haplotype B have variable gene contents and one or more of the B-specific genes or alleles (KIR2DS1, 2DS2, 2DS3, 2DS5, 2DL2, 2DL5, 3DS1) 82 . During pregnancy, specific maternal KIRs interact with specific offspring HLA-C, -B and -G to regulate immune responses; these interactions are classified as activating or inhibiting and form the basis for education of uterine NK cells 76,83 . Our hypothesis is that, on balance, stronger interactions that activate NK cell activity will lead to a greater degree of immunosuppression 84 (higher IL-10 levels for instance 70 ) and a relatively quiescent neonatal immune system reflected by a characteristic cytokine profile. This will result in lower leukemia risk as the neonate will react more moderately upon exposure to infectious agents and antigens after birth 70,85 . In the current study, we leveraged unique resources to evaluate whether interaction between specific maternal KIR and offspring HLA genotypes is associated with the risk of childhood ALL, and if this association is mediated by cytokine levels at birth in various racial/ethnic groups that experience varying risks of childhood leukemia. This study provides novel insights into the role of in utero immune development in the pathogenesis of childhood ALL. Methods Study population Cases were children diagnosed with ALL at the age of 0-14 years and included in the California Childhood Leukemia Study (CCLS) or the California Childhood Cancer Records Linkage Project (CCRLP). The details of CCLS and CCRLP have been described elsewhere 59,86,87 . Briefly, both are case-control studies of childhood leukemia, where the cases were children aged 0-14 years at the time of leukemia diagnosis ascertained from pediatric 27 hospitals in California (CCLS, 1995-2015) or from a linkage between statewide birth records and cancer diagnosis information from the California Cancer Registry (CCRLP, 1988-2011), and non-overlapping population with CCLS. Control subjects for both studies were selected from California birth records and matched to cases on date of birth, sex, self-reported race/ethnicity, and derived from the same study region. CCLS cases were actively recruited, while CCRLP cases and controls were registry-based only. Newborn blood samples for the offspring cases and controls in both studies were obtained from the California Biobank Program 88 , with DNA isolated from newborn dried bloodspots Qiagen. In the CCLS, DNA for mothers was derived from buccal cell swabs or saliva samples collected at the time of in-home personal interview. In the CCRLP, we extracted DNA from maternal blood specimens from 15-20 weeks of gestation with the case or control child, which were archived in five Southern California counties (San Diego, Imperial, Orange, Riverside and San Bernardino) in the years of 1999-2009. The current analysis was restricted to CCLS and CCRLP cases and controls who had maternal DNA available. Cases with Down syndrome were excluded due to the potential confounding effects of trisomy 21 on immune development. A total of 226 case child-mother pairs and 404 control child-mother pairs were included in this analysis. The genetic ancestry of each subject was inferred by principal component (PC) analysis with subjects in 1000 Genomes project 89 . A total of 2504 subjects (661 African American, 347 Latino, 504 East Asian, 503 European, and 489 South Asian) from 1000 Genomes were included as reference for the PC analysis 89 . We first used Plink 2.0 90 to merge the variants of our study participants with the variants of the subjects in the 1000 Genomes project and performed a PC analysis. We then used Plink 2.0 to estimate the genetic ancestry of our study participants by comparing their first 2 PCs to the first 2 PCs of the reference subjects in the 1000 Genomes project whose genetic ancestry 28 are known 91 . We also plotted the first 2 PCs of the study participants to those of the 1000 Genomes reference subjects with R 4.1.1 92 (Supplementary Figure 2-1). Genotyping/imputing HLA alleles and KIR haplotypes All cases, controls and mothers were genotyped with SNP Thermo Fisher LAT Axiom Array (CCRLP subjects, n=1060) or by a custom Fluidigm HLA-KIR sequencing platform 93,94 at Children’s Hospital Oakland Research Institute (CCLS subjects, n=200). HLA-A, -Bw4 and -C genotypes were imputed with HIBAG 95 from the LAT array data (HIBAG: CCRLP subjects, n=1060) or analyzed from direct sequencing (Fluidigm: CCLS subjects, n=200; Supplementary Figure 2-2). Ancestral group-specific models included in HIBAG were used for HLA imputation for predicted Latino, non-Latino White (NLW), non- Latino Black (NLB), and non-Latino Asian (NLA) subjects. The average prediction accuracy of HIBAG was 89% (European ancestry, 89%; African ancestry, 91%; Latino ancestry, 79%; other ancestry, 86%) as validated in a multiracial population 96 . All mothers have the same predicted genetic ancestry as their offspring. KIR*Imp 97 was used to impute KIR genotypes from LAT array data (KIR*Imp: CCRLP subjects, n=1060) and direct sequencing (Fluidigm: CCLS subjects, n=200; Supplementary Figure 2-2). We reported the average predicted probability of each imputed HLA allele or KIR haplotype generated by HIBAG or KIR*Imp as an indicator for imputation quality (Supplementary Table 2-1). Quantifying the offspring HLA and maternal KIR interaction We first used the broad category of KIR A/B haplotype as an indicator for the inhibition/activation of immune responses. More copies of the KIR A haplotype indicates more overall inhibition 75 . In addition, we calculated an activation score and an inhibition score for different combinations of specific offspring HLA and maternal KIR genes, referred to as HLA- 29 KIR interaction (Table 2-1) based on the compiled information on specific allelic combinations derived from previous studies 98,99 . We assigned an activation/inhibition score ‘1’ to the combinations that were reported to have a ‘strong’ activation/inhibition, or an activation/inhibition score ‘0.5’ to the combinations that were reported to have a ‘weak’ activation/inhibition in the reference studies. The activation score is a sum of all activating combinations, and the inhibition score is the sum of all inhibitory combinations within the mother/fetal dyad. Furthermore, to evaluate the HLA-KIR interaction and ALL risk considering NK licensing, we computed a ‘licensed’ activation score and a ‘licensed’ inhibition score based on the combination of maternal HLA, maternal KIR, and offspring HLA genes (Supplementary Table 2-2). Cytokines The neonatal levels of 10 cytokines (IL-1b, IL-2, IL-4, IL-6, IL-8, IL-10, IL-12p70, INF- g, TNF-a, and VEGF) and the immune-modulating enzyme arginase-II (ARG-II) were assessed by Luminex technology on blood spot extracts or ELISA, as previously described 73,100 . The cytokines were normalized with a variance stabilizing normalization (VSN) method, normalizing on case/control status, birth year, protein, batch, and plate spot 101 . Statistical analysis A potential association between maternal/fetal HLA-KIR interaction and the risk of childhood ALL was analyzed with logistic regression, without matching between cases and controls, and stratified by predicted genetic ancestry (Latino/NLW/NLB/NLA) to account for potential effect modification by ancestry among the broad four groups; this was the main analysis of interest. The dependent variable of the logistic regression model was ALL case/control status, and the independent variables were the inhibition and activation scores of 30 HLA-KIR interaction as described above. We also performed analyses adjusting for the top 5 PCs of predicted genetic ancestry to minimize the residual confounding within each ancestral group. We then used conditional logistic regression to perform a matched analysis as a sensitivity analysis to control for the potential confounding by offspring sex and genetic ancestral components. Similar to the main analyses, we built models stratified by predicted genetic ancestry (Latino/NLW/NLB/NLA). For each model, the activation and inhibition scores are the main independent variables and predicted genetic ancestry (5 PCs) is a covariate for the matched analyses. Finally, for both the main and sensitivity analyses, we used a random-effects meta-analysis model, weighted by the number of subjects in each predicted genetic ancestry group, to obtain a weighted average odds ratio (Supplementary Figure 2-2). In addition, we used an unmatched logistic regression model to evaluate the HLA-KIR interaction and ALL risk considering NK licensing. The dependent variable is ALL case/control status, and the independent variables are ‘licensed’ activation score and ‘licensed’ inhibition score. We performed an unmatched analysis to assess the relation of normalized neonatal levels of the 10 cytokines and ARG-II (dependent variable) to HLA-KIR interaction (independent variable) with a linear regression model adjusting for age at bloodspot collection, birth weight, and gestation week. The covariates were selected from a set of candidate factors (offspring sex, age at bloodspot collection, birth weight, gestation week, and type of health insurance) with a stepwise method. Variables with p-value less than 0.05 were retained in the model. We built models including the activation and inhibition scores with and without adjustment for covariates. In the matched analyses, we used a linear mixed model adjusting for the same set of covariates as the linear regression model, and cluster by matched pairs, to analyze the association between the HLA-KIR scores and normalized cytokine or ARG-II levels (Supplementary Figure 2-2). 31 To evaluate the potential mediation effect of cytokines and ARG-II in the association between HLA-KIR interaction and childhood ALL risk, a causal mediation analysis was performed when 1) the association between HLA-KIR interaction and childhood ALL risk was statistically significant; and 2) the association between the potential mediator (cytokine/ARG-II) and HLA-KIR interaction was statistically significant. We first evaluated the effect of the potential mediator on childhood ALL risk with a logistic regression model where the case/control status of childhood ALL was the dependent variable, and the potential mediator and HLA-KIR activation or inhibition scores were the independent variables. We then performed the causal mediation analysis 102 . Briefly, the unstandardized indirect effects for each of 1000 bootstrapped samples were computed, and the 95% confidence intervals (CI) were computed by determining the indirect effects at the 2.5 th and 97.5 th percentiles. Our activation score was also tested for association with recurrent fetal loss, another outcome previously associated with maternal KIR-child HLA interactions 103 . We computed activation and inhibition scores for mothers who experienced prior fetal loss (“ever”) with those who did not (“never”). Distribution by case/control status was analyzed with a Chi-squared test for categorical variables, and two-sample t-test for continuous variables. All statistical tests were two-sided, and p-values less than 0.05 were considered statistically significant. Results Offspring HLA-maternal KIR interaction and ALL risk For the primary (unmatched) analysis, 390 (61.9%) Latino, 160 (25.4%) NLW, 64 NLA (10.2%) and 16 (2.5%) NLB child-mother pairs were included in each model (Table 2-2). For the 32 sensitivity (matched) analysis, 139 Latino (62.6%), 56 NLW (25.2%), and 27 NLA (12.2%) one- to-one matched mother-child case-control pairs from CCLS and CCRLP were included in each model (Supplementary Table 2-3). The distribution of maternal KIR genes and offspring HLA- C alleles were reported in Supplementary Tables 2-4 & 2-5. In our primary analysis, the activating HLA-KIR interaction was inversely associated with ALL risk in Latino subjects, but not in other ancestry groups. Genetic ancestry was a significant effect modifier between ALL case/control status and the activating HLA-KIR interaction (log-likelihood p-value=0.003). The weighted average of unadjusted ORs among Latino, NLW, and NLA subjects were not different from the null for both the activation (odds ratio (OR)=0.69; 95%CI:0.47,1.03) and inhibition scores (OR=0.87; 95%CI:0.73,1.02). In the multivariable model for Latinos, the odds of childhood ALL decreased by 41% with each one- unit increase in activation score (OR=0.59; 95%CI:0.49,0.71; p<0.001) when adjusting for inhibition score and the top 5 PCs for predicted genetic ancestry. The weighted average of adjusted ORs were similar to those of the unadjusted ORs (ORactivationscore=0.69; 95%CI:0.45,1.04; ORinhibitionscore=0.85; 95%CI:0.71,1.01) (Table 2-3). Similar effect estimates in Latino subjects were observed in the univariate model (OR=0.60; 95%CI:0.49,0.72; p<0.001), as well as with the logistic regression models where cases and controls were matched on sex (unadjusted OR=0.68; 95%CI:0.55,0.85; p<0.001; adjusted OR=0.66; 95%CI:0.53,0.83; p<0.001). Similar weighted average ORs and indicator of heterogeneity were observed in the matched models compared to the unmatched models (Supplementary Table 2-6). In the analysis where maternal NK licensing with her own MHC class I-specific alleles was considered (which would happen prior to and during pregnancy), activation and inhibition scores were 33 attenuated, and no significant association was found between ALL risk and the ‘licensed’ activation or ‘licensed’ inhibition scores (Supplementary Table 2-7). Recurrent miscarriage was previously linked to weak HLA-KIR interactions (reviewed in 103 ) and was evaluated here to help validate our data and activation index instrument. In CCLS and CCRLP, the status of previous fetal loss was known for 583 mothers (92.5%). Among them, 122 mothers (20.9%) had a fetal loss before giving birth to the index child. The activation score of HLA-KIR interaction was lower among mothers who had a fetal loss (mean=2.08; standard deviation, SD=1.28) compared to those who never had a fetal loss at the time of giving birth to the child (mean=2.25; SD=1.32). This difference was not statistically significant (p=0.22); however, the direction of association was consistent overall and among the three predicted genetic ancestry groups studied (Supplementary Table 2-8). Cytokines and offspring HLA-maternal KIR interaction Information on cytokines and ARG-II were only available for CCRLP subjects. In total, 323 (63.3%) Latino, 139 (27.3%) NLW, and 48 NLA (9.4%) mother-child pairs were included in this analysis. NLB subjects were not included because none of the cases were NLB (Supplementary Table 2-9). With a multivariable linear regression model adjusting for age at bloodspot collection, birth weight, and gestation week, a one-unit increase in the activation score was associated with 0.13-unit (95%CI: 0.04, 0.22) increase in normalized ARG-II level for Latinos. For NLAs, one- unit increase in activation score was associated with 0.29-unit (95%CI: -0.56, -0.02) decrease in normalized IL-10 level, 0.27-unit (95%CI: -0.49, -0.06) decrease in normalized TNF-a level; one-unit increase in inhibition score was associated with 0.35-unit (95%CI: -0.65, -0.05) decrease in normalized ARG II level (Figure 2-1 & Supplementary Table 2-10). 34 When matched on offspring sex and predicted genetic ancestry, 76 (60.8%) Latino, 38 (30.4%) NLW, and 11 (8.8%) NLA mother-child case-control pairs were included in the analysis (Supplementary Table 2-11). With a multivariable linear mixed model adjusting for age at cytokine collection, birth weight, and gestation week, we found that normalized TNF-a was the only cytokine having a statistically significant association with activation score among Latino subjects. The level of normalized TNF-a decreased by 0.15-unit (95%CI: -0.31, -0.00) with a one-unit increase in activation score (Supplementary Table 2-12). We observed no associations between other cytokines or ARG-II and HLA-KIR activation score in Latino subjects, or between any cytokine/ARG-II and the activation score in any other racial/ethnic groups. To test whether cytokines were mediating the impact of HLA-KIR interactions, a mediation analysis was performed to assess the potential mediation effect of neonatal ARG-II in the association between activating HLA-KIR interaction and childhood ALL risk for Latino subjects. However, the average bootstrapped unstandardized indirect effect was 0.00214 (- 0.00404, 0.01), indicating no causal mediation (Supplementary Figure 2-3 & Supplementary Table 2-13). Discussion Consistent with our original hypothesis, activating HLA-KIR interactions contributed to a lower risk of childhood ALL in Latinos and potentially subjects of other genetic ancestry groups, although the latter results did not reach statistical significance possibly due to the lack of statistical power. A weighted average meta-analysis among predicted genetic ancestry groups implied that true heterogeneity exists, which mirrors the heterogeneous findings among predicted 35 genetic ancestry groups previously observed in California regarding immunologic risk factors such as childhood contacts exposure (daycare) 104 and early child infectious disease history 59 . Our finding is also compatible with a previous report that the inheritance of a higher number of activating KIR genes reduced the risk of childhood ALL in Latinos only (and not in non-Latino whites) 75 . Activating HLA-child/KIR-mother interactions were also protective against other conditions that can result from maladaptive maternal-fetal interactions including recurrent spontaneous abortion 105 , recurrent miscarriage 106 , and reproductive failure 80 . In our study, activating offspring HLA-maternal KIR combinations that were associated with lower risk of childhood ALL in Latino subjects were also associated with a higher level of ARG-II, an immunosuppressive enzyme previously linked to ALL risk 100 . Additionally, in NLA subjects the activating HLA-KIR combinations are associated with lower levels of IL-10 and TNF-a, and inhibiting HLA-KIR combinations are associated with lower levels of ARG-II. These findings suggest that the role of early life immune function or conditioning in the etiology of childhood ALL may vary by predicted genetic ancestry groups concerning the mediation effects of certain cytokines. There were no significant associations observed among the NLW subjects, which consisted of a substantial proportion of our study population suggesting sufficient power to see an association should one exist. In our previous work, we identified stronger associations with postnatal immune stimulatory exposures such as childhood contacts in daycare, and birth order (having older siblings) among NLW compared to Latinos 59,104 , and it is possible that such postnatal exposures have a stronger impact in NLW compared to Latinos, whereas prenatal immune development by HLA-KIR interactions might play a larger role in Latinos. This is also compatible with our prior result on KIR genes only, where significant results were only apparent for Latinos in an ALL case-control study (not including mothers) 75 . 36 Specifically, we reported that the frequency of KIR A/A haplotype was significantly higher in Latino cases compared to controls, but not in NLW subjects. In the current study which includes a non-overlapping sample set, we showed a consistent observation among genotyped mothers that the frequency of maternal KIR A/A was higher in Latino cases compared to controls (p=0.002), but not in NLW (p=0.939), NLA (p=0.139), or NLB subjects (p=0.120). In our previous study, we stated that external environmental factors, such as patterns of infection 107,108 , that interact with KIR may help account for the variation in childhood ALL risk by ethnicity. This is a reasonable hypothesis based on the role of KIR in a nascent leukemia clone in a child after birth, where interactions with maternal alleles will not play a role. In the current study, we examined a different hypothesis impacting only the fetal period by evaluating neonatal cytokines as potential mediators of the association between child HLA/maternal KIR interaction and risk of childhood ALL. Based on the current findings, we postulate that the maternal-fetal genetic interaction between HLA (fetus) and KIR (mother) could be the root cause rather than any external environmental influence. We found that HLA-KIR interactions are associated with ALL risk in some groups (Latinos and NLA) and that ARG-II is associated with the HLA-KIR interaction in Latino subjects; ARG-II, IL-10, and TNF-a are associated with the interaction in NLA subjects. ARG-II is a cytokine that suppresses T cell proliferation through an anti-inflammatory cascade resulting in arginine depletion 109 . Increased neonatal level of ARG-II was associated with a higher risk of childhood ALL 100 . IL-10 as explained above is a major immunosuppressive cytokine during pregnancy associated with fetal tolerance in the mother. TNF-a is an inflammatory cytokine, a component of the “cytokine storm,” and its reduced level with strong HLA-KIR interactions may 37 be reflective of a calmed immune phenotype at birth which may affect the way an infant responds to external antigenic stimuli and infectious agents after birth. The maternal-fetal health literature has shown that repeated miscarriages can result from weak HLA-KIR interactions 103 . Our data were consistent with these prior results - although not reaching statistical significance (possibly due to limited sample size and power), we found weaker HLA-KIR interaction for mothers with a history of fetal loss compared to those without such a history among all three predicted genetic ancestry groups tested. Such interactions may lead to lower immunosuppression via certain cytokine profiles. Recurrent miscarriage is more frequent in mothers of children with ALL compared to controls 110 , and while this was previously speculated to be a product of a higher frequency of fetal chromosome abnormalities 111,112 , our results suggest that immune development may play a role. There are several strengths and limitations of the current study. One strength is the diverse population in California, especially a large percentage of Latinos who harbor the highest risk of ALL 58 . We were uniquely positioned to evaluate the association between HLA-KIR interaction and childhood ALL risk in multiple ancestral groups and did observe differences by predicted genetic ancestry. In addition, we were able to measure the level of cytokines at birth before the subjects have developed ALL by using archived newborn blood spots from the California Biobank. Furthermore, we obtained DNA from mothers, which enabled us to impute KIR genes for mothers. A limitation of the current study is that, although we imputed HLA genotypes respectively for each ethnic group, the KIR genes were imputed using European ancestry population as the reference. This may yield misclassification error in the KIR genes for subjects in minority ethnic groups. Another limitation is that the sample size of non-Latino minority Americans is small – particularly the paucity of non-Latino Black subjects, because of 38 the composition of the California population and a lower risk of ALL in Black children. This compromises the generalizability of the results to NLB subjects. An additional limitation is that we could not adequately evaluate how NK education may have influenced the functionality of NK in stimulating cells given the extensive hypothetical assumptions of our statistical modeling leading to an algebraic treatment of the data. According to the “NK education” hypothesis, NK cells are fully responsive to target cells only when the ligand for inhibitory MHC class I–specific receptors (ex. KIR2DL1/2, KIR3DL1) are expressed in the host 113,114,115 ). NK cells lacking MHC class I–specific receptors are less responsive to target cells. On the other hand, the education of NK cells via activating KIRs is reported to induce tolerance, which complements the education via inhibitory KIRs 116 . However, with the current data, we could not tease out how much NK were licensed for activation or inhibition. In summary, the current study supports the role of HLA-KIR interaction in the development of childhood ALL, with variation in effect by genetic ancestry. Further research to confirm and extend these findings is necessary, as well as the impact of postnatal infections which was not evaluated in the current study population. 39 Table 2- 1. Activation/inhibition effect for different combinations of maternal KIR and offspring HLA genes. Maternal KIR gene 1 Offspring HLA allele 2 Inhibition score 3 Activation score 4 KIR2DL1 C2 1 KIR2DL2 C1 1 KIR2DL3 C1 1 KIR2DL2 C2 0.5 KIR2DL3 C2 0.5 KIR3DL1 BW4*80I 1 KIR3DL1 BW4*80T 0.5 KIR3DL1 A*23:01 0.5 KIR3DL1 A*24:02 0.5 KIR3DL1 A*24:03 0.5 KIR3DL1 A*25:01 0.5 KIR3DL1 A*32:01 1 KIR2DS1 C2 1 KIR2DS2 C1 1 KIR2DS4 C*02:02 1 KIR2DS4 C*04:01 1 KIR2DS4 C*05:01 1 KIR2DS4 C*01:02 1 KIR2DS4 C*14:02 1 KIR2DS4 C*16:01 1 KIR2DS5 C2 1 KIR3DS1 BW4*80I 1 1 KIR imputed with KIR*Imp or directly genotyped (CCLS samples only). 2 HLA imputed with HIBAG by predicted genetic ancestry group or directly genotyped (CCLS samples only). 3, 4 Inhibition/activation score= 1if indicated as “strong inhibition/activation’ in previous study; score= 0.5 if indicated as “weak inhibition/activation’ in previous study. 40 41 Table 2- 2. Logistic regression models 1 assessing the association between childhood ALL case/control status and HLA-KIR interaction. Predicted genetic ancestry Unadjusted model Model adjusting for top 5 PCs Odds ratio (95% CI) p-value 2 Odds ratio (95% CI) p-value 2 Latino all races Activation score 0.60* (0.49, 0.72) <0.001 0.59* (0.49, 0.71) <0.001 Inhibition score 0.82 (0.66, 1.01) 0.066 0.81 (0.64, 1.01) 0.057 Non-Latino White Activation score 1.02 (0.78, 1.32) 0.907 1.03 (0.78, 1.35) 0.842 Inhibition score 0.94 (0.69, 1.29) 0.705 0.91 (0.66, 1.27) 0.592 Non-Latino Asian Activation score 0.65 (0.42, 1.00) 0.051 0.63 (0.38, 1.01) 0.052 Inhibition score 0.98 (0.59, 1.62) 0.945 0.91 (0.51, 1.59) 0.729 p-value (LR test) Activation score 3 0.003 0.003 Inhibition score 4 0.335 0.380 Weighted average 5 Activation score 0.69 (0.47, 1.03) 0.068 0.69 (0.45, 1.04) 0.076 Inhibition score 0.87 (0.73, 1.02) 0.089 0.85 (0.71, 1.01) 0.064 1 390 Latino, 160 non-Latino White, and 64 non-Latino Asian mother-child pairs were included. 2 p-value based on a Wald test. 3 LR test comparing the null model (ALL case/control status ~ activation score + inhibition score + genetic ancestry) to the model with an interaction term between ancestry and activation score (ALL case/control status ~ activation score + inhibition score + genetic ancestry + activation score x genetic ancestry). 4 LR test comparing the null model (ALL case/control status ~ activation score + inhibition score + genetic ancestry) to the model with an interaction term between ancestry and inhibition score (ALL case/control status ~ activation score + inhibition score + genetic ancestry + inhibition score x genetic ancestry). 5 The weighted average of odds ratio among Latino all races, non-Latino White, and non-Latino Asian subjects. Calculated with a random-effects meta-analysis model where the weights are the number of subjects in each predicted genetic ancestry group. ALL, acute lymphoblastic leukemia; HLA, human leukocyte antigen; KIR, killer immunoglobulin-like receptor; PC, principal component; CI, confidence interval; LR test, log- likelihood ratio test. *Odds ratio statistically significantly different from the null with p-value< 0.05. 42 Figure 2- 1. Linear regression model assessing the association between VSN normalized cytokines and activation/inhibition scores. Figure 1 A. B. C. 43 Abbreviations: VSN, variance stabilizing normalization; IL, interleukin; INF, interferon; TNF, tumor necrosis factor; VEGF, vascular endothelial growth factor; ARG, arginase SE, standard error; CI, confidence interval. Unadjusted model: model with cytokines as the outcome, activation, and inhibition scores as predictors. Full model: model with cytokines as the outcome, activation, and inhibition scores as predictors, adjusting for age at cytokine collection, birth weight, gestation week. Cytokines were normalized on case/control status, birth year, protein, batch, and plate spot. *Coefficient statistically significantly different from the null with p-value< 0.05. A. The association between VSN normalized cytokines and activation/inhibition scores among Latino subjects. B. The association between VSN normalized cytokines and activation/inhibition scores among non-Latino White subjects. C. The association between VSN normalized cytokines and activation/inhibition scores among non-Latino Asian subjects. 44 Chapter 3: Burden of familial-associated early-onset cancer risk In this Chapter, we zoomed into the families with early-onset cancers (mainly hematologic cancers) in California to compare the burden of familial-associated early-onset cancer risk in Latinos vs. non-Latino Whites. Introduction Both genetic and environmental factors play a role in the causes of early-onset cancer. Several well-defined genetic syndromes contribute to early-onset cancer risk, along with a wider array of common alleles that influence risk marginally and detected at the population level. As an example of the former, Li-Fraumeni syndrome caused by mutations in the tumor suppressor gene TP53, is associated with an increased risk of a spectrum of cancers diagnosed at early ages 117 . An example of low penetrance common genetic variations associated with cancer risk include IKZF1 and ARID5B genes in pediatric acute lymphoblastic leukemia (ALL) 118 . Both of these classes of variants may vary in frequency by race/ethnic group and cluster by families 39,119,120 , where familial germline deletion of ETV6 was reported for ALL 121 . Examination of cancer predisposition requires investigation in ethnic strata particularly where cancer incidence rates are known to differ as it does for many pediatric cancer types, such as leukemia 122 and brain cancer 123 . In addition to primary cancers, incidence patterns of second independent malignancies may also provide a perspective of underlying genetic predisposition. Among childhood cancer survivors, more second primary malignancy cases are observed among non-Latino Whites (NLW) than Latino subjects 124 . This is also reflected in adult cancers, where Latino breast cancer survivors had lower risk of second cancers than NLW and NL Black women 125 . 45 Germline pathogenic/likely pathogenic variants in cancer predisposition genes are found in approximately 10% of pediatric cancer patients 117,126-128 , and may be inherited or arise de novo. Highly penetrant inherited variants will contribute to clustering of cancer cases within the family. Shared environments within the family unit may also be considered alongside genetic risk as potential causes for family-based cancer concordance. Familial concordance of a wide variety of cancers has been assessed using the Swedish Family-Cancer Database, leading to a deep understanding of familial relative risks 129-131 . The Victorian Paediatric Cancer Family Study in Australia also explored the cancer risks for relatives of children with cancer in a small population 132 . In the US, the Utah Population database may be the best-known population for studying familial risk 133-137 . Importantly, these studies largely comprise families of European ancestry, and therefore have not examined potential ethnic- specific familial risks. Here, we utilized linked population registries with over 64,000 individuals to quantify the familial risks (siblings and mothers) and the risks of early-onset second primary malignancies in the highly diverse and large population of California. Methods Source of Data We used linked population-based registries in California to evaluate the relative risks of early- onset cancers (0-26 years age of onset) for siblings and mothers of children, adolescents, and young adults (AYA) aged 0-26 diagnosed with cancer, as well as the relative risks of early-onset second primary malignancies (SPMs) among the proband patients. The dataset was created by linking information from the California Cancer Registry (CCR) and California Birth Statistical Master File, allowing the capture of siblings and parents of cancer probands, along with their 46 cancer incidence. The linked dataset comprehensively encompassed all cancer cases 0-26 years old, as well as their sibling and mother cancers, diagnosed from 1989 to 2015 in California. Our upper age limit of 26 was set based on the available age range covered by this relatively young cohort. Overall, the dataset included a total of 121,571 individuals. The information on healthy siblings and mothers was available during the whole study period, whereas the information on fathers was not available until 2004 in the birth files and therefore is not included in the current analysis. For the analysis of cancer familial risks, we included all primary incident cancer cases diagnosed from 1989 to 2015 among patients aged 0-26 years, with patient age-at-diagnosis limited by the study time period for which California maintained a statewide SEER gold- standard cancer registry. In addition, we performed a subgroup analysis on pediatric cancers including patients aged 0-15 years diagnosed within the same study period. For the analysis of secondary cancer risks, we included all SPMs diagnosed over the same years and patient age ranges. Although the CCR only records primary malignancies, some misclassification of relapsed or recurrent disease is possible. A physician reviewed diagnosis codes of all the cases diagnosed after the first primary case to prevent the misclassification of relapsed first primary malignancies (FPMs) as SPMs. For both analyses, we classified the cancers into twelve-broad groups and subgroups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). The abbreviations for cancer types are included in Table 3-1. We also grouped the cancers into hematologic or solid categories in the analyses. Hematologic cancers were defined as leukemias and lymphomas. Solid cancers were defined as CNS tumors, neuroblastomas, retinoblastomas, renal tumors, 47 hepatic tumors, bone tumors, sarcomas, germ cell tumors, epithelial neoplasms, and other and unspecified malignant neoplasms. Statistical Analysis We quantified the relative risks for siblings and mothers, and the relative risks of SPMs by calculating the standardized incident ratios (SIRs) of a given cancer or of SPMs among the healthy siblings, and the SIRs of a given cancer among healthy mothers of probands using a previous published method 129 . We defined a proband as a pediatric or AYA patient with a given cancer. Only one child/AYA in each family can be a proband, so that in families with two or more cases, the proband is defined as the patient with the earliest date of diagnosis. Given a proband with cancer, we calculated the SIRs for a sibling or a mother in the same family for all types of cancers. Separately, we calculated the SIRs for a sibling for the same type of cancer as the proband. We also stratified the analyses by self-identified race/ethnicity of the mother in each family. The SIRs in siblings, mothers or of SPMs can be denoted as: !"# = % & = ∑ ∑ ( !" # ! "$% & !$% ∑ ∑ ∑ ) ' * !"' ()*+ '$% # ! "$% & !$% Where N is the number of families, ni is the number of non-proband individuals of interest (siblings/SPMs/mothers) in family i, and Kmax is the total number of age intervals. The data for each individual includes a disease indicator (Dij) and the number of years “at risk” during the k th age interval (tijk). A given individual is defined to be at risk beginning at their age when the proband in their family is diagnosed and ending either when they become affected themselves or they are censored due to end of study follow-up. For siblings and mothers, age was stratified into seven groups as 0, 0-4, 5-9, 10-14, 15-19, 20-24 and 25-29 years. For the calculation of SIRs within a given race group, λk is the race-, sex- and age-specific incidence rate of a given cancer. We compared the SIRs across race/ethnic groups with approximate Chi-squared tests. The 48 approximate chi-square method compares the probability of occurrence of events in one group to another, based on a binomial distribution. This comparison is not related to the 95% confidence intervals for the SIRs. We designated that all events occurred right at the middle point of each calendar year. We also stratified the analysis by 5-year age groups. The 95% confidence intervals (CIs) were calculated assuming a Poisson distribution for categories with less than 100 observed cases. For categories with more than 100 observed cases, we adopted the method as indicated by Breslow and Day (1987) to calculate the 95% Cis, as suggested by Washington State Department of Health 138,139 . Statistical analyses were performed using R software (v 3.6.0). Any two-sided p-value less than 0.05 was considered statistically significant. A supplement is included with this manuscript with more information on the statistical tests and computational codes used. Results Demographics of the study population From 1989 to 2015, we identified a total of 29 249 pediatric and AYA patients with a primary malignancy, comprising 29 072 probands, 112 affected siblings (from 110 families) and 65 affected mothers. All siblings were diagnosed after the proband’s diagnosis as defined, and 56 (86.15%) of the 65 mothers were diagnosed after the proband’s diagnosis. We also identified 387 SPMs among all pediatric and AYA probands (Table 3-2). Familial relative risks of early-onset cancers Overall, we found a 3.32-fold (95%CI:2.85-3.85) increased relative risk of any cancer among siblings and mothers who have a proband with cancer in the same family. Briefly, we found a 2.97-fold (95%CI:2.30-3.78) increased relative risk of any cancer given a proband with hematologic cancers and a 4.54-fold (95%CI:3.82-5.35) increased relative risk of any cancer 49 given a proband with solid cancers. When stratified by cancer type, higher relative risks among siblings and mothers were observed given probands with leukemias, lymphomas, CNS tumors, retinoblastomas, renal tumors, sarcomas, GCTs, epithelial neoplasms, and other unspecified neoplasms (Figure 3-1A). For the relative risk of specific cancer types, we found a 2.68-fold (95%CI:1.68-4.06) increased risk of hematologic cancers among siblings and mothers of a proband with hematologic cancer, and a 6.78-fold (95%CI:5.58-8.16) increased relative risk of solid cancers among siblings and mothers of a proband with solid cancer (Supplementary Table 3-1). Furthermore, leukemias, lymphomas, CNS tumors, retinoblastoma, sarcomas, GCT, and epithelial neoplasms exhibited statistically significantly increased relative risk for the same type of cancer as the proband (Figure 3-2 & Supplementary Table 3-2). When stratified by more finely defined cancer subtypes, increased relative risks of any cancer for siblings and mothers were observed given a proband with lymphoid leukemia, acute myeloid leukemia, Hodgkin lymphomas, non-Hodgkin lymphomas, astrocytomas, intracranial and intraspinal embryonal tumors, certain gliomas, certain specified intracranial and intraspinal neoplasms, nephroblastoma and other nonepithelial renal tumors, rhabdomyosarcomas, fibrosarcomas to peripheral nerve sheath tumors to and other fibrous neoplasms, certain specified soft tissue sarcomas, malignant gonadal germ cell tumors, and certain unspecified carcinomas (Supplementary Table 3-3). When stratified by race/ethnicity, the relative risk of any cancer for siblings and mothers given a proband with solid cancer was significantly higher among Latino subjects than non- Latino White (NLW) subjects [Latino: SIR=4.98;95%CI:3.82-6.39; NLW: SIR=3.02;95%CI:2.12-4.16; P=0.019] (Figure 3-1B). Non-Latino Asians/Pacific Islanders (API) 50 had higher SIRs than NLW given a proband with hematologic cancer (SIR=7.56;95%CI: 3.26- 14.90), and non-Latino Blacks had higher SIRs than NLW given a proband with any cancer (SIR=6.96;95%CI:3.71-11.91) or solid cancer (SIR=7.35;95%CI: 3.36-13.95) (Supplementary Table 3-4), but small numbers resulted in unstable estimates. For the relative risk of specific cancers given a proband with the same cancer, Latino subjects also showed higher relative risk of solid cancers than NLW subjects [Latino: SIR=7.94;95%CI:5.89-10.47; NLW: SIR=4.41;95%CI:2.99-6.25; P=0.012] (Supplementary Table 3-1). Relative risks of second primary malignancies Overall, we found a 7.27-fold increased risk of SPMs relative to the general population among children/AYAs with an FPM (SIR=7.27;95%CI:6.56-8.03). Most primary cancer types were associated with an elevated relative risk of SPMs (Figure 3-3A). When stratified by race/ethnicity, a similar relative risk of all SPMs given a proband with cancer was observed among Latino subjects than NLW subjects [Latino: SIR=6.85;95%CI:5.83-8.00; NLW: SIR=6.65;95%CI:5.55-7.91; P=0.869] (Figure 3-3B). For the relative risks of SPMs of the same cancer types as the FPM, we found elevated risks for both hematologic and solid cancers. When stratified by race/ethnicity, similar relative risks were observed among NLW subjects compared to Latino subjects given a proband with hematologic or solid cancer (Supplementary Table 3-5). Discussion To our knowledge, this is the first study to quantify the familial clustering risks and risks of SPMs among early-onset cancer patients with an emphasis on racial/ethnic differences. Using linked population registry data in the California population, we found that the risk for a sibling 51 child/AYA or mother to have early-onset cancer was elevated once a proband was identified with an early-onset cancer. Likewise, the relative risks for SPMs were elevated among children/AYAs who contracted a first primary cancer. Due to the rarity of childhood cancers, the absolute risk is very small, but still higher among young siblings and mothers in the current study (0.605%) compared to general population (0.023%) of the same age group. The findings were consistent across race/ethnic groups; however, the magnitude was different. Latinos had higher sibling/maternal relative risks compared to NLWs for solid cancers. Consistent with our results, a rich literature with a primary focus on European ancestry populations has reported excessive familial risks of hematologic malignancies 129 , lymphomas 140- 142 , brain tumors 143,144 , neuroblastomas 145 , retinoblastomas 142,145 , germ cell tumors 146 , sarcomas 147 and melanomas 148 . In terms of secondary cancers, studies have reported excessive risks of second primary malignancies among of survivors of hereditary retinoblastoma 149 , chronic myeloid leukemia 150 , chronic lymphoblastic leukemia 151 , Hodgkin's lymphoma 152 , non- Hodgkin's lymphoma 153 , and neuroblastoma 154 . The excessive familial risks of certain cancers are highly likely to be associated with genetic predisposition. The archetypic examples are germline loss-of-function mutations in RB1, which are found in ~40% of retinoblastoma cases 145 , and adrenal cortical cancer, with germline TP53 mutations accounting for most familial cases 145 . Low penetrance common genetic variations, for instance in CEBPE, IKZF1, and ARID5B genes in ALL, are associated with cancer risk and may also contribute to familial concordance as combinations of low frequency alleles or “polygenic risk scores” have been shown to be as impactful as single strong predisposition mutations in adult cancers 155,156 ; however, their contribution to cancer clustering among children and their families has not yet been studied. 52 Our data demonstrate a higher degree of familial-based clustering of solid cancers among Latinos compared to non-Latino Whites. This familial concordance is likely due to both shared genetic and environmental causes and is accompanied by a clearly higher incidence of some cancer types. Latinos are an admixed population, comprising an ancestral mixture from Native American, European, and African sources. California Latinos, particularly the youth population are largely from Mexico, and harbor a higher risk of certain cancers particularly pediatric leukemias, the most common cancer in children 12 ; however this higher risk is partially accounted for by a higher frequency of common risk alleles which do not address strong familial predisposition loci 39 . This higher risk identified in relation to the family unit has not been studied, and our results here beg for an analysis of comparative sources of genetic and environmental risk that contribute to the higher risk and familial clustering of cancers in Latinos. Therapy of the first primary cancer is a major factor in the induction of secondary independent malignancies 157-159 . Multiple primary cancer diagnoses are considered a key feature of hereditary cancer predisposition syndromes 160 . As such secondary cancers are rare, genetics are still likely to play a strong role 161 , and our overall SPM results here emphasize a similar patterning as cancer clustering in first primary malignancies. Of note, our analysis was not designed to distinguish relative risk contributions from therapy and genetic sources for secondary cancers. For some tumor types the germline predisposition was readily noted in this cohort, for example ten of the fourteen affected relatives who had a proband with retinoblastoma were diagnosed with the same cancer, an unsurprising finding given that germline RB1 mutations account for a significant proportion of retinoblastoma are highly penetrant, and those tumors tend to be diagnosed young. We also observed increased relative risks for sarcomas given a proband 53 with leukemias, suggesting the presence of families with Li-Fraumeni syndrome, which is characterized by a spectrum of childhood and adult-onset cancers including adrenocortical carcinoma, breast cancer, CNS tumors, sarcomas, and leukemia 160 . Population-level selection pressures are thought to influence the relative frequencies of alleles. For instance, genetic adaptations that shaped the Native American genome to cold and warm environments 162 , and immune response following colonization by Europeans 163 . Our result suggests that some adaptive selection pressures, or simply genetic drift exacerbated by bottlenecking of genetic diversity during the Native American population history may differentially influence familial cancer clustering by age of cancer onset 164 . If replicated in other study settings, this contrast between genetic risk child and adult-onset cancers by ethnicity should be studied further for a fuller understanding of familial risks. Our analyses capitalized on the highly diverse population in California, allowing us to quantify the relative risks across different ethnic groups. Moreover, the utilization of linked population-based registries in California enabled us to minimize the selection and information biases introduced by a case-control study design or other strategies that only sample portions of the population. There are also some limitations of our study. Despite the large number of total cancer cases, the number of affected siblings and second primaries are very small for some cancer types, thus limiting the power to detect significant relative risks. Also, we are unable to track cancer incidence for affected siblings, maternal cancers, and SPMs that may have been diagnosed outside of California. In addition, the follow-up time of 26 years is not enough for a comprehensive detection of SPMs in the probands, nor for cancers arising in proband mothers at older ages. These insufficient follow-up time and loss-to follow-up issues have limited our ability to quantify the relative risks among mothers with cancer onset at older ages (>40 yrs). 54 Furthermore, it is likely that the low number of mothers with cancer is a result of bias against some very strong cancer predispositions, so the patients could not survive long enough or be healthy enough to reproduce. Lastly, the lack of records on fathers reduces our ability to quantify the relative risks among other first-degree relatives and may reduce the appreciation of the potential contribution of high-risk cancer predisposition syndromes which can be inherited from either parent. Accepting those limitations with the current dataset, our study has several important implications that may open windows to future research. First, the genetic predispositions driving the excessive childhood solid cancer risks among the Latino population, whether from higher frequencies of known cancer predisposition syndromes or mutations in novel genes, or a higher burden of common or rare genetic risk alleles, warrants further investigation. Second, the comparative attributable fraction of familial risk based on environmental risk factors interacting with genetic predispositions warrants further investigation. Lastly, descriptive studies on familial and secondary cancer risks among race/ethnic groups other than Latinos and non-Latino Whites may provide additional insights into cancer incidence variation by race/ethnicity. 55 Table 3- 1. Abbreviations of the twelve broad groups defined by the International Classification of Childhood Cancer, Third edition. Abbreviation Definition † Leukemias I. Leukemias, myeloproliferative diseases, and myelodysplastic diseases Lymphomas II. Lymphomas and reticuloendothelial neoplasms CNS tumors III. CNS and miscellaneous intracranial and intraspinal neoplasms Neuroblastomas IV. Neuroblastoma and other peripheral nervous cell tumors Retinoblastoma V. Retinoblastoma Renal tumors VI. Renal tumors Hepatic tumors VII. Hepatic tumors Bone tumors VIII. Malignant bone tumors Sarcomas IX. Soft tissue and other extraosseous sarcomas GCT X. Germ cell tumors, trophoblastic tumors, and neoplasms of gonads Epithelial neoplasms XI. Other malignant epithelial neoplasms and malignant melanomas Other XII. Other and unspecified malignant neoplasms † Cancers were classified into groups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). 56 57 Figure 3- 1. Relative risks of early-onset cancers among siblings and mothers. A. Relative risks among siblings and mothers of any early-onset cancer (diagnosed under 26 years of age) given a proband with cancer, 1989 to 2015, California, USA. B. Relative risks by ethnic group among siblings and mothers of any early-onset cancer (diagnosed under 26 years of age) given a proband with cancer, 1989 to 2015, California, USA. Cancers were classified into groups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). Hematologic cancers include leukemias and lymphomas. Solid cancers include CNS tumors, neuroblastomas, retinoblastomas, renal tumors, hepatic tumors, bone tumors, sarcomas, GCT, epithelial neoplasms and others. The axis for SIR was natural log-transformed. SIR and 95% CI were not calculatable for cancers with zero observed case. P was calculated assuming a Poisson distribution. Abbreviations: SIR, Standardized incidence ratio. CI, confidence interval. GCT, germ cell tumors, trophoblastic tumors, and neoplasms of gonads. 58 Figure 3- 2. Relative risks of siblings and mothers of specific cancers. Cancers were classified into groups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). Standardized incidence ratios greater than 10 were recoded to 10. Siblings and mothers of a proband were diagnosed with cancer from 1989 to 2015 at 0 to 26 years of age. P was calculated assuming a Poisson distribution. Abbreviations: SIR, Standardized incidence ratio. CI, confidence interval. GCT, germ cell tumors, trophoblastic tumors, and neoplasms of gonads. 59 Figure 3- 3. Relative risks of second primary malignancies. A. Relative risks of second primary malignancies of any early-onset cancer (diagnosed under 26 years of age) given a proband with cancer, 1989 to 2015, California, USA. B. Relative risks of second primary malignancies of any early-onset cancer (diagnosed under 26 years of age) given a proband with cancer by ethnic group, 1989 to 2015, California, USA. Cancers were classified into groups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). Hematologic cancers include leukemias and lymphomas. Solid cancers include CNS tumors, neuroblastomas, retinoblastomas, renal tumors, hepatic tumors, bone tumors, sarcomas, GCT, epithelial neoplasms and others. The axis for SIR was natural log-transformed. SIR and 95% CI were not calculatable for cancers with zero observed case. P was calculated assuming a Poisson distribution. Abbreviations: SIR, Standardized incidence ratio. CI, confidence interval. GCT, germ cell tumors, trophoblastic tumors, and neoplasms of gonads. FPM, first primary malignancy. SPM, second primary malignancy. 60 Chapter 4: Genetic predispositions of familial-associated early-onset hematologic cancers In this Chapter, we further zoomed into individual level to explore the genetic predispositions in association with the excessive risk of hematologic cancers among Latino subjects using the same source of data as Chapter 3. Introduction Hematologic cancer (leukemia, lymphoma, and myeloma) is the most commonly diagnosed early-onset cancer among people aged under 26 years old 1 . Inherited and de novo mutation of genes encoding hematopoietic transcription factors are central oncogenic events in the pathogenesis of hematologic cancers 121 . Pathogenic/probably pathogenic germline mutations were identified in approximately 10% of pediatric hematologic cancer patients 117,126-128 . According to our report in Chapter 3, the risk of early-onset cancers diagnosed under 26 years of age is 2.97 times higher among siblings and mothers who have a proband with hematologic cancer in the same family. Inherited germline mutations may contribute to this excessive cancer risk. This risk varies among race/ethnic groups – suggesting that predisposition mutations vary by identity or frequency among groups. Leukemia is the most common early-onset hematologic cancer, accounting for approximately 25% of all cancers diagnosed in children 1 . Similar to hematologic patients in general (prior paragraph), the relative early-onset cancer risk is 2.87 times higher among siblings and mothers given a proband with leukemia, suggesting that inherited predisposition may play a role. The disease among older adults is dominated by acute myeloid leukemia (AML), while the disease subtype among children and young adults is dominated by acute lymphoblastic leukemia (ALL). The ALL diagnosed in children is mainly good-prognosis subtypes, such as those with high hyperdiploidy or ETV6-RUNX1 fusion translocations 10 . Recent studies have identified 61 germline nonsilent mutations in TP53, PAX5, IKZF1, and ETV6, in both familial and sporadic acute lymphoblastic leukemia (ALL) but have not thoroughly assessed the spectrum of mutations among ancestral groups and their contribution to disparities in risk. Lymphoma (collectively defined as non-Hodgkin lymphoma [NHL], Hodgkin lymphoma [HL], and chronic lymphoblastic leukemia [CLL]) is another common early-onset hematologic cancer, accounting for approximately 10% of all cancers diagnosed in children 1,140 . According to our report in Chapter 3, the relative early-onset cancer risk in our ancestrally mixed population is 4.66 times higher for siblings and mothers given a proband with lymphoma in the same family. Previous studies have reported a similar result, that first-degree relatives of NHL, HL, and CLL patients among European origin populations have ~1.7 times, 3.1 times, and 8.5 times higher risk of developing NHL, HL, and CLL, respectively 165-170 . In the past decade, genome-wide association studies (GWAS) have identified common loci (minor allele frequency, MAF>0.5) with small effect sizes 171-182 in association with the risk of lymphoma. The rare variants, insertion/deletions, block substitutions, inversions, translocations and copy number alterations associated with lymphoma risk have not been studied extensively 183 . The risk of hematologic cancers varies by race/ethnic group. According to our report in Chapter 1, Latino subjects have higher incidence ALL compared to subjects of other race/ethnic groups across all ages. For lymphomas diagnosed under 19 years of age, the incidence among non-Latino White (NLW) subjects is higher than subjects of other race/ethnic groups 184 . However, the variation in genetic predispositions that may drive this difference in risk by race/ethnic group has been understudied. Familial cancer databases can be used to study rare genetic predispositions with large effect size that cannot be captured by GWAS. Therefore, we 62 use the linked cancer population registries in California to assess rare risk loci with large effect size that may explain this variation in risk of hematologic cancers by race/ethnic group. Methods Study population We used the same linked population-based registries in California as described in Chapter 3. Briefly, the dataset was created by linking information from the California Cancer Registry (CCR) and California Birth Statistical Master File. The linked dataset comprehensively encompassed all cancer cases 0-26 years old, as well as their sibling and mother cancers, diagnosed from 1989 to 2015 in California. To study the rare genetic predispositions in association with hematologic cancers, we first identified all sibling pairs with cancer in the database. Then, we selected the sibling pairs among which at least one sibling has hematologic cancer (leukemia or lymphoma) and sent them for sequencing. Overall, 70 patients from 35 families were eligible for inclusion. The race/ethnicity for each subject was determined by mapping first two principle genetic components to the 1000 Genome (1KG) references 89 . A total of 2504 subjects (661 African American, 347 Latino, 504 East Asian, 503 European, and 489 South Asian) from 1KG were included as reference for the PC analysis 89 . We first used Plink 2.0 90 to merge the variants of our study participants with the variants of the subjects in the 1KG and performed a PC analysis. We then used Plink 2.0 to estimate the genetic ancestry of our study participants by comparing their first 2 PCs to the first 2 PCs of the reference subjects in the 1KG whose genetic ancestry are known 91 . 63 Nucleic acid preparation and genotyping DNA used for whole exome sequencing (WES) were isolated from neonatal dried blood samples obtained from the California Biobank Program 88 using Beckman Genfind v3 reagents on an Eppendorf robotic sample handling platform. Uniquely barcoded samples underwent WES on the IDT xGen Exome V1 plus spike-in of a small panel of clinically relevant probes. Approximately 250 million paired end reads, each 100 bp in length, were generated for each sample. Mapping and variant identification The Genome Analysis ToolKit (GATK) pipeline for germline short variant (SNPs + indels) discovery was used for mapping and variant calling 185-187 , based on the GRCh37 assembly. Resulting gene sequence variations stored in variant call format (VCF) files were annotated with ANNOVAR 188 . We filtered all variants for minor allele frequency (MAF) against reference cohorts in the Genome Aggregation Database (gnomAD) 189,190 and Trans-Omics for Precision Medicine (TOPMed) 191 . Rare variants with allele frequency<=0.001 in both the exome and genome sequencing data in gnomAD, and TOPMed were included in the filtered VCF file. Variants with ClinVar 192 annotation of “benign” or “likely benign” were excluded. Variants with alternative allele reading depth<=5, or alternative allele reading depth/genotype reading depth<=0.2 were excluded. Annotation of pathogenicity We extracted the variants that are shared by the two siblings in a family and used the Pediatric Cancer Variant Pathogenicity Information Exchange (PeCanPIE) Medal Ceremony pipeline to identify the pathogenicity of the variants. PeCanPIE works by first sifting through 64 variants in sequencing data, and then annotating the pathogenicity of the variants based on American College of Medical Genetics and Genomics (ACMG) guidelines. The potential pathogenicity of the variants is classified into three tiers (gold, silver, and bronze) 193 . We examined variants on a list of 1073 genes that are reported to be cancer/ immunodeficiency/ nonmalignant hematological-related genes 193,194 , pediatric cancer predisposition genes 195,196 , tumor suppressor genes, tyrosine kinase genes, or cancer genes classified based on their recurrent somatic mutation in cancer 126 . Variants that have a PeCanPIE gold or silver medal and a ClinVar annotation “pathogenic” or “likely pathogenic” were included in the subsequent ancestry analyses. Ancestry of pathogenic variants To evaluate if the pathogenic/likely pathogenic variants were originated from a specific genetic ancestry, we further examined the global ancestries of each study subject, and the local ancestries of each pathogenic/likely pathogenic variant. The ancestries were classified into 5 superpopulations: European (EUR), African (AFR), Admixed American (AMR), East Asian (EAS), and South Asian (SAS). The global ancestries of each subject were determined by analyzing the gene sequence variations of all study participants with ADMIXTURE 197 . The local ancestries of each variant were determined with RFMix 198 . We used gene sequence variations from the 1KG and Human Genome Diversity Project (HGDP) to construct the reference panel for the local ancestry analysis. First, ADMIXTURE was used to identify the 1KG and HGDP subjects with a ‘pure’ global ancestry (100%EUR/ 100%AFR/ 100%AMR/ 100%EAS/ 100%SAS). A total of 345 ancestrally unmixed subjects (69 AFR, 69 AMR, 69 EAS, 69 EUR, 69 SAS) were included in the reference panel. We then mapped the local gene sequence variations of all study participants to the gene sequence variations of these reference subjects to 65 determine the local ancestry of each pathogenic/likely pathogenic variant. The local ancestry of the common SNP that is closest to the variant of interest was deemed to be the local ancestry of that variant. Results Demographics of study participants Among the 70 patients eligible for inclusion, 67 samples had enough DNA for genotyping. 64 of the 67 samples were sibling pairs from the same family and were included in the genetic analyses. The 64 subjects were from 32 families, among which 12 (37.5%) were Latino and 10 (31.2%) were non-Latino White (NLW) families (Table 4-1). Including parents and all healthy or diseased children, the average size of Latino family was 4.92 people/family and the average size of NLW family was 6.10 people/family. Pathogenic/likely pathogenic variants Overall, we identified 281 variants shared between the two siblings that are pathogenic/likely pathogenic among the 32 families. Among them, 106 (37.86%) were found in Latino families, 66 (23.57%) were found in NLW families, 36 (12.86%) were found in NLB families, 67 (23.93%) were found in NLAPI families, and 5 (1.79%) were found in NLAIAN families. Fifteen variants (5.36%) were highly likely to be pathogenic (assigned a ‘gold medal’ by PeCanPIE) (Table 4-2). For Latino families, pathogenic variants were detected on TP53 (NM_000546.5:c.644G>A; NM_000546.5:c.533A>C), SERPINA1 (rs28929474), PTPRD (rs1251077945), ATM (NM_000051.3:c.3158dup), HFE (rs1799945). For NLW families, pathogenic variants were detected on TP53 (rs397516435), SERPINA1 (rs17580), and HFE (rs1799945). For NLAPI families, pathogenic variants were detected on FLG 66 (NM_002016.1:c.3321del), MLF1 (rs200248107), HFE (rs1799945), and HBB (rs33950507). For one NLB family, pathogenic variant was detected on GATA2 (NM_001145661.1:c.1009C>T) (Supplementary Table 4-1). Ancestry of pathogenic variants The pathogenic variant on GATA2 shared a pure AFR ancestry on the same haplotype (100% on haplotype 2 for both siblings) in the one NLB family. For one Latino family, the mutation on SERPINA1 shared AMR ancestry on the same haplotype (98% and 99% on haplotype 2, respectively) for the two siblings in a family. For NLW families, the mutations on HFE and SERPINA1 had shared EUR and SAS ancestries [(HFE: sibling 1: 83% SAS on haplotype 1 and 99% EUR on haplotype 2; sibling 2: 99% EUR on haplotype 1 and 75% SAS on haplotype 2); (SERPINA1: sibling 1: 91% EUR on haplotype 1 and 99% SAS on haplotype 2; sibling 2: 99% SAS on haplotype 1 and 100% EUR on haplotype 2)]; the mutation on TP53 had shared EUR ancestry on the same haplotype (100% on haplotype 2 for both siblings) between the members in a family. For NLAPI families, the mutation on FLG had a shared AMR ancestry on the same haplotype (93% and 95% on haplotype 1, respectively); the mutation on MLF1 had a shred EUR ancestry (sibling 1: 48% on haplotype 1; sibling 2: 100% on haplotype 2); and the mutation on HBB had a shred EAS ancestry on the same haplotype (100% on haplotype 1 for both siblings) between the members in a family (Supplementary Table 4-2). Discussion With this study, we have demonstrated that the variation in familial risk of early-onset hematologic cancers by ancestral/ethnic group is highly likely to be associated with genetic predisposition. The profile of pathogenic/likely pathogenic variants across ancestral/ethnic 67 groups share some similarities but is overall different from each other. Consistent with previous findings, we have proved that familial early-onset hematologic cancers harbor germline TP53, GATA2, and ATM mutations 199,200 . Mutations on TP53 have been associated with many hematologic cancers including lymphoblastic 201 , myeloid leukemias 202 , and lymphomas 203-205 . Mutations on GATA2 have been associated with myeloid malignancies 206,207 . And mutations on ATM have been associated with T-cell prolymphocytic leukemia 208 , mantle cell lymphoma 209 , and gliomas 210 . The mutation rs1251077945 on PTPRD is specific to Latino families. rs1251077945 is a rare variant with high penetrance only in Latino population. As reported in gnomAD, the minor allele frequency (MAF) of rs1251077945 is 0.00003 in Latino/Admixed Americans, and 0 in all other ancestral/ethnic groups 189,190 . PTPRD is involved in oncogenic transformation 211 and has been associated with acute myeloid leukemias 212,213 . The mutation rs200248107 on MLF1 is specific to NLAPI families. As reported in gnomAD, the MAF of rs200248107 is 0.004 in East Asians, 0.0002 in South Asians, with null presence in the Latino, European, and African population 189,190 . MLF1 encodes an oncoprotein and has been associated with myeloid leukemias 214 . There are several strengths and limitations of this study. A major strength is the linkage of population-based cancer registries to sequencing data of newborns. Thus, we have established a comprehensive understanding of genetic predispositions that drive the excessive familial risk of early-onset hematologic cancers. Furthermore, benefiting from the highly diverse population in California, we are the first group that have assessed the variation in genetic predisposition in association with familial hematologic cancer risk in multiple ancestral/ethnic groups to our 68 knowledge. However, limited by the length of time in data collection and the rarity of childhood cancers, we have only detected a few mutations that are unique to an ancestral/ethnic group. To better understand the origin of a mutation, we have planned a future analysis to identify the local ancestries of the mutations that are specific to an ancestral/ethnic group. Such analyses will help to establish the role of strong cancer predisposition variants more definitively among different race/ethnic groups, and whether such mutations harbor varied penetrance in the context of different ancestral backgrounds. In addition, a future direction is to include more study subjects, and more pedigrees in a family to better understand the variation of genetic predisposition of hematologic cancers in different ancestral/ethnic groups. 69 Conclusion The risk and familial clustering of early-onset cancers varies by ancestral/ethnic group, potentially driven by the difference in immune development before birth, and by the discrepancy in genetic predisposition. In this dissertation, we comprehensively studied the epidemiology and genetics of early-onset hematologic cancers, and in specific, the most diagnosed early-onset cancer, acute lymphoblastic leukemia. Initially, we reported that Latino subjects have the highest risk of ALL in all age groups, and the risk kept increasing from 2000-2016. To investigate the driver for the disparity in ALL risk across ancestral/ethnic groups, we used a case-control design and found that the activating combination of maternal KIR and offspring HLA is associated with lower risk of childhood ALL in Latino and NLA children, implicating immune development characteristics affect the risk of ALL. To better understand the variation in early-onset cancer risk across ancestral/ethnic groups, we estimated the family-based cancer concordance rates and found that Latino subjects have a higher relative risk of early-onset cancers compared to NLW subjects. We further investigated the genetic predisposition of those cancers and identified a different profile of variants in different ancestral/ethnic groups. The excessive risk of hematologic cancers in Latino population has been well- documented 184 , but it is difficult to identify the cause for the higher risk. By comprehensively evaluating the incidence trend, familial clustering, and potential environmental and genetic factors that may associate, we were able to demonstrate the roles of immune status before birth and genetic predisposition in association with the variation in early-onset hematologic cancer risk across ancestral/ethnic groups. Our studies have provided some evidence to explain the risk differences by ancestry/ethnicity, but the findings cannot fully explain the cause of the variation in cancer risk. Therefore, it is worthwhile for future studies to focus on 1) assessing if the 70 “Hispanic paradox” is still a crucial characteristic in risk modification with individual-level information on family origin, area of birth, diet and health care, etc.; 2) understating the potential mediation role of cytokines in the in utero immune development, by examining childhood ALL pathways in different ancestral/ethnic groups with enough statistical power; 3) investigating the genetic predisposition in association with the excessive familial risk of early-onset solid tumors in the Latino population; 4) exploring if the factors other than in utero immune development and genetic predisposition (for instance, chemical environmental exposures such as air pollution, pesticides, and industrial chemicals) would explain the excessive early-onset cancer risk in Latino population. 71 72 Table 4- 1. Distribution of rare variants shared by two siblings in a family. N (%) Predicted genetic ancestry Latino all races 106 (37.86) NLW 66 (23.57) NLB 36 (12.86) NLAPI 67 (23.93) NLAIAN 5 (1.79) PeCanPIE Medal Gold 15 (5.36) Silver 265 (94.64) Category Cancer 194 (69.29) Hematological 24 (8.57) Immunological 35 (12.50) Other 27 (9.64) Mutation class Missense 263 (93.93) Frameshift 8 (2.86) Nonsense 4 (1.43) Protein deletion 2 (0.71) Splice 2 (0.71) UTR_5 1 (0.36) Abbreviations: NLW, non-Latino White; NLB, non-Latino Black; NLAPI, non-Latino Asian/Pacific Islander; NLAIAN, non-Latino American Indian/Alaskan Native; PeCanPIE, Pediatric Cancer Variant Pathogenicity Information Exchange. 73 Bibliography 1 Cancer Facts & Figures 2018, <https://www.cancer.org/content/dam/cancer- org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2018/cancer- facts-and-figures-2018.pdf> (2018). 2 Skarzynski, J. Blood Cancer Awareness Month: What You Need to Know, <https://www.curetoday.com/view/blood-cancer-awareness-month-what-you-need-to- know> (2019). 3 van Leeuwen, F. E. & Ng, A. K. Late sequelae in Hodgkin lymphoma survivors. Hematol Oncol 35 Suppl 1, 60-66, doi:10.1002/hon.2402 (2017). 4 Munir, I., Mehmood, T. & McFarlane, I. M. Amyotrophic Lateral Sclerosis, a Possible Sequela of Chronic Myeloid Leukemia. Am J Med Case Rep 7, 230-235, doi:10.12691/ajmcr-7-10-3 (2019). 5 Oeffinger, K. C. et al. Chronic health conditions in adult survivors of childhood cancer. N Engl J Med 355, 1572-1582, doi:10.1056/NEJMsa060185 (2006). 6 Hudson, M. M. et al. Clinical ascertainment of health outcomes among adults treated for childhood cancer. JAMA 309, 2371-2381, doi:10.1001/jama.2013.6296 (2013). 7 Miller, K. D. et al. Cancer Statistics for Hispanics/Latinos, 2018. CA Cancer J Clin 68, 425-445, doi:10.3322/caac.21494 (2018). 8 Cancer Surveillance Programs in the United States, <https://www.cancer.org/treatment/treatments-and-side-effects/clinical-trials/cancer- surveillance-programs-and-registries-in-the-united- states.html#:~:text=The%20National%20Cancer%20Institute's%20(NCI,Cancer%20incid ence%20(new%20cases)> (2014). 9 Horowitz, N. A., Akasha, D. & Rowe, J. M. Advances in the genetics of acute lymphoblastic leukemia in adults and the potential clinical implications. Expert Rev Hematol 11, 781-791, doi:10.1080/17474086.2018.1509702 (2018). 74 10 Pui, C. H., Yang, J. J., Bhakta, N. & Rodriguez-Galindo, C. Global efforts toward the cure of childhood acute lymphoblastic leukaemia. Lancet Child Adolesc Health 2, 440- 454, doi:10.1016/S2352-4642(18)30066-X (2018). 11 Hunger, S. P. et al. Improved survival for children and adolescents with acute lymphoblastic leukemia between 1990 and 2005: a report from the children's oncology group. J Clin Oncol 30, 1663-1669, doi:10.1200/JCO.2011.37.8018 (2012). 12 Barrington-Trimis, J. L. et al. Trends in childhood leukemia incidence over two decades from 1992 to 2013. Int J Cancer 140, 1000-1008, doi:10.1002/ijc.30487 (2017). 13 Roberts, K. G. et al. High Frequency and Poor Outcome of Philadelphia Chromosome- Like Acute Lymphoblastic Leukemia in Adults. J Clin Oncol 35, 394-401, doi:10.1200/JCO.2016.69.0073 (2017). 14 Russell, L. J. et al. IGH@ translocations are prevalent in teenagers and young adults with acute lymphoblastic leukemia and are associated with a poor outcome. J Clin Oncol 32, 1453-1462, doi:10.1200/JCO.2013.51.3242 (2014). 15 Weng, A. P. et al. Activating mutations of NOTCH1 in human T cell acute lymphoblastic leukemia. Science 306, 269-271, doi:10.1126/science.1102160 (2004). 16 Igwe, I. J. et al. The presence of Philadelphia chromosome does not confer poor prognosis in adult pre-B acute lymphoblastic leukaemia in the tyrosine kinase inhibitor era - a surveillance, epidemiology, and end results database analysis. Br J Haematol 179, 618-626, doi:10.1111/bjh.14953 (2017). 17 Lennmyr, E. et al. Survival in adult acute lymphoblastic leukaemia (ALL): A report from the Swedish ALL Registry. Eur J Haematol 103, 88-98, doi:10.1111/ejh.13247 (2019). 18 McNeer, J. L. & Bleyer, A. Acute lymphoblastic leukemia and lymphoblastic lymphoma in adolescents and young adults. Pediatr Blood Cancer 65, e26989, doi:10.1002/pbc.26989 (2018). 19 Dores, G. M., Devesa, S. S., Curtis, R. E., Linet, M. S. & Morton, L. M. Acute leukemia incidence and patient survival among children and adults in the United States, 2001-2007. Blood 119, 34-43, doi:10.1182/blood-2011-04-347872 (2012). 75 20 Quiroz, E. et al. The emerging story of acute lymphoblastic leukemia among the Latin American population - biological and clinical implications. Blood Rev 33, 98-105, doi:10.1016/j.blre.2018.08.002 (2019). 21 Barrington-Trimis, J. L. et al. Rising rates of acute lymphoblastic leukemia in Hispanic children: trends in incidence from 1992 to 2011. Blood 125, 3033-3034, doi:10.1182/blood-2015-03-634006 (2015). 22 Giddings, B. M., Whitehead, T. P., Metayer, C. & Miller, M. D. Childhood leukemia incidence in California: High and rising in the Hispanic population. Cancer 122, 2867- 2875, doi:10.1002/cncr.30129 (2016). 23 Gurney, J. G. et al. Trends in cancer incidence among children in the U.S. Cancer 78, 532-541, doi:10.1002/(SICI)1097-0142(19960801)78:3<532::AID-CNCR22>3.0.CO;2-Z (1996). 24 Pullarkat, S. T., Danley, K., Bernstein, L., Brynes, R. K. & Cozen, W. High lifetime incidence of adult acute lymphoblastic leukemia among Hispanics in California. Cancer Epidemiol Biomarkers Prev 18, 611-615, doi:10.1158/1055-9965.EPI-07-2949 (2009). 25 Health, N. I. o. Registry Groupings in SEER Data and Statistics. 26 Institute, N. C. Site Recode ICD-O-3/WHO 2008 Definition, <https://seer.cancer.gov/siterecode/icdo3_dwhoheme/> (2018). 27 NAACCR. NAACCR Fast Stats, <https://faststats.naaccr.org/standardpops.php> (2018). 28 Joinpoint Trend Analysis Software, <https://surveillance.cancer.gov/joinpoint/> ( 29 Health, N. I. o. Time-dependent County Attributes, <https://seer.cancer.gov/seerstat/variables/countyattribs/time-dependent.html> ( 30 Yost, K., Perkins, C., Cohen, R., Morris, C. & Wright, W. Socioeconomic status and breast cancer incidence in California for different race/ethnic groups. Cancer Causes Control 12, 703-711 (2001). 76 31 Center, P. R. Latinos in California, Texas, New York, Florida and New Jersey, <https://www.pewresearch.org/hispanic/2004/03/19/latinos-in-california-texas-new-york- florida-and-new-jersey/> (2004). 32 Hunger, S. P. & Mullighan, C. G. Redefining ALL classification: toward detecting high- risk ALL and implementing precision medicine. Blood 125, 3977-3987, doi:10.1182/blood-2015-02-580043 (2015). 33 Moorman, A. V. New and emerging prognostic and predictive genetic biomarkers in B- cell precursor acute lymphoblastic leukemia. Haematologica 101, 407-416, doi:10.3324/haematol.2015.141101 (2016). 34 Salari, K. et al. Genetic admixture and asthma-related phenotypes in Mexican American and Puerto Rican asthmatics. Genet Epidemiol 29, 76-86, doi:10.1002/gepi.20079 (2005). 35 Perez-Andreu, V. et al. Inherited GATA3 variants are associated with Ph-like childhood acute lymphoblastic leukemia and risk of relapse. Nat Genet 45, 1494-1498, doi:10.1038/ng.2803 (2013). 36 Perez-Andreu, V. et al. A genome-wide association study of susceptibility to acute lymphoblastic leukemia in adolescents and young adults. Blood 125, 680-686, doi:10.1182/blood-2014-09-595744 (2015). 37 Clay-Gilmour, A. I. et al. Genetic association with B-cell acute lymphoblastic leukemia in allogeneic transplant patients differs by age and sex. Blood Adv 1, 1717-1728, doi:10.1182/bloodadvances.2017006023 (2017). 38 Roberts, K. G. et al. Targetable kinase-activating lesions in Ph-like acute lymphoblastic leukemia. N Engl J Med 371, 1005-1015, doi:10.1056/NEJMoa1403088 (2014). 39 Walsh, K. M. et al. Associations between genome-wide Native American ancestry, known risk alleles and B-cell ALL risk in Hispanic children. Leukemia 27, 2416-2419, doi:10.1038/leu.2013.130 (2013). 40 Xu, H. et al. ARID5B genetic polymorphisms contribute to racial disparities in the incidence and treatment outcome of childhood acute lymphoblastic leukemia. J Clin Oncol 30, 751-757, doi:10.1200/JCO.2011.38.0345 (2012). 77 41 Aldrich, M. C. et al. Cytogenetics of Hispanic and White children with acute lymphoblastic leukemia in California. Cancer Epidemiol Biomarkers Prev 15, 578-581, doi:10.1158/1055-9965.EPI-05-0833 (2006). 42 Wang, L., Gomez, S. L. & Yasui, Y. Racial and Ethnic Differences in Socioeconomic Position and Risk of Childhood Acute Lymphoblastic Leukemia. Am J Epidemiol 185, 1263-1271, doi:10.1093/aje/kww164 (2017). 43 BUREAU, P. R. Population Bulletin Update: Latinos in the United States 2010, <https://www.prb.org/latinosupdate2/> (2010). 44 Young, R. P. & Hopkins, R. J. A review of the Hispanic paradox: time to spill the beans? Eur Respir Rev 23, 439-449, doi:10.1183/09059180.00000814 (2014). 45 Franzini, L., Ribble, J. C. & Keddie, A. M. Understanding the Hispanic paradox. Ethn Dis 11, 496-518 (2001). 46 Miranda-Filho, A. et al. Epidemiological patterns of leukaemia in 184 countries: a population-based study. Lancet Haematol 5, e14-e24, doi:10.1016/S2352- 3026(17)30232-6 (2018). 47 Erdmann, F. et al. Incidence of childhood cancer in Costa Rica, 2000-2014: An international perspective. Cancer Epidemiol 56, 21-30, doi:10.1016/j.canep.2018.07.004 (2018). 48 Heck, J. E. et al. Risk of Childhood Cancer by Maternal Birthplace: A Test of the Hispanic Paradox. JAMA Pediatr 170, 585-592, doi:10.1001/jamapediatrics.2016.0097 (2016). 49 Uretsky, M. C. & Mathiesen, S. G. The effects of years lived in the United States on the general health status of California's foreign-born populations. J Immigr Minor Health 9, 125-136, doi:10.1007/s10903-006-9017-7 (2007). 50 Katz, A. J., Chia, V. M., Schoonen, W. M. & Kelsh, M. A. Acute lymphoblastic leukemia: an assessment of international incidence, survival, and disease burden. Cancer Causes Control 26, 1627-1642, doi:10.1007/s10552-015-0657-6 (2015). 78 51 Perez-Saldivar, M. L. et al. Childhood acute leukemias are frequent in Mexico City: descriptive epidemiology. BMC Cancer 11, 355, doi:10.1186/1471-2407-11-355 (2011). 52 Organization, W. H. Estimated number of new cases in 2018, all cancers, both sexes, ages 0-14, <https://gco.iarc.fr/today/online-analysis- table?v=2018&mode=population&mode_population=countries&population=900&popula tions=320&key=asr&sex=0&cancer=39&type=0&statistic=5&prevalence=0&population _group=0&ages_group%5B%5D=0&ages_group%5B%5D=2&group_cancer=1&include _nmsc=1&include_nmsc_other=1#collapse-others> (2018). 53 Monge, P. et al. Childhood leukaemia in Costa Rica, 1981-96. Paediatr Perinat Epidemiol 16, 210-218, doi:10.1046/j.1365-3016.2002.00422.x (2002). 54 Wilkinson, J. D. et al. Cancer incidence among Hispanic children in the United States. Rev Panam Salud Publica 18, 5-13, doi:10.1590/s1020-49892005000600002 (2005). 55 Mody, R. et al. Twenty-five-year follow-up among survivors of childhood acute lymphoblastic leukemia: a report from the Childhood Cancer Survivor Study. Blood 111, 5515-5523, doi:10.1182/blood-2007-10-117150 (2008). 56 Turcotte, L. M. et al. Temporal Trends in Treatment and Subsequent Neoplasm Risk Among 5-Year Survivors of Childhood Cancer, 1970-2015. JAMA 317, 814-824, doi:10.1001/jama.2017.0693 (2017). 57 Mulrooney, D. A. et al. The changing burden of long-term health outcomes in survivors of childhood acute lymphoblastic leukaemia: a retrospective analysis of the St Jude Lifetime Cohort Study. Lancet Haematol 6, e306-e316, doi:10.1016/S2352- 3026(19)30050-X (2019). 58 Feng, Q. et al. Trends in Acute Lymphoblastic Leukemia Incidence in the US from 2000- 2016: an Increased Risk in Latinos Across All Age Groups. Am J Epidemiol, doi:10.1093/aje/kwaa215 (2020). 59 Urayama, K. Y. et al. Early life exposure to infections and risk of childhood acute lymphoblastic leukemia. Int J Cancer 128, 1632-1643, doi:10.1002/ijc.25752 (2011). 79 60 Urayama, K. Y., Buffler, P. A., Gallagher, E. R., Ayoob, J. M. & Ma, X. A meta-analysis of the association between day-care attendance and childhood acute lymphoblastic leukaemia. Int J Epidemiol 39, 718-732, doi:10.1093/ije/dyp378 (2010). 61 Greaves, M. Infection, immune responses and the aetiology of childhood leukaemia. Nat Rev Cancer 6, 193-203, doi:10.1038/nrc1816 (2006). 62 Ma, X. et al. Vaccination history and risk of childhood leukaemia. Int J Epidemiol 34, 1100-1109, doi:10.1093/ije/dyi113 (2005). 63 Schuz, J. & Erdmann, F. Environmental Exposure and Risk of Childhood Leukemia: An Overview. Arch Med Res 47, 607-614, doi:10.1016/j.arcmed.2016.11.017 (2016). 64 Hauer, J., Fischer, U. & Borkhardt, A. Towards prevention of childhood ALL by early- life immune training. Blood, doi:10.1182/blood.2020009895 (2021). 65 Crouch, S. et al. Infectious illness in children subsequently diagnosed with acute lymphoblastic leukemia: modeling the trends from birth to diagnosis. Am J Epidemiol 176, 402-408, doi:10.1093/aje/kws180 (2012). 66 Chang, J. S., Tsai, C. R., Tsai, Y. W. & Wiemels, J. L. Medically diagnosed infections and risk of childhood leukaemia: a population-based case-control study. Int J Epidemiol 41, 1050-1059, doi:10.1093/ije/dys113 (2012). 67 Greaves, M. F. & Wiemels, J. Origins of chromosome translocations in childhood leukaemia. Nat Rev Cancer 3, 639-649 (2003). 68 Greaves, M. A causal mechanism for childhood acute lymphoblastic leukaemia. Nat Rev Cancer 18, 471-484, doi:10.1038/s41568-018-0015-6 (2018). 69 Greaves, M. F. Speculations on the cause of childhood acute lymphoblastic leukemia. Leukemia 2, 120-125 (1988). 70 Chang, J. S. et al. Profound deficit of IL10 at birth in children who develop childhood acute lymphoblastic leukemia. Cancer Epidemiol Biomarkers Prev 20, 1736-1740, doi:10.1158/1055-9965.EPI-11-0162 (2011). 80 71 Couper, K. N., Blount, D. G. & Riley, E. M. IL-10: the master regulator of immunity to infection. J Immunol 180, 5771-5777, doi:10.4049/jimmunol.180.9.5771 (2008). 72 Sooregard, S. H. et al. Neonatal inflammatory markers in children later diagnosed with B-cell precursor acute lymphoblastic leukaemia. Cancer Res 78, 5458-5463 (2018). 73 Whitehead, T. et al. Cytokine levels at birth in children who developed acute lymphoblastic leukemia. Cancer Epidemiol Biomarkers Prev in press (2021). 74 Almalte, Z. et al. Novel associations between activating killer-cell immunoglobulin-like receptor genes and childhood leukemia. Blood 118, 1323-1328, doi:10.1182/blood-2010- 10-313791 (2011). 75 de Smith, A. J. et al. The role of KIR genes and their cognate HLA class I ligands in childhood acute lymphoblastic leukemia. Blood 123, 2497-2503, doi:10.1182/blood- 2013-11-540625 (2014). 76 Colucci, F. The role of KIR and HLA interactions in pregnancy complications. Immunogenetics 69, 557-565, doi:10.1007/s00251-017-1003-9 (2017). 77 Papuchova, H., Meissner, T. B., Li, Q., Strominger, J. L. & Tilburgs, T. The Dual Role of HLA-C in Tolerance and Immunity at the Maternal-Fetal Interface. Front Immunol 10, 2730, doi:10.3389/fimmu.2019.02730 (2019). 78 Chazara, O., Xiong, S. & Moffett, A. Maternal KIR and fetal HLA-C: a fine balance. J Leukoc Biol 90, 703-716, doi:10.1189/jlb.0511227 (2011). 79 Kim, S. et al. Licensing of natural killer cells by host major histocompatibility complex class I molecules. Nature 436, 709-713, doi:10.1038/nature03847 (2005). 80 Hiby, S. E. et al. Maternal activating KIRs protect against human reproductive failure mediated by fetal HLA-C2. J Clin Invest 120, 4102-4110, doi:10.1172/JCI43998 (2010). 81 Moffett, A. & Colucci, F. Co-evolution of NK receptors and HLA ligands in humans is driven by reproduction. Immunol Rev 267, 283-297, doi:10.1111/imr.12323 (2015). 81 82 Oevermann, L. et al. KIR B haplotype donors confer a reduced risk for relapse after haploidentical transplantation in children with ALL. Blood 124, 2744-2747, doi:10.1182/blood-2014-03-565069 (2014). 83 V.;, G. J. & B., D. S. J. A Brief Analysis of Tissue-Resident NK Cells in Pregnancy and Endometrial Diseases: The Importance of Pharmacologic Modulation. Immuno 1(3), 174- 193 (2021). 84 Sivori, S. et al. Human NK cells: surface receptors, inhibitory checkpoints, and translational applications. Cell Mol Immunol 16, 430-441, doi:10.1038/s41423-019-0206- 4 (2019). 85 Basha, S., Surendran, N. & Pichichero, M. Immune responses in neonates. Expert Rev Clin Immunol 10, 1171-1184, doi:10.1586/1744666X.2014.942288 (2014). 86 Wallace, A. D. et al. Allergies and Childhood Acute Lymphoblastic Leukemia: A Case- Control Study and Meta-analysis. Cancer Epidemiol Biomarkers Prev 27, 1142-1150, doi:10.1158/1055-9965.EPI-17-0584 (2018). 87 Wang, R. et al. Cesarean Section and Risk of Childhood Acute Lymphoblastic Leukemia in a Population-Based, Record-Linkage Study in California. Am J Epidemiol 185, 96-105, doi:10.1093/aje/kww153 (2017). 88 California Biobank Program, <https://www.cdph.ca.gov/Programs/CFH/DGDS/Pages/cbp/default.aspx> (2018). 89 Genomes Project, C. et al. A global reference for human genetic variation. Nature 526, 68-74, doi:10.1038/nature15393 (2015). 90 Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559-575, doi:10.1086/519795 (2007). 91 Anderson, C. A. et al. Data quality control in genetic case-control association studies. Nat Protoc 5, 1564-1573, doi:10.1038/nprot.2010.116 (2010). 92 Team, R. C. R: A language and environment for statistical computing. (2017). 82 93 Moonsamy, P. V. et al. High throughput HLA genotyping using 454 sequencing and the Fluidigm Access Array System for simplified amplicon library preparation. Tissue Antigens 81, 141-149, doi:10.1111/tan.12071 (2013). 94 Closa, L., Vidal, F., Herrero, M. J. & Caro, J. L. Design and Validation of a Multiplex KIR and HLA Class I Genotyping Method Using Next Generation Sequencing. Front Immunol 9, 2991, doi:10.3389/fimmu.2018.02991 (2018). 95 Zheng, X. et al. HIBAG--HLA genotype imputation with attribute bagging. Pharmacogenomics J 14, 192-200, doi:10.1038/tpj.2013.18 (2014). 96 Kuniholm, M. H. et al. Human leucocyte antigen class I and II imputation in a multiracial population. Int J Immunogenet 43, 369-375, doi:10.1111/iji.12292 (2016). 97 Vukcevic, D. et al. Imputation of KIR Types from SNP Variation Data. Am J Hum Genet 97, 593-607, doi:10.1016/j.ajhg.2015.09.005 (2015). 98 Boudreau, J. E., Mulrooney, T. J., Le Luduec, J. B., Barker, E. & Hsu, K. C. KIR3DL1 and HLA-B Density and Binding Calibrate NK Education and Response to HIV. J Immunol 196, 3398-3410, doi:10.4049/jimmunol.1502469 (2016). 99 Kulkarni, S., Martin, M. P. & Carrington, M. The Yin and Yang of HLA and KIR in human disease. Semin Immunol 20, 343-352, doi:10.1016/j.smim.2008.06.003 (2008). 100 Nielsen, A. B. et al. Increased neonatal level of arginase 2 in cases of childhood acute lymphoblastic leukemia implicates immunosuppression in the etiology. Haematologica 104, e514-e516, doi:10.3324/haematol.2019.216465 (2019). 101 Lin, S. M., Du, P., Huber, W. & Kibbe, W. A. Model-based variance-stabilizing transformation for Illumina microarray data. Nucleic Acids Res 36, e11, doi:10.1093/nar/gkm1075 (2008). 102 Tingley, D., Yamamoto, T., Hirose, K., Keele, L. & Imai, K. mediation: R Package for Causal Mediation Analysis. Journal of Statistical Software 59, 1-38 (2014). 83 103 Moffett, A., Chazara, O., Colucci, F. & Johnson, M. H. Variation of maternal KIR and fetal HLA-C genes in reproductive failure: too early for clinical intervention. Reprod Biomed Online 33, 763-769, doi:10.1016/j.rbmo.2016.08.019 (2016). 104 Ma, X. et al. Ethnic difference in daycare attendance, early infections, and risk of childhood acute lymphoblastic leukemia. Cancer Epidemiol Biomarkers Prev 14, 1928- 1934, doi:14/8/1928 [pii] 10.1158/1055-9965.EPI-05-0115 (2005). 105 Wang, S. et al. Increased activating killer immunoglobulin-like receptor genes and decreased specific HLA-C alleles in couples with recurrent spontaneous abortion. Biochem Biophys Res Commun 360, 696-701, doi:10.1016/j.bbrc.2007.06.125 (2007). 106 Faridi, R. M. et al. Influence of activating and inhibitory killer immunoglobulin-like receptors on predisposition to recurrent miscarriages. Hum Reprod 24, 1758-1764, doi:10.1093/humrep/dep047 (2009). 107 Khakoo, S. I. et al. HLA and NK cell inhibitory receptor genes in resolving hepatitis C virus infection. Science 305, 872-874, doi:10.1126/science.1097670 (2004). 108 Martin, M. P. et al. Epistatic interaction between KIR3DS1 and HLA-B delays the progression to AIDS. Nat Genet 31, 429-434, doi:10.1038/ng934 (2002). 109 McGovern, N. et al. Human fetal dendritic cells promote prenatal T-cell immune suppression through arginase-2. Nature 546, 662-666, doi:10.1038/nature22795 (2017). 110 Karalexi, M. A. et al. Maternal fetal loss history and increased acute leukemia subtype risk in subsequent offspring: a systematic review and meta-analysis. Cancer Causes Control 28, 599-624, doi:10.1007/s10552-017-0890-2 (2017). 111 Ayton, P. et al. Truncation of the Mll gene in exon 5 by gene targeting leads to early preimplantation lethality of homozygous embryos. Genesis 30, 201-212, doi:10.1002/gene.1066 (2001). 84 112 Sinnett, D., Labuda, D. & Krajinovic, M. Challenges identifying genetic determinants of pediatric cancers--the childhood leukemia experience. Fam Cancer 5, 35-47, doi:10.1007/s10689-005-2574-4 (2006). 113 Anfossi, N. et al. Human NK cell education by inhibitory receptors for MHC class I. Immunity 25, 331-342, doi:10.1016/j.immuni.2006.06.013 (2006). 114 Kim, S. et al. HLA alleles determine differences in human natural killer cell responsiveness and potency. Proc Natl Acad Sci U S A 105, 3053-3058, doi:10.1073/pnas.0712229105 (2008). 115 Yawata, M. et al. MHC class I-specific inhibitory receptors and their ligands structure diverse human NK-cell repertoires toward a balance of missing self-response. Blood 112, 2369-2380, doi:10.1182/blood-2008-03-143727 (2008). 116 Fauriat, C., Ivarsson, M. A., Ljunggren, H. G., Malmberg, K. J. & Michaelsson, J. Education of human natural killer cells by activating killer cell immunoglobulin-like receptors. Blood 115, 1166-1174, doi:10.1182/blood-2009-09-245746 (2010). 117 Saletta, F., Dalla Pozza, L. & Byrne, J. A. Genetic causes of cancer predisposition in children and adolescents. Transl Pediatr 4, 67-75, doi:10.3978/j.issn.2224- 4336.2015.04.08 (2015). 118 Moriyama, T., Relling, M. V. & Yang, J. J. Inherited genetic variation in childhood acute lymphoblastic leukemia. Blood 125, 3988-3995, doi:10.1182/blood-2014-12-580001 (2015). 119 Caswell-Jin, J. L. et al. Racial/ethnic differences in multiple-gene sequencing results for hereditary cancer risk. Genet Med 20, 234-239, doi:10.1038/gim.2017.96 (2018). 120 Ricker, C. et al. Increased yield of actionable mutations using multi-gene panels to assess hereditary cancer susceptibility in an ethnically diverse clinical cohort. Cancer Genet 209, 130-137, doi:10.1016/j.cancergen.2015.12.013 (2016). 121 Rampersaud, E. et al. Germline deletion of ETV6 in familial acute lymphoblastic leukemia. Blood Adv 3, 1039-1046, doi:10.1182/bloodadvances.2018030635 (2019). 85 122 Feng, Q. et al. Trends in Acute Lymphoblastic Leukemia Incidence in the US from 2000- 2016: an Increased Risk in Latinos Across All Age Groups (2020). 123 Ostrom, Q. T. et al. Risk factors for childhood and adult primary brain tumors. Neuro Oncol 21, 1357-1375, doi:10.1093/neuonc/noz123 (2019). 124 Brown, A. L. et al. Survival disparities for second primary malignancies diagnosed among childhood cancer survivors: A population-based assessment. Cancer 125, 3623- 3630, doi:10.1002/cncr.32356 (2019). 125 Calip, G. S., Law, E. H. & Ko, N. Y. Racial and ethnic differences in risk of second primary cancers among breast cancer survivors. Breast Cancer Res Treat 151, 687-696, doi:10.1007/s10549-015-3439-7 (2015). 126 Zhang, J. et al. Germline Mutations in Predisposition Genes in Pediatric Cancer. N Engl J Med 373, 2336-2346, doi:10.1056/NEJMoa1508054 (2015). 127 Plon, S. E. & Lupo, P. J. Genetic Predisposition to Childhood Cancer in the Genomic Era. Annu Rev Genomics Hum Genet 20, 241-263, doi:10.1146/annurev-genom-083118- 015415 (2019). 128 Ripperger, T. et al. Childhood cancer predisposition syndromes-A concise review and recommendations by the Cancer Predisposition Working Group of the Society for Pediatric Oncology and Hematology. Am J Med Genet A 173, 1017-1037, doi:10.1002/ajmg.a.38142 (2017). 129 Sud, A. et al. Analysis of 153 115 patients with hematological malignancies refines the spectrum of familial risk. Blood 134, 960-969, doi:10.1182/blood.2019001362 (2019). 130 Kharazmi, E., Fallah, M., Sundquist, K. & Hemminki, K. Familial risk of early and late onset cancer: nationwide prospective cohort study. BMJ 345, e8076, doi:10.1136/bmj.e8076 (2012). 131 Hemminki, K. & Czene, K. Attributable risks of familial cancer from the Family-Cancer Database. Cancer Epidemiol Biomarkers Prev 11, 1638-1644 (2002). 86 132 Heath, J. A., Smibert, E., Algar, E. M., Dite, G. S. & Hopper, J. L. Cancer risks for relatives of children with cancer. J Cancer Epidemiol 2014, 806076, doi:10.1155/2014/806076 (2014). 133 Curtin, K. et al. Familial risk of childhood cancer and tumors in the Li-Fraumeni spectrum in the Utah Population Database: implications for genetic evaluation in pediatric practice. Int J Cancer 133, 2444-2453, doi:10.1002/ijc.28266 (2013). 134 Kohli, D. R. et al. Familial pancreatic cancer risk: a population-based study in Utah. J Gastroenterol 54, 1106-1112, doi:10.1007/s00535-019-01597-3 (2019). 135 Samadder, N. J. et al. Epidemiology and familial risk of synchronous and metachronous colorectal cancer: a population-based study in Utah. Clin Gastroenterol Hepatol 12, 2078-2084 e2071-2072, doi:10.1016/j.cgh.2014.04.017 (2014). 136 Wang, X. et al. Using the Utah Population Database to assess familial risk of primary open angle glaucoma. Vision Res 50, 2391-2395, doi:10.1016/j.visres.2010.09.018 (2010). 137 Goldgar, D. E., Easton, D. F., Cannon-Albright, L. A. & Skolnick, M. H. Systematic population-based assessment of cancer risk in first-degree relatives of cancer probands. J Natl Cancer Inst 86, 1600-1608, doi:10.1093/jnci/86.21.1600 (1994). 138 Breslow, N. E. & Day, N. E. Statistical methods in cancer research. Volume II--The design and analysis of cohort studies. IARC Sci Publ, 1-406 (1987). 139 Guidelines for Using Confidence Intervals for Public Health Assessment, <https://www.doh.wa.gov/Portals/1/Documents/1500/ConfIntGuide.pdf> (2012). 140 Cerhan, J. R. & Slager, S. L. Familial predisposition and genetic risk factors for lymphoma. Blood 126, 2265-2273, doi:10.1182/blood-2015-04-537498 (2015). 141 Fallah, M. et al. Familial risk of non-Hodgkin lymphoma by sex, relationship, age at diagnosis and histology: a joint study from five Nordic countries. Leukemia 30, 373-378, doi:10.1038/leu.2015.272 (2016). 87 142 Madanat-Harjuoja, L. M., Pitkaniemi, J., Hirvonen, E., Malila, N. & Diller, L. R. Linking population-based registries to identify familial cancer risk in childhood cancer. Cancer 126, 3076-3083, doi:10.1002/cncr.32882 (2020). 143 Couldwell, W. T. & Cannon-Albright, L. A. A description of familial clustering of meningiomas in the Utah population. Neuro Oncol 19, 1683-1687, doi:10.1093/neuonc/nox127 (2017). 144 Crump, C., Sundquist, J., Sieh, W., Winkleby, M. A. & Sundquist, K. Perinatal and familial risk factors for brain tumors in childhood through young adulthood. Cancer Res 75, 576-583, doi:10.1158/0008-5472.CAN-14-2285 (2015). 145 Kamihara, J. et al. Retinoblastoma and Neuroblastoma Predisposition and Surveillance. Clin Cancer Res 23, e98-e106, doi:10.1158/1078-0432.CCR-17-0652 (2017). 146 Landero-Huerta, D. A. et al. Epigenetic and risk factors of testicular germ cell tumors: a brief review. Front Biosci (Landmark Ed) 22, 1073-1098, doi:10.2741/4534 (2017). 147 Lynch, H. T. et al. Familial sarcoma: challenging pedigrees. Cancer 98, 1947-1957, doi:10.1002/cncr.11743 (2003). 148 Frank, C., Sundquist, J., Hemminki, A. & Hemminki, K. Risk of other Cancers in Families with Melanoma: Novel Familial Links. Sci Rep 7, 42601, doi:10.1038/srep42601 (2017). 149 Marees, T. et al. Risk of second malignancies in survivors of retinoblastoma: more than 40 years of follow-up. J Natl Cancer Inst 100, 1771-1779, doi:10.1093/jnci/djn394 (2008). 150 Sasaki, K. et al. Incidence of second malignancies in patients with chronic myeloid leukemia in the era of tyrosine kinase inhibitors. Int J Hematol 109, 545-552, doi:10.1007/s12185-019-02620-2 (2019). 151 Molica, S. Second neoplasms in chronic lymphocytic leukemia: incidence and pathogenesis with emphasis on the role of different therapies. Leuk Lymphoma 46, 49-54, doi:10.1080/10428190400007524 (2005). 88 152 Baker, H. Second cancer risk for Hodgkin's lymphoma survivors. Lancet Oncol 17, e50, doi:10.1016/S1470-2045(16)00002-4 (2016). 153 Chattopadhyay, S. et al. Second primary cancers in non-Hodgkin lymphoma: Bidirectional analyses suggesting role for immune dysfunction. Int J Cancer 143, 2449- 2457, doi:10.1002/ijc.31801 (2018). 154 Applebaum, M. A. et al. Second malignancies in patients with neuroblastoma: the effects of risk-based therapy. Pediatr Blood Cancer 62, 128-133, doi:10.1002/pbc.25249 (2015). 155 Fantus, R. J. & Helfand, B. T. Germline Genetics of Prostate Cancer: Time to Incorporate Genetics into Early Detection Tools. Clin Chem 65, 74-79, doi:10.1373/clinchem.2018.286658 (2019). 156 Yadav, S. & Couch, F. J. Germline Genetic Testing for Breast Cancer Risk: The Past, Present, and Future. Am Soc Clin Oncol Educ Book 39, 61-74, doi:10.1200/EDBK_238987 (2019). 157 McNerney, M. E., Godley, L. A. & Le Beau, M. M. Therapy-related myeloid neoplasms: when genetics and environment collide. Nat Rev Cancer 17, 513-527, doi:10.1038/nrc.2017.60 (2017). 158 Mazonakis, M., Kachris, S. & Damilakis, J. Second Cancer Risk from Radiation Therapy for Common Solid Tumors Diagnosed in Reproductive-Aged Females. Radiat Prot Dosimetry 182, 208-214, doi:10.1093/rpd/ncy050 (2018). 159 Turcotte, L. M. et al. Chemotherapy and Risk of Subsequent Malignant Neoplasms in the Childhood Cancer Survivor Study Cohort. J Clin Oncol 37, 3310-3319, doi:10.1200/JCO.19.00129 (2019). 160 Valdez, J. M., Nichols, K. E. & Kesserwan, C. Li-Fraumeni syndrome: a paradigm for the understanding of hereditary cancer predisposition. Br J Haematol 176, 539-552, doi:10.1111/bjh.14461 (2017). 161 Churpek, J. E. et al. Inherited mutations in cancer susceptibility genes are common among survivors of breast cancer who develop therapy-related leukemia. Cancer 122, 304-311, doi:10.1002/cncr.29615 (2016). 89 162 Reynolds, A. W. et al. Comparing signals of natural selection between three Indigenous North American populations. Proc Natl Acad Sci U S A 116, 9312-9317, doi:10.1073/pnas.1819467116 (2019). 163 Lindo, J. et al. A time transect of exomes from a Native American population before and after European contact. Nat Commun 7, 13175, doi:10.1038/ncomms13175 (2016). 164 O'Fallon, B. D. & Fehren-Schmitz, L. Native Americans experienced a strong population bottleneck coincident with European contact. Proc Natl Acad Sci U S A 108, 20444- 20448, doi:10.1073/pnas.1112563108 (2011). 165 Goldin, L. R. et al. Familial aggregation and heterogeneity of non-Hodgkin lymphoma in population-based samples. Cancer Epidemiol Biomarkers Prev 14, 2402-2406, doi:10.1158/1055-9965.EPI-05-0346 (2005). 166 Goldin, L. R., Bjorkholm, M., Kristinsson, S. Y., Turesson, I. & Landgren, O. Elevated risk of chronic lymphocytic leukemia and other indolent non-Hodgkin's lymphomas among relatives of patients with chronic lymphocytic leukemia. Haematologica 94, 647- 653, doi:10.3324/haematol.2008.003632 (2009). 167 Goldin, L. R., Bjorkholm, M., Kristinsson, S. Y., Turesson, I. & Landgren, O. Highly increased familial risks for specific lymphoma subtypes. Br J Haematol 146, 91-94, doi:10.1111/j.1365-2141.2009.07721.x (2009). 168 Kristinsson, S. Y. et al. Risk of lymphoproliferative disorders among first-degree relatives of lymphoplasmacytic lymphoma/Waldenstrom macroglobulinemia patients: a population-based study in Sweden. Blood 112, 3052-3056, doi:10.1182/blood-2008-06- 162768 (2008). 169 Goldin, L. R. et al. Familial aggregation of Hodgkin lymphoma and related tumors. Cancer 100, 1902-1908, doi:10.1002/cncr.20189 (2004). 170 Chang, E. T. et al. Family history of hematopoietic malignancy and risk of lymphoma. J Natl Cancer Inst 97, 1466-1474, doi:10.1093/jnci/dji293 (2005). 90 171 Smedby, K. E. et al. GWAS of follicular lymphoma reveals allelic heterogeneity at 6p21.32 and suggests shared genetic susceptibility with diffuse large B-cell lymphoma. PLoS Genet 7, e1001378, doi:10.1371/journal.pgen.1001378 (2011). 172 Skibola, C. F. et al. Genetic variants at 6p21.33 are associated with susceptibility to follicular lymphoma. Nat Genet 41, 873-875, doi:10.1038/ng.419 (2009). 173 Cerhan, J. R. et al. Genome-wide association study identifies multiple susceptibility loci for diffuse large B cell lymphoma. Nat Genet 46, 1233-1238, doi:10.1038/ng.3105 (2014). 174 Tan, D. E. et al. Genome-wide association study of B cell non-Hodgkin lymphoma identifies 3q27 as a susceptibility locus in the Chinese population. Nat Genet 45, 804- 807, doi:10.1038/ng.2666 (2013). 175 Vijai, J. et al. A genome-wide association study of marginal zone lymphoma shows association to the HLA region. Nat Commun 6, 5751, doi:10.1038/ncomms6751 (2015). 176 Enciso-Mora, V. et al. A genome-wide association study of Hodgkin's lymphoma identifies new susceptibility loci at 2p16.1 (REL), 8q24.21 and 10p14 (GATA3). Nat Genet 42, 1126-1130, doi:10.1038/ng.696 (2010). 177 Frampton, M. et al. Variation at 3p24.1 and 6q23.3 influences the risk of Hodgkin's lymphoma. Nat Commun 4, 2549, doi:10.1038/ncomms3549 (2013). 178 Urayama, K. Y. et al. Genome-wide association study of classical Hodgkin lymphoma and Epstein-Barr virus status-defined subgroups. J Natl Cancer Inst 104, 240-253, doi:10.1093/jnci/djr516 (2012). 179 Moutsianas, L. et al. Multiple Hodgkin lymphoma-associated loci within the HLA region at chromosome 6p21.3. Blood 118, 670-674, doi:10.1182/blood-2011-03-339630 (2011). 180 Cozen, W. et al. A genome-wide meta-analysis of nodular sclerosing Hodgkin lymphoma identifies risk loci at 6p21.32. Blood 119, 469-475, doi:10.1182/blood-2011-03-343921 (2012). 91 181 Cozen, W. et al. A meta-analysis of Hodgkin lymphoma reveals 19p13.3 TCF3 as a novel susceptibility locus. Nat Commun 5, 3856, doi:10.1038/ncomms4856 (2014). 182 Vijai, J. et al. Susceptibility loci associated with specific and shared subtypes of lymphoid malignancies. PLoS Genet 9, e1003220, doi:10.1371/journal.pgen.1003220 (2013). 183 Frazer, K. A., Murray, S. S., Schork, N. J. & Topol, E. J. Human genetic variation and its contribution to complex traits. Nat Rev Genet 10, 241-251, doi:10.1038/nrg2554 (2009). 184 Cancer Facts & Figures for Hispanics/Latinos 2018-2020, <https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and- statistics/cancer-facts-and-figures-for-hispanics-and-latinos/cancer-facts-and-figures-for- hispanics-and-latinos-2018-2020.pdf> ( 185 McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20, 1297-1303, doi:10.1101/gr.107524.110 (2010). 186 DePristo, M. A. et al. A framework for variation discovery and genotyping using next- generation DNA sequencing data. Nat Genet 43, 491-498, doi:10.1038/ng.806 (2011). 187 Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 43, 11 10 11-11 10 33, doi:10.1002/0471250953.bi1110s43 (2013). 188 Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38, e164, doi:10.1093/nar/gkq603 (2010). 189 Karczewski, K. J. et al. Author Correction: The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 590, E53, doi:10.1038/s41586-020-03174-8 (2021). 190 Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434-443, doi:10.1038/s41586-020-2308-7 (2020). 92 191 Trans-Omics for Precision Medicine (TOPMed) Program, <https://www.nhlbi.nih.gov/science/trans-omics-precision-medicine-topmed-program> (2014). 192 Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res 46, D1062-D1067, doi:10.1093/nar/gkx1153 (2018). 193 Edmonson, M. N. et al. Pediatric Cancer Variant Pathogenicity Information Exchange (PeCanPIE): a cloud-based platform for curating and classifying germline variants. Genome Res 29, 1555-1565, doi:10.1101/gr.250357.119 (2019). 194 Kraft, I. L. & Godley, L. A. Identifying potential germline variants from sequencing hematopoietic malignancies. Blood 136, 2498-2506, doi:10.1182/blood.2020006910 (2020). 195 Grobner, S. N. et al. The landscape of genomic alterations across childhood cancers. Nature 555, 321-327, doi:10.1038/nature25480 (2018). 196 Grobner, S. N. et al. Author Correction: The landscape of genomic alterations across childhood cancers. Nature 559, E10, doi:10.1038/s41586-018-0167-2 (2018). 197 Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19, 1655-1664, doi:10.1101/gr.094052.109 (2009). 198 Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am J Hum Genet 93, 278-288, doi:10.1016/j.ajhg.2013.06.020 (2013). 199 Klco, J. M. & Mullighan, C. G. Advances in germline predisposition to acute leukaemias and myeloid neoplasms. Nat Rev Cancer 21, 122-137, doi:10.1038/s41568-020-00315-z (2021). 200 Stieglitz, E. & Loh, M. L. Genetic predispositions to childhood leukemia. Ther Adv Hematol 4, 270-290, doi:10.1177/2040620713498161 (2013). 93 201 Salmoiraghi, S., Rambaldi, A. & Spinelli, O. TP53 in adult acute lymphoblastic leukemia. Leuk Lymphoma 59, 778-789, doi:10.1080/10428194.2017.1344839 (2018). 202 Barbosa, K., Li, S., Adams, P. D. & Deshpande, A. J. The role of TP53 in acute myeloid leukemia: Challenges and opportunities. Genes Chromosomes Cancer 58, 875-888, doi:10.1002/gcc.22796 (2019). 203 Jain, P. & Wang, M. Mantle cell lymphoma: 2019 update on the diagnosis, pathogenesis, prognostication, and management. Am J Hematol 94, 710-725, doi:10.1002/ajh.25487 (2019). 204 Eskelund, C. W. et al. TP53 mutations identify younger mantle cell lymphoma patients who do not benefit from intensive chemoimmunotherapy. Blood 130, 1903-1910, doi:10.1182/blood-2017-04-779736 (2017). 205 Deng, M. et al. Aggressive B-cell Lymphoma with MYC/TP53 Dual Alterations Displays Distinct Clinicopathobiological Features and Response to Novel Targeted Agents. Mol Cancer Res 19, 249-260, doi:10.1158/1541-7786.MCR-20-0466 (2021). 206 Pasquet, M. et al. High frequency of GATA2 mutations in patients with mild chronic neutropenia evolving to MonoMac syndrome, myelodysplasia, and acute myeloid leukemia. Blood 121, 822-829, doi:10.1182/blood-2012-08-447367 (2013). 207 Kennedy, A. L. & Shimamura, A. Genetic predisposition to MDS: clinical features and clonal evolution. Blood 133, 1071-1085, doi:10.1182/blood-2018-10-844662 (2019). 208 Kanagal-Shamanna, R. et al. Molecular characterization of Novel ATM fusions in chronic lymphocytic leukemia and T-cell prolymphocytic leukemia. Leuk Lymphoma, 1- 11, doi:10.1080/10428194.2021.2010061 (2021). 209 Bea, S. et al. Landscape of somatic mutations and clonal evolution in mantle cell lymphoma. Proc Natl Acad Sci U S A 110, 18250-18255, doi:10.1073/pnas.1314608110 (2013). 210 Lin, L. et al. Mutant IDH1 Enhances Temozolomide Sensitivity via Regulation of the ATM/CHK2 Pathway in Glioma. Cancer Res Treat 53, 367-377, doi:10.4143/crt.2020.506 (2021). 94 211 Uhl, G. R. & Martinez, M. J. PTPRD: neurobiology, genetics, and initial pharmacology of a pleiotropic contributor to brain phenotypes. Ann N Y Acad Sci 1451, 112-129, doi:10.1111/nyas.14002 (2019). 212 Song, L. et al. Protein tyrosine phosphatases receptor type D is a potential tumour suppressor gene inactivated by deoxyribonucleic acid methylation in paediatric acute myeloid leukaemia. Acta Paediatr 105, e132-141, doi:10.1111/apa.13284 (2016). 213 Depreter, B. et al. Deciphering molecular heterogeneity in pediatric AML using a cancer vs. normal transcriptomic approach. Pediatr Res 89, 1695-1705, doi:10.1038/s41390- 020-01199-3 (2021). 214 Rangrez, A. Y. et al. Myeloid leukemia factor-1 is a novel modulator of neonatal rat cardiomyocyte proliferation. Biochim Biophys Acta Mol Cell Res 1864, 634-644, doi:10.1016/j.bbamcr.2017.01.004 (2017). 95 Appendices Appendix A: Supplementary materials for Chapter 1 Supplementary table 1-1. Age-adjusted incidence rate (AAIR) and annual percent change (APC) of acute lymphocytic leukemia (ALL) by selected demographic characteristics in California. Overall Non-Latino White Latino all races N AAIR 1 (95%- CI) APC 2 (95%- CI) N AAIR 1 (95%- CI) APC 2 (95%- CI) N AAIR 1 (95%- CI) APC 2 (95%- CI) Overall 11,840 1.92 (1.88, 1.95) 1.25* (0.94, 1.57) 3,828 1.62 (1.56, 1.67) 0.71 (-0.02, 1.46) 6,386 2.56† (2.49, 2.63) 1.15* (0.80, 1.50) Male 6,694 2.16 (2.11, 2.22) 1.51* (1.03, 1.99) 2,219 1.88 (1.80, 1.96) 0.76 (-0.39, 1.94) 3,581 2.79† (2.69, 2.89) 1.64* (1.12, 2.15) Female 5,146 1.67 (1.62, 1.72) 1.00* (0.39, 1.61) 1,609 1.36 (1.29, 1.43) 0.69 (-0.51, 1.91) 2,805 2.32† (2.23, 2.41) 0.71 (-0.04, 1.46) Age (years) 0-14 6,076 4.64 (4.53, 4.76) 0.69* (0.25, 1.14) 1,707 4.24 (4.04, 4.44) 0.52 (-0.58, 1.62) 3,529 5.36† (5.19, 5.55) 0.70* (0.12, 1.28) 15-39 2,713 1.19 (1.15, 1.24) 2.28* (1.48, 3.09) 667 0.82 (0.76, 0.88) 0.54 (-1.18, 2.30) 1,703 1.73† (1.65, 1.82) 2.31* (1.48, 3.14) 40+ 3,051 1.15 (1.11, 1.19) 1.50* (0.81, 2.20) 1,454 0.97 (0.92, 1.02) 1.17* (0.07, 2.28) 1,154 1.85† (1.74, 1.96) 0.88 (-0.14, 1.91) Non-Latino Black Non-Latino API Non-Latino AIAN N AAIR 1 (95%- CI) APC 2 (95%- CI) N AAIR 1 (95%- CI) APC 2 (95%- CI) N AAIR 1 (95%- CI) APC 2 (95%- CI) Overall 432 1.08† (0.98, 1.18) 1.46 (-0.87, 3.84) 1,091 1.41† (1.33, 1.50) 0.49 (-1.04, 2.04) 58 1.63 (1.24, 2.12) NA Male 231 1.17† (1.02, 1.34) 1.82 (-1.28, 5.01) 606 1.59 (1.47, 1.73) 0.30 (-2.40, 3.08) 33 1.89 (1.30, 2.68) NA Female 201 0.99 (0.86, 1.14) 0.70 (-2.12, 3.60) 485 1.24 (1.13, 1.36) 0.66 (-1.14, 2.50) 25 1.38 (0.89, 2.05) NA Age (years) 0-14 212 2.41† (2.10, 2.76) 1.38 (-1.90, 4.78) 564 3.74† (3.44, 4.06) -0.24 (-1.59, 1.13) 30 4.18 (2.82, 5.97) NA 15-39 88 0.58† (0.46, 0.71) 2.00 (-1.82, 5.96) 236 0.76 (0.66, 0.86) 3.28 (0.09, 6.57) 11 0.83 (0.41, 1.50) NA 40+ 132 0.82† (0.68, 0.98) 1.81 (-2.54, 6.35) 291 0.79† (0.70, 0.88) 0.23 (-2.68, 3.22) 17 1.02 (0.59, 1.67) NA 1 AAIR per 100,000 persons, diagnosed from 2000-2016, SEER18 data. 2 Annual percent change (APC) for 2000-2016 using SEER18 data. *Statistically significantly change in incidence from 2000 to 2016. †Statistically significant p-value of the student t-test compared to non-Latino White. 96 Supplementary table 1-2. Percent foreign-born in 2000 and Yost index by race/ethnicity and county. Non-Latino White Latino all races Non-Latino Black County No. of cases 1 Percen t foreign -born quintil e 2 Yost index quintil e 3 No. of cases 1 Percen t foreign -born quintil e 2 Yost index quintil e 3 No. of cases 1 Percen t foreign -born quintil e 2 Yost index quintil e 3 CA: Alameda County (06001) 123 High High 125 High Highest 35 High Highest CA: Amador County (06005) 7 Lowest Mediu m NA NA NA NA NA NA CA: Calaveras County (06009) 15 Lowest Mediu m NA NA NA NA NA NA CA: Contra Costa County (06013) 128 Mediu m Highest 98 Mediu m Highest 23 Mediu m Highest CA: Del Norte County (06015) 8 Lowest Lowest NA NA NA NA NA NA CA: El Dorado County (06017) 40 Low High NA NA NA NA NA NA CA: Fresno County (06019) 95 High Lowest 212 High Lowest 7 High Low CA: Humboldt County (06023) 27 Lowest Lowest NA NA NA NA NA NA CA: Imperial County (06025) NA NA NA 66 Highest Lowest NA NA NA CA: Kern County (06029) 77 Mediu m Lowest 190 Mediu m Lowest NA NA NA CA: Kings County (06031) 14 Mediu m Lowest 50 Mediu m Lowest NA NA NA CA: Lake County (06033) 13 Low Lowest NA NA NA NA NA NA CA: Lassen County (06035) 8 Lowest Low NA NA NA NA NA NA CA: Los Angeles Registry (06037 ) 693 Highest Low 2295 Highest Mediu m 161 Highest Mediu m CA: Madera County (06039) 13 High Lowest 38 High Lowest NA NA NA CA: Marin County (06041) 39 Mediu m Highest NA NA NA NA NA NA CA: Mendocino County (06045) 19 Low Low NA NA NA NA NA NA CA: Merced County (06047) 20 High Lowest 60 High Lowest NA NA NA CA: Monterey County (06053) 38 High Mediu m 105 High High NA NA NA CA: Napa County (06055) 22 Mediu m High 24 Mediu m Highest NA NA NA CA: Nevada County (06057) 19 Lowest Mediu m NA NA NA NA NA NA CA: Orange County (06059) 341 High High 512 High Highest NA NA NA CA: Placer County (06061) 83 Low Highest 17 Low Highest NA NA NA CA: Riverside County (06065) 208 Mediu m Mediu m 387 Mediu m Mediu m 24 Mediu m Mediu m CA: Sacramento County (06067) 163 Mediu m Mediu m 121 Mediu m High 31 Mediu m High CA: San Benito County (06069) NA NA NA 11 Mediu m Highest NA NA NA CA: San Bernardino County (060 71) 200 Mediu m Mediu m 456 Mediu m Mediu m 31 Mediu m Mediu m CA: San Diego County (06073) 370 High Mediu m 460 High High 34 High High 97 CA: San Francisco County (0607 5) 56 Highest High 43 Highest Highest NA NA NA CA: San Joaquin County (06077) 58 High Low 131 High Low NA NA NA CA: San Luis Obispo County (06 079) 53 Low Mediu m 12 Low High NA NA NA CA: San Mateo County (06081) 81 Highest Highest 80 Highest Highest NA NA NA CA: Santa Barbara County (0608 3) 44 High Mediu m NA NA NA NA NA NA CA: Santa Clara County (06085) 171 Highest Highest 216 Highest Highest 12 Highest Highest CA: Santa Cruz County (06087) 42 Mediu m High 43 Mediu m Highest NA NA NA CA: Shasta County (06089) 37 Lowest Lowest NA NA NA NA NA NA CA: Solano County (06095) 44 Mediu m High 39 Mediu m Highest 12 Mediu m Highest CA: Sonoma County (06097) 83 Mediu m High 53 Mediu m Highest NA NA NA CA: Stanislaus County (06099) 56 Mediu m Low NA NA NA NA NA NA CA: Sutter County (06101) 9 High Low 14 High Lowest NA NA NA CA: Tehama County (06103) NA NA NA 9 Low Lowest NA NA NA CA: Tulare County (06107) 39 High Lowest 109 High Lowest 3 High Low CA: Tuolumne County (06109) 10 Lowest Low NA NA NA NA NA NA CA: Ventura County (06111) 133 High Highest 137 High Highest NA NA NA CA: Yolo County (06113) 30 High Low NA NA NA NA NA NA CT: Fairfield County (09001) 165 Mediu m Highest 65 Mediu m Highest 17 Mediu m Highest CT: Hartford County (09003) 135 Low High 46 Low High NA NA NA CT: Litchfield County (09005) 39 Lowest Highest NA NA NA NA NA NA CT: Middlesex County (09007) 31 Low Highest NA NA NA NA NA NA CT: New Haven County (09009) 152 Low High NA NA NA 13 Low High CT: New London County (09011) 60 Lowest High NA NA NA NA NA NA CT: Tolland County (09013) 32 Lowest Highest NA NA NA NA NA NA CT: Windham County (09015) 21 Lowest Mediu m NA NA NA NA NA NA GA: Bartow County (13015) 14 Lowest Mediu m NA NA NA NA NA NA GA: Bibb County (13021) 17 Lowest Lowest NA NA NA NA NA NA GA: Bulloch County (13031) 11 Lowest Lowest NA NA NA NA NA NA GA: Chatham County (13051) 36 Lowest Low NA NA NA 19 Lowest Low GA: Cherokee County (13057) 38 Lowest Highest NA NA NA NA NA NA GA: Clayton County (13063) 11 Low Mediu m 14 Low High 21 Low Mediu m GA: Cobb County (13067) 104 Low Highest NA NA NA NA NA NA GA: Colquitt County (13071) 9 Low Lowest NA NA NA NA NA NA GA: Columbia County (13073) 22 Lowest Highest NA NA NA NA NA NA GA: Dade County (13083) 2 Lowest Lowest NA NA NA NA NA NA GA: Dawson County (13085) 7 Lowest High NA NA NA NA NA NA GA: Decatur County (13087) 6 Lowest Lowest NA NA NA NA NA NA 98 GA: DeKalb County (13089) 42 Mediu m High 40 Mediu m High 63 Mediu m High GA: Douglas County (13097) 12 Lowest High NA NA NA NA NA NA GA: Evans County (13109) 2 Lowest Lowest NA NA NA NA NA NA GA: Fannin County (13111) 5 Lowest Lowest NA NA NA NA NA NA GA: Fayette County (13113) 16 Lowest Highest NA NA NA NA NA NA GA: Forsyth County (13117) 40 Low Highest 5 Low Highest NA NA NA GA: Fulton County (13121) 81 Low Mediu m 28 Low High 54 Low High GA: Glynn County (13127) 17 Lowest Low NA NA NA NA NA NA GA: Gordon County (13129) 17 Low Low NA NA NA NA NA NA GA: Gwinnett County (13135) 99 Mediu m Highest 58 Mediu m Highest 30 Mediu m Highest GA: Hall County (13139) 33 Mediu m Mediu m NA NA NA NA NA NA GA: Houston County (13153) 20 Lowest Mediu m NA NA NA NA NA NA GA: Jackson County (13157) 18 Lowest Low NA NA NA 1 Lowest Mediu m GA: Jones County (13169) NA NA NA NA NA NA 2 Lowest High GA: Lamar County (13171) 2 Lowest Low NA NA NA NA NA NA GA: Liberty County (13179) NA NA NA NA NA NA 5 Lowest Low GA: Morgan County (13211) 3 Lowest Low NA NA NA NA NA NA GA: Muscogee County (13215) 16 Lowest Lowest NA NA NA 13 Lowest Low GA: Oglethorpe County (13221) 2 Lowest Lowest NA NA NA NA NA NA GA: Paulding County (13223) 24 Lowest High NA NA NA NA NA NA GA: Pickens County (13227) 5 Lowest Low NA NA NA NA NA NA GA: Pierce County (13229) 7 Lowest Lowest NA NA NA NA NA NA GA: Richmond County (13245) 19 Lowest Lowest NA NA NA 15 Lowest Low GA: Rockdale County (13247) 11 Low High NA NA NA NA NA NA GA: Spalding County (13255) 11 Lowest Lowest NA NA NA NA NA NA GA: Thomas County (13275) 11 Lowest Lowest NA NA NA NA NA NA GA: Toombs County (13279) 8 Lowest Lowest NA NA NA NA NA NA GA: Troup County (13285) 7 Lowest Lowest NA NA NA NA NA NA GA: Turner County (13287) NA NA NA NA NA NA 1 Lowest Lowest GA: Upson County (13293) 5 Lowest Lowest NA NA NA NA NA NA GA: Walker County (13295) 16 Lowest Lowest NA NA NA NA NA NA GA: Whitfield County (13313) NA NA NA 16 Mediu m Low NA NA NA HI: Hawaii County (15001) - 200 0+ 13 Low Low NA NA NA NA NA NA HI: Honolulu County (15003) - 2 000+ 38 Mediu m High NA NA NA NA NA NA IA: Allamakee County (19005) 4 Lowest Lowest NA NA NA NA NA NA IA: Benton County (19011) 3 Lowest Mediu m NA NA NA NA NA NA IA: Black Hawk County (19013) 29 Lowest Low NA NA NA NA NA NA 99 IA: Bremer County (19017) 5 Lowest Low NA NA NA NA NA NA IA: Buchanan County (19019) 11 Lowest Low NA NA NA NA NA NA IA: Buena Vista County (19021) 4 Low Lowest NA NA NA NA NA NA IA: Cerro Gordo County (19033) 9 Lowest Lowest NA NA NA NA NA NA IA: Cherokee County (19035) 5 Lowest Lowest NA NA NA NA NA NA IA: Clayton County (19043) 7 Lowest Lowest NA NA NA NA NA NA IA: Clinton County (19045) 16 Lowest Low NA NA NA NA NA NA IA: Crawford County (19047) 4 Low Lowest NA NA NA NA NA NA IA: Dubuque County (19061) 15 Lowest Low NA NA NA NA NA NA IA: Fayette County (19065) 8 Lowest Lowest NA NA NA NA NA NA IA: Franklin County (19069) 2 Lowest Lowest NA NA NA NA NA NA IA: Greene County (19073) 3 Lowest Lowest NA NA NA NA NA NA IA: Guthrie County (19077) 4 Lowest Low NA NA NA NA NA NA IA: Ida County (19093) 1 Lowest Lowest NA NA NA NA NA NA IA: Iowa County (19095) 8 Lowest Mediu m NA NA NA NA NA NA 1 Number of cases of non-Latino white/ Latino all races/ non-Latino black in each county. 2 Percent of people born in a foreign country by county in 2000 from the Census 2000 ACS data. Lowest, less than 5.93%; low, 5.94%-13.17%; medium, 13.18%-19.20%; high, 19.21%-29.86%; highest, more than 29.87%. 3 The race-specific Yost index by county in 2000 from the Census 2000 ACS data. NL White: lowest, less than 10,405; low, 10,406-11,045; medium, 11,046-11,405; high, 11,406-11,618; highest, more than 11,619. Latino: lowest, less than 10,780; low, 10,781-10,996; medium, 10,997-11,127; high, 11,128-11,537; highest, more than 11,538. Black: lowest, less than 9,747; low, 9,748-10,845; medium, 10,846-11,155; high, 11,156-11,503; highest, more than 11,504. 100 Supplementary table 1-3. Multivariate Poisson regression model 1 of age-adjusted incidence rates (AAIR) by race/ethnicity and age groups. Non-Latino White Non-Latino Black Latino Incidence rate ratio (95% CI) Percent foreign born 2 0-14 years Lowest Referent Referent Referent Low 1.07 (0.94, 1.24) 1.22* (1.00, 1.48) 0.80 (0.62, 1.03) Medium 1.16* (1.00, 1.34) 2.15* (1.73, 2.67) 0.75* (0.59, 0.95) High 1.18* (1.02, 1.37) 2.62* (2.06, 3.33) 0.81 (0.64, 1.03) Highest 2.32* (1.93, 2.79) 3.72* (2.79, 4.92) 1.15 (0.88, 1.51) p for trend <0.001 <0.001 0.155 Percent foreign born 2 15-39 years Lowest Referent Referent Referent Low 1.12 (0.97, 1.29) 1.19 (0.88, 1.59) 0.70* (0.51, 0.95) Medium 1.52* (1.29, 1.79) 1.52* (1.07, 2.14) 0.56* (0.42, 0.74) High 1.37* (1.16, 1.60) 2.16* (1.45, 3.16) 0.62* (0.46, 0.82) Highest 2.21* (1.81, 2.69) 4.31* (2.87, 6.41) 0.96 (0.71, 1.31) p for trend <0.001 <0.001 0.398 Percent foreign born 2 40+ years Lowest Referent Referent Referent Low 1.01 (0.88, 1.17) 1.15 (0.89, 1.48) 0.60* (0.42, 0.87) Medium 1.12 (0.96, 1.31) 1.58* (1.15, 2.15) 0.50* (0.36, 0.70) High 1.19* (1.02, 1.39) 2.11* (1.48, 2.97) 0.52* (0.38, 0.73) Highest 2.00* (1.66, 2.41) 3.12* (2.21, 4.38) 0.74 (0.52, 1.05) p for trend <0.001 <0.001 0.993 1 The Poisson model using age-adjusted incidence rates (AAIR) as the dependent variable can be denoted as Count= Gender+ Year of diagnosis+ Percent foreign born quintile + Yost index quintile+ offset (ln(weighted population)), where weighted population= population in each category/(standard population in each category/ total standard population). Population in each category was the population in each age-, gender-, year of diagnosis-, and percent foreign born- category. Standard population in each category and total standard population were derived from the Census 2000 ACS data. 2 Percent foreign born: percent of people born in a foreign country by county in 2000 from the Census 2000 ACS data. Lowest, less than 5.93%; low, 5.94%-13.17%; medium, 13.18%-19.20%; high, 19.21%-29.86%; highest, more than 29.87%. 3 Yost index: the race-specific Yost index by county in 2000 from the Census 2000 ACS data. NL White: lowest, less than 10,405; low, 10,406-11,045; medium, 11,046-11,405; high, 11,406-11,618; highest, more than 11,619. Latino: lowest, less than 10,780; low, 10,781-10,996; medium, 10,997-11,127; high, 11,128-11,537; highest, more than 11,538. Black: lowest, less than 9,747; low, 9,748-10,845; medium, 10,846-11,155; high, 11,156-11,503; highest, more than 11,504. *Statistically significantly incidence rate ratio (IRR). 101 Supplementary table 1-4. Distribution of population by race/ ethnicity, percent foreign- born quintiles and Yost index quintiles. Race/ ethnicity Percent foreign- born quintile 1 No. of cases 3 Population in each category 4 Standard population in each category 4 Yost index quintile 2 No. of cases 3 Population in each category 4 Standard population in each category 4 Overall Non- Latino White Lowest 3,695 2,523,138 91,133,764 Lowest 2,341 1,815,276 89,847,307 Low 3,145 2,079,280 91,170,617 Low 2,385 1,652,696 90,770,089 Medium 1,985 1,363,236 89,983,198 Medium 2,308 1,742,765 91,212,633 High 1,817 1,193,899 90,819,679 High 2,353 1,717,989 91,321,119 Highest 1,059 758,090 91,555,220 Highest 2,314 1,584,311 91,121,919 Latino all races Lowest 389 186,215 83,299,693 Lowest 1,661 607,859 88,382,197 Low 839 372,189 87,183,966 Low 1,770 705,105 86,451,939 Medium 1,981 729,268 89,339,936 Medium 1,638 755,898 87,663,067 High 2,310 832,512 89,378,391 High 1,581 595,125 88,284,883 Highest 2,775 955,006 88,478,183 Highest 1,644 708,316 88,934,028 Non- Latino Black Lowest 513 537,679 87,360,038 Lowest 328 503,791 89,869,461 Low 417 428,447 88,095,143 Low 329 396,259 87,101,266 Medium 305 272,464 88,166,999 Medium 332 299,703 88,519,764 High 202 189,965 85,937,076 High 324 353,965 86,795,304 Highest 202 178,177 89,319,541 Highest 326 304,334 87,017,220 California Non- Latino White Lowest 145 1,720,649 92,961,130 Lowest 788 8,933,725 91,544,547 Low 270 2,816,041 91,544,547 Low 752 10,885,870 92,961,130 Medium 1,090 12,761,743 91,544,547 Medium 810 8,933,725 91,544,547 High 1,316 15,227,376 91,544,547 High 805 8,933,725 91,544,547 Highest 1,007 12,183,655 91,544,547 Highest 673 10,747,204 91,544,547 Latino all races Lowest 17 189,933 85,620,655 Lowest 1,299 7,830,727 91,339,527 Low 81 571,419 91,194,807 Low 1,258 11,147,314 91,247,268 Medium 1,580 9,636,133 91,544,547 Medium 1,532 12,675,259 88,215,070 High 2,008 12,029,458 89,641,989 High 1,030 6,317,657 88,724,679 Highest 2,700 15,580,378 91,544,547 Highest 1,267 9,422,039 91,544,547 1 Percent of people born in a foreign country by county in 2000 from the Census 2000 ACS data. Lowest, less than 5.93%; low, 5.94%-13.17%; medium, 13.18%-19.20%; high, 19.21%-29.86%; highest, more than 29.87%. 2 The race-specific Yost index by county in 2000 from the Census 2000 ACS data (overall) or from the Census 2000 ACS data in California (California). Overall: NL White: lowest, less than 10,405; low, 10,406-11,045; medium, 11,046-11,405; high, 11,406- 11,618; highest, more than 11,619. Latino: lowest, less than 10,780; low, 10,781-10,996; medium, 10,997-11,127; high, 11,128- 11,537; highest, more than 11,538. Black: lowest, less than 9,747; low, 9,748-10,845; medium, 10,846-11,155; high, 11,156- 11,503; highest, more than 11,504. California: NL White: lowest, less than 10,859; low, 10,860-11,048; medium, 11,049-11,395; high, 11,396-11,618; highest, more than 11,619. Latino: lowest, less than 10,815; low, 10,816-10,966; medium, 10,967-11,050; high, 11,051-11,523; highest, more than 11,524. 3 Number of acute lymphoblastic leukemia (ALL) cases. 4 Population and standard population in each category derived from the Census 2000 ACS data. 102 Supplementary figure 1-1a. Age-adjusted incidence rates (AAIR) for acute lymphocytic leukemia (ALL) among non-Latino Asian and Pacific Islanders (API) from 2000-2016. Supplementary figure 1-1b. Age-adjusted incidence rates (AAIR) for acute lymphocytic leukemia (ALL) among non-Latino American Indian, Alaskan Natives (AIAN) from 2000- 2016. Supplementary Figure1-1 legend: The age-adjusted incidence rates (AAIR) for acute lymphocytic leukemia (ALL) derived from SEER years 2000 to 2016 among (a) non-Latino Asian and Pacific Islanders (API); (b) non-Latino American Indian, Alaskan Natives (AIAN). Each line depicts a different age group. Statistically significant annual percent changes (APC) in AAIR were found in (a) non-Latino API age 15-39 (APC= 1.95 [95%CI: 0.15, 3.79]); (b) non- Latino AIAN age 15-39 (APC= 9.79 [95%CI: 5.65, 14.09]). 0.00 1.00 2.00 3.00 4.00 5.00 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Y ear of Diagnosis Age−Adjusted Incidence Rate Age groups 0−14 15−39 40+ 0.00 3.00 6.00 9.00 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Y ear of Diagnosis Age−Adjusted Incidence Rate Age groups 0−14 15−39 40+ 103 Supplementary figure 1- 2. Multivariate Poisson regression model of age-adjusted incidence rates (AAIR) among Latinos, excluding Los Angeles County. Supplementary Figure 1-2 legend: N: Number of cases. IRR: Incidence rate ratio. Percent foreign born: percent of people born in a foreign country by county in 2000 from the Census 2000 ACS data. Lowest, less than 5.93%; low, 5.94%-13.17%; medium, 13.18%- 19.20%; high, 19.21%-29.86%; highest, more than 29.87%. Yost index: the race-specific Yost index by county in 2000 from the Census 2000 ACS data. Latino: lowest, less than 10,780; low, 10,781-10,996; medium, 10,997-11,127; high, 11,128- 11,537; highest, more than 11,538. The Poisson model using age-adjusted incidence rates (AAIR) as the dependent variable can be denoted as Count= Age+ Gender+ Year of diagnosis+ Percent foreign born quintile + Yost index quintile+ offset (ln(weighted population)), where weighted population= population in each category/(standard population in each category/ total standard population). Population in each category was the population in each age-, gender-, year of diagnosis-, and percent foreign born- category. Standard population in each category and total standard population were derived from the Census 2000 ACS data. 104 Supplementary figure 1-3. The correlation between Yost index and percent foreign born by race/ethnicity groups. Supplementary Figure 1-3 legend: Each line depicts a race/ethnicity group. The Pearson’s correlations between Yost index and percent foreign born were R= 0.07 for Latinos, R= 0.30 for non-Latino Whites and R= 0.38 for non-Latino Blacks. The regression lines were weighted by the total number of cases diagnosed in each county from 2000-2016. CA: Imperial County NM: Dona Ana County NM: Luna County CA: Los Angeles Registry CA: San Francisco County NJ: Hudson County 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 8240 8490 8740 8990 9240 9490 9740 9990 10240 10490 10740 10990 11240 11490 11740 Y ost index in 2000 Foreign born in 2000 (%) Race/Ethnicity Latino all races Non−Latino White Non−Latino Black Number of cases 500 1000 1500 2000 105 Supplementary figure 1-4a. Multivariate Poisson regression model of age-adjusted incidence rates (AAIR) among non-Latino Whites in California. 106 Supplementary figure 1-4b. Multivariate Poisson regression model of age-adjusted incidence rates (AAIR) among Latinos in California. Supplementary Figure 1-4 legend: IRR: Incidence rate ratio. Percent foreign born: percent of people born in a foreign country by county in 2000 from the Census 2000 ACS data. Lowest, less than 5.93%; low, 5.94%-13.17%; medium, 13.18%- 19.20%; high, 19.21%-29.86%; highest, more than 29.87%. Yost index: the race-specific Yost index by county in 2000 from the Census 2000 ACS data in California. NL White: lowest, less than 10,859; low, 10,860-11,048; medium, 11,049-11,395; high, 11,396-11,618; highest, more than 11,619. Latino: lowest, less than 10,815; low, 10,816- 10,966; medium, 10,967-11,050; high, 11,051-11,523; highest, more than 11,524. The Poisson model using age-adjusted incidence rates (AAIR) as the dependent variable can be denoted as Count= Age+ Gender+ Percent foreign born quintile + Yost index quintile+ offset (ln(weighted population)), where weighted population= population in each category/(standard population in each category/ total standard population). Population in each category was the population in each age-, gender-, year of diagnosis-, and percent foreign born- category. Standard population in each category and total standard population were derived from the Census 2000 ACS data. 107 Supplementary figure 1-5a. Multivariate Poisson regression model of age-adjusted incidence rates (AAIR) among non-Latino Whites outside California. 108 Supplementary figure 1-5b. Multivariate Poisson regression model of age-adjusted incidence rates (AAIR) among Latinos outside California. Supplementary Figure 1-5 legend: IRR: Incidence rate ratio. Percent foreign born: percent of people born in a foreign country by county in 2000 from the Census 2000 ACS data. Lowest, less than 5.93%; low, 5.94%-13.17%; medium, 13.18%- 19.20%; high, 19.21%-29.86%; highest, more than 29.87%. Yost index: the race-specific Yost index by county in 2000 from the Census 2000 ACS data in California. NL White: lowest, less than 10,097; low, 10,098-11,012; medium, 11,013-11,416; high, 11,417-11,635; highest, more than 11,636. Latino: lowest, less than 10,418; low, 10,419- 11,040; medium, 11,041-11,399; high, 11,400-11,602; highest, more than 11,603. The Poisson model using age-adjusted incidence rates (AAIR) as the dependent variable can be denoted as Count= Age+ Gender+ Percent foreign born quintile + Yost index quintile+ offset (ln(weighted population)), where weighted population= population in each category/(standard population in each category/ total standard population). Population in each category was the population in each age-, gender-, year of diagnosis-, and percent foreign born- category. Standard population in each category and total standard population were derived from the Census 2000 ACS data. 109 Appendix B: Supplementary materials for Chapter 2 Supplementary Table 2-1. Imputation quality of HIBAG and KIR*Imp. Average predicted probability (standard deviation) 1 Predicted genetic ancestry HIBAG KIR*Imp Latino all races 0.85 (0.19) 0.86 (0.17) Non-Latino White 0.95 (0.11) 0.90 (0.14) Non-Latino Asian 0.84 (0.20) 0.82 (0.18) Non-Latino Black 0.62 (0.31) 0.77 (0.21) Abbreviations: HIBAG, HLA genotype imputation with attribute bagging; HLA, human leukocyte antigen; KIR, killer immunoglobulin-like receptor. 1 For HIBAG, average predicted probability indicates the average predicted probability for each HLA allele; for KIR*Imp, average predicted probability indicates the average posterior probability for each KIR haplotype. 110 Supplementary Table 2-2. Activation/inhibition effect for different combinations of maternal KIR and offspring HLA genes. Maternal KIR gene 1 Maternal HLA gene 2 Offspring HLA gene 2 Licensed inhibition score 3 Licensed activation score 4 KIR2DL1 C1/C1 C1/C2 1 KIR2DL1 C1/C2 C2/C2 1 KIR2DL1 C1/C2 C2/C2 1 KIR2DL2 C1/C2 C1/C1 0.5 KIR2DL2 C2/C2 C1/C2 0.5 KIR2DL3 C1/C2 C1/C1 0.5 KIR2DL3 C2/C2 C1/C2 0.5 KIR3DL1 80I 80I/80T 0.5 KIR3DL1 80I 80T 0.5 KIR3DL1 80I 80I/80I 1 KIR3DL1 80I 80T/80T 1 KIR3DL1 NA/NA 80I/80I 2 KIR3DL1 NA/NA 80I/80T 1.5 KIR3DL1 NA/NA 80I 1 KIR3DL1 NA/NA 80T/80T 1 KIR3DL1 NA/NA 80T 0.5 Number of A*32:01 allele KIR3DL1 0 1 1 KIR3DL1 1 2 1 KIR3DL1 0 2 2 Number of A*23:01 allele KIR3DL1 0 1 0.5 KIR3DL1 1 2 0.5 KIR3DL1 0 2 1 Number of A*24:02 allele KIR3DL1 0 1 0.5 KIR3DL1 1 2 0.5 KIR3DL1 0 2 1 Number of A*24:03 allele KIR3DL1 0 1 0.5 KIR3DL1 1 2 0.5 KIR3DL1 0 2 1 Number of A*25:01 allele KIR3DL1 0 1 0.5 KIR3DL1 1 2 0.5 KIR3DL1 0 2 1 KIR2DS1 C1/C1 C1/C2 1 KIR2DS1 C1/C2 C2/C2 1 KIR2DS2 C1/C2 C1/C1 1 KIR2DS2 C2/C2 C1/C2 1 KIR2DS5 C1/C1 C1/C2 1 111 KIR2DS5 C1/C2 C2/C2 1 Number of C*02:02 allele KIR2DS4 0 1 1 KIR2DS4 1 2 1 KIR2DS4 0 2 2 Number of C*04:01 allele KIR2DS4 0 1 1 KIR2DS4 1 2 1 KIR2DS4 0 2 2 Number of C*05:01 allele KIR2DS4 0 1 1 KIR2DS4 1 2 1 KIR2DS4 0 2 2 Number of C*01:02 allele KIR2DS4 0 1 1 KIR2DS4 1 2 1 KIR2DS4 0 2 2 Number of C*14:02 allele KIR2DS4 0 1 1 KIR2DS4 1 2 1 KIR2DS4 0 2 2 Number of C*16:01 allele KIR2DS4 0 1 1 KIR2DS4 1 2 1 KIR2DS4 0 2 2 KIR3DS1 80I 80I/80I 1 KIR3DS1 NA/NA 80I 1 KIR3DS1 NA/NA 80I/80T 1 KIR3DS1 NA/NA 80I/80I 2 1 KIR imputed with KIR*Imp or directly genotyped (CCLS samples only). 2 HLA imputed with HIBAG by predicted genetic ancestry group or directly genotyped (CCLS samples only). 3, 4 Licensed inhibition/activation scores computed from inhibition/activation scores in Table 2-1. 112 Supplementary Table 2-3. Demographics, offspring HLA-C, and maternal KIR carrier frequencies among ALL case and control mother-child pairs in CCRLP and CCLS. Analysis conditioned on paired cases and controls. Latino all races Non-Latino White Non-Latino Asian Case Control Case Control Case Control (N=139) (N=139) p-value 1 (N=56) (N=56) p-value 1 (N=27) (N=27) p- value 1 Study 2 NA NA NA CCRLP 76.0 (54.7%) 139 (100%) 38.0 (67.9%) 56.0 (100%) 12.0 (44.4%) 27.0 (100%) CCLS 63.0 (45.3%) 0 (0%) 18.0 (32.1%) 0 (0%) 15.0 (55.6%) 0 (0%) Offspri ng sex 3 NA NA NA Female 53.0 (38.1%) 53.0 (38.1%) 27.0 (48.2%) 27.0 (48.2%) 8.00 (29.6%) 8.00 (29.6%) Male 86.0 (61.9%) 86.0 (61.9%) 29.0 (51.8%) 29.0 (51.8%) 19.0 (70.4%) 19.0 (70.4%) Offspri ng HLAC 0.673 0.211 0.094 C1/C1 47.0 (33.8%) 53.0 (38.1%) 16.0 (28.6%) 25.0 (44.6%) 10.0 (37.0%) 11.0 (40.7%) C1/C2 60.0 (43.2%) 59.0 (42.4%) 31.0 (55.4%) 24.0 (42.9%) 9.00 (33.3%) 14.0 (51.9%) C2/C2 32.0 (23.0%) 27.0 (19.4%) 9.00 (16.1%) 7.00 (12.5%) 8.00 (29.6%) 2.00 (7.4%) Matern al KIR 0.039 0.579 0.235 A/A 22.0 (15.8%) 10.0 (7.2%) 6.00 (10.7%) 9.00 (16.1%) 3.00 (11.1%) 0 (0%) B/x 117 (84.2%) 129 (92.8%) 50.0 (89.3%) 47.0 (83.9%) 24.0 (88.9%) 27.0 (100%) Activati on score <0.001 0.508 0.015 Mean (SD) 1.58 (1.34) 2.32 (1.25) 2.05 (1.52) 1.89 (0.985) 1.89 (1.45) 2.78 (1.12) Inhibiti on score 0.008 0.768 0.246 Mean (SD) 3.41 (0.996) 3.74 (1.05) 3.84 (1.15) 3.90 (1.08) 3.52 (1.03) 3.87 (1.17) Abbreviations: ALL, acute lymphoblastic leukemia; CCRLP, Childhood Cancer Records Linkage Project; CCLS, California Childhood Leukemia Study; HLA, human leukocyte antigen; KIR, killer immunoglobulin-like receptor; SD, standard deviation. 1 The difference distribution by case/control status was analyzed with a Chi-squared test for categorical variables, and two-sample t-test for continuous variables. P-values < 0.05 were considered statistically significant. 2 The p-values were ‘NA’ because there was no control in CCLS. 3 The p-values were ‘NA’ because the mother-child pairs were matched on offspring sex. 113 Supplementary Table 2-4. Maternal KIR carrier frequencies among ALL case and control mother-child pairs in CCRLP and CCLS. Gene Case (N=226) Control (N=404) p-value 1 OR (95% CI) 2 KIR2DL1 215 (94.7%) 393 (97.3%) 0.121 0.67 (0.27, 1.69) KIR2DL2 157 (69.2%) 351 (86.9%) <0.001 0.45 (0.29, 0.70) KIR2DL3 169 (74.4%) 250 (61.9%) 0.002 1.58 (1.08, 2.33) KIR2DL4 225 (99.1%) 404 (100%) 0.129 NA KIR2DL5 116 (51.1%) 251 (62.1%) 0.009 0.76 (0.53, 1.09) KIR2DP1 202 (89.0%) 395 (97.8%) <0.001 0.28 (0.12, 0.63) KIR2DS1 86 (37.9%) 152 (37.6%) 1 1.04 (0.72, 1.48) KIR2DS2 151 (66.5%) 348 (86.1%) <0.001 0.43 (0.28, 0.67) KIR2DS3 47 (20.7%) 78 (19.3%) 0.678 0.91 (0.58, 1.40) KIR2DS4 214 (94.3%) 385 (95.3%) 0.575 1.02 (0.47, 2.32) KIR2DS5 58 (25.6%) 125 (30.9%) 0.171 0.89 (0.60, 1.31) KIR3DL1 219 (96.5%) 385 (95.3%) 0.545 1.46 (0.61, 3.83) KIR3DP1 222 (97.8%) 404 (100%) 0.006 NA KIR3DS1 98 (43.2%) 153 (37.9%) 0.204 1.26 (0.88, 1.79) Abbreviations: ALL, acute lymphoblastic leukemia; CCRLP, Childhood Cancer Records Linkage Project; CCLS, California Childhood Leukemia Study; KIR, killer immunoglobulin-like receptor; OR, odds ratio; CI, confidence interval. 1 The difference in frequencies between the case and control groups was analyzed for statistical significance at the 95% CI using Chi-squared test. p-values < 0.05 were considered statistically significant. 2 Odds ratios were calculated with logistic regression with different gene carriers as the independent factors, with the risk of ALL being the dependent variable, adjusting for the ancestry and offspring sex. 114 Supplementary Table 2-5. Offspring HLA-C allele and group frequencies among ALL case and control mother-child pairs in CCRLP and CCLS. Allele Group Cases (n= 226) Controls (n= 404) p-value 1 OR (95%CI) 2 Group frequency Allele frequency Group frequency Allele frequency C*01:02 C1 0.557 0.046 0.626 0.056 1 1.29 (1.02, 1.66) C*03:02 0.044 0.009 C*03:03 0.042 0.042 C*03:04 0.070 0.089 C*03:39 0.004 0.000 C*07:01 0.084 0.093 C*07:02 0.134 0.163 C*07:04 0.004 0.016 C*07:26 0.002 0.000 C*08:01 0.009 0.030 C*08:02 0.013 0.031 C*08:03 0.002 0.011 C*08:13 0.009 0.000 C*08:25 0.026 0.000 C*12:02 0.009 0.009 C*12:03 0.031 0.019 C*14:02 0.007 0.014 C*16:01 0.020 0.046 C*02:02 C2 0.443 0.029 0.374 0.032 C*02:10 0.002 0.006 C*02:87 0.026 0.000 C*04:01 0.174 0.152 C*04:03 0.002 0.002 C*04:10 0.002 0.000 C*05:01 0.066 0.058 C*06:02 0.066 0.069 C*06:11 0.004 0.000 C*12:04 0.007 0.000 C*15:02 0.046 0.032 C*15:05 0.004 0.001 C*16:02 0.007 0.006 C*17:01 0.007 0.012 C*18:01 0.000 0.001 115 Abbreviations: ALL, acute lymphoblastic leukemia; CCRLP, Childhood Cancer Records Linkage Project; CCLS, California Childhood Leukemia Study; HLA, human leukocyte antigen; OR, odds ratio; CI, confidence interval. 1 The difference in frequencies between the case and control groups was analyzed for statistical significance at the 95% CI using Chi-squared test. p-values < 0.05 were considered statistically significant. 2 Odds ratios were calculated with logistic regression with different alleles as the independent factors, with the risk of ALL being the dependent variable, adjusting for the ancestry and offspring sex. 116 Supplementary Table 2-6. Conditional logistic regression models 1 assessing the association between childhood ALL case/control status and HLA-KIR interaction. Analysis conditioned on paired cases and controls. Ancestry Unadjusted model Model adjusting for top 5 PCs Odds ratio (95% CI) p-value 2 Odds ratio (95% CI) p-value 2 Latino Activation score 0.68* (0.55, 0.85) <0.001 0.66* (0.53, 0.83) <0.001 Inhibition score 0.82 (0.64, 1.05) 0.116 0.83 (0.64, 1.06) 0.136 Non-Latino White Activation score 1.13 (0.83, 1.54) 0.420 1.38 (0.92, 2.06) 0.119 Inhibition score 0.91 (0.65, 1.27) 0.574 0.98 (0.64, 1.51) 0.928 Non-Latino Asian Activation score 0.71 (0.47, 1.08) 0.108 0.62 (0.33, 1.18) 0.144 Inhibition score 0.75 (0.39, 1.45) 0.391 0.63 (0.25, 1.59) 0.326 p-value (LR test) Activation score 3 0.009 0.009 Inhibition score 4 0.328 0.284 Weighted average 5 Activation score 0.78 (0.54, 1.12) 0.176 0.79 (0.45, 1.38) 0.401 Inhibition score 0.83 (0.69, 1.01) 0.064 0.84 (0.67, 1.05) 0.116 Abbreviations: ALL, acute lymphoblastic leukemia; HLA, human leukocyte antigen; KIR, killer immunoglobulin-like receptor; PC, principal component; CI, confidence interval. 1 Matched on offspring sex and Ancestry. 139 Latino, 56 non-Latino White and 27 non-Latino Asian one-to-one matched mother-child pairs were included. 2 p-value based on a Wald test. 3 LR test comparing the null model (ALL case/control status ~ activation score + inhibition score + genetic ancestry) to the model with an interaction term between ancestry and activation score (ALL case/control status ~ activation score + inhibition score + genetic ancestry + activation score x genetic ancestry). 4 LR test comparing the null model (ALL case/control status ~ activation score + inhibition score + genetic ancestry) to the model with an interaction term between ancestry and inhibition score (ALL case/control status ~ activation score + inhibition score + genetic ancestry + inhibition score x genetic ancestry). 5 The weighted average of odds ratio among Latino all races, non-Latino White, and non-Latino Asian subjects. Calculated with a random-effects meta-analysis model where the weights are the number of subjects in each ancestry group. *Odds ratio statistically significantly different from the null with p-value< 0.05. 117 Supplementary Table 2-7. Logistic regression models 1 assessing the association between childhood ALL case/control status and HLA-KIR interaction considering NK licensing. Predicted genetic ancestry Unadjusted model Model adjusting for top 5 PCs Odds ratio (95% CI) p-value 2 Odds ratio (95% CI) p-value 2 Latino all races Licensed activation score 0.82 (0.61, 1.09) 0.175 0.82 (0.61, 1.09) 0.176 Licensed inhibition score 0.95 (0.68, 1.32) 0.770 0.93 (0.66, 1.31) 0.691 Non-Latino White Licensed activation score 1.34 (0.89, 2.04) 0.158 1.29 (0.84, 2.00) 0.247 Licensed inhibition score 0.87 (0.53, 1.40) 0.562 0.76 (0.45, 1.26) 0.301 Non-Latino Asian Licensed activation score 0.86 (0.45, 1.60) 0.633 0.73 (0.35, 1.46) 0.387 Licensed inhibition score 0.66 (0.27, 1.45) 0.314 0.57 (0.22, 1.35) 0.219 p-value (LR test) Licensed activation score 3 0.092 0.077 Licensed inhibition score 4 0.454 0.370 Weighted average 5 Licensed activation score 0.94 (0.65, 1.34) 0.719 0.91 (0.64, 1.29) 0.597 Licensed inhibition score 0.89 (0.69, 1.16) 0.396 0.84 (0.64, 1.10) 0.213 1 390 Latino, 160 non-Latino White, and 64 non-Latino Asian mother-child pairs were included. 2 p-value based on a Wald test. 3 LR test comparing the null model (ALL case/control status ~ licensed activation score + licensed inhibition score + genetic ancestry) to the model with an interaction term between ancestry and activation score (ALL case/control status ~ licensed activation score + licensed inhibition score + genetic ancestry + licensed activation score x genetic ancestry). 4 LR test comparing the null model (ALL case/control status ~ licensed activation score + licensed inhibition score + genetic ancestry) to the model with an interaction term between ancestry and inhibition score (ALL case/control status ~ licensed activation score + licensed inhibition score + genetic ancestry + licensed inhibition score x genetic ancestry). 5 The weighted average of odds ratio among Latino all races, non-Latino White, and non-Latino Asian subjects. Calculated with a random-effects meta-analysis model where the weights are the number of subjects in each predicted genetic ancestry group. ALL, acute lymphoblastic leukemia; HLA, human leukocyte antigen; KIR, killer immunoglobulin-like receptor; PC, principal component; CI, confidence interval; LR test, log- likelihood ratio test. *Odds ratio statistically significantly different from the null with p-value< 0.05. 118 Supplementary Table 2-8. Demographics, offspring HLAC and maternal KIR interaction among mothers who ever or never had a fetal loss in CCRLP and CCLS. Latino all races Non-Latino White Non-Latino Asian Fetal loss Ever Never Ever Never Ever Never (N=76) (N=285) p- value 1 (N=30) (N=121) p- value 1 (N=14) (N=43) p- value 1 Status 0.223 0.609 0.705 Case 28 (36.8%) 82 (28.8%) 11 (36.7%) 36 (29.8%) 6 (42.9%) 14 (32.6%) Control 48 (63.2%) 203 (71.2%) 19 (63.3%) 85 (70.2%) 8 (57.1%) 29 (67.4%) Study 0.301 0.540 0.174 CCRLP 66 (86.8%) 261 (91.6%) 27 (90.0%) 115 (95.0%) 10 (71.4%) 39 (90.7%) CCLS 10 (13.2%) 24 (8.4%) 3 (10.0%) 6 (5.0%) 4 (28.6%) 4 (9.3%) Offsprin g sex 0.178 0.797 0.951 Female 37 (48.7%) 112 (39.3%) 11 (36.7%) 50 (41.3%) 6 (42.9%) 16 (37.2%) Male 39 (51.3%) 173 (60.7%) 19 (63.3%) 71 (58.7%) 8 (57.1%) 27 (62.8%) Activatio n score 0.375 0.463 0.514 Mean (SD) 2.13 (1.25) 2.28 (1.34) 1.97 (1.19) 2.15 (1.28) 2.14 (1.66) 2.47 (1.28) Inhibitio n score 0.441 0.213 0.510 Mean (SD) 3.57 (1.10) 3.67 (1.02) 4.12 (0.953) 3.86 (1.10) 3.46 (1.13) 3.70 (1.13) Abbreviations: CCRLP, Childhood Cancer Records Linkage Project; CCLS, California Childhood Leukemia Study; HLA, human leukocyte antigen; KIR, killer immunoglobulin-like receptor; SD, standard deviation. 1 The difference distribution by case/control status was analyzed with a Chi-squared test for categorical variables, and two-sample t-test for continuous variables. P-values < 0.05 were considered statistically significant. 119 Supplementary Table 2-9. Demographics and distribution of VSN normalized cytokines among CCRLP subjects. Latino all races Non-Latino White Non-Latino Asian Case Control Case Control Case Control (N=76) (N=247) p-value 1 (N=38) (N=101) p-value 1 (N=11) (N=37) p- value 1 Offsprin g sex 1.000 0.178 0.249 Female 32.0 (42.1%) 106 (42.9%) 19.0 (50.0%) 36.0 (35.6%) 2.00 (18.2%) 16.0 (43.2%) Male 44.0 (57.9%) 141 (57.1%) 19.0 (50.0%) 65.0 (64.4%) 9.00 (81.8%) 21.0 (56.8%) Age at collectio n (days) 0.046 0.993 <0.001 <25 16.0 (21.1%) 92.0 (37.2%) 12.0 (31.6%) 32.0 (31.7%) 3.00 (27.3%) 14.0 (37.8%) 25-27 15.0 (19.7%) 41.0 (16.6%) 5.00 (13.2%) 12.0 (11.9%) 7.00 (63.6%) 11.0 (29.7%) 28-38 21.0 (27.6%) 63.0 (25.5%) 9.00 (23.7%) 26.0 (25.7%) 0 (0%) 8.00 (21.6%) >=39 24.0 (31.6%) 51.0 (20.6%) 12.0 (31.6%) 31.0 (30.7%) 1.00 (9.1%) 4.00 (10.8%) Birthwei ght (g) 0.507 0.640 0.425 <2,500 3.00 (3.9%) 10.0 (4.0%) 1.00 (2.6%) 4.00 (4.0%) 0 (0%) 1.00 (2.7%) 2,500- 2,999 13.0 (17.1%) 42.0 (17.0%) 6.00 (15.8%) 14.0 (13.9%) 2.00 (18.2%) 5.00 (13.5%) 3,000- 3,499 31.0 (40.8%) 92.0 (37.2%) 16.0 (42.1%) 34.0 (33.7%) 8.00 (72.7%) 17.0 (45.9%) 3,500- 3,999 25.0 (32.9%) 72.0 (29.1%) 13.0 (34.2%) 35.0 (34.7%) 1.00 (9.1%) 13.0 (35.1%) >=4,000 4.00 (5.3%) 31.0 (12.6%) 2.00 (5.3%) 14.0 (13.9%) 0 (0%) 1.00 (2.7%) Gestatio nal week 0.773 0.975 0.617 26-36 8.00 (10.5%) 24.0 (9.7%) 4.00 (10.5%) 10.0 (9.9%) 1.00 (9.1%) 4.00 (10.8%) 37-41 62.0 (81.6%) 202 (81.8%) 30.0 (78.9%) 81.0 (80.2%) 9.00 (81.8%) 28.0 (75.7%) 42-44 5.00 (6.6%) 11.0 (4.5%) 3.00 (7.9%) 7.00 (6.9%) 0 (0%) 3.00 (8.1%) Activatio n score 0.306 0.206 0.296 Mean (SD) 2.28 (1.25) 2.45 (1.26) 2.42 (1.50) 2.08 (1.11) 2.18 (1.17) 2.62 (1.26) Inhibitio n score 0.364 0.328 0.468 Mean (SD) 3.63 (0.943) 3.75 (1.03) 4.09 (1.16) 3.88 (1.04) 3.41 (1.02) 3.68 (1.13) VSN Normaliz ed Cytokine s IL-1b 0.409 0.877 0.724 Mean (SD) -0.134 (1.09) -0.0175 (1.01) -0.0990 (1.11) -0.0673 (0.972) 0.0611 (0.898) 0.176 (1.02) 120 IL-2 0.722 0.023 0.759 Mean (SD) 0.0326 (0.997) -0.0145 (1.04) 0.389 (0.864) -0.00615 (0.974) -0.0900 (1.01) -0.195 (0.870) IL-4 0.728 0.641 0.969 Mean (SD) 0.0574 (0.955) 0.0125 (1.07) -0.0805 (0.901) -0.162 (0.949) -0.194 (0.849) -0.205 (0.887) IL-6 0.726 0.820 0.735 Mean (SD) 0.0224 (0.994) 0.0681 (0.990) -0.200 (1.01) -0.156 (1.03) 0.0195 (0.836) 0.126 (1.10) IL-8 0.167 0.838 0.934 Mean (SD) -0.154 (1.01) 0.0304 (1.02) -0.244 (1.02) -0.205 (0.926) 0.328 (1.11) 0.297 (0.880) IL-10 0.202 0.199 0.963 Mean (SD) -0.112 (1.02) 0.0597 (1.01) 0.0761 (0.892) -0.146 (0.917) -0.138 (0.912) -0.123 (1.04) IL-12p70 0.552 0.173 0.173 0.027 Mean (SD) 0.0845 (1.09) 0.000907 (1.01) 0.211 (0.864) -0.0223 (0.964) 0.173 (0.633) -0.406 (0.951) INF-g 0.875 0.891 0.754 Mean (SD) -0.0550 (1.09) -0.0329 (0.994) 0.0891 (1.08) 0.0617 (0.949) 0.298 (1.34) 0.159 (1.01) TNF-a 0.144 0.246 0.101 Mean (SD) 0.127 (0.957) -0.0578 (0.958) 0.202 (1.02) -0.0315 (1.12) -0.0556 (0.606) -0.445 (0.824) VEGF 0.611 0.880 0.426 Mean (SD) -0.151 (0.947) -0.0863 (1.04) -0.0120 (0.946) 0.0163 (1.05) 0.377 (0.864) 0.119 (1.11) ARG-II 0.608 0.765 0.317 Mean (SD) 0.0737 (1.03) 0.00404 (1.04) -0.0782 (0.867) -0.0264 (1.00) 0.187 (1.44) -0.287 (0.859) Abbreviations: CCRLP, Childhood Cancer Records Linkage Project; VSN, variance stabilizing normalization; NL, non-Latino; SD, standard deviation; IL, interleukin; INF, interferon; TNF, tumor necrosis factor; VEGF, vascular endothelial growth factor; ARG, arginase. 1 Cytokines were normalized on case/control status, birth year, protein, batch, and plate spot. 121 Supplementary Table 2-10. Linear regression model assessing the association between VSN normalized cytokines and activation/inhibition scores among CCRLP subjects. Coefficient (95% CI) Latino all races Non-Latino White Non-Latino Asian VSN normalized cytokine Unadjusted model Full model 1 Unadjusted model Full model 1 Unadjusted model Full model 1 IL-1b Activation score 0.06 (-0.03, 0.15) 0.05 (-0.04, 0.15) 0.00 (-0.14, 0.14) 0.00 (-0.15, 0.15) -0.30 * (-0.52, -0.08) -0.22 (-0.49, 0.06) Inhibition score -0.00 (-0.12, 0.11) -0.01 (-0.13, 0.11) 0.02 (-0.14, 0.19) 0.06 (-0.11, 0.24) -0.10 (-0.35, 0.15) -0.10 (-0.40, 0.19) IL-2 Activation score -0.06 (-0.15, 0.03) -0.05 (-0.14, 0.04) 0.02 (-0.11, 0.16) -0.01 (-0.15, 0.13) -0.02 (-0.23, 0.20) -0.04 (-0.29, 0.21) Inhibition score 0.02 (-0.09, 0.13) 0.02 (-0.09, 0.14) 0.08 (-0.08, 0.23) 0.07 (-0.09, 0.24) -0.17 (-0.41, 0.07) -0.15 (-0.42, 0.12) IL-4 Activation score 0.01 (-0.08, 0.10) 0.00 (-0.09, 0.09) -0.01 (-0.14, 0.12) 0.01 (-0.13, 0.15) -0.16 (-0.37, 0.05) -0.22 (-0.46, 0.02) Inhibition score -0.03 (-0.15, 0.08) -0.04 (-0.15, 0.08) -0.11 (-0.26, 0.04) -0.11 (-0.27, 0.06) 0.03 (-0.21, 0.26) 0.08 (-0.18, 0.34) IL-6 Activation score 0.01 (-0.07, 0.10) -0.00 (-0.09, 0.09) -0.01 (-0.16, 0.13) -0.03 (-0.17, 0.12) -0.16 (-0.41, 0.08) -0.13 (-0.39, 0.14) Inhibition score -0.06 (-0.17, 0.05) -0.09 (-0.20, 0.03) 0.03 (-0.13, 0.20) 0.01 (-0.15, 0.18) 0.00 (-0.28, 0.28) 0.08 (-0.20, 0.36) IL-8 Activation score 0.07 (-0.02, 0.16) 0.07 (-0.02, 0.16) -0.01 (-0.14, 0.13) -0.00 (-0.14, 0.13) -0.26 * (-0.47, -0.05) -0.17 (-0.43, 0.10) Inhibition score 0.01 (-0.11, 0.12) -0.01 (-0.13, 0.10) -0.05 (-0.20, 0.11) -0.05 (-0.21, 0.11) -0.03 (-0.26, 0.21) -0.03 (-0.31, 0.26) IL-10 Activation score 0.04 (-0.05, 0.12) 0.03 (-0.06, 0.12) -0.03 (-0.16, 0.10) 0.00 (-0.13, 0.14) -0.23 * (-0.46, -0.01) -0.29 * (-0.56, -0.02) Inhibition score -0.02 (-0.13, 0.09) -0.03 (-0.14, 0.09) 0.01 (-0.14, 0.16) 0.02 (-0.13, 0.17) -0.23 (-0.48, 0.03) -0.19 (-0.49, 0.10) IL-12p70 Activation score 0.03 (-0.06, 0.12) 0.03 (-0.06, 0.11) 0.03 (-0.10, 0.16) 0.05 (-0.09, 0.19) 0.04 (-0.18, 0.26) 0.02 (-0.23, 0.27) Inhibition score 0.03 (-0.08, 0.14) 0.03 (-0.09, 0.14) -0.07 (-0.22, 0.08) -0.10 (-0.25, 0.06) -0.17 (-0.42, 0.07) -0.26 (-0.53, 0.02) INF-g Activation score 0.06 (-0.03, 0.15) 0.04 (-0.05, 0.13) -0.01 (-0.15, 0.13) 0.01 (-0.13, 0.15) 0.17 (-0.09, 0.43) 0.02 (-0.28, 0.32) Inhibition score -0.02 (-0.13, 0.09) -0.00 (-0.11, 0.11) 0.03 (-0.13, 0.19) 0.05 (-0.11, 0.21) 0.09 (-0.20, 0.38) 0.16 (-0.16, 0.49) 122 TNF-a Activation score -0.05 (-0.13, 0.03) -0.06 (-0.14, 0.03) -0.17 * (-0.32, -0.02) -0.14 (-0.30, 0.02) -0.27 * (-0.45, -0.09) -0.27 * (-0.49, -0.06) Inhibition score 0.00 (-0.10, 0.11) 0.00 (-0.10, 0.11) 0.03 (-0.15, 0.20) 0.04 (-0.15, 0.22) 0.00 (-0.20, 0.20) -0.03 (-0.26, 0.21) VEGF Activation score 0.00 (-0.09, 0.09) -0.01 (-0.10, 0.08) -0.10 (-0.24, 0.04) -0.15 (-0.30, 0.00) -0.28 (-0.52, -0.04) -0.18 (-0.48, 0.13) Inhibition score 0.05 (-0.06, 0.16) 0.02 (-0.09, 0.13) -0.01 (-0.17, 0.16) 0.03 (-0.14, 0.21) -0.09 (-0.36, 0.18) -0.09 (-0.42, 0.24) ARG II Activation score 0.12 * (0.03, 0.21) 0.13 * (0.04, 0.22) -0.05 (-0.19, 0.08) -0.01 (-0.15, 0.13) -0.09 (-0.32, 0.14) -0.09 (-0.36, 0.19) Inhibition score -0.00 (-0.11, 0.11) 0.02 (-0.10, 0.13) 0.00 (-0.15, 0.16) 0.01 (-0.15, 0.17) -0.32 * (-0.58, -0.06) -0.35 * (-0.65, -0.05) Abbreviations: CCRLP, Childhood Cancer Records Linkage Project; VSN, variance stabilizing normalization; SE, standard error; IL, interleukin; INF, interferon; TNF, tumor necrosis factor; VEGF, vascular endothelial growth factor; ARG, arginase. 1 Model with cytokines as the outcome, adjusting for age at cytokine collection, birth weight, and gestation week. Cytokines were normalized on case/control status, birth year, protein, batch, and plate spot. *Coefficient statistically significantly different from the null with p-value< 0.05. 123 Supplementary Table 2-11. Demographics and distribution of VSN normalized cytokines among CCRLP subjects matched on offspring sex. Latino all races Non-Latino White Non-Latino Asian Case Control Case Control Case Control (N=76) (N=76) p-value 1 (N=38) (N=38) p-value 1 (N=11) (N=11) p- value 1 Offsprin g sex NA NA NA Female 32.0 (42.1%) 32.0 (42.1%) 19.0 (50.0%) 19.0 (50.0%) 2.00 (18.2%) 2.00 (18.2%) Male 44.0 (57.9%) 44.0 (57.9%) 19.0 (50.0%) 19.0 (50.0%) 9.00 (81.8%) 9.00 (81.8%) Age at collection (days) 0.241 0.881 0.324 <25 16.0 (21.1%) 25.0 (32.9%) 12.0 (31.6%) 11.0 (28.9%) 3.00 (27.3%) 4.00 (36.4%) 25-27 15.0 (19.7%) 8.00 (10.5%) 5.00 (13.2%) 7.00 (18.4%) 7.00 (63.6%) 5.00 (45.5%) 28-38 21.0 (27.6%) 21.0 (27.6%) 9.00 (23.7%) 7.00 (18.4%) 0 (0%) 2.00 (18.2%) >=39 24.0 (31.6%) 22.0 (28.9%) 12.0 (31.6%) 13.0 (34.2%) 1.00 (9.1%) 0 (0%) Birthwei ght (g) 0.714 0.340 0.247 <2,500 3.00 (3.9%) 4.00 (5.3%) 1.00 (2.6%) 0 (0%) 0 (0%) 0 (0%) 2,500- 2,999 13.0 (17.1%) 15.0 (19.7%) 6.00 (15.8%) 4.00 (10.5%) 2.00 (18.2%) 2.00 (18.2%) 3,000- 3,499 31.0 (40.8%) 28.0 (36.8%) 16.0 (42.1%) 13.0 (34.2%) 8.00 (72.7%) 4.00 (36.4%) 3,500- 3,999 25.0 (32.9%) 21.0 (27.6%) 13.0 (34.2%) 14.0 (36.8%) 1.00 (9.1%) 4.00 (36.4%) >=4,000 4.00 (5.3%) 8.00 (10.5%) 2.00 (5.3%) 7.00 (18.4%) 0 (0%) 1.00 (9.1%) Gestatio nal week 0.475 0.101 0.589 26-36 8.00 (10.5%) 10.0 (13.2%) 4.00 (10.5%) 0 (0%) 1.00 (9.1%) 1.00 (9.1%) 37-41 62.0 (81.6%) 61.0 (80.3%) 30.0 (78.9%) 35.0 (92.1%) 9.00 (81.8%) 8.00 (72.7%) 42-44 5.00 (6.6%) 2.00 (2.6%) 3.00 (7.9%) 2.00 (5.3%) 0 (0%) 1.00 (9.1%) Activatio n score 0.379 0.225 0.466 Mean (SD) 2.28 (1.25) 2.45 (1.14) 2.42 (1.50) 2.05 (1.09) 2.18 (1.17) 2.55 (1.13) Inhibitio n score 0.585 0.872 0.361 Mean (SD) 3.63 (0.943) 3.72 (0.984) 4.09 (1.16) 4.05 (0.964) 3.41 (1.02) 3.86 (1.25) VSN Normalized Cytokines IL-1b 0.549 0.309 0.651 Mean (SD) -0.134 (1.09) -0.0302 (1.04) -0.0990 (1.11) 0.144 (0.952) 0.0611 (0.898) -0.112 (0.868) IL-2 0.870 0.233 0.391 Mean (SD) 0.0326 (0.997) 0.0589 (0.986) 0.389 (0.864) 0.143 (0.919) -0.0900 (1.01) 0.251 (0.795) 124 IL-4 0.726 0.966 0.910 Mean (SD) 0.0574 (0.955) 0.00221 (0.982) -0.0805 (0.901) -0.0899 (1.01) -0.194 (0.849) -0.157 (0.624) IL-6 0.761 0.819 0.193 Mean (SD) 0.0224 (0.994) -0.0306 (1.15) -0.200 (1.01) -0.255 (1.06) 0.0195 (0.836) 0.643 (1.28) IL-8 0.460 0.359 0.839 Mean (SD) -0.154 (1.01) -0.0326 (1.02) -0.244 (1.02) -0.0527 (0.773) 0.328 (1.11) 0.235 (1.01) IL-10 0.387 0.484 0.151 Mean (SD) -0.112 (1.02) 0.0305 (1.00) 0.0761 (0.892) -0.0798 (1.04) -0.138 (0.912) 0.480 (1.02) IL-12p70 0.395 0.329 0.168 Mean (SD) 0.0845 (1.09) -0.0582 (0.974) 0.211 (0.864) -0.00400 (1.04) 0.173 (0.633) -0.319 (0.940) INF-g 0.603 0.736 0.418 Mean (SD) -0.0550 (1.09) 0.0360 (1.06) 0.0891 (1.08) 0.166 (0.904) 0.298 (1.34) -0.160 (1.26) TNF-a 0.082 0.794 0.244 Mean (SD) 0.127 (0.957) -0.142 (0.940) 0.202 (1.02) 0.138 (1.09) -0.0556 (0.606) -0.432 (0.841) VEGF 0.457 0.505 0.032 Mean (SD) -0.151 (0.947) -0.0296 (1.06) -0.0120 (0.946) 0.140 (1.03) 0.377 (0.864) -0.488 (0.897) ARG-II 0.594 0.156 0.125 Mean (SD) 0.0737 (1.03) -0.0122 (0.954) -0.0782 (0.867) 0.233 (1.02) 0.187 (1.44) -0.666 (1.02) Abbreviations: CCRLP, Childhood Cancer Records Linkage Project; VSN, variance stabilizing normalization; NL, non-Latino; SD, standard deviation; IL, interleukin; INF, interferon; TNF, tumor necrosis factor; VEGF, vascular endothelial growth factor; ARG, arginase. 1 Cytokines were normalized on case/control status, birth year, protein, batch, and plate spot. 125 Supplementary Table 2-12. Linear mixed model assessing the association between VSN normalized cytokines and activation/inhibition scores among CCRLP subjects. Coefficient (95% CI) Latino all races Non-Latino White Non-Latino Asian VSN normalized cytokine Unadjusted model Full model 1 Unadjusted model Full model 1 Unadjusted model Full model 1 IL-1b Activation score -0.00 (-0.15, 0.14) 0.00 (-0.15, 0.15) 0.07 (-0.12, 0.27) 0.08 (-0.12, 0.29) -0.20 (-0.53, 0.14) -0.36 (-0.80, 0.08) Inhibition score -0.10 (-0.28, 0.08) -0.09 (-0.27, 0.10) -0.07 (-0.31, 0.17) -0.07 (-0.31, 0.18) -0.04 (-0.34, 0.26) -0.21 (-0.50, 0.08) IL-2 Activation score -0.06 (-0.19, 0.08) -0.07 (-0.21, 0.06) -0.04 (-0.21, 0.12) -0.06 (-0.22, 0.11) -0.22 (-0.54, 0.10) 0.16 (-0.19, 0.51) Inhibition score 0.09 (-0.08, 0.25) 0.07 (-0.09, 0.24) 0.10 (-0.09, 0.30) 0.15 (-0.04, 0.34) -0.37 * (-0.68, -0.05) -0.46 * (-0.75, - 0.17) IL-4 Activation score 0.02 (-0.12, 0.15) 0.05 (-0.09, 0.19) -0.00 (-0.18, 0.18) 0.01 (-0.19, 0.20) -0.23 (-0.48, 0.03) -0.20 (-0.51, 0.10) Inhibition score -0.04 (-0.20, 0.13) -0.02 (-0.18, 0.15) -0.10 (-0.32, 0.11) -0.08 (-0.32, 0.16) -0.08 (-0.30, 0.15) 0.11 (-0.14, 0.37) IL-6 Activation score -0.08 (-0.23, 0.06) -0.09 (-0.24, 0.06) 0.02 (-0.17, 0.22) -0.03 (-0.21, 0.15) 0.03 (-0.40, 0.46) 0.33 * (0.33, 0.33) Inhibition score -0.10 (-0.28, 0.08) -0.10 (-0.28, 0.08) -0.02 (-0.25, 0.22) 0.09 (-0.12, 0.30) -0.20 (-0.59, 0.19) 0.06 * (0.06, 0.06) IL-8 Activation score 0.01 (-0.13, 0.15) 0.04 (-0.10, 0.18) 0.02 (-0.15, 0.18) 0.01 (-0.17, 0.19) -0.17 (-0.60, 0.25) -0.22 (-0.83, 0.39) Inhibition score -0.01 (-0.18, 0.16) 0.01 (-0.16, 0.19) -0.15 (-0.36, 0.06) -0.14 (-0.36, 0.08) 0.01 (-0.38, 0.41) 0.00 (-0.42, 0.42) IL-10 Activation score 0.06 (-0.08, 0.19) 0.04 (-0.10, 0.18) -0.07 (-0.25, 0.11) -0.02 (-0.21, 0.17) 0.05 (-0.35, 0.46) -0.11 (-0.68, 0.46) Inhibition score -0.06 (-0.23, 0.11) -0.08 (-0.25, 0.09) 0.10 (-0.12, 0.32) 0.14 (-0.08, 0.37) -0.20 (-0.61, 0.20) -0.01 (-0.48, 0.46) IL-12p70 Activation score 0.01 (-0.12, 0.15) 0.01 (-0.13, 0.15) -0.03 (-0.21, 0.15) -0.00 (-0.19, 0.18) 0.15 (-0.19, 0.48) -0.30 (-0.30, - 0.30) Inhibition score -0.08 (-0.25, 0.09) -0.07 (-0.24, 0.10) -0.00 (-0.22, 0.22) -0.01 (-0.24, 0.22) -0.04 (-0.35, 0.27) 0.10* (0.10, 0.10) INF-g Activation score 0.09 (-0.05, 0.23) 0.09 (-0.06, 0.23) 0.05 (-0.14, 0.23) 0.08 (-0.11, 0.27) 0.30 (-0.19, 0.78) 0.40 (-0.10, 0.89) Inhibition score 0.05 (-0.13, 0.23) 0.08 (-0.09, 0.26) 0.07 (-0.15, 0.29) 0.03 (-0.20, 0.26) 0.24 (-0.23, 0.71) 0.62* (0.24, 1.00) TNF-a Activation score -0.07 (-0.19, 0.05) -0.08 (-0.20, 0.05) -0.12 (-0.31, 0.08) -0.07 (-0.28, 0.14) -0.18 (-0.48, 0.12) -0.38 * (-0.38, - 0.38) Inhibition score -0.15* (-0.31, -0.00) -0.14 (-0.30, 0.01) 0.05 (-0.19, 0.29) 0.04 (-0.22, 0.31) -0.00 (-0.29, 0.29) -0.03 * 126 (-0.03, - 0.03) VEGF Activation score -0.09 (-0.23, 0.04) -0.07 (-0.21, 0.07) -0.04 (-0.22, 0.14) -0.11 (-0.32, 0.09) -0.35 (-0.73, 0.02) -0.58 * (-1.13, - 0.03) Inhibition score -0.00 (-0.17, 0.17) -0.02 (-0.19, 0.14) -0.10 (-0.32, 0.12) -0.02 (- 0.26, 0.22) -0.02 (-0.39, 0.35) -0.15 (-0.58, 0.28) ARG II Activation score 0.15 (0.02, 0.28) 0.18 (0.04, 0.31) -0.03 (-0.21, 0.15) 0.00 (-0.17, 0.18) 0.06 (-0.42, 0.53) -0.18 (-0.68, 0.33) Inhibition score -0.01 (-0.17, 0.16) 0.04 (-0.12, 0.21) -0.08 (-0.30, 0.14) -0.08 (-0.29, 0.14) -0.52 * (-1.00, -0.04) -0.41* (-0.72, - 0.10) Abbreviations: CCRLP, Childhood Cancer Records Linkage Project; VSN, variance stabilizing normalization; SE, standard error; IL, interleukin; INF, interferon; TNF, tumor necrosis factor; VEGF, vascular endothelial growth factor; ARG, arginase. 1 Model with cytokines as the outcome, adjusting for age at cytokine collection, birth weight, gestation week, and allowing the intercept to vary randomly by matched pairs. Cytokines were normalized on case/control status, birth year, protein, batch, and plate spot. *Coefficient statistically significantly different from the null with p-value< 0.05. 127 Supplementary Table 2-13. A causal mediation analysis to evaluate the potential mediation effect of ARG II in the association between the activating HLA-KIR interaction and childhood acute lymphocytic leukemia risk among Latino subjects. Estimate (95% confidence interval) p-value ACME (average) 0.00214 (-0.00404, 0.01) 0.53 ADE (average) -0.02373 (-0.07545, 0.02) 0.26 Proportion Mediated (average) -0.09905 (-1.23232, 0.88) 0.64 ACME (ALL control) 0.0022 (-0.00417, 0.01) 0.53 ACME (ALL case) 0.00208 (-0.00398, 0.01) 0.53 ADE (ALL control) -0.02367 (-0.0755, 0.02) 0.26 ADE (ALL case) -0.02379 (-0.07539, 0.02) 0.26 Proportion Mediated (ALL control) -0.10172 (-1.25754, 0.88) 0.64 Proportion Mediated (ALL case) -0.09638 (-1.18805, 0.88) 0.64 Abbreviations: ARG, arginase; HLA, human leukocyte antigen; KIR, killer immunoglobulin-like receptor; ALL, acute lymphocytic leukemia; ACME stands for average causal mediation effects, ADE stands for average direct effects. 128 Supplementary Figure 2-1. Prediction of genetic ancestry for CCRLP and CCLS subjects using subjects in 1000 Genomes project as reference. Abbreviations: CCRLP, Childhood Cancer Records Linkage Project; CCLS, California Childhood Leukemia Study; PC, principal component; AFR, African; AMR, Ad Mixed American; EAS, East Asian; EUR, European · SAS, South Asian. 129 Supplementary Figure 2-2. Study flowchart. Abbreviations: CCRLP, Childhood Cancer Records Linkage Project; CCLS, California Childhood Leukemia Study; HIBAG, Human leukocyte antigens Genotype Imputation with Attribute Bagging; KIR, killer immunoglobulin-like receptors; PC, principal component. 130 Supplementary Figure 2-3. Schematic of the mediation analysis to evaluate the potential mediation effect of ARG II in the association between the activating HLA-KIR interaction and childhood acute lymphocytic leukemia risk among Latino subjects. Abbreviations: ARG, arginase; HLA, human leukocyte antigen; KIR, killer immunoglobulin-like receptor; ALL, acute lymphocytic leukemia. 131 Appendix C: Supplementary materials for Chapter 3 Supplementary Table 3-1. Relative risks of the same type of early-onset cancer with the proband among siblings and mothers by ethnic group, 1989 to 2015, California, USA. † Cancers were classified into subgroups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). § Affected relatives include mother and siblings of the proband diagnosed with early-onset cancer under 26 years of age. ‡ p-value comparing the SIRs between non-Latino White and Latino all races using an approximate Chi-squared test. ¶ Hematologic cancers include group 1, leukemias, myeloproliferative diseases, and myelodysplastic diseases; group 2, lymphomas and reticuloendothelial neoplasms. Solid cancers Overall Non-Latino White Latino all races Cancer of the proband † No. of Probands No. of affected relatives § SIR (95% CI) No. of Probands No. of relatives § SIR (95% CI) No. of Probands No. of affected relatives § SIR (95% CI) p-value ‡ Hematolog ic cancers ¶ 11404 22 2.68* (1.68, 4.06) 2702 6 2.64 (0.97, 5.76) 5484 7 1.56 (0.63, 3.21) 0.505 Solid cancers ¶ 17849 112 6.78* (5.58, 8.16) 5471 31 4.41* (2.99, 6.25) 7323 50 7.94* (5.89, 10.47) 0.012 Leukemias 8500 11 2.48* (1.24, 4.45) 1646 ≦5 2.19 (0.27, 7.91) 4374 ≦5 1.45 (0.4, 3.72) 0.996 Lymphom as 2928 ≦5 5.9* (1.61, 15.11) 1060 ≦5 9.04* (1.86, 26.41) 1121 0 NA CNS tumors 5739 11 6.19* (3.09, 11.08) 1728 ≦5 5.07* (1.38, 12.98) 2242 ≦5 4.66 (0.96, 13.61) 0.788 Neuroblast oma 1445 ≦5 10.31 (0.26, 57.44) 367 0 NA 490 0 NA Retinoblast oma 718 10 454.55* (217.97, 835.93) 126 ≦5 200* (5.06, 1114.32) 327 7 636.36* (255.85, 1311.15) 0.446 Renal tumors 1069 ≦5 15.87 (0.4, 88.44) 209 0 NA 442 ≦5 38.46 (0.97, 214.29) NA Hepatic tumors 414 0 NA 74 0 NA 200 0 NA Malignant bone tumors 249 0 NA 73 0 NA 121 0 NA Sarcomas 2915 10 31.45* (15.08, 57.83) 816 ≦5 18.87* (2.28, 68.16) 1259 ≦5 35.97* (11.68, 83.94) 0.687 GCT 2423 ≦5 7.71* (1.59, 22.54) 672 0 NA 1270 ≦5 8.62* (1.04, 31.14) NA Epithelial neoplasms 3050 24 40.68* (26.06, 60.53) 1458 9 21.33* (9.75, 40.49) 1046 9 55.56* (25.4, 105.46) 0.065 Other 255 0 NA 79 0 NA 90 0 NA NA 132 include I group 3, CNS and miscellaneous intracranial and intraspinal neoplasms; group 4, neuroblastoma and other peripheral nervous cell tumors; group 5, retinoblastoma; group 6, renal tumors; group 7, hepatic tumors; group 8, malignant bone tumors; group 9, soft tissue and other extraosseous sarcomas; group 10, germ cell tumors, trophoblastic tumors, and neoplasms of gonads; group 11, other malignant epithelial neoplasms and malignant melanomas; group 12, other and unspecified malignant neoplasms. SIR, Standardized incidence ratio. CI, confidence interval. * Statistically significant standardized incidence ratio with p<0.05 assuming a Poisson distribution. 133 Supplemental Table 3-2. Relative risks of siblings and mothers for a specific type of early- onset cancer (diagnosed 0 to 26 years of age) given a proband with cancer, 1989 to 2015, California, USA. Cancer of the proband † Cancer of the relative § No. of affected relatives § SIR (95% CI) Leukemias Leukemias 11 2.48* (1.24, 4.45) Lymphomas 6 2.46 (0.9, 5.35) CNS tumors 6 2.02 (0.74, 4.4) Hepatic tumors ≦5 0.41 (0.01, 2.28) Sarcomas 8 7.36* (3.18, 14.5) GCT 6 3.51* (1.29, 7.64) Epithelial neoplasms 13 5.37* (2.86, 9.18) Lymphomas Leukemias 7 7.47* (3, 15.39) Lymphomas ≦5 5.9* (1.61, 15.11) CNS tumors ≦5 2.89 (0.35, 10.46) Neuroblastoma ≦5 15.38 (0.39, 85.72) Retinoblastoma ≦5 40* (1.01, 222.86) Renal tumors ≦5 12.99 (0.33, 72.36) Sarcomas ≦5 7.07 (0.86, 25.53) GCT ≦5 1.93 (0.05, 10.78) Epithelial neoplasms ≦5 2.67 (0.32, 9.65) CNS tumors Leukemias ≦5 1.94 (0.63, 4.53) Lymphomas ≦5 1.31 (0.16, 4.72) CNS tumors 11 6.19* (3.09, 11.08) Renal tumors ≦5 4.29 (0.11, 23.91) Sarcomas 6 8.97* (3.29, 19.52) GCT ≦5 1.81 (0.22, 6.55) Epithelial neoplasms 10 6.37* (3.05, 11.71) Other ≦5 0.65 (0.02, 3.64) Neuroblastoma Lymphomas ≦5 2.23 (0.06, 12.41) Neuroblastoma ≦5 10.31 (0.26, 57.44) Sarcomas ≦5 4.78 (0.12, 26.66) GCT ≦5 3.33 (0.08, 18.57) Epithelial neoplasms ≦5 5.09 (0.62, 18.38) Other ≦5 2.23 (0.06, 12.41) Retinoblastoma Lymphomas ≦5 4.5 (0.11, 25.1) Retinoblastoma 10 454.55* (217.97, 835.93) Malignant bone tumors ≦5 4.5 (0.11, 25.1) Sarcomas ≦5 9.52 (0.24, 53.06) Epithelial neoplasms ≦5 5.24 (0.13, 29.17) 134 Renal tumors Lymphomas ≦5 3.02 (0.08, 16.83) CNS tumors ≦5 2.37 (0.06, 13.2) Renal tumors ≦5 15.87 (0.4, 88.44) Sarcomas ≦5 13.33* (1.61, 48.16) GCT ≦5 4.42 (0.11, 24.65) Epithelial neoplasms ≦5 9.17* (1.89, 26.81) Hepatic tumors Leukemias ≦5 4.08 (0.1, 22.74) Epithelial neoplasms ≦5 9.52 (0.24, 53.06) Malignant bone tumors Leukemias ≦5 14.49 (0.37, 80.75) Epithelial neoplasms ≦5 16.39 (0.41, 91.34) Sarcomas Leukemias 8 7.01* (3.03, 13.82) Lymphomas ≦5 1.37 (0.03, 7.61) CNS tumors 7 8.62* (3.47, 17.76) Neuroblastoma ≦5 11.11 (0.28, 61.91) Renal tumors ≦5 19.8* (2.4, 71.53) Sarcomas 10 31.45* (15.08, 57.83) GCT ≦5 7.41* (2.02, 18.97) Epithelial neoplasms 6 7.56* (2.77, 16.45) Other ≦5 1.37 (0.03, 7.61) GCT Leukemias ≦5 5.81* (1.58, 14.89) Lymphomas ≦5 4.06 (0.49, 14.65) CNS tumors ≦5 5.93* (1.22, 17.33) Neuroblastoma ≦5 19.23 (0.49, 107.15) Renal tumors ≦5 16.67 (0.42, 92.86) Sarcomas ≦5 9.66* (1.17, 34.9) GCT ≦5 7.71* (1.59, 22.54) Epithelial neoplasms ≦5 6.92* (1.89, 17.72) Epithelial neoplasms Leukemias ≦5 8.39* (2.28, 21.47) Lymphomas ≦5 2.19 (0.06, 12.22) CNS tumors ≦5 7.56* (1.56, 22.08) Neuroblastoma ≦5 41.67* (1.05, 232.15) Sarcomas ≦5 10.99* (1.33, 39.7) GCT ≦5 2.63 (0.07, 14.66) Epithelial neoplasms 24 40.68* (26.06, 60.53) Other Lymphomas ≦5 16.39 (0.41, 91.34) CNS tumors ≦5 14.29 (0.36, 79.59) Neuroblastoma ≦5 125* (3.16, 696.45) Sarcomas ≦5 37.04 (0.94, 206.36) 135 † Cancers were classified into subgroups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). § Affected relatives include mother and siblings of the proband diagnosed with early-onset cancer under 26 years of age. SIR, Standardized incidence ratio. CI, confidence interval. * Statistically significant standardized incidence ratio with p<0.05 assuming a Poisson distribution. 136 Supplemental Table 3-3. Relative risks of any early-onset cancer (diagnosed at 0 to 26 years of age) for siblings and mothers of the same type of cancer with the proband given a proband with cancer by subgroups, 1989 to 2015, California, USA. All cancers The same type of cancer Cancer of the proband † No. of probands No. of affected relatives § SIR (95% CI) No. of affected relatives § SIR (95% CI) Lymphoid leukemias 6705 25 2.7* (1.91, 3.7) ≦5 1.86 (0.6, 4.34) Acute myeloid leukemias 1317 9 4.61* (2.46, 7.89) ≦5 27.27* (5.62, 79.7) Hodgkin lymphomas 1399 9 7.12* (3.79, 12.17) ≦5 10.53* (1.27, 38.02) Non-Hodgkin lymphomas (except Burkitt lymphoma) 985 ≦5 4.02* (1.48, 8.75) ≦5 11.11 (0.28, 61.91) Ependymomas and choroid plexus tumor 561 ≦5 2.46 (0.51, 7.19) 0 NA Astrocytomas 1965 11 3.6* (1.97, 6.03) ≦5 7.69 (0.93, 27.79) Intracranial and intraspinal embryonal tumors 1010 6 2.67* (1.07, 5.5) ≦5 11.11 (0.28, 61.91) Other gliomas 815 8 5.34* (2.31, 10.53) ≦5 66.67* (8.07, 240.82) Other specified intracranial and intraspinal neoplasms 1470 ≦5 5.48* (2.5, 10.4) ≦5 22.22* (2.69, 80.27) Neuroblastoma and ganglioneuroblas toma 1409 6 2.06 (0.83, 4.25) ≦5 9.09 (0.23, 50.65) Nephroblastoma and other nonepithelial renal tumors 1032 6 3.67* (1.68, 6.97) ≦5 16.67 (0.42, 92.86) Rhabdomyosarco mas 719 11 8.46* (4.63, 14.2) 0 NA Fibrosarcomas to peripheral nerve sheath tumors to and other fibrous neoplasms 434 7 10.46* (4.21, 21.56) ≦5 200* (24.22, 722.47) Other specified soft tissue sarcomas 1624 11 7.57* (4.56, 11.82) ≦5 55.56* (18.04, 129.65) Malignant gonadal germ cell tumors 1030 12 5.44* (3.05, 8.97) ≦5 6.67 (0.81, 24.08) Malignant melanomas 788 ≦5 3.65 (0.75, 10.68) ≦5 25 (0.63, 139.29) Other and unspecified carcinomas 1845 20 18.11* (12.13, 26.01) 17 113.33* (66.02, 181.46) 137 † Cancers were classified into subgroups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). § Affected relatives include mother and siblings of the proband diagnosed with early-onset cancer under 26 years of age. SIR, Standardized incidence ratio. CI, confidence interval. * Statistically significant standardized incidence ratio with p<0.05 assuming a Poisson distribution. 138 Supplementary Table 3-4. Relative risks of any early-onset cancer (diagnosed 0 to 26 years of age) among siblings and mothers by ethnic group, 1989 to 2015, California, USA. Non-Latino White Latino all races Non-Latino API Non-Latino Black Cancer of the proban d † No. of Proban ds No. of affecte d relative s § SIR (95% CI) No. of Proban ds No. of affecte d relative s § SIR (95% CI) p-value ‡ No. of Proban ds No. of affected relative s § SIR (95% CI) p-value ‡ No. of Proban ds No. of affected relative s § SIR (95% CI) p- value ‡ Overall 8119 50 2.6* (1.93, 3.43) 12736 78 3.36* (2.66, 4.19) 0.183 1677 11 4.58* (2.29, 8.2) 0.128 1102 13 6.96* (3.71, 11.91) 0.002 Hemat ologic cancers ¶ 2702 19 2.69* (1.62, 4.2) 5484 27 2.48* (1.64, 3.61) 0.910 679 8 7.56* (3.26, 14.9) 0.023 391 ≦5 6.14* (1.67, 15.73) 0.242 Solid cancers ¶ 5471 37 3.02* (2.12, 4.16) 7323 62 4.98* (3.82, 6.39) 0.019 1021 7 5.07* (2.04, 10.44) 0.306 719 9 7.35* (3.36, 13.95) 0.026 Leuke mias 1646 11 2.11* (1.05, 3.77) 4374 21 2.35* (1.45, 3.59) 0.911 457 6 6.94* (2.55, 15.12) 0.032 205 ≦5 6.58* (1.36, 19.22) 0.176 Lymph omas 1060 8 4.32* (1.87, 8.52) 1121 8 4.09* (1.77, 8.07) 0.887 226 ≦5 20.3* (5.53, 51.98) 0.022 187 ≦5 5.13 (0.13, 28.53) 0.685 CNS tumors 1728 12 2.63* (1.36, 4.59) 2242 12 2.86* (1.48, 4.99) 0.999 311 ≦5 7.14* (1.47, 20.87) 0.250 239 ≦5 7.56* (1.56, 22.08) 0.216 Neurob lastom a 367 ≦5 0.7 (0.02, 3.92) 490 ≦5 1.7 (0.21, 6.13) 0.871 51 ≦5 6.54 (0.17, 36.37) 0.466 48 0 NA NA Retino blasto ma 126 ≦5 3.65 (0.44, 13.18) 327 9 12.31* (5.63, 23.37) 0.178 35 0 NA NA 30 ≦5 11.49 (0.29, 63.95) 0.881 Renal tumors 209 ≦5 2.64 (0.32, 9.54) 442 ≦5 2.91 (0.6, 8.5) 0.728 28 0 NA NA 53 ≦5 8.85 (0.22, 49.24) 0.850 Hepati c tumors 74 0 NA 200 ≦5 2.43 (0.06, 13.52) NA 13 0 NA NA 6 0 NA NA Bone tumors 73 0 NA 121 ≦5 6.06 (0.15, 33.77) NA 13 0 NA NA 17 0 NA NA Sarcom as 816 7 3.77* (1.51, 7.76) 1259 20 9.05* (5.53, 13.98) 0.062 160 ≦5 4.46 (0.11, 24.84) 0.681 132 ≦5 10.68* (2.2, 31.19) 0.267 GCT 672 ≦5 3.42 (0.93, 8.77) 1270 10 6.2* (2.97, 11.41) 0.454 184 ≦5 4.85 (0.12, 27.01) 0.755 70 0 NA NA Epithel ial neoplas ms 1458 14 8.6* (4.7, 14.43) 1046 15 15.15* (8.48, 24.99) 0.176 239 ≦5 13.25* (1.6, 47.82) 0.899 122 ≦5 22.47* (2.72, 81.13) 0.450 Other 79 ≦5 10.47* (1.27, 37.83) 90 ≦5 5.85 (0.15, 32.58) 0.924 8 0 NA NA 19 0 NA NA 139 † Cancers were classified into subgroups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). § Affected relatives include mother and siblings of the proband diagnosed with early-onset cancer under 26 years of age. ‡ p-value comparing the SIRs between non-Latino White and Latino all races using an approximate Chi-squared test. ¶ Hematologic cancers include group 1, leukemias, myeloproliferative diseases, and myelodysplastic diseases; group 2, lymphomas and reticuloendothelial neoplasms. Solid cancers include I group 3, CNS and miscellaneous intracranial and intraspinal neoplasms; group 4, neuroblastoma and other peripheral nervous cell tumors; group 5, retinoblastoma; group 6, renal tumors; group 7, hepatic tumors; group 8, malignant bone tumors; group 9, soft tissue and other extraosseous sarcomas; group 10, germ cell tumors, trophoblastic tumors, and neoplasms of gonads; group 11, other malignant epithelial neoplasms and malignant melanomas; group 12, other and unspecified malignant neoplasms. SIR, Standardized incidence ratio. CI, confidence interval. * Statistically significant standardized incidence ratio with p<0.05 assuming a Poisson distribution. 140 Supplemental Table 3-5. Relative risks of second primary malignancies of the same type of early-onset cancer (diagnosed 0 to 26 years of age) with the first primary malignancy by ethnic groups, California, USA. Overall Non-Latino White Latino all races Cancer of the proband † No. of FPMs No. of SP Ms SIR (95% CI) No. of FPMs No. of SPMs SIR (95% CI) No. of FPMs No. of SPMs SIR (95% CI) p-value ‡ Hematol ogic cancers ¶ 11,553 50 6.1* (4.53, 8.04) 2,727 15 6.61* (3.7, 10.9) 5,528 19 4.23* (2.55, 6.61) 0.262 Solid cancers ¶ 18,893 30 1.82* (1.22, 2.59) 5,673 13 1.85 (0.98, 3.16) 7,551 13 2.06* (1.1, 3.53) 0.938 Leukemi as 8,543 41 9.26* (6.65, 12.56) 1,649 11 12.05* (6.01, 21.56) 4,385 16 5.82* (3.32, 9.44) 0.094 Lympho mas 2,960 9 13.27* (6.07, 25.2) 1,063 ≦5 12.05* (3.28, 30.85) 1,124 ≦5 NA 0.768 CNS tumors 5,877 29 16.32* (10.93, 23.44) 1,765 12 15.21* (7.86, 26.57) 2,277 11 17.08* (8.53, 30.56) 0.943 Neurobl astoma 1,450 ≦5 20.62* (2.5, 74.48) 369 ≦5 NA 491 0 NA NA Retinobl astoma 728 0 NA 127 0 NA 334 0 NA NA Renal tumors 1,078 0 NA 209 0 NA 444 0 NA NA Hepatic tumors 417 0 NA 74 0 NA 200 0 NA NA Maligna nt bone tumors 250 0 NA 73 0 NA 121 0 NA NA Sarcoma s 3,028 16 50.31* (28.76, 81.71) 850 ≦5 47.17* (15.32, 110.08) 1,297 7 50.36* (20.25, 103.76) 0.858 GCT 2,468 ≦5 5.14 (0.62, 18.57) 679 0 NA 1,287 ≦5 8.62* (1.04, 31.14) NA Epitheli al neoplas ms 3,479 10 16.95* (8.13, 31.17) 1,489 6 14.22* (5.22, 30.95) 1,057 ≦5 24.69* (6.73, 63.22) 0.605 Other 261 0 NA 79 0 NA 90 0 NA NA SIR, Standardized incidence ratio. CI, confidence interval. FPM, first primary malignancy. SPM, second primary malignancy. † Cancers were classified into subgroups as defined by the International Classification of Childhood Cancer, Third edition (ICCC-3, November 2012) (https://seer.cancer.gov/iccc/iccc3.html). ‡ p-value comparing the SIRs between non-Latino White and Latino all races using an approximate Chi-squared test. ¶ Hematologic cancers include group 1, leukemias, myeloproliferative diseases, and myelodysplastic diseases; group 2, lymphomas and reticuloendothelial neoplasms. Solid cancers include I group 3, CNS and miscellaneous intracranial and intraspinal neoplasms; group 4, 141 neuroblastoma and other peripheral nervous cell tumors; group 5, retinoblastoma; group 6, renal tumors; group 7, hepatic tumors; group 8, malignant bone tumors; group 9, soft tissue and other extraosseous sarcomas; group 10, germ cell tumors, trophoblastic tumors, and neoplasms of gonads; group 11, other malignant epithelial neoplasms and malignant melanomas; group 12, other and unspecified malignant neoplasms. SIR, Standardized incidence ratio. CI, confidence interval. FPM, first primary malignancy. SPM, second primary malignancy. * Statistically significant standardized incidence ratio with p<0.05 assuming a Poisson distribution. 142 Appendix D: Supplementary materials for Chapter 4 Supplementary Table 4-1. Pathogenic/likely pathogenic variants shared between two siblings in a family. Blind Family ID Blind ID Chromosome: Position (Hg19) Reference allele/Mutant allele Gene name Class GBS Class Predicted genetic ancestry 2 412, 534 1:152284041 T/- FLG frameshift Gold NHAPI 4 312, 48 14:94847262 T/A SERPINA1 missense Gold NHW 4 312, 48 6:26091179 C/G HFE missense Gold NHW 5 656, 530 3:158320712 C/T MLF1 nonsense Gold NHAPI 6 396, 277 17:7578263 G/A TP53 nonsense Gold NHW 10 143, 47 17:7578205 C/T TP53 missense Gold HW 15 33, 269 3:128202711 G/A GATA2 nonsense Gold NHB 17 417, 542 6:26091179 C/G HFE missense Gold NHW 18 316, 338 6:26091179 C/G HFE missense Gold NHAPI 20 418, 53 14:94844947 C/T SERPINA1 missense Gold HW 26 17, 384 17:7578397 T/G TP53 missense Gold HW 26 17, 384 9:8636844 G/C PTPRD splice Gold HW 27 369, 435 11:108143453 -/A ATM frameshift Gold HW 27 369, 435 6:26091179 C/G HFE missense Gold HW 32 414, 464 11:5248173 C/T HBB missense Gold NHAPI 1 654, 562 11:108138003 T/C ATM missense Silver HW 1 654, 562 14:45658449 A/G FANCM missense Silver HW 1 654, 562 16:65016087 A/T CDH11 missense Silver HW 1 654, 562 21:36164605 A/G RUNX1 missense Silver HW 1 654, 562 21:36164605 A/C RUNX1 missense Silver HW 1 654, 562 6:43568762 A/G POLH missense Silver HW 1 654, 562 16:3304626 C/G MEFV missense Silver HW 1 654, 562 22:19226892 C/A CLTCL1 missense Silver HW 1 654, 562 5:150901476 G/C FAT2 missense Silver HW 2 412, 534 1:155159708 T/C MUC1 missense Silver NHAPI 2 412, 534 15:42035242 C/A MGA missense Silver NHAPI 2 412, 534 16:65016087 A/T CDH11 missense Silver NHAPI 2 412, 534 7:151878670 T/A KMT2C missense Silver NHAPI 2 412, 534 7:151884872 T/C KMT2C missense Silver NHAPI 2 412, 534 1:158623092 T/A SPTA1 missense Silver NHAPI 2 412, 534 14:51224551 A/T NIN missense Silver NHAPI 2 412, 534 16:3304626 C/G MEFV missense Silver NHAPI 2 412, 534 19:51378022 G/A KLK2 missense Silver NHAPI 2 412, 534 2:21252534 G/A APOB missense Silver NHAPI 2 412, 534 20:60892813 G/A LAMA5 missense Silver NHAPI 3 446, 555 16:65016087 A/T CDH11 missense Silver HW 3 446, 555 21:36164605 A/G RUNX1 missense Silver HW 3 446, 555 21:36164605 A/C RUNX1 missense Silver HW 4 312, 48 13:103518708 G/A ERCC5 missense Silver NHW 4 312, 48 16:65016087 A/T CDH11 missense Silver NHW 4 312, 48 21:36164605 A/G RUNX1 missense Silver NHW 143 4 312, 48 21:36164605 A/C RUNX1 missense Silver NHW 4 312, 48 1:114372214 C/G PTPN22 splice Silver NHW 4 312, 48 11:64699314 C/T PPP2R5B missense Silver NHW 4 312, 48 6:135732649 T/C AHI1 missense Silver NHW 5 656, 530 1:156838353 G/A NTRK1 missense Silver NHAPI 5 656, 530 12:49420078 C/T KMT2D missense Silver NHAPI 5 656, 530 16:65016087 A/T CDH11 missense Silver NHAPI 5 656, 530 16:65022127 T/C CDH11 missense Silver NHAPI 5 656, 530 17:37872108 G/A ERBB2 missense Silver NHAPI 5 656, 530 20:57415481 T/G GNAS missense Silver NHAPI 5 656, 530 7:92732142 C/T SAMD9 missense Silver NHAPI 5 656, 530 14:92505923 T/A TRIP11 missense Silver NHAPI 5 656, 530 16:27454345 C/T IL21R missense Silver NHAPI 5 656, 530 17:46805405 C/A HOXB13 missense Silver NHAPI 5 656, 530 8:48848402 A/T PRKDC missense Silver NHAPI 6 396, 277 10:27306492 A/G ANKRD26 missense Silver NHW 6 396, 277 16:65016087 A/T CDH11 missense Silver NHW 6 396, 277 14:92465749 C/T TRIP11 missense Silver NHW 6 396, 277 19:15376259 G/C BRD4 missense Silver NHW 6 396, 277 8:100861110 C/T VPS13B missense Silver NHW 7 353, 354 11:22646800 G/A FANCF missense Silver HW 7 353, 354 16:65016087 A/T CDH11 missense Silver HW 7 353, 354 11:118307544 T/C KMT2A missense Silver HW 8 507, 92 16:65016087 A/T CDH11 missense Silver HW 8 507, 92 1:156756485 C/T PRCC missense Silver HW 8 507, 92 1:158592859 C/T SPTA1 missense Silver HW 8 507, 92 16:50744565 T/G NOD2 missense Silver HW 8 507, 92 17:73837042 T/C UNC13D missense Silver HW 8 507, 92 22:42322716 G/C TNFRSF13 C missense Silver HW 9 156, 254 16:65016087 A/T CDH11 missense Silver NHW 9 156, 254 20:23030011 G/A THBD missense Silver NHW 10 143, 47 10:43610119 G/A RET missense Silver HW 10 143, 47 13:32906766 C/T BRCA2 missense Silver HW 10 143, 47 16:65016087 A/T CDH11 missense Silver HW 10 143, 47 2:209116230 C/G IDH1 missense Silver HW 10 143, 47 2:215595164 G/A BARD1 missense Silver HW 10 143, 47 7:140476739 A/G BRAF missense Silver HW 10 143, 47 1:169580754 A/G SELP missense Silver HW 10 143, 47 16:72821610 CCGCCGCCA/ - ZFHX3 proteinDel Silver HW 10 143, 47 19:1632363 C/T TCF3 missense Silver HW 10 143, 47 2:169830263 G/T ABCB11 missense Silver HW 10 143, 47 22:28196284 A/G MN1 missense Silver HW 10 143, 47 4:110687847 G/A CFI missense Silver HW 11 220, 427 9:21974681 A/G CDKN2A missense Silver HW 11 220, 427 1:36937913 C/T CSF3R missense Silver HW 144 11 220, 427 12:56494932 T/C ERBB3 missense Silver HW 11 220, 427 16:15829354 C/G MYH11 missense Silver HW 11 220, 427 16:65016087 A/T CDH11 missense Silver HW 11 220, 427 8:145737138 A/G RECQL4 missense Silver HW 11 220, 427 2:202082397 C/T CASP10 missense Silver HW 11 220, 427 5:150925641 A/G FAT2 missense Silver HW 11 220, 427 8:71068718 C/T NCOA2 missense Silver HW 13 645, 381 16:14014038 C/T ERCC4 missense Silver NHW 13 645, 381 1:183559352 C/T NCF2 missense Silver NHW 13 645, 381 15:55497853 G/C RAB27A missense Silver NHW 13 645, 381 16:50745926 C/T NOD2 missense Silver NHW 13 645, 381 19:10472598 C/T TYK2 missense Silver NHW 13 645, 381 2:109380221 T/G RANBP2 missense Silver NHW 15 33, 269 10:43610119 G/A RET missense Silver NHB 15 33, 269 13:32912733 C/T BRCA2 missense Silver NHB 15 33, 269 13:32930730 C/T BRCA2 missense Silver NHB 15 33, 269 17:63545677 G/A AXIN2 missense Silver NHB 15 33, 269 7:116397714 C/T MET missense Silver NHB 15 33, 269 7:151878670 T/A KMT2C missense Silver NHB 15 33, 269 1:157557111 C/A FCRL4 missense Silver NHB 15 33, 269 14:56107141 A/G KTN1 missense Silver NHB 15 33, 269 2:100210621 T/C AFF3 missense Silver NHB 15 33, 269 4:187510072 G/A FAT1 missense Silver NHB 16 307, 455 16:65016087 A/T CDH11 missense Silver HW 16 307, 455 4:1980595 C/T WHSC1 missense Silver HW 16 307, 455 9:21970924 A/G CDKN2A missense Silver HW 16 307, 455 19:33498967 G/A RHPN2 missense Silver HW 16 307, 455 19:45284301 G/A CBLC missense Silver HW 16 307, 455 19:45683032 G/T BLOC1S3 missense Silver HW 16 307, 455 2:21256262 C/T APOB missense Silver HW 16 307, 455 9:139264888 T/A CARD9 missense Silver HW 17 417, 542 1:120468082 C/T NOTCH2 missense Silver NHW 17 417, 542 10:43610119 G/A RET missense Silver NHW 17 417, 542 12:48181508 G/C HDAC7 missense Silver NHW 17 417, 542 16:65016087 A/T CDH11 missense Silver NHW 17 417, 542 21:36164605 A/G RUNX1 missense Silver NHW 17 417, 542 21:36164605 A/C RUNX1 missense Silver NHW 17 417, 542 2:175436937 T/C WIPF1 missense Silver NHW 17 417, 542 20:60893638 G/A LAMA5 missense Silver NHW 18 316, 338 11:94179043 C/A MRE11A missense Silver NHAPI 18 316, 338 11:94179043 C/A MRE11 missense Silver NHAPI 18 316, 338 12:6788123 A/G ZNF384 missense Silver NHAPI 18 316, 338 22:30000094 A/G NF2 missense Silver NHAPI 18 316, 338 9:139405696 G/A NOTCH1 missense Silver NHAPI 18 316, 338 16:50745926 C/T NOD2 missense Silver NHAPI 145 18 316, 338 2:160755340 T/G LY75- CD302 missense Silver NHAPI 18 316, 338 2:160755340 T/G LY75- CD302 missense Silver NHAPI 18 316, 338 22:26872985 T/C HPS4 missense Silver NHAPI 19 584, 653 10:43610119 G/A RET missense Silver HW 19 584, 653 16:65016087 A/T CDH11 missense Silver HW 19 584, 653 17:7606119 G/A WRAP53 missense Silver HW 19 584, 653 10:90771767 G/A FAS missense Silver HW 19 584, 653 16:72992667 C/T ZFHX3 missense Silver HW 19 584, 653 3:41952854 T/C ULK4 missense Silver HW 19 584, 653 6:168297647 A/G MLLT4 missense Silver HW 19 584, 653 6:168297647 A/G AFDN missense Silver HW 20 418, 53 16:65016087 A/T CDH11 missense Silver HW 20 418, 53 16:11219940 C/T CLEC16A missense Silver HW 20 418, 53 17:73825036 C/G UNC13D missense Silver HW 20 418, 53 17:73827335 T/G UNC13D missense Silver HW 20 418, 53 2:216211530 C/G ATIC missense Silver HW 20 418, 53 22:36688157 C/T MYH9 missense Silver HW 20 418, 53 3:142212065 A/G ATR missense Silver HW 20 418, 53 8:100128022 A/G VPS13B missense Silver HW 20 418, 53 8:100791158 G/A VPS13B missense Silver HW 21 302, 297 10:115341664 G/T HABP2 nonsense Silver NHAPI 21 302, 297 12:133212494 C/T POLE missense Silver NHAPI 21 302, 297 16:65016087 A/T CDH11 missense Silver NHAPI 21 302, 297 4:57777331 A/G REST missense Silver NHAPI 21 302, 297 1:43396500 C/T SLC2A1 missense Silver NHAPI 21 302, 297 11:95825940 C/T MAML2 missense Silver NHAPI 21 302, 297 17:56356502 A/G MPO missense Silver NHAPI 21 302, 297 3:105400444 A/- CBLB frameshift Silver NHAPI 21 302, 297 3:105438979 T/A CBLB missense Silver NHAPI 21 302, 297 4:187557837 C/T FAT1 missense Silver NHAPI 22 293, 398 16:65016087 A/T CDH11 missense Silver NHB 22 293, 398 4:55602923 A/G KIT missense Silver NHB 22 293, 398 X:129155042 G/A BCORL1 missense Silver NHB 22 293, 398 11:10051350 G/A SBF2 missense Silver NHB 22 293, 398 16:3304626 C/G MEFV missense Silver NHB 22 293, 398 16:3304739 A/G MEFV missense Silver NHB 22 293, 398 2:160628393 T/C LY75- CD302 missense Silver NHB 22 293, 398 2:160628393 T/C LY75- CD302 missense Silver NHB 23 58, 395 16:65016087 A/T CDH11 missense Silver NHW 23 58, 395 1:172629250 C/A FASLG missense Silver NHW 23 58, 395 13:37394037 G/C RFXAP missense Silver NHW 23 58, 395 18:2926779 A/G LPIN2 missense Silver NHW 23 58, 395 2:21233972 T/C APOB missense Silver NHW 23 58, 395 2:223085955 G/T PAX3 missense Silver NHW 146 24 106, 524 12:48190030 T/C HDAC7 missense Silver NHW 24 106, 524 16:65016087 A/T CDH11 missense Silver NHW 24 106, 524 21:36164679 C/G RUNX1 missense Silver NHW 24 106, 524 4:106157831 C/A TET2 missense Silver NHW 24 106, 524 7:142459791 G/T PRSS1 missense Silver NHW 25 203, 287 16:65016087 A/T CDH11 missense Silver NHW 25 203, 287 2:39222341 G/A SOS1 missense Silver NHW 25 203, 287 1:198666000 C/T PTPRC missense Silver NHW 25 203, 287 12:7173893 G/A C1S missense Silver NHW 25 203, 287 16:50745332 -/G NOD2 frameshift Silver NHW 25 203, 287 16:50745331 G/- NOD2 frameshift Silver NHW 25 203, 287 6:47846449 C/G PTCHD4 missense Silver NHW 25 203, 287 8:100155318 G/A VPS13B missense Silver NHW 26 17, 384 15:42028437 T/A MGA missense Silver HW 26 17, 384 16:65016087 A/T CDH11 missense Silver HW 26 17, 384 22:29091178 C/A CHEK2 missense Silver HW 26 17, 384 1:11105539 T/C MASP2 missense Silver HW 26 17, 384 11:63987441 C/G FERMT3 missense Silver HW 26 17, 384 16:10989612 G/A CIITA missense Silver HW 26 17, 384 16:89985913 T/C MC1R missense Silver HW 26 17, 384 8:100844615 A/G VPS13B missense Silver HW 26 17, 384 X:153592640 C/T FLNA missense Silver HW 27 369, 435 11:108138003 T/C ATM missense Silver HW 27 369, 435 11:108143454 T/A ATM missense Silver HW 27 369, 435 14:45658449 A/G FANCM missense Silver HW 27 369, 435 16:65016087 A/T CDH11 missense Silver HW 27 369, 435 21:36164605 A/G RUNX1 missense Silver HW 27 369, 435 21:36164605 A/C RUNX1 missense Silver HW 27 369, 435 10:103826291 C/T HPS6 missense Silver HW 27 369, 435 2:21249716 C/T APOB missense Silver HW 27 369, 435 6:133072318 C/A VNN2 missense Silver HW 28 256, 88 1:45797401 G/A MUTYH missense Silver NHAPI 28 256, 88 16:65016087 A/T CDH11 missense Silver NHAPI 28 256, 88 21:36164605 A/G RUNX1 missense Silver NHAPI 28 256, 88 21:36164605 A/C RUNX1 missense Silver NHAPI 28 256, 88 8:90990527 G/A NBN missense Silver NHAPI 28 256, 88 1:235973750 T/C LYST missense Silver NHAPI 28 256, 88 16:72821610 CCGCCGCCA/ - ZFHX3 proteinDel Silver NHAPI 28 256, 88 2:21252534 G/A APOB missense Silver NHAPI 28 256, 88 2:176972817 G/A HOXD11 missense Silver NHAPI 28 256, 88 6:135644371 T/C AHI1 missense Silver NHAPI 28 256, 88 8:100887784 C/G VPS13B missense Silver NHAPI 29 364, 258 16:65016087 A/T CDH11 missense Silver NHAIAN 29 364, 258 21:36164605 A/G RUNX1 missense Silver NHAIAN 29 364, 258 21:36164605 A/C RUNX1 missense Silver NHAIAN 147 29 364, 258 11:9829590 C/- SBF2 frameshift Silver NHAIAN 29 364, 258 14:56084820 G/T KTN1 missense Silver NHAIAN 30 553, 295 16:68849506 C/T CDH1 missense Silver HAPI 30 553, 295 12:56495023 G/A ERBB3 missense Silver HAPI 30 553, 295 16:65016087 A/T CDH11 missense Silver HAPI 30 553, 295 19:17943415 G/C JAK3 missense Silver HAPI 30 553, 295 2:167129256 C/A SCN9A missense Silver HAPI 30 553, 295 4:55598046 T/C KIT missense Silver HAPI 30 553, 295 13:20605511 T/C ZMYM2 missense Silver HAPI 30 553, 295 14:104173544 G/T XRCC3 missense Silver HAPI 30 553, 295 16:3304626 C/G MEFV missense Silver HAPI 30 553, 295 16:72992667 C/T ZFHX3 missense Silver HAPI 30 553, 295 20:60893635 C/T LAMA5 missense Silver HAPI 30 553, 295 3:41759248 G/C ULK4 missense Silver HAPI 30 553, 295 3:142285065 T/C ATR missense Silver HAPI 30 553, 295 4:88036015 A/T AFF1 missense Silver HAPI 30 553, 295 X:152807192 G/T ATP2B3 missense Silver HAPI 31 187, 63 16:65016087 A/T CDH11 missense Silver NHW 31 187, 63 21:36164605 A/G RUNX1 missense Silver NHW 31 187, 63 21:36164605 A/C RUNX1 missense Silver NHW 31 187, 63 7:91732035 G/C AKAP9 missense Silver NHW 31 187, 63 1:156756602 C/T PRCC missense Silver NHW 31 187, 63 14:92472218 T/C TRIP11 missense Silver NHW 31 187, 63 16:82032745 G/T SDR42E1 missense Silver NHW 32 414, 464 14:45658123 A/G FANCM missense Silver NHAPI 32 414, 464 16:65016087 A/T CDH11 missense Silver NHAPI 32 414, 464 17:8137796 G/A CTC1 missense Silver NHAPI 32 414, 464 2:209108284 T/C IDH1 missense Silver NHAPI 32 414, 464 21:36164605 A/G RUNX1 missense Silver NHAPI 32 414, 464 21:36164605 A/C RUNX1 missense Silver NHAPI 32 414, 464 7:91641854 T/C AKAP9 missense Silver NHAPI 32 414, 464 1:57378100 C/T C8A missense Silver NHAPI 32 414, 464 16:3304626 C/G MEFV missense Silver NHAPI 32 414, 464 22:19184092 A/G CLTCL1 missense Silver NHAPI 32 414, 464 4:187542780 G/A FAT1 missense Silver NHAPI 34 24, 560 12:133263886 C/G POLE missense Silver NHW 34 24, 560 16:65016087 A/T CDH11 missense Silver NHW 34 24, 560 19:45867799 G/A ERCC2 missense Silver NHW 34 24, 560 19:4362683 C/T SH3GL1 missense Silver NHW 34 24, 560 2:46574131 C/A EPAS1 missense Silver NHW 34 24, 560 20:60893635 C/T LAMA5 missense Silver NHW 34 24, 560 3:148880043 G/A HPS3 missense Silver NHW 34 24, 560 6:26091185 A/T HFE missense Silver NHW 35 610, 28 13:32911419 C/T BRCA2 missense Silver NHB 35 610, 28 16:65016087 A/T CDH11 missense Silver NHB 35 610, 28 16:89805306 A/C FANCA missense Silver NHB 148 35 610, 28 16:89871774 G/A FANCA missense Silver NHB 35 610, 28 21:36164605 A/G RUNX1 missense Silver NHB 35 610, 28 21:36164605 A/C RUNX1 missense Silver NHB 35 610, 28 9:35078170 C/T FANCG missense Silver NHB 35 610, 28 1:235827898 G/A LYST missense Silver NHB 35 610, 28 11:45827265 T/G SLC35C1 UTR_5 Silver NHB 35 610, 28 15:80450459 A/G FAH missense Silver NHB 35 610, 28 16:50745332 -/G NOD2 frameshift Silver NHB 35 610, 28 16:50745331 G/- NOD2 frameshift Silver NHB 35 610, 28 19:6222553 G/A MLLT1 missense Silver NHB 35 610, 28 4:110662159 C/G CFI missense Silver NHB 35 610, 28 5:38493836 G/T LIFR missense Silver NHB 35 610, 28 6:44233488 T/A NFKBIE missense Silver NHB 35 610, 28 9:139840113 G/T C8G missense Silver NHB Abbreviations: NHW, non-Hispanic/Latino White; NHB, non-Hispanic/Latino Black; NHAPI, non-Hispanic/Latino Asian/Pacific Islander; NHAIAN, non-Hispanic/Latino American Indian/Alaskan Native; HW, Hispanic/Latino White; HAPI, Hispanic/Latino Asian/Pacific Islander; HAIAN, Hispanic/Latino American Indian/Alaskan Native.
Abstract (if available)
Abstract
Incidence trends in acute lymphoblastic leukemia (ALL) demonstrate disparities by race and ethnicity. We used data from the Surveillance, Epidemiology and End Results Registry to evaluate patterns in ALL incidences from 2000-2016, including the association between the percent of people born in a foreign country at the county level and ALL incidence. Among 23,829 individuals of all ages diagnosed with ALL, 8,297 (34.8%) were Latinos, 11,714 (49.2%) were non-Latino (NL) Whites, and 1,639 (6.9%) were NL Blacks. Latinos had the largest increase in the age-adjusted incidence rate (AAIR) in this period compared to other race/ethnicities for both children and adults: AAIR was 1.6 times higher for Latinos (AAIR=2.43; 95%CI: 2.37,2.49) compared to NL Whites (AAIR=1.56; 95%CI:1.53,1.59; P<0.01). The AAIR for all subjects increased approximately 1% per year from 2000-2016 (annual percent change=0.97; 95%CI:0.67,1.27), with the highest increase in Latinos (annual percent change=1.18; 95%CI:0.76,1.60). In multivariable models evaluating the contribution of % of the county residents that were foreign born to ALL risk, a positive association was found for percentage of foreign born for NL Whites (P-trend<0.01) and Blacks (P-trend<0.01), but the inverse association was found for Latinos (P-trend<0.01) consistent with tenets of the “Hispanic paradox” in which better health outcomes exist for foreign-born Latinos.
Acute lymphoblastic leukemia (ALL) in children is associated with a distinct neonatal cytokine profile. The basis of this neonatal immune phenotype is unknown, but potentially related to maternal-fetal immune receptor interactions. We conducted a case-control study of 226 case child-mother pairs and 404 control child-mother pairs to evaluate the role of interaction between human leukocyte antigen (HLA) genotypes in the offspring and maternal killer immunoglobulin-like receptor (KIR) genotypes in the etiology of childhood ALL, while considering potential mediation by neonatal cytokines and the immune-modulating enzyme arginase-II (ARG-II). We observed different associations between offspring HLA-maternal KIR activating profiles and the risk of ALL in different predicted genetic ancestry groups. For instance, in Latino subjects who experience the highest risk of childhood leukemia, activating profiles were significantly associated with a lower risk of childhood ALL (odds ratio, OR=0.59; 95% confidence interval, CI:0.49, 0.71) and a higher level of ARG-II at birth (coefficient=0.13; 95%CI:0.04, 0.22). HLA-KIR activating profiles were also associated with a lower risk of ALL in non-Latino Asians (OR=0.63; 95%CI:0.38, 1.01), however with lower TNF-α level (coefficient=-0.27; 95%CI: -0.49, -0.06). Among non-Latino White subjects, no significant association was observed between offspring HLA-maternal KIR interaction and ALL risk, or cytokine levels. The current study reports the association between offspring HLA-maternal KIR interaction and the development of childhood ALL with variation by predicted genetic ancestry. We also observed some associations between activating profiles and immune factors related to cytokine control; however, cytokines did not demonstrate causal mediation of the activating profiles on ALL risk.
The role of race/ethnicity in genetic predisposition of early-onset cancers can be estimated by comparing family-based cancer concordance rates among ethnic groups. We used linked California health registries to evaluate the relative cancer risks for first degree relatives of patients diagnosed between ages 0-26, and the relative risks of developing distinct second malignancies (SPMs). From 1989-2015, we identified 29,631 cancer patients and 62,863 healthy family members. Given probands with cancer, there were increased relative risks of any cancer for siblings and mothers [standardized incidence ratio (SIR)=3.32; 95% confidence interval (CI): 2.85-3.85)] and of SPMs (SIR=7.27; 95%CI: 6.56-8.03). Higher relative risk of any cancer in siblings and mothers given a proband with solid cancer (P=0.019) was observed for Latinos (SIR=4.98;95%CI:3.82-6.39) compared to non-Latino White subjects (SIR=3.02;95%CI:2.12-4.16), supporting a need for increased attention to the genetics of early-onset cancer predisposition and environmental factors in Latinos.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Pathogenic variants in cancer predisposition genes and risk of non-breast multiple primary cancers in breast cancer patients
PDF
Identifying genetic, environmental, and lifestyle determinants of ethnic variation in risk of pancreatic cancer
PDF
Genetic epidemiological approaches in the study of risk factors for hematologic malignancies
PDF
Prostate cancer disparities among Californian Latinos by country of origin: clinical characteristics, incidence, treatment received and survival
PDF
Origins of the gender disparity in bladder cancer risk: a SEER analysis
PDF
Genetic and environmental risk factors for childhood cancer
PDF
Red and processed meat consumption and colorectal cancer risk: meta-analysis of case-control studies
PDF
Diet quality and pancreatic cancer incidence in the multiethnic cohort
PDF
Understanding acute lymphoblastic leukemia in different ethnic groups in the United States
PDF
Examining the relationship between common genetic variation, type 2 diabetes and prostate cancer risk in the multiethnic cohort
PDF
The role of heritability and genetic variation in cancer and cancer survival
PDF
Fish consumption and risk of colorectal cancer
PDF
Body size and the risk of prostate cancer in the multiethnic cohort
PDF
Disparities in gallbladder, intra-hepatic bile duct, and other biliary cancers among multi-ethnic populations: a California Cancer Registry study
PDF
Dietary and supplementary folate intake and prostate cancer risk
PDF
Genetic variation in the base excision repair pathway, environmental risk factors and colorectal adenoma risk
PDF
Early childhood health experience & adult phenotype in twins
PDF
The interplay between tobacco exposure and polygenic risk score for growth on birthweight and childhood acute lymphoblastic leukemia
PDF
Air pollution, smoking, and multigenerational DNA methylation Signatures: a study of two southern California cohorts
PDF
sFLT-1 gene polymorphisms and risk of severe-spectrum hypertensive disorders of pregnancy
Asset Metadata
Creator
Feng, Qianxi
(author)
Core Title
Ancestral/Ethnic variation in the epidemiology and genetic predisposition of early-onset hematologic cancers
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Epidemiology
Degree Conferral Date
2022-05
Publication Date
04/16/2022
Defense Date
02/24/2022
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
ancestral/ethnic variation,early-onset cancer,Epidemiology,genetic predisposition,Genetics,hematologic cancer,OAI-PMH Harvest
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Wiemels, Joseph (
committee chair
), Bhojwani , Deepa (
committee member
), Chiang, Charleston (
committee member
), de Smith, Adam (
committee member
), Gauderman, William J. (
committee member
)
Creator Email
lynnsenkei@gmail.com,qianxife@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC110964850
Unique identifier
UC110964850
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Feng, Qianxi
Type
texts
Source
20220417-usctheses-batch-927
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
ancestral/ethnic variation
early-onset cancer
genetic predisposition
hematologic cancer