Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Reproducibility and management of big data in brain MRI studies
(USC Thesis Other)
Reproducibility and management of big data in brain MRI studies
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
REPRODUCIBILITY AND MANAGEMENT OF BIG DATA IN BRAIN MRI STUDIES by Alyssa Huichao Zhu A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (BIOMEDICAL ENGINEERING) December 2024 ii DEDICATION To my grandparents, Jean Kim and Inok Won, and my cousin, Jerome Chung, who always believed in me but passed away before they could see me reach the finish line. iii ACKNOWLEDGMENTS There is no PhD journey without a PhD advisor, and so I have to start by thanking my PI Neda Jahanshad. The wide range of works presented in this dissertation are the result of her opening a myriad of research doors with the edict and encouragement to go forth and explore. I will always appreciate the truly unique opportunities and learning adventures that would not have been possible without her. I would also like to thank Paul Thompson, who despite leading the Imaging Genetics Center and the ENIGMA Consortium, always found time in a busy schedule to discuss projects and provide feedback. Additional thanks go to my committee members Vasilis Marmarelis, Brent Liu, and Wendy Mack for their teaching and advice along the way. The work presented in this dissertation was also made possible by labmates and collaborators. Most of all, my deepest appreciation goes to Talia Nir, my partner in crime for research and mentorship, who made everything so easy and somehow, despite our shared anxieties, so joyful. Lastly, I would not be here without my family, especially my parents, who have given me everything possible to get to where I am. I am also eternally grateful to my friends, both within academia and without, for filling my life with something other than code, statistics, and computer screens. A special thanks to WiSE FoN, Julie Yu and Daniel Moyer (we should’ve come up with a group name), Sophia Thomopoulos (along with Meleka and Bia), Tater Thots, #many-pickles, and #PhDGang. iv TABLE OF CONTENTS Dedication.....................................................................................................................................................ii Acknowledgments........................................................................................................................................iii List of Tables..............................................................................................................................................vii List of Figures............................................................................................................................................viii Abstract......................................................................................................................................................... x Chapter 1: Introduction.............................................................................................................................. 1 Chapter 2: The Biobank Data Parser.......................................................................................................... 7 2.1 Introduction...................................................................................................................................... 8 2.2 Methods: Developing the Back-End .............................................................................................. 10 2.3 Results: User Interfaces and Provided Functions........................................................................... 13 2.4 Discussion...................................................................................................................................... 19 2.5 Conclusion ..................................................................................................................................... 21 2.6 Acknowledgments.......................................................................................................................... 21 Chapter 3: From Large Scale Single Studies to Large Scale Multisite Studies....................................... 22 3.1 Cross-sectional and longitudinal ApoE2 and ApoE4 associations with regional QSM and diffusion MRI in the UK Biobank ................................................................................................. 22 3.1.1 Introduction............................................................................................................................. 23 3.1.2 Methods .................................................................................................................................. 25 3.1.2.1 UK Biobank Participants ................................................................................................ 25 3.1.2.2 T1-Weighted MRI........................................................................................................... 26 3.1.2.3 Quantitative Magnetic Susceptibility.............................................................................. 26 3.1.2.4 Diffusion-Weighted MRI................................................................................................ 26 3.1.2.5 Statistical Analyses......................................................................................................... 27 3.1.3 Results..................................................................................................................................... 28 3.1.3.1 ApoE4 Microstructural Associations.............................................................................. 28 3.1.3.2 ApoE2 Microstructural Associations.............................................................................. 29 3.1.3.3 ApoE-by-Age Interactions.............................................................................................. 30 3.1.4 Discussion............................................................................................................................... 32 3.1.5 Acknowledgments................................................................................................................... 35 3.2 Robust Automatic Corpus Callosum Analysis Toolkit: Mapping Callosal Development Across Heterogeneous Multisite Data........................................................................................................ 36 v 3.2.1 Introduction............................................................................................................................. 37 3.2.2 Methods .................................................................................................................................. 38 3.2.2.1 PING demographics and imaging................................................................................... 38 3.2.2.2 Preprocessing and segmentation ..................................................................................... 38 3.2.2.3 Feature extraction............................................................................................................ 40 3.2.2.4 Statistics.......................................................................................................................... 41 3.2.3 Results..................................................................................................................................... 41 3.2.4 Discussion............................................................................................................................... 43 3.2.5 Acknowledgments................................................................................................................... 45 3.3 Assessing the Potential Integration of UK Biobank Diffusion MRI Data into ENIGMA Studies............................................................................................................................................ 46 Chapter 4: Tensor-Based Morphometry .................................................................................................. 50 4.1 Longitudinal multi-site modeling of brain atrophy trajectories associated with amyloid and tau...................................................................................................................... 50 4.1.1 Introduction............................................................................................................................. 51 4.1.2 Methods .................................................................................................................................. 52 4.1.2.1 Datasets........................................................................................................................... 52 4.1.2.2 Image Processing ............................................................................................................ 53 4.1.2.3 Statistical Analysis.......................................................................................................... 54 4.1.3 Results..................................................................................................................................... 55 4.1.3.1 CSF and PET Biomarker Associations with Volumetric Change................................... 55 4.1.3.2 Rate and Acceleration of Volume Changes ................................................................... 56 4.1.4 Discussion and Conclusion..................................................................................................... 57 4.1.5 Acknowledgments................................................................................................................... 59 4.2 Age-Related Heterochronicity Of Brain Morphometry May Bias Voxelwise Findings................ 60 4.2.1 Introduction............................................................................................................................. 60 4.2.2 Methods .................................................................................................................................. 62 4.2.2.1 Creating the Templates................................................................................................... 62 4.2.2.2 TBM Processing.............................................................................................................. 63 4.2.2.3 Statistics.......................................................................................................................... 63 4.2.3 Results..................................................................................................................................... 64 4.2.3.1 Ventricular Jacobians...................................................................................................... 64 4.2.3.2 Voxelwise Analysis ........................................................................................................ 65 4.2.4 Discussion............................................................................................................................... 68 vi 4.2.5 Acknowledgments................................................................................................................... 69 Chapter 5: eHarmonize ............................................................................................................................ 70 5.1 Introduction.................................................................................................................................... 70 5.2 Methods.......................................................................................................................................... 75 5.2.1 Datasets................................................................................................................................... 75 5.2.2 Imaging Processing................................................................................................................. 78 5.2.3 Quality Control ....................................................................................................................... 79 5.2.4 Harmonization Methods.......................................................................................................... 80 5.2.5 Creating the Reference Curves ............................................................................................... 82 5.2.6 Reference Curve Evaluation ................................................................................................... 83 5.2.7 eHarmonize............................................................................................................................. 86 5.3 Results............................................................................................................................................ 86 5.3.1 Harmonization methods.......................................................................................................... 86 5.3.2 The Lifespan Reference .......................................................................................................... 87 5.3.3 Post-Harmonization Analyses................................................................................................. 89 5.4 Discussion...................................................................................................................................... 93 5.5 Acknowledgments.......................................................................................................................... 98 5.6 Supplementary Materials ............................................................................................................... 99 Chapter 6: Replication Studies and Future Directions........................................................................... 103 6.1 Introduction.................................................................................................................................. 103 6.2 Family History of Suicide in the ABCD Study............................................................................ 106 6.2.1 Methods ................................................................................................................................ 106 6.2.2 Results................................................................................................................................... 107 6.3 Family History of Suicide in the HBN Study .............................................................................. 110 6.3.1 Methods ................................................................................................................................ 110 6.3.2 Results................................................................................................................................... 111 6.4 Future Directions ......................................................................................................................... 112 6.5 Acknowledgments ....................................................................................................................... 114 References................................................................................................................................................. 115 vii LIST OF TABLES 1.1 Effects of Acquisition Parameters on Diffusion MRI ........................................................................ 2 2.1 Summary of Existing Tools to Manage UK Biobank Data .............................................................. 13 3.1.1 UK Biobank Demographics and ApoE Genotype Data ................................................................... 27 3.1.2 ApoE4 Associations with QSM ....................................................................................................... 29 3.1.3 ApoE2 Associations with QSM ....................................................................................................... 30 3.2.1 Dice Similarity Coefficients to Compare RACCAT, FreeSurfer, and Manual Segmentations ....... 42 3.2.2 Correlation Coefficients to Compare RACCAT, FreeSurfer, and Manual Segmentations .............. 42 3.3.1 R-Squared values for Regional Diffusion Measures ........................................................................ 49 4.1.1 Dataset Demographics for Longitudinal TBM ................................................................................ 52 4.1.2 Baseline diagnosis, CSF and PET Biomarkers ................................................................................ 53 4.2.1 Number of Significant Voxels Reported With Age and Sex Associations ...................................... 68 5.1 Study Demographics and Image Acquisition Parameters Used in the eHarmonize Project ............ 75 5.2 Comparison of Regional Ages of Peak Fractional Anisotropy ........................................................ 79 5.3 Demographics and ApoE4 Information for Lifespan Analyses ....................................................... 85 S5.1 Effect Sizes of Regional Associations Between Sex and White Matter Microstructure ............... 101 S5.2 Effect Sizes of Regional Associations Between ApoE4 and White Matter Microstructure .......... 102 6.1 Frequency Comparisons of Suicidal Thoughts and Behaviors Between Controls and Children with Parental History of Suicide in the ABCD Study .................................................................... 109 6.2 Demographics of Children in the HBN Study ............................................................................... 111 6.3 Demographics of Children with Parental History of Suicide in the HBN Study ........................... 112 viii LIST OF FIGURES 2.1 Schematic of UK Biobank Data Parser ............................................................................................ 10 2.2 UK Biobank Metadata of ICD-10 Codes and Example Parser Code ............................................... 12 2.3 UK Biobank Study Timeline and Example Timeline Matching Code ............................................. 13 2.4 Biobank Data Parser Command Line and Python Interface Examples ............................................ 17 2.5 Examples of Available Visualization Tools in the Biobank Data Parser ......................................... 18 3.1.1 Processing Stream of Diffusion and QSM Data .............................................................................. 25 3.1.2 Hippocampal and Amygdalar QSM Values by ApoE Genotype ..................................................... 29 3.1.3 Hippocampal and Amygdalar QSM Residuals by Age Plots ........................................................... 31 3.1.4 Hippocampal and Amygdalar Diffusion MRI Residuals by Age Plots ........................................... 32 3.2.1 Example Corpus Callosum Segmentations in the PING Dataset ..................................................... 40 3.2.2 Comparison of Segmentation Outputs from RACCAT, FreeSurfer, and Manual Segmentations ..................................................................................................................... 42 3.2.3 Scatter Plots of Corpus Callosum Morphology Measures by Age ................................................... 44 3.3.1 Scatter Plots Comparing Fornix and Fornix/Stria Terminalis Fractional Anisotropy Between Pipelines ........................................................................................................................................... 48 4.1.1 Longitudinal Tensor-Based Morphometry Pipeline ......................................................................... 55 4.1.2 Voxelwise Standardized Beta Maps of Significant Associations Between Volumetric Changes and Aβ and pTau biomarkers ........................................................................................................... 56 4.1.3 Group Comparisons of Rate and Acceleration Effects .................................................................... 57 4.2.1 Age Trends of Ventricular Jacobians from 3D and 4D Templates .................................................. 65 4.2.2 Voxelwise Maps Showing Regional Differences Between TBM Pipelines Using Different Templates ......................................................................................................................................... 66 4.2.3 Voxelwise Age Associations with TBM Compared Pipelines Using Different Templates.............. 67 5.1 Plots of Global Fractional Anisotropy From Different Datasets Before and After Harmonization .................................................................................................................................. 80 5.2 Lifespan Reference Curves for Global FA and Their Application to Held Out Datasets ................ 88 ix 5.3 Global FA Shift Parameter Plotted Against Voxel Volume Shows a Negative Correlation ........... 89 5.4 FA Associations with ApoE4 and Effect Size Comparisons Between Pre- and Post-Harmonized Data ..................................................................................................................... 91 5.5 The Impact of Harmonization Paradigm on the FA-Age Associations in Longitudinal Data ......... 92 5.6 Presentation of Different Aspects of eHarmonize: Usage Page, the Atlas Used, and an Example Quality Control Output .................................................................................................................... 93 S5.1 Site-wise Plots of FA-Age Associations to Demonstrate Manufacturer Differences in the ABCD Study .......................................................................................................................... 99 S5.2 Outlier Lifespan Trajectories of Regions with Late Peak Age or Accelerated Decline ................ 100 6.1 Flow Chart Showing the Subject Selection Process in the ABCD Study of Potential Effects of the Family History of Suicide (FHoS) Impacts on Neurodevelopment ..................................... 108 6.2 Effects of Parental and Child Sex on Cortical Surface Area in the ABCD FHoS Study ............... 109 6.3 Effects of Paternal History of Suicide on Cortical Surface Area are Widespread in Boys in the ABCD Study ....................................................................................................................... 110 6.4 In the HBN Study, Cortical Thickness was Lower in Children with FHoS Than Controls ............ 112 x ABSTRACT Advances in MRI have led to discoveries of factors that affect brain macro- and microstructure in health and disease. The small size of many neuroimaging studies led to concerns about poor reproducibility of research findings and calls for the comparison and pooling of multi-cohort datasets to establish the consistency of reported effects. Across studies, MRI acquisitions vary in sequence parameters such as spatial resolution as well as hardware used. Currently, two major approaches are used to achieve robust results: the collection of large datasets and the combination of smaller datasets. Here, I hope to address unique challenges presented by each of the methods. While large datasets may provide the statistical power needed to robustly estimate the small effect sizes often seen in neuroimaging studies, they can be difficult to use for researchers without vast computational resources. In Chapter 2, I present the Biobank Data Parser, which uses metadata to provide researchers with computationally efficient ways to query, process, and visualize data. While singularly large biobanks are useful on their own, they can also be combined with other datasets to improve the generalizability of results. Efforts to compare and pool MRI measures require concerted efforts to harmonize image processing pipelines and output measures to compensate for these sources of variance. In Chapter 3, I start with a large scale neuroimaging analysis conducted using data from a single biobank and transition to large scale multi-site studies. Multi-site studies often use templates to harmonize image processing pipelines and statistical analyses. Chapter 4 introduces the concept of brain MRI templates in an image processing pipeline in both spatial and temporal dimensions. Then in Chapter 5, I shift to a scalar template, in this case a set of lifespan reference curves for white matter microstructure. However, harmonization of MRI processing alone may not address all sources of multi-site variance as datasets also vary in recruitment criteria. In Chapter 6, I demonstrate one such case, in which similar analyses of a rare exposure variable did not replicate between two pediatric datasets with very different age ranges and sample sizes. I end by describing the future directions meant to address them, predominantly harmonization of non-imaging, e.g., demographic and questionnaire, data. 1 CHAPTER 1 Introduction Over the last several years, researchers in the brain imaging field—and the biological sciences in general— began to realize that they were faced with a reproducibility crisis (Goodman, Fanelli, and Ioannidis 2016; Munafò et al. 2017; Gilmore et al. 2017). Many studies aimed to understand population differences between one group and another, yet attempts at replication were not often considered. As recently as 10 years ago, most brain imaging studies were limited to a single site, with a single scanner, and single acquisition protocol. In the search for consistent patterns in the underlying neurobiology of disease, often studies of different cohorts did not agree on which brain regions were significantly different between different diagnostic groups. (Button et al. 2013) drew attention to the low statistical power in many areas of neuroscientific research. In both functional and structural neuroimaging, many studies of disease or genetic effects on the brain had examined imaging data from less than 50 individuals, limited by the cost of data collection and the time needed to recruit and scan a large cohort. Differences in results across studies could also arise due to both methodological differences and biological differences in the cohorts assessed, or due to differences in selection criteria for a given study (Bayer, Thompson, et al. 2022). Brain MRI scans are influenced by scanner hardware and acquisition parameters, such as head coils, imaging gradient non-linearity, software version, image reconstruction algorithm, scanner magnetic field strength, and scanner manufacturer (T. Zhu et al. 2011; McGibney et al. 1993; Jovicich et al. 2014; Teipel et al. 2011; Wolz et al. 2014). These parameters and factors affect all types of MRI modalities, but different modalities can also be affected by additional sources of variability. For example in diffusion MRI, which is sensitive to white matter (WM) microstructure, the angular resolution of the data and the protocol b-values (diffusion weightings) also affect the images and the downstream measures extracted from 2 processing pipelines. Table 1.1 summarizes a selection of studies that have evaluated parameter changes in diffusion MRI studies. Table 1.1. Effects of differences in scan parameters on diffusion metrics have been assessed in several publications. (KEY: FA = fractional anisotropy, MD = mean diffusivity) (A. H. Zhu et al. 2019) 3 Large biobanks, such as the UK Biobank (UKB), are now being established to overcome the problem of small sample sizes—in UKB, nearly 100,000 individuals over age 45 from the British population are being scanned (Miller et al. 2016). This unique sample provides sufficient statistical power and new opportunities to identify factors that affect brain structure or function, such as single genetic variants that typically explain around 0.5–1% of the population variance in a trait, such as volume measures from MRI (Hibar et al. 2015). To accomplish this goal, the UKB has four dedicated scanners of identical make and model, all running identical scan protocols to minimize any effects from scanner differences. However, gathering hundreds or thousands of imaging data samples, from a single scanner or from a limited number of scanners with a dedicated protocol, is often not feasible due to cost and time constraints. More common now is the acquisition of large, multi-site datasets, either as population studies or as targeted disease studies. In the United States, the Adolescent Brain Cognitive Development (ABCD) study is an ongoing population study designed to track the development of over 10,000 children (aged 9-10 years old at baseline) longitudinally (“The Conception of the ABCD Study: From Substance Use to a Broad NIH Collaboration” 2018). Multi-site initiatives such as the North American Alzheimer’s Disease Neuroimaging Initiative (ADNI) (Jack et al. 2015; Weiner et al. 2015) and the Parkinson’s Progression Markers Initiative (PPMI) (Parkinson Progression Marker Initiative 2011) scan participants around North America or around the world, boosting the power to address specific questions about neurodegeneration. These prospective multi-site studies made an effort to create a consistent protocol for some aspects of image collection in advance. Retrospective multi-site studies can also take advantage of smaller previously conducted studies, as the working groups of the Enhancing Neuro Imaging Genetics through Meta-Analysis (ENIGMA) consortium do (Paul M. Thompson et al. 2014). Combining existing dataset requires more careful consideration of the study designs and can benefit from retrospective methods of harmonization. Harmonization, or standardization of scientific data to enable comparisons, is vital to improve the interpretability of research findings, enabling researchers to aggregate and compare datasets and pool evidence from different sources, increasing power. Comparisons of datasets are crucial to enable us to test reproducibility of research findings, and allow us to be more confident in the generalizability (or otherwise) 4 of findings that have been made in single cohorts. In contrast, a comparison of data across cohorts can also help in identifying cohort specific effects that are unique to a cohort and may not be found in other settings. In short, whether the effects are consistent universally or found only in some settings, the generality and scope of the effects can be better identified by bringing data together from multiple sources, and data harmonization is crucial for doing this. Now, the need to harmonize data—to test for reproducibility—has become vital and more universally recognized. Depending on the study design, and the data to be harmonized, different approaches to harmonization are available. The methods for combining data across multiple sites can be categorized into two broad categories: meta-analysis, where statistical inferences are pooled across datasets, and megaanalysis, where individual level data are combined before statistical inferences are made. Recent efforts in consortia, such as ENIGMA (Paul M. Thompson et al. 2014, 2017), have introduced coordinated metaanalysis to the field of neuroimaging. Harmonized MRI processing pipelines are distributed to ensure similar analyses across multiple sites (https://enigma.ini.usc.edu/protocols/), and data at each site is fit with the same statistical model. Mega-analysis refers to the case where the individual subject-level data is available for harmonization. Reports of measures derived from structural MRI have found mega-analysis to be more powerful than meta-analysis (Boedhoe et al. 2018), but for diffusion imaging, Jahanshad and colleagues have found that meta-analysis consistently results in lower standard errors on the effects, and tighter confidence intervals (Jahanshad et al. 2017). While measures from structural MRI, such as regional volumes, represent physical quantities, measures from diffusion imaging are derived from mathematical models of diffusion that depend on the acquisition protocol. When the individual data measures are available as opposed to study summary statistics, individuallevel measures may also be harmonized before statistical inferences are made. (Fortin et al. 2017) evaluated several harmonization techniques for diffusion tensor imaging (DTI)-based measures and found that the ComBat method (Johnson, Li, and Rabinovic 2007) appeared to outperform others in reducing inter-site variability. ComBat was originally developed to correct for batch effects in gene expression assays, extending previous models (C. Li and Wong 2003) into an empirical Bayes framework. The ComBat 5 method has informative parametric priors set via an empirical Bayes iterative updating scheme, leveraging conjugacy for efficient computation. The authors claim that such a fitting procedure makes the model robust to larger numbers of sites with smaller within-site sample sizes. This method calculates harmonization parameters for each site based on all available measures, while allowing for region by region transformations, which has been found to be important (Venkatraman et al. 2015). When pooling data across studies it is important to account for population differences between merged cohorts or sites. Cohort demographics, such as mean age and sex distribution, have also been shown to affect study results (Yeatman, Wandell, and Mezer 2014; Bava et al. 2011; Hasan et al. 2010; Herting et al. 2012; Yingying Wang et al. 2012). Harmonization methods, such as ComBat, attempt to eliminate the variability from each site’s scanner, while preserving the real sources of variation from the cohort demographics. This highlights the need for caution, as when the demographic and site variables overlap, it may be difficult to disentangle the two. One potential solution is the establishment of a standard reference based on a large variety of data, such as lifespan reference curves. In structural MRI, statistical harmonization methods have led to the recent publications of lifespan reference curves for brain morphometry (Bethlehem et al. 2022; Ge et al. 2024; Rutherford et al. 2022; Bayer, Dinga, et al. 2022). When available, these reference curves may be useful in harmonizing data with limited demographic overlap between sites. However, large public datasets and multi-site studies may not benefit or be possible for studies of all types. In particular, for studies of rare disorders or exposures, there may only be a relatively few number of patients in a given geographic region, limiting the achievable sample size. For example, the UKB is a cohort that is somewhat healthier than the general British population (Fry et al. 2017), is limited in the age range assessed, the geographic reach, and in the number of patients with neuropsychiatric conditions in the sample. Even by filtering large public datasets and combining the samples, the samples may be too small and diverse for a single harmonized analysis. In such cases, limited data may benefit from replication studies. 6 The thesis is meant to address each of the mentioned methods for obtaining reproducible and robust results in neuroimaging studies: singularly large biobanks, combining multi-site datasets, and replication studies. In Chapter 2, we begin with the Biobank Data Parser. While necessary for robust findings, large datasets can prove difficult to use, requiring significant resources in computer storage and memory. We set up the Biobank Data Parser, using the UK Biobank as an example, to provide researchers with efficient ways to query, process, and visualize data. In Chapter 3, we show applications in large scale studies, covering a single study analysis of the effects of a genetic risk factor on microstructure in the UK Biobank, the development of site-invariant metrics in a multi-site study of corpus callosum morphometry, and a comparison of protocols that would allow for combining the UK Biobank diffusion data with other datasets processed by the ENIGMA-DTI pipeline. Chapter 4 introduces the concept of brain MRI templates in both spatial and temporal dimensions through tensor-based morphometry (TBM). The first part describes a multi-study analysis of the impact of Alzheimer’s biomarkers on the longitudinal trajectories of regional brain changes. This study establishes study-specific brain templates to account for protocol differences. The second study described in this chapter introduces a temporal component by establishing a 4D template that allows for age-matching and accounts for brain atrophy that occurs with aging. In Chapter 5, we shift to a scalar template, in this case a set of lifespan reference curves for microstructural measures in WM tracts. We use ten public datasets to develop the lifespan reference curves and create a framework for the lifespan reference curves to be incorporated into studies. Finally in Chapter 6, we investigate the potential relationship between parental history of suicide and neurodevelopment in children. These studies, which use two different datasets, demonstrate the difficulties faced in replication studies. We end by describing the future directions meant to address them. 7 CHAPTER 2 The Biobank Data Parser This section is adapted from: Zhu, A. H., Shetty, A., Salminen, L. E., Thompson, P. M., Jahanshad, N. The Biobank Data Parser: A data parser with built in and customizable filters for large biobank studies. Under Preparation Large biobanks have recently emerged in concerted efforts to study the myriad of risk factors that contribute to health conditions. However, with large storage requirements for the numerous types of data, these datasets can be difficult to manage. One example is the UK Biobank, which has collected a large array of data—demographic, health, genetic, etc—from approximately 500,000 subjects. To help users manage the large dataset, we developed the Biobank Data Parser, a Python package with a user-friendly interface to filter the large downloads and streamline usability. The command line interface (CLI) provides an easy way to set up a database server, which allows for a flexible Python API. For CLI users, the more manageable, filtered data can also be read into the researcher’s statistical software of choice. The CLI and Python API code can be shared between different applications or user groups for consistency, which may improve reproducibility of studies. We created this tool using data and metadata from the UK Biobank, however, the tool’s functionality can easily be expanded and generalized to any large dataset. The Biobank Data Parser is an open-source package and can be found at https://github.com/USC-IGC/ukbb_parser. 8 2.1 INTRODUCTION The onset and progression of many health conditions are influenced by a combination of lifestyle, environmental, and genetic factors. Many individual epidemiological risk factors have been found to have small effect sizes (Figueiredo et al. 2014), and obtaining the statistical power to untangle the complex and subtle interactions between risk factors would require datasets that are wide and deep, large in both scope of collected data types and sample size, respectively (Button et al. 2013; Marek et al. 2022; Medland et al. 2014). For example, studies have shown that tens of thousands of subjects would be required to detect genetic and behavior effects in a regionally unbiased full brain imaging study (Medland et al. 2014; Marek et al. 2022; Smith and Nichols 2018). As a result, concerted efforts have been made to systematically collect participant data into large structured biobanks, which contain a vast array of deeply characterized data from thousands to hundreds of thousands of participants. These efforts have been ongoing around the world with examples such as the UK Biobank, the All of Us Research Program in the United States, and the German Biobank Alliance. These large biobanks open the door to a myriad of research possibilities, but also come at the cost of local technical challenges, ranging from data storage and retrieval to data management. Biobanks collect thousands of data measures that can vary in value type (e.g., numerical, categorical, free response) and relate to each other in different ways (e.g., measures collected from the same organ vs. the same method, at the same or different times). These nuances may require researchers to spend a substantial amount of time to understand and effectively use a dataset. Additionally, large data downloads can be unwieldy for most commonly used software programs. For the UK Biobank, the initial spreadsheet of all participants alone without any bulk downloads can be more than 40 gigabytes. Learning the structure of the dataset can aid in both aspects. Programs that use the study metadata can more efficiently parse and load the dataset according to the user’s needs. Large biobanks draw the interest of thousands of researchers around the world and have been used in a comparable number of publications to date. These large datasets provide statistical power to detect small effect sizes in a great number of fields, but flexibility in subject classification criteria and data processing pipelines can influence results. Cohort characteristics such as sample size, inclusion and 9 exclusion criteria, and definition of healthy control versus diagnostic groups are often cited as potential reasons for differences between study results. In population studies that do not recruit along such strict guidelines, these definitions are then determined on a user-by-user basis. For example, hypertension can be assessed using medical records, self-report responses, medications, automated readings by a blood pressure device, or manual measurements using a sphygmomanometer. When available, researchers may choose to use one, a subset, or all of these to determine if participants are hypertensive for their analysis. Differences in such decisions could impact study results and replicability and could be mitigated by consistent data management and code sharing. The UK Biobank (UKB) is a deeply phenotyped resource that makes metadata related to the data fields collected and their subject level coding freely available. Initial data collection took place in over 500,000 people in the United Kingdom between 2006-2010 in participants aged 40-69 years, spanning a wide scope of fields, including demographics, biological samples, health records, and cognitive tests (Sudlow et al. 2015). In addition to in-person follow-up visits, subsequent data collection has also included online questionnaires and cognitive tests. Since 2014, an effort to collect MRI scans has been ongoing. To date, approximately 80,000 participants have been invited to undergo scanning in a follow-up visit (Miller et al. 2016; Petersen et al. 2016; Littlejohns et al. 2020). Currently (as of 2024), there are over 8,500 datafields provided from over 200 categories, and the dataset is still growing. Tools that use study metadata and provide a user-friendly interface would help facilitate standardized documentation of methods. Using the UK Biobank as an example, we created the Biobank Data Parser to create a Python package equipped with command line and Python interfaces for data filtering, preprocessing, and visualization. We have implemented a SQLite back-end for efficient data querying that accounts for the many data encoding schemes and allows for data visualization and exploratory analyses on the fly. The code used in studies can then be shared with publications to promote replicability. While initially written for the UK Biobank, the framework is written to be flexible and easily adapted to other large datasets. The 10 Biobank Data Parser is freely available as an open source Python package (https://github.com/USCIGC/ukbb_parser). 2.2 METHODS: DEVELOPING THE BACK-END The UK Biobank Data Parser is built around three main components with UKB metadata and a SQLite back-end embedded within a Python framework. UKB metadata is publicly available through the UKB Data Showcase (https://biobank.ndph.ox.ac.uk/ukb/). The metadata is provided in a flat data structure, but the data-fields are organized into categories in a hierarchical tree-like structure. Using Python, we map the metadata into a nested dictionary with each category having subcategory and data-field items and save it as a JSON file. Metadata for the data-fields, including the JSON file, and select data-field encoding schemes are stored in the back-end and deployed with the ukbb_parser package. Figure 2.1. Schematic displaying the workflow and framework of the UK Biobank Data Parser. Metadata and data from the UKB are combined in a SQLite framework that serves as the back-end for the Python user interface. The UKB data downloads are made available in a flat data structure (e.g., comma-separated values (CSV) file). The hierarchically represented metadata in the JSON file can be used to set up a relational database management system (RDBMS) for efficient data handling. We chose to use SQLite as it doesn’t 11 require a database administrator or Docker program to set up a database server. In Python, we use the ‘polars’ package to transform the flat data file into a SQLite database. Each category is made into its own respective table with the relationship between tables, i.e., categories, maintained (Figure 2.1). Python functions can then use both data and metadata to provide real-time user functionality for data pre-processing and visualization (further detailed in the “Results: User Interfaces” section). Health conditions The UK Biobank has a number of hierarchical data coding systems, including those used for diagnostic data. Medical records of the UKB participants were obtained from the UK’s centralized healthcare system, which uses the International Classification of Diseases (ICD). The majority are coded with the tenth ICD revision (ICD-10) codes with older diagnoses coded as ICD-9. At the highest level, the ICD system is organized into chapters, each reflecting a class of diseases, with subsequent levels becoming more specific. As with the data download, the user does not see the hierarchical meta-organization in the spreadsheet, only the final level diagnoses. In addition, one participant diagnosis is listed per column in alphanumeric order. Therefore any researcher wishing to include or exclude a specific diagnosis or set of diagnoses must search across the entire set of ICD columns, which consists of multiple arrayed data-fields. We use SQLite to query for the relevant data-fields and then Python to retrieve the relevant ICD codes from the metadata, one of the data encoding schemes stored in the ukbb_parser back-end. Python pandas is then used to filter or inventory the participant data for the requested diagnostic codes (Figure 2.2). In addition to medical records, study participants are also asked to report health conditions. Selfreport responses are not provided as ICD-10 codes but are also hierarchically coded. Metadata for the selfreport medical conditions is also hosted in the back-end of the ukbb_parser, which allows for the same level of functionality. This functionality can be expanded to any hierarchically coded data, including non-medical data, such as job codes (data-field 22617). 12 Figure 2.2. ICD-10 codes are hierarchically organized. The various levels encapsulating late onset Alzheimer’s disease are displayed. Using metadata provided by the UKB, the ukbb_parser can filter or inventory medical conditions at any level. The flags to be used when running the ukbb_parser CLI to inventory the specified diagnoses are provided. Timeline filtering and matching As the UK Biobank has acquired data across a long period of time, the timing of both cross-sectional and longitudinal data needs to be accounted for. Data acquired during in-person visits to UKB centers follow a timeline known as Instance 2. Of the initial 500,000 subjects, a subset have been invited back for an imaging visit and a smaller subset for a second imaging visit. For researchers interested in the imaged cohort, the ukbb_parser has built-in functionality to filter both by participants with a scan visit and in-person data that was acquired during the imaging visits. Questionnaires and measurements taken at home or online were completed on different timelines indicated by various “Instance IDs”. As the time interval between the different types of data collection may vary at the individual level, the data-fields reflecting the data collection dates are programmed into the ukbb_parser, which allows for a function that checks the timeline-associated dates to match data from across timelines given a specified time threshold (Figure 2.3). 13 Figure 2.3. UKB study timeline. In UKB study parlance, “Instancing” is the term used to indicate a particular timeline of data collection. Within each Instancing timeline, the instances follow zero-based indexing. Example code for the Python API demonstrates how to match data between brain volume (Instancing 2) and fluid intelligence assessed online (Instancing 178). 2.3 RESULTS: USER INTERFACES AND PROVIDED FUNCTIONS The ukbb_parser Python package is built with functions for data filtering, preprocessing, and visualization, providing a uniquely comprehensive and user-friendly tool (Table 2.1). Table 2.1. A brief summary of other existing tools for managing UKB data. Name Language User Interface Data Filtering Data Processing Data Visualization URL Biobank Read Python CLI Only includes functionality for Instancing 2 timeline; needs explicit data-field and code inputs (does not incorporate hierarchical metadata) Some handling of missing data Correlation Matrix Plots (does not detect and stop when categorical data-field requested) https://github.com/sap hir746/BiobankReadBash FUNPACK Python CLI Incorporates metadata Handles missing data; categorical recoding - https://git.fmrib.ox.ac. uk/fsl/funpack/ 14 ukbiobankloaders Python, Apache Parquet CLI, Python API Focused on medical records - - https://github.com/Be nevolentAI/ukbiobank -loaders ukbREST PostgreSQL Docker CLI Includes handling of genetics data; needs explicit data-field and code inputs (does not incorporate hierarchical metadata) Categorical recoding - https://github.com/hak yimlab/ukbrest UKBBtabularprocessing bash, Python CLI, Python API Data extraction; incorporates metadata - - https://github.com/Co BrALab/UKBBtabular-processing The Biobank Data Parser comes with two types of user interfaces: the Python-based command line interface (CLI) and the Python application programming interface (API). There is some shared functionality between the two, though each has its own unique capabilities. To set up the SQLite database, which is required to use the Python API, the ukbb_parseDB csv2db command must be run. On an Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz system running CentOS7 with 128 GB of memory, converting a 40 GB flatstructured CSV file into a SQLite database took approximately three hours. Setting up the SQLite database is not necessary to use the CLI data preprocessing functions, though it will decrease processing time in the future. The functions available in each user interface are explained in further detail below. COMMAND LINE INTERFACE The CLIs are set up with flags to filter and inventory the data, outputting CSV files for those who do not run statistical analyses in Python (Figure 2.4). For those who do not wish to set up a SQLite database, the ukbb_parser command operates solely on the downloaded text file. The ukbb_parseDB CLI has the same set of functions. The CLIs are designed to minimize the output files to improve readability in other 15 programming languages. Therefore, the functions are written such that only data requested by the user are included in the output. Parse The parse subcommand is equipped with flags that allow for filtering the UKB data in both the subject (row) and data-field (column) dimensions. To filter by columns, inclusion criteria can be specified using either categories (--incat) or data-field codes (--inhdr). If categories are given, the tool queries the study metadata for all data-fields in the specified category and every subsequent level of subcategories. To exclude subcategories or data-fields within the requested categories, the --excat or --exhdr flags can be used, respectively. Any number and combination of the aforementioned flags can be provided at once. Subject filtering can be based on diagnoses and/or visit-available subsets. Inclusion or exclusion of ICD-10 (--incon/--excon) or self-report (non-cancer) conditions (--insr/--exsr) can be used to filter on diagnostic conditions. The ICD-10 inputs are allowed to be given in simplified forms, e.g., --excon G will exclude all subjects that have any diagnosis codes beginning with “G” (Chapter VI Diseases of the nervous system). Additional flags allow for subject subsets to be specified in different ways, such as those who have completed a neuroimaging visit (--img_subs_only). As with the column filters, these flags can be used in any combination. Additional flags are detailed in the subcommand’s help menu. The parse command requires users to provide an output prefix which is applied to the names of the two output files: a CSV file and an HTML header key. The CSV file contains the filtered data and by default keeps the column names of the original download: formatted as <data-field>-<instance>.<array>. As the data-field numbers are not intuitively matched to the meaning of the data they reflect, the HTML header key contains information about the columns, including data-field, encoding, and instance coding hyperlinks where available. Users can also request more intuitive column names based on the data-field descriptions by calling --long_names. 16 Inventory The inventory subcommand allows the user to simplify multi-array data-fields with hierarchical encoding schemes. Currently, there are three data types available for inventory: ICD-10 conditions, self-report (noncancer) conditions, and careers. The aim of inventory is to create binary columns that indicate if subjects are associated with user-specified codes. The user must indicate which level of the hierarchical encoding scheme to query (Figure 2.2). To avoid overly complicated command calls, only one data type can be inventoried at a time, but any number of levels and codes within that data type can be specified at the same time. By default, a single summary column will be output for each requested higher level code. However, a column per selectable code within the higher level code can be requested (--all_codes). Processing time will be commensurate to the number of output codes/columns being requested. PYTHON API Once the SQLite database has been set up, the data processing functions available through the CLI—parse and inventory—are also available through the Python API (Figure 2.4). The outputs are pandas DataFrame objects rather than files, and as such they inherit pandas’ wide range of functionality. These output DataFrames can also be input to additional ukbb_parser functions, such as time matching and data visualization, or to statistical packages such as statsmodels or scikit-learn. The Python API for the ukbb_parser provides additional functions that take advantage of the UKB study metadata. To address the various data collection timelines, the time_match function uses Instancing information of different data types to find subjects whose data acquisitions occurred within a user-requested time frame (Figure 2.3). The function will find the Instancing associated with the data-fields and attempt to identify the corresponding dates of acquisition. For data that is not associated with an Instancing timeline, users can also provide a data-field with a Date value type. 17 Figure 2.4. The Biobank Data Parser has command line and Python interfaces for users, depending on their preference. Equivalent calls for the parse and inventory functions are displayed for comparison. As the Python API allows for quick data handling, on-the-fly visualization and data exploration are also possible. Visualization tools are available through the ukbviz module, which outputs a dynamic plotly object that allows users to zoom and scroll. The ukbb_parser provides different visualization methods depending on if the requested data are continuous or discrete (Figure 2.5). For categorical and ordinal data, 18 the metadata is queried for the encoding scheme, and if possible the provided numerical values are converted to their meanings for display as axis tick labels (e.g., Sex codes 0 and 1 to Female and Male, respectively). Due to the vast number of encoding schemes, not every schema is stored within the Biobank Data Parser back-end. If an internet connection is available, the tool will query the UKB Data Showcase for the appropriate code-to-meaning mapping. Figure 2.5. The available visualization tools include bar plots for categorical data and scatter plots for continuous data. 19 2.4 DISCUSSION We created the Biobank Data Parser, a versatile Python package that aids in data loading, preprocessing, and visualization of the very large population dataset. The SQLite back-end, paired with both command line and Python interfaces, provides efficient and user-friendly querying. The incorporation of study metadata into multiple functions simplifies the complexities of the dataset’s different data encodings and acquisition timelines for the user. The code and its documentation are publicly available at https://github.com/USC-IGC/ukbb_parser. The Python package is written to handle datasets of any size on any system. Reading in the initial, large UKB download is the most computationally and time intensive step. The Python polars package is used to read in the downloaded CSV file and manages the big dataset accordingly with the available system memory. The time required to load in the dataset is inversely related to the size of available memory. However, once the CSV file is converted into a SQLite database—a one-time cost—any subsequent data querying or filtering is significantly faster, allowing for real-time use. Of the other existing software packages available for handling the UKB dataset, only ukbREST sets up a database, using PostgreSQL. The ukbREST tool provides data filtering functions but does not incorporate study metadata and therefore requires explicit inputs detailing individual data-fields and codes. Overall, few packages come with data pre-processing tools, though ukbREST and FUNPACK use metadata to allow for the categorical recoding of numerical values to their meaningful counterparts. With regards to data visualization and exploration capabilities, Biobank Read can produce a correlation matrix between columns, but it does not check the data-field metadata to ensure that such correlations would be meaningful, i.e., that the values types are not binary or categorical in nature. Code written using the Biobank Data Parser is meant to be easily shareable in an effort to increase the reproducibility of studies, such as those using the UKB dataset. Currently, some research papers denote the specific data-fields that are used in their study. While this is a good first step, we wish to take it further with the explicit delineation of inclusion and exclusion criteria and processing of study variables. However, 20 we stress that this tool should not be used to identify subject key mappings between different applications. Any linking of subject identifiers across applications should be done through the UK Biobank. The infrastructure for the ukbb_parser package can easily be applied to any other large biobank dataset. It was built around the well-documented and easily accessible UKB study metadata. Application to other datasets can be achieved by either formatting the incoming study metadata similarly to that of the UK Biobank or editing the functions that handle the data-field metadata. The package code is written modularly, such that editing data-field functions would not require re-factoring the overall infrastructure. The Biobank Data Parser is designed to be flexible, so that it can evolve as the datasets evolve. In the time since the package was first written, the UK Biobank has added new data-fields, deprecated existing ones, and changed the behavior of others. The ukbb_parser has been adapted accordingly. However, as biobank data types and collection timelines can be extremely varied, it is possible that we did not account for all behaviors. For any unaccounted for behaviors, users will be able to post issues to our GitHub repository, which we can then address. With increased use, all dataset nuances can be accounted for. As part of future releases, we also plan to include scripts for other programming languages or other types of data, such as genetics. Once the SQLite database has been set up, any number of programming languages would be able to interface with it. We plan to start with another popular programming language for statistics, R. Preliminary scripts for the use of the SQLite database through R can also be found on our GitHub page in the ukbb_parser’s sister repo ukbb_parse_R. To deal with new data types such as genetics, the existing package infrastructure would need to be expanded. Currently, the ukbb_parser is only designed to handle phenotypic data provided in the UKB spreadsheet download. With the provided study metadata, setting up the SQLite back-end and user interfaces does not require domain knowledge of all the phenotypes. However, handling genetics data in particular would need careful curation and expertise. We plan to call upon collaborators with domain-specific expertise to ensure that the data and related functions are incorporated properly. 21 2.5 CONCLUSION The Biobank Data Parser provides user-friendly interfaces to filter UKB downloads, reduce the size of the dataset, and improve ease of usability. For CLI users, the output spreadsheets can then be used for analyses using the researcher’s statistical software of choice. The CLI also provides an easy way to set up a database server, which allows for a flexible Python API. The CLI and Python API code can be shared between different applications or user groups for consistency, which may improve reproducibility of studies. We created this tool using data and metadata from the UK Biobank, however, the tool’s functionality can easily be expanded and generalized to any large dataset. The Biobank Data Parser is an open-source package and can be found at https://github.com/USC-IGC/ukbb_parser. 2.6 ACKNOWLEDGMENTS This package is developed using the UK Biobank Resource under Application Number 11559 and funding from NIH grants P41 EB015922 and S10 OD032285. 22 CHAPTER 3 From Large Scale Single Studies to Large Scale Multisite Studies 3.1 Cross-sectional and longitudinal ApoE2 and ApoE4 associations with regional QSM and diffusion MRI in the UK Biobank This section is adapted from: Nir, T. M.*, Zhu, A. H.*, Ba Gari, I., Dixon, D., Islam, T., Villalon-Reina, J. E., ... & Jahanshad, N. (2021). Effects of ApoE4 and ApoE2 genotypes on subcortical magnetic susceptibility and microstructure in 27,535 participants from the UK Biobank. In PACIFIC SYMPOSIUM ON BIOCOMPUTING 2022 (pp. 121-132). Zhu, A.H.*, Nir, T.M.*, Ba Gari, I., Dixon, D., Islam, T., Villalon-Reina, J.E., Salminen, L.E., Thompson, P.M., Jahanshad, N. “ApoE2 and ApoE4 associations with regional QSM and diffusion MRI in the UK Biobank”. (2022) Organization for Human Brain Mapping. Glasgow, Scotland, UK, June 19-23, 2022. Zhu, A.H.*, Nir, T.M.*, Ba Gari, I., Dixon, D., Islam, T., Villalon-Reina, J.E., Haddad, E., Thompson, P.M., Jahanshad, N. “APOE4 genotype associations with longitudinal change in hippocampal microstructure”. (2022) Alzheimer's Association International Conference. San Diego, CA, USA, June 19- 23, 2022. Disrupted iron homeostasis is associated with several neurodegenerative diseases, including Alzheimer’s disease (AD), and may be partially modulated by genetic risk factors. Here we evaluated whether subcortical iron deposition is associated with ApoE genotype, which substantially affects risk for late-onset AD. We evaluated differences in subcortical quantitative susceptibility mapping (QSM), a type of MRI 23 sensitive to cerebral iron deposition, between either ApoE4 (E3E4+E4E4) or ApoE2 (E2E3+E2E2) carriers and E3 homozygotes (E3E3) in 27,535 participants from the UK Biobank (age: 45-82 years). We found that ApoE4 carriers had higher hippocampal (d=0.036; p=0.012) and amygdalar (d=0.035; p=0.013) magnetic susceptibility, particularly individuals aged 65 years or older, while those carrying ApoE2 (which protects against AD) had higher QSM only in the hippocampus (d=0.05; p=0.006), particularly those under age 65. Secondary diffusion MRI microstructural associations in these regions revealed greater diffusivity and less diffusion restriction in E4 carriers, however no differences were detected in E2 carriers. Disease risk conferred by ApoE4 may be linked with higher subcortical iron burden in conjunction with inflammation or neuronal loss in aging individuals, while ApoE2 associations may not necessarily reflect unhealthy iron deposits earlier in life. 3.1.1 INTRODUCTION Iron is essential for normal brain function; however, disrupted iron homeostasis can lead to brain iron accumulation, which is associated with neurodegenerative diseases including Parkinson’s and Alzheimer’s disease (AD) (Raven et al. 2013). Excess iron induces oxidative stress, and accompanies neuritic plaques and neurofibrillary tangles, the pathological hallmarks of AD (Ward et al. 2014). Plasma levels of ironregulating proteins, including ferritin and transferrin, are highly heritable, with a large portion of their heritability explained by variants in the HFE and TF genes (Benyamin et al. 2009); variants in these genes have been associated specifically with iron overload (Bell et al. 2021). Brain iron concentration and other microstructural properties may be detected in vivo using metrics derived from advanced MRI techniques, including susceptibility- and diffusion- weighted MRI sequences (SWI and dMRI, respectively). These metrics are also sensitive to genetic variants associated with iron homeostasis (Elliott et al. 2018; Jahanshad, Kohannim, Hibar, et al. 2012). Quantitative susceptibility mapping (QSM) derived from SWI is sensitive to paramagnetic iron concentrations, a predominant source of magnetic susceptibility variation in brain tissue. QSM has been 24 used to study susceptibility profiles of iron-rich subcortical structures in both normal aging (Yuyao Zhang et al. 2018) and neurodegenerative diseases (Ravanfar et al. 2021). Tissue magnetic susceptibility is also influenced by factors such as calcium and myelin concentrations. While there is less myelin in subcortical gray matter structures than in white matter, myelin is diamagnetic and counteracts iron effects on QSM, thus confounding QSM interpretations. Brain microstructure can also be evaluated with diffusion MRI (dMRI), which may help disentangle the opposing effects of iron and myelin on QSM. Single-shell diffusion tensor imaging (DTI) is the most widely used method to analyze dMRI datasets, but cannot differentiate crossing fibers, and captures partial volumes from different tissue compartments. This may reduce the sensitivity of DTI measures, in particular for evaluating complex gray matter micro-architecture. Multi-shell neurite orientation dispersion and density imaging (NODDI) is a three-compartment biophysical model that attempts to resolve signal contributions from various tissue compartments, including the restricted intracellular and ‘freewater’, or isotropic, extracellular compartments (H. Zhang et al. 2012). Such models may offer insight into subcortical microstructural properties and associations between QSM and genetic risk factors. The most well-recognized, and greatest common genetic risk factor for sporadic AD is a haplotype in the APOE gene. ApoE plays a key role in cerebral cholesterol metabolism and β- amyloid clearance. Growing evidence suggests that AD risk conferred by the ApoE4 genotype (carried by around 25% of individuals, depending on ancestry) is partially linked with cerebral iron burden. For example, both higher magnetic susceptibility and its association with higher amyloid plaque load have been reported in ApoE4 carriers with mild cognitive impairment (van Bergen et al. 2016). ApoE2, in contrast, has a proposed neurotrophic or neuroprotective effect (Z. Li et al. 2020). However, as ApoE2 is less prevalent in the population (found in around 10% of people), few studies have been sufficiently well-powered to evaluate whether ApoE2 has an effect on brain structure or function (Grothe et al. 2017); none, to our knowledge, have investigated associations with QSM. 25 In this study, we evaluated associations between both ApoE4 and ApoE2 genotypes and in vivo QSM and dMRI measures of subcortical iron concentrations and microstructure in a large, population-based sample of older adults (age range: 45-82 years) from the UK Biobank (Miller et al. 2016). 3.1.2 METHODS 3.1.2.1 UK Biobank Participants The UK Biobank (UKBB) is a publicly available dataset of over 500,000 community-based middle-aged and older adults residing in the United Kingdom (Miller et al. 2016). At the time of this analysis, nearly 43,000 of these individuals had undergone brain MRI on a 3T Siemens Skyra. SWI, dMRI, T2-weighted FLAIR, and T1-weighted (T1w) MRI were downloaded from the UKBB database through application 11559; available MRI acquisition protocols and preprocessing are detailed in Miller et al. (2016). In total, 40,069 participants had SWI and T1w images, 35,600 of whom also had dMRI. MRI scans that failed preprocessing or did not pass quality control (QC), as described for each modality below, were removed from analyses. Participants whose genetic ethnic group was not of European ancestry, who were missing demographic or genetic data, or whose available ApoE genotype was not E2E2, E2E3, E3E3, E3E4, or E4E4, were also excluded - leaving 27,535 participants with QSM and 27,313 participants with dMRI (Figure 3.1.1; Table 3.1.1). Figure 3.1.1. Processing stream with remaining number of participants after each step. 26 3.1.2.2 T1-Weighted MRI T1w 3D volumetric MPRAGE DICOMs were downloaded for processing (1x1x1 mm3 voxels). FreeSurfer version 7.1 was used to obtain segmentations for seven subcortical regions of interest (ROIs): the thalamus, caudate, putamen, pallidum, hippocampus, amygdala, and nucleus accumbens. Participants with a subcortical volume laterality index (LI=(L-R)/(L+R)) greater than five standard deviations from the mean were flagged for visual inspection and possible removal from the analysis. 3.1.2.3 Quantitative Magnetic Susceptibility QSM maps were derived from the SWI DICOMs (0.8x0.8x3 mm3 voxels; TE=9.42, 20 ms) and constructed using the Morphology Enabled Dipole Inversion (MEDI) toolbox (http://pre.weill.cornell.edu/mri/pages/qsm.html). Magnitude images were isolated for each head coil channel (N=32) and then combined by taking the square root of the sum of squares. The resulting magnitude image was skull-stripped using HD-BET (Isensee et al. 2019). Phase images were combined across two echo times by using a complex fitting approach to estimate the frequency offset in each voxel (T. Liu et al. 2013). The combined phase was unwrapped using the Laplacian operation method, and the background field was removed using the Laplacian Boundary Value (Zhou et al. 2014). Finally, the MEDI solver was implemented with model error reduction through iterative tuning (MERIT) and a spherical mean value (SMV) operator for background field removal (lambda=1000, SMV kernel=3) (T. Liu et al. 2013). All QSM images were visually inspected for quality. Images were rated for overall image quality (pass/fail) and Gibbs ringing (on a scale of 0-2); images with poor image quality and severe Gibbs ringing (i.e., 2s) were removed. Each participant’s SWI magnitude image was linearly registered to their respective T1w image using FSL’s flirt with 12 degrees-of-freedom (dof) and normalized mutual information; the resulting transform was then applied to the QSM image. 3.1.2.4 Diffusion-Weighted MRI Multi-shell dMRI scans were preprocessed and available for download as previously described (AlfaroAlmagro et al. 2018). All dMRI were acquired with 2x2x2 mm3 voxels and included 50 b=1000 s/mm2 and 50 b=2000 s/mm2 diffusion-weighted images (δ=21.4 ms, ∆=45.5 ms), and 5 b0 volumes. Two dMRI 27 models were evaluated: 1) DTI, fit to the subset of b=0 and 1000 s/mm2 volumes using FSL, was made available for download by the UKBB; 2) we further fit NODDI (H. Zhang et al. 2012) to all shells using DmiPy (Fick, Wassermann, and Deriche 2019). Resulting DTI fractional anisotropy (FA) and mean diffusivity (MD) and NODDI intracellular (ICVF), isotropic volume fractions (ISOVF), and orientation dispersion index (ODI) maps were transformed to respective T1w images through a multi-step process. The UKBB processing includes proprietary gradient distortion correction (GDC) of all modalities; UKBB processed dMRI, therefore, had to be warped to achieve correspondence with the non-GDC T1w and SWI images that we independently processed for this study. For each participant, the GDC mean b0 image was linearly registered to the respective GDC T2-FLAIR, using FSL’s flirt with 12 dof and a boundary-based registration (BBR) using the FreeSurfer-derived white matter mask, which in turn was nonlinearly warped to the original T2-FLAIR using ANTs symmetric normalization. The T2-FLAIR was then linearly registered to the respective T1w image using FSL’s flirt with 6 dof and BBR. The transformations from GDC b 0 to T1w were concatenated and applied to all dMRI scalar maps. Table 3.1.1. ApoE genotype and demographic data for UKBB participants with QSM and dMRI. APOE QSM dMRI Total N (%) Sex, Male N Age, yrs Mean (SD) < 65 , yrs N ≥ 65, yrs N Edu, College N Total N (%) Sex, Male N Age, yrs Mean (SD) < 65, yrs N ≥ 65, yrs N Edu, College N E3E3 16,590 (60.3) 7,687 64.6 (7.6) 8,095 8,495 11,164 16,473 (60.3) 7,878 64.3 (7.5) 8,236 8,237 11,090 E3E4 6,589 (23.9) 2,996 64.1 (7.6) 3,440 3,149 4,508 6,522 (23.9) 3,006 63.8 (7.4) 3,461 3,061 4,417 E4E4 646 (2.3) 285 63.7 (7.1) 352 294 451 644 (2.4) 291 63.7 (7.1) 349 295 461 E2E3 3,538 (12.8) 1,586 64.9 (7.6) 1,690 1,848 2,369 3,510 (12.9) 1,656 64.5 (7.5) 1,725 1,785 2,346 E2E2 172 (0.6) 77 64.1 (8.2) 88 84 121 164 (0.6) 77 64.0 (8.0) 87 77 109 Total 27,535 12,631 64.5 (7.6) 13,665 13,870 18,613 27,313 12,908 64.2 (7.4) 13,858 13,455 18,423 3.1.2.5 Statistical Analyses The median QSM and diffusion values across voxels in each of the left and right subcortical ROIs were extracted. The average of the left and right medians was then calculated, and the bilateral averages used for analysis. 28 Multiple linear regressions were performed to evaluate differences in each of the seven regional QSM indices between ApoE3 homozygotes (E3E3) and either 1) ApoE4 carriers (E4E3 and E4E4) or 2) ApoE2 carriers (E2E3 and E2E2). Secondary analyses evaluated the additive effect of carrying one or two E4 or E2 alleles. Sixteen covariates were included in all statistical models: age, sex, age-by-sex interaction, age2 , age2 -by-sex interaction, educational attainment (“college” or “no college”), and population structure (measured using the first 10 principal components of the UKBB ancestry analysis). To correct for multiple comparisons across seven ROIs, we used the false discovery rate (FDR) procedure (q=0.05). All reported p-values are uncorrected; only associations for which p-values survived FDR correction were considered significant. In regions that showed significant associations in primary analyses, we further evaluated 1) QSM effects beyond structural volume by also covarying for regional volume and total intracranial volume; and 2) associations between ApoE genotype and subcortical microstructure indexed with dMRI measures, using the same statistical models. We also tested for ApoE-by-age interactions in these regions, as ApoE genotypes may have an age-dependent effect on the brain. Interaction analyses were conducted using both linear and generalized additive models (GAM), including ApoE genotype and age as covariates. The R `mgcv` package was used for GAM analyses; age, age-by-sex and ApoE-by-age interaction terms were modeled using spline smoothing functions (cubic regression splines, k = 10) in place of the linear and nonlinear age-related covariate. Finally, we stratified linear associations by age, testing for ApoE associations within the subsets of participants either < 65 or ≥ 65 years of age. 3.1.3 RESULTS 3.1.3.1 ApoE4 Microstructural Associations Compared to the E3E3 group (N=16,590), E4 carriers (N=7,235 E3E4 and E4E4 combined) had higher magnetic susceptibility values in the hippocampus (d=0.036) and amygdala (d=0.035; p ≤ critical FDR p=0.013; Table 3.1.2; Figure 3.1.2). Using an additive model, E4 carriers only had significantly higher 29 susceptibility values in the amygdala (r=0.018; p ≤ critical FDR p=0.005). Susceptibility effects remained when also covarying for regional volume (hippocampus d=0.030; p=0.036; amygdala d=0.036; p=0.010). dMRI comparisons between the E3E3 group (N=16,473) and E4 carriers (N=7,116 E3E4 and E4E4 combined) revealed significantly higher hippocampal MD (d=0.069; p=1.0x10-6 ) and ISOVF (d=0.062; p=1.2x10-5 ) and lower FA (d=-0.044; p=0.002) and ICVF (d=-0.064; p=6.7x10-6 ) in E4 carriers; the E4 group also had higher MD (d=0.032; p=0.020) and ISOVF (d=0.037; p=0.009) in the amygdala. All dMRI associations remained significant when also covarying for volume. Table 3.1.2. ApoE4 associations with QSM. ROI Dominant Additive d r b se p r b se p Thalamus -0.002 0.001 -7.8x10-6 4.6x10-5 0.87 0.004 -2.5x10-5 4.1x10-5 0.54 Caudate -0.020 0.009 -2.2x10-4 1.5x10-4 0.15 0.013 -2.8x10-4 1.3x10-4 0.040* Putamen 0.009 0.004 1.2x10-4 1.8x10-4 0.50 0.003 7.4x10-5 1.6x10-4 0.63 Pallidum -0.018 0.008 -2.9x10-4 2.3x10-4 0.21 0.011 -3.4x10-4 2.0x10-4 0.090 Hippocampus 0.036 0.016 2.4x10-4 9.3x10-5 0.012** 0.015 1.9x10-4 8.2x10-5 0.022* Amygdala 0.035 0.016 3.5x10-4 1.4x10-4 0.013** 0.018 3.5x10-4 1.2x10-4 0.005** Accumbens -0.003 0.001 -5.5x10-5 2.8x10-4 0.85 0.004 -1.6x10-4 2.5x10-4 0.52 Figure 3.1.2. Bar plots of QSM values in the hippocampus and amygdala by ApoE genotype. 3.1.3.2 ApoE2 Microstructural Associations Compared to the E3E3 group (N=16,590), E2 carriers (N=3,710 E3E2 and E2E2 combined) had higher magnetic susceptibility values in the hippocampus using both the dominant (d=0.05; p ≤ critical FDR 30 p=0.006) and additive approaches (r=0.02; p ≤ critical FDR p=0.005; Table 3.1.3). Susceptibility effects remained significant when covarying for volume (d=0.046; p=0.012). No significant subcortical dMRI differences were found between the E3E3 group (N=16,473) and E2 carriers (N=3,674 E3E2 and E2E2). Table 3.1.3. ApoE2 associations with QSM. ROI Dominant Additive d r b se p r b se p Thalamus 0.004 0.002 1.3x10-5 5.9x10-5 0.83 2.1x10-5 -1.7x10- 7 5.6x10-5 1.00 Caudate 0.024 0.009 2.6x10-4 2.0x10-4 0.18 5.4x10-3 1.4x10-4 1.8x10-4 0.45 Putamen 0.014 0.005 1.7x10-4 2.3x10-4 0.46 2.4x10-3 7.3x10-5 2.1x10-4 0.73 Pallidum 0.002 0.001 3.5x10-5 2.9x10-4 0.91 2.9x10-3 -1.1x10- 4 2.7x10-4 0.68 Hippocampus 0.05 0.019 3.3x10-4 1.2x10-4 0.006** 2.0x10-2 3.1x10-4 1.1x10-4 0.005* * Amygdala 0.021 0.008 2.1x10-4 1.8x10-4 0.24 7.2x10-3 1.7x10-4 1.7x10-4 0.30 Accumbens 0.022 0.008 4.4x10-4 3.7x10-4 0.23 8.2x10-3 4.0x10-4 3.4x10-4 0.24 **uncorrected p ≤ critical FDR threshold *uncorrected p ≤ 0.05 3.1.3.3 ApoE-by-Age Interactions No significant linear QSM ApoE4-by-age or ApoE2-by-age interactions were detected in the hippocampus or amygdala (Figure 3.1.3A, C). However, a significant ApoE4-by-age effect was found in the amygdala using GAM (estimated degrees of freedom = 5.64, F = 2.54, p = 0.016; Figure 3.1.3D). For dMRI, a significant linear ApoE4-by-age interaction was detected in the hippocampus with ICVF (r=0.030; p=0.0009), ISOVF (r=0.029; p=5.7x10-5 ), and MD (r=0.033; p=7.9x10-5 ), and in the amygdala with ISOVF (r=0.018; p=0.016) and MD (r=0.016; p=0.010). E4 carriers showed greater increases in ISOVF and MD and decreases in ICVF with increasing age (Figure 3.1.4). Using GAM, significant ApoE4-by-age interactions were also found in the hippocampus with ICVF (p=0.007), ISOVF (p=2.3x10-5 ), and MD (p=2.7x10-5 ), and in the amygdala with ISOVF (p=0.026) and MD (p=0.016). No ApoE2-by-age interactions were detected with linear or GAM models in relation to diffusion measures. 31 3.1.3.3.1 ApoE Associations Stratified by Age When stratified by age, ApoE4 was significantly associated with QSM values in both the hippocampus and amygdala only in the subset of participants age 65 years and older (hippocampus d=0.049; p=0.015; amygdala d=0.062; p=0.002). ApoE4 was also associated with diffusion measures in the hippocampus (ISOVF: d=0.093; p=5.6x10-6 ; ICVF: d=-0.081; p=7.7x10-5 ; MD: d=0.097; p=2.3x10-6 ; FA: d=-0.056; p=0.007) and amygdala (ISOVF: d=0.050; p=0.016; MD: d=0.051; p=0.013) in the older subset, while no significant associations were detected in the younger group. In contrast, E2 carriers younger than 65 years of age had, on average, higher hippocampal magnetic susceptibility (d=0.073; p=0.005) than those with the E3E3 genotype, but differences were not significant in the older subset. As with the full sample, ApoE2 associations with hippocampal diffusion measures were not significant in either age group. Figure 3.1.3. QSM residuals in the (A, B) hippocampus and (C, D) amygdala plotted against age using either (A, C) a linear approach (adjusted for sex, age-by-sex, age2 , education, and population structure), or (B, D) using GAM (linearly adjusted for sex, age-by-sex, age2 -by-sex, education, and population structure). 32 Figure 3.1.4. ICVF and ISOVF dMRI residuals (adjusted for sex, age-by-sex, age2 , education, and population structure) in the (A) hippocampus and (B) amygdala linearly plotted against age. MD plots closely resembled ISOVF, and GAM plots closely resembled linear plots. 3.1.4 DISCUSSION With aging, iron dysregulation and accumulation can induce oxidative stress, trigger the aggregation of proteins involved in pathogenesis of neurodegenerative disorders, and cause cell death(Ward et al. 2014). Here, we found higher QSM susceptibility in the hippocampi and amygdala of E4 carriers, temporal lobe structures known to be affected in AD and dementia; this likely reflects higher iron deposition, but may also be driven, in part, by less myelin. Several studies have found higher magnetic susceptibility in the hippocampi and amygdala of individuals with mild cognitive impairment and AD (Ravanfar et al. 2021; Acosta-Cabronero et al. 2013). To date, only a few studies have linked ApoE4 genotype with greater magnetic susceptibility and iron. Significantly higher cerebrospinal fluid ferritin levels have been found across cognitively normal and AD ApoE4 carriers (N=302) (Ayton et al. 2015). In a study by van Bergen et al. (2016), ApoE4 carriers had both higher cortical QSM iron deposition and cortical PiB-PET amyloid plaque load (N=37) (van Bergen et al. 2016). Findings from yet another QSM study suggested that ApoE4 may moderate iron effects on default mode network connectivity in cognitively healthy older adults (N=69) (Kagerer et al. 2020). Still, most published studies are relatively small and largely evaluate older individuals with cognitive impairment. Large-scale studies of the normal aging population are needed to better understand genetic risk for disease. 33 Higher QSM susceptibility in subcortical structures of E4 carriers was consistently accompanied by higher dMRI MD and ISOVF and lower FA and ICVF. These dMRI differences indicate greater diffusivity and less restriction, and could reflect, for example, lower neuronal density or greater inflammation and edema. Other studies have similarly reported lower FA and higher MD in white matter of older ApoE4 carriers (Harrison et al. 2020). Transgenic mice expressing ApoE4 in neurons have been shown to develop axonal degeneration and gliosis in the brain (Tesseur et al. 2000). Moreover, iron deposits are also found in activated microglia (Connor et al. 1992), the inflammatory cells of the nervous system, which, post mortem, have been found to colocalize with tau and amyloid in the hippocampi of individuals with AD (Zeineh et al. 2015). Growing evidence suggests that ApoE4 effects on brain structure and function may become more pronounced after age 60; two relatively large studies suggest that ApoE4 does not affect cognitive function before age 65 (N=6,560) (Jorm et al. 2007), and that in older individuals, ApoE4 carriers have a faster rate of cognitive decline than non-carriers (N=501) (Schiepers et al. 2012). While smaller hippocampal volumes in ApoE4 carriers have been reported in young healthy carriers (N=44) (O’Dwyer et al. 2012), other studies have failed to find this association (Lupton et al. 2016). Smaller hippocampal and amygdalar volumes have more consistently been found in older ApoE4 carriers with cognitive impairment (Lupton et al. 2016; Hostage et al. 2013). Accordingly, in addition to significant ApoE4-by-age interactions indicating greater age-related increases in iron and diffusivity in E4 carriers, we found that QSM and dMRI E4 effects only survived in individuals over age 65. ApoE2 has been associated with lower risk for AD (Z. Li et al. 2020). While greater hippocampal volumes have been detected in E2 carriers in ADNI (Hostage et al. 2013), to date, large scale population neuroimaging studies have found limited evidence to suggest it confers protection through improved brain integrity. Here, we found higher magnetic susceptibility in the hippocampi of E2 carriers in, to our knowledge, the first study to evaluate QSM differences. While this could reflect higher iron concentration, this does not necessarily reflect unhealthy deposits; it could, for example, be driven by hemoglobin in the surrounding vasculature and reflect greater blood circulation around the hippocampus. In contrast to ApoE4 34 carriers, who showed both higher QSM and compromised dMRI microstructural integrity, we did not detect any associations between the ApoE2 genotype and dMRI measures; this may suggest that subcortical iron depositions play a different role in E2 carriers. Furthermore, while no ApoE2-by-age interaction was found, ApoE2 effects were only detected in younger participants. This could reflect the important role iron plays in the central nervous system; cerebral iron increases quickly early in development and is necessary for myelin production, neurotransmitter synthesis and breakdown, and microglial activation, among other essential functions (Ward et al. 2014). There is also evidence, however, that ApoE2 is not entirely benign, and is associated with an increased risk of cerebral amyloid angiopathy 11 and type 2 diabetes mellitus (Santos-Ferreira et al. 2019). Further studies are necessary to disentangle the relationship between ApoE2 and cerebral iron. As with all quantitative MRI methods, QSM values depend on acquisition and processing protocols (Deistung, Schweser, and Reichenbach 2017; Haacke et al. 2015). In the processing approach used here, we used the QSM values directly output from our pipeline without inter-individual harmonization through the use of a reference region. Ventricular CSF or specific white matter bundles may be used for this purpose, however choice of reference region is not standardized, and QSM variation within reference regions may further confound measurements (Ravanfar et al. 2021). It will be important to determine whether the choice of QSM processing affects the overall conclusions of this work. As with DTI, since the inception of NODDI a number of limitations have also been identified. To avoid overparameterization, many model assumptions and fixed parameters are imposed that have not been widely validated and are highly dependent on acquisition parameters(Jelescu and Budde 2017); this complicates the biological specificity and interpretability of resulting microstructural measures. In the largest QSM study to evaluate ApoE genotypes to date, we found that higher subcortical iron burden is associated with ApoE4. Our findings also offer some initial insights into ApoE2 mechanisms. Both AD risk and protective genotypes showed higher magnetic susceptibility compared to the most common E3E3 genotype; however, these effects were driven by different age groups, and had inconsistent effects on diffusion metrics. Independent replication and further histopathological investigation of the 35 effects of ApoE genotype on hippocampal iron deposition, and its relationship to age, is needed. QSM provides complementary information to more commonly acquired MRI modalities, and acquisition of these scans in prospective studies of aging may provide further insights into neurodegeneration. 3.1.5 ACKNOWLEDGMENTS This work was completed using UK Biobank Resource under application number 11559. Funding support was provided by NIH grants (R01 AG059874, T32 AG058507, P41 EB015922), Zenith Award from the U.S. Alzheimer’s Association, and a grant from Biogen Inc. 36 3.2 Robust Automatic Corpus Callosum Analysis Toolkit: Mapping Callosal Development Across Heterogeneous Multisite Data This section is adapted from: Zhu, A. H., Saremi, A., Amini, A., Pires, R., Thompson, P. M., & Jahanshad, N. (2018, December). Robust automatic corpus callosum analysis toolkit: mapping callosal development across heterogeneous multisite data. In 14th International Symposium on Medical Information Processing and Analysis (Vol. 10975, p. 109750M). International Society for Optics and Photonics. The corpus callosum (CC) is the main neural pathway that communicates information between the brain’s hemispheres. Impairment of this pathway is evident in neurogenetic and developmental disorders, neurodegenerative diseases, and in many major psychiatric disorders, making the CC the focus of intense study. Prior studies often require manual input for segmentation, or have been single site, single modality, or fail to report on the reliability and generalizability of segmentations. We develop a Robust Automatic Corpus Callosum Analysis Toolkit (RACCAT) that segments the midsagittal CC from T1-weighted images, guided by diffusion MRI where available, to facilitate large-scale multimodal CC studies of its global, regional, and local (pointwise) structure. RACCAT was applied to data from 772 individuals aged 3-21 from the Pediatric Imaging, Neurocognition, and Genetics study, a developmental cohort imaged using multiple scanners and imaging protocols. CC area and fractional anisotropy were associated with age but also with site and scanner manufacturer; CC curvature also showed significant age associations but showed no detectable association with the scanning site, making it a robust developmental biomarker for multisite studies. 37 3.2.1 INTRODUCTION The corpus callosum (CC) is the main white matter commissure that transfers information between the brain’s left and right hemispheres. It has a crucial role in healthy brain function. Structural alterations of the CC are thought to underlie some clinical abnormalities in Alzheimer’s disease and other dementias (Lyoo et al. 1997; Yamauchi et al. 2000; Tomimoto et al. 2004), neuropsychiatric conditions such as schizophrenia (Bachmann et al. 2003; Downhill et al. 2000), and neurodevelopmental disorders including autism spectrum disorders (Vidal et al. 2006). Many CC metrics can be conveniently derived from the midsagittal slice, offering a simple but useful metric of interhemispheric brain connectivity for large-scale studies. Studies of the midsagittal CC show that it continues to develop well into adolescence (Giedd et al. 1999; P. M. Thompson et al. 2000; Luders, Thompson, and Toga 2010), and abnormal CC development has consistently been reported in psychiatric and developmental disorders such as schizophrenia (Bachmann et al. 2003; Downhill et al. 2000) and ADHD (Hynd et al. 1991). The growth and microstructural development of the corpus callosum has been characterized in several neurodevelopmental cohorts; however, smaller sample sizes have resulted in findings that are not always consistent (Giedd et al. 1999; Lenroot et al. 2007). Most corpus callosum segmentation methods require some manual input, hindering their application to large datasets. Scanner variations may also affect corpus callosum measures, making it challenging to combine data in multi-site studies. We have developed the Robust Automatic Corpus Callosum Analysis Toolkit (RACCAT), a fully automated python-based multi-modal pipeline that can be run on large cohorts to extract a range of CC metrics - from its global shape and area, to pointwise thickness measures along its length. Here, using a large multi-site publicly available dataset - the Pediatric Imaging, Neurocognition, and Genetics (PING) cohort (Jernigan et al. 2016) - we examine the reliability and robustness of our extractions. We further map out age-related trajectories for the global mid-sagittal CC metrics, and determine which metrics are relatively site-invariant, enabling more robust large-scale developmental efforts in collaborative international consortia, such as ENIGMA (Paul M. Thompson et al. 2014). 38 3.2.2 METHODS 3.2.2.1 PING demographics and imaging The PING study is a publicly available multi-site dataset from a cognitively healthy developmental cohort, aged 3 to 21 years. Parents of children under the age of 18 provided written consent, and participants aged 18 years or older provided consent. The studies were approved at each individual site. Scanning was performed at 6 institutions around the United States. The structural and diffusion MRI imaging protocols were standardized across sites by scanner manufacturer (four Siemens, one Philips, and one GE) (Jernigan et al. 2016). Scan parameters for the T1-weighted (T1w) acquisitions are detailed by (LeWinn et al. 2017). The diffusion-weighted images (DWI) were acquired as follows: for Siemens: TE = 91 ms, TR = 9500 ms, slices with 2.5 x 2.5 x 2.5 mm3 , 4 b0 images (at least one acquired in each of the P-A and A-P directions), 32 directions with a b-value of 1000 s/mm2 ; for Philips: TE = 91 ms, TR = 9000 ms, slices with 2.5 x 2.5 x 2.5 mm3 , 3 b0 images (at least one acquired in each of the P-A and A-P directions), 30/30 directions with a b-value of 1000 s/mm2 ; for GE: minimum TE, TR = 8000 ms, slices with 1.875 x 1.875 x 2.5 mm3 , 2 b0 images, 30 directions with a b-value of 1000 s/mm2 . T1-weighted and diffusion-weighted MR images of 772 subjects (age range 3.25-21 years old, 362 females, 410 males) were analyzed. RACCAT was used to evaluate the CC in all 772 individuals, as described below. To validate the CC segmentation, 20 subjects were also manually segmented by three individuals and processed using FreeSurfer (FS) 5.3.0 (Fischl 2012). The midsagittal CC was also derived from FS processing in the same 20 subjects for comparison to RACCAT and manual raters. The following procedures are all included in RACCAT and were programmed in Python (http://www.python.org) unless otherwise indicated. 3.2.2.2 Preprocessing and segmentation The T1w images are registered to the MNI152 1-mm standard space using FSL FLIRT (Jenkinson et al. 2002) and six degrees of freedom to preserve the size and shape of the brains. The DWI were processed using FSL’s dtifit (Jenkinson et al. 2012). The fractional anisotropy (FA) map derived from the diffusion tensor processing was registered to the subject’s T1-weighted anatomical image in standard space (using 39 FLIRT and six degrees of freedom), and the resulting transform was used to also register the primary eigenvector (V1). We note opposing phase b0 EPI scans wherever collected, and susceptibility artifacts were corrected for prior to tensor fitting. The same mid-sagittal slice was selected for all subjects and modalities based on the MNI standard (x=89). Otsu thresholding (Otsu 1979) was applied solely on the center of the T1 mid-sagittal slice to segment regions with high intensity. A region of interest prior based on the MNI brain was used to select the CC and exclude other hyperintense signals or structures. The intensity-based segmentation at this stage occasionally included the fornix, a white matter tract adjacent to the CC and has a similar T1-weighted intensity. While other T1w only based options have also been implemented based off of local curvature and smoothness, with a multimodal dataset, we use the DWI for refinement of the segmentation. Unlike the fibers of the fornix, CC fibers are interhemispheric, with very little fiber crossing or corruption in the midsagittal region, making the primary eigenvector extremely reliable for discerning the CC from other tracts including the fornix. The FA and V1 maps were thresholded to isolate the WM and remove tracts where the FA was less than 0.2 and the x component of the primary eigenvector was less than 0.65. This mask from the DWI was then applied to the T1w CC mask. An additional command line interface is also included to generate a quality control (QC) webpage. The mid-sagittal slice of each T1 image is displayed with the boundary of the CC segmentation in yellow and the quadratic fit of the skeleton in blue; Figure 3.2.1 shows this with a different coloring scheme. 40 Figure 3.2.1. Example QC images are displayed here. Left: The CC segmentation, indicated by yellow, is fit with a quadratic model, indicated by the blue curve, to estimate the curvature. These example subjects display the decreased curvature (flatter CC) observed with age (Top: Male age <5 yrs; Bottom: Male age 18-21 yrs). Right: The CC is segmented into five subregions (Witelson 1989). 3.2.2.3 Feature extraction The subregions are segmented according to two common schemes: (1) into five segments, adapted from Witelson (Witelson 1989), and (2) into three segments by equal lengths. Derived features of the segmentation include metrics at various scale levels: whole ROI and subregion area and curvature along with pointwise curvature of the boundary. Global and subregion curvatures were calculated by applying a quadratic model to the midline skeleton. Point-wise curvature of the border was estimated by a circular model smoothed by six neighboring voxels. This study focuses on the global measures. The diffusion-based CC mask is segmented independently of the T1-w, to reduce confounding by remaining susceptibility artifacts. The average diffusion metrics are collected in both the overall mask and subregions. No shape metrics are calculated for the diffusion-based segmentation mask. The histograms of all imaging metrics were checked to ensure a normal distribution. 41 3.2.2.4 Statistics Reliability: Inter-rater reliability between the manual and automatic segmentations was assessed using the Dice similarity coefficient (DSC) and the correlation coefficient (r) for area and curvature. Assessments were done pair-wise between RACCAT, FS, and each of the manual raters and between the manual raters. Age Effects: Multivariate linear regression models were used to test the association between the CC metrics and linear and quadratic age terms, as well as sex-effects and age by sex interactions. These variables were modeled jointly covarying for imaging site and ICV. Our primary interest was the effect of age, and such developmental interactions with sex, however, we were also interested in whether the measures were related to the imaging sites, as this could reduce robustness in multisite initiatives. To account for the multiple regression models, the false discovery rate (FDR) was used to correct for multiple comparisons testing. All reported p-values have been FDR-corrected. 3.2.3 RESULTS Table 3.2.1 displays the median and average DSCs from the pairwise comparison of the different segmentation methods. The best correspondence was found between manual raters with average DSC values ranging from 0.92 and 0.94; however RACCAT also had good agreement with the manual raters (average 0.83-0.89). FS overlap with segmentations of manual raters resulted in average DSC values of 0.81 to 0.84. Visual inspection of the FS masks showed many segmentations to contain part or all of the fornix (Figure 3.2.2) and one was split into two sections. As detailed in Table 3.2.2, CC measures derived from RACCAT showed good correlation with those of manual raters with curvature being more robust than area. FS did not perform as well as RACCAT. 42 Table 3.2.1. The upper and lower triangles contain the median and average Dice similarity coefficients respectively. RACCAT Rater 1 Rater 2 Rater 3 FS RACCAT - 0.88 0.83 0.89 0.71 Rater 1 0.88 - 0.93 0.94 0.84 Rater 2 0.83 0.93 - 0.92 0.81 Rater 3 0.89 0.94 0.92 - 0.84 FS 0.70 0.81 0.78 0.81 - Figure 3.2.2. The CC of an example subject is shown with segmentations from our toolkit, a manual rater, and FreeSurfer. The fornix was a common problem for FS segmentations. Table 3.2.2. The upper and lower triangles contain the correlation coefficients of total area and curvature respectively. RACCAT shows strong correlations with curvature from all raters, while FreeSurfer has low reliability. RACCAT Rater 1 Rater 2 Rater 3 FS RACCAT - 0.91 0.78 0.48 0.21 Rater 1 0.94 - 0.87 0.74 0.13 Rater 2 0.93 0.95 - 0.76 0.32 Rater 3 0.92 0.92 0.96 - 0.47 FS 0.56 0.62 0.55 0.62 - 43 We found the area and FA of the total CC significantly increase with age (area: beta = 7.35, p < 0.001; FA: beta = 0.0024, p < 0.001) whereas the curvature significantly decreases with age (beta = -0.0002; p < 0.001). Area also significantly increases with ICV (beta = 6.05 x 10-5 , p < 0.001). Metrics were adjusted for the linear effect of ICV, and the regressions were repeated on the residuals. The area and FA of the total CC still significantly increased with age (area: beta = 7.29, p < 0.001; FA: beta = 0.0024, p < 0.001), and the curvature still significantly decreases with age (beta = -0.0002, p < 0.001). In the full CC, males tended to have larger areas (beta = 26.47, p = 0.080) and were found to have significantly higher FA than females (beta = 0.020, p < 0.001), but there was no significant difference in total curvature (beta = 5.85 x 10-4 , p = 0.240). There is a significant age-by-sex interaction association with FA (beta = 9.8 x 10-4 , p = 0.034) but not with area or curvature (area: beta = 1.20, p = 0.20; curvature: beta = 5.75 x 10-5 , p = 0.20). Area and FA of the full CC were found to be scanner manufacturer dependent - the two sites that used either a GE or Philips scanner were found to have mostly significant impact on area and FA increase with age (GE area: beta = -17.5, p = 0.034; Philips area: beta = 45.9, p < 0.001; GE FA: beta = -0.030, p < 0.001; Philips FA: beta = -0.061, p < 0.001). No sites were found to have a significant impact on curvature. Figure 3.2.3 shows scatterplots where data points are colored by scanner manufacturer. 3.2.4 DISCUSSION RACCAT segmentations were shown to have good overlap with those of manual raters with high DSCs. The correlation coefficients between the manual raters were high for total curvature and low for area, showing that curvature is a more robust inter-rater metric. Despite the standardization of the PING imaging protocol, manufacturer-based differences were found in area and FA measurements. Curvature was found to be site-independent as well as being a measure significantly associated with age, making it a promising developmental biomarker for multisite studies. Abnormal CC curvature has been found in neuropsychiatric conditions including schizophrenia (Joshi et al. 2013). 44 Figure 3.2.3. The area, curvature, and FA of the full CC segmentation all show associations with age. Scatter plots are shown separately for males and females. The CC becomes flatter with age, while its area and FA increase. Points are colored and shaped by scanner manufacturer with shaded areas indicating one standard deviation from the fit line. RACCAT is intended for multi-modal analysis of the CC and uses both MRI and dMRI for segmentation. However, as diffusion scans are less common, the tool also has an option to segment the CC based on T1w scans alone. Here, branching points are detected and labeled for a random walker segmentation to remove the fornix for example. After it is largely removed, the surface may not be smooth. A curvature analysis of the inferior border signals a second random walker segmentation to remove any bumps. This feature will be further explored in a different paper in addition to investigating the feasibility of applying RACCAT to clinical scans. A preliminary examination of the subregions showed all measures of the posterior body of the CC to have particularly high associations with age, and this will also be explored in more detail. Other future directions include deriving additional metrics, testing them for site-related variability, and relating metrics to neurocognitive test scores. Longitudinal processing is currently being developed, particularly tensorbased morphometry. Tractography will also be incorporated to improve the functional and anatomical subregion segmentation of the CC. 45 While area and FA have been shown to increase with age, to the best of our knowledge this is the largest study to replicate those results. We demonstrated the feasibility of using RACCAT in a large-scale, multi-site study and found that curvature has significant age associations and is most site-invariant. 3.2.5 ACKNOWLEDGMENTS Funding for this project was provided by NIH Big Data to Knowledge program: ENIGMA Center grant, U54 EB020403 (PI: Thompson) and Michael J. Fox Foundation Grant 14848 (PI: Jahanshad). The Pediatric Imaging, Neurocognition and Genetics Study (PING) dataset used in the preparation of this manuscript were obtained from the National Institute of Mental Health (NIMH) Data Archive (NDA). NDA is a collaborative informatics system created by the National Institutes of Health to provide a national resource to support and accelerate research in mental health. Dataset identifier(s): NIMH Data Archive Collection ID 2607. This manuscript reflects the views of the authors and may not reflect the opinions or views of the NIH or of the Submitters submitting original data to NDA. 46 3.3 Assessing the Potential Integration of UK Biobank Diffusion MRI Data into ENIGMA Studies This section is adapted from: Zhu, A. H., McIntosh, A. M., Thompson, P. M., & Jahanshad, N. (2018). Assessing the Potential Integration of UK Biobank Diffusion MRI Data into ENIGMA Studies. A technical report available at https://github.com/USC-IGC/enigma_ukb_comparison. Introduction The UK Biobank is a large single site study that began with the collection of biological samples and clinical history from more than 500,000 people around the United Kingdom from 2006-2010. Participants were asked to return for an imaging visit, including brain MRI, starting in 2014. The imaging data from over 10,000 subjects has been analyzed by the UKB team using standard methods (Alfaro-Almagro et al. 2018), and the images and imaging data phenotypes (IDPs) are available to research laboratories that apply for access to the data. Due to the broad inclusion criteria for subject selection, data from the UK Biobank may prove useful in increasing the sample sizes for various studies, by combining it with other data. However, combining data from different scanners and processing pipelines is a non-trivial problem important to establish for future studies. The feasibility of integrating the UK Biobank data into Enhancing NeuroImaging Genetics through Meta Analysis (ENIGMA) studies is assessed here. Methods The raw and processed diffusion images of 447 subjects were downloaded. Subjects were between 46 to 73 years old, including 209 women. For our study, subjects were intentionally not screened to exclude 47 neurological conditions as we aimed purely to assess methodological agreement, and we wanted to ensure there was no bias in terms of patient demographics. Images were re-processed using the ENIGMA diffusion processing pipeline (Jahanshad et al. 2013). The diffusion measures analyzed in this study were fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD), radial diffusivity (RD), intracellular volume fraction (ICVF), and isotropic volume fraction (ISOVF). The two major differences between the pipelines were the use of ANTS (Avants et al. 2008) for non-linear registration and the ENIGMA template. Ordinary least squares models were fitted for every region of interest (ROI) of the Johns Hopkins University’s (JHU) ICBM-DTI-81 white-matter labels atlas (Mori et al. 2008), with metrics derived from the ENIGMA skeleton as the predictor and the UK Biobank derived metrics as the response. The fits were forced through the origin. The goodness of the fit was assessed using the coefficient of determination (R-squared). Results Despite the quality control (QC) already performed by the UK Biobank, three subjects were additionally dropped from the analysis, one because of a (PA) image acquired with an incomplete field of view and two because of bad FNIRT registrations, leaving 444 subjects for analysis. Most ROIs showed good correspondence between the two processing pipelines, though the corticospinal tract (CST), the fornix (FX), superior fronto-occipital fasciculus (SFO), and uncinate (UNC) were often inconsistent (Table 3.3.1, Figure 3.3.1). The fornix/stria terminalis (FX/ST) was notable for its large range of R-squared values (0.19-0.95). ICVF and MD were the most consistent diffusion metrics (average R-squared values of 0.89 and 0.82 respectively), whereas FA and ISOVF were the least (average R-squared values of 0.78 and 0.76 respectively). Discussion The ROIs that consistently had low R-squared values are in areas that are susceptible to partial voluming effects (juxtacortical or periventricular) and/or registration errors—the JHU atlas only considers the portion 48 of the CST that is inferior to the thalamus (Acheson et al. 2017). These preliminary results indicate that reprocessing the diffusion images may not be necessary, depending on the ROI and diffusion metric of interest. Figure 3.3.1. The consistency between pipelines of both the fornix (FX) and fornix/stria terminalis (FX/ST) were inconsistent between diffusion measures. The FA values in the scatter plots show the FX at its worst and the FX/ST at its best. 49 Table 3.3.1. R-squared values for all ROIs and diffusion measures. ROIs are ordered by average R-squared in descending order. R-squared values below 0.6 are colored red. This visually demonstrates that the diffusion measures of certain ROIs are less consistent between the two processing pipelines. In viewing the minimum and maximum Rsquared columns, of particular note is the FX/ST, which has an R-squared range of 0.191-0.945 across diffusion metrics. 50 CHAPTER 4 Tensor-Based Morphometry 4.1 Longitudinal multi-site modeling of brain atrophy trajectories associated with amyloid and tau This section is adapted from: Somu, S.*, Zhu, A. H.*, Narula, S., Jahanshad, N., Nir, T. M. (2024). Longitudinal multi-site modeling of brain atrophy trajectories associated with amyloid and tau. Accepted by SIPAIM 2024 Amyloid and tau pathogenesis are hallmarks of Alzheimer’s disease and occur before observable volumetric changes in a brain MRI scan. Amyloid beta plaque deposition can occur years before tau-related neurodegeneration, and may be an important predictor of subsequent brain atrophy. In this study, we evaluated T1-weighted MRI data from four longitudinal Alzheimer’s disease datasets to assess associations between amyloid beta and tau biomarkers and atrophy patterns using tensor based morphometry. To account for potential non-linear effects on atrophy, we modeled age effects, time interval between subject visits and the acceleration of atrophy with aging in the context of amyloid and tau positivity. We found that amyloid beta and tau positivity were associated with greater atrophy in medial temporal lobe brain regions associated with Alzheimer’s disease. Furthermore, we found that amyloid beta and tau positivity were associated with different acceleration patterns, namely that the rate of amyloid-related atrophy started high but decelerated while tau-related atrophy accelerated with age. 51 4.1.1 INTRODUCTION Alzheimer’s disease (AD) is characterized by accumulation of amyloid (Aβ) and tau pathology in the brain, often years before signs of brain atrophy and cognitive impairment emerge (Braak and Del Tredici 2015). Aβ biomarker abnormalities are thought to precede tau and neurodegeneration in many proposed models of AD progression. Understanding the relationship between early AD pathology and brain atrophy over time is crucial for tracking and predicting disease trajectories. Tensor-based morphometry (TBM) is an MRI analysis technique that maps local structural brain differences from the gradients of deformation fields that align one T1-weighted (T1w) image to another. This method has been shown to be sensitive to longitudinal brain atrophy in AD (Hua et al. 2013). However, many longitudinal studies have been constrained to single cohort analyses. To enhance statistical power, validate hypotheses across various MRI protocols and study designs, and produce generalizable results across heterogeneous populations, multi-site data-pooling is imperative (Bayer, Thompson, et al. 2022). Many AD datasets are designed as longitudinal studies with multiple follow-ups to track subjects as they age (Weiner et al. 2017; LaMontagne et al. 2019). To accurately map disease progression over time and increase power to detect earlier, more subtle effects, it can be beneficial to incorporate all available MRI scans into the analysis. However, longitudinal data modeling requires consideration of different temporal components: the general age effect, the interval between subject visits, and potential acceleration or deceleration of atrophy over time. Non-linear rates of atrophy have been associated with aging and clinical AD diagnosis (Tustison et al. 2019), but the various temporal effects of Aβ and tau pathology have not been studied. The aim of the present study was to evaluate a comprehensive multi-study, multi-time point TBM pipeline on longitudinal datasets. T1w MRI scans from four independent AD studies were used to evaluate the relationship between localized brain atrophy and amyloid and tau biomarkers derived from cerebrospinal fluid (CSF) and positron emission tomography (PET). As tau neurofibrillary tangles are 52 considered more closely correlated with neurodegeneration than Aβ plaques (Wolfe 2012; Vogel et al. 2020), we hypothesized that the trajectories of atrophy associated with Aβ and tau positivity would differ. 4.1.2 METHODS 4.1.2.1 Datasets T1w brain MRI scans from four studies—ADNI3 (1057 subjects), HABS-MGH (284 subjects), OASIS3 (1085 subjects), and PREVENT-AD (343 subjects)—were included in this analysis (Weiner et al. 2017; LaMontagne et al. 2019; Dagley et al. 2017; Tremblay-Mercier et al. 2021). Subjects with only a single time point and those scanned at different scanners in subsequent timepoints were excluded. All remaining scans were included. Demographic and clinical information are detailed in Tables 4.1.1, 4.4.2; CSF and PET biomarker measures were only included if collected within 6 months of an included MRI scan. If more than one CSF measure was closest in time to a given MRI scan date, then both of them were included in the analysis with the same MRI repeated (repeated measures). CSF measures were provided as continuous measures, which were binarized to amyloid or tau positivity using cutoffs from (Blennow et al. 2019; Salimi et al. 2024). Binarized PET measures were made available by the studies. Table 4.1.1. Dataset demographics. Study Scanner Manufacturer Scanner Field Strength Voxel volumes (mm3 ) N subjects / Female N scans N followup visits Scan Time Interval (years) ADNI3 Philips, GE, Siemens 3T 1, 1.2, 1.5 561 / 288 1429 2.7 (0.8) 1.4 (0.6) HABSMGH Siemens 3T 1.2, 1.5 228 / 133 640 2.8 (0.6) 2.5 (0.9) OASIS3 Siemens 1.5T, 3T 1,1.2, 1.5 100 / 51 210 2.1 (0.4) 2.2 (0.8) PREVENTAD Siemens 3T 1 273 / 197 1145 4.6 (1.3) 0.9 (0.3) 53 Table 4.1.2. Baseline diagnosis, CSF and PET Biomarkers N Total scans CSF BATCH / PET tracer AB PET (+/-) AB42 CSF (+/-) pTau181 CSF (+/-) *CN /MCI /AD MR Age (years) ADNI 3 UPENNBIOMK10, UPENNBIOMK12, UPENNBIOMK13, FBB, FBP 193/259 149/161 122/190 307/180/57 74.9(8.2) HABS-MGH PiB 51/133 - - 199/2/0 73.7(8.2) OASIS3 PiB 7/24 - - 56/16/3 72.9(8.2) PREVENT-AD 1 CSF Batch - 3/96 3/96 273/0 /0 63.8(8.2) *Diagnosis information is provided only for participants whose diagnosis date is within 6 months of their MRI scan. CN - Control Normal, MCI - Mild Cognitive Impairment, AD - Alzheirmer’s Disease 4.1.2.2 Image Processing Each T1w scan was denoised using ANTs Denoising (Manjón et al. 2010), followed by brain extraction with HD-BET (Isensee et al. 2019), bias field correction using FSL FAST (Y. Zhang, Brady, and Smith 2001), and normalized by the median scan intensity. The FOV was refined using FSL robustfov (Woolrich et al. 2009). Each time point TP[i] was registered with the skull on, both linearly and nonlinearly to the preceding TP [i-1], generating a LogJacobian[i]_to[i-1] that quantifies the deformation between the two time points. Linear registrations were conducted using FSL's FLIRT (Jenkinson et al. 2002) with 12 degrees of freedom, and ANTs (MI, SyN) (Tustison et al. 2019) for the non-linear registration. This process was executed sequentially, resulting in N-1 LogJacobians for a subject with N scans (Figure 4.1.1). A study-specific minimal deformation template (MDT) was created for each of the four studies. Masked preprocessed baseline T1w scans from 50 subjects representative of the study in terms of age, sex and diagnosis were registered to MNI_152_1mm template and then passed through the ANTs Multivariate Template Construction (CC, Bspline, SyN) to generate the study MDT. A multisite MDT was then created from the four study MDTs. Each TP[i-1] masked, preprocessed T1w scan was warped to its respective study MDT which was concatenated with the warp to the multisite MDT and applied to the LogJacobian_[i]to[i-1] to bring it to the multi-site MDT (Jahanshad et al. 2019) space for pooled analyses across studies (Figure 4.1.1). 54 4.1.2.3 Statistical Analysis Voxel-wise random effects linear regressions were used to test for associations between longitudinal brain volume changes (LogJacobian_[i]to[i-1]) and 1) Aβ PET, 2) Aβ42 CSF, or 3) pTau181 CSF positivity (i.e. +/-) at TP[i-1]. Statistical analyses were performed using the lmerTest package in R (Kuznetsova, Brockhoff, and Christensen 2017). The model incorporated subject as the random effect and the following fixed effects: LogJacobian ~ Biomarker Positivity + Biomarker Batch + Sex + Study + Voxel Volume + MRI Scanner Manufacturer + Baseline-to-MDT Log Jacobian + MRI-Biomarker Collection Time Interval + MRI Age + MRI Time Interval + MRI Age-by-MRI Time Interval To account for the temporal components, MRI Age was included to model the general age effect (mean atrophy), MRI Time Interval for the rate of atrophy between visits, and MRI Age-by-MRI Time Interval interaction for acceleration (Guillaume et al. 2014). We used the false discovery rate for multiple comparisons correction across voxels (q = 0.05). In voxels that had significant associations between atrophy and biomarker positivity, two secondary analyses were performed: 1) a sensitivity analysis to evaluate whether effects remained significant in the subset of subjects who were cognitively normal (CN) at TP[i-1] (N=293 CSF, N= 514 PET); and 2) to see if mean atrophy, rate of atrophy, or acceleration of atrophy differed by biomarker group, we evaluated the interaction between the biomarkers and MRI age, scan time interval and acceleration effect. 55 Figure 4.1.1. Tensor Based Morphometry showing the registration of TP[i] to TP[i-1] and spatial normalization of their LogJacobian to the Multi-site MDT. 4.1.3 RESULTS 4.1.3.1 CSF and PET Biomarker Associations with Volumetric Change Aβ positivity in both CSF and PET was significantly associated with greater atrophy in the temporal lobe, deep white matter, thalamus, and corpus callosum, along with greater ventricular and sulcal expansion. The associations were more widespread in the PET measures as compared to the CSF measures (Figure 4.1.2A,B). pTau181 positivity was significantly associated with greater atrophy in the temporal lobe, caudate, deep white matter, corpus callosum and ventricular expansion. Sulcal expansion was also seen in a minor cluster of voxels (Figure 4.1.2C). In our sensitivity analyses, all associations remained significant in the subset of CN participants. 56 Figure 4.1.2. Standardized Beta estimates for significant associations between volumetric changes and Aβ and pTau biomarkers (q < 0.05; FDR corrected) show tissue atrophy (blue) and ventricular expansion (yellow) with greater pathology. 4.1.3.2 Rate and Acceleration of Volume Changes In primary analyses of CSF and PET biomarkers, voxelwise atrophy was significantly associated with time interval (rate) and acceleration. Voxelwise atrophy was only significantly associated with MRI Age in models with CSF biomarkers. Across all models, the regions significantly associated with time-based covariates were adjacent to the frontal horns of the lateral ventricles and the splenium of the corpus callosum. In the PET Aβ model, rate and acceleration were also significant around the temporal horns of the ventricles, superior to the hippocampus. Secondary analyses evaluating biomarker-by-time covariate interactions found notable CSF biomarker-by-interval and CSF biomarker-by-acceleration effects. CSF Aβ+ participants had on average higher rates of atrophy than Aβ- participants (Figure 4.1.3B), however this rate decelerated over time (Figure 4.1.3C). In contrast, pTau+ participants had on average lower rates of atrophy than pTau- 57 participants (Figure 4.1.3B), however this rate accelerated over time (Figure 4.1.3C). No interactions were found for Aβ PET. Figure 4.1.3. Time effects on LogJacobians averaged across all voxels that showed a significant negative effect (i.e. contraction) in respective Aβ42 CSF and pTau181 CSF analyses A.) Age effect: visualized by the relationship between mean LogJacobians and TP[i-1] MRI Age, the trend lines show the volume change over time while the scatter points with the secondary lines represent the volumes grouped by subject. B.) Rate effect: Beta estimate from the linear regression model fit which accounts for change in volume over scan time interval. C.) Acceleration effect: Beta estimates from the linear regression model fit indicating acceleration or deceleration of volume changes with Aging. 4.1.4 DISCUSSION AND CONCLUSION We found significant associations between atrophy and the presence of amyloid and tau in regions typically affected by AD, such as the medial temporal lobe. These associations remained significant in the subset of 58 cognitively normal participants, suggesting our longitudinal TBM pipeline is sensitive to early neurodegeneration. This is further supported by associations detected with CSF Aβ, which is thought to precede Aβ PET abnormalities in the AD biomarker cascade (Jack et al. 2013). Our study also aimed to determine whether the presence of amyloid and tau affects the rate and acceleration of atrophy in the brain. We found a mirrored effect of rate vs acceleration between Aβ and tau, which may reflect the proposed timing of pathological progression. As Aβ deposition is thought to occur first, more pronounced atrophy may occur earlier on and then slow. The acceleration of atrophy shown in our CSF pTau positive subset may correspond to the later appearance of neurofibrillary tangles in the brain. Further studies are needed to support these findings. The current analyses evaluated each Aβ and tau measure separately. As part of our future work, we’d like to model the trajectories of subjects that are grouped based on both Aβ and tau status (i.e., Aβ-/tau- vs. Aβ+/tau- vs. Aβ+/tau+). This would help confirm if the acceleration effects we found are truly reflective of the proposed biomarker timeline. While our longitudinal, multi-study TBM pipeline uses pairwise warps to the closest previous time point, other studies have used different approaches. Some TBM studies register all follow-up images to the subject’s baseline (Hua et al. 2013) or create a subject-specific template (Reuter et al. 2012). We sought to remove the bias from dependence on a single time point and to avoid needing to recreate a subject template with added data. However, one potential limitation of our method is that our TBM maps only account for two images at a time. In the future, we hope to develop a method that will account for all the time points from a given subject in a single analysis and be flexible enough to account for newly added data. Overall, we found different temporal trajectories for brain atrophy between Aβ+ and pTau+ groups using our longitudinal, multi-study TBM pipeline. The sensitivity we have shown to early pathology is promising and warrants further study. 59 4.1.5 ACKNOWLEDGMENTS Funding was provided by AARG-23-1149996 and the following NIH grants: R01AG059874, R01AG058854, S10OD032285. 60 4.2 Age-Related Heterochronicity Of Brain Morphometry May Bias Voxelwise Findings This section is adapted from: Zhu, A. H., Thompson, P. M., & Jahanshad, N. (2021). Age-related heterochronicity of brain morphometry may bias voxelwise findings. In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI) (pp. 836-839). IEEE. Increased age has arguably one of the largest known effects on brain structure in healthy adult populations, often affecting different brain regions at different rates, and showing numerous interactions. Modeling age appropriately can be challenging. Tensor-based morphometry produces voxelwise maps of regional intersubject differences in the brain, often in relation to a single study-specific (3D) template. In studies with a wide age range, this single template may not be sufficient. Here, we create age-specific templates within smaller age bins (4D) to compare to the standard 3D model and evaluate the potential biases. We analyzed the morphological changes of nearly 26,000 subjects from the UK Biobank. We found that age-related biases that existed with a 3D template were minimized with the 4D template. For effective modeling of age across the lifespan, a single template appears suboptimal. 4.2.1 INTRODUCTION Structural magnetic resonance imaging (MRI) has become a common non-invasive tool for studying brain development and aging. Variations in brain morphology across populations may be influenced by age, sex, disease, and other effects. The degree of these effects may also vary regionally within the brain (Lemaitre et al. 2012). Various methods have been implemented to account for these differences, particularly with regard to age. Statistical models often model age with linear and sometimes non-linear terms (Lemaitre et al. 2012; Lebel et al. 2008). In some cases, it is the main effect of interest, in others, it is a nuisance variable modeled to fit a more subtle effect, such as the effect of a genetic variant, to the remaining variance. In 61 studies that use deep learning on the brain MRI to calculate an individual’s ‘brainAge’, the metric ends up being correlated with age itself. It has been suggested to account for the residual age association by further regressing it out (Smith et al. 2019); however, new analyses suggest that such modifications may inflate model statistics (Butler et al. 2021). Brain MRI processing pipelines often reference a template image to determine inter-individual differences. Most templates are based on young adult brains, but the use of such templates in other populations may be problematic (Fonov et al. 2011; Machilsen et al. 2007). A minimum deformation template (MDT) is a study-specific template based on a subset of subjects. Non-linearly registering images to the MDT produces a warp field, which can be used to compute maps of morphological brain changes in an analysis pipeline known as tensor-based morphometry (TBM). These structural brain difference maps can be used to assess brain-wide differences associated with a number of factors, including diseases such as Alzheimer’s disease and human immunodeficiency virus (Hua et al. 2008; Lepore et al. 2008). Rather than rely solely on the statistical methods mentioned, study-specific templates are often created to reflect the target population. Fonov et al. show that when an adult template is used for voxelwise registrations, a widespread bias is encountered where voxels are on average expanded compared to when age-appropriate templates are used (Fonov et al. 2011). This skew can result in a non-zero mean for the distribution of log-Jacobians of voxels across the population for many voxels; this would require nonparametric hypothesis testing to ensure that appropriate inferences are made (Leow et al. 2007). While discrepancies in overall head size and volume would no longer be the driving force underlying any potential biased effects in studies of adulthood, we hypothesize that age-inappropriate subtleties do occur in voxelwise studies of older, aging adults, which may result in biased output maps. This may be most problematic when the effect tested is of a variable that exhibits heterochronicity, or heteroskedastic effects with respect to age. Our analysis is particularly relevant in today’s era of large-scale biobank studies that include tens of thousands of MRIs, making them highly powered to detect subtle effects (Smith and Nichols 2018). 62 Here, we analyzed data from the UK Biobank, a cohort with neuroimaging data from over 25,000 adults aged 45-80 to test our hypothesis that age-appropriate templates may also be beneficial in studies of older individuals; specifically, we assess whether subtle age-related differences may result in biased inference maps if an age-appropriate template is not used. We create a series of age-appropriate templates and compare effect maps within and across 5-year age bins to those using a single cohort-specific template. 4.2.2 METHODS The UK Biobank is a population-based study that began in 2006 with the collection of cognitive tests, biological samples, etc. In 2014, the UK Biobank started collecting brain MRI scans with the goal of scanning 100,000 subjects and acquiring follow-up scans from 1,000 subjects. As of 2020, over 50,000 subjects have been scanned. Brain MRI acquisitions included a structural T1-weighted scan. Full details of scan parameters may be found in (Miller et al. 2016). The T1-weighted images of over 26,000 subjects aged 44-82 years old were processed using FreeSurfer 5.3 (Fischl 2012). Briefly, FreeSurfer processing includes bias field correction, skull stripping, subcortical segmentation, and cortical parcellation. 4.2.2.1 Creating the Templates To create a minimum deformation template (MDT), we used the Symmetric Normalization (SyN) nonlinear transform available through Advanced Normalization Tools (ANTS) (Avants et al. 2008). Subject images were iteratively transformed to an evolving template until a minimal difference threshold was reached. The initial template was the MNI152 standard-space T1-weighted average structural template image with 1 mm3 isotropic voxel resolution. Details of the template creation method can be found in (Jahanshad et al. 2019). The lateral, third, and fourth ventricles were segmented in all templates using FreeSurfer 5.3 and combined to create a single ventricular mask. Two sets of MDTs were created. First, a 3D template was created from 60 healthy subjects (30 male, 30 female). The age distribution of the 60 subjects visually matched that of the whole sample. Then a 4D template was created with five year intervals: 45-50, 50-55, 55-60, 60-65, 65-70, 70-75, and 75-80. 63 For each age bin, 60 healthy subjects (30 male, 30 female) were selected to create an age-specific template. There was an insufficient number of subjects outside of the 45-80 year old range to create age-specific templates. Healthy subjects were determined using ICD-10 and self-reported conditions. Exclusion criteria included all mental and behavioral disorders, diseases of the nervous system, and cancers. 4.2.2.2 TBM Processing For each subject, a binarized brain mask was created based on the aparc+aseg.mgz output of the FreeSurfer processing pipeline. The brain mask was morphologically closed with a 6-mm kernel, and any remaining holes were filled. This mask was then applied to the bias-field corrected T1-weighted (T1w) image from the FreeSurfer outputs. The masked T1w images were then linearly registered to the 3D MDT and the ageappropriate volume of the 4D MDT using FSL’s FLIRT (9 degrees of freedom) for linear alignment followed by a non-linear registration using ANTS SyN. Subjects used in the creation of the templates were excluded from analysis. Subjects who were under 45 or over 80 years of age were also excluded. No subjects were excluded due to any diagnosis. Images were visually checked to ensure appropriate registration to the templates. Log Jacobian maps were extracted from the non-linear deformation maps. The mean log Jacobian within the ventricular mask was extracted for each subject. For the rest of this work, we refer to the log Jacobians in the full brain as the Jac-maps, and in the ventricles as the vent-Jacs. 4.2.2.3 Statistics The ventricular Jacobian was analyzed on the whole cohort. Means and variances of the Jacobians between the 3D and 4D MDTs were compared. Age trends were also analyzed both within the age bins and across the whole cohort. As structural changes vary regionally, voxel-wise statistics were run in addition to ventricular log Jacobian analyses. Images were downsampled to 2x2x2 mm3 resolution to allow faster computation. The age bins were not uniformly sampled (e.g., there were approximately nine times as many subjects in the 65-70 age range than in the 45-50 age range), which may introduce a bias from the large-N age bins in the 3D analyses and different sample size biases across the age bins in the 4D analyses. As a result, 529 subjects 64 were randomly selected from each age bin, for a total of N=3,703. This also decreased the voxelwise modeling computation time. To observe regions where biases may occur, we ran voxelwise t-tests of the Jacobian maps of each age group in comparison to the Jacobian maps from registration to the 3D template. In each voxel, a linear regression was also performed with age, sex, and an age-by-sex interaction term as independent variables and the log Jacobian as the dependent variable. Multiple comparisons were corrected for using the false discovery rate (FDR) within each covariate image. 4.2.3 RESULTS The age distribution of the subjects used in the creation of the 3D MDT largely matched those of the whole sample. The age distribution across the cohort skewed older with the most subjects in the 65-70 age group. The sex distribution of most age groups matched that of the whole cohort with a little over half being female. 4.2.3.1 Ventricular Jacobians For registrations to the 3D template, the mean of the ventricular log Jacobian increased with age (Figure 4.2.1). Positive and negative log Jacobian values indicate ventricular expansion and shrinkage respectively. This indicates that the ventricles of younger subjects were expanded to match those of the MDT, and the reverse was true of the older subjects. This age trend was less apparent with registrations to the age-specific templates. Ventricular log Jacobian means for individual age bins roughly centered around zero. The 4D implementation then better allows for modeling of age effects that can vary across the age groups. The variance between age groups was largely consistent both between age groups and 3D vs. 4D template processing streams. 65 Figure 4.2.1. Age trends of the vent-Jacs to the 3D and 4D templates are plotted separately within the 5-year age bins. Registrations to the 3D template show a marked difference depending on age bin, while 4D vent-Jacs largely center around zero. 4.2.3.2 Voxelwise Analysis Voxelwise t-test maps show stark regional differences between age-appropriate and the 3D templates (Figure 4.2.2). In age groups younger than 65, the t-values are significantly lower in the ventricles, i.e., the ventricles did not have to deform as severely for the age-appropriate MDT as they did for the 3D MDT. In age groups older than 65, the t-values are significantly higher indicating the reverse. This approximately matches the mean of the subjects used in creation of the 3D MDT (63 years). The area reflecting ventricular deformation also decreases as age groups reach the cohort mean and increases as the age groups pass it. Other regions showing decreased and increased log-Jacobians also follow the same pattern of t-value reversal about the mean age. For age, sex, and the age-by-sex interaction terms, nominally significant voxels varied across age bins, as seen in the example of age-associated voxel maps in Figure 4.2.3. As expected, the different models show different overall age-effects when the images are all pooled, while similar patterns are observed for each specific bin. Age-by-sex interaction effects are also different between 3D and 4D registration methods, where the 3D model indicates significant ventricular involvement whereas the 4D model does not, suggesting that perhaps the interaction effect may still depend on age itself. 66 Figure 4.2.2. Voxelwise t-value maps comparing log Jacobians between the 3D template and the age bin templates. Red voxels indicate the age-appropriate registrations show higher mean Jacobians (more relative expansion), while blue voxels indicate the opposite (less relative expansion). 67 Figure 4.2.3. Voxelwise beta maps for age using both a single template and age-appropriate templates. Displayed regions are those that pass the nominal significance threshold of p < 0.05. 68 All terms had clusters that passed the FDR threshold in the whole cohort analysis (Table 4.2.1). In the age group specific analyses, age was only significant in the 75-80 age group. Most of these voxels were clustered in the superior parietal lobe. Smaller clusters were also found in the orbitofrontal regions and occipital lobes. Sex and age-by-sex effects did not reach FDR significance thresholds in any of the age group analyses. Table 4.2.1. Jac-Map analysis results in the whole cohort (N=3703). The mask had a total of 177,451 voxels with number of significant voxels reported. Over 75% of the voxels are associated with age, but approximately 5% are associated with sex or age-by-sex. 3D 4D Variable FDR thr N Sig Voxels FDR thr N Sig Voxels Age 0.039 138662 0.029 101703 Sex 0.0022 7931 0.0016 6339 Age-by-Sex 0.0024 8425 0.0018 5757 4.2.4 DISCUSSION As in Fonov et al. (Fonov et al. 2011), we found that using different age-specific templates for processing affects the results. We found this to be true in both regional (ventricular) and voxelwise analyses. Deformations were heavily associated with age when not using age-specific templates. While voxelwise age-associations were found using age-specific templates, the regions differed, confirming heterochronicity of morphological differences in the aging brain (Lemaitre et al. 2012). When pooling data across all age groups, the voxelwise effect sizes of age associations are lower with 4D TBM. Particularly in large scale population studies such as UKB where older individuals exhibit more confounding comorbidities, this may prove helpful in isolating factors related to accelerated atrophy, independent of age. As an example region of interest, the ventricles are strongly associated with age in the 3D MDT model but are not in the 4D MDT model (Figure 4.2.3). Further work is needed to compare 69 statistical effects obtained with age-specific 4D TBM and brain age (Smith et al. 2019), as these may provide complementary approaches for disentangling age from others effects on brain structure. We will extend this work by using overlapping sliding window based age templates, and testing the extent to which these variations in templates may affect associations with age-related comorbidities including cardiovascular conditions and cognitive decline. 4.2.5 ACKNOWLEDGMENTS This research was conducted using the UK Biobank Resource under Application Number 11559. This work was supported by NIH grant R01AG059874. 70 CHAPTER 5 eHarmonize This section is adapted from: Zhu, A. H., Nir, T. M., Javid, S., Villalon-Reina, J. E., Rodrigue, A. L., Strike, L. T., ... & Alzheimer’s Disease Neuroimaging Initiative. (2024). Reference curves for harmonizing multi-site regional diffusion MRI metrics across the lifespan. bioRxiv. Under revision at Scientific Data Age-related white matter (WM) microstructure maturation and decline occur throughout the human lifespan, complementing the process of gray matter development and degeneration. Here, we create normative lifespan reference curves for global and regional WM microstructure by harmonizing diffusion MRI (dMRI)-derived data from ten public datasets (N = 40,898 subjects; age: 3-95 years; 47.6% male). We tested three harmonization methods on regional diffusion tensor imaging (DTI) based fractional anisotropy (FA), a metric of WM microstructure, extracted using the ENIGMA-DTI pipeline. ComBat-GAM harmonization provided multi-study trajectories most consistent with known WM maturation peaks. Lifespan FA reference curves were validated with test-retest data and used to assess the effect of the ApoE4 risk factor for dementia in WM across the lifespan. We found significant associations between ApoE4 and FA in WM regions associated with neurodegenerative disease even in healthy individuals across the lifespan, with regional age-by-genotype interactions. Our lifespan reference curves and tools to harmonize new dMRI data to the curves are publicly available as eHarmonize (https://github.com/ahzhu/eharmonize). 5.1 INTRODUCTION Methodological variability and small sample sizes have contributed to poor reproducibility in population studies across many research fields, including neuroscience and brain imaging (Button et al. 2013; Smith and Nichols 2018). As brain MRI scanning is costly and time-intensive, large well-powered neuroimaging 71 studies typically require pooling of data from smaller, existing studies or data collection across multiple sites. In either case, two primary sources of variance make it non-trivial to combine multi-site neuroimaging data. First, heterogeneous study design and subject inclusion criteria may yield results that are generalizable only to similar cohorts. Second, MRI scans and derived metrics vary considerably due to differences in scanner hardware and software, acquisition parameters such as spatial or angular resolution, and image processing pipelines (Jovicich et al. 2009). Even when acquisition protocols are harmonized, inter-site differences still persist (Vollmar et al. 2010). Multi-study consortia, such as the Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA) consortium (Paul M. Thompson et al. 2022), may account for study differences using a traditional two-stage (i.e., group-based) meta-analysis. Meta-analyses allow individual studies to account for study-specific covariates and population substructure, but if specific traits or conditions of interest have a low prevalence, e.g., rare copy number variants in the genome, meta-analysis may not be possible as single sites may lack sufficient samples to run statistical models. When small studies are conducted, model assumptions of normally distributed sampling and known variances may be violated (Burke, Ensor, and Riley 2017), and inflated effect sizes from smaller studies may contribute to the meta-analytic results (Button et al. 2013). Meta-analyses may also be unable to distinguish between site differences and confounding biological variables that are correlated with site (Brouwer et al. 2022). Beyond this, in a lifespan study, age-dependent effects (e.g., specifically occurring in development or senescence) may be diluted using a meta-analysis approach. Brain imaging studies that analyze data from individuals across the lifespan may benefit instead from a “mega”-analysis approach, which pools data or derived imaging features for a single analysis based on all individual data points. Mega-analysis can accommodate a broader range of statistical models and better control of effect size estimation (Boedhoe et al. 2018). Growth charts have been established for anthropometric measurements, such as height and weight, but until recently, study differences and the lack of standardization have hindered pooling datasets of sufficiently large sample sizes and age ranges to establish normative models for brain MRI measures. The emergence of large public datasets and data 72 harmonization techniques have made lifespan brain charts possible to create. Initial ‘brain charts’ using structural MRI measures across the lifespan have combined multi-study data for mega-analyses with different approaches to account for site heterogeneity, including generalized additive models for location, scale, and shape (GAM-LSS) to account for non-linear trajectories of global and regional volume, surface area, and thickness measurements (Bethlehem et al. 2022), linear and non-linear hierarchical Bayes models of subcortical volume and cortical thickness (Rutherford et al. 2022; Bayer, Dinga, et al. 2022), and fractional polynomial regression applied to harmonized cortical thickness measures(Frangou et al. 2022). Ge et al. compared methods for normative modeling of brain morphometry and chose multivariate fractional polynomial regression models to create sex-specific lifespan charts for subcortical volumes and cortical surface area and thickness measures, available through CentileBrain (www.centilebrain.org) (Ge et al. 2024). ComBat is a commonly used harmonization method, initially developed to correct for batch effects in gene expression arrays (Johnson, Li, and Rabinovic 2007), but later applied to derived neuroimaging measures (Fortin et al. 2018, 2017). ComBat expanded upon previous models, implementing an empirical Bayes framework that is robust to small sample sizes. Recent extensions of ComBat include ComBat-GAM (Pomponio et al. 2020a) and CovBat (Chen et al. 2022), which account for non-linear effects of covariates on the brain measures, and adjust for site differences in data covariance, respectively. Structural imaging studies have shown that ComBat and ComBat-GAM may improve the detection of case-control group differences in schizophrenia and post-traumatic stress disorder (PTSD) cohorts compared to unharmonized mixed-effects models that model study as a random effect (Radua et al. 2020; Sun et al. 2022). The initial application of ComBat in the neuroimaging literature was to data derived from diffusionweighted MRI (Fortin et al. 2017). Diffusion-weighted MRI (dMRI) is sensitive to white matter (WM) microstructure and is influenced by a greater number of acquisition parameters than T1-weighted MRI, including spatial and angular resolution, diffusion times, b-value, and the number of shells (A. H. Zhu et al. 2019). In general, greater angular resolution and number of b-shells (e.g., diffusion weightings) give more stable diffusion metrics (Zhan et al. 2012; Correia, Carpenter, and Williams 2009), and smaller voxel 73 sizes are less susceptible to partial voluming but may have more noise (Zhan et al. 2012; Papinutto, Maule, and Jovicich 2013). The initial application of ComBat to dMRI-derived data was limited to pooling together only two studies that had similar age ranges and acquisition parameters (Fortin et al. 2017). While it has since been used more widely in studies of epilepsy (Hatton et al. 2020), traumatic brain injury (Siqueira Pinto et al. 2023), and neurogenetic disorders (Villalón-Reina et al. 2020), harmonization of dMRI metrics across age ranges and acquisition protocols has not been explored in depth. Single-site dMRI studies have found lifespan trajectories of WM microstructure to be different to those of T1w-derived gray matter measures: peak WM maturation typically occurs over a decade later than corresponding peaks for gray matter metrics (P. Kochunov et al. 2012; Lebel et al. 2012), necessitating the development of dMRI-specific lifespan curves. Most prior dMRI studies have used measures extracted from the diffusion tensor imaging (DTI) model. The DTI model has been widely adopted in clinical studies to understand WM microstructure because, compared to more advanced models, it can be reliably fit to lower resolution data which is more convenient and faster to collect. This widespread use is particularly advantageous when combining studies with different dMRI acquisitions. Although harmonization approaches such as ComBat are widely used, limitations remain. ComBat regularizes site-specific harmonization parameters differently depending on sample size, and is sensitive to the input data, such that if a new cohort were to join the study, the harmonization would need to be repeated. To avoid incorrectly modeling true biological variability as a scanner-related effect, datasets must have a sufficient overlap in all covariates, e.g., age ranges. Multi-site mega-analyses often need a centralized database, which may limit participation by sites that cannot share individual level data. Harmonizing data to a template or reference that can be distributed would overcome this limitation. ComBat and its variants typically adjust input data so that the residuals of a statistical model have the same mean and variance, but none provide a reference dataset to perform normalization consistently across studies. However, care is required when choosing the reference dataset, as harmonized results may be biased toward the selected data. An ideal reference for harmonization would include multiple populations across a wide age range, such as the NiChart Reference Dataset aggregating structural MRI data for the iSTAGING consortium 74 (Pomponio et al. 2020a), and provide a means to account for differences in MRI acquisition protocols. A normative model could be created, incorporating heterogeneous datasets to represent the different sources of variance, and used as a reference for ComBat harmonization. Here, we created a framework that harmonizes input data to built-in lifespan reference curves, allowing for distributed harmonization, easy incorporation of new datasets, and version control of updatable lifespan reference curves. Here, we harmonized DTI data from ten public datasets, totaling 13,297 subjects (aged 3-95), to create lifespan reference curves for white matter regions extracted with the ENIGMA-DTI pipeline (Mori et al. 2008; Jahanshad et al. 2013). The ENIGMA-DTI pipeline is a well-established pipeline based on tractbased spatial statistics (Smith et al. 2006) with standardized outputs, which has been run on data from over a hundred cohorts worldwide, in studies of over ten disorders (Peter Kochunov et al. 2022). We evaluated lifespan curves computed using ComBat, ComBat-GAM, and CovBat to determine which methods would most effectively capture the expected non-linear trends (P. Kochunov et al. 2012; Lebel et al. 2012, 2008). After creating an optimal set of lifespan reference curves, we harmonized an independent group of test datasets (2,161 subjects, aged 3-85), acquired with dMRI protocols previously unseen in our training data, to our newly established lifespan reference curves. We characterized the impact of acquisition parameters on harmonization parameters, the regional sex differences in our lifespan reference curves, and the performance of our framework on longitudinal data. Lastly, as the effect of genetic risk factors may vary across the lifespan, we evaluated our framework by assessing the effects of Apolipoprotein E (ApoE) genotype on white matter microstructure in a multi-study sample aged 3-85 (N = 30,915). The E4 allele is the major risk haplotype for late-onset Alzheimer’s disease - compared to the most common genotype, E3 - whereas E2 is less common and may have a neuroprotective effect (C.-C. Liu et al. 2013). The overall framework - including the harmonization mechanism and our lifespan reference curves are freely available in an open source Python package called ENIGMA Harmonize (eHarmonize; https://github.com/ahzhu/eharmonize). 75 5.2 METHODS 5.2.1 Datasets Data for Building Reference Curves Public dMRI datasets collected from individuals across a variety of age ranges were combined to span 3 to 95 years of age (N = 40,898 subjects; 47.6% male; Table 5.1). The pediatric and adolescent cohorts (age: 3-21) included the Pediatric Imaging, Neurocognition, and Genetics (PING) dataset (Jernigan et al. 2016); Human Connectome Project in Development (HCP-D) (Somerville et al. 2018); and the Adolescent Brain and Cognitive Development (ABCD) (Hagler et al. 2019) studies. As the ABCD study contains twin and sibling data, one sibling per family was randomly selected to create an unrelated subset. Participants in the ABCD study were recruited from a very narrow age range (9-10 years old), yet at the time of analysis, twoyear follow up scans had also been released for most participants. We broadened the age range by including data from half the participants at baseline and an independent half at two year follow-up. No additional exclusion criteria were applied to the PING and HCP-D datasets for this work. Table 5.1. Study demographics and image acquisition parameters of each dataset. In multi-shell acquisitions, the shells used for DTI processing are indicated in bold. * Original acquisition dimensions. GE automatically zero-pads in k-space by default, so processing was performed on resampled data. ‡ QTIM participants were scanned twice, once with each protocol. ‡‡ Non-zero b-value shells were processed separately, for comparison. Study/Dat aset Exclusion Criteria Age Range (years) Available N (% Male) Sampled N (% Male) Country Manufacturer (Field Strength) b-values (volumes) Resolution (mm3 ) PING - 3-21 906 (52.2) 902 (52.1) United States Siemens (3T) GE (3T) Philips (3T) 0 (4), 1000 (32) 0 (2), 1000 (30) 0 (3), 1000 (30) 2.5 x 2.5 x 2.5 2.5 x 2.5 x 2.5* 2.5 x 2.5 x 2.5 HCP-D - 8-21 617 (48.5) 617 (48.5) United States Siemens (3T) 0 (28), 1500 (93), 3000 (92) 1.5 x 1.5 x 1.5 ABCD One sibling per family 9-13 5611 (52.7) 5000 (52.8) United States Siemens (3T) 0 (7), 500 (6), 1000 (15), 2000 (15), 3000 (60) 1.7 x 1.7 x 1.7 SLIM - 17-26 432 (45.1) 432 (45.1) China Siemens (3T) 0 (1), 1000 (30) 2.0 x 2.0 x 2.0 HCP One sibling per family 22-36 363 (45.5) 363 (45.5) United States Siemens (3T) 0 (18), 1000 (90), 2000 (90), 3000 (90) 1.25 x 1.25 x 1.25 CamCAN - 18-87 305 (35.1) 305 (35.1) United Kingdom Siemens (3T) 0 (3), 1000 (30), 2000 (30) 2.0 x 2.0 x 2.0 PPMI Parkinson ’s disease 30-81 83 (63.9) 83 (63.9) International Siemens (3T) 0 (1), 1000 (64) 1.98 x 1.98 x 2.0 76 UK Biobank ICD-10 codes: C70-C72, G, I6, Q0, R90, R940, S01-S09; self-report conditions : 1081, 1086, 1266, 1267, 1397, 1434, 1437, 1491, 1626 45-82 31,986 (46.8) 5000 (47.0) United Kingdom Siemens (3T) 0 (8), 1000 (50), 2000 (50) 2.0 x 2.0 x 2.0 OASIS3 Cognitive impairme nt 42-92 225 (48.4) 225 (48.4) United States Siemens (3T) 0 (1), 1000 (64) 2.5 x 2.5 x 2.5 ADNI3 Cognitive impairme nt 55-95 370 (41.2) 370 (41.2) United States Canada Siemens (3T) Siemens (3T) GE (3T) GE (3T) Philips (3T) Philips (3T) 0 (1), 1000 (30) 0 (7), 1000 (48) 0 (4), 1000 (32) 0 (6), 1000 (48) 0 (1), 1000 (32) 0 (4), 1000 (32) 2.0 x 2.0 x 2.0 2.0 x 2.0 x 2.0 2.0 x 2.0 x 2.0* 2.0 x 2.0 x 2.0* 2.0 x 2.0 x 2.0 2.0 x 2.0 x 2.0 NIH Pediatric Age < 3 years 3-21 122 (44.3) - United States Siemens (1.5T) GE (1.5T) 0 (10), 100 (10), 300 (10), 500 (10), 800 (30), 1100 (50) 0 (9), 100 (10), 300 (10), 500 (10), 800 (30), 1100 (50) 2.5 x 2.5 x 2.5 2.5 x 2.5 x 2.5 TAOS One sibling per family 12-16 232 (49.6) - United States Siemens (3T) 0 (3), 700 (55) 1.7 x 1.7 x 3 GOBS Kinship > 0.125 23-78 189 (37.0) - United States Siemens (3T) 0 (3), 700 (55) 1.7 x 1.7 x 3 QTIMd30☨ One twin per family 18-29 317 (35.6) - Australia Bruker (4T) 0 (3), 1159 (27) 1.8 x 1.8 x 5.5 QTIMd105☨ One twin per family 18-29 317 (35.6) - Australia Bruker (4T) 0 (11), 1159 (94) 1.8 x 1.8 x 2 HUNT Multiple sclerosis, cysts, meningio ma, infarction, post-op changes 50-67 811 (46.7) - Norway GE (1.5T) 0 (5), 1000 (40) 2.5 x 2.5 x 2.5* ADNI2 Cognitive impairme nt 60-85 42 (40.5) - United States Canada GE (3T) 0 (5), 1000 (41) 2.7 x 2.7 x 2.7 ADNI3- S127☨☨ Cognitive impairme nt 61-85 64 (34.4) - United States Siemens (3T) 0 (13), 1000 (48), 2000 (60) 2 x 2 x 2 77 The young adult cohorts (age: 17-36) included baseline data from the Southwest University Longitudinal Imaging Multimodal (SLIM) brain data repository (Qiu 2016; Y. Wang et al. 2014) and Human Connectome Project (HCP) (Van Essen et al. 2013). Each study acquired MRI data on a single scanner. As with ABCD, HCP is family-based, so only one sibling per family was selected. The Cam-CAN dataset included younger, midlife, and older adults (18 to 87 years) (Shafto et al. 2014). Mid-adulthood to older adults (age: 30-95) were included from Tier 1 of the Parkinson’s Progression Markers Initiative (PPMI) (Parkinson Progression Marker Initiative 2011), Open Access Series of Imaging Studies (OASIS3) dataset (LaMontagne et al. 2019), the population-based UK Biobank study (Miller et al. 2016), and a subset of the third phase of the Alzheimer’s Disease Neuroimaging Initiative with single-shell diffusion imaging acquisition (ADNI3) (Weiner et al. 2017). In the PPMI, OASIS3, and ADNI3 studies, participants with a diagnosis of Parkinson’s disease, mild cognitive impairment, or dementia were excluded from the reference training cohort. Exclusion criteria applied to the UK Biobank included neurological and cerebrovascular disorders as well as incidental findings on brain MRI and head injuries. Evaluation Datasets Seven datasets were held out to evaluate the reference curves made from the other datasets, as detailed in the Reference Curve Evaluation section (Table 5.1). Children scanned as part of the NIH Pediatric MRI study with the extended diffusion MRI protocol were included as the pediatric test cohort (Walker et al. 2016). Children younger than 3 years of age were excluded. Brain MRIs from the Teen Alcohol Outcomes Study (TAOS) and the Genetics of Brain Structure and Function (GOBS) study were acquired on the same scanner with the same acquisition protocol (Peter Kochunov et al. 2011). As a result, they were combined to provide a lifespan dataset including both children and adults. One child per family was selected for the TAOS dataset. As the GOBS study is composed of extensive pedigrees, selecting one member per family would have decreased the sample size considerably. Instead, a kinship filter was implemented, where only individuals with less than a first-degree cousin relationship were included (kinship < 0.125). Older adults from the Trøndelag Health Study (HUNT) study were also included (Åsvold et al. 2023; Eikenes et al. 2023). 78 To further evaluate our template in handling test data with differences in scan parameters, we included two datasets with which we calculated DTI metrics for the same individuals in different ways. The first dataset, the Queensland Twin IMaging (QTIM) study (de Zubicaray et al. 2008), included individuals scanned with two protocols of differing angular and spatial resolution (Jahanshad, Kohannim, Toga, et al. 2012). The second dataset we used was the subset of multi-shell diffusion MRI data from ADNI3, for which we calculated DTI metrics with both b-shells. 5.2.2 Image processing A summary of dMRI acquisition parameters for template building and evaluation datasets can be found in Table 5.1. In the training datasets with a multi-shell acquisition, the most appropriate shell for diffusion tensor imaging (DTI) was chosen for processing, usually b = 1000 s/mm2 (Soares et al. 2013; Kingsley and Monahan 2004). As with many large-scale multi-study initiatives in ENIGMA, preprocessing steps varied across studies as datasets were collected and processed over the past 15 years. Preprocessing followed guidelines provided by the ENIGMA-DTI team (on GitHub), which includes steps such as EPI distortion and eddy current distortion correction. FSL’s dtifit was used to calculate fractional anisotropy (FA) and diffusivity maps. As ADNI3-S127 dMRI acquisitions were multi-shell, dtifit was run on volumes from the b=1000 and b=2000 s/mm2 shells independently, for comparison. Images were then processed through the ENIGMA-DTI pipeline (Jahanshad et al. 2013). The FA maps were warped to the ENIGMA template and then skeletonized using tract-based spatial statistics (Smith et al. 2006). The mean FA value was extracted from the full skeleton as well as each region of interest (ROI) defined by the JHU atlas (Mori et al. 2008). Bilateral ROIs were combined by averaging the measures across hemispheres, and some subregion ROIs were combined to create a single measure for the entire structure, e.g., the three parts of the corpus callosum. In both cases, the combined average was weighted by the number of voxels in each ROI. A lifespan reference curve was created for all measures, lateralized and combined. In subsequent analyses, we report only the results for the combined measures (25 ROIs; Table 5.2). 79 Table 5.2. The age at which regional white matter skeleton FA trajectories peaked fell mostly below 20 years of age in the ComBat and CovBat reference curves. In the ComBat-GAM reference, the peak ages largely fell between the ages of 20 and 40 years old, which Kochunov and Lebel had both previously reported. One exception is the SCC which had a peak age of 68 across methods. The CST from the JHU atlas has shown low reliability, so the results for this ROI should be interpreted with caution (Acheson et al. 2017). The tapetum and uncinate fasciculus reported here reflect the current JHU atlas labels rather than the ENIGMA-DTI lookup table, which was based on a previous JHU atlas version. 5.2.3 Quality Control We limited the ABCD subset in our training sample to include only data from Siemens scanners, after preliminary quality assurance as detailed in Supplementary Figure S5.1. In the PPMI dataset, where controls could have prodromal Parkinson’s disease, dMRI data underwent visual quality control (QC), and those with movement artifacts were excluded. Statistical QC was also performed by removing subjects who had any ROI metric fall outside of 5 standard deviations of the site mean. 80 Figure 5.1. (A) After harmonization using different methods, average FA in the full WM skeleton is plotted against age and colored by study. Age-binned boxplots of (B) unharmonized data and (C) data harmonized using ComBatGAM show the median global FA were quite different between protocols pre-harmonization and were more similar post-harmonization. 5.2.4 Harmonization Methods Regional FA metrics from the training datasets were harmonized using three approaches: ComBat, ComBat-GAM, and CovBat (Johnson, Li, and Rabinovic 2007; Pomponio et al. 2020a; Chen et al. 2022). ComBat and ComBat-GAM were run using the neuroHarmonize package in Python 81 (https://github.com/rpomponio/neuroHarmonize). CovBat was run using the R package of the same name (https://github.com/andy1764/CovBat_Harmonization). Across all methods, age and sex were used as covariates, with the dataset modeled as the batch effect. As the outputs of the ComBat family harmonization methods are weighted by the sample size of each study, harmonization was run using iterative subsampling of all datasets to provide more equal weighting across datasets (25 iterations; 200 subjects per dataset where available; 13,297 subjects in total) as the UK Biobank greatly outnumbered the other datasets. The larger datasets (UK Biobank and ABCD) were sampled without replacement, while the smaller datasets were sampled with replacement. ComBat (Johnson, Li, and Rabinovic 2007): ComBat is a location and scale adjustment model where, for site i and individual j, it assumes that the feature measurements are modeled as a linear combination of site effects and non-site effects, written as: 𝑦𝑖𝑗𝑣 = 𝛼𝑣 + 𝑋𝑖𝑗 𝑇 𝛽𝑣 + 𝛾𝑖𝑣 + 𝛿𝑖𝑣𝑒𝑖𝑗𝑣 where 𝛼𝑣 is the overall mean per feature 𝑣, 𝛽𝑣 is the vector of corresponding coefficients to the covariate matrix 𝑋, 𝛾𝑖𝑣 is an offset from the grand mean per site 𝑖 and feature 𝑣, 𝑒𝑖𝑗𝑣 is the residual vector, and 𝛿𝑖𝑣 is the multiplicative site effect of site 𝑖 on feature 𝑣. ComBat removes the additive and multiplicative effects from the residuals using 𝑒𝑖𝑗𝑣 𝐶𝑜𝑚𝐵𝑎𝑡 = 𝑦𝑖𝑗𝑣−𝛼̂𝑣−𝑋𝑖𝑗 𝑇𝛽̂𝑣−𝛾𝑖𝑣 ∗ 𝛿𝑖𝑣 ∗ where 𝛾 ∗ 𝑖𝑣 and 𝛿 ∗ 𝑖𝑣 are estimated using an empirical Bayes framework. The harmonized data are then calculated by 𝑦𝑖𝑗𝑣 𝐶𝑜𝑚𝐵𝑎𝑡 = 𝑒𝑖𝑗𝑣 𝐶𝑜𝑚𝐵𝑎𝑡 + 𝛼̂𝑣 + 𝑋𝑖𝑗 𝑇 𝛽̂𝑣 where 𝛼̂𝑣 and 𝛽̂𝑣 are the respective parameter estimates. ComBat-GAM (Pomponio et al. 2020b): ComBat-GAM extends ComBat by using generalized additive models (GAM (Hastie and Tibshirani 1986)) to model non-linear covariate effects. Modeling of the nonlinear covariates is achieved by placing one or more covariates within the function, f(x): 𝑦𝑖𝑗𝑣 𝐶𝑜𝑚𝐵𝑎𝑡−𝐺𝐴𝑀 = 𝑦𝑖𝑗𝑣 − 𝑓(𝑋𝑖𝑗𝑣) − 𝛾𝑖𝑣 ∗ 𝛿𝑖𝑣 ∗ + 𝑓(𝑋𝑖𝑗𝑣) where 𝑓(𝑋𝑖𝑗𝑣) is a smooth function over the covariates(Bayer, Thompson, et al. 2022). 82 CovBat (Chen et al. 2022): CovBat was proposed to harmonize both mean and covariance batch effects in data for multivariate pattern analysis. First, CovBat applies ComBat to normalize the mean and variance of the residuals in a statistical model with for 𝑝 features. The ComBat-adjusted residuals, 𝑒𝑖𝑗𝑣 𝐶𝑜𝑚𝐵𝑎𝑡, are calculated for each feature independently, but may still retain site-specific covariance. Thus in the second step, principal component analysis is performed on the residuals of the full dataset to identify covariance patterns and reduce the number of dimensions. The ComBat-adjusted residuals can now be expressed as 𝑒𝑖𝑗 𝐶𝑜𝑚𝐵𝑎𝑡 = ∑ 𝑞 𝑘=1 𝜉𝑖𝑗𝑘𝜙̂ 𝑘 where 𝜙̂ 𝑘are the estimated principal components obtained as the eigenvectors of the full-data covariance matrix, 𝜉𝑖𝑗𝑘 are the principal component scores, and 𝑞 is the number of orthogonal axes. Treating the principal component scores analogously to those of the original features, site-specific covariance can be removed via 𝜉𝑖𝑗𝑘 𝐶𝑜𝑣𝐵𝑎𝑡 = (𝜉𝑖𝑗𝑘 − 𝜇̂𝑖𝑘)/𝜌̂𝑖𝑘 where 𝜇̂𝑖𝑘and 𝜌̂𝑖𝑘 are the site-specific center and scale parameters, respectively, of each principal component. Finally, the CovBat-adjusted residuals are obtained by projecting the adjusted scores into residual space via 𝑒𝑖𝑗 𝐶𝑜𝑣𝐵𝑎𝑡 = ∑ 𝐾 𝑘=1 𝜉𝑖𝑗𝑘 𝐶𝑜𝑣𝐵𝑎𝑡𝜙̂ 𝑘 + ∑ 𝑞 𝑙=𝐾+1 𝜉𝑖𝑗𝑙𝜙̂ 𝑙 where K is the number of principal components chosen to capture the user-specified percent variation. Then the final CovBat-adjusted observations are obtained by adding the intercepts and covariates estimated in the first step using ComBat: 𝑦𝑖𝑗𝑣 𝐶𝑜𝑣𝐵𝑎𝑡 = 𝑒𝑖𝑗𝑣 𝐶𝑜𝑣𝐵𝑎𝑡 + 𝛼̂𝑣 + 𝑋𝑖𝑗 𝑇 𝛽̂𝑣 5.2.5 Creating the Reference Curves After the training datasets were subsampled and harmonized using the three ComBat methods, the outputs of each method were combined for evaluation. To create reference curves, GAM models were fit for each ROI and harmonization method, covarying for sex and smoothing across age using the mgcv package in R 83 (basis function: 10 cubic splines). To evaluate the harmonization methods, we extracted the peak age, i.e., the age of maximum FA, from each region that exhibited a concave down trajectory. We compared our harmonized curves to previous single-site studies of the white matter microstructure that reported FA to peak between the ages of 20 and 40 years old (P. Kochunov et al. 2012; Lebel et al. 2012). We used these as our “silver standard”, i.e., a standard established by in vivo models rather than histological studies. Once the optimal harmonization method was selected, GAM models were fit to each ROI, covarying for sex and smoothing across age. We used the qgam package in R (basis function: 10 cubic splines) to fit quantile regression GAM models for each centile (0.1-0.99) to generate normative lifespan reference curves for each sex. The ComBat-GAM Python package (neuroHarmonize) does not allow the user to specify a reference site, so we adapted the code to allow for that functionality (available through neuroHarmonize fork: https://github.com/ahzhu/neuroharmonize). Evaluation datasets were harmonized to the lifespan reference curves, i.e., the location and scale parameters (𝛾 ∗ 𝑖𝑣 and 𝛿 ∗ 𝑖𝑣 respectively) of each site was calculated in relation to the mean and variance of the lifespan reference curves. 5.2.6 Reference Curve Evaluation Characterizing the Reference Curves In addition to age-at-peak comparisons, we also characterized the trajectories of our regional reference curves compared to that of the global FA. We calculated the Fréchet distance (FD), a curve similarity metric, between the global FA reference curve and that of each ROI. We tested for sex differences in bilaterally averaged ROIs. GAM models were fitted to model sex effects on regional FA while covarying for age as a smoothed term (basis function: ten cubic splines). We also tested for smoothed age-by-sex interactions. Multiple comparisons were corrected for using the false discovery rate method (FDR; 25 tests)(Benjamini and Hochberg 1995). 84 Train vs. Test Datasets Using the adapted ComBat-GAM code, all available data from the training datasets - as well as the seven held-out datasets - were harmonized to the newly created lifespan reference curves. Using each subject’s age and sex, the differences between reference-predicted and site-harmonized values were calculated, and the mean absolute error (MAE) was calculated for each protocol and ROI. For each ROI, an unpaired t-test was used to compare the performance of our harmonization framework on training vs test datasets (FDRcorrected; 25 tests). Acquisition Effects on Model Parameters Spatial resolution, number of diffusion directions, and choice of b-value are all known to affect DTI values. We tested for correlations between voxel volume and the ComBat output scale and shift parameters from both training and test datasets. We also analyzed the angular resolution and number of volumes (b0 and bshells combined); other acquisition parameters such as b-value and scanner manufacturer, were highly homogeneous across studies, and we were not able to determine their effects on model parameters. Longitudinal Studies To determine how our framework would perform for longitudinal studies, we harmonized longitudinal data in two ways: first, we calculated the scale and shift parameters from the baseline data and applied them to the follow-up data; second, we calculated the scale and shift parameters from the baseline and follow-up data separately. We then ran mixed-effects models to determine how the different methods impacted the modeled age effects on each ROI, covarying for sex, age-by-sex interaction, and age2 and including subject ID as a random effect. The mixed-effects models were also run in the unharmonized data, which was used for comparison. We used data from the UK Biobank, which has a subset of subjects with follow-up imaging visits acquired approximately two years after baseline (N = 1,384, baseline ages 47-80 years, mean time interval: 2.25 years). Case Studies To examine the outcome of case-control analyses after harmonization, we chose to analyze the effects of ApoE across the lifespan as the data was available in most datasets and requires no harmonization of its 85 own. Genetic data was collected in seven of the ten training datasets, and two of the evaluation datasets (Table 5.3). From datasets that released full genetic data, the ApoE SNPs rs7412 and rs429358 were extracted for analysis. Other datasets focusing on aging populations had only made ApoE genotypes available, and the provided data was used for this study. As genome-wide data was not available for all studies, subjects were filtered for European ancestry using self-provided race or ethnicity information. Only healthy controls as defined in Table 5.1 were included. Analyses were run separately for E2 and E4 allele counts, each using E3E3 homozygotes as controls and excluding carriers of the other allele. Linear regressions tested for effects of E2 or E4 count on harmonized regional FA, adjusting for age, sex, age-bysex, and age2 , and multiple comparisons corrected using FDR (25 tests). In secondary analyses, ApoE-byage interactions were tested in nominally significant ROIs. Table 5.3. Demographics and ApoE4 information for the datasets included in the regression models. The included subjects are healthy controls filtered for European ancestry. 86 Two test sets ADNI3-S127 and QTIM had data available from different dMRI protocols on the same subjects, as described above. For both datasets, we ran the ApoE regressions pre- and postharmonization with both sets of FA metrics. We then compared the effect sizes to determine how much of an impact the differences in protocol made on the statistical outputs, and if they were different, to determine if harmonization would result in a convergence of the results. In the QTIM study, we ran the ApoE4 analyses in both the low spatial and angular resolution scans and the high spatial and angular resolution scans. In the ADNI3-S127 dataset, we ran the same models in the different diffusion shells: b=1000 vs b=2000 s/mm2 . 5.2.7 eHarmonize Combining our modified ComBat-GAM code with the lifespan reference curves, we created the eHarmonize Python package (https://github.com/ahzhu/eharmonize), which comes equipped with command line tools to read in FA measures and harmonize them to the included centile reference curves while taking age and sex into account (Figure 5.6). The eHarmonize command line interface was written to harmonize data from a new site to the built-in lifespan reference curves and apply an existing harmonization model to new data from a known site. To account for dMRI acquisitions that do not cover a full field-ofview, eHarmonize detects which subset of ROIs are provided before calculating harmonization model parameters as the underlying neuroHarmonize does not handle missing data. 5.3 RESULTS 5.3.1 Harmonization methods The lifespan full WM skeleton FA references that were created are shown in Figure 5.1. Qualitatively, study FA trajectories across age were better aligned after harmonization, regardless of method. The performance of ComBat and CovBat appeared similar. The peak ages for white matter FA in these references were mostly before twenty years of age (Table 5.2). In the pediatric datasets, the steep increase in FA with age was maintained, but due to the larger age range and number of adult datasets, the linear model resulted in larger harmonized FA values than the outputs from the GAM model. The peak age of white matter FA in 87 almost all of the ComBat-GAM references was between 20 and 40 years old, matching the expected values (P. Kochunov et al. 2012; Lebel et al. 2012). As a result, we used the ComBat-GAM harmonized data to create the lifespan reference curves. 5.3.2 The Lifespan Reference As our harmonization was conducted with iterative sampling, we were able to plot a prospective reference for each iteration (Figure 5.2A). The iterations produced largely consistent results, although the sparse sampling of older subjects (age > 85 years old) resulted in larger confidence intervals at the older ages. Using the outputs from all iterations, we created one set of centile reference curves per sex, covering much of the lifespan (3-95 yrs) (Figure 5.2B). Characterizing the Reference Curves With the exception of the splenium of the corpus callosum (SCC), the age peaks of all lifespan reference curves fell between 20 and 40 years of age. Rather than peaking in the 20-40 age range, the SCC plateaus at that age range before rising again for its later age peak. In comparison to the global FA reference curve, most ROIs had lifespan trajectories with a high curve similarity (FD between 0.005 and 0.04). The exception was the fornix (FX), which had a Fréchet distance of 0.14 and a steeper slope of decline after the peak. Lifespan trajectories for SCC and FX can be found in Supplementary Figure S5.2. Higher FA was found in females as compared to males in six ROIs: the fornix, the fornix/stria terminalis, posterior corona radiata, posterior thalamic radiation, sagittal stratum, and the tapetum. No significant sex differences were found in the body, splenium, or whole corpus callosum. The remaining ROIs showed higher FA in males compared to females. Effect sizes for all FA measures are reported in Supplementary Table S5.1. We found no significant age-by-sex interactions. 88 Figure 5.2. (A) Iteration-specific reference curves of the global FA measure as created by iterative subsampling of ∼200 participants from each study and ComBat-GAM harmonization are displayed (25 iterations; mean in black). (B) Sex-specific centile curves derived from the results of iterative subsampling harmonization make up the final lifespan reference curve. (C) After applying our framework (eHarmonize) to held-out evaluation datasets, the harmonized datasets fall in line with the global FA lifespan reference curve. 89 5.3.3 Post-Harmonization Analyses Train vs. Test Datasets We harmonized the FA values of all training and test datasets to the newly created lifespan reference curves (Figure 5.2C). The MAE comparison between training and test datasets found no significant differences across ROIs (p > 0.09). Acquisition Effects on Model Parameters We extracted the scale and shift parameters for all protocols. Of the tested acquisition parameters, voxel size showed significantly negative correlations with the shift parameter (γ*; -0.57 < r < -0.27) but no correlation with scale (δ*) across most ROIs (Figure 5.3). Neither the number of directions nor volumes showed a significant impact on either the shift or scale parameters. Figure 5.3. For most ROIs, the shift parameter extracted from the ComBat-GAM model was significantly correlated with voxel volume. The negative correlation between the global FA shift parameter and voxel volume is shown here (r = -0.60; p < 0.001). Case Studies Nine datasets had ApoE data available for analysis. In total, 26,902 subjects (3 to 85 years of age) were included in the E4 analyses (Table 5.3) and 22,760 aged 3-85 years in E2 analyses. No significant associations were found between E2 count and regional FA. In E4 carriers, significantly lower FA was found in the hippocampal cingulum (CGH; β = -0.027, p = 4.1×10−6 ), posterior thalamic radiation (PTR; β 90 = –0.022, p = 1.6×10−4 ), overall skeleton (β = -0.020, p = 6.2×10−4 ) and splenium of the corpus callosum (SCC; β = -0.016, p = 6.8×10−3 ). Effect sizes for all ROIs may be found in Supplementary Table S5.2. Nominally significant ROIs included the sagittal stratum, overall corpus callosum, genu of the corpus callosum, retrolenticular part of the internal capsule, and the fornix (crus)/stria terminalis (FX/ST). Secondary analyses in all ROIs passing the nominal significance threshold showed a significant age-byApoE4 interaction in the FX/ST (β = -7.7×10−4 ; p = 0.014). In datasets with multiple protocols, a comparison of regional ApoE4 standardized beta estimates from regression models run pre-harmonization found similar results between the ADNI3-S127 protocols differing only in b-value (rβ = 0.97) and less similar results between the QTIM protocols differing in both spatial and angular resolution (rβ = 0.60). In the ADNI3-S127 dataset (N = 56), the CGH was found to be nominally significant in both protocols (b=1000: β = -0.30, p = 0.016; b=2000: β = -0.30, p = 0.019), but in the posterior corona radiata, a nominally significant result was only found in the b=2000 dataset (b=1000: β = -0.26, p = 0.060; b=2000: β = -0.28, p = 0.046). There were no significant associations in the QTIM dataset (N = 316). After harmonization, the ApoE4 standardized betas of all individual datasets and protocols remained almost identical, ensuring harmonization does not change individual dataset findings (Figure 5.4). Longitudinal Studies In the UK Biobank, mixed-effects models detected insignificant differences in associations between age and regional FA from regressions run on raw unharmonized data and data harmonized by baseline parameters; correlations of the age effects between methods, calculated across ROIs, approximated to r ∼ 1.0 (Figure 5.5). When baseline and follow-up data were harmonized independently, the age effect correlation with the raw unharmonized models was 0.92. 91 Figure 5.4. Age associations, residualized by sex, age-by-sex, and age2 , are plotted for the (A) CGH and (B) FXST. In the CGH, E4 carriers had significantly lower FA compared to their E3E3 counterparts. In the FXST, the FA of E4 carriers was higher at younger ages but after approximately age 55 years, dropped below that of non-carriers in older ages. A comparison of protocols in the (C) ADNI3-S127 and (D) QTIM datasets showed that harmonization does not converge the results of the same subjects acquired with different protocols. Each scatter point reflects the association (standardized beta) between ApoE4 and an ROI, corrected for age, sex, age-by-sex, and age2 . 92 Figure 5.5. Standardized betas for age associations with each WM ROI either pre-harmonization (x-axis) or postharmonization (y-axis). Harmonization was either performed by (A) applying baseline parameters to follow-up data, or (B) modeling the ComBat parameters for each time point separately. eHarmonize Our package is set up to be adaptable. In addition to the lifespan reference curves, a JSON file is included containing meta information about the reference curves (e.g., version number, datasets used in their creation). This feature allows for updated references to be implemented in the future while preserving previous versions for ongoing studies to maintain consistency. References for other diffusion measures, or measures from any other modality (imaging or non), can easily be implemented by adding the appropriate information to the meta JSON. The outputs of eHarmonize include the harmonized data, model parameters, and a text log file for provenance, reflecting a timestamp of when the tool was run and by whom, and the reference and ROIs that were used. If a study includes cases and controls, eHarmonize will harmonize the measures based on the controls and then apply the model to the cases as is done in the neuroHarmonize package. A QC image per ROI is also output showing a line plot of the study data vs. age before and after harmonization, with the reference in the background (Figure 5.6C). 93 Figure 5.6. (A) The eHarmonize command line interface comes with two subcommands: harmonize-fa for harmonizing data from a new site, and apply-harmonization for applying an existing harmonization model to a known site. (B) The ENIGMA-DTI template with the skeleton and ROIs overlaid. (C) QC output showing data before and after harmonization in relation to the reference curve, shown in gray. 5.4 DISCUSSION In this work, we combined dMRI data from ten public datasets to create lifespan reference curves for global and regional white matter (WM) fractional anisotropy (FA). We found that ComBat-GAM best matched the previously reported non-linear age trends across the lifespan and expected age peaks seen in single cohort studies of WM development. Across most regions, our reference curves show a steep increase in FA 94 during development, peaking between the ages of 20 and 40, followed by a continuous and gradual decrease for the rest of the lifespan. One notable exception was the splenium of the corpus callosum. While the FA also increases steeply during development, it plateaus in the early 20s and then peaks later at 68 years. Previous studies have found the FA of the splenium of the corpus callosum (SCC) to be relatively stable with age in adulthood (Gunning-Dixon et al. 2009), possibly due to posterior-to-anterior development and anterior-to-posterior aging trends (Krogsrud et al. 2016; Pietrasik et al. 2020). The widely diverging projections of the splenium may account for the late peak as the loss of diverging fibers increases FA followed by an overall decline(Friedrich et al. 2020). Another outlier region is the fornix, which has the same overall trends, but a much sharper decline. Given its location, this likely reflects greater misregistration and partial voluming associated with age-related atrophy and nearby ventricular expansion (Metzler-Baddeley et al. 2012). We found sex differences across the WM skeleton across the lifespan. Most regions showed higher FA in males than in females, and the opposite effect was found in six ROIs. Prior studies of sex differences in white matter microstructure have also reported regional variation (Eikenes et al. 2023; Lawrence et al. 2023; López-Vicente et al. 2021). These sex effects may be affected by covariates, such as intracranial volume (ICV). In a study using the HUNT dataset, analyses without an ICV covariate found regionally varying sex effects, but after including ICV, only females had regions with significantly higher FA (Eikenes et al. 2023). We did not incorporate ICV as a covariate in our harmonization framework as it is not an output of the ENIGMA-DTI pipeline and is generally estimated from T1-weighted MRI, as opposed to dMRI. For longitudinal studies, we showed that harmonization parameters modeled on baseline data can be applied to follow-up data or parameters can be modeled independently in each timepoint. Results from our comparisons to unharmonized data were largely consistent between the two methods. In the UK Biobank, which we used for the analyses, the time interval between visits was much smaller than the age range of the study population. In some studies, the follow-up age range may fall outside the modeled age 95 range at baseline, and a separate follow-up model may be advisable. This may be particularly advantageous for data in the non-linear ranges of the lifespan reference curves. One major benefit of a lifespan dMRI reference is that it can be used to study subtle effects, such as genetic influences, on brain WM microstructure throughout life. Here, we harmonized FA data from the healthy controls of ten datasets to our lifespan reference curves and found an effect of lower global FA in subjects with the ApoE4 genotype compared to E3E3 homozygotes. The same effect was found regionally in the hippocampal cingulum, the posterior thalamic radiation, and the splenium of the corpus callosum. As a genetic risk factor for Alzheimer’s disease (AD), our findings overlap with regions previously implicated in DTI studies of dementia (Nir et al. 2013; Peter Kochunov et al. 2021). Previous studies of white matter microstructure in cognitively healthy controls have also found E4 carriers to have lower FA than noncarriers, though published results are inconsistent and some studies have also reported null findings (Harrison et al. 2020). Most previous studies were either limited by smaller sample sizes (N < 200) or a limited age range with most focusing on individuals aged 60 years and above, and fewer than five studies incorporating younger individuals. Our study was conducted in over 30,000 subjects and across the lifespan. We also found a significant age-by-E4 interaction in the fornix (crus)/stria terminalis whereby the FA of the E4 carriers was higher than that of E3 homozygotes early in life until approximately age 55 years, after which E4 carriers showed lower FA than noncarriers. The fornix and the stria terminalis are major output tracts of the hippocampus and amygdala, respectively. In a longitudinal lifespan study of structural imaging measures, age-dependent associations were found between E4 and rates of volume change in the hippocampus and the amygdala (Brouwer et al. 2022). In both structures, E4 was associated with prolonged growth into adulthood and faster atrophy later in life. This age-by-E4 interaction may reflect the ‘antagonistic pleiotropy’ hypothesis that the ApoE4 genotype may be advantageous earlier in life (Han and Bondi 2008). As part of our ApoE analyses, we further evaluated datasets that had subjects acquired with multiple dMRI protocols. Diffusion-weighted MRI, and FA in particular, is affected by many acquisition parameters (A. H. Zhu et al. 2019). We found that the use of our lifespan harmonization framework does not converge 96 the statistical results of the same population acquired on different protocols. However, we note that differences in FA due to acquisition protocols may reflect different underlying anatomy or biological processes. We found that voxel size was negatively correlated with the harmonization shift parameter, i.e., sites with larger voxels generally have lower FA values. Larger voxels are more likely to contain crossing fibers, which would result in lower FA values. In such a case, convergence of FA values between protocols at the expense of biological interpretability may not be desired. We combined our lifespan reference curves and harmonization mechanism into a Python package called eHarmonize. Dissemination of the package will allow collaborators to harmonize their data on-site and then share either the harmonized measures, or - in situations where raw data cannot be shared (e.g., genetics studies) - results of an agreed upon analysis. In addition, new sites will be free to join existing projects without requiring re-harmonization of the previously collected and harmonized data. We also created the package framework to be flexible and adaptable to update our existing reference curves and include reference curves for other measures, such as diffusivity measures or even those extracted from another modality. Version control ensures appropriate provenance for reproducibility. The datasets we used to build our lifespan reference curves are public and therefore available to researchers for many applications. As such, researchers harmonizing multi-site data for machine learning studies may be concerned about data leakage with these datasets in the harmonization reference. After applying the eHarmonize framework to training and testing datasets, we found that there was no significant difference in MAE between training and testing datasets (p > 0.09). Our iterative sampling approach likely limited the overfitting of large datasets, such as the UK Biobank, so harmonization is not driven by any single dataset. eHarmonize may be an important tool for large scale machine learning purposes, but we recommend interested researchers to formally evaluate this for data leakage when using one of the training datasets, as was done in the establishment of harmonizer (Marzi et al. 2024). Our lifespan reference curves have a few limitations that we plan to address in future versions. First, we had uneven sampling across the age range. In particular, children younger than 8 years old and adults in the range of 35 to 45 years old were underrepresented. We also acknowledge that the protocols of 97 our training data were largely homogeneous, good quality data, and our test sets were similar. For the next iteration of lifespan curves, we aim to include more data from our underrepresented age ranges and acquisition protocols. To address the connection between acquisition protocol and biological interpretation between studies, we will also evaluate the added value of separate lifespan reference curves for different acquisition parameters. To expand on eHarmonize’s functionality, future steps will include modeling functions that can take into account population-based sources of variability or additional model parameters. For the current study, we took care to only include unrelated and cross-sectional subsets of all studies. In the future, it would be important to examine nested random effects that account for a covariance (kinship) structure of the study population. There is currently one nested-ComBat package that implements both nested and Gaussian mixture model versions of ComBat (Horng et al. 2022); however, the functions are currently hardcoded for bi-modal or binary data. With regard to model parameters, the World Health Organization (WHO) recommends GAM-LSS for creating lifespan charts (Borghi et al. 2006). GAM-LSS models may fit the first four moments of a distribution, adding in skew and kurtosis parameters, which GAM does not. Fitting more parameters robustly requires more data, and as many studies have small sample sizes, we elected to start with GAM models. For future studies wishing to combine only larger datasets, we will examine the effect of incorporating the ability to fit a GAM-LSS model. Additional features we will test include combining ComBat-GAM with CovBat to account for covariance between measures as well as the nonlinear age trend. Overall, we successfully created lifespan reference curves for regional FA measures and made progress in applying ComBat-GAM to these references. Our framework provides studies with the ability to standardize their dMRI measures. Collaborators from different sites can also harmonize their data to our lifespan reference curves without worrying about their covariate overlap or differences in sample size. The framework is now available as a Python package at https://github.com/ahzhu/eharmonize. 98 5.5 ACKNOWLEDGMENTS AHZ, TMN, PMT, and NJ conceptualized the study. AR, LS, GIdZ, KLM, MJW, SEM, JB, DCG, PK, and AKH provided access to study data. AHZ, TMN, and JEV-R processed images and aggregated data from the numerous datasets. AHZ created the lifespan reference curves, coded eHarmonize, and performed all statistical analysis. SJ helped with methods testing. AHZ, TMN, SJ and NJ wrote the manuscript. AHZ, TMN, SJ, JEV-R, PMT, and NJ received funding support from the NIH (R01MH134004, R01MH116147, R01AG058854, P41EB015922, RF1AG057892) and the Alzheimer’s Association. JB received funding support from the following NIH grants: U54HG013247, P30AG059305, U19AG076581, R01AG078423, and R01AG058464. 99 5.6 SUPPLEMENTARY MATERIALS Supplementary Figure S5.1. ABCD site effects on FA in the genu of the corpus callosum (GCC). Longitudinal data from ABCD subjects are provided as spaghetti plots, separated by site and colored by manufacturer. The trend line per site is also included in black. Most sites with a GE scanner showed decreasing FA with age, contrary to the expected age effect. Philips was the least common scanner manufacturer. One Philips site had a downward trajectory (site01), and another had widely diverging subject-level trajectories (site19). As a result, we elected to use only ABCD data acquired on Siemens scanners. 100 Supplementary Figure S5.2. Outlier ROI lifespan trajectories. (A) FA in the SCC peaks at age 68 years, much later than all other ROIs. (B) After peaking at age 29 years, FA in the FX decreases faster than in any other ROI. The lifespan reference curve for the global FA measure is also included for reference. 101 Supplementary Table S5.1. Sex effects across the white matter. ROIs where males had significantly higher FA are in bold. ROIs where females had significantly higher FA are italicized. * p < 0.05; ** p < 0.001 ROI Estimate SE t-value p-value pFDR AverageFA 0.0020 0.00024 8.4 < 0.001** < 0.001** ACR 0.0018 0.00037 4.8 < 0.001** < 0.001** ALIC 0.0042 0.00040 10.5 < 0.001** < 0.001** BCC -0.00078 0.00046 -1.7 0.089 0.092 CC 0.00014 0.00038 0.4 0.71 0.71 CGC 0.010 0.00048 21.7 < 0.001** < 0.001** CGH 0.0049 0.00066 7.4 < 0.001** < 0.001** CR 0.0013 0.00031 4.1 < 0.001** < 0.001** CST 0.0082 0.00053 15.5 < 0.001** < 0.001** EC 0.0030 0.00034 8.7 < 0.001** < 0.001** FX -0.0068 0.00082 -8.3 < 0.001** < 0.001** FXST -0.0050 0.00047 -10.5 < 0.001** < 0.001** GCC 0.0012 0.00048 2.5 0.011* 0.013* IC 0.0032 0.00032 10.0 < 0.001** < 0.001** PCR -0.00077 0.00037 -2.1 0.040 0.046 PLIC 0.0041 0.00036 11.5 < 0.001** < 0.001** PTR -0.0035 0.00044 -7.9 < 0.001** < 0.001** RLIC 0.00082 0.00039 2.1 0.036 0.043 SCC 0.00076 0.00041 1.9 0.063 0.069 SCR 0.0014 0.00037 3.9 < 0.001** < 0.001** SFO 0.0044 0.00042 10.6 < 0.001** < 0.001** SLF 0.0030 0.00035 8.6 < 0.001** < 0.001** SS -0.0051 0.00043 -11.9 < 0.001** < 0.001** TAP -0.0034 0.00069 -4.9 < 0.001** < 0.001** UNC 0.0050 0.00060 8.4 < 0.001** < 0.001** 102 Supplementary Table S5.2. ApoE4 effects across the white matter. ROIs where ApoE4 was associated with significantly lower FA after multiple comparisons correction are in bold. * p < 0.05; ** p < 0.001 ROI N Estimates SE t_values p_values p_FDR AverageFA 26902 -0.020 0.0057 -3.42 < 0.001** 0.0052* ACR 26902 -0.0081 0.0052 -1.56 0.12 0.25 ALIC 26902 -0.0039 0.0059 -0.67 0.50 0.63 BCC 26902 -0.0084 0.0060 -1.39 0.16 0.27 CC 26902 -0.014 0.0060 -2.27 0.02* 0.097 CGC 26902 -0.010 0.0058 -1.76 0.08 0.20 CGH 26902 -0.027 0.0059 -4.61 < 0.001** < 0.001** CR 26902 -0.0045 0.0056 -0.80 0.42 0.56 CST 26902 -0.0054 0.0059 -0.92 0.36 0.50 EC 26902 -0.0013 0.0059 -0.22 0.83 0.90 FX 26902 -0.0063 0.0044 -1.43 0.15 0.27 FXST 26902 -0.011 0.0056 -1.98 0.048* 0.13 GCC 26902 -0.012 0.0056 -2.10 0.036* 0.13 IC 26902 -0.0060 0.0060 -1.01 0.31 0.46 PCR 26902 -0.0078 0.0060 -1.32 0.19 0.29 PLIC 26902 0.00043 0.0060 0.071 0.94 0.94 PTR 26902 -0.022 0.0057 -3.78 < 0.001** 0.0020* RLIC 26902 -0.012 0.0061 -2.05 0.040* 0.13 SCC 26902 -0.016 0.0059 -2.70 0.0068* 0.043* SCR 26902 0.0031 0.0060 0.51 0.61 0.69 SFO 26902 0.00081 0.0060 0.14 0.89 0.93 SLF 26902 -0.0091 0.0060 -1.52 0.13 0.25 SS 26902 -0.015 0.0060 -2.56 0.01* 0.052 TAP 26902 -0.0096 0.0060 -1.59 0.11 0.25 UNC 26902 -0.0033 0.0060 -0.54 0.59 0.69 103 CHAPTER 6 Replication Studies and Future Directions This section is adapted from: Zhu, A.H., Thompson, P.M., Jahanshad, N. “Family history of suicide may have a sex-specific effect on brain structure in adolescents.” (2019) Society for Neuroscience. Chicago, IL, USA, October 19-23, 2019. Zhu, A.H., Thompson, P.M., Jahanshad, N. “Sex specific neurodevelopmental associations with maternal and paternal history of suicide in ABCD.” (2020) Organization for Human Brain Mapping. Virtual, June 23-July 3, 2020. Zhu, A.H., Jahanshad, N. “Cortical Structure Mediates Externalizing Behaviors In Children With Parental History Of Suicide. An ABCD Study.” (2021) Society of Biological Psychiatry. Virtual, April 28-May 1, 2021. Zhu, A.H., Jahanshad, N. “The Effects of Parental History of Suicide on Neurodevelopment of Children from the Healthy Brain Network Study.” (2023) International Congress on Psychopharmacology. Belek, Antalya, Turkey, October 22-25, 2023. 6.1 INTRODUCTION In recent years, suicide has consistently been listed as a leading cause of death in the United States, accounting for 2-3% of deaths per year (Heron 2017, 2019, 2021). Suicidal thoughts and behaviors (STBs) are even more common: for every death by suicide, there are estimated to be twenty suicide attempts (SA) (Who 2014). Suicidal ideation (SI) is the most common STB with a reported lifetime prevalence of 9.2% 104 (Nock et al. 2008). Longitudinal studies have found that STBs are associated with reduced mental and physical health (Fairweather-Schmidt et al. 2016; Goldman-Mellor et al. 2014), and poorer mental health has been found in the loved ones of those who died by suicide (Mitchell et al. 2009). Suicides and STBs are thought to be preventable; however, over fifty years of research have resulted in a stable list of risk factors that remain low in predictive power of future STBs (Franklin et al. 2017). Thus far, the best predictor of a future SA has been a previous SA (Who 2014). Understanding the mechanisms by which risk factors operate may help improve prediction efforts in the future. Risk factors may have genetic and/or environmental underpinnings. Family history of suicide (FHoS) is a risk factor that combines both. Epidemiological and family studies have found heritability estimates of STBs to be within the range of 17-55% (Statham et al. 1998; Roy and Segal 2001; Sokolowski, Wasserman, and Wasserman 2014; Voracek and Loibl 2007). This heritability has been shown to be separate from that of psychiatric syndromes that may predispose people to STBs (Roy, Rylander, and Sarchiapone 1997; D. A. Brent et al. 1994). People with family history of suicide (FHoS) are five times more likely to attempt suicide than those who don’t (McGirr et al. 2009; David A. Brent et al. 2015). In addition to heritability, trauma from FHoS may also play a role. The suicide risk of FHoS offspring is approximately doubled that of those who suddenly lost a parent to other causes (Burrell, Mehlum, and Qin 2018). Modulating factors may include demographics, such as sex of both the parent and offspring and age of the offspring. Previous research has found conflicting results when studying the effects of parent and child sex on suicide risk conferred from family history. Some studies have shown the female sex to be particularly impactful (e.g., daughters are more affected or those with a maternal history of suicide) (Gravseth et al. 2010; Mittendorfer-Rutz, Rasmussen, and Lange 2012; Agerbo, Nordentoft, and Mortensen 2002), some have found the greatest impact in those who lost their same-sex parent (Cheng et al. 2014), and others have found no effect of sex from either the parent or the offspring (Burrell, Mehlum, and Qin 2018). With regards to age, offspring of suicide attemptors tend to attempt suicide at a younger age (David A. Brent et al. 2003), and the risk from FHoS is increased in younger offspring (Burrell, Mehlum, and Qin 2018). 105 Suicide is the leading cause of non-accidental death in children and young adults (Heron 2017, 2019, 2021). STBs and non-suicidal self-harm often begin during adolescence (Hawton, Saunders, and O’Connor 2012). There are many risk factors shared between the two—including FHoS, low socioeconomic status, parental separation or divorce, interpersonal difficulties, mental disorders, drug and alcohol misuse, and hopelessness—and self-harm is a risk factor for suicide (Hawton, Saunders, and O’Connor 2012). Previous studies of young adolescents have found that the onset of self-harm was associated with late stage or completed puberty, rather than chronological age (Hawton, Saunders, and O’Connor 2012). Correspondingly, the prevalence of suicide is relatively low before the age of 15 years before steadily increasing into early adulthood (Bertolote and Fleischmann 2002). As a result, suicidality and self-harm studies in younger children are less common. Neuroimaging studies can help elucidate the biological mechanisms of suicide risk, especially in children whose brains are still undergoing development. Adolescent studies of neuroimaging correlates with STBs are less common than those conducted in adults but have found associations between SA and structural volume differences in regions such as the prefrontal cortex, anterior cingulate cortex, lateral temporal regions, and parahippocampal gyrus (Schmaal et al. 2020a). The effects of FHoS on neurodevelopment in children are even more lacking, but a recent study in adults involving suicide attempters, clinical controls, and relatives found smaller volumes in frontal and temporal brain regions in people with FHoS, regardless of the subject’s own attempt history (Jollant et al. 2018). In the same study, significant differences between the brains of suicide attempters and non-attempters were only found in a clinically depressed population without FHoS (Jollant et al. 2018). The authors suggest that their findings may implicate different neural mechanisms between FHoS and non-FHoS suicide attempters (Jollant et al. 2018). Brain MRI studies in developing children may help tease apart these neural mechanisms underlying the increased risk for STBs. Some of the large datasets aimed at studying brain development also include extensive questionnaires on the family history of medical conditions, such as suicide. The Adolescent Brain Cognitive Development (ABCD) Study, which has collected detailed imaging, demographic, behavior, and family 106 history data from over 10,000 8-11 year old children (“The Conception of the ABCD Study: From Substance Use to a Broad NIH Collaboration” 2018), provides a large-scale opportunity to study how FHoS relates to brain development. The Healthy Brain Network (HBN) study has collected MRI of approximately 2,000 children and young adults aged 5-22 years and also contains FHoS data from their parents or caretakers (Alexander et al. 2017). Here, we examined the effects of parental FHoS on the brain structure of children from the ABCD and HBN datasets. We hypothesized that maternal and paternal FHoS may have different effects on the developing brain, so we also evaluated these groups separately. We analyzed each dataset separately to gauge the reproducibility of our results. 6.2 FAMILY HISTORY OF SUICIDE IN THE ABCD STUDY 6.2.1 Methods Baseline imaging data from ABCD release 2.0.1 was analyzed. FHoS information was obtained via the ABCD Family History Assessment, which asked whether specific family members had attempted or completed suicide. Children for whom a parent had a history of suicide or suicide attempt were selected for the FHoS group. We then filtered by family ID to obtain a subset of children who were unrelated to each other. An income-matched set of children were selected as controls (Controls: FHoS = 5:1). Children with a second-degree relative who attempted or completed suicide were excluded. STB in the children was assessed with the KSADS-5 with questions on lifetime self-harm behavior, suicidal ideation, and suicidal attempt (Kaufman 2013; Kobak and Kaufman 2015). Chi-squared analyses were used to determine any differences between the two groups. The ABCD data release includes FreeSurfer version 5.3 (Fischl 2012) based cortical parcellations and regional measures of thickness (TH) and surface area (SA). For the purposes of our study, regional measures were bilaterally averaged. Differences in CT and SA between FHoS and control groups were assessed with mixed effects models. Fixed effect covariates included age, sex, and family income; data collection site was modeled as a random effect. Models were initially run without a global correction (mean TH or total SA); we re-ran the association with the global correction to emphasize regional effects. The 107 false discovery rate (FDR) method was used to correct for multiple comparisons for all regions (NROI = 34) but separately by measure (e.g., cortical thickness and surface area) (Benjamini and Hochberg 1995). Maternal and paternal FHoS (MatHoS and PatHoS, respectively) were also evaluated separately along with sex-stratified analyses to determine effects that were specific to boys or girls. Nominally significant regions (p < 0.05) were modeled as mediators for the effect of FHoS on internalizing and externalizing behaviors as assessed by the Child Behavior Checklist (CBCL) (Achenbach 2009). Caregivers rated child behaviors over the past 6 months as “0 = Not True”, “1 = Somewhat/Sometimes True”, and “2 = Very True/Often True”. These responses were aggregated into internalizing or externalizing behaviors and provided by the ABCD study. Boys, girls, and MatHoS or PatHoS were also analyzed separately. Significance for group and mediation analyses were assessed using FDR. 6.2.2 Results 525 children (240 F) had first-degree FHoS, and 2,624 (1,252 F) were identified as controls (Figure 6.1). 306 children (152 F) had maternal FHoS, and 197 children (84 F) had paternal FHoS. There were no significant differences between age and sex distribution between the groups. FHoS children were more likely to self-report STB than controls, and the effect was stronger in children with MatFHoS than in PatFHoS children (Table 6.1). The transverse temporal cortex was thicker in children with FHoS (t = 3.07, p = 0.002) when covarying for global thickness, and this was driven by MatHoS (t = 3.21, pFDR = 0.046). The SA of the pars opercularis was significantly larger in girls with MatHoS (without total SA: t = 3.61, pFDR = 0.01; with: t = 3.36, pFDR = 0.027, Figure 6.2). In children with paternal FHoS, significantly smaller areas were found in the banks of the superior temporal sulcus (STS) (t = -3.80, pFDR = 0.005) and temporal poles (t = -3.02, pFDR = 0.043). Many regions had a smaller SA in boys with paternal FHoS before covarying for global surface area (Figure 6.3). When accounting for global SA, children with paternal FHoS had smaller 108 bank of the STS (t = -3.52, pFDR = 0.015) but a larger superior parietal cortex (t = 3.22, pFDR = 0.021). In boys alone, the superior parietal cortex was larger (t = 3.26, pFDR = 0.038, Figure 6.2). Figure 6.1. ABCD Subject Selection. One child was selected from each family to form an unrelated subcohort. Children with second-degree (2°) FHoS were excluded. As socioeconomic status is strongly associated with suicide risk, control and first-degree (1°) FHoS children were matched by family income (ratio 5:1). There were no significant differences in the distribution of sex. 109 Controls Maternal FHoS Paternal FHoS + - + - 𝜒 2 + - 𝜒 2 self-harm 151 2449 35 270 13.7** 20 176 5.4* suicidal ideation 130 2470 37 268 24.3** 19 177 7.1* suicidal attempt 25 2575 10 295 10.4* 6 190 5.5* Table 6.1. Self-reported frequencies of suicidal thoughts and behaviors in children with or without a first-degree family history of suicide. * indicates significance pFDR < 0.05 ** indicates significance pFDR < 0.001 Figure 6.2. Parental and child sex effects on cortical surface area. No results were found in children when the parent with a history of suicide attempt or completion was of the opposite sex. In children with the same sex as their FHoS parent, larger surface areas were found in some regions. 110 Figure 6.3. Paternal FHoS effects in boys. When not covarying for total surface area, many regions of the cortex are smaller in boys with paternal FHoS. These effects all disappear when covarying for total surface area, and instead an effect is found in the opposite direction (larger superior parietal cortex, see Figure 6.2). For children with PatHoS, temporal-parietal areas mediated externalizing behaviors, largely driven by boys with PatHoS. Externalizing behaviors in boys were mediated by cortical structure in the anterior cingulate cortex, regardless of the sex of the parent with FHoS. In children with paternal history of suicidal behavior, SA of the inferior parietal lobule significantly mediated internalizing behaviors in all children, largely driven by boys with paternal history of suicide. 6.3 FAMILY HISTORY OF SUICIDE IN THE HBN STUDY 6.3.1 Methods T1-weighted images from the HBN study were processed using FreeSurfer version 7.1.1 (Fischl 2012) to obtain 7 subcortical volumes as well as thickness and surface area for 34 cortical regions. Each subcortical and cortical measure was averaged bilaterally. The number of suicide attempts from each parent was 111 collected by the HBN study as part of a comprehensive questionnaire on family history, and binarized to determine offspring FHoS status. We created a subset of controls without FHoS matched by age, sex, annual household income, and site (Controls: FHoS = 2:1). Mixed-effects models were run to determine the effect of FHoS on each brain measure, co-varying for age, sex, age-by-sex, age2, income, and intracranial volume with data collection site as the random effect. Multiple comparisons were accounted for using the FDR procedure, correcting for the number of ROIs in a given measure. In regions that passed FDR correction, secondary analyses were run stratifying by parent, to determine maternal and paternal FHoS effects specifically. 6.3.2 Results Out of over 2000 participants, 134 HBN participants were found to have maternal FHoS and 54 with paternal FHoS. Demographics for all groups are shown in Tables 6.2 and 6.3. There were no significant differences between age and sex distribution between the groups. In comparison to controls, FHoS offspring had significantly lower cortical thickness in the insula (t = -3.20, p = 0.0015) and rostral middle frontal cortex (t = -3.19, p = 0.0016) (Figure 6.4); no cortical area or subcortical volume differences were detected. When stratifying the analyses by parent sex, we found results were driven by the maternal FHoS group (insula: t = -2.80, p = 0.0054; rostral middle frontal cortex: t = -3.11, p = 0.0021); no paternal-specific effects were detected. Table 6.2. Of the 2818 HBN participants, 184 children had FHoS, denoted as FHx (family history). A subset of 110 children had complete set of demographics and imaging available. 112 Table 6.3. Demographics table of the FHx children stratified by sex of affected parent. Four children had both maternal and paternal FHoS (denoted as FHx for family history) and were included in both sets of analyses. Figure 6.4. In FHoS children, cortical thickness was significantly lower in the insula and rostral middle frontal cortex, with an additional four ROIs reaching nominal significance. 6.4 FUTURE DIRECTIONS We conducted similar analyses of the potential effects of parental history of suicide on neurodevelopment in two pediatric datasets, but the results did not replicate between the two studies. In the ABCD dataset, we 113 found FHoS to be largely associated with increased cortical thickness and surface area in temporal regions. In sex-stratified analyses, we found brain differences in children who were of the same sex as the affected parent. In the HBN dataset, children with FHoS had on average thinner insula and rostral middle frontal cortices. The results were driven by maternal FHoS. However, it should be noted that in both datasets the number of children with maternal FHoS outnumbered those with paternal FHoS. The difference in results is likely due to the differences in study design between the two datasets. Both methodological differences, such as MRI acquisition and processing protocols, and study sample characteristics are known sources of site or study effects. One example of differences in image processing pipelines is that structural MRI images from the ABCD and HBN datasets were processed with different version of FreeSurfer (Haddad et al. 2023). However, the difference in subject recruitment between the two studies likely played a large role in the discrepancy of the results. The ABCD study recruited subjects within a very limited age range, (“The Conception of the ABCD Study: From Substance Use to a Broad NIH Collaboration” 2018) while the HBN study included children from a wider age range (Alexander et al. 2017). The trajectory of structural brain measures such as cortical thickness and surface area during neurodevelopment differ with cortical surface area peaking around the baseline ABCD age but cortical thickness experiencing steady decline (Wierenga et al. 2014). With this in mind, the cortical surface area results of the ABCD study are particularly difficult to interpret as higher or lower surface area may be interpreted differently on either side of the peak age. The harmonization work presented in this dissertation thus far has primarily been focused on processing pipelines and creating normative models. Methods such as ComBat attempt to address MRIspecific sources of variation while preserving the effects of study-specific demographics. The discrepancy in our multi-study FHoS results emphasize the need to also include harmonization of non-imaging measures, such as demographics and questionnaires (Campos et al. 2023). In our study, different questionnaires for family history of suicide were used with HBN assessing the number of attempts, while ABCD assessed a binary of attempted or completed suicides. Attempted and completed suicides are sometimes thought of as a single phenotype (Brent et al. 2015; Brent et al. 1996; Lieb et al. 2005), but it is 114 possible that they have different effects of neurodevelopment. Also when studying rare exposure variables, such as FHoS, harmonization would prove crucial as imposing additional exclusion restrictions to homogenize data can severely limit the amount of available data. Overall, we found results in the frontotemporal cortical regions, which have been implicated in neuroimaging studies of suicidal thoughts and behaviors (Schmaal et al. 2020b). By identifying brain differences in children at possibly higher risk for suicidal behaviors, we may be able to identify novel, targeted treatments and care. We also found effects to be sex-specific. Boys and girls may require distinct therapies depending on the sex of the parent with suicidal behaviors. However, our lack of replicated results indicated an immense need for further work, which we hope to address in the future. 6.5 ACKNOWLEDGMENTS Funding for this study was provided by NIH grants P41 EB015922, R01 MH 117601,and R01 MH 116147. 115 REFERENCES Achenbach, T. M. 2009. The Achenbach system of empirically based assessment (ASEBA): Development, findings, theory, and applications. Burlington: University of Vermont, Research Center for Children, Youth, and Families. Acheson, Ashley, S. Andrea Wijtenburg, Laura M. Rowland, Anderson Winkler, Charles W. Mathias, L. Elliot Hong, Neda Jahanshad, et al. 2017. “Reproducibility of Tract-Based White Matter Microstructural Measures Using the ENIGMA-DTI Protocol.” Brain and Behavior 7 (2): e00615. Acosta-Cabronero, Julio, Guy B. Williams, Arturo Cardenas-Blanco, Robert J. Arnold, Victoria Lupson, and Peter J. Nestor. 2013. “In Vivo Quantitative Susceptibility Mapping (QSM) in Alzheimer’s Disease.” PloS One 8 (11): e81093. Agerbo, Esben, Merete Nordentoft, and Preben Bo Mortensen. 2002. “Familial, Psychiatric, and Socioeconomic Risk Factors for Suicide in Young People: Nested Case-Control Study.” BMJ 325 (7355): 74. Alexander, Lindsay M., Jasmine Escalera, Lei Ai, Charissa Andreotti, Karina Febre, Alexander Mangone, Natan Vega-Potler, et al. 2017. “An Open Resource for Transdiagnostic Research in Pediatric Mental Health and Learning Disorders.” Scientific Data 4 (December):170181. Alfaro-Almagro, Fidel, Mark Jenkinson, Neal K. Bangerter, Jesper L. R. Andersson, Ludovica Griffanti, Gwenaëlle Douaud, Stamatios N. Sotiropoulos, et al. 2018. “Image Processing and Quality Control for the First 10,000 Brain Imaging Datasets from UK Biobank.” NeuroImage 166 (February):400– 424. Åsvold, Bjørn Olav, Arnulf Langhammer, Tommy Aune Rehn, Grete Kjelvik, Trond Viggo Grøntvedt, Elin Pettersen Sørgjerd, Jørn Søberg Fenstad, et al. 2023. “Cohort Profile Update: The HUNT Study, Norway.” International Journal of Epidemiology 52 (1): e80–91. Avants, B. B., C. L. Epstein, M. Grossman, and J. C. Gee. 2008. “Symmetric Diffeomorphic Image Registration with Cross-Correlation: Evaluating Automated Labeling of Elderly and Neurodegenerative Brain.” Medical Image Analysis 12 (1): 26–41. Ayton, Scott, Noel G. Faux, Ashley I. Bush, and Alzheimer’s Disease Neuroimaging Initiative. 2015. “Ferritin Levels in the Cerebrospinal Fluid Predict Alzheimer’s Disease Outcomes and Are Regulated by APOE.” Nature Communications 6 (May):6760. Bachmann, S., J. Pantel, A. Flender, C. Bottmer, M. Essig, and J. Schröder. 2003. “Corpus Callosum in First-Episode Patients with Schizophrenia--a Magnetic Resonance Imaging Study.” Psychological Medicine 33 (6): 1019–27. Bava, Sunita, Veronique Boucquey, Diane Goldenberg, Rachel E. Thayer, Megan Ward, Joanna Jacobus, and Susan F. Tapert. 2011. “Sex Differences in Adolescent White Matter Architecture.” Brain Research 1375 (February):41–48. Bayer, Johanna M. M., Richard Dinga, Seyed Mostafa Kia, Akhil R. Kottaram, Thomas Wolfers, Jinglei Lv, Andrew Zalesky, Lianne Schmaal, and Andre Marquand. 2022. “Accommodating Site Variation in Neuroimaging Data Using Normative and Hierarchical Bayesian Models.” NeuroImage 264 (December):119699. Bayer, Johanna M. M., Paul M. Thompson, Christopher R. K. Ching, Mengting Liu, Andrew Chen, Alana 116 C. Panzenhagen, Neda Jahanshad, Andre Marquand, Lianne Schmaal, and Philipp G. Sämann. 2022. “Site Effects How-to and When: An Overview of Retrospective Techniques to Accommodate Site Effects in Multi-Site Neuroimaging Analyses.” Frontiers in Neurology 13 (October):923988. Bell, Steven, Andreas S. Rigas, Magnus K. Magnusson, Egil Ferkingstad, Elias Allara, Gyda Bjornsdottir, Anna Ramond, et al. 2021. “A Genome-Wide Meta-Analysis Yields 46 New Loci Associating with Biomarkers of Iron Homeostasis.” Communications Biology 4 (1): 156. Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society. Series B, Statistical Methodology 57 (1): 289–300. Benyamin, Beben, Allan F. McRae, Gu Zhu, Scott Gordon, Anjali K. Henders, Aarno Palotie, Leena Peltonen, et al. 2009. “Variants in TF and HFE Explain Approximately 40% of Genetic Variation in Serum-Transferrin Levels.” American Journal of Human Genetics 84 (1): 60–65. Bergen, J. M. G. van, X. Li, J. Hua, S. J. Schreiner, S. C. Steininger, F. C. Quevenco, M. Wyss, et al. 2016. “Colocalization of Cerebral Iron with Amyloid Beta in Mild Cognitive Impairment.” Scientific Reports 6 (October):35514. Bertolote, José Manoel, and Alexandra Fleischmann. 2002. “Suicide and Psychiatric Diagnosis: A Worldwide Perspective.” World Psychiatry: Official Journal of the World Psychiatric Association 1 (3): 181–85. Bethlehem, R. A. I., J. Seidlitz, S. R. White, J. W. Vogel, K. M. Anderson, C. Adamson, S. Adler, et al. 2022. “Brain Charts for the Human Lifespan.” Nature 604 (7906): 525–33. Blennow, Kaj, Leslie M. Shaw, Erik Stomrud, Niklas Mattsson, Jon B. Toledo, Katharina Buck, Simone Wahl, et al. 2019. “Predicting Clinical Decline and Conversion to Alzheimer’s Disease or Dementia Using Novel Elecsys Aβ(1-42), pTau and tTau CSF Immunoassays.” Scientific Reports 9 (1): 19024. Boedhoe, Premika S. W., Martijn W. Heymans, Lianne Schmaal, Yoshinari Abe, Pino Alonso, Stephanie H. Ameis, Alan Anticevic, et al. 2018. “An Empirical Comparison of Meta- and Mega-Analysis With Data From the ENIGMA Obsessive-Compulsive Disorder Working Group.” Frontiers in Neuroinformatics 12:102. Borghi, E., M. de Onis, C. Garza, J. Van den Broeck, E. A. Frongillo, L. Grummer-Strawn, S. Van Buuren, et al. 2006. “Construction of the World Health Organization Child Growth Standards: Selection of Methods for Attained Growth Curves.” Statistics in Medicine 25 (2): 247–65. Braak, Heiko, and Kelly Del Tredici. 2015. “The Preclinical Phase of the Pathological Process Underlying Sporadic Alzheimer’s Disease.” Brain: A Journal of Neurology 138 (Pt 10): 2814–33. Brent, D. A., J. A. Perper, G. Moritz, L. Liotus, J. Schweers, L. Balach, and C. Roth. 1994. “Familial Risk Factors for Adolescent Suicide: A Case-Control Study.” Acta Psychiatrica Scandinavica 89 (1): 52– 58. Brent, David A., Nadine M. Melhem, Maria Oquendo, Ainsley Burke, Boris Birmaher, Barbara Stanley, Candice Biernesser, et al. 2015. “Familial Pathways to Early-Onset Suicide Attempt: A 5.6-Year Prospective Study.” JAMA Psychiatry 72 (2): 160–68. Brent, David A., Maria Oquendo, Boris Birmaher, Laurence Greenhill, David Kolko, Barbara Stanley, Jamie Zelazny, et al. 2003. “Peripubertal Suicide Attempts in Offspring of Suicide Attempters with 117 Siblings Concordant for Suicidal Behavior.” The American Journal of Psychiatry 160 (8): 1486–93. Brouwer, Rachel M., Marieke Klein, Katrina L. Grasby, Hugo G. Schnack, Neda Jahanshad, Jalmar Teeuw, Sophia I. Thomopoulos, et al. 2022. “Genetic Variants Associated with Longitudinal Changes in Brain Structure across the Lifespan.” Nature Neuroscience 25 (4): 421–32. Burke, Danielle L., Joie Ensor, and Richard D. Riley. 2017. “Meta-Analysis Using Individual Participant Data: One-Stage and Two-Stage Approaches, and Why They May Differ.” Statistics in Medicine 36 (5): 855–75. Burrell, Lisa Victoria, Lars Mehlum, and Ping Qin. 2018. “Sudden Parental Death from External Causes and Risk of Suicide in the Bereaved Offspring: A National Study.” Journal of Psychiatric Research 96 (January):49–56. Butler, Ellyn R., Andrew Chen, Rabie Ramadan, Trang T. Le, Kosha Ruparel, Tyler M. Moore, Theodore D. Satterthwaite, et al. 2021. “Pitfalls in Brain Age Analyses.” Human Brain Mapping 42 (13): 4092– 4101. Button, Katherine S., John P. A. Ioannidis, Claire Mokrysz, Brian A. Nosek, Jonathan Flint, Emma S. J. Robinson, and Marcus R. Munafò. 2013. “Power Failure: Why Small Sample Size Undermines the Reliability of Neuroscience.” Nature Reviews. Neuroscience 14 (5): 365–76. Campos, Adrian I., Laura S. Van Velzen, Dick J. Veltman, Elena Pozzi, Sonia Ambrogi, Elizabeth D. Ballard, Nerisa Banaj, et al. 2023. “Concurrent Validity and Reliability of Suicide Risk Assessment Instruments: A Meta-Analysis of 20 Instruments across 27 International Cohorts.” Neuropsychology 37 (3): 315–29. Chen, Andrew A., Joanne C. Beer, Nicholas J. Tustison, Philip A. Cook, Russell T. Shinohara, Haochang Shou, and Alzheimer’s Disease Neuroimaging Initiative. 2022. “Mitigating Site Effects in Covariance for Machine Learning in Neuroimaging Data.” Human Brain Mapping 43 (4): 1179–95. Cheng, C-C J., W-J Yen, W-T Chang, K. C-C Wu, M-C Ko, and C-Y Li. 2014. “Risk of Adolescent Offspring’s Completed Suicide Increases with Prior History of Their Same-Sex Parents' Death by Suicide.” Psychological Medicine 44 (9): 1845–54. Connor, J. R., S. L. Menzies, S. M. St Martin, and E. J. Mufson. 1992. “A Histochemical Study of Iron, Transferrin, and Ferritin in Alzheimer’s Diseased Brains.” Journal of Neuroscience Research 31 (1): 75–83. Correia, Marta Morgado, Thomas A. Carpenter, and Guy B. Williams. 2009. “Looking for the Optimal DTI Acquisition Scheme given a Maximum Scan Time: Are More B-Values a Waste of Time?” Magnetic Resonance Imaging 27 (2): 163–75. Dagley, Alexander, Molly LaPoint, Willem Huijbers, Trey Hedden, Donald G. McLaren, Jasmeer P. Chatwal, Kathryn V. Papp, et al. 2017. “Harvard Aging Brain Study: Dataset and Accessibility.” NeuroImage 144 (Pt B): 255–58. Deistung, Andreas, Ferdinand Schweser, and Jürgen R. Reichenbach. 2017. “Overview of Quantitative Susceptibility Mapping.” NMR in Biomedicine 30 (4). https://doi.org/10.1002/nbm.3569. Downhill, J. E., Jr, M. S. Buchsbaum, T. Wei, J. Spiegel-Cohen, E. A. Hazlett, M. M. Haznedar, J. Silverman, and L. J. Siever. 2000. “Shape and Size of the Corpus Callosum in Schizophrenia and Schizotypal Personality Disorder.” Schizophrenia Research 42 (3): 193–208. 118 Eikenes, Live, Eelke Visser, Torgil Vangberg, and Asta K. Håberg. 2023. “Both Brain Size and Biological Sex Contribute to Variation in White Matter Microstructure in Middle-Aged Healthy Adults.” Human Brain Mapping 44 (2): 691–709. Elliott, Lloyd T., Kevin Sharp, Fidel Alfaro-Almagro, Sinan Shi, Karla L. Miller, Gwenaëlle Douaud, Jonathan Marchini, and Stephen M. Smith. 2018. “Genome-Wide Association Studies of Brain Imaging Phenotypes in UK Biobank.” Nature 562 (7726): 210–16. Fairweather-Schmidt, A. K., P. J. Batterham, P. Butterworth, and S. Nada-Raja. 2016. “The Impact of Suicidality on Health-Related Quality of Life: A Latent Growth Curve Analysis of Community-Based Data.” Journal of Affective Disorders 203 (October):14–21. Fick, Rutger H. J., Demian Wassermann, and Rachid Deriche. 2019. “The Dmipy Toolbox: Diffusion MRI Multi-Compartment Modeling and Microstructure Recovery Made Easy.” Frontiers in Neuroinformatics 13 (October):64. Fischl, Bruce. 2012. “FreeSurfer.” NeuroImage 62 (2): 774–81. Fonov, Vladimir, Alan C. Evans, Kelly Botteron, C. Robert Almli, Robert C. McKinstry, D. Louis Collins, and Brain Development Cooperative Group. 2011. “Unbiased Average Age-Appropriate Atlases for Pediatric Studies.” NeuroImage 54 (1): 313–27. Fortin, Jean-Philippe, Nicholas Cullen, Yvette I. Sheline, Warren D. Taylor, Irem Aselcioglu, Philip A. Cook, Phil Adams, et al. 2018. “Harmonization of Cortical Thickness Measurements across Scanners and Sites.” NeuroImage 167 (February):104–20. Fortin, Jean-Philippe, Drew Parker, Birkan Tunç, Takanori Watanabe, Mark A. Elliott, Kosha Ruparel, David R. Roalf, et al. 2017. “Harmonization of Multi-Site Diffusion Tensor Imaging Data.” NeuroImage 161 (November):149–70. Frangou, Sophia, Amirhossein Modabbernia, Steven C. R. Williams, Efstathios Papachristou, Gaelle E. Doucet, Ingrid Agartz, Moji Aghajani, et al. 2022. “Cortical Thickness across the Lifespan: Data from 17,075 Healthy Individuals Aged 3-90 Years.” Human Brain Mapping 43 (1): 431–51. Franklin, Joseph C., Jessica D. Ribeiro, Kathryn R. Fox, Kate H. Bentley, Evan M. Kleiman, Xieyining Huang, Katherine M. Musacchio, Adam C. Jaroszewski, Bernard P. Chang, and Matthew K. Nock. 2017. “Risk Factors for Suicidal Thoughts and Behaviors: A Meta-Analysis of 50 Years of Research.” Psychological Bulletin 143 (2): 187–232. Friedrich, Patrick, Christoph Fraenz, Caroline Schlüter, Sebastian Ocklenburg, Burkhard Mädler, Onur Güntürkün, and Erhan Genç. 2020. “The Relationship Between Axon Density, Myelination, and Fractional Anisotropy in the Human Corpus Callosum.” Cerebral Cortex 30 (4): 2042–56. Fry, Anna, Thomas J. Littlejohns, Cathie Sudlow, Nicola Doherty, Ligia Adamska, Tim Sprosen, Rory Collins, and Naomi E. Allen. 2017. “Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population.” American Journal of Epidemiology 186 (9): 1026–34. Ge, Ruiyang, Yuetong Yu, Yi Xuan Qi, Yu-Nan Fan, Shiyu Chen, Chuntong Gao, Shalaila S. Haas, et al. 2024. “Normative Modelling of Brain Morphometry across the Lifespan with CentileBrain: Algorithm Benchmarking and Model Optimisation.” The Lancet. Digital Health 6 (3): e211–21. Giedd, J. N., J. Blumenthal, N. O. Jeffries, J. C. Rajapakse, A. C. Vaituzis, H. Liu, Y. C. Berry, M. Tobin, 119 J. Nelson, and F. X. Castellanos. 1999. “Development of the Human Corpus Callosum during Childhood and Adolescence: A Longitudinal MRI Study.” Progress in Neuro-Psychopharmacology & Biological Psychiatry 23 (4): 571–88. Gilmore, Rick O., Michele T. Diaz, Brad A. Wyble, and Tal Yarkoni. 2017. “Progress toward Openness, Transparency, and Reproducibility in Cognitive Neuroscience.” Annals of the New York Academy of Sciences 1396 (1): 5–18. Goldman-Mellor, Sidra J., Avshalom Caspi, Honalee Harrington, Sean Hogan, Shyamala Nada-Raja, Richie Poulton, and Terrie E. Moffitt. 2014. “Suicide Attempt in Young People: A Signal for LongTerm Health Care and Social Needs.” JAMA Psychiatry 71 (2): 119–27. Goodman, Steven N., Daniele Fanelli, and John P. A. Ioannidis. 2016. “What Does Research Reproducibility Mean?” Science Translational Medicine 8 (341): 341ps12. Gravseth, H. M., L. Mehlum, T. Bjerkedal, and P. Kristensen. 2010. “Suicide in Young Norwegians in a Life Course Perspective: Population-Based Cohort Study.” Journal of Epidemiology and Community Health 64 (5): 407–12. Grothe, Michel J., Sylvia Villeneuve, Martin Dyrba, David Bartrés-Faz, Miranka Wirth, and Alzheimer’s Disease Neuroimaging Initiative. 2017. “Multimodal Characterization of Older APOE2 Carriers Reveals Selective Reduction of Amyloid Load.” Neurology 88 (6): 569–76. Guillaume, Bryan, Xue Hua, Paul M. Thompson, Lourens Waldorp, Thomas E. Nichols, and Alzheimer’s Disease Neuroimaging Initiative. 2014. “Fast and Accurate Modelling of Longitudinal and Repeated Measures Neuroimaging Data.” NeuroImage 94 (July):287–302. Gunning-Dixon, Faith M., Adam M. Brickman, Janice C. Cheng, and George S. Alexopoulos. 2009. “Aging of Cerebral White Matter: A Review of MRI Findings.” International Journal of Geriatric Psychiatry 24 (2): 109–17. Haacke, E. Mark, Saifeng Liu, Sagar Buch, Weili Zheng, Dongmei Wu, and Yongquan Ye. 2015. “Quantitative Susceptibility Mapping: Current Status and Future Directions.” Magnetic Resonance Imaging 33 (1): 1–25. Haddad, Elizabeth, Fabrizio Pizzagalli, Alyssa H. Zhu, Ravi R. Bhatt, Tasfiya Islam, Iyad Ba Gari, Daniel Dixon, Sophia I. Thomopoulos, Paul M. Thompson, and Neda Jahanshad. 2023. “Multisite Test-Retest Reliability and Compatibility of Brain Metrics Derived from FreeSurfer Versions 7.1, 6.0, and 5.3.” Human Brain Mapping 44 (4): 1515–32. Hagler, Donald J., Jr, Seann Hatton, M. Daniela Cornejo, Carolina Makowski, Damien A. Fair, Anthony Steven Dick, Matthew T. Sutherland, et al. 2019. “Image Processing and Analysis Methods for the Adolescent Brain Cognitive Development Study.” NeuroImage 202 (November):116091. Han, S. Duke, and Mark W. Bondi. 2008. “Revision of the Apolipoprotein E Compensatory Mechanism Recruitment Hypothesis.” Alzheimer’s & Dementia: The Journal of the Alzheimer's Association 4 (4): 251–54. Harrison, Judith R., Sanchita Bhatia, Zhao Xuan Tan, Anastasia Mirza-Davies, Hannah Benkert, Chantal M. W. Tax, and Derek K. Jones. 2020. “Imaging Alzheimer’s Genetic Risk Using Diffusion MRI: A Systematic Review.” NeuroImage. Clinical 27 (July):102359. Hasan, Khader M., Arash Kamali, Humaira Abid, Larry A. Kramer, Jack M. Fletcher, and Linda Ewing- 120 Cobbs. 2010. “Quantification of the Spatiotemporal Microstructural Organization of the Human Brain Association, Projection and Commissural Pathways across the Lifespan Using Diffusion Tensor Tractography.” Brain Structure & Function 214 (4): 361–73. Hastie, Trevor, and Robert Tibshirani. 1986. “Generalized Additive Models.” Statistical Science: A Review Journal of the Institute of Mathematical Statistics 1 (3): 297–310. Hatton, Sean N., Khoa H. Huynh, Leonardo Bonilha, Eugenio Abela, Saud Alhusaini, Andre Altmann, Marina K. M. Alvim, et al. 2020. “White Matter Abnormalities across Different Epilepsy Syndromes in Adults: An ENIGMA-Epilepsy Study.” Brain: A Journal of Neurology 143 (8): 2454–73. Hawton, Keith, Kate E. A. Saunders, and Rory C. O’Connor. 2012. “Self-Harm and Suicide in Adolescents.” The Lancet 379 (9834): 2373–82. Heron, Melonie. 2017. “Deaths: Leading Causes for 2015.” National Vital Statistics Reports: From the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System 66 (5): 1–76. ———. 2019. “Deaths: Leading Causes for 2017.” National Vital Statistics Reports: From the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System 68 (6): 1–77. ———. 2021. “Deaths: Leading Causes for 2019.” National Vital Statistics Reports: From the Centers for Disease Control and Prevention, National Center for Health Statistics, National Vital Statistics System 70 (9): 1–114. Herting, Megan M., Emily C. Maxwell, Christy Irvine, and Bonnie J. Nagel. 2012. “The Impact of Sex, Puberty, and Hormones on White Matter Microstructure in Adolescents.” Cerebral Cortex 22 (9): 1979–92. Hibar, Derrek P., Jason L. Stein, Miguel E. Renteria, Alejandro Arias-Vasquez, Sylvane Desrivières, Neda Jahanshad, Roberto Toro, et al. 2015. “Common Genetic Variants Influence Human Subcortical Brain Structures.” Nature 520 (7546): 224–29. Horng, Hannah, Apurva Singh, Bardia Yousefi, Eric A. Cohen, Babak Haghighi, Sharyn Katz, Peter B. Noël, Russell T. Shinohara, and Despina Kontos. 2022. “Generalized ComBat Harmonization Methods for Radiomic Features with Multi-Modal Distributions and Multiple Batch Effects.” Scientific Reports 12 (1): 4493. Hostage, Christopher A., Kingshuk Roy Choudhury, Pudugramam Murali Doraiswamy, Jeffrey R. Petrella, and Alzheimer’s Disease Neuroimaging Initiative. 2013. “Dissecting the Gene Dose-Effects of the APOE ε4 and ε2 Alleles on Hippocampal Volumes in Aging and Alzheimer’s Disease.” PloS One 8 (2): e54483. Hua, Xue, Derrek P. Hibar, Christopher R. K. Ching, Christina P. Boyle, Priya Rajagopalan, Boris A. Gutman, Alex D. Leow, et al. 2013. “Unbiased Tensor-Based Morphometry: Improved Robustness and Sample Size Estimates for Alzheimer’s Disease Clinical Trials.” NeuroImage 66 (February):648– 61. Hua, Xue, Alex D. Leow, Neelroop Parikshak, Suh Lee, Ming-Chang Chiang, Arthur W. Toga, Clifford R. Jack Jr, Michael W. Weiner, Paul M. Thompson, and Alzheimer’s Disease Neuroimaging Initiative. 2008. “Tensor-Based Morphometry as a Neuroimaging Biomarker for Alzheimer’s Disease: An MRI Study of 676 AD, MCI, and Normal Subjects.” NeuroImage 43 (3): 458–69. 121 Hynd, G. W., M. Semrud-Clikeman, A. R. Lorys, E. S. Novey, D. Eliopulos, and H. Lyytinen. 1991. “Corpus Callosum Morphology in Attention Deficit-Hyperactivity Disorder: Morphometric Analysis of MRI.” Journal of Learning Disabilities 24 (3): 141–46. Isensee, Fabian, Marianne Schell, Irada Pflueger, Gianluca Brugnara, David Bonekamp, Ulf Neuberger, Antje Wick, et al. 2019. “Automated Brain Extraction of Multisequence MRI Using Artificial Neural Networks.” Human Brain Mapping 40 (17): 4952–64. Jack, Clifford R., Jr, Josephine Barnes, Matt A. Bernstein, Bret J. Borowski, James Brewer, Shona Clegg, Anders M. Dale, et al. 2015. “Magnetic Resonance Imaging in Alzheimer’s Disease Neuroimaging Initiative 2.” Alzheimer’s & Dementia: The Journal of the Alzheimer's Association 11 (7): 740–56. Jack, Clifford R., Jr, David S. Knopman, William J. Jagust, Ronald C. Petersen, Michael W. Weiner, Paul S. Aisen, Leslie M. Shaw, et al. 2013. “Tracking Pathophysiological Processes in Alzheimer’s Disease: An Updated Hypothetical Model of Dynamic Biomarkers.” Lancet Neurology 12 (2): 207–16. Jahanshad, Neda, Joshua Faskowitz, Gennady Roshchupkin, Derrek P. Hibar, Boris A. Gutman, Nicholas J. Tustison, Hieab H. H. Adams, et al. 2019. “Multi-Site Meta-Analysis of Morphometry.” IEEE/ACM Transactions on Computational Biology and Bioinformatics / IEEE, ACM 16 (5): 1508–14. Jahanshad, Neda, Peter V. Kochunov, Emma Sprooten, René C. Mandl, Thomas E. Nichols, Laura Almasy, John Blangero, et al. 2013. “Multi-Site Genetic Analysis of Diffusion Images and Voxelwise Heritability Analysis: A Pilot Project of the ENIGMA-DTI Working Group.” NeuroImage 81 (November):455–69. Jahanshad, Neda, Omid Kohannim, Derrek P. Hibar, Jason L. Stein, Katie L. McMahon, Greig I. de Zubicaray, Sarah E. Medland, et al. 2012. “Brain Structure in Healthy Adults Is Related to Serum Transferrin and the H63D Polymorphism in the HFE Gene.” Proceedings of the National Academy of Sciences of the United States of America 109 (14): E851–59. Jahanshad, Neda, Omid Kohannim, Arthur W. Toga, Katie L. McMahon, Greig I. de Zubicaray, Narelle K. Hansell, Grant W. Montgomery, Nicholas G. Martin, Margaret J. Wright, and Paul M. Thompson. 2012. “DIFFUSION IMAGING PROTOCOL EFFECTS ON GENETIC ASSOCIATIONS.” Proceedings / IEEE International Symposium on Biomedical Imaging: From Nano to Macro. IEEE International Symposium on Biomedical Imaging, 944–47. Jahanshad, Neda, PhD, Habib Ganjgahi MS, Janita Bralten PhD, Anouk den Braber PhD, Joshua Faskowitz BS, Annchen R. Knodt MS, Hervé Lemaitre PhD, et al. 2017. “Do Candidate Genes Affect the Brain’s White Matter Microstructure? Large-Scale Evaluation of 6,165 Diffusion MRI Scans.” bioRxiv. bioRxiv. https://doi.org/10.1101/107987. Jelescu, Ileana O., and Matthew D. Budde. 2017. “Design and Validation of Diffusion MRI Models of White Matter.” Frontiers of Physics 28 (November). https://doi.org/10.3389/fphy.2017.00061. Jenkinson, Mark, Peter Bannister, Michael Brady, and Stephen Smith. 2002. “Improved Optimization for the Robust and Accurate Linear Registration and Motion Correction of Brain Images.” NeuroImage 17 (2): 825–41. Jenkinson, Mark, Christian F. Beckmann, Timothy E. J. Behrens, Mark W. Woolrich, and Stephen M. Smith. 2012. “FSL.” NeuroImage 62 (2): 782–90. Jernigan, Terry L., Timothy T. Brown, Donald J. Hagler Jr, Natacha Akshoomoff, Hauke Bartsch, Erik Newman, Wesley K. Thompson, et al. 2016. “The Pediatric Imaging, Neurocognition, and Genetics 122 (PING) Data Repository.” NeuroImage 124 (Pt B): 1149–54. Johnson, W. Evan, Cheng Li, and Ariel Rabinovic. 2007. “Adjusting Batch Effects in Microarray Expression Data Using Empirical Bayes Methods.” Biostatistics 8 (1): 118–27. Jollant, Fabrice, Gerd Wagner, Stéphane Richard-Devantoy, Stefanie Köhler, Karl-Jürgen Bär, Gustavo Turecki, and Fabricio Pereira. 2018. “Neuroimaging-Informed Phenotypes of Suicidal Behavior: A Family History of Suicide and the Use of a Violent Suicidal Means.” Translational Psychiatry 8 (1): 120. Jorm, Anthony F., Karen A. Mather, Peter Butterworth, Kaarin J. Anstey, Helen Christensen, and Simon Easteal. 2007. “APOE Genotype and Cognitive Functioning in a Large Age-Stratified Population Sample.” Neuropsychology 21 (1): 1–8. Joshi, Shantanu H., Katherine L. Narr, Owen R. Philips, Keith H. Nuechterlein, Robert F. Asarnow, Arthur W. Toga, and Roger P. Woods. 2013. “Statistical Shape Analysis of the Corpus Callosum in Schizophrenia.” NeuroImage 64 (January):547–59. Jovicich, Jorge, Silvester Czanner, Xiao Han, David Salat, Andre van der Kouwe, Brian Quinn, Jenni Pacheco, et al. 2009. “MRI-Derived Measurements of Human Subcortical, Ventricular and Intracranial Brain Volumes: Reliability Effects of Scan Sessions, Acquisition Sequences, Data Analyses, Scanner Upgrade, Scanner Vendors and Field Strengths.” NeuroImage 46 (1): 177–92. Jovicich, Jorge, Moira Marizzoni, Beatriz Bosch, David Bartrés-Faz, Jennifer Arnold, Jens Benninghoff, Jens Wiltfang, et al. 2014. “Multisite Longitudinal Reliability of Tract-Based Spatial Statistics in Diffusion Tensor Imaging of Healthy Elderly Subjects.” NeuroImage 101 (November):390–403. Kagerer, Sonja M., Jiri M. G. van Bergen, Xu Li, Frances C. Quevenco, Anton F. Gietl, Sandro Studer, Valerie Treyer, et al. 2020. “APOE4 Moderates Effects of Cortical Iron on Synchronized Default Mode Network Activity in Cognitively Healthy Old-Aged Adults.” Alzheimer’s & Dementia: The Journal of the Alzheimer's Association 12 (1): e12002. Kaufman, J., Birmaher, B., et al., 2013. KSADS-PL. Yale University, New Haven, CT. Kingsley, Peter B., and W. Gordon Monahan. 2004. “Selection of the Optimum B Factor for DiffusionWeighted Magnetic Resonance Imaging Assessment of Ischemic Stroke.” Magnetic Resonance in Medicine: Official Journal of the Society of Magnetic Resonance in Medicine / Society of Magnetic Resonance in Medicine 51 (5): 996–1001. Kobak, K.A., Kaufman, J., 2015. Ksads-comp. Center for Telepsychology, Madison, WI. Kochunov, Peter, David C. Glahn, Thomas E. Nichols, Anderson M. Winkler, Elliot L. Hong, Henry H. Holcomb, Jason L. Stein, et al. 2011. “Genetic Analysis of Cortical Thickness and Fractional Anisotropy of Water Diffusion in the Brain.” Frontiers in Neuroscience 5 (October):120. Kochunov, Peter, L. Elliot Hong, Emily L. Dennis, Rajendra A. Morey, David F. Tate, Elisabeth A. Wilde, Mark Logue, et al. 2022. “ENIGMA-DTI: Translating Reproducible White Matter Deficits into Personalized Vulnerability Metrics in Cross-Diagnostic Psychiatric Research.” Human Brain Mapping 43 (1): 194–206. Kochunov, Peter, Artemis Zavaliangos-Petropulu, Neda Jahanshad, Paul M. Thompson, Meghann C. Ryan, Joshua Chiappelli, Shuo Chen, et al. 2021. “A White Matter Connection of Schizophrenia and Alzheimer’s Disease.” Schizophrenia Bulletin 47 (1): 197–206. 123 Kochunov, P., D. E. Williamson, J. Lancaster, P. Fox, J. Cornell, J. Blangero, and D. C. Glahn. 2012. “Fractional Anisotropy of Water Diffusion in Cerebral White Matter across the Lifespan.” Neurobiology of Aging 33 (1): 9–20. Krogsrud, Stine K., Anders M. Fjell, Christian K. Tamnes, Håkon Grydeland, Lia Mork, Paulina DueTønnessen, Atle Bjørnerud, et al. 2016. “Changes in White Matter Microstructure in the Developing Brain--A Longitudinal Diffusion Tensor Imaging Study of Children from 4 to 11years of Age.” NeuroImage 124 (Pt A): 473–86. Kuznetsova, Alexandra, Per B. Brockhoff, and Rune H. B. Christensen. 2017. “LmerTest Package: Tests in Linear Mixed Effects Models.” Journal of Statistical Software 82 (13). https://doi.org/10.18637/jss.v082.i13. LaMontagne, Pamela J., Tammie L. S. Benzinger, John C. Morris, Sarah Keefe, Russ Hornbeck, Chengjie Xiong, Elizabeth Grant, et al. 2019. “OASIS-3: Longitudinal Neuroimaging, Clinical, and Cognitive Dataset for Normal Aging and Alzheimer Disease.” bioRxiv. medRxiv. https://doi.org/10.1101/2019.12.13.19014902. Lawrence, Katherine E., Zvart Abaryan, Emily Laltoo, Leanna M. Hernandez, Michael J. Gandal, James T. McCracken, and Paul M. Thompson. 2023. “White Matter Microstructure Shows Sex Differences in Late Childhood: Evidence from 6797 Children.” Human Brain Mapping 44 (2): 535–48. Lebel, C., M. Gee, R. Camicioli, M. Wieler, W. Martin, and C. Beaulieu. 2012. “Diffusion Tensor Imaging of White Matter Tract Evolution over the Lifespan.” NeuroImage 60 (1): 340–52. Lebel, C., L. Walker, A. Leemans, L. Phillips, and C. Beaulieu. 2008. “Microstructural Maturation of the Human Brain from Childhood to Adulthood.” NeuroImage 40 (3): 1044–55. Lemaitre, Herve, Aaron L. Goldman, Fabio Sambataro, Beth A. Verchinski, Andreas Meyer-Lindenberg, Daniel R. Weinberger, and Venkata S. Mattay. 2012. “Normal Age-Related Brain Morphometric Changes: Nonuniformity across Cortical Thickness, Surface Area and Gray Matter Volume?” Neurobiology of Aging 33 (3): 617.e1–9. Lenroot, Rhoshel K., Nitin Gogtay, Deanna K. Greenstein, Elizabeth Molloy Wells, Gregory L. Wallace, Liv S. Clasen, Jonathan D. Blumenthal, et al. 2007. “Sexual Dimorphism of Brain Developmental Trajectories during Childhood and Adolescence.” NeuroImage 36 (4): 1065–73. Leow, Alex D., Igor Yanovsky, Ming-Chang Chiang, Agatha D. Lee, Andrea D. Klunder, Allen Lu, James T. Becker, Simon W. Davis, Arthur W. Toga, and Paul M. Thompson. 2007. “Statistical Properties of Jacobian Maps and the Realization of Unbiased Large-Deformation Nonlinear Image Registration.” IEEE Transactions on Medical Imaging 26 (6): 822–32. Lepore, N., C. Brun, Y. Y. Chou, M. C. Chiang, R. A. Dutton, K. M. Hayashi, E. Luders, et al. 2008. “Generalized Tensor-Based Morphometry of HIV/AIDS Using Multivariate Statistics on Deformation Tensors.” IEEE Transactions on Medical Imaging 27 (1): 129–41. LeWinn, Kaja Z., Margaret A. Sheridan, Katherine M. Keyes, Ava Hamilton, and Katie A. McLaughlin. 2017. “Sample Composition Alters Associations between Age and Brain Structure.” Nature Communications 8 (1): 874. Li, Cheng, and Wing Hung Wong. 2003. “DNA-Chip Analyzer (dChip).” In Statistics for Biology and Health, 120–41. Statistics for Biology and Health. New York, NY: Springer New York. 124 Liu, Chia-Chen, Chia-Chan Liu, Takahisa Kanekiyo, Huaxi Xu, and Guojun Bu. 2013. “Apolipoprotein E and Alzheimer Disease: Risk, Mechanisms and Therapy.” Nature Reviews. Neurology 9 (2): 106–18. Liu, Tian, Cynthia Wisnieff, Min Lou, Weiwei Chen, Pascal Spincemaille, and Yi Wang. 2013. “Nonlinear Formulation of the Magnetic Field to Source Relationship for Robust Quantitative Susceptibility Mapping.” Magnetic Resonance in Medicine: Official Journal of the Society of Magnetic Resonance in Medicine / Society of Magnetic Resonance in Medicine 69 (2): 467–76. Li, Zonghua, Francis Shue, Na Zhao, Mitsuru Shinohara, and Guojun Bu. 2020. “APOE2: Protective Mechanism and Therapeutic Implications for Alzheimer’s Disease.” Molecular Neurodegeneration 15 (1): 63. López-Vicente, Mónica, Sander Lamballais, Suzanne Louwen, Manon Hillegers, Henning Tiemeier, Ryan L. Muetzel, and Tonya White. 2021. “White Matter Microstructure Correlates of Age, Sex, Handedness and Motor Ability in a Population-Based Sample of 3031 School-Age Children.” NeuroImage 227 (February):117643. Luders, Eileen, Paul M. Thompson, and Arthur W. Toga. 2010. “The Development of the Corpus Callosum in the Healthy Human Brain.” The Journal of Neuroscience: The Official Journal of the Society for Neuroscience 30 (33): 10985–90. Lupton, Michelle K., Lachlan Strike, Narelle K. Hansell, Wei Wen, Karen A. Mather, Nicola J. Armstrong, Anbupalam Thalamuthu, et al. 2016. “The Effect of Increased Genetic Risk for Alzheimer’s Disease on Hippocampal and Amygdala Volume.” Neurobiology of Aging 40 (April):68–77. Lyoo, I. K., A. Satlin, C. K. Lee, and P. F. Renshaw. 1997. “Regional Atrophy of the Corpus Callosum in Subjects with Alzheimer’s Disease and Multi-Infarct Dementia.” Psychiatry Research 74 (2): 63–72. Machilsen, Bart, Emiliano d’Agostino, Frederik Maes, Dirk Vandermeulen, Horst K. Hahn, Lieven Lagae, and Peter Stiers. 2007. “Linear Normalization of MR Brain Images in Pediatric Patients with Periventricular Leukomalacia.” NeuroImage 35 (2): 686–97. Manjón, José V., Pierrick Coupé, Luis Martí-Bonmatí, D. Louis Collins, and Montserrat Robles. 2010. “Adaptive Non-Local Means Denoising of MR Images with Spatially Varying Noise Levels.” Journal of Magnetic Resonance Imaging: JMRI 31 (1): 192–203. Marzi, Chiara, Marco Giannelli, Andrea Barucci, Carlo Tessa, Mario Mascalchi, and Stefano Diciotti. 2024. “Efficacy of MRI Data Harmonization in the Age of Machine Learning: A Multicenter Study across 36 Datasets.” Scientific Data 11 (1): 115. McGibney, G., M. R. Smith, S. T. Nichols, and A. Crawley. 1993. “Quantitative Evaluation of Several Partial Fourier Reconstruction Algorithms Used in MRI.” Magnetic Resonance in Medicine: Official Journal of the Society of Magnetic Resonance in Medicine / Society of Magnetic Resonance in Medicine 30 (1): 51–59. McGirr, Alexander, Martin Alda, Monique Séguin, Sophie Cabot, Alain Lesage, and Gustavo Turecki. 2009. “Familial Aggregation of Suicide Explained by Cluster B Traits: A Three-Group Family Study of Suicide Controlling for Major Depressive Disorder.” The American Journal of Psychiatry 166 (10): 1124–34. Metzler-Baddeley, Claudia, Michael J. O’Sullivan, Sonya Bells, Ofer Pasternak, and Derek K. Jones. 2012. “How and How Not to Correct for CSF-Contamination in Diffusion MRI.” NeuroImage 59 (2): 1394– 1403. 125 Miller, Karla L., Fidel Alfaro-Almagro, Neal K. Bangerter, David L. Thomas, Essa Yacoub, Junqian Xu, Andreas J. Bartsch, et al. 2016. “Multimodal Population Brain Imaging in the UK Biobank Prospective Epidemiological Study.” Nature Neuroscience 19 (11): 1523–36. Mitchell, Ann M., Teresa J. Sakraida, Yookyung Kim, Leann Bullian, and Laurel Chiappetta. 2009. “Depression, Anxiety and Quality of Life in Suicide Survivors: A Comparison of Close and Distant Relationships.” Archives of Psychiatric Nursing 23 (1): 2–10. Mittendorfer-Rutz, Ellenor, Finn Rasmussen, and Theis Lange. 2012. “A Life-Course Study on Effects of Parental Markers of Morbidity and Mortality on Offspring’s Suicide Attempt.” PloS One 7 (12): e51585. Mori, Susumu, Kenichi Oishi, Hangyi Jiang, Li Jiang, Xin Li, Kazi Akhter, Kegang Hua, et al. 2008. “Stereotaxic White Matter Atlas Based on Diffusion Tensor Imaging in an ICBM Template.” NeuroImage 40 (2): 570–82. Munafò, Marcus R., Brian A. Nosek, Dorothy V. M. Bishop, Katherine S. Button, Christopher D. Chambers, Nathalie Percie du Sert, Uri Simonsohn, Eric-Jan Wagenmakers, Jennifer J. Ware, and John P. A. Ioannidis. 2017. “A Manifesto for Reproducible Science.” Nature Human Behaviour 1 (1): 0021. Nir, Talia M., Neda Jahanshad, Julio E. Villalon-Reina, Arthur W. Toga, Clifford R. Jack, Michael W. Weiner, Paul M. Thompson, and Alzheimer’s Disease Neuroimaging Initiative (ADNI). 2013. “Effectiveness of Regional DTI Measures in Distinguishing Alzheimer’s Disease, MCI, and Normal Aging.” NeuroImage. Clinical 3 (July):180–95. Nock, Matthew K., Guilherme Borges, Evelyn J. Bromet, Jordi Alonso, Matthias Angermeyer, Annette Beautrais, Ronny Bruffaerts, et al. 2008. “Cross-National Prevalence and Risk Factors for Suicidal Ideation, Plans and Attempts.” The British Journal of Psychiatry: The Journal of Mental Science 192 (2): 98–105. O’Dwyer, Laurence, Franck Lamberton, Silke Matura, Colby Tanner, Monika Scheibe, Julia Miller, Dan Rujescu, David Prvulovic, and Harald Hampel. 2012. “Reduced Hippocampal Volume in Healthy Young ApoE4 Carriers: An MRI Study.” PloS One 7 (11): e48895. Otsu, Nobuyuki. 1979. “A Threshold Selection Method from Gray-Level Histograms.” IEEE Transactions on Systems, Man, and Cybernetics 9 (1): 62–66. Papinutto, Nico Dario, Francesca Maule, and Jorge Jovicich. 2013. “Reproducibility and Biases in High Field Brain Diffusion MRI: An Evaluation of Acquisition and Analysis Variables.” Magnetic Resonance Imaging 31 (6): 827–39. Parkinson Progression Marker Initiative. 2011. “The Parkinson Progression Marker Initiative (PPMI).” Progress in Neurobiology 95 (4): 629–35. Pietrasik, Wojciech, Ivor Cribben, Fraser Olsen, Yushan Huang, and Nikolai V. Malykhin. 2020. “Diffusion Tensor Imaging of the Corpus Callosum in Healthy Aging: Investigating Higher Order Polynomial Regression Modelling.” NeuroImage 213 (June):116675. Pomponio, Raymond, Guray Erus, Mohamad Habes, Jimit Doshi, Dhivya Srinivasan, Elizabeth Mamourian, Vishnu Bashyam, et al. 2020a. “Harmonization of Large MRI Datasets for the Analysis of Brain Imaging Patterns throughout the Lifespan.” NeuroImage 208 (March):116450. ———. 2020b. “Harmonization of Large MRI Datasets for the Analysis of Brain Imaging Patterns 126 throughout the Lifespan.” NeuroImage 208 (March):116450. Qiu, Jiang. 2016. “SLIM.” Child Mind Institute. https://doi.org/10.15387/fcp_indi.retro.slim. Radua, Joaquim, Eduard Vieta, Russell Shinohara, Peter Kochunov, Yann Quidé, Melissa J. Green, Cynthia S. Weickert, et al. 2020. “Increased Power by Harmonizing Structural MRI Site Differences with the ComBat Batch Adjustment Method in ENIGMA.” NeuroImage 218 (September):116956. Ravanfar, Parsa, Samantha M. Loi, Warda T. Syeda, Tamsyn E. Van Rheenen, Ashley I. Bush, Patricia Desmond, Vanessa L. Cropley, et al. 2021. “Systematic Review: Quantitative Susceptibility Mapping (QSM) of Brain Iron Profile in Neurodegenerative Diseases.” Frontiers in Neuroscience 15 (February):618435. Raven, Erika P., Po H. Lu, Todd A. Tishler, Panthea Heydari, and George Bartzokis. 2013. “Increased Iron Levels and Decreased Tissue Integrity in Hippocampus of Alzheimer’s Disease Detected in Vivo with Magnetic Resonance Imaging.” Journal of Alzheimer’s Disease: JAD 37 (1): 127–36. Reuter, Martin, Nicholas J. Schmansky, H. Diana Rosas, and Bruce Fischl. 2012. “Within-Subject Template Estimation for Unbiased Longitudinal Image Analysis.” NeuroImage 61 (4): 1402–18. Roy, A., G. Rylander, and M. Sarchiapone. 1997. “Genetics of Suicides. Family Studies and Molecular Genetics.” Annals of the New York Academy of Sciences 836 (December):135–57. Roy, A., and N. L. Segal. 2001. “Suicidal Behavior in Twins: A Replication.” Journal of Affective Disorders 66 (1): 71–74. Rutherford, Saige, Charlotte Fraza, Richard Dinga, Seyed Mostafa Kia, Thomas Wolfers, Mariam Zabihi, Pierre Berthet, et al. 2022. “Charting Brain Growth and Aging at High Spatial Precision.” eLife 11 (February). https://doi.org/10.7554/eLife.72904. Salimi, Y., D. Domingo-Fernández, M. Hofmann-Apitius, and C. Birkenbihl. 2024. “Data-Driven Thresholding Statistically Biases ATN Profiling across Cohort Datasets.” The Journal of Prevention of Alzheimer’s Disease 11 (1): 185–95. Santos-Ferreira, Cátia, Rui Baptista, Manuel Oliveira-Santos, Regina Costa, José Pereira Moura, and Lino Gonçalves. 2019. “Apolipoprotein E2 Genotype Is Associated with a 2-Fold Increase in the Incidence of Type 2 Diabetes Mellitus: Results from a Long-Term Observational Study.” Journal of Lipids 2019 (August):1698610. Schiepers, O. J. G., S. E. Harris, A. J. Gow, A. Pattie, C. E. Brett, J. M. Starr, and I. J. Deary. 2012. “APOE E4 Status Predicts Age-Related Cognitive Decline in the Ninth Decade: Longitudinal Follow-up of the Lothian Birth Cohort 1921.” Molecular Psychiatry 17 (3): 315–24. Schmaal, Lianne, Anne-Laura van Harmelen, Vasiliki Chatzi, Elizabeth T. C. Lippard, Yara J. Toenders, Lynnette A. Averill, Carolyn M. Mazure, and Hilary P. Blumberg. 2020a. “Imaging Suicidal Thoughts and Behaviors: A Comprehensive Review of 2 Decades of Neuroimaging Studies.” Molecular Psychiatry 25 (2): 408–27. ———. 2020b. “Imaging Suicidal Thoughts and Behaviors: A Comprehensive Review of 2 Decades of Neuroimaging Studies.” Molecular Psychiatry 25 (2): 408–27. Shafto, Meredith A., Lorraine K. Tyler, Marie Dixon, Jason R. Taylor, James B. Rowe, Rhodri Cusack, Andrew J. Calder, et al. 2014. “The Cambridge Centre for Ageing and Neuroscience (Cam-CAN) Study Protocol: A Cross-Sectional, Lifespan, Multidisciplinary Examination of Healthy Cognitive 127 Ageing.” BMC Neurology 14 (October):204. Siqueira Pinto, Maíra, Stefan Winzeck, Evgenios N. Kornaropoulos, Sophie Richter, Roberto Paolella, Marta M. Correia, Ben Glocker, et al. 2023. “Use of Support Vector Machines Approach via ComBat Harmonized Diffusion Tensor Imaging for the Diagnosis and Prognosis of Mild Traumatic Brain Injury: A CENTER-TBI Study.” Journal of Neurotrauma 40 (13-14): 1317–38. Smith, Stephen M., Mark Jenkinson, Heidi Johansen-Berg, Daniel Rueckert, Thomas E. Nichols, Clare E. Mackay, Kate E. Watkins, et al. 2006. “Tract-Based Spatial Statistics: Voxelwise Analysis of MultiSubject Diffusion Data.” NeuroImage 31 (4): 1487–1505. Smith, Stephen M., and Thomas E. Nichols. 2018. “Statistical Challenges in ‘Big Data’ Human Neuroimaging.” Neuron 97 (2): 263–68. Smith, Stephen M., Diego Vidaurre, Fidel Alfaro-Almagro, Thomas E. Nichols, and Karla L. Miller. 2019. “Estimation of Brain Age Delta from Brain Imaging.” NeuroImage 200 (October):528–39. Soares, José M., Paulo Marques, Victor Alves, and Nuno Sousa. 2013. “A Hitchhiker’s Guide to Diffusion Tensor Imaging.” Frontiers in Neuroscience 7 (March):31. Sokolowski, Marcus, Jerzy Wasserman, and Danuta Wasserman. 2014. “Genome-Wide Association Studies of Suicidal Behaviors: A Review.” European Neuropsychopharmacology: The Journal of the European College of Neuropsychopharmacology 24 (10): 1567–77. Somerville, Leah H., Susan Y. Bookheimer, Randy L. Buckner, Gregory C. Burgess, Sandra W. Curtiss, Mirella Dapretto, Jennifer Stine Elam, et al. 2018. “The Lifespan Human Connectome Project in Development: A Large-Scale Study of Brain Connectivity Development in 5-21 Year Olds.” NeuroImage 183 (December):456–68. Statham, D. J., A. C. Heath, P. A. Madden, K. K. Bucholz, L. Bierut, S. H. Dinwiddie, W. S. Slutske, M. P. Dunne, and N. G. Martin. 1998. “Suicidal Behaviour: An Epidemiological and Genetic Study.” Psychological Medicine 28 (4): 839–55. Sun, Delin, Gopalkumar Rakesh, Courtney C. Haswell, Mark Logue, C. Lexi Baird, Erin N. O’Leary, Andrew S. Cotton, et al. 2022. “A Comparison of Methods to Harmonize Cortical Thickness Measurements across Scanners and Sites.” NeuroImage 261 (November):119509. Teipel, Stefan J., Sigrid Reuter, Bram Stieltjes, Julio Acosta-Cabronero, Ulrike Ernemann, Andreas Fellgiebel, Massimo Filippi, et al. 2011. “Multicenter Stability of Diffusion Tensor Imaging Measures: A European Clinical and Physical Phantom Study.” Psychiatry Research 194 (3): 363–71. Tesseur, I., J. Van Dorpe, K. Bruynseels, F. Bronfman, R. Sciot, A. Van Lommel, and F. Van Leuven. 2000. “Prominent Axonopathy and Disruption of Axonal Transport in Transgenic Mice Expressing Human Apolipoprotein E4 in Neurons of Brain and Spinal Cord.” The American Journal of Pathology 157 (5): 1495–1510. “The Conception of the ABCD Study: From Substance Use to a Broad NIH Collaboration.” 2018. Developmental Cognitive Neuroscience 32 (August):4–7. Thompson, Paul M., Ole A. Andreassen, Alejandro Arias-Vasquez, Carrie E. Bearden, Premika S. Boedhoe, Rachel M. Brouwer, Randy L. Buckner, et al. 2017. “ENIGMA and the Individual: Predicting Factors That Affect the Brain in 35 Countries Worldwide.” NeuroImage 145 (Pt B): 389–408. Thompson, Paul M., Neda Jahanshad, Lianne Schmaal, Jessica A. Turner, Anderson M. Winkler, Sophia I. 128 Thomopoulos, Gary F. Egan, and Peter Kochunov. 2022. “The Enhancing NeuroImaging Genetics through Meta-Analysis Consortium: 10 Years of Global Collaborations in Human Brain Mapping.” Human Brain Mapping 43 (1): 15–22. Thompson, Paul M., Jason L. Stein, Sarah E. Medland, Derrek P. Hibar, Alejandro Arias Vasquez, Miguel E. Renteria, Roberto Toro, et al. 2014. “The ENIGMA Consortium: Large-Scale Collaborative Analyses of Neuroimaging and Genetic Data.” Brain Imaging and Behavior 8 (2): 153–82. Thompson, P. M., J. N. Giedd, R. P. Woods, D. MacDonald, A. C. Evans, and A. W. Toga. 2000. “Growth Patterns in the Developing Brain Detected by Using Continuum Mechanical Tensor Maps.” Nature 404 (6774): 190–93. Tomimoto, Hidekazu, Jin-Xi Lin, Akinori Matsuo, Masafumi Ihara, Ryo Ohtani, Masunari Shibata, Yukio Miki, and Hiroshi Shibasaki. 2004. “Different Mechanisms of Corpus Callosum Atrophy in Alzheimer’s Disease and Vascular Dementia.” Journal of Neurology 251 (4): 398–406. Tremblay-Mercier, Jennifer, Cécile Madjar, Samir Das, Alexa Pichet Binette, Stephanie O. M. Dyke, Pierre Étienne, Marie-Elyse Lafaille-Magnan, et al. 2021. “Open Science Datasets from PREVENT-AD, a Longitudinal Cohort of Pre-Symptomatic Alzheimer’s Disease.” NeuroImage. Clinical 31 (June):102733. Tustison, Nicholas J., Andrew J. Holbrook, Brian B. Avants, Jared M. Roberts, Philip A. Cook, Zachariah M. Reagh, Jeffrey T. Duda, et al. 2019. “Longitudinal Mapping of Cortical Thickness Measurements: An Alzheimer’s Disease Neuroimaging Initiative-Based Evaluation Study.” Journal of Alzheimer’s Disease: JAD 71 (1): 165–83. Van Essen, David C., Stephen M. Smith, Deanna M. Barch, Timothy E. J. Behrens, Essa Yacoub, Kamil Ugurbil, and WU-Minn HCP Consortium. 2013. “The WU-Minn Human Connectome Project: An Overview.” NeuroImage 80 (October):62–79. Venkatraman, Vijay K., Christopher E. Gonzalez, Bennett Landman, Joshua Goh, David A. Reiter, Yang An, and Susan M. Resnick. 2015. “Region of Interest Correction Factors Improve Reliability of Diffusion Imaging Measures within and across Scanners and Field Strengths.” NeuroImage 119 (October):406–16. Vidal, Christine N., Rob Nicolson, Timothy J. DeVito, Kiralee M. Hayashi, Jennifer A. Geaga, Dick J. Drost, Peter C. Williamson, et al. 2006. “Mapping Corpus Callosum Deficits in Autism: An Index of Aberrant Cortical Connectivity.” Biological Psychiatry 60 (3): 218–25. Villalón-Reina, Julio E., Kenia Martínez, Xiaoping Qu, Christopher R. K. Ching, Talia M. Nir, Deydeep Kothapalli, Conor Corbin, et al. 2020. “Altered White Matter Microstructure in 22q11.2 Deletion Syndrome: A Multisite Diffusion Tensor Imaging Study.” Molecular Psychiatry 25 (11): 2818–31. Vogel, Jacob W., Yasser Iturria-Medina, Olof T. Strandberg, Ruben Smith, Elizabeth Levitis, Alan C. Evans, Oskar Hansson, Alzheimer’s Disease Neuroimaging Initiative, and Swedish BioFinder Study. 2020. “Spread of Pathological Tau Proteins through Communicating Neurons in Human Alzheimer’s Disease.” Nature Communications 11 (1): 2612. Vollmar, Christian, Jonathan O’Muircheartaigh, Gareth J. Barker, Mark R. Symms, Pamela Thompson, Veena Kumari, John S. Duncan, Mark P. Richardson, and Matthias J. Koepp. 2010. “Identical, but Not the Same: Intra-Site and Inter-Site Reproducibility of Fractional Anisotropy Measures on Two 3.0T Scanners.” NeuroImage 51 (4): 1384–94. 129 Voracek, Martin, and Lisa Mariella Loibl. 2007. “Genetics of Suicide: A Systematic Review of Twin Studies.” Wiener Klinische Wochenschrift 119 (15-16): 463–75. Walker, Lindsay, Lin-Ching Chang, Amritha Nayak, M. Okan Irfanoglu, Kelly N. Botteron, James McCracken, Robert C. McKinstry, et al. 2016. “The Diffusion Tensor Imaging (DTI) Component of the NIH MRI Study of Normal Brain Development (PedsDTI).” NeuroImage 124 (Pt B): 1125–30. Wang, Yingying, Chris Adamson, Weihong Yuan, Mekibib Altaye, Akila Rajagopal, Anna W. Byars, and Scott K. Holland. 2012. “Sex Differences in White Matter Development during Adolescence: A DTI Study.” Brain Research 1478 (October):1–15. Wang, Y., D. Wei, W. Li, and J. Qiu. 2014. “Individual Differences in Brain Structure and Resting-State Functional Connectivity Associated with Type A Behavior Pattern.” Neuroscience 272 (July):217–28. Ward, Roberta J., Fabio A. Zucca, Jeff H. Duyn, Robert R. Crichton, and Luigi Zecca. 2014. “The Role of Iron in Brain Ageing and Neurodegenerative Disorders.” Lancet Neurology 13 (10): 1045–60. Weiner, Michael W., Dallas P. Veitch, Paul S. Aisen, Laurel A. Beckett, Nigel J. Cairns, Jesse Cedarbaum, Michael C. Donohue, et al. 2015. “Impact of the Alzheimer’s Disease Neuroimaging Initiative, 2004 to 2014.” Alzheimer’s & Dementia: The Journal of the Alzheimer's Association 11 (7): 865–84. Weiner, Michael W., Dallas P. Veitch, Paul S. Aisen, Laurel A. Beckett, Nigel J. Cairns, Robert C. Green, Danielle Harvey, et al. 2017. “The Alzheimer’s Disease Neuroimaging Initiative 3: Continued Innovation for Clinical Trial Improvement.” Alzheimer’s & Dementia: The Journal of the Alzheimer's Association 13 (5): 561–71. Who. 2014. Preventing Suicide: A Global Imperative. Wierenga, Lara M., Marieke Langen, Bob Oranje, and Sarah Durston. 2014. “Unique Developmental Trajectories of Cortical Thickness and Surface Area.” NeuroImage 87 (February):120–26. Witelson, S. F. 1989. “Hand and Sex Differences in the Isthmus and Genu of the Human Corpus Callosum. A Postmortem Morphological Study.” Brain: A Journal of Neurology 112 ( Pt 3) (June):799–835. Wolfe, Michael S. 2012. “The Role of Tau in Neurodegenerative Diseases and Its Potential as a Therapeutic Target.” Scientifica 2012 (December):796024. Wolz, Robin, Adam J. Schwarz, Peng Yu, Patricia E. Cole, Daniel Rueckert, Clifford R. Jack Jr, David Raunig, Derek Hill, and Alzheimer’s Disease Neuroimaging Initiative. 2014. “Robustness of Automated Hippocampal Volumetry across Magnetic Resonance Field Strengths and Repeat Images.” Alzheimer’s & Dementia: The Journal of the Alzheimer's Association 10 (4): 430–38.e2. Woolrich, Mark W., Saad Jbabdi, Brian Patenaude, Michael Chappell, Salima Makni, Timothy Behrens, Christian Beckmann, Mark Jenkinson, and Stephen M. Smith. 2009. “Bayesian Analysis of Neuroimaging Data in FSL.” NeuroImage 45 (1 Suppl): S173–86. Yamauchi, H., H. Fukuyama, Y. Nagahama, Y. Katsumi, T. Hayashi, C. Oyanagi, J. Konishi, and H. Shio. 2000. “Comparison of the Pattern of Atrophy of the Corpus Callosum in Frontotemporal Dementia, Progressive Supranuclear Palsy, and Alzheimer’s Disease.” Journal of Neurology, Neurosurgery, and Psychiatry 69 (5): 623–29. Yeatman, Jason D., Brian A. Wandell, and Aviv A. Mezer. 2014. “Lifespan Maturation and Degeneration of Human Brain White Matter.” Nature Communications 5 (September):4932. 130 Zeineh, Michael M., Yuanxin Chen, Hagen H. Kitzler, Robert Hammond, Hannes Vogel, and Brian K. Rutt. 2015. “Activated Iron-Containing Microglia in the Human Hippocampus Identified by Magnetic Resonance Imaging in Alzheimer Disease.” Neurobiology of Aging 36 (9): 2483–2500. Zhang, Hui, Torben Schneider, Claudia A. Wheeler-Kingshott, and Daniel C. Alexander. 2012. “NODDI: Practical in Vivo Neurite Orientation Dispersion and Density Imaging of the Human Brain.” NeuroImage 61 (4): 1000–1016. Zhang, Y., M. Brady, and S. Smith. 2001. “Segmentation of Brain MR Images through a Hidden Markov Random Field Model and the Expectation-Maximization Algorithm.” IEEE Transactions on Medical Imaging 20 (1): 45–57. Zhang, Yuyao, Hongjiang Wei, Matthew J. Cronin, Naying He, Fuhua Yan, and Chunlei Liu. 2018. “Longitudinal Data for Magnetic Susceptibility of Normative Human Brain Development and Aging over the Lifespan.” Data in Brief 20 (October):623–31. Zhan, Liang, Daniel Franc, Vishal Patel, Neda Jahanshad, Yan Jin, Bryon A. Mueller, Matt A. Bernstein, et al. 2012. “HOW DO SPATIAL AND ANGULAR RESOLUTION AFFECT BRAIN CONNECTIVITY MAPS FROM DIFFUSION MRI?” Proceedings / IEEE International Symposium on Biomedical Imaging: From Nano to Macro. IEEE International Symposium on Biomedical Imaging, 1–6. Zhou, Dong, Tian Liu, Pascal Spincemaille, and Yi Wang. 2014. “Background Field Removal by Solving the Laplacian Boundary Value Problem.” NMR in Biomedicine 27 (3): 312–19. Zhu, Alyssa H., Daniel C. Moyer, Talia M. Nir, Paul M. Thompson, and Neda Jahanshad. 2019. “Challenges and Opportunities in dMRI Data Harmonization.” In Computational Diffusion MRI, 157– 72. Mathematics and Visualization. Cham: Springer International Publishing. Zhu, Tong, Rui Hu, Xing Qiu, Michael Taylor, Yuen Tso, Constantin Yiannoutsos, Bradford Navia, et al. 2011. “Quantification of Accuracy and Precision of Multi-Center DTI Measurements: A Diffusion Phantom and Human Brain Study.” NeuroImage 56 (3): 1398–1411. Zubicaray, Greig I. de, Ming-Chang Chiang, Katie L. McMahon, David W. Shattuck, Arthur W. Toga, Nicholas G. Martin, Margaret J. Wright, and Paul M. Thompson. 2008. “Meeting the Challenges of Neuroimaging Genetics.” Brain Imaging and Behavior 2 (4): 258–63.
Abstract (if available)
Abstract
Advances in MRI have led to discoveries of factors that affect brain macro- and microstructure in health and disease. The small size of many neuroimaging studies led to concerns about poor reproducibility of research findings and calls for the comparison and pooling of multi-cohort datasets to establish the consistency of reported effects. Across studies, MRI acquisitions vary in sequence parameters such as spatial resolution as well as hardware used. Currently, two major approaches are used to achieve robust results: the collection of large datasets and the combination of smaller datasets. Here, I hope to address unique challenges presented by each of the methods. While large datasets may provide the statistical power needed to robustly estimate the small effect sizes often seen in neuroimaging studies, they can be difficult to use for researchers without vast computational resources. In Chapter 2, I present the Biobank Data Parser, which uses metadata to provide researchers with computationally efficient ways to query, process, and visualize data. While singularly large biobanks are useful on their own, they can also be combined with other datasets to improve the generalizability of results. Efforts to compare and pool MRI measures require concerted efforts to harmonize image processing pipelines and output measures to compensate for these sources of variance. In Chapter 3, I start with a large scale neuroimaging analysis conducted using data from a single biobank and transition to large scale multi-site studies. Multi-site studies often use templates to harmonize image processing pipelines and statistical analyses. Chapter 4 introduces the concept of brain MRI templates in an image processing pipeline in both spatial and temporal dimensions. Then in Chapter 5, I shift to a scalar template, in this case a set of lifespan reference curves for white matter microstructure. However, harmonization of MRI processing alone may not address all sources of multi-site variance as datasets also vary in recruitment criteria. In Chapter 6, I demonstrate one such case, in which similar analyses of a rare exposure variable did not replicate between two pediatric datasets with very different age ranges and sample sizes. I end by describing the future directions meant to address them, predominantly harmonization of non-imaging, e.g., demographic and questionnaire, data.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Unveiling the white matter microstructure in 22q11.2 deletion syndrome with diffusion magnetic resonance imaging
PDF
Neuroimaging markers of risk & resilience to brain aging and dementia
PDF
Pattern detection in medical imaging: pathology specific imaging contrast, features, and statistical models
PDF
Data modeling approaches for continuous neuroimaging genetics
PDF
Diffusion MRI of the human brain: signal modeling and quantitative analysis
PDF
Methods for improving reliability and consistency in diffusion MRI analysis
PDF
Decoding the neurological and genetic underpinnings of chronic pain
PDF
Using neuroinformatics to identify genomic and proteomic markers of suboptimal aging and Alzheimer's disease
PDF
The neuroimaging of cancer and chemotherapy
PDF
Applications of graph theory to brain connectivity analysis
PDF
Correction, coregistration and connectivity analysis of multi-contrast brain MRI
PDF
Brain connectivity in epilepsy
PDF
Novel computational techniques for connectome analysis based on diffusion MRI
PDF
Characterizing brain aging with neuroimaging, health, and genetic data
PDF
Characterization of the brain in early childhood
PDF
Concurrent monitoring and diagnosis of process and quality faults with canonical correlation analysis
PDF
Susceptibility-weighted MRI for the evaluation of brain oxygenation and brain iron in sickle cell disease
PDF
Efficient processing of streaming data in multi-user and multi-abstraction workflows
PDF
Novel theoretical characterization and optimization of experimental efficiency for diffusion MRI
PDF
A multi-site neuroimaging approach to studying hippocampal damage in chronic stroke
Asset Metadata
Creator
Zhu, Alyssa Huichao
(author)
Core Title
Reproducibility and management of big data in brain MRI studies
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Biomedical Engineering
Degree Conferral Date
2024-12
Publication Date
01/13/2025
Defense Date
09/09/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
big data,bioinformatics,brain MRI,data harmonization,human aging,image processing,neuroimaging,OAI-PMH Harvest,reproducibility,software development
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Jahanshad, Neda (
committee chair
), Marmarelis, Vasilis (
committee member
), Thompson, Paul (
committee member
)
Creator Email
ahzhoo@gmail.com,alyssahz@usc.edu
Unique identifier
UC11399FAGU
Identifier
etd-ZhuAlyssaH-13749.pdf (filename)
Legacy Identifier
etd-ZhuAlyssaH-13749
Document Type
Dissertation
Format
theses (aat)
Rights
Zhu, Alyssa Huichao
Internet Media Type
application/pdf
Type
texts
Source
20250113-usctheses-batch-1234
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
big data
bioinformatics
brain MRI
data harmonization
human aging
image processing
neuroimaging
reproducibility
software development