Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Novel theoretical characterization and optimization of experimental efficiency for diffusion MRI
(USC Thesis Other)
Novel theoretical characterization and optimization of experimental efficiency for diffusion MRI
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Novel Theoretical Characterization and Optimization of Experimental Eciency for Diusion MRI BY DIVYA VARADARAJAN DISSERTATION Submitted in partial fulllment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering in the Graduate School of University of Southern California, Los Angeles, California December, 2018 Doctoral Committee: Associate Professor Justin P. Haldar, Chair Professor Richard M. Leahy Professor Hanna Damasio c 2018 Divya Varadarajan Abstract Magnetic Resonance Imaging (MRI) has been used for several decades to study biological features such as anatomy, metabolism, and function of the human brain. However, most MRI modalities probe the brain at millimeter spatial scales where microscopic information is inaccessible. Diusion MRI (dMRI) has emerged as a powerful technique to study tissue structure at microscopic spatial scales by noninvasively quantifying the Brownian motion characteristics of water molecules trapped in and around the biological structures. One of the most useful applications of dMRI is in reconstructing the white matter pathways of the brain. In addition, dMRI has been used for studying eects on microstructure due to dynamic factors such as brain development, and pathology. However, long scan times continue to be a major challenge for dMRI. While spending an hour or more in the scanner is acceptable for ex-vivo tissue analysis or for motivated, healthy volunteers, it becomes challenging for in-vivo analysis for a number of important subject populations (such as younger children and sick individuals). In practice, scan times are constrained for human subjects, and studies have to make inferences working with very few samples. The overall performance is directly impacted by the eciency of the dMRI protocol, which consists of the sampling scheme used to acquire dMRI data, as well as the parameter estimation method used to make inferences from measured data. This work aims to improve eciency of dMRI experiments by developing new theory to characterize performance of linear dMRI estimation methods, optimize the parameter estimation protocol to maximize the amount of information extracted from the measured data, and accurately model noise to suppress the eects of noise contamination. To my husband, Srikanth Nori and in loving memory of N.S. Aravamudhan (1955-2018) Acknowledgment There are many people that I would like to thank for supporting me throughout my thesis process. I am grateful rst and foremost to my advisor Prof. Justin P. Haldar for teaching me and driving me to achieve my goals at every step of my doctoral studies. His immense knowledge of the subject and principled approach towards research have beneted me greatly and I hope to carry these learnings forward in my research. I am grateful for his trust in me to be his rst student and feel very happy to have been guided by him. I would also like to thank Prof. Richard M. Leahy for mentoring me since I joined USC as a Masters student. His outlook on research and the friendly and encouraging atmosphere that he has created in our lab is incredible. I am truly grateful to him for introducing me to my advisor, for his questions on my work and for several technical and thought provoking discussions. I am grateful to Prof. Krishna Nayak for his comments on my research and I really appreciate his enthusiasm towards research and life. Another person with whom I have had very insightful conversations with is Prof. Hanna Damasio. I am especcially grateful for her thoughts on my thesis and for making me understand the neuroscience perspectives of my work especially regarding validation. I have been lucky to be part of a lab with amazing colleagues who encourage each other and have so much fun uncovering new scientic ndings. Thanks to the present and past members of BIG lab at USC: Prof. Anand Joshi, Dr. Chitresh Bhushan, Dr. Daeun Kim, Minqi Chong, Soyoung Choi, Jian Li, Tae Hyung Kim, Rodrigo Lobos, Hossein Shahabi, Yunsong Liu, and Dakarai McCoy. During my Ph.D I have had the privilege of making long lasting friendships. My journey has been truly enjoyable and easy because of you all, and for that I am ever grateful: Ni- harika Gajawelli, Sandeep Nallan Chakravarthula, Md Nasir, Shiva Navabi, Mansi Kasliwal, Hiteshi Sharma, Pankaj Rajak, Harish Sridhara, Setu Mohta, Supriya Ghorpadkar, Ramya Sundararaman, Shyja Vipin, Saru Janahan and Yamini Arun. I am also grateful to all my family and friends back home. I am not able to name everyone here but want to convey that I miss you all and your encouragement keeps me going. Special thanks to Ganga Neelamraju, N V Chalapathi Rao, and Prasanth Nori for always being supportive of me. Lastly, this thesis would not have been possible if it wasn't for the love and support of my mother, Usha Varadarajan, my father, T.S. Varadarajan, my sister, Vidula Varadarajan, my brother Ujjwal Varadarajan and my husband, Srikanth Nori. I am grateful to my parents for all the sacrices they have made to make me who I am today, my brother and my sister for their unfailing love and aection, and my husband for encouraging me, inspiring me and showing me what it means to love someone. I am eternally indebted to you all. 6 Contents List of Figures 11 List of Tables 14 1 Introduction 1 1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.1 EAP Response Function (ERF) Theory . . . . . . . . . . . . . . . . . 7 1.3.2 Estimation Method Design . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3.3 Non-central Chi Noise Estimation . . . . . . . . . . . . . . . . . . . . 8 1.4 Organization of Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Background 10 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Diusion Propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Orientation Distribution Function . . . . . . . . . . . . . . . . . . . . 15 2.2.2 Return to origin, axial and plane probabilities . . . . . . . . . . . . . 16 2.3 Diusion MRI Acquisition and Sampling . . . . . . . . . . . . . . . . . . . . 18 2.3.1 Pulsed Gradient Spin-echo . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.2 Narrow Pulse Approximation and Q-space . . . . . . . . . . . . . . . 20 7 2.4 Signal Processing in Diusion MRI . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.1 Fourier Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4.2 Linear estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3 EAP Response Function Theory 27 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.1 Background on Linear Estimation Methods . . . . . . . . . . . . . . 32 3.3.2 Theoretical Characterization of Linear Estimation . . . . . . . . . . . 34 3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4.1 Theoretical Insights into Estimation Parameter Selection . . . . . . . 40 3.4.2 Theoretical Insights into Resolution/Noise Trade-os . . . . . . . . . 47 3.4.3 Theoretical Comparisons of Dierent Estimation Methods . . . . . . 49 3.4.4 Theoretical Insights from Sampling Theory . . . . . . . . . . . . . . . 53 3.5 Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.A Appendix : Linear Basis Function Expansion Methods . . . . . . . . . . . . 59 4 Estimation Method Design for Diusion MRI 62 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.3.1 EAP Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.3.2 Linear Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.4 ERFO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.4.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.4.2 Theoretical Characteristics . . . . . . . . . . . . . . . . . . . . . . . . 73 8 4.4.3 Joint ERFO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.5 Experiment protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.5.1 Q-space protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.5.2 ERFO training data . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.6 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.6.1 Test dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.6.2 ODF Simulation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.6.3 RTAP Simulation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.6.4 RTOP Simulation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.7 Tracking Application Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.8 RTAP Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.9 Joint-ERFO : An initial result . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.9.1 Joint-ERFO Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.10 Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5 Non-central Chi Noise Modeling 103 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.3 Notation and Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.3.1 ML and PML Estimation . . . . . . . . . . . . . . . . . . . . . . . . 108 5.3.2 The Gaussian and NCC Distributions . . . . . . . . . . . . . . . . . . 108 5.3.3 Majorize-Minimize Algorithms . . . . . . . . . . . . . . . . . . . . . . 110 5.4 Quadratic Tangent Majorants forL ncc (yjx) . . . . . . . . . . . . . . . . . . 110 5.5 Example Application: Denoising . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.5.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.5.2 Real Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.6 Example Application: HARDI Estimation . . . . . . . . . . . . . . . . . . . 118 9 5.6.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.6.2 Real Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 5.6.3 Comparisons with Traditional Optimization Algorithms . . . . . . . . 125 5.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.7.1 Initialization and Local Minima . . . . . . . . . . . . . . . . . . . . . 128 5.7.2 Accurate Noise Modeling . . . . . . . . . . . . . . . . . . . . . . . . . 130 5.7.3 Choosing Images: Complex, Magnitude, or rSoS? . . . . . . . . . . . 135 5.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 6 Conclusion 139 7 Bibliography 143 10 List of Figures 1.1 Linear Diusion MRI Parameter Estimation . . . . . . . . . . . . . . . . . . 2 2.1 Microstructure: This image portrays the rich tissue characteristic information available at the microscopic level. . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 PGSE pulse sequence schematic diagram. . . . . . . . . . . . . . . . . . . . . 18 2.3 The eect of the PGSE pulse sequence on molecules that diuse and on molecules that are stationary (No Diusion). . . . . . . . . . . . . . . . . . . 19 3.1 Comparison of the ERFsg(R 0 ; R) for 3D-SHORE EAP estimation for dierent maximum radial orders N. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2 A more detailed look at the 3D-SHORE ERFs shown in Fig. 3.1 for EAP estimation with dierent maximum radial orders N. . . . . . . . . . . . . . . 41 3.3 EAPs estimated using 3D-SHORE with dierent maximum radial orders N from noiseless two-tensor simulated data. . . . . . . . . . . . . . . . . . . . . 43 3.4 Empirical EAP estimation error for simulated noiseless multi-tensor datasets. 44 3.5 The ERFs for SPF-based RTOP estimation for dierent multi-shell sampling schemes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.6 Plots of the theoretically-derived ERF FWHM and noise variance for SPF- based RTOP estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 11 3.7 Plots of the empirically-observed RTOP estimation error for SPF-based esti- mation with dierent multi-shell sampling schemes. . . . . . . . . . . . . . . 49 3.8 Comparison of the ERFs for dierent ODF (FRT aand GQI) and CSA ODF (FRACT and 3D-SHORE) estimation methods for single-shell and multi-shell data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.9 ODFs (FRT and GQI) and CSA ODFs (FRACT and 3D-SHORE) estimated from real in vivo HCP brain data. . . . . . . . . . . . . . . . . . . . . . . . . 50 3.10 Aliasing and resolution for shell-based and Cartesian sampling schemes in diusion MRI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.1 ERFO training data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.2 ERFO testing data - cylinders with 1 micron radius and 5 micron radius. . . 81 4.3 Comparison of orientation estimation errors of ERFO, FRACT and CSD es- timation methods for single shell numerical simulations. . . . . . . . . . . . . 82 4.4 ODF ERFs for single shell acquisition for ERFO and FRACT. . . . . . . . . 83 4.5 RTAP ERFs for gold standard, ERFO (multi-shell) and GQI. . . . . . . . . 84 4.6 Comparison of RTAP estimation errors of ERFO and GQI estimation methods for multishell and single shell numerical simulations. . . . . . . . . . . . . . . 85 4.7 RTOP ERFs for gold standard, ERFO (multi-shell) and 3D-SHORE. . . . . 88 4.8 Comparison of RTOP estimation errors of ERFO and 3D-SHORE for multi- shell shell numerical simulations. . . . . . . . . . . . . . . . . . . . . . . . . 89 4.9 Estimated ODFs using dierent estimation methods from a region-of-interest that is marked in (a). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.10 Estimated ODFs using dierent estimation methods from a region-of-interest that is marked in (a). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.11 Estimated ODFs using dierent estimation methods from a region-of-interest that is marked in (a). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 12 4.12 Normalized RTAP ERFs and RTAP estimates of ERFO and GQI calculated for the full MGH-USC protocol. . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.13 RTAP estimation errors for ERFO and GQI for multiple multi-shell sampling schemes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.14 RTAP estimation errors for ERFO and GQI for a single sampling scheme realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.15 Joint ERFO ODF design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4.16 Joint ERFO testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.1 Quantitative results for denoising simulations. . . . . . . . . . . . . . . . . . 115 5.2 Representative results from TV denoising of simulated Rician data. . . . . . 116 5.3 Root-mean-squared error images for the Rician denoising. . . . . . . . . . . . 117 5.4 Results from Rician TV denoising of real brain data. Also shown are the NRMSE values for each image. . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.5 Error images for the Rician denoising. . . . . . . . . . . . . . . . . . . . . . . 120 5.6 Quantitative spherical harmonic (SH) errors for the HARDI simulations. . . 125 5.7 Comparison of diusion signal estimates from noisy data. . . . . . . . . . . . 126 5.8 Representative real-data spherical harmonic coecient estimation results. . . 127 5.9 Accuracy v/s compute time plots for proposed and alternative NCC noise suppression methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 5.10 Convergence plots for the proposed MM approach with dierent initializations.130 5.11 Results from GRAPPA reconstructed and adaptive combined TV denoising of real brain data. Also shown are the NRMSE values for each image. . . . . 133 5.12 Error images for GRAPPA reconstructed and adaptive combined TV denoising.134 5.13 Comparison between TV denoising of complex, magnitude, and rSoS images. The NRMSE for each image is shown below each result. . . . . . . . . . . . . 136 13 List of Tables 5.1 Mean, median, standard deviation, and total compute times (units of seconds) for dierent PML algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Chapter 1 Introduction 1 Figure 1.1: Linear Diusion MRI Parameter Estimation 1.1 Problem Statement This work addresses the problem of making dMRI experiments as ecient as possible, in- cluding: (i) developing novel theory to characterize accuracy of an arbitrary linear dMRI method, (ii) optimizing the parameter estimation protocol to maximize the amount of infor- mation extracted from the measured data, and (iii) developing statistical models that can be used to suppress noise contamination Data acquisition in diusion magnetic resonance imaging (dMRI) is often modeled (under a number of assumptions to be described later) through a Fourier transform relationship [1,2]: E (q) = Z P (r)e i2q T r dr +n (q); (1.1) where P (r) is the probability distribution function describing the molecular diusion at displacement r2R 3 called the ensemble average propagator (EAP), is the diusion time, andE (q) andn (q) are the acquired dMRI image and the measurement noise at the q-space location q2 R 3 respectively. Q-space is the reciprocal space of the EAP that is measured by the MRI scanner. In most dMRI experiments, we acquire a nite set of M q-space measurementsE(q m ) for m = [1; 2;:::;M] and estimate either the EAP, P (r) or a parameter, ^ () that is derived from the EAP. The variable describes the space of the EAP parameter and will have units depending on the context. For example, when we are estimating the EAP itself, represents 2 displacements and will have millimeter units. We know that on satisfying certain sampling requirements described by the sampling theorem [3, 4], we can perfectly reconstruct the EAP from its q-space measurements using the inverse Fourier transform (IFT). However, due to limited scan time, in practice M is too small and in such an undersampled regime IFT is known to be an ill-posed problem. Therefore, most experiments in dMRI are focused on estimating an EAP parameter that generally resides in a lower dimensional space than the EAP and does not require as many samples to estimate as the full EAP itself. Fig. 1.1 shows the steps involved in linear estimation of an arbitrary EAP parameter ^ (). The gure rst shows the evolution of the noisy diusion q-space data E (q) as given by Eq. (1.1) followed by a linear estimation method step to estimate the diusion parameter ^ (). Linear estimation will be described in detail in chapters 2 and 3. Blocks marked in green are the areas whose eciency is analyzed and improved by our research: Q-space Sampling Protocol. The number of q-space measurements (M) is pro- portional to the experiment duration. In addition, the sampling locations directly in uence the resolution of the parameter estimate ^ () as well as the errors due to aliasing [3, 4]. Existing q-space sampling schemes are based on intuitive or model based designs [5{8] that can be suboptimal [9,10]. Moreover, q-space sampling scheme characteristics such as resolution or aliasing remain unevaluated due to lack of theoret- ical characterization. Therefore, designing ecient q-space sampling scheme remains an open research question with the goal of making dMRI faster without compromising on estimation accuracy. Estimation method. As described earlier, most dMRI estimation methods work in regimes where M is too small to apply standard Fourier reconstruction techniques. Ex- isting estimators [11{19] are proposed intuitively, make sampling scheme assumptions that can be suboptimal, make approximations and/or make modeling assumptions that 3 are known to fail in biological tissue [9,10]. Additionally, the dMRI literature lacks a principled way of choosing between existing estimators or designing ecient estimators and both problems remain open questions. Noise. Thermal noise [20, 21] in MRI, caused by thermal uctuation of charged particles can cause random uctuations in the measured signal. Noise is problematic because (1) MR has low sensitivity, (2) higher spatial resolution is associated with higher noise, and (3) higher diusion weighting is associated with lower signal, and (4) diusion MRI is already a long experiment so there's no time for additional data aver- aging. Existing estimator design methods [11{19] do not take noise into consideration in their designs and tend to have low accuracy when estimating using low SNR q-space data. Additionally, the statistics of thermal noise can get complicated when dealing with magnitude dMRI images and noise removal can prove to be challenging [22{28]. Existing noise removal methods approximate the noise distribution inaccurately or rely on computationally intensive numerical approaches [29]. Thus, designing sam- pling schemes and estimation methods that mitigate noise related errors and designing noise removal methods that model noise accurately can potentially improve estimation eciency and is an active area of research in dMRI. The design problems listed for each block of the linear estimation pipeline can be at- tributed to the fact that methods in dMRI have largely ignored the extensive signal pro- cessing theory that has been developed to characterize the sampling requirements, accuracy, and stability of linear estimation methods. Lack of theoretical characterization has made it challenging to predict the behavior of linear dMRI methods. Presently, methods are evalu- ated empirically and it is challenging for the evaluations to be exhaustive enough to include all cases seen in biological tissue. This dissertation focuses on the problem of making the dMRI estimation pipeline as ecient as possible. We introduce a novel signal processing theory that can be used to char- 4 acterize the performance of arbitrary dMRI linear estimation methods as applied to arbitrary q-space sampling schemes. The characterization is similar to that of a point spread func- tion [30{34] that is widely used to describe spatial resolution of MR images. The new theory enables prediction of accuracy metrics such as estimation error, resolution and noise-resilience of dierent data acquisition and parameter estimation techniques. Prediction of accuracy inturn enables designing of optimal estimation methods and q-space sampling schemes. We then propose a new theory driven design method for estimation method design in dMRI that optimizes the overall estimation error. This design is useful for dMRI datasets that have already been acquired and takes thermal noise into consideration. In addition, we present an initial result of jointly designing the estimation method and the sampling scheme. This design is mainly for future experiments where both the acquisition protocol as well as the estimation method can be controlled. Lastly, we tackle noise in dMRI. In addition to our noise-aware designs, we present a new maximum likelihood estimation method for accurately removing non-central chi and Rician noise - a commonly encountered noise distribution in MRI. Noisy MRI magnitude and root sum-of-squares (SoS) images follow the Rician and non-central chi distribution respectively. In dMRI, estimation of diusion parameters can be inaccurate due to the noise bias introduced by these distributions. This work presents a new approach to model and estimate denoised q-space data as well as EAP parameters from Rician and non-central chi data. We show estimation results from both simulated and noisy real data, and demonstrate how this method can improve estimation compared to existing approaches. 1.2 Motivation Diusion Magnetic resonance imaging (dMRI) has proven useful to understand both normal and pathological tissue microstructure of a biological system. Diusion MRI is sensitive to tissue microstructure that cannot be detected at the macroscopic scale of conventional 5 MRI. This provides new investigative features useful for many applications. For exam- ple, it has shown promising results in understanding neurological disorders that change the tissue microstructure such as stroke [35], Alzheimer's disease [36], epilepsy [37], and mul- tiple sclerosis [38]. Another important use is in studying white matter connectivity of the brain [11{13, 39, 40]. Prior to its availability, only ex-vivo white matter studies were possi- ble [39,41]. dMRI has made it possible to study white matter connections in-vivo. However, long scan times continue to pose a major hurdle for dMRI. High quality dMRI scans ( 40 min) are time consuming, and scans get very challenging for a number of important human subject populations (children, patients etc.). In practice scan times are constrained and fewer samples are acquired making standard inverse Fourier transform based reconstruction unsuitable for estimation. Scan times can also be reduced by improving the experimental eciency for practical human dMRI scans. However, the experimental eciency of a system has been hard to predict due to a lack of a model that perfectly describes diusion in complex biological tissue. In place of ground truth, dMRI parameter estimates are validated using empirical evaluations that model diusion behavior. It is common for these diusion modeling assumptions to fail in dierent parts of the body and evaluations to not be exhaustive. In addition, it is unclear how to choose between dierent existing sampling schemes and estimation methods. Evaluating the optimality of a sampling scheme or estimation method remains an open question. There are also nancial costs associated with MR scanning { having a very long diusion scan in the protocol often means that there is no room to include other important MR experiments, and experimenters are often forced to make sacrices to try and t everything within a short experiment duration. This work contributes to improving the eciency of a typical dMRI experiment by (i) establishing a new signal processing framework for dMRI that can predict accuracy, (ii) 6 designing optimal estimation methods to extract maximum information from limited q- space samples, and (iii) accurately modeling the noise distribution for noise suppression. In addition we showcase an initial design for optimizing the q-space sampling scheme and estimation method jointly. 1.3 Main Results 1.3.1 EAP Response Function (ERF) Theory We propose a new linear estimation theory that analytically describes the relationship be- tween a diusion parameter estimate and the true EAP. We show several examples where the characteristics of this relationship can predict behaviors such as resolution, estimation error and noise resilience of an arbitrary estimation method applied to an arbitrary q-space sampling scheme. Prior to this work, existing diusion MRI data-sampling and signal-estimation methods have largely been developed based on approximations, intuition, and extensive empirical evaluation [18, 19, 42{50]. Our proposed theoretical characterization enables prediction of the performance of an approach without requiring extensive empirical testing. Notably, our proposed tools only assume a real and symmetric EAP and can be used even when traditional modeling assumptions are inaccurate. We expect these tools to be useful when choosing between dierent existing schemes, as well as for designing optimal sampling and estimation methods for dMRI. 1.3.2 Estimation Method Design We propose a new method for estimating EAP parameters for arbitrary q-space sampling schemes. Existing approaches can broadly be classied as model-based [10, 13, 39, 40] and model-free [11, 12, 17, 18, 48, 51, 52]. Model-based approaches are widely used, but rely on 7 modeling assumptions that are commonly violated in the brain [9]. In contrast, model- free approaches make fewer assumptions, but it is unclear if existing methods are optimal. Additionally, many of the existing methods are not applicable to arbitrary q-space sampling schemes, and instead assume a specic scheme. This work introduces a new approach, called ERFO (ERF Optimized ODF estimation), that learns an optimal linear EAP parameter estimator from training data and can accom- modate arbitrary q-space sampling. Our approach is guided and justied by the theoretical framework for linear diusion estimation described in chapter 3. We present both simulation and real data evaluation of the proposed method to estimate multiple EAP parameters. Our proposed method is the rst to provide a way to design an estimation method theoretically, without making diusion modeling assumptions or relying on intuition [5{7]. 1.3.3 Non-central Chi Noise Estimation We propose a new framework for statistical estimation problems involving non-central chi (NCC) probability distributions. We investigate estimation of HARDI diusion parameters and denoising of MRI magnitude and root sum of squares combined images. We demon- strate that using accurate Rician/NCC modeling of the statistical data distribution can substantially improve estimation results relative to conventional approaches. Existing approaches [22{26, 26, 27, 53{55] have either approximated noise to a Gaussian distribution or model the noise accurately but apply computationally expensive numerical optimization techniques to optimize the NCC likelihood. Gaussian approximation is inac- curate while numerical approaches used are computationally intensive with no guarantee of improved accuracy. Our approach is both accurate and faster. Additionally it is also more exible and can be applied jointly with the parameter estimation method, further improving the overall accuracy of the estimation. 8 1.4 Organization of Dissertation This dissertation is organized as follows: Chapter 2 describes some background that will aid in understanding the rest of the thesis. It gives an overview of the diusion propagator based formalism, some commonly estimated ensemble average propagator (EAP) parameters, diusion acquisition and narrow pulse assumption, Fourier reconstruction and linear estimation in diusion MRI. Chapter 3 describes a new theoretical framework that can characterize arbitrary linear diusion estimation methods with arbitrary q-space sampling schemes. Simulation and ex- perimental results are shown to demonstrate that this new theory can predict behaviors such as aliasing, resolution, noise resilience and estimation error of existing linear methods. Chapter 4 presents a new optimization framework to design estimation methods for esti- mating EAP based parameters. This chapter proposes a design method based on the theory described in chapter 3. The new method learns an optimal parameter estimation method from training data. We present simulation and real dMRI data results of estimating mul- tiple EAP parameters. In particular, our results optimize the estimation of the orientation distribution function , return to origin probability and return to axial probability. We also present an initial result for jointly designing the q-space sampling scheme and the parameter estimation method. Chapter 5 introduces a novel optimization technique to optimize non-central chi and Rician noise. We show both simulation and real data result to demonstrate the usefulness of the new framework for parameter estimation and denoising. Finally, Chapter 6 provides conclusions. 9 Chapter 2 Background 10 Figure 2.1: Microstructure: This image portrays the rich tissue characteristic information avail- able at the microscopic level. The image shows a T1-weighted image at mm 3 resolution and a microscopy image at m 3 resolution. The microscopy image occupies just a fraction of the MRI voxel and shows detailed microstructure that is inaccessible atmm 3 resolution of T1-weighted MRI. (Source: Myelinated Neuron (https://commons.wikimedia.org/wiki/File:Myelinated neuron.jpg) by Roadnottaken(https://en.wikipedia.org/wiki/User:Roadnottaken) is licensed under CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/), via Wikimedia Commons) 2.1 Overview Magnetic resonance imaging (MRI) can produce rich image contrasts that are sensitive to a large variety of biological features. However, most conventional MRI captures information at the millimeter scale. Fig. 2.1 shows a slice of a T1-weighted MR image at millimeter spatial scale and a slice of a microscopy image at micrometer spatial scale. The microscopy image forms a tiny fraction of the MRI voxel and contains a neuron cross-section of less than 2.5 m diameter. T1 MRI (and other millimeter scale MRI) misses out on the high resolution information available at the microscopic level seen in the microscopy image because it asso- ciates a single intensity value for all the structures contained inside a mm 3 resolution voxel. In addition, imaging at microscopic scale becomes important for many clinical applications because many pathologies originate at this level [56,57]. In the last few decades, diusion MRI has emerged as a powerful modality to image the microstructure. It is based on the principle that displacement of molecules due to 11 Brownian motion will follow the surrounding tissue geometry and by studying displacement characteristics we can indirectly infer the micro-architecture [2]. The ability to study micro- architecture has opened up a broad range of unique applications. For example diusion MRI is the only imaging modality that has enabled study of in-vivo white matter connectivity [11,18,39,40,43,46,48,51,52,58{62]. Other diusion MRI applications include studies trying to understand neuroanatomy by imaging microstructure [63,64]and diagnostic applications, especially important when the microscopic changes do not show up in other modalities. Some examples of brain pathologies where diusion MRI has been useful include stroke, tumors, neurodegenarative conditions, multiple sclerosis and traumatic brain injury [35{38,56,65{67]. Acquiring diusion MRI data has become routine in both clinical and research applica- tions. The popularity of dMRI is partly because of its unique ability to characterize tissue geometry at microscopic level and partly because of the constant development of methods to improve acquisition and estimation. However, there are still some key questions on eciency of a diusion MRI experiment that remain unanswered which we address in this thesis. In this chapter we cover some fundamental background on diusion propagator modeling, dMRI acquisition, and some signal processing concepts that will help in understanding the later chapters. 2.2 Diusion Propagator In this section we present a probability distribution function (PDF) based framework called the propagator that describes molecular diusion and has been extremely useful in predicting diusion behavior in biological applications. Molecular diusion is the random motion of molecules at the microscopic scale called Brownian motion due to the thermal energy contained in the system [68]. The nature of the movement of these molecules is governed by its surroundings and much of diusion physics is dedicated to describing this motion theoretically. The motion of a single molecule can 12 be completely described by its displacement r (t) at time t and since the displacement is a random quantity, it is mathematically represented using probability distribution functions (PDF) referred in dMRI literature as propagators.. Displacement at time t of a molecule will depend on the time instant t as well the molecule's initial position r 0 . So the displacement probability of a single molecule is usually described as a conditional PDF,P s (rjr 0 ;), which is the probability of a molecule to displace from r 0 to r in time [2,69]. This conditional PDF is referred as the propagator or the self correlation function [2]. However, in a typical imaging experiment we are always operating at nite resolution, which means that in every voxel we are observing superposition of signals from a large num- ber of molecules going about their random walks in dierent microscopic environments. The observed propagator is therefore an ensemble average over all single molecule propagators within the voxel and is called the ensemble average propagator (EAP). The EAP is mathe- matically related to the single molecule propagator by integrating over all initial positions: P (R) = Z P s (r 0 + Rjr 0 ;) (r 0 )dr 0 (2.1) where (r 0 ) is the initial spin density at r 0 and P s (r 0 + Rjr 0 ;) is the probability of a molecule to displace from r 0 to r 0 + R in time with R representing the displacement of the propagatorP s (r 0 + Rjr 0 ;). The ensemble average integrates out the starting positions r 0 and resulting EAP P (R) is a function of just the net displacement R. We follow this notation of the EAP in the rest of the thesis Furthermore, it turns out that for the special case of unrestricted isotropic diusion, the single molecule propagator has a nice analytical formulation given by P s (rjr 0 ;) = (4D) 3=2 exp (r r 0 ) 2 =4D ; (2.2) 13 where D is the diusion coecient that was related to the mean squared displacement of diusing molecules by Einstein [70]. The unrestricted propagator was derived by solving the dierential equation that describes the evolution of diusion in space and time along with appropriate initial and boundary conditions [2]. The derived propagator depends on the dierence in the position or displacement,(r r 0 ) of the diusing molecule, and hence irrespective of the initial position all molecules in the unrestricted medium will have the same propagator. Therefore the EAP too, which is the average over all propagators, will have a similar formulation as Eq. (2.2) given by: P (R) = (4D) 3=2 exp R 2 =4D : (2.3) Additionally, this kind of model can be extended to the case of unrestricted anisotropic Gaussian diusion with the diusion coecient, D being replaced by the diusion tensor D [71]: P (R) = jDj (4) 3 1=2 exp R T D 1 R =4 ; (2.4) The anisotropic Gaussian EAP is one of the most widely used diusion models to evaluate anisotropy in biology. In the last decade, new microstructure models [14{16, 72{74] have been proposed that can model more complex diusion behavior by decomposing the diusion signal from a voxel into sum of signal from multiple microscopic compartments. Each compartment is usually modeled using a simple unrestricted or restricted diusion model. However, studies have shown that even these complex models fail (atleast partially) in the real biological tissue such as the brain [9], and creating a more accurate model of tissue geometry still remains an active area of research. Additionally, the non-linearity of diusion models make it hard to predict and assess model failures, further complicating the validation of these model based approaches. 14 To counter some of these disadvantages, we focus our eort instead on developing methods that estimate the EAP or a parameter derived from the EAP that can inform on important structural properties such as geometry, density or size of the underlying microstructure with- out modeling the microstructure explicitly. We next describe some of the EAP parameters that have contributed to important applications in dMRI literature. 2.2.1 Orientation Distribution Function Early white matter studies using diusion MRI observed that the diusion measurements were sensitive to the orientation of the diusion sensitizing gradients [75,76] giving the rst clues of anisotropic diusion in white matter. Anisotropy was rst measured by [41] in the cat nervous system where white matter signal was shown to depend on gradient orientation. Low white matter signal intensity was measured by [41] when gradient was parallel to the ber orientation and since then anisotropy has been studied extensively, especially in the white matter of the human brain. Diusion tensor imaging (DTI) [39] was the rst method that made it possible to model anisotropy and study white matter connectivity. However, isotropic and simple anisotropic models such as DTI were insucient to model complex white matter structures such as crossing bers accurately. This led to development of more advanced models [11,13,17,40,43,51,52,61] that estimate orientation of white matter bers along multiple orientations. One of the EAP parameters that is used to estimate orientational anisotropy is the orientation distribution function (ODF) [11, 43]. [11] dened ODF along the direction u as the radial integral of the EAP along u, ODF (u) = Z 1 0 P (u)d: (2.5) This denition provided useful information about which orientations are associated with the most diusion, and Tuch [11] showed that there are relatively easy ways to approximately 15 estimate this denition of the ODF. Another denition of ODF preferred by many considers the volume element for the radial integral. In this denition the ODF radial integral is calculated over a cone of constant solid angle [40,43,77,78], ODF CSA (u) = Z 1 0 P (u) 2 d: (2.6) ODF CSA is a proper marginal probability distribution of the EAP calculated along a par- ticular orientation. Since ODF CSA is a PDF, it integrates to 1 over all possible directions: R u2S 2 ODF CSA (u)du = 1, whereS 2 is the set of all points on the unit sphere. ODF andODF CSA are used to study the orientational structure of the brain especially in white matter map of anisotropic axons, and is generally estimated along multiple directions u. The directions of ODF peaks are correlated with the directions of white matter berbundles because of which ODF estimation is a common rst step for white matter connectivity and tractography studies [64]. 2.2.2 Return to origin, axial and plane probabilities White matter structure is made up several intricate components such as axons, myelin sheath, astrocytes, various other type of glial cells, and more. These structures work intricately to perform several important functions to keep the brain healthy and their malfunctioning is responsible for various pathologies. Therefore, estimating various local microscopic features such as cellular packing density, axon diameters, and cell size can provide important informa- tion about healthy and pathological tissue [79]. Diusion MRI has contributed to estimating several scalar parameters [39,48,80] that shed light on some of these microstructure features. Here we discuss a few of the EAP scalar parameter that are analyzed in this thesis. The return to origin probability (RTOP) is the average probability of a molecule to not incur any net displacement in time or in other words the average probability that a molecule has returned back to its initial position in time. RTOP is simply the value of the 16 EAP at zero displacement (R = 0), RTOP =P (0): (2.7) RTOP has gained importance because under perfect restriction condition (zero permeability) it is proportional to the surface to volume ratio of the pore, where pore was dened as the region where the molecules are allowed to diuse [71,81]. Additionally, RTOP has also been shown to be equal to the inverse of the mean pore volume when the time is long enough for the molecules to traverse the entire pore space [48]. However, in most practical scans the diusion time is generally not long enough to traverse the pore completely so [48] proposed another parameter called the return-to-axis probability (RTAP). RTAP is the dened as the radial integral of the EAP along the axial direction, dened as the direction of maximum diusion and is given by RTAP = Z P [0; 0;z] T dz; (2.8) where we have used r = [x;y;z] T and assumed without loss of generality that the main diusion direction is oriented alongz. Eq. (2.8) is the same as Eq. (2.5) with the distinction that the integral is along the axial direction. RTAP was proposed with the motivation that in restricted geometry the molecules will be able to completely traverse the pore boundaries in the restricted directions and in turn satisfy the long time limit in those directions. One can think of RTAP as the average probability that a molecule will return to the axial line in time and the RTAP radial integral is integrating out the axial direction. Analogous to RTOP, RTAP was derived to be sensitive to the mean cross-sectional area of the pore. When we integrate out the radial plane instead, the return-to-plane probability (RTPP) is calculated as RTPP = Z P [x;y; 0] T dxdy; (2.9) 17 Figure 2.2: PGSE pulse sequence schematic diagram. using the same assumptions we used with RTAP. RTPP is measuring the average probability that a molecule will return to the origin along just the axial direction in time . Analogous to RTOP and RTAP, RTPP was derived to be sensitive to the length of the pore. However, since in most scans the diusion time is not long enough to traverse the pore completely along the axial direction, long diusion scan time is required to estimate RTPP accurately. Recent clinical applications have shown RTOP to correlate with axon volume and myeli- nation, and RTAP to correlate with axonal packing density and axon diameter [19, 82], relating EAP properties to useful biological biomarkers. Since changes in myelin and cel- lular/axonal volume and density are properties pertinent to neurological changes occuring in both health [63, 83, 84] and disease [85{87], we expect RTOP and RTAP to aid more neuroscience and clinical applications in the future. 2.3 Diusion MRI Acquisition and Sampling In this section we describe the acquisition of dMRI data with the pulsed gradient spin echo sequence (PGSE) and derive the Fourier transform relationship of the EAP and q-space under narrow pulse approximation. The Fourier relationship derived here is the central model assumed in the rest of the thesis. 18 (a) initial state (b) First gradient pulse (at s) (c) After time (at t) (d) Second gradient pulse (at u) Figure 2.3: The eect of the PGSE pulse sequence on molecules that diuse and on molecules that are stationary (No Diusion). 2.3.1 Pulsed Gradient Spin-echo Diusion was rst observed by Hahn [88] and then measured by Carr and Purcell [89] in nuclear magnetic resonance (NMR) experiments in the 1950s. But it was only a decade later in 1965 that a robust diusion MRI acquisition protocol called the pulsed gradient spin-echo (PGSE) was proposed by Stejskal and Tanner [1]. Till today, the PGSE sequence continues to be the most popular sequence for acquiring diusion MRI data. The PGSE pulse sequence diagram is shown in Fig. 2.2. The spins of molecules are initially excited by a 90 RF pulse, after which all spin phases are aligned in one direction. The RF pulse is followed by gradient pulse, g of duration . The gradient pulse introduces a position dependent phase-lag or phase-shift in the spin phase given by R t=0 g(t) r(t)dt, where is the gyromagnetic ratio, g is the diusion gradient vector and r is the position of the spin at time t. The spins are then allowed to diuse for time. The 180 RF pulse followed by another gradient pulse of duration induces an additional phase-lag which cancels out the lag introduced by the rst gradient pulse. The net spin phase induced by the two gradients is zero in molecules that are stationary and non-zero for molecules that have 19 diused. In order to relate the spin phase to the EAP, we have to make one more assumption called narrow pulse and remove the eects of motion when the diusion gradients are on. 2.3.2 Narrow Pulse Approximation and Q-space The narrow pulse approximation assumes that the diusion gradient pulse duration is much smaller than the duration between the two canceling gradient pulses : << Intuitively, assuming a short gradient pulse means that we can ignore the motion over the duration of the pulse. The gradient then encodes the position of the water molecule instantaneously, thereby getting rid of the time integral in the phase shift, R t=0 g(t) r(t)dt g r. This approximation can be reasonably realized (not completely most of the time) in real experiments and leads to very useful insight that can help describe diusion in complex geometry and hence is assumed in many methods. The narrow pulse approximation is assumed in the rest of this thesis as well. We now explain the PGSE sequence assuming narrow pulse by considering a toy example shown in Fig. 2.3, where we compare the eect of the sequence on molecules that diuse and molecules that do not diuse. The molecules are arranged in a vertical line and the arrows depicts the phase of the transverse magnetization of their spins. We use spins and molecules interchangeably for the rest of the chapter. The initial position in Fig. 2.3(a) is post excitation and all spin phases are aligned in one direction. The rst diusion gradient pulse introduces a position dependent phase-lag in the spin phase given by g r, where is the gyromagnetic ratio, g is the diusion gradient vector and r is the position of the spin at the instant when the gradient is turned on. We can observe the change in phase in our toy example in Fig. 2.3(b). The color bar shows the gradation in the phase-lag with the spin phase not changing much at positions close to low lag, while the phase ips by almost 180 degrees at positions close to high lag. The arrows can be seen turning and in-turn visualizing the phase shift and one can observe that every 20 molecule has a unique phase value encoded based on its position along the vertical direction. The spins are then allowed to diuse for time. The molecule that diuse move from their initial position as shown in Fig. 2.3(c). The 180 RF pulse followed by the second gradient pulse induces an additional phase- lag which cancels out the phase introduced by the rst gradient pulse. The two color bars in Fig. 2.3(d) depict the canceling gradients. The net spin phase is zero in molecules that are stationary and the spin phase of the stationary molecules returns to its inital value. However, in molecules that have diused, since their position has changed, the new gradient will not cancel the original phase shift and some residual spin phase will remain resulting in randomness in the spin phase. The randomness in displacement due to diusion is encoded into the randomness of the spin phase leading to the attenuation of the net transverse mag- netization of spins. The attenuated spin echo signal hence encodes the amount of diusion that occurred in time between the two gradient pulses. Let us now quantify the eect of narrow pulse. The PGSE signal S (g) measured from a single voxel is a superposition of the transverse magnetization [2,71] of all spins contained in it, S (g) =S 0 < exp (i)> (2.10) =S 0 Z exp (i)P ()d; (2.11) whereS 0 is the signal when no gradient is applied,<> is the statistical expectation opera- tor, is the spin phase andP () is the probability of the spin phase to be. Eq. (2.10) and Eq. (2.11) assume that the ensemble average over all spins given by Eq. (2.11), approximates the statistical expectation operator closely. Under narrow pulse approximation (<< ), the gradient signal can be written as g (u) = g ( f (0) f ()); (2.12) 21 where f () is a dirac delta function. The spin phase then becomes = g T (r (0) r ()) (2.13) = g T R (2.14) = q T R; (2.15) where R = (r (0) r ()) is the spin displacement in time between the two gradients and q = g is the q-space vector whose relevance you will see shortly. The spin phase is proportional to the spin displacement, which means that the probability of spin phase is proportional to the probability of spin displacement. Eq. (2.11) can be then rewritten as E (q) = Z e iq T R P (R)dR: (2.16) whereE (q) is the normalized q-space data (S (q)=S 0 ) at q2R 3 andP (R) is the EAP at R2R 3 . Eq. (2.16) is the central relationship for q-space diusion MRI because it establishes that the acquired q-space diusion data is related to the EAP by means of a Fourier trans- formation. We will use this Fourier transform relationship in the later chapters to dene new theory that allows us to design ecient diusion MRI experiments. 2.4 Signal Processing in Diusion MRI 2.4.1 Fourier Reconstruction Under narrow pulse approximation, the EAP can be reconstructed from q-space data by inverse Fourier transform (IFT) P (R) = Z e iq T R E (q)dq: (2.17) 22 Methods such as diusion spectrum imaging (DSI) [43] and generalized q-sampling imaging (GQI) [52] rely on this relationship to estimate EAP and ODF from q-space data. A typical DSI scan comprises of 515 samples. Acquiring such large amount of samples requires a lot of time ( 40-50 minutes) and is therefore limited to research environments. IFT based reconstruction of continuous band-limited signal from Cartesian (uniform) sampled data is a well studied problem in signal processing and on meeting certain sampling requirements we can recover the signal perfectly using IFT [3, 4]Let us consider a 1D con- tinuous signal x(t) sampled at frequency f s = 1=T s to generate samples x[n] =x(nT s ). Let us also assume that x (t) is band-limited (X ( ) = 0;8 > max ) with the highest non-zero frequency component at max , and has a continuous Fourier transform X( ) related by X( ) = Z 1 1 x (t)e j t dt; 2R: (2.18) The samplesx[n] will have a periodic discrete time Fourier transformX(e j! ) represented by X(e j! ) = 1 X k=1 x[n]e j!n ;!2 [;]: (2.19) The sampling theorem relates X(e j! ) to its aperiodic counterpart X( ) by X e j! = 1 T s 1 X k=1 X (j )j = ! Ts + 2k Ts : (2.20) The conditions for perfect reconstruction is a consequence of Eq. (2.20) because the sampling frequency (or the sampling period T s ) controls the amount of shift accrued by X (j ) and for perfect recovery of X(j ) (and in turn x(t)), we need that every period of X(e j! ) to contain only a single copy of X (j ). Sampling theorem [3,4] states that a band-limited signal can be perfectly recovered from its samples if the sampling frequency, f S is equal to or higher than twice the largest signal 23 frequency (F max = max =2) called the Nyquist rate. However, if the Nyquist rate is not met then for frequencies greater than f s =2 adjacent copies of X (j ) called alias will overlap and the frequency content will be replaced be corrupted by superposition of components from the overlapping aliases. The phenomenon of overlapping copies is called aliasing and it is not possible to recover the original signal from aliased frequency spectrum. In order to avoid aliasing it is common to apply an anti-aliasing lter tox(t) which is a low pass lter that allows frequencies less than f s =2 to pass through but loses the information at frequencies greater than f s =2. 2.4.2 Linear estimation In most practical scenarios (especially clinical applications), scan time is heavily constrained. In such situations, we do not have access to the EAP, and need to estimate the parameter of interest from limited q-space data. IFT from undersampled q-space data is an ill-posed problem [90,91] leading to non-unique solutions and so EAP is not estimated directly in any low scan-time experiments. However, EAP parameters occupy a lower dimensional space than the EAP, require lesser samples for estimation and can predict useful microstructure properties, making them suitable for studies with limited q-space data. Additionally, Cartesian q-space sampling schemes requiring large number of samples are not suitable for low scan time experiments. It is instead routine in a clinical environment to acquire undersampled q-space data with samples spread uniformly over a shell [8]. Such a single shell scheme is called the high angular resolution diusion imaging (HARDI) [11]. Multi-shell sampling schemes consisting of multiple such shells are also becoming common- place in both research and clinical experiments. Non-Cartesian sampling schemes and heavily undersampled q-space make IFT unsuitable for limited scan time applications and typically other estimators are used to estimate various parameters of EAP. In this thesis we focus on linear estimators that estimate an EAP parameter using a linear 24 combination of measured q-space data. Non-linear estimators such as [9,13,40,92] too have been proposed in literature. However, non-linear methods make diusion modeling assump- tions and as described in section 2.2, model failures are common when describing complicated microstructure geometry. Behavior of non-linear methods when modeling assumptions are violated is unknown and can lead to erroneous results. An arbitrary linear estimator a (), when estimating a parameter fromM measurements can be represented by vector of length M with the estimation procedure generalizing to ^ () = M X m=1 a m ()y m ; (2.21) where the a m values are the linear estimation coecients that form the M elements of the estimator: a () = [a 1 ();a 2 (); ;a M ()] and y m are the M measurements. is a context dependent additional variables that estimator and the output ^ () depend on, for example if we are estimating the EAP then will be the displacement vector. For the rst scenario let us assume that the measurements y m are noiseless or deter- ministic. The estimation represented in 2.21 is a deterministic system with input y = [y 1 ;y 2 ; ;y M ] and output ^ (). The system is linear because it satises the properties of superposition (additivity and homogeneity) given by: ^ a () =T fy a g; (2.22) ^ b () =T fy b g; (2.23) then; ^ a + ^ b =T fy a +y b g; (2.24) where T fg is the representation of the system, ^ a () and ^ b () are outputs of the system with inputs y a and y b respectively and ;2R. One of the main advantages of linearity is that it oers theoretical predictability of the performance of the estimator. The system in 2.21 can be completely characterized by its 25 impulse response given by the estimator coecients a (), where the impulse response is dened as the output of the system when the input is an MM identity matrix depicting Kronecker delta functions. The reason we carry the variable in our derivations in this thesis is because we are dealing with shift variant systems in diusion MRI. This means that output at two shifts ^ ( 1 ) and ^ ( 2 ) from the same input y will have dierent impulse response functions. For shift variant systems we will need to analyze the response at every shift uniquely to evaluate and understand overall errors. However, since we are dealing with nite number of estima- tions, this is not a big concern and instead of one error value, we will be analyzing a vector of errors instead. Generally, in estimator design problems a loss function describing error between the estimated parameter ^ () and true parameter () is chosen and statistically optimized to minimize the error and in turn design the estimator. In chapter 4 we will describe a Bayes risk loss based estimator design for EAP parameter estimation that minimizes the expected squared error loss and in chapter 5 we describe maximum likelihood based estimator design for the purpose of NCC noise elimination that minimizes the negative log-likelihood loss of the NCC distribution. 26 Chapter 3 EAP Response Function Theory 27 3.1 Overview The data measured in diusion MRI can be modeled as the Fourier transform of the Ensem- ble Average Propagator (EAP), a probability distribution that summarizes the molecular diusion behavior of the spins within each voxel. This Fourier relationship is potentially advantageous because of the extensive theory that has been developed to characterize the sampling requirements, accuracy, and stability of linear Fourier reconstruction methods. However, existing diusion MRI data sampling and signal estimation methods have largely been developed and tuned without the benet of such theory, instead relying on approxima- tions, intuition, and extensive empirical evaluation. This paper aims to address this discrep- ancy by introducing a novel theoretical signal processing framework for diusion MRI. The new framework can be used to characterize arbitrary linear diusion estimation methods with arbitrary q-space sampling, and can be used to theoretically evaluate and compare the accuracy, resolution, and noise-resilience of dierent data acquisition and parameter esti- mation techniques. The framework is based on the EAP, and makes very limited modeling assumptions. As a result, the approach can even provide new insight into the behavior of model-based linear diusion estimation methods in contexts where the modeling assumptions are inaccurate. The practical usefulness of the proposed framework is illustrated using both simulated and real diusion MRI data in applications such as choosing between dierent parameter estimation methods and choosing between dierent q-space sampling schemes. 1 1 The text and gures in this chapter have been previously published in [93], and are copyright of the Elsevier. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the Elsevier. c 2017. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ 28 3.2 Background Diusion MRI has the unique ability to noninvasively quantify the Brownian motion charac- teristics of water molecules as they take random walks through their local microenvironment. Within biological tissues, these random walks are strongly constrained by the local tissue microarchitecture, which means that the diusion MRI signal can be used to infer structural tissue features at microscopic spatial scales that are otherwise inaccessible through conven- tional millimeter-scale MRI. When applied to the brain, diusion MRI has emerged as an especially important tool for quantifying tissue microstructural characteristics that change as a result of brain development, plasticity, and pathology, as well as for reconstructing the white matter pathways that connect dierent brain regions [94,95]. In the diusion MRI literature, the process of molecular diusion is often summarized by a probability distribution P (R) known as the Ensemble Average Propagator (EAP). The EAP describes the ensemble-average probability density that a randomly selected water molecule from a given voxel will be found at a spatial displacement of R2 R 3 from its original position after undergoing random Brownian motion for a time interval of duration . The EAP is a popular summary of the diusion process because, under the narrow-pulse approximation and assuming a standard Stejskal-Tanner diusion encoding scheme [1], the EAP is related to the measured diusion MRI dataE (q) at q-space location q2R 3 through a Fourier transform relationship [2]. In Eq. (2.16) we established the Fourier transformation relationship with noiseless dMRI data, but here we consider noisy data and modify Eq. (2.16) as, E (q) = Z P (R)e i2q T R dR +n (q); (3.1) where n (q) represents measurement noise and it is assumed that the data E(q) has been normalized such that E(0) = 1 in the absence of noise. 29 Almost all of the quantitative diusion MRI literature 2 can be viewed as estimating the EAP (or quantitative measures derived from the EAP) from a nite set of M q-space measurements E(q m ) for m = 1; 2;:::;M. As a simple well-known example, the popular diusion tensor imaging (DTI) model of [39] is equivalent to assuming that the EAP has the form of a zero-mean multivariate Gaussian distribution, with the covariance matrix of the Gaussian corresponding to the conventional diusion tensor. As a result, DTI model tting is equivalent to EAP estimation under a specic parametric model of the EAP. The linear Fourier relationship of Eq. (3.1) may be viewed as especially advantageous because concepts from standard signal processing and linear systems theory (e.g., point- spread functions, the sampling theorem, the Nyquist rate, etc.) could potentially be applied to characterize and design both q-space sampling strategies and EAP estimation strategies. However, to the best of our knowledge, existing literature has largely ignored these possibil- ities for theoretical analysis. Instead, most methods for diusion MRI signal modeling and experiment design have been developed heuristically based on some combination of intuition, model-based approximation of the EAP, and empirical validation. There are two exceptions to our previous generalization. Specically, in seminal work, Tuch [11,71,97] relied on signal processing/linear systems theory to characterize an approach he proposed called the Funk-Radon Transform (FRT), and was able to show theoretically that the FRT estimated a "blurred" version of a specic EAP-based integral of interest. More recently, we generalized the FRT by introducing the Funk-Radon and Cosine Transform (FRACT) [12] and several variations [60,98]. These FRACT-based approaches were built o of Tuch's original theory and showed that the FRT estimation approach could be modied to yield improved diusion estimation with much better theoretical characteristics. However, a limitation of the existing theoretical analyses is that they focus narrowly on the estimation of a specic EAP-derived quantity (obtained by radial integration of the EAP) known as 2 A notable exception to this rule is the family of multiple diusion encoding MRI experiments [96], which adopt a more complicated representation as the result of interrogating spin displacement correlations during multiple diusion intervals. 30 the Orientation Distribution Function (ODF), rather than considering more general features that can potentially be extracted from the EAP. In addition, the existing approaches focused narrowly on diusion encoding schemes that sample data on a small number of densely- sampled shells in q-space, rather than allowing arbitrary q-space sampling patterns. This work introduces a novel and general theoretical framework that precisely describes the relationship between estimated EAP measures and the true original EAP. The framework is applicable to arbitrary linear EAP estimation methods and is compatible with arbitrary q-space sampling patterns. As will be illustrated, this has applications ranging from the design of q-space sampling patterns, to the selection of EAP estimation methods, and to a more detailed and more precise theoretical understanding of recently-reported empirically- observed diusion MRI aliasing phenomena [99, 100]. A preliminary account of portions of this work was previously presented in [101]. The fact that our framework is only valid for linear estimation methods may seem like a limitation of the proposed approach, due to the modern popularity of nonlinear diusion estimation methods that are used with DTI [39, 102], diusion kurtosis imaging [80], non- negativity constrained spherical deconvolution [13], diusion basis spectrum imaging [103], NODDI [15], and various other ODF/EAP estimation methods [40, 51, 104{106]. However, it is important to note that there are a large number of powerful existing linear diusion estimation methods, and the development of new linear diusion estimation methods is still an active area of research with demonstrated potential for success [107]. A few examples of linear methods include q-ball imaging [11, 71, 97, 108, 109], FRACT [12, 60], linear spher- ical deconvolution [42, 98, 110, 111], inverse Fourier transform [2, 43, 112{116], generalized q-sampling imaging (GQI) [52], and linear methods for estimating basis function expansions of the EAP [18,19,44{50]. In addition, it is also worth noting that nonlinear methods often rely on parametric diusion models. These parametric diusion models are often idealized and approximate in nature, and their performance can deteriorate unexpectedly in the pres- 31 ence of modeling errors [9, 10, 12]. In contrast, the theoretical framework we develop for linear methods will still provide insight into performance even when the modeling assump- tions of the linear estimation methods are violated, and will be accurate as long as the approximations underlying the q-space model of Eq. (3.1) are valid. 3.3 Theory 3.3.1 Background on Linear Estimation Methods In its most general form, our proposed theoretical framework is applicable to linear diusion estimation methods that generate an estimate of an EAP parameter() of interest according to ^ () = M X m=1 a m ()E(q m ) (3.2) for some linear coecients a m (), m = 1;:::;M, that may depend on additional variables but do not depend on the measured data values. The variables can be arbitrary and chosen in a context-dependent way, e.g., we will show examples where represents a spatial displacement vector and other examples where represents an orientation vector. In practice, common EAP parameters that are estimated in this way include the EAP P (R) itself; the ODF [11,71] ODF (u)/ Z 1 0 P (u)d (3.3) with u2S 2 whereS 2 is the 2-sphere; the constant solid angle (CSA) ODF [40,43,77,78] ODF CSA (u) = Z 1 0 P (u) 2 d (3.4) 32 with u2S 2 ; the return-to-origin probability (RTOP) [48,81,117] RTOP =P (0); (3.5) the return-to-axis probability [48]; the return-to-plane probability [48]; the mean-squared displacement [18, 113{115]; the mean fourth-order displacement [18]; and the generalized kurtosis [18]. All of these EAP-derived parameters can be interesting in practical applications because they are often sensitive to dierent aspects of the tissue microarchitecture. While there are many linear diusion estimation methods that have been proposed, it is worth noting that only a few of them were written in the form of Eq. (3.2) in the original references. Methods that directly use expressions of this form were usually derived us- ing numerical approximations (to account for nite/non-uniform data sampling) of analytic transform integrals, and include the inverse Fourier transform [2, 43, 112{115], the Gen- eralized q-sampling imaging (GQI) approach [52], and the Funk-Radon transform (FRT) approach [11,71]. For example, the inverse Fourier transform approaches often estimate the EAP according to ^ P (R) = M X m=1 w m e i2q T m R E(q m ); (3.6) where thew m coecients are appropriate numerical quadrature weights that can be used to account for non-uniform q-space sampling density. In this case, a m () =w m e i2q T m . On the other hand, many recent linear approaches are based on the use of basis function expansions [12, 18, 42, 44{50, 60, 98, 108{111], which require some manipulation in order to be expressed in the form of Eq. (3.2). For completeness, we describe the mapping of such methods into Eq. (3.2) in 3.A. 33 3.3.2 Theoretical Characterization of Linear Estimation We can gain theoretical insight into linear diusion estimation methods by inserting the data acquisition model from Eq. (3.1) into the linear EAP parameter estimation equation from Eq. (3.2), which leads to ^ () = M X m=1 a m () Z P (R)e i2q T m R dR +n (q m ) = Z P (R) M X m=1 a m ()e i2q T m R ! dR + M X m=1 a m ()n(q m ): (3.7) While we could analyze Eq. (3.7) directly, further simplications are possible if we assume a pure diusion process in which the EAP is zero-mean, real, and symmetric with P ( ~ R) = P ( ~ R) for all possible ~ R. It is well known that in this case and in the absence of noise, E(~ q) =E(~ q) for all possible ~ q [71,118]. This symmetry leads to the simplication ^ () = Z P (R)g(; R)dR | {z } Noise-Free Contribution + M X m=1 a m ()n(q m ) | {z } Noise-Only Contribution : (3.8) with g(; R), M X m=1 a m () cos(2q T m R): (3.9) Compared to Eq. (3.7), Eq. (3.8) has the advantages of not involving any complex-valued numbers and implicitly incorporating common q-space and EAP symmetry assumptions. Equation Eq. (3.8) provides the foundation for our theoretical analysis of linear diusion estimation methods. We believe that this result is powerful because it provides a direct relationship between the linearly-estimated EAP parameter ^ () with the true EAPP (R) and the noisen(q), and can be used to assess the accuracy and noise-sensitivity of arbitrary linear estimators. As indicated in Eq. (3.8), there are two distinct terms that contribute to the estimated 34 EAP parameter: a noiseless contribution from the true original EAP and a noise-only con- tribution term. We will discuss these two contributions in more detail in the following subsections. Contributions to the Linear Estimate from the True EAP In the absence of noise, Eq. (3.8) simplies to ^ noise-free () = Z P (R)g(; R)dR: (3.10) This equation provides a direct relationship between the true original EAP P (R) and the deterministic noise-free component of the estimated EAP parameter ^ (). Interestingly, this relationship is very similar to the standard point-spread function relationship that is commonly used to characterize the quality of MRI image reconstruction from sampled k-space data [119]. Analysis of the functiong(; R) in this relationship can be very insightful, because the shape of this function will determine the resolution/accuracy/bias of the estimator. To be consistent with the spatial response function terminology used for similar theoretical analyses in MRI image reconstruction [120{123], we refer to g(; R) as the EAP response function (ERF). As an illustrative example, consider the case where we are interested in using a linear estimation method to estimate the EAP at a point = R 0 from sampled data. In this case, the ideal true value of P (R 0 ) can be expressed as P (R 0 ) = Z P (R) f (R R 0 )dR; (3.11) or equivalently (due to symmetry) as P (R 0 ) = Z P (R) f (R R 0 ) + f (R + R 0 ) 2 dR; (3.12) 35 where f () denotes the Dirac delta function. In addition, we can also use Eq. (3.10) to nd that any linear estimate of ^ P (R 0 ) from noiseless sampled data will be related to the true EAP through the relationship ^ P (R 0 ) = Z P (R)g(R 0 ; R)dR: (3.13) for some ERF g(R 0 ; R) dened through Eq. (3.9). However, while Eq. (3.13) has similar integral form to Eqs. 3.11 and 3.12, the estimated EAP value is unlikely to perfectly match the true value unless the ERF satises g(R 0 ; R) = f (R R 0 ) + f (R + R 0 ) 2 : (3.14) Since it is impossible for a nite cosine series in the form of Eq. (3.9) to perfectly reproduce a delta function, we can infer that our linear estimate of ^ P (R 0 ) is, at best, related to a \blurred" version of the EAP. Importantly, our theoretical framework now provides a mechanism for constructing the ERF and understanding this blurring behavior, allowing us to predict the resolution characteristics of dierent linear estimation schemes and providing opportunities for optimizing the design of q-space sampling schemes and associated EAP estimation methods. As another illustrative example, consider the case where we are interested in estimating the CSA ODF from Eq. (3.4) along the direction specied by = u 0 = [0; 0; 1] T . In this case, the ideal CSA ODF from Eq. (3.4) can be rewritten as ODF CSA (u 0 ) = Z 1 0 P ([0; 0;z] T )z 2 dz = 1 2 Z P (R)z 2 f (x) f (y)dR; (3.15) while a linearly estimated CSA ODF from nitely-sampled noiseless data could be expressed 36 using Eq. (3.10) as ^ ODF CSA (u 0 ) = Z P (R)g(u 0 ; R)dR: (3.16) for some ERF g(u 0 ; R) dened through Eq. (3.9). Similar to our above analysis of EAP estimation and our previous analysis of ODF esti- mation [12], we should not expect to always get perfect estimation of the CSA ODF unless the ERF g(u 0 ; R) = z 2 f (x) f (y)=2. And because we cannot achieve Dirac delta functions using the nite cosine series from Eq. (3.9) that denes the ERF, the CSA ODF is also expected to be a \blurred" version of the desired CSA ODF. As for the previous case, our theoretical framework again provides a mechanism for understanding this blurring through the ERF, and provides opportunities for optimizing experiment design and parameter esti- mation methods to achieve an ERF with desirable characteristics. These two illustrative examples are representative of our general approach to theoretical characterization of linear estimation methods, which can also be used to analyze linear esti- mation for many other EAP parameters such as those described by Eqs. 3.3 - 3.5. Concrete numerical demonstrations of the ERF for dierent q-space sampling and linear estimation schemes will be presented later in Section 4.5. Contributions to the Linear Estimate from Noise In addition to the noise-free contribution from Eq. (3.10) which determines the resolu- tion/accuracy/bias of the linear estimator, it is also important to understand the noise-only contribution which determines the precision/variance of the estimator. In the absence of signal, Eq. (3.8) simplies to ^ noise-only () = M X m=1 a m ()n(q m ): (3.17) 37 N = 3 N = 6 N = 10 N = 12 Figure 3.1: Comparison of the ERFs g(R 0 ;R) for 3D-SHORE EAP estimation for dierent maximum radial orders N. Specically, the ERF is shown as a function of x and z (i.e., R = [x; 0;z] T ) for R 0 = [15; 0; 0] T m. The green and pink lines superimposed on these images dene 1D axes that pass through the peaks of the ERFs, with reference to Fig. 3.2. Analysis of this expression requires additional information about the characteristics of the noise samples n(q m ). The statistical modeling of diusion MRI images can be quite complicated [28, 29, 124], and our description will be presented as generically as possible to accommodate all of the dierent possible signal distributions that may be encountered in practice (ranging from Rician or non-central chi noise for magnitude images, to complex Gaussian noise for complex images). The main assumption we will make is that each noise sample n(q m ) is a random variable that is statistically independent from the other samples. For convenience, we will use m to denote the expected value<n(q m )> (where we have used<> to denote statistical expectation of a random variable) and 2 m to denote the variance <jn(q m )j 2 j m j 2 > of each noise perturbation n(q m ). Given this notation, standard results from probability theory indicate that the mean of ^ noise-only () will is given by < ^ noise-only ()>= M X m=1 a m () m ; (3.18) 38 and that the second-moment of ^ noise-only () is given by <j ^ noise-only ()j 2 >= M X m=1 a m () m ! 2 + M X m=1 ja m ()j 2 2 m : (3.19) As a result, if we know the mean and variance of the noise perturbations in our data, it is also straightforward to use the a m () values to compute the mean and variance associated with any linear EAP parameter estimate. In many cases, the means of each noise sample are either approximately zero (e.g., in the case of zero-mean Gaussian noise or high-SNR magnitude images) or can be estimated (e.g., in the case of low-SNR magnitude images with a rectied noise oor [28,29]) to demean the noise. In these cases, the noise-only component of the estimate will have approximately zero mean and an easily-calculated variance. 3.4 Results In this section, we show several illustrative examples of calculating ERFs for commonly used linear estimation methods, and demonstrate that the theoretically-derived ERF characteris- tics are predictive of the practical performance characteristics observed through substantially more complicated empirical evaluation. Moreover, we demonstrate that the insight provided by ERFs can help to simplify a number of practical diusion acquisition and parameter estimation decisions that are often encountered in practical experiments. While there are many interesting practical questions raised by our theoretical analysis, our main objective for this paper is not to provide comprehensive answers to these questions or to make any specic recommendations for how researchers should acquire or process their data { instead, our objective is simply to demonstrate that our proposed theoretical analysis tools have the potential to help answer a range of interesting questions across a range of dierent acquisi- tion schemes, estimation schemes, and diusion imaging contexts. As a result, the examples 39 we have chosen to show are intentionally diverse, and we intentionally move rapidly from one context to the next without digging too deeply into any specic application. The estimation implementations we have used in this section are available as part of the open-source BrainSuite (http://brainsuite.org/) software [125]. 3.4.1 Theoretical Insights into Estimation Parameter Selection Estimation methods typically have estimation parameters that need to be chosen by the user prior to data analysis (e.g., the number of basis functions to use or the regularization parameter for a linear basis function expansion method, as discussed in 3.A). In existing implementations, these parameters are usually either chosen heuristically based on visual assessment, or are chosen based on extensive empirical evaluation of dierent possibilities. In this section, we investigate the usefulness of ERFs for simplifying these kinds of decisions. For the sake of concreteness, we consider the case of EAP estimation using the 3D Simple Harmonic Oscillator based Reconstruction and Estimation (3D-SHORE) model [48], and will assess the usefulness of ERFs for assisting with the choice of model order parameters. The 3D-SHORE model uses a linear basis function expansion as explained in 3.A. Following [51], we consider the orthonormal 3D-SHORE basis functions given by: n`m (q) = s 2(n`)!kqk 2` 2 (`+ 3 2 ) (n + 3 2 ) e kqk 2 2 2 L `+ 1 2 n` kqk 2 2 Y m ` q kqk 2 ; (3.20) for ` = 0; 1;:::;L, n = `;` + 1;:::;N, and, m =`;` + 1;:::;` 1;`, where N is the maximum radial order parameter and L is the maximum angular order parameter of the 3D-SHORE basis. In this expression, L n () is the generalized Laguerre polynomial of order n with parameter,Y m ` () is the real and symmetric version of the spherical harmonic basis function of order` and degreem [109], and () is the Gamma function. Assuming a nominal diusion coecient of 1.2510 3 mm 2 /s and eective diusion time of 39.6 ms and using formulas from [45] and [51], we set the 3D-SHORE scale parameter =256 mm 2 . Given this 40 N = 3 N = 6 N = 10 N = 12 Figure 3.2: A more detailed look at the 3D-SHORE ERFs shown in Fig. 3.1 for EAP estimation with dierent maximum radial orders N. The 1D plots of the ERF values correspond to the (top) pink lines along thex-axis from Fig. 3.1 and (bottom) green lines parallel to thez-axis from Fig. 3.1 that pass through the dominant ERF peaks. The plots also show the positions of the ideal ERF peaks at R 0 =[15; 0; 0] T m. choice of basis functions, the 3D-SHORE reconstruction procedure was implemented using regularization as described in Eqs. 3.22 - 3.25 from 3.A, using Laplace-Beltrami regulariza- tion to encourage smooth angular and radial variations of the EAP. The Laplace-Beltrami operator was implemented as described by [45] and [51], with the regularization parameter set to = 10 8 [45]. For the results shown in this subsection, we considered a multi-shell q-space sampling scheme that we generated using the generalized multi-shell electrostatic repulsion technique described by [126]. Our design uses 3 shells (b-values of 1000, 2000, and 3000 s/mm 2 with an eective diusion time of 39.6 ms), with 64 samples per shell. Based on this reconstruction procedure and q-space sampling scheme, we inserted the results of Eq. (3.25) into Eq. (3.9) to analytically compute ERFs for EAP estimation for dierent choices of the radial model order parameter N (i.e.,N = 3; 6; 10; and 12), with the angular order parameter bounded by N. Figure 3.1 shows illustrative results for this case. Note that the ERF is a 3D function of R (or a 6D function of R and R 0 ), but 3D/6D images 41 are hard to visualize. We have chosen to only show a 2D image of thex-z plane, noting that the features of the ERF do not change substantially with rotation about the x-axis (i.e., the ERFs in this case are approximately axially symmetric). Additionally, due to the fairly uniform q-space sampling pattern, the ERFs in this case are also approximately rotation invariant with respect to R 0 , so we have chosen to only show R 0 = [15; 0; 0] T m. We have also chosen to only show the region in which x and z are smaller than 50 m, because in practical in vivo brain imaging scenarios at room temperature, a negligible fraction of spins would diuse farther than 50 m over a 39.6 ms time interval. If EAP estimation were perfect, we would see two delta functions in the ERF atR 0 . However, instead of observing delta functions, we instead see broad peaks in the ERF that are centered in approximately the correct positions, indicating that our estimate of the EAP at R 0 will be a blurred version of the true EAP. In order to obtain the highest possible resolution for our EAP estimate, it would desirable for the widths of these dominant peaks (i.e., the mainlobe width using signal processing terminology [127]) to be as small as possible. In addition, in order to avoid leakage of information from one part of the EAP into a distant region of the EAP, it is important for the magnitude of the ERF in regions outside of the dominant peaks (i.e., the sidelobe level using signal processing terminology [127]) to be as small as possible. 3 As can be seen from Fig. 3.1, the mainlobe and sidelobe characteristics have clear dierences for dierent values of N. These mainlobe and sidelobe characteristics are potentially even easier to visualize in 1D plots passing through the ERF peaks, as shown in Fig. 3.2. As can be seen from these gures, all four dierent choices of N lead to peaks that are centered in approximately the correct location. However, the ERF behavior is dierent in each case. Specically, it can be observed that the mainlobe sizes and shapes are fairly 3 It should be noted that interpretation of leakage in terms of the ERF requires some nuance. Specically, substantial signal leakage will occur only if the EAP and the ERF are both large simultaneously in regions outside of the dominant peaks. In practice, real EAPs tend to be highly concentrated and decay rapidly away from R =0, meaning that it is often not problematic to have large sidelobes for values of R that are beyond the eective range of the EAP. 42 (a) True EAP (b) N =3 (c) N =6 (d) N =10 (e) N =12 Figure 3.3: EAPs estimated using 3D-SHORE with dierent maximum radial orders N from noiseless two-tensor simulated data. For simplicity, we only show the 2D plane containing the principal directions of both tensors instead of showing the full 3D estimated EAPs. similar for N = 6; 10; and 12, while the mainlobe for N = 3 is much wider. Based on the structure of the ERFs, we would expect the N = 3 case to yield very dierent EAP estimation performance compared to the other cases, which are expected to be much more similar to one another. Simulations were performed to examine how ERF characteristics correlate with practical EAP estimation quality. Data was synthesized by modeling the EAP as the linear com- bination of two diusion tensor models. Each diusion tensor component was given the same relative volume fraction. The principal diusion tensor eigenvalue for both tensors was systematically varied between 0.510 3 mm 2 /s and 310 3 mm 2 /s, while the relative orientations of the principal eigenvectors were systematically varied between 0 and 90 de- grees, leading to a total of 400 dierent synthetic EAPs. The two smallest eigenvalues of the diusion tensors were always set to 0.210 3 mm 2 /s. For qualitative illustration, Figure 3.3 shows estimated EAPs resulting from applying 3D-SHORE to simulated noiseless data with the tensors crossing at 90 degrees and the largest diusion tensor eigenvalue set to 1:4 10 3 mm 2 /s. As would be expected based on the ERFs, the N = 3 case has much dierent behavior than the other cases, and leads to estimation performance that is much worse. Specically, the N = 3 case has failed to resolve the orientations of the two tensors, while the orientations of the two tensors are clearly resolved in all the remaining cases. The loss of angular resolution in the N=3 case is 43 Figure 3.4: (Left) Empirical EAP estimation error at R 0 = [15; 0; 0] T m for each of the 400 simulated noiseless multi-tensor datasets, plotted as a function of the maximal radial orderN of the 3D-SHORE basis. The data is shown using a box and whisker plot, in which the median estimation error is shown using a horizontal black line, while the green box shows the interquartile range between the 25th and 75th percentiles of the empirical distribution. The dashed red \whiskers" are lines spanning the range of the empirical distribution above and below the interquartile range, while observations beyond the whisker length are marked as outliers and denoted with red plus symbols. Samples are marked as outliers if their distance away from the interquartile range is more than 1.5 times the length of the interquartile range. (Center) The corresponding FWHM of the ERF as a function of N. (Right) The theoretical noise variance (normalized to be invariant to by dividing by 2 ) as a function of N, calculated using Eq. (3.19) assuming m = = 1 for all m. not surprising based on the much larger ERF mainlobe width observed in Figs. 3.1 and 3.2 for this case. Quantitative results are shown in Fig. 3.4. Specically, we quantied the 3D-SHORE EAP estimation error at R 0 = [15; 0; 0] T m for each of the 400 simulated noiseless datasets. Error was computed using the absolute dierence between the estimated EAP ^ P (R 0 ) and the ground truth EAP P (R 0 ), i.e., Error =jP (R 0 ) ^ P (R 0 )j. For comparison, we also computed the equivalent full-width at half-maximum (FWHM) of the ERF for the same EAP location R 0 = [15; 0; 0] T m. The FWHM was computed by computing the maximum value of the mainlobe peak, and then computing the volume of the mainlobe where the ERF magnitude was larger than half the maximum value. Finally, the FWHM was computed as the diameter of an equivalent sphere with the same volume. Note that the FWHM is a common metric used to assess the resolution of a point-spread function in a conventional imaging experiment, with smaller FWHM values corresponding to higher resolution [119]. 44 As can be seen in Fig. 3.4, the FWHM of the ERF follows the same trend as the EAP estimation error, and both curves have a minimum atN = 6. Quantitatively, the correlation coecient between the FWHM of the ERF and the median of the EAP estimation error is R = 0:9650, which is highly signicant (p< 10 5 ).These results provide a direct indication of the potential usefulness of the ERF, since the theoretically-derived FWHM curve could be used to choose the optimal radial order without requiring extensive simulations to come to the same conclusion. However, it should also be noted that the estimation error and ERF characteristics are both functions of R 0 , and that the behavior observed in Fig. 3.4 is specic to R 0 = [15; 0; 0] T m. We do not show extensive results for other values of R 0 due to space constraints, but generally speaking, we have observed that in regions where the true EAP has large intensity, the empirically observed estimation error follows the same basic trend as the FWHM. On the other hand, in regions where the true EAP intensity is small (e.g., at R 0 = [35; 0; 0] T m in Fig. 3.3, at which the true EAP value is approximately zero), we observe that the EAP estimation error instead follows the same trend as the maximum sidelobe level of the ERF. Both of these observations are consistent with intuition { when the EAP value is large, poor resolution (as measured using the FWHM) is likely to be one of the primary sources of estimation error. On the other hand, in regions where the EAP is very small, the errors are much more likely to be dominated by leakage of information from distant regions where the EAP has higher intensity. It is also worth mentioning that, perhaps surprisingly, the optimal choice of N, as identied using either the minimum value of the empirical estimation error or the theoretical characteristics of the ERF, has a strong dependence on R 0 . While N = 6 was optimal for R 0 = [15; 0; 0] T , we observe that larger values ofN should be preferred (both empirically and theoretically) for larger R 0 . While the previous results considered EAP estimation quality in the noiseless case, our theoretical framework can also provide a theoretical measure of the noise variance us- 45 (a) b-values: [1000; 2000; 3000] (b) b-values: [1500; 3000; 4500] (c) b-values: [2000; 4000; 6000] Figure3.5: The ERFs for SPF-based RTOP estimation for dierent multi-shell sampling schemes. The three b-values used for each multi-shell scheme are shown in units of s/mm 2 . ing Eq. (3.19). We have shown a plot of the value of Eq. (3.19) as a function ofN in Fig. 3.4, assuming m = 0 and m = 1 for all m. Interestingly (and perhaps coincidentally), we ob- serve that, in this case, the noise variance has its largest value atN = 6, which is exactly the same radial order at which the FWHM has its minimum value. As a result, our theoretical framework suggests that the optimal choice ofN will represent a trade-o between bias and variance in this case. Empirical results using noisy simulated data conrm this conclusion, though are not shown here due to space constraints and because we illustrate this trade-o in a dierent context in the sequel. It is also worth mentioning that while the ERF is easily calculated to provide insight into the theoretical characteristics of a given estimator, it does not necessarily illuminate the reasons why a given estimator will have a given theoretical characteristics. For instance, the ERF alone does not provide direct insight into what causes the case with N = 6 to have the smallest FWHM in this example. To answer questions like this, it would be necessary to perform a more detailed investigation into the specic numerical characteristics of 3D- SHORE with this specic sampling pattern. 46 3.4.2 Theoretical Insights into Resolution/Noise Trade-os In this subsection, we use our theoretical framework to explore the trade-os between reso- lution and noise variance in the context of RTOP estimation from multi-shell diusion data. For the purpose of illustration, we estimated RTOP using a dierent estimation model from that of our previous example: a linear basis function expansion method using the Spherical Polar Fourier (SPF) basis described by [44]. The estimation procedure was implemented by simply replacing the 3D-SHORE basis with the SPF basis in the 3D-SHORE implementation from the previous subsection, leaving all other aspects the same. We used a maximum radial order of 6 and a maximum angular order of 8 for the SPF basis, and the SPF scale parameter was set to =457 mm 2 (based on a nominal diusion coecient of 0.710 3 mm 2 /s). For further illustration, we considered three dierent multi-shell q-space sampling schemes. The rst sampling scheme was identical to that described in the previous subsection (3 shells, with 64 orientations per shell, and b-values of 1000, 2000, and 3000 s/mm 2 ). The remaining two sampling schemes were identical to the rst, except that we modied the b-values of each shell: one scheme used shells with b-values of 1500, 3000, and 4500 s/mm 2 , while the other scheme used shells with b-values of 2000, 4000, and 6000 s/mm 2 . The ERFs for SPF-based RTOP estimation are shown in Fig. 3.5. We observe from this gure that the mainlobe width of the ERF becomes smaller as we move from smaller to larger b-values. This observation is quantied using the FWHM (calculated using the same method from the previous subsection) of the ERF, which is plotted in Fig. 3.6. This theoretical characterization suggests that RTOP estimation is expected to be more accurate using the higher b-value sampling scheme, at least in situations when the SNR is high. On the other hand, we also used Eq. (3.19) to calculate the expected noise variance in each case. This is also plotted in Fig. 3.6, and we observe that the noise variance increases as we move from smaller larger b-values. This suggests that, holding all other aspects of parameter estimation xed, the choice of sampling scheme will represent a trade-o between bias and 47 Figure 3.6: Plots of the theoretically-derived ERF FWHM and noise variance (normalized to be invariant to by dividing by 2 ) values for SPF-based RTOP estimation. variance for RTOP. To test these theoretical predictions, we considered the same set of noiseless two-tensor EAP simulation models described in the previous subsection, and have added simulated zero- mean Gaussian noise with noise variance 2 . The Gaussian noise variance was designed to achieve signal-to-noise ratios (SNRs) of 500, 150, 80, and 40, with SNR dened as E(0)=. Subsequently, we calculated the magnitude of this noisy simulated data, resulting in Rician- distributed measurements. Subsequently, SPF-based RTOP estimation was applied to this data, and the resulting RTOP estimation error distributions are plotted in Fig. 3.7. The results from Fig. 3.7 demonstrate that at high SNR, the empirical RTOP estimation error follows the same trend as the FWHM of the ERF, while at low SNR, the empirical RTOP estimation error follows the same trend as the theoretical noise variance. Quanti- tatively, the correlation coecient between the FWHM of the ERF and the median of the RTOP error isR = 0:9995 for the highest SNR case (signicant withp< 0:05), while the the correlation coecient between the theoretical noise variance and the median of the RTOP error is R = 0:9595 for the lowest SNR case (while the correlation is very high, this result is not statistically signicant, though we suspect we would observe signicance in this case if we had analyzed more than three experiment designs). At intermediate SNR values, the empirical RTOP error can be viewed as a linear combination of the theoretically-derived 48 (a) SNR 500 (b) SNR 150 (c) SNR 80 (d) SNR 40 Figure 3.7: Plots of the empirically-observed RTOP estimation error for SPF-based estimation with dierent multi-shell sampling schemes. The data is shown using box and whisker plots, using the same formatting conventions as in Fig. 3.4. FWHM and noise variance trends. These empirical results are consistent with our theo- retical predicitions, and support the potential usefulness of our theoretical framework for practical decisions associated with noise/resolution trade-os and q-space sampling scheme selection. 3.4.3 Theoretical Comparisons of Dierent Estimation Methods In many situations, it is common for diusion MRI users to be unsure about which estimation procedure will give the best results with their data. In this subsection, we demonstrate that our theoretical framework can provide valuable information to assist users in making this kind of decision. For concreteness, we consider the problem of ODF estimation from single-shell diusion data using the FRT [11,109] and FRACT [12], and ODF estimation from multi-shell diusion data using GQI [52] and 3D-SHORE [48,51]. In our implementations of FRT and FRACT, we used a spherical harmonic basis expansion with maximum angular order L = 8 and Laplace-Beltrami regularization with = 0:006 [109]. Our implementation of 3D-SHORE used the same settings as in Section 3.4.1, with maximum radial orderN = 8 and maximum angular orderL = 8. The FRT and GQI were designed to estimate the ODF from Eq. (3.3), 49 FRT GQI FRACT 3D-SHORE Figure 3.8: Comparison of the ERFsg(u 0 ;R) for dierent ODF (FRT aand GQI) and CSA ODF (FRACT and 3D-SHORE) estimation methods for single-shell and multi-shell data. Specically, the ERFs are shown as a function ofx andz (i.e., R = [x; 0;z] T ) for the orientation u 0 = [0; 0; 1] T . The green and pink lines superimposed on the top row of images dene 1D axes that pass through the peaks of the ERFs, with reference to the 1D plots shown in the bottom two rows. FA FRT GQI FRACT 3D-SHORE Figure 3.9: ODFs (FRT and GQI) and CSA ODFs (FRACT and 3D-SHORE) estimated from real in vivo HCP brain data. ODFs and CSA ODFs are shown from a region of crossing white matter bers, as marked in the color-coded fractional anisotropy (FA) image. while FRACT and 3D-SHORE were designed to estimate the CSA ODF from Eq. (3.4). For multi-shell ODF and CSA ODF estimation, the acquisition was chosen identical to the Human Connectome Project (HCP) protocol [10], which has three shells at b-values of 50 1000, 2000, and 3000 s/mm 2 , with 90 samples per shell and diusion timing parameters of = 43:1 and = 10:6 ms. For single-shell ODF and CSA ODF estimation, we simply used the outermost (b = 3000 s/mm 2 ) shell from the HCP protocol. The ERFs for these dierent cases are shown for orientation u 0 = [0; 0; 1] T (see Eq. (3.16) for notation conventions) in Fig. 3.8. With this choice of u 0 , the ideal ERF is g(u 0 ; R) = f (x) f (y)=2 for ODF estimation using Eq. (3.3), while the ideal ERF isg(u 0 ; R) =z 2 f (x) f (y)=2 for CSA ODF estimation using Eq. (3.4). As can be seen in Fig. 3.8, the dierent sampling and reconstruction approaches yield ERFs that approximate these ideal ERFs with dierent levels of accuracy. We observe that along thez direction (parallel to the ODF orientation being estimated), the ERFs for the ODF-estimation methods FRT and GQI are approximately constant (as desired in the ideal ODF case) for a range of z values near the origin, although the ERFs undesirably decay away for larger values ofz. Specically, the ERF value is within 20% of the ideal ERF value forz values up tojzj = 15m for FRT andjzj = 18m for GQI. Similarly, the ERFs for CSA ODF-estimation methods FRACT and 3D-SHORE approximately follow a z 2 trend (as desired in the ideal CSA ODF case) for a range of z values near the origin, although the ERFs also undesirably decay away for larger values of z. The ERF value is within 20% of the ideal ERF value for z values up tojzj = 21 m for FRACT andjzj = 31 m for 3D-SHORE. We expect that the width of the mainlobe in the xy plane (orthogonal to the ODF orientation being estimated) will be a strong predictor of the angular resolution of the ODF estimate. Interestingly, we observe that the two estimation methods used with single-shell data (FRT and FRACT) have smaller mainlobe widths along the x-dimension than the corresponding ODF or CSA ODF estimation methods used with multi-shell data (GQI and 3D-SHORE). Specically, the mainlobe FWHMs along the x dimension are calculated as 11.08 m for FRT versus 13.96 m for GQI, and 12.07 m for FRACT versus 16.61 m for 51 3D-SHORE. This observation is perhaps surprising, because the single-shell data is a subset of the multi-shell data, and it would generally be expected that the use of more data would lead to improvements in angular resolution. Nevertheless, empirical tests using an EAP model with two crossing diusion tensors (similar to those described in previous subsections) validate that FRT demonstrates quantitatively better angular resolution than GQI and that FRACT demonstrates quantitatively better angular resolution than 3D-SHORE for the q- space sampling schemes and estimation parameters we assumed for this analysis. It should be noted, however, that the single-shell methods have produced ERFs with larger sidelobes than the multi-shell methods in this case (the ratio between the highest sidelobe value and the value of the ERF peak was 0.42 for FRT versus 0.28 for GQI, and 0.38 for FRACT versus 0.20 for 3D-SHORE), such that the use of more data in the multi-shell case is apparently helping to reduce the amount of potential signal leakage from distant regions of the true EAP. Results obtained by applying these four ODF estimation methods to real human brain data distributed by the HCP are shown in Fig. 3.9. Qualitatively, this gure demonstrates the expected behavior, with the FRT having apparently higher angular resolution (more pronounced separation between ODF peaks in a region of known crossing bers) than GQI, and the FRACT having apparently higher angular resolution than the 3D-SHORE estimate. Quantitative empirical simulation results are shown in supplementary Fig. S1, which similarly suggest that the FWHM of the ERF follows the same trend as empirically-observed angular resolution. Overall, this investigation has shown that ERFs can also provide valuable insight that can inform practitioners about the expected characteristics of dierent diusion estimation schemes. However, we would like to caution readers to not draw the conclusion that single- shell sampling is always better than multi-shell sampling, or that FRT and FRACT are always better than GQI and 3D-SHORE. These ndings were true for the specic data 52 Cartesian Cartesian Cartesian Multi-Shell Multi-Shell Moderate Density High Density Low Density Moderate Density Low Density Moderate Coverage Moderate Coverage Large Coverage Moderate Coverage Large Coverage Figure3.10: (Top row) Various Cartesian and multi-shell q-space sampling schemes with dierent sampling densities and dierent amounts of q-space coverage. Note that in the Cartesian case, the sampling points are spaced evenly in q-space, which results in parabolic distribution of the sampling points when represented using b-values. For the Cartesian case, we have only shown the sampling locations where the z-component of the b vector is zero to assist in visualization. For the multi- shell case, we have rotated each sampling point to the nearest point in the b x -b y plane to assist in visualization. (Middle row) ERFs g(R 0 ;R) for GQI-based EAP estimation, shown as a function of x and z (i.e., x = [x; 0;z] T ) for R 0 = [0; 0; 0] T m. (Bottom row) EAPs estimated from simulated noiseless multi-tensor data with two diusion tensors crossing at a 75 angle. sampling schemes and parameter estimation settings that we considered in this investigation, although results may dier substantially in dierent contexts. 3.4.4 Theoretical Insights from Sampling Theory In the modern q-space literature, it is relatively rare for authors to discuss fundamental sampling theory concepts like the Nyquist rate, aliasing, and resolution, even though such concepts should be fundamental for Fourier-based EAP reconstruction methods like diusion spectrum imaging (DSI) [43] and radial DSI [128]. When these concepts are discussed, they 53 are frequently examined empirically rather than theoretically [99,100], and popular methods for DSI sampling design are often based on extensive empirical testing [129, 130] instead of leveraging well-established insights from sampling theory. In this regard, q-space sampling design methods are very dierent from modern MRI k-space sampling design methods (which rely heavily on theoretical concepts like the point-spread function [119]), even though the underlying Fourier sampling and reconstruction problems are nearly identical between the dierent cases. In this subsection, we observe that a theoretical analysis using ERFs helps to bridge this longstanding gap, and allows new insights into the various trade-os associated with dierent sampling approaches. For the sake of concreteness, we considered GQI-based EAP reconstruction [52] using several Cartesian and multi-shell sampling schemes that cover a range of dierent sampling densities (with higher density characterized by smaller distance between nearby q-space sam- pling locations) and dierent amounts of q-space coverage (with larger coverage characterized by higher maximum b-values), as illustrated in Fig. 3.10. The Cartesian sampling schemes were designed using either the 515-sample (for the high sampling density case) or 203-sample (for the other two cases) DSI sampling patterns described by [129], with appropriate scaling to achieve a maximum b-value of 4000 s/mm 2 (for the moderate q-space coverage cases) or 10000 s/mm 2 (for large q-space coverage). The multi-shell schemes were designed with 3 shells with 68 samples per shell using generalized multi-shell electrostatic repulsion [126], with b-values of 1300, 2700, and 4000 s/mm 2 (for moderate q-space coverage) or 3300, 7000, and 10000 s/mm 2 (for large q-space coverage). In all cases, the diusion times were set to = 21:8ms and = 12:9 ms. In addition to sampling patterns, Fig. 3.10 also shows ERFs along with corresponding EAP reconstructions from simulated multi-tensor data with two diusion tensors crossing at a 75 angle. The behavior we observe is that the mainlobe width of the ERF and the resolution of the estimated EAP seems to be primarily in uenced by the extent of q-space coverage, 54 with smaller FWHMs and higher resolution when larger b-values are measured. There is not a substantial dierence in mainlobe width between Cartesian and multi-shell sampling when the amount of q-space coverage is matched. On the other hand, the sidelobe behavior is heavily in uenced by the sampling density, with smaller sidelobes when the sampling density is higher. There is also a substantial dierence in aliasing structure between Cartesian sampling versus multi-shell sampling, with Cartesian sampling producing highly coherent aliasing artifacts if data is not sampled at the Nyquist rate, while multi-shell sampling produces incoherent artifacts. All of these features are easily predicted by standard k-space sampling theory for Cartesian and radial sampling trajectories [119], though are not usually discussed or analyzed in a sampling-theoretic context for diusion MRI. When looking at these results, it may be tempting to try and decide which of these q-space sampling schemes is optimal. However, it is important to point out that this kind of question is impossible to answer denitively without additional context. Specically, the ERF is a tool that can be used to inform a practitioner about the behavior of dierent sampling schemes, but the ERF does not have any ability to infer which EAP features are most important for a given application. As a concrete example, the low-density large- coverage Cartesian sampling results from Fig. 3.10 can provide an excellent representation of the EAP if a practitioner knows to discard all of the unwanted aliases. On the other hand, if a practitioner blindly used the estimated EAP to generate an ODF estimate using Eq. (3.3), the presence of aliasing would cause the ODF estimate to be heavily biased [99]. While it is possible to use the ERF to select between dierent sampling patterns, it is generally necessary to have specic optimality criteria in mind (e.g., as was illustrated when comparing the specic angular resolution criterion between single-shell and multi-shell acquisition in the previous subsection). 55 3.5 Discussion and Conclusions This work described a new EAP-based theoretical framework for understanding the behavior of linear diusion estimation schemes, particularly with respect to resolution, signal leakage, and noise sensitivity. Since this framework makes minimal assumptions (i.e., it only relies on the standard Fourier model of q-space data), it is broadly applicable to a wide range of dierent diusion acquisition, estimation, and modeling methods. We also showed a variety of illustrative applications for such theoretical characterizations in the contexts of choosing between dierent estimation methods and choosing between dierent q-space sampling patterns. However, it is worth reemphasizing that our goal in each of these cases was only to be illustrative and not to be exhaustive { there are many interesting facets that we have left unexplored, and which provide potentially promising directions for future research. Nevertheless, we believe that even our simple examples have already produced new insights that have not been previously accessible, and also provide new opportunities for ERF-based optimization (based on mainlobe widths, sidelobe levels, etc.) and design of linear diusion estimation methods. These types of approaches have not been previously possible to the best of our knowledge, aside from our preliminary work on CSA ODF estimation [60]. However, it is also worth mentioning that there is not a single way of measuring mainlobe widths and sidelobe levels that is universally accepted within the signal processing literature [127], such that the choice of optimal methods for quantifying the \goodness" of an ERF remains an interesting open problem. On a related topic, using the ERF to select between two dierent alternative schemes often requires some nuance when the dierence between two methods is relatively small. For example, if a given estimation method has an ERF FWHM that is 0.002 microns smaller than that of another estimation method, does that really mean that the rst method is always better? Similar nuances have existed in the signal processing literature for decades [127]. For example, there is generally no universally optimal choice between the Hann window and 56 the Hamming window in signal processing applications. These two windows are very similar to one another, but represent slightly dierent tradeos in resolution, signal leakage, and noise characteristics. The choice between the Hann window and the Hamming window is nontrivial in practice, and will ultimately depend on more detailed contextual information about the goals of the specic application. This is exactly the same type of nuance that will exist whenever trying to choose between similar ERFs. Nevertheless, we believe that the existence of some form of theoretical characterization is better than having nothing at all. In addition, we believe that it can also be very valuable to have theoretical understanding of the expected input/output behavior of a data acquisition/processing system, rather than just building intuition based on anecdotal empirical observations. In practical scenarios where the EAP modeling assumptions of Eq. (3.1) do not perfectly hold (e.g., in the case of nite-width gradient pulses or in the case of physiological motion), our ERF-based characterization will only provide an approximation of the true character- istics. We expect our theoretical characterization to degrade gracefully in such cases, in much the same way that q-space data processing methods are still routinely applied to data that does not perfectly t the q-space model. However, it should be noted that a thorough exploration of such cases requires a more detailed model of diusion physics than the one considered in this paper, since the EAP alone is no longer sucient to describe the diusion encoding process. While our modeling framework can potentially be generalized to handle a more complicated diusion model that could capture such eects, this kind of extension is beyond the scope of this paper. It is also worth noting that an interesting feature of our proposed theoretical framework is that it enables a theoretical analysis of shift invariance in a way that was not previously possible. Furthermore, during the course of our investigation, we discovered that many linear diusion estimation systems are not shift-invariant in the way that we may have originally anticipated. This observation has a few consequences. From an ERF perspective, this ob- 57 servation means that it may sometimes be worthwhile to compute multiple complementary ERFs that explore dierent regions of the estimation space. In practice, we have found that calculating the ERF is not very computationally expensive, meaning that the need to per- form multiple ERF evaluations is not prohibitive. For example, for our basic unoptimized MATLAB-based implementation and considering EAP estimation using 3D-SHORE with the HCP q-space sampling protocol, it takes an Intel Core i7 3.4 GHz CPU roughly 6.3 seconds to evaluate the ERF at 10 6 dierent R values. However, there are also cases where the ERF characteristics demonstrate some form of invariance. For illustration, supplementary Figs. S2 and S3 analyze the rotation invariance of 3D-SHORE EAP and CSA ODF estimation from data sampled with the HCP protocol. These gures show that the ERFs computed for dierent orientations are approximately invariant to rotation in this case, as may be ex- pected since the spherical harmonic representation (used in the 3D-SHORE basis functions) is robust to rotation, and the q-space data for this case is sampled on uniformly-sampled shells with relatively high sampling density. Cases like this that possess approximate orien- tation invariance are convenient, since they obviate the need to evaluate ERF characteristics at many dierent orientations. Finally, from the perspective of an estimator designer, the fact that many of the current estimators do not seem to possess shift-invariance may be seen as an opportunity to develop better estimators, as has previously been done in other elds like positron emission tomography reconstruction [131,132] or X-ray computed tomography reconstruction [133]. In conclusion, we expect the theoretical tools proposed in this paper to be useful in a wide variety of diusion MRI contexts. In addition, since there are many other MRI-based imaging experiments that use a similar linear data acquisition model with linear parameter estimation (e.g., this is especially true for certain kinds of perfusion/ ow imaging, which can also be formulated in terms of EAPs [2]), we believe that the general approach we have taken is easily expanded to other quantitative MRI modalities. 58 3.A Appendix : Linear Basis Function Expansion Meth- ods As mentioned in Section 3.3.1, there are a variety of linear basis expansion methods [12, 18, 42, 44{50, 60, 98, 108{111] that, after some manipulation, can be expressed in the form of Eq. (3.2). This appendix describes how to perform these manipulations. A common assumption of the linear basis expansion methods is that, in the absence of noise, the noiseless diusion signal E(q) can be expressed as a weighted sum of K basis functions k (q) according to E(q) = K X k=1 c k k (q); (3.21) for some appropriate coecients c k . The basis functions k (q) are frequently chosen as combinations of the spherical harmonic basis to describe the angular component of the diusion signal and polynomial bases to describe the radial component of the diusion decay. Basis function representations are powerful, since the innite dimensional functionE(q) has been represented by a K-dimensional model which could potentially be estimated from as few asK dierent q-space measurements. Additionally, with suitably-chosen basis functions, these methods can enable fast compuational methods for evaluating the EAP parameters dened by the integral formulas from Eqs. 3.3 - 3.5. Estimation of the coecients c k is frequently performed by solving a regularized least- squares optimization problem of the form ^ c = arg min c2R K kHc ek 2 2 | {z } Data Consistency + kDck 2 2 | {z } Regularization ; (3.22) where e2R M is the vector of E(q m ) values, c2R K is the vector of c k values, the matrix H2R MK has entries [H] mk = k (q m ), the matrix D is often used to regularize the solution and reduce the ill-posedness of the matrix inversion (e.g., choosing D as a Laplace-Beltrami 59 operator will reduce noise amplication by encouraging the solution to be smooth), and is the regularization parameter that determines the strength of the regularization constraint. If the nullspaces of H and p D have trivial intersection, then the optimal solution to Eq. (3.22) is ^ c = Ge; (3.23) where the matrix G2R KM is given by G, H T H +D T D 1 H T : (3.24) Once the c k coecients have been estimated, it is common to use Eq. (3.21) to generate an estimate of the measured signal ^ E(q) at every point q, and then apply analytic transfor- mations to ^ E(q) to obtain estimates of the EAP parameters of interest. For example, it is straightforward to estimate the EAP using the inverse Fourier transform according to ^ P (R) = Z ^ E(q)e i2q T R dq = K X k=1 ^ c k Z k (q)e i2q T R dq = K X k=1 ^ c k k (R) = M X m=1 K X k=1 k (R)[G] km ! E(q m ); (3.25) where we have used the fact from Eq. (3.23) that ^ c k = M X m=1 [G] km E(q m ) (3.26) and have used k (R) to denote the inverse Fourier transform of k (q). Note that, as expected, Eq. (3.25) has exactly the desired linear form of Eq. (3.2) with = R anda m (R) = P K k=1 k (R)[G] km . When estimating other EAP-derived parameters, a common approach is to apply the integral formulas from Eqs. 3.3 - 3.5 to the estimated EAP from Eq. (3.25). For example, 60 the CSA ODF from Eq. (3.4) can be estimated as ^ ODF CSA (u) = Z 1 0 ^ P (u) 2 d = Z 1 0 K X k=1 ^ c k k (u) 2 d = M X m=1 K X k=1 B k (u)[G] km ! E(q m ); (3.27) with B k (u), Z 1 0 k (u) 2 d: (3.28) Again, Eq. (3.27) has the desired linear form of Eq. (3.2), with = u and a m (u) = P K k=1 B k (u)[G] km . Linear estimators in the form of Eq. (3.2) for estimating other EAP- derived parameters like those dened in Eqs. 3.3 - 3.5 can be obtained in similar ways. Acknowledgments This work was supported in part by NSF CAREER award CCF-1350563 and NIH grants R01-NS074980, R01-NS089212, and R21-EB022951. Some of the computation for the work described in this paper was supported by the University of Southern California's Center for High-Performance Computing (hpc.usc.edu). Data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University. 61 Chapter 4 Estimation Method Design for Diusion MRI 62 4.1 Overview Estimation method design in dMRI have been proposed intuitively or make diusion mod- eling assumptions. All these designs are either suboptimal in brain regions where modeling assumptions fail or have an intuitive cost function that might be suboptimal. In addition most designs need to set model parameters prior to the design or have other initial assump- tions that could be suboptimal begin with. These existing designs have also completely ignored the linear systems theory (ERF theory from chapter 3) available to us due to the Fourier transform relationship between the EAP and q-space dMRI signal. For the rst time, due to the ERF theory, we have a framework that relates the sampling scheme and estimation method to eciency of the dMRI protocol. Our work proposes a new design approach that provides a principled way to approach estimation method design. We also present rst initial result to jointly design both the q-space sampling scheme and parameter estimation method for estimating ODFs. 4.2 Introduction Diusion MRI (dMRI) has the unique ability to noninvasively quantify the Brownian motion of water molecules in living tissues. This is important because the Brownian motion of water molecules is highly constrained by the tissue microstructure, and quantitative parameters that are estimated from the dMRI signal have been shown to be highly sensitive to subtle microscopic tissue features that are not observable using other in vivo imaging methods. Because of this unique sensitivity to microstructural changes, the role of quantitative dMRI has become especially prominent in modern neuroscience and neurology research [64,73,94, 95]. In the dMRI literature, the random motion of water molecules is often described by a probability distribution function called the diusion propagator [2, 69]. In particular, the 63 diusion propagator ~ P (Rjr 0 ; ) is the conditional probability that a molecule which starts at a spatial position r 0 will later be found at the spatial position R after diusing for a time interval , and the ensemble average propagator (EAP) P (R) is obtained from the diusion propagator by integrating out the dependence on the initial position [2,69]: P (R) = ZZZ p(r 0 )P (r 0 + Rjr 0 ; )dr 0 ; (4.1) where p(r 0 ) is the probability of initially nding a water molecule at position r 0 . We estab- lished in Eq. (2.16) that EAP is important because, when using a standard Stejskal-Tanner pulsed gradient spin echo sequence under the narrow pulse approximation [1], the noise- less data observed in a dMRI experiment can be accurately represented using the \q-space model," in which the measured noiseless data E(q) and the EAP form a simple Fourier transform pair [2]. For completeness of the rest of the derivation in this chapter we rewrite the Fourier relationship here: E (q) = Z P (R)e i2q T R dR; (4.2) where the q-space sampling location q is a function of the diusion encoding gradients, and we have assumed that the data E (q) has been normalized such that E (0) = 1. Quantitative dMRI methods often attempt to estimate either the EAP itself or param- eters that could easily be derived from the EAP. It is well-known from Fourier sampling theory that perfectly estimating P (R) from E (q) would require an innite number of samples [118] unless strong additional prior knowledge is available. However, since practical dMRI datasets only acquire a limited number of q-space samples due to practical constraints on data acquisition time, the overwhelming majority of dMRI estimation methods have relied on tting the data with low-dimensional approximate signal models in order to enable robust estimation of dMRI parameters from a nite amount of data. A representative (but certainly 64 not comprehensive) set of approximate low-dimensional modeling approaches includes dif- fusion tensor imaging (DTI) (which relies on a signal model with 6 degrees of freedom) [6]; neurite orientation dispersion and density imaging (NODDI) [15] (which relies on a signal model with 6 degrees of freedom); and basis function expansion methods like diusion basis spectrum imaging (DBSI) [103], constrained spherical deconvolution (CSD) [13], and mean apparent propagator (MAP) MRI [48] which rely on signal models with a larger (in the dozens, but still nite) number of degrees of freedom. Importantly, the use of noisy nite q-space sampling together approximate nite-dimensional models can lead to errors in dMRI parameter estimation. While such errors have been char- acterized in certain contexts [9,10,99,100,134], it is generally dicult to predict the nature of estimation errors without making strong additional assumptions about the \true" nature of diusion within the voxel, and it may not be possible to incorporate assumptions that are realistic enough to accurately describe real tissue. To make matters worse, the biophysical mechanisms underlying dMRI contrast are complicated and not always well-understood, and the choice of dMRI models remains a somewhat controversial and contentious subject within the eld [135{137]. Together, these issues make it very dicult to verify that a given dMRI parameter estimation method will robustly produce accurate results. This work introduces a new approach to dMRI that we call EAP Response Function Optimized (ERFO) estimation, which we hypothesize can overcome some of these limitations of model-based dMRI parameter estimation methods. Preliminary descriptions of ERFO have previously been presented in recent conferences [61,138]. Rather than taking the conventional approach where the q-space data is t to a nite- dimensional signal model before producing a quantitative EAP parameter of interest, ERFO instead uses training data to directly learn a mapping from the q-space data to the parameter of interest without any biophysical modeling assumptions. In that sense, ERFO is similar to several learning-based dMRI parameter estimation methods that have very recently appeared 65 in the literature [139{143]. However, ERFO is also distinct from these previous machine learning approaches in a couple of important ways. First, the previous methods all generate dMRI parameter estimates assuming a nite-dimensional model of the diusion MRI signal, which is potentially problematic as described above. And second, the previous methods all yield nonlinear parameter estimation methods, as is conventional in modern machine learning. There are good reasons that modern machine learning methods use nonlinear estimation. In particular, linear estimators are a special case of nonlinear estimators, which means that the performance of a properly trained nonlinear estimator should be no worse than the performance of a properly trained linear estimator. In addition, it has been established that modern machine learning methods are capable of learning arbitrary nonlinear mappings with arbitrary accuracy [144]. This leads to a natural question: given these clear benets of nonlinearity, why would we intentionally design ERFO to generate a linear estimator? Our rationale for using linearity is based on one of the common criticisms of modern machine learning methods: while these methods can often work very well in practice, they are often considered as mysterious \black- box" methods whose decision processes are hard to explain, whose good performance is hard to understand from a theoretical perspective, and which can sometimes fail in very unexpected ways [145,146]. An important limitation of many such methods is that it can be dicult to predict whether a trained machine learning method will generalize to new data that it hasn't specically been trained to handle. This is an important problem in dMRI, since there are many cases where for which we can't obtain a substantial amount of relevant training data (e.g., in cases where rare and/or poorly understood pathology is present). The advantage of imposing linearity on the ERFO estimator is that our previous work introduced a novel signal processing theory that allows for the characterization of linear dMRI estimators using a concept we call the EAP Response Function (ERF) [93]. This 66 ERF-based characterization is virtually \model-free," and depends only on the assumption that the q-space model of Eq. Eq. (4.2) is accurate. Importantly this ERF theory can be used to precisely characterize the accuracy and noise sensitivity of dierent linear dMRI parameter estimation methods, and can be used as a guide for selecting between dierent linear estimation methods [93]. While many linear dMRI estimation methods have been proposed in previous literature [2, 11, 12, 18, 19, 42{50, 52, 60, 71, 98, 108{116] and have demonstrated some level of good empirical performance, we hypothesize that none of these methods is optimal within the class of linear estimation methods. In particular, these existing methods were largely based on nite-dimensional approximate model-tting approaches, and they were specically not designed for optimal performance in the presence of modeling violations and noise. While we do not claim that our new ERFO approach is universally optimal, it is optimal with respect to the datasets and noise level it has been trained for, and our ERF-based theory allows us to predict its performance on new datasets that it hasn't been trained for and which do not need to satisfy any restrictive biophysical modeling assumptions. Our proposed ERFO approach is very general, and as will be shown, can be used to estimate a variety of dierent dMRI parameters from arbitrary q-space sampling schemes. Our empirical results demonstrate that ERFO has good performance when compared against popular existing methods, both linear and nonlinear. 67 4.3 Background 4.3.1 EAP Parameters Our ERFO approach is designed to estimate dMRI parameters that are linear functions of the EAP, i.e., that can be expressed as = ZZZ g (R)P (R)dR; (4.3) where is the parameter of interest, andg (R) is a kernel function that denes the parameter we're interested in estimating. Many of the most popular dMRI parameters can be written in this integral form, including the orientation distribution function (ODF) [40,43], the return- to-axis probability (RTAP) [48], the return-to-origin probability (RTOP) [48, 81, 117], the mean-squared displacement [18,113{115], the mean fourth-order displacement [18], and the generalized kurtosis [18]. While ERFO is broadly applicable to all of these cases, this paper will show illustrative examples involving the ODF, RTAP, and RTOP parameters, so we present more detail about these specic parameters below. ODF The ODF is dened as the marginal distribution obtained by integrating out the radial dependence of the EAP ODF (u) = Z 1 0 P (u) 2 d; (4.4) where u is a unit-length orientation vector. Note that, without loss of generality, if we take R = [x;y;z] T and let u = [0; 0; 1] T , then this expression can be equivalently rewritten in a 68 form consistent with Eq. Eq. (4.3) as ODF (u) = ZZZ P (R) 1 2 (x)(y)z 2 | {z } g (R) dxdydz; (4.5) where() is the Dirac delta function and we have simplied the integration limits using the fact that the EAP for a pure diusion process is symmetric (i.e., P (R) =P (R) for any R) [71]. The ODF describes the probability of water displacement along the given orientation u, and is frequently used in tractography studies because the orientations associated with larger amounts of diusion often coincide with the orientations of the underlying white matter ber bundles. RTAP The RTAP is dened as RTAP = Z P (u)d; (4.6) where u is a unit-length orientation vector dening the `axis' from which RTAP takes its name. Again letting u = [0; 0; 1] T without loss of generality, this expression can also be equivalently rewritten in a form consistent with Eq. Eq. (4.3) as RTAP = ZZZ P (R) ((x)(y)) | {z } g (R) dxdydz: (4.7) The RTAP has been applied as a biomarker for axonal diameter, myelination, and packing density [19,82] in voxels with a single dominant white matter orientation, with u chosen to match the dominant orientation. 69 RTOP The RTOP is dened as RTOP =P (0) = ZZZ P (R) ((x)(y)(z)) | {z } g (R) dxdydz; (4.8) and has been applied as a biomarker for cell volume and cellularity [19,82]. 4.3.2 Linear Estimation While Eq. Eq. (4.3) describes the parameter we would like to estimate, we do not have access to the EAP in practical scenarios, and need to estimate the parameter of interest from the noisy diusion data. We focus in this work on linear parameter estimators, which from an estimate of the parameter of interest using a linear combination of measured q-space data samples according to: ^ = M X m=1 a m E(q m ); (4.9) where it is assumed that we haveM q-space measurements, and thea m values are the linear estimation coecients. While many dierent choices of a m have been proposed over the years using dierent assumptions in dierent linear dMRI parameter estimation contexts, the goal of ERFO will be to learn a set of a m coecients using training data without any biophysical modeling assumptions. 70 4.4 ERFO 4.4.1 Approach One can dene many cost functions for optimizing the ERF function. Our search has been based on the following criteria which are useful properties to have in any cost function : (i) easily optimized, (ii) leads to designs that improve application performance, (iii) ability to generalize and work with dierent models or applications. In our quest we have tried dierent cost functions [60] and found the above properties satised with the Bayes risk cost that we propose here. However, this is just one example cost function and there can be other metrics that work well, might also be better suited for dierent applications. The ERFO approach assumes that we are given training data that will be used to optimize the a m coecients. In particular, we assume that we are given P dierent examples of noiseless datafE p (q)g M m=1 together with corresponding values of the true EAP parameter value p , for p = 1;:::;P . When creating this training dataset, it is desirable that the training examples have somewhat similar diusion characteristics to the real dMRI data you are interested in estimating parameters from. For example, since this work focuses on brain imaging, the examples we show later in the paper create training data based on some of the common signal models that are used with brain dMRI data, with realistic parameters that represent the parameter ranges that are observed in real experiments. ERFO is designed to nd coecients a m that minimize the error between the estimated parameter ^ from Eq. Eq. (4.9) and the true parameter with respect to the training set. While there are many dierent error metrics that we could use, for the sake of simplicity and without loss of generality, the rest of this paper assumes that we are using the mean-squared error as a cost function. This means that ERFO tries to nd a m coecients that minimize the cost function: P X p=1 E 2 4 p M X m=1 a m (E p (q m ) +n(q m )) 2 3 5 ; (4.10) 71 where E[] denotes statistical expectation and n(q m ) is the noise perturbation associated with themth data sample. Our ERFO design approach is independent of the choice of noise distribution, and can for example be used with the Gaussian, Rician, or non-central chi noise distributions that are commonly encountered in dMRI applications [28]. To simplify the rest of our exposition and without loss of generality, the remainder of this paper will assume that we are working with independent and identically distributed Gaussian noise with zero mean and variance 2 , which is appropriate for high signal-to-noise ratio (SNR) data [22], phase-corrected data [147], or certain types of noise-transformed data [148]. In this case, the previous cost function from Eq. Eq. (4.10) simplies to: P X p=1 p M X m=1 a m E p (q m ) 2 | {z } bias +P 2 M X m=1 ja m j 2 | {z } variance : (4.11) As can be seen, there are two components in this expression. The rst term is associated with estimator bias, and represents the amount of error that would be obtained when applying the linear estimator to noiseless data. The second term represents the error associated with noise. The fact that the noise variance 2 appears explicitly in this cost function implies that the optimal ERFO estimator will depend on the noise level { the a m coecients estimated for high-SNR data will be dierent from the coecients estimated for low-SNR data, since the ERFO approach will automatically adapt to nd an optimal trade-o between bias and variance for the given noise level. Equation Eq. (4.11) can be equivalently rewritten as the simple least squares problem: kt Eak 2 2 +P 2 kak 2 2 ; (4.12) where t2R P is the vector of p parameters from the training data, the matrix E2R PM is also formed from the training data and has entries [E] pm = E p (q m ), and a2 R M is the 72 vector of a m coecients. This is a standard optimization problem that has a well-known analytic solution. In particular, the optimal solution for the a m coecients is given by a = (E T E +P 2 I) 1 E T t; (4.13) where I is the MM identity matrix. Since M is usually a small number in dMRI, this analytic training procedure is simple and computationally ecient relative to conventional machine learning methods. The steps described above provide optimized ERFO coecients for one specic parameter of interest (e.g., computing ODF (u) for one specic orientation u). If multiple parameters need to be estimated (e.g., if ODF (u) needs to be computed for several dierent choices of u), then this same procedure can be repeated for each parameter of interest. Note that the only thing that changes when switching from one parameter to another is the choice of t, which enables computational simplication if the matrix (E T E+P 2 I) 1 E T is precomputed and applied to each parameter of interest. 4.4.2 Theoretical Characteristics The previous section described the ERFO approach in detail. While it should be apparent that the ERFO approach will produce good parameter estimates for the datasets it has been trained on, it is less apparent whether the ERFO estimator will perform well on new datasets that it hasn't been trained to handle, which is a common concern for machine learning methods. However, as mentioned previously, the linearity of the ERFO estimator enables us to generate theoretical characterizations of the ERFO estimator that will give us important insights into its performance characteristics and will allow us to predict its ability to generalize beyond the training set. In particular, our previous work [93] showed that linear dMRI estimators can be charac- 73 terized using the relation that ^ = M X m=1 a m (E p (q m ) +n(q m )) = Z P (R)g(R)dR + M X m=1 a m ()n(q m ) (4.14) where the function g(R), M X m=1 a m () cos(2q T m R) (4.15) is the ERF. Equation Eq. (4.14) provides a direct relationship between the estimated parameter ^ and the true original EAP. In particular, the rst term in Eq. Eq. (4.14) does not depend on the noise, and is very similar to the desired expression from Eq. Eq. (4.3), except that the ideal kernel function g (R) has been replaced with the ERF g(R). Noise only appears in the second term of Eq. Eq. (4.14), and this second term allows us to predict the additional errors due to noise [93]. Combining Eq. Eq. (4.3) with Eq. Eq. (4.14) allows us to express the mean-square parameter estimation error as E j ^ j 2 = ZZZ P (R) (g (R)g(R))dR 2 + 2 kak 2 2 : (4.16) As can be seen, it would be desirable to haveg(R) =g (R), since in that case, we would have perfect estimation in the absence of noise. Unfortunately, due to nite sampling, it is generally not possible for g(R) to match g (R) exactly. Nevertheless, we can gain insight into the performance of the ERFO estimator by comparing how closely g(R) matches with g (R). It should be noted that the ERF g(R) is very similar to the concept of a point-spread function in traditional imaging [93], and often represents a blurred version of g (R). As a result, the amount of blurring we observe can be quantied, and can be used to predict the 74 \resolution" of our dMRI parameter estimation process. Importantly, this characterization does not require any biophysical signal modeling assumptions, and can therefore provide \model-free" insight into parameter estimation performance. This characterization is important, because it means that we do not simply have to blindly trust that our machine learning approach has produced a parameter estimator that will perform well. Rather, the characterization provided by the ERF provides us with a theoretical mechanism to verify whether or not we expect ERFO to have good generalization characteristics. This characterization is also important because it allows us some insight into the behavior of ERFO when the training data has not been chosen very well, which is important because we will rarely have perfect training data that accurately represents the features of real tissues. Specically, the ERF allows us to equivalently rewrite the bias term from Eq. Eq. (4.11) as P X p=1 p M X m=1 a m E p (q m ) 2 = P X p=1 ZZZ P p (R) (g (R)g(R))dR 2 ; (4.17) where P p (R) is the EAP associated with the pth set of training data. This representation makes it clear that the bias term from the ERFO cost function will implicitly encourage the ERF error (g (R)g(R)) to be small, while the EAPs associated with the training data can be interpreted as weights on this error. In particular, ERFO is incentivized to minimize the ERF error at locations R where the training EAPs P p (R) are large, while it is not incentivized to reduce ERF errors at locations where the training EAPs have very small values. From this, we can expect that it may be okay to train ERFO using training data that doesn't perfectly match the real practical scenario of interest, as long as the distribution of EAP energy is roughly similar to the real application. 75 4.4.3 Joint ERFO The ERFO cost function in 4.12 is modied to optimize the qspace sampling scheme and the estimation method jointly: arg min fa;qg kt Eak 2 2 +P 2 kak 2 2 ; (4.18) We use a variable projection approach [149, 150] to solve this problem. In particular, the optimal solution for the a m coecients is given by 4.13, and we can substitute for this and rewrite the cost in terms of sampling scheme: arg min q kt E(E T E +P 2 I) 1 E T tk 2 2 +P 2 k(E T E +P 2 I) 1 E T tk 2 2 ; (4.19) =k(I E(E T E +P 2 I) 1 E T )tk 2 2 +P 2 k(E T E +P 2 I) 1 E T tk 2 2 ; (4.20) We solve for q, which is an M x 3 matrix using gradient descent. Since E is a non-linear function of q, this is a non-convex problem. We use multiple initializations and choose the solution with the least value of the cost function. The initializations used are comprised of various single-shell, multishell (2-shell,3-shell and 4-shell), radial,Cartesian and random sampling schemes. Once we have the optimal q-space sampling scheme q then we can solve for the estimation method using eq. 4.13. 4.5 Experiment protocols In this section, we evaluate ERFO estimation method designs for three EAP parameters relevant to the white matter - ODF, RTOP and RTAP. As mentioned earlier ERFO is not limited to these parameters and can be applied to estimate other EAP parameters as well but here we select a few to demonstrate the usefulness of ERFO. The examples compare ERFO's performance with well known linear and non-linear es- 76 timators using both real and simulated dMRI data. We also demonstrate the exibility of ERFO to adapt to the qspace sampling scheme by experimenting with dierent sampling schemes. The estimation implementations we have used in this section will be made available as part of the open-source BrainSuite (http://brainsuite.org/) software (Shattuck and Leahy, 2002). 4.5.1 Q-space protocols In our experiments we have tried to train ERFO for a variety of q-space sampling scheme set- ting to showcase the exibility of this approach in adapting to dierent acquisition protocol. Here we list the protocols assumed for each of the EAP parameter: ODF Training: ERFO ODF estimators were designed for the single-shell NTU pro- tocol (b-value = 3000sec=mm 2 ) with 64 q-space directions with eective diusion time of 56.8 ms [52]. RTOP Training: ERFO RTOP estimator was designed for the 552-sample four-shell MGH-USC (b-values = [1000,3000,5000,10000]sec=mm 2 ) protocol with [64,64,128,256] q-space directions in each shell and an eective diusion time of 17.2 ms [151]. RTAP Training: ERFO RTAP estimator was designed for two multi-shell protocols: 1) the 288-sample three-shell (b-value = [1000,2000,3000]sec=mm 2 ) human connectome project (HCP) protocol with 90 samples per shell and an eective diusion time of 39.6 ms [10], and 2) the 552-sample four-shell (b-values = [1000,3000,5000,10000]sec=mm 2 ) MGH-USC protocol with [64,64,128,256] q-space samples in each shell and an eective diusion time of 17.2 ms [151]. Additionally, we have also trained ERFO for dierent SNRs (SNR = 25 and/or 35), with SNR here dened as the reciprocal of noise standard deviation observed in the absence of 77 any diusion weighting (b=0 image). 4.5.2 ERFO training data Apart from q-space protocol and SNR, the third input that ERFO assumes is the samples of the training data -ffE p (q m )g M m=1 ; p (g P p=1 . In section 4.4.2 we showed theoretically that the training data model informs on what points in EAP are important, the model thus need not be perfect. In addition, since we do not expect this weighting to change much between the existing models, we decided to choose a model that is simple and has less model parameters. Keeping simplicity in mind, we use the diusion tensor imaging (DTI) [39] as our training model. DTI allowed us to sample the model parameter space densely at a reasonable computational expense. However, ERFO itself is exible and can be used with other models as well. Our training dataset consisted of a total 510,000 (= P ) realizations of the DTI model. The largest eigenvalue of the training tensor was ( 1 ) randomly sampled from a uniform distribution with interval 0.6 to 1.4m 2 =sec and the smaller eigenvalues ( 2 = 3 ) were ran- domly sampled from a uniform distribution with interval 0:1 10 3 to 0:3 10 3 mm 2 =sec. Additionally, each tensor was rotated along 150 directions to ensure good coverage of tensor orientations. The 150 directions were uniformly distributed over a sphere using electrostatic repulsion based sampling [8]. Fig. 4.1 shows a few samples of th ODF ( p ) from the training dataset for the ODF estimation experiment. The samples show tensors with varying eigen- values depicted by the tensor size and varying tensor orientations depicted by the direction along which the tensor is oriented. 4.6 Simulation results We tested ERFO empirically using numerical simulations and compared ERFO's perfor- mance with existing estimation approaches. Estimation of three parameters - ODF, RTAP 78 Figure 4.1: Training Data : The gure shows ODF ( p ) of six of the tensors that were a part of the training dataset. Note that the these samples have varying orientations and the eigenvalues (tensor size). and RTOP was tested. 4.6.1 Test dataset For simulations, we modeled the ground truth as the noiseless dMRI signal observed from water diusing in two crossing truncated cylinders from [14, 152]. Cylindrical model is commonly used to model white matter and our ground truth here models a simple case of white matter with two crossing bers. Additionally, we also wanted to choose a ground truth test model that is dierent from the ERFO training model to test the generalizability of ERFO in scenarios that it has not been trained for. We next describe the test model parameters that was used. The number of cylinders was set to two, each was equally weighted with a volume fraction of 0.5 each. The pore size was set to either one or ve microns. For the one micron cylinder, diusivity parallel to the cylinder was linearly sampled from 0:510 3 to 1:210 3 mm 2 =sec and diusivity along the transverse direction was set to 0:3 10 3 mm 2 =sec. For the ve micron cylinder, both the diusivities parallel and transverse to the cylinder were linearly sampled from 0:8 10 3 to 1:2 10 3 mm 2 =sec. The range of diusivities were chosen to depict typical white matter. Lastly, the innite series in model was truncated at 20 for both the summations. In the rest of the paper we use the terms noiseless and gold standard interchangeably. Figure 4.2 shows example EAPs generated from testing model for a single and two crossing 79 cylinders. Each row represents a dierent radii of the cylinder modeled. The EAP was generated by taking the Fourier transform of the noiseless ground truth q-space data sampled on a grid at 208 times higher sampling rate than the typical diusion spectrum imaging protocol, with a total of 214,364 q-space samples and a max b-value of 20000sec=mm 2 . The single cylinder EAPs represent EAPs for cylinders at 45 degrees and 0 degrees from the z axis. The crossing cylinders shown are samples taken from our testing dataset with two cylinders crossing at 72 and 90 degrees. The test dataset model however does not provide analytical solutions for estimating the EAP parameters(ODF, RTAP and RTOP), so we estimate the gold standard parameter for all the experiments from the densely sampled EAP described in the earlier paragraph. We refer to these noiseless densely sampled EAPs as "gold standard EAP" in the rest of the paper. In addition, we theoretically validate that the gold standard estimate is superior by demonstrating that the ERFs associated with the gold standard has far better characteristics (such as better resolution, low estimation error and zero error variance) than the other methods being tested. Theoretical validation of the gold standard that we demonstrate in this paper, to the best of our knowledge, has never been done before. Noisy q-space data was then simulated from the testing data model by adding white Gaussian noise with noise variance 2 for the ODF and RTOP simulation and Rician noise for the RTAP simulation. Noisy data was then sampled at q-space locations of protocols described in section 4.5.1. We present simulation error results for many dierent SNRs (= 1=) in this section. 4.6.2 ODF Simulation: We reconstructed ODFs (or ber orientation distributions in the case of CSD) with ERFO, CSD [13], and FRACT [12] for one-micron two-cylinder test dataset with additve Gaussian noise at SNR 25 and 45. The test q-space data was sampled for the NTU protocol. 80 Figure 4.2: Testing data - cylinders with 1 micron radius (top row) and 5 micron radius (bottom row). The gure shows the EAP of eight cylinders (four single and four crossing) generated from our testing data model [14]. The crossing cylinders were a part of the test dataset. The single ber response function needed for the model-based CSD approach was esti- mated from two white matter regions in an in vivo dataset. In one case, the single ber response function was estimated from the white matter near the left postcentral gyrus (PG), while the other case estimated the single ber response function from white matter in the corpus callosum (CC). The single ber response was calculated as the rotational harmonic coecients of the mean tensor in each of the white matter regions [13]. We quantitatively evaluated the ability of the three approaches to predict the orientation of the cylinders. We calculated the location of the estimated peaks and the angle of separation between the two estimated peaks and compared them with those from the noiseless gold standard. The results of these simulations are shown in Fig. 4.3, where each box plot depicts the distribution of the normalized root mean squared errors (NRMSE) between the estimated and the gold standard peaks and angle of separation. We observe that ERFO has lower peak location error as well as lower angle of separation error than CSD and FRACT at both SNRs. First we compare ERFO and FRACT theoretically to understand the empirical behavior. 81 (a) SNR 25 (b) SNR 45 (c) SNR 25 (d) SNR 45 Figure 4.3: Comparison of orientation estimation errors for single shell numerical simula- tions. Four methods were compared - ERFO, Constrained spherical deconvolution (CSD(CC) and CSD(PG)) and FRACT Fig. 4.4 shows the ERFs for ERFO and FRACT at both SNRs. The ERFs were normalized to represent them in the same scale. In our previous work we have shown that angular resolution is correlated to the ERF mainlobe width and estimation errors due to false positive/negative orientation peaks or due to signal leakage is associated to the ERF side lobe level [93]. We had dened the ERF mainlobe width as the half maxima width of the central bright lobe at its peak. We observe that the ERF mainlobe width, of 12:3 m for ERFO and 12:2 m for FRACT, suggesting the resolution of both approaches are very similar. However, if you observe the arrows in ERFs, then you can observe that ERFO optimizes 82 (a) ERFO (SNR=25) (b) ERFO (SNR=45) (c) FRACT Figure 4.4: ODF ERFs for single shell acquisition for ERFO and FRACT. FRACT has the same ERF for both SNRs. See [93] for description of our ERF display conventions. the sidelobe leakage close to the origin very strongly compared to FRACT. This optimization is benecial because the EAP will be the highest around the origin and hence any sidelobe leakage there will contribute heavily to errors. The ERF theory here suggests that ERFO keeps the overall estimation error lower than FRACT by optimizing the side-lobe leakage more eciently. Additionally, we will demonstrate qualitatively in real data that in more complex cases, this error dierence can increase. Unfortunately, since CSD is a non linear approach we cannot compare it with ERFO theoretically. The simulation shows one case where ERFO was better, but the simulation ground truth itself does not completely model the white matter, so there might be regions in the brain where CSD might be better. Moreover, we observe in the results that performance of CSD is dependent on the white matter model that was chosen. Other ber response functions could potentially lead to improved results but there is no principled way to nd which response function will be the most optimal. More extensive empirical evaluation with real data will be required to judge if ERFO is better than CSD consistently in white matter. 4.6.3 RTAP Simulation: Our second experiment involved comparing RTAP estimation performance of two estimation methods - ERFO and GQI [52]. The mean diusion distance ratio parameter for GQI was set to 1.2, as recommended in [52]. 83 (a) gold standard (b) ERFO SS (SNR=25) (c) ERFO SS (SNR=10) (d) GQI SS (e) ERFO MS (SNR=25) (f) ERFO MS (SNR=10) (g) GQI MS Figure 4.5: RTAP ERFs for gold standard, ERFO (multi-shell) and GQI. See [93] for description of our ERF display conventions. The test dataset comprised of multiple crossing ve micron cylinders with Rician noise at SNR 10 and 25 added to it. SNR here refers to the reciprocal of the standard deviation of the Gaussian distribution that generated the Rician noise. The SNR was kept low (typical for clinical scanners) because this will be the regime where Rician noise will dier from the Gaussian noise that ERFO is trained for. The test q-space data was sampled for the HCP protocol. Here we calculate and compare a multishell RTAP estimate using the full HCP protocol and a single shell RTAP estimate using just the b-value = 3000 sec=mm 2 shell of the HCP protocol for both the methods. In order to compare we needed to choose a good gold standard RTAP. To achieve this, we estimated the RTAP integral from eq. 4.6 along multiple diusion directions z2S 2 for every gold standard EAP in the test data set (described in 4.6.1). S 2 refers to the unit sphere and we chose 321 unique directions that were uniformly distributed over S 2 using electrostatic repulsion [8]. GQI was then used to approximate the integral along each direction and the maxima of the integral from amongst the 321 estimates was set as the gold standard RTAP. 84 (a) Multishell (SNR=10) (b) Single shell (SNR=10) (c) Multishell (SNR=25) (d) Single shell (SNR=25) Figure 4.6: Comparison of RTAP estimation errors for multishell and single shell numerical simulations. Two methods - ERFO and GQI, were compared at SNR 10 with Rician noise. The direction,z max corresponding to the maxima was set as the direction of largest diusivity (or the axial direction). Additionally, ERF was used to theoretically validate if our chosen gold standard RTAP was more accurate than the test cases we compare in this experiment. ERFs for the gold standard, ERFO and GQI applied to full and single shell HCP protocol at SNRs 10 and 25 are shown in gure 4.12. The ERFs were normalized to represent them in the same scale. The gold standard ERF can be seen to be superior in resolution (main-lobe width = 5.5m) and to possess a uniformly lower side-lobe leakage characteristics than multishell ERFO (ERFO MS: main-lobe width = 10.5 m at SNR 25 and 11.2 m at SNR 10), multishell 85 GQI (GQI MS: main-lobe width = 14 m), single shell ERFO (ERFO SS: main-lobe width = 11.6 m at SNR 25 and 11.2 m at SNR 10) and single shell GQI (GQI SS: main-lobe width = 12.1 m). These characteristics suggest that the gold standard estimation is the most accurate of the three methods, thereby validating our choice. As mentioned earlier, such theoretical validation of the gold standard has never been done before. Dwelling more into ERF comparison we observe that ERFO has a lower mainlobe width than GQI in all scenarios, suggesting that ERFO is favoring to optimize angular resolution. For single shell sampling ERFO (ERFO SS) has a slightly lower mainlobe width than single shell GQI (GQI SS), but GQI SS has overall lower sidelobe leakage, barring the rst sidelobe near the origin where ERFO SS outperforms GQI SS at both SNRs. The side lobe peak along the origin for ERFO SS (SNR 25) is 34%, ERFO SS (SNR 10) is 37% and GQI SS is 39%. We believe ERFO compromises on side lobe leakage at higher displacement because probability of nding water molecules at those locations is low for our training data and so the signal from high displacement locations will not contribute much to the error. However, ERFO chooses to be more aggressive about the sidelobe near the origin because the diusion probability in that region is going to high and hence low displacement locations will contribute strongly to the error. We believe that ERFO SS is optimizing the sidelobe more eciently than GQI SS by containing the leakage better at high error contributing displacements as well as giving marginal improvements over resolution, so we predict ERFO SS to demonstrate a lower estimation error bias. In addition, since the SNR is considerably low, we also calculated the error variance prediction given by the variance term in eq 4.11. Here too we found ERFO SS (noise variance = 3:2 10 9 at SNR 25 and 1:9 10 9 at SNR 10) to have a lower noise variance than GQI SS (noise variance = 4:1 10 9 ), supporting the prediction that we should observe overall lower empirical errors for ERFO SS. Also note that as SNR drops, ERFO becomes conservative with error variance conrming that it optimizes the error variance strongly at lower SNRs. 86 The multishell ERF tells a very interesting story. We observe here that multishell ERFO (ERFO MS) aggressively optimizes resolution compromising side lobe leakage, while multi- shell GQI (GQI MS) aggressively optimizes the side lobe leakage compromising on resolution. ERFO MS shows a 3-3.5 m improvement in ERF mainlobe width compared to GQI MS. However, the rst side lobe along x axis of ERFO MS ( 40% at SNR 25 and 39% at SNR 10) is close to 10% higher than GQI MS (30%). It is not immediately clear which of the two tradeo will yield lesser error, but our squared error optimization seem to be preferring resolution over side lobe leakage, so we hypothesize that improved resolution benets the overall error bias more. In addition, ERFO MS (noise variance = 5:4 10 9 at SNR 25 and 1:9 10 9 at SNR 10) outperforms GQI MS (noise variance = 7 10 9 ) with regards to error variance at both SNRs, so we also expect less variance in our estimate as well. Empirical tests were conducted to validate our predictions. Fig. 4.6 plots the normalized mean squared error (NRMSE) between the gold standard RTAP, and the RTAP estimated with ERFO and GQI from noisy q-space data at SNR 10 and 25 for the full and single shell HCP protocol. Since RTAP estimation of all three methods apply an arbitrary scaling of the RTAP integral that depends on the q-space sampling scheme used, we calculate a scaling factor, e st for each of the estimation methods that minimizes the mean squared estimation error over the entire test dataset: est = arg min T X t=1 kRTAP (t) gstd RTAP (t) est k 2 ; (4.21) where T is the number of elements in the test dataset, and RTAP g std (t) and RTAP (t) est is the gold standard RTAP and the estimated RTAP values for the t th element respectively. NRMSE shown in the gures is the error calculated with the scaled estimate. As shown, at both SNRs ERFO has lower errors than GQI, validating that resolution is more critical and provides substantially more benet than optimizing leakage for the multishell case. One more interesting feature that ERFO MS ERF contains is the high side lobe around 87 (a) gold standard (b) ERFO (c) 3D-SHORE (d) ERF Figure 4.7: RTOP ERFs for gold standard, ERFO (multi-shell) and 3D-SHORE. See [93] for description of our ERF display conventions. 30 microns displacement, especially evident at SNR 25. This high sidelobe is most likely because most of our training set has low values of EAP here. We have also seen empirically that the overall errors are still low despite this high side lobe suggesting that EAP at 30 micron displacement does not contribute much to the estimate. However, we would like to caution the reader here that if the test data contained high EAP values at 30 microns, then ERFO MS would not have been suitable for estimation. More precisely, if our training set is very dierent from the testing set optimum performance cannot be guaranteed. It is true that we claim that ERFO is generalizable beyond the training set, but this claim is made with an assumption that high diusion probability regions will not be weighted low by the training model or in other words the training model is not drastically dierent than the testing scenario. However, the silver lining here is that the user can verify if the model is acceptable or erroneous by analyzing the ERF and then taking wise decisions. 4.6.4 RTOP Simulation: Our nal simuation experiment was to estimate RTOP for the MGH USC protocol with ERFO and 3D-SHORE [17, 51]. 3D-SHORE model parameters were chosen from [17, 51]: radial order was set to 6, the radial and angular regularization parameters were set to 10 8, and the scale factor was set to 1 (8 2 D sh ) with D sh = 0:7 10 3 mm 2 =sec and where is the 88 (a) Gaussian noise (SNR 25) (b) Rician noise (SNR 25) Figure 4.8: Comparison of RTOP estimation errors for multishell shell numerical simulations. Two methods were compared - ERFO and 3D-SHORE diusion time. The test dataset was identical to the ODF estimation case, except that it was sampled for the MGH USC protocol. We estimated the gold standard RTOP from noiseless gold standard EAP using 16th order 3D-SHORE basis based RTOP estimation (other model parameters were as described earlier). Here too we validated that our choice of gold standard is good by verifying that the gold standard ERF resolution characteristics (main-lobe width = 3.3 m) were better than ERFO (main-lobe width = 5.1 m) and 3D-SHORE(main-lobe width = 5.8 m) and that the gold standard side-lobe leakage was uniformly lower than ERFO and 3D-SHORE. All the three ERFs along with a line plot showing the ERF variation for all three methods along the x-axis (z = 0) at SNR=25 are shown in gure 4.7. The ERFs here were normalized such that the gold standard ERF is one at the origin. One can observe clearly in the line plot that ERFO and 3D-SHORE, both underestimate RTOP. Since 3D-SHORE has lower values in ERF we expect it to have lower error variance characteristics than ERFO. However, ERFO has a closer peak to the gold standard than 3D- SHORE, so we expect ERFO to have lesser error bias than 3D-SHORE. Empirical evaluations were conducted to check which of the two methods has overall lower errors. Fig. 4.8 plots 89 the normalized mean squared error (NRMSE) between the gold standard RTOP, and the RTOP estimated with ERFO and 3D-SHORE from noisy q-space data at SNR 25 and 35. As shown, at both SNRs ERFO has lower error than 3D-SHORE validating the tradeo chosen by ERFO to favor reducing error bias over variance. Fig. 4.8 plots the normalized mean squared error (NRMSE) between the gold standard RTOP, and the RTOP estimated with ERFO and 3D-SHORE from noisy q-space data at SNR 25. We show results with both Gaussian and Rician noise. As shown, at both SNRs ERFO has lower error than 3D-SHORE validating the bias-variance tradeo chosen by ERFO. Morevover, we observe that ERFO gives lower error even with Rician noise, however with a higher bias as will be expected because ERFO has not been trained for Rician noise. Also, please note that unlike RTAP simulation, Rician noise aected the estimates more strongly in case of RTOP estimation because the MGH protocol goes to much higher b- values where the SNR will be low. At low SNRs Rician and Gaussian noise will behave dierently, and hence in this result we observe a higher error with Rician noise for ERFO than Gaussian noise. In summary, we experimented on estimating a variety of diusion parameters from ar- bitrary sampled q-space data and under dierent noise conditions. All three experiments consistently showed improved performance of ERFO compared to alternative methods. 4.7 Tracking Application Results Tractography is an active area of research whose results depend on the estimate of the orientation of white matter bundles. One of the common parameters used to estimate the orientation is the orientation distribution function. In this application we analyze tracking results from a single shell 64-sample dMRI dataset from [52]. The b-value of the shell is 3000 sec=mm 2 . Representative reconstructed ODFs and tracks are shown in gures 4.9 and 4.10 from 90 a region of the brain with three crossing bers (the cortico-spinal tract, bers originating from the corpus callosum, and the superior longitudinal fasciculus) on the right and left hemisphere respectively. Tractography results shown in these gures are from a seed placed in this same three ber region of interest. DSI studio software [52] was used to generate the tracks for three orientation estimation methods - ERFO, FRACT [12] and CSD [59]. CSD is likely the most popular orientation estimator used in tracking today and in our experience [93] FRACT too has very good angular resolution characteristics that make it a suitable method for tracking. For the right hemisphere (Fig. 4.9), ERFO and CSD ODFs show three peaks and seem to be estimating the orientation corresponding to the superior longitudinal fasciculus (SLF), while FRACT has diculties resolving this orientation. Comparing ERFO and CSD, this orientation is represented more prominently in the ERFO reconstructions which we believe leads to less noisy tracks. For the left hemisphere (Fig. 4.10), all three methods are successful at estimating the SLF orientation, but here too ERFO has the strongest ODF peaks in that orientation and less noisy tracks. While we don't know gold standard, we believe that the ERFO results seems to encourage more regularity in tracks, performs well in case of multiple crossing bers, and we believe that the tracks have much better correspondence with the expected anatomy compared to the alternatives. The ERFs for this protocol were shown in [61,138], but our hypothesis there was identical to that made in Fig. 4.4. Even in this case ERFO and FRACT had very similar mainlobe widths (ERFO - 13 m and FRACT - 12.2 m) but ERFO had substantially improved sidelobe leakage characteristic especially close to the origin. The sidelobe level at peak ERF was 29% for ERFO and 37% for FRACT. As hypothesized in the earlier ERF analysis in [61,138] as well as in section 4.6.2, we expect FRACT to show high estimation errors due to side lobe leakage and false positive/false negative orientations. We believe that missing third crossing in gures 4.9 and 4.10 is indicative of the erroneous behavior predicted by 91 (a) Reference (b) ERFO (c) FRACT (d) CSD (CC) (e) ERFO (f) FRACT (g) CSD (CC) Figure 4.9: Estimated ODFs using dierent estimation methods from a region-of-interest that is marked in (a). theory. In our nal tractography result we show some common tracks - fanning of the corpus callosum, cingulum bundle, oribito-frontal track along wiht inferior longitudinal fasciculus and the cortico-spinal track, generated using orientation peaks from ERFO ODF. The tracks shown follow the anatomy well and demonstrates the ability of ERFO to estimate orientation peaks eectively for tracking applications. 4.8 RTAP Estimation Results RTAP has recently showed promise in white matter quantitative studies in clinical applica- tions and is seen to correlate with axon density [19, 50, 82]. In this study we quantitatively analyze and compare RTAP estimation with ERFO and GQI in white matter. Unlike our simulations we cannot acquire a densely sampled gold standard dataset. So the rst or- 92 (a) FA (b) ERFO (c) FRACT (d) CSD (CC) (e) FA (f) ERFO (g) FRACT (h) CSD (CC) (i) FA (j) ERFO (k) FRACT (l) CSD (CC) Figure 4.10: Estimated ODFs using dierent estimation methods from a region-of-interest that is marked in (a). der of business was to analyze if we can establish convergence of ERFO and GQI with a practical number of q-space samples. To establish this, we selected the largest of the three sampling protocols, the MGH-USC protocol. In this experiment we begin with theoretical evaluation to establish a valid gold standard and then proceed to compare RTAP estimation performance with the established gold standard. Figure 4.12 shows the RTAP ERFs obtained with ERFO and generalized q-sampling imaging (GQI) [52] applied to the 552-sample MGH-USC protocol. ERFO and GQI ERF have a main-lobe width of 4.595 and 4.718 microns, and the rst sidelobe level at 2.38% and 2.46% respectively. The theoretical resolution characterizations from ERF suggested that the two methods converge to a similar RTAP estimate when all 552 samples are used for estimation. However, theoretical noise variance given by the noise only component in Eq. (4.11) for ERFO (3.4714e+09) was lower than GQI (4.2755e+10) suggesting that there 93 (a) Corpus callosum (b) Cingulum (LH) (c) Cingulum (RH) (d) Inferior fasciculus and Orbito-frontal (LH + RH) (e) Cortico-spinal (LH + RH) Figure 4.11: Estimated ODFs using dierent estimation methods from a region-of-interest that is marked in (a). will be some dierence in the estimation error due to dierent noise resilience characteristics. Given the SNR was reasonably high for this dataset, we expected the resolution characteris- tics to dominate noise resilience characteristics and RTAP estimation to be similar between the two methods in majority of the white matter voxels. To validate our theoretical hypothesis and establish a gold standard, we masked the white matter voxels with an FA threshold of 0.5. A total of 28388 voxels were found to be above the FA threshold. Because of the directional nature of RTAP, here we are considering white matter voxels with some base level of anisotropy given by the FA threshold. Then we estimated the ERFO and GQI RTAP for all the WM voxels. The output of the two methods were normalized such that input q-space data of all ones produces an output RTAP of 1, this normalization normalizes the peak value in ERF to be 1. We have plotted the ERFO and GQI RTAP pairs as a scatter plot in g. 4.13. A linear polynomial with equation y = 1:04x 0:012 ts a line through all the RTAP pairs with a R-squared goodness of t metric of 0.9997, further supporting the convergence of RTAP estimates of the two methods. Fig. 4.13 shows a single slice after RTAP estimation and one can qualitatively observe the similarity. Theoretical, empirical and qualitative results, all establish that the two methods give similar results at full MGH-USC protocol and hence the result of one of these methods can be used as a gold standard. Another point to mention here is that both GQI and RTAP are based on the Fourier 94 (a) ERFO (b) GQI (c) scatter plot (d) ERFO (e) GQI Figure 4.12: (a) and (b) show the normalized RTAP ERFs of ERFO and GQI calculated for the full MGH-USC protocol. (c) shows the scatter plot of the RTAP estimates of the two methods. We can see that RTAP scatter follows the 45 degree line closely, suggesting that the two estimates are extremely similar. (e) and (f) show the RTAP estimate of a slice from the MGH-USC dataset for ERFO and GQI, further supporting the similarity argument. relationship between EAP and q-space, so ideally when we have innitely large number of samples these two approaches will estimate the same EAP and the dierence between them will diminish as the number of samples increase. However, acquiring 552 samples is not practical in most scenarios and hence comparing these approaches and choosing the method that has higher accuracy with fewer samples is important for applications. For the rest of the experiments in this section GQI full protocol RTAP estimate was selected as the gold standard to analyze performance of the two methods at sampling schemes with fewer samples. We subsampled the full MGH-USC protocol into 80, 120, 200, 280, 360 and 440 sample multi-shell schemes. The proportion of samples in each of the shell was maintained same as the full protocol. Additionally, we generated hundred realizations of q-space sampling 95 (a) 552 samples (b) 80 samples (c) 120 samples (d) 200 samples (e) 280 samples (f) 360 samples (g) 440 samples Figure 4.13: RTAP (NTU data) estimation errors for ERFO and GQI averaged at dierent sampling schemes. Error at each sampling protocol is the error spread over hundred dierent realizations. schemes for each of the subsampled protocol. Each realization was generated by selecting samples from each shell of the full protocol in a uniformly random manner. Every q-space point in the shell was assumed to have equal probability during selection. NRMSE between our gold standard and the RTAP estimate of subsampled protocol realizations was calculated as described in eq. 4.21. NRMSE at all the subsampled protocols over all realizations is shown in gure 4.13. The error in white matter is lower for ERFO than GQI in all cases. This is most likely because ERFO adapts to the q-space sampling scheme, noise level as well as the white matter tissue and designs the estimation method to reduce the overall error in that regime. We also show single realization errors using the rst N q-space samples of each shell of the MGH-USC protocol in 4.14 for N=80,200,280. Figure 4.14 analyzes the error by binning voxels based on the axial diusivity (AD) calculated from the diusion tensor estimate at each voxel and estimates the error at each bin. This result shows us that not only is the overall error for ERFO lower, but ERFO errors are lower in every AD bin compared to GQI. These results establish that ERFO has superior performance to GQI while estimating RTAP uniformly across the dierent diusion proles of white matter that were tested. 96 (a) ERFO (80 samples) (b) ERFO (120 samples) (c) ERFO (200 samples) (d) GQI (80 samples) (e) GQI (120 samples) (f) GQI (200 samples) Figure 4.14: RTAP (NTU data) estimation errors for ERFO and GQI for a single sampling scheme realization binned based on the axial diusivity of the voxel. We used the rst N samples of each shell of the MGH USC protocol. 4.9 Joint-ERFO : An initial result Experiment design and parameter estimation design methods in dMRI have primarily con- centrated on optimizing the sampling scheme for an assumed diusion model [5{7], optimiz- ing estimation method for a given sampling scheme [60] or optimizing intuitive properties of sampling schemes [8, 126, 153{155]. Designs originating from these approaches may po- tentially be sub-optimal because of the following reasons: It is fairly common for diusion models to fail (atleast partially) in the brain and it is uncertain how the optimal sampling schemes fare in those regions; In addition, there does not exist a theoretical proof that any particular sampling scheme is optimal for dMRI, so approaches where the sampling scheme is xed could, by nature, be suboptimal. Furthermore, under narrow-pulse approximation, we know that the measured data can be modeled as a Fourier transform of the true EAP. This opens up the vast theory of signal processing methods to quantify sampling requirements, accuracy and stability of Fourier reconstruction. However, existing dMRI sampling and EAP based parameter estimation 97 approaches have largely been developed and tuned without the benet of such theory, in- stead relying on intuition, approximations, modeling assumptions, and extensive empirical evaluation. In the previous chapter we introduced the novel EAP response theory that relates sam- pling scheme, transformation kernel and true EAP with the estimate. In this work we propose a new experiment design scheme that optimizes the EAP response function of the dMRI experiment. The proposed approach is expected to jointly solve for the sampling scheme and transform coecients. The advantage of such design methodology is that cost functions can be dened based on known signal processing performance measures such as accuracy, it does not require empirical testing, it does not model the EAP, and can be used even when traditional modeling assumptions are inaccurate. 4.9.1 Joint-ERFO Evaluation Here we present a design that optimizes both q-space sampling scheme and estimation method jointly as described in 4.20. The input parameters used were: SNR=20, num- ber of samples=64,diusion time 39.6 ms. Fig. 4.15 shows the Joint-ERFO ODF optimized sampling scheme. Optimal b-values histogram shows two peaks suggesting a two-shell sampling scheme with rst shell containing 14 samples at mean b-value 2000 and second shell with 50 samples at mean b-value 5000. We analyzed the directions by dividing the samples into 2 shells : shell 1 (b-values<3500) and shell 2 (b-values>3500). As can be seen, the optimized sampling directions appear to be uniformly distributed on both shells. This observation supports that existing intuitive experiment designs that optimize for uniform sampling are useful. However, the importance of joint-ERFO is in the fact that it provides a principled approach that calculates the b- values that are optimal without the need for trial and error. Moreover, at 64 samples it is more common to use single shell schemes rather than two shells and even though there are 98 Figure 4.15: Joint ERFO ODF design: The gure shows the Joint-ERFO optimized sampling scheme (N=64). Optimal b-values indicate a two-shell sampling scheme and optimal q-space sam- pling directions look uniformly distributed over the shell. many multi-shell schemes that have been proposed the optimal [2000, 5000] scheme with 14 and 50 samples respectively has not been proposed before. This is because even though empirical evaluations can tell how good the scheme is, they do not provide a way to nd the optimal one. We tested joint-ERFO ODF empirically using simulations. Twenty ground truth diusion signal consisting of two tensor [39] with separation angles from 0 to 90 degrees were simulated. These angles were linearly spaced from 0 to 90 degrees. Random Gaussian noise at SNR 20 was added and 100 noisy voxels for each ground truth was simulated and sampled for the following schemes: joint-ERFO sampling scheme, uniformly sampled single shell at b- value 3000, uniformly sampled single shell at b-value 5000. We reconstructed the ODFs for joint-ERFO ODF and 3D-SHORE at multiple sampling schemes and compared the errors across voxels. The errors are plotted in Fig. 4.16. Our simulations indicate joint-ERFO ODF performs better than 3D-SHORE for all the schemes tested implying the benet of optimizing acquisition and estimation method protocol. Our future direction includes verifying the empirical observations with real diusion MRI data. 99 Figure 4.16: Joint ERFO test: The gure compares Joint-ERFO and 3D-SHORE results for multiple sampling schemes. 4.10 Discussion and Conclusions This work described a new EAP-based estimator design framework for optimizing linear estimation of diusion parameters such as EAP, ODF, RTAP etc, from dMRI data. Since this framework is based on the standard Fourier model of q-space data, it is broadly applicable to arbitrary q-space sampling schemes. We applied ERFO to design estimation methods for ODF and RTAP estimations and demonstrated improved performance with simulated and real data. However, it is worth reemphasizing that our framework is not limited to these two parameters { there are many interesting parameters that we have left unexplored, and which provide potentially promising directions for future research. Nevertheless, we believe that the resulting estimators perform well in practice, and can generalize to cases they haven't been trained for because of nice theoretical characteristics. These types of approaches have not been previously possible to the best of our knowledge, aside from our preliminary work on CSA ODF estimator design [61,138]. In this work we have trained ERFO for typical values in the white matter because both parameters (ODF and RTAP) are mainly calculated to study white matter tissue. However, in principle ERFO is not limited to a single tissue and estimation methods for parameters of other tissue types such as gray matter can also be designed using ERFO. More generally, we 100 can parcellate the brain into regions that have similar characteristics (for e.g. same number of crossing bers) and then train ERFO for each of those regions separately. Such tissue specic training can further improve ERFO estimation performance because the average error is optimized for a smaller and more specic range of diusion characteristics. Thus, instead of using a single estimation method for all tissues, ERFO provides an avenue to further reduce errors by designing more appropriate methods that take tissue properties into consideration. It is also worth noting that ERFO cost function is minimizing the mean squared error of the estimate over a class of EAP models. In our experiments we have trained ERFO assuming a diusion tensor imaging (DTI) model [6]. Our choice of DTI was based on the simplicity of the model and availability of an analytical representation of ODF and RTAP leading to overall reduced complexity of the optimization. ERFO is however very exible and can be trained alongside other diusion signal or EAP models and potentially even with combination of multiple models. The choice of the training model is left to the designer and it yet remains to be established which amongst the existing models will be the best choice. We do believe that since the model is only aecting the error more softly any reasonable model should give similar performance. In addition, the mean squared error cost function presented in this paper was chosen because of its computational simplicity, good theoretical characteristics and its ability to generate improved performance. This cost function provided for a good rst step for training based design methodology. The choice of cost function is nontrivial in practice, and will ultimately depend on more detailed contextual information about the goals of the specic application. For example, in our previous work we established that the main-lobe width of the ERF can be a good indicator of the ODF angular resolution and hence, for ODF estimation (especially for tractography studies) it might be benecial to optimize the main- lobe width directly instead. Designing application specic cost functions is currently beyond 101 the scope of this paper but a potential future area of research. In conclusion, we expect the design method ERFO proposed in this paper to be useful in a wide variety of dMRI contexts. In addition, since there are many other MRI-based imaging experiments that use a similar linear data acquisition model with linear parameter estimation (e.g., this is especially true for certain kinds of perfusion/ ow imaging, which can also be formulated in terms of EAPs [2]), we believe that the general approach we have taken is easily expanded to other quantitative MRI modalities. Acknowledgments This work was supported in part by NSF CAREER award CCF-1350563 and NIH grants R01-NS074980, R01-NS089212, and R21-EB022951. Some of the computation for the work described in this paper was supported by the University of Southern California's Center for High-Performance Computing (hpc.usc.edu). Data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University. 102 Chapter 5 Non-central Chi Noise Modeling 103 5.1 Overview The statistics of many MR magnitude images are described by the non-central chi (NCC) family of probability distributions, which includes the Rician distribution as a special case. These distributions have complicated negative log-likelihoods that are nontrivial to opti- mize. This paper describes a novel majorize-minimize framework for NCC data that allows penalized maximum likelihood estimates to be obtained by solving a series of much simpler regularized least-squares surrogate problems. The proposed framework is general and can be useful in a range of applications. We illustrate the potential advantages of the framework with real and simulated data in two contexts: 1) MR image denoising and 2) diusion prole estimation in high angular resolution diusion MRI. The proposed approach is shown to yield improved results compared to methods that model the noise statistics inaccurately and faster computation relative to commonly-used nonlinear optimization techniques. 1 5.2 Introduction This paper considers model-based statistical estimation involving noisy images with statis- tics described by a non-central chi (NCC) distribution. Model-based statistical estimation methods are becoming increasingly prevalent in MRI [157], due to enhanced capabilities for accurately modeling MRI data acquisition physics and new possibilities for incorpo- rating prior information into image reconstruction and parameter estimation. These ap- proaches have proved useful in many applications, including reconstruction from sparsely- sampled data [105, 158{164], image denoising [55, 165{174], quantitative parameter estima- tion [26,54,148,175{181], and artifact correction [182{185]. 1 The text and gures in this chapter have been previously published in [28, 156], and are copyright of the IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. 104 Accurate noise modeling is essential for model-based statistical estimation. It is well known that thermal noise in single-channel complex-valued MRI k-space data samples can be modeled as independent and identically distributed (i.i.d.) zero-mean Gaussian with equal variances in the real and imaginary parts [186,187]. For this noise model, conventional maximum likelihood (ML) and penalized ML (PML) estimation reduce to solving simple least-squares (LS) or regularized LS problems. Due to the prevalence and simplicity of regu- larized LS cost functions, a growing set of fast and ecient algorithms exist that specically target these optimization problems. However, while the noise in MRI k-space data is Gaussian, the images retrieved from MRI scanners frequently have non-Gaussian statistics, primarily due to nonlinear processing operations that are applied when imaging data is reconstructed and saved [29, 124]. This work focuses on the NCC distributions, which arise when taking the root sum-of-squares (rSoS) of multiple Gaussian-distributed complex images. The rSoS operation is frequently used for combining multiple MRI images acquired from a phased-array of receiver coils [188]. An important special case of the NCC distribution is the Rician distribution, which occurs when taking the magnitude (i.e., discarding the phase) of a Gaussian image [22,186,189{191]. NCC distributions are well-approximated as Gaussian when the signal-to-noise ratio (SNR) is high [22, 188], and LS-based methods are often applied to NCC data in the high- SNR regime with negligible error. However, low-SNR images obeying an NCC distribution behave quite dierently. Specically, NCC noise perturbations have nonzero mean, leading to a bias in signal intensity. In addition, the bias and variance of NCC images depend nonlinearly on the (spatially-varying) noiseless signal amplitude. It has been observed in a variety of contexts that treating NCC data as if it were Gaussian can substantially degrade performance [26,54,55,148,165{170,172{174,176{181,184]. While there are clear advantages to accurate noise modeling, the optimization of NCC- based PML cost functions is considerably more complicated than solving a regularized LS 105 problem. Previous methods for optimizing NCC negative log-likelihoods (NLLs) have relied on the use of generic numerical optimization techniques [53] such as the Newton-Raphson [54], quasi-Newton [55], Nelder-Mead [176], brute-force grid search [184], and Levenberg- Marquardt methods [26,180]. However, as demonstrated in this work, the NLL for the NCC distribution is non-convex, meaning that many of these numerical optimization methods might not converge to the global optimum. In addition, the repeated evaluation of the NLL and its gradient requires computationally-expensive Bessel function evaluations, which can slow the optimization process. Partly due to computational issues, suboptimal approximations of ML and PML esti- mation are still widespread. Gaussian approximation is quite common, and there are a number of methods that are designed to reduce the noise bias without directly optimizing the NCC NLL. Examples include approximately transforming the NCC distribution to a Gaussian distribution [148, 181] and various simplied forms of nonlinear bias compensa- tion [22, 165, 167, 169, 173, 179, 190, 191]. While these simplications improve computation speed, performance is typically worse than would be obtained from the statistically-optimal ML or PML solutions. This work introduces a novel and simple quadratic tangent majorant for the NCC dis- tribution, which is used to develop a general majorize-minimize (MM) [192] optimization framework for solving NCC PML estimation problems. The primary advantage of this framewok is that the complicated NCC PML optimization problem can be replaced with a series of easy-to-solve regularized LS problems. The ubiquity of ecient algorithms for solving regularized LS problems makes our approach particularly attractive. Preliminary versions of portions of this work were originally presented in [156,193]. During the preparation of [156], we discovered existing methods that can be obtained as special cases of our general MM-based optimization framework [194, 195]. Solo and Noh [194] derived an expectation maximization (EM) algorithm for activation detection in Rician 106 functional MRI data. Zhu et al. [195] derived a similar EM algorithm for estimating the parameters of nonlinear image contrast models from Rician data. In both cases, tangent majorants for the Rician distribution were implicitly obtained in the expectation step of the EM algorithm by treating the missing phase information as latent variables. Interestingly, our MM framework yields identical optimization algorithms when applied to the same cost functions considered in previous EM-based work, though our majorants are derived using dierent methods that have distinct advantages. For example, the previous EM algorithms were only derived for the Rician distribution, not for the broader class of NCC distributions that we consider. 2 In addition, the EM derivations provide limited insight into the structure of the NCC NLL, while our derivations directly demonstrate that the NCC NLL is generally not a convex function. Finally, the previous work was narrowly focused on specic applications and only considered ML estimation. In contrast, our proposed framework can be applied generally to arbitrary ML and PML cost functions. In the context of PML estimation, a key contribution of our work is that the rapidly expanding range of regularized LS optimization algorithms can now be directly used to eciently solve NCC-based PML problems. This paper also provides the rst empirical evidence that statistically optimal approaches can have substantial advantages relative to some of the simplied noise modeling strategies [22,190] used in previous PML scenarios. This paper is organized as follows. After establishing notation and background in Sec. 5.3, our proposed majorants are presented in Sec. 5.4. Next, Secs. 5.5 and 5.6 respectively illustrate the proposed MM approach in the context of regularized image denoising and regularized diusion prole estimation in high angular resolution diusion imaging (HARDI). Discussion and conclusions are provided in Secs. 5.7 and 5.8. 2 If applying the EM approach to a general (non-Rician) NCC distribution, it is nontrivial to compute the integrals involved in the expectation step due to complicated expressions involving a large number of latent phase variables. 107 5.3 Notation and Background 5.3.1 ML and PML Estimation Model-based statistical estimation addresses the following problem: nd a good estimate ^ x of an unknown length-P parameter vector of interest x, given a length-M vector of measured data y and a likelihood function p (yjx) describing the probability of y conditioned on x. The classical PML estimator is obtained by choosing ^ x according to ^ x = arg min x L (yjx) +R (x); (5.1) whereL (yjx) = lnp (yjx) is the NLL andR(x) is a user-dened regularizing penalty. ML is obtained when R (x) 0. 5.3.2 The Gaussian and NCC Distributions With Gaussian noise, data measurement can be modeled as y =A (x) + e; (5.2) where theA operator is the ideal (noiseless) mapping between x and y, and e is a length-M zero-mean i.i.d. complex Gaussian noise vector. Assuming the real and imaginary parts of the noise each have variance 2 , the NLL is given by L gauss (yjx) = 1 2 2 M X m=1 j[A (x)] m [y] m j 2 = 1 2 2 kA (x) yk 2 2 ; (5.3) where we use the notation [a] j to denote the jth entry of the vector a,kk 2 is the standard Euclidean norm, and we have neglected an additive constant. Note that Eq. (5.1) with this 108 NLL is a simple regularized LS optimization problem. If the phase information from Eq. (5.2) were missing, i.e., y =jA (x) + ej; (5.4) then the data vector y follows the Rician distribution. The NLL for this case is given (neglecting additive constants) by L rice (yjx) = M X m=1 ( j[A(x)] m j 2 2 2 ln I 0 j[A(x)] m j [y] m = 2 ) ; (5.5) where I N () is the Nth-order modied Bessel function of the rst kind. In the rSoS scenario, the data model extends to y = v u u t N X n=1 jA n (x) + e n j 2 ; (5.6) where N is the number of measurements being combined by rSoS (e.g., the number of coils in phased-array coil combination [188]), and theA n operators and e n vectors are the natural generalizations of theA operator and e vector to the rSoS context. Assuming that the e n vectors are i.i.d. with the same statistical characteristics as e, then y will follow an NCC distribution with NLL given (neglecting additive constants) by L ncc (yjx) = M X m=1 ( [K(x)] 2 m 2 2 ln I N1 ([K(x)] m [y] m = 2 ) [K(x)] N1 m !) ; (5.7) whereK(x) is a length-M vector with entries [K(x)] m = q P N n=1 j[A n (x)] m j 2 . Note that L ncc (yjx) reduces toL rice (yjx) when N = 1. 109 The likelihood for the NCC distributions is 0 (and the NLL will be innite) whenever K(x) is negative, since magnitude and rSoS data can never be negative. Therefore, when solving Eq. (5.1) for NCC distributions, it is necessary to restrict optimization to the set =fx : [K (x)] m 0; m = 1; 2;:::;Mg. 5.3.3 Majorize-Minimize Algorithms Consider an arbitrary cost functionJ(x) that we wish to minimize. Instead of directly mini- mizingJ (x), the MM approach [192] sequentially minimizes surrogate cost functionsG (xj^ x i ) known as tangent majorants. This can greatly accelerate computations if the G (xj^ x i ) are easier to optimize than J (x). The MM procedure is described by Alg. 1. Algorithm 1 Majorize-Minimize Procedure 1) Initialize ^ x 0 . 2) for i = 0; 1; 2;::: until convergence Set ^ x i+1 = arg min x G (xj^ x i ). To be a valid tangent majorant, eachG(xj^ x i ) should satisfyJ (x)G (xj^ x i ) for8x, with J (^ x i ) =G (^ x i j^ x i ). Under these conditions, the iterates ^ x i are guaranteed to monotonically decrease the original cost function, i.e.,J (^ x i+1 )J (^ x i ). See [196] for a detailed description of MM convergence. 5.4 Quadratic Tangent Majorants forL ncc (yjx) With the preliminaries out of the way, we are now prepared to present the main theoretical results of this paper: Theorem 1 For x2 , the functionL ncc (yjx) dened in Eq. (5.7) has the quadratic tangent majorant G (xj^ x i ) = 1 2 2 kK(x) ~ y i k 2 2 +C i ; (5.8) 110 where ~ y i is the length-M vector dened by [~ y i ] m = [y] m I N ([K(^ x i )] m [y] m = 2 ) I N1 ([K(^ x i )] m [y] m = 2 ) ; (5.9) and C i is an iteration-dependent constant. For practical implementation, note that lim t!0 I N (t)=I N1 (t) = 0 for N 1. Combining Theorem 1 with the MM procedure leads to Alg. 2, our proposed MM-based estimation algorithm. Algorithm 2 MM Algorithm for NCC Data 1) Initialize ^ x 0 . 2) for i = 0; 1; 2;::: until convergence Set ^ x i+1 = arg min x2 1 2 2 kK(x) ~ y i k 2 2 +R (x): Algorithm 2 involves the iterative solution of regularized LS problems. As a result, our proposed MM framework enables us to directly leverage existing fast regularized LS algorithms. Our proof of Theorem 1 makes use of another result that is of independent interest: Theorem 2 LetN be a non-negative integer and2R be a positive scalar. Then ln x N I N (x) is strictly concave. Theorem 2 implies thatL ncc (yjx) is generally not a convex function of x, since Eq. (5.7) is the linear combination of strictly-convex quadratic terms and strictly-concave negative log- Bessel functions. This accentuates the potential diculties in optimizing the NCC NLL, and to the best of our knowledge, the convexity of this NLL has not been previously examined. The fact thatL ncc (yjx) is the sum of strictly-convex and strictly-concave terms also implies that we could use the concave-convex procedure [197] to minimize NCC NLLs. 3 It turns out that applying this procedure to NCC-based ML optimization leads to exactly 3 Thanks to Dr. D. Ruan for pointing out this connection. 111 the same iterations as Alg. 2. As a result, the convergence theory for the concave-convex procedure [197] can also have relevance to our MM framework. Proofs of Theorems 1 and 2 are given in the Supplementary Material. The following two sections illustrate and evaluate our general MM framework in dierent application contexts. 5.5 Example Application: Denoising Our rst example concerns the well-studied problem of denoising MR magnitude and rSoS images [55,165{174]. Specically, we are given the voxels of a noisy NCC image y, and are asked to nd an estimate ^ x of the noiseless image. For this case,K is an identity operator, and the MM iteration is: ^ x i+1 = arg min x0 1 2 2 kx ~ y i k 2 2 +R (x): (5.10) For simplicity, we show 2D examples with R (x) chosen as the edge-preserving total variation (TV) penalty [198] R TV (x) = X q r [D h x] q 2 + [D v x] q 2 ; (5.11) where D h and D v are sparse matrices that respectively compute rst-order nite dierences between adjacent voxels (approximating the image derivative) along the horizontal and ver- tical directions, and is a regularization parameter. 4 As enabled by our new MM framework, we solve Eq. (5.10) using a modern fast algorithm designed for solving LS problems with TV regularization. Specically, we use the fast rst- order primal-dual proximal algorithm of Chambolle and Pock [202] to solve Eq. (5.10) at each iteration. Note that we can neglect the positivity constraint x 0 when solving Eq. (5.10), 4 Note that other popular denoising methods such as non-local means (NLM) [199] are also easily incor- porated into our MM framework using the regularization frameworks described in [200,201]. 112 because the unconstrained solution will always be positive (y and ~ y i are always positive, and the regularization will have the eect of locally-smoothing these positive images). The results we show later in this section are based on the initialization ^ x 0 = y. For reference, we compare our PML results to those obtained using common non-ML regularization strategies. Specically, we compare against: Gaussian Approximation (GA). We solve: ^ x GA = arg min x kx yk 2 2 +R (x); (5.12) which we obtain by replacing the true NCC NLL with the LS function appropriate for Gaussian noise [22]. Squaring Transformation (ST). We solve: ^ d ST = arg min d d y 2 2 2 +R (d); (5.13) where y 2 denotes the sum-of-squares image obtained by squaring each element of the rSoS image y. This choice is motivated by the fact that y 2 has a constant, signal- independent noise bias equal to 2N 2 . After obtaining ^ d ST , we \debias" it (subtracting 2N 2 and setting negative elements to 0) and obtain a denoised image ^ x ST by taking the elementwise square-root. Many existing non-ML denoising methods are based on this kind of approach, e.g., [22,165,167,169,173,190,191]. 5.5.1 Simulations The proposed approach was rst evaluated with simulated data. Complex white Gaussian noise was added to a synthetic phantom containing features with dierent intensities and dierent spatial scales. The noiseless synthetic image is not piecewise constant, and thus is 113 not perfectly suited to TV. Noisy Rician magnitude images (N = 1) and NCC rSoS images (N = 4) were syn- thesized at several dierent SNRs (SNR = 1, 2, 5, and 10, with SNR dened as the ratio between the median value of the noiseless single-channel image magnitude and ). TV de- noising was performed using the MM, GA, and ST approaches described above, using perfect knowledge of the true value of 2 . Five dierent noise realizations were generated at each SNR. For each method and each SNR, the TV regularization parameter was optimized to yield the smallest possible error with respect to the gold-standard noiseless image. Error was measured using the normalized root mean-squared error (NRMSE) metric, dened as NRMSE =k^ x x gold k 2 =kx gold k 2 , where ^ x and x gold are the estimated and gold-standard images, respectively. NRMSE was also used to quantify the performance of the dierent de- noising methods. Since NRMSE has certain limitations [203], we also performed qualitative visual comparisons. Quantitative results from these simulations are shown in Fig. 5.1, while qualitative visual results are shown in Figs. 5.2 and 5.3 and Supplementary Figs. S1 and S2 (Figs. 5.2 and 5.3 show the Rician case, while Supplementary Figs. S1 and S2 show the rSoS case with N = 4). These results show that for all SNRs, the MM and ST images have substantially less error and bias than the GA images. This is not surprising because GA is based on an inaccurate noise model that completely neglects the noise bias, while MM and ST both incorporate information about the noise bias. Comparing MM and ST, both methods are eective at removing noise bias, though the MM results have smaller NRMSE values. This is also not surprising, since MM optimally models the full noise distribution, while the ST approach only models the bias. While the NRMSE dierences between MM and ST are relatively small in this simula- tion, the denoised images generated by these two approaches have an important qualitative dierence. Specically, we observe from Fig. 5.2 and Supplementary Fig. S1 that the ST 114 1 2 3 4 5 6 7 8 9 10 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 NRMSE SNR GA MM ST (a) Rician (N = 1) 1 2 3 4 5 6 7 8 9 10 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 NRMSE SNR GA MM ST (b) rSoS (N = 4) Figure 5.1: Quantitative results for the TV denoising simulations. Each plot shows the NRMSE of each denoising method as a function of SNR. approach smooths low-intensity image regions more strongly than high-intensity regions (i.e., image features in low-intensity regions look more blurred, while high-intensity image regions still have a speckled noisy appearance), while the MM approach appears to smooth all re- gions of the image in a scale-invariant manner. This behavior is especially evident at low SNRs, as highlighted in Supplementary Fig. S3 (which shows zoomed regions from Fig. 2). The uniform smoothing achieved by the MM approach should be expected, since TV is the- oretically a scale-invariant regularization function [204]. On the other hand, applying TV leads to scale-variant behavior in the ST approach because the squaring and square-rooting operations used in ST are not scale-invariant. 5.5.2 Real Data The proposed approach was also evaluated with real Rician and rSoS brain data. A high- resolution MPRAGE image (0.50.50.8 mm 3 voxels with a matrix size of 384384192) was acquired with a 3T scanner and a 32-channel head coil. Fully-sampled raw k-space data was obtained for ve dierent repetitions of the acquisition, and images were reconstructed using the standard Fourier transform. Denoising experiments were performed on a single- 115 Noisy MM GA ST SNR=1 SNR=2 SNR=5 SNR=10 Figure 5.2: Representative results from TV denoising of simulated Rician data. 116 Noisy MM GA ST SNR=1 SNR=2 SNR=5 SNR=10 Figure 5.3: Root-mean-squared error images (computed based on ve noise realizations) for the Rician TV denoising results shown in Fig. 5.2. For easier visualization, the image intensities for SNRs 1, 2, 5, and 10 were respectively scaled by factors of 1.5, 3, 6, and 10 relative to the images in Fig. 5.2. 117 coil magnitude image and an rSoS image generated from the rst repetition, while all ve datasets were complex averaged to generate a \noiseless" gold standard reference images. We applied the unregularized ST approach to reduce any noise bias remaining in the gold standard images. The proposed approach requires prior knowledge of 2 and N. We performed ML es- timation of 2 based on data from background regions of the noisy images. Due to noise correlations between the dierent receiver coil elements in an rSoS image, the parameter N of the NCC distribution should not be chosen as the number of coils unless the receiver channels are prewhitened [29, 173]. As a result, we also performed ML estimation of N for the rSoS data, yielding an estimated value of N = 9. As shown in Supplementary Fig. S4, our estimated noise models match the empirical noise distributions quite accurately. As in the simulations, TV denoising was performed using the MM, GA, and ST ap- proaches, with regularization parameters for each method chosen to achieve a minimum NRMSE. Quantitative and qualitative results for Rician denoising are shown in Figs. 5.4 and 5.5, while rSoS results are shown in Supplementary Figs. S5 and S6. Consistent with the simulation results, both MM and ST substantially outperform the GA approach. In addition, MM yields smaller NRMSE and more uniform spatial smoothing than ST. 5.6 Example Application: HARDI Estimation Our second example involves the estimation of diusion proles from HARDI data [11, 71]. Brie y, 5 HARDI acquisitions sample the diusion-weighted MR signal on the surface of a sphere, with one set of spherical samples available for each imaging voxel. It is common to t a spherical harmonic representation to this data to yield a continuous representation of the diusion prole for each voxel [108{110]. The spherical harmonic basis for signals dened on the sphere is analogous to the Fourier series basis for periodic functions, and provides a 5 See [12] for a detailed discussion of this problem and its typical solutions. 118 (a) Noisy (NRMSE=0.2691) (b) MM (NRMSE=0.1371) (c) GA (NRMSE=0.1965) (d) ST (NRMSE=0.1650) Figure 5.4: Results from Rician TV denoising of real brain data. Also shown are the NRMSE values for each image. 119 (a) Noisy (b) MM (c) GA (d) ST Figure 5.5: Error images for the Rician TV denoising results shown in Fig. 5.4. For easier visualization, the error image intensities were scaled by a factor of 4 relative to the images in Fig. 5.4. 120 convenient orthonormal basis that is widely used in a variety of applications. In HARDI, the tted spherical harmonic representation can then be used to estimate the orientations of ber structures within each voxel (useful for tracking white matter or muscle bers through a process called tractography), and to estimate quantitative diusion characteristics for each voxel like mean diusivity (MD) and generalized fractional anisotropy (GFA). Parameters like MD and GFA provide insight into tissue microstructure and biomarkers for disease [11]. The use of magnitude images is especially prevalent in diusion MRI, due to the fact that image phase is highly sensitive to subject motion in the presence of diusion encoding. This leads to phase inconsistencies for in vivo data. In addition, the physics of the diusion encod- ing process also means that diusion MR images typically have low-SNR [26,55,172,174,177]. This combination of characteristics means that PML-based NCC modeling should be partic- ularly attractive. However, most current HARDI studies denoise the data prior to estima- tion [55,172,174], and PML estimation has only been investigated to a limited extent [177]. Ref. [177] concluded that the PML approach is generally too computationally expensive to be worthwhile relative to simpler non-ML debiasing approaches. In this section, we demonstrate the potential value oered by our MM framework to enable computationally ecient PML HARDI estimation. It should be noted that [177] solved a maximum entropy spherical de- convolution problem for HARDI data, which is related but not identical to the more-common formulation based on Laplace-Beltrami regularized spherical harmonic estimation problem described below. To the best of our knowledge, this paper is the rst work to evaluate PML methods for this formulation of the problem. Given a data vector y of diusion prole samples from a single voxel, commonly-used tting procedures [12,108{110] estimate the spherical harmonic representation by solving ^ x = arg min x 1 2 2 kAx yk 2 2 +kLxk 2 2 ; (5.14) where the columns of the matrix A correspond to sampled versions of the truncated spherical 121 harmonic basis functions, 6 x is the unknown vector of spherical harmonic coecients, is a regularization parameter, and L is either the identity matrix (to penalize signals with large ` 2 -norm [108]) or the Laplace-Beltrami operator (to penalize non-smooth signals [109]). Comparing with Eq. (5.1), Eq. Eq. (5.14) can be viewed as a GA-based form of spherical harmonic estimation. This GA estimator is easily computed using the noniterative linear LS solution [109] ^ x = A H A + 2 2 L H L 1 A H y, y: (5.15) Instead of GA estimation, we investigate the use of NCC PML estimation as enabled by our proposed MM framework, by iterating according to: ^ x i+1 = arg min x:Ax0 1 2 2 kAx ~ y i k 2 2 +kLxk 2 2 : (5.16) Equation. Eq. (5.16) has the simple solution ^ x i+1 = ~ y i when A~ y i is non-negative. When non-negativity is violated, we solve Eq. (5.16) as a standard quadratic programming problem using Matlab's quadprog function. The results we show later in this section are initialized by setting ^ x 0 equal to the non-negativity constrained solution to Eq. (5.14). Our proposed MM approach was compared against the GA and ST approaches. For this problem, the ST approach debiases each measurement sample independently by subtracting the noise bias from magnitude-squared samples as described in the previous section, and then applies the standard GA estimator to the \debiased" data. Laplace-Beltrami regularization was used in all cases, and the regularization parameter was individually optimized for each case to yield the smallest NRMSE. 6 In our implementation, we use the modied (real-valued and symmetric) spherical harmonic basis func- tions proposed in [109]. 122 5.6.1 Simulations Noiseless diusion data was simulated at M = 200 dierent points on the sphere, based on a two-tensor model given by S(q m ) = S 0 2 h e bq H m D 1 qm +e bq H m D 2 qm i ; (5.17) for m = 1; 2;:::;M, where S(q m ) is the mth noiseless sample, S 0 is the signal without diusion weighting, the diusion encoding value is b =2,000 s/mm 2 , the q m are length- 3 unit vectors describing the spherical sampling locations, and D 1 and D 2 are diusion tensors [39] with identical eigenvalues ( 1 = 1:410 3 mm 2 /s, 2 = 3 = 0:210 3 mm 2 /s) but dierent orientations. The angle between the principal eigenvectors of the two tensors was systematically varied from 0 to 90 . Complex Gaussian noise was added to the diusion signal, and Rician data was obtained by taking the magnitude. Noisy data was simulated at SNR=3, 6, and 10, with SNR dened asS 0 =. Estimation was performed 1000 times (using dierent noise realizations) for each case. Estimation results were evaluated qualitatively and quantitatively. Quantitative perfor- mance was assessed using: Spherical Harmonic Error. Computed as the NRMSE of the estimated spherical harmonic coecients with respect to those estimated from noiseless data. Mean Diusivity (MD). Quantitative MD was computed according to [110] MD = 1 bM M X m=1 ln ([A^ x] m =S 0 +"): (5.18) where " is a small regularization parameter used reduce the sensitivity of the loga- rithm to small perturbations when [A^ x] m is near 0. This parameter was adjusted independently for each algorithm to yield minimal NRMSE. 123 Generalized Fractional Anisotropy (GFA). The quantitative GFA diusion mea- sure was computed as [205] q 1 [^ c] 1 =k^ ck 2 2 ]; (5.19) where ^ c is the spherical harmonic representation of the Funk-Radon Transform of ^ x [11], and the rst spherical harmonic coecient [c] 1 corresponds to the isotropic spherical harmonic basis function. Quantitative simulation results are shown in Fig. 5.6 and Supplementary Fig. S7. Con- sistent with our previous results, Fig. 5.6 shows that the proposed MM method has the smallest spherical harmonic error at all SNRs, while Supplementary Fig. S7 shows that the MM method frequently yields MD and GFA estimates with the smallest bias. However, it should also be noted that MM can yield MD and GFA estimates with higher variance than GA or ST. This behavior (lower bias but higher variance) is consistent with previous studies involving the Rician distribution and the diusion tensor model [26]. Qualitative results are shown in Fig. 5.7. We observe that all the methods work well at high SNR. As before, this is not surprising, since the Gaussian approximation is accurate at high SNR, while accurate noise modeling becomes more important at low SNR. At lower SNRs, both MM and ST yield substantially less bias than GA, while MM has slightly less bias than ST. These results further validate the potential benets of using PML estimation with the MM framework. 5.6.2 Real Data We also applied the MM approach to the 32-channel multi-shell diusion imaging data from [172], with shells at b-values of 1; 000, 2; 000, 3; 000, and 4; 000 s/mm 2 and M = 30 samples per shell. Unlike [172] (which performed specialized k-space based denoising of the data), we used standard Fourier reconstruction of the data, with the complex images from each coil combined using rSoS. 124 0 20 40 60 80 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 Tensor angle (degrees) SH coefficient error GA MM ST (a) SNR=10 0 20 40 60 80 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 Tensor angle (degrees) SH coefficient error GA MM ST (b) SNR=3 Figure 5.6: Quantitative spherical harmonic (SH) errors for the HARDI simulations. The lines and error bars respectively correspond to the means and standard deviations of the NRMSE values across 1000 noise realizations. Diusion proles from each shell were estimated separately within the HARDI framework using the MM, GA, and ST approaches. Noise parameters N and 2 were estimated as in Sec. 5.5.2. Since no gold standard is available, results were assessed qualitatively. Results are shown in Fig. 5.8. It can be seen that both MM and ST are able to sub- stantially reduce noise bias relative to GA. However, we also observe that the high b-value ST images possess much fewer low-intensity image structures relative to the MM and GA approaches, suggesting that it is less eective than MM at accurately preserving the diusion prole. These results indicate that noise modeling during spherical harmonic estimation has potential benets, and can complement the traditional approach where diusion images are denoised prior to spherical harmonic estimation [55,172,174]. 5.6.3 Comparisons with Traditional Optimization Algorithms Our previous results focused on the benets of using PML estimation instead of widely- used simplifed noise models. In this section, we compare the computational eciency of our proposed MM approach against generic optimization algorithms that have been previously 125 Angle SNR=3 SNR=6 SNR=10 0 40 70 90 Figure 5.7: Diusion prole 2D cross-sections (passing through the plane dened by the principal eigenvectors ofD 1 andD 2 ) computed from simulated data. Each image shows the noiseless prole (green) and average proles estimated from noisy data using the MM (red), ST (magenta), and GA (blue) approaches. applied to solve NCC optimization problems. Specically, we compared against standard quasi-Newton and Levenberg-Marquardt methods [53]. These algorithms were implemented using Matlab's fminunc and lsqnonlin functions, respectively, while our MM approach was evaluated using our own (relatively unoptimized) Matlab implementation. All algorithms were implemented with the same stopping criterion (iterations halted when the estimated change in spherical harmonic coecients was less than 10 6 , leading to a dierent number of iterations for each voxel and each algorithm. The number of MM iterations was limited to at most 100). Computations were performed using the same NCC-based cost function and 32-channel data from Sec. 5.6.2, using a workstation with a quad-core 2.27 GHz processor. Spherical harmonic coecients were estimated for all voxels within the FOV, and compute 126 b (s/mm 2 ) Noisy MM GA ST 1,000 2,000 3,000 4,000 Figure5.8: Representative real-data spherical harmonic estimation results. Each image row shows the estimated value of [A^ x] m for dierent choices of m corresponding to dierent b-values. times were saved for each algorithm. The results obtained using MM had the smallest overall cost function values, ensuring that our compute times are not biased in favor of the MM approach. Table 5.1 shows the mean and median compute times for each voxel, as well as the stan- dard deviation and the total compute time for the whole image. The MM-based algorithm was much faster than the other PML algorithms for this data. In particular, the proposed MM algorithm was more than 16 faster than the Levenberg-Marquardt method and more than 100 faster than the quasi-Newton method. Figure 5.9 compares the trade-o between 127 Table 5.1: Mean, median, standard deviation, and total compute times (units of seconds) for dierent PML algorithms. Algorithm Mean Median Stand. Dev. Total Quasi- Newton 1.923 2.010 0.516 31503.3 Levenberg- Marquardt 0.119 0.118 0.010 1941.3 MM 0.017 0.022 0.008 283.7 compute time and statistical optimality between the PML and regularized non-ML meth- ods. While the MM approach is not as fast as the non-ML GA and ST methods (with most compute times less than 4 ms per voxel), it yielded substantially smaller PML cost function values, providing a nice balance between quality and speed. When interpreting these speed results, it should be noted that the computational e- ciency of MM was dependent on the existence of ecient algorithms for solving regularized LS problems. For other cost functions where ecient regularized LS algorithms are not available, the MM approach may not be preferred. As with all optimization algorithms, computational eciency is best examined individually for each context. 5.7 Discussion Several issues relevant to our proposed framework are discussed in the following subsections. 5.7.1 Initialization and Local Minima Our empirical results indicate that the MM approach demonstrates better performance than the ST and GA methods for the applications we investigated. However, it is important to point out that the computational complexity of ST or GA is similar to the complexity of just one iteration of the MM approach, and that the MM approach needs to be initialized well for the dierence in computation time to be negligible. Good initializations are also 128 Figure 5.9: Plot of the cumulative PML cost function value for all voxels (normalized for easier visualization) versus the mean per-voxel compute time for dierent algorithms. The error bars show one standard deviation. important because our work showed that the NCC NLL is generally nonconvex, and therefore subject to local minima. We are also aware of one specic initialization that is particularly problematic: if ^ x 0 is chosen such thatK (^ x 0 ) = 0, then the corresponding ~ y 0 = 0, meaning that ^ x 1 will not depend in any way on the measured data y and that the MM framework can stagnate unless the regularization term R (x) specically prefers nonzero solutions. To empirically investigate convergence characteristics, we performed MM-based NCC image denoising as in Sec. 5.5.1 for four dierent initializations: (i) initialization with the noisy image ^ x 0 = y, (ii) initialization with the true noiseless image, (iii) initialization with the unregularized ^ x ST result, and (iv) random initialization. For each initialization, we ran 500 MM iterations while tracking the cost function value. Figure 5.10 shows the results. It can be seen that the results of all the initializations converge to a similar cost function value, though dierent initializations converge faster 129 0 100 200 300 400 500 10 −4 10 −2 10 0 10 2 10 4 10 6 10 8 Iterations NLL Noisy Reference ST output Random (a) SNR 5 0 100 200 300 400 500 10 −4 10 −2 10 0 10 2 10 4 10 6 Iterations NLL Noisy Reference ST output Random (b) SNR 1 Figure 5.10: Convergence plots for the proposed MM approach with dierent initializations. than others. The resulting denoised images (not shown) are quite similar to one another, though do contain small dierences in regions of the image where the true noiseless signal is zero. Even though the nal results obtained from these dierent initializations only have minor dierences for this denoising example, it is important to remember that there are initializations for which the MM approach will fail to come close to a useful solution (e.g., the all-zero initialization), and that other application contexts might be more severely impacted by nonconvexity. At least for this application, either the ST output or the noisy image seem like useful initializations for the MM approach, since both of these choices lead to fast convergence in Fig. 5.10. 5.7.2 Accurate Noise Modeling Using the NCC NLL from Eq. (5.7) relies on the prior knowledge of N and 2 , and assumes that noise is white and uncorrelated from voxel to voxel and from coil to coil. However, there are practical situations where these assumptions might be violated. For example, it is frequently inaccurate to assume that the noise between dierent coils is uncorrelated and has the same variance. Previous work has shown that in such cases, rSoS data is still modeled 130 accurately by an NCC distribution, except that the parameterN might not equal the number of images that were combined using rSoS [29,206{209]. For this case, it has been shown that ML estimation of N yields reasonable results [29], and for simplicity, we used this approach in the results we presented. Alternatively, another common approach would be to prewhiten the data from dierent coils using the noise correlation matrix [158], and apply rSoS to these instead. Since we used fully-sampled data and traditional Fourier reconstruction, our assumptions (thatN and 2 were spatially invariant, and that noise in dierent voxels was uncorrelated) will be valid for all the results shown in this paper. This means that our noise modeling assumptions are expected to be accurate, as conrmed by Supplementary Fig. S3. However, depending on the provenance of the images, there are cases where the noise may be spatially correlated and/or spatially varying. For example, spatial noise correlations and spatially- varying noise parameters can exist if images are reconstructed using parallel imaging methods like GRAPPA [210] or SENSE [158]. Spatially-varying 2 values can also exist if images from multiple coils are combined using adaptive combination methods [211]. Fortunately, it has been observed that even if spatial noise correlation is neglected, there is still a benet to modeling the data as if it were spatially-independent NCC [29]. In addition, there exist several schemes for directly estimating the parameters of spatially-varying noise elds from MRI images [29, 206{209], and there also exist methods for theoretically predicting noise parameters based on reconstruction parameters [212]. If spatially-varying noise parameters are known, the MM approach can be easily adapted: this would be achieved by replacing 2 and N in Eq. (5.7) by appropriate spatially-varying values that depend on m. To illustrate this, we used the same 32-channel MPRAGE dataset from Sec. 5.5.2 to generate GRAPPA-reconstructed images (acceleration factor 3) from single-repetition data. Images from each coil were combined using both rSoS and adaptive coil combination [211]. The combined images were subsequently denoised using TV regularization and the MM, 131 GA, and ST methods. For each method, the cost function was modied on a voxel-by-voxel basis to account for spatially varying noise. We used the pseudo multiple replica method to generate synthetic noise with the same statistics as the real noise [212], and estimated spatially-varying 2 andN values using ML (withN = 1 for adaptively-combined data, since adaptively-combined images have Rician statistics). As before, the regularization parameter was tuned to yield minimum NRMSE for each method, with NRMSE calculated based on a fully-sampled 5-averaged reference (debiased using ST). Figs. 5.11 and 5.12 show the results for adaptively combined data, while Supplementary Figs. S8 and S9 show the rSoS results. As expected, the MM approach can successfully accommodate spatially-varying noise, and has better denoising performance than either GA or ST. In many data processing pipelines, it is common to also perform multiple preprocessing steps that could in uence the noise distribution in the images. For example, motion correc- tion, distortion correction, and bias eld correction will all modify the characteristics of the noise distribution of an MR image in potentially undesirable ways that could invalidate the NCC assumptions. There are several possible ways of addressing this issue. A simple option would be to apply the NCC modeling at the rst stage of the data processing pipeline. This is the approach we used in this paper (none of our data was preprocessed). Another option would be to keep track of the eects of each preprocessing stage on the overall noise distri- bution. For example, bias eld correction will lead to a simple spatially-varying modication of 2 that can easily be computed if the bias eld correction parameters are known. A nal option might be to implement all of the dierent traditional image processing operations into a single integrated step. This might also have other advantages, since there is empirical evidence that supports the idea that integrated approaches yield better performance than stagewise processing in a variety of contexts (e.g., [213{215]). It should also be noted that certain types of image reconstruction (e.g., nonlinear image 132 (a) Noisy (NRMSE=0.2847) (b) MM (NRMSE=0.1994) (c) GA (NRMSE=0.2393) (d) ST (NRMSE=0.2491) Figure 5.11: Results from GRAPPA reconstructed and adaptive combined TV denoising of real brain data. Also shown are the NRMSE values for each image. 133 (a) Noisy (b) MM (c) GA (d) ST Figure 5.12: Error images for GRAPPA reconstructed and adaptive combined TV denoising results shown in Fig. 5.11. For easier visualization, the error image intensities were scaled by a factor of 4.2 relative to the images in Fig. 5.11. 134 reconstruction from sparsely-sampled and/or noisy data [55, 159]) could result in images with highly non-NCC statistical distributions. The proposed approach should be used with caution in such cases. Consistent with previous reports [26], we observed that estimation using an accurate NCC model can decrease estimator bias, but can sometimes also increase variance. For applications where consistency is more important than systematic bias, approaches like GA might have certain advantages relative to accurate NCC modeling, and it is important to keep these trade-os in mind when choosing a noise model for a specic MR application. 5.7.3 Choosing Images: Complex, Magnitude, or rSoS? In principle, there are scenarios (e.g., when the original k-space data is available from each of multiple dierent receiver coils) in which investigators can choose whether to process multiple dierent Gaussian complex images, multiple dierent Rician magnitude images, or a single NCC-distributed rSoS image. It is natural to ask: which one of these options should be preferred? It is known from the data processing inequality [216] that we should theoretically expect for complex images to contain more information than magnitude images, and that magnitude images should contain more information than rSoS images, based on the fact that rSoS images can be computed from magnitude images, which themselves can be computed from complex images. Empirical conrmation of the advantage of complex images relative to Rician images for image denoising was shown in [172] and some of its references. However, Sijbers et al. [23] showed the surprising result that Gaussian complex images should only be preferred over Rician images in cases where the phase is relatively consistent, while estimation based on Rician distributed images can yield smaller RMSE when the phase is less predictable. To the best of our knowledge, the question of whether to prefer complex images versus Rician images versus NCC images has never been addressed before. For this work, we 135 Noisy Complex Magnitude rSoS SNR=1 (a) 0.6786 (b) 0.1168 (c) 0.1574 (d) 0.1295 SNR=2 (e) 0.2746 (f) 0.0832 (g) 0.1401 (h) 0.1074 SNR=10 (i) 0.0463 (j) 0.0257 (k) 0.0334 (l) 0.0271 Figure 5.13: Comparison between TV denoising of complex, magnitude, and rSoS images. The NRMSE for each image is shown below each result. performed a small-scale simulation study to provide insight. Specically, we simulated a 4-coil data acquisition, where the noiseless image for each coil had constant phase and the same magnitude as the image used in Sec. 5.5.1, and each coil was subject to i.i.d. complex Gaussian noise. We compared the following reconstruction schemes: (i) each complex image was denoised idependently using Gaussian modeling with TV regularization, followed by rSoS coil combination; (ii) the magnitude images from each coil were denoised independently using Rician modeling with TV regularization, followed by rSoS coil combination; (iii) the images were combined via rSoS, and then denoised using NCC modeling with TV regularization. Regularization parameters were independently optimized for each scheme and SNR to achieve minimal NRMSE. Representative results are shown in Fig. 5.13. The results conrm the earlier nding 136 that complex Gaussian images should be preferred when phase is consistent. However, it is interesting to note that NCC modeling outperforms Rician modeling. This can be partially attributed to the fact that rSoS coil combination has the eect of noise averaging, while the independent Rician denoising problems each have relatively lower SNR. Our evaluation in this subsection was intentionally brief, and dierent reconstruction and/or coil-combination strategies could easily lead to dierent conclusions (e.g., joint reconstruction [217] or adap- tive coil combination [211]). In addition, images with dierent phase characteristics could yield very dierent results. For example, most MRI images have smooth phase, but spatial smoothing of complex images with rapidly varying phase would lead to poor results due to local phase cancellations. Despite the relatively small scope of the comparisons shown in Fig. 5.13, the main point we want to emphasize is that it's important to carefully consider how imaging data is stored and processed, since these decisions can have substantial impacts on the quality of the nal results. 5.8 Conclusions This paper proposed a new MM framework for statistical estimation problems involving NCC MRI images. A key advantage is that the original nonconvex and nonlinear PML optimization problem is reduced into a series of regularized LS problems, which enables the use of algorithms from the extensive and growing body of literature on regularized LS optimization. Our empirical results validated that the use of the true NCC NLLs can have substantial performance advantages over commonly-used estimation approaches such as GA and ST that solve simplied optimization problems. In addition, we demonstrated that our proposed MM approach can have substantially lower compute time relative to traditional methods that are invoked for solving nonlinear optimization problems. The computational eciency we observed is consistent with results reported by another group [218], who used our proposed MM framework (as we initially described in [156]) to achieve more than a 15 137 reduction in compute time for the denoising method described in [55]. Though we've only investigated the proposed framework in a couple scenarios, we expect it to be useful in a broad range of applications. A few examples include distortion correc- tion for EPI images [185], non-negative matrix factorization [219], and compressed sensing diusion MRI [105, 161{164], among many others. The proposed framework is also a nice complement to recent tools for optimally choosing regularization parameters when dealing with NCC data [220]. 138 Chapter 6 Conclusion 139 Diusion MRI has enabled study of microstructure architecture in biological tissues that was otherwise inaccessible in conventional MRI. Access to microscopic properties has led to many new applications, one of the most important being the study of white matter architecture of the human brain in-vivo. However, one of the most important limitation of dMRI scans is that they are inherently long because we have to additionally encode the diusion dimensions along with the standard spatial dimensions. Moreover, the signal from a single voxel contains information about the random motion of water molecules from multiple micro-environments contained in that voxel. Random motion in each of the micro-environments is complex and its characterization continues to be an active area of research. Therefore, in practice since the scan time is limited, we are constantly trying to estimate complex diusion information from very few samples. This dissertation addresses the issue of limited scan time by proposing several directions to improve overall eciency of a dMRI experiment. We developed new theory that can predict performance of linear dMRI methods, developed a new method to optimally design parameter estimation methods for dMRI, and developed a new algorithm to suppress noise. In our research we have used tools from (1) Fourier theory to develop linear estimation theory for dMRI, (2) estimation theory to propose cost functions for estimator design, (3) machine learning to learn ecient estimators from training data, and (4) optimization to propose ecient algorithms to optimize complex cost functions. The main contributions of this work include: The development of a new theoretical characterization to describe linear estimation in Diusion MRI. We derived a theoretical relationship between the EAP and the esti- mated EAP parameter. The derivation combines the Fourier transform relationship between q-space and EAP under narrow pulse assumption of the PGSE sequence and the general mathematical description of linear estimation to derive the relationship. The proposed theory can characterize arbitrary linear estimation methods in dMRI 140 when applied to arbitrary q-space sampling schemes. We have demonstrated that the new theory can predict important performance characteristics such as angular resolu- tion, estimation error and noise resilience and aliasing for any linear dMRI method. These predictions can help choose the sampling scheme and estimation method best suited for a particular application without empirical evaluations. The development of a novel EAP parameter estimation method, called ERFO (ERF Optimized parameter estimation) that learns a linear EAP parameter estimator from training data. We optimized a Bayes Risk cost function that minimized the squared error between true parameter and the linearly estimated parameter. The approach can be applied to arbitrary q-space sampling schemes, has strong theoretical justication, and it can be shown that the trained estimators will generalize to new settings they weren't trained for. We evaluated the ERFO estimator for estimating multiple EAP parameter with simulation and real data and demonstrated improved performance. We also demonstrated that ERFO has reduced errors even in the presence of Rician noise. The development of a new framework for statistical estimation problems involving Rician or non-central chi (NCC) distributed MRI images. Rician and NCC distri- bution accurately model noise in magnitude and root sum of squares combined MR images. Our estimation framework proposes an ecient majorize minimize algorithm for solving the maximum likelihood estimate of Rician and NCC distributions. We pre- sented empirical results to validate that the use of our approach can have substantial performance advantages over commonly-used estimation approaches. In addition, we demonstrated that our proposed MM approach can have substantially lower compute time relative to traditional methods. We expect these tools to be useful when choosing between dierent existing schemes, and improving overall accuracy by improving performance of the estimation and the noise re- moval methods. In addition, there are other extensions possible for improving eciency even 141 more and will become future directions for our research. In chapter 4 we presented one of our future directions by proposing a scheme to jointly optimize q-space sampling scheme and estimation method. Such joint optimization has never been done before and our initial evaluations are encouraging and suggest substantial improvement in eciency. Other po- tential improvements include (1) accurate noise modeling during estimation method design by including Rician or non-central chi noise, (2) extensions to other MR imaging modalities such as perfusion weighted imaging which shares similar EAP based MR physics, and (3) designing suitable cost functions that are specic to a given application. Most importantly, approaches in dMRI generally tend to optimize the acquisition and reconstruction performance separately. However, errors in the estimate have joint contri- butions from both these blocks. The work presented in this thesis is novel also because it is the rst step to evaluate linear dMRI estimation holistically and consider both dMRI acquisition and reconstruction jointly to optimize performance. In the future, developing general approaches that are broader than linear methods to holistically optimize estimation performance will be an exciting and important step forward. 142 Chapter 7 Bibliography [1] E. O. Stejskal, J. E. Tanner, Spin diusion measurements: Spin echoes in the presence of a time dependent eld gradient, J. Chem. Phys. 42 (1965) 288{292. [2] P. T. Callaghan, Principles of Nuclear Magnetic Resonance Microscopy, Clarendon Press, Oxford, 1991. [3] H. Nyquist, Certain topics in telegraph transmission theory, Transactions of the Amer- ican Institute of Electrical Engineers 47 (2) (1928) 617{644. [4] S. C. E., A mathematical theory of communication, Bell System Technical Journal 27 (3) 379{423. [5] D. C. Alexander, A general framework for experiment design in diusion mri and its application in measuring direct tissue-microstructure features, Magnetic Resonance in Medicine 60 (2) (2008) 439{448. [6] P. J. Basser, D. K. Jones, Diusion tensor mri: theory, experimental design and data analysis - a technical review, NMR in Biomedicine 15 (7-8) (2002) 456{467. 143 [7] D. H. J. Poot, A. J. den Dekker, E. Achten, M. Verhoye, J. Sijbers, Optimal experi- mental design for diusion kurtosis imaging, IEEE Transactions on Medical Imaging 29 (3) (2010) 819{829. [8] D. K. Jones, M. A. Horseld, A. Simmons, Optimal strategies for measuring diusion in anisotropic systems by magnetic resonance imaging, Magn. Reson. Med. 42 (1999) 515{525. [9] S. Jbabdi, S. N. Sotiropoulos, A. M. Savio, M. Gra~ na, T. E. J. Behrens, Model- based analysis of multishell diusion mr data for tractography: How to get over tting problems, Magnetic Resonance in Medicine 68 (6) (2012) 1846{1855. [10] S. N. Sotiropoulos, S. Jbabdi, J. Xu, J. L. Andersson, S. Moeller, E. J. Auerbach, M. F. Glasser, M. Hernandez, G. Sapiro, M. Jenkinson, D. A. Feinberg, E. Yacoub, C. Lenglet, D. C. V. Essen, K. Ugurbil, T. E. Behrens, Advances in diusion MRI acquisition and processing in the human connectome project, NeuroImage 80 (2013) 125{143. [11] D. S. Tuch, Q-ball imaging, Magn. Reson. Med. 52 (2004) 1358{1372. [12] J. P. Haldar, R. M. Leahy, Linear transforms for Fourier data on the sphere: Appli- cation to high angular resolution diusion MRI of the brain, NeuroImage 71 (2013) 233{247. [13] J.-D. Tournier, F. Calamante, A. Connelly, Robust determination of the bre orienta- tion distribution in diusion MRI: Non-negativity constrained super-resolved spherical deconvolution, NeuroImage 35 (2007) 1459{1472. [14] L. Raguin, D. Hernando, D. Karampinos, L. Ciobanu, B. Sutton, Z. Liang, J. Geor- giadis, Quantitative analysis of q-space mri data, Vol. 11, 2005. 144 [15] H. Zhang, T. Schneider, C. A. Wheeler-Kingshott, D. C. Alexander, NODDI: Prac- tical in vivo neuroite orientation dispersion and density imaging of the human brain, NeuroImage 61 (2012) 1000{1016. [16] W. Xiaojie, C. M. F., W. Yong, S. Peng, L. J. E., T. Kathryn, F. R. S., S. Sheng- Kwei, Diusion basis spectrum imaging detects and distinguishes coexisting subclin- ical in ammation, demyelination and axonal injury in experimental autoimmune en- cephalomyelitis mice, NMR in Biomedicine 27 (7) (2014) 843{852. [17] E. . Ozarslan, C. Koay, T. M. Shepherd, S. J. Blackband, P. J. Basser, Simple harmonic oscillator based reconstruction and estimation for three-dimensional q-space MRI, in: Proc. Int. Soc. Magn. Reson. Med., 2009, p. 1396. [18] L. Ning, C.-F. Westin, Y. Rathi, Estimating diusion propagator and its moments using directional radial basis functions, IEEE Transactions on Medical Image 34 (2015) 1{21. [19] A. V. Avram, J. E. Sarlls, A. S. Barnett, E. . Ozarslan, C. Thomas, M. O. Irfanoglu, E. Hutchinson, C. Pierpaoli, P. J. Basser, Clinical feasibility of using mean apparent propagator (MAP) MRI to characterize brain tissue microstructure, NeuroImage 127 (2016) 422{434. [20] H. Nyquist, Thermal agitation of electric charge in conductors, Phys. Rev. 32 (1928) 110{113. [21] J. B. Johnson, Thermal agitation of electricity in conductors, Phys. Rev. 32 (1928) 97{109. [22] H. Gudbjartsson, S. Patz, The Rician distribution of noisy MRI data, Magn. Reson. Med. 34 (1995) 910{914. 145 [23] J. Sijbers, A. J. den Dekker, Maximum likelihood estimation of signal amplitude and noise variance from MR data, Magn. Reson. Med. 51 (2004) 586{594. [24] S. Aja-Fern andez, M. Niethammer, M. Kubicki, M. E. Shenton, C.-F. Westin, Restora- tion of DWI data using a Rician LMMSE estimator, IEEE Trans. Med. Imag. 27 (2008) 1389{1403. [25] S. Aja-Fern andez, C. Alberola-L opez, C.-F. Westin, Noise and signal estimation in magnitude MRI and Rician distributed images: a LMMSE approach, IEEE Trans. Image Process. 17 (2008) 1383{1398. [26] J. L. R. Andersson, Maximum a posteriori estimation of diusion tensor parameters using a Rician noise model: Why, how and but, NeuroImage 42 (2008) 1340{1356. [27] A. C ardenas-Blanco, C. Tejos, P. Irarrazaval, I. Cameron, Noise in magnitude magnetic resonance images, Concepts Magn. Reson. A 32A (2008) 409{416. [28] D. Varadarajan, J. P. Haldar, A majorize-minimize framework for Rician and non- central chi MR images, IEEE Trans. Med. Imag. 34 (2015) 2191{2202. [29] S. Aja-Fern andez, A. Trist an-Vega, A review on statistical noise models for magnetic resonance imaging, Tech. Rep. LPI2013-01, Universidad de Valladolid (2013). [30] G. B. Airy, On the Diraction of an Object-glass with Circular Aperture, Transactions of the Cambridge Philosophical Society 5 (1835) 283. [31] K. Rossmann, Point spread-function, line spread-function, and modulation transfer function: tools for the study of imaging systems, Radiology 93 (2) (1969) 257{272. [32] M. M. Shashadhar, T. D. Je, P. R. John, B. C. Joseph, W. A. Don, Transfer function measurement and analysis for a magnetic resonance imager, Medical Physics 18 (6) (1991) 1141{1144. 146 [33] S. C. Michael, D. J. Dick, P. S. Frank, Computing the modulation transfer function of a magnetic resonance imager, Medical Physics 21 (3) (1994) 483{489. [34] M. D. Robson, J. C. Gore, R. T. Constable, Measurement of the point spread function in MRI using constant time imaging, Magn. Reson. Med. 38 (1997) 733{740. [35] K. van Everdingen, J. van der Grond, L. Kappelle, L. Ramos, W. Mali, Diusion- weighted magnetic resonance imaging in acute stroke, Stroke 29 (9) (1998) 1783{1790. [36] G. T. Stebbins, C. M. Murphy, Diusion tensor imaging in alzheimer's disease and mild cognitive impairment, Behavioural Neurology 21 (1-2) (2009) 39{49. [37] S. H. Eriksson, F. J. Rugg-Gunn, M. R. Symms, G. J. Barker, J. S. Duncan, Diusion tensor imaging in patients with epilepsy and malformations of cortical development, Brain 124 (3) (2001) 617{626. [38] M. Rovaris, A. Gass, R. Bammer, S. J. Hickman, O. Ciccarelli, D. H. Miller, M. Filippi, Diusion mri in multiple sclerosis, Neurology 65 (10) (2005) 1526{1532. [39] P. J. Basser, J. Mattiello, D. LeBihan, Estimation of the eective self-diusion tensor from the NMR spin echo, J. Magn. Reson. B 103 (1994) 247{254. [40] I. Aganj, C. Lenglet, G. Sapiro, E. Yacoub, K. Ugurbil, N. Harel, Reconstruction of the orientation distribution function in single- and multiple-shell q-ball imaging within constant solid angle, Magn. Reson. Med. 64 (2010) 554{566. [41] M. E. Moseley, Y. Cohen, J. Kucharczyk, J. Mintorovitch, H. S. Asgari, M. F. Wend- land, J. Tsuruda, D. Norman, Diusion-weighted mr imaging of anisotropic water diusion in cat central nervous system., Radiology 176 (2) (1990) 439{445. 147 [42] J.-D. Tournier, F. Calamante, D. G. Gadian, A. Connelly, Direct estimation of the ber orientation density function from diusion-weighted MRI data using spherical deconvolution, NeuroImage 23 (2004) 1176{1185. [43] V. J. Wedeen, P. Hagmann, W.-Y. I. Tseng, T. G. Reese, R. M. Weissko, Map- ping complex tissue architecture with diusion spectrum magnetic resonance imaging, Magn. Reson. Med. 54 (2005) 1377{1386. [44] H.-E. Assemlal, D. Tschumperl e, L. Brun, Ecient and robust computation of PDF features from diusion MR signal, Med. Image Anal. 13 (2009) 715{729. [45] J. Cheng, A. Ghosh, T. Jiang, R. Deriche, Model-free and analytical EAP reconstruc- tion via spherical polar fourier diusion MRI, in: Proc. MICCAI, 2010, pp. 590{597. [46] M. Descoteaux, R. Deriche, D. LeBihan, J.-F. Mangin, C. Poupon, Multiple q-shell diusion propagator imaging, Med. Image Anal. 15 (2011) 603{621. [47] A. P. Hosseinbor, M. K. Chung, Y.-C. Wu, A. L. Alexander, Bessel Fourier orientation reconstruction (BFOR): An analytical diusion propagator reconstruction for hybrid diusion imaging and computation of q-space indices, NeuroImage 64 (2013) 650{670. [48] E. . Ozarslan, C. G. Koay, T. M. Shepherd, M. E. Komlosh, M. O. Irfanoglu, C. Pier- paoli, P. J. Basser, Mean apparent propagator (MAP) MRI: A novel diusion imaging method for mapping tissue microstructure, NeuroImage 78 (2013) 16{32. [49] M. Zucchelli, L. Brusini, C. A. M endez, A. Daducci, C. Granziera, G. Menegaz, What lies beneath? Diusion EAP-based study of brain tissue microstructure, Med. Image Anal. 32 (2016) 145{156. [50] R. H. J. Fick, D. Wassermann, E. Caruyer, R. Deriche, MAPL: Tissue microstructure estimation using Laplacian-regularized MAP-MRI and its application to HCP data, NeuroImage 134 (2016) 365{385. 148 [51] S. L. Merlet, R. Deriche, Continuous diusion signal, EAP and ODF estimation via compressive sensing in diusion MRI, Med. Image Anal. 17 (2013) 556{572. [52] F.-C. Yeh, V. J. Wedeen, W.-Y. I. Tseng, Generalizedq-sampling imaging, IEEE Trans. Med. Imag. 29 (2010) 1626{1635. [53] D. Bertsekas, Nonlinear Programming, Athena Scientic, 1999. [54] O. T. Karlsen, R. Verhagen, W. M. M. J. Bov ee, Parameter estimation from Rician- distributed data sets using a maximum likelihood estimator: application to T 1 and perfusion measurements, Magn. Reson. Med. 41 (1999) 614{623. [55] F. Lam, S. D. Babacan, J. P. Haldar, M. W. Weiner, N. Schu, Z.-P. Liang, Denoising diusion-weighted magnitude MR images using rank and edge constraints., Magn. Reson. Med. 71 (2014) 1272{1284. [56] C. Matute, M. Domercq, A. P erez-Samart n, B. R. Ransom, Protecting white matter from stroke injury, Stroke 44 (4) (2013) 1204{1211. [57] K. FH, Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation As- pects., 2015. [58] E. . Ozarslan, T. M. Shepherd, B. C. Vemuri, S. J. Blackband, T. H. Mareci, Resolution of complex tissue microarchitecture using the diusion orientation transform (DOT), NeuroImage 31 (2006) 1086{1103. [59] J.-D. Tournier, C.-H. Yeh, F. Calamante, K.-H. Cho, A. Connelly, C.-P. Lin, Resolving crossing bres using constrained spherical deconvolution: Validation using diusion- weighted imaging phantom data, NeuroImage 42 (2008) 617{625. 149 [60] D. Varadarajan, J. P. Haldar, MS-FRACT: Optimized linear transform methods for ODF estimation in multi-shell diusion MRI, in: Proc. IEEE Int. Symp. Biomed. Imag., 2015, pp. 1172{1175. [61] D. Varadarajan, J. P. Haldar, Towards optimal linear estimation of orientation dis- tribution functions with arbitrarily sampled diusion MRI data, in: Proc. IEEE Int. Symp. Biomed. Imag., 2018, pp. 743{746. [62] E. Garyfallidis, M.-A. C^ ot e, F. Rheault, J. Sidhu, J. Hau, L. Petit, D. Fortin, S. Cu- nanne, M. Descoteaux, Recognition of white matter bundles using local and global streamline-based registration and clustering, NeuroImage 170 (2018) 283 { 295. [63] H. Assal, D. Antonio, I. Beatriz, E. S. Matthew, D. Hanna, Music training and child development: a review of recent ndings from a longitudinal study, Annals of the New York Academy of Sciences 1423 (1) 73{81. [64] A. Ghosh, R. Deriche, A survey of current trends in diusion MRI for structural brain connectivity, J. Neural Eng. 13 (2016) 011001. [65] A review of diusion tensor imaging studies in schizophrenia, Journal of Psychiatric Research 41 (1) (2007) 15 { 30. [66] T. Moritani, A. Hiwatashi, D. A. Shrier, H. Z. Wang, Y. Numaguchi, P.-L. A. West- esson, Cns vasculitis and vasculopathy: Ecacy and usefulness of diusion-weighted echoplanar mr imaging, Clinical Imaging 28 (4) (2004) 261 { 270. [67] S. Choi, A. M. Bush, M. T. Borzage, A. A. Joshi, W. J. Mack, T. D. Coates, R. M. Leahy, J. C. Wood, Hemoglobin and mean platelet volume predicts diuse t1-mri white matter volume decrease in sickle cell disease patients, NeuroImage: Clinical 15 (2017) 239 { 246. 150 [68] T. R. Brown, B. M. Kincaid, K. Ugurbil, NMR chemical shift imaging in three dimen- sions, Proc. Natl. Acad. Sci. USA 79 (1982) 3523{3526. [69] J. K arger, W. Heink, The propagator representation of molecular transport in micro- porous crystallites, Journal of Magnetic Resonance 51 (1983) 1{7. [70] E. A., Uber die von der molekularkinetischen theorie der w arme geforderte bewegung von in ruhenden ussigkeiten suspendierten teilchen, Annalen der Physik 322 (8) 549{ 560. [71] D. S. Tuch, Diusion MRI of complex tissue structure, Ph.D. thesis, Massachusetts Institute of Technology (2002). [72] Y. Assaf, T. Blumenfeld-Katzir, Y. Yovel, P. J. Basser, AxCaliber: A method for measuring axon diameter distribution from diusion MRI, Magn. Reson. Med. 59 (6) (2008) 1347{1354. [73] H.-E. Assemlal, D. Tschumperl e, L. Brun, K. Siddiqi, Recent advances in diusion MRI modeling: Angular and radial reconstruction, Med. Image Anal. 15 (2011) 369{396. [74] T. Hosey, G. Williams, R. Ansorge, Inference of multiple ber orientations in high angular resolution diusion imaging, Magn. Reson. Med. 54 (2005) 1480{1489. [75] T. L. Chenevert, J. A. Brunberg, J. G. Pipe, Anisotropic diusion in human white matter: demonstration with mr techniques in vivo., Radiology 177 (2) (1990) 401{405. [76] M. DORAN, J. HAJNAL, N. VANBRUGGEN, M. KING, I. YOUNG, G. BYDDER, Normal and abnormal white matter tracts shown by mr imaging using directional diusion weighted sequences, Journal of Computer Assisted Tomography 14 (6) (1990) 865{873. 151 [77] A. Trist an-Vega, C.-F. Westin, S. Aja-Fern andez, Estimation of ber orientation prob- ability density functions in high angular resolution diusion imaging, NeuroImage 47 (2009) 638{650. [78] A. Barnett, Theory of q-ball imaging redux: Implications for ber tracking, Magn. Reson. Med. 62 (2009) 910{923. [79] J. M. Edgar, I. R. Griths, Chapter 5 - white matter structure: A microscopist's view, in: Diusion MRI, 2009, pp. 74 { 103. [80] J. H. Jensen, J. A. Helpern, MRI quantication of non-Gaussian water diusion by kurtosis analysis, NMR Biomed. 23 (2010) 698{710. [81] P. P. Mitra, L. L. Latour, R. L. Kleinberg, C. H. Sotak, Pulsed-eld-gradient NMR measurements of restricted diusion and the return-to-the-origin probability, J. Magn. Reson. A 114 (1995) 47{58. [82] S. Karmacharya, B. Gagoski, L. Ning, R. Vyas, H. Cheng, J. Soul, J. Newberger, M. Shenton, Y. Rathi, P. Grant, Advanced diusion imaging for assessing normal white matter development in neonates and characterizing aberrant development in congenital heart disease, NeuroImage: Clinical 19 (2018) 360{373. [83] T. Paus, Growth of white matter in the adolescent brain: Myelin or axon?, Brain and Cognition 72 (1) (2010) 26 { 35. [84] J. Scholz, M. C. Klein, T. E. J. Behrens, H. Johansen-Berg, Training induces changes in white-matter architecture, Nature neuroscience 12 (11). [85] S. Love, Demyelinating diseases, Journal of Clinical Pathology 59 (11) (2006) 1151{ 1159. 152 [86] D. KL, S. DG, F. JI, et al, White matter changes in schizophrenia: Evidence for myelin-related dysfunction, Archives of General Psychiatry 60 (5) (2003) 443{456. [87] N. Evangelou, D. Konz, M. M. Esiri, S. Smith, J. Palace, P. M. Matthews, Regional axonal loss in the corpus callosum correlates with cerebral white matter lesion volume and distribution in multiple sclerosis, Brain 123 (9) (2000) 1845{1849. [88] E. L. Hahn, Spin echoes, Phys. Rev. 80 (1950) 580{594. [89] H. Y. Carr, E. M. Purcell, Eects of diusion on free precession in nuclear magnetic resonance experiments, Phys. Rev. 94 (1954) 630{638. [90] M. Bertero, P. Boccacci, Introduction to Inverse Problems in Imaging, Institute of Physics Publishing, London, 1998. [91] P. C. Hansen, Discrete Inverse Problems: Insight and Algorithms, SIAM, Philadelphia, 2010. [92] M. H. Khachaturian, J. J. Wisco, D. S. Tuch, Boosting the sampling eciency of q-ball imaging using multiple wavevector fusion, Magn. Reson. Med. 57 (2007) 289{296. [93] D. Varadarajan, J. P. Haldar, A theoretical signal processing framework for linear dif- fusion MRI: Implications for parameter estimation and experiment design, NeuroImage 161 (2017) 206 { 218. [94] D. Le Bihan, H. Johansen-Berg, Diusion MRI at 25: exploring brain tissue structure and function, NeuroImage 61 (2012) 324{341. [95] J.-D. Tournier, S. Mori, A. Leemans, Diusion tensor imaging and beyond, Magn. Reson. Med. 65 (2011) 1532{1556. [96] N. Shemesh, S. N. Jespersen, D. C. Alexander, Y. Cohen, I. Drobnjak, T. B. Dyrby, J. Finsterbusch, M. A. Koch, T. Kuder, F. Laun, M. Lawrenz, H. Lundell, P. P. Mitra, 153 M. Nilsson, E. . Ozarslan, D. Topgaard, C.-F. Westin, Conventions and nomenclature for double diusion encoding NMR and MRI, Magn. Reson. Med. 75 (2016) 82{87. [97] D. S. Tuch, T. G. Reese, M. R. Wiegell, V. J. Wedeen, Diusion MRI of complex neural architecture, Neuron 40 (2003) 885{895. [98] J. P. Haldar, R. M. Leahy, The equivalence of linear spherical deconvolution and model- free linear transform methods for diusion MRI, in: Proc. IEEE Int. Symp. Biomed. Imag., 2013, pp. 504{507. [99] L. M. Lacerda, J. I. Sperl, M. I. Menzel, T. Sprenger, G. J. Barker, F. Dell'Acqua, Dif- fusion in realistic biophysical systems can lead to aliasing eects in diusion spectrum imaging, Magn. Reson. Med. (Early View)doi:10.1002/mrm.26080. [100] Q. Tian, A. Rokem, R. D. Folkerth, A. Nummenmaa, Q. Fan, B. L. Edlow, J. A. McNab, Q-space truncation and sampling in diusion spectrum imaging, Magn. Reson. Med. (Early View)doi:10.1002/mrm.26071. [101] D. Varadarajan, J. P. Haldar, A theoretical framework for sampling and reconstrucing ensemble average propagators in diusion MRI, in: Proc. Int. Soc. Magn. Reson. Med., 2016, p. 2049. [102] C. G. Koay, L.-C. Chang, J. D. Carew, C. Pierpaoli, P. J. Basser, A unifying theoretical and algorithmic framework for least squares methods of estimation in diusion tensor imaging, J. Magn. Reson. 182 (2006) 115{125. [103] Y. Wang, Q. Wang, J. P. Haldar, F.-C. Yeh, M. Xie, P. Sun, T.-W. Tu, K. Trinkaus, R. S. Klein, A. H. Cross, S.-K. Song, Quantication of increased cellularity during in ammatory demyelination, Brain 134 (2011) 3587{3598. 154 [104] M. I. Menzel, E. T. Tan, K. Khare, J. I. Sperl, K. F. King, X. Tao, C. J. Hardy, L. Marinelli, Accelerated diusion spectrum imaging in the human brain using com- pressed sensing, Magn. Reson. Med. 66 (2011) 1226{1233. [105] B. Bilgic, K. Setsompop, J. Cohen-Adad, A. Yendiki, L. L. Wald, E. Adelsteinsson, Accelerated diusion spectrum imaging with compressed sensing using adaptive dic- tionaries, Magn. Reson. Med. 68 (2012) 1747{1754. [106] Y. Rathi, O. Michailovich, F. Laun, K. Setsompop, P. Grant, C.-F. Westin, Multi- shell diusion signal recovery from sparse measurements, Med. Image Anal. 18 (2014) 1143{1156. [107] L. Ning, F. Laun, Y. Gur, E. V. R. DiBella, S. Deslauriers-Gauthier, T. Megherbi, A. Ghosh, M. Zucchelli, G. Menegaz, R. Fick, S. St-Jean, M. Paquette, R. Aranda, M. Descoteaux, R. Deriche, L. O'Donnell, Y. Rathi, Sparse reconstruction challenge for diusion MRI: Validation on a physical phantom to determine which acquisition scheme and analysis method to use?, Med. Image Anal. 26 (2015) 316{331. [108] C. P. Hess, P. Mukherjee, E. T. Han, D. Xu, D. B. Vigneron, Q-ball reconstruction of multimodal ber orientations using the spherical harmonic basis, Magn. Reson. Med. 56 (2006) 104{117. [109] M. Descoteaux, E. Angelino, S. Fitzgibbons, R. Deriche, Regularized, fast, and robust analytical Q-ball imaging, Magn. Reson. Med. 58 (2007) 497{510. [110] A. W. Anderson, Measurement of ber orientation distributions using high angular resolution diusion imaging, Magn. Reson. Med. 54 (2005) 1194{1206. [111] M. Descoteaux, R. Deriche, T. R. K. Osche, A. Anwander, Deterministic and prob- abilistic tractography based on complex bre orientation distributions, IEEE Trans. Med. Imag. 28 (2009) 269{286. 155 [112] D. G. Cory, A. N. Garroway, Measurement of translational displacement probabilities by NMR: An indicator of compartmentation, Magn. Reson. Med. 14 (1990) 435{444. [113] Y. Cohen, Y. Assaf, High b-value q-space analyzed diusion-weighted MRS and MRI in neuronal tissues { a technical review, NMR Biomed. 15 (2002) 516{542. [114] Y.-C. Wu, A. L. Alexander, Hybrid diusion imaging, NeuroImage 36 (2007) 617{629. [115] Y.-C. Wu, A. S. Field, A. L. Alexander, Computation of diusion function measures in q-space using magnetic resonance hybrid diusion imaging, IEEE Trans. Med. Imag. 27 (2008) 858{865. [116] M. Paquette, G. Gilbert, M. Descoteaux, Optimal DSI reconstruction parameter rec- ommendations: Better ODFs and better connectivity, NeuroImage 142 (2016) 1{13. [117] M. D. H urlimann, L. M. Schwartz, P. N. Sen, Probability of return to the origin at short times: A probe of microstructure in porous media, Phys. Rev. B 51 (1995) 14936{14940. [118] R. N. Bracewell, The Fourier Transform and its Applications, McGraw-Hill, Boston, 2000. [119] Z.-P. Liang, P. C. Lauterbur, Principles of Magnetic Resonance Imaging: A Signal Processing Perspective, IEEE Press, New York, 2000. [120] T. H. Mareci, H. R. Brooker, High-resolution magnetic resonance spectra from a sen- sitive region dened with pulsed eld gradients, J. Magn. Reson. 57 (1984) 157{163. [121] T. H. Mareci, H. R. Brooker, Essential considerations for spectral localization using indirect gradient encoding of spatial information, J. Magn. Reson. 92 (1991) 229{246. [122] U. Dydak, M. Weiger, K. P. Pruessmann, D. Meier, P. Boesiger, Sensitivity-encoded spectroscopic imaging, Magnetic Resonance in Medicine 46 (4) (2001) 713{722. 156 [123] A. Greiser, M. von Kienlin, Ecient k-space sampling by density-weighted phase- encoding, Magn. Reson. Med. 50 (2003) 1266{1275. [124] O. Dietrich, J. G. Raya, S. B. Reeder, M. Ingrisch, M. F. Reiser, S. O. Schoenberg, In uence of multichannel combination, parallel imaging and other reconstruction tech- niques on MRI noise characteristics, Magn. Reson. Imag. 26 (2008) 754{762. [125] D. W. Shattuck, R. M. Leahy, BrainSuite: An automated cortical surface identication tool, Med. Image Anal. 6 (2002) 129{142. [126] E. Caruyer, C. Lenglet, G. Sapiro, R. Deriche, Design of multishell sampling schemes with uniform coverage in diusion MRI, Magn. Reson. Med. 69 (2013) 1534{1540. [127] F. J. Harris, On the use of windows for harmonic analysis with the discrete fourier transform, Proceedings of the IEEE 66 (1) (1978) 51{83. [128] S. H. Baete, S. Yutzy, F. E. Boada, Radial q-space sampling for DSI, Magn. Reson. Med. 76 (2016) 769{780. [129] L.-W. Kuo, J.-H. Chen, V. J. Wedeen, W.-Y. I. Tseng, Optimization of diusion spectrum imaging and q-ball imaging on clinical MRI system, NeuroImage 41 (2008) 7{18. [130] L.-W. Kuo, W.-Y. Chiang, F.-C. Yeh, V. J. Wedeen, W.-Y. I. Tseng, Diusion spec- trum MRI using body-centered-cubic and half-sphere sampling schemes, J. Neurosci. Methods 212 (2013) 143{155. [131] J. Qi, R. M. Leahy, Resolution and noise properties of MAP reconstruction for fully 3-D PET, IEEE Trans. Med. Imag. 19 (2000) 493{506. [132] J. W. Stayman, J. A. Fessler, Regularization for uniform spatial resolution properties in penalized-likelihood image reconstruction, IEEE Trans. Med. Imag. 19 (2000) 601{615. 157 [133] J. H. Cho, J. A. Fessler, Regularization designs for uniform spatial resolution and noise properties in statistical image reconstruction for 3D X-ray CT, IEEE Trans. Med. Imag. 34 (2015) 678{689. [134] A. L. Alexander, K. M. Hasan, M. Lazar, J. S. Tsuruda, D. L. Parker, Analysis of partial volume eects in diusion-tensor MRI, Magn. Reson. Med. 45 (2001) 770{780. [135] R. V. Mulkern, S. J. Haker, S. E. Maier, On high b diusion imaging in the human brain: ruminations and experimental insights, Magn. Reson. Imag. 27 (2009) 1151{ 1162. [136] D. A. Yablonskiy, A. L. Sukstanskii, Theoretical models of the diusion weighted MR signal, NMR Biomed. 23 (2010) 667{681. [137] D. S. Novikov, V. G. Kiselev, S. N. Jespersen, On modeling, Magn. Reson. Med. 79 (2018) 3172{3193. [138] D. Varadarajan, J. P. Haldar, ERFO: Improved ODF estimation by combining machine learning with linear estimation theory, in: Proc. Int. Soc. Magn. Reson. Med., 2018, p. 1557. [139] V. Golkov, A. Dosovitskiy, J. I. Sperl, M. I. Menzel, M. Czisch, P. S amann, T. Brox, D. Cremers, q-space deep learning: Twelve-fold shorter and model-free diusion mri scans, IEEE Transactions on Medical Imaging 35 (5) (2016) 1344{1351. [140] S. Koppers, D. Merhof, Direct estimation of ber orientations using deep learning in diusion imaging, in: Machine Learning in Medical Imaging, 2016, pp. 53{60. [141] C. Ye, Learning-based ensemble average propagator estimation, in: Proc. MICCAI, 2017, pp. 593{601. 158 [142] C. Ye, Tissue microstructure estimation using a deep network inspired by a dictionary- based framework, Med. Image Anal. 42 (2017) 288{299. [143] C. Ye, J. L. Prince, Fiber orientation estimation guided by a deep network, in: MIC- CAI, 2017, pp. 575{583. [144] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neural Networks 359{366. [145] A. Nguyen, J. Yosinski, J. Clune, Deep neural networks are easily fooled: High con- dence predictions for unrecognizable images, in: IEEE Conf. Comp. Vision Patt. Recog., 2015, pp. 427{436. [146] R. Shwartz-Ziv, N. Tishby, Opening the black box of deep neural networks via infor- mation, PreprintArXiv: 1703.00810. [147] C. Eichner, S. F. Cauley, J. Cohen-Adad, H. E. Moeller, R. Turner, K. Setsompop, L. L. Wald, Real diusion-weighted MRI enabling true signal averaging and increased diusion contrast, NeuroImage 122 (2015) 373{384. [148] C. G. Koay, E. . Ozarslan, P. J. Basser, A signal transformational framework for breaking the noise oor and its applications in MRI, J. Magn. Reson. 197 (2009) 108{ 119. [149] G. H. Golub, M. Heath, G. Wahba, Generalized cross-validation as a method for choosing a good ridge parameter, Technometrics 21 (1979) 215{223. [150] G. Golub, V. Pereyra, Separable nonlinear least squares: the variable projection method and its applications, Inverse Probl. 19 (2003) R1{R26. [151] Q. Fan, T. Witzel, A. Nummenmaa, K. R. V. Dijk, J. D. V. Horn, M. K. Drews, L. H. Somerville, M. A. Sheridan, R. M. Santillana, J. Snyder, T. Hedden, E. E. Shaw, M. O. 159 Hollinshead, V. Renvall, R. Zanzonico, B. Keil, S. Cauley, J. R. Polimeni, D. Tisdall, R. L. Buckner, V. J. Wedeen, L. L. Wald, A. W. Toga, B. R. Rosen, Mgh-usc human connectome project datasets with ultra-high b-value diusion mri, NeuroImage 124 (2016) 1108 { 1114. [152] P. Callaghan, Pulsed-gradient spin-echo nmr for planar, cylindrical, and spherical pores under conditions of wall relaxation, Journal of Magnetic Resonance, Series A 113 (1) (1995) 53 { 59. [153] K. M. Jansons, D. C. Alexander, Persistent angular structure: New insights from dif- fusion mri data. dummy version, in: Information Processing in Medical Imaging: 18th International Conference, IPMI 2003, Ambleside, UK, July 20-25, 2003. Proceedings, 2003, pp. 672{683. [154] C. G. Koay, A simple scheme for generating nearly uniform distribution of antipodally symmetric points on the unit sphere, Journal of computational science 2 (2011) 377{81. [155] N. G. Papadakis, D. Xing, C. L.-H. Huang, L. D. Hall, T. Carpenter, A comparative study of acquisition schemes for diusion tensor imaging using mri, Journal of Magnetic Resonance 137 (1) (1999) 67 { 82. [156] D. Varadarajan, J. P. Haldar, A quadratic majorize-minimize framework for statistical estimation with noisy Rician- and noncentral chi-distributed MR images, in: Proc. IEEE Int. Symp. Biomed. Imag., 2013, pp. 708{711. [157] J. A. Fessler, Model-based image reconstruction for MRI, IEEE Signal Process. Mag. 27 (2010) 81{89. [158] K. P. Pruessmann, M. Weiger, M. B. Scheidegger, P. Boesiger, SENSE: Sensitivity encoding for fast MRI, Magn. Reson. Med. 42 (1999) 952{962. 160 [159] M. Lustig, D. Donoho, J. M. Pauly, Sparse MRI: The application of compressed sensing for rapid MR imaging, Magn. Reson. Med. 58 (2007) 1182{1195. [160] J. P. Haldar, Low-rank modeling of local k-space neighborhoods (LORAKS) for con- strained MRI, IEEE Trans. Med. Imag. 33 (2014) 668{681. [161] O. Michailovich, Y. Rathi, S. Dolui, Spatially regularized compressed sensing for high angular resolution diusion imaging, IEEE Trans. Med. Imag. 30 (2011) 1100{1115. [162] B. A. Landman, J. A. Bogovic, H. Wan, F. E. Z. ElShahaby, P.-L. Bazin, J. L. Prince, Resolution of crossing bers with constrained compressed sensing using diusion tensor MRI, NeuroImage 59 (2012) 2175{2186. [163] A. Gramfort, C. Poupon, M. Descoteaux, Denoising and fast diusion imaging with physically constrained sparse dictionary learning, Med. Image Anal. (2013) 36{49. [164] M. Paquette, S. Merlet, G. Gilbert, R. Deriche, M. Descoteaux, Comparison of sam- pling strategies and sparsifying transforms to improve compressed sensing diusion spectrum imaging, Magn. Reson. Med.In Press. DOI: 10.1002/mrm.25093. [165] R. D. Nowak, Wavelet-based Rician noise removal for magnetic resonance imaging, IEEE Trans. Image Process. 8 (1999) 1408{1419. [166] J. Sijbers, A. J. den Dekker, A. Van der Linden, M. Verhoye, D. Van Dyck, Adaptive anisotropic noise ltering for magnitude MR data, Magn. Reson. Imag. 17 (1999) 1533{1539. [167] A. Pi zurica, W. Philips, I. Lemahieu, M. Acheroy, A versatile wavelet domain noise ltration technique for medical imaging, IEEE Trans. Med. Imag. 22 (2003) 323{331. [168] L. He, I. R. Greenshields, A nonlocal maximum likelihood estimation method for Rician noise reduction in MR images, IEEE Trans. Med. Imag. 28 (2009) 165{172. 161 [169] J. V. Manj on, P. Coup e, L. Mart -Bonmat , D. L. Collins, M. Robles, Adaptive non- local means denoising of MR images with spatially varying noise levels, J. Magn. Reson. Imag. 31 (2010) 192{203. [170] J. Rajan, B. Jeurissen, M. Verhoye, J. V. Audekerke, J. Sijbers, Maximum likelihood estimation-based denoising of magnetic resonance images using restricted local neigh- borhoods, Phys. Med. Biol. 56 (2011) 5221. [171] J. Rajan, J. Veraart, J. V. Audekerke, M. Verhoye, J. Sijbers, Nonlocal maximum likelihood estimation method for denoising multiple-coil magnetic resonance images, Magn. Reson. Imag. 30 (2012) 1512{1518. [172] J. P. Haldar, V. J. Wedeen, M. Nezamzadeh, G. Dai, M. W. Weiner, N. Schu, Z.-P. Liang, Improved diusion imaging through SNR-enhancing joint reconstruction, Magn. Reson. Med. 69 (2013) 277{289. [173] S. Aja-Fern andez, V. Brion, A. Trist an-Vega, Eective noise estimation and ltering from correlated multiple-coil MR data, Magn. Reson. Imag. 31 (2013) 272{285. [174] V. Brion, C. Poupon, O. Ri, S. Aja-Fern andez, A. Trist an-Vega, J.-F. Mangin, D. LeBihan, F. Poupon, Noise correction for HARDI and HYDI data obtained with multi-channel coils and sum of squares reconstruction: An anisotropic extension of the LMMSE, Magn. Reson. Imag. 31 (2013) 1360{1371. [175] T. K. Leipert, D. W. Marquardt, Statistical analysis of NMR spin-lattice relaxation times, J. Magn. Reson. 24 (1976) 181{199. [176] J. Sijbers, A. J. den Dekker, E. Raman, D. Van Dyck, Parameter estimation from magnitude MR images, Int. J. Imag. Syst. Tech. 10 (1999) 109{114. 162 [177] R. A. Clarke, P. Scifo, G. Rizzo, F. Dell'Acqua, G. Scotti, F. Fazio, Noise correction on Rician distributed data for bre orientation estimators, IEEE Trans. Med. Imag. 27 (2008) 1242{1251. [178] J. Noh, V. Solo, Rician distributed FMRI: Asymptotic power analysis and Cram er-Rao lower bounds, IEEE Trans. Signal Process. 59 (2011) 1322{1328. [179] A. Kristoersen, Estimating non-Gaussian diusion model parameters in the presence of physiological noise and Rician signal bias, J. Magn. Reson. Imag. 35 (2012) 181{189. [180] M. Bouhrara, D. A. Reiter, H. Celik, J.-M. Bonny, V. Lukas, K. W. Fishbein, R. G. Spencer, Incorporation of Rician noise in the analysis of biexponential transverse re- laxation in cartilage using a multiple gradient echo sequence at 3 and 7 Tesla, Magn. Reson. Med.In Press. DOI: 10.1002/mrm.25111. [181] R. Bai, C. G. Koay, E. Hutchinson, P. J. Basser, A framework for accurate determina- tion of theT 2 distribution from multiple echo magnitude MRI images, J. Magn. Reson. 244 (2014) 53{63. [182] B. P. Sutton, D. C. Noll, J. A. Fessler, Fast, iterative image reconstruction for MRI in the presence of eld inhomogeneities, IEEE Trans. Med. Imag. 22 (2003) 178{188. [183] B. J. Wilm, C. Barmet, M. Pavan, K. P. Pruessmann, Higher order reconstruction for MRI in the presence of spatiotemporal eld perturbations, Magn. Reson. Med. 65 (2011) 1690{1701. [184] D. Hernando, D. C. Karampinos, K. F. King., J. P. Haldar, S. Majumdar, J. G. Georgiadis, Z.-P. Liang, Removal of olenic fat chemical shift artifact in diusion MRI, Magn. Reson. Med. 65 (2011) 692{701. 163 [185] C. Bhushan, A. A. Joshi, R. M. Leahy., J. P. Haldar, Improved B 0 -distortion correction in diusion MRI using interlaced q-space sampling and constrained reconstruction, Magn. Reson. Med.In Press. DOI: 10.1002/mrm.25026. [186] R. M. Henkelman, Measurement of signal intensities in the presence of noise in MR images, Med. Phys. 12 (1985) 232{233. [187] E. R. McVeigh, R. M. Henkelman, M. J. Bronskill, Noise and ltration in magnetic resonance imaging, Med. Phys. 12 (1985) 586{591. [188] C. D. Constantinides, E. Atalar, E. R. McVeigh, Signal-to-noise measurements in mag- nitude images from NMR phased arrays, Magn. Reson. Med. 38 (1997) 852{857. [189] S. O. Rice, Mathematical analysis of random noise, Bell Syst. Technol. J. 23 (1944) 282{332. [190] G. McGibney, M. R. Smith, An unbiased signal-to-noise ratio measure for magnetic resonance images, Med. Phys. 20 (1993) 1077{1078. [191] A. J. Miller, P. M. Joseph, The use of power images to perform quantitative analysis on low SNR MR images, Magn. Reson. Imag. 11 (1993) 1051{1056. [192] D. R. Hunter, K. Lange, A tutorial on MM algorithms, Am. Stat. 58 (2004) 30{37. [193] D. Varadarajan, J. P. Haldar, A novel approach for statistical estimation of HARDI diusion parameters from Rician and non-central chi magnitude images, in: Proc. Int. Soc. Magn. Reson. Med., 2014, p. 801. [194] V. Solo, J. Noh, An EM algorithm for Rician fMRI activation detection, in: Proc. IEEE Int. Symp. Biomed. Imag., 2007, pp. 464{467. 164 [195] H. Zhu, Y. Li, J. G. Ibrahim, X. Shi, H. An, Y. Chen, W. Gao, W. Lin, D. B. Rowe, B. S. Peterson, Regression models for identifying noise sources in magnetic resonance images, Journal of the American Statistical Association 104 (2009) 623{637. [196] M. W. Jacobsen, J. A. Fessler, An expanded theoretical treatment of iteration- dependent majorize-minimize algorithms, IEEE Trans. Image Process. 16 (2007) 2411{ 2422. [197] A. L. Yuille, A. Rangarajan, The concave-convex procedure, Neural Comput. 15 (2003) 915{936. [198] L. I. Rudin, S. Osher, E. Fatemi, Nonlinear total variation based noise removal algo- rithms, Physica D 60 (1992) 259{268. [199] A. Buades, B. Coll, J. M. Morel, Image denoising methods. a new nonlocal principle, SIAM Rev. 52 (2010) 113{147. [200] S. V. Venkatakrishnan, C. A. Bouman, B. Wohlberg, Plug-and-play priors for model based reconstruction, in: Proc. IEEE GlobalSIP, 2013, pp. 945{948. [201] Z. Yang, M. Jacob, Nonlocal regularization of inverse problems: A unied variational framework, IEEE Trans. Image Process. 22 (2013) 3192{3203. [202] A. Chambolle, T. Pock, A rst-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vis. 40 (2011) 120{145. [203] Z. Wang, A. C. Bovik, Mean squared error: Love it or leave it? a new look at signal delity measures, IEEE Signal Process. Mag. 26 (2009) 98{117. [204] C. Bouman, K. Sauer, A generalized Gaussian image model for edge-preserving MAP estimation, IEEE Trans. Image Process. 2 (1993) 296{310. 165 [205] H.-E. Assemlal, D. Tschumperl e, L. Brun, Fiber tracking on HARDI data using robust ODF elds, in: Proc. IEEE Int. Conf. Image Process., 2007, pp. 133{136. [206] S. Aja-Fern andez, A. Trist an-Vega, C. Alberola-L opez, Noise estimation in single-and multiple-coil magnetic resonance data based on statistical models, Magn. Reson. Imag. 27 (2009) 1397{1409. [207] S. Aja-Fern andez, A. Trist an-Vega, W. S. Hoge, Statistical noise analysis in GRAPPA using a parametrized noncentral chi approximation model, Magn. Reson. Med. 65 (2011) 1195{1206. [208] S. Aja-Fern andez, A. Trist an-Vega, In uence of noise correlation in multiple-coil statis- tical models with sum of squares reconstruction, Magn. Reson. Med. 67 (2012) 580{585. [209] V. Brion, C. Poupon, O. Ri, S. Aja-Fernandez, A. Tristan-Vega, J. F. Mangin, D. Le Bihan, F. Poupon, Parallel MRI noise correction: An extension of the LMMSE to non central chi distributions, in: Proc. MICCAI, 2011, pp. 226{233. [210] M. A. Griswold, P. M. Jakob, R. M. Heidemann, M. Nittka, V. Jellus, J. Wang, B. Kiefer, A. Haase, Generalized autocalibrating partially parallel acquisitions (GRAPPA), Magn. Reson. Med. 47 (2002) 1202{1210. [211] D. O. Walsh, A. F. Gmitro, M. W. Marcellin, Adaptive reconstruction of phased array MR imagery, Magn. Reson. Med. 43 (2000) 682{690. [212] P. M. Robson, A. K. Grant, A. J. Madhuranthakam, R. Lattanzi, D. K. Sodickson, C. A. McKenzie, Comprehensive quantication of signal-to-noise ratio and g-factor for image-based and k-space-based parallel imaging reconstructions, Magn. Reson. Med. 60 (2008) 895{907. [213] B. P. Sutton, D. C. Noll, J. A. Fessler, Dynamic eld map estimation using a spiral- in/spiral-out acquisition, Magn. Reson. Med. 51 (2004) 1194{1204. 166 [214] L. Ying, J. Sheng, Joint image reconstruction and sensitivity estimation in SENSE (JSENSE), Magn. Reson. Med. 57 (2007) 1196{1202. [215] S. Tao, J. D. Trzasko, Y. Shu, J. Huston, M. A. Bernstein, Integrated image recon- struction and gradient nonlinearity correction, Magn. Reson. Med.In Press. [216] T. M. Cover, J. A. Thomas, Elements of Information Theory, John Wiley & Sons, Inc., New York, 1991. [217] J. P. Haldar, Z.-P. Liang, Joint reconstruction of noisy high-resolution MR image sequences, in: Proc. IEEE Int. Symp. Biomed. Imag., 2008, pp. 752{755. [218] F. Lam, D. Liu, Z. Song, M. W. Weiner, N. Schu, Z.-P. Liang, A fast algorithm for rank and edge constrained denoising of magnitude diusion-weighted images, in: Proc. Int. Soc. Magn. Reson. Med., 2014, p. 401. [219] P. Sajda, S. Du, T. R. Brown, R. Stoyanova, D. C. Shungu, X. Mao, L. C. Parra, Nonnegative matrix factorization for rapid recovery of constituent spectra in magnetic resonance chemical shift imaging of the brain, IEEE Trans. Med. Imag. 23 (2004) 1453{1465. [220] F. Luisier, T. Blu, P. J. Wolfe, A CURE for noisy magnetic resonance images: chi- square unbiased risk estimation, IEEE Trans. Image Process. 21 (2012) 3454{3466. 167
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Methods for improving reliability and consistency in diffusion MRI analysis
PDF
Correction, coregistration and connectivity analysis of multi-contrast brain MRI
PDF
Optimization methods and algorithms for constrained magnetic resonance imaging
PDF
Theory and simulation of diffusion magnetic resonance imaging on brain's white matter
PDF
Diffusion MRI of the human brain: signal modeling and quantitative analysis
PDF
New theory and methods for accelerated MRI reconstruction
PDF
Novel computational techniques for connectome analysis based on diffusion MRI
PDF
Applications of graph theory to brain connectivity analysis
PDF
Shift-invariant autoregressive reconstruction for MRI
PDF
Performance prediction, state estimation and production optimization of a landfill
PDF
3D vessel mapping techniques for retina and brain as an early imaging biomarker for small vessel diseases
PDF
Novel optimization tools for structured signals recovery: channels estimation and compressible signal recovery
PDF
Improving sensitivity and spatial coverage of myocardial arterial spin labeling
PDF
On the theory and applications of structured bilinear inverse problems to sparse blind deconvolution, active target localization, and delay-Doppler estimation
PDF
A logic partitioning framework and implementation optimizations for 3-dimensional integrated circuits
PDF
Theoretical foundations for modeling, analysis and optimization of cyber-physical-human systems
PDF
Multidimensional characterization of propagation channels for next-generation wireless and localization systems
PDF
Theoretical and computational foundations for cyber‐physical systems design
PDF
Design and characterization of flow batteries for large-scale energy storage
Asset Metadata
Creator
Varadarajan, Divya
(author)
Core Title
Novel theoretical characterization and optimization of experimental efficiency for diffusion MRI
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Electrical Engineering
Publication Date
10/08/2018
Defense Date
10/04/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
diffusion MRI,experiment design,linear theory,maximum likelihood estimation,MRI,non-central chi noise,OAI-PMH Harvest,optimization,Q-space,Rician noise
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Haldar, Justin P. (
committee chair
), Damasio, Hanna (
committee member
), Leahy, Richard M. (
committee member
)
Creator Email
divya.varadarajan1@gmail.com,dvaradar@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-74896
Unique identifier
UC11670709
Identifier
etd-Varadaraja-6791.pdf (filename),usctheses-c89-74896 (legacy record id)
Legacy Identifier
etd-Varadaraja-6791.pdf
Dmrecord
74896
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Varadarajan, Divya
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
diffusion MRI
experiment design
linear theory
maximum likelihood estimation
MRI
non-central chi noise
optimization
Q-space
Rician noise