Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Probabilistic medical image imputation via Wasserstein and conditional deep adversarial learning
(USC Thesis Other)
Probabilistic medical image imputation via Wasserstein and conditional deep adversarial learning
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Probabilistic Medical Image Imputation Via Wasserstein and Conditional
Deep Adversarial Learning
by
Ragheb Raad
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(MECHANICAL ENGINEERING)
December 2023
Copyright 2023 Ragheb Raad
Acknowledgements
Special thanks for Dr. Assad Oberai and Dr. Vinay Duddalwar for the supervising of this work throughout these 4+ years. Also, a special thanks for Dr.
Deep Ray and Dr. Dhruv Patel for mentoring. Special thanks for Dr. Darryl
Hwang and Dr. Bino Varghese for consultation help. Special thanks for my labmates at the Computation and Data Driven Discovery Group. Special thanks
for USC and Viterbi for giving me the opportunity to spend these wonderful years in Los Angeles, arguably the best city in the world. Special thanks
for my sister Petra, my mom Zannoubia and my dad Hassan for all the support they gave me during my life and career. The support from ARO grant
W911NF2010050 and the Ming-Hsieh Institute at USC is also acknowledged.
And most importantly, special thanks for the creator, the prophet, and the
prophet’s family.
ii
Table of Contents
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction To Generative Algorithms in Medical Imaging 4
1.2 Introduction To Medical Image Imputation . . . . . . . . . 7
Chapter 2: Methods and Literature Review . . . . . . . . . . . . . . 9
2.1 The WGAN as Prior . . . . . . . . . . . . . . . . . . . . . . 9
2.2 The CGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . 10
Chapter 3: The WGAN as Prior Approach . . . . . . . . . . . . . . . 13
3.1 The Gradient Penalty . . . . . . . . . . . . . . . . . . . . . 13
3.2 Bayesian Inference . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Convergence of MCMC algorithm . . . . . . . . . . . . . . . 18
3.4 Enhancements to the WGAN and Style Loss . . . . . . . . 19
3.5 Architecture and Hyper-parameters of WGAN . . . . . . . 20
Chapter 4: WGAN as Prior Results and Discussion . . . . . . . . . . 24
4.1 CECT image data . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Results of Image Imputation . . . . . . . . . . . . . . . . . 25
4.3 Validation of the Texture of the Imputed Images . . . . . . 30
4.4 Utility of Estimating Standard Deviation . . . . . . . . . . 32
Chapter 5: The CGAN Approach . . . . . . . . . . . . . . . . . . . . 34
5.1 Architecture and Training of the CGAN . . . . . . . . . . . 35
5.2 Specific Choices for the Architecture . . . . . . . . . . . . . 39
5.3 Generator and Critic blocks . . . . . . . . . . . . . . . . . . 41
iii
5.4 PIX2PIX: A non-stochastic algorithm . . . . . . . . . . . . 46
5.5 Other Generative Models . . . . . . . . . . . . . . . . . . . 46
5.6 Patient Data . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Chapter 6: The CGAN Results and Discussion . . . . . . . . . . . . 49
6.1 Standard Deviation and Uncertainty . . . . . . . . . . . . . 53
6.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.3 Applications of CGAN to Segmented Data and Comparison
To First Approach . . . . . . . . . . . . . . . . . . . . . . . . . 57
Chapter 7: Conclusions and Future Work . . . . . . . . . . . . . . . 62
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Appendix A: Unsegmented Data Remaining Results from the
CGAN Imputation . . . . . . . . . . . . . . . . . . . . . . . . . 77
Appendix B: Segmented Data Remaining Results from the
CGAN Imputation . . . . . . . . . . . . . . . . . . . . . . . . . 108
iv
List of Tables
1 Rb values of Markov chains in z used to impute the images at
the four time-points for Subject 1. . . . . . . . . . . . . . . . . 19
2 Generator architecture . . . . . . . . . . . . . . . . . . . . . . . 22
3 Discriminator architecture. . . . . . . . . . . . . . . . . . . . . 22
4 Hyper-parameters for cWGAN. . . . . . . . . . . . . . . . . . . 42
5 Performance of the cGAN and PIX2PIX algorithms on test
data (values averaged over 37 subjects). The metrics considered include Structural Similarity Index Measure (SSIM), Peak
Signal-to-Noise Ratio (PSNR) and Normalized Mean Square
Error (NMSE). . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6 Performance of different algorithms averaged over all timepoints and subjects. Each entry contains the average and the
standard deviation (in parenthesis). . . . . . . . . . . . . . . . . 53
7 Performance of the two imputation algorithms averaged over all
time-points and subjects. Each entry contains the average and
the standard deviation (in parenthesis). . . . . . . . . . . . . . 63
v
List of Figures
1 Schematic diagram of algorithm. . . . . . . . . . . . . . . . . . 21
2 True images, imputed images and their statistics for Subject 1. 26
3 True images, imputed images and their statistics for Subject 2. 27
4 True images, imputed images and their statistics for Subject 3. 28
5 True images, imputed images and their statistics for Subject 4. 29
6 Normalized difference in the entropic metrics for the true and
imputed images for 4 subjects across 4 time-points. . . . . . . . 31
7 Average estimated standard deviation versus the normalized L2
error for the standard and enhanced WGAN formulations. Subject 1:⃝, Subject 2: □, Subject 3: ⋆, Subject 4: ⋄. Standard
GAN: Red, Enhanced GAN: Blue. . . . . . . . . . . . . . . . . 32
8 A schematic diagram of the imputation algorithm. In the above
illustration, we have assumed that the corticomedullary image
is missing from the sequence (j = 2). The first step involves constructing a linear regression based guess of the missing image
through the operator Rj
. This approximate sequence, and a
sequence of random vectors z
(j)
, are used as input to the fullytrained generator, G∗
. The generator produces an ensemble of
likely complete sequences wherein each member is denoted by
x
G,(j)
. These are use to calculate the pixel-wise mean and standard deviation (SD) images. The best guess to the imputed
image is extracted from the former and the latter is used to
determine the confidence in the imputation. . . . . . . . . . . . 39
9 Architecture of generator and critic used in the conditional GAN. 40
10 Architecture of a DenseBlock. . . . . . . . . . . . . . . . . . . . 44
11 Architecture of a Down block. . . . . . . . . . . . . . . . . . . . 45
vi
12 Architecture of an Up block. . . . . . . . . . . . . . . . . . . . 45
13 True and imputed images for Subjects 1 and 2. . . . . . . . . . 50
14 True and imputed images for Subjects 3 and 4. . . . . . . . . . 51
15 True and imputed images for Subjects 5 and 6. . . . . . . . . . 52
16 ROC curve for classifying a given imputed image as acceptable. 54
17 True and imputed images for Subject 1. . . . . . . . . . . . . . 59
18 True and imputed images for Subject 2. . . . . . . . . . . . . . 60
19 True and imputed images for Subject 3. . . . . . . . . . . . . . 61
20 True and imputed images for Subject 4. . . . . . . . . . . . . . 62
A1 True and imputed images for Subject 7. . . . . . . . . . . . . . 77
A2 True and imputed images for Subject 8. . . . . . . . . . . . . . 78
A3 True and imputed images for Subject 9. . . . . . . . . . . . . . 79
A4 True and imputed images for Subject 10. . . . . . . . . . . . . 80
A5 True and imputed images for Subject 11. . . . . . . . . . . . . 81
A6 True and imputed images for Subject 12. . . . . . . . . . . . . 82
A7 True and imputed images for Subject 13. . . . . . . . . . . . . 83
A8 True and imputed images for Subject 14. . . . . . . . . . . . . 84
A9 True and imputed images for Subject 15. . . . . . . . . . . . . 85
A10 True and imputed images for Subject 16. . . . . . . . . . . . . 86
A11 True and imputed images for Subject 17. . . . . . . . . . . . . 87
A12 True and imputed images for Subject 18. . . . . . . . . . . . . 88
A13 True and imputed images for Subject 19. . . . . . . . . . . . . 89
A14 True and imputed images for Subject 20. . . . . . . . . . . . . 90
A15 True and imputed images for Subject 21. . . . . . . . . . . . . 91
A16 True and imputed images for Subject 22. . . . . . . . . . . . . 92
A17 True and imputed images for Subject 23. . . . . . . . . . . . . 93
A18 True and imputed images for Subject 24. . . . . . . . . . . . . 94
vii
A19 True and imputed images for Subject 25. . . . . . . . . . . . . 95
A20 True and imputed images for Subject 26. . . . . . . . . . . . . 96
A21 True and imputed images for Subject 27. . . . . . . . . . . . . 97
A22 True and imputed images for Subject 28. . . . . . . . . . . . . 98
A23 True and imputed images for Subject 29. . . . . . . . . . . . . 99
A24 True and imputed images for Subject 30. . . . . . . . . . . . . 100
A25 True and imputed images for Subject 31. . . . . . . . . . . . . 101
A26 True and imputed images for Subject 32. . . . . . . . . . . . . 102
A27 True and imputed images for Subject 33. . . . . . . . . . . . . 103
A28 True and imputed images for Subject 34. . . . . . . . . . . . . 104
A29 True and imputed images for Subject 35. . . . . . . . . . . . . 105
A30 True and imputed images for Subject 36. . . . . . . . . . . . . 106
A31 True and imputed images for Subject 37. . . . . . . . . . . . . 107
B1 True and imputed images for Subject 5. . . . . . . . . . . . . . 108
B2 True and imputed images for Subject 6. . . . . . . . . . . . . . 109
B3 True and imputed images for Subject 7. . . . . . . . . . . . . . 110
B4 True and imputed images for Subject 8. . . . . . . . . . . . . . 111
B5 True and imputed images for Subject 9. . . . . . . . . . . . . . 112
B6 True and imputed images for Subject 10. . . . . . . . . . . . . 113
B7 True and imputed images for Subject 11. . . . . . . . . . . . . 114
B8 True and imputed images for Subject 12. . . . . . . . . . . . . 115
B9 True and imputed images for Subject 13. . . . . . . . . . . . . 116
B10 True and imputed images for Subject 14. . . . . . . . . . . . . 117
B11 True and imputed images for Subject 15. . . . . . . . . . . . . 118
B12 True and imputed images for Subject 16. . . . . . . . . . . . . 119
B13 True and imputed images for Subject 17. . . . . . . . . . . . . 120
B14 True and imputed images for Subject 18. . . . . . . . . . . . . 121
viii
B15 True and imputed images for Subject 19. . . . . . . . . . . . . 122
B16 True and imputed images for Subject 20. . . . . . . . . . . . . 123
B17 True and imputed images for Subject 21. . . . . . . . . . . . . 124
B18 True and imputed images for Subject 22. . . . . . . . . . . . . 125
B19 True and imputed images for Subject 23. . . . . . . . . . . . . 126
ix
Abstract
Contrast Enhanced Computed Tomography (CECT) imaging has led to more
accurate detection of kidney cancer. CECT images are generated through
the injection of an intravenous contrast agent into the subject and then
through imaging at four distinct time-points. These are the pre-contrast, corticomedullary, nephrographic, and excretory time-points. However, in some cases
one of the four images corresponding to a given time-point may be missing.
This may be due to the corruption of the image, its degradation due to patient
motion, or due to differing protocols at different institutions. This missing
image can reduce the utility of the whole scan and the only solution would be
to call the patient back to the clinic and have them undertake the scan again.
This can lead to undue burden on the patient, increase the overall healthcare
costs and expose the patient once again to radiation. To address this, in the
proposed research we develop a method to predict the missing image, given
images available at three time-points. To do this, we will make use of two algorithms which can learn and sample from the probability distribution of the
complete CECT sequence conditioned on the incomplete sequence. The first
algorithm makes use of generative adversarial networks (GANs) as a prior distribution and then uses Hamiltonian Monte Carlo to probabilistically impute
the missing image sequence. The second algorithm employs conditional GANs
that can directly sample from the posterior distribution thus probabilistically
imputing the missing image. We will apply these algorithms to data collected
at the Keck School of Medicine at USC and quantify their effectiveness by computing quantitative measures of the difference between the true and imputed
images.
x
Chapter 1: Introduction
Medical image data acquired from ultrasound, X-rays (CT), MR and other
types of imaging modalities is routinely used in detecting, diagnosing and planning treatment for myriad diseases. The problem of missing data is ubiquitous
in medical imaging. Missing image data can be in the form of missing images
in a sequence of images, missing regions within a single image, or artifacts
like blurring, which degrade the image significantly. In all these cases, missing
data leads to the loss in utility of the images, and an accompanying loss in the
accuracy of detection, diagnosis, and treatment planning for a disease.
Missing or lost data can be attributed to many reasons. In some cases
patients may be initially scanned under one protocol, while the final management of the disease might require additional, or more thorough, scans.
However, this may not be feasible due to the patient’s inability to tolerate
additional scans, logistical issues such as visits to a tertiary care center, and
restrictions imposed by the insurance provider. In addition to this, the authors
in [1] refer to missing image data as the “leaky” radiological pipeline, and
lament the fact that the transition from analog to digital imaging and the
advent of standards like PACS and DICOM has not eliminated this problem.
They point to several causes for missing image data that include incompatibility between different vendors of medical imaging equipment, saturation of the
bandwidth of a device, and collateral damage during events like server errors.
In all these cases, a portion of image data is missing which renders useless the
portion that is collected, or diminishes its value.
Image imputation refers to the task of recovering the missing/corrupted
part of an image from the part that is available/not corrupted. This task has
typically relied on problem-specific computer vision algorithms. In contrast,
the advent of new deep learning algorithms has made it possible to develop
1
techniques that could work well across a wide range of image imputation tasks
[2, 3]. Further, by relying on statistical versions of these techniques, it is possible to not only impute the missing image data, but also provide quantitative
measures of confidence in the imputed solution. This additional information
can allow the clinician to make an informed decision about how much to trust
the imputed data when making their final decision. This principle, that is,
using novel statistical deep learning techniques to impute imaging data, and
to provide a measure of confidence in the imputation, forms the main thrust
of work described in this dissertation. While the techniques developed in this
dissertation can be applied to a wide range of image imputation tasks, we
focus on the task of imputing missing contrast enhanced CT (CECT) images
of renal tumors.
Over 90% of renal tumors are asymptomatic and are identified incidentally in 27 - 50% of all patients that are imaged. While all imaging modalities
have their strengths and limitations, the widespread use of CECT imaging has
lead to the increased detection of kidney cancers. CECT images are generated
by injecting an intravenous contrast agent into the subject and then imaging
during four distinct time-points. These are the pre-contrast, corticomedullary,
nephrographic, and excretory time-points. The four phases correspond to different phases in the enhancement and therefore characterize the masses with
reference to their vascularity (quantity of neovascularity), and washout (quality of the neovascularity). A complete sequence consists of all images at all
four time-points. In some cases, one or more these images may be missing and
may need to be imputed. For instance, subject motion during the exam could
blur some images rendering them of little clinical value, or, under a clinical
protocol a subject with a renal tumor might undergo CECT imaging where
the pre-contrast image is not recorded. However, each image in the CECT
2
sequence is important and has clinical value. For example, the pre-contrast,
corticomedullary and nephrographic images are all required to evaluate intensity enhancement and washout within the tumor, kidney and other organs,
which is in turn used in evaluation and diagnosis [4]. Also, in the excretory
phase, the renal pelvis is clearly visualized and its location relative to the
tumor can be determined. This information is useful to a surgeon performing
nephrectomy to remove the tumor [5, 6].
Tumor vasculature (tumor neoangiogenesis) is characterized by disorganized branching and shunting and various degrees of leakiness. When a
radiologist assesses these, they are looking at both in relation to the adjacent
normal tissue. In addition, a qualitative diagnosis of the tumor is often based
on the combination of analyzing different tissue densities, change in density,
vascularity and washout, and its margins with adjacent normal tissue. Therefore, a conventional diagnosis on whether a tumor is benign or malignant is
based on the qualitative visual inspection of the four CECT phase images.
More recently, techniques such as radiomics or machine learning that rely on
quantitative evaluation of these images, are also being considered for this task.
Further, once a decision has been made to treat or resect the renal mass, these
images are used by the surgeon to plan the surgery. The results presented in
this present study are an initial attempt to recover missing images. In future
studies, these will be evaluated to determine whether they reveal the diagnostic
markers that radiologists are looking for.
The loss of any one or more CECT phase images due to any of the reasons
discussed above negatively impacts the management of renal masses. It leads
to less accurate diagnosis of malignant masses and adversely effects surgical
planning in cases where surgical intervention is necessary. According to [7]
images from at least three CECT phases are required to characterize a renal
3
mass. We note that the problem of missing CECT images is significant. For
example, from 2011-2017 the Keck school Medicine at USC curated CECT
images from 735 patients. However, out of these, images for all four phases
are available for only 453 patients. That is for 40% of the patients at least
one image is missing. Given that Keck school is a tertiary care center, the
proportion of missing images is small. For other smaller/secondary centers this
percentage is substantially higher.
In this dissertation, the problem of imputing missing CECT images is
treated by developing statistical deep learning-based techniques. The goal is
to predict the CECT image corresponding to a missing time-point, given the
CECT images at the other three time-points.
1.1 Introduction To Generative Algorithms in Medical
Imaging
In the field of healthcare, medical imaging has long been an indispensable
tool. It enabled clinicians to visualize and diagnose various medical conditions
non-invasively. With the emergence of artificial intelligence (AI), particularly
generative AI algorithms, this field experienced a revolutionary transformation. Generative AI algorithms have arised as a powerful tool in enhancing
the accuracy, speed, and efficiency of medical image analysis and tests,
contributing significantly to improved patient care and outcomes [8, 9].
Generative AI algorithms are a subgroup of artificial intelligence that trying to create data instead than simply classifying or recognizing existing data.
Hence why they are characterized by their capability of generating new, synthetic data that tries to closely resemble real-world examples. In the context
of medical imaging, these algorithms are applied in many tasks such as image
4
synthesis, denoising, super-resolution, image-to-image translation, and more
[10, 11].
One of the notable advancements in generative AI for medical imaging
is the application of Generative Adversarial Networks (GANs) [12]. GANs
consist of two neural networks, a generator and a discriminator (or in some
cases a critic), which engage in a continuous adversarial training process. The
generator attempts to produce synthetic images that are indistinguishable from
real medical images, while the discriminator evaluates these generated images
and provides feedback to the generator.
Generative AI algorithms have found a plethora of applications in medical imaging, with significant benefits for both clinicians and patients. These
applications include:
Data Augmentation
Generative algorithms can create additional training data by generating synthetic medical images [13–15], thus helping to address the issue of limited
annotated data. This is a common challenge in medical image analysis. Such an
application therefore aids in training more robust and accurate deep learning
models.
Image Denoising and Enhancement
Generative models can be used in denoising medical images, hence improving
image quality and aiding in more accurate diagnoses. They can also enhance
the resolution of images, revealing finer details that might be crucial for
diagnosis [16–18].
5
Cross-Modality Image Translation
Generative AI can also translate medical images from one modality to another,
such as converting CT scans to MRI images. This capability facilitates multimodal analysis and can be particularly valuable in planning treatments [11,
19, 20].
Organ Segmentation
Accurate segmentation of organs and structures within medical images is vital
for treatment planning and diagnosis. Generative models can assist in such a
process, saving time for radiologists and enhancing accuracy [21, 22].
Pathology Detection
Such algorithms can be trained to detect specific pathologies or anomalies
within medical images, enabling early disease diagnosis and monitoring [23].
The integration of generative AI into medical imaging therefore not only
enhances diagnostic capabilities but also streamlines workflows, reduces the
burden on healthcare professionals, and improves patient care by enabling
more accurate and timely diagnoses.
In conclusion, generative AI algorithms have initiated a new era of innovation in medical imaging. Their ability to generate synthetic data and enhance
existing images has transformed the way medical professionals analyze and
interpret diagnostic imagery [17, 21, 24, 25]. As research in this field continues
to advance, the potential for generative AI to improve healthcare outcomes
and patient experiences in medical imaging is limitless.
6
1.2 Introduction To Medical Image Imputation
The quality of medical images can be affected by various factors, including hardware limitations, patient movement, or incomplete data acquisition.
These imperfections can hinder accurate diagnosis and treatment planning.
In response to these challenges, medical image imputation has emerged as a
powerful technique to enhance the utility and reliability of medical imaging
data.
Medical image imputation refers to the process of predicting and filling
in missing or corrupted image data to create a complete and diagnostically
valuable image. It leverages advanced computational algorithms and artificial
intelligence techniques to restore the missing or deteriorated portions of an
image. Several factors contribute to the need for medical image imputation. In
diagnostic imaging modalities such as magnetic resonance imaging (MRI) or
computed tomography (CT), motion artifacts due to patient movement during
scanning can result in incomplete or distorted images [10]. Moreover, in the
era of big data, it is not uncommon to encounter missing data due to technical
glitches or incomplete scans. These challenges can compromise the accuracy
of diagnoses and hinder treatment planning.
The application of artificial intelligence, particularly deep learning models,
has revolutionized medical image imputation. Generative models, such as Variational Autoencoders (VAEs) and GANs, have gained prominence in medical
image imputation. VAEs learn probabilistic representations of image data and
can generate plausible missing portions based on learned statistical patterns
[26]. GANs, as mentioned earlier, can generate realistic images by learning
the underlying distribution of the data [12]. These models can predict missing information, trying to create visually coherent and diagnostically valuable
medical images.
7
Medical image imputation also has a wide range of clinical applications. It
is particularly beneficial in enhancing the quality of functional MRI (fMRI)
data by reducing motion artifacts and improving the accuracy of brain connectivity studies [27]. Beyond traditional medical imaging modalities, image
imputation has been applied to the field of histopathology, where it plays a
critical role in enhancing the quality of scanned tissue slides [23]. By filling
in missing or damaged regions in these high-resolution images, pathologists
can better analyze tissue samples, leading to more accurate cancer diagnoses
and treatment recommendations. In ophthalmology, retinal imaging is crucial for the diagnosis and monitoring of various eye diseases. However, factors
like eye movement can introduce artifacts in retinal scans. Image imputation
techniques can correct these artifacts, providing ophthalmologists with cleaner
and more reliable images for assessing conditions such as diabetic retinopathy
[28]. Image imputation holds great potential for improving remote healthcare
and telemedicine services. Patients in remote or underserved areas may have
limited access to high-quality imaging equipment. By using imputation techniques to enhance lower-quality images, healthcare providers can offer more
accurate remote diagnoses and treatment recommendations, expanding access
to medical expertise.
8
Chapter 2: Methods and Literature Review
We decided to proceed with solving the image imputation problem using two
different approaches: the first approach a Wasserstein Generative Adversarial Network (WGAN) Bayesian Inference and the second approach uses a
Conditional Generative Adversarial Network (CGAN).
2.1 The WGAN as Prior
This technique uses the information encoded in a set of complete CECT images
(a set that includes all four time-points) to train a generative adversarial network (GAN) [12], whose generator learns this distribution, and can be used to
efficiently sample from it. This generator is then used to represent the prior
in a Bayesian inference problem, whose goal is to infer the distribution of the
CECT image corresponding to a missing time-point, given the CECT images
at all other time-points. Using the GAN allows us to encode the complex
information available through the sample set as a prior in the inference problem. It also allows us to formulate the inference problem in the latent space
of the GAN, whose dimension is much smaller than that of the image itself.
This dimension reduction in turn allows us to use methods like Markov Chain
Monte Carlo (MCMC) [29] to efficiently sample from the posterior distribution. Once the MCMC chain attains its equilibrium, it is used to compute
the desired statistics of the posterior distribution for the missing image. This
includes quantities such as the most likely image, and the pixel-wise mean and
standard deviation. We note that statistics like standard deviation provide a
quantitative measure of the uncertainty in the prediction, and may be used to
determine the confidence in the prediction.
9
The WGAN approach is to be applied to segmented images where only the
tumor is shown in the CECT image, while the remaining parts of the image
that show other organs or parts of the kidney are switched to black.
2.2 The CGAN
The image imputation problem here is solved using a Conditional Generative Adversarial Network which, in itself, uses a Bayesian Inference framework
[30]. This technique uses the information encoded in a set of complete CECT
images to train a CGAN whose generator learns the posterior distribution in
a Bayesian Inference problem. This way the generator can be used directly
to sample from the inferred distribution, thus making calculating the statistics much easier as no MCMC iterations are required. The drawback is that
if a slightly different problem is to be solved, for example, one of imputing
with two missing time points, then the entire network has to be retrained.
Similar to the first approach, these samples can be used after that to calculate important statistics like the pixel-wise mean and standard deviation. The
CGAN approach is to be applied to unsegmented images where the tumor and
everything around it in the CECT images are retained.
2.3 Literature Review
At the beginning of the work on the dissertation, we were not aware of any
prior work that attempts to impute missing images within a sequence of CECT
images acquired at different time-points. However, in the broader field of
medical image imputation, there was significant work that has demonstrated
effectiveness of modern machine learning algorithms for different imaging
modalities [2, 3, 31, 32]. There are also several applications that utilize imageto-image algorithms that are inspired by GANs. These include algorithms like
10
the CycleGAN, StarGAN and CollaGAN [19, 20, 33]. In most instances these
algorithms are utilized in applications where medical images from a given
domain (say T1 or T2 MR images) are translated to another domain (say
FLAIR image). We note that these algorithms are significantly different from
the one described in this dissertation, since for a given input image they produce a single output image. More specifically, they do not solve a statistical
inference problem and therefore are not able to quantify the distribution of
likely answers. Stated simply, given an image of one type they produce only
one image of another type. They do not account for the fact that there may be
an ensemble of images that are consistent with a given input image. Specifically, for the case of imputing CECT images, there may be whole collection of
likely complete sequences that are consistent with a given incomplete sequence.
The algorithms presented in this dissertation accomplishes this and doing so
is shown to be more accurate while also providing a measure confidence to its
user.
Deep learning-based image imputation techniques have recently been used
for imputing and synthesizing CT images. This includes generating CT images
for data augmentation to eventually improve the performance of a CT-image
based classifier [14, 15, 34]. It also includes algorithms for generating CECT
images at a single time point for the lungs [35], and the kidneys [36], where
the latter study uses the concept of neural transfer for improved performance.
Recently, Wasserstein GANs have shown promise in efficiently solving statistical inference problems in physics and computer vision [18, 37]. This work
builds upon those ideas and applies them to real-world medical applications.
More recently, the augmented CycleGAN architecture has also been used to
accomplish this task, however to our knowledge this approach has not been
11
applied to imputing medical images [38]. Indeed, its application to the problem
described in this thesis will be an interesting area for future work.
When it comes to Conditional GANs, in image-to-image translation problems, they have shown high effectiveness in learning the relation between
sources who are dependent statistically and their target domain images [11].
Xia et al. in [3] applied a CGAN to the generate missing Cardiac MR slices. The
results showed that their CGAN model consistently outperformed traditional
imputation techniques like PIX2PIX.
Likewise, as the work moved forward and the WGAN paper got published,
several GAN-based approaches were implemented and tested for imputing
renal CECT images at different time-points [39]. The approaches tested
include several standard algorithms and two novel methods, ReMIC and DiagnosisGAN, which were shown to be the most accurate. In ReMIC [40], a
representational disentanglement scheme for multi-domain image completion
was used to improve the performance of the algorithm, whereas in DiagnosisGAN [39] in addition to the CECT images themselves, other sources of
information, like segmentation mask for the tumor, and the knowledge cancer
sub-type, were used to improve the performance of the method. In the CGAN
results section of this dissertation we compare our algorithm with these methods and conclude that our method is more accurate, and at the same time
provides estimates of confidence in the imputation task.
12
Chapter 3: The WGAN as Prior Approach
A WGAN is comprised of two neural networks: a generator G and a critic D.
The generator G (ΩZ 7→ ΩX) takes as input random Gaussian noise z ∼ PZ
and makes a prediction that tries to approximate the true distribution PX.
Assuming a true dataset S is formed of samples xi ∼ PX, it is fed into the
critic D (ΩX 7→ R) along with the generator’s prediction where the Wasserstein Distance between them is calculated. The loss function of the WGAN
incorporating the Wasserstein distance only leads us to solve the following
minmax problem:
(G
∗
, D∗
) = arg min
G
[arg max
D∈f
[ E
x∼PX
[D(x)] − E
z∼PZ
[D(G(z))]]] (1)
where x represents the real images belonging to the real distribution PX and
G(z) represents the generator’s prediction. The critic model has to belong to
f which represents the functions that are 1-Lipschitz continuous. The solution of this min-max problem ensures that the GAN has learnt the original
distribution, and can generate samples from it.
3.1 The Gradient Penalty
The authors in [41] show that issues with WGAN arise mainly because of the
weight clipping method used to enforce Lipschitz continuity on the critic. This
is why they suggest the WGAN-GP model which uses a gradient penalty term.
The gradient penalty term basically replaces weight clipping with a constraint
on the gradient norm of the critic to enforce Lipschitz continuity. This has
shown to allow more stable training of the network than WGAN and requires
very little hyper-parameter tuning. The new loss terms and minmax problem
13
for WGAN-GP are therefore:
L(G, D) = E
x∼PX
[D(x)] − E
z∼PZ
[D(G(z))] (2)
D∗
(G) = arg max
D
[L(G, D) − λ E
xˆ∼Pxˆ
[(||∇xˆD(ˆx)||2−1)2
]] (3)
G
∗ = arg min
G
L(G, D∗
) (4)
where Pxˆ is the distribution obtained by uniformly sampling along a
straight line between the real distribution, PX, and the generated distribution Pg which represents G(z). λ, is the penalty coefficient used to weight the
gradient penalty term. In the thesis, λ, is set to be equal to 10 for all the
experiments.
For a fully trained WGAN (trained using gradient penalty or some other
way to enforce the Lipschitz constraint) with infinite capacity the push-forward
of PZ by G is weakly equivalent to PX. That is for any continuous, bounded
function l(x),
E
x∼PX
[l(x)] = E
z∼PZ
[l(G(z))], (5)
which implies that all moment-based statistics, such as the mean and variance, will be equal. Thus, sampling x from PX is (statistically) equivalent
to sampling z from PZ and passing each sample through the fully-trained
generator.
In this thesis, xi ∈ Ωxi ⊂ R
Nx , i = 1, · · · , 4, denote CECT images,
each with Nx pixels, corresponding to the four distinct time points. Let
x = [x1, x2, x3, x4], be a composite image that includes images from all four
time-points. Further, let S ≡ {x
(1)
, · · · , x(N)}, be a collection of N such composite images. These samples are drawn from an underlying distribution PX,
which is assumed unknown.
14
In the first stage of our algorithm (see Figure 1), the set S is used to
train the generator G of a Wasserstein GAN (WGAN) [41, 42] which maps a
Nz−dimensional latent space vector z ∈ ΩZ ⊂ R
Nz to x. The dimension of
the latent space Nz ≪ Nx, and the samples z are drawn from an uncorrelated
Gaussian distribution (PZ) with zero mean and unit variance.
3.2 Bayesian Inference
Next, consider the restriction operator Ri(x), which acts on x and deletes
the CECT image corresponding to the i−th time-point. We assume that we
observe a noisy version of this restricted set of images. That is, we observe,
y = Ri(x) + η, (6)
where η ∼ pη is assumed to be an uncorrelated Gaussian with zero mean and
variance σ. The problem we wish to solve is: given this noisy, restricted version
of the CECT image sequence, and the prior knowledge encoded in the set S,
determine the CECT image corresponding to the missing time-point i.
In order to solve this problem, we rely on Bayesian inference, which provides
a suitable framework to solve ill-posed (inverse) problems and quantifying the
underlying uncertainty [43]. We proceed by first defining a prior distribution
Pprior for the field x to be inferred, before making the measurement/observation y. This distribution is typically constructed based on domain knowledge,
snapshots and/or known constraints on x. Next, the restriction operator Ri
is
used in conjunction with the measurement noise to define the likelihood distribution PY |X(y | x) := pη(y − Ri(x)). This distribution captures the inherent
uncertainty and loss of information in the measurement about the inferred
field. Finally, we use Bayes’ rule to obtain an expression for the posterior
15
probability of the composite image x, given the measurement y.
PX|Y (x | y) = PY |X(y | x)Pprior(x)/Z
= pη(y − Ri(x))PX(x)/Z, (7)
where Z is the evidence term which ensures that the integral of the the
probability density is unity.
There are two fundamental challenges in using this formula to infer the
posterior distribution PX|Y . (1) The dimension of x is large and therefore
methods like MCMC cannot be easily used to learn the posterior distribution.
For example, in the problems studied in the results section, this dimension is
4 × 100 × 100 = 40, 000 and most existing MCMC methods work well only
for dimension of O(100). (2) PX is not known explicitly; rather it is known
only through the set S. Thus it is difficult to construct a prior distribution
that captures this information. We address these challenges by using the set S
to train a WGAN and mapping the expression for the posterior to the latent
space of the WGAN. We accomplish this below by utilizing (5) and (7).
The expectation of any continuous bounded function l(x) over the posterior
density is given by
E
x∼PX|Y
[l(x)] = E
x∼PX
[l(x)pη(y − Ri(x))/Z]
= E
x∼PX
[m(x)]
= E
z∼PZ
[m(G(z))]
= E
z∼PZ
[l(G(z))pη(y − Ri(G(z)))/Z]
= E
z∼PZ|Y
[l(G(z))]. (8)
16
In the first line of the equation above, we have used the definition of PX|Y (7).
The second line follows by defining m(x) ≡ l(x)pη(y−Ri(x))/Z. The third line
follows from the weak equivalence statement for a WGAN (5). The fourth line
follows from the definition of m(x). The fifth line makes use of the definition
of the posterior density in the latent space, that is,
PZ|Y (z | y) ≡ pη(y − Ri(G(z)))PZ(z)/Z. (9)
Equating the left hand side and the final expression on the right hand side
of (8), we have
E
x∼PX|Y
[l(x)] = E
z∼PZ|Y
[l(G(z))]. (10)
Here PZ|Y is the posterior distribution in the latent space of the WGAN and
is given by (9).
The pair of equations (9) and (10) allow us to compute statistics of the
posterior distribution in a computationally tractable way. We use the expression in (9) to train a MCMC algorithm whose stationary point yields samples
that are drawn from an approximation of PZ|Y . This constitutes the second
stage of our algorithm, and is described in some detail in the next section.
Once this is done, we use (10) to approximate any statistic of the posterior
using the samples from the Markov chain trained in Stage 2. That is,
E
x∼PX|Y
[l(x)] ≈
1
Nsamp
N
Xsamp
i=1
l(G(z)). (11)
This is the third and final stage of our algorithm. The overall algorithm is
depicted pictorially in Figure 1.
17
3.3 Convergence of MCMC algorithm
MCMC algorithms are known to suffer from challenges when used in highdimensional spaces, where the mass of the target density is typically concentrated in narrow regions on a lower dimensional manifold [44]. To fully explore
the regions of interest, the requirements on the length of the Markov chains [45]
can make the algorithm computationally infeasible. Thus, posing the inference
problem on the lower-dimensional latent space can alleviate this issue.
A number of diagnostic tools are available to analyse the convergence of the
MCMC algorithm, and thus determine the termination length of the generated
Markov chains. We direct interested readers to [46] for a summary of such
techniques. In the present work, we use the Gelman-Rubin diagnostic [47]
which estimates the convergence by considering multiple Markov chains and
evaluating the between-chains and within-chains variances.
We consider M chains of length N, each of which is generated by the
MCMC algorithm from different random initial points. Let µm and σ
2
m denote
the sample mean and variance of the mth chain, and µ denote the overall mean
across all chains, i.e., µ =
PM
m=1 µm/M. Then, we estimate the within-chain
variance W, the between-chain variance B and the pooled variance Vb as
W =
1
M
X
M
m=1
σ
2
m, B =
N
M − 1
X
M
m=1
(µm − µ)
2
,
Vb =
N − 1
N
W +
B
N
.
(12)
Finally, we evaluate the potential scale reduction factor Rb =
q
V /W b . Assuming that the initial points of the chains were sampled from an over-dispersed
distribution compared to the target distribution, Vb is expected to overestimate
the variance of the target distribution, while W underestimates it. Thus, the
18
Table 1 Rb values of Markov chains in z used to impute the images at the four
time-points for Subject 1.
Chain length (N)
256K 512K 1024K
Time-point
1 1.13 ± 0.10 1.08 ± 0.08 1.05 ± 0.04
2 1.28 ± 0.23 1.17 ± 0.15 1.12 ± 0.10
3 1.14 ± 0.15 1.09 ± 0.06 1.08 ± 0.06
4 1.19 ± 0.19 1.13 ± 0.17 1.11 ± 0.16
closer that value of Rb is to 1, the more assured we are about the convergence
of the chains.
To demonstrate the utility of this tool, we consider the chains generated to
impute the missing images at the 4 time-points for Subject 1. At each timepoint, we use M = 4 chains for each of the lengths N = 256K, 512K, 1024K.
Since the chains are generated for latent variable z ∈ R
Nz
, we obtain a vector
Rb ∈ R
Nz
for each configuration. To simplify the analysis, we condense this
vector to a scalar by considering the dimensional mean ± standard deviation of
Rb, which is listed in Table 1. Note that these scalar value moves closer to 1 as N
increases, indicating convergence. In practice, a value of R
Abstract (if available)
Abstract
Contrast Enhanced Computed Tomography (CECT) imaging has led to more accurate detection of kidney cancer. CECT images are generated through the injection of an intravenous contrast agent into the subject and then through imaging at four distinct time-points. These are the pre-contrast, corticomedullary, nephrographic, and excretory time-points. However, in some cases one of the four images corresponding to a given time-point may be missing. This may be due to the corruption of the image, its degradation due to patient motion, or due to differing protocols at different institutions. This missing image can reduce the utility of the whole scan and the only solution would be to call the patient back to the clinic and have them undertake the scan again. This can lead to undue burden on the patient, increase the overall healthcare costs and expose the patient once again to radiation. To address this, in the proposed research we develop a method to predict the missing image, given images available at three time-points. To do this, we will make use of two algorithms which can learn and sample from the probability distribution of the complete CECT sequence conditioned on the incomplete sequence. The first algorithm makes use of generative adversarial networks (GANs) as a prior distribution and then uses Hamiltonian Monte Carlo to probabilistically impute the missing image sequence. The second algorithm employs conditional GANs that can directly sample from the posterior distribution thus probabilistically imputing the missing image. We will apply these algorithms to data collected at the Keck School of Medicine at USC and quantify their effectiveness by computing quantitative measures of the difference between the true and imputed images.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Algorithms and landscape analysis for generative and adversarial learning
PDF
Physics-based data-driven inference
PDF
Noise benefits in Markov chain Monte Carlo computation
PDF
Learning at the local level
PDF
Scalable optimization for trustworthy AI: robust and fair machine learning
PDF
Robust and adaptive online decision making
PDF
Computational tumor ecology: a model of tumor evolution, heterogeneity, and chemotherapeutic response
PDF
Information geometry of annealing paths for inference and estimation
PDF
Theory and applications of adversarial and structured knowledge learning
PDF
Impurity gas detection for spent nuclear fuel (SNF) canisters using ultrasonic sensing and deep learning
PDF
Switching dynamical systems with Poisson and multiscale observations with applications to neural population activity
PDF
Average-case performance analysis and optimization of conditional asynchronous circuits
PDF
A deep learning approach to online single and multiple object tracking
PDF
Closing the reality gap via simulation-based inference and control
PDF
Data-driven learning for dynamical systems in biology
PDF
Graph machine learning for hardware security and security of graph machine learning: attacks and defenses
PDF
A green learning approach to image forensics: methodology, applications, and performance evaluation
PDF
High-throughput methods for simulation and deep reinforcement learning
PDF
Reconstruction and estimation of shear flows using physics-based and data-driven models
PDF
Towards learning generalization
Asset Metadata
Creator
Raad, Ragheb
(author)
Core Title
Probabilistic medical image imputation via Wasserstein and conditional deep adversarial learning
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Mechanical Engineering
Degree Conferral Date
2023-12
Publication Date
10/25/2023
Defense Date
10/19/2023
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
adversarial networks,Bayesian inference,conditional GAN,deep learning,Image,image imputation,kidney,loss function,machine learning,Markov chain Monte Carlo,OAI-PMH Harvest,tumors,Wasserstein GAN
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Oberai, Assad (
committee chair
), Pahlevan, Niema (
committee member
), Duddalwar, Vinay (
committee member
), Newton, Paul (
committee member
)
Creator Email
raghebra@usc.edu,raghebraad@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC113761111
Unique identifier
UC113761111
Identifier
etd-RaadRagheb-12433.pdf (filename)
Legacy Identifier
etd-RaadRagheb-12433
Document Type
Dissertation
Format
theses (aat)
Rights
Raad, Ragheb
Internet Media Type
application/pdf
Type
texts
Source
20231030-usctheses-batch-1103
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
adversarial networks
Bayesian inference
conditional GAN
deep learning
image imputation
kidney
loss function
machine learning
Markov chain Monte Carlo
tumors
Wasserstein GAN