Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Imaging informatics-based electronic patient record and analysis system for multiple sclerosis research, treatment, and disease tracking
(USC Thesis Other)
Imaging informatics-based electronic patient record and analysis system for multiple sclerosis research, treatment, and disease tracking
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
IMAGING INFORMATICS-BASED
ELECTRONIC PATIENT RECORD
AND ANALYSIS SYSTEM FOR
MULTIPLE SCLEROSIS RESEARCH,
TREATMENT, AND DISEASE
TRACKING
A dissertation presented to the faculty of
the USC Graduate School at the University
of Southern California in partial fulfillment
of the requirements for the degree DOCTOR
OF PHILOSOPHY (Biomedical Engineering)
Kevin C. Ma
May 2018
1
Dedication
To my parents. Baba, Mama, thank you for always supporting me in everything that I do. I love
you.
To Ye-ye, Nai-nai. Thank you for your sacrifice and hard work to give me this life. Rest in peace.
To my brother Chi-Che, for your unending support and being the mature one even when I’m the
elder.
To my wife Abby, for being my best friend, my confidant, my foundation, the moon of my life.
2
Acknowledgements
This work has been done with partial contribution, collaboration, and guidance from the
following teachers, colleagues, and friends:
Dr. Brent Liu PhD advisor
Dr. H.K. (Bernie) Huang Professor emeritus, advisor
Dr. Lilyana Amezcua Clinical collaborator
Dr. James Fernandez Manual contour segmentation
Dr. Alex Lerner Manual contour segmentation
Dr. Mark Shiroishi Manual contour segmentation
Dr. Jerry Loo GUI design
Nakul Reddy EM CAD algorithm
Sydney Forsyth Simulated lesion tracking evaluation
Ly Pham Cad evaluation
Joseph Liu Dynamic DICOM display
In addition, I would like to thank my friends and colleagues at USC and particularly the IPILab. I
will always cherish the time we have spent together and the experiences we have shared. To Dr.
Brent Liu, my sincerest gratitude for launching my PhD career and tirelessly guiding me through
the whole process. To Dr. Bernie Huang, thank you for the countless advices and supports. To Lucy
Zhang, Jorge Documet, Jasper Lee, Anh Le, Sinchai Tsao, thank you for mentoring me in my
earliest years as a PhD student, when I was at my most confident yet naïve. To Ruchi Deshpande,
Ximing Wang, Sneha Verma, thank you for inspiring me and giving me courage to continue this
journey. Finally, to my family and friends, thank you for being supportive and believing in me.
3
Table of Contents
Dedication ....................................................................................................................................... 1
Acknowledgements ......................................................................................................................... 2
LIST OF TABLES .......................................................................................................................... 6
LIST OF FIGURES ......................................................................................................................... 7
ABBREVIATIONS ......................................................................................................................... 9
ABSTRACT .................................................................................................................................. 11
Chapter 1 Introduction of Multiple Sclerosis ................................................................................ 12
1.1 Clinical Background ............................................................................................................ 12
1.1.1 Diagnosis and Treatment of Multiple Sclerosis ........................................................... 13
1.1.2 Medical imaging for Multiple Sclerosis ....................................................................... 13
1.2 Clinical data and data sources utilized for Multiple Sclerosis ............................................. 15
1.2.1 Neurological data sources ............................................................................................. 15
1.2.2 Radiological data sources ............................................................................................. 15
1.2.3 Diagnostic Workflow ................................................................................................... 16
1.3 MS research based on ethnicity differences ........................................................................ 16
1.3.1 Current research on Asian and African American MS patients ................................... 17
1.3.2 MS research proposal on Hispanic American patients ................................................. 17
1.4 Current challenges of MS research and treatment ............................................................... 18
1.4.1 Data integration challenges .......................................................................................... 18
1.4.2 Data analysis challenges ............................................................................................... 18
1.4.3 System integration and implementation challenges ..................................................... 18
1.4.4 Proposed solution: MS eFolder .................................................................................... 19
Chapter 2 Multiple Sclerosis eFolder System Overview ............................................................... 20
2.1 Introduction to imaging informatics based electronic patient record .................................. 20
2.2 MS eFolder design overview ............................................................................................... 20
2.2.1 System Diagram and Workflow/Dataflow Profile ....................................................... 20
2.2.2 System Components Overview .................................................................................... 22
Chapter 3 CAD Module and MS Lesion Detection and Quantification Algorithm ...................... 24
3.1 CAD purpose and architecture ............................................................................................ 24
3.2 Lesion detection methods .................................................................................................... 24
3.2.1 First-iteration of CAD – based on KNN ....................................................................... 24
3.2.2 Second-iteration of CAD – based on Expectation Maximization................................. 30
3.2.3 Manual contouring ........................................................................................................ 31
4
3.3 Post-processing .................................................................................................................... 31
3.3.1 3-D lesion quantification .............................................................................................. 31
3.3.2 Calculation of Brain Parenchymal Fraction ................................................................. 32
3.3.3 Lesion localization and normalization for longitudinal tracking .................................. 32
Chapter 4 Multiple Sclerosis eFolder Database Module ............................................................... 34
4.1 Overall Database Schema Design ....................................................................................... 34
4.2 Patient Demographic Database ............................................................................................ 35
4.3 Patient Imaging Database .................................................................................................... 36
4.4 Lesion Quantification Results Database .............................................................................. 36
Chapter 5 MS eFolder Graphical User Interface and System Design ........................................... 38
5.1 Overall GUI design and Architecture .................................................................................. 38
5.2 GUI Functionality and Features .......................................................................................... 38
5.2.1 Patient Demographic and Medical Data display .......................................................... 39
5.2.2 Data Input ..................................................................................................................... 40
5.2.3 Query/Retrieve ............................................................................................................. 41
5.3 DICOM-based image viewer ............................................................................................... 42
5.4 Disease tracking and data analysis tools ............................................................................. 43
5.5 Integration of MS eFolder within the clinical environment with IHE protocol .................. 44
5.6 DICOM-SR for MS post-processing data ........................................................................... 48
Chapter 6 Data Collection and System Evaluation ....................................................................... 51
6.1 Data Collection .................................................................................................................... 51
6.1.1 Introduction to data types collected .............................................................................. 51
6.1.2 Institutional Review Board Approval for Human Subjects data collection.................. 51
6.1.3 Data Selection Criteria ................................................................................................. 51
6.1.4 Patient clinical data ....................................................................................................... 52
6.1.5 Imaging data ................................................................................................................. 52
6.2 Evaluation of Lesion Quantification results ........................................................................ 53
6.2.1 Evaluation methodology ............................................................................................... 53
6.2.2 Acquisition of manual segmentation ............................................................................ 54
6.2.3 Evaluation results ......................................................................................................... 54
6.3 Evaluation of longitudinal lesion tracking methodology .................................................... 58
6.3.1 Evaluation methodology ............................................................................................... 58
6.3.2 Evaluation results and tracking results on web-based GUI .......................................... 58
6.4 Evaluation of data mining capabilities ................................................................................ 60
6.4.1 Lesion volume differences between Hispanic and Caucasian patients......................... 60
5
6.4.2 BPF changes between Hispanic and Caucasian female patients .................................. 61
6.5 DICOM-SR Evaluation ....................................................................................................... 62
6.5.1 DICOM-SR transfer protocol evaluation ..................................................................... 62
6.5.2 DICOM-SR viewer ....................................................................................................... 63
Chapter 7 Clinical significance of the MS eFolder: Use-case scenarios ....................................... 66
7.1 Scenario 1: Clinical workflow improvement with the MS eFolder System ........................ 66
7.2 Scenario 2: Tracking patient’s progress over time .............................................................. 66
7.2.1 Background................................................................................................................... 67
7.2.2 Data mining and decision support ................................................................................ 67
7.2.3 Tracking patient’s response to new treatment .............................................................. 67
7.3 Scenario 3: Comparing lesion volume for Hispanic and Caucasian patients ...................... 67
7.3.1 Subject recruitment and data collection ........................................................................ 67
7.3.2 Data analysis and results .............................................................................................. 68
7.3.3 eFolder’s Role in Big Data analysis in Medical Imaging ............................................. 68
Chapter 8 Current Status and Future Work ................................................................................... 69
8.1 Completed components and features of MS eFolder ........................................................... 69
8.2 Future work ......................................................................................................................... 69
8.2.1 Multi-site big data storage and analysis ....................................................................... 69
8.2.2 Future work needed for deployment for clinical use .................................................... 70
REFERENCE ................................................................................................................................ 72
LIST OF PUBLICATIONS ........................................................................................................... 76
6
List of Tables
Table 6.1 Evaluation results for the EM algorithm ....................................................................... 56
Table 6.2 Simulated lesion growths for 4 longitudinal studies for 4 years. .................................. 58
Table 6.3 Summary and comparison of eFolder system evaluation .............................................. 65
7
List of Figures
Figure 1.1 Three axial brain images of an MS patient. ................................................................ 14
Figure 2.1 MS eFolder system diagram. . ..................................................................................... 21
Figure 2.2 MS eFolder workflow diagram with IHE postprocessing profile. ............................... 22
Figure 3.1 CAD Preprocessing workflow ..................................................................................... 25
Figure 3.2 Effects of image rotation via midsagittal line (in red). ................................................ 26
Figure 3.3 Left: Realigned FLAIR image. Middle: Brain mask. Right: Segmented brain
parenchyma ................................................................................................................. 26
Figure 3.4 Workflow of lesion voxel classification. ..................................................................... 27
Figure 3.5 3-dimensional features space from 4 sample training sets.. ......................................... 28
Figure 3.6 MS CAD results: .......................................................................................................... 29
Figure 3.7 Workflows of the two CAD algorithms developed in the eFolder project.. ................ 31
Figure 3.8 26-connectivity diagram. ............................................................................................. 32
Figure 3.9 Workflow of MS lesion registration in longitudinal studies. ....................................... 33
Figure 4.1 The MS eFolder patient data model.. ........................................................................... 34
Figure 4.2 General database schema for the MS eFolder .............................................................. 35
Figure 4.3 DICOM data model structure ....................................................................................... 36
Figure 5.1 Screenshot of comprehensive MS eFolder web-based GUI......................................... 39
Figure 5.2 Close-up of the left module displaying patient’s MS history. ...................................... 40
Figure 5.3 Data input page for MS eFolder ................................................................................... 41
Figure 5.4 Data lookup page for MS eFolder. ............................................................................... 42
Figure 5.5 The design and layout of the customized eFolder image viewer. ................................ 43
Figure 5.6 Scatter plot of disease duration vs. 3-D lesion volume for Hispanic and Caucasian MS
patients. ........................................................................................................................ 44
Figure 5.7 Lesion volume tracking of the 4 patients over 4 studies per patient, with an example of
mouse-over that shows statistics of that study. ............................................................ 44
Figure 5.8 IHE post-processing workflow diagram ...................................................................... 45
Figure 5.9 MS eFolder workflow diagram with IHE postprocessing profile. ............................... 46
Figure 5.10 Status tracking page for MS eFolder post-processing workflow. .............................. 47
Figure 5.11 Status tracking page for MS eFolder post-processing workflow. .............................. 47
Figure 5.12 Workflow diagram for DICOM-SR conversion and display. .................................... 48
Figure 5.13 XML document to store MS CAD analysis results based on DICOM SR template .. 49
Figure 6.1 First out of 6 pages of Multiple Sclerosis questionnaire to collect patient data ........... 52
Figure 6.2 Total lesion volumes of 36 Caucasian studies. ............................................................ 55
Figure 6.3 Total lesion volumes of 36 Hispanic studies. ............................................................... 55
Figure 6.4 Total lesion volumes of 36 Caucasian studies.. ........................................................... 57
Figure 6.5 Total lesion volumes of 36 Hispanic studies.. .............................................................. 57
Figure 6.6 Three consecutive normalized brain atlas axial slices with lesion contour overlays.. . 59
Figure 6.7 Graphical representation of longitudinal tracking of the four patients with simulated data
from the first scenario. ................................................................................................. 59
Figure 6.8 MS eFolder’s graphical tool for tracking volume changes of a single lesion for patient
C011.. .......................................................................................................................... 60
Figure 6.9 Scatter plot of MS disease duration versus 3-D lesion volume from data mining. ...... 61
Figure 6.10 Scatter plot of MS Disease duration versus brain parenchymal fraction between
Hispanic female patients and Caucasian female patients. ........................................... 61
Figure 6.11 Network diagram of DICOM-SR data transfer evaluation.. ...................................... 63
8
Figure 6.12 Screenshot of comprehensive MS eFolder web-based GUI....................................... 64
Figure 6.13 The design and layout of the customized eFolder image viewer.. ............................. 64
9
Abbreviations
CA Caucasian American
CAD Computer-aided detection/diagnosis
CIS Clinically Isolated Syndrome
CNS Central nervous system
COSTAR Computer Stored Ambulatory Record
CSF Cerebro-spinal fluid
DICOM Digital imaging and communication in medicine
EDSS Expanded Disability Status Scale
EM Expectation Maximization
ePR/eMR electronic patient record/electronic medical record
FLAIR fluid attenuated inverse recovery
Gd gadolinium
GM Grey matter
GUI Graphical User Interface
HA Hispanic American
HCCII Healthcare Consultation Center II
HKHA Hong Kong Hospital Authority
HTML Hypertext markup language
HTTP Hypertext transfer protocol
IRB Institutional Review Board
KNN K-nearest neighbors
LAC Los Angeles County Hospital
MRI Magnetic resonance imaging
MS Multiple Sclerosis
MSP midsagittal plane
PACS Picture archiving and communication system
PHP PHP Hypertext Preprocessor
PI Principal Investigator
PPMS Primary-progressive multiple sclerosis
10
ROI Region of interests
RRMS relapse-remitting multiple sclerosis
SC Secondary Capture
SNOMED-CT Standardized Nomenclature in Medicine – Clinical Terms
SPMS Secondary-Progressive multiple sclerosis
SR Structured Reporting
UID unique identification
US United States
USC University of Southern California
VistA the Veterans Health Information Systems and Technology Architecture
WM White Matter
XML Extensible markup language
11
Abstract
Purpose: Multiple Sclerosis (MS) is a neurological degenerative disease that affects over 2.3
million people worldwide. Symptoms vary greatly and can be disabling and life-threatening in the
most severe cases. MRI is currently used to identify MS lesions in the brain and to assess the
affected regions and severity of MS. The goal of treating MS is to prolong patient life, to limit
relapses and attacks, and to increase quality of life for the patients. It is essential to track MS
patients’ disease progression in longitudinal studies. However, disease tracking can be difficult
because of intra- and inter-personal variabilities when manually calculating MS lesion volumes.
Therefore, a comprehensive, integrated patient record system, called MS eFolder, is designed and
developed to store and display patient data, perform quantitative imaging analysis, and to provide
big-data analysis tools. The MS e-Folder aims to aid in three areas: disease tracking, decision
support, and data mining for both clinical and research purposes.
Method: The MS eFolder includes three components: eFolder database, the computer-aided
detection (CAD) algorithm, and a web-based graphical user interface (GUI). The three components
are integrated and interconnected with each other for data mining and data retrieval. The eFolder
database stores patients’ demographic and clinical data, DICOM-based imaging data, and
quantitative data such as lesion volume, number of lesions, lesion locations, and brain parenchymal
volumes. The MS CAD system is developed based on Statistical Parametric Mapping (SPM) toolkit
in MATLAB. The primary results of the CAD algorithm includes number of lesions, lesion
volumes, and brain parenchymal volumes. Further post-processing of lesion contours identifies the
location of each lesions, perform lesion registration and track lesion volume changes in longitudinal
studies. CAD data is converted into DICOM-SR format for storage. A web-based GUI is designed
to display patients’ clinical information, imaging data, CAD data, and other data analysis results.
The eFolder system can be readily installed in a clinical environment based on IHE post-processing
integration protocol.
Summary: The MS eFolder system has been designed and developed as an integrated data storage
and mining solution in both clinical and research environments, while providing unique features,
such as quantitative lesion analysis and disease tracking over a longitudinal study. A
comprehensive image and clinical data integrated database provided by MS eFolder provides a
platform for treatment assessment, outcomes analysis and decision-support. The proposed system
serves as a platform for future quantitative analysis derived automatically from CAD algorithms
that can also be integrated within the system for individual disease tracking and future MS-related
research. Ultimately the eFolder provides a decision-support infrastructure that can eventually be
used as add-on value to the overall electronic medical record.
12
Chapter 1 Introduction of Multiple Sclerosis
1.1 Clinical Background
Multiple Sclerosis (MS) is a demyelinating disease in which the patient’s central nervous system
(CNS) degenerates and causes inflammation and brain and spinal atrophy. [1] The disease is caused
by autoreactive T-cells attacking myelins in the CNS, thus affecting nervous signal conductivity.
[2,3] Therefore, MS patients suffer from cognitive function impairment, loss of vision, dementia,
fatigue, and loss of bowel/bladder control with varying severity. [1,2] The scarred tissues leftover
from the autoimmune attacks are called lesions, or plaques, in the CNS white matter [3]. As the
disease progresses, the effect on a patient’s life can be devastating. Disability can continue to
progress until an effective treatment regimen is developed for an individual patient. Direct medical
costs and associated assistive and rehabilitative expenditures can quickly reach into tens of
thousands of dollars per year.
In the United States, there are currently around 400,000 patients diagnosed with MS, with
approximately 10,000 new patients diagnosed annually [4]. There is currently no cure for MS
because of the irreversible axonal injuries and the complex nature as to what triggers the
autoimmune attacks. [2,3] It is believed that the cause of MS can be contributed by a number and
combination of genetic factors, environmental factors, relation with other diseases, and bacterial or
viral infections. [3,5] In addition, MS manifests itself differently amongst different ethnicities [6,7],
such as differing prevalent symptoms, disabilities, MS lesion locations, and response to treatments.
These various factors make MS a complex disease with many research projects currently under
way to help identify disease causes, treatments, and other general knowledge about MS.
There are four types of MS according to the disease progression and nature of multiple sclerosis:
clinically isolated syndrome (CIS), relapse-remitting (RR), primary progressive (PP), and
secondary progressive (SP). [8,9] CIS is defined as the first episode of neurological symptoms
caused by demyelination in the central nervous system. This includes symptoms consistent with
MS but does not meet diagnostic qualifications for MS.
The two most common types of MS are RR and PP. Patients with RRMS experience clearly-defined
attacks and worsening of neurologic functions over time. Each attack, or relapse, leaves the patient
in a worse condition than before. Disease condition, however, does not change between attacks.
Most MS patients (85%) are of the RR type. Patients with PPMS, from the disease onset, experience
a gradual worsening of neurological functions over time, without attacks. Approximately 10% of
MS patients are of this type. SPMS patients experience a relapse-remitting stage of MS first, and
then their disease course becomes a gradual worsening instead of intermittent periods of attacks
and stable conditions. [8,9] Other classifications of MS include optical spinal MS, where affected
areas concentrate in the spinal cord rather than the cerebral cortex, and benign MS, a rare portion
of relapse-remitting MS but with mild, infrequent attacks and little disease progression over 10 to
15 years.
For the purpose of this proposal, data of RR and some optic-spinal MS patients are collected for
the construction and validation of imaging informatics-based MS system because of the broad and
complex aspect of considering all MS types. The system, however, has been designed and modified
to store data of all types of MS patients.
13
1.1.1 Diagnosis and Treatment of Multiple Sclerosis
Because of the complexity of the disease, there is no simple test for diagnosing MS. Therefore,
diagnosis of MS consists of a combination of patient history, neurological exams, laboratory tests,
and radiological exams. [3,10] Criteria of categorizing multiple sclerosis are based on the principle
of dissemination of lesions in time and space, which means that there must be at least two attacks
separated by a set amount of time (at least one month, as recommended by McDonald et al [10]),
and at least two lesions appearing in different places in the central nervous system (CNS) [11].
Established by the International Panel on the Diagnosis of MS in 2001, the McDonald criteria have
been the more accepted diagnostic criteria for MS based on brain imaging, extent of symptoms,
and duration of symptoms. [12] According to the criteria, an “attack” (or relapse) is defined as a
historical or current acute inflammatory demyelinating event in the CNS with duration of at least
24 hours without fever or infection. A current attack should be corroborated with neurological
examination findings, and a historical attack can be evidenced by multiple episodes of paroxysmal
symptoms or MRI findings of demyelination in the area of interest. For a “definite” diagnosis of
MS, a patient must have at least two documented attacks with clinical evidence of more than 2
lesions, or clinical evidence of 1 lesion with evidence of a prior attack. If a patient has partial
clinical presentation of MS but not enough for a “definite” diagnosis, additional data is needed for
further evaluation. The additional data includes, but not limited to, lesion dissemination in space
and/or time, demonstrated by one or more lesions in T2/Gadolinium(Gd)-enhanced T1 MR scans
of the CNS; lesions appearing in at least 2 of 4 MS-typical regions of the CNS (periventricular,
juxtacortical, infratentorial, or spinal cord); at least one year of disease progression; and other
examinations such as follow-up MRI, awaiting second attacks, and other neurological and
radiological findings. The advantage of the McDonald criteria is its ability to incorporate patient
history, neurological exam results, and MRI findings. The various data needed for MS diagnosis
comes from different sources, and thus it is important for physicians to have access to complete
patient records and results to give an accurate assessment for MS patients. An electronic patient
record system with integration of data from multiple sources can help overall clinical workflow of
MS diagnosis, treatment, and disease management.
Treatment strategies for MS typically involve modifying disease courses, reduce the number and
severity of attacks, prevent relapses, manage symptoms, and provide physiological support.
Different patients require personalized treatment strategy as MS manifestation varies amongst
patients. Interferon beta-1b and glatiramer acetate treatments, two of the most common treatment
options for MS, are immunomodulation agents aimed to suppress inflammation and immune
responses. Their effects include reducing relapse rate and slowed disease progression. [13] Other
immune-suppressing agents, such as Mitoxantrone, Natalizumab, and dimethyl fumarate, can also
be prescribed according to patients’ responses to other treatments. [13] Symptom-specific treatment
options target symptoms such as fatigue, spasticity, depression, cognitive impairment, and others.
Physical rehabilitation is needed for patients suffering from disabilities. MS treatments are heavily
dependent on patient’s conditions and symptoms, and therefore a thorough patient profile can help
physicians make the most informed decisions in prescribing treatment and rehab options. A
comprehensive electronic patient record system for MS can use prior patient data to help with
treatment strategies and tracking patients’ response to treatments.
1.1.2 Medical imaging for Multiple Sclerosis
As mentioned above, the use of MR imaging has become a standard practice in both diagnosing
and monitoring multiple sclerosis because it reveals multiple lesions in the CNS of patients.
14
MRI offers three major benefits [14]:
1. MRI results, combined with characteristic symptoms, provides earlier and more confident
diagnosis as per McDonald criteria
2. MRI offers multiple sclerosis training for neurology and radiology, contributing to the
understanding of pathophysiology of MS and how pathophysiology changes relate to
clinical manifestations of disease
3. MRI results can monitor treatment effects in both clinical and research purposes
The purpose of an MRI scan is to visually locate lesions, or scarred axons and tissues, in the central
nervous system. MS lesions appear hypointense in a T1-weighted scan and hyper intense in a T2-
weighted and FLAIR (fluid attenuated inverse recovery) scans. Location and morphology of lesions
are also used to identify lesions caused by multiple sclerosis [14]. Figure 1.1 shows the three
mentioned axial slices of a multiple sclerosis patient’s brain.
Figure 1.1 Three axial brain images of an MS patient. The left-most image is T1-weighted,
the middle image is T2-weighted, and the right-most image is FLAIR. Hyperintense
regions in FLAIR image indicates MS lesions in white matter
The purpose of the FLAIR series is to suppress intensity of fluids (such as CSF), therefore
highlights periventricular lesions [12]. In the FLAIR image as shown in Figure 1.1, brighter pixels
in the white matter regions are lesions. However, use of FLAIR axial sequences alone is not
adequate for MS diagnosis and monitoring, since it is susceptible to artifacts due to movement. T2-
weighted images are the most sensitive with low specificity, thus is used to accurately measure
identified lesion sizes. Gadolinium-enhanced T1-weighted sequence shows active plaques, blood-
brain barrier damages, and increases specificity of MS lesion detections [14]. The current
recommendation of MR scanning protocol of diagnosing MS is as follows [15]:
• Dual-echo, T2, and FLAIR axial of whole brain
• Optional dual-echo or FLAIR sagittal midline
• Both unenhanced and gadolinium-enhanced T1 scans
• Optional DWI (diffusion-weighted imaging)
Other factors, such as image slice thickness, magnetic power of MR modality, and dosage of
contrast agent also affects detection and visualization of MS lesions. In my research, T1-w, T2-w,
15
and FLAIR axials images are collected because they are of the most basic images included in most
scanning protocols.
MS disease progress can be monitored by subsequent MR scans for changes in lesion characteristics.
In general, following initial diagnosis, disease activity is assessed by the number of relapses per
year (relapse rate), accumulation of disability as measured by the expanded disability status scale
(EDSS) [16] and changes in MRI lesion characteristics. The progression of the disease is variable,
and requires routine follow-up imaging studies to document disease exacerbation, improvement, or
stability of the characteristic MS lesions. MS lesion quantification requires a manual approach to
lesion measurement on MRI [17]. It is time consuming to quantify lesion load and volumetric
calculations with current techniques only yielding at most, a rough estimate. The task of tracking
MS lesion changes becomes even more difficult if several MR longitudinal studies of the same
patient (follow-up studies) require quantitative comparison which is currently not utilized in
clinical diagnosis. In addition, it has been shown that MS diagnosis, lesion detection and lesion
load calculation suffer from inter- and intra-observer variability. [18]
Therefore, lesion detection and tracking could be greatly improved by development into a
quantifiable value using the proposed imaging informatics tools. Utilizing computer-aided
detection (CAD) methods is considered ideal for monitoring MS activity as defined by MRI lesion
load. [19]
1.2 Clinical data and data sources utilized for Multiple Sclerosis
During a patient’s clinical visit, a large amount of data is generated. Physicians observe the
collected data and assess the patient’s progress based on patient history and physician’s personal
knowledge. The amount of data for a patient increases because of repeated follow-up visits in
longitudinal studies. There are two main categories of data discussed here: neurological data and
radiological data.
1.2.1 Neurological data sources
Neurological data is collected by patients’ primary physicians and expert neurologists. It includes
patients’ demographic profiles (age, gender, ethnicity), genetic information, disease history (age of
onset, symptoms, treatments), and other biomarkers (CSF analysis, laboratory results). Depending
on patients’ disease history, clinical visits, and treatment options, the amount of neurological and
demographic data required for thorough documentation can be very large. A typical modern clinic
may have an electronic patient record (ePR) system for storing patient records. However,
information stored in an ePR may not be specific or detailed enough for quantitative data analysis.
For a disease like MS, collecting patient history, building a patient profile, and recording patient
progression become key in treating patients. Complex information and data need to be managed
and stored carefully.
1.2.2 Radiological data sources
MS patients’ radiological data are stored in imaging clinics’ storage systems. In most cases, the
images are archived in a PACS (Picture Archiving and Communications System). Scans of patients’
afflicted organs in the nervous system are scanned by the clinics’ MR imaging modality. The digital
MR images are created and stored in DICOM format (Digital Imaging and Communications in
Medicine) in the archive. Radiologists may read the acquired studies, make any readings and note
any significant changes, and the radiology report are stored in PACS along with the studies. While
16
each clinic may have differing workflow in terms of image acquisition and storage, all digital
medical images are stored in the DICOM format and have medical imaging archives like PACS. In
order to retrieve images outside of clinics’ PACS, the typical and most common solution is to burn
the MR studies onto a Compact Disc (CD) and give the CD to the patient for his or her own use, or
send CDs via mail to a referring physician. This process can be tedious and, in the age of Internet
and digital communications, is outdated.
1.2.3 Diagnostic Workflow
In the clinical environment, MR cases suspected of MS are read by radiologists or neuroradiologists,
and cases are reviewed and diagnosed by neurologists. While individual readers approach cases
differently, here is how a typical MRI MS case is read:
1. Reader retrieves the current study from their patient/study worklist. If there are any
previous scans of the same patients, they are retrieved to make comparison reads.
2. Reader looks at 3 scans: T1-weighted, T2-weighted, and FLAIR axial slices to identify
positive lesions. Readers note lesion locations in FLAIR slices and then cross-reference
with T2-weighted for a more accurate volume assessment. T1-weighted studies are used to
check activity of lesions. Readers assess lesion volumes either by a manual ruler or by
gross estimation.
3. Reader checks sagittal and coronal slices for any missed lesions in other views.
4. If available, reader checks enhanced T1-weight series and diffusion images
5. Reader makes final diagnosis, which include specific lesion locations, number of lesions,
estimated lesion sizes and total lesion load, rough measurements of brain volume and CSF
volume to check for brain atrophy
In short, MRI provides a useful diagnostic tool for multiple sclerosis. Its various uses for MS and
its utilization in the new McDonald diagnostic criteria make MR a necessary tool for multiple
sclerosis. MS lesions can be clearly identified with MR, and thus making treatment response
monitoring and developing correlation between lesion locations and clinical manifestations
possible. However, the MS lesions are currently not quantified in 3-D in the clinical diagnostic
workflow. While radiologists and physicians can observe size and shape changes in subsequent
MR studies, the volume changes are not quantified and therefore cannot be accurately tracked for
research and clinical purposes. A computer-aided quantification tool in an electronic patient record
system can record patient progress in terms of quantified 3-D lesion and brain volumes, which can
be key to assess patient progress.
1.3 MS research based on ethnicity differences
Genetics is one of the leading contributing factors in MS prevalence and disease characteristics.
Genetics, i.e. presence or absence of specific genomes, can either trigger MS symptoms, present
different lesion locations and symptoms, length and severity of attacks, and appropriate treatments
[20, 21]. This is since several genomes have been significantly linked to the disease. Another
possible cause of MS can be explained by environmental factors, i.e. geography, pathogen or
chemical exposure, and nutrition [21]. This section attempts to summarize will summarize MS
research studies based on ethnicities to offer examples of similarities and differences between
Caucasian, Asian, and African-American MS patients. However, information regarding Hispanic
American MS patients is limited, even though Hispanic American has become the largest minority
group in the United States. A recent proposed study of comparison between Hispanic-American
and Caucasian MS patients is discussed in the Los Angeles County area. The focus of my
17
dissertation research has been influenced by this study, and particularly in observing disease
differences in Hispanic and Caucasian patients.
1.3.1 Current research on Asian and African American MS patients
MS for East Asian population is characterized by the heavy involvement in the optic nerve and
spinal cord, as well as low susceptibility [22, 23]; while African American MS patients have more
severe symptoms and less susceptibility than Caucasian patients [24, 25]. Several studies, such as
by Kira (2003) and Shibasaki et.al. (1981) have shown that while the overall prevalence of MS in
Asia is lower than that in Europe and North America, opticospinal MS prevalence is significantly
higher in Asian than in other regions (42% in Japan versus 6% in Britain, for example) [23, 26].
High prevalence of opticospinal MS in Asia can possibly be attributed to both genetic and
environmental factors. Genetic studies have shown opticospinal MS patients have different
genomes present [27], and on the other hand, the ratio of opticospinal MS patients against
convention MS patients increase as latitude decreases in Asia [21]. In general, African American
patients have a lower MS prevalence rate than CA patients. [6] MS prevalence rate for native
Africans are even lower, with the first well-documented MS case in black South African population
as late as 1987 [25,28,29]. Studies conclude that both genetics and geographical factors play roles
in causing MS, and it also suggests that AA patients are more susceptible due to most African
Americans today are of mixed racial background with European decent, who are more susceptible
to MS [6]. AA patients have a more aggressive disease course as evidenced by a higher lesion
volume in T1-weighted and T2-weighted MRI findings [30]. They have higher risk of ambulatory
disabilities and has higher rate of having opticospinal MS [28].
1.3.2 MS research proposal on Hispanic American patients
For both Asian and African American MS patients, there has been a long history of studies and
research published. However, there has not been a similar interest in studying Hispanic American
(HA) patients. In this section, the current status of research for Hispanic MS is summarized.
MS awareness has been raised in Latin America [31]. At the same time, there are some substantial
motivations to study MS for HA patients because it is the largest minority group in the US, and MS
prevalence for HA is reported to be increasing. [32] From MS-related studies in Latin America, it
is shown that Hispanic patients have higher rate of opticospinal MS and lower age of MS onset,
similar to Asian MS patients. On the other hand, because of European ancestry, MS patients in
Latin America also possessed the same genes that are found related to MS. [32]
To find out if HA MS patients progress differently than patients of other races, there is a need of a
large-scale longitudinal research studies that focuses on HA patients. A pilot study by Amezcua et.
al. [32] in the Los Angeles County area suggests that HA patients have higher rate of having
relapse-remitting MS and lower age of onset than CA and AA patients. There is also a comparison
study between MS patients that were born in the US and those who migrated to US after age 15;
which found that optic neuritis is more common in US-born HA patients than immigrated HA
patients. US-born patients also have lower age of onset than immigrated patients. The results show
that there are certain trends that are unique to HA patients from CA and AA patients, and results
may aid in better diagnosis and treatment for HA MS patients in the US.
18
1.4 Current challenges of MS research and treatment
There are many challenges for the diagnosis and treatment of multiple sclerosis, both clinically and
for research.
1.4.1 Data integration challenges
As described before, the key to managing MS disease is the continuous follow-up appointments
and exams to analyze disease course, and using the best available corresponding treatment to the
disease profile. Therefore, clinicians need to analyze a large amount of data, both from neurology
and radiology, for complete patient disease profiles. In order to retrieve data from both neurology
and radiology departments, physician may have to access separate medical record systems.
Neurologists may not have images readily available, and on the other hand, radiologists may not
have patient history readily available for diagnosis and reading. Therefore, a centralized record
system for MS may provide users with complete data sets for easier and speedier diagnosis and
follow-up.
1.4.2 Data analysis challenges
There are several features in determining MS progression and disease profile such as lesion
locations and volume, as well as tracking changes in lesion sizes, and new lesions to relate to
new/existing symptoms. For that purpose, radiologists may attempt to measure and quantify lesion
contours and volumes.
MS lesion quantification on MRI requires a manual approach to lesion measurement [17]. It is time
consuming to quantify lesion load and volumetric calculations with current techniques only
yielding at most, a rough estimate. The task of tracking MS lesion changes becomes even more
difficult if several MR longitudinal studies of the same patient (follow-up studies) require
quantitative comparison which is currently not utilized in clinical diagnosis. In addition, it has been
shown that MS diagnosis, lesion detection and lesion load calculation suffer from inter- and intra-
observer variability. [18] Therefore, lesion detection and tracking could be greatly improved by
development into a quantifiable value using the proposed imaging informatics tools. Utilizing
computer-aided detection (CAD) methods is considered ideal for monitoring MS activity as defined
by MRI lesion load. [19]
In addition, it is currently impossible for physicians to quickly and effectively analyze relationship
between MRI results and patient history. In order to relate any visual changes observable on MRI
and symptom progressions and disease history, there is a need for an integrated system with data
analysis modules.
1.4.3 System integration and implementation challenges
All clinical systems communicate with a specified protocol. For images, the standard is DICOM.
For text data in ePRs, the standard is HL7. When data is collected from different systems to form
a complete patient profile, the output data format and protocol may be incompatible. Users may
need to convert data from other sources into a readable format, which is time- and resource-
consuming. Non-standardized data collection, such as survey forms or Excel sheets, further needs
to be converted and categorized for storage and convenient data retrieval.
19
1.4.4 Proposed solution: MS eFolder
To build a complete system to aid in multiple sclerosis diagnosis, treatment, and research, I am
proposing the development of an ethnically-diverse imaging informatics-based system, called MS
eFolder, designed specifically for multiple sclerosis. The MS eFolder provides a database solution
for storing patients’ clinical information, an effective way for accessing patients’ MR images and
associated data, and also provides an automatic lesion quantification tool. By integrating these
components, the eFolder system can provide a data repository for treatment planning, correlate MR
lesion quantification data with clinical manifestations more effectively, and provide a data mining
tool for research purposes. For large-scale ethnicity-based MS research, there is a need for a large-
scale multi-ethnic patient database system, such as the MS eFolder system, that is designed to store,
display, and analyze research-related patient data. In order to facilitate MS-related research, the
eFolder system must be designed specifically for storing MS patient data, be flexible and
comprehensive for data mining, for querying and retrieving MR images, and a lesion quantification
and locational algorithm to relate lesion load with clinical information in a more standardized way.
The goal of the eFolder system is to provide an electronic patient record system that provides
comprehensive data storage, analysis, display solutions for multiple sclerosis clinical care and
research, and a framework for system integration in the existing clinical environment.
20
Chapter 2 Multiple Sclerosis eFolder System Overview
2.1 Introduction to imaging informatics based electronic patient record
In order to build a complete system to aid in multiple sclerosis diagnosis, treatment, and research,
I have designed and developed the Multiple Sclerosis eFolder, an ethnically-diverse imaging
informatics-based system. The MS eFolder provides a storage solution for patients’ clinical
information and disease history, ability to access patients’ MR images and associated data
effectively, and computer-aided data analysis tools for providing information that is previously
unavailable. By integrating these components, the eFolder system can provide a data repository for
treatment planning, correlate MR lesion quantification data with clinical manifestations more
effectively, and provide a data mining tool for research purposes.
The concept of eFolder is directly derived from ePRs, or electronic patient records. ePR is a digital,
comprehensive patient database that stores patients’ demographic information, medical history, and
any information that may be needed and included in ePR’s designs. Several ePR systems have been
designed and implemented in clinical environments, such as VistA (The Veterans Health
Information Systems and Technology Architecture) used in the United States VA (Veterans Affairs)
hospitals [33], HKHA (Hong Kong Hospital Authority) [34], and COSTAR (Computer Stored
Ambulatory Record) in Massachusetts General Hospital [35]. In addition, an ePR system can also
be designed to fit specific clinical applications, such as surgery and radiation therapy [36, 37].
The MS eFolder is unique because it is specifically designed for patients diagnosed for multiple
sclerosis. The system is required to store relevant data from various sources that are related to the
particular disease and essential for diagnosis, treatment, and research. This allows the eFolder
system to have a more specialized design and development process, and it turn it can provide more
specific utility for the specialty of MS.
The concept of a disease-centric imaging informatics system is unique because of its functions,
integration of multi-departmental data, and potential of new quantitative data analysis. While the
MS eFolder is designed specifically for usage in imaging-related multiple sclerosis treatment and
research, its approach and design can be applied to development of other disease-centric imaging
informatics systems that utilizes clinical imaging data for disease treatment and management.
2.2 MS eFolder design overview
In this section, I will briefly highlight the design of the MS eFolder system, with brief overview of
each components. The details of design and development of the system are presented in the
following chapters.
2.2.1 System Diagram and Workflow/Dataflow Profile
Figure 2.1 shows the system components of the MS eFolder system.
21
Figure 2.1 MS eFolder system diagram. Solid line represents dataflow of text data, dotted
line represents image dataflow, and dashed line represents post-processing dataflow.
The diagram shows the setup and communication protocols between each component of the eFolder
system. To create or update a patient profile, user can enter clinical and patient data into database
via web-based form. The web server hosts both the database and the system interface, and the user
accesses the user interface for data retrieval. Different modules on the web interface provide various
data mining and analysis results.
For image acquisition and storage, user may upload DICOM images into the system, or the image
source may transmit images directly into the system via the DICOM protocol. The DICOM parser
extracts image metadata and stores in the database and sends metadata and images to both the CAD
module for processing and the image repository for storage. After the CAD process is completed,
detection and quantification data is both stored in the database and converted to DICOM-SR
(structured reports) and then sent to image repository for storage.
The eFolder system workflow is designed to be seamlessly integrated into a clinical environment.
The key for this workflow design is to obey current clinical data and data transfer standards (i.e.
DICOM and HL7) and having clearly-defined system components that interface with other clinical
informatics system components and equipment. The design of the eFolder system workflow is
modeled after the IHE (Integrating the Healthcare Enterprise) post-processing workflow profile.
Figure 2.2 shows how the eFolder system may impact a clinical workflow.
22
Figure 2.2 MS eFolder workflow diagram with IHE postprocessing profile. The blue circle
indicates all of the components included in the eFolder. The steps 1 through 4 indicates the
order of workflow of the demonstration.
The eFolder system is integrated within a typical IHE post-processing workflow profile. The
system workflow therefore does not affect normal radiological data acquisition, storage, and
display. If the scanned MRI study is for MS analysis, an additional copy of the study is sent to the
eFolder system for analysis. Any other forms and figures can be uploaded into the eFolder system
via web-based user interface. System workflow design methodology and explanation are described
in detail in Chapter 5.
2.2.2 System Components Overview
The eFolder system is designed with three main software components: an automatic lesion
detection and quantification toolkit, eFolder system database, and a web-based graphical user
interface.
Computer-aided detection (CAD) Module
To solve the issue of inter- and intra-personal variability of MS lesion quantification in MRI and
to ease the process of lesion tracking process, a computer-aided detection (CAD) module of MS
white matter lesions have been designed. Its advantages include a standardized quantification
process, allowing MS lesion tracking in longitudinal studies, and offering quantifiable results for
clinical trials and research. In addition, the CAD toolkit converts quantification data to structured
reports and identifies lesion locations and registration for tracking. The CAD system is integrated
23
with the database, and thus the results are stored and patient data can be queried via lesion
characteristics. This module is the centerpiece of the ePR system since it is crucial for MS disease
tracking and treatment management within the clinical workflow. The detailed CAD and
quantification methodology is presented in Chapter 3.
Database Module
The database is designed to store the patients’ information, such as their demographic information,
general medical history, MS history, treatments, and any other relevant data collected by clinicians
and researchers. The database also links patient profiles to medical images (in this case, patients’
brain and spinal MRI studies) to aid in diagnosis and disease tracking. The eFolder is designed to
have its own DICOM-based image repository, therefore the database also handles DICOM-based
metadata. CAD results are also stored in the database for efficient retrieval. MS eFolder data model
and database design is presented in Chapter 4.
Web-based graphical user interface (GUI) Module
A GUI is needed for patient data viewing, input, management, and data querying. It is web-based
such that the system can be accessed via Hypertext Transfer Protocol (HTTP) and Internet
connections. The advantage of the web-based GUI is for users to access the system with variety of
different devices at both on-site and remote locations. The GUI provides comprehensive patient
data display, including diagnostic MR images and CAD results. The GUI also provides post-
processing data analysis, including disease trend identifications, and data mining toolkits. The
overall GUI design and features are presented in Chapter 5.
24
Chapter 3 CAD Module and MS Lesion Detection and Quantification Algorithm
3.1 CAD purpose and architecture
Computer aided detection of MS lesions, including lesion identification, 3-dimentional clustering
and volume quantification, anatomical location identification, and brain volume segmentation have
been developed for inclusion into the MS eFolder system. The purpose of the CAD module is to
integrate imaging information with clinical information for a complete patient profile. This
eliminates the need for radiologists to contour and calculate lesion volume, a task that can be time
and labor intensive. The CAD results can allow tracking of lesion volume changes over time, which
can be key to visualizing and managing a MS patient’s progress and aid in treatment decisions and
outcome analysis.
The MS CAD algorithm is designed to output lesion volumes, lesion locations, and total lesion load.
The detailed algorithm design splits up into three parts: preprocessing, lesion voxel identification
by probability thresholding, and lesion quantification steps. The algorithm is prototyped in
MATLAB.
3.2 Lesion detection methods
Throughout the design and development of the MS eFolder, there have been two CAD algorithms:
one involving k-nearest-neighbors (KNN) principles, and one involving estimation-maximization
methods. This section details both methodologies.
3.2.1 First-iteration of CAD – based on KNN
The KNN-based algorithm is designed on 3-D MRI brain images. It uses T1, T2, and FLAIR (Fluid
attenuated inversion recovery) axial slices of 3mm or 5mm thickness. The algorithm converts the
MR images into a three-dimensional matrix for analysis.
Preprocessing
Figure 3.1 shows the preprocessing workflow, including segmentation modules and required input
files. The first step in image preprocessing is to read and extract the input images: T1-weighted,
T2-weighted, and FLAIR axial slices. The images are loaded into a 3-D matrix in MATLAB.
Intensity values are normalized to 256 grey-scale (8 bits). Resolutions of the three series are
examined and normalized.
25
Figure 3.1 CAD Preprocessing workflow
After the three sequences are of uniform resolution, the second step is to realign the images such
that they are of uniform orientation. Both resolution standardization and realignment are required
to spatially register lesions and brain anatomy, since the completed lesion classification algorithm
requires voxel intensities and spatial information of T1, T2, and FLAIR images. The images are
realigned according to the midsagittal plane (MSP). The MSP is defined as a plane formed from
the interhemispheric fissure line segments having the dominant orientation [38], and the MSP is
identified via localizing the fissure line segments in the 3-D brain data set. From identifying the
MSP, rotation angles of the brain (along the x and y planes) are calculated. Due to low resolution
of axial data set in the z direction, the rotation angle is not calculated.
Figure 3.2 shows an original FLAIR image and the image after realignment.
26
Figure 3.2 Effects of image rotation via midsagittal line (in red). Left: original MR image.
Right: Rotated image with normalized intensity values.
After aligning the images, a segmentation of the human brain is needed to select the region of the
head where the MS lesions can occur, which is both white and grey matter of the brain (with
emphasis in the white matter region). The brain is segmented based on an automated histogram-
based algorithm for T1 and FLAIR images. The algorithm involves three steps: 1)
foreground/background thresholding, 2) disconnection of brain from skull via morphological
operations, and 3) removal of fragments such as sinus, cerebral spinal fluid, and so on. [39] Figure
3.3 shows the segmented brain mask and the subsequent isolated grey and white matter.
Figure 3.3 Left: Realigned FLAIR image. Middle: Brain mask. Right: Segmented brain
parenchyma
Brain segmentation is important in the CAD component in two ways. First, it identifies the regions
of interest and two, it enables calculation of brain parenchymal fraction: ratio of brain matter and
intracranial space. Brain parenchymal fraction is another important indicator of MS is calculating
brain atrophy rate [40].
27
Probabilistic segmentation of multiple sclerosis lesions in 3-D
After the brain is extracted from the 3-D volume data, a method is designed to classify the voxels
within the brain mask of preprocessed T1-weighted, T2-weighted, and FLAIR images. Each voxels
within the brain mask is examined and the probability of each voxel being a lesion or non-lesion
voxel is determined. Figure 3.4 shows the workflow of identifying lesion voxels.
Figure 3.4 Workflow of lesion voxel classification. Green box represents results from
preprocessing steps, and red represents preloaded data
The voxel identification section is based on the k-nearest neighbor (KNN) algorithm. [41] KNN
uses a set number of features to build a multi-dimensional feature space. Features of a single voxel
are extracted and put into the feature space. A number K of closest neighbors of the target voxel
are found within the feature space. The classification of the input voxel is determined by the
composition of classes of the K nearest neighbors. For the lesion classification algorithm, three
features are used to build a three-dimensional feature space: voxel intensity of T1, T2, FLAIR
images.
The next step is to populate the feature space with lesion voxel features and non-lesion voxel
features. To know which voxels are a part of a MS lesion and which are not, a set of 4 3mm-slice
training cases, or training sets, have been created to train the feature space. The 4 training sets
contains T1, T2, and FLAIR brain MR images with various severity and MS lesions present in their
white matter (see red box in Figure 3.4). Two neuroradiologists have participated manually
segmenting MS lesions from those 4 training sets, thus identifying all lesion voxels. The training
steps only need to be completed once, as all future voxel analyses are based on the constructed
feature space. Figure 3.5 illustrates a three-dimensional feature space.
28
Figure 3.5 3-dimensional features space from 4 sample training sets. Red dots signify “non-
lesion” voxels, and blue circles are “lesion” voxels.
The training set includes voxel features from the 4 cases, with each voxel classified as “lesion” or
“non-lesion”. The general principle of this algorithm is to determine the probability of a voxel being
“lesion” by finding out the number of “lesion” nearest neighbors out of k nearest neighbors.
After the training set is built, resulting data from preprocessing can be evaluated. For every input
voxel, the features are extracted and put into the trained feature space. The distance between the
input point and the trained voxels are calculated one-by-one, and 100 closest neighbors are
determined. The mathematical formula for calculating distances is
|𝑑 − 𝑑 ′| = √∑(𝑑 𝑖 − 𝑑 𝑖 ′
)
2
𝑘𝑑
𝑖 =1
where kd = number of features (in this case, kd=6), d i is the vector of input features, and d i’ is the
vector of features d is compared to. [42]
A kd-tree structure is then applied to speed up this calculation. A kd-tree is basically a binary tree,
with a parent node and two child nodes. This effectively splits the feature space into two planes,
and since the child nodes can be split into more child nodes, the feature space can be split into
smaller compartments. The algorithm first constructs a kd-tree, and arbitrarily selects the first
parent node, which does not have to be a nearest neighbor. The algorithm then searches through
the child nodes to find a nearer neighbor. If the neighbor is nearer than at least one of the 100 closest
neighbors already found, it is then saved. The program then continues recursively until the end of
the tree structure. After the 100 nearest neighbors are identified, their classification as lesion or
non-lesion is determined. The formula for calculating lesion probability is as follows:
𝑃 𝑙𝑒𝑠𝑖𝑜𝑛 =
Number of "lesion" nearest neighbors
100
Probability thresholding then is applied to positively or negatively identify a voxel as a lesion voxel.
The threshold is currently set as 70% based on results of KNN algorithm on the 4 training sets,
29
therefore point A is classified as a non-lesion. The binary images then go through the clustering
process described in Section 3.3 to obtain lesion identifiers and 3D volumes. Figure 3.6 shows
lesion identification result of a single image.
Figure 3.6 MS CAD results: Left upper: FLAIR image with brain matter segmented, Right
upper: Lesion probability map of the original image in grey-scale. The whiter the voxel is,
the more likely it is a lesion voxel. The blue arrows indicate darker regions that won’t be
identified as lesions after thresholding. Left bottom: black-and-white lesion map produced
by thresholding probability map at 70%. Right bottom: lesion voxels (in red) overlay on
top of original image
The drawback of the KNN method is that it is computationally intensive. Initial testing showed that
each input cases produces results in 5 to 8 hours, depending on how many voxels the data set has.
30
It is impractical to produce real-time results that is demanded by the MS eFolder design. Therefore,
a second CAD methodology is explored to emphasize on computational efficiency.
3.2.2 Second-iteration of CAD – based on Expectation Maximization
The goal of the second iteration of CAD algorithm is to achieve MS lesion voxel classification
results in an efficient way. The second algorithm employs new methods of preprocessing and
segmentation of regions of interest and normal/abnormal voxel identification. The volume
quantification methods are identical to the KNN-based algorithm and is detailed in Section 3.3.
Preprocessing
Brain matter segmentation is based on Statistical Parametric Mapping (SPM) brain image analysis
toolkit for MATLAB [43]. At first, the algorithm reads T1 axial and FLAIR axial images as 3-D
volumes. If the images sizes do not match each other, the T1 images are affine transformed to match
the size of FLAIR images. T1 images are used for grey and white matter segmentation, while
FLAIR images are for identifying “abnormal” voxels. Secondly, grey matter(GM), white
matter(WM), and cerebral spinal fluid(CSF) regions of the brain are segmented via voxel intensity
estimation and probabilistic maps provided from SPM, which utilizes voxel-based morphometry
[44], or VBM. VBM of brain MRI involves spatially normalizing all of the images in a study into
a standardized stereotactic space. The algorithm then extracts the gray and white matter voxels
from normalized space, performs smoothing, and then performs a Bayesian statistical analysis to
further calculate gray and white matter voxel probabilities [45]. The output of this SPM-based
preprocessing is a probabilistic map of grey matter and white matter segmentation. Thresholding
the probabilistic map produces brain mask to segment regions of interest on FLAIR images.
Expectation-Maximization
An expectation maximization algorithm for k multidimensional Gaussian mixture is applied to the
segmentation results to detect abnormal (lesion) voxels [46,47]. The algorithm involves calculating
the expected voxel intensity range of white matter and grey matter within the segmentation results.
Mean and variance of each voxel group is detected, and Gaussian mixture model is created for each
voxel group. The maximization step then re-classifies the voxels based on the results of the
Gaussian mixture classification. The 2-dimensional Gaussian model is used: one for voxels within
the normal brain matter, and one for voxels within MS lesions.
The estimation results are used to determine the likelihood of a lesion voxel based on whether the
voxel intensity is outside the calculated normal range within the white matter region. The normal
range is current set at within 3 standard deviations of normal FLAIR intensities. Once all abnormal
voxels are identified, the binary images (where “true” means abnormal and ‘false” means normal)
are part of the output. The results then go through the post-processing algorithm described in
Section 3.3. Figure 3.7 shows workflows of both CAD algorithms and highlight the similarities and
differences between the two methods.
31
Figure 3.7 Workflows of the two CAD algorithms developed in the eFolder project. The
two algorithms are developed separately, with different methodologies for preprocessing
and lesion voxel classification. The post-processing method is shared between the two
CAD algorithms.
3.2.3 Manual contouring
Lesion contours by neuroradiologists at USC have been collected to act as gold standard for lesion
detection. The contours were done manually by two neuroradiologists on Fuji Synapse 3D post-
processing client in the clinical environment. Currently the manual contours are used in this project,
as the CAD algorithm is still being refined for more consistent accuracy in its results. Figure 3
shows the MATLAB results of a sample data. The original DICOM images are converted to NIFTI
format for processing. The output lesion contour images are saved in NIFTI format as well. The
results of manual contouring are used to compare CAD results in Chapter 7.
3.3 Post-processing
3.3.1 3-D lesion quantification
Lesion voxels in the binary segmentation are clustered in 3-D with 26-connectivity, meaning that
two lesion voxels are defined as “of the same cluster” if the voxels are connected to each other by
touching their 6 faces, 12 edges, and 8 corners. Figure 3.8 shows how 26-connectivity works: for
the blue voxel, any voxels reside in the 26 red spaces are considered “connected” and thus form a
cluster.
32
Figure 3.8 26-connectivity diagram. Red voxels are “26-connected” to the blue (center)
voxel.
The lesion load (total lesion volume in the brain) and the number of lesions can be calculated using
these clusters. Lesion volume is obtained by multiplying number of voxels in a lesion and the voxel
size, which is extracted from DICOM headers of images. Lesion load is obtained by summing up
all lesion volumes, and lesion locations are identified by the coordinates of their centroids.
3.3.2 Calculation of Brain Parenchymal Fraction
Brain parenchymal fraction is calculated from the brain segmentation results of the CAD process.
From the second CAD methodology, white matter, grey matter, and cerebral spinal fluid (CSF)
space are segmented. Volume of each regions of interest are calculated the same way as lesion
volume calculation methodology, with number of voxels multiplied by voxel size. The formula of
brain parenchymal fraction calculation is shown below.
𝐵𝑃𝐹 =
𝑊𝑀𝑉 + 𝐺𝑀𝑉 𝑊𝑀𝑉 + 𝐺𝑀𝑉 + 𝐶𝑆𝐹𝑉
WMV is white matter volume, GMV is grey matter volume, and CSFV is CSF volume. Results are
included in the structured report and in the database.
3.3.3 Lesion localization and normalization for longitudinal tracking
In order to identify each individual lesions in the same patient in separate longitudinal studies, the
studies have to be normalized to a template brain to map the lesions based on their locations. To
accomplish this, the brain warping technique using MATLAB’s Statistic Parametric Mapping
toolkit is used to prototype the normalization methodology. SPM’s voxel-based morphometry
methodology
is able to warp the subject brain into a template brain with refinement in the
subcortical structures [45, 48]. The template used in this algorithm is the ICBM 152 Nonlinear
Atlases version 2009
[49, 50], which includes labeling of 152 different subcortical structures that
is needed for lesion location identification. A semi-automatic methodology, which is included in
the SPM toolkit, is applied to the warping of the brain images.
Firstly, the SPM12 module titled “Normalisation: Estimate and Write” is launched via MATLAB.
The normalization process requires three inputs: the FLAIR axial MRI images, the lesion contour
images (in registration with the FLAIR images), and the tissue probability map (the ICBM
template). The FLAIR images are warped according to the tissue probability map, and the resulting
33
warp parameter is calculated and applied to the lesion contour images, thus warmping the lesion
space to the template spae. The estimation options, including bias regularization, Gaussian
smoothing of bias, affine regularization, and warping regularization have been set.
After brain normalization is completed, the lesion contour images are registered to the tissue
probability map, which includes a labeling map. The labels, in XML format, are included in the
SPM toolkit. The coordinates of centroid for each lesion cluster is first extracted, and using the
registered coordinates of the label map, the centroid location is obtained.
During the clustering procedures, each lesion is assigned an identifier. Due to different parameters
in each study such as number of lesions, the identifiers are not consistent from one imaging study
to the next in a longitudinal study. Therefore, a lesion registration algorithm is applied so link
different lesion identifiers. The registration algorithm involves matching two normalized studies
against each other. Because of lesion growth variations and contouring differences, it is possible
for each lesion in subsequent studies to be registered to multiple lesions in the previous study, and
one lesion in the previous study may have occupied the same coordinate spaces as multiple lesions
in the subsequent studies. For our prototyping purposes, we chose the lesion from the previous
study that occupies the most contour areas of the targetted lesion as the “registered lesion”. In order
to achieve this, for each lesion in the subsequent study, a histogram count is performed to determine
the location of the lesion relative to the lesion space in the previous study. Figure 3.9 illustrates the
lesion registration methodology.
Figure 3.9 Workflow of MS lesion registration in longitudinal studies.
The eFolder system stores the quantification results in two ways: converting the quantification
results into DICOM structured reporting (DICOM-SR) objects and store the DICOM objects in the
eFolder data storage module, and uploading quantification results directly into the eFolder database.
This is to ensure that:
1. The DICOM-SR objects can be accessed by other system components or other DICOM-
compliant informatics systems.
2. Results in the quantification database can be included in data querying and data mining
applications.
The methodology of DICOM-SR storage solution will be discussed in Chapter 5, and the database
design will be discussed in Chapter 4.
34
Chapter 4 Multiple Sclerosis eFolder Database Module
The eFolder database stores basic text data such as patient information, medical history, social
history, DICOM-based image metadata, and 3-D lesion volumes and location coordinates. The
database system is written in MySQL, an open-source web-based database language. The database
structure is built such that one single patient has a unique data entry regarding demographics and
social data, has a list of all MR studies regarding to MS, and a list of all CAD results (in SR format)
available for that patient. The data therefore is patient-centric and allows quick access to a patient’s
historical data. The design of the database model follows the current data model designs for clinical
electronic patient record (EMR) systems. Due to the schema’s similarity to existing clincal systems,
the process of integrating the eFolder system into EMRs is more straight-forward. Figure 4.1 shows
a typical data model of a MS patient in the database.
Figure 4.1 The MS eFolder patient data model. Each patient entry includes basic text data
in black, and each clinical visit generates new data to be stored in the patients’ profile.
Each patient profile includes demographic data, social history, medical history, and MS disease
history. The disease history component gets updated with every clinical visit. During each clinical
visit by the patient, new data is created. Neurological exams, lab results, and other updates are
recorded in the database, and each MRI study is uploaded into the system. The database is updated
with metadata from each new imaging study and quantification results from CAD study.
4.1 Overall Database Schema Design
Figure 4.2 shows the general database structure.
35
Figure 4.2 General database schema for the MS eFolder. Arrows indicate how each table
is related to each other in the schema, and the patient demographics table is assigned as the
master table for information lookup.
The database design consists of three parts: patient demographics database (blue in the figure),
imaging database (grey), and quantitative results (orange). Each box represents a separate group of
tables that shares a common topic, i.e. medical history, MS history, etc. Each box includes names
of tables or columns. The breakdown of each sections in the schema- patient demographics,
DICOM imaging database, and quantification results database- is designed such that each section
is similar to current clinical informatics systems. The demographics database is modeled after the
EMRs, the DICOM-based imaging database is modeled after PACS, and the quantitative database
is structured to fit the DICOM real-world model. Therefore, the compartmentalized design can be
easily integrated with other clinical informatics systems. Each part of the schema is described in
more details in the following sections of this chapter.
4.2 Patient Demographic Database
The information stored in patient demographic database tables includes race, sex, ethnicity,
birthplace, childhood illnesses, vaccines, and other past medical histories. The demographics
information table is assigned as master table. The internal patient UID (unique identifier), assigned
by the eFolder system and independent from clinic origin, is used as key to link to other tables. The
database looks up data from other tables in the demographics database using the internal patient
UID stored in the demographics table.
The other tables in the demographics group include the patient’s MS history, consisting of the year
of first diagnosis, any family members with MS, MS type, MS symptoms and frequency, treatment
history, and so forth. This data is collected from patient’s existing medical records and can be
inputted via the web-based GUI. Social history data gathering is done by patient interviews and
filling out survey forms, reading of medical reports, and physician inputs.
36
The stored data, when possible, follows SNOMED-CT (Standardized Nomenclature of Medicine-
Clinical Terms) nomenclature to standardize names and codes for categories such as symptoms,
race, country of origin, and vaccines [51]. SNOMED allows commonly-used clinical terms
regarding MS (and other diseases) to be stored and searched efficiently in the database. The purpose
of this design is to have uniform terminology that is understood by all users.
4.3 Patient Imaging Database
The imaging database stores all MR images of the patients. The database structure is designed
following the DICOM structure, which is illustrated in Figure 4.3: from the patient level to the
study level, series, and finally images level. It follows a tree structure, where the singular patient is
the root, and the first level of branching is studies – all MRI studies that have been taken on record.
The second level of branching is series: each MR study contains several series, defined by the
different ways of image acquisition. Under each series there are individual MR images. Since the
images themselves are not stored in a SQL-based database, the filenames of image files are stored.
Figure 4.3 DICOM data model structure, the same format is followed to store MR images
in the eFolder database [52]
Unlike the patient demographic database, data in the imaging database is automatically populated.
Table values, such as study instance UID (unique identification) and series instance UID are parsed
directly from the header files of images. During the DICOM-transfer or FTP-based DICOM files
upload into the eFolder system, the DICOM header data is parsed by the system, the studies are
matched with an existing patient via clinical patient ID (identifiers used by the data origin) and the
database is updated automatically.
4.4 Lesion Quantification Results Database
37
As described in Chapter 3, results from the CAD program of the MS eFolder is both stored in the
database and in the structured report format. The database thus allows data querying and data
lookups based on lesion quantification data, and allows the system to perform disease tracking and
display on the web-based GUI.
CAD algorithm produces results at both the “study level” and at “image level”. The “study level”
quantification results include lesion indexes, lesion sizes in three-dimensions, lesion centroid
coordinates, centroid location label, total lesion load, volumes of white matter, grey matter, and
CSF, and voxel sizes. At the “image level”, the database stores quantification results for an image,
including contour size of each lesion on the image, number of images, and location coordinates of
contour centroids. Besides the database storing quantified data, the eFolder system converts all
CAD analysis results into structured reports according to the DICOM standard. The reports, or
DICOM-SR objects, are stored in the eFolder repository for DICOM-based query and retrieve.
The design and development of the eFolder database has been based on clinical imaging informatics
systems such as EMR and PACS. The design feature allows the eFolder system to be integrated
seamlessly into the clinical environment. The database development is web-based to allow system
access from web browsers. The system interface, therefore, needs to be web-based and browser-
independent to fully display the eFolder’s features and patient data. The design and development
of the eFolder GUI and accompanying features for clinical system and workflow integration is
described in Chapter 5.
38
Chapter 5 MS eFolder Graphical User Interface and System Design
5.1 Overall GUI design and Architecture
The following are the design criteria for the eFolder user interface:
• The GUI needs to be web-based to allow remote access using thin-client architecture.
Computations and visualizations are completed on the server side for a light-weight and
fast GUI.
• The GUI needs to be comprehensive. It needs to display patient clinical data, imaging data,
and quantification results on the same interface. It allows physicians and radiologists to
access information related to the data query.
• The system needs to be dynamic and allow display of 3D images and manipulations of
images presented. An organized viewing interface allows for a more clarified presentation
• The GUI needs to allow flexible and intelligent data mining. With many patients’
information stored in the eFolder system, any clinician and researcher should be able to
look up MS patients on a variety of different search criteria, ranging from patient
demographic data to lesion analytical results.
Base on PHP scripting language, the dynamic GUI guides the user to look up a specific patient’s
disease history with images, and it allows querying for patients with various criteria. JavaScript-
based DICOM object viewer is used to view patients’ images.
5.2 GUI Functionality and Features
Figure 5.1 shows design of the comprehensive web-based GUI for MS eFolder. The details of each
section on the design are explained in the next section.
39
Figure 5.1 Screenshot of comprehensive MS eFolder web-based GUI. Left panel: patient’s
clinical and demographic data. Right top panel: DICOM image viewer embedded in
eFolder GUI. Right bottom panel: SR document content in tabulated format.
The GUI displays the initial information of the patient in three different modules. The left module
contains the patient’s demographic information, medical history, MS history, and social history
data. The right top module contains the image display. The module leads to a separate page (a full-
page image display) that shows the patients’ study list. The bottom-right module displays links to
the data analysis tools, including plotting lesion volume changes, getting the CAD results from
DICOM-SR, and tracking individual lesion changes over time.
5.2.1 Patient Demographic and Medical Data display
The medical and demographics data is displayed in this module. The demographics data is shown
on the top of the page (as seen in Figure 5.1), which includes patient name, ID, birthdate, gender,
and ethnicity. Patient’s medical records are displayed on the left module, and broken down into
three subcategories: MS History, Medical History, and Social History. Clicking on different tabs
changes content of the module to the indicated subcategory. The display layout is shown in Figure
5.2.
40
Figure 5.2 Close-up of the left module displaying patient’s MS history.
5.2.2 Data Input
A data input page has been created to manually input a patient’s data. The online form can replace
the paper-based survey forms and paper-based medical records. Figure 5.3 shows a screenshot of
the data input page.
41
Figure 5.3 Data input page for MS eFolder
5.2.3 Query/Retrieve
A data query/retrieve page has been created to allow patient data lookup and data query based on a
variety of search criteria. Criteria include demographic terms such as gender and ethnicity, or
medical records such as types of MS, length of disease, treatment types, and so on. This tool allows
for data mining in large data sets and patients, which can lead to complex data analysis. Figure 5.4
shows the search page for query/retrieve of patient data.
42
Figure 5.4 Data lookup page for MS eFolder. Click on “View All Records” for complete
patient list
5.3 DICOM-based image viewer
The top right module shows the DICOM-based imaging studies of the current patient. A built-in
DICOM viewer based on CornerStone open source Java library has been used to display DICOM
objects on a web-browser. Figure 5.5 shows the screenshots of the eFolder image viewer.
43
Figure 5.5 The design and layout of the customized eFolder image viewer.
The methodology of the DICOM study viewer is based on the open souce CornerStone JavaScript
library, while also incorporates JQuery, HTML, and HTML5. For the eFolder system, the viewer
is integrated to query and retrieve study information with PHP from eFolder MySQL database.
Additionally, there are several design features to view images from longitudinal studies:
1. DICOM-SC and DICOM-SR objects are displayed alongside images from the study. This
feature can be toggled on and off.
2. User can query and retrieve all of the studies of a particular patient, as shown in Figure 6.
3. Users are able to use the second (right) viewing window to display other studies of the
same subject via a drop-down menu, for direct comparison between the studies.
5.4 Disease tracking and data analysis tools
The bottom right module provides links to a variety of data analysis toolkits, including structured
reports of all studies, disease tracking toolkit, plotting of brain parenchymal fraction (BPF) against
other patients, and comparison of lesion volumes and progression against other patients. A data
analysis toolkit for the eFolder has been developed to display analysis results based on user-defined
parameters. The web-based toolkit utilizes JQuery and HTML5 libraries provided by open-source
toolkit HighCharts, and is integrated with PHP and MySQL to access information from the eFolder
database. Figure 5.6 is a sample plotting tool that shows lesion volumes of Hispanic and Caucasian
patients versus the number of years that they have been diagnosed to having MS. This analysis is
one of the first steps in comparing MS disease differences between Hispanic and Caucasian patients.
The graphical tool allows users to query and view analysis results on the web-based GUI.
44
Figure 5.6 Scatter plot of disease duration vs. 3-D lesion volume for Hispanic and
Caucasian MS patients.
Figure 5.6 demonstrates that the data analysis tool can attempt to correlate quantitative data with
patient history to observe potential data trends. As another example of data analysis tools, figure
5.7 shows the lesion volume tracking results of 4 longitudinal studies. The graphical analysis tool
can visually track and quantify 3-D lesion volume changes over 4 studies of the same patients.
Combining disease progression and patient history can provide insight on the causes of the disease
changes. Further studies and data are being collected to perform comprehensive longitudinal study
analysis using the eFolder tool.
Figure 5.7 Lesion volume tracking of the 4 patients over 4 studies per patient, with an
example of mouse-over that shows statistics of that study.
5.5 Integration of MS eFolder within the clinical environment with IHE protocol
The goal of MS eFolder is to integrate components of eFolder together and fit into a real-life clinical
workflow. IHE (Integrating the Healthcare Enterprise) provides a design of how to include a post-
processing step inside a typical clinical workflow [53]. Figure 5.8 shows the integrated workflow
with a post-processing workstation.
45
First, images are acquired from modality and then sent to PACS for storage. RIS (radiology
information system) is informed of the existence of the study and provides a status tracking service
to mark the study as in claimed, in progress, or completed. The post-processing workstation queries
PACS for a worklist to retrieve the studies. The studies are processed, and the resulting images and
documents are sent back to PACS for storage.
The MS eFolder design is modeled after the IHE workflow profile to show its use in a clinical
environment. To accomplish the tasks, a simulated clinical environment with MS eFolder is set up
in the Image Processing and Informatics laboratory. Figure 5.9 shows the conceptual diagram of
MS eFolder’s setup, modified from Figure 5.8.
Figure 5.8 IHE post-processing workflow diagram. The workflow beings with image
acquisition, includes image storage solution (PACS) and an addition post-processing step.
The entire process is monitored by RIS (radiology information system)
[53]
46
Figure 5.9 MS eFolder workflow diagram with IHE postprocessing profile. The blue circle
indicates all components included in the eFolder. The steps 1 through 4 indicates the order
of workflow of the demonstration.
The workflow for MS eFolder integration is defined in four steps:
1. MR images are sent from modality simulator to the eFolder server for archiving
2. The eFolder server sends a copy of the images to the CAD Module for postprocessing
analysis
3. The CAD Module sends the completed CAD report back to eFolder server for archiving
4. At the completion of each of the previous steps, a status tracking tool inside eFolder
displays alerts of the study progress to the user
The first step is to simulate DICOM images transferring from image modality to the PACS server
for storage. In this system simulation, the eFolder web-base server serves as an image storage server.
Images are sent to the eFolder server from the laptop via the open-source DICOM toolkit dcmtk,
installed on both the laptop (image sender) and the eFolder server (image receiver). The laptop acts
as Storage SCU and the eFolder web server acts as the Storage SCP. The Storage SCP constantly
listens to the dedicated DICOM port for incoming DICOM images. When DICOM images are sent
to the server, the storage SCP stores all DICOM images in a temporary folder. A PHP script then
is used to parse DICOM data to store metadata inside eFolder database, alert the status tracker the
existence of the study, and moves the DICOM data into an archiving folder.
The second step is for the DICOM studies to be sent to the CAD module for post-processing
analysis. In this simulation, users can access the status tracking page from the eFolder GUI (as
shown in Figure 5.10) to select a study to send images to CAD module.
47
Figure 5.10 Status tracking page for MS eFolder post-processing workflow. Users can
select a study to be sent to the CAD module for quantitative analysis.
The selected DICOM study are then sent to the CAD module via FTP (file transfer protocol). The
CAD program is running continuously and listens to the destination folder of the FTP transfer to
process the new DICOM study. During this time, the status tracker would show that the selected
study has been sent for post-processing, as shown in Figure 5.11.
Figure 5.11 Status tracking page for MS eFolder post-processing workflow after a study
(Study ID 10) has been sent to the CAD module, and after another study (Study ID 15) has
been processed and sent back to the server.
After the post-processing step is completed, the CAD module sends the results back to the storage
server. A DICOM-SR (DICOM structured report) object is created to accompany the imaging
study for archiving. The MATLAB program first outputs the CAD results, including DICOM
header information, reference images, and MS lesion quantification analysis results, into an XML
document. The document is then sent to the eFolder server, where it is converted to DICOM-SR
object via dcmtk toolkit, and stored in the archive folders. The eFolder GUI provides a web-based
DICOM-SR viewer for users to view post-processing results.
48
The CAD module also connects to the MySQL-based eFolder database directly via a MATLAB-
MySQL connector. The program updates the status tracker database and the CAD results database
automatically. Figure 5.11, as shown previously, shows an example of a study that has been
completed on the status tracker. The complete dataflow of this one imaging study shows that the
eFolder can fulfill all of the actions required in an IHE post-processing workflow profile.
5.6 DICOM-SR for MS post-processing data
The workflow of converting CAD results from MATLAB to DICOM-SR, then converting to web-
based display format is shown in Figure 5.12.
Figure 5.12 Workflow diagram for DICOM-SR conversion and display. Top row:
converting MS CAD results to DICOM-SR for storage. Bottom row: querying and
displaying information within DICOM-SR objects
The first step is to convert CAD results to an XML document, which then is converted to DICOM-
SR via the dcmtk open-source toolkit. To generate the correct XML document, sample DICOM SR
files are obtained online converted to XML. The XML template then is modified and customized
to store the CAD quantification values. A MATLAB script is used to convert CAD output from a
study to the XML file according to the customized template. Figure 5.13 shows a partial screenshot
of the resultant XML document.
49
Figure 5.13 XML document to store MS CAD analysis results based on DICOM SR
template
The DICOM-SR object is created by dcmtk’s xml2dsr function and is sent to archive via DICOM
protocol [54]. The secondary captures of MS CAD results are converted to DICOM-SC objects for
storage and display from a DICOM image viewer, for streamlining the DICOM-compliant
workflow in the MS eFolder system. For 2D image captures, a MATLAB script is used to overlay
2D lesion contours on top of the FLAIR axial images to create a new series under the DICOM
50
study. DICOM-SCs are stored in the archive and can be viewed alongside the original DICOM
images in the DICOM web-based viewer. Examples of how DICOM-SCs are displayed in the
eFolder’s GUI is shown in Figure 5.5.
To access the DICOM-SR object and display on the web-based GUI, the GUI automatically queries
the SR and images from the DICOM archive when the user accesses a patient’s profile in the
eFolder. While the DICOM studies are loaded into the image viewer, the SR document is displayed
separately in an HTML table. The SR object is converted back to the original XML document, and
a custom XML-parser is built in PHP to read the document and extract data in a tabulated format.
The MS eFolder system interface is first designed to display all patient information, including
demographic data, patient history, brain MRI, and quantification data. Over time, more system
features, such as lesion registration and tracking, longitudinal data analysis, DICOM-SR creation
and display have been developed and added to the user interface. These additional features provide
a more integrated and complete system for MS disease management, research, and data analysis.
The system workflow and dataflow has also been designed according to current clinical
environments for future integration with other clinical informatics systems. Once the system has
been designed and developed, adequate amount of data is needed to perform system testing and
evaluation. The data collection and system evaluation process will be discussed in the next chapter.
51
Chapter 6 Data Collection and System Evaluation
The goal of the eFolder system is to provide data integration and data analysis that have been
previously unavailable to physicians and other users. The key of evaluating the eFolder system lies
in testing essential system features and capabilities to ensure the delivery of fast and accurate
information to the users. I have designed the system evaluation process, which focuses on three
features: MS lesion detection and quantification, longitudinal tracking of lesion volume changes,
and data mining. In this chapter, I will first discuss data collected for system development and
evaluation, then I will present the evaluation design, describe the methodologies of each evaluation
processes, and discuss evaluation results.
6.1 Data Collection
The MS eFolder system requires patient demographic data, clinical data, and MR imaging data. To
build and validate the system, patient data has been acquired from the Department of Neurology
and Department of Radiology at University of Southern California (USC). There are two main
goals of data collection: the first goal is to acquire sample data to build system infrastructure, and
the second goal is to acquire larger number of data to complete system validation and evaluation.
In this chapter, data collection procedures, protocols, and number of patients collected for the
eFolder project are described.
6.1.1 Introduction to data types collected
Clinical and demographic data has been collected from the Department of Neurology at USC. Both
demographic data and clinical data are compiled by research assistants in an Excel table. The texts
are then manually entered into the MS eFolder database to create patient profiles. Patient ID or
MRN (medical record number) is used to collect the relevant brain MR studies from three sites at
USC: Keck Hospital of USC, USC Health Center Consultation II outpatient imaging center, and
Los Angeles County Hospital+USC Medical Center). Imaging studies are exported via physical
media, i.e. compact disks, and are uploaded to the eFolder system via a built-in DICOM uploader.
Radiologists’ manual contour data, created by radiology and neuro-radiology fellows, are collected
via Fuji Synapse 3D and exported as 3D digital files in NIFTI (neuroimaging informatics
technology initiative) format. The NIFTI files are converted to create DICOM-SC objects for
storage in the eFolder.
6.1.2 Institutional Review Board Approval for Human Subjects data collection
Data collection for the MS eFolder have been approved by Institutional Review Board (IRB) at
University of Southern California under “Multiple Sclerosis in Hispanic Populations” (ID: HS-08-
00269) with Dr. Lilyana Amezcua, Department of Neurology as Principal Investigator (PI).
6.1.3 Data Selection Criteria
To evaluate system performance, a group of 72 patients have been selected based on their ethnicity,
gender, age, and disease duration. The list of patient data is divided to 36 Hispanic and 36 Caucasian
patients. The subjects of two groups are matched by gender, age (within 5 years), disease duration
(within 5 years), and disease type (all are relapse-remitting). Patient candidates’ criteria include
having been diagnosed of multiple sclerosis, having had appointments at the Neurology department
at USC, and having had MR brain images (T1-w, T2-w, and FLAIR axial slices) taken at various
imaging centers and available in digital format at USC. The purpose of this data collection is to
52
develop and evaluate a methodology of observing differences in Hispanic and Caucasian patients
based on their imaging studies.
6.1.4 Patient clinical data
Clinical patient data collection has begun at USC since 2008. Patients visiting the MS clinic were
asked to fill out a questionnaire form regarding their basic information, social history, ethnic
background, medical history, and childhood medical history. A sample form is shown in Figure 6.1.
Figure 6.1 First out of 6 pages of Multiple Sclerosis questionnaire to collect patient data
Data collected from forms are manually typed by PI’s research assistants into Microsoft® Excel®
spreadsheets for storage prior to the start of this project.
6.1.5 Imaging data
MR brain studies from the selected patients are collected from the USC Medical Center, which
includes the three sites mentioned above: Los Angeles County Hospital, USC Hospital, and HCCII
outpatient imaging center. Most studies have at least T1 axial non-contrast, T2 axial, and FLAIR
axial series. All 3 series are needed for KNN-based CAD module, and T1 and FLAIR are needed
for EM-based CAD module. The imaging data collected are used to build and test the two CAD
53
modules, build and test DICOM-based database schema, and DICOM-based eFolder services like
parser, uploader, IHE workflow profile, and web-based image viewer.
There have been three phases of imaging data collection. The first phase of data collection is to
collect both Hispanic and Caucasian images to build and test the KNN lesion detection algorithm.
During this phase, 20 MR cases (of 17 patients, with 3 longitudinal studies) have been collected
from USC’s Health Consultation Center II (HCCII), an outpatient imaging center. These cases are
of 3mm slice thickness with 0mm gap between slices. The second phase of data collection is to
collect lesion quantification data for Hispanic (HA) and Caucasian (CA) MS patients. Imaging
studies for 36 Hispanic and 36 Caucasian patients have been collected from clinical sites at USC.
These represent the total 72 matched MS patient cases described initially. The third phase is to
collect longitudinal studies for 4 select patients. Each longitudinal study has 3 additional imaging
studies. In total, the number of imaging studies collected for Phase I is 20 studies, for Phase II is
72 studies, and for Phase III is 12 studies.
There exists an inconsistency between scanning protocols of cases from Phase I and Phase II and
III because the studies are taken at different locations at different times. This is normal clinical
workflow and true for most data aggregation from multiple sites. However, it is a challenge for
developing the CAD algorithm, as various image resolution affect CAD performances and difficult
to develop CAD algorithms that work for all studies.
All imaging studies from Phase I have uniform resolution and slice thickness – all studies are of
3mm slice thickness with no gaps between slices. Imaging protocols from Phase II have varying
resolutions: slice thickness can be 3mm with no gap or 5mm with 1.5mm or 2mm gap. It should be
noted while the scanning protocol of 3mm with no gaps is more ideal for lesion quantification,
since there are few such cases exist as of now, the 3mm cases have only been used to build and test
the CAD algorithm. For comparing Hispanic and Caucasian differences, only Phase II data will be
used, and data from Phase II is used to build and evaluate CAD algorithm based on expectation-
maximization.
Phase III data is used for longitudinal data evaluation as well as development and evaluation of
lesion tracking methodologies. Out of the group of 72 patients, 4 patients have been selected with
longitudinal studies data collection. The criteria for the selection of these patients is that they have
had yearly MRI scans completed at USC-LAC hospitals. Three imaging studies for each patient,
taken from years 2009 to 2014, are collected, for a total of 12 studies (in addition to their initial
studies, which are part of studies collected in Phase II). These imaging studies are then evaluated
by the expectation-maximization algorithm for lesion contours, and lesion contours are also
manually evaluated by radiologists and neuro-radiologists.
6.2 Evaluation of Lesion Quantification results
The two computer-aided detection algorithms, KNN-based voxel classification and expectation
maximization (EM) -based segmentation, are both under evaluation for accuracy of lesion
quantification results. The goal of this evaluation is to present the detection accuracy of the
prototype CAD algorithms, compare and discuss advantages and disadvantages of each algorithm,
and set up gold-standard detection results for all other system evaluations.
6.2.1 Evaluation methodology
54
For each detection algorithm, the result indicates which voxels are “abnormal” and which are within
the definition of “normal” features. Therefore, the evaluation of detection result is based on if the
algorithm is able to successfully identify “abnormal” voxels. The definition of “normal” and
“abnormal” pixels, i.e. identifying MS lesions and determining size and shape of lesions, are
established by radiologists and neuro-radiologists, who are called “expert readers”. The expert
readers manually identify MS lesions in brain MRI studies and manually contour the lesions. The
results are the “gold standard” of the eFolder’s CAD system: all pixels inside the lesion contours
are “abnormal”, and all other pixels are “normal”. The accuracy of the CAD detection algorithm,
therefore, depends on whether the CAD results match the manual results from expert readers. Based
on results from CAD and manual contours, number of “true positive(TP)”, “true negative(TN)”,
“false positive(FP))”, and “false negative(FN)” pixels per study are calculated. From these values,
the sensitivity, specificity, and positive predictive values (precision) are calculated from the
following formulas:
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝑇𝑁
𝑇𝑁 + 𝐹𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
6.2.2 Acquisition of manual segmentation
The first step of the CAD evaluation is to acquire the manual contour data. The contours were
drawn manually by two neuro-radiology fellows on Fuji Synapse® 3D post-processing client in the
at the Keck Hospital of USC. All of the imaging studies collected for the eFolder project have been
read by the same two fellows. To save time and effort, the studies were contoured by radiology
residents first and was edited and confirmed by the two fellows. Once the studies are read, the
results are stored on the Fuji Synapse 3D system. The system has a functionality to output the
contour data in NIFTI (3D medical imaging data) format, which is read by MATLAB and converted
to DICOM for storage in the eFolder.
6.2.3 Evaluation results
The CAD detection results are compared with manual contour results in MATLAB. First, the
results are compared visually. Figures 6.2 and Figure 6.3 show the quantified lesion volumes for
Caucasian and Hispanic studies using expectation-maximization algorithm.
55
Figure 6.2 Total lesion volumes of 36 Caucasian studies. Dashed lines represent CAD
results (from EM algorithm) and the solid line represents the manual contouring results.
Figure 6.3 Total lesion volumes of 36 Hispanic studies. Dashed lines represent CAD results
(from EM algorithm) and the solid line represents the manual contouring results.
56
Visually shown in the above figures, the EM algorithm is generally overestimating lesion volumes
when compared to radiologists’ markings.
Table 6.1 shows the sensitivity, specificity, and precision (positive predictive values) of the
expectation-maximization CAD algorithm for both Caucasian and Hispanic studies.
Table 6.1 Evaluation results for the EM algorithm
Caucasian patients Hispanic patients Overall
Sensitivity 37.79% ± 26.73% 43.88% ± 27.86% 40.83% ± 27.47%
Specificity 99.58% ± 0.60% 98.99% ± 2.48% 99.29% ± 1.83%
Precision 19.17% ± 26.49% 15.38% ± 20.77% 17.27% ± 23.88%
The low sensitivity and precision indicate that the results have a high number of false positive and
false negative pixels. This confirms that the algorithm is overestimating lesion volumes and is not
precise enough to be a reliable indicator of the true lesion volumes of the studies. Due to this reason,
all quantification-related figures and analysis are currently results from manual results and not from
automatic detection results. The results of the current automatic CAD algorithm are not accurate
due to the following reasons:
1. Segmentation errors. For a 23-slice study, most lesions reside between slices 12-20 because
those are near top of the head, i.e. most cerebral matter is included in these slices. Lower-
numbered slices (slice 1 through 10) contains mostly brain stem, cerebellum, and nasal and
oral cavities. However, the CAD system is detecting quantifying lesion voxels in lower-
numbered slices, which should not have been included in segmentations of regions of
interests. Therefore, there is some error in segmenting white and grey matter using SPM
built-in functions in the preprocessing step. Errors in white matter and grey matter
segmentations also affects expectation maximization algorithm because the falsely
segmented voxels are outside of “normal” range and thus are identified as false positive.
2. Lack of sophisticated method for eliminating false positives. Using information like
anatomical locations may help with identifying false positive detections before the
quantification result is finalized.
However, since the CAD module is only a part of the overall modular design of the eFolder system,
future work and improvements can be made to the CAD module which will benefit the overall
system performance. For further evaluation of the eFolder system and the location and tracking
algorithm, the gold standard manual contours made by the expert readers were utilized.
Figures 6.4 and 6.5 below shows the comparison of manual contour results and results from KNN-
based CAD algorithm.
57
Figure 6.4 Total lesion volumes of 36 Caucasian studies. Dashed lines represent results
from KNN-based algorithm and the solid line represents the manual contouring results.
Figure 6.5 Total lesion volumes of 36 Hispanic studies. Dashed lines represent results from
KNN-based algorithm and the solid line represents the manual contouring results.
The figures show that KNN-based algorithm is underestimating the lesion volumes. Combined with
long computing times required for analysis, the KNN algorithm is not currently suitable for
inclusion in the eFolder. With planned future work, an improvement in detection accuracy can be
expected for the EM-based CAD algorithm. Because the overall eFolder system is a modular
58
design, any future work to improve the precision and accuracy of the CAD algorithm will ultimately
benefit the entire system.
6.3 Evaluation of longitudinal lesion tracking methodology
One of the most unique features of the MS eFolder is the lesion registration and tracking algorithm.
It can track changes in individual lesion sizes over time. This will identify or confirm the area of
most active lesion growth, which in turn may help clinician understand the changes in patient’s
disease course, confirm the treatment effectiveness, and open other possibilities in knowledge
discovery.
To ensure that the lesion tracking algorithm is working as intended, I designed the validation and
evaluation methodology. The evaluation process tests the algorithm’s ability to accurately track the
difference in lesion volume changes, tracking an individual lesion’s volume change, and testing
how the GUI is able to display the tracking information. The evaluation methodology is based on
the manual contours performed by the two expert users and considered the current gold standard.
6.3.1 Evaluation methodology
Simulation plans for the four longitudinal studies have been created. For each study, three lesion
volume progression scenarios are planned:
1. The overall lesion volume has significantly increased
2. The overall lesion volume does not change significantly, but individual lesion volumes
may change
3. The overall lesion volume has decreased
Based on the three progression scenarios, I have simulated 4 years of yearly studies for each
scenario per study (for a total of 48 simulated studies). Table 6.2 shows the lesion volume data for
simulated lesion growth.
Table 6.2 Simulated lesion growths for 4 longitudinal studies for 4 years.
Patient Scenario Initial
Vol(cm
3
)
Year 1 Year 2 Year 3 Year 4
C010 Volume increase 4.81 6.215 5.245 6.39 6.798
No change 4.903 4.691 5.116 5.728
Volume decrease 4.498 4.283 4.578 5.007
C011 Volume increase 17.2 19.9125 17.99 19.63 21.89
No change 17.072 15.83 17.2 18.66
Volume decrease 15.264 14.89 15.58 16.39
H013 Volume increase 2.23 2.975 2.779 3.286 3.77
No change 2.207 1.951 2.351 2.882
Volume decrease 1.887 1.692 2.415 2.704
H021 Volume increase 1.31 1.968 1.835 2.346 3.067
No change 1.314 1.044 1.449 1.934
Volume decrease 0.751 0.7378 0.8429 1.143
Once the simulated studies were created, the lesion registration and tracking algorithm was used to
verify if the results, including which lesions are changed and by how much, match the known values.
6.3.2 Evaluation results and tracking results on web-based GUI
59
After applying lesion registration methodologies to each simulated case, all lesion volume changes
can be observed in the MS eFolder. Figure 6.6 shows how the lesion changes are observed in
MATLAB.
Figure 6.6 Three consecutive normalized brain atlas axial slices with lesion contour
overlays. The blue region indicates lesions in a previous study, and the red regions indicate
how those lesions changed in a subsequent study. The findings are consistent with
simulated data.
In the lesion registration algorithm, all images are normalized to the brain atlas template. Figure
6.6 shows three sample slices of the template, and how the lesion contours are mapped on the atlas.
The blue region is from a previous study, and the red region is from the subsequent study. Lesion
contour changes can be observed and calculated based on the amount of non-overlapped pixels
between the two studies.
The following figures show the lesion registration outputs displayed in the eFolder GUI.
Figure 6.7 Graphical representation of longitudinal tracking of the four patients with
simulated data from the first scenario.
60
Figure 6.8 MS eFolder’s graphical tool for tracking volume changes of a single lesion for
patient C011. Blue line represents the overall change, and the black line represents how
much that particular lesion volume has changed.
Overall, the lesion tracking algorithms can track individual lesion volume changes, and detect the
amount of changes consistent with simulated data. The web-based GUI is able to graphically show
the changes over time and able to display the contour changes. While 48 studies have been planned
to complete the lesion tracking evaluation, analysis for the first 4 have been completed, as shown
in Figure 6.8. The tracking evaluation is a lengthy process and the completion of the evaluation
results are to be completed as part the future work.
6.4 Evaluation of data mining capabilities
The data mining feature of the eFolder system was also evaluated. Using aggregated data in the
database and the web-based GUI, a series of queries were performed to find out if there are any
observable differences in lesion volumes and brain parenchymal fractions between Hispanic and
Caucasian patients.
6.4.1 Lesion volume differences between Hispanic and Caucasian patients
Using the data lookup page of the system GUI, all Hispanic and Caucasian patients with relapse-
remitting MS were identified. Disease duration, total lesion volume, and total number of lesions
are recorded for each patient. The eFolder system then is used to plot lesion volumes in both linear
and scatter plots. Figure 6.9 below displays how data mining results are presented in the eFolder
system.
61
Figure 6.9 Scatter plot of MS disease duration versus 3-D lesion volume from data mining.
The volume data is separated into two groups: Hispanic patients and Caucasian patients.
The best-fit regression line is automatically applied and shown in the same plot
The data mining results and display show that the eFolder provides data repository, database, and
user interface to perform data mining tasks. Throughout the system development and evaluation,
the data mining process has been repeated for more than 100 times, and the eFolder has returned
expected query results every time. Data analysis toolkits in the eFolder system can provide big-
data analysis and display results.
6.4.2 BPF changes between Hispanic and Caucasian female patients
The second data mining evaluation method is to look up all brain parenchymal fraction (BPF)
numbers of Hispanic and Caucasian female patients. The field “gender = female” as a search
criterion was added to the search for all patients, and the result returns a list of patients fitting all
criteria. BPF values are found by looking up CAD result table with the returned list of patient IDs.
Results are then collected and plotted using the eFolder’s graphical analysis tool. Figure 6.10 shows
the results of the data mining and analysis process.
Figure 6.10 Scatter plot of MS Disease duration versus brain parenchymal fraction between
Hispanic female patients and Caucasian female patients.
62
While the two data mining scenarios show inconclusive results due to the small sample size, the
evaluation procedures show that the eFolder is capable of data mining tasks for data analysis and
knowledge discovery. Neurological data and lesion quantification data can be integrated to allow
new data analysis that was previously unavailable or tedious to obtain. As more patients are
collected, the database is further enriched both for the CAD algorithm as well as knowledge
discovery of potential ethnic differences in MS disease presentation amongst Caucasian and
Hispanic population groups. A detailed exploration of how the eFolder system can contribute to
medical and research scenarios is discussed in Chapter 9.
6.5 DICOM-SR Evaluation
DICOM-SR objects, created as output of the CAD module, are designed to store in DICOM-
compliant data storage. Since the SR objects are DICOM-compliant, the SR objects can be
recognized and read by other DICOM-compliant components in the clinical environment. This
makes the integration of eFolder with current clinical systems possible. Therefore, it is important
to make sure that the quantitative data outputs from the CAD module are DICOM-SR objects. The
plan of evaluating DICOM-SR objects include two aspects:
1. The SR objects should be transmitted using DICOM transfer protocols.
2. The SR objects should be recognized and read by a DICOM-SR viewer.
The eFolder’s DICOM-SR objects are evaluated to ensure the two criteria are met. The DICOM-
SR objects include two components and both are evaluated: the structured report that is embedded
in the object’s metadata, and the DICOM-compliant secondary captures (DICOM-SC) that are
created by overlaying lesion contours on original MR images.
6.5.1 DICOM-SR transfer protocol evaluation
To test if the DICOM-SR object can be transferred using DICOM transfer protocols, a simulated
PACS environment has been set up in the Image Processing and Informatics Laboratory (IPILab).
The workstations and servers in the Lab room are connected to each other via intranet. Specifically,
two separate workstations in the Lab network are set up to transmit DICOM files and objects
between them. One workstation is set up to use the DICOM-send function in dcmtk (the
workstation is designated as StoreSCU, or SCU for short), and the other workstation is set up to
use the DICOM-receive function in dcmtk (designated as StoreSCP, or SCP for short). The SCP
(service class provider) in a DICOM-compliant network provides the storage service, i.e. receiving
DICOM objects for storage, and the SCU (service class user) utilizes the store service, sending the
DICOM object to SCP. Figure 6.11 shows the network diagram between SCU and SCP.
63
Figure 6.11 Network diagram of DICOM-SR data transfer evaluation. StoreSCU uses the
DICOM-send protocol to send DICOM-SR and DICOM-SC objects in the in-lab network.
The StoreSCP receives the DICOM-SR and DICOM-SC objects and stores in the
destination repository.
The SCU and SCP are first tested with DICOM images. Once the images are confirmed to be sent
by the SCU and received by the SCP, the DICOM-SR objects are transferred with the same
procedure. The result confirms that the DICOM-SR objects are recognized by both the SCP and
SCU, and the object is stored in the repository, same as the test DICOM images. DICOM-SR
objects for Phase II studies (72 studies total) are used in the testing and transfer protocol evaluation,
and all 72 cases were sent and received by the two workstations.
6.5.2 DICOM-SR viewer
Since DICOM-SR and DICOM-SC are DICOM-compliant, the content of those objects should be
parsed by a DICOM parser and displayed via a DICOM object viewer. Therefore, the SR and SC
objects are evaluated by using eFolder’s built-in parser and viewer to display the content. The
DICOM-SR objects are displayed on the web-based eFolder GUI. The GUI uses a DICOM to XML
converter to convert DICOM-SR report content into a web-based XML format. Figure 6.12
(replicated from Figure 5.1) shows that the DICOM-SR content is displayed on the bottom right
panel of the main eFolder UI.
64
Figure 6.12 Screenshot of comprehensive MS eFolder web-based GUI. Left panel:
patient’s clinical and demographic data. Right top panel: CornerStone image viewer
embedded in eFolder GUI. Right bottom panel: SR document content in tabulated format.
The DICOM-SC images are displayed via the eFolder’s DICOM image viewer. Figure 6.13
(replicated from Figure 5.5) shows how the DICOM-SC images are displayed in the image viewer.
Figure 6.13 The design and layout of the customized eFolder image viewer. The DICOM-
SC images are displayed on the right, while an original MR image is displayed on the left.
65
All DICOM-SR and DICOM-SC objects are uploaded to the eFolder system, and the system viewer
is able to parse all of the studies (from both Phase II and Phase III, a total of 84 studies) and display
all of the information.
In summary, several key components and features of eFolder have been evaluated. Table 6.3 shows
what have been covered in this chapter.
Table 6.3 Summary and comparison of eFolder system evaluation
Features
evaluated
Type of tasks
performed
Completion
status
Success Short summary
CAD module Sensitivity and
Specificity
analysis
Completed No More work is needed to improve
CAD module accuracy and
performance. Instead, manual
contour results are used in the
eFolder system.
Lesion
tracking and
registration
Detecting
changes in
lesion volumes
Ongoing Yes Lesion tracking can successfully
detect volume changes in the first
few studies. 48 studies are planned
for evaluation
Data mining Query/retrieve
patient data
and perform
data analysis
Completed Yes The eFolder GUI is used to query
the database for all patients to
return quantification data to draw
data analysis. Multiple data queries
are performed and all returned
desired results.
DICOM-SR DICOM
transfer
protocol and
DICOM
content display
Completed Yes The system recognizes SR and SC
as DICOM objects, able to
send/receive SR/SC objects and
able to view content on DICOM-
based viewers.
This chapter has presented the evaluation of various system components with limited available data.
The overall impact of the eFolder system to the current clinical environment is still to be determined
and reserved for future work. In the next chapter, several use-case scenarios are introduced and
highlighted to showcase the potential impact a MS eFolder can have on disease management, MS
research, data analysis, treatment planning, and decision support.
66
Chapter 7 Clinical significance of the MS eFolder: Use-case scenarios
The MS eFolder is designed to improve clinical workflows and provide data analysis tools for
research. In this chapter, a few clinical use-case scenarios are used as an example of how the
eFolder impacts in the clinical environment and for MS-related research.
7.1 Scenario 1: Clinical workflow improvement with the MS eFolder System
The MS eFolder, as an integrated data repository and management system, can improve the clinical
workflow through several ways in a typical patient visit:
1. Provide simpler data collection tools. When a patient checks in at the clinic for the first
time, she fills out a paper-based form for her medical history, medications, etc. Answers
on the paper forms have to be manually entered into a digital spreadsheet or a database
later by staff. With the eFolder system, the data entry form is integrated in the web-based
user interface, and therefore the patient can directly enter the form answers into the system.
The trained staff is also able to help the patient with entering data on the form.
2. Centralize data storage. During the clinical visit, the patient is subject to a neurological
exam and is scheduled for an MRI study. After the clinical visit, the neurological
examination data is either printed and stored with other paperwork items in the patient’s
profile form, or the data is stored in an electronic patient record system at the clinic. The
imaging study is stored in the clinical PACS, and the clinician or physician can access the
images via a PACS workstation or data exported onto a CD. With an integrated eFolder
system, the user is able to access both neurological examination data and the imaging
studies in one system on one interface. The physician can examine and compare the data
and make informed diagnosis and decisions.
3. Open access to other authorized users. The patient’s family doctor, who referred the patient
to the neurological clinic, would like to review the data and diagnosis. Without the web-
based interface of the eFolder, the family doctor would need to either request data access
or wait for the data to be delivered as a parcel, or the patient may bring data home and
deliver to the family doctor during the next visit. However, with a web-based interface, the
family doctor is able to request a user account to log into the eFolder system and view the
patient’s results directly. This saves the family doctor valuable time and effort.
4. Ease for patients to visit other clinical locations. The patient decides to visit a different
neurological clinic for another exam. The patient data, already stored on the eFolder, can
easily be accessible by her or authorized users at the new clinic.
The key features of the eFolder system highlighted are its web-based interface for data entry and
data viewing, integration of neurological data and radiological data, and the web server to allow
authorized remote data access. Because the eFolder is designed for multiple sclerosis studies, this
allows users to view clinical records, medical images, and quantification data results on any web
browser and perform disease tracking, disease management tasks, and review of patient data. This
is important because other informatics systems, like hospital EMR, may not have all information
stored in the MS eFolder.
7.2 Scenario 2: Tracking patient’s progress over time
The eFolder system provides disease tracking and decision support for physicians in treating and
managing the disease course. This following scenario highlights how the eFolder can provide
previous-unavailable information that is critical in patient care.
67
7.2.1 Background
A MS patient has been visiting the clinic regularly for several years. Every year he has a brain MRI
study taken, and his medical history and medical imaging studies are stored in the MS eFolder
system. According to his records, he has been taking Copaxone ever since he was diagnosed with
MS. He has had few attacks, and his symptoms are getting steadily worse. Lesion volume analysis
from his MRI studies shows a steady increase of MS lesion volumes over time, particularly in the
left motor cortex. The increase of lesion activity in that region matches the patient’s complaint of
numbness in his right leg.
7.2.2 Data mining and decision support
The neurologist, after reviewing the patient’s history, determines that Copaxone alone may be
ineffective and wants to switch the patient to a different treatment course. In the MS eFolder
database, the neurologist starts to look up patients with the same criteria as his patient: similar age,
gender, ethnicity, MS disease type, age of onset, and treatment. From this subset of patients, the
neurologists further consider any patients that have had left motor cortex lesions, their treatment
plans, and their disease progression course. The extensive search has led the neurologist to discover
that Tysabri has shown to be effective in treating the type of disease progression the patient is
having. The neurologist then decides to put the patient on Tysabri. Imaging studies and neurological
examinations are taken during subsequent follow-up visits.
7.2.3 Tracking patient’s response to new treatment
Imaging studies from subsequent visits show that the patient’s lesion volume has not changed since
Tysabri was prescribed. Patient also has not experienced worsening symptoms. This shows that
Tysabri may have helped in managing MS progression for this patient.
The key features of MS eFolder used are the data mining, data analysis, and knowledge discovery
capabilities. Physicians can make knowledge-based decisions to determine disease progression,
consider multiple treatment options, and track effectiveness of treatments.
7.3 Scenario 3: Comparing lesion volume for Hispanic and Caucasian patients
As a data repository and data analysis tool, the MS eFolder can provide quantitative results for
various MS-related research projects. This scenario depicts how a researcher can utilize the MS
eFolder to gather the necessary data for research.
7.3.1 Subject recruitment and data collection
The research project is on finding imaging markers for differences in Hispanic and Caucasian MS
patients. As described in Section 1.3, Hispanic American MS patients are found to have a higher
rate of opticospinal MS and lower age of onset; both characteristics are similar to Asian MS patients.
[32]. Hispanic MS patients have greater prevalence of optic neuritis and spinal cord syndromes
than non-Hispanic patients. [55] The eFolder can be used in this study to confirm image markers
consistent with higher opticospinal MS prevalence.
To do this, a large number of Hispanic, Asian, and non-Hispanic/non-Asian MS patients have to
be recruited, both brain and spinal MRI studies gathered, and imaging analysis have to be
completed. First, the research staff uses the MS eFolder to record subject profiles in a database.
68
Using the database, the subjects can be filtered by age, gender, age of onset, disease type, symptoms,
and other criteria as the control variables. Secondly, the imaging studies are gathered from PACS
or other imaging repositories, and uploaded into the MS eFolder through the DICOM uploader.
The development of the eFolder system and the data collection of MS patients were partly driven
by this potential research use-case scenario.
7.3.2 Data analysis and results
In this scenario, the spinal MS analysis has been added into the overall CAD module of the MS
eFolder. The spinal module is able to read spinal and brain MRI and identify and quantify MS
lesions. Once sufficient data is gathered, the researcher performs the quantitative imaging analysis
to obtain 3-dimensional lesion volumes, locations, brain parenchymal volumes. The eFolder’s data
analysis tool can output the data and the researcher can perform statistical analysis to find any
correlation between ethnicity and imaging markers. With the quantitative results and eFolder’s
analysis tools, the researchers can determine that the Hispanic American MS patients have more
lesions present in the optic nerves and spinal cord regions. Research can also correlate lesion
location with symptoms present in Hispanic American patients. This scenario shows how an
eFolder can provide new quantification data and data analysis tools to help in MS research.
7.3.3 eFolder’s Role in Big Data analysis in Medical Imaging
Another important aspect of the eFolder’s contribution to the field of research is its capability of
storing and analyzing large amounts of data. “Big data” describes collection and analysis of large
amount and variety of data
[56]. The idea and application of big data is frequently referred to in
business analytics, system integration, machine learning, simulation, and visualization
[57]. The
characteristics of big data involves volume, variety, velocity, and complexity of data collected. The
advantages of analyzing big data is to make data quickly and easily available to users, as well as
offering unique and innovative data analysis tools that may observe trends in research subjects.
Results from big data analysis can help draw conclusions of complex problems and help improve
a system’s performance.
Medical imaging data fits the idea of big data in the four key characteristics mentioned above: a
typical radiology clinic generates very large amount of data per year, and the amount of data only
increases as enterprise healthcare solutions are on the rise, and more and more data is generated on
the daily basis (volume and velocity). For a hospital or clinic, data includes images, videos,
waveform, text data, reports, and so on (variety). For each patient, his or her health record in a
hospital also includes variety of data, which needs to be integrated to perform advanced analysis
(complexity). With a large number of subjects and a very large amount of data, the MS eFolder is
a system that is capable of handling the new trend of big data analysis in medical imaging.
These three use-case scenarios in the clinical environment highlight the eFolder system’s
contributions and prowess in various aspects of MS treatment and research. With very large data
sets, the eFolder can handle complex data analysis and provide new overall trends and discoveries
in the MS population. This also showcases the idea of patient-centric informatics system and how
it may help in patient diagnosis and treatment, quantitative research, and big data analysis in the
future.
69
Chapter 8 Current Status and Future Work
In this chapter, I will discuss components and aspects of eFolder system that have been completed,
further testing and development planned, and plans for system applications in the real world.
8.1 Completed components and features of MS eFolder
As of December 2017, the MS eFolder is hosted on a Windows-based web server managed by
Image Processing and Informatics Laboratory at University of Southern California. It is used for
project demonstrations and feature testing.
The three major components of the eFolder: database, web-based GUI, and automatic lesion
detection algorithm, have been completed and hosted on the eFolder server. Over the years,
additional features of the eFolder have been integrated to the eFolder system and tested in the
laboratory environment. To summarize, the additional features of the eFolder include:
• DICOM-SR object conversion, management, and display.
• Longitudinal study lesion volume analysis and display. This includes overall volumes and
lesion volumes divided by cortical structural sub-regions.
• Lesion location classification, lesion registration, and display of lesion contour changes.
• IHE post-processing integration profile. This includes work status tracking and a
simulation of workflow in-lab.
• “Big data” analysis toolkit and display. This shows display of resulting analysis of
correlation between Hispanic and Caucasian MS patients.
Data collection for 72 Hispanic and Caucasian MS patients have been completed and initial testing
of lesion tracking of longitudinal studies has concluded. The automatic CAD module needs to be
further developed to increase detection accuracy, but the system architecture and modular structure
is in place for future upgrades or replacements. Currently the manual contour results are used in
the eFolder system developments and evaluations. Additional data is needed to test and confirm
many of eFolder’s features and capabilities, which will be discussed in the next section.
8.2 Future work
8.2.1 Multi-site big data storage and analysis
To evaluate the “big data” storage and analysis capabilities of the eFolder system, a large amount
of data is needed to build up the data repository. Data collection at USC needs to continue to build
up the patient base, and imaging data from other imaging centers and clinical sites are needed.
Multi-site evaluation of the eFolder can prove:
1. eFolder’s capability of handling large amount of data. Several aspects of the eFolder will
be tested: the handling of DICOM uploader and parser to populate the imaging database,
the speed of executing CAD components for new imaging data, and storage and display of
multiple DICOM-SR objects.
2. Improvement of CAD accuracy. The CAD module performance has room for improvement.
The CAD needs to be re-evaluated for more accurate ROI segmentation. With more data,
the CAD accuracy can be further refined and tested with images from different sources
70
3. Adding spinal MR analysis to CAD modules to provide more tools to further study
Hispanic American MS patients
4. Imaging data from different sources to test the DICOM-based functions of eFolder. The
eFolder’s system has been built on images from MR scanners located at Keck Hospital at
USC, USC+LAC hospital, and USC Healthcare Consultation Center II. While the eFolder
is able to read and register imaging data from different scanners (made by different
manufacturers), more data is needed to confirm robustness of eFolder’s DICOM-based
modules.
5. Testing various data analysis toolkits. The large data sets, especially more patients with
longitudinal studies, can test the eFolder’s lesion registration and tracking toolkit. The large
data sets can also test eFolder’s data mining and plotting tools.
6. Continue collecting more longitudinal studies and continue evaluating lesion registration
and tracking processes. To ensure the effectiveness and accuracy of the lesion registration
algorithm, more real longitudinal studies need to be collected. More data will also increase
the power of statistical analysis involving longitudinal studies. Furthermore, with a more
precise brain atlas, the lesion registration algorithm will be able to pinpoint the subcortical
regions of the lesions, thus able to use location information in data analysis and data mining.
7. Ethnicity-based analysis results and existence of any large-scale trends in neurological
findings and imaging markers. This ties into the previous point 6. With a large enough
dataset, data analysis on observable differences in imaging markers between Hispanic and
Caucasian MS patients become significant and thus relevant to finding out more about the
nature of multiple sclerosis.
8. Testing the eFolder’s effectiveness in imaging-based clinical trials. The unique ability of
eFolder to quantify and track results can be used in imaging and result-based clinical trials.
The system is also designed to handle large-scale multi-site trials with its data storage and
remote access via web GUI.
The additional data collection for the MS eFolder for big data analysis is key to demonstrate the
unique features and capabilities of the system.
8.2.2 Future work needed for deployment for clinical use
Just like a typical clinical PACS or EPR, the eFolder system needs to be robust enough for day-to-
day operations of clinical workflow. This includes testing of large workloads in a large-scale
clinical hospital like the USC Hospital system. The CAD component especially needs to
simultaneously process incoming imaging data, or needs to have a data queue system for handling
incoming quantification data one-by-one. The system should be tested against high data traffic and
simultaneous access from multiple users. This benchmark would be the first step in deployment
into a real-world environment.
Another real-world requirement for eFolder system is fault-tolerance. In case of a system-wide
failure, eFolder data should be readily-available from back-up servers and services. The eFolder
system requires regular data back-ups, and the eFolder back-up procedures and workflows can be
integrated into the fault tolerance workflow of existing clinical PACS and ePR.
In addition to integration of fault-tolerance workflow and guidelines of eFolder and other existing
clinical systems, eFolder system should also be able to communicate to those systems for data
acquisition and analysis. This includes technical challenges such as knowing transfer protocols of
the existing system (both text and images) and building custom interfaces for the eFolder to pull
71
data or store data from those existing systems. Currently I have not tested other HL7 or DICOM-
based interfaces to access data from a PACS or RIS/HIS, and this integration step can be useful for
more direct data collection. This can be more convenient for users instead of letting other systems
output imaging data to another storage medium before being uploaded into the eFolder or manually
entering patient data when that data should already exist in the RIS/HIS.
72
Reference
[1] R Benedict, J Bobholz “Multiple Sclerosis” Semin Neurol 2007; 27(1): 078-085
[2] S Ringold, C Lynn, R Glass “Multiple Sclerosis” JAMA. 2005;293(4):514
[3] D Hafler “Multiple Sclerosis” J. Clin. Invest. 2004, 113(6):788-794
[4] Multiple Sclerosis Associaton of America. MS Overview & Glossary: Who Gets Multiple
Sclerosis. Available at http://mymsaa.org/ms-information/overview/who-gets-ms/. Last accessed
June 8, 2016.
[5] D Hafler “The distinction blurs between an autoimmune versus microbial hypothesis in
multiple sclerosis” J. Clin. Invest. 1999, 104:527-529
[6] B Cree, O Khan, D Bourdette, et al. “Clinical characteristics of African Americans vs
Caucasian Americans with Multiple Sclerosis” Neurology 63:2039-2045 (2004)
[7] K Yamasaki, J Kira, Y Kawano, et al. “Western versus Asian types of multiple sclerosis:
Immunogenetically and clinically distinct disorders” Annals of Neurology 40:4 569-574 (2004)
[8] FD Lublin, SC Reingold. “Defining the clinical course of multiple sclerosis: results of an
international survey. National Multiple Sclerosis Society (USA) Advisory committee on Clinical
Trials of New Agents in Multiple Sclerosis” Neuro. 1996; 46(4):907-11
[9] FD Lublin, SC Reingold, JA Cohen, et al. Defining the clinical course of multiple sclerosis:
The 2013 revisions. Neurology. 2014;83(3):278-286. doi:10.1212/WNL.0000000000000560.
[10] KO Lovblad, N Anzalone, A Dorfler, et al. “MR Imaging in Multiple Sclerosis: Review and
Recommendations for Current Practice” Am J Neuroradiol 2010; 31:981-89
[11] WI McDonald, SA Hawkins. “Clinical study of primary progressive multiple sclerosis:
guidelines from the International Panel on the diagnosis of multiple sclerosis” Ann Neurol
2001;50:121-7
[12] CH Polman, SC Reingold, B Banwell, et al. Diagnostic criteria for multiple sclerosis: 2010
Revisions to the McDonald criteria. Annals of Neurology. 2011;69(2):292-302.
[13] Canadian Agency for Drugs and Technologies in Health (CADTH). Recommendations for
drug therapies for relapsing-remitting multiple sclerosis. Ottawa (ON): Canadian Agency for
Drugs and Technologies in Health (CADTH); 2013 Oct. 19. Accessed online at
https://www.guideline.gov/summaries/summary/48222/recommendations-for-drug-therapies-for-
relapsingremitting-multiple-sclerosis on 2016 Jul. 9.
[14] MW Tas, F Barkhol, MA van Walderveen, et al. “The effect of gadolinium on the sensitivity
and specificity of MR in the initial diagnosis of multiple sclerosis” Am J Neuroradiol
1995;16:259-64
[15] KO Lovblad, N Anzalone, A Dorfler, et al. “MR Imaging in Multiple Sclerosis: Review and
Recommendations for Current Practice” Am J Neuroradiol 2010; 31:981-89
[16] Kurtzke JF. “Rating neurologic impairment in multiple sclerosis: an Expanded Disability
Status Scale (EDSS) “. Neurology. 1983;33:1444-1452.
73
[17] Miller DH, et al. “MRI outcomes in placebo-controlled trial of natalizumab in relapsing
MS“. Neurology 2007;68;1390-1401.
[18] M Filippi, MA Horfield, M Rovaris, et al. “Intraobserver and Interobserver Variability in
Schemes for Estimating Volume of Brain Lesions on MR Images in Multiple Sclerosis” Am J
Neuroradiol 1998;19:239-44
[19] A Wong, A Gertych, CS Zee, et al. “A CAD system for assessment of MRI findings to track
the progression of multiple sclerosis”. Proceedings of SPIE Medical Imaging, Vol. 65142U-1-7,
February 2007.
[20] B Weinstock-Guttman, M Ramanathan, K Hashimi, et. al. “Increased tissue damage and
lesion volumes in African Americans with multiple sclerosis” Neuro. 2010; 74:538-544
[21] J McElroy, J Oksenberg. “Multiple Sclerosis Genetics” from Advances in multiple Sclerosis
and Experimental Demyelinating Diseases, Vol. 318 pp 45-72
[22] SJ Kenealy, MC Babron, Y Bradford, et. al. “A second-generation genomic screen for
multiple sclerosis” (2004) Am J Hum Genet 75: 1070-1078
[23] J. Kira. “Multiple sclerosis in the Japanese population” Lancet Neuro. 2003; 2:117-27
[24] B Weinstock-Guttman, M Ramanathan, K Hashimi, et. al. “Increased tissue damage and
lesion volumes in African Americans with multiple sclerosis” Neuro. 2010; 74:538-544
[25] I Kister, E Chamot, J Bacon et. al. “Rapid disease course in African Americans with multiple
sclerosis” Neuro. 2010; 75:217-223
[26] H Shibasaki, W McDonald, Y Kuroiwa. “Racial modification of clinical picture of multiple
sclerosis: comparison between British and Japanese patients.” J Neuro Sci 1981; 49:253-71
[27] JR Oksenberg, SE Baranzini, LF Barcellos, SL Hauser (2001) “Multiple sclerosis: genomic
rewards.” J Neuroimmunol 113:171–184
[28] B. Weinstock-Guttman, L. Jacobs, C. Brownscheidle, et. al. “Multiple Sclerosis
characteristics in African American patients in the New York State Multiple Sclerosis
Consortium” Mult. Scle. 2003; 9:293-8
[29] G Dean, A Bhigjee, P Bill, et. al. “Multiple sclerosis in black South Africans and
Zimbabweans” J Neurol Neurosurg Psychiatry 1994;57:1064-69
[30] J Haines, M Ter-Minassian, J Gusella, et. al. “A complete genomic screen for multiple
sclerosis underscores a role for the major histocompatibility complex” (1996) The Multiple
Sclerosis Genetic Group. Nat Genet 13:469-471
[31] V Rivera, “Multiple Sclerosis in Latin America: Reality and Challenge” Neuroepidemiology
2009;32:394-395
[32] L. Amezcua “MS in a Hispanic population” 2010 Consortium of Multiple Sclerosis Centers,
May/02/2010
[33] B Robinson, “A model patient records system” FEDCW (2005) 19:4 32
[34] AC Sek, NT Cheung, KM Choy, “A territory-wide electronic health record--from concept to
practicality: the Hong Kong experience” Stud Health Technol Inform. 2007;129(Pt 1):293-6
74
[35] MA Hattwich, “Computer Stored Ambulatory Record Systems in Real Life Practice” Proc
Annu Symp Comput Appl Med Care 1979 October 17:761-764
[36] MYY Law, B Liu, LW Chan, “DICOM-RT-based Electronic Patient Record Information
System for Radiation Therapy” Radiographics 2009; 29:961-972
[37] J Documet, A Le , B Liu, et. al. “A multimedia electronic patient record system for image-
assisted minimally invasive spinal surgery” JCARS 2010; 5:3 195-209
[38] Q Hu, W Nowinski, “A rapid algorithm for robust and automatic extraction of the
midsagittal plane of the human cerebrum from neuroimages based on local symmetry and outlier
removal.” NeuroImage, 20(4):2153{2165, Dec 2003.
[39] Z Shan, G Yue, J Liu, “Automated histogram-based brain segmentation in t1-weighted three-
dimensional magnetic resonance head images.” NeuroImage, 17(3):1587-1598, Nov 2002.
[40] M Vågberg, T Lindqvist, K Ambarki, et al “Automated Determination of Brain Parenchymal
Fraction in Multiple Sclerosis” American Journal of Neuroradiology March 2013, 34 (3) 498-
504; DOI: https://doi.org/10.3174/ajnr.A3262
[41] P Anbeek, K Vincken, M van Osch, et. al. “Probabilistic segmentation of white matter
lesions in MR imaging.” NeuroImage, 21(3):1037{1044, Mar 2004.
[42] A Moore, “Efficient Memory-based Learning for Robot Control” PhD. Thesis; Technical
Report No. 209, Computer Laboratory, University of Cambridge, 1991, Chapter 6
[43] K Friston, “Statistical Parametric Mapping the Analysis of Functional Brain Images.”
London: Academic, 2007. Print.
[44] J Ashburner, K Friston "Unified Segmentation." NeuroImage 26.3 (2005): 839-51. Web.
[45] J Ashburner, K Friston "Voxel-Based Morphometry—The Methods." NeuroImage 11.6
(2000): 805-21. Web.
[46] (Dec. 23rd, 2014) P Tsui “Fast EM_GM”
http://www.mathworks.us/matlabcentral/fileexchange/9659 MATLAB Central
[47] K Leemput, F Van, D Maes, et al "Automated Segmentation of Multiple Sclerosis Lesions
by Model Outlier Detection." IEEE Transactions on Medical Imaging 20.8 (2001): 677-88. Web.
[48] A Mechelli, C Price, K Friston, J Ashburner “Voxel-Based Morphometry of the Human
Brain: Methods and Applications” Current Medical Imaging Reviews 2005, 1 (2) 105-113; DOI:
https://doi.org/10.2174/1573405054038726
[49] VS Fonov, AC Evans, K Botteron, CR Almli, RC McKinstry, DL Collins and BDCG,
Unbiased average age-appropriate atlases for pediatric studies, NeuroImage,Volume 54, Issue 1,
January 2011, ISSN 1053–8119, DOI: 10.1016/j.neuroimage.2010.07.033
[50] VS Fonov, AC Evans, RC McKinstry, CR Almli and DL Collins, Unbiased nonlinear
average age-appropriate brain templates from birth to adulthood, NeuroImage, Volume 47,
Supplement 1, July 2009, Page S102 Organization for Human Brain Mapping 2009 Annual
Meeting, DOI: http://dx.doi.org/10.1016/S1053-8119(09)70884-5
75
[51] RA Cote, et. al. SNOMED international: the systematized nomenclature of human and
veterinary medicine Northfield: College of American Pathologists; Schaumburg, IL: American
Veterinary Medical Association, 1993
[52] HK Huang. PACS and Imaging Informatics: Basic Principles and Application. (2010)
Wiley-Blackwell; Ed. 2
[53] IHE Post-processing Workflow http://wiki.ihe.net/index.php?title=Post-
Processing_Workflow
[54] (Nov. 2013) Xml2dsr: Convert DICOM SR file and data set to XML
http://support.dcmtk.org/docs/xml2dsr.html
[55] M Langille, T Islam, M Burnett, L Amezcua. “Clinical Characteristics of Pediatric-Onset
and Adult-Onset Multiple Sclerosis in Hispanic Americans”. Journal of Child Neurology, (2016)
31(8), 1068-1073.
[56] (Dec. 2013) “What is Big Data?” by University Alliance at Villanova University
http://www.villanovau.com/resources/bi/what-is-big-data/#.VOUo6JjF_Vo
[57] J Manylka et al. “Big data: The next frontier for innovation, competition, and productivity.”
McKinsey&Company, May 2011.
http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovati
on
76
List of Publications
Peer-reviewed manuscripts
Ma KC, Fernandez JR, Amezcua L, Lerner A, Shiroishi MS, Liu BJ. Design and development of
an ethnically-diverse imaging informatics-based eFolder system for multiple sclerosis patients.
Computerized medical imaging and graphics : the official journal of the Computerized Medical
Imaging Society. 2015;46(Pt 2):257-268. doi:10.1016/j.compmedimag.2015.09.007.
Huang HK, Deshpande R, Documet J, Le AH, Lee J, Ma K, Liu BJ. Medical imaging informatics
simulators: a tutorial Int J CARS (2014) 9: 433. doi:10.1007/s11548-013-0939-y
Conference proceedings
Ma K, Liu J, Zhang X, et. al. Lesion registration for longitudinal disease tracking in an imaging
informatics-based multiple sclerosis eFolder. Proc. SPIE 9789, Medical Imaging 2016: PACS and
Imaging Informatics: Next Generation and Innovations, 97890F (25 March 2016); doi:
10.1117/12.2217903
Hwang DH, Ma K, Yepes F, et. al. Multidimensional Interactive Radiology Report and Analysis:
standardization of workflow and reporting for renal mass tracking and quantification. Proc. SPIE
9681, 11th International Symposium on Medical Information Processing and Analysis, 96810C (22
December 2015); doi: 10.1117/12.2211526
Ma K, Wang X, Lerner A, et. al. Big data in multiple sclerosis: development of a web-based
longitudinal study viewer in an imaging informatics-based eFolder system for complex data
analysis and management. Proc. SPIE 9418, Medical Imaging 2015: PACS and Imaging
Informatics: Next Generation and Innovations, 941809 (17 March 2015); doi: 10.1117/12.2082650
Ma K, Wong J, Zhong M, et. al, Development of a web-based DICOM-SR viewer for CAD data
of multiple sclerosis lesions in an imaging informatics-based efolder. Proc. SPIE 9039, Medical
Imaging 2014: PACS and Imaging Informatics: Next Generation and Innovations, 903903 (2014);
doi:10.1117/12.2044076.
Ma K, Reddy N, Deshpande R, et al., Integration of imaging informatics-based multiple sclerosis
eFolder system for multi-site clinical trials utilizing IHE workflow profiles, Proceedings of SPIE
Vol. 8674, 86740A (2013)
Ma K, Fernandez J, Amezcua L, et al., Lesion comparison of multiple sclerosis in hispanic and
caucasian patients utilizing an imaging informatics-based eFolder system, Proceedings of SPIE Vol.
8319, 83190G (2012)
Suh J, Ma K, Le A, Improvement of MS (multiple sclerosis) CAD (computer aided diagnosis)
performance using C/C++ and computing engine in the graphical processing unit (GPU).
Proceedings of SPIE Vol. 7967, 79670V (2011)
Ma K, Fernandez J, Amezcua L, Lerner A, Liu B, Evaluation of an automatic multiple sclerosis
lesion quantification tool in an informatics-based MS e-folder system. Proceedings of SPIE Vol.
7967, 79670K (2011)
Liu M, Loo J, Ma K, Liu B Development of a data mining and imaging informatics display tool for
a multiple sclerosis e-folder system. Proceedings of SPIE Vol. 7967, 796709 (2011)
77
Le AH, Park YW, Ma, K, Jacobs C, Liu BJ. Performance evaluation for volumetric segmentation
of multiple sclerosis lesions using MATLAB and computing engine in the graphical processing
unit (GPU). Proceedings of SPIE Vol. 7628, 76280W (2010)
Jacobs C, Ma K, Moin P, Liu B. An automatic quantification system for MS lesions with integrated
DICOM structured reporting (DICOM-SR) for implementation within a clinical environment,"
Proceedings of SPIE Vol. 7628, 76280K (2010)
Ma K, Jacobs C, Fernandez J, Amezcua L, Liu B, (2010) "The development of a disease oriented
eFolder for multiple sclerosis decision support," Proceedings of SPIE Vol. 7628, 76280G
Ma K, Moin P, Zhang A, Liu B, (2010) "Computer-aided bone age assessment for ethnically diverse
older children using integrated fuzzy logic system," Proceedings of SPIE Vol. 7628, 76280I
Moin P, Ma K, Amezcua L, Gertych A, Liu B,(2009) "The development of an MRI lesion
quantifying system for multiple sclerosis patients undergoing treatment," Proceedings of SPIE Vol.
7264, 72640J
Ma KC, Zhang A, Moin P, Fleshman M, Vachon L, Liu B, Huang HK, (2009) "An online real-time
DICOM web-based computer-aided diagnosis system for bone age assessment of children in a
PACS environment," Proceedings of SPIE Vol. 7264, 726418
Lee JC, Ma KC, Liu, BJ, (2008) "Assuring image authenticity within a data grid using lossless
digital signature embedding and a HIPAA-compliant auditing system," Proceedings of SPIE Vol.
6919, 69190O
Zhang A, Uyeda J, Tsao S, Ma K, Vachon LA, Liu BJ, Huang HK (2008) "Web-based computer-
aided-diagnosis (CAD) system for bone age assessment (BAA) of children," Proceedings of SPIE
Vol. 6919, 69190H
Zhou Z, Ma K, Talini E, Documet J, Liu B, (2007) "Design and implementation of a web-based
data grid management system for enterprise PACS backup and disaster recovery," Proceedings of
SPIE Vol. 6516, 65160S.
Conference presentations
Ma K, Zhang J, Zhong H, Liu B. Web-based DICOM-SR Viewer for CAD Data of Multiple
Sclerosis Lesions in an Imaging Informatics-based eFolder. Radiological Society of North America
2013 Scientific Assembly and Annual Meeting, Applied Science (Computer Exhibit)
Ma K, Reddy N, Amezcua L, Liu B. “Bridging the Gap between Post-processing Innovations and
Real-World Clinical Use: The Utilization of an eFolder with Embedded IHE Workflow Profile for
MS Clinical Trials” RSNA 2012 Applied Science (Computer Exhibit) LL-INE1240-SUB
Ma K, Wang X, Reddy N, Liu B. “Challenges and Lessons Learned in Normalizing Pathologic
Brains in MS and Stroke Studies with Clinical Image Scanning Protocols” RSNA 2012 Scientific
Informal (Poster) Presentation LL-INE1267
Ma K, Garrison K, Fernandez J, Lerner A, Shiroishi M, Amezcua L, Liu B. “An Automatic Multiple
Sclerosis Lesion Tracking Tool for Longitudinal MRI Studies” RSNA 2011 Scientific Informal
(Poster) Presentation LL-INS-SU9B
78
Loo J, Ma K, Liu B, Amezcua L. “A timeline widget correlating MRI lesion load and EDSS
parameters in a large cohort for MS patients” RSNA 2011 Education Exhibit #11003141
Documet J, Ma K, Maziad A, Huang HK. “A Novel Multimedia Electronic Patient Record (ePR)
system to Enhance the Workflow, Accuracy, and Management of Various Types of Image-assisted
(IA) Surgery” RSNA 2011 Education Exhibit #11003593
Lerner A, Ma K, Fernandez J, et al. “Lesion Volume and Cerebral Volume Loss in Patients of
Hispanic Descent with Multiple Sclerosis” RSNA 2010 Scientific Formal (Paper) Presentations.
Abstract ID: 9004633
Ma K, Fernandez J, Loo J, Amezcua L, Lerner A, Shiroishi M, et. al. “Evaluation Methodology for
an Automatic Multiple Sclerosis Lesion Detection and Quantification System in a Clinical
Environment” RSNA 2010 Scientific Informal (Paper) Presentations. Abstract ID: 9007545
Ma K, Fernandez J, Amezcua L, Loo J, Liu M, Liu B. “A Comprehensive Decision Support Tool
for Multiple Sclerosis Treatment Targetting Hispanic Population” RSNA 2010 Education Exhibits.
Abstract ID: 9004623
Le A, Ma K, Suh J, Park YW, Deshpande R, Liu B. “A 3D Volumetric CAD of Multiple Sclerosis
Using Multi-core CPUs and General Purpose Graphics Processing Units (GPGPU)” RSNA 2010
Education Exhibits. Abstract ID: 9005851
Ma K, Moin P, Hong X, Law MY, Liu BJ, Huang HK, et al. (2009) “Multi-Modality Electronic
Patient Record (ePR) for Diagnostic Breast Imaging Centers,” RSNA 2009 Scientific Poster, LL-
IN2121-D06
Documet J, Lee R, Ma K, Liu B (2009) “A Benchmark Study to Measure Native Client-side Image
Manipulation Performance Utilizing Firefox 3.5, Safari 4, Chrome 2, and Internet Explorer 8,”
RSNA 2009 Education Exhibit, LL-UR4738
Ma K, Tsai CC, Moin P, Lee J, Liu BJ (2009) “Dynamic Digital MR Image Visual Templates for
Grading White-Matter Lesions from Vascular Dementia in a PACS Environment,” RSNA 2009
Education Exhibit, LL-IN3538
Ma K, Zhang A, Fleshman M, Vachon L, Liu B, Moin P, Huang, HK, “An Online Real-Time
DICOM Web-based Computer-aided Diagnosis (CAD) System for Bone Age Assessment of
Children in a PACS Environment: Clinical Validation,” RSNA 2008 Education Exhibit, LL-
PD4093-H07
Ma KC, Le AH, Fernandez, JF, Chan T, Liu BJ, Huang HK, et al. “A Timely Computer-aided
Detection System for Acute Ischemic and Hemorrhagic Stroke on CT in an Emergency
Environment,” RSNA 2008 Education Exhibit, LL-IN1105
Chao S, Le A, Ma K. “Image Sharing and Distribution of a Regional Health Information
Organization (RHIO) through the Utilization of a DICOM, IHE, HL7 Compliant Data Grid using
XDS,” RSNA 2007 Education Exhibit, LL-IN5282
Lee J, Zhou Z, Talini E, Ma K, Liu B, Huang HK. “A Data Grid for Enterprise PACS Tier 2 Storage
and Disaster Recovery,” RSNA 2007 InfoRAD, LL-IN5226-B
Abstract (if available)
Abstract
Purpose: Multiple Sclerosis (MS) is a neurological degenerative disease that affects over 2.3 million people worldwide. Symptoms vary greatly and can be disabling and life-threatening in the most severe cases. MRI is currently used to identify MS lesions in the brain and to assess the affected regions and severity of MS. The goal of treating MS is to prolong patient life, to limit relapses and attacks, and to increase quality of life for the patients. It is essential to track MS patients’ disease progression in longitudinal studies. However, disease tracking can be difficult because of intra- and inter-personal variabilities when manually calculating MS lesion volumes. Therefore, a comprehensive, integrated patient record system, called MS eFolder, is designed and developed to store and display patient data, perform quantitative imaging analysis, and to provide big-data analysis tools. The MS e-Folder aims to aid in three areas: disease tracking, decision support, and data mining for both clinical and research purposes. ❧ Method: The MS eFolder includes three components: eFolder database, the computer-aided detection (CAD) algorithm, and a web-based graphical user interface (GUI). The three components are integrated and interconnected with each other for data mining and data retrieval. The eFolder database stores patients’ demographic and clinical data, DICOM-based imaging data, and quantitative data such as lesion volume, number of lesions, lesion locations, and brain parenchymal volumes. The MS CAD system is developed based on Statistical Parametric Mapping (SPM) toolkit in MATLAB. The primary results of the CAD algorithm includes number of lesions, lesion volumes, and brain parenchymal volumes. Further post-processing of lesion contours identifies the location of each lesions, perform lesion registration and track lesion volume changes in longitudinal studies. CAD data is converted into DICOM-SR format for storage. A web-based GUI is designed to display patients’ clinical information, imaging data, CAD data, and other data analysis results. The eFolder system can be readily installed in a clinical environment based on IHE post-processing integration protocol. ❧ Summary: The MS eFolder system has been designed and developed as an integrated data storage and mining solution in both clinical and research environments, while providing unique features, such as quantitative lesion analysis and disease tracking over a longitudinal study. A comprehensive image and clinical data integrated database provided by MS eFolder provides a platform for treatment assessment, outcomes analysis and decision-support. The proposed system serves as a platform for future quantitative analysis derived automatically from CAD algorithms that can also be integrated within the system for individual disease tracking and future MS-related research. Ultimately the eFolder provides a decision-support infrastructure that can eventually be used as add-on value to the overall electronic medical record.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
A medical imaging informatics based human performance analytics system
PDF
Development of an integrated biomechanics informatics system (IBIS) with knowledge discovery and decision support tools based on imaging informatics methodology
PDF
Knowledge‐driven decision support for assessing radiation therapy dose constraints
PDF
An electronic patient record (ePR) system for image-assisted minimally invasive spinal surgery
PDF
Decision support system in radiation therapy treatment planning
PDF
Identifying injury risk, improving performance, and facilitating learning using an integrated biomechanics informatics system (IBIS)
PDF
Molecular imaging data grid (MIDG) for multi-site small animal imaging research based on OGSA and IHE XDS-i
PDF
Pattern detection in medical imaging: pathology specific imaging contrast, features, and statistical models
PDF
Data-driven and logic-based analysis of learning-enabled cyber-physical systems
PDF
Learning from limited and imperfect data for brain image analysis and other biomedical applications
PDF
Theoretical foundations for modeling, analysis and optimization of cyber-physical-human systems
Asset Metadata
Creator
Ma, Kevin Chikai
(author)
Core Title
Imaging informatics-based electronic patient record and analysis system for multiple sclerosis research, treatment, and disease tracking
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Biomedical Engineering
Publication Date
02/08/2018
Defense Date
01/24/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
computer-aided detection,imaging informatics,medical imaging,multiple sclerosis,OAI-PMH Harvest,system integration
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Liu, Brent (
committee chair
), Zhou, Qifa (
committee chair
), McNitt-Gray, Jill (
committee member
)
Creator Email
kevincma@usc.edu,kma830@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-471018
Unique identifier
UC11266945
Identifier
etd-MaKevinChi-6019.pdf (filename),usctheses-c40-471018 (legacy record id)
Legacy Identifier
etd-MaKevinChi-6019.pdf
Dmrecord
471018
Document Type
Dissertation
Rights
Ma, Kevin Chikai
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
computer-aided detection
imaging informatics
medical imaging
multiple sclerosis
system integration