Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Making way for artificial intelligence in cancer care: how doctors, data scientists and patients must adapt to a changing landscape
(USC Thesis Other)
Making way for artificial intelligence in cancer care: how doctors, data scientists and patients must adapt to a changing landscape
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Making Way for Artificial Intelligence in Cancer Care:
How Doctors, Data Scientists and Patients Must Adapt to a Changing Landscape
Alexandra Demetriou
Specialized Journalism
Master of Arts
University of Southern California
August 2019
1
Table of Contents
2
7
16
I. Cancer Through the Eyes of AI
II. What Makes a Model?
III. Not All Data Are Created Equal
IV. AI in Practice: Ensuring Equity
V. Bibliography
24
30
2
Cancer Through the Eyes of AI
Dan Ruderman sits in front of his computer, toggling knobs on a control panel with the
enthusiasm of a young kid at an arcade. But this is no video game — he’s navigating through a
3D model of a breast tumor, demonstrating the way his research can help pathologists probe
further into biopsy samples and diagnose cancers with greater accuracy than ever before. Much
of this is made possible thanks to artificial intelligence.
His eyes light up as he reminisces about his humble beginnings in programming, when as a
child he would visit the local RadioShack on Saturday mornings to run programs he had hand-
written the night before. “They loved that,” Ruderman laughs, “because here's this 12-year-old
kid, and if he can do it, anybody can do it.”
1
He followed his interest in programming through graduate school and earned his doctorate in
theoretical physics from the University of California at Berkeley. But even still, the computing
power and data he could leverage was limited. “My PhD thesis was about analyzing statistical
properties of a data set of 45 images — 45 pictures of the forest at 256 x 256 pixels, that I
went out and took with a still video camera. And that was not that easy to do,” he adds.
Flash forward to today, and he and his colleagues at the University of Southern California are
working to advance cancer diagnostics by designing machine learning algorithms that can read
pathology slides and draw insights the human eye cannot. When diagnosing breast cancer, for
example, one of the most important questions doctors need to answer is whether or not a
patient’s cancer cells have a molecule on their surfaces called estrogen receptor, which causes
1
Ruderman, Dan. (Assistant Professor of Research Medicine, USC), in discussion with the author. May 9,
2019.
3
the tumor to grow in response to the hormone estrogen. If they do, oncologists prescribe a
therapy targeted against the estrogen receptor molecules — but determining whether or not a
patient’s tumor responds to estrogen in the first place requires staining biopsy samples in a
process that can be expensive and inconsistent depending on where it's done.
“We wanted to understand whether that can be figured out from the tissue architecture itself
rather than by actually staining the tissue for the molecule estrogen receptor,” Ruderman
explains. “We don't really understand how tumors grow, how they get from point A to point B.
Nobody can ever create a movie of how a tumor came into being. But how tumors grow, the
shapes they take, and how different cells arrange themselves inside a tumor really speaks to the
biology of that tumor. And so the question is, can you work backwards by looking at the pattern
of how this tissue arises and how it ends up, to say, ‘it's this kind of a tumor and it’s going to
respond to this kind of drug’?”
The answer is yes, you can. Ruderman worked with graduate student Rishi Rawat at USC’s
Ellison Institute for Transformative Medicine to create a deep neural network
2
that analyzes
images of breast cancer and can spot patterns in the tissue’s architecture to predict whether the
cells have estrogen receptor on their surfaces. The network did so by discovering that the nucleus
of a breast cancer cell with estrogen receptors is shaped slightly differently than a cell without —
and it made this connection with no human supervision.
2
Rawat, Rishi R., et al. “Correlating Nuclear Morphometric Patterns with Estrogen Receptor Status in
Breast Cancer Pathologic Specimens.” Npj Breast Cancer, vol. 4, no. 1, 2018, doi:10.1038/s41523-018-
0084-4.
4
“One of the problems with human vision is that we have foveation. We look at one part of the
world with very high detail,” Ruderman says. “Then we move our eyes around to take in the
whole scene. But if you think about it, just take your thumb out at arm’s length and look at it —
your thumb is this tiny, detailed little region, but everything else outside of that region is
amazingly blurry. And yet we get this sense that we have all this information about the world.
We really don't. It's really an illusion.”
By contrast, Ruderman explains, computers can take in every single pixel of an image at once
and process them all simultaneously. “Our eyes are not optimized for doing pathology. Whereas
we can build a computer vision system, where all it's ever seen in its life is pathology images of
breast cancer, and see if it can do as good or maybe better than what the pathologist can do,”
Ruderman says. “So just at a very fundamental level, there's this notion that using computers, we
should be able to do a better job of making these decisions than we can by eye,” he concludes.
“There's also the notion that different pathologists will see different things on the exact same
slide, and they will make contrasting decisions about 30% of the time. And that's a lot.”
As of now, the algorithm has yet to surpass the diagnostic abilities of molecular staining -- the
gold standard in pathology — but it certainly can do more to predict estrogen receptor status
than a human ever could by eye. “The key thing is we can do something,” says Ruderman.
Rawat and Ruderman’s work is one of many examples of the ways artificial intelligence holds
the power to draw previously unnoticed connections from existing data, assist physicians in
diagnosis and treatment planning, and, potentially, revolutionize cancer care as we know it.
5
“We went through all these waves of medicine. First was molecular medicine, then there was big
data, and now we need something to integrate it all,” explains Dr. Patricia Raciti,
3
a pathologist
at Paige.AI. “It just makes sense that as we get more and more data, AI is a great way to
integrate it all and derive insights from that data that we as humans might not be able to derive.”
Answers to some of the most perennially confounding questions about cancer could be right
under doctors’ and researchers’ noses, but humans just can’t see those connections in all of this
data. A machine, however, just might be able to.
“It’s just the reality of this kind of data deluge that clinicians are facing,” says Andrew Rebhan, a
research consultant and AI specialist with Advisory Board's Health Care IT Advisor.
4
From
electronic health records and increased digitization, to personalized genomic data and input from
wearable devices, Rebhan says it’s inevitable that doctors will have to integrate artificial
intelligence into their workflows. “Quite frankly, I don't know how a human being could
possibly keep up with all of this.”
Worldwide, the number of newly diagnosed cancer cases per year is expected to rise to 23.6
million by 2030, and the United States estimates its expenditures on cancer will far surpass the
approximately $147.3 billion spent in 2017. The field of oncology is therefore ripe for any
3
Raciti, Patricia. (Pathologist, Paige.AI), in discussion with the author. June 18, 2019.
4
Rebhan, Andrew. (Research consultant, Advisory Board), in discussion with the author. June 14, 2019.
6
advancements AI can deliver to help doctors and researchers keep up with cancer’s tremendous
pace and costs.
"AI is important in cancer care specifically because cancer is such a complex, expensive
conglomeration of diseases to treat, and we’ve been trying for many years with varied degrees of
success,” Raciti says. “So the exploration for ways to detect it, ways to improve treatment
response, and ways to detect survivability is so important — anything that's out there is worth
exploring.”
Beyond helping pathologists classify cancers upon diagnosis, Raciti adds that a promising boon
of machine learning involves feeding algorithms additional historical data — including former
patients’ responses to drugs and their ultimate outcomes — to create a model that can predict
future patients’ disease progressions.
“You can essentially give an algorithm a thousand pathology slides of cancer in patients that did
fantastic, and a thousand slides of the same cancer in which patients did not have a good course
and perhaps died very prematurely — even though at diagnosis they looked the same and the
parameters were reported the same by the pathologist,” Raciti says. The algorithm can then
independently learn from the two data sets to pinpoint differences that might help doctors better
predict progression, and, hopefully, find ways to intervene sooner and more successfully. “That,
to me, is very exciting,” Raciti says, “because we can certainly re-create the wheel and just make
a computer do what a pathologist does, or we can have a computer do something completely
different and add way more value than the pathologist currently can in the care of a patient."
7
AI applications in cancer care reach far beyond pathology. Algorithms can be implemented to
plan radiation therapy,
5
provide clinical decision support,
6
speed up drug discovery,
7
match
patients to clinical trials,
8
optimize drug dosing,
9
and generally streamline the ways oncologists
and researchers can help patients. But these applications are largely still in their experimental
phases, and there are a number of obstacles doctors and data scientists must overcome before AI
can improve cancer care across the board. Whether it meets its potential or falls short of its hype
has more to do with the humans operating the AIs than with the machines themselves. The
initiative that doctors and the public take to educate themselves and prepare the way for AI will
ultimately determine how soon and how much machine learning will disrupt cancer care.
What Makes a Model?
The term artificial intelligence generally refers to computers that behave intelligently, and it can
describe anything from basic programs that automate repetitive or administrative tasks, to deep
learning algorithms that are capable of carrying out unsupervised learning and creating new
connections on their own.
For a relatively simple example of AI applied in cancer care, Rebhan describes an algorithm that
can be trained to scan x-rays and identify pathological signs. “It could be trained to say, ‘This
5
Suleyman, Mustafa. “The promising role of AI in helping plan treatment for patients with head and neck cancers.”
DeepMind. September 13, 2018.
6
“Philips and Dana-Farber operationalize and scale Clinical Pathways at ASCO 2018.” Philips News Center. May
31, 2018.
7
Fleming, Nick. “How artificial intelligence is changing drug discovery.” Nature. May 30, 2018.
8
Murphy, Chris. “Developers Use Artificial Intelligence To Match Patients To Clinical Trials.” Forbes. March 12,
2019.
9
Neely, Michael. (Chief, Division of Infectious Diseases, Children’s Hospital Los Angeles), in discussion with the
author. June 14, 2019.
8
image gives an indication of pneumonia.’ It could do that 99 times out of a hundred,” he
explains, “but that same algorithm can't tell the difference between an x-ray of a dog and a cat.”
This type of machine learning is called supervised learning, and it mostly functions to automate
routine tasks that humans could perform fairly easily. Unsupervised learning, on the other hand,
is where much of AI’s potential in oncology starts to unfold.
“The idea of a neural network is designed, roughly speaking, on how the nervous system works,”
explains Ruderman. He compares the way a machine analyzes pixels of an image with the way
our neurons integrate vast amounts of sensory information to draw simplified conclusions. A
deep neural network is essentially an intricate web of nodes. A computer can evaluate millions of
potential connections between the nodes to pull forth relationships hidden in the data. Much in
the same way the human brain learns from repeated experience, to train a network, researchers
input data like images of breast cancer cells that are paired with outcomes – such as positive or
negative for estrogen receptor. The computer establishes relationships between the inputs and
outputs, and the more data it sees the more often meaningful connections are reinforced. Feed a
network enough of the right data, and it’s possible to come up with innovative results like Rawat
and Ruderman’s and harness AI to discover patterns that surpass what pathologists can notice on
their own.
The same types of networks that can process millions of pixels of image data could potentially
mine vast amounts of clinical data to predict how patients might respond to a given drug, or what
their individual survival probabilities look like down the road.
9
Dr. Jeremy Mason, the Mathematical Oncology Lead at USC’s Convergent Science Initiative in
Cancer, is using FDA data to do exactly that. “You have side effect data, you have survival data,
you have progression data, you have baseline demographic data, mutational status, gender, age,
height, weight, body mass index — you have all this information, so it’s just a matter of pooling
all this data together and asking the right questions,” Mason explains.
10
“So one question could
be, if I’m looking at eight clinical trials that are testing three different drugs, which one is the
right one for a patient? Which one is the one that will extend survival or minimize progression or
side effects? All that data exists, so from the modeling side, it’s simply, ‘What are the right
inputs to predict an output?’ You just have to define it,” he says. “It all comes down to the
question you’re trying to ask.”
So once the question is defined, how exactly does the network process so many different metrics
to make predictions?
“You have a network of neurons, which is made in layers,” Ruderman says. “The early layers
take all the data and look at it, the later layers have fewer and fewer neurons. At the very end
when you're making a decision, you have essentially one neuron that says yes or no.” Networks
with more layers can carry out increasingly complex tasks, but they also require more computing
power, and their connections are often difficult for a human to follow or explain. “You have to
give up on some of that stuff,” Ruderman says. "What you get is the power that this thing can do.”
10
Mason, Jeremy. (Mathematical Oncology Lead, CSI-Cancer, USC Michelson Center for Convergent Bioscience),
in discussion with the author. February 11, 2019.
10
This feature of deep learning has already led to seminal advances in research, such as when an
automated software called C-Path discovered that characteristics of the tissue surrounding a
tumor — known as the stroma — are more useful for predicting a breast cancer patient’s survival
than details about the cancer itself.
11
For an example outside the science world, Ruderman points
to Google’s AlphaGo, a deep learning algorithm that was able to beat the world’s champion at
the board game Go by coming up with a new move that humans had never seen before.
12
“People
in the Go community actually use that move now. So it's an example where a neural network can
help train people to do a better job.”
But when it comes to medicine, the same unsupervised learning that has the potential to teach
doctors and researchers to do better can also stir a bit of fear. “This is the AI explainability
problem, and it's a big one,” Ruderman says. “This is the notion that the AI goes off and learns
something, it does something, but it doesn't tell you how it's doing it. This is an active area of
research.”
“The explainability and the black box effect have thrown some caution into this space," Rebhan
says. "Even if the system is correct the majority of the time, in healthcare, people want to
know why a care decision was made."
11
Beck, Andrew H., Ankur R. Sangoi, Samuel Leung, Robert J. Marinelli, Torsten O. Nielsen, Marc J. Van De
Vijver, Robert B. West, Matt Van De Rijn, and Daphne Koller. "Systematic analysis of breast cancer morphology
uncovers stromal features associated with survival." Science translational medicine 3, no. 108 (2011): 108ra113-
108ra113.
12
Metz, Cade. “In Two Moves, AlphaGo and Lee Sedol Redefined the Future.” Wired. March 16, 2016.
11
“There are some things you can do, like asking, ‘What part of this image was the most important
for that decision?’ And it can tell you, ‘Oh, it's this region here,’” Ruderman says, adding that
researchers are learning how to decode the decisions a network makes to both better train the
physicians using the network, and to gain their trust in the machine.
“From the academic space, we've shown that these machines can be trained quite well and
perform quite well. But I think that there's still a bit of a hesitancy when it comes to the clinical
space,” Rebhan adds, juxtaposing the skepticism surrounding machine learning with the way
doctors readily lean on their own discernment and intuition when it comes to diagnosis. In the
same vein, he notes that pharmaceutical companies and researchers often can’t explain why a
drug works in some cases and not in others, but doctors and patients are still willing to use the
drug if it works on the majority of clinical trial subjects. “Yet there’s this kind of tension
demanding that AI be fully transparent and perfect before it becomes a wide-scale, highly
adopted practice,” Rebhan says. “It's an interesting dynamic in the space.”
The Cleveland Clinic’s Center for Clinical Artificial Intelligence, one of the world’s leaders in
implementing AI into medical practice, has an innovative solution to this problem: the doctors
program their own models.
“I'm a programmer. I'm a physician, but a programmer,” says Dr. Aziz Nazha, the hematologist
and medical oncologist who leads the Center for Clinical Artificial Intelligence.
13
The center is
one of the only large-scale artificial intelligence centers based in a hospital and run by
13
Nazha, Aziz. (Director, The Cleveland Clinic Center for Clinical Artificial Intelligence), in discussion with the
author. June 20, 2019.
12
physicians, and they are working to employ emerging AI technologies to improve diagnostics,
treatment planning and predictive modeling. One of the center’s missions is to produce
“physician-data scientists,” whom Dr. Nazha describes as clinicians who see patients in addition
to conducting research and programming models they can use in clinical practice. The Center has
just started offering a course designed to familiarize physician-data scientists with platform
programming, and a course for medical students is underway as well. “I believe we have to start
from the medical students and teach them about this technology and how they incorporate it into
their future plans in terms of research and clinical application,” Dr. Nazha says.
Dr. Nazha specializes in treating patients with leukemia, with a research focus on a subtype
called myelodysplastic syndrome (MDS). Clinicians often use the Revised International
Prognostic Scoring System (IPSS-R) to stratify MDS patients and make treatment decisions
based on their risks of disease progression and mortality. However, Dr. Nazha noticed
discrepancies between the scoring system’s predictions and the realities of his clinical practice.
He and his team used clinical, mutational and genomic data to build a prediction algorithm that
could churn out survival probabilities at different timepoints of disease — all personalized to the
individual. His team’s model outperformed all other commonly used models for predicting MDS
progression and overall survival.
14
The team is now building a website to share their model with
other physicians, as well as patients who want to better understand their prognoses.
“I build models, and the students, medical students and residents who work with me program
models,” Dr. Nazha explains, “so when we use our algorithms, we know the ins and outs.” Dr.
14
Aziz Nazha et al.. "A Personalized Prediction Model to Risk Stratify Patients with Myelodysplastic Syndromes
(MDS)." Blood 130, no. Suppl 1 (2017): 160.
13
Nazha emphasizes the value he places on using their own models rather than those built by an
outside data scientist — regardless of how high quality another programmer’s model might also
be — because they can better peer inside the black box if they design and understand the models
themselves. “We always try to dissect the features — number one, to make sure that the
variables included in the final model make sense clinically to us, and number two, to also try to
learn from the machine learning algorithm. We’ve actually built models that have very high
accuracy, but when we tried to dissect them clinically, they didn't make sense to us, so we
trashed the models,” he recounts. “And I think that's the power when you are a physician who's
trying to build a model. You understand what the physician needs, you understand what the
patient needs, and at the same time, you understand the strengths and weaknesses of the
algorithm.”
Nazha adds that this type of explainability is possible, even for very large and complex data sets.
He refers to another one of their projects, which used data from 1.5 million Cleveland Clinic
admissions to predict readmission risk and patients’ mental state within 24 to 48 hours of
readmission. “This is a huge amount of data. You have about 600 to 700 variables per patient, so
you're talking about almost a billion data points,” Nazha remarks. Importantly, they weren’t
merely trying to figure out if a given patient would be readmitted or not. “I can predict if a
patient is going to come back to the hospital, but for me as a physician, that’s useless,” Nazha
says. “What are really important to me are the factors that brought the patient back. And if I
know the factors, I might be able to do something to avert that outcome,” he concludes.
The team designed an algorithm with explainability in mind so they could extract features from
the model, then plotted those metrics to demonstrate the ways certain variables impacted a
14
patient’s readmission more than others. “I think the major issue with physicians accepting the
models is that machine learning seems like a black box, and they don't understand it,” Nazha
says. “But there are several techniques that you can use to extract the important features that
impact the algorithm’s decision, and then you can plug those features in and learn from them.
We have done that, and we’ve learned a lot.”
The highest performing networks tend to also be the least explainable,
15
so striking a balance
between the complexity clinicians want in a model and the transparency they need is one of the
strongest assets of a center like Cleveland Clinic’s. As AI continues to take hold in cancer care,
medicine and data science will have to go hand in hand so programmers will know the right
questions to ask of their models, and physicians will be able to pull clinically useful information
out of them.
“Everybody often thinks of AI as replacing a clinician or trying to re-create the clinician's role,
but what I think is the most exciting and promising part is that it can actually enhance the
clinician's role,” says Raciti. “Not just to make what currently exists more efficient, more cost-
effective and more accurate, but to actually introduce a whole new dimension to the care of a
patient to provide personalized care in a way that has been limited in current practice.”
Dr. Laurence Court, a physicist at MD Anderson Cancer Center who uses machine learning to
improve radiation therapy planning, adds that artificial intelligence tools have their limitations,
15
Došilović, Filip Karlo, Mario Brčić, and Nikica Hlupić. "Explainable artificial intelligence: A survey." In 2018
41st International convention on information and communication technology, electronics and microelectronics
(MIPRO), pp. 0210-0215. IEEE, 2018.
15
and their recommendations should be verified by a human operator.
16
“Just like a human process
can go wrong, so can an automated process,” says Court. “A person will see things that may
seem obvious [to a human], but they may not be obvious to a computer.”
A famous, non-medical proof of concept is that of the panda-gibbon adversarial example: an
image of a panda is uploaded to a neural network, and the algorithm reads the image as a panda
with 57.7% confidence. A hacker can superimpose a layer of noise over the image — which still
looks exactly like a panda to the human eye — and the algorithm will read it as a gibbon with
99.3% confidence.
17
The point is that humans and algorithms clearly have different skill sets
when it comes to information analysis, so people and machines will always have to pick up the
slack the other leaves behind.
It’s, hopefully, pretty easy for a human to spot machine errors like a mislabeled panda, but what
about in cancer care, where oncologists don’t have a way of knowing how a unique patient
should respond to an algorithm’s suggested treatment? What can doctors do to ensure their AI
does no harm?
“The quality of data, the type of data, the amount of data — that's the most important part,”
Nazha insists, “so we take careful consideration when we think about these projects.”
16
Court, Laurence. (Assistant Professor Tenure Track, Department of Radiation Physics, Division of Radiation
Oncology, The University of Texas MD Anderson Cancer Center), in discussion with the author. June 17, 2019.
17
Ian Goodfellow, Nicolas Papernot, Sandy Huang, Rocky Duan, Pieter Abbeel, Jack Clark. “Attacking Machine
Learning with Adversarial Examples.” Openai.com. February 24, 2017.
16
This idea of the data being paramount is echoed all throughout the world of medical AI, and it’s
linked to an essential problem: as machine learning inevitably makes its way into clinical
practice, doctors will need to be able to trust the algorithms they’ve built. But as a network’s
computing power increases, a physician’s ability to explain it starts to decay. More likely than
not, the black box problem isn’t going anywhere. To be certain medical AI of the future is as
reliable and as safe as possible, perhaps doctors should start looking more toward something they
can do a much better job of controlling: the data that trains an algorithm and determines the
quality of its outputs in the first place.
Not All Data Are Created Equal
Granted, data isn’t exactly a sexy topic. That’s why it often gets passed over by the media in
favor of hyping the latest AI to outperform doctors
18
at what doctors do best, or painted as the
villain
19
when people remember that bad data can have bad consequences. But the reality is that
data needs more attention, because the use of AI in cancer care is still very much in its nascent
stages. Whether it progresses forward as a paradigm-shifting tool or stumbles along
inconsistently depends heavily on the data doctors and scientists feed their algorithms.
“The quality of the data is incredibly important, but less talked about,” says Raciti, referring back
to her own field of pathology and noting that, in theory, any hospital could start digitizing glass
18
Olson, Parmy. “This AI Just Beat Human Doctors On A Clinical Exam.” Forbes. June 28, 2018.
19
Ross, Casey. “What if AI in health care is the next asbestos?” STAT. June 19, 2019.
17
slides given the money and trained personnel. Pairing a slide with the right diagnosis, on the
other hand, is where it gets tricky.
Raciti works for Paige.AI, a computational pathology company that, thanks to a partnership with
Memorial Sloan Kettering Cancer Center, has access to a repository of over 25 million tissue
pathology slides diagnosed by some of the world’s top doctors.
20
Pathologists at Paige.AI don’t
have to worry about cracked slides, poor quality staining, or holes in the tissue, but these are very
real and confounding concerns for data scientists trying to train an algorithm off of less refined
images and diagnoses. “An algorithm is created and it's not meant to think,” Raciti says. “It's
meant to take an input, you program it, and it puts out an output. And if the input is garbage, the
output will be garbage.”
“Generally speaking, an algorithm cannot be better than the data that has been used to train that
algorithm,” says Dr. Issam El Naqa, a professor of radiation oncology at the University of
Michigan.
21
He references the widely publicized example of Amazon’s facial recognition
algorithm that demonstrated bias against women and people of color,
22
and he points out that
cases like this are often due to biased data sets rather than issues with the algorithms themselves.
Similar bias exists in medical data: clinical trials produce some of the health care industry’s
highest quality data and therefore the best material for modeling, yet minorities are largely
20
Lunden, Ingrid. “Paige.AI nabs $25M, inks IP deal with Sloan Kettering to bring machine learning to cancer
pathology.” Techcrunch.com. 2018.
21
El Naqa, Issam. (Professor of Radiation Oncology, University of Michigan), in conversation with the author. June
5, 2019.
22
Metz, Cade and Singer, Natasha. “A.I. Experts Question Amazon’s Facial-Recognition Technology.” The New
York Times. April 3, 2019.
18
underrepresented
23
in this data and only about five percent
24
of all cancer patients participate in
clinical trials in the first place.
“ The first step is more complete data,” Mason says.
25
“There are so few patients, and there’s a
lot of missingness to the data, which is a problem we’re looking at and something
computer science is trying to handle.”
Ruderman points out that too little data can be a serious obstacle when designing a model. “If
you don't have enough training data, you're going to have a neural network that's under-trained,”
he says. “It has more parameters than you'll have data to tell those parameters what to be.”
The amount of useful data shrinks even further in practice, when data scientists have to account
for the fact that cancer is not one, but many varied diseases that require unique models. “You
can’t build a cancer prediction model. Never going to work,” Mason explains. “A lung cancer
prediction model will work better, but still probably not good enough. And the more specific you
get, the fewer and fewer patients will be in that pool.”
“Cancer’s not one time point. It changes,” Mason continues. “Does the data reflect that? How do
[we] build a model when one patient we have has 17 blood draws, but the others have an average
of 2?” he muses. “How do you build models where you don’t lose that information, but you also
don’t make it an N of 1 problem?”
23
“Minorities in Clinical Trials.” U.S. Food and Drug Administration. August 6, 2018.
24
Zimmerman, Brian. “Just 5% of cancer patients participate in clinical trials: 5 things to know.” Becker’s Hospital
Review. December 11, 2017.
25
Mason, Jeremy. Ibid.
19
Mason and many other members of the research community advocate for more data sharing to
help fill these deficits. But concerns over patient privacy, institutions’ unwillingness to give
away valuable research data they’ve collected, and the sheer effort it can take to aggregate the
right data from patients’ records all confound the sharing process.
“Data sharing is a huge challenge, quite honestly, in terms of scaling any efforts that you're
trying to do beyond your local populations,” admits Court, who is studying ways to globalize the
same AI radiation planning technology his team is implementing at MD Anderson. Cutting
through institutional red tape has been one of his biggest challenges so far.
Joseph Paul Cohen, a postdoctoral fellow with Turing Award laureate Yoshua Bengio at Mila
and the University of Montreal, insists that this lack of data sharing slows down progress.
26
But
he also acknowledges that hospitals view data as an asset, and they’re not likely to want to give
that asset away any time soon. Mason jokes that the way each hospital or institution currently
guards its own data mirrors the music industry in the 1990s. “Every recording industry had its
own music format, and they all thought music was theirs,” he says. “Then Napster came along
and started sharing with everybody, and it completely revolutionized music.”
Mason’s comment is not to encourage haphazard sharing of patients’ data. Patient privacy is
sacred in the medical community, and researchers and doctors are duly cautious about who is
granted access to work with potentially sensitive patient information. But Mason brings up
26
Cohen, Joseph Paul. (Leader of the Medical Research Group, Mila, Montreal), in discussion with the author. June
16, 2019.
20
another interesting point through a new project he’s starting, which speaks to the need to rethink
data sharing in an age when so much of our personal data is already online.
Mason’s upcoming project involves partnering with USC’s Information Sciences Institute to
mine either Facebook or Twitter data for patient information that will be fed into their models.
“It’s all public data,” Mason insists, pointing out that many patients actively post details about
their cancers in public support groups or through chats organized by hashtag.
“Every Monday night at 9 pm Eastern, for one hour they have a breast cancer social media chat.
#BCSM. Every Monday night, for about three years now, people get on and just tweet with that
hashtag,” Mason explains. “And there are sites where you can pull a transcript from a certain day
with a certain time range with a certain tag, and they have data going back throughout the years.
So if you pull that entire transcript and [if] you have a certain question, you can mine that data.
It’s all doable,” he concludes.
The #BCSM Community is a worldwide volunteer organization
27
dedicated to using social media
to share information and support those affected by breast cancer. The mere existence of online
communities like #BCSM raises the question: if patients and survivors are willing to share
details about their cancers on social media to benefit others going through the same experiences,
might they also be willing to contribute their data to open repositories that researchers could use
to create disease models? Most researchers seem to agree that now is the time to tip the scales
and encourage patients to take a more active role in data sharing than ever before.
27
Breast Cancer Social Media Community Website. https://bcsm.org/
21
Cohen points out that patients simply consenting to have their data shared when they’re at the
hospital isn’t enough, because the data often end up getting stuck within that hospital’s system
and only shared via research agreements. An alternative is for patients to take a more active role
in ensuring they are represented in research by sharing their data themselves. For example, the
All of Us Research Program,
28
run by the NIH, represents a good model for essentially
crowdsourcing medical data to increase the quantity and diversity of data that can help advance
precision medicine. Individuals can voluntarily contribute their data to the study, and their
information will be aggregated into a large database. The NIH is entrusted with the task of
protecting contributors’ privacy, and it can in turn share the data with approved researchers
around the world.
The All of Us Research Program is a promising step toward getting people outside of the medical
community invested in data sharing and encouraging more people to value the impact their data
can have on the future of medicine. But often times the subsets of the population historically
underrepresented in medical data and clinical trials are also less likely to step up and start sharing
their data on their own. Dr. Christina Chapman, a radiation oncologist at the University of
Michigan Medical Center who focuses on delivering technologically advanced medical care to
all populations, says one of the major obstacles to getting necessary data from diverse
populations is a lack of trust. Marginalized patients have little incentive to share their data with a
health care system already stained with inequity. She calls upon researchers to conduct more
community-level outreach and speak to diverse populations to build that trust. If not, the same
28
“About the All of Us Research Program.” The National Institutes of Health website.
22
bias that already exists in medical data will perpetuate itself on a larger scale in the algorithms
that inform cancer care of the future.
Mason concludes that modeling a disease as complex as cancer will ultimately require more
complete data and more data sharing – and a certain degree of standardization as well. He
describes the cumbersome nature of combing through FDA clinical trial data, in which different
studies record their data in different formats, and having to use one algorithm just to extract and
standardize all the metrics before feeding them into any of his prediction algorithms. “In the
past,” Mason says, “not as many people cared how this data was organized, because not as many
people were thinking about modeling cancer the way we are today. It wasn’t at the forefront of
their minds.”
Standardization can come down to relatively simple matters like converting between metric
versus imperial systems, but it gets complicated when researchers start adding details like
personalized genomic data into a model. “There are a lot of different systems out there to
actually compute and measure your genomics, but which one is ‘right’? And even if they’re all
‘right’ how should you record in that format so you can easily translate to another format?”
Mason contemplates. “Because same as language, English isn’t necessarily right compared to
French or compared to Spanish –– you just need some base that you can communicate across.
“People should release more public data, and then we can discuss what's inconsistent,” Cohen
remarks, and refers back to his own experiences dealing with the discrepancies in medical
diagnosis and data recording that occur around the world.
23
Cohen leads the medical research group at the Montreal Institute for Learning Algorithms,
known as Mila. In January, he and his team released a free web-based chest x-ray analysis tool,
aptly named Chester,
29
to give medical students and doctors the opportunity to tinker with AI’s
diagnostic potential. Chester was trained on the NIH’s public x-ray data, so the algorithm’s
diagnoses mimic the way the NIH classifies findings on a chest x-ray. But Cohen frequently
hears feedback from doctors using Chester in other countries who point out that they diagnose
the same radiological findings differently. He predicts that as more institutions continue to share
data, the discussion surrounding diagnostic differences will come to the forefront, and the global
medical community will have to determine how best to accommodate different diagnostic truths
into machine learning tools.
Ideally, oncologists and data scientists will get to a point where they have enough high-quality
data and access to that data to create algorithms that are well tailored to the unique and diverse
populations benefitting from them. “If you have a patient, and you have enough data from a
cohort that's like that patient, you can make a specific neural network that will work best for that
patient,” Ruderman explains, though he qualifies that accumulating enough of the right
information to train such networks is paramount.
AI in cancer care has a long way to go before this goal becomes a reality. But as doctors and
researchers hone machine learning tools and gradually move closer to that reality, who will
benefit from AI applications in oncology as they come to fruition? Emerging technologies like
29
Cohen, Joseph Paul, Paul Bertin, and Vincent Frappier. "Chester: A Web Delivered Locally Computed Chest X-
Ray Disease Prediction System." arXiv preprint arXiv:1901.11210(2019).
24
precision medicine have come under fire for disproportionately benefiting the wealthy and
elite,
30
but AI doesn’t have to follow that path if those implementing the tools proceed with care.
In fact, many doctors and researchers suggest –– albeit optimistically –– that skillfully designed
algorithms trained on the right data could encapsulate high-quality care in their designs and bring
unprecedented benefits to cancer patients who would never otherwise have access to that level of
care.
AI in Practice: Ensuring Equity
Designing AIs that account for the unique resources, patient populations, and clinical practices
where the AI is deployed will be essential as oncologists integrate machine learning into their
practices. The balance between quality and customization is not to be underestimated: one of the
biggest reasons IBM’s Watson for Oncology fell short
31
on many of its promises is because the
designers thought they could democratize cancer care by simply allowing oncologists around the
world to tap into Sloan Kettering’s top-notch data and clinical expertise. But in reality,
packaging up knowledge from the world’s best oncologists at one institution and shipping it off
globally in the form of an algorithm didn’t democratize care –– it just made the machine’s
recommendations incongruent with the realities of medicine everywhere else.
“Before people do this type of AI work, they've got to have somebody on the team who actually
understands equity and understands dissemination and implementation,” Chapman says. She
30
Arcaya, Mariana C., and José F. Figueroa. "Emerging trends could exacerbate health inequities in the United
States." Health Affairs 36, no. 6 (2017): 992-998.
31
Casey Ross and Ike Swetlitz. “IBM pitched its Watson supercomputer as a revolution in cancer care. It’s nowhere
close.” STAT. September 5, 2017.
25
refers back to her days as an undergraduate biomedical engineering student at Johns Hopkins
University, where she first noticed the stark contrast between the medical advancements taking
form in academia and the realities of translating those technologies to populations in need.
“A lot of the innovations that were happening on campus and the care that was happening at the
hospital remained inaccessible to many of the people who literally lived right next door to the
hospital,” Chapman explains. “I realized there's this huge disconnect and that as you keep
innovating, if you don't take a second to pause and think about…whether people are even getting
the advances of yesterday, you're just going to continue to have this gap.”
She adds that while her field of radiation oncology has focused heavily on innovation in the
realm of academia, many community physicians are struggling to keep up with the latest
advancements. We haven't focused enough on implementation and dissemination science to ensure
that these advances can actually be accessible to everybody.”
Speaking on his experience implementing AI to assist in radiation therapy planning in Africa,
Court says that considering the needs of the target population is key when designing a machine
learning intervention. Those needs can vary greatly from place to place. “The details of how you
will treat a person vary from institution to institution and also from country to country,” he says,
“so creating a tool that can be used globally is a challenge because of those differences in
practice and resources. You have to accommodate them to some degree.”
26
Chapman explains that training health care professionals to implement new technologies requires
time, effort, and resources, and there will always be a learning curve to account for as well. She
adds that there are many patients in our country lacking health insurance and access to basic
medical care, and those gaps in rudimentary care need to be addressed first and foremost. For
example, approximately 14 percent of office-based physicians have yet to adopt electronic health
record systems
32
–– if data isn’t even digitized in the first place, it’s difficult to open a dialogue
about implementing algorithms to analyze patient data.
But acknowledging these obstacles doesn’t mean discounting the potential of AI to benefit
patients at both high- and low-resource clinics. AI is poised to fill many existing gaps in care,
which is why some doctors and researchers see it as a potential equalizer of care. As long as a
medical center can meet a certain threshold of resources, AI can, in theory, be implemented to
streamline and improve many steps in the process of diagnosing and treating cancer.
In the field of pathology, an algorithm could essentially serve as an expert consult in
understaffed clinics to arrive at better diagnoses and enable patients to access the right care
sooner.
33
“All of a sudden we start to talk about leveling the playing field, such that a patient
doesn't need to go to a major cancer center... to be able to get the best diagnosis,” Raciti says.
34
In radiology, Court explains that one of the biggest challenges he has observed while working to
implement radiation planning algorithms in Africa is the lack of properly trained radiology staff.
32
Office of the National Coordinator for Health Information Technology. 'Office-based Physician Electronic Health
Record Adoption,' Health IT Quick-Stat #50. January 2019.
33
Synthesis of information from interviews with pathology experts such as Ruderman and Raciti.
34
Raciti, Patricia. Ibid.
27
“There's data out that indicates that you do have better outcomes when you go to a large
academic center compared to if you go to a … smaller center,” Court explains. “There are many
reasons for that, but one of them is just consistency in the treatment and review of plans by your
peers.” Court says that AI can make radiation planning more consistent across institutions and
somewhat compensate for a lack of highly trained personnel. He adds that his team was able to
access their AI tool using the free Wifi on Turkish Airlines, driving home the point that AI
interventions can be made widely accessible, provided an institution has adequate Internet
connectivity. Of course, higher power computing requires certain hardware, but simple AI
interventions can be designed to be readily translated to under-resourced areas and thus help
equalize care.
Mason echoes this theme, explaining that compared to drug development, designing predictive
models is relatively cheap and efficient. “A drug has an entire pipeline, because you have to do
all the chemistry behind it, and the animal testing, and go through all the clinical trials and you
have to pay all the people involved every step of the way. For the model, if you trust all the data
going into that model, which means you trust everything that’s happened up to that point, it takes
a lot less time and a lot less man hours. And it’s cheaper,” Mason explains.
In many regards, AI can be easily deployed in all types of medical settings to help standardize
and streamline cancer care; the question comes down to how and where people will choose to
implement it –– and how much they will charge.
28
“It’s just a matter of ensuring that we are spreading the capabilities here. How are we
democratizing and sharing IT resources across different stakeholders?” Rebhan asks. “AI can be
a differentiator for a lot of health systems,” he explains, and cautions against it following the
route of pharmaceuticals, where high prices create disparities in access. After all, just because an
AI intervention can be created and implemented cheaply doesn’t mean companies designing
algorithms will want to charge low premiums for their services.
Cohen says that even if companies decide to charge high prices for their algorithms, groups like
his research team at Mila will be around to make free versions of AI tools, “just to add a
common denominator to make sure that people aren't exploited by these tools.”
Chapman notes, however, that as of today, ensuring equity in health care often takes the form of
a diversity requirement in clinical trials or a section researchers must cover in their grant
proposals. It has yet to establish itself as a key focus of the medical establishment in the wake of
the constant pressure to innovate. “I think AI can be used to improve access and address equity,
but I think people have to have an equity lens up front –– and I just don't really think many
people have that lens. I don't think people learn how to have that lens because I don't think it's a
central tenet of academia,” she says.
The field of oncology is just beginning to scratch the surface of leveraging AI to improve cancer
care, and doctors have the opportunity to start educating themselves now to better work with AI
before it fully makes its way into the clinical setting. Moreover, we as a society have the
opportunity to encourage data sharing and advocate so diverse populations benefit from this
29
potential equalizer. The action that doctors, data scientists, and the public take to pave the way
for AI’s implementation will dictate whether AI revolutionizes cancer care for a wide range of
patients, or follows the pattern of previous technologies and disproportionately benefits those
with easy access to care and money to spare.
“AI is a tool, and like any other tool, you can use it poorly, or you can use it effectively,” says
Dr. Michelle Ng Gong, who has worked to integrate artificial intelligence into clinical practice
within the Montefiore Health System.
35
“How you use the tool, how you measure it, how you
determine whether it's working or not, will be key to advancing the field.”
35
Ng Gong, Michelle. (Chief, Division of Critical Care, Einstein/Montefiore Department of Medicine), in
discussion with the author, June 21, 2019.
30
Bibliography
Arcaya, Mariana C., and José F. Figueroa. "Emerging trends could exacerbate health inequities
in the United States." Health Affairs 36, no. 6 (2017): 992-998.
Aziz Nazha et al.. "A Personalized Prediction Model to Risk Stratify Patients with
Myelodysplastic Syndromes (MDS)." Blood 130, no. Suppl 1 (2017): 160.
Beck, Andrew H., Ankur R. Sangoi, Samuel Leung, Robert J. Marinelli, Torsten O. Nielsen,
Marc J. Van De Vijver, Robert B. West, Matt Van De Rijn, and Daphne Koller.
"Systematic analysis of breast cancer morphology uncovers stromal features associated
with survival." Science translational medicine 3, no. 108 (2011): 108ra113-108ra113.
Casey Ross and Ike Swetlitz. “IBM pitched its Watson supercomputer as a revolution in cancer
care. It’s nowhere close.” STAT. September 5, 2017.
Cohen, Joseph Paul, Paul Bertin, and Vincent Frappier. "Chester: A Web Delivered Locally
Computed Chest X-Ray Disease Prediction System." arXiv preprint
arXiv:1901.11210(2019).
Došilović, Filip Karlo, Mario Brčić, and Nikica Hlupić. "Explainable artificial intelligence: A
survey." In 2018 41st International convention on information and communication
technology, electronics and microelectronics (MIPRO), pp. 0210-0215. IEEE, 2018.
Fleming, Nic. "How artificial intelligence is changing drug discovery." Nature 557, no. 7706
(2018): S55-S55.
Ian Goodfellow, Nicolas Papernot, Sandy Huang, Rocky Duan, Pieter Abbeel, Jack Clark.
“Attacking Machine Learning with Adversarial Examples.” Openai.com. February 24,
2017.
Lunden, Ingrid. “Paige.AI nabs $25M, inks IP deal with Sloan Kettering to bring machine
learning to cancer pathology.” Techcrunch.com. 2018.
Metz, Cade and Singer, Natasha. “A.I. Experts Question Amazon’s Facial-Recognition
Technology.” The New York Times. April 3, 2019.
“Minorities in Clinical Trials.” U.S. Food and Drug Administration. August 6, 2018.
Murphy, Chris. “Developers Use Artificial Intelligence To Match Patients To Clinical Trials.”
Forbes. March 12, 2019.
Office of the National Coordinator for Health Information Technology. 'Office-based Physician
Electronic Health Record Adoption,' Health IT Quick-Stat #50. January 2019.
Olson, Parmy. “This AI Just Beat Human Doctors On A Clinical Exam.” Forbes. June 28, 2018.
Rawat, Rishi R., et al. “Correlating Nuclear Morphometric Patterns with Estrogen Receptor
Status in Breast Cancer Pathologic Specimens.” Npj Breast Cancer, vol. 4, no. 1, 2018,
doi:10.1038/s41523-018-0084-4.
“Philips and Dana-Farber operationalize and scale Clinical Pathways at ASCO 2018.” Philips
News Center. May 31, 2018.
Ross, Casey. “What if AI in health care is the next asbestos?” STAT. June 19, 2019.
Suleyman, Mustafa. “The promising role of AI in helping plan treatment for patients with head
and neck cancers.” DeepMind. September 13, 2018.
Zimmerman, Brian. “Just 5% of cancer patients participate in clinical trials: 5 things to know.”
Becker’s Hospital Review. December 11, 2017.
Abstract (if available)
Abstract
Artificial intelligence (AI) and machine learning are poised to revolutionize cancer care in the future, from helping pathologists diagnose cancers with more speed and accuracy to aiding in radiation planning and disease progression modeling. But before AI can meet its full potential, doctors and researchers need to solve a handful of problems related to the data they use to train their algorithms. “Big data” has been touted as a huge boon to medical research advancement, but big doesn't necessarily imply complete, high quality or accessible data—and all of those attributes are essential to designing effective models that can benefit a wide catchment of patients, not just the few who have access to top-notch cancer care. These issues are often glossed over in favor of focusing on AI’s powerful potential, so I based my thesis on interviews with doctors and data scientists that discuss the data-related obstacles, such as data sharing and completeness of data sets, that cancer researchers and physicians must overcome to maximize the impact of the algorithms they design. I also look into issues such as the black box or explainability problem, and discuss how doctors can take part in the process of designing algorithms to better inform those models with a clinical perspective and understand their limitations. Finally, since a concern with AI is that it could exacerbate disparities if only the wealthiest echelon of patients access it at top cancer centers, I asked all my interviewees about who will benefit from AI implementation in oncology and I dedicated the last section of my thesis to discussing implementation and equity of care in the age of AI.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Artificial intelligence for low resource communities: Influence maximization in an uncertain world
PDF
Predicting and planning against real-world adversaries: an end-to-end pipeline to combat illegal wildlife poachers on a global scale
PDF
Coding.Care: guidebooks for intersectional AI
PDF
Designing data-effective machine learning pipeline in application to physics and material science
PDF
Ancestral inference and cancer stem cell dynamics in colorectal tumors
PDF
Application of data-driven modeling in basin-wide analysis of unconventional resources, including domain expertise
PDF
Toward counteralgorithms: the contestation of interpretability in machine learning
PDF
Essays on bioinformatics and social network analysis: statistical and computational methods for complex systems
PDF
Countering problematic content in digital space: bias reduction and dynamic content adaptation
PDF
Accessible cytotoxic and antiviral drug analogues: improved synthetic approaches to isoindolinones and bioisosteric difluoromethylated nucleotides, and the search for therapeutic organotelluranes
Asset Metadata
Creator
Demetriou, Alexandra Nicole
(author)
Core Title
Making way for artificial intelligence in cancer care: how doctors, data scientists and patients must adapt to a changing landscape
School
Annenberg School for Communication
Degree
Master of Arts
Degree Program
Specialized Journalism
Publication Date
08/15/2019
Defense Date
08/14/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
artificial intelligence,cancer,cancer care,cancer diagnosis,data science,disease modeling,machine learning,neural network,OAI-PMH Harvest,oncologist,oncology,Pathology,physician,Radiology,Research,treatment planning
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Saltzman, Joe (
committee chair
), Agus, David (
committee member
), Levander, Michelle (
committee member
)
Creator Email
andemetr@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-214873
Unique identifier
UC11663036
Identifier
etd-DemetriouA-7794.pdf (filename),usctheses-c89-214873 (legacy record id)
Legacy Identifier
etd-DemetriouA-7794.pdf
Dmrecord
214873
Document Type
Thesis
Format
application/pdf (imt)
Rights
Demetriou, Alexandra Nicole
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
artificial intelligence
cancer care
cancer diagnosis
data science
disease modeling
machine learning
neural network
oncologist
oncology
physician
treatment planning