Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Developing a robust single cell whole genome bisulfite sequencing protocol to analyse circulating tumor cells
(USC Thesis Other)
Developing a robust single cell whole genome bisulfite sequencing protocol to analyse circulating tumor cells
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
i
DEVELOPING A ROBUST SINGLE CELL WHOLE GENOME BISULFITE SEQUENCING
PROTOCOL TO ANALYSE CIRCULATING TUMOR CELLS
by
Veronica Ortiz
A Dissertation Presented to the
FACULTY OF THE USC KECK SCHOOL OF MEDICINE
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(CANCER BIOLOGY AND GENOMICS )
May 202 2
Copyright 2021 Veronica Ortiz
ii
DEDICATION
I would like to dedicate this PhD degree and all my accomplishments to my parents and
my Tio Vidal. They came to this country with nothing, and they sacrificed everything so that I
could pursue my dreams. I represent this degree, but I truly believe this degree is more theirs than
mine. Cancer is a terrible disease and COVID19 is a terrible pandemic that took loved ones from
me, but they left me with a strong foundation. I thank mi mama, mi papa and my loving Tio Vidal
with all my heart and hope I will make them proud. Los quiero mucho.
I would also love to dedicate this to the man that helped me make this possible. He is the
most caring, most beautiful and understanding angel that I do not deserve. Manpreet, thank you
for picking me up and walking for both of us when I couldn’t.
Thank you to my heavenly Father for making the impossible possible before my eyes
numerous times and continue to do so (John 14:27).
iii
ACKNOWLEDGEMENTS
I was always a curious kid growing up, and until college I was fortunate enough to learn
about what a PhD consists of. Since then, things have fit into place and my curiosity was able to
be practiced into something I enjoyed. I give a big thank you to USC for allowing me to pursue
my curiosity and for challenging me to develop the scientist I am today. The Trojan family does
exist, and this family has helped me, even through my darkest times. USC, you opened your arms
and invited me in, and I will forever be grateful for this wonderful opportunity.
First and foremost, I want to thank my mentor Doctor Min Yu for being the most supportive
and loving mentor I have ever had. Before even rotating in your lab, I had a list of all the things I
wanted the lab I would join to be, and I was told it would be impossible to find all those things.
Rotating through your lab made me realize that I had found my home and everything I ever wanted.
Not only was this lab my home for many years, but it was also a place where I would find my
family. I have faced multiple trials through my life, and two huge losses during graduate school,
but despite that, Min has been my biggest cheerleader and her continued support made one foot go
after the other when I thought I couldn’t. Thus, thank you, Min, for facing things with me and for
your everlasting faith in me when I didn’t believe in myself.
Next, I would like to thank the Yu lab for being the greatest lab family I could ever have.
Thank you, Lin, for pushing me to be sassy, but to also be the best I could be. You helped me start
this project and you taught me so much. I enjoyed our laughs and jokes so much, and I am grateful
for all that you taught me. It really, really, really helped me get this far. I would like to thank
Desirea for being my conscience, especially when I wasn’t able to think straight, and when I wasn’t
doing well. Our talks have helped me and have pushed me to have faith in myself and have pushed
me to realize that my life means more than my problems. I would like to thank Ebony for listening
iv
to me during my gloomiest times and for giving me the most thoughtful advice. You helped me
when I had doubt in myself and that pushed me forward and our talks continue to be cathartic. I
want to thank my “brothers” Sathish and Jojo, AKA Yonathan, for always being there for me, even
for the smallest things. We took bioinformatic courses together and those times are so precious to
me, especially the random talks we had. Thank you for being the best lab brothers I could ever ask
for. I would also like to thank the hacker Remi for being literally the best superhuman anyone
could ever ask for. Your skills and continued effort to learn things have pushed me to also want to
be like you. You have also been so patient and kind with me, and our random talks about everything
have been my escapes during lab. A special thank you to all my lab members: Oihana (love our
talks about Netflix Manpreet), Alee (persevere), Mohamad (those scary stories late at night picking
cells through Rarecyte), Diane (thank you for telling me not to quit), Amber (awkward dance
moves, enough said), Yilin (thank you for introducing me to historical Chinese figures) and Sarai
(thank you for “cheating” the system and teaching me).
Thank you to Doctor Andrew Smith and his lab. Andrew, you still scare me, but you are
also someone I highly look up to. You have been so supportive, and your guidance has been a huge
part of me finally coming to this point. Our random nerd talks and you pushing me to play video
games have been such great memories. I also want to thank his graduate students Doctor Ben
Decato and Guilherme De Sena Brandine. Both Andrew, Ben and Gui have taught me a lot about
next generation sequencing and bioinformatics. You guys have helped me develop a whole new
appreciation and curiosity for this wonderful field. I am forever grateful for your help and I would
not be at this stage of my PhD if it weren’t for your help.
I also want to thank Doctor David Craig, Doctor Brook Hjelm and Doctor Enrique
Velasquez for taking me with no expertise in bioinformatics, and for teaching me and helping me
v
learn various programming languages. Especially to Dr. Hjelm, who continues to be an incredible
mentor in my life and someone I can ask for help whenever. Thank you for your continued support
and for being incredibly kind and a cheerleader in my life.
Lastly, I want to thank my dissertation committee: Doctor Ite Offringa, Doctor Michael
Stallcup, Doctor Gangning Liang and Doctor Kimberly Siegmund. All of you have been such a
tremendous blessing in my life and thank you for all your encouraging advice through these years.
Your guidance has been fundamental towards my PhD, and I wouldn’t be here at the end without
all of you. Thank you to Doctor Stallcup for being there with me even before the beginning. You
were my last interview at USC, and before we finished, you said that if I chose USC, you’d help
me and guide me in finding a lab. I took those words to heart and met with you again. During that
meeting you recommended Min and the rest is history after that. You were in my qualifying
committee and are now seeing me finish by being part of my dissertation committee. Thank you
so much for being such a great and caring mentor to me and Min. You have given so many beautiful
words of advice that I will treasure forever.
vi
ABSTRACT
Breast cancer accounts for over 40,000 deaths annually in the United States and over ninety
percent of these deaths are attributed to metastasis. Circulating tumor cells (CTCs), shed from the
primary or metastatic tumors into the circulatory system, are displaced in distant organs where
metastatic tumors eventually arise. Revealing the molecular properties underlying heterogenous
CTCs will have a significant impact in overcoming therapeutic resistance in patients and
understanding metastatic mechanisms. CTCs have been studied in the context of genomic
mutations and transcriptional profiling. However, the influence of epigenetic changes contributing
to the characteristics of CTCs is unknown. DNA methylation plays a fundamental role in
regulating many cellular processes, and alterations in methylation, especially at promoters, is a
typical hallmark of cancer. Our lab established several CTC lines that were derived from breast
cancer patients, allowing us to analyze the CTCs’ methylomes for the first time. My focus started
with understanding the heterogenous nature of CTCs by utilizing three published protocols for
single cell whole genome bisulfite sequencing (scWGBS). However, I found that these scWGBS
protocols produced inconsistent bisulfite conversion rates in single CTC samples. A highly
efficient bisulfite conversion rate is absolutely crucial for accurate methylation analysis, since
inefficient conversion of unmethylated cytosine residues into uracil will lead to false positive
results. In contrast, I could produce highly efficient bisulfite conversion rates using naked DNA
and single euchromatic mouse embryonic stem cells. These results lead me to hypothesize that
proteins in heterochromatic genomes of CTCs may prevent the DNA from the bisulfite reaction,
thus leading to inconsistent conversion in single CTCs. Since CTCs are a rare cell type and every
cell matters, I then focused on developing an improved scWGBS protocol that can be applied
reliably to CTCs and other rare cell types. By finding an appropriate lysing buffer that can
vii
effectively lyse the cell and denature proteins from DNA, as well as an optimal proteinase that can
degrade proteins, I successfully generated scWGBS libraries in cancer and normal cells, with a
consistent bisulfite conversion rate of >95%. Application of this robust scWGBS protocol led to
the identification of unique methylation regions in CTCs and other cancer cell lines. Moreover, I
applied the analysis in a CTC line that is treated with a DNA demethylating agent, 5-aza-2'-
deoxycytidine (5-AZA-CdR) and showed that there are heterogenous responses to the treatment at
the single cell level, which cannot be revealed in the bulk analysis. In conclusion, I have
successfully developed a robust scWGBS protocol that can be applied to analyze DNA
methylomes of rare cell types.
viii
TABLE OF CONTENTS
Dedication…………………………………………………………………………...……………ii
Acknowledgements………………………………………………………………..………….....iii
Abstract………………………………………………………………….……………..……...…vi
List of Figures…………………………………………………………..………….……….……xi
List of Tables…………………………………………………..………………..………………xii
Chapter 1: Literature Review………………………………………………...…………………1
1.1 Introduction to CTCs…………………………………………………………...……1
1.2 Current knowledge of CTCs using single-cell techniques………………………….1
1.2.1 scRNA-seq in CTCs………………………………………………..……..2
1.2.2 scDNA-Seq in CTCs………………………………………………...……6
1.3 Proteomics in CTCs…………………………………………………………………..9
1.4 The epigenetic features of CTCs and unknown questions…………………………9
1.5 Introduction to DNA methylation and its role in cancer…………………………..10
1.6 Whole genome bisulfite sequencing………………...………………………...……12
1.7 Single-cell whole genome bisulfite sequencing……………………………….……13
1.8 Sodium bisulfite conversion rate in whole genome bisulfite sequencing……..….15
1.9 Introduction to 5-AZA-CdR…………………………………………………….….16
Chapter 2: Material and Methods……………………………………………………………...18
2.1 Isolation of single cells………………………………………………………..……..18
2.1.1 Isolation of single cells from cell cultures………………………………...18
2.1.2 Isolation of single white blood cells from mouse blood…………….…...18
2.2 Lysing single cells………………………………………………………………..…..20
ix
2.3 Sodium bisulfite treatment of single cells…………………………………………..20
2.4 Library preparation for single-cell whole genome bisulfite sequencing………….20
2.4.1 TruSeq DNA methylation library kit………...………………………......20
2.4.2 Accel-NGS Adaptase Module for Single-Cell Methyl-Seq Library
Preparation kit.....................................................................................................21
2.4.3 Pico Methyl-Seq Library Prep Kit……………………………………….21
2.5 Sequencing of single-cell whole genome bisulfite sequencing libraries………….21
2.6 Cell culture………………………………………………………………………….21
2.7 Generation of bulk whole genome bisulfite sequencing libraries……….……….22
2.8 Isolation of single CTCs and WBCs from breast cancer patients……….………23
2.9 5-AZA-CdR treatment………………………………………...……………………24
2.10 Data processing and analysis……………………………….....…………….……24
2.10.1 Bulk WGBS processing………………………….......……….…………24
2.10.2 scWGBS processing……………………………….....……….…………24
2.10.3 Clustering analysis by Euclidean distance………………………….….25
2.10.4 Mean CpG methylation analysis……………………………………….25
2.10.5 Comparing new scWGBS protocol to past published scWGBS
protocols…………………………………………………………………………25
2.10.6 Creating IGC tracks…………………………………………………….26
Chapter 3: Evaluation of Current Single-Cell WGBS Protocols and Challenges…………...27
3.1 Background………………………………………………………………………….27
3.2 Evaluation of the reported scWGBS protocols via bulk WGBS………………….28
3.3 Evaluation of the Farlik scWGBS protocol on CTC line BRx07……….………..29
x
3.4 Validation of past report with ESCs……….…………………….………………...30
3.5 Optimizing the lysis and bisulfite conversion steps……………………………….31
3.6 Evaluation of the Gravina scWGBS protocol…………………………………….33
3.7 Optimization of the lysing step with RLT buffer.. …………….…………………34
3.8 Further optimization of the lysing step……………………….……………...……34
3.9 Evaluation of a new single-cell methyl-seq library kit……….…………………..36
3.10 Summary…………………………………………………………………………...36
Chapter 4: Optimization for a Robust scWGBS Protocol……………………………………38
4.1 Background…………………………………………………….……………………38
4.2 The optimized scWGBS protocol produced consistent bisulfite conversion rates
on various cell types………………………………….………………………………….38
4.3 The improved protocol produces comparable or enhanced library complexity
than those from published protocols…………………………………………………..43
4.4 Evaluating the dose and effect of 5-AZA-CdR on BRx07 CTC line survival……44
4.5 scWGBS analysis reveals heterogeneity in global demethylation by 5-AZA-
CdR………………………………………...……………………………………………46
4.6 Summary………………………………………….…………………………………51
Chapter 5: Discussion………………………………………….……………………………….52
5.1 Overview of results……….……………………….………………………………...52
5.2 Perspectives………………………………………………………………………….53
References………………………………………………………………………………...……..55
xi
LIST OF FIGURES
CHAPTER 3: TESTING CURRENT scWGBS PROTOCOLS AND CHALLENGES
Figure 1: General steps for scWGBS.
Figure 2. Evaluation of library complexity for samples prepared by the Farlik and Smallwood
protocols.
CHAPTER 4: A ROBUST SINGLE CELL WGBS PROTOCOL
Figure 1: Images showing a CTC and immune cell isolated from a stage IV breast cancer patient.
Figure 2: VO_scWGBS produces WGBS libraries with a bisulfite conversion rate ranging from
90-100%.
Figure 3: scWGBS libraries from improved protocol compares well with published scWGBS
protocols.
Figure 4: 5-AZA-CdR attenuates cell proliferation in BRx07.
Figure 5: Status of global CpG mean methylation levels in bulk WGBS analysis of BRx07
treated with 5-Aza-CdR.
Figure 6: Vehicle and Week2 libraries produced a bisulfite conversion rate of >97%.
Figure 7: 5-AZA-CdR treatment resulted in heterogenous demethylation at single cell level.
xii
LIST OF TABLES
CHAPTER 3: TESTING CURRENT scWGBS PROTOCOLS AND CHALLENGES
Table 1: Bisulfite conversion failed at first attempt with scWGBS using BRx07.
Table 2: The Farlik scWGBS protocol is successful in producing high bisulfite conversion in
mouse embryonic stem cells.
Table 3: Lysis optimization did not improve the bisulfite conversion rate.
Table 4: Increase in conversion time did not improve the bisulfite conversion rate.
Table 5: Testing Gravina’s protocol and the elimination of PBS.
Table 6: RLT buffer did not improve bisulfite conversion.
Table 7: P2 buffer plus NEB’s proteinase K works in improving bisulfite conversion.
Table 8: New scWGBS protocol is successful in both human and mouse tumor cells.
CHAPTER 4: A ROBUST SINGLE CELL WGBS PROTOCOL
Table 1: Various cell types used to test optimized scWGBS protocol.
Table 2: Optimized scWGBS protocol is successful in bisulfite conversion of various cell types.
1
CHAPTER 1: LITERATURE REVIEW
1.1 Introduction to CTCs
Circulating tumor cells (CTCs) were first observed in 1869 in the blood of a cancer patient
with metastatic progression by Thomas Ashworth (Ashworth, 1869). He deduced that the cells in
the blood resembled the cells from the cancer, thus concluding that these cells arrived in the veins
from the tumor via draining circulation (Ashworth, 1869). Subsequent investigations have
corroborated his findings that CTCs derive from the primary and metastatic tumors (Lambert, et
al., 2017). Starting initially from the primary tumor, CTCs are cells that acquire specific properties
that facilitate them to leave their place of origin and enter the blood circulation. Once in circulation,
they are carried around the body to create distant metastatic lesions eventually. The active
metastatic tumors can shed more CTCs and potentially seed for additional metastases. However,
only a small fraction of CTCs survive, and less than 0.01% of these have high metastatic potential
to initiate distant metastases (Krebs, et al., 2010). Although the metastatic process is inefficient,
the result from this phenomenon is the major reason for cancer-related deaths (Martin, et al., 2013).
1.2 Current knowledge of CTCs using single-cell techniques
CTCs can be isolated from the peripheral blood through a minimally invasive blood draw.
Therefore, it is feasible to sample CTCs repeatedly as a real-time liquid biopsy to monitor disease
progression. In comparison, tissue biopsies are invasive due to surgical needs, and often they fail
to reflect the most active part of the tumor due to localized sampling. In addition, it has been
suspected that surgeries could dislodge tumor cells into circulation and contribute to the seeding
of metastatic tumors (Alieva, et al., 2018). Based on mutational analysis, CTCs were shown to
carry the driver mutations of the active growing tumor, indicating their shedding from the active
regions of the tumor (Kowalik, et al., 2017). Thus, longitudinal analysis of CTCs from a single
2
patient can give insights into the progression of the disease and treatment effectiveness. For
instance, CTC enumeration can reflect the disease progression or therapy response (Toss, et al.,
2014), which has led to the approval of CellSearch to enumerate CTCs by the food and drug
administration.
It is estimated that in 1 milliliter of blood there are 1 billion red blood cells (RBCs), 1
million white blood cells (WBCs) and 0-10 CTCs (Akpe, et al., 2020), thus making CTCs
extremely rare and difficult to detect. This produces the challenge of isolating and characterizing
CTCs. In the last 2 decades, a plethora of technologies have been developed for CTC isolation. In
addition to CellSearch, technologies include both the positive selection of CTCs, via surface
markers present (epithelial markers) on tumor cells or tumor cell physical properties, and negative
selection that aims at eliminating WBCs and RBCs. However, even with the most efficient
isolation technologies, the low numbers of CTCs that are captured are not enough for various
downstream analyses which often require a huge input. To overcome this hurdle, Yu and
colleagues developed a method that can isolate and expand CTCs in culture from stage IV, luminal
breast cancer patients (Yu, et al., 2015). The sufficient quantity of ex vivo expanded CTCs led to
the possibility of performing various experiments that were not possible years ago. These
functional and molecular analyses have provided insights into various characteristics of CTCs.
However, these bulk methods fail to assess the heterogenous properties of actively shed CTCs
(Ortiz, et al., 2018). CTCs are rare so each one counts. Thus single-cell-omic technologies are
crucial to study these rare CTCs.
1.2.1 scRNA-seq in CTCs
Intratumor heterogeneity plays a significant role in disease progression under selective
pressures, such as the multiple steps of the metastatic cascade and various therapies (Lovly, et al.,
3
2016). Due to the invasive and limited properties of surgical biopsies, analyzing expression
patterns of CTCs at the single-cell level can offer insights into molecular pathways modified
during metastasis and the evolution of a heterogeneous population during therapy resistance. Full
length single-cell RNA-sequencing (scRNA-seq) was first applied to six CTCs from a melanoma
patient and the consequential data was compared to several primary melanocytes, melanoma
cancer cell lines, human embryonic stem cells (ESCs), and immune cells (Ramskold, et al., 2012).
The scRNA-seq analysis distinguished CTCs from other cell types and revealed signaling
pathways altered in CTCs relative to melanocytes, such as an upregulation of melanoma-associated
antigens and a downregulation of cell death and MHC class I genes (Ramskold, et al., 2012).
Furthermore, this study identified an upregulation of several genes encoding plasma membrane-
associated proteins and a downregulation of genes related to escaping immune surveillance
(Ramskold, et al., 2012). The upregulated surface markers can possibly differentiate CTCs from
primary melanocytes and blood cells, which can facilitate future CTC isolation in melanoma
patients.
In a follow-up study via scRNA-seq in pancreatic CTCs from both a mouse model and
human patients, an elevated expression of stromal-derived extracellular matrix (ECM) proteins
were discovered when CTCs were compared with matching primary tumors (Ting, et al., 2014).
Knocking down the expression of SPARC, an ECM protein, suppressed cell migration and
invasiveness, suggesting that the irregular expression of stromal ECM proteins in CTCs can
contribute to their metastatic spread to distant organs (Ting, et al., 2014).
It has been suspected that clusters of CTCs have a higher metastatic potential than single
CTCs. One group discovered that CTC clusters led to a worse prognosis in breast and prostate
cancer patients and validated this hypothesis with cell line models (Aceto, et al., 2014). To validate
4
the molecular drivers, they applied scRNA-seq to analyze single CTCs and clustered CTCs from
breast cancer patients and consequently uncovered a list of differentially expressed genes,
including plakoglobin, which was implicated in cluster formation. Suppressing plakoglobin levels
disrupted cluster development and significantly suppressed the metastatic potential of those cells
(Aceto, et al., 2014).
In a latest study, scRNA-seq was applied to CTCs isolated from breast cancer patients with
progressive metastatic lesions in bones or visceral organs (Aceto, et al., 2018). The analysis
revealed several enriched signaling pathways, including activated androgen receptor (AR)
signaling in bone metastases. These patients contained a longer aromatase inhibitor (AI) treatment
than patients with progressive visceral metastases (Aceto, et al., 2018). This thrilling discovery
points to the role of AR signaling in promoting bone metastasis under the selective pressure of
prolonged AI treatment, thus steering to a potential therapeutic opportunity to use AR-targeted
therapies that are already implemented in prostate cancers. These studies have exhibited how
single-cell transcriptomic analysis can facilitate the discovery of metastatic mechanisms.
Additionally, scRNA-seq has provided critical comprehensions on CTC transcriptional
heterogeneity and its involvement in therapy resistance. It has been determined that transcriptional
heterogeneity exists in CTCs from genetically engineered pancreatic cancer mouse models (Ting,
et al., 2014).
Furthermore, many epithelial CTCs express mesenchymal markers at various levels,
which is consistent with an analogous finding done in breast cancer CTCs via multiplex fluorescent
RNA-ISH assay (Yu, et al., 2013). The expression of these mesenchymal markers could potentially
contribute to the cancer stem cell-like characteristics (Mani, et al., 2008) and resistance to various
therapies (Yu, et al., 2013, Fisher, et al., 2015, Zheng, et al., 2015)). In patients, transcriptional
5
heterogeneity is even more evident. scRNA-seq performed on prostate cancer CTCs manifested
tremendous heterogeneity in transcriptomes, which could contribute to the various resistance
mechanisms for AR-targeted therapies (Miyamoto, et al., 2015). Transcriptional analyses between
CTCs from patients with anti-androgen therapy (enzalutamide) resistance and naïve patients
manifested two different resistance mechanisms: activation of glucocorticoid receptor and
noncanonical Wnt signaling. Both pathways coexisted in CTCs to various quantities in diverse
patients, including some CTCs from the treatment-naïve group, pointing to the challenge of
treating cancers with heterogeneous transcriptomes (Miyamoto, et al., 2015). Comparably, another
study showed heterogeneous levels of SPINK1 and BIRC5 (two genes associated with aggressive
castration-resistant prostate cancers) in CTCs isolated from different prostate cancer patients
(Cann, et al., 2012).
It is evident that scRNA-seq can be utilized in CTCs as a powerful tool to discover new
tumor progression markers, which can ultimately revolutionize personalized patient care.
However, current scRNA-seq techniques have several limitations. Former publications (Ting, et
al., 2014, Miyamoto, et al., 2015, Cann, et al., 2012) have revealed that the success rate of overall
amplification and library preparation is less than 60%, due to sample management and the fragility
of CTCs (Zhu, et al., 2018). In addition, single-cell experiments are extremely prone to input loss.
It is estimated that about 10–20% of transcripts are reverse transcribed during library preparation,
thus preferring highly expressed genes over lesser expressed genes (Kolodziejczyk et al., 2015).
Additional rising techniques, such as RNA sequential probing of targets (SPOTS) (Eng, et al.,
2017) and sequential fluorescent in situ hybridization (FISH) (Lubeck, et al., 2012), can possibly
overcome this challenge, due to their sensitivity for lesser expressed genes.
6
Finally, CTCs that enter the bloodstream can potentially derive from various clones, or can
be at different stages of replication, possibly confounding the analysis of expression patterns (Ting,
et al., 2014). Nonetheless, scRNA-seq has become one of the most cutting-edge single-cell
methods and continues to improve. In parallel, CTC isolation is also progressing. Plus, studying
numerous CTCs isolated from one patient or mouse model across multiple time points can
overcome some of the technical limitations and aid in interpreting various discoveries.
1.2.2 scDNA-Seq in CTCs
Analyzing genetic alterations in single CTCs can reveal inter- and intrapatient
heterogeneity and its association with therapeutic response in real time. A study used multiple
annealing and looping based amplification cycles for exome sequencing and copy number
profiling of single CTCs from seven patients with metastatic lung adenocarcinoma (ADC) and
showed that CNVs in CTCs were highly stable (Ni, et al., 2013). Intriguingly, the CNV profiles
from ADC and small-cell lung cancer (SCLC) subtypes were considerably dissimilar, whereas the
global patterns of CNVs in different patients from the same cancer subtype displayed an
unanticipated conservation. In addition, CTC profiles were similar to the metastatic tumors versus
primary tumors, including one patient who switched from an ADC profile in the primary tumor to
an SCLC profile in a liver metastasis. Treatment with an SCLC standard regimen of etoposide and
cisplatin resulted in a dramatic response in this patient, confirming the diagnostic value of CTCs
in clinical management. However, it was demonstrated that there was significant heterogeneity in
point mutations and INDELs among single CTCs from one patient, although a substantial number
of point mutations (59%) from the primary and metastatic tumors were also detected in CTCs in
this study. An additional study performed CNV sequencing via whole-genome amplification in
single CTCs, isolated across four time points, from a metastatic prostate cancer patient treated with
7
chemotherapy and abiraterone (Dago, et al., 2014). The authors discovered three different CNV
subclones, each exhibiting various degrees of amplification in genes that potentially mediate
therapeutic resistance, such as the AR and MYC genes. Additionally, they identified a specific
CTC population that showed genetic similarity to a retrospective biopsy taken from a bone
metastasis, suggesting the origin for this CTC clone.
In a different study, the gene PTEN was deleted in all CTCs isolated from one patient, and
a unique clone with BRAF amplification was found in another patient (Ruiz, et al., 2015). Both a
PTEN deletion and BRAF amplification have been defined as resistant mechanisms to BRAF
inhibitor therapy in melanoma patients (Ruiz, et al., 2015). Thus, by performing single-cell CNV
analysis at different time points throughout therapy, this study has recognized unique clones and
underlying genetic modifications that arise during the selective pressures of therapy. Additionally,
CNV data from single CTCs was investigated to determine if it could discriminate chemosensitive
from chemorefractory SCLC patients (Carter, et al, 2017). The scientists derived 16 CNV baseline
profiles that demonstrated separation of chemosensitive and chemorefractory status. This was the
first study to identify potential genomic biomarkers in CTCs that suggest the chemosensitivity of
an individual’s SCLC before the start of treatment. Functional studies to identify the genetic
mechanisms or regulatory regions in these locations would be even more informative for guiding
effective chemotherapies. These studies have shown the power of using single-cell CNV in CTCs
for the discovery of novel mechanisms of resistance, as well as potential biomarkers and
therapeutic targets for future research.
The accurate evaluation of point mutations in single CTCs is particularly challenging due
to technical issues: amplification bias introduced by whole-genome amplification, sequencing
errors (Sabina, 2015), and the scarceness and fragility of CTCs in a vial of blood. One group has
8
tackled this issue by developing improved computational methods, sequencing multiple
independent libraries of CTCs, and focusing on common mutations in primary and metastatic
tumors (Lohr, et al., 2014). Relying on these advancements, they have performed exome
sequencing in single CTCs and related their results with multiple regions of the primary tumor.
Remarkably, there is obvious heterogeneity among various regions of the primary tumor and CTCs
resemble one region, suggesting that they share a more recent mutual ancestor than others (Lohr,
et al., 2014). In addition, novel single-cell whole-genome sequencing and computational methods,
such as AneuFinder (a bioinformatic package that enables the detection of CNVs in single-cell
sequencing data), will assist in addressing some of these issues (van den Bos, et al., 2016).
In an alternative study, single CTCs isolated from metastatic breast cancer patients were
studied for heterogenous mutations in PIK3CA. In their study they found that different patients
contained various mutations within PIK3CA, and that one patient displayed three different
PIK3CA variants in distinct single CTCs but demonstrated PIK3CA as wild type in bulk samples.
A different study revealed that KRAS mutations were detected in 5 of 15 CTCs from one patient
and PIK3CA mutations in 14 of 36 CTCs from four patients (Gasch, et al., 2016). Furthermore,
molecular classification of single CTCs demonstrated considerable intra- and interpatient
heterogeneity of genetic alterations in KRAS, EGFR and PIK3CA, possibly elucidating the
variable response rates to EGFR inhibition in patients with colorectal cancer. These studies have
shown the worth of single CTC analysis for specific genetic alterations that have clear clinical
implications.
Amplifying the whole genome to develop a workable library for analysis can potentially
result in several technical errors, including low physical sequencing coverage, non-uniform
coverage, allelic dropout events, false-positive errors, and false-negative results due to insufficient
9
coverage; the technical details for these errors were extensively discussed in other reviews (Sabina,
2015, Navin, 2014). Special precaution must be taken in the postprocessing of data to avoid the
dilemma of calling every discovered variant heterogeneous (Navin, 2014). In addition, in situ
sequencing has been developed even on samples fixed on slides (Lee, et al., 2015). This method
reduces the difficulty of sample handling and the cell damage or loss that can occur during CTC
enrichment and isolation (Zhu, et al., 2018). Finally, other advancements have led to studying the
transcriptome and genomic DNA in parallel in single cells, such as SIDR-seq (simultaneous
isolation of genomic DNA and total RNA), thus introducing another opportunity for studying
single CTCs (Han, et al., 2018).
1.3 Proteomics in CTCs
Proteomics is the study of protein structures and their localization, functional status, and
interactions. The transcription and translation of a gene can create more than one protein, and these
proteins could be post-translationally modified or their concentration could be altered based on
their function. In cancer, phosphatases and kinases are normally deregulated and constitutively
stimulated. Therefore, several target drugs have been developed to combat this disease.
Concerning single-cell proteomics, latest developments in mass spectrometry, western blot and
sample management strategies are quickly making proteomic analyses of single cells feasible.
Kang and colleagues developed a single-cell resolution western blot (scWB) (Kang et al., 2016)
that has enabled the study of a panel of 8 proteins in single CTCs derived from estrogen receptor-
positive breast cancer (Sinkala et al, 2017).
1.4 The epigenetic features of CTCs and unknown questions
Individual CTCs from breast cancer patients are assumed to result from unique patterns of
altered cancer genes. However, the contribution of epigenetic changes to the development of CTC
10
characteristics is unknown. Epigenetic research has been done in CTCs, but not enough literature
exists due to the high input requirements of epigenetic experiments. A recent study has shown that
during in vitro expansion, individual CTCs can switch their HER2 expression to create an
equilibrium of HER2 positive and negative subpopulations, suggesting an underlying epigenetic
regulation (Jordan, et al., 2016). As advances rise on how to expand these CTCs continue, as well
as improving single-cell methods, these epigenetic experiments will become possible.
One epigenetic attribute that has been studied in CTCs is DNA methylation. Using the 27K
methylation array, a study found that there was hypermethylation in genes correlated with
apoptosis, angiogenesis, and the VEGF pathway in CTCs, when compared to non-matched primary
tumors (Freindlander, et al., 2014). However, this analysis examined less than 1% of all 28 million
CpGs. Another study using pyrosequencing revealed that Hgf and c-Met were overexpressed and
were associated with hypomethylation at their promoters from CTC lines derived from mice
(Ogunwobi, et al., 2013). There have also been single-cell methylation studies performed on CTCs.
One study investigated the overall methylation levels of both DNA and RNA through mass
spectrometry (Huang, et al., 2016), and another studied a few epithelial-mesenchymal transition
genes through single-cell multiplexed-agarose-embedded bisulfite sequencing (Pixberg, et al.,
2017). The most recent scWGBS paper investigated the DNA methylation landscape of single
CTCs in lung cancer patients, and a CTC DNA methylation signature was discovered that differed
from primary lung cancer tissues (Zhao, et al., 2021). In addition, another paper investigated the
methylomes of single CTCs and CTC clusters from breast cancer patients and mouse models on a
genome-wide scale (Gkountela, et al., 2019). From this they discovered specific regions that were
hypomethylated in binding sites associated with stemness-and proliferation-associated
11
transcription factors in CTC clusters, which promote stemness and metastasis (Gkountela, et al.,
2019).
Another attribute that has been performed is an Assay for Transposase-Accessible
Chromatin sequencing (ATAC-seq) to evaluate the chromatin accessibility in CTC-derived
metastatic cells. Principal component analysis revealed a pronounced clustering of samples
according to the patient, which is further confirmed with previous reports of extensive
transcriptional diversity between tumors (Klots, et al., 2020).
The CTC methylome and its chromatin accessibility have been studied but at a bare
minimum, and chromatin organization and protein-DNA interactions in CTCs is non-existent in
literature, thus making the epigenome still a mystery on how it contributes to the characteristics of
CTCs, such as therapeutic resistance, CTC evolution, survival in blood circulation and how it could
potentially play a role in organ tropism. Recent advancements have led to the possibility of
studying different material at the same time. scNOMe-seq (single cell nucleosome occupancy and
methylome-sequencing) enables the study of methylation and accessibility in DNA at the same
time from a single cell (Pott, 2017). In addition, scNMT-seq (single cell nucleosome, methylation,
and transcription sequencing) permits the investigation of DNA accessibility, the methylome and
transcriptome in a single cell by using CpG methyltransferase to label open chromatin followed
by bisulfite treatment and RNA sequencing (Clark, et al., 2018).
1.5 Introduction to DNA methylation and its role in cancer
DNA methylation was one of the first epigenetic marks to be discovered and remains the
most studied. DNA methylation is an epigenetic mechanism involving the transfer of a methyl
group onto the C5 position of the cytosine to form 5-methylcytosine. It is catalyzed by a family
of DNA methyltransferases (DNMTs) that transfer a methyl group from S-adenyl methionine
12
(SAM) to the fifth carbon of a cytosine residue to form 5-methylcytosine (Moore, et al., 2013).
DNMT3a and DNMT3b can establish a new methylation pattern to unmodified DNA and are thus
known as de novo DNMTs (Moore, et al., 2013). On the other hand, DNMT1 functions during
DNA replication to copy the DNA methylation pattern from the parental DNA strand onto the
newly synthesized daughter strand (Moore, et al., 2013). DNA methylation is an important
regulator of gene transcription, and it is involved in recruiting proteins for gene repression or by
inhibiting the binding of transcription factors to DNA (Moore, et al., 2013). During development,
certain patterns of DNA methylation change because of a dynamic process involving both de
novo DNA methylation and demethylation (Moore, et al., 2013). Consequently, differentiated cells
develop a stable and unique DNA methylation pattern that regulates tissue-specific gene
transcription.
DNA methylation at cytosines in CpG dinucleotides is critical in programming gene
expression and its alteration is one of the hallmarks of cancer. Vertebrate CpG islands are regions
with a high frequency of CpG dinucleotides that span at least 1000bp (Bird, et al., 1985; Panchin,
et al., 2016). CpG islands are found at promoters at a high frequency, estimated to be at least 70%
of all human promoters (Saxonov, et al., 2006; Babenko, et al., 2017). In normal cells, CpG islands
found at promoters usually remain unmethylated, which is a state associated with the capacity for
transcription. In cancer cells, some CpG islands become heavily methylated, which results in
repression of gene transcription (Phillips, et al., 2008; Moore, et al., 2013). The strong level of
methylation at CpG islands in cancer cells may occur since this selection favors methylation and
silencing at the promoters of genes important for normal cell homeostasis, such as tumor
suppressors (Phillips, et al., 2008). In addition, there are regions called partially methylated
domains (PMDs), which are megabase domains with lower-than-average-methylation and these
13
regions are found at gene poor regions (Berman et al., 2012). PMDs correspond to lamina-
attachment domains that are thought to be linked to gene expression, but little is known about their
corresponding domains or their composition at the single cell level (Berman et al., 2012). As of
now, PMDs have only been found in cancer, cell lines and the placenta. Furthermore, it has been
found that methylation patterns in enhancers control a cohort of gene expression (Blattler et al.,
2014). Thus, misregulation of DNA methylation could lead to gene expression changes associated
with diverse phenotypes in CTCs.
DNA methylation seems to be promising for translational use in patients. For example,
hypermethylated promoters may serve as biomarkers. Additionally, unlike genetic alterations,
DNA methylation is reversible, which makes it extremely interesting for therapeutic applications.
1.6 Whole genome bisulfite sequencing
Whole genome bisulfite sequencing (WGBS) uses sodium bisulfite treatment on DNA to
determine DNA methylation patterns and it is the gold standard of technology for the detection of
genome-wide DNA methylation (Cokus et al., 2008; Lister et al., 2009). Treatment of DNA with
sodium bisulfite converts unmethylated cytosines to uracil, but leaves 5-
methylcytosine (methylated cytosines) untouched. Therefore, DNA that has been treated with
sodium bisulfite preserves only methylated cytosines and causes chemical changes of individual
unmethylated cytosine residues, yielding single-nucleotide resolution information. Bisulfite
treatment also leads to DNA fragmentation. Thus, this type of sequencing relies on sodium
bisulfite to convert unmethylated cytosines and to fragment DNA, thereby preparing it for
downstream library preparation.
1.7 Single-cell whole genome bisulfite sequencing
14
Single-cell whole genome bisulfite sequencing (scWGBS) allows the detection of DNA
methylation at the single-cell level and is a versatile tool to investigate DNA methylation in rare
cells and heterogeneous populations. The first ever reported protocol created for scWGBS was in
2014 by Smallwood and colleagues. This protocol includes a Post-Bisulfite Adaptor Tagging or
PBAT, where sodium bisulfite treatment is performed first, followed by adapter ligation afterwards
(Miura, et al., 2014). PBAT reduces the loss of information due to harsh sodium bisulfite treatment
that could degrade tagged adaptors (Miura, et al., 2014). The adaptors being integrated after
treatment increased the chances of material getting sequenced, thereby improving the complexity
of the sample.
Another unique feature of Smallwood’s protocol is that they input 5-rounds of a pre-
amplification step to ensure a high number of DNA strands are tagged, further improving the
complexity of the final library (Smallwood, et al., 2014). The focus of their study was performed
in metaphase-II oocytes and mouse embryonic stem cells, and both populations manifested
heterogeneity within their own methylomes (Smallwood, et al., 2014). They were able to assess
about 3.7 million CpGs of the 28 million in an individual mouse diploid cell (Smallwood, et al.,
2014).
Second to publish a scWGBS protocol was Farlik, et al. in 2015. This approach was
different from the Smallwood protocol in that it did not require a pre-amplification step and utilized
two kits, in contrast to many homemade reagents in the Smallwood one. One kit is the EZ DNA
Methylation-Direct Kit (Zymo Research, D5020) which includes the lysis buffer and conversion
reagent. The second kit is the TruSeq DNA Methylation Library Preparation Kit (Illumina,
discontinued) which allows up to 18 cycles of amplification. The Farlik protocol also included
15
PBAT into scWGBS protocol and successfully evaluate roughly 1-2 million CpGs in K562, HL60,
KBM7 and mouse ESCs (Farlik, et al., 2015).
The third published approach to scWGBS, and the simplest one, is by Gravina, et al. in
2016. This approach uses one kit called Pico Methyl-Seq Prep Kit (Zymo Research, D5455) that
includes the conversion reagent and library preparation kit all in one (Gravina, et al., 2016). The
lysis materials come separately but are from the same company (Zymo Research). Same as the
previous two protocols, this one also includes PBAT into scWGBS successfully and is the first to
approach scWGBS with one commercial kit. The authors demonstrated the effectiveness of this
kit, originally designed for samples with a low yield of 10pg-100ng DNA, to investigate 1-2
million CpGs in single mouse fibroblasts and liver cells (Gravina, et al., 2016).
1.8 Sodium bisulfite conversion rate in whole genome bisulfite sequencing
WGBS relies on the conversion of every single unmethylated cytosine to uracil. To
generate successful scWGBS libraries, the bisulfite conversion rate must be measured. This
measurement indicates how efficient the reaction was in converting single unmethylated cytosine
residues into uracil. Therefore, incomplete bisulfite conversion will lead to false positive results
for methylation. Only cytosines in single-stranded DNA are prone to attack by sodium bisulfite,
therefore, denaturation of the DNA is incredibly critical (Ermakova, et al., 1996; Chan, et al.,
2012). Incomplete denaturation of double-stranded DNA before conversion can lead to
inefficiency and incomplete bisulfite conversion. In addition, it is important to ensure that reaction
parameters such as temperature and salt concentration are suitable to maintain the DNA in a single-
stranded conformation for complete conversion (Panayiotis, et al., 2010). Lastly, it is imperative
that the DNA be naked, without any proteins bound to it, for it to be accessible to the chemical
bisulfite reaction.
16
1.9 Introduction to 5-AZA-CdR
Aberrant DNA methylation is a typical hallmark of cancer that consequently leads to the
repression of many genes that suppress malignancy. However, these events are reversible, which
makes this phenomenon an attractive target for specific inhibitors of DNA methylation, such as 5-
aza-2'-deoxycytidine (5-AZA-CdR). 5-AZA-CdR was first established in 1964 by Pliml and Sorm
(Pliml and Sorm, 1964) and its effectiveness in leukemia was conveyed in 1968 by Sorm and
Vesely (Sorm and Vesely, 1968). In addition, Saito and colleagues discovered that 5-AZA-CdR
could activate miRNAs that acted as tumor suppressors in cancer cells (Saito, et al., 2006). Since
5-AZA-CdR is a prodrug, it must be activated by deoxycytidine kinase to its monophosphate form
and by other kinases to its triphosphate form, which is then incorporated into DNA by DNA
polymerase (Saliba, et al., 2021). After it is incorporated, these analogs trap DNMTs, resulting in
proteosomal degradation and heritable global demethylation as cells divide (Kelly et al., 2010).
The DNA demethylation effect by 5-AZA-CdR in cancer cells can lead to the reactivation
of silent tumor-suppressor genes, induction of differentiation or senescence, growth inhibition, and
loss of clonogenicity. In addition, latest studies have uncovered that 5-AZA-CdR reactivates
interferon-responsive genes through dsRNAs containing endogenous retroviruses, which are
typically repressed by promoter DNA methylation (Chiappinelli et al., 2015; Roulois, et al., 2015).
Consequently, this process initiates dsRNA-sensing pathways and successively the response to
type I interferon (IFN), which was labelled as “viral mimicry” (Chiappinelli et al., 2015; Roulois,
et al., 2015). Additionally, vitamin C increases viral mimicry induced by 5-AZA-CdR (Liu, et al.,
2016).
In the past, the purpose of gene body DNA methylation was poorly understood, while
several studies began to demonstrate the potential effect of gene body DNA methylation affecting
17
gene expression (Kulis et al., 2012; Lister et al., 2009; Maunakea et al., 2010; Varley et al., 2013).
Yang, et al., demonstrated by utilizing 5-AZA-CdR in a colon cancer cell line that gene body DNA
methylation increases gene expression, and that this formation is dependent on the presence of the
DNA methyltransferase DNMT3B. By applying 5-AZA-CdR, they uncovered that demethylation
within transcription units can turn off the transcription of genes, including some oncogenes (Yang,
et al., 2014). In addition, they identified 4 different groups with varying degrees of DNA
methylation recovery from 5-AZA-CdR, with two groups demonstrating either a slow or quick
recovery of DNA methylation and the other two groups being in between both extremes (Yang, et
al., 2014). Genes that manifested rapid remethylation in their gene bodies were linked with
increased cellular growth and these genes were enriched in metabolic pathways or were
upregulated by c-MYC in uncultured human cancers (Yang, et al., 2014). This study demonstrated
that 5-AZA-CdR not only reactivated genes silenced by promoter methylation, but DNA
methylation inhibitors induce DNA demethylation across all genomic features (Yang, et al., 2014).
Gene body DNA methylation is an interesting potential target for therapy, since it can induce the
downregulation of oncogenes and metabolic genes (Yang, et al., 2014).
18
CHAPTER 2: MATERIALS AND METHODS
2.1 Isolation of single cells
2.1.1 Isolation of single cells from cell cultures
For adherent cell lines, cells were dissociated into single cells using 0.25% Trypsin-EDTA
solution (Gibco, 15400-054). Afterwards, they were harvested into 1.5mL micro centrifuge tubes
and spun down at 500g, 5 min, 21°C. Cells were then washed with 1X DPBS (Corning, 21-031-
CV) and spun down at 500g, 5 min, 21°C. Cells were resuspended and plated into ultra-low
attachment 6-well plates (Corning, 3471) using 3mL of 1X DPBS (Corning, 21-031-CV) into each
well. Single cells were isolated using single-cell micromanipulation and were deposited into 2uL
of 1X DPBS (Corning, 21-031-CV) into the sides of PCR tubes (VWR, 20170-012). Afterwards,
10uL of lysis buffer was deposited into the cap of the PCR tube and then spun down using a
tabletop mini centrifuge (Benchmark, C1012) for 1 minute. Picked cells were put on ice while
more cells were picked, and/or were stored at -80°C for later use.
For non-adherent cells, cells were harvested into 1.5mL micro centrifuge tubes and spun
down at 500g, 5 min, 21°C. Cells were then washed with 1X DPBS (Corning, 21-031-CV) and
spun down at 500g, 5 min, 21°C. Cells were resuspended and plated into ultra-low attachment 6
well plates (Corning, 3471) using 3mL of 1X DPBS (Corning, 21-031-CV) into each well. Single
cells were isolated using single-cell micromanipulation and were deposited into 2uL of 1X DPBS
(Corning, 21-031-CV) into the sides of PCR tubes (VWR, 20170-012). Afterwards, 10uL of lysis
buffer was deposited into the cap of the PCR tube and then spun down using a tabletop mini
centrifuge (Benchmark, C1012) for 1 minute at room temperature (RT). Picked cells were put on
ice while more cells were picked, and/or were stored at -80°C.
2.1.2 Isolation of single white blood cells from mouse blood
19
Each mouse experiment was carried out in fulfillment with protocols approved by the
Intuitional Animal Care and Use Committee at USC. NOD/scid GAMMA (NSG) mice (Jackson
Laboratory, 005557) were kept in pathogen-free conditions required by the USC’s animal facility.
NSG female mice were sedated with isoflurane and the chest wall was shaved, followed
by sterilization via 10% iodine and alcohol wipes (fisher scientific, 19-090-834). Afterwards,
terminal bleed was performed by acquiring slowly about 0.5-1mL of blood, followed by
euthanizing per animal protocol guidelines. RBC lysis buffer (Miltenyi Biotec, 130-094-183) was
set to warm to room temperature (RT) before proceeding. RBC lysis buffer was diluted to 1X with
dH2O into a 50mL falcon tube. 0.5-1mL of blood was added to the tube and vortexed for 5 seconds
only. After, it was incubated for 10 min followed by centrifugation at 300g for 10min at 21°C. The
supernatant was discarded, and the pellet was washed with 10 mL of 1X DPBS (Corning, 21-031-
CV). Then it was centrifuged at 300g for 10min at 21°C, followed by the supernatant being
discarded. The pellet was resuspended in 200uL of cold MACS buffer (1X PBS, 0.5% bovine
serum albumin and 2mM EDTA). Staining was performed with 2uL of 1:10 mCD45 (BioLegend,
103123). After, it was incubated in the dark and on ice for 30min, then washed with 20mL of cold
MACS buffer and spun down 300g for 15min at 4°C. The pellet was resuspended in 5mL of cold
MACS buffer and 2.5mL was added to each well of a polyHEMA coated 2-chamber slide. The
cells were allowed to settle for 2 hours at 4°C in the dark before scanning and picking via Rarecyte.
Cells were then imaged using Rarecyte, and cells that manifested a positive fluorescent
mCD45 signal were picked and were deposited into 50uL of 1X DPBS (Corning, 21-031-CV) in
a specialized PCR tube (RareCyte, 22-1056-001). Afterwards, 40uL of 1X DPBS (Corning, 21-
031-CV) was carefully removed, and PCR tubes were spun down at a speed of 1000RPM for
5minutes. Each tube was inspected under microscope to validate a single cell was still present.
20
10uL of lysis buffer was then added, spun down using a mini centrifuge (Benchmark, C1012) for
30sec at RT and then stored at -80C.
2.2 Lysing single cells
Picked single cells were brought to a final volume of 20uL with water (or if you must,
PBS) after adding 10uL of lysing buffer (200mM NaOH, 1% SDS (w/v)). The lysing buffer must
be diluted 1:1. Lambda phage DNA (Promega, D1521) was spiked into the tube to evaluate
bisulfite conversion rate. Afterwards, the following PCR program was run: 98°C 5min, 55°C 2min,
pause, 55°C 60min, 4°C forever. After the 55°C 2min incubation, the program was paused and
1uL of proteinase K (NEB, P8107) is added to each sample. Afterwards, the program is continued
until the end. Samples were stored at -80°C.
2.3 Sodium bisulfite treatment of single cells
Sodium bisulfite conversion of genomic DNA for single cells was performed by using the
EZ DNA Methylation Kit (Zymo Research, D5020). The CT conversion reagent was prepared and
130uL was added directly to lysed cells and the manufacturers protocol was followed. Elution
buffer was incubated within the column for 10min before spinning down and samples were stored
at -20°C for short-term storage or -80°C for long-term storage.
2.4 Library preparation for single-cell whole genome bisulfite sequencing
2.4.1 TruSeq DNA methylation library kit
10uL of eluted converted DNA was used as input for the library kit the TruSeq DNA
Methylation Library Preparation Kit (Illumina, discontinued). Libraries were made according to
the manufacturer’s protocol, with the modification of eluting with 10uL after the second clean-up.
21
2.4.2 Accel-NGS Adaptase Module for Single-Cell Methyl-Seq Library Preparation kit
10uL of eluted converted DNA was used as input for the library kit Accel-NGS Adaptase
Module for Single-Cell Methyl-Seq Library Preparation (Swift Biosciences, 33096). Libraries
were made according to the manufacturer’s protocol, with the modification of eluting with 10uL
after the second clean-up.
2.4.3 Pico Methyl-Seq Library Prep Kit
The Pico Methyl-Seq Library Prep Kit (Zymo Research, D5455) was used for scWGBS as
mentioned in Gravina, et al. 2016. Briefly, single cells were lysed with 10uL digestion buffer and
1uL of proteinase K (Zymo Research, D5020). The bisulfite conversion and library preparation
were performed on cell lysates using the Pico Methyl-Seq Library Prep Kit (Zymo Research,
D5455) according to the supplier’s instructions with some alterations. Particularly, the first
modification consisted of a reduction of the primer concentration in the pre-amplification step
(20 μM) to avoid primer dimers in the final library. Then, another modification at the amplification
step: 11 cycles of PCR amplification in total. DNA was eluted at 10uL after the second clean-up.
2.5 Sequencing of single-cell whole genome bisulfite sequencing libraries
Quality control for the final libraries were performed by measuring the DNA concentration
with the QuBit dsDNA HS assay (Life Technologies, Q32851) on QuBit 2.0 Fluorometer (Life
Technologies, Q32866) and by determining library fragment sizes with the Experion DNA 1K
Analysis kit (Bio-Rad, 700-7107) on the Experion Automated Electrophoresis Station (Bio-Rad,
701-7000) or through 4200 TapeStation system (Agilent, G2991BA) using High Sensitivity
ScreenTape D5000 (Agilent, 5067-5592). Sequencing was performed on Illumina NextSeq and
NovaSeq.
2.6 Cell culture
22
The Brx07 CTC line was derived from a luminal, stage IV breast cancer patient (Yu, et al.,
2013) and was cultured in Corning ultra-low attachment plates (VWR, 29443-030) with RPMI-
1640 medium (ThermoFisher, 11875119), supplemented with EGF (20 ng/mL) (PeproTech, AF-
100-15), basic fibroblast growth factor (20ng/mL)) (PeproTech, 100-18B), 1× B27 (ThermoFisher
Scientific, 0080085SA), and 1× antibiotic/antimycotic (ThermoFisher, 15240-096), in 37°C, 4%
O
2
and 5% CO
2
. Py001 (a cell line made in our lab and derived from the primary tumor of the
mouse model MMTV-PyMT) was maintained in RPMI-1640 medium (ThermoFisher, 11875119),
with 10% FBS (fetal bovine serum) (Genesee, 25-550) and 1% p/s (penicillin and streptomycin)
(Sigma Aldrich, P0781-100ML) at 37°C, 21% O
2
and 5% CO
2
. A549 and MDA-MB-231, was
maintained in DMEM medium (VWR, 45000-304) with 10% FBS (Genesee, 25-550) and 1% p/s
(Sigma Aldrich, P0781-100ML) at 37°C, 21% O
2
and 5% CO
2
. HCT116 was maintained in
McCoy's 5a Medium Modified (Fisher Scientific, 30-2007) with 10% FBS (Genesee, 25-550) and
1% p/s (Sigma Aldrich, P0781-100ML) at 37°C, 21% O
2
and 5% CO
2
. HMC3 was maintained in
DMEM medium (VWR, 45000-304) with 10% FBS (Genesee, 25-550) and 1% p/s (Sigma
Aldrich, P0781-100ML) at 37°C, 21% O
2
and 5% CO
2
. HBMEC was maintained in MEBM Basal
Medium (Lonza, CC-3151) and MEGM SingleQuots Supplement Pack (Lonza, CC-4136) at 37°C,
21% O
2
and 5% CO
2
. HBMEC (ScienceCell, 1000) was maintained in Endothelial Cell Medium
(ScienceCell, 1001), FBS (ScienceCell, 0025), p/s (ScienceCell, 0503) and endothelial cell growth
supplement (ScienceCell, 1052) at 37°C, 21% O
2
and 5% CO
2
.
2.7 Generation of bulk whole genome bisulfite sequencing libraries
Bulk DNA was isolated and eluted in 20uL via QIAamp DNA Micro Kit (QIAGEN,
56304) according the “User-Developed Protocol” instructions.
23
Sodium bisulfite conversion was performed on 50-100ng of input purified DNA from bulk
of cells via EZ DNA Methylation Kit (Zymo, D5020) and the manufacturers protocol was
followed. Elution buffer was incubated within the column for 5min before spinning down and
samples were stored at -20°C for short-term storage or -80°C for long-term storage.
10uL of eluted converted bulk DNA was used as input for the library kit TruSeq DNA
Methylation Library Preparation Kit (Illumina, discontinued). Libraries were made according to
the manufacturer’s protocol, with the modification of eluting with 10uL after the second clean-up.
Quality control for the final libraries were performed by measuring the DNA concentration
with the QuBit dsDNA HS assay (Life Technologies, Q32851) on QuBit 2.0 Fluorometer (Life
Technologies, Q32866) and by determining library fragment sizes with the Experion DNA 1K
Analysis kit (Bio-Rad, 700-7107) on the Experion Automated Electrophoresis Station (Bio-Rad,
701-7000) or through 4200 TapeStation system (Agilent, G2991BA) using High Sensitivity
ScreenTape D5000 (Agilent, 5067-5592). Sequencing was performed on Illumina NextSeq and
NovaSeq.
2.8 Isolation of single CTCs and WBCs from breast cancer patients
Single CTCs and WBCs were picked by using the negative selection staining protocol from
Kamal, et al. 2019. Briefly, buffy coats acquired from AccuCyte were stained with a cocktail of
antibodies against immune markers (IM) consisting of anti-CD45, anti-CD14 and anti-CD16 anti-
bodies (BD Biosciences) and a live cell dye called Cell-tracker green. Cells were scanned and
picked using RareCyte and cells that were negative for IMs and positive for the Cell-tracker dye
were picked as potential CTCs. Cells positive for IMs and Cell-tracker green were picked as
potential WBCs.
24
2.9 5-AZA-CdR treatment
50,000 BRx07 CTCs were plated in a 24-well ULA plate and were treated with 50nM
AZA-CdR or 1X DPBS (Corning, 21-031-CV) as the Vehicle daily for 7 days. After the 7-day
treatment, for every 2 weeks, 10 single cells were collected (protocol mentioned above) and the
rest was utilized for bulk WGBS for a total of 18 weeks. Single cells were stored at -80°C after
lysis. Bulk DNA was isolated and eluted in 20uL via QIAamp DNA Micro Kit (QIAGEN, 56304)
according to the “User-Developed Protocol” instructions and stored at -20°C. scWGBS and bulk
WGBS libraries were created according to methods mentioned above.
2.10 Data processing and analysis
2.10.1 Bulk WGBS processing
Quality control was run on each individual sample using falco (v0.3.0,
https://github.com/smithlabcode/falco). Raw sequence reads were trimmed for adaptor
contamination and poor-quality reads using Trim Galore! (v0.3.5, http://
www.bioinformatics.babraham.ac.uk/projects/trim_galore/). Trimmed sequences were first
mapped to the human genome (build GRCh38) using WALT (v1.01,
https://github.com/smithlabcode/walt). Then, each sample was run through MethPipe (v5.0.0,
https://github.com/smithlabcode/methpipe) for the removal of duplication reads, measuring the
bisulfite conversion rate and extracting methylation calls.
2.10.2 scWGBS processing
Quality control was run on each individual sample using falco (v0.3.0,
https://github.com/smithlabcode/falco). Raw sequence reads were trimmed for adaptor
contamination and poor-quality reads using Trim Galore! (v0.3.5, http://
www.bioinformatics.babraham.ac.uk/projects/trim_galore/). Trimmed sequences were first
25
mapped to the human genome (build GRCh38) or the mouse genome (build GRCm38) using
abismal (v1.0.0., https://github.com/smithlabcode/abismal/). Then, each sample was run through
MethPipe (v5.0.0, https://github.com/smithlabcode/methpipe) for the removal of duplication
reads, measuring the bisulfite conversion rate and extracting methylation calls.
2.10.3 Clustering analysis by Euclidean distance
Symmetrical CpG methylation files from each sample were acquired from the program
symmetric-cpgs from the pipeline MethPipe (v5.0.0, https://github.com/smithlabcode/methpipe).
The build GRCh38 was subjected to non-overlapping 3kb windows using bedtools (v2.30.0,
https://bedtools.readthedocs.io/en/latest/). This 3kb file from the build GRCh38 was overlapped
with the symmetrical CpG files using the roimethstat program from MethPipe (v5.0.0,
https://github.com/smithlabcode/methpipe) and windows were defined informative if at least 6
CpGs per widow were sequenced. Methylome similarity and dissimilarity between samples was
measured by Euclidean distance. Data visualization and analysis were performed using custom R
scripts.
2.10.4 Mean CpG methylation analysis
Mean methylation of CpGs for Vehicle vs Week2 scWGBS samples were extracted by
using the levels program from MethPipe (v5.0.0, https://github.com/smithlabcode/methpipe). Data
visualization and analysis were performed using custom R scripts.
2.10.5 Comparing new scWGBS protocol to past published scWGBS protocols
To predict library complexity and genome coverage, the preseq package (v3.1.2,
https://github.com/smithlabcode/preseq) was used by using the program gc_extrap. Bulk mapped
reads from WALT were fed into preseq. scWGBS .bed files converted from mapped .sam files
were fed into preseq. To compare libraries, the x-axis was scaled to the total number of sequenced
26
reads by dividing by the number total bases and multiplying by the number of sequenced reads.
Data visualization and analysis were performed using custom R scripts.
2.10.6 Creating IGV tracks
The wigToBigWig program was utilized to create bigwig (bw) files using UCSC’s genome
browser’s directory of binary utilities (http://hgdownload.cse.ucsc.edu/admin/exe/). The
hg38.chrom.sizes file was utilized (http://hgdownload.cse.ucsc.edu/goldenpath/hg38/bigZips/).
Methcount output files from MethPipe were converted to .bed files. Afterwards, they were utilized
to create .bw files. Bigwig files were inputted into the Integrative Genomics Viewer (IGV)
program to visualize and explore scWGBS methylomes.
27
CHAPTER 3: EVALUATION OF CURRENT SINGLE-CELL
WGBS PROTOCOLS AND CHALLENGES
3.1 Background
Whole genome bisulfite sequencing (WGBS) is the gold standard of technology for the
detection of genome-wide DNA methylation (Cokus et al., 2008; Lister et al., 2009). Single-cell
whole genome bisulfite sequencing (scWGBS) is a technique that offers single-base resolution and
absolute quantification of the methylome within an individual cell. Compared to the single-cell
reduced representation bisulfite sequencing (scRRBS) protocol focusing on CpG islands,
scWGBS is relatively unbiased and provides cumulative coverage for more than 90% of CpG loci
in merged libraries from a relatively homogenous cell population. Uncovering epigenetic
information from single cells will enhance our understanding of gene regulation and epigenetic
heterogeneity. Achieving understanding at the single cell level will also help address many
unanswered questions, such as what are the epigenomic compositions of heterogeneous organs and
how does epigenetic heterogeneity result in therapeutic resistance. In the past, WGBS protocols
involved ligating adapters to DNA prior to bisulfite conversion, which resulted in loss of
information consequently from the degradation of DNA treated with sodium bisulfite. The
implementation of Post-Bisulfite Adaptor Tagging or PBAT in scWGBS during library
preparation (Figure 1) allows for the preservation of DNA because of the addition of sequencing
adapters after bisulfite conversion.
One key criterion for a successful scWGBS library is that the bisulfite conversion rate must
be high (>95%) since the incomplete conversion of unmethylated cytosine to uracil will be
falsefully determined as methylated cytosine
Although DNA methylation of CTCs has been analyzed, past methods evaluated less than
1% of CpGs in the genome. Since WGBS can cover the whole methylome, scWGBS can be used
28
to reveal unique features in CTCs that are relevant to many unknown questions in the metastatic
process. For example, what are the DNA methylome heterogeneity in CTCs? Are there specific
methylation features associated with metastatic initiating abilities of single CTCs at distinct
organs? Can we identify methylome changes associated with therapeutic resistant CTCs?
Compared to other epigenetic regulations such as histone modifications that are characterized by
fast turnover and are technically difficult to analyze in single CTCs, DNA methylation is relatively
stable and can be accessed in single CTCs. Therefore, important information of DNA methylomes
in CTCs will have the potential in the future to be developed into biomarkers in CTCs to impact
patient monitoring and outcomes.
Figure 1: General steps for scWGBS. Figure demonstrating the general steps for scWGBS,
which includes 1) picking a single cell via single-cell micromanipulation, 2) lysing a cell, 3)
bisulfite conversion directly on a lysed cell, 4) library preparation, and 5) next generation
sequencing.
3.2 Evaluation of the reported scWGBS protocols via bulk WGBS
In order to investigate the DNA methylome in single CTCs, I evaluated the efficacy of the
two reported scWGBS protocols (Smallwood et al., 2014; Farlik et al., 2015). Both protocols were
tested first using genomic DNA isolated from a bulk of 50, 000 CTCs to determine which one was
the best option for downstream scWGBS analysis. Two different CTC lines established from two
29
different stage IV breast cancer patients (BRx07 and BRx68) were used in the comparison. All
samples had a bisulfite conversion rate of >98% and were successfully mapped to the human
genome. The bisulfite conversion rate was measured by using unmethylated non-CpGs. Next, to
evaluate the complexity of the libraries, WALT and preseq (Daly, et al. 2014, Chen, et al., 2016)
were utilized to plot the expected distinct reads versus the total reads sequenced (Figure 2).
Interestingly, the Farlik protocol produced more complex libraries than those made from the
Samllwood protocol (Figure 2).
Figure 2. Evaluation of library complexity for samples prepared by the Farlik and
Smallwood protocols. Graph showing estimated library complexity based on the expected distinct
reads (y-axes) vs. total sequenced reads (x-axes). Bulk WGBS libraries from BRx68 or BRx07
CTC lines showed higher complexity in the Farlik protocol than those made with the Smallwood
protocol.
3.3 Evaluation of the Farlik scWGBS protocol on CTC line BRx07
Based on the results from the bulk WGBS analysis, I used the Farlik protocol to produce 6
scWGBS libraries from the BRx07 CTC line. However, I found an inconsistency in the bisulfite
Farlik, et al. Brx68
Farlik, et al. Brx07
Smallwood, et al. Brx68
Smallwood, et al. Brx07
30
conversion rate (Table 1). A good indicator of a successful bisulfite conversion rate is to reach
>95% by measuring non-CpGs known to be unmethylated. However, sequencing results of the 6
single CTC libraries manifested a wide range of 8-80% in the bisulfite conversion rate. In contrast,
the spiked-in unmethylated lambda DNA as a positive control reached a bisulfite conversion of
>98% (Table 1), indicating an issue of the bisulfite conversion efficiency within the CTCs, but not
the sodium bisulfite reagent itself.
Table 1: Bisulfite conversion failed at first attempt with scWGBS using BRx07. SC1-SC6 are
6 individual scWGBS libraries from the CTC line BRx07. Unmethylated lambda DNA was spiked-
in as a positive control. The bisulfite conversion rate was measured by unmethylated non-CpGs.
3.4 Validation of past reports with ESCs
To exclude potential technical issues, I evaluated the performance of the scWGBS
protocol on the mouse embryonic stem cells (mESCs) that were used in the published scWGBS
protocols. I used the Farlik protocol to create 6 scWGBS libraries using 6 individual mESCs.
The result showed that 5 out of 6 libraries successfully reached >96% bisulfite conversion rate
(Table 2). Since ESCs have a more euchromatic genome than that of a cancer cell, this result
indicated that either cell debris or proteins bound to the heterochromatic genome of cancer cells
are potentially preventing the efficiency of the bisulfite conversion.
Sample Bisulfite Conversion Rate (%)
SC1 6.94
SC2 3.34
SC3 79.3
SC4 7.88
SC5 14.0
SC6 33.9
Lambda Phage DNA 98.4
31
Table 2: The Farlik scWGBS protocol is successful in producing high bisulfite conversion
in mouse embryonic stem cells. Five out of six libraries of the single mESCs prepared by the
Farlik protocol were successful in generating a bisulfite conversion rate of >96%. The bisulfite
conversion rate was measured by unmethylated non-CpGs.
3.5 Optimizing the lysis and bisulfite conversion steps
Based on the successful bisulfite conversion on the naked spiked-in lambda DNA and
mESCs, I hypothesize that cell debris in the tube may hinder the reaction, or the nuclear membrane
was not lysed effectively. To test these potential issues, I performed scWGBS on another set of
single cells by optimizing two conditions: 1) lysing for one hour at 55
o
C instead of 20min at 50
o
C
(https://www.goldbio.com/articles/article/20-answers-to-important-proteinase-k-questions-plus-
free-printable-fact-sheet); 2) increasing the bisulfite conversion time to 4 hours instead of 3.5
hours. Unfortunately, both conditions did not improve the bisulfite conversion rate, which
continued to show a low, wide range (Table 3 and Table 4).
Sample BS Conversion rate (%)
mESC1 96.6
mESC2 44.4
mESC3 96.5
mESC4 96.5
mESC5 96.7
mESC6 97.2
32
Table 3: Lysis optimization did not improve the bisulfite conversion rate. SC1-SC6 are 6
individual scWGBS libraries from the CTC line BRx07. The optimization of increasing the lysis
time from 20min to 1 hour and temperature from 50
o
C to 55
o
C was tested. Unmethylated lambda
DNA was spiked-in as a positive control. The bisulfite conversion rate was measured by
unmethylated non-CpGs.
Table 4: Increase in conversion time did not improve the bisulfite conversion rate. SC1-SC6
are 6 individual scWGBS libraries from the CTC line BRx07. A bisulfite conversion time from
3.5 hours to 4 hours was tested. Unmethylated lambda DNA was spiked-in as a positive control.
The bisulfite conversion rate was measured by unmethylated non-CpGs.
Sample Bisulfite Conversion Rate (%)
SC1 74.0
SC2 83.3
SC3 12.7
SC4 77.2
SC5 48.3
SC6 51.9
Lambda Phage DNA 98.0
Sample Bisulfite Conversion Rate (%)
SC7 82.2
SC8 9.17
SC9 7.69
SC10 18.8
SC11 82.1
SC12 62.9
Lambda Phage DNA 98.8
33
Sample Protocol Bisulfite conversion rate (%)
G1 Gravina 10.5
G2 Gravina 62.2
G3 Gravina 81.9
F4 Farlik 52.4
F5 Farlik 88.7
F6 Farlik 84.2
Table 5: Testing Gravina’s protocol and the elimination of PBS. G1-G3 are three individual
scWGBS libraries from BRx07 testing the Gravina, et al. protocol without PBS. F4-F6 are three
individual scWGBS libraries from BRx07 testing the Farlik, et al. protocol without PBS. The
bisulfite conversion rate was measured by unmethylated non-CpGs.
3.6 Evaluation of the Gravina scWGBS protocol
I then tested a newly published scWGBS protocol by Gravina, et al 2016. Communication
with Dr. Gravina led to the suggestion of removing PBS in the Farlik protocol due to a concern
that a high salt buffer may hinder the bisulfite reaction. I tested the Gravina protocol on single
CTCs picked and directly deposited into the lysis buffer, as well as the Farlik protocol with the
elimination of PBS. However, the problem of inefficient bisulfite conversion persisted (Table 5).
The Farlik protocol on bulk samples consistently generated libraries with a bisulfite conversion
rate of >98%, suggesting that the difference may be due to the input, which starts of as purified
naked DNA without protein or salt contamination. Therefore, I hypothesized that the
heterochromatic nature of CTCs with compacted nucleosome and/or transcription factors maybe
mask the DNA from the access of bisulfite reaction. In addition, the proteins bound to the DNA
could hinder its denaturation into single strands that is necessary for the reaction to work properly.
Thus, I focused on testing lysis buffers that could successfully breakdown the cell and nuclear
membrane, and denature proteins, as well as adding a proteinase that can further degrade proteins
bound to DNA.
34
3.7 Optimization of the lysing step with RLT buffer
A follow-up report based on the Smallwood scWGBS protocol by Clark, et al. 2017
introduced many modifications, including a change of their homemade lysis buffer to Qiagen’s
RLT buffer aimed at improving the lysing process and bisulfite conversion rate. Their new
protocol also proclaimed an 88% success rate on scWGBS. Thus, I tested this new lysis buffer
with the Farlik protocol on the 9 CTC clones picked from the BRx07 line. Unfortunately, this
resulted to a 0% success rate (Table 6).
Sample Bisulfite Conversion Rate (%)
SC1 1.57
SC2 75.9
SC3 3.11
SC4 67.1
SC5 43.8
SC6 51.3
SC7 51.1
SC8 38.4
SC9 53.3
Table 6: RLT buffer did not improve the bisulfite conversion. SC1-SC9 are 9 individual
scWGBS libraries from BRx07. The Qiagen RLT lysis buffer was tested on all cells (Clark, et
al). The bisulfite conversion rate was measured by unmethylated non-CpGs.
3.8 Further optimization of the lysing step
Next, I tested two different lysing buffers known to be effective at lysing the cell and
nuclear membrane and denaturing proteins bound to DNA. One is an SDS buffer consisting of
0.5% SDS and 0.05M Tris-HCl with proteinase K from New England BioLabs (NEB). Proteinase
K is frequently used in DNA isolation due to its robustness in degrading proteins in the form of
histones and transcription factors bound to DNA (https://www.goldbio.com/articles/article/20-
35
answers-to-important-proteinase-k-questions-plus-free-printable-fact-sheet). In addition,
proteinase K is also more robust if SDS is introduced
(https://www.goldbio.com/articles/article/20-answers-to-important-proteinase-k-questions-plus-
free-printable-fact-sheet). Another one is the P2 buffer from Qiagen, consisting of 200mM NaOH
and 1% SDS with NEB’s proteinase K. I also re-tested the lysing buffer from the Farlik protocol,
but with the addition of NEB’s proteinase K, instead of Zymo’s. I included a non-converted single-
cell library as a negative control, as well as spiked-in unmethylated lambda DNA as a positive
control. The Qiagen P2 buffer plus proteinase K from NEB consistently produced scWGBS
libraries with a bisulfite conversion of >96% (Table 7) indicating a successful lysis buffer for
improving the bisulfite conversion.
Sample BS Conversion rate (%)
EZ1 85.4
EZ2 73.6
EZ3 46.9
P2-1 97.8
P2-2 96.7
P2-3 97.7
SDS1 85.6
SDS2 80.4
SDS3 77.9
Unconverted Control 4.41
Lambda Phage DNA 97.3
Table 7: P2 buffer plus NEB’s proteinase K works in improving bisulfite conversion. 2
different buffers plus NEB’s proteinase K were tested to see if it improved the bisulfite conversion
rate in scWGBS libraries from BRx07. EZ1-EZ3 are three scWGBS libraries testing proteinase K
from NEB with Farlik’s lysing buffer from Zymo. P2-1 through P2-3 are three scWGBS libraries
testing the Qiagen P2 buffer plus proteinase K from NEB. SDS1-SDS3 are three scWGBS libraries
testing the SDS buffer plus proteinase K from NEB. An unconverted scWGBS library was
included as a negative control, in addition to spiked-in unmethylated lambda DNA as a positive
control. The bisulfite conversion rate was measured by unmethylated non-CpGs.
36
3.9 Evaluation of a new single-cell methyl-seq library kit
In 2019, the TruSeq DNA Methylation Library Preparation Kit from Illumina and the
Farlik’s protocol was discontinued. The same year Swift BioSciences released its Accel-NGS
Adaptase Module for Single-Cell Methyl-Seq Library Kit. I tested Swift Bioscience’s kit in 3
single CTCs from BRx07 and 3 single cells from a mouse cell line called Py001 that I established
from the breast cancer mouse model MMTV-PyMT by sorting the dissociated tumor cells with an
anti-EpCAM antibody. A human and mouse cell line was tested to determine if it would perform
well with both species. I used the Qiagen P2 buffer with the NEB proteinase K and the new Single-
Cell Methyl-Seq Library Kit to generate the scWGBS libraries. The result showed consistent and
reliable bisulfite conversion rates of >96% for cancer cells from both human and mouse species
(Table 8).
Table 8: New scWGBS protocol is successful in both human and mouse tumor cells. C1-C3
are 3 scWGBS libraries from BRx07. P1-P3 are 3 scWGBS libraries from Py001. The Qiagen P2
buffer, NEB proteinase K and Swift Bioscience library kit was tested. The bisulfite conversion rate
was measured by unmethylated non-CpGs.
3.10 Summary
Sample Bisulfite Conversion Rate (%)
C1 98.5
C2 97.6
C3 97.6
P4 97.1
P5 98.3
P6 98.8
37
All published scWGBS protocols produced unreliable low bisulfite conversion rates in the
CTC line BRx07. However, the Farlik protocol produced consistently successful WGBS libraries
in bulk and scWGBS mESC libraries. The optimization of including the Qiagen P2 buffer plus
proteinase K from NEB consistently produced scWGBS libraries with a bisulfite conversion of
>96%. In addition, when compared to past results, the optimizations produced a small range in the
bisulfite conversion rate. Although the sample size is small, the success of all libraries indicates
the potential of these new modifications leading to a more robust version of scWGBS that is
applicable to rare cell types. The next step would require evaluating how dependable this new
protocol is in various cell types.
38
CHAPTER 4: VALIDATION AND APPLICATION OF A ROBUST
SINGLE CELL WGBS PROTOCOL
4.1 Background
High-throughput single-cell technologies have great potential to uncover new cell types.
DNA methylation was one of the first epigenetic marks to be discovered and remains the most
studied. Single-cell whole genome bisulfite sequencing (scWGBS) allows the detection of DNA
methylation at the single-cell level and is a versatile tool to investigate DNA methylation in rare
cells and heterogeneous populations. Over the past several years, I tested the first three published
scWGBS protocols on the CTC line BRx07 (Smallwood, et al., 2014; Farlik, et al., 2015; Gravina,
et al., 2016). Unfortunately, all three failed to produce reliable bisulfite conversion rates, and it
was a unique phenomenon that occurred in cancer cells. To solve this issue, I produced a new
protocol that delivers a much higher success rate when compared to what is published. This
protocol is highly reliable so that it can be utilized on rare cell types. Rare cells exist in many
conditions. Many tumors are heterogeneous and consist of various tumor cell populations and an
ever-changing microenvironment. Cancer stem cells or some immune cells can be rare compared
to other cell types. Moreover, therapeutic drugs can establish unique clones that are resistant to
treatment and responsible for disease progression. Aside from cancer biology, Busslinger, et al.
2021 identified novel rare cell types that were linked to the secretion of high volumes of water in
the small intestine.
Understanding all aspects of genetic and epigenetic features of CTCs could provide vital
information for cancer progression. Genetic and transcriptional single-cell studies have revealed
the heterogenous nature of CTCs and potential pathways involved in therapeutic resistance.
However, the epigenome of CTCs is not well studied. Recent reports by Gkountela, et al., 2019
39
and Zhao, L. et al., 2021 are the only ones to utilize scWGBS on single CTCs in breast cancer and
lung cancer, respectively. Another study has shown that during in vitro expansion, individual
CTCs can switch their HER2 expression to create an equilibrium of HER2 positive and negative
subpopulations, suggesting an underlying epigenetic regulation (Jordan, et al., 2016). Therefore,
it will be important to understand how DNA methylation in CTCs could be playing a part in
regulating transcriptomes of CTCs during the metastatic cascade and therapeutic resistance. CTCs
isolated from a blood sample at a given time are generally very rare, thus, each CTC counts.
4.2 The optimized scWGBS protocol produced consistent bisulfite conversion rates on
various cell types
Encouraged by the success of my optimized scWGBS protocol (VO_scWGBS) on a small
sample size, I set to test whether it could be utilized universally in various cell types. I verified my
protocol on several cell types from both human and mouse (Table 1). In addition, this protocol was
tested on individual CTCs and WBCs isolated from a stage IV breast cancer patient (Figure 1 and
Table 1). The new VO_scWGBS protocol produced a bisulfite conversion rate of >95% in 66
samples and a bisulfite conversion rate of >90% in 2 samples (Table 2, highlighted in yellow). It
produced a bisulfite conversion rate of >95% in all freshly picked cells (mouse WBCs and patients’
cells). The reliability and success of these results demonstrate that optimizing for an effective lysis
buffer and proteinase K is vital in producing dependable bisulfite conversion rates for any cell
type. In addition, when compared to past results from published scWGBS protocols, the optimized
protocol produced a tight range of 90-99% in the bisulfite conversion rate (Figure 2).
40
Table 1: Various cell types used to test optimized scWGBS protocol. Different cell types were
tested on the new optimized scWGBS protocol. “Samples” column describes the name of the cell
line or cell type used. “Cell Line” column explains if the sample originates from a cell line or
freshly isolated cells. The “Species” column explains the species of origin. The “Cell Type”
column explains the origin of the cells. The column termed “Number Picked” explains how many
cells were picked for each cell type. The “Alias” column explains what the cells are named for the
scWGBS libraries.
Samples
Cell
Line
Species Cell Type
Number
Picked
Alias
MDA-MB-
231
Yes Human Triple-Negative Breast Cancer 5
MDA23
1
A549 Yes Human Lung Cancer 5 A549
HCT116 Yes Human Colon Cancer 5 HCT
HMEC Yes Human Normal Breast 5 HMEC
HMC3 Yes Human Microglia 5 glia
HBMEC Yes Human Brain Endothelial Cells 5 Endo
BC343_CTCs No Human Stage IV Breast Cancer CTCs 4 CTC3
BC343_WBCs No Human
Stage IV Breast Cancer Immune
Cells
3 IC
Mouse WBCs No Mouse WBCs from Blood 11 V3
Vehicle Yes Human BRx07 Untreated 10 Veh
Week 2 Yes Human BRx07 Treated with 5-AZA-CdR 10 Wk2
41
Figure 1: Images showing a CTC and immune cell isolated from a Stage IV breast cancer
patient. The blood sample was processed by the negative selection approach of the PIC&RUN
protocol (Kamal, et al., 2019) which involves a Cell-Tracker green viability stain and a cocktail of
immune marker (IM) antibodies. (A) A CTC is defined as a cell with IM
−
/Cell-Tracker green
+
.
(B) An immune cell is defined as a cell with IM
+
/Cell-Tracker green
+
.
Sample Bisulfite Conversion Rate
A549_24 97.5
A549_25 98.5
A549_26 96.0
A549_27 97.8
A549_28 98.1
CTC3_210 98.8
CTC3_211 98.8
CTC3_212 98.7
CTC3_29 98.2
Endo_610 94.7
Endo_66 96.9
Endo_67 97.1
Endo_68 98.0
Endo_69 95.8
glia_111 97.5
glia_112 95.4
glia_21 98.1
glia_22 98.8
glia_23 98.7
HCT_110 93.1
HCT_16 98.7
HCT_17 96.9
HCT_18 96.9
HCT_19 97.0
HMEC_61 98.7
HMEC_62 98.5
HMEC_63 98.5
HMEC_64 97.8
HMEC_65 97.8
IC_312 97.9
IC_41 98.7
IC_42 95.8
42
MDA231_11 96.3
MDA231_12 97.8
MDA231_13 98.4
MDA231_14 95.4
MDA231_15 91.3
Veh_410 98.4
Veh_411 98.7
Veh_412 98.5
Veh_43 98.5
Veh_44 98.1
Veh_45 98.4
Veh_46 99.0
Veh_47 98.5
Veh_48 97.8
Veh_49 98.5
WK2_51 98.7
Wk2_52 98.6
Wk2_53 98.7
Wk2_54 98.7
Wk2_55 98.8
Wk2_56 98.7
Wk2_57 98.8
Wk2_58 98.1
Wk2_59 98.8
Wk2_510 98.8
V3_31 98.9
V3_310 98.8
V3_311 98.9
V3_32 98.9
V3_33 98.9
V3_34 98.5
V3_35 98.9
V3_36 99.0
V3_37 98.9
V3_38 99.0
V3_39 98.9
Table 2: Optimized scWGBS protocol is successful in bisulfite conversion of various cell
types. Different cell types were tested on the new optimized scWGBS protocol to determine if it
produced reliable bisulfite conversion rates. The “Sample” column explains the alias (alias
43
information can be found in Table 1). The optimized protocol produced 66/68 samples with a
bisulfite conversion rate of >95%. 2 samples (highlighted in yellow) produced a bisulfite
conversion rate of >90%. The bisulfite conversion rate was measured by analyzing non-CpGs
known to be unmethylated.
Figure 2: VO_scWGBS produces WGBS libraries with a bisulfite conversion rate ranging
from 90-100%. The x-axis explains the different optimizations tested and the corresponding
bisulfite conversion rates from each sample (y-axis), which is noted as a dotplot. Each dot
corresponds to an individual scWGBS library. Optimizations “Attempt1, RLT, TimeInc, Gravina,
LysisO, EZ, noPBS, SDS, mESCs and P2” are explained in chapter 3. Attemp1 explains the first
attempt using the Farlik scWGBS protocol. RLT tests the optimization from Clark, et al. TimeInc
corresponds to the time increase tested. Gravina corresponds to Gravina, et al. protocol. LysisO
defines the lysis optimization done to the time and temperature. EZ corresponds to the Farlik
protocol to be tested with NEB proteinase K. noPBS corresponds to testing Farlik’s protocol
without PBS. SDS corresponds to lysis buffer tested. mESCs defines the cells tested for replication
purposes from published protocols. P2 corresponds to the lysis optimization from Qiagen. The
protocol termed VO_scWGBS is the new optimized protocol and it demonstrates how its bisulfite
conversion range is narrower and above 90%, when compared to the wide range from past data.
VO_scWGBS represents the samples described in Table 2.
4.3 The improved protocol produces comparable or enhanced library complexity compared
with those from published protocols
44
To evaluate whether the VO_scWGBS protocol produces complex libraries, I used the
preseq package to determine coverage and complexity of the libraries made by the VO_scWGBS
and the two previous protocols. The preseq package uses the initial sequenced reads to extrapolate
how many more unique bases could be captured if we were to sequence further. The new protocol
produced either comparable or more complex libraries to those made with previous protocols
(Figure 3).
Figure 3: scWGBS libraries from improved protocol compares well with published scWGBS
protocols. 2 random scWGBS libraries were taken from each scWGBS protocol (Smallwood,
Farlik and VO_scWGBS). Mapped reads were converted to .bed files and were inputted into
preseq. Preseq evaluates the coverage and complexity of next-generation sequencing libraries. The
x-axis was scaled to the total number of sequenced reads by dividing by the number total bases
and multiplying by the number of sequenced reads. The y-axis explains how many bases get
uniquely covered once. For example, around 5.0+08, it would be expected that the lower graphs
from VO_scWGBS and Smallwood 2014 would reach saturation if sequenced further, thus
covering bases more than once.
4.4 Evaluating the dose and effect of 5-AZA-CdR on BRx07 CTC line survival
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
0.0e+00 5.0e+08 1.0e+09 1.5e+09 2.0e+09 2.5e+09 3.0e+09
0e+00 1e+08 2e+08 3e+08 4e+08 5e+08
Sequenced Reads
Covered Bases
o
o
o
o
o
o
o
o
o
o
o
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
********************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*****************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+
+
+
+
+
+
+
+
+
+
+
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
o
*
+
Smallwood 2014
Farlik 2015
VO_scWGBS
45
Aberrant DNA methylation has been revealed to modulate the expression of many genes
that are involved in tumorigenesis and progression (Esteller, 2005). Because the epigenetic
regulation is reversible, epigenetic drugs are attractive therapeutic strategies in cancer, such as a
well-known DNA methylation inhibitor, 5-aza-2'-deoxycytidine (5-AZA-CdR). 5-AZA-CdR is a
prodrug that requires activation via phosphorylation by deoxcytidine kinase (Momparler, et al.,
2005). The subsequent nucleotide analog is incorporated into DNA and deactivates DNA
methyltransferase. DNA demethylation by this analog in neoplastic cells can lead to reactivation
of tumor-suppressor genes, induction of differentiation or senescence, growth inhibition, and loss
of clonogenicity (Momparler, et al., 2005). A study performed by Yang, et al., demonstrated by
using 5-AZA-CdR in a colon cancer cell line that gene body DNA methylation increases gene
expression and that this formation is dependent on the presence of the DNA methyltransferase
DNMT3B. By applying 5-AZA-CdR, they uncovered that demethylation within transcription units
can turn off the transcription of genes, including some oncogenes (Yang, et al., 2014). As a S-
phase-specific agent, 5-AZA-CdR has been shown to be an effective antineoplastic agent against
leukemia, myelodysplastic syndrome, and non-small cell lung cancer (Momparler, et al., 2005).
However, the effect of 5-AZA-CdR at the single-cell level on the CTC methylome is unknown.
Therefore, we applied the new VO_scWGBS protocol to evaluate these questions by treating the
CTC line BRx07 with 5-AZA-CdR.
Pharmacokinetic studies have shown that 5-AZA-CdR has a short in vivo half-life of 15 to
25 minutes (Momparler, et al., 2005). Thus, I treated 50,000 cells from BRx07 with a titration of
different concentrations of 5-AZA-CdR daily for one week, due to the doubling time of BRx07
being every 7 days. Afterwards, the cells were examined every week to determine the effects of
the drug. After a month, I observed that all concentrations of 5-AZA-CdR caused a pause in cell
46
growth (Figure 4). I then chose the lowest concentration of 50nM for downstream experiments
with single cells.
Figure 4: 5-AZA-CdR attenuates cell proliferation in BRx07. The CTC line BRx07 was treated
with 3 different concentrations (50nM, 100nM, 300nM) of 5-AZA-CdR to assess at which
concentration we would see an affect. 50,000 CTCs were treated daily for 7 days, based on the cell
doubling time of this cell line. Vehicle represents the untreated group. Images shown are phase
contrast images taken at one month after the start of treatment.
4.5 scWGBS analysis reveals heterogeneity in global demethylation by 5-AZA-CdR
To determine the effect of 5-AZA-CdR on DNA methylation in single cells, 50,000 BRx07
cells were treated with 5-AZA-CdR for 7 days. Then, every two weeks, 10 single cells were
harvested for scWGBS, and the rest of the cells were used to generate bulk WGBS libraries for a
total of 18 weeks. I first analyzed the bulk methylomes to assess the time point at which the most
demethylation occurred. I found that there was a significant drop in global methylation level at
week 2 and the methylation levels recovered gradually afterwards, reaching comparable levels to
the vehicle at week 12 (Figure 5). Based on this result, I performed scWGBS analysis of single
cells from the vehicle-treated sample at the week 2 time point to determine the global effect of 5-
AZA-CdR on CpGs. All libraries produced a bisulfite conversion rate of >97% (Figure 6).
Intriguingly, compared to a homogenous methylation status in single cells treated with vehicle, 5-
AZA-CdR treatment led to demethylation in a heterogeneous nature and produced various degrees
of demethylation (Figure 7A, C and D). One of the scWGBS libraries (sample week2_56) exhibits
a methylation level like that of the vehicle cluster, demonstrating a minimal effect of 5-AZA-CdR
Vehicle 50nM 100nM 300nM
Brx07
Brx68
5-Aza-CdR Titration
1 Month After
47
in this cell (Figure 6B). Additionally, week 2 single cells developed two different clusters, with
one group demonstrating a greater demethylation effect (Figure 7B and Figure 7C). Furthermore,
the result also clearly showed that despite a consistent globally high level of methylation, the
methylomes in the vehicle treated single cells showed location heterogeneity (Figure 7C).
Figure 5: Status of global CpG mean methylation levels in bulk WGBS analysis of BRx07
treated with 5-AZA-CdR. Graph showing the mean methylation of CpGs in bulk BRx07
collected at different times points (x-axis) after treatment with Vehicle, or 50nM of 5-AZA-CdR.
Vehicle, week 2-8, week 12 and week 16 were sequenced.
CpG Mean Methylation
Bulk WGBS Samples
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
veh wk2 wk4 wk6 wk8 wk12 wk16
CpG Mean Methylation
48
Figure 6: Vehicle and Week 2 libraries produced a bisulfite conversion rate of >97%. Mapped
reads were fed into MethPipe to measure the bisulfite conversion rate. X-axis shows the two
treatment groups: Untreated group (Veh) and the 5-AZA-CdR treated group after two weeks
(Wk2). Each dot corresponds to an individual scWGBS library. Y-axis shows the bisulfite
conversion rate from 96-100%. All 20 samples produced a dependable conversion rate >97%.
96
97
98
99
100
Veh Wk2
Samples
Bisulfite Conversion Rate
Sample
Veh
Wk2
49
40
50
60
70
Veh_Bulk Veh_sc Wk2_Bulk Wk2_sc
Samples
CpG Mean Methylation
Sample
Veh_Bulk
Veh_sc
Wk2_Bulk
Wk2_sc
A
Veh44
Veh48
Veh47
Veh412
Veh49
Veh45
Veh410
Veh46
Wk2_56
Veh411
Veh43
BULK_VEH
Wk2_51
Wk2_52
Wk2_59
Wk2_510
Wk2_57
BULK_WK2
Wk2_58
Wk2_54
Wk2_53
Wk2_55
0 50 100 150 200 250
Cluster Dendrogram
hclust (*, "ward.D2")
dd
Height
B
50
Veh_43
Veh_411
Veh_46
Wk2_55
Wk2_53
Wk2_510
Wk2_58
Wk2_54
Wk2_53
Wk2_55
Wk2_51
Wk2_52
Wk2_59
Wk2_510
BULK_WK2
Wk2_57
Veh44
Veh48
Veh47
Veh49
Veh412
Veh45
Veh410
Veh46
Wk2_56
Veh411
Veh43
BULK_VEH
0 0.4 0.8
Value
0 15000
Color Key
and Histogram
Count
C
D
51
Figure 7: 5-AZA-CdR treatment resulted in heterogenous demethylation at the single cell
level. (A) Dot plot showing the mean CpG methylation levels of different samples. X-axis shows
4 different groups. The Veh_Bulk and Wk2_Bulk corresponds to the bulk WGBS libraries
generated. Veh_sc and Wk2_sc corresponds to the scWGBS libraries picked for each condition
(Vehicle and Week 2). Y-axis corresponds to the mean methylation of CpGs.
(B) Hierarchical clustering map of the bulk and single cell WGBS libraries. Non-overlapping 3kb
windows were used to estimate the mean methylation rate across the genome of each single cell
and bulk WGBS library. The dendrogram depicts the similarity and dissimilarity using Euclidean
distance.
(C) Heatmap for the mean methylation rates from 3kb non-overlapping windows. Blue to yellow
depicts methylation levels from 0-100%, respectively. Cluster dendrogram (like B) is at the top of
the heatmap and sites to the left.
(D) Representative Integrative Genomics Viewer (IGV) browser tracks of vehicle treated BRx07
single cells (blue) and week 2 5-AZA-CdR treated BRx07 single cells (green). All IGV tracks
contain the same scaling factor for the y-axis as indicated in the upper left-hand regions of each
track. The region that identifies heterogenous demethylation from sample Wk2_510 is indicated
near the top of the panel in red and with black lines through the tracks. The RefSeq gene map is
presented in blue at the bottom demonstrating the overall gene structure.
4.6 Summary
The results of the new scWGBS protocol demonstrated consistent high levels of bisulfite
conversion rates and complex libraries in single cells. This new method can be applied to
investigate various rare cell types and help derive novel discoveries at the single cell level. As an
example, shown with the scWGBS analysis of cells treated with 5-AZA-CdR, I found the effect
of 5-AZA-CdR on the CTC line BRx07 is heterogenous, which cannot be discovered from the
bulk analysis. Additionally, a rare cell demonstrated marginal effect from the drug. The robust
scWGBS protocol opens the door to many research questions that require analysis of DNA
methylation in single cells.
52
CHAPTER 5: DISCUSSION
5.1 Overview of results
Here I have described an optimized approach for single cell methylome sequencing and
demonstrated its robustness in various types of cells. This protocol is reliable in freshly isolated
cells such as CTCs and WBCs, works effectively in both mouse and human cells, and can be used
to study the drug-induced epigenome remodeling at the single cell level using 5-AZA-CdR. I have
demonstrated the importance of a dependable lysis buffer and effective proteinase in producing
consistent and reliable bisulfite conversion rates of >95% in single cells. The workflow is
comparable to recently published scWGBS protocols (Smallwood, et al. 2014, Farlik, et al. 2015).
However, in contrast to the Smallwood protocol, my protocol does not require any pre-
amplification, which offers several advantages, including reduced reagent cost, hands-on time, and
amplification bias, as well as precise measurements of PCR duplicates, but without the cost of
lower library complexity.
This improved protocol has enabled the discovery of variations of methylomes in multiple
cell types at the single cell level. In addition, I have discovered heterogenous responses of a CTC
line at the single cell level to the treatment of 5-AZA-CdR. Intriguingly, we also found a rare cell
demonstrating a marginal effect to 5-Aza-CdR, which is similar to the untreated group. This
finding suggests several possibilities. The cell that is not responding to the 5-AZA-CdR treatment
could be in a quiescent state or be able to recover much quicker than the other cells. These are
testable hypotheses that can be explored further. Moreover, we also found that there are two groups
of cells that clustered by their demethylation levels to the 5-AZA-CdR treatment. The development
of a robust scWBSC protocol allows us to identify such effects of 5-AZA-CdR at the single cell
level that were not feasible at the bulk level. These findings point to the most impacted
53
demethylated regions and potential resistance clones to the 5-AZA-CdR treatment, which holds
potential important impact into the clinic.
DNA methylation is an important epigenetic regulation of cancer and other diseases.
Dysregulations of DNA methylation can serve as biomarkers for specific clinical context.
Moreover, unlike genetic alterations, DNA methylation is reversible thereby holds promise for
therapeutic interventions. Therefore, a greater understanding of DNA methylation events at the
single cell level may contribute to its future applications for the management of malignant diseases.
5.2 Perspectives
The robust scWGBS protocol that I developed can enable numerous studies in different
areas of research. For example, the global consortium for studying the human epigenome, called
the International Human Epigenome Consortium (IHEC), aims to understand disease-related
methylation patterns and the heterogeneity between different cell types. In addition, the single-cell
Methylation Bank (scMethBank), a comprehensive and curated database that integrates single-cell
methylation data and metadata from publicly available datasets, is in huge need of more single cell
methylome data. The collection of wide-ranging data could deliver a chance to comprehend
methylation. The accumulation of scWGBS methylomes will make it possible to find methylation
hotspot regions that fluctuate within different tissue types, different experimental or environmental
conditions, and heterogeneous diseases such as cancer. Furthermore, these studies would benefit
the cellular heterogeneity analysis through visualization of single-cell DNA methylation data,
including the assignment of cell clusters in a t-SNE plot.
In addition, the robust scWGBS protocol could be applied to understand how the CTC
methylome is heterogenous within patients, as well as its role in therapeutic resistance and cancer
progression. Additionally, it could provide insights into how specific DNA methylation patterns
54
lead to the establishment of CTCs or the development of metastatic organ tropisms. I believe that
an understanding of the association of methylation and its biological role in disease will be
discovered with additional data in the future.
Finally, as methylation occurs early and can be detected in body fluids, it may be of
potential use in early detection of tumors and prognostic purposes (Das, et al., 2004). Because
DNA methylation is reversible, drugs like 5′-azacytidine, are being used to treat a variety of
tumors. Novel demethylating agents such as antisense DNA methyl transferase and small
interference RNA are being developed (Das, et al., 2004). Therefore, the ability of analyzing the
DNA methylation status within single cells is of extensive interest, especially in understanding the
heterogeneous effect of various drugs and how the methylome plays a part in therapeutic
resistance.
55
REFERENCES
Ashworth T. (1869) A case of cancer in which cells similar to those in the tumors were seen in the
blood after death. Aust Med J 14: 146–149
Lambert AW, Pattabiraman DR, Weinberg RA. Emerging Biological Principles of
Metastasis. Cell. 2017;168(4):670-691
Krebs MG, Hou JM, Ward TH, Blackhall FH, Dive C. Circulating tumour cells: their utility in
cancer management and predicting outcomes. Ther Adv Med Oncol. 2010;2(6):351-365
Martin TA, Ye L, Sanders AJ, et al. Cancer Invasion and Metastasis: Molecular and Cellular
Perspective. In: Madame Curie Bioscience Database [Internet]. Austin (TX): Landes Bioscience;
2000-2013. Available from: https://www.ncbi.nlm.nih.gov/books/NBK164700/
Alieva M, van Rheenen J, Broekman MLD. Potential impact of invasive surgical procedures on
primary tumor growth and metastasis. Clin Exp Metastasis. 2018;35(4):319-331
Kowalik A, Kowalewska M, Góźdź S. Current approaches for avoiding the limitations of
circulating tumor cells detection methods-implications for diagnosis and treatment of patients with
solid tumors. Transl Res. 2017 Jul;185:58-84.e15
Toss A, Mu Z, Fernandez S, Cristofanilli M. CTC enumeration and characterization: moving
toward personalized medicine. Ann Transl Med. 2014;2(11):108. doi:10.3978/j.issn.2305-
5839.2014.09.06
Akpe V, Kim TH, Brown CL, Cock IE. Circulating tumour cells: a broad perspective. J R Soc
Interface. 2020;17(168):20200065. doi:10.1098/rsif.2020.0065
Lovly CM, Salama AK, Salgia R. Tumor Heterogeneity and Therapeutic Resistance. Am Soc Clin
Oncol Educ Book. 2016;35:e585-93. doi: 10.1200/EDBK_158808
Ramsköld D, Luo S, Wang YC, Li R, Deng Q, Faridani OR, Daniels GA, Khrebtukova I, Loring
JF, Laurent LC, Schroth GP, Sandberg R. Full-length mRNA-Seq from single-cell levels of RNA
and individual circulating tumor cells. Nat Biotechnol. 2012 Aug;30(8):777-82. doi:
10.1038/nbt.2282. Erratum in: Nat Biotechnol. 2020 Mar;38(3):374
Ting DT, Wittner BS, Ligorio M, Vincent Jordan N, Shah AM, Miyamoto DT, Aceto N, Bersani
F, Brannigan BW, Xega K, Ciciliano JC, Zhu H, MacKenzie OC, Trautwein J, Arora KS, Shahid
M, Ellis HL, Qu N, Bardeesy N, Rivera MN, Deshpande V, Ferrone CR, Kapur R, Ramaswamy
S, Shioda T, Toner M, Maheswaran S, Haber DA. Single-cell RNA sequencing identifies
extracellular matrix gene expression by pancreatic circulating tumor cells. Cell Rep. 2014 Sep
25;8(6):1905-1918
Ortiz V, Yu M. Analyzing Circulating Tumor Cells One at a Time. Trends Cell Biol.
2018;28(10):764-775
56
Aceto N, Bardia A, Miyamoto DT, Donaldson MC, Wittner BS, Spencer JA, Yu M, Pely A,
Engstrom A, Zhu H, Brannigan BW, Kapur R, Stott SL, Shioda T, Ramaswamy S, Ting DT, Lin
CP, Toner M, Haber DA, Maheswaran S. Circulating tumor cell clusters are oligoclonal precursors
of breast cancer metastasis. Cell. 2014 Aug 28;158(5):1110-1122
Aceto N, Bardia A, Wittner BS, Donaldson MC, O'Keefe R, Engstrom A, Bersani F, Zheng Y,
Comaills V, Niederhoffer K, Zhu H, Mackenzie O, Shioda T, Sgroi D, Kapur R, Ting DT, Moy B,
Ramaswamy S, Toner M, Haber DA, Maheswaran S. AR Expression in Breast Cancer CTCs
Associates with Bone Metastases. Mol Cancer Res. 2018 Apr;16(4):720-727
Yu M, Bardia A, Wittner BS, Stott SL, Smas ME, Ting DT, Isakoff SJ, Ciciliano JC, Wells MN,
Shah AM, Concannon KF, Donaldson MC, Sequist LV, Brachtel E, Sgroi D, Baselga J,
Ramaswamy S, Toner M, Haber DA, Maheswaran S. Circulating breast tumor cells exhibit
dynamic changes in epithelial and mesenchymal composition. Science. 2013 Feb
1;339(6119):580-4
Mani SA, Guo W, Liao MJ, Eaton EN, Ayyanan A, Zhou AY, Brooks M, Reinhard F, Zhang CC,
Shipitsin M, Campbell LL, Polyak K, Brisken C, Yang J, Weinberg RA. The epithelial-
mesenchymal transition generates cells with properties of stem cells. Cell. 2008 May
16;133(4):704-15
Fischer KR, Durrans A, Lee S, Sheng J, Li F, Wong ST, Choi H, El Rayes T, Ryu S, Troeger J,
Schwabe RF, Vahdat LT, Altorki NK, Mittal V, Gao D. Epithelial-to-mesenchymal transition is
not required for lung metastasis but contributes to chemoresistance. Nature. 2015 Nov
26;527(7579):472-6
Zheng X, Carstens JL, Kim J, Scheible M, Kaye J, Sugimoto H, Wu CC, LeBleu VS, Kalluri R.
Epithelial-to-mesenchymal transition is dispensable for metastasis but induces chemoresistance in
pancreatic cancer. Nature. 2015 Nov 26;527(7579):525-530
Miyamoto DT, Zheng Y, Wittner BS, Lee RJ, Zhu H, Broderick KT, Desai R, Fox DB, Brannigan
BW, Trautwein J, Arora KS, Desai N, Dahl DM, Sequist LV, Smith MR, Kapur R, Wu CL, Shioda
T, Ramaswamy S, Ting DT, Toner M, Maheswaran S, Haber DA. RNA-Seq of single prostate
CTCs implicates noncanonical Wnt signaling in antiandrogen resistance. Science. 2015 Sep
18;349(6254):1351-6
Cann GM, Gulzar ZG, Cooper S, Li R, Luo S, Tat M, Stuart S, Schroth G, Srinivas S, Ronaghi M,
Brooks JD, Talasaz AH. mRNA-Seq of single prostate cancer circulating tumor cells reveals
recapitulation of gene expression and pathways found in prostate cancer. PLoS One.
2012;7(11):e49144
Zhu Z, Qiu S, Shao K, Hou Y. Progress and challenges of sequencing and analyzing circulating
tumor cells. Cell Biol Toxicol. 2018 Oct;34(5):405-415.
Kolodziejczyk AA, Kim JK, Svensson V, Marioni JC, Teichmann SA. The technology and biology
of single-cell RNA sequencing. Mol Cell. 2015 May 21;58(4):610-20
57
Eng CL, Shah S, Thomassie J, Cai L. Profiling the transcriptome with RNA SPOTs. Nat Methods.
2017 Dec;14(12):1153-1155
Lubeck E, Cai L. Single-cell systems biology by super-resolution imaging and combinatorial
labeling. Nat Methods. 2012 Jun 3;9(7):743-8
Ni X, Zhuo M, Su Z, Duan J, Gao Y, Wang Z, Zong C, Bai H, Chapman AR, Zhao J, Xu L, An T,
Ma Q, Wang Y, Wu M, Sun Y, Wang S, Li Z, Yang X, Yong J, Su XD, Lu Y, Bai F, Xie XS,
Wang J. Reproducible copy number variation patterns among single circulating tumor cells of lung
cancer patients. Proc Natl Acad Sci U S A. 2013 Dec 24;110(52):21083-8
Dago AE, Stepansky A, Carlsson A, Luttgen M, Kendall J, Baslan T, Kolatkar A, Wigler M, Bethel
K, Gross ME, Hicks J, Kuhn P. Rapid phenotypic and genomic change in response to therapeutic
pressure in prostate cancer inferred by high content analysis of single circulating tumor cells. PLoS
One. 2014 Aug 1;9(8):e101777
Ruiz C, Li J, Luttgen MS, Kolatkar A, Kendall JT, Flores E, Topp Z, Samlowski WE, McClay E,
Bethel K, Ferrone S, Hicks J, Kuhn P. Limited genomic heterogeneity of circulating melanoma
cells in advanced stage patients. Phys Biol. 2015 Jan 9;12(1):016008
Carter L, Rothwell DG, Mesquita B, Smowton C, Leong HS, Fernandez-Gutierrez F, Li Y, Burt
DJ, Antonello J, Morrow CJ, Hodgkinson CL, Morris K, Priest L, Carter M, Miller C, Hughes A,
Blackhall F, Dive C, Brady G. Molecular analysis of circulating tumor cells identifies distinct
copy-number profiles in patients with chemosensitive and chemorefractory small-cell lung cancer.
Nat Med. 2017 Jan;23(1):114-119
Sabina J, Leamon JH. Bias in Whole Genome Amplification: Causes and Considerations. Methods
Mol Biol. 2015; 1347:15-41
Lohr JG, Adalsteinsson VA, Cibulskis K, Choudhury AD, Rosenberg M, Cruz-Gordillo P, Francis
JM, Zhang CZ, Shalek AK, Satija R, Trombetta JJ, Lu D, Tallapragada N, Tahirova N, Kim S,
Blumenstiel B, Sougnez C, Lowe A, Wong B, Auclair D, Van Allen EM, Nakabayashi M, Lis RT,
Lee GS, Li T, Chabot MS, Ly A, Taplin ME, Clancy TE, Loda M, Regev A, Meyerson M, Hahn
WC, Kantoff PW, Golub TR, Getz G, Boehm JS, Love JC. Whole-exome sequencing of circulating
tumor cells provides a window into metastatic prostate cancer. Nat Biotechnol. 2014
May;32(5):479-84
van den Bos H, Spierings DC, Taudt AS, Bakker B, Porubský D, Falconer E, Novoa C, Halsema
N, Kazemier HG, Hoekstra-Wakker K, Guryev V, den Dunnen WF, Foijer F, Tatché MC, Boddeke
HW, Lansdorp PM. Single-cell whole genome sequencing reveals no evidence for common
aneuploidy in normal and Alzheimer's disease neurons. Genome Biol. 2016 May 31;17(1):116
Gasch C, Oldopp T, Mauermann O, Gorges TM, Andreas A, Coith C, Müller V, Fehm T, Janni
W, Pantel K, Riethdorf S. Frequent detection of PIK3CA mutations in single circulating tumor
58
cells of patients suffering from HER2-negative metastatic breast cancer. Mol Oncol. 2016
Oct;10(8):1330-43
Navin NE. Tumor evolution in response to chemotherapy: phenotype versus genotype. Cell Rep.
2014 Feb 13;6(3):417-9
Lee JH, Daugharthy ER, Scheiman J, Kalhor R, Ferrante TC, Terry R, Turczyk BM, Yang JL, Lee
HS, Aach J, Zhang K, Church GM. Fluorescent in situ sequencing (FISSEQ) of RNA for gene
expression profiling in intact cells and tissues. Nat Protoc. 2015 Mar;10(3):442-58
Jordan NV, Bardia A, Wittner BS, Benes C, Ligorio M, Zheng Y, Yu M, Sundaresan TK, Licausi
JA, Desai R, O'Keefe RM, Ebright RY, Boukhali M, Sil S, Onozato ML, Iafrate AJ, Kapur R,
Sgroi D, Ting DT, Toner M, Ramaswamy S, Haas W, Maheswaran S, Haber DA. HER2 expression
identifies dynamic functional states within circulating breast cancer cells. Nature. 2016 Sep
1;537(7618):102-106
Friedlander TW, Ngo VT, Dong H, Premasekharan G, Weinberg V, Doty S, Zhao Q, Gilbert EG,
Ryan CJ, Chen WT, Paris PL. Detection and characterization of invasive circulating tumor cells
derived from men with metastatic castration-resistant prostate cancer. Int J Cancer. 2014 May
15;134(10):2284-93
Ogunwobi OO, Puszyk W, Dong HJ, Liu C. Epigenetic upregulation of HGF and c-Met drives
metastasis in hepatocellular carcinoma. PLoS One. 2013 May 28;8(5):e63765. doi:
10.1371/journal.pone.0063765
Huang W, Qi CB, Lv SW, Xie M, Feng YQ, Huang WH, Yuan BF. Determination of DNA and
RNA Methylation in Circulating Tumor Cells by Mass Spectrometry. Anal Chem. 2016 Jan
19;88(2):1378-84. doi: 10.1021/acs.analchem.5b03962. Epub 2016 Jan 7. Erratum in: Anal Chem.
2016 Apr 19;88(8):4581
Pixberg CF, Raba K, Müller F, Behrens B, Honisch E, Niederacher D, Neubauer H, Fehm T,
Goering W, Schulz WA, Flohr P, Boysen G, Lambros M, De Bono JS, Knoefel WT, Sproll C,
Stoecklein NH, Neves RPL. Analysis of DNA methylation in single circulating tumor cells.
Oncogene. 2017 Jun 8;36(23):3223-3231
Gkountela S, Castro-Giner F, Szczerba BM, Vetter M, Landin J, Scherrer R, Krol I, Scheidmann
MC, Beisel C, Stirnimann CU, Kurzeder C, Heinzelmann-Schwarz V, Rochlitz C, Weber WP,
Aceto N. Circulating Tumor Cell Clustering Shapes DNA Methylation to Enable Metastasis
Seeding. Cell. 2019 Jan 10;176(1-2):98-112.e14
Klotz R, Thomas A, Teng T, Han SM, Iriondo O, Li L, Restrepo-Vassalli S, Wang A, Izadian N,
MacKay M, Moon BS, Liu KJ, Ganesan SK, Lee G, Kang DS, Walmsley CS, Pinto C, Press MF,
59
Lu W, Lu J, Juric D, Bardia A, Hicks J, Salhia B, Attenello F, Smith AD, Yu M. Circulating Tumor
Cells Exhibit Metastatic Tropism and Reveal Brain Metastasis Drivers. Cancer Discov. 2020
Jan;10(1):86-103
Moore, L., Le, T. & Fan, G. DNA Methylation and Its Basic Function. Neuropsychopharmacol
38, 23-38 2013
Bird A, Taggart M, Frommer M, Miller OJ, Macleod D. A fraction of the mouse genome that is
derived from islands of nonmethylated, CpG-rich DNA. Cell. 1985 Jan;40(1):91-9. doi:
10.1016/0092-8674(85)90312-5. PMID: 2981636.
Saxonov S, Berg P, Brutlag DL. A genome-wide analysis of CpG dinucleotides in the human
genome distinguishes two distinct classes of promoters. Proc Natl Acad Sci U S A. 2006 Jan
31;103(5):1412-7. doi: 10.1073/pnas.0510310103. Epub 2006 Jan 23. PMID: 16432200; PMCID:
PMC1345710.
Phillips, T. (2008) The role of methylation in gene expression. Nature Education 1(1):116
Berman BP, Weisenberger DJ, Aman JF, et al. Regions of focal DNA hypermethylation and long-
range hypomethylation in colorectal cancer coincide with nuclear lamina-associated domains. Nat
Genet. 2011;44(1):40-46. Published 2011 Nov 27. doi:10.1038/ng.969
Blattler A, Yao L, Witt H, Guo Y, Nicolet CM, Berman BP, Farnham PJ. Global loss of DNA
methylation uncovers intronic enhancers in genes showing expression changes. Genome Biol.
2014 Sep 20;15(9):469. doi: 10.1186/s13059-014-0469-0. PMID: 25239471; PMCID:
PMC4203885.
Cokus, S., Feng, S., Zhang, X. et al. Shotgun bisulphite sequencing of the Arabidopsis genome
reveals DNA methylation patterning. Nature 452, 215–219 (2008).
https://doi.org/10.1038/nature06745
Lister, R., Pelizzola, M., Dowen, R. et al. Human DNA methylomes at base resolution show
widespread epigenomic differences. Nature 462, 315–322 (2009).
https://doi.org/10.1038/nature08514
Smallwood SA, Lee HJ, Angermueller C, Krueger F, Saadeh H, Peat J, Andrews SR, Stegle O,
Reik W, Kelsey G. Single-cell genome-wide bisulfite sequencing for assessing epigenetic
heterogeneity. Nat Methods. 2014 Aug;11(8):817-820. doi: 10.1038/nmeth.3035. Epub 2014 Jul
20. PMID: 25042786; PMCID: PMC4117646.
Miura F, Ito T. Highly sensitive targeted methylome sequencing by post-bisulfite adaptor tagging.
DNA Res. 2015 Feb;22(1):13-8. doi: 10.1093/dnares/dsu034. Epub 2014 Oct 16. PMID:
25324297; PMCID: PMC4379973.
60
Farlik M, Sheffield NC, Nuzzo A, Datlinger P, Schönegger A, Klughammer J, Bock C. Single-cell
DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics. Cell
Rep. 2015 Mar 3;10(8):1386-97. doi: 10.1016/j.celrep.2015.02.001. Epub 2015 Feb 26. PMID:
25732828; PMCID: PMC4542311.
Gravina S, Dong X, Yu B, Vijg J. Single-cell genome-wide bisulfite sequencing uncovers
extensive heterogeneity in the mouse liver methylome. Genome Biol. 2016;17(1):150. Published
2016 Jul 5. doi:10.1186/s13059-016-1011-3
Ermakova-Gerdes, S., Shestakov, S. & Vermaas, W. Random chemical mutagenesis of a
specific psbDI region coding for a lumenal loop of the D2 protein of photosystem II
in Synechocystis sp. PCC 6803. Plant Mol Biol 30, 243–254 (1996).
https://doi.org/10.1007/BF00020111
Panayiotis G. Menounos, George P. Patrinos, Chapter 4 - Mutation Detection by Single Strand
Conformation Polymorphism and Heteroduplex Analysis, Editor(s): George P. Patrinos, Wilhelm
J. Ansorge, Molecular Diagnostics (Second Edition), Academic Press, 2010, Pages 45-58, ISBN
9780123745378,https://doi.org/10.1016/B978-0-12-374537-8.00004-3.
(https://www.sciencedirect.com/science/article/pii/B9780123745378000043)
Kamal, M., Saremi, S., Klotz, R. et al. PIC&RUN: An integrated assay for the detection and
retrieval of single viable circulating tumor cells. Sci Rep 9, 17470 (2019).
https://doi.org/10.1038/s41598-019-53899-4
de Sena Brandine G and Smith AD. Falco: high-speed FastQC emulation for quality control of
sequencing data. F1000Research 2021, 8:1874 (https://doi.org/10.12688/f1000research.21142.2)
Song Q, Decato B, Hong E, Zhou M, Fang F, Qu J, Garvin T, Kessler M, Zhou J, Smith AD (2013)
A reference methylome database and analysis pipeline to facilitate integrative and comparative
epigenomics. PLOS ONE 8(12): e81148
Guilherme de Sena Brandine, Andrew D. Smith
bioRxiv 2020.12.21.423849; doi: https://doi.org/10.1101/2020.12.21.423849
Chen H, Smith AD, Chen T. WALT: fast and accurate read mapping for bisulfite sequencing.
Bioinformatics. 2016 Nov 15;32(22):3507-3509
Quinlan AR and Hall IM, 2010. BEDTools: a flexible suite of utilities for comparing genomic
features. Bioinformatics. 26, 6, pp. 841–842
Felix Krueg; Frankie James; Phil Ewels; Ebrahim Afyounian; Benjamin Schuster-Boeckler.
FelixKrueger/TrimGalore: v0.6.7. 2021 July 23
Daley T, Smith AD. Modeling genome coverage in single-cell sequencing. Bioinformatics. 2014
Nov 15;30(22):3159-65
61
Busslinger GA, Weusten BLA, Bogte A, Begthel H, Brosens LAA, Clevers H. Human
gastrointestinal epithelia of the esophagus, stomach, and duodenum resolved at single-cell
resolution. Cell Rep. 2021 Mar 9;34(10):108819
Zhao, L., Wu, X., Zheng, J. et al. DNA methylome profiling of circulating tumor cells in lung
cancer at single base-pair resolution.Oncogene 40, 1884–1895 (2021).
Esteller M. Aberrant DNA methylation as a cancer-inducing mechanism. Annu Rev Pharmacol
Toxicol. 2005;45:629-56
Momparler RL. Epigenetic therapy of cancer with 5-aza-2'-deoxycytidine (decitabine). Semin
Oncol. 2005 Oct;32(5):443-51
Das PM, Singal R. DNA methylation and cancer. J Clin Oncol. 2004 Nov 15;22(22):4632-42
Han KY, Kim KT, Joung JG, Son DS, Kim YJ, Jo A, Jeon HJ, Moon HS, Yoo CE, Chung W, Eum
HH, Kim S, Kim HK, Lee JE, Ahn MJ, Lee HO, Park D, Park WY. SIDR: simultaneous isolation
and parallel sequencing of genomic DNA and total RNA from single cells. Genome Res. 2018
Jan;28(1):75-87.
Kang CC, Yamauchi KA, Vlassakis J, Sinkala E, Duncombe TA, Herr AE. Single cell-resolution
western blotting. Nat Protoc. 2016 Aug;11(8):1508-30.
Sinkala, E., Sollier-Christen, E., Renier, C. et al. Profiling protein expression in circulating tumour
cells using microfluidic western blotting. Nat Commun 8, 14622 (2017).
Pott S. Simultaneous measurement of chromatin accessibility, DNA methylation, and nucleosome
phasing in single cells. Elife. 2017 Jun 27;6:e23203.
Clark, S.J., Argelaguet, R., Kapourani, CA. et al. scNMT-seq enables joint profiling of chromatin
accessibility DNA methylation and transcription in single cells. Nat Commun 9, 781 (2018).
Panchin AY, Makeev VJ, Medvedeva YA. Preservation of methylated CpG dinucleotides in
human CpG islands. Biol Direct. 2016;11(1):11. Published 2016 Mar 22.
Babenko VN, Chadaeva IV, Orlov YL. Genomic landscape of CpG rich elements in human. BMC
Evol Biol. 2017;17(Suppl 1):19. Published 2017 Feb 7.
Moore, L., Le, T. & Fan, G. DNA Methylation and Its Basic
Function. Neuropsychopharmacol 38, 23–38 (2013).
Chan K, Sterling JF, Roberts SA, Bhagwat AS, Resnick MA, Gordenin DA. Base damage within
single-strand DNA underlies in vivo hypermutability induced by a ubiquitous environmental
agent. PLoS Genet. 2012;8(12):e1003149.
62
Pliml J, Sorm F. Synthesis of 2`-deoxy-D-ribofuranosyl-5-azacytosine. Coll Czech Chem
Commun. 1964;29:2576–2577.
Sorm F, Vesely J. Effect of 5-aza-2'-deoxycytidine against leukemic and hemopoietic tissues in
AKR mice. Neoplasma. 1968;15:339–343.
Saito Y, Liang G, Egger G, Friedman JM, Chuang JC, Coetzee GA, Jones PA. Specific activation
of microRNA-127 with downregulation of the proto-oncogene BCL6 by chromatin-modifying
drugs in human cancer cells. Cancer Cell. 2006 Jun;9(6):435-43.
Saliba AN, John AJ, Kaufmann SH. Resistance to venetoclax and hypomethylating agents in acute
myeloid leukemia. Cancer Drug Resist. 2021;4:125-142.
Kelly TK, De Carvalho DD, Jones PA. Epigenetic modifications as therapeutic targets. Nat
Biotechnol. 2010 Oct;28(10):1069-78.
Chiappinelli KB, Strissel PL, Desrichard A, Li H, Henke C, Akman B, Hein A, Rote NS, Cope
LM, Snyder A, Makarov V, Budhu S, Slamon DJ, Wolchok JD, Pardoll DM, Beckmann MW,
Zahnow CA, Merghoub T, Chan TA, Baylin SB, Strick R. Inhibiting DNA Methylation Causes an
Interferon Response in Cancer via dsRNA Including Endogenous Retroviruses. Cell. 2015 Aug
27;162(5):974-86. doi: 10.1016/j.cell.2015.07.011. Erratum in: Cell. 2016 Feb 25;164(5):1073.
Buhu, Sadna [corrected to Budhu, Sadna]; Mergoub, Taha [corrected to Merghoub, Taha]. Erratum
in: Cell. 2017 Apr 6;169(2):361.
Roulois D, Loo Yau H, Singhania R, Wang Y, Danesh A, Shen SY, Han H, Liang G, Jones PA,
Pugh TJ, O'Brien C, De Carvalho DD. DNA-Demethylating Agents Target Colorectal Cancer
Cells by Inducing Viral Mimicry by Endogenous Transcripts. Cell. 2015 Aug 27;162(5):961-73.
Liu M, Ohtani H, Zhou W, Ørskov AD, Charlet J, Zhang YW, Shen H, Baylin SB, Liang G,
Grønbæk K, Jones PA. Vitamin C increases viral mimicry induced by 5-aza-2'-deoxycytidine. Proc
Natl Acad Sci U S A. 2016 Sep 13;113(37):10238-44.
Kulis M, Heath S, Bibikova M, Queiros AC, Navarro A, Clot G, Martinez-Trillos A, Castellano
G, Brun-Heath I, Pinyol M, et al. Epigenomic analysis detects widespread gene-body DNA
hypomethylation in chronic lymphocytic leukemia. Nat Genet. 2012
Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z,
Ngo QM, et al. Human DNA methylomes at base resolution show widespread epigenomic
differences. Nature. 2009;462:315–322.
Maunakea AK, Nagarajan RP, Bilenky M, Ballinger TJ, D'Souza C, Fouse SD, Johnson BE, Hong
C, Nielsen C, Zhao Y, et al. Conserved role of intragenic DNA methylation in regulating
alternative promoters. Nature. 2010;466:253–257.
63
Varley KE, Gertz J, Bowling KM, Parker SL, Reddy TE, Pauli-Behn F, Cross MK, Williams BA,
Stamatoyannopoulos JA, Crawford GE, et al. Dynamic DNA methylation across diverse human
cell lines and tissues. Genome Res. 2013;23:555–567.
Yang X, Han H, De Carvalho DD, Lay FD, Jones PA, Liang G. Gene body methylation can alter
gene expression and is a therapeutic target in cancer. Cancer Cell. 2014;26(4):577-590.
Abstract (if available)
Abstract
Breast cancer accounts for over 40,000 deaths annually in the United States and over ninety percent of these deaths are attributed to metastasis. Circulating tumor cells (CTCs), shed from the primary or metastatic tumors into the circulatory system, are displaced in distant organs where metastatic tumors eventually arise. Revealing the molecular properties underlying heterogenous CTCs will have a significant impact in overcoming therapeutic resistance in patients and understanding metastatic mechanisms. CTCs have been studied in the context of genomic mutations and transcriptional profiling. However, the influence of epigenetic changes contributing to the characteristics of CTCs is unknown. DNA methylation plays a fundamental role in regulating many cellular processes, and alterations in methylation, especially at promoters, is a typical hallmark of cancer. Our lab established several CTC lines that were derived from breast cancer patients, allowing us to analyze the CTCs’ methylomes for the first time. My focus started with understanding the heterogenous nature of CTCs by utilizing three published protocols for single cell whole genome bisulfite sequencing (scWGBS). However, I found that these scWGBS protocols produced inconsistent bisulfite conversion rates in single CTC samples. A highly efficient bisulfite conversion rate is absolutely crucial for accurate methylation analysis, since inefficient conversion of unmethylated cytosine residues into uracil will lead to false positive results. In contrast, I could produce highly efficient bisulfite conversion rates using naked DNA and single euchromatic mouse embryonic stem cells. These results lead me to hypothesize that proteins in heterochromatic genomes of CTCs may prevent the DNA from the bisulfite reaction, thus leading to inconsistent conversion in single CTCs. Since CTCs are a rare cell type and every cell matters, I then focused on developing an improved scWGBS protocol that can be applied reliably to CTCs and other rare cell types. By finding an appropriate lysing buffer that can effectively lyse the cell and denature proteins from DNA, as well as an optimal proteinase that can degrade proteins, I successfully generated scWGBS libraries in cancer and normal cells, with a consistent bisulfite conversion rate of >95%. Application of this robust scWGBS protocol led to the identification of unique methylation regions in CTCs and other cancer cell lines. Moreover, I applied the analysis in a CTC line that is treated with a DNA demethylating agent, 5-aza-2'- deoxycytidine (5-AZA-CdR) and showed that there are heterogenous responses to the treatment at the single cell level, which cannot be revealed in the bulk analysis. In conclusion, I have successfully developed a robust scWGBS protocol that can be applied to analyze DNA methylomes of rare cell types.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Ancestral inference and cancer stem cell dynamics in colorectal tumors
PDF
DNA methylation changes in the development of lung adenocarcinoma
PDF
Understanding DNA methylation and nucleosome organization in cancer cells using single molecule sequencing
PDF
Efficient algorithms to map whole genome bisulfite sequencing reads
PDF
Effects of chromatin regulators during carcinogenesis
PDF
Limit of detection analysis for cell-free DNA methylation using targeted bisulfite sequencing
PDF
Ectopic expression of a truncated isoform of hair keratin 81 in breast cancer alters biophysical characteristics to promote metastatic propensity
PDF
Applying multi-omics in cancer liquid biopsy for improved patient monitoring and biomarker discovery
PDF
DNA methylation markers for blood-based detection of small cell lung cancer in mouse models
PDF
Exploring the effects of CXCR4 inhibition on circulating tumor cell populations in metastatic prostate cancer
PDF
DNA methylation as a biomarker in human reproductive health and disease
PDF
Functional DNA methylation changes in normal and cancer cells
PDF
Application of tracing enhancer networks using epigenetic traits (TENET) to identify epigenetic deregulation in cancer
PDF
RNA methylation in cancer plasticity and drug resistance
PDF
Heterogeneity and plasticity of malignant and non-malignant circulating analytes in breast carcinomas
PDF
Identification of CBP/FOXM1 as a molecular target in triple negative breast cancer
PDF
Identification and characterization of cancer-associated enhancers
PDF
Mechanistic basis for chromosomal translocations at the E2A gene
PDF
Development of immunotherapy for small cell lung cancer using iso-aspartylated antigen
PDF
TLR8-transferred miR-192 acts as a tumor suppressor in neuroblastoma by inhibiting CTCF
Asset Metadata
Creator
Ortiz, Veronica
(author)
Core Title
Developing a robust single cell whole genome bisulfite sequencing protocol to analyse circulating tumor cells
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Cancer Biology and Genomics
Degree Conferral Date
2022-05
Publication Date
06/21/2024
Defense Date
11/14/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
breast cancer,cancer,CTCs,DNA methylation,epigenetics,Metastasis,OAI-PMH Harvest,scWGBS,single cell,WGBS
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Offringa, Ite A. (
committee chair
), Liang, Gangning (
committee member
), Siegmund, Kimberly (
committee member
), Stallcup, Michael (
committee member
), Yu, Min (
committee member
)
Creator Email
veronico@usc.edu,veroniconato@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC111345313
Unique identifier
UC111345313
Legacy Identifier
etd-OrtizVeron-10775
Document Type
Dissertation
Rights
Ortiz, Veronica
Internet Media Type
application/pdf
Type
texts
Source
20220622-usctheses-batch-948
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
breast cancer
CTCs
DNA methylation
epigenetics
scWGBS
single cell
WGBS