Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Dissecting the heterogeneity of mouse hematopoietic stem cells in vivo
(USC Thesis Other)
Dissecting the heterogeneity of mouse hematopoietic stem cells in vivo
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
DISSECTING THE HETEROGENEITY OF
MOUSE HEMATOPOIETIC STEM CELLS IN VIVO
By
Du Jiang
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
DEVELOPMENT, STEM CELL AND REGENERATIVE MEDICINE
August 2020
Copyright 2020 Du Jiang
ii
Dedication
To my family
iii
Acknowledgements
I am extremely grateful to my mentor, Dr. Rong Lu, for all her insightful guidance,
patience, and generosity. It is such a great privilege to be among her first graduate students, and I
have been so fortunate to receive her tireless instructions and inspirations. From collecting
peripheral blood to interpreting single-cell RNA-seq data, from designing algorithm to writing
manuscript, she always keeps her door open, and uses her enthusiasm to encourage me to tackle
any problems. Her optimism and involvement have made my past six years never be felt too
long.
I am also thankful to all the helps I have received on my journey toward the degree,
especially my dissertation committee members, Dr. Qi-Long Ying, Dr. Andy McMahon, Dr.
Adam MacLean, and former member Dr. Akil Merchant, for their advice, encourage, and
criticism in the past several years.
I thank all the current and former members of the Lu Lab for assisting my research as
well as making my life in the lab so joyful and unforgettable. Specifically, I thank Ania for
managing the lab and making my life easy, and leading the aging project; Adnan for pioneering
the single-cell RNA-seq in the lab; Humberto for leading the leukemia project; Charlie for
putting together the Nature Protocols manuscript.
Finally, I thank my family, especially my parents, for their constant support. Thousands
of miles apart, I feel their love and prayers every day. To my two deceased grandfathers, I am
proud to present you my degree, and may you rest in peace. To my spouse, Yichen, thank you for
all your love and congratulations for marrying a doctor.
iv
TABLE OF CONTENTS
Dedication ....................................................................................................................................... ii
Acknowledgements ........................................................................................................................ iii
List of Figures ................................................................................................................................ vi
List of Tables ............................................................................................................................... viii
Abstract .......................................................................................................................................... ix
Chapter 1: Introduction ................................................................................................................... 1
Isolation and characterization of mouse hematopoietic stem cells ............................................. 1
Hematopoietic hierarchy ............................................................................................................. 3
Methods of HSC clonal tracking ................................................................................................. 5
Single cell RNA sequencing technology ..................................................................................... 9
HSC heterogeneity .................................................................................................................... 12
Chapter 2: Improvement of clonal tracking barcode system ........................................................ 14
Abstract ..................................................................................................................................... 14
Introduction ............................................................................................................................... 14
Applications of the method ....................................................................................................... 16
Comparisons with other methods .............................................................................................. 17
Limitations ................................................................................................................................ 20
Experimental design .................................................................................................................. 21
Plasmid Generation................................................................................................................ 21
Lentivirus Packaging ............................................................................................................. 23
Transducing Experimental Cells ........................................................................................... 23
Barcode Extraction ................................................................................................................ 24
DNA Quantification & High-throughput Sequencing ........................................................... 25
Analyzing Sequencing Data .................................................................................................. 26
Anticipated results ..................................................................................................................... 28
Author contribution ................................................................................................................... 29
Figures and Tables .................................................................................................................... 30
v
Chapter 3: HSC heterogeneity is cell autonomous ....................................................................... 38
Abstract ..................................................................................................................................... 38
HSCs derived from the same clone behave similarly in different hosts ................................... 38
Mobilization does not alter HSC differentiation ....................................................................... 39
Discussion ................................................................................................................................. 40
Methods ..................................................................................................................................... 40
Figures ....................................................................................................................................... 44
Chapter 4: Identifying genes modulating functional differences between individual HSCs ........ 51
Abstract ..................................................................................................................................... 51
Introduction ............................................................................................................................... 51
Molecular bridges linking genetic barcode tracking and single cell RNA sequencing ............ 52
Identifying genes significantly associated with cellular activities across individual cells ....... 53
Functional relevance of identified genes ................................................................................... 54
Quantitative associations between gene expression and cellular activity ................................. 55
Discussion ................................................................................................................................. 56
Methods ..................................................................................................................................... 57
Figures and Tables .................................................................................................................... 62
Chapter 5: Transplantation alters HSC differentiation ................................................................. 79
Abstract ..................................................................................................................................... 79
Transplantation conditions alter HSC differentiation at the clonal level .................................. 79
Transplantation modulates cellular activities in vivo ................................................................ 81
Discussion ................................................................................................................................. 84
Methods ..................................................................................................................................... 85
Figures and Tables .................................................................................................................... 88
Chapter 6: Cellular heterogeneity associated with aging and leukemia ..................................... 106
Abstract ................................................................................................................................... 106
Temporal variability in the onset of aging .............................................................................. 106
Leukemia progression and chemotherapy response ................................................................ 108
Methods ................................................................................................................................... 111
Figures and Tables .................................................................................................................. 115
References ................................................................................................................................... 121
vi
List of Figures
Fig. 2.1 Experimental workflow ................................................................................................... 30
Fig 2.2 Comparing barcode extraction replicates ......................................................................... 31
Fig. 2.3 Q-PCR amplification of barcodes.................................................................................... 32
Fig. 3.1 Donor chimerism of chemotherapy-mediated transplantation ........................................ 44
Fig. 3.2 HSCs derived from the same ancestor differentiate similarly in different mice ............. 45
Fig. 3.3 FACS gating for cell isolation ......................................................................................... 47
Fig. 3.4 Comparing the self-renewal of HSCs derived from the same ancestor ........................... 48
Fig. 3.5 Establishing HSC mobilization assay .............................................................................. 49
Fig. 3.6 Tracking HSC clones before and after mobilization ....................................................... 50
Fig. 4.1 Scheme of the integrative experimental system .............................................................. 62
Fig. 4.2 Extracting “molecular bridges” ....................................................................................... 63
Fig. 4.3 Bioinformatic pipeline for mapping single cell transcriptomes and activities ................ 64
Fig. 4.4 Comparing the transcriptomes and activities of barcoded and non-barcoded HSCs ...... 65
Fig. 4.5 Identifying genes significantly associated with cellular activities across individual cells
....................................................................................................................................................... 66
Fig. 4.6 Quantitative association between gene expression and cellular activity ......................... 67
Fig. 4.7 Classifying quantitative association patterns ................................................................... 68
Fig. 4.8 Complete list of genes identified as significantly associated with HSC activities .......... 70
Fig. 4.9 Gene-gene interaction depicted by the association patterns ............................................ 75
vii
Fig. 5.1 HSC lineage bias after unconditioned or irradiation-mediated transplantation .............. 88
Fig. 5.2 Comparing the lineage bias in two secondary recipients ................................................ 89
Fig. 5.3 HSC clones with distinct lineage bias and balance exhibited similar blood abundance
consistency .................................................................................................................................... 90
Fig. 5.4 Comparing the lineage bias between primary and secondary recipients ......................... 92
Fig. 5.5 Lymphoid biased HSC clones produced less blood cells in secondary recipients. ......... 93
Fig. 5.6 HSCs systematically change blood production during serial transplantation. ................ 94
Fig. 5.7 HSCs can be activated or inactivated upon transplantation ............................................ 95
Fig. 5.8 Fewer HSCs produce blood in secondary recipients compared with primary recipients 96
Fig. 5.9 Clonal dominance increases upon transplantation ........................................................... 97
Fig. 5.10 Characteristics of activated and inactivated HSC clones .............................................. 98
Fig. 5.11 Contribution of persistent clones in primary and secondary recipients ......................... 99
Fig. 5.12 Myl10 expression is associated with HSC lineage bias shift upon transplantation ..... 100
Fig. 6.1 Identifying genes expressed differently between HSCs aged differently...................... 115
Fig. 6.2 HSCs aged in young and old niches exhibited different gene ontology signatures ...... 116
Fig. 6.3 Identifying differentially expressed genes between blood and legs .............................. 117
Fig. 6.4 Identifying differentially expressed genes between blood and ovary ........................... 118
Fig. 6.5 Identifying genes associated with different clonal response to chemotherapy ............. 119
viii
List of Tables
Table 2.1 Barcode Oligos for each Library ID. ............................................................................ 35
Table 2.2 List of primers ............................................................................................................... 37
Supplemental Table 4.1 Complete list of genes exhibiting significant associations with HSC
activities ........................................................................................................................................ 77
Supplemental Table 4.2 Summary of previous studies ................................................................. 78
Supplementary Table 5.1 Significance of the change in lineage bias determined by Monte Carlo
simulation .................................................................................................................................... 101
Supplementary Table 5.2 Significance of the change in lineage bias determined by probability
calculation ................................................................................................................................... 102
Supplementary Table 5.3 T-test analysis between different clones ............................................ 104
Supplementary Table 6.1 Summary of previous studies ............................................................ 120
ix
Abstract
Stem cell heterogeneity plays important roles during development, aging, regeneration,
and disease progression. However, its underlying mechanisms remain largely elusive. Here, we
improved an embedded viral barcoding technology, and combined it with droplet-based single
cell RNA sequencing technology to study hematopoietic stem cells (HSCs) in mouse. By
simultaneously measuring the transcriptomes and in vivo cellular activities of hundreds of
individual HSCs in mice, we show that intercellular variations in the expression levels of dozens
of genes are significantly correlated with distinct activity levels of individual HSCs. our data
illustrate a novel approach for studying molecular regulatory mechanisms through quantitatively
dissecting intercellular variations.
1
Chapter 1: Introduction
Isolation and characterization of mouse hematopoietic stem cells
The word stem cell was first used by Ernst Haeckel, who has been famous for his
Biogenetic law, to describe the fertilized egg (Häckel, 1868). Since then, the stem cell concept
has been framed into a tree-like model, in which the stem cell sits at the root of a branching tree,
and gives rise to their progeny through an ordered series of steps (Laurenti & Göttgens, 2018).
While people had been suspecting the existence of stem cell or common progenitor for different
organs, the first evidence was not observed until middle 20
th
century. The atomic bomb
explosions in Japan in 1945 marked a tragic outcome that war could bring to human being, but
unexpectedly, it shed light on the stem cell research. With a large population exposed to
radiation, those who died from the lowest lethal dose of irradiation were found to mostly die of
hematopoietic failure. A few years later, using mouse model, such radiation syndrome was
recapitulated and prevented by injecting marrow cells(Jacobson et al., 1951; Lorenz et al., 1951;
Weissman & Shizuru, 2008). It not just started the field of hematopoietic cell transplantation, but
also implied the existence of a common blood progenitor within bone marrow.
Starting from 1961, Till and McCulloch performed a series of experiments which were
seen as initial prospective efforts in search of the common blood progenitors, or hematopoietic
stem cells (HSCs). By injecting bone marrow cells into lethally irradiated mouse and observing
colony formation within spleen or in vitro, they found that bone marrow cells seem to be capable
of making more of themselves, as well as giving rise to both myelo-erythroid progeny and
lymphocytes. While such ability of self-renewal and differentiation meet our current definition
for stem cells, the relatively low resolution suggested that it was not sufficient to conclude the
existence of HSCs, because the differentiation outcome could be contributions from various cell
2
types, and self-renewal could be a property of multiple hematopoietic cell type as well. However,
both bone marrow transplantation and colony forming assay had become powerful tools for the
decades to come, and would have greatly push forward our understanding on hematopoiesis.
With the retrospective genetic marker evidence that HSCs existed, researchers moved on
to the prospective isolation of such stem cells from bone marrow. Bone marrow cells were
initially separated by size and density (N. Iscove, 1990; N. N. Iscove et al., 1972), then with the
advancement of multiparametric fluorescence activated cell sorting (FACS) (Hulett et al., 1969)
and monoclonal antibody production (Köhler & Milstein, 1975), by reagents to cell surface
markers(Visser et al., 1984). Development of FACS and monoclonal antibody has also eased the
evaluation of differentiation potential of isolated cells. Initially, it had to be done by spleen
colony assay, or in vitro factor-dependent colony forming assay (Ezine et al., 1984, 1985;
Lepault & Weissman, 1981; Spangrude, Muller-Sieburg, et al., 1988; Visser et al., 1984;
Whitlock & Witte, 1982). Later, mouse strains on the C57BL background congenic for 2 alleles
of the cell surface CD45, the leukocyte common antigen, were developed (F. W. Shen et al.,
1986), and cell surface markers that delineate mature cell populations such as B220 (B cell)
(Coffman & Weissman, 1981) and Gr1 (granulocyte) (Holmes et al., 1986) were also identified.
As the result, one could easily evaluate the differentiation potential of transplanted cells by
performing FACS analysis on peripheral blood cells.
With these tools in hand, the process of isolating HSCs has been sub-setting cells by
FACS and evaluating their differentiation potential. An early experiment found that cells lacking
expression of B220 were producing myeloid/B clones in vitro. This inspired the first sub-setting
criteria, the 7-10 lineage antibodies for negative selection (Lin-) (C. E. Muller-Sieburg et al.,
1986). Later, one of the monoclonal antibodies produced to putative pre-T hybridomas, Sca1,
3
was shown to separate cells into two subpopulations; only the sca1-positive population have the
in vivo reconstitution potential (Spangrude, Heimfeld, et al., 1988). Similar approach had shown
c-Kit is also an important marker for the HSC population (Ikuta & Weissman, 1992). Lin-
cKit+Sca1+ (KLS) bone marrow cells have since been taken as hematopoietic stem and
progenitor population, with additional markers being reported to further purify HSCs. Inspired
by the report that all colony-forming activity of human bone marrow cells was found in the
CD34-positive fraction(Baum et al., 1992), Osawa et al generated mouse CD34 antibody and
showed that in mouse it was the opposite, that CD34 should be used for negative selection for
HSCs (Osawa et al., 1996). Using a different approach, Sean Morrison Laboratory sought to
identify differentially expressed genes between KLS HSCs and multipotent progenitors. They
found that SLAM family receptors, including CD150, CD48 and CD244, were differentially
expressed among functionally distinct progenitor populations (Kiel et al., 2005; Oguro et al.,
2013). While recently there has been some reports proposing expression of single gene to define
HSCs, such as Hoxb5 (J. Y. Chen et al., 2016, p. 5) and Tie2 (Busch et al., 2015), so far, the
multicolor flow sorting remains the gold standard for isolating phenotypic HSCs. Our lab defines
HSCs as Lin-cKit+Sca1+Flk2-CD34-Slam+ cells within bone marrows.
Hematopoietic hierarchy
The progress in prospective isolation of HSCs have also brought identification and
isolation of a hierarchy of progenitors. Around the time when cKit was described, the Lin-Sca+
population was resolved into three subpopulations based on the relative fluorescent level of
Mac1 and CD4 (S. J. Morrison et al., 1997; Sean J. Morrison & Weissman, 1994), two lineage
markers. The three subpopulations were later designated as long-term HSCs (LT-HSCs), short-
term HSCs (ST-HSCs), and multipotent progenitors (MPPs), based on their self-renewal
4
potential shown in CFU-S and competitive transplantation. LT-HSCs give rise to ST-HSCs, and
ST-HSCs give rise to MPPs, with no recorded de-differentiations. Only LT-HSCs can provide
long-term multilineage reconstitution (LTMR) upon transplantation (Uchida et al., 1994).
The isolations of oligo-lineage progenitors further demonstrated that only LT-HSCs hold
the ability for LTMR. Based on the prior knowledge that Interleukin-7 (IL7) acts as a
nonredundant cytokine for both B and T cell development, which is mediated by IL7 receptor
(IL7R), Kondo et al used IL7R expression as a marker to search in mouse bone marrow. They
demonstrated that single KLS IL7R+ cell can differentiate into both B and T cells but not any
myeloerythroid lineages, therefore representing the common lymphoid progenitor (CLP)
population (Kondo et al., 1997). Similarly, based on the prior knowledge that Fcγ receptor
(FcgR) is and important marker for myelomonocytic cells and a progenitor marker in fetal liver
hematopoiesis, Akashi et al divided the IL7R-Lin-cKit+Sca1- population into three subsets by
FACS and evaluated their in vivo differentiation potential, demonstrating that they are common
myeloid progenitors (CMP), granulocyte-macrophage progenitors (GMP), and megakaryocyte-
erythroid progenitors (MEP) (Akashi et al., 2000). Transplantation of CMP gave rise to all
myeloerythroid progeny; MEP gave only red blood cells and platelets; GMP gave rise only to
granulocytes monocytes, and cells of the macrophage lineage. These data put together, a tree-like
hematopoiesis hierarchy had been fully constructed (Weissman & Shizuru, 2008): on top of the
hierarchy sits the HSC, which is capable of long-term self-renewal and giving rise to MPPs
which do not self-renew; MPPs can differentiate into myeloid lineage (CMP) and lymphoid
lineage (CLP); CMPs differentiate into GMPs and MEPs; GMPs differentiate into granulocytes
and macrophages; MEPs differentiate into erythrocytes and platelets; CLPs differentiate into B
cells, T cells and natural killer cells.
5
Further investigations have proposed modifications on the classic hematopoiesis
hierarchy tree. By combining FACS and transplantation assessment, MPPs have been divided
into up to four sub-populations with various biased differentiation potential (Cabezas-Wallscheid
et al., 2014; Pietras et al., 2015; Wilson et al., 2008). In a human study, by refining FACS
scheme and performing single cell transplantation, Notta et al demonstrated that there is a shift
of the tree structure during human development (Notta et al., 2016): in fetal liver, oligopotent
progenitors with distinct activities were a prominent component, where the bone marrow was
dominated by unilineage progenitors; in fetal liver, megakaryocyte progenitors were not
restricted to the stem cell compartment, whereas in bone marrow, megakaryocyte lineage was
closed tied to the fate of multipotent cells. Recent progress based on single cell RNA sequencing
analysis also challenged the node-and-line presentation of the hierarchy, arguing that the
differentiation is a continuous process (Nestorowa et al., 2016).
Methods of HSC clonal tracking
In order to study the heterogeneity within HSC population, particularly interactions
between different HSCs, it is critically important to track large number of individual HSCs or
HSC clones simultaneously. This requires technologies that can label individual cells or clones
with distinct markers, and one of the first of its kind was actually developed before HSC was
definitely characterized. In one of Till and McCulloch’s early experiments, they sought to
determine whether spleen colonies observed post bone marrow transplantation were developed
from single cells and hence were clones. They first exposed mice with a low dose irradiation
(250 cGy), then right after injecting bone marrow cells from donor mice of the same strain, the
recipient mice were exposed to additional irradiation (650 cGy) to reach lethal dose. This way,
the host cell was deprived of colony forming ability, while the donor cell would randomly carry
6
chromosome alterations induced by radiation (BECKER et al., 1963). They found that the same
chromosome markers were seen among cells within the same spleen colony, but not across
different spleen colonies, proving the single clonogenicity of spleen colony forming unit. This
result had suggested that clonal populations of hematopoietic cells can be quantitatively studied
in vivo. However, the relatively low resolution prevented explorations into other questions
important for clonal tracking study, such as whether different clones have different proliferation
and differentiation potential. Nonetheless, the idea of introducing new genetic information into
HSCs followed by transplantation was later proved to be key to deciphering the fate of individual
HSCs.
Around 1985, when people had identified several populations with various differentiation
potentials and had become more and more interested in hematopoietic hierarchy, Keller et al,
Dick et al, and Lemischka et al respectively developed tools to track the fate of hematopoietic
precursors and analyze cell lineage relationships (Dick et al., 1985; Keller et al., 1985; Ihor R.
Lemischka et al., 1986). They reasoned that retroviral insertion of foreign genes should occurred
at different genomic locations in different cells, which can be stably inherited by the
corresponding progeny. Therefore, in their experiments, mouse bone marrow cells were first co-
cultured with cell lines that produce retrovirus carrying neo gene, then transplanted into
irradiated mice. Because multicolor flow cytometry had not yet been available at that time, they
then used colonies derived from marrow, thymus, lymph nodes, and spleen of host mice,
representing different lineage outcomes, to perform restriction enzyme digestion and Southern
blotting. They were able to identify a few different DNA fragment sizes, which indicates
different virus integration sites (ISs), and some presented in multiple lineages and some
7
presented in secondary hosts. However, the low-resolution nature of Southern blot limited the
number of ISs detectable.
Derivatives of polymerase chain reaction (PCR) have then been used to facilitate the IS
detection. These methods, such as inverse PCR (iPCR) (Kohn et al., 1995), ligation-mediated
PCR (LM-PCR) (Kustikova et al., 2005), and linear-amplification-mediated PCR (LAM-PCR)
(Kiem et al., 2004; Schmidt et al., 2002, 2003), differ slightly but all begin with linear
amplification using single primer that primes the known viral sequence. Although these PCR-
based assay had greatly improved the resolution in detecting ISs, they still relied on restriction
enzyme cutting to distinguish different fragment sizes, thus could only detect at most dozens of
ISs. It also remained elusive whether one fragment size was resulted from one viral integration
event, or independent viral integrations have the same distance from restriction enzyme sites.
To distinguish different ISs that share the same fragment size, it would be necessary to
identify the nearby genomic sequence. Therefore, several groups used high throughput
sequencing technology, by then Roche 454 pyrosequencing, to sequence the genomic regions
amplified by LM-PCR or LAM-PCR (Cartier et al., 2009; Hai et al., 2008; Maetzig et al., 2011;
G. P. Wang et al., 2010). These methods were able to identify thousands of ISs and are still in
use, particularly in the field of gene therapy. However, such studies had revealed that
integrations clustered around common insertion sites more frequently than random (Hai et al.,
2008), raising the concern whether individual ISs truly represent individual clones. Additionally,
the difficulty of recovering genomic ISs precludes obtaining the quantitative data needed for
most clonal tracking questions.
Synthetic DNA segments, on the contrary, are easy to recover and produce highly
quantitative results when used to barcode individual cellular clones. In this system, artificial
8
DNA segments were flanked by known sequences before retro- or lenti-virally delivered into
cells, therefore they can be extracted from genomic DNA by a simple PCR. The idea of using
synthetic DNA segments for cellular barcoding was initially in conjunction with micro-array
(Schepers et al., 2008), and it proved to be a powerful tool by showing how locally primed T
cells disperse. However, it requires designing and manufacturing the micro-array probes ad hoc,
and the diversity of the synthetic library is largely limited by the capacity of the array. Later, our
lab and others have conjugated synthetic DNA barcodes with high throughput sequencing
(Cheung et al., 2013; Gerrits et al., 2010; Lu et al., 2011; Lyne et al., 2018; Naik et al., 2013; L.
V. Nguyen, Makarem, et al., 2014), which provides high sensitivity and throughput, and allows
precise quantification of cellular progeny. In our system, a lentiviral vector delivers barcodes
from a large library into a small number of cells at a titer low enough such that most cells receive
a single unique barcode. After transplantation, the progeny cells of donor cells are harvested, and
barcodes are recovered from their genomic DNA by a single PCR reaction, which also
introduces Illumina handles to both ends of the product. Barcodes are then identified and
quantified using high throughput sequencing and bioinformatic analysis.
Recently, several approaches have been developed to try to eliminate the transduction
and transplantation steps, which are yet inevitable in barcoding procedure for in vivo studies.
These approaches use transgenic organisms whose cells were engineered with transposon,
Polylox, or CRISPR/Cas9 technologies. The transposon-based approach temporarily expresses a
transposase in HSPC population to activate the mobilization of a DNA segment, i.e. a
transposon, which is randomly inserted into the genome to label individual cells(Rodriguez-
Fraticelli et al., 2018; Sun et al., 2014). Transposon insertion sites are then identified, similar to
traditional retroviral IS identification, by restriction enzyme cutting, LM-PCR, and sequencing.
9
As a result, this approach also suffers from poor quantification due to difficulties in recovering
ISs. The Polylox-based approach uses a series of unique loxP sites embedded in the genome that
can be excised randomly when Cre recombinase is activated. This approach has been used to
generate DNA segment combinations (Pei et al., 2017) or fluorescent protein combinations (Yu
et al., 2016). While advances in FACS has made more fluorescence distinguishable, fluorescent
protein combination-based assays still have a much lower potential in labeling as many clones
than DNA sequence-based assays. DNA segment combinations, theoretically capable of labeling
at least hundreds of thousands of clones, could in fact only track a few dozens of clones.
Additionally, the Polylox-based approach assumes that recombination occurs randomly, which is
not necessarily true (Rüfer & Sauer, 2002), further raising the question whether it truly labels
individual clones. The CRISPR/Cas9-based approach uses the novel gene editing tool to edit
DNA segments embedded in the genome, with the help of guide RNA (Frieda et al., 2017;
Kalhor et al., 2018). Such approach has been applied to track the lineages of whole organism
development. Similar to the Polylox-based approach, it also suffers from the fact that editing is
more likely to occur in some sites than others.
Single cell RNA sequencing technology
It is critically important to measure gene expression at single cell level when it comes to
highly heterogeneous tissues, but it had been technically challenging due to the limited amount
of materials for analysis. The first successful attempt of amplifying transcriptome materials from
single cells to sufficient amount was done in early 1990s (Brady et al., 1990). In this attempt,
synthesis of cDNA was primed with oligo(dT) and the resulting strands are tailed with poly(dA);
the cDNA was then PCR-amplified using oligo(dT) containing primers. Almost two decades
later, this approach was improved in conjunction with micro-array (Kurimoto et al., 2006), then
10
with high throughput sequencing (Tang et al., 2009), the latter marks the first ever mRNA-seq
whole-transcriptome analysis on a single cell. While this method captured the full-length cDNA
and identified many previously unknown transcripts and splicing variants, it inevitably inherited
some limitations of mRNA-seq at that time, such as bias introduced during multiple rounds of
PCR amplification, and preferentially amplification of 3’ ends of mRNAs. To eliminate the PCR
bias in mRNA-seq experiments, unique molecule identifier (UMI) was applied to count
individual RNA molecules (Kivioja et al., 2012). In this method, cDNA molecules within a
library were added a random DNA sequence label, i.e. UMI, before PCR amplification and
sequencing. Because the number of UMIs acts as a molecular memory of the number of
molecules in the original sample, upon deep sequencing, the number of original DNA molecules
can be determined by counting each UMI only once, regardless however many times they were
observed. To improve the coverage across full transcripts, Smart-seq (Ramsköld et al., 2012) and
Smart-seq2 (Picelli et al., 2013) were invented. In these methods, instead of adding poly(dA) at
the tail of cDNA, they added a template switching oligo (TSO). This template switching
technology is especially helpful when RNA amount is very limited (Ramsköld et al., 2012).
Aside from the efforts to improving the chemistry to better measure mRNA within single
cell, another important aspect of the field is to improve the throughput, which is the number of
cells that can be measured in a relatively timely and cost-effective manner. Original method
(Tang et al., 2009) requires manually picking single cells and process within individual PCR
tubes, which is very time consuming and error-prone. CEL-Seq was among the first attempts to
reduce the labor of single cell mRNA-seq. In this method, mRNAs from individual cells were
still reverse transcribed in different tubes, but a distinct cell barcode sequence was introduced
during reverse transcription. After this step, cDNA can be pooled for downstream processing. By
11
performing pair-end sequencing, the cell barcode information of each mRNA can be identified,
and the expression measurement can be assigned to the original cell bioinformatically. This idea
was then automated as MARS-Seq(Jaitin et al., 2014), which brings the practical throughput to
thousands of cells. Other attempts were trying to isolate individual cells in a more effective way,
such as flow cytometry (Shalek et al., 2013) or microfluidics C1 system (Shalek et al., 2014).
However, none of them is fast, scalable and cost friendly.
All the significant progress aforementioned had inspired the emergence of Drop-seq
(Klein et al., 2015; Macosko et al., 2015), which turned out to be revolutionary for the field.
Drop-seq technology first encapsulate cells with microparticles in nanoliter-scale aqueous
droplets by precisely combining aqueous and oil flows in microfluidic device. Within each
droplet, each cell is lysed, and its mRNAs bind to the primers on its companion microparticle.
The primers consist of a partial Illumina sequencing handle, a cell barcode sequence which is
unique for each microparticle, and a UMI sequence which is unique for each primer. After
reverse transcription capped with TSO, the resulted full-length cDNAs can be pooled, PCR
amplified, and constructed a library for pair-end high throughput sequencing. This way, mRNA
molecules from individual cells can be digitally counted, and thousands of cells can be measured
simultaneously. Soon after this, 10X Genomics commercialized the Drop-seq idea (Zheng et al.,
2017), making single-cell RNA sequencing more widely available.
During the widespread of Drop-seq, researchers didn’t stop to develop new single cell
RNA sequencing technologies to further lower the cost of the experiment, eliminate the cell loss
during the process, and reduce the reliance on specific equipment. Such methods include
Microwell-seq (Han et al., 2018) and SPLiT-seq (Rosenberg et al., 2018). In Microwell-seq,
individual cells are trapped in an agarose constructed microarray and mRNAs are captured on
12
magnetic barcoded beads. Microwell-seq was claimed to have advantage in convenience and
simplicity, and the authors have used it to construct a mouse cell atlas (Han et al., 2018) and a
human cell landscape (Han et al., 2020). SPLiT-seq is done by splitting and pooling multiple
rounds while ligating various combination of barcodes, and distinguishing transcriptomes of
single cell by combinatorial barcodes. This method requires almost no customized equipment
and would be particularly useful for fixed cells or nuclei.
HSC heterogeneity
A key observation from early studies of spleen colony forming assay was the high degree
of variability in the numbers and types of daughter cells produced (Till et al., 1964). Although
many of such early observations could later be explained by the characterization of various
progenitor populations, it remained elusive whether self-renewal and differentiation of HSCs is
randomly regulated by intrinsic and environmental influences. By using limiting dilution assays,
Muller-Sieburg et al proposed by then controversial idea that the differences in HSC behavior
are due to their possession of different properties (Müller-Sieburg et al., 2002). It wasn’t until the
characterization of LT-HSCs (Kiel et al., 2005) that observed in vivo activity could be attributed
to single cell. Various studies have since demonstrated that upon differentiation, individual HSCs
produce different amounts of blood cells and exhibit distinct preferences, or “lineage biases”, in
producing different types of blood cells: Beerman et al prospectively separated HSCs into
subpopulations, and showed that myeloid-biased HSCs expressed high levels of CD150, a
SLAM family member, whereas HSCs expressing lower levels of CD150 exhibited a balanced
lineage output (Beerman, Bhattacharya, et al., 2010); Dykstra et al performed a large cohort of
single HSC transplantation, distinguishing HSCs into subtypes as “lymphoid-deficient”,
“balanced”, or “myeloid-deficient” (Dykstra et al., 2007); in another single HSC transplantation
13
study, the repopulating kinetic patterns post transplantation were classified into at least 16 types
(Sieburg et al., 2006); by using transgenic mouse model, Ergen et al showed that expression
level of an inflammatory cytokine results in alteration of mTOR signaling and myeloid skewing
(Ergen et al., 2012).
Despite the growing efforts dissecting the HSC heterogeneity, it remains elusive what
functional roles it plays and what could be the underlying regulatory mechanisms. HSCs of
different lineage biases have been associated with aging (C. E. Muller-Sieburg et al., 2012):
lymphoid-biased HSCs were found predominantly early in the lifespan, whereas myeloid-biased
HSCs accumulate in aged mice. It is also not clear whether it is the intrinsic factor or extrinsic
factor that defines the heterogeneity. The work of Ergen et al showed that cytokines regulate
HSC lineage bias (Ergen et al., 2012), indicating an extrinsic regulatory mechanism; on the
contrary, Yamamoto et al used a paired daughter cell (PDC) assay to suggest the opposite
(Yamamoto et al., 2013). In the PDC assay, a single HSC was cultured until its first division,
after which the two daughter cells were transplanted respectively into parallel host cells. Around
half of the pairs showed strong similarity in the peripheral blood output. However, due to the low
throughput nature of single cell transplantation, one should be cautious drawing any conclusion
on the heterogeneity of the entire HSC population.
In addition to functional activity measurement, explorations on HSC heterogeneity have
also taken place on the molecular level, thanks to the technology advancement of digital RT-
PCR (Warren et al., 2006), microarray (Moignard et al., 2013), and more recently, single cell
RNA sequencing (Grover et al., 2016; Nestorowa et al., 2016). However, none of these studies
were able to resolve the relationship between heterogeneous cellular activities and heterogeneous
gene expression.
14
Chapter 2: Improvement of clonal tracking barcode system
This work was published:
Bramlett, C., Jiang, D., Nogalska, A., Eerdeng, J., Contreras, J., & Lu, R. (2020). Clonal tracking
using embedded viral barcoding and high-throughput sequencing. Nature Protocols, 15(4), 1436-
1458.
Abstract
Embedded viral barcoding in combination with high throughput sequencing is a powerful
technology to track single cell clones. It can provide clonal level insights into cellular
proliferation, development, differentiation, migration, and treatment efficacy. Here, we present a
detailed protocol of a viral barcoding procedure including the creation of barcode libraries, the
viral delivery of barcodes, the recovery of barcodes, and the computational analysis of barcode
sequencing data. The entire procedure can be completed within a few weeks. This barcoding
method requires cells to be susceptible to viral transduction. It provides high sensitivity and
throughput, and allows precise quantification of cellular progeny. It is cost efficient and does not
require any advanced skills. It can also be easily adapted to many types of applications, including
both in vitro and in vivo experiments.
Introduction
A cell is a basic unit of biological systems. It can divide to produce progeny cells,
forming a cell clone. Tracking cell clones over time and through space can provide critical
insights into cellular behavior. As genetic material is conserved during cell division, a cell can be
marked and tracked when unique genetic information is inserted into its genomic DNA, a
procedure called genetic barcoding. Since genetic barcodes are inherited by all progeny cells, the
15
abundance of each barcode in a cellular population is proportional to the number of cells derived
from the original barcoded cell. In conjunction with high throughput sequencing, genetic
barcoding is a powerful technique that allows for tracking clonal behaviors in a high-throughput
manner(Lu et al., 2011).
The original approach for genetic barcoding used retroviral insertion sites to mark
individual cell clones and Southern blots to analyze the results(Dick et al., 1985; Keller et al.,
1985; I. R. Lemischka et al., 1986). Later, synthetic random DNA barcodes were used in
conjunction with micro-arrays(Schepers et al., 2008). Recently, we and others developed viral
genetic barcodes that mark cells using synthetic DNA segments embedded within a viral
construct that can be easily quantified by high throughput sequencing(Cheung et al., 2013;
Gerrits et al., 2010; Lyne et al., 2018; Naik et al., 2013; L. V. Nguyen, Makarem, et al., 2014)
(Fig. 2.1) . The embedded viral barcoding technology provides high sensitivity and throughput,
and allows precise quantification of cellular progeny(Brewer et al., 2016; Lu et al., 2019; L.
Nguyen et al., 2018; C. Wu et al., 2014). The high-throughput nature of the improved technique
reduces the impact of experimental noise associated with single-cell measurements by greatly
increasing the number of measurements. The high sensitivity of barcode recovery provided by a
single step of PCR allows for the identification of small changes in barcode abundances. In
addition, embedded viral barcoding generates data with single cell resolution through the use of
randomized barcodes and does not involve the handling of single cells at any point. For
simplicity, the term “barcoding” will refer to embedded viral barcoding throughout unless
otherwise stated.
The barcoding method has been utilized and improved by several groups(Lyne et al.,
2018; Bystrykh et al., 2014; Bystrykh & Belderbos, 2016; Naik et al., 2014; Thielecke et al.,
16
2017). However, there are no standards in the field to generate and analyze barcode data(Lyne et
al., 2018). Here, we provide a detailed and easy-to-replicate protocol to generate and implement
genetic barcodes for cellular tracking studies. Since its first publication(Lu et al., 2011), our
protocol has been significantly optimized to improve sensitivity and detection limits(Brewer et
al., 2016; Lu et al., 2019; L. Nguyen et al., 2018; C. Wu et al., 2014). These improvements
primarily involve upgraded data analysis algorithms and experimental procedures for barcode
recovery. Here, we outline the protocol in a general way so that it can be adapted to many types
of applications, including both in vitro and in vivo experiments. Our protocol allows new users to
easily set up barcoding at a low cost by creating their own barcode libraries and performing
computational analysis in their own labs.
Applications of the method
Barcoding can be applied to any cells that are susceptible to lentivirus
infection(Kebschull & Zador, 2018; Naik et al., 2014; Thielecke et al., 2017; Woodworth et al.,
2017) and generates clonal behavior information that is important for many fields of research.
For example, it can identify the cell of origin during development and track the differentiation
patterns of stem cells. Using this approach, we have identified a distinct lineage origin for natural
killer cells in a rhesus macaque transplantation model(C. Wu et al., 2014). The high throughput
nature of this technology allows for comparing many individual cells simultaneously, and
provides a direct assay of cellular heterogeneity. For example, we used barcoding to show how
hematopoietic stem cells heterogeneously differentiate after transplantation in mice(Brewer et
al., 2016; Lu et al., 2019; L. Nguyen et al., 2018).
Barcoding can also be used to study diseases, particularly those that originate from rare
cells such as cancer(L. V. Nguyen, Cox, et al., 2014; L. V. Nguyen et al., 2015; Woodworth et
17
al., 2017). For example, barcoding can help reveal the cellular origins of cancer genesis, relapse
and metastasis. It can also reveal the heterogeneous responses of cancer cells to treatment. These
studies require ex vivo barcoding of candidate cells, typically samples from patients or animal
models. Tracking can then be performed in vitro or in animal models.
Barcoding has also been applied to facilitate gene and drug screens. For example,
barcoding has been used in CRISPR screens, where the gRNAs serve as genetic
barcodes(Shalem et al., 2014; T. Wang et al., 2014). While these studies typically do not require
single cell resolution, genetic barcodes still provide high throughput and tremendous cost
savings. Moreover, single cell information gathered regarding cellular heterogeneity may provide
further insights for these screens.
Comparisons with other methods
The conventional strategy for studying clonal behaviors is simply to track a single cell at
a time, e.g. single cell transplantation and single cell culture(Dykstra et al., 2007; Osawa et al.,
1996; Sieburg et al., 2006). While these approaches do not require viral transduction, they are
labor intensive and cost prohibitive for most applications. To increase throughput and reduce
cost, fluorescent proteins, either singly or in combination, have been used to mark clonal
identities(Cornils et al., 2014; Livet et al., 2007; Rios et al., 2014; Weber et al., 2012). However,
the number of fluorescent colors is small, limiting the number of cells that can be confidently
tracked at the clonal level. In contrast, the synthetic DNA segments that we use have virtually
unlimited variations, allowing thousands of clones to be tracked with a quantifiably high degree
of accuracy in the clonal level labeling. It is also cost effective as our viral library designs allow
multiple samples to be sequenced together.
18
Other techniques have tried to overcome the limited number of fluorescent colors using
viral insertion sites paired with Linear Amplification-Mediated PCR (LAM-PCR)(Harkey et al.,
2007; Schmidt et al., 2007; C. Wu et al., 2011, 2013) and quantitative shearing linear
amplification PCR(Zhou et al., 2014). These methods are still in use for human studies
particularly those that involve gene therapy. However, the difficulty of recovering genomic
insertion sites precludes obtaining the quantitative data needed for most clonal tracking
questions. In contrast, the synthetic DNA segments used in barcoding are easy to recover and
produce highly quantitative results. The design of our genetic barcodes allows for their recovery
using a single step of PCR, during which primers that are needed for downstream sequencing are
simultaneously incorporated. This simple and elegant approach greatly reduces experimental
noise during barcode recovery and produces quantitative and reproducible data. In our day-to-
day experiments, replicate samples are highly consistent (Fig. 2.2), and barcode quantification is
directly proportional to donor cell doses(Brewer et al., 2016), demonstrating the high fidelity of
our quantitative measurements when applied to in vivo experiments.
The barcoding procedure for in vivo studies requires cell transduction and transplantation.
Attempts to eliminate the transduction and transplantation steps have been tried in several
approaches where cells are engineered with transposon, Polylox or CRISPR/Cas9 technologies.
Transposon-based methods temporarily express a transposase to activate the mobilization of a
DNA segment, called a transposon, which is randomly inserted into the genome to label
individual cells(Sun et al., 2014). Similar to the viral insertion site detection technique, this
approach suffers from poor quantification because of technical difficulties in recovering the
genomic insertion sites.
19
The Polylox-based system uses a series of unique loxP sites embedded in the genome that
are excised randomly upon exposure to Cre recombinase(Pei et al., 2017). This approach has
been commonly used to generate fluorescent protein combinations(Livet et al., 2007). The
CRISPR/Cas9-based system edits genomic DNA, or synthetic DNA segments embedded in the
genome, with the help of guide RNA (gRNA)(Frieda et al., 2017; Kalhor et al., 2018). Both
Polylox and CRISPR/Cas9 rely on the assumption that the DNA recombination is random, which
is not entirely true for either system(Frieda et al., 2017; Kalhor et al., 2018; Rüfer & Sauer, 2002;
M. W. Shen et al., 2018). Meanwhile, the viral barcoding method suffers from random multiple
labeling as well.
Taken together, the transposon, Polylox and CRISPR/Cas9 systems enable endogenous
activation of cellular labeling. Compared to viral barcoding, these approaches do not require cell
isolation, culture, transduction or transplantation, thus enabling the study of native cellular
behaviors in addition to ex vivo and transplantation-mediated studies. Moreover, tissue-specific
promoters can be implemented to address tissue specific questions. However, these approaches
can be challenging for cell types that cannot be defined by a single promoter.
To overcome the transgenic requirement, new retrospective methods of clonal tracking
take advantage of naturally occurring mutations. These methods rely on rare somatic mutations
to reconstruct the lineage relationships between individual cells(Chapal-Ilani et al., 2013; Lee-
Six et al., 2018; Osorio et al., 2018; Wasserstrom et al., 2008) . Since neutral mutations occur
during cell division in a seemingly random process, these methods link cells to one another when
they share common mutations. However, these methods require enough cell divisions to
accumulate rare mutations and cannot track cells that do not carry any mutations. Moreover, they
require whole genome sequencing to identify the rare mutations, which is cost prohibitive for
20
most applications. They are also computationally intensive and require specialized knowledge of
lineage reconstruction using a population genetics approach. In contrast, barcoding can label any
cells that are accessible to viral infection and integration, is easy and inexpensive, and does not
require any advanced computational skills.
Limitations
The barcoding strategy presented here is limited to systems that tolerate cell isolation,
short term culture, and transplantation. In our protocol, cells of interest need to be isolated from
their respective tissues in order to be transduced by the lentiviral vector carrying genetic
barcodes. This can be a problem for cells that cannot be readily isolated or require maintenance
of endogenous tissue architecture. In some cases, in vivo injection of barcode-carrying virus can
be used as an alternative strategy, although it creates new problems of labeling unwanted cells
and uneven transduction. In vivo delivery of the barcodes is not discussed further here.
Cells may potentially change their properties during culture and barcode transduction.
While many studies have shown that lentiviral integration does not cause any apparent change in
biological functions of the transduced cells(Gonzalez-Murillo et al., 2008; McKenzie et al.,
2006; Naik et al., 2014; L. V. Nguyen, Cox, et al., 2014; L. V. Nguyen et al., 2015), it is still
possible that a particular lentiviral vector may be randomly inserted into some genomic regions
and alter cell behavior. Therefore, experimental replicates and controls must be rigorously used
to exclude the possibility that rare viral insertions cause the observed phenotype.
Different cell types have different transduction rates. The technique reported here was
optimized for mouse cells but has been used for studying primate cells as well(C. Wu et al.,
2014, 2018). Transduction efficiencies for primary human cells are generally lower than for
21
mouse cells in our hands but are sufficient to yield meaningful results. Both mouse and human
cell lines generally exhibit higher transduction efficiencies than primary cells.
While modern high throughput sequencing has greatly improved barcode recovery,
barcode detection is still limited by experimental loss during barcode extraction and cell
collection. For example, cells collected from a part of a solid tissue may not be representative of
the whole tissue. Additionally, the sizes of some cell populations may numerically exceed the
limit of our barcode extraction protocol, and only a fraction of the cells can be analyzed.
Sequencing depth, as well as barcode extraction procedure, may also limit the detection of
barcodes with low abundancy(Bystrykh et al., 2012). Furthermore, detection limits may vary
between samples with differing cell numbers.
Experimental design
Plasmid Generation
The oligo library template can be obtained from IDT or other vendors. We suggest HPLC
purification for best results. The synthetic DNA oligos that we use to generate barcode libraries
are comprised of several parts: a BamHI restriction enzyme site, a forward primer binding site, a
6bp library ID, a 27bp random sequence, a reverse primer binding site, and an EcoRI restriction
enzyme site (Fig. 2.1a). The 6bp library ID allows different cell populations to be
simultaneously barcoded and combined during downstream biological treatment, barcode
extraction and sequencing. This saves much experimental time and resource. The 27bp random
sequence generates a maximum of 4
27
different barcodes in theory. This number is reduced by
excluding sequences with restriction enzyme cutting sites and with characters difficult for PCR
and sequencing such as poly-Ns. Longer or shorter random sequences and random sequences
22
with interspersed fixed sequences can also be used. The length should not be too long such that it
exceeds the sequencing capacity, nor too short such that it limits library diversity. A 6bp
sequence is added to both ends to ensure proper restriction enzyme cutting (Table 2.1). The
primer binding sites enable targeted PCR for barcode extraction (Fig. 2.1b). The BamHI and
EcoRI sites are designed for cloning the double-stranded barcode DNA into lentiviral backbones,
such as the pCDH plasmid. Other types of vectors are also applicable as long as they can insert
DNA barcodes into the host cell’s genome. The plasmid may also include fluorescent proteins,
such as GFP, to signal the presence of barcodes and to evaluate the transduction efficiency. The
primer design can be customized as needed. The 6bp library ID and 27bp random sequence can
be readily replaced to accommodate alternative barcode designs. Alternative barcode designs
include interspersed fixed sequences and a library of known barcode sequences. Implantation of
partially or fully pre-designed barcode sequences can avoid restriction cutting sites and poly-N
stretches.
Synthesized DNA oligos are made double-stranded using a single primer “Strand2”
(Table 2.2). After cloning, each plasmid library is transformed into competent cells (E. coli), and
all bacterial colonies are amplified to achieve high barcode diversity. Bacterial cultures are
grown overnight in an incubator. Plasmids are isolated from bacterial culture using the Qiagen
Plasmid Maxi kit. Plasmid DNA concentration is then measured using NanoDrop. Before
proceeding to the next step, the plasmid needs to be sequenced for evaluating barcode diversity,
i.e. the number of unique barcodes and their relative abundances in the library(Lu et al., 2011;
Naik et al., 2014). A high library diversity (high barcode numbers and equal representation of
unique barcodes) is essential to reducing the chance that more than one cell is labeled by the
same barcode. Optimizing the bacterial transformation step is the key to improving library
23
diversity. The diversity of the library dictates the number of clones that can be tracked in a single
experiment such that each barcode represents a single cell clone with statistical certainty. An
exact calculation for this limit was provided in our previous study(Lu et al., 2011). A user-
friendly calculation tool is provided with this protocol. As a rule of thumb, a library of 40,000-
50,000 evenly distributed barcodes allows around 1,000 cells to be tracked with greater than 95%
probability that more than 95% of the barcodes represent single cells.
Lentivirus Packaging
HEK293T cells are used to produce lentiviral particles. HEK293T cells are transfected
with pCDH barcode plasmid, lentivectors Pax2 and VSV-G, in the presence of SuperFect
Transfection Reagent. The supernatant is collected and the media changed at 48, 72, and 96
hours. The virus should always be kept on ice or at 4°C after harvesting. After pooling and
concentrating using 50% PEG-8000, the virus should be aliquoted and kept at -80°C for long
term storage. Lentiviral library needs to be tested on cell lines before use by transduction,
barcode extraction and sequencing, to evaluate the viral titer and barcode diversity, i.e. the
number of unique barcodes and their relative abundances in the library. Results from sequencing
plasmids and transduced cell lines can create reference libraries that facilitate downstream
bioinformatics analysis(Naik et al., 2014).
Transducing Experimental Cells
The exact transduction time depends on the research purpose, the type of cells, and how
well the culture condition supports cellular properties. Using low viral titer will ensure that each
cell only receives one viral insertion and subsequently one barcode. Cells that receive more than
24
one barcode will be over represented in the results. We previously reported that ~50%
transduction efficiency resulted in >95% cells carrying only a single barcode(Lu et al., 2011).
Other studies have applied lower transduction efficiency (~15%) to further reduce the chance of
double barcoding(Naik et al., 2014). After incubation, cells should be washed thoroughly to
remove any remaining virus. Labeled cells are now ready for experimental use.
It is important to use the same viral libraries for both control and experiment groups, and
to include biological replicates using different viral libraries or viral infection wells if possible,
to avoid experimental noise associated with viral infection. We recommend evaluating the
percentage of infected cells in every experiment by analyzing an aliquot of the experimental
cells. The fractions of cells receiving single and multiple barcodes must be determined
experimentally by analyzing the barcode copy number in genomic DNA at the clonal level.
Multiple infections and barcoding of the same cells is acceptable if these cells are expected to
produce similar results as singly infected cells. The number of experimental cells to be barcoded
should be limited based on the barcode diversity in the library(Lu et al., 2011). This limit is
particularly important for experiments using cell lines where each barcode is meant to represent a
single cell with statistical confidence. In addition to the cell number and viral incubation time,
other experimental parameters, such as cell numbers for barcode analysis and time to harvest the
cells, also influence experimental results and can be adapted from previously published studies
with similar experimental conditions(Guernet et al., 2016; Lu et al., 2019; Merino et al., 2019; L.
Nguyen et al., 2018; L. V. Nguyen, Cox, et al., 2014; L. V. Nguyen et al., 2015; L. V. Nguyen,
Makarem, et al., 2014; Shalem et al., 2014; Verovskaya et al., 2013; T. Wang et al., 2014;
Woodworth et al., 2017; C. Wu et al., 2014).
Barcode Extraction
25
Barcodes are recovered by isolating genomic DNA from cells of interest. For a given
population, the number of cells required for analyses depend on the desired barcode detection
sensitivity. To identify barcodes that are as rare as 1 in 1,000, at least 1,000 barcoded cells have
to be collected for barcode recovery. If possible, more than 10,000 barcoded cells should be
collected for best results. High cell numbers allow for identifying rare barcodes, but too many, as
well as too few, cells may reduce barcode recovery rates and present problems during barcode
extraction. Sorting is not required for collecting cells, but the collected cells should be prepared
for genomic DNA extraction and counted in preparation for the barcode extraction procedure.
From the isolated genomic DNA, barcodes are PCR amplified using designed primers (Table
2.2) that flank the barcode region and provide binding sites for downstream high-throughput
sequencing. These primers also add indexes that enable multiplex sequencing. To ensure precise
quantification, the PCR should be halted during the exponential phase of amplification (typically
20–27 cycles) before the curve plateaus (Fig. 2.3). Samples with different numbers of cells may
require different numbers of cycles. Compared to conventional PCR that uses a predetermined
cycle number, stopping the PCR reaction during the exponential phase prevents over
amplification and reduces PCR bias in line with the idea behind quantitative PCR. After PCR,
barcode DNA is purified using magnetic beads.
DNA Quantification & High-throughput Sequencing
The amplified barcodes need to be precisely quantified before sequencing. It is important
to choose a quantification method that is sensitive and robust. We choose fluorescence-based
quantification (Qubit assay), but other methods such as TapeStation ScreenTape assay may also
suffice.
26
Barcode samples prepared using different reverse primers may be pooled for sequencing
as one sample to reduce cost. Our library ID design provides an additional option for
multiplexing different barcoded samples. Additional index primers and library IDs can reduce
sequencing cost at the expense of the additional resources to create them. While we recommend
HPLC purified primers, desalted primers are also acceptable. The depth of sequencing depends
on the number of barcoded cells used during barcode extraction. We recommend sequencing
around 100 reads per barcoded cell to ensure precise barcode quantification. While the barcode is
only 33bp, we typically sequence at least 50bp single end, so that the sequence from 34
th
to 50
th
bp can be used as a quality control check.
Analyzing Sequencing Data
We developed custom Python scripts to extract barcodes from the raw sequencing results.
The scripts consist of three major steps. In the first step, the code extracts the first 50bp of each
read. This 50bp should consist of the 6bp library ID, a 27bp random sequence, and the 17bp PCR
handle. In the second step, the code aligns the last 17bp of each read to the expected sequence.
The reads containing the expected 17bp are then separated based on their first 6bp sequence, i.e.,
library IDs. In this step, the code also counts the copy number for each unique sequence. In the
third step, the code generates the final results that consist of master barcodes and their copy
numbers. The generation and use of master barcodes are explained in detail below.
As PCR and sequencing can both generate errors(Bystrykh et al., 2012), we combine
sequences that are closely similar to each other following the conventional strategy used for
analyzing high throughput sequencing data. We use Levenshtein edit distance to quantify the
similarity between different sequences. Each nucleotide substitution results in an edit distance of
27
1. Each indel results in an edit distance of 2 because all sequences are the same length and an
indel creates an additional difference at the last base pair. By default, if the edit distance between
different sequences is no more than 4, they are considered to be derived from the same sequence
(Fig. 2.4). Our Python scripts allow users to customize the edit distance threshold.
In the second step, we allow a maximum edit distance of 4 when aligning the 17bp. We
exclude reads whose first 6bp does not match exactly with any expected library IDs. In the third
step, the code performs pairwise comparison of all the unique sequences, and groups the pairs
with no more than 4 edit distance that share a common sequence. Within each group, the
sequence with the highest copy number is kept as the “master barcode”. The copy number for
each master barcode is the sum of the copy numbers of all barcodes that are no more than 4 edit
distance different from the master barcode. The master barcodes are used to represent the
original barcodes delivered by the lentivirus. If a reference library from sequencing plasmids and
transduced cell lines is used, the master barcode sequences can be drawn from the reference
library instead. The sequences of the master barcodes can facilitate comparisons between
different samples that are derived from the same barcoded cell population. The third step of the
code generates a file reporting the distance between each unique sequence and its master
barcode, as well as the distance between different master barcodes. This information can help
users adjust the edit distance threshold. While there is an R-package ‘genBaRcode’ available for
similar barcode analysis(Thielecke et al., 2019), our Python code provides a flexible alternative
that is easy to implement for users with little programming skills. Downstream data analysis and
visualization are contingent to the specific biological questions and can be adapted from previous
studies(Brewer et al., 2016; Bystrykh et al., 2014; Bystrykh & Belderbos, 2016; Lu et al., 2019;
28
Lyne et al., 2018; Naik et al., 2013, 2014; L. Nguyen et al., 2018; L. V. Nguyen, Cox, et al.,
2014; L. V. Nguyen et al., 2015; L. V. Nguyen, Makarem, et al., 2014; C. Wu et al., 2014, 2018).
Anticipated results
Plasmid Generation
Barcode oligos should be inserted at the BamHI and EcoRI restriction enzyme sites.
Plasmid should be ~100 bp larger (~7250 bp if using the pCDH lentivirus vector) and
circularized when ligation is successful. PCR using qPCR primers from Table 2.2 should
produce a ~150 bp product. Barcode diversity should be sufficient for the intended experiment.
Lentivirus Packaging
Virus should have an appropriate viral titer that ensures 15–50% transduction efficiency,
and should have enough barcode diversity for the intended experiment.
Transducing Experimental Cells
Transduction should produce 15–50% GFP expression via flow cytometry to reduce the
chance of double barcoding. When testing barcode diversity using control cell lines, higher
percentage of GFP positive cells is acceptable.
Barcode Extraction
qPCR amplification of barcodes should produce a typical exponential curve, which is
absent for the negative control (Fig. 2.3).
DNA Quantification & High-throughput Sequencing
29
Barcode DNA should be ~ 150 bp and yield > 1 ng/µl per sample. High throughput
sequencing will provide one “.fastq” file for each used reverse primer. The file names should
begin with the 6bp Reverse Primer Index (Table 2.2).
Analyzing Sequencing Data
Barcode quantification results will be generated (Fig. 2.5). Additionally, “step-2_combine-
library-ID.py” and “step-3_combine-barcodes.py” will both generate “stats.txt” file for quality
check.
In stats.txt files generated in step 2 of the analysis:
1. “% valid reads based on 17bp ending” should be at least 70–80%.
2. “Numbers of reads with expected virus ID” should be higher than those with unexpected
library IDs. (Fig. 2.5a)
Author contribution
R.L. conceived and developed the protocol. C.B., D.J., J.C., and A.N. optimized the barcode
extraction protocol. D.J. and J.E. improved the Python code for analyzing high-throughput
sequencing data. C.B., D.J., and R.L. prepared the manuscript. J.C. and A.N. provided assistance
in manuscript text preparation.
30
Figures and Tables
Fig. 2.1 Experimental workflow
a. Synthesized semi-random barcode oligos are cloned into plasmids before packaging into a
lentiviral vector. Cells of interest are then transduced. To retrieve barcodes, genomic DNA is
extracted before qPCR amplification and high-throughput sequencing. Raw sequencing data is
processed by a custom data analysis pipeline to quantify the abundance of each barcode. b. PCR
strategy. The 33bp cellular barcode is flanked by an Illumina TruSeq read1 sequence and a
custom read2 sequence so that a single PCR reaction can add the Illumina P5 and P7 adaptors to
the ends of each barcode. Designed by Charles Bramlett (cbramlet@usc.edu).
31
Fig 2.2 Comparing barcode extraction replicates
Primary mouse hematopoietic stem cells were barcoded and transplanted into recipient mice. 4
months after transplantation mice were bled, and white blood cells were collected and processed.
Cell lysates were divided into two replicate samples, and processed separately for genomic DNA
extraction, barcode amplification, and sequencing. Each dot represents a barcode. Shown is the
abundance of barcodes highly consistent in replicated samples. Pearson correlation: 0.99; P
value: 5.3 x 10
-144
.
32
Fig. 2.3 Q-PCR amplification of barcodes
BB88 cells were barcoded and 50,000 GFP positive cells were sorted via FACS one week after
transduction. gDNA was isolated and amplified. Shown is a multi-component plot of barcode
amplification. EvaGreen fluorescent dye (green lines) is used to quantify DNA amount, thus no
Rox signal was observed (red lines). a. Two samples with similar amounts of genomic DNA
were amplified together, and their exponential curve emerged at similar PCR cycles. We stopped
the reaction at cycle 25, which is about halfway through the exponential phase. This is to avoid
over-amplification and to reduce background signals. No DNA template control samples show
no amplification (two flat green lines). b. One sample was amplified to saturation. This is an
example of over-amplification.
33
Fig 2.4 Optimizing the edit distance thresholds
Histograms show the distances between unique sequences and their corresponding master
barcode (red), as well as the distances between different master barcodes (blue). Each row shows
one edit distance threshold. Data from two independent samples are shown in two columns.
Threshold of edit distance 4 was chosen where the distances between master barcodes are higher
than and separated from the distances between unique sequences and their master barcodes.
34
Fig. 2.5 Python pipeline outputs
Primary human acute lymphoblastic leukemia (ALL) cells were barcoded and transplanted into
NSG mice. Two months after transplantation, mice were bled, ALL cells were collected and
processed. ALL cells barcoded with virus library 8 and 9 were used for this example. a. Custom
algorithms written in Python code group reads based on their library IDs. b. The Python
algorithm quantifies each barcode with consideration to sequencing errors. Each color represents
a unique barcode, and the size represents its relative abundance. Shown are data from library 9 in
Fig. 2.4a.
35
Table 2.1 Barcode Oligos for each Library ID.
The core of each oligo consists of a 6bp library ID and a 27bp random sequence represented as
N’s. The core is flanked by forward and reverse primer binding sites as well as restriction
enzyme sites. Additional 6bp sequence is added at both ends to ensure proper restriction enzyme
cutting.
36
Virus
library
Library
ID
DNA Oligos (5’ – 3’)
1 CGTGAT
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTCGTGATNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
2 ACATCG
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTACATCGNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
3 GCCTAA
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTGCCTAANNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
4 TGGTCA
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTTGGTCANNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
5 CACTGT
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTCACTGTNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
6 ATTGGC
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTATTGGCNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
7 GATCTG
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTGATCTGNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
8 TCAAGT
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTTCAAGTNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
9 CTGATC
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGATCNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
10 AAGCTA
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTAAGCTANNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
11 GTAGCC
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTGTAGCCNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
12 TACAAG
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTTACAAGNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
13 ATGACA
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTATGACANNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
14 AGCGGT
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTAGCGGTNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
15 ACTCAG
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTACTCAGNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
16 TAACGT
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTTAACGTNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
17 TGTTAC
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTTGTTACNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
18 TCCGTA
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTTCCGTANNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
19 GAGTTC
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTGAGTTCNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
20 GTCGAG
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTGTCGAGNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
21 GCAACT
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTGCAACTNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
22 CAGTGC
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTCAGTGCNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
23 CTTATG
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTCTTATGNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
24 CGACCT
CGCCGCGGATCCACACTCTTTCCCTACACGACGCTCTTCCGATCTCGACCTNNNNNNNNNNNNNN
NNNNNNNNNNNNNAGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAATTCCGGCG
37
Table 2.2 List of primers
Procedure Primer Sequence (5’-3’)
Plasmid generation Strand2 CGCCGGAATTCCAAGCAGAAGACGGCATACGA
qPCR
Forward
AATGATACGGCGACCACCGAGATCTACACTCTTTCCC
TACACGACGCTCTTCCGATCT
R1 (GCCAAT)
CAAGCAGAAGACGGCATACGAGATGCCAATACGGCAT
ACGAGCTCTTCCGATCT
R2 (GATCTG)
CAAGCAGAAGACGGCATACGAGATGATCTGACGGCAT
ACGAGCTCTTCCGATCT
R3 (TCAAGT)
CAAGCAGAAGACGGCATACGAGATTCAAGTACGGCAT
ACGAGCTCTTCCGATCT
Sequencing
Custom Index
Primer
AGATCGGAAGAGCTCGTATGCCGT
38
Chapter 3: HSC heterogeneity is cell autonomous
Abstract
HSCs are the key therapeutic component of bone marrow transplantation, the first and
most prevalent clinical stem cell therapy. However, it is difficult to study and leverage stem cell
heterogeneity when each cell is unique and unpredictable. In this chapter, we used embedded
viral barcoding technology to track the blood production of individual HSCs in mouse. We found
that, upon conditioned transplantation mediated by either irradiation or chemotherapy, HSCs
derived from the same ancestor HSC exhibit similar clonal expansion and lineage bias
characteristics in different mice. This suggests that the diverse differentiation programs of
individual HSCs do not depend on stochastic mechanisms, but instead are cell autonomous
properties. These discoveries open new avenues of research for identifying intrinsic HSC
regulatory factors. It also allows us to infer single HSC’s behavior from clonal tracking analysis.
HSCs derived from the same clone behave similarly in different hosts
Irradiation and chemotherapy are both important regimens for cancer treatment as well as
bone marrow transplantation. Clinically, damage to the hematopoietic system often forces
shortchange treatment protocols and compromises therapeutic outcomes (Y. Wang et al., 2006).
We sought to investigate whether individual HSCs behave differently when reconstituting
hematopoietic system of hosts pre-conditioned with irradiation or chemotherapeutic drugs.
Among the drugs we have tested, we chose Busulfan for further investigation because it reaches
comparable donor chimerism in peripheral blood with irradiation-mediated transplantation (Fig.
3.1) (Down et al., 1994).
39
We first resolved if it is possible to deduce single HSC activities from clonal tracking
assays (Fig. 3.2). Here, HSC activities, including self-renewal and differentiation, are quantified
based on the amount of progeny HSCs, granulocytes, and B cells (Fig. 3.3) that are derived from
a barcoded HSC. Taking advantage of the “portable” property of HSCs via transplantation, we
compared the in vivo activities of HSCs derived from the same clone in different mice (Fig.
3.2a). The data show that HSCs derived from the same ancestor produced consistent amounts of
granulocytes and B cells in different mice (Figs. 3.2b-e), although their self-renewal levels
varied slightly (Fig. 3.4). The variation in self-renewal may arise from the difficulty in collecting
all HSCs dispersed in the bone marrow throughout the body. The consistency in blood
production was observed in all examined mice, regardless of the use of radiation or Busulfan-
mediated chemo-treatment as pre-transplantation conditioning (Figs. 3.2d-e). These data suggest
that HSCs derived from the same ancestor inherit the same differentiation program, and that the
differentiation activities of individual HSCs can be estimated from clonal level measurements,
assuming individual HSCs within a clone equally contribute to differentiation.
Mobilization does not alter HSC differentiation
To determine whether HSC clones exhibit persistent blood production characteristics in
different niches independent of transplantation, we performed mobilization assays (Fig. 3.5a)
using G-CSF and AMD3100 (Broxmeyer et al., 2005), which effectively mobilize phenotypic
HSCs from bone marrow to peripheral blood (Fig. 3.5b). We mobilized HSCs on mice carrying
barcoded HSCs and then compared blood production 20 days before and 20 days after
mobilization. We found that HSC clones exhibited highly consistent blood production quantities
before and after mobilization (Fig. 3.6a), suggesting that the random changes in the quantity of
HSC blood production upon transplantation may be associated with the transplantation
40
procedure. The lineage bias of HSC clones is generally persistent (Fig. 3.6b), although a slight
shift towards the lymphoid lineage is also observed (Fig. 3.6c).
Discussion
In summary, this data demonstrates that individual HSCs derived from the same ancestor
HSC exhibit similar differentiation characteristics, such as blood production quantity and lineage
bias. These characteristics are inherited and reset during transplantation mediated by either
radiation or chemotherapy. These findings suggest that the specific differentiation program of an
individual HSC does not depend on its niche or stochastic mechanisms, but is a cell autonomous
characteristic. While previous population level studies have shown that HSC niche provides
growth factors essential for sustaining general HSC function, our current clonal-level study
reveal that functional variations between individual HSCs are instead determined by intrinsic
cellular mechanisms. In other words, individual HSCs differ in their responses to niche signals.
These intrinsic mechanisms may play an essential role in maintaining the robustness of the
overall blood supply, but how they are coordinated to sustain a balanced supply is not yet known.
Methods
Mice
The primary donor mice used in all the experiments were eight to twelve-week-old
B6(C57BL/6J, CD45.2+, The Jackson Laboratory, #000664) or F1 (CD45.2+/CD45.1+, bred in-
house) mice. The recipient mice were eight to twelve-week-old BLY (B6.SJL-Ptprc
a
Pepc
b
/BoyJ, CD45.1+, The Jackson Laboratory, #002014) mice. Helper cells that were co-
transplanted alongside HSCs consisted of F1 or BLY whole bone marrow cells. Prior to
transplantation, recipient mice were treated with one of the following two conditions: (a)
41
irradiation with 950 cGy immediately before transplantation (nine primary recipients and their
corresponding secondary recipients); or (b) intra-peritoneal injection with busulfan (Sigma-
Aldrich, #150606) at a dose of 50 mg per kg body weight 24 hours before transplantation (Down
et al., 1994) (seven primary recipients and their corresponding secondary recipients). All animal
procedures were approved by the Institutional Animal Care and Use Committee.
HSC isolation and transplantation
In the primary transplantation experiments, HSCs (lineage (CD3, CD4, CD8, B220, Gr1, Mac1,
Ter119)-/cKit+/Sca1+/Flk2-/CD34-/CD150+) were obtained from the crushed bones of donor
mice and isolated using fluorescent-activated cell sorting (FACS) with the FACS-Aria II (BD
Biosciences, San Jose, CA) after enrichment using CD117 (ckit) microbeads (AutoMACS,
Miltenyi Biotec, #130-091-224). Prior to transplantation, HSCs were infected with lentivirus
carrying barcodes and GFP for 15 hours. HSC clonal labeling was carried out as previously
described(Bramlett et al., 2020; Lu et al., 2011). Barcoded donor HSCs (irradiation group: 2,500
HSCs for three mice and 7,000 HSCs for five mice; busulfan group: 2,500 HSCs for three mice
and 4,000 HSCs for four mice) and 250,000 helper bone marrow cells were transplanted into
each recipient in four batches to ensure biological reproducibility. Donor cell dose did not
influence any results reported in this study, so data from mice transplanted with different HSC
doses were combined in the same group. During the secondary transplantation, CD117-
microbead-enriched cells from each primary recipient were equally divided into four portions,
and transplanted into secondary recipients. Both primary and secondary transplantations were
performed via retro-orbital injection.
Blood sample collection and FACS analysis
42
Blood samples were collected into PBS containing 10mM EDTA via a small transverse cut in the
tail vein. To separate blood cells, 2% dextran was added, and the remaining blood cells were
treated with ammonium-chloride-potassium lysis buffer on ice for five minutes to remove
residual red blood cells. After a 30-minute antibody incubation at 4 °C, samples were
resuspended in propidium iodide solution (1:5000 in PBS with 2% FBS). Cells were sorted using
FACS-Aria II and separated into granulocytes and B cells. Antibodies were obtained from
eBioscience and BioLegend, as previously described(Brewer et al., 2016; L. Nguyen et al.,
2018). Donor cells were sorted based on the CD45 marker. The following cell-surface markers
were used to sort the different blood cell types:
Granulocytes: CD4-/CD8-/B220-/CD19-/Mac1+/Gr1+
B cells: CD4-/CD8-/Gr1-/Mac1-/B220+/CD19+
Flow cytometry data were analyzed using BD FACSDiva software version 8.0.
Quantification of HSC activity
To enable comparisons across mice, HSC differentiation activities were quantified as the amount
of granulocytes and B cells that a barcoded HSC produced as a fraction of the entire white blood
cell population within a mouse. We harvested bone marrow from the spine and all bones in the
legs and arms. HSC self-renewal activities were quantified as the abundance of HSCs that share
an identical tracking barcode among all harvested HSCs with barcodes. The specific calculations
are as follows:
43
𝐶𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝑡 ℎ𝑒 𝑏𝑙𝑜𝑜𝑑 (%𝑊𝐵𝐶 )
= (𝑐𝑒𝑙𝑙 𝑡𝑦𝑝𝑒 % 𝑤 ℎ𝑖𝑡𝑒 𝑏𝑙𝑜𝑜𝑑 𝑐𝑒𝑙𝑙𝑠 ) × (𝑑𝑜𝑛𝑜𝑟 % )
× (𝐺𝐹𝑃 % 𝑎𝑚𝑜𝑛𝑔 𝑑𝑜𝑛𝑜𝑟 𝑐𝑒𝑙𝑙𝑠 ) × (𝑟𝑒𝑎𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑛𝑒 𝑏𝑎𝑟𝑐𝑜𝑑𝑒 )
/(𝑡𝑜𝑡𝑎𝑙 𝑟𝑒𝑎𝑑𝑠 𝑜𝑓 𝑎𝑙𝑙 𝑏𝑎𝑟𝑐𝑜𝑑𝑒𝑠 )
𝐶𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑜𝑓 𝐻𝑆𝐶𝑠 (% 𝑏𝑎𝑟𝑐𝑜𝑑𝑒𝑑 𝐻𝑆𝐶𝑠 )
= (𝑟𝑒𝑎𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑛𝑒 𝑏𝑎𝑟𝑐𝑜𝑑𝑒 ) × 100%/(𝑡𝑜𝑡𝑎𝑙 𝑟𝑒𝑎𝑑𝑠 𝑜𝑓 𝑎𝑙𝑙 𝑏𝑎𝑟𝑐𝑜𝑑𝑒𝑠 )
Clones whose contributions to white blood cells were lower than 0.001% were excluded from all
analyses.
HSC mobilization
Six months after irradiation-mediated transplantation, mice were treated with G-CSF and
AMD3100, according to a published protocol (Broxmeyer et al., 2005). Specifically, mice were
subcutaneously injected with 2.5 μg G-CSF (Amgen, #121181-53-1) per mouse, twice a day for
two days. Eighteen hours after the last injection, mice were subcutaneously injected with
AMD3100 (Sigma-Aldrich, #A5602) at a dose of 5 mg per kg body weight. Blood samples were
collected 20 days before and 20 days after mobilization for analysis, as described above.
44
Figures
Fig. 3.1 Donor chimerism of chemotherapy-mediated transplantation
Recipient wild type mice were preconditioned with either 5-FU (Goebel et al., 2004) (a) or
Busulfan (Down et al., 1994) (b) as described, or with PBS as control (c). Bone marrow cells
from donor mice were c-Kit enriched before retro-orbitally transplanted into recipients. 6 months
after transplantation, blood was collected via bleeding from tails and processed for FACS
analysis.
45
Fig. 3.2 HSCs derived from the same ancestor differentiate similarly in different mice
a. Schematic illustration of the one-to-multiple serial transplantation experiment that compares
the differentiation activities of HSCs derived from the same ancestor. b-c. Comparing the clonal
abundance of granulocytes (b) and B cells (c) from primary (1°) and secondary recipients (2°), as
well as between two secondary recipients that share a common primary recipient. Each open
circle represents an HSC clone, and its position indicates its abundance. Clones from all
experimental mice are plotted in each panel. “r” depicts Pearson correlation coefficient. d-e.
Pearson correlation coefficient for the clonal abundance of granulocytes (d) or B cells (e)
between one primary recipient and one secondary recipient (1° vs. 2°), and between two
46
secondary recipients (2° vs. 2°). Each marker represents a comparison between a pair of mice. **
P < 0.01.
47
Fig. 3.3 FACS gating for cell isolation
a. FACS gating for sorting granulocytes and B cells from the peripheral white blood cells. b.
FACS gating for sorting HSCs from c-Kit enriched bone marrow cells.
48
Fig. 3.4 Comparing the self-renewal of HSCs derived from the same ancestor
Shown are the abundances of HSC clones from secondary (2°) recipients that share a common
primary recipient. Each open circle represents an HSC clone. Clones from eight secondary
recipients are plotted. “r” depicts Pearson correlation coefficient.
49
Fig. 3.5 Establishing HSC mobilization assay
G-CSF and AMD3100 are administered at a dose of 2.5μg per mouse and 5mg/kg body weight
respectively. 1 hour after AMD3100 administration, peripheral blood was collected and c-Kit
enriched before performing either HSC staining and FACS analysis immediately (a), or
transplantation with 250,000 bone marrow helper cells. Four months after transplantation, blood
samples were collected from tail vein and stained for FACS analysis (b). GA, mice treated with
G-CSF plus AMD3100 assay.
50
Fig. 3.6 Tracking HSC clones before and after mobilization
a, Schematic illustration of the HSC mobilization experiment. b-c, Comparing the clonal
abundance of granulocytes (b, left) and B cells (b, right), as well as the lineage bias (c), before
and after HSC mobilization.
51
Chapter 4: Identifying genes modulating functional differences
between individual HSCs
Abstract
Single cell transcriptomics have shown that considerable variations exist in the gene
expression of cells that perform an identical function. These variations may reflect the fine-
tuning of cellular function and may provide new opportunities for studying cellular regulation.
Here, by simultaneously measuring the transcriptomes and in vivo cellular activities of hundreds
of individual HSCs in mice, we show that intercellular variations in the expression levels of
dozens of genes are significantly correlated with distinct activity levels of individual HSCs.
Some of these genes are known regulators of hematopoiesis, and variations to their expression
can fine-tune activities of individual HSCs. Moreover, the expression of these genes and their
respective cellular activities exhibit four general quantitative association patterns, revealing the
prevalence of distinct dose-response mechanisms between molecular and cellular levels. These
data illustrate a novel approach for studying molecular regulatory mechanisms through
quantitatively dissecting intercellular variations.
Introduction
In multicellular organisms, a single biological function is often carried out by multiple
cells expressing genes essential for that function. Recent studies on single cell transcriptomes
show that cells with seemingly identical functions often exhibit noticeable differences in their
gene expression(Haber et al., 2017; Han et al., 2018; Regev et al., 2017; Shalek et al., 2014;
Stubbington et al., 2017). Studies using single-cell organisms suggest that these intercellular
variations can arise from stochastic transcriptional events and have no functional
52
consequence(Blake et al., 2003; Elowitz et al., 2002; Losick & Desplan, 2008; Newman et al.,
2006; Raser & O’Shea, 2004). However, in higher-level organisms, stochastic mechanisms may
not be the sole source of variability in gene expression among cells with an identical function.
Instead, intercellular variations may reflect active modifications of cellular activity necessary to
coordinate overall biological function within an organism. If true, intercellular variations may
provide new opportunities for studying cellular regulation and molecular function.
Here, we investigate the functional role of transcriptional variation in mice using HSCs as
a model system. We have demonstrated that our viral embedded barcode tracking system allows
deducing single cell activities from clonal tracking data derived from genetic barcode tracking.
Using this system, in conjunction with “molecular bridges” that connect genetic barcode tracking
and droplet-based single-cell RNA sequencing(Macosko et al., 2015; Zheng et al., 2017), we
have been able to directly compare single cell activity and single cell gene expression. In
addition, we have developed a new bioinformatics strategy that takes advantage of the high-
throughput nature of our data and biological replicates to overcome the experimental noises
inherent in single cell measurements. With these technical advances, we have identified genes
whose expression levels are significantly associated with cellular activity levels. Moreover, we
show how these genes are quantitatively associated with cellular activities across individual
cells.
Molecular bridges linking genetic barcode tracking and single cell RNA sequencing
We developed an experimental system that simultaneously quantifies in vivo HSC
activities and their gene expressions at the single cell level in a high-throughput manner (Fig.
4.1). The cellular activities of hundreds of individual HSCs were quantified by clonal tracking
using unique genetic tracking barcodes (Brewer et al., 2016; Lu et al., 2011, 2019; L. Nguyen et
53
al., 2018). The gene expression of individual HSCs was analyzed by droplet-based single-cell
RNA sequencing(Macosko et al., 2015; Zheng et al., 2017), where cDNA libraries of individual
HSCs were tagged by unique cellular barcodes to allow thousands of individual cells to be
simultaneously analyzed. To link data from genetic barcode tracking and single cell RNA
sequencing together, we generated a library of “molecular bridges”, taking advantage of the
presence of transcribed genetic tracking barcodes in the single cell cDNA libraries (Fig. 4.2).
Molecular bridges contain both genetic tracking barcodes and cellular cDNA barcodes (Fig.
4.2a). Therefore, they constitute, in essence, a directory connecting the two types of barcodes
and linking single cell activities with single cell transcriptomes (Fig. 4.3). In four primary
recipient mice four months post transplantation, the molecular bridges successfully matched
single cell gene expression and single cell activity data from 654 HSCs affiliated with 156
clones. HSCs with and without matched datasets exhibited similar transcriptomes (Fig. 4.4a) and
cellular activities (Fig. 4.4b), except that HSCs with high levels of self-renewal activities are
over-represented among the mapped clones, as expected.
Identifying genes significantly associated with cellular activities across individual cells
To identify genes whose expressions are associated with HSC activities, including self-
renewal (HSC abundance), myeloid differentiation (granulocyte abundance), lymphoid
differentiation (B cell abundance), differentiation in both lineages (total of granulocyte and B
cell abundance), and lineage bias (ratio of B cell to granulocyte abundance), we compared the
gene expression profiles of HSCs that exhibited high and low levels of each cellular activity
(Fig. 4.5a). First, for each individual mouse, we compared the expression levels of each gene
between HSCs with high and low levels of a particular activity (Fig. 4.5b). We then compared
the results across mice to identify genes that were significantly differentially expressed. To
54
estimate the false positive rate of our gene identification, we compared experimental data with
100 sets of scrambled data which were generated by randomly mapping cellular activity data to
gene expression data (Fig. 4.5a). Genes identified using this methodology exhibited expression
levels that were robustly associated with respective HSC activities in all experimental mice (Fig.
4.5c).
We reiterated the differential gene expression analyses (Fig. 4.5a) and separated cellular
activities into high and low groups at all detected activity levels (Fig. 4.6a). For each HSC
activity, we identified 20–40 genes with at least one false positive score less than 0.05
(Supplementary Table 4.1). The variation in expression levels of the identified genes correlated
with cellular activity levels in a non-random way as validated by a scrambling analysis (Fig.
4.5a). The identification of these genes indicates that intercellular variations in gene expression
are connected with differential levels of cellular activities in mice. These genes may be essential
in performing the cellular activities with which they are specifically associated.
Functional relevance of identified genes
Among the genes that we identified to be associated with HSC activities through
intercellular variation analyses (Supplementary Table 4.1), some had already been shown to
regulate hematopoiesis (Supplementary Table 4.2). For example, over-expression of Abcg2 in
mice has been shown to drive myelodysplastic syndrome, and Abcg2 expression is particularly
high in myelodysplastic syndrome patients(Kawabata et al., 2018). This is consistent with our
data that HSCs with higher levels of Abcg2 expression produced more granulocytes. A large
cohort study of leukemia patients showed that Btg1 deletion is associated with
leukemia(Waanders et al., 2012), consistent with our data that HSCs with lower levels of Btg1
expression exhibited increased self-renewal activities. In addition, Arid4a, Hhex, Rheb and
55
Akap9 have been previously studied using knockout mouse models, and have been shown to
regulate hematopoiesis in the ways revealed by our data (Supplementary Table 4.2)(Jackson et
al., 2015; Peng et al., 2018; M.-Y. Wu et al., 2008).
Quantitative associations between gene expression and cellular activity
To determine how cellular activities change with gene expression levels, we compared the P
values of the differential gene expression analyses across various levels of HSC activities (Fig.
4.6a). Among the genes significantly associated with HSC activities, we identified four general
patterns (Fig. 4.6b, Fig. 4.7, Fig. 4.8 and Supplementary Table 4.1). (1) Constant association: P
values are significant and steady, suggesting that these genes are differentially expressed at all
levels of HSC activities. This pattern indicates a dose-response relationship between gene
expression and HSC activity. (2) Discrete association: P values shift abruptly and are generally
constant outside of the shifting points. This pattern indicates that the association between gene
expression and HSC activity is limited to a specific range of HSC activity levels. If these genes
are essential for their respective HSC activities, then alternative mechanisms must exist to carry
out the HSC activities outside of the association ranges. (3) Unimodal association: P value
distribution forms one clear peak and generally follows a normal distribution. These genes are
most significantly differentially expressed at one specific level of the corresponding HSC
activities. If these genes are essential for their respective HSC activities, then their expression
levels may need to reach a critical threshold in order to turn on the cellular activities. (4)
Multimodal association: P value distribution forms multiple peaks. These genes are most
significantly differentially expressed at several distinct levels of HSC activities. If these genes
are essential for their respective HSC activities, then they might alter HSC activities at multiple
thresholds possibly through multiple binding partners in different pathways.
56
Unimodal and multimodal associations were the most common association patterns among all
five HSC activities that we examined. No constant association pattern was found to govern HSC
self-renewal and granulocyte production. The discrete association pattern was also absent from
granulocyte production and was rare for HSC self-renewal activity. This suggests that dose-
response mechanisms may not be involved in regulating HSC self-renewal and myeloid
differentiation. For the discrete association pattern, four genes showed abrupt increases or
decreases in P values at a similar level of B cell production (Fig. 4.9a). These genes may be
involved in alternative molecular programs that prepare HSCs for either high or low B cell
production. In addition, seven genes associating with HSC self-renewal shared association peaks
around the same level where self-renewal was high (Fig. 4.9b). The expressions of these genes
were also correlated with each other across individual HSCs (Fig. 4.9c), and these genes are
mostly relevant to cell division. The co-expression and co-localized association peaks between
these genes suggest that they may collectively act in the same molecular pathway in HSCs.
Discussion
In this study, we show how intercellular variations in gene expression levels are
associated with cellular activity levels in mouse HSCs, and we identified specific genes that
manifest the association. These data indicate that intercellular variations in gene expression
reflect active management of cellular activity at a fine resolution. In multi-cellular organisms,
multiple cells performing an identical function must be coordinated. Part of this coordination
may rely on fine-tuning the specific activity level of each cell. Alternatively, coordination may
depend upon controlling cell numbers. For example, we have previously shown that the number
of engrafted HSCs in a mouse is tightly controlled, regardless of the donor HSC dose (Brewer et
al., 2016). Yet, altering cell numbers may be less efficient than modulating cellular activity
57
levels, particularly in response to constant biological changes. It is likely that other tissues and
organs also modulate their cellular activity levels in order to coordinate their overall biological
functions. Our findings provide a new approach for studying cellular regulation and molecular
function through the analysis of intercellular variations. This approach, including the
bioinformatics strategy that we developed, can be readily extended to many biomedical systems.
It is particularly relevant given the recent growth in single-cell studies.
Our data revealed striking quantitative association patterns between gene expression and cellular
activities as well as specific genes and cellular activities that manifest each pattern. These
association patterns reflect distinct quantitative relationships between the cellular and molecular
levels, and provide a new perspective for understanding complex gene regulatory networks. For
example, genes that share a common turning point in their association patterns, such as a
common association peak or discrete site, may be closely related in the molecular regulatory
networks underlying the cellular function. Moreover, understanding the nature of these
molecular and cellular associations can help achieve better control of cellular activities in
biomedical applications. For instance, genes exhibiting the constant association pattern can be
good therapeutic targets for activating cellular activities, whereas genes specifically associated
with high levels of a cellular activity can be good therapeutic targets for tuning down, but not
completely blocking, cellular activities such as proliferation.
Methods
Quantification of HSC activities
The quantification of activities by an HSC clone is as described in chapter 3. While we were able
to confidently quantify HSC differentiation by harvesting the vast majority of the peripheral
blood cells through perfusion, our measurement of HSC self-renewal may have been
58
compromised by the accessibility of HSCs in bone marrow that is difficult to recover completely.
Because the result from chapter 3 suggested that single cell activities can be deduced from clonal
level data, here we determined an HSC’s self-renewal level by the clonal abundance recovered
from the HSC population, and an HSC’s differentiation level by the clonal abundance in the
peripheral blood divided by the clonal abundance in HSCs.
Lineage bias analysis
The calculation of lineage bias level is as follows:
If 𝑐𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝑔𝑟𝑎𝑛𝑢𝑙𝑜𝑐𝑦𝑡𝑒𝑠 = 0 𝑎𝑛𝑑 𝑐𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝐵 𝑐𝑒𝑙𝑙𝑠 > 0,
𝑙𝑖𝑛𝑒𝑎𝑔𝑒 𝑏𝑖𝑎𝑠 𝑙𝑒𝑣𝑒𝑙 = 1
If 𝑐𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝑔𝑟𝑎𝑛𝑢𝑙𝑜𝑐 𝑦𝑡𝑒𝑠 > 0 𝑎𝑛𝑑 𝑐𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝐵 𝑐𝑒𝑙𝑙𝑠 = 0,
𝑙𝑖𝑛𝑒𝑎𝑔𝑒 𝑏𝑖𝑎𝑠 𝑙𝑒𝑣𝑒𝑙 = −1
If 𝑐𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝑔𝑟𝑎𝑛𝑢𝑙𝑜𝑐𝑦𝑡𝑒𝑠 > 0 𝑎𝑛𝑑 𝑐𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝐵 𝑐𝑒𝑙𝑙𝑠 > 0,
𝑙𝑖𝑛𝑒𝑎𝑔𝑒 𝑏 𝑖𝑎𝑠 𝑙𝑒𝑣𝑒𝑙 =
𝑎𝑟𝑐𝑡𝑎𝑛 (𝑐𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝐵 𝑐𝑒𝑙𝑙𝑠 )/(𝑡𝑜𝑡𝑎𝑙 𝐵 𝑐𝑒𝑙𝑙𝑠 % 𝑤 ℎ𝑖𝑡𝑒 𝑏𝑙𝑜𝑜𝑑 𝑐𝑒𝑙𝑙𝑠 )
(𝑐𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 𝑖𝑛 𝑔𝑟𝑎𝑛𝑢𝑙𝑜𝑐𝑦𝑡𝑒𝑠 )/𝑡𝑜𝑡𝑎𝑙 𝑔𝑟𝑎𝑛𝑢𝑙𝑜𝑐𝑦𝑡𝑠 % 𝑤 ℎ𝑖𝑡𝑒 𝑏𝑙 𝑜 𝑜𝑑 𝑐𝑒𝑙𝑙𝑠 )
− 𝜋 /4
𝜋 /4
Single cell RNA sequencing
HSCs were FACS-sorted as described above. Single HSC suspensions were washed with ice cold
0.04% BSA in PBS, loaded onto 3’ library chips, and processed according to the manufacturer’s
protocol for the Chromium Single Cell 3’ Library (10X Genomics, V2) with minor modifications
as follows. 5,000–9,000 HSCs from each mouse were loaded into each reaction channel. The
resulted cDNAs were then amplified per manufacturer’s recommendation with one additional
59
cycle. After cDNA amplification, half of the amplified cDNA was used for the downstream
fragmentation, adaptor ligation and sample index PCR. The other half of the cDNA was subject
to amplifying “molecular bridges” that contain the genetic tracking barcodes and the Chromium
cellular barcodes (see below). The former indexed cDNA libraries were first sequenced on an
Illumina NextSeq 500 aiming at a coverage of 5,000 raw reads per cell to estimate the cell
numbers, before sequenced deeper on an Illumina HiSeq 4000 aiming at a coverage of 50,000
raw reads per cell. (pair-end; read1: 26 cycles; i7 index: 8 cycles; read 2: 98 cycles). The
sequencing results were processed using the Cell Ranger pipeline (10X Genomics, v 2.1.0) for
cellular barcode assignment and unique molecule identifier (UMI) quantification. Cells with
more than 5% UMIs mapped to mitochondrial genes were excluded. Only genes with at least 3
UMIs in at least 5% of cells were used for downstream analyses. Expression values for gene i in
cell j were calculated by dividing UMI count values for gene i by the sum of the UMI counts in
cell j, and then multiplying by 10,000 to create TPM-like values, then transforming to
log2(TPM+1).
Generating and mapping molecular bridges
To specifically amplify molecular bridges containing both a clonal tracking barcode and a cell
barcode, we performed PCR using the single cell cDNA library as template and a single primer
(5’- sample index-ACACTCTTTCCCTACACGAC) (Fig. 4.2a). The PCR products were more
than 1.5 kb long (Fig. 4.2a), and were purified from an agarose gel band (Zymo Research, #11-
300C) (Fig. 4.2b) and sequenced using the PacBio Sequel sequencer (Pacific Biosciences, v2.1).
Raw PacBio sequencing data were analyzed using Circular Consensus Sequences (CCS)
application of SMRT Analysis software (Pacific Biosciences, SMRT Link Version 5.1.0) with
default parameters. The resulted “fasta” files were processed by a customized Python script (Fig.
60
4.3). We first aligned the molecular bridges to the viral sequence, the main part of the molecular
bridges. Then, the well-aligned molecular bridges were compared to the genetic tracking barcode
list and the Chromium cellular barcode list generated from the same HSC population (Fig. 4.1).
We allow one tracking barcode to map to multiple cellular barcodes, as each tracking barcode
represents one HSC clone that may consist of multiple HSCs upon self-renewal. Around 20% of
cellular barcodes were mapped to multiple tracking barcodes. This may arise from multiple viral
infections during genetic barcoding labeling, or from multiple cells captured in the same droplet
during single cell cDNA library generation. Both are random processes and difficult to avoid.
We discarded these data in the downstream analysis.
Identifying differentially expressed genes
To identify genes whose expression levels are associated with variations in HSC activity levels,
we generated 100 sets of scramble data for the experimental mice by randomly mapping genetic
tracking barcode data to gene expression data. For each gene and each HSC activity, two P-
values were calculated by one-sided Mann-Whitney test using both the experimental data and the
100 sets of scramble data, then P-values from replicate mice were combined using Fisher’s
method. To reduce false positives, we calculated a false positive score, which was defined as the
number of genes whose P values were equal to or smaller than the given gene’s P value in the
scramble data (median of the 100 sets), divided by that gene number in the experimental data.
Only the genes with a score lower than 0.05 were considered significant.
Classifying quantitative association patterns
61
To classify the dynamic association patterns of P-values across various levels of HSC activities,
we plotted − log
10
(𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 𝑃 𝑣𝑎𝑙𝑢𝑒 ) against HSC activity by rank (Fig. 4.6b), and we fit the
data points to the following three formulas in order (Fig. 4.7):
(1) 𝑦 = 𝑎
(2) 𝑦 = 𝑏 × 𝑒 −
(𝑥 −𝜇 )
2
2𝜎 2
+ 𝑐
(3) 𝑦 = {
𝑑 , 𝑖𝑓 𝑥 ≤ 𝑓 𝑒 , 𝑖𝑓 𝑥 > 𝑓
a, b, c, d, e, f, μ, and σ are constants calculated by Python function “scipy.stats.curve_fit”. Genes
whose data points best fit formula (1) were classified as unimodal association. For the remaining
genes, those whose data points best fit formula (2) were classified as the unimodal association
group. For the remaining genes, those whose data points best fit formula (3) were classified as
the discrete association group. Lastly, data from all the remaining genes were fitted to a
polynomial formula using a Python function “numpy.polyfit”, applying the minimum degree
when 𝑟 2
of the fitting reaches 0.9. We then identified the number of peaks on the polynomial
curves to classify the genes into the unimodal group (one peak) or multimodal group (multiple
peaks).
62
Figures and Tables
Fig. 4.1 Scheme of the integrative experimental system
Schematic illustration of the experimental system that simultaneously measures HSC
transcriptomes and in vivo activities.
63
Fig. 4.2 Extracting “molecular bridges”
a. The PCR primers were designed to specifically amplify the molecules that contain both a
tracking barcode (TBC) and a cell barcode (CBC) from single cell cDNA libraries. b. Agarose
gel showing the PCR product. The arrow points to the desired products, the molecular bridges,
that were cut and purified for PacBio sequencing.
64
Fig. 4.3 Bioinformatic pipeline for mapping single cell transcriptomes and activities
Bioinformatic pipeline for mapping single cell transcriptomes and single cell activities. Yellow
bands in the histograms highlight sequencing reads selected for downstream analysis.
65
Fig. 4.4 Comparing the transcriptomes and activities of barcoded and non-barcoded HSCs
a. t-SNE (t-distributed stochastic neighbor embedding) plot of HSC single-cell RNA sequencing
data comparing HSCs that were mapped to mice through tracking barcodes and HSCs that were
not mapped. b. Comparing activity levels of HSCs that were mapped to single-cell RNA
sequencing data and HSCs that were not mapped.
66
Fig. 4.5 Identifying genes significantly associated with cellular activities across individual cells
a. Workflow to determine differentially expressed genes. b. Individual HSCs are ranked based
on their granulocyte production abundance and separated into high and low granulocyte
producing groups. Colors denote HSCs from different mice. c. Gnas gene is expressed
significantly higher in low granulocyte-producing HSCs compared to high granulocyte-
producing HSCs in the experimental mice (M1~4), but not in the “scramble mice” which are
mock mice generated by scrambling the mapping between tracking barcode data and gene
expression data (M1’~4’, one of the 100 scramble sets is shown). In (b) and (c), 50% rank of
granulocyte production was used as a representative threshold to define low and high
granulocyte producing HSCs. * P < 0.05; ** P < 0.01; ns, not significant.
67
Fig. 4.6 Quantitative association between gene expression and cellular activity
a. Workflow to compare the association between gene expression and HSC activities across
different levels of HSC activities. b. Four general association patterns. Each grey dot represents a
P-value calculated using one detected HSC activity level as the threshold. Colored lines show
fitted curves. Red triangles denote peak P values identified in unimodal and multimodal
association patterns. c. Number of genes that exhibit different association patterns with HSC
activities.
68
Fig. 4.7 Classifying quantitative association patterns
a. Process of classifying all the genes significantly associated with B cell production based on
their quantitative association patterns. After plotting -log10(combined P-value) against HSC
activity by rank, Python function “scipy.stats.curve_fit” was used to determine constant variables
𝑎 , 𝑏 , 𝑐 , 𝑑 , 𝑒 , 𝑓 , μ, and σ that would minimize the Euclidean distance between the actual data
point and the fitted line. The distance distribution is used to determine the cutoff for classifying
constant association, unimodal association that can be fitted with a normal distribution, and
discrete association. The remaining genes were classified into unimodal or multimodal
69
association based on the number of peaks on the fitted polynomial curve. b-e, The Euclidean
distance distributions for classifying genes that are significantly associated with granulocyte
production (b), HSC self-renewal (c), Granulocyte + B cell production (d), and B cell /
granulocyte lineage bias (e). See method section of Chapter 4 for additional details.
70
Fig. 4.8 Complete list of genes identified as significantly associated with HSC activities
71
72
73
74
Shown are all genes identified as significantly associated with HSC activities including HSC
self-renewal (a), granulocyte production (b), B cell production (c), granulocyte (Gr) and B cell
production (d), and B-cell-to-granulocyte lineage bias (e). “†” in (a) denotes genes highlighted in
Fig. 4.8b. “#” in (c) denotes genes highlighted in Fig. 4.8a.
75
Fig. 4.9 Gene-gene interaction depicted by the association patterns
Gene-gene interactions depicted by the association patterns between gene expression and cellular
activity across individual cells. a. Top, the discrete sites showing the HSC activity levels where
the P values change abruptly. Shown are all genes that exhibit significant discrete association
with B cell production. Yellow band highlights the B cell production level where discrete sites of
four genes co-localize. Bottom, quantitative association of the four genes highlighted in yellow
at the top. Each dot represents a P-value calculated using one detected B cell production level as
the threshold. b. Top, the distribution of P-value peaks across HSC self-renewal levels. Shown
are all genes that exhibit unimodal or multimodal association with HSC self-renewal activity.
Note that the lowest and highest peaks are comprised of genes whose real peaks fall outside of
the HSC self-renewal range. Yellow arrow highlights the HSC self-renewal level where seven
peaks co-localize. Bottom, smoothed association curves of genes highlighted by the yellow
arrow at the top. c. Correlation in gene expression across individual HSCs between genes
associated with HSC self-renewal. Each circle represents a comparison of two genes highlighted
76
in yellow in (b). Each “x” mark represents a comparison of the remaining genes associated with
HSC self-renewal. r
2
was used to compare both positive and negative correlations. *** P< 0.001.
77
Supplementary Table 4.1 Complete list of genes exhibiting significant associations with HSC
activities
A complete list of genes exhibiting significant associations with HSC activities across individual
HSCs.
78
Supplementary Table 4.2 Summary of previous studies
Summary of previous studies that showed relevant functions of the genes identified in our study.
79
Chapter 5: Transplantation alters HSC differentiation
Abstract
Recent studies suggest that the HSC differentiation is cell autonomous as it persists
across transplantation. Here we show that several key aspects of HSC differentiation do not
follow classical autonomous regulation. Lineage bias was only absent in unconditioned
transplantation, but not in irradiation- or chemotherapy-mediated transplantation. With pre-
transplantation conditioning, the lineage bias and balance of individual HSC clones are not
autonomous across transplantation as they shift in a systematic and predictable way. In addition,
the amount of blood production by individual HSC clones is reset randomly. Transplantation can
also activate and inactivate the differentiation of HSC clones with distinct characteristics. After
transplantation, HSCs derived from the same ancestor generally conform to an autonomous
model of regulation as they produce similar amounts of blood cells and exhibit similar lineage
biases, although not lineage balance, even when they reside in different recipients. These
findings demonstrate that the differentiation of individual HSCs is regulated by non-classical
autonomous mechanisms that may provide robustness and flexibility to blood production.
Transplantation conditions alter HSC differentiation at the clonal level
Irradiation-mediated transplantation is used in the vast majority of HSC studies, leading
to many discoveries on HSC characteristics including lineage bias and clonal expansion
(Beerman, Bhattacharya, et al., 2010; Dykstra et al., 2007; Ergen et al., 2012; Sieburg et al.,
2006). While preconditioning the recipient is necessary to obtain high levels of HSC
engraftment, all conditioning regimens, to various degrees, injure and derange the niches that
normally regulate HSCs (Dominici et al., 2009; Pietras et al., 2015). Although damaged niches
can be restored to some extent after conditioning, it is unclear whether HSC regulation in
80
restored niches still resembles that under normal physiological conditions. HSCs can be
transplanted without the use of conditioning by taking advantage of the natural HSC migration in
the peripheral blood, but it produces low engraftment rates even after repeat transplantations
(Bhattacharya et al., 2009; Czechowicz et al., 2007). Using our high throughput, high sensitivity
embedded viral tracking technology, we sought to compare the clonal level HSC differentiation
under unconditioned or conditioned transplantation.
For unconditioned transplantation group, we performed gender-matched transplantation
every other day for a total of five to seven times, each time with 1,000 barcoded HSCs. This
result in around 1% donor chimerism in the peripheral blood 22 weeks post transplantation. For
irradiation-mediated transplantation, host mice received 950 cGy dose of radiation prior to being
transplanted with 5,000-9,000 barcoded HSCs. The lineage bias of an HSC clone is determined
by its relative contribution to myeloid versus lymphoid lineages. For example, HSC barcodes
with myeloid bias have relatively high copy numbers in myeloid cell types such as granulocytes
and relatively low copy numbers in lymphoid cell types such as B cells. While no lineage bias
was observed when pre-transplantation conditioning was absent (Fig. 5.1, left), after irradiation-
mediated transplantation, donor-derived HSC clones were separated into three groups using the
ratio of granulocyte barcode copy numbers to B cell barcode copy numbers (Fig. 5.1, right);
these three groups represented myeloid bias, lymphoid bias, and lineage balance, consistent with
previous studies (C. E. Muller-Sieburg et al., 2012). This data indicates that the conditioning
regimen used in the previous studies may have contributed to the observed HSC heterogeneity.
Thus, future studies must be carefully designed to distinguish normal HSC physiology from
emergency modes.
81
Transplantation modulates cellular activities in vivo
The significant association between intercellular variations in gene expression and
cellular activity levels indicates that the differentiation levels of individual HSCs undergo active
adjustments. This finding is surprising given that it has been previously shown that the
heterogeneous differentiation programs of individual HSCs are cell autonomous (Yu et al., 2016)
and persist during serial transplantations(Dykstra et al., 2007). In our abovementioned serial
transplantation experiment (Chapter 3), we found that although HSCs derived from the same
ancestor generally conform to an autonomous model of regulation, some key aspects of HSC
differentiation do not persist across transplantation.
Comparing the quantity of blood cells that each HSC clone produced in the primary and
secondary recipients, we found that the overall blood production of individual HSC clones
changed dramatically between the primary and secondary recipients (Chapter 3). We also found
that HSCs originating from the same ancestor exhibited similar lineage biases in different
secondary recipients (Fig. 5.2). This similarity was even greater in clones that exhibited extreme
lineage biases (Fig. 5.2). However, lineage-balanced clones were not consistently balanced
across different secondary recipients (Fig. 5.2). These clones exhibited similar consistency in the
amount of their overall blood production as the lineage-biased clones (Fig. 5.3). Thus, lineage
balanced HSC clones appear to possess the flexibility to adjust the overall balance of different
blood cell types, and do not possess the flexibility to regulate the overall quantity of blood
production.
Interestingly, most HSCs changed their lineage biases systematically towards the
lymphoid lineage during transplantation (Fig. 5.4). This may be related to the reduced blood
production of lymphoid-biased HSCs after transplantation (Pietras et al., 2015; Yamamoto et al.,
82
2013) (Fig. 5.5). Our data show that HSC clones that were lineage-balanced in primary
recipients developed a lymphoid bias in secondary recipients (Fig. 5.6a). Myeloid-biased clones
in primary recipients were more likely to become lineage-balanced in secondary recipients (Fig.
5.6b). Surprisingly, some lineage committed clones, which did not produce all measured blood
cell types in primary recipients, adopted a classic HSC behavior contributing to both lineages in
secondary recipients (Figs. 565a-b). Using both Monte Carlo simulations and probability
calculations, we show that the observed lineage bias change is significantly different from a
random distribution (Supplementary Tables 5.1 and 5.2). In addition, when data from all but
one mouse were used to build a predictive model, the behavior of HSCs in the remaining mouse
can be faithfully predicted (Fig. 5.6a). Taken together, these data suggest that HSC blood
production was altered in a systematic and predictable way during hematopoietic reconstitution
post transplantation.
Previous population-level studies suggest that HSCs are the only cells that produce blood
four months post transplantation, that they supply all blood cell types, and that their unique self-
renewal capacity allows them to continue supplying blood cells in secondary recipients (Eaves,
2015; Purton & Scadden, 2007; Seita & Weissman, 2010; Bryder et al., 2006). However, our
clonal-level data identified a substantial number of HSC-like cells that challenge this dogma. We
found that some HSCs did not produce all measured blood cell types four months post
transplantation in the primary recipients (Brewer et al., 2016; Yamamoto et al., 2013). We call
these cells “lineage committed”. Here, we found that these committed HSC clones could take up
the classic HSC behavior in the secondary recipients (Fig. 5.7a). In addition, we found a
substantial number of HSC clones that only supplied blood cells in the secondary recipients, but
not in the primary recipients (Fig. 5.7a). Strikingly, a majority of these clones were
83
simultaneously activated in multiple secondary recipients (Fig. 5.7b), suggesting that the
activation of HSC differentiation by transplantation is highly consistent among HSCs derived
from the same ancestor.
Transplanting HSCs from primary recipients into secondary receipts can activate or
inactivate HSC differentiation. We found that fewer HSC clones supplied blood cells in the
secondary recipients than in the primary recipients (Fig. 5.8). Furthermore, the number of clones
that supplied blood cells was similar between secondary recipient mice that shared the same
primary recipient (Fig. 5.8). This is expected as it is technically impossible to extract every HSC
from the bone marrow of the primary recipient during the secondary transplantation. In addition,
clonal dominance consistently increased after secondary transplantation (Fig. 5.9). Here, we
quantify clonal dominance as the total amount of blood cells produced by the top five most
abundant HSC clones. We found that clonal dominance was higher in the secondary recipients
than in the primary recipients in all examined mice. We obtained similar results when using
different definitions of clonal dominance, such as analyzing the top one or three most abundant
clones, or when solely considering the production of a single blood cell type (Fig. 5.9).
We further determined how secondary transplantation activates or inactivates the
differentiation of individual HSCs (Fig. 5.10). We found that clones that were active in both
primary and secondary recipients produced most, but not all blood cells (Fig. 5.10a). In
particular, they produced more lymphoid than myeloid cells in secondary recipients, consistent
with the lineage bias shift towards lymphoid lineage at the clonal level (Fig. 5.10b-c).
Additionally, persistently active HSC clones supplied more myeloid than lymphoid cells in
primary recipients that received irradiation treatment, but did not do so in primary recipients that
received busulfan treatment (Fig. 5.11). This is the only difference between the two treatments
84
that we identified in this study. In general, pre-transplantation radiation and busulfan treatments
appear to have similar influence on HSC differentiation. Moreover, these clones were more
abundant (Fig. 5.10a) and more likely to be found in all examined blood lineages than other
clones (Fig. 5.10b). Persistently active and committed clones did not exhibit any lineage
preference in the primary recipients, but exhibited significantly higher levels of lymphoid
commitment in the secondary recipients (Fig. 5.10c). Similarly, HSC clones that only supply
blood cells in either primary or secondary recipients also exhibited significantly higher levels of
lymphoid commitment (Fig. 5.10c). In addition, clones that only supply blood cells in secondary
recipients tended to exhibit myeloid bias (Fig. 5.10d). They balanced out the systematic lineage
bias shift of persistent clones towards the lymphoid lineage (Fig. 5.6). These data suggest that
HSC clones whose differentiation is activated or inactivated during transplantation also exhibit
specific differentiation characteristics.
To identify genes associated with changes in cellular activities, we compared the
transcriptomes of HSCs that did not change lineage bias with those that shifted towards the
lymphoid lineage. We found that Myl10 expression is significantly higher in HSCs that changed
lineage bias (Fig. 5.12). Myl10 is known to play an important role during lymphocyte
development(Oltz et al., 1992). Our data now suggest that it may be involved in the lymphoid
priming of HSCs.
Discussion
In summary, our data demonstrate for the first time how distinct differentiation
characteristics of individual HSCs exhibit non-classical autonomy during serial transplantation.
While blood production abundance, lineage bias and activation of differentiation are highly
consistent among HSCs derived from the same ancestor, suggesting cell autonomous regulation,
85
the blood production abundance of individual HSCs is randomly reset by transplantation, and
lineage bias shifts in a systematic and predictable way. These distinct types of changes suggest
that some differentiation properties, such as blood production abundance, are autonomously
regulated and not dependent on the prior behavior of a HSC clone. Other autonomous properties,
such as lineage bias, are dependent on prior behavior of the HSC clone. Moreover, lineage
balanced differentiation is not an autonomous property of HSCs. This implies that lineage
balanced HSC clones play a key role in fine-tuning the balance between different blood cell
types.
Our discovery elucidates the regulatory mechanisms underlying the heterogeneity of HSC
differentiation and their alteration by transplantation. Our findings also demonstrate the
feasibility for isolating HSCs with distinct blood production characteristics. A better
understanding of the non-classical autonomous regulation of HSC differentiation may help to
improve bone marrow transplantation therapy. For example, clinicians could transplant myeloid-
biased HSC clones to produce a balanced blood supply or target specific HSC clones to address
the needs of an individual patient. This would ultimately improve clinical outcomes and create
opportunities for better and safer precision medicine.
Methods
Unconditioned transplantation
HSCs were obtained from crushed bones of donor mice as described in Chapter 3, before
transfected with lentivirus carrying clonal tracking barcodes as described in Chapter2, then
transplanted into recipient mice via retro-orbital injection. Donor and recipient mice were of the
same gender to reduce immune rejection. 1,000 barcoded HSCs were transplanted into each
86
mouse every other day for 18 days (9,000 donor HSCs total). Long-term stable engraftment of
∼1% donor chimerism was consistently obtained.
Lineage bias analysis
Lineage-committed clones are defined as clones that were only present in either myeloid
(granulocytes) or lymphoid (B cell) lineages. Lineage-biased clones are defined as clones present
in both lineages, but whose relative abundance in one lineage is 2.4142 (cotangent π/8) times
more than their relative abundance in the other lineage. The remaining clones are defined as
balanced clones.
Violin graphs (Fig. 5.6) were plotted using Python package “matplotlib”, in which the estimator
bandwidths were calculated using the default parameter scott method. The maximum widths of
individual violin bars were scaled based on the clone numbers in different categories.
To perform the Monte Carlo simulations (Supplementary Table 5.1), clones were first
categorized into five categories based on their behaviors in primary or secondary recipients: MC,
myeloid committed; MB, myeloid biased; Ba, balanced; LB, lymphoid biased; LC, lymphoid
committed. Then, clones in each category were classified into five groups (MC, MB, Ba, LB,
LC) based on their behaviors in secondary or primary recipients, respectively. The numbers of
clones in each group were determined for each category and for the five categories pooled
together. The latter were used to create five bins of corresponding sizes. For each simulation,
clones in each category were randomly assigned to one of the five bins. We conducted 1,000,000
simulations, and calculated the number of runs where each simulated bin is larger or smaller than
the actual number of clones in each bin. The probability that the simulated number is smaller
than the actual number is shown in the table.
87
To calculate the probability table (Supplementary Table 5.2), clones were first categorized into
five categories (MC, MB, Ba, LB, LC) based on their behavior in primary recipients. Then
clones in each category were classified into five groups (MC, MB, Ba, LB, LC) based on their
behaviors in secondary recipients. The numbers of clones in each group were counted for each
category and for the five categories pooled together. The probability of a group in a category is
calculated as:
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 =
∑ (
𝐾 𝑖 )(
𝑁 −𝐾 𝑛 −𝑖 )
𝑘 𝑖 =0
(
𝑁 𝑛 )
Where 𝑁 is the number of all clones analyzed, 𝑛 is the number of clones in this category, 𝐾 is
the pooled number of clones for this group from all categories, and 𝑘 is the number of clones in
this group from its category. This generates the probability that the number of clones in a random
distribution is equal to or smaller than the actual number of clones for a group in a category.
88
Figures and Tables
Fig. 5.1 HSC lineage bias after unconditioned or irradiation-mediated transplantation
Scatter plots comparing barcode copy numbers from granulocytes with barcode copy numbers
from B cells in the peripheral blood collected at 22 weeks post transplantation. Each dot
represents a unique barcode that is used to track a single HSC clone. Colors are assigned
according to the ratios of the granulocytes barcode copy numbers (myeloid lineage) to B cells
barcode copy numbers (lymphoid lineage). WT: wild type. Data of irradiation mediated
transplantation shown here were generated by Rong Lu.
89
Fig. 5.2 Comparing the lineage bias in two secondary recipients
Comparing the lineage bias between granulocytes and B cells in two secondary recipients (2°)
that share a common primary recipient. Bottom, each open circle represents an HSC clone. The
darkness of the color depicts the clonal abundance. Top, HSC clones are separated into five bins
of equal sizes based on their lineage biases. Shown is the P-value of the Pearson correlation
comparing the lineage bias of the clones in each bin between secondary recipients.
90
Fig. 5.3 HSC clones with distinct lineage bias and balance exhibited similar blood abundance
consistency
91
Shown is the clonal abundance of granulocytes (top) and B cells (bottom) between two
secondary (2°) recipients that share a common primary recipient. Clones were separated into five
groups, shown in five columns, based on their lineage bias in primary (1°) recipients. Each open
circle represents an HSC clone, and its position indicates its clonal abundance. Clones from all
experimental mice are plotted in each panel. “r” depicts Pearson correlation coefficient.
92
Fig. 5.4 Comparing the lineage bias between primary and secondary recipients
Comparing the lineage bias of HSC clones between primary and secondary recipients. Each open
circle represents an HSC clone.
93
Fig. 5.5 Lymphoid biased HSC clones produced less blood cells in secondary recipients.
Comparing lineage bias level in primary recipients (x axis) and clonal abundance in the
peripheral blood of secondary recipients (y axis. Upper row, granulocytes; lower row, B cells).
Each open circle represents an HSC clone.
94
Fig. 5.6 HSCs systematically change blood production during serial transplantation.
a. Lineage bias changes during transplantation. HSC clones are categorized based on their
lineage bias in primary recipients. Clones from all but one primary recipient and its
corresponding secondary recipients were used to generate the violin bars. Clones from the
remaining primary and secondary recipients were plotted as dots. MC, myeloid committed; MB,
myeloid biased; Ba, balanced; LB, lymphoid biased; LC, lymphoid committed. Lineage
committed clones exclusively differentiated into one lineage, while lineage biased clones
differentially differentiated into both lineages. b. HSC clones are categorized based on their
differentiation behavior in secondary recipients and shown as described in (a).
95
Fig. 5.7 HSCs can be activated or inactivated upon transplantation
a. Overlap of clones that produce white blood cells four months after transplantation in primary
and secondary recipients. Shown are the mean ± s.e.m. of each pair of primary and secondary
recipients. b. Average fraction of clones detected only in the peripheral blood of secondary
recipients but not in the primary recipients.
96
Fig. 5.8 Fewer HSCs produce blood in secondary recipients compared with primary recipients
Number of clones detected in primary (1°) and secondary (2°) recipients. The average number of
clones in secondary recipients that share a common primary recipient is shown. The blue error
bar depicts s.e.m. of secondary recipients that share a common primary recipient.
97
Fig. 5.9 Clonal dominance increases upon transplantation
Total abundance of the top five most abundant clones among white blood cells (WBCs). Markers
representing secondary recipients are aligned to the marker of the corresponding primary
recipient.
98
Fig. 5.10 Characteristics of activated and inactivated HSC clones
a. Percentage of clones that are of high or medium abundance. High abundance clones refer to
the top one-third most abundant clones in each mouse; medium abundance clones refer to the
middle one-third abundant clones. See Supplemental Table 5.3a for t-test analysis comparing
different categories of clones. b. Percentage of clones that are multi-lineage or lineage
committed. Multi-lineage clones are defined as those that produce all measured blood cell types.
Lineage committed clones are defined as those that do not produce every blood cell type. See
Supplemental Table 5.3b for t-test analysis comparing different categories of clones. c.
Percentage of clones that are myeloid committed or lymphoid committed. See Supplemental
Table 5.3c for t-test analysis comparing different categories of clones. d. Percentage of clones
that are lineage biased or lineage balanced. See Supplemental Table 5.3d for t-test analysis
comparing different categories of clones.
99
Fig. 5.11 Contribution of persistent clones in primary and secondary recipients
Contribution of persistent clones to different blood cell types in primary and secondary
recipients. Each dot represents data from one mouse. Gr, granulocyte; B, B cell; 4T, CD4
+
T cell;
8T, CD8
+
T cell.
100
Fig. 5.12 Myl10 expression is associated with HSC lineage bias shift upon transplantation
Left, HSC clones mapped with single cell gene expression data form three clusters based on their
lineage bias changes between primary and secondary recipients. Marker shapes denote individual
mice. Grey circles denote unmapped clones. Right, Myl10 is significantly differentially
expressed between HSC clones that changed toward lymphoid production and those that did not
change their lineage bias. Colors at the bottom are corresponding to clusters shown on the top.
*** P < 0.001.
101
Supplementary Table 5.1 Significance of the change in lineage bias determined by Monte Carlo
simulation
Significance of the change in lineage bias upon secondary transplantation as determined by
Monte Carlo simulation. Shown are P-values generated by Monte Carlo simulation indicate if the
number of clones observed is significantly different from a random distribution. P-values less
than 0.05 indicate changes significantly smaller than random. P-values more than 0.95 indicate
changes significantly larger than random.
102
Supplementary Table 5.2 Significance of the change in lineage bias determined by probability
calculation
Significance of the change in lineage bias upon secondary transplantation as determined by
probability calculation. Direct calculation of the probability that the observed number of clones
exhibit lineage bias change between the primary and secondary recipients. Shown are the
probabilities that the numbers of clones are equal or lower than observed numbers. The
calculations starting from the primary and secondary recipients generate the same results. P-
103
values less than 0.05 indicate changes significantly smaller than random. P-values more than
0.95 indicate changes significantly larger than random.
104
Supplementary Table 5.3 T-test analysis between different clones
105
T-test analysis of different clones. related to Fig. 5.9. 1° per, persistent clones in primary
recipients; 1°-only, clones only detected in primary recipients; 2° per, persistent clones in
secondary recipients; 2°-only, clones only detected in secondary recipients; *** P < 0.001; ** P
< 0.01; * P < 0.05; ns, not significant; - not applicable.
106
Chapter 6: Cellular heterogeneity associated with aging and
leukemia
Abstract
Age-related physiological declines and leukemia are two important pathogenesis
associated with hematopoietic system. Here, we explored their respective underlying molecular
heterogeneity at the clonal level using our integrative technology that combines viral embedded
barcode tracking and single cell RNA sequencing. In the aging study, we characterized genes
associated with pro-aging or anti-aging behaviors of stem cells. In the leukemia study, we
identify distinct gene expression signatures that drives leukemic cell expansion and
heterogeneous response to chemotherapies. Our data suggested the possibility of delaying aging
by targeting specific subsets of stem cells and reveals new insights into cancer cell heterogeneity.
It also demonstrates the power and significance of simultaneously analyzing gene expression and
cellular activity at the single cell level.
Temporal variability in the onset of aging
While all living organisms age, the aging phenotype is manifested at different
chronological ages across individuals (Collier & Coleman, 1991; Mitnitski et al., 2017; Yashin et
al., 2016). The variability in the onset of aging is well recognized and intensely investigated in
human studies where genetic background and lifestyle choices are found to be dominant factors
(Yashin et al., 2016; Mitnitski et al., 2017). However, little is known about the cellular and
molecular mechanisms underlying this variability, even in animal models where genetic
background and environment can be well controlled. A large part of the difficulty lies in the
similarity between organisms of the same chronological age, which obliges most aging studies to
107
compare young and old organisms. While this approach has discovered many biological
differences between the young and old conditions, it has been challenging to distinguish the
mechanisms that trigger the aging phenotype when many biological changes simultaneously take
place and influence each other. These differences are undetectable at the population level, but
can be teased out by single cell studies. Of all stem cells, hematopoietic stem cells (HSCs)
provide an ideal model system for study aging onset(Orkin & Zon, 2008). In the hematopoietic
system, aging phenotype is known as an imbalance between different blood cell types, namely an
overabundance of myeloid cells and a deficiency of lymphoid cells phenotype (Beerman,
Maloney, et al., 2010; C. Muller-Sieburg & Sieburg, 2008; Pang et al., 2017). It has been shown
that MOLD arises from the aging of HSCs (Ergen & Goodell, 2010; C. Muller-Sieburg &
Sieburg, 2008; Sudo et al., 2000). However, it is unclear how the aging process develops within
the HSC network as HSCs have recently been found to be heterogeneous. In this study, we
tracked the dynamic changes of individual HSC clones over time and investigated the underlying
molecular mechanisms on the onset of aging phenotype.
Clonal tracking data shows that the temporal variability in the onset of aging is associated
with subgroups of HSCs (data not shown). We then sought to compare the gene expression
profiles of HSCs aged in young niche from mice that displayed early and delayed aging. We
identified a short list of genes that were significantly differentially expressed and found that their
known functions are largely related to the cellular mechanisms that we have identified (Fig. 6.1
and Supplementary Table 6.1). Genes significantly higher expressed in delayed aging mice are
known to maintain stem cell self-renewal (Dkc1, Khdrbs1, and Cbfa2t3), inhibit stem cell
differentiation (Dkc1 and Cbfa2t3), and lower expressed in advanced age (Mrps33 and Ctla2a).
This is consistent with the two cellular mechanisms that we have identified. Genes significantly
108
higher expressed in early aging mice are known to inhibit cellular senescence (Prmt1 and Srsf3),
which may be related to the clonal succession mechanism specific to intrinsically aged HSCs.
We also found striking differences between the gene expression profiles of HSCs from
old niches and from young niches in mice that displayed early aging. More than half of all genes
that we analyzed were significantly lower expressed in HSCs aged in young niches, or
significantly higher expressed in HSCs aged in aging niche (Fig. 6.2a). These genes were
significantly enriched with functions relevant to extracellular exosomes, and suggest that HSC-
niche communication may play a role in aging, consistent with our results from the cellular and
organism levels. A small number of genes were significantly higher expressed in HSCs aged in
young niche, and these genes were significantly enriched with functions relevant to ribosomes
(Fig. 6.2b). Global inhibition of protein synthesis has been shown to increase lifespan in many
organisms. Our data now linked ribosomal functions to intrinsic aging of HSCs.
Taken together, these findings demonstrate that the particular mechanisms underlying the
temporal variability in the onset of aging can be different from general aging mechanisms
derived from comparisons between young and old conditions. These findings also reveal that a
subset of cells can behave opposite to the overall cell population and substantially influence the
overall biological phenotype.
Leukemia progression and chemotherapy response
Cancer is an evolving disease driven by genetic and epigenetic changes. The evolution of
these molecular alterations generates tremendous cellular heterogeneity (Kreso et al., 2013).
Consequently, individual cancer cells differentially proliferate, selectively metastasize, and
sporadically escape therapeutic treatment. Cellular heterogeneity has arisen as a major hurdle in
cancer treatment (Dagogo-Jack & Shaw, 2018). To tackle this problem, many studies have
109
investigated gene expression differences between individual cancer cells using single cell mRNA
sequencing (Darmanis et al., 2017; Gawad et al., 2014). However, it is unclear how the
molecular differences contribute to the heterogeneous activities of individual cancer cells during
cancer progression and treatment. Here, we investigate the heterogeneity of cancer cells in their
individual activities during expansion, circulation, and chemotherapy responses. We show that
cells exhibiting distinct activities are characterized by distinct gene expression signatures.
We used B-cell acute lymphoblastic leukemia (B-ALL) and assessed the activities of
barcoded cancer cells in a patient-derived xenograft (PDX) model, which is particularly suitable
for testing metastasis and therapeutic response. Clonal tracking analysis characterized spatially
confined clonal expansion (Fig. 6.3a), which contradicts the prevalent dogma which says
leukemia, a “liquid cancer”, uniformly spreads throughout the body. We then used single-cell
RNA sequencing data to compare the gene expression of clones more abundant in the bone
marrow and clones more abundant in the blood. We found three genes—BTK, DNAJC, and
LRIF1— that were significantly differentially expressed (Fig. 6.3b). BTK (Bruton’s tyrosine
kinase) is critical for signal transduction downstream of pre-B cell receptor (pre-BCR) and
functions as a tumor suppressor in B-ALL (Feldhahn et al., 2005; Kersseboom et al., 2003). A
BTK-binding molecule, Ibrutinib, is currently being tested in a clinical trial for treating B-ALL
(ClinicalTrials.gov Identifier: NCT02997761).
In addition to spatially confined clonal expansion in the bone marrow, we found organ-
specific extramedullary clonal expansion in the kidney, stomach and ovaries associated with
three patient samples respectively (Fig. 6.4a). These data demonstrate clonal selection in
extramedullary expansion. We analyzed the gene expression profiles of donor cells comparing
clones that expanded in the ovary with those that did not (Fig. 6.4b). We identified a largely
110
unknown gene, CMC2 (COX assembly mitochondrial protein 2 homolog), that was expressed
significantly higher in clones overrepresented in the ovary (Fig. 6.4c).
Chemotherapy treatment of ALL is one of the great successes in medical oncology—
transforming a universally fatal disease into one where most children and many adults can be
cured. One peculiarity of ALL therapy is that multiple cycles of low-dose maintenance therapy
after high intensity therapy is necessary for long-term cure (NCCN clinical practice guidelines:
Acute Lymphoblastic Leukemia Version 2.2019; Pediatric Acute Lymphoblastic Leukemia
Version 1.2020). These complex regimens were derived empirically from decades of methodical
clinical research. However, a mechanistic understanding of how intensive and maintenance
therapies synergize has never been presented. We hypothesized that intensive and maintenance
therapies target different ALL clones. We included PDX from three patients, labeled as ALL04,
ALL06, and ALL20, in this experiment. We investigated donor leukemia cells to determine if the
clones that exhibit a distinct chemotherapy response share a common gene expression signature
prior to the treatment. In ALL04 prior to any treatment, the clones that responded better to
combination therapy than to intensive therapy expressed significantly lower levels of EBPL and
significantly higher levels of MESDC compared to all other clones (Fig. 6.5a). Additionally,
some clones from ALL04 responded significantly better to maintenance therapy than to intensive
therapy, and these clones expressed lower levels of CAPNS1 prior to treatments compared to all
other clones (Fig. 6.5b). This is in line with the previous finding that inhibiting CAPNS1
sensitizes prostate cancer cells to methotrexate treatment (Jorfi et al., 2015). From another
patient sample ALL20, some clones responded significantly better to intensive therapy than to
maintenance therapy (Fig. 6.5c). Prior to treatments, these clones expressed higher levels of
BTG2, CD38, GTF2A2, ICOSLG, ITGAE and ZRANB2 compared to all other clones (Fig.
111
6.5c). BTG2 is a tumor suppressor in B-ALL and a known target of p53 (Tijchon et al., 2016). It
is upregulated during chemotherapy-mediated apoptosis in cancer cells (J.-G. Chen et al., 2003;
Islaih et al., 2005). CD38 and ITGAE (CD103) are both activation markers of leukemia, and our
data is consistent with the idea that intensive chemotherapy selectively targets the highly
proliferative tumor fraction (Delgado et al., 2002). Monoclonal antibodies targeting CD38
(daratumumab, isatuximab and MOR202) have been used in many clinical trials for
hematopoietic malignancies (van de Donk et al., 2016). CD38 and ITGAE were activated in
response to pentostatin (Delgado et al., 2002), an antimetabolite drug that disrupts nucleic acid
synthesis like methotrexate. This is consistent with our finding that clones with higher expression
of these genes were less sensitive to methotrexate. ICOSLG has been found to be upregulated in
trastuzumab-resistant breast cancer cells (Nam et al., 2015), suggesting it plays a role in
therapeutic resistance. Taken together, the data provide original experimental evidence that
different subsets of ALL clones differentially responded to intensive and maintenance therapies,
and exhibited distinct gene expression signatures.
Methods
Procedures associated with aging project
HSCs were obtained from crushed bones of donor mice as described in Chapter 3, before
transfected with lentivirus carrying clonal tracking barcodes as described in Chapter 2, then
transplanted into recipient mice via retro-orbital injection. Recipient mice were irradiated with
950cGy before receiving the transplant. To collect HSCs aged in old niche, The end time point of
the experiment was set at 15 months post-transplantation, but some mice were sacrificed 12-14
months post-transplantation because they had reached their end-of-life point. To collect HSCs
aged in young niche, after the primary transplantation, c-kit enriched bone marrow cells were
112
isolated from all of the limbs, sternum and spine of each mouse five months post transplantation
and subsequently transplanted into a newly irradiated young recipient.
For gene expression analysis, barcoded GFP+ HSCs were FACS-sorted from a) three recipient
mice 15 months post transplantation, and b) six serial transplantation recipient mice 5 months
after the 8th transplantation. Sorted cells were washed in PBS with 0.04 % bovine serum
albumin (BSA) and processed using the Chromium Single-cell 3′ v2 Library Kit (10× Genomics,
Pleasanton, CA) following the manufacturer’s instructions. 12,000 -15,000 single cells were
loaded for capture. Subsequent steps for mapping single HSC activity and gene expression were
described in Chapter 4.
To identify differentially expressed genes, we only considered genes with more than 3 UMI in
more than 5% of cells. P values were calculated using one-tailed Wilcoxon test. P-values from
replicate mice were combined using Fisher’s method. Genes with adjusted P value < 0.05 were
considered as differentially expressed (DEG). DEG lists were submitted to the Database for
Annotation, Visualization and Integrated Discovery (DAVID) v6.8 for enrichment analysis of
the significantly overrepresented GO terms (Huang et al., 2009a, 2009b).
Procedures associated with leukemia project
Human B-cell acute lymphoblastic leukemia (B-ALL) cells were sorted for human CD45 and
CD19 from cryopreserved clinical specimens obtained from adult patients. These cells were
barcoded as described in Chapter 2 with modifications. Briefly, cells were cultured in
StemSpan
TM
Serum-Free Expansion Medium II (SFEM II) (Stem Cell Technologies) in the
presence of 20 ng/ml human FLT-3 ligand (Gibco by Life Technologies), 20 ng/ml human
Interleukin-3 (IL-3) (Gibco by Life Technologies) and 50 ng/ml human Stem Cell Factor (SCF)
113
(Gibco by Life Technologies); after 24 hours of pre-stimulation under the culture condition, cells
were washed and incubated for another 16 hours in the same medium with addition of lentivirus
carrying the DNA barcodes. 8 ng/µl polybrene was added into the culture to facilitate viral
transduction. B-ALL cells were washed three times with Dulbecco’s Phosphate Buffered Saline
(D-PBS) (Gibco by Life Technologies) prior to transplantation into NOD.Cg-
Prkdc
scid
Il2rg
tm1Wjl
(NSG, JAX stock number 05557) or NOD.Cg-Prkdc
scid
Il2rg
tm1Wjl
Tg(CMV-
IL3,CSF2,KITLG)1Eav/MloySzJ (NSG-SGM3, JAX stock number 013062) mice, which were
irradiated with 150 cGy and transplanted with 100,000 to 200,000 human B-ALL cells via tail
vein injection. Mice were monitored daily for evidence of distress and were euthanized when
human chimerism exceeded 90% of total MNC. At the endpoint, mouse peripheral blood was
collected via perfusion using D-PBS with 10mM Ethylenediaminetetraacetic acid (EDTA)
(Sigma-Aldrich). Spleen, bones, and tissues with noticeable extramedullary expansion were
collected.
To compare responses to different chemotherapy treatments, once human leukemia cell
contribution reached 20-40% of total MNC, mice were randomized and placed into one of the
five chemotherapy groups. The five groups were: (i) vehicle (V) control, (ii) short-term intensive
(I) therapy, (iii) short-term intensive therapy followed by long-term maintenance (IM) therapy,
(iv) prolonged intensive (pI) therapy, and (v) maintenance (M) therapy. The chemotherapy
treatment consisted of an intensive and a maintenance phase. The intensive therapy phase
consisted of Vincristine (Hospira Pharmaceuticals) (0.25mg/kg) administered weekly via
intravenous (IV) injection, Dexamethasone (AuroMedics Pharma) (7.5mg/kg) administered
Monday, Wednesday, Friday via intraperitoneal (IP) injection, and L-asparaginase (Sigma-Tau
Pharmaceuticals) (100 IU/kg) administered bi-weekly via IP injection. Maintenance therapy
114
phase consisted of weekly Methotrexate (Accord Healthcare) (5mg/kg) administered via IV or
intramuscular injection. The groups were treated as follows: ‘V’ group received weekly IV
injections of Bacteriostatic Water (Hospira Pharmaceuticals) during the life span of the mice.
The short-term ‘I’ group only received four weeks of intensive therapy. The ‘IM’ group received
four weeks of the intensive therapy and was followed with maintenance therapy. The ‘pI’ group
was treated with an intensive phase for as long as the mice tolerated high dose treatment.
Toxicity was assessed based on body weight change. The range of treatment was 7 weeks for
ALL04 and ALL20 and 10 weeks for ALL06. The ‘M’ group was treated with maintenance
therapy continuously during the life span of the mice. The range was based on patient sample (5
weeks for ALL20, 14 weeks for ALL04 and 28 weeks for ALL06. Leukemia progression was
monitored throughout the duration of treatment by analyzing the peripheral blood. In addition,
mouse weight was monitored weekly throughout treatment to asses signs of therapy toxicity. If
body weight dropped more than 20% of the starting weight, chemotherapy doses were adjusted.
Clonal tracking barcode extraction and analysis were described in Chapter 2. We further
combined sequencing data with FACS data to calculate the clonal abundance for each clone as
follows:
𝐶𝑙𝑜𝑛𝑎𝑙 𝑎𝑏𝑢𝑛𝑑𝑎𝑛𝑐𝑒 % = 100 ∗ [
# 𝑜𝑓 𝑟𝑒𝑎𝑑𝑠 𝑓𝑜𝑟 𝑒𝑎𝑐 ℎ 𝑏𝑎𝑟𝑐𝑜𝑑𝑒 𝑡𝑜𝑡𝑎𝑙 𝑟𝑒𝑎𝑑𝑠 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑏𝑎𝑟𝑐𝑜𝑑𝑒𝑠 ] [
# 𝑜𝑓 ℎ𝑢𝑚𝑎𝑛 𝑐𝑒𝑙𝑙𝑠 𝑡𝑜𝑡𝑎𝑙 𝑀𝑁𝐶 ] [
# 𝑜𝑓 𝐺𝐹𝑃 𝑐𝑒𝑙𝑙𝑠 𝑡𝑜𝑡𝑎𝑙 ℎ𝑢𝑚𝑎𝑛 𝑐𝑒𝑙𝑙𝑠 ]
115
Figures and Tables
Fig. 6.1 Identifying genes expressed differently between HSCs aged differently
Examples of significantly differentially expressed genes between the HSC clones aged in young
niche from early (n=3) and delayed (n=3) aging mice. Horizontal bar denotes mean from all
clones; white dot denotes median. For the analysis shown here, the upstream mouse experiment
and HSC collection were performed by Anna Nogalska (nogalska@usc.edu).
116
Fig. 6.2 HSCs aged in young and old niches exhibited different gene ontology signatures
Differentially expressed genes between the HSC clones aged in young (n=3) versus old (n=2)
niches from early aging mice. Genes associated with the most significantly enriched Gene
Ontology (GO) terms (P<0.01) were highlighted in colors. For the analysis shown here, the
upstream mouse experiment and HSC collection were performed by Anna Nogalska
(nogalska@usc.edu).
117
Fig. 6.3 Identifying differentially expressed genes between blood and legs
a. Comparison of clonal abundance between the peripheral blood and the leg BM. Markers of the
same shape represent data from one mouse. 99% confidence intervals were determined by the
blood and spleen comparison and highlighted by dashed lines. b. Genes significantly
differentially expressed in clones more abundant in the BM compared to clones more abundant
in the blood. For the analysis shown here, the mouse experiment, leukemia cell collection, and
clonal abundance comparison were performed by Humberto Contreras-Trujillo
(hcontrer@usc.edu).
118
Fig. 6.4 Identifying differentially expressed genes between blood and ovary
a. Clonal abundance was assessed in different tissues and organs, and mapped to single cell RNA
sequencing data. b. comparison of clonal abundance between the peripheral blood and the ovary.
Graphs are described as in Fig. 6.3. c. the CMC2 gene was significantly upregulated in clones
more abundant in the ovary than clones more abundant in the blood. Black bar indicates the
mean, and white dot represents the median. For the analysis shown here, the mouse experiment,
leukemia cell collection, and clonal abundance comparison were performed by Humberto
Contreras-Trujillo (hcontrer@usc.edu).
119
Fig. 6.5 Identifying genes associated with different clonal response to chemotherapy
Representative clonal dynamics during combination therapy in two PDX mice. Comparing
chemotherapy responses of the same clones in different tissues lead to identification of clones
that responded differently under various chemotherapy treatments. (a, left) ALL04 clones that
responded better to combination therapy than to intensive therapy. (b, left) ALL04 clones that
responded better to maintenance therapy than to intensive therapy. (c, left) ALL20 clones that
responded better to intensive therapy than to maintenance therapy. (a-c, right) Genes
significantly differentially expressed in the clones shown on the left (blue) compared to all other
clones from the same patient samples (orange). Black bar indicates the mean, and white dot
represents the median. For the analysis shown here, the mouse experiment, leukemia cell
collection, and clonal abundance comparison were performed by Humberto Contreras-Trujillo
(hcontrer@usc.edu).
120
Supplementary Table 6.1 Summary of previous studies
Summary of previous studies that showed relevant functions for the genes identified in our study.
Information shown here were collected with assistance from Anna Nogalska
(nogalska@usc.edu).
121
References
Akashi, K., Traver, D., Miyamoto, T., & Weissman, I. L. (2000). A clonogenic common myeloid
progenitor that gives rise to all myeloid lineages. Nature, 404(6774), 193–197.
https://doi.org/10.1038/35004599
Baum, C. M., Weissman, I. L., Tsukamoto, A. S., Buckle, A. M., & Peault, B. (1992). Isolation
of a candidate human hematopoietic stem-cell population. Proceedings of the National
Academy of Sciences of the United States of America, 89(7), 2804–2808.
BECKER, A. J., McCULLOCH, E. A., & TILL, J. E. (1963). Cytological Demonstration of the
Clonal Nature of Spleen Colonies Derived from Transplanted Mouse Marrow Cells.
Nature, 197(4866), 452–454. https://doi.org/10.1038/197452a0
Beerman, I., Bhattacharya, D., Zandi, S., Sigvardsson, M., Weissman, I. L., Bryder, D., & Rossi,
D. J. (2010). Functionally distinct hematopoietic stem cells modulate hematopoietic
lineage potential during aging by a mechanism of clonal expansion. Proceedings of the
National Academy of Sciences, 107(12), 5465–5470.
Beerman, I., Maloney, W. J., Weissmann, I. L., & Rossi, D. J. (2010). Stem cells and the aging
hematopoietic system. Current Opinion in Immunology, 22(4), 500–506. PubMed.
https://doi.org/10.1016/j.coi.2010.06.007
Bhattacharya, D., Czechowicz, A., Ooi, A. L., Rossi, D. J., Bryder, D., & Weissman, I. L.
(2009). Niche recycling through division-independent egress of hematopoietic stem cells.
Journal of Experimental Medicine, 206(12), 2837–2850.
Blake, W. J., KÆrn, M., Cantor, C. R., & Collins, J. J. (2003). Noise in eukaryotic gene
expression. Nature, 422(6932), 633–637. https://doi.org/10.1038/nature01546
Brady, G., Barbara, M., & Iscove, N. N. (1990). Representative in Vitro cDNA Amplification
From Individual Hemopoietic Cells and Colonies. Methods in Molecular and Cellular
Biology, 2, 17–25.
Bramlett, C., Jiang, D., Nogalska, A., Eerdeng, J., Contreras, J., & Lu, R. (2020). Clonal tracking
using embedded viral barcoding and high-throughput sequencing. Nature Protocols,
15(4), 1436–1458. https://doi.org/10.1038/s41596-019-0290-z
Brewer, C., Chu, E., Chin, M., & Lu, R. (2016). Transplantation dose alters the differentiation
program of hematopoietic stem cells. Cell Reports, 15(8), 1848–1857.
Broxmeyer, H. E., Orschell, C. M., Clapp, D. W., Hangoc, G., Cooper, S., Plett, P. A., Liles, W.
C., Li, X., Graham-Evans, B., Campbell, T. B., Calandra, G., Bridger, G., Dale, D. C., &
Srour, E. F. (2005). Rapid mobilization of murine and human hematopoietic stem and
progenitor cells with AMD3100, a CXCR4 antagonist. The Journal of Experimental
Medicine, 201(8), 1307–1318. https://doi.org/10.1084/jem.20041385
Bryder, D., Rossi, D. J., & Weissman, I. L. (2006). Hematopoietic stem cells: The paradigmatic
tissue-specific stem cell. The American Journal of Pathology, 169(2), 338–346.
https://doi.org/10.2353/ajpath.2006.060312
Busch, K., Klapproth, K., Barile, M., Flossdorf, M., Holland-Letz, T., Schlenner, S. M., Reth,
M., Höfer, T., & Rodewald, H.-R. (2015). Fundamental properties of unperturbed
122
haematopoiesis from stem cells in vivo. Nature, 518(7540), 542–546.
https://doi.org/10.1038/nature14242
Bystrykh, L. V., & Belderbos, M. E. (2016). Clonal Analysis of Cells with Cellular Barcoding:
When Numbers and Sizes Matter. In K. Turksen (Ed.), Stem Cell Heterogeneity: Methods
and Protocols (pp. 57–89). Springer New York. https://doi.org/10.1007/7651_2016_343
Bystrykh, L. V., de Haan, G., & Verovskaya, E. (2014). Barcoded Vector Libraries and
Retroviral or Lentiviral Barcoding of Hematopoietic Stem Cells. In K. D. Bunting & C.-
K. Qu (Eds.), Hematopoietic Stem Cell Protocols (pp. 345–360). Springer New York.
https://doi.org/10.1007/978-1-4939-1133-2_23
Bystrykh, L. V., Verovskaya, E., Zwart, E., Broekhuis, M., & de Haan, G. (2012). Counting stem
cells: Methodological constraints. Nature Methods, 9, 567.
Cabezas-Wallscheid, N., Klimmeck, D., Hansson, J., Lipka, D. B., Reyes, A., Wang, Q.,
Weichenhan, D., Lier, A., von Paleske, L., Renders, S., Wünsche, P., Zeisberger, P.,
Brocks, D., Gu, L., Herrmann, C., Haas, S., Essers, M. A. G., Brors, B., Eils, R., …
Trumpp, A. (2014). Identification of Regulatory Networks in HSCs and Their Immediate
Progeny via Integrated Proteome, Transcriptome, and DNA Methylome Analysis. Cell
Stem Cell, 15(4), 507–522. https://doi.org/10.1016/j.stem.2014.07.005
Cartier, N., Hacein-Bey-Abina, S., Bartholomae, C. C., Veres, G., Schmidt, M., Kutschera, I.,
Vidaud, M., Abel, U., Dal-Cortivo, L., & Caccavelli, L. (2009). Hematopoietic stem cell
gene therapy with a lentiviral vector in X-linked adrenoleukodystrophy. Science,
326(5954), 818–823.
Chapal-Ilani, N., Maruvka, Y. E., Spiro, A., Reizel, Y., Adar, R., Shlush, L. I., & Shapiro, E.
(2013). Comparing Algorithms That Reconstruct Cell Lineage Trees Utilizing
Information on Microsatellite Mutations. PLOS Computational Biology, 9(11), e1003297.
https://doi.org/10.1371/journal.pcbi.1003297
Chen, J. Y., Miyanishi, M., Wang, S. K., Yamazaki, S., Sinha, R., Kao, K. S., Seita, J., Sahoo,
D., Nakauchi, H., & Weissman, I. L. (2016). Hoxb5 marks long-term haematopoietic
stem cells and reveals a homogenous perivascular niche. Nature, 530(7589), 223–227.
https://doi.org/10.1038/nature16943
Chen, J.-G., Yang, C.-P. H., Cammer, M., & Horwitz, S. B. (2003). Gene expression and mitotic
exit induced by microtubule-stabilizing drugs. Cancer Research, 63(22), 7891–7899.
Cheung, A. M. S., Nguyen, L. V., Carles, A., Beer, P., Miller, P. H., Knapp, D. J. H. F., Dhillon,
K., Hirst, M., & Eaves, C. J. (2013). Analysis of the clonal growth and differentiation
dynamics of primitive barcoded human cord blood cells in NSG mice. Blood, 122(18),
3129–3137. https://doi.org/10.1182/blood-2013-06-508432
Coffman, R. L., & Weissman, I. L. (1981). B220: A B cell-specific member of th T200
glycoprotein family. Nature, 289(5799), 681–683. https://doi.org/10.1038/289681a0
Collier, T. J., & Coleman, P. D. (1991). Divergence of biological and chronological aging:
Evidence from rodent studies. Neurobiology of Aging, 12(6), 685–693.
https://doi.org/10.1016/0197-4580(91)90122-Z
123
Cornils, K., Thielecke, L., Hüser, S., Forgber, M., Thomaschewski, M., Kleist, N., Hussein, K.,
Riecken, K., Volz, T., Gerdes, S., Glauche, I., Dahl, A., Dandri, M., Roeder, I., & Fehse,
B. (2014). Multiplexing clonality: Combining RGB marking and genetic barcoding.
Nucleic Acids Research, 42(7), e56–e56. PubMed. https://doi.org/10.1093/nar/gku081
Czechowicz, A., Kraft, D., Weissman, I. L., & Bhattacharya, D. (2007). Efficient transplantation
via antibody-based clearance of hematopoietic stem cell niches. Science, 318(5854),
1296–1299.
Dagogo-Jack, I., & Shaw, A. T. (2018). Tumour heterogeneity and resistance to cancer therapies.
Nature Reviews. Clinical Oncology, 15(2), 81–94.
https://doi.org/10.1038/nrclinonc.2017.166
Darmanis, S., Sloan, S. A., Croote, D., Mignardi, M., Chernikova, S., Samghababi, P., Zhang,
Y., Neff, N., Kowarsky, M., Caneda, C., Li, G., Chang, S. D., Connolly, I. D., Li, Y.,
Barres, B. A., Gephart, M. H., & Quake, S. R. (2017). Single-Cell RNA-Seq Analysis of
Infiltrating Neoplastic Cells at the Migrating Front of Human Glioblastoma. Cell Reports,
21(5), 1399–1410. https://doi.org/10.1016/j.celrep.2017.10.030
Delgado, J., Bustos, J. G., Jimenez, M. C., Quevedo, E., & Hernandez-Navarro, F. (2002). Are
activation markers (CD25, CD38 and CD103) predictive of sensitivity to purine
analogues in patients with T-cell prolymphocytic leukemia and other lymphoproliferative
disorders? Leukemia & Lymphoma, 43(12), 2331–2334.
https://doi.org/10.1080/1042819021000040035
Dick, J. E., Magli, M. C., Huszar, D., Phillips, R. A., & Bernstein, A. (1985). Introduction of a
selectable gene into primitive stem cells capable of long-term reconstitution of the
hemopoietic system of W/Wv mice. Cell, 42(1), 71–79. https://doi.org/10.1016/S0092-
8674(85)80102-1
Dominici, M., Rasini, V., Bussolari, R., Chen, X., Hofmann, T. J., Spano, C., Bernabei, D.,
Veronesi, E., Bertoni, F., & Paolucci, P. (2009). Restoration and reversible expansion of
the osteoblastic hematopoietic stem cell niche after marrow radioablation. Blood,
114(11), 2333–2343.
Down, J. D., Boudewijn, A., Dillingh, J. H., Fox, B. W., & Ploemacher, R. E. (1994).
Relationships between ablation of distinct haematopoietic cell subsets and the
development of donor bone marrow engraftment following recipient pretreatment with
different alkylating drugs. British Journal of Cancer, 70(4), 611–616.
https://doi.org/10.1038/bjc.1994.359
Dykstra, B., Kent, D., Bowie, M., McCaffrey, L., Hamilton, M., Lyons, K., Lee, S.-J., Brinkman,
R., & Eaves, C. (2007). Long-term propagation of distinct hematopoietic differentiation
programs in vivo. Cell Stem Cell, 1(2), 218–229.
Eaves, C. J. (2015). Hematopoietic stem cells: Concepts, definitions, and the new reality. Blood,
125(17), 2605–2613. https://doi.org/10.1182/blood-2014-12-570200
Elowitz, M. B., Levine, A. J., Siggia, E. D., & Swain, P. S. (2002). Stochastic Gene Expression
in a Single Cell. Science, 297(5584), 1183–1186.
https://doi.org/10.1126/science.1070919
124
Ergen, A. V., Boles, N. C., & Goodell, M. A. (2012). Rantes/Ccl5 influences hematopoietic stem
cell subtypes and causes myeloid skewing. Blood, 119(11), 2500–2509.
https://doi.org/10.1182/blood-2011-11-391730
Ergen, A. V., & Goodell, M. A. (2010). Mechanisms of hematopoietic stem cell aging.
Experimental Gerontology, 45(4), 286–290. PubMed.
https://doi.org/10.1016/j.exger.2009.12.010
Ezine, S., Weissman, I. L., & Rouse, R. V. (1984). Bone marrow cells give rise to distinct cell
clones within the thymus. Nature, 309(5969), 629–631. https://doi.org/10.1038/309629a0
Ezine, S., Weissman, I. L., & Rouse, R. V. (1985). Thymus homing clonogenic bone marrow
cells. Advances in Experimental Medicine and Biology, 186, 223–227.
https://doi.org/10.1007/978-1-4613-2463-8_27
Feldhahn, N., Río, P., Soh, B. N. B., Liedtke, S., Sprangers, M., Klein, F., Wernet, P., Jumaa, H.,
Hofmann, W.-K., Hanenberg, H., Rowley, J. D., & Müschen, M. (2005). Deficiency of
Bruton’s tyrosine kinase in B cell precursor leukemia cells. Proceedings of the National
Academy of Sciences of the United States of America, 102(37), 13266–13271.
https://doi.org/10.1073/pnas.0505196102
Frieda, K. L., Linton, J. M., Hormoz, S., Choi, J., Chow, K.-H. K., Singer, Z. S., Budde, M. W.,
Elowitz, M. B., & Cai, L. (2017). Synthetic recording and in situ readout of lineage
information in single cells. Nature, 541(7635), 107–111.
https://doi.org/10.1038/nature20777
Gawad, C., Koh, W., & Quake, S. R. (2014). Dissecting the clonal origins of childhood acute
lymphoblastic leukemia by single-cell genomics. Proceedings of the National Academy
of Sciences, 111(50), 17947–17952. https://doi.org/10.1073/pnas.1420822111
Gerrits, A., Dykstra, B., Kalmykowa, O. J., Klauke, K., Verovskaya, E., Broekhuis, M. J. C., de
Haan, G., & Bystrykh, L. V. (2010). Cellular barcoding tool for clonal analysis in the
hematopoietic system. Blood, 115(13), 2610–2618. https://doi.org/10.1182/blood-2009-
06-229757
Goebel, W. S., Pech, N. K., Meyers, J. L., Srour, E. F., Yoder, M. C., & Dinauer, M. C. (2004).
A murine model of antimetabolite-based, submyeloablative conditioning for bone
marrow transplantation: Biologic insights and potential applications. Experimental
Hematology, 32(12), 1255–1264. https://doi.org/10.1016/j.exphem.2004.08.007
Gonzalez-Murillo, A., Lozano, M. L., Montini, E., Bueren, J. A., & Guenechea, G. (2008).
Unaltered repopulation properties of mouse hematopoietic stem cells transduced with
lentiviral vectors. Blood, 112(8), 3138–3147. https://doi.org/10.1182/blood-2008-03-
142661
Grover, A., Sanjuan-Pla, A., Thongjuea, S., Carrelha, J., Giustacchini, A., Gambardella, A.,
Macaulay, I., Mancini, E., Luis, T. C., Mead, A., Jacobsen, S. E. W., & Nerlov, C.
(2016). Single-cell RNA sequencing reveals molecular and functional platelet bias of
aged haematopoietic stem cells. Nature Communications, 7(1), 1–12.
https://doi.org/10.1038/ncomms11075
Guernet, A., Mungamuri, S. K., Cartier, D., Sachidanandam, R., Jayaprakash, A., Adriouch, S.,
Vezain, M., Charbonnier, F., Rohkin, G., Coutant, S., Yao, S., Ainani, H., Alexandre, D.,
125
Tournier, I., Boyer, O., Aaronson, S. A., Anouar, Y., & Grumolato, L. (2016). CRISPR-
Barcoding for Intratumor Genetic Heterogeneity Modeling and Functional Analysis of
Oncogenic Driver Mutations. Molecular Cell, 63(3), 526–538. PubMed.
https://doi.org/10.1016/j.molcel.2016.06.017
Haber, A. L., Biton, M., Rogel, N., Herbst, R. H., Shekhar, K., Smillie, C., Burgin, G., Delorey,
T. M., Howitt, M. R., Katz, Y., Tirosh, I., Beyaz, S., Dionne, D., Zhang, M.,
Raychowdhury, R., Garrett, W. S., Rozenblatt-Rosen, O., Shi, H. N., Yilmaz, O., …
Regev, A. (2017). A single-cell survey of the small intestinal epithelium. Nature,
551(7680), 333–339. https://doi.org/10.1038/nature24489
Häckel, E. (1868). Natürliche Schöpfungsgeschichte.
Hai, M., Adler, R. L., Bauer, T. R., Tuschong, L. M., Gu, Y.-C., Wu, X., & Hickstein, D. D.
(2008). Potential genotoxicity from integration sites in CLAD dogs treated successfully
with gammaretroviral vector-mediated gene therapy. Gene Therapy, 15(14), 1067–1071.
https://doi.org/10.1038/gt.2008.52
Han, X., Wang, R., Zhou, Y., Fei, L., Sun, H., Lai, S., Saadatpour, A., Zhou, Z., Chen, H., Ye,
F., Huang, D., Xu, Y., Huang, W., Jiang, M., Jiang, X., Mao, J., Chen, Y., Lu, C., Xie, J.,
… Guo, G. (2018). Mapping the Mouse Cell Atlas by Microwell-Seq. Cell, 172(5), 1091-
1107.e17. https://doi.org/10.1016/j.cell.2018.02.001
Han, X., Zhou, Z., Fei, L., Sun, H., Wang, R., Chen, Y., Chen, H., Wang, J., Tang, H., Ge, W.,
Zhou, Y., Ye, F., Jiang, M., Wu, J., Xiao, Y., Jia, X., Zhang, T., Ma, X., Zhang, Q., …
Guo, G. (2020). Construction of a human cell landscape at single-cell level. Nature, 1–9.
https://doi.org/10.1038/s41586-020-2157-4
Harkey, M. A., Kaul, R., Jacobs, M. A., Kurre, P., Bovee, D., Levy, R., & Blau, C. A. (2007).
Multiarm High-Throughput Integration Site Detection: Limitations of LAM-PCR
Technology and Optimization for Clonal Analysis. Stem Cells and Development, 16(3),
381–392. https://doi.org/10.1089/scd.2007.0015
Holmes, K. L., Langdon, W. Y., Fredrickson, T. N., Coffman, R. L., Hoffman, P. M., Hartley, J.
W., & Morse, H. C. (1986). Analysis of neoplasms induced by Cas-Br-M MuLV tumor
extracts. The Journal of Immunology, 137(2), 679–688.
Huang, D. W., Sherman, B. T., & Lempicki, R. A. (2009a). Bioinformatics enrichment tools:
Paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids
Research, 37(1), 1–13. https://doi.org/10.1093/nar/gkn923
Huang, D. W., Sherman, B. T., & Lempicki, R. A. (2009b). Systematic and integrative analysis
of large gene lists using DAVID bioinformatics resources. Nature Protocols, 4(1), 44–57.
https://doi.org/10.1038/nprot.2008.211
Hulett, H. R., Bonner, W. A., Barrett, J., & Herzenberg, L. A. (1969). Cell sorting: Automated
separation of mammalian cells as a function of intracellular fluorescence. Science (New
York, N.Y.), 166(3906), 747–749. https://doi.org/10.1126/science.166.3906.747
Ikuta, K., & Weissman, I. L. (1992). Evidence that hematopoietic stem cells express mouse c-kit
but do not depend on steel factor for their generation. Proceedings of the National
Academy of Sciences of the United States of America, 89(4), 1502–1506.
https://doi.org/10.1073/pnas.89.4.1502
126
Iscove, N. (1990). Haematopoiesis. Searching for stem cells. Nature, 347(6289), 126–127.
https://doi.org/10.1038/347126a0
Iscove, N. N., Messner, H., Till, J. E., & McCulloch, E. A. (1972). Human marrow cells forming
colonies in culture: Analysis by velocity sedimentation and suspension culture. Series
Haematologica (1968), 5(2), 37–49.
Islaih, M., Halstead, B. W., Kadura, I. A., Li, B., Reid-Hubbard, J. L., Flick, L., Altizer, J. L.,
Thom Deahl, J., Monteith, D. K., Newton, R. K., & Watson, D. E. (2005). Relationships
between genomic, cell cycle, and mutagenic responses of TK6 cells exposed to DNA
damaging chemicals. Mutation Research, 578(1–2), 100–116.
https://doi.org/10.1016/j.mrfmmm.2005.04.012
Jackson, J. T., Nasa, C., Shi, W., Huntington, N. D., Bogue, C. W., Alexander, W. S., &
McCormack, M. P. (2015). A crucial role for the homeodomain transcription factor Hhex
in lymphopoiesis. Blood, 125(5), 803–814. https://doi.org/10.1182/blood-2014-06-
579813
Jacobson, L. O., Simmons, E. L., Marks, E. K., & Eldredge, J. H. (1951). Recovery from
radiation injury. Science (New York, N.Y.), 113(2940), 510–511.
https://doi.org/10.1126/science.113.2940.510
Jaitin, D. A., Kenigsberg, E., Keren-Shaul, H., Elefant, N., Paul, F., Zaretsky, I., Mildner, A.,
Cohen, N., Jung, S., Tanay, A., & Amit, I. (2014). Massively Parallel Single-Cell RNA-
Seq for Marker-Free Decomposition of Tissues into Cell Types. Science, 343(6172),
776–779. https://doi.org/10.1126/science.1247651
Jorfi, S., Ansa-Addo, E. A., Kholia, S., Stratton, D., Valley, S., Lange, S., & Inal, J. (2015).
Inhibition of microvesiculation sensitizes prostate cancer cells to chemotherapy and
reduces docetaxel dose required to limit tumor growth in vivo. Scientific Reports, 5.
https://doi.org/10.1038/srep13006
Kalhor, R., Kalhor, K., Mejia, L., Leeper, K., Graveline, A., Mali, P., & Church, G. M. (2018).
Developmental barcoding of whole mouse via homing CRISPR. Science, 361(6405).
https://doi.org/10.1126/science.aat9804
Kawabata, K. C., Hayashi, Y., Inoue, D., Meguro, H., Sakurai, H., Fukuyama, T., Tanaka, Y.,
Asada, S., Fukushima, T., Nagase, R., Takeda, R., Harada, Y., Kitaura, J., Goyama, S.,
Harada, H., Aburatani, H., & Kitamura, T. (2018). High expression of ABCG2 induced
by EZH2 disruption has pivotal roles in MDS pathogenesis. Leukemia, 32(2), 419–428.
https://doi.org/10.1038/leu.2017.227
Kebschull, J. M., & Zador, A. M. (2018). Cellular barcoding: Lineage tracing, screening and
beyond. Nature Methods, 15(11), 871–879. https://doi.org/10.1038/s41592-018-0185-x
Keller, G., Paige, C., Gilboa, E., & Wagner, E. F. (1985). Expression of a foreign gene in
myeloid and lymphoid cells derived from multipotent haematopoietic precursors. Nature,
318(6042), 149–154. https://doi.org/10.1038/318149a0
Kersseboom, R., Middendorp, S., Dingjan, G. M., Dahlenborg, K., Reth, M., Jumaa, H., &
Hendriks, R. W. (2003). Bruton’s Tyrosine Kinase Cooperates with the B Cell Linker
Protein SLP-65 as a Tumor Suppressor in Pre-B Cells. The Journal of Experimental
Medicine, 198(1), 91–98. https://doi.org/10.1084/jem.20030615
127
Kiel, M. J., Yilmaz, Ö. H., Iwashita, T., Yilmaz, O. H., Terhorst, C., & Morrison, S. J. (2005).
SLAM Family Receptors Distinguish Hematopoietic Stem and Progenitor Cells and
Reveal Endothelial Niches for Stem Cells. Cell, 121(7), 1109–1121.
https://doi.org/10.1016/j.cell.2005.05.026
Kiem, H.-P., Sellers, S., Thomasson, B., Morris, J. C., Tisdale, J. F., Horn, P. A., Hematti, P.,
Adler, R., Kuramoto, K., Calmels, B., Bonifacino, A., Hu, J., von Kalle, C., Schmidt, M.,
Sorrentino, B., Nienhuis, A., Blau, C. A., Andrews, R. G., Donahue, R. E., & Dunbar, C.
E. (2004). Long-Term Clinical and Molecular Follow-up of Large Animals Receiving
Retrovirally Transduced Stem and Progenitor Cells: No Progression to Clonal
Hematopoiesis or Leukemia. Molecular Therapy, 9(3), 389–395.
https://doi.org/10.1016/j.ymthe.2003.12.006
Kivioja, T., Vähärautio, A., Karlsson, K., Bonke, M., Enge, M., Linnarsson, S., & Taipale, J.
(2012). Counting absolute numbers of molecules using unique molecular identifiers.
Nature Methods, 9(1), 72–74. https://doi.org/10.1038/nmeth.1778
Klein, A. M., Mazutis, L., Akartuna, I., Tallapragada, N., Veres, A., Li, V., Peshkin, L., Weitz,
D. A., & Kirschner, M. W. (2015). Droplet Barcoding for Single-Cell Transcriptomics
Applied to Embryonic Stem Cells. Cell, 161(5), 1187–1201.
https://doi.org/10.1016/j.cell.2015.04.044
Köhler, G., & Milstein, C. (1975). Continuous cultures of fused cells secreting antibody of
predefined specificity. Nature, 256(5517), 495–497. https://doi.org/10.1038/256495a0
Kohn, D. B., Weinberg, K. I., Nolta, J. A., Heiss, L. N., Lenarsky, C., Crooks, G. M., Hanley, M.
E., Annett, G., Brooks, J. S., El-Khoureiy, A., Lawrence, K., Wells, S., Moen, R. C.,
Bastian, J., Williams-Herman, D. E., Elder, M., Wara, D., Bowen, T., Hershfield, M. S.,
… Parkman, R. (1995). Engraftment of gene–modified umbilical cord blood cells in
neonates with adenosine deaminase deficiency. Nature Medicine, 1(10), 1017–1023.
https://doi.org/10.1038/nm1095-1017
Kondo, M., Weissman, I. L., & Akashi, K. (1997). Identification of Clonogenic Common
Lymphoid Progenitors in Mouse Bone Marrow. Cell, 91(5), 661–672.
https://doi.org/10.1016/S0092-8674(00)80453-5
Kreso, A., O’Brien, C. A., van Galen, P., Gan, O. I., Notta, F., Brown, A. M. K., Ng, K., Ma, J.,
Wienholds, E., Dunant, C., Pollett, A., Gallinger, S., McPherson, J., Mullighan, C. G.,
Shibata, D., & Dick, J. E. (2013). Variable clonal repopulation dynamics influence
chemotherapy response in colorectal cancer. Science (New York, N.Y.), 339(6119), 543–
548. https://doi.org/10.1126/science.1227670
Kurimoto, K., Yabuta, Y., Ohinata, Y., Ono, Y., Uno, K. D., Yamada, R. G., Ueda, H. R., &
Saitou, M. (2006). An improved single-cell cDNA amplification method for efficient
high-density oligonucleotide microarray analysis. Nucleic Acids Research, 34(5), e42–
e42. https://doi.org/10.1093/nar/gkl050
Kustikova, O., Fehse, B., Modlich, U., Yang, M., Düllmann, J., Kamino, K., Neuhoff, N. von,
Schlegelberger, B., Li, Z., & Baum, C. (2005). Clonal Dominance of Hematopoietic Stem
Cells Triggered by Retroviral Gene Marking. Science, 308(5725), 1171–1174.
https://doi.org/10.1126/science.1105063
128
Laurenti, E., & Göttgens, B. (2018). From haematopoietic stem cells to complex differentiation
landscapes. Nature, 553(7689), 418–426. https://doi.org/10.1038/nature25022
Lee-Six, H., Øbro, N. F., Shepherd, M. S., Grossmann, S., Dawson, K., Belmonte, M., Osborne,
R. J., Huntly, B. J. P., Martincorena, I., Anderson, E., O’Neill, L., Stratton, M. R.,
Laurenti, E., Green, A. R., Kent, D. G., & Campbell, P. J. (2018). Population dynamics of
normal human blood inferred from somatic mutations. Nature, 561(7724), 473–478.
https://doi.org/10.1038/s41586-018-0497-0
Lemischka, I. R., Raulet, D. H., & Mulligan, R. C. (1986). Developmental potential and dynamic
behavior of hematopoietic stem cells. Cell, 45(6), 917–927.
Lemischka, Ihor R., Raulet, D. H., & Mulligan, R. C. (1986). Developmental potential and
dynamic behavior of hematopoietic stem cells. Cell, 45(6), 917–927.
Lepault, F., & Weissman, I. L. (1981). An in vivo assay for thymus-homing bone marrow cells.
Nature, 293(5828), 151–154. https://doi.org/10.1038/293151a0
Livet, J., Weissman, T. A., Kang, H., Draft, R. W., Lu, J., Bennis, R. A., Sanes, J. R., &
Lichtman, J. W. (2007). Transgenic strategies for combinatorial expression of fluorescent
proteins in the nervous system. Nature, 450, 56.
Lorenz, E., Uphoff, D., Reid, T. R., & Shelton, E. (1951). Modification of irradiation injury in
mice and guinea pigs by bone marrow injections. Journal of the National Cancer
Institute, 12(1), 197–201.
Losick, R., & Desplan, C. (2008). Stochasticity and Cell Fate. Science (New York, N.Y.),
320(5872), 65–68. https://doi.org/10.1126/science.1147888
Lu, R., Czechowicz, A., Seita, J., Jiang, D., & Weissman, I. L. (2019). Clonal-level lineage
commitment pathways of hematopoietic stem cells in vivo. Proceedings of the National
Academy of Sciences of the United States of America, 116(4), 1447–1456.
https://doi.org/10.1073/pnas.1801480116
Lu, R., Neff, N. F., Quake, S. R., & Weissman, I. L. (2011). Tracking single hematopoietic stem
cells in vivo using high-throughput sequencing in conjunction with viral genetic
barcoding. Nature Biotechnology, 29(10), 928–933.
Lyne, A.-M., Kent, D. G., Laurenti, E., Cornils, K., Glauche, I., & Perié, L. (2018). A track of
the clones: New developments in cellular barcoding. Experimental Hematology, 68, 15–
20. https://doi.org/10.1016/j.exphem.2018.11.005
Macosko, E. Z., Basu, A., Satija, R., Nemesh, J., Shekhar, K., Goldman, M., Tirosh, I., Bialas, A.
R., Kamitaki, N., Martersteck, E. M., Trombetta, J. J., Weitz, D. A., Sanes, J. R., Shalek,
A. K., Regev, A., & McCarroll, S. A. (2015). Highly Parallel Genome-wide Expression
Profiling of Individual Cells Using Nanoliter Droplets. Cell, 161(5), 1202–1214.
https://doi.org/10.1016/j.cell.2015.05.002
Maetzig, T., Brugman, M. H., Bartels, S., Heinz, N., Kustikova, O. S., Modlich, U., Li, Z., Galla,
M., Schiedlmeier, B., Schambach, A., & Baum, C. (2011). Polyclonal fluctuation of
lentiviral vector–transduced and expanded murine hematopoietic stem cells. Blood,
117(11), 3053–3064. https://doi.org/10.1182/blood-2010-08-303222
129
McKenzie, J. L., Gan, O. I., Doedens, M., Wang, J. C. Y., & Dick, J. E. (2006). Individual stem
cells with highly variable proliferation and self-renewal properties comprise the human
hematopoietic stem cell compartment. Nature Immunology, 7, 1225.
Merino, D., Weber, T. S., Serrano, A., Vaillant, F., Liu, K., Pal, B., Di Stefano, L., Schreuder, J.,
Lin, D., Chen, Y., Asselin-Labat, M. L., Schumacher, T. N., Cameron, D., Smyth, G. K.,
Papenfuss, A. T., Lindeman, G. J., Visvader, J. E., & Naik, S. H. (2019). Barcoding
reveals complex clonal behavior in patient-derived xenografts of metastatic triple
negative breast cancer. Nature Communications, 10(1), 766.
https://doi.org/10.1038/s41467-019-08595-2
Mitnitski, A., Howlett, S. E., & Rockwood, K. (2017). Heterogeneity of Human Aging and Its
Assessment. The Journals of Gerontology. Series A, Biological Sciences and Medical
Sciences, 72(7), 877–884. PubMed. https://doi.org/10.1093/gerona/glw089
Moignard, V., Macaulay, I. C., Swiers, G., Buettner, F., Schütte, J., Calero-Nieto, F. J., Kinston,
S., Joshi, A., Hannah, R., Theis, F. J., Jacobsen, S. E., de Bruijn, M. F., & Göttgens, B.
(2013). Characterization of transcriptional networks in blood stem and progenitor cells
using high-throughput single-cell gene expression analysis. Nature Cell Biology, 15(4),
363–372. https://doi.org/10.1038/ncb2709
Morrison, S. J., Wandycz, A. M., Hemmati, H. D., Wright, D. E., & Weissman, I. L. (1997).
Identification of a lineage of multipotent hematopoietic progenitors. Development,
124(10), 1929–1939.
Morrison, Sean J., & Weissman, I. L. (1994). The long-term repopulating subset of
hematopoietic stem cells is deterministic and isolatable by phenotype. Immunity, 1(8),
661–673.
Müller-Sieburg, C. E., Cho, R. H., Thoman, M., Adkins, B., & Sieburg, H. B. (2002).
Deterministic regulation of hematopoietic stem cell self-renewal and differentiation.
Blood, 100(4), 1302–1309.
https://doi.org/10.1182/blood.V100.4.1302.h81602001302_1302_1309
Muller-Sieburg, C. E., Sieburg, H. B., Bernitz, J. M., & Cattarossi, G. (2012). Stem cell
heterogeneity: Implications for aging and regenerative medicine. Blood, 119(17), 3900–
3907.
Muller-Sieburg, C. E., Whitlock, C. A., & Weissman, I. L. (1986). Isolation of two early B
lymphocyte progenitors from mouse marrow: A committed Pre-Pre-B cell and a
clonogenic Thy-1lo hematopoietic stem cell. Cell, 44(4), 653–662.
https://doi.org/10.1016/0092-8674(86)90274-6
Muller-Sieburg, C., & Sieburg, H. B. (2008). Stem cell aging: Survival of the laziest? Cell Cycle,
7(24), 3798–3804. https://doi.org/10.4161/cc.7.24.7214
Naik, S. H., Perié, L., Swart, E., Gerlach, C., van Rooij, N., de Boer, R. J., & Schumacher, T. N.
(2013). Diverse and heritable lineage imprinting of early haematopoietic progenitors.
Nature, 496(7444), 229–232. https://doi.org/10.1038/nature12013
Naik, S. H., Schumacher, T. N., & Perié, L. (2014). Cellular barcoding: A technical appraisal.
Experimental Hematology, 42(8), 598–608.
https://doi.org/10.1016/j.exphem.2014.05.003
130
Nam, S., Chang, H. R., Jung, H. R., Gim, Y., Kim, N. Y., Grailhe, R., Seo, H. R., Park, H. S.,
Balch, C., Lee, J., Park, I., Jung, S. Y., Jeong, K.-C., Powis, G., Liang, H., Lee, E. S., Ro,
J., & Kim, Y. H. (2015). A pathway-based approach for identifying biomarkers of tumor
progression to trastuzumab-resistant breast cancer. Cancer Letters, 356(2 Pt B), 880–890.
https://doi.org/10.1016/j.canlet.2014.10.038
Nestorowa, S., Hamey, F. K., Pijuan Sala, B., Diamanti, E., Shepherd, M., Laurenti, E., Wilson,
N. K., Kent, D. G., & Göttgens, B. (2016). A single-cell resolution map of mouse
hematopoietic stem and progenitor cell differentiation. Blood, 128(8), e20-31.
https://doi.org/10.1182/blood-2016-05-716480
Newman, J. R. S., Ghaemmaghami, S., Ihmels, J., Breslow, D. K., Noble, M., DeRisi, J. L., &
Weissman, J. S. (2006). Single-cell proteomic analysis of S. cerevisiae reveals the
architecture of biological noise. Nature, 441(7095), 840–846.
https://doi.org/10.1038/nature04785
Nguyen, L. V., Cox, C. L., Eirew, P., Knapp, D. J. H. F., Pellacani, D., Kannan, N., Carles, A.,
Moksa, M., Balani, S., Shah, S., Hirst, M., Aparicio, S., & Eaves, C. J. (2014). DNA
barcoding reveals diverse growth kinetics of human breast tumour subclones in serially
passaged xenografts. Nature Communications, 5, 5871.
Nguyen, L. V., Makarem, M., Carles, A., Moksa, M., Kannan, N., Pandoh, P., Eirew, P., Osako,
T., Kardel, M., Cheung, A. M. S., Kennedy, W., Tse, K., Zeng, T., Zhao, Y., Humphries,
R. K., Aparicio, S., Eaves, C. J., & Hirst, M. (2014). Clonal analysis via barcoding
reveals diverse growth and differentiation of transplanted mouse and human mammary
stem cells. Cell Stem Cell, 14(2), 253–263. https://doi.org/10.1016/j.stem.2013.12.011
Nguyen, L. V., Pellacani, D., Lefort, S., Kannan, N., Osako, T., Makarem, M., Cox, C. L.,
Kennedy, W., Beer, P., Carles, A., Moksa, M., Bilenky, M., Balani, S., Babovic, S., Sun,
I., Rosin, M., Aparicio, S., Hirst, M., & Eaves, C. J. (2015). Barcoding reveals complex
clonal dynamics of de novo transformed human mammary cells. Nature, 528, 267.
Nguyen, L., Wang, Z., Chowdhury, A. Y., Chu, E., Eerdeng, J., Jiang, D., & Lu, R. (2018).
Functional compensation between hematopoietic stem cell clones in vivo. EMBO
Reports, 19(8). https://doi.org/10.15252/embr.201745702
Notta, F., Zandi, S., Takayama, N., Dobson, S., Gan, O. I., Wilson, G., Kaufmann, K. B.,
McLeod, J., Laurenti, E., Dunant, C. F., McPherson, J. D., Stein, L. D., Dror, Y., & Dick,
J. E. (2016). Distinct routes of lineage development reshape the human blood hierarchy
across ontogeny. Science, 351(6269). https://doi.org/10.1126/science.aab2116
Oguro, H., Ding, L., & Morrison, S. J. (2013). SLAM family markers resolve functionally
distinct subpopulations of hematopoietic stem cells and multipotent progenitors. Cell
Stem Cell, 13(1), 102–116. https://doi.org/10.1016/j.stem.2013.05.014
Oltz, E. M., Yancopoulos, G. D., Morrow, M. A., Rolink, A., Lee, G., Wong, F., Kaplan, K.,
Gillis, S., Melchers, F., & Alt, F. W. (1992). A novel regulatory myosin light chain gene
distinguishes pre-B cell subsets and is IL-7 inducible. The EMBO Journal, 11(7), 2759–
2767.
Orkin, S. H., & Zon, L. I. (2008). Hematopoiesis: An Evolving Paradigm for Stem Cell Biology.
Cell, 132(4), 631–644. https://doi.org/10.1016/j.cell.2008.01.025
131
Osawa, M., Hanada, K., Hamada, H., & Nakauchi, H. (1996). Long-Term Lymphohematopoietic
Reconstitution by a Single CD34-Low/Negative Hematopoietic Stem Cell. Science,
273(5272), 242–245. https://doi.org/10.1126/science.273.5272.242
Osorio, F. G., Rosendahl Huber, A., Oka, R., Verheul, M., Patel, S. H., Hasaart, K., de la
Fonteijne, L., Varela, I., Camargo, F. D., & van Boxtel, R. (2018). Somatic Mutations
Reveal Lineage Relationships and Age-Related Mutagenesis in Human Hematopoiesis.
Cell Reports, 25(9), 2308-2316.e4. https://doi.org/10.1016/j.celrep.2018.11.014
Pang, W. W., Schrier, S. L., & Weissman, I. L. (2017). Age-associated changes in human
hematopoietic stem cells. Aging and Hematopoiesis, 54(1), 39–42.
https://doi.org/10.1053/j.seminhematol.2016.10.004
Pei, W., Feyerabend, T. B., Rössler, J., Wang, X., Postrach, D., Busch, K., Rode, I., Klapproth,
K., Dietlein, N., Quedenau, C., Chen, W., Sauer, S., Wolf, S., Höfer, T., & Rodewald, H.-
R. (2017). Polylox barcoding reveals haematopoietic stem cell fates realized in vivo.
Nature, 548(7668), 456–460. https://doi.org/10.1038/nature23653
Peng, H., Kasada, A., Ueno, M., Hoshii, T., Tadokoro, Y., Nomura, N., Ito, C., Takase, Y., Vu,
H. T., Kobayashi, M., Xiao, B., Worley, P. F., & Hirao, A. (2018). Distinct roles of Rheb
and Raptor in activating mTOR complex 1 for the self-renewal of hematopoietic stem
cells. Biochemical and Biophysical Research Communications, 495(1), 1129–1135.
https://doi.org/10.1016/j.bbrc.2017.11.140
Picelli, S., Björklund, Å. K., Faridani, O. R., Sagasser, S., Winberg, G., & Sandberg, R. (2013).
Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nature
Methods, 10(11), 1096–1098. https://doi.org/10.1038/nmeth.2639
Pietras, E. M., Reynaud, D., Kang, Y.-A., Carlin, D., Calero-Nieto, F. J., Leavitt, A. D., Stuart, J.
M., Göttgens, B., & Passegué, E. (2015). Functionally distinct subsets of lineage-biased
multipotent progenitors control blood production in normal and regenerative conditions.
Cell Stem Cell, 17(1), 35–46.
Purton, L. E., & Scadden, D. T. (2007). Limiting factors in murine hematopoietic stem cell
assays. Cell Stem Cell, 1(3), 263–270. https://doi.org/10.1016/j.stem.2007.08.016
Ramsköld, D., Luo, S., Wang, Y.-C., Li, R., Deng, Q., Faridani, O. R., Daniels, G. A.,
Khrebtukova, I., Loring, J. F., Laurent, L. C., Schroth, G. P., & Sandberg, R. (2012).
Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor
cells. Nature Biotechnology, 30(8), 777–782. https://doi.org/10.1038/nbt.2282
Raser, J. M., & O’Shea, E. K. (2004). Control of Stochasticity in Eukaryotic Gene Expression.
Science, 304(5678), 1811–1814. https://doi.org/10.1126/science.1098641
Regev, A., Teichmann, S. A., Lander, E. S., Amit, I., Benoist, C., Birney, E., Bodenmiller, B.,
Campbell, P., Carninci, P., Clatworthy, M., Clevers, H., Deplancke, B., Dunham, I.,
Eberwine, J., Eils, R., Enard, W., Farmer, A., Fugger, L., Göttgens, B., … Human Cell
Atlas Meeting Participants. (2017). The Human Cell Atlas. ELife, 6.
https://doi.org/10.7554/eLife.27041
Rios, A. C., Fu, N. Y., Lindeman, G. J., & Visvader, J. E. (2014). In situ identification of
bipotent stem cells in the mammary gland. Nature, 506, 322.
132
Rodriguez-Fraticelli, A. E., Wolock, S. L., Weinreb, C. S., Panero, R., Patel, S. H., Jankovic, M.,
Sun, J., Calogero, R. A., Klein, A. M., & Camargo, F. D. (2018). Clonal analysis of
lineage fate in native haematopoiesis. Nature, 553(7687), 212–216.
https://doi.org/10.1038/nature25168
Rosenberg, A. B., Roco, C. M., Muscat, R. A., Kuchina, A., Sample, P., Yao, Z., Graybuck, L.
T., Peeler, D. J., Mukherjee, S., Chen, W., Pun, S. H., Sellers, D. L., Tasic, B., & Seelig,
G. (2018). Single-cell profiling of the developing mouse brain and spinal cord with split-
pool barcoding. Science, 360(6385), 176–182. https://doi.org/10.1126/science.aam8999
Rüfer, A. W., & Sauer, B. (2002). Non-contact positions impose site selectivity on Cre
recombinase. Nucleic Acids Research, 30(13), 2764–2771.
Schepers, K., Swart, E., van Heijst, J. W. J., Gerlach, C., Castrucci, M., Sie, D., Heimerikx, M.,
Velds, A., Kerkhoven, R. M., Arens, R., & Schumacher, T. N. M. (2008). Dissecting T
cell lineage relationships by cellular barcoding. Journal of Experimental Medicine,
205(10), 2309–2318. https://doi.org/10.1084/jem.20072462
Schmidt, M., Carbonaro, D. A., Speckmann, C., Wissler, M., Bohnsack, J., Elder, M., Aronow,
B. J., Nolta, J. A., Kohn, D. B., & von Kalle, C. (2003). Clonality analysis after
retroviral-mediated gene transfer to CD34 + cells from the cord blood of ADA-deficient
SCID neonates. Nature Medicine, 9(4), 463–468. https://doi.org/10.1038/nm844
Schmidt, M., Schwarzwaelder, K., Bartholomae, C., Zaoui, K., Ball, C., Pilz, I., Braun, S.,
Glimm, H., & von Kalle, C. (2007). High-resolution insertion-site analysis by linear
amplification–mediated PCR (LAM-PCR). Nature Methods, 4, 1051.
Schmidt, M., Zickler, P., Hoffmann, G., Haas, S., Wissler, M., Muessig, A., Tisdale, J. F.,
Kuramoto, K., Andrews, R. G., Wu, T., Kiem, H.-P., Dunbar, C. E., & von Kalle, C.
(2002). Polyclonal long-term repopulating stem cell clones in a primate model. Blood,
100(8), 2737–2743. https://doi.org/10.1182/blood-2002-02-0407
Seita, J., & Weissman, I. L. (2010). Hematopoietic stem cell: Self-renewal versus differentiation.
Wiley Interdisciplinary Reviews. Systems Biology and Medicine, 2(6), 640–653.
https://doi.org/10.1002/wsbm.86
Shalek, A. K., Satija, R., Adiconis, X., Gertner, R. S., Gaublomme, J. T., Raychowdhury, R.,
Schwartz, S., Yosef, N., Malboeuf, C., Lu, D., Trombetta, J. J., Gennert, D., Gnirke, A.,
Goren, A., Hacohen, N., Levin, J. Z., Park, H., & Regev, A. (2013). Single-cell
transcriptomics reveals bimodality in expression and splicing in immune cells. Nature,
498(7453), 236–240. https://doi.org/10.1038/nature12172
Shalek, A. K., Satija, R., Shuga, J., Trombetta, J. J., Gennert, D., Lu, D., Chen, P., Gertner, R. S.,
Gaublomme, J. T., Yosef, N., Schwartz, S., Fowler, B., Weaver, S., Wang, J., Wang, X.,
Ding, R., Raychowdhury, R., Friedman, N., Hacohen, N., … Regev, A. (2014). Single-
cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature, 510(7505),
363–369. https://doi.org/10.1038/nature13437
Shalem, O., Sanjana, N. E., Hartenian, E., Shi, X., Scott, D. A., Mikkelsen, T. S., Heckl, D.,
Ebert, B. L., Root, D. E., Doench, J. G., & Zhang, F. (2014). Genome-Scale CRISPR-
Cas9 Knockout Screening in Human Cells. Science, 343(6166), 84.
https://doi.org/10.1126/science.1247005
133
Shen, F. W., Tung, J. S., & Boyse, E. A. (1986). Further definition of the Ly-5 system.
Immunogenetics, 24(3), 146–149. https://doi.org/10.1007/bf00364741
Shen, M. W., Arbab, M., Hsu, J. Y., Worstell, D., Culbertson, S. J., Krabbe, O., Cassa, C. A.,
Liu, D. R., Gifford, D. K., & Sherwood, R. I. (2018). Predictable and precise template-
free CRISPR editing of pathogenic variants. Nature, 563(7733), 646–651.
https://doi.org/10.1038/s41586-018-0686-x
Sieburg, H. B., Cho, R. H., Dykstra, B., Uchida, N., Eaves, C. J., & Muller-Sieburg, C. E.
(2006). The hematopoietic stem compartment consists of a limited number of discrete
stem cell subsets. Blood, 107(6), 2311–2316.
Spangrude, G. J., Heimfeld, S., & Weissman, I. L. (1988). Purification and characterization of
mouse hematopoietic stem cells. Science, 241(4861), 58–62.
https://doi.org/10.1126/science.2898810
Spangrude, G. J., Muller-Sieburg, C. E., Heimfeld, S., & Weissman, I. L. (1988). Two rare
populations of mouse Thy-1lo bone marrow cells repopulate the thymus. The Journal of
Experimental Medicine, 167(5), 1671–1683.
Stubbington, M. J. T., Rozenblatt-Rosen, O., Regev, A., & Teichmann, S. A. (2017). Single-cell
transcriptomics to explore the immune system in health and disease. Science (New York,
N.Y.), 358(6359), 58–63. https://doi.org/10.1126/science.aan6828
Sudo, K., Ema, H., Morita, Y., & Nakauchi, H. (2000). Age-Associated Characteristics of
Murine Hematopoietic Stem Cells. Journal of Experimental Medicine, 192(9), 1273–
1280. https://doi.org/10.1084/jem.192.9.1273
Sun, J., Ramos, A., Chapman, B., Johnnidis, J. B., Le, L., Ho, Y.-J., Klein, A., Hofmann, O., &
Camargo, F. D. (2014). Clonal dynamics of native haematopoiesis. Nature, 514(7522),
322–327.
Tang, F., Barbacioru, C., Wang, Y., Nordman, E., Lee, C., Xu, N., Wang, X., Bodeau, J., Tuch,
B. B., Siddiqui, A., Lao, K., & Surani, M. A. (2009). MRNA-Seq whole-transcriptome
analysis of a single cell. Nature Methods, 6(5), 377–382.
https://doi.org/10.1038/nmeth.1315
Thielecke, L., Aranyossy, T., Dahl, A., Tiwari, R., Roeder, I., Geiger, H., Fehse, B., Glauche, I.,
& Cornils, K. (2017). Limitations and challenges of genetic barcode quantification.
Scientific Reports, 7, 43249.
Thielecke, L., Cornils, K., & Glauche, I. (2019). GenBaRcode – a comprehensive R package for
genetic barcode analysis. BioRxiv, 696229. https://doi.org/10.1101/696229
Tijchon, E., van Emst, L., Yuniati, L., van Ingen Schenau, D., Havinga, J., Rouault, J.-P.,
Hoogerbrugge, P. M., van Leeuwen, F. N., & Scheijen, B. (2016). Tumor suppressors
BTG1 and BTG2 regulate early mouse B-cell development. Haematologica, 101(7),
e272-276. https://doi.org/10.3324/haematol.2015.139675
Till, J. E., McCulloch, E. A., & Siminovitch, L. (1964). A STOCHASTIC MODEL OF STEM
CELL PROLIFERATION, BASED ON THE GROWTH OF SPLEEN COLONY-
FORMING CELLS*. Proceedings of the National Academy of Sciences of the United
States of America, 51(1), 29–36.
134
Uchida, N., Aguila, H. L., Fleming, W. H., Jerabek, L., & Weissman, I. L. (1994). Rapid and
sustained hematopoietic recovery in lethally irradiated mice transplanted with purified
Thy-1.1lo Lin-Sca-1+ hematopoietic stem cells. Blood, 83(12), 3758–3779.
van de Donk, N. W. C. J., Janmaat, M. L., Mutis, T., Lammerts van Bueren, J. J., Ahmadi, T.,
Sasser, A. K., Lokhorst, H. M., & Parren, P. W. H. I. (2016). Monoclonal antibodies
targeting CD38 in hematological malignancies and beyond. Immunological Reviews,
270(1), 95–112. https://doi.org/10.1111/imr.12389
Verovskaya, E., Broekhuis, M. J. C., Zwart, E., Ritsema, M., van Os, R., de Haan, G., &
Bystrykh, L. V. (2013). Heterogeneity of young and aged murine hematopoietic stem
cells revealed by quantitative clonal analysis using cellular barcoding. Blood, 122(4),
523–532. https://doi.org/10.1182/blood-2013-01-481135
Visser, J. W., Bauman, J. G., Mulder, A. H., Eliason, J. F., & de Leeuw, A. M. (1984). Isolation
of murine pluripotent hemopoietic stem cells. The Journal of Experimental Medicine,
159(6), 1576–1590. https://doi.org/10.1084/jem.159.6.1576
Waanders, E., Scheijen, B., van der Meer, L. T., van Reijmersdal, S. V., van Emst, L., Kroeze,
Y., Sonneveld, E., Hoogerbrugge, P. M., van Kessel, A. G., van Leeuwen, F. N., &
Kuiper, R. P. (2012). The origin and nature of tightly clustered BTG1 deletions in
precursor B-cell acute lymphoblastic leukemia support a model of multiclonal evolution.
PLoS Genetics, 8(2), e1002533. https://doi.org/10.1371/journal.pgen.1002533
Wang, G. P., Berry, C. C., Malani, N., Leboulch, P., Fischer, A., Hacein-Bey-Abina, S.,
Cavazzana-Calvo, M., & Bushman, F. D. (2010). Dynamics of gene-modified progenitor
cells analyzed by tracking retroviral integration sites in a human SCID-X1 gene therapy
trial. Blood, 115(22), 4356–4366. https://doi.org/10.1182/blood-2009-12-257352
Wang, T., Wei, J. J., Sabatini, D. M., & Lander, E. S. (2014). Genetic Screens in Human Cells
Using the CRISPR-Cas9 System. Science, 343(6166), 80.
https://doi.org/10.1126/science.1246981
Wang, Y., Probin, V., & Zhou, D. (2006). Cancer therapy-induced residual bone marrow injury-
Mechanisms of induction and implication for therapy. Current Cancer Therapy Reviews,
2(3), 271–279. https://doi.org/10.2174/157339406777934717
Warren, L., Bryder, D., Weissman, I. L., & Quake, S. R. (2006). Transcription factor profiling in
individual hematopoietic progenitors by digital RT-PCR. Proceedings of the National
Academy of Sciences of the United States of America, 103(47), 17807–17812.
https://doi.org/10.1073/pnas.0608512103
Wasserstrom, A., Adar, R., Shefer, G., Frumkin, D., Itzkovitz, S., Stern, T., Shur, I., Zangi, L.,
Kaplan, S., Harmelin, A., Reisner, Y., Benayahu, D., Tzahor, E., Segal, E., & Shapiro, E.
(2008). Reconstruction of Cell Lineage Trees in Mice. PLOS ONE, 3(4), e1939.
https://doi.org/10.1371/journal.pone.0001939
Weber, K., Thomaschewski, M., Benten, D., & Fehse, B. (2012). RGB marking with lentiviral
vectors for multicolor clonal cell tracking. Nature Protocols, 7, 839.
Weissman, I. L., & Shizuru, J. A. (2008). The origins of the identification and isolation of
hematopoietic stem cells, and their capability to induce donor-specific transplantation
135
tolerance and treat autoimmune diseases. Blood, 112(9), 3543–3553.
https://doi.org/10.1182/blood-2008-08-078220
Whitlock, C. A., & Witte, O. N. (1982). Long-term culture of B lymphocytes and their
precursors from murine bone marrow. Proceedings of the National Academy of Sciences
of the United States of America, 79(11), 3608–3612.
Wilson, A., Laurenti, E., Oser, G., van der Wath, R. C., Blanco-Bose, W., Jaworski, M., Offner,
S., Dunant, C. F., Eshkind, L., & Bockamp, E. (2008). Hematopoietic stem cells
reversibly switch from dormancy to self-renewal during homeostasis and repair. Cell,
135(6), 1118–1129.
Woodworth, M. B., Girskis, K. M., & Walsh, C. A. (2017). Building a lineage from single cells:
Genetic techniques for cell lineage tracking. Nature Reviews Genetics, 18, 230.
Wu, C., Espinoza, D. A., Koelle, S. J., Yang, D., Truitt, L., Schlums, H., Lafont, B. A.,
Davidson-Moncada, J. K., Lu, R., Kaur, A., Hammer, Q., Li, B., Panch, S., Allan, D. A.,
Donahue, R. E., Childs, R. W., Romagnani, C., Bryceson, Y. T., & Dunbar, C. E. (2018).
Clonal expansion and compartmentalized maintenance of rhesus macaque NK cell
subsets. Science Immunology, 3(29). https://doi.org/10.1126/sciimmunol.aat9781
Wu, C., Jares, A., Winkler, T., Xie, J., Larochelle, A., & Dunbar, C. E. (2011). Tracking
Retroviral-Integrated Clones with Modified Non-Restriction Enzyme LAM-PCR
Technology. Molecular Therapy, 19, S45. https://doi.org/10.1016/S1525-0016(16)36685-
0
Wu, C., Jares, A., Winkler, T., Xie, J., Metais, J.-Y., & Dunbar, C. E. (2013). High efficiency
restriction enzyme-free linear amplification-mediated polymerase chain reaction
approach for tracking lentiviral integration sites does not abrogate retrieval bias. Human
Gene Therapy, 24(1), 38–47. PubMed. https://doi.org/10.1089/hum.2012.082
Wu, C., Li, B., Lu, R., Koelle, S. J., Yang, Y., Jares, A., Krouse, A. E., Metzger, M., Liang, F.,
Loré, K., Wu, C. O., Donahue, R. E., Chen, I. S. Y., Weissman, I., & Dunbar, C. E.
(2014). Clonal tracking of rhesus macaque hematopoiesis highlights a distinct lineage
origin for natural killer cells. Cell Stem Cell, 14(4), 486–499.
https://doi.org/10.1016/j.stem.2014.01.020
Wu, M.-Y., Eldin, K. W., & Beaudet, A. L. (2008). Identification of chromatin remodeling genes
Arid4a and Arid4b as leukemia suppressor genes. Journal of the National Cancer
Institute, 100(17), 1247–1259. https://doi.org/10.1093/jnci/djn253
Yamamoto, R., Morita, Y., Ooehara, J., Hamanaka, S., Onodera, M., Rudolph, K. L., Ema, H., &
Nakauchi, H. (2013). Clonal analysis unveils self-renewing lineage-restricted progenitors
generated directly from hematopoietic stem cells. Cell, 154(5), 1112–1126.
Yashin, A. I., Arbeev, K. G., Arbeeva, L. S., Wu, D., Akushevich, I., Kovtun, M., Yashkin, A.,
Kulminski, A., Culminskaya, I., Stallard, E., Li, M., & Ukraintseva, S. V. (2016). How
the effects of aging and stresses of life are integrated in mortality rates: Insights for
genetic studies of human health and longevity. Biogerontology, 17(1), 89–107. PubMed.
https://doi.org/10.1007/s10522-015-9594-8
Yu, V. W. C., Yusuf, R. Z., Oki, T., Wu, J., Saez, B., Wang, X., Cook, C., Baryawno, N., Ziller,
M. J., Lee, E., Gu, H., Meissner, A., Lin, C. P., Kharchenko, P. V., & Scadden, D. T.
136
(2016). Epigenetic Memory Underlies Cell-Autonomous Heterogeneous Behavior of
Hematopoietic Stem Cells. Cell, 167(5), 1310-1322.e17.
https://doi.org/10.1016/j.cell.2016.10.045
Zheng, G. X. Y., Terry, J. M., Belgrader, P., Ryvkin, P., Bent, Z. W., Wilson, R., Ziraldo, S. B.,
Wheeler, T. D., McDermott, G. P., Zhu, J., Gregory, M. T., Shuga, J., Montesclaros, L.,
Underwood, J. G., Masquelier, D. A., Nishimura, S. Y., Schnall-Levin, M., Wyatt, P. W.,
Hindson, C. M., … Bielas, J. H. (2017). Massively parallel digital transcriptional
profiling of single cells. Nature Communications, 8(1), 1–12.
https://doi.org/10.1038/ncomms14049
Zhou, S., Bonner, M. A., Wang, Y.-D., Rapp, S., De Ravin, S. S., Malech, H. L., & Sorrentino,
B. P. (2014). Quantitative Shearing Linear Amplification Polymerase Chain Reaction: An
Improved Method for Quantifying Lentiviral Vector Insertion Sites in Transplanted
Hematopoietic Cell Systems. Human Gene Therapy Methods, 26(1), 4–12.
https://doi.org/10.1089/hgtb.2014.122
Abstract (if available)
Abstract
Stem cell heterogeneity plays important roles during development, aging, regeneration, and disease progression. However, its underlying mechanisms remain largely elusive. Here, we improved an embedded viral barcoding technology, and combined it with droplet-based single cell RNA sequencing technology to study hematopoietic stem cells (HSCs) in mouse. By simultaneously measuring the transcriptomes and in vivo cellular activities of hundreds of individual HSCs in mice, we show that intercellular variations in the expression levels of dozens of genes are significantly correlated with distinct activity levels of individual HSCs. our data illustrate a novel approach for studying molecular regulatory mechanisms through quantitatively dissecting intercellular variations.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Functional compensation between hematopoietic stem cell clones in vivo
PDF
Molecular signatures underlying intercellular differences in leukemia progression and chemotherapy response
PDF
Membrane-bound regulation of hematopoietic stem cells
PDF
Deciphering heterogeneity of preleukemic clonal expansion
PDF
Metabolic profiling of single hematopoietic stem cells for developing novel ex vivo culture strategies
PDF
Effects of parathyroid hormone analogues on hematopoietic stem cell niche potential of bone-marrow mononuclear cells
PDF
Regional localization and regulation of hematopoietic stem cells in the bone marrow stem cell niche
PDF
Fibroblastic connective tissue cells: the blastema stem cells and source of large-scale chondrogenesis in the regenerating lizard tail
PDF
The role of ERK1/2 in mouse embryonic stem cell fate control
PDF
Understanding human nephrogenesis and scaling synthesis of organoids facilitate modeling of kidney development and disease
PDF
Role of the bone marrow niche components in B cell malignancies
PDF
Role of beta-catenin in mouse epiblast stem cell, embryonic stem cell self-renewal and differentiation
PDF
Investigating molecular roadblocks to enhance direct cellular reprogramming
PDF
Dissecting metabolic changes in muscle stem cells during activation
PDF
Transcriptional regulation in nephron progenitor cells
PDF
The role of endoplasmic reticulum chaperones in regulating hematopoietic stem cells and hematological malignancies
PDF
Pleotropic potential of Stat3 in determining self-renewal, apoptosis, and differentiation in mouse embryonic stem cells
PDF
Role of STAT3 phosphorylation in mouse embryonic stem cell self-renewal and differentiation
PDF
Molecular basis of mouse epiblast stem cell and human embryonic stem cell self‐renewal
PDF
Molecular programs in epithelial morphogenesis in mammalian organ systems
Asset Metadata
Creator
Jiang, Du
(author)
Core Title
Dissecting the heterogeneity of mouse hematopoietic stem cells in vivo
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Development, Stem Cells and Regenerative Medicine
Publication Date
01/29/2021
Defense Date
05/06/2020
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
clonal tracking,hematopoietic stem cells,OAI-PMH Harvest,single-cell RNA-sequencing,Transplantation
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ying, Qi-Long (
committee chair
), Lu, Rong (
committee member
), MacLean, Adam (
committee member
), McMahon, Andrew (
committee member
)
Creator Email
chinajiangdu@gmail.com,dujiang@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-353862
Unique identifier
UC11666283
Identifier
etd-JiangDu-8851.pdf (filename),usctheses-c89-353862 (legacy record id)
Legacy Identifier
etd-JiangDu-8851.pdf
Dmrecord
353862
Document Type
Dissertation
Rights
Jiang, Du
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
clonal tracking
hematopoietic stem cells
single-cell RNA-sequencing