Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Functional analysis of a prostate cancer risk enhancer at 7p15.2
(USC Thesis Other)
Functional analysis of a prostate cancer risk enhancer at 7p15.2
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Functional Analysis of a Prostate Cancer Risk Enhancer at 7p15.2
By
Zhifei Luo
A Thesis Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
(Biochemistry and Molecular Biology)
August 2016
I
ACKNOWLEDGMENTS
I would like to express the deepest appreciation and thanks to my mentor Dr. Peggy
Farnham for her consistent help and guidance in research and my career as well as the spirit of
adventure and endless passion in research she conveyed. Besides, I highly appreciate the respect
she showed about all my naïve and audacious ideas in research and the encouragement to try
nearly all of them.
I would like also to thank Dr. Zoltan Tokes, the advisor of the Biochemistry and
Molecular Biology Master’s Program and my committee members, who agreed to read my
dissertation at the last minute and brought me to this wonderful place. Thanks to my committee
member Dr. Michael Stallcup for his wise suggestions about this project and his kind help.
To all the Farnham lab members, thank you all for the great help and support in my
science journey. In addition, I would like to thank all my classmates and friends for their help
over the last two years.
II
TABLE OF CONTENTS
ACKNOWLEDGMENTS ............................................................................................................... I
TABLE OF CONTENTS ................................................................................................................ II
LISTS OF FIGURES AND TABLES .......................................................................................... III
LIST OF ABBREVIATIONS ....................................................................................................... IV
ABSTRACT .................................................................................................................................... 1
CHAPTER 1 INTRODUCTION .................................................................................................... 2
CHAPTER 2 DELETION OF E7P15 IN PROSTATE CELL LINES ......................................... 12
2.1 The design of gRNA and plasmid construction ............................................................... 14
2.2 Cell culture ....................................................................................................................... 19
2.3 Transfection ..................................................................................................................... 20
2.4 Guide RNA Validation .................................................................................................... 21
2.5 Enrichment of transfected cells ....................................................................................... 21
2.6 Generation of single cell colonies .................................................................................... 23
2.7 Detection of cells having enhancer deletions .................................................................. 24
2.8 Detection of the copy number of remaining chromosomes ............................................. 27
2.9 Summary .......................................................................................................................... 28
CHAPTER 3 INVESTIGATION OF PROLIFERATION CHANGES IN E7P15 DELETED
CELLs ........................................................................................................................................... 34
3.1 Colony formation assay ................................................................................................... 34
3.2 Cell proliferation assay .................................................................................................... 37
CHAPTER 4 RNA-SEQ AND DATA ANALYSIS .................................................................... 39
4.1 RNA-seq library construction .......................................................................................... 39
4.2 DHT treated RNA-seq ..................................................................................................... 40
4.3 Data analysis .................................................................................................................... 40
4.3 Pathway analysis .............................................................................................................. 46
CHAPTER 5 DISCUSSION ......................................................................................................... 47
CHAPTER 6 FUTURE DIRECTIONS ........................................................................................ 56
REFERENCES ............................................................................................................................. 59
III
LISTS OF FIGURES AND TABLES
Figure 1: Chromatin signatures around the 7p15.2 enhancer in multiple prostate cell lines .......... 8
Figure 2: Allele-specific effects of the risk SNP in an enhancer luciferase assay .......................... 9
Figure 3: CRISPR/Cas9 deletion region ....................................................................................... 15
Figure 4: qPCR results and the C4-2B karyotype information ..................................................... 27
Figure 5: Gel images of genotyping .............................................................................................. 31
Figure 6: Summary of the CRISPR pipeline ................................................................................. 32
Figure 7: C4-2B colony formation assay results ........................................................................... 35
Figure 8: RWPE-1 colony formation assay results ....................................................................... 36
Figure 9: Cell proliferation assay results ...................................................................................... 38
Figure 10: PCA plots .................................................................................................................... 45
Figure 11: Pathway analysis ......................................................................................................... 46
Figure 12: Gene expression changes within 2MB. ....................................................................... 50
Figure 13: Circos plot of top 30 differentially expressed genes. .................................................. 51
Figure 14: CTCF peaks around the enhancer region .................................................................... 55
Table 1: Prostate cancer risk loci .................................................................................................... 3
Table 2: gRNA sequencing and coordinates ................................................................................. 16
Table 3: Oligonucleotides used to synthesize gRNAs .................................................................. 17
Table 4: Primers for genotyping ................................................................................................... 26
Table 5: qPCR primers .................................................................................................................. 28
Table 6: Summary of genome editing results ............................................................................... 30
Table 7: Lists of control cells and fully deleted cells ................................................................... 33
Table 8: Top 30 down regulated genes in RPWE-1 ..................................................................... 42
Table 9: Top 30 down regulated genes in C4-2B ......................................................................... 43
Table 10: Top 30 down regulated genes in C4-2B after DHT treatment ...................................... 44
Table 11 Number of differentially expressed genes in enhancer-deleted cells ............................ 46
IV
LIST OF ABBREVIATIONS
PCa Prostate cancer
CRISPR Clustered regularly interspaced short palindromic repeats
GWAS Genome-wide association studies
SNP Single nucleotide polymorphism
LD Linkage disequilibrium
ChIP-seq Chromatin immunoprecipitation sequencing
DHT Dihydrotestosterone
eQTL Expression quantitative loci
AR Androgen receptor
PSA Prostate specific antigen
FACS Fluorescence-activated cell sorting
LDH Lactate dehydrogenase
IFN Interferon
FPKM Fragments per kilobase of reads per million mapped reads
PCA Principal component analysis
SKY Spectral karyotyping
4C Circular chromosome conformation capture
CTCF CCCTC-binding factor
FISH Fluorescence in situ hybridization
CASFISH Cas9-mediated fluorescence in situ hybridization
PAC Puromycin N-acetyltransferase
1
ABSTRACT
Prostate cancer (PCa) is the second most commonly diagnosed cancer in men in the
United States and one of the most common types of cancer in men worldwide[5, 6]. To better
understand the gene regulation mechanisms underpinning prostate cancer progression, a PCa risk
locus was deleted using clustered regularly interspaced short palindromic repeats (CRISPR) in
two prostate cell lines[7], with and without an active enhancer mark encompassing the risk SNP
within the locus. RNA-seq, colony formation assays, and cell proliferation assays were
performed to investigate the transcriptomic and phenotypic changes. My results suggest that this
locus does affect prostate cancer susceptibility.
2
CHAPTER 1
INTRODUCTION
Other than skin cancer, prostate cancer is the most commonly diagnosed cancer in men in
the United States. Although the 5-year survival rate of PCa is 99%, PCa is still the second-
leading cause of cancer death in American men, after lung and bronchial cancer. Current
estimates are that 1 in 7 men will be diagnosed with PCa during their lifetime and 1 in 39 men
will die from the disease. The estimated number of newly diagnosed cases and deaths for 2016 is
180,890 and 26,120 respectively. However, it should be noted that the incidence and mortality
rate of PCa varies significantly among ethnic populations, with African Americans being the
highest followed by Caucasian, Hispanic and Asian in descending order[5, 6]. Thus, there must
be a large genetic contribution for the risk of developing prostate cancer.
To identify critical genetic variants (known as single nucleotide polymorphisms, or SNPs)
that increase the risk for PCa, several genome-wide association studies (GWAS) have been
performed based on different ethnic groups; in total, these studies have discovered 100 risk loci
with 83% of the risk-associated genomic regions being shared across different ethnic populations,
suggesting a common causal mechanism[1, 4, 8-10]. These 100 loci are distributed throughout
the genome, indicating that there are many different regions that contribute to risk for prostate
cancer (Table 1). Although the GWAS-identified SNPs point investigators to a certain genomic
3
Table 1: Prostate cancer risk loci
Locus SNP Normal
allele
Risk
allele
Risk allele
frequency
Odds ratio for
risk allele
Nearby genes
1p35 rs636291 G A 0.16 1.18 PEX14
1q21 rs1218582 A G 0.45 1.06 KCNN3
1q21 rs17599629 A G 0.22 1.08 GOLPH3L
1q32 rs4245739 A C 0.25 1.10 MDM4, PIK3C2B
1q32 rs1775148 T C 0.27 1.06 SLC41A1
2p11 rs10187424 A G 0.41 1.09 GGCX/VAMP8
2p15 rs721048 G A 0.19 1.15 EHBP1
2p21 rs1465618 G A 0.23 1.08 THADA
2p24 rs13385191 A G 0.56 1.15 C2orf43
2p25 rs11902236 G A 0.27 1.07 TAF1B:GRHL1
2p25 rs9287719 T C 0.46 1.06 NOL10
2q31 rs12621278 A G 0.06 1.33 ITGA6
2q37 rs2292884 A G 0.25 1.14 MLPH
2q37 rs3771570 G A 0.15 1.12 FARP2
3p11 rs2055109 T C 0.90 1.20 Unknown
3p12 rs2660753 C T 0.11 1.18 Unknown
3q13 rs7611694 A C 0.41 1.10 SIDT1
3q21 rs10934853 C A 0.28 1.12 EEFSEC
3q23 rs6763931 C T 0.45 1.04 ZBTB38
3q26 rs10936632 A C 0.48 1.11 CLDN11/SKIL
4q13 rs1894292 G A 0.48 1.10 AFM, RASSF6
4q13 rs10009409 C T 0.32 1.08 COX18
4q22 rs17021918 C T 0.34 1.11 PDLIM5
4q22 rs12500426 C A 0.46 1.08 PDLIM5
4q24 rs7679673 C A 0.45 1.10 TET2
5p12 rs2121875 T G 0.34 1.05 FGF10
5p15 rs2242652 G A 0.19 1.15 TERT
5p15 rs12653946 C T 0.44 1.26 IRX4
5q35 rs6869841 G A 0.21 1.07 FAM44B (BOD1)
6p21 rs130067 T G 0.21 1.05 CCHCR1
6p21 rs1983891 C T 0.41 1.15 FOXP4
6p21 rs3096702 G A 0.40 1.07 NOTCH4
6p21 rs2273669 A G 0.15 1.07 ARMC2, SESN1
6p21 rs115306967 C G 0.65 1.06 HLA-DRB6
6p22 rs115457135 G A 0.22 1.07 TRIM31
6p24 rs4713266 T C 0.52 1.06 NEDD9
6q14 rs9443189 A G 0.14 1.08 MYO6
4
6q22 rs339331 C T 0.63 1.22 RFX6
6q25 rs9364554 C T 0.29 1.17 SLC22A3
6q25 rs1933488 A G 0.41 1.12 RSG17
7p12 rs56232506 G A 0.45 1.06 TNS3
7p15 rs10486567 A G 0.77 1.35 JAZF1
7p21 rs12155172 G A 0.23 1.11 SP8
7q21 rs6465657 T C 0.46 1.12 LMTK2
8p21 rs2928679 C T 0.42 1.05 SLC25A37
8p21 rs1512268 G A 0.45 1.18 NKX3.1
8p21 rs11135910 G A 0.16 1.11 EBF2
8q24 rs1447295 C A 0.13 1.62 Unknown
8q24 rs6983267 T G 0.50 1.26 Unknown
8q24 rs16901979 C A 0.09 1.79 Unknown
8q24 rs10086908 T C 0.30 1.15 Unknown
8q24 rs12543663 A C 0.31 1.08 Unknown
8q24 rs620861 C T 0.39 1.11 Unknown
9p21 rs17694493 C G 0.14 1.08 CDKN2B-AS1
9q31 rs817826 T C 0.08 1.41 RAD23B–KLF4
9q33 rs1571801 C A 0.25 1.27 DAB21P
10q11 rs10993994 C T 0.40 1.25 MSMB
10q11 rs76934034 C T 0.91 1.13 MARCH8
10q24 rs3850699 A G 0.29 1.10 TRIM8
10q26 rs4962416 T C 0.27 1.20 CTBP2
10q26 rs2252004 T G 0.77 1.16 Unknown
11p15 rs7127900 G A 0.20 1.22 Unknown
11q12 rs1938781 T C 0.30 1.16 FAM111A
11q13 rs7931342 G T 0.49 1.19 Unknown
11q22 rs11568818 A G 0.44 1.10 MMP7
11q23 rs11214775 A G 0.71 1.07 HTR3B
12q13 rs10875943 T C 0.31 1.07 TUBA1C/PRPH
12q13 rs902774 G A 0.15 1.17 KRT8
12q13 rs80130819 C A 0.91 1.14 RP1-228P16.4
12q24 rs1270884 G A 0.49 1.07 TBX5
13q22 rs9600079 G T 0.38 1.18 Unknown
14q22 rs8008270 G A 0.18 1.12 FERMT2
14q23 rs7153648 G C 0.06 1.11 SIX1
14q24 rs7141529 A G 0.50 1.09 RAD51L1
14q24 rs8014671 A G 0.59 1.06 TTC9
16q22 rs12051443 G A 0.34 1.06 PHLPP2
17p13 rs684232 A G 0.36 1.10 VPS53, FAM57A
5
17q12 rs4430796 G A 0.49 1.22 HNF1B
17q12 rs11649743 A G 0.80 1.28 HNF1B
17q21 rs7210100 A G 0.05 1.51 ZNF652
17q21 rs11650494 G A 0.08 1.15 SPOP, HOXB13
17q24 rs1859962 T G 0.46 1.20 Unknown
18q23 rs7241993 G A 0.30 1.09 SALL3
19q13 rs2735839 G A 0.15 1.20 KLK2/KLK3
19q13 rs8102476 T C 0.54 1.12 Unknown
19q13 rs11672691 G A 0.76 1.12 Unknown
19q13 rs103294 T C 0.24 1.28 LILRA3
20q13 rs2427345 G A 0.37 1.06 GATAS,
CABLES2
20q13 rs6062509 A C 0.30 1.12 ZGPAT
20q13 rs12480328 C T 0.93 1.13 ADNP
21q22 rs1041449 A G 0.44 1.06 TMPRSS2
22q11 rs2238776 A G 0.80 1.08 TBX1
22q13 rs5759167 G T 0.47 1.16 BIL/TTLL1
Xp11 rs5945619 T C 0.36 1.19 NUDT11
Xp11 rs2807031 T C 0.18 1.07 XAGE3
Xp22 rs2405942 A G 0.21 1.14 SHROOM2
Xq12 rs5919432 A G 0.19 1.06 AR
Xq13 rs6625711 T A 0.41 1.04 SLC7A
Xq13 rs4844289 A G 0.39 1.04 NLGN3-BCYRN1
Prostate cancer risk loci. This list of independent prostate cancer risk loci is modified from The
genetic epidemiology of prostate cancer and its clinical implications [1]and A meta-analysis of
87,040 individuals identifies 23 new susceptibility loci for prostate cancer[4]
Table 1: continued
6
region, these studies are not precise enough to determine the exact genetic changes that are
causal for prostate cancer risk. This is because a SNP on the GWAS array (known as an index
SNP) is simply a surrogate for many co-inherited SNPs in a given genomic region. Thus, one
cannot assume that the GWAS index SNP is the actual SNP linked to increased risk for prostate
cancer; any of the SNPs in high linkage disequilibrium (LD) with the index SNP are candidate
causal SNPs. This creates a large problem because there are often hundreds of SNPs in high LD
with the index SNPs; e.g. an analysis of 77 of the prostate cancer risk SNPs identified 727 SNPs
in high LD (r^2 > 0.5). Another difficulty in understanding the mechanisms that lead to
increased risk for prostate cancer is that the majority of the index SNPs and the correlated SNPs
fall within non-coding regions rather than coding regions. A SNP that falls within an exon could
cause nonsense mutations or missense mutations, leading to reduced or inactive protein function.
However, it is more challenging to understand the mechanism by which a SNP that falls within a
noncoding region could influence prostate cancer risk. One possibility is that the non-coding
SNPs alter transcription factor binding motifs and thus influence activity of regulatory elements.
To test this hypothesis, an R/Bioconductor software package called FunciSNP has been used to
characterize the prostate cancer risk SNPs[11]. FunciSNP integrates index SNP genomic
locations with chromatin biofeature annotations and 1000 genomes genotyping data, with the
goal of identifying SNPs that are in high linkage disequilibrium with the index SNP and in a
functional element (e.g. exons, promoters, or enhancers). For my studies, the functional elements
7
of interest are enhancers. Of note, a recent comprehensive functional annotation of 77 of the
prostate cancer risk loci using FunciSNP identified 727 SNPs having a r^2 value > 0.5 and, of
these, 663 are in enhancer regions. Of particular interest is SNP rs10486567, which is the index
SNP, and is located in an enhancer in the third intron of the JAZF1 gene[3]. In addition, this SNP
is also listed as a prioritized functional candidate in a recent multiethnic prostate cancer fine-
mapping study[12].
As noted above, one possibility for how a non-coding SNP could influence prostate
cancer is that it might disrupt a transcription factor binding in a regulatory element, causing
changes in expression of target genes. Interestingly, the risk-associated G allele (which is present
in 77% of the European Population) at rs10486567 creates a recognition motif for the DNA
binding protein NKX3-1 (a tumor suppressor) whereas the non-risk A allele forms a motif for the
DNA binding protein FOXA1 (an oncogene) (Figure 2B). The region encompassing this SNP
has marks of an active enhancer (H3K27Ac, DHS, etc.) in several prostate cancer cell lines
(Figure 1) and has been shown to have enhancer activity in a luciferase reporter assay using
LNCaP prostate cancer cells, with the non-risk A allele having higher (1.39-fold) activity than
the risk-associated G allele (Figure 2). In addition, dihydrotestosterone (DHT) significantly
increased activity in a reporter assay when either the risk or non-risk SNP-containing fragment
was cloned upstream of a promoter in a reporter assay[3].
8
Figure 1: Chromatin signatures around the 7p15.2 enhancer in multiple prostate cell
lines. A. Shown is the H3K27Ac ChIP-seq signal, a marker of active enhancers, at the 7p15.2
locus. The blue line indicates the position of the rs10486567 SNP. There is no enhancer
signal in normal prostate cell lines or in the PC3 or RWPE-2 cell lines. B. Published ChIP-
seq data indicates that there is a FoxA1 ChIP-seq peak in LNCaP cells and an Androgen
receptor binding site (using data from 13 tumor and 7 normal cell lines). According to
DNase-seq data, this locus is not an open region in normal prostate cell lines (PREC and
RWPE-1). The H3K4me1 and H3K4me3 data from LNCaP indicates that the locus is an
enhancer region rather than a promoter, because they mark active enhancers and active
promoters, respectively.
9
Figure 2: Allele-specific effects of the risk SNP in an enhancer luciferase assay . A.
Shown are the results of a reporter assay in which the risk and non-risk SNP-containing
fragments were cloned upstream of the thymidine kinase minimal promoter. The enhancer
luciferase assay was performed using LNCaP cells with or without DHT treatment,
indicating that the activity of the luciferase reporter assay is increased by DHT treatment.
The risk-associated G allele produced a lower level of luciferase expression, suggesting
weaker enhancer activity. B. Motif analysis of rs10486567 nearby region indicating the
risk G allele will change the binding factor from NKX3-1 to AR-FOXA. Adapted from
Hazelett et al[3].
10
As noted above, the rs10486567 SNP is located within an enhancer of the JAZF1 gene.
Because the JAZF1 gene encodes a strong tumor suppressor, one could hypothesize that JAZF1
is regulated by the rs10486567-containing enhancer located at 7P15.2 (identified throughout this
text as E7P15), with the risk allele lowering the activity of the enhancer, reducing the expression
of JAZF1, and leading to increased tumorigenicity. However, a recent prostate cancer expression
quantitative trait loci (eQTL) study using allele-specific expression patterns in association
analyses has suggested that E7P15 regulates the TAX1PB1 gene, which is located 196 kb from
the enhancer[13]. In addition, another paper investigated prostate cancer risk-associated cis-
regulatory modules integrating sequencing data from 295 PCas with 602 prostate tumor sample
gene expression datasets. Rs10486567 was among one of the new gene regulatory mechanisms
identified in which the risk-associated SNP interrupted the ternary androgen receptor (AR)
FOXA1 and AR-HOXB13 complexes and competitive binding mechanisms. According to their
allele-specific eQTL analysis, which included TCGA prostate cancer data and an in-house cohort
of prostate cancers, the rs10486567 is associated with expression of the HOXA13 gene, which is
783 kb away[14]. There is a cluster of HOXA genes nearby the HOXA13; HOXA genes are
important transcription factors during embryonic development and play an important role in
prostate development[15]. Most importantly, their expression can regulate each other according
to several studies[16, 17]. Also, chromatin conformation studies have suggested that an enhancer
does not usually loop to the closest promoter [18], which would be the JAZF1 promoter.
11
Therefore, to determine if E7P15 regulates JAZF1, TAX1PBA, HOXA13 or other more distal
genes, I have deleted the enhancer using CRISPR/Cas9 and examined effects on gene regulation
by RNA-seq. In order to examine the effects of enhancer deletion of cell phenotype, I have
performed cell proliferation and colony formation assays in control and enhancer-deleted cells.
12
CHAPTER 2
DELETION OF E7P15 IN PROSTATE CELL LINES
The genomic region encompassing the rs10486567 SNP has marks of an active enhancer
in the majority of prostate tumor cell lines but not in normal prostate cells (Figure 1A). If the
SNP confers risk based on altered activity of the E7P15 enhancer, then, based on the current
model in the field, deleting this region should have an effect on target gene expression in tumor,
but not normal, cells. Therefore, I will delete E7P15 from the prostate cancer cell line C4-2B,
which has a strong H3K27ac peak, and from the normal prostate cell line RWPE-1, which does
not have an active enhancer mark at rs10486567.
Based on unpublished H3K27Ac ChIP-seq data performed by members of the Farnham
lab, two of the prostate cancer cell lines, PC3 and RPWE-2, also do not have the mark of an
active enhancer at this locus. However, two important criteria for a good prostate cancer cell
model are the expression of prostate specific antigen (PSA) and the androgen receptor[19].
Among the cancer cell lines mentioned above, PC3 do not express either of these and is thus not
considered as a good prostate cancer model. Also, others in the lab have shown that, in general,
the set of enhancers in PC3 do not greatly overlap the enhancers in the other prostate cancer cell
lines. RWPE-2 is derived from non-neoplastic prostate cell line RWPE-1 by transforming with
v-Ki-ras (also called K-ras) and is thus not obtained from a real prostate tumor[20]. Although, it
expresses PSA and AR in response to DHT treatment, it is still basically a normal cell line that
13
has been transformed in vitro to have certain cancer phenotypes. Thus, even though the
H3K27Ac mark does not exist in PC3 and RWPE-2, the enhancer could still be considered
prostate cancer specific because it is present in the other four tumor lines (LNCaP, C4-2B,
VCaP, and 22RV1). LNCaP has been widely used as a good PCa model. However, it grows very
slowly and has been difficult to use for genome editing due to extreme difficulty in isolating
single cell clones. The androgen independent LNCaP subline C4-2B is derived from a bone
metastasis from a mouse infected with C4-2, a LNCaP subline, and it grows much faster and is
more amenable to cell cloning[2]. Hence, I chose C4-2B as the cancer cell line in my research.
The primary prostate epithelia cells (PrEC) are normal prostate cells. However, each
isolate of these cells is derived from different patients and can thus vary from batch to batch,
especially in SNPs. PREC are not immortalized and thus can only be passaged approximately 10
times; accordingly, these cannot be used for CRISPR-mediated deletion experiments. In contrast,
RWPE-1 are well-characterized, non-neoplastic, prostatic epithelial cells that were immortalized
using human papillomavirus 18. These cells express cytokeratin 8 and 18, markers of luminal
prostatic epithelial cells, and exhibit growth and differentiation characteristics of normal prostate
epithelial cells. For these reasons, I chose to use RWPE-1 as the normal prostate cell model in
my research.
14
2.1 The design of gRNA and plasmid construction
The gRNAs were designed by a website tool (http://crispr.mit.edu) based on the hg19
genome. All the gRNAs used in my research had at least scores of 76 and did not have any single
mismatched sequences in the genome (the last 17bp region of the guide RNAs, including the
PAM sequence was unique in the genome). When designing guide RNAs, repeat regions were
excluded. I designed 4 pairs of gRNAs that flank the H3K27Ac signals and 2 pairs of gRNAs
that fell within the peak but still would delete the SNP when used as a pair (Figure 3 and Table
2). 3 pairs of gRNA, denoted small medium and large, were cloned into Cas9-GFP plasmids and
3 pairs of new gRNA were cloned into a vector that allows selection using puromycin.
15
Figure 3: CRISPR/Cas9 deletion region. The blue line indicates the location of target SNP rs10486567.
The green luciferase bar indicates the region included in the enhancer luciferase assay mentioned above.
The gRNA tracks show the genomic regions that will be deleted using the different pairs of guide RNAs.
The small medium and large purple bars are the regions deleted using the Cas9-GFP selection method.
Similarly, the grey bars indicate the sequences deleted by the three different pairs of gRNAs using the
puromycin selection method.
16
Table 2: gRNA sequencing and coordinates
Name Coordinates of
target DNA
sequence
Strand Target DNA sequence Scores
7p15.2 Left
Large
27974181-27974203 + TACACCTTGGCGGTAGCCTCTGG 92
7p15.2 Right
Large
27979057-27979079 + GCCATACCCCCAATACTCTGAGG 86
7p15.2 Left
Medium
27975638-27975660 - GTGGCATGGAAAGCGGAACCAGG 82
7p15.2 Right
Medium
27978064-27978086 - GACCTCCATGAAATCGCTACAGG 93
7p15.2 Left
Small
27976019-27976041 - TGTTGGTATAGACACGTATGAGG 91
7p15.2 Right
Small
27976825-27976847 - TGCTTGAGGCGTAAATGTTGTGG 83
7p15.2 New Left
Large
27973418-27973437 + GAACATAATGGGAAGGCGGTTGG 82
7p15.2 New
Right Large
27977883-27977902 + CCTTAGGTTATATTGTATCCAGG 78
7p15.2 New Left
Medium
27975176-27975195 + AGTGACAACTAATGTGTACGTGG 86
7p15.2 New
Right Medium
27977767-27977786 - TCAAACATAGAGATTGGGTCAGG 75
7p15.2 New Left
Small*
27975607-27975626 - GGCAAGGCGCACGAGGTACTGGG 89
7p15.2 New
Right Small*
27977221-27977240 - AAGCAGTAAACCTATTCTACAGG 71
gRNA sequencing and coordinates. Two gRNAs right and left were used together to delete
the enhancer. The coordinates are based on hg19. Strand indicates whether the guide RNA is on
positive strand or negative strand. The sequences marked in green are PAM sites. Scores are
from crispr.mit.edu. * These two gRNAs were cloned but not used in the thesis.
17
Table 3: Oligonucleotides used to synthesize gRNAs
Name Top Oligos Bottom Oligos Deletion
Size*
7p15.2 Left
Large
TTTCTTGGCTTTATATATCT
TGTGGAAAGGACGAAACACC
GACACCTTGGCGGTAGCCTC
GACTAGCCTTATTTTAACTT
GCTATTTCTAGCTCTAAAAC
GAGGCTACCGCCAAGGTGTC
4876bp
7p15.2 Right
Large
TTTCTTGGCTTTATATATCT
TGTGGAAAGGACGAAACACC
GCCATACCCCCAATACTCTG
GACTAGCCTTATTTTAACTT
GCTATTTCTAGCTCTAAAAC
CAGAGTATTGGGGGTATGGC
4876bp
7p15.2 Left
Medium
TTTCTTGGCTTTATATATCT
TGTGGAAAGGACGAAACACC
GTGGCATGGAAAGCGGAACC
GACTAGCCTTATTTTAACTT
GCTATTTCTAGCTCTAAAAC
GGTTCCGCTTTCCATGCCAC
2426bp
7p15.2 Right
Medium
TTTCTTGGCTTTATATATCT
TGTGGAAAGGACGAAACACC
GACCTCCATGAAATCGCTAC
GACTAGCCTTATTTTAACTT
GCTATTTCTAGCTCTAAAAC
GTAGCGATTTCATGGAGGTC
2426bp
7p15.2 Left
Small
TTTCTTGGCTTTATATATCT
TGTGGAAAGGACGAAACACC
GGTTGGTATAGACACGTATG
GACTAGCCTTATTTTAACTT
GCTATTTCTAGCTCTAAAAC
CATACGTGTCTATACCAACC
816bp
7p15.2 Right
Small
TTTCTTGGCTTTATATATCT
TGTGGAAAGGACGAAACACC
GGCTTGAGGCGTAAATGTTG
GACTAGCCTTATTTTAACTT
GCTATTTCTAGCTCTAAAAC
CAACATTTACGCCTCAAGCC
816bp
7p15.2 New
Left Large
CACCGAACATAATGGGAAGG
CGGT
AAACACCGCCTTCCCATTAT
GTTC
4484bp
7p15.2 New
Right Large
CACCGCCTTAGGTTATATTG
TATCC
AAACGGATACAATATAACCT
AAGGC
4484bp
7p15.2 New
Left Medium
CACCGAGTGACAACTAATGT
GTACG
AAACCGTACACATTAGTTGT
CACTC
2610bp
7p15.2 New
Right Medium
CACCGTCAAACATAGAGATT
GGGTC
AAACGACCCAATCTCTATGT
TTGAC
2610bp
7p15.2 New
Left Small*
CACCGGCAAGGCGCACGAGG
TACT
AAACAGTACCTCGTGCGCCT
TGCC
1633bp
7p15.2 New
Right Small*
CACCGAAGCAGTAAACCTAT
TCTAC
AAACGTAGAATAGGTTTACT
GCTTC
1633bp
Oligonucleotides used to synthesize gRNAs. The oligonucleotides were ordered from Integrated
DNA Technology (Coralville, IA). There are two different lengths of oligonucleotides because two
different protocols were used to synthesis the gRNA vector. The sequences marked in purple are
either the gRNA sequences for the GFP-based plasmid or the restriction enzyme site for the
Puromycin vector. * These two gRNAs were made but not used in the thesis.
18
I used a GFP-based FACS approach and a puromycin selection method in my research.
For the FACS-based method, the gRNA vector (Addgene, cat. no. 41824) and a plasmid
expressing Cas9-GFP (Addgene, cat. no. 41824) were prepared according to a standard protocol
[21]; all oligonucleotides used are listed in Table 3. This method requires co-transfection of three
different plasmids, two for the different gRNAs and one for Cas9-GFP, which leads to low
transfection efficiency and depends upon cell sorting to enrich transfected cells. Because the cell
sorting was too harsh for some cell lines, another more efficient puromycin-based approach was
also developed. The puromycin vector (Addgene, cat. no. 62988) contains two expression
cassettes, one for a human codon-optimized SpCas9 and one for a single guide RNA; plasmids
were constructed based on the protocol from Feng Zhang’s lab[22]. This vector showed higher
efficiency in genome editing applications not only because it required transfection of only two
plasmids but also because it was based on a new chimeric RNA design with a longer tracer RNA
hybrid which has been claimed to be more effective than the original guide RNA structure. This
vector also had improved puromycin selection because it has corrected the point mutation in
puromycin N-acetyltransferase (PAC) gene in previous versions of puromycin-selection
plasmids. I note that there is also a double nickase version of Cas9 which could reduce off-target
binding effects[23]. However, the editing efficiency is the most important concern for CRISPR-
mediated deletions using aneuploid prostate tumor lines and the double nickase would lower the
19
chance to get full knockout cells. Fortunately, CRISPR off target effects seems not to be a big
problem for most applications [24, 25].
2.2 Cell culture
The human prostate cell line C4-2B was grown at 37ºC, in 5% CO
2
in RPMI 1640
medium (Corning Cellgro) with 10% fetal bovine serum (Gibco by Life Technologies) without
antibiotics. The RWPE-1 (ATCC
®
CRL-11609
TM
) cell line was cultured in Keratinocyte Serum
Free Medium (Thermo Fisher Scientific, cat. no. 17005-042) without antibiotics. Because
mycoplasma contamination is invisible and it could alter the outcome of my experiments,
antibiotics were not used in general in cell culture because the growth of bacteria and yeast could
be used as a monitor of contamination and contaminated cells could be discarded. Besides, I
found that using a good aseptic technique was enough to prevent contamination and antibiotics
were not necessary. However, 2% penicillin and streptomycin (USC Norris Comprehensive
Cancer Center Bioreagent & Cell Culture Core) were still used for cell sorting when bacteria
contamination was hard to avoid. Also, all cell lines have recently been tested for mycoplasma
contamination by the bioreagent&cell culture core and they were found to be not contaminated.
In the colony selection step, conditioned medium was used to culture single cells that
were prepared by mixing 40% used medium with 60% new medium. The used medium was
collected between 24 and 48 hours before cells became confluent. The mixture was then
centrifuged and the supernatant was filtered and saved for mixing with new medium.
20
2.3 Transfection
C4-2B cells were transfected using lipofectamine 3000 (Life Technologies,
catalog#3000008) according to a company-supplied protocol. However, the transfection
efficiency was low using this method (20~30% efficiency). To overcome that problem, C4-2B
cells were also electroporated using a Neon transfection kit and device (Invitrogen,
catalog#MPK5000) and better results were achieved. I found that the transfection efficiency
varied greatly using different combinations of pulse voltage, pulse width, and pulse times and I
optimized these parameters according to manufacturer’s instruction; in total, 31 different
combinations were tested. In addition, different amounts of plasmids were also tested and I found
that the efficiency was higher when more plasmids, within a certain range, were added. As a
result, 1x10
5
cells were mixed with 1.5µg plasmid and electroporated in a 10µl tip using the
following conditions: pulse voltage 1250, pulse width 20, pulse number 1. Each tip was used 3
times and 3x10
5
transfected cells were seeded into 1 well of a 6 well plates (>80% efficiency).
However, after optimizing all conditions, I found that electroporation did not generate better
results as compared with lipofectamine 3000 for RWPE-1 cells. Consequently, RWPE-1 cells
were transfected with 5000ng plasmid using 7.5µl lipofectamine 3000 reagent in 6 well plates
(10% efficiency).
21
2.4 Guide RNA Validation
Two days after transfection, genomic DNA was isolated using the QIAamp DNA mini kit
(Qiagen, cat. no. 51306) and tested by PCR using GoTaq green master mix (Promega, cat. no.
M7123) with primers flanking the target regions. The failed guide RNA pairs would only have a
large band while the gRNA pairs that were successful in deleting the target region would
generate a smaller band. Of course, the PCR method could not tell the efficiency of each guide
RNA in the pair. Therefore, certain gRNAs were also tested using the Surveyor
®
mutation
detection kit (IDT, cat. no. 706025). Although the Surveyor kit could measure the editing
efficiency of each gRNA, I found that the PCR analysis was sufficient for most situations and
more convenient. Based on these results, all gRNAs worked to target Cas9 to the appropriate
genomic site (Data not shown).
2.5 Enrichment of transfected cells
The most laborious part of genome editing is genotyping all the single cell colonies. To
reduce the number of cells that must be genotyped, the transfected cells must be future enriched
unless the transfection efficiency can reach 100%. Cells transfected with GFP were enriched via
fluorescence-activated cell soring (FACS). 48 hours after transfection, C4-2B cells were sorted
into one well of a 24-well plate using a BD FACS Aria I or II cell sorter (BD Biosciences) at the
flow cytometer core at the Eli and Edythe Broad CIRM center. The sorted cells were cultured in
plates for one week to recover. Then, they were replated to single cells per well. Alternatively,
22
the C4-2B cells were also sorted directly into 96-well plates, one cell per well. The latter
approach could generate single cell colonies. However, not all C4-2B cells continued to divide
when they were grown as single after cell sorting; the recovery rate was around 20%.
The cells transfected with puromycin vectors were enriched by puromycin selection. I
performed a kill curve experiment in advance to determine the optimal dose. Two days after
transfection, cells were approximately confluent in the plate. I found that the results of
puromycin treatment were highly influenced by cell density[26]. Specifically, more confluent
cells were more resistant to puromycin and the over-confluent cells were extremely hard to kill.
After puromycin was added directly to these cells, only a few cells in the center with high
puromycin resistance survived. However, many cells around the border of the plate were still
alive. Thus, the cells were trypsinized and seeded into 10~15cm dishes to reduce the cell density
and puromycin was added to plates having cells at less than 30% confluency cells. Puromycin
concentrations were test from 300ng/µl to 1600ng/µl, using 100ng/µl interval. Based on the kill
curve, 400ng/µl was the minimum dose that killed all C4-2B cells after 4 days and 1000ng/ml
was the lowest concentration to kill RWPE-1 after 5 days. Therefore, for my experiments, two
days after transfection, cells were selected using the above dose and time. After puromycin
selection, cells were cultured in normal media for 2 to 5 days to recover and then processed for
single cell isolation.
23
2.6 Generation of single cell colonies
Four different approaches, cell sorting, serial dilution, cloning discs (Sigma-Aldrich,
Z374431-100EA) and an in-house developed ocular sorting method were used to develop single
cell colonies. Cell sorting: C4-2B and RWPE-1 were sorted into 96 well plates, one cell per
well. For the cells that had been enriched, they were sorted using a strong GFP signal. Otherwise,
the cells were sorted directly (for those without Cas9-GFP). Although this approach was the
most efficient to seed single cells, the prostate cells were sensitive to it and the survival rate in
subculture was low. Serial Dilution: Single cells were also obtained by serial dilution. Cells were
counted using a hemocytometer and 30 or 100 cells were seeded into a 96-well plate. Using this
method, I found that only around 20 percent of the wells contained exactly 1 cell. It was very
time-consuming to carefully examine all the wells in the following days to make sure that
colonies were grown from a well containing only 1 cell. Cloning discs: Using this method, 100
or 200 cells were seeded into one 10cm or 15cm dishes. After one or two weeks, cell colonies
were picked using cloning discs. Although well-separated colonies were chosen, there is no
guarantee that all of the cell colonies arose from a single cell. Ocular sorting: To overcome the
drawbacks mentioned above, I developed an ocular sorting technique. Cells were trypsinized and
diluted to low density into a 10cm dish. Single cells were picked up by a 2µl pipette with the
help of a microscope, one at a time. Then, cells were seeded into 96 well plates. The 36 wells at
the edge of the plated were filled by 200µl DPBS because they evaporated much faster during
24
long culture time; only the 60 wells containing 150µl medium in the middle were used to grow
single cells. Although this technique required good skills, it was very efficient, time saving and
cell friendly. Because I determined that single RWPE-1 cells required conditioned media for
survival, these cells were cultured in the mix of conditioned and regular media described above.
However, conditioned medium had negligible effect on C4-2B cells and therefore was not
generally used for these cells. To avoid cells being accidentally aspirated, only 50% medium was
replaced, as needed, over the colony growth period.
2.7 Detection of cells having enhancer deletions
When 50% confluent in 96 well plates, the C4-2B cells were dissociated directly by
vigorously pipetting 20 times. 70% of the cells were collected for genotyping and the rest were
left in the plate for subculture. Due to the fact that RWPE-1 cells stick tightly to the plate, they
were removed by trypsinization (and then the trypsinization was stopped by the addition of
serum). After the cells were resuspended in media, 30% of the cells were transferred to 24 well
plates and the medium was changed the next day; 70% of the cells were used for DNA isolation.
Genomic DNA was extracted using the QuickExtract DNA extraction solution (Epicentre,
QE09050) according to the manufacturer’s protocol; 20~30 µl of the solution was added and
finally 2µl was used in genotyping. Deletions were tested by PCR using GoTaq green master
mix (Promega, cat. no. M7123) with primers both flanking and within the target regions (Table
4); 34 cycles were used.
25
Three different outcomes were detected in the genotyping analyses: full deletions, partial
deletions, and no deletions. The fully deleted colonies only had the short PCR band generated by
the primers flanking the target region. The colonies that had no alleles deleted showed a band
using both the inside primers and the long band using the outside primers. The partially deleted
cells had three bands, one from the inside primer pair, and a short and a long band from the
primers flanking the deletion region (Figure 5).
26
Table 4: Primers for genotyping
Names Primer sequences Location
Production
Size
Forward Large1 CCACCTAAAGGGCATTTCAA 27974006 27974025 5288bp
Forward Large2 CGTGGTTTACCAAAGCAAGA 27973928 27973947 5198bp
Forward Medium1 AGCACAGCCTCTGTGATCCT 27975410 27975429 3216bp
Forward Medium2 CCCCCAGCAAAATAGAAGGT 27975365 27975384 3331bp
Forward Small1 GCTGAGCCTGGAAGAAAAAC 27975846 27975865 1267bp
Forward Samll2 CATGGTCTTCCACAACATGC 27975957 27975976 1339bp
Forward In1 TGGCCTTACTGCTACCCAAT 27976111 27976130 374bp
Forward In2 TGCACAAACTCAGGGACAAA 27976219 27976238 536bp
Reverse Large1 ATACAGGGTCTTCCCCCTCT 27979274 27979293 5288bp
Reverse Large2 ATCCCCTTCTCTCCAGATCC 27979106 27979125 5198bp
Reverse Medium1 TCAGGGAGAATGGCTCTCAC 27978607 27978626 3216bp
Reverse Medium2 AAATGGCCCAAGCTTCAGAT 27978677 27978696 3331bp
Reverse Small1 GGATGTCATGGGAACATCCT 27977093 27977112 1267bp
Reverse Samll2 TCCATCCATAAAACCAAGAGG 27977275 27977295 1339bp
Reverse In1 TGTTTAGCTGGAGCATGGAG 27976466 27976485 374bp
Reverse In2 GGCAAGGCTTTGACAACTCT 27976736 27976755 536bp
Forward New
Small*
GAAGGGGGAAACAATGACAA 27975460 27975479 1835bp
Forward New
Medium*
GCAGCCAGACACTTTAGGTCA 27975057 27975077 3639bp
Forward New
Large*
TGGACAAGATGTTGGCTTCA 27973297 27973316 5828bp
Primers for genotyping. One forward and one reverse primer were used together to test for
deletions. For instance, the forward Large1 was used with the reverse Large1. The in primers
were located within deletion region. * forward new small, forward new medium and forward new
large was used with reverse small2, reverse medium2 and reverse large2, respectively.
27
2.8 Detection of the copy number of remaining chromosomes
Because there are 5 copies of chromosome 7 in C4-2B cells, qPCR was used to quantitate
the numbers of remaining chromosomes in the clones using SsoFast EvaGreen supermix (Biorad,
cat. no. 1725202) (Figure 4A). The same amount of genomic DNA was used to minimize
variation in the assays. In addition to primers within the deletion region, another pairs of primers
outside the deletion region but still on the same chromosome was used as a control (Table 5).
According to qPCR, only 1 copy of chromosome 7 remained for the 3 small and medium clones
tested whereas 1 or 2 copies remained for the 3 large clones tested.
Figure 4: qPCR results and the C4-2B karyotype information.
A. The red box indicates the 5 copies of chromosome 7[2]. (B)
Three different clones were analyzed for the small, medium, and
large guide RNA pairs.
28
Table 5: qPCR primers
2.9 Results
To ensure that any observed phenotypic and transcriptomic changes were not caused by
off target effects of the Cas9, three different pairs of gRNAs, small, medium and large were used
to delete the risk enhancer in C4-2B cells using a GFP-based approach. An empty vector
expressing only GFP-Cas9 was utilized as control and the cells were put through the exact
treatment (cell sorting, forming single cell colonies, etc.) as for the deleted colonies (except since
no guide RNAs were included, there should have been no deletions). Although around 200 single
cell colonies were genotyped, no fully deleted clones were identified due to the fact that C4-2B
cells have 5 copies of chromosome 7 (Table 6). By performing qPCR on the partially deleted
cells, 3 colonies for each pair of gRNA that had the majority of the alleles deleted were
Primer name Sequences
qPCR F1 GGCATAAAACGGCTCGTAAG
qPCR R1 GCTCATTCTGGAGGGGATAA
qPCR F2 TGGCCTTACTGCTACCCAAT
qPCR R2 GCATGCAGAGAGTGTGTGCT
qPCR control F1 CCAAATGTAGTGGCAGGAGG
qPCR control R1 CTAAAGAAGAGGGCCAGGCT
qPCR control F2 TGAGGAACAGGGAGAAGGGA
qPCR control R2 TCCTGGGAAACTCTGAAGGG
qPCR primers. The qPCR F1, R1, F2 and R2 are located within the deletion region. The 4
control primers are outside of the deletion locus.
29
identified. The best 3 colonies from the small gRNAs and the best 3 colonies from the large
gRNAs, plus three different control colonies, were send out for RNA-seq analysis. However, no
genes were significantly downregulated within 2MB of the deleted region (data not shown).
To obtain a complete knock out of the targeted region on all chromosome 7 alleles, a
more efficient puromycin approach was developed. I began with the best colony (the one with
the fewest remaining copies of the targeted region) originally obtained using the medium and
large guide RNA pairs. These colonies were transfected with the same guide RNAs that had
been used to create the partial deletions. However, it is possible that even though the region was
not deleted, the binding site could still have been destroyed by the Cas9. To avoid that issue, I
also designed two new pairs of gRNAs (new medium and new large guide RNAs) that were
slightly offset from the original guide RNA binding sites. The best colony using the original
medium pair of gRNA was transfected with new medium or the new large gRNAs. Similarly, the
best colony using the original large pair of gRNAs was transfected with the new medium or the
new large gRNAs. Although the CRISPR pipeline adapted from Feng Zhang lab’s protocol
worked well[22], it was still very difficult to delete all the copies of E7p15. After genotyping
more than 250 single cell colonies, 3 full knock out cells were finally obtained using the new
large pair of gRNA (Figure 5). Two of the three were based on the best original large colony and
one was from the best original medium colony. The new left large gRNA was also transfected
30
along with Cas9 to create control cells that were put through the same treatment (puromycin
selection, forming single cell colonies etc.); three different colonies were saved as controls.
The nearly diploid RWPE-1 cells were much more easily targeted by CRISPR. The new
large pair of gRNAs and new medium pair of gRNAs were used via puromycin method. After
screening around 150 colonies, 15 fully deleted ones were achieved using the new large pair of
gRNA. The genotyped colonies that did not have deletions were save as controls. This kind of
control was termed absolute control because it was transfected with the same guide RNAs and
could account for off target effects. Thus, although all RWPE-1 fully deleted cells were achieved
by the same pair of gRNA, the necessity of obtaining cells edited by another pair gRNA can be
alleviated. A summary of the deletion experiments is shown in Figure 6.
Table 6: Summary of genome editing results
C4-2B
first
round
editing
Small
pair
Medium pair Large pair Total
Genotyped 69 56 68 193
Partial
deletions
9 24 46 78
Tested by
qPCR
9 10 9 28
C4-2B
Second
round
editing
Other
M partial+new
large pair
L partial+new large
pair
Total
Genotyped 150+ 50+ 50+ 250+
Full deletions 0 1 2 3
RWPE-1
New medium pair New large pair Total
Genotyped
18 139 157
Full deletions
0 15 15
31
Figure 5: Gel images of genotyping. A. 0.1~10kb ladder (New England Biolabs, cat. no. N3270S). B.
Only partially deleted C4-2B cells were achieved from first round CRISPR editing. The expected sizes of
bands if deletion occurs and if no deletion occurs are shown. Colonies having both sizes of bands indicate
partial deletions. C. One fully deleted C4-2B colony was genotyped with 3 different pairs of primers. Both
the M+L deletion and control were previously edited using medium pair of gRNA. * The second line
marked by asterisk indicate some chromosomes were hit by the medium pair of gRNAs. D. Two different
C4-2B L+L full deletion and control. The 5kb band was too large to PCR. E and F. 7 RWPE-1 full knocked
out samples which used in RNA-seq were test with 3 pairs of primers.
32
Figure 6: Summary of the CRISPR pipeline. A. Pipeline for C4-2B cells.
Two different approaches were used; 3 clones and 3 deletions were analyzed
by RNA-seq. B. Pipeline for RWPE-1 cells; 5 controls and 7 of the 15 fully
knocked out colonies were send out for RNA-seq.
33
Table 7: Lists of control cells and fully deleted cells
Cell type Stock name Primer used
C4-2B
Control1 New Left Large*
Control2 New Left Large*
Control3 New Left Large*
M+L full deletion
Left Medium + Right Medium + New Left
Large + New Right Large
L+L full deletion 1
Left Large + Right Large + New Left Large +
New Right Large
L+L full deletion 2
Left Large + Right Large + New Left Large +
New Right Large
RWEP-1
Control1 New Left Large + New Right Large
Control2 New Left Large + New Right Large
Control3 New Left Large + New Right Large
Control4 New Left Large + New Right Large
Control5 New Left Large + New Right Large
Control6 New Left Large + New Right Large
Control7 New Left Large + New Right Large
New L full deletion1 New Left Large + New Right Large
New L full deletion2 New Left Large + New Right Large
New L full deletion3 New Left Large + New Right Large
New L full deletion4 New Left Large + New Right Large
New L full deletion5 New Left Large + New Right Large
New L full deletion6 New Left Large + New Right Large
New L full deletion7 New Left Large + New Right Large
List of controls and full deleted cells. * The C4-2B control cells went through 2 rounds
CRISPR editing. In addition to the new left large primer transfected in the second round
experiment, they were also transfected with a vector expressing only Cas9 and GFP.
34
CHAPTER 3
INVESTIGATION OF PROLIFERATION CHANGES IN E7P15 DELETED CELLs
To investigate the phenotypic differences between deleted cells and control cells, cell
proliferation and colony formation assays were performed using both C4-2B and RWPE-1 cells.
Because genome editing via the CRISPR technique is based on single cell colonies, I might
observe colony-to-colony variation. To obtain reliable results, multiple colonies were tested. For
C4-2B, control 1, the M+L full deletion, and the L+L full deletion 1 were tested. For RWPE-1,
control 1 and 2 (mixed together before use) and new L full deletion 1 and 2 (mixed together
before use) were tested.
3.1 Colony formation assay
1500 or 6500 cells from each sample were sorted into one well of a 6-well plate directly
via the FACS machine to obtain an accurate cell number; triplicates were performed. To make
the FACS process less harmful to cells, the flow rate was set as low as 2. Each well contained 3
ml of medium and was replaced every 3 days. After 10 days, the medium was aspirated and the
plates were rinsed once with ice cold DPBS. Then the plates were placed on ice and fixed with
2ml ice cold methanol for 20min. After the methanol was aspirated, 0.5% crystal violate was
added to cover the bottom and the plates were incubated at room temperature for 30min. The
crystal violate solution was collected after 30min and the plates were carefully submerged in
35
water to remove the residue. The plates were left to dry at room temperature. The cell colonies
which have a t least 50 cells were determined using a microscope and the number them were
counted by eye. Due to the overwhelming number of colonies, only the plates seeded with 1500
C4-2B cells were counted (Figure 7 and 8).
Figure 7: C4-2B colony formation assay results. A. C4-2B control plates. B. C4-2B M+L full
deletion plates. C. C4-2B L+L 1 full deletion plates. D. Histogram of the number of cell colonies
which have at least 50 cells. The counting was based on the wells seeded with 1500 cells.
36
Figure 8: RWPE-1 colony formation assay results. A. RWPE-1 control
plates. B. RWPE-1 full deletion plates
37
3.2 Cell proliferation assay
To overcome the variation caused by cell counting using a hemocytometer, 1500 or 5000
cells from each sample were sorted directly into one well of a 96-well plate for accurate
quantitation. To generate the same environment for each sample, the 36 wells at the edge were
filled with 200µl DPBS, and the 60 inner wells were filled with 150µl medium. Controls were
seeded into B and C rows and deletions were seeded into F and G rows. After 3, 5, 7 or 9 days,
15µl WST-1 (Sigma Aldrich, cat. no. 5015944001) assay reagent was added directly into each
well and the plate was incubated for 1 hour. Before being read by a plate reader (BioTek,
synergy 2 multi-mode reader), the plate was shaken for 1 min. Samples were measured at
440nm and the 690nm reference wavelength. The reference value was deducted from the 440nm
result. Also, the value of the empty control was deducted from sample reading. In order to get
reliable statistical results, 20 replicates were performed. A two sample T test was used for a
statistical test. (Figure 9)
38
Figure 9: Cell proliferation assay results. A. C4-2B 1500 cells after 5 day’s
incubation. B. C4-2B 1500 cells after 7 day’s incubation. C. C4-2B 5000 cells after 3
day’s incubation. D. C4-2B 5000 cells after 3 day’s incubation. E. RWPE-1 1500
cells after 10 day’s incubation. F. RWPE-1 5000 cells after 9 day’s incubation.
A
C
B
E
D
F
39
CHAPTER 4
RNA-SEQ AND DATA ANALYSIS
In many studies, RNA-seq experiments are done using triplicates. However, it is very
likely that some genes of the near twenty-thousand genes in the human genome will be
coincidentally expressed at a higher or lower in all the three samples (as compared to the
controls), which leads to false positive results. This is especially problematic when studying the
effects of enhancer deletion in which only a modest reduction in expression of a target gene may
occur (due to the fact that the basal expression mediated by the promoter is still intact). Another
confounding issue is that, due to the karyotype instability of prostate cancer cell lines, different
single cell colonies may have different copy numbers of the same chromosome as well as
varying translocations. To overcome this issue, an increased number of biological and technical
replicates should be used in RNA-seq. Also, different data analysis approaches should be utilized
to decipher the true target genes.
4.1 RNA-seq library construction
The colonies used to prepare RNA-seq libraries are listed in Figure 6. In total, 7
biological replicates of fully deleted RWPE-1 colonies plus 5 different control colonies were
used. To increase the number of replicates for C4-2B cells, technical triplicates were made for
each sample, using 3 different growth passages. RNA were isolated using TRIzol reagent
40
(Thermal Fisher Scientific, cat. no. 15596018). The RNA quality was checked using a 2100
Bioanalyzer instrument (Agilent technologies, cat. no. G2939AA) and a RNA 6000 nano kit
(Agilent technologies, cat. no. 50671511). ERCC spike in mix 1 was added (ThermoFisher
Scientific, cat. no. 4456740). The libraries were made using KAPA stranded mRNA-seq kits
(KAPK Biosystems, cat. no. kk8421), according to manufacturer’s instructions and the library
quality was checked by high sensitivity DNA kits (Agilent technologies, cat. no. 50674626).
4.2 DHT treated RNA-seq
According to published luciferase assays, the enhancer activity of the inserted enhancer
sequence can be boosted by DHT, with a significant difference in response between the two
alleles. Although C4-2B growth is not DHT sensitive cell , gene expression can still be affected
by DHT using a longer treatment time. Since the purpose is not to investigate DHT-induced
genes, there is no need to culture cells in phenol red free medium and with charcoal striped
serum. Therefore, to study the enhancer under constitutive conditions of “high hormone”, C4-2B
were treated with 10nM DHT for 48 hours before RNA isolation.
4.3 Data analysis
Numerous packages have been developed to interpret RNA-seq data, however the results
can vary depending on the analysis method that is used and none of the methods has been proven
to be the best for all circumstances. Thus, I compared 3 genome mapper TopHat, TopHat2,
STAR and two differential gene expression tools, Tuxedo tools (Cufflinks, Cuffdiff) and
41
DESeq2[27-30]. As a result, STAR was the most accurate aligner (~98% total alignment rate)
and took 60% less time than Tophat2 (90%~93% total alignment rate) and Tophat (89%~92%
total alignment rate). Cufflinks utilizes the classic fragments per kilobase of reads per million
mapped reads (FPKM) as a normalization method. However, many papers have criticized this as
an inferior method compared with approaches other packages, such as DEseq2, use[31]. I found
that DEseq2 gave similar results as Tuxedo tools for my study but ran much faster. It is difficult
to judge which result is more accurate but the FPKM value is more useful to measure the gene
expression level than read counts.
Thus, the RNA-seq data were mapped with STAR and differentially expressed genes
were identified using default parameters of both DEseq2 and Cufflinks. The low count genes
were filtered using default parameters. By principal component analysis (PCA), the 3 controls
based on control colony 3 in C4-2B without DHT-treated RNA-seq data were identified as
outliers and removed from subsequent analysis (Figure 10). The downregulated genes were
defined as having at least 1.5 FPKM in controls and a 20% decrease with a statistically
significant P value. Similarly, the upregulated genes must have no less than 1.5 FPKM in deleted
samples and a 20% increase. The number of differentially expressed genes is listed in Table 11
and the top 30 downregulated genes are shown in Tables 8,9 and 10.
42
Table 8: Top 30 down regulated genes in RPWE-1
Gene Locus Control FPKM Deletion FPKM Ratio q Value
MIR2861 chr9 24.42 0.00 -25006.23 2.33E-02
HS6ST2 chrX 4.69 0.02 -225.97 5.79E-03
SLITRK5 chr13 2.60 0.03 -91.77 7.81E-04
MX2 chr21 13.09 0.21 -63.56 7.81E-04
RTP4 chr3 2.14 0.04 -49.87 4.39E-02
PADI2 chr1 3.10 0.07 -45.57 7.81E-04
BST2 chr19 13.74 0.31 -43.71 7.81E-04
SAA2 chr11 13.93 0.36 -38.85 7.81E-04
MX1 chr21 126.24 3.84 -32.90 7.81E-04
XAF1 chr17 34.78 1.38 -25.28 7.81E-04
THY1 chr11 2.97 0.14 -21.71 4.29E-03
CMPK2 chr2 10.70 0.52 -20.68 7.81E-04
AGR2 chr7 1.93 0.09 -20.68 1.44E-03
ELN chr7 26.72 1.36 -19.70 7.81E-04
CCL2 chr17 53.82 2.95 -18.13 7.81E-04
IFI6 chr1 306.55 16.80 -18.13 7.81E-04
ETV7 chr6 1.54 0.09 -16.91 4.78E-03
IFI44L chr1 48.17 2.93 -16.45 7.81E-04
CXCL11 chr4 20.82 1.51 -13.83 7.81E-04
CLDN11 chr3 9.65 0.73 -13.18 7.81E-04
SAMD9L chr7 7.26 0.59 -12.38 7.81E-04
ACKR1 chr1 1.96 0.16 -11.88 7.81E-04
OAS1 chr12 60.97 5.13 -11.88 7.81E-04
SERPINA3 chr14 8.11 0.75 -10.78 7.81E-04
RSAD2 chr2 7.06 0.66 -10.70 7.81E-04
LAMP3 chr3 3.23 0.31 -10.27 7.81E-04
TRIM22 chr11 17.03 1.79 -9.51 7.81E-04
IFIH1 chr2 16.68 1.78 -9.45 7.81E-04
SAA1 chr11 209.38 22.32 -9.38 7.81E-04
TLR2 chr4 3.25 0.38 -8.57 7.81E-04
43
Table 9: Top 30 down regulated genes in C4-2B
Gene Locus Control FPKM Deletion FPKM Ratio q Value
ADAMTSL3 chr15 2.17 0.22 -9.85 6.88E-04
PLA2G3 chr22 2.55 0.27 -9.51 6.88E-04
PTGER4 chr5 1.97 0.32 -6.15 6.88E-04
RAC2 chr22 3.07 0.57 -5.43 6.88E-04
TERC chr3 1.65 0.32 -5.17 1.94E-02
PLA2G4D chr15 3.84 0.80 -4.79 6.88E-04
HTR5A-AS1 chr7 1.88 0.45 -4.20 6.88E-04
BMP7 chr20 5.62 1.36 -4.14 6.88E-04
PREX1 chr20 1.68 0.42 -4.03 6.88E-04
H19 chr11 3.36 0.91 -3.68 6.88E-04
KCNH2 chr7 2.53 0.71 -3.56 6.88E-04
LGALS1 chr22 5.82 1.65 -3.53 6.88E-04
CHMP4A chr14 3.78 1.08 -3.53 6.88E-04
NEDD9 chr6 3.68 1.13 -3.27 6.88E-04
MAFF chr22 1.84 0.56 -3.25 6.88E-04
FGF21 chr19 2.13 0.68 -3.16 6.88E-04
PIP chr7 3.81 1.22 -3.10 6.88E-04
CACNA1H chr16 3.23 1.05 -3.07 6.88E-04
RTN4RL1 chr17 2.75 0.90 -3.07 6.88E-04
HYAL1 chr3 3.43 1.13 -3.03 6.88E-04
GNAL chr18 3.46 1.17 -2.95 6.88E-04
IFITM1 chr11 9.19 3.20 -2.89 6.88E-04
STC1 chr8 3.41 1.21 -2.83 6.88E-04
CLDN14 chr21 1.27 0.46 -2.79 6.88E-04
BAGE,BAGE5 chr21 5.66 2.07 -2.73 1.88E-03
CERS1 chr19 1.73 0.64 -2.71 6.88E-04
INSIG1 chr7 335.46 125.37 -2.68 6.88E-04
CA8 chr8 7.06 2.66 -2.66 6.88E-04
CDKN1C chr11 5.39 2.03 -2.66 6.88E-04
COLGALT2 chr1 1.80 0.68 -2.66 6.88E-04
44
Table 10: Top 30 down regulated genes in C4-2B after DHT treatment
Gene Locus Control FPKM Deletion FPKM Ratio q Value
SPIC chr12 3.89 2.01 -1.93 4.53E-02
EPHA3 chr3 5.21 2.79 -1.87 6.44E-03
SYTL2 chr11 7.36 4.50 -1.62 6.44E-03
PLK2 chr5 4.47 2.83 -1.58 6.44E-03
SPRY1 chr4 18.64 12.82 -1.45 6.44E-03
SLC30A1 chr1 15.67 10.85 -1.44 6.44E-03
HELLS chr10 14.62 10.27 -1.43 6.44E-03
CLSPN chr1 10.93 7.78 -1.40 6.44E-03
ASF1B chr19 40.79 29.24 -1.39 6.44E-03
CBLN2 chr18 40.50 29.65 -1.37 6.44E-03
SMAD6 chr15 14.42 10.63 -1.36 6.44E-03
UHRF1 chr19 10.06 7.41 -1.36 6.44E-03
GINS2 chr16 27.86 20.68 -1.35 2.49E-02
S1PR3 chr9 15.67 11.79 -1.33 6.44E-03
LINC01004 chr7 4.20 3.18 -1.32 6.44E-03
ORC6 chr16 21.26 16.11 -1.32 4.06E-02
DUSP2 chr2 37.27 28.05 -1.32 4.53E-02
BRCA2 chr13 14.52 11.08 -1.31 6.44E-03
FAM110B chr8 28.64 21.86 -1.31 6.44E-03
KIF18B chr17 12.13 9.32 -1.30 2.49E-02
ZNF460 chr19 13.09 10.06 -1.30 2.49E-02
DDIAS chr11 12.73 9.71 -1.30 3.16E-02
LIN7A chr12 119.43 91.77 -1.30 4.06E-02
POLQ chr3 10.13 7.84 -1.29 1.13E-02
E2F7 chr12 7.31 5.66 -1.29 2.49E-02
EXPH5 chr11 5.39 4.14 -1.29 2.85E-02
EXO1 chr1 12.47 9.65 -1.29 4.53E-02
TCF19 chr6_ssto_hap7 28.84 22.47 -1.28 6.44E-03
E2F1 chr20 25.63 19.97 -1.28 3.54E-02
AURKB chr17 39.67 30.91 -1.28 4.06E-02
45
Figure 10: PCA plots. Red dots are controls and blue dots are deletions. A. C4-2B without outliers via
Tuxedo tools. B. RWPE-1 via Tuxedo tools. C. C4-2B after DHT treatment via Tuxedo tools. D. C4-2B
without outliers via DESeq2. E. RWPE-1 via DESeq2. F. C4-2B after DHT treatment via DESeq2. G.
C4-2B with outliers via Tuxedo tools. H. C4-2B with only one outlier removed via Tuxedo tools.
46
Table 11 Number of differentially expressed genes in enhancer deleted cells
Genome-wide
down regulated
Genome-wide
up regulated
Downregulated on
Chr7
Upregulated on
Chr7
RWPE-1 970 518 15 38
C4-2B 544 567 30 51
C4-2B
with DHT
38 125 1 6
4.3 Pathway analysis
The genome-wide downregulated genes were further characterized using the Ingenuity
Pathway Analysis software (Figure 11)[32].
A
B
C
Figure 11: Pathway analysis. A. RWPE-1 results. B. C4-2B results. C. C4-2B after DHT treatment results.
47
CHAPTER 5
DISCUSSION
Originally, I had predicted that I would observe transcriptomic and phenotypic changes
when E7P15 was deleted in C4-2B cells, but not in RWPE-1 cells. In addition, I had hoped to be
able to confirm or not confirm the predicted target genes of this enhancer. Surprisingly, what I
found was that both the C4-2B cells (having an active enhancer mark over rs10486567) and
RWPE-1 cells (lacking an active enhancer mark over rs10486567) show phenotypic and
transcriptome changes after deletion of the 7p15.2 enhancer locus. In addition, the changes I
observed in each cell line were totally different. Finally, my results did not support the previous
bioinformatics predictions of target genes of this SNP/enhancer.
Because C4-2B cells are prostate cancer cells that have an active enhancer at E7P15, I
predicted that deletion of the enhancer would slow cell proliferation. According to the colony
formation assays, the growth of C4-2B cells was reduced when the enhancer was deleted (P
value< 0.001). More specifically, the C4-2B L+L deletion clearly showed fewer colonies than
the M+L deletions (Figure 7 B C). However, in the cell proliferation assays the M+L colonies
grew slower than the L-L colonies (although both grew slower than control cells) (Figure 9 A C).
To confirm these results, the proliferation assay was repeated for C4-2B (Figure 9 B D), with
similar results. The inconsistency between the two assays (colony formation and proliferation)
could be explained by the principles upon which they are based. The clonogenic assay is an in
48
vitro cell survival assay based on the ability of a single cell to grow into a colony, which
essentially tests the capacity of each cell to undergo unlimited division[33]. Nowadays, nearly all
metabolic proliferation assays (e.g. WST1, MTT, XTT, and MTS) measure the amount of
formazan dye generated by lactate dehydrogenase (LDH) using either tetrazolium salts or similar
products such as alamar blue. The rational is that cancer cells have increased glycolysis and the
elevated pyruvate will be largely converted to lactate instead of being consumed by mitochondria
under aerobic conditions. This lactate generation pathway normally happens in an anaerobic
environment. However, it is still active in cancer cells even under oxygen sufficient conditions;
this is termed the Warburg effect[34, 35]. The lactate generation is catalyzed by LDHA, the A
form of LDH, and the active LDHA turns the glucose metabolism from energy generation to
organic compound storage resulting in biomass incorporation and proliferation. According to the
RNA-seq results, the cell proliferation results were highly correlated with the FPKM value of the
LDHA gene; the C4-2B M+L deletion cells had lower LDHA expression than did the L+L
deletion cells. There was no significant difference in the proliferation assays upon deletion of the
enhancer in RWPE-1 cells (Figure 7, 8, 9) and these cells have a similar LDHA gene expression
as do the control cells. Thus, short-term assays may simply be monitoring levels of LDHA. If so,
then short-term cell proliferation assays may be less informative than a long-term colony
formation assay for my research because mechanisms other than LDHA expression level also
play pivotal roles in cell proliferation. When the proliferation assay was extended from 5 days to
49
7 days, the L+L deletion showed lower values than M+L deletion, which could be caused by the
L+L deletion having fewer cells at the later time point. Therefore, I recommend that the colony
formation assay or a long-term cell proliferation assay that monitors actual cell number should be
used for future experiments[36].
As indicated above, 3 different previous studies have predicted 3 different target genes
for this enhancer; JAZF1 at 100 kb, TAXBP1 at 196 kb, HOXA13 at 873kb from the enhancer.
JAZF1 is off in C4-2B before and after deletion of the enhancer; similarly, the low expression of
JAZF1 in RWPE-1 remained unchanged upon enhancer deletion. TAXBP1 is highly expressed
in both cell lines and it is not affected by deleting the enhancer region. The HOXA13 gene has a
low expression level in C4-2B and shows only a 20% downregulation upon enhancer deletion
(P<0.05). My initial hypothesis was that the risk SNP affects an enhancer, which leads to altered
gene expression. Therefore, my expected result for RWPE-1 was no changes at all in gene
expression (since there was no active enhancer signal in that cell line). Instead, the expression of
the HOX13 was significantly upregulated by 6.34 fold (P<0.001) in RWPE-1 cells having the
enhancer deletion, which suggests a true association (albeit the opposite of what was predicted)
between the 7p15.2 enhancer locus and the HOXA13 gene in that cell line (Figure 12 and 13).
50
Figure 12: Gene expression changes within 2MB. Y axis is the
log2fold change ratio. X axis is the distance between transcription
start site and rs10486567. Red dots are the genes not sufficiently
expressed. Dot size is correlated with FPKM value. A. Gene
expression changes of C4-2B cells. B. Gene expression changes
of C4-2B cells after DHT treatment. C Gene expression changes
of RWPE-1 cells.
A
B
C
51
A
B
C D
E F
Figure 13: Circos plot of top 30 differentially expressed genes. C4-2B top 30 down
regulated genes B. C4-2B top 30 up regulated genes. C. C4-2B top 30 down regulated genes
after DHT treatment. D. C4-2B top 30 up regulated genes after DHT treatment. E. RWPE-1
top 30 down regulated genes. F. RWPE-1 top 30 up regulated genes.
52
How could deleting a region without an active enhancer marker boost the expression of a
gene that is 873kb away? One hypothesis is the gene is affected by other regulatory elements
falling in the deletion region, such as a CCCTC-binding factor (CTCF) site, which is a chromatin
domain element[37]. If a chromatin topological domain was changed by deleting a CTCF site,
then one could expect to see changes in the expression of distal genes (in both a positive and a
negative direction). ENCODE LNCaP CTCF ChIP-seq data shows 1 or 2 CTCF peaks exist in
the same region, which is consistent with our data from the other prostate cell lines 22RV1, PrEC,
RWPE-2 and VCaP (Figure 10 B C). These data revealed a consistent CTCF binding pattern in
the 7p15.2 enhancer locus across several cell lines, which exists in both normal and tumor cells
regardless of DHT treatment. However, according to the CTCF ChIP-seq data from the Farnham
lab, there is no CTCF binding site in C4-2B, LNCaP and RWPE-1 cells within the deleted region
(Figure 10 A). Our CTCF-ChIP-seq experiments were all performed using the same antibody but
with two different lot numbers; C4-2B, LNCaP and RWPE-1 libraries were generated with one
lot number but the ChIPs for the other cell lines were done with another lot of antibody. Perhaps
the observed lack of a CTCF site in the C4-2B and RWPE-1 cell lines was caused by the specific
lot number of the CTCF antibody; this can be tested in future experiments. I also note that
deletion of the putative CTCF site in C4-2B cells did not lead to upregulation of HOXA13.
Perhaps the chromatin domains are different in the two cell types.
53
I found that RNA-seq identified thousands of differentially expressed genes in the entire
genome for both cell lines. However, no genes were significantly downregulated within 2MB of
the SNP in either cell line. The results in C4-2B cells (showing that only genes on different
chromosomes were greatly affected by deletion of the enhancer) could perhaps be explained by
translocations. A remote gene may have been translocated near the enhancer or vice versa.
Besides, chromosome instability is a severe issue in many prostate cancer cell lines[38, 39]. PCa
cells can randomly gain or lose entire sets of chromosomes. According to a previous study using
DU145 prostate cancer cells, two weeks after plating as single cells, some colonies had two-fold
more chromosomes than others[39]. CRISPR/Cas9 is based on single cell colonies, which
requires a single cell to divide around 15 times. The C4-2B cells in my research have undergone
two rounds of CRISPR to get a full knock out which might have resulted in chromosomal
abnormalities. Therefore, in the future it would be very helpful to have translocation and
karyotyping information about the C4-2B cells achieved.
Interestingly, the affected genes were very different between the tumor and normal lines.
Surprisingly, the DHT treatment rescued and even reversed the effects of the deletion on the
expression of the majority of the genes, suggesting that the effects caused by deleting the
enhancer were minimized by growth in high levels of androgen hormone. Perhaps the DHT
boosted gene expression, including that of the downregulated genes, and due to complex gene
regulation networks the previously identified differentially expressed genes reached a balance in
54
both control and deleted cells. According to gene pathway analysis, the cholesterol synthesis
pathway is the top changed pathway for C4-2B cells. Based on IPA analysis, the cholesterol
pathway was not affected in DHT-treated C4-2B cells and no pathways were significantly
affected. Strikingly, the differentially expressed genes for RWPE-1 are significantly enriched for
interferon (IFN)-inducible genes, which have been shown to be important in prostate cancer
therapy[40, 41]. In a previous study, researchers compared the expression profiling of benign
prostate cancer cells and nontumorigenic parental cells. A significant portion of the down-
regulated genes belonged to interferon-inducible genes[42]. This result suggests that the deleted
region at the 7p15.2 locus can strongly suppress prostate tumor progression. As indicated above,
in RWPE-1 cells the expression of HOXA13 increased several fold. HOXB13 has been
confirmed to play an important role in prostate cancer progression and the mutation of HOXB13
is associated with a significant increase of prostate cancer susceptibility. The result implies
HOAX13 may also contribute to with prostate cancer initiation.
55
A
B
C
Figure 14: CTCF peaks around the enhancer region. A. CTCF ChIP-seq using
antibody lot#34614003. B. CTCF ChIP-seq data from ENCODE. C. CTCF ChIP-seq
using antibody lot#23913002.
56
CHAPTER 6
FUTURE DIRECTIONS
My major findings are: 1) deletion of a small genomic region can have large effects on
the transcriptome and on cell growth assays; 2) the top deregulated genes are scattered through
the genome and are not near the enhancer; 3) deletion of the same region in different cells can
have very different effects, and 4) deletion of a region lacking an active enhancer mark can
greatly affect the transcriptome. These findings bring up several different interesting questions
concerning gene regulation.
1) Are any of the deregulated genes direct target genes? As noted above, none of the
down regulated genes are near the deleted region. This suggests that perhaps a) the enhancer can
loop to a promoter very far away but on the same chromosome, b) the enhancer can loop to
another chromosome to active a promoter, c) the genes showing large downregulation have been
translocated to be nearby the enhancer, or d) the deletion affects a here-to-date unannotated
regulatory element (such as a long noncoding RNA or a chromosome structural domain) that
affects gene expression in trans.
To continue to address these questions, I propose that several different experiments
should be undertaken.
Expt 1: Are the observed effects due to off-target binding of CAS9. In order to make sure
the observed variations were not caused by off-target effect, PCR products covering the top 20
57
potential off-target sites predicted by bioinformatics tools can be sequenced, although this kind
of verification in other studies has not revealed any severe off-target effects[43].
Expt 2: Is chr7p15 in its natural genomic location? The question as to why there are huge
transcriptomic changes but no genes significantly downregulated within 2 MB might be
answered by translocations. Due to the limited translocation information about the C4-2B and
RWPE-1 cells, spectral karyotyping (SKY) should be done to map the translocation events. This
technique involves painting each chromosome with artificially colored probes and then
visualizing translocations. Alternatively, genomic DNA can be sequenced to find translocation
sites using the new sequence technology provided by the 10X Genomics company.
Expt 3: Does E7P15 engage in long range and/or interchromosomal enhancer-promotor
interactions? Although such interactions are controversial, some papers claims that they can be
detected[44]. Looping assays such as circular chromosome conformation capture (4C) can be
used to discover promoters that interact with the enhancer[45]. However, many investigators feel
that 4C cannot accurately detect interchromosomal interactions. An alternative way to confirm
the interchromosomal regulation would be doing fluorescence in situ hybridization (FISH) to
visualize the chromosome interactions or by doing Cas9-mediated fluorescence in situ
hybridization (CASFISH)[46]. The latter technique uses CRISPRR combined with RNA probe to
the sequence binding site in intact genome which can preserve the spatial relationship and is very
convenient to process.
58
Expt 4: Does E7P15 overlap a chromatin structural domain element? In addition to
enhancers, another class of proteins has been identified that can control distal gene expression,
the best-characterized being CTCF, cohesion, and ZNF143[47-49]. Deletion of these complexes
can alter the structure of large topologically interacting domains (TADs), resulting in changes in
gene expression[50]. I note that in my studies, not only was the risk SNP deleted but I also
deleted a region of nearly 5kb region. Previous studies have shown that CTCF can bind to this
region in some cell types but we have not been able to show binding in RWPE1 or C4-2B cells.
However, there is some concern about the antibody used in the experiments. Therefore, future
experiments should be to perform ChIP for CTCF, ZNF143, and cohesion in these cells to
determine if the E7p15 region is an important structural domain. Alternatively, HiC could be
performed in the control and deleted cells to see if major structural changes occur throughout the
genome[51].
59
REFERENCES
1. Eeles, R., et al., The genetic epidemiology of prostate cancer and its clinical implications.
Nature Reviews Urology, 2014. 11(1): p. 18-31.
2. Thalmann, G.N., et al., Androgen-independent cancer progression and bone metastasis in
the LNCaP model of human prostate cancer. Cancer research, 1994. 54(10): p. 2577-
2581.
3. Hazelett, D.J., et al., Comprehensive functional annotation of 77 prostate cancer risk
loci. PLoS Genet, 2014. 10(1): p. e1004102.
4. Al Olama, A.A., et al., A meta-analysis of 87,040 individuals identifies 23 new
susceptibility loci for prostate cancer. Nature genetics, 2014. 46(10): p. 1103-1109.
5. Howlader N, N.A., Krapcho M, Miller D, Bishop K, Altekruse SF, Kosary CL, Yu M,
Ruhl J, Tatalovich Z, Mariotto A, Lewis DR, Chen HS, Feuer EJ, Cronin KA (eds). SEER
Cancer Statistics Review, 1975-20012, National Cancer Institute. Bethesda, MD. 2012.
6. Siegel, R.L., K.D. Miller, and A. Jemal, Cancer statistics, 2016. CA: a cancer journal for
clinicians, 2016. 66(1): p. 7-30.
7. Horvath, P. and R. Barrangou, CRISPR/Cas, the immune system of bacteria and archaea.
Science, 2010. 327(5962): p. 167-170.
8. Haiman, C.A., et al., Multiple regions within 8q24 independently affect risk for prostate
cancer. Nature genetics, 2007. 39(5): p. 638-644.
9. Gudmundsson, J., et al., Common sequence variants on 2p15 and Xp11. 22 confer
susceptibility to prostate cancer. Nature genetics, 2008. 40(3): p. 281-283.
10. Eeles, R.A., et al., Identification of seven new prostate cancer susceptibility loci through
a genome-wide association study. Nature genetics, 2009. 41(10): p. 1116-1121.
11. Coetzee, S.G., et al., FunciSNP: an R/bioconductor tool integrating functional non-
coding data sets with genetic association studies to identify candidate regulatory SNPs.
Nucleic acids research, 2012: p. gks542.
12. Han, Y., et al., Integration of multiethnic fine-mapping and genomic annotation to
prioritize candidate functional SNPs at prostate cancer susceptibility regions. Human
molecular genetics, 2015: p. ddv269.
13. Larson, N.B., et al., Comprehensively evaluating cis-regulatory variation in the human
prostate transcriptome by using gene-level allele-specific expression. The American
Journal of Human Genetics, 2015. 96(6): p. 869-882.
14. Whitington, T., et al., Gene regulatory mechanisms underpinning prostate cancer
susceptibility. Nature genetics, 2016. 48(4): p. 387-397.
15. PODLASEK, C.A., J.Q. CLEMENS, and W. BUSHMAN, Hoxa-13 gene mutation
results in abnormal seminal vesicle and prostate development. The Journal of urology,
1999. 161(5): p. 1655-1661.
16. Li, Z., et al., The long non-coding RNA HOTTIP promotes progression and gemcitabine
resistance by regulating HOXA13 in pancreatic cancer. J Transl Med, 2015. 13: p. 84.
17. Zhang, S., et al., Long noncoding RNA HOTTIP contributes to the progression of
prostate cancer by regulating HOXA13. Cellular and molecular biology (Noisy-le-Grand,
France), 2015. 62(3): p. 84-88.
60
18. Yao, L., B.P. Berman, and P.J. Farnham, Demystifying the secret mission of enhancers:
linking distal regulatory elements to target genes. Critical reviews in biochemistry and
molecular biology, 2015. 50(6): p. 550-573.
19. Russell, P.J., P. Jackson, and E.A. Kingsley, Prostate cancer methods and protocols. Vol.
81. 2003: Springer.
20. Bello, D., et al., Androgen responsive adult human prostatic epithelial cell lines
immortalized by human papillomavirus 18. Carcinogenesis, 1997. 18(6): p. 1215-1223.
21. Mali, P., et al., RNA-guided human genome engineering via Cas9. Science, 2013.
339(6121): p. 823-826.
22. Ran, F.A., et al., Genome engineering using the CRISPR-Cas9 system. Nature protocols,
2013. 8(11): p. 2281-2308.
23. Ran, F.A., et al., Double nicking by RNA-guided CRISPR Cas9 for enhanced genome
editing specificity. Cell, 2013. 154(6): p. 1380-1389.
24. Veres, A., et al., Low incidence of off-target mutations in individual CRISPR-Cas9 and
TALEN targeted human stem cell clones detected by whole-genome sequencing. Cell
stem cell, 2014. 15(1): p. 27-30.
25. Kim, D., et al., Digenome-seq: genome-wide profiling of CRISPR-Cas9 off-target effects
in human cells. Nature methods, 2015. 12(3): p. 237-243.
26. Cass, C.E., Density ‐dependent resistance to puromycin in cell cultures. Journal of
cellular physiology, 1972. 79(1): p. 139-146.
27. Trapnell, C., et al., Differential gene and transcript expression analysis of RNA-seq
experiments with TopHat and Cufflinks. Nature protocols, 2012. 7(3): p. 562-578.
28. Kim, D., et al., TopHat2: accurate alignment of transcriptomes in the presence of
insertions, deletions and gene fusions. Genome Biol, 2013. 14(4): p. R36.
29. Dobin, A., et al., STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 2013.
29(1): p. 15-21.
30. Love, M.I., W. Huber, and S. Anders, Moderated estimation of fold change and
dispersion for RNA-seq data with DESeq2. Genome biology, 2014. 15(12): p. 1-21.
31. Conesa, A., et al., A Survey of Best Practices for RNA-seq Data Analysis. 2016.
32. The [networks, functional analyses, etc.] were generated through the use of QIAGEN’s
Ingenuity Pathway Analysis (IPA®, QIAGEN Redwood City,
http://www.qiagen.com/ingenuity).
33. Franken, N.A., et al., Clonogenic assay of cells in vitro. Nature protocols, 2006. 1(5): p.
2315-2319.
34. Turner, J. and E. Brittain, Oxygen as a factor in photosynthesis. Biol. Rev, 1962. 37: p.
130-170.
35. Vander Heiden, M.G., L.C. Cantley, and C.B. Thompson, Understanding the Warburg
effect: the metabolic requirements of cell proliferation. science, 2009. 324(5930): p.
1029-1033.
36. Cai, Y., et al., Loss of Chromosome 8p Governs Tumor Progression and Drug Response
by Altering Lipid Metabolism. Cancer cell, 2016. 29(5): p. 751-766.
37. Ong, C.-T. and V.G. Corces, CTCF: an architectural protein bridging genome topology
and function. Nature Reviews. Genetics, 2014. 15(4): p. 234.
38. Pan, Y., et al., Characterization of chromosomal abnormalities in prostate cancer cell
lines by spectral karyotyping. Cytogenetic and Genome Research, 2000. 87(3-4): p. 225-
232.
61
39. Beheshti, B., et al., Evidence of chromosomal instability in prostate cancer determined by
spectral karyotyping (SKY) and interphase fish analysis. Neoplasia, 2001. 3(1): p. 62-69.
40. Bulbul, M., R. Huben, and G. Murphy, Interferon ‐β treatment of metastatic prostate
cancer. Journal of surgical oncology, 1986. 33(4): p. 231-233.
41. Ren, C., et al., Cancer gene therapy using mesenchymal stem cells expressing interferon-
β in a mouse prostate cancer lung metastasis model. Gene therapy, 2008. 15(21): p.
1446-1453.
42. Shou, J., et al., Expression profiling of a human cell line model of prostatic cancer
reveals a direct involvement of interferon signaling in prostate tumor progression.
Proceedings of the National Academy of Sciences, 2002. 99(5): p. 2830-2835.
43. O'Geen, H., et al., A genome-wide analysis of Cas9 binding specificity using ChIP-seq
and targeted sequence capture. Nucleic acids research, 2015. 43(6): p. 3389-3404.
44. Cai, M., et al., 4C-seq revealed long-range interactions of a functional enhancer at the
8q24 prostate cancer risk locus. Scientific reports, 2016. 6.
45. Zhao, Z., et al., Circular chromosome conformation capture (4C) uncovers extensive
networks of epigenetically regulated intra-and interchromosomal interactions. Nature
genetics, 2006. 38(11): p. 1341-1347.
46. Deng, W., et al., CASFISH: CRISPR/Cas9-mediated in situ labeling of genomic loci in
fixed cells. Proceedings of the National Academy of Sciences, 2015. 112(38): p. 11870-
11875.
47. Bell, A.C., A.G. West, and G. Felsenfeld, The protein CTCF is required for the enhancer
blocking activity of vertebrate insulators. Cell, 1999. 98(3): p. 387-396.
48. Parelho, V., et al., Cohesins functionally associate with CTCF on mammalian
chromosome arms. Cell, 2008. 132(3): p. 422-433.
49. Zlotorynski, E., Chromatin: ZNF143 in the loop. Nature Reviews Molecular Cell
Biology, 2015. 16(3): p. 127-127.
50. Narendra, V., et al., CTCF establishes discrete functional chromatin domains at the Hox
clusters during differentiation. Science, 2015. 347(6225): p. 1017-1021.
51. Belton, J.-M., et al., Hi–C: a comprehensive technique to capture the conformation of
genomes. Methods, 2012. 58(3): p. 268-276.
Abstract (if available)
Abstract
Prostate cancer (PCa) is the second most commonly diagnosed cancer in men in the United States and one of the most common types of cancer in men worldwide. To better understand the gene regulation mechanisms underpinning prostate cancer progression, a PCa risk locus was deleted using clustered regularly interspaced short palindromic repeats (CRISPR) in two prostate cell lines, with and without an active enhancer mark encompassing the risk SNP within the locus. RNA-seq, colony formation assays, and cell proliferation assays were performed to investigate the transcriptomic and phenotypic changes. My results suggest that this locus does affect prostate cancer susceptibility.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Functional characterization of a prostate cancer risk region
PDF
Functional characterization of colon cancer risk enhancers
PDF
Understanding prostate cancer genetic susceptibility and chromatin regulation
PDF
Using CRISPR-mediated deletion to study prostate cancer regulatory elements located at loop anchors identified by Hi-C
PDF
Functional characterization of colon cancer risk-associated enhancers: connecting risk loci to risk genes
PDF
Characterizing ZFX-mediated gene regulation to reveal possible candidates for clinical intervention
PDF
Functional characterization of colorectal cancer GWAS loci
PDF
Identification and characterization of cancer-associated enhancers
PDF
The role of GRP78 in the regulation of apoptosis and prostate cancer progression
PDF
Breast epithelial cell type specific enhancers and functional annotation of breast cancer risk loci
PDF
Studies of murine prostate cancer stem / progenitor cells
PDF
The noncanonical role of telomerase in prostate cancer cells: exploring a non-telomeric signaling role for telomerase protein (TERT) in a cancer cell line
PDF
Characterization of a new chromobox protein 8 (CBX8) antagonist in a model of human colon cancer
PDF
The role of PAX8 in epithelial ovarian carcinoma
PDF
Characterization of the progenitor cell zone in feather follicles
PDF
Creating a multiple micrornia expression vector to target GRP78, an ER chaperone and signaling regulator in cancer
PDF
Functional role of chromatin remodeler proteins in cancer biology
PDF
An essential role of argininosuccinate synthase 1 in Kaposi’s sarcoma-associated herpesvirus-induced cellular transformation
PDF
Identification and fine-mapping of genetic susceptibility loci for prostate cancer and statistical methodology for multiethnic fine-mapping
PDF
Exploration of the roles of cancer stem cells and survivin in the pathogenesis and progression of prostate cancer
Asset Metadata
Creator
Luo, Zhifei
(author)
Core Title
Functional analysis of a prostate cancer risk enhancer at 7p15.2
School
Keck School of Medicine
Degree
Master of Science
Degree Program
Biochemistry and Molecular Biology
Publication Date
07/26/2018
Defense Date
06/23/2016
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
enhancer,epigenetics,genomics,OAI-PMH Harvest,prostate cancer
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Farnham, Peggy (
committee chair
), Stallcup, Michael (
committee member
), Tokes, Zoltan (
committee member
)
Creator Email
zhifeilu@usc.edu,zhifeiluo1208@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-284743
Unique identifier
UC11279589
Identifier
etd-LuoZhifei-4666.pdf (filename),usctheses-c40-284743 (legacy record id)
Legacy Identifier
etd-LuoZhifei-4666.pdf
Dmrecord
284743
Document Type
Thesis
Format
application/pdf (imt)
Rights
Luo, Zhifei
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
enhancer
epigenetics
genomics
prostate cancer