Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Identification of novel androgen receptor target genes in prostate cancer
(USC Thesis Other)
Identification of novel androgen receptor target genes in prostate cancer
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
IDENTIFICATION OF NOVEL ANDROGEN RECEPTOR
TARGET GENES IN PROSTATE CANCER
by
Unnati Jariwala
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(BIOCHEMISTRY AND MOLECULAR BIOLOGY)
August 2007
Copyright 2007 Unnati Jariwala
ii
Dedication
This dissertation is dedicated to all those who battle with cancer. I also dedicate this
thesis to my family and friends.
iii
Acknowledgements
I want to acknowledge the many individuals who have contributed to my scientific
training, this thesis and those who have provided inspiration and moral support. This
is of course a partial list.
Dr. Baruch Frenkel – for taking me into his lab when I needed a new home, and for
guiding my research with creative ideas, optimism, encouragement and excellent
mentoring.
To the Frenkel Lab – for making our lab a great place to work in and for their
support and camaraderie. Thanks especially to Dr. Nathalie Leclerc and Jon Cogan.
To the Coetzee Lab – for collaborating on the ChIP Display and ChIP-on-chip
projects. Thanks especially to Dr. Jen Prescott, Dr. Li Jia, and Dr. Grant Buchanan.
To former Anderson lab members – for their constant support and encouragement.
To my family and friends – for moral support and encouragement.
iv
Table of Contents
Dedication ii
Acknowledgements iii
List of tables vii
List of figures viii
Abstract ix
Chapter 1: Introduction
1.1 Background on prostate cancer 1
1.2 Androgen regulation of prostate development 1
1.3 AR gene and structure 3
1.4 Molecular mechanisms of AR action 5
1.5 AR – A culprit in prostate cancer 5
1.6 AR target genes 7
1.6A. PSA 7
1.6B. FGF8 9
1.6C. PMEPA1 9
1.6D. TMPRSS2 10
1.7 Dissertation overview 11
Chapter 2: Materials and Methods
2.1 Cell culture and materials 12
2.2 ChIP 12
2.3 ChIP Display 13
2.4 AR siRNA 14
2.5 RT-PCR 14
2.6 Gene expression analysis in clinical prostate cancer specimens 15
2.7 Oligonucleotides used for ChIP Display studies 15
2.7A. ChIP Display oligonucleotides 15
2.7B. ChIP primers 16
2.7C. RT-qPCR primers 16
2.7D. siRNA oligonucleotides 17
2.8 ChIP-on-chip 17
2.9 Peak calling of AR-occupied regions 18
2.10 Location analysis 20
2.11 Bioinformatics 20
2.12 Cloning 21
2.13 Luciferase assays 21
2.14 Gene expression analysis 22
2.15 Expression tiling array 23
v
2.16 Oligonucleotides used for ChIP-on-chip studies 24
2.16A. ChIP validation primers 24
2.16B. Cloning primers 24
2.16C. RT-qPCR primers 27
Chapter 3: Identification of novel AR target genes by ChIP Display
3.1 Technical aspects and usefulness of ChIP Display 30
3.2 ChIP Display of AR targets in C4-2B cells: an example 34
3.3 ChIP Display discloses 19 novel AR binding sites in PCa cells 38
3.4 AR occupied regions are associated with DHT-stimulated and DHT-
repressed genes in C4-2B cells 42
3.5 AR-dependent, DHT-independent regulation of OAT and MRFAP1 45
3.6 Differential regulation of genes near AR-occupied regions in LNCaP
versus C4-2B cells 47
3.7 Clinical relevance of novel AR target genes 50
3.8 Discussion
3.8A. AR occupancy is not biased towards 5’ promoter-proximal
regions 54
3.8B. AR location analysis discloses ligand-independent, AR-
dependent gene regulation 55
3.8C. Differential basal gene expression in C4-2B versus
LNCaP cells 56
3.8D. Novel AR target genes: Potential mechanisms contributing
to PCa progression 57
3.8D(a). PYCR1 58
3.8D(b). OAT 58
3.8D(c). PRKCD 59
3.8D(d). CRELD2 and DDT 60
3.8D(e). MUC6 60
3.8D(f). TRPV3 and GSTT2 61
3.8D(g). AR occupancy at prostate cancer linked loci 61
Chapter 4: Identification of novel AR target genes by ChIP-on-chip
4.1 Technical details and usefulness of ChIP-on-chip. 63
4.2 AR ChIP-on-chip 66
4.3 ChIP-on-chip discloses 62 novel AR binding sites on chromosome
19 and 20 in C4-2B PCa cells 67
4.4 Location Analysis of AR occupied sites 88
4.5 Half of our AR-occupied sites are associated with histone acetylation 92
4.6 Conservation of non-exonic AR occupied regions 94
4.7 Many novel AR-occupied sites can function as enhancers 96
4.8 AR regulates only a small number of genes in the vicinity of the
62 occupied loci 99
vi
4.9 AR-mediated regulation of a conserved transcript of unknown
function 100
4.10 Discussion
4.10A. ChIP-on-chip data confirms the notion that AR-
occupancy is not biased to 5’ promoter proximal
regions 108
4.10B. AR binds to more highly conserved sequences,
but far from TSS when compared to ER 109
4.10C. AR occupancy at the majority of our 62 loci is not
sufficient for gene expression alterations 111
4.10D. AR mediated regulation of TUF 113
4.10E. ChIP-chip discloses AR target genes with
(potential) roles in PCa 115
4.10E(a). JAG1 115
4.10E(b). MMP24 116
4.10E(c). TGM2 117
4.10E(d). CBFA2T2 119
Chapter 5: Overall conclusion 121
References 123
vii
List of Tables
Table 1 AR targets identified by ChIP Display 40
Table 2 AR targets identified by ChIP-on-chip 70
Table 3 Low confidence ARORs reproduced in 1/3 experiments 74
Table 4 Low condifence ARORs reproduced in 2/3 experiments 83
viii
List of Figures
Figure 1 AR gene and protein structure 4
Figure 2 Principles of ChIP Display 32
Figure 3 ChIP Display (CD) demonstrates a putative AR target 37
Figure 4 Gene expression analysis 44
Figure 5 Effects of AR siRNA-knockdown on gene expression 46
Figure 6 Gene expression in C4-2B versus LNCaP cells 48
Figure 7 Expression of CD-disclosed genes in PCa tumors 52
Figure 8 Principles of ChIP-on-chip 65
Figure 9 ChIP-on-chip discloses 62 novel AR binding sites 69
Figure 10 AR occupied loci 87
Figure 11 Analysis of AR-occupied regions 90
Figure 12 Correlation of Pol II and AR-occupied regions to histone
acetylation 93
Figure 13 Conservation of AR-occupied regions 95
Figure 14 Map of TkLuc2+ 98
Figure 15 DHT time course experiment and gene expression analysis 101
Figure 16 TUF1 105
Figure 17 TUF1 sequence conservation 107
ix
Abstract
The androgen receptor (AR) has unequivocal roles throughout all phases of
prostate cancer (PCa). The downstream target genes that mediate the receptor’s
oncogenic functions, however, remain ill-defined. This thesis describes studies
undertaken to identify AR target genes in C4-2B advanced human PCa cells using
two chromatin immunoprecipitation-based methodologies: ChIP Display and ChIP-
on-chip.
Using ChIP Display we discovered 19 AR-occupied regions (ARORs). Our
data suggests that AR does not prefer binding to 5’proximal-promoter sequences; on
the other hand, it does tend to occupy intragenic sequences, mostly introns.
Approximately half of the genes near ARORs were regulated by the AR, usually
positively, and usually, but not always, in a ligand-dependent manner. Although the
AR generally bound the same regions in the androgen-dependent LNCaP and the
castrate resistant cell line C4-2B, we detected differences in how nearby genes are
regulated. Finally, we provide evidence in support of AR-mediated regulation of
some of the newly discovered AR target genes in vivo.
Using ChIP-on-chip we identified an additional 62 novel ARORs on
chromosomes 19 and 20. Again, AR demonstrated no preference for binding to
5’gene-flanking sequences. Sixty percent of the 62 ARORs were intragenic. Most
of the 62 ARORs were far from transcription start sites, overlapped with histone
acetylation, and exhibited some conservation across species. Sixty percent of
x
ARORs demonstrated enhancer activity in reporter assays. Several approaches we
took to analyze gene expression showed that only 9/95 annotated genes in proximity
to ARORs responded to DHT treatment. The use of high-resolution tiling
microarray led us to identify TUF1 (transcript of unknown function 1), which was
highly stimulated by DHT. Sequence analysis revealed that TUF1 encodes a
putative collagenous protein with high degree of conservation.
Several genes we identified as AR targets (for example, PRKCD, PYCR1,
JAG1 and MMP24) have been implicated in cancer progression or have functions
which make them likely candidates for mediating AR’s oncogenic functions. Future
studies on the molecular mechanisms of these target genes in PCa will shed light on
aspects of disease progression and may provide novel targets for drug development.
1
Chapter 1: Introduction
1.1 Background on prostate cancer
Prostate Cancer (PCa) is the most common non cutaneous cancer diagnosed
in men and it remains to be the 2
nd
leading cause of cancer-related mortality among
men in the United States (1). Each year around 230,000 men in the USA are
diagnosed with PCa and of these, about 30,000 succumb to this disease (1). Age
seems to be the single most important risk factor for developing PCa with incidence
of the disease rising dramatically in men over the age of 50 (49). Other key risk
factors include ethnic background and heredity. African American men have 40%
higher incidence of PCa compared to Caucasian men in the same age group and
having a first-degree relative with PCa doubles the risk of men getting PCa (49).
Lastly, environmental factors such as geographical location (91) and high-fat diets
(92) are also thought to contribute to PCa in men with no familial history of the
disease.
1.2 Androgen regulation of prostate development
The human prostate is approximately the size and shape of a walnut and is
located right below the bladder and above the rectum. Prostate growth and
development begin in fetal life and these processes are initiated by and dependent on
androgens produced by fetal testis (66). The prostate develops upon receiving
androgenic stimulation from the urogenital sinus (UGS), a structure composed of
2
both epithelial and mesynchymal layers. The first known response to androgens by
the (murine) UGS is the expression of Nkx3.1, a homeobox gene which leads to the
outgrowth of prostatic epithelial buds (89) from the UGS into the mesenchymal
layer. Next, the epithelial buds grow in a specific spatial pattern, elongate, and
bifurcate. This process of branching morphogenesis is also driven by androgens and
gives rise to three prostatic zones: a central zone, a transition zone and a peripheral
zone (66). Each of these three zones reflect three different ducts in the human
prostate. Concurrent with processes described above, epithelial and mesenchymal
differentiation occurs that provide cells with specialized functions, such as
production of prostatic secretory proteins (66). The process of differentiation is also
regulated by androgens leading to mature prostate ducts containing three major cell
types: luminal secretory epithelial cells, basal epithelial cells and stromal smooth
muscle cells (66). As androgens are essential to prostate development, mice (31) and
humans (10) harboring inactivating mutations in the AR fail to develop a prostate. In
adult men, androgens continue to play a role in cell survival. Consequently,
androgen ablation (withdrawal) leads to prostate involution and epithelial cell loss
via apoptosis. Re-administration of androgens results in the return to normal size
and function of the prostate due to rapid proliferation and differentiation of stem
cells in the basal epithelial compartment (57).
3
1.3 AR gene and structure
The AR is a member of the nuclear hormone receptor family and is
responsible for mediating the physiological effects of androgens described above.
The AR gene resides on chromosome Xq11-12 and spans ~90-kb of DNA containing
eight exons (9, 63) (see Figure 1). The gene encodes for the 110 kDA AR protein,
which, along with other nuclear hormone receptor family members such as
Glucocorticoid Receptor (GR), Protgesterone Receptor (PR) and Estrogen Receptor
(ER), shares the following conserved structural features. Exons 2 and 3 code for the
central DNA binding domain (DBD) consisting of two zinc-finger motifs. Adjacent
to the DBD lies the carboxyl terminal ligand binding domain (LBD), coded by exons
4-8 (Figure 1) (25). The LBD also contains an activation domain (AF-2) that is
important for the transcriptional activity of the receptor. A nuclear localization
signal is embedded in a small hinge region in between the DBD and the LBD (Figure
1). In contrast to the highly conserved regions described above, the N terminal
region of nuclear hormone receptors is the least conserved and can vary from a few
amino acids up to 500 in the case of AR (1). The AR N-terminal domain contains a
second activation domain (AF-1) and Poly Q repeats which can vary in size. AR
with shorter repeats has higher activity and some studies have found a correlation
between Poly-Q tract length and prostate cancer risk (25).
4
Figure 1) AR gene and protein structure. The location of AR on chromosome X,
its intron / exon organization and the functional protein domains are depicted (25).
5
1.4 Molecular mechanism of AR action
In the absence of ligand, the AR resides in the cytoplasm in association with
chaperone proteins called heat shock proteins (HSPs). Ligand binding causes
conformational changes in the AR leading to dissociation from HSPs, receptor
dimerization and subsequent translocation into the nucleus. Liganded AR binds to
specific DNA sequences called androgen responsive elements and initiates the
formation of a protein complex (classically thought to occur at gene promoters, but
see our results and discussion below). This protein complex includes, among others,
RNA Pol II and primary and secondary co-activators, which serve to enhance AR
target gene transcription (1). Examples of primary AR co-activators, which bind
directly to AR and which serve as a scaffold for other co-regulatory proteins, include
the p160 members SRC-1 and GRIP1/TIF-2 (1). Secondary co-activators, which do
not bind directly to AR, include p300/CBP and possess histone acetyl transferase
activity (1). These co-activator proteins serve important roles in AR-mediated gene
regulation and not surprisingly, altered expression of these proteins is also implicated
in prostate carcinogenesis (85).
1.5 AR – A culprit in PCa
In the previous sections the importance of androgens to all aspects of normal
prostate development and growth were described. Not surprisingly, prostate cancer
cells remain dependent on androgens for growth and survival, at least initially. Due
to this dependency on androgens, surgical or pharmacological castration (with AR
6
antagonists) is the standard therapeutic option for PCa management. Initially,
ablation therapy markedly inhibits AR, as evident by tumor regression. However in
most cases this is inevitably followed by recurrence of the cancer, often at metastatic
sites. Most metastatic PCa are castrate-resistant (i.e. resistant to further hormone
manipulation). Surprisingly though, AR expression and function are maintained in
advanced stage, castrate-resistant, disease (86, 87). Moreover, the growth of
castrate-resistant PCa remains AR dependent as exemplified by the following lines
of evidence. Zegarra-Moro et al showed that disruption of the AR by a specific
antibody or ribozyme inhibited proliferation in ablation-resistant PCa cells in the
absence of androgen (116). Using human PCa xenografts in mice, Chen et al found
that increased AR expression was necessary and sufficient to convert androgen-
sensitive PCa to an ablation-resistant state (15). The ability of AR to convert
androgen-sensitive PCa to an ablation-resistant state was dependent on having a
functional ligand-binding domain and a functional DNA binding domain, suggesting
that in spite of a lack of androgen dependency, androgens are still necessary for
disease progression and that AR mediated tumorogenesis is a result of AR’s
genotropic effects (15). Finally, specific expression in mouse prostate epithelial cells
of an AR transgene containing a gain-of-function mutation (with increased basal
activity and response to coregulators), resulted in PCa development in 100% of the
animals, thus proving that aberrant AR signaling was sufficient to cause PCa and that
under certain conditions the AR acts as an oncogene (29). The studies cited above
all suggest that the AR is a key, causative, player during early and late stages of PCa.
7
1.6 AR target genes
As AR is a transcription factor, its oncogenic functions are likely mediated
through specific target genes. Prostate specific antigen (PSA) is the most well
known AR target gene. Rising serum PSA levels are currently used for the diagnosis
and prognosis of PCa in men and help direct patient care management. PSA is
thought to contribute to PCa progression through its protease activity and its ability
to induce epithelial-mesenchymal transition and cell migration. Although certain
aspects of PCa progression can be explained by the functions attributed to PSA, AR
driven PSA expression alone does not fully explain the contributions of AR to
disease progression. Interestingly enough, mice do not have a homolog of PSA
(107), yet they still develop prostate cancer, suggesting that other AR target genes
likely play a causal role in the advancement of the disease. Additional AR target
genes implicated in PCa progression are FGF8, Cdk1 and Cdk2, as well as PMEPA1
and TMPRSS2. A brief description of some of these known AR target genes and
their function is what follows.
1.6A. PSA
Prostate specific antigen (PSA) or KLK3 is a 33-kDa glycoprotein which
belongs to the kallikrein (KLK) family of serine proteases (17). PSA is one of 178
human serine proteases, making the kalikriens the largest contiguous gene family in
the human genome (78). PSA and fourteen other members of this family of
proteases reside on chromosome 19q13.4 (78). PSA’s major substrates are proteins
8
made and secreted by the seminal vesicles, semenogelin I (SgI) and semenogelin II
(SgII), which are the gel forming proteins in freshly ejaculated semen (107). To
date, PSA remains the most well studied AR target gene, its androgenic regulation
being mediated by functional AREs both in the proximal promoter (72) and upstream
enhancer sequences (18).
High levels of PSA are produced in the prostate by epithelial cells. Under
normal circumstances, PSA leakage into the stroma is prevented by tight junctions.
During prostate cancer, however, due to the disruption of the normal prostate
architecture, a large amount of PSA leaks out and can be detected in the serum. This
phenomenon of PSA leakage has made it possible to utilize serum PSA as a
biomarker. It is currently used extensively to screen for PCa, to detect recurrence
following therapy of local and metastatic disease (107). While PSA screening has
profoundly aided the detection and management of advanced PCa, PSA levels for a
large fraction of localized PCa overlap with those levels found in men without PCa,
thus making PSA screening somewhat controversial (4).
As PSA expression is maintained in androgen sensitive as well as castrate
resistant disease, it is thought that PSA likely has functional roles in PCa biology,
and is not just a serum biomarker. PSA has been shown to cleave IGFBP3, which
prevents IGF-1 binding to IGFBP3. Free IGF-1 then is thought to contribute to PCa
proliferation (4). PSA is also known to cleave fibronectin and laminin, two
extracellular matrix proteins, which can lead to cell migration and metastaisis (4).
PSA also has mitogenic activity for osteoblasts in vitro, which may be mediated by
9
TGFß activation and or by proteolytic modulation of osteoblast cell surface proteins
(4). This potentially explains the high propensity for PCa to metastasize to bone.
1.6B. FGF8
Fibroblast growth factor 8 (FGF8) was initially identified as being androgen
induced in mouse mammary cancer SC3 cells (96). Subsequently, Gnanapragasam
et al demonstrated AR-mediated regulation of FGF8b in human PCa (26). FGF8b
mRNA levels were found to be induced by testosterone and decreased by castration
in androgen sensitive CRW22 prostate xenografts. Furthermore, FGF8b protein was
expressed in both primary untreated and in castrate-resistant PCa. The proximal
promoter region of the FGF8 was able to bind AR and contained putative androgen
responsive elements. In another study, increased FGF8 expression was correlated
with higher Gleason scores and advanced tumor stage and men with tumors
expressing higher FGF8 experienced poor survival (19). Studies in LNCaP cells also
demonstrated growth inhibition of PCa with the use of FGF8 neutralizing antibodies,
which confirmed the biological relevance of FGF8 in carcinogenesis (19).
1.6C. PMEPA1
The application of Serial Analysis of Gene Expression (SAGE) in LNCaP
cells, to find androgen regulated genes, led to the identification of PMEPA1 in 2004
(62). Further studies to characterize PMEPA1 functions found that it interacted with
Nedd4, an E3 ubiquitin ligase involved in proteosomal-mediated protein degradation.
10
Overexpression of PMEPA1 in various PCa cell lines led to cell growth inhibition
(112). In conjunction with its growth inhibitory effects, the relevance of PMEPA1 in
tumorogenesis is evident from it’s decreased expression in PCa tumors from patients
with more advanced stage disease / higher serum PSA compared to normal PCa
samples (112).
1.6D. TMPRSS2
Gene expression profiling by cDNA microarray analysis disclosed
TMPRSS2, a membrane-bound serine protease, to be an androgen induced transcript
in LNCaP cells (56). TMPRSS2 transcripts were also found to be overexpressed in a
large number of PCa patients (101). More recently, TMPRSS2 was shown to
activate protease-activated receptor 2 (PAR-2), a G-protein coupled receptor, leading
to a rise in intracellular calcium in PCa cells (109). Interestingly, the AR response
mechanism of TMPRSS2 drives oncogenic Ets family members in many castrate
resistant tumors due to TMPRSS2:Ets chromosomal translocations (100). These
studies demonstrate the importance of TMPRSS2 in contributing to PCa progression.
The examples of AR target genes provided above highlight recent strides
made in understanding the mechanisms of PCa progression. However, considering
that the AR plays unequivocal roles during all stages of clinical PCa, there remains
paucity in the knowledge of target genes it regulates. The identification of these
additional target genes will greatly expand our knowledge on the mechanisms of
11
AR-mediated PCa progression and can aid the development of new approaches for
disease management.
1.7 Dissertation overview
This thesis describes studies undertaken to identify AR target genes based on
their physical interaction with the AR. Two recently developed methodologies were
utilized for identifying novel AR target genes: Chromatin Immunoprecipitation
Display (ChIP Display or CD) and Chromatin Immunoprecipitation microarray
(ChIP-on-chip). C4-2B human PCa cells, a model for castrate-resistant disease, were
utilized in our studies. CD led to the identification of 19 novel regions occupied by
the AR and ChIP-on-chip led to the discovery of 62 additional targets. The
expression patterns of genes within these AR-occupied loci, along with functions
attributed to these genes, render some of them potential PCa therapeutic targets.
12
Chapter 2: Materials and Methods
2.1 Cell culture and materials
C4-2B cells, a model for castrate-resistant PCa, were obtained from ViroMed
Laboratories Inc. (Minnetonka, MN) and their parental, androgen-dependent LNCaP
cells, were obtained from ATCC. (Manassas, VA). Both C4-2B and LNCaP cells
were maintained in RPMI-1640 medium (Invitrogen, Carlsbad, CA) supplemented
with 5% fetal bovine serum (FBS; Invitrogen). Dihydrotestosterone (DHT; Sigma
Chemical Co., St. Louis, MI) was administered in phenol red-free RPMI-1640
supplemented with 5% charcoal-stripped FBS (CSS; Gemini, West Sacremento,
CA). The N-terminal AR antibody (N20) and RNA Pol II antibody were both
purchased from Santa Cruz Biotechnology (Santa Cruz, CA). The AcH3 antibody
was purchased from Upstate Biotechnology (Lake Placid, NY).
2.2 ChIP
ChIP was carried out essentially as described previously (41). C4-2B and
LNCaP cells were cultured for 3 days in phenol red-free RPMI-1640 supplemented
with 5% CSS, then treated for 4-hr with 10 nM DHT, followed by cross-linking with
1% formaldehyde for 10 minutes. After sonication, chromatin was
immunoprecipitated overnight at 4˚C with either anti-AR antibodies or isotype-
matched IgG. For ChIP Display studies, AR occupancy was assessed by PCR with
locus-specific primer (see ‘ChIP primers’ below). Serial dilutions of genomic DNA
13
were amplified to ensure that PCR was performed within a dynamic range. For
ChIP-on-chip studies, AR occupancy was assessed by quantitative real-time RT-
PCR.
2.3 ChIP Display (CD)
We have recently described the CD procedure in detail (5). Briefly, DNA
from AR ChIP and IgG control ChIP was dephosphorylated using shrimp alkaline
phosphatase (NEB, Ipswich, MA) and digested with AvaII (NEB). The AvaII
fragments were subjected to ligation-mediated PCR using each of 36 combinations
of eight primers. Each primer had A or T at the +3 position of the AvaII site, and
A,T,G, or C at the so-called +6 position, immediately internal to the AvaII site (5).
In the present paper, primers are named by the nucleotides occupying these two
positions. For example, the PCR primer ‘AC’ is the one with A at the +3 and C at
the +6 position. Each PCR reaction in the present study was performed in duplicate,
with a 1°C difference in the annealing temperature (see Fig. 1A). The amplified
material form 2-3 independent AR ChIPs and 2-3 controls was resolved by
polyacrylamide gel electrophoresis (PAGE), and bands enriched in the AR ChIPs
were excised and reamplified. They were then subjected to secondary digestion with
HaeIII, HinfI and MspI (NEB), and sub-fragments were isolated by agarose gel
electrophoresis and sequenced. The sequences were mapped to the human genome
using the SSAHA program on ENSEMBL (www.ensembl.org).
14
2.4 AR siRNA
C4-2B cells (1.5 x 10
5
cells / well in 6 well plates) were cultured for two
days in phenol red-free RPMI-1640 supplemented with 5% CSS. The cells were
then transfected using OligofectAMINE (Invitrogen) with 100 nM of either AR-
specific or non-specific siRNA (see ‘siRNA oligonucleotides’ below) as previously
described (41). After two days, cells were treated for 16 hours with 10 nM DHT or
ethanol vehicle prior to analysis of gene expression.
2.5 RT-PCR
Cells were grown in six-well plates and RNA was extracted using Biorad’s
total RNA mini kit according to manufacturer’s protocols (Biorad, Hercules, CA).
RNA quality was assessed spectrophotometrically and by agarose gel
electrophoresis. High quality RNA (200-1000 ng) was reverse-transcribed with
random hexamers using the Taqman reverse transcription reagents kit (Applied
Biosystems, Foster City, CA). cDNAs of interest were amplified using gene
specific-primers (see ‘RT-qPCR’ primers below) and the iQSYBR Green supermix
(Biorad). Amplification was performed in triplicate in a 96-well format and
monitored in real time using the Opticon 2 DNA Engine (Biorad). Negative controls
without RNA, without RT and without cDNA were always included to rule out
contamination. Expression levels were determined using standard curves for each
gene and corrected for 18S ribosomal RNA levels.
15
2.6 Gene expression analysis in clinical PCa specimens
Expression of genes disclosed by ChIP Display was analyzed in prostate
cancer samples using microarray data collected as part of our previous studies (36).
Briefly, clinical samples were from 40 primary prostate cancers obtained during
radical prostatectomy and 7 AR-positive metastatic prostate cancer lesions. Twenty-
three of the primary tumors were from patients receiving no therapy before surgery
and the remaining 17 were from patients after 3 months of goserelin plus flutamide
androgen-ablation therapy. All tissues were obtained during routine clinical
management at the Memorial Sloan-Kettering Cancer Center, New York, NY, under
protocols approved by the Institutional Review Board. RNA was extracted from
manually-microdissected tissue consisting of 60-80% prostate cancer cell nuclei, and
analyzed as previously described (36) using the Affymetrix U95 A-E array set. The
results are displayed as a heat-map generated using ‘Heatmap Builder’ (Stanford
University), with data routed to 50 equal gates for each probeset (row) using a linear
grey scale gradient from white (lowest value) to black (highest value).
2.7 Oligonulceotides used in ChIP Display studies
2.7A. ChIP Display oligonucleotides
Short linker oligo 5’ -TTC GCG GCC GCA C- 3’
Long A linker oligo 5’ -GAC GTG CGG CCG CGA A- 3’
Long T linker oligo 5’ -GTC GTG CGG CCG CGA A- 3’
"AA" PCR primer 5’ -CGG CCG CAC GAC CA- 3’
"AT" PCR primer 5’ -CGG CCG CAC GAC CT- 3’
"AG" PCR primer 5’ -CGG CCG CAC GAC CG- 3’
"AC" PCR primer 5’ -CGG CCG CAC GAC CC- 3’
"TA" PCR primer 5’ -CGG CCG CAC GTC CA- 3’
"TT" PCR primer 5’ -CGG CCG CAC GTC CT- 3’
16
"TG" PCR primer 5’ -CGG CCG CAC GTC CG- 3’
"TC" PCR primer 5’ -CGG CCG CAC GTC CC- 3’
2.7B. ChIP primers
Chr. Forward Primer Reverse Primer
1q25.2 CCTTCCGGACAATGAAGAAG AAGCAAGCCACTCACCCTAC
1p35.2 AGGGACAACATCACCAGGAG AGCACGGCTACTGCACCTAC'
2q37.3 GCTTCTCCAGCGTCCAGTAG CTGACGAGTGGTCATCTTCC'
3p21.1 CCAGAGAACAGCATGCTCAA TACAACCAGGCAGGATGACA
4p16.1 GTGGAAACTGTTGGGTGGAG' CTGGTCCCCACTGAGTCTTC'
7q11.23 CTCAAGAGGATCGGGAGATG TCCTGGGCTCAAGTGATTCT
7q11.23 ACCAGAGGGGTCTGTGTGTC' AAGCGAGAAGCGCTAATAGG
8q24.3 GTGGATGGATGGCAAATAGG' GCTCCTTTGTCAGGGATCAG
10q12.1 GTGGGCTCAGCACTGGAC CAGGGAAACCCCAGAATCAG
10q26.13 CTCCCAAACTCCTCCAACTG GGGTAGAACATCAGGGCAAC
11p15.4 GTGATGCCGTTGATGACAGT ACGGGTAACACCACCTTCAG
11q12.3 CATTAAGTCATTGTAAGGCCTGTG TCCAGGATCAGGAACTCACC
11q25 GTCCTACCCTGGAGGGACTG GGAGGAGAGGAATCGAGGTC
14q31.3 CCATCTGCTTAGATGTTCATGC TGGGATCTTTGAGGGGATAAC
17p13.2 GCAAAAATGATGGGAAAAGC CTTGAAGGCGGTTGCTACTC
17q25.3 ACCCCAACTCCTCCTCACAG CCAAGAAGATTCTGGGGTGA
22q11.23 GATTGGCCATCAGGGAGTAG TGAGATCAGCCAGTGTCACC
22q13.1 CATGGGAGATGCACTCTTGA GTTCAGTGGGTTGTCCTTGG
22q13.3 ACCACCCACTCCTCAGTCAC ACAGGGGCTCTTCCAATACC
11p11.2
(non-target) CCGACTTCCTCTCCTGACTG' TCAGCTTGCTCCCCATTTAT
PSA
enhancer TGAAAACAGACCTACTCTGGA AGCAAAGACAGCAACACCTT
2.7C. RT-qPCR primers
Gene Forward Primer Reverse Primer
QSCN6
ACCCTCAACTTCCTCAAG
TCATCATCTCAGGCTTCC
LHX4 CAGGCGGACAGTTAATGAATGG GGACGATATGGAGGATGGAGAC'
CAP350 AGAATGGAGCCAAAAGAGCA CAAGAATGCCACGAATTTCA
ACBD6 GGCCTGTGATCGAGGACATA TAAGCCTTGCCAGTTGTGTG
KIF1A AAGGCCTCCTCCTAGACAGC CTGTGTTCTTCAGGGGCTCT
PRKCD CCTGACTATATCGCCCCTGA GTCCTTGGACTCCTTGGTGA
MAN2B2 GGGTGTACCCCAACATGAGT CTGTGGAATAGGGCAGGAAG
MRFAP1 TGCTCATCCAGATCAAAACG CAAAAGGCTCTCTGGTTTCG
FZD9 AGACCATCGTCATCCTGACC CCGATCTTGACCATGAGCTT
BAZ1B AAAGCCTTCCACCTGTTTTG GCAAACCAGCCACCTCATAA
WBSCR28 AGTGACCTGGAGGGTGTGTC CTGGGTCGTGTGCTCAAAG'
WBSCR27 GTCTGACCACCAGGACCAAC AGACAATGCCGGAGATGAAG
CLDN4 TGCTTTGTTCTTCCCTGGAC' ACCACCACACCCTGTCACTT
17
KIAA1217 CCATGAGTGCCAAGAACAGA TTGACTCTGCGGTGAGAATG
OAT TTCTGGGGTAGGACGTTGTC AGCTCTCGCACTCCCATTAG
LHPP GAGGTTCTGCACCAACGAGT CACACAGTTTGGGTTGGATG
MUC6 AACATCATCACCCAGCAGGT TGGTGGGTGTTTTCCTGTCT
AP2A2 TGACGTCTGCATCCACAGAT TGCTGGACCTTCTTCGACTT
SLC22A6 ACCCTCCGCCACCTCTTCC GGCAGGCAGGTCCACAGC
CHRM1 CCGCTACTTCTCCGTGACTC GTGCTCGGTTCTCTGTCTCC
TRPV1 GCCCATGGGGACTTCTTTA TTCCCTTCTTGTTGGTGAGC
CARKL AATGGACAGAGGGAGGGATT TACGTTCCAGCTTTGGCTCT
TRPV3 GAGCCTGTCCAGGAAGTTCA GTGCTTGGCAAACTTCTTCC
MAFG GAGAAGCTGGCCTCAGAGAA GGCATCCGTCTTGGACTTTA'
PYCR1 ACACCCCACAACAAGGAGAC' CTGGAGTGTTGGTCATGCAG
SIRT7 GGACCTGGTAACGGAGCTG CGCCTGTGTAGACGACCAAG
GSTT2 CAATGGCTGGAGGACAAGTT CCTGATAGGCCTCTGGTGAG
DDT CTGGAGCTGGACACGAATTT GGCTAGCTCCTTGGTGAGAA
SYNGR1 TCTGCATCTACAACCGCAAC TTCAGTGGGTTGTCCTTGG
MAP3K7IP1 CCAAGCTGGACAGATGACCT CCACGAAGTTGGTCACTCG
CRELD2 GGAGATGGGAGCAGACAGG ACCCAGCCCACTTCACACT
ALG12 GCGTGATTTTTGGACTCTGG GAACACGATGATGGCGAAG
18S rRNA CCGCAGCTAGGAATAATGGA CGGTCCAAGAATTTCACCTC
AR CTGGACACGACAACAACCAG CAGATCAGGGGCGAAGTAGA
2.7D. siRNA oligonucleotides
Target Forward Primer Reverse Primer
AR ACGUUUACUUAUCUUAUGCTT GCAUAAGAUAAGUAAACGUTT
Non specific AAUUUUACUCGCUCGAUUUTT AAAUCGAGCGAGUAAAAUUTT
2.8 ChIP-on-chip
Chromatin immunoprecipitations were carried out as described above with
anti-AR, anti-histone H3/H4, and anti-Pol II antibodies (all from Santacruz). These
immunoprecipitated samples along with genomic input DNA were amplified by
linker mediated PCR (LM-PCR) as described elsewhere (81). The amplified DNAs
were differentially labeled with Cy3 and Cy5 and hybridized onto a tiled array
manufactured by NimbleGen Systems, Inc (www.nimblegen.com). Array #35 was
chosen for this study which contains 46.6Mb (72.7%) of DNA sequence on
18
chromosome 19 and 48.1Mb (77.1%) on chromosome 20. The specific positions of
regions represented on this array are as follows: Chromosome 19: 17, 458, 552 – 63,
805, 962; Chromosome 20: 8,641 – 48, 146, 322. ChIP samples were prepared by
Dr. Li Jia.
2.9 Peak calling of AR-occupied regions
To identify peaks defining AR-occupied loci, we first constructed a window
that centers any given probe i . The window covered neighboring probes on the
chromosome that were less than or equal to 600bps away from probe i including
itself. With this window, a statistics called Moving Average (MA) was defined as:
∑
∈
=
i
NP j
j
i
i
r
NP
M
#
1
,
where
i
NP was the set of probes that were within the window at probe i and
i
NP #
was the size of this set. We assigned the probes inside the window as the
“neighboring probes.” If there were less than 3 probes in any given window, we did
not calculate the moving average for it. Therefore, if we denoted the total number of
probes that had moving averages by ' N , then N N < ' . To calculate the significance
of all the
i
M ’s, we compared each of them with a normal distribution, which was
called the “background model.” For the normal distribution of the background
model, we estimated the mean and variance by permutation. For a given probe i , if
there were in total m neighboring probes inside a given window, we randomly
sampled m probes’ log2 ratios from all the log2 ratios on the array. We then took
19
these randomly picked m ratios as the ratios of the probes inside the window and
calculated a new moving average called ‘permutated-moving average’, denoted by
*
M . If the random sampling was repeated for T ( 10000 = T in our data) times, we
obtained a vector of ) , , , (
* *
2
*
1 T
M M M L . The sample mean and variance of these
*
i
M ’s
were denoted by μ ˆ and
2
ˆ σ , respectively. The significance or p-value, was estimated
by
dz
z
M Z P p
i
M
i i
∫
+∞
⎭
⎬
⎫
⎩
⎨
⎧ −
− = > =
2
2
2
) ˆ , ˆ (
ˆ 2
) ˆ (
exp
ˆ 2
1
) (
2
σ
μ
σ π
σ μ
,
where
) ˆ , ˆ (
2
σ μ
Z is a random variable with the normal distribution ) ˆ , ˆ (
2
σ μ N .
Having obtained a p-value for each probe, we can determine which probes
were potentially in the binding regions. Since we had more than 150000 probes, we
decided to use the method introduced by Benjamini et (1995) to control for false
discovery rates (FDR) on each chromosome. According to this method, all the
probes were sorted in an increasing order of the p-values first. Suppose the ordered
p-values were
) ' ( ) 2 ( ) 1 (
, , ,
N
p p p L and the corresponding indices of the probes were
) ' ( ) 2 ( ) 1 (
, , ,
N
I I I L , then the significant probes were defined by:
{}
⎭
⎬
⎫
⎩
⎨
⎧
⋅ ≤ ∈ = ℜ i
N
q
p N i I q
i i
'
: ' , , 2 , 1 , ) (
) ( ) (
L ,
where q was the overall false discovery rate threshold. In our data,q = 0.005.
If a region is a binding region, then we would expect the probes’ ratios to
form a peak if placed consecutively according to their positions on the chromosome.
Using our method, the potential binding regions were defined to be those that had at
20
least 4 consecutive probes that are significant and within a 600bps window. If there
were more than 4 consecutive probes inside one region, then we extended the
binding region until the next probe was not significant. The starting position for each
potential binding region was the starting position of the first significant probes
among the 4 consecutive probes. The ending position of the peak would then be the
ending position of the last significant probe. This work was conducted by Xiting
Yan from the department of computational biology at USC under the mentorship of
Dr. Ting Chen.
2.10 Location analysis
The distance from the center of each AR-occupied loci to the closest point of
the nearest gene were determined using Ensembl (www.ensembl.org). It was also
noted whether the AR binding region was 5’or 3’ to a gene or within the gene body
based on the annotation found on Ensembl. Additionally, the distance from each
binding site to the nearest known transcription start site (TSS) was also determined
bioinformatically.
2.11 Bioinformatics
The comparisons of distance of AR-occupied loci to: TSS (Figure 11B),
nearest AcH3 site (Figure 12) and conservation analysis of AR-occupied regions
(Figure 13) was conducted by using bioinformatics tools by Dr. Ben Berman at USC
(Department of Preventive Medicine).
21
2.12 Cloning
DNA fragments representing the ARORs were amplified using C4-2B
genomic DNA and site specific primers (see cloning primers) with engineered
EcoRI, BglII or SacII restriction enzyme sites. The amplified fragments were cloned
into the Tk2
+
Luciferase construct in both forward and reverse orientations. We
were able to clone all but one, R16, due technical difficulty arising from repetitive
sequences in this region. The cloned DNA fragments were 600bps on average
(SD=222), ranging from 326 to 1287. Each fragment encompassed the AROR as
defined by the moving average methodology (described above) with the inclusion of
an extra 100bps on each side of the AROR to allow for the most optimal primer
design. The majority of the cloning effort was led by Dr. Grant Buchanan at USC
(Departments of Preventive Medicine, Urology, and Dame Roma Mitchell Cancer
Research Labratories, The University of Adelaide / Hanson Institute, Australia) and
Allison Walters.
2.13 Luciferase assays
C4-2B cells were seeded in 12 well plates at a density of 125,000 / well and
cultured for 2 days in 5% CSS containing phenol red free RPMI-1640 medium.
Cells were transfected with 1 ug plasmid / well using Lipofectamine LTX and PLUS
reagents (Invitrogen) in optimem media. As a positive control, in each experiment,
we transfected cells with the pGL3-PSA5.85 construct, which contains a 5.85-kb
PSA upstream fragment (61) (a gift from Dr. Hong-Wu Cheng). 4 hrs post
22
transfection, the plasmids were removed and cells were treated with 5% CSS
supplemented RPMI-1640 medium containing EtOH vehicle or 10nM DHT. Cells
were lysed 24 hrs post treatment with 250 ul passive buffer (Promega, Madison, WI )
/ well and luciferase activity in the cell lysates were quantified by using the
Luciferase assay system kit (Promega). Data represents mean of up to three
independent experiments.
2.14 Gene expression analysis
We identified all annotated genes within a 100-kb window surrounding each
AR binding site. A total of 95 genes were found in proximity of our 62 sites based
on gene annotation found on Ensembl (www.ensembl.org). For gene expression
studies, cDNAs were synthesized initially from total RNA collected (using Biorad’s
total RNA mini kit) from C4-2B cells treated for 16hrs with ethanol or 10nM DHT
and subsequently from an extensive DHT time course. In both cases, gene
expression was assessed by RT-qPCR using gene specific-primers (see ChIP-on-chip
expression primers) and the iQSYBR Green supermix (Biorad). Amplification was
performed in triplicate in a 96-well format and monitored in real time using the
Opticon 2 DNA Engine (Biorad). Negative controls without RNA, without RT and
without cDNA were always included to rule out contamination. Expression levels
were determined using standard curves for each gene and corrected for 18S
ribosomal RNA levels.
23
2.15 Expression tiling array
To identify potential un-annotated, AR-regulated transcripts in C4-2B cells,
we hybridized cDNAs onto a genomic tiling array representing sequences of
chromosome 20 from position 10,108,999 – 45,999,517. Contiguous probes on this
array overlap by 10bp, and tile the entire region described above. C4-2B cells were
treated with 10nM DHT or 0.01% ethanol vehicle for 16 hrs and total RNA was
collected as described above. The total RNA was depleted of ribosomal RNA
(rRNA) using the RiboMinus human/mouse module from Invitrogen according to
manufacturer’s protocols. As rRNA constitutes a substantial fraction of total RNA
content, it was necessary to deplete this fraction from our samples to ensure mRNA
enrichment and to prevent signal squelching by the rRNAs upon fluorescent labeling.
To ensure depletion 100ng of the RNAs, before and after depletion, were analyzed
by agarose gel electrophoresis, and a substantial decrease in 28s and 18s rRNA
bands was observed. Double stranded (d.s.) cDNAs were synthesized from the ribo-
depleted RNA samples using the SuperScript Double Stranded cDNA synthesis kit
(Invitrogen) according to company recommended protocols. The d.s. cDNAs were
sent for labeling and hybridization to Nimblegen Systems, Inc.
24
2.16 Oligonulceotides used for ChIP-on-chip studies
2.16A. ChIP validation primers
Regions Forward Reverse
PSA TGAAAACAGACCTACTCTGGA AGCAAAGACAGCAACACCTT
NC1 TCCTGCCCTGGAGAACTTAAAG TAGTGGTCAGCAGGCAGTGC
NC2 GTGAGTGCCCAGTTAGAGCATCTA GGAACCAGTGGGTCTTGAAGTG
NC3 TCATCATGAATCGCACTGTTAGC GCCCAAGTGCCTTGGTATACC
NC4 CAGTGGCCATGAGTTTTGTTTG AACCAATCCAACTGCATTATACACA
1 CATGGTGTACCCAGTTGGCA GGGCACCTACTGAACATCTGCTA
2 CCCAGAGAGGCATAAACAATGC TTGTCCAATGCCAGGAAGC
5 GCACAGAATCTCTGTGTGTGCATA GCATACACAAATACTTTGGGATTAAGG
6 GGTACTTACAGGAAAACTCCAAAATTGA TGCCAAGAACATAGATGAGTAGCATAG
9 CCCAGAAACTGCTGAGTCACTG TCCGAGGGTTCCTCTCTAAGG
12 GGAGGTGGTACATGGCTTCTG CAGTGTCTGGGCAGTAGTCCTG
13 TCTGGGCCTCTTTCTTCCTTG AAACACAGGCAGCAGCAGATAC
20 GGGTTCTGTCCTCTCACCTTGA TGCTCCTCTGCAGTGTGATATAGC
24 GTGTTCTGCTTCAGCGTCTGC CAGGGAAGAAGGCTGCAATG
26 ACCAAGCACTGGAGCATGTG CAGCAGCCACCGGTAAACA
32 TGTGCATACACCAACCACACTG CCGCTGGTTGATGGCATAG
41 GCTTCCTGGAGGTGGTGATG GTGTCCCTCTCCCTCCCAAA
43 ATTGTTCAGCTCCTCGTGCA ACCCAGGTGTCGAAGTCAGAA
44 CTGCCAAATGGAATAGAAGAGAAAC ATTCCCAAACTTCCTTTTCACG
47 GAAGGACCCTGGCCAATTG TGTTCCCCATCATCCATTCC
48 TGCTCACTGCCTCTACATGAAGTC GTGGACATCTTAAGGGCTGAGAGA
50 CACGGTAAGAGTTTTATGGGCATC CAGCCAATCTCATGACCACACTA
51 GCTAGAGGGCATGTGGCAGA ATGCCAGGACAGTGTTCTTTCC
2.16B. Cloning primers
Regions Forward
1 GATCCGGAATTCCGTGCCCACACAAGGAAATA
2 GATCCGGAATTCATCAAAGCACCCACATAGCC
3 GATCCGGAATTCGCAAAACACATACTGCAGGTACA
4 GATCCGGAATTCTCACAGGAATCATGAGGTTGA
5 GACTATGAATTCTACAACCTTAAAAGGAG
6 GATCCGGAATTCCTGTGAGAACAGGCTGTCTTG
7 GATGGAAGATCTCATTCCTCTTAGCAGACTCAACAC
8 GATCCG GAATTC CTGTCCTCACCCTGCTTTGT
9 GATTCGAATTCTTCGAGTGTTTGGTGCTTGTAG
10 GATCCGGAATTCAGTAGCTGCTGTGTGGGCTA
11 GATCCGGAATTCCTGGAATGAGCACCCAATCT
12 GATTCGAATTCTGGCTTCTGGAGGGCCCTGAG
13 TCCCCGCGGGGAGCTTTGATGTGCTAGTTGGGTA
14 GATCCGGAATTCTGCCTCCAAATGTCTCAGGT
25
15 GATCCGGAATTCGGTTTTGCTCACTGCCTTGT
16 N/A
17 GATCCGGAATTCCCATCACCTTTGCTCTCTCC
18 GATCCGGAATTCGGAATATAAGTCACTTTTTGCAAGC
19 GATCCGGAATTCTTTGTGGTCCAGTTCCCAGT
20 GATGGAAGATCTTGCTTCTCCTGTTGAAAGTTG
21 GATCCGGAATTCCTGGGGCAGGTTTCAGTTAG
22 GATCCGGAATTCAGTTTTCTTTACAATTCAGCTCATC
23 GATCCGGAATTCTAAGGAGGCCAAGGAAGTGA
24 GATCCG GAATTC ATTGCAGCCAAAGCCATAGT
25 GATCCGGAATTCAATTGTGTCCTGGACGTCTCT
26 GATCCGGAATTCCTGAAGGGTACGTGCTGAAGG
27 GATCCGGAATTCTTGAAGTTTTTGGTAATGTATTGCTT
28 GATCCGGAATTCGCAAATTTGTGCCCAAAGAA
29 GATCCGGAATTCGGCAGCAAGGAGCAAGATAC
30 GATCCGGAATTCGTCAGTTAGGCCTGGCAATG
31 GATCCGGAATTCAAAAGGGACTATGCACACATTTT
32 GATCCGGAATTCCCTTCTCCCGTGTGCTTTTA
33 GATCCGGAATTCGTATGGCCTAAAGCCAACCA
34 GATGGAAGATCTCATGCATGTATGCTCTCAGC
35 GATCCGGAATTCTTTGGCTTCTGGCTCAGTCT
36 GATCCGGAATTCGCACCAGGCCAAGTTTTTAT
37 GATCCGGAATTCTTGAATCTACTTTATTTGCTTAGTTGA
38 GATCCGGAATTCCACATGTCAATCGTCCCTCTT
39 GATCCGGAATTCAGCTAGCAGAGGTGGAGCTG
40 GATGGAAGATCTGTGTATGACTGGCCCTGGAC
41 GATCCG GAATTC GTAGAAACCCAAGCCCCTCT
42 GATCCGGAATTCGGTGCTGTTCCCATTTTACC
43 GATCCGGAATTCGGAGGGAGAGGGACACCTAC
44 GATCCG GAATTC CACTGGTGTCCAACCAAGTT
45 GATCCGGAATTCAAGTCCATCGGTCACTACCC
46 GATCCGGAATTCCTGTCTTCATCTCTGTTAACCTTCC
47 GATCCGGAATTCTGTCCATACCCCTTTCTCCA
48 GATCCGGAATTCGGCTGGTAGGAATCACAAGC
49 GATCCGGAATTCCGACAGTGTATTGGGCACAG
50 GATCCGGAATTCGGGCTGACGTATCAAGAAGG
51 GATCCGGAATTCAACCAGGGTTCTGCAAACAT
52 GATGGAAGATCTCCCTTAGGGACCCCTTTACA
53 TCCCCGCGGGGAGAGGTGGAGCAAGGATTTGA
54 GATCCGGAATTCTCACCAGAAGTAGATGCCGATA
55 GATCCGGAATTCCCACTATGCCCATATGCCTAA
56 GATGGAAGATCTTACTTTGCACCTGTGCCTGT
57 GATCCGGAATTCCTAAAACTAGGGGCGCAACA
58 GATCCGGAATTCCGTAGCAGGGTGGGGTTTCTT
59 GATCCGGAATTCCCAGCCCCAGATGATGTTGTA
60 GATCCGGAATTCCTTGGGGTGACTTCCAGATTT
61 GATCCGGAATTCCTGGTCCATGACCCTGATCTA
62 GATCCGGAATTCCTTTTTGGTCTCAATTCGTATGTT
26
Regions Reverse
1 GATCCGGAATTCCAGGGAGGTCAGATTCCAGA
2 GATCCGGAATTCCCAGTTCCCTGGAAGTCTGT
3 GATCCGGAATTCAAAAGGGATTTTGCTTTAGTCA
4 GATCCGGAATTCTGCATTGTGATTTCCGTATGA
5 GACTATGAATTCACTGTACTCCAGCCCGG
6 GATCCGGAATTCTTGGATATTCAGAGGCAACTACC
7 GATGGAAGATCTGGCTAGCTAGGCTTGCTGAA
8 GATCCG GAATTC CCTCAGCAGCCTTTTACCTG
9 GATTCGAATTCTGGAGGCTAAGTGGGGCTCAAG
10 GATCCGGAATTCAGATGCCCTCCAGCTGTAAA
11 GATCCGGAATTCGAAGCTGTGCTTTCCTGACC
12 GATTCGAATTCTCACATTATTTCCCATGTGTTC
13 TCCCCGCGGGGATTCCACCTCAGGCCAGAAAG
14 GATCCGGAATTCCAGCAAACCCTGCATTTACC
15 GATCCGGAATTCCCAGTACTACTACCGCCCACA
16 N/A
17 GATCCGGAATTCGAGCCTCCTTCTAATTGGAACAT
18 GATCCGGAATTCCAAATGGAACAATGTGGTATGG
19 GATCCGGAATTCCCCCAGTATTGCAGCCTAAA
20 GATGGAAGATCTCTGCTCCTCTGCAGTGTGATA
21 GATCCGGAATTCACCCTCACGTTGTTTAAATGG
22 GATCCGGAATTCTGTGGGACAAAGAACCCTGT
23 GATCCGGAATTCAGGCAGAGTGAGCAGCTATTG
24 GATCCG GAATTC GACAGAGGTTGATTTGTTGTGC
25 GATCCGGAATTCTGTTGGTTAAGCCCATTTGC
26 GATCCGGAATTCCGCATGTTTTATTTAAGCCCAGGT
27 GATCCGGAATTCTGACAAAACTGGGGTAGGTCA
28 GATCCGGAATTCTTTCTTTAGGGTCTTTGCCTGT
29 GATCCGGAATTCGGGCTTGCTATTTGTTGGTG
30 GATCCGGAATTCCCTTTGAAATAGCTGTCGTAAAGTC
31 GATCCGGAATTCAAACAAATGGAAAGGCCTCA
32 GATCCGGAATTCGAAGAGGCCTTACCTGCTGA
33 GATCCGGAATTCAAATCAACTACCAATTGCCAAGA
34 GATGGAAGATCTACTTACGCCACCAGCAGAGT
35 GATCCGGAATTCTGAAACTCAGGCTTCTGGCTA
36 GATCCGGAATTCGGGCCATAGAGAATGTTCCA
37 GATCCGGAATTCATCTCCAGTGTACTTGGCTTAAAGA
38 GATCCGGAATTCCAAACAGCATTTGCCCTAGA
39 GATCCGGAATTCTCTCTCCTGAGAGCCCGTTA
40 GATGGAAGATCTAAATGGCAGGAATGAGATGC
41 GATCCG GAATTC TTCAAGGAGCAAAGCACCTC
42 GATCCGGAATTCGCCTGGCTTACAAATAGGTCA
43 GATCCGGAATTCCCACCCAGCTGGAAACTACA
44 GATCCG GAATTC CCCTAGATTTACACTCGGAGAGC
45 GATCCGGAATTCCTGTGCAGCGTCTCAAAGTC
46 GATCCGGAATTCCTGCTGTGTGTTTTTGTGATTT
47 GATCCGGAATTCCTCCCTTTGCACTCCAAATC
27
48 GATCCGGAATTCTCTTTGGAGAGGTAAGCCAGA
49 GATCCGGAATTCCAAGATGAGACCGAGTTGAGG
50 GATCCGGAATTCGACAATCTGGCGATGCAAT
51 GATCCGGAATTCCCCACACCCTCATATCCATC
52 GATGGAAGATCTTGCTAACGAATGGCATCAAA
53 TCCCCGCGGGGACAGACCACCGAGAATCTGGA
54 GATCCGGAATTCGGCCAAGTGGAGCCATAATA
55 GATCCGGAATTCCCATTTAACTGCCCCAAAGA
56 GATGGAAGATCTCCTTTTGGCAAGGGAATTTT
57 GATCCGGAATTCCAGAATGCCATTTGGAAAAA
58 GATCCGGAATTCCGCAAAATTACTGTACAACCCATAA
59 GATCCGGAATTCCACCAGTGTAACCACCAAGTTCC
60 GATCCGGAATTCCAGGACAGTCTGGGAGCACTTA
61 GATCCGGAATTCCGGGCACAGGAATCTTTGAGA
62 GATCCGGAATTCCGAGCGAGACTCCGATTCAAA
2.16C. RT-qPCR primers
Gene Forward Reverse
ABHD12 GGTGTGACTGAAGCCAACAG TGTAAGGTTTGCAGCTCAGG
AKT2 ACTGACTTTGGCCTCTGCAA TCTGCTTGGGGTCCTTCTTA
ANGPT4 GTGGCCTGTCAAACCTCAAC GCACAGCTGCTCGTTAGATG
APBA2BP ACTGAGCCAGACCTGCACAC TAGAAAGCAAGCAGCCCAGT
BCL2L1 GTTCTGCTGGGCTCACTCTT GGGCTGCATGTAGTGGTTCT
CBFA2T2 CCCTCCCACCAATAAATCCT GGGAGATGTCATTGCCAAAC
CHMP4B GGACGACGACATGAAGGAAT GAGAGTCGAAAGCGATGGAA
CNTD2 CTCGACGCAGAGCCTTCA AGATGTCCCCGGCGTACT
COMP TGGGTTGGAAGGACAAGAAG TTGGCCCAGATGATGTTCTC
COX412 GGCTCCAGTTCAATGAGACC AAATCACCAGAGCTGCGAAT
CRTC1 CTGCACCAGAGCACAATGAC CTTGCTTGGAAAGGTTTTTGTC
CST3 CAGATCTACGCTGTGCCTTG GGTGGGAGGTGTGCATAAGA
CST4 CCGCACCATATGTACCAAGTC GCACTACAGTGGGTGGGAGT
CST9 CCTATGTTCCTCGCCACAGT GACCTGAGGAAAGCTGATGC
CTNNLB1 GGCACAGACAACTGCCATAA TCTCAGGTTCCGCAGGAG
CYP2A6 ACTTTGGGGTGCATTCTCAC CACCATGAATCCTGCCTAGC
CYP2A7P1 TTATCTGTTGCCCGCTCCTA GTGAACCCCACCTAGCTTTG
C19ORF40 GGCAGGAAATGGCTACAGAA GGGCTGGGAAGTATTGTTCA
C20ORF30 GGTGTTCCTACCCGGATTTT CTGGGACAGTTCCACTGTGA
C20ORF55 GACTAAAGCCCTCGGGATATG GCGAGTCTCAGGGCTAAAGA
C20ORF70 CGAATGAGGAGGACCACTGT AAGGTCCTTCTGGTGGGACT
C20ORF74 CGAGCTTGGAGTGGAAGAAT AAACAACACGAGCTGCCATA
C20ORF77 TCTGTCCTTGTGGATGCTTTT CACTTCGTTCTTGCCAGATG
C20ORF94 TCCTCAAGAGAGGGATACGC GATCACGACTGAATGCAAGC
C20ORF100 CTGCTCCCTGCCTACTCCTA CGAGTCATAGGCAGCCACTT
C20ORF102 GACGAAGGCACCTACGAGTG CAGATGCAGGACGGAGTTCT
C20ORF133 TGACCTTAGAAGAGAGACGCAAA TTGCCCTTCATCTCCTCCTT
28
C20ORF152 TCTTTCGGAAGGACCTGTTG GGTATTTCTGGGGTGCCATA
C20ORF196 AATAACTCCTGGACCGCTGA GAGACAGGCACTTGCAGACA
PDCD2L AGGGACTTTGTCAACCTGGA CAAAATCTGCTCCTGACAAGC
DHX35 AGAGGAGCCCACAGCTACAG CAGAGCAACCAAACATTCCA
ELMO2 AGCCATGGACTTTACCCAGA CTGCAGGATTTCACAGAGCA
ENTPD6 CGAGGTCTTCTACGGGATCA GGGCCTTTTCTCCAGGTAAC
EPB41L1 GGTCTGCATCGAGCATCATA ACATGGTGTACCGTTTGCTG
EPN1 CCCGACGAGTTCTCTGACTT ATGACTCCGGCGTCTTCC
EYA2 GATCAGCAGCATCTCCACCT CCTTCCTGGTGTCCAACATC
FAM83D ATTCTGTCTGGCCAAGTGGT GACTGTGGTTTTCGGTTGGT
GPATC1 GGAAGAGCATGCACCAGAAT CGGAGGATGAGGACTTTTCA
GSTG30 CGAGGTTCATCACAATGACG AGGATGAGGATGAGCAGGTG
JAG1 GGCAAGGCCTGTACTGTGAT CTCAGGGCAGGAACACTGAT
KCNSI CTGTACCTGGCTGTGGGTGT CAGGATGCCCCCTAGGAT
KIAA0355 GCGCACCACTCTATGCAGTC GGGTCAGTACTGGGGCAGATAG
KIAA0406 TGATGAAGGTGGACCCAGAC GTCGTGTAGGGGTTCTGCTG
KLH26 TCCTCGATGTTGTGCTGACT TAGGCGAAGTCGATGATGTG
LPIN3 GAACCTGAACCCACTCTGGA TGGTAAGAGACGCTGTGCTG
LSM14A CCAAAGTAGTGCGGTTGGTT CACTGCTTGTTCCATGGTTG
MMP24 GATCGAGCAGTTCTGGAAGG TAGGTCTTGCCCACAGGTTC
NALP9 GACCCTGAAACTTGGGCATA CTCAATGCCTCACACAGCAC
NAT5 TCTGTCTGTTGCCCCAGAAT CACTGTAGCCCAACTGCTTG
NCOA3 CTGACATCTCTGCACCAGGA GATGGAGCTCAAAGCTGGTC
PCNA CTGTGCAAAAGACGGAGTGA TCACCGTTGAAGAGAGTGGA
PIGT TCCCAGTCTCTGATGGCTCT AGGCAGATCACGTTGTAGGG
PLCB1 CGAGATCCTCGGCTTAATGA CCACTCAGATAGCGCATGAA
PLCB4 CGAACTCGCATGGTTATGAA TCGAAGGGAAATGTGTCGAT
PLCG1 AATGGAGACAACCGCCTCTA TTTTTGCCTTCATCCAGAGC
PRNP AGCCTGGAGGATGGAACACT ACTCGGCTTGTTCCACTGAC
PROCR CTACTTCCGCGACCCCTATC ACTGGAGCAGGTAGGACTGC
PYGB CCTGTGCATACACCAACCAC CCCTCCTCGATCACAGACAT
Q4VXU4 ATGATGACACTGGGGACCAT GAGATGGCTAGGGGTTCTCC
Q5TG32 AGTGGATGGGAGAGACTTGC GGCTGGAGGTCATCTGTGAT
Q5TG30 GGGAACAGCACAACAAGATG TGCACCATTATCTGCACCAC
Q6KAL7 TTTAGAGATGTGGCCGTGGT CCCTTGCTTCAAAGAGGACA
Q6UXV6 CAATGGCAGGGAGTTCTGTT ATGTGGGTAATGTCGCCTCT
Q6ZSU1 TCTCTCTATGCTCGCTGGTG GCCTCCTCTTAGGGATGCTT
Q6ZUK9 CTCCATTGCCCATTTCTCTC CTCCTCCACTGGGAAATGAA
Q9H410 CTTTTGGCAGATTGCAGTGA TCATGAAATTCTTCTGGCAATG
Q9NT59 GCAGAGGAACGGAAGCTAGA TGTGCACTGTCTCCCTGTGT
RAD21L1 CCTGAAGACGCAGCTGTCAC CAGCGTCTCCAGCTTCTTGT
RIN2 TGTATGGCGCTGATGACTTC GTTGGTGGTTCTCCGTTTGT
RHPN2 GTGGCCACAAACTCAAAGGT AGAGGAATCAGGGGAATCGT
RP11/LATH CCTGACCGAAGATGCTGAAT GATGTCCAGGAGGCCTGATA
RSPO4 ACTGTCCCCCTGGGTACTTC GGCAGACACTTCCCCTTGTA
SDCB2 GACTCCAAGCTAGGCTGAGG CCAGTTGCGACTTTAAGCAG
SIPA1L3 GTGTTGGAAGTTCCCAAGGA CTCATCCACGCCGAAGTAAT
SIX5 CCAATGTGCACCTCATCAAC CCGTCTCTGGCTTCAGTGG
29
SLC23A2 ACGGCATGGAGTCGTACAAT ATCTGAACTCCGGCTGTTGT
SPTLC2L GGAAGCACCCCTTCATGTTA CAGCTGCGTTGCATTTTTCT
STK4 GAAATGGATTCTGGCACGAT CAGTTGTGATGGCAACGTGT
SULF2 ACACGTACTGGTGCATGAGG TGGTAGGGGTCTGTGTTGAG
TGM2 GTCAACTGCAACGATGACCA TGCGGAAGTACTCGATGAGA
TOMM34 TTGCAGACATCAGCAACCTC TTGCCCTGTTGGGTTTTTAG
TPX2 TTCAACCTGTCCCAAGGAAA TTGCAATTTCTCGAGCTCCT
TUF1 AGAGGATTTTGCCAGAGCAG CCAAAGCCACCTTGCAAATA
TXNDC13 GCAGAATCGGAGATCAGAGG CTTTTACGCTGCCTCAAGGA
UBA2 AGCAAGCCAGAGGTGACTGT CTTCCGTCTCTCCCTCTTCG
U2AF2 ATTTCTTCAACGCCCAGATG GAGGCCTGCGGATCTTTAGT
WFDC2 CCCCCAGGTGAACATTAACT ATTGGGAGTGACACAGGACA
WTIP CGAGACAACCATCCGTGTG GTGGCAACGACGACACAGTA
ZHX3 CCTCATCGAGAAGCTGGAAC GCATCTTGCAACACCACAGT
ZNF341 AACACATGCAGACCCACAAG GAGCTGTCGATGAGGACCAC
ZNF536 CAAGTGTCCGCACTGTGACT AGAACCCCTGATGTCCACTG
ZNF542 ACTCAAAGTGCCCAGCTCAT TGAACTGCATCACACGACAA
ZNF663 GAGTACGGGGAAGCTTTTCA ATAGGAAAGCGAGGGGAAGA
ZSCAN5 GAAAGACGTGTCCAGCCAAC CCTTCTCCAAGGTCTGCTTG
30
Chapter 3: Identification of novel AR target genes by ChIP Display
3.1 Technical aspects and usefulness of ChIP Display
ChIP Display (see Figure 2 for a schematic diagram) was developed in our
lab by a former graduate student who successfully applied the technique to identify
novel target genes for the osteoblast specific transcription factor RUNX2 (5). As its
name implies, ChIP Display begins with chromatin immunoprecipitation and shares
some similarities with differential display. Immunoprecipitations are carried out
using specific and control (IgG) antibodies to pull down DNA fragments bound by a
transcription factor of interest (Figure 2A). With ChIP, a specific genomic target
bound by a TF is significantly enriched; however, the non-specifically co-
precipitated fragments across the whole genome altogether overwhelm the specific
ones, thereby excluding the feasibility of cloning TF targets directly from the ChIP
material. ChIP Display solves this problem in two main steps: concentrating TF
bound fragments of interest and scattering the non-specifically co-precipitated DNA
fragments. Restriction enzyme digestion with Ava II (Figure 2C), whose recognition
sequence is expected every 500 bps on average, standardizes all genomic fragments
representing one TF bound locus, to approximately the same size, allowing this locus
to be concentrated on a gel after PCR for ease of visualization. Scattering of the
overwhelming number of non-specifically co-precipitated DNA fragments is
achieved by dividing the pool of Ava II digested immunoprecipitates into sub-
31
Figure 2. Principles of ChIP Display (CD). Chromatin immunoprecipitations are
carried out with a specific antibody of interest, or IgG control, to pull down all DNA
fragments bound to a protein (A). The precipitated DNA is then treated with SAP
and digested with AvaII (B&C). The digested fragments are ligated with linkers for
subsequent PCRs using specific primers (D&E). The PCR products are separated by
PAGE and bands consistently present in experimental lanes compared to control
lanes are cut for reamplification, digestion and sequencing (F&G). Nested primers
are designed to amplify linker ligated DNA fragments by making use of
permutations in the linkers at positions + 3 and + 6 (H). These permutations, based
on the AvaII recognition sequence, allows for 36 possible combinations of primers to
use for PCR amplification of the immunoprecipitated material (I) (5).
32
Figure 2
33
families based on DNA sequence permutations outside of, and internal to, the Ava II
recognition sequence, which reads GGWCCN (Figure 2H). The nucleotide at the
center, W, provides two permutations (as this is part of the AvaII recognition
sequence and can tolerate either A or T) and the nucleotide just internal to restriction
site, N, provides four permutations (A, T, G or C). This allows the Ava II digested
immunoprecipitates to be divided into 36 possible CD families based on the
combination of permutations allowed at W and N positions (Figure 2I). All
restriction fragments representing a particular locus will have the same nucleotides at
W and N and will remain in the same CD family (Figure 2E). In contrast, non-
specifically co-precipitated fragments are scattered into many different families and
their signal will be eroded. Based on these principles, thirty-six combinations of
eight nested primers are utilized to amplify DNA fragments belonging to one family
at a time. The PCR products from 2-3 independent experimental and control
immunoprecipitates are resolved by PAGE (Figure 2F). Bands more prominently
and reproducibly amplified in experimental ChIPs and not in control ChIPs are
considered putative targets (Figure 2F) and excised from the gel for further
characterization by restriction digests, sequencing and mapping (Figure 2G).
Although high throughput methods have been developed since the invention
of CD, this method still offers several advantages. Firstly, CD is a relatively
inexpensive procedure which can be adopted in any standard molecular biology lab.
Secondly, ChIP Display allows us to map transcription factor binding sites,
information which is important in terms of understanding gene regulation and which
34
enables us to experimentally decipher genes in the vicinity that are being regulated.
This allows for the unbiased identification of novel targets, which are most likely to
be direct targets. The above advantages are not offered by other assays such as
microarray expression profiling, which biases the results to the set of genes present
on the array. Another disadvantage of microarray analysis is that one cannot tease
out whether the responsive genes are direct or indirect targets. Others have pursued
to identify transcription factor targets via hybridization of chromatin
immunoprecipitates to promoter arrays, however this again biases the results to
promoter-proximal regulatory regions, which mounting evidence suggests may not
be the primary binding sites for transcription factors (see our results and discussion
that follows). The application of genome-wide ChIP-on-chip analysis does address
concerns inherent in microarray analysis; however the high cost of ChIP-on-chip
technology limits its accessibility to all researchers.
3.2 ChIP Display of AR targets in C4-2B cells: an example
To identify AR targets in PCa, C4-2B cells were subjected to ChIP Display
(3.1 above) (5). ChIP was performed with either AR or IgG control antibodies, and
the purified DNA was digested with AvaII. The AvaII fragments were amplified
using ligation-mediated PCR with each of 36 possible primer combinations (5) and
resolved by PAGE. Figure 3 describes an example of the procedure leading to the
identification of one novel target, and Table 1 summarizes all the AR targets
identified in this study.
35
Figure 3. ChIP Display (CD) demonstrates a putative AR target. A) CD Gel.
C4-2B cells were treated with 10 nM DHT for 4 hours to enhance AR association
with target loci. Two independent AR ChIPs, and IgG control ChIPs were subjected
to the CD procedure as described in Materials and Methods. In the example shown
here, PCRs were performed with the ‘AC’ and the ‘TG’ PCR primers (see Materials
and Methods and table 1) , with the annealing temperature set at either 70°C or 71°C
as indicated. Amplified products were resolved using 8% PAGE and visualized by
EtBr staining. The arrowheads point at bands amplified more prominently in the AR
compared to the Control (IgG) lanes. M, marker DNA; numbers above bands
indicate size in bps. B) Re-amplification and digestion. The two bands indicated in
panel A by arrowheads were excised, purified and re-amplified with the same ‘AC’
and ‘TG’ primers used for CD. The products were subjected to secondary digestion
with the indicated enzymes, followed by agarose gel electrophoresis. Arrowheads
point at similar HaeIII sub-fragments obtained from the two AR ChIPs. –C, no
template control, UC, uncut, M, marker DNA. C) Mapping of AR target. The
HaeIII subfragments from B were excised, purified and sequenced. By blasting
against the human genome (www.ensembl.org), both sequences mapped to
chromosome 17q25.3, ~1.5-kb upstream of the MAFG gene and ~2.5-kb
downstream of the PYCR1 gene as shown in the diagram. The two genes are
transcribed in the same direction as indicated by the horizontal arrows. pA,
polyadenylation signal. The AR binding region discovered through CD (“hit”) abuts
a CpG island (bottom, striped rectangle), but does not overlap with any repetitive
36
elements (bottom, black rectangles). Several AREs (checkerboard triangles) were
identified in this region using Consite (http://mordor.cgb.ki.se/cgi-
bin/CONSITE/consite). D) Validation of target by conventional ChIP analysis.
AR occupancy at the PYCR1/MAFG locus was tested by conventional ChIP assay.
The PSA enhancer serves as positive control. A non-target locus serves as the
negative control. Genomic DNA was used to demonstrate that the ChIP
amplification was performed within a dynamic range. –C, no template control. M,
marker DNA.
37
Figure 3
38
The example shown in Figure 3A entails the amplification and PAGE of two
independent AR-ChIPs and two mock ChIPs using one of the 36 primer
combinations – the ‘AC’ and the ‘TG’ primer (see Materials and Methods and
Table 1). The arrowheads in Figure 3A point to bands more prominently amplified
in the AR ChIPs as compared to the IgG ChIPs. These bands, representing a putative
AR binding region, were excised, reamplified and further characterized by secondary
restriction digests and agarose gel electrophoresis (Figure 3B). The major HaeIII
digestion product from each of the two ChIPs was sequenced and mapped to human
chromosome 17q25.3, 1.5-kb upstream of the MAFG gene and 2.5-kb 3’ of the
PYCR1 gene (Figure 3C). The AvaII fragment displayed in the original PAGE
(Figure 3A), depicted in Figure 3C as “hit,” does not contain repetitive sequences
and is located between two canonical Androgen Receptor Elements (AREs) (Figure
3C). A 2.4-kb CpG island is present adjacent to the hit (Figure 3C).
To validate AR occupancy at the region described above, we performed
conventional ChIP assays with locus-specific primers (materials and methods). Four
independent experiments with C4-2B cells showed that the PYCR1/MAFG locus
was enriched in AR ChIPs as compared to paired IgG control ChIPs (Table 1, and
see a representative result in Figure 3D).
3.3 ChIP Display discloses 19 novel AR binding sites in PCa cells
The CD procedure, exemplified above for the ‘AC’ and ‘TG’ primer pair,
was performed using all 36 possible primer combinations (5), resulting in the
39
independent conventional ChIP assays of C4-2B cells (Table 1). Whereas only four
of the 19 AR binding regions were up to 10-kb 5’ of the nearest gene (indicated by
bolded and italicized text in Table 1), many of the binding regions were either within
the body of annotated genes (8 of the 19 regions) or up to 4-kb 3’ of the nearest gene
(3 of the 19 regions), indicating that AR-bound regions are not preferentially found
within so-called 5’-flanking gene regulatory sequences. Four of the 19 CD hits were
mapped to regions more than 197-kb away from any annotated gene (Table 1).
An important enigma in prostate cancer research is the molecular nature of
the transition from androgen-dependent to castrate-resistant disease. In this context,
the C4-2B cell line serves as a model of the latter, whereas its parent cell line,
LNCaP, serves as a model of the former (98). We speculated that many of the AR-
occupied regions in C4-2B cells could become targets for this transcription factor
only during the transition from androgen dependence to castrate-resistance and
would therefore not be occupied by the AR in LNCaP cells. However, results of
ChIP analysis in LNCaP cells were inconsistent with this notion, as 16 of 17 regions
that we tested, which were occupied by the AR in C4-2B cells, were also occupied in
LNCaP cells at least in one conventional ChIP assay (Table 1). Be that as it may,
several of the AR-occupied regions are located near genes that have been linked to
prostate or other cancers (see Discussion below and references in Table 1).
40
Table 1: AR targets identified by ChIP Display
ChIP
validation
5
CD
Primers
1
Band
2
AvaII –AvaII
3
Nearby Genes
4
Position of CD Hit
Relative to gene
C4-2B LNCaP
AT, TA 1p35.2
30,152,547-
30,152,728
Nearest gene is 626-kb away 4 2
QSCN6 (77, 111) exon 13
LHX4 (114) 32.7-kb 5'
CEP350 84.2-kb3'
AA, TC 1q25.2
178,433,288 -
178,433,456
ACBD6 90.6-kb 3'
3 1
KIF1A intron 23 / exon 24
AT, AT 2q37.3
241,348,804 -
241,348,990
AQP12 62.4-kb 3'
3 1
TT, TT 3p21.1
53,169,093 -
53,169,401
PRKCD (24, 45) 0.8-kb 5' 4 Nd
MAN2B2 intron 5
AT, AC 4p16.1
6,644,411 -
6,644,619
MRFAP1 48.7-kb 5'
3 3
FZD9 (104) 2.7-kb 5'
AA, TC 7q11.23 72,483,118 -
72,483,317
BAZ1B 10k-b 3'
4 Nd
WBSCR28 4-kb 3'
WBSCR27 27.4-kb 5' AT, AG 7q11.23
72,922,165 -
72,922,474
CLDN4 (33) 37.2-kb 3'
4 2
AA, AG 8q24.3
143,094,298 -
143,094,518
Nearest gene is 197kb away (23) 3 2
AC, TC 10p12.1
24,584,349 -
24,584,579
KIAA1217
intron 2
4 3
OAT 3.4-kb 3'
AT, AG 10q26.13
126,072,189 -
126,072,473
LHPP 67.9-kb 5'
2 0
MUC6 (35) 10-kb 5'
AC, TC 11p15.4
1,017,234 -
1,017,529
AP2A2 15-kb 3'
3 3
SLC22A8 intron 2
SLC22A6 23.9-kb 5' AT, AT 11q12.3
62,532,814 -
62,532,977
CHRM1 87.3-kb 5'
5 2
AT, AT 11q25
134,102,859 -
134,103,167
Nearest gene is 315-kb away 2 3
AT, AT 14q31.3
86,510,091 -
86,510,360
Nearest gene is 959-kb away 3 2
TRPV1 exon 1 / intron 1
CARKL 11.4-kb 3' AT, AC 17p13.2
3,446,846 -
3,447,080
TRPV3 (88) (22) 39-kb 5'
3 1
41
Table 1: Continued
1
CD primers are defined based on two variable nucleotides as described in Materials and Methods.
2
Cytogenetic band containing the CD hit.
3
Absolute positions of the AvaII sites flanking the fragment displayed by PAGE.
4
The nearest Refseq gene annotated in Ensembl (www.ensembl.org) is shown in bolded and italicized
text. Published papers suggesting relevance to cancer are referenced near the respective gene name.
5
Number of independent conventional ChIP assays in which AR occupancy was confirmed. nd, not
determined.
MAFG 1.5-kb 5'
PYCR1 (14, 21) 2.5-kb 3' AC, TG 17q25.3
77,480,155 -
77,480,527
SIRT7 (3, 51) 12-kb 5'
4 2
GSTT2 (30) exon 4 / intron 4
AA, AA 22q11.23
22,655,127 -
22,655,462
DDT 3.1-kb 5'
3 2
SYNGR1 intron 2
AG, AG 22q13.1
38,101,611-
38,101,989
MAP3K7IP1 25-kb 5’
3 2
CRELD2 1-kb 3'
AC, TC 22q13.3
48,707,684 -
48,707,956
ALG12 10-kb 5’
3 1
42
3.4 AR occupied regions are associated with DHT-stimulated and DHT-
repressed genes in C4-2B cells
One of the goals of this study was to identify primary AR-responsive target
genes in PCa cells. Of the 19 AR-occupied regions, 15 were within 10-kb of Refseq-
annotated genes. We initially measured the androgen responsiveness of genes
nearest to each of these 15 AR-occupied regions (gene names bolded and italicized
in Table 1). C4-2B cells were depleted from steroids, and treated with DHT or
vehicle for 0, 2, 4, 8, 16, 24 or 48 hours. Gene expression was assessed by RT-
qPCR. Of the 15 genes nearest AR occupied regions, expression of all but SLC22A8
was detectable, and only 6 of the remaining 14 genes responded to DHT treatment in
a consistent manner. CRELD2, PRKCD and GSTT2 were stimulated (Figure 4B, C,
D, solid lines), whereas MUC6, KIAA1217, and WBSCR28 were repressed (Figure
4ZC, ZE, ZF, solid lines). Because the remaining 8 of 15 genes nearest the AR
occupied regions did not respond to DHT, we tested the expression of 19 additional
nearby genes, up to 100-kb away from AR occupied regions. Of these 19 genes, the
expression of all but AQP12 was detectable, but only eight responded to DHT
treatment. DDT, TRPV3, PYCR1, AP2A2, ACBD6, SIRT7 and MRFAP1 were
stimulated (Figure 4A and E-J, solid lines), and CHRM1was repressed (Figure 4ZD).
Altogether, of 32 genes within 100-kb from AR-occupied regions, many of which
have been implicated in cancer progression (see references next to gene names in
Table 1), ten were stimulated and four were repressed in DHT-treated C4-2B cells.
More detailed investigation of the repressed genes is described elsewhere (79).
43
Figure 4. Gene expression analysis. C4-2B (solid lines) and LNCaP cells (broken
lines) were maintained in 5% CSS-containing medium for three days, and then re-fed
(time 0) with the same medium supplemented with either 10 nM DHT or ethanol
vehicle. RNA was extracted at the indicated time points during the time course and
expression of the specified genes was measured by RT-qPCR. Expression levels
relative to 18S rRNA (which itself stayed stable throughout the time course) are
shown with the 0 time values defined as 1 for each cell line. Representative data is
shown from two independent experiments with n=3, except for panels 4L, O, U, Y
and ZC, where n=6. Error bars are SEM. Genes are roughly ordered based on the
DHT-responsiveness in C4-2B cells, with stimulated genes first (panels A-J) to
repressed genes last (panels ZC-ZF). TRVP3 mRNA was barely detectable in
LNCaP cells.
44
Figure 4
45
Notably, there were four loci in bands 2q37.3, 7q11.23, 10q26.13 and 22q13.1,
where no nearby genes responded to DHT despite AR occupancy (Table 1 and
Figure 4).
3.5 AR-dependent, DHT-independent regulation of OAT and MRFAP1
Genes near AR-occupied regions that did not respond to DHT could still be
regulated by the AR in a ligand-independent manner. To address this possibility, we
treated C4-2B cells with AR siRNA duplexes (41) and assessed the effects on gene
expression in the absence (and presence – as control) of DHT. Of eight genes near
the four AR-occupied regions that were not associated with DHT-responsiveness in
C4-2B cells, we found one, OAT, which was repressed in three of three siRNA
experiments (Figure 5A), suggesting that it is indeed stimulated by the AR in the
absence of ligand, despite its DHT non-responsiveness (Figure 4L). The other seven
genes, KIF1A, AQP12, FZD9, BAZ1B, LHPP, SYNGR1 and MAP3K7IP1 did not
respond to the siRNA treatment (data not shown), suggesting that AR occupancy at
these loci may be without functional consequences in cultured C4-2B cells.
As controls for the AR siRNA experiments we also measured expression of
AR itself and the DHT-stimulated genes PSA (40), PRKCD, PYCR1 and MRFAP1
(Figure 4C, F, J). As expected, the AR knockdown (Figure 5B) was associated with
loss of DHT-stimulation (Figure 5C, D, and E). Interestingly, however, one of these
controls, MRFAP1, displayed an unexpected phenotype. In addition to the DHT-
stimulation, it was reproducibly stimulated in cells treated with AR siRNA (Figure
46
Figure 5. Effects of AR siRNA-knockdown on gene expression. C4-2B cells were
treated with AR siRNA (white bars) or a non-specific siRNA (black bars), followed
by administration of either DHT (10 nM) or Ethanol vehicle for 16 hours.
Expression levels of the indicated genes were analyzed in triplicate by RT-qPCR and
corrected for 18S rRNA levels. Values measured with the non-specific siRNA and
ethanol were defined as 1. Results (Mean ± SD) are representative of three
independent experiments.
47
5F). Taken together, our data suggest that unliganded AR supports basal OAT
expression (Figure 5A) without further stimulation by DHT (Figure 4L), while basal
MRFAP1 expression is suppressed by unliganded AR (Figure 5F), yet stimulated by
DHT (Figure 4J).
3.6 Differential regulation of genes near AR-occupied regions in LNCaP versus
C4-2B cells
Although most of the regions occupied by the AR in C4-2B cells (a model of
castrate-resistant PCa) were also occupied in LNCaP cells (a model of androgen-
dependent PCa) (Table 1), we suspected that the functional consequences of AR
occupancy at these loci might differ between the two cell lines. We therefore
complemented the DHT time course studies in C4-2B cells (Figure 4, solid lines)
with parallel expression analysis of the same genes in LNCaP cells under a similar
experimental protocol (Figure 4, dashed lines). Both similarities and differences
between the two cell lines were observed. The three genes most strongly stimulated
by DHT in C4-2B cells – DDT, CRELD2 and PRKCD – were also stimulated in
LNCaP cells (Figure 4A, B, C), although the stimulation of DDT and PRKCD was
more modest in LNCaP cells. Genes that were more moderately stimulated by DHT
in C4-2B cells were slightly (PYCR1, Figure 4F) or not at all stimulated in LNCaP
cells (GSTT2, AP2A2, ACBD6, and SIRT7; Figure 4, panels D, G, H, I). Repressed
genes displayed a mirror image. The four genes most strongly repressed in C4-2B
cells – MUC6, CHRM1, KIAA1217, and WBSCR28 – were also repressed in
48
Figure 6. Gene expression in C4-2B versus LNCaP cells. RNA was extracted
from C4-2B and LNCaP cultures that were maintained for two days in CSS-
supplemented medium. Gene expression was analyzed side-by-side by RT-qPCR
and corrected for 18S rRNA. Bars represent the ratio between the expression in C4-
2B and LNCaP cells, where the expression level in LNCaP cells is defined as 1 (n=3;
Mean ± SD). Only genes with significant differences between the two cell lines are
shown.
49
LNCaP cells, but repression generally occurred faster in the C4-2B cells (Figure
4ZC, ZD, ZE, ZF). Subtle responses to DHT were less consistent between C4-2B
and LNCaP cells, although three genes, CARKL, MAN2B2, and LHX4, displayed
remarkably similar expression patterns (Figure 4, panels M, N, O).
An emerging concept in PCa research is that ligand-independent AR-
mediated gene expression contributes to the acquisition of a castrate-resistant growth
state. If the derivation of C4-2B from LNCaP cells (98) were associated with such a
mechanism, then one could expect expression of some genes near AR-occupied
regions to be higher in hormone-deprived C4-2B as compared to hormone-deprived
LNCaP cells. We therefore compared expression of the 32 genes near the AR-
occupied regions between the two cell lines, and found four that were expressed in
C4-2B cells at levels between 2 and 12-fold higher than in LNCaP cells (Figure 6).
Not surprisingly, one of these genes was OAT, which was repressed after AR
knockdown in C4-2B cells (Figure 5A). The other three were QSCN6, GSTT2, and
TRPV3. Interestingly, two genes, KIF1A and MAN2B2, were 2 fold less expressed
in C4-2B than in LNCaP cells. Thus, differences in gene expression between
LNCaP and C4-2B cells, both under androgen deprivation and after DHT
stimulation, may be involved in mechanisms of progression from early to late stage
disease.
50
3.7 Clinical relevance of novel AR target genes
To examine whether genes found in proximity to AR-occupied regions in our
culture model are potentially regulated by the AR during PCa progression, we mined
a microarray database for expression profiles of our genes during various stages of
prostate cancer progression. This study was designed by Drs. William Gerald and
Howard I. Scher from Memorial Sloan-Kettering Cancer Center, New York, NY and
Dr. Wayne Tilley from Dame Roma Cancer Hosptial in Adelaide, Australia. The
database contained gene expression profiles from clinical samples which consisted of
snap-frozen prostatic tissues obtained during routine clinical management at the
Memorial Sloan-Kettering Cancer Center, New York, NY, and included 23 primary
prostate cancers from patients undergoing radical prostatectomy with no therapy
before surgery, 17 primary prostate cancers after 3 months of androgen ablation
therapy (goserelin and flutamide) before radical prostatectomy, and 7 AR-positive
metastatic prostate cancer lesions. RNA was extracted from tissue consisting of 60-
80% prostate cancer cell nuclei obtained by manual microdissection of each sample,
and analysed as previously described using the Affymetrix U95 A-E array set (36).
Figure 7 illustrates expression of the in vitro CD-disclosed genes this microarray
analysis. Expression of several of these genes was consistent with in vivo regulation
by the AR. The mRNAs for DDT, CRELD2, PRKCD, and PYCR1, which were
DHT-stimulated in vitro (Figure 4), were decreased in the androgen-ablated as
compared to the primary untreated tumors (Figure 7, Group II). Furthermore, when
compared to the androgen-ablated tumors, the expression of these four genes was
51
Figure 7. Expression of CD-disclosed genes in PCa tumors. RNA from 47 PCa
tumors (columns) was analyzed using Affymetrix U95 A-E microarray sets (21) and
results are mined for all probesets (rows) interrogating each of the 32 CD-disclosed
genes (Table1). Heat map shows relative expression for each of the indicated
probesets, where darker shades represent higher mRNA levels. Tumors included 23
primary prostate cancers from patients not receiving therapy (primary), 17 primary
prostate cancers following 3-month neoadjuvant androgen ablation therapy
(primary+AAT), and 7 AR-positive metastatic lesions (mets). All probesets
interrogating each gene are shown, except for probesets 59776_at (WBSCR28) and
36904_at (KIF1A), which did not detect significant expression in any sample.
Samples are grouped and ranked as follows. Group I - probesets for the known AR-
stimulated genes KLK3/PSA and TMPRSS2. Group II - probesets exhibiting
statistically greater mean expression in untreated compared to AAT-treated primary
PCa samples (p<0.05), thereby representing putative AR-stimulated genes. Group III
- probesets exhibiting statistically lower mean expression in untreated compared to
AAT-treated PCa samples (p<0.05), thereby representing putative AR-repressed
genes. Group IV - probesets exhibiting no statistical difference between samples
without or with AAT. Probesets in Groups II-IV are ranked by p-value in descending
order.
52
Figure 7
53
elevated in the metastatic tumors (Figure 7), presumably due to reactivation of the
AR (86, 87). The similarity between the expression profiles of the CD-disclosed
targets DDT, CRELD2, PRKCD, and PYCR1 in the clinical samples to those of the
established AR target genes PSA/KLK3 (8) and TMPRSS2 (56) (Figure 7, Group I)
suggests that the four genes discovered in our in vitro study are indeed AR targets in
vivo. Interestingly, expression of ALG12, LHX4 and TRPV1, which was
unresponsive to DHT in vitro, was also decreased in androgen-ablated as compared
to untreated primary tumors (Figure 7, group II), possibly reflecting alternative
mechanisms of AR signaling in clinical samples compared to our in vitro cell culture
system. Two microarray probesets that interrogate KIAA1217 expression suggest
that this gene is expressed more strongly in androgen-ablated as compared to
untreated PCa tumors (Figure 7, Group III), consistent with the DHT-mediated
repression observed in C4-2B and LNCaP cells (Figure 4ZE). However, results from
other KIAA1217 probesets (Figure 5) suggest that the in vivo regulation of
KIAA1217 by the AR is either variable or unique to specific isoforms. Among the
other genes apparently downregulated by AR in vivo (Figure 7, group III) are
QSCN6 and SYNGR1, which exhibited a trend for DHT-mediated repression in vitro
(Figure 4Z and ZB). Group III of figure 7 also contains genes near AR-occupied
regions, such as MAN2B2 and CLDN4, which appear to have been downregulated
by androgen signaling in vivo but not in vitro. Comparison of the expression
patternin the tumor samples and in culture also suggests that a few genes such as
54
3.8 Discussion
3.8A. AR occupancy is not biased towards 5’ promoter-proximal regions
The classical view of gene regulation places 5’-flanking sequences at the
center stage. Consistent with this view, functional AREs have been mapped within
0.5-kb upstream of the AR-responsive genes probasin, KLK2 and KLK3 (PSA) (72,
82, 83). While four of the 19 AR-occupied regions disclosed in our study were
located within 10-kb upstream of annotated transcription start sites, many more were
found within gene bodies (8/19) or within the 4-kb sequences downstream from the
3’ ends of annotated genes. Our findings are consistent with several recent genome-
wide location analyses of other transcription factors. For example, only 4% of
estrogen receptor (ER) binding sites were mapped to 1-kb promoter-proximal
regions by ChIP-chip analysis (11). Similarly, genome-wide location analysis
indicates that p53 has no preference for binding to 5’ promoter-proximal regions
(106). Thus, accumulating evidence suggest that promoter-proximal regions
constitute only a small fraction of mammalian gene regulatory sequences.
Many of the AR-occupied regions identified in the present study, which
cannot be designated classical 5’ promoter regions, were still close to an annotated
gene that they could potentially regulate. There is no consensus as to how far a
transcription factor-binding region should be in order to be considered a putative cis-
acting regulatory domain for a given gene. Values of 1-kb, and up to 100-kb from
the transcription start site have been used by various investigators (11, 106), but
experimental evidence in support of any value is scarce. Systematic analyses of
55
transcription factor binding regions across the genome have become feasible only
recently. Such studies, including the present one, illustrate the need for mutagenesis
of transcription factors binding regions that are distant from annotated genes in order
to identify functionally relevant regions. In particular, it would be interesting to
decipher the role of binding regions located hundreds of kbs away from the nearest
annotated gene. Such regions may still regulate distant annotated genes on the same
(105) or even other chromosomes (93), or they may regulate nearby unannotated
transcripts (108).
3.8B. AR location analysis discloses ligand-independent, AR-dependent gene
regulation
Comprehensive gene expression analysis is frequently employed for the
discovery of target genes for transcription factors, including the AR (21, 36, 73,
113). Such expression analyses cannot differentiate between direct and indirect
targets and they do not provide information on the location of regulatory elements.
Another, frequently under-appreciated limitation of expression studies is that they
only disclose target genes that respond to the transcription factor of interest under the
specific experimental conditions utilized by the investigator. In contrast, ChIP
Display and other approaches for location analysis such as ChIP-chip (12, 37, 81),
SABE (16), STAGE (46), ChIP-PET (106), GMAT (84), SACO (39) and DamID
(102) all rely on physical interaction, not gene expression, which has both pros and
cons. For example, in the present study, we discovered OAT as an AR-regulated
56
gene, although it did not respond to DHT. Other AR-target genes are likely missed
in expression-based studies because of the limited sensitivity and specificity of
microarray hybridization as compared to RT-qPCR. Of course, each of the
expression and the location approaches for target identification should ideally be
complemented appropriately. In the present study, we showed that many (but not
all) of the genes near AR-occupied regions are DHT-responsive. For OAT, which
was disclosed here by ChIP Display and could not have been disclosed by
comprehensive analysis of gene expression in response to androgen treatment, we
used siRNA knockdown to demonstrate the ligand-independent regulation by the
AR. Furthermore, of the genes near AR-occupied regions that responded to neither
DHT nor AR siRNA in our study, some may be AR-regulated under specific,
possibly transient physiological or pathological conditions not modeled by the
experimental systems we employed. This is particularly important in the context of
castrate-resistant PCa, where AR activation can occur through various signaling
pathways, including Her2, AKT, and MAPK (87).
3.8C. Differential basal gene expression in C4-2B versus LNCaP cells
The AR plays critical roles during all stages of PCa progression (15, 28, 86).
It is not clear, however, whether AR regulates different sets of genes before and after
ablation therapy. In our study, AR occupancy at most of the regions disclosed by
CD was similar in LNCaP and C4-2B cells, models of early and late stage PCa,
respectively. Our data is therefore consistent with the idea that the AR continues to
57
regulate the same genes before and after ablation therapy, but that the nature of this
regulation alters during disease progression. For many genes near AR-occupied
regions, ligand-bound AR had the same qualitative effects in the two cell lines,
except they were stronger in the C4-2B as compared to the LNCaP model (e.g.,
DDT, PRKCD, GSTT2, PYCR1; Figure 2). Some other genes near AR-occupied
regions were found to express at higher basal levels in C4-2B as compared to LNCaP
cells (e.g., OAT, GSTT2, TRPV3; Figure 4). The higher basal expression of these
genes could be a direct result of ligand-independent activation by the AR due to, for
example, alterations in cofactor expression (13, 53) and/or chromatin reorganization
(41).
3.8D. Novel AR target genes: Potential mechanisms contributing to PCa
progression
PSA (KLK3) remains the most well studied AR target gene in the PCa
literature to date. Since its approval in 1986, serum PSA is routinely used to aid the
early diagnosis and prognosis of PCa in men. However, AR-driven PSA expression
alone does not fully explain the role of AR in PCa development and progression.
Although additional AR target genes have been recently discovered, e.g., FKBP5
(64) and TMPRSS2 (56), most remain elusive. Some of the AR target genes
discovered in the present study, and more to be discovered in the future, may open
new research avenues and help develop novel therapeutic approaches to manage
PCa.
58
3.8D(a). PYCR1
Pyrroline-5-carboxylate reductase 1 (PYCR1) catalyzes the NAD(P)H-
dependent conversion of pyrroline-5-carboxylate (P5C) to proline. Stimulation of
PYCR1 by the AR could contribute to PCa progression because P5C is pro-apoptotic
(69) and proline is anti-apoptotic (14). Indeed, a role for PYCR1 in PCa was
suggested by a 4-fold increased expression in human prostate tumors compared to
adjacent normal tissue (21). In the present study we demonstrate AR occupancy at
the PYCR1 locus in living PCa cells, the functionality of which is suggested by
DHT-mediated stimulation of gene expression. Consistent with these in vitro data,
we also demonstrate decreased PYCR1 expression in PCa biopsies from men
undergoing androgen ablation therapy as compared to untreated controls. The
highest PYCR1 expression in our PCa samples was found in biopsies from
metastatic tumors, possibly as a result of atypical AR activation. Of the AR targets
discovered in the present study, PYCR1 is a strong candidate for mediating the
oncogenic action of AR signaling in PCa.
3.8D(b). OAT
Interestingly, another AR target gene discovered in this study also
participates in proline metabolism. Ornithine amino transferase (OAT) converts
ornithine to glutamate γ-semialdehyde, which spontaneously cyclizes to form
pyrroline-5-carboxylate (P5C), a proline precursor and the substrate for PYCR1.
The functional evidence for AR-mediated regulation of OAT is weaker than that for
59
PYCR1. While OAT mRNA was not significantly altered in response to DHT, it
was repressed after siRNA-mediated knockdown of the AR, and also displayed a
3.3-fold higher basal expression in C4-2B as compared to LNCaP cells.
Interestingly, OAT’s expression pattern in our tumor samples does not suggest AR-
mediated stimulation, but rather repression, possibly reflecting interactions of AR
signaling with input from other cell types or components of the extracellular matrix,
which only occur in vivo.
3.8D(c). PRKCD
In this study, we mapped AR occupancy to a region 0.8-kb upstream of the
gene encoding Protein Kinase C delta (PRKCD), which has received much attention
in the PCa literature. PRKCD mRNA levels were higher in DHT-treated as
compared to untreated PCa cell cultures and were reduced in PCa biopsies from
patients undergoing androgen ablation therapy as compared to those from untreated
patients. The observed high PRKCD mRNA levels in PCa metastases, which may
reflect ligand-independent AR activation, possibly play a role in late stage disease
because PRKCD is implicated in growth, migration and invasion of cancer cells,
including PCa (44, 45). PRKCD has also been implicated in the control of cell
survival, although most studies suggest it is in fact pro-apoptotic (24, 94, 97). Future
studies will have to address how PRKCD’s pro-apoptotic activity is overcome in
advanced PCa cells.
60
3.8D(d). CRELD2 and DDT
The gene expression data form both the cell culture models and the clinical
tumor samples suggest androgen-mediated stimulation of CRELD2 and DDT.
Although a role for CRELD2 in carcinogenesis remains to be investigated, this
Cysteine-rich with EGF-like Domains 2 (CRELD2) protein has been shown to
interact with neuronal acetylcholine receptors (76). Likewise, no role in tumor
progression has been assigned yet to DDT, a protein with homology to macrophage
migration inhibitory factor (MIF) (75).
3.8D(e). MUC6
Among the few genes near AR-occupied regions that were repressed by DHT
was MUC6. Mucin 6 is a secreted glycoprotein that forms a protective gel layer
around the producing cell (35). Other mucins are aberrantly expressed in cancer (35)
and MUC2 was ascribed a tumor suppression function (103). Conceivably, AR-
mediated MUC6 repression can contribute to PCa progression. Notably, however,
MUC6 mRNA was neither increased in the androgen-ablated compared to the
untreated tumors, nor was it absent in the metastatic samples. Although our in vivo
data does not support a role for MUC6 in PCa progression, MUC6 could still play a
transient role during a short period of time not captured by our clinical samples.
61
3.8D(f). TRPV3 and GSTT2
Like the androgen-repressed MUC6, evidence for roles for the androgen-
stimulated TRPV3 and GSTT2 genes in PCa progression is suggested only from our
in vitro data. These two genes are not only androgen stimulated but are also
expressed in C4-2B cells more strongly than in LNCaP cells. TRPV3 is a member of
the transient receptor potential (TRP) family of thermosensory ion channel genes.
Another member of this family, TRPV6, potentiates calcium-dependent cell
proliferation (88), and its expression has been linked to human PCa progression (22).
Glutathione S-transferase theta 2 (GSTT2) belongs to a family of detoxification
enzymes, overexpression of which is thought to provide cells with protection against
oxidative stress and various drugs (30).
3.8D(g). AR occupancy at PCa-linked loci
Two of the AR-occupied regions disclosed by ChIP Display were near loci
previously linked to PCa: (i) the hereditary prostate cancer 1 (HPC1) locus, which
has been mapped to 1q24-25 (111); and (ii) the 8q24 locus, recently linked to PCa
through admixture mapping in African American men (23). Although fine mapping
of specific genetic elements has not been achieved yet for either of these loci, their
contribution to PCa progression in a mechanistic sense may be related to the
observed AR occupancy.
62
In summary, we describe 19 novel AR-occupied regions in PCa cells, many
of which are associated with genes that are regulated by the AR in either a ligand-
dependent or ligand-independent manner. Furthermore, some of the newly identified
AR target genes are differentially regulated in cell models for, and/or biopsies from,
different stages of PCa progression. These genes provide opportunities for future
research to better understand the role of the AR in PCa and eventually improve
patient care, especially in the context of castrate-resistant disease.
63
Chapter 4: Identification of novel AR target genes by ChIP-on-chip
4.1 Technical details and usefulness of ChIP-on-chip
The human genome is composed of three billion base pairs with only 1.5% of
our genome coding for proteins (110). The majority of the nonprotein-coding
sequences are thought to contain the instructions for the program of gene expression
in every cell. Embedded within this massive pool of non-coding DNA are sequences
that interact with regulatory proteins to modulate the transcriptional output of a cell.
Only recently, upon the completion of the human genome sequencing project in 2001
has it become feasible to get a handle on a specific transcription factor’s global
binding map. This comprehensive perspective has become possible by combining
chromatin immunoprecipitation with microarray technology. The medhodology is
referred to as ChIP-on-chip or ChIP-chip and is paving the way for large scale,
genome-wide, studies on transcription factor binding as well as histone
modifications, and DNA methylation, just to name a few.
ChIP-on-chip (see Figure 8A for a schematic diagram) begins with
crosslinking of proteins to DNA in living cells. The chromatin is extracted and
fragmented by sonication and chromatin immunoprecipitation is carried out with an
antibody specific for the protein of interest. The immunoprecipitated DNA is
amplified and fluorescently labeled along with control genomic DNA. The two
samples are labeled differentially and co-hybridized to microarray slides containing
probes representing the non-repetitive sequences of the genome. The microarray
64
Figure 8. Principles of ChIP-on-chip. Chromatin immunoprecipitations are carried
out with a specific antibody of interest. Input DNA is used as control. The control
DNA and the immunoprecipitates are differentially labeled using Cy3 and Cy 5 and
cohybridized onto a microarray containing probes tiling the genome (A).
Transcription factor binding sites are indicated by sets of consecutive probes which
exhibit significantly higher Cy5/Cy3 ratios above background. These regions are
defined by a peak in the array data and represent the protein’s footprint (B) (47).
65
Figure 8
A
B
66
data from ChIP-on-chip is reflective of the relative enrichment of protein bound
sequences in the immunoprecipitated DNA compared to the control samples. The
binding sites can be identified by having multiple, clustered positive signals along
the length of DNA (Figure 8B). The probes representing enriched protein bound
sequences usually form a peak and represent the footprint of the bound protein
(Figure 8B) (47, 110) .
Using ChIP-on-chip, it is now possible to interrogate all non-repetitive DNA
sequences in the entire genome for all known transcription factors (given a good
antibody is available) during a specific developmental stage, after a specific stimulus
or even at different disease states. This enables a comprehensive analysis of DNA-
protein interaction. Additionally, the close spacing of probes on the array provides
high resolution data. It is also possible to customize arrays to suit specific needs. A
limiting factor is the high cost of performing genome wide analysis, although the
technology is expected to become more and more affordable as more companies
offer ChIP-on-chip services. Additionally, the amount of data ChIP-on-chip
generates can be overwhelming and difficult to manage. Bioinformatics support is
absolutely essential for data analysis and can be key in mining the gold in the data.
4.2 AR ChIP-on-chip
Subsequent to our ChIP Display studies, which resulted in a sampling of AR
binding sites, we became interested in a more comprehensive approach for the
identification of AR targets in prostate cancer. We therefore utilized ChIP-on-chip
67
to meet our goal. Due to financial constraints, we chose to do chromosome-wide
mapping of AR binding sites rather than genome-wide. We utilized the ChIP-on-
chip services offered by Nimblegen Inc. as they provide high density long oligomer
arrays which enables very high resolution mapping. 50-mer, isothermally matched,
probes are spaced 50bps apart on these arrays. Array #35 was utilized in this study,
which contains 46.6Mb (72.7%) of DNA sequence on chromosome 19 and 48.1Mb
(77.1%) of chromosome 20. The coverage offered by this array represents 3.07% of
the entire genome.
4.3 ChIP-on-chip discloses 62 novel AR binding sites on chromosome 19 and 20
in C4-2B PCa cells
In our search for novel AR target genes, we applied ChIP-on-chip to identify
AR-bound loci in living C4-2B cells. DHT stimulated (4hrs) C4-2B cells were
subjected to chromatin immunoprecipitations with an AR specific antibody to pull
down genomic loci occupied by the activated AR. Three independent AR chromatin
immunoprecipitates along with input DNA were sent to Nimblegen for labeling and
hybridization onto micorarrays, representing chromosome 19 and 20, tiled with 50-
mer probes spaced 50 bps apart. Bioinformatics was applied to the array data from
all three experiments to identify probes that were significantly enriched in the AR
immunoprecipitates as compared to the input DNA. A false discovery rate (FDR) of
0.5 % was set as the cutoff to identify strong AR bound loci. Using this stringent
criterion, and requiring a locus to be enriched in 3 of 3 independent experiments, we
68
identified a total of 62 AR occupied sites (Figure 9 and Table 2). However, there
were a number of loci, not part of our 62, which were significant in at least one
experiment (see Table 3 for 1/3 ARORS and Table 4 for 2/3 ARORs). Fourteen of
our high confidence AR-occupied regions were found on chromosome 19 (regions
(R) 1-14 in Table 1) and 48 on chromosome 20 (R15-62 in Table 1). Based on these
numbers it would be predicted that, genome-wide, there are a little over 2000 AR
binding sites, with one site every 1.5Mb on average.
Representative data from 10 of the 62 regions is shown in Figure 10A, where
rises in consecutive probes form peaks, which define the AR-occupied regions. The
high degree of reproducibility of ChIP-on-chip can be noted from the similarity in
the data from all three experiments, indicated by different colors. Chromosome 19
afforded us a built in positive control, the PSA locus, in our study, which was also
confirmed to be occupied by the AR at the upstream enhancer (data not shown). To
validate AR occupancy at several randomly chosen loci, including some regions
shown in Figure 10A, we performed conventional ChIP assays with locus specific
primers using RT-qPCR. We also included non AR occupied loci (NC) as negative
controls in our experiments and the PSA enhancer served as a positive control. C4-
2B cells were treated with vehicle or DHT for 4hrs and this was followed by
immunoprecipitations with either AR or IgG antibodies. As shown in figure 10B
(data is courtesy of Dr. Li Jia), AR occupancy at all 18 regions we tested was
confirmed as demonstrated by the enrichment of these loci in AR ChIPs as
comparedto IgG ChIPs in the DHT treated samples. Interestingly about 50% of the
69
Figure 9. ChIP-on-chip discloses 62 novel AR binding sites. C4-2B PCa cells
were stimulated by DHT for 4 hrs to activate the AR upon which chromatin
immunoprecipitations were carried out with anti-AR antibodies. Cy3 labeled
genomic DNA and Cy5 labeled AR immunuprecipitated DNA was co-hybridized
onto a Nimblegen manufactured array containing 50-mer probes spaced 50-bps apart
representing 46.6Mb (72.7%) of chromosome 19 and 48.1Mb (77.1%) chromosome
20. Red lines denote each of the 62 AR-occupied regions discovered in the present
study. Fourteen were found on chromosome 19 and 48 on chromosome 20. Gray
area denotes chromosomal regions not represented on the array.
70
Table 2: AR targets identified by ChIP-on-chip
Luciferase
Region
Center of
Peak
1
Nearby
Genes
2
Position of AR
Binding Site
Relative to gene
Fold Change
(DHT / Etoh)
F R
CRTC1 intron 2 0.99
KLH26 28-kb 5’ 1.11 1
18670307
COMP 84-kb 3’ 0.58
64.9
24.44
2
33472497
Gene desert 2.93 2.88
3 35625325 ZNF536 intron 1 N/E
0.84
1.65
4
36296286
Q9NT59 36-kb 5' N/E
1.35
0.75
RHPN2 intron 3 1
C19ORF40 48-kb 3’ 0.65 5
38208445
GPATC1 55-kb 5’ 0.72
0.95
0.59
KIAA0355 intron 2 1.27
GPATC1 54-kb 5’ 0.72 6
39493284
LSM14A 81-kb 3’ 0.82
4.13
8.95
UBA2 intron 12' 0.92
WTIP 24-kb 5' 2.58 7
39640721
DCD2L 32-kb 3' 1.1
2.24
2.17
UBA2 2.5-kb 3’ 0.92
8
39655161
WTIP 10-kb 5’ 2.58
1.07
1.75
9
43253280
SIPA1L3
intron 2
0.98
1.45
2.37
AKT2 intron 4 0.85
CNTD2 23-kb 5’ N/E
10
45447198
Q6UXV6 24-kb 3’ 0.64
1.7
0.75
CYP2A6 intron 1 N/E
CYP2A7P1 intron 1 N/E 11
46119127
Q6ZSU1 21-kb 3’ 0.58
4.21
22.51
12
50938325
SIX5 22-kb 3’
0.42
13.6
15.06
EPN1 intron 1 0.96
U2AF2 1.9-kb 3' 1.2 13
60879769
NALP9 39-kb 3' N/E
1.3
12.09
ZSCAN5 intron 2 1.48
14
61558873
ZNF542 22-kb 5’ N/E
0.84
4.83
71
Table 2: Continued
ANGPT4 exon 1 / intron 1 N/E
C20ORF55 45-kb 3' 0.65 15
820094
RSPO4 67-kb 3' N/E
2
2.58
RAD21L1 intron 4 N/E
Q9H410 25-kb 3’ N/E 16
1187516
SDCB2 51-kb 3’ N/E
--
3
--
3
17
4532559
PRNP 82-kb 5’ 0.91
0.96
1.4
C20ORF30 exon 4
0.95
PCNA 15-kb 3’
0.93
18
5028803
SLC23A2 90-kb 5’
0.96
4.87
5.7
19
5694840
C20ORF196 intron 1
0.98
11.01
7.87
PLCB1 38-kb 5’ 0.99
20
8023222
TXNDC13 75-kb 5’ 1.01
3.53
3.29
21
8148258
PLCB1 intron 2 0.99
15.8
17.31
22
9235287
PLCB4 intron 4 0.84
6.88
3.36
23
9263828
PLCB4 intron 3 0.84
5.76
27.42
JAG1 intron 3 2.2
24
10587245
C20ORF94 35-kb 3’ 0.63
2.93
1.94
25
10790472
Gene desert
21
4.2
26
11086310
Gene desert
1.57
3.2
27
11959323
Gene desert
1.45
15.77
28
12875527
SPTLC2L 64-kb 5’ 0.24
31.2
4.87
29
14137745
C20ORF133 intron 3
0.71
0.91
1.21
30
15348646
C20ORF133 intron 6 0.71
14.5
18.01
C20ORF133 intron 8 0.71
31
15791051
Q6ZUK9 31-kb 3’ N/E
0.34
0.33
72
Table 2: Continued
RIN2 intron 5 1.23
32
19893378
NAT5 53-kb 5’ 0.94
7.29
10.11
RIN2 intron 6 1.23
33
19895901
NAT5 50-kb 5’ 0.94
1.73
1.67
C20orf74 intron 14 0.96
34
20506254
Q4VXU4 82-kb 5’ N/E
6.66
2.53
35
20672274
QXUXV4 84-kb 3’ N/E
2.1
1.71
36
20688293
C20ORF74 47-kb 5’ 0.96
1.53
1.54
37
20823022
Gene desert 1.72
1.24
38
21918387
Gene desert 1.27
1.06
39
22224882
Gene desert 1.54
3.1
CST3 intron 2 1.07
CST9 30-kb 5’ N/E 40
23564120
CST4 50-kb 3’ N/E
1.55
0.97
PYGB intron 9 1.47
ABHD12 14-kb 3’ -- 41
25208869
ENTPD6 53-kb 3’ 1.2
1.43
0.65
BCL2L1 intron 2 1.54
COX412 35-kb 3’ N/E 42
29731853
TPX2 59-kb 5’ 0.79
0.82
0.68
RP11/LATH 1.9-kb 5’ N/E
43
31243201
C20ORF70 10-kb 3’ N/E
23.2
21.53
CBFA2T2 intron 4 0.56
44
31663541
APBA2BP 45-kb 3’ N/E
58.7
6.33
CHMP4B intron 1 0.89
45
31878408
ZNF341 35-kb 3’ 0.51
2.18
1.5
MMP24 18-kb 5’ 0.42
46
33259664
PROCR 31-kb 3’ 0.83
1.7
1.17
EPB41L1 2.8-kb 5’ 0.82
47
34145659
C20ORF152 63-kb 3’ 0.65
1.34
2.57
73
Table 2: Continued
1
The position of the center most probe defining the AR binding site on the genome.
2
Genes up to 100-kb away from a given AR binding site
3
This AR binding site was not clonable into the Tk-2+ Luc vector.
N/E = This gene was found not to be expressed in C4-2B cells.
F = forward, R = reverse
48
35821838
CTNNLB1 intron 5 0.8
1.31
1.57
KIAA0406 intron 8 N/E
C20ORF77 43-kb 5’ 0.99 49
36052084
C20ORF102 45-kb 3’ N/E
3.03
1.87
C20ORF77 17-kb 3’ 0.99
50
36171508
TGM2 19-kb 3’ 7.98
2.48
7.96
Q5TG32 17-kb 3’ N/E
51
36667128
Q5TG30 30-kb 5’ N/E
1.52
0.66
DHX35 intron 8 0.74
52
37062429
FAM83D 47-kb 3’ 0.85
1.21
1.38
53
37923752
Gene desert
1.11
0.68
54
38531504
Gene desert
24.7
70.98
55
38536345
Gene desert
0.78
0.88
ZHX3 intron 2 0.62
PLCG1 73-kb 3’ 0.66 56
39310859
LPIN3 92-kb 5’ 0.51
0.99
0.74
57
41912169
C20ORF100 65-kb 5’
79.2
7.33
STK4 intron 10 0.89
KCNSI 47-kb 3’ 4.74 58
43107637
TOMM34 85-kb 5’ 0.7
40.1
1.96
WFDC2 4-kb 5’ N/E
59
43527534
PIGT 39-kb 3’ 0.71
1.22
1.43
ZNF663 intron 3 0.67
Q6KAL7 34-kb 5’ N/E 60
44512835
ELMO2 44-kb 5’ 1.58
1.55
4.28
61
45071372
EYA2 intron 4 N/E
0.92
1.12
NCOA3 exon 16 / intron 16 N/E
62
45704721
SULF2 14-kb 3’ 0.89
13.7
3.25
74
Table 3: Low confidence ARORs reproduced in 1/3 experiments
Location Start End Location Start End
chr19 17656390 17656739 chr19 33971962 33972311
chr19 18351874 18352323 chr19 33974454 33974903
chr19 18478645 18479394 chr19 34020928 34021677
chr19 18669547 18671116 chr19 34068938 34069387
chr19 18713145 18713994 chr19 34352006 34352355
chr19 19133133 19133482 chr19 34389563 34390212
chr19 19352917 19353366 chr19 34691472 34691821
chr19 19699345 19699694 chr19 34861435 34861984
chr19 19806573 19806922 chr19 34948701 34949150
chr19 19922177 19922526 chr19 35127241 35127790
chr19 20231563 20232012 chr19 35128451 35129447
chr19 20286910 20287459 chr19 35152259 35152708
chr19 20580374 20580823 chr19 35154104 35155053
chr19 21077047 21077696 chr19 35168031 35168780
chr19 21195182 21195831 chr19 35192128 35192877
chr19 21674779 21675128 chr19 35524578 35525027
chr19 22476379 22476928 chr19 35624825 35626074
chr19 22632316 22632665 chr19 35674595 35675044
chr19 23307192 23307541 chr19 35715178 35715527
chr19 23336249 23336898 chr19 35793805 35794154
chr19 23350598 23350947 chr19 35801964 35802313
chr19 23602517 23602966 chr19 35804620 35804969
chr19 24083960 24084309 chr19 35916609 35916958
chr19 33085032 33085581 chr19 35971526 35972275
chr19 33118594 33118943 chr19 36059700 36060049
chr19 33123467 33124216 chr19 36076042 36076391
chr19 33134594 33135286 chr19 36090539 36091249
chr19 33171922 33172371 chr19 36212951 36213300
chr19 33192971 33193420 chr19 36255371 36255820
chr19 33205315 33205664 chr19 36271788 36272337
chr19 33232410 33232759 chr19 36295886 36296635
chr19 33299662 33300311 chr19 36316977 36317326
chr19 33303333 33304082 chr19 36372882 36373231
chr19 33318465 33318814 chr19 36445275 36445724
chr19 33382059 33382508 chr19 36455036 36455585
chr19 33472147 33472896 chr19 36475900 36476449
chr19 33643280 33643629 chr19 36543934 36544383
chr19 33677124 33677473 chr19 36629924 36630273
chr19 33684820 33685369 chr19 36755575 36756424
chr19 33732802 33733251 chr19 36857393 36858242
chr19 33739567 33740016 chr19 36895321 36895770
75
Table 3: Continued
Location Start End Location Start End
chr19 36901062 36901511 chr19 42033806 42034155
chr19 36902162 36902811 chr19 42160600 42161049
chr19 36910226 36910675 chr19 42195601 42195950
chr19 37069631 37070280 chr19 42336950 42337499
chr19 37189933 37190282 chr19 42569471 42570020
chr19 37216379 37217028 chr19 42680103 42680552
chr19 37296872 37297274 chr19 42745123 42745572
chr19 37452445 37453228 chr19 42782444 42782893
chr19 37501359 37501708 chr19 43004883 43005532
chr19 37503038 37503387 chr19 43006283 43006832
chr19 37532493 37533142 chr19 43013630 43013979
chr19 37565499 37566048 chr19 43047490 43047939
chr19 37682073 37682522 chr19 43069299 43070048
chr19 37719417 37719966 chr19 43071299 43071953
chr19 38116239 38116588 chr19 43240486 43240835
chr19 38122941 38123290 chr19 43253030 43253579
chr19 38208245 38208694 chr19 43481437 43481886
chr19 38215415 38215864 chr19 43558061 43558410
chr19 38359610 38360029 chr19 43611862 43612311
chr19 38475820 38476469 chr19 43672604 43673053
chr19 38997948 38998397 chr19 43722972 43723521
chr19 39149123 39149472 chr19 43830695 43831244
chr19 39401049 39401698 chr19 43837069 43837518
chr19 39405019 39405468 chr19 44240226 44240675
chr19 39480316 39480865 chr19 44308544 44308893
chr19 39492984 39493633 chr19 44584577 44585526
chr19 39509942 39510391 chr19 44771502 44771878
chr19 39640521 39640970 chr19 44781483 44781932
chr19 39644138 39644587 chr19 44792159 44792808
chr19 39654911 39655460 chr19 44940919 44941268
chr19 39724304 39724853 chr19 45205975 45206524
chr19 39868066 39868515 chr19 45252282 45252666
chr19 39956817 39957166 chr19 45423619 45424168
chr19 40094410 40094859 chr19 45427983 45428432
chr19 40622540 40622989 chr19 45447048 45447397
chr19 40896809 40897258 chr19 45831931 45832480
chr19 41126095 41126444 chr19 46118727 46119776
chr19 41397812 41398261 chr19 46191406 46191955
chr19 41398412 41398761 chr19 47121364 47121713
chr19 41587950 41588499 chr19 47125080 47125429
chr19 41990173 41990594 chr19 47273437 47273786
76
Table 3: Continued
Location Start End Location Start End
chr19 48255793 48256142 chr19 55126891 55127375
chr19 48365604 48366253 chr19 55689782 55690131
chr19 48474158 48474607 chr19 55833557 55833906
chr19 48550588 48551037 chr19 55895096 55895445
chr19 48575754 48576103 chr19 56012353 56012702
chr19 48890297 48891116 chr19 56046139 56046488
chr19 48977179 48977828 chr19 56084261 56084710
chr19 49360940 49361600 chr19 56184845 56185494
chr19 49431751 49432300 chr19 56261196 56261845
chr19 49667864 49668313 chr19 56806344 56806693
chr19 50105336 50105685 chr19 56839711 56840060
chr19 50234581 50235030 chr19 56889133 56889682
chr19 50288685 50289034 chr19 57082441 57082890
chr19 50289847 50290196 chr19 57136597 57137046
chr19 50499952 50500401 chr19 57146913 57147262
chr19 50673538 50674135 chr19 57184022 57184471
chr19 50938148 50938550 chr19 57314667 57315016
chr19 51603338 51603687 chr19 57330206 57330655
chr19 52059179 52060028 chr19 57572038 57572387
chr19 52060479 52060928 chr19 57842647 57843196
chr19 52097268 52098017 chr19 58007853 58008302
chr19 52108253 52108702 chr19 58051547 58051896
chr19 52108857 52109406 chr19 58136501 58136850
chr19 52117099 52117648 chr19 58231301 58231650
chr19 52130470 52130819 chr19 58305631 58306080
chr19 52161644 52162193 chr19 58467824 58468273
chr19 52171628 52171977 chr19 58664493 58664842
chr19 52291753 52292202 chr19 58967875 58968424
chr19 52329360 52329909 chr19 59488133 59488682
chr19 52343308 52343657 chr19 59733905 59734354
chr19 52347494 52348043 chr19 59769790 59770439
chr19 52479338 52479687 chr19 59793160 59793509
chr19 52704939 52705488 chr19 59805100 59805949
chr19 52859057 52859406 chr19 59883468 59883817
chr19 52976927 52977476 chr19 59946120 59946525
chr19 53225122 53225571 chr19 59962754 59963161
chr19 53256582 53257033 chr19 60006501 60006850
chr19 53456330 53456879 chr19 60041609 60041958
chr19 54355682 54356231 chr19 60170683 60171032
chr19 54445718 54446067 chr19 60859342 60859691
chr19 54485158 54485507 chr19 60879409 60880178
77
Table 3: Continued
Location Start End Location Start End
chr19 60912999 60913348 chr20 630226 630675
chr19 60927231 60927680 chr20 819776 820461
chr19 60935697 60936146 chr20 1076086 1076435
chr19 61023461 61023810 chr20 1187171 1187910
chr19 61074053 61074602 chr20 1685374 1686123
chr19 61137804 61138453 chr20 2035046 2035676
chr19 61386414 61386763 chr20 2069777 2070126
chr19 61419132 61419681 chr20 2165896 2166345
chr19 61490606 61490955 chr20 2510712 2511061
chr19 61554901 61555250 chr20 2542211 2542560
chr19 61558523 61559272 chr20 3302092 3302816
chr19 61598127 61598768 chr20 3362400 3362849
chr19 61610508 61610957 chr20 3534573 3535122
chr19 61707966 61708315 chr20 3650566 3650915
chr19 61713405 61714154 chr20 3703808 3704315
chr19 61820867 61821216 chr20 3856309 3856758
chr19 61878530 61878879 chr20 4210384 4210933
chr19 61890285 61890634 chr20 4381486 4381935
chr19 62049593 62049942 chr20 4506222 4506871
chr19 62148820 62149169 chr20 4532309 4532858
chr19 62221243 62221692 chr20 4563251 4563600
chr19 62291764 62292113 chr20 4895503 4895852
chr19 62331726 62332375 chr20 5028393 5029112
chr19 62397011 62397360 chr20 5159053 5159502
chr19 62414650 62414999 chr20 5472356 5472705
chr19 62427663 62428312 chr20 5520506 5520855
chr19 62463427 62463976 chr20 5571126 5571475
chr19 62692220 62692569 chr20 5694565 5695114
chr19 62743083 62743532 chr20 6229806 6230155
chr19 62749801 62750150 chr20 6392585 6393034
chr19 62887331 62887780 chr20 6413872 6414421
chr19 63043293 63043742 chr20 6426986 6427335
chr19 63076894 63077243 chr20 6642241 6642590
chr19 63791855 63792827 chr20 6730858 6731407
chr19 63795783 63796632 chr20 6787547 6787896
chr20 46720 47069 chr20 6997703 6998452
chr20 85725 86474 chr20 7052647 7053196
chr20 186129 186578 chr20 7163459 7164008
chr20 374140 374889 chr20 7297244 7297593
chr20 408142 408689 chr20 7567349 7567798
chr20 625470 625819 chr20 7703830 7704179
78
Table 3: Continued
Location Start End Location Start End
chr20 7916589 7917338 chr20 10921162 10921711
chr20 7931023 7931372 chr20 11051363 11051812
chr20 7995357 7996021 chr20 11085885 11087134
chr20 8021313 8021662 chr20 11107422 11107971
chr20 8022897 8023746 chr20 11174252 11175201
chr20 8143246 8144295 chr20 11312634 11312983
chr20 8147983 8148532 chr20 11835412 11836461
chr20 8175298 8175647 chr20 11902860 11903309
chr20 8178341 8178690 chr20 11953659 11954208
chr20 8352642 8353191 chr20 11958998 11959647
chr20 8376449 8376898 chr20 11969382 11969731
chr20 8385739 8386188 chr20 12015314 12015663
chr20 8435012 8435361 chr20 12030959 12031308
chr20 8730838 8731187 chr20 12107346 12107695
chr20 8744230 8744779 chr20 12193164 12193713
chr20 9004279 9004812 chr20 12234878 12235427
chr20 9126753 9127502 chr20 12317429 12318078
chr20 9228615 9229064 chr20 12600252 12601001
chr20 9235062 9235511 chr20 12652934 12653483
chr20 9236607 9237292 chr20 12839383 12839832
chr20 9263603 9264969 chr20 12865554 12866003
chr20 9329181 9329630 chr20 12875352 12875701
chr20 9371413 9371862 chr20 12910554 12911003
chr20 9618592 9618941 chr20 12946880 12947429
chr20 9894964 9895413 chr20 12963681 12964030
chr20 10044100 10044549 chr20 13352244 13353093
chr20 10193747 10194296 chr20 13446643 13447392
chr20 10217651 10218100 chr20 13455499 13456048
chr20 10295735 10296184 chr20 13550982 13551431
chr20 10330592 10331141 chr20 13582038 13582587
chr20 10433199 10434048 chr20 13585270 13585719
chr20 10434490 10435039 chr20 13667594 13668543
chr20 10501830 10502779 chr20 13796637 13797086
chr20 10568079 10568428 chr20 13906032 13906381
chr20 10586970 10587519 chr20 14022133 14022482
chr20 10655672 10656347 chr20 14126868 14127317
chr20 10779152 10779601 chr20 14137170 14138119
chr20 10785126 10785675 chr20 14139212 14139761
chr20 10789892 10790946 chr20 14193379 14193928
chr20 10890371 10891120 chr20 14242185 14242834
chr20 10898730 10899579 chr20 14312957 14314606
79
Table 3: Continued
Location Start End Location Start End
chr20 14381754 14382103 chr20 19058624 19059073
chr20 14509188 14509837 chr20 19146116 19146465
chr20 14532487 14533036 chr20 19153809 19154158
chr20 14700478 14700827 chr20 19376333 19376882
chr20 14702877 14703326 chr20 19453098 19453896
chr20 14704376 14705153 chr20 19456480 19456829
chr20 14706107 14706556 chr20 19634983 19635632
chr20 14718622 14719171 chr20 19760120 19760569
chr20 14726836 14727185 chr20 19893053 19893802
chr20 15048537 15049110 chr20 19895467 19896334
chr20 15225703 15226152 chr20 19939387 19939836
chr20 15348221 15349170 chr20 19964988 19966037
chr20 15368871 15369520 chr20 20014385 20014734
chr20 15408918 15409467 chr20 20427040 20427589
chr20 15425131 15425680 chr20 20500082 20500731
chr20 15435424 15435773 chr20 20505679 20506828
chr20 15662319 15662968 chr20 20509479 20509828
chr20 15731942 15732391 chr20 20538883 20539332
chr20 15790426 15791475 chr20 20584670 20585319
chr20 15799939 15800726 chr20 20586770 20587219
chr20 15863846 15864295 chr20 20603703 20604352
chr20 15882754 15883103 chr20 20609820 20610169
chr20 15916445 15916994 chr20 20671911 20672837
chr20 15929916 15930265 chr20 20681310 20681859
chr20 16217099 16217494 chr20 20687968 20688617
chr20 16491826 16492275 chr20 20727735 20728084
chr20 16668812 16670061 chr20 20800505 20800954
chr20 16675752 16676386 chr20 20822783 20823660
chr20 16786870 16787219 chr20 20941110 20941959
chr20 16956673 16957122 chr20 21017061 21018910
chr20 17000345 17000694 chr20 21039048 21039397
chr20 17174943 17175292 chr20 21093753 21094202
chr20 17209164 17209513 chr20 21419896 21420345
chr20 17319063 17319412 chr20 21424264 21424613
chr20 17334425 17335074 chr20 21716519 21716968
chr20 17499430 17499979 chr20 21875477 21875826
chr20 17904222 17904671 chr20 21918062 21918811
chr20 17981209 17981858 chr20 21923111 21923624
chr20 18564469 18565218 chr20 21982378 21982727
chr20 18739975 18740324 chr20 21987274 21987823
chr20 19020845 19021194 chr20 21992850 21993889
80
Table 3: Continued
Location Start End Location Start End
chr20 22038701 22039150 chr20 26048774 26049223
chr20 22092202 22093151 chr20 26053804 26054153
chr20 22151790 22152339 chr20 26069816 26070265
chr20 22224407 22225356 chr20 26074573 26075122
chr20 22311843 22312192 chr20 26079008 26079390
chr20 22314525 22314974 chr20 26115514 26115963
chr20 22380892 22381941 chr20 26144925 26145274
chr20 22487244 22487593 chr20 26152136 26152685
chr20 22488246 22488795 chr20 26158302 26158751
chr20 22516187 22516536 chr20 26186557 26186906
chr20 22627552 22628001 chr20 28034211 28034560
chr20 22673056 22674105 chr20 28038191 28039240
chr20 22739690 22740039 chr20 28059883 28060432
chr20 22774445 22774894 chr20 28062993 28063542
chr20 22993585 22994334 chr20 28067939 28068688
chr20 23020980 23021329 chr20 28084281 28084630
chr20 23130509 23130858 chr20 28091368 28091917
chr20 23468415 23468764 chr20 28219815 28220764
chr20 23563845 23564494 chr20 28223565 28224116
chr20 23702186 23702635 chr20 28227828 28228520
chr20 23945100 23945649 chr20 28237003 28238074
chr20 24094249 24094598 chr20 28244406 28244855
chr20 24189312 24189661 chr20 28246656 28247705
chr20 24355056 24355505 chr20 28251807 28252156
chr20 24549564 24550213 chr20 29313069 29313418
chr20 24658466 24658915 chr20 29731678 29732127
chr20 24918795 24919844 chr20 29763946 29764695
chr20 24968338 24969187 chr20 29821888 29822437
chr20 24976423 24976772 chr20 29835980 29836504
chr20 25118719 25119168 chr20 29994537 29994886
chr20 25201931 25202380 chr20 30345547 30345896
chr20 25206276 25206725 chr20 30377431 30377880
chr20 25208394 25209343 chr20 30803718 30804067
chr20 25282646 25283295 chr20 30855639 30856088
chr20 25823424 25823873 chr20 31242676 31243725
chr20 25844881 25845430 chr20 31256432 31256781
chr20 25847892 25848452 chr20 31425357 31425858
chr20 25852910 25853559 chr20 31584217 31584566
chr20 25858506 25858955 chr20 31590190 31591132
chr20 25869373 25869922 chr20 31663266 31663815
chr20 25873655 25874004 chr20 31696346 31697195
81
Table 3: Continued
Location Start End Location Start End
chr20 31783803 31784452 chr20 36940137 36940586
chr20 31878083 31878632 chr20 37062104 37062753
chr20 32078854 32079303 chr20 37237899 37238248
chr20 32231395 32232002 chr20 37248188 37248837
chr20 32265360 32266609 chr20 37281631 37282811
chr20 32321740 32322268 chr20 37320979 37321428
chr20 32489573 32490022 chr20 37326436 37326885
chr20 32497610 32498159 chr20 37454859 37455308
chr20 33201573 33201922 chr20 37525502 37525951
chr20 33259294 33260034 chr20 37551109 37551658
chr20 33353628 33354177 chr20 37766328 37766977
chr20 33370487 33370936 chr20 37768424 37768973
chr20 33425733 33426082 chr20 37836067 37836416
chr20 33700316 33700765 chr20 37923427 37924076
chr20 33791726 33792175 chr20 38018155 38018504
chr20 34000098 34000447 chr20 38066555 38067817
chr20 34145184 34146133 chr20 38081624 38081973
chr20 34315291 34315640 chr20 38181050 38181399
chr20 34685974 34686761 chr20 38212546 38213095
chr20 35199473 35199822 chr20 38252487 38252836
chr20 35225418 35225767 chr20 38308120 38308469
chr20 35234619 35234968 chr20 38381807 38382256
chr20 35530695 35531170 chr20 38421483 38422232
chr20 35738119 35738601 chr20 38469228 38469577
chr20 35771672 35772391 chr20 38477074 38477523
chr20 35774351 35774900 chr20 38499837 38500186
chr20 35785342 35786122 chr20 38531029 38532131
chr20 35791051 35792100 chr20 38535670 38536619
chr20 35807610 35808059 chr20 38723173 38723522
chr20 35821007 35822312 chr20 38748246 38748795
chr20 35860225 35860674 chr20 38954777 38955226
chr20 35901151 35901700 chr20 39148778 39149727
chr20 35907745 35908964 chr20 39310384 39311333
chr20 35914004 35914453 chr20 39330344 39330904
chr20 36051213 36052955 chr20 39369185 39369534
chr20 36073949 36074298 chr20 39536729 39537178
chr20 36127528 36128077 chr20 39555690 39556339
chr20 36170828 36172187 chr20 39559274 39559623
chr20 36305860 36306259 chr20 39570452 39570901
chr20 36617236 36617685 chr20 39613253 39613702
chr20 36666815 36667440 chr20 39613839 39614388
82
Table 3: Continued
Location Start End Location Start End
chr20 39676517 39677066 chr20 43527309 43527983
chr20 39712260 39712909 chr20 43577457 43577906
chr20 39997121 39997470 chr20 43612296 43612745
chr20 40226501 40227350 chr20 43841164 43841513
chr20 40246990 40247339 chr20 44141378 44141827
chr20 40316711 40317160 chr20 44150550 44150899
chr20 40495666 40496315 chr20 44152514 44152863
chr20 40540290 40541422 chr20 44176571 44176920
chr20 40557240 40557589 chr20 44294518 44295567
chr20 40600298 40600647 chr20 44512660 44513109
chr20 40638891 40639440 chr20 44629499 44629948
chr20 40644656 40645005 chr20 44959234 44959583
chr20 40675542 40675991 chr20 45038929 45039374
chr20 40764025 40764674 chr20 45071133 45071611
chr20 40994846 40995195 chr20 45273314 45273663
chr20 41148179 41148728 chr20 45334526 45334875
chr20 41197623 41198172 chr20 45345031 45345480
chr20 41450411 41450960 chr20 45380409 45381458
chr20 41685767 41686116 chr20 45413204 45413753
chr20 41698780 41699129 chr20 45417104 45417653
chr20 41827775 41828124 chr20 45602841 45603290
chr20 41911859 41912479 chr20 45630153 45630502
chr20 41941752 41942401 chr20 45643479 45643928
chr20 42166577 42166926 chr20 45661817 45662166
chr20 42258317 42258666 chr20 45664051 45664700
chr20 42407141 42407490 chr20 45704446 45704995
chr20 42594516 42594865 chr20 45867439 45867888
chr20 42775841 42776290 chr20 45869395 45869744
chr20 42841857 42842206 chr20 45892485 45892834
chr20 42913525 42914074 chr20 45986879 45987228
chr20 42970841 42971290 chr20 46053868 46054317
chr20 43042152 43042701 chr20 46093608 46093957
chr20 43064202 43064846 chr20 46493258 46493907
chr20 43077928 43078777 chr20 46714883 46715232
chr20 43087138 43087487 chr20 47045372 47045921
chr20 43107236 43107811 chr20 47190748 47191197
chr20 43255081 43255630 chr20 47217180 47217729
chr20 43269554 43270003 chr20 47468203 47468552
chr20 43272800 43273249 chr20 47633544 47633893
chr20 43284501 43285150 chr20 47691380 47691975
chr20 43313624 43314273 chr20 47900431 47900880
A list of low confidence ARORs (reproduced in 1/3 ChIP experiments) is presented with the
chromosomal location and start and end positions of the AROR as defined by the moving average
methodology (see materials and methods). Chr. = chromosome
83
Table 4: Low confidence ARORs reproduced in 2/3 experiments
Location Start End Location Start End
chr19 18478645 18479294 chr19 49431851 49432200
chr19 18669547 18671116 chr19 49667964 49668313
chr19 18713245 18713894 chr19 50938148 50938550
chr19 19133133 19133482 chr19 51603338 51603687
chr19 23602617 23602966 chr19 52059179 52059628
chr19 33085132 33085581 chr19 52329360 52329909
chr19 33123667 33124216 chr19 52977227 52977476
chr19 33134594 33134943 chr19 56046139 56046488
chr19 33171922 33172371 chr19 56084261 56084710
chr19 33299762 33300211 chr19 57082441 57082790
chr19 33318465 33318814 chr19 58305631 58306080
chr19 33472147 33472896 chr19 59883468 59883817
chr19 34861535 34861984 chr19 60879409 60880178
chr19 35625025 35625674 chr19 60935697 60936146
chr19 35971626 35972275 chr19 61137904 61138253
chr19 36090639 36090988 chr19 61558623 61559272
chr19 36295986 36296635 chr19 62221243 62221692
chr19 36475900 36476249 chr19 62463527 62463876
chr19 36902362 36902911 chr19 62743183 62743532
chr19 37452645 37453028 chr19 63043293 63043642
chr19 38208245 38208694 chr19 63796283 63796632
chr19 38998048 38998397 chr20 86025 86474
chr19 39401049 39401598 chr20 186129 186478
chr19 39493084 39493633 chr20 374240 374689
chr19 39640521 39640970 chr20 630326 630675
chr19 39654911 39655460 chr20 819776 820461
chr19 40896909 40897258 chr20 1187171 1187910
chr19 41398412 41398461 chr20 1685474 1686123
chr19 42160600 42161049 chr20 2510712 2511061
chr19 43004883 43005432 chr20 3302092 3302816
chr19 43047590 43047839 chr20 3362400 3362849
chr19 43253030 43253579 chr20 3650566 3650915
chr19 43722972 43723521 chr20 4210584 4210933
chr19 44584577 44585426 chr20 4532309 4532858
chr19 45205975 45206424 chr20 5028493 5029112
chr19 45423619 45424168 chr20 5520806 5520855
chr19 45447048 45447397 chr20 5571126 5571375
chr19 46118727 46119576 chr20 5694565 5695114
chr19 47125080 47125429 chr20 6392685 6393034
chr19 48890297 48891116 chr20 6413972 6414321
chr19 49361040 49361389 chr20 7995357 7995806
84
Table 4: Continued
Location Start End Location Start End
chr20 8022897 8023646 chr20 19895467 19896334
chr20 8143246 8143995 chr20 19939387 19939836
chr20 8147983 8148532 chr20 19964988 19965537
chr20 8376449 8376898 chr20 20014385 20014734
chr20 8385839 8386188 chr20 20505679 20506828
chr20 8744430 8744779 chr20 20538883 20539332
chr20 9235062 9235511 chr20 20672188 20672737
chr20 9263603 9264052 chr20 20681410 20681759
chr20 10295735 10296084 chr20 20687968 20688617
chr20 10433499 10433848 chr20 20822783 20823560
chr20 10568179 10568628 chr20 21017161 21018410
chr20 10586970 10587519 chr20 21419896 21420345
chr20 10790097 10790846 chr20 21918062 21918711
chr20 10890471 10891020 chr20 21923111 21923624
chr20 10921262 10921811 chr20 21992850 21993099
chr20 11085985 11086734 chr20 22038701 22039150
chr20 11107422 11107871 chr20 22092602 22093051
chr20 11174652 11175101 chr20 22224307 22225256
chr20 11312634 11312983 chr20 22673456 22673805
chr20 11835812 11835961 chr20 22774445 22774894
chr20 11958998 11959647 chr20 23468415 23468764
chr20 12317529 12317978 chr20 23563845 23564394
chr20 12600252 12600701 chr20 24540811 24541060
chr20 12875352 12875701 chr20 24919295 24919844
chr20 13446843 13447292 chr20 24968738 24969087
chr20 13455599 13455948 chr20 25201931 25202380
chr20 13667994 13668343 chr20 25208394 25209343
chr20 14137370 14138119 chr20 25282846 25283295
chr20 14139212 14139761 chr20 25823624 25823973
chr20 14509388 14509837 chr20 25875906 25875955
chr20 14704376 14704925 chr20 26152136 26152685
chr20 15348221 15348970 chr20 28038191 28038940
chr20 15790826 15791275 chr20 28062893 28063242
chr20 16669212 16669961 chr20 28246656 28247405
chr20 16675752 16676386 chr20 29731678 29732127
chr20 17319063 17319412 chr20 30377431 30377880
chr20 18564469 18565118 chr20 30855739 30856088
chr20 18739975 18740424 chr20 31242776 31243625
chr20 19146116 19146465 chr20 31425357 31425858
chr20 19453198 19453796 chr20 31590490 31590832
chr20 19893053 19893702 chr20 31663266 31663815
85
Table 4: Continued
Location Start End Location Start End
chr20 31696446 31697095 chr20 39712360 39712909
chr20 31784103 31784452 chr20 40226501 40227250
chr20 31878183 31878632 chr20 40495766 40496215
chr20 32078854 32079453 chr20 40540853 40541322
chr20 32265260 32265609 chr20 40764025 40764574
chr20 32321740 32322268 chr20 40994846 40995195
chr20 33259294 33260034 chr20 41911859 41912479
chr20 33353728 33354077 chr20 43078028 43078677
chr20 34145184 34146133 chr20 43087038 43087387
chr20 34685974 34686461 chr20 43107462 43107811
chr20 35774351 35774800 chr20 43269654 43270003
chr20 35785573 35786022 chr20 43284401 43285050
chr20 35807610 35808059 chr20 43527309 43527758
chr20 35821463 35822312 chr20 43577457 43577906
chr20 35907945 35908964 chr20 43612296 43612745
chr20 36051313 36052955 chr20 43671754 43672103
chr20 36127528 36127977 chr20 43841164 43841513
chr20 36170828 36172187 chr20 44141378 44141827
chr20 36666815 36667640 chr20 44150550 44150899
chr20 37062104 37062753 chr20 44295018 44295467
chr20 37248188 37248437 chr20 44512660 44513109
chr20 37284273 37284522 chr20 45038929 45039374
chr20 37768424 37768973 chr20 45071133 45071611
chr20 37923427 37924076 chr20 45345131 45345480
chr20 38421583 38422032 chr20 45380709 45381358
chr20 38531029 38531978 chr20 45643479 45643928
chr20 38535870 38536619 chr20 45704446 45704995
chr20 38723173 38723522 chr20 46053868 46054117
chr20 39310384 39311333 chr20 47468203 47468652
chr20 39555790 39556339 chr20 47633544 47633893
chr20 39614039 39614388 chr20 47691580 47691975
chr20 39676517 39676966
A list of low confidence ARORs (reproduced in 2/3 ChIP experiments) is presented with the
chromosomal location and start and end positions of the AROR as defined by the moving average
methodology (see materials and methods). Chr. = chromosome
86
Figure 10. AR occupied Loci. A) Representative Examples. Several AR-
occupied loci identified in our study are presented, where AR occupancy is defined
in the array data by the rise (over background) in consecutive probes, giving rise to a
peak. The top panel provides chromosomal position, followed by three independent
experimental replicates which demonstrate AR occupancy. The last panel is the gene
annotation track. B) Validation of AR occupancy by conventional ChIP. AR
occupancy at 18 randomly chosen loci from the 62 shown in figure 1A was tested by
independent conventional ChIP assays. Locus specific primers were designed and
occupancy was assessed by RT-qPCR. The PSA enhancer served as our positive
control and non AR occupied loci (NC) on the array served as our negative control.
This data is courtesy of Dr. Li Jia.
87
Figure 10
88
18 regions we analyzed appeared to be also occupied by the AR in the absence of
ligand, with R12 and R41 having significant AR occupancy (more than that observed
at the PSA locus) without ligand stimulation. These AR-occupied regions may be
important in the context of castrate resistant disease, where AR activation is known
to occur through ligand independent pathways (85). The PSA locus was also bound
by the AR at the basal state, and this occupancy increased upon DHT treatment,
consistent with previous observations. Equally importantly, none of the negative
control (NC) regions demonstrated AR occupancy (Figure 10B). Given that all 18 of
the randomly chosen sites were validated to be occupied by the AR in our ChIP
experiments, we are able to say with high confidence that all the AR occupied
regions we have identified through ChIP-on-chip are likely to be bound by the AR in
living cells.
4.4 Location analysis of AR occupied sites
The comprehensive AR binding data from this study provided us an ideal
opportunity to do location analysis on both the position and the distance of our
binding sites to nearby annotated genes and transcription start sites (TSS). We
measured the distance from the center of the AR occupied region (as defined by the
highest ranked probe) to the closest point of the nearest annotated gene using
Ensembl. We also noted the position (5’, 3’ or intragenic) of each AR occupied
region relative to the closest gene. As indicated in figure 11A, only 3 of our AR-
89
Figure 11. Analysis of AR-occupied regions. A) Location Analysis of the 62 AR-
occupied loci. Using the gene annotation data on Ensembl (www.ensembl.org)
location analysis of AR-occupied regions was conducted to determine whether AR
binding was 5’ gene flanking, intragenic, or at 3’end of the closest annotated gene.
The distance of AR-occupied regions to the closest genes was also noted. B)
Relationship of AR-occupied regions to transcription start sites (TSS).
Cumulative percentage of AR-occupied regions (blue line) as a function of their
distance from the nearest transcription start sites is graphed. For comparison, 125
ER binding sites (green line) (11) identified in MCF7 cells on chromosome 19 and
20 are also included in this analysis. As control set of sequences (dotted black line),
consisting of 6200 non AR occupied regions on chromosome 19 and 20 that are size
matched with the AR-occupied regions were utilized. This graph is courtesy of Dr.
Ben Berman.
90
Figure 11
B
91
occupied regions were up to 10-kb 5’ of annotated genes and 7 were within 82 kb 5’
of annotated genes. 5 AR-occupied regions were found at 3’ ends of annotated
genes, 1 being within 10-kb and 4 being within 84 kb. 10 regions were in gene
deserts (lack of genes within 100 kb on both sides of theAR-occupied region). The
majority (60%) of our AR-occupied regions were intragenic. Our data suggests that,
contrary to the dogma, AR prefers to bind enhancers rather than promoters.
Corollary to this notion, most of our AR-occupied regions were distal to TSS
of annotated genes as shown in Figure 11B (AR data indicated by blue line). We
also compared the proximity to TSS of our 62 AR-occupied loci to the 125 ER-
occupied loci in MCF-7 cells (green line figure 11B) on chromosome 19 and 20 (11).
This comparison revealed that both ERORs and ARORs are not greatly enriched
below 100bps of TSS. After a distance of 100-bps from TSS, a gradual increase in
ER binding sites can be noted, but not many ARORs are found in comparision. At
about 1-kb from TSS, we find about 20% more ER-occupied regions than AR-
occupied regions, and this difference is maintained at longer distances to TSS. Our
data suggests that, interestingly, ER binding tends to be closer to TSS than AR
binding. This conclusion, although intriguing, is however a tentative one, as the
number of AR-occupied regions we have identified is quite small.
92
4.5 Half of our AR-occupied sites are associated with histone acetylation
In parallel with AR ChIP-on-chip, we also conducted RNA PolII and
acetylated histone H3/H4 ChIP-on-chip analysis in C4-2B cells to determine whether
our AR-occupied sites recruited components or lead to chromatin alterations
indicative of active transcription. Immunoprecipitates from RNA PolII and
acetylated histone ChIPs were sent for labeling and hybridization onto NimbleGen’s
array # 35. As shown in figure 12, 85% of PolII-occupied regions on chromosome
19 and 20 overlap with histone acetylation. This high percentage of overlap with
histone acetylation is expected for PolII as chromatin opening (indicated by histone
acetylation) is known to facilitate RNAPolII recruitment and subsequent
transcription (74). 50% of our robust (3/3 ChIPs) AR-bound loci were found to
overlap with histone acetylation, suggesting that at least at half of our 62 loci, AR
binding potentially leads to chromatin opening.
Because we were concerned about missing potential AR-bound loci due to
our stringent cutoffs, we decided to assess histone acetylation at AR-occupied
regions found using lower cutoffs. From figure 12, it appears that the higher our
cutoffs the more robust is the association between AR-occupancy and histone
acetylation (compare the data from 1/3 and 2/3 vs. 3/3 ChIPs). This validates our
decision to pick and study the 62 AR-occupied loci, which are reproduced in 3/3
experiments.
93
Figure 12. Correlation of Pol II and AR-occupied regions to histone acetylation.
Percent of Pol II (red line) and high confidence AR-occupied regions (3 of 3, pink
line) as a function of their distance from the nearest histone acetylation site are
graphed. Also included for comparison are low confidence AR-occupied regions (2
of 3, green line and 1 of 3, blue line). Control sequences (dotted black line) are 6000
non AR-occupied regions on chromosome 19 and 20 that are size matched with AR-
occupied regions. This graph is courtesy of Dr. Ben Berman.
94
4.6 Conservation of non-exonic AR occupied regions
To assess the extent of evolutionary conservation, we compared the
sequences of AR-occupied regions between 17 vertebrate species using the UCSC
genome browser. For comparison, we also included 111 ER-occupied loci on
chromosome 19 and 20 in MCF-7 cells (11). As controls we utilized sequences of
6200 non-AR-occupied regions, which are not expected to be conserved. The mean
conservation data from the control sequences is graphed as a belt with the top gray
line of the belt representing +1 standard error (labeled control + 1
st
) and the bottom
gray line of the belt representing -1 standard error (labeled control -1
st
). Data within
this belt is considered non-significant, while the data outside the belt is considered
significant. From Figure 13, we can see that low confidence AR-occupied regions
(blue lines and labeled 1/3 ARORs) and ER-occupied regions (red line and labeled
ERORs) are not conserved as these data sets fall between the standard error belt from
the control sequences. Only the center of the high confidence ARORs (green line
and labeled 3/3 ARORs) is slighltly significantly conserved as this regions lies above
the standard error belt of the control sequences. Our data, which implies that AR
binding sites are more evolutionarily conserved than ER binding sites, would not be
expected, and in fact, the opposite is anticipated based on the literature that supports
the view that the AR evolved much later than did ER (52, 99). A possible
explanation for this contradiction is that we have utilized stringent cutoffs to identify
our high confidence AR-occupied regions than was utilized by Caroll et al., in
identifying the ER-occupied regions. Our high stringency cutoffs reduces
95
Figure 13. Conservation of AR-occupied regions. Sequence conservation
analysis was conducted using the USCS genome browser. The Conservation score is
graphed relative to the center of the peak spanning 2500 bps on both sides. Data is
presented for high confidence AR-occupied regions (green line) and low confidence
AR-occupied regions (blue line). For comparison ER-occupied regions are also
included. As controls we utilized sequences of 6200 non-AR-occupied regions,
which are not expected to be conserved. The mean conservation data from the
control sequences is graphed as a belt with the top gray line of the belt representing
+1 standard error (labeled control + 1
st
) and the bottom gray line of the belt
representing -1 standard error (labeled control -1
st
) from the mean.
96
contaminants (which may represennt weak binders) and enriches for strong binders,
which may be more likely to be conserved. This is supported by the data in Figure
13, where low confidence AR-occupied regions show lack of conservation (just like
the ER-occupied regions) when compared to high confidence AR-occupied regions.
Thus it is likely that if ER-occupied regions were identified using stingent criteria,
comparable to what we utilized, these high confience ER-occupied regions may be
equally as conserved as our high confidence ARORs.
4.7 Many novel AR-occupied sites can function as enhancers
To demonstrate functionality of the 62 AR-occupied sites, we proceeded to
clone each region, in both forward (F) and reverse (R) orientations, into the TK2
+
luciferase construct, composed of a pGL3 backbone with a minimal thymidine
kinase promoter (Figure 14). Each AR binding site, 600 bps on average, was
amplified from C4-2B genomic DNA using site specific primers containing
engineered EcoRI, BglII, or SacII restriction sites. We were able to clone all but one
AR-occupied region, R16, due to the presence of repetitive sequences in this region.
The constructs were transiently transfected into C4-2B PCa cells that were androgen
deprived for 2 days and each construct’s responsiveness to DHT was measured 24
hrs post DHT treatment. Representative median fold induction (DHT/EtOH) data
from up to 4 independent experiments is shown in Table 2 under the “luciferase”
heading. In an attempt to decipher a mechanism for AR-mediated activation, each
construct’s DHT responsiveness was categorized based on its extent of response to
97
AR activation. From our 122 constructs, 5 responded 40-80 fold, 20 responded 9-35
fold, 16 responded 4-8 fold, and 20 responded 2-3.5 fold. Lastly, and surprisingly,
61 of the 122 constructs had no appreciable activity (< 2 fold), despite confirmed AR
occupancy. Overall 37 of our 62 AR binding sites (60%) demonstrate enhancer
activity, at least in one orientation. Within the responding constructs we made the
following observations. As expected of enhancers, several AR occupied regions
which we cloned, were equally responsive in both orientations: R2, R15, R18, R20,
R21, R30 and R43 being examples. Interestingly, several AR occupied regions
responded to ligand treatment in an orientation dependent manner, which is not
classically how enhancers are thought to behave. Examples include R1, R25, R28,
R44, R57, R58 and R11, R13, R14, R23, R27, R54, with the former set responding
more strongly in the forward orientation and the latter set responding more strongly
in the reverse orientation. Currently, the stratifications we have made using the
responsiveness of our AR-occupied regions is being analyzed via bioinformatics to
learn about the “rules” of AR mediated transcription. Our hypothesis is that the
extent of responsiveness and perhaps even directionality of response may be related
to subtle differences in the AR binding sequences in these constructs. This is
currently being addressed bioinformatically through the generation of ARE logos
and will be addressed experimentally by mutagenesis of the putative important bases.
An alternative explanation for the observed differences in the responsiveness of these
constructs is the varying requirement for AR collaborating factors for each AROR.
98
Figure 14. Map of TkLuc2
+
. A schematic map of the Tk2
+
Luc plasmid, which was
used to clone AR-occupied regions, is depicted. The multiple cloning site was used
to clone in PCR fragments representing the AR-occupied regions.
99
4.8 AR regulates only a small number of genes in the vicinity of the 62 occupied
loci
Following the luciferase assays, which demonstrated functionality of 60% of
the AR-occupied sites in C4-2B cells, we next pursued gene expression analysis with
the goal of identifying novel AR-responsive target genes. Of our 62 AR-occupied
regions, 10 were in gene deserts and 52 were proximal to annotated genes. Within a
100-kb window from these 52 AR-occupied sites, we had a total of 95 annotated
genes. Table 2 lists all the annotated genes within this 100-kb window. The closest
gene to each AR-occupied region is listed first, and is bolded and italicized. By RT-
qPCR, we initially measured mRNA levels of all 95 genes at a single time point, 16
hrs, in DHT or vehicle treated C4-2B cells. The gene names are listed in table 2
under the “nearby genes” column. The expression data for each gene is also
presented in table 2 under the “fold change” heading. Of the 95 genes, very few
responded to ligand stimulation. Four genes were induced at least two fold (WTIP,
JAG1, TGM2, and STK4) and five were repressed at least two fold (SIX5,SPTLC2L,
ZNF341, MMP24, and LPIN3) (Table 2). 29 of the 95 genes were not expressed in
our cell line (confirmed by primers producing a product with genomic DNA but not
with cDNA, although we cannot rule out the possibility of exon skipping or
alternative splice forms of these genes being the actual targets of AR). The
expression of the majority of genes was not significantly altered by ligand
stimulation. We were quite surprised to find that only 9 of the 66 expressed genes
from our original list of 95, responded to AR-activation in C4-2B cells. We
100
reasoned that perhaps our restricted analysis of gene expression at a single time point
could explain the limited number of responding genes. We therefore conducted a
DHT time course experiment to catch gene expression alterations that we might have
missed at time points other than 16hrs. We selected at random 6 expressed genes,
which did not respond to DHT treatment. As positive controls, we included in our
analysis two genes, JAG1 and TGM2 which were responsive at the 16 hr time point.
We also included CBFA2T2, which seemed repressed (although not two fold), due to
its interesting biology. As seen in figure 15, JAG1 and TGM2, were confirmed to be
induced in the time course experiments, as was our positive control, PSA.
CBFA2T2 also appeared to be persistently repressed in our time course experiment.
However, the other 6 genes (EPB41L1, RHPN2, C20ORF77, EPN1, U2AF2 and
ELMO2), which did not respond in the 16 hr experiment, also did not appear to be
regulated by androgens in the time course experiment. Thus, studying gene
expression at a single time point likely did not cause the paucity of genes identified
as being AR-regulated.
4.9 AR-mediated regulation of a novel, conserved transcript of unknown
function
We were quite perplexed to find only a few responsive genes in the vicinity
of AR-occupied loci, especially considering that many of the 62 regions were
functional as transcriptional enhancers in luciferase assays. Our literature search
made us aware of other genomic elements such as micro RNAs (MIRs) or transcript
101
Figure 15. DHT time course experiment and gene expression analysis. C4-2B
cells were maintained in 5% CSS-containing medium for three days, and then re-fed
(time 0) with the same medium supplemented with either 10nM DHT or ethanol
vehicle. RNA was extracted at the indicated time points and gene expression of the
95 annotated genes proximal to our AR-occupied regions was measured by RT-
qPCR. Expression levels are normalized to 18S rRNA and the data is graphed as a
comparative ratio (DHT/ethanol).
102
of unknown functions (TUFs) (6) that have recently received attention in the gene
regulation field, which we had not previously considered in our analysis to be
potentially regulated by the AR. To address the hypothesis that AR regulates un-
annotated genomic elements in the vicinity of our 62 AR-occupied loci we
conducted expression analysis using a genomic tiling microarray. This methodology
would also allow us to determine if annotated genes outside of 100-kb window from
each AR-occupied region were regulated by the AR. The array was designed with
50-mer probes with 10-bp overlaps covering ~ 75% of chromosome 20 originally
present on the ChIP-on-chip array. This high resolution tiling array would enable the
identification of, even small, AR-regulated genomic elements in an un-biased
manner. For hybridization onto this array we synthesized d.s. cDNA from total RNA
collected from 16 hr EtOH-treated and 16 hr DHT-treated C4-2B cells. The cDNAs
from the two samples were differentially labeled with fluorescent dyes (Cy3 vs. Cy5)
and co-hybridized onto the genomic tiling array described above. We were
particularly interested in identifying areas where AR-occupancy was detected in our
ChIP-chip analysis, but where gene annotation was absent. Proximal to such areas,
positive signals in the genomic tiling array would indicate the presence of un-
annotated DNA elements being regulated by the AR upon ligand stimulation.
The array data was analyzed to identify regions of consecutive probes that
were above the background noise. We utilized the expression pattern of Jag1 and
TGM2, two genes which we knew to be induced upon androgen treatment from our
time course experiment and 16-hrs experiment, as indicators of success for the
103
experiment. Indeed these two genes were found to be upregulated upon DHT
treatment (data not shown). However, the tiling array data did not reveal additional
annotated genes in the vicinity of AR-occupied regions (beyond 100-kb) being
regulated upond androgen stimulation. This re-assured us that our previous strategy
of measuring gene expression within 100-kb of each AR-occupied region was not a
bad one, as we were not missing additional AR-regulated transcripts outside this
100-kb window. The tiling array data confirmed that AR-occupied regions did not
regulate many annotated genes within and outside a 100-kb window from an AROR.
We did however, to our surpirse, find an unannotated region, which we call
TUF1, that hppened to be AR-regulated. The expression data from the genomic
tilling array for TUF1 is presented in figure 16A. It can be seen in Fig. 16A (top
panel), that a portion of the genome shown in the tiling array is AR-responsive (as
indicated by a rise in the probes in the area labeled TUF1). This region, R26
(Table1), is one of the 62 AR-occupied regions identified in our study and happens
to be roughly 1-kb downstream from a confirmed AR occupied region (Figure 16A,
tracks 2-4).
104
Figure 16. TUF1 A) Identification. To identify if AR-activation leads to
transcription of unannotated genomic elements we co-hybridized fluorescently
labeled, double stranded, cDNA from from 16-hr DHT and ethanol treated C4-2B
cells onto a genomic tiling array. 50-mer, overlapping, probes on this array
represented 75% of chromosome 20 which we initially interrogated for AR binding
sites. An example of an unannotated, AR-responsive transcript, which we call
TUF1, is shown in panel 1. Alignment of this genomic region with the AR ChIP-on-
chip data reveals the presence of a high confidence AR-binding site (panels 2-4) ~1-
kb upstream of the TUF1 locus. Panel 5 indicates the histone acetylation profile in
this region. The last panel (6) is shown to indicate the absence of any current gene
annotation data in this region. B) Primer Design. To confirm TUF regulation by
the AR, we conducted expression analysis of TUF1 in response to DHT stimulation
over a time course. Primers were designed, as indicated by arrows, within a region
that appeared to be responsive to AR in the tiling array experiment. C) AR-
mediated regulation of TUF1. mRNA levels of TUF1 were highly responsive to
androgen stimulation. Data is normalized to 18S rRNA and graphed as a
comparative ratio (DHT/ethanol).
105
Figure 16
106
The fifth track of Fig. 16A presents the acetylation profile in the TUF1
region, which is higher than background and indicative of open chromatin. The
latter two data sets support the notion that AR occupancy at this locus regulates
transcription of TUF1. To confirm TUF1 regulation by androgens, we designed
primers within a portion of the AR-responsive region (indicated by arrows in Fig.
16B) in the genomic tiling array and assessed TUF1 expression in the DHT time
course experiment. –RT control was included to rule out DNA contamination. As
shown in Figure 16C, TUF1 is highly responsive to androgen treatment in a time
dependent manner in C4-2B cells.
TUF1’s functional significance is hinted at by the sequence conservation data
shown in figure 17. We blasted the genomic sequence containing the AR-occupied
locus and the TUF1 transcribed region using the UCSC genome browser. The
conservation data is aligned with the microarray data in figure 17. The AR-occupied
region (labeled AROR and demarcated by red lines in figure 17) which lies 1-kb
upstream of TUF1 is highly conserved. The AR-responsive TUF1 coding region
(labeled TUF1 and demarcated by red lines in figure 17), within which we designed
RT-PCR primers to measure androgen responsiveness, is also highly conserved.
Multiple open reading frames (ORFs) can be found in the TUF1 transcript (data not
shown) and preliminary sequence analysis suggests that it may encode a protein with
homology to collagen (data not shown). Our data opens up the possibility that AR
can regulate TUFs in the genome, which could encode a functional protein.
107
Figure 17. TUF1 sequence conservation. The conservation of TUF1 and its
surrounding region was assessed using the UCSC genome browser. Areas of
conservation are demarcated by red lines.
108
4.10 Discussion
4.10A. ChIP-on-chip data confirms the notion that AR-occupancy is not biased
to 5’ promoter proximal regions
Recent genome wide location analysis for transcription factors such as ER
(11) and p53 (106) have clearly shown that contrary to the dogma, these transcription
factors demonstrate no preference for binding to 5’ gene flanking sequences, which
are thought to be the classical gene regulatory elements. Our ChIP Display data
initially hinted that the same appeared to be true for the AR. Our ChIP-on-chip
study further corroborated this notion for the AR. Only 10 of our AR occupied
regions were 5’ of annotated genes, with seven being within 82-kb of transcriptional
start sites. Five of our sites were up to 84-kb downstream of 3’ ends of genes. Most,
37, of our AR-occupied regions were found to be intragenic, as has been
demonstrated for two other transcription factors, Oct4 and Nanog (59). As both
experimental approaches, ChIP-on-chip and ChIP Display, have led us to the same
conclusion, our results are likely not simply due to chance or experimental technique
bias. Further support for our findings is provided by a recent publication by
Takayama et al., a group that applied AR ChIP-on-chip using the ENCODE array.
This microarray tiles about 1% (30Mb) of the whole genome with certain regions
having been manually selected and others selected at random by the research
consortium of the ENCyclopedia of DNA Elements (ENCODE Project Consortium,
2004). From the ten AR-binding sites identified in their study, only two were within
3 kb upstream of TSS (95). The comprehensive analysis of protein-DNA
109
interactions, such as the one presented here for AR, is paving way for a deeper, and
more complex, understanding of gene regulation and suggests that the majority of
gene regulation, at least for those TFs mentioned here, occurs via distal sites rather
than from gene proximal sequences.
4.10B. AR binds to more highly conserved sequences, but farther from TSS
when compared to ER
Phylogenetic analysis of steroid hormone receptors in vertebrates indicates
that the first steroid receptor to evolve was the estrogen receptor, followed by the
progesterone receptor. The full complement of mammalian steroid receptors,
including the androgen receptor, arose from these two ancient receptors through two
large-scale genome expansions (52, 99). As the regulation of physiological
processes by androgens and corticoids are relatively new innovations, a model of
ligand exploitation has been proposed. In this model, the receptor for the terminal
ligand, being estrogen, in a biosynthetic pathway evolved first. Selection for
estrogen also led to the selection of synthesis for the intermediates in this pathway,
even if initially the receptors for these intermediates were absent. Duplication of
receptors allowed them to evolve affinity for these intermediates, leading to the
creation of new hormone-receptor pairs (99). As the estrogen receptor is more
ancient than the androgen receptor, it would be anticipated that binding sequences
for ER would be more highly conserved than for AR. However, based on the
110
comparison of our AR-occupied loci with ER-occupied loci on chromosome 19 and
20, we find this not to be the case.
Although we are quite perplexed by this finding, a possible explanation,
provided earlier, is that the AR-occupied regions are more conserved because we
used high stringengy (reproduced in 3/3 independent expeirments) cutoffs to define
these binding sites than was done by Caroll et al., for ER-occupied regions. These
stringently mined AR-occupied regions may represent DNA sequences that tightly
bind to AR and therefore are more likely to be conserved. Another explanation for
the AR vs. ER difference could be that conservation of our AR-binding sites reflects
the conservation contributed by other, yet unidentified, but AR cooperative, protein
binding sequences. The basis for this reasoning is that the conservation spans ~200
bps to the right and to the left of the center of the AR-occupied regions. AREs,
however, are only 15-bp long palindromic sequences (38).
Another intriguing finding from our studies was that AR binding, compared
to ER binding, is found to be quite far from TSS of genes. Consistent with ER
binding closer to TSS, Carroll et al. report that gene expression in MCF-7 cells is
correlated with ER binding in the vicinity of estrogen regulated genes (11). Our data
suggests, although we cannot be completely sure as we have not exhaustively
addressed this issue, that the same is not the case for the AR in C4-2B cells, where,
the closest genes did not respond to DHT and, in fact, only a handful of genes were
found to be responsive to AR-activation (within 100-kb window of an AR-occupied
region). This could be likely related to AR binding far from TSS. These
111
observations all lead us to ask, what, then, is the main function of AR-binding at
these loci? One possibility is that AR binding at sites far from TSS mainly facilitates
chromatin opening, through binding and subsequent recruitment of chromatin
remodeling complexes. This is supported by our data that 50% of our AR binding
sites have histone acetylation, which is indicative of open chromatin. Chromatin
opening upon AR binding to these sites may be followed by recruitment of other co-
operating transcription factors that in conjunction with AR regulate expression of
distal genes. As such these AR binding sites can be thought of as functioning
analogous to locus control regions (LCRs). LCRs are known to be quite far from
genes they regulate, demonstrate classical enhancer activity in transient transfection
assays, and possess chromatin opening activity (54).
4.10C. AR occupancy at the majority of our 62 loci is not sufficient for gene
expression alterations
In addition to learning about the rules that govern AR mediated gene
regulation, one of our goals in conducting this study was to find primary AR target
genes in PCa. Therefore, we were quite disappointed to find only a handful of genes
(9/95) that were responsive to androgen treatment. Extensive time course
experiments for some of the genes suggest that the single time point (16hrs) used to
determine expression of all genes was not the cause for the observed low number of
responding genes. The paucity of AR-regulated genes is perplexing considering that
a large number of AR-occupied regions which we cloned demonstrated enhancer
112
activity in luciferase reporter assays. One possible explanation could be that
measuring gene expression 100-kb surrounding the AR-occupied locus is too narrow
of a window. We do not belive this to be the likely explanation for the low number
of AR-responsive genes because our expression tiling array data also confirmed that
AR activation does not modulate the expression of many genes (within and outside a
100-kb window). Another possibility for the low number of AR-responsive genes is
that AR-regulates genes very far away, possibly on other chromosomes (which could
not be captured using the tiling array as it covered only a portion of chromosome
20). The bioinformatics analysis on the 62 AR-occupied sites identified by ChIP-on-
chip revealed that AR tends to bind far from TSS of genes, suggesting a distal mode
of gene regulation. This raises the possibility that many genes, which the AR
regulates, may be quite far from the identified AR-bound loci. To address this
concern we are currently conducting genome-wide gene expression profiling to get
an unbiased profile of all AR-regulated genes on chromosome 19 and 20 in C4-2B
cells. The androgen responsiveness of genes from such a study can then be
correlated with proximity to AR-occupied regions to determine whether a
relationship exists between AR binding and gene regulation.
Another possibility is that our cell culture model system does not entirely
mimic the in-vivo situation to cause AR-mediated gene expression alterations.
Perhaps, binding of AR needs to be followed by other cues not present in our culture
system, for example the co-operation of correct co-regulatory molecules, which may
be present in other cell types or under alternative environmental conditions. To this
113
end, we are currently mining the expression profiles of the genes on chromosome 19
and 20 in a prostate cancer microarray study. Our hypothesis is that we will find
more AR-responsive genes in the vicinity of the 62 AR-occupied loci in vivo than
we did in our culture system as the tumor samples, being living cells, have
encapsulated all the components and environmental cues necessary for AR-mediated
gene regulation.
Another intriguing explanation could be that AR regulates expression of non-
coding RNA, or as yet un-annotated transcripts. This possibility is discussed in
depth below.
4.10D. AR-mediated regulation of TUF
Gene annotation data from Ensembl currently lists ~22,000 known or
predicted protein coding loci in the human genome (68). This translates into only
2% of the total fraction of bases in the genome being transcribed and therefore
“functional.” This leaves the vast majority of human DNA to be labeled as “junk
DNA,” represented by the intergenic and intronic sequences, which are generally
thought not to be transcribed. However recent studies with genome tiling
microarrays, which offer comprehensive and unbiased investigation of RNA coding
regions, have noted a high degree of novel transcription that is beyond expectation
based on current gene annotation data.
Chromosome 21 and 22 tiling microarray experiments using 11 human cell
lines revealed that 94% of positive probes lied outside currently annotated exon
114
positions. This study concluded that there appears to be as much as an order of
magnitude more transcription than can be accounted for by the 770 well-
characterized and predicted genes on these two chromosomes (43). Similarly,
Bertone and colleagues identified a large number of novel transcribed sequences
using pooled human polyA
+
RNA (6). Similar results have been obtained by
expression tiling array experiments in other species including E.coli (90), and
Drosophila (34). These novel transcribed regions are thought to correspond to (i) un-
annotated exons from alternatively spliced mRNAs, (ii) under-represented 5’or 3’
untranslated regions, (iii) non-coding RNAs, or (iv) novel transcripts coding for
functional proteins.
In our search for novel AR-target genes we also conducted an expression
tiling array experiment for a portion of chromosome 20. This led to the
identification of androgen regulated transcription of TUF1, which corresponded to a
region of the genome with lack of gene annotation. TUF1 sequence analysis
revealed regions, possibly corresponding to exons, which are highly conserved
across species, and suggests that TUF1 may encode a functional protein. Only one
other study has reported novel expressed sequences in prostate cancer cells (80).
This study, however, did not demonstrate androgen responsiveness of any of their
novel transcribed sequences. As such, TUF1 is the first described androgen-
responsive, novel transcript that has been identified using an expression tiling array.
Our data hints at the possibility that there may be many more such AR-regulated
transcripts that have yet to be discovered. Extrapolation from our discovery of 1
115
unannotated transcript on chromosome 20 represented on the expression tiling array
(covering ~36,000, 000 bps), suggests that there may be ~90 more such AR-
regulated unannotated transcripts in the human genome. Such discoveries will help
to fully annotate the human genome and provide novel avenues for scientific
research.
4.10E. ChIP-chip discloses AR target genes with (potential) roles in PCa
Although our in-vitro gene expression studies disclosed only a handful of
genes which were AR-regulated upon ligand treatment, our study has expanded the
list of direct known AR target genes. These genes offer new avenues for research
and may provide novel insights into mechanisms of AR-mediated prostate cancer
progression. Some genes have interesting functions as they relate to cancer biology
and a brief discussion on these genes is what follows below.
4.10E(a). JAG1
JAG1 encodes for the protein Jagged 1, which is a ligand for the receptor
notch1. Signaling through notch1 is known to control cell fate determination and
epithelial-mesenchymal transition (115). Martin et al. applied high-throughput
quantitative proteomics in LNCaP cells to identify proteins of importance in
neoplastic prostate epithelium and identified Jagged 1 protein to be up-regulated by
androgens (67). The molecular mechanism, however, for this increase was not
pursued in their studies. We identified an AR binding site in intron 3 of JAG1,
116
which was functional in luciferase reporter assays and our gene expression studies
demonstrated a 2-fold induction of JAG1 mRNA upon androgen treatment. Our
results indicate that JAG1 is a direct target of the androgen receptor in prostate
cancer cells. As for the biological functions of Jagged 1, several studies have
demonstrated its importance in cancer. Santagata et al conducted
immunohistochemical analysis and found Jagged1 to be significantly overexpressed
in metastatic prostate cancer as compared to localized or benign prostate cancer (85).
Furthermore, this high Jagged1 expression was significantly associated with prostate
cancer recurrence in a subset of clinically localized disease, independent of other
clinical parameters (85). Ergun et al took a network biology approach to identify
genes and pathways that could be mediators of prostate cancer progression. This
study also identified JAG1 as one of the top 100 potential genetic mediators for
metastatic prostate cancer (20). Lastly, down-regulation of Jagged-1 was shown to
inhibit cell growth and induce S-phase cell cycle arrest in prostate cancer cells (117).
The results from all the studies cited above, in combination with our findings,
suggest that Jagged1 is an AR target gene, which could be utilized as a prognostic
marker and as a potential therapeutic target for prostate cancer.
4.10E(b). MMP24
Matrix metallopeptidase 24 is a member of the membrane-type MMP (MT-
MMP) family and is expressed at the cell surface. Members of this family are
central mediators of proteolytic events at the cell surface that regulate tumor cell
117
motility via extracellular matrix (ECM) degradation, metastasis and angiogenesis
(32). The MMP24 protein is known to activate proMMP2, gelatinase A, by cleavage
(58). We identified an AR binding site 18-kb 5’ of MMP24, which did not appear to
be functional in our reporter assays. The observation that the cloned AR binding site
was not functional could be explained by the fact that our reporter assay (using
TK2+ vector, with the minimal thymidine kinase promoter, which maintains low
basal activity) is more suited to identify enhancers, not repressive DNA elements,
due to its inherent low basal activity in the context of C4-2B cells. However our
gene expression studies revealed that MMP24 is significantly repressed by androgen
treatment. Consistent with our observations Jung et al have also reported that down-
regulation of most MT-MMPs, including MMP24, is typical for prostate carcinoma
(42). Although it might be expected that expression levels of proteins such as
MMP24 would be upregulated in cancers to promote disease progression, it may be
possible that MMP24 may have, yet unidentified, tumor suppressive functions. In
this regard, it would make sense that the AR’s oncogenic activity leads to repression
of MMP24 expression. Further studies are required to address the functional
significance of MMP24 gene repression in the context of PCa progression.
4.10E(c). TGM2
TGM2 is a calcium dependent tissue transglutaminase, which catalyzes the
crosslinking of proteins by forming covalent bonds between the γ-carboxamide
groups of peptide-bound glutamine residues and various primary amines, including
118
the ε-amino group of lysine in certain proteins (27). In doing so, TGM2 acts as
“biological glue,” and contributes to the processes of cell death and survival, cell-
matrix interaction and stabilization and maintenance of tissue integrity (27).
Additionally, the unique C terminus of TGM2, which is not involved in TGase
activity, functions as a G protein in receptor signaling (27). The enzyme is mainly
localized in the cytoplasm, with a small fraction being found in the membrane and
extracellularly.
Several studies implicate TGM2 to be important in cancer biology. Increased
TGM2 expression has been observed in drug resistant and metastatic breast cancer
cells (70). Mangala et al found TGM2 to be closely associated with integrins on the
cell surface, and this association promoted cell attachment, motility, invasion and
cell survival. TGM2 siRNA treatment impaired these functions in breast cancer
cells, confirming the important role it plays in the metastatic phenotype in these cells
(65). Another study found TGM2 to protect against apoptosis by protecting Rb from
caspase-induced degradation in a transamidation dependent manner (7).
Multiple transcriptional activators, such as TNF α, retinoic acid, vitamin D
and progresterone, have been found to regulate TGM2 expression (27). Our study is
the first to report transcriptional regulation of TGM2 by androgens in the prostate.
We mapped an AR-binding site 19-kb 3’ of TGM2, which happened to be functional
in our reporter assay experiments. Our gene expression studies found TGM2 to be
induced ~ 8-fold by androgen treatment. Considering the significant induction of
TGM2 expression by androgens, and due to this protein’s various roles in cancer
119
biology, TGM2 is a likely candidate for mediating androgen receptor’s oncogenic
potential during PCa progression.
4.10E(d). CBFA2T2
CBFA2T2 stands for core-binding factor, runt domain, alpha subunit 2;
translocated to 2. Another name for this gene is MTGR1 (myeloid transforming
gene related-protein 1). CBFA2T2 is a relative of CBFAT1 / MTG8, also known as
ETO (eight twenty one), which is disrupted by a translocation between chromosome
8 and 21 in about 15% of myeloid leukemia cases (2). This translocation fuses
MTG8 to the DNA binding domain of AML1 / RUNX1, leading to the creation of
AML-ETO, which causes repression of RUNX1-regulated genes (55). In contrast to
CBFAT1/ MTG8, which is expressed in the brain and in some hematopoietic cells,
CBFA2T2 / MTGR1 is ubiquitously expressed (55). Also, whereas CBFA2T1 /
MTGR8 is known to undergo chromosomal translocation, CBFA2T2 / MTGR1 has
not yet been reported to undergo translocations (71).
The ETO family members share four conserved regions, NHR1 to NHR4,
and have two conserved zinc finger motifs that are involved in interaction with
proteins but not DNA (48). Members of this family are generally regarded as
transcriptional corepressors as they can interact with various corepressor proteins
and histone deacetylases (50). CBFA2T2 / MTGR1 also functions as a
transcriptional corepressor (48). Additionally, CBFA2T2 interacts with the AML-
120
ETO fusion protein, and this interaction is thought to contribute to leukemogenesis
by repressing AML-1 dependent transcription (48).
We identified an AR-binding site in intron 4 of CBFA2T2 and this binding
site was highly functional in reporter assays. Androgen treatment led to a persistent
repression of CBFA2T2 mRNA levels (although it was not two fold, which we set as
an arbitrary threshold for measuring responsiveness). Based on the literature which
reports CBFA2T2 as a transcriptional repressor, its own repression by AR may
contribute prostate cancer progression by leading to the inappropriate activation of
genes it normally represses. Supporting our hypothesis that CBFA2T2 may be
important in AR-mediated prostate cancer progression, Ergun et al. identified
CBFA2T2 as one of the top 100 potential genetic mediators for metastatic prostate
cancer (20). Future studies will need to elucidate molecular the contribution of
CBFA2T2 in PCa.
121
Chapter 5: Overall conclusion
In summary, our studies have identified a total of 81 novel AR-occupied
regions in C4-2B human prostate cancer cells. It is intriguing that most of these
binding sites were intragenic, rather than residing in 5’ proximal-promoter
sequences. Our findings, as well as data from other studies, suggest that previous
gene regulation studies have been largely biased towards the interrogation of 5’gene
flanking sequences. Additionally, most gene regulation, at least by AR, ER, p53,
Oct4 and Nanog, appears to be via distal binding sites. This raises further questions
as to how, mechanistically, these distal transcription factor binding sites regulate
transcription. Future studies will be needed to build models of gene regulation at a
distance. This can be teased out, for example, by conducting transcription factor
binding site mutagenesis or knock-out experiments followed by global gene
expression analysis. Recently, Lomvardas et al., used chromosome conformation
capture to demonstrate association of an enhancer element, H, on chromosome 14
with multiple olfactory receptor gene promoters on different chromosomes (60).
Chromosome conformation capture experiments with the AR-occupied regions
identified in the present study may help to shed light on distal genes which these
enhancers may be regulating.
Related to AR-binding far away from TSS, another interesting finding from
our studies is that linear proximity of a gene to a transcription factor does not
necessarily lead to gene expression alterations. Perhaps, as yet unidentified,
122
insulator elements prevent transcription from initiating for one gene, or perhaps
transcription factor binding at one chromosome leads to transcription of genes on
other chromosomes or at a distant location due to juxtaposition of these sites in the
3D context.
Lastly, we have identified a number of AR target genes that could potentially
participate in AR mediated prostate cancer progression. These include PRKCD,
PYCR1, JAG1, MMP24, TGM2 and TUF1 to name a few. It will be interesting to
see, in the future, how these genes contribute to disease progression. If they are
found to play significant roles, potential therapeutic targets can be developed for
disease management.
123
References
1. Agoulnik, I. U., and N. L. Weigel. 2006. Androgen receptor action in
hormone-dependent and recurrent prostate cancer. J Cell Biochem 99:362-72.
2. Amann, J. M., B. J. Chyla, T. C. Ellis, A. Martinez, A. C. Moore, J. L.
Franklin, L. McGhee, S. Meyers, J. E. Ohm, K. S. Luce, A. J. Ouelette,
M. K. Washington, M. A. Thompson, D. King, S. Gautam, R. J. Coffey,
R. H. Whitehead, and S. W. Hiebert. 2005. Mtgr1 is a transcriptional
corepressor that is required for maintenance of the secretory cell lineage in
the small intestine. Mol Cell Biol 25:9576-85.
3. Ashraf, N., S. Zino, A. Macintyre, D. Kingsmore, A. P. Payne, W. D.
George, and P. G. Shiels. 2006. Altered sirtuin expression is associated with
node-positive breast cancer. Br J Cancer 95:1056-61.
4. Balk, S. P., Y. J. Ko, and G. J. Bubley. 2003. Biology of prostate-specific
antigen. J Clin Oncol 21:383-91.
5. Barski, A., and B. Frenkel. 2004. ChIP Display: novel method for
identification of genomic targets of transcription factors. Nucleic Acids Res
32:e104.
6. Bertone, P., V. Stolc, T. E. Royce, J. S. Rozowsky, A. E. Urban, X. Zhu,
J. L. Rinn, W. Tongprasit, M. Samanta, S. Weissman, M. Gerstein, and
M. Snyder. 2004. Global identification of human transcribed sequences with
genome tiling arrays. Science 306:2242-6.
7. Boehm, J. E., U. Singh, C. Combs, M. A. Antonyak, and R. A. Cerione.
2002. Tissue transglutaminase protects against apoptosis by modifying the
tumor suppressor protein p110 Rb. J Biol Chem 277:20127-30.
8. Borgono, C. A., and E. P. Diamandis. 2004. The emerging roles of human
tissue kallikreins in cancer. Nat Rev Cancer 4:876-90.
124
9. Brown, C. J., S. J. Goss, D. B. Lubahn, D. R. Joseph, E. M. Wilson, F. S.
French, and H. F. Willard. 1989. Androgen receptor locus on the human X
chromosome: regional localization to Xq11-12 and description of a DNA
polymorphism. Am J Hum Genet 44:264-9.
10. Brown, T. R., P. A. Scherer, Y. T. Chang, C. J. Migeon, P. Ghirri, K.
Murono, and Z. Zhou. 1993. Molecular genetics of human androgen
insensitivity. Eur J Pediatr 152 Suppl 2:S62-9.
11. Carroll, J. S., C. A. Meyer, J. Song, W. Li, T. R. Geistlinger, J.
Eeckhoute, A. S. Brodsky, E. K. Keeton, K. C. Fertuck, G. F. Hall, Q.
Wang, S. Bekiranov, V. Sementchenko, E. A. Fox, P. A. Silver, T. R.
Gingeras, X. S. Liu, and M. Brown. 2006. Genome-wide analysis of
estrogen receptor binding sites. Nat Genet 38:1289-97.
12. Cawley, S., S. Bekiranov, H. H. Ng, P. Kapranov, E. A. Sekinger, D.
Kampa, A. Piccolboni, V. Sementchenko, J. Cheng, A. J. Williams, R.
Wheeler, B. Wong, J. Drenkow, M. Yamanaka, S. Patel, S. Brubaker, H.
Tammana, G. Helt, K. Struhl, and T. R. Gingeras. 2004. Unbiased
mapping of transcription factor binding sites along human chromosomes 21
and 22 points to widespread regulation of noncoding RNAs. Cell 116:499-
509.
13. Chang, H. C., S. C. Chen, J. Chen, and J. T. Hsieh. 2005. In vitro gene
expression changes of androgen receptor coactivators after hormone
deprivation in an androgen-dependent prostate cancer cell line. J Formos Med
Assoc 104:652-8.
14. Chen, C., and M. B. Dickman. 2005. Proline suppresses apoptosis in the
fungal pathogen Colletotrichum trifolii. Proc Natl Acad Sci U S A 102:3459-
64.
15. Chen, C. D., D. S. Welsbie, C. Tran, S. H. Baek, R. Chen, R. Vessella, M.
G. Rosenfeld, and C. L. Sawyers. 2004. Molecular determinants of
resistance to antiandrogen therapy. Nat Med 10:33-9.
16. Chen, J., and I. Sadowski. 2005. Identification of the mismatch repair genes
PMS2 and MLH1 as p53 target genes by using serial analysis of binding
elements. Proc Natl Acad Sci U S A 102:4813-8.
125
17. Clements, J. A. 1989. The glandular kallikrein family of enzymes: tissue-
specific expression and hormonal regulation. Endocr Rev 10:393-419.
18. Cleutjens, K. B., H. A. van der Korput, C. C. van Eekelen, H. C. van
Rooij, P. W. Faber, and J. Trapman. 1997. An androgen response element
in a far upstream enhancer region is essential for high, androgen-regulated
activity of the prostate-specific antigen promoter. Mol Endocrinol 11:148-61.
19. Dorkin, T. J., M. C. Robinson, C. Marsh, A. Bjartell, D. E. Neal, and H.
Y. Leung. 1999. FGF8 over-expression in prostate cancer is associated with
decreased patient survival and persists in androgen independent disease.
Oncogene 18:2755-61.
20. Ergun, A., C. A. Lawrence, M. A. Kohanski, T. A. Brennan, and J. J.
Collins. 2007. A network biology approach to prostate cancer. Mol Syst Biol
3:82.
21. Ernst, T., M. Hergenhahn, M. Kenzelmann, C. D. Cohen, M. Bonrouhi,
A. Weninger, R. Klaren, E. F. Grone, M. Wiesel, C. Gudemann, J.
Kuster, W. Schott, G. Staehler, M. Kretzler, M. Hollstein, and H. J.
Grone. 2002. Decrease and gain of gene expression are equally
discriminatory markers for prostate carcinoma: a gene expression analysis on
total and microdissected prostate tissue. Am J Pathol 160:2169-80.
22. Fixemer, T., U. Wissenbach, V. Flockerzi, and H. Bonkhoff. 2003.
Expression of the Ca2+-selective cation channel TRPV6 in human prostate
cancer: a novel prognostic marker for tumor progression. Oncogene 22:7858-
61.
23. Freedman, M. L., C. A. Haiman, N. Patterson, G. J. McDonald, A.
Tandon, A. Waliszewska, K. Penney, R. G. Steen, K. Ardlie, E. M. John,
I. Oakley-Girvan, A. S. Whittemore, K. A. Cooney, S. A. Ingles, D.
Altshuler, B. E. Henderson, and D. Reich. 2006. Admixture mapping
identifies 8q24 as a prostate cancer risk locus in African-American men. Proc
Natl Acad Sci U S A 103:14068-73.
126
24. Fujii, T., M. L. Garcia-Bermejo, J. L. Bernabo, J. Caamano, M. Ohba,
T. Kuroki, L. Li, S. H. Yuspa, and M. G. Kazanietz. 2000. Involvement of
protein kinase C delta (PKCdelta) in phorbol ester-induced apoptosis in
LNCaP prostate cancer cells. Lack of proteolytic cleavage of PKCdelta. J
Biol Chem 275:7574-82.
25. Gelmann, E. P. 2002. Molecular biology of the androgen receptor. J Clin
Oncol 20:3001-15.
26. Gnanapragasam, V. J., C. N. Robson, D. E. Neal, and H. Y. Leung. 2002.
Regulation of FGF8 expression by the androgen receptor in human prostate
cancer. Oncogene 21:5069-80.
27. Griffin, M., R. Casadio, and C. M. Bergamini. 2002. Transglutaminases:
nature's biological glues. Biochem J 368:377-96.
28. Haag, P., J. Bektic, G. Bartsch, H. Klocker, and I. E. Eder. 2005.
Androgen receptor down regulation by small interference RNA induces cell
growth inhibition in androgen sensitive as well as in androgen independent
prostate cancer cells. J Steroid Biochem Mol Biol 96:251-8.
29. Han, G., G. Buchanan, M. Ittmann, J. M. Harris, X. Yu, F. J. Demayo,
W. Tilley, and N. M. Greenberg. 2005. Mutation of the androgen receptor
causes oncogenic transformation of the prostate. Proc Natl Acad Sci U S A
102:1151-6.
30. Hayes, J. D., and D. J. Pulford. 1995. The glutathione S-transferase
supergene family: regulation of GST and the contribution of the isoenzymes
to cancer chemoprotection and drug resistance. Crit Rev Biochem Mol Biol
30:445-600.
31. He, W. W., M. V. Kumar, and D. J. Tindall. 1991. A frame-shift mutation
in the androgen receptor gene causes complete androgen insensitivity in the
testicular-feminized mouse. Nucleic Acids Res 19:2373-8.
32. Hernandez-Barrantes, S., M. Bernardo, M. Toth, and R. Fridman. 2002.
Regulation of membrane type-matrix metalloproteinases. Semin Cancer Biol
12:131-8.
127
33. Hewitt, K. J., R. Agarwal, and P. J. Morin. 2006. The claudin gene family:
expression in normal and neoplastic tissues. BMC Cancer 6:186.
34. Hild, M., B. Beckmann, S. A. Haas, B. Koch, V. Solovyev, C. Busold, K.
Fellenberg, M. Boutros, M. Vingron, F. Sauer, J. D. Hoheisel, and R.
Paro. 2003. An integrated gene annotation and transcriptional profiling
approach towards the full gene content of the Drosophila genome. Genome
Biol 5:R3.
35. Hollingsworth, M. A., and B. J. Swanson. 2004. Mucins in cancer:
protection and control of the cell surface. Nat Rev Cancer 4:45-60.
36. Holzbeierlein, J., P. Lal, E. LaTulippe, A. Smith, J. Satagopan, L. Zhang,
C. Ryan, S. Smith, H. Scher, P. Scardino, V. Reuter, and W. L. Gerald.
2004. Gene expression analysis of human prostate carcinoma during
hormonal therapy identifies androgen-responsive genes and mechanisms of
therapy resistance. Am J Pathol 164:217-27.
37. Horak, C. E., and M. Snyder. 2002. ChIP-chip: a genomic approach for
identifying transcription factor binding sites. Methods Enzymol 350:469-83.
38. Horie-Inoue, K., H. Bono, Y. Okazaki, and S. Inoue. 2004. Identification
and functional analysis of consensus androgen response elements in human
prostate cancer cells. Biochem Biophys Res Commun 325:1312-7.
39. Impey, S., S. R. McCorkle, H. Cha-Molstad, J. M. Dwyer, G. S. Yochum,
J. M. Boss, S. McWeeney, J. J. Dunn, G. Mandel, and R. H. Goodman.
2004. Defining the CREB regulon: a genome-wide analysis of transcription
factor regulatory regions. Cell 119:1041-54.
40. Jia, L., J. Kim, H. Shen, P. E. Clark, W. D. Tilley, and G. A. Coetzee.
2003. Androgen receptor activity at the prostate specific antigen locus:
steroidal and non-steroidal mechanisms. Mol Cancer Res 1:385-92.
41. Jia, L., H. C. Shen, M. Wantroba, O. Khalid, G. Liang, Q. Wang, E.
Gentzschein, J. K. Pinski, F. Z. Stanczyk, P. A. Jones, and G. A. Coetzee.
2006. Locus-wide chromatin remodeling and enhanced androgen receptor-
mediated transcription in recurrent prostate tumor cells. Mol Cell Biol
26:7331-41.
128
42. Jung, M., A. Romer, G. Keyszer, M. Lein, G. Kristiansen, D. Schnorr, S.
A. Loening, and K. Jung. 2003. mRNA expression of the five membrane-
type matrix metalloproteinases MT1-MT5 in human prostatic cell lines and
their down-regulation in human malignant prostatic tissue. Prostate 55:89-98.
43. Kapranov, P., S. E. Cawley, J. Drenkow, S. Bekiranov, R. L. Strausberg,
S. P. Fodor, and T. R. Gingeras. 2002. Large-scale transcriptional activity
in chromosomes 21 and 22. Science 296:916-9.
44. Kharait, S., R. Dhir, D. Lauffenburger, and A. Wells. 2006. Protein kinase
Cdelta signaling downstream of the EGF receptor mediates migration and
invasiveness of prostate cancer cells. Biochem Biophys Res Commun
343:848-56.
45. Kiley, S. C., K. J. Clark, M. Goodnough, D. R. Welch, and S. Jaken.
1999. Protein kinase C delta involvement in mammary tumor cell metastasis.
Cancer Res 59:3230-8.
46. Kim, J., A. A. Bhinge, X. C. Morgan, and V. R. Iyer. 2005. Mapping
DNA-protein interactions in large genomes by sequence tag analysis of
genomic enrichment. Nat Methods 2:47-53.
47. Kim, T. H., and B. Ren. 2006. Genome-Wide Analysis of Protein-DNA
Interactions. Annu Rev Genomics Hum Genet 7:81-102.
48. Kitabayashi, I., K. Ida, F. Morohoshi, A. Yokoyama, N. Mitsuhashi, K.
Shimizu, N. Nomura, Y. Hayashi, and M. Ohki. 1998. The AML1-MTG8
leukemic fusion protein forms a complex with a novel member of the
MTG8(ETO/CDR) family, MTGR1. Mol Cell Biol 18:846-58.
49. Konstantinos, H. 2005. Prostate cancer in the elderly. Int Urol Nephrol
37:797-806.
50. Kumar, R., J. Manning, H. E. Spendlove, G. Kremmidiotis, R. McKirdy,
J. Lee, D. N. Millband, K. M. Cheney, M. R. Stampfer, P. P. Dwivedi, H.
A. Morris, and D. F. Callen. 2006. ZNF652, a novel zinc finger protein,
interacts with the putative breast tumor suppressor CBFA2T3 to repress
transcription. Mol Cancer Res 4:655-65.
129
51. Langley, E., M. Pearson, M. Faretta, U. M. Bauer, R. A. Frye, S.
Minucci, P. G. Pelicci, and T. Kouzarides. 2002. Human SIR2 deacetylates
p53 and antagonizes PML/p53-induced cellular senescence. Embo J 21:2383-
96.
52. Laudet, V. 1997. Evolution of the nuclear receptor superfamily: early
diversification from an ancestral orphan receptor. J Mol Endocrinol 19:207-
26.
53. Li, P., X. Yu, K. Ge, J. Melamed, R. G. Roeder, and Z. Wang. 2002.
Heterogeneous expression and functions of androgen receptor co-factors in
primary prostate cancer. Am J Pathol 161:1467-74.
54. Li, Q., K. R. Peterson, X. Fang, and G. Stamatoyannopoulos. 2002. Locus
control regions. Blood 100:3077-86.
55. Licht, J. D. 2001. AML1 and the AML1-ETO fusion protein in the
pathogenesis of t(8;21) AML. Oncogene 20:5660-79.
56. Lin, B., C. Ferguson, J. T. White, S. Wang, R. Vessella, L. D. True, L.
Hood, and P. S. Nelson. 1999. Prostate-localized and androgen-regulated
expression of the membrane-bound serine protease TMPRSS2. Cancer Res
59:4180-4.
57. Litvinov, I. V., A. M. De Marzo, and J. T. Isaacs. 2003. Is the Achilles'
heel for prostate cancer therapy a gain of function in androgen receptor
signaling? J Clin Endocrinol Metab 88:2972-82.
58. Llano, E., A. M. Pendas, J. P. Freije, A. Nakano, V. Knauper, G.
Murphy, and C. Lopez-Otin. 1999. Identification and characterization of
human MT5-MMP, a new membrane-bound activator of progelatinase a
overexpressed in brain tumors. Cancer Res 59:2570-6.
59. Loh, Y. H., Q. Wu, J. L. Chew, V. B. Vega, W. Zhang, X. Chen, G.
Bourque, J. George, B. Leong, J. Liu, K. Y. Wong, K. W. Sung, C. W.
Lee, X. D. Zhao, K. P. Chiu, L. Lipovich, V. A. Kuznetsov, P. Robson, L.
W. Stanton, C. L. Wei, Y. Ruan, B. Lim, and H. H. Ng. 2006. The Oct4
and Nanog transcription network regulates pluripotency in mouse embryonic
stem cells. Nat Genet 38:431-40.
130
60. Lomvardas, S., G. Barnea, D. J. Pisapia, M. Mendelsohn, J. Kirkland,
and R. Axel. 2006. Interchromosomal interactions and olfactory receptor
choice. Cell 126:403-13.
61. Louie, M. C., H. Q. Yang, A. H. Ma, W. Xu, J. X. Zou, H. J. Kung, and
H. W. Chen. 2003. Androgen-induced recruitment of RNA polymerase II to
a nuclear receptor-p160 coactivator complex. Proc Natl Acad Sci U S A
100:2226-30.
62. Lu, J., A. Lal, B. Merriman, S. Nelson, and G. Riggins. 2004. A
comparison of gene expression profiles produced by SAGE, long SAGE, and
oligonucleotide chips. Genomics 84:631-6.
63. Lubahn, D. B., D. R. Joseph, P. M. Sullivan, H. F. Willard, F. S. French,
and E. M. Wilson. 1988. Cloning of human androgen receptor
complementary DNA and localization to the X chromosome. Science
240:327-30.
64. Magee, J. A., L. W. Chang, G. D. Stormo, and J. Milbrandt. 2006. Direct,
androgen receptor-mediated regulation of the FKBP5 gene via a distal
enhancer element. Endocrinology 147:590-8.
65. Mangala, L. S., J. Y. Fok, I. R. Zorrilla-Calancha, A. Verma, and K.
Mehta. 2006. Tissue transglutaminase expression promotes cell attachment,
invasion and survival in breast cancer cells. Oncogene.
66. Marker, P. C., A. A. Donjacour, R. Dahiya, and G. R. Cunha. 2003.
Hormonal, cellular, and molecular control of prostatic development. Dev Biol
253:165-74.
67. Martin, D. B., D. R. Gifford, M. E. Wright, A. Keller, E. Yi, D. R.
Goodlett, R. Aebersold, and P. S. Nelson. 2004. Quantitative proteomic
analysis of proteins released by neoplastic prostate epithelium. Cancer Res
64:347-55.
68. Mattick, J. S., and I. V. Makunin. 2006. Non-coding RNA. Hum Mol
Genet 15 Spec No 1:R17-29.
131
69. Maxwell, S. A., and G. E. Davis. 2000. Differential gene expression in p53-
mediated apoptosis-resistant vs. apoptosis-sensitive tumor cell lines. Proc
Natl Acad Sci U S A 97:13009-14.
70. Mehta, K., J. Fok, F. R. Miller, D. Koul, and A. A. Sahin. 2004.
Prognostic significance of tissue transglutaminase in drug resistant and
metastatic breast cancer. Clin Cancer Res 10:8068-76.
71. Morohoshi, F., S. Mitani, N. Mitsuhashi, I. Kitabayashi, E. Takahashi,
M. Suzuki, N. Munakata, and M. Ohki. 2000. Structure and expression
pattern of a human MTG8/ETO family gene, MTGR1. Gene 241:287-95.
72. Murtha, P., D. J. Tindall, and C. Y. Young. 1993. Androgen induction of a
human prostate-specific kallikrein, hKLK2: characterization of an androgen
response element in the 5' promoter region of the gene. Biochemistry
32:6459-64.
73. Nelson, P. S., N. Clegg, H. Arnold, C. Ferguson, M. Bonham, J. White, L.
Hood, and B. Lin. 2002. The program of androgen-responsive genes in
neoplastic prostate epithelium. Proc Natl Acad Sci U S A 99:11890-5.
74. Nightingale, K. P., R. E. Wellinger, J. M. Sogo, and P. B. Becker. 1998.
Histone acetylation facilitates RNA polymerase II transcription of the
Drosophila hsp26 gene in chromatin. Embo J 17:2865-76.
75. Nishihira, J., M. Fujinaga, T. Kuriyama, M. Suzuki, H. Sugimoto, A.
Nakagawa, I. Tanaka, and M. Sakai. 1998. Molecular cloning of human D-
dopachrome tautomerase cDNA: N-terminal proline is essential for enzyme
activation. Biochem Biophys Res Commun 243:538-44.
76. Ortiz, J. A., M. Castillo, E. D. del Toro, J. Mulet, S. Gerber, L. M. Valor,
S. Sala, F. Sala, L. M. Gutierrez, and M. Criado. 2005. The cysteine-rich
with EGF-like domains 2 (CRELD2) protein interacts with the large
cytoplasmic domain of human neuronal nicotinic acetylcholine receptor
alpha4 and beta2 subunits. J Neurochem 95:1585-96.
77. Ouyang, X., T. L. DeWeese, W. G. Nelson, and C. Abate-Shen. 2005.
Loss-of-function of Nkx3.1 promotes increased oxidative damage in prostate
carcinogenesis. Cancer Res 65:6773-9.
132
78. Paliouras, M., C. Borgono, and E. P. Diamandis. 2007. Human tissue
kallikreins: The cancer biomarker family. Cancer Lett 249:61-79.
79. Prescott, J., U. Jariwala, L. Jia, J. P. Cogan, A. Barski, S. Pregizer, A.
Arasheben, J. J. Neilson, B. Frenkel, and G. A. Coetzee. 2007
Provisionally Accepted. Androgen Receptor-Mediated Repression of Novel
Target Genes. Prostate.
80. Quayle, S. N., H. Hare, A. D. Delaney, M. Hirst, D. Hwang, J. E. Schein,
S. J. Jones, M. A. Marra, and M. D. Sadar. 2007. Novel expressed
sequences identified in a model of androgen independent prostate cancer.
BMC Genomics 8:32.
81. Ren, B., F. Robert, J. J. Wyrick, O. Aparicio, E. G. Jennings, I. Simon, J.
Zeitlinger, J. Schreiber, N. Hannett, E. Kanin, T. L. Volkert, C. J.
Wilson, S. P. Bell, and R. A. Young. 2000. Genome-wide location and
function of DNA binding proteins. Science 290:2306-9.
82. Rennie, P. S., N. Bruchovsky, K. J. Leco, P. C. Sheppard, S. A.
McQueen, H. Cheng, R. Snoek, A. Hamel, M. E. Bock, B. S. MacDonald,
and et al. 1993. Characterization of two cis-acting DNA elements involved
in the androgen regulation of the probasin gene. Mol Endocrinol 7:23-36.
83. Riegman, P. H., R. J. Vlietstra, J. A. van der Korput, A. O. Brinkmann,
and J. Trapman. 1991. The promoter of the prostate-specific antigen gene
contains a functional androgen responsive element. Mol Endocrinol 5:1921-
30.
84. Roh, T. Y., W. C. Ngau, K. Cui, D. Landsman, and K. Zhao. 2004. High-
resolution genome-wide mapping of histone modifications. Nat Biotechnol
22:1013-6.
85. Santagata, S., F. Demichelis, A. Riva, S. Varambally, M. D. Hofer, J. L.
Kutok, R. Kim, J. Tang, J. E. Montie, A. M. Chinnaiyan, M. A. Rubin,
and J. C. Aster. 2004. JAGGED1 expression is associated with prostate
cancer metastasis and recurrence. Cancer Res 64:6854-7.
133
86. Scher, H. I., G. Buchanan, W. Gerald, L. M. Butler, and W. D. Tilley.
2004. Targeting the androgen receptor: improving outcomes for castration-
resistant prostate cancer. Endocr Relat Cancer 11:459-76.
87. Scher, H. I., and C. L. Sawyers. 2005. Biology of progressive, castration-
resistant prostate cancer: directed therapies targeting the androgen-receptor
signaling axis. J Clin Oncol 23:8253-61.
88. Schwarz, E. C., U. Wissenbach, B. A. Niemeyer, B. Strauss, S. E. Philipp,
V. Flockerzi, and M. Hoth. 2006. TRPV6 potentiates calcium-dependent
cell proliferation. Cell Calcium 39:163-73.
89. Sciavolino, P. J., E. W. Abrams, L. Yang, L. P. Austenberg, M. M. Shen,
and C. Abate-Shen. 1997. Tissue-specific expression of murine Nkx3.1 in
the male urogenital system. Dev Dyn 209:127-38.
90. Selinger, D. W., K. J. Cheung, R. Mei, E. M. Johansson, C. S. Richmond,
F. R. Blattner, D. J. Lockhart, and G. M. Church. 2000. RNA expression
analysis using a 30 base pair resolution Escherichia coli genome array. Nat
Biotechnol 18:1262-8.
91. Shimizu, H., R. K. Ross, L. Bernstein, R. Yatani, B. E. Henderson, and T.
M. Mack. 1991. Cancers of the prostate and breast among Japanese and
white immigrants in Los Angeles County. Br J Cancer 63:963-6.
92. Sonn, G. A., W. Aronson, and M. S. Litwin. 2005. Impact of diet on
prostate cancer: a review. Prostate Cancer Prostatic Dis 8:304-10.
93. Spilianakis, C. G., M. D. Lalioti, T. Town, G. R. Lee, and R. A. Flavell.
2005. Interchromosomal associations between alternatively expressed loci.
Nature 435:637-45.
94. Sumitomo, M., M. Ohba, J. Asakuma, T. Asano, T. Kuroki, T. Asano,
and M. Hayakawa. 2002. Protein kinase Cdelta amplifies ceramide
formation via mitochondrial signaling in prostate cancer cells. J Clin Invest
109:827-36.
134
95. Takayama, K., K. Kaneshiro, S. Tsutsumi, K. Horie-Inoue, K. Ikeda, T.
Urano, N. Ijichi, Y. Ouchi, K. Shirahige, H. Aburatani, and S. Inoue.
2007. Identification of novel androgen response genes in prostate cancer cells
by coupling chromatin immunoprecipitation and genomic microarray
analysis. Oncogene.
96. Tanaka, A., K. Miyamoto, N. Minamino, M. Takeda, B. Sato, H. Matsuo,
and K. Matsumoto. 1992. Cloning and characterization of an androgen-
induced growth factor essential for the androgen-dependent growth of mouse
mammary carcinoma cells. Proc Natl Acad Sci U S A 89:8928-32.
97. Tanaka, Y., M. V. Gavrielides, Y. Mitsuuchi, T. Fujii, and M. G.
Kazanietz. 2003. Protein kinase C promotes apoptosis in LNCaP prostate
cancer cells through activation of p38 MAPK and inhibition of the Akt
survival pathway. J Biol Chem 278:33753-62.
98. Thalmann, G. N., P. E. Anezinis, S. M. Chang, H. E. Zhau, E. E. Kim, V.
L. Hopwood, S. Pathak, A. C. von Eschenbach, and L. W. Chung. 1994.
Androgen-independent cancer progression and bone metastasis in the LNCaP
model of human prostate cancer. Cancer Res 54:2577-81.
99. Thornton, J. W. 2001. Evolution of vertebrate steroid receptors from an
ancestral estrogen receptor by ligand exploitation and serial genome
expansions. Proc Natl Acad Sci U S A 98:5671-6.
100. Tomlins, S. A., D. R. Rhodes, S. Perner, S. M. Dhanasekaran, R. Mehra,
X. W. Sun, S. Varambally, X. Cao, J. Tchinda, R. Kuefer, C. Lee, J. E.
Montie, R. B. Shah, K. J. Pienta, M. A. Rubin, and A. M. Chinnaiyan.
2005. Recurrent fusion of TMPRSS2 and ETS transcription factor genes in
prostate cancer. Science 310:644-8.
101. Vaarala, M. H., K. Porvari, A. Kyllonen, O. Lukkarinen, and P. Vihko.
2001. The TMPRSS2 gene encoding transmembrane serine protease is
overexpressed in a majority of prostate cancer patients: detection of mutated
TMPRSS2 form in a case of aggressive disease. Int J Cancer 94:705-10.
102. van Steensel, B., and S. Henikoff. 2000. Identification of in vivo DNA
targets of chromatin proteins using tethered dam methyltransferase. Nat
Biotechnol 18:424-8.
135
103. Velcich, A., W. Yang, J. Heyer, A. Fragale, C. Nicholas, S. Viani, R.
Kucherlapati, M. Lipkin, K. Yang, and L. Augenlicht. 2002. Colorectal
cancer in mice genetically deficient in the mucin Muc2. Science 295:1726-9.
104. Verras, M., and Z. Sun. 2006. Roles and regulation of Wnt signaling and
beta-catenin in prostate cancer. Cancer Lett 237:22-32.
105. Wang, Q., J. S. Carroll, and M. Brown. 2005. Spatial and temporal
recruitment of androgen receptor and its coactivators involves chromosomal
looping and polymerase tracking. Mol Cell 19:631-42.
106. Wei, C. L., Q. Wu, V. B. Vega, K. P. Chiu, P. Ng, T. Zhang, A. Shahab,
H. C. Yong, Y. Fu, Z. Weng, J. Liu, X. D. Zhao, J. L. Chew, Y. L. Lee, V.
A. Kuznetsov, W. K. Sung, L. D. Miller, B. Lim, E. T. Liu, Q. Yu, H. H.
Ng, and Y. Ruan. 2006. A global map of p53 transcription-factor binding
sites in the human genome. Cell 124:207-19.
107. Williams, S. A., P. Singh, J. T. Isaacs, and S. R. Denmeade. 2007. Does
PSA play a role as a promoting agent during the initiation and/or progression
of prostate cancer? Prostate 67:312-29.
108. Willingham, A. T., and T. R. Gingeras. 2006. TUF love for "junk" DNA.
Cell 125:1215-20.
109. Wilson, S., B. Greer, J. Hooper, A. Zijlstra, B. Walker, J. Quigley, and S.
Hawthorne. 2005. The membrane-anchored serine protease, TMPRSS2,
activates PAR-2 in prostate cancer cells. Biochem J 388:967-72.
110. Wu, J., L. T. Smith, C. Plass, and T. H. Huang. 2006. ChIP-chip comes of
age for genome-wide functional analysis. Cancer Res 66:6899-902.
111. Xu, J., S. L. Zheng, B. Chang, J. R. Smith, J. D. Carpten, O. C. Stine, S.
D. Isaacs, K. E. Wiley, L. Henning, C. Ewing, P. Bujnovszky, E. R.
Bleeker, P. C. Walsh, J. M. Trent, D. A. Meyers, and W. B. Isaacs. 2001.
Linkage of prostate cancer susceptibility loci to chromosome 1. Hum Genet
108:335-45.
136
112. Xu, L. L., Y. Shi, G. Petrovics, C. Sun, M. Makarem, W. Zhang, I. A.
Sesterhenn, D. G. McLeod, L. Sun, J. W. Moul, and S. Srivastava. 2003.
PMEPA1, an androgen-regulated NEDD4-binding protein, exhibits cell
growth inhibitory function and decreased expression during prostate cancer
progression. Cancer Res 63:4299-304.
113. Xu, L. L., Y. P. Su, R. Labiche, T. Segawa, N. Shanmugam, D. G.
McLeod, J. W. Moul, and S. Srivastava. 2001. Quantitative expression
profile of androgen-regulated genes in prostate cancer cells and identification
of prostate-specific genes. Int J Cancer 92:322-8.
114. Yamaguchi, M., K. Yamamoto, and O. Miura. 2003. Aberrant expression
of the LHX4 LIM-homeobox gene caused by t(1;14)(q25;q32) in chronic
myelogenous leukemia in biphenotypic blast crisis. Genes Chromosomes
Cancer 38:269-73.
115. Zavadil, J., L. Cermak, N. Soto-Nieves, and E. P. Bottinger. 2004.
Integration of TGF-beta/Smad and Jagged1/Notch signalling in epithelial-to-
mesenchymal transition. Embo J 23:1155-65.
116. Zegarra-Moro, O. L., L. J. Schmidt, H. Huang, and D. J. Tindall. 2002.
Disruption of androgen receptor function inhibits proliferation of androgen-
refractory prostate cancer cells. Cancer Res 62:1008-13.
117. Zhang, Y., Z. Wang, F. Ahmed, S. Banerjee, Y. Li, and F. H. Sarkar.
2006. Down-regulation of Jagged-1 induces cell growth inhibition and S
phase arrest in prostate cancer cells. Int J Cancer 119:2071-7.
Abstract (if available)
Abstract
The androgen receptor (AR) has unequivocal roles throughout all phases of prostate cancer (PCa). The downstream target genes that mediate the receptor's oncogenic functions, however, remain ill-defined. This thesis describes studies undertaken to identify AR target genes in C4-2B advanced human PCa cells using two chromatin immunoprecipitation-based methodologies: ChIP Display and ChIP-on-chip.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Co-chaperone influence on androgen receptor signaling and identification of androgen receptor genes in prostate cancer
PDF
The identification of novel kinase genes associated with androgen independent prostate cancer
PDF
Homologous cell systems for the study of progression of androgen-dependent prostate cancer to castration-resistant prostate cancer
PDF
Exploration of the roles of cancer stem cells and survivin in the pathogenesis and progression of prostate cancer
PDF
Studies of murine prostate cancer stem / progenitor cells
PDF
Functional analyses of androgen receptor structure pertaining to prostate cancer
PDF
Modulation of Runx proteins by steroid hormone receptors
PDF
Genetic interaction between androgen receptor and Lef1 in bone mass control
PDF
Targeting molecular signals involved in the development of castration resistant prostate cancer
PDF
Functional characterization of a prostate cancer risk region
PDF
The mechanism of recruitment of Tip60 to ER target genes
PDF
Using genomics to understand the gene selectivity of steroid hormone receptors
PDF
Systematic analysis of single nucleotide polymorphisms in the human steroid 5-alpha reductase type I gene
PDF
Functional analysis of a prostate cancer risk enhancer at 7p15.2
PDF
Gene delivery to pulmonary mucosa
PDF
Identification of DNA methylation markers in diffuse large B-cell lymphoma
PDF
The role of Hic-5 in glucocorticoid receptor binding to chromatin
PDF
Study of bone morphogenetic protein-2 and stromal cell derived factor-1 in prostate cancer
PDF
Studies of the biological relevance of Histone H4 Lysine 20 monomethylation: discovery of its role in the cell cycle and localization within the human genome
PDF
The development of targeted transcription factor transposition and understanding chromatin dynamics in hypertrophic cardiomyopathy
Asset Metadata
Creator
Jariwala, Unnati
(author)
Core Title
Identification of novel androgen receptor target genes in prostate cancer
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Biochemistry and Molecular Biology
Degree Conferral Date
2007-08
Publication Date
06/26/2007
Defense Date
05/02/2007
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
androgen receptor,C4-2B,castrate resistant,ChIP,OAI-PMH Harvest,prostate cancer
Language
English
Advisor
Frenkel, Baruch (
committee chair
), [illegible] (
committee member
)
Creator Email
jariwala@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m552
Unique identifier
UC1145033
Identifier
etd-Jariwala-20070626 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-507998 (legacy record id),usctheses-m552 (legacy record id)
Legacy Identifier
etd-Jariwala-20070626.pdf
Dmrecord
507998
Document Type
Dissertation
Rights
Jariwala, Unnati
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
androgen receptor
C4-2B
castrate resistant
ChIP
prostate cancer