Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Characterizing ZFX-mediated gene regulation to reveal possible candidates for clinical intervention
(USC Thesis Other)
Characterizing ZFX-mediated gene regulation to reveal possible candidates for clinical intervention
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Characterizing ZFX-mediated gene regulation to reveal possible candidates for clinical intervention Lijun Yao Mentor: Dr. Peggy J. Farnham Department of Biochemistry and Molecular Medicine Master of Science University of Southern California August 8th, 2017 I Acknowledgements Before I start my story about ZFX, I would like to express my gratitude to my mentor, Dr. Peggy Farnham. She is not only a distinguished scientist but also a very responsible mentor. For academic studies, I am often inspired by her innovative thinking, passion for scientific research and diligent attitude. For my individual development, I benefited from her instructions on developing short term goals and a long term career plan. I am very fortunate to study in our lab and to have this great chance to learn experiment techniques and bioinformatics analysis methods, as well as to attend some interesting academic conferences. In addition, I want to thank all my fellow lab members for their help over these last 2 years, including Yu (Phoebe) Guo, Zhifei Luo, Carol Munoz, Charlie Nicolet, Andrew Perez, Suhn Kyong Rhie, Shannon Schreiner, Heather Witt, and Songren Wang. Especially, I am very grateful to Heather and Phoebe for teaching me experiment skills and to Suhn for teaching me bioinformatics analysis while they were very busy with their own lab work. I really appreciate their time and patient guidance. I would like to thank my committee member, Dr. Michael R. Stallcup, who has been very supportive during my whole MS journey, as well as Dr. Judd Rice, who gave very interesting lectures in my classes. I appreciate their time in reviewing my thesis and giving me suggestions for improving my project. II I thank the Stanford Center for Genomics and Personalized Medicine (SCGPM) for ChIP-seq and RNA-seq sequencing and the University of Southern California's Norris Medical Library Bioinformatics Service for assisting with sequencing data analysis. Finally, I would like to thank for my parents, Bin Yao and Junfeng Kang, my sister, Liping Yao, and my uncle, Jungen Kang, to support me to study abroad and to pursue my MS at USC. It was really a good choice when I look back. Also, I have been very happy living with my roommates, Yuting Cheng, Mengyao Zeng and Yishu Qu. Thanks for being so friendly and bringing me lots of fun. III Table of Contents Acknowledgements .......................................................................................................................... I List of Figures ............................................................................................................................... IV List of Tables ................................................................................................................................. V Abstract ......................................................................................................................................... VI List of Abbreviations ................................................................................................................. VIII Chapter 1 Introduction .................................................................................................................... 1 1.1 Expression of Zinc Finger Protein, X-Linked (ZFX), correlates with tumorigenesis and poor patient survival ................................................................................................................... 1 1.2 ZFX gene structure and protein structure ............................................................................. 4 1.3 ZFX might act as transcriptional regulator ........................................................................... 5 Chapter 2 Materials and Methods ................................................................................................... 8 2.1 Cell Culture ........................................................................................................................... 8 2.2 Antibody Validation .............................................................................................................. 8 2.3 Chromatin Immunoprecipitation Library Construction and Sequencing ............................ 10 2.4 qPCR analyses .................................................................................................................... 12 2.5 ChIP-seq data processing. ................................................................................................... 14 2.6 siRNA knockdown and RNA-seq library construction ...................................................... 15 2.7 RNA-seq data processing .................................................................................................... 16 Chapter 3 Characterizing binding patterns for ZFX in different tumor cell lines ........................ 17 3.1 ENCODE TF Antibody Characterization ........................................................................... 17 3.2 Creation of high quality, duplicate ZFX ChIP-seq datasets in 3 different cancer cell lines. ................................................................................................................................................... 19 3.3 ZFX binds to many of the same promoter regions in different cancer cell types. .............. 23 3.4 ZFX binds downstream of the TSS in promoter regions .................................................... 27 3.5 ZFX binding motifs are similar in C42B, MCF7 and HCT116 cells. ................................. 28 Chapter 4 Functional analysis of ZFX in human cancer cells ...................................................... 32 4.1 Many genes downregulated upon knockdown of ZFX are direct ZFX targets ................... 32 4.2 Binding patterns of direct ZFX target genes ....................................................................... 35 4.3 Identification of top diseases and biological functions affected by knockdown of ZFX in C42B and MCF7 cells. .............................................................................................................. 44 Chapter 5 Discussion and Future Directions ................................................................................ 46 What distinguishes a “functional” bound ZFX from a “non-functional” bound ZFX? ............ 47 Does ZNF711 substitute for ZFX in MCF7 cells? .................................................................... 49 Could inhibition of ZFX-regulated pathways provide a therapeutic option? ........................... 50 References ..................................................................................................................................... 55 IV List of Figures Figure 1.1 The influence of ZFX overexpression on prostate, breast and colon cancer ................. 3 Figure 1.2 ZFX gene and protein structure. .................................................................................... 4 Figure 1.3 ZFX can transactivate the SET promoter.. .................................................................... 6 Figure 2.1 Chromatin Immunoprecipitation flow chart. ............................................................... 12 Figure 2.2 An example of ChIP enrichment checks ..................................................................... 14 Figure 2.3 ZFX siRNA target sites ............................................................................................... 16 Figure 3.1 ENCODE primary antibody validation ....................................................................... 17 Figure 3.2 ENCODE secondary antibody validation .................................................................... 18 Figure 3.3 ChIP-seq flow chart ..................................................................................................... 19 Figure 3.4 ZFX ChIP-seq binding patterns. .................................................................................. 19 Figure 3.5 ZFX ChIP-seq peak location analysis. ........................................................................ 25 Figure 3.6 Overlapping ZFX peaks in different cell lines ............................................................ 26 Figure 3.7 Heat map of all C42B and MCF7 binding sites .......................................................... 27 Figure 3.8 Location analysis of ZFX binding sites relative to the nearest TSS in C42B cells ..... 28 Figure 3.9 Treefam gene tree analysis of ZFX ............................................................................. 31 Figure 4.1 Knockdown of ZFX by siRNA in C42B and MCF7 ................................................... 33 Figure 4.2 Most direct target sites are located ~ 240bp downstream of TSS ............................... 36 Figure 4.3 UCSC browser screenshot of a direct target gene, DIS3L .......................................... 37 Figure 4.4 There are many more common downregulated genes than upregulated genes between C42B and MCF7 ................................................................................................................... 43 Figure 5.1 CRISPR-mediated knockout of the ZFX gene. ........................................................... 48 Figure 5.2 ZFX and ZNF711 expression levels in C42B and MCF7 cells. .................................. 49 Figure 5.3 Top 5 related pathways upon ZFX knockdown. ......................................................... 50 Figure 5.4 ZFX regulated genes in the BRCA1 DNA damage response pathway. ...................... 51 Figure 5.5 Pathway comparison upon ZFX knockdown in C42B and MCF7.. ............................ 53 V List of Tables Table 2.1 Cell line information ....................................................................................................... 8 Table 2.2 qPCR primer sequences for ChIP enrichment checks. ................................................. 13 Table 2.3 ZFX siRNA target information ..................................................................................... 16 Table 3.1 ChIP-seq read depth for each cell line. ......................................................................... 21 Table 3.2 ChIP-seq library complexity parameters for each cell line. .......................................... 22 Table 3.3 ChIP-seq enrichment parameters for each cell line. ..................................................... 23 Table 3.4 ChIP-seq IDR parameters for each cell line. ................................................................ 23 Table 3.5 Top 4 preferred motifs in C42B .................................................................................... 30 Table 3.6 Top 4 preferred motifs in MCF7 ................................................................................... 30 Table 3.7 Top 4 preferred motifs in HCT116 ............................................................................... 30 Table 4.1 ZFX siRNA qPCR primer information ......................................................................... 32 Table 4.2 RNA-seq total reads and Tophat 2 Alignment rate ....................................................... 34 Table 4.3 Many downregulated genes have ZFX bound to their promoter regions. .................... 35 Table 4.4 Top 4 motifs of direct target sites in C42B ................................................................... 38 Table 4.5 Top 4 motifs of direct target sites in MCF7 .................................................................. 38 Table 4.6 The top 30 most downregulated genes in C42B. .......................................................... 39 Table 4.7 The top 30 most upregulated genes in C42B. ............................................................... 40 Table 4.8 The top 30 most downregulated genes in MCF7 .......................................................... 41 Table 4.9 The top 30 most upregulated genes in MCF7. .............................................................. 42 Table 4.10 Top disease and biological functions of differentially expressed genes upon ZFX knockdown in C42B ............................................................................................................. 44 Table 4.11 Top disease and biological functions of differentially expressed genes upon ZFX knockdown in MCF7 ............................................................................................................ 45 Table 5.1 Chemotherapeutic drugs that target the DNA damage response pathway .................... 52 Table 5.2 Chemotherapeutic drugs that target death receptor signaling pathway ........................ 52 VI Abstract High expression of the transcription factor ZFX has been linked to increased proliferation and tumorigenesis in multiple types of malignant tumors. In addition, ZFX overexpression is correlated with poor patient survival in colorectal, gallbladder, and renal cancers. However, the mechanism by which ZFX mediates transcriptional regulation has not been studied and ZFX target genes in human are not known. I assisted with ChIP-seq assays in three cancer cell lines (derived from prostate, breast and colon cancers) to identify ZFX-binding sites throughout the human genome. Using stringent quality control metrics, I identified ~9 thousand binding sites in each cell type. Interestingly, the binding patterns were very similar in all cell types, with ~65% of the ZFX-binding sites being located within +/- 2 kb of a transcription start site. To determine if ZFX is responsible for regulation of the promoters to which it is bound, I performed RNA-seq analysis after knockdown of ZFX by siRNA in C42B prostate cancer cells, identifying 911 upregulated and 1236 downregulated genes. Interestingly, 515 (41%) of the downregulated genes have ZFX-binding sites in their promoter regions whereas only 101 (11%) of the upregulated genes have ZFX binding sites in their promoter regions. Similar knockdown studies were performed in MCF7 breast cancer cells; I found that 181 (36%) out of 509 downregulated genes but only 29 (13%) out of 218 upregulated genes have ZFX-binding sites in their promoter regions. Taken together with the fact that ZFX binds to promoters that have active histone modifications, my results suggest that ZFX may act as a transcriptional activator for ~500 target genes in prostate cancer cells and ~200 target genes in breast cancer. To determine if ZFX regulates the same genes and pathways in different cancer types, Ingenuity Pathway Analysis of ZFX-regulated genes was performed. I found that the BRCA1 DNA repair pathway was inhibited in the ZFX knockdown prostate cancer cells and death receptor signaling pathway was VII inhibited in the ZFX knockdown breast cancer cells. Interestingly, both these pathways have known chemotherapeutic targets which may be useful for tumors overexpressing ZFX. VIII List of Abbreviations Cas9 CRISPR-associated protein-9 nuclease ChIP Chromatin immunoprecipitation ChIP-seq ChIP sequencing CRC Colorectal cancer CRISPR Clustered regularly interspaced short palindromic repeats DBD DNA binding domain ESC Embryonic stem cells gRNA Guide RNA GSA Gene Specific Analysis HSC Hematopoietic stem cells IDR Irreproducible discovery rate IHC Immunohistochemistry IPA Ingenuity pathway analyses KD Knock down NLS Nuclear localization sequence NRF Non-Redundant Fraction NSC Normalized strand cross-correlation coefficient PCR Polymerase chain reaction RNA-seq RNA sequencing RSC Relative strand cross-correlation coefficient siRNA Small (or short) interfering RNA TAD Transcriptional activation domain TSS Transcription start site ZFX Zinc finger protein, X-Linked 1 Chapter 1 Introduction 1.1 Expression of Zinc Finger Protein, X-Linked (ZFX), correlates with tumorigenesis and poor patient survival Previous studies have revealed that ZFX is overexpressed in multiple types of malignant tumors and that ZFX is required for the self-renewal of embryonic stem cells and hematopoietic stem cells (Galan-Caridad JM et al., 2007). Several studies have implicated ZFX in multiple human cancers, including prostate cancer, breast cancer, colorectal cancer, glioma (Zhou Y et al., 2011), renal carcinoma (Fang Q et al., 2014), gastric cancer (Nikpour P et al., 2012), gallbladder adenocarcinoma (Weng et al., 2015), non-small cell lung carcinoma (Li 2013) and laryngeal squamous cell carcinoma (Fang J et al., 2012). In these studies, it was shown that high expression of ZFX is linked to tumorigenesis and knocking down ZFX can significantly suppress cellular proliferation and increase the proportion of apoptotic cells. In addition, high ZFX expression has been shown to correlate with poor survival of cancer patients. For example, ZFX expression is significantly related to histological grade (P<0.001) in gallbladder adenocarcinoma and patients that survived less than 1 year were found to have significantly higher ZFX expression than patients that survived more than 1 year (Weng et al., 2015). Taken together, these studies suggest that ZFX may function as an oncogene and serve as a potential therapeutic intervention target for cancer patients. Our laboratory has previously focused on the molecular mechanisms that drive prostate, breast, and colon cancers (Rhie SK et al., 2016, Lay FD et al., 2015) and therefore I describe in more detail below what is known about the role of ZFX in these 3 cancer types. 2 In an early study of ZFX and ZFY (a highly related protein on the Y chromosome), researchers found that ZFX transcripts were not detected in normal hypertrophic prostate tissue using Northern analysis. However, in a prostate adenocarcinoma, high levels of the 8.0 and 6.3 kb ZFX transcripts were present. In addition, RT-PCR demonstrated that 20 of 31 high-grade prostate tumors expressed ZFX and/or ZFY transcripts (Tricoli et al., 1993). Another study used immunohistochemical (IHC) staining to show that prostate cancer tissues exhibit significantly higher ZFX expression than benign prostatic hyperplasia and adjacent tissues (Figure 1.1A). Moreover, siRNA-mediated knockdown of ZFX showed that reduction of ZFX levels effectively suppresses the cellular proliferation of PC-3 prostate cancer cells, as well as significantly reducing the number of colonies in colony forming assays. Furthermore, ZFX knockdown affected cell cycle progression and lead to apoptosis in PC-3 cells (Jiang H, et al., 2012). A study that investigated the role of ZFX in human breast cancer suggests that ZFX plays a key role in breast cancer development (Yang H et al., 2014). Immunohistochemistry showed that expression of ZFX is higher in more advanced invasive breast cancers. Also, ZFX is overexpressed in multiple breast cancer cell lines and the proliferation rate of breast cancer cells was significantly suppressed over time after infection of a lentivirus expressing shRNAs to ZFX was introduced into MCF-7 and MDA-MB-231 breast cancer cells (Figure 1.1B). Moreover, ZFX silencing results in cell cycle arrest in the G0/G1 phase in breast cancer cells (Yang H et al., 2014). 3 Studies of ZFX in colorectal cancer (CRC) have revealed that ZFX plays an important role in CRC tumorigenicity. High expression of ZFX promotes tumor growth and CRC patients with high ZFX expression have poorer overall and disease-free survival (Yan X et al., 2016). Also, CRC patients with higher ZFX expression exhibit a significantly shorter survival time (Jiang J et al., 2015) (Figure 1.1C). Moreover, knockdown of ZFX significantly suppressed proliferation and invasion of the CRC cell lines HCT116 and LoVo (Jiang J et al., 2015). Figure 1.1 The influence of ZFX overexpression on prostate, breast and colon cancer. A. Expression levels of ZFX are higher in prostate cancer tissues. ZFX expression was analyzed by immunohistochemistry in prostate cancer tissue, adjacent tissue and benign prostatic hyperplasia tissues. Data are representative of 45 human prostate specimens and 16 benign tissues (Jiang H et al., 2012) B. Effect of ZFX silencing on the proliferation of breast cancer cells. The proliferation rate was analyzed with the MTT assay in MCF-7 cells using lentiviruses to express shRNA to ZFX (Yang H et al., 2014). C. Association between ZFX expression and overall survival in colorectal cancer (CRC) patients. The 90 CRC patients were stratified into ZFX high and ZFX low expression groups according to the immunostaining intensity of ZFX (Jiang J et al., 2015). A B C A B 4 1.2 ZFX gene structure and protein structure As described above, ZFX is overexpressed in various tumor types and associated with increased cell proliferation and poor survival in cancer patients. However, the mechanism by which ZFX influences cancer initiation or progression has not been well-studied. Clues to possible modes of action come from analysis of its gene structure. ZFX, also known as ZNF926, is located on the X chromosome and is structurally similar to ZFY, which is located on the Y chromosome. The full length ZFX protein contains an acidic transcriptional activation domain (TAD), a nuclear localization sequence (NLS) and a DNA binding domain (DBD) consisting 13 C2H2-type zinc fingers (Figure 1.2). Overall protein homology is 92% between ZFX and ZFY with the zinc finger domain having 97% homology. Interestingly, the ZFX gene escapes X inactivation (Schneider-Gadicke et al.,1989). Figure 1.2 ZFX gene and protein structure. A. ZFX is located at chrXp21.3, and transcribed in the forward strand direction. The full length coding region, from the start codon to the stop codon, is about 67 Kbp. B. Shown is the exon/intron structure of ZFX. There are 7 transcript variants in Gencode V19, encoding 4 isoforms of the ZFX protein. Variants 1, 2 and 3 (NM_003410.3, NM_001178084.1, NM_001178085.1) encode the same isoform. Exons 1 to 6 in isoform 1 include the 5' UTR and encode the N-terminal acidic domain. Exon 10 encodes the C-terminal zinc finger-containing domain for isoform 1 and 2 and includes the 3’UTR A B C D Transcription Activation Domain DNA Binding Domain 5 (Schneider-Gadicke et al.,1989a). C. ZFX isoform 1 (NP_001171555.1), the longest isoform, contains 805 amino acids and consists of an acidic transcriptional activation domain (TAD), a nuclear localization domain (NLS), and a DNA binding domain (DBD). D. The DBD of isoform 1 consists of 13 C2H2-type zinc fingers. 1.3 ZFX might act as transcriptional regulator ZFX, similar to many ZNF proteins, contains an acidic transcriptional activation domain and likely functions as a transcription factor (Schneider-Gadicke et al.,1989). Studies in mouse embryonic stem cells (ESC) and adult hematopoietic stem cells (HSC) showed that ZFX directly activates common target genes in ESC and HSC, as well as ESC-specific target genes including the ESC self-renewal regulators Tbx3 and Tcl1 (Galan-Caridad et al., 2007). The deletion of ZFX impaired stem cell self-renewal, whereas ZFX overexpression facilitated ESC self-renewal by opposing differentiation. This study suggests that ZFX acts as a transcriptional regulator for self-renewal of both types of stem cells. Since ZFX acts a transcriptional regulator for ESC cells, one study, focusing on Core Transcriptional Network in ESC, performed ZFX ChIP-seq in mouse embryonic stem cells and derived ZFX consensus sequence motifs by using a de novo motif-discovery algorithm (Chen X et al., 2008). This analysis provides the ZFX motif used in further studies described below. A study in support of the hypothesis that ZFX is a transcriptional regulator comes from analysis of the SET (SE Translocation) gene (Xu S et al., 2016). The authors found that ZFX binds to and transactivates the SET promoter in HeLa cells. The Genomatix software (Munich, Germany; http://genomatix.de/cgi-bin/matinspector_prof/mat_fam.pl.) and JASPAR database (Copenhagen, Denmark; http://jaspar.binf.ku.dk/cgi-bin/jaspar_db.pl.) were used to predict transcription factor binding sites. Based on cis-elements in the SET core promoter, they predicted that Sp1, E2F1, 6 E2F3, E2F4, EGR1, and ZFX might control SET expression. By transfection with plasmids expressing these transcription factors, they found that ZFX appeared to account for a large portion of the SET promoter activity. Also, the SET proximal promoter region was shown to contain four predicted ZFX-binding sites (Figure 1.3A), identified by the motif analysis of mouse ZFX (discussed above). This suggests that these sites may play a role in the ZFX- mediated regulation of SET gene. Mutagenesis studies indicated that the ZFX-binding motif located closest to the transcription start site (Site 4) accounts for most of the ZFX-mediated transactivation (Figure 1.3B and 1.3C). Also, siRNA-mediated knockdown confirmed the significance and specificity of the ZFX-mediated SET promoter activation. Chromatin immunoprecipitation results verified that ZFX was able to bind to the native SET promoter in Hela cells. Figure 1.3 ZFX can transactivate the SET promoter. A. Positions and sequences of cis- elements in the SET core promoter (−157/+47). The transcription start site is indicated by an arrow. B. Structure of the SET2 promoter; individual ZFX motifs were mutated (indicated by filled circles) and used in promoter reporter assays. C. HeLa cells were transfected with ZFX or a control vector. Significantly increased reporter activity was found in P5 and M1-3, but not M4 upon ZFX overexpression. Taken together, these data from (Xu S et al., 2016) provided strong evidence for a significant role of the fourth ZFX site in ZFX-mediated regulation of the SET promoter. Although a few isolated mechanistic studies of ZFX have been performed, to date a genome- wide characterization of ZFX in human cancers has not been accomplished. Therefore, to 7 identify the ZFX binding patterns in 3 different tumor cell lines (derived from colon, prostate, breast), I used the Chromatin Immunoprecipitation (ChIP) assay and analyzed ChIP-seq data using the ENCODE ChIP-seq analysis pipeline. To further understand the role of ZFX in transcriptional regulation, I reduced the levels of ZFX by siRNA and performed RNA-seq to identify genes responsive to changes in levels of ZFX. Using Ingenuity Pathway Analyses (IPA), I found that the BRCA1 DNA repair pathway was inhibited in the ZFX knockdown prostate cancer cells and death receptor signaling pathway was inhibited in the ZFX knockdown breast cancer cells. Interestingly, both these pathways have known chemotherapeutic targets which may be useful for tumors overexpressing ZFX. 8 Chapter 2 Materials and Methods 2.1 Cell Culture C42B, MCF7 and HCT116 human cell lines were cultured in the corresponding medium (Corning Cellgro) shown in Table 2.1 with 10% fetal bovine serum (Gibco by Life Technologies) and 1% penicillin/ streptomycin at 37℃ with 5% CO 2 . The identity of each cell line was confirmed using STR markers and the cell stocks were shown to be free of mycoplasma. Cells used in Chromatin Immunoprecipitation (ChIP) assays and cells treated with siRNAs were harvested at 80%-90% confluence. Table 2.1 Cell line information 2.2 Antibody Validation The ZFX antibody was validated according to ENCODE standards (https://www.encodeproject.org/documents/c7cb0632-7e5f-455e-9119- 46a54f160711/@@download/attachment/ENCODE_Approved_May_2016_TF_Antibody%20C haracterization_Guidelines.pdf). This involves first demonstrating that the antibody recognizes a protein of the correct size (and not other proteins) on a Western blot and that knockdown of the transcript reduces the band on the Western blot. This validation must be performed in each cell line that is used for ChIP analysis. The step-by-step protocol that I used for antibody validation is: Label name ZFX target Strand Primer Sequence (5' -> 3') DIS3L_2 positive Forward GCTTGGCTAACCAGCTCTCA Reverse CTGTGTCAAGCTTCTGCACG LRRC41_1 positive Forward CGGTCGCTTAGTCAGTTTGG Reverse CAGATTGGAGAGCGAGGGAA NSMAF_2 positive Forward CAGGATCCGACCTCACACAC Reverse TGGCCAACAGATTGGTGGTT ZFX_3 positive Forward CTCTGAAACACGGGTACATAGG Reverse GGAGGGAGATGAGCAAAGTT CDH1_UP negative Forward CTGCCATAAGGAAACCTGGA Reverse GCATCACTGGGGAAAAGAAA HOXA13 negative Forward CCCAAGAACCAGTCCAAGAA Reverse TGGTTCTTCAGCACCAACAC ZNF180_3' negative Forward TGATGCACAATAAGTCGAGCA Reverse TGCAGTCAATGTGGGAAGTC ZNF554_3' negative Forward CGGGGAAAAGCCCTATAAAT Reverse TCCACATTCACTGCATTCGT Cell Line Media ATCC # C42B RPMI 1640 NA MCF7 DMEM ATCC HTB-22™ HCT116 McCoy’s 5a ATCC CCL-247™ 9 1) Obtain nuclear extracts. Harvest cells at 80-90% confluence. Break open the cells and release the nuclei using Cell Lysis Buffer (5mM PIPES pH 8.0, 85mM KCl, Igepal 10µL/mL, add protease inhibitors immediately before use) and Nuclei Lysis Buffer (50mM Tris-Cl pH 8.1, 10mM EDTA, 1% SDS, add protease inhibitors immediately before use). If the nuclei do not lyse when the buffer is added or the solution is still viscous, then a 5 second pulse sonication using Sonic Dismembrator (Fisher Scientific, Model 100) should be performed. 2) Quantify the protein concentration in the nuclear extracts using the Qubit Protein Assay Kit (Q33212); make 40ug aliquots for the Western Blots. 3) Separate the proteins using an SDS-PAGE gel. To denature the proteins, heat a mixture of 5uL 4 ×Laemmili sample buffer, 10% 2-Mercaptoethanol ( βME) and 40ug nuclear extract at 95°C for 5mins. Load the denatured proteins, as well as 10uL of the protein markers (Precision Plus Protein™ All Blue Prestained Protein Standards, Cat#1610373), on a Precast SDS-PAGE gel (BIO RAD Mini-PROTEAN TGX Gels, Cat#456-9025). Run the gel at 100V for 90min. 4) Transfer the proteins to a nitrocellulose sheet. Gel transfer is performed at 90V for 1 hour in a cold room. 5) Immunoblot and visualize antigen bands. Pre-block membranes with 1×TBST (1×TBS and 0.1% Tween-20) with 5% milk and incubate membranes with diluted primary antibody (ZFX Mouse mAb, CST#L28B6, Lot#1). The next day, wash the membranes and then perform a secondary antibody incubation for 1 hour at room temperature. Scan the membranes and save the image showing the bands detected by the ZFX antibody. Finally, incubate the membrane with the nucleoporin control antibody (BD Transduction Laboratories cat# 610497, Lot# 77287) for 1 hour at room temperature followed by the same secondary antibody incubation process. Scan the 10 membrane and save the final image showing the bands detected by the control antibody; this will serve as a loading control. 2.3 Chromatin Immunoprecipitation Library Construction and Sequencing ChIP-seq experiments were performed in duplicate for each cell line. A detailed protocol can be found at (O'Geen et al., 2011). Briefly, there are 6 critical steps in the ChIP Library construction process (Figure 2.1). 1) Grow cells in 245×245×20mm square culture dishes and crosslink cells when they reach 80%-90% confluence. It is critical to keep cells healthy and avoid overgrowth. Crosslink cells at room temperature with 1% formaldehyde for 10 minutes with agitation and stop the crosslinking reaction using glycine to a final concentration of 0.125M. Continue to agitate ≥5 min at room temperature. 2) Break open the cells and release the nuclei using Cell Lysis Buffer and Nuclei Lysis Buffer containing protease inhibitors. Sonicate cells at 4 °to achieve an average chromatin length of 500bp using the Diagenode BioRuptor Pico (Cat# B01060001) with 30 seconds on and 30 seconds off sonication cycles. 3) Save 500ng of sonicated chromatin for the Input. Add 30µL (~4µg) ZFX antibody (ZFX Mouse mAb, CST#L28B6, Lot#1) to 400µg sonicated chromatin to precipitate the chromatin. Incubate on a rotating platform at 4° overnight. 4) Capture and wash the antibody-bound protein/DNA complexes using magnetic protein A/G beads (Pierce Protein A/G beads, Prod #88803). Elute the antibody/chromatin complexes using IP elution buffer. 5) Reverse the formaldehyde crosslinks in each ChIP sample and the saved Input using NaCl (approximately 0.6M final concentration) at 67° overnight. Purify the DNA using a QIAGEN 11 QIAquick PCR Purification Kit (Cat# 28104). Before proceeding to library construction, verify the ChIP enrichment by qPCR. Typically, a successful ChIP has more than 10-fold enrichment over input at positive target sites. 6) Construct a ChIP-seq library using the KAPA Hyper Prep Kit (KAPA Biosystems #KK8505). a. Convert ChIP DNA fragments to blunt-ended, 5’-phosphorylated DNA. b. Ligate NEXTflex™ DNA Barcodes adapters to DNA fragments. c. Clean up the fragment library using Ampure Magetics beads (BECKMAN COULTER Agencourt AMPure XP, A63881). Avoid letting the beads dry. d. Amplify half of the adaptor-modified DNA fragments using the following PCR protocol: 98ºC → 45 sec 10-15 cycles: 98ºC → 15 sec 60ºC → 30 sec 72ºC → 30 sec 72ºC → 1 min 10ºC → ∞ e. Purify the library using Ampure magnetic beads, determine the library concentration using a Qubit dsDNA HS Assay Kit (Q32851) and/or a Kappa Quantification Kit (KK4844). f. Check the library quality using a BioAnalyzer (Agilent technologies) and check the ChIP enrichment using multiple positive and negative target primers in qPCR. Positive, but not negative, targets should be at least 10-fold enriched over Input. 12 g. Sequence high quality ChIP-seq libraries on a HiSeq2000 sequencing machine. Paired end sequencing was performed at Stanford University and the ChIP-seq data analyzed in this dissertation was mapped to the hg19 human genome version using BWA (default parameters). Figure 2.1 Chromatin Immunoprecipitation flow chart. A schematic of the major steps involved in ChIP-seq is shown; see O’Geen et al., 2011 for details. 2.4 qPCR analyses Quantitative real-time PCR was performed using ChIP and library preparations. When I began my experiments, I did not have a positive target site and could not perform the standard enrichment checks. Therefore, the first ChIP library was sequenced without enrichment checks. However, I then identified positive and negative target sites by visually inspecting ZFX peaks on the Genome browser using the first set of ZFX ChIP-seq data. Positive target primers were designed within regions of high enrichment, whereas negative target primers were designed within regions of almost no enrichment; 4 positive target sites and 4 negative target sites were chosen to check the enrichment in subsequent ChIP-seq experiments (Figure 2.2). Quantitative real-time PCR were performed using SsoFast EvaGreen Supermix (BIO-RAD, Cat. No. 1725201) ZFX_MCF7 DIS3L-2(+) 3.62 LRRC41_F1/R1 (+) 35.14 NSMAF-2 (+) 9.21 ZFX-3 (+) 6.54 HOXA13 (-) 0.55 ZNF180-3' (-) 1.40 ZN554-3' (-) 0.72 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 Enrichment / Input (Norm. DNA with CDH1_Up negatvie) 13 and SYBR reagent. The average cycle threshold was determined for each triplicate reaction of each sample. The amount of DNA was normalized using one of the negative controls; positive targets should be at least 10-fold enriched over input. See Table 2.2 for qPCR primer information for positive and negative targets. Table 2.2 qPCR primer sequences for ChIP enrichment checks. Label name ZFX target Strand Primer Sequence (5' -> 3') DIS3L_2 positive Forward GCTTGGCTAACCAGCTCTCA Reverse CTGTGTCAAGCTTCTGCACG LRRC41_1 positive Forward CGGTCGCTTAGTCAGTTTGG Reverse CAGATTGGAGAGCGAGGGAA NSMAF_2 positive Forward CAGGATCCGACCTCACACAC Reverse TGGCCAACAGATTGGTGGTT ZFX_3 positive Forward CTCTGAAACACGGGTACATAGG Reverse GGAGGGAGATGAGCAAAGTT CDH1_UP negative Forward CTGCCATAAGGAAACCTGGA Reverse GCATCACTGGGGAAAAGAAA HOXA13 negative Forward CCCAAGAACCAGTCCAAGAA Reverse TGGTTCTTCAGCACCAACAC ZNF180_3' negative Forward TGATGCACAATAAGTCGAGCA Reverse TGCAGTCAATGTGGGAAGTC ZNF554_3' negative Forward CGGGGAAAAGCCCTATAAAT Reverse TCCACATTCACTGCATTCGT Cell Line Media ATCC # C42B RPMI 1640 NA MCF7 DMEM ATCC HTB-22™ HCT116 McCoy’s 5a ATCC CCL-247™ 14 Figure 2.2 An example of ChIP enrichment checks. ChIP was performed in MCF7 breast cancer cells. The enrichment at the four positive binding sites (DIS3L, LRRC41, NSMAF, ZFX3) was shown to be higher than the enrichment at the negative sites (HOXA13, ZNF180, ZNF554); the values were normalized to the CDH1 negative region. At least one of the positive sites showed an enrichment more than 10 fold in the ChIP samples from MCF7 cells; note that the positive controls were chosen based on ChIP-seq data from C42B cells. 2.5 ChIP-seq data processing. The following tools were used to analyze the ChIP-seq data. BedGraph. HOMER makeTagDirectory script was used to create tag directories. MakeUCSCfile script with “-res 1 -fsize 5e7” was used to make Bedgraph files for UCSC genome browser visualization. Genome TSS. “Table Browser” with “Gencode Genes V19”, output format “all fields from selected table” was used to download all human TSS sites. Tag density plots. HOMER annotatePeaks.pl script with “–hist 20 –size 4000” was used to plot ChIP-seq tag density relative to the center of TSS +/- 2kb. Homer motif searches. HOMER findMotifsGenome.pl script was used to identify motifs enriched in ZFX the binding sites. Parameter “–len 45” was used since ZFX has 13 zinc fingers. ZFX_MCF7 DIS3L-2(+) 3.62 LRRC41_F1/R1 (+) 35.14 NSMAF-2 (+) 9.21 ZFX-3 (+) 6.54 HOXA13 (-) 0.55 ZNF180-3' (-) 1.40 ZN554-3' (-) 0.72 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 Enrichment / Input (Norm. DNA with CDH1_Up negatvie) 15 Overlap peaks among different cell lines. Homer mergePeaks script with parameters “–d given –prefix filename –venn filename” was used to identify overlap peaks. Motif density plots. HOMER annotatePeaks.pl script with “–hist 20 –size 4000 –m <motif file>” was used to plot motifs in relation to the center of TSS +/- 2kb. The motif files that I used are from the Homer known motifs. Venn Diagram. Venn diagrams were made using R with library (VennDiagram). 2.6 siRNA knockdown and RNA-seq library construction To characterize the role of ZFX in gene regulation, I performed knockdown of the ZFX mRNA in C42B prostate cancer cells and MCF7 breast cancer cells using siRNAs and then analyzed the transcriptome using RNA-seq. 1) Knockdown ZFX by siRNA a. Plate ~10 5 cells/well in 6-well plates and incubate cells at 37℃ in 5% CO 2 overnight b. Transfect cells with siRNA ZFX (Dharmacon, ON-TARGETplus Human ZFX (7543) siRNA - SMARTpool, 5 nmol, Cat No. L-006572-00-0005) and transfect cells with siRNA Control (Dharmacon, ON-TARGETplus Non-targeting Pool, Cat No. D-001810-10-05); see Table 2.3 for the sequence of the ZFX siRNAs and Figure 2.3 for the targeting location of the ZFX siRNAs. DharmaFECT 3 Transfection Reagent (Dharmacon, Cat No. T-2003-01) was used for the C42B transfections and DharmaFECT 1 Transfection Reagent (Dharmacon, Cat No. T-2001- 01) was used for the MCF7 transfections. Both siRNA ZFX and siRNA control transfections were performed in biological triplicates. Cells were incubated for 24 hours and transfected again with the same siRNA concentration; the incubation was continued for an additional 24 hours. c. Extract RNA from transfected cells with TRIzol® Reagent (Thermal Fisher Scientific, Cat. No. 15596018). Check the RNA quality using the RNA 6000 nano kit (Agilent technologies, Cat. No. 16 50671511) and the 2100 Bioanalyzer (Agilent technologies, Cat. No. G2939AA). PCR analysis was used to demonstrate that the siRNA treatment resulted in reduced ZFX mRNA levels prior to proceeding with RNA-seq (see Results). d. Construct RNA-seq libraries using KAPA stranded mRNA-seq kits (KAPK Biosystems, Cat. No. kk8421). Check the library quality using high sensitivity DNA kits (Agilent technologies, Cat. No. 50674626) in BioAnalyzer. Table 2.3 ZFX siRNA target information Figure 2.3 ZFX siRNA target sites. There are 7 splice variants of ZFX in Gencode V19. The variant ENST00000379177.1, shown here, is the most abundant one in C42B prostate cancer cells and MCF7 breast cancer cells. The 4 siRNAs correspond to positions within exon 7, exon 9, and exon 11 of the ZFX gene. 2.7 RNA-seq data processing NGS data analysis software was used to analyze RNA-seq data. Tophat2 was used to align the reads in Partek platform and the differentially expressed genes were identified using Partek Gene Specific Analysis (GSA) algorithms. Cell line Sample Name Total Reads Total Alignment Rate C4-2B siCtrl C4-2B_siCTRL_rep1 39,352,578 89.98% C4-2B_siCTRL_rep2 37,889,627 90.06% C4-2B_siCTRL_rep3 43,184,055 89.90% siZFX C4-2B_siZFX_rep1 43,747,503 88.89% C4-2B_siZFX_rep2 39,640,681 89.43% C4-2B_siZFX_rep3 38,565,814 87.79% MCF7 siCtrl MCF7_siCtrl_rep1 48,009,367 89.29% MCF7_siCtrl_rep2 43,215,478 89.28% MCF7_siCtrl_rep3 48,232,039 89.61% siZFX MCF7_siZFX_rep1 46,499,259 88.97% MCF7_siZFX_rep2 44,882,433 89.67% MCF7_siZFX_rep3 40,748,068 90.04% Sequence Location siRNA target sequence-1 UGAAAUCGCUGACGAAGUU chrX:24226342-24226360 siRNA target sequence-2 GAAUGACCAUGGACACAGA chrX:24225482-24225500 siRNA target sequence-3 GCAACAUGCUAGUUACUUU chrX:24229708-24229726 siRNA target sequence-4 CCAAGUAGUAGUUGUUUAA chrX:24230419-24230437 Scale chrX: 4 siRNA targets RefSeq Genes Common SNPs(144) 2 kb hg19 24,225,000 24,226,000 24,227,000 24,228,000 24,229,000 24,230,000 24,231,000 ZFX ZFX ZFX ZFX ZFX ZFX ZFX Xp22.11 ZFX ZFX ZFX ZFX ZFX ZFX ZFX ZFX LNCaP CTCF DS 0.0614 - 0 _ Scale chrX: 4 siRNA targets RefSeq Genes Common SNPs(144) 2 kb hg19 24,225,000 24,226,000 24,227,000 24,228,000 24,229,000 24,230,000 24,231,000 ZFX ZFX ZFX ZFX ZFX ZFX ZFX Xp22.11 ZFX ZFX ZFX ZFX ZFX ZFX ZFX ZFX LNCaP CTCF DS 0.0614 - 0 _ Exon 7 Exon 9 Exon 11 17 Chapter 3 Characterizing binding patterns for ZFX in different tumor cell lines 3.1 ENCODE TF Antibody Characterization For each transcription factor ChIP-seq antibody, a primary and a supporting secondary antibody characterization should be performed. The primary characterization can be either a standard Western Blot or IP western. In this case, a Western blot was chosen. The current standard is that the major detected band should be within 20% of the size predicated by the size of the coding region. The ZFX protein is about 135kDa and the antibody binds to the ZFX N-terminus at about aa 400 (in exon 9). As shown in Figure 3.1, the ZFX antibody meets expectations for the primary characterization. Three cancer cell lines (boxed in red) were chosen for ChIP-seq experiments. Figure 3.1 ENCODE primary antibody validation. The ZFX protein is about 135kDa and the ZFX antibody detected a protein of this size, verifying the validity of the ZFX antibody. The Nucleoporin p62 antibody was used as a loading control; the p62 protein band is shown below the main image. 18 In addition to a primary characterization, a secondary characterization method is required to support successful immunoblot data. ENCODE suggests use of 1 of 4 different secondary characterization methods and I used siRNA against the mRNA of the target protein (ZFX) to verify the antibody specificity. C42B cells were transfected with 50nM control siRNAs or 50nM ZFX targeting siRNAs in duplicate. According to ENCODE standards, for siRNA knockdown characterization, the bands should be reduced by at least 50% of the control signal. The ZFX signal was significantly reduced in the siZFX transfections, but not the siCtrl transfections, verifying that the band observed on the prior western blot corresponds to the ZFX protein (Figure 3.2) Figure 3.2 ENCODE secondary antibody validation. Duplicate siRNA treatments of C42B cells were performed; 40ug of nuclear extract was analyzed on the left side of the gel and 56ug of nuclear extract was analyzed on the right side of the gel. A significantly reduced signal in the siZFX lanes can be seen using both amounts of nuclear extract. The Nucleoporin p62 antibody was used as a loading control; the p62 western blot image is shown below the main image. 19 3.2 Creation of high quality, duplicate ZFX ChIP-seq datasets in 3 different cancer cell lines. ChIP-seq assays were performed using at least 2 biological duplicates for 3 cancer cell lines (C42B prostate cancer cells, MCF7 breast cancer cells, and HCT116 colon cancer cells). ChIP- seq data was analyzed using the ENCODE ChIP-seq pipeline to perform read mapping, check the quality metrics, and identify reproducible peaks (Figure 3.3). Homer bedgraph files were generated to visually inspect peaks on UCSC genome browser (Figure 3.4). Figure 3.3 ChIP-seq flow chart Figure 3.4 ZFX ChIP-seq binding patterns. For each cell line, a UCSC browser track of one of 2 replicates of the ChIP-seq datasets is shown for a ~15 Mb region of chromosome 1. Robust peaks over background verified that the ChIP-seq experiments were successful. Scale chr1: RefSeq Genes Common SNPs(144) 5 Mb hg19 10,000,000 15,000,000 20,000,000 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 RP11-334N17.1 RNU1-8P CAMTA1-IT1 RP3-453P22.2 RP3-453P22.2 RP4-549F15.1 RP11-338N10.1 RP11-338N10.2 RP11-338N10.3 CAMTA1 VAMP3 VAMP3 RP3-467L1.6 PER3 PER3 PER3 RP3-467L1.4 UTS2 UTS2 UTS2 TNFRSF9 PARK7 PARK7 PARK7 PARK7 PARK7 PARK7 Y_RNA ERRFI1 ERRFI1 ERRFI1 ERRFI1 RP11-431K24.1 RNU1-7P RP11-431K24.3 RP11-431K24.4 RN7SL729P RNU6-991P SLC45A1 SLC45A1 SLC45A1 Y_RNA SLC45A1 RERE RERE RERE RERE RERE RERE RP5-1115A15.1 RP5-1115A15.1 SNORA77 Y_RNA RP4-633I8.4 ENO1 ENO1-IT1 ENO1-AS1 RNU6-304P CA6 CA6 CA6 CA6 CA6 RN7SL451P SLC2A7 SLC2A5 SLC2A5 SLC2A5 SLC2A5 SCARNA16 GPR157 GPR157 MIR34A RP3-510D11.1 RP3-510D11.2 H6PD H6PD H6PD Z98044.1 SPSB1 SPSB1 SPSB1 RNA5SP40 RP13-392I16.1 SLC25A33 TMEM201 TMEM201 TMEM201 TMEM201 PIK3CD PIK3CD C1orf200 RP11-558F24.4 PIK3CD PIK3CD PIK3CD CLSTN1 CLSTN1 CLSTN1 CLSTN1 CTNNBIP1 RP11-84A14.5 CTNNBIP1 CTNNBIP1 CTNNBIP1 CTNNBIP1 LZIC LZIC LZIC LZIC RP11-84A14.4 NMNAT1 NMNAT1 NMNAT1 RN7SKP269 MIR5697 RP11-807G9.2 RBP7 UBE4B UBE4B UBE4B UBE4B UBE4B RNU6-828P KIF1B KIF1B KIF1B MIR1273D KIF1B KIF1B RNU6-37P RN7SL731P KIF1B RN7SL721P PGD PGD PGD PGD RP4-736L20.3 APITD1 APITD1 APITD1-CORT APITD1-CORT APITD1 APITD1 APITD1-CORT CORT CORT DFFA RP5-1113E3.3 DFFA PEX14 PEX14 PEX14 RN7SL614P CASZ1 RP4-734G22.3 CASZ1 CASZ1 C1orf127 C1orf127 AL713997.1 Y_RNA TARDBP TARDBP TARDBP TARDBP MASP2 RP4-635E18.8 MASP2 SRM EXOSC10 EXOSC10 EXOSC10 RP4-635E18.7 EXOSC10 RP4-635E18.6 RP4-635E18.6 MTOR MTOR MTOR MTOR-AS1 MTOR-AS1 RNU6-537P snoU13 ANGPTL7 ANGPTL7 RNU6-291P UBIAD1 UBIAD1 PTCHD2 PTCHD2 PTCHD2 PTCHD2 RP1-69M21.2 FBXO2 FBXO2 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO6 MAD2L2 MAD2L2 MAD2L2 MAD2L2 MAD2L2 DRAXIN AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP C1orf167 C1orf167 C1orf167 RP11-56N19.5 MTHFR MTHFR MTHFR MTHFR CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 NPPA-AS1 NPPA-AS1 NPPA NPPA NPPB RNU5E-1 RNU5E-4P KIAA2013 KIAA2013 PLOD1 PLOD1 PLOD1 MFN2 MFN2 MFN2 MIIP MIIP MIIP Y_RNA RN7SL649P TNFRSF8 TNFRSF8 RNU6-777P TNFRSF8 TNFRSF8 TNFRSF1B TNFRSF1B TNFRSF1B MIR4632 SNORA70 VPS13D VPS13D VPS13D VPS13D VPS13D VPS13D SNORA59A RP5-888M10.2 DHRS3 RNU6ATAC18P DHRS3 RP11-474O21.5 AADACL4 AADACL3 AADACL3 snoU13 C1orf158 C1orf158 C1orf158 PRAMEF12 PRAMEF1 PRAMEF1 RP5-845O24.8 PRAMEF11 HNRNPCL1 PRAMEF2 PRAMEF4 PRAMEF10 PRAMEF7 RNU6-1072P PRAMEF6 PRAMEF6 PRAMEF6 PRAMEF22 WI2-3308P17.2 PRAMEF6 PRAMEF6 RP13-221M14.2 RP13-221M14.3 PRAMEF26 PRAMEF3 PRAMEF3 PRAMEF5 RNU6-771P PRAMEF8 PRAMEF9 RP11-219C24.10 PRAMEF13 PRAMEF18 PRAMEF16 PRAMEF21 PRAMEF21 PRAMEF15 PRAMEF14 PRAMEF14 PRAMEF14 PRAMEF19 PRAMEF19 PRAMEF17 PRAMEF20 PRAMEF20 LRRC38 RP4-597A16.2 PDPN PDPN PDPN PDPN PDPN PDPN PDPN RNA5SP41 AL359771.1 CTA-520D8.2 SCARNA11 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 RNU6-1265P RP11-344F13.1 RP4-704D23.1 KAZN KAZN KAZN KAZN KAZN KAZN TMEM51-AS1 TMEM51 TMEM51 TMEM51 TMEM51 TMEM51 C1orf195 C1orf195 FHAD1 FHAD1 FHAD1 FHAD1 FHAD1 FHAD1 RP3-467K16.2 FHAD1 RP3-467K16.7 FHAD1 RP3-467K16.4 EFHD2 CTRC CTRC CTRC CELA2A CELA2A CELA2B CELA2B CASP9 CASP9 CASP9 CASP9 CASP9 DNAJC16 DNAJC16 DNAJC16 SCARNA21 DNAJC16 RP4-680D5.8 AGMAT RP4-680D5.2 RNU7-179P DDI2 RSC1A1 RP4-680D5.9 PLEKHM2 PLEKHM2 AL121992.1 RP11-288I21.1 PLEKHM2 SLC25A34 SLC25A34 RP11-169K16.4 TMEM82 TMEM82 FBLIM1 FBLIM1 FBLIM1 FBLIM1 FBLIM1 FBLIM1 RP11-169K16.9 RP11-169K16.9 SPEN SPEN snoU13 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 C1orf64 RP11-5P18.5 HSPB7 HSPB7 HSPB7 HSPB7 HSPB7 HSPB7 CLCNKA CLCNKA CLCNKA CLCNKA CLCNKA CLCNKB CLCNKB FAM131C FAM131C EPHA2 EPHA2 RP11-276H7.2 RP11-276H7.3 ARHGEF19 ARHGEF19-AS1 ARHGEF19 ANO7P1 C1orf134 RSG1 FBXO42 FBXO42 SZRD1 SZRD1 SZRD1 SZRD1 SZRD1 SZRD1 SPATA21 SPATA21 SPATA21 SPATA21 NECAP2 NECAP2 NECAP2 NECAP2 NECAP2 NECAP2 RP4-798A10.2 CROCCP3 RNU1-1 RP4-798A10.7 RP4-798A10.4 U1 AL355149.2 AL355149.1 NBPF1 NBPF1 NBPF1 NBPF1 NBPF1 RP5-1182A14.5 CROCCP2 RNU1-3 AL137798.1 AL021920.1 ESPNP AL021920.2 RNU1-4 MST1L RP11-108M9.1 RP11-108M9.2 MIR3675 U1 RP11-108M9.4 RNU1-2 RP11-108M9.6 CROCC CROCC MFAP2 MFAP2 MFAP2 MFAP2 RP1-37C10.3 ATP13A2 ATP13A2 ATP13A2 ATP13A2 SDHB SDHB PADI2 PADI2 PADI2 PADI2 Y_RNA RP11-380J14.1 PADI1 PADI1 PADI1 PADI1 PADI1 PADI3 MIR3972 PADI4 PADI4 AC004824.2 PADI4 PADI6 RCC2 RP1-20B21.4 RCC2 RCC2 AC004824.1 snoU13 ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L RP11-473A10.2 ACTL8 RP11-174G17.2 IGSF21 RP11-174G17 IGSF21 RP11-422P22.1 KLHDC7A PAX7 PAX7 PAX7 TAS1R2 RP13-279N23.2 ALDH4A1 ALDH4A1 ALDH4A1 ALDH4A1 ALDH4A1 MIR4695 MIR1290 IFFO2 RP5-1126H10.2 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 AL137127.1 UBR4 UBR4 RP1-43E13.2 EMC1 EMC1 EMC1 EMC1 EMC1 MRTO4 MRTO4 AKR7L AKR7L AKR7A3 AKR7A2 RNU6-1099P PQLC2 PQLC2 PQLC2 RN7SL85P CAPZB CAPZB CAPZB CAPZB CAPZB CAPZB CAPZB RN7SL277P RNU4-28P snoU13 RP5-1056L3.1 MINOS1-NBL1 MINOS1 MINOS1 NBL1 NBL1 NBL1 HTR6 TMCO4 TMCO4 TMCO4 TMCO4 RNF186 RP11-91K11.2 OTUD3 OTUD3 PLA2G2E RN7SL304P PLA2G2A PLA2G2A PLA2G2A PLA2G5 PLA2G5 PLA2G2D PLA2G2F RP3-340N1.2 PLA2G2C PLA2G2C PLA2G2C RP3-340N1.5 UBXN10 RP3-340N1.6 VWA5B1 VWA5B1 VWA5B1 VWA5B1 RP4-745E8.2 VWA5B1 RP4-749H3.1 RP4-749H3.2 CAMK2N1 CAMK2N1 MUL1 FAM43B CDA CDA PINK1 PINK1 PINK1-AS DDOST DDOST DDOST DDOST KIF17 KIF17 KIF17 KIF17 SH2D5 SH2D5 SH2D5 RP5-930J4.2 HP1BP3 HP1BP3 RP5-930J4.4 HP1BP3 HP1BP3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 RNU7-200P EIF4G3 EIF4G3 EIF4G3 EIF4G3 MIR1256 CpG: 23 CpG: 16 CpG: 111 CpG: 107 CpG: 157 CpG: 61 CpG: 20 CpG: 110 CpG: 25 CpG: 31 CpG: 62 CpG: 91 CpG: 157 CpG: 15 CpG: 63 CpG: 67 CpG: 24 CpG: 21 CpG: 46 CpG: 23 CpG: 32 CpG: 43 CpG: 80 CpG: 64 CpG: 22 CpG: 62 CpG: 140 CpG: 105 CpG: 32 CpG: 112 CpG: 157 CpG: 38 CpG: 79 CpG: 20 CpG: 23 CpG: 24 CpG: 22 CpG: 33 CpG: 32 CpG: 144 CpG: 97 CpG: 19 CpG: 30 CpG: 129 CpG: 71 CpG: 26 CpG: 32 CpG: 26 CpG: 19 CpG: 159 CpG: 28 CpG: 21 CpG: 39 CpG: 92 CpG: 72 CpG: 160 CpG: 18 CpG: 147 CpG: 83 CpG: 59 CpG: 22 CpG: 48 CpG: 75 CpG: 19 CpG: 25 CpG: 30 CpG: 22 CpG: 48 CpG: 36 CpG: 42 CpG: 21 CpG: 27 CpG: 43 CpG: 247 CpG: 20 CpG: 93 CpG: 123 CpG: 77 CpG: 60 CpG: 29 CpG: 172 CpG: 80 CpG: 20 CpG: 23 CpG: 40 CpG: 35 CpG: 100 CpG: 84 CpG: 95 CpG: 169 CpG: 34 CpG: 27 CpG: 25 CpG: 93 CpG: 28 CpG: 43 CpG: 84 CpG: 35 CpG: 73 CpG: 66 CpG: 63 CpG: 90 CpG: 21 CpG: 56 CpG: 23 CpG: 18 CpG: 33 CpG: 133 CpG: 89 CpG: 23 CpG: 22 CpG: 59 CpG: 84 CpG: 109 CpG: 41 CpG: 119 CpG: 21 CpG: 135 CpG: 31 CpG: 197 CpG: 20 CpG: 45 CpG: 121 CpG: 23 CpG: 51 CpG: 53 CpG: 54 CpG: 44 CpG: 57 CpG: 21 CpG: 25 CpG: 17 CpG: 18 CpG: 15 CpG: 76 CpG: 151 CpG: 41 CpG: 184 CpG: 27 CpG: 22 CpG: 28 CpG: 82 CpG: 72 CpG: 42 CpG: 20 CpG: 20 CpG: 118 CpG: 48 CpG: 124 CpG: 44 CpG: 37 CpG: 37 CpG: 64 CpG: 56 CpG: 70 CpG: 36 CpG: 129 CpG: 18 CpG: 42 CpG: 105 CpG: 41 CpG: 103 CpG: 64 CpG: 133 CpG: 31 CpG: 19 CpG: 38 CpG: 231 CpG: 25 CpG: 181 CpG: 89 CpG: 38 CpG: 19 CpG: 30 CpG: 128 CpG: 41 CpG: 23 CpG: 103 CpG: 51 CpG: 18 CpG: 26 CpG: 96 CpG: 22 CpG: 48 CpG: 19 CpG: 72 CpG: 17 CpG: 77 CpG: 31 CpG: 47 CpG: 37 CpG: 18 CpG: 30 CpG: 139 CpG: 83 CpG: 17 CpG: 32 CpG: 31 CpG: 83 CpG: 29 CpG: 90 CpG: 18 CpG: 21 CpG: 81 CpG: 205 CpG: 56 CpG: 20 CpG: 58 CpG: 98 CpG: 30 CpG: 26 CpG: 51 CpG: 19 CpG: 44 CpG: 37 CpG: 21 CpG: 67 CpG: 199 CpG: 83 CpG: 82 CpG: 72 CpG: 73 CpG: 129 CpG: 25 CpG: 96 CpG: 54 CpG: 153 CpG: 160 CpG: 33 CpG: 63 CpG: 89 CpG: 36 CpG: 34 CpG: 46 CpG: 110 CpG: 245 CpG: 43 CpG: 188 CpG: 57 CpG: 61 CpG: 19 CpG: 34 CpG: 109 CpG: 115 CpG: 156 1p36.23 1p36.22 1p36.21 1p36.13 p36.12 zoom in to <= 10,000,000 bases to view items CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 VAMP3 PER3 PER3 PER3 PER3 PER3 PER3 UTS2 UTS2 UTS2 TNFRSF9 TRNA_Pseudo PARK7 PARK7 PARK7 AX747125 ERRFI1 ERRFI1 SLC45A1 SLC45A1 RERE RERE RERE RERE BC113958 RERE ENO1 ENO1 ENO1 ENO1 ENO1-AS1 CA6 CA6 CA6 CA6 CA6 SLC2A7 SLC2A5 SLC2A5 SLC2A5 SLC2A5 SLC2A5 SLC2A5 SLC2A5 GPR157 GPR157 GPR157 GPR157 mir-34 MIR34A H6PD SPSB1 SPSB1 5S_rRNA SLC25A33 TMEM201 TMEM201 TMEM201 PIK3CD PIK3CD C1orf200 BC038541 PIK3CD PIK3CD PIK3CD CLSTN1 CLSTN1 CLSTN1 CLSTN1 CTNNBIP1 CTNNBIP1 LZIC LZIC LZIC LZIC NMNAT1 RBP7 RBP7 UBE4B UBE4B UBE4B UBE4B UBE4B UBE4B KIF1B KIF1B KIF1B KIF1B KIF1B KIF1B KIF1B KIF1B PGD PGD APITD1 APITD1-CORT APITD1-CORT APITD1-CORT APITD1-CORT APITD1-CORT CORT DFFA DFFA PEX14 PEX14 PEX14 PEX14 PEX14 PEX14 PEX14 PEX14 CASZ1 CASZ1 CASZ1 CASZ1 Mir_584 C1orf127 C1orf127 C1orf127 TARDBP TARDBP MASP2 MASP2 MASP2 MASP2 SRM EXOSC10 EXOSC10 EXOSC10 MTOR MTOR MTOR-AS1 ANGPTL7 UBIAD1 PTCHD2 PTCHD2 PTCHD2 FBXO2 FBXO2 FBXO2 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO6 MAD2L2 MAD2L2 MAD2L2 DRAXIN AK125437 AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP C1orf167 C1orf167 C1orf167 MTHFR MTHFR MTHFR MTHFR MTHFR CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 NPPA-AS1 NPPA-AS1 NPPA-AS1 NPPA-AS1 NPPA-AS1 NPPA-AS1 NPPA NPPB KIAA2013 KIAA2013 PLOD1 PLOD1 MFN2 MFN2 MIIP TNFRSF8 TNFRSF8 TNFRSF8 TNFRSF8 TNFRSF1B TNFRSF1B TNFRSF1B MIR4632 VPS13D VPS13D VPS13D VPS13D VPS13D VPS13D SNORA59B DHRS3 DHRS3 DHRS3 DHRS3 DHRS3 AADACL4 AADACL3 AADACL3 C1orf158 C1orf158 PRAMEF12 PRAMEF1 PRAMEF11 LOC649330 HNRNPCL1 PRAMEF2 PRAMEF4 PRAMEF10 PRAMEF8 PRAMEF6 PRAMEF5 PRAMEF22 PRAMEF5 LOC440563 PRAMEF3 PRAMEF5 PRAMEF8 PRAMEF8 PRAMEF9 PRAMEF13 PRAMEF19 PRAMEF16 PRAMEF20 PRAMEF8 PRAMEF10 PRAMEF9 PRAMEF19 PRAMEF17 PRAMEF20 LRRC38 PDPN PDPN PDPN PDPN PDPN PDPN PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 AK124197 KAZN KAZN KAZN KAZN KAZN KAZN KAZN TMEM51-AS1 TMEM51-AS1 TMEM51 TMEM51 TMEM51 TMEM51 TMEM51 FHAD1 FHAD1 AK055853 FHAD1 FHAD1 FHAD1 FHAD1 FHAD1 EFHD2 CTRC CTRC CELA2A CELA2B CASP9 CASP9 CASP9 CASP9 CASP9 CASP9 DNAJC16 DNAJC16 DNAJC16 DNAJC16 AGMAT DDI2 DDI2 DDI2 RSC1A1 DQ573015 PLEKHM2 SLC25A34 TMEM82 FBLIM1 FBLIM1 FBLIM1 FBLIM1 FBLIM1 UQCRHL FLJ37453 SPEN SPEN ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 DQ576383 C1orf64 HSPB7 HSPB7 HSPB7 HSPB7 HSPB7 CLCNKA CLCNKA CLCNKA CLCNKA CLCNKA CLCNKA CLCNKB CLCNKB CLCNKB FAM131C EPHA2 EPHA2 ARHGEF19 ARHGEF19 RSG1 FBXO42 FBXO42 FBXO42 SZRD1 SZRD1 SZRD1 SZRD1 SZRD1 AL832937 SPATA21 SPATA21 SPATA21 NECAP2 NECAP2 NECAP2 NECAP2 CROCCP3 CROCCP3 TRNA_Asn TRNA_Asn BC036435 AX747988 TRNA_Gly TRNA_Pseudo MIR3675 NBPF1 NBPF1 NBPF1 NBPF1 NBPF1 NBPF1 NBPF1 CROCCP2 CROCCP2 CROCCP2 CROCCP2 CROCCP2 BC015342 FLJ00313 FLJ00313 MST1P2 MST1P2 MST1P2 MST1P2 MST1P2 MST1P2 MST1P2 TRNA_Gly TRNA_Val MIR3675 ESPNP TRNA_Pseudo TRNA_Gly CROCC MST1L MST1L TRNA_Pseudo MIR3675 TRNA_Pseudo TRNA_Gly AK125737 TRNA_Asn BC070363 TRNA_Asn DL489931 CROCC CROCC CROCC CROCC MFAP2 MFAP2 MFAP2 MFAP2 ATP13A2 ATP13A2 ATP13A2 ATP13A2 ATP13A2 SDHB PADI2 PADI2 PADI2 PADI1 PADI1 PADI1 PADI1 PADI1 PADI3 Mir_584 PADI4 PADI4 PADI6 RCC2 RCC2 ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ACTL8 IGSF21 IGSF21 AX747631 KLHDC7A KLHDC7A PAX7 PAX7 PAX7 TAS1R2 ALDH4A1 ALDH4A1 ALDH4A1 ALDH4A1 MIR4695 IFFO2 AX747516 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 EMC1 EMC1 EMC1 EMC1 EMC1 MRTO4 AKR7L AKR7L AKR7A3 LOC100506730 AKR7A2 AKR7A2 PQLC2 PQLC2 PQLC2 PQLC2 PQLC2 CAPZB CAPZB CAPZB CAPZB AX748283 Metazoa_SRP MINOS1 MINOS1 MINOS1 MINOS1 MINOS1 MINOS1-NBL1 MINOS1-NBL1 RPS14P3 NBL1 NBL1 NBL1 NBL1 NBL1 NBL1 NBL1 NBL1 HTR6 TMCO4 TMCO4 TMCO4 TMCO4 TMCO4 RNF186 OTUD3 PLA2G2E PLA2G2A PLA2G2A PLA2G2A PLA2G2A PLA2G5 PLA2G5 PLA2G2D PLA2G2D PLA2G2F PLA2G2C UBXN10 UBXN10 VWA5B1 VWA5B1 VWA5B1 VWA5B1 VWA5B1 LOC339505 LOC339505 CAMK2N1 MUL1 FAM43B CDA CDA CDA PINK1 PINK1 DDOST DDOST DDOST KIF17 KIF17 KIF17 KIF17 SH2D5 SH2D5 SH2D5 HP1BP3 HP1BP3 HP1BP3 HP1BP3 HP1BP3 HP1BP3 HP1BP3 HP1BP3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 USC1049_ZFX_C42B_RepB_TGACCA 20 - 0 _ USC1050_ZFX_MCF7_RepB_GCCAAT 20 - 0 _ USC990_ZFX_HCT116_RepB 20 - 0 _ USC1056_ZFX_HEK293T_RepB 20 - 0 _ USC1051_INPUT_HEK293T_RepA_CGATGT 20 - 0 _ USC1063_INPUT_HEK293T_RepB_GATCAG 20 - 0 _ USC1048_ZFX_HEK293T_RepA_CGATGT 20 - 0 _ USC1046_INPUT_HCT116_RepA 20 - 0 _ USC987_INPUT_HCT116_RepB 20 - 0 _ USC1040_ZFX_HCT116_RepA 20 - 0 _ LNCaP CTCF DS 4.7441 - 0 _ LNCaP CTCF DS 2.2847 - 0 _ ZFX_C42B ZFX_MCF7 ZFX_HCT116 Scale chr1: RefSeq Genes Common SNPs(144) 5 Mb hg19 10,000,000 15,000,000 20,000,000 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 RP11-334N17.1 RNU1-8P CAMTA1-IT1 RP3-453P22.2 RP3-453P22.2 RP4-549F15.1 RP11-338N10.1 RP11-338N10.2 RP11-338N10.3 CAMTA1 VAMP3 VAMP3 RP3-467L1.6 PER3 PER3 PER3 RP3-467L1.4 UTS2 UTS2 UTS2 TNFRSF9 PARK7 PARK7 PARK7 PARK7 PARK7 PARK7 Y_RNA ERRFI1 ERRFI1 ERRFI1 ERRFI1 RP11-431K24.1 RNU1-7P RP11-431K24.3 RP11-431K24.4 RN7SL729P RNU6-991P SLC45A1 SLC45A1 SLC45A1 Y_RNA SLC45A1 RERE RERE RERE RERE RERE RERE RP5-1115A15.1 RP5-1115A15.1 SNORA77 Y_RNA RP4-633I8.4 ENO1 ENO1-IT1 ENO1-AS1 RNU6-304P CA6 CA6 CA6 CA6 CA6 RN7SL451P SLC2A7 SLC2A5 SLC2A5 SLC2A5 SLC2A5 SCARNA16 GPR157 GPR157 MIR34A RP3-510D11.1 RP3-510D11.2 H6PD H6PD H6PD Z98044.1 SPSB1 SPSB1 SPSB1 RNA5SP40 RP13-392I16.1 SLC25A33 TMEM201 TMEM201 TMEM201 TMEM201 PIK3CD PIK3CD C1orf200 RP11-558F24.4 PIK3CD PIK3CD PIK3CD CLSTN1 CLSTN1 CLSTN1 CLSTN1 CTNNBIP1 RP11-84A14.5 CTNNBIP1 CTNNBIP1 CTNNBIP1 CTNNBIP1 LZIC LZIC LZIC LZIC RP11-84A14.4 NMNAT1 NMNAT1 NMNAT1 RN7SKP269 MIR5697 RP11-807G9.2 RBP7 UBE4B UBE4B UBE4B UBE4B UBE4B RNU6-828P KIF1B KIF1B KIF1B MIR1273D KIF1B KIF1B RNU6-37P RN7SL731P KIF1B RN7SL721P PGD PGD PGD PGD RP4-736L20.3 APITD1 APITD1 APITD1-CORT APITD1-CORT APITD1 APITD1 APITD1-CORT CORT CORT DFFA RP5-1113E3.3 DFFA PEX14 PEX14 PEX14 RN7SL614P CASZ1 RP4-734G22.3 CASZ1 CASZ1 C1orf127 C1orf127 AL713997.1 Y_RNA TARDBP TARDBP TARDBP TARDBP MASP2 RP4-635E18.8 MASP2 SRM EXOSC10 EXOSC10 EXOSC10 RP4-635E18.7 EXOSC10 RP4-635E18.6 RP4-635E18.6 MTOR MTOR MTOR MTOR-AS1 MTOR-AS1 RNU6-537P snoU13 ANGPTL7 ANGPTL7 RNU6-291P UBIAD1 UBIAD1 PTCHD2 PTCHD2 PTCHD2 PTCHD2 RP1-69M21.2 FBXO2 FBXO2 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO6 MAD2L2 MAD2L2 MAD2L2 MAD2L2 MAD2L2 DRAXIN AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP C1orf167 C1orf167 C1orf167 RP11-56N19.5 MTHFR MTHFR MTHFR MTHFR CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 NPPA-AS1 NPPA-AS1 NPPA NPPA NPPB RNU5E-1 RNU5E-4P KIAA2013 KIAA2013 PLOD1 PLOD1 PLOD1 MFN2 MFN2 MFN2 MIIP MIIP MIIP Y_RNA RN7SL649P TNFRSF8 TNFRSF8 RNU6-777P TNFRSF8 TNFRSF8 TNFRSF1B TNFRSF1B TNFRSF1B MIR4632 SNORA70 VPS13D VPS13D VPS13D VPS13D VPS13D VPS13D SNORA59A RP5-888M10.2 DHRS3 RNU6ATAC18P DHRS3 RP11-474O21.5 AADACL4 AADACL3 AADACL3 snoU13 C1orf158 C1orf158 C1orf158 PRAMEF12 PRAMEF1 PRAMEF1 RP5-845O24.8 PRAMEF11 HNRNPCL1 PRAMEF2 PRAMEF4 PRAMEF10 PRAMEF7 RNU6-1072P PRAMEF6 PRAMEF6 PRAMEF6 PRAMEF22 WI2-3308P17.2 PRAMEF6 PRAMEF6 RP13-221M14.2 RP13-221M14.3 PRAMEF26 PRAMEF3 PRAMEF3 PRAMEF5 RNU6-771P PRAMEF8 PRAMEF9 RP11-219C24.10 PRAMEF13 PRAMEF18 PRAMEF16 PRAMEF21 PRAMEF21 PRAMEF15 PRAMEF14 PRAMEF14 PRAMEF14 PRAMEF19 PRAMEF19 PRAMEF17 PRAMEF20 PRAMEF20 LRRC38 RP4-597A16.2 PDPN PDPN PDPN PDPN PDPN PDPN PDPN RNA5SP41 AL359771.1 CTA-520D8.2 SCARNA11 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 RNU6-1265P RP11-344F13.1 RP4-704D23.1 KAZN KAZN KAZN KAZN KAZN KAZN TMEM51-AS1 TMEM51 TMEM51 TMEM51 TMEM51 TMEM51 C1orf195 C1orf195 FHAD1 FHAD1 FHAD1 FHAD1 FHAD1 FHAD1 RP3-467K16.2 FHAD1 RP3-467K16.7 FHAD1 RP3-467K16.4 EFHD2 CTRC CTRC CTRC CELA2A CELA2A CELA2B CELA2B CASP9 CASP9 CASP9 CASP9 CASP9 DNAJC16 DNAJC16 DNAJC16 SCARNA21 DNAJC16 RP4-680D5.8 AGMAT RP4-680D5.2 RNU7-179P DDI2 RSC1A1 RP4-680D5.9 PLEKHM2 PLEKHM2 AL121992.1 RP11-288I21.1 PLEKHM2 SLC25A34 SLC25A34 RP11-169K16.4 TMEM82 TMEM82 FBLIM1 FBLIM1 FBLIM1 FBLIM1 FBLIM1 FBLIM1 RP11-169K16.9 RP11-169K16.9 SPEN SPEN snoU13 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 C1orf64 RP11-5P18.5 HSPB7 HSPB7 HSPB7 HSPB7 HSPB7 HSPB7 CLCNKA CLCNKA CLCNKA CLCNKA CLCNKA CLCNKB CLCNKB FAM131C FAM131C EPHA2 EPHA2 RP11-276H7.2 RP11-276H7.3 ARHGEF19 ARHGEF19-AS1 ARHGEF19 ANO7P1 C1orf134 RSG1 FBXO42 FBXO42 SZRD1 SZRD1 SZRD1 SZRD1 SZRD1 SZRD1 SPATA21 SPATA21 SPATA21 SPATA21 NECAP2 NECAP2 NECAP2 NECAP2 NECAP2 NECAP2 RP4-798A10.2 CROCCP3 RNU1-1 RP4-798A10.7 RP4-798A10.4 U1 AL355149.2 AL355149.1 NBPF1 NBPF1 NBPF1 NBPF1 NBPF1 RP5-1182A14.5 CROCCP2 RNU1-3 AL137798.1 AL021920.1 ESPNP AL021920.2 RNU1-4 MST1L RP11-108M9.1 RP11-108M9.2 MIR3675 U1 RP11-108M9.4 RNU1-2 RP11-108M9.6 CROCC CROCC MFAP2 MFAP2 MFAP2 MFAP2 RP1-37C10.3 ATP13A2 ATP13A2 ATP13A2 ATP13A2 SDHB SDHB PADI2 PADI2 PADI2 PADI2 Y_RNA RP11-380J14.1 PADI1 PADI1 PADI1 PADI1 PADI1 PADI3 MIR3972 PADI4 PADI4 AC004824.2 PADI4 PADI6 RCC2 RP1-20B21.4 RCC2 RCC2 AC004824.1 snoU13 ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L RP11-473A10.2 ACTL8 RP11-174G17.2 IGSF21 RP11-174G17 IGSF21 RP11-422P22.1 KLHDC7A PAX7 PAX7 PAX7 TAS1R2 RP13-279N23.2 ALDH4A1 ALDH4A1 ALDH4A1 ALDH4A1 ALDH4A1 MIR4695 MIR1290 IFFO2 RP5-1126H10.2 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 AL137127.1 UBR4 UBR4 RP1-43E13.2 EMC1 EMC1 EMC1 EMC1 EMC1 MRTO4 MRTO4 AKR7L AKR7L AKR7A3 AKR7A2 RNU6-1099P PQLC2 PQLC2 PQLC2 RN7SL85P CAPZB CAPZB CAPZB CAPZB CAPZB CAPZB CAPZB RN7SL277P RNU4-28P snoU13 RP5-1056L3.1 MINOS1-NBL1 MINOS1 MINOS1 NBL1 NBL1 NBL1 HTR6 TMCO4 TMCO4 TMCO4 TMCO4 RNF186 RP11-91K11.2 OTUD3 OTUD3 PLA2G2E RN7SL304P PLA2G2A PLA2G2A PLA2G2A PLA2G5 PLA2G5 PLA2G2D PLA2G2F RP3-340N1.2 PLA2G2C PLA2G2C PLA2G2C RP3-340N1.5 UBXN10 RP3-340N1.6 VWA5B1 VWA5B1 VWA5B1 VWA5B1 RP4-745E8.2 VWA5B1 RP4-749H3.1 RP4-749H3.2 CAMK2N1 CAMK2N1 MUL1 FAM43B CDA CDA PINK1 PINK1 PINK1-AS DDOST DDOST DDOST DDOST KIF17 KIF17 KIF17 KIF17 SH2D5 SH2D5 SH2D5 RP5-930J4.2 HP1BP3 HP1BP3 RP5-930J4.4 HP1BP3 HP1BP3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 RNU7-200P EIF4G3 EIF4G3 EIF4G3 EIF4G3 MIR1256 CpG: 23 CpG: 16 CpG: 111 CpG: 107 CpG: 157 CpG: 61 CpG: 20 CpG: 110 CpG: 25 CpG: 31 CpG: 62 CpG: 91 CpG: 157 CpG: 15 CpG: 63 CpG: 67 CpG: 24 CpG: 21 CpG: 46 CpG: 23 CpG: 32 CpG: 43 CpG: 80 CpG: 64 CpG: 22 CpG: 62 CpG: 140 CpG: 105 CpG: 32 CpG: 112 CpG: 157 CpG: 38 CpG: 79 CpG: 20 CpG: 23 CpG: 24 CpG: 22 CpG: 33 CpG: 32 CpG: 144 CpG: 97 CpG: 19 CpG: 30 CpG: 129 CpG: 71 CpG: 26 CpG: 32 CpG: 26 CpG: 19 CpG: 159 CpG: 28 CpG: 21 CpG: 39 CpG: 92 CpG: 72 CpG: 160 CpG: 18 CpG: 147 CpG: 83 CpG: 59 CpG: 22 CpG: 48 CpG: 75 CpG: 19 CpG: 25 CpG: 30 CpG: 22 CpG: 48 CpG: 36 CpG: 42 CpG: 21 CpG: 27 CpG: 43 CpG: 247 CpG: 20 CpG: 93 CpG: 123 CpG: 77 CpG: 60 CpG: 29 CpG: 172 CpG: 80 CpG: 20 CpG: 23 CpG: 40 CpG: 35 CpG: 100 CpG: 84 CpG: 95 CpG: 169 CpG: 34 CpG: 27 CpG: 25 CpG: 93 CpG: 28 CpG: 43 CpG: 84 CpG: 35 CpG: 73 CpG: 66 CpG: 63 CpG: 90 CpG: 21 CpG: 56 CpG: 23 CpG: 18 CpG: 33 CpG: 133 CpG: 89 CpG: 23 CpG: 22 CpG: 59 CpG: 84 CpG: 109 CpG: 41 CpG: 119 CpG: 21 CpG: 135 CpG: 31 CpG: 197 CpG: 20 CpG: 45 CpG: 121 CpG: 23 CpG: 51 CpG: 53 CpG: 54 CpG: 44 CpG: 57 CpG: 21 CpG: 25 CpG: 17 CpG: 18 CpG: 15 CpG: 76 CpG: 151 CpG: 41 CpG: 184 CpG: 27 CpG: 22 CpG: 28 CpG: 82 CpG: 72 CpG: 42 CpG: 20 CpG: 20 CpG: 118 CpG: 48 CpG: 124 CpG: 44 CpG: 37 CpG: 37 CpG: 64 CpG: 56 CpG: 70 CpG: 36 CpG: 129 CpG: 18 CpG: 42 CpG: 105 CpG: 41 CpG: 103 CpG: 64 CpG: 133 CpG: 31 CpG: 19 CpG: 38 CpG: 231 CpG: 25 CpG: 181 CpG: 89 CpG: 38 CpG: 19 CpG: 30 CpG: 128 CpG: 41 CpG: 23 CpG: 103 CpG: 51 CpG: 18 CpG: 26 CpG: 96 CpG: 22 CpG: 48 CpG: 19 CpG: 72 CpG: 17 CpG: 77 CpG: 31 CpG: 47 CpG: 37 CpG: 18 CpG: 30 CpG: 139 CpG: 83 CpG: 17 CpG: 32 CpG: 31 CpG: 83 CpG: 29 CpG: 90 CpG: 18 CpG: 21 CpG: 81 CpG: 205 CpG: 56 CpG: 20 CpG: 58 CpG: 98 CpG: 30 CpG: 26 CpG: 51 CpG: 19 CpG: 44 CpG: 37 CpG: 21 CpG: 67 CpG: 199 CpG: 83 CpG: 82 CpG: 72 CpG: 73 CpG: 129 CpG: 25 CpG: 96 CpG: 54 CpG: 153 CpG: 160 CpG: 33 CpG: 63 CpG: 89 CpG: 36 CpG: 34 CpG: 46 CpG: 110 CpG: 245 CpG: 43 CpG: 188 CpG: 57 CpG: 61 CpG: 19 CpG: 34 CpG: 109 CpG: 115 CpG: 156 1p36.23 1p36.22 1p36.21 1p36.13 p36.12 zoom in to <= 10,000,000 bases to view items CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 CAMTA1 VAMP3 PER3 PER3 PER3 PER3 PER3 PER3 UTS2 UTS2 UTS2 TNFRSF9 TRNA_Pseudo PARK7 PARK7 PARK7 AX747125 ERRFI1 ERRFI1 SLC45A1 SLC45A1 RERE RERE RERE RERE BC113958 RERE ENO1 ENO1 ENO1 ENO1 ENO1-AS1 CA6 CA6 CA6 CA6 CA6 SLC2A7 SLC2A5 SLC2A5 SLC2A5 SLC2A5 SLC2A5 SLC2A5 SLC2A5 GPR157 GPR157 GPR157 GPR157 mir-34 MIR34A H6PD SPSB1 SPSB1 5S_rRNA SLC25A33 TMEM201 TMEM201 TMEM201 PIK3CD PIK3CD C1orf200 BC038541 PIK3CD PIK3CD PIK3CD CLSTN1 CLSTN1 CLSTN1 CLSTN1 CTNNBIP1 CTNNBIP1 LZIC LZIC LZIC LZIC NMNAT1 RBP7 RBP7 UBE4B UBE4B UBE4B UBE4B UBE4B UBE4B KIF1B KIF1B KIF1B KIF1B KIF1B KIF1B KIF1B KIF1B PGD PGD APITD1 APITD1-CORT APITD1-CORT APITD1-CORT APITD1-CORT APITD1-CORT CORT DFFA DFFA PEX14 PEX14 PEX14 PEX14 PEX14 PEX14 PEX14 PEX14 CASZ1 CASZ1 CASZ1 CASZ1 Mir_584 C1orf127 C1orf127 C1orf127 TARDBP TARDBP MASP2 MASP2 MASP2 MASP2 SRM EXOSC10 EXOSC10 EXOSC10 MTOR MTOR MTOR-AS1 ANGPTL7 UBIAD1 PTCHD2 PTCHD2 PTCHD2 FBXO2 FBXO2 FBXO2 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO44 FBXO6 MAD2L2 MAD2L2 MAD2L2 DRAXIN AK125437 AGTRAP AGTRAP AGTRAP AGTRAP AGTRAP C1orf167 C1orf167 C1orf167 MTHFR MTHFR MTHFR MTHFR MTHFR CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 CLCN6 NPPA-AS1 NPPA-AS1 NPPA-AS1 NPPA-AS1 NPPA-AS1 NPPA-AS1 NPPA NPPB KIAA2013 KIAA2013 PLOD1 PLOD1 MFN2 MFN2 MIIP TNFRSF8 TNFRSF8 TNFRSF8 TNFRSF8 TNFRSF1B TNFRSF1B TNFRSF1B MIR4632 VPS13D VPS13D VPS13D VPS13D VPS13D VPS13D SNORA59B DHRS3 DHRS3 DHRS3 DHRS3 DHRS3 AADACL4 AADACL3 AADACL3 C1orf158 C1orf158 PRAMEF12 PRAMEF1 PRAMEF11 LOC649330 HNRNPCL1 PRAMEF2 PRAMEF4 PRAMEF10 PRAMEF8 PRAMEF6 PRAMEF5 PRAMEF22 PRAMEF5 LOC440563 PRAMEF3 PRAMEF5 PRAMEF8 PRAMEF8 PRAMEF9 PRAMEF13 PRAMEF19 PRAMEF16 PRAMEF20 PRAMEF8 PRAMEF10 PRAMEF9 PRAMEF19 PRAMEF17 PRAMEF20 LRRC38 PDPN PDPN PDPN PDPN PDPN PDPN PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 PRDM2 AK124197 KAZN KAZN KAZN KAZN KAZN KAZN KAZN TMEM51-AS1 TMEM51-AS1 TMEM51 TMEM51 TMEM51 TMEM51 TMEM51 FHAD1 FHAD1 AK055853 FHAD1 FHAD1 FHAD1 FHAD1 FHAD1 EFHD2 CTRC CTRC CELA2A CELA2B CASP9 CASP9 CASP9 CASP9 CASP9 CASP9 DNAJC16 DNAJC16 DNAJC16 DNAJC16 AGMAT DDI2 DDI2 DDI2 RSC1A1 DQ573015 PLEKHM2 SLC25A34 TMEM82 FBLIM1 FBLIM1 FBLIM1 FBLIM1 FBLIM1 UQCRHL FLJ37453 SPEN SPEN ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 ZBTB17 DQ576383 C1orf64 HSPB7 HSPB7 HSPB7 HSPB7 HSPB7 CLCNKA CLCNKA CLCNKA CLCNKA CLCNKA CLCNKA CLCNKB CLCNKB CLCNKB FAM131C EPHA2 EPHA2 ARHGEF19 ARHGEF19 RSG1 FBXO42 FBXO42 FBXO42 SZRD1 SZRD1 SZRD1 SZRD1 SZRD1 AL832937 SPATA21 SPATA21 SPATA21 NECAP2 NECAP2 NECAP2 NECAP2 CROCCP3 CROCCP3 TRNA_Asn TRNA_Asn BC036435 AX747988 TRNA_Gly TRNA_Pseudo MIR3675 NBPF1 NBPF1 NBPF1 NBPF1 NBPF1 NBPF1 NBPF1 CROCCP2 CROCCP2 CROCCP2 CROCCP2 CROCCP2 BC015342 FLJ00313 FLJ00313 MST1P2 MST1P2 MST1P2 MST1P2 MST1P2 MST1P2 MST1P2 TRNA_Gly TRNA_Val MIR3675 ESPNP TRNA_Pseudo TRNA_Gly CROCC MST1L MST1L TRNA_Pseudo MIR3675 TRNA_Pseudo TRNA_Gly AK125737 TRNA_Asn BC070363 TRNA_Asn DL489931 CROCC CROCC CROCC CROCC MFAP2 MFAP2 MFAP2 MFAP2 ATP13A2 ATP13A2 ATP13A2 ATP13A2 ATP13A2 SDHB PADI2 PADI2 PADI2 PADI1 PADI1 PADI1 PADI1 PADI1 PADI3 Mir_584 PADI4 PADI4 PADI6 RCC2 RCC2 ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ARHGEF10L ACTL8 IGSF21 IGSF21 AX747631 KLHDC7A KLHDC7A PAX7 PAX7 PAX7 TAS1R2 ALDH4A1 ALDH4A1 ALDH4A1 ALDH4A1 MIR4695 IFFO2 AX747516 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 UBR4 EMC1 EMC1 EMC1 EMC1 EMC1 MRTO4 AKR7L AKR7L AKR7A3 LOC100506730 AKR7A2 AKR7A2 PQLC2 PQLC2 PQLC2 PQLC2 PQLC2 CAPZB CAPZB CAPZB CAPZB AX748283 Metazoa_SRP MINOS1 MINOS1 MINOS1 MINOS1 MINOS1 MINOS1-NBL1 MINOS1-NBL1 RPS14P3 NBL1 NBL1 NBL1 NBL1 NBL1 NBL1 NBL1 NBL1 HTR6 TMCO4 TMCO4 TMCO4 TMCO4 TMCO4 RNF186 OTUD3 PLA2G2E PLA2G2A PLA2G2A PLA2G2A PLA2G2A PLA2G5 PLA2G5 PLA2G2D PLA2G2D PLA2G2F PLA2G2C UBXN10 UBXN10 VWA5B1 VWA5B1 VWA5B1 VWA5B1 VWA5B1 LOC339505 LOC339505 CAMK2N1 MUL1 FAM43B CDA CDA CDA PINK1 PINK1 DDOST DDOST DDOST KIF17 KIF17 KIF17 KIF17 SH2D5 SH2D5 SH2D5 HP1BP3 HP1BP3 HP1BP3 HP1BP3 HP1BP3 HP1BP3 HP1BP3 HP1BP3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 EIF4G3 USC1049_ZFX_C42B_RepB_TGACCA 20 - 0 _ USC1050_ZFX_MCF7_RepB_GCCAAT 20 - 0 _ USC990_ZFX_HCT116_RepB 20 - 0 _ USC1056_ZFX_HEK293T_RepB 20 - 0 _ USC1051_INPUT_HEK293T_RepA_CGATGT 20 - 0 _ USC1063_INPUT_HEK293T_RepB_GATCAG 20 - 0 _ USC1048_ZFX_HEK293T_RepA_CGATGT 20 - 0 _ USC1046_INPUT_HCT116_RepA 20 - 0 _ USC987_INPUT_HCT116_RepB 20 - 0 _ USC1040_ZFX_HCT116_RepA 20 - 0 _ LNCaP CTCF DS 4.7441 - 0 _ LNCaP CTCF DS 2.2847 - 0 _ ZFX_C42B ZFX_MCF7 ZFX_HCT116 20 I further analyzed the ChIP-seq datasets to determine if they pass the thresholds for ENCODE quality control standards. Since I used previously made lab ChIP-seq data (performed by Yu (Phoebe) Guo) for H3K27Ac (an enhancer mark) in my project, the quality metrics data for that mark are also included here. H3K27Ac ChIP-seq in MCF7 and HCT116 were single-end sequenced so only Read 1 is available. 1) Read depth. For transcription factor experiments and narrow-peak histone marks such as H3K27Ac, it is recommended that each replicate should have ~20 million minimum usable fragments. As shown in Table 3.1, all ZFX and H3K27Ac datasets meet this criteria. 21 Table 3.1 ChIP-seq read depth for each cell line. 2) Library Complexity. Library complexity is measured using the Non-Redundant Fraction (NRF) and PCR Bottlenecking Coefficients 1 and 2 (PBC1 and PBC2). The preferred values are NRF>0.9, PBC1>0.9, and PBC2>10; all ZFX datasets have NRF > 0.9; PBC1> 0.9, and PBC2 > 10 (Table 3.2). Cell line Mark Replicate ID Uniquely Mapped PF Reads (Read 1) Uniquely Mapped PF Reads (Read 2) C42B ZFX USC835_ZFX_RepA 47,263,928 47,067,279 USC1049_ZFX_C42B_RepB 55,358,418 51,973,903 INPUT USC844_INPUT_RepA 27,979,461 27,417,288 USC851_INPUT_RepB 27,182,668 26,768,261 H3K27Ac USC1054_H3K27Ac_C42B_RepA 25,769,585 23,476,507 USC1055_H3K27Ac_C42B_repB 41,976,903 39,249,270 HCT116 ZFX USC1040_ZFX_HCT116_RepA 42,687,533 39,922,819 USC990_ZFX_HCT116_RepB 68,307,628 66,012,919 INPUT USC1046_INPUT_ HCT116_RepA 24,470,542 17,928,109 USC987_ INPUT_ HCT116_RepB 50,852,409 49,037,255 H3K27Ac USC149_H3K27Ac_RepA 13,524,166 NA USC224_H3K27Ac_RepB 22,014,646 NA MCF7 ZFX USC841_ZFX_RepA 61,103,108 60,692,136 USC1050_ZFX_MCF7_RepB 55,154,361 52,046,539 INPUT USC845_INPUT_RepA 22,137,158 22,037,245 USC846_INPUT_RepB 26,769,785 26,620,796 H3K27Ac USC139_H3K27Ac_Rep1 29,397,051 NA USC141_H3K27Ac_Rep2 26,955,690 NA 22 Table 3.2 ChIP-seq library complexity parameters for each cell line. 3) Enrichment. Normalized Strand Cross-correlation coefficient (NSC) and Relative Strand Cross-correlation coefficient (RSC) are measures of enrichment derived without dependence on prior determination of enriched regions; higher values indicate more enrichment. It is recommended that NSC values be greater than 1.05 and that RSC values be greater than 1. All ZFX datasets has RSC >1 and all but 2 ZFX ChIP-seq datasets have NSC > 1.05 (USC835 and UC990 are slightly lower but for each case, the other replicate has NSC > 1.05); See Table 3.3. Cell line Mark Replicate ID NRF PBC1 PBC2 C42B ZFX USC835_ZFX_RepA 0.96 0.96 24.59 USC1049_ZFX_C42B_RepB 0.94 0.94 15.45 INPUT USC844_INPUT_RepA 0.98 0.98 50.68 USC851_INPUT_RepB 0.99 0.99 69.14 MCF7 ZFX USC841_ZFX_RepA 0.96 0.96 22.86 USC1050_ZFX_MCF7_RepB 0.95 0.95 20.59 INPUT USC845_INPUT_RepA 0.98 0.98 47.87 USC846_INPUT_RepB 0.98 0.98 47.53 HCT116 ZFX USC1040_ZFX_HCT116_RepA 0.94 0.94 15.65 USC990_ZFX_HCT116_RepB 0.94 0.94 15.94 INPUT USC1046_ZFX_INPUT_RepA 0.96 0.96 25.57 USC987_ZFX_INPUT_RepB 0.96 0.96 27.00 Cell line Mark Replicate ID NSC RSC C42B ZFX USC835_ZFX_RepA 1.03 1.56 USC1049_ZFX_C42B_RepB 1.11 1.89 INPUT USC844_INPUT_RepA 1.09 2.80 USC851_INPUT_RepB 1.02 1.73 MCF7 ZFX USC841_ZFX_RepA 1.06 1.77 USC1050_ZFX_MCF7_RepB 1.16 2.31 INPUT USC845_INPUT_RepA 1.10 3.71 USC846_INPUT_RepB 1.07 3.10 HCT116 ZFX USC1040_ZFX_HCT116_RepA 1.18 1.78 USC990_ZFX_HCT116_RepB 1.02 1.41 INPUT USC1046_ZFX_INPUT_RepA 1.13 3.00 USC987_ZFX_INPUT_RepB 1.01 1.56 23 Table 3.3 ChIP-seq enrichment parameters for each cell line. 4) Assessment of reproducibility for biological replicates (IDR). The ideal ChIP-seq data should have a self-consistency Ratio <2.0 and a Rescue Ratio < 2.0. As shown in Table 3.4, the C42B, MCF7, and HCT116 ChIP-seq datasets pass these criteria. Table 3.4 ChIP-seq IDR parameters for each cell line. 3.3 ZFX binds to many of the same promoter regions in different cancer cell types. After determining that the sequenced ChIP-seq libraries were of high quality, I used MACS2 (https://github.com/taoliu/MACS) to call peaks and I identified a set of significant and reproducible peaks for each dataset by using IDR (Irreproducible Discovery Rate). The IDR analysis assesses the rank consistency of identified peaks between replicates, and outputs the number of peaks that pass a user-specified reproducibility threshold (Landt SG et al., 2012). IDR Cell line Mark Replicate ID NRF PBC1 PBC2 C42B ZFX USC835_ZFX_RepA 0.96 0.96 24.59 USC1049_ZFX_C42B_RepB 0.94 0.94 15.45 INPUT USC844_INPUT_RepA 0.98 0.98 50.68 USC851_INPUT_RepB 0.99 0.99 69.14 MCF7 ZFX USC841_ZFX_RepA 0.96 0.96 22.86 USC1050_ZFX_MCF7_RepB 0.95 0.95 20.59 INPUT USC845_INPUT_RepA 0.98 0.98 47.87 USC846_INPUT_RepB 0.98 0.98 47.53 HCT116 ZFX USC1040_ZFX_HCT116_RepA 0.94 0.94 15.65 USC990_ZFX_HCT116_RepB 0.94 0.94 15.94 INPUT USC1046_ZFX_INPUT_RepA 0.96 0.96 25.57 USC987_ZFX_INPUT_RepB 0.96 0.96 27.00 Cell line Mark Replicate ID NSC RSC C42B ZFX USC835_ZFX_RepA 1.03 1.56 USC1049_ZFX_C42B_RepB 1.11 1.89 INPUT USC844_INPUT_RepA 1.09 2.80 USC851_INPUT_RepB 1.02 1.73 MCF7 ZFX USC841_ZFX_RepA 1.06 1.77 USC1050_ZFX_MCF7_RepB 1.16 2.31 INPUT USC845_INPUT_RepA 1.10 3.71 USC846_INPUT_RepB 1.07 3.10 HCT116 ZFX USC1040_ZFX_HCT116_RepA 1.18 1.78 USC990_ZFX_HCT116_RepB 1.02 1.41 INPUT USC1046_ZFX_INPUT_RepA 1.13 3.00 USC987_ZFX_INPUT_RepB 1.01 1.56 Cell line Mark Replicate ID Uniquely Mapped PF Reads (Read 1) Uniquely Mapped PF Reads (Read 2) C42B ZFX USC835_ZFX_RepA 47,263,928 47,067,279 USC1049_ZFX_C42B_RepB 55,358,418 51,973,903 INPUT USC844_INPUT_RepA 27,979,461 27,417,288 USC851_INPUT_RepB 27,182,668 26,768,261 H3K27Ac USC1054_H3K27Ac_C42B_RepA 25,769,585 23,476,507 USC1055_H3K27Ac_C42B_repB 41,976,903 39,249,270 HCT116 ZFX USC1040_ZFX_HCT116_RepA 42,687,533 39,922,819 USC990_ZFX_HCT116_RepB 68,307,628 66,012,919 INPUT USC1046_INPUT_ HCT116_RepA 24,470,542 17,928,109 USC987_ INPUT_ HCT116_RepB 50,852,409 49,037,255 H3K27Ac USC149_H3K27Ac_RepA 13,524,166 NA USC224_H3K27Ac_RepB 22,014,646 NA MCF7 ZFX USC841_ZFX_RepA 61,103,108 60,692,136 USC1050_ZFX_MCF7_RepB 55,154,361 52,046,539 INPUT USC845_INPUT_RepA 22,137,158 22,037,245 USC846_INPUT_RepB 26,769,785 26,620,796 H3K27Ac USC139_H3K27Ac_Rep1 29,397,051 NA USC141_H3K27Ac_Rep2 26,955,690 NA Rescue Ratio Self-consistency Ratio C42B_ZFX 1.20 1.61 MCF7_ZFX 1.43 1.34 HCT116_ZFX 1.23 1.92 24 analysis indicated that ~9000 peaks were reproducible. Therefore, I merged the two replicates, called peaks, and used the top 9000 peaks for further characterization. To determine the binding pattern of ZFX relative to gene structure and active enhancer regions, I overlapped the genomic locations of the 9000 ZFX peaks with genomic locations of all known human transcription start sites and with H3K27Ac ChIP-seq data from the same cell type. I found that, in contrast to many other transcription factors, ZFX does not bind mostly to distal elements. Rather, most ZFX binding sites are localized in promoter regions. For C42B, ~65% of the ZFX binding sites (5777 out of 9000) are localized in promoter regions whereas only ~32% of binding sites correspond to distal enhancers (as determined by comparing the ZFX peaks to H3K27Ac peaks) and ~3% are located at other distal regions (Figure 3.5). To determine if one category of peaks has higher peak heights than the other categories, I compared the peak height for promoter-bound peaks, distal enhancer-bound peaks as well as peaks bound to other regions. The ranking pattern reveals that promoter-bound sites have slightly higher peaks than the enhancer-bound sites but that both of these categories are much higher than the “other” sites. 25 Figure 3.5 ZFX ChIP-seq peak location analysis. Promoter regions are defined as +/- 2kb from a transcription start site (TSS) and distal H3K27Ac sites were identified using ChIP-seq data from the appropriate cell line. The number of peaks for each ChIP-seq dataset is shown on top of each bar graph. The median peak height for each set of peaks is shown as purple numbers within the bars. As shown above in Figure 3.4, the binding pattern of ZFX is visually very similar between different cancer cell lines. To quantitatively compare the binding patterns in the different cell lines, I used Homer mergePeaks to identify the overlapping and non-overlapping peaks between different cell lines (Figure 3.6). Approximately, 5842 common ZFX binding sites (65%) were identified in C42B, MCF7 and HCT116. I further compared the ZFX peaks in the C42B and MCF7 cell lines. I found that 6381 common ZFX binding sites (70%) were identified in both C42B and MCF7. There are 4385 (77%) common promoter-bound peaks, 1426 (49%) common distal enhancer-bound peaks and 27 (11%) common peaks bound in other regions of the genome. Therefore, the similar ZFX binding patterns observed in the different cell lines is primarily because ZFX binds to the same promoter regions in the different cell types. One interpretation of 5777 2931 259 5305 2124 951 5970 2805 198 0 1000 2000 3000 4000 5000 6000 7000 TSS+/-2KB distal H3K27Ac others Location analysis of ZFX binding sites C42B MCF7 HCT116 43 46 85 34 42 72 14 24 40 26 this analysis might be that the overlapping peaks are the reproducible peaks. However, it is important to note that each of the non-overlapping peaks are from a robust and reproducible dataset for that particular cell line. To further determine if there are really a set of cell type- specific ZFX peaks, I calculated the medium peak height in the sets of common peaks and “unique” peaks. I found that the “unique” peaks have a lower median peak height than the common peaks. This suggests that possibly the “unique” peaks may have simply just been below the cut-off for peak calling in one of the cell lines (e.g. the peak may have been just above the cut-off in C42B cells but just below the cut-off in MCF7 cells). To investigate further, I created a heat map comparing C42B and MCF7 peaks (Figure 3.7). I found that there is no significantly different signal blocks at the C42B “unique” peaks in MCF7 cells and vice versa. Therefore, it appears as if ZFX binds to the basically the same sites in the different cell types. Figure 3.6 Overlapping ZFX peaks in different cell lines. The yellow circles represent C42B ZFX peaks sets, the blue circles represent MCF7 ZFX peak sets, and the grey circle represents 27 HCT116 ZFX peak sets. The sizes of the circles are relatively proportional to the number of peaks. Purple numbers refers to the median peak height. Figure 3.7 Heat map of all C42B and MCF7 binding sites. The left panel is all ZFX binding sites in C42B and MCF7 cells. The middle panel is “unique” ZFX binding sites in C42B versus MCF7, clustering “unique” binding sites in C42B. The right panel is “unique” ZFX binding sites in C42B versus MCF7, clustering “unique” binding sites in MCF7. The window size is +/- 1kb from the peak center. 3.4 ZFX binds downstream of the TSS in promoter regions From the analyses performed above, I knew that most ZFX binding sites should be within +/- 2kb of a TSS. However, the previous analyses did not determine if there is a preferred distance from the TSS for ZFX binding. To further characterize ZFX binding sites, I generated a tag density plot for all ZFX binding sites in C42B relative to the TSS of the nearest gene. Interestingly, I found that most binding sites are located downstream of the nearest TSS (Figure 3.8A). Then, I calculated the distance from the summit of peaks to the nearest TSS; the average distance is about 240bp, which is consistent with the tag density plot. ZFX binds to its own Peak center Peak center Peak center Peak center C42B C42B MCF7 MCF7 Peak center Peak center C42B MCF7 28 promoter region and the binding site is shown in Figure 3.8B as an example of a site located downstream of +1. Figure 3.8 Location analysis of ZFX binding sites relative to the nearest TSS in C42B cells. A. The X axis is the distance from the center of the TSS. Tag density (y axis) was plotted within +/- 2kb of the nearest TSS; the summit of the peak is about 200-300 bp downstream from the TSS. B. Shown is the ZFX peak at the ZFX promoter region. 3.5 ZFX binding motifs are similar in C42B, MCF7 and HCT116 cells. To determine the sequence motif to which ZFX prefers to bind, I performed a motif analysis using the HOMER program and the findMotifsGenome script. The top 2 identified motifs in C42B cells are essentially the same (AGGCCTAG) (Table 3.5), and are similar to a motif that was identified in a ChIP-seq experiment performed previously using an antibody to mouse ZFX in mES cells (Chen X et al, 2008). Interestingly, the top motif was also identified in a ChIP-seq assay using an antibody to ZNF711 in SHSY5Y human neuroblastoma cells. C2H2 zinc finger 0 0.1 0.2 0.3 0.4 0.5 0.6 -2000 -1900 -1800 -1700 -1600 -1500 -1400 -1300 -1200 -1100 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 TagDensity Distancefromthe centerof TSS All binding sites in relation to TSS Scale chrX: RefSeq Genes Common SNPs(144) 500 bases hg19 24,167,500 24,168,000 24,168,500 ZFX-AS1 ZFX ZFX ZFX ZFX ZFX CpG: 149 Xp22.11 ZFX-AS1 ZFX ZFX ZFX ZFX ZFX USC1049_ZFX_C42B_RepB_TGACCA 40 - 0 _ USC-RNA195_C4-2B_siCTRL_rep1 1.65 - 0 _ USC-RNA192_C4-2B_siZFX_rep1 0.16 - 0 _ USC844_INPUT_C42B_R1 40 - 0 _ LNCaP CTCF DS 0.3333 - 0 _ LNCaP CTCF DS 0.1787 - 0 _ Scale chrX: RefSeq Genes Common SNPs(144) 500 bases hg19 24,167,500 24,168,000 24,168,500 ZFX-AS1 ZFX ZFX ZFX ZFX ZFX CpG: 149 Xp22.11 ZFX-AS1 ZFX ZFX ZFX ZFX ZFX USC1049_ZFX_C42B_RepB_TGACCA 40 - 0 _ USC-RNA195_C4-2B_siCTRL_rep1 1.65 - 0 _ USC-RNA192_C4-2B_siZFX_rep1 0.16 - 0 _ USC844_INPUT_C42B_R1 40 - 0 _ LNCaP CTCF DS 0.3333 - 0 _ LNCaP CTCF DS 0.1787 - 0 _ ZFX_C42B B A 29 proteins comprise the largest class of site-specific DNA-binding proteins encoded in the human genome. Of the 2000 predicted DNA binding transcription factors, ~900 contain C2H2 zinc finger domains. The ZNFs have arisen through gene duplication followed by mutation and thus a given ZNF is highly related to a set of other ZNFs. A comparison of ZFX to other proteins using the Treefam program (http://www.treefam.org/family/TF335557#tabview=tab1) revealed that, as expected, ZFX is highly related to ZFY (Palmer et al., 1990); see Figure 3.9. Interestingly, the next most highly related protein is ZNF711. This suggests that ZFX and ZNF711 probably have very similar, if not identical, DNA binding domains, explaining why the same motif is obtained in ChIP-seq experiments using the human ZFX and the human ZNF711 antibodies. I used the NCBI protein blast tool to determine the homology between ZFX and ZNF711. There is 55% identity between the entire ZFX and ZNF711 proteins, with the zinc finger domains having 87% identity. I also found that the top 2 most preferred ZFX motifs for C42B, MCF7 and HCT116 are the same (Tables 3.5, 3.6 and 3.7), which is not surprising since many of the same sites are bound in all 3 cell types. To determine if the promoter-located sites had the same motifs as the non- promoter sites, I performed the motif analysis using only the non-promoter bound ZFX sites from C42B cells. I found that the top 4 motifs in non-promoter-bound binding sites are the same as those identified in “all” binding sites and that motif percentage in these sites is also very high (58.93%, 40.03%, 31.03%, and 24.33% respectively). This provides more evidence that the non- promoter binding sites are in fact direct ZFX binding sites. 30 Table 3.5 Top 4 preferred motifs in C42B Table 3.6 Top 4 preferred motifs in MCF7 Table 3.7 Top 4 preferred motifs in HCT116 Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 57.40% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 36.82% 3 AP-2gamma(AP2)/MCF7-TFAP2C-ChIP- Seq(GSE21234)/Homer 31.77% 4 AP-2alpha(AP2)/Hela-AP2alpha-ChIP- Seq(GSE31477)/Homer 25.29% Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 54.27% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 34.16% 3 Elk4(ETS)/Hela-Elk4-ChIP- Seq(GSE31477)/Homer 15.96% 4 Elk1(ETS)/Hela-Elk1-ChIP- Seq(GSE31477)/Homer 15.93% 24.89% 24.89% Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 56.81% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 37.35% 3 Elk4(ETS)/Hela-Elk4-ChIP- Seq(GSE31477)/Homer 15.85% 4 AP-2alpha(AP2)/Hela-AP2alpha-ChIP- Seq(GSE31477)/Homer 24.89% 24.89% 24.89% C G CT C C T A T G C G A A G GC G G T A C C T G A C G A T G C T A C G A T C T A G G C CT CC T T A C G G A A GGG C T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C C T G A G T C A T A TCCT G A T C G A G C T G C T A C G A T C T A G G C CT CC T T A C G G A A GGG C T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C C T G A G T C A T A TCCT G A T C G A G C T T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C T C G A A G T C A T A TCCT G A T C G G A C T Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 57.40% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 36.82% 3 AP-2gamma(AP2)/MCF7-TFAP2C-ChIP- Seq(GSE21234)/Homer 31.77% 4 AP-2alpha(AP2)/Hela-AP2alpha-ChIP- Seq(GSE31477)/Homer 25.29% Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 54.27% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 34.16% 3 Elk4(ETS)/Hela-Elk4-ChIP- Seq(GSE31477)/Homer 15.96% 4 Elk1(ETS)/Hela-Elk1-ChIP- Seq(GSE31477)/Homer 15.93% 24.89% 24.89% Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 56.81% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 37.35% 3 Elk4(ETS)/Hela-Elk4-ChIP- Seq(GSE31477)/Homer 15.85% 4 AP-2alpha(AP2)/Hela-AP2alpha-ChIP- Seq(GSE31477)/Homer 24.89% 24.89% 24.89% C G CT C C T A T G C G A A G GC G G T A C C T G A C G A T G C T A C G A T C T A G G C CT CC T T A C G G A A GGG C T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C C T G A G T C A T A TCCT G A T C G A G C T G C T A C G A T C T A G G C CT CC T T A C G G A A GGG C T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C C T G A G T C A T A TCCT G A T C G A G C T T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C T C G A A G T C A T A TCCT G A T C G G A C T Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 57.40% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 36.82% 3 AP-2gamma(AP2)/MCF7-TFAP2C-ChIP- Seq(GSE21234)/Homer 31.77% 4 AP-2alpha(AP2)/Hela-AP2alpha-ChIP- Seq(GSE31477)/Homer 25.29% Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 54.27% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 34.16% 3 Elk4(ETS)/Hela-Elk4-ChIP- Seq(GSE31477)/Homer 15.96% 4 Elk1(ETS)/Hela-Elk1-ChIP- Seq(GSE31477)/Homer 15.93% 24.89% 24.89% Rank Motif Name % of target sequences with motif 1 ZNF711(Zf)/SHSY5Y-ZNF711-ChIP- Seq(GSE20673)/Homer 56.81% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 37.35% 3 Elk4(ETS)/Hela-Elk4-ChIP- Seq(GSE31477)/Homer 15.85% 4 AP-2alpha(AP2)/Hela-AP2alpha-ChIP- Seq(GSE31477)/Homer 24.89% 24.89% 24.89% C G CT C C T A T G C G A A G GC G G T A C C T G A C G A T G C T A C G A T C T A G G C CT CC T T A C G G A A GGG C T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C C T G A G T C A T A TCCT G A T C G A G C T G C T A C G A T C T A G G C CT CC T T A C G G A A GGG C T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C C T G A G T C A T A TCCT G A T C G A G C T T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G G A T C T C G A A G T C A T A TCCT G A T C G G A C T 31 Figure 3.9 Treefam gene tree analysis of ZFX. 32 Chapter 4 Functional analysis of ZFX in human cancer cells 4.1 Many genes downregulated upon knockdown of ZFX are direct ZFX targets To determine the role of ZFX in cancer, I wanted to identify genes regulated by ZFX. I attempted to inactivate the ZFX gene using the CRISPR/Cas9 technology but could not identify any homozygous knockouts (see Figure 5.1 in the next chapter). Therefore, I used siRNA to knockdown the levels of ZFX RNA and then performed RNA-seq to analyze effects on the transcriptome. I performed siRNA-mediated ZFX knockdown in C42B and MCF7 cells. Cells were transfected in triplicate with control or ZFX siRNAs and knockdown efficiency was checked by qPCR before RNA-seq library construction (Table 4.1 and Figure 4.1). Table 4.1 ZFX siRNA qPCR primer information Name Primer sequence ZFX.mRNA.v1 Forward (5'->3') TGAGCTGTGCTTTACGCTGG Reverse (5'->3') CCCATCTTCATCCATGGCCT ZFX.mRNA.v3 Forward (5'->3') GTGCTCCAGAGAAAGGCCG Reverse (5'->3') CTTTCCCAGCGTAAAGCACAG ZFX.mRNA.v5 Forward (5'->3') TGGGCAGCAGCTTATGGTAATA Reverse (5'->3') TGCGCCATGGAACTCGTG A B 33 Figure 4.1 Knockdown of ZFX by siRNA in C42B and MCF7. By using 3 different primer sets (see Table 4.1), I measured expression of different ZFX variants in control and ZFX siRNA- treated cells. A. C42B cells were treated with 100nm siZFX (a mixture of 4 siRNAs against ZFX mRNA was used) or 100nM siCtrl in triplicates. B. MCF7 cells were treated with 70nM siZFX or 70nM siControl in triplicates. Having demonstrated that ZFX mRNA was reduced in the ZFX siRNA-treated cells, I proceeded with making libraries for RNA-seq. I analyzed the RNA-seq data using the Tophat2 program; the total alignment rate for the RNA-seq libraries ranged from 89% to 91%, demonstrating high quality sequencing data (Table 4.2). Name Primer sequence ZFX.mRNA.v3 Forward (5'->3') GTGCTCCAGAGAAAGGCCG Reverse (5'->3') CTTTCCCAGCGTAAAGCACAG ZFX.mRNA.v5 Forward (5'->3') TGGGCAGCAGCTTATGGTAATA Reverse (5'->3') TGCGCCATGGAACTCGTG ZFX.mRNA.v1 Forward (5'->3') TGAGCTGTGCTTTACGCTGG Reverse (5'->3') CCCATCTTCATCCATGGCCT A B 34 Table 4.2 RNA-seq total reads and Tophat 2 Alignment rate Using the Partek GSA algorithm, I found that C42B cells express 18781 genes. To identify genes showing expression changes upon reduction of ZFX levels, I chose an FDR cutoff of 0.05 and a fold-change cutoff of 1.5. Using these values, I identified 991 upregulated genes and 1282 downregulated genes. Genes identified as responsive to changes in the level of a transcription factor include both direct target genes and genes that are in downstream signaling pathways regulated by the direct target genes (i.e. indirect targets). One approach to identify direct target genes is to determine which of the deregulated genes have ZFX binding sites in their promoter regions. I found that 515 downregulated genes and 101 upregulated genes have ZFX binding sites within 2 kb of their TSS (Table 4.3). There are 104 ZFX binding sites associated with the 101 upregulated genes, indicating that most of these genes have one bound ZFX. However, there are 556 ZFX binding sites associated with the 515 downregulated genes, indicating that some of the downregulated genes have more than one ZFX bound to their promoter regions. Given that about ~41% of the downregulated genes have ZFX binding sites in their promoters whereas only Cell line Sample Name Total Reads Total Alignment Rate C4-2B siCtrl C4-2B_siCTRL_rep1 39,352,578 89.98% C4-2B_siCTRL_rep2 37,889,627 90.06% C4-2B_siCTRL_rep3 43,184,055 89.90% siZFX C4-2B_siZFX_rep1 43,747,503 88.89% C4-2B_siZFX_rep2 39,640,681 89.43% C4-2B_siZFX_rep3 38,565,814 87.79% MCF7 siCtrl MCF7_siCtrl_rep1 48,009,367 89.29% MCF7_siCtrl_rep2 43,215,478 89.28% MCF7_siCtrl_rep3 48,232,039 89.61% siZFX MCF7_siZFX_rep1 46,499,259 88.97% MCF7_siZFX_rep2 44,882,433 89.67% MCF7_siZFX_rep3 40,748,068 90.04% Sequence Location siRNA target sequence-1 UGAAAUCGCUGACGAAGUU chrX:24226342-24226360 siRNA target sequence-2 GAAUGACCAUGGACACAGA chrX:24225482-24225500 siRNA target sequence-3 GCAACAUGCUAGUUACUUU chrX:24229708-24229726 siRNA target sequence-4 CCAAGUAGUAGUUGUUUAA chrX:24230419-24230437 35 ~11% of the upregulated genes have ZFX binding sites in their promoters, my results suggest that ZFX may act as a transcriptional activator for ~500 target genes in prostate cancer cells. I also performed ZFX knockdown and RNA-seq analysis in MCF7 breast cancer cells using the same alignment tools and differential expression tools. I found that MCF7 expresses 16779 genes. Using the same cutoffs, I identified 218 upregulated genes and 509 downregulated genes. I found that 29 upregulated genes (13%) and 181 downregulated genes (36%) have ZFX binding sites within 2 kb of their TSS (Table 4.3). Table 4.3 Many downregulated genes have ZFX bound to their promoter regions. 4.2 Binding patterns of direct ZFX target genes I have defined the set of downregulated genes that have ZFX bound to their promoter regions as ZFX direct target genes. I have further characterized the ZFX binding patterns in the set of direct target promoters. As shown above, the set of all ZFX binding sites showed an unusual location analysis, binding ~240 bp downstream of the TSS. However, because only a small percentage of the total number of ZFX binding sites may be functional (5%; 515 out of ~9000 sites in C42B cells), it was important to determine the location analysis of the functional sites. To study the binding patterns of direct ZFX target genes (i.e. downregulated genes that have ZFX bound to Upregulated genes Downregulated genes C42B Number of genes showing expression changes 911 1236 Subset of those genes with ZFX bound to their promoter 101 (11%) 515 (41%) MCF7 Number of genes showing expression changes 218 509 Subset of those genes with ZFX bound to their promoter 29(13%) 181 (36%) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 -2000 -1860 -1720 -1580 -1440 -1300 -1160 -1020 -880 -740 -600 -460 -320 -180 -40 100 240 380 520 660 800 940 1080 1220 1360 1500 1640 1780 1920 Tag Density Distance from center of TSS Direct target sites in relation to TSS 0 0.1 0.2 0.3 0.4 0.5 0.6 -2000 -1900 -1800 -1700 -1600 -1500 -1400 -1300 -1200 -1100 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 TagDensity DistancefromthecenterofTSS All binding sites in relation to TSS C42B siControl RNA-seq C42B siZFX RNA-seq ZFX C42B ChIP-seq ZFX C42B ChIP-seq A B 36 their promoter region), I generated a tag density plot. Interestingly, the binding sites in the direct target genes are more highly enriched at the location downstream of the TSS than are the set of “all ZFX” binding sites (Figure 4.2). Figure 4.2 Most direct target sites are located ~ 240bp downstream of TSS. Shown are tag density plots of “all ZFX” binding sites (left panel) vs tag density plots for those ZFX binding sites in the promoters of the down-regulated genes (right panel). Note the scale of the y axis is different for the two panels. An example of a direct ZFX target genes is shown in Figure 4.3. DIS3L expression is reduced upon ZFX knockdown and ZFX binds to +240 downstream of the DIS3L TSS. Interestingly, DIS3L has recently been shown to be a target of ZFX in a human medulloblastoma cell line (Palmer CJ et al., 2014). Upregulated genes Downregulated genes C42B Number of genes showing expression changes 911 1236 Subset of those genes with ZFX bound to their promoter 101 (11%) 515 (41%) MCF7 Number of genes showing expression changes 218 509 Subset of those genes with ZFX bound to their promoter 29(13%) 181 (36%) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 -2000 -1860 -1720 -1580 -1440 -1300 -1160 -1020 -880 -740 -600 -460 -320 -180 -40 100 240 380 520 660 800 940 1080 1220 1360 1500 1640 1780 1920 Tag Density Distance from center of TSS Direct target sites in relation to TSS 0 0.1 0.2 0.3 0.4 0.5 0.6 -2000 -1900 -1800 -1700 -1600 -1500 -1400 -1300 -1200 -1100 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 TagDensity DistancefromthecenterofTSS All binding sites in relation to TSS C42B siControl RNA-seq C42B siZFX RNA-seq ZFX C42B ChIP-seq ZFX C42B ChIP-seq A B 37 Figure 4.3 UCSC browser screenshot of a direct target gene, DIS3L.A. A UCSC browser track displays RNA levels in control vs siRNA-treated C42B prostate cancer cells and the ZFX ChIP-seq data. B. A close-up of the ZFX binding site shows that the peak is 200~300bp downstream of the TSS. I next determined if the same motif would be identified in the binding sites at the subset of direct ZFX target promoters. For C42B, the top 4 preferred motifs in direct target sites are exactly same as in all binding sites and the percentage of preferred motif is slightly higher in direct target sites than in all binding sites (Table 4.4). For MCF7, the top 2 most preferred motifs in direct target sites are exactly same as in all binding sites (Table 4.5). Upregulated genes Downregulated genes C42B Number of genes showing expression changes 911 1236 Subset of those genes with ZFX bound to their promoter 101 (11%) 515 (41%) MCF7 Number of genes showing expression changes 218 509 Subset of those genes with ZFX bound to their promoter 29(13%) 181 (36%) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 -2000 -1860 -1720 -1580 -1440 -1300 -1160 -1020 -880 -740 -600 -460 -320 -180 -40 100 240 380 520 660 800 940 1080 1220 1360 1500 1640 1780 1920 Tag Density Distance from center of TSS Direct target sites in relation to TSS 0 0.1 0.2 0.3 0.4 0.5 0.6 -2000 -1900 -1800 -1700 -1600 -1500 -1400 -1300 -1200 -1100 -1000 -900 -800 -700 -600 -500 -400 -300 -200 -100 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 TagDensity DistancefromthecenterofTSS All binding sites in relation to TSS C42B siControl RNA-seq C42B siZFX RNA-seq ZFX C42B ChIP-seq ZFX C42B ChIP-seq A B 38 Table 4.4 Top 4 motifs of direct target sites in C42B Table 4.5 Top 4 motifs of direct target sites in MCF7 For C42B cells, the top 30 downregulated genes and the top 30 upregulated genes are shown below in Tables 4.6 and 4.7, respectively; the genes that have ZFX bound to their promoter regions are shown in red. 12 out of 30 most differentially downregulated genes are direct target genes whereas none of the 30 most differentially upregulated genes are direct target genes. Rank Motif Name % of target sequences with motif (direct target sites) % of target sequences with motif (all binding sites) 1 ZNF711(Zf)/SHSY5Y- ZNF711-ChIP- Seq(GSE20673)/Homer 60.97% 57.40% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 38.49% 36.82% 3 AP-2alpha(AP2)/Hela- AP2alpha-ChIP- Seq(GSE31477)/Homer 33.27% 25.29% 4 AP-2gamma(AP2)/MCF7- TFAP2C-ChIP- Seq(GSE21234)/Homer 39.75% 31.77% Rank Motif Name % of target sequences with motif (direct target sites) % of target sequences with motif (all binding sites) 1 ZNF711(Zf)/SHSY5Y- ZNF711-ChIP- Seq(GSE20673)/Homer 66.15% 54.27% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 39.49% 34.16% 3 ZNF692(Zf)/HEK293- ZNF692.GFP-ChIP- Seq(GSE58341)/Homer 8.21% NA 4 AP-2gamma(AP2)/MCF7- TFAP2C-ChIP- Seq(GSE21234)/Homer 41.03% NA G C T A C G A T C T A G G C CT CC T T A C G G A A GGG C C G CT C C T A T G C G A A GGC G G T A C C T G A C G A T T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G A C G C TGGA G A T G C CCCT C G A T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G C G CT C C T A T G C G A A G GC G G T A C C T G A C G A T Rank Motif Name % of target sequences with motif (direct target sites) % of target sequences with motif (all binding sites) 1 ZNF711(Zf)/SHSY5Y- ZNF711-ChIP- Seq(GSE20673)/Homer 60.97% 57.40% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 38.49% 36.82% 3 AP-2alpha(AP2)/Hela- AP2alpha-ChIP- Seq(GSE31477)/Homer 33.27% 25.29% 4 AP-2gamma(AP2)/MCF7- TFAP2C-ChIP- Seq(GSE21234)/Homer 39.75% 31.77% Rank Motif Name % of target sequences with motif (direct target sites) % of target sequences with motif (all binding sites) 1 ZNF711(Zf)/SHSY5Y- ZNF711-ChIP- Seq(GSE20673)/Homer 66.15% 54.27% 2 ZFX(Zf)/mES-Zfx-ChIP- Seq(GSE11431)/Homer 39.49% 34.16% 3 ZNF692(Zf)/HEK293- ZNF692.GFP-ChIP- Seq(GSE58341)/Homer 8.21% NA 4 AP-2gamma(AP2)/MCF7- TFAP2C-ChIP- Seq(GSE21234)/Homer 41.03% NA G C T A C G A T C T A G G C CT CC T T A C G G A A GGG C C G CT C C T A T G C G A A GGC G G T A C C T G A C G A T T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G A C G C TGGA G A T G C CCCT C G A T G C A GC GCCA C T T C G A T A C G G AGGCCC T T C G A T A C G C G CT C C T A T G C G A A G GC G G T A C C T G A C G A T 39 Table 4.6 The top 30 most downregulated genes in C42B. Gene Name P-value (KD vs. WT) FDR step up (KD vs. WT) Fold change (KD vs. WT) 1 RP11-407G23.3 0.0253 0.0482 -11.5126 2 RP11-290L1.3 0.0175 0.0351 -8.3598 3 RP11-529K1.3 0.0149 0.0305 -6.5788 4 RP11-350O14.18 0.0013 0.0035 -6.2860 5 XXcos-LUCA11.5 0.0027 0.0068 -6.0080 6 CCDC177 0.0008 0.0022 -5.9672 7 GATSL1 0.0017 0.0044 -5.5971 8 AS3MT 0.0200 0.0394 -5.4736 9 RP11-96O20.4 0.0029 0.0072 -5.2889 10 LYRM9 0.0000 0.0000 -5.2708 11 RP4-734P14.4 0.0027 0.0067 -5.2533 12 CBLN3 0.0000 0.0001 -4.9352 13 ARNT2 0.0000 0.0000 -4.9211 14 SLC6A19 0.0016 0.0043 -4.9069 15 CTSZ 0.0000 0.0000 -4.8931 16 CEP112 0.0000 0.0000 -4.8821 17 RP11-229P13.23 0.0012 0.0033 -4.5786 18 SYDE1 0.0038 0.0091 -4.5395 19 MVB12B 0.0000 0.0000 -4.5266 20 CTB-54O9.9 0.0046 0.0108 -4.3864 21 ANKRD65 0.0000 0.0000 -4.3750 22 MAGI2 0.0000 0.0000 -4.3666 23 CTD-3214H19.4 0.0001 0.0005 -4.2288 24 KIAA2022 0.0000 0.0001 -4.1847 25 ARAP3 0.0000 0.0000 -4.1706 26 CTD-2116N17.1 0.0000 0.0000 -4.1357 27 CTD-3092A11.1 0.0000 0.0000 -3.9178 28 GHRHR 0.0007 0.0021 -3.8768 29 FAM149A 0.0000 0.0000 -3.8023 30 RP13-616I3.1 0.0000 0.0000 -3.7716 40 Table 4.7 The top 30 most upregulated genes in C42B. For MCF7 cells, the top 30 downregulated genes and the top 30 upregulated genes are shown in Table 4.8 and Table 4.9, respectively; the genes that have ZFX bound to their promoter regions are shown in red. 10 out of 30 most differentially downregulated genes are direct target genes whereas only 1 (RAB4B-EGLN2) of the 30 most differentially upregulated genes are direct Gene Name P-value (KD vs. WT) FDR step up (KD vs. WT) Fold change (KD vs. WT) 1 HIST1H4J 0.0000 0.0000 15.8645 2 AKR1C1 0.0000 0.0000 14.3618 3 RP11-282O18.6 0.0000 0.0001 11.5480 4 AKR1B10 0.0000 0.0001 11.2040 5 HMOX1 0.0000 0.0000 9.2728 6 RP11-24N18.1 0.0001 0.0002 9.0391 7 DIO3 0.0000 0.0000 8.8522 8 AKR1C2 0.0000 0.0000 7.8845 9 AC119673.1 0.0037 0.0089 7.4546 10 FAM171B 0.0013 0.0035 7.0895 11 AC104534.3 0.0000 0.0000 6.9040 12 MSTN 0.0002 0.0006 6.3654 13 DMRT1 0.0007 0.0020 5.7297 14 FAM21D 0.0000 0.0000 5.5801 15 IFIT2 0.0000 0.0000 4.9057 16 CDSN 0.0027 0.0067 4.7845 17 RP11-216L13.17 0.0043 0.0102 4.6869 18 RP11-449P15.1 0.0244 0.0468 4.6547 19 LAMC2 0.0000 0.0000 4.6011 20 RP11-391L3.1 0.0003 0.0010 4.4885 21 GDF15 0.0000 0.0000 4.3099 22 AKR1C3 0.0012 0.0034 4.3061 23 MB21D2 0.0000 0.0000 4.2759 24 PGM5 0.0000 0.0000 4.2739 25 SNORD3A 0.0000 0.0001 4.1460 26 DISC1 0.0009 0.0024 4.0705 27 CTC-492K19.4 0.0001 0.0002 3.9748 28 ANGPT2 0.0000 0.0000 3.9335 29 RP11-862L9.3 0.0030 0.0075 3.8926 30 UGT1A6 0.0005 0.0016 3.8861 41 target genes. Instead of have binding sites downstream ~240bp, ZFX binds near the TSS of RAB4B-EGLN2. Table 4.8 The top 30 most downregulated genes in MCF7 Gene Name P-value (KD vs. WT) FDR step up (KD vs. WT) Fold change (KD vs. WT) 1 WHAMMP2 0.0000 0.0003 -8.4307 2 AS3MT 0.0000 0.0000 -8.1591 3 RP11-108P20.1 0.0011 0.0062 -6.9844 4 LINC00284 0.0003 0.0022 -5.3726 5 SFT2D3 0.0001 0.0007 -5.1094 6 ZFX 0.0000 0.0000 -5.0582 7 NTF4 0.0035 0.0154 -4.5697 8 MAP6 0.0001 0.0010 -4.5012 9 AP000347.4 0.0076 0.0290 -3.8768 10 MAGEE1 0.0000 0.0001 -3.8737 11 RP11-341N2.1 0.0042 0.0179 -3.8679 12 MVB12B 0.0000 0.0000 -3.8559 13 RAB36 0.0000 0.0000 -3.3832 14 CASP16 0.0020 0.0101 -3.2825 15 HOXA-AS3 0.0063 0.0252 -3.2024 16 ECHDC3 0.0000 0.0000 -3.1859 17 RP11-566K11.5 0.0081 0.0306 -3.1310 18 FAM203B 0.0000 0.0004 -3.0587 19 AKAP7 0.0000 0.0001 -2.9795 20 WHAMMP3 0.0001 0.0008 -2.9726 21 FAM66C 0.0001 0.0006 -2.9672 22 CTD-2292P10.4 0.0001 0.0007 -2.9516 23 KIAA2022 0.0001 0.0012 -2.9177 24 FAM212A 0.0079 0.0299 -2.9141 25 ZDHHC8P1 0.0001 0.0006 -2.7997 26 RP11-82L18.2 0.0080 0.0302 -2.7586 27 RP11-67L2.2 0.0000 0.0000 -2.7428 28 FOXD2-AS1 0.0000 0.0003 -2.7394 29 FAXC 0.0000 0.0004 -2.7025 30 MAGI2 0.0007 0.0043 -2.7012 42 Table 4.9 The top 30 most upregulated genes in MCF7. Gene Name P-value (KD vs. WT) FDR step up (KD vs. WT) Fold change (KD vs. WT) 1 RP11-196G18.24 0.0009 0.0054 7.5227 2 GATSL1 0.0036 0.0161 6.7225 3 PALM2-AKAP2 0.0011 0.0064 5.7780 4 ASB3 0.0013 0.0070 4.8965 5 CTD-3105H18.14 0.0031 0.0142 4.5037 6 RP13-608F4.8 0.0027 0.0125 3.7895 7 RP4-548D19.3 0.0105 0.0376 3.7085 8 CYP1A1 0.0000 0.0002 3.6915 9 RAB4B-EGLN2 0.0018 0.0091 3.2597 10 MUC4 0.0007 0.0046 3.1053 11 LINC00341 0.0025 0.0120 2.9719 12 AC104134.2 0.0111 0.0392 2.8249 13 MYZAP 0.0024 0.0113 2.7836 14 HMOX1 0.0000 0.0000 2.7803 15 AC005077.12 0.0141 0.0476 2.7475 16 COLEC12 0.0098 0.0355 2.6267 17 CTD-3099C6.11 0.0085 0.0317 2.6253 18 CTC-499J9.1 0.0120 0.0419 2.5813 19 TLR6 0.0023 0.0109 2.5743 20 PEAR1 0.0000 0.0002 2.5346 21 KLHDC7B 0.0000 0.0000 2.5187 22 PTPRH 0.0000 0.0001 2.4559 23 FUT3 0.0020 0.0101 2.4507 24 NHLRC4 0.0033 0.0147 2.4001 25 PDLIM3 0.0008 0.0050 2.3510 26 CTD-2116N17.1 0.0001 0.0012 2.3344 27 CCNA1 0.0004 0.0031 2.2858 28 DYX1C1-CCPG1 0.0046 0.0193 2.2787 29 LINC00346 0.0035 0.0155 2.2676 30 ATF3 0.0000 0.0000 2.2633 43 As shown above, ~77% of the promoters (a total of 4386 promoters) bound by ZFX in C42B and MCF7 cells are the same (Figure 3.6). However, only a small percentage of bound promoters show changes in expression in the ZFX knockdown cells. To determine which, if any, of the bound promoters are commonly regulated in the two cell types, I calculated the percentage of common genes in all upregulated and downregulated genes versus in the subset of direct ZFX target genes between C42B and MCF7 (Figure 4.4). Compared to upregulated genes, there are many more common genes in the sets of downregulated genes versus the sets of upregulated genes between the 2 cell lines, both for all downregulated genes and for the subset of those genes with ZFX bound to their promoters. This suggests that ZFX might regulate ~90 genes in both C42B and MCF7 by directly binding to their promoter regions; however, many of the ZFX- regulated genes are cell type-specific. Figure 4.4 There are many more common downregulated genes than upregulated genes between C42B and MCF7. Yellow circles represent C42B gene sets and the blue circles represent MCF7 gene sets. The size of the circles is relatively proportional to the number of Upregulated genes with ZFX bound to their promoter A B 998 271 238 (47%) All downregulated genes C42B MCF7 MCF7 C42B Downregulated genes with ZFX bound to their promoter A B 425 90 (50%) 91 A B 903 210 8 All upregulated genes C42B MCF7 4% A B 94 22 7 C42B MCF7 24% 44 genes. The percentage shown in common area is the common gene number relative to MCF7 gene number for each specific category. 4.3 Identification of top diseases and biological functions affected by knockdown of ZFX in C42B and MCF7 cells. Using the Ingenuity Pathway Analysis (IPA) software, I performed Diseases and Functions analysis of the sets of all (direct and indirect) downregulated and upregulated genes in C42B and MCF7 cells. Differentially expressed genes upon ZFX knockdown in both C42B and MCF7 cells are linked to cancer, cellular function and maintenance, and the cell cycle, as would be expected for genes regulated by a transcription factor involved in tumorigenesis (Table 4.10 and Table 4.11). Table 4.10 Top disease and biological functions of differentially expressed genes upon ZFX knockdown in C42B Top Disease and Bio Functions Diseases and Disorders Name p-value Cancer 7.42E-04 - 1.98E-38 Organismal Injury and Abnormalities 7.80E-04 - 1.98E-38 Gastrointestinal Disease 7.42E-04 - 4.70E-29 Reproductive System Disease 6.40E-04 - 5.18E-11 Developmental Disorder 7.02E-04 - 7.30E-10 Molecular and Cellular Functions Name p-value Cellular Assembly and Organization 4.11E-04 - 1.73E-12 Cellular Function and Maintenance 7.55E-04 - 1.73E-12 Cell Cycle 7.02E-04 - 1.85E-10 DNA Replication, Recombination, and Repair 4.71E-04 - 3.09E-09 Cell Morphology 6.64E-04 - 3.34E-08 Top Disease and Bio Functions Diseases and Disorders Name p-value Cancer 6.51E-03 - 5.02E-10 Organismal Injury and Abnormalities 6.51E-03 - 5.02E-10 Gastrointestinal Disease 6.49E-03 - 5.73E-08 Connective Tissue Disorders 3.96E-03 - 1.39E-06 Inflammatory Response 6.55E-03 - 1.39E-06 Molecular and Cellular Functions Name p-value Cellular Function and Maintenance 6.51E-03 - 6.85E-06 Cellular Compromise 3.96E-03 - 1.20E-05 Cell Death and Survival 6.49E-03 - 1.78E-05 Cellular Movement 6.49E-03 - 6.99E-05 Cell-To-Cell Signaling and Interaction 6.65E-03 - 8.59E-05 45 Table 4.11 Top disease and biological functions of differentially expressed genes upon ZFX knockdown in MCF7 Top Disease and Bio Functions Diseases and Disorders Name p-value Cancer 7.42E-04 - 1.98E-38 Organismal Injury and Abnormalities 7.80E-04 - 1.98E-38 Gastrointestinal Disease 7.42E-04 - 4.70E-29 Reproductive System Disease 6.40E-04 - 5.18E-11 Developmental Disorder 7.02E-04 - 7.30E-10 Molecular and Cellular Functions Name p-value Cellular Assembly and Organization 4.11E-04 - 1.73E-12 Cellular Function and Maintenance 7.55E-04 - 1.73E-12 Cell Cycle 7.02E-04 - 1.85E-10 DNA Replication, Recombination, and Repair 4.71E-04 - 3.09E-09 Cell Morphology 6.64E-04 - 3.34E-08 Top Disease and Bio Functions Diseases and Disorders Name p-value Cancer 6.51E-03 - 5.02E-10 Organismal Injury and Abnormalities 6.51E-03 - 5.02E-10 Gastrointestinal Disease 6.49E-03 - 5.73E-08 Connective Tissue Disorders 3.96E-03 - 1.39E-06 Inflammatory Response 6.55E-03 - 1.39E-06 Molecular and Cellular Functions Name p-value Cellular Function and Maintenance 6.51E-03 - 6.85E-06 Cellular Compromise 3.96E-03 - 1.20E-05 Cell Death and Survival 6.49E-03 - 1.78E-05 Cellular Movement 6.49E-03 - 6.99E-05 Cell-To-Cell Signaling and Interaction 6.65E-03 - 8.59E-05 46 Chapter 5 Discussion and Future Directions ZFX is upregulated in a variety of different human cancers and high expression of ZFX has been shown to correlate with poor patient survival. Therefore, I reasoned that inhibition of ZFX might possibly be included in a “personalized medicine” approach to treatment of cancer patients. However, transcription factors are very hard to directly inhibit (mainly because they do not encode an enzymatic activity) and thus are not good therapeutic targets. However, it was possible that understanding the mechanism of action of ZFX and/or identifying key target genes or pathways regulated by ZFX may provide insight into treatment of patients with high levels of ZFX in their tumors. Towards this goal, I performed a genome-wide functional analysis of the ZFX transcription factor in human cancer cell lines, using ChIP-seq and RNA-seq analyses. I identified ~9000 binding sites for ZFX in prostate, breast, and colon cancer cells. Interestingly, the majority of these binding sites were common to all 3 tumor types and were located in promoter regions. However, only a small percentage of bound promoters showed changes in expression upon ZFX knockdown. Interestingly, I found that ZFX binds ~240 bp downstream of the TSS, an unusual location for a DNA-binding TF, and that it binds to an 8-nt motif. As mentioned in the introduction, ZFX has 13 zinc finger domains. It is known zinc finger proteins of the Cys2-His2 type consist of tandem arrays of domains, where each domain has the ability to contact three adjacent base pairs of DNA through three key residues (Desjarlais JR et al., 1992). Therefore, the 13 zinc finger domains of ZFX could bind to a ~39bp DNA motif. Our lab has previously studied 6-finger ZNF proteins and has shown that these ZNFs prefer a 12 nt motif (representing binding of only 4 of the 6 fingers); (Grimmer et al., 2014). My studies of ZFX are consistent with these previous studies indicating that not all fingers of a multi-finger ZNF are involved in DNA sequence recognition. Finally, I identified several pathways that could 47 possibly serve as chemotherapeutic targets for patients with high levels of ZFX. Although my studies have provided new insights into the role of ZFX in transcriptional regulation and in cancer, they have also suggested new questions and areas of investigation. What distinguishes a “functional” bound ZFX from a “non-functional” bound ZFX? Although ZFX binds to ~5000-6000 promoters in a given cell type, knockdown of ZFX resulted in altered activity of only a small subset of these promoters; the widespread binding, but limited functional consequences, of ZFX is similar to previous studies in our lab for other ZNF proteins (Grimmer et al., 2014). To clarify that the lack of regulation is not due to a set of false positive peaks, I checked the median peak heights in regulated vs non-regulated promoters and found that ZFX ChIP-seq peaks in the regulated (median peak height: 50) vs non-regulated promoters (median peak height: 43) are of similar height. One possibility to explain why ZFX only regulates a small percentage of promoters to which it is bound is that ZFX binding may be redundant with other factors bound to those promoters. However, I could not identify any significant novel motifs for TFs in the set of ZFX bound but not regulated promoters. Another possibility is that peak location may play a role in distinguishing functional vs nonfunctional ZFX binding. Interestingly, the binding sites in the set of “regulated” promoters are more enriched at +240 than the binding sites in the set of “all” ZFX bound promoters (see Figure 4.2). However, we do not yet know if it is critical that ZFX actually bind to +240 downstream of the TSS to regulate transcription. Follow-up experiments could be to perform promoter-reporter assays in which the location of the ZFX binding site was moved upstream or downstream of its normal position to see if the location affects regulation. Finally, a third possibility is that the amount of ZFX protein remaining after siRNA treatment is sufficient to activate some of its 48 target genes. I attempted to address this question using CRISPR-mediated knockout of the ZFX gene (Figure 5.1A). However, after screening a large number of colonies, I could not identify homozygous deletions (Figure 5.1B). This suggests that perhaps complete knockout of ZFX is lethal. Alternatively, I note that C42B cells are anueploid and it may have just been difficult to achieve CRISPR-mediated deletion in all alleles. A future experiment could be to perform siRNA treatment of the partially deleted cells, to see if I could achieve very low levels of ZFX. If so, then the RNA-seq experiments could be repeated to see if more genes that have ZFX bound to their promoters are affected. Figure 5.1 CRISPR-mediated knockout of the ZFX gene. A. Guide RNAs designed to delete the 13 ZFX zinc finger domains are shown in the first track. To select the homologous deletions, 3 set of primers were designed, as shown in the bottom 3 tracks. The outer primers were designed to amplify the whole region including the predicted gRNA deletion region. The inner left primers were designed to amplify the region flanking the left gRNA whereas inner right primers were designed to amplify the region flanking the right gRNA. B. Examples of clones screened for the homozygous deletion. Clone names are labelled on the bottom of the gel image. Each clone was screened using 3 sets of primers. Lane 1 is the outer primer amplification fragment; Lanes 2 and 3 are the inner left primer amplification fragment and the inner right primer amplification fragment respectively. A Cas9 vector control was also performed (the right clone). Only the clone L1R2-3 C2 (in red) has deletions of some alleles (as shown by the shift in the 2574 bp fragment to 1257 bp). However, not all alleles are deleted since there are still fragments amplified by the inner left and inner right primers. A B 49 Does ZNF711 substitute for ZFX in MCF7 cells? I noticed that fewer genes showed differential expression upon knockdown of ZFX in MCF7 cells (727 genes) than in C42B cells (2147 genes). However, this was not due to the relative reduction of ZFX mRNA in the two cell types. As shown in Figure 5.2A, the knockdown efficiency of ZFX was higher in MCF7 (~80%) than in C42B (almost 70%). Considering that ZFX is structurally homologous to ZNF711 and the two transcription factors have very similar binding motifs, I checked the ZNF711 expression levels in both cell lines (Figure 5.2B). Interestingly, the ZNF711 expression level is much higher in MCF7 cells than in C42B. This suggests that the effect of ZFX knockdown might be reduced in MCF7 cells since ZNF711 might substitute for ZFX function in these cells. A future study of ZNF711 in MCF7 cells could test this hypothesis. Knocking down ZNF711 by siRNA followed by RNA-seq in both control and siZFX-treated cells would determine if ZFX and ZNF711 regulate the same set of genes. Figure 5.2 ZFX and ZNF711 expression levels in C42B and MCF7 cells. The Y axis is normalized reads, to measure the expression levels, and the X axis is the different cancer cell types with different siRNA treatments. The average normalized reads of triplicates is labelled on top of each bar graph. A. ZFX expression levels in C42B and MCF7 cells, before and after treatment with ZFX siRNAs. B. ZNF711 expression levels in C42B and MCF7 cells, before and after treatment with ZFX siRNAs. 56 35 19 7 0 10 20 30 40 50 60 C42B MCF7 Normalized reads ZFX Control Knockdown 0.09 4.98 0.2 6.54 0 1 2 3 4 5 6 7 C42B MCF7 Normalized reads ZNF711 Control Knockdown A B 50 Could inhibition of ZFX-regulated pathways provide a therapeutic option? As described above, I identified several hundred genes that were affected by knockdown of ZFX in prostate and breast cancer cells. To determine if these genes are involved in any specific pathways, I first performed a pathway analysis of the sets of all (direct and indirect) downregulated and upregulated genes in C42B cells using the Ingenuity Pathway Analysis software (Figure 5.3). Figure 5.3 Top 5 related pathways upon ZFX knockdown. (A) Pathways identified upon knockdown of ZFX in C42B cells (B) Pathways identified upon knockdown of ZFX in MCF7 cells. Blue bars indicate that the pathways were inhibited (the darker the blue, the higher the significance of inhibition); grey bars indicate that the direction of pathway change, inhibition or RAR Activation Molecular Mechanisms of Cancer Role of BRCA1 in DNA Damage Response Mitotic Roles of Polo-like Kinase Estrogen-mediated S-phase Entry -log(p-value) 0.0 0.1 1.0 1.5 2.0 2.5 3.0 3.5 0.0 0.1 0.2 0.3 0.4 0.5 Ratio Unfolded protein response Death Receptor Signaling Hepatic Fibrosis / Hepatic Stellate Cell Activation Role of JAK family kinases in IL-6-type Cytokine Signaling Antioxidant Action of Vitamin C A B 51 activation, is not known based on the IPA database. Yellow square dots represent the ratio of pathway members shown in my experiments versus the total number of members in the specific pathway. Upon ZFX knockdown in C42B cells, the BRCA1 DNA damage response was shown to be statistically significantly inhibited (high –log p value) in the IPA analysis (Figure 5.3). Therefore, I performed a more detailed analysis of the BRCA1 DNA damage response pathway (Figure 5.4). Interestingly, this pathway can be targeted by several existing drugs (Table 5.1). By applying these drugs, the BRCA1 DNA damage response pathway could be inhibited, which may be useful for tumors overexpressing ZFX. Figure 5.4 ZFX regulated genes in the BRCA1 DNA damage response pathway. Circles filled in red indicate that the section of the pathway is activated upon ZFX knockdown and circles filled in green indicate that the section of the pathway is inhibited upon ZFX knockdown. Outlined purple circles indicate that the genes are differentially expressed upon ZFX knockdown. 52 Table 5.1 Chemotherapeutic drugs that target the DNA damage response pathway I next performed pathway analysis of genes differentially regulated by knockdown of ZFX in MCF7 cells. As shown in Figure 5.3B, the death receptor signaling pathway was statistically significantly inhibited upon ZFX knockdown in MCF7. I also found some drugs which target this signaling pathway members and act as inhibitors in this pathway, which can also be useful for tumors overexpressing ZFX (Table 5.2). Table 5.2 Chemotherapeutic drugs that target death receptor signaling pathway IPA has a function that can be used to compare pathways enriched in different datasets. Therefore, I used this method to compare the pathways influenced by ZFX knockdown in C42B versus MCF7. Although the death receptor signaling pathway is also inhibited in C42B (Figure 5.5), it is not as statistically significant as the BRCA1 DNA damage response pathway, which is Drug Name Targets Actions Indications/Status APR-246 TP53 inhibitor esophageal carcinoma/Phase 1/Phase 2 hematological system tumor/Phase 1 high-grade serous ovarian cancer/Phase 1/Phase 2 AZD0156 ATM inhibitor cancer/Phase 1 metastasis/Phase 1 AZD6738 ATR inhibitor acute lymphocytic leukemia, type L3/Phase 1 acute lymphocytic leukemia/Phase 1 advanced gastric adenocarcinoma/Phase 1 Drug Name Targets Actions Indications/Status (-)-gossypol BCL2 inhibitor adrenal cortex carcinoma/Phase 2 adult Burkitt lymphoma/Phase 1 adult diffuse large-cell lymphoma/Phase 1 ABT-767 PARP1,PARP2 inhibitor primary peritoneal cancer/Phase 1 Drug Name Targets Actions Indications/Status APR-246 TP53 inhibitor esophageal carcinoma/Phase 1/Phase 2 hematological system tumor/Phase 1 high-grade serous ovarian cancer/Phase 1/Phase 2 AZD0156 ATM inhibitor cancer/Phase 1 metastasis/Phase 1 AZD6738 ATR inhibitor acute lymphocytic leukemia, type L3/Phase 1 acute lymphocytic leukemia/Phase 1 advanced gastric adenocarcinoma/Phase 1 Drug Name Targets Actions Indications/Status (-)-gossypol BCL2 inhibitor adrenal cortex carcinoma/Phase 2 adult Burkitt lymphoma/Phase 1 adult diffuse large-cell lymphoma/Phase 1 ABT-767 PARP1,PARP2 inhibitor primary peritoneal cancer/Phase 1 53 the reason why it is not in the list of top canonical pathways (Figure 5.3A), which are ranked using p-value (means probability-value). Therefore, I would still recommend to choose BRCA1 DNA damage response as a targetable pathway in prostate cancer and to choose the Death Receptor signaling pathway as targetable pathway in breast cancer. Figure 5.5 Pathway comparison upon ZFX knockdown in C42B and MCF7. Red squares indicate that the specific pathway is activated, blue squares indicate that the specific pathway is inhibited, and white squares indicate that the direction of pathway change, either activation or inhibition, is not clear. The two targetable pathways mentioned in the text are labelled in red. C42B MCF7 54 Future studies could include treating control and ZFX siRNA-treated cells with inhibitors of the BRCA1-mediated DNA repair and the death receptor signaling pathways to determine if cells expressing high levels of ZFX are sensitive to the inhibitors. 55 References Chen X, Xu H, Yuan P, Fang F et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 2008 Jun 13;133(6):1106-17. PMID: 18555785 Desjarlais JR, Berg JM. Toward rules relating zinc finger protein sequences and DNA binding site preferences. Proc Natl Acad Sci U S A. 1992 Aug 15;89(16):7345-9. PubMed PMID: 1502144; PubMed Central PMCID: PMC49706. Fang J, Yu Z, Lian M, Ma H, Tai J, Zhang L, Han D. Knockdown of zinc finger protein, X- linked (ZFX) inhibits cell proliferation and induces apoptosis in human laryngeal squamous cell carcinoma. Mol Cell Biochem. 2012 Jan;360(1-2):301-7. doi: 10.1007/s11010-011-1069- x. Epub 2011 Oct 19. PubMed PMID: 22009483. Fang Q, Fu WH, Yang J, Li X, Zhou ZS, Chen ZW, Pan JH. Knockdown of ZFX suppresses renal carcinoma cell growth and induces apoptosis. Cancer Genet. 2014 Oct-Dec;207(10- 12):461-6. doi: 10.1016/j.cancergen.2014.08.007. Epub 2014 Sep 12. PubMed PMID: 25441684. Fang X, Huang Z, Zhou W, Wu Q, Sloan AE, Ouyang G, McLendon RE, Yu JS, Rich JN, Bao S. The zinc finger transcription factor ZFX is required for maintaining the tumorigenic potential of glioblastoma stem cells. Stem Cells. 2014 Aug;32(8):2033-47. doi: 10.1002/stem.1730. PubMed PMID: 24831540; PubMed Central PMCID: PMC4349564. Galan-Caridad JM, Harel S, Arenzana TL, Hou ZE, Doetsch FK, Mirny LA, Reizis B. Zfx controls the self-renewal of embryonic and hematopoietic stem cells. Cell. 2007 Apr 20;129(2):345-57. PubMed PMID: 17448993; PubMed Central PMCID: PMC1899089. Grimmer, M. R., & Farnham, P. J. (2014). Can genome engineering be used to target cancer- associated enhancers? Epigenomics, 6(5), 493–501. Harel S, Tu EY, Weisberg S, Esquilin M, Chambers SM, Liu B, Carson CT, Studer L, Reizis B, Tomishima MJ. ZFX controls the self-renewal of human embryonic stem cells. PLoS One. 2012;7(8):e42302. doi: 10.1371/journal.pone.0042302. Epub 2012 Aug 3. PubMed PMID: 22879936; PubMed Central PMCID: PMC3411758. Jiang H, Zhang L, Liu J, Chen Z, Na R, Ding G, Zhang H, Ding Q. Knockdown of zinc finger protein X-linked inhibits prostate cancer cell proliferation and induces apoptosis by activating caspase-3 and caspase-9. Cancer Gene Ther. 2012 Oct;19(10):684-9. doi: 10.1038/cgt.2012.53. Epub 2012 Aug 17. PubMed PMID: 22898899. Jiang J, Liu LY. Zinc finger protein X-linked is overexpressed in colorectal cancer and is associated with poor prognosis. Oncol Lett. 2015 Aug;10(2):810-814. Epub 2015 Jun 10. PubMed PMID: 26622575; PubMed Central PMCID: PMC4509075. 56 Jiang M, Xu S, Yue W, Zhao X, Zhang L, Zhang C, Wang Y. The role of ZFX in non-small cell lung cancer development. Oncol Res. 2012;20(4):171-8. PubMed PMID:23461064. Li C, Li H, Zhang T, Li J, Ma F, Li M, Sui Z, Chang J. ZFX is a Strong Predictor of Poor Prognosis in Renal Cell Carcinoma. Med Sci Monit. 2015 Nov 5;21:3380-5. PubMed PMID: 26540164; PubMed Central PMCID: PMC4638281. Lay FD, Liu Y, Kelly TK, Witt H, Farnham PJ, Jones PA, Berman BP. The role of DNA methylation in directing the functional organization of the cancer epigenome. Genome Res. 2015 Apr;25(4):467-77. doi: 10.1101/gr.183368.114. Epub 2015 Mar 6.PubMed [citation] PMID: 25747664, PMCID: PMC4381519 Laity JH, Lee BM, Wright PE. Zinc finger proteins: new insights into structural and functional diversity. Curr Opin Struct Biol. 2001 Feb;11(1):39-46. Review. PubMed PMID: 11179890. Landt SG, Marinov GK, Kundaje A, et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Research. 2012;22(9):1813- 1831.doi:10.1101/gr.136184.111. Li K, Zhu ZC, Liu YJ, Liu JW, Wang HT, Xiong ZQ, Shen X, Hu ZL, Zheng J. ZFX knockdown inhibits growth and migration of non-small cell lung carcinoma cell line H1299. Int J Clin Exp Pathol. 2013 Sep 15;6(11):2460-7. eCollection 2013. PubMed PMID: 24228108; PubMed Central PMCID: PMC3816815. Ma H, Yang F, Lian M, Wang R, Wang H, Feng L, Shi Q, Fang J. Dysregulation of zinc finger protein, X-linked (ZFX) impairs cell proliferation and induces apoptosis in human oral squamous cell carcinorma. Tumour Biol. 2015 Aug;36(8):6103-12. doi: 10.1007/s13277-015- 3292-7. Epub 2015 Apr 28. PubMed PMID: 25916205; PubMed Central PMCID: PMC4546697. Nikpour P, Emadi-Baygi M, Mohammad-Hashem F, Maracy MR, Haghjooy-Javanmard S. Differential expression of ZFX gene in gastric cancer. J Biosci. 2012 Mar;37(1):85-90. PubMed PMID: 22357206. O'Geen H, Echipare L, Farnham PJ. Using ChIP-seq technology to generate high-resolution profiles of histone modifications. Methods Mol Biol. 2011;791:265-86. doi: 10.1007/978-1- 61779-316-5_20. PubMed PMID: 21913086; PubMed Central PMCID: PMC4151291. Palmer CJ, Galan-Caridad JM, Weisberg SP, Lei L, Esquilin JM, Croft GF, Wainwright B, Canoll P, Owens DM, Reizis B. Zfx facilitates tumorigenesis caused by activation of the Hedgehog pathway. Cancer Res. 2014 Oct 15;74(20):5914-24. doi: 10.1158/0008-5472.CAN- 14-0834. Epub 2014 Aug 27. PubMed PMID: 25164012; PubMed Central PMCID: PMC4199880. 57 Rohs R, Jin X, West SM, Joshi R, Honig B, Mann RS. Origins of specificity in protein-DNA recognition. Annual Review of Biochemistry. 2010;79:233-269. doi:10.1146/annurev- biochem-060408-091030. Rhie SK, Guo Y, Tak YG, Yao L, Shen H, Coetzee GA, Laird PW, Farnham PJ. Identification of activated enhancers and linked transcription factors in breast, prostate, and kidney tumors by tracing enhancer networks using epigenetic traits. Epigenetics Chromatin. 2016;9:50. PubMed PMID: 27833659, PMCID: PMC5103450 Schneider-Gädicke A, Beer-Romero P, Brown LG, Nussbaum R, Page DC. ZFX has a gene structure similar to ZFY, the putative human sex determinant, and escapes X inactivation. Cell. 1989 Jun 30;57(7):1247-58. PubMed PMID: 2500252. Tricoli JV, Bracken RB. ZFY gene expression and retention in human prostate adenocarcinoma. Genes Chromosomes Cancer. 1993 Feb;6(2):65-72. PubMed PMID: 7680890. Weng H, Wang X, Li M, Wu X, Wang Z, Wu W, Zhang Z, Zhang Y, Zhao S, Liu S, Mu J, Cao Y, Shu Y, Bao R, Zhou J, Lu J, Dong P, Gu J, Liu Y. Zinc finger X-chromosomal protein (ZFX) is a significant prognostic indicator and promotes cellular malignant potential in gallbladder cancer. Cancer Biol Ther. 2015;16(10):1462-70. doi: 10.1080/15384047.2015.1070994. Epub 2015 Jul 31. PubMed PMID: 26230915; PubMed Central PMCID: PMC4846125. Xu S, Duan P, Li J, Senkowski T, Guo F, Chen H, Romero A, Cui Y, Liu J, Jiang SW. Zinc Finger and X-Linked Factor (ZFX) Binds to Human SET Transcript 2 Promoter and Transactivates SET Expression. Int J Mol Sci. 2016 Oct 20;17(10). pii: E1737. PubMed PMID: 27775603; PubMed Central PMCID: PMC5085766. Yang F, Ma H, Feng L, Lian M, Wang R, Fan E, Fang J. Zinc finger protein x-linked (ZFX) contributes to patient prognosis, cell proliferation and apoptosis in human laryngeal squamous cell carcinoma. Int J Clin Exp Pathol. 2015 Nov 1;8(11):13886-99. eCollection 2015. PubMed PMID: 26823701; PubMed Central PMCID: PMC4713487. Yang H, Lu Y, Zheng Y, Yu X, Xia X, He X, Feng W, Xing L, Ling Z. shRNA-mediated silencing of ZFX attenuated the proliferation of breast cancer cells. Cancer Chemother Pharmacol. 2014 Mar;73(3):569-76. doi: 10.1007/s00280-014-2379-y. Epub 2014 Jan 22. PubMed PMID: 24448637. Yan X, Shan Z, Yan L, Zhu Q, Liu L, Xu B, Liu S, Jin Z, Gao Y. High expression of Zinc-finger protein X-linked promotes tumor growth and predicts a poor outcome for stage II/III colorectal cancer patients. Oncotarget. 2016 Apr 12;7(15):19680-92. doi: 10.18632/oncotarget.7547. PubMed PMID: 26967242; PubMed Central PMCID: PMC4991411. Yan X, Yan L, Su Z, Zhu Q, Liu S, Jin Z, Wang Y. Zinc-finger protein X-linked is a novel predictor of prognosis in patients with colorectal cancer. Int J Clin Exp Pathol. 2014 May 58 15;7(6):3150-7. eCollection 2014. PubMed PMID: 25031734; PubMed Central PMCID: PMC4097274. Zhou Y, Su Z, Huang Y, Sun T, Chen S, Wu T, Chen G, Xie X, Li B, Du Z. The Zfx gene is expressed in human gliomas and is important in the proliferation and apoptosis of the human malignant glioma cell line U251. J Exp Clin Cancer Res. 2011 Dec 20;30:114. doi: 10.1186/1756-9966-30-114. PubMed PMID: 22185393; PubMed Central PMCID: PMC3259083.
Abstract (if available)
Abstract
High expression of the transcription factor ZFX has been linked to increased proliferation and tumorigenesis in multiple types of malignant tumors. In addition, ZFX overexpression is correlated with poor patient survival in colorectal, gallbladder, and renal cancers. However, the mechanism by which ZFX mediates transcriptional regulation has not been studied and ZFX target genes in human are not known. I assisted with ChIP-seq assays in three cancer cell lines (derived from prostate, breast and colon cancers) to identify ZFX-binding sites throughout the human genome. Using stringent quality control metrics, I identified ~9 thousand binding sites in each cell type. Interestingly, the binding patterns were very similar in all cell types, with ~65% of the ZFX-binding sites being located within ±2 kb of a transcription start site. To determine if ZFX is responsible for regulation of the promoters to which it is bound, I performed RNA-seq analysis after knockdown of ZFX by siRNA in C42B prostate cancer cells, identifying 911 upregulated and 1236 downregulated genes. Interestingly, 515 (41%) of the downregulated genes have ZFX-binding sites in their promoter regions whereas only 101 (11%) of the upregulated genes have ZFX binding sites in their promoter regions. Similar knockdown studies were performed in MCF7 breast cancer cells
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Do the ZFX and ZFY transcription factors have redundant or unique functions?
PDF
Do ZFX and ZNF711 regulate the same genes in HEK293T cells?
PDF
Functional characterization of a prostate cancer risk region
PDF
Identification and characterization of cancer-associated enhancers
PDF
Identification of target genes and protein partners of ZNF711 in glioblastoma cells
PDF
Functional analysis of a prostate cancer risk enhancer at 7p15.2
PDF
Using CRISPR-mediated deletion to study prostate cancer regulatory elements located at loop anchors identified by Hi-C
PDF
Positive regulation of RNA polymerase III-mediated transcription of tRNA genes by the Mediator kinase submodule
PDF
Functional characterization of colon cancer risk-associated enhancers: connecting risk loci to risk genes
PDF
Using epigenetic toggle switches to repress tumor-promoting gene expression
PDF
Using genomics to understand the gene selectivity of steroid hormone receptors
PDF
Estrogen receptor-β characterization in breast cancer: development of a reliable assay for measuring expression
PDF
Inhibitory effects of estradiol and SERMs on RUNX2-driven osteoblast differentiation and gene expression
PDF
Vpr-binding protein negatively regulates p53 by site-specific phosphorylation through intrinsic kinase activity
PDF
Characterization of a new chromobox protein 8 (CBX8) antagonist in a model of human colon cancer
PDF
Mapping transcription factor networks linked to glioblastoma multiform: identifying target genes of the oncogenic transcription factor ZFX in glioblastoma multiforme
PDF
Functional characterization of colon cancer risk enhancers
PDF
Impacts of post-translational modifications on interactions between G9a and its N-terminus binding partners
PDF
Placenta growth factor-miRNAs-lncRNAs axis in the regulation of ET-1 gene involved in pulmonary hypertension in sickle cell disease
PDF
Characterization of the ZFX family of transcription factors that bind downstream of the start site of CpG island promoters
Asset Metadata
Creator
Yao, Lijun
(author)
Core Title
Characterizing ZFX-mediated gene regulation to reveal possible candidates for clinical intervention
School
Keck School of Medicine
Degree
Master of Science
Degree Program
Biochemistry and Molecular Medicine
Publication Date
05/31/2019
Defense Date
05/05/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cancer,chromatin immunoprecipitation,gene regulation,OAI-PMH Harvest,ZFX
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Farnham, Peggy (
committee chair
), Rice, Judd (
committee member
), Stallcup, Michael (
committee member
)
Creator Email
lijunyao@usc.edu,yaolj601@163.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-377551
Unique identifier
UC11257982
Identifier
etd-YaoLijun-5353.pdf (filename),usctheses-c40-377551 (legacy record id)
Legacy Identifier
etd-YaoLijun-5353.pdf
Dmrecord
377551
Document Type
Thesis
Rights
Yao, Lijun
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
chromatin immunoprecipitation
gene regulation
ZFX