Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Characterization of the ZFX family of transcription factors that bind downstream of the start site of CpG island promoters
(USC Thesis Other)
Characterization of the ZFX family of transcription factors that bind downstream of the start site of CpG island promoters
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Characterization of the ZFX family of
transcription factors that bind downstream of the
start site of CpG island promoters
By
Stephanie Weiya Ni
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(Cancer Biology and Genomics)
December 2020
ii
Acknowledgments
I would like to express my sincere gratitude and respect to my mentor, role model, and dear friend,
Dr. Peggy Farnham, who has given me tremendous guidance, trust, and support. I am forever
grateful to Peggy for believing in me and taking me into her lab as the 30th Ph.D. student. She
has offered me amazing opportunities to learn and grow, both professionally and personally, from
day one. Peggy has made going to graduate school one of my best life decisions. Peggy is one
of the two most important women in my life. I hope to grow into a scientist, leader, and influencer
like her in the future, who is capable, wise, and kind.
I have always enjoyed my time in the lab because of the lovely people there. I would like to
express my sincere appreciation to Dr. Charles Nicolet, who has been giving me continuous
scientific and life advice, laughter, and encouragement. Charlie’s bright and witty personality is
one of a kind and contagious. I am lucky to have him as my bay-mate. Special thanks to Shannon
Schreiner, Dr. Zhifei Luo, Andrew Perez, and Dr. Suhn Rhie who have provided substantial help
in performing experiments, data analysis, and understanding the projects. I would also like to
thank Dr. Yu Guo, Wei Zhu, Jiani Shi, and Karly Nisson for making the lab somewhere I feel
happy, safe, and fulfilled.
I would also like to thank my Qualifying Exam Committee and Dissertation Committee, Dr. Michael
Stallcup, Dr. Oliver Bell, Dr. Judd Rice, Dr. Min Yu, Dr. Kimberly Siegmund, and Dr. Neil Segil,
for always being responsive, sharing, and giving me extensive suggestions on my project.
I am who I am because of this wonder woman: my mother, Mingxia Yin, who has loved and
supported me unconditionally at every stage of my life. My mother opens her arms widely to
protect me when needed but always gives me ample freedom to explore and make mistakes. She
is my best friend, coach, and the woman I look up to.
iii
For all the wonderful time we spent together and all the ups and downs we have carried each
other through, I would like to thank my boyfriend and my great friends, Dr. Nicholas Scianmarello,
Yiwen Xu, Yang Song, Xinyan Liang, Mengmei Zheng, Luping Chen, Dr. Josephine Fang, and
many others.
iv
Table of Contents
Acknowledgments .......................................................................................................................... ii
List of Tables ............................................................................................................................... vii
List of Figures ............................................................................................................................. viii
Abbreviations ................................................................................................................................ ix
Abstract ......................................................................................................................................... xi
Chapter 1 ...................................................................................................................................... 1
Overview of human transcription factors and C2H2 zinc finger proteins ...................................... 1
1.1 The encyclopedia of human regulatory elements ............................................................... 1
1.2 How do TFs work? ............................................................................................................. 5
1.3 Classification of TFs ............................................................................................................ 6
1.4 The Zinc finger family of TFs ............................................................................................... 7
1.5 ZNFs and human disease ................................................................................................... 9
1.6 Outstanding questions about ZNFs ................................................................................... 10
1.7 Overall study design .......................................................................................................... 12
Chapter 2 .................................................................................................................................... 13
Materials and methods ............................................................................................................... 13
2.1 Cell culture ........................................................................................................................ 13
2.2 CRISPR/Cas9-mediated genomic deletions ..................................................................... 13
2.3 Cell cycle analysis ............................................................................................................. 14
2.5 Construction of ZFX zinc finger deletion mutants ............................................................. 15
2.6 Transient transfection assays ........................................................................................... 15
2.7 ChIP-seq ........................................................................................................................... 16
2.8 ChIP-exo ........................................................................................................................... 17
2.9 DNA methylation EPIC arrays ........................................................................................... 18
2.10 Western Blot .................................................................................................................... 18
2.11 Immunoprecipitation-Mass Spectrometry (IP-MS) .......................................................... 19
Chapter 3 .................................................................................................................................... 21
Characterization of the ZFX family of transcription factors that bind downstream of the start site
of CpG island promoters ............................................................................................................. 21
3.1 Abstract ............................................................................................................................. 21
3.2 Introduction ....................................................................................................................... 21
3.3 Results .............................................................................................................................. 24
3.3.1 Loss of ZFX and ZNF711 inhibits cell proliferation and causes large changes in the
transcriptome of HEK293T cells .......................................................................................... 24
v
3.3.2 ZFX family members have essentially identical binding patterns at CpG island
promoters ............................................................................................................................ 30
3.3.3 ZFX and ZNF711 have properties of a transcription activator when bound
downstream of the TSS ....................................................................................................... 31
3.3.4 ZFX family members bind throughout the first several hundred base pairs of the
transcribed region of their target genes ............................................................................... 34
3.3.5 The first 10 C2H2 zinc fingers of ZFX are dispensable for DNA binding and
transcriptional activity .......................................................................................................... 39
3.4 Discussion ......................................................................................................................... 46
Chapter 4 .................................................................................................................................... 48
Investigation of the mechanisms by which the ZFX family members regulate the transcriptome
.................................................................................................................................................... 48
4.1 Abstract ............................................................................................................................. 48
4.2 Introduction ....................................................................................................................... 48
4.3 Results .............................................................................................................................. 49
4.3.1 Does binding of ZFX and ZNF711 affect the DNA methylation level at target
promoters? .......................................................................................................................... 49
4.3.1 Identifying protein interaction partners of ZFX. ........................................................... 52
4.3.2 Are ZFX and ZNF711 involved in transcriptional elongation? .................................... 54
4.4 Discussion ......................................................................................................................... 57
Chapter 5 .................................................................................................................................... 60
Exploration of differentiating traits between responsive target genes and non-responsive target
genes of ZFX .............................................................................................................................. 60
5.1 Abstract ............................................................................................................................. 60
5.2 Introduction ....................................................................................................................... 60
5.3 Results .............................................................................................................................. 61
5.3.1 Do responsive promoters have different binding patterns of ZFX family members than
non-responsive promoters. .................................................................................................. 61
5.3.2 Can promoter-associated histone modifications identify ZFX-responsive promoters?
............................................................................................................................................. 64
5.3.3 Do cellular compensatory changes obscure the identification of direct target genes in
the DKO cells? ..................................................................................................................... 65
5.3.3 Are protein-protein interactions responsible for specifying responsive vs non-
responsive promoters? ........................................................................................................ 67
5.4 Discussion ......................................................................................................................... 70
Chapter 6 .................................................................................................................................... 72
Discussion and future directions ................................................................................................. 72
6.1 Discussion ......................................................................................................................... 72
6.1.1 Why don’t the majority of the ZFX-bound promoters respond to the loss of ZFX family
members? ............................................................................................................................ 72
6.1.2 Does NSD1 contribute to the function of the ZFX family? .......................................... 74
6.2 Future directions ............................................................................................................... 75
6.2.1 ZFX and GBM ............................................................................................................. 75
vi
References ................................................................................................................................. 77
vii
List of Tables
Table 1.1* = Supplementary Table S1 in Ni et al. 2020
Table 2.1* = Supplementary Table S2 in Ni et al. 2020
Table 3.1* = Supplementary Table S3A-G in Ni et al. 2020
Table 3.2* = Supplementary Table S3J & S3K in Ni et al. 2020
Table 3.3* = Supplementary Table S3L & S3M in Ni et al. 2020
Table 3.4* = Supplementary Table S3N in Ni et al. 2020
Table 5.1 = Levels of remaining ZFX and/or ZNF711mRNAs in knockdown experiments….……66
* These Supplementary Tables are all very large and cannot be easily provided in print form. They
are all published in Ni, W., A. A. Perez, S. Schreiner, C. M. Nicolet, and P. J. Farnham. 2020.
"Characterization of the ZFX family of transcription factors that bind downstream of the start site
of CpG island promoters." Nucleic Acids Res. doi: 10.1093/nar/gkaa384 and can be accessed
directly at https://academic.oup.com/nar/article/48/11/5986/5837054#supplementary-data.
viii
List of Figures
Figure 1.1 Numbers of ENCODE assays performed through Phase 3 ......................................... 2
Figure 1.2 ENCODE 3 genome annotations ................................................................................. 4
Figure 1.3 The Human TF Repertoire ........................................................................................... 8
Figure 2.1 ZFY antibody validation in female HEK293T cells ..................................................... 16
Figure 3.1 The ZFX gene family ................................................................................................. 22
Figure 3.2 The ZFX family gene structure comparison ............................................................... 22
Figure 3.3 Amino acid alignment of the ZFX family members .................................................... 23
Figure 3.4 ZFX and/or ZNF711 CRISPR KO in HEK293T ......................................................... 25
Figure 3.5 Loss of ZFX and ZNF711 in HEK293T cells inhibits cell proliferation ....................... 26
Figure 3.6 Reduction in ZFX and ZNF711 levels causes large effects on the transcriptome ..... 27
Figure 3.7 Gene ontology analysis of common DEGs in all 3 DKO clones ................................ 28
Figure 3.8 Cell cycle and pathway analysis of downregulated genes in all 3 DKO clones ......... 29
Figure 3.9 ZFX family members bind to essentially identical CpG island promoters .................. 31
Figure 3.10 K-means clustering of ZFX and ZNF711 ChIP-seq peaks in HEK293T .................. 32
Figure 3.11 ZFX and ZNF711 have properties of a transcription activator when bound downstream
of the TSS ................................................................................................................................... 34
Figure 3.12 Characterization of ZFX and ZNF711 binding sites ................................................. 36
Figure 3.13 Motif analysis of ZNF711 ChIP-seq, ChIP-exo read2, and ChExMix peaks ............ 38
Figure 3.14 Visualization of example peaks in ZNF711 ChIP-seq and ChIP-exo replicates ...... 39
Figure 3.15 Sequences and protein levels of ZFX ZF mutant constructs ................................... 40
Figure 3.16 In vivo binding of FLAG-tagged ZFX protein ........................................................... 42
Figure 3.17 Motif predictions for ZFX zinc fingers ...................................................................... 43
Figure 3.18 Functional analysis of the ZFX protein .................................................................... 44
Figure 3.19 ZFX ZF11-13 has very similar transcriptional activities as wt ZFX .......................... 46
Figure 4.1 DNA methylation analysis of DKO cells ..................................................................... 51
Figure 4.2 Volcano plot of enriched ZFX-interacting proteins identified by IP-MS in HEK293T . 53
Figure 4.3 NSD1 protein interacts with ZFX protein ................................................................... 54
Figure 4.4 H3K36me3 analyses in wt HEK293T and DKO clones ............................................. 56
Figure 5.1 Characterization of putative direct targets in wt HEK293T ........................................ 62
Figure 5.2 ZFX and ZNF711 peaks at promoters of genes with no expression change in DKO 63
Figure 5.3 ZNF711 ChIP-seq and ChIP-exo peaks in bound responsive vs non-responsive
promoters .................................................................................................................................... 64
Figure 5.4 H3K4me3 and H3K27ac marks in DKO cells compared to wt HEK293T .................. 65
Figure 5.5 Volcano plots of ZFX and/or ZNF711 siTOG experiments ........................................ 67
Figure 5.6 The ZFX family ChIP-seq in various cell lines browser track screenshot .................. 68
Figure 5.7 siRNA knockdown of ZFX in three different cell lines ................................................ 70
ix
Abbreviations
5C Chromatin conformation capture carbon copy
AID Auxin-inducible degron
ATAC-seq Assay for transposase-accessible chromatin using sequencing
BTB Pox virus and Zinc finger (POZ)/ BR- C, ttk, and bab
CAGE Cap analysis gene expression
CASTing Cyclic amplification and selection of targets
ChIP-seq ChIP sequencing
CRISPR Clustered regularly interspaced short palindromic repeats
CRISPRi CRISPR interference
CTCF CCCTC-binding factor
DBD DNA binding domain
DEG Differentially expressed genes
DKO Double knockout
eCLIP Enhanced cross-linking immunoprecipitation
EGR Early growth response
ENCODE Encyclopedia of DNA Elements
ERG ETS-related gene
ETS Erythroblast Transformation Specific
ETV1 Translocation variant 1
FAIRE-seq Formaldehyde-assisted isolation of regulator elements with sequencing
GBM Glioblastoma multiforme
genotyping HTS genotyping by high-throughput sequencing
GLIS GLI-Similar
GOI Gene of interest
HAT Histone acetyltransferase
HLH Helix–Loop–Helix
H3K4me3 Histone 3 lysine 4 trimethylation
H3K9me3 Histone 3 lysine 9 trimethylation
H3K27ac Histone 3 lysine 27 acetylation
H3K36me3 Histone 3 lysine 36 trimethylation
iCLIP Individual nucleotide-resolution CLIP
IAA Indole-3-acetic acid
IDR Irreproducible discovery rate
IP-MS Immunoprecipitation-Mass Spectrometry
KLF Krüppel-like factors
KO Knockout
KRAB Krüppel-associated box
MeDIP-seq Methylated DNA immunoprecipitation coupled with next-generation
sequencing
MNase-seq Micrococcal nuclease digestion with deep sequencing
MRE-seq Methylation-sensitive restriction enzyme sequencing
MS/MS Tandem mass spectrometry
NAA 1-naphthaleneacetic acid
NLS Nuclear localization sequence
NSD1 Nuclear Receptor Binding SET Domain Protein 1
OFT Outflow tract
PCR Polymerase chain reaction
PDB Proximity-dependent biotinylation
PLA Proximity ligation assay
x
POI Protein of interest
POZ Pox virus and Zinc finger
RAMPAGE RNA annotation and mapping of promoters for the analysis of gene
expression
RCA Rolling-circle amplification
Repli-chip Replication strand arrays
Repli-seq Nascent DNA replication strand sequencing
RIP-chip RNA-binding protein immunoprecipitation-microarray profiling
RIP-seq RNA immunoprecipitation sequencing
RNA-PET RNA paired-end tag
RNA pol2 RNA polymerase 2
RNA-seq RNA sequencing
RRBS Reduced-representation bisulfite sequencing
RT-qPCR Real-time quantitative polymerase chain reaction
SCAN SRE-ZBP, CTfin51, AW-1, and Number 18 cDNA
SCF Skp1, Cullin and F-box
SETDB1 SET domain, bifurcated 1
shRNA Short hairpin RNA
siTOG siRNA with CRISPR/dCas9 (the deactivated form of Cas9) toggle switch
siRNA Small interfering RNA
SP Specificity proteins
TALEN Transcription activator-like effector nuclease
TF Transcription factor
TIR1 The F-box transport inhibitor response 1
TSS Transcription start site
UTR Untranslated region
WGBS Whole genome bisulfite sequencing
wt Wildtype
WT Wilms tumor
WT1 Wilms' tumor suppressor protein 1
XILD X- linked intellectual disability
ZF Zinc finger
ZNF Zinc finger protein
ZFX Zinc finger protein, X-linked
ZFY Zinc finger protein, Y-linked
ZIC Zinc finger protein of the cerebellum
xi
Abstract
C2H2 zinc finger proteins (ZNFs) constitute the largest transcription factor (TF) family in human,
yet the least studied one due to the nature of their protein structures, lack of high-quality
antibodies, and low levels of expression. My research has focused on studying the C2H2 zinc
finger protein, X-linked (ZFX) transcription factor family which consists of three members: ZFX,
zinc finger protein, Y-linked (ZFY), and zinc finger protein 711 (ZNF711). Although their protein
structure suggests that ZFX, ZFY, and ZNF711 are transcriptional regulators, the mechanisms by
which they influence transcription have not yet been elucidated. In this study, I mapped and
compared the genome-wide DNA binding sites of the ZFX family in different cell types. I created
CRISPR-mediated ZFX and/or ZNF711 knockouts in female HEK293T cells (which naturally lack
ZFY) and found that these TFs function as transcription activators and critical regulators of cell
proliferation. I have also identified the zinc fingers responsible for the DNA binding and
transcription activating activities, interacting partners of ZFX, and explored the epigenetics
mechanisms of the ZFX family. New findings on the ZFX family provide important insights into
transcriptional regulation in human cells by members of the large, but under-studied family of
C2H2 ZNFs.
1
Chapter 1
Overview of human transcription factors and C2H2 zinc finger proteins
1.1 The encyclopedia of human regulatory elements
The human body is composed of a large variety of cells and tissues. Although these myriad cell
types have some common functions (e.g. maintaining their numbers through cell proliferation and
maintaining metabolic homeostasis to ensure survival), each tissue type also has specialized
functions that, when taken together, create the intricate symphony of the human body. Both the
general and specific functions of a given tissue are accomplished by the actions of the set of
proteins expressed in that particular cell type. As succinctly (and somewhat simplistically)
summarized in the central dogma of molecular biology (i.e. DNA makes RNA makes proteins),
the production of cellular proteins begins with instructions encoded in our genome. However, all
cells in our body have the same genome and yet the sets of proteins produced in different tissues
are quite distinct. Proteins which are involved in common cellular functions (often called
housekeeping proteins) are produced using genomic regions called regulatory elements that are
accessible (open) in all cell types. In contrast, proteins that specify tissue-specific milieus are
produced as a result of tissue-specific accessibility of regulatory elements. Both the common and
tissue-specific accessible genomic regions are “read” by a specific type of protein called a
transcription factor. A transcription factor (TF) can be defined as a protein that binds to a
regulatory element either directly through a DNA binding domain (DBD) or indirectly through
protein-protein interactions with a DNA binding protein and subsequently increases or decreases
gene expression by influencing the initiation or elongation of an RNA polymerase complex through
the body of a coding or non-coding gene. It is the combination of the set of accessible regulatory
elements and the set of transcription factors produced in each cell type that produces tissue-
specific phenotypes from cells that all have the same genome. Therefore, to understand the
intricacies of the human body, it is critical that we have a comprehensive and detailed catalog of
2
all regulatory elements and identify the transcription factors that use these elements to produce
the total set of common and cell-specific proteins in each cell type (called the cell proteome).
The Encyclopedia of DNA Elements (ENCODE) Consortium was founded in 2003, with the goal
of mapping all regulatory elements in the human genome (Birney et al. 2007); in Phases 2 and 3
the Consortium efforts were expanded to also include the analysis of certain model organisms
(Consortium 2012, Moore et al. 2020). All data from Consortium experiments are available at the
ENCODE portal (http://www.encodeproject.org) and raw and processed data are available
directly from the cloud as an Amazon Public Data Set (https://registry.opendata.
aws/encode-project/); see Figure 1.1 for a snapshot illustrating the collection of different types of
datasets over the timeline from the beginning of the project.
.
Figure 1.1 Numbers of ENCODE assays performed through Phase 3
3D chromatin structure experiments include ChIA-PET, Hi-C, and chromatin conformation capture carbon
copy (5C). Chromatin accessibility experiments include DNAase-seq, assay for transposase-accessible
chromatin using sequencing (ATAC-seq), transcription activator-like effector nuclease (TALEN)-modified
DNAase-seq, formaldehyde-assisted isolation of regulator elements with sequencing (FAIRE-seq) and
micrococcal nuclease digestion with deep sequencing (MNase-seq). DNA methylation experiments include
DNAme arrays, whole genome bisulfite sequencing (WGBS), reduced-representation bisulfite sequencing
(RRBS), methylation-sensitive restriction enzyme sequencing (MRE-seq) and methylated DNA
3
immunoprecipitation coupled with next-generation sequencing (MeDIP-seq). Histone modification
experiments include ChIP–seq of histone and modified histone targets. Knockdown transcription
experiments include RNA-seq after treatment with small interfering RNA (siRNA), short hairpin RNA
(shRNA), clustered regularly interspaced short palindromic repeats (CRISPR) or CRISPR interference
(CRISPRi). RNA binding experiments include enhanced cross-linking immunoprecipitation (eCLIP), RNA
bind-n-seq, RNA immunoprecipitation sequencing (RIP-seq), RNA-binding protein immunoprecipitation-
microarray profiling (RIP-chip), individual nucleotide-resolution CLIP (iCLIP) and Switchgear. Transcription
experiments include RNA annotation and mapping of promoters for the analysis of gene expression
(RAMPAGE), cap analysis gene expression (CAGE), RNA paired-end tag (RNA-PET), microRNA-seq,
microRNA counts, more classical RNA-seq and RNA-microarrays. TF binding experiments are ChIP–seq
on non-histone targets. Other assays include genotyping arrays, nascent DNA replication strand
sequencing (Repli-seq), replication strand arrays (Repli-chip), tandem mass spectrometry (MS/MS),
genotyping by high-throughput sequencing (genotyping HTS) and DNA-PET. All data can be examined in
detail at https://www.encodeproject.org. This figure was taken from Snyder et al. 2020; my mentor Dr.
Farnham is a co-author of the manuscript.
One of the major goals of ENCODE has been to identify regions of open chromatin bound by
transcription factors. Phase 1 was limited to the use of microarrays and focused on only 1% of
the human genome. With the advent of Phase 2, genome-wide methods to identify regions of
DNA accessibility that relied upon the identification of specifically modified histones or regions of
hypersensitivity to DNAse treatment were used to identify regulatory elements. In Phase 3,
additional methods to identify regulatory elements, such as ones based on the accessibility of the
genome to transposases, were also included in the analyses. These experiments identified
2,157,387 open chromatin regions, which includes 750,392 regions marked by modified histones
that are associated with active chromatin, such as mono-, di- or tri-methylation of histone H3 at
lysine 4 (H3K4me1, H3K4me2 or H3K4me3) or acetylation of histone 3 at lysine 27 (H3K27ac).
Taken together, these regions of open chromatin correspond to ~30-50% of the human genome
(Figure 1.2). Of course, not all open regions are the same in each cell or tissue type.
4
Figure 1.2 ENCODE 3 genome annotations
Shown is the percentage of the human genome covered by open chromatin (green bar) and active histone
modifications (yellow and orange bars); the blue bar indicates the percent of the genome bound by the
human TFs studied to date. This figure was taken from Snyder et al. 2020; my mentor Dr. Farnham is a co-
author of the manuscript.
As described above, sequencing genomic regions identified a) using antibodies to modified
histones with a technique known as ChIP-seq (Chromatin Immunoprecipitation followed by
sequencing), b) following treatment with DNase with a technique known as DNAse-seq, or c)
using a transposase with a technique known as ATAC-seq can identify open chromatin that
corresponds to regulatory elements. The majority of these regions of open chromatin are bound
by one or more transcription factors. However, while the open chromatin assays can identify
where a transcription factor likely binds, that information does not identify the specific TF(s) bound
to the element. Before the advent of ChIP-seq, in vitro assays such as CASTing (cyclic
amplification and selection of targets) were used to identify short consensus motifs for site-specific
DNA-binding transcription factors. Some investigators have used in vitro-derived motifs to identify
candidate TFs bound to specific sets of open chromatin elements. However, using motif analysis
to identify which TF binds to a specific genomic region has several problems. For example, many
TFs have very closely related family members that have the same motif, making it difficult to know
Percentage of mappable genome
5
which family member binds to an element in a particular cell type. Also, due to interaction with
protein partners which were not included in the in vitro studies, some TFs have different motifs in
vivo than in vitro. More recently, identification of TFs bound to genomic regulatory elements has
been accomplished using antibodies to specific TFs in ChIP-seq assay. In fact, genome-wide
mapping of the binding patterns of site-specific DNA-binding TFs was a major goal of ENCODE.
To date, the results of more than 4000 TF ChIP-seq experiments can be found on the Experiment
Matrix at the ENCODE portal
(https://www.encodeproject.org/matrix/?type=Experiment&status=released&assay_slims=DNA+
binding&assay_slims=DNA+binding&assay_title=TF+ChIP-seq). These TF binding genomic
datasets include information about TFs from most, if not all, TF families (the different categories
of TF families described in the next section). However, many TFs have not been analyzed due to
the lack of suitable antibodies or because they are only expressed in rare cell types or difficult to
study tissue types. Also, genomic binding patterns of a given TF can differ from cell type to cell
type, thus requiring that many additional experiments will need to be done before a completely
comprehensive map of TF binding can be finalized.
1.2 How do TFs work?
Open regions of chromatin (e.g. those regions identified by ATAC-seq, DNAase-seq, or modified
histones) generally correspond to promoters (defined as regulatory regions near the transcription
start site (TSS)) or enhancers (defined as regulatory regions far from the TSS). A core promoter
region is defined as a region ±50 bp from the TSS of a gene. Core promoters are composed of
common sequence elements such as a TATA box or a CpG island (which is a genomic region
with high GC content and a high density of CpG dinucleotides). TATA box-containing promoters
often produce cell type-specific or induced (e.g. by a hormone) transcripts, whereas
housekeeping genes are often driven by CpG island promoters (Saxonov, Berg, and Brutlag
2006). Both types of core promoters are bound by general TFs such as RNA Polymerase 2 (pol2)
6
and other components of the pre-initiation complex. However, a core promoter alone does not
provide robust transcription, due to unstable interactions of the general transcriptional machinery
with the DNA. Promoter activity can be increased by the action of site-specific, DNA-binding TFs
that either bind proximal to the proximal promoter region or to distal enhancer elements.
The ChIP-seq experiments performed to date have shown that there are two basic types of TF
binding patterns; some TFs mainly bind to promoter regions and some TFs mainly bind to open
chromatin regions outside of promoters. Generally speaking, site-specific DNA-binding TFs do
not have enzymatic activities (Chen and Koehler 2020). Most eukaryotic TFs bind to the genome
using their DNA binding domains and then utilize unstructured regions outside of their DNA
binding domains to recruit other proteins that have enzymatic activities used in regulating
transcription (Reiter, Wienerroither, and Stark 2017, Brayer, Kulshreshtha, and Segal 2008). Such
cofactors (known as co-activators or co-repressors) commonly contain effector domains which
can mediate processes such as chromatin remodeling and histone modifications (Frietze and
Farnham 2011). Because co-activators, such as histone acetyltransferases (HATs) or protein
kinases, lack the ability to bind to DNA on their own (Nakagawa et al. 2018), they rely on their
interactions with site-specific DNA-binding TFs to regulate transcription. TFs that bind in proximal
promoter regions recruit co-activators that function by stabilizing the recruitment of the
transcriptional machinery. In contrast, TFs that bind to enhancer regions recruit co-regulators that
mediate enhancer-promoter contact via chromatin looping (Fuda, Ardehali, and Lis 2009, Haberle
and Stark 2018, Core and Adelman 2019, Vihervaara, Duarte, and Lis 2018, Farnham 2009,
Zabidi and Stark 2016, Frietze and Farnham 2011)
1.3 Classification of TFs
There are ∼1600 TFs with sequence-specific DNA binding and gene regulation properties,
consisting of ~8% of the set of human genes (Vaquerizas et al. 2009, Lambert et al. 2018). Site-
7
specific DNA-binding TFs are classified according to their DNA binding domains, which provide
useful information concerning their DNA binding patterns and their evolutionary relatedness
(Vaquerizas et al. 2009). TF families are evolutionarily conserved amongst many species. For
example, HOX proteins (members of the homeodomain family) display essentially identical
sequence specificity and conserved physiological roles between human and Drosophila in
controlling the body plan during development (Bürglin 2011, Nitta et al. 2015, Luo, Rhie, and
Farnham 2019). However, there are striking differences in the relative ratio of numbers of
members in certain TF families in different species. For example, C2H2 zinc finger proteins
(ZNFs) have greatly expanded in vertebrates as compared to other organisms (Charoensawan,
Wilson, and Teichmann 2010). The three largest families of sequence-specific human TFs are
proteins that achieve DNA binding using zinc finger (ZFs) domains, homeodomains, and helix–
loop–helix (HLH) domains. Of these C2H2 ZNFs comprise the largest class of site-specific DNA
binding proteins encoded in the human genome (Tupler, Perini, and Green 2001); of the ∼1600
predicted human DNA binding transcription factors, 747 contain C2H2 zinc finger domains
(Lambert et al. 2018).
1.4 The Zinc finger family of TFs
C2H2 ZNFs are classified by differences in types of ZF modules: triple-ZFs (e.g., SP1, KLF3),
multiple-adjacent-ZFs (e.g., TFIIIA, ZNF423), and separated-paired-ZFs (e.g., PRDII-BFI, BNC1)
(Iuchi 2000). Most C2H2 ZNFs have 3-5 or 10-15 ZFs and some ZNFs have more than 30 ZFs
(i.e. ZNF208 has up to 36 ZFs) (Figure 1.3) (Lambert et al. 2018). C2H2 ZNFs can also be
subclassified by conserved functional domains outside of the ZF DBDs. For example, some ZNFs
contain a Krüppel-associated box (KRAB) domain, which is found in ~350 human C2H2 ZNFs.
The KRAB domain is a repressive effector domain (Vaquerizas et al. 2009) that recruits a protein
complex containing TRIM28/ KAP1, HP1 alpha, and SETDB1 (Ecco, Imbeault, and Trono 2017)
that deposits H3K9me3 histone marks. In general, KRAB-containing ZNFs have more ZFs than
8
other ZNFs. Other common domains in ZNFs include the Pox virus and Zinc finger (POZ)/ BR- C,
ttk, and bab (BTB) domain and the SRE-ZBP, CTfin51, AW-1, and Number 18 cDNA (SCAN)
domain (Nowick et al. 2010, Mackeh et al. 2018, Williams, Blacklow, and Collins 1999); see inset
in Figure 1.3. However, many ZNFs do not have well-defined effector domains and thus are only
classified by the arrangement of their zinc fingers. Perhaps the ZNFs which function as activators
(and not repressors) compose the category of “none” because many transactivation domains
contain classical acidic sequences that are of low-complexity (Garza, Ahmad, and Kumar 2009).
Figure 1.3 The Human TF Repertoire
Shown is the number of TFs and motif status for each DBD family; the inset displays the distribution of the
number of C2H2 ZF domains for classes of effector domains (KRAB, SCAN, or BTB domains); ‘‘Classic’’
indicates the related and highly conserved C2H2 ZNFs that have 3 ZFx, including the specificity proteins
(SP), Krüppel-like factors (KLF), Early Growth Response (EGR), GLI-Similar (GLIS), zinc finger protein of
the cerebellum (ZIC), and Wilms tumor (WT) proteins. This figure was taken from Mackeh et al. 2018.
The fact that C2H2 ZNFs are the largest family of TFs suggests that they may be critical regulators
of many important biological networks. However, the majority of these ZNFs have not been well-
studied due to issues related to low expression levels, poor antibody quality, and a lack of
knowledge as to what tissues or physiological processes they may regulate (Ecco, Imbeault, and
Trono 2017). Evolutionary studies of C2H2-ZNF genes have shown a large discrepancy in the
9
numbers and sequence of ZNFs in homologous clusters across mammals (i.e., human,
chimpanzee, mouse, rat, and dog). In fact, species-specific duplications have resulted in a large
expansion of C2H2-ZNFs, more particularly KRAB-ZNFs, in primates (Tadepally, Burger, and
Aubry 2008, Huntley et al. 2006). Therefore, most C2H2-ZNF genes cannot be studied using mice
or other model systems such as worms or flies. Taken together, these issues have greatly limited
the study of human C2H2 ZNFs.
1.5 ZNFs and human disease
Alterations in gene expression caused by the inappropriate level, structure, or function of a site-
specific, DNA-binding TF have been associated with a diverse set of human diseases, including
cancers and developmental disorders (Augello, Hickey, and Knudsen 2011, Vaquerizas et al.
2009, Weedon 2007), indicating the importance of understanding the normal and abnormal
functions of these regulatory proteins. For example, MYC, a site-specific DNA-binding TF, is one
of the most frequently overamplified and translocated oncogenes in a multitude of human cancers
(Beroukhim et al. 2010, Shou et al. 2000, Stine et al. 2015). Mutations, overexpression, and loss
of expression of GATA family factors have also been identified in different cancers including
breast, colorectal, ovarian, and leukemia (Chou, Provot, and Werb 2010, Zheng and Blobel 2010,
Vicente et al. 2012). The dysregulation of Erythroblast Transformation Specific (ETS) family of
TFs (e.g., overexpression of ETS-related gene (ERG) and translocation variant 1 (ETV1)) has
been shown to drive tumorigenesis in prostate cancer and melanoma (Tomlins et al. 2009, Clark
and Cooper 2009, Chen et al. 2013, Jané-Valbuena et al. 2010). Mutations in HOXA1 have been
associated with cardiac outflow tract (OFT) malformations (Holve et al. 2003, Tischfield et al. 2005)
whereas mutations in the DBD of HOXD13 and loss-of-function mutations of HOXA13 can lead
to limb developmental anomalies (Goodman 2002, Barrera et al. 2016)
10
Studies have also shown an association of dysregulation of C2H2 ZNFs with human disease. For
example, multiple KRAB-ZNFs (ZNF41, ZNF81, ZNF142, ZNF148, ZNF673, and ZNF674)
encoded on the X chromosome are reported to be driver genes for X- linked intellectual disability
(XLID) (Lugtenberg et al. 2006, Raymond and Tarpey 2006, Mandel and Chelly 2004, Khan et al.
2019). In addition, mutations in ZNF711 are reported in XLID patients with autistic features (van
der Werf et al. 2017). ZNFs are also shown to be causative genes in cancer progression.
ZNF322A is upregulated via copy number amplification in lung cancer patients (Lo et al. 2012).
ZFX is overexpressed in glioblastoma, lung, oral, and gallbladder cancers (Jiang, Xu, et al. 2012,
Wu et al. 2013, Ma et al. 2015, Weng et al. 2015, Fang, Huang, et al. 2014) and ZNF395 in
overexpressed in Ewing’s sarcoma, osteosarcoma, renal cell carcinoma (Yabe et al. 2008,
Tsukahara et al. 2004, Dalgin et al. 2007); the overexpression of these TFs may drive
tumorigenesis through promoting cell growth, migration, invasion, and overactivation of cell-cycle
regulating intracellular signaling pathways (e.g., PI3K/AKT or ERK/MAPK pathways). In contrast,
ZNF304 functions as an oncogene in KRAS-positive human colorectal cancer cell lines and
tumors by silencing tumor suppressor genes through recruiting a co-repressor complex, including
DNMT1, to promoter regions (Serra et al. 2014). In addition, mutations in binding sites of the well-
known ZNF family member CCCTC-binding factor (CTCF) have been identified in gastrointestinal
cancer patients (Guo et al. 2018).
1.6 Outstanding questions about ZNFs
There are several important outstanding questions that require further studies to understand how
the largest family of TFs normally regulate gene expression and how they may also contribute to
human disease.
1) What are the roles of the multiple fingers in DNA-binding and gene regulation? Typical
ZNF binding motifs are 6-8 nt, which would suggest that only 3 ZFs would be needed to recognize
11
the motif (each ZF recognizes 3 nts); however, most ZF DBDs in C2H2 ZNFs are much longer
than 3 fingers. Structural dynamic studies have shown that there are two conformational modes
of the DBD of Zif268; it has been suggested that one mode facilitates a DNA scanning process to
help identify preferable binding sites and the other mode mediates direct motif recognition
activities (Zandarashvili et al. 2015). Some ZNFs have been shown to bind to both DNA and RNA.
For example, TFIIIA has nine ZFs and utilizes ZF1-3 for interacting with DNA and ZF4-7 for
interacting with 5S RNA (McConkey and Bogenhagen 1988, Ryan and Darby 1998, Nolte et al.
1998). Wilms' tumour suppressor protein (WT1), an essential ZNF for fetal development, has also
been shown to have both DNA and RNA-binding activities (Ullmark, Montano, and Gullberg 2018).
As described within my dissertation, the ZFX family members bind to genomic locations that are
also encoded in the RNA transcripts of the target genes, suggesting that both DNA and RNA
binding might be properties of this family of C2H2 ZNFs.
2) Why do certain ZNFs have unusual genomic binding patterns? ZNF274 binds only to
the 3’ coding exons of ZNF genes; it is unclear if this binding affects transcription initiation,
elongation, or termination, or whether these proteins have both DNA and RNA binding capability.
As described within my dissertation, the ZFX family members bind almost exclusively to the
transcribed region immediately downstream of the TSS. Do these unusual binding locations of
ZNFs contribute to gene regulation? Do other ZNFs (as yet unstudied) have unusual binding
patterns?
3) ZNF genes have undergone duplication through evolution to produce many families of
highly related proteins. Do highly related family members have unique or redundant functions?
ZFX, zinc finger protein Y-linked (ZFY), and ZNF711 in the same TF family have highly similar
gene structures and essentially identical genome-wide binding profiles in a number of cell types,
12
however, further studies remain to be done to investigate the unique and overlapped functions
among family members (Ni et al. 2020, Rhie et al. 2018).
1.7 Overall study design
In my dissertation research, I have focused on the 3-member ZFX family of C2H2 ZNFs. To gain
insight into their function, I created single and double knockout clones lacking ZFX and ZNF711
from female HEK293T cells (which naturally lack ZFY) and performed RNA-seq to examine
effects on the transcriptome. I analyzed ChIP-seq datasets (extending the studies to include a
male cell line to allow analysis of all three family members) and ChIP-exo datasets to identify
direct target genes of these TFs. I classified the ZFX family member binding sites using all known
TSS from GENCODE release 19 (GRCH37.p19) and known CpG islands from UCSC table
browser (http://genome.ucsc.edu/ cgi-bin/hgTables). I performed a series of ChIP-seq and DNA
methylation array experiments designed to investigate the mechanism by which ZFX family
members mediate transcription. Finally, I collaborated with other laboratory members to assay a
series of FLAG-tagged ZFX mutant proteins for DNA binding and transcriptional activity. A list of
all genomic datasets used in this study can be found in Table 1.1 (which is Supplementary Table
S1 in Ni et al. 2020).
13
Chapter 2
Materials and methods
2.1 Cell culture
Human kidney HEK293T (ATCC #CRL-3216) and prostate cancer 22Rv-1 (ATCC #CCL-2505)
cells were obtained from ATCC (https://www.atcc.org/). Cells were cultured in appropriate media
(HEK293T in DMEM and 22Rv1 in RPMI 1640) supplemented with 10% fetal bovine serum (Gibco
by Thermo Fisher #10437036) plus 1% penicillin and 1% streptomycin at 37 °C with 5% CO2. Cell
lines were authenticated via the STR method and validated to be mycoplasma free using a
universal mycoplasma detection kit (ATCC #30-1012K).
2.2 CRISPR/Cas9-mediated genomic deletions
Guide RNAs used to create ZFX and ZNF711 functional deletions (see Table 2.1 which is
Supplementary Table S2 in Ni et al. 2020) were cloned into pSpCas9(BB)-2A-Puro (PX459) V2.0
plasmid (Addgene #62988). HEK293T cells were transfected with PX459 V2.0 expressing Cas9
plus the gRNAs or with the PX459 V2.0 vector only (which expressed Cas9 but not guide RNAs)
using Lipofectamine 3000 (Thermo Fisher #L3000015), according to the manufacturer’s protocol.
24 hours after transfection, cells were selected with 2 ng/uL puromycin for 24 hours and then
harvested. Post-selection cell pools were stained with DAPI (Thermo Fisher #62248) and sorted
for live cells using BD FACSAria Ilu SORP (USC Flow Cytometry Facility). Live single cells were
sorted individually into a well of 96-well plates containing growth media for HEK293T (described
above). Genomic DNA of single cell-derived clonal populations was extracted using QuickExtract
DNA Extraction Solution (Epicentre #QE9050), following the manufacturer’s protocol and was
used in PCR-based homozygous deletion screening assays with primers listed in Table 2.1
(which is Supplementary Table S2 in Ni et al. 2020). I identified multiple colonies that showed
complete deletion of the DNA between the paired guide RNAs (not shown). RNA from those single
cell-derived clonal populations was harvested using DirectZol RNA MiniPrep kit (Zymo #R2052)
14
according to the manufacturer’s protocol. cDNA was synthesized using the SuperScript VILO
cDNA Synthesis Kit (Life Technologies #11754-050) following the manufacturer’s protocol and
used in qPCR-based (Quantabio #95054-02K) assays with primers listed in Table 2.1 (which is
Supplementary Table S2 in Ni et al. 2020). These assays demonstrated that there was no
detectable RNA corresponding to the region within the deleted coding regions (not shown). Finally,
a Western blot was performed to demonstrate that there was no expression of ZFX or ZNF711
protein in the clones (see Figure 3.5).
2.3 Cell cycle analysis
Cells of wildtype (wt) HEK293T, two ZFX knockout (KO) clones, two ZNF711 KO clones, and
three ZFX and ZNF711 double knockout (DKO) clones were treated with 70% ethanol for 2 hours
on ice, washed twice with cold PBS, and then labelled with DAPI (Thermo Fisher #62248) at a
final concentration of 10 ug/mL for 30 minutes on ice, protected from light. The flow cytometry
assay was performed using BD LSR II (USC Flow Cytometry Facility). Fixed cells were gated on
single cells via Width and Area signals. Cell cycle analysis of the percentage of G0/G1, S, and
G2/M phases were calculated from the DAPI-area histogram using ImageJ
(https://imagej.nih.gov/ij/).
2.4 RNA-seq
Total RNA was extracted using DirectZol RNA MiniPrep kit (Zymo #R2052) following the
manufacturer’s protocol. RNA integrity was checked using RNA 6000 Nano kit (Agilent
Technologies #50671511) on a 2100 Bioanalyzer (Agilent Technologies #G2939AA). RNA-seq
libraries for controls, ZFX and ZNF711 KO clones, and the DKO clones were made using the
KAPA Stranded mRNA kit with beads (Roche #KK8421) following the manufacturer’s protocol.
Samples were sequenced on an Illumina HiSeq3000 with 50bp single-ended reads. The RNA-
seq libraries of DKO cells transfected with a control plasmid, wt ZFX FLAG, or ZFX ZF11-13 FLAG
15
were prepared by Novogene. Paired-end sequencing was performed by the company. RNA-seq
results were aligned to GENCODE v19 and reads were counted using STAR
(https://github.com/alexdobin/STAR). Differentially expressed genes with absolute fold
change >1.5 were determined using edgeR
(https://bioconductor.org/packages/release/bioc/html/edgeR.html). DAVID
(https://david.ncifcrf.gov/summary.jsp) was used for gene ontology analyses; specifically, the
Functional Annotation Clustering tool and the INTERPRO protein domain category was used, with
default settings (3 genes required per category) and medium stringency.
2.5 Construction of ZFX zinc finger deletion mutants
ZFX mutant expression constructs were generated by amplifying the ZFX-Myc-DDK expression
vector (Origene #RC214045) using primers with 15 bp complementary overhangs flanking
different ZFs to create constructs containing ZF1-8, ZF9-13, ZF9-11, ZF11-13, or no ZF (see
Table 2.1 which is Supplementary Table S2 in Ni et al. 2020). The resulting constructs were
transformed into CopyCutter™ EPI400™ Chemically Competent E. coli (Lucigen #C400CH10)
and induced to high copy number according to the manufacturer’s protocol. Plasmids were
purified using Qiagen miniprep kit (Qiagen #D4068) and the deletions were validated via Sanger
sequencing. Primers used for cloning and sequencing are listed in Table 2.1 (which is
Supplementary Table S2 in Ni et al. 2020).
2.6 Transient transfection assays
To test transcriptional activity of the ZFX deletion mutants, HEK293T cells were seeded into 6
well plates and transfected during log phase growth. Transfection was carried out with
Lipofectamine 3000 (ThermoFisher #L3000015) according to manufacturer’s instructions. After
24 hours, cells were lysed in TRI Reagent (Zymo #R2050-1-200) and RNA was recovered by
precipitation. Total RNA was converted to cDNA using iScript (Bio-Rad #1708841BUN). Real-
16
time quantitative polymerase chain reaction (RT-qPCR) was carried out using SYBR on a BioRad
CFX 1000. Data points represent results from triplicate wells and duplicate RT-qPCR readings.
Primers used to monitor expression of endogenous genes are provided in Table 2.1 (which is
Supplementary Table S2 in Ni et al. 2020).
2.7 ChIP-seq
ZFX (Cell Signaling Technology # 5419S), ZNF711 (Kleine-Kohlbrecher et al. 2010), and ZFY
(Sigma #SAB2102775-100UL) antibodies were used for ChIP assays in HEK293T and 22Rv1
cells, as previously described (7). 400-900 ug chromatin was used for ZFX (30 uL antibody),
ZNF711 (5 ug antibody), and ZFY (10 uL antibody) ChIP assays. For ZFX and ZNF711 antibody
validation, Western blots were performed in wildtype and knockout cells. For ZFY antibody
validation, it was demonstrated that ZFY can be ChIPed in male 22Rv1 cells but not in female
HEK293T cells, thus demonstrating that there is no cross reactivity with the other 2 family
members (Figure 2.1)
Figure 2.1 ZFY antibody validation in female HEK293T cells
ChIP assays were performed using an antibody to ZFY or ZNF711 in female HEK293T cells and binding
was analyzed using primers for a known ZFY and ZNF711 binding site (LRRC41) and a negative region
(ZNF554-3’). Although ZNF711 showed very robust binding to LRRC41, ZFY binding was not detected. For
comparison, the inset shows that ZFX binding to the LRRC41 promoter is detected at similar levels as
ZNF711 binding. Andrew Perez performed the ChIP experiments.
Ni. Figure S1
Figure S1. Amino acid alignment of the ZFX family members. A) Shown is a Treefam (www.treefam.org/) alignment for ZFX,
identifying the closely related ZFY and ZNF711 proteins. B) Alignments were generated using sequences (ZFX-P17010, ZFY-
P08048, ZNF711-Q9Y462) from UniProt (www.uniprot.org), aligned using MUSCLE (www.ebi.ac.uk/Tools/msa/muscle/), output was
generated in ClustalW format (www.genome.jp/tools-bin/clustalw), and viewed using Jalview v2.10.4 (jalview.org). Each of the 13
C
2
H
2
zinc fingers of ZFX are boxed; for each finger, the cysteines are marked in red and the histidines are marked in green;
cysteines and histidines not related to the zinc finger structure are not color coded. Optimal DNA binding zinc fingers have the
structure X-C-X
2−5
-C-X
3
-ψ-X
5
-ψ-X
2
-H-X
3−5
-H (whereψ is a hydrophobic amino acid); if present in the right position, eachψ is marked
in blue. C) ChIP assays were performed using an antibody to ZFY or ZNF711 in female HEK293T cells and binding was analyzed
using primers for a known ZFY and ZNF711 binding site (LRRC41) and a negative region (ZNF554-3’). Although ZNF711 showed
very robust binding to LRRC41, ZFY binding was not detected. For comparison, the inset shows that ZFX binding to the LRRC41
promoter is detected at similar levels as ZNF711 binding.
1 2
2 3 4
7 6
5
8 9
9
13
10 11 12
A
B
C
ZNF711
ZFX
Relative Normalized Expression
300
200
100
0
LRRC41 ZNF554-3’
Input ZFY ChIP ZNF711 ChIP
17
All ChIP-seq samples for endogenous TFs were performed in duplicate, following ENCODE
standards. ChIP-seq libraries were prepared using the KAPA HyperPrep kit (Roche #KK8503)
following the manufacturer’s protocol. Samples were sequenced on an Illumina HiSeq3000
machine using 100 bp paired-end reads for ZFX and 50bp single-end reads for all other samples.
All ChIP-seq data were processed according to the ENCODE3 ChIP-seq pipeline
(https://www.encodeproject.org/chip-seq/), and mapped to hg19; all data passed ENCODE
quality standards. ChIP-seq peaks were called using MACS2 (https://github.com/taoliu/MACS),
followed by identifying common peaks between duplicates using irreproducible discovery rate
(IDR) (https://github.com/nboley/idr). To test DNA binding activity of mutant ZFX proteins,
HEK293T cells were transfected with a plasmid expressing a FLAG-tagged wt ZFX or a mutated
ZFX construct using Lipofectamine 3000 (Thermo Fisher #L3000015) according to the
manufacturer’s protocol. Cells were harvested 24 hours after transfection for ChIP assays. For
each ChIP assay, 5ug of FLAG antibody (Sigma-Aldrich #F1804-200UG) was used with 150 ug
chromatin. Also, 40 ug of chromatin, along with an antibody to histone 3 lysine 36 trimethylation
(H3K36me3) (Cell Signaling Technology #9763S), was used for ChIP-seq analysis of wt
HEK293T and three DKO clones; the antibody was validated by the company to demonstrate no
cross-reactivity to unmodified, mono-, or di-methylated H3K36. ChIP-seq was performed and
analyzed as described above.
2.8 ChIP-exo
Approximately 100 million HEK293T cells were crosslinked for each ChIP-exo assay using the
ChIP-seq protocol described above. Crosslinked cells, ZFX antibody (Cell Signaling Technology
# 5419S), and ZNF711 antibody (Thermo Fisher #PA5-31815) were sent to Peconic, where the
ChIP-exo assay was performed (http://www.peconicgenomics.com/services.html). Samples were
sequenced on an Illumina NextSeq 500 machine using 2 × 40 bp paired-end sequencing
generating ~40 million reads per sample. Sequence reads were aligned to human (hg19) genome
18
using using bwa-mem (v0.7.9a) (http://bio-bwa.sourceforge.net/). Peaks in ChIP-exo data were
called using ChExMix (http://mahonylab.org/software/chexmix/).
2.9 DNA methylation EPIC arrays
500 ng genomic DNA was extracted from wt HEK293T cells and the three DKO clones using the
Zymo Quick-DNA Miniprep kit (Zymo #D3024); the genomic DNA was bisulfite-converted using
the Zymo EZ DNA Methylation kit (Zymo #D5001) according to the manufacturer’s protocol. The
bisulfite-converted DNA was analyzed using Illumina EPIC BeadArrays (Moran, Arribas, and
Esteller 2016). The BeadArrays were scanned and the raw signal intensities were extracted from
the *.IDAT files using the ‘noob’ function in the minfi R package. The beta value (a measure of
change in DNA methylation) was calculated as (M/(M+U)), in which M and U refer to the (pre-
processed) mean methylated and unmethylated probe signal intensities, respectively.
Measurements in which the fluorescent intensity was not statistically significantly above
background signal (detection p value > 0.05) were removed from the dataset. Probes located from
-1500 bp relative to the TSS and extending through the first coding exon (using the Illumina
MethylationEPIC Manifest RefGene annotation) were included in the analysis as a defined set of
“promoter” probes for downstream analysis. The cut off used for identifying hypomethylated or
hypermethylated probes was 0.2 for the absolute beta value difference between the methylation
level of a probe in the DKO cells vs. the wt HEK293T cells.
2.10 Western Blot
Nuclear lysate was prepared using cell lysis buffer (5mM PIPES pH 8, 85mM KCL, 10uL/mL NP40)
with protease inhibitors (cOmpleteTM Protease Inhibitor Cocktail, Sigma Aldrich), followed by
nuclei lysis buffer (50mM Tris pH 8, 10mM EDTA, 1% SDS) with protease inhibitors. Nuclear
lysates were briefly sonicated using a Bioruptor Pico (Diagenode) sonicator to reduce sample
viscosity. Extracted proteins were denatured at 95 ̊C for 5min. Denatured proteins were separated
19
on 4%-15% SDS-PAGE gels (BioRAD #4561085), and then transferred to a nitro-cellulose
membrane. The membranes were blocked for 1 hour at room temperature with 5% nonfat milk
(BioRAD #1706404XTU) in TBST (20mM Tris pH 7.4, 150mM NaCl, 0.1% Tween 20) then
washed 3 times with TBST. Primary antibodies were added at 1:1000 dilution in TBST and
incubated overnight at 4 ̊C. The primary antibodies are described in Chapter 2.7 ChIP-seq
experiments; as a loading control, a primary antibody raised against Nucleoporin p62 (BD
Biosciences #610497) was used. The next day, the membranes were incubated with secondary
antibodies (Thermo Scientific, Goat Anti-Mouse IgG (H+L), DyLight 680 Conjugated, and Goat
Anti-Rabbit IgG (H+L), DyLight 800 Conjugated) at 1:1000 dilution for 1 hour at room temperature
and washed 3 times with TBST. Fluorophore signals were detected using the Li-COR Odyssey
system.
2.11 Immunoprecipitation-Mass Spectrometry (IP-MS)
1x10
7
cells per IP were scraped from the plates using ice cold PBS and lysed in cell lysis buffer
(5mM PIPES pH8, 85mM KCl, 1% Igepal) with protease inhibitors (cOmpleteTM Protease
Inhibitor Cocktail, Sigma Aldrich) on ice for 15 minutes. Lysed cells were centrifuged at 4ºC and
the buffer was removed. Pelleted nuclei were resuspended in 1x RIPA buffer (50mM Tris pH 7.4,
150mM NaCl, 1% Igepal, 0.25% deoxycholic acid, and 1mM EDTA, pH 8.0) with protease
inhibitors and incubated on ice for 30min. Nuclei were centrifuged at maximum speed for 15
minutes at 4ºC; the lysate was transferred to a new tube for each IP and dilutee using 1x RIPA
buffer with protease inhibitors. The lysate was incubated with rabbit IgG at 4ºC for 1-2 hours for
preclearing. Protein A/G beads (VWR, PI88803) were used to remove the IgG. The supernatant
was transferred to a new tube and antibody added to each IP and incubated on a rotator at 4ºC
overnight. The next day, protein A/G beads were added to each IP and incubated on a rotator for
an additional 2 hours. Beads were allowed to separate on a magnetic rack; the supernatant was
20
disposed of. The beads were washed three times with 1x RIPA with protease inhibitors. 4 buffer
exchanges were performed using 50mM ammonium bicarbonate. The beads were shaken for 20
minutes at 4ºC between each exchange. After the 4th exchange, 0.5uL trypsin (Promega) was
added and the beads were shaken at room temperature overnight. The next morning, beads were
separated from the supernatant using a magnet and removed, saving the supernatant. An
additional 25uL of 50mM ammonium bicarbonate was added to the beads and shaken for an
additional 20 minutes. Again, the beads were separated from the supernatant using a magnet
and the second supernatant was combined with the first supernatant. Trypsin-digested samples
were then flash frozen using liquid nitrogen and stored at -80ºC until shipped to UC Davis for
mass spectrometry and analyses. Scaffold (.sf3) files received from the UC Davis proteomics core
were opened in Scaffold Viewer. Experiment settings were adjusted to categorize the ZFX IP as
target and IgG IP as control. Data was filtered to include proteins with a 1.0% FDR, with a
minimum of 1 peptide and 95% peptide threshold. Quantitative analysis of the total spectra was
performed using Scaffold to calculate fold change by category and perform a Fisher’s Exact Test
with no correction and a significance of p < 0.05 using the control as reference. Using these
settings and calculations, a scatterplot was generated to visualize the data.
21
Chapter 3
Characterization of the ZFX family of transcription factors that bind downstream of the
start site of CpG island promoters
Most of the work described in this chapter has been published in Ni, Perez, Schreiner, Nicolet,
and Farnham. 2020. "Characterization of the ZFX family of transcription factors that bind
downstream of the start site of CpG island promoters." Nucleic Acids Res. doi:
10.1093/nar/gkaa384.
3.1 Abstract
Although their protein structure suggests that ZFX, ZFY, and ZNF711 are transcriptional
regulators, the mechanisms by which they influence transcription have not yet been elucidated. I
used CRISPR-mediated deletion to create bi-allelic knockouts of ZFX and/or ZNF711 in female
HEK293T cells (which naturally lack ZFY). I found that loss of either ZFX or ZNF711 reduced cell
growth and that the DKO cells have major defects in proliferation. RNA-seq analysis revealed that
thousands of genes showed altered expression in the DKO clones, suggesting that these TFs are
critical regulators of the transcriptome. To gain insight into how these TFs regulate transcription,
mutant ZFX proteins were created and analyzed for DNA binding and transactivation capability.
These experiments demonstrated that ZFs 11-13 are necessary and sufficient for DNA binding
and, in combination with the N terminal region, constitute a functional transactivator. These
functional analyses of the ZFX family provides important new insights into transcriptional
regulation in human cells by members of the large, but under-studied family of C2H2 ZNFs.
3.2 Introduction
The studies presented in this Chapter are focused on a small family of human C2H2 ZNFs that
are ubiquitously expressed in human tissues. A Treefam (http://www.treefam.org) analysis
reveals that members of the family include ZFX, ZFY, ZNF711 (Figure 3.1).
22
Figure 3.1 The ZFX gene family
Shown is a Treefam (www.treefam.org/) alignment for ZFX, identifying the closely related ZFY and ZNF711
proteins.
ZFX and ZFY are nearly identical proteins encoded on either the X or Y chromosome, respectively
(having 96% overall similarity, with 99% similarity in the zinc finger domains). ZNF711 is highly
related to the other two family members, having 67% overall similarity with ZFX and 87% similarity
in the zinc finger domains (Figure 3.2).
Figure 3.2 The ZFX family gene structure comparison
Shown are gene structure schematics for ZFX, ZFY, and ZNF711. Dashed lines indicate zinc fingers
conserved between ZFX and the other two family members. NLS: nuclear localization sequence. Weiya Ni
performed the alignments.
Ni. Figure S1
Figure S1. Amino acid alignment of the ZFX family members. A) Shown is a Treefam (www.treefam.org/) alignment for ZFX,
identifying the closely related ZFY and ZNF711 proteins. B) Alignments were generated using sequences (ZFX-P17010, ZFY-
P08048, ZNF711-Q9Y462) from UniProt (www.uniprot.org), aligned using MUSCLE (www.ebi.ac.uk/Tools/msa/muscle/), output was
generated in ClustalW format (www.genome.jp/tools-bin/clustalw), and viewed using Jalview v2.10.4 (jalview.org). Each of the 13
C
2
H
2
zinc fingers of ZFX are boxed; for each finger, the cysteines are marked in red and the histidines are marked in green;
cysteines and histidines not related to the zinc finger structure are not color coded. Optimal DNA binding zinc fingers have the
structure X-C-X
2−5
-C-X
3
-ψ-X
5
-ψ-X
2
-H-X
3−5
-H (whereψ is a hydrophobic amino acid); if present in the right position, eachψ is marked
in blue. C) ChIP assays were performed using an antibody to ZFY or ZNF711 in female HEK293T cells and binding was analyzed
using primers for a known ZFY and ZNF711 binding site (LRRC41) and a negative region (ZNF554-3’). Although ZNF711 showed
very robust binding to LRRC41, ZFY binding was not detected. For comparison, the inset shows that ZFX binding to the LRRC41
promoter is detected at similar levels as ZNF711 binding.
1 2
2 3 4
7 6
5
8 9
9
13
10 11 12
A
B
C
ZNF711
ZFX
Relative Normalized Expression
300
200
100
0
LRRC41 ZNF554-3’
Input ZFY ChIP ZNF711 ChIP
ZFY
(801aa)
ZFX
(805aa)
ZNF711
(761aa)
Overall ZF domain
Similarity to ZFX
96% 99%
67% 87%
Acidic
domain
NLS
Zinc finger (ZF)
domain
Ni. Figure 1
23
Although previous studies have recognized the high similarity of ZFX and ZFY (North et al. 1991),
the relationship of ZNF711 to ZFX and ZFY has only been recently noted (Rhie et al. 2018). The
next closest human ZNF identified by the Treefam analysis is ZNF639. However, we have not
included ZNF639 in the ZFX family because it has only a 25% similarity to ZFX. ZFX and ZFY
have 13 zinc finger domains at the C-terminal end of the protein; ZNF711 has amino acid
differences that disrupt ZF3 and ZF7 and thus has only 11 ZFs. All 3 proteins have an acidic
domain at the N-terminus and a nuclear localization signal between the acidic domain and the
zinc finger domains; see Figure 3.3 for a comparison of the amino acid sequences of the ZFX
family members.
Figure 3.3 Amino acid alignment of the ZFX family members
Alignments were generated using sequences (ZFX-P17010, ZFY- P08048, ZNF711-Q9Y462) from UniProt
(www.uniprot.org), aligned using MUSCLE (www.ebi.ac.uk/Tools/msa/muscle/), output was generated in
ClustalW format (www.genome.jp/tools-bin/clustalw), and viewed using Jalview v2.10.4 (jalview.org). Each
of the 13 C 2H 2 zinc fingers of ZFX are boxed; for each finger, the cysteines are marked in red and the
histidines are marked in green; cysteines and histidines not related to the zinc finger structure are not color
coded. Optimal DNA binding zinc fingers have the structure X-C-X 2−5-C-X 3-ψ-X 5-ψ-X 2-H-X 3−5-H (where ψ is
a hydrophobic amino acid); if present in the right position, each ψ is marked in blue. Shannon Schreiner
performed the alignments.
Ni. Figure S1
Figure S1. Amino acid alignment of the ZFX family members. A) Shown is a Treefam (www.treefam.org/) alignment for ZFX,
identifying the closely related ZFY and ZNF711 proteins. B) Alignments were generated using sequences (ZFX-P17010, ZFY-
P08048, ZNF711-Q9Y462) from UniProt (www.uniprot.org), aligned using MUSCLE (www.ebi.ac.uk/Tools/msa/muscle/), output was
generated in ClustalW format (www.genome.jp/tools-bin/clustalw), and viewed using Jalview v2.10.4 (jalview.org). Each of the 13
C 2H 2 zinc fingers of ZFX are boxed; for each finger, the cysteines are marked in red and the histidines are marked in green;
cysteines and histidines not related to the zinc finger structure are not color coded. Optimal DNA binding zinc fingers have the
structure X-C-X 2−5-C-X 3-ψ-X 5-ψ-X 2-H-X 3−5-H (whereψ is a hydrophobic amino acid); if present in the right position, eachψ is marked
in blue. C) ChIP assays were performed using an antibody to ZFY or ZNF711 in female HEK293T cells and binding was analyzed
using primers for a known ZFY and ZNF711 binding site (LRRC41) and a negative region (ZNF554-3’). Although ZNF711 showed
very robust binding to LRRC41, ZFY binding was not detected. For comparison, the inset shows that ZFX binding to the LRRC41
promoter is detected at similar levels as ZNF711 binding.
1 2
2 3 4
7 6
5
8 9
9
13
10 11 12
A
B
C
ZNF711
ZFX
Relative Normalized Expression
300
200
100
0
LRRC41 ZNF554-3’
Input ZFY ChIP ZNF711 ChIP
24
Of the three family members, ZFX has been the most studied in relation to a variety of human
cancers. In fact, it has been implicated in the initiation or progression of many different types of
human cancers, including prostate cancer, breast cancer, colorectal cancer, glioma, renal
carcinoma, gastric cancer, gallbladder adenocarcinoma, non-small cell lung carcinoma and
laryngeal squamous cell carcinoma (Fang et al. 2012, Li et al. 2013, Weng et al. 2015, Nikpour
et al. 2012, Fang, Huang, et al. 2014, Fang, Fu, et al. 2014, Zhou et al. 2011, Jiang, Wang, et al.
2012, Jiang, Xu, et al. 2012). In these previous studies, it was shown that high expression of ZFX
correlates with poor survival of cancer patients. Based on its increased levels and association
with poor survival in many different cancer types, ZFX does not appear to be a tumor type-specific
oncogene, but rather increased levels of ZFX (and perhaps also ZFY and ZNF711) may generally
contribute to metaplastic transformation via causing tumor-promoting changes in the
transcriptome. However, the mechanism(s) by which the ZFX family influences transcriptional
regulation has not been determined. Therefore, I created knockout cells lacking expression of all
ZFX family members, identified genes responsive to loss of these TFs, characterized and
compared the binding patterns of ZFX, ZFY, and ZNF711 using Chromatin immunoprecipitation
(ChIP)-seq and ChIP-exo, and performed structure–functional analyses of the ZFX protein,
identifying regions sufficient for DNA binding and transactivation.
3.3 Results
3.3.1 Loss of ZFX and ZNF711 inhibits cell proliferation and causes large changes in the
transcriptome of HEK293T cells
For the initial investigations into the function of the ZFX family, I used the CRISPR/Cas9 system
to functionally inactivate the ZFX and ZNF711 genes in female HEK293T cells. I chose to use
these cells because they express similar levels of ZFX and ZNF711 (Figure 3.4A) but lack ZFY
(which is encoded on the Y chromosome). Because ZFX and ZFY are so similar (96% overall), it
is likely they have a similar function and the use of female cells meant that we only had to delete
25
two TFs and not three to study the consequences of loss of the entire family. Paired sets of
plasmids encoding guide RNAs designed to delete specific coding regions of ZFX or ZNF711
(Figure 3.4B) and co-expressing Cas9 were transfected into HEK293T cells; after 48 hours
individual cells were isolated using flow cytometry and then grown into colonies. Genomic DNA
was extracted and analyzed using specific primers that spanned the deletion region. See Table
2.1 (which is Supplementary Table S2 in Ni et al. 2020) for the sequence of all guide RNAs and
primers used in this study.
Figure 3.4 ZFX and/or ZNF711 CRISPR KO in HEK293T
A) Expression levels of ZFX/ZFY/ZNF711 in wt HEK293T cells. B) Locations of gRNAs used to create
CRISPR/Cas9-mediated ZFX and/or ZNF711 knockouts. The deletion of ZFX in ZFX KO clone1 and clone2
and the DKO clones were generated using ZFX gRNA1 and gRNA2. The deletion of ZNF711 in ZNF711
KO clone1 and the DKO clones was generated using ZNF711 gRNA1 and gRNA2; the deletion of ZNF711
KO clone2 was generated using ZNF711 gRNA2 and gRNA3. Weiya Ni performed the gRNA design and
CRISPR KO experiments.
I identified multiple colonies that showed no expression of ZFX or ZNF711 (Figure 3.5A).
However, my initial transfections did not produce any cells lacking both ZFX and ZNF711, despite
screening a large number of colonies. Therefore, I next transfected guide RNAs that target ZFX
into the ZNF711 KO clone1 and selected single cell-derived colonies, this time using conditioned
media (70% regular growth media plus 30% filtered growth media taken from growing cultures of
the wt HEK293T) to provide a more supportive growth environment. I obtained several DKO cell
clones that lacked expression of both ZFX and ZNF711 (Figure 3.5A). The difficulty in obtaining
DKO clones suggested that reduction of both ZFX and ZNF711 may have negatively affected cell
ZFX ZNF711 ZFY
0
2
4
6
8
log2(CPM)
WT HEK293T
A
C D
Ni. Figure 2
B
20 kb hg19
ZFX
chrX:24,167,762-24,234,372
10 kb hg19
ZNF711
chrX:84,498,997-84,528,368
gRNA 1 gRNA 2 gRNA 3
Deletion: 7.7kb Deletion: 294bp
Deletion: 122bp
gRNA 1 gRNA 2
ZFX
p62
wt HEK293T
ZFX KO clone1
ZFX KO clone2
ZNF711 KO clone1
ZNF711 KO clone2
DKO clone1
DKO clone2
ZNF711
p62
150
100
150
100
kDa
wt HEK293T
24 48 72 96 120 144 168
0
50
100
150
Time (hours)
Cell number x10
5
wt HEK293T
ZFX KO clone1
ZFX KO clone2
ZNF711 KO clone1
ZNF711 KO clone2
DKO clone1
DKO clone2
26
proliferation. To test this hypothesis, I performed proliferation assays over a 168-hr time course.
As shown in Figure 3.5B, loss of either ZFX or ZNF711 reduced the proliferation rate of HEK293T
cells to approximately the same level, whereas loss of both ZFX and ZNF711 caused a severe
inhibition of cell proliferation; in general, I have observed that DKO cells grow slowly and must be
kept at a high density to maintain viable cell populations.
Figure 3.5 Loss of ZFX and ZNF711 in HEK293T cells inhibits cell proliferation
A) Western blots showing the protein levels of ZFX and ZNF711 in wt HEK293T, ZFX KO clones, ZNF711
KO clones, and DKO clones; also shown is the level of p62 as a loading control. B) Proliferation assays
using wt HEK293T, two different ZFX and two different ZNF711 KO clones, and two DKO clones; data
points are the mean of three biological replicates. Weiya Ni performed the experiments and data analysis.
The severe effects on proliferation in the ZFX and ZNF711 KO and DKO cells suggested that loss
of these TFs was likely to cause major changes in the transcriptome of HEK293T cells. To test
this hypothesis, I performed RNA-seq analysis of two ZFX KO clones, two ZNF711 KO clones,
three DKO clones lacking both ZFX and ZNF711, and controls; each clone was analyzed using 3
biological replicates (producing 24 RNA-seq datasets in total). Volcano plots showing the
differentially expressed genes (DEGs) in both of the ZFX KO clones, both of the ZNF711 KO
clones, and the three DKO clones are shown in Figure 3.6 See Table 3.1 (which is
Supplementary Table S3A-G in Ni et al. 2020) for the gene expression changes in all single and
double knockout clones.
A B
ZFX
p62
wt HEK293T
ZFX KO #1
ZFX KO #2
ZNF711 KO #1
ZNF711 KO #2
DKO #1
DKO #2
ZNF711
p62
150
100
150
100
kDa
24 48 72 96 120 144 168
0
50
100
150
Time (hours)
Cell number x10
5
WT HEK293T
ZFX KO #1
ZFX KO #2
ZNF711 KO #1
ZNF711 KO #2
DKO #1
DKO #2
27
Figure 3.6 Reduction in ZFX and ZNF711 levels causes large effects on the transcriptome
Volcano plots showing the DEGs identified via RNA-seq in comparisons of wt HEK293T vs ZFX KO clone1,
KO clone2, ZNF711 KO clone1, KO clone2, DKO clone1, DKO clone2, or DKO clone3. The numbers of
significantly up- and downregulated genes in each clone are shown in the upper right or left corners,
respectively, of each panel. Weiya Ni performed the experiments and data analysis.
In general, I observed that cells lacking ZNF711 but retaining ZFX had fewer changes in the
transcriptome than did cells lacking ZFX but retaining ZNF711; cells lacking both TFs showed the
greatest number of upregulated and downregulated genes. To address any potential issues due
to clonal variation, I compared the genes showing altered regulation in each of the 3 individually
derived clonal populations that lacked both ZFX and ZNF711, identifying 2428 genes
downregulated in 2 of the 3 DKO clones and 1166 genes commonly downregulated in all 3 DKO
clones (Figure 3.7A). I also identified 3784 genes upregulated in 2 of the 3 DKO clones and 2124
genes commonly upregulated in all 3 of the DKO clones. Gene ontology analyses of the commonly
deregulated genes in all 3 DKO clones revealed that different categories of genes were
0
100
200
300
−5 0 5
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
0
100
200
300
−5 0 5
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
0
100
200
300
−5 0 5
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
0
100
200
300
−6 −3 0 3 6
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
0
100
200
300
−6 −3 0 3 6
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
0
100
200
300
−6 −3 0 3 6
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
0
100
200
300
−6 −3 0 3 6
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
ZFX KO clone1 ZFX KO clone2 ZNF711 KO clone1 ZNF711 KO clone2
DKO clone1 DKO clone2 DKO clone3
n=746 n=1180 n=964 n=1205 n=1068 n=874 n=1466 n=416
n=3744 n=2711 n=3844 n=2679 n=4149 n=3024
28
upregulated vs downregulated (Figure 3.7B). For example, genes that are upregulated upon loss
of ZFX and ZNF711 include histone genes, zinc finger TFs, and cadherins whereas genes that
are downregulated upon loss of the two TFs include kinases, ATPase, peptidases, chaperone
proteins, and oxidoreductases. A complete list of the clusters and all genes identified in each
cluster can be found in Table 3.2 (which is Supplementary Table S3J & S3K in Ni et al. 2020).
Figure 3.7 Gene ontology analysis of common DEGs in all 3 DKO clones
A) Comparison of DEGs commonly downregulated or upregulated in all 3 DKO clones. B) Gene ontology
analysis of the 1166 commonly downregulated and 2124 commonly upregulated genes in all 3 DKO clones.
Weiya Ni performed the data analysis.
In support of my finding that loss of ZFX and ZNF711 resulted in proliferation defects, the term
“Cyclins and Cell Cycle Regulation” was one of the top identified pathways in the set of
downregulated genes; additionally, flow cytometry cell cycle analysis revealed that the DKO cells
have a higher percentage of G0/G1 cells and a lower percentage of G2/M cells than wt HEK293T
cells (Figure 3.8).
1014
585
510
793
334
418
1166
DKO clone1 DKO clone3
DKO clone2
935
525
625
585
465
570
2124
DKO clone1 DKO clone3
DKO clone2
A
Downregulated genes
Upregulated genes
B
-log10(p-Value)
-log10(p-Value)
0 1 2 3 4 5 6
Helicase, superfamily 1/2, ATP-binding domain, DinG/Rad3-type
Mammalian uncoordinated homology 13, subgroup, domain 2
Thrombospondin, type 1 repeat
Ras guanine nucleotide exchange factor, domain
Cadherin
Histone H4, conserved site
Zinc finger C2H2-type/integrase DNA-binding domain
EGF-like calcium-binding
MHC class II, alpha/beta chain, N-terminal
Histone-fold
0 1 2 3 4 5
Peptidase C19, ubiquitin carboxyl-terminal hydrolase 2
Serine/threonine-protein kinase, active site
Cyclin, N-terminal
Helicase, C-terminal
AAA+ ATPase domain
Mitochondrial carrier domain
Pyridine nucleotide-disulphide oxidoreductase, FAD/NAD(P)-binding domain
Histidine kinase-like ATPase, ATP-binding domain
Chaperone tailless complex polypeptide 1 (TCP-1)
Tetratricopeptide repeat-containing domain
29
Figure 3.8 Cell cycle and pathway analysis of downregulated genes in all 3 DKO clones
A) Cells (wt HEK293T, two ZFX KO clones, two ZNF711 KO clones, and three DKO clones) were fixed,
labelled with DAPI (10 ug/mL), and analyzed using BD LSR II gating on single cells via Width and Area
signals. The percentages of G0/G1, S, and G2/M phases were calculated from the DAPI-area histogram.
B) The top identified pathways for the set of genes commonly downregulated in all 3 DKO clones is shown.
Weiya Ni performed the experiments and data analysis.
BD FACSDiva 8.0
Global Sheet1 Printed on: Fri Apr 5, 2019 09:43:31 PDT
wt HEK239T
G0/G1: 58.6%
S phase: 21.4%
G2/M: 19.5%
ZFX KO clone1
BD FACSDiva 8.0
Global Sheet1 Printed on: Fri Apr 5, 2019 09:47:23 PDT
G0/G1: 58.2%
S phase: 22.1%
G2/M: 18.8%
ZFX KO clone2
BD FACSDiva 8.0
Global Sheet1 Printed on: Fri Apr 5, 2019 09:43:59 PDT
G0/G1: 60.2%
S phase: 20.5%
G2/M: 18.2%
ZNF711 KO clone1
BD FACSDiva 8.0
Global Sheet1 Printed on: Fri Apr 5, 2019 09:47:34 PDT
G0/G1: 52.4%
S phase: 24.4%
G2/M: 21.9%
ZNF711 KO clone2
BD FACSDiva 8.0
Global Sheet1 Printed on: Fri Apr 5, 2019 09:47:44 PDT
G0/G1: 53.3%
S phase: 20.8%
G2/M: 24.9%
DKO clone1
BD FACSDiva 8.0
Global Sheet1 Printed on: Fri Apr 5, 2019 09:45:24 PDT
G0/G1:63.1%
S phase: 19.3%
G2/M: 16.9%
DKO clone2
BD FACSDiva 8.0
Global Sheet1 Printed on: Fri Apr 5, 2019 09:45:55 PDT
G0/G1:63.2%
S phase: 20.6%
G2/M: 14.9%
DKO clone3
BD FACSDiva 8.0
Global Sheet1 Printed on: Fri Apr 5, 2019 09:44:12 PDT
G0/G1:71.3%
S phase: 19.3%
G2/M: 9.7%
A
B
0 2 4 6 8
Glucose and Glucose-1-phosphate Degradation
Ketogenesis
Hypoxia Signaling in the Cardiovascular System
Oxidative Ethanol Degradation III
Citrulline Biosynthesis
GDP-glucose Biosynthesis
Superpathway of Citrulline Metabolism
Isoleucine Degradation I
Unfolded protein response
Tryptophan Degradation X (Mammalian, via Tryptamine)
Cyclins and Cell Cycle Regulation
UDP-N-acetyl-D-glucosamine Biosynthesis II
Antiproliferative Role of TOB in T Cell Signaling
UDP-N-acetyl-D-galactosamine Biosynthesis II
PI3K/AKT Signaling
Cell Cycle: G2/M DNA Damage Checkpoint Regulation
RAN Signaling
Superpathway of Geranylgeranyldiphosphate Biosynthesis I (via Mevalonate)
HIPPO signaling
Mitotic Roles of Polo-Like Kinase
Mevalonate Pathway I
Superpathway of Cholesterol Biosynthesis
Protein Ubiquitination Pathway
-log10(p-Value)
30
3.3.2 ZFX family members have essentially identical binding patterns at CpG island
promoters
The next step in characterizing ZFX and ZNF711 was to define their genome-wide binding profiles
by performing ChIP-seq in HEK293T cells using antibodies to ZFX and ZNF711; I note that the
antibodies used for these experiments have passed ENCODE validation criteria, as all signal on
a Western blot is eliminated in the individual knockout clones (Figure 3.5A). All ChIP-seq
experiments were performed using biological duplicates (see Table 1.1 which is Supplementary
Table S1 in Ni et al. 2020); browser tracks from a single replicate of ZFX and ZNF711 ChIP-seq
are shown in Figure 3.9A. I found that the binding profiles are very similar for ZFX and ZNF711.
As noted in Figure 3.2, ZFY is also highly related to ZFX and, based on the binding profiles of
ZFX and ZNF711, one might expect that ZFY would also have a similar binding pattern as ZFX.
However, ZFY is not expressed in female HEK293T cells. To allow a comparison of the binding
patterns of ZFX, ZFY, and ZNF711, replicate ChIP-seq experiments were performed in male
22Rv1 prostate cells for all 3 family members (ZFY antibody validation was performed by
demonstrating that no signal was detected by ChIP using female HEK293T cells). I found that all
3 family members showed highly correlated binding patterns throughout the human genome
(Figure 3.9A, Figure 3.9B). Peaks were identified for all ChIP-seq datasets and annotated into
promoter vs. non-promoter binding sites. I found that each factor binds mainly to promoters that
are CpG islands (Figure 3.9C). The CpG island promoters bound by the 3 factors are essentially
the same, with a total of 10,723 CpG island promoters bound by the union of ZFX, ZFY, and
ZNF711 (Figure 3.9D), corresponding to 72% of the active CpG island promoters in 22Rv1 cells.
31
Figure 3.9 ZFX family members bind to essentially identical CpG island promoters
A) Browser tracks showing ZFX family member binding profiles in female HEK293T kidney cells and male
22Rv1 prostate cells. Also shown is a zoom in on a single peak located in the DOCK7 promoter region. B)
Shown is a heatmap illustrating the genome-wide correlation of ZFX family member binding patterns in
22Rv1 cells. C) Bar graph of genomic distributions of ZFX family member binding sites in 22Rv1 cells in
promoter and non-promoter regions (left) and bar graph showing the relative distribution of binding sites in
CpG island (CGI) promoters and non-CpG island promoters (right). D) Venn diagrams comparing the sets
of CpG island promoters bound by ZFX, ZFY, and ZNF711 in 22Rv1 cells. Andrew Perez performed the
ChIP-seq experiments and Weiya Ni performed the data analysis.
3.3.3 ZFX and ZNF711 have properties of a transcription activator when bound downstream
of the TSS
The binding patterns shown above demonstrate that ZFX family members bind to CpG island
promoters. To further investigate the binding pattern of these TFs, I performed a K-means
clustering based on the peak locations relative to the nearest TSS, identifying 4 groups of binding
sites for ZFX and ZNF711 (Figure 3.10). Interestingly, the strongest binding sites comprise ~1200
peaks (cluster 1) which are located downstream of the TSS. An additional larger set of ~4700
peaks (cluster 3) has a similar downstream location, but a slightly weaker binding profile. We also
A
Prostate
22Rv1
Kidney
HEK293T
chr7:
20 Mb hg19
100,000,000 150,000,000
30 -
0 _
30 -
0 _
50 -
0 _
30 -
0 _
30 -
0 _
ZNF711
ZFX
ZFX
ZNF711
ZFY
DOCK7
Ni. Figure4
DOCK7
1.00
0.97
0.97
0.97
1.00
0.97
0.97
0.97
1.00
0.5 0.6 0.7 0.8 0.9 1.0
ZFX
ZFX
ZFY
ZFY
ZNF711
ZNF711
B C
10723
ZFX
ZFY
ZNF711
D
0
2000
4000
6000
8000
10000
ZFX ZNF711 ZFY
Promoter peaks (TSS+/-2Kb)
Non-promoter peaks
CGI promoter peaks
Non-CGI promoter peaks
0
2000
4000
6000
8000
10000
ZFX ZNF711 ZFY
32
identified ~1400 peaks (cluster 2) that are located upstream of the TSS and a set of weaker peaks
(cluster 4) that appear to have a Y-shaped pattern. Further analysis of the peaks in cluster 4
revealed peaks that are upstream (cluster 4.1), downstream (cluster 4.2), and over the TSS
(cluster 4.3), as well as a set of peaks that are very small and have no distinct binding pattern
(cluster 4.4). The upstream and downstream peaks in cluster 4 have a different location than the
peaks in clusters 1, 2, and 3. The peaks in cluster 1 and 3 are located downstream, but quite near,
the TSS whereas the peaks in cluster 4.2 are much farther downstream (close to +2 Kb). Similarly,
the peaks in cluster 2 are located upstream, but near, the TSS whereas the peaks in cluster 4.1
are much farther upstream (close to – 2Kb).
Figure 3.10 K-means clustering of ZFX and ZNF711 ChIP-seq peaks in HEK293T
ZFX and ZNF711 peak sets from HEK293T cells were clustered using K-means clustering, identifying 4
sets of peaks with distinct binding sites (left); cluster 4 (combination peaks) was subsequently re-clustered,
identifying 4 subsets (right). Tag density plots for each of the 4 different clusters are presented on top of
the heatmaps. Weiya Ni performed the data analysis.
Ni. Figure 5
Cluster 1, n=1195
Cluster 2, n=1382
Cluster 3, n=4708
Cluster 4, n=5971
Cluster 4.1
Cluster 4.2
Cluster 4.3
Cluster 4.4
A
B
ZFX ZNF711
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
Genes with promoters
bound by the TF
Genes downregulated
in all 3 DKO clones
Genes upregulated
in all 3 DKO clones
Downstream TSS peaks – cluster 1
Downstream TSS peaks – cluster 3
Upstream TSS peaks – cluster 2
Combination peaks – cluster 4
No ZFX or ZNF711 binding
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance to TSS Distance to TSS
Distance to TSS (bp) Distance to TSS (bp)
33
The fact that most of the strongest ZFX and ZNF711 peaks are downstream of the TSS (clusters
1 and 3) raises several questions. For example, do these TFs regulate transcription from a
location downstream of the TSS or is regulation achieved only when the TFs are bound to the
minority of sites upstream of the TSS? Also, do the TFs function as direct activators or repressors
and, if so, does their activity differ depending on the binding location? To answer these questions,
I compared the binding profiles of ZFX and ZNF711 at all bound promoters and at promoters that
we identified as commonly downregulated or upregulated in all 3 DKO clones (Figure 3.11). The
tag density plots of all ZFX or ZNF711 peaks were quite broad and showed a large peak at +240
and a shoulder at -240. Interestingly, the promoters that are downregulated upon loss of ZFX and
ZNF711 have very strong peaks downstream of the TSS with a frequency peak at +240,
suggesting that ZFX and ZNF711 function as activators when bound downstream of the TSS on
that group of promoters. In contrast, promoters that are upregulated upon loss of ZFX and ZNF711
have very flat binding profiles, suggesting that genes that show increased expression in the DKO
cells are indirectly regulated by ZFX and ZNF711, perhaps because they are components of
affected signaling pathways. The pie charts show the percentage of deregulated genes that have
promoters bound by ZFX or ZNF711, broken into the different clusters; in total, 86% of the
downregulated genes are bound by ZFX or ZNF711 whereas only 24% of the upregulated genes
are bound by ZFX or ZNF711 (and most of these have peaks located in the weaker cluster 4).
Therefore, ZFX and ZNF711 appear to function mainly as transcriptional activators, but only when
they are bound downstream of the TSS.
34
Figure 3.11 ZFX and ZNF711 have properties of a transcription activator when bound downstream
of the TSS
Average signals of ZFX and ZNF711 ChIP-seq reads in wt HEK293T at all promoters bound by each TF
(top), promoters of genes with decreased expression in all three DKO clones (middle), and promoters of
genes with increased expressions in all three DKO clones (bottom). Also shown, for both the downregulated
and the upregulated gene categories, is the percentage of genes whose promoters are bound by ZFX or
ZNF711 in peak categories 1-4, or not bound by ZFX or ZNF711. Weiya Ni performed the data analysis.
3.3.4 ZFX family members bind throughout the first several hundred base pairs of the
transcribed region of their target genes
Because the majority of the ZFX binding sites occur downstream of the TSS within the transcribed
region, we annotated the position of the downstream ZFX binding sites relative to gene structure
(Figure 3.12A). I found that most of these binding sites fall within the 5’ untranslated region (UTR),
the first coding exon, or the first intron, suggesting that there was not a preference for binding to
coding or non-coding regions downstream of the TSS. This was true for the set of all ZFX peaks
and for the set of ZFX peaks found at the genes that are commonly downregulated in all 3 of the
DKO cells. However, although we used the genomic location of the summit of the called ChIP-
seq peaks for the location analysis, the “genomic summit” of a ChIP-seq peak does not
Ni. Figure 5
Cluster 1, n=1195
Cluster 2, n=1382
Cluster 3, n=4708
Cluster 4, n=5971
Cluster 4.1
Cluster 4.2
Cluster 4.3
Cluster 4.4
A
B
ZFX ZNF711
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
−2000 0 2000
0
6
12
18
Genes with promoters
bound by the TF
Genes downregulated
in all 3 DKO clones
Genes upregulated
in all 3 DKO clones
Downstream TSS peaks – cluster 1
Downstream TSS peaks – cluster 3
Upstream TSS peaks – cluster 2
Combination peaks – cluster 4
No ZFX or ZNF711 binding
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance to TSS Distance to TSS
Distance to TSS (bp) Distance to TSS (bp)
35
necessarily correspond to the location of the precise binding site (e.g. due to the random nature
of the sonication of the chromatin). The precise identification of a peak summit may also be
compounded when analyzing ZFX and ZNF711. I note that the tag density plots shown in Figure
3.10 show a fairly broad binding profile for ZFX family members. Also, close inspection of single
peaks reveals a relatively wide peak at individual promoters (see Figure 3.9A for the single ChIP-
seq peak in the DOCK7 promoter). For comparison to another multi-finger ZNF, I calculated the
average peak width of ZFX (13 ZFs) and CTCF (11 ZFs) peaks and found that the ZFX peaks
(average width of 1816 bp) are quite a bit wider than the CTCF peaks (average width of 747 bp);
the ChIP-seq experiments for both TFs were performed in our lab using the same protocol. The
broad ZFX and ZNF711 peak widths suggested a need for a more precise delineation of the
binding sites. Therefore, I turned to ChIP-exo, a modification of ChIP-seq that improves the
resolution of binding sites (Rossi, Lai, and Pugh 2018). The use of ChIP-exo reduced the average
width of the ZNF711 binding sites from ~1800 to ~300 bp, providing a more distinct pattern of
upstream and downstream binding (Figure 3.12B). I compared the genomic locations of the wide
ZNF711 ChIP-seq binding sites to the narrow ChIP-exo peaks (in both cases, using the peak
summits obtained using the ENCODE pipeline). I also used peak information from ChExMix, a
program designed specifically to identify precise binding sites from ChIP-exo data (Figure 3.12C).
In all cases, the downstream ZNF711 binding sites are spread throughout the 5’UTR, first coding
exon, and first intron. These results suggest that the localization of ZNF711 is not related to the
classification of the transcribed region to which it binds.
36
Figure 3.12 Characterization of ZFX and ZNF711 binding sites
A) Classification of binding sites based on genomic locations of all ZFX peaks located downstream of the
TSS and ZFX downstream peaks at promoters of genes down-regulated in all 3 DKO clones. B)
Classification of ZNF711 downstream ChIP-seq peaks, downstream peaks identified using read2 of the
ChIP-exo dataset, and downstream peaks identified by the ChexMix program (+/-10 nt from the nt identified
as the binding site by the program). C) Tag density plots of all ZNF711 peaks from standard ChIP-seq and
from ChIP-exo. Andrew Perez prepared cross-linked samples for Peconic and Peconic performed the ChIP-
exo experiments. Weiya Ni performed the data analysis.
Previous studies have identified a ZNF711 motif (AGGCCTAG) using ChIP-seq data from a brain
tumor cell line SH-SY5Y (Kleine-Kohlbrecher et al. 2010). However, these studies used the entire
ChIP-seq peak width (which, as shown above, covers a very large area of the proximal promoter
region), making it difficult to be sure if the identified motif was involved in direct recruitment of
ZNF711 or if it was instead a motif commonly found in CpG island promoters. Also, the ChExMix
program, which is used to call motifs in ChIP-exo data, identified a smaller motif of GGCCT. This
shorter motif is similar to a short motif GGCC identified for mouse Zfx using ChIP-seq data (Chen
et al. 2008) and for ZFY using in vitro assays (Grants et al. 2010, Taylor-Harris, Swift, and
Ashworth 1995, Weirauch et al. 2014). To more precisely define the ZNF711 binding motif, I
performed motif analysis using the top 5000 ZNF711 peaks identified by ChIP-seq (using the
entire width of the MACS2 peaks, which is what is usually done in motif analyses), the top5000
Ni. Figure 6
Downstream ZFX peaks
Downregulated genes bound
by ZFX in all 3 DKO clones
Downstream ZNF711
peaks in ChIP-exo read2
Downstream ZNF711
ChExMix peaks in ChIP-exo
Downstream ZNF711
peaks in ChIP-seq
A
5’ UTR
1
st
exon
1
st
intron
Other exon
Other intron
C
0.4
0.9
1.4
1
4
7
10
-2000 -1000 0 1000 2000
ChIP-exo Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
All ZNF711 ChIP-seq peaks
All ZNF711 ChIP-exo peaks
B
D
200 bp
randomized
CGI promoter
region
2 kb
randomized
CGI promoter
region
20 bp
randomized
CGI promoter
region
ZNF711
ChIP-seq
ZNF711
ChIP-exo
read2
ZNF711
ChIP-exo
ChExMix
98.66% 90.08% 56.86% 23.14% 24.74% 3.78%
99.92% 98.08% 73.88% 53.92% 39.32% 6.64%
Known ZNF711 motif
ChIP-exo motif
T
G
C
A
GC
GCCA
C
T
T
C
G
A
T
A
C
G
LRRC41 UBE2D2 SYPL1 UBE2D3 NUDT19 TBCA
ChIP-seq
rep1
ChIP-seq
rep2
ChIP-exo
rep1
ChIP-exo
rep2
E
37
peaks identified by ChIP-exo (using the entire width of the MACS2 peaks) or the top 5000 peaks
identified by the ChExMix program (in this case, because ChExMix outputs a single nt summit for
each peak, the sequence was extended +/- 10bp for motif analysis). I found that essentially all of
the top 5000 ZNF711 ChIP-seq peaks contain the known ZNF711 motif and the ChIP-exo GGCCT
motif (Figure 3.13). However, because the ZNF711 peaks are quite wide (~2 kb), they span a
large proportion of the promoter region. As shown in Figure 3.9, ZNF711 binds mainly to GC-rich
CpG island promoters. This suggests that these motifs may have been identified because they
are GC-rich and commonly found in CpG island promoters. In fact, when I analyzed 5000 2 kb
regions randomly selected from CpG island promoters, I found that all 2 kb randomized promoter
regions also contain these same motifs. As noted above, ChIP-exo reduced the peak widths to
an average size of 200-300 nt. Motif analysis of the ChIP-exo peaks showed a reduction in the
number of peaks that contained the known ZNF711 motif or the shorter GGCCT motif, although
the peaks did have a higher percentage of both motifs than did randomly selected 200 bp regions
from CpG island promoter downstream regions. Finally, analyzing the sequences +/- 10 nt from
the ChExMix peak summits resulted in a further drop in the percentage of peaks that contain the
motifs. In this case, ~25% of the ChExMix peak locations contain the known ZNF711 peak and
~40% contain the smaller GGCCT motif. However, of note, randomized 20b regions contain these
motifs at a very low frequency (~5%). These results suggest that the ZNF711 binding sites are
enriched in both the known motif and the GGCCT motif, but the majority of sites do not contain
either motif. I also note that both the ZNF711 motif and the GGCCT motif are present throughout
the genome, albeit at a higher density in CpG islands (data not shown). Thus, the presence of a
motif is perhaps supportive of binding but does not appear to be absolutely required nor sufficient
for binding.
38
Figure 3.13 Motif analysis of ZNF711 ChIP-seq, ChIP-exo read2, and ChExMix peaks
Motif analysis using the top 5000 peaks identified from standard ZNF711 ChIP-seq (average width 1800
nt), ChIP-exo read2 (average width 300 nt), ChIP-exo by the ChexMix program (20 nt), and randomized
CpG island promoter regions (width 2 kb, 200 bp, and 20 nt). The peaks were searched for the known
ZNF711 motif and the 5 nt motif identified by ChIP-exo. Weiya Ni performed the data analysis.
Visual inspection of individual promoters revealed that not only did the ChIP-exo method result in
narrower peaks overall, but the broad ChIP-seq peaks were fractured into multiple peaks in the
ChIP-exo datasets (Figure 3.14). These results suggest that there are multiple ZNF711 binding
events for each promoter. Due to limitations of the ChIP assay, I cannot distinguish between
multiple ZNF711 molecules bound to a given promoter in the same cell or a single ZNF711
molecule binding at different locations in a given promoter in different cells. Perhaps the multiple
copies of CCGGT elements within CpG island promoters simply help to localize ZFX family
members to the region of open chromatin in a CpG island promoter, with the exact distance from
the TSS not being important for regulation as long as binding is downstream of the TSS. I note
that I performed similar ChIP-exo experiments using a ZFX antibody. Unfortunately, although the
overall patterns were the same as for ZNF711, the ZFX antibody did not perform as well in ChIP-
exo in either of two independent experiments (producing much smaller peaks overall, but in the
same locations) and therefore this data was not included in my analyses.
Ni. Figure 6
Downstream ZFX peaks
Downregulated genes bound
by ZFX in all 3 DKO clones
Downstream ZNF711
peaks in ChIP-exo read2
Downstream ZNF711
ChExMix peaks in ChIP-exo
Downstream ZNF711
peaks in ChIP-seq
A
5’ UTR
1
st
exon
1
st
intron
Other exon
Other intron
C
0.4
0.9
1.4
1
4
7
10
-2000 -1000 0 1000 2000
ChIP-exo Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
All ZNF711 ChIP-seq peaks
All ZNF711 ChIP-exo peaks
B
D
200 bp
randomized
CGI promoter
region
2 kb
randomized
CGI promoter
region
20 bp
randomized
CGI promoter
region
ZNF711
ChIP-seq
ZNF711
ChIP-exo
read2
ZNF711
ChIP-exo
ChExMix
98.66% 90.08% 56.86% 23.14% 24.74% 3.78%
99.92% 98.08% 73.88% 53.92% 39.32% 6.64%
Known ZNF711 motif
ChIP-exo motif
C A C G A C T G A C T G A G A C GT C
T
G
C
A
GC
GCCA
C
T
T
C
G
A
T
A
C
G
LRRC41 UBE2D2 SYPL1 UBE2D3 NUDT19 TBCA
ChIP-seq
rep1
ChIP-seq
rep2
ChIP-exo
rep1
ChIP-exo
rep2
E
39
Figure 3.14 Visualization of example peaks in ZNF711 ChIP-seq and ChIP-exo replicates
Zoom-in comparison of peaks from ZNF711 standard ChIP-seq replicates and ChIP-exo replicates. Weiya
Ni performed the data analysis.
3.3.5 The first 10 C2H2 zinc fingers of ZFX are dispensable for DNA binding and
transcriptional activity
As the next step, I wished to define which of the C2H2 ZFs were involved in recruitment of the
ZFX family to chromatin. As noted above, ZFX and ZFY have 13 C2H2 ZFs but ZNF711 has
amino acid changes that eliminate the C2H2 structure for ZF3 and ZF7 (Figure 3.3), suggesting
that perhaps ZFs closer to the C-terminus are used for DNA binding.
To test this hypothesis, ZFX protein constructs were created that contained the N-terminus and
only ZF1-8 or the N-terminus and only ZF9-13 (Figure 3.15A; 3.15B). Plasmids expressing
FLAG-tagged versions of wt and mutant ZFX proteins were transfected into HEK293T and/or
DKO cells, in vivo expression was confirmed by Western blot (Figure 3.15C), and ChIP-seq was
performed using a FLAG antibody.
Ni. Figure 6
Downstream ZFX peaks
Downregulated genes bound
by ZFX in all 3 DKO clones
Downstream ZNF711
peaks in ChIP-exo read2
Downstream ZNF711
ChExMix peaks in ChIP-exo
Downstream ZNF711
peaks in ChIP-seq
A
5’ UTR
1
st
exon
1
st
intron
Other exon
Other intron
C
0.4
0.9
1.4
1
4
7
10
-2000 -1000 0 1000 2000
ChIP-exo Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
All ZNF711 ChIP-seq peaks
All ZNF711 ChIP-exo peaks
B
D
200 bp
randomized
CGI promoter
region
2 kb
randomized
CGI promoter
region
20 bp
randomized
CGI promoter
region
ZNF711
ChIP-seq
ZNF711
ChIP-exo
read2
ZNF711
ChIP-exo
ChExMix
98.66% 90.08% 56.86% 23.14% 24.74% 3.78%
99.92% 98.08% 73.88% 53.92% 39.32% 6.64%
Known ZNF711 motif
ChIP-exo motif
T
G
C
A
GC
GCCA
C
T
T
C
G
A
T
A
C
G
LRRC41 UBE2D2 SYPL1 UBE2D3 NUDT19 TBCA
ChIP-seq
rep1
ChIP-seq
rep2
ChIP-exo
rep1
ChIP-exo
rep2
E
40
Figure 3.15 Sequences and protein levels of ZFX ZF mutant constructs
A) Schematic of FLAG-tagged ZFX ZF mutant constructs. B) Peptide sequences used in the zinc finger
mutants. Only the zinc finger-containing portion of ZFX from amino acids 411 to 805 (UniProtKB- P17010
designation) is shown for clarity. The positions of the 13 ZFs are indicated by numbered boxes. Colored
underlines indicate the extent of the regions present in the expression constructs: green, ZF1-8; blue, ZF9-
13; red, ZF9-11; purple, ZF11-13. The black triangle indicates the position of the last ZFX amino acid
present in the no zinc finger mutant and the red triangle shows the junction point between the N-terminal
Ni. Figure 7
A
ZF1 ZF2 ZF3 ZF4 ZF5 ZF6 ZF7 ZF8 ZF9 ZF10 ZF11 ZF12 ZF13
FLAG
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
ZFX ZF9-11 FLAG
ZFX ZF11-13 FLAG
FLAG
ZF1 ZF2 ZF4 ZF5 ZF6 ZF7 ZF8 ZF3
ZF9 ZF10 ZF11 ZF12 ZF13
FLAG
ZF9 ZF10 ZF11
FLAG
ZF11 ZF12 ZF13
FLAG
B
ZFX no ZF FLAG
FLAG
D
0 2 4 6 8 10 12 14 16
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
wt ZFX
FLAG
ZFX ZF 1-8
FLAG
ZFX ZF 9-
13 FLAG
ZFX ZF 9-
11 FLAG
ZFX ZF 11-
13 FLAG
ZFX no ZF
FLAG
Control
Fold change
DKO clone1-exp1
DKO clone1-exp2
DKO clone2
0 2 4 6 8 10 12 14 16
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
wt ZFX
FLAG
ZFX ZF 1-8
FLAG
ZFX ZF 9-
13 FLAG
ZFX ZF 9-
11 FLAG
ZFX ZF 11-
13 FLAG
ZFX no ZF
FLAG
Control
Fold change
DKO clone1-exp1
DKO clone1-exp2
DKO clone2
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
ZFX ZF9-11 FLAG
ZFX ZF11-13 FLAG
ZFX no ZF FLAG
Control
-2.0 0 2.0Kb
Distance to TSS (bp)
wt ZFX FLAG
-2.0 0 2.0Kb
Distance to TSS (bp)
ZFX ZF 9-13 FLAG
0
2
4
6
8
10
12
E
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
Scale
chr2:
UCSC RefSeq
CpG Islands
Basic
10 Mb
hg19
65,000,000 70,000,000 75,000,000 80,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
ZFX ChIP-seq in HEK293T rep1
25.3 -
0 _
ZNF711 ChIP-seq in HEK293T rep1
21.78 -
0 _
ZNF711 ChIP-seq in HEK293T rep2
28.07 -
0 _
ZFX ChIP-seq in 22Rv1 rep1
10.94 -
0 _
ZFX ChIP-seq in 22Rv1 rep2
39.46 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep1
7.34 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep2
19.81 -
0 _
ZFY ChIP-seq in 22Rv1 rep1
8.15 -
0 _
ZFY ChIP-seq in 22Rv1 rep2
21.97 -
0 _
ZFX ChIP-seq in HEK293T rep2
25 -
0 _
wt ZFX FLAG ChIP-seq
18 -
0 _
ZFX ZF 9-13 FLAG ChIP-seq
28 -
0 _
ZFX ZF 1-8 FLAG ChIP-seq
25 -
0 _
ZNF711 ChIP-exo rep1
32.9 -
0 _
ZNF711 ChIP-exo rep2
18.52 -
0 _
ChIP-seq input in HEK293T rep1
20 -
0 _
ChIP-seq input in HEK293T rep2
20 -
0 _
EGV control RNAseq
1203.61 -
0 _
ZFX KO clone1 RNAseq
1124.74 -
0 _
ZFX KO clone 2 RNAseq
1109.04 -
0 _
ZNF711 KO clone1 RNAseq
1076.28 -
0 _
ZNF711 KO clone2 RNAseq
951.33 -
0 _
DKO clone1 RNA-seq
1105.83 -
0 _
DKO clone2 RNA-seq
884.87 -
0 _
DKO clone3 RNA-seq
708.93 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep2
19.81 -
0 _
8.15 -
0 _
21.97 -
0 _
ZFX ChIP-seq in HEK293T rep2
25 -
0 _
18 -
0 _
28 -
0 _
25 -
0 _
32.9 -
0 _
18.52 -
0 _
ChIP-seq input in HEK293T rep1
20 -
Endogenous ZFX peaks
Distance to TSS
wt ZFX FLAG ZFX ZF9-13 FLAG
ZFX ZF9-13 FLAG
ZFX ZF1-8 FLAG
0.5
2.5
4.5
6.5
0
3
6
9
12
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
Endogenous ZFX
ZFX ZF9-13 FLAG
C
Ni. Figure S4
Figure S4. Sequences and protein levels of ZFX zinc finger (ZF) mutant constructs. A) Peptide sequences used in the zinc
finger mutants. Only the zinc finger-containing portion of ZFX from amino acids 411 to 805 (UniProtKB- P17010 designation) is
shown for clarity. The positions of the 13 ZFs are indicated by numbered boxes. Colored underlines indicate the extent of the
regions present in the expression constructs: green, ZF1-8; blue, ZF9-13; red, ZF9-11; purple, ZF11-13. The black triangle
indicates the position of the last ZFX amino acid present in the no zinc finger mutant and the red triangle shows the junction
point between the N-terminal half of the protein and the ZF regions in mutants ZF9-13, ZF9-11, and ZF11-13. B) Shown is an
anti-ZFX Western blot demonstrating the protein expression levels of endogenous ZFX in wt HEK293T cells and ZFX ZF mutant
constructs in DKO cells; p62 was used as a loading control. The doublets in the FLAG ZFX lanes could be due to
conformational differences, protein modifications, or degradation products. We note that in this particular transfection, the ZFX
ZF9-11 is expressed at lower levels than the other mutants; however, in other transfections it was expressed as high as the full
length ZFX. C) ChIP-qPCR analysis of ZF mutant enrichment at promoter regions. wt HEK293T cells were transfected with
FLAG-tagged ZFX mutant constructs and ChIPs were performed using a FLAG antibody. Shown are qPCR results using ChIP
DNA demonstrating that ZFX ZF11-13 FLAG can bind to target promoters. LRRC41 and STRADB promoters are bound by
endogenous ZFX in wt HEK293T cells; ZNF180 was used a negative control.
A
150
100
75
kDa
anti-p62
No transfection
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
anti-ZFX
ZFX ZF 9-11 FLAG
ZFX ZF 11-13 FLAG
ZFX no ZF FLAG
B
TAIIIGPDGH PLTVYPCMIC GKKFKSRGFL KRHMKNHPEH LAKKKYRCTD CDYTTNKKIS LHNHLESHKL TSKAEKAIEC
DECGKHFSHA GALFTHKMVH KEKGANKMHK CKFCEYETAE QGLLNRHLLA VHSKNFPHIC VECGKGFRHP SELKKHMRIH
TGEKPYQCQY CEYRSADSSN LKTHVKTKHS KEMPFKCDIC LLTFSDTKEV QQHALIHQES KTHQCLHCDH KSSNSSDLKR
HIISVHTKDY PHKCDMCDKG FHRPSELKKH VAAHKGKKMH QCRHCDFKIA DPFVLSRHIL SVHTKDLPFR CKRCRKGFRQ
QSELKKHMKT HSGRKVYQCE YCEYSTTDAS GFKRHVISIH TKDYPHRCEY CKKGFRRPSE KNQHIMRHHK EVGLP*
ZF1 ZF2
ZF3 ZF4 ZF5
ZF6 ZF7 ZF8
ZF9 ZF10 ZF11
ZF12 ZF13
AA 420
AA 805
C
15
10
5
0
Relative Normalized
Expression
LRRC41 STRADB
Input ZFX ZF11-13 FLAG ZFX ZF9-11 FLAG
wt HEK293T DKO clone1
B
C
41
half of the protein and the ZF regions in mutants ZF9-13, ZF9-11, and ZF11-13. C) Shown is an anti-ZFX
Western blot demonstrating the protein expression levels of endogenous ZFX in wt HEK293T cells and
ZFX ZF mutant constructs in DKO cells; p62 was used as a loading control. The doublets in the FLAG ZFX
lanes could be due to conformational differences, protein modifications, or degradation products. We note
that in this particular transfection, the ZFX ZF9-11 is expressed at lower levels than the other mutants;
however, in other transfections it was expressed as high as the full length ZFX. Charles Nicolet cloned the
ZF mutant ZFX constructs and Weiya Ni performed the Western Blot.
The FLAG-tagged wt ZFX produced a genomic binding pattern similar to the pattern obtained
using the endogenous ZFX antibody, as did the FLAG-tagged ZFX that lacked ZF1-8 but
contained ZF9-13 (Figure 3.16). In contrast, FLAG-tagged ZFX containing ZF1-8 but lacking ZF9-
13 did not bind to the genome, even though it was expressed at the same level as the FLAG-
tagged wt ZFX. These results suggested that ZF9-13 are involved in binding. Many C2H2 ZNFs,
such as the Sp1 and Kruppel-like family (KLF) members, use 3 ZFs to bind to DNA (Swamynathan
2010, Vihervaara, Duarte, and Lis 2018, Iuchi 2000). Therefore, additional mutant ZFX proteins
were created, one containing only ZF9-11, one containing only ZF11-13, and one construct which
lacked all ZF (no ZF). ChIP analysis revealed that ZF11-13 are sufficient for recruitment of ZFX
to promoter regions (Figure 3.16D). For comparison, a prediction of the DNA binding motifs for
the different ZFX mutant constructs was performed using the website tool “DNA-binding
Specificities of Cys2His2 Zinc Finger Proteins” (http://zf.princeton.edu/); the predicted motif for
ZFX ZF11-13 closely matches the motif identified using the ChIP-exo peaks (Figure 3.17).
42
Figure 3.16 In vivo binding of FLAG-tagged ZFX protein
A) Browser tracks showing genomic binding profiles of endogenous ZFX and FLAG-tagged wt ZFX and
ZFX mutants in HEK293T cells. B) Tag density plots of ChIP-seq peaks comparing endogenous ZFX and
ZFX ZF9-13 peak locations in HEK293T cells. C) Heatmaps showing ChIP-seq data from FLAG-tagged wt
ZFX and ZFX ZF9-13 centered on the genomic locations of the endogenous ZFX peaks. D) ChIP-qPCR
analysis of ZF mutant enrichment at promoter regions. wt HEK293T cells were transfected with FLAG-
tagged ZFX mutant constructs and ChIPs were performed using a FLAG antibody. Shown are qPCR results
using ChIP DNA demonstrating that ZFX ZF11-13 FLAG can bind to target promoters. LRRC41 and
STRADB promoters are bound by endogenous ZFX in wt HEK293T cells; ZNF180 was used a negative
control. Shannon Schreiner performed FLAG-tagged ZFX ZF mutant ChIP-seq experiments and Weiya Ni
performed the data analysis.
Ni. Figure 7
A
ZF1 ZF2 ZF3 ZF4 ZF5 ZF6 ZF7 ZF8 ZF9 ZF10 ZF11 ZF12 ZF13
FLAG
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
ZFX ZF9-11 FLAG
ZFX ZF11-13 FLAG
FLAG
ZF1 ZF2 ZF4 ZF5 ZF6 ZF7 ZF8 ZF3
ZF9 ZF10 ZF11 ZF12 ZF13
FLAG
ZF9 ZF10 ZF11
FLAG
ZF11 ZF12 ZF13
FLAG
B
ZFX no ZF FLAG
FLAG
D
0 2 4 6 8 10 12 14 16
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
wt ZFX
FLAG
ZFX ZF 1-8
FLAG
ZFX ZF 9-
13 FLAG
ZFX ZF 9-
11 FLAG
ZFX ZF 11-
13 FLAG
ZFX no ZF
FLAG
Control
Fold change
DKO clone1-exp1
DKO clone1-exp2
DKO clone2
0 2 4 6 8 10 12 14 16
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
wt ZFX
FLAG
ZFX ZF 1-8
FLAG
ZFX ZF 9-
13 FLAG
ZFX ZF 9-
11 FLAG
ZFX ZF 11-
13 FLAG
ZFX no ZF
FLAG
Control
Fold change
DKO clone1-exp1
DKO clone1-exp2
DKO clone2
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
ZFX ZF9-11 FLAG
ZFX ZF11-13 FLAG
ZFX no ZF FLAG
Control
-2.0 0 2.0Kb
Distance to TSS (bp)
wt ZFX FLAG
-2.0 0 2.0Kb
Distance to TSS (bp)
ZFX ZF 9-13 FLAG
0
2
4
6
8
10
12
E
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
Scale
chr2:
UCSC RefSeq
CpG Islands
Basic
10 Mb
hg19
65,000,000 70,000,000 75,000,000 80,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
ZFX ChIP-seq in HEK293T rep1
25.3 -
0 _
ZNF711 ChIP-seq in HEK293T rep1
21.78 -
0 _
ZNF711 ChIP-seq in HEK293T rep2
28.07 -
0 _
ZFX ChIP-seq in 22Rv1 rep1
10.94 -
0 _
ZFX ChIP-seq in 22Rv1 rep2
39.46 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep1
7.34 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep2
19.81 -
0 _
ZFY ChIP-seq in 22Rv1 rep1
8.15 -
0 _
ZFY ChIP-seq in 22Rv1 rep2
21.97 -
0 _
ZFX ChIP-seq in HEK293T rep2
25 -
0 _
wt ZFX FLAG ChIP-seq
18 -
0 _
ZFX ZF 9-13 FLAG ChIP-seq
28 -
0 _
ZFX ZF 1-8 FLAG ChIP-seq
25 -
0 _
ZNF711 ChIP-exo rep1
32.9 -
0 _
ZNF711 ChIP-exo rep2
18.52 -
0 _
ChIP-seq input in HEK293T rep1
20 -
0 _
ChIP-seq input in HEK293T rep2
20 -
0 _
EGV control RNAseq
1203.61 -
0 _
ZFX KO clone1 RNAseq
1124.74 -
0 _
ZFX KO clone 2 RNAseq
1109.04 -
0 _
ZNF711 KO clone1 RNAseq
1076.28 -
0 _
ZNF711 KO clone2 RNAseq
951.33 -
0 _
DKO clone1 RNA-seq
1105.83 -
0 _
DKO clone2 RNA-seq
884.87 -
0 _
DKO clone3 RNA-seq
708.93 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep2
19.81 -
0 _
8.15 -
0 _
21.97 -
0 _
ZFX ChIP-seq in HEK293T rep2
25 -
0 _
18 -
0 _
28 -
0 _
25 -
0 _
32.9 -
0 _
18.52 -
0 _
ChIP-seq input in HEK293T rep1
20 -
Endogenous ZFX peaks
Distance to TSS
wt ZFX FLAG ZFX ZF9-13 FLAG
ZFX ZF9-13 FLAG
ZFX ZF1-8 FLAG
0.5
2.5
4.5
6.5
0
3
6
9
12
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
Endogenous ZFX
ZFX ZF9-13 FLAG
C
Ni. Figure 7
A
ZF1 ZF2 ZF3 ZF4 ZF5 ZF6 ZF7 ZF8 ZF9 ZF10 ZF11 ZF12 ZF13
FLAG
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
ZFX ZF9-11 FLAG
ZFX ZF11-13 FLAG
FLAG
ZF1 ZF2 ZF4 ZF5 ZF6 ZF7 ZF8 ZF3
ZF9 ZF10 ZF11 ZF12 ZF13
FLAG
ZF9 ZF10 ZF11
FLAG
ZF11 ZF12 ZF13
FLAG
B
ZFX no ZF FLAG
FLAG
D
0 2 4 6 8 10 12 14 16
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
wt ZFX
FLAG
ZFX ZF 1-8
FLAG
ZFX ZF 9-
13 FLAG
ZFX ZF 9-
11 FLAG
ZFX ZF 11-
13 FLAG
ZFX no ZF
FLAG
Control
Fold change
DKO clone1-exp1
DKO clone1-exp2
DKO clone2
0 2 4 6 8 10 12 14 16
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
wt ZFX
FLAG
ZFX ZF 1-8
FLAG
ZFX ZF 9-
13 FLAG
ZFX ZF 9-
11 FLAG
ZFX ZF 11-
13 FLAG
ZFX no ZF
FLAG
Control
Fold change
DKO clone1-exp1
DKO clone1-exp2
DKO clone2
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
ZFX ZF9-11 FLAG
ZFX ZF11-13 FLAG
ZFX no ZF FLAG
Control
-2.0 0 2.0Kb
Distance to TSS (bp)
wt ZFX FLAG
-2.0 0 2.0Kb
Distance to TSS (bp)
ZFX ZF 9-13 FLAG
0
2
4
6
8
10
12
E
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
Scale
chr2:
UCSC RefSeq
CpG Islands
Basic
10 Mb
hg19
65,000,000 70,000,000 75,000,000 80,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
ZFX ChIP-seq in HEK293T rep1
25.3 -
0 _
ZNF711 ChIP-seq in HEK293T rep1
21.78 -
0 _
ZNF711 ChIP-seq in HEK293T rep2
28.07 -
0 _
ZFX ChIP-seq in 22Rv1 rep1
10.94 -
0 _
ZFX ChIP-seq in 22Rv1 rep2
39.46 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep1
7.34 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep2
19.81 -
0 _
ZFY ChIP-seq in 22Rv1 rep1
8.15 -
0 _
ZFY ChIP-seq in 22Rv1 rep2
21.97 -
0 _
ZFX ChIP-seq in HEK293T rep2
25 -
0 _
wt ZFX FLAG ChIP-seq
18 -
0 _
ZFX ZF 9-13 FLAG ChIP-seq
28 -
0 _
ZFX ZF 1-8 FLAG ChIP-seq
25 -
0 _
ZNF711 ChIP-exo rep1
32.9 -
0 _
ZNF711 ChIP-exo rep2
18.52 -
0 _
ChIP-seq input in HEK293T rep1
20 -
0 _
ChIP-seq input in HEK293T rep2
20 -
0 _
EGV control RNAseq
1203.61 -
0 _
ZFX KO clone1 RNAseq
1124.74 -
0 _
ZFX KO clone 2 RNAseq
1109.04 -
0 _
ZNF711 KO clone1 RNAseq
1076.28 -
0 _
ZNF711 KO clone2 RNAseq
951.33 -
0 _
DKO clone1 RNA-seq
1105.83 -
0 _
DKO clone2 RNA-seq
884.87 -
0 _
DKO clone3 RNA-seq
708.93 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep2
19.81 -
0 _
8.15 -
0 _
21.97 -
0 _
ZFX ChIP-seq in HEK293T rep2
25 -
0 _
18 -
0 _
28 -
0 _
25 -
0 _
32.9 -
0 _
18.52 -
0 _
ChIP-seq input in HEK293T rep1
20 -
Endogenous ZFX peaks
Distance to TSS
wt ZFX FLAG ZFX ZF9-13 FLAG
ZFX ZF9-13 FLAG
ZFX ZF1-8 FLAG
0.5
2.5
4.5
6.5
0
3
6
9
12
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
Endogenous ZFX
ZFX ZF9-13 FLAG
C
Ni. Figure S4
Figure S4. Sequences and protein levels of ZFX zinc finger (ZF) mutant constructs. A) Peptide sequences used in the zinc
finger mutants. Only the zinc finger-containing portion of ZFX from amino acids 411 to 805 (UniProtKB- P17010 designation) is
shown for clarity. The positions of the 13 ZFs are indicated by numbered boxes. Colored underlines indicate the extent of the
regions present in the expression constructs: green, ZF1-8; blue, ZF9-13; red, ZF9-11; purple, ZF11-13. The black triangle
indicates the position of the last ZFX amino acid present in the no zinc finger mutant and the red triangle shows the junction
point between the N-terminal half of the protein and the ZF regions in mutants ZF9-13, ZF9-11, and ZF11-13. B) Shown is an
anti-ZFX Western blot demonstrating the protein expression levels of endogenous ZFX in wt HEK293T cells and ZFX ZF mutant
constructs in DKO cells; p62 was used as a loading control. The doublets in the FLAG ZFX lanes could be due to
conformational differences, protein modifications, or degradation products. We note that in this particular transfection, the ZFX
ZF9-11 is expressed at lower levels than the other mutants; however, in other transfections it was expressed as high as the full
length ZFX. C) ChIP-qPCR analysis of ZF mutant enrichment at promoter regions. wt HEK293T cells were transfected with
FLAG-tagged ZFX mutant constructs and ChIPs were performed using a FLAG antibody. Shown are qPCR results using ChIP
DNA demonstrating that ZFX ZF11-13 FLAG can bind to target promoters. LRRC41 and STRADB promoters are bound by
endogenous ZFX in wt HEK293T cells; ZNF180 was used a negative control.
A
150
100
75
kDa
anti-p62
No transfection
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
anti-ZFX
ZFX ZF 9-11 FLAG
ZFX ZF 11-13 FLAG
ZFX no ZF FLAG
B
TAIIIGPDGH PLTVYPCMIC GKKFKSRGFL KRHMKNHPEH LAKKKYRCTD CDYTTNKKIS LHNHLESHKL TSKAEKAIEC
DECGKHFSHA GALFTHKMVH KEKGANKMHK CKFCEYETAE QGLLNRHLLA VHSKNFPHIC VECGKGFRHP SELKKHMRIH
TGEKPYQCQY CEYRSADSSN LKTHVKTKHS KEMPFKCDIC LLTFSDTKEV QQHALIHQES KTHQCLHCDH KSSNSSDLKR
HIISVHTKDY PHKCDMCDKG FHRPSELKKH VAAHKGKKMH QCRHCDFKIA DPFVLSRHIL SVHTKDLPFR CKRCRKGFRQ
QSELKKHMKT HSGRKVYQCE YCEYSTTDAS GFKRHVISIH TKDYPHRCEY CKKGFRRPSE KNQHIMRHHK EVGLP*
ZF1 ZF2
ZF3 ZF4 ZF5
ZF6 ZF7 ZF8
ZF9 ZF10 ZF11
ZF12 ZF13
AA 420
AA 805
C
15
10
5
0
Relative Normalized
Expression
LRRC41 STRADB
Input ZFX ZF11-13 FLAG ZFX ZF9-11 FLAG
wt HEK293T DKO clone1
A B
C
D
43
Figure 3.17 Motif predictions for ZFX zinc fingers
Different subsets of ZFX zinc fingers were analyzed using the website tool called DNA-binding Specificities
of Cys 2His 2 Zinc Finger Proteins (http://zf.princeton.edu/); see also NAR, 42(3): 1497-1508; NAR, 43. Epub
2015 Jan 15. Each subset of zinc fingers was analyzed using 3 different methods. The ZFX protein
sequence used was P17010 (www.uniprot.org/). For comparison, the motif identified using ZFX and
ZNF711 ChIP-exo datasets is GGCCT. Shannon Schreiner performed the motif analyses.
C2H2 ZFs have also been implicated in protein-protein interactions (Brayer and Segal 2008),
suggesting that perhaps some of the ZFs not involved in genomic recruitment may be involved in
transcriptional activity. To examine this possibility, we tested the ZFX constructs using a transient
transfection reporter assay. ZFX expression constructs were transfected into DKO cells and the
expression of endogenous genes was monitored by RT-qPCR after 24 hours, using triplicate
transfections for each data point. We examined expression of two genes (LONRF2 and CAPN2)
whose promoters are bound by both ZFX and ZNF711 in wt HEK293T cells and which show a
reduction in gene expression in all 3 DKO clones, of one gene (FOS) that is upregulated in the
DKO cells (a putative indirect target gene), and of one gene (HOXC4) which shows no expression
changes in the DKO cells. As shown in Figure 3.18, strong upregulation was observed by a
subset of the transfected ZFX constructs only for the two genes which are bound by ZFX in wt
HEK293T cells and that show a reduction in RNA levels upon loss of ZFX family members (the
putative direct target genes). This increased expression was observed in multiple, independent
experiments using two independently derived DKO clones. The indirect target gene (FOS) and
Ni. Figure S5
ZF9-11 ZF9-13 ZF11-13
Expanded
Linear SVM
RF Regression
on B1H
Polynomial
SVM
ZF1-8
Figure S5. Motif predictions for ZFX zinc fingers. Different subsets of ZFX zinc fingers corresponding to the mutants we
generated were analyzed using the website tool called DNA-binding Specificities of Cys 2His 2 Zinc Finger Proteins
(http://zf.princeton.edu/); see also NAR, 42(3): 1497-1508; NAR, 43. Epub 2015 Jan 15. Each subset of zinc fingers was
analyzed using 3 different methods. The ZFX protein sequence used was P17010 (www.uniprot.org/). For comparison, the motif
identified using ZFX and ZNF711 ChIP-exo datasets is GGCCT.
44
the control gene (HOXC4) were not affected upon transfection of the ZFX constructs. Notably,
the ability of the ZFX constructs to bind to the genome was correlated with increased expression
levels of the target genes. Because the FLAG-tagged ZFX ZF11-13 could increase expression of
endogenous target genes to the same extent as the FLAG-tagged wt ZFX construct, this suggests
that the first 10 C2H2 ZFs may be dispensable for genomic DNA binding and transcriptional
activity in HEK293T cells.
Figure 3.18 Functional analysis of the ZFX protein
Expression levels following transfection with different ZFX constructs (as analyzed by RT-qPCR) of two
genes (LONRF2 and CAPN2) whose promoters are bound by both ZFX and ZNF711 in wt HEK293T cells
and which show a reduction in gene expression in all 3 DKO clones, of one gene (FOS) that is upregulated
in all 3 DKO cells (a putative indirect target gene), and of one gene (HOXC4) that shows no expression
changes in the DKO cells. Expression data were normalized to the control (cells transfected with an
unrelated plasmid). Three independent experiments were performed using two different clonal populations
of DKO cells; data points represent results from triplicate wells and duplicate PCR readings. Error bars
indicate the pooled standard deviations of the means for the constructs and for the normalizing control.
Charles Nicolet and Weiya Ni performed the transfection experiments. Charles Nicolet performed the data
analysis.
Ni. Figure 7
A
ZF1 ZF2 ZF3 ZF4 ZF5 ZF6 ZF7 ZF8 ZF9 ZF10 ZF11 ZF12 ZF13
FLAG
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
ZFX ZF9-11 FLAG
ZFX ZF11-13 FLAG
FLAG
ZF1 ZF2 ZF4 ZF5 ZF6 ZF7 ZF8 ZF3
ZF9 ZF10 ZF11 ZF12 ZF13
FLAG
ZF9 ZF10 ZF11
FLAG
ZF11 ZF12 ZF13
FLAG
B
ZFX no ZF FLAG
FLAG
D
0 2 4 6 8 10 12 14 16
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
wt ZFX
FLAG
ZFX ZF 1-8
FLAG
ZFX ZF 9-
13 FLAG
ZFX ZF 9-
11 FLAG
ZFX ZF 11-
13 FLAG
ZFX no ZF
FLAG
Control
Fold change
DKO clone1-exp1
DKO clone1-exp2
DKO clone2
0 2 4 6 8 10 12 14 16
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
LONRF2
CAPN2
FOS
HOXC4
wt ZFX
FLAG
ZFX ZF 1-8
FLAG
ZFX ZF 9-
13 FLAG
ZFX ZF 9-
11 FLAG
ZFX ZF 11-
13 FLAG
ZFX no ZF
FLAG
Control
Fold change
DKO clone1-exp1
DKO clone1-exp2
DKO clone2
wt ZFX FLAG
ZFX ZF1-8 FLAG
ZFX ZF9-13 FLAG
ZFX ZF9-11 FLAG
ZFX ZF11-13 FLAG
ZFX no ZF FLAG
Control
-2.0 0 2.0Kb
Distance to TSS (bp)
wt ZFX FLAG
-2.0 0 2.0Kb
Distance to TSS (bp)
ZFX ZF 9-13 FLAG
0
2
4
6
8
10
12
E
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
chr2:
UCSC RefSeq
CpG Islands
Basic
65,000,000 75,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
Endogenous ZFX
25 -
wt ZFX FLAG
18 -
ZFX ZF 9-13 FLAG 28 -
ZFX ZF 1-8 FLAG
25 -
Scale
chr2:
UCSC RefSeq
CpG Islands
Basic
10 Mb
hg19
65,000,000 70,000,000 75,000,000 80,000,000
Mir_548
5S_rRNA
JB153659
JB153659
MIR4432
BCL11A
AL833181
Mir_562
PAPOLG
Metazoa_SRP
FLJ16341
REL
PUS10
5S_rRNA
PEX13
KIAA1841
LOC339803
C2orf74
AHSA2
USP34
SNORA70B
XPO1
FAM161A
CCT4
COMMD1
BC071802
B3GNT2
MIR5192
Metazoa_SRP
TMEM17
BC038779
EHBP1
Y_RNA
LOC100132215
OTX1
DBIL5P2
WDPCP
MDH1
UGP2
VPS54
PELI1
LINC00309
AL355732
LGALSL
AFTPH
MIR4434
LOC339807
SERTAD2
AK097952
LOC400958
SLC1A4
CEP68
RAB1A
ACTR2
SPRED2
FLJ16124
AK131224
MIR4778
MEIS1-AS3
MEIS1
BC040863
LOC644838
ETAA1
C1D
WDR92
PNO1
PPP3R1
CNRIP1
PLEK
FBXO48
APLF
PROKR1
ARHGAP25
BMP10
GKN2
GKN1
ANTXR1
MIR3126
GFPT1
Y_RNA
NFU1
AAK1
SNORA36C
ANXA4
AK125871
GMCL1
SNRNP27
MXD1
FW340055
ASPRV1
PCBP1-AS1
PCBP1
LOC100133985
C2orf42
TIA1
TRNA
PCYOX1
SNRPG
FAM136A
TRNA_Pseudo
TGFA
Mir_548
ADD2
FIGLA
CLEC4F
CD207
VAX2
ATP6V1B1
ANKRD53
TEX261
OR7E91P
TRNA_Pseudo
NAGK
MCEE
MPHOSPH10
PAIP2B
AF090102
ZNF638
U6
DYSF
CYP26B1
EXOC6B
U2
SNORD78
SPR
EMX1
SFXN5
SFXN5
RAB11FIP5
RAB11FIP5
DQ580250
NOTO
SMYD5
PRADC1
CCT7
FBXO41
AK125051
EGR4
U6
ALMS1
NAT8
ALMS1P
NAT8B
TPRKB
DUSP11
C2orf78
STAMBP
ACTG2
DGUOK
Mir_598
5S_rRNA
TET3
BOLA3
BOLA3-AS1
MOB1A
MTHFD2
SLC4A5
DCTN1
DCTN1-AS1
C2orf81
DQ588163
WDR54
RTKN
INO80B
WBP1
MOGS
MRPL53
CCDC142
TTC31
LBX2
LBX2-AS1
PCGF1
TLX2
TLX2
DQX1
AUP1
HTRA2
LOXL3
DOK1
M1AP
SEMA4F
HK2
TRNA_Glu
AK125960
POLE4
TACR1
MIR5000
EVA1A
MRPL19
GCFC2
LRRTM4
5S_rRNA
TRNA_Pseudo
SNAR-H
BC030125
BC024248
REG3G
REG1B
REG1A
REG1P
REG3A
CTNNA2
U6
MIR4264
LRRTM1
5S_rRNA
LOC1720
2p16.1
2p15
2p14
2p13.3
2p13.2
2p13.1
2p12
p11.2
ZFX ChIP-seq in HEK293T rep1
25.3 -
0 _
ZNF711 ChIP-seq in HEK293T rep1
21.78 -
0 _
ZNF711 ChIP-seq in HEK293T rep2
28.07 -
0 _
ZFX ChIP-seq in 22Rv1 rep1
10.94 -
0 _
ZFX ChIP-seq in 22Rv1 rep2
39.46 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep1
7.34 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep2
19.81 -
0 _
ZFY ChIP-seq in 22Rv1 rep1
8.15 -
0 _
ZFY ChIP-seq in 22Rv1 rep2
21.97 -
0 _
ZFX ChIP-seq in HEK293T rep2
25 -
0 _
wt ZFX FLAG ChIP-seq
18 -
0 _
ZFX ZF 9-13 FLAG ChIP-seq
28 -
0 _
ZFX ZF 1-8 FLAG ChIP-seq
25 -
0 _
ZNF711 ChIP-exo rep1
32.9 -
0 _
ZNF711 ChIP-exo rep2
18.52 -
0 _
ChIP-seq input in HEK293T rep1
20 -
0 _
ChIP-seq input in HEK293T rep2
20 -
0 _
EGV control RNAseq
1203.61 -
0 _
ZFX KO clone1 RNAseq
1124.74 -
0 _
ZFX KO clone 2 RNAseq
1109.04 -
0 _
ZNF711 KO clone1 RNAseq
1076.28 -
0 _
ZNF711 KO clone2 RNAseq
951.33 -
0 _
DKO clone1 RNA-seq
1105.83 -
0 _
DKO clone2 RNA-seq
884.87 -
0 _
DKO clone3 RNA-seq
708.93 -
0 _
ZNF711 ChIP-seq in 22Rv1 rep2
19.81 -
0 _
8.15 -
0 _
21.97 -
0 _
ZFX ChIP-seq in HEK293T rep2
25 -
0 _
18 -
0 _
28 -
0 _
25 -
0 _
32.9 -
0 _
18.52 -
0 _
ChIP-seq input in HEK293T rep1
20 -
Endogenous ZFX peaks
Distance to TSS
wt ZFX FLAG ZFX ZF9-13 FLAG
ZFX ZF9-13 FLAG
ZFX ZF1-8 FLAG
0.5
2.5
4.5
6.5
0
3
6
9
12
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
Endogenous ZFX
ZFX ZF9-13 FLAG
C
45
To further examine this possibility, I transfected FLAG-tagged wt ZFX, FLAG-tagged ZFX ZF11-
13, or a control plasmid not expressing ZFX into DKO cells and compared global expression by
RNA-seq (Figure 3.19). Volcano plots of DEGs in DKO cells transfected with wt ZFX or ZFX
ZF11-13, as compared to the control cells, are shown in Figure 3.19A; see Table 3.3 (which is
Supplementary Table S3L & S3M in Ni et al. 2020) for all DEGs. I identified thousands of genes
that responded to the reintroduction of wt ZFX into the DKO cells. To further compare the cellular
response to a 24 hr exposure of wt ZFX vs. ZFX ZF11-13, I created a volcano plot comparing
these two datasets. I found that there are very few genes that show differential responses to the
wt ZFX (containing 13 ZFs) vs the ZFX ZF11-13 (containing only the final 3 ZFs). To identify the
direct target genes in DKO cells that are responsive to the reintroduction of ZFX, I compared the
846 genes that are bound by ZFX and ZNF711 in wt HEK293T cells and show a decrease in
mRNA levels in all three DKO clones and the 2275 genes that show increased levels in DKO cells
transfected with either FLAG-tagged wt ZFX or ZFX ZF11-13 (Figure 3.19B). I identified 277
responding promoters (see Table 3.4 which is Supplementary Table S3N in Ni et al. 2020). The
binding patterns of transfected FLAG-tagged wt ZFX and FLAG-tagged ZFX ZF9-13 at the
responding promoters (identified in Figure 3.19B) recapitulate the endogenous ZFX binding
pattern, which has a peak at +240 downstream of the TSS (Figure 3.19C). I found that 274 of the
277 responding promoters have the known ZFX family motif (Figure 3.19D).
46
Figure 3.19 ZFX ZF11-13 has very similar transcriptional activities as wt ZFX
A) Volcano plots showing the DEGs identified via RNA-seq in comparisons of DKO cells 24 hr after
transfection with FLAG-tagged wt ZFX vs. a control plasmid, FLAG-tagged ZFX ZF11-13 vs. a control
plasmid, and FLAG-tagged wt ZFX vs. FLAG-tagged ZFX ZF11-13. B) Shown is a Venn diagram comparing
the 846 genes that are bound by ZFX and ZNF711 in wt HEK293T cells and show a decrease in mRNA
levels in all three DKO clones and the 2275 genes that show increased levels in DKO cells transfected with
either FLAG-tagged wt ZFX or ZFX ZF11-13. C) Shown is a tag density plot of ChIP-seq data for
endogenous ZFX, FLAG-tagged wt ZFX, ZFX ZF9-13, or ZFX ZF1-8 at the set of 277 responding promoters
identified in panel C. D) Motif coverage analysis of the 277 responding promoters. Weiya Ni performed the
data analysis.
3.4 Discussion
In this Chapter, I reported that the ZFX family members are ubiquitously expressed and have an
unusual binding location at essentially the same CpG island promoters in various cell lines.
CRISPR knockout of the ZFX family, followed by cell proliferation assays and RNA-seq, showed
that the ZFX family members are critical regulators of cell growth and affect expression of
thousands of genes in HEK293T cells. ZFX mutant functional assays demonstrated that the
LONRF2
0
10
20
30
40
50
−6 −3 0 3 6
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
LONRF2
0
10
20
30
40
50
−6 −3 0 3 6
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
Ni. Figure 8
A
0
10
20
30
40
50
−6 −3 0 3 6
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
B C
wt ZFX FLAG vs. Control ZFX ZF11-13 FLAG vs. Control wt ZFX FLAG vs. ZFX ZF11-13 FLAG
n=954 n=917 n=1378 n=1358 n=61 n=13
Upregulated genes in wt ZFX FLAG
and ZFX ZF11-13 FLAG
Downregulated genes bound by ZFX
and ZNF711 in all 3 DKO clones
277
569
1998
+ motif
- motif
T
G
C
A
GC
GCCA
C
T
T
C
G
A
T
A
C
G
274
D
0
6
12
18
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
Endogenous ZFX wt ZFX FLAG
ZFX ZF9-13 FLAG ZFX ZF1-8 FLAG
47
combination of the ZFX N-terminal region and zinc fingers 11-13 compose a DNA-binding
transactivator.
However, there is a striking discrepancy between the numbers of the deregulated genes bound
by the ZFX family in DKO cells and total number of genes with promoter regions bound by the
ZFX family in wt cells. There are two possible reasons for this difference: 1) there are
compensation effects in the DKO clones that allow the cells to survive upon the loss of critical cell
proliferation regulators (further discussed in Chapter 5), and 2) there are unknown characteristics,
in addition to DNA binding, that define what constitutes a direct target promoter of the ZFX family
(further investigated in Chapter 4).
48
Chapter 4
Investigation of the mechanisms by which the ZFX family members regulate the
transcriptome
Most of the work described in this chapter has been published in Ni, Perez, Schreiner, Nicolet,
and Farnham. 2020. "Characterization of the ZFX family of transcription factors that bind
downstream of the start site of CpG island promoters." Nucleic Acids Res. doi:
10.1093/nar/gkaa384. IP-MS results are unpublished data.
4.1 Abstract
The aberrant expression of ZFX is associated with the tumorigenesis of many human cancers.
However, the published cancer-related studies mostly focused on phenotypic measurements of
cell growth, comparing tumors or cells having different levels of ZFX mRNA. The underlying
molecular mechanisms by which the ZFX family controls proliferation in the genome has not yet
been defined. One mechanism by which a TF can influence transcription is via maintaining a
region of hypomethylated chromatin in the core promoter. Another mechanism by which site-
specific DNA-binding transcription factors mediate transcription is by interacting with co-activators.
In this chapter, I found that the change in DNA methylation is not directly correlated with the
responding genes to the loss of ZFX and ZNF711 in DKO cells. In addition, initial IP-MS studies
have identified Nuclear Receptor Binding SET Domain Protein 1 (NSD1) as the only interacting
partner of ZFX in HEK293T cells.
4.2 Introduction
In mammalian cells, CpG islands are generally unmethylated and provide a region of open
chromatin surrounding the TSS of housekeeping genes (Antequera 2003). The methylation of
cytosines in CpG island promoters can limit the binding of TFs if their binding motif contains a
49
CpG and can also recruit methyl-binding proteins and repressive protein complexes, leading to
gene silencing (Deaton and Bird 2011, Jones 2012). Thus, unmethylated CpG island promoters
are associated with expressed genes and methylated CpG island promoters are associated with
repressed genes. In Chapter 3, I reported that 72% of the active CpG island promoters in 22Rv1
cells are bound by a ZFX family member. Therefore, one possible mechanism by which the ZFX
family may regulate transcription is through keeping the CpG islands hypomethylated. If so, then
differences in DNA methylation at ZFX target promoters in wt HEK293T vs DKO cells might
provide insights into the mechanisms by which the ZFX family regulates transcription. In this
chapter, I have analyzed the patterns of DNA methylation of all promoters in wt and DKO cells.
Another mechanism that may distinguish responsive vs non-responsive ZFX-bound promoters is
the set of co-regulators bound to a promoter; ZFX family members may require cooperation with
another protein to activate transcription. For example, previous studies in our lab have shown that
TCF7L2 mediates expression of a different set of genes in different tissues through interaction
with tissue-specific TFs. Namely, TCF7L2 cooperates with HNF4alpha and FOXA2 in HepG2
cells and with GATA3 in MCF7 cells (Frietze et al. 2012). Therefore, IP-MS experiments were
performed in HEK293T cells to identify proteins that interact with ZFX.
4.3 Results
4.3.1 Does binding of ZFX and ZNF711 affect the DNA methylation level at target promoters?
As noted above, ZFX binds to CpG island promoters. In general, CpG island promoters tend to
have large hypomethylated regions of open chromatin in most cell types (and hence are active
and classified as housekeeping promoters). Changes in the levels of DNA methylation can have
major effects on promoter activity, with increased methylation leading to gene silencing (Jones
2012, Miranda and Jones 2007). It is possible that ZFX family members help create a region of
low DNA methylation and increased DNA methylation in wt cells. Although the identified ZFX
DNA binding motif (AGGCCTAG) does not contain a methylatable CpG dinucleotide, there are
50
many CpGs within each bound promoter. Therefore, if, for example, ZFX recruited a DNA
demethylase, the demethylase could work in conjunction with ZFX to keep the CpGs
unmethylated and therefore provide a good environment for high levels of transcription. To
address the question as to whether binding of ZFX and ZNF711 affects the DNA methylation level
at target promoters, I examined DNA methylation levels using Illumina EPIC arrays in wt
HEK293T cells and the three DKO cell lines. As shown in Figure 4.1, I found that the loss of ZFX
and ZNF711 results in a slight hypomethylation at many promoters in all three DKO cell lines,
when compared to wt HEK293 cells. This observation is quite surprising because if ZFX functions
as a transcription activator through maintaining the open, unmethylated chromatin state at
promoters regions, I would have expected to see an increase of methylation at cytosines at CpG
island promoters. Thus, although there were modest changes in DNA methylation in the DKO
cells, the overall extent of promoter hypomethylation could not be specifically associated with the
binding of ZFX and/or ZNF711 or ZFX- or ZNF711-mediated gene regulation.
51
Figure 4.1 DNA methylation analysis of DKO cells
A) Shown is the DNA methylation level at all promoter probes on the EPIC array for each of the 3 DKO
clones. To more clearly visualize any hypermethylated or hypomethylated promoter probes, probes with
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
WT
Seven
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
WT
Jake
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
WT
Lady
0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
WT
Seven
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
WT
Seven
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
WT
Seven
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8
WT
Seven
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
WT
Seven
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
WT
Seven
Ni. Figure S3
A
DKO clone1
DKO clone2
DKO clone3
wt HEK293T wt HEK293T wt HEK293T
All genes
B
Genes bound by ZFX and ZNF711 Genes not bound by ZFX or ZNF711
C
Downregulated genes in all 3 DKO clones
(n=1166)
Upregulated in all 3 DKO clones
(n=2124)
D
Downregulated genes bound by ZFX
and ZNF711 in all 3 DKO clones
(n=846)
DKO clone1
wt HEK293T
DKO clone1
wt HEK293T
DKO clone1
wt HEK293T
DKO clone1
wt HEK293T
DKO clone1
wt HEK293T
DKO clone1
wt HEK293T
Random genes bound by ZFX and
ZNF711 with no change in expression
in any of the DKO clones
(n=850)
Figure S3. DNA methylation analysis of DKO cells. A) Shown is the DNA methylation level at all promoter probes on the EPIC
array for each of the 3 DKO clones. To more clearly visualize any hypermethylated or hypomethylated promoter probes, probes
with beta value differences in DKO vs. wt HEK293T < 0.2 were removed for panels B, C, and D. B) Shown is the DNA methylation
level at all promoters on the array that are bound vs. not bound by ZFX and ZNF711 in wt HEK293T. C) Shown is the DNA
methylation level of promoters that are downregulated or upregulated in DKO cells (binding by ZFX and ZNF711 was not taken into
consideration). D) Shown is the DNA methylation level at promoters of genes that are bound by ZFX and ZNF711 and
downregulated in all 3 DKO clones vs. bound by ZFX and ZNF711 but show no change in expression in DKO cells. Note that Panel
B, C, and D show results of using data from DKO clone1, but very similar patterns were seen for all 3 DKO clones. In summary, all
3 of the DKO clones show a modest hypomethylation at many promoters, independent of whether the promoters are direct target of
ZFX or ZNF711.
hypomethylated
hypermethylated
52
beta value differences in DKO vs. wt HEK293T < 0.2 were removed for panels B, C, and D. B) Shown is
the DNA methylation level at all promoters on the array that are bound vs. not bound by ZFX and ZNF711
in wt HEK293T. C) Shown is the DNA methylation level of promoters that are downregulated or upregulated
in DKO cells (binding by ZFX and ZNF711 was not taken into consideration). D) Shown is the DNA
methylation level at promoters of genes that are bound by ZFX and ZNF711 and downregulated in all 3
DKO clones vs. bound by ZFX and ZNF711 but show no change in expression in DKO cells. Note that
Panel B, C, and D show results of using data from DKO clone1, but very similar patterns were seen for all
3 DKO clones. In summary, all 3 of the DKO clones show a modest hypomethylation at many promoters,
independent of whether the promoters are direct target of ZFX or ZNF711. The USC Norris Molecular
Genomics Core Facility performed the DNA methylation EPIC array. Weiya Ni performed the data analysis.
4.3.1 Identifying protein interaction partners of ZFX.
To identify interacting proteins of ZFX, two independent IP-MS experiments using a ZFX antibody
targeting endogenous ZFX proteins in wt HEK293T cells, three IP-MS experiments using a FLAG
antibody targeting FLAG-tagged wt ZFX, ZFX ZF1-8 and ZFX ZF9-13 proteins (in wt HEK293T
cells transfected with FLAG-tagged ZFX constructs), and two independent IgG controls were
performed. After the quantitative analysis of the total spectra, enriched proteins are shown in the
scatterplot plot (Figure 4.2); combining all the experiments and control samples, NSD1 (the
highlighted green triangle) is only interacting partner of ZFX that was consistently identified by
IP-MS.
53
Figure 4.2 Scatterplot plot of enriched ZFX-interacting proteins identified by IP-MS in HEK293T
Shown is the scatterplot plot (Fisher's Exact Test, p Value< 0.05) of ZFX IP-MS experiment in HEK293T
analyzed by Scaffold. Because there is no spectrum count being mapped to NSD1 peptide in control
samples, the calculated enrichment of NSD1 protein in the MS is indefinite (noted as a significant outlier).
Shannon Schreiner performed IP-MS experiments and data analysis.
NSD1 is a histone methyltransferase that dimethylates Lys-36 of histone H3, creating H3K36me2.
De novo NSD1 mutations have been shown to cause overgrowth syndromes with intellectual
disability, such as Sotos Syndrome and Weaver Syndrome (Kurotaki et al. 2002, Douglas et al.
2003). As noted in Chapter 2, the N terminus plus fingers 11-13 of ZFX constitutes a protein that
is sufficient for DNA binding and transactivation activity. One might hypothesize that NSD1
functions as a co-factor of ZFX by interacting with a domain that remains in the mutant ZFX ZF11-
13 to mediate transactivation. However, the IP-MS experiments showed that NSD1 interacts with
endogenous ZFX, FLAG-tagged wt ZFX, and ZFX ZF1-8, but not FLAG-tagged ZFX ZF9-13
(Figure 4.3). These results do not allow a straightforward conclusion. The wt ZFX and all ZFX
mutant proteins contain the N-terminal domain, suggesting that NSD1 might interact with the
Volcano Plot (Fisher's Exact Test, p < 0.05, No Correction)
Significant Nonsignificant Significant Outlier Nonsignificant Outlier Significance Threshold
Zero Fold Change
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11
Log2(Fold Change (Target/Control))
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-Log10 p-value
NSD1
54
acidic activation domain. However, ZFX9-13 did not pull down NSD1. This suggests two
interpretations: 1) NSD1 interacts with fingers 1-8 or 2) NSD1 normally interacts with the N-
terminal domain but ZFX ZF9-13 has taken on a slightly different protein structure due to deletion
of the first 8 fingers and cannot interact with NSD1 even though it has the N-terminal domain.
However, because ZFX ZF9-13 can activate transcription upon transfection into DKO cells, these
results suggest that the interaction with NSD1 might not be essential in HEK293T cells. Clearly
additional experiments are required to validate these initial IP-MS results.
Figure 4.3 NSD1 protein interacts with ZFX protein
Total spectrum counts are used to compare protein levels in different samples in a semi-quantitative manner.
Shown is total spectrum counts of NSD1 in five independent IP-MS experiments (targeting endogenous
ZFX rep1, endogenous ZFX rep2, FLAG-tagged wt ZFX, ZFX ZF1-8, and ZFX ZF9-13) and two IgG control
samples. Shannon Schreiner performed the IP-MS experiments and the data analysis.
4.3.2 Are ZFX and ZNF711 involved in transcriptional elongation?
Histone-lysine N-methyltransferase, H3 lysine-36 and H4 lysine-20 specific OS=Homo sapiens
OX=9606 GN=NSD1 PE=1 SV=1
Control Target
Ctrl 1
Ctrl 2
ZFX ZF9-13
ZFX 1
ZFX 2
ZFX FLAG
ZFX ZF1-8
BioSample
0.0
2.5
5.0
7.5
10.0
12.5
15.0
17.5
20.0
22.5
25.0
27.5
30.0
32.5
35.0
37.5
Total Spectra
55
NSD1 has been shown to be a dimethylase for H3K36 and this modification serves as a substrate
for trimethylation of histone 3 lysine 36 (H3K36me3) (Lucio-Eterovic et al. 2010). H3K36me3 has
been linked to transcription elongation and its occupancy level throughout gene bodies is
positively correlated with gene activities (Wen et al. 2014). I reported in Chapter 3 that ZFX and
ZNF711 preferentially bind downstream of the TSS in transcribed regions in all tested cell lines.
This suggests that the ZFX family members might be involved in transcriptional elongation (Kwak
and Lis 2013). To test whether ZFX family members influence H3K36me3 levels at target genes,
I performed H3K36me3 ChIP-seq experiments in wt HEK293T cells and in the 3 DKO clones
(Figure 4.4). Strikingly, I found that genes bound by ZFX and ZNF711 have much higher levels
of H3K36me3 in wt HEK293T cells than do genes not bound by these TFs (Panel A; note the
different scale of the Y axis in the left vs right panels). Interestingly, I found that the levels of
H3K36me3 are reduced in the DKO cells at all genes, not just at those bound by ZFX and ZNF711.
Also, in the DKO cells, reduction of H3K36me3 occurs not only at genes that are downregulated
(as compared to their expression in wt cells) but also at genes that show no changes in expression
in wt vs DKO cells. Therefore, it seems that ZFX family members may be important in recruiting
an H3K36me3 histone methyltransferase to transcribed regions of most active genes in HEK293T
cells, but changes in the levels of this mark are not correlated with changes in gene expression.
56
Figure 4.4 H3K36me3 analyses in wt HEK293T and DKO clones
A) Metagene plots showing the tag density of H3K36me3 across genes that are bound by ZFX and ZNF711
in wt HEK293T (left) and random genes that are not bound by either ZFX or ZNF711 (right) in wt HEK293T
and DKO clone1. B) Metagene plots in wt HEK293T and DKO clone1, DKO clone2, or DKO clone3 showing
the tag density of H3K36me3 across genes whose promoters are bound by ZFX and ZNF711 & down-
regulated in all 3 DKO clones (left) and random genes whose promoters are bound by ZFX and ZNF711
-2.0 TSS TES 2.0.E
1
2
3
4
genes
WT
D.2
-2.0 T66 TE6 2.0.E
1
2
3
4
5
6
genes
WT
D.2
Promoters bound by ZFX and ZNF711 &
downregulated in all 3 DKO clones
(n=846)
Random promoters bound by ZFX and ZNF711 but
not downregulated in any of the DKO clone
(n=850)
-2.0 TSS TES 2.0.E
1
2
3
4
genes
WT
D.2 cOone2
-2.0 T66 TE6 2.0.E
1
2
3
4
5
6
genes
WT
D.2 cOone2
-2.0 TSS TES 2.0.E
1
2
3
4
genes
WT
D.2 cOone3
-2.0 T66 TE6 2.0.E
1
2
3
4
5
6
genes
WT
D.2 cOone3
All promoters bound by ZFX and
ZNF711 in wt HEK293T
(n=11450)
Random promoters not bound by
ZFX or ZNF711 in wt HEK293T
(n=11450)
-2.0 TSS TES 2.0.E
1
2
3
4
genes
WT
D.2 cOone1
-2.0 7SS 7ES 2.0.E
0.7
0.8
0.9
1.0
genes
W7
D.2 cOone1
-2.0 7SS 7ES 2.0.E
0.7
0.8
0.9
1.0
genes
W7
D.2 cOone1
-2.0 7SS 7ES 2.0.E
0.7
0.8
0.9
1.0
genes
W7
D.2 cOone1
Ni. Fig S6
A
B
TSE TSS 2 Kb -2 Kb
TSE TSS 2 Kb -2 Kb
TSE TSS 2 Kb -2 Kb
TSE TSS 2 Kb -2 Kb TSE TSS 2 Kb -2 Kb
TSE TSS 2 Kb -2 Kb
TSE TSS 2 Kb -2 Kb
TSE TSS 2 Kb -2 Kb
Figure S6. H3K36me3 analyses in wt HEK293T and DKO clones A) Metagene plots showing the tag density of H3K36me3
across genes that are bound by ZFX and ZNF711 in wt HEK293T (left) and random genes that are not bound by either ZFX or
ZNF711 (right) in wt HEK293T and DKO clone1. B) Metagene plots in wt HEK293T and DKO clone1, DKO clone2, or DKO
clone3 showing the tag density of H3K36me3 across genes whose promoters are bound by ZFX and ZNF711 & down-regulated
in all 3 DKO clones (left) and random genes whose promoters are bound by ZFX and ZNF711 but not downregulated in any of
the DKO clones (right).
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
wt HEK293T
DKO clone1
wt HEK293T
DKO clone1
wt HEK293T
DKO clone1
wt HEK293T
DKO clone1
wt HEK293T
DKO clone2
wt HEK293T
DKO clone2
wt HEK293T
DKO clone3
wt HEK293T
DKO clone3
57
but not downregulated in any of the DKO clones (right). Weiya Ni performed ChIP-seq experiments and
data analysis.
4.4 Discussion
In this chapter, I have investigated two mechanisms by which the ZFX family could regulate
transcription. First, I examined whether loss of ZFX family members resulted in an increase in
DNA methylation at target promoters (which would create a region of closed chromatin and reduce
promoter activity). Second, I investigated whether levels of a histone methylation mark linked to
transcriptional elongation are altered in response to loss of ZFX family members. I did not identify
evidence suggesting that loss of ZFX family members resulted in hypermethylation of promoters.
However, I did find evidence suggesting a link between ZFX and methylation of histone H3 at
lysine 36.
Initial IP-MS experiments identified NSD1 as the only interacting protein of either endogenous
ZFX or overexpressed FLAG-tagged ZFX proteins. NSD1 dimethylates H3K36, which is required
for production of H3K36me3 by SETD2. I found that In DKO cells, the overall H3K36me3 level is
reduced at all genes regardless of the ZFX/ZNF711 binding status in wt cells or expression level
changes in DKO cells. These results suggest that the general H3K36 methyltransferase
mechanisms might be disrupted in DKO cells. NSD1, NSD2, NSD3, SETD2, and ASH1L have all
been implicated in the methylation of H3K36 (Chen et al. 2020, Zhuang et al. 2018, Han et al.
2018, De and Müller 2019). However, there is no change in expression of these
methyltransferases in DKO vs wt HEK293T at the mRNA level. Thus, the reduction in H3K36me3
in DKO cells is likely not due to changes in the mRNA levels of these methyltransferases. It is
possible that loss of ZFX has resulted in loss of NSD1 binding at all promoters. We have
attempted to test this possibility by NSD1 ChIP-seq but have not yet found an NSD1 antibody that
works in ChIP. We are currently creating a FLAG-tagged NSD1 protein for use in ChIP-seq.
58
Interestingly, analyses in which I plotted levels of H3K36me3 over gene bodies in wt and DKO
cells show that DKO cells are lacking the “dip” at 3’-end of genes, as compared to wt HEK293T
cells (Figure 4.4). Although there are numerous examples in the literature of the 3 ‘end dip in
H3K36me3 levels (Meers et al. 2017, Lu et al. 2015, Ebmeier et al. 2017, Vandenbon et al. 2018),
the mechanisms responsible for this dip are not known. The dip in H3K36me3 at the 3’ end of
genes could be due to a presence of a nucleosome-free region or the nucleosome could be
present but simply not trimethylated on H3K36. However, using published MNase-seq, NOMe-
seq, and ATAC-seq datasets, we cannot find strong evidence for a nucleosome-free region at the
3’ end of genes (Suhn Rhie, unpublished analyses). The dip could be due to the presence of an
unmodified K36 or another modification at or near K36 may prevent trimethylation of this amino
acid. It is difficult to know a priori what other mark may take the place of H3K36me3 at the 3’ end
of genes.
In summary, although ZFX family members bind downstream of the TSS (suggesting an
involvement in elongation) and NSD1, a protein involved in elongation, was identified as an
interacting protein, it is not yet clear how this histone methylase contributes to ZFX-mediated gene
regulation. Our initial IP-MS experiments did not identify NSD1 when FLAG-ZFX-ZF9-13 was
used for IP, even though this construct has transcriptional activation ability in a transfection
experiment. Also, all genes, regardless of whether they have ZFX bound to their promoter,
showed a defect in H3K36me3 in DKO cells. Although this is an intriguing result, it is hard to
understand why promoters not bound by ZFX showed this same effect.
It is possible that we did not identify the key interacting protein in our IP-MS experiments. There
are several possible reasons for this: 1) antibody quality, 2) IP-MS protocol conditions, 3) low
endogenous protein levels of ZFX family members, or 4) weak protein-protein interactions. We
59
have validated the specificity of the ZFX antibody according to ENCODE standards and tested
the antibody affinity via successful ChIP-seq and Western blot assays (Rhie et al. 2018). We have
also optimized the IP-MS protocol multiple times for these experiments and the same protocol
has been used in other projects in the lab. The fact that the overexpressed FLAG-tagged wt ZFX
and endogenous ZFX IP-MS experiments have identical results suggests that the endogenous
ZFX level should not be an issue in pulling down interacting partners via IP. Therefore, perhaps
a standard IP cannot capture the weak protein-protein interactions between ZFX and its
interacting partners because the interactions cannot be preserved during the cell lysis and
purification steps. Future investigation of ZFX-interacting partners is discussed in Chapter 6.
60
Chapter 5
Exploration of differentiating traits between responsive target genes and non-responsive
target genes of ZFX
5.1 Abstract
∼10,000 CpG island promoters are bound by ZFX family members in a given cell type. However,
less than half of the bound promoters show responsiveness to loss of ZFX and ZNF711 in the
knockout HEK293T cells or after knockdown of all three family members in 22Rv1 prostate cancer
cells. Thus, identifying traits that distinguish a responsive from a non-responsive target gene is
critical for understanding the molecular mechanisms underlying the functions of ZFX family. In
this chapter, I investigate whether promoter structure, promoter activity level, or epigenetic
modifications can distinguish promoters that are bound by ZFX family members in wt cells and
show decreased expression in the DKO cells from those that are bound by ZFX family members
in wt cells and do not show decreased expression in the DKO cells.
5.2 Introduction
In Chapter 3, I reported that the ZFX family members bind to ~10,000 promoter regions,
preferentially to CGI promoters, in all tested cell lines. This observation suggests that these TFs
might play an essential roles in regulation of the transcriptome, especially in the transcription of
housekeeping genes (Moran, Arribas, and Esteller 2016). In the ZFX and/or ZNF711 knockout
cells, I have demonstrated that the loss of these TFs potently inhibits cell proliferation, disrupts
cell cycle pathways, and causes considerable transcriptome changes. Although thousands of
genes changed expression levels in DKO cells, only ~30% of the genes with promoter regions
bound by the ZFX family in wt cells responded to the deletion of ZFX family members. The
61
discrepancy between the numbers of promoters bound by these TFs and the number of promoters
having changed activities upon the loss of the TFs suggests that there might be post-DNA binding
mechanisms that distinguish true target promoters (bound and regulated) from non-target
promoters (bound but not regulated). To attempt to define a true ZFX target promoter, I compared
the characteristics of two distinct groups of genes with promoters bound by the TFs: 1) genes that
are bound by ZFX and downregulated in all three DKO clones and 2) genes that are bound by
ZFX but showed no expression changes in any of the DKO clones. I investigated epigenetic marks,
TF binding, ChIP-seq and ChIP-exo peak size or height, and expression levels in wt HEK293T.
5.3 Results
5.3.1 Do responsive promoters have different binding patterns of ZFX family members than
non-responsive promoters.
To attempt to distinguish promoters that are bound and responsive to ZFX family members from
promoters that are bound but not responsive, I first looked for differences in ChIP-seq peak
heights. I used the set of 846 bound and responsive promoters and 850 randomly chosen bound
and non-responsive promoters. As shown in Figure 5.1, the ZFX and ZNF711 peak heights in
ChIP-seq and ChIP-exo are slightly higher and sharper in promoter regions of bound and
downregulated genes than in bound but not deregulated genes. However, the peaks are quite
robust and believable at this second set of promoters (Figure 5.2). As described in Chapter 3,
ChIP-exo experiments were performed for ZNF711. ZNF711 ChIP-seq peaks are fractured into
the same number of ChIP-exo peaks in two groups of promoters. Thus, peak height or peak
number cannot distinguish responsive from non-responsive promoters (Figure 5.3).
62
Figure 5.1 Characterization of putative direct targets in wt HEK293T
Comparison of tag density plots of A) ZFX and ZNF711 ChIP-seq, B) ZNF711 ChIP-exo, and C) H3K4me3
and H3K27ac ChIP-seq in wt HEK293T in promoter regions of genes bound by ZFX and ZNF711 with
reduced expression levels (n=846) and with no expression change (n=850) in DKO cells. Lijun Yao
performed ZFX ChIP-seq, Zhifei Luo performed ZNF711 ChIP-seq, Peconic performed ZNF711 ChIP-exo,
Weiya Ni performed H3K4me3 and H3K27ac ChIP-seq. Weiya Ni performed the data analysis.
0
3
6
9
12
0
3
6
9
12
15
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
ZFX ZNF711
0
3
6
9
12
0
3
6
9
12
15
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
ZFX ZNF711
Bound & Downregulated in all 3 DKO
clones (n=846)
Bound but expression is not changed in any
of the DKO clones (n=850)
Peak height of ZFX
and ZNF711 ChIP-seq
in wt HEK293T
0
1
2
3
4
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
ZNF711 ChIP-exo
0
1
2
3
4
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
ZNF711 ChIP-exo
ChExMix peaks in
ZNF711 ChIP-exo
under ZNF711
ChIP-seq peaks
0
10
20
30
0
6
12
18
24
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K4me3 H3K27Ac
0
10
20
30
0
6
12
18
24
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K4me3 H3K27Ac
H3K4me3 and
H3K27ac ChIP-seq
peaks in wt
HEK293T
A
B
C
63
Figure 5.2 ZFX and ZNF711 peaks at promoters of genes with no expression change in DKO
Shown are browser tracks of the two replicates of ZFX and ZNF711 ChIP-seq of three example genes with
robust and believable TF binding but no expression change in any of the DKO clone. Weiya Ni performed
the data analysis.
64
Figure 5.3 ZNF711 ChIP-seq and ChIP-exo peaks in bound responsive vs non-responsive promoters
Shown are browser tracks of the two replicates of ZNF711 ChIP-seq and ChIP-exo of six example genes
from 846 bound and responsive promoters and 850 randomly chosen bound and non-responsive promoters.
Weiya Ni performed the data analysis.
5.3.2 Can promoter-associated histone modifications identify ZFX-responsive promoters?
I next performed ChIP-seq using antibodies to promoter-related histone modifications in wt
HEK293T cells. No distinguishable difference was observed in active promoter marks H3K4me3
and histone 3 lysine 27 acetylation (H3K27ac) in wt HEK293T between the genes bound by ZFX
and ZNF711 in wt cells and have reduced expression levels in DKO cells and those that are bound
in wt but have no expression change in DKO cells (Figure 5.4). Although I could not detect any
significant differences in promoter histone modifications at the “bound and downregulated” vs
“bound but not deregulated” promoters cells, it was possible that differences would be observed
in the DKO cells. As shown in Figure 5.4, I found that “bound and downregulated” direct targets
have lowered H3K4me3 and H3K27ac ChIP-seq peak heights in promoter regions in DKO cells,
as compared to wt HEK293T cells. However, the same differences in these marks was observed
in promoters of “bound but not deregulated” genes. Another possible differentiating trait could be
different expression levels of these two groups of genes in wt HEK293T cells. However, there is
no statistically significant difference between the two groups; the mean expression levels,
Log2(counts per million reads mapped), of the two sets of genes are 5.27 and 5.47, respectively.
65
Figure 5.4 H3K4me3 and H3K27ac marks in DKO cells compared to wt HEK293T
Comparison of density plots of A) H3K4me3 and B) H3K27ac ChIP-seq in wt HEK293T and DKO cells
(clone1) in promoters of downregulated genes bound by ZFX and ZNF711 in DKO cells (n=846) and
random genes bound by ZFX and ZNF711 but with no change in expression levels (n=850). Similar results
are observed in the other two DKO clones. Weiya Ni performed ChIP-seq experiments and data analysis.
5.3.3 Do cellular compensatory changes obscure the identification of direct target genes
in the DKO cells?
Through the cell proliferation assays, cell cycle analysis, and RNA-seq analysis in DKO cells, I
reported that the ZFX family plays an essential role in regulating critical cell functions required for
cell growth. However, these DKO clones were very difficult to obtain, as compared to the single
KO clones, suggesting that it was possible that a compensatory mechanism had been activated
that allowed expression of ZFX and ZFN711 target genes even when these TFs were deleted.
One way to test this is to examine the effects of a transient loss of these TFs. To compare the
permanent loss of ZFX and ZNF711 in DKO cells with a transient reduction of the two TFs in a
Bound & Downregulated in all 3 DKO
clones (n=846)
Bound but expression is not changed in any
of the DKO clones (n=850)
A
B
0
10
20
30
40
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K27ac - Downregulated and bound (n=846) in wt HEK293T
H3K27ac - Downregulated and bound (n=846) in DKO clone1
0
10
20
30
40
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K27ac - Not downregulated, but bound (n=850) in wt HEK293T
H3K27ac - Not downregulated, but bound (n=850) in DKO clone1
0
10
20
30
40
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K4me3 - Downregulated and bound (n=846) in wt HEK293T
H3K4me3 - Downregulated and bound (n=846) in DKO clone 1
0
10
20
30
40
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K4me3 - noDEG, but bound (n=850) in wt HEK293T
H3K4me3 - noDEG, but bound (n=850) in DKO clone1
in wt HEK293T
in DKO clone1
0
10
20
30
40
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K4me3 - noDEG, but bound (n=850) in wt HEK293T
H3K4me3 - noDEG, but bound (n=850) in DKO clone1
0
10
20
30
40
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K4me3 - noDEG, but bound (n=850) in wt HEK293T
H3K4me3 - noDEG, but bound (n=850) in DKO clone1
in wt HEK293T
in DKO clone1
0
10
20
30
40
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K27ac - Not downregulated, but bound (n=850) in wt HEK293T
H3K27ac - Not downregulated, but bound (n=850) in DKO clone1
in wt HEK293T
in DKO clone1
0
10
20
30
40
-2000 -1000 0 1000 2000
ChIP-seq Fragment Depth
(per bp per peak)
Distance from TSS (bp)
H3K27ac - Not downregulated, but bound (n=850) in wt HEK293T
H3K27ac - Not downregulated, but bound (n=850) in DKO clone1
in wt HEK293T
in DKO clone1
66
knockdown system, we first attempted to knock down ZFX and/or ZNF711 using siRNAs.
However, we found that up to 35-47% of the mRNA of each TF remained (see Table 5.1).
To accomplish a more robust transient reduction in ZFX and ZNF711, our lab developed a new
knockdown approach combining siRNA with a CRISPR/dCas9 (the deactivated form of Cas9)
toggle switch (siTOG). The CRISPR/dCas9 toggle switch system mediates the expression of
gene(s) by directing a dCas9-effector domain fusion to the target promoter using guide RNAs.
The activating or repressing functions of the fused effector domains control the expression of
target genes at the epigenetic level. For instance, the KRAB domain in a dCas9-KRAB fusion
causes transcriptional repression of target promoters by recruiting KAP1 via protein-protein
interactions, which further interacts with other TFs and chromatin-modifying enzymes, such as
SET domain, bifurcated 1 (SETDB1), an histone 3 lysine 9 trimethylation (H3K9me3)-specific
histone methyltransferase (Iyengar and Farnham 2011, Lupo et al. 2013). Due to the loss of
nuclease activity in dCas9, the siTOG system alters gene activities on the epigenetics level
without editing the genome. As shown in Table 5.1, by using a combination of siRNAs and dCas9-
KRAB, we could more effectively reduce the levels of both ZFX and ZNF711.
Table 5.1 Levels of remaining ZFX and/or ZNF711mRNAs in knockdown experiments
Knockdown experiments were conducted in triplicates. mRNA levels were measure via RNA-seq. Wei Zhu
performed the experiments.
As shown in the volcano plots in Figure 5.5, there are very limited DEGs in siTOG ZNF711
compared to siTOG ZFX and siTOG of both TFs, suggesting that ZFX might be playing a more
essential role than ZNF711 in HEK293T cells. When the ZNF711 level is reduced in the cells,
perhaps ZFX compensates this loss of function. Although thousands of DEGs were identified in
Remaining mRNA Single TF knockdown Double TF knockdown
ZFX ZNF711 ZFX ZNF711
siRNA 35% 47% 44% 58%
siTOG 13% 6% 22% 12%
67
siTOG for both TFs, twice as many responsive promoters were identified in the DKO cells (Figure
3.3). Therefore, it is possible that even the remaining low-level of the ZFX and ZNF711 mRNAs
in the siTOG cells can maintain some of the cellular functions. If so, then this would not allow us
to correctly identify the direct target promoters using this transient knockdown approach.
Figure 5.5 Volcano plots of ZFX and/or ZNF711 siTOG experiments
Volcano plots showing DEGs identified via RNA-seq in comparisons of ZFX and/or ZNF711 siTOG vs.
controls. Weiya Ni performed the data analysis.
5.3.3 Are protein-protein interactions responsible for specifying responsive vs non-
responsive promoters?
The ZFX family of transcription factors bind to ~13,000 essentially identical promoters in each cell
type we have tested (kidney HEK293T, breast MCF7, prostate 22Rv1 and C42B, leukemia HAP1,
and colon HCT116) (Figure 5.6). However, as described above, only a fraction of these promoters
responded to loss of ZFX family members in HEK293T cells. This suggests that perhaps ZFX
needs to interact with another protein to activate transcription. If so, then different interacting
partners could specify different subsets of responsive promoters in different cell types. This
possibility can be addressed in two ways: 1) determining if different sets of genes show reduced
expression when the ZFX family members are knocked down in different cell types (discussed
0
25
50
75
100
−5.0 −2.5 0.0 2.5 5.0
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
0
25
50
75
100
−5.0 −2.5 0.0 2.5 5.0
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
0
25
50
75
100
−5.0 −2.5 0.0 2.5 5.0
log2(Fold Change)
−log10(Adj−pVal)
threshold
FALSE
TRUE
N=1973 N=1158 N=27 N=14 N=2199 N=1502
siTOG ZFX siTOG ZNF711 siTOG ZFX and ZNF711
68
here) and 2) identifying different interaction partners in different cell types (discussed in Chapter
6).
Figure 5.6 The ZFX family ChIP-seq in various cell lines browser track screenshot
All ZFX family ChIPs were performed using the same ZFX antibody (Cell Signaling Technology # 5419S).
ZNF711 ChIP-seq in HEK293T was performed using antibody from Dr. Kristian Helin and the rest of
ZNF711 ChIP-seq was performed using antibody Thermo # PA5-31652. ChIP-seq experiments were
performed by Shannon Schreiner, Andrew Perez, Lijing Yao, Zhifei Luo, Heather Witt. Weiya Ni performed
data analysis.
To determine whether the ZFX family members regulate the same or a different set of target
genes in different cell types, siRNA knockdown of ZFX was performed followed by RNA-seq in
three different cell lines (HEK293T, MCF7, and C42B). Shown in the volcano plots in Figure 5.7A,
in each cell line, thousands of genes changed expression levels in response to the knockdown of
ZFX, with more genes being downregulated than being upregulated. I generated lists of
responsive target genes in each cell line by a) comparing all up- and down-regulated genes in
each cell line (Figure 5.7B) and b) after intersecting ZFX binding profiles with the lists of
ZFX in HEK293T
ZNF711 in HEK293T
ZFX in 22Rv1
ZNF711 in 22Rv1
ZFY in 22Rv1
ZFX in C42B
ZFX in MCF7
ZFX in HCT116
ZFX in HAP1
ZNF711 in HAP1
chr1:159,542,602-243,144,256
69
deregulated genes. In all 3 cell lines I found that only a limited number of the downregulated genes
are bound by ZFX in the promoter regions. This is not surprising because I have shown that
reduction of ZFX family members has severe effects on proliferation; thus, there will be a large
number of signaling pathways affected in the knockdown cells and consequently a large number
of indirectly responsive genes. However, I am most interested in the direct target genes (bound
and downregulated). The Venn diagram (Figure 5.7C) shows that cell lines only share a very
small portion of common responsive direct targets, suggesting that ZFX has distinct targets in
different cell types. I also found that although some genes are only downregulated in certain cell
line, there are robust ZFX peaks at these promoter regions across all three cell lines (Figure 5.8).
ZFX
Downregulated genes
bound by ZFX
HEK293T MCF7 C42B
ZFX knockdown via siRNA
Overlap of
downregulated
genes upon
siZFX
Overlap of
upregulated
genes upon
siZFX
A
B
C
70
Figure 5.7 siRNA knockdown of ZFX in three different cell lines
A) Volcano plots of RNA-seq data in HEK293T, MCF7, and C42B show deregulated genes after knocking
down ZFX by siRNA transfection (fold change cut-off is 1.5 and FDR cut-off is 0.05; NC=no change). B)
Venn diagrams show up and downregulated genes in three cell lines. C) Venn diagrams show overlapped
up and downregulated genes bound by ZFX in three cell lines. Wei Zhu performed ZFX knockdown via
siRNA experiments in HEK293T and Lijun Yao performed siRNA experiments in MCF7 and C42B. Weiya
Ni performed the data analysis.
Figure 5.8 ZFX binding at cell-type specific downregulated gene promoters
ZFX ChIP-seq browser track screenshots across HEK293T, C42B, and MCF7 cells at example promoters
of genes which are cell-type specific downregulated genes in response to the ZFX knockdown. Weiya Ni
performed the data analysis.
5.4 Discussion
In this Chapter, I have discussed two possible mechanisms that might explain why only a subset
of promoters bound by ZFX family members show reduced activity in the DKO cells. I have
concluded that there is no strong evidence indicating that ZFX/ZNF711 binding profiles or
Overlap of ZFX-bound
downregulated genes
upon siZFX
Overlap of ZFX-bound
upregulated genes
upon siZFX
71
H3K4me3 or H3K27ac modifications distinguish a responsive from a non-responsive target gene
in either wt or DKO cells. I note that these histone marks are all activating marks. There have
been studies showing that the overall H3K27me3 mark increases upon the loss of H3K36me2 in
embryonic stem cells (Streubel et al. 2018). Because of the link between NSD1 and H3K36me2,
it was possible that H3K27me levels changed in DKO cells. Therefore, I performed ChIP-seq for
H3K27me3 in wt and DKO cells. However, due to the very low overall level of H3K27me3 in
HEK293T cells, I was not able to identify a statistically significant change in this mark in DKO and
wt HEK293T (data not shown). In contrast, my studies showing that different subsets of bound
promoters are responsive to knockdown of ZFX family members in different cell types suggests
that perhaps as yet unidentified protein partners help to define ZFX target genes.
Although the ZFX family members bind to almost identical sites in various cell lines, genes
deregulated upon the loss of these TFs in different cell lines are cell-type specific. This suggests
that the ZFX family regulate the transcriptome via post DNA-binding cell-type specific
mechanisms. It has been shown that the same TF can regulate different sets of targets genes in
different cell types, such as the cell-type specific functions of TCF7L2 in HepG2 and MCF7 cells,
and ESR1 in breast and endometrial cell lines (Gertz et al. 2012, Frietze et al. 2012). Future
experiments need to be done to identify ZFX-interacting partners in different cell types (discussed
in Chapter 6).
72
Chapter 6
Discussion and future directions
6.1 Discussion
The three members of the ZFX family, ZFX, ZFY, and ZNF711, have highly similar gene structures
and genomic binding profiles in various cell types. Their unique binding preference downstream
of the TSS at CpG island promoter regions suggests that the ZFX family might be involved in
regulating the expression of housekeeping genes and thus mediate essential molecular and
cellular functions. I reported in earlier chapters that thousands of genes responded to the loss of
ZFX family members in DKO cells. However, the responsive genes correspond to only a fraction
of the genes whose promoters are bound by ZFX family members. To date, I have not been able
to identify differences in promoter structure, activity level, or epigenetic modifications that
distinguish promoters that are bound by both ZFX and ZNF711 in wt cells and show decreased
expression in the DKO cells from those that are bound in wt cells and do not show decreased
expression in DKO cells. I describe below three major unanswered questions that arise from my
thesis work.
6.1.1 Why don’t the majority of the ZFX-bound promoters respond to the loss of ZFX family
members?
In HEK293T cells, ~13,000 promoter regions are bound by ZFX; however only ~3,400 promoters
bound by ZFX in wt cells are downregulated in DKO cells. Thus, ~74% of the active CpG island
promoters bound by ZFX are not affected by the loss of ZFX and ZNF711. There might be three
possible reasons for this discrepancy.
1) The difficulties I encountered during the identification of DKO clones and their slow
growth phenotype suggest that the transcriptome I observed in DKO cells might be a result of cell
survival selection. When cells are under the extreme stress of losing two critical cell growth
73
regulators for thousands of housekeeping genes, the DKO cells might have adopted new
compensating strategies to rescue the deregulated cell programs and survive in the cell culture
system. Therefore, the long-term effects of knocking out ZFX family members in DKO cells is
likely a combination of both direct and indirect cell responses. This could make it difficult to study
the functions of the ZFX family in DKO cells. To overcome the caveats of using permanent
genome editing systems to study gene functions, transient knockdown or inducible systems can
be utilized to monitor the acute cell responses upon the reduced level or loss of the gene(s) of
interest. However, thousands more DEGs were observed in DKO cells comparing to siTOG
knockdown samples, although both ZFX and ZNF711 are knocked down to low levels in siTOG
experiments. It is possible that even low remaining levels of the ZFX family can compensate for
the reduced level of other family members. This hypothesis can be tested by performing ChIP-
seq of ZFX family members in the knockdown samples to see if there are still binding events
happening at the original binding sites. Another method of achieving transient gene silencing is
through degrading proteins using auxin2 in the Auxin-inducible degron (AID) technology
(Nishimura et al. 2009, Li et al. 2019). Auxin represents a family of plant hormones, such as
indole-3-acetic acid (IAA; a natural auxin) and 1-naphthaleneacetic acid (NAA; a synthetic auxin),
that are able to influence gene expression during cell development and proliferation (Nishimura
et al. 2009). In the AID system, an auxin-inducible domain is fused to the gene of interest (GOI)
and SCF (Skp1, Cullin and F-box) complexes containing the F-box transport inhibitor response 1
(TIR1) is stability expressed in the cells. Upon the controllable introduction of auxin into the system,
auxin binds to TIR1 thus forming a SCF/TIR1/AID/GOI complex. This complex functions as an E3
ubiquitin ligase which recruits an E2 ligase to ubiquitinate GOI. The polyubiquitinated GOI will be
rapidly recognized and degraded by endogenous proteasome. The protein degradation process
is short-term and only takes place upon the introduction of auxin. Thus, one advantage of the AID
system is allowing gene silencing on demand for instant investigation of cell responses upon the
74
loss of GOI, by avoiding the introduction of long-term survival selection effects into the system.
This system may provide a better method for identifying ZFX family target promoters.
2) The ZFX family may regulate the transcriptome in a post DNA-binding manner using
cell type-specific protein partners. If so, then ZFX family member binding would not be sufficient
to identify the responsive promoters. Identifying ZFX-interacting partners via protein-protein
interaction in various cell lines may provide cell-type specific insights into the direct targets of the
ZFX family and the mechanisms through which these TFs mediate the transcriptome.
3) Perhaps cell culture studies cannot not fully capture the functions of ZFX family in vivo.
Studies have shown that the increased ZFX expression levels in patient samples is positively
correlated with poor patient survival rates in multiple cancers (Fang et al. 2012, Li et al. 2013,
Weng et al. 2015, Nikpour et al. 2012, Fang, Huang, et al. 2014, Fang, Fu, et al. 2014, Zhou et
al. 2011, Jiang, Wang, et al. 2012, Jiang, Xu, et al. 2012). It has also been shown that the
reduction of ZFX levels potently inhibits glioblastoma (GBM) stem cell-bearing tumor growth in
mouse models (Fang, Huang, et al. 2014). In order to better understand the oncogenic traits and
molecular mechanisms of the ZFX family, future in vivo studies might be necessary.
6.1.2 Does NSD1 contribute to the function of the ZFX family?
Strikingly, NSD1 is the only identified ZFX-interacting partner in repeated IP-MS experiments
performed in HEK293T cells. However, so far, no correlation has been observed between the
H3K36 methylation levels and activities of the ZFX responsive promoters. Understanding whether
NSD1 and ZFX co-localize in the genome could provide valuable insights into the roles of NSD1
in ZFX functions. However, the roadblock of performing NSD1 ChIP-seq has been the lack of
high-quality ChIP-grade NSD1 antibody. ChIP-seq using a FLAG tagged NSD1 is the next step
in this approach to link NSD1 and ZFX. Another potential future experiment could be performing
proximity ligation assays (PLA) to detect if two proteins of interest (POIs) are in close proximity
(<40 nm). PLA plus and minus probes linked with unique short DNA strands are attached
75
separately to the secondary antibodies raised in different species. When POIs are directly
interacting, the DNA strands will be ligated and amplified via PCR-mediated rolling-circle
amplification (RCA) (Poulard et al. 2014). Complementary DNA oligos along with fluorochromes
will be hybridized to repeating sequences in the amplicons. The amplified fluorescence can be
visualized by microscopy and quantified as PLA signals indicating the localization of the
interaction between POIs (Alam 2018).
As noted above, NSD1 was the only interactor identified in the IP-MS experiments. Due to the
caveat of IP-MS experiments, it is possible that transient and weak protein-protein interactions
were not captured. To overcome this shortcoming of IP-MS, additional experiment could be
performed, such as proximity-dependent biotinylation (PDB)-based MS, such as BIOID and
TURBOID (Samavarchi-Tehrani, Samson, and Gingras 2020, May et al. 2020). By attaching an
enzyme to ZFX that adds biotin labels through covalent bonds to the proteins that are in close
proximity, additional ZFX partners may be identified. Biotin-based MS is useful for capturing weak
and transient interactions. Although one caveat of the biotinylation approach is that a large
number of proteins are usually identified, the most critical interactors could be prioritized by
creation of additional ZFX mutants. For example, ZFX constructs having mutations in the
conserved phenylalanines in the N-terminus (which are good candidates for interaction with co-
activators) can be used in the biotin-based MS experiments (Yoder and Kumar 2006). The
characterization of protein-protein interacting domains and the identification of co-factors of ZFX
in different cell types would also shed light on the understanding of cell-type specific roles of ZFX
in different cell lines.
6.2 Future directions
6.2.1 ZFX and GBM
76
As described in previous Chapters, there is a strong correlation between high expression of ZFX
and poor patient survival in a variety of human cancers. Unfortunately, the previous studies did
not also include an analysis of ZNF711 or ZFY. Therefore, it is not known if ZFX is more important
in driving cancer development than are the other 2 family members. Although I have shown that
knocking out either ZFX or ZNF711 has an effect on proliferation in HEK293T cells (with a double
knockout having an even greater effect), this is not a very cancer-relevant model system. In order
to better understand ZFX as an oncogenic driver in cancer development, expanding future studies
into more cancer-relevant models is crucial. Previous studies have demonstrated a correlation
between high ZFX expression and poor GBM patient survival rates and have shown that
knockdown of ZFX almost eliminates tumor growth in GBM patient-derived xenograft models
(Fang, Huang, et al. 2014). It is interesting that such a dramatic effect was seen when only one
ZFX family member was knocked down. This suggests that ZFX may be more important than
ZNF711 or ZFX for the development of glioblastoma. However, as with the other cases, the
investigators did not compare knockdown of ZFX with knockdown of the other family members.
Future experiments that compare the ability of the 3 family members to control growth of GMB
tumors would be very interesting. As a first step, it would be important to perform genomic profiling
of all 3 TFs in glioblastoma cells to determine if they have identical binding patterns. Then,
knocking them all out and adding the family members back one at a time may reveal family
member-specific in vivo functions.
77
References
Alam, M. S. 2018. "Proximity Ligation Assay (PLA)." Curr Protoc Immunol 123 (1):e58. doi:
10.1002/cpim.58.
Antequera, F. 2003. "Structure, function and evolution of CpG island promoters." Cell Mol Life
Sci 60 (8):1647-58. doi: 10.1007/s00018-003-3088-6.
Augello, M. A., T. E. Hickey, and K. E. Knudsen. 2011. "FOXA1: master of steroid receptor
function in cancer." EMBO J 30 (19):3885-94. doi: 10.1038/emboj.2011.340.
Barrera, L. A., A. Vedenko, J. V. Kurland, J. M. Rogers, S. S. Gisselbrecht, E. J. Rossin, J.
Woodard, L. Mariani, K. H. Kock, S. Inukai, T. Siggers, L. Shokri, R. Gordân, N. Sahni, C.
Cotsapas, T. Hao, S. Yi, M. Kellis, M. J. Daly, M. Vidal, D. E. Hill, and M. L. Bulyk. 2016.
"Survey of variation in human transcription factors reveals prevalent DNA binding
changes." Science 351 (6280):1450-1454. doi: 10.1126/science.aad2257.
Beroukhim, R., C. H. Mermel, D. Porter, G. Wei, S. Raychaudhuri, J. Donovan, J. Barretina, J. S.
Boehm, J. Dobson, M. Urashima, K. T. Mc Henry, R. M. Pinchback, A. H. Ligon, Y. J. Cho,
L. Haery, H. Greulich, M. Reich, W. Winckler, M. S. Lawrence, B. A. Weir, K. E. Tanaka,
D. Y. Chiang, A. J. Bass, A. Loo, C. Hoffman, J. Prensner, T. Liefeld, Q. Gao, D. Yecies,
S. Signoretti, E. Maher, F. J. Kaye, H. Sasaki, J. E. Tepper, J. A. Fletcher, J. Tabernero,
J. Baselga, M. S. Tsao, F. Demichelis, M. A. Rubin, P. A. Janne, M. J. Daly, C. Nucera,
R. L. Levine, B. L. Ebert, S. Gabriel, A. K. Rustgi, C. R. Antonescu, M. Ladanyi, A. Letai,
L. A. Garraway, M. Loda, D. G. Beer, L. D. True, A. Okamoto, S. L. Pomeroy, S. Singer,
T. R. Golub, E. S. Lander, G. Getz, W. R. Sellers, and M. Meyerson. 2010. "The landscape
of somatic copy-number alteration across human cancers." Nature 463 (7283):899-905.
doi: 10.1038/nature08822.
Birney, E., J. A. Stamatoyannopoulos, A. Dutta, R. Guigó, T. R. Gingeras, E. H. Margulies, Z.
Weng, M. Snyder, E. T. Dermitzakis, R. E. Thurman, M. S. Kuehn, C. M. Taylor, S. Neph,
C. M. Koch, S. Asthana, A. Malhotra, I. Adzhubei, J. A. Greenbaum, R. M. Andrews, P.
Flicek, P. J. Boyle, H. Cao, N. P. Carter, G. K. Clelland, S. Davis, N. Day, P. Dhami, S. C.
Dillon, M. O. Dorschner, H. Fiegler, P. G. Giresi, J. Goldy, M. Hawrylycz, A. Haydock, R.
Humbert, K. D. James, B. E. Johnson, E. M. Johnson, T. T. Frum, E. R. Rosenzweig, N.
Karnani, K. Lee, G. C. Lefebvre, P. A. Navas, F. Neri, S. C. Parker, P. J. Sabo, R.
Sandstrom, A. Shafer, D. Vetrie, M. Weaver, S. Wilcox, M. Yu, F. S. Collins, J. Dekker, J.
D. Lieb, T. D. Tullius, G. E. Crawford, S. Sunyaev, W. S. Noble, I. Dunham, F. Denoeud,
A. Reymond, P. Kapranov, J. Rozowsky, D. Zheng, R. Castelo, A. Frankish, J. Harrow, S.
Ghosh, A. Sandelin, I. L. Hofacker, R. Baertsch, D. Keefe, S. Dike, J. Cheng, H. A. Hirsch,
E. A. Sekinger, J. Lagarde, J. F. Abril, A. Shahab, C. Flamm, C. Fried, J. Hackermüller, J.
Hertel, M. Lindemeyer, K. Missal, A. Tanzer, S. Washietl, J. Korbel, O. Emanuelsson, J.
S. Pedersen, N. Holroyd, R. Taylor, D. Swarbreck, N. Matthews, M. C. Dickson, D. J.
Thomas, M. T. Weirauch, J. Gilbert, J. Drenkow, I. Bell, X. Zhao, K. G. Srinivasan, W. K.
Sung, H. S. Ooi, K. P. Chiu, S. Foissac, T. Alioto, M. Brent, L. Pachter, M. L. Tress, A.
Valencia, S. W. Choo, C. Y. Choo, C. Ucla, C. Manzano, C. Wyss, E. Cheung, T. G. Clark,
J. B. Brown, M. Ganesh, S. Patel, H. Tammana, J. Chrast, C. N. Henrichsen, C. Kai, J.
Kawai, U. Nagalakshmi, J. Wu, Z. Lian, J. Lian, P. Newburger, X. Zhang, P. Bickel, J. S.
Mattick, P. Carninci, Y. Hayashizaki, S. Weissman, T. Hubbard, R. M. Myers, J. Rogers,
P. F. Stadler, T. M. Lowe, C. L. Wei, Y. Ruan, K. Struhl, M. Gerstein, S. E. Antonarakis,
Y. Fu, E. D. Green, U. Karaöz, A. Siepel, J. Taylor, L. A. Liefer, K. A. Wetterstrand, P. J.
Good, E. A. Feingold, M. S. Guyer, G. M. Cooper, G. Asimenos, C. N. Dewey, M. Hou, S.
Nikolaev, J. I. Montoya-Burgos, A. Löytynoja, S. Whelan, F. Pardi, T. Massingham, H.
Huang, N. R. Zhang, I. Holmes, J. C. Mullikin, A. Ureta-Vidal, B. Paten, M. Seringhaus, D.
Church, K. Rosenbloom, W. J. Kent, E. A. Stone, S. Batzoglou, N. Goldman, R. C.
Hardison, D. Haussler, W. Miller, A. Sidow, N. D. Trinklein, Z. D. Zhang, L. Barrera, R.
78
Stuart, D. C. King, A. Ameur, S. Enroth, M. C. Bieda, J. Kim, A. A. Bhinge, N. Jiang, J. Liu,
F. Yao, V. B. Vega, C. W. Lee, P. Ng, A. Yang, Z. Moqtaderi, Z. Zhu, X. Xu, S. Squazzo,
M. J. Oberley, D. Inman, M. A. Singer, T. A. Richmond, K. J. Munn, A. Rada-Iglesias, O.
Wallerman, J. Komorowski, J. C. Fowler, P. Couttet, A. W. Bruce, O. M. Dovey, P. D. Ellis,
C. F. Langford, D. A. Nix, G. Euskirchen, S. Hartman, A. E. Urban, P. Kraus, S. Van Calcar,
N. Heintzman, T. H. Kim, K. Wang, C. Qu, G. Hon, R. Luna, C. K. Glass, M. G. Rosenfeld,
S. F. Aldred, S. J. Cooper, A. Halees, J. M. Lin, H. P. Shulha, M. Xu, J. N. Haidar, Y. Yu,
V. R. Iyer, R. D. Green, C. Wadelius, P. J. Farnham, B. Ren, R. A. Harte, A. S. Hinrichs,
H. Trumbower, H. Clawson, J. Hillman-Jackson, A. S. Zweig, K. Smith, A. Thakkapallayil,
G. Barber, R. M. Kuhn, D. Karolchik, L. Armengol, C. P. Bird, P. I. de Bakker, A. D. Kern,
N. Lopez-Bigas, J. D. Martin, B. E. Stranger, A. Woodroffe, E. Davydov, A. Dimas, E.
Eyras, I. B. Hallgrímsdóttir, J. Huppert, M. C. Zody, G. R. Abecasis, X. Estivill, G. G.
Bouffard, X. Guan, N. F. Hansen, J. R. Idol, V. V. Maduro, B. Maskeri, J. C. McDowell, M.
Park, P. J. Thomas, A. C. Young, R. W. Blakesley, D. M. Muzny, E. Sodergren, D. A.
Wheeler, K. C. Worley, H. Jiang, G. M. Weinstock, R. A. Gibbs, T. Graves, R. Fulton, E.
R. Mardis, R. K. Wilson, M. Clamp, J. Cuff, S. Gnerre, D. B. Jaffe, J. L. Chang, K. Lindblad-
Toh, E. S. Lander, M. Koriabine, M. Nefedov, K. Osoegawa, Y. Yoshinaga, B. Zhu, P. J.
de Jong, ENCODE Project Consortium, NISC Comparative Sequencing Program, Baylor
College of Medicine Human Genome Sequencing Center, Washington University Genome
Sequencing Center, Broad Institute, and Children's Hospital Oakland Research Institute.
2007. "Identification and analysis of functional elements in 1% of the human genome by
the ENCODE pilot project." Nature 447 (7146):799-816. doi: 10.1038/nature05874.
Brayer, K. J., S. Kulshreshtha, and D. J. Segal. 2008. "The protein-binding potential of C2H2 zinc
finger domains." Cell Biochem Biophys 51 (1):9-19. doi: 10.1007/s12013-008-9007-6.
Brayer, K. J., and D. J. Segal. 2008. "Keep your fingers off my DNA: protein-protein interactions
mediated by C2H2 zinc finger domains." Cell Biochem Biophys 50 (3):111-31. doi:
10.1007/s12013-008-9008-5.
Bürglin, T. R. 2011. "Homeodomain subtypes and functional diversity." Subcell Biochem 52:95-
122. doi: 10.1007/978-90-481-9069-0_5.
Charoensawan, V., D. Wilson, and S. A. Teichmann. 2010. "Lineage-specific expansion of DNA-
binding transcription factor families." Trends Genet 26 (9):388-93. doi:
10.1016/j.tig.2010.06.004.
Chen, A., and A. N. Koehler. 2020. "Transcription Factor Inhibition: Lessons Learned and
Emerging Targets." Trends Mol Med 26 (5):508-518. doi: 10.1016/j.molmed.2020.01.004.
Chen, R., W. Q. Zhao, C. Fang, X. Yang, and M. Ji. 2020. "Histone methyltransferase SETD2: a
potential tumor suppressor in solid cancers." J Cancer 11 (11):3349-3356. doi:
10.7150/jca.38391.
Chen, X., H. Xu, P. Yuan, F. Fang, M. Huss, V. B. Vega, E. Wong, Y. L. Orlov, W. Zhang, J. Jiang,
Y. H. Loh, H. C. Yeo, Z. X. Yeo, V. Narang, K. R. Govindarajan, B. Leong, A. Shahab, Y.
Ruan, G. Bourque, W. K. Sung, N. D. Clarke, C. L. Wei, and H. H. Ng. 2008. "Integration
of external signaling pathways with the core transcriptional network in embryonic stem
cells." Cell 133 (6):1106-17.
Chen, Y., P. Chi, S. Rockowitz, P. J. Iaquinta, T. Shamu, S. Shukla, D. Gao, I. Sirota, B. S. Carver,
J. Wongvipat, H. I. Scher, D. Zheng, and C. L. Sawyers. 2013. "ETS factors reprogram
the androgen receptor cistrome and prime prostate tumorigenesis in response to PTEN
loss." Nat Med 19 (8):1023-9. doi: 10.1038/nm.3216.
Chou, J., S. Provot, and Z. Werb. 2010. "GATA3 in development and cancer differentiation: cells
GATA have it!" J Cell Physiol 222 (1):42-9. doi: 10.1002/jcp.21943.
Clark, J. P., and C. S. Cooper. 2009. "ETS gene fusions in prostate cancer." Nat Rev Urol 6
(8):429-39. doi: 10.1038/nrurol.2009.127.
79
Consortium, ENCODE Project. 2012. "An integrated encyclopedia of DNA elements in the human
genome." Nature 489 (7414):57-74. doi: 10.1038/nature11247.
Core, L., and K. Adelman. 2019. "Promoter-proximal pausing of RNA polymerase II: a nexus of
gene regulation." Genes Dev 33 (15-16):960-982. doi: 10.1101/gad.325142.119.
Dalgin, G. S., D. T. Holloway, L. S. Liou, and C. DeLisi. 2007. "Identification and characterization
of renal cell carcinoma gene markers." Cancer Inform 3:65-92.
De, I., and C. W. Müller. 2019. "Unleashing the Power of ASH1L Methyltransferase." Structure
27 (5):727-728. doi: 10.1016/j.str.2019.04.012.
Deaton, A. M., and A. Bird. 2011. "CpG islands and the regulation of transcription." Genes Dev
25 (10):1010-22. doi: 10.1101/gad.2037511.
Douglas, J., S. Hanks, I. K. Temple, S. Davies, A. Murray, M. Upadhyaya, S. Tomkins, H. E.
Hughes, T. R. Cole, and N. Rahman. 2003. "NSD1 mutations are the major cause of Sotos
syndrome and occur in some cases of Weaver syndrome but are rare in other overgrowth
phenotypes." Am J Hum Genet 72 (1):132-43. doi: 10.1086/345647.
Ebmeier, C. C., B. Erickson, B. L. Allen, M. A. Allen, H. Kim, N. Fong, J. R. Jacobsen, K. Liang,
A. Shilatifard, R. D. Dowell, W. M. Old, D. L. Bentley, and D. J. Taatjes. 2017. "Human
TFIIH Kinase CDK7 Regulates Transcription-Associated Chromatin Modifications." Cell
Rep 20 (5):1173-1186. doi: 10.1016/j.celrep.2017.07.021.
Ecco, G., M. Imbeault, and D. Trono. 2017. "KRAB zinc finger proteins." Development 144
(15):2719-2729. doi: 10.1242/dev.132605.
Fang, J., Z. Yu, M. Lian, H. Ma, J. Tai, L. Zhang, and D. Han. 2012. "Knockdown of zinc finger
protein, X-linked (ZFX) inhibits cell proliferation and induces apoptosis in human laryngeal
squamous cell carcinoma." Mol Cell Biochem 360 (1-2):301-7. doi: 10.1007/s11010-011-
1069-x.
Fang, Q., W. H. Fu, J. Yang, X. Li, Z. S. Zhou, Z. W. Chen, and J. H. Pan. 2014. "Knockdown of
ZFX suppresses renal carcinoma cell growth and induces apoptosis." Cancer Genet 207
(10-12):461-6. doi: 10.1016/j.cancergen.2014.08.007.
Fang, X., Z. Huang, W. Zhou, Q. Wu, A. E. Sloan, G. Ouyang, R. E. McLendon, J. S. Yu, J. N.
Rich, and S. Bao. 2014. "The zinc finger transcription factor ZFX is required for maintaining
the tumorigenic potential of glioblastoma stem cells." Stem Cells 32 (8):2033-47. doi:
10.1002/stem.1730.
Farnham, P. J. 2009. "Insights from genomic profiling of transcription factors." Nat Rev Genet 10
(9):605-16. doi: 10.1038/nrg2636.
Frietze, S., and P. J. Farnham. 2011. "Transcription factor effector domains." Subcell Biochem
52:261-77. doi: 10.1007/978-90-481-9069-0_12.
Frietze, S., R. Wang, L. Yao, Y. G. Tak, Z. Ye, M. Gaddis, H. Witt, P. J. Farnham, and V. X. Jin.
2012. "Cell type-specific binding patterns reveal that TCF7L2 can be tethered to the
genome by association with GATA3." Genome Biol 13 (9):R52. doi: 10.1186/gb-2012-13-
9-r52.
Fuda, N. J., M. B. Ardehali, and J. T. Lis. 2009. "Defining mechanisms that regulate RNA
polymerase II transcription in vivo." Nature 461 (7261):186-92. doi: 10.1038/nature08449.
Garza, A. S., N. Ahmad, and R. Kumar. 2009. "Role of intrinsically disordered protein
regions/domains in transcriptional regulation." Life Sci 84 (7-8):189-93. doi:
10.1016/j.lfs.2008.12.002.
Gertz, J., T. E. Reddy, K. E. Varley, M. J. Garabedian, and R. M. Myers. 2012. "Genistein and
bisphenol A exposure cause estrogen receptor 1 to bind thousands of sites in a cell type-
specific manner." Genome Res 22 (11):2153-62. doi: 10.1101/gr.135681.111.
Goodman, F. R. 2002. "Limb malformations and the human HOX genes." Am J Med Genet 112
(3):256-65. doi: 10.1002/ajmg.10776.
80
Grants, J., E. Flanagan, A. Yee, and P. J. Romaniuk. 2010. "Characterization of the DNA binding
activity of the ZFY zinc finger domain." Biochemistry 49 (4):679-86. doi:
10.1021/bi9018626.
Guo, Y. A., M. M. Chang, W. Huang, W. F. Ooi, M. Xing, P. Tan, and A. J. Skanderup. 2018.
"Mutation hotspots at CTCF binding sites coupled to chromosomal instability in
gastrointestinal cancers." Nat Commun 9 (1):1520. doi: 10.1038/s41467-018-03828-2.
Haberle, V., and A. Stark. 2018. "Eukaryotic core promoters and the functional basis of
transcription initiation." Nat Rev Mol Cell Biol 19 (10):621-637. doi: 10.1038/s41580-018-
0028-8.
Han, X., L. Piao, Q. Zhuang, X. Yuan, Z. Liu, and X. He. 2018. "The role of histone lysine
methyltransferase NSD3 in cancer." Onco Targets Ther 11:3847-3852. doi:
10.2147/OTT.S166006.
Holve, S., B. Friedman, H. E. Hoyme, T. J. Tarby, S. J. Johnstone, R. P. Erickson, C. L. Clericuzio,
and C. Cunniff. 2003. "Athabascan brainstem dysgenesis syndrome." Am J Med Genet
A 120A (2):169-73. doi: 10.1002/ajmg.a.20087.
Huntley, S., D. M. Baggott, A. T. Hamilton, M. Tran-Gyamfi, S. Yang, J. Kim, L. Gordon, E.
Branscomb, and L. Stubbs. 2006. "A comprehensive catalog of human KRAB-associated
zinc finger genes: insights into the evolutionary history of a large family of transcriptional
repressors." Genome Res 16 (5):669-77. doi: 10.1101/gr.4842106.
Iuchi, S. 2000. "Three classes of C2H2 zinc finger proteins." CMLS, Cell. Mol. Life Sci. 58:625-
635.
Iyengar, S., and P. J. Farnham. 2011. "KAP1 protein: an enigmatic master regulator of the
genome." J Biol Chem 286 (30):26267-76. doi: 10.1074/jbc.R111.252569.
Jané-Valbuena, J., H. R. Widlund, S. Perner, L. A. Johnson, A. C. Dibner, W. M. Lin, A. C. Baker,
R. M. Nazarian, K. G. Vijayendran, W. R. Sellers, W. C. Hahn, L. M. Duncan, M. A. Rubin,
D. E. Fisher, and L. A. Garraway. 2010. "An oncogenic role for ETV1 in melanoma."
Cancer Res 70 (5):2075-84. doi: 10.1158/0008-5472.CAN-09-3092.
Jiang, M., S. Xu, W. Yue, X. Zhao, L. Zhang, C. Zhang, and Y. Wang. 2012. "The role of ZFX in
non-small cell lung cancer development." Oncol Res 20 (4):171-8. doi:
10.3727/096504012x13548165987493.
Jiang, R., J. C. Wang, M. Sun, X. Y. Zhang, and H. Wu. 2012. "Zinc finger X-chromosomal protein
(ZFX) promotes solid agar colony growth of osteosarcoma cells." Oncol Res 20 (12):565-
70. doi: 10.3727/096504013X13775486749290.
Jones, P. A. 2012. "Functions of DNA methylation: islands, start sites, gene bodies and beyond."
Nat Rev Genet 13 (7):484-92. doi: 10.1038/nrg3230.
Khan, K., M. Zech, A. T. Morgan, D. J. Amor, M. Skorvanek, T. N. Khan, M. S. Hildebrand, V. E.
Jackson, T. S. Scerri, M. Coleman, K. A. Rigbye, I. E. Scheffer, M. Bahlo, M. Wagner, D.
D. Lam, R. Berutti, P. Havránková, A. Fečíková, T. M. Strom, V. Han, P. Dosekova, Z.
Gdovinova, F. Laccone, M. Jameel, M. R. Mooney, S. M. Baig, R. Jech, E. E. Davis, N.
Katsanis, and J. Winkelmann. 2019. "Recessive variants in ZNF142 cause a complex
neurodevelopmental disorder with intellectual disability, speech impairment, seizures, and
dystonia." Genet Med 21 (11):2532-2542. doi: 10.1038/s41436-019-0523-0.
Kleine-Kohlbrecher, D., J. Christensen, J. Vandamme, I. Abarrategui, M. Bak, N. Tommerup, X.
Shi, O. Gozani, J. Rappsilber, A. E. Salcini, and K. Helin. 2010. "A functional link between
the histone demethylase PHF8 and the transcription factor ZNF711 in X-linked mental
retardation." Mol Cell 38 (2):165-78. doi: 10.1016/j.molcel.2010.03.002.
Kurotaki, N., K. Imaizumi, N. Harada, M. Masuno, T. Kondoh, T. Nagai, H. Ohashi, K. Naritomi,
M. Tsukahara, Y. Makita, T. Sugimoto, T. Sonoda, T. Hasegawa, Y. Chinen, H. A. Tomita
Ha, A. Kinoshita, T. Mizuguchi, K. Yoshiura Ki, T. Ohta, T. Kishino, Y. Fukushima, N.
Niikawa, and N. Matsumoto. 2002. "Haploinsufficiency of NSD1 causes Sotos syndrome."
Nat Genet 30 (4):365-6. doi: 10.1038/ng863.
81
Kwak, H., and J. T. Lis. 2013. "Control of transcriptional elongation." Annu Rev Genet 47:483-
508. doi: 10.1146/annurev-genet-110711-155440.
Lambert, S. A., A. Jolma, L. F. Campitelli, P. K. Das, Y. Yin, M. Albu, X. Chen, J. Taipale, T. R.
Hughes, and M. T. Weirauch. 2018. "The Human Transcription Factors." Cell 172 (4):650-
665. doi: 10.1016/j.cell.2018.01.029.
Li, K., Z. C. Zhu, Y. J. Liu, J. W. Liu, H. T. Wang, Z. Q. Xiong, X. Shen, Z. L. Hu, and J. Zheng.
2013. "ZFX knockdown inhibits growth and migration of non-small cell lung carcinoma cell
line H1299." Int J Clin Exp Pathol 6 (11):2460-7.
Li, S., X. Prasanna, V. T. Salo, I. Vattulainen, and E. Ikonen. 2019. "An efficient auxin-inducible
degron system with low basal degradation in human cells." Nat Methods 16 (9):866-869.
doi: 10.1038/s41592-019-0512-x.
Lo, F. Y., J. W. Chang, I. S. Chang, Y. J. Chen, H. S. Hsu, S. F. Huang, F. Y. Tsai, S. S. Jiang,
R. Kanteti, S. Nandi, R. Salgia, and Y. C. Wang. 2012. "The database of chromosome
imbalance regions and genes resided in lung cancer from Asian and Caucasian identified
by array-comparative genomic hybridization." BMC Cancer 12:235. doi: 10.1186/1471-
2407-12-235.
Lu, L., X. Chen, D. Sanders, S. Qian, and X. Zhong. 2015. "High-resolution mapping of H4K16
and H3K23 acetylation reveals conserved and unique distribution patterns in Arabidopsis
and rice." Epigenetics 10 (11):1044-53. doi: 10.1080/15592294.2015.1104446.
Lucio-Eterovic, A. K., M. M. Singh, J. E. Gardner, C. S. Veerappan, J. C. Rice, and P. B. Carpenter.
2010. "Role for the nuclear receptor-binding SET domain protein 1 (NSD1)
methyltransferase in coordinating lysine 36 methylation at histone 3 with RNA polymerase
II function." Proc Natl Acad Sci U S A 107 (39):16952-7. doi: 10.1073/pnas.1002653107.
Lugtenberg, D., H. G. Yntema, M. J. Banning, A. R. Oudakker, H. V. Firth, L. Willatt, M. Raynaud,
T. Kleefstra, J. P. Fryns, H. H. Ropers, J. Chelly, C. Moraine, J. Gecz, J. van Reeuwijk, S.
B. Nabuurs, B. B. de Vries, B. C. Hamel, A. P. de Brouwer, and H. van Bokhoven. 2006.
"ZNF674: a new kruppel-associated box-containing zinc-finger gene involved in
nonsyndromic X-linked mental retardation." Am J Hum Genet 78 (2):265-78. doi:
10.1086/500306.
Luo, Z., S. K. Rhie, and P. J. Farnham. 2019. "The Enigmatic HOX Genes: Can We Crack Their
Code?" Cancers (Basel) 11 (3). doi: 10.3390/cancers11030323.
Lupo, A., E. Cesaro, G. Montano, D. Zurlo, P. Izzo, and P. Costanzo. 2013. "KRAB-Zinc Finger
Proteins: A Repressor Family Displaying Multiple Biological Functions." Curr Genomics
14 (4):268-78. doi: 10.2174/13892029113149990002.
Ma, H., F. Yang, M. Lian, R. Wang, H. Wang, L. Feng, Q. Shi, and J. Fang. 2015. "Dysregulation
of zinc finger protein, X-linked (ZFX) impairs cell proliferation and induces apoptosis in
human oral squamous cell carcinorma." Tumour Biol 36 (8):6103-12. doi:
10.1007/s13277-015-3292-7.
Mackeh, R., A. K. Marr, A. Fadda, and T. Kino. 2018. "C2H2-Type Zinc Finger Proteins:
Evolutionarily Old and New Partners of the Nuclear Hormone Receptors." Nucl Recept
Signal 15:1550762918801071. doi: 10.1177/1550762918801071.
Mandel, J. L., and J. Chelly. 2004. "Monogenic X-linked mental retardation: is it as frequent as
currently estimated? The paradox of the ARX (Aristaless X) mutations." Eur J Hum Genet
12 (9):689-93. doi: 10.1038/sj.ejhg.5201247.
May, D. G., K. L. Scott, A. R. Campos, and K. J. Roux. 2020. "Comparative Application of BioID
and TurboID for Protein-Proximity Biotinylation." Cells 9 (5). doi: 10.3390/cells9051070.
McConkey, G. A., and D. F. Bogenhagen. 1988. "TFIIIA binds with equal affinity to somatic and
major oocyte 5S RNA genes." Genes Dev 2 (2):205-14. doi: 10.1101/gad.2.2.205.
Meers, M. P., T. Henriques, C. A. Lavender, D. J. McKay, B. D. Strahl, R. J. Duronio, K. Adelman,
and A. G. Matera. 2017. "Histone gene replacement reveals a post-transcriptional role for
H3K36 in maintaining metazoan transcriptome fidelity." Elife 6. doi: 10.7554/eLife.23249.
82
Miranda, T. B., and P. A. Jones. 2007. "DNA methylation: the nuts and bolts of repression." J
Cell Physiol 213 (2):384-90. doi: 10.1002/jcp.21224.
Moore, J. E., M. J. Purcaro, H. E. Pratt, C. B. Epstein, N. Shoresh, J. Adrian, T. Kawli, C. A. Davis,
A. Dobin, R. Kaul, J. Halow, E. L. Van Nostrand, P. Freese, D. U. Gorkin, Y. Shen, Y. He,
M. Mackiewicz, F. Pauli-Behn, B. A. Williams, A. Mortazavi, C. A. Keller, X. O. Zhang, S.
I. Elhajjajy, J. Huey, D. E. Dickel, V. Snetkova, X. Wei, X. Wang, J. C. Rivera-Mulia, J.
Rozowsky, J. Zhang, S. B. Chhetri, A. Victorsen, K. P. White, A. Visel, G. W. Yeo, C. B.
Burge, E. Lécuyer, D. M. Gilbert, J. Dekker, J. Rinn, E. M. Mendenhall, J. R. Ecker, M.
Kellis, R. J. Klein, W. S. Noble, A. Kundaje, R. Guigó, P. J. Farnham, J. M. Cherry, R. M.
Myers, B. Ren, B. R. Graveley, M. B. Gerstein, L. A. Pennacchio, M. P. Snyder, B. E.
Bernstein, B. Wold, R. C. Hardison, T. R. Gingeras, J. A. Stamatoyannopoulos, Z. Weng,
and ENCODE Project Consortium. 2020. "Expanded encyclopaedias of DNA elements in
the human and mouse genomes." Nature 583 (7818):699-710. doi: 10.1038/s41586-020-
2493-4.
Moran, S., C. Arribas, and M. Esteller. 2016. "Validation of a DNA methylation microarray for
850,000 CpG sites of the human genome enriched in enhancer sequences." Epigenomics
8 (3):389-99. doi: 10.2217/epi.15.114.
Nakagawa, T., M. Yoneda, M. Higashi, Y. Ohkuma, and T. Ito. 2018. "Enhancer function regulated
by combinations of transcription factors and cofactors." Genes Cells 23 (10):808-821. doi:
10.1111/gtc.12634.
Ni, W., A. A. Perez, S. Schreiner, C. M. Nicolet, and P. J. Farnham. 2020. "Characterization of
the ZFX family of transcription factors that bind downstream of the start site of CpG island
promoters." Nucleic Acids Res. doi: 10.1093/nar/gkaa384.
Nikpour, P., M. Emadi-Baygi, F. Mohammad-Hashem, M. R. Maracy, and S. Haghjooy-
Javanmard. 2012. "Differential expression of ZFX gene in gastric cancer." J Biosci 37
(1):85-90. doi: 10.1007/s12038-011-9174-2.
Nishimura, K., T. Fukagawa, H. Takisawa, T. Kakimoto, and M. Kanemaki. 2009. "An auxin-based
degron system for the rapid depletion of proteins in nonplant cells." Nat Methods 6
(12):917-22. doi: 10.1038/nmeth.1401.
Nitta, K. R., A. Jolma, Y. Yin, E. Morgunova, T. Kivioja, J. Akhtar, K. Hens, J. Toivonen, B.
Deplancke, E. E. Furlong, and J. Taipale. 2015. "Conservation of transcription factor
binding specificities across 600 million years of bilateria evolution." Elife 4. doi:
10.7554/eLife.04837.
Nolte, R. T., R. M. Conlin, S. C. Harrison, and R. S. Brown. 1998. "Differing roles for zinc fingers
in DNA recognition: structure of a six-finger transcription factor IIIA complex." Proc Natl
Acad Sci U S A 95 (6):2938-43. doi: 10.1073/pnas.95.6.2938.
North, M., C. Sargent, J. O'Brien, K. Taylor, J. Wolfe, N. A. Affara, and M. A. Ferguson-Smith.
1991. "Comparison of ZFY and ZFX gene structure and analysis of alternative 3'
untranslated regions of ZFY." Nucleic Acids Res 19 (10):2579-86. doi:
10.1093/nar/19.10.2579.
Nowick, K., A. T. Hamilton, H. Zhang, and L. Stubbs. 2010. "Rapid sequence and expression
divergence suggest selection for novel function in primate-specific KRAB-ZNF genes."
Mol Biol Evol 27 (11):2606-17. doi: 10.1093/molbev/msq157.
Poulard, C., J. Rambaud, M. Le Romancer, and L. Corbo. 2014. "Proximity ligation assay to detect
and localize the interactions of ERα with PI3-K and Src in breast cancer cells and tumor
samples." Methods Mol Biol 1204:135-43. doi: 10.1007/978-1-4939-1346-6_12.
Raymond, F. L., and P. Tarpey. 2006. "The genetics of mental retardation." Hum Mol Genet 15
Spec No 2:R110-6. doi: 10.1093/hmg/ddl189.
Reiter, F., S. Wienerroither, and A. Stark. 2017. "Combinatorial function of transcription factors
and cofactors." Curr Opin Genet Dev 43:73-81. doi: 10.1016/j.gde.2016.12.007.
83
Rhie, S. K., L. Yao, Z. Luo, H. Witt, S. Schreiner, Y. Guo, A. A. Perez, and P. J. Farnham. 2018.
"ZFX acts as a transcriptional activator in multiple types of human tumors by binding
downstream of transcription start sites at the majority of CpG island promoters." Genome
Res. doi: 10.1101/gr.228809.117.
Rossi, M. J., W. K. M. Lai, and B. F. Pugh. 2018. "Simplified ChIP-exo assays." Nat Commun 9
(1):2842. doi: 10.1038/s41467-018-05265-7.
Ryan, R. F., and M. K. Darby. 1998. "The role of zinc finger linkers in p43 and TFIIIA binding to
5S rRNA and DNA." Nucleic Acids Res 26 (3):703-9. doi: 10.1093/nar/26.3.703.
Samavarchi-Tehrani, P., R. Samson, and A. C. Gingras. 2020. "Proximity Dependent Biotinylation:
Key Enzymes and Adaptation to Proteomics Approaches." Mol Cell Proteomics 19
(5):757-773. doi: 10.1074/mcp.R120.001941.
Saxonov, S., P. Berg, and D. L. Brutlag. 2006. "A genome-wide analysis of CpG dinucleotides in
the human genome distinguishes two distinct classes of promoters." Proc Natl Acad Sci
U S A 103 (5):1412-7. doi: 10.1073/pnas.0510310103.
Serra, R. W., M. Fang, S. M. Park, L. Hutchinson, and M. R. Green. 2014. "A KRAS-directed
transcriptional silencing pathway that mediates the CpG island methylator phenotype."
Elife 3:e02313. doi: 10.7554/eLife.02313.
Shou, Y., M. L. Martelli, A. Gabrea, Y. Qi, L. A. Brents, A. Roschke, G. Dewald, I. R. Kirsch, P. L.
Bergsagel, and W. M. Kuehl. 2000. "Diverse karyotypic abnormalities of the c-myc locus
associated with c-myc dysregulation and tumor progression in multiple myeloma." Proc
Natl Acad Sci U S A 97 (1):228-33. doi: 10.1073/pnas.97.1.228.
Snyder, M. P., T. R. Gingeras, J. E. Moore, Z. Weng, M. B. Gerstein, B. Ren, R. C. Hardison, J.
A. Stamatoyannopoulos, B. R. Graveley, E. A. Feingold, M. J. Pazin, M. Pagan, D. A.
Gilchrist, B. C. Hitz, J. M. Cherry, B. E. Bernstein, E. M. Mendenhall, D. R. Zerbino, A.
Frankish, P. Flicek, R. M. Myers, and ENCODE Project Consortium. 2020. "Perspectives
on ENCODE." Nature 583 (7818):693-698. doi: 10.1038/s41586-020-2449-8.
Stine, Z. E., Z. E. Walton, B. J. Altman, A. L. Hsieh, and C. V. Dang. 2015. "MYC, Metabolism,
and Cancer." Cancer Discov 5 (10):1024-39. doi: 10.1158/2159-8290.CD-15-0507.
Swamynathan, S. K. 2010. "Kruppel-like factors: three fingers in control." Hum Genomics 4
(4):263-70. doi: 10.1186/1479-7364-4-4-263.
Tadepally, H. D., G. Burger, and M. Aubry. 2008. "Evolution of C2H2-zinc finger genes and
subfamilies in mammals: species-specific duplication and loss of clusters, genes and
effector domains." BMC Evol Biol 8:176. doi: 10.1186/1471-2148-8-176.
Taylor-Harris, P., S. Swift, and A. Ashworth. 1995. "Zfyl encodes a nuclear sequence-specific
DNA binding protein." FEBS Lett 360 (3):315-9. doi: 10.1016/0014-5793(95)00141-u.
Tischfield, M. A., T. M. Bosley, M. A. Salih, I. A. Alorainy, E. C. Sener, M. J. Nester, D. T. Oystreck,
W. M. Chan, C. Andrews, R. P. Erickson, and E. C. Engle. 2005. "Homozygous HOXA1
mutations disrupt human brainstem, inner ear, cardiovascular and cognitive
development." Nat Genet 37 (10):1035-7. doi: 10.1038/ng1636.
Tomlins, S. A., A. Bjartell, A. M. Chinnaiyan, G. Jenster, R. K. Nam, M. A. Rubin, and J. A.
Schalken. 2009. "ETS gene fusions in prostate cancer: from discovery to daily clinical
practice." Eur Urol 56 (2):275-86. doi: 10.1016/j.eururo.2009.04.036.
Tsukahara, T., Y. Nabeta, S. Kawaguchi, H. Ikeda, Y. Sato, K. Shimozawa, K. Ida, H. Asanuma,
Y. Hirohashi, T. Torigoe, H. Hiraga, S. Nagoya, T. Wada, T. Yamashita, and N. Sato. 2004.
"Identification of human autologous cytotoxic T-lymphocyte-defined osteosarcoma gene
that encodes a transcriptional regulator, papillomavirus binding factor." Cancer Res 64
(15):5442-8. doi: 10.1158/0008-5472.CAN-04-0522.
Tupler, R., G. Perini, and M. R. Green. 2001. "Expressing the human genome." Nature 409
(6822):832-3. doi: 10.1038/35057011.
84
Ullmark, T., G. Montano, and U. Gullberg. 2018. "DNA and RNA binding by the Wilms' tumour
gene 1 (WT1) protein +KTS and -KTS isoforms-From initial observations to recent global
genomic analyses." Eur J Haematol 100 (3):229-240. doi: 10.1111/ejh.13010.
van der Werf, I. M., A. Van Dijck, E. Reyniers, C. Helsmoortel, A. A. Kumar, V. M. Kalscheuer, A.
P. de Brouwer, T. Kleefstra, H. van Bokhoven, G. Mortier, S. Janssens, G. Vandeweyer,
and R. F. Kooy. 2017. "Mutations in two large pedigrees highlight the role of ZNF711 in X-
linked intellectual disability." Gene 605:92-98. doi: 10.1016/j.gene.2016.12.013.
Vandenbon, A., Y. Kumagai, M. Lin, Y. Suzuki, and K. Nakai. 2018. "Waves of chromatin
modifications in mouse dendritic cells in response to LPS stimulation." Genome Biol 19
(1):138. doi: 10.1186/s13059-018-1524-z.
Vaquerizas, J. M., S. K. Kummerfeld, S. A. Teichmann, and N. M. Luscombe. 2009. "A census of
human transcription factors: function, expression and evolution." Nat Rev Genet 10
(4):252-63. doi: 10.1038/nrg2538.
Vicente, C., A. Conchillo, M. A. García-Sánchez, and M. D. Odero. 2012. "The role of the GATA2
transcription factor in normal and malignant hematopoiesis." Crit Rev Oncol Hematol 82
(1):1-17. doi: 10.1016/j.critrevonc.2011.04.007.
Vihervaara, A., F. M. Duarte, and J. T. Lis. 2018. "Molecular mechanisms driving transcriptional
stress responses." Nat Rev Genet 19 (6):385-397. doi: 10.1038/s41576-018-0001-6.
Weedon, M. N. 2007. "The importance of TCF7L2." Diabet Med 24 (10):1062-6. doi:
10.1111/j.1464-5491.2007.02258.x.
Weirauch, M. T., A. Yang, M. Albu, A. G. Cote, A. Montenegro-Montero, P. Drewe, H. S.
Najafabadi, S. A. Lambert, I. Mann, K. Cook, H. Zheng, A. Goity, H. van Bakel, J. C.
Lozano, M. Galli, M. G. Lewsey, E. Huang, T. Mukherjee, X. Chen, J. S. Reece-Hoyes, S.
Govindarajan, G. Shaulsky, A. J. M. Walhout, F. Y. Bouget, G. Ratsch, L. F. Larrondo, J.
R. Ecker, and T. R. Hughes. 2014. "Determination and inference of eukaryotic
transcription factor sequence specificity." Cell 158 (6):1431-1443. doi:
10.1016/j.cell.2014.08.009.
Wen, H., Y. Li, Y. Xi, S. Jiang, S. Stratton, D. Peng, K. Tanaka, Y. Ren, Z. Xia, J. Wu, B. Li, M. C.
Barton, W. Li, H. Li, and X. Shi. 2014. "ZMYND11 links histone H3.3K36me3 to
transcription elongation and tumour suppression." Nature 508 (7495):263-8. doi:
10.1038/nature13045.
Weng, H., X. Wang, M. Li, X. Wu, Z. Wang, W. Wu, Z. Zhang, Y. Zhang, S. Zhao, S. Liu, J. Mu,
Y. Cao, Y. Shu, R. Bao, J. Zhou, J. Lu, P. Dong, J. Gu, and Y. Liu. 2015. "Zinc finger X-
chromosomal protein (ZFX) is a significant prognostic indicator and promotes cellular
malignant potential in gallbladder cancer." Cancer Biol Ther 16 (10):1462-70. doi:
10.1080/15384047.2015.1070994.
Williams, A. J., S. C. Blacklow, and T. Collins. 1999. "The zinc finger-associated SCAN box is a
conserved oligomerization domain." Mol Cell Biol 19 (12):8526-35. doi:
10.1128/mcb.19.12.8526.
Wu, S., X. Y. Lao, T. T. Sun, L. L. Ren, X. Kong, J. L. Wang, Y. C. Wang, W. Du, Y. N. Yu, Y. R.
Weng, J. Hong, and J. Y. Fang. 2013. "Knockdown of ZFX inhibits gastric cancer cell
growth in vitro and in vivo via downregulating the ERK-MAPK pathway." Cancer Lett 337
(2):293-300. doi: 10.1016/j.canlet.2013.04.003.
Yabe, H., T. Tsukahara, S. Kawaguchi, T. Wada, N. Sato, and H. Morioka. 2008. "Overexpression
of papillomavirus binding factor in Ewing's sarcoma family of tumors conferring poor
prognosis." Oncol Rep 19 (1):129-34.
Yoder, N. C., and K. Kumar. 2006. "Selective protein-protein interactions driven by a
phenylalanine interface." J Am Chem Soc 128 (1):188-91. doi: 10.1021/ja055494k.
Zabidi, M. A., and A. Stark. 2016. "Regulatory Enhancer-Core-Promoter Communication via
Transcription Factors and Cofactors." Trends Genet 32 (12):801-814. doi:
10.1016/j.tig.2016.10.003.
85
Zandarashvili, L., A. Esadze, D. Vuzman, C. A. Kemme, Y. Levy, and J. Iwahara. 2015. "Balancing
between affinity and speed in target DNA search by zinc-finger proteins via modulation of
dynamic conformational ensemble." Proc Natl Acad Sci U S A 112 (37):E5142-9. doi:
10.1073/pnas.1507726112.
Zheng, R., and G. A. Blobel. 2010. "GATA Transcription Factors and Cancer." Genes Cancer 1
(12):1178-88. doi: 10.1177/1947601911404223.
Zhou, Y., Z. Su, Y. Huang, T. Sun, S. Chen, T. Wu, G. Chen, X. Xie, B. Li, and Z. Du. 2011. "The
Zfx gene is expressed in human gliomas and is important in the proliferation and apoptosis
of the human malignant glioma cell line U251." J Exp Clin Cancer Res 30:114. doi:
10.1186/1756-9966-30-114.
Zhuang, L., Y. Jang, Y. K. Park, J. E. Lee, S. Jain, E. Froimchuk, A. Broun, C. Liu, O. Gavrilova,
and K. Ge. 2018. "Depletion of Nsd2-mediated histone H3K36 methylation impairs
adipose tissue development and function." Nat Commun 9 (1):1796. doi: 10.1038/s41467-
018-04127-6.
Abstract (if available)
Abstract
C2H2 zinc finger proteins (ZNFs) constitute the largest transcription factor (TF) family in human, yet the least studied one due to the nature of their protein structures, lack of high-quality antibodies, and low levels of expression. My research has focused on studying the C2H2 zinc finger protein, X-linked (ZFX) transcription factor family which consists of three members: ZFX, zinc finger protein, Y-linked (ZFY), and zinc finger protein 711 (ZNF711). Although their protein structure suggests that ZFX, ZFY, and ZNF711 are transcriptional regulators, the mechanisms by which they influence transcription have not yet been elucidated. In this study, I mapped and compared the genome-wide DNA binding sites of the ZFX family in different cell types. I created CRISPR-mediated ZFX and/or ZNF711 knockouts in female HEK293T cells (which naturally lack ZFY) and found that these TFs function as transcription activators and critical regulators of cell proliferation. I have also identified the zinc fingers responsible for the DNA binding and transcription activating activities, interacting partners of ZFX, and explored the epigenetics mechanisms of the ZFX family. New findings on the ZFX family provide important insights into transcriptional regulation in human cells by members of the large, but under-studied family of C2H2 ZNFs.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Do the ZFX and ZFY transcription factors have redundant or unique functions?
PDF
Identification of target genes and protein partners of ZNF711 in glioblastoma cells
PDF
Do ZFX and ZNF711 regulate the same genes in HEK293T cells?
PDF
Characterizing ZFX-mediated gene regulation to reveal possible candidates for clinical intervention
PDF
The relationship between DNA methylation and transcription factor binding in colon cancer cells
PDF
Quantitative modeling of in vivo transcription factor–DNA binding and beyond
PDF
pRB structures and functions mediating suppression of retinoblastoma initiation
PDF
CpG poor promoter SULT1C2 regulated by DNA methylation and is induced by cigarette smoke condensate in lung cell lines
PDF
Using genomics to understand the gene selectivity of steroid hormone receptors
PDF
Mechanistic basis for chromosomal translocations at the E2A gene
PDF
Understanding protein–DNA recognition in the context of DNA methylation
PDF
Using novel small molecule modulators as a tool to elucidate the role of the Myocyte Enhancer Factor 2 (MEF2) family of transcription factors in leukemia
PDF
Detecting joint interactions between sets of variables in the context of studies with a dichotomous phenotype, with applications to asthma susceptibility involving epigenetics and epistasis
Asset Metadata
Creator
Ni, Weiya Stephanie
(author)
Core Title
Characterization of the ZFX family of transcription factors that bind downstream of the start site of CpG island promoters
School
Keck School of Medicine
Degree
Doctor of Philosophy
Degree Program
Cancer Biology and Genomics
Publication Date
12/07/2020
Defense Date
10/21/2020
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
C2H2 zinc finger proteins,CpG island promoters,CRISPR knockout,OAI-PMH Harvest,transcription factors,ZFX
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Farnham, Peggy J. (
committee chair
)
Creator Email
stephanie.ni.w@gmail.com,swni67@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-413150
Unique identifier
UC11667453
Identifier
etd-NiWeiyaSte-9184.pdf (filename),usctheses-c89-413150 (legacy record id)
Legacy Identifier
etd-NiWeiyaSte-9184.pdf
Dmrecord
413150
Document Type
Dissertation
Rights
Ni, Weiya Stephanie
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
C2H2 zinc finger proteins
CpG island promoters
CRISPR knockout
transcription factors
ZFX