Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
MSI2 level alters histone transcription rate in HepG2 cells
(USC Thesis Other)
MSI2 level alters histone transcription rate in HepG2 cells
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
MSI2 Level Alters Histone Transcription Rate in HepG2 Cells
by
Ahmed Farooq
A Thesis Presented to the
FACULTY OF THE USC KECK SCHOOL OF MEDICINE
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
(MOLECULAR MICROBIOLOGY AND IMMUNOLOGY)
December 2021
Copyright 2021 Ahmed Farooq
Acknowledgements:
Thank you to Dr. Keigo Machida for allowing me to join his lab in 2020, and guiding me
throughout 2020 and 2021. Thank you to my committee members, Dr. Ou and Dr. Zandi, for
graciously joining my thesis committee. Thank you to members of Dr. Machida’s lab, former and
current, who helped with and gave advice in regards to the completion of my thesis project. Dr.
Da-Wei Yeh, Dr. Hye-Yeon Choi, and Juan Carlos Hernandez, in particular, were the most
experienced members of the Machida Lab who helped all other members towards completing
their goals in the lab. Thank you to UCSD collaborator Dr. Soohwan Oh who, along with Dr. Da-
Wei Yeh, started the experiments in 2019 that would serve as the core of my thesis project. Also,
thank you to the USC Senior Bioinformatics Specialist, Meng Li, for meeting with our lab in
January 2021 to help us analyze the bioinformatics data for this project. I would like to thank the
USC MMI department, as well as my fellow MMI class of 2021 cohort, for making my time at
USC a memorable one. Finally, and most importantly, I would like to thank my family for
everything. They are the most important people in my life, and I would not have survived
graduate school without them.
ii
Table of Contents
Acknowledgements…………………………………………………………………………..……ii
List of Tables……………………………………………………………………………….……..iv
List of Figures……………………………………………………………………………………..v
Abstract.……………………………………………………………………………………….….vi
1. Introduction……………………………………………………………………………………..1
2. Materials & Methods………………………………………………………………….………21
2.1: Cell Lines………………………………………………………………….….….…21
2.2: Statistical Analysis………………………………………………………………….22
2.3: PRO-seq Experimental Protocol…………………………………..………………..23
2.4: Bioinformatics……………………….…….……………………………….………24
3. Results………………………………………………………………………………………..29
3.1: There Were No Statistically Significant Results in Hep3B and Huh7 Cells……….29
3.2: There Were Statistically Significant Results in Both HepG2 Data Sets……..…..…31
3.3: Majority of Significantly Altered Histone Genes Were Histone Variants………….34
3.4: Less than Half of 36 Histone Genes Were Bound by P53 in Non-HepG2 Cells…..38
3.5: Limited Binding of Histone Genes by P53 and MYC in HepG2 Data Sets……….39
4. Discussion……………………………………………………………………………………43
References………………………………………………………………………………………48
iii
List of Tables:
Table 1: Statistically Significant Results in Rate of Transcription Between Control vs. MSI2
Over-Expression in HepG2 Cells……………………………………………………….…..…..30
Table 2: Top 50 Most Statistically Significant Results in Rate of Transcription Between Control
(sh-Scrambled) vs. MSI2 Knock-Down (shMSI2) in HepG2 Cells…………………………….32
Table 3: 36 Statistically Significant Histone Genes (q-value ≤ 0.05) in Rate of Transcription
Between Control (sh-Scrambled) vs. MSI2 Knock-Down (shMSI2) in HepG2 Cells…….……35
Table 4: Information about 36 Histone Genes with Statistically Significant Rate of Transcription
Between sh-Scrambled vs. shMSI2 HepG2 Cells………………………..………………..……37
iv
List of Figures:
Figure 1: P53 Baer Hub CHIP-seq Results in Non-HepG2 Cells for P53 Binding of 36 Histone
Genes’ Promoter Regions………………………………………………………………………39
Figure 2: MYC CHIP-seq Results in HepG2 Cells for MYC Binding of 36 Histone Genes’
Promoter Regions………………………………………………………………………………40
Figure 3: P53 CHIP-seq Results in HepG2 Cells for P53 Binding of 36 Histone Genes’ Promoter
Regions…………………………………………………………………………………………41
v
Abstract:
Hepatocellular carcinoma is the most common form of primary liver cancer around the world.
Musashi RNA-Binding Protein 2, or MSI2, has been shown to contribute to HCC tumorigenesis
both in vitro and in vivo. Utilizing Homer to analyze PRO-seq data, we were able to see the
changes in rate of transcription when MSI2 was over-expressed and knocked down in three HCC
cell lines. In HepG2 cells, thirty-six histone genes’ transcription rate were significantly altered
when MSI2 was knocked-down. Twenty-eight of these histone genes were histone variants. We
hypothesized that p53 and MYC were competing for histone promoter regions, affecting
transcription rate. To determine if this was the case, p53 and MYC CHIP-seq binding was
analyzed. However, p53 and MYC were not shown to bind to histone promoter regions in HepG2
cells. While the mechanism was not determined, histone variants’ connection to MSI2 and HCC
is worth further research.
vi
1: Introduction
Hepatocellular carcinoma is one of the major causes of cancer deaths worldwide.
Hepatocellular carcinoma is the most common form of primary liver cancer, marked with
hepatitis B or hepatitis C infection in conjunction with lifestyle diseases such as obesity and
alcohol-damaged liver [1]. In terms of fatality, liver cancer as a whole is more fatal worldwide
than in the United States [2]. In the year of 2021, it is expected that in terms of primary liver
cancer and intrahepatic bile duct cancer, 42,230 new cases will be diagnosed and 30,230 patients
will die in the United States [2]. Worldwide, 800,000 new cases of liver cancer will be
diagnosed, with 700,000 deaths [2]. This will make liver cancer a leading cause of cancer death
worldwide, being more common in Southeast Asia and Sub-Saharan Africa [2]. Regarding
primary liver cancer, hepatocellular carcinoma accounts for 90 percent cases worldwide as of
2019 [1].
People living in Sub-Saharan Africa and Southeast Asia are more likely to suffer from
liver cancer due to being exposed to known causes of liver cancer, such as the hepatitis C virus
[1]. However, the rate of HCC incidence in Eastern and Western Sub-Saharan Africa has
decreased by around 20% due to increased vaccination against hepatitis B and hepatitis C virus
[1]. Elsewhere, in 2016, 30,000 people in Japan passed away from hepatocellular carcinoma [3].
The cause of hepatocellular carcinoma incidence in Japan is mostly attributed to hepatitis C virus
infection [3]. Excessive alcohol consumption is another trait that, when acting together with
hepatitis B or hepatitis C infection, can induce hepatocellular carcinoma [1]. This would indicate
that countries where hepatitis B and C infection are higher would also suffer from hepatocellular
carcinoma based on lifestyle choices like excessive alcohol consumption and obesity [1]. This is
relative to countries that do not suffer from elevated hepatitis B and C infection but still consume
1
alcohol and have high-fat diets. Alcohol consumption and liver cancer are associated due to the
liver being the main organ involved in alcohol metabolism [4].
The liver metabolizes alcohol and acetaldehyde via alcohol dehydrogenase and aldehyde
dehydrogenase, respectively [4]. Alcohol dehydrogenase first converts ethanol into acetaldehyde
[4]. Then, acetaldehyde is converted into acetate via aldehyde dehydrogenase [4]. Acetaldehyde
is a known cancer-causing agent, or carcinogen, and the increased consumption of alcohol would
increase the amount of alcohol that is processed into acetaldehyde, as well as increase the
presence of reactive oxygen species [4]. Along with the elevated level of acetaldehyde and
reactive oxygen species, excessive alcohol consumption can cause livers to suffer from scarring
referred to as cirrhosis, a form of fibrosis [5]. Alcohol-induced cirrhosis of the liver is one of the
most prevalent risk factors for developing hepatocellular carcinoma, along with hepatitis B and C
infection, and steatohepatitis [1]. Hepatocellular carcinoma, however, can develop without
cirrhosis [6].
Steatohepatitis occurs when the liver is inflamed, with fat accumulation in the liver due to
a high-fat diet [6]. Hepatocellular carcinoma can occur with hepatitis C infection and
steatohepatitis, without alcohol-induced cirrhosis, making obesity and diabetes important risk
factors for the development of hepatocellular carcinoma [6]. In non-cirrhotic hepatocellular
carcinoma, alcohol is involved in 21.77 percent, hepatitis B viral infection in 30.60 percent, and
hepatitis C viral infection in 14.36 percent of cancers [7]. In cirrhotic hepatocellular carcinoma,
alcohol is involved in 30 percent, hepatitis B infection in 41.65 percent, and hepatitis C infection
in 44.18 percent of cancers [7]. Non-alcoholic steatohepatitis (NASH) is part of a more broad
term referred to as non-alcoholic fatty liver disease (NAFLD), leading to an increase in non-
alcoholic hepatocellular carcinoma incidence [6]. In 2012, the annual rate of hepatocellular
2
carcinoma incidence in the United States due to NASH-related cirrhosis was 2.6 to 12.8 percent
[6]. However, NASH, NAFLD, hepatitis B infection, and hepatitis C infection, germ-line
mutations, inherited diseases, and exposure to genotoxic substances like aflatoxin B1 can lead to
non-cirrhotic hepatocellular carcinoma [7]. Overall, around 20 percent of hepatocellular
carcinoma cases can develop in non-cirrhotic livers [7].
Hepatocellular carcinoma is a primary liver cancer, meaning that the origin of this cancer
is inside of the liver [8]. Secondary cancers, however, originate in a primary location and then
metastasize to a secondary location; an example would be primary cancer in the liver
metastasizing to the lymph nodes, the secondary cancer location [8]. In order for metastasis to
occur, cells must first escape from the origin of the tumor in order to reach circulation through
the vascular or lymphatic systems via the process of intravasation [9]. In hepatocellular
carcinoma’s case, this site of origin would be the liver. In order to metastasize to a distant site,
tumor cells must break through the basement membrane and extracellular matrix of the tumor
origin [9]. Then, entering circulation, the cancer cells travel to a distant site, such as the lymph
nodes, and leave circulation via extravasation [9]. Invasion of the distant site then occurs, and the
process of metastasis is complete [9]. The avoidance of metastases or micro-metastases is
paramount in order to increase cancer patient survival [9]. This is why the understanding of
metastases and other cancer processes is important.
Although a source of controversy, hepatocellular carcinoma is thought to originate from
hepatocytes, which are liver epithelial cells [10]. Epithelial-mesenchymal transition plays an
important role in metastasis of cancers [10]. Epithelial cells like hepatocytes are normally
structured with other epithelial cells via tight junctions, adherens junctions, and desmosomes
[10]. During embryonic morphogenesis, epithelial cells can undergo the reversible process
3
known as the epithelial-mesenchymal transition program [10]. When this program occurs,
epithelial cells gain the ability to migrate, losing the junctions that rigidly attach epithelial cells
together [10]. Epithelial-mesenchymal transition more importantly occurs during cancer
progression and is vital for the occurrence of invasion and metastasis [10]. Extracellular signals
that are sent to epithelial cells from the tumor microenvironment are thought to produce
carcinoma cells, fully able to invade and metastasize [10]. In order for invasion to occur, the
basement membrane surrounding the tumor and vasculature required for intravasation and
extraversion must be broken through [10]. E-cadherin is though to be the gatekeeper of
maintaining the epithelial state of epithelial cells [10]. The mRNA and protein of various
proteins such as e-cadherin, alpha-catenin, and gamma-catenin are down-regulated during the
process of epithelial-mesenchymal transition [10]. Several different signaling pathways such as
Wnt, transforming growth factor beta (TGF-β), and Notch are implicated in epithelial-to-
mesenchymal transition [10]. TGF-β, in particular, is thought to be the main signaling pathway
inducing epithelial-mesenchymal transition through transcription factors such as Zeb 1/2, Twist1,
and Snail1/2 [10]. The three aforementioned transcription factors are capable of down-regulating
e-cadherin through various processes such as binding to the promoter of e-cadherin, repressing
transcription, or through a double-negative feedback loop [10].
Hepatocellular carcinoma cells over-expressing NANOG are shown to be undergoing
active epithelial-mesenchymal transition [11]. NANOG is a transcription factor that is involved
in maintaining stemness and forming pluripotent cells via the reprogramming of differentiated
cells [11]. NANOG amount was shown to be the lowest in non-cancerous liver tissue, higher in
non-invasive hepatocellular carcinoma tissue, and highest in the invasive and metastatic
hepatocellular carcinoma tissue [11]. Increased NANOG expression in hepatocellular carcinoma
4
cells also induced epithelial-mesenchymal transition, decreasing epithelial markers and
increasing mesenchymal markers [11]. NANOG was also found to regulate epithelial-
mesenchymal transition through the Nodal/SMAD3 signaling pathway [11]. NANOG first
activates Nodal, via two NANOG binding motifs, which then allows Nodal to promote SMAD3
phosphorylation as well as Snail, or SNAI1, expression [11]. Overall, NANOG increased
invasion, drug resistance, sphere formation, and soft agar colonization in hepatocellular
carcinoma cells [11].
NANOG is involved in maintaining stemness and is mainly expressed in embryonic
pluripotent cells [11]. There are some hepatocellular carcinoma tumors that contain tumorigenic
cancer stem cells [12]. These tumor cancer stem cells tend to have similarities to normal stem
cells in both cellular heterogeneity, as well as the expression of genes associated with embryonic
cancer stem cells such as NANOG [12]. Cancer stem cells are capable of differentiation as well
as self-renewal [13]. Cancer stem cells also contain stem cell surface markers such as CD133,
CD44, and CD90 [13]. These cancer stem cells can be isolated in vitro through selecting for the
aforementioned surface markers for experimentation [13]. Cancer stem cells also less susceptible
to drugs and treatments targeting differentiated tumor cells [13]. While differentiated tumor cells
are eliminated with drug treatment, cancer stem cells can continue to self-renew, or divide into
other cancer stem cells [13]. These cancer stem cells can then differentiate into tumor cells, such
as hepatocellular carcinoma cells, thus resisting the effects of cancer treatments and drugs [13].
Tumor progression and metastasis can thus continue unabated [13].
As previously mentioned, NANOG is expressed in a small sub-population of cancer stem
cells in some hepatocellular carcinoma tumors [12]. These NANOG-positive cancer stem cells
are associated with poor clinical outcome in hepatocellular carcinoma patients [12]. NANOG-
5
positive cancer stem cells exhibited resistance to known hepatocellular carcinoma treatments
such as sorafenib and cisplatin [12]. Knocking-down NANOG in these cancer stem cells is also
shown to decrease self-renewal and stem-cell associated genes as well as increasing genes
associated with mature hepatocytes [12]. If a minuscule amount of tumor cells survive
chemotherapy or some other cancer therapy and become quiescent, cancer growth can be in
remission due to the presence of therapy-resistant cancer stem cells. Transcription factors like
NANOG are worthwhile to research due to the difficulty in treating cancers like hepatocellular
carcinoma.
While transcription factors like NANOG are important for research to ameliorate
difficulties with treating tumors containing cancer stem cells, a protein like Musashi RNA
Binding Protein 2 (MSI2) is implicated in several different cancers, including hepatocellular
carcinoma. MSI2 binds to the RNA of gene targets such as NUMB, PTEN, MYC, and TGF-β
[14]. All of the aforementioned targets are involved in oncogenic signaling pathways [14]. The
effects of MSI2 on these gene targets are all post-transcriptional [14]. MSI2 was found to be
over-expressed in different cancers such as lung, pancreatic, leukemia, and pancreatic [14].
Elevated level of Musashi protein expression was associated with low differentiation status,
metastasis, lymph node invasion, as well as the presence of stem cell markers such as CD133
[12, 14]. An increased expression of MSI2 was also found to be involved in hepatocellular
carcinoma due to its binding to MYC mRNA [15]. RNA-binding proteins are seen as being
difficult to target; however, initial efforts to develop MSI1 and MSI2 inhibitors are promising
[14].
MSI1 and MSI2 are both around 75 percent identical in terms of amino acid sequence
[14]. After identifying the role of the msi, or Musashi, gene in Drosophila, it was found that
6
human Musashi proteins were involved in maintenance of hematopoietic stem cells [14].
Thereafter, it came to no surprise that aberrant MSI1 and MSI2 function was linked with cancer
progression [14]. This is because genes associated with stem and progenitor cell capacity are
historically linked with cancer progression due to differentiation, self-renewal, and cell division
[14]. This was previously mentioned when discussing NANOG, due to NANOG’s role in
maintaining stemness in the cancer stem cell subpopulation of tumor cells [12].
Research done to ascertain the role of Musashi proteins in Chronic Myeloid Leukemia
(CML) found that during the event of blast crisis in mouse models of CML, in which 10-19
percent of cells in the blood or bone marrow were abnormal white blood cells, MSI2 levels were
also elevated [14, 16, 17]. It was also shown that expression of fusion protein NUP98-HOXA9
during blast crisis induced MSI2 expression [14]. MSI2 expression was then conducive to
NUMB suppression as well as the cellular de-differentiation required for blast crisis to occur
[14]. Along with CML, MSI2 was also shown to be a prognostic biomarker for acute myeloid
leukemia, bladder cancer, hepatocellular carcinoma, pancreatic cancer, and myelodysplastic
syndrome through a variety of experimental methods utilizing patient and tissue samples [14].
MSI2 was found to be over-expressed in liver cancer stem cells due to the presence of
noted liver cancer stem cell markers CD133 and OV6 [18]. When CD133+ and OV6+
populations of liver cancer stem cells were isolated, MSI2 was over-expressed in these
populations [18]. By utilizing reverse transcription polymerase chain reaction (RT-PCR), it was
found that in control and MSI2-expressing cells, there was elevated expression of stemness genes
like NANOG, OCT-4, and SOX2 in MSI2-expressing cells relative to control cells [18]. This
would indicate that MSI2 expression works in conjunction with an increase in stemness [18].
MSI2 also increased sphere formation in the MSI2-expressing cells relative to the control cells
7
[18]. The researchers utilized sorafenib against the MSI2 over-expression cells, and found that
the cells have increased survival, colony formation, and decreased caspase-3 expression [18].
Increased expression of MSI2 could indicate and increase in cancer stem cell features such as
self-renewal ability and chemoresistance in hepatocellular carcinoma, hence the increased
survival of MSI2 over-expression cells treated with sorafenib [18].
In terms of HCC specimen analysis, MSI2 expression was found to be elevated in 85% of
the 66 human hepatocellular carcinoma specimen tissues relative to non-tumor tissue [18]. MSI2
over-expression in vivo was also investigated utilizing xenograft nude mice with control or MSI2
over-expression cells [18]. Tumors were developed in these xenograft mice; tumor weight in
MSI2 over-expression mice significantly increased tumor growth based on tumor growth curves
and weights [18]. Knockdown of MSI2 also decreased tumor burden in these mice [18]. Utilizing
immunohistochemical staining, it was found that MSI2 over-expression enhanced the expression
of stemness markers OV6 as well as the proliferation marker KI-67 in xenograft tumors [18].
Knocking down MSI2 decreased the expression of OV6 and KI-67 in the xenograft tumors [18].
As was the case with in vitro, it was found in vivo that the expression of pluripotent
transcription factors NANOG, OCT-4, and SOX2 increased in MSI2 over-expression cells
relative to control cells [18]. Knocking down of MSI2 in vivo decreased these pluripotent
transcription factors [18]. These results indicate MSI2’s involvement in liver cancer stem cell
tumor initiation as well as tumor growth enhancement [18]. Finally, it was found that LIN28A,
one of the two paralogs of the oncofetal RNA-binding protein LIN28, was highly expressed in
liver cancer stem cells along with MSI2 [18]. LIN28 is involved in stem cell maintenance,
tumorigenesis, and tissue development [19]. LIN28 paralogs LIN28A and LIN28B were shown
to inhibit let-7 family micro RNAs (miRNAs), which would cease the inhibition of let7 targets
8
including Ras, PI3K/AKT, and MYC [20, 21]. MSI2 over-expression increased LIN28A
expression and knockdown of MSI2 decreased LIN28A expression [18]. When LIN28A was
knocked-down in HCC cells, over-expression of MSI2 did not increase resistance to sorafenib
nor did it decrease caspase-3-dependent apoptosis [18]. Over-expression of MSI2 in LIN28A-
silenced HCC cells also did not increase stemness factors of NANOG, OCT-4, and SOX2 [18].
These results indicate that chemoresistance and self-renewal of cancer stem cells is driven by
MSI2, upstream of LIN28A [18].
As previously mentioned, it was found that MSI2 binds to MYC mRNA [15] The MYC
family includes l-MYC, n-MYC, v-MYC, and c-MYC; c-MYC is also known as MYC [22]. C-
MYC oncogene is known as cellular-MYC, and v-MYC oncogene as avian myelocytomatosis
virus strain 29 [22]. N-MYC expression is tissue restricted, whereas c-MYC is located in the cell
[22]. C-myc and MYC are also interchangeable, referring to the same oncogene. In
hepatocellular carcinoma cells, c-MYC is routinely over-expressed and is overall one of the most
over-expressed genes in human cancers [23]. Increased expression of c-MYC in hepatocellular
carcinoma leads to a more aggressive phenotype of cancer [23]. It was also shown via genetic
analyses that c-MYC over-expression is present in 70 percent of viral and alcohol-related
hepatocellular carcinoma [23, 24].
The proto-oncogene MYC is highly regulated through a variety of means, and
participates in a variety of growth-promoting signal transduction pathways [22]. WNT and TGF-
β are two of the known pathways that involve MYC eliciting cell growth and proliferation [22].
WNT, for example, is involved in the inhibition of beta-catenin; when beta-catenin is not
inhibited via APC, due to the loss of APC, beta-catenin is able bind and transactivate MYC [22].
This turns MYC into a constitutively active, or always active, form [22]. When MYC becomes
9
dysregulated through various processes, such as loss of upstream regulators like APC,
constitutively active MYC is able to induce cell growth and proliferation via binding to target
genes’ DNA sequences [22]. MYC can also be inhibited through checkpoint regulation via p53
and Arf [22].
MYC, being a proto-oncogene, can become mutated into an oncogene. When this occurs,
the full tumorigenic effect of MYC is present. When MYC is able to bypass the various cellular
checkpoints halting its progression from a proto-oncogene into an oncogene, numerous events
will occur [25]. Fortunately, when MYC is over-expressed in normal cells, apoptosis or
proliferative arrest senescence can occur [25]. However, if a secondary event occurs after MYC
over-expression, such as the loss of a tumor suppressor like p53, tumorigenesis can occur [25].
Tumorigenesis then transforms normal cells into tumor cells [25]. Over-expression of MYC can
have different effects based on the type of cell the over-expression is occurring in. If MYC is
over-expressed in an embryonic liver, cellular proliferation would occur [25]. Whereas, if the
over-expression of MYC is occurring in an adult liver, polyploidy occurs due to cellular growth
without mitosis [25]. While cellular proliferation occurs in the adult liver leading to polyploidy,
circumstances such as a partial hepatectomy or exposure to a liver toxin can induce an increased
amount of expression of MYC in the adult hepatocytes [25]. Loss of the tumor suppressor and
participant in cell cycle regulation p53 can further induce cellular proliferation and tumorigenesis
in adult hepatocytes when MYC is over-expressed [25]. Overall, MYC-induced cellular
proliferation and tumorigenesis depends on the type of cell involved as well as whether certain
genes, like p53, are expressed or inhibited [25].
It was found that the RNA binding protein MSI2 was binding to the RNA of MYC in
hepatocellular carcinoma cells [15]. Specifically, binding of MSI2 to the RNA of MYC occurs in
10
the internal ribosome entry site (IRES) of the MYC RNA in the 5’ end [15]. This was found to
occur in Huh7 HCC cells [15]. IRES can allow translation in a cap-independent manner [15].
This form of cap-independent translation through the IRES is usually done in viruses [26].
Binding of MSI2 to the MYC RNA had the effect of increasing the amount of translation of
MYC RNA into MYC protein [15]. As previously mentioned, the over-expression of MYC is the
start of the possible transformation of normal cells into cancer cells depending on loss of certain
genes like p53 [25]. MSI2 binding to MYC can therefore lead to tumorigenesis in hepatocellular
carcinoma in vitro as well as in vivo [15].
The IRES is located in the 5’ end of mRNA [15]. In normal mRNA translation, a 5’ m7G
cap is added during RNA synthesis and is required for cap-dependent translation to occur [26].
This cap is recognized by eukaryotic initiation factor 4E (eIF4E) [26]. This factor is part of the
eIF4E complex composed of various eukaryotic initiation factors such as eIF4G and eIF4A [26].
The eIF4E complex then recruits the pre-assembled 43s pre-initiation complex, which includes
parts such as eukaryotic initiation factors 4A, 4G, and 4E, as well as small ribosomal subunit 40S
[26]. This complex also contains an eIF2/MET-tRNAi/GTP ternary complex [26]. This whole
complex then detects the start codon, AUG, in the 5’ untranslated region of the mRNA from 5’ to
3’ beckon the 60S large ribosomal subunit to then form the 80S ribosome for translation to occur
[26]. This then concludes the cap-dependent translation method [26].
Cap-independent translation was first discovered to occur in viruses [26]. The IRES is
found in the 5’ region of mRNA, and the structure of IRES region can be further analyzed when
looking at the mRNA secondary structure [26]. In IRES cap-independent translation, one does
not require the full 80S ribosome for translation to occur; the IRES interacts, instead, with the
11
40S small ribosome [26]. In viruses, the IRES can directly or indirectly bind to the 40s ribosome,
with or without the help of IRES-transactivating factors (ITAFs) [26].
In the eukaryotic IRES, there are two types of cellular IRES [26]. In order for the 60S
subunit to be recruited, the IRES must first recruit the 40S subunit [26]. In type 1 cellular IRES,
the IRES binds to the 40S through ITAFs, or via a bridge consisting of a eukaryotic initiation
factor (IF) [26]. This binding of 40S to the IRES can then allow recruitment of the 60S subunit.
In type 2 cellular IRES, the 18S ribosomal RNA (rRNA) binds to cis elements on the IRES; the
40S subunit can then bind directly to the IRES [26]. After the binding of the 40S subunit, the 60S
subunit can then bind to the IRES [26]. Ultimately, the binding of the 60S subunit to the IRES
can initiate translation [26]. Cap-independent translation through the IRES is how MSI2 is
involved in MYC translation - through MSI2’s binding to MYC IRES [26]. Along with MYC,
there were other MSI2 binding partners such as GRP137b and miR22hg [15]. MSI2, via its
binding, increased the amount of MYC protein and decreased miR-22 levels [15]. This increased
level of MSI2 increased colony formation, sphere formation, and increased tumorigenesis [15].
It was found that MSI2 level increased during stage three and stage four, or end stage
HCC [15]. This makes MSI2 a viable target, as end stage or terminal stage HCC has a median
survival rate of 3-4 months [27]. While those with third and fourth stage liver cancer should not
be considered for clinical trials, targeting MSI2 in patients in conjunction with chemotherapy or
other cancer therapies during earlier stages of hepatocellular carcinoma may benefit patients
[27]. Those with third and fourth stage hepatocellular carcinoma undergo palliative care, which
involves easing the pain of those suffering from debilitating disease, as well as their families,
through monitoring and assessing physical pain, as well as other types of issues whether they be
mental or psychosocial [27]. Targeting MSI2 during the early stages of hepatocellular carcinoma
12
could be a form of pre-palliative care, as it would ease symptoms that need to be treated with
palliative care such as abdominal pain from enlarged tumor mass [27]. This is because, as it was
previously mentioned, targeting MSI2 in vivo was seen to decrease tumor size in mice [15, 18].
In terms of current methods for treating hepatocellular carcinoma, there are different
methods based on severity of the disease. The most well known therapies for cancer are radiation
therapy and chemotherapy. Surgical resection is done before radiation therapy or chemotherapy
[28]. The method of surgical resection involves removing as much as the tumor from the liver as
possible [28]. Another method is liver transplantation which involves replacing the diseased liver
with a healthy liver from a donor [28]. The Milan criteria is used to decide candidates for liver
transplantation [28]. From 300 patients who satisfied the Milan criteria, the 10-year survival rate
was at 70 percent [28]. These methods are quite successful in the early stages of hepatocellular
carcinoma, but in end-stage hepatocellular carcinoma, the liver is not functioning properly due to
liver cirrhosis in cirrhotic HCC [28]. These treatments are therefore useful for hepatocellular
carcinoma that has been diagnosed early [28]. The issue with early diagnosis of hepatocellular
carcinoma is a lack of visible symptoms; most who get tested for early-stage liver cancer are
already high risk [28].
Along with the aforementioned treatments, sorafenib was approved by the Food and Drug
Administration (FDA) in 2007 [28]. Sorafenib is the first and only treatment for advanced HCC
in over 10 years [28]. Sorafenib is a small multi-tyrosine kinase inhibitor [28]. This drug,
sorafenib, blocks the actions of Raf kinase, as well as receptors such as vascular endothelial
growth factor (VEGF) receptor and platelet-derived growth factor (PDGF) receptor [28]. One of
the benefits of utilizing sorafenib, for example, is inhibiting VEGF receptor which is involved in
the angiogenic process of HCC [28]. This is important as HCC is one of the most vascular solid
13
tumors [29]. Blocking VEGF receptor can thus suppress the carcinogenesis and angiogenesis in
HCC [29]. Sorafenib, however, does not target angiogenic factors that work independently of the
VEGF receptor [29].
In terms of other therapies besides sorafenib, clinical trials are undergoing to see the
efficacy of a variety of treatments. In HCC patients with unresectable cancer, in which the tumor
mass is not capable of being removed, it was found that first-line treatment of atezolizumab
combined with bevacizumab resulted in better outcomes than sorafenib in terms of survival
outcomes [30]. A first-line treatment is done first with standard treatments such as chemotherapy
and radiation therapy. The FDA has officially approved of this combination of drugs as of 2020
for HCC patients who have not received other systemic therapy. Atezolizumab is a monoclonal
antibody used to treat a variety of different cancers such as triple-negative breast cancer, small
cell lung cancer, and hepatocellular carcinoma [30]. Bevacizumab, also known as Avastin, is
used to treat diseases such as proliferative diabetic retinopathy and diabetic macular edema via
an intravitreal injection [31]. Avastin works by acting as a VEGF inhibitor, as VEGF is a major
driving force behind the development of proliferative diabetic retinopathy and diabetic macular
edema [31]. As HCC is one of the most vascular solid tumors, inhibition of the processes behind
said vascularity is important for managing HCC, hence the targeting of VEGF in sorafenib and
Avastin [29].
Another form of advanced HCC treatment that has received accelerated approval from
the FDA is pembrolizumab, or Keytruda [32]. Along with anti-angiogenic agents lenvatinib,
regorafenib, and cabozantinib, the PD-1 inhibitor Keytruda is undergoing phase III trials [32].
These agents are to be utilized after sorafenib in a second-line setting [32]. Keytruda may also
meet the need for patients with who require a first-line treatment who cannot use atezolizumab
14
and Avastin [32]. Keytruda acts as an inhibitor of PD-1, which is a checkpoint inhibitor that is
expressed on T cells [32]. When PD-1 binds to PDL-1 on a target cell, such as a cancer cell, this
indicates to the T cell that the target expressing PDL-1 should not be attacked by the T cell [32].
This allows tumor cells expressing PDL-1 to evade destruction from T cells that express PD-1
[32]. Therefore, inhibition of PD-1, and subsequent inhibition of PD-1 binding to PDL-1, will
allow T cells to target and destroy cancer cells [32].
For those patients with advanced HCC who have already used sorafenib, and cannot use
the combination of atezolizumab and bevacizumab, Keytruda may be a viable treatment [32].
Keynote-240 was an FDA phase III randomized and double-blind trial for utilizing
pembrolizumab in patients with advanced HCC [32]. Between May 31, 2016 and November 23,
2017, 413 randomly-assigned patients were participating in this trial [32]. It was found by the
FDA that the pembrolizumab Keynote-240 did not reach the dual end points of improving
progression-free survival and overall survival for patients with advanced HCC [32]. It was also
stated that at the start of this study on May 31, 2016, there were no drugs approved to treat HCC
after progression while utilizing sorafenib [32]. However, during the course of the study, several
different treatments such as regorafenib and nivolumab were approved [32]. The usage of these
drugs may have affected the results of this trial [32]. While pembrolizumab did not reach the pre-
determined statistical significance, the results of Keynote-240 were consistent with a prior trial,
Keynote-224, whose patients were previously treated with sorafenib [32]. The results of this trial,
Keynote-240, most likely supported the aggressive approval of pembrolizumab for advance HCC
by the FDA [32].
These treatments for HCC mentioned above, including sorafenib, Avastin, and
pembrolizumab, are first-line and second-line treatments. If liver transplantation cannot be done,
15
as it is the best option for those with HCC, other treatments like resection or systemic therapies
like those that utilize sorafenib or pembrolizumab can be used [33]. There are issues with these
treatments, as with liver transplantation, there is a lack of viable livers for transplantation. The
systemic therapies can be useful, but are not guaranteed to be viable, as certain thresholds must
be passed to utilize these therapies. An example would be the previously mentioned fact that
pembrolizumab can be used for patients who have already used sorafenib but cannot use
atezolizumab and bevacizumab [32]. It was also previously mentioned that during the
Keynote-240 trial for pembrolizumab, treatments that were not available during the beginning of
the trial were approved during the trial [32].
New treatments focusing on different aspects of diseases like HCC can always be of use.
Therefore, studying cellular pathways and genes of interest and developing therapies for these
targets is important in the fight against HCC. This is why studying a gene of interest like MSI2,
prevalent in cancers such as acute myeloid leukemia and HCC, is an important endeavor. MSI2
RNA-binding protein is abundant in those with advanced HCC; if MSI2 is present in HCC
patients at earlier stages, MSI2 could be a marker for diagnosing early HCC. The earlier a cancer
is detected, the earlier treatment can be started leading to better results in terms of overall
survival and progression-free survival. Developing a systemic therapy targeting MSI2, and
therefore MYC, could be important for the fight against HCC.
To understand the effect of MSI2 expression on genome-wide transcription rate, we
conducted Precision Run-On Sequencing (PRO-seq) and Global Run-On Sequencing (GRO-seq)
analysis of the PRO-seq FASTQ files. Both PRO-seq and GRO-seq allows measurement of
transcription rate utilizing nascent RNA; PRO-seq, however, is more precise to a single base-pair
whereas GRO-seq is not so precise. First, PRO-seq was done in wet lab experiments to yield a
16
library to be sequenced and analyzed. The PRO-seq materials were then sequenced into FASTQ
files. Finally, the FASTQ files were altered and eventually analyzed utilizing the GRO-seq
analysis method of Homer, via homer.ucsd.edu. This analysis was done on HepG2, Hep3B, and
Huh7 HCC cells. In terms of cell type differences, HepG2 expressed wild-type p53, whereas
Hep3B has non-functional p53 expression and Huh7 has mutated p53. Each of these three cell
types had knocked-down MSI2 (shMSI2), over-expressed MSI2, and a control with wild-type
expression of MSI2. It was found that in Hep3B and Huh7 cells, there were no significant results,
but in HepG2 cells, interesting results were found especially when MSI2 was knocked-down.
The PRO-seq results from HepG2 cells in both control vs. shMSI2, and control vs. MSI2 over-
expression were statistically significant in terms of p-value and q-value. The most interesting
result occurred in shMSI2 HepG2 cells; in this case, the transcription rate of histone genes were
increased in a significant manner for all 36 histone genes included in this analysis. Not only that,
but of the top 50 genes with statistically-significant changes in rate of transcription in shMSI2
HepG2 cells, 13 of the 36 histone genes were among the top 50 genes.
Histones are proteins that are required for the process of DNA replication, as well as
replenishing duplicated chromatin [34]. In order for DNA to enter the nucleus, DNA must be
condensed into chromatin [34]. The unit of chromatin is the nucleosome [34]. The nucleosome
involves chromatin wrapped around cores of 8 histone proteins, making up a histone octamer
[34]. These core histone proteins are histones H2A, H2B, H3, and H4; there are two copies of
each histone in the histone octamer, with around 147 base pairs of DNA around 1.75 rounds
wrapped around the histone proteins [34, 35]. In eukaryotes, there are also several different
variants of histone H2 and H3 with unique functions [34]. The core histone proteins, as well as
linker histone H1, are required for DNA compaction into chromatin [34]. During S phase of the
17
cell cycle when DNA is replicated, histone expression is high [34]. When DNA replication is
complete, histone expression decreases [34]. DNA is replicated in the S phase, and must be
assembled into chromatin via histones [34]. When DNA is doubled during S phase, the amount
of histones must also increase to maintain the proper DNA-to-histone ratio [34]. Therefore,
histone production must be tightly regulated to occur during S phase when histones are required
[34]. After DNA synthesis ends or the end of the S phase is reached, DNA replication-dependent
histone transcription is turned off [34].
Histone variants, mostly consisting of histone H2A variants, can be canonical replication-
dependent or replication-independent [34]. Cajal bodies, nuclear organelles present in all
eukaryotes, are thought to be together with histone locus bodies [36]. This is because coilin, the
marker for cajal bodies, is also present in high concentrations in some histone locus bodies [36].
Another reason is that cajal bodies and histone locus bodies are physically interacting [36].
Histone locus bodies contain factors that are required to process histone pre-mRNAs [36].
Histone locus bodies are also associated with the genes that code for histone proteins, which
would indicate that histone locus bodies is the site of histone production [36].
Histone variants have a variety of different functions and are altered in cancers [37].
There are variants for histone H2A, H2B, and H3; histone H4, however, is one of the most
slowly-evolving histones, and only has a miniscule amount of variants in different species [37].
One of these histone H4 variants is H4G, which plays a role in breast cancer by regulating
ribosomal DNA transcription [38]. Histone variant functions span from histone H1.2 inducing
apoptosis, to mH2A2 inactivating the X-chromosome [37]. Histone variants are also altered in
multiple different cancers [37]. H2A1, H2A/p, H2A.1, H2A2A, H2A.2, and H2A.o are all altered
in hepatocellular carcinoma [37]. Along with having different functions, histone variants can also
18
affect nucleosome stability [39]. Histone variant H2A.B can decrease nucleosome stability,
whereas histone variants H2A.Z, macroH2A, and CENP-A can increase nucleosome stability
[39]. This alteration of nucleosome stability can result in a different functional output on
chromatin function and organization [39]. Including H2A.B in the nucleosome has the functional
output of destabilization of DNA-histone ratio and histone-histone interactions, causing
chromatin to open up, transcription factors to bind, and transcription to occur [39]. Histone
variants can be replication-dependent, if they are located within a histone gene cluster, or
replication-independent [34]. Overall, histone variants affect post-translational modification,
chromatin remodeling complexes, and epigenetic regulation complexity [39].
Based on these prior observations, my hypothesis is as followed: MSI2 influences histone
transcription rate via p53 and MYC. The basis of this hypothesis is that p53 is present in normal
levels in HepG2 cells, which contains the statistically-significant results; however, in Hep3B,
p53 is not present; in Huh7 cells, p53 is destabilized due to a point mutation. This would indicate
that MSI2 and p53 may be influencing histone transcription in HepG2 cells. MYC, through MSI2
interaction, may also be involved in histone gene transcription via repressing the histone genes.
Both MYC and p53 can act as transcription factors, so their binding to histone gene promoters
may affect histone transcription in HepG2 cells. P53 may be increasing transcription and MYC
may be decreasing histone transcription by acting like an activator and repressor, respectively.
The first aim of this project were to see the changes in transcription when MSI2 expression was
altered via PRO-seq, and the GRO-seq analysis tool Homer. The second aim was to see if MYC
and p53 transcription factors were bound to the 36 histone gene promoter regions in HCC cell
lines by utilizing Chromatin Immunoprecipitation sequencing (CHIP-seq) data sets, which would
19
help determine if p53 and MYC are binding histones through their promoter regions or enhancer
regions.
In terms of methodology, the first thing that was done was the production of MSI2 over-
expression and shMSI2 HepG2, Hep3B, and Huh7 cells. After this was done, PRO-seq was
conducted via collaboration of the Machida Lab with a University of California, San Diego
collaborator, Dr. Soohwan Oh. Then, the PRO-seq FASTQ data files were analyzed with Homer,
and visualized via the University of California, Santa Cruz (UCSC) genome browser. After, to
determine if p53 binds to histone gene promoters, we looked at the p53 Baer Hub, a collection of
41 different non-HepG2 CHIP-seq data sets across different cell lines. Also, to further determine
if p53 and MYC bind to the 36 histone gene promoter regions, we found p53 and MYC HepG2
CHIP-Seq data sets via the National Center for Biotechnology Information (NCBI) Gene
Expression Omnibus (GEO). These data sets were made available to the public. The analysis of
the aforementioned CHIP-Seq data sets was visualized and analyzed utilizing the UCSC genome
browser. This order of this thesis is as follows: introduction, materials & methods, results,
discussion, and references.
20
2: Material and Methods:
2.1: Cell Lines
HepG2, Huh7, and Hep3B cells were utilized in this project. The former post-doctoral
member of the Machida Lab, Dr. Da-Wei Yeh, handled the cell lines for this project. Dr. Yeh also
was in charge of knocking-down and over-expressing MSI2 in HepG2, Huh7, and Hep3B cell
lines, as well as any other experiments involved in validation of MSI2 level. Huh7, Hep3B, and
HepG2 cells were maintained in Dulbecco’s Modified Eagle Medium (DMEM) with 10% fetal
bovine syrum (FBS).
In terms of cells utilized in the PRO-seq experiment, for Huh7 cell line:
SH-G443 - sh-Scrambled, SH-G444 - sh-Scrambled, SH-G445 - shMSI2, SH-G446 - shMSI2
SH-G578 - vector control, SH-G579 - vector control, SH-G580 - MSI2 over-expression, SH-
G581 - MSI2 over-Expression
For the HepG2 cell line:
SH-G414 - sh-Scrambled, SH-G422 - sh-Scrambled, SH-G415 - shMSI2, SH-G423 - shMSI2
SH-G416 - vector control, SH-G424 - vector control, SH-G417 - MSI2 over-expression, SH-
G425 - MSI2 over-expression
For the Hep3B cell line:
SH-G418 - sh-Scrambled, SH-G426 - sh-Scrambled, SH-G419 - shMSI2, SH-G427 - shMSI2
SH-G420 - vector control, SH-G428 - vector control, SH-G421 - MSI2 over-expression, SH-
G429 - MSI2 over-expression
21
2.2: Statistical Analysis
For analysis of PRO-seq experiments using Homer, statistical analysis was required to
calculate log2 fold change, p-value, and q-value/adjusted p-value. This statistical analysis was
done by the UCSD collaborator, Dr. Soohwan Oh. To calculate the log2 fold change, the sh-
Scrambled or vector control values, and the shMSI2 or MSI2 over-expression values were
averaged. For example, the sh-Scrambled control values were averaged, and the shMSI2 values
were averaged, and then fold change was found by dividing shMSI2/sh-Scrambled transcription
rate values. The log2 could then be found.
To analyze p-values, a two-tailed t-test was first conducted, hence significance could be
on either side of the bell curve. Transcription rate values higher or lower than the controls for
both shMSI2 and MSI2 over-expression could be significant. Once the t-test was completed, the
p-value could be calculated.
After the p-values were calculated, q-values/adjusted p-values could be calculated. To
calculate adjusted p-value from q-value, the Bonferroni method was utilized in R. To do this, the
following code was utilized: “P_value <- c (P-value 1, P-value 2…P-value N)” to store the p-
values for “P_value”. Then, use the code “p.adjust (P_value, method = “bonferroni”)” to convert
p-values into q-values/adjusted p-values.
22
2.3: PRO-seq Experimental Protocol
PRO-seq experimental protocol followed the instructions of Nature Protocols [40]. This
protocol was utilized by Dr. Da-Wei Yeh starting in mid 2019 while collaborating with Dr.
Soohwan Oh of UCSD. The cell types utilized were HepG2, Huh7, and Hep3B control, MSI2
over-expression, and shMSI2 cells. Due to there being 150 steps of experimental protocol, the
protocol will be summarized. There were several materials listed for nuclei isolation, nuclear
run-on, reverse transcription, and finally library preparation. The first step was to prepare the cell
culture, 10^7 cells at 80% confluencey [40]. The samples were then prepared in a cold room;
nuclei isolation could then occur [40]. For nuclei isolation, cells were centrifuged, resuspended
with cold PBS, centrifuged again, resuspended in ice-cold douncing buffer, centrifuged and
washed several more times, and finally frozen at -80 degress celsius [40]. The next step was
nuclear run-on, using a nuclear run-on master mix [40]. The master mix was composed of Tris-
Cl, MgCL2, DTT, KCL, and DEPC-H2O [40]. Biotin run-on was conducted utilizing 2x reaction
mix [40]. This involved adding the nuclear run-on master mix with biotin nucleotides [40]. Then,
after the mixture of the master mix and biotin was created, the nuclei of the HCC cells and the 2x
reaction mix was heated and combined together [40].
RNA extraction was then conducted [40]. Trizol was mixed with the cells via vortexing
and centrifuging the sample until RNA was extracted [40]. After RNA extraction, the RNA
fragmentation was conducted with the RNA pellet via base hydrolysis [40]. Then, several steps
like RNA enrichment, attachment of adaptors to the 3’ end of the RNA, modification of the RNA
via enzymes, ligation of an adaptor to the 5’ end of the RNA, more biotin enrichment, and
reverse-transcription occurred [40]. Polymerase-chain reaction and gel analysis was conducted,
and library size selection occurred [40]. Sequencing and analysis of the data was performed [40].
23
2.4: Bioinformatics
2.4.1: Installing Homer and Various Programs
In order for PRO-seq data to be analyzed, Homer was first installed on MacBook Air
(M1, 2020). Instructions for installing Homer are followed via homer.ucsd.edu [41]. Xcode was
installed via app store; then, opened up computer terminal and added “xcode-select -install” to
install command line developer tools [41]. Afterwards, Python3 was downloaded via python.org/
downloads. Anaconda was downloaded and installed via anaconda.com/products/
individual#macos. Then, computer terminal was opened and “conda install wget” was added to
install wget via anaconda [41]. R statistical analysis tool was downloaded via R-Project.org and
then download/install R. Bioconductor was then installed by inputting the following:
“if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(version = “3.13”)"
The above line of code was added in R program [41]. Afterwards, to install core packages, the
following was added into R:
“if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install()”
Inside of the parenthesis of “BiocManager::install()”, the intended installation package was
added [41]. For the installation of Homer packages, the following code was added to R:
“BiocManager::install(c(“DESeq2”, “edgeR”))” so DESeq2 and edgeR were installed [41]. Once
these packages were installed, Samtools was installed via instructions on samtools.github.io/
bcftools/howtos/install.html. Perl should have already been installed on computer; to check what
24
version of Perl was installed, “Perl -V” was inputted into terminal; version of Perl installed was
Perl 5, Version 28, Subversion 2 (V5.28.2). G++ and GCC were already installed via the
installation of Xcode; Make was also already installed [41].
Next, Homer was downloaded from homer.ucsd.edu/Homer/Download.html [41]. The
file, configurehomer.pl, was added in folder designated as “Homer”; the directory for this file
was as follows: “Users/AhmedFarooq/Desktop/Misc./Homer/configurehomer.pl” [41]. To finally
install homer, in the terminal, “Perl/Users/AhmedFarooq/Desktop/Misc./Homer/
configurehomer.pl -Install” was added to install Homer [41]. Into the terminal, the line
“open .bash_profile” was added and it opened up the bash txt file; “PATH=$PATH:/Users/
ahmedfarooq/desktop/Misc./Homer//bin/“ was then copied and pasted into the bash txt file and
file was saved [41]. Next step was to download various genome packages such as mouse mm8,
human hg18, and human hg19 [41]. In order to download these genome packages, into the
command line terminal, the following was added: “perlPerl/Users/AhmedFarooq/Desktop/Misc./
Homer/configurehomer.pl - install ____” where in the underscore, mm8, hg18, and hg19 were
added to install the respective packages [41].
Now that installation is complete an all packages are installed, GRO-seq practice was
conducted utilizing two fibroblast practice data files from the NCBI website: GSM340901 and
GSM340902 [41]. Once these files were downloaded, a tag directory was made with the
following code inserted in terminal:
“/Users/ahmedfarooq/Desktop/Misc./Homer/cpp/makeTagDirectory IMR90/ -genome hg18
-checkGC /users/ahmedfarooq/desktop/Misc./GSM340901_lib1_aligned.bed /users/
ahmedfarooq/desktop/Misc./GSM340902_lib2_aligned.bed”
25
This made a directory, or folder, with the practice files, called IMR90 [41]. Before conducting
GRO-seq analysis, files had to be analyzed for a variety of values such as checking the GC%
content, GC-distribution via “tagGCcontent.txt” and “genomeGCcontent.txt”, nucleotide
frequencies via “tagfrefuniq.txt” [41]. The final thing done for practice was to visualize data in
UCSC genome browser. In order for this to be done, in the command terminal, “/users/
ahmedfarooq/desktop/misc./Homer/cpp/makeUCSCfile IMR90 -o auto -strand separate” was
added [41]. This turned the directory, IMR90, into a track for UCSC genome browser analysis
[41]. The information visualized utilizing genome browser could then be downloaded onto an
excel file for further analysis.
2.4.2: GRO-seq Bioinformatics Analysis
Once practice was complete, the actual GRO-seq analysis could be done. The alterations
and alignment of the PRO-seq FASTQ files to a genome, as well as visualizing the data were
done by Dr. Soohwan Oh. From the PRO-seq experiment, FASTQ files were produced. In order
for FASTQ files to be analyzed, sequence quality had to be checked; the files had to be trimmed;
and sequences had to be manipulated [41]. After manipulation, the FASTQ files were mapped to
the hg19 reference genome [41]. With the files all aligned to the hg19 genome, the PRO-seq files
were analyzed for things like GC-distribution and nucleotide frequencies via Homer [41]. A tag
directory was made for each file [41]. The files were finally visualized utilizing the UCSC
genome browser; each file was made into a UCSC genome browser track. Data from the genome
browser was then downloaded and analyzed on Microsoft excel [41].
Afterwards, we analyzed the excel files based on the q-value/adjusted p-value; statistical
significance was stated as being q-value/adjusted p-value less than 0.05 for HepG2, Huh7, and
26
Hep3B cells. This was done for sh-Scrambled vs. shMSI2 and vector control vs. MSI2 over-
expression for the above three cell types.
2.4.3: CHIP-seq Data Analysis
In order to discern if p53 and MYC were involved in histone transcription differences
when MSI2 was knocked-down in HepG2 cells via short-hairpin, CHIP-seq data sets were
analyzed. Utilizing publicly available CHIP-seq data sets, p53 and MYC transcription factor
binding of histone genes’ promoter region, as well as any enhancer regions, was analyzed. To see
if p53 was bound to the 36 histone gene promoter regions in various non-HepG2 data sets, the
p53 Baer Hub was utilized [42]. The p53 Baer Hub is a collection of 41 CHIP-seq data sets and
was available through the UCSC genome browser [42]. Each of the 36 histone genes that had
highly up-regulated transcription rates when MSI2 was knocked-down in HepG2 cells was
looked at for p53 binding of these genes’ promoter regions and any enhancer regions [42]. The
location of the gene, on the plus or minus strand, was also important to note.
Afterwards, to see p53 and MYC binding in HepG2 cells’ CHIP-seq data sets, publicly
available data sets from the NCBI Geo Accession Viewer was utilized. To see p53 histone gene
binding, the CHIP-seq data set from “Global re-wiring of p53 transcription regulation by the
hepatitis B virus X (HBX) protein” was utilized and visualized through the UCSC genome
browser [43]. The HepG2 control data set showing p53 binding sites, GSM1581946, was used
[43]. This data set did not have a threshold cutoff. To see if MYC was bound to the 36 histone
genes as well as any enhancer regions, another publicly available CHIP-seq data set was used.
This data set, GSM2797581, was a HepG2 MYC binding from NCBI Geo Accession Viewer
with a 0.02 Irreproducible Discovery Rate (IDR) [44]. IDR was included due to the researchers
utilizing replicates of the MYC files. For all CHIP-seq experiments visualized via the UCSC
27
genome browser, the genes of interest were first located. Then, 1000 base-pairs up-stream and
down-stream of the gene location were investigated for p53 or MYC binding. Also, the location
of the gene on the plus or minus strand was also taken into account when looking for promoter
binding.
28
3. Results:
3.1: There Were No Statistically Significant Results in Hep3B and Huh7 Cells
From the results of the PRO-seq experiment, there were six different pieces of data; two
sections for each cell line; this data was analyzed using Homer. There was a section for sh-
Scrambled vs. shMSI2 and vector control vs. MSI2 over-expression for HepG2, Huh7, and
Hep3B cells. Overall, for every section, there were 27, 945 transcripts. Each section
corresponded to a different gene, with the gene name or names, chromosome location of gene,
and the type of gene listed. For example, information for whether the gene was protein-coding or
non-coding RNA (ncRNA) was listed. While the results of Hep3B and Huh7 cells are not stated
in this thesis due to redundancy, mentioning the methods for determining significance is
important. Each section contains the change in transcription in the control vs. shMSI2 or MSI2
over-expression, as well as log2 fold change, p-value, and q-value. Results are based on q-value
since q-value, or adjusted p-value, is more reliable than p-value as it is adjusted for the false-
discovery rate (FDR). For all 6 sections of PRO-seq data, q-value at or below 0.05 was the
chosen threshold for significance for change in transcription rate. For Hep3B and Huh7, all of the
q-value were at a value of 1, which indicates that 100% of the results are a false positive,
meaning the null hypothesis was accepted. A q-value of 0.05 is a 5% FDR, which indicates that
of significant results, 5% are false positives. Q-value will be much more accurate, as there will
be less false positives at a given q-value vs. the corresponding p-value. For Hep3B and Huh7
data sets, the q-values for all transcripts is equal to 1; this indicates that all differences between
the control and the altered MSI2 transcription rate changes are false positives, and the null
hypothesis is not rejected. In other words, there were no significant changes in transcription rate
when MSI2 was knocked-down or over-expressed in Huh7 and Hep3B cells.
29
Gene Vector
Control
Transcripti
on Rate
Vector
Control
Transcripti
on Rate
MSI2 OE
Transcripti
on Rate
MSI2 OE
Transcripti
on Rate
Control vs.
MSI2 OE
log2 fold
change
Control vs.
MSI2 OE p-
value
Control vs.
MSI2 OE q-
value
IFI44L 137.91 124.91 439.33 401.8 1.68057821 5.03E-22 1.41E-17
OAS2
9.65 6.41 82.47 66.67 3.14807532 1.57E-17
2.19E-13
IFIT1
274.43 227.39 618.86 518.92 1.17588467 3.52E-17
3.28E-13
IFIT3
50.34 57.65 212.37 200 1.96936453 3.02E-16
2.11E-12
PARP9
230.99 160.14 494.06 430.63 1.22100743 2.19E-15
1.23E-11
EPSTI1
116.53 118.5 362.7 275.68 1.45957832 4.20E-14
1.96E-10
OASL
41.37 36.83 145.96 115.32 1.75158922 4.11E-12
1.64E-08
IFI44
25.51 33.63 123.33 127.93 2.14478373 5.41E-12
1.89E-08
DDX58
568.17 456.39 1111.46 1061.26 1.08295063 6.39E-12
1.98E-08
RSAD2
12.41 6.41 67.14 46.85 2.50602239 2.96E-11
8.28E-08
IFIH1
74.47 59.25 201.42 145.95 1.37688262 3.31E-10
8.40E-07
SP110
42.06 48.04 141.58 126.13 1.60935659 1.08E-09
2.51E-06
IFIT2
32.41 24.02 94.87 109.91 1.81266297 1.95E-09
4.19E-06
HELZ2
346.83 192.16 606.45 464.86 0.97580534 8.59E-09
1.71E-05
STAT1
928.8 709.4 1596.04 1295.49 0.81432082 1.30E-08
2.42E-05
DDX60
112.39 113.7 289.72 214.41 1.17325586 2.12E-08
3.51E-05
ETV5
202.03 180.95 94.87 77.48 -1.15083702 2.14E-08
3.51E-05
LGALS3BP
76.54 38.43 159.82 129.73 1.27424160 1.20E-07
0.0001866
IFI6
113.08 96.08 256.15 169.37 1.03430519 5.47E-07
0.0008050
DNAH17-
AS1
38.61 12.81 108.74 63.06 1.64993858 7.90E-07
0.0011039
FAM53C
454.4 342.69 666.29 558.56 0.60993261 2.64E-06
0.0035173
OAS1
20.69 9.61 64.22 34.23 1.64022129 9.68E-06
0.0123019
XAF1
2.76 3.2 18.24 23.42 2.75847497 1.12E-05
0.0135603
HERC5
204.1 161.74 323.29 284.68 0.72102003 1.49E-05
0.0173188
ACO1
669.54 538.05 1212.9 774.77 0.70414698 2.84E-05
0.0305092
Gene
30
3.2: There Were Statistically Significant Results in Both HepG2 Data Sets
In HepG2 shMSI2 and MSI2 over-expression cells, there were statistically significant
results based on the 5% FDR q-value of 0.05. In MSI2 over-expression HepG2 results in Table 1,
there were 26 significant results. In shMSI2, there were 461 significant results under the 0.05 q-
value threshold as seen in table 2. For the MSI2 over-expression data set, SH-G416 and SH-
G424 were both vector control; SH-G417 and SH-G425 were MSI2 over-expression data sets. To
get the results, the controls’ transcription rates were averaged, then the OEMSI2 transcription
rates were averaged. Log2 fold change, p-value, and q-value could then be calculated. HepG2
MSI2 over-expression data set did not yield a large amount of significant results likely due to the
benign change in MSI2 expression relative to knocking-down MSI2 expression in shMSI2
HepG2 cells. For the shMSI2 HepG2 data set, SH-G414 and SH-G422 were the sh-Scrambled
cells; SH-G415 and SH-G423 were shMSI2 cells. Of the top 50 genes with the most significantly
changed rates of transcription in shMSI2 HepG2 cells, 13/50 were of histone genes. This means
26%, or a little over a 1/4, of the top 50 genes in shMSI2 cells were of histone genes which was
quite interesting. Canonical histone proteins are required for DNA replication. Histone variants
also have a multitude of unique functions. In the 461 most significant changes in transcription for
HepG2 cells, 36 histone genes overall had increased transcription rate when MSI2 was knocked-
down via short-hairpin. In shMSI2 HepG2, HIST1H1C, the gene that codes for linker histone
DNAH17
447.51 254.61 661.18 482.88 0.69496673 2.81E-05
0.0305092
Vector
Control
Transcripti
on Rate
Vector
Control
Transcripti
on Rate
MSI2 OE
Transcripti
on Rate
MSI2 OE
Transcripti
on Rate
Control vs.
MSI2 OE
log2 fold
change
Control vs.
MSI2 OE p-
value
Control vs.
MSI2 OE q-
value
Gene
Table 1. Statistically significant results (q-value ≤ 0.05) in rate of transcription between vector control vs. MSI2 over-
expression in HepG2 cells. Information includes gene, transcription rate of vector control cells (SH-G416, SH-G424), MSI2
over-expression (OE) cells (SH-G417, SH-G425), log2 fold change, p-value, and q-value. Results from PRO-seq. Analysis via
Homer.
31
H1.2 variant, had the second most significantly altered rate of transcription out of 461 genes.
Overall, for all 36 histone genes in shMSI2 HepG2 cells, the transcription rate increased in a
significant fashion based on q-value.
Gene sh-
Scrambled
Transcripti
on Rate
sh-
Scrambled
Transcripti
on Rate
shMSI2
Transcripti
on Rate
shMSI2
Transcripti
on Rate
Control vs.
MSI2
Knock-
Down log2
fold
change
Control vs.
MSI2
Knock-
Down
MSI2 p-
value
Control vs.
MSI2
Knock-
Down q-
value
LINC01512 153.71 153.88 429.8 386.96 1.50041496 3.58E-22 1.00E-17
HIST1H1C
418.25 307.76 756.48 790.83 1.19968309 1.20E-16 1.68E-12
SLC9A9
467.39 404.99 876.16 708.18 0.96546059 6.96E-16 6.49E-12
INSIG1
567.77 549.56 240.12 238.56 -1.142318 1.58E-15 1.11E-11
HIST2H2AB
445.44 443.88 907.02 882.88 1.10514256 4.97E-15 2.78E-11
FTH1
613.78 469.24 1112.51 1040.67 1.09984041 1.02E-14 4.76E-11
CHRM3
402.57 383 837.02 606.74 0.98139817 1.78E-14 7.11E-11
MIR196A2
6.27 1.69 45.16 50.72 3.68789544 3.28E-14 1.02E-10
HIST2H2AC
327.28 300.99 637.55 794.59 1.29038773 3.07E-14 1.02E-10
CHAC1
401.52 389.77 703.04 644.31 0.86155263 7.28E-13 2.03E-09
ID4
1706.46 1809.33 878.42 775.8 -0.9928761 1.09E-12 2.75E-09
HIST3H2BB
189.26 173.32 372.59 383.21 1.14981008 1.18E-12 2.75E-09
HIST1H2AC
279.18 251.95 516.36 623.65 1.19968871 1.53E-12 3.28E-09
DHRS2
13.59 7.61 58.71 67.62 2.65926011 6.48E-12 1.21E-08
H1F0
588.69 413.44 1001.86 901.66 1.03714236 6.26E-12 1.21E-08
NFKBIA
282.32 233.35 481.74 477.13 0.99278218 7.51E-12 1.31E-08
RIMS2
612.74 571.55 1166.71 903.54 0.90493576 1.79E-11 2.94E-08
RPL10A
712.07 683.15 1248.75 1170.28 0.8929987 2.29E-11 3.55E-08
HIST3H2A
284.41 213.91 487.76 420.77 0.97553509 3.18E-11 4.68E-08
HNRNPA1P1
0
1506.74 1287.67 2444.07 2472.05 0.92384235 6.61E-11 9.24E-08
32
ACTG1
1159.6 978.22 1850.17 1857.79 0.90246362 1.20E-10 1.59E-07
NDRG1
636.79 485.31 310.87 223.54 -0.9476073 1.66E-10 2.11E-07
AKAP6
391.06 292.54 606.69 537.24 0.85082197 2.83E-10 3.44E-07
HIST1H2BC
169.39 185.16 349.26 439.56 1.24125401 6.20E-10 7.22E-07
LOC101930
114
87.83 63.41 188.18 167.18 1.34120874 7.81E-10 8.73E-07
HIST1H4B
186.12 161.49 340.23 298.67 0.97862932 8.95E-10 9.62E-07
HNRNPA1
1896.76 1640.23 2973.98 2943.54 0.85054321 1.01E-09 1.04E-06
HIST1H2AB
192.39 145.42 333.45 309.95 1.03398474 1.34E-09 1.34E-06
ANKRD31
283.36 283.24 486.25 524.09 0.92303101 1.43E-09 1.38E-06
LGR6
279.18 273.09 471.2 456.47 0.83793971 2.42E-09 2.25E-06
CDC20
322.05 304.37 516.36 529.73 0.83038471 3.21E-09 2.90E-06
NUPR1
12.55 14.37 57.96 48.84 2.06487842 4.30E-09 3.75E-06
HIST1H3J
52.28 32.13 117.42 120.22 1.59920631 5.46E-09 4.62E-06
TMPO-AS1
230.04 186.85 368.08 364.42 0.91047513 7.50E-09 6.16E-06
ASNSP1
991.25 919.88 1552.1 1483.98 0.76958516 8.13E-09 6.40E-06
HIST2H2BF
778.99 670.47 1166.71 1247.3 0.84241677 8.24E-09 6.40E-06
LINC00662
168.35 150.5 301.84 278.01 0.95828451 1.01E-08 7.59E-06
FOSL2
239.45 197 103.87 75.14 -1.1558914 1.06E-08 7.80E-06
LOC101927
830
177.76 176.71 336.46 289.28 0.91492199 1.14E-08 7.99E-06
TMLHE-AS
177.76 176.71 336.46 289.28 0.91492164 1.14E-08 7.99E-06
POU2F3
166.25 239.27 426.04 409.5 1.1217453 1.19E-08 8.13E-06
HSPA8
496.67 494.61 790.35 728.84 0.71011865 1.41E-08 9.39E-06
HIST2H2BA
261.41 236.73 420.77 524.09 1.019757 1.51E-08 9.83E-06
DIRC3
554.18 472.62 1016.17 695.03 0.83939758 1.83E-08 1.16E-05
Gene sh-
Scrambled
Transcripti
on Rate
sh-
Scrambled
Transcripti
on Rate
shMSI2
Transcripti
on Rate
shMSI2
Transcripti
on Rate
Control vs.
MSI2
Knock-
Down log2
fold
change
Control vs.
MSI2
Knock-
Down
MSI2 p-
value
Control vs.
MSI2
Knock-
Down q-
value
33
3.3: Majority of Significantly Altered Histone Genes Were Histone Variants
Of the 36 histone genes whose transcription rate changed significantly in the shMSI2
HepG2 cells, 28 of them were histone variant genes. The rate of transcription for all 36 histone
genes increased in a significant manner when MSI2 was knocked down via short-hairpin in
HepG2 cells. As seen in Table 3, the genes with green color were histone variants, and genes
with blue color were non-variant histones. The type of histone variant ranges from linker histone
H1 to histone H3, with an abundance of histone H2A, H2B, and H3 variant. Histone H4 genes
are also present; there are only a limited amount of histone H4 variants as histone H4 proteins
are one of the most slowly-evolving histones [37]. In terms of significance of transcription rate
difference between sh-Scrambled and shMSI2, most histone H4 genes are at the bottom of the
list. Also included in the non-variant list of histones are histones HIST2H2BA and HIST2H3PS2,
which are histone pseudogenes; pseudogenes are non-functional genes that resemble fully
RGMB-AS1
219.58 247.73 407.97 441.44 0.94564246 2.39E-08 1.45E-05
HIST1H1E
509.22 309.45 744.44 683.76 0.92010797 2.33E-08 1.45E-05
RNVU1-20
203.9 199.53 348.51 411.38 1.00110735 2.82E-08 1.67E-05
HSPA5
909.69 928.34 1473.06 1485.86 0.78661379 2.99E-08 1.74E-05
LOC101928
978
970.34 951.17 1542.31 1442.66 0.73465533 3.80E-08 2.17E-05
RPL4
1367.68 1281.75 2118.89 1976.14 0.73002856 4.03E-08 2.25E-05
Gene sh-
Scrambled
Transcripti
on Rate
sh-
Scrambled
Transcripti
on Rate
shMSI2
Transcripti
on Rate
shMSI2
Transcripti
on Rate
Control vs.
MSI2
Knock-
Down log2
fold
change
Control vs.
MSI2
Knock-
Down
MSI2 p-
value
Control vs.
MSI2
Knock-
Down q-
value
Table 2. Top 50 most statistically significant results (q-value ≤ 0.05) in rate of transcription between control (sh-Scrambled) vs.
MSI2 knock-down (shMSI2) in HepG2 cells. Information includes gene, transcription rate of sh-Scrambled cells (SH-414,
SH-422), shMSI2 cells (SH-415, SH-423), log2 fold change, p-value, and q-value. Results from PRO-seq. Analysis via Homer.
34
functional genes. Histone variants, as mentioned in the introduction, have interesting functions
and can make up the histone octamer of the nucleosome, increasing or decreasing nucleosome
stability [39]. Histone variants, especially histone H2A variants, can be quite similar in structure
[45].
Gene sh-
Scrambled
Transcripti
on Rate
sh-
Scrambled
Transcripti
on Rate
shMSI2
Transcripti
on Rate
shMSI2
Transcripti
on Rate
Control vs.
MSI2
Knock-
Down log2
fold
change
Control vs.
MSI2
Knock-
Down
MSI2 p-
value
Control vs.
MSI2
Knock-
Down q-
value
HIST1H1C 418.25 307.76 756.48 790.83 1.19968309 1.20E-16 1.68E-12
HIST2H2AB
445.44 443.88 907.02 882.88 1.10514256 4.97E-15 2.78E-11
HIST2H2AC
327.28 300.99 637.55 794.59 1.29038773 3.07E-14 1.02E-10
HIST3H2BB
189.26 173.32 372.59 383.21 1.14981008 1.18E-12 2.75E-09
HIST1H2AC
279.18 251.95 516.36 623.65 1.19968871 1.53E-12 3.28E-09
HIST3H2A
284.41 213.91 487.76 420.77 0.97553509 3.18E-11 4.68E-08
HIST1H2BC
169.39 185.16 349.26 439.56 1.24125401 6.20E-10 7.22E-07
HIST1H4B
186.12 161.49 340.23 298.67 0.97862932 8.95E-10 9.62E-07
HIST1H2AB
192.39 145.42 333.45 309.95 1.03398474 1.34E-09 1.34E-06
HIST1H3J
52.28 32.13 117.42 120.22 1.59920631 5.46E-09 4.62E-06
HIST2H2BF
778.99 670.47 1166.71 1247.3 0.84241677 8.24E-09 6.40E-06
HIST2H2BA
261.41 236.73 420.77 524.09 1.019757 1.51E-08 9.83E-06
HIST1H1E
509.22 309.45 744.44 683.76 0.92010797 2.33E-08 1.45E-05
HIST1H4E
316.82 282.39 474.96 469.61 0.74988751 6.76E-08 3.50E-05
HIST1H3H
378.52 299.3 538.94 509.06 0.73074296 6.98E-08 3.55E-05
HIST1H1D
51.24 42.27 133.98 92.04 1.39700024 7.43E-08 3.71E-05
HIST1H2BF
37.64 24.52 82.8 103.32 1.66875521 1.14E-07 5.28E-05
HIST2H2BE
2579.56 1970.82 3459.48 3462 0.7187709 1.37E-06 0.0004357
HIST1H3B
915.97 692.45 1196.82 1243.54 0.71307444 2.24E-06 0.0006384
Gene
35
Due to this issue with histone variants’ structural similarity, utilizing antibodies to
measure protein structure of histone variants can have limited accuracy due to some histone
variants only being 3 amino acids apart in structure [45]. Another issue with measuring protein
HIST2H3PS
2
541.63 390.61 690.99 633.04 0.61488423 4.64E-06 0.0011254
HIST2H2AA
3
1849.71 1518.48 2407.18 3071.28 0.82405701 4.85E-06 0.0011315
HIST2H2AA
4
1849.71 1516.79 2405.68 3071.28 0.82445996 4.90E-06 0.0011315
HIST1H2AE
78.42 75.25 150.54 135.25 0.98650993 5.35E-06 0.0012147
HIST1H2BN
70.06 64.26 133.23 114.59 0.98163317 1.59E-05 0.0027406
HIST1H2BD
1341.54 1265.69 1786.19 2034.37 0.66042918 2.22E-05 0.00348
HIST1H2AG
805.13 668.78 1039.5 1010.61 0.58079897 2.30E-05 0.0035868
HIST1H2AK
214.35 197 302.59 347.51 0.74747651 2.42E-05 0.0036975
HIST1H2BJ
396.29 343.27 499.8 608.62 0.68225854 2.88E-05 0.0042776
HIST1H3D
1339.44 1052.62 1685.33 1519.67 0.5293885 0.00011068 0.0120352
HIST1H2BK
2152.94 1745.92 2619.45 2738.79 0.57137673 0.00011151 0.012078
HIST1H4C
159.98 168.25 246.14 242.32 0.65676759 0.00026376 0.0229617
HIST2H3D
653.52 502.22 801.64 715.69 0.49977963 0.00038174 0.0297148
HIST1H1B
21.96 23.67 56.45 39.45 1.18698517 0.00045895 0.0335739
HIST1H4D
155.8 137.81 228.83 193.48 0.62682841 0.00050185 0.0355045
HIST2H4A
1181.56 880.15 1412.09 1256.69 0.48129804 0.00069053 0.0445657
HIST2H4B
1183.65 880.99 1412.85 1256.69 0.47967226 0.00072417 0.0462031
sh-
Scrambled
Transcripti
on Rate
sh-
Scrambled
Transcripti
on Rate
shMSI2
Transcripti
on Rate
shMSI2
Transcripti
on Rate
Control vs.
MSI2
Knock-
Down log2
fold
change
Control vs.
MSI2
Knock-
Down
MSI2 p-
value
Control vs.
MSI2
Knock-
Down q-
value
Gene
Table 3. 36 statistically significant histone genes (q-value ≤ 0.05) in rate of transcription between control (sh-Scrambled) vs.
MSI2 knock-down (shMSI2) in HepG2 cells. Information includes gene, transcription rate of sh-Scrambled cells (SH-414,
SH-422), shMSI2 cells (SH-415, SH-423), log2 fold change, p-value, and q-value. Results from PRO-seq. Analysis via Homer.
Results taken from 461 statistically significant sh-Scrambled vs. shMSI2 HepG2 cells. Green = histone variant; blue = histone
non-variant.
36
level is due to the number of post-translational modifications that histones undergo [45]. There
are separate antibodies for the histones as well as for those same histones that have been
modified via a PTM. Combined with the dubious quality of histone variant antibodies with
limited functions, as well as a lack of references, the plan to use western blot to measure histone
variant protein level did not occur. Instead, elucidating if transcription factor binding is the cause
of this increase in histone gene transcription rate when MSI2 was knocked-down in HepG2 cells
was the final step of this project. This was done by utilizing CHIP-seq data sets available from
NCBI and visualized with the UCSC genome browser.
Histone Gene Type Name
HIST1H1C H1 Variant Linker Histone H1.2
HIST2H2AB H2A Variant Histone H2A type 2-B
HIST2H2AC H2A Variant Histone H2A type 2-C
HIST3H2BB H2B Variant Histone H2B type 3-B
HIST1H2AC H2A Variant Histone H2A type 1-C
HIST3H2A H2A Variant Histone H2A type 3
HIST1H2BC H2B Variant Histone H2B type 1-C
HIST1H4B H4 Non-Variant Histone cluster 1 H4 family B
HIST1H2AB H2A Variant Histone H2A type 1-B/E
HIST1H3J H3 Variant Histone H3.1
HIST2H2BF H2B Variant Histone H2B type 2-F
HIST2H2BA H2B pseudogene (non-variant) Histone H2B Family Member A
HIST1H1E H1 Variant Linker Histone H1.4
HIST1H4E H4 Non-Variant Histone H4
HIST1H3H H3 Variant Histone H3.1
HIST1H1D H1 Variant Linker Histone H1.3
HIST1H2BF H2B Variant Histone H2B type 1-C/E/F/G/I
37
3.4: Less than Half of 36 Histone Genes Were Bound by P53 in Non-HepG2 Cells
The p53 Baer Hub, available for the public via the UCSC genome browser, allows p53
binding to be visualized by utilizing 41 CHIP-seq data sets [42]. The data sets include non-
HepG2 cells, such as SAOS2 osteosarcoma cells [42]. While the significant results of the PRO-
seq experiment were derived from HepG2 cells, understanding the general binding pattern across
several different types of cells, both cancerous and non-cancerous, can give an important
HIST2H2BE H2B Variant Histone H2B type 2-E
HIST1H3B H3 Variant Histone H3.1
HIST2H3PS2 H3 Pseudogene (Non-Variant) Histone Cluster 2 H3 pseudogene 2
HIST2H2AA3 H2A Variant Histone H2A type 2-A
HIST2H2AA4 H2A Variant Histone H2A type 2-A
HIST1H2AE H2A Variant Histone H2A type 1-B/E
HIST1H2BN H2B Variant Histone H2B type 1-N
HIST1H2BD H2B Variant Histone H2B type 1-D
HIST1H2AG H2A Variant Histone H2A type 1
HIST1H2AK H2A Variant Histone H2A type 1
HIST1H2BJ H2B Variant Histone H2B type 1-J
HIST1H3D H3 Variant Histone H3.1
HIST1H2BK H2B Variant Histone H2B type 1-K
HIST1H4C H4 Non-Variant Histone H4
HIST2H3D H3 Variant Histone H3.2
HIST1H1B H1 Variant Linker Histone H1.5
HIST1H4D H4 Non-Variant Histone H4
HIST2H4A H4 Non-Variant Histone H4
HIST2H4B H4 Non-Variant Histone H4
Table 4. Information about 36 histone genes with statistically significant rate of transcription between control (sh-Scrambled)
vs. MSI2 Knock-Down (shMSI2) HepG2 cells. Information listed as gene name and type of histone. Green = histone variant.
Blue = histone non-variant.
38
understanding to general p53 binding patterns. Via the p53 Baer hub, it was shown that 16/36
histone genes had p53 binding in their genes’ promoter region In figure 1A, an example of this
type of binding was shown for HIST1H1C, the gene that codes for the linker histone H1 variant,
H1.2. Of all the 36 histone genes, HIST1H1C is one fo the most bound genes in its promoter
region by p53. In Figure 1A, the blue bar shows the location of the HIST1H1C gene, with the
gene going from 3’ to 5’ as it is on the minus strand. Each black and green bar shows binding of
p53 in different data sets. The bars themselves show peak calling, or p53 protein binding, in
DNA. In HIST1H1C, there were 7 data sets with binding in the promoter region: 6876,
peak8443, peak13765, peak5138, peak4083, peak5108, and peak6715. The black bar indicates a
p53 peak in 2+ data sets in at least 2 independent data sets; the green bar indicates SISSR peak
calls. Figure 1B shows that 16/36 histone genes are bound by p53 in non-HepG2 cells.
3.5: Limited Binding of Histone Genes by P53 and MYC in HepG2 Data Sets
Figure 1. Less than half of histone genes have p53 bound to promoter region in non-HepG2 cells. (A) Example of CHIP-seq
binding of histone HIST1H1C (linker histone variant H1.2) bound by p53 in multiple different non-HepG2 data sets. Each bar
is from a single data set and shows p53 binding. Green bars show SISSRs peak calls. Black bars show p53 peaks in 2+
independent data sets. Visualized via the UCSC genome browser. CHIP-seq binding sets from p53 Baer Hub. (B) Showing
overall number of binding of p53 with histone genes’ promoter regions in non-HepG2 cells
[Promoter]
39
Both p53 and MYC HepG2 CHIP-seq binding data were visualized utilizing the UCSC
genome browser. It was shown in both HepG2 data sets that there was limited binding of p53 and
MYC. Overall, only 1/36 histone genes had some level of p53 and MYC binding within the
vicinity of the genes. For MYC, there was binding near but not in the promoter region of
HIST1H2AK. In Figure 2A, one could see a grey bar under “User Supplied Track”, which shows
MYC binding. Above, in the blue bar, is HIST1H2AK on the minus strand. The promoter region
of HIST1H2AK is visualized by the dip in the layered H3K27Ac at the bottom of the figure.
There was MYC binding near HIST1H2AK, but no binding in the promoter region of any of the
36 histone genes.
The same binding pattern is seen in p53 HepG2 CHIP-seq data set, also visualized by the
UCSC genome browser. Under the “user supplied track”, one can see a black bar listed as
Figure 2. No MYC binding to histone promoter regions in HepG2 cells (A) HIST1H2AK is only histone with some MYC
binding near gene. MYC does not bind to HIST1H2AK promoter. Grey bar under “User Supplied Track” shows MYC binding.
CHIP-seq files from NCBI Geo Accession Viewer. CHIP-seq data visualized via UCSC genome browser (B) Showing overall
number of binding of MYC with histone genes’ promoter regions in HepG2 cells. Overall, no binding of MYC to 36
statistically significant histone genes’ promoter regions
[Promoter]
40
“MACS_peak_10463” which shows p53 binding. This binding is close to HIST1H2AE, but not
to the promoter region of the histone gene, which is marked by the dip in the layered H3K27Ac
at the bottom of Figure 3A. As stated in Figure 3B, there was p53 binding in HepG2 cells in 0/36
histone gene promoter regions. For the MYC CHIP-seq data set in Figure 2A, there is an IDR
cutoff of 0.02; this is because the researchers used two replicates to show peak calls, or MYC
binding, in HepG2 cells. To normalize the results between the two HepG2 replicates, IDR was
used. The p53 data set from Figure 3A, however, did not have this cutoff. It was shown in these
CHIP-seq experiments that there was no real binding of MYC and p53 to the promoter region of
the 36 histone genes in HepG2 cells. While there is some binding of p53 in non-HepG2 cells as
seen via the p53 Baer Hub, overall the hypothesis was not supported by the CHIP-seq data. The
significant results from the shMSI2 HepG2 cells from the PRO-seq experiment were most likely
not related to p53 and MYC binding to the promoter regions of histone genes. The null
hypothesis was therefore accepted due to a lack of binding of MYC and p53 to the histone genes’
promoter regions in HepG2 cells.
41
42
4. Discussion:
In this project, hepatocellular carcinoma cancer cell lines were used to determine changes
in the rate of transcription when MSI2 levels were altered. The results showed that HepG2 cells,
as opposed to Hep3B and Huh7 cells, had genes with significantly altered rates of transcription
when MSI2 was knocked-down with a short-hairpin targeting MSI2. There was an abundance of
significant results in the shMSI2 HepG2 cells, whereas the MSI2 over-expression HepG2 cells
only produced 26 genes with significantly altered rates of transcription. In the shMSI2 HepG2
cells, there were 461 significant results. In the top 50 most significantly altered rate of
transcription for shMSI2 HepG2 cells, 26% or 13 of the top 50 genes were histone genes.
Overall, of the 461 significantly changed genes in HepG2 shMSI2 cells, 36 were histone genes.
Of these histone genes, 28/36 were histone variants.
Histone variants are different from core histone proteins in that not all histone variants
are replication-dependent; some are replication-independent. These histone variants also have
different functions that are not tied specifically to the nucleosome or replication during the S
phase of the cell cycle. Histone variants are also altered in various types of cancer, and are an
interesting research focus. While histone variants were increased in terms of transcription rate in
HepG2 shMSI2 cells, there was little to no difference in transcription rate in Huh7 and Hep3B
when MSI2 was knocked-down. Histone variants, along with previously mentioned functions
like histone H1.2 inducing apoptosis, are also involved in fundamental chromatin differentiation
[37, 46]. Histone variants may also be involved in epigenetic inheritance [46]. Histone variants,
in general, are more varied than their replication-dependent non-variant counterparts. This is
because histone variants’ functions are not tied only to DNA replication or nucleosome stability.
Histone variants have several different functions, and through post-translational modifications,
43
can recruit different proteins to the chromatin that interact specifically with post-translationally
modified histone variants [39]. Through these variant-interacting proteins, regions of chromatin
with histone variants can have unique characteristics due to the interacting scaffolding proteins
[39].
These significant results were not present in Huh7 and Hep3B cells, which resulted in
research to ascertain why this was the case. The biggest difference in the different HCC cell lines
used was that the p53 gene was fully functional in HepG2 cells, whereas in Huh7 and Hep3B
cells, was affected by mutations. As a result, p53 protein was not present in Hep3B cells due to a
null mutation; in Huh7 cells, p53 was mutated via the Y220C point mutation which destabilized
the p53 protein. Additional research showed that MSI2 and p53 have indirect interaction through
NUMB. MSI2 was shown to down-regulate NUMB; NUMB interacts with p53 as well as
MDM2 [47]. Numb was shown to form a tri-complex with the protein p53 as well as with the E3
ubiquitin ligase, MDM2 [47]. NUMB, in this complex, prevents the ubiquitination and
breakdown of p53 by MDM2; this leads to stabilization of p53 protein levels.
With MSI2 knocked-down, NUMB and p53 protein levels would be higher in shMSI2
cells relative to MSI2 over-expression HepG2 cells. MSI2 was also shown to increase levels of
MYC in Huh7 HCC cells via IRES-mediated cap-independent translation of MYC [15].
Ultimately, MYC and p53 were hypothesized as being two different competitive regulators for
changes in histone transcription rate when MSI2 was wild-type or knocked-down. Normal levels
of MSI2 would allow MYC to bind to the promoter of histone genes over p53 due to p53 down-
regulation via MDM2. When MSI2 was knocked-down, p53 would then be able to bind to the
histones’ promoter regions. To see if this was the case, CHIP-seq data sets available through
NCBI were utilized and visualized via the UCSC genome browser. As p53 and MYC proteins
44
can act as transcription factors, this could have been the explanation for the disparity in results
between HepG2 with Huh7 and Hep3B in the PRO-seq experiment. This ‘competition’ between
MYC and p53, however, was not the cause of the differences in transcription rate in HepG2 cells
when MSI2 was knocked-down. While the CHIP-seq data for non-HepG2 cells from the p53
Baer Hub showed some binding of p53 to histone genes’ promoter regions, this was not the case
in HepG2 cells. In HepG2 cells, 0/36 histone genes’ promoter regions were bound by p53 and
MYC based on the CHIP-seq data. P53 did not act as a transcriptional activator, and MYC did
not act as a transcriptional repressor of histone genes’ promoter regions in HepG2 cells.
Therefore, the hypothesis was not supported and the null hypothesis was accepted.
For possible reasons as to why there were significant results in HepG2 cells but not in
Huh7 and Hep3B cells, there are a few explanations. One possible explanation is that maybe,
when nuclei were isolated from the various cells for the PRO-seq wet lab protocol, higher quality
nuclei was extracted from HepG2 cells, but not in Huh7 or Hep3B cells. This could have affected
the analysis done for the PRO-seq experiment, and why there were such a disparity in results
between HepG2 with Huh7 and Hep3B cells. Another reason may be that MSI2 was the sole
regulator, not p53 and MYC, and protein levels of MYC were possibly higher in HepG2 cells
than in Huh7 and Hep3B cells. This would explain why knocking down MSI2 had a wide range
of results between the HCC cells. It is known that MSI2 contributes to HCC progression in vitro
and in vivo via binding to and increasing translation of MYC RNA, but this contribution between
cell types has not been established. MSI2 downstream effects may be the cause of these results.
Finally, maybe the model did not involve MYC, and was only affected by MSI2 and p53. This
could be a reason why there was no binding of p53 in HepG2 cells via the CHIP-seq data set.
The p53 binding did not appear because MSI2 was wild-type in the GEO CHIP-seq data sets.
45
Binding of p53 to the promoter region of histone genes may occur when MSI2 was knocked-
down; this model would also be independent of MYC.
While the mechanism behind the results of the PRO-seq experiment were not ascertained,
the results of this study merit further research and experimentation. Histone variants, as
mentioned throughout this thesis, have several different functions, like causing apoptosis to
occur [37]. Histone variants can also affect chromatin differentiation, and possibly epigenetic
inheritance [46]. However, it is quite hard to study histone variants due to issues with antibody
quality and lack of references for aforementioned antibodies. Also, there is a difficulty in
detecting histone variant protein level due to similarity in histone variant structure, especially in
histone H2A variants [45]. To detect histone variants’ protein level, mass spectrometry is thought
to be the superior choice over western blot [45]. This would be a good first step in quantifying
histone variants in HepG2, Huh7, and Hep3B cells for future experiments. Another potential
limitation would be determining what histone variants to focus on; should one choose the histone
variants most abundant based on protein level, or histone variants that are mutated in HCC. After
quantifying histone variant protein level, choosing histone variants to focus on would be the next
logical step.
After choosing histone variants to focus on, altering expression of these histone variants
and measuring tumorigenicity in vitro via HCC cells and in vivo via HCC mice could be the final
step. Hepatocellular carcinoma is the most common form of liver cancer in the world; MSI2 has
been shown to contribute to HCC cancer progression. MSI2 knockdown in HepG2 HCC cells
was shown to significantly alter the transcription rate in several genes, 36 of which were histone
genes. Of these histone genes, 28 were histone variants, which have several different functions
and are mutated in various different cancers. Histone variants are also shown to affect chromatin
46
structure based on the different histone variants that make up the nucleosome. Histone variants
also are thought to be involved in epigenetic inheritance, which can have important effects in
offspring. In terms of innovation, the connection of histones, or more specifically, histone
variants to MSI2 has not been done before. MSI2 playing a role in HCC has been shown in
various different research articles; the effect of MSI2 on histone gene transcription, however, is a
unique topic. Seeing if histone variants with known functions still function the same way in HCC
would be interesting. Also, some histone variants’ functions are unknown. Using MSI2 knock-
down and MSI2 over-expression HCC cells or mice to see if one could discover these histone
variants’ functions would also be innovative. Histone variants are a worthwhile research topic,
especially in the context of studying MSI2 and HCC. Hopefully further examination of these
histone variants will strengthen MSI2 as a therapeutic target in the fight against hepatocellular
carcinoma.
47
References:
1.Villanueva, A. (2019). Hepatocellular carcinoma. New England Journal of Medicine, 380(15),
1450–1462. https://doi.org/10.1056/nejmra1713263
2. Key statistics about liver cancer. American Cancer Society. (n.d.). https://www.cancer.org/
cancer/liver-cancer/about/what-is-key-statistics.html.
3. Tanaka, J., Akita, T., Ko, K., Miura, Y ., & Satake, M. (2019). Countermeasures against viral
hepatitis B and C in JAPAN: An epidemiological point of view. Hepatology Research, 49(9),
990–1002. https://doi.org/10.1111/hepr.13417
4. Seitz, H. K., & Becker, P. (2007). Alcohol metabolism and cancer risk. Alcohol research &
health : the journal of the National Institute on Alcohol Abuse and Alcoholism. https://
www.ncbi.nlm.nih.gov/pmc/articles/PMC3860434/.
5. Tarao, K., Nozaki, A., Ikeda, T., Sato, A., Komatsu, H., Komatsu, T., Taguri, M., & Tanaka, K.
(2019). Real impact of liver cirrhosis on the development of hepatocellular carcinoma in various
liver diseases—meta-analytic assessment. Cancer Medicine, 8(3), 1054–1065. https://doi.org/
10.1002/cam4.1998
6. Cholankeril, G., Patel, R., Khurana, S., & Satapathy, S. K. (2017). Hepatocellular carcinoma
in non-alcoholic Steatohepatitis: Current knowledge and implications for management. World
Journal of Hepatology, 9(11), 533. https://doi.org/10.4254/wjh.v9.i11.533
7. Desai, A., Sandhu, S., Lai, J.-P., & Sandhu, D. S. (2019). Hepatocellular carcinoma in Non-
Cirrhotic Liver: A comprehensive review. World Journal of Hepatology, 11(1), 1–18. https://
doi.org/10.4254/wjh.v11.i1.1
8. Seyfried, T. N., & Huysentruyt, L. C. (2013). On the origin of cancer metastasis. Critical
Reviews in Oncogenesis, 18(1 - 2), 43–73. https://doi.org/10.1615/critrevoncog.v18.i1-2.40
48
9. Martin, T. A. (1970, January 1). Cancer Invasion and Metastasis: Molecular and Cellular
Perspective. Madame Curie Bioscience Database [Internet]. https://www.ncbi.nlm.nih.gov/
books/NBK164700/.
10. Tsai, J. H., & Yang, J. (2013). Epithelial-mesenchymal plasticity in carcinoma metastasis.
Genes & Development, 27(20), 2192–2206. https://doi.org/10.1101/gad.225334.113
11. Sun, C., Sun, L., Jiang, K., Gao, D.-M., Kang, X.-N., Wang, C., Zhang, S., Huang, S., Qin,
X., Li, Y ., & Liu, Y .-K. (2013). NANOG promotes liver cancer Cell invasion by inducing
epithelial–mesenchymal transition THROUGH Nodal/smad3 signaling pathway. The
International Journal of Biochemistry & Cell Biology, 45(6), 1099–1108. https://doi.org/
10.1016/j.biocel.2013.02.017
12. Shan, J., Shen, J., Liu, L., Xia, F., Xu, C., Duan, G., Xu, Y ., Ma, Q., Yang, Z., Zhang, Q., Ma,
L., Liu, J., Xu, S., Yan, X., Bie, P., Cui, Y ., Bian, X.-wu, & Qian, C. (2012). Nanog regulates
self-renewal of cancer stem cells through the insulin-like growth factor pathway in human
hepatocellular carcinoma. Hepatology, 56(3), 1004–1014. https://doi.org/10.1002/hep.25745
13. Ayob, A. Z., & Ramasamy, T. S. (2018). Cancer stem cells as key drivers of tumour
progression. Journal of Biomedical Science, 25(1). https://doi.org/10.1186/s12929-018-0426-4
14. Kudinov, A. E., Karanicolas, J., Golemis, E. A., & Boumber, Y . (2017). Musashi RNA-
Binding Proteins as Cancer drivers and Novel Therapeutic Targets. Clinical Cancer Research,
23(9), 2143–2153. https://doi.org/10.1158/1078-0432.ccr-16-2728
15. Yeh, D.W., Siddique, H.R., Zheng, M., Choi, H.Y ., Machida, T., Narayanan, P., Kou, Y ., Punt,
V ., Tahara, S.M., Feldman, D.E., Chen, L., Machida, K. (2021). MSI2 Promotes MYC and VIral
Translation to Induce Self-Renewal of Tumor-Initiating Cells. [Manuscript not yet published]
49
16. Leukemia - chronic myeloid - cml - phases. Cancer.Net. (2018, July 17). https://
www.cancer.net/cancer-types/leukemia-chronic-myeloid-cml/phases.
17. Ito, T., Kwon, H. Y ., Zimdahl, B., Congdon, K. L., Blum, J., Lento, W. E., Zhao, C., Lagoo,
A., Gerrard, G., Foroni, L., Goldman, J., Goh, H., Kim, S.-H., Kim, D.-W., Chuah, C., Oehler, V .
G., Radich, J. P., Jordan, C. T., & Reya, T. (2010). Regulation of Myeloid Leukemia by the Cell-
Fate Determinant Musashi. Nature, 466(7307), 765–768. https://doi.org/10.1038/nature09171
18. Fang, T., Lv, H., Wu, F., Wang, C., Li, T., Lv, G., Tang, L., Guo, L., Tang, S., Cao, D., Wu,
M., Yang, W., & Wang, H. (2017). Musashi 2 contributes to THE stemness and Chemoresistance
of liver cancer stem cells VIA Lin28a activation. Cancer Letters, 384, 50–59. https://doi.org/
10.1016/j.canlet.2016.10.007
19. Shyh-Chang, N., & Daley, G. Q. (2013). Lin28: Primal regulator of growth and metabolism
in stem cells. Cell Stem Cell, 12(4), 395–406. https://doi.org/10.1016/j.stem.2013.03.005
20. Wang, H., Zhao, Q., Deng, K., Guo, X., & Xia, J. (2016). Lin28: An emerging Important
Oncogene connecting several aspects of cancer. Tumor Biology, 37(3), 2841–2848. https://
doi.org/10.1007/s13277-015-4759-2
21. Wang, T., Wang, G., Hao, D., Liu, X., Wang, D., Ning, N., & Li, X. (2015). Aberrant
regulation of the LIN28A/LIN28B and let-7 loop in human malignant tumors and its effects on
the hallmarks of cancer. Molecular Cancer, 14(1). https://doi.org/10.1186/s12943-015-0402-5
22. Dang, C. V . (2012). MYC on the path to cancer. Cell, 149(1), 22–35. https://doi.org/10.1016/
j.cell.2012.03.003
23. Lin, C.-P., Liu, C.-R., Lee, C.-N., Chan, T.-S., & Liu, H. E. (2010). Targeting c-myc as a
novel approach for hepatocellular carcinoma. World Journal of Hepatology, 2(1), 16–20. https://
doi.org/10.4254/wjh.v2.i1.16
50
24. Schlaeger, C., Longerich, T., Schiller, C., Bewerunge, P., Mehrabi, A., Toedt, G., Kleeff, J.,
Ehemann, V ., Eils, R., Lichter, P., Schirmacher, P., & Radlwimmer, B. (2007). Etiology-
dependent molecular mechanisms in human hepatocarcinogenesis. Hepatology, 47(2), 511–520.
https://doi.org/10.1002/hep.22033
25. Gabay, M., Li, Y ., & Felsher, D. W. (2014). Myc activation is a hallmark of cancer initiation
and maintenance. Cold Spring Harbor Perspectives in Medicine, 4(6). https://doi.org/10.1101/
cshperspect.a014241
26. Yang, Y ., & Wang, Z. (2019). IRES-mediated cap-independent translation, a path leading to
hidden proteome. Journal of Molecular Cell Biology, 11(10), 911–919. https://doi.org/10.1093/
jmcb/mjz091
27. Kumar, M., & Panda, D. (2014). Role of supportive care for terminal Stage hepatocellular
carcinoma. Journal of Clinical and Experimental Hepatology, 4, S130–S139. https://doi.org/
10.1016/j.jceh.2014.03.049
28. Medavaram, S., & Zhang, Y . (2018). Emerging therapies in advanced hepatocellular
carcinoma. Experimental Hematology & Oncology, 7(1). https://doi.org/10.1186/
s40164-018-0109-6
29. Yang, Z. F., & Poon, R. T. P. (2008). Vascular changes in hepatocellular carcinoma. The
Anatomical Record: Advances in Integrative Anatomy and Evolutionary Biology, 291(6), 721–
734. https://doi.org/10.1002/ar.20668
51
30. Finn, R. S., Qin, S., Ikeda, M., Galle, P. R., Ducreux, M., Kim, T.-Y ., Kudo, M., Breder, V .,
Merle, P., Kaseb, A. O., Li, D., Verret, W., Xu, D.-Z., Hernandez, S., Liu, J., Huang, C., Mulla,
S., Wang, Y ., Lim, H. Y ., … Cheng, A.-L. (2020). Atezolizumab Plus Bevacizumab in
Unresectable Hepatocellular Carcinoma. New England Journal of Medicine, 382(20), 1894–
1905. https://doi.org/10.1056/nejmoa1915745
31. Intravitreal bevacizumab in DIABETIC RETINOPATHY . recommendations from the Pan-
American collaborative Retina study Group (PACORES): The 2016 KNOBLOCH LECTUR.
(2018). Asia-Pacific Journal of Ophthalmology, 7(1), 36–39. https://doi.org/10.22608/
apo.2017466
32. Finn, R. S., Ryoo, B.-Y ., Merle, P., Kudo, M., Bouattour, M., Lim, H. Y ., Breder, V ., Edeline,
J., Chao, Y ., Ogasawara, S., Yau, T., Garrido, M., Chan, S. L., Knox, J., Daniele, B., Ebbinghaus,
S. W., Chen, E., Siegel, A. B., Zhu, A. X., & Cheng, A.-L. (2020). Pembrolizumab as Second-
Line therapy in patients with ADV ANCED hepatocellular carcinoma in KEYNOTE-240: A
randomized, Double-blind, Phase III TRIAL. Journal of Clinical Oncology, 38(3), 193–202.
https://doi.org/10.1200/jco.19.01307
33. Luca Cicalese, M. D. (2021, June 14). Hepatocellular carcinoma (HCC). Practice Essentials,
Anatomy, Pathophysiology. https://emedicine.medscape.com/article/197319-overview.
34. Mei, Q., Huang, J., Chen, W., Tang, J., Xu, C., Yu, Q., Cheng, Y ., Ma, L., Yu, X., & Li, S.
(2017). Regulation of dna replication-coupled histone gene expression. Oncotarget, 8(55),
95005–95022. https://doi.org/10.18632/oncotarget.21887
35. Luger, K., Mäder, A. W., Richmond, R. K., Sargent, D. F., & Richmond, T. J. (1997). Crystal
structure of THE Nucleosome core particle at 2.8 Å resolution. Nature, 389(6648), 251–260.
https://doi.org/10.1038/38444
52
36. Nizami, Z., Deryusheva, S., & Gall, J. G. (2010). The Cajal body and HISTONE LOCUS
Body. Cold Spring Harbor Perspectives in Biology, 2(7). https://doi.org/10.1101/
cshperspect.a000653
37. Monteiro, F. L., Baptista, T., Amado, F., Vitorino, R., Jerónimo, C., & Helguero, L. A.
(2014). Expression and functionality of histone H2A variants in cancer. Oncotarget, 5(11), 3428–
3443. https://doi.org/10.18632/oncotarget.2007
38. Long, M. (2019). A novel histone h4 variant h4g regulates rDNA transcription in breast
cancer. Nucleic Acids Research, 47(16), 8399–8409. https://doi.org/10.14711/
thesis-991012730058603412
39. Martire, S., & Banaszynski, L. A. (2020). The roles of histone variants in fine-tuning
chromatin organization and function. Nature Reviews Molecular Cell Biology, 21(9), 522–541.
https://doi.org/10.1038/s41580-020-0262-8
40. Mahat, D. B., Kwak, H., Booth, G. T., Jonkers, I. H., Danko, C. G., Patel, R. K., Waters, C.
T., Munson, K., Core, L. J., & Lis, J. T. (2016). Base-pair-resolution genome-wide mapping of
active RNA polymerases using precision Nuclear run-on (PRO-seq). Nature Protocols, 11(8),
1455–1476. https://doi.org/10.1038/nprot.2016.086
41. HOMER (v4.11, 10-24-2019). Homer Software and Data Download. (n.d.). http://
homer.ucsd.edu/homer/.
42. Nguyen T-A, Grimm SA, Bushel PR, Li J, Li Y , Bennett BD, Lavender CA, Ward JM, Fargo
DC, Anderson CW, Li L, Resnick MA, Menendez D. Revealing a human p53 universe. 2018.
Nucleic Acids Research
53
43. Chan C, Thurnherr T, Wang J, Gallart-Palau X et al. Global re-wiring of p53 transcription
regulation by the hepatitis B virus X protein. Mol Oncol 2016 Oct;10(8):1183-95. PMID:
27302019
44. Partridge, E. C., Chhetri, S. B., Myers, R. M., & Mendenhall, E. M. (n.d.). Geo accession
viewer. National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/geo/
query/acc.cgi?acc=GSM2797581.
45. El Kennani, S., Adrait, A., Permiakova, O., Hesse, A.-M., Ialy-Radio, C., Ferro, M., Brun, V .,
Cocquet, J., Govin, J., & Pflieger, D. (2018). Systematic quantitative analysis of H2A and H2B
variants by Targeted proteomics. Epigenetics & Chromatin, 11(1). https://doi.org/10.1186/
s13072-017-0172-y
46. Henikoff, S., & Smith, M. M. (2015). Histone variants and epigenetics. Cold Spring Harbor
Perspectives in Biology, 7(1). https://doi.org/10.1101/cshperspect.a019364
47. Colaluca, I. N., Tosoni, D., Nuciforo, P., Senic-Matuglia, F., Galimberti, V ., Viale, G., Pece,
S., & Di Fiore, P. P. (2008). Numb controls p53 tumour suppressor activity. Nature, 451(7174),
76–80. https://doi.org/10.1038/nature06412
54
Abstract (if available)
Abstract
Hepatocellular carcinoma is the most common form of primary liver cancer around the world. Musashi RNA-Binding Protein 2, or MSI2, has been shown to contribute to HCC tumorigenesis both in vitro and in vivo. Utilizing Homer to analyze PRO-seq data, we were able to see the changes in rate of transcription when MSI2 was over-expressed and knocked down in three HCC cell lines. In HepG2 cells, thirty-six histone genes’ transcription rate were significantly altered when MSI2 was knocked-down. Twenty-eight of these histone genes were histone variants. We hypothesized that p53 and MYC were competing for histone promoter regions, affecting transcription rate. To determine if this was the case, p53 and MYC CHIP-seq binding was analyzed. However, p53 and MYC were not shown to bind to histone promoter regions in HepG2 cells. While the mechanism was not determined, histone variants’ connection to MSI2 and HCC is worth further research.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Musashi-2 promotes c-MYC expression through IRES-dependent translation and self-renewal ability in hepatocellular carcinoma
PDF
Novel long non-coding RNA, LINC00824, contributes to stemness in hepatocytes
PDF
Identification of molecular mechanism for cell-fate decision in liver; &, SARS-CoV replicon inhibitor high throughput drug screening
PDF
c-JUN mediated alteration of SLC2A2 expression in hepatoma cell line HepG2
PDF
Synergism between TLR4-c-JUN axis and surface receptor stimulation translocate c-MYC into Ig loci, through AID leading to lymphomagenesis
PDF
Mechanisms of nuclear translocation of pyruvate dehydrogenase and its effects on tumorigenesis of hepatocellular carcinoma…
PDF
Tri-specific T cell engager immunotherapy targeting tumor initiating cells
PDF
Comparative analysis of scFv and non-scFv based chimeric antigen receptors (CARs) against B cell maturation antigen (BCMA)
PDF
Hepatitis B virus X protein regulation of β-catenin and NANOG and co-regulatory role with YAP1 in HCC malignancy
PDF
SARS-CoV-2 suppression of CD1d expression and NKT cell function
PDF
Role of TLR4 and AID in lymphomagenesis induced by obesity and hepatitis C virus
PDF
Polycomb repressive complex 2 subunit stabilizes NANOG to maintain self-renewal in hepatocellular carcinoma tumor-initiating stem-like cells
PDF
Studies on the role of TMEM56 in tumorigenesis by using PTEN knockout mouse model
PDF
Developing a bioluminescence tracking system targeting on tumor cells and T cells for cancer immunotherapy
PDF
Hepatic c-Jun overexpression metabolically reprograms cancer cells through mTORC2/AKT pathway
PDF
Construction and testing of chimeric antigen receptor targeting CS1 for treatment of primary effusion lymphoma
PDF
Establishing a human EGFR expressing murine mammary carcinoma cell line-D2F2, as a syngeneic immunocompetent model
PDF
Regulation of T cell HLA-DR by CD3 ζ signaling
PDF
The analysis and modeling of signaling pathways induced by the interactions of the SARS-CoV-2 spike protein with cellular receptors
PDF
Transcriptional regulation of IFN-γ and PlGF in response to Epo and VEGF in erythroid cells
Asset Metadata
Creator
Farooq, Ahmed
(author)
Core Title
MSI2 level alters histone transcription rate in HepG2 cells
School
Keck School of Medicine
Degree
Master of Science
Degree Program
Molecular Microbiology and Immunology
Degree Conferral Date
2021-12
Publication Date
11/03/2021
Defense Date
09/15/2021
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Alcohol,cancer,ChIP-seq,Cirrhosis,Hep3B,hepatocellular carcinoma,HepG2,histone,Homer,Huh7,liver cancer,Metastasis,OAI-PMH Harvest,obesity,PRO-seq,transcription rate
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Machida, Keigo (
committee chair
), Ou, James (
committee member
), Zandi, Ebrahim (
committee member
)
Creator Email
afarooq@usc.edu,farooqa1@uci.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC16659542
Unique identifier
UC16659542
Legacy Identifier
etd-FarooqAhme-10204
Document Type
Thesis
Format
application/pdf (imt)
Rights
Farooq, Ahmed
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
ChIP-seq
Hep3B
hepatocellular carcinoma
HepG2
histone
Huh7
liver cancer
obesity
PRO-seq
transcription rate