Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Structural and biochemical analyses on substrate specificity and HIV-1 Vif mediated inhibition of human APOBEC3 cytidine deaminases
(USC Thesis Other)
Structural and biochemical analyses on substrate specificity and HIV-1 Vif mediated inhibition of human APOBEC3 cytidine deaminases
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
1
Structural and Biochemical Analyses on Substrate Specificity and HIV-1 Vif Mediated Inhibition
of Human APOBEC3 Cytidine Deaminases
by
Fumiaki Ito
A Dissertation Presented to the
FACULTY OF THE USC GRAUDATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy
(MOLECULAR BIOLOGY)
May 2019
2
Abstract
APOBEC (apolipoprotein B mRNA editing catalytic poly-peptide-like) proteins belong to a
family of polynucleotide cytidine deaminases that play diverse biological roles by converting
cytosine base (C) into uracil (U) on single stranded DNA (ssDNA) or RNA (ssRNA). APOBEC3
(A3) subfamily members (A3A-H) represent a hallmark of intrinsic immunity to restrict viral
infection and maintain the genomic integrity by triggering lethal hypermutation on viral genomes.
Whereas cytidine deamination of malicious foreign DNA by A3 proteins is essential part of innate
immunity, several A3 members inadvertently mutate their own genomes, which may lead to
carcinogenesis.
In addition to the deamination of canonical cytosine base, several APOBEC members have been
implicated in deaminating 5-methylcytosine (mC), an epigenetically modified form of cytosine on
genomic DNA. Deamination of mC produces another type of mutation as the deamination product
of mC is thymine (T). Conversion of mC into T was previously proposed to be a part of the
demethylation process of mC in the genome for epigenetic regulation. However, involvement of
entire APOBEC family members in deaminating mC in the genome and the underlying molecular
basis for recognizing mC have not been clearly delineated.
Potent antiviral activity of A3 proteins has led lentiviruses to evolve a unique gene called viral
infectivity factor (Vif). Vif is a highly conserved lentiviral accessory gene and is essential for
successful infection and propagation of the viruses. Vif deficient HIV-1 (HIV-1 Vif) is unable to
replicate in vivo because its genome is susceptible to lethal hypermutation caused by A3s. HIV-1
Vif targets a set of human A3s for proteasomal degradation to escape from their antiviral activity.
In this process, Vif hijacks Cul5-EloB-EloC-Rbx2 E3 ubiquitin ligase complex to perform the
ubiquitination and subsequent proteasomal degradation of A3s and additionally recruits cellular
3
transcription factor CBF- to form a stable complex suitable for binding to A3s. Despite extensive
efforts, the molecular mechanisms of Vif mediated antagonism of A3 proteins remain elusive.
In this thesis, we were determined to tackle two unaddressed questions regarding APOBEC
family. (1) What is the potential of each APOBEC protein to deaminate mC in addition to
conventional C for proposed epigenetic alteration and what is the molecular basis for
differentiating mC and C. (2) How APOBEC proteins are targeted and antagonized by HIV-1 Vif
during the HIV infection at molecular level.
In chapter 1, the current overview and the recent progress in APOBEC family research are
reviewed.
In chapter 2, we comprehensively investigated the potential of different A3 proteins to
deaminate canonical C and mC by using purified recombinant proteins of 11 human APOBECs.
The results showed that different APOBEC members showed drastically different degrees of
overall deaminase activity and selectivity for mC. A3A and A3H showed distinctively high
deaminase activity for both C and mC with relatively high selectivity for mC, whereas six other
APOBEC members (A1, A3B, A3C, A3D, A3F, and A3G) showed moderate to low deaminase
activity and relatively low selectivity for mC. Furthermore, we identified and dissected the
structural/sequence elements that contribute to the efficient deamination of C and mC in A3A by
structure-guided functional mutagenesis. These findings expand our understanding of APOBEC
family proteins that have diverse cellular function and possible link to epigenetic regulation.
In chapter 3, we analyzed structure and biochemical properties of A3H, a potent restriction
factor against HIV-1 Vif. The 2.49 Å crystal structure of monomeric A3H revealed several unique
features, including an unusually long C-terminal helix (6) and a long flexible loop around the
zinc-coordinating active center. A3H showed an extensive positively charged surface around the
active site pocket and substrate binding loops. Multiple positively charged residues within this
4
patch were important for the subcellular localization and the deaminase activity of A3H.
Interestingly, A3H formed enzymatically inactive high molecular weight ribonucleoprotein
complex in vivo, which can be dissociated into highly active low molecular weight species by
treating with RNase A. These results shed light on molecular basis for the complex nucleic acid
binding, molecular assembly, and catalytic regulation of A3H.
In chapter 4, we analyzed the molecular interaction between A3H and HIV-1 Vif by in vitro
binding assay, in vivo Vif-mediated degradation assay, and cryo-electron microscopy (cryo-EM)
single particle analysis. Vif-E3 ubiquitin ligase complex comprising Vif/CBF-/EloB/EloC/Cul5
heteropentamer was purified to homogeneity. Vif complex bound to both monomeric and dimeric
forms of A3H in vitro. Vif-mediated degradation assay revealed that -helix 3 and 4 in A3H are
critical interface for Vif binding and that several clustered lysine residues around loop 3 are likely
to be a part of Vif-mediated ubiquitination site in A3H. Our preliminary cryo-EM 3D
reconstruction allowed us to fit reported atomic structures of Vif complex and A3H. These results
would facilitate the understanding of HIV Vif-mediated counteraction of human A3 proteins and
provide a valuable insight for the future drug development.
5
Table of Contents
Abstract 2
Table of Contents 5
List of Tables 5
List of Figures 7
Acknowledgements 9
Publication 11
Chapter 1. Introduction to APOBEC Family Proteins 12
Chapter 2. Family-Wide Comparative Analysis of Cytidine and Methylcytidine
Deamination by Eleven Human APOBEC Proteins
22
Chapter 3. Understanding the Structure, Multimerization, Subcellular Localization and
mC Selectivity of a Genomic Mutator and Anti-HIV Factor APOBEC3H
45
Chapter 4. Structural and Biochemical Analysis of HIV-1 - Vif-A3H Interaction 78
References 99
6
List of Tables
Table 2.1 Deaminase activity for C and mC of all APOBEC proteins tested at the
indicated optimal pH with the preferred DNA substrate motifs.
35
Table 2.2 Deaminase activity of A3A and A3BCD2 mutants for C and mC. 36
Table 3.1 Crystallographic data collection and refinement statistics. 64
Table 3.2 ssDNA and ssRNA binding properties of A3H dimeric and monomeric
mutants.
65
Table 3.3 The deaminase activity for C and mC, and mC selectivity factor of A3H
mutants.
66
7
List of Figures
Figure 1.1 Eleven members of human APOBEC family and their functional roles 18
Figure 1.2 Sequence alignment of human APOBEC3 proteins 19
Figure 1.3 lentiviral accessory proteins and their target restriction factors 21
Figure 2.1 Proposed demethylation pathway of mC 37
Figure 2.2 SDS-PAGE of the 11 purified APOBECs 38
Figure 2.3 Time course deaminase activity of A3A for normal C and mC 39
Figure 2.4 Dose dependent deaminase activity of 9 APOBEC proteins for C and
mC
40
Figure 2.5 Identification of structural elements important for deaminase activity
and mC selectivity
41
Figure 2.6 Comparison of the substrate binding affinity and substrate motif
specificity of WT and R1 mutant of A3A
42
Figure 2.7 Y130 on loop 7 of A3A modulates the selectivity for mC deamination 43
Figure 3.1 Protein purification and the overall structure of A3H 67
Figure 3.2 The positively charged surface and the nucleic acid binding property of
A3H
68
Figure 3.3 Multimerization of A3H in HEK293T cells and RNA-dependent
inhibition of deaminase activity
69
Figure 3.4 Positively charged patches are important for subcellular localization
and deaminase activity of A3H
70
Figure 3.5 Comparison of the charged surface around the Zn-active site center of
A3H, A3A, and A3BCD2
71
Figure 3.6 The deaminase activity for C and mC of A3H mutants 72
Supplementary
Figure S3.1
Protein purification of wild-type A3H hap II 73
Supplementary
Figure S3.2
Multiple sequence alignment of APOBEC proteins 74
Supplementary
Figure S3.3
Structural features of A3H 75
8
Supplementary
Figure S3.4
The charged surface feature surrounding the zinc-coordinating active
center of the active APOBEC domains
76
Supplementary
Figure S3.5
Comparison of the reported A3H structures 77
Figure 4.1 Constructs, purification, and negative-stain EM image of VCBCC
complex
91
Figure 4.2 Binding of VCBCC complex to A3H 92
Figure 4.3 Binding of VCBCC complex to A3H dimer and monomer 93
Figure 4.4 Vif binding site in A3H 94
Figure 4.5 Identification of target lysine residues in A3H for Vif-mediated
ubiquitination
95
Figure 4.6 Cryo-EM analysis of VCBCC/MBP-A3H dimer complex 96
Figure 4.7 Cryo-EM analysis of VCBCC/A3H monomer complex 97
Figure 4.8 Schematic of interaction between VCBCC complex and A3H 98
9
Acknowledgements
It has been six years since I left Japan and came to the US to pursue PhD. I am grateful to
everyone I met during my time at USC.
I thank my advisor Dr. Xiaojiang Chen for all kinds of support and mentorship he provided me
during my PhD. He gave me opportunities to learn various scientific skills and pursue fundamental
scientific questions with great independency. He has been always available to give me advise and
suggestion whenever I face problems. This environment enabled me to learn how to approach
scientific problems both independently and cooperatively with other scientists. Thanks to him, I
have been able to study something that I am truly interested in.
I thank my dissertation committee members, Dr. Lin Chen, Dr. Vadim Cherezov, and Dr.
Matthew Pratt for the time and suggestions.
I thank Dr. Hanjing Yang for continuous on-site technical assistance and intellectual advice for
my research for the past six years.
I thank Dr. Hong Zhou at the University of California, Los Angeles (UCLA) for giving me
opportunity to work on transmission electron microscopes at California NanoSystems Institute.
I thank Dr. Ana Lucia Alvarez Cabrera and Dr. Kevin Huynh for teaching me the electron
microscopy techniques and assisting data collection and processing for single particle analysis.
I thank all my former and present lab members who have accompanied me during my PhD:
Aaron Wolfe, Brett Zirkle, Jiemin Zhao, Kyu Min Kim, Xiao Xiao, Yang Fu, Lyon Chen, Jiang
Gu, Damian Wang, Yao Fang, Guan Wang, Gary Molano, Thomas Colin, Cathy Marsura, Vagan
Arutiunian, Dan Ma, Guochang Lyu, and Renee Zhang.
I thank my undergraduate fellows, Jhennis Megan R. Lacsamana, Cherie Morimoto, Emi Hirsh,
Kenzie Cohen, Sasha Park, and Shen-Chi Andrew Kao for assisting my work and lab routine tasks.
I thank Nakajima foundation for offering me 5-years graduate fellowship.
10
Lastly, I thank my parents for unconditional support from Japan for the past six years. They
have respected every decision I made and let me pursue my career in science.
11
Publication
[1] Ito F, Yang H, Xiao X, Li SX, Wolfe A, Zirkle B, Arutiunian V , and Chen XS, Understanding
the Structure, Multimerization, Subcellular Localization and mC Selectivity of a Genomic Mutator
and Anti-HIV Factor APOBEC3H. Sci Rep (2018) 8:3763.
[2] Ito F, Fu Y , Kao SC A, Yang H, and Chen XS, Family-Wide Comparative Analysis of Cytidine
and Methylcytidine Deamination by Eleven Human APOBEC Proteins. J Mol Biol 429 (2017),
1787-1799.
[3] Gu J, Chen Q, Xiao X, Ito F, Wolfe A, and Chen XS, Biochemical Characterization of
APOBEC3H Variants: Implications for Their HIV-1 Restriction Activity and mC Modification. J
Mol Biol 428 (2016), 4626-38.
[4] Fu Y , Ito F, Zhang G, Fernandez B, Yang H, and Chen XS, DNA cytosine and methylcytosine
deamination by APOBEC3B: enhancing methylcytosine deamination by engineering APOBEC3B.
Biochem J 471 (2015), 25-35.
12
Chapter 1
Introduction to APOBEC Family Proteins
General description of human APOBEC family
APOBEC (apolipoprotein B mRNA editing catalytic poly-peptide-like) proteins belong to a
family of polynucleotide cytidine deaminases, which edit single stranded DNA (ssDNA) or RNA
(ssRNA) by converting cytosine (C) base into uracil (U). Humans have 11 members in this family:
APOBEC1 (A1), APOBEC2 (A2), APOBEC3 (A3: A3A, A3B, A3C, A3D, A3F, A3G, A3H),
APOBEC4 (A4) and activation-induced cytidine deaminase (AID). Seven A3 subfamily members
form a gene cluster in chromosome 22, which is the result of gene duplication event. Among A3
members, A3A, A3C, and A3H have single zinc-coordinating cytidine deaminase domain, while
A3B, A3D, A3F, and A3G have two cytidine deaminase domains (1) (Figure 1.1).
The A3 zinc-coordinating domain is further classified into three distinct subclasses Z1 - Z3 (1-
3). Z1 and Z2 domains share a SW(S/T)PCX2-4C (where X is any amino acid) motif, whereas Z3
domain has TWSPCX2C motif (4). The single domain A3 proteins A3A, A3C, and A3H have Z1,
Z2, and Z3 type, respectively. The double domain A3 proteins A3B, A3D, A3F, and A3G have Z2-
Z1, Z2-Z2, Z2-Z2, and Z2-Z1, respectively. A3H is the sole copy of Z3 type domain (2, 5). All
cytidine deaminase domains contain conserved zinc-coordinating motif HXEX25-31PCX2-4C
(Figure 1.2). The histidine (H) and the two cysteines (C) coordinate a zinc atom and form the
catalytic pocket. The cytosine base fits into the active site pocket first, and then active site center
glutamate (E) acts as a proton donor and activates the water molecule that is coordinated by the
zinc atom. The activated water deaminates the amino group on C4 of the bound cytidine through
a zinc-hydroxide-mediated nucleophilic attack (6). Although all A3 domains have conserved zinc-
13
coordinating motif, N-terminal domains of double domain enzymes (A3B, A3D, A3F, and A3G)
appear to be catalytically inactive (2, 7, 8).
All APOBEC proteins are expected to have conserved cytidine deaminase fold, which
comprises a five--stranded core surrounded by six -helices with the order of 1-1-2-2-3-
3-4-4-5-5-6 (Figure 1.2).
Physiological roles of A1, A2, A4, and AID
A1 is involved in lipid metabolism by editing the mRNA encoding ApoB to introduce an early
stop codon and produce a truncated ApoB for lipid transport (9). A2 and A4 appear to exhibit no
deaminase activity or mutagenic activity (10-12). While A2 has been implicated in muscle
differentiation, precise biological roles of these genes await further investigation (13). AID induces
somatic hypermutations in the immunoglobulin locus of the maturing B cells and triggers antibody
class-switch recombination, a critical process for antibody diversification and maturation for
humoral immunity (14-18). Genetic defects in AID cause type-2 hyper-IgM syndrome, a disease
related to immunodeficiency (19).
Antiviral activity of A3s
A3 subfamily members play pivotal roles in restricting exogenous and endogenous harmful
DNA, including viral pathogens and retrotransposons (20-26). A3G is known as a key restriction
factor of Vif-deficient human immunodeficiency virus-1 (HIV-1 Vif) (27-29). A3D, A3F, and
A3H (haplotype II, V , VII) additionally restrict HIV Vif (30-32). However, the anti-HIV activity
of these A3s are inhibited during the HIV infection when Vif is present (See the section for Vif).
A3C is a potent inhibitor of simian immunodeficiency virus (SIV) (33). A3A restricts human
papilloma virus (HPV) (34), Hepatitis B virus (HBV) (35), Adeno-associated virus (AA V) (36)
14
and retrotransposons of LINE-1 and Alu (37, 38). Whereas both A3G and A3A are induced by
interferon-, A3G is primarily expressed in CD4-positive T-cells and A3A is mainly expressed in
myeloid lineages such as monocytes and macrophages (24, 36-44).
HBV restriction by A3A and A3B
Hepatitis B Virus (HBV) is a major cause of chronic liver diseases. HBV possesses irregular
form of genomic DNA that consists of both dsDNA and ssDNA. Upon its infection on target cells,
the complementary strand of the ssDNA region is synthesized to form a covalently closed circular
DNA (cccDNA). A3A and A3B were identified as key players in HBV restriction in liver cells (35).
Interferon- and lymphotoxin- receptor activation can trigger the degradation of HBV cccDNA
through the hypermutation caused by A3A and A3B, and the following uracil bases excision by
uracil DNA glycosylase (UDG). A3A and A3B specifically target cccDNA by directly binding to
the HBV core protein (HBc) that is associated with cccDNA.
HIV accessory proteins
HIV is responsible for the Acquired Immunodeficiency Syndrome (AIDS). In 2017, 36.9
million people were estimated to be living with HIV worldwide, and 21.7 million people were
accessing antiretroviral therapy (Global HIV and AIDS Statistics, 2018 fact sheet). When the
pandemic HIV-1 was first isolated and its genome was cloned and sequenced, several unknown
open reading frames were identified in addition to the common retroviral gag, pol, and env genes
for viral replication and propagation. It turned out that all lentiviruses, including HIV , have
developed these unique group of genes to overcome the innate immunity of the host cells. Their
product proteins are called accessory proteins, since they are not directly involved in the viral
genomic replication, but they drastically enhance the infectivity of the viruses on the host cells.
15
The accessory proteins encoded by HIV include Vif, Vpx, Vpr, Vpu, and Nef, all of which have
corresponding target restriction factors in the host cells (Figure 1.3). These accessory proteins
generally downregulate the steady-state level of their target factors through either proteasomal or
lysosomal degradation by hijacking the host ubiquitin ligases or clathrin adapters, respectively.
Each accessory protein has multiple targets by developing distinctive surfaces for direct binding
to different host targets. (Figure 1.3).
HIV accessory protein Vif
Vif protein had been known as an essential factor for the pathogenic infection for the
lentiviruses and seemed to be required during the late stages of virus production (45, 46). In 2002,
A3G (originally termed as CEM-15) was identified as a target host factor of Vif-mediated
antagonism, as transient or stable expression of A3G inhibit the replication of HIV-1 Vif in
permissive CEM-SS cells, which usually does not express A3G (27). The following studies have
shown that Vif degrades A3G in proteasome-dependent manner (47-49) and Vif-deficient HIV-1
genome was subject to A3G-mediated hypermutation (28, 50, 51). Vif protein hijacks Cul5 E3
ubiquitin ligase complex containing Cul5, EloB, EloC and a RING-box protein to degrade A3G
through ubiquitin-proteasome system. In this process, Vif additionally requires the transcription
factor CBF- to form a stable and functional complex (52, 53). Multiple studies have shown that
the other A3 members that have anti-HIV activity, including A3D, A3F, A3H (haplotype II), are
also targeted by Vif for proteasomal degradation (Figure 1.3) (30-32). Recently, a crystal structure
of a complex comprising Vif, CBF-, EloB, EloC, and N-terminal fragment of Cul5 was reported
(54). This hetero-pentameric structure shows that Vif simultaneously contacts CBF-, EloC, and
Cul5 promoting the assembly of E3 ubiquitin ligase. Currently, structural basis of how Vif protein
targets A3 proteins for degradation is unclear. On the other hand, human Apoptosis-Signaling
16
Kinase1 (ASK1) was identified as a natural inhibitor of HIV-1 Vif (55). ASK1 disrupts the Vif-
mediated degradation of A3G by directly binding to EloBC binding region of Vif, thereby
inhibiting the formation of E3 ubiquitin ligase. These are likely to be part of ongoing molecular
arm race between human and viruses.
Other accessory proteins related to APOBECs, Vpx and Vpr
Vpx is another important accessory protein possessed by HIV-2 and simian immunodeficiency
SIVsm/SIVmac lineage. Vpx is an indispensable factor when these lentiviruses infect the host
monocyte cells. Although HIV-1, that does not encode Vpx gene, is not able to infect human
monocytes, introducing Vpx into monocytes allow HIV-1 to efficiently infect monocytes (56). The
known target proteins of Vpx include human A3A, SAMHD1, and HUSH complex, all of which
are restriction factors in myeloid linage cells (57-61). A3A was first shown to interact with Vpx
from HIV-2/SIVsm by the co-immunoprecipitation of overexpressed proteins (57, 62). The studies
showed that the cellular A3A level was lowered in the presence of Vpx, indicating that Vpx may
employ ubiquitin-proteasome system to degrade A3A. SAMHD1 is a deoxynucleoside
triphosphate triphosphohydrolase that lowers the intracellular dNTP (materials for DNA
replication) levels, thereby inhibiting the viral reverse transcription mainly in non-proliferating
cells such as myeloid and resting T-cells (63). Vpx recruits Cul4A E3 ubiquitin ligase complex to
ubiquitinate and degrade SAMHD1. Crystal structure of Vpx from SIVsm and C-terminal region
of human SAMHD1 and human DCAF1, a component of Cul4A ubiquitin ligase complex was
reported (64). In this trimeric structure, the interaction interface of SAMHD1 and Vpx, and Vpx
and DCAF1, were clearly identified. Although this complex did not include other components of
the ubiquitin ligase, such as Cul4A, the modeled structure of the whole Cul4A ubiquitin ligase
complexed with Vpx and SAMHD1 showed that SAMHD1 was located close enough to accept
17
ubiquitin from the RING domain of Rbx1, which directly associates with E2 ubiquitin conjugating
protein. HUSH (Human Silencing Hub) complex is a recently identified target for Vpx (60, 61).
HUSH complex is involved in position-effect variegation, a process of silencing a normally active
gene as a result of its positioning into heterochromatin (65). Like SAMHD1, steady-state level of
HUSH complex is also decreased in a DCAF1-CUL4A ubiquitin proteasome-dependent manner.
Vpr is a homolog of Vpx and utilizes the same DCAF1-CUL4A ubiquitin ligase machinery as
Vpx. However, Vpr has completely different set of target factors. The host proteins that are
downregulated by Vpr include UNG2 (66, 67), SLX4 complex (68), and MCM10 (69). The nuclear
isoform of UNG2 is first identified as a target of Vpr. This enzyme takes part in base excision
repair and recognizes and cleaves uracil bases on DNA. The resulting abasic sites will be repaired
by the following base excision repair machinery. Viruses target and degrade cellular UNG2 to
either perturb the proper DNA repair system for the host genome integrity or prevent the repair of
uracil bases on the reverse transcripts generated by APOBEC-mediated deamination.
It is expected that there are more unidentified host factors targeted by the lentiviral accessory
proteins and further structural study is also needed to understand the molecular basis for these
virus-host interactions.
Implication of APOBEC in cancer development
There are accumulated evidences that deamination activity possessed by APOBECs is involved
in cancer formation. APOBEC-driven carcinogenesis occurs when the APOBECs localize in the
nucleus and inadvertently deaminate cytidines on genome while genomic DNA becomes
transiently single-stranded during transcription or replication or repair processes. A3B was shown
to be associated with breast cancer based on the upregulated mRNA levels of A3B and local
sequence context around the cancer mutations (70-72). Silencing of endogenous A3B levels in
18
breast cancer cell lines abolished all measurable cytidine deaminase activity in the cell extracts
(70). Other studies showed APOBEC-mediated mutation signature was widely observed in various
kinds of cancers (73-76). Other carcinogenic APOBEC members include A3H haplotype I, which
contribute to breast and lung cancers (77), and A3A, which edits genomic and mitochondrial DNA
in multiple cell lines (73, 74, 78). These findings on APOBEC-mediated cancer formation raises
a new “double-edged sword” paradigm for APOBEC protein family.
19
Figure 1.1. Eleven members of human APOBEC family and their functional roles. a, Each
member has one or two conserved cytidine deaminase domains with zinc-coordinating motifs.
20
Catalytically active domains and HIV-1 Vif binding domains were indicated by red stars and
yellow diamonds, respectively. b, The cytidine deamination reaction catalyzed by APOBECs.
21
Figure 1.2. Sequence alignment of human APOBEC3 proteins. Each domain of double domain
proteins (A3B, A3D, A3F, and A3G) is aligned separately. For A3H sequence, the representative
haplotype I, splicing variant 182 was used. Secondary structure was adopted from the crystal
structure of A3H (PDB ID: 5W45). 100% conserved residues are shown in red background, 99-
22
70% conserved similar residues are shown in red characters. The sequence alignment was
performed by Clustal W multiple sequence alignment (79) and displayed by ESPript (80). The
essential amino acid residues for zinc-coordinating cytidine deaminase motif (HXEX25-31PCX2-
4C) are indicated by stars.
23
Figure 1.3. lentiviral accessory proteins and their target restriction factors. Lentiviral
accessory proteins have distinctive targets in their host cells for downregulation. a, Vif targets A3C,
A3D, A3F, A3G, and A3H for proteasomal degradation through Cul5 E3 ubiquitin ligase. In this
process, cellular CBF- is additionally recruited by Vif. b, Vpx targets SAMHD1, A3A, and HUSH
complex for proteasomal degradation through Cul4A E3 ubiquitin ligase. c, Vpr targets UNG2,
SLX4 complex, and MCM10 for proteasomal degradation through Cul4A E3 ubiquitin ligase. d,
Vpu targets Tetherin for lysosomal degradation through clathrin adapter AP-1. e, Nef targets
Tetherin, SERINC3, SERINC5 for lysosomal degradation through clathrin adapter AP-1.
24
Chapter 2
Family-Wide Comparative Analysis of Cytidine and Methylcytidine Deamination by
Eleven Human APOBEC Proteins
Authors: Fumiaki Ito, Yang Fu, Shen-Chi Andrew Kao, Hanjing Yang, and Xiaojiang S.
Chen
Contributions: F.I., Y .F. and X.S.C. designed the experiments. F.I. and Y .F. purified the proteins
and performed the enzyme assay. SC.A.K. assisted with molecular cloning. F.I., Y .F., H.Y . and
X.S.C. wrote the manuscript. F.I. and Y .F. contributed equally to this work.
INTRODUCTION
In addition to the deamination of canonical cytosine, multiple studies have indicated that
activation-induced cytidine deaminase (AID) have the capacity to deaminate 5-methylcytosine
(mC) to produce thymine (T) (81-83). Deamination of mC by AID was proposed as a potential
mechanism for demethylation of mC as it produces a T:G mismatch at the mC site, which may
promote demethylation through an active DNA repair process (84, 85) (Figure 2.1). Since the
methylation of C is one of the major epigenetic modifications that regulates gene expression in
many cases (86-89), the involvement of AID in demethylation pathway may be an important part
of the epigenetic regulation in early embryogenesis or induced pluripotent stem cells (81, 90-92).
Currently, only known demethylation process consists of the stepwise oxidation of methyl group
by TET (ten-eleven translocation) enzymes and the following base excision repair (85) (Figure
2.1).
25
Subsequent studies have shown that another cytidine deaminase APOBEC3A (A3A) can more
efficiently deaminate mC either in an in vitro assay with purified proteins or in vivo genetic assay
in E. coli or monocyte cells with plasmid DNA substrates (93-95). In contrast, APOBEC3G (A3G)
has undetectable mC deamination activity (93, 94). It is currently not well defined whether the rest
of the APOBEC members have any mC deamination activity. While the involvement of AID in the
mC demethylation in genomic DNA remains controversial (96, 97), deamination of mC by
AID/APOBEC proteins may serve as an additional pathway for demethylation (Figure 2.1).
In an attempt to identify the elements responsible for the efficient mC deamination by A3A,
Carpenter et al. (93) focused on the two sequence/structural differences around the catalytic center
between A3A and A3G catalytic domain (A3GCD2) based on their reported structures (98-100):
twelve amino acids extension of the N-terminus and two amino acids insertion (W104/G105)
between the two zinc-coordinating cysteines, both of which are unique to A3A. Truncation of
either of these two elements in A3A shows no obvious change in specificity for mC. Therefore,
the structural elements that contribute to the efficient mC deamination remains unidentified.
To clearly define the involvement of APOBEC proteins in mC deamination and provide a
comparative view of the C and mC deaminase activity for the entire APOBEC family members,
we examined the deaminase activity of 11 known APOBEC members for both normal C and mC
by using purified proteins. We found that nine APOBECs are active in deaminating C, and eight
of the nine active APOBECs showed detectable mC deaminase activity in vitro. A3A and A3H
showed distinctively high activity in both C and mC deamination, with over 3–4 orders of
magnitude higher than the rest of the APOBEC members under similar assay condition. When the
relative mC deaminase activity (or mC selectivity factor) was considered, both A3A and A3H
showed higher mC selectivity factor than the rest of the members. Through mutational analysis of
A3A and A3B catalytic domain (A3BCD2), we identified flexible loop 1 region nearby catalytic
26
pocket to be critical determinants for overall deaminase activity and selectivity for mC. In addition,
a highly conserved tyrosine on loop 7 (Y130 in A3A) also plays a role in determining not only
deaminase activity but also mC selectivity, which may be influenced by the conformation of loop
1 on one side and the target C substrate in the active site center on the other side. This family-wide
study on 11 APOBEC members provides the first side-by-side comparative deaminase activities
for C and mC. The results would be valuable for future comprehensive understanding of the broad
biological functions of the APOBEC family.
RESULTS
APOBEC protein expression and purification
While A3A was highly soluble when expressed with a short His-tag, the other APOBECs were
less soluble or mostly insoluble with the His-tag. These insoluble APOBECs were more soluble
when expressed as fusion proteins with either maltose-binding protein (MBP) or glutathione S-
transferase (GST), and the resulting fusion proteins were purified through corresponding affinity
column chromatography. A3H has seven different haplotypes (hap I-VII) (101-103), and the
representative hap II (referred to as A3H throughout the chapter) was used in this study. For all
APOBEC members, their inactive mutant proteins with the catalytic center glutamate replaced
with alanine (E to A) were prepared as negative controls for deamination assay. The purified wild-
type (WT) and inactive mutants contained little detectable contaminant proteins (Figure 2.2).
These purified recombinant APOBEC proteins were used for the following activity assay.
The relative deamination activities for C and mC of different APOBECs
First, time-lapse deaminase activity of A3A towards normal C and mC were tested. 10 nM
A3A was incubated with 600 nM 30 nt ssDNA containing target normal C and mC showed a
27
specific activity of 189 and 29 pmol/min/µg towards C and mC, respectively, suggesting that A3A
can readily deaminate both C and mC with higher preference for C (Figure 2.3). The catalytically
inactive mutant of A3A showed no detectable deaminase activity, confirming that the purified
recombinant proteins were essentially free of host contaminants.
For a quantitative comparison of the deaminase activities for C and mC of different APOBEC
members, the deaminase activity was determined as the product formation over enzyme
concentration (nM product/μM enzyme) in a range where the product formation is linearly
dependent on enzyme concentration (96). The results showed that A3A and A3H had particularly
high activity for both C and mC, compared to the other APOBEC members (Table 2.1 and Figure
2.4). Intermediate to low deaminase activity for C and mC was detected for six other APOBEC
members (i.e., AID, A1, A3B, A3C, A3F, A3G) (Table 2.1 and Figure 2.4). A3D was the only
member that did not show consistently measurable activity for mC under our experimental
conditions. A2 and A4 had no detectable deaminase activity for either C or mC.
To quantitatively assess the relative deaminase activity for mC over normal C, we calculated
the ratio of the mC/C (initial activity) ×100, which represents the number of mC deaminations per
every one hundred C deaminations under the same conditions, and is defined as the selectivity
factor for mC (8, 104). The mC selectivity factor normalizes the overall deaminase activity, and
thus, can be used to compare the mC selectivity among different A3 members that display
drastically different overall deaminase activity. The second most active A3H showed the highest
mC selectivity factor of 53.0, which is over three-fold higher than A3A that showed the highest
activity for C and mC (Table 2.1). AID, which has been previously linked to epigenetic regulation
through mC demethylation, had a mC selectivity factor of 10.9. The mC selectivity factors for the
rest of the APOBEC members with detectable mC deamination activity were all less than 10 (Table
2.1). Interestingly, A3C had the lowest mC selectivity factor of 1.7 despite it had the third highest
28
C deamination activity (Table 2.1). These results suggest that each active APOBEC protein has
various levels of the overall deaminase activity and specificity for mC over normal C.
Structural elements contributing to the efficient C/mC deamination in A3A
We next attempted to identify structural/sequence elements that affect the overall deaminase
activity and the mC selectivity in A3 proteins. We focused on the differences between the two
highly homologous domains, A3A and A3B catalytic domain (A3BCD2), that share 89% amino
acid sequence identity but have different magnitude of the deaminase activity and mC selectivity.
A sequence alignment of A3A and A3BCD2 showed that the 11% non-identical amino acid
residues are clustered in three regions: 1-loop 1 (region 1), 2-loop 3 (region 2), and 5-6
(region 3) (Figure 2.5a, b). To decipher the role of these three non-identical regions, the amino
acid residues of A3A in each of these three regions were replaced with their corresponding residues
of A3BCD2, and the deaminase activity of the chimeric mutants were tested.
Among the three region mutants (R1-R3, Figure 2.5a), region 1 mutant (R1) had the largest
decrease in deaminase activity as well as mC selectivity. The reduction of C and mC deaminase
activity was about 85-fold and 200-fold, respectively, lowering the mC selectivity factor to 4.8
(Table 2.2 and Figure 2.5c). On the other hand, R2 showed relatively minor effects, with about 10-
fold reduction for both C and mC deamination activity, and thus, maintaining a similar mC
selectivity factor (Table 2.2 and Figure 2.5d). R3 had no obvious effects on the deaminase activity
and the mC selectivity (Table 2.2 and Figure 2.5e). These results clearly indicate that among the
three regions showing sequence variability between A3A and A3BCD2, region 1 of A3A (α1/loop
1) is critical for its high deaminase activity and high mC selectivity. We further sought to find
which residues in region 1 are critical for the high deaminase activity and mC selectivity of A3A
by mutating a subset of residues in region 1 to generate mutant R1-1, R1-2, and R1-3 (Figure 2.5a).
29
Mutant R1-1 contains three point mutations (H16D/I17T/S20F) on 1, mutant R1-2 contains
mutations in the middle of loop 1, where the sequence -GIG- of A3A is replaced by -DPLVLR-
from A3BCD2, and R1-3 contains two point mutations (H29R/K30Q) near the C-terminal end of
loop 1 (Figure 2.5a). The activity assay of these mutants showed that R1-1 had a similar level of
the activity for both C and mC as the WT, resulting in a similar mC selectivity factor of 12.0 (Table
2.2 and Figure 2.5f). Both R1-2 and R1-3 showed significantly reduced deaminase activity. R1-2
showed a ~42-fold reduction for C deamination, and a more pronounced 87-fold reduction for mC
deamination, resulting in a lowered mC selectivity of 6.1. R1-3 displayed a 28-fold reduction for
C deamination, and more pronounced 76-fold reduction for mC deamination, yielding a reduced
mC selectivity of 4.7 (Table 2.2 and Figure 2.5g, h). Similar trend was observed in the mutational
analysis on A3BCD2. Transferring the elements in region 1, specifically residues on loop 1, from
A3A into A3BCD2 indeed enhanced both the overall deaminase activity and the mC selectivity of
A3BCD2, confirming the importance of loop 1 (R1 A3BCD2, Table 2.2) (8). Thus, we conclude
that the residues on loop 1 (mutated in both R1-2 and R1-3 A3A) play major roles in determining
both the overall deaminase activity and the mC selectivity.
ssDNA binding and motif specificity of the R1 mutant of A3A
To address the question of whether the decreased deaminase activity and mC selectivity of A3A
loop 1 mutant R1 could be caused by altered ssDNA substrate binding, the ssDNA binding affinity
was measured by rotational anisotropy with 30 nt FAM-labeled ssDNA containing 5’-TCA or 5’-
TmCA motifs. The changes in rotational anisotropy with increasing concentrations of proteins
were fitted to a simple one-site specific binding model. The WT A3A showed a similar affinity for
both C and mC ssDNA substrates with Kd of 0.525 μM and 0.392 μM, respectively (Figure 2.6a,
b). R1 mutant showed about 3- to 4-fold lower affinity for C and mC substrates with Kd of 1.59
30
μM and 1.71 μM, respectively (Figure 2.6c, d). These results suggest that such reduction in DNA
binding affinity may provide partial explanation for the drastic decrease of the overall deaminase
activity (85-fold decrease in C deamination and the 200-fold decrease in mC deamination). Other
factors, such as the orientation of target C and mC at the active center pocket, may also affect the
deaminase activity and the mC selectivity of the R1 mutant.
It is known that the best DNA trinucleotide sequence motif for A3A is 5’-TCA, in which the T
at -1 position can be either of pyrimidine (T/C) and the A at +1 position can be either of purine
(A/G) (34, 36). On the other hand, A3G prefers cytosine that is preceded by two additional
cytosines (5’-CCC) (29, 105, 106). Because loop 1 and loop 7 are involved in ssDNA binding
(107-109), we wondered if the observed reduced selectivity for mC observed in region 1 mutants
is the result of altered sequence motif specificity. To test this possibility, the substrate specificity
of WT A3A and mutant R1 were compared using a set of 30 nt ssDNA substrates containing each
of the four different nucleotides at the -1 or +1 position of the recognition sequence, i.e. 5'-NCA
and 5'-TCN, as well as 5'-NmCA and 5'-TmCN, where N is any of the four nucleotides. A3A
showed clear preference for a substrate sequence motif with pyrimidine at -1 position and purine
at +1 position (Figure 2.6e, f), consistent with previous studies (34, 36). In addition, trinucleotide
sequence motif preference of A3A for mC containing substrates was similar to that for normal C
containing substrates (Figure 2.6g, h) with the exception of 5'-CmC substrate (5'-CmC was a
poorer substrate when compared with the corresponding normal 5’-CC substrate). Overall, A3A
R1 mutant showed similar sequence motif preference as the WT A3A for both normal C and mC
containing substrates (Figure 2.6i-l). Therefore, the lower overall deaminase activity and the
reduced mC selectivity factor observed in loop 1 A3A mutants are not due to the altered sequence
motif preference.
31
Role of conserved Y130 on loop 7 for mC selectivity
To understand the mechanism as to how loop 1 affects the observed changes in the deaminase
activity and mC selectivity, we modeled the C and mC into the active site pocket of A3A by using
program Glide (Schrödinger, LLC.) (110) (Figure 2.7a). Structural superposition of A3A with the
mouse free cytosine deaminase bound to cytidine (PDBID: 2FR6)(111) was conducted in PyMOL
(Schrödinger, LLC.), and the resulting orientation/positioning of the C was referred as a guide to
select the C/mC poses from the Glide docking output. Based on the modeled structure, we noticed
that loop 1 does not directly interact with the methyl group at the C5 position of the target mC.
The mC bound at the active site pocket has a preferred binding pose in a way that its methyl group
at the C5 position is on the side facing loop 7 near the residue Y130 (Figure 2.7a). Furthermore,
comparison of the NMR and crystal structures of A3A (99, 112) with the other known structures
of A3 proteins (98, 113-116) showed that Y130 in A3A clearly can adopt different positions that
are not permitted by the corresponding tyrosine residues of the other A3 proteins (Figure 2.7a).
The Y130 in A3A can point towards the outer surface and loop 1 because of available space
generated by A3A loop 1 conformation, whereas the equivalent tyrosine in A3C, A3F catalytic
domain 2 (A3FCD2) and A3GCD2 are excluded by their respective loop 1 from the position
occupied by Y130 in A3A, resulting in a location right next to the methyl group of the target mC.
Even though the conformation of this conserved Y130 may change upon binding of ssDNA, Y130
may have a possible role in bridging loop 1 and the active site pocket to affect the deaminase
activity and mC selectivity. We investigated whether the amino acid side chain at the Y130 position
could affect the mC selectivity by substituting with a set of different amino acid residues. The
results showed that substitution of Y130 with non-aromatic residues (including A, V , S, T, D, I)
completely abolished the deaminase activity (Figure 2.7b), indicating that neither simple
hydrophobicity nor polarity is sufficient to retain the deaminase activity. Substitution of Y130 with
32
other aromatic residues (Y130F and Y130W) lowered, but retained the deaminase activities (Table
2.2 and Figure 2.7b). Interestingly, Y130F had ~17-fold reduction of activity in C deamination and
~6-fold reduction in mC deamination, yielding an increased mC selectivity factor of 36.1 (Table
2.2 and Figure 2.7c). Y130W, on the other hand, had ~70-fold reduction of activity in C
deamination, and 125-fold reduction in mC deamination, yielding a reduced mC selectivity factor
of 7.1 (Table 2.2 and Figure 2.7d). The similar trend for increased mC selectivity factor was also
observed in R1-Y130F when compared to R1 (Table 2.2). R1-Y130W mutant had no detectible
activity, indicating that the combination of Y130W with the loop 1 mutations in R1 inactivates the
enzyme. These results suggest that tyrosine at 130 position may be optimized in A3A to be
positioned between loop 1 and the active center to achieve high deaminase activity for both C and
mC, and the replacement with a slightly shorter (Phe) or bulkier (Trp) aromatic side chain reduce
the overall deaminase activity but have distinctive positive or negative impact on the selectivity
for mC.
DISCUSSION
APOBEC cytidine deaminases play important roles in innate and adaptive immunity and other
biological processes including potential genomic mC modification for epigenetic regulation. Their
aberrant deamination can also generate malignant mutations leading to cancer. Even though the 11
known APOBEC members are evolutionarily conserved and share the same core structure, the
properties of their deamination activity varies depending on their physiological roles. In this
chapter, we performed family-wide comparative analysis on C and mC deamination by using
purified APOBEC proteins. Our results showed that nine APOBECs (except for A2 and A4) had
strong to readily detectable deaminase activity on C. Notably, A3A and A3H had robust deaminase
activity on C and mC, and high selectivity for mC. Moderate to low deaminase activity for C and
33
mC was detected for six other APOBEC members (AID, A1, A3B, A3C, A3F, A3G).
By introducing point-mutations in different regions of A3A, we show that several residues on
loop 1 and Y130 on loop 7 next to the active site center contribute critically to the observed high
deaminase activity and high mC selectivity factor. At the structural level, the conformation of loop
1 and Y130 on loop 7 are likely to be correlated during deamination reaction, i.e. loop 1
conformation affects the positioning and conformation of Y130, linking the observed roles of both
loop 1 region and loop 7 Y130 in mC selectivity. Therefore, these structural features for the loop
1 as well as the conserved tyrosine residue on the loop 7 can help to rationalize the high activity
and high selectivity for mC deamination observed in A3A. They also rationalize the observation
that the Y130F A3A gained higher selectivity for mC, probably because the slightly shorter Phe
residue lacking a hydroxyl group can provide better accommodation for the methyl group of the
mC in the active site pocket for deamination to yield higher mC selectivity factor. Similarly, the
bulkier tryptophan residue at this position may generate steric hindrance for mC to yield lower mC
selectivity factor.
In summary, this study provides the first direct comparison of the deaminase activity for C and
mC of 11 known APOBEC family members. A caveat is that the enzymatic activity data are
obtained from in vitro studies using purified proteins from E. coli. While the activity may reflect
the intrinsic properties of the APOBECs in vitro, the in vivo situation may vary due to potential
post-translational modification, co-factor binding, and subcellular localization. This study on the
family-wide comparison of the activity and mC selectivity would be valuable for understanding
the function of APOBEC family proteins in various biological processes and possible link to
epigenetic regulation.
34
MATERIALS AND METHODS
Plasmids
A3A, A3BCD2, and their mutants were cloned into the pET28a vector with His6-tag at their C-
terminus. AID, A1, A3B, A3C, A3D, A3F, A3G, A3H, A4 were cloned into pMAL-c5X vector to
express fusion proteins with MBP at N-terminus. A2 was cloned into pGEX-6P-1 vector to express
fusion proteins with GST at N-terminus. Cloning and mutagenesis were performed with In-Fusion
cloning and PrimeSTAR mutagenesis (Clontech) by following manufacturer’s instruction. The
sequences of the constructs were verified by DNA sequencing (Genewiz).
Protein expression and purification
The expression plasmids were transformed into the E. coli BL21(DE3) and the cells were grown
in LB medium at 37°C. The recombinant proteins were induced with 0.1 mM isopropyl -D-1-
thiogalactopyranoside (IPTG) when the OD600 reached 0.6. The cells were transferred to 18-21°C
and further grown for 18 hours.
For A3A and A3BCD2, the cell pellets were resuspended with lysis buffer (20 mM Tris-HCl,
pH 8.0, and 250 mM NaCl). The cells were lysed by French Press and cellular debris was removed
by centrifugation. The supernatant containing His6-fused proteins were loaded onto Ni-NTA
agarose column (QIAGEN). The nickel column was extensively washed with wash buffer (20 mM
Tris-HCl, pH 8.0, 50 mM imidazole, and 250 mM NaCl) and the protein was eluted with elution
buffer (20 mM Tris-HCl pH 8.0, 500 mM imidazole, and 250 mM NaCl). The fractions that
contained the recombinant proteins were pooled and concentrated with an Amicon 10 K
concentrator (EMD Millipore) to 5-30 mg/ml, switched from elution buffer to protein storage
buffer (20 mM Tris-HCl, pH 8.0, 250 mM NaCl, 1 mM EDTA, and 1 mM DTT) and stored at -
80°C.
35
For AID, A1, A3B, A3C, A3D, A3F, A3G, A3H, and A4, the cell pellets were resuspended with
the lysis buffer (20 mM Tris-HCl, pH 8.0, and 250 mM NaCl). The cells were lysed by French
Press and cellular debris was removed by centrifugation. The supernatant containing MBP-fused
proteins were loaded onto amylose column (New England Biolabs). Amylose column was
extensively washed with wash buffer (20 mM Tris-HCl, pH 8.0, and 1 M NaCl) and the protein
was eluted with the elution buffer (20 mM Tris-HCl, pH 8.0, 250 mM NaCl, and 20 mM D-
maltose). Eluted fractions were concentrated and stored at -80°C.
For A2, the cell pellets were resuspended with the lysis buffer (20 mM Tris-HCl, pH 8.0, 250
mM NaCl, and 2 mM DTT), The cells were lysed by French Press and cellular debris was removed
by centrifugation. The supernatant containing GST-A2 was loaded onto glutathione sepharose
column (GE Healthcare). The column was extensively washed with the lysis buffer and the GST-
A2 was eluted by lysis buffer with 10 mM reduced glutathione. Eluted fractions were concentrated
with an Amicon 10 K concentrator (EMD Millipore) and further purified using Superose 6 size-
exclusion chromatography (GE Healthcare). The fractions containing GST-A2 were collected,
concentrated, and stored at -80°C. SDS-PAGE was used to assess the purity of the protein and to
calibrate the concentrations of all purified proteins.
Deaminase assay
The purified recombinant protein was incubated with 600 nM 5'-6-FAM-labeled 30 nt ssDNA
substrates containing a target C or mC in deamination buffer (25 mM buffer at the indicated pH,
100 mM NaCl, 1 mM DTT, 0.1% Triton X-100, and 0.1 μg/ml RNase A) at 37°C for 2 h. The
deaminase reaction was terminated by heat inactivation at 95°C for 10 min. The bases of the
deamination products U or T were subsequently cleaved by incubating with uracil DNA
glycosylase (2 units, New England Biolabs) at 37°C for 1 h or with thymine DNA glycosylase (2
36
units) and three-fold excess amount of the complementary ssDNA at 42°C for 12 h, respectively.
The abasic sites were hydrolyzed in the presence of 0.1 M NaOH at 90°C for 10 min. The
deamination products were separated on 20% urea denaturing gels, visualized by Molecular
Imager FX (Bio-Rad), and quantified by Quantity One 1-D Analysis Software (Bio-Rad). Error
bars were generated based on standard errors of three independent data sets.
Steady-state rotational anisotropy DNA binding assay
5'-6-FAM-labeled 30 nt ssDNA containing -TCA- was used as a substrate for A3A binding
assay monitored by change in steady-state fluorescence depolarization (rotational anisotropy).
Various concentrations of A3A was incubated with 50 nM ssDNA for 1 min at room temperature
in 130 l binding buffer containing 25 mM HEPES-NaOH, pH 6.5, and 25 mM NaCl. The
rotational anisotropy was measured using a QuantaMaster QM-1 fluorometer (Photon Technology
International) with a single emission channel. Samples were excited with vertically polarized light
at 495 nm, and both vertical and horizontal emissions were monitored at 520 nm (8-nm bandwidth).
The dissociation constant was obtained by iterative curve fitting to one site specific binding model
using Prism 6 (GraphPad).
Structural modeling
Docking of the C and mC onto A3A was performed by the program Glide (110) and visualized
by PyMOL (Schrödinger, LLC). A3A was superimposed to the complex structure of mouse free
cytosine deaminase bound to free cytidine (PDB ID: 2FR6)(111), and the resulting
orientation/position of the C obtained from the superposition was used as a guide to select the
C/mC poses from the Glide docking output.
37
Table 2.1. Deaminase activity for C and mC of all APOBEC proteins tested at the indicated
optimal pH with the preferred DNA substrate motifs.
Enzyme Activity for C
a
(nM product/
M enzyme)
Activity for mC
a
(nM product/
M enzyme)
mC
selectivity factor
b
(mC/C)*100
pH Substrate
motifs
AID 48.8 ± 2.1 5.3 ± 0.3 10.9 7.5 GCA/GmCA
A1 53.5 ± 2.3 5.2 ± 1.7 9.7 6.5 TCA/TmCA
A2 – – – – –
A3A 256,000 ± 41,000 40,300 ± 4,500 15.7 6.5 TCA/TmCA
A3B 1,200 ± 50 78.1 ± 6.6 6.5 5.5 TCA/TmCA
A3C 1,830 ± 120 31.4 ± 3.0 1.7 5.5 TCA/TmCA
A3D 53.1 ± 4.5 ND – 6.5 ACA/AmCA
A3F 59.1 ± 1.7 2.6 ± 0.8 4.4 6.5 TCA/TmCA
A3G 102 ± 4 10.3 ± 0.8 9.9 6.5 CCC/CCmC
A3H 34,600 ± 1,700 18,300 ± 1,200 53.0 6.5 TCA/TmCA
A4 – – – – –
Note:
a
Deaminase activity for C and mC was calculated from the initial linear range of dose
dependent product yield for each APOBEC (Figure 2.4). S.D. was estimated from three
independent deaminase assay experiments.
b
The mC selectivity factor was calculated as mC/C
activity ×100. ND indicates not determined.
38
Table 2.2. Deaminase activity of A3A and A3BCD2 mutants for C and mC.
Enzyme Activity for C
a
(nM product/M enzyme)
Activity for mC
a
(nM product/M enzyme)
mC selectivity
factor
b
(mC/C)*100
R1 3,180 ± 150 153 ± 3 4.8
R2 25,500 ± 600 3,440 ± 70 13.5
R3 322,000 ± 79,000 42,300 ± 1,100 13.1
R1-1 352,000 ± 14,000 42,200 ± 800 12.0
R1-2 6,400 ± 480 391 ± 19 6.1
R1-3 9,430 ± 350 444 ± 7 4.7
Y130F 16,000 ± 300 5,780 ± 110 36.1
Y130W 3,780 ± 80 271 ± 5 7.1
R1+Y130F 653 ± 24 59.1 ± 2.3 9.1
R1+Y130W ND ND –
A3BCD2 283 ± 5 9.65 ± 0.26 3.4
R1 A3BCD2 6,940 ± 130 1,310 ± 70 18.9
Note:
a
Deaminase activity for C and mC was calculated from the initial linear range of dose
dependent product yield for each mutant (Figure 2.5c-h and Figure 2.7c, d). S.D. was estimated
from three independent deaminase assay experiments.
b
The mC selectivity factor was calculated
as mC/C activity ×100. ND indicates not determined.
39
Figure 2.1. Proposed demethylation pathway of mC. Deamination of mC by APOBEC proteins
may be a part of DNA demethylation process of cytosine in genome. mC (or 5hmC) could be
deaminated into thymidine (or 5hmU) by APOBECs and base excision repair machinery
recognizes the generated mismatch to replace the deaminated bases with unmodified cytosine.
Currently known DNA demethylation pathway comprises stepwise oxidation of methyl group of
mC by TET oxidase and the following base excision repair of the oxidized (formylated or
carboxylated) cytosines.
40
Figure 2.2. SDS-PAGE of the 11 purified APOBECs. SDS-PAGE of the purified proteins for the
11 (a) WT and (b) catalytically inactive E-to-A active site mutant APOBECs with affinity tags.
A3A is fused to His6-tag, AID, A1, A3B, A3C, A3D, A3F, A3G, A3H, A4 are fused to MBP-tag,
and A2 is fused to GST-tag.
41
Figure 2.3. Time course deaminase activity of A3A for normal C and mC. a, Time course
deaminase activity of A3A for C and mC containing ssDNA substrates. The reaction was
performed with 10 nM A3A and 600 nM 30 nt ssDNA containing single target C or mC. The
reaction products appear as 16 nt ssDNA fragment after coupling reaction by uracil DNA
glycosylase and subsequent hydrolysis in abasic sites. b, Quantification of the deamination product
versus incubation time.
42
Figure 2.4. Dose dependent deaminase activity of 9 APOBEC proteins for C and mC. a-i,
Deamination products were quantified against increasing concentration of enzymes with 600 nM
ssDNA substrates containing a target C (red) or mC (blue). The initial slope of each plot in the
linear range is calculated as deaminase activity (nM product/μMenzyme). S.D. was estimated from
data collected in three independent experiments.
43
Figure 2.5. Identification of structural elements important for deaminase activity and mC
selectivity. a, Sequence alignment of A3A and A3BCD2. The three regions (Region 1-3) of A3A
that show sequence differences from A3BCD2 are replaced with the equivalent residues of
A3BCD2 to generate mutants R1, R2, and R3. R1 is further divided into R1-1, R1-2, and R1-3. b,
The three mutated regions are mapped onto A3A structure (PDB ID: 5SWW), in which the active
center Zn and its coordinating histidine and two cysteines are shown as a sphere and sticks,
respectively. Region 1 (loop 1 region) is located next to the Zn-active center. c-h, Dose dependent
activity assay for C and mC deamination for WT A3A and the six A3A mutants (R1, R2, R3, R1-
1, R1-2, and R1-3) shown in a. Deamination products were quantified against increasing
concentration of enzymes with 600 nM ssDNA substrates containing a target C (red) or mC (blue).
S.D. was estimated from data collected in three independent experiments.
44
Figure 2.6. Comparison of the substrate binding affinity and substrate motif specificity of
WT and R1 mutant of A3A. a-d, Substrate ssDNA binding of WT and R1 mutant. Binding of
proteins to FAM-labeled 30 nt ssDNA containing 5’-TCA- or 5’-TmCA- motif was measured by
rotational anisotropy. Binding mixtures contained 50 nM ssDNA and various concentrations of
proteins. The plots of changes in anisotropy were fitted by one-site specific binding model. e-h,
Deaminase activity of WT A3A for C (e, f) and mC (g, h) on different DNA sequence motifs.
Deamination reaction mixture contains 10 nM A3A and 600 nM substrate ssDNA that contain the
listed tri-nucleotide motifs. i-l, Deaminase activity of R1 A3A for C (i, j) and mC (k, l) on
different DNA sequence motifs. Deamination reaction mixture contains 100 nM R1 mutant and
600 nM substrate ssDNA that contain the listed tri-nucleotide motifs. These results suggest that
the substrate motif specificity of R1 mutant is similar to that of WT A3A. S.D. was estimated
from data collected in three independent experiments.
45
Figure 2.7. Y130 on loop 7 of A3A modulates the selectivity for mC deamination. a,
Superimposition of A3A (green, PDB ID: 2M65)(99), A3GCD2 (magenta, PDB ID: 3IQS)(98),
A3C (red, ID: 3VOW)(114), and A3FCD2 (orange, PDB ID: 4J4J)(115). The A3A loop 1 (in green)
has a conformation different from those of other APOBECs. The conserved tyrosine residue on
loop 7 (shown in sticks) in APOBECs (except for A3A) is right next to the active site pocket where
the mC would bind. The mC modeled into the active site has its methyl group pointing to the
conserved tyrosine residue. The orientation of the conserved tyrosine residue of A3A (Y130) can
adopt a position distinct from its equivalent tyrosine residues in the other APOBECs, mainly
because the unique A3A loop 1 conformation allows its Y130 to point away from the methyl group
of the mC. b, Deaminase activity of Y130 mutants of A3A and A3A R1.
Aromatic residue in this position is necessary and sufficient for the enzymatic activity of WT A3A.
Deamination reaction was performed by incubating 100 nM A3A mutants and 600 nM 30 nt
46
ssDNA containing target C at 37°C for 1 hour. ASM indicates active site mutant. c and d, Dose
dependent deaminase activity of Y130F and Y130W. Substitution of Y130 with other aromatic
residues retained the deaminase activity with altered selectivity for mC. The quantified deaminase
activity is shown in Table 2.2.
47
Chapter 3
Understanding the Structure, Multimerization, Subcellular Localization and mC
Selectivity of a Genomic Mutator and Anti-HIV Factor APOBEC3H
Authors: Fumiaki Ito, Hanjing Yang, Xiao Xiao, Shu-Xing Li, Aaron Wolfe, Brett Zirkle,
Vagan Arutiunian, and Xiaojiang S. Chen
Contributions: F.I., H.Y ., and X.S.C. designed the experiments. H.Y ., X.X., and S.L. performed
crystallization and structural determination. F.I., H.Y ., A.W., B.Z., and V .A. performed mutational
and biochemistry study. F.I. H.Y ., and X.S.C. wrote the manuscript. F.I. and H.Y . contributed
equally to this work.
INTRODUCTION
APOBEC3H (A3H) is a member of the APOBEC cytidine deaminase family that play important
roles in innate immunity including restricting endogenous retroelements and infectious
retroviruses (26, 101, 117-121). As the most divergent member of the APOBEC3 (A3) family, A3H
belongs to the Z3-type zinc-coordinating cytidine deaminase domain that is phylogenetically
distinct from the Z1- and Z2-type domains of other A3 proteins (4, 5). A3H is the most
polymorphic member of the APOBEC family, as its mRNA can undergo alternative splicing to
generate four splicing variants, and there are seven distinct haplotypes in human population (hap
I-VII) containing various combinations of five single nucleotide polymorphisms (2, 101-103).
Among the seven A3H haplotypes, only hap II, V , and VII effectively restrict Vif-deficient HIV-1
(HIV-1 Vif) (101-103, 122), whereas hap I was also shown to have anti-HIV activity when
48
overexpressed in cell culture (102, 103, 123, 124). Literatures suggest that the anti-HIV activity of
A3H can be through both deaminase dependent and independent manners (103, 125). The
differential anti-HIV activity of the A3H variants were attributed to a combination of several
factors, such as differences in RNA binding and virion packaging capacity (124, 126), protein
stability (101, 127-129), deaminase activity (104), and subcellular localization (123, 124). Nucleic
acid binding is a key feature of all APOBEC proteins, and nucleic acids can often have multiple
functional roles. A3H has been found in different oligomeric forms, and it oligomerizes both in
cells and during recombinant protein purification (104, 128, 130, 131). Evidence so far suggests
that binding to RNA is largely responsible for the multimerization of A3H and several other
APOBEC members (77, 82, 106, 125, 130, 132). RNA binding is also an important step for the
recruitment and encapsidation of APOBEC proteins to the HIV virion, which is necessary to exert
their anti-HIV activity (103, 124, 126, 133-135). Additionally, ssDNA binding is critical for
deaminase activity. RNA binding appears to be inhibitory for deaminase activity of A3H as RNase
A treatment activates or enhances the deamination on ssDNA for A3H and several APOBEC
members (8, 82, 104, 125, 132, 136), which suggests overlapping binding sites for RNA and
ssDNA substrates. Multiple studies have suggested that APOBEC proteins can utilize diverse
modes of binding to nucleic acids, all of which are required for specific functions and regulations.
As reported in chapter 2, A3H, together with A3A, shows about three magnitudes higher cytosine
(C) and methylcytosine (mC) deaminase activity compared to other APOBEC members in vitro.
Moreover, the selectivity for mC deamination of A3H is several times higher than that of A3A and
other APOBECs. While a detailed mechanism for the high activity and mC selectivity for A3H is
not yet well understood, loop 1 and loop 7 have been shown to play major roles in regulating
activity and mC selectivity in A3B and A3A (see chapter 2 in this thesis). Inadvertent deamination
of genomic DNA by A3H has been associated with mutations in breast and lung cancers (77).
49
In this chapter, we performed structural and extensive biochemical analyses on human A3H.
Our 2.49Å crystal structure of an RNA-free monomeric A3H showed a uniquely long C-terminal
helix (6) and a disrupted beta strand in the canonical five-β-sheet stranded core. Point mutations
on H114 or W115/C116 on loop 7 near the catalytic pocket completely disrupted the RNA-
mediated dimerization of A3H, yielding an RNA-free monomer that still shows binding to nucleic
acids and deaminase activity. A3H formed enzymatically inactive high molecular weight (HMW)
complex in mammalian cells. The deaminase activity was inhibited by RNA binding as RNase A
treatment dissociated the HMW complex into enzymatically active low molecular weight (LMW)
species. A3H showed a highly positively charged surface around the zinc-coordinating active
center and substrate binding loops. Multiple positively charged residues within this charged
surface were important for the subcellular localization and deaminase activity of A3H. Finally, we
have identified multiple residues that contribute to high overall deaminase activity as well as
selectivity for mC. Taken together, these findings provide valuable insights into molecular basis
for the complex nucleic acid binding and catalytic regulation of A3H.
RESULTS
Monomeric and dimeric forms of human A3H
Human A3H haplotype II (referred to as A3H hereafter) with a maltose-binding protein (MBP)
tag at its N-terminus produced a dimeric form associated with RNA that could be dissociated to
monomer and free RNA through a high salt treatment (Supplementary Figure S3.1). Extensive
screening was conducted to search for a construct that would produce a stable monodispersed A3H
for structural study. Subsequently, three mutants, namely m1, m1+H114A, and
m1+W115A/C116S, were identified as good constructs for the in vitro study and purified for the
structural and biochemical analysis. A3H m1 carries a set of 7 mutations
50
(E56A/W90S/C127S/G128Q/S129E/Q130G/L155A) to increase the protein stability and reduce
toxicity to the expression host (Supplementary Figure S3.2). While m1 produced a stable dimer
form, adding either H114A or W115A/C116S point mutations produced stable monomers (Figure
3.1a, b). With these stable dimeric and monomeric forms of A3H, we performed crystallization
trials either with the MBP-tag or the cleaved forms, with or without nucleic acids, and obtained a
high-quality crystal of the m1+W115A/C116S monomer without MBP-tag and nucleic acids that
diffracted X-ray to 2.49 Å resolution (Table 3.1 and Figure 3.1c).
General structural features of human A3H monomer
The A3H m1+W115A/C116S monomeric structure was determined to 2.49Å resolution and
refined to the statistics shown in Table 3.1. Each asymmetric unit (asu) contains two A3H
molecules that have nearly identical structure of the core fold and the loops. As expected from the
divergent A3H sequence among APOBEC family members, this A3H crystal structure was indeed
divergent from other known APOBEC structures so far (98, 99, 112-116, 137-141). Notably, helix
6 (6) of A3H extended six amino acid residues (1.7 turns) at its N-terminal side, making it the
longest 6 among known APOBEC structures (Figure 3.1d). The canonical short beta strand 5 (β5)
of APOBEC proteins was not a typical strand in this A3H monomeric structure. The disruption of
β5 appears to result from the proline residue (P132) that forms an outward-facing bulge and moves
away from β4, disrupting the already short β5 (Supplementary Figure S3.3a, b). In addition, A3H
monomer contains a uniquely long loop 1 around the Zn-center (Figure 3.1d). Other secondary
structural features of the APOBECs are well preserved in A3H. The monomeric structure also
shows that A3H is highly positively charged on one end around the Zn-active center, and neutral
and negatively charged on the other end (Figure 3.2a and Supplementary Figure S3.3c). A3H
shows the most extensive positively charged surface among the active APOBEC domain structures
51
(Supplementary Figure S3.4). Other highly positively charged APOBEC proteins and domains
include AID and A3A. During the preparation of the manuscript for this chapter, two independent
groups reported the structures of dimeric A3H with RNA bound at the dimer interface (142, 143).
The overall structural features of our RNA-free monomeric A3H described here (PDB ID: 5W45)
are very similar to those of the reported dimeric A3H structures. Our A3H monomer mutant
structure superimposes well with A3H from pig-tailed macaque (pgtA3H, PDB ID: 5W3V) with
an r.m.s.d of 0.598 for all atoms (Supplementary Figure S3.5). The differences between the RNA-
free hA3H and the RNA-bound pgtA3H mostly reside in loops 1, 3, and 7. However, the
superimposition of our A3H monomer with RNA-bound human A3H dimer (PDB ID: 6B0B)
yielded a larger r.m.s.d of 1.327, indicating greater structural difference (Supplementary Figure
S3.5).
Nucleic acid binding of A3H
The fact that high salt treatment of the dimer A3H yielded monomer and free RNA suggests
that RNA binding mediates dimer formation. In addition, stable monomeric form obtained simply
by mutating H114 or W115/C116 on loop 7 suggests that these residues participate in nucleic acid
binding. H114 and W115 are located around the center of a highly positively charged surface
(Figure 3.2a). To compare the binding affinity of the dimeric and monomeric forms of A3H to
nucleic acids, we employed electrophoretic mobility shift assay (EMSA) and examined the
contribution of H114 and W115 to nucleic acid binding. We focused our binding assay on various
single stranded oligonucleotides as our initial investigation revealed no detectable binding to
dsDNA or dsRNA for the dimeric and monomeric forms of A3H (data not shown). We tested the
binding of a 6-FAM labeled 50 nucleotide (nt) ssRNA and ssDNA (Figure 3.2b, c), containing a
mixed sequence with no predicted secondary structure. To our surprise, the results revealed
52
relatively strong binding to both 50 nt RNA or DNA for all the A3H constructs tested, with Kd
values between 8–34 nM for RNA and 22–68 nM for DNA (Table 3.2). Only small differences in
binding affinity for the 50 nt RNA or DNA were observed between the dimeric construct (m1) and
the monomeric constructs (m1+H114A or +W115A/C116S) (Table 3.2 and Figure 3.2b, c). When
comparing the affinity of different monomeric forms to the 50 nt RNA substrate, the m1 monomer
form (converted from the dimer form by high salt) showed a Kd of ~8 nM, which is slightly tighter
binding than the 12 nM for the W115A/C116S mutant, and the 34 nM for the H114A mutant (Table
3.2). When comparing between the dimeric and monomeric forms of the same m1 construct, the
monomeric form showed stronger binding than the dimeric form for both 50 nt RNA and DNA
(Table 3.2). We then tested shorter ssDNA, 13 nt and 8 nt, to compare the binding affinity of
dimeric forms and monomeric forms (Figure 3.2d, e). The results showed that the
m1+W115A/C116S monomer mutant showed lower binding compared to m1 dimer and monomer,
and different shift pattern as the oligonucleotide gets shorter. With the 13 nt ssDNA, the binding
affinity of W115A/C116S mutant had a Kd of 497 nM, whereas m1 monomer was 272 nM (Table
3.2). With the 8 nt ssDNA, the Kd of W115A/C116S monomer mutant was 2.96 μM, about a 3-
fold drop in binding affinity compared to 983 nM for m1 monomeric construct. Again, if compared
between the dimeric form and monomeric form of the same m1 dimer construct, the monomeric
form showed stronger binding than the dimeric form for shorter oligomers (Table 3.2 and Figure
3.2d, e). Because of the similarity of binding between 50 nt ssRNA and ssDNA, the phenomenon
observed for the shorter ssDNA is likely to be similar for the shorter ssRNA. Additionally, all A3H
constructs showed some level of cooperativity in their binding to the ssDNA/RNA of different
lengths (Table 3.2).
53
Formation of A3H HMW species in HEK293T cells
Non-substrate nucleic acid binding by APOBEC proteins, especially through binding of RNA
that is freely available inside cells, may be a general factor for the multimerization of these
enzymes and RNA-mediated inhibition of deaminase activity (82, 133, 134). Here we examined
the multimerization status of an N-terminal FLAG-tagged A3H (FLAG-A3H) expressed in
HEK293T cells using a cell fractionation assay and tested the effect of RNase A treatment of the
cell lysates on both the oligomeric status and deaminase activity. The cell lysates of 293T cells
overexpressing FLAG-A3H either with or without RNase A treatment were fractionated by
Superdex 200 size-exclusion chromatography (SEC) column, and fractions across the elution
range were analyzed by Western blot to detect the multimerization status of A3H (Figure 3.3a, b).
Each fraction from SEC was also tested for deaminase activity. The results showed that, without
RNase A treatment, A3H in the cell lysates eluted mostly in high molecular weight (HMW)
fractions (>~800 kDa), and very little deaminase activity was detected across these fractions, being
barely above background levels (Figure 3.3a). However, in the RNase A-treated cell lysates, the
HMW species dissociated to the LMW species, and deaminase activity was detected across the
fractions, with high activity associated with the LMW fractions (Figure 3.3b). These results
indicate that RNA binding of A3H is involved in multimerization and inhibition of deaminase
activity of A3H in vivo. These results are consistent with the strong binding affinity to nucleic
acids and the RNA-mediated dimerization of A3H observed previously.
Role of positively charged residues in subcellular localization of A3H
In the highly positively charged surface of A3H, there are a total of 13 arginine and lysine
residues (Figure 3.4a), which can be grouped into three patches (patch 1: K16/R17/R18/R20/R21,
patch 2: K27/K50/K51/K52, and patch 3: K168/R171/R175/R179) based on their location. To test
54
the contribution of these positively charged residues to RNA-mediated HMW species formation
in cells, we generated the corresponding patch mutants by replacing the all the residues within
these three groups to alanine and then tested multimerization status in HEK293T cell lysates. When
the cell lysates of these patch mutants (patch 1-3) for molecular mass analysis were prepared, we
noticed that the mutants mostly showed reduced protein level in the soluble fractions. We
subsequently hypothesized that the reduced patch mutant proteins in the soluble fraction could be
due to relocalization from cytoplasm to nucleus as nuclear fractions are removed during the cell
lysate preparation for molecular mass analysis. Accordingly, subcellular distribution of the WT
A3H hap I, II and the three patch mutants are tested. Hap I was included in this analysis because
previous studies have shown that A3H hap I predominantly localizes in nucleus, whereas hap II
mainly in cytoplasm (123, 124). The subcellular fractionation analysis showed that A3H hap I was
mostly localized in the nucleus (~71%), consistent with previous reports. Comparing to hap I, A3H
hap II had a reduced level in nucleus (~50%) (Figure 3.4b). Interestingly, two of the three patch
mutants, patch 1 and 3, showed higher distribution in the nucleus than WT A3H, with ~80% for
both. These results suggest that the positively charged residues in patch 1 and patch 3 play major
roles in mediating subcellular localization. In addition, A3H carrying W115A mutation
(m1+W115A) also showed predominant nuclear localization (Figure 3.4b). We next tested if there
is any difference in RNA-mediated inhibition of deaminase activity for the patch mutants, as an
alternative way of assessing RNA binding. The results showed that, while WT A3H hap II
displayed high activity only after RNase A treatment, hap I, and three patch mutants (patch 1–3),
all had significantly reduced deaminase activity with or without RNase A treatment (Figure 3.4c),
suggesting that these three patch mutants may have impaired binding to substrate ssDNA for
deamination. Of note, patch 1 mutant showed comparable levels of activity regardless of RNase A
treatment, even though overall activity decreased compared to WT, suggesting that inhibition of
55
activity by RNA may be attenuated in this mutant. Taken together, these positively charged patches
in A3H surface contributes to subcellular localization and deaminase activity possibly through
binding to nucleic acids (either ssDNA or RNA).
Importance of loop 1 for deaminase activity and mC selectivity of A3H
As we reported in chapter 2, A3H and A3A have about three orders of magnitude higher
deaminase activity than other APOBECs in in vitro assay using purified recombinant proteins.
Comparing the surface charge distribution around the Zn-active center of all catalytically active
APOBEC domains, A3H and A3A are the only two APOBECs having their Zn-center surrounded
by a predominantly positively charged surface area (Figure 3.5 and Supplementary Figure S3.4).
The highly positively charged environment around the Zn-center in A3H and A3A may help attract
any ssDNA substrates for efficient deamination. As a comparison, other active APOBEC domains
have positively charged surfaces located some distance away from the Zn-center (Figure 3.5 and
Supplementary Figure S3.4), suggesting that substrate DNA may initially bind to the positively
charged area away from the Zn-center and then extend the target C to the active site for
deamination, as in the case reported for A3FCD2 in complex with ssDNA (144). Thus, there may
be a relationship between the charge distribution around the Zn-center and deaminase activity
levels observed for different members of the APOBEC family. Both A3H and A3A have been
shown to similarly have strong mC deaminase activity. However, the mC selectivity factor (defined
as mC/C specific activity ×100) of A3H is around 50, the highest among all APOBECs (chapter
2). Here with the structure of A3H at hand, we tried to understand structural elements important
for the high deaminase activity and high mC selectivity observed for A3H. In chapter 2, we showed
that loop 1 is important in determining the deaminase activity and mC selectivity for A3A and
A3BCD2. Here we have examined the role of A3H loop 1 with regards to deaminase activity and
56
mC selectivity using cell lysates from mammalian cells expressing A3H. A3H loop 1 is highly
charged, with five arginine residues (Figure 3.5b) located in the previously mutated patch 1 area.
We first generated R to D point mutations for each of these five residues, and the results showed
that mutants R17D, R21D, and R26D essentially lost the deaminase activity (Figure 3.6a),
demonstrating that negatively charged residues at these positions of loop 1 abolished the activity.
The other two mutants, R18D and R20D, decreased the activity by ~20–40% in C deamination,
and by ~40–50% in mC deamination (Figure 3.6a). The mC selectivity factor of R18D decreased
to a half compared to WT (Table 3.3), whereas that of R20D only had a moderate decrease. We
further tested the effect of alanine substitution on a few selected positions on loop 1, including
R21A, R26A and R18A/L19A. In contrast to R21D and R26D mutants that showed complete loss
of activity, R21A and R26A both retained the deaminase activity and showed decreased selectivity
for mC (Table 3.3). R18A/L19A showed about 2-fold higher C deaminase activity than WT with
slightly increased selectivity factor for mC (Table 3.3). These data suggest that not only the
positions, but also the residue types on loop 1 can have effect on deaminase activity as well as mC
selectivity. A sequence alignment of the APOBEC proteins also revealed a unique residue, A28,
on the C-terminal end of A3H loop 1, that is either T or S in all other active APOBEC domains
(Supplementary Figure S3.2). In the 3D structure, A28 occupies the same position as T that packs
on the back side of the target cytosine base at the Zn-center pocket (Figure 3.5c), and presumably
could help to stabilize the C base inside the pocket for deamination. To test whether this unique
A28 of A3H would provide more room for the larger mC inside the pocket and allow for higher
mC selectivity, we made an A28T mutant of A3H, and the results of the subsequent activity assay
showed that A28T had higher deaminase activity on both C and mC than WT A3H (Table 3.3 and
Figure 3.6a), presumably holding the base tighter for more efficient deamination. However, the
A28T mutant showed no significant change in mC selectivity, indicating that A28 is not one of the
57
factors accounting for the high mC selectivity in A3H.
Effect of loop swapping of loop 1 and loop 7
Because the mutational studies in chapter 2 showed that loop 1 and loop 7 around the Zn-active
center play important roles in determining the deaminase activity and mC selectivity, we tested the
function of loop 1 and loop 7 of A3H by swapping them with the equivalent loops from A3A and
A3GCD2, which have relatively high and low deaminase activity, respectively. The results showed
that swapping loop 1 of A3H with either A3A or A3GCD2 resulted in a complete loss of activity
(Figure 3.6b), indicating that loop 1 from A3A and A3GCD2 are not compatible with the active
center configuration of A3H for deaminase activity, likely disrupting the correct substrate binding
mode. However, swapping loop 7 of A3H with A3A showed deaminase activity comparable to that
of WT A3H, except that the mC selectivity factor was reduced (Table 3.3 and Figure 3.6b). Loop
7 swapping with that of A3GCD2, in contrary, resulted in a loss of activity for ssDNA with TCA
motif. Since loop 7 of A3GCD2 is shown to favor the TCC motif for deamination (145, 146), the
deaminase activity using substrate ssDNA containing TCC motif did show low activity that is
barely above backgrounds (Figure 3.6b). The loop 7 swapping results indicate that loop 7 of A3H
can be replaced with that of A3A, but not of A3GCD2. Taken together, loop 1 and loop 7 of A3H
are both important for the robust deaminase activity and mC selectivity, but loop 1 is more sensitive
to changes than loop 7, which is consistent with the results shown earlier, where several point
mutations on loop 1 abolished A3H activity.
DISCUSSION
In chapter 3, we report the crystal structure of an RNA-free human A3H monomer mutant and
the structure-guided biochemical studies. The overall structure of this A3H monomeric form is
58
conserved with those of other known structures of APOBEC proteins. While we were preparing
the manuscript for this chapter, crystal structures of dimeric A3H from pig-tailed macaque
(pgtA3H) and human (hA3H) were reported by two independent groups (142, 143) and the similar
crystal structure of the dimeric A3H from chimpanzee (cpzA3H) was reported later (147). Our
apo-form A3H crystal structure showed consistent overall feature with three reported RNA-bound
dimeric A3H structures. The A3H structure has some unique features in that it has an extended -
helix 6 (6) and a shortened -strand 5 (β5). In addition, A3H has the longest loop 1 and shortest
loop 3 around the zinc-coordinating active center among the known APOBEC structures.
Furthermore, A3H has the most extensive positively charged surface around the active center
among all catalytically active APOBEC domains reported so far (Supplementary Figure S3.4). To
obtain A3H crystals, WT and mutant A3H proteins were purified as dimeric form with RNA bound
or monomeric form. With high salt treatment, dimeric A3H was dissociated into monomers and
free RNA, indicating that RNA binding to A3H is primarily through hydrophilic interaction. Stable
monomeric A3H was obtained by adding H114A or W115A/C116S mutations, which presumably
broke the dimer interaction by disrupting binding to RNA. The three available A3H dimer
structures all show similar RNA mediated dimerization (142, 143, 147). Our results showing the
conversion of dimeric A3H into monomeric form by dissociating RNA in high salt condition or by
H114A or W115A/C116S mutations are consistent with the reported RNA-mediated A3H dimer
structures. RNase A treatment of mammalian cell lysates expressing A3H not only activated A3H
deaminase activity, but also converted the HMW species into LMW species. These results suggest
that RNA binding of A3H plays a role in HMW ribonucleoprotein complex formation in vivo. The
similar propensity is seen in A3G, but not in A3B, where the HMW complex of A3B is insensitive
to RNase A treatment, even though RNase A treatment can greatly enhance its deaminase activity
(139). The positively charged surface of A3H has 13 R/K residues that can be grouped into three
59
patches. Interestingly, alanine substitution of these three patches, patch 1, 2, and 3, changed
subcellular distribution pattern of A3H, with patch 2 showing modest increase of nuclear
localization to 65%, and patch 1 and 3 showing increase to 80% nuclear localization from WT
level (~50%). These patch mutants had greatly reduced deaminase activity, indicating a loss of
substrate binding. In addition, patch 1 mutant did not show a significant difference in their
deaminase activity with or without RNase A treatment, which suggests that RNA-mediated
inhibition is no longer present in this mutant. The extensively positively charged surface areas
around the Zn-center of A3H and A3A suggests that nucleic acids should be able to bind directly
to the active site pocket, which may explain why A3H and A3A are the two APOBECs with higher
activity than other members of the APOBEC family (see chapter 2). Interestingly, the monomeric
and dimeric mutants of A3H showed comparable Kd values for both 50 nt ssRNA and ssDNA.
However, the monomeric mutant (m1+W115A/C116S) showed attenuated binding to 13 nt and 8
nt ssDNA compared to the dimeric form. The observation that the monomeric mutant containing
W115A mutation showed lowered binding to shorter nucleic acids may be in part because the
hydrophobic interactions between the nucleotide base and the W115 side chain have a major
contribution to the binding of shorter nucleic acids, in addition to the charge-charge interactions
through the multiple R/K residues. When the nucleic acids are sufficiently long, they can then bind
to multiple sites across the positively charged surface, reducing the contribution of W115 to the
overall binding affinity.
A3H has been shown to be highly catalytically active in C deamination and has the highest mC
selectivity (see chapter 2). Among the loops around the active center (i.e. loop 1, 3, 5 and 7) of
APOBECs, loop 5 is highly conserved, while loops 1, 3 and 7 are variable. The loop 3 of A3H has
only four residues, making it the shortest among all active APOBECs. It is likely that such a short
loop adopts a consistently open configuration, as seen in the non-substrate binding structure. Loop
60
1 of A3H is the longest and most divergent in sequence among all active APOBECs. By
comparison, the other highly active APOBEC, A3A, has the shortest loop 1. There are also some
unique sequence features on loop 1 and loop 7 of A3H: a total of seven R/K residues are present
on loop 1. Point mutations on five of the loop 1 R/K revealed that they all play a role in deaminase
activity and mC selectivity, even though their relative contributions vary to some extent. Loop 1
and loop 7 have very different conformations in the apo-A3H structure reported here and the
reported RNA-bound A3H structures, suggesting both loops are flexible and can adopt different
conformations for RNA or ssDNA substrate binding. While the detailed mechanisms for ssDNA
binding likely require further co-crystal structures, the results shown here suggest that multiple
residues on loop 1 and loop 7 contribute to the high activity and high mC selectivity in deamination,
which is consistent with the results on A3A and A3BCD2 in chapter 2.
In summary, we report a structure and biochemical properties of human A3H. Our findings
provide valuable insights into molecular basis for the complex nucleic acid binding for molecular
assembly, subcellular distribution, and catalytic regulation of A3H.
MATERIALS AND METHODS
Plasmids
The coding sequence of the human A3H hap II (GenBank accession: ACK77776) was codon-
optimized for the expression hosts (Escherichia coli and human HEK293T) and synthesized
(GeneArt Gene Synthesis, Themo Fisher Scientific). The coding sequence of A3H hap I (GenBank
accession: NP_001159474) was derived from that of A3H hap II through site-directed mutagenesis.
A3H hap II constructs for crystallization trials and in vitro assay were cloned into pMAL-c5X
vector with an N-terminal MBP-tag with or without PreScission Site. A3H hap I and hap II
constructs for human cell-based study were cloned in pcDNA3.1(+) mammalian expression vector
61
with an N-terminal FLAG tag. Cloning and mutagenesis were performed by In-Fusion cloning and
PrimeSTAR mutagenesis (Clontech) by following manufacturer’s instruction. The sequences of
the constructs were verified by DNA sequencing (Genewiz).
Protein expression and purification
E. coli cells harboring the A3H expression vectors were grown at 37°C to about OD600 of 0.2-
0.3 and further growth was continued at 14-16°C. 0.1 mM isopropyl -D-1-thiogalactopyranoside
(IPTG) was added when the OD600 reached 0.6-0.8 and proteins were induced for overnight. To
purify dimer and monomer of MBP-A3H hap II, E. coli cells were harvested and lysed in buffer A
(25 mM HEPES, pH 7.5, 500 mM NaCl, 20 mM MgCl2, and 1 mM DTT) supplemented with 1
mg RNase A (Qiagen) per liter cells. The clear soluble fraction obtained after centrifugation was
passed through amylose resin, washed with buffer B (50 mM HEPES, pH 7.5, 500 mM NaCl, and
0.5 mM TCEP) in 0.5 M, 1 M, and 0.5 M NaCl gradient supplemented with 10 μg/ml RNase A,
and eluted with buffer B supplemented with 40 mM maltose. The elution fractions were
concentrated and treated with 1 mg/ml RNase A at 4°C overnight, and separated by Hiload 16/60
Superdex 200 gel filtration chromatography (GE Healthcare) in buffer B. The dimer fractions were
collected and concentrated. To obtain the monomer, A3H hap II dimer was subjected to RNase A
(0.5 mg/ml) treatment and Hiload 16/60 Superdex 75 gel filtration chromatography (GE
Healthcare) in the presence of buffer B plus 1.5 M NaCl or 2 M NaCl, which resulted in monomeric
species and released free RNAs from the dimers. MBP-fused A3H m1 mutant dimer and monomer
were purified with a protocol similar to that described above with modifications. The concentrated
amylose elution fractions were subject to two rounds of 1 mg/ml RNase A treatment at 4°C (RNase
T1 at final concentration of 1 U/μl was also used for some batches) and Superdex 200 gel filtration
chromatography in buffer B. In each round, the dimer fractions were collected and concentrated.
62
The A260/280 of the final dimer was between 0.92-1.0. To obtain MBP-fused A3H m1 monomer,
the NaCl concentration of the dimer sample was adjusted to 1.5 M and the monomer fractions were
collected after Superdex 75 gel filtration chromatography in buffer B with 1.5 M NaCl. The
A260/280 of the concentrated monomer was between 0.63-0.71. For cleaved A3H
m1+W115A/C116S monomer for crystallization trials, the concentrated amylose elution fractions
were subject to Superdex 75 gel filtration chromatography in buffer B. Then the MBP tag was
cleaved with PreScission protease in buffer C (50 mM HEPES, pH 7.5, 250 mM NaCl, 0.5 M
arginine, 0.5 mM TCEP). The fractions containing the cleaved A3H monomer were obtained after
Superdex 75 gel filtration chromatography in buffer C, collected, and concentrated for
crystallization trials. The A260/280 of the cleaved m1 W115A/C116S monomer was between 0.53-
0.57. MBP-fused A3H monomer mutants (m1+H114A and m1+W115A/C116S) used in EMSA
were purified with the same method described above without PreScission cleavage.
Protein crystallization, data collection, structure determination and refinement
The cleaved monomeric A3H m1+W115A/C116S mutant protein was concentrated to about 4-
7 mg/ml for crystallization screening. Initially crystals were obtained at 4°C by sitting drop vapor-
diffusion method in many conditions containing PEG (PEGs Suite, Qiagen). After optimization,
crystals used for data collection were grown in 0.2 M Na thiocyanate and 4% PEG 20K at 4°C.
Diffraction data was collected in Advanced Photon Source 23-ID-D. Data sets were indexed,
integrated and scaled using HKL2000 program package. The structure of A3H was determined by
molecular replacement by MOLREP (CCP4 suite) using the core structure of rhesus macaque
A3GCD1 (PDB ID: 5K81) with the loops being removed as the search model. The initial map was
improved by NCS averaging and the model for the removed loops was build based on the improved
map. The final structure was refined by PHENIX and manually checked in COOT. The statistics
63
for diffraction data and structural determination/refinement is shown in Table 3.1.
Electrophoretic mobility shift assay (EMSA)
Each 6-FAM labeled oligonucleotide at a specified concentration (1 nM 50 nt ssRNA, 10 nM
50 nt/13 nt/8 nt ssDNA) was titrated by MBP-fused A3H m1 dimer/monomer, m1+H114A
monomer, and m1+W115A/C116S monomer up to 8 μM in 10 μl reaction volume containing 50
mM HEPES, pH 7.5, 250 mM NaCl, 1 mM DTT, 2.5 mM EDTA, and 10% glycerol. The reaction
mixture was incubated on ice for 10 min and analyzed by 8% native PAGE. Typhoon RGB
Biomolecular Imager (GE Healthcare) was used to visualize the images, ImageQuant TL (GE
Healthcare) was used for band quantification, and GraphPad Prism was used for curve fitting.
Three independent experiments were performed.
Cell culture and transfection for human cell-based assays
For studying the multimerization and subcellular localization of various A3H constructs in
HEK293T cells (ATCC), A3H mutants generated in pcDNA3.1(+) vector with an N-terminal
FLAG tag were transfected into HEK293T cells. HEK293T cells were maintained in DMEM
medium (Corning), supplemented with 10% FBS, 100 U/ml penicillin and 100 μg/ml streptomycin.
Transfections of HEK293T cells were done by using X-tremeGENE 9 DNA Transfection Reagent
(Roche) and following manufacturer’s recommendation. To detect the expression of various A3H
constructs by Western blot, cell lysate samples were separated by SDS-PAGE, transferred onto
PVDF membrane (EMD Millipore), and blotted with anti-FLAG M2 mAb (Sigma, 1:3000).
Cell lysate fractionation analysis of A3H
Analysis of multimerization or HMW/LMW complex formation of A3H constructs in
64
HEK293T cells was performed by fractionating cell lysates by FPLC. At 72 h post-transfection,
A3H-transfected HEK293T cells in 150 mm
2
dishes were harvested, washed with PBS, and lysed
in lysis buffer (50 mM HEPES, pH 7.5, 125 mM NaCl, 0.6% NP-40 alternative, 0.5 mM TCEP,
10% glycerol final total volume was 1 ml after mixing with cells) with 1× Halt protease and
phosphatase inhibitor (Thermo Fisher Scientific) for 15 min. After centrifugation and removing
the surface lipid fraction, the clear supernatant fraction was loaded onto Superdex 200 10/300 GL
column (GE Healthcare) equilibrated with running buffer (50 mM HEPES pH 7.5, 125 mM NaCl,
0.1% NP-40 alternative, 0.5 mM TCEP, and 10% glycerol). Fractions were subjected to Western
blot and deaminase assay. For RNase A treatment, the clear supernatant after lysis was incubated
with 100 μg/ml RNase A on ice for 2 h before loading onto Superdex 200 10/300 GL column.
Analysis of subcellular distribution of various A3H mutants
The subcellular distribution of various A3H mutants in HEK293T cells was analyzed by cell
fractionation to separate the cytoplasmic and nuclear fractions, followed by SDS-PAGE and
Western blot. At 48 h post-transfection, A3H-transfected HEK293T cells cultured in 6-well plates
were harvested and washed with PBS. Cells were fractionated into whole cell, cytoplasmic and
nuclear fractions by Nuclei EZ Prep kit (Sigma). The fractions were further treated with 2% SDS
and benzonase (Sigma) to degrade chromosomal DNA. Subcellular fractions were analyzed by
SDS-PAGE and Western blot using anti-FLAG M2 mAb to detect the various FLAG-A3H
constructs. The FLAG-A3H band density was quantified with ImageJ to determine the ratio of
subcellular localization.
Deaminase assay
At 48 h post-transfection, A3H-transfected 293T cells in 6-well plates were harvested, washed
65
with PBS, and the whole cell lysates were prepared using M-PER protein extraction reagent
(Thermo Fisher Scientific) with 1× Halt protease and phosphatase inhibitor. After centrifugation,
the clear supernatant fraction was separated for deaminase assay. Prior to deaminase reaction, the
total protein concentration was quantified by BCA protein assay kit (Pierce), the expression level
of each A3H construct was quantified by Western blot and normalized with the whole cell lysate
transfected with the empty pcDNA3.1(+) vector. Various concentration of A3H-transfected 293T
cell lysates were incubated with 300 nM 5'-6-FAM-labeled 30 nt ssDNA substrates containing a
target C or mC (5'-ATTTATATTATTTATT(m)CATATTTATATTTA-3') in a final volume of 20 l
deaminase reaction mixture (25 mM HEPES, pH 7.0, 50 mM NaCl, 1 mM DTT, 0.1% Triton X-
100, 0.1 mg/ml RNase A). The deaminase reaction was performed at 37°C for 1 h, and then
terminated by heat inactivation at 90°C for 5 min. The bases of the deamination products U or T
were subsequently cleaved by UDG (2.5 units, NEB) or TDG (0.5 μg, 3-fold excess amount of the
complementary ssDNA was also added). The UDG reaction was performed at 37°C for 1 h, and
the TDG reaction was performed at 42°C for overnight. The resulting abasic sites were hydrolyzed
by 0.1 M NaOH at 90°C for 10 min. The deamination products were separated on 20% urea
denaturing gels, visualized by Molecular Imager FX (Bio-Rad) or Typhoon RGB Biomolecular
Imager, and quantified by Quantity One 1-D Analysis Software (Bio-Rad) or ImageQuant TL.
Deaminase activity (nM product/g cell lysates) was determined as the product formation over
enzyme concentration in an initial range where the product formation is linearly dependent on cell
lysate concentration. Error bars were generated based on standard deviation of three independent
data sets.
66
Table 3.1. Crystallographic data collection and refinement statistics.
A3H (PDBID: 5W45)
Data collection
Space group P2 1
Cell dimensions
a, b, c (Å) 46.749, 65.006, 65.540
(
o
) 90.00, 90.08, 90.00
Resolution (Å) 50-2.49 (2.58-2.49)*
R sym or R merge 8.7 (49.1)
I / I 19.6 (2.5)
Completeness (%) 98.8 (90.9)
Redundancy 6.2 (4.4)
Molecules per ASU 2
Refinement
Resolution range (Å) 50-2.49
No. reflections 13746
R work / R free 21.12 / 23.48
No. atoms 3009
Protein 2976
Ligand/ion 2
Water 31
B-factors 48.0
Protein 47.95
Ligand/ion 45.68
Water 52.72
R.m.s. deviations
Bond lengths (Å) 0.002
Bond angles () 0.551
Structure was determined from a single crystal. *Highest-resolution shell is shown in parentheses.
67
Table 3.2. ssDNA and ssRNA binding properties of A3H dimeric and monomeric mutants.
K d (nM)
Nucleotides
A3H constructs
1
(oligomer state)
m1
(dimer form)
m1
(monomer form)
m1+W115A/C116S
(monomer only)
m1+H114A
(monomer only)
ssRNA (FAM-50 nt) 14.3 ± 1.0 8.4 ± 0.4 12.6 ± 1.7 34.9 ± 1.8
ssDNA (FAM-50 nt) 68.5 ± 3.3 35.0 ± 1.0 39.8 ± 2.2 22.2 ± 1.3
ssDNA (FAM-13 nt) 1,357 ± 37 272 ± 8 497 ± 11
ssDNA (FAM-8 nt) 2,846 ± 134
2
983 ± 23
2
2,960 ± 571
2
Hill coefficient (cooperativity)
Nucleotides
A3H constructs
1
(oligomer state)
m1
(dimer form)
m1
(monomer form)
m1+W115A/C116S
(monomer only)
m1+H114A
(monomer only)
ssRNA (FAM-50 nt) 1.9 ± 0.2 1.9 ± 0.2 2.8 ± 0.9 2.4 ± 0.2
ssDNA (FAM-50 nt) 2.6 ± 0.3 4.6 ± 0.7 3.8 ± 0.6 2.2 ± 0.2
ssDNA (FAM-13 nt) 3.7 ± 0.3 4.3 ± 0.6 3.5 ± 0.2
ssDNA (FAM-8 nt) 2.7 ± 0.2 3.7 ± 0.3 –
3
Note: The Kd values were obtained based on EMSA results, which should only be considered as
approximate estimates of the binding affinity, especially for the shorter 13 nt and 8 nt oligomers.
1
A3H mutants m1 and m1+W115A/C116S contain the catalytic residue E to A mutation.
2
The
EMSA gel shift bands are smears or less defined bands for the 8 nt oligomer, and the quantification
of binding is an approximate estimate.
3
No obvious cooperativity was observed.
68
Table 3.3. The deaminase activity for C and mC, and mC selectivity factor of A3H mutants.
A3H constructs TCA
(nM product/
g cell lysate)
TmCA
(nM product/
g cell lysate)
Selectivity for
mC
(mC/C)*100
WT 171 ± 8 82.3 ± 4.8 48.3
Loop 1
mutants
R17D ND ND -
R18D 141 ± 48 34.5 ± 2.1 24.5
R20D 105 ± 20 41.1 ± 8.3 39.0
R21D 8.7 ND -
R26D 3.2 ND -
R18A/L19A 387 ± 70 240 ± 32 62.0
R21A 490 ± 21 126 ± 4 25.7
R26A 110 ± 30 33.5 ± 19.1 30.5
A28T 702 ± 125 388 ± 69 55.3
Loop-
swap
mutants
lp1_A3A ND ND -
lp1_A3GCD2 ND ND -
lp7_A3A 168 49 29.2
lp7_A3GCD2 ND ND -
69
Figure 3.1. Protein purification and the overall structure of A3H. a, Superdex 200 SEC elution
profile of MBP-A3H dimeric and monomeric mutants. A3H m1 produced a stable dimer after
extensive RNase A treatment (red). The purified m1 dimer can dissociate to monomer and free
RNA after RNase A treatment followed by 1.5 M or higher salt buffer (blue). The RNA-bound m1
dimer was disrupted by two sets of mutations on loop 7: H114A (m1+H114A) or W115A/C116S
(m1+W115A/C116S), and RNA-free monomers were purified from m1+H114A (light blue) or
m1+W115A/C116S (green). b, Multiangle light scattering (MALS) of MBP-fused
m1+W115A/C116S mutant, showing a clean monomeric form. The expected molecular mass of a
monomer A3H is 63.2 kDa. c, two views of crystal structure of A3H m1+W115A/C116S monomer
mutant colored by secondary structure elements. Zinc atom is shown in gray sphere. d,
Superimposition of the A3H (green) with A3A (PDB ID: 4XXO, yellow), A3BCD2 (PDB ID:
5CQI, salmon), and AID (PDB ID: 5W0R, purple).
70
Figure 3.2. The positively charged surface and the nucleic acid binding property of A3H. a,
The positively charged surface around the Zn-active center of A3H, covering loop 1, 3, 5, 7, and
-helix 1 and 6 (1 and 6). The other end of the molecule is mostly neutral or negatively charged
(Supplementary Figure S3.3c). The surface electrostatic potential colored according to calculated
electrostatic potential of accessible surface area from −5 kT/e (red) to 5 kT/e (blue). The side chain
of W115 was modeled based on the monomer structure. b-e, Representative gel images of EMSA
showing A3H mutants binding to 50 nt ssRNA (1 nM, b), 50 nt ssDNA (10 nM, c), 13 nt ssDNA
(10 nM, d), and 8 nt ssDNA (10 nM, e). Quantification and estimate of binding affinity for each
oligonucleotide are shown in Table 3.2.
71
Figure
3.3. Multimerization of A3H in HEK293T cells and RNA-dependent inhibition of deaminase
activity. A3H formed enzymatically inactive high molecular weight (HMW) ribonucleoprotein
complex in vivo. Lysates of HEK293T cells expressing A3H, untreated or treated with RNase A,
were fractionated by Superdex 200 size-exclusion chromatography and subsequently analyzed by
Western blot and deaminase activity assay. a, A3H predominantly appeared as HMW complexes
and showed negligible deaminase activity without RNase A treatment. b, After RNase A treatment,
the HMW complex were converted into enzymatically active low molecular weight (LMW)
species. α-tubulin is an endogenous control.
72
Figure 3.4. Positively charged patches are important for subcellular localization and
deaminase activity of A3H. a, A3H structure showing the positively charged residues mutated in
three patch mutants (patch 1–3). b, Cell fractionation analysis of A3H and patch mutants, showing
the distribution between nucleus and cytoplasm in HEK293T cells. Transfected 293T cells
expressing wild-type A3H hap I, hap II, and hap II mutants were fractionated into whole cell,
cytoplasm, and nucleus. FLAG-A3H proteins in each fraction were analyzed by Western blot. c,
Deaminase activity of A3H patch mutants using the cell lysates of transfected HEK293T cells with
or without RNase A treatment. The deaminase reaction was performed with cell lysate range of 0–
6 μg (total protein amount, 2-fold dilutions from 6 μg) and 300 nM ssDNA. S and P indicate
substrates and products, respectively.
73
Figure 3.5. Comparison of the charged surface around the Zn-active site center of A3H, A3A,
and A3BCD2, and the unique loop 1 residues in A3H. a, The charge distribution around the Zn-
active site, showing that the Zn-center of A3H and A3A (PDB ID: 4XXO), but not of A3BCD2
(PDB ID: 5CQI), is surrounded by positively charged surface. The surface electrostatic potential
colored according to calculated electrostatic potential of accessible surface area from −5 kT/e (red)
to 5 kT/e (blue). b, Five arginine residues on loop 1 of A3H that affect deaminase activity and mC
selectivity. c, Position of A28 (blue dotted sphere) at the Zn-active center pocket, located on the
side of the modeled target C in the pocket. The space that would be occupied by a larger T/S at the
same position in other APOBECs is represented by red dotted sphere that may pack tighter with
the target C.
74
Figure 3.6. The deaminase activity for C and mC of A3H mutants. Representative urea-
denaturing gel images of the deaminase assay of the A3H mutants, including (a) point mutations
on loop 1 and (b) loop swapped mutants using cell lysates of transfected HEK293T cells. The
deaminase reaction was performed with cell lysate range of 0–6 μg (total protein amount, 2-fold
dilutions from 6 μg) and 300 nM ssDNA containing reactive C. c, The quantification of the
deaminase activity assay from a and b. The activity was calculated from the initial range where
the product formation is linearly dependent on protein concentration.
75
Supplementary Figure S3.1. Protein purification of wild-type A3H hap II. a, Superdex 200
elution profiles of MBP-fusion of wild-type A3H dimeric and monomeric forms. MBP-A3H forms
a stable dimer after extensive RNase A treatment (blue). The purified dimer can dissociate to
monomer and free RNA after RNase A treatment followed by 2.0 M or higher salt buffer (orange).
The MBP-fusion of the monomeric mutant m1+W115A/C116S is used as monomeric size marker
(grey). b, The molecular weight standard curve for the SEC run in a. c, The SDS-PAGE gel
(protein) and denaturing urea gel (nucleic acid) of the peaks of the WT A3H treated with 2.0 M
NaCl. The peak 1 (pk1) had a A260/280 ratio of 2.13, close to that of a typical RNA.
76
Supplementary Figure S3.2. Multiple sequence alignment of APOBEC proteins. a, Sequence
alignment of A3H haplotype II (A3H-II) with three other A3 members. The nomenclature of
secondary structures assigned on the top of the alignment is adopted based on those of APOBEC2
(10), and the changes of secondary structures for A3H is indicated below the alignment. Point-
mutations in the A3H m1 construct are shown below the alignment. b, Multiple sequence
alignment of the active APOBEC domains with known structures around loop 1 sequences,
showing that the A28 residue unique to A3H is a T/S residue at the equivalent position in other
APOBECs.
77
Supplementary Figure S3.3. Structural features of A3H. a, The electron density map of beta
strand 4 (4) and 5 (5) of A3H monomer structure, showing P132 forming an outward-facing
bulge away from 4, breaking the already short 5. The 5 strand in A3H dimer structure (PDB
ID: 5W3V) is shown in blue line. b, The cartoon representation of A3H monomer structure (top),
and the superimposition of A3H (green) with A3A (PDB ID: 4XXO, yellow), A3BCD2 (PDB ID:
5CQI, salmon) and AID (PDB ID: 5W0R, purple), showing the bulge at residue P132 in the A3H
monomer structure to break the short 5 strand. c, The two opposite ends of A3H showing different
charged features. The top panel is viewed from the Zn-center direction, and the bottom panel is
from the opposite end. The surface electrostatic potential colored according to calculated
electrostatic potential of accessible surface area from -5 kT/e (red) to 5 kT/e (blue).
78
Supplementary Figure S3.4. The charged surface feature surrounding the zinc-coordinating
active center of the active APOBEC domains. Zn-center of A3H and A3A (PDB ID: 4XXO) are
surrounded with positively charged areas. The Zn-center of the rest of the APOBECs shown here
are surrounded by either neutral in A3BCD2 (PDB ID: 5CQI), AID (PDB ID: 5W0R), A3FCD2
(PDB ID: 3WUS) or even negatively charged areas in A3GCD2 (PDB ID: 3IQS) and A3C (PDB
ID: 3VOW). The ribbon diagram on the right shows the general orientation of views for the surface
figures. Yellow spheres indicate zinc atom.
79
Supplementary Figure S3.5. Comparison of the reported A3H structures. Monomer human
A3H in this report (a, hA3H monomer, PDB ID: 5W45), protomer of dimer A3H from pig-tailed
macaque (b, pgtA3H, PDB ID: 5W3V)(142) and protomer of dimer human A3H (c, hA3H, PDB
ID: 6B0B)(143). Superimposition of monomer hA3H monomer with (d) protomer of dimer
pgtA3H and (e) protomer of dimer hA3H. For clarity, the loops are deleted in the overlapping
structures in d and e.
80
Chapter 4
Structural and Biochemical Analysis of HIV-1 Vif - A3H Interaction
Authors: Fumiaki Ito, Ana Lucia Alvarez Cabrera, Kevin Huynh, Aaron Wolfe, Xiao Xiao,
Z. Hong Zhou, and Xiaojiang S. Chen
Contributions: F.I., A.W., X.X. and X.S.C. conceived the project and designed the experiments.
F.I. performed the protein purification. F.I., A.L.A.C. and K.H. performed the initial EM screening,
cryo-EM data collection and data processing. F.I. performed the binding assay and Vif-degradation
assay. Z.H.Z. offered the access to the transmission electron microscopes. F.I. and X.S.C. wrote
the manuscript.
INTRODUCTION
The lentiviruses have evolved a set of accessory genes to evade intrinsic immune response from
the host cells and subvert their antiviral restriction factors (148, 149). Viral infectivity factor (Vif)
is encoded by all lentiviruses including HIV-1. Vif knock-out HIV-1 (HIVVif) is unable to
replicate in non-permissive human T-cells because its viral genome is susceptible to lethal
hypermutation caused by APOBEC3 (A3) cytidine deaminases (27-29). A3 proteins are potent
antiviral restriction factors and responsible for clearing foreign DNA by converting cytosine (C)
into uracil (U) on target single stranded DNA (ssDNA). Among seven human A3 proteins, A3D,
A3F, A3G, and A3H are known to inhibit HIV-1 Vif replication (119, 150). These A3 members
have capacity to be incorporated into HIV virion and execute cytidine deamination on ssDNA
reverse transcription product of the viral genome in susceptible T-cells. In addition to deamination-
81
dependent viral restriction activity, A3 proteins have additional mechanisms to restrict HIV
replication in a deamination-independent manner (151-154). During HIV infection, Vif
antagonizes the antiviral activity of A3s by compromising their steady state level through
ubiquitin-proteasome system. Vif hijacks Cul5-EloB-EloC-Rbx2 E3 ubiquitin ligase from the host
cells to exert ubiquitination of the target A3s for subsequent proteasomal degradation (48). In this
process, Vif additionally recruits cellular transcription factor CBF- to form a stable complex for
A3 binding (52, 53). HIV-1 Vif is known to degrade a series of human A3 proteins, including A3C,
A3D, A3F, A3G, and A3H. Vif-A3 interaction is sorted into three types, namely A3C/D/F-type,
A3G-type, and A3H-type as they have clearly different Vif-binding surfaces. Vif is also expected
to have at least three distinctive areas for A3 binding (119, 150).
Interestingly, A3H shows varying level of sensitivity to Vif from different HIV strains. A3H
haplotype II (hap II) is more sensitive to HIV-1 LAI Vif than NL4-3 Vif, and thus it is less
restrictive against HIV-1 LAI than against NL4-3 (101, 102, 155). The sensitivity of A3H to Vif
was attributed to a single Vif amino acid polymorphism at position 48 where LAI has histidine and
NL4-3 has asparagine (122). On the other hand, recent study showed lysine residue at position 97
in A3H plays a central role in determining the sensitivity to Vif. Substitution of this lysine in human
A3H with the glutamine, an equivalent residue in chimpanzee A3H, which is more sensitive to
HIV Vif-mediated degradation than human A3H, shows increased sensitivity to Vif (156).
Vif-A3 protein complex has been refractory to structural studies, owing to the propensity of Vif
and A3s to form high molecular weight oligomer in cells and to associate with cellular nucleic
acids. Yet, Guo et al. determined a crystal structure of a complex comprising Vif, CBF-, and Cul5
E3 ligase (N-terminal domain of Cul5, EloB, and EloC) (54). Currently, detailed structural
information on Vif-A3 complex is lacking despite its importance.
In this chapter, we analyzed the molecular interaction between HIV-1 Vif and human A3H by
82
in vitro binding assay, in vivo Vif-mediated degradation assay, and cryo-electron microscopy (cryo-
EM) single particle analysis. Vif-E3 ubiquitin ligase complex comprising Vif/CBF-
/EloB/EloC/Cul5 (referred as VCBCC complex) hetero-pentamer was purified to homogeneity.
VCBCC formed a stable complex with both monomeric and dimeric forms of A3H in vitro. Cell-
based Vif-mediated degradation assay showed that -helix 3 (3) and 4 (4) of A3H were
primarily responsible for Vif binding, with 3 being a central interface. Furthermore, clustered
lysine residues around loop 3 were likely to be a part of ubiquitination sites for Vif-mediated
degradation in A3H. Our preliminary cryo-EM 3D reconstruction of VCBCC/A3H complex
allowed us to fit atomic structures of VCBCC and A3H to the density. Taken together, these results
provide valuable insights into the crucial virus-host interaction for future development of
therapeutics.
RESULTS
Purification of VCBCC complex
Vif is an intrinsically disordered protein and notoriously insoluble by itself. Co-expression of
Vif from HIV-1 NL4-3, human CBF-, EloB, and EloC yielded a highly soluble complex (VCBC
complex), presumably because CBF- function as a chaperon of Vif. N-terminal domain of Cul5
(nCul5) was purified independently and mixed with VCBC complex to form VCBCC complex.
Disordered regions were excluded from the constructs based on the VCBCC crystal structure
(Figure 4.1a) (54). VCBCC complex was purified as RNA-bound species (A260/280 = 0.9-1.1)
and RNA-free species (A260/280 = 0.56) (Figure 4.1b, c). RNA-free VCBCC complex was
separated for the complex formation with A3H. RNA-free VCBCC complex was highly
monodispersed and homogeneous in negative-stain electron microscopy (NS-EM). NS-EM images
allowed us to identify certain population of the characteristic U-shaped particles with typical
83
particle diameter of 8-12 nm, which is consistent with VCBCC complex crystal structure (Figure
4.1d).
Binding of VCBCC complex to A3H
in vitro binding of VCBCC complex to A3H was first tested by using purified proteins. For
better binding to A3H, a point mutation N48H was introduced to Vif from HIV NL4-3 strain as
this mutant is reported to degrade A3H more efficiently than Vif with N48 residue (122). For A3H,
the effect of a single point mutation K97Q, which was recently reported to enhance the sensitivity
to Vif, was examined. When mixed with VCBCC complex containing N48H Vif, K97Q MBP-
A3H dimer showed prominent peak shift in size-exclusion chromatography (SEC) while WT
MBP-A3H dimer only showed minor shift (Figure 4.2a, b). Vif-mediated degradation assay
confirmed that N48H Vif degraded both WT and K97Q A3H more efficiently than WT Vif and
that K97Q A3H was more sensitive to N48H Vif than WT A3H (Figure 4.2c). To understand which
oligomeric state of A3H is targeted by Vif for degradation, we isolated dimer and monomer
fractions of A3H during the purification, both of which are associated with RNA, and tested for
VCBCC binding. By using N48H Vif and K97Q A3H, both dimeric and monomeric forms of A3H
showed prominent peak shift when combined with VCBCC complex (Figure 4.3). These results
suggest that previously identified polymorphisms in both Vif and A3H that showed enhanced
degradation of A3H directly contributed to higher binding affinity between the two proteins in
vitro, and that Vif can target both dimeric and monomeric forms of A3H as long as bound RNA.
Vif binding site in A3H
In addition to K97 in the middle of -helix 3 (3) in human A3H, previous studies demonstrated
that a polymorphic residue at position 121 in -helix 4 (4), where hap II has aspartate and hap I
84
has lysine, was also crucial to the sensitivity to Vif-mediated degradation. D121K on A3H hap II
renders the protein more resistant to Vif (32, 122). Therefore, the A3H surface around 3 and 4
are likely to be responsible for binding to Vif. To gain more detailed insight into Vif-A3H interface,
we tested the impact of A3H surface amino acid residues on sensitivity to Vif. Accordingly, S86,
W90, D94, and D100 on 3, and D121, L125, and S129 on 4 were examined (Figure 4.4a). These
residues were replaced with alanine or either of the charged residues (aspartate or arginine) on
K97Q enhanced Vif-sensitivity construct. The FLAG-tagged A3H mutants were co-transfected
with Vif (N48H) to 293T cells and their protein levels were observed at post 48 hours transfection.
The results showed that the 3 mutants W90A and D94A became highly resistant to Vif. Other
mutants on S86, L125, and S129 showed partial resistance when they had alanine or either of the
charged residue substitutions (Figure 4.4b). Unlike the literature, D121K did not show noticeable
difference in the sensitivity to Vif in our condition, possibly because the backbone construct used
here has K97Q, which enhance the sensitivity to Vif compared to WT A3H. These results confirm
that 3 and 4 of A3H are primarily responsible for Vif binding and residues W90 and D94 on 3,
in addition to K97, are playing key roles in Vif binding.
Ubiquitination site in A3H
When substrate proteins for proteasomal degradation are ubiquitinated, they commonly accept
ubiquitin molecules through amide bond between ε-amino group of one or multiple lysine residues
on the surface of the substrate proteins and C-terminal carboxyl group of ubiquitin. A3H, as well
as some other A3s such as A3G, undergoes ubiquitination when Vif is present in cells (131). While
the major ubiquitination site for A3G has been attributed to four clustered lysines in C-terminal
domain (A3GCD2) (157), the ubiquitination site for A3H is currently unknown. Identifying the
surface ubiquitination site in A3H would help us to correctly orient Cul5 ubiquitin ligase complex
85
in A3H-Vif-E3 ligase complex, as ubiquitination site of the substrate protein needs to be within a
range where E2 ubiquitin conjugating enzyme can transfer ubiquitin. Human A3H has 12 lysine
residues, all of which are exposed to the protein surface. Among 12 lysines, K97 is the only one
that is located within the identified Vif-binding interface. As enhanced Vif-sensitivity construct
K97Q A3H was used as default, all the other 11 lysines were examined. These 11 lysines were
sorted into three groups according to their location on the A3H crystal structure, namely, region 1
(R1): K16/K153/K161/K168, region 2 (R2): K27/K50/K51/K52/K64, and region 3 (R3):
K117/K174/K181. To identify lysines that are ubiquitinated by Vif, three lysine-deficient region
mutants were prepared by substituting all the lysines in the same region with arginines, which
would maintain the positive charge, but block ubiquitination (Figure 4.5a). Lysine-free mutant
A3H, in which all 11 lysines were substituted with arginines, was also prepared (11KR). As
controls, A3G and its Vif-mediated ubiquitination-deficient mutant
(K297R/K301R/K303R/K334R) were also included. The lysine deficient region mutants were
fused with HA-tag at their N-terminus, as HA-tag has no lysine in its sequence. The A3H mutants
were co-transfected with increasing amount of N48H Vif to 293T cells and their protein levels
were tested at post 48 hours transfection. The results showed that, among the three region mutants,
only R2 mutant, in which lysines around flexible loop 3 were mutated, showed partially increased
resistance to Vif-mediated degradation. The other two region mutants did not show noticeable
change in the sensitivity to Vif (Figure 4.5b). Of note, the protein of lysine-free mutant (11KR)
had consistently low steady-state level, possibly due to lowered protein stability by mutations
introduced in this construct. Taken together, while lysines within region 2 appeared to partially
contribute to Vif-mediate ubiquitination, we were not able to see complete resistance to Vif-
mediated degradation in any of the region mutants. We noticed that expression level of Vif tends
to be higher when it is co-transfected with low expression level A3H constructs (i.e., 11KR),
86
suggesting that Vif itself may also be partially degraded when it degrades A3H. These observations
suggest that ubiquitination site may be dispersed across the A3H surface, or additional mechanism
for Vif-mediated downregulation of A3H may also exist.
Cryo-EM analysis of VCBCC/A3H complex
To understand the structural basis for Vif-A3H interaction, we approached VCBCC/A3H
protein complex by X-ray crystallography and Cryo-EM single particle analysis. Despite extensive
effort, all the crystallization trials for this protein complex was unsuccessful. This is presumably
because Vif-A3H interaction is primarily driven by salt-sensitive polar interaction, which may not
be stable in most of the crystal screening conditions. On the other hand, cryo-EM analysis allowed
us to directly use the solution samples from the size-exclusion chromatography fractions to observe
the near-native state structure. First, VCBCC/MBP-A3H dimer complex was analyzed by cryo-
EM. The particles of VCBCC/MBP-A3H dimer complex were mono-dispersed, but with certain
degrees of heterogeneity in size in cryo-EM micrographs, possibly due to partial dissociation of
the complex during the vitrification of the specimen (Figure 4.6a). 2D classification of ~300,000
particles from 3205 cryo-EM micrographs showed assemblies of several globular and rod-shaped
density with the average dimensions of 12-16 nm by 6-12 nm (Figure 4.6b). Notably, many classes
showed smearing fuzzy densities in the outside edges, indicating that there are several intrinsically
flexible regions in the complex that are less well aligned during the 2D class averaging. With these
2D classes, reference-free ab initio 3D reconstruction was performed and the initial 3D map was
built to ~15 Å resolution. The reconstructed initial 3D model could coarsely accommodate one
molecule of VCBCC complex and one molecule of A3H dimer crystal structures and one protomer
of A3H can be located next to Vif (Figure 4.6c). During the 3D reconstruction, there was an
obvious loss of density from the 2D classes. The lost density is likely to be attributed to the
87
smearing density in the 2D classes and therefore, the flexible regions were not well reflected in
the 3D density and prevented us from reaching high resolution reconstruction. The two MBP-tags
attached to the dimeric A3H are the probable sources of the flexibility, as flexibly tethered domains
are generally difficult to reconstruct (158, 159). Indeed, MBP-less complex of VCBCC and A3H
monomer showed more rigid densities in their 2D average classes (Figure 4.7a, b). The preliminary
structural information obtained by cryo-EM provided the overall architecture of the entire Vif-
A3H complex, yet higher resolution 3D model is still pending.
DISCUSSION
In this chapter, we analyzed the molecular interaction between HIV-1 Vif and A3H by in vitro
binding assay, in vivo Vif-mediated degradation assay, and cryo-electron microscopy (cryo-EM)
single particle analysis. Co-expression of Vif, CBF-, EloB, and EloC produced a highly soluble
hetero-tetrameric complex and adding N-terminal domain of Cul5 yielded both RNA-bound and
RNA-free forms of hetero-pentameric complex (VCBCC). NS-EM analysis of RNA-free VCBCC
complex showed a characteristic U-shaped architecture, which is consistent with the reported
crystal structure (54).
Both Vif and A3H continue to evolve in the host-virus “arm race” during HIV infection. Among
seven human A3H haplotypes, only hap II, V , and VII are stably expressed and have antiviral
activity against HIV-1 (101, 103, 127). Furthermore, comparison of A3H from human and
chimpanzee regarding the sensitivity to HIV Vif has led to the identification of an amino acid
residue at position 97 as a critical determinant for the sensitivity to Vif (156). The stable human
A3H haplotypes are effectively degraded by Vif from LAI isolate, but not by Vif from NL4-3. This
difference was attributed to the polymorphism at position 48 in Vif (122, 155). By introducing
these reported polymorphic mutations to both Vif and A3H that contribute to enhanced degradation
88
of A3H in vivo, VCBCC complex containing N48H NL4-3 Vif predominantly formed a complex
with K97Q human A3H. Therefore, there is a positive correlation between the capacity of Vif to
degrade A3H and binding affinity between the two proteins. Our in vitro binding assay further
revealed that Vif can target both dimeric and monomeric forms of A3H. Yet, in vivo situation is
expected to be more complicated as A3H stays as high molecular weight ribonucleoprotein
complex that is enzymatically inactive as reported in chapter 3. How oligomerization state and
catalytic activity of A3H are regulated during the HIV restriction is currently unknown.
In search for the Vif-binding surface in A3H, multiple residues in 3 and 4 were identified.
Among the identified Vif-binding residues, W90 and D94, in addition to K97, on 3 are playing
the central roles in Vif binding. W90 is unusually protruding towards the outside surface (Figure
4.4a) and thus Vif might have evolved to take advantage of this recognizable feature in A3H. Other
A3H residues S86, L125, and S129 were also participating in Vif-binding. These results were
mostly consistent with the recently reported Vif binding site in A3H with minor discrepancy (156,
160).
In addition to the Vif binding interface, we explored the Vif-mediated ubiquitination site in
A3H. Identification of the ubiquitination site, together with the Vif binding site, would allow us to
speculate the relative spatial positions of A3H in A3H-Vif-Cul5 ubiquitin ligase complex since the
ubiquitination site would face the E2 ubiquitin conjugating enzyme for ubiquitin transfer. Our
investigation on surface lysine residues showed that, while disruption of lysines within region 2
around flexible loop3 partially interfered the Vif-mediated degradation, none of our lysine
deficient region mutants obtained complete resistance to Vif. Note that the disruption of previously
identified Vif-mediated ubiquitination site in A3G (A3G 4KR) also resulted in the partial resistance
to Vif in our condition (Figure 4.5b). These observations suggest that ubiquitination site may be
rather dispersed across the A3H surface than clustered in a specific area, or additional mechanism
89
for Vif-mediated downregulation may also exist.
Despite our extensive efforts, all the crystallization trials for VCBCC/A3H protein complex has
been unsuccessful. The probable reason is that the protein complex may become unstable in most
of the crystal screening conditions and dissociate into sub-complexes as Vif-A3H interaction is
expected to be driven by salt-sensitive polar interaction. The disassembled complex may result in
lower homogeneity of the sample and thus the intact protein complex would have less chance to
crystallize. By cryo-EM single particle analysis, we were able to observe intact protein complexes
in vitreous specimen and obtained about 15 Å 3D reconstruction of VCBCC/A3H dimer complex.
The 3D model allowed us to fit atomic structures of VCBCC complex and A3H dimer, and one
protomer of A3H can be placed next to Vif. Although our sample had N-terminal MBP-tag on A3H,
we were not able to fit MBP into the density, which may explain the loss of density during 3D
reconstruction from the 2D classes. 3D model of the VCBCC/A3H dimer suggests that only one
protomer in A3H dimer was bound to VCBCC, even though both protomers in A3H dimer could
be available for VCBCC binding (Figure 4.8). There may be a steric hindrance that did not allow
two VCBCC molecules to bind to each protomer of A3H dimer.
In summary, we obtained biochemical and structural insights into the interaction between HIV-
1 Vif and human A3H. Although a higher-resolution 3D model is still pending, these results
facilitate the understanding of Vif-mediated counteraction of A3 proteins and provide valuable
insights for future development of therapeutics.
MATERIALS AND METHODS
Plasmids
Vif from HIV-1 pNL4-3 (residues 1-176) with His6-tag at N-terminus, human EloB (residues
1-102), human EloC (residues 17-112), and human CBF- (residues 1-156) were cloned into each
90
of the four polycistronic cassettes in pST39 co-expression vector (161). N-terminal domain of
human Cul5 (nCul5, residues 12-320 or 12-386) was cloned into pGEX-6P-1 to express a fusion
protein with GST at N-terminus. Human A3H haplotype II was cloned into pMAL-c5X to express
a fusion protein with MBP at N-terminus. For mammalian cell expression, human A3H haplotype
II and A3G were cloned into pcDNA3.1(+) mammalian expression vector with FLAG-tag or HA-
tag at N-terminus. Codon-optimized Vif from HIV-1 pNL4-3 was also cloned into pcDNA3.1(+)
without tag. Cloning and mutagenesis were performed with In-Fusion cloning and PrimeSTAR
mutagenesis (Clontech) by following manufacturer’s instruction. The sequences of all the
constructs were verified by DNA sequencing (Genewiz).
Protein expression and purification
His6-Vif/EloB/EloC/CBF--pST-39, nCul5-pGEX-6P-1, and A3H-pMAL-c5X expression
vectors were transformed into the E. coli strains BL21(DE3), XA90, and C43(DE3)pLysS,
respectively. The E. coli cells harboring the expression vectors were grown in LB medium at 37°C
until the OD600 reaches 0.6. The recombinant proteins were induced by 0.3 mM isopropyl -D-1-
thiogalactopyranoside (IPTG) at 16°C for 18 hours.
For VCBC complex, the cell pellets were resuspended with buffer A (20 mM Tris-HCl, pH 8.0,
250 mM NaCl, and 0.5 mM TCEP) containing RNase A (0.1 mg/ml). The cells were lysed by
sonication and cellular debris was removed by centrifugation. The supernatant containing His 6-
VCBC complex was loaded onto Ni-NTA agarose column (QIAGEN). The nickel column was
extensively washed with wash buffer (20 mM Tris-HCl, pH 8.0, 50 mM imidazole, 250 mM NaCl,
and 0.5 mM TCEP) and the protein was eluted with elution buffer (20 mM Tris-HCl, pH 8.0, 500
mM imidazole, 250 mM NaCl, and 0.5 mM TCEP). The His6-tag was cleaved by incubating with
PreScission protease for overnight. VCBC complex was then subjected to size-exclusion
91
chromatography (SEC, Superdex 200, GE Healthcare) equilibrated with buffer A. The peak
fractions were collected and concentrated for VCBCC complex preparation.
For nCul5, the cell pellets were resuspended with buffer B (20 mM Tris-HCl, pH 8.0, 500 mM
NaCl, and 0.5 mM TCEP), lysed by sonication, and cellular debris was removed by centrifugation.
The supernatant containing GST-nCul5 was loaded onto glutathione sepharose column (GE
Healthcare). Glutathione column was extensively washed with buffer B and GST-tag was cleaved
on the column by incubating with PreScission protease for overnight. nCul5 was eluted from the
glutathione column and subjected to SEC (Superdex 75, GE Healthcare) equilibrated with buffer
B. The peak fractions were collected and concentrated for VCBCC complex preparation.
For MBP-A3H, the cell pellets were resuspended with buffer C (25 mM HEPES-NaOH, pH
7.5, 500 mM NaCl, and 0.5 mM TCEP) containing RNase A (0.1 mg/ml), lysed by sonication, and
cellular debris was removed by centrifugation. The supernatant containing MBP-A3H was loaded
onto amylose column (New England Biolabs). Amylose column was extensively washed with
wash buffer (25 mM HEPES-NaOH, pH 7.5, 1 M NaCl, and 0.5 mM TCEP) and the protein was
eluted with the elution buffer (25 mM HEPES-NaOH, pH 7.5, 500 mM NaCl, 40 mM D-maltose,
and 0.5 mM TCEP). Eluted fractions were concentrated and subjected to SEC (Superdex 200)
equilibrated with buffer C. Peak fractions of dimeric and monomeric MBP-A3H were separated
and concentrated.
VCBCC complex was formed by mixing VCBC complex and nCul5 by 1:1 molar ratio and
incubating on ice for 30 min. The protein mixture was subjected to SEC (Superdex 200) and the
complex fractions were separated.
VCBCC/MBP-A3H complex was formed by mixing VCBCC complex and MBP-A3H by 1:1
molar ratio and incubating on ice for 30 min. The protein mixture was subjected to SEC (Superdex
200) and complex fractions were separated.
92
Cryo-EM data acquisition
3 ul aliquots of 0.1-0.15 mg/ml purified VCBCC/A3H complex were applied to glow-
discharged quantifoil 1.2/1.3 holey carbon copper 300-mesh grids. Grids were then blotted with
filter paper and vitrified in liquid ethane using Vitrobot Mark IV (FEI). Vitrified EM grids were
screened in TF20 transmission electron microscope (FEI) to optimize the freezing conditions.
Higher resolution cryo-EM images were collected in Titan Krios (FEI) operated at 300 kV and a
nominal magnification of 130,000× with K2-Summit direct electron detector (Gatan Inc.) in
counting mode equipped with energy filter (Gatan Inc.). Sixty frames were recorded for each
movie at a pixel size of 1.07 Å, with 200 ms exposure time. 3205 movies were recorded for
VCBCC/MBP-A3H dimer complex, and 1935 movies were recorded for VCBCC/A3H monomer
complex.
Cryo-EM data processing
Frames in each movie were aligned and motion-corrected using MotionCor2 (162). Contrast
transfer function (CTF) was estimated for each motion-corrected micrographs using CTFFIND4
(163). Initially, particles were manually picked to generate templates for automated particle
picking by RELION2.0. In total, ~300,000 particles were extracted for VCBCC/MBP-A3H dimer
complex, and ~230,000 particles were extracted for VCBCC/A3H monomer complex. Reference-
free two-dimensional (2D) classification was performed in RELION2.0. For VCBCC/MBP-A3H
dimer complex, generated 2D classes were then used in ab initio 3D reconstruction by CryoSPARC
(164), yielding ~15 Å ab initio initial 3D model.
Vif degradation assay of A3H
93
pcDNA-FLAG-A3H mutants were co-transfected with either pcDNA-Vif or pcDNA3.1(+)
empty vector into HEK293T cells (ATCC) in 12-well plates by using X-tremeGENE 9 DNA
Transfection Reagent (Roche). At post-48 hours transfection, the cells were washed once with PBS,
and lysed in RIPA buffer (Sigma) with 1× complete protease inhibitors (Roche). The lysates were
then subjected to Western blot with anti-FLAG M2 mAb from mouse (Sigma, 1:3,000), anti--
tubulin mAb from mouse (GeneTex, 1:5,000) and anti-Vif mAb from mouse (NIH AIDS Reagent
Program #319, 1:2,000) as primary antibodies. Cy3-labelled goat-anti-mouse mAb (GE
Healthcare) was used to detect the signal. Typhoon RGB Biomolecular Imager (GE Healthcare)
was used to visualize the images.
94
Figure 4.1. Constructs, purification, and negative-stain EM image of VCBCC complex. a,
Constructs of VCBCC complex. Unstructured regions were deleted from the constructs based on
the VCBCC crystal structure (54). Cul5 has a molecular hinge between N-terminal domain and C-
terminal domain, and only N-terminal domain was used to minimize the conformational flexibility.
For N-terminal domain of Cul5, two versions were prepared: nCul5
12-386
(long), nCul5
12-320
(short)
as Guo et al. (54) prepared a long Cul5 which was cleaved around the residue 320 by elastase
treatment during their purification. b, Superdex 200 SEC profile of VCBCC complex containing
nCul5
12-386
. VCBCC complex was purified as RNA-bound form (peak 1) and RNA-free form (peak
2). c, SDS-PAGE and urea denaturing RNA gel of the two VCBCC complex peaks in c. d,
Negative-stain EM image of VCBCC complex containing nCul5
12-320
. Particles were stained with
uranyl acetate. The characteristic U-shaped particles with the diameter of 8-12 nm were observed.
The scale bar is 80 nm.
95
Figure 4.2. Binding of VCBCC complex to A3H a, WT A3H binds to VCBCC inefficiently.
VCBCC (N48H Vif) complex and MBP-A3H_WT were mixed and the relative molecular weight
was analyzed by Superdex 200 SEC column. The protein mixture did not show prominent peak
shift compared to the individual proteins. SDS-PAGE of the SEC fractions of VCBCC+MBP-A3H,
VCBCC alone, and MBP-A3H alone are shown in the right. b, K97Q A3H binds to VCBCC
efficiently. VCBCC (N48H Vif) complex and MBP-A3H_K97Q were mixed and the relative
molecular weight was analyzed by Superdex 200 SEC column. The protein mixture showed a
prominent peak shift. SDS-PAGE of the SEC fractions of VCBCC+MBP-A3H_K97Q and MBP-
A3H_K97Q alone are shown in the right. c, Vif-mediated degradation assay of A3H. Vif and A3H
were co-transfected to 293T cells and the A3H protein level was tested. N48H Vif degraded both
WT and K97Q A3H more efficiently than WT Vif. K97Q A3H was more sensitive to N48H Vif
than WT A3H.
96
Figure 4.3. Binding of VCBCC complex to A3H dimer and monomer. VCBCC (N48H Vif)
complex was mixed with either MBP-A3H_K97Q dimer or monomer, and the relative molecular
weight was analyzed by Superdex 200 SEC column. Both mixtures containing dimer and monomer
showed prominent peak shift compared to the individual proteins. SDS-PAGE of the SEC fractions
of VCBCC+MBP-A3H dimer, VCBCC alone, MBP-A3H dimer alone, VCBCC+MBP-A3H
monomer, and MBP-A3H monomer alone are shown in the right.
97
Figure 4.4. Vif binding site in A3H. a, Vif binding interface mapped in ribbon and surface
representation of A3H (PDB ID: 5W45). Vif-binding residues are mostly in 3 and 4. W90S and
S129E in the crystal structure were modelled back to wild-type W90 and S129, respectively. Zinc
atom is shown in sphere. b, Vif-mediated degradation assay of A3H mutants. A3H surface residues
on 3 and 4 were mutated to alanine or residues with charged side chains (aspartate or arginine).
Vif (N48H) and A3H mutants were co-transfected to 293T cells and the A3H protein level was
tested. Mutations on W90 and D94 rendered A3H highly resistant to Vif. Mutations on S86, L125,
and S129 rendered A3H partially resistant to Vif.
98
Figure 4.5. Identification of target lysine residues in A3H for Vif-mediated ubiquitination a,
Surface lysines are mapped in A3H (PDB ID: 5W45). 11 surface lysines were sorted into three
groups based on their location: region 1 (R1): K16/K153/K161/K168, region 2 (R2):
K27/K50/K51/K52/K64, and region 3 (R3): K117/K174/K181. All the lysines in each region are
replaced with arginines to test their roles in the Vif-mediated ubiquitination. b, Vif-mediated
degradation assay of lysine-deficient region mutants. Increasing amount of Vif (N48H) and A3H
mutants were co-transfected to 293T cells and the A3H protein level was tested. Only R2
(K27R/K50R/K51R/K52R/K64R) showed partial resistance to Vif. R1 and R3 showed no obvious
change. A3G and A3G 4KR (K297R/K301R/K303R/K334R) were included as controls. A3G 4KR
also showed only partial resistance to Vif.
99
Figure 4.6. Cryo-EM analysis of VCBCC/MBP-A3H dimer complex. a, Representative
micrograph of VCBCC/MBP-A3H dimer complex. The scale bar is 100 nm. b, Representative 2D
classes from ~300,000 particles of VCBCC/MBP-A3H dimer complex. c, 3D reconstruction of
VCBCC/MBP-A3H dimer complex at ~15 Å and coarse fitting of the crystal structures of VCBCC
complex (PDB ID: 4N9F)(54) and A3H dimer (PDB ID: 5W3V)(142). Some densities were lost
while 3D volume was reconstructed from 2D classes, indicating that there are intrinsically flexible
regions in the complex.
100
Figure 4.7. Cryo-EM analysis of VCBCC/A3H monomer complex. a, Representative
micrograph of VCBCC/A3H monomer complex. The scale bar is 80 nm. b, Representative 2D
classes from ~230,000 particles of VCBCC/A3H monomer complex.
101
Figure 4.8. Schematic of interaction between VCBCC complex and A3H. Both RNA-bound
monomeric and dimeric forms of A3H bound to VCBCC complex in vitro. 3D reconstruction of
the VCBCC/MBP-A3H dimer suggests that only one molecule of VCBCC bound to either of the
protomer in A3H dimer is dominant in the complex population.
102
References
1. Jarmuz A, et al. (2002) An anthropoid-specific locus of orphan C to U RNA-editing
enzymes on chromosome 22. Genomics 79(3):285-296.
2. LaRue RS, et al. (2009) Guidelines for naming nonprimate APOBEC3 genes and proteins.
J Virol 83(2):494-497.
3. Conticello SG, Thomas CJ, Petersen-Mahrt SK, & Neuberger MS (2005) Evolution of the
AID/APOBEC family of polynucleotide (deoxy)cytidine deaminases. Mol Biol Evol
22(2):367-377.
4. LaRue RS, et al. (2008) The artiodactyl APOBEC3 innate immune repertoire shows
evidence for a multi-functional domain organization that existed in the ancestor of placental
mammals. BMC Mol Biol 9:104.
5. OhAinle M, Kerns JA, Malik HS, & Emerman M (2006) Adaptive evolution and antiviral
activity of the conserved mammalian cytidine deaminase APOBEC3H. J Virol 80(8):3853-
3862.
6. Conticello SG (2008) The AID/APOBEC family of nucleic acid mutators. Genome Biol
9(6):229.
7. Navarro F, et al. (2005) Complementary function of the two catalytic domains of
APOBEC3G. Virology 333(2):374-386.
8. Fu Y, et al. (2015) DNA cytosine and methylcytosine deamination by APOBEC3B:
enhancing methylcytosine deamination by engineering APOBEC3B. Biochem J 471(1):25-
35.
9. Navaratnam N, et al. (1993) The p27 catalytic subunit of the apolipoprotein B mRNA
editing enzyme is a cytidine deaminase. J Biol Chem 268(28):20709-20712.
10. Prochnow C, Bransteitter R, Klein MG, Goodman MF, & Chen XS (2007) The APOBEC-
2 crystal structure and functional implications for the deaminase AID. Nature
445(7126):447-451.
11. Lada AG, et al. (2011) Mutator effects and mutation signatures of editing deaminases
produced in bacteria and yeast. Biochemistry (Mosc) 76(1):131-146.
12. Harris RS, Petersen-Mahrt SK, & Neuberger MS (2002) RNA editing enzyme APOBEC1
and some of its homologs can act as DNA mutators. Mol Cell 10(5):1247-1253.
13. Sato Y, et al. (2010) Deficiency in APOBEC2 leads to a shift in muscle fiber type,
diminished body mass, and myopathy. J Biol Chem 285(10):7111-7118.
14. Honjo T, Muramatsu M, & Fagarasan S (2004) AID: how does it aid antibody diversity?
Immunity 20(6):659-668.
15. Bransteitter R, Pham P, Calabrese P, & Goodman MF (2004) Biochemical analysis of
103
hypermutational targeting by wild type and mutant activation-induced cytidine deaminase.
J Biol Chem 279(49):51612-51621.
16. Chaudhuri J, et al. (2003) Transcription-targeted DNA deamination by the AID antibody
diversification enzyme. Nature 422(6933):726-730.
17. Ta VT, et al. (2003) AID mutant analyses indicate requirement for class-switch-specific
cofactors. Nat Immunol 4(9):843-848.
18. Shinkura R, et al. (2004) Separate domains of AID are required for somatic hypermutation
and class-switch recombination. Nat Immunol 5(7):707-712.
19. Revy P, et al. (2000) Activation-induced cytidine deaminase (AID) deficiency causes the
autosomal recessive form of the Hyper-IgM syndrome (HIGM2). Cell 102(5):565-575.
20. Chiu YL & Greene WC (2008) The APOBEC3 cytidine deaminases: an innate defensive
network opposing exogenous retroviruses and endogenous retroelements. Annu Rev
Immunol 26:317-353.
21. Duggal NK & Emerman M (2012) Evolutionary conflicts between viruses and restriction
factors shape immunity. Nat Rev Immunol 12(10):687-695.
22. Love RP, Xu H, & Chelico L (2012) Biochemical analysis of hypermutation by the
deoxycytidine deaminase APOBEC3A. J Biol Chem 287(36):30812-30822.
23. Suspene R, et al. (2011) Somatic hypermutation of human mitochondrial and nuclear DNA
by APOBEC3 cytidine deaminases, a pathway for DNA catabolism. Proc Natl Acad Sci U
S A 108(12):4858-4863.
24. Stenglein MD, Burns MB, Li M, Lengyel J, & Harris RS (2010) APOBEC3 proteins
mediate the clearance of foreign DNA from human cells. Nat Struct Mol Biol 17(2):222-
229.
25. Harris RS & Liddament MT (2004) Retroviral restriction by APOBEC proteins. Nat Rev
Immunol 4(11):868-877.
26. Refsland EW & Harris RS (2013) The APOBEC3 family of retroelement restriction factors.
Curr Top Microbiol Immunol 371:1-27.
27. Sheehy AM, Gaddis NC, Choi JD, & Malim MH (2002) Isolation of a human gene that
inhibits HIV-1 infection and is suppressed by the viral Vif protein. Nature 418(6898):646-
650.
28. Mangeat B, et al. (2003) Broad antiretroviral defence by human APOBEC3G through
lethal editing of nascent reverse transcripts. Nature 424(6944):99-103.
29. Harris RS, et al. (2003) DNA deamination mediates innate immunity to retroviral infection.
Cell 113(6):803-809.
30. Russell RA, Smith J, Barr R, Bhattacharyya D, & Pathak VK (2009) Distinct domains
within APOBEC3G and APOBEC3F interact with separate regions of human
104
immunodeficiency virus type 1 Vif. J Virol 83(4):1992-2003.
31. Smith JL & Pathak VK (2010) Identification of specific determinants of human
APOBEC3F, APOBEC3C, and APOBEC3DE and African green monkey APOBEC3F that
interact with HIV-1 Vif. J Virol 84(24):12599-12608.
32. Zhen A, Wang T, Zhao K, Xiong Y , & Yu XF (2010) A single amino acid difference in
human APOBEC3H variants determines HIV-1 Vif sensitivity. J Virol 84(4):1902-1911.
33. Yu Q, et al. (2004) APOBEC3B and APOBEC3C are potent inhibitors of simian
immunodeficiency virus replication. J Biol Chem 279(51):53379-53386.
34. Vartanian JP, Guetard D, Henry M, & Wain-Hobson S (2008) Evidence for editing of
human papillomavirus DNA by APOBEC3 in benign and precancerous lesions. Science
320(5873):230-233.
35. Lucifora J, et al. (2014) Specific and nonhepatotoxic degradation of nuclear hepatitis B
virus cccDNA. Science 343(6176):1221-1228.
36. Chen H, et al. (2006) APOBEC3A is a potent inhibitor of adeno-associated virus and
retrotransposons. Curr Biol 16(5):480-485.
37. Bogerd HP, et al. (2006) Cellular inhibitors of long interspersed element 1 and Alu
retrotransposition. Proc Natl Acad Sci U S A 103(23):8780-8785.
38. Muckenfuss H, et al. (2006) APOBEC3 proteins inhibit human LINE-1 retrotransposition.
J Biol Chem 281(31):22161-22172.
39. Peng G, et al. (2007) Myeloid differentiation and susceptibility to HIV-1 are linked to
APOBEC3 expression. Blood 110(1):393-400.
40. Landry S, Narvaiza I, Linfesty DC, & Weitzman MD (2011) APOBEC3A can activate the
DNA damage response and cause cell-cycle arrest. EMBO Rep 12(5):444-450.
41. Koning FA, Goujon C, Bauby H, & Malim MH (2011) Target cell-mediated editing of HIV-
1 cDNA by APOBEC3 proteins in human macrophages. J Virol 85(24):13448-13452.
42. Pham P, Landolph A, Mendez C, Li N, & Goodman MF (2013) A biochemical analysis
linking APOBEC3A to disparate HIV-1 restriction and skin cancer. J Biol Chem
288(41):29294-29304.
43. Narvaiza I, et al. (2009) Deaminase-independent inhibition of parvoviruses by the
APOBEC3A cytidine deaminase. PLoS Pathog 5(5):e1000439.
44. Chen K, et al. (2006) Alpha interferon potently enhances the anti-human
immunodeficiency virus type 1 activity of APOBEC3G in resting primary CD4 T cells. J
Virol 80(15):7645-7657.
45. Gabuzda DH, et al. (1992) Role of vif in replication of human immunodeficiency virus
type 1 in CD4+ T lymphocytes. J Virol 66(11):6489-6495.
46. von Schwedler U, Song J, Aiken C, & Trono D (1993) Vif is crucial for human
105
immunodeficiency virus type 1 proviral DNA synthesis in infected cells. J Virol
67(8):4945-4955.
47. Marin M, Rose KM, Kozak SL, & Kabat D (2003) HIV-1 Vif protein binds the editing
enzyme APOBEC3G and induces its degradation. Nat Med 9(11):1398-1403.
48. Yu X, et al. (2003) Induction of APOBEC3G ubiquitination and degradation by an HIV-1
Vif-Cul5-SCF complex. Science 302(5647):1056-1060.
49. Sheehy AM, Gaddis NC, & Malim MH (2003) The antiretroviral enzyme APOBEC3G is
degraded by the proteasome in response to HIV-1 Vif. Nat Med 9(11):1404-1407.
50. Lecossier D, Bouchonnet F, Clavel F, & Hance AJ (2003) Hypermutation of HIV-1 DNA
in the absence of the Vif protein. Science 300(5622):1112.
51. Zhang H, et al. (2003) The cytidine deaminase CEM15 induces hypermutation in newly
synthesized HIV-1 DNA. Nature 424(6944):94-98.
52. Jager S, et al. (2012) Vif hijacks CBF-beta to degrade APOBEC3G and promote HIV-1
infection. Nature 481(7381):371-375.
53. Zhang W, Du J, Evans SL, Yu Y , & Yu XF (2012) T-cell differentiation factor CBF-beta
regulates HIV-1 Vif-mediated evasion of host restriction. Nature 481(7381):376-379.
54. Guo Y, et al. (2014) Structural basis for hijacking CBF-beta and CUL5 E3 ligase complex
by HIV-1 Vif. Nature 505(7482):229-233.
55. Miyakawa K, et al. (2015) ASK1 restores the antiviral activity of APOBEC3G by
disrupting HIV-1 Vif-mediated counteraction. Nat Commun 6:6945.
56. Schule S, et al. (2009) Restriction of HIV-1 replication in monocytes is abolished by Vpx
of SIVsmmPBj. PLoS One 4(9):e7098.
57. Berger A, et al. (2010) Interaction of Vpx and apolipoprotein B mRNA-editing catalytic
polypeptide 3 family member A (APOBEC3A) correlates with efficient lentivirus infection
of monocytes. J Biol Chem 285(16):12248-12254.
58. Laguette N, et al. (2011) SAMHD1 is the dendritic- and myeloid-cell-specific HIV-1
restriction factor counteracted by Vpx. Nature 474(7353):654-657.
59. Hrecka K, et al. (2011) Vpx relieves inhibition of HIV-1 infection of macrophages
mediated by the SAMHD1 protein. Nature 474(7353):658-661.
60. Chougui G, et al. (2018) HIV-2/SIV viral protein X counteracts HUSH repressor complex.
Nat Microbiol 3(8):891-897.
61. Yurkovetskiy L, et al. (2018) Primate immunodeficiency virus proteins Vpx and Vpr
counteract transcriptional repression of proviruses by the HUSH complex. Nat Microbiol
3(12):1354-1361.
62. Berger G, et al. (2011) APOBEC3A is a specific inhibitor of the early phases of HIV-1
infection in myeloid cells. PLoS Pathog 7(9):e1002221.
106
63. Goldstone DC, et al. (2011) HIV-1 restriction factor SAMHD1 is a deoxynucleoside
triphosphate triphosphohydrolase. Nature 480(7377):379-382.
64. Schwefel D, et al. (2014) Structural basis of lentiviral subversion of a cellular protein
degradation pathway. Nature 505(7482):234-238.
65. Elgin SC & Reuter G (2013) Position-effect variegation, heterochromatin formation, and
gene silencing in Drosophila. Cold Spring Harb Perspect Biol 5(8):a017780.
66. Bouhamdan M, et al. (1996) Human immunodeficiency virus type 1 Vpr protein binds to
the uracil DNA glycosylase DNA repair enzyme. J Virol 70(2):697-704.
67. Ahn J, et al. (2010) HIV-1 Vpr loads uracil DNA glycosylase-2 onto DCAF1, a substrate
recognition subunit of a cullin 4A-ring E3 ubiquitin ligase for proteasome-dependent
degradation. J Biol Chem 285(48):37333-37341.
68. Laguette N, et al. (2014) Premature activation of the SLX4 complex by Vpr promotes
G2/M arrest and escape from innate immune sensing. Cell 156(1-2):134-145.
69. Romani B, Shaykh Baygloo N, Aghasadeghi MR, & Allahbakhshi E (2015) HIV-1 Vpr
Protein Enhances Proteasomal Degradation of MCM10 DNA Replication Factor through
the Cul4-DDB1[VprBP] E3 Ubiquitin Ligase to Induce G2/M Cell Cycle Arrest. J Biol
Chem 290(28):17380-17389.
70. Burns MB, et al. (2013) APOBEC3B is an enzymatic source of mutation in breast cancer.
Nature 494(7437):366-370.
71. Burns MB, Temiz NA, & Harris RS (2013) Evidence for APOBEC3B mutagenesis in
multiple human cancers. Nat Genet 45(9):977-983.
72. Kuong KJ & Loeb LA (2013) APOBEC3B mutagenesis in cancer. Nat Genet 45(9):964-
965.
73. Roberts SA, et al. (2013) An APOBEC cytidine deaminase mutagenesis pattern is
widespread in human cancers. Nat Genet 45(9):970-976.
74. Alexandrov LB, et al. (2013) Signatures of mutational processes in human cancer. Nature
500(7463):415-421.
75. Gwak M, Choi YJ, Yoo NJ, & Lee S (2014) Expression of DNA cytosine deaminase
APOBEC3 proteins, a potential source for producing mutations, in gastric, colorectal and
prostate cancers. Tumori 100(4):112e-117e.
76. Swanton C, McGranahan N, Starrett GJ, & Harris RS (2015) APOBEC Enzymes:
Mutagenic Fuel for Cancer Evolution and Heterogeneity. Cancer Discov 5(7):704-712.
77. Starrett GJ, et al. (2016) The DNA cytosine deaminase APOBEC3H haplotype I likely
contributes to breast and lung cancer mutagenesis. Nat Commun 7:12918.
78. Nik-Zainal S, et al. (2014) Association of a germline copy number polymorphism of
APOBEC3A and APOBEC3B with burden of putative APOBEC-dependent mutations in
107
breast cancer. Nat Genet 46(5):487-491.
79. Sievers F, et al. (2011) Fast, scalable generation of high-quality protein multiple sequence
alignments using Clustal Omega. Mol Syst Biol 7:539.
80. Robert X & Gouet P (2014) Deciphering key features in protein structures with the new
ENDscript server. Nucleic Acids Res 42(Web Server issue):W320-324.
81. Morgan HD, Dean W, Coker HA, Reik W, & Petersen-Mahrt SK (2004) Activation-induced
cytidine deaminase deaminates 5-methylcytosine in DNA and is expressed in pluripotent
tissues: implications for epigenetic reprogramming. J Biol Chem 279(50):52353-52360.
82. Bransteitter R, Pham P, Scharff MD, & Goodman MF (2003) Activation-induced cytidine
deaminase deaminates deoxycytidine on single-stranded DNA but requires the action of
RNase. Proc Natl Acad Sci U S A 100(7):4102-4107.
83. Larijani M, et al. (2005) Methylation protects cytidines from AID-mediated deamination.
Mol Immunol 42(5):599-604.
84. Franchini DM, Schmitz KM, & Petersen-Mahrt SK (2012) 5-Methylcytosine DNA
demethylation: more than losing a methyl group. Annu Rev Genet 46:419-441.
85. Kohli RM & Zhang Y (2013) TET enzymes, TDG and the dynamics of DNA demethylation.
Nature 502(7472):472-479.
86. Reik W, Dean W, & Walter J (2001) Epigenetic reprogramming in mammalian
development. Science 293(5532):1089-1093.
87. Jaenisch R & Bird A (2003) Epigenetic regulation of gene expression: how the genome
integrates intrinsic and environmental signals. Nat Genet 33 Suppl:245-254.
88. Li E (2002) Chromatin modification and epigenetic reprogramming in mammalian
development. Nat Rev Genet 3(9):662-673.
89. Surani MA (2001) Reprogramming of genome function through epigenetic inheritance.
Nature 414(6859):122-128.
90. Popp C, et al. (2010) Genome-wide erasure of DNA methylation in mouse primordial germ
cells is affected by AID deficiency. Nature 463(7284):1101-1105.
91. Bhutani N, et al. (2010) Reprogramming towards pluripotency requires AID-dependent
DNA demethylation. Nature 463(7284):1042-1047.
92. Kumar R, et al. (2013) AID stabilizes stem-cell phenotype by removing epigenetic memory
of pluripotency genes. Nature 500(7460):89-92.
93. Carpenter MA, et al. (2012) Methylcytosine and normal cytosine deamination by the
foreign DNA restriction enzyme APOBEC3A. J Biol Chem 287(41):34801-34808.
94. Wijesinghe P & Bhagwat AS (2012) Efficient deamination of 5-methylcytosines in DNA
by human APOBEC3A, but not by AID or APOBEC3G. Nucleic Acids Res 40(18):9206-
9217.
108
95. Suspene R, Aynaud MM, Vartanian JP, & Wain-Hobson S (2013) Efficient deamination of
5-methylcytidine and 5-substituted cytidine residues in DNA by human APOBEC3A
cytidine deaminase. PLoS One 8(6):e63461.
96. Nabel CS, et al. (2012) AID/APOBEC deaminases disfavor modified cytosines implicated
in DNA demethylation. Nat Chem Biol 8(9):751-758.
97. Fritz EL, et al. (2013) A comprehensive analysis of the effects of the deaminase AID on
the transcriptome and methylome of activated B cells. Nat Immunol 14(7):749-755.
98. Holden LG, et al. (2008) Crystal structure of the anti-viral APOBEC3G catalytic domain
and functional implications. Nature 456(7218):121-124.
99. Byeon IJ, et al. (2013) NMR structure of human restriction factor APOBEC3A reveals
substrate binding and enzyme specificity. Nat Commun 4:1890.
100. Chen KM, et al. (2008) Structure of the DNA deaminase domain of the HIV-1 restriction
factor APOBEC3G. Nature 452(7183):116-119.
101. OhAinle M, Kerns JA, Li MM, Malik HS, & Emerman M (2008) Antiretroelement activity
of APOBEC3H was lost twice in recent human evolution. Cell Host Microbe 4(3):249-259.
102. Harari A, Ooms M, Mulder LC, & Simon V (2009) Polymorphisms and splice variants
influence the antiretroviral activity of human APOBEC3H. J Virol 83(1):295-303.
103. Wang X, et al. (2011) Analysis of human APOBEC3H haplotypes and anti-human
immunodeficiency virus type 1 activity. J Virol 85(7):3142-3152.
104. Gu J, et al. (2016) Biochemical Characterization of APOBEC3H Variants: Implications for
Their HIV-1 Restriction Activity and mC Modification. J Mol Biol 428(23):4626-4638.
105. Yu Q, et al. (2004) Single-strand specificity of APOBEC3G accounts for minus-strand
deamination of the HIV genome. Nat Struct Mol Biol 11(5):435-442.
106. Chelico L, Pham P, Calabrese P, & Goodman MF (2006) APOBEC3G DNA deaminase
acts processively 3' --> 5' on single-stranded DNA. Nat Struct Mol Biol 13(5):392-399.
107. Bulliard Y, et al. (2011) Structure-function analyses point to a polynucleotide-
accommodating groove essential for APOBEC3A restriction activities. J Virol 85(4):1765-
1776.
108. Mitra M, et al. (2014) Structural determinants of human APOBEC3A enzymatic and
nucleic acid binding properties. Nucleic Acids Res 42(2):1095-1110.
109. Harjes S, et al. (2013) Impact of H216 on the DNA binding and catalytic activities of the
HIV restriction factor APOBEC3G. J Virol 87(12):7008-7014.
110. Yoo J & Medina-Franco JL (2011) Homology modeling, docking and structure-based
pharmacophore of inhibitors of DNA methyltransferase. J Comput Aided Mol Des
25(6):555-567.
111. Teh AH, et al. (2006) The 1.48 A resolution crystal structure of the homotetrameric cytidine
109
deaminase from mouse. Biochemistry 45(25):7825-7833.
112. Bohn MF, et al. (2015) The ssDNA Mutator APOBEC3A Is Regulated by Cooperative
Dimerization. Structure 23(5):903-911.
113. Shandilya SM, et al. (2010) Crystal structure of the APOBEC3G catalytic domain reveals
potential oligomerization interfaces. Structure 18(1):28-38.
114. Kitamura S, et al. (2012) The APOBEC3C crystal structure and the interface for HIV-1 Vif
binding. Nat Struct Mol Biol 19(10):1005-1010.
115. Siu KK, Sultana A, Azimi FC, & Lee JE (2013) Structural determinants of HIV-1 Vif
susceptibility and DNA binding in APOBEC3F. Nat Commun 4:2593.
116. Bohn MF, et al. (2013) Crystal structure of the DNA cytosine deaminase APOBEC3F: the
catalytically active and HIV-1 Vif-binding domain. Structure 21(6):1042-1050.
117. Harris RS & Dudley JP (2015) APOBECs and virus restriction. Virology 479-480:131-145.
118. Tan L, Sarkis PT, Wang T, Tian C, & Yu XF (2009) Sole copy of Z2-type human cytidine
deaminase APOBEC3H has inhibitory activity against retrotransposons and HIV-1. F ASEB
J 23(1):279-287.
119. Desimmie BA, et al. (2014) Multiple APOBEC3 restriction factors for HIV-1 and one Vif
to rule them all. J Mol Biol 426(6):1220-1245.
120. Malim MH & Bieniasz PD (2012) HIV Restriction Factors and Mechanisms of Evasion.
Cold Spring Harb Perspect Med 2(5):a006940.
121. Feng Y , Goubran MH, Follack TB, & Chelico L (2017) Deamination-independent
restriction of LINE-1 retrotransposition by APOBEC3H. Sci Rep 7(1):10881.
122. Ooms M, Letko M, Binka M, & Simon V (2013) The resistance of human APOBEC3H to
HIV-1 NL4-3 molecular clone is determined by a single amino acid in Vif. PLoS One
8(2):e57744.
123. Li MM & Emerman M (2011) Polymorphism in human APOBEC3H affects a phenotype
dominant for subcellular localization and antiviral activity. J Virol 85(16):8197-8207.
124. Zhen A, Du J, Zhou X, Xiong Y , & Yu XF (2012) Reduced APOBEC3H variant anti-viral
activities are associated with altered RNA binding activities. PLoS One 7(7):e38771.
125. Mitra M, et al. (2015) Sequence and structural determinants of human APOBEC3H
deaminase and anti-HIV-1 activities. Retrovirology 12:3.
126. Ooms M, Majdak S, Seibert CW, Harari A, & Simon V (2010) The localization of
APOBEC3H variants in HIV-1 virions determines their antiviral activity. J Virol
84(16):7961-7969.
127. Refsland EW, et al. (2014) Natural polymorphisms in human APOBEC3H and HIV-1 Vif
combine in primary T lymphocytes to affect viral G-to-A mutation levels and infectivity.
PLoS Genet 10(11):e1004761.
110
128. Feng Y, et al. (2015) Natural Polymorphisms and Oligomerization of Human APOBEC3H
Contribute to Single-stranded DNA Scanning Ability. J Biol Chem 290(45):27188-27203.
129. Dang Y, et al. (2008) Human cytidine deaminase APOBEC3H restricts HIV-1 replication.
J Biol Chem 283(17):11606-11614.
130. Li J, et al. (2014) APOBEC3 multimerization correlates with HIV-1 packaging and
restriction activity in living cells. J Mol Biol 426(6):1296-1307.
131. Baig TT, Feng Y , & Chelico L (2014) Determinants of efficient degradation of APOBEC3
restriction factors by HIV-1 Vif. J Virol 88(24):14380-14395.
132. Chiu YL & Greene WC (2006) APOBEC3 cytidine deaminases: distinct antiviral actions
along the retroviral life cycle. J Biol Chem 281(13):8309-8312.
133. Apolonia L, et al. (2015) Promiscuous RNA binding ensures effective encapsidation of
APOBEC3 proteins by HIV-1. PLoS Pathog 11(1):e1004609.
134. York A, Kutluay SB, Errando M, & Bieniasz PD (2016) The RNA Binding Specificity of
Human APOBEC3 Proteins Resembles That of HIV-1 Nucleocapsid. PLoS Pathog
12(8):e1005833.
135. Wang T, Tian C, Zhang W, Sarkis PT, & Yu XF (2008) Interaction with 7SL RNA but not
with HIV-1 genomic RNA or P bodies is required for APOBEC3F virion packaging. J Mol
Biol 375(4):1098-1112.
136. Chen Q, Xiao X, Wolfe A, & Chen XS (2016) The in vitro Biochemical Characterization
of an HIV-1 Restriction Factor APOBEC3F: Importance of Loop 7 on Both CD1 and CD2
for DNA Binding and Deamination. J Mol Biol 428(13):2661-2670.
137. Qiao Q, et al. (2017) AID Recognizes Structured DNA for Class Switch Recombination.
Mol Cell 67(3):361-373 e364.
138. Xiao X, Li SX, Yang H, & Chen XS (2016) Crystal structures of APOBEC3G N-domain
alone and its complex with DNA. Nat Commun 7:12193.
139. Xiao X, et al. (2017) Structural determinants of APOBEC3B non-catalytic domain for
molecular assembly and catalytic regulation. Nucleic Acids Res 45(12):7540.
140. Shi K, Carpenter MA, Kurahashi K, Harris RS, & Aihara H (2015) Crystal Structure of the
DNA Deaminase APOBEC3B Catalytic Domain. J Biol Chem 290(47):28120-28130.
141. Lee S, et al. (2017) Hydrogen bonds are a primary driving force for de novo protein folding.
Acta Crystallogr D Struct Biol 73(Pt 12):955-969.
142. Bohn JA, et al. (2017) APOBEC3H structure reveals an unusual mechanism of interaction
with duplex RNA. Nat Commun 8(1):1021.
143. Shaban NM, et al. (2018) The Antiviral and Cancer Genomic DNA Deaminase
APOBEC3H Is Regulated by an RNA-Mediated Dimerization Mechanism. Mol Cell
69(1):75-86 e79.
111
144. Fang Y , Xiao X, Li SX, Wolfe A, & Chen XS (2018) Molecular Interactions of a DNA
Modifying Enzyme APOBEC3F Catalytic Domain with a Single-Stranded DNA. J Mol
Biol 430(1):87-101.
145. Carpenter MA, Rajagurubandara E, Wijesinghe P, & Bhagwat AS (2010) Determinants of
sequence-specificity within human AID and APOBEC3G. DNA Repair (Amst) 9(5):579-
587.
146. Rathore A, et al. (2013) The local dinucleotide preference of APOBEC3G can be altered
from 5'-CC to 5'-TC by a single amino acid substitution. J Mol Biol 425(22):4442-4454.
147. Matsuoka T, et al. (2018) Structural basis of chimpanzee APOBEC3H dimerization
stabilized by double-stranded RNA. Nucleic Acids Res 46(19):10368-10379.
148. Malim MH & Emerman M (2008) HIV-1 accessory proteins--ensuring viral survival in a
hostile environment. Cell Host Microbe 3(6):388-398.
149. Strebel K (2013) HIV accessory proteins versus host restriction factors. Curr Opin Virol
3(6):692-699.
150. Aydin H, Taylor MW, & Lee JE (2014) Structure-guided analysis of the human APOBEC3-
HIV restrictome. Structure 22(5):668-684.
151. Iwatani Y, et al. (2007) Deaminase-independent inhibition of HIV-1 reverse transcription
by APOBEC3G. Nucleic Acids Res 35(21):7096-7108.
152. Belanger K, Savoie M, Rosales Gerpe MC, Couture JF, & Langlois MA (2013) Binding of
RNA by APOBEC3G controls deamination-independent restriction of retroviruses. Nucleic
Acids Res 41(15):7438-7452.
153. Morse M, et al. (2017) Dimerization regulates both deaminase-dependent and deaminase-
independent HIV-1 restriction by APOBEC3G. Nat Commun 8(1):597.
154. Pollpeter D, et al. (2018) Deep sequencing of HIV-1 reverse transcripts reveals the
multifaceted antiviral functions of APOBEC3G. Nat Microbiol 3(2):220-233.
155. Li MM, Wu LI, & Emerman M (2010) The range of human APOBEC3H sensitivity to
lentiviral Vif proteins. J Virol 84(1):88-95.
156. Nakashima M, et al. (2017) Mapping Region of Human Restriction Factor APOBEC3H
Critical for Interaction with HIV-1 Vif. J Mol Biol 429(8):1262-1276.
157. Iwatani Y, et al. (2009) HIV-1 Vif-mediated ubiquitination/degradation of APOBEC3G
involves four critical lysine residues in its C-terminal domain. Proc Natl Acad Sci U S A
106(46):19539-19544.
158. Chittori S, et al. (2018) Structural mechanisms of centromeric nucleosome recognition by
the kinetochore protein CENP-N. Science 359(6373):339-343.
159. Alvarez FJD, et al. (2017) CryoEM structure of MxB reveals a novel oligomerization
interface critical for HIV restriction. Sci Adv 3(9):e1701264.
112
160. Ooms M, Letko M, & Simon V (2017) The Structural Interface between HIV-1 Vif and
Human APOBEC3H. J Virol 91(5).
161. Tan S, Kern RC, & Selleck W (2005) The pST44 polycistronic expression system for
producing protein complexes in Escherichia coli. Protein Expr Purif 40(2):385-395.
162. Zheng SQ, et al. (2017) MotionCor2: anisotropic correction of beam-induced motion for
improved cryo-electron microscopy. Nat Methods 14(4):331-332.
163. Rohou A & Grigorieff N (2015) CTFFIND4: Fast and accurate defocus estimation from
electron micrographs. J Struct Biol 192(2):216-221.
164. Punjani A, Rubinstein JL, Fleet DJ, & Brubaker MA (2017) cryoSPARC: algorithms for
rapid unsupervised cryo-EM structure determination. Nat Methods 14(3):290-296.
Abstract (if available)
Abstract
APOBEC (apolipoprotein B mRNA editing catalytic poly-peptide-like) proteins belong to a family of polynucleotide cytidine deaminases that play diverse biological roles by converting cytosine base (C) into uracil (U) on single stranded DNA (ssDNA) or RNA (ssRNA). APOBEC3 (A3) subfamily members (A3A-H) represent a hallmark of intrinsic immunity to restrict viral infection and maintain the genomic integrity by triggering lethal hypermutation on viral genomes. Whereas cytidine deamination of malicious foreign DNA by A3 proteins is essential part of innate immunity, several A3 members inadvertently mutate their own genomes, which may lead to carcinogenesis. In addition to the deamination of canonical cytosine base, several APOBEC members have been implicated in deaminating 5-methylcytosine (mC), an epigenetically modified form of cytosine on genomic DNA. Deamination of mC produces another type of mutation as the deamination product of mC is thymine (T). Conversion of mC into T was previously proposed to be a part of the demethylation process of mC in the genome for epigenetic regulation. However, involvement of entire APOBEC family members in deaminating mC in the genome and the underlying molecular basis for recognizing mC have not been clearly delineated. Potent antiviral activity of A3 proteins has led lentiviruses to evolve a unique gene called viral infectivity factor (Vif). Vif is a highly conserved lentiviral accessory gene and is essential for successful infection and propagation of the viruses. Vif deficient HIV-1 (HIV-1 ΔVif) is unable to replicate in vivo because its genome is susceptible to lethal hypermutation caused by A3s. HIV-1 Vif targets a set of human A3s for proteasomal degradation to escape from their antiviral activity. In this process, Vif hijacks Cul5-EloB-EloC-Rbx2 E3 ubiquitin ligase complex to perform the ubiquitination and subsequent proteasomal degradation of A3s and additionally recruits cellular transcription factor CBF-β to form a stable complex suitable for binding to A3s. Despite extensive efforts, the molecular mechanisms of Vif mediated antagonism of A3 proteins remain elusive. In this thesis, we were determined to tackle two unaddressed questions regarding APOBEC family. (1) What is the potential of each APOBEC protein to deaminate mC in addition to conventional C for proposed epigenetic alteration and what is the molecular basis for differentiating mC and C. (2) How APOBEC proteins are targeted and antagonized by HIV-1 Vif during the HIV infection at molecular level.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
The crystal structure of APOBEC-2 and implications for APOBEC enzymes
PDF
Structural and biochemical determinants of APOBEC1 substrate recognition and enzymatic function
PDF
A structure based study of the HIV restriction factor APOBEC3G
PDF
Biochemical studies of APOBEC protein family
PDF
Exploring roles of human APOBEC-mediated RNA editing activity
PDF
Structural and biochemical studies of two DNA transaction enzymes
PDF
Structural and biochemical studies of large T antigen: the SV40 replicative helicase
PDF
APOBEC RNA mutational signatures and the role of APOBEC3B in SARS-CoV-2 infection
PDF
Data-driven approaches to studying protein-DNA interactions from a structural point of view
PDF
Molecular characterization of the HIV-1 Vpu protein and its role in antagonizing the cellular restriction factor BST-2/tetherin both in vitro and in vivo
PDF
Simulating the helicase motor of SV40 large tumor antigen
PDF
Structure and function of archaeal McM helicase
PDF
Rational selection of CRISPR/Cas9 guide RNAs for homology directed genome editing and its utility in the development of gene therapies
PDF
Structural studies of two key factors for DNA replication in eukaryotic cells
PDF
mRNA oxidation and its relation to p53 amyloid formation and disease
PDF
Biochemical characterization and structural analysis of two hexameric helicases for eukaryotic DNA replication
PDF
C. elegans topoisomerase II regulates chromatin architecture and DNA damage for germline genome activation
PDF
Scanning and catalytic properties of AID with structural comparisons to APOBEC3A
PDF
Structure and regulation of lymphoid tyrosine phosphatase (LYP) in autoimmune response
PDF
Genome-wide studies of protein–DNA binding: beyond sequence towards biophysical and physicochemical models
Asset Metadata
Creator
Ito, Fumiaki (author)
Core Title
Structural and biochemical analyses on substrate specificity and HIV-1 Vif mediated inhibition of human APOBEC3 cytidine deaminases
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Molecular Biology
Publication Date
04/24/2021
Defense Date
03/18/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
APOBEC,DNA editing,HIV,OAI-PMH Harvest,structural biology
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Chen, Xiaojiang (
committee chair
), Chen, Lin (
committee member
), Cherezov, Vadim (
committee member
), Pratt, Matthew (
committee member
)
Creator Email
fumi.ito.0630@gmail.com,fumiakii@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-145881
Unique identifier
UC11660442
Identifier
etd-ItoFumiaki-7252.pdf (filename),usctheses-c89-145881 (legacy record id)
Legacy Identifier
etd-ItoFumiaki-7252.pdf
Dmrecord
145881
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Ito, Fumiaki
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
APOBEC
DNA editing
HIV
structural biology