Close
The page header's logo
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected 
Invert selection
Deselect all
Deselect all
 Click here to refresh results
 Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
AID scanning & catalysis and the generation of high-affinity antibodies
(USC Thesis Other) 

AID scanning & catalysis and the generation of high-affinity antibodies

doctype icon
play button
PDF
 Download
 Share
 Open document
 Flip pages
 More
 Download a page range
 Download transcript
Copy asset link
Request this asset
Transcript (if available)
Content AID SCANNING & CATALYSIS AND THE GENERATION OF
HIGH-AFFINITY ANTIBODIES
by
Hongyu Zhang
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(MOLECULAR BIOLOGY)
May 2024
Copyright 2024 Hongyu Zhang



ii
Acknowledgments
I thank Dr. Myron Goodman, for giving me the opportunity to work with him. The
Goodman lab has provided me with great tools and guidelines for pursuing my scientific
research. With Dr. Goodman’s advice and mentorship, I transformed from a naïve collage
biology graduate to a true scientist who can work independently and think critically.
I thank Dr. Phuong Pham for his guidelines and helpful contribution to my projects. Both
of my projects were established on Dr. Pham’s previous research. His previous study builds a
strong scientific foundation for my projects. Dr. Pham has always been a great teacher to me,
showing me how to do different experiments and handle different equipment in the lab.
I thank my previous lab members, Dr. Soo Lim Jeong and Dr. Adhirath Sikand. Dr. Jeong
and I collaborated effectively, ensuring the success of the project. Dr. Sikand was always a great
friend to me in the lab and after his graduation, providing me with excellent advice and help on
different career opportunities.
I thank other members of Goodman Lab: Dr. Malgorzata Jaszczur, Dr. Debika Ojha,
Megan Cherry-Rockward, and Runtian Jiang, for creating such a peaceful and supportive
working environment.
I thank Dr. Chi Mak for his contribution to the AID scanning and catalysis project. His
commitment to the project gave me belief and hope whenever I wanted to give up. Dr. Mak was
so smart and patient that he would explain super-complex physical and mathematical ideas to me
in layman’s terms. I hope Dr. Mak can recover soon and stay healthy.
I thank my committee members: Dr. David McKemy, Dr. Lin Chen, and Dr. Michael
Lieber for their time, feedback, and guidance on my projects and dissertation.



iii
I thank my mother, Ping Luo, father, Peng Zhang, and my brothers and sister, Xinyu
Zhang, Shiyu Zhang, Xiaoliang Zhang, and Xiaole Zhang for their unconditional love and
support.
Lastly, I thank my wife, Dan Ma, who has provided me tremendous support in my life
and study. The journey for pursuing a Ph.D. is hard. Dan provides me with a home where I can
recover and get back together when facing any challenges.



iv
Table of Contents
Acknowledgments......................................................................................................................................... ii
List of Tables................................................................................................................................................ vi
List of Figures............................................................................................................................................. vii
Abstract...................................................................................................................................................... viii
Chapter 1. Introduction .................................................................................................................................1
1.1 AID in AID/APOBEC family .............................................................................................................1
1.2 Role of AID in antibody generation....................................................................................................3
1.3 Biochemical properties of AID ...........................................................................................................7
1.4 Application of AID..............................................................................................................................9
1.5 Conclusions.......................................................................................................................................10
Chapter 2. Adjacent Nucleotides Modulate AID's Deamination Activity on ssDNA .................................12
2.1 Introduction.......................................................................................................................................12
2.2 Results...............................................................................................................................................15
2.2.1 AID shows similar activity on USER enzyme (Uracil DNA glycosylase and Endonuclease
VIII) -treated ssDNA and gapped-DNA .............................................................................................15
2.2.2 AID scanning and catalysis are decoupled on homogeneous motifs .........................................18
2.2.3 AID’s activity depends on neighboring nucleotides...................................................................21
2.2.4 The neighboring nucleotides show differing effects on different trinucleotide motifs..............23
2.3 Discussion.........................................................................................................................................24
2.4 Materials and methods......................................................................................................................27
ssDNA and AID preparation ...............................................................................................................27
Deamination reaction ..........................................................................................................................29
NGS library preparation for Maximum-depth sequencing .................................................................29
NGS with Miniseq machine................................................................................................................30
NGS data analysis...............................................................................................................................30
Chapter 3. In Vitro Affinity Maturation of Camelid Nanobodies Targeting FAAH....................................32
3.1 Introduction.......................................................................................................................................32
3.2 Results...............................................................................................................................................37
3.2.1 AID and Pol η preserved their activities and favored mutation in CDRs. .................................37
3.2.2 Affinity-matured VHH library showed more avid binders to FAAH.........................................38
3.2.3 Single clone affinity improvement via AID and Pol η affinity maturation................................39



v
3.2.4 Affinity improvement with two rounds of affinity maturation...................................................42
3.2.5 Affinity-matured VHH showed an inhibitory effect in the mouse model ..................................43
3.3 Discussion.........................................................................................................................................44
3.4 Materials and methods......................................................................................................................48
Materials .............................................................................................................................................48
Human FAAH purification..................................................................................................................48
Generation of naïve VHH phage library .............................................................................................49
Phage library purification....................................................................................................................49
Affinity maturation via gapped-DNA method ....................................................................................49
Affinity maturation using the ssDNA method ....................................................................................50
Phage production with helper phage...................................................................................................51
Nanobody selection via bio-panning through immunotubes ..............................................................52
Nanobody selection via bio-panning through Biacore Biosensor.......................................................53
Nanobody screening using phage ELISA ...........................................................................................53
Expression and purification of VHH nanobodies ...............................................................................54
Affinity measurement via Surface plasmon resonance .......................................................................55
Cold plantar assay on mice .................................................................................................................55
Computational analysis.......................................................................................................................56
Chapter 4. Conclusions...............................................................................................................................57
References...................................................................................................................................................60
Supplemental Figures and Tables................................................................................................................73
Appendix.....................................................................................................................................................80



vi
List of Tables
Table 3. 1 Kinetic and equilibrium dissociation constants of purified VHHs targeting FAAH.... 41
Table 3. 2 Kinetic and equilibrium dissociation constants of purified C6, F4, and 69 targeting
FAAH............................................................................................................................................ 43
Table S1. 1 Primers and ssDNA ordered for AID reaction and NGS library preparation ............ 73



vii
List of Figures
Figure 1 AID's role in antibody diversification. ............................................................................. 5
Figure 2.1 Overview of MDS strategy.......................................................................................... 14
Figure 2. 2 Comparison of background noise among different DNA constructs.......................... 17
Figure 2. 3 Mutation pattern comparison between gapped and USER-treated AGC homogenous
motif construct. ............................................................................................................................. 18
Figure 2. 4 Tow-point correlation analysis of all selected homogenous motifs. .......................... 20
Figure 2. 5 Comparison between clones with more than one deamination. ................................. 22
Figure 2. 6 Mutation pattern of AGCTTT.................................................................................... 22
Figure 2. 7 The deamination rate with different 15nt motifs at 5min AID incubation. ................ 23
Figure 3.1 Structure of antibody and antibody fragments. ........................................................... 34
Figure 3.2 Overview of our in vitro nanobody generation process. ............................................. 35
Figure 3. 3 Naïve and Affinity matured VHH Phage recovery rate.............................................. 39
Figure 3. 4 Phage screening for a Naïve VHH library.................................................................. 40
Figure 3. 5 Phage screening for A3 affinity matured VHH library............................................... 40
Figure 3. 6 Clones with unique sequences after A3 affinity matured VHH library selection and
screening. ...................................................................................................................................... 41
Figure 3. 7 Clones with continuous affinity maturation. ............................................................. 42
Figure S3. 1 AID and poly eta mutation spectrum on IGHV3-23*01 region............................... 76
Figure S3. 2 The DNA sequence of 5 selected clones is different from A3. ................................ 77
Figure S3. 3 The DNA sequence of C6, F4 and 69....................................................................... 77
Figure S3.4 Inflammatory cold allodynia is inhibited by both LM52-VHH and Mab 1085....... 78
Figure S3. 5 Docking scores between VHHs and FAAH. ............................................................ 79



viii
Abstract
Activation-induced deoxycytidine deaminase (AID) plays a crucial role in the human
immune system by initiating somatic hypermutation (SHM) and class switch recombination
(CSR). In 2003, the Goodman lab showed that the substrate for AID catalysis is single-stranded
(ss)DNA, and that AID scans ssDNA processively. The lab has studied coupling between
AID’scanning and catalysis, focused on sequence context effects. Using AID’s deamination
footprints, we described AID’s activity through a mathematical model validated with
experimental data. This model implies a coupled relationship between AID’s scanning and
catalysis activity except for sequences with homogenous motifs. In this study, combining ssDNA
and Next Generation Sequencing (NGS), we confirmed the model predictions on AID’s
catalysis-decoupled scanning on ssDNA with homogenous motifs. We also demonstrated how
neighboring sequences influence AID’s motif preferences.
AID’s involvement in the immune system is essential in the process of generating highaffinity antibodies (Abs). AID and error-prone DNA polymerase eta (Pol η) are upregulated upon
antigen (Ag) exposure, enabling genetic diversification required for high-affinity Ab production.
We used AID and Pol η to synthesize high-affinity Abs biochemically in a test tube. By treating
native llama nanobody (VHH) genes with purified AID and Pol η, we generated diversified VHH
libraries. Via phage display, we selected VHHs with high affinity to Fatty Acid Amide
Hydrolysis (FAAH) and characterized their binding affinity. Our successful in vitro antibody
diversification process highlights the potential of AID, paving a new way for antibody
generation.



1
Chapter 1. Introduction
1.1 AID in AID/APOBEC family
AID is a member of the AID/APOBEC protein family which can generate mutations on
DNA and RNA through their ability to deaminate cytosine (C) to uracil (U). Because of their
mutagenic properties, proteins in the AID/APOBEC family have played a critical role in both
adaptive and innate immunity in vertebrates1
. The action of deamination on both DNA and RNA
molecules has been associated with an intrinsic response upon viral infection, macrophage
diversification through transcription, and antibody generation2-4
. The first protein identified in
the AID/APOBEC family is APOBEC1, which is involved in the process of editing
apolipoprotein B (ApoB) mRNA5
. By introducing a stop codon through its deamination activity,
APOBEC1 can lower the expression of ApoB6
. Since then, additional members of the family
have been discovered and characterized. APOBEC2 plays a critical role in transcriptional
regulation in Myoblast differentiation by interaction with the promoter region of the DNA.
APOBEC2 has involved in gene expression-related muscle differentiation7
. The APOBEC3
protein family is known for its role in the intrinsic response to viral infection. There are
numerous studies documenting the role of APOBEC3 proteins in viral immunity to parvoviruses,
herpesviruses, and hepatitis B virus8-10. There is also increasing evidence of APOBEC3s antiretroviral properties against HIV-1
11
. In humans, there are seven APOBEC3 proteins encoded in
a gene cluster arranged in tandem12. Upon viral infection, APOBEC3 proteins can induce
deaminations on the viral genome, disrupting the virus’s ability to replicate effectively13
.
APOBEC4 is a newly identified member of the AID/APOBEC family, and has not yet been
characterized. One recent study indicates APOBEC4’s antiviral role in chicken, showing
inhibitory effects on Newcastle Disease Virus14. AID is highly expressed in activated B cells



2
responsible for the adaptive immune system in vertebrates
15
. By deaminating Immunoglobulin
(Ig) genes, AID initiates processes of antibody diversification and isotype switching16
.
There are a total of 11 AID/APOBEC genes present in the human genome. AID and
APOBEC1 are located on chromosome 12. APOBEC2 is located on chromosome 6. The 7
APOBEC3 genes are located on chromosome 22. APOBEC4 is located on chromosome 117
.
Although located at different sites of the genome and exerting different cellular functions, each
member of this family shares at least one zinc-dependent deaminase (ZDD) domain containing a
consensus sequence of H-X-E-X23-28-P-C-X2-4-C (X = any amino acid)18. The zinc-dependent
deamination is initiated by having histidine and cysteine residues binding to the zinc divalent
cation at the active site to form a catalytical pocket. The C bound to this pocket is then
deaminated through nucleophilic attack by glutamic acid residue-controlled proton shuttling19
.
Another shared feature among AID/APOBEC proteins is their ability to bind with ssDNA/RNA.
The shallow grooves on the surface of the proteins contain clusters of positively charged (basic)
and aromatic (hydrophobic) residues. These regions serve to interact with the negatively charged
nucleic acid backbone, facilitating their alignment and stacking with the bases of the nucleic
acid20
.
During a deamination reaction, AID/APOBEC proteins show distinct activity and
preference for different ssDNA/RNA substrates. The catalytical efficiency of various
AID/APOBEC proteins is determined by their ability to create both homogenous and
heterogeneous complexes17
. APOBEC1’s catalytic activity is associated with RNA-binding
cofactors including A1CF and RBM4721. AID mutants exhibit a dominant negative phenotype
suggesting that AID is a dimer
22
. AID has been obsevered as a monomer and an oligomer
structurally23,24
. APOBEC2 is found as a monomer in solution25
. APOBEC3B, APOBEC3D,



3
APOBEC3F, APOBEC3G, APOBEC3H show preference for multimerization; however,
APOBEC3A and APOBEC3C exist as monomers
26
. Besides their molecular form of existence,
AID/APOBEC proteins show different motif preferences on ssDNA/RNA. For example,
APOBEC1 shows the highest activity on ssDNA within a TC motif27
. AID preferentially acts at
WRC motifs (W=A/T; R= A/G) both in vivo and in vitro19,23. APOBEC3A prefers YYC (Y=C/T)
motifs and APOBEC3G prefers CCC motif28,29
.
1.2 Role of AID in antibody generation
Antibodies (Abs) generated from B lymphocytes and capable of recognizing foreign
Antigens (Ags) act as a crucial tool in vertebrate immune systems. Naïve B cells undergo
processes known as somatic hypermutation (SHM) and class-switch recombination (CSR) to
enhance an Abs’ ability to identify Ags and change their functional capabilities30. Upon Ag
exposure, the cell introduces point mutations into the variable region of the Ig genes to produce
variants with increased affinity31. CSR happens within the B cell by having the expressed Ig
heavy-chain constant-region gene (CH) exchanged from Cμ to another downstream CH gene.
SHM and CSR result in the production of high-affinity Abs along with secondary isotypes for
different effector functions32
.
The discovery of AID occurred in 1999 when T. Honjo and colleagues identified a
specific expressed gene in B lymphocyte cell line CH12F3 which can undergo CSR at high
frequencies after simulation33
. A subsequent study showed AID-deficient mice’s inability to
undergo SHM and CSR indicating the crucial role AID played in these processes34. Later, AID
deficiency was identified as the cause of an autosomal recessive form of the Hyper-IgM
syndrome (HIGM2), a disease that causes loss of CSR and SHM, confirming the requirement for
AID in SHM and CSR35
.



4
AID is upregulated upon antigen exposure to induce SHM and CSR. SHM primarily
occurs in variable region genes of heavy chain (IgH) and light chain (IgI) which encode the
antibody binding site. Before SHM, single-strand DNA transcription vesicles form on the
variable region genes during transcription, producing ssDNA. AID deaminates cytosine on
ssDNA, creating a U: G mismatch36. The cell then adapts error-prone replication, base excision
repair (BER), and mismatch repair (MMR) to repair the U: G mismatch creating widely diverse
genes which are then translated into antibodies with different affinities (Figure 1)2,37. DNA
replication repair leads to C-T mutations on the AID-deaminated strand and G-A mutations on
the complementary strand38. During BER, the deaminated U is removed by the Uracil-DNA
glycosylase (UNG) and single-strand break (SSB) is created by AP-endonuclease (APE). The
error-prone polymerase then randomly inserts any of the four nucleotides at the SSB, creating
mutations39
. With MMR, the U: G mismatch is targeted by MSH2 (MutS Homolog2)-MSH3 or
MSH2-MSH6 heterodimers and MutLα (MLH1/PMS2 heterodimer) to create SSB and
exonuclease 1 to remove a series of nucleotides near the mismatch site. This gap in the DNA is
then filled with error-prone polymerase to create multiple base mutations near the mismatch
site40
. During SHM, point mutations occur at a very high rate (10-3
-10-4 per base division) within
the 1.5 kb downstream of the transcription start site (TSS)41
.
For CSR, AID catalyzes deaminations that induce double-strand breaks in the S region
which contains highly repetitive sequences rich in AID favored motifs
42
. Besides having favored
motifs for AID, the structure of the S region is also favored for AID. Transcription of the S
region tends to generate G quadruplex structures and form long stretches of R-loops43-45
. In vitro
studies have shown these DNA structures are preferred AID targets24
. The transcription initiated
by the I promoter located 5’ to the S region causes the S region DNA to uncoil and can be



5
targeted by AID, leading AID to have access to the entire 2-12 kb S region sequence34. Upon
AID-catalyzed deamination, UNG and APE are recruited for making single-stand breaks (SSB).
When the SSBs on opposite strands are sufficiently close to each other, MMR can convert the
SSB to a double-strand break (DSB)46
. These breaks then act as substrates for non-homologous
end-joining (NHEJ) transforming IgM-producing cells into B-cells capable of producing IgG,
IgA, and IgE isotypes47 (Figure 1).
Figure 1.1 AID's role in antibody diversification. In SHM C->U deamination created by AID induces error-prone replication,
base excision repair (BER), and mismatch repair (MMR). In CSR, AID deamination causes double-strand break repaired with
non-homologous end-joining to produce different Ig isotypes48
.



6
While AID is important for antibody generation, the nature of being a mutator raises a
potential threat to genetic integrity. As a result, the activity of AID is strictly regulated at
transcriptional, post-transcriptional, and post-translational levels49
.
The transcription of AID is induced by E-protein, NFκB, PAX5, STAT6, and IRF8
transcription factors and is negatively regulated by IRF4, BLIMP1, ID3, and ID250
. For example,
upon simulation, NFκB and STAT6 bind the enhancer region upstream of the transcription
starting site to enhance the expression of AID. Activator PAX5, E2A is recruited to the repressive
region, allowing effective de-repress AID’s expression50
.
The stability of AID mRNA is regulated by micro miR-115 and miR-181b. The
expression levels of miR-115 and miR-181b are influenced by the activity of CSR. Since
containing multiply binding sequences towards AID 3’ untranslated region, these miRNAs can
adhere to the AID mRNA and inhibit expression51,52
. AID expression is further regulated by
alternative splicing creating different AID variants; the alternative AID variants show defective
CSR and SHM caused by loss of structural supports for the catalytic site53,54
.
Post-translationally, AID is regulated through phosphorylation of threonine 27, threonine
140, and serine 3855. Mutations on these amino acids have shown interference with SHM and
CSR in vivo56. It is also reported that the cellular concentration of AID is regulated through
proteasomal degradation and is dynamically regulated with Hsp9057,58. Overexpression of AID or
up-regulated AID activity is related to chronic inflammation-associated oncogenesis59,60
.
AID’s activity is limited by its transcription-target mechanism in the cell. AID shows a
strong binding affinity towards ssDNA over dsDNA in vitro61
. AID acts on ssDNA and has no
activity on dsDNA62. In vivo, AID’s during DNA transcription on the non-transcroned strand63
.



7
In SHM, mutations in the VDJ gene starts from the leader intro of the V gene (150bp
downstream of the Ig gene promoter) and end within 1.5-2kb downstream of the sequence
without any preference for the J gene segment, indicating a strong correlation between AID’s
activity and DNA transcription64,65. By introducing a second transcriptional promoter upstream
of the constant region, mutations were induced in the C region at a level similar to that
previously observed in the V region66
. By moving the promoter on the V gene 750 upstream, a
shift in mutation pattern is observed as well67
. The mutation frequency is also correlated to the
level of transcription. By adjusting the transcription level using histone deacetylase, a 2-fold
increase in mutation rate is observed in a hypermutated cell line containing AID68
. Active
transcription is also required for initiating CSR. The elimination of germline transcription by
deleting the promoter leads to defective CSR69
.
1.3 Biochemical properties of AID
To obtain a deeper understanding of AID’s biochemical activity, a structural-based
analysis can elucidate its catalytical specificity and processivity. When expressed and purified
from B cells and baculovirus-infected insect cells, it was hard to obtain AID in high yield70
.
Escherichia coli (E coli) can be adapted to obtain AID in high yield; however the specific of AID
is reduced about 50-fold compared to insect cell expressed AID71. After expression, it is difficult
to purify AID because of the highly localized positive charge (+11) in the N-terminal, leading
AID to adhere tightly to RNAs and interfere with the purification process23,70
. It has been
suggested that this large positive region on the N-terminal is responsible for AID’s scanning
activity along the negatively charged DNA backbone. By substituting the basic amino acid
residues with acidic residues in this region, AID’s processivity is diminished, causing a notable
decrease in the average number of deamination events on ssDNA in vitro72
. The C-terminal is



8
also shown to be responsible for AID’s specificity. Mutations including insertions, replacements,
or truncations in the C-terminal of AID has shown a loss of CSR activity without interfering with
SHM activity. This indicates the essential role of AID’s C-terminal domain during interaction
with the S region in the Ig gene73,74
. This phenomenon is also shown in patients with Hyper-IgM
syndrome type 2 (HIGM2), a primary immunodeficiency disease that shows normal or high
serum IgM levels with little or no IgG, IgA, and IgE isotypes. A genetic study of a few patients
showing normal SHM activity, but defective CSR shows mutations on the C-terminal region of
AID gene73,75
. A recent study in vitro shows the C-terminal region is responsible for AID
oligomerization and interaction on the AID-favored substrates during CSR24
.
To overcome the purification difficulty and investigate the catalytical specificity of AID,
AIDv(Δ15) is generated. AIDv(Δ15) has 12 mutations and three deletions on the amino acid
sequence at the N-terminal and a 15 amino acid deletion at the C-terminal. The N-terminal
mutation creates an N-terminus the same as APOBEC3A and reduces the charge of AID from
11.9 to 3.4 at neutral pH. The C-terminal truncation facilitates the efficiency of crystallization23
.
Structure analysis shows the major difference between AID and other APOBEC proteins is at
loop 7 which is larger and extends away from the active site23
. In loop 7, AID contains a distinct
amino acid sequence from Leu113 to Pro123 which has been demonstrated to govern the
specificity of APOBEC proteins76,77. Changing the residues in the loop has shown alternation in
the mutation specificity creating different mutation spectra after mutations78
. The expanded loop
of AID also allows it to accept two purines on the 5’ side adjacent to the targeted C (WRC)
without strict differentiation against pyrimidines23
.
Our biochemical studies show that AID catalysis is processive on ssDNA. The initial
evidence of AID’s processive action is obtained through an in vitro assay allowing ssDNA to be



9
deaminated by a single AID molecule. After reacting with AID, many deaminated ssDNA
contains multiple C-T mutations79
. Later experiments accessing AID’s activity show a linear
relationship between AID’s catalysis activity and reaction time80
. The processivity of AID is
critical during SHM and CSR, ensuring AID binds strictly and tightly to the target ssDNA during
transcription. It has been shown that AID can translocate along within a transcription bubble on
the ssDNA while retaining its catalytical specificity in vitro72,81
. The processive activity of AID
is also seen in a vivo model with UNG-/-MMR-/- mice model showing a large numbers of of
mutiple deaminations in Ig genes82
.
1.4 Application of AID
Since the invention of the CRISPR-Cas9 system, the gene-editing field has been rapidly
advanced with different CRISPR-associate tools that enable target gene disruption, recovery, and
regulation83
. However, conventional CRISPR-based tools encounter inherent problems from the
DSB response followed by the cleavage activity of CRISPR: p53-mediated DNA damage
response can unexpected chromosomal deletions or rearrangements84,85; NHEJ can induce small
insertion and deletions at the DSB site; and Homology-directed repair (HDR) remains inefficient
and only occurs at specific cell cycle phases86-88
. To solve the fundamental issue of CRISPR,
cytosine based editors and adenine base editors are have been used, enabling precise, efficient,
and reversible single-nucleotide conversion without causing DSB and donor DNA83,89
.
Because of AID’s precise activity in converting C to U, it attracted the attention of
scientists in the CRISPR-Cas9 gene-editing field. The first Cytosine base editor using AID
(Target-AID) was invented in 2016 by Dr. Nishida from Kobe University. Nuclease-inactive
dCas9 is fused with an AID ortholog from sea lamprey with a peptide linker. The dCas9 can
unwind the DNA near the protospacer adjacent motif (PAM), creating a complementary region



10
the single guide RNA (sgRNA) can bind to and form an R-loop, facilitating AID’s activity90
. AID
can initiate base editing by C to U deamination without causing DSBs. DNA replication of the
deaminated sequence then replaces the U with T, causing a C to T conversion on the DNA91
. In
both yeast and mammalian systems, Target-AID shows the capability of C to T conversion with
an efficiency of 80% and 10% respectively. The mutation spectrum of Target-AID’s activity on
yeast indicates a short deamination window (a few bases) 18 bases upstream of the PAM90
.
A similar strategy, Targeted AID-mediated mutagenesis (TAM) was published in late
2016. TAM used a human AID variant, AIDx (AID-P186X), which shows a stronger
deamination activity than the full-length AID. Together with a UDG inhibitor to prevent the
repairment of U-G mismatch, dCas9-AIDx shows an increasing mutation rate by five times and
is established as an effective tool for genetic diversification in mammalian cells to enable protein
evolution that was not feasible before92
.
Since the invention of AID-associated base editors, improvements have been made to
enhance the mutation rate and range. For example, the dCas9 is replaced by nCas9 (D10A)
which nicks the non-edited DNA strand that is complementary to the sgRNA93. The cleavage
activity of nCas9 induces MMR causing an enhanced editing efficiency91
. In another case
(CRISPR-X), MS2 protein-fused AID is recruited by sgRNA with MS2 binding sites, allowing a
catalytical window +/-50 bp from the PAM sequence94
. AID-associated base editors present
powerful tools for precise and effective genetic modification. Advancement in these tools offers
new opportunities for disease modeling and disease treatment87
.
1.5 Conclusions
Being a significant member of the AID/APOBEC family, AID has been extensively
studied because of its crucial role in the adaptive immune system. Dysfunction of AID



11
expression has been associated with immune system impairments. Biochemical investigations
have provided insights into AID’s characterization and enables AID to be applicable as a tool for
gene editing. In this study, we continue to explore the potential applications of AID inspired by
AID’s pivotal role in the immune system and AID’s catalysis and scanning activities in vitro.
Biochemical studies have revealed that AID functions as a processive protein on ssDNA
with specific favored catalytic motifs. Our previous studies have described AID’s scanningcoupled catalytical activity using an analytical mathematical analysis95. To get a deeper
understanding of AID’s activity on ssDNA in vitro, we plan to adopt Next Generation
Sequencing to obtain more detailed information on AID’s deamination footprint. By expanding
our access to AID's deamination footprint data, we aim to enhance our previous model to
conduct a mathematical calculation through the deamination activity of AID and bring new
insights into the direction of protein study. (Charpter2)
Replicating the SHM process in the B cell. We have used AID and a SHM upregulated
polymerase, Pol η, to introduce diversification into native nanobodies sourced from Llamas.
Through the adaption of phage display techniques, we have generated nanobodies with a
heightened affinity for Fatty Acid Amide (FAAH) Hydrolase. Nanobodies with high affinities
towards FAAH hold signification promise for addressing the therapeutic needs of patients
suffering from pain and depression. (Charpter3)



12
Chapter 2. Adjacent Nucleotides Modulate AID's Deamination
Activity on ssDNA
2.1 Introduction
AID is required during antibody generation; however, the biochemical mechanisms that
allow it to identify and act on target motifs are not well understood. It has been shown that AID
acts processively on ssDNA and catalyzes multiple deaminations on the same DNA molecule in a
single binding event62,79. Using purified AID, it has been shown in vitro that AID deaminates C
preferentially in WRC (W=A/T, R=A/G) hot motifs while acting much less efficiently on SYC
(S=G/C, Y=T/C) cold motifs79. In previous studies, a one-dimensional bidirectional, random
walk model was used to analyze AID’s scanning and catalysis process on ssDNA with
homogeneous or alternating trinucleotide motifs95,96. Using ssDNA containing two hot motifs
and one cold motif, we concluded that AID deaminates at an efficiency of about 8% for the WRC
hot motifs and 0.7% for the SYC cold motifs, with an approximate motif binding time of 3s and
ssDNA binding time of 5 min81,96
.
While our previous ssDNA substrate design produced data that allowed us to derive
analytical mathematical models describing AID’s scanning and deamination dynamics, there
were several technical obstacles that hindered us from getting a more comprehensive
understanding of AID. In previous studies, to ensure that AID acted on stretched ssDNA, a
gapped-DNA substrate was constructed using an M13 phage, and the deamination event would
be identified as mutant M13 phage progenies after transformation into uracil-DNA-glycosylase
deficient ung- E. coli which have white (colorless) phenotype, whereas wild type plaques (having
no C deaminations in M13 phage DNA) exhibit a dark blue color. However, this color-indicating
system relies on the generation of one or more premature stop codon(s) in the reporter of the



13
lacZα reading frame by AID deamination. To ensure the occurrence of stop codons created by
AID to prevent the expression of the LacZ gene, a limited number of deamination motifs
(AGC/AAC/GTC) can be chosen in this system80. Meanwhile, the ability of inserted repeated
motifs to propagate in E. coli limited the motif selection as well. For certain homogeneous
motifs, the repeated motifs are not tolerated by E.coli, with the potential to cause growth defects,
ending up with no cell growth or with a truncated M13 genome after purification97-99. Another
difficulty encountered in the previous experimental setup was the sequencing system. DNA was
individually purified and sequenced by Sanger sequencing for each selected clone on the bluewhite screening plate. The whole process was both time-consuming and labor-intensive.
Next-generation sequencing (NGS) was invented in the early 21st century and has been
rapidly adapted because of its ability to provide highly efficient, rapid, low-cost DNA
sequencing100. However, NGS generally has an error rate between 0.1% to 1% depending on the
sequencing platform and GC content of the DNA library101. Although barcoding strategies have
been proposed to lower the error rate, such strategies can have low yields and do not solve any
error introduced by PCR during the library preparation process102,103. Maximum-depth
sequencing (MDS) was invented to enhance the accuracy of NGS. The method was developed
and showed high sensitivity in detecting rare mutations while keeping the error rate as low as
5*10^-8 per nucleotide.
In this study, we adapted the MDS strategy with a few modifications to enable the
method to be compatible with DNA sequences containing uracil (Figure 2.1). Utilizing IDT’s
OligoAnalyzer tool, we designed ssDNA with unfavored secondary structure by energy
minimization using nearest neighbor energy parameters104,105. We first examined if AID’s activity
is preserved on ssDNA compared to the gapped-DNA structure. Then we collected and analyzed



14
the footprints of AID on ssDNA with homogenous motifs which were not possible to make with
the gapped-DNA method. We also investigated the effect of neighboring nucleotides (DNA
context) on AID-catalyzed deamination efficiencies in various target motifs and found that AID
behaves differently on the same motifs with different neighboring nucleotides. With these data,
we worked on developing a mathematical model that can describe the scanning dynamics and
deamination activity of AID. AID scanning and catalysis are tightly coupled processes95
. We
believe this study will allow us to understand AID’s scanning mechanism. Previously, we
proposed a model to predict AID’s coupled scanning and catalysis activity and confirmed the
results using experiment data. Furthermore, we were able to describe AID’s scanning-coupled
catalysis with a Hamiltonian equation. In this project, we plan to use the Hamiltonian equation as
a base, use the experiment data to calculate the coefficients of the equation, and try to solve a
factorization problem through AID’s deamination activities.
Figure 2.1 Overview of MDS strategy. Each ssDNA is attached to a unique barcode of 25 random nucleotides. The linear
amplification would attach the index and P7 fragment while replicating only from the original templet to avoid any PCR error
accumulation. Exponential amplification was done to attach the P5 fragment and build the NGS library. During analysis, reads
with the same barcode are grouped and a consensus sequence is used to identify true deaminations.



15
2.2 Results
2.2.1 AID shows similar activity on USER enzyme (Uracil DNA glycosylase and
Endonuclease VIII) -treated ssDNA and gapped-DNA
To investigate the feasibility of using ssDNA as a model instead of gapped-DNA for this
study. We first tested the activity of AID on both ssDNA and gapped-DNA. The ssDNA was
treated with USER enzyme to prevent any spontaneous deamination during the oligo production
process. USER enzyme is a mixture of Uracil DNA glycosylase (UDG) and the DNA
glycosylase-lyase Endonuclease VIII. UDG is responsible for excising uracil bases, resulting in
the formation of an abasic (apyrimidinic) site while preserving phosphodiester backbone
integrity. Endonuclease VIII exhibits lyase activity, cleaving the phosphodiester backbone at both
the 3´ and 5´ sides of the abasic site. The USER enzyme cuts ssDNA from IDT which contains
uracil, which otherwise could lead to higher noise levels in the final data analysis. USER
enzyme-treated ssDNA would have a similar background level as the gapped-DNA which is
about 4-fold less than untreated ssDNA (Figure 2.2). Both USER-treated and gapped ‘AGC’
repeat ssDNA were treated with AID and analyzed. AID shows identical activity in these two
systems with respect to mutation frequencies and 2-point correlations (Figure 2.3).
The number of deaminations AID can make is time-dependent. At early time points, AID
acts on the ssDNA for a short period. This leads to the majority deaminated ssDNA having only a
few deaminations. However, as the incubation time increases, AID acts on the DNA for a longer
period, and there are more ssDNA molecules containing multiple deaminations. This was
observed from the experiment results (Figure 2.3A). For both gapped-DNA and USER-treated
ssDNA constructs, at early time points (30s, 45s), 50-70% of the deaminated DNA had only one
deamination and less than 20% of the DNA had more than 4 deaminations. However, at 5min,



16
the amount of DNA having one deamination decreased to 30-40% and the amount of DNA
having more than 4 deaminations increased to 30-40%. This indicates that AID is acting, as
expected, similarly on both DNA constructs.
A 2-point correlation analysis measures the pairwise distance between all deaminations
across each deaminated ssDNA. The x-axis of the two-point correlation analysis is the number of
motifs (tri-nucleotides) between two deaminations. The y-axis of the two-point correlation
analysis is the frequency of the observation. We have shown that 2-point correlations can be used
to study displacement dynamics of AID; the catalysis activity of AID is not explicitly contained
in the 2-point correlation analysis96
. As a result of AID’s high processivity, as incubation times
have increased the proportion of ssDNA with single deamination decreases while the proportion
of ssDNA with multiple deamination increases. As incubation times increase, the two-point
correlation starts out being sharper and narrower at short times and becomes flatter at longer
times. This expectation matches the experiment results (Figure 2.3B). Both DNA constructs
show the same two-point correlation trend as reaction time increases.



17
Figure 2. 2 Comparison of background noise among different DNA constructs. Three types do ssDNA construct were used for
making MDS NGS library without any AID treatment. IDT ssDNA shows the highest error rate and the error rate between the
gapped-DNA construct and USER-treated ssDNA were comparable.



18
Figure 2. 3 Mutation pattern comparison between gapped and USER-treated AGC homogenous motif construct. A. The
comparison of mutation amount distribution. The clones showing AID-induced deamination were collected and subgrouped based
on the amount of deamination contained. AID scans with high processivity on all ssDNA substrates. Both DNA constructs show
the same trend of a decrease in the number of clones with a single deamination and an increase in the number of clones with more
than five deamination. B. A comparison of two-point correlations. Two-point correlations describe the frequency of the distance
between two deaminations. Both DNA constructs showed a higher and similar frequency of increased correlation distances at
longer times suggesting that the activity of AID is similar when acting on the two DNA constructs.
2.2.2 AID scanning and catalysis are decoupled on homogeneous motifs
With the evidence showing ssDNA can potentially replace gapped-DNA, DNA with
homogeneous motifs that were not compatible with the gapped-DNA system were constructed.
To prevent any secondary structure from forming that would interfere with the activity of AID,
the IDT’s OligoAnalyzer tool was used to design ssDNA with non-cold homogeneous motifs and



19
to eliminate the formation of secondary structure. Also, SYC cold motifs were avoided because
of AID’s low catalysis activity that leads to rare mutation events.
The homogeneous motifs were used here to decouple AID’s scanning and catalysis
activity. Since deamination is the result of both scanning and catalysis, by having only one type
of motif on the ssDNA, AID has the same deamination efficiency on each motif. AID catalysis
contributes to the number of deaminations but does not affect the scanning trajectories. By doing
the two-point correlation analysis, the catalysis effect of AID can be factored out and the results
reflect the scanning properties of AID.
For example, in ssDNA constructs with AID-favored homogenous motifs, AID catalyzes
large numbers of deaminations and generates a substantial amount of deamination data.
However, the two-point correlation analysis is a frequency-based analysis, the large dataset
resulting from AID’s favored catalysis would only enhance the accuracy of the analysis but not
change the shapes of the curve at the different time points. If AID is scanning the same way on
an DNA constructs containing disfavored motifs, less deamination data will be obtained. This
would only affect the accuracy of the curve because of possible higher statistical error but not the
shape of the curve.
Because of the capability of eliminating the catalysis effect, the result of the two-point
correlation can reflect how AID scans on homogeneous sequences with different motifs. ssDNA
constructs containing different repetitive motifs were treated with AID from 30s to 5min. At each
time point, a fraction of the reaction would be taken out for sequencing. The sequencing result
was used to describe AID’s activity at each time point. By analyzing data obtained from short to
long reaction times, AID’s scanning properties can be indirectly revealed.



20
Although different deamination percentages have been observed on ssDNA with different
homogenous motifs, the two-point correlation results look similar to each other(Figure 2.4).
Because of AID’s catalysis preference, a large amount of data can be generated from the
AAC/AGC/TAC (WRC) motifs and thereby obtain smooth curves. However, for non-WRC
motifs, TTC/ATC, the lower amount of data obtained generates a larger amount of noise
resulting in curves with greater scatter. Overall, the trend and frequency of the two-point
correlation analysis align with each other with AID reacting on different homogeneous motifs.
This result indicates that AID’s scanning is similar and its scanning and catalysis can be
decoupled on homogeneous sequences with different motifs.
Figure 2. 4 Tow-point correlation analysis of all selected homogenous motifs. The motifs were chosen because they have neutral
or favored deamination tendency by AID, containing only one cytosine and not favored for secondary structure formation. The
two-point correlation analysis shows similar AID activity on these motifs.



21
2.2.3 AID’s activity depends on neighboring nucleotides
It has been shown that AID deamination activity is strongly influenced by surrounding
nucleotides in vitro and in vivo. It has been found that AID shows higher deamination activity
when surrounded by flexible nucleotides (poly dT) and reduced deamination activity in the
vicinity of rigid nucleotides (poly dA)106. However, the mechanism responsible for this sequence
context dependence is unclear. We designed homogenous AGC motifs separated by poly dT
trinucleotide motifs, which were reacted with AID. While the amount of deamination increases
as AID reaction time increases, AID was less reactive on the poly-dT separated ssDNA compared
to the homogeneous motif ssDNA. To reduce background noise, and to ensure that AID was
scanning over the entire ssDNA substrate, ssDNA clones with more than one deamidations were
grouped and analyzed. The amount of clones having more than one deamidation are comparable
between two different DNA constructs at early time points; however, the number of clones
showing more than one deamidations exhibited up to a 20% difference at the 5-minute time
point. This indicated that AID behaved differently when acting on these two different DNA
constructs (Figure 2.5). The mutation frequency distribution showed an accumulation of clones
having one deamination and fewer clones having multiple deaminations (Figure 2.6A). To
further investigate the cause of the difference, a two-point correlation analysis was carried out
and showed distinctive scanning properties on poly-dT separated ssDNA (Figure 2.6B). Twopoint correlation curves for different incubation times aligned with each other, indicating AID
was not scanning as had been predicted. Together with the mutation distribution data, we suggest
that the poly-T could potentially inhibit AID’s scanning properties and cause the trapping of
AID. As a result, for sequences with “TTT” inserts, as AID scanning appears to be inhibited, it
will have lower chances of making multiple deaminations and look less active.



22
Figure 2. 5 Comparison between clones with more than one deamination. The AGCTTT and AGCAGC constructs were treated
with AID. The poly-dT separated ssDNA shows up to 20% fewer clones with multiple deaminations compared to the ssDNA
without poly-dT inserts.
Figure 2. 6 Mutation pattern of AGCTTT. A. Clonal deamination distribution. No dramatic change was observed for clones with
five or more deaminations. Over half of the clones had only one or two deaminations. B. Tow-point correlation analysis of all
AGCTTT homogenous motifs. The two-point correlation lines align with each other at different time points, suggesting that AID
was not scanning properly.



23
2.2.4 The neighboring nucleotides show differing effects on different trinucleotide motifs
With the understanding that AID may deaminate differently in the presence of different
surrounding nucleotides, we designed sequences with different motifs surrounding different
spacers to examine the effects of neighboring nucleotides on the deaminations in individual
motifs. A spacer of 15 nucleotides spacer was designed and five different trinucleotide motifs
showing different deamination tendencies were selected. The 5-motif DNA constructs were then
treated with AID for 5min. With three different spacers tested here, the deamination efficiency is
different on each of the motifs. Surprisingly, the rate of deamination of AAC and ATC exhibits a
two-fold difference depending on the choice of the spacer. With spacer
“GTTATGTAGAGTGTT”, AID was less active on AAC, an AID-favored motif, compared to the
neutral ATC motif (Figure 2.7).
Figure 2. 7 The deamination rate with different 15nt motifs at 5min AID incubation. With different spacers chosen, AID’s activity
changes for the different motifs. Especially with motifs AAC, ATC, and TAC, AID’s deamination preference changes based on
the different spacers chosen.



24
2.3 Discussion
Proteins that scan dsDNA processively to detect mismatched nucleotides in doublestranded DNA as part of mismatch repair and base excision repair systems have been studied
extensively107-110
. During the past 20 years proteins that scan single-stranded DNA, e.g.,
replication protein A(RPA) have been shown to function in genome maintenance111-113
. AID
plays an important role in our immune system by initiating the antibody diversification process.
A better understanding of the scanning and catalysis activity of AID could provide a deeper
insight into antibody diversity.
While we cannot directly track the activity of AID on ssDNA in our experimental setup,
our experiment does provide single-molecular-level information by showing the distribution of
AID-catalyzed C to U deamination events along ssDNA. We have shown that the pattern of AID
deaminations can be used to deduce how AID scanning and catalysis are coupled; in other words,
the pattern of deaminations is dependent on the scanning properties of AID. By analyzing ssDNA
with NGS in the experiment workflow, we were able to investigate AID’s activity on a variety of
homogeneous motifs. One of the main concerns was the quality of IDT’s synthetic ssDNA. In the
early stage of the experiment. During the process of cloning the ssDNA directly into the M13
phage backbone, we noticed that there were ssDNA molecules with defects such as nucleotide
insertions or deletions and single nucleotide misincorporations. To solve this problem USER
enzymes were used to cut any spontaneous deamination during the ssDNA synthetic process and
highly restricted analysis conditions were used to identify only those differences in nucleotide
identity compared to a reference sequence that occurs solely at C sites. With both experimental
and analytical adjustments, we were able to reach the same fidelity level as the gapped-DNA
setup while not sacrificing the freedom of design of the ssDNA sequence.



25
Using a homogeneous sequence with different motifs, AID behaves in a consistent
manner with our previous observations, where AID exhibits an elevated C deamination
efficiency in WRC motifs. To explore the scanning dynamics of AID, a two-point correlation
analysis was used. In our experiment, we selected clones containing at least two C to U
deaminations, and used a two-point correlation analysis to determine the probability of observing
two U bases at the various target sites – i.e., this is a measure of the correlation between two
deamination events. Since the ssDNA contains a repetition of only one motif, the result of the
two-point correlation analysis should depend only on the scanning properties of AID. The twopoint correlation data show that AID scanning and catalysis are uncoupled, as predicted by the
model calculation, which is valid only for homogeneous motifs95
.
Scanning and catalysis are coupled, for DNA containing inhomogeneous motifs95
. This
was first observed by inserting a poly-T trinucleotide into AAC and AGC repetitive motifs.
While the deamination rate increases as reaction time increases, the two-point correlation results
were the same for different reaction times. The reason behind this observation was unknown and
we suggested that AID was trapped locally on the ssDNA molecules. As a result, AID’s scanning
activity was very different between the homogenous motifs sequences and poly dT inserted
sequences.
To better understand how neighboring nucleotides could contribute to the deamination
efficiency of AID, we designed sequences with a 15nt spacer separating different motifs. While
retaining single-hit conditions, in which AID can act on a ssDNA substrate only once (with a
probability of > 90% based on a Poisson analysis of mutated clones), we observed that AID’s
deamination rate changes dramatically for different spacers. We observed that the spacer has a
higher impact on the hot motifs (up to 20% difference) compared to the neutral ones (up to 10%



26
difference). By further shuffling the position of the hottest motif, we noticed the deamination rate
of the hottest motif decreased when it was next to a hot motif. Currently, we do not have a good
explanation for this observation. More experiments need to be executed to understand this.
There is much to learn about AID’s activity on ssDNA. One of the experiments that can
facilitate this study is to track the scanning activity of AID on ssDNA in real-time. We purified
AID and ligated the AID with an AZD-647 fluorophore through a sortases-mediated ligation
process. This ligation process allows cleavage of certain affinity tags and allows high-purity
labeled AID without any free fluorophores. Together with ATTO-565 labeled ssDNA, we could
monitor the scanning activity of AID ssDNA in real-time by FRET analysis on a single-molecule
level.
The scanning properties of AID alter the FRET signal depending on the distance between
AID and the labeled nucleotide on the ssDNA tethered to the cover slip. The FRET experiment
could immediately provide insight into why AID behaved so differently with different
neighboring nucleotides. For example, we have performed the two-point correlation analysis
with sequences with homogenous motifs and motifs separated by tri-T nucleotides. The twopoint correlation analysis indirectly showed that AID scanned differently on these two different
constructs. The FRET trajectories could show the difference directly.
Besides investigating the nucleotide neighboring effects of AID scanning, AID interests
us the most through its quantum isomorphism properties. We have previously shown that
coupled scanning and catalysis by AID can be described using a Hamiltonian equation114. While
the Hamiltonian equation structure has been determined from our previous study, the coefficient
of each variable/parameter of this Hamiltonian equation depends on different sequence
arrangements. We intend to calculate these coefficients using the NGS data and use this



27
Hamiltonian equation to perform mathematical calculations. We intend to use ssDNA sequences
separated by 15 nt spacers with five different motifs. With 5 different motifs on the ssDNA, there
would be 32 different combinations of deamination outcomes. Using the frequency of each of
these 32 different deamination results, Dr. Mak can calculate the coefficients to incorporate into
the Hamiltonian. We will react AID with two 5-motif sequences sharing the same spacer but
having different motif arrangements. The NGS data from one arrangement will be used to derive
the Hamiltonian coefficients and the NGS from the other arrangement will be used to determine
if the coefficients calculation is correct. Meanwhile, we also want to calculate the transition rate
between different motifs on this 5-motif sequence. Ideally, the transition rate should be the same
as or showing the same trend as we observed from the 2-point correlation analysis. However, the
transition rate from the 5-motif sequences did not align with our finding with homogenous
sequences. The calculation assumes that AID scans indifferently on the ssDNA. We will need to
do more tests or change the way of calculating the coefficient to solve this problem.
2.4 Materials and methods
ssDNA and AID preparation
The ssDNA was designed and predicted by the OligoAnalyzer Tool (Integrated DNA
Technologies, Inc.), based on having a low possibility of forming hairpin and self-dimerized
structures, and then synthesized by Integrated DNA Technologies, Inc. The ssDNA sequence can
be found in Table S1.1. To ensure the absence of spontaneous deamination or contamination
arising during synthesis, 650 ug of oligonucleotides were further treated with 3 units of USER
enzyme (New England Biolabs) at 37C for 1hr. The reaction was terminated with two times
phenol:chlorophorm: isoamyl alcohol (25:24:1) extraction and one P-6 gel column (Bio-Rad) for
buffer exchange and small molecule elimination. The purified ssDNA was quantified with UV



28
Spectrometry (Beckman Coulter) to measure deamination levels. The protocol for making
Gapped-DNA has been described in our previous paper80. In brief, the DNA was first cloned into
an M13mp2 phage backbone at the beginning of the lazZα gene with EcoRI restriction enzyme
and transformed into E coli. Using blue and white selection, clones containing the insertion
produce white or light blue plaques and are collected. To ensure the fidelity of the DNA
constructed, each clone was sequenced individually and clones containing with correct DNA
constructs were saved and used for phage production. ssDNA was then purified from the phages.
Lastly, the gapped DNA was created by annealing the ssDNA with denatured M13mp2 DNA.
Wild-type GST-tagged AID protein was expressed in baculovirus Sf9-infected insect
cells. 1L of 2*10^6 cells/ml Sf9 cells were infected with the baculovirus. The infected cells were
grown in a suspension culture flask at 27℃ for 3 days. The cells were collected and washed with
PBS three times. The cell pellet was then suspended in lysis buffer (20mM Hepes pH7.5, 250
mM NaCl, 1mM DTT, 1 mM EDTA, 10 mM NaF, 10 mM NaHPO4 pH7.5, 10mM Na4P2O7,
10% Glycerol, 1 tablet of Roche cOmplete protease inhibitor and 1% Triton X-100). The mixture
was stirred till homogenous and then sonicated on ice with Duty cycle 50 and output cycle 5 for
5min. The protein supernatant mixture was collected by spinning the sonicated sample at 11,000
rpm for 30min. The GST-AID was captured with equilibrated Glutathione Sepharose 4B resin
(GE Healthcare) on rock overnight in the cold room. The resin was washed with wash buffer
(100 mM Tris pH8 and 500 mM NaCl) and eluted with elution buffer (50 mM reduced
glutathione, 100 mM Tris pH8, and 500 mM NaCl). The eluted GST-AID was dialyzed in buffer
containing 20 mM Tris pH7.5, 50 mM NaCl, 1 mM EDTA, 1 mM DTT, and 10 % glycerol and
stored at -80℃.



29
Deamination reaction
15 ng of GST-AID and 15ng of RNase A (Sigma) was preactivated in 10ul of reaction
buffer (20 mM Tris pH8, 1 mM DTT and 1mM EDTA) for 2min at 37℃. 30ng of ssDNA was
suspended in 10ul of reaction buffer and incubated at 37℃ for 2 min. The deamination reaction
was initiated by mixing the enzyme solution with the ssDNA solution and incubating from 0 s to
5 min. The reaction was terminated by double extraction with phenol:chlorophorm: isoamyl
alcohol (25:24:1) and one P-6 gel columns buffer exchange. For gapped DNA, another 1-hour
incubation of EcoRI at 37oC was performed to release the ssDNA for NGS library preparation.
To ensure that the mutated ssDNA was acted only by a single AID molecule, AID and ssDNA
concentrations were chosen so that only 5% or less than 5% of the ssDNA would be deaminated
at 5 min (AID’s binding time is 5min)
62,79,81,115
.
NGS library preparation for Maximum-depth sequencing
The method was adapted from Dr. Nudler’s group and modified to fit our purposes of
capturing the deamination footprint103. 1ul of AID treated ssDNA was reacted with 4nM Ind-25-
NTSC or Reverse-Ind25 for 1 cycle PCR (94℃ 1min; 50℃ 1min, 72℃ 2min, and 4℃ hold)
with premixed Taq polymerase (Promega Corporation) for a total reaction volume of 25ul. The
PCR mixture was incubated with 0.85 ratio AMPure XP beads (Beckman Coulter Life Science)
and eluted in 20ul TE buffer. 4ul of purified DNA was linear amplified with 1ul of 1 uM NG or
P7 primers for 12 cycles with 0.3ul of Q5U Hot Start High Fidelity DNA Polymerase(New
England Biolabs) and 200uM dNTP for a total of 30ul reaction volume (98℃ 30s, 98℃ 10s,
60℃ 20s, 68℃ 1min, go to step 2 11X, and 68℃ 5min). Start the exponential amplification
when the linear amplification is finished by adding 1ul of 1uM New-Rev or Rev NG primers and
amplify for 12 cycles (98℃ 30s, 98℃ 10s, 55℃ 20s, 72℃ 1min, go to step 2 11X, 72℃ 5min



30
and 4℃ hold). The amplified library was cleaned with 2 times beads purification with a 0.85
ratio and eluted with 15ul TE buffer.
NGS with Miniseq machine
The purified NGS library was quantified with qPCR (Quanta bio) or Qubit (Thermo
Fisher Scientific) and size was checked with DNA Analyzer (Agilent). The High-output reagent
kit was thawed out in a water bath and then kept in the cold room before usage. The NGS library
was denatured by mixing 5ul of 1nM library pool and 0.1N NaOH for 5min at room temperature.
5ul of 200mM Tris HCl was added to neutralize the reaction. The denatured library was diluted
with a prechilled Hybridization buffer to make a 2 pM loading library. The 2 pM PhiX loading
stock was made the same way from PhiX control V3 (Illumina, Inc). Mix the diluted library and
PhiX to reach and total volume of 500ul and 5% PhiX as the final loading stock. The final
loading stock was added to the sequencing cartridge and the sequencing was started on the
Miniseq machine. For larger data acquisition, the NGS library samples were shipped to Amera
Health Inc. for Hiseq or Novaseq sequencing.
NGS data analysis
The source code for NGS data analysis (Appendix) has been uploaded to GitHub
(https://github.com/hongyuzh0212/MDS). Depending on the sequencing quality, quality
control.py has been used to convert any nucleotide to “N” if the quality score is lower than 32
(ASCII-33 scale). The converted read1 and read2 files were processed to
barcodecreat+universal.py to merge the read1 and read2 files and extract the unique barcodes
showing up over 2 times in the read. During the merge process, a verification would usually done
to ensure the 125th to 130th of the read1 sequence would perfectly match the read2 sequence. The



31
final merged sequence has the nucleotides with the highest quality score for each sequence. The
barcodecreat+universal.py would produce sequence-over.txt corresponding to the merged reads
and barcode-over.txt corresponding to the selected unique barcodes. The
finalversionseq+universal.py would use these two files to give the mutation analysis of
individual sequences for the sample. To ensure the accuracy of the reading, a consensus step was
done. For ssDNA having a barcode showing up less than 5 times, the nucleotide must perfectly
match with each other, and for ssDNA having a barcode showing up 5 or more than 5 times, the
nucleotide must match with each other at an error rate of 20% for each position. To identify
mutations caused by AID, the consensus sequences are aligned with the original sequences. For
further mathematical analysis, mutation sites were identified by motif (trinucleotides) and
exported as “.” for no mutation and “T” for mutated. The results are processed by mutations
contained in a motif allowing us to determine the number of mutations and mutation probability
for each motif. The two-point correlations between deaminated sites and the transition rate
between motifs are analyzed by Dr. Mak.



32
Chapter 3. In Vitro Affinity Maturation of Camelid Nanobodies
Targeting FAAH
3.1 Introduction
For the successful development of Abs with high affinity and specificity for Ags,
vertebrates have evolved their immune system to produce and diversify Abs116. The process of
generation of high-affinity Abs is SHM, which happens in the activated B-cell and is done
through mutations contained in the variable region of immunoglobulin genes117,118. SHM is
initiated by AID34, which deaminates cytosine (C to U) preferentially at WRC (W=A/T, R= A/G)
motifs62,79. AID deaminates C to U at a rate of about 10^-3 to 10^-4 per base pair in the V gene,
which is about a million times higher than normal mutation frequencies in somatic cells119,120
.
The deaminated C can undergo several different pathways and end up with mutations on the V
gene. The U could be recognized as T and the cell would incorporate an A opposite it and result
in a C to T transition mutation117. The G•U mismatch could also induce DNA repair in cells by
either long post replication mismatch repair or base excision repair121. Both processes are errorprone because a low-fidelity DNA polymerase is recruited to fill in the repair gap and produce
both transition and transversion mutations122. Pol η, which is 80-fold less accurate for base
substitution errors and 140-fold less accurate for indel error than normal replicative polymerase,
is reported to be highly active in the SHM process123,124. To correspond with the active enzymes
in the SHM process, the CDR regions of the V genes have evolved to enrich WRC hot motifs125
.
Studies have shown AID’s preferential activities on the WRC motifs are preserved on the V gene
both in vitro and in vivo126-129
.
Phage display is a technique that allows the expression of exogenous peptides on the
surface of phage particles. The exogenous DNA sequences are inserted into the specific phage-



33
coating gene. Upon production of phage particles, the inserted encoding peptide is exposed on
the surface of the phage130. The physical linkage between the gene inserted and the peptide
displayed on the surface has enabled phage display as a great tool for Ab selection131. To date,
there are over 14 clinically approved monoclonal antibodies (mAbs) selected and developed
through this technology132. Taking advantage of the robustness, easy-to-perform, and costeffectiveness, different targets with different displayed proteins including single-chain variable
fragments (scFvs), antigen-binding fragments (Fabs), and nanobodies (VHHs) are being studied
at present133
.
Due to the advancement in recombinant antibody engineering techniques, research has
mainly focused on Fabs and scFvs. Recently, a growing number of studies have indicated that
VHHs could be the ideal target for certain applications including diagnostics, treatment, and drug
delivery and targeting134. The discovery of the VHH can be traced back to the finding of heavychain-only antibodies (HcAbs) in camelids and sharks. Different from conventional mAbs,
HcAbs contain only two heavy chains, with a single variable domain, VHH, as the Ag binding
region135. Although lacking the light chain region, the VHH has compensated for that by having
a longer CDR3 region which is considered the most significant region for Ab-Ag interaction134
.
The structure of VHHs provides them with many benefits over conventional scFvs and Fabs.
Due to lacking the light chain, VHH is normally 15kDa, which is much smaller than scFvs
(30kDa) and Fabs (50kDa) (Figure 3.1). This small size allows VHHs to have unique
accessibility to small epitopes and better penetration abilities136. By replacing the highly
conserved hydrophobic amino acid between heavy and light chains with hydrophilic amino
acids, VHHs have higher solubility137. The refolding capacity from its structure also allows
VHHs to tolerate high temperatures, elevated pressure, and non-physiological pHs134,138
.



34
Figure 3.1 Structure of antibody and antibody fragments. The size of human mAb is normally 150 kDa. The Fab is the variable
domain of the mAb and has a size of around 50kDa. scFv containing only the variable regions of the mAb and linked through a
peptide linker is about 30kDa and VHH derived from the variable domain of heavy-chain-only antibodies found in camelids is
about 15kDa.
Although it is easy to display the antibody on the surface of phages, an effective method
to enhance the binding of displayed Abs is needed for making high-affinity binders. At present,
most phage display libraries are affinity matured using a mutator E. coli stain or error-prone PCR
process139-141. These approaches often produce an excess of Abs with low solubility or may not
be tolerated in the immune system.
Here, we proposed to generate high-affinity VHHs by mimicking the B cell SHM process
(Figure 3.2). After performing affinity maturation on naïve llama VHH genes with AID and Pol η
and selecting against the target protein with a phage display system, we expect that VHHs with
higher specificity will be generated. This study would be the proof of concept of transferring the
in vivo SHM process into in vitro with biochemically purified AID and Pol η. The success of this
study indicates a new way to increase Ab affinities by using a repetitive series of affinity
maturation steps in a purified biochemical system.



35
Figure 3.2 Overview of our in vitro nanobody generation process. Naïve nanobody gene was obtained from non-immuned llamas.
Affinity maturation was performed with AID and error-prone polymerase η on the nanobody gene. The genes were cloned into
phage vectors for panning and screening. After three rounds of panning with increasing stringency conditions, clones showing
high ELISA signals compared to the original clone were successfully identified for nanobody production. In the end, purified
nanobodies’ Ab-Ag affinities (KD values) were measured by SPR.
To access the efficacy of the affinity maturation process. Human Fatty Acid Amide
Hydrolysis (FAAH) would be used as a target Ag in this study. FAAH is an active enzyme that
degrades anandamide (AEA) and 2-arachidonoylglcerol (2-AG) in the endocannabinoid
signaling pathway142. AEA and 2-AG accumulation would activate or enhance the activity of
cannabinoid receptors leading to both pain relief and inflammation reduction143. Inhibition of
FAAH has shown pain relief, anti-inflammatory, and anti-depression without any undesirable
side effects, indicating FAAH as a potential therapeutic target144. It has also been suggested that
inhibiting the rapid catabolism by FAAH could have beneficial effects on neurodegenerative
conditions including Alzheimer’s and Parkinson’s diseases145. However, the development of



36
FAAH therapeutics was hindered by off-target issues
146. One recent phase I study resulted in
suspension because of the death of one patient and mild-to-severe neurological symptoms in four
other patients147. Given how attractive FAAH inhibitors could be, one way to avoid the safety
issue is to repurpose drugs with good safety and tolerability profiles while having good affinity
to FAAH148. However, to solve the off-target issue from its root, a drug with high affinity is
needed. By making VHHs with better affinities to FAAH, we could not only establish the
effectiveness of our affinity maturation system but also accelerate the development of FAAH
therapeutics.



37
3.2 Results
3.2.1 AID and Pol η preserved their activities and favored mutation in CDRs.
To examine the affinity maturation activity of AID and Pol η biochemically, we treated
human IGHV3-23*01, a gene often used as an in vivo SHM reporter and chronic lymphocytic
leukemia predictor149,150, with AID and Pol η in a gapped-DNA system. The IGHV-23*01 was
cloned in an M13mp2 phage genome and treated with AID for C to U conversion, resulting in
mutations at C•G sites, and Pol η for filling the gap by catalyzing misincorporation at A•T sites,
favoring GA motifs. The resulting mutation frequencies are about 1%151,152. The reacted phage
genome was transfected into E. coli and individual clones were sequenced for mutation
identification. By gathering data for 150 mutated phage clones with 862 mutations, the mutation
spectrum was created (Figure S3.1). There were more C to T mutations that occurred in the CDR
regions as more WRC motifs were located in the CDR regions and responsible for binding to
Ags. The data aligned with previous findings that AID favored CDR regions and was responsible
for the affinity improvements of Abs126,153. On the other hand, fewer mutations were observed in
the framework (FW) region. This was expected as FW regions were often more conserved since
they are responsible for the overall structural stabilization16. The mutation spectrum indicated
that our in vitro system favored mutating CDR regions more than FW regions while conserving
the activities of these enzymes in vivo. By targeting CDR regions as the favored mutation region,
our system is expected to result in higher affinity antibodies, following several rounds of affinity
maturation, compared to random mutagenesis.



38
3.2.2 Affinity-matured VHH library showed more avid binders to FAAH
To examine the efficacy of affinity maturation, phage biopannings were performed with a
naïve VHH library and an affinity-matured VHH library. To prevent unspecific binding, in the
first round of biopanning, two sets of immunotubes were prepared, one immobilized with 10ug
of Maltose-binding protein (MBP) and another immobilized with 10ug of MBP-FAAH. 10^12
transducing units (TU) of phage from both libraries were incubated with the MBP immobilized
tube for 1hr and then transferred into the MBP-FAAH tube. Phages that have a high affinity to
the immunotube surface or MBP would remain in the MBP tube. Following washing and elution,
the eluted phages were incubated with K91BK cells for phage tittering and amplification. The
same procedure was performed for a total of three biopannings while decreasing the amount of
phage input (from 10^12 to 10^10 TU), immobilized MBP-FAAH (from 10 to 1ug), low washing
buffer pH (7 to 5), shorter MBP-FAAH binding time (30 min to 1h) and increasing MPBS
concentration (2 to 5%) to make each round of biopanning more stringent. The recovery rate was
calculated by dividing the amount of elution output by phage input (Figure 3.3). Higher phage
recovery was observed for the affinity-matured VHH library at each biopanning round.
Furthermore, after the third round of biopanning, significantly higher phage recovery (~8 fold)
was observed in the affinity-matured VHH library.



39
Figure 3. 3 Naïve and Affinity matured VHH Phage recovery rate. The recovery rate was calculated by dividing the amount of
eluted phage by the amount of input phage. A slightly higher recovery rate was observed with the Affinity matured library for the
1
st and 2nd biopannings. Around an 8-fold recovery difference was observed with the 3rd biopanning.
3.2.3 Single clone affinity improvement via AID and Pol η affinity maturation
To better understand how our in vitro affinity maturation system works, 48 phage clones
eluted from the 3rd naïve VHH biopanning were cultured individually and separately incubated
with coated MBP or MBP-FAAH plates for phage ELISA (Figure 3.4). The ratio between MBPFAAH and the MBP signal was calculated to select the clone with the highest affinity towards
FAAH. A3, a clone with high affinity towards MBP-FAAH and low affinity towards MBP was
isolated from the VHH phage library for AID and Poly eta affinity maturation. The affinitymatured A3 gene was transformed into MC1061 E. coli for phage library production. The newly
produced phage library was subjected to three rounds of stringency-increasing biopanning with
an off-tag FAAH. A phage ELISA was performed for 96 individual clones eluted from the 3rd
biopanning (Figure 3.5). 20 clones with the highest OD450 nm signal were selected for



40
sequencing. 5 of them showed different amino acid sequences from the original clone (Figure
3.6A, Figure S3.2).
Figure 3. 4 Phage screening for a Naïve VHH library. 48 phage clones with VHH inserts were cultured individually and bound to
MBP and MBP-FAAH in separate wells. The amount of phage binding was detected by M13 phage Ab with signals that can be
observed at OD450. The x-axis indicates the 48 phage clones, and the left y-axis indicates the OD450 reading with gray bars
representing the MBP binding and black bars representing the MBP-FAAH binding. The ratio between MBP and MBP-FAAH
signals was calculated and represented by the red bar shown on the right y-axis. A phage clone with a high affinity towards
FAAH has a low gray bar, a high black bar, and a high red bar.
Figure 3. 5 Phage screening for A3 affinity matured VHH library. 96 phage clones with VHH inserts were cultured individually
and bound against empty FAAH wells. All phage clones showed high specificity towards FAAH with little or no non-specific
binding.
Along with the original clone, A3, these 5 different VHH peptides were purified for KD
measurements by surface plasmons resonance (SPR). Two VHH clones, F9 and OD1, were
measured to have higher affinity compared to the parental clone (Figure 3.6B, Table 3.1). Clone



41
F9 showed slightly increased affinity with one mutation on the FW1 region and clone OD1
showed about 2.4-fold higher affinity to FAAH with mutation on the FW3 region.
Figure 3. 6 Clones with unique sequences after A3 affinity matured VHH library selection and screening. A. The amino acid
sequence of 5 selected clones is different from A3. The CDR region is shown highlighted in gray. Clones B5 and F12 had
mutations in the CDR2 F9, E12, and OD1 had mutations in the FW region. B. SPR results of A3, F9, OD1, and F12. With the
same amount of FAAH immobilization, OD1, and F9 showed higher RU compared to A3 while F12 showed less RU.
Table 3. 1 Kinetic and equilibrium dissociation constants of purified VHHs targeting FAAH.



42
3.2.4 Affinity improvement with two rounds of affinity maturation
With the possibility of reaching the affinity limitations of A3’s backbone, another
screening was done with the 3rd biopanning elution from the naïve VHH library, and a weak
binder C6 was selected to check if we could get continuous affinity improvements with more
rounds of affinity maturation. C6 was firstly affinity matured using the gapped-DNA method and
selected and screened. F9, with a mutation on the FW3 and CDR3 region, was selected as it
dominated the 3rd biopanning elution population. The F9 gene was extracted, and affinity was
matured using the ssDNA method to create more mutations. To ensure that the binding selection
is based on affinity instead of avidity, a monovalent display system was used for displaying only
one VHH each time on the phage surface. The biopanning was done with SPR for real-time
binding monitoring and screening was done for binder selection. One clone, 69, was identified
with one mutation on the CDR2 region (Figure 3.7A, Figure S3.3).
Figure 3. 7 Clones with continuous affinity maturation. A. The amino acid sequence of C6, F4, and 69. The CDR region is grayhighlighted. F4 had two amino acid changes in the CDR3 region. 69 had and additional amino acid change in CDR2. B. Predicted
protein structure. The structure of FAAH was predicted through Alphafold2. VHH C6, F4, and 69 were predicted through
NanoNet. For VHHs, the CDR1 region is colored yellow, the CDR2 region is colored cyan and the CDR3 region is colored green.
The mutation spots are colored red. C6 :F4 represented the alignment between C6 and F4, a conformational change can be
observed (black arrow). For C6:69 represented in the alignment between C6 and 69, an additional conformational change can be
observed (black arrow).



43
Before measuring the KD on SPR, structural predictions were performed using
Alphafold2 and NanoNet. Alphafold2 was used for predicting the structure of FAAH without the
transmembrane domain154. NanoNet was used for predicting the structure of the VHH
nanobodys
155 (Figure 3.7B). The predicted structure showed the amino acid change would lead to
conformation changes in the CDR regions for both clone F4 and 69. To evaluate the function of
the mutations in the CDR region, SPR KD measurements were performed (Table 3.2). Each
affinity maturation showed a KD increase and ended up with a 4-fold binding enhancement.
Table 3. 2 Kinetic and equilibrium dissociation constants of purified C6, F4, and 69 targeting FAAH.
3.2.5 Affinity-matured VHH showed an inhibitory effect in the mouse model
Another VHH target studied is Artemin, a neurotropic factor involved in cold sensitivity.
Soo Lim Jeong, a previous lab member, isolated one affinity-matured VHH clone LM52 that had
a KD ~ 88 nM against Artemin. We compared the efficacy of LM52 with one of the commercial
mABs, MAB1085 (R&D Systems) through collaboration with the McKemy Lab (USC
Neuroscience). Both MAB1085 and LM52 were nape injected (10 mg/kg body weight) into mice
under CFA-induced cold allodynia. The cold sensitivity was measured over time after injection
by observing withdrawal latencies. An increased latency time implied an attenuation of cold



44
allodynia and reduced pain. Both MAB1085 and LM52 showed a restored cold response back to
the pre-CFA level (Figure S3.4). Although LM52’s inhibitory effect is not as strong as MAB1085
(shorted effective time), this mice study revealed that the affinity matured VHH is not only
specific to its target but also inhibits the target’s enzymatic function.
3.3 Discussion
This is the first reported experiment using combinations of AID and Pol η to affinity
mature VHH genes. Based on our positive results, we have shown that this method can mimic
the vivo affinity maturation without the “hassle” of animal immunization processes. Along with
the phage display system, we believe this method could bring a huge impact, enhance efficiency,
and accelerate the development of Abs both in research and in pharmaceutical usage.
We first investigated the combined mutational effects of AID and Pol η on the human
IGHV3-23*01 gene. It has been reported that AID preferentially deaminates WRC motifs and
Pol η preferentially mutates WA motifs79,151. The mutation spectrum for IGHV3-23*01 meets
these expectations and matches the characteristics of V-gene SHM. Moreover, due to the natural
characteristics of the CDR that contains more WRC motifs, more mutations are accumulated in
the CDRs after in vitro affinity maturation. This observation indicates using AID and Pol η to
make mutations could be naturally favored for the V genes since CDRs are considered the main
contributors to Ab-Ag binding. This can also be seen by the phage recovery rate before and after
affinity maturation. The recovery rate of the affinity-matured group is always higher than the
naïve library and is about 8-fold higher than the naïve group following several rounds of affinity
maturation, which illustrates the effectiveness of the affinity maturation method. Furthermore, it
suggests that the affinity maturation method shows a minimal deleterious effect on the phage



45
library because a slightly higher 1st round of biopanning recovery rate means not many nonproductive or non-soluble mutations have been introduced.
In our first experiment with a relatively high KD clone, A3, we found that after affinity
maturation, biopanning, and screening, five clones with various mutations were selected and
analyzed. Two of them showed enhanced affinity over A3. OD1, with only one Pol η-generated
amino acid change in the FW region, shows a 2.5-fold affinity enhancement. It has been reported
that even though the FW is not directly contributing to the Ag binding, it may influence the
dimerization or thermodynamic stability of Abs
156,157. We believe the effective mutation that
occurred in FW is caused by the short CDR3 region on the parental clone A3 as well. The
adaptive immune enzymes co-evolve with the immunoglobulin gene to give the highest affinity
enhancement. In the situation where there is little to change in CDR, the FW region may be used
to increase the Ab affinity. The OD1 mutation was TA to TT, which is the pol eta hot motif.
Again, this shows how our affinity maturation works with V genes.
The adaptive immune system usually has several rounds of affinity maturation for the
selection of the best Abs. We further examined if our affinity maturation method could increase
the Ab’s affinity. One of the advantages of VHH is the long CDR3 gene; however, A3 with a
very short CDR3 gene might not be suitable for this purpose. We hypothesized that a clone with
a longer CDR3 region could evolve Abs with higher affinities. The naïve VHH was selected and
a weak binder, C6 was selected.
We did one round of affinity maturation followed by the standard protocol of panning and
screening. Clone F4 stood out as the dominant clone in the population and was isolated for the
second round of affinity maturation. To introduce more mutations in VHH genes, the ssDNA
method was adapted since the gapped-DNA synthesis process creates ssDNA leftovers which



46
would trap AID and inhibit the deamination process. Instead of a phage vector, a phagemid
vector with an assistant helper phage was used so as to have only one VHH packed in each phage
particle. This monovalent phage display system could greatly reduce the avidity effect and
enhance the stringency during the selection and screening process. Lastly, real-time analysis with
a Biacore Biosensor was utilized to ensure optimal washing stringency throughout the
biopanning process. After biopanning and screening, clone 69 was isolated based on having a
high ELISA signal.
Computational tools were used for predicting the protein-protein interactions. The
structure of FAAH was predicted through Alphafold2 with high confidence. The structure of
clones C6, F4, and 69 nanobodies was predicted through NanoNet which is designed specifically
for Nb. By doing a structural alignment, a conformational change on CDR3 can be seen from F4
to C6. By the second round of affinity maturation, another conformational change in CDR2 can
be seen from 69 to F4. The structure prediction indicates possible binding affinity changes from
the affinity maturation process. These prediction was further confirmed by SPR measurements
showing that clone 69 has a 4-fold affinity increase compared to the original clone, C6. Also, as
predicted, F4 showed a slight affinity enhancement compared to C6.
Further experiments with mice models through collaboration with the McKemy Lab
showed the efficacy of our affinity-matured VHHs. Mice injected with our affinity-matured VHH
showing significant cold sensitivity reduction revealed that these VHHs can not only bind to the
target specifically but also inhibit the target’s enzymatic function.
The positive results from the experiment provide a proof of principle that demonstrates
the efficacy of our novel affinity maturation method. However, only a limited amount of affinity
enhancement has been obtained. Two factors could lead to the observation. First, for monitoring



47
the affinity maturation process, we intentionally chose to perform affinity maturation on single
clones to easily trace the mutation sites. However, this would inhibit the potential to get
exceptionally high-affinity binders. There might be only a small degree of improvement in
binding affinity from the way in which the parental clone has been selected. This is likely to be
unfavorable for creating highly diversified libraries. Second, only one or two rounds of affinity
maturation have been used to diversify the libraries. In the germinal center, multiple rounds of
affinity maturation occur with selection and screening. In the future, we expect to obtain
substantially higher affinity enhancements by performing additional rounds of selection and
affinity maturation. Additionally, one of the obstacles of the current experimental setup is
efficiency. As shown in the data (Table 2.1), only 2 of the 5 selected VHHs show enhanced
affinity from the SPR measurement due to the variability associated with phage ELISA. We have
tested several docking software on predicting the binding between VHHs and FAAH. Three of
the docking software (pyDock, ClusPro2, and HDOCK) with different flexibility during the
docking process showed aligned affinity predictions when applied to our experimental data
(Figure S3.5). In future experiments, we can take advantage of the in-silico tools to predict the
VHHs with the highest probability of binding and use SPR to confirm the prediction.



48
3.4 Materials and methods
Materials
M13mp2 phage, pMALx vector, Escherichia coli CSH50, and MC1061 ung- strains are
from the lab stock. AIDv and Pol η purified by previous lab members as described23,158. We
purchased the pADL20c phagemid vector, CM13 helper phage, and TG1 E. coli strain from
Antibody Design Labs (San Diego, CA). Phage vector, f3TR1, and K91BK E. coli strain were
provided by George P. Smith (University of Missouri, Columbia, MO)159. We purchased the
naïve VHH library in pADL20c vector from 24 non-immunized Llama from Abcore Inc
(Ramona, CA).
Human FAAH purification
The DNA sequence of human fatty acid amide hydrolase (FAAH) was obtained from
GenBank (AH007340.2). FAAH amino acids sequence 30-579 (excluding the N-terminal
transmembrane domain) was cloned into a pMALx expression vector and transformed into a
CSH50 strain for protein purification. The E. coli cells were induced at OD 0.5 with 1mM IPTG
and shaken at 18℃ overnight. The cells were washed with 10mM Tris pH 8.5 and 1M NaCl. The
cell pellet was resuspended in lysis buffer with 20 mM Tris pH8.5, 0.1% Triton X-100, 5 mM
DTT, 1M NaCl, and 1 mM PMSF followed by 8min sonication. After centrifuging at 11,000 rpm
for 30min, the supernatant was incubated with Amylose resin (New England Biolabs).
Depending on the need for an MBP tag, Factor Xa Protease was added with 20 mM Tris pH 8.5,
100 mM NaCl, and 2 mM CaCl2 and incubated in the cold room for 3h. The FAAH or MBP
tagged FAAH were further cleaned by gel filtration with 20 mM Tris pH 8.5, 1M NaCl, and 5%
glycerol and stored at -80℃.



49
Generation of naïve VHH phage library
Llama VHH region was amplified with forward primer
(TATTACTCGCGGCCCACGCGGCCATGGCT) and reverse primer
(GGTGATGGTGTTGGCCCAGGGGCTGAGGAGACGGTGAC) which contain one Bgl I
cutting site on each. The amplified product was cut with Bgl I(New England Biolabs) overnight.
Meanwhile, the f3TR1 vector was cut with Bgl I as well followed by dephosphorylation via
Shrimp Alkaline Phosphatase (New England Biolabs). The cut VHH region was ligated into the
f3TR1 vector followed by transformation via electroporation (0.2cm gap cuvette, 2.5kV, 25uF,
and 400ohm) into MC1061 strain. The phage library was generated by overnight culture in 2xYT
medium in the presence of Tetracycline (20ug/ml).
Phage library purification
The cell culture containing phage was shaken gently with 15% volume of PEG/NaCl
solution (16.7% PEG 8000 MW, Sigma, 3.3M NaCl) for 4 hours in the cold room. The phage
was isolated by centrifugation at 10,000 rpm for 10 min. Precipitated phages were dissolved in
5ml TBS. The insoluble material was eliminated through centrifuging at 10,000 rpm for 10 min.
A second phage purification was performed with PEG/NaCl precipitation, and the phage pellet
was dissolved in TBS buffer with 50% glycerol and stored at -20℃.
Affinity maturation via gapped-DNA method
A purified phage solution was mixed with TE buffer as a 1:1 ratio in volume. The phage
ssDNA extraction was done with two times phenol:chlorophorm: isoamyl alcohol (25:24:1). The
supernatant solution was incubated with 1/10 volume of P3 and 2x volume of 200 proof ethanol
and incubated at -20℃ for 30min. The DNA pellet was extracted by centrifuging at 16,000 rpm



50
for 10 min at 4oC. The pellet was washed with 70% ethanol, centrifuged for 10min, and vacuum
dried for 15min. The phage ssDNA was obtained by dissolving the pellet in TE buffer.
Meanwhile, a normal f3TR1 vector was cut with Bgl I restriction enzyme and purified. To
generate the gapped-DNA, 1ug cut f3TR1 vector was heated for 4min at 70C in 50ul water.
0.5ug of phage ssDNA was added and incubated with the cut f3TR1 for 1min at 70C. The
mixture was then placed on ice for 5min, and SSC was added to a final 2x concentration
(300mM NaCl, 30mM sodium citrate). The mixture was then incubated at 60oC for 5min and
incubated on ice for 30 min. The gapped DNA was then purified with 1/10 TE buffer and Amico
Ultra-0.5 10kDa centrifugal filter unit (Millipore). The presence of gapped DNA was then
validated using agarose gel electrophoresis.
To perform the affinity maturation experiment, the gapped-DNA (500ng) was treated
with 1um AIDv for 30s, 1min, 2min, 5min, 10min, 20min and 1hr at 37℃. The reaction was
quenched by phenol:chlorophorm: isoamyl alcohol with a DNA cleanup column (New England
bio). The mutated gapped-DNA was further treated with 0.65um pol eta (in reaction buffer:
40mM Tris pH8, 50mM NaCl, 2.5% glycerol, 10mM dithiothreitol, 2.5mM MgCl2, 100uM
dNTPs) at 37℃ for 2hr to fill up the gap. The filled-up gapped DNA was then quenched by
phenol:chlorophorm: isoamyl and desalted before transformation into MC1061 strain for phage
production. Normally, 10^11 TU/ml phage can be obtained by 3.5ug of ssDNA input.
Affinity maturation using the ssDNA method
Phage ssDNA was extracted as mentioned above, 0.5ng phage ssDNA was treated with
AID and purified in the same way as described in the Gapped-DNA workflow. 50-60% ssDNA
was retained after the column purification. The AID-treated ssDNA was annealed to LpAtof3-R
primer (GGTGATGGTGTTGGCCCCAGGGGCTGAGGAGACGGTGAC) by heating at 95oC



51
for 2min and cooling on ice for 5 min. 0.35um of pol eta was added along with the reaction
buffer to reach a total volume of 100ul. The reaction was on at 37oC for 4h. The VHH region was
further amplified with LpAtof3-R primer and LpAtof3-F primer
(TATTACTCGCGGCCCACGCGGCCATGGCT) by premixed Taq polymerase (Promega
Corporation) for 25-cycle amplification. The amplified product was purified with
phenol:chlorophorm: isoamyl and ethanol precipitation. Both purified VHH gene and pADL20c
vector were cut with Bgl I as described above and the pADL20c vector was dephosphorylated to
prevent self-ligation. The ligation was done through T4 ligase and transformed into TG1 cells for
phage production. With 7ug of insert, 10^11 TU/ml phage was obtained with 50% of them
mutated.
Phage production with helper phage
The transformed TG1 cells were grown on Ampicillin plates at 37oC overnight for the
selection of cells containing the phagemid. The colonies were collected in fresh 2xYT medium
and 0.1% w/v glucose to prevent catabolite repression. The freshly collected cultures were
diluted to around 0.2 OD in the 600nm absorbance and grew the cell between 0.4 to 0.5 OD for
the helper phage transduction. 1ul of CM13 helper phage was added per 1ml of culture to reach a
multiplicity of infection (MOI) about 14 (14 times helper phage is added per unit of cell). The
culture was then incubated at 37oC and 250 rpm for 1hr to allow helper phage attachment and
infection to take place. Then Ampicillin (100 ug/ml) and Kanamycin (50 ug/ml) were added to
select the cell with both the phagemid and the helper phage. The temperature was reduced to
30℃ and incubated overnight for viral replication and assembly. The phage accumulates in the
culture supernatant and can be purified as described above.



52
Nanobody selection via bio-panning through immunotubes
For the f3TR1 phage, the MaxiSorp Nunc-immunotubes (Thermo Scientific) were used
for the phage biopanning. The immunotubes were washed with phosphate-buffered saline (PBS,
pH 7.4) before FAAH attachment. 1ml of FAAH (1-10ug) in PBS was added into the
immunotubes and let the hydrophilic interaction take place overnight at 4
oC. On the next day, the
coated immunotubes were washed with TPBS (PBS with 0.1% Tween) and PBS to eliminate
nonspecific binding. Both phage and the coated immunotubes were blocked with MPBS (2-5%
w/v nonfat dry milk in PBS) at room temperature for 1-2hr. The blocking solution in the coated
immunotubes was taken and changed for the pre-blocked phages for binding at room temperature
for 1-2 hours with gentle shaking. Unbound phage and weak-binding phage were eliminated by
washing 9 times with TPBS and 2 times with PBS at pH ranging from 5 to 7.
Elution was performed by incubating with 1 ml of 1 ug/ml trypsin in PBS for 30min at
room temperature. The supernatant was recovered for both phage tittering and amplification.
Serial dilutions were done with the eluted phage for infecting mid-log stage K91BK cells. The
phage infection was done by mixing the eluted phage and cells were incubated at 37oC for
30min. The cells were plated on Tetracycline plates and grown overnight at 37oC. The number of
eluted phages can be calculated by counting the number of colonies on the plate and multiplying
the dilution ratios. The eluted phages were amplified by having 500ul of eluted phages infecting
25 ml mid-log phase K91BK cells at 37oC for 30min. 75ml 2xYT medium was then added and
the Tetracycline concentration was adjusted to 20 ug/ml for overnight amplification. The
amplified phages would be harvested and purified as the method above. The newly amplified
phages were used for the next round of biopanning for phage selection. Three rounds of
biopanning were performed with decreasing FAAH, decreasing phage library input, decreasing



53
binding time, decreasing washing buffer pH, and increasing nonfat milk concentration for
creating a more stringent condition for nanobody with high affinity to FAAH.
Nanobody selection via bio-panning through Biacore Biosensor
This protocol is adapted to monitor the binding and dissociation events between FAAH
and phage in real time160. The C1 chip (Cytiva) was activated by mixing 1-ethyl-3- (3-
dimethylaminopropyl) carbodiimide (EDC) and N-hydroxysuccinimide (NHS) (Cytiva) in a 1:1
ratio. FAAH was diluted in 10 mM sodium acetate buffer pH4 and immobilized on the chip till
reaching 300 RU. The surface was deactivated by adding 1M ethanolamine. The surface was
washed with at least 50ul PBS till the base was stabilized. pADL20c phage library was added at
5ul /min. To eliminate unbound and weak binding phage, buffer washes were performed as
described above. The washed solution was collected when the RU level decreased to 100 RU
above the initial baseline. The final selection was done with 100 mM triethylamine buffer and
neutralized through 1:1 1 M Tris-HCl pH7.4. 500ul of mid-log phase TG1 cells were then
infected by the eluted phages and plated on the Ampicillin plate overnight for phage amount
estimation and amplification with helper phages.
Nanobody screening using phage ELISA
Individual colonies were picked on the antibiotic-resistant plate after biopanning and
transferred into deepwell 96 well plates containing 1.5 ml 2xYT medium and antibiotic for phage
production overnight shaking at 37oC 200 rpm. Meanwhile, the 96-well MaxiSorp plate was
washed 2 times with PBS and coated with FAAH at around 1ug protein per well overnight at
4℃. The washing was performed with AquaMax 2000 plate washer (Molecular Devices, LLC).



54
On the next day, FAAH-coated plates were washed with TPBS and PBS 2 times before blocking.
The plates were blocked with 2% MPBS for 2hr at room temperature. Meanwhile, the 96well
culture plate was centrifuged at 3000rpm for 20 minutes. 50ul of each phage culture supernatant
was blocked with 4% MPBS at a 1:1 ratio on a new 96 well plate for 2hr at room temperature.
The MPBS on the FAAH coated plates were poured off and the blocked phages were added into
the coated plate and incubated for 1.5 hours. The plates were then washed with TPBS and PBS
three times. 100ul of 1:4000 diluted anti-M13-HRP (Sino Biological US, Inc) in 2% MPBS was
added into each of the wells for 1hr. The plate was then washed with TPBS and PBS 4 times
before adding the TMB solution (Thermo Fisher Scientific) for color development for 5min. The
reaction was stopped with 0.2M sulfuric acid and the number of phages can be measured at
OD450nm via SpectraMax iD5 microplate reader (Molecular Devices, LLC).
Expression and purification of VHH nanobodies
To express the VHH nanobodies, the VHH genes were subcloned into the pADL20c
vector through Bgl I using a restriction enzyme and T4 ligase. The constructs were transformed
into E. coli CSH50 cells and grown at 37oC in LB with 0.2% glucose and 100 ug/ml Ampicillin.
The culture was included at OD 600nm around 0.7 with 1mM IPTG. The nanobody expression
was done at 30C overnight. The cells were harvested by centrifugation at 3000g for 20min at
4
oC . The cell pellets were gently resuspended in 1 ml TSE buffer (200 mM Tris-HCl pH8, 500
mM sucrose, and 1 mM EDTA) with a complete Protease Inhibitor Cocktail (Roche). The
mixture was incubated on ice for 30min and centrifuged at 16000g for 30min at 4
oC . The
supernatant was collected and passed through 1ml Ni-NTA resin (Qiagen). The resin was then
washed with PBS and eluted with PBS with 300 mM Imidazole. The VHH was further purified



55
by gel filtration using the Superdex 75 column and stored in PBS. For long-term storage, 10%
glycerol was added and the nanobodies were stored at -80oC.
Affinity measurement via Surface plasmon resonance
The equilibrium dissociation constant (KD) was measured using a Biacore T200. The
FAAH protein was diluted in 10mM sodium acetate pH4.5 to make it positively charged. The
positively charged protein was flown through and immobilized on the CM5 sensor chip via
amine coupling as described before to reach a RU between 150 to 250. The VHH nanobodies
were diluted to concentrations ranging from 10 to 500nM in flow buffer (PSB with 0.005%
Tween) and injected onto the chip at 30ul/min for 120s. The chip was regenerated with 7mM
NaOH for different nanobodies at different concentrations. The KD was calculated by the
Biacore software by fitting the sensorgrams with a 1:1 binding model.
Cold plantar assay on mice
The cold plantar assay was performed by Shanni Yamaki and Chenyu Yang in the
McKemy Lab (Department of Biological Sciences, Neuroscience, University of Southern
California, CA). Inflammatory cold allodynia was induced in mice by unilateral intraplanar hind
paw injection of Complete Freund’s adjuvant (CFA; 20ul; Sigma). A behavior test was performed
two days post-injection using a previously described protocol161,162
. Mice were given a 2-hour
acclimation period in a Plexiglas chamber before undergoing cold plantar testing. This testing
was conducted on a glass surface maintained at 30°C. A compressed powdered dry ice pellet was
placed under the glass surface of the hind paw being tested, and withdrawal latencies were
recorded. Each time point included three trials per paw.



56
Computational analysis
All the DNA and amino acids sequence analyses were done using Benchling (Kabat
database for CDR recognition). The FAAH structure was predicted by Alphafold2 using
ChimeraX163. The structure of VHH nanobodies was predicted by NanoNet on its Colab website.
The docking and affinity prediction was performed using ClusPro2 through Boston University
and Stony Brook University; pyDock using the Barcelona Supercomputing Center; and HDOCK
through the School of Physics, Huazhong University of Science and Technology.



57
Chapter 4. Conclusions
AID was identified over 20 years ago. Since then, the scientific community has made
significant progress in understanding this protein. Through extensive research, AID’s critical role
in antibody generation and its biochemical activity has been revealed. With an extensive
understanding of AID’s catalytic properties, a novel technology using AID as a central enzyme
has emerged and advanced. Leveraging AID’s intrinsic deamination activity, cytosine deaminase
editors have been developed and refined. These innovative gene editing tools offer a highly
efficient and precise way to modify genomes across diverse organisms, promising transformative
advancement in not only biological research but also possible real-world applications. For
example, the Target-AID was employed by a group of researchers from the University of
Tsukuba to introduce multiple-gene substitutions in tomatoes for improving yield, shelf-life,
nutrient content, and disease resistance. Using AID as a core enzyme, the team was able to
introduce allelic variations in multiple targets in a single line and single generation164. The
success of Target-AID not only proves AID as a great genetic editing tool but also provides
people with a new way to study AID as an applied research tool.
In this study, with the understanding of AID’s essential role in SHM, we developed a
novel nanobody generation method that enables nanobody diversification in a test tube. By
mimicking the antibody diversification process in the B cell using AID and Pol η, we
successfully enhanced the affinity of a nanobody against it target antigen in an in vitro setting.
We have shown that our in vitro affinity maturation process can continuously improve the
binding of the nanobodies. In its current state, the system is not perfect and there are many
improvements that can be made. The efficiency of both nanobody gene diversification and
selection processes need to be improved. After AID-Pol η affinity maturation, there is still a large



58
proportion of nanobody genes unedited. The large unedited nanobody can compete with the
affinity-matured nanobody and interfere with the selection process. To enhance the deamination
efficiency, a more catalytically active AID can be used. Currently, the system includes only AID
and Pol η as mutators; however, in SHM, Pol ζ, Pol ι, Pol θ, and Pol к are upregulated as well165
.
By including other error-pone polymerases, more mutation patterns and events can be obtained.
With the advancement of computational biology, computational tools can be adapted as well to
provide in silico predictions on the bindings and enhance the efficiency of nanobody
development. Additional characterization of the isolated nanobodies is needed as well. In our
method, the specificity of the nanobody is overlooked. Western blotting can be adapted to assess
the specificity of the nanobodies166
. This is the first developed assay based on AID’s essential
role in the immune system. We believe the establishment of this assay can benefit the
development of antibodies in both research and clinical fields.
We also aimed to find an application of AID from its innate biochemical activity. From
the previous study, we were able to describe AID’s activity on ssDNA using a mathematical
model. We proposed to use this model to solve a mathematical problem. For doing a
mathematical calculation, both the model and activity of AID must be accurate and consistent.
Currently, we are still optimizing our model and experiment process to reach the consistency of
AID’s reaction on ssDNA. Even though the original goal of this project has not been achieved,
during the work with the project, a new NGS-based assay was developed for accurately detecting
AID’s deamination. Meanwhile, additional insights into how neighboring nucleotides affect
AID’s activity have been revealed. This finding coincides with others’ findings that AID’s
deamination preferences can be influenced by a broader range of surrounding sequences beyond
the two upstream nucleotides106. At present, we still have little clue about the reason behind the



59
observation. More experiments from different angles could help with answering this question.
For example, a smFRET assay can indicate if the change in the deamination activity is
contributed by AID’s catalysis or scanning activity. With a more comprehensive understanding of
AID’s activity in vitro, a better mathematical model might be formulated to describe and predict
AID’s activity. The success of the project to use AID as a computational tool can bring new
insights into the direction and application of protein study.



60
References
1 Liu, M. C. et al. AID/APOBEC-like cytidine deaminases are ancient innate immune mediators in
invertebrates. Nat Commun 9, 1948 (2018). https://doi.org/10.1038/s41467-018-04273-x
2 Maul, R. W. & Gearhart, P. J. AID and somatic hypermutation. Adv Immunol 105, 159-191 (2010).
https://doi.org/10.1016/s0065-2776(10)05006-6
3 Stavrou, S. & Ross, S. R. APOBEC3 Proteins in Viral Immunity. J Immunol 195, 4565-4570 (2015).
https://doi.org/10.4049/jimmunol.1501504
4 Harjanto, D. et al. RNA editing generates cellular subsets with diverse sequence within
populations. Nat Commun 7, 12145 (2016). https://doi.org/10.1038/ncomms12145
5 Navaratnam, N. et al. The p27 catalytic subunit of the apolipoprotein B mRNA editing enzyme is
a cytidine deaminase. J Biol Chem 268, 20709-20712 (1993).
6 Wong, L., Vizeacoumar, F. S., Vizeacoumar, F. J. & Chelico, L. APOBEC1 cytosine deaminase
activity on single-stranded DNA is suppressed by replication protein A. Nucleic Acids Res 49, 322-
339 (2021). https://doi.org/10.1093/nar/gkaa1201
7 Lorenzo, J. P. et al. APOBEC2 safeguards skeletal muscle cell fate through binding chromatin and
regulating transcription of non-muscle genes during myoblast differentiation. Proc Natl Acad Sci
U S A 121, e2312330121 (2024). https://doi.org/10.1073/pnas.2312330121
8 Narvaiza, I. et al. Deaminase-independent inhibition of parvoviruses by the APOBEC3A cytidine
deaminase. PLoS Pathog 5, e1000439 (2009). https://doi.org/10.1371/journal.ppat.1000439
9 Zhang, Y., Chen, X., Cao, Y. & Yang, Z. Roles of APOBEC3 in hepatitis B virus (HBV) infection and
hepatocarcinogenesis. Bioengineered 12, 2074-2086 (2021).
https://doi.org/10.1080/21655979.2021.1931640
10 Cheng, A. Z. et al. APOBECs and Herpesviruses. Viruses 13 (2021).
https://doi.org/10.3390/v13030390
11 Malim, M. H. APOBEC proteins and intrinsic resistance to HIV-1 infection. Philos Trans R Soc Lond
B Biol Sci 364, 675-687 (2009). https://doi.org/10.1098/rstb.2008.0185
12 Jarmuz, A. et al. An anthropoid-specific locus of orphan C to U RNA-editing enzymes on
chromosome 22. Genomics 79, 285-296 (2002). https://doi.org/10.1006/geno.2002.6718
13 Harris, R. S. & Liddament, M. T. Retroviral restriction by APOBEC proteins. Nat Rev Immunol 4,
868-877 (2004). https://doi.org/10.1038/nri1489
14 Shi, M. et al. Characterization and functional analysis of chicken APOBEC4. Dev Comp Immunol
106, 103631 (2020). https://doi.org/10.1016/j.dci.2020.103631



61
15 Lossos, I. S., Levy, R. & Alizadeh, A. A. AID is expressed in germinal center B-cell-like and
activated B-cell-like diffuse large-cell lymphomas and is not correlated with intraclonal
heterogeneity. Leukemia 18, 1775-1779 (2004). https://doi.org/10.1038/sj.leu.2403488
16 Goodman, M. F., Scharff, M. D. & Romesberg, F. E. AID-initiated purposeful mutations in
immunoglobulin genes. Adv Immunol 94, 127-155 (2007). https://doi.org/10.1016/s0065-
2776(06)94005-x
17 Riva, G. et al. HPV Meets APOBEC: New Players in Head and Neck Cancer. Int J Mol Sci 22 (2021).
https://doi.org/10.3390/ijms22031402
18 Navaratnam, N. & Sarwar, R. An overview of cytidine deaminases. Int J Hematol 83, 195-200
(2006). https://doi.org/10.1532/ijh97.06032
19 Conticello, S. G. The AID/APOBEC family of nucleic acid mutators. Genome Biol 9, 229 (2008).
https://doi.org/10.1186/gb-2008-9-6-229
20 Salter, J. D., Bennett, R. P. & Smith, H. C. The APOBEC Protein Family: United by Structure,
Divergent in Function. Trends Biochem Sci 41, 578-594 (2016).
https://doi.org/10.1016/j.tibs.2016.05.001
21 Blanc, V. et al. Apobec1 complementation factor (A1CF) and RBM47 interact in tissue-specific
regulation of C to U RNA editing in mouse intestine and liver. Rna 25, 70-81 (2019).
https://doi.org/10.1261/rna.068395.118
22 Mondal, S., Begum, N. A., Hu, W. & Honjo, T. Functional requirements of AID's higher order
structures and their interaction with RNA-binding proteins. Proc Natl Acad Sci U S A 113, E1545-
1554 (2016). https://doi.org/10.1073/pnas.1601678113
23 Pham, P. et al. Structural analysis of the activation-induced deoxycytidine deaminase required in
immunoglobulin diversification. DNA Repair (Amst) 43, 48-56 (2016).
https://doi.org/10.1016/j.dnarep.2016.05.029
24 Qiao, Q. et al. AID Recognizes Structured DNA for Class Switch Recombination. Mol Cell 67, 361-
373.e364 (2017). https://doi.org/10.1016/j.molcel.2017.06.034
25 Krzysiak, T. C., Jung, J., Thompson, J., Baker, D. & Gronenborn, A. M. APOBEC2 is a monomer in
solution: implications for APOBEC3G models. Biochemistry 51, 2008-2017 (2012).
https://doi.org/10.1021/bi300021s
26 Li, J. et al. APOBEC3 multimerization correlates with HIV-1 packaging and restriction activity in
living cells. J Mol Biol 426, 1296-1307 (2014). https://doi.org/10.1016/j.jmb.2013.12.014
27 Harris, R. S., Petersen-Mahrt, S. K. & Neuberger, M. S. RNA editing enzyme APOBEC1 and some
of its homologs can act as DNA mutators. Mol Cell 10, 1247-1253 (2002).
https://doi.org/10.1016/s1097-2765(02)00742-6



62
28 Chelico, L., Prochnow, C., Erie, D. A., Chen, X. S. & Goodman, M. F. Structural model for
deoxycytidine deamination mechanisms of the HIV-1 inactivation enzyme APOBEC3G. J Biol
Chem 285, 16195-16205 (2010). https://doi.org/10.1074/jbc.M110.107987
29 Pham, P., Landolph, A., Mendez, C., Li, N. & Goodman, M. F. A biochemical analysis linking
APOBEC3A to disparate HIV-1 restriction and skin cancer. J Biol Chem 288, 29294-29304 (2013).
https://doi.org/10.1074/jbc.M113.504175
30 Chi, X., Li, Y. & Qiu, X. V(D)J recombination, somatic hypermutation and class switch
recombination of immunoglobulins: mechanism and regulation. Immunology 160, 233-247
(2020). https://doi.org/10.1111/imm.13176
31 Teng, G. & Papavasiliou, F. N. Immunoglobulin somatic hypermutation. Annu Rev Genet 41, 107-
120 (2007). https://doi.org/10.1146/annurev.genet.41.110306.130340
32 Chaudhuri, J. & Alt, F. W. Class-switch recombination: interplay of transcription, DNA
deamination and DNA repair. Nat Rev Immunol 4, 541-552 (2004).
https://doi.org/10.1038/nri1395
33 Muramatsu, M. et al. Specific expression of activation-induced cytidine deaminase (AID), a novel
member of the RNA-editing deaminase family in germinal center B cells. J Biol Chem 274, 18470-
18476 (1999). https://doi.org/10.1074/jbc.274.26.18470
34 Muramatsu, M. et al. Class switch recombination and hypermutation require activation-induced
cytidine deaminase (AID), a potential RNA editing enzyme. Cell 102, 553-563 (2000).
https://doi.org/10.1016/s0092-8674(00)00078-7
35 Revy, P. et al. Activation-induced cytidine deaminase (AID) deficiency causes the autosomal
recessive form of the Hyper-IgM syndrome (HIGM2). Cell 102, 565-575 (2000).
https://doi.org/10.1016/s0092-8674(00)00079-9
36 Jiao, J., Lv, Z., Wang, Y., Fan, L. & Yang, A. The off-target effects of AID in carcinogenesis. Front
Immunol 14, 1221528 (2023). https://doi.org/10.3389/fimmu.2023.1221528
37 Yu, K. AID function in somatic hypermutation and class switch recombination. Acta Biochim
Biophys Sin (Shanghai) 54, 759-766 (2022). https://doi.org/10.3724/abbs.2022070
38 Heltzel, J. M. H. & Gearhart, P. J. What Targets Somatic Hypermutation to the Immunoglobulin
Loci? Viral Immunol 33, 277-281 (2020). https://doi.org/10.1089/vim.2019.0149
39 Odegard, V. H. & Schatz, D. G. Targeting of somatic hypermutation. Nat Rev Immunol 6, 573-583
(2006). https://doi.org/10.1038/nri1896
40 Olave, M. C. & Graham, R. P. Mismatch repair deficiency: The what, how and why it is important.
Genes Chromosomes Cancer 61, 314-321 (2022). https://doi.org/10.1002/gcc.23015
41 Martin, A. & Scharff, M. D. Somatic hypermutation of the AID transgene in B and non-B cells.
Proc Natl Acad Sci U S A 99, 12304-12308 (2002). https://doi.org/10.1073/pnas.192442899



63
42 Schrader, C. E., Linehan, E. K., Mochegova, S. N., Woodland, R. T. & Stavnezer, J. Inducible DNA
breaks in Ig S regions are dependent on AID and UNG. J Exp Med 202, 561-568 (2005).
https://doi.org/10.1084/jem.20050872
43 Duquette, M. L., Handa, P., Vincent, J. A., Taylor, A. F. & Maizels, N. Intracellular transcription of
G-rich DNAs induces formation of G-loops, novel structures containing G4 DNA. Genes Dev 18,
1618-1629 (2004). https://doi.org/10.1101/gad.1200804
44 Zhang, Z. Z. et al. The strength of an Ig switch region is determined by its ability to drive R loop
formation and its number of WGCW sites. Cell Rep 8, 557-569 (2014).
https://doi.org/10.1016/j.celrep.2014.06.021
45 Yu, K., Chedin, F., Hsieh, C. L., Wilson, T. E. & Lieber, M. R. R-loops at immunoglobulin class switch
regions in the chromosomes of stimulated B cells. Nat Immunol 4, 442-451 (2003).
https://doi.org/10.1038/ni919
46 Stavnezer, J., Guikema, J. E. & Schrader, C. E. Mechanism and regulation of class switch
recombination. Annu Rev Immunol 26, 261-292 (2008).
https://doi.org/10.1146/annurev.immunol.26.021607.090248
47 Okazaki, I. M., Kinoshita, K., Muramatsu, M., Yoshikawa, K. & Honjo, T. The AID enzyme induces
class switch recombination in fibroblasts. Nature 416, 340-345 (2002).
https://doi.org/10.1038/nature727
48 Pérez-Durán, P., de Yebenes, V. G. & Ramiro, A. R. Oncogenic events triggered by AID, the adverse
effect of antibody diversification. Carcinogenesis 28, 2427-2433 (2007).
https://doi.org/10.1093/carcin/bgm201
49 Çakan, E. & Gunaydin, G. Activation induced cytidine deaminase: An old friend with new faces.
Front Immunol 13, 965312 (2022). https://doi.org/10.3389/fimmu.2022.965312
50 Xu, Z. et al. Regulation of aicda expression and AID activity: relevance to somatic hypermutation
and class switch DNA recombination. Crit Rev Immunol 27, 367-397 (2007).
https://doi.org/10.1615/critrevimmunol.v27.i4.60
51 de Yébenes, V. G. et al. miR-181b negatively regulates activation-induced cytidine deaminase in B
cells. J Exp Med 205, 2199-2206 (2008). https://doi.org/10.1084/jem.20080579
52 Teng, G. et al. MicroRNA-155 is a negative regulator of activation-induced cytidine deaminase.
Immunity 28, 621-629 (2008). https://doi.org/10.1016/j.immuni.2008.03.015
53 Wu, X., Darce, J. R., Chang, S. K., Nowakowski, G. S. & Jelinek, D. F. Alternative splicing regulates
activation-induced cytidine deaminase (AID): implications for suppression of AID mutagenic
activity in normal and malignant B cells. Blood 112, 4675-4682 (2008).
https://doi.org/10.1182/blood-2008-03-145995



64
54 van Maldegem, F., Jibodh, R. A., van Dijk, R., Bende, R. J. & van Noesel, C. J. Activation-induced
cytidine deaminase splice variants are defective because of the lack of structural support for the
catalytic site. J Immunol 184, 2487-2491 (2010). https://doi.org/10.4049/jimmunol.0903102
55 Mechtcheriakova, D., Svoboda, M., Meshcheryakova, A. & Jensen-Jarolim, E. Activation-induced
cytidine deaminase (AID) linking immunity, chronic inflammation, and cancer. Cancer Immunol
Immunother 61, 1591-1598 (2012). https://doi.org/10.1007/s00262-012-1255-z
56 McBride, K. M. et al. Regulation of class switch recombination and somatic mutation by AID
phosphorylation. J Exp Med 205, 2585-2594 (2008). https://doi.org/10.1084/jem.20081319
57 Aoufouchi, S. et al. Proteasomal degradation restricts the nuclear lifespan of AID. J Exp Med 205,
1357-1368 (2008). https://doi.org/10.1084/jem.20070950
58 Orthwein, A. et al. Regulation of activation-induced deaminase stability and antibody gene
diversification by Hsp90. J Exp Med 207, 2751-2765 (2010).
https://doi.org/10.1084/jem.20101321
59 Gion, Y. et al. Up-regulation of activation-induced cytidine deaminase and its strong expression
in extra-germinal centres in IgG4-related disease. Sci Rep 9, 761 (2019).
https://doi.org/10.1038/s41598-018-37404-x
60 Zhang, J., Shi, Y., Zhao, M., Hu, H. & Huang, H. Activation-induced cytidine deaminase
overexpression in double-hit lymphoma: potential target for novel anticancer therapy. Sci Rep
10, 14164 (2020). https://doi.org/10.1038/s41598-020-71058-y
61 Dickerson, S. K., Market, E., Besmer, E. & Papavasiliou, F. N. AID mediates hypermutation by
deaminating single stranded DNA. J Exp Med 197, 1291-1296 (2003).
https://doi.org/10.1084/jem.20030481
62 Bransteitter, R., Pham, P., Scharff, M. D. & Goodman, M. F. Activation-induced cytidine
deaminase deaminates deoxycytidine on single-stranded DNA but requires the action of RNase.
Proc Natl Acad Sci U S A 100, 4102-4107 (2003). https://doi.org/10.1073/pnas.0730835100
63 Ramiro, A. R., Stavropoulos, P., Jankovic, M. & Nussenzweig, M. C. Transcription enhances AIDmediated cytidine deamination by exposing single-stranded DNA on the nontemplate strand.
Nat Immunol 4, 452-456 (2003). https://doi.org/10.1038/ni920
64 Rada, C., González-Fernández, A., Jarvis, J. M. & Milstein, C. The 5' boundary of somatic
hypermutation in a V kappa gene is in the leader intron. Eur J Immunol 24, 1453-1457 (1994).
https://doi.org/10.1002/eji.1830240632
65 Lebecque, S. G. & Gearhart, P. J. Boundaries of somatic mutation in rearranged immunoglobulin
genes: 5' boundary is near the promoter, and 3' boundary is approximately 1 kb from V(D)J gene.
J Exp Med 172, 1717-1727 (1990). https://doi.org/10.1084/jem.172.6.1717
66 Peters, A. & Storb, U. Somatic hypermutation of immunoglobulin genes is linked to transcription
initiation. Immunity 4, 57-65 (1996). https://doi.org/10.1016/s1074-7613(00)80298-8



65
67 Tumas-Brundage, K. & Manser, T. The transcriptional promoter regulates hypermutation of the
antibody heavy chain locus. J Exp Med 185, 239-250 (1997).
https://doi.org/10.1084/jem.185.2.239
68 Bachl, J., Carlson, C., Gray-Schopfer, V., Dessing, M. & Olsson, C. Increased transcription levels
induce higher mutation rates in a hypermutating cell line. J Immunol 166, 5051-5057 (2001).
https://doi.org/10.4049/jimmunol.166.8.5051
69 Zhang, J., Bottaro, A., Li, S., Stewart, V. & Alt, F. W. A selective defect in IgG2b switching as a
result of targeted mutation of the I gamma 2b promoter and exon. Embo j 12, 3529-3537 (1993).
https://doi.org/10.1002/j.1460-2075.1993.tb06027.x
70 Jaszczur, M., Bertram, J. G., Pham, P., Scharff, M. D. & Goodman, M. F. AID and Apobec3G
haphazard deamination and mutational diversity. Cell Mol Life Sci 70, 3089-3108 (2013).
https://doi.org/10.1007/s00018-012-1212-1
71 Chelico, L., Pham, P. & Goodman, M. F. Stochastic properties of processive cytidine DNA
deaminases AID and APOBEC3G. Philos Trans R Soc Lond B Biol Sci 364, 583-593 (2009).
https://doi.org/10.1098/rstb.2008.0195
72 Bransteitter, R., Pham, P., Calabrese, P. & Goodman, M. F. Biochemical analysis of
hypermutational targeting by wild type and mutant activation-induced cytidine deaminase. J Biol
Chem 279, 51612-51621 (2004). https://doi.org/10.1074/jbc.M408135200
73 Ta, V. T. et al. AID mutant analyses indicate requirement for class-switch-specific cofactors. Nat
Immunol 4, 843-848 (2003). https://doi.org/10.1038/ni964
74 Barreto, V., Reina-San-Martin, B., Ramiro, A. R., McBride, K. M. & Nussenzweig, M. C. C-terminal
deletion of AID uncouples class switch recombination from somatic hypermutation and gene
conversion. Mol Cell 12, 501-508 (2003). https://doi.org/10.1016/s1097-2765(03)00309-5
75 Imai, K. et al. Analysis of class switch recombination and somatic hypermutation in patients
affected with autosomal dominant hyper-IgM syndrome type 2. Clin Immunol 115, 277-285
(2005). https://doi.org/10.1016/j.clim.2005.02.003
76 Kohli, R. M. et al. A portable hot spot recognition loop transfers sequence preferences from
APOBEC family members to activation-induced cytidine deaminase. J Biol Chem 284, 22898-
22904 (2009). https://doi.org/10.1074/jbc.M109.025536
77 Carpenter, M. A., Rajagurubandara, E., Wijesinghe, P. & Bhagwat, A. S. Determinants of
sequence-specificity within human AID and APOBEC3G. DNA Repair (Amst) 9, 579-587 (2010).
https://doi.org/10.1016/j.dnarep.2010.02.010
78 Wang, M., Rada, C. & Neuberger, M. S. Altering the spectrum of immunoglobulin V gene somatic
hypermutation by modifying the active site of AID. J Exp Med 207, 141-153 (2010).
https://doi.org/10.1084/jem.20092238



66
79 Pham, P., Bransteitter, R., Petruska, J. & Goodman, M. F. Processive AID-catalysed cytosine
deamination on single-stranded DNA simulates somatic hypermutation. Nature 424, 103-107
(2003). https://doi.org/10.1038/nature01760
80 Pham, P., Calabrese, P., Park, S. J. & Goodman, M. F. Analysis of a single-stranded DNA-scanning
process in which activation-induced deoxycytidine deaminase (AID) deaminates C to U
haphazardly and inefficiently to ensure mutational diversity. J Biol Chem 286, 24931-24942
(2011). https://doi.org/10.1074/jbc.M111.241208
81 Senavirathne, G. et al. Activation-induced deoxycytidine deaminase (AID) co-transcriptional
scanning at single-molecule resolution. Nat Commun 6, 10209 (2015).
https://doi.org/10.1038/ncomms10209
82 Storb, U., Shen, H. M. & Nicolae, D. Somatic hypermutation: processivity of the cytosine
deaminase AID and error-free repair of the resulting uracils. Cell Cycle 8, 3097-3101 (2009).
https://doi.org/10.4161/cc.8.19.9658
83 Jeong, Y. K., Song, B. & Bae, S. Current Status and Challenges of DNA Base Editing Tools. Mol Ther
28, 1938-1952 (2020). https://doi.org/10.1016/j.ymthe.2020.07.021
84 Kosicki, M., Tomberg, K. & Bradley, A. Repair of double-strand breaks induced by CRISPR-Cas9
leads to large deletions and complex rearrangements. Nat Biotechnol 36, 765-771 (2018).
https://doi.org/10.1038/nbt.4192
85 Haapaniemi, E., Botla, S., Persson, J., Schmierer, B. & Taipale, J. CRISPR-Cas9 genome editing
induces a p53-mediated DNA damage response. Nat Med 24, 927-930 (2018).
https://doi.org/10.1038/s41591-018-0049-z
86 Zhang, J. P. et al. Efficient precise knockin with a double cut HDR donor after CRISPR/Cas9-
mediated double-stranded DNA cleavage. Genome Biol 18, 35 (2017).
https://doi.org/10.1186/s13059-017-1164-8
87 Wang, F., Zeng, Y., Wang, Y. & Niu, Y. The Development and Application of a Base Editor in
Biomedicine. Biomed Res Int 2020, 2907623 (2020). https://doi.org/10.1155/2020/2907623
88 Rodgers, K. & McVey, M. Error-Prone Repair of DNA Double-Strand Breaks. J Cell Physiol 231, 15-
24 (2016). https://doi.org/10.1002/jcp.25053
89 Kim, J. S. Precision genome engineering through adenine and cytosine base editing. Nat Plants 4,
148-151 (2018). https://doi.org/10.1038/s41477-018-0115-z
90 Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive
immune systems. Science 353 (2016). https://doi.org/10.1126/science.aaf8729
91 Liu, H., Zhu, Y., Li, M. & Gu, Z. Precise genome editing with base editors. Med Rev (2021) 3, 75-84
(2023). https://doi.org/10.1515/mr-2022-0044



67
92 Ma, Y. et al. Targeted AID-mediated mutagenesis (TAM) enables efficient genomic diversification
in mammalian cells. Nat Methods 13, 1029-1035 (2016). https://doi.org/10.1038/nmeth.4027
93 McCaffrey, J. et al. CRISPR-CAS9 D10A nickase target-specific fluorescent labeling of double
strand DNA for whole genome mapping and structural variation analysis. Nucleic Acids Res 44,
e11 (2016). https://doi.org/10.1093/nar/gkv878
94 Hess, G. T. et al. Directed evolution using dCas9-targeted somatic hypermutation in mammalian
cells. Nat Methods 13, 1036-1042 (2016). https://doi.org/10.1038/nmeth.4038
95 Mak, C. H., Pham, P., Afif, S. A. & Goodman, M. F. Random-walk enzymes. Phys Rev E Stat Nonlin
Soft Matter Phys 92, 032717 (2015). https://doi.org/10.1103/PhysRevE.92.032717
96 Mak, C. H., Pham, P., Afif, S. A. & Goodman, M. F. A mathematical model for scanning and
catalysis on single-stranded DNA, illustrated with activation-induced deoxycytidine deaminase. J
Biol Chem 288, 29786-29795 (2013). https://doi.org/10.1074/jbc.M113.506550
97 Blackwood, J. K., Okely, E. A., Zahra, R., Eykelenboom, J. K. & Leach, D. R. DNA tandem repeat
instability in the Escherichia coli chromosome is stimulated by mismatch repair at an adjacent
CAG·CTG trinucleotide repeat. Proc Natl Acad Sci U S A 107, 22582-22586 (2010).
https://doi.org/10.1073/pnas.1012906108
98 Hashem, V. I., Klysik, E. A., Rosche, W. A. & Sinden, R. R. Instability of repeated DNAs during
transformation in Escherichia coli. Mutat Res 502, 39-46 (2002). https://doi.org/10.1016/s0027-
5107(02)00027-1
99 Schumacher, S., Pinet, I. & Bichara, M. Modulation of transcription reveals a new mechanism of
triplet repeat instability in Escherichia coli. J Mol Biol 307, 39-49 (2001).
https://doi.org/10.1006/jmbi.2000.4489
100 Akintunde, O., Tucker, T. & Carabetta, V. J. The evolution of next-generation sequencing
technologies. ArXiv (2023).
101 Petrackova, A. et al. Standardization of Sequencing Coverage Depth in NGS: Recommendation for
Detection of Clonal and Subclonal Mutations in Cancer Diagnostics. Front Oncol 9, 851 (2019).
https://doi.org/10.3389/fonc.2019.00851
102 Lou, D. I. et al. High-throughput DNA sequencing errors are reduced by orders of magnitude
using circle sequencing. Proc Natl Acad Sci U S A 110, 19872-19877 (2013).
https://doi.org/10.1073/pnas.1319590110
103 Jee, J. et al. Rates and mechanisms of bacterial mutagenesis from maximum-depth sequencing.
Nature 534, 693-696 (2016). https://doi.org/10.1038/nature18313
104 SantaLucia, J., Jr. A unified view of polymer, dumbbell, and oligonucleotide DNA nearestneighbor thermodynamics. Proc Natl Acad Sci U S A 95, 1460-1465 (1998).
https://doi.org/10.1073/pnas.95.4.1460



68
105 SantaLucia, J., Jr., Allawi, H. T. & Seneviratne, P. A. Improved nearest-neighbor parameters for
predicting DNA duplex stability. Biochemistry 35, 3555-3562 (1996).
https://doi.org/10.1021/bi951907q
106 Wang, Y. et al. Mesoscale DNA feature in antibody-coding sequence facilitates somatic
hypermutation. Cell 186, 2193-2207.e2119 (2023). https://doi.org/10.1016/j.cell.2023.03.030
107 Gorman, J. et al. Dynamic basis for one-dimensional DNA scanning by the mismatch repair
complex Msh2-Msh6. Mol Cell 28, 359-370 (2007). https://doi.org/10.1016/j.molcel.2007.09.008
108 Wright, D. J., Jack, W. E. & Modrich, P. The kinetic mechanism of EcoRI endonuclease. J Biol Chem
274, 31896-31902 (1999). https://doi.org/10.1074/jbc.274.45.31896
109 Esadze, A., Kemme, C. A., Kolomeisky, A. B. & Iwahara, J. Positive and negative impacts of
nonspecific sites during target location by a sequence-specific DNA-binding protein: origin of the
optimal search at physiological ionic strength. Nucleic Acids Res 42, 7039-7046 (2014).
https://doi.org/10.1093/nar/gku418
110 Bennett, S. E., Sanderson, R. J. & Mosbaugh, D. W. Processivity of Escherichia coli and rat liver
mitochondrial uracil-DNA glycosylase is affected by NaCl concentration. Biochemistry 34, 6109-
6119 (1995). https://doi.org/10.1021/bi00018a014
111 Mersch, K. N., Sokoloski, J. E., Nguyen, B., Galletto, R. & Lohman, T. M. "Helicase" Activity
promoted through dynamic interactions between a ssDNA translocase and a diffusing SSB
protein. Proc Natl Acad Sci U S A 120, e2216777120 (2023).
https://doi.org/10.1073/pnas.2216777120
112 Chen, J., Le, S., Basu, A., Chazin, W. J. & Yan, J. Mechanochemical regulations of RPA's binding to
ssDNA. Sci Rep 5, 9296 (2015). https://doi.org/10.1038/srep09296
113 Pangeni, S. et al. Rapid long-distance migration of RPA on single stranded DNA occurs through
intersegmental transfer utilizing multivalent interactions. bioRxiv, 2023.2012.2009.570606
(2023). https://doi.org/10.1101/2023.12.09.570606
114 Mak, C. H., Pham, P. & Goodman, M. F. Random Walk Enzymes: Information Theory, Quantum
Isomorphism, and Entropy Dispersion. J Phys Chem A 123, 3030-3037 (2019).
https://doi.org/10.1021/acs.jpca.9b00910
115 Creighton, S. & Goodman, M. F. Gel kinetic analysis of DNA polymerase fidelity in the presence of
proofreading using bacteriophage T4 DNA polymerase. J Biol Chem 270, 4759-4774 (1995).
https://doi.org/10.1074/jbc.270.9.4759
116 Riera Romo, M., Perez-Martinez, D. & Castillo Ferrer, C. Innate immunity in vertebrates: an
overview. Immunology 148, 125-139 (2016). https://doi.org/10.1111/imm.12597
117 Jonathan U. Peled, * Fei Li Kuang,1, * Maria D. Iglesias-Ussel,1 Sergio Roa,1 Susan L. Kalis,1
Myron F. Goodman,2 and Matthew D. Scharff1. The Biochemistry of Somatic Hypermutation.
Annual Review of Immunology 26:481-511 (2008).



69
118 Hwang, J. K., Alt, F. W. & Yeap, L. S. Related Mechanisms of Antibody Somatic Hypermutation and
Class Switch Recombination. Microbiol Spectr 3, MDNA3-0037-2014 (2015).
https://doi.org/10.1128/microbiolspec.MDNA3-0037-2014
119 McKean, D. et al. Generation of antibody diversity in the immune response of BALB/c mice to
influenza virus hemagglutinin. Proc Natl Acad Sci U S A 81, 3180-3184 (1984).
https://doi.org/10.1073/pnas.81.10.3180
120 Rajewsky, K. Clonal selection and learning in the antibody system. Nature 381, 751-758 (1996).
https://doi.org/10.1038/381751a0
121 Liu, M. & Schatz, D. G. Balancing AID and DNA repair during somatic hypermutation. Trends
Immunol 30, 173-181 (2009). https://doi.org/10.1016/j.it.2009.01.007
122 Pilzecker, B. & Jacobs, H. Mutating for Good: DNA Damage Responses During Somatic
Hypermutation. Front Immunol 10, 438 (2019). https://doi.org/10.3389/fimmu.2019.00438
123 Zeng, X. et al. DNA polymerase η is an A-T mutator in somatic hypermutation of immunoglobulin
variable genes. Nat Immunol 2, 537–541 (2001). https://doi.org/https://doi.org/10.1038/88740
124 Eckert, K. A. Nontraditional Roles of DNA Polymerase Eta Support Genome Duplication and
Stability. Genes (Basel) 14 (2023). https://doi.org/10.3390/genes14010175
125 Kunik, V., Ashkenazi, S. & Ofran, Y. Paratome: an online tool for systematic identification of
antigen-binding regions in antibodies based on sequence or structure. Nucleic Acids Res 40,
W521-524 (2012). https://doi.org/10.1093/nar/gks480
126 Pham, P. et al. AID-RNA polymerase II transcription-dependent deamination of IgV DNA. Nucleic
Acids Res 47, 10815-10829 (2019). https://doi.org/10.1093/nar/gkz821
127 Xu, J. et al. Nanobodies from camelid mice and llamas neutralize SARS-CoV-2 variants. Nature
595, 278-282 (2021). https://doi.org/10.1038/s41586-021-03676-z
128 Ohm-Laursen, L. & Barington, T. Analysis of 6912 unselected somatic hypermutations in human
VDJ rearrangements reveals lack of strand specificity and correlation between phase II
substitution rates and distance to the nearest 3' activation-induced cytidine deaminase target. J
Immunol 178, 4322-4334 (2007). https://doi.org/10.4049/jimmunol.178.7.4322
129 Wei, L. et al. Overlapping hotspots in CDRs are critical sites for V region diversification. Proc Natl
Acad Sci U S A 112, E728-737 (2015). https://doi.org/10.1073/pnas.1500788112
130 Anand, T. et al. Phage Display Technique as a Tool for Diagnosis and Antibody Selection for
Coronaviruses. Curr Microbiol 78, 1124-1134 (2021). https://doi.org/10.1007/s00284-021-
02398-9
131 Almagro, J. C., Pedraza-Escalona, M., Arrieta, H. I. & Perez-Tapia, S. M. Phage Display Libraries for
Antibody Therapeutic Discovery and Development. Antibodies (Basel) 8 (2019).
https://doi.org/10.3390/antib8030044



70
132 Alfaleh, M. A. et al. Phage Display Derived Monoclonal Antibodies: From Bench to Bedside. Front
Immunol 11, 1986 (2020). https://doi.org/10.3389/fimmu.2020.01986
133 Ledsgaard, L., Kilstrup, M., Karatt-Vellatt, A., McCafferty, J. & Laustsen, A. H. Basics of Antibody
Phage Display Technology. Toxins (Basel) 10 (2018). https://doi.org/10.3390/toxins10060236
134 Yang, E. Y. & Shah, K. Nanobodies: Next Generation of Cancer Diagnostics and Therapeutics.
Front Oncol 10, 1182 (2020). https://doi.org/10.3389/fonc.2020.01182
135 Muyldermans, S. A guide to: generation and design of nanobodies. FEBS J 288, 2084-2102
(2021). https://doi.org/10.1111/febs.15515
136 Jin, B. K., Odongo, S., Radwanska, M. & Magez, S. Nanobodies: A Review of Generation,
Diagnostics and Therapeutics. Int J Mol Sci 24 (2023). https://doi.org/10.3390/ijms24065994
137 Bannas, P., Hambach, J. & Koch-Nolte, F. Nanobodies and Nanobody-Based Human Heavy Chain
Antibodies As Antitumor Therapeutics. Front Immunol 8, 1603 (2017).
https://doi.org/10.3389/fimmu.2017.01603
138 Muyldermans, S. Nanobodies: Natural Single-Domain Antibodies. Annual Review of Biochemistry
82:775-797 (2013).
139 Low, N. M., Holliger, P. H. & Winter, G. Mimicking somatic hypermutation: affinity maturation of
antibodies displayed on bacteriophage using a bacterial mutator strain. J Mol Biol 260, 359-368
(1996). https://doi.org/10.1006/jmbi.1996.0406
140 Rahbarnia, L. et al. Evolution of phage display technology: from discovery to application. J Drug
Target 25, 216-224 (2017). https://doi.org/10.1080/1061186x.2016.1258570
141 Martineau, P. Error-prone polymerase chain reaction for modification of scFvs. Methods Mol Biol
178, 287-294 (2002). https://doi.org/10.1385/1-59259-240-6:287
142 Di Marzo, V. & Maccarrone, M. FAAH and anandamide: is 2-AG really the odd one out? Trends
Pharmacol Sci 29, 229-233 (2008). https://doi.org/10.1016/j.tips.2008.03.001
143 Schlosburg, J. E., Kinsey, S. G. & Lichtman, A. H. Targeting fatty acid amide hydrolase (FAAH) to
treat pain and inflammation. Aaps j 11, 39-44 (2009). https://doi.org/10.1208/s12248-008-9075-
y
144 Ahn, K., Johnson, D. S. & Cravatt, B. F. Fatty acid amide hydrolase as a potential therapeutic
target for the treatment of pain and CNS disorders. Expert Opin Drug Discov 4, 763-784 (2009).
https://doi.org/10.1517/17460440903018857
145 Mikaeili, H. et al. Molecular basis of FAAH-OUT-associated human pain insensitivity. Brain 146,
3851-3865 (2023). https://doi.org/10.1093/brain/awad098
146 Dider, S., Ji, J., Zhao, Z. & Xie, L. Molecular mechanisms involved in the side effects of fatty acid
amide hydrolase inhibitors: a structural phenomics approach to proteome-wide cellular off-



71
target deconvolution and disease association. NPJ Syst Biol Appl 2, 16023 (2016).
https://doi.org/10.1038/npjsba.2016.23
147 van Esbroeck, A. C. M. et al. Activity-based protein profiling reveals off-target proteins of the
FAAH inhibitor BIA 10-2474. Science 356, 1084-1087 (2017).
https://doi.org/10.1126/science.aaf7497
148 Zanfirescu, A., Nitulescu, G., Mihai, D. P. & Nitulescu, G. M. Identifying FAAH Inhibitors as New
Therapeutic Options for the Treatment of Chronic Pain through Drug Repurposing.
Pharmaceuticals (Basel) 15 (2021). https://doi.org/10.3390/ph15010038
149 Dal-Bo, M. et al. B-cell receptor, clinical course and prognosis in chronic lymphocytic leukaemia:
the growing saga of the IGHV3 subgroup gene usage. Br J Haematol 153, 3-14 (2011).
https://doi.org/10.1111/j.1365-2141.2010.08440.x
150 Brezinschek, H. P., Brezinschek, R. I. & Lipsky, P. E. Analysis of the heavy chain repertoire of
human peripheral B cells using single-cell polymerase chain reaction. J Immunol 155, 190-202
(1995).
151 Rogozin, I. B., Pavlov, Y. I., Bebenek, K., Matsuda, T. & Kunkel, T. A. Somatic mutation hotspots
correlate with DNA polymerase eta error spectrum. Nat Immunol 2, 530-536 (2001).
https://doi.org/10.1038/88732
152 Pavlov, Y. I. et al. Correlation of somatic hypermutation specificity and A-T base pair substitution
errors by DNA polymerase eta during copying of a mouse immunoglobulin kappa light chain
transgene. Proc Natl Acad Sci U S A 99, 9954-9959 (2002).
https://doi.org/10.1073/pnas.152126799
153 Tang, C., Bagnara, D., Chiorazzi, N., Scharff, M. D. & MacCarthy, T. AID Overlapping and Polη
Hotspots Are Key Features of Evolutionary Variation Within the Human Antibody Heavy Chain
(IGHV) Genes. Front Immunol 11, 788 (2020). https://doi.org/10.3389/fimmu.2020.00788
154 Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583-
589 (2021). https://doi.org/10.1038/s41586-021-03819-2
155 Cohen, T., Halfon, M. & Schneidman-Duhovny, D. NanoNet: Rapid and accurate end-to-end
nanobody modeling by deep learning. Front Immunol 13, 958584 (2022).
https://doi.org/10.3389/fimmu.2022.958584
156 Kiguchi, Y. et al. The V(H) framework region 1 as a target of efficient mutagenesis for generating
a variety of affinity-matured scFv mutants. Sci Rep 11, 8201 (2021).
https://doi.org/10.1038/s41598-021-87501-7
157 Jung, S. et al. The importance of framework residues H6, H7 and H10 in antibody heavy chains:
experimental evidence for a new structural subclassification of antibody V(H) domains. J Mol Biol
309, 701-716 (2001). https://doi.org/10.1006/jmbi.2001.4665



72
158 Jeong, S. L. et al. Immunoglobulin somatic hypermutation in a defined biochemical system
recapitulates affinity maturation and permits antibody optimization. Nucleic Acids Res 50, 11738-
11754 (2022). https://doi.org/10.1093/nar/gkac995
159 Smith, G. P. Filamentous fusion phage: novel expression vectors that display cloned antigens on
the virion surface. Science 228, 1315-1317 (1985). https://doi.org/10.1126/science.4001944
160 Yuan, B., Liu, R. & Sierks, M. Improved affinity selection using phage display technology and offrate based selection. Electronic Journal of Biotechnology 9 (2006).
https://doi.org/10.4067/S0717-34582006000200011
161 Lippoldt, E. K., Elmes, R. R., McCoy, D. D., Knowlton, W. M. & McKemy, D. D. Artemin, a glial cell
line-derived neurotrophic factor family member, induces TRPM8-dependent cold pain. J Neurosci
33, 12543-12552 (2013). https://doi.org/10.1523/jneurosci.5765-12.2013
162 Lippoldt, E. K., Ongun, S., Kusaka, G. K. & McKemy, D. D. Inflammatory and neuropathic cold
allodynia are selectively mediated by the neurotrophic factor receptor GFRα3. Proc Natl Acad Sci
U S A 113, 4506-4511 (2016). https://doi.org/10.1073/pnas.1603294113
163 Meng, E. C. et al. UCSF ChimeraX: Tools for structure building and analysis.
164 Kashojiya, S. et al. Modification of tomato breeding traits and plant hormone signaling by targetAID, the genome-editing system inducing efficient nucleotide substitution. Hortic Res 9 (2022).
https://doi.org/10.1093/hr/uhab004
165 Seki, M., Gearhart, P. J. & Wood, R. D. DNA polymerases and somatic hypermutation of
immunoglobulin genes. EMBO Rep 6, 1143-1148 (2005).
https://doi.org/10.1038/sj.embor.7400582
166 Pillai-Kastoori, L. et al. Antibody validation for Western blot: By the user, for the user. J Biol Chem
295, 926-939 (2020). https://doi.org/10.1074/jbc.RA119.010472



73
Supplemental Figures and Tables
Table S1. 1 Primers and ssDNA ordered for AID reaction and NGS library preparation
Head
befor
e
inser
ts
TAAGTGATGTTTGTTGAA
Tail
after
inser
ts
TTTGTGTAGGAAGATTGAA
Ind25-
NTS
C
CTGGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNNNNNNNNNNNNNNNN
NNNNN(NNN 1-3 nucleotide can be added)TTCAATCTTCCTACACAAA
Reve
rseInd2
5
ACACGACGCTCTTCCGATCTNNNNNNNNNNNNNNNNNNNNNNNNNTTCAA
TCTTCCTACACAAA
NevRev
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTC
CGATCT(NNNN 1-4 nucleotide can be added)TAAGTGATGTTTGTTGAA



74
NG CAAGCAGAAGACGGCATACGAGATCGTGAT(6nt index, change based on
needs)GTGACTGGAGTTCAGACGT
Reve
rse
NG
CAAGCAGAAGACGGCATACGAGATCGTGAT(6nt index, change based on
needs)GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTTAAGTGATGTTTGT
TGAA
Reve
rseP7
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTC
CGATCT
AGC
AGC
AGCAGC…….AGCAGC 54 repeats
AAC
AAC
AACAAC…….AACAAC 54 repeats
ATC
ATC
ATCATC …… ATCATC 54 repeats
TAC
TAC
TACTAC…… TACTAC 54 repeats
TTC
TTC
TTCTTC…… TTCTTC 54 repeats
AGC
TTT
AGCTTTAGCTTT ……. AGCTTTAGCTTT 27 repeats
Seq3 GAGAGATGTGTATGTAACGAGAGATGTGTATGTAGCGAGAGATGTGTATGTT
ACGAGAGATGTGTATGTTTCGAGAGATGTGTATGTATCGAGAGATGTGTATGT



75
Seq4 GAGAGATGTGTATGTAACGAGAGATGTGTATGTATCGAGAGATGTGTATGTTA
CGAGAGATGTGTATGTTTCGAGAGATGTGTATGTAGCGAGAGATGTGTATGT
Seq7 GTTATGTAGAGTGTTAACGTTATGTAGAGTGTTAGCGTTATGTAGAGTGTTTA
CGTTATGTAGAGTGTTTTCGTTATGTAGAGTGTTATCGTTATGTAGAGTGTT
Seq8 GTTATGTAGAGTGTTAACGTTATGTAGAGTGTTATCGTTATGTAGAGTGTTTA
CGTTATGTAGAGTGTTTTCGTTATGTAGAGTGTTAGCGTTATGTAGAGTGTT
Seq1
1
AGTGATATGTTAAGGAAC AGTGATATGTTAAGGAGC
AGTGATATGTTAAGGTAC AGTGATATGTTAAGGTTC
AGTGATATGTTAAGGATC AGTGATATGTTAAGG
Seq1
2
AGTGATATGTTAAGGAAC AGTGATATGTTAAGGATC
AGTGATATGTTAAGGTAC AGTGATATGTTAAGGTTC
AGTGATATGTTAAGGAGC AGTGATATGTTAAGG



76
Figure S3. 1 AID and poly eta mutation spectrum on IGHV3-23*01 region. Gapped M13mp2 phage DNA with IGHV3-23*01
region was treated with AID and Pol η. A total of 150 clones were picked with 862 mutation spots. Bases above the template
sequence represent any mutations caused by AID and poly eta. The templet sequence was underlined with green for FW regions
and red for CDR regions. AID's preferred motifs were red fronted. Pol η preferred motifs were yellow highlighted.
(Experiments performed by Soo Lim Jeong in the Goodman Lab, USC Molecular Biology)



77
Figure S3. 2 The DNA sequence of 5 selected clones is different from A3. The mutated nucleotides are red highlighted.
Figure S3. 3 The DNA sequence of C6, F4 and 69. The mutated nucleotides are red highlighted.



78
Figure S3.4 Inflammatory cold allodynia is inhibited by both LM52-VHH and Mab 1085. CFA was injected into 9 mice to
induce cold allodynia. Both groups of mice showed a significant increase in cold-evoked hind paw withdrawal latencies post antiartemin MAB 1085 (A, n=5) and LM52-VHH (B, n=9). Contra is the uninjected, control hind paw; ipsi is the MAB 1085 or
LM52 VHH experimental groups. Withdrawal latency measures the time between the cold simulation and the mouse removing its
hind paw. A longer latency means the mouse is less sensitive to cold. For MAB1085 group, 5 mice were treated; for LM52 group,
9 mice were treated. ns p>0.05, ***p<0.001, **p<0.01, *P<0.05.
(Experiments performed by Soo Lim Jeong in the Goodman Lab, USC Molecular Biology, Shanni Yamaki and Chenyu Yang in
the McKemy Lab, USC Neuroscience)



79
Figure S3. 5 Docking scores between VHHs and FAAH. A more favorable interaction between two proteins results in a lower
docking score for all three software calculations. 69 showed the lowest scores and was predicted to have the most favorable
binding interaction to FAAH predicted by all three software calculations.



80
Appendix
Python coded used for MDS data analysis:
Barcodecreat+universal.py:
### Need input
READ1 = '22095FL-08-02-01_S2_L006_R1_001.fastq'
READ2 ='22095FL-08-02-01_S2_L006_R2_001.fastq'
ORDER= 142
Primer1= 25
Primer2= 0
#######
Primer1EXTEND=Primer1-25
Primer2EXTEND=Primer2-0
FullLENGTH= ORDER+25 #default: 142 designed sequences with priming site and
25N barcode
R2ADD = FullLENGTH-130+Primer1EXTEND+Primer2EXTEND
R2MATCH1=FullLENGTH-125+Primer1EXTEND+Primer2EXTEND
R2MATCH2=FullLENGTH-135+Primer1EXTEND+Primer2EXTEND
def compbase(base):
#Returns complement of base
if base == "A":
compbase = "T"
elif base == "C":
compbase = "G"
elif base == "G":
compbase = "C"
elif base == "T":
compbase = "A"
else:
compbase = base
# if there's an N in the sequence, for example, keep the N
return compbase
def compseq(seq):
#Returns complement of seq
comp = [] # empty list
for base in seq:
nuc = compbase(base)
comp.append(nuc)
compstr = "".join(comp)
return compstr
def covert (read):
out = []
with open(read) as f:
for idx, line in enumerate(f.read().splitlines()):



81
if idx %4==1 :
out.append(line)
return out
def rev(seq):
#Returns the reverse of seq
return seq[::-1]
def revcomp(seq):
#Returns the reverse complement of seq
rv = rev(seq)
rvcp = compseq(rv)
return rvcp
def seqrevcomp (seq):
out=[]
for i in range(len(seq)):
temp = revcomp(seq[i])
out.append(temp)
return out
def merge(read1,read2,r1headkeep,r2tailkeep): #merge the R1 and R2 sequence
together
out=[]
for i in range(len(read1)):
if read1[i][125:135] == read2[i][-R2MATCH1:-R2MATCH2]:
temp= "".join([read1[i][0:r1headkeep],read2[i][-r2tailkeep:]])
out.append(temp)
return out
def final25(seq): #get the last 25 bases of the read. The last 25 reads are
the barcodes
MyList=[]
for line in seq:
if line[-(35+Primer2):-(25+Primer2)] == 'GAAGATTGAA':
MyList.append(line[-25:])
return MyList
def filterbarcode(barcode): #get rid of the barcode error reads
out =[]
for element in barcode:
if element not in
['CCCCCCCCCCCCCCCCCCCCCCCCC','NNNNNNNNNNNNNNNNNNNNNNNNN','AAAAAAAAAAAAAAAAAAA
AAAAAA','TTTTTTTTTTTTTTTTTTTTTTTTT','GGGGGGGGGGGGGGGGGGGGGGGGG']:
out.append(element)
return out
from collections import Counter
r1=covert(READ1)
b=covert(READ2)
r2 =seqrevcomp(b)
del b



82
mergerdseq0 = merge(r1,r2,130,R2ADD) #
del r2
del r1
MyFile=open('sequence-over.txt','w')
for element in mergerdseq0:
MyFile.write(element)
MyFile.write('\n')
MyFile.close()
counts = Counter(final25(mergerdseq0))
del mergerdseq0
fivetimesbarcode = [value for value, count in counts.items() if count >2]
#print(output)
print(len(fivetimesbarcode))
del counts
fivetimesbarcode=filterbarcode(fivetimesbarcode)
#creat the barcode file
MyFile=open('barcode-over.txt','w')
MyFile.write('\n'.join(fivetimesbarcode))
MyFile.close()
finalversionseq+universal.py:
import statistics
import re
import multiprocessing
from multiprocessing import Queue
from multiprocessing import Process, managers
import time
##### Need input
name = 'seq7'
Primer1= 25
Primer2= 0
ORDER= 142
Seqinput=
'GTTATGTAGAGTGTTAACGTTATGTAGAGTGTTAGCGTTATGTAGAGTGTTTACGTTATGTAGAGTGTTTTCGTTA
TGTAGAGTGTTATCGTTATGTAGAGTGTT'
Processors=60



83
######
filename = name+'.txt'
print(filename)
def replace_c(input_string):
modified_string = input_string.replace('C', '[CT]')
return modified_string
MUTseq = replace_c(Seqinput)
Primer1EXTEND=Primer1-25
Primer2EXTEND=Primer2-0
FullLENGTH= ORDER+25 #default: 142 designed sequences with priming site and
25N barcode
def adjustseq(seqlist):
out=[]
for i in range(len(seqlist[1])):
temp=[]
for line in seqlist:
if len (line) == FullLENGTH:
temp.append(line[i])
most=statistics.mode(temp)
if temp.count(most) > 0.8*len(temp):
most = most
else:
most = 'N'
out.append(most)
seq = "".join(out)
return seq
def finalseq(dubplicate,original,seq_dict,q):
out=[]
for line in dubplicate:
filter_object = [original[i] for i in seq_dict[line]]
if len(filter_object) !=0:
out.append(adjustseq(filter_object))
q.put(out)
'''
if __name__=="__main__":
pool =Pool()
result = pool.map(finalseq,barcode,seq)
pool.close()
pool.join()
print(result)
'''



84
if __name__=="__main__":
t1 = time.time()
my_file = open("barcode-over.txt", "r")
content = my_file.read()
testbarcode = content.split("\n")
my_file.close()
barcode = testbarcode
my_file = open("sequence-over.txt", "r")
content = my_file.read()
seq = content.split("\n")
my_file.close()
seq_dict = dict()
for i, line in enumerate(seq):
if line[-25:] in seq_dict:
seq_dict[line[-25:]].append(i)
else:
seq_dict[line[-25:]] = [i]
t2=time.time()
q= Queue()
processes=[]
processnumber=Processors
seperation=int(len(barcode)/processnumber)
fivecode=[barcode[i:i+seperation] for i in
range(0,len(barcode),seperation)]
for element in fivecode:
process=Process(target= finalseq,args=(element,seq,seq_dict,q))
process.start()
processes.append(process)
final=[]
for process in processes:
obj = q.get()
#print(obj)
final=final+obj
print ("finsh")
for process in processes:
process.join()
#print(final)
#print(len(final))
print("took",time.time()-t1)



85
def motifnoly(seq):
out = []
for line in seq:
out.append(line[18+Primer2:123+Primer2])
return out
motifs = motifnoly(final)
refmoti = MUTseq
def correctseq(seq, refmoti):
out = []
for line in seq:
if re.findall(refmoti, line) != []:
out.append(line)
return out
correct = correctseq(motifs, refmoti)
print(len(correct))
right = Seqinput
def alignment(seq, ref):
out = []
Mut = []
for line in seq:
temp = 0
for i in range(len(ref)):
if line[i] == ref[i]:
temp = temp + 0
else:
temp = temp + 1
#print(line[i])
#print(i)
Mut.append(line[i])
out.append(temp)
#print(Mut)
#print(len(Mut))
return out
#alig = alignment(correct, right)
def counterror(seq):
score = 0
for element in seq:
if element != 0:
score = score + 1
#print(element)
return score
#print(counterror(alig))
#def counterrorbiger1(seq):



86
score = 0
for element in seq:
if element > 1:
score = score + 1
#print(element)
return score
#print(counterrorbiger1(alig))
alig = alignment(correct, right)
#print(alig)
def counterror(seq):
score = 0
for element in seq:
if element != 0:
score = score + 1
# print(element)
return score
print(counterror(alig))
def counterrorbiger1(seq):
score = 0
for element in seq:
if element > 1:
score = score + 1
# print(element)
return score
print(counterrorbiger1(alig))
def creatdot(seq, ref):
out = []
for line in seq:
temp = []
n = 0
for i in range(35):
if line[i * 3:i * 3 + 3] == ref[i * 3:i * 3 + 3]:
temp.append(".")
else:
temp.append("T")
n = n + 1
tempjoin = "".join(temp)
if n > 0:
out.append(tempjoin)
return out
def onlymut(seq):
out = []
for line in seq:
if line != '.....':



87
out.append(line)
return out
allmut = creatdot(correct, right)
print(onlymut(allmut))
print(len(onlymut(allmut)))
MyFile = open(filename, 'w')
for element in onlymut(allmut):
MyFile.write(element)
MyFile.write('\n')
MyFile.close()
mutation distribute 3motif.py:
import re
import pandas as pd
list_ = open("seq7.txt").read().split()
#print(list_)
out=[0]*5
#5+5+5+5+5+5 motif combination
for line in list_:
if line[5] == 'T':
out[0]=out[0]+1
if line [11] == 'T':
out[1]=out[1]+1
if line [17] == 'T':
out[2]=out[2]+1
if line [23] == 'T':
out[3]=out[3]+1
if line [29] == 'T':
out[4]=out[4]+1
'''
#5+3+7+7+3+5 motif combination
for line in list_:
if line[5] == 'T':
out[0]=out[0]+1
if line [9] == 'T':
out[1]=out[1]+1
if line [17] == 'T':
out[2]=out[2]+1
if line [25] == 'T':
out[3]=out[3]+1
if line [29] == 'T':
out[4]=out[4]+1
'''



88
print(out)
sum = 0
for element in out:
sum = sum + element
percentage= []
for element in out:
percentage.append(element/sum*100)
print(percentage)
position = list(range(1,6))
print(position)
import matplotlib.pyplot as plt
plt.plot(position,percentage)
plt.xlabel('position')
plt.ylabel('percentage')
plt.show()
'''
out3=[]
out4=[]
out5or_more=[]
n=[0,0,0,0,0]
for line in list_:
a= line.count('T')
if a == 2:
out2.append(line)
n[1]=n[1]+1
elif a== 3:
out3.append(line)
n[2] = n[2] + 1
elif a ==4:
out4.append(line)
n[3] = n[3] + 1
else:
out5or_more.append(line)
n[4] = n[4] + 1
quality control.py:
def covert (Read):
outseq=[]
infile1 = open(Read)
for line in infile1:
out = line.replace("\n", "")
out = out.replace("\r", "")
outseq.append(out)
infile1.close()
return outseq
def seqqulity(seq):
goodquality= ['A','B','C','D','E','F','G','H','I','J','K']
print(len(seq))
length = len(seq)/4
#print(length)



89
length = int(length)
print(length)
for i in range(0,length):
if len(seq[i * 4 + 3]) > 129:
for j in range(0,130):
if seq[i*4+3][j] not in goodquality:
mylist=list(seq[i*4+1])
mylist[j]='N'
seq[i*4+1] = ''.join(mylist)
return seq
read = covert('Undetermined_S0_L001_R2_001.fastq')
Read = seqqulity(read)
MyFile=open('seqR2.txt','w')
for element in Read:
MyFile.write(element)
MyFile.write('\n')
MyFile.close() 
Asset Metadata
Creator Zhang, Hongyu (author) 
Core Title AID scanning & catalysis and the generation of high-affinity antibodies 
Contributor Electronically uploaded by the author (provenance) 
School College of Letters, Arts and Sciences 
Degree Doctor of Philosophy 
Degree Program Molecular Biology 
Degree Conferral Date 2024-05 
Publication Date 05/17/2024 
Defense Date 05/02/2024 
Publisher Los Angeles, California (original), University of Southern California (original), University of Southern California. Libraries (digital) 
Tag affinity maturation,AID,antibody,catalysis,OAI-PMH Harvest,scanning 
Format theses (aat) 
Language English
Advisor Goodman, Myron (committee chair), Chen, Lin (committee member), Lieber, Michael (committee member), Mak, Chiho (committee member), McKemy, David (committee member) 
Creator Email hongyuzh@usc.edu 
Permanent Link (DOI) https://doi.org/10.25549/usctheses-oUC113940238 
Unique identifier UC113940238 
Identifier etd-ZhangHongy-12960.pdf (filename) 
Legacy Identifier etd-ZhangHongy-12960 
Document Type Thesis 
Format theses (aat) 
Rights Zhang, Hongyu 
Internet Media Type application/pdf 
Type texts
Source 20240517-usctheses-batch-1155 (batch), University of Southern California (contributing entity), University of Southern California Dissertations and Theses (collection) 
Access Conditions The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law.  Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright.  It is the author, as rights holder, who must provide use permission if such use is covered by copyright. 
Repository Name University of Southern California Digital Library
Repository Location USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email uscdl@usc.edu
Abstract (if available)
Abstract Activation-induced deoxycytidine deaminase (AID) plays a crucial role in the human immune system by initiating somatic hypermutation (SHM) and class switch recombination (CSR). In 2003, the Goodman lab showed that the substrate for AID catalysis is single-stranded (ss)DNA and that AID scans ssDNA processively.  The lab has studied coupling between AID’scanning and catalysis, focused on sequence context effects. Using AID’s deamination footprints, we described AID’s activity through a mathematical model validated with experimental data. This model implies a coupled relationship between AID’s scanning and catalysis activity except for sequences with homogenous motifs. In this study, combining ssDNA and Next Generation Sequencing (NGS), we confirmed the model predictions on AID’s catalysis-decoupled scanning on ssDNA with homogenous motifs. We also demonstrated how neighboring sequences influence AID’s motif preferences.

AID’s involvement in the immune system is essential in the process of generating high-affinity antibodies (Abs). AID and error-prone DNA polymerase eta (Pol η) are upregulated upon antigen (Ag) exposure, enabling genetic diversification required for high-affinity Ab production. We used AID and Pol η to synthesize high-affinity Abs biochemically in a test tube. By treating native llama nanobody (VHH) genes with purified AID and Pol η, we generated diversified VHH libraries. Via phage display, we selected VHHs with high affinity to Fatty Acid Amide Hydrolysis (FAAH) and characterized their binding affinity. Our successful in vitro antibody diversification process highlights the potential of AID, paving a new way for antibody generation. 
Tags
affinity maturation
AID
antibody
catalysis
scanning
Linked assets
University of Southern California Dissertations and Theses
doctype icon
University of Southern California Dissertations and Theses 
Action button