Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Genome engineering of filamentous fungi for efficient novel molecule production
(USC Thesis Other)
Genome engineering of filamentous fungi for efficient novel molecule production
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
GENOME ENGINEERING
OF FILAMENTOUS FUNGI FOR EFFICIENT
NOVEL MOLECULE PRODUCTION
Johannes van Dijk
Degree being conferred: Doctor of Philosophy (MOLECULAR PHARMACOLOGY AND TOXICOLOGY)
FACULTY OF THE USC GRADUATE SCHOOL
University of Southern California
Degree conferral date: December 2018
i
Abstract
The use of fungi by humans dates back more than 5000 years. At first a trial-and-error approach was used.
Later the molecules in extracts became known and characterized. Now, with more and more species
having their genome sequenced, effort in employing fungi focus on a genetic level, identifying new
enzyme functionalities and engineering genomes for increased yield or novel product discovery. In this
thesis gene engineering as well as gene discovery is applied to obtain novel enzyme functionality and
molecules. The most important methods used are described in Chapter 2, while additional methods will
be described in the respective chapters.
Domain swapping in Non-Ribosomal Peptide Synthetase-like enzymes (Chapters 3 and 4)
A group of small non-ribosomal peptide synthetase (NRPS)-like enzymes was recently discovered in
Aspergillus terreus. The products of these enzymes include antitumor metabolites asterriquinones and
butyrolactones, and the phytotoxic phenguignardic acid. Knowledge of the mechanism of these
biosynthetic enzymes will open the way for developing analogues with improved functionality. NRPSs are
typically megadalton size enzymes that produce peptides by selecting free amino acids, activating and
tethering them, and via a condensation reaction couple one to the next. Each amino acid is recognized by
a protein domain so a particular NRPS makes a specific peptide sequence, unlike the ribosome which uses
the DNA as template. In non-ribosomal peptide formation, non-proteinogenic amino acids can be
incorporated and via additional protein modules the peptide can be modified during chain elongation, for
example epimerization to generate D-amino acids, N-methylation, or side chain cyclization to generate
oxazolidines. The immunosuppressant cyclosporin is the most famous product of this type of enzymes.
Their modular architecture, with a separate domain for each amino acid and reaction, makes them a
target of engineering efforts. Libraries of NRPs could be generated by recombining the domains that
recognize amino acids. To date much effort has gone into determining the domain boundaries and where
ii
exactly to exchange protein domains while retaining whole protein functionality. The main downside of
the engineered enzymes has been yields as low as 1% of the wildtype enzyme. To combat potential low
yields for this project, a heterologous expression system was used: Aspergillus nidulans and its powerful
native alcA promoter. NRPS-like enzymes are similar in domain architecture to their NRPS cousins. The
main difference is that they are a lot smaller with typically only three domains: one for recognizing and
activating an amino acid (A domain), one for tethering (T domain) and one for release (TE domain). The
other difference is that instead of an amino acid, an alpha-keto acid is activated, which means the final
product does not contain any peptide bonds. Five enzymes with these three domains were recently
identified in A. terreus and their products were confirmed via knockout studies and heterologous
expression in A. nidulans with yields of 50-100 mg/L. The products were all dimeric aromatic alpha-keto
acids, differing only in cyclization pattern. Our hypothesis was that the cyclization pattern is determined
by the TE domain so cyclization and release go hand in hand. To test this, we generated hybrids of the
genes by fusing the TE domain of one gene with the A and T domain of another. The resulting functional
hybrid enzymes had comparable yield and allowed for the structural characterization of their products,
showing that the cyclization pattern of hybrids was determined by which TE domain they contained.
Products of these hybrids were novel secondary metabolites with new combinations of heterocycles and
side chains. This result shows that our system can not only be used to study enzyme mechanisms, but also
engineered to produce new molecules on an industrial viable scale. To generate even more novel
molecules, the tailoring enzymes of the wildtype clusters were coexpressed. Using this approach
specifically methylated and prenylated molecules were successfully obtained. Part of this work was
published in Organic Letters in 2016. Future work will focus on generating more functional hybrids and
thus more novel molecules. These new molecules will be compared to the wild type compounds in their
phytotoxic (plant assay using leaf discs) and cytotoxic activity (MTT assay) in four cancer cell lines, NCI-
H460 (non-small cell lung), MCF-7 (breast), SF-268 (CNS glioma), and MIA Pa Ca-2 (pancreatic).
iii
Heterologous expression of Ribosomally synthesized and Post-translationally modified Peptides
(RiPPs) (Chapter 5)
Ustiloxin B is a cyclic tetrameric peptide exhibiting antimitotic properties via microtubule inhibition
making it a cancer drug lead. It is produced by a number of crop infesting fungi like Aspergillus flavus and
Ustilaginoidea virens and until recently it was thought to originate from an NRPS. However, the NRPS was
never identified and through genome analysis a gene with a 16 repeat of the four amino acids
corresponding to the ustiloxin B sequence was found and shown to be a ribosomally synthesized precursor
for ustiloxin B. Through knockout studies the gene cluster borders were determined, and there are 16
genes in the cluster. The generation of ustiloxin B analogues could lead to stronger antitumor activity and
generally better drug properties.
NRP analogues can be created through A domain replacements in NRPSs, but in the case of ustiloxin B a
simple codon change in the precursor gene can yield analogues. To achieve this, we first expressed the
gene cluster in our heterologous host A. nidulans, which has a ready genetic system with three usable
selection markers. This way modifications to the ustiloxin B cluster can be made easily. Previously
biosynthetic gene clusters were expressed one gene after the other, starting with the core gene and
adding tailoring enzymes later recycling a marker with each gene. However, for a 16 gene cluster this
would take 16 rounds of transformation. Therefore, a new method was developed for transformation of
the complete gene cluster of A. flavus into the A. nidulans host in a single step, reducing the time needed
from months to weeks. Culturing of the mutants under standard glucose minimal media (GMM) did not
yield ustiloxin B. However, it is known that ustiloxin B is naturally produced by A. flavus when growing on
and contaminating crops. Therefore, the mutants were grown in V8 juice media, in which amounts of
ustiloxin B were detected. Subsequent overexpression of the transcription factor of the cluster, using the
alcA promoter, increased ustiloxin B production tenfold. Future work will involve the expression of
modified precursor peptides to generate ustiloxin B analogs. These will be tested for their antimitotic
iv
properties using initial microtubule assembly assay, followed by MTT assay to test the cytotoxicity on
MKN-1, MKN-7, MKN-74, RERF-LC-MA, SBC-5, MCF-7, WiDr, SW-480, and KU-2 cancer cell lines.
In this work it is shown that besides the enzymes Nature provides us, genome engineering can open up a
chemical space currently not occupied, with seemingly endless possibilities. As engineering efforts
increase, guided by structural data and high-throughput screening, one day a new natural enzyme will be
discovered that has the same functionality as an existing engineered one. Until that day efforts on
engineering and natural discovery both have value.
v
Acknowledgements
First and foremost, I thank my amazing wife Dr. Marta Kubała for her relentless support during my whole
PhD. Besides your moral support, I hugely benefitted from your vast experience in microbiology and I
credit you for many breakthroughs. You are my biggest trump card in life and I am thrilled to continue our
adventure together.
I have been very lucky to land in the laboratory of my supervisor Professor Clay Wang and I thank him for
taking me into his lab after my first rotation even though he already took two other students. Your unique
view on projects and ability to come up with new ones to secure funding is something that inspired my
work. I was happy to serve as your lab manager, which taught me many things regular bench work does
not.
Special thanks to C.J. Guo, who mentored me in my first months in the lab and whose project ideas started
my thesis work. Your impressive Krown Fellowship seminar in one of my first weeks at USC, inspired me
to aim high.
I thank all my current and former lab mates for the good atmosphere. I will always remember the trips to
Las Vegas and the many inspiring debates during lunch. CJ, Junko, Shore, Hsu Hwa, Mike, Jim, Weiwen, Yi-
Ming, Jillian, Kevin, Steven, Michelle, Ada, Yien, Chris, Patrick and Eva. You have all either moved on
further in your careers or still going strong in the lab, and I hope we stay in touch.
Eva, I am passing my torch/projects on to you. I am sure you are going to be very successful in this lab and
I hope you will publish many papers in the coming year.
I have greatly enjoyed supervising temporary students. Stephanie, you were my first undergrad and your
diligence and independence are greatly appreciated. Iarlaith, you are one of the most curious students I
have had the pleasure to work with. I hope your studies in Ireland, or wherever you are visiting now, are
vi
going well. Austen, we have worked together the longest time and we did some exciting things with
domain linkers.
I would like to thank all the staff at the School of Pharmacy in facilitating the many aspects of grad school,
from booking rooms to ordering supplies to getting paid.
Lastly, I would like to thank the USC and the School of Pharmacy for admitting me into the program and
providing me with various scholarships that made it possible for me to obtain my PhD. Admission into US
universities as a foreign student is not easy, but USC made it possible for me. Fight on!
vii
Table of Contents
Chapter 1 Historic overview of fungal research relevant to this thesis ................................................. 1
The premolecular age............................................................................................................. 2
The pregenomic age ............................................................................................................... 3
The post-genomic age ............................................................................................................ 9
References ........................................................................................................................... 21
Chapter 2 Heterologous expression of fungal secondary metabolite pathways in the Aspergillus
nidulans host system........................................................................................................................ 24
Introduction ........................................................................................................................ 24
Identification of secondary metabolite genes in fungal genomes ........................................... 26
Design primers for fusion PCR .............................................................................................. 26
Obtain genomic DNA for PCR template ................................................................................ 28
Materials ................................................................................................................ 28
DNA extraction from hyphae ................................................................................... 28
Materials ................................................................................................................ 29
DNA extraction from spores .................................................................................... 30
Fusion PCR construction ...................................................................................................... 30
Materials ................................................................................................................ 31
Genomic PCR .......................................................................................................... 31
One Pot Fusion Reaction ......................................................................................... 32
Two Pot Fusion Reaction ......................................................................................... 33
Transformation .................................................................................................................... 33
Materials ................................................................................................................ 33
Protoplasting ........................................................................................................... 34
Diagnostic PCR .................................................................................................................... 36
Liquid culturing of mutant strains ........................................................................................ 36
Materials ................................................................................................................ 36
Culturing strains in glucose minimal media (GMM) ................................................... 37
Recipes ............................................................................................................................... 37
Conclusions ......................................................................................................................... 39
References ........................................................................................................................... 40
viii
Chapter 3 Engineering fungal nonribosomal peptide synthetase-like enzymes by heterologous
expression and domain swapping ..................................................................................................... 41
References ........................................................................................................................... 49
Supporting information ........................................................................................................ 50
Chapter 4 Expanding the chemical space of Nonribosomal Peptide Synthetase-like Enzymes by domain
and tailoring enzyme recombination ................................................................................................ 78
References ........................................................................................................................... 84
Supporting information ........................................................................................................ 85
Chapter 5 One step heterologous expression of the ustiloxin B gene cluster ................................... 125
References ......................................................................................................................... 136
Supporting information ...................................................................................................... 137
Chapter 6 Discussion ...................................................................................................................... 151
1
Chapter 1
Historic overview of fungal research relevant to this thesis
Through mankind’s ever curious nature it has always had the drive to touch new things, but also to put
new things on their mouths. Some things gave a pleasurable experience like fruit, others provided an
awful yet harmless taste like Brussels sprouts. Some were lethal and the surviving members of a group
learned from others trying new things what is good to eat and what not, apart from naturally repulsive
things like rotten items. Mushrooms were among the items that went through this human selection
process, where some were deemed edible, while others toxic or even lethal, like Amanita phalloides,
appropriately known as Death Cap. Other mushrooms turned out to have a special effect upon ingestion,
namely hallucinations. These effects were obviously not understood at the time, so these mushrooms
gained a mythical status. Stone paintings that could date back to 9000 BC depict shamans holding
mushrooms (Figure 1) and in later native middle American cultures, mushrooms played a big role in
religious rituals.
Figure 1. A cave drawing from Tassili n’Ajjer, a sandstone rock formation in Northern Algeria, showing what appears to be a bee-
faced shaman covered in mushrooms. Drawn by Kat Harrison McKenna from photograph by Lajoux (1961).
https://radicalmycology.files.wordpress.com/2013/11/untitled-1.png
2
It was not until the 1950s that examples of molecules responsible for these experiences were extracted
and identified, namely psilocybin and psilocin. Now we know that they bind to serotonin receptors in our
brain to exert their effect, but that was obviously not imaginable by the Aztecs, let alone by the producers,
the mushrooms. Despite a giant increase in technological advances and scientific knowledge over the last
thousands of years, not much has changed in the application of hallucinogenic mushrooms: they are
simply eaten. This is not the case for many other fungi. The study of fungi can roughly be divided in three
periods: the premolecular age, the pregenomic and the post-genomic age.
The premolecular age
In the premolecular age the exact molecules found in fungi or fungal extracts were not known, but rather
through trial and error their effects were archived, carried over and expanded upon generation after
generation. Much of present day Chinese medicine is constructed that way, but more and more molecular
bases of these traditional healing methods are being unraveled. An early example of this is the use of
Fomitopsis betulina as an anti-parasitic by ice men more than 5000 years ago, as evidenced by the
discovery of the fungus among the possessions of a mummified ice man with parasite eggs in its rectum
1
(Figure 2). Also, Agaricus campestris, a field mushroom very akin to the regular mushroom from the
Figure 2. Fomitopsis betulina as found on the mummified ice man. These indicate one of the earliest medicinal uses of fungi.
3
supermarket, has traditionally been used to treat throat cancer, boils, and abscesses when stewed in
milk
2
. Not so much medicinally, but more culinary has been the use of various Aspergillus species, namely
in the creation of soy sauce in Asia around 2200 years ago. The millennia old use of yeast for making beer
and bread also deserves a mention here as part of the premolecular age.
The pregenomic age
While it is hard to define the exact start of the post-molecular or pregenomic age, it can be characterized
by a growing understanding of which molecules in fungal extracts are the active components. Once it was
understood that there are particular compounds in fungal extracts that can be useful to humans, the
mining of fungi for potential molecules really caught on. Off course the real explosion of fungal natural
product discovery was lit by Fleming’s penicillin in 1928
3
. Penicillin is a great example of the transition
from the premolecular to the post-molecular age. Fleming initially coins the term “penicillin” as an
abbreviation of “mould broth filtrate” and his entire 1929 publication does not involve purified
compound. Later, scaling up production and purification by others led to the implementation of penicillin
as a drug to fight infections deadly until then. Interestingly, the first clinical trial was successful, however
there was not enough penicillin to complete the treatment and the patient still died to his infection. With
the second World War in full swing, the British and American Allies threw their full force behind penicillin
scale up efforts to cure their affected soldiers, hoping it would give them an edge over the Nazis. Initially,
brute force was applied with some labs culturing 500 L per week of the initial P. chrysogenum strain. It
must be noted that in these times of war, government and several large pharmaceutical companies joined
forces to produce enough penicillin. Improvements in yield were made by optimizing the culture medium,
though further scale up was hampered by the inability to produce penicillin in submerged cultures. A
worldwide search for a penicillin producing strain under submerged conditions resulted in a strain from
an infected cantaloupe on a local farmer’s market close to a US government research facility, that could
produce twice as much as the most optimized strain so far. Further optimization through radiation
4
induced mutagenesis made penicillin accessible worldwide also to civilians. Now, industrial P.
chrysogenum strains reach titers as high as 30 mg/ml, incredible from a 2.4 μg/ml starting point. After the
penicillin breakthrough, a large increase in the efforts to find new antibiotics from fungi and other micro-
organisms took place in the 1940s, 1950s, and 1960s. Two of the biggest classes of compounds found in
this time were polyketides and polypeptides. The origin of these compounds was often unknown and
hypotheses on the biosynthetic origin were investigated
4
. Based on the structure of polyketides, the
hypothesis was formed that they are polymers of a limited pool of precursors. Radiolabeling of acetic acid
showed it is used to form fatty acids and the same was hypothesized for polyketides
5
. In the 1950s
radiolabeling was used to determine which starter units are incorporated into polyketide natural
products, (methyl)malonyl and acetyl being the major ones. Laboratory chemical synthesis was used to
explain the biosynthesis of polyketides from the building blocks. The need for catalysts in the subsequent
condensation of carboxylic acids to form polyketides meant there must be appropriate but different
catalysts intracellularly. The role of Coenzyme A as a means to activate esters using a sulfur atom was
soon discovered
6
. Further enzymology was done by generating cell free extracts of microorganisms
producing a certain polyketide and then applying various purification techniques. Analogous with fatty
Figure 3. Adapted from
7
(A) Type I PKS/FAS have covalently linked domains where the building blocks move from domain to
domain and each domain acts once per molecule. (B) Type II PKS/FAS consist of separate proteins that aggregate where the T
domain moves the building blocks from domain to domain repetitively. Domain descriptions see below.
5
acid synthase (FAS) typology, polyketide synthases (PKSs) in bacteria were found to be of type II (iterative),
where several enzymes together produce a fatty acid or polyketide and each enzyme can be studied
separately, and in eukaryotes like yeast or mammalian species of type I (non-iterative), where fatty acids
or polyketides are formed by a single enzyme with repeated covalent modules.
Among the first cell-free extracts from fungi that was functionally tested was Penicillium patulum, which
produces patulin, a mycotoxin and clinically failed antibiotic. It was shown that crude cell-free extract
(after patulin depletion) could still produce patulin when labeled acetyl-CoA but not labeled acetate itself
was added. This showed the enzymes required to produce acetyl-CoA were absent from the extract, but
the enzymes to make patulin from acetyl-CoA were there. Bassett and Tanenbaum
8
explored the
involvement of 6-methylsalicylate (6-MSA) in patulin biosynthesis, but it was only ten years later that the
individual PKS was isolated from P. patulum and shown to generate 6-MSA from acetyl- and malonyl-CoA
9
.
The identification of specific PKS genes had to wait for recombinant DNA technology to be developed in
the 1970s and 1980s. One of the key breakthroughs came from the Hopwood lab, when they identified
the PKS gene involved in actinorhodin biosynthesis in Streptomyces coelicolor
10
. Through cosynthesis with
other mutants, a group of genes responsible for actinorhodin production was identified and it showed the
mutants cosynthesize unidirectionally, allowing for the first polyketide biosynthetic pathway to be
postulated. They also determined these genes were all close to each other in the genome, birthing the
idea of biosynthetic gene clusters. One important aspect of this research is that the compound being
investigated was a visible dye, allowing for easy identification of impaired mutants. This is still used today
in filamentous fungi for developing novel genetic techniques like CRISPR-Cas9. One of the earliest fungal
PKS genes to be sequenced was 6-MSA synthase
11
. Since the PKS had been purified previously, an antibody
could be generated and an expression library in Escherichia coli could be screened for 6-MSAS presence.
The sequence of the fungal 6-MSAS was found to be similar to type I FAS.
6
Figure 4. Molecular structures of various building blocks, intermediates, and final natural products.
Polypeptide natural product biosynthesis research started off slightly different from polyketides. Where
polyketides were analyzed from a chemical synthesis approach, the peptide natural product biosynthesis
question was asked in the light of the emerging understanding of ribosomal biosynthesis
12
. Remarkably,
when antibiotics that inhibit the ribosome were applied, a certain peptide, tyrocidine, was still produced
13
.
This indicated a ribosome-independent route for peptide formation, thus the enzymes involved were
dubbed non-ribosomal peptide synthetases (NRPSs). A synthetase requires ATP for its bond formation,
whereas a synthase does not. Initially it was postulated by Lipmann that NRPSs might be evolutionary
progenitors of the ribosome, but that turned out not to be true
14, 15
. However, Lipmann did contribute to
the overall understanding of the enzymology of NRPSs, most notably the modularity of the enzymatic
mechanism, which will be described later in this chapter. This initial work was mostly done in bacteria
since protein extraction is easier than in fungi. The first fungal NRPSs were characterized in the 1980s,
namely those involved in penicillin and cyclosporin production
16-18
. The rise of DNA sequencing in the late
1980s led to the identification of the NRPS genes for these products. In 1988, one of the first NRPSs genes
to be sequenced was tycA from Bacillus brevis which encodes tyrocidine synthetase I, by Marahiel and
coworkers
19, 20
, who was one of the pioneers in sequencing biosynthetic gene clusters for NRPSs. The first
7
fungal NRPS gene sequenced was pcbAB from Penicillium chrysogenum in 1990
21
. The earlier proposed
NRPS modularity was crucial in interpreting the sequence data, where the number of module repeats
could be correlated to the number of amino acid residues in the NRP.
The tools to study PKS and NRPS enzymology in the pre-genomic age allowed for the description of the
basic reaction mechanisms of both enzymes. Enzymatic elucidation of PKSs, NRPSs, and other biosynthetic
enzymes continues well into the post-genomic age and novel reaction mechanisms are still being reported
to date. Nevertheless, the basic elements or protein domains for each enzyme are described with the
addition of common variable domains
7
. PKSs have three basic domains for elongation of their building
blocks. The thiolation (T) domain, which is also present in all NRPSs, serves as a place holder for growing
polymer chains via an active thioester. Early research hypothesized a cysteine was involved, but fatty acid
biosynthesis research established early on the role of 4’-phoshopanthetheinyl (4’-PPT) in generating a
thioester. A specific conserved serine in all T domains is post-translationally modified by
phosphopantetheinyltransferase, most likely to generate space between the enzyme and the growing
polymer (Figure 5C). The two other domains in PKSs are the ketosynthase (KS) and acyltransferase (AT)
domain and they have catalytic activity. AT domains select malonyl-CoA or methylmalonyl-CoA (in case of
fungal PKS only malonyl-CoA) and the CoA thioester is transferred to the thiol of 4’-PPT at the T domain
downstream. The KS domain then takes the upstream acyl thioester from the upstream T domain and
transfers it to its own cysteine. Then it catalyzes the bond forming reaction between its own cysteine
bound acyl thioester and the downstream 4’PPT bound acyl thioester resulting in decarboxylation of the
downstream substrate and the extended unit bound to the downstream T domain, ready for another
round with further downstream domains (Figure 5A). In type I PKSs, each domain is used once so for each
extension an additional group of domains is found in the PKS. Type II PKS only have one set that gets
reused repeatedly, until the final length polyketide is formed. The regulation of the number of iterations
is still a subject of study and lack of understanding impedes engineering efforts of this type of PKS. Three
8
additional, optional domains are found in PKSs being ketoreductase (KR), dehydratase (DH), and
enoylreductase (ER) domains that sequentially reduce a ketone, dehydrate a hydroxyl group, and reduce
a double bond. Each intermediate can be found in final polyketides depending on the presence of one or
more of these domains in the PKS. Initiation of the PKS occurs either through a partially inactive KS domain
so that only decarboxylation takes place or by an initiating AT domain that selects an acyl-CoA instead of
a (methyl)malonyl-CoA. Termination of chain formation is catalyzed by the final thioesterase (TE) domain
which hydrolyzes a fully-grown polyketide yielding a free acid or facilitates intramolecular cyclization
releasing a cyclic polyketide.
Figure 5. The basic domains of (A) PKSs and (B) NRPSs with the basic reaction mechanism of each. (C) Structure of the 4’-PPT
that is post-translationally tethered to the T domain.
A typical core module of an NRPS also consists of three domains (Figure 5B). Like in PKSs, a T domain can
be found with the same tethering function for growing polymers. Additionally, it contains an adenylation
(A) domain and a condensation (C) domain. The A domain recognizes amino acids and activates them
using ATP, hence the name synthetase, and transfers the activated amino acid to the 4’-PPT arm of the
adjacent T domain. A catalytic base in the C domain facilitates peptide bond formation between two
molecules on adjacent T domains. Initiation can be achieved through a starting A domain and downstream
9
T domain for the first amino acid to be attached to. Termination is through a similar process as for PKSs
where a TE domain catalyzes hydrolysis or cyclization to release the product from the enzyme. Notable
optional domains are epimerization (E) domains that change an amino acids conformation from L to D and
condensation domains that cyclize amino acid side chains like cysteine and serine to form cyclic adducts
during assembly.
As shown above, the pre-genomic age produced a wealth of knowledge on (fungal) natural product
biosynthesis and many compounds were discovered during the boom in the 1950s and 1960s when many
large pharmaceutical companies attempted to mine the wealth of natural products by screening
thousands of microorganisms. However, this initial well dried up and the rise of synthetic libraries and
high throughput screening contributed to a loss of interest in natural products, though many synthetic
drug leads are still derived or inspired by natural products
22
.
The post-genomic age
Much of the knowledge on the modular function of biosynthetic core enzymes is derived from their DNA
sequence. The identification and individual sequencing of these genes was a time-consuming effort, but
the automation of gene sequencing in the 1990s made whole genome sequencing possible. Smaller
genomes were first sequenced, but as size constraints loosened, priority was given to the most relevant
species like, obviously, the human genome as well as model organisms for research and industry, like
Saccharomyces cerevisiae (baker’s yeast). In 1995, the first genome sequenced was Haemophilus
influenzae, with a size of 1.8 million base pairs relatively small. One year later the first eukaryote, S.
cerevisiae, was sequenced at 12 million base pairs. Other significant species sequenced during that time
were Drosophila melanogaster (fruit fly, 139.5 million base pairs in 2000), Homo Sapiens (3.2 billion base
pairs in 2001), and Mus musculus (laboratory mouse, 2.6 billion base pairs in 2002). The genome sequence
of the first filamentous fungus was released in 2003 when the Neurospora crassa genome (40 million base
pairs) was published in Nature at the 50
th
anniversary of the publication of the structure of DNA
23
. More
10
filamentous fungi followed in the years after as sequencing time and cost decreased. Most notably
fumigatus, nidulans, terreus and oryzae from the Aspergillus genus
24-26
. A. fumigatus is a notorious human
pathogen, A. terreus produces lovastatin (a cholesterol lowering drug), A. oryzae is used in the production
of soy sauce and sake, while A. nidulans is a model organism widely used in academia. Aspergillus nidulans
genetics were pioneered by Pontecorvo who chose the fungus as a model to study genetic analysis of
microorganisms
27
. Before the modern genetic tools were developed, A. nidulans genetics were studied
using irradiation-induced mutations and its ability to form crossovers. A. nidulans can form heterokaryons
with either itself or genetically distinct strains, resulting in two different nuclei
28
. Two genetic mutations
that impair the conidial color production were identified to be complementary and that was used to select
for successful crossover between two strains, the heterokaryon would form a diploid and restore conidial
color. The diploid spontaneously breaks down to haploid cells and this whole process happens asexually,
allowing for genetic analysis of other species whose sexual stages were unknown. The advent of PCR
decades later allowed for DNA mediated transformation where, after enzymatic removal of the cell wall,
foreign DNA carrying a selectable marker could be introduced either into the fungal genome by
spontaneous integration or using self-replicating vectors. Selectable markers either complement critical
mutations, often nutritional (Figure 8), or impart resistance to a certain toxin added to the selecting
growth media (Figure 7). Nutritional complementation has less false positives compared to toxin
resistance since the fungus can either make the nutrition or not, whereas toxin resistance is always
concentration and media dependent. Fungal protoplasts can be transformed by adding linear, PCR
constructed DNA that integrates heterologously into the hosts genome. This is often undesired as it may
disrupt crucial genes at random.
Instead, homologous parts of DNA are padded to the 5’ and 3’ ends of the transforming DNA fragments
to facilitate homologous recombination at a specific locus (Figure 6). This way genes or promoters can be
inserted at specific places or specific genes can be knocked out by a selectable marker. In most eukaryotes,
11
there are two major ways of DNA double strand break repair: homologous recombination and non-
homologous end joining. Non-homologous end joining is missing in most bacteria, but an important repair
mechanism in eukaryotes, especially for haploids that have no homologous template during the S1 phase
Figure 6. Mechanism of double strand DNA break repair via homologous recombination with an exogenous DNA fragment. After
the double strand DNA break a complex of proteins is recruited and facilitates DNA resection around the break. A nucleoprotein
filament then binds to the single stranded 3’ overhang and finds DNA sequences similar to it and invades the recipient DNA duplex
(in this case an exogenous construct). DNA polymerase then extends the invading strand on the homologous recipient strand.
The other 3’ overhang also binds to the recipient DNA and after more extension, nicking endonucleases cut single strand DNA in
the crossing sections. These cuts can result in either crossover or non-crossover recombinant sequences. If the exogenous DNA
construct is designed correctly, the crossover and non-crossover results are identical.
12
of the cell cycle. Homologous recombination uses an intact homologous DNA strand to repair the break
resulting often in crossover between alleles. However, in Aspergillus nidulans the rate of homologous
recombination of foreign DNA is naturally low; 0-40% based on the length of the DNA that is homologous.
This means that many transformants must be genetically analyzed to find one with the desired single
insertion or deletion.
The sequencing of strains from the Aspergillus genus allowed for a systematic approach to both assign the
biosynthetic genes to their respective products as well as the discovery of new ones. The advances that
whole genome sequencing brought to secondary metabolism studies in filamentous fungi can be
illustrated by Aspergillus nidulans, though parallel advances have been made in other species like
Aspergillus niger, oryzae, or fumigatus to name a few. Shortly before the Aspergillus nidulans genome
sequence was published in 2005
24
, a method was developed to drastically increase the gene targeting
efficiency in Neurospora crassa
29
. Gene targeting in Saccharomyces cerevisiae has always been very
efficient since the species preferentially uses homologous recombination to repair its DNA, making yeast
a widely used model eukaryote. Many other organisms however, have a robust non-homologous end
joining mechanism to repair DNA double strand breaks. This mechanism involves the Ku protein which
binds to broken DNA and aligns the two broken ends for the DNA repair machinery to rejoin the two ends.
The Ku protein was initially discovered in humans and was described as a heterodimer consisting of Ku70
and Ku80
30
. Homologs of Ku can be found in all kingdoms of life and Ninomiya et al. theorized that if the
Ku homolog in N. crassa was disrupted, the homologous recombination rate would increase making gene
targeting more efficient. Indeed, the percentage of transformants with correct integration went up from
10-30% to 100%. The combination of a fully sequenced genome and an efficient gene targeting system
sparked the scientific community to find Ku homologs in other species to disrupt it and apply genome
wide gene targeting studies. Soon after the Aspergillus nidulans genome sequence with 13-fold coverage
was published, Nayak et al. published that the Ku70 homolog, designated nkuA, could be deleted and
13
improve homologous recombination rates in Aspergillus nidulans
31
. Around the same time ku deletion
strains were developed for A. fumigatus and A. niger
32, 33
.
Thanks to the pioneering work of Pontecorvo the genetics of A. nidulans were thoroughly described, which
allowed for the development of a unique platform. There were many auxotrophic mutants available,
which allowed for selection of multiple nutritional markers. The argB- mutation, requiring arginine
supplementation, was used to knock out nkuA in A. nidulans, using the native argB gene from an argB+
strain. Three more auxotrophic mutations are present in the platform: riboB2, pyroA4, and pyrG89. RiboB
is required for the biosynthesis of the essential vitamin riboflavin, while similarly the pyroA mutant needs
to be supplemented with pyridoxine, another essential vitamin. PyrG encodes orotidine 5'-phosphate
(OMP) decarboxylase, an enzyme for pyrimidine biosynthesis. Supplementation with uracil (and
sometimes uridine) is needed for growth of the pyrG89 mutant. Additionally, selection for non-functional
pyrG is also possible using 5’ fluoroorotic acid (5’FOA), which is converted to the toxic metabolite 5’
fluorouracil by OMP decarboxylase. This allows for a back and forth selection where subsequently DNA
fragments can be inserted into or deleted from the genome. This developed system has been applied to
mainly three areas: targeted gene deletions, overexpression of native genes, and heterologous expression
of foreign genes. These three aspects will be illustrated using examples from secondary metabolism
research, though they have been applied to study primary metabolism and other cell functions in A.
nidulans extensively.
Combining the detailed sequence data with an efficient gene targeting method, Chiang et al. attempted
to link biosynthetic genes coding for NRPSs to their respective products by specifically deleting six NRPS
genes and consequently looking at changes in the secondary metabolite profile
34
. Only one deletion
mutant caused a change in HPLC profile, which means the other five genes are silent under the culturing
conditions used in this experiment. The deletion of AN2545.3 resulted in the disappearance of five
compounds from the secondary metabolic profile (three depicted in Figure 9), though production in the
14
wild type strain was low since 10 liters of media had to be fermented to yield 1.2-2.8 mg of each
compound. Using 2-D NMR to elucidate the structure of each, one was previously described in literature
Figure 7. Four selective antibiotics used in this dissertation. Each has a distinct toxicity and resistance mechanism. An advantage
of antibiotics compared to using nutritional selection is that auxotrophic strains are not needed and so these can be used in wild
type strains. A disadvantage is that the effective antibiotic concentration is dependent on the culturing conditions so the chance
of false positives and negatives is higher and the antibiotic has to be tested for each new strain and condition.
to be emericellamide A. The structures of the other four compounds were elucidated and because of
similarity were named emericellamides C-F. The biosynthetic pathway was further probed by deleting
genes neighboring AN2545.3 according to the gene clustering theory. Three more gene knockouts ablated
15
emericellamide production coding for a highly-reducing type I PKS, an acyl-CoA ligase, and an
acyltransferase. The proposed pathway starts at the PKS, after which its product is activated by the acyl-
CoA ligase and transferred to the NRPS by the acyltransferase. Instead of knocking out NRPSs, Nielsen et
Figure 8. Four nutritional supplements for their respective auxotrophic mutants. The genes can be either disrupted by mutation
or completely knocked out. Complementation can be achieved through reintroduction of the functional enzyme from the same
or a compatible species. Selection for successful reintroduction is done by omitting the respective supplement. Any additional
genes fused to the complementing enzyme will also have been introduced into the strain.
al. systematically knocked out 32 PKS genes in an effort to link them to produced compounds using a
variety of culturing conditions in the hopes of activating otherwise silent gene clusters
35
. Several genes
previously linked to known products, were linked to secondary products illustrating that biosynthetic
pathways are rarely linear, but that intermediates can be utilized by more than one pathway. One gene
that was not linked to any product was identified when the knockout of AN8383 failed to produce the
16
meroterpenoids austinol and dehydroaustinol. Overexpression of AN8383 yielded an intermediate in
agreement with previously proposed pathways for this class of meroterpenoids based on old-fashioned
radiolabeling studies.
Knockout studies can sometimes generate intermediates, especially when tailoring enzymes are knocked
out and stable intermediates can be isolated. These intermediates can give clues to the functionality of
the enzyme that was removed from the pathway, though many intermediates are undetectable, unstable,
or reactive which can give a distorted image of the biosynthesis of the final molecule. Another way to
Figure 9. Extracted Ion Currents (EICs) of three emericellamides (A, C, D). Knockout of AN2545.3 results in the disappearance of
the peaks corresponding to the emericellamides.
obtain intermediates is by overexpressing key enzymes, which has the added advantage that otherwise
silent genes are turned on and can be linked to their intermediate product, if not the final product. Where
knockout strains are compared with the wild type looking for the disappearance of metabolites, in
overexpression strains one looks at the appearance of compounds.
17
Using its excellent genetic system Aspergillus nidulans has been genome mined extensively using
promoter exchange. Eight non-reducing polyketide synthases (NR-PKSs) had their native promoter
replaced by the alcA promoter to link these genes to their natural product (intermediate)
36
. Five of them
produced at least one major compound upon induction with alcohol in sufficient amounts to scale up,
isolate, and characterize. One of the genes did not contain a releasing domain, but additional
overexpression of an adjacent thioesterase increased the yield. The other two overexpressed NR-PKSs did
not produce any compound until a neighboring highly reducing polyketide synthase (HR-PKS) or fatty acid
synthase (FAS) was also overexpressed, presumably providing specialized starter units for these NR-PKSs.
In another study, thirteen putative NRPS-like enzymes identified in the A. nidulans genome were
overexpressed in the same way
37
. The only A. nidulans secondary metabolite known at that point to
originate from an NRPS-like enzyme, is terrequinone A. The NRPS-like enzyme, tidA, dimerizes two indole-
pyruvic acid molecules derived from tryptophan. Only one of the thirteen strains produced a new
metabolite, microperfuranone, which seems to be a dimerized form of phenylpyruvic acid, an
intermediate of phenylalanine metabolism. The products of the other NRPS-like enzymes were not
detected. This means the products were not extracted or detected due to their chemical nature or the
NRPS-like enzymes need additional enzymes overexpressed to elicit their products.
While successful for NR-PKSs, the core enzyme overexpression approach was less efficient for NRPS-like
enzymes. Realizing this is not a one size fits all approach, other methods to activate silent gene(s) (clusters)
have been and are being explored. One concurrent method is to overexpress putative transcription factors
(TFs). The TF can then activate the cluster it is controlling. The advantage is that the expression of only
one gene must be modified to express multiple genes in concert. The relative expression levels of each
gene in the cluster are evolutionary optimized and are not changed in this approach. Fine-tuning the
expression levels of each gene in individual (heterologous) overexpression is sometimes necessary for
sufficient product formation and is often hard to accomplish
38
. Overexpression of TFs in A. nidulans has
18
led to the discovery of asperfuranone, aspyridone A and B, and several emodin analogs as secondary
metabolites of this species
39-41
. However, overexpression of TFs in fourteen putative biosynthetic clusters
in A. nidulans did not yield any or not enough new compounds. Again, this is no one size fits all approach,
but a useful one nonetheless.
Figure 10. (a) HPLC analysis of the background and AN3396.4 overexpression A. nidulans strains. After induction of the alcA
promoter with cyclopentanone the production of microperfuranone can be observed as confirmed (b) by UV-vis, mass, and NMR
data (not shown).
Apart from transcription factors that regulate the expression of gene clusters, global regulators were also
found in the genome of A. nidulans. Through mutagenesis studies, the laeA gene was identified to have a
positive impact on the expression of the biosynthesis clusters of sterigmatocystin, penicillin, and
terrequinone A in A. nidulans
42
. The deletion of laeA reduced or eliminated transcription of these gene
clusters. More recently a negative regulator was discovered in McrA
43
. McrA, when overexpressed,
19
reduces expression of secondary metabolite clusters, for example the one involved in nidulanin
production, and when knocked out the production is increased. The mechanism with which both global
regulators work is related to chromatin remodeling directly (LaeA) or indirectly (McrA), but the exact
network involved in secondary metabolite regulation has yet to be elucidated. However, their
overexpression or deletion can facilitate discovery of cryptic gene clusters in A. nidulans and other species
with homologs.
A third way to study secondary metabolism is via heterologous expression of biosynthetic genes or entire
clusters. Since A. nidulans is very easy to use and a well-studied organism, heterologous expression of its
genes is often only performed as a control. For example, A. niger can express functional A. nidulans
enzymes in a non-native environment and was used to express micA outside of its host to show
independent functionality
37
. A. nidulans itself, however, is becoming an important host for the expression
of enzymes from other fungal species
44, 45
. Historically, Escherichia coli and S. cerevisiae have been used
more often, especially for heterologous protein expression, though for secondary metabolite enzymes
they are less optimal. Firstly, they both are unable to splice introns from foreign genes, forcing the use of
cDNA, while A. nidulans can often express genes amplified directly from genomic DNA. Secondly, E. coli
and S. cerevisiae are not evolutionary optimized to produce (fungal) secondary metabolites so they often
lack the proper precursors or necessary post-translational modifications for proper enzyme function. Two
ways to heterologously express genes in A. nidulans are via a replicating plasmid or through homologous
recombination in the hosts genome. The advantage using plasmids is that the hosts genome stays intact.
Homologous insertion into the hosts genome does not require the use of plasmids, cutting out a laborious
cloning step. However, the gene must be inserted somewhere in the genome, with sometimes unexpected
and undesirable consequences due to random gene disruptions. The previously described increase in gene
targeting efficiency in A. nidulans also means heterologous constructs can be specifically inserted at a
locus of choice, basically eliminating unexpected disruptions. This also means only one copy of the gene
20
construct is present and expression levels are more comparable between strains compared to
transformants where the plasmid copy number is not always constant.
The A. nidulans system has been developed further using the three nutritional auxotroph mutants which
can be complemented with A. fumigatus homologs to avoid unwanted recombination at the original
marker site when the A. nidulans version are used. Additionally, gene clusters for the major natural
products are completely removed as to create a cleaner background on which the products of the
exogenous genes can be detected while at the same time freeing up the precursor pool to be used by the
heterologous enzymes. A. nidulans expression platforms exist with the complete sterigmatocystin cluster
removed (ΔAN7804-7825), or even with emericellamide (ΔAN2545-2549), asperfuranone (ΔAN1039-
1029), monodictyphenone (ΔAN10023-10021), terrequinone (ΔAN8512-8520), austinol (ΔAN8379-8384,
ΔAN9246-9259), F9775 (ΔAN7906-7915), and asperthecin (ΔAN6000-6002) removed as well
46
. These
platforms have successfully been used to express gene(s) clusters from A. terreus and A. fumigatus
47, 48
.
In this dissertation, the tools described here, are used and further developed to ask biosynthetic
questions. Alternative approaches to studying secondary metabolism are also explored. Chapter 2
describes a detailed materials and methods on heterologous expression in Aspergillus nidulans. Chapter
3 and 4 show the application of that method to expressing engineered hybrids of NRPS-like genes
originally from A. terreus. In chapter 5 a novel type of fungal biosynthetic gene cluster from A. flavus is
expressed in A. nidulans using a novel method.
21
References
1. Capasso, L., Lancet 1998, 352 (9143), 1864-1864.
2. Wani, B. A.; Bodha, R. H.; Wani, A. H., Journal of Medicinal Plants Research 2010, 4 (24), 2598-
2604.
3. Fleming, A., British Journal of Experimental Pathology 1929, 10 (3), 226-236.
4. Bentley, R.; Bennett, J. W., Annual Review of Microbiology 1999, 53, 411-446.
5. Birch, A. J.; Donovan, F. W., Australian Journal of Chemistry 1953, 6 (4), 360-368.
6. Bassett, E. W.; Tanenbaum, S. W., Biochimica Et Biophysica Acta 1960, 40 (3), 535-537.
7. Fischbach, M. A.; Walsh, C. T., Chemical Reviews 2006, 106 (8), 3468-3496.
8. Bassett, E. W.; Tanenbaum, S. W., Federation Proceedings 1959, 18 (1), 187-187.
9. Seyffert, R.; Dimroth, P.; Lynen, F., Hoppe-Seylers Zeitschrift Fur Physiologische Chemie 1969, 350
(10), 1161-&.
10. Rudd, B. A. M.; Hopwood, D. A., Journal of General Microbiology 1979, 114 (SEP), 35-43.
11. Beck, J.; Ripka, S.; Siegner, A.; Schiltz, E.; Schweizer, E., European Journal of Biochemistry 1990,
192 (2), 487-498.
12. Felnagle, E. A.; Jackson, E. E.; Chan, Y. A.; Podevels, A. M.; Berti, A. D.; McMahon, M. D.; Thomas,
M. G., Molecular Pharmaceutics 2008, 5 (2), 191-211.
13. Mach, B.; Reich, E.; Tatum, E. L., Proceedings of the National Academy of Sciences of the United
States of America 1963, 50 (1), 175-&.
14. Gevers, W.; Kleinkauf, H.; Lipmann, F., Proceedings of the National Academy of Sciences of the
United States of America 1968, 60 (1), 269-+.
15. Gevers, W.; Kleinkauf, H.; Lipmann, F., Proceedings of the National Academy of Sciences of the
United States of America 1969, 63 (4), 1335-+.
16. Castro, J. M.; Liras, P.; Laiz, L.; Cortes, J.; Martin, J. F., Journal of General Microbiology 1988, 134,
133-141.
17. Zocher, R.; Nihira, T.; Paul, E.; Madry, N.; Peeters, H.; Kleinkauf, H.; Keller, U., Biochemistry 1986,
25 (3), 550-553.
18. Nuesch, J.; Heim, J.; Treichler, H. J., Annual Review of Microbiology 1987, 41, 51-75.
19. Weckermann, R.; Furbass, R.; Marahiel, M. A., Nucleic Acids Research 1988, 16 (24), 11841-11841.
20. Mittenhuber, G.; Weckermann, R.; Marahiel, M. A., Journal of Bacteriology 1989, 171 (9), 4881-
4887.
21. Diez, B.; Gutierrez, S.; Barredo, J. L.; Vansolingen, P.; Vandervoort, L. H. M.; Martin, J. F., Journal
of Biological Chemistry 1990, 265 (27), 16358-16365.
22. Newman, D. J.; Cragg, G. M., Journal of Natural Products 2016, 79 (3), 629-661.
23. Galagan, J. E.; Calvo, S. E.; Borkovich, K. A.; Selker, E. U.; Read, N. D.; Jaffe, D.; FitzHugh, W.; Ma,
L. J.; Smirnov, S.; Purcell, S.; Rehman, B.; Elkins, T.; Engels, R.; Wang, S. G.; Nielsen, C. B.; Butler, J.; Endrizzi,
M.; Qui, D. Y.; Ianakiev, P.; Pedersen, D. B.; Nelson, M. A.; Werner-Washburne, M.; Selitrennikoff, C. P.;
Kinsey, J. A.; Braun, E. L.; Zelter, A.; Schulte, U.; Kothe, G. O.; Jedd, G.; Mewes, W.; Staben, C.; Marcotte,
E.; Greenberg, D.; Roy, A.; Foley, K.; Naylor, J.; Stabge-Thomann, N.; Barrett, R.; Gnerre, S.; Kamal, M.;
Kamvysselis, M.; Mauceli, E.; Bielke, C.; Rudd, S.; Frishman, D.; Krystofova, S.; Rasmussen, C.; Metzenberg,
R. L.; Perkins, D. D.; Kroken, S.; Cogoni, C.; Macino, G.; Catcheside, D.; Li, W. X.; Pratt, R. J.; Osmani, S. A.;
DeSouza, C. P. C.; Glass, L.; Orbach, M. J.; Berglund, J. A.; Voelker, R.; Yarden, O.; Plamann, M.; Seiler, S.;
Dunlap, J.; Radford, A.; Aramayo, R.; Natvig, D. O.; Alex, L. A.; Mannhaupt, G.; Ebbole, D. J.; Freitag, M.;
Paulsen, I.; Sachs, M. S.; Lander, E. S.; Nusbaum, C.; Birren, B., Nature 2003, 422 (6934), 859-868.
24. Galagan, J. E.; Calvo, S. E.; Cuomo, C.; Ma, L. J.; Wortman, J. R.; Batzoglou, S.; Lee, S. I.; Basturkmen,
M.; Spevak, C. C.; Clutterbuck, J.; Kapitonov, V.; Jurka, J.; Scazzocchio, C.; Farman, M.; Butler, J.; Purcell,
22
S.; Harris, S.; Braus, G. H.; Draht, O.; Busch, S.; D'Enfert, C.; Bouchier, C.; Goldman, G. H.; Bell-Pedersen,
D.; Griffiths-Jones, S.; Doonan, J. H.; Yu, J.; Vienken, K.; Pain, A.; Freitag, M.; Selker, E. U.; Archer, D. B.;
Penalva, M. A.; Oakley, B. R.; Momany, M.; Tanaka, T.; Kumagai, T.; Asai, K.; Machida, M.; Nierman, W. C.;
Denning, D. W.; Caddick, M.; Hynes, M.; Paoletti, M.; Fischer, R.; Miller, B.; Dyer, P.; Sachs, M. S.; Osmani,
S. A.; Birren, B. W., Nature 2005, 438 (7071), 1105-1115.
25. Machida, M.; Asai, K.; Sano, M.; Tanaka, T.; Kumagai, T.; Terai, G.; Kusumoto, K. I.; Arima, T.; Akita,
O.; Kashiwagi, Y.; Abe, K.; Gomi, K.; Horiuchi, H.; Kitamoto, K.; Kobayashi, T.; Takeuchi, M.; Denning, D. W.;
Galagan, J. E.; Nierman, W. C.; Yu, J. J.; Archer, D. B.; Bennett, J. W.; Bhatnagar, D.; Cleveland, T. E.;
Fedorova, N. D.; Gotoh, O.; Horikawa, H.; Hosoyama, A.; Ichinomiya, M.; Igarashi, R.; Iwashita, K.; Juvvadi,
P. R.; Kato, M.; Kato, Y.; Kin, T.; Kokubun, A.; Maeda, H.; Maeyama, N.; Maruyama, J.; Nagasaki, H.;
Nakajima, T.; Oda, K.; Okada, K.; Paulsen, I.; Sakamoto, K.; Sawano, T.; Takahashi, M.; Takase, K.;
Terabayashi, Y.; Wortman, J. R.; Yamada, O.; Yamagata, Y.; Anazawa, H.; Hata, Y.; Koide, Y.; Komori, T.;
Koyama, Y.; Minetoki, T.; Suharnan, S.; Tanaka, A.; Isono, K.; Kuhara, S.; Ogasawara, N.; Kikuchi, H., Nature
2005, 438 (7071), 1157-1161.
26. Nierman, W. C.; Pain, A.; Anderson, M. J.; Wortman, J. R.; Kim, H. S.; Arroyo, J.; Berriman, M.; Abe,
K.; Archer, D. B.; Bermejo, C.; Bennett, J.; Bowyer, P.; Chen, D.; Collins, M.; Coulsen, R.; Davies, R.; Dyer,
P. S.; Farman, M.; Fedorova, N.; Feldblyum, T. V.; Fischer, R.; Fosker, N.; Fraser, A.; Garcia, J. L.; Garcia, M.
J.; Goble, A.; Goldman, G. H.; Gomi, K.; Griffith-Jones, S.; Gwilliam, R.; Haas, B.; Haas, H.; Harris, D.;
Horiuchi, H.; Huang, J.; Humphray, S.; Jimenez, J.; Keller, N.; Khouri, H.; Kitamoto, K.; Kobayashi, T.;
Konzack, S.; Kulkarni, R.; Kumagai, T.; Lafton, A.; Latge, J. P.; Li, W. X.; Lord, A.; Majoros, W. H.; May, G. S.;
Miller, B. L.; Mohamoud, Y.; Molina, M.; Monod, M.; Mouyna, I.; Mulligan, S.; Murphy, L.; O'Neil, S.;
Paulsen, I.; Penalva, M. A.; Pertea, M.; Price, C.; Pritchard, B. L.; Quail, M. A.; Rabbinowitsch, E.; Rawlins,
N.; Rajandream, M. A.; Reichard, U.; Renauld, H.; Robson, G. D.; de Cordoba, S. R.; Rodriguez-Pena, J. M.;
Ronning, C. M.; Rutter, S.; Salzberg, S. L.; Sanchez, M.; Sanchez-Ferrero, J. C.; Saunders, D.; Seeger, K.;
Squares, R.; Squares, S.; Takeuchi, M.; Tekaia, F.; Turner, G.; de Aldana, C. R. V.; Weidman, J.; White, O.;
Woodward, J.; Yu, J. H.; Fraser, C.; Galagan, J. E.; Asai, K.; Machida, M.; Hall, N.; Barrell, B.; Denning, D. W.,
Nature 2005, 438 (7071), 1151-1156.
27. Pontecorvo, G.; Roper, J. A.; Hemmons, L. M.; Macdonald, K. D.; Bufton, A. W. J., Advances in
Genetics Incorporating Molecular Genetic Medicine 1953, 5, 141-238.
28. Oakley, B. R., Reference Module in Life Sciences 2017.
29. Ninomiya, Y.; Suzuki, K.; Ishii, C.; Inoue, H., Proceedings of the National Academy of Sciences of
the United States of America 2004, 101 (33), 12248-12253.
30. Walker, J. R.; Corpina, R. A.; Goldberg, J., Nature 2001, 412 (6847), 607-614.
31. Nayak, T.; Szewczyk, E.; Oakley, C. E.; Osmani, A.; Ukil, L.; Murray, S. L.; Hynes, M. J.; Osmani, S.
A.; Oakley, B. R., Genetics 2006, 172 (3), 1557-1566.
32. Krappmann, S.; Sasse, C.; Braus, G. H., Eukaryotic Cell 2006, 5 (1), 212-215.
33. Meyer, V.; Arentshorst, M.; El-Ghezal, A.; Drews, A. C.; Kooistra, R.; van den Hondel, C.; Ram, A.
F. J., Journal of Biotechnology 2007, 128 (4), 770-775.
34. Chiang, Y. M.; Szewczyk, E.; Nayak, T.; Davidson, A. D.; Sanchez, J. F.; Lo, H. C.; Ho, W. Y.; Simityan,
H.; Kuo, E.; Praseuth, A.; Watanabe, K.; Oakley, B. R.; Wang, C. C. C., Chemistry & Biology 2008, 15 (6),
527-532.
35. Nielsen, M. L.; Nielsen, J. B.; Rank, C.; Klejnstrup, M. L.; Holm, D. K.; Brogaard, K. H.; Hansen, B. G.;
Frisvad, J. C.; Larsen, T. O.; Mortensen, U. H., Fems Microbiology Letters 2011, 321 (2), 157-166.
36. Ahuja, M.; Chiang, Y.-M.; Chang, S.-L.; Praseuth, M. B.; Entwistle, R.; Sanchez, J. F.; Lo, H.-C.; Yeh,
H.-H.; Oakley, B. R.; Wang, C. C. C., Journal of the American Chemical Society 2012, 134 (19), 8212-8221.
37. Yeh, H. H.; Chiang, Y. M.; Entwistle, R.; Ahuja, M.; Lee, K. H.; Bruno, K. S.; Wu, T. K.; Oakley, B. R.;
Wang, C. C. C., Applied Microbiology and Biotechnology 2012, 96 (3), 739-748.
23
38. Ajikumar, P. K.; Xiao, W.-H.; Tyo, K. E. J.; Wang, Y.; Simeon, F.; Leonard, E.; Mucha, O.; Phon, T. H.;
Pfeifer, B.; Stephanopoulos, G., Science 2010, 330 (6000), 70-74.
39. Chiang, Y. M.; Szewczyk, E.; Davidson, A. D.; Keller, N.; Oakley, B. R.; Wang, C. C. C., Journal of the
American Chemical Society 2009, 131 (8), 2965-2970.
40. Bergmann, S.; Schumann, J.; Scherlach, K.; Lange, C.; Brakhage, A. A.; Hertweck, C., Nature
Chemical Biology 2007, 3 (4), 213-217.
41. Bok, J. W.; Chiang, Y. M.; Szewczyk, E.; Reyes-Dominguez, Y.; Davidson, A. D.; Sanchez, J. F.; Lo, H.
C.; Watanabe, K.; Strauss, J.; Oakley, B. R.; Wang, C. C. C.; Keller, N. P., Nature Chemical Biology 2009, 5
(9), 696-696.
42. Bok, J. W.; Keller, N. P., Eukaryotic Cell 2004, 3 (2), 527-535.
43. Oakley, C. E.; Ahuja, M.; Sun, W. W.; Entwistle, R.; Akashi, T.; Yaegashi, J.; Guo, C. J.; Cerqueira, G.
C.; Wortman, J. R.; Wang, C. C. C.; Chiang, Y. M.; Oakley, B. R., Molecular Microbiology 2017, 103 (2), 347-
365.
44. Yaegashi, J.; Oakley, B. R.; Wang, C. C. C., Journal of Industrial Microbiology & Biotechnology 2014,
41 (2), 433-442.
45. Lubertozzi, D.; Keasling, J. D., Biotechnology Advances 2009, 27 (1), 53-75.
46. Chiang, Y. M.; Oakley, C. E.; Ahuja, M.; Entwistle, R.; Schultz, A.; Chang, S. L.; Sung, C. T.; Wang, C.
C. C.; Oakley, B. R., Journal of the American Chemical Society 2013, 135 (20), 7720-7731.
47. Guo, C. J.; Sun, W. W.; Bruno, K. S.; Oakley, B. R.; Keller, N. P.; Wang, C. C. C., Chemical Science
2015, 6 (10), 5913-5921.
48. Ryan, K. L.; Moore, C. T.; Panaccione, D. G., Toxins 2013, 5 (2), 445-455.
24
Chapter 2
Heterologous expression of fungal secondary metabolite
pathways in the Aspergillus nidulans host system
1. Introduction
Fungal species have been widely used to study fundamental biology of eukaryotes since they grow faster
and are easier to grow compared to higher order mammalian cells. Several species such as Aspergillus
fumigatus and Aspergillus flavus are of interest because of their pathogenicity due to both invasiveness
in immunocompromised patients and the toxic compounds they produce, for example gliotoxin and
aflatoxin. These compounds are often the product of the secondary metabolite (SM) pathways in these
fungi. SM pathways offer a myriad of organic molecules and can also sometimes be medicinally useful
such as most famously penicillin G, cyclosporine A, and lovastatin. Since many of these biosynthetic
pathways present in the genomes of fungal species have not been linked to their product, this provides
an opportunity for discovery as more (fungal) genomes are sequenced.
Fungal secondary metabolite biosynthesis pathways involve one core set of enzymes that is responsible
for the backbone of the compound. These core enzymes could either be polyketide synthases (PKS), non-
ribosomal synthetases (NRPS), a hybrid PKS-NRPS, or a terpene cyclase (TC). Discovery of new secondary
metabolite genes is often accomplished by homology searches using known genes of these core enzymes
1-3
. Conveniently, in many cases, the other genes (coding for tailoring enzymes, resistance proteins,
transcription factors, or transporters) in the biosynthetic pathway are clustered in the genome so they
can be identified easily once the core enzymes have been identified. There are several ways to link the
identified secondary metabolite pathways with their products. One way is to create knockouts in the
producing organism
4
. In this approach the disappearance of one or more compounds in the secondary
metabolite profile in the mutant strain compared to a control strain would indicate that the deleted gene
in the mutant is necessary for product formation. This method is quite straightforward but would require
25
that a culture condition can be identified under which the gene product is produced. In many cases
secondary metabolite clusters identified from a sequenced genome are either silent or produce very small
amounts, so that a condition to turn on the pathway cannot be identified easily. Alternatively, our lab and
others have used a heterologous expression approach to link secondary metabolism genes with their
respective compounds
5, 6
. The heterologous expression approach is useful particularly when the
secondary pathways are silent in the producing organism or when no genetic system is available.
The three commonly used host systems in the lab are Saccharomyces cerevisiae (yeast)
7
, Aspergillus
oryzae
8
, and Aspergillus nidulans. Gene targeting is very efficient in yeast, but wild-type yeast is not rich
in SMs and thus relevant building blocks may be absent. Introns also differ between yeast and filamentous
fungi, so cDNA is necessary for heterologous expression. Both A. nidulans and A. oryzae are more suited
in these respects since intron splicing is not an issue. In our lab we have focused on using A. nidulans as a
host. A. nidulans has the advantage that its genetic system is well developed. Host strains with multiple
nutritional markers are available. In collaboration with Berl Oakley’s group at University of Kansas an A.
nidulans strain, LO8030, with four mutations (pyroA4, riboB2, pyrG89, and nkuA::argB) and eight gene
clusters knocked out (sterigmatocystin AN7804-7825, emericellamide AN2545-2549, asperfuranone
AN1039-1029, monodictyphenone AN10023-10021, terrequinone AN8512-8520, austinol AN8379-
8384/AN9246-9259, F9775 AN7906-7915, and asperthecin AN6000-6002) has been created
6
. The strain
LO8030 wastes fewer resources on known secondary metabolites and has a clean background ideal for
detecting new compounds as the result of heterologous expression. This chapter will explain step by step
how to rapidly heterologously express SM genes in this highly optimized host without the need for
cosmids, plasmids, or vectors. Parts of this protocol have also been described elsewhere in detail
9
, but
will be included to enable readers to perform the experiments using this chapter alone.
26
2. Identification of secondary metabolite genes in fungal genomes
The first step is to identify the gene(s) you want to express. The Joint Genome Institute (JGI) is a great
resource for (newly) sequenced genomes, but NCBI and AspGD also have genomes available. Genes of
interest are usually found by blast homology analysis with known biosynthetic genes. The predicted
function of hits and their surrounding genes can be used as an indication of validity. Once a gene is
identified, the genomic sequence can be downloaded with introns included, since the A. nidulans system
can recognize them.
Note: Since JGI provides genome sequences prior to publication, please consult the JGI website for their
most recent policy regarding data usage and acknowledgements.
3. Design primers for fusion PCR
For genes smaller than 4000 base pairs, one forward and one reverse primer can be designed. The forward
primer typically starts at the start codon and has a 5’ sticky end of ~20 base pairs for fusion PCR to an
upstream promoter. The powerful alcohol inducible promoter alcA
10
is most often used in our lab, but
other (non-inducible) promoters like Tet-on system
11
or gpdA
12
are also possibilities. The reverse primer
is designed at roughly 300 base pairs after the stop codon to allow proper transcription termination. It
has a 3’ sticky end of ~20 base pairs for a selection marker. The gene pyrG from A. fumigatus is most often
used to complement the A. nidulans pyrG89 background strain. AfriboB and AfpyroA markers can be used
for subsequent transformations. More genes can be inserted by marker recycling where a marker of the
first construct is replaced by a second construct with a different marker. AfpyrG deletion can also be
selected for with 5-fluoroorotic acid. Genes larger than 4000 base pairs require two or more DNA
fragments for transformation due to PCR limits for large (8+ kb) constructs. The 5’ and the 3’ primers are
designed in the same way and additional primers are designed to generate fragments with a 1000 base
pair overlap. Additionally, a second marker is fused to the 5’ fragment before the promoter (see Figure
1B).
27
A
B
Figure 1: Fusion PCR overview for genes smaller (A) and larger than 4000 base pairs (B). (A) Individual fragments are amplified
from genomic DNA. Flanking fragments and the alcA promoter from the A. nidulans host, AfpyrG from A. fumigatus genomic
DNA, and the target gene from its respective source. The “N” sticky ends of the alcA_Rev and pyrG_FW primers on the first line
are an arbitrary number of nucleotides added to make the fusion PCRs on the middle line have nested primers on both sides. (B)
The target gene is amplified in two separate pieces with 1000 base pairs overlap. In addition to the promoter, an additional
marker is added to the 5’ fragment. The two fragments are mixed and added to the protoplasts for transformation.
28
4. Obtain genomic DNA for PCR template
The ability of A. nidulans to recognize introns, allows the use of unmodified genomic DNA for gene
amplification. Genomic DNA is preferably extracted from hyphae, since they presumably contain less
metabolites that may inhibit polymerase. Usually genomic DNA from spores also works though. For
hazardous species, it is possible to purchase only the genomic DNA from commercial sources such as
ATCC. Since many fungal species are either plant or human pathogens sometimes obtaining the necessary
permits will take several months and purchasing only genomic DNA can then greatly speed up the project.
Lastly, with the significant improvements in commercial gene synthesis, it is also possible to obtain
secondary metabolite genes via commercial DNA synthesis.
4.1 Materials
Note: In this section we provide the supplier and the item number we use in the Wang lab. Equivalent
items can be used instead unless specifically noted below.
1. Miracloth (EMD Millipore, #475855)
2. Mortar and pestle
3. Sterile toothpick (Forster)
4. Lysis buffer hyphae (see recipe)
5. Phenol/chloroform/isoamyl alcohol (25:24:1) adjusted to pH 8 (EMD Millipore, #516726)
6. 3 M NaOAc, pH 5 (AMRESCO, #0602)
7. Isopropanol (J.T. Baker, #9084-01)
8. Ethanol, biotechnology grade (IBI scientific, # IB15720)
9. TE buffer (see recipe)
4.2 DNA extraction from hyphae
1. Harvest the hyphae by filtration through Miracloth.
2. Transfer to Eppendorf tube, lyophilize, and grind dry mycelia with a sterile toothpick or
29
Transfer to mortar and freeze with liquid nitrogen and grind to powder
3. Add 700 μl lysis buffer and vortex for 30 seconds.
4. Incubate at 65 °C for 1 hour. (tube with cap-lock)
5. Add 700 μl (an equal amount of lysis buffer in step 3) phenol/chloroform/isoamyl alcohol
(25:24:1) adjusted to pH 8 and vortex for 1 minute.
6. Microcentrifuge at max speed (or at least 10,000 x g) for 15 min at RT.
7. Transfer aqueous phase (~400 μl in 100 μl steps to avoid disturbing separation) to a new
Eppendorf tube.
8. Add 13.3 μl of 3 M NaOAc pH5 followed by 223 μl (0.54 x V) of isopropanol and invert gently.
DNA clots may be visible.
9. Microcentrifuge at max speed (or at least 10,000 x g) for 3 min at RT.
10. Remove supernatant with a pipette and rinse the pellet twice with 70% ethanol.
11. Remove supernatant with a pipette and place the tube in a dry bath at 50 °C for ~10 min.
12. Resuspend in 100 μl TE buffer. If the DNA does not dissolve, place in dry bath at 65 °C for 15
min.
13. Take 0.5 μl for a 50 μl PCR reaction.
14. Genomic DNA can be stored at -20 °C for months or at -80 °C for longer.
4.3 Materials
1. Miracloth (EMD Millipore, #475855)
2. Lysis buffer spores (see recipe)
3. Glass beads, acid washed (Sigma, #G8772)
4. Phenol/chloroform/isoamyl alcohol (25:24:1) adjusted to pH 8 (EMD Millipore, #516726)
5. 3 M NaOAc, pH 5 (AMRESCO, #0602)
6. Isopropanol (J.T. Baker, #9084-01)
30
7. Ethanol, biotechnology grade (IBI scientific, # IB15720)
8. TE buffer (see recipe)
4.4 DNA extraction from spores
1. Take 500 μl of spore suspension in an Eppendorf tube and centrifuge at 8000 g for 5 min.
2. Remove supernatant and resuspend in 100 μl lysis buffer, then add 150 mg/100 μl of 0.45-0.5
mm glass beads.
3. Vortex for 30 seconds.
4. Incubate for 30 min at 65 °C vortexing 30 seconds every 10 min. (tube with cap-lock)
5. Add 100 μl of phenol/chloroform/isoamyl alcohol (25:24:1).
6. Vortex for 5 min.
7. Microcentrifuge at max speed (or at least 10,000 x g) for 5 min at RT.
8. Collect 50-60 μl from the upper phase and transfer to a new tube.
9. Add 5 μl of 3 M NaOAc pH5, mix, add 100 μl ethanol, mix. Put in -20 °C for 20 min.
10. Microcentrifuge at max speed (or at least 10,000 x g) for 30 min at RT.
11. Discard supernatant and wash with 100 μl 70% ethanol.
12. Microcentrifuge at max speed (or at least 10,000 x g) for 15 min at RT.
13. Remove supernatant and place the tube in a dry bath at 50 °C for ~5 min.
14. Dissolve in 30 μl TE buffer.
15. Take 0.5 μl for a 50 μl PCR reaction.
5. Fusion PCR construction
Individual fragments are amplified using the designed primers in 50 μl PCR reactions. Note: Accuprime
TM
high fidelity Taq is the commercial polymerase we use due to its accuracy, high yield, and ability to
generate large DNA fragments. Even though it is more expensive than other polymerases commercially
available, in our experience those have performed significantly worse in terms of both success and yield.
31
Therefore, we highly recommend using the same polymerase we use in the Wang lab for the fusion PCR
construction. Shorter fragments can be amplified using the Pfx polymerase which generates blunt ends
which is preferred for overlapping fragments for subsequent fusion PCR.
5.1 Materials
1. Techne Prime thermocycler
2. Accuprime
TM
Taq high fidelity with buffer I and II (Thermo Fisher Scientific (#12346094)
3. Accuprime
TM
Pfx DNA polymerase with Pfx reaction mix (Thermo Fisher Scientific (#12344024)
4. Agarose (IBI scientific, #IB70071)
5. Ethidium Bromide (IBI scientific, #40075)
6. QIAquick Gel Extraction Kit (Qiagen, #28704)
7. Zymo-Spin I columns (Zymo Research, #C1003)
5.2 Genomic PCR
PCR reactions are analyzed on a 1% agarose gel with 0.01% ethidium bromide and correct bands are cut
out and purified using a gel extraction kit (Qiagen) and eluted in 30 μl elution buffer. The concentration
of eluted DNA is determined by nanodrop. For fusion PCR, each fragment is mixed at a fixed molar ratio
(1:1, 1:2:1 or 1:2:2:1). Fusion PCR with more than four fragments should be split up into two separate
fusions of which the products can be fused again. The amount of each fragment needed depends on each
size, but a minimum of 30 ng of each fragment is recommended. Fusion PCR can be done in one or two
pots. In one pot the primers are added with the separate fragments, which is the quickest way of obtaining
the fusion construct. In two pots the fragments are first fused without primers and then an aliquot is
32
amplified in a second round with primers, which takes more time than the one pot reaction, but it is easier
to generate more fusion construct, since the amplification step can be scaled up easily.
Figure 2. Genomic PCR conditions.
5.3 One Pot Fusion Reaction
Figure 3. One pot fusion PCR conditions.
33
5.4 Two Pot Fusion Reaction
Figure 4. Two pot fusion PCR conditions.
The fusion PCR is again analyzed on 1% agarose gel and correct product is cut out and purified using a gel
extraction kit. Fusion PCR construct is eluted in 10-15 μl elution buffer or ddH 2O from Zymo-Spin I columns
that allow elution with volumes as low as 6 μl and can be directly used for transformation or stored at 20
°C. The fusion PCR product is not quantified, but the amount of DNA is correlated with transformation
success: more DNA gives more transformed colonies. Also, for larger constructs and transformation with
more than one fragment, DNA amounts need to be higher to be successful.
6. Transformation
Protoplasts of the A. nidulans background strain are always freshly prepared for transformation
13
.
6.1 Materials
1. Supplemented yeast glucose (YG) media (see recipes)
2. Gyratory shaker
3. 2X Protoplasting solution (see recipes)
4. Miracloth° (EMD Millipore, #475855)
5. 1.2 M sucrose (sterile)
34
6. 0.6 M KCl (sterile)
7. 0.6 M KCl, 50 mM CaCl 2 (sterile)
8. PEG solution (see recipe)
9. Glucose minimal media agar plates (see recipe)
10. Sterile cotton swab (Puritan, #25-806 1WC)
11. S/T buffer (see recipes)
6.2 Protoplasting
1. Fresh spores (1e8, less than one week old) are inoculated in 20 ml YG media with supplements
pyridoxine, riboflavin, uridine and uracil.
2. Spores are shaken at 30 °C, 135 rpm for 14 hours.
3. 2x protoplasting solution made 20 minutes before
4. The hyphae are filtered with sterile Miracloth
5. Hyphae are washed with 2 ml of YG media and transferred with a sterile spatula to a sterile 50
ml conical flask with 8 ml YG media.
6. 8 ml of 2X protoplasting solution is filtered into the flask
7. Protoplasting at 30 °C, 135 rpm for 2-2.5 hours, beware of overdigestion
8. 4 ml of protoplasting solution is carefully layered on 8 ml of 1.2 M sucrose in a 15 ml tube and
centrifuged with the brake off at 1800 g for 10 minutes.
9. Protoplasts are collected from slightly above the resulting interface with a sterile dropper. Avoid
disturbing the interface or collecting from below the interface. Collect 1-1.5 ml from each tube
in a new 15 ml tube.
10. Add 2X volume of 0.6 M KCl and centrifuge with brake at 1800 g for 10 minutes.
35
11. Remove supernatant with a serological pipette and resuspend in 2 x 1 ml of 0.6 M KCl.
Resuspend by pipetting the first 300 μl slowly up and down a few times, then add the rest and
transfer to a sterile Eppendorf tube (1 ml in each tube).
12. Centrifuge at 2400 g for 3 minutes. Then wash each tube with 1 ml 0.6 M KCl two more times.
Material that does not resuspend after a few times of pipetting up and down should be
removed by keeping the suspension sideways in the tip for 20 seconds. Unsuspended clogs will
stick to the tip wall.
13. All protoplasts are resuspended in 1 ml 0.6 M KCl, 50 mM CaCl 2 and centrifuged at 2400 g for 3
minutes.
14. Supernatant is discarded and protoplasts are resuspended in 100 μl 0.6 M KCl, 50 mM CaCl 2 per
transformation. Up to 10 transformations can be done with one batch.
15. Protoplast solution is added to the DNA solution (10-15 μl) and vortexed once for 1 second.
16. 50 μl of freshly filtered PEG solution is added and vortexed once for 1 second.
17. Put on ice for 25 min.
18. Add 100 μl PEG solution and mix by pipetting up and down a minimum of 10 times.
19. Keep at RT for 25 min.
20. Plate each transformation mixture gently with a sterile spreader on two 10 cm GMM plates with
0.6 M KCl, riboflavin, and pyridoxin (roughly 130 μl per plate).
21. Incubate face up at 30 °C overnight.
22. Flip plates and transfer to 37 °C.
23. Colonies should be visible 2 days after transfer.
By choosing the flanking DNA of the fusion construct (~1000 base pairs each side) to target the yA or the
wA locus, the visible colonies after transformation have changed color from green to either yellow or
white. When picking colonies for diagnostic PCR, the green ones, if any, can be disregarded. Three colonies
36
per transformation are picked and scratched on selective plates. After two days a single colony is picked
with a sterile cotton swab and replated on selective plates to ensure genetic homogeneity. Spores are
harvested by covering the plate in 7 ml S/T buffer by rubbing a sterile cotton swab over the surface and
collecting the suspension.
7. Diagnostic PCR
Genomic DNA of the transformants is extracted as described above. Diagnostic PCR (same as genomic PCR
except 10 μl reactions) is performed by using primers outside of the insert on both background and new
strains. An increase or decrease in size according to the fragment inserted shows correct transformation.
For large, multiple fragment insertions the whole insertion is too large for a single PCR. Therefore, shorter
fragments both within and on the border, are chosen and compared with absence in the background.
Diagnostic PCR usually shows success rates near 90%. Southern blots can be performed if necessary to
double confirm. In addition (part of) the genes can be sequenced to confirm insertion and identify
potential mutations not detected by PCR.
8. Liquid culturing of mutant strains
8.1 Materials
1. Glucose minimal media (see recipes)
2. Methyl ethyl ketone (or 2-butanone) (Sigma-Aldrich, #34861)
3. Ethyl Acetate (EMD Millipore, #EX0240-5)
4. 6 M HCl (J.T. Baker, #5619-03)
5. Dimethyl sulfoxide (DMSO) (J.T. Baker, #9224-06)
6. Methanol (EMD Millipore, #MX0488-1)
7. 0.2 μm PTFE filter (VWR, #28145-491)
37
8.2 Culturing strains in glucose minimal media (GMM)
Genetically correct strains are cultured alongside the background strain in 30 or 50 ml of liquid GMM in
125 ml conical flasks at 37 °C, 180 rpm for 42 hours. This ensures most if not all glucose is used up and will
not inhibit subsequent alcA induction. The temperature is lowered to 30 °C and the alcA promoter is
induced by adding 134.4 or 224 ml methylethylketone (MEK). Cyclopentanone can also be used, but is
more toxic to the fungus. After three days of induction, the cultures are filtered with filter paper and
extracted with an equal volume of ethyl acetate. Aqueous phase is acidified to pH 2 with 6 M HCl and
again extracted with an equal volume of ethyl acetate. Ethyl acetate extractions are dried, weighed, and
redisolved in 20% DMSO in methanol at a concentration of roughly 1 mg/ml or as desired. After filtration
with a 0.2 μm PTFE filter, 10 μl is injected for LC-MS analysis. New peaks compared to the background are
identified. Hit strains will be scaled up to liter scale. Extracts are first purified by flash chromatography
followed by semi-preparative HPLC. Purified compounds are characterized by HighRes MS and NMR.
Detailed protocols for secondary metabolite analysis can be found in our earlier Methods in Enzymology
review
9
.
9. Recipes
Lysis buffer hyphae (200 ml)
Add 1.211 g Tris, 2.922 g EDTA, and 6 g SDS to a final volume of 200 ml ddH 2O. Adjust pH to 7.2 with NaOH.
Add 1% 2-mercaptoethanol right before use.
Lysis buffer spores (200 ml)
Add 4 ml Triton X-100, 2 g SDS, 1.1688 g NaCl, 0.0584 g EDTA, and 0.3152 g Tris-HCl to a final volume of
200 ml ddH 2O. Adjust to pH 8.0 with NaOH.
38
TE buffer (200 ml)
Add 0.3152 g Tris-HCl and 5.84 mg EDTA to a final volume of 200 ml ddH 2O. Adjust to pH 8.0 with NaOH.
Autoclave.
Yeast Glucose (YG) rich media (300 ml)
Add 1.5 g yeast extract, 6 g dextrose, 300 μl of Hutner’s trace element solution to a final volume of 300ml.
Pyridoxine dependent strains can grow on this medium. Riboflavin dependent strain require the addition
of 750 μg riboflavin (from stock solution). Uridine* and uracil dependent strains require 0.7326 g and 0.3
g respectively. Autoclave.
Glucose minimal media (500 ml)
Add 5 g dextrose, 500 μl KOH (5.5 M), 500 μl Hutner’s trace element solution and 25 ml of 20X salt solution
(see below) to a final volume of 500 ml. Optional supplements: 0.25 mg pyridoxine, 1.25 mg riboflavin,
500 mg uracil and 1.221 g uridine*. For agar plates, add 7.5 g agar. For 0.6 M KCl plates, add 22.36 g KCl.
Autoclave. Agar media is kept at 55 °C in a water bath before pouring 25 ml per 10 cm plate.
*Add filter sterilized uridine stock solution (1 M) after autoclaving.
20X salt solution (2000 ml)
Add 240 g NaNO 3, 20.8 g KCl, 20.8 g MgSO 4 · 7 H 2O, and 60.8 g KH 2PO 4 to a final volume of 2000 ml.
2X protoplasting solution (10 ml)
Dissolve 8.2 g KCl and 2.1 g citric acid monohydrate in 50 ml ddH 2O. Adjust pH to 5.8 with 1.1 M KOH
solution. Make the volume up to 100 ml. Autoclave. Take 10 ml of KCl/citric acid and add 2 g VinoTaste®
Pro. Vortex and let it dissolve for 20 minutes. Filter sterilize into flask for immediate use (see step 6 of
protoplasting).
39
PEG solution (100 ml)
Add 4.47 g KCl and 0.74 g CaCl·2H 2O to about 20 ml ddH 2O and 0.802 ml 1 M Tris-HCl and 0.196 ml 1 M
Tris. Add 25 g PEG (average molecular weight 3,350). Add ddH 2O approximately up to 90 ml. Use a
magnetic stirrer to dissolve PEG. Remove stirrer and make up to 100 ml. Autoclave. Before use, filter
necessary amount with 0.2 μm filter to remove detrimental PEG precipitates.
S/T buffer (1000 ml)
Add 8.5 g NaCl and 1 ml Tween80 to 1000 ml ddH 2O. Autoclave.
10. Conclusions
Gene transfer using this method is very facile and mutant strains with correct gene inserts are easily
obtained. The functionality of the heterologously expressed genes is hard to predict and often requires
additional experiments, like the addition of genes that only work in tandem. However, the ability to
generate ten or more different mutants per transformation and the use of several orthogonal markers,
makes it easy to screen many gene mutants and combinations. The use of the powerful alcA promoter in
this system allows for the detection of secondary metabolites from less than optimal enzymes and even
protein purification for in vitro studies. The artificial recreation and subsequent manipulation of
biosynthetic pathways that this system enables, will greatly contribute to synthetic biology and drug
discovery in the next decade.
40
11. References
1. Inglis, D. O.; Binkley, J.; Skrzypek, M. S.; Arnaud, M. B.; Cerqueira, G. C.; Shah, P.; Wymore, F.;
Wortman, J. R.; Sherlock, G., Comprehensive annotation of secondary metabolite biosynthetic genes and
gene clusters of Aspergillus nidulans, A. fumigatus, A. niger and A. oryzae. Bmc Microbiology 2013, 13, 23.
2. Takeda, I.; Umemura, M.; Koike, H.; Asai, K.; Machida, M., Motif-Independent Prediction of a
Secondary Metabolism Gene Cluster Using Comparative Genomics: Application to Sequenced Genomes
of Aspergillus and Ten Other Filamentous Fungal Species. DNA Research 2014, 21 (4), 447-457.
3. Khaldi, N.; Seifuddin, F. T.; Turner, G.; Haft, D.; Nierman, W. C.; Wolfe, K. H.; Fedorova, N. D.,
SMURF: Genomic mapping of fungal secondary metabolite clusters. Fungal Genetics and Biology 2010, 47
(9), 736-741.
4. Chiang, Y. M.; Szewczyk, E.; Nayak, T.; Davidson, A. D.; Sanchez, J. F.; Lo, H. C.; Ho, W. Y.; Simityan,
H.; Kuo, E.; Praseuth, A.; Watanabe, K.; Oakley, B. R.; Wang, C. C. C., Molecular genetic mining of the
Aspergillus secondary metabolome: Discovery of the emericellamide biosynthetic pathway. Chemistry &
Biology 2008, 15 (6), 527-532.
5. Chiang, Y. M.; Oakley, C. E.; Ahuja, M.; Entwistle, R.; Schultz, A.; Chang, S. L.; Sung, C. T.; Wang, C.
C. C.; Oakley, B. R., An Efficient System for Heterologous Expression of Secondary Metabolite Genes in
Aspergillus nidulans. Journal of the American Chemical Society 2013, 135 (20), 7720-7731.
6. Chiang, Y.-M.; Ahuja, M.; Oakley, C. E.; Entwistle, R.; Asokan, A.; Zutz, C.; Wang, C. C. C.; Oakley,
B., R., Development of Genetic Dereplication Strains in Aspergillus nidulans Results in the Discovery of
Aspercryptin. Angewandte Chemie 2015, 127, 1-5.
7. Tsunematsu, Y.; Ishiuchi, K.; Hotta, K.; Watanabe, K., Yeast-based genome mining, production and
mechanistic studies of the biosynthesis of fungal polyketide and peptide natural products. Natural Product
Reports 2013, 30 (8), 1139-1149.
8. Sakai, K.; Kinoshita, H.; Nihira, T., Heterologous expression system in Aspergillus oryzae for fungal
biosynthetic gene clusters of secondary metabolites. Applied Microbiology and Biotechnology 2012, 93
(5), 2011-2022.
9. Lim, F. Y.; Sanchez, J. F.; Wang, C. C. C.; Keller, N., P., Toward Awakening Cryptic Secondary
Metabolite Gene Clusters in Filamentous Fungi. In Methods in Enzymology, 2012; Vol. 517, pp 303-324.
10. Waring, R. B.; May, G. S.; Morris, N. R., CHARACTERIZATION OF AN INDUCIBLE EXPRESSION
SYSTEM IN ASPERGILLUS-NIDULANS USING ALCA AND TUBULIN-CODING GENES. Gene 1989, 79 (1), 119-
130.
11. Meyer, V.; Wanka, F.; van Gent, J.; Arentshorst, M.; van den Hondel, C. A. M. J. J.; Ram, A. F. J.,
Fungal Gene Expression on Demand: an Inducible, Tunable, and Metabolism-Independent Expression
System for Aspergillus niger. Applied and Environmental Microbiology 2011, 77 (9), 2975-2983.
12. Punt, P. J.; Kramer, C.; Kuyvenhoven, A.; Pouwels, P. H.; Vandenhondel, C., AN UPSTREAM
ACTIVATING SEQUENCE FROM THE ASPERGILLUS-NIDULANS GPDA GENE. Gene 1992, 120 (1), 67-73.
13. Szewczyk, E.; Nayak, T.; Oakley, C. E.; Edgerton, H.; Xiong, Y.; Taheri-Talesh, N.; Osmani, S. A.;
Oakley, B. R., Fusion PCR and gene targeting in Aspergillus nidulans. Nature Protocols 2006, 1 (6), 3111-
3120.
41
Chapter 3
Engineering fungal nonribosomal peptide synthetase-like
enzymes by heterologous expression and domain swapping
Genome sequencing projects have demonstrated that filamentous fungi contain far more secondary
metabolite gene clusters than compounds ever identified from these organisms. Aspergillus terreus, the
industrial source of the cholesterol-lowering drug lovastatin, contains 28 polyketide synthase (PKS) genes,
22 non-ribosomal synthetase (NRPS) genes, one hybrid PKS/NRPS gene, two PKS-like genes and 15 NRPS-
like genes. NRPS-like enzymes contain an adenylation domain (A), a thiolation domain (T), and a
thioesterase (TE) or reductase domain (R), but do not contain the peptide bond forming condensation
domain (C) found in canonical NRPS enzymes.
1-5
In A. terreus five single-module NRPS-like enzymes with
the A-T-TE domain structure have been identified: atqA, apvA, btyA, atmelA, and pgnA. In NRPSs, the
function of the A domain is to recognize and activate specific amino acid residues. In these NRPS-like
enzymes, instead of an amino acid, an alpha-keto acid is recognized and activated. It is then transferred
to the 4'-phosphopantetheine arm on the T domain. From there it is transferred to the TE domain. When
a second residue is activated and transferred to the now vacated T domain, the two monomers dimerize
to form a cyclic core and the resulting molecule is released from the enzyme.
1
As part of our larger effort
to characterize fungal secondary metabolite genes in A. terreus, our group previously examined the five
NRPS-like genes that contain the A-T-TE domain structure using knockout or overexpression A. terreus
strains or heterologous expression (HE) A. nidulans strains.
42
Figure 1. Three NRPS-like enzymes from Aspergillus terreus and their products.
The biosynthesis of didemethylasterriquinone D by atqA has been studied in detail through its homologue
tdiA in A. nidulans in 2007.
1
In 2015, the spatial regulation of apvA and atmelA was elucidated and it was
shown that each gene can be heterologously expressed in A. nidulans to produce aspulvinone E (1). More
recently, the product of pgnA was determined by Tet-on overexpression. It was shown phenguignardic
acid (3) could also be produced via HE of pgnA in A. nidulans. Though btyA was confirmed to play a role in
butyrolactone biosynthesis through knockout studies, the product of btyA was not confirmed through HE.
Therefore, we first developed an HE strain of btyA in the yA locus of A. nidulans, the same locus as pgnA
was expressed in. ApvA was previously expressed in the wA locus, so to generate a comparable set, apvA
was also expressed in the yA locus. For this study we used a HE approach using the well-characterized and
heavily engineered LO4389 strain of A. nidulans as a host system.
6
This strain has the 25 genes involved
in production of the major secondary metabolite sterigmatocystin eliminated. One of the advantages of
using this A. nidulans strain is that we can easily identify and isolate the products of the heterologously
expressed genes from A. terreus strain NIH2624. Another advantage is the strength of the alcA promoter
which should allow secondary metabolite detection even when engineered enzymes display lowered
activity. The alpha-keto acid forms of phenylalanine and tyrosine are part of the shikimate pathway that
is highly conserved in fungi, plants and bacteria.
7
Therefore, it is assumed that they are readily available
as starting material for the enzymes. The A. nidulans NRPS-like enzyme, micA, is thought to use
43
phenylpyruvic acid as well.
5
To create both constructs, the genes were amplified from Aspergillus terreus
genomic DNA. Genes were fused via fusion PCR with an alcA promoter on the 5’ end, an AfpyrG marker
on the 3’ end, and flanking fragments for heterologous recombination in the yellow locus (yA) of
Aspergillus nidulans (Figure S1A). As a control we fused only the AfpyrG marker to the flanking fragments.
This way the control strain can be cultured in the same media as the other mutant strains. The purified
fusion PCR products were used in a previously described transformation protocol
8
, and the generation of
gene construct insertions was confirmed by diagnostic PCR (Figure S2). Transformed strains were cultured
in Glucose Minimal Media for 42 hours and induced with 2-butanone for an additional three days,
followed by ethyl acetate extraction. The extract of the apvA in yA strain was analyzed by Liquid
Chromatography-Mass Spectrometry (LC-MS) and the data were identical to those reported for apvA in
wA.
9
For confirmation of the product of the btyA in yA HE strain, we scaled up to a one-liter culture. The
main product was purified using column chromatography followed by preparative High Performance
Liquid Chromatography (HPLC), and its identity was confirmed by Nuclear Magnetic Resonance (NMR)
spectroscopy to be butyrolactone IIa (2) (Figure S6). In Figure 2, HPLC traces of extracts of the two new
strains (apvA, panel A and btyA, panel B) and the previously generated strain (pgnA, panel C) show three
clearly different products from enzymes with a similar domain structure. It was previously proposed that
apvA and btyA both utilize two 4-hydroxyphenylpyruvic acid (HPPA) building blocks to produce 1 and 2,
respectively.
10
Along the same lines, the biosynthesis of 3 was proposed to be through dimerization of
two phenylpyruvic acid (PPA) building blocks instead of HPPA.
11
In Figure 1 it can be seen that, despite
similarity in building blocks used and in the domain structure of the enzymes, the products have different
cyclization patterns. In this study we were interested in identifying which of the three domains in NRPS-
like enzymes is responsible for the different cyclization. Secondly, we were interested in testing whether
we could rationally swap domains in NRPS-like enzymes to change the structure of the final products as
has been done before with a nonreducing polyketide synthetase.
12
The modular architecture of the NRPS-
44
like enzymes should in theory allow for the recombination of modules to generate novel, artificial
nanomachines that produce complex molecules. First, we asked whether protein domains of these three
enzymes could be swapped while retaining activity. Since it is known that the function of the A domain in
fungal NRPS and NRPS-like enzymes is to activate specific amino acids and alpha-keto acids, we started
the engineering effort by substituting the A domain. T domains have been exchanged before to generate
functional mutants in the single-module NRPS IndC, but without changing the product that was formed.
13
Figure 2. HPLC traces of ethyl acetate extracts of the strains with different heterologous constructs expressed at total spectrum
scan by DAD detector. (A) Control strain with the AfpyrG marker shows only minor products, most likely siderophores produced
by A. nidulans. (B) Strain with apvA under the alcA promoter shows aspulvinone E (1) production. (C) Strain with btyA under the
alcA promoter shows butyrolactone IIa (2) production. (D) Strain with pgnA under the alcA promoter shows phenguignardic acid
(3) production. * denotes a side product that was previously undetected. The production of * varies per culture iteration and
condition, and it has no identifiable m/z peaks in either positive or negative mode in the range 100-1500. UV absorption of * can
be found in Figure S5.
45
We created two different hybrid constructs with the A domains replaced. In one construct we replaced
the A domain of BtyA with the A domain of ApvA since both enzymes utilize the same HPPA building blocks
and differ only in their cyclization pattern. We expected that the novel hybrid enzyme with the A domain
from ApvA and the T and TE domains from BtyA would still produce butyrolactone IIa (2). We expected
the second hybrid enzyme, with the A domain from PgnA and the T and TE domains from BtyA, to produce
a novel compound: butyrolactone IIa constructed with PPA instead of HPPA, phenylbutyrolactone IIa (4).
To create the hybrid enzymes, we first analyzed the NRPS-like genes and determined the domain borders
using the Conserved Domain Database (or Pfam). Exact positions of the substitutions can be found in the
alignment data (Figure S3). Hybrids were generated using genomic DNA of the heterologous expression
strains. The A domains were amplified together with the alcA promoter and the 5’ yA flanking region. The
T and TE domains were amplified together with the AfpyrG and the 3’ yA flanking region (Figure S1B). The
two fragments were fused to generate hybrid NRPS-like genes and the fusion constructs were used for
transformation. Diagnostic PCR confirmed insertion of the fragment at the correct locus and sequencing
confirmed the correct substitution location in the hybrid (Figure S2, S4). The first hybrid enzyme ApvA
with the BtyA T and TE produced butyrolactone IIa, as verified by LC/MS. This showed that our hypothesis
was indeed correct that A domains can be substituted without completely losing activity (Figure 3A). The
second hybrid, PgnA with the BtyA T and TE, produced a compound that was neither phenguignardic acid
nor butyrolactone IIa, based on LC/MS analysis (Figure 3C, S5). To determine the chemical structure, we
scaled up the culture of the mutant strain with the hybrid gene to one liter. However, the compound was
not sufficiently stable during the purification process, with significant degradation even in neutral buffers,
despite being a single large peak in the LC/MS analysis. The product seemed to degrade into at least four
different compounds, so a specific mechanism for that process is not straightforward. Possibly keto-enol
tautomerism in the five-membered ring plays a role in the instability of the molecule. We therefore
obtained High Resolution MS and NMR data from the crude extract (~90 mg) of the one-liter culture and
46
showed that the hybrid enzyme produced the expected compound with phenyl side chains and a
heterocyclic core like butyrolactones (4) (Table S4, Figure S7, S8). This result showed that newly
engineered combinations of enzymatic domains from NRPS-like enzymes can produce novel secondary
metabolites in sufficient quantities for characterization.
Figure 3. HPLC traces of ethyl acetate extracts of the strains with different heterologous constructs expressed at total spectrum
scan by DAD detector. (A, B) Both apvA/btyA hybrids produce butyrolactone IIa (2). (C, D) Both pgnA/btyA hybrids produce
phenylbutyrolactone IIa (4).
Having established the exchangeability of A domains in NRPS-like enzymes, we next wanted to extend the
swapping strategy to replace thioesterase (TE) domains in NRPS-like enzymes to elucidate their functions.
The TE domain in NRPS-like enzymes is thought to play a role in the cyclization process, but whether or
not it can perform its function as independently as the A domain was not clear. To answer this question,
we engineered two different hybrid enzymes. In one construct we combined the A and T domains of ApvA
47
with the TE domain of BtyA. In the second construct we combined the A and T domains of PgnA with the
TE domain of BtyA (Figure 3B and D). Domain borders were again determined using the Conserved Domain
Database (or Pfam) and the substitution position was arbitrarily chosen within the domain linkers (Figure
S3). Hybrids were generated in the same method as described above for the A domain substitutions. Both
hybrid strains were cultured and the extracts were analyzed using LC/MS. The first ApvA/BtyA hybrid
construct produced compound 2 (Figure 3B). ApvA normally produces aspulvinone E (1) but when the
original TE domain of ApvA is replaced by the TE domain of BtyA, butyrolactone IIa was produced instead.
This indicated that the TE domain is the main determining factor for cyclization. The second PngA/BtyA
hybrid produced again compound 4 confirming this hypothesis (Figure 3D).
Figure 4. Proposed biosynthesis pathway for phenylbutyrolactone IIa 4. The cyclization mechanism is the same as proposed for
butyrolactone II but the changed A domain recognizes a different pyruvic acid moiety to generate compound 4. The mechanism
works for T domains from either parent enzyme.
The results presented in this study show that it is possible to engineer new functional NRPS-like enzymes
by recombining domains of natural ones. This effort lead to the production of a new natural product
phenylbutyrolactone IIa (4). The fact that A domains can be exchanged means they can function
48
independently of their T and TE domains. Engineering A domain substitutions in NRPSs has been proven
successful in the past to generate novel NRPs.
14-16
Classic NRPSs catalyze peptide bond formation between
two monomers from distinct A domains, while these NRPS-like enzymes only have a single A domain and
catalyze a different reaction, so up until now it was not clear whether the same strategy would work here.
The function of the TE domain seems similarly independent as the A domain, since both TE swapped
hybrids function in the same fashion as their A domain swapped hybrid counterparts. Previous studies
concluded that T and TE domains are dependent on each other and need to be preserved in recombinant
NRPSs. Whether NRPS-like enzymes form an exception to this rule, will become clear as more hybrids are
generated. The future directions for this project are two-fold. One direction is to expand further domain
combinations to gain access to more novel structures. For example, the A domain of ApvA combined with
the T and TE domain of PgnA should produce 4-hydroxyphenguignardic acid. The other is to take the
functional PgnA/BtyA hybrid enzyme and coexpress additional enzymes from NRPS-like gene pathways,
such as SAM-dependent methyltransferases or prenyltransferases to further diversify the structures that
can be produced using this approach. This may also stabilize the produced compound, making purification
possible. Using the Aspergillus nidulans heterologous expression system, we are pursuing both avenues
and the results will be reported in due course.
49
References
1 Balibar, C. J.; Howard-Jones, A. R.; Walsh, C. T., Nature Chemical Biology 2007, 3 (9), 584-592.
2 Wackler, B.; Lackner, G.; Chooi, Y. H.; Hoffmeister, D., Chembiochem 2012, 13 (12), 1798-1804.
3 Zhu, J.; Chen, W.; Li, Y. Y.; Deng, J. J.; Zhu, D. Y.; Duan, J.; Liu, Y.; Shi, G. Y.; Xie, C.; Wang, H. X.;
Shen, Y. M., Gene 2014, 546 (2), 352-358.
4 Schneider, P.; Bouhired, S.; Hoffmeister, D., Fungal Genetics and Biology 2008, 45 (11), 1487-
1496.
5 Yeh, H. H.; Chiang, Y. M.; Entwistle, R.; Ahuja, M.; Lee, K. H.; Bruno, K. S.; Wu, T. K.; Oakley, B. R.;
Wang, C. C. C., Applied Microbiology and Biotechnology 2012, 96 (3), 739-748.
6 Guo, C.-J.; Knox, B. P.; Sanchez, J. F.; Chiang, Y.-M.; Bruno, K. S.; Wang, C. C. C., Organic Letters
2013, 15 (14), 3562-3565.
7 Sun, W. W.; Guo, C. J.; Wang, C. C. C., Fungal Genetics and Biology 2016, 89, 84-88.
8 Chiang, Y. M.; Oakley, C. E.; Ahuja, M.; Entwistle, R.; Schultz, A.; Chang, S. L.; Sung, C. T.; Wang, C.
C. C.; Oakley, B. R., Journal of the American Chemical Society 2013, 135 (20), 7720-7731.
9 Tohge, T.; Watanabe, M.; Hoefgen, R.; Fernie, A. R., Frontiers in Plant Science 2013, 4, 13.
10 Szewczyk, E.; Nayak, T.; Oakley, C. E.; Edgerton, H.; Xiong, Y.; Taheri-Talesh, N.; Osmani, S. A.;
Oakley, B. R., Nature Protocols 2006, 1 (6), 3111-3120.
11 Guo, C. J.; Sun, W. W.; Bruno, K. S.; Oakley, B. R.; Keller, N. P.; Wang, C. C. C., Chemical Science
2015, 6 (10), 5913-5921.
12 Yeh, H. H.; Chang, S. L.; Chiang, Y. M.; Bruno, K. S.; Oakley, B. R.; Wu, T. K.; Wang, C. C. C., Organic
Letters 2013, 15 (4), 756-759.
13 Beer, R.; Herbst, K.; Ignatiadis, N.; Kats, I.; Adlung, L.; Meyer, H.; Niopek, D.; Christiansen, T.;
Georgi, F.; Kurzawa, N.; Meichsner, J.; Rabe, S.; Riedel, A.; Sachs, J.; Schessner, J.; Schmidt, F.; Walch, P.;
Niopek, K.; Heinemann, T.; Eils, R.; Di Ventura, B., Molecular Biosystems 2014, 10 (7), 1709-1718.
14 Calcott, M. J.; Ackerley, D. F., Biotechnology Letters 2014, 36 (12), 2407-2416.
15 Duerfahrt, T.; Doekel, S.; Sonke, T.; Quaedflieg, P.; Marahiel, M. A., European Journal of
Biochemistry 2003, 270 (22), 4555-4563.
16 Fischbach, M. A.; Lai, J. R.; Roche, E. D.; Walsh, C. T.; Liu, D. R., Proceedings of the National
Academy of Sciences of the United States of America 2007, 104 (29), 11951-11956.
50
Supporting Information
Engineering fungal nonribosomal peptide synthetase-like enzymes by
heterologous expression and domain swapping
Table of Contents
Isolation of compound 2 .................................................................................................................. 51
Table S1. Fungal strains used in this study ........................................................................................ 51
Table S2. Primers used in this study .................................................................................................. 52
Table S3. NMR data for compound 2 ................................................................................................ 53
Table S4. NMR data for compound 4 ................................................................................................ 54
Figure S1. Schematics for fusion PCR and transformation .................................................................. 55
Figure S2. Diagnostic PCR ................................................................................................................. 56
Figure S3. Alignment of NRPS-like protein sequences ....................................................................... 56
Figure S4. Sequencing data of hybrids ............................................................................................... 60
Figure S5. Spectral data compounds 1-4............................................................................................ 62
Figure S6. NMR spectra of compound 2 ............................................................................................ 63
Figure S7. High Resolution Mass Spectrometry of compound 4 .......................................................... 69
Figure S8. NMR spectra of compound 4 ............................................................................................ 70
Supplemental references ................................................................................................................. 76
51
Isolation of compound 2
The gradient system was MeCN (solvent B) and 5% MeCN/H 2O (solvent A) both containing 0.05% TFA. The
gradient condition for semi-preparative HPLC analysis of the crude of the alcA_btyA strain was 0-2 min
100%-80% A, 2-17 min 80%-60% A, 17-19 min 60%-0% A, 19-21 min 0%-100% A, 21-23 min 100% A.
Compound 2 was eluted with other impurities in the same fraction at 9.15 min, which was further purified
using a different gradient system (0-2 min 100%-100% A, 2-4 min 100%-72% A, 4-15 min 72% A, 15-17 min
72%-0% A, 17-19 min 0%-100% A, 19-21 min 100% A) to yield pure compound 2 (84.79 mg/L).
Table S1. Fungal strains used in this study.
Fungal strain or
transformants
Gene mutation(s) Genotype
LO4389 + AfpyrG (A. nidulans) stcJΔ, AfpyrG
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::AfpyrG
CW8501 stcJΔ, alcA(p)-apvA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-apvA-AfpyrG
CW8502 stcJΔ, alcA(p)-btyA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-btyA-AfpyrG
CW6053.1
11
stcJΔ, alcA(p)-pgnA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-pgnA-AfpyrG
CW8503 stcJΔ, alcA(p)-apvA/btyA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-apvA(A and T)-btyA(TE)-AfpyrG
CW8504 stcJΔ, alcA(p)-apvA/btyA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-apvA(A)-btyA(T and TE)-AfpyrG
CW8505 stcJΔ, alcA(p)-pgnA/btyA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-pgnA(A and T)-btyA(TE)-AfpyrG
CW8506 stcJΔ, alcA(p)-pgnA/btyA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-pgnA(A)-btyA(T and TE)-AfpyrG
LO4389 stcJΔ pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
52
Table S2. Primers used in this study.
primer sequence (5' → 3')
Primers used in heterologous expression of complete genes from Aspergillus terreus
ATEG2004.1HEF CCA ATC CTA TCA CCT CGC CTC AAA ATG ACT TTG AAC AAC CTA CA
ATEG2004.1HER CGA AGA GGG TGA AGA GCA TTG CGC TTG ACT TTC AAT AGA CG
ATEG2815.1HEF CCA ATC CTA TCA CCT CGC CTC AAA ATG ACC AAA ATT GAT TTG AT
ATEG2815.1HER CGA AGA GGG TGA AGA GCA TTG GCG TTC TAT TTG CTT TGA A
ATEG8899.1HEF
3
CCA ATC CTA TCA CCT CGC CTC AAA ATG AAT AAG AAG CTC AAG CT
ATEG8899.1HER
3
CGA AGA GGG TGA AGA GCA TTG ATG GGA CAG GCG ATA GAT AA
Primers used in heterologous expression of hybrid genes from Aspergillus terreus
ATEG2004.1ATR GTA CTC GCG CGG GCC CTG
ATEG2815.1TEF CAG GGC CCG CGC GAG TAC AAC CCC GTC GTG ACC CTG
ATEG8899.1ATR GTA TAT GCT GCA GTC TCT TTG
ATEG2815.1TEF CAA AGA GAC TGC AGC ATA TAC AAC CCC GTC GTG ACC CTG
ATEG2004.1AR CAG CTC GTT ATT GGC GTT CT
ATEG2815.1TTEF AGA ACG CCA ATA ACG AGC TGC TCA AGC TCC GCC AGT CC
ATEG8899.1AR CAT CTG TTC GTT CGC TTC TTG
ATEG2815.1TTEF CAA GAA GCG AAC GAA CAG ATG CTC AAG CTC CGC CAG TCC
Primers for sequencing hybrid strains
2004 seq FW T-TE GGA GCC GCA GAA CGA CCT
2004 seq FW A-T CCG AGC TAC AGC TGT TGC
8899 seq FW A-T-TE GCG GTC CGG TGG TTC TCA
53
Table S3. NMR data for compound 2 (400 and 100 MHz in DMSO-d 6)
Position δH (J in Hz) δC HMBC
a
COSY NOESY
1 157.9, C
2 6.88, d (8.0) 115.9, CH 1, 3, 4, 5, 6 H-3 H-3
3 7.61, d (8.0) 129.0, CH 1, 2, 5, 6, 7 H-2 H-2, H2-12
4 121.5, C
5 7.61, d (8.0) 129.0, CH 1, 2, 3, 6, 7 H-6 H-6
6 6.88, d (8.0) 115.9, CH 1, 2, 3, 4, 5 H-5 H-5
7 128.0, C
8 138.2, C
9 168.5, C
10 85.2, C
11 171.0, C
12 3.38, s 38.1, CH2 7, 10, 11, 13, 14, 16 H-3, H-18
13 124.0 C
14 6.61, d (8.0) 131.2, CH 12, 15, 16, 17, 18, H-15 H-15
15 6.51, d (8.0) 114.6, CH 13, 14, 16, 17, 18 H-14 H-14
16 — 156.3, C
17 6.51, d (8.0) 114.6, CH 13, 14,15, 16, 17 H-18 H-18
18 6.61, d (8.0) 131.2, CH 12, 14, 15, 16, 17 H-17 H2-12, H17
11-OH 10.5, br s
a
: HMBC correlations are from proton(s) to the indicated carbon.
54
Table S4. NMR data for compound 4 (400 and 100 MHz in DMSO-d 6).
Position δH (J in Hz) δC HMBC
a
COSY NOESY
1 7.39, t (8.0) 130.8, CH 2,3,5,6 H-2, H-6 H-2, H-6
2 7.49, t (8.0) 129.3, CH 1,3,5,6 H-1, H-3 H-1, H-3
3 7.72, d (8.0) 127.5, CH 4,5,7 H-2 H-2, H-12
4 129.0, C
5 7.72, d (8.0) 127.5, CH 3,4,7 H-6 H-6, H-12
6 7.49, t (8.0) 129.3, CH 1,2,3,5 H-1, H-5 H-1, H-5
7 127.2, C
8 140.7, C
9 168.4, C
10 85.6, C
11 170.9, C
12 3.50, s 39.0, CH2
7,10,11,13,
14,18
H-3, H5, H14,
H18
13 134.3, C
14 6.78, d (8.0) 130.6, CH 16,18 H-15 H-12, H-15
15 7.12, t 128.2, CH 13 H-14 H-14
16 7.12, t 127.3, CH 13 H-15, H-17
17 7.12, t 128.2, CH 13 H-18 H-18
18 6.78, d (8.0) 130.6, CH 14,16 H-17 H-17, H-12
8-OH 10.9, br s
a
: HMBC correlations are from proton(s) to the indicated carbon.
The impurities visible in the NMR spectra could be excluded from the assignments using the predicted
molecular formula derived from the High Resolution Mass Spectrum, which meant there are a finite
number of carbon and proton chemical shifts to be assigned for the molecule. The carbon shift at 168.9
ppm was assigned an impurity. It is in the range of cabronyls and shows HMBC coupling with a proton at
3.22 ppm, which does not correspond with any part of compound 4. The carbon shift at 29.5 ppm was
also assigned an impurity, with only an HMBC coupling to a proton at 1.21 ppm, that does not belong to
the predicted compound 4 either.
55
A
B
Figure S1. Schematics for fusion PCR and transformation. (A) A gene of interest is fused via fusion PCR to a promoter at the 5’
end, a nutritional marker at the 3’ end, and flanking homologous fragments for recombination with the yellow locus in Aspergillus
nidulans. (B) A 5’ and a 3’ fragment from two different heterologous expression constructs is amplified and fused together using
fusion PCR. This fragment is used for transformation in the same fashion as for the native genes. Experimental details for fusion
PCR and transformation can be found in these references
8, 17
.
56
Figure S2. Diagnostic PCR.
The yellow locus of each strain was amplified using yA P1 and yA P6 primers. The unmodified yellow locus
of the background strain has an expected size of 3940 bp, whereas the mutants after insertion of the
construct have an expected size of ~7400 bp. All strains have their expected sizes as can be seen on the
gel image.
Figure S3. Alignment of NRPS-like protein sequences with (A) Pfam domain predictions and hybrid locations highlighted (B)
secondary structure predictions and hybrid locations highlighted.
A.
CLUSTAL O(1.2.3) multiple sequence alignment
_ A domain
_ T domain
_ TE domain
_ hybrid locations
ATEG_2815 --------------------MTKIDLINLLDHAADTVAASGILIYSPGNVESPHRLTYAE
ATEG_2004 --------------------MTLNNLQALLRRVAAREDSGHVVVYGMGNTKAFKSYSYQD
ATEG_8899 MNKKLKLFSMPGAQTSQIVIMLFQSLLHLLEAIASREPTRYIITYSIGNTHTPEIFSYSD
* .* ** * : :: *. **..: . :* :
ATEG_2815 LRDAAQQNARRLGCMEGFAPGSLVLLHLDGYRDNMIWLWSLIYAGCIPVMSTPFAHHEEH
ATEG_2004 LLRVAIKASVALRKTSDLHPGSVILLHFDNHWDNIVWFWAASFAGCLPAISASFSNDASQ
ATEG_8899 LLQSARKAAGALRFKYHVVPGSVVLLHFNDHWNSMLWFWATLIADCIPAMSTPFSNNPET
* * : : * . ***::***::.: :.::*:*: *.*:*.:*: *::. .
ATEG_2815 RRSHLLHLQSLLRDPICLTRQGLEAQFPPDVGFRLCNIESIS-GGSNPFTSPRLGQPPGQ
ATEG_2004 RTAHIERLSTTLMHPLCLTNERIMADFAGQDAVQPLAVETLVLNGDVSFEALP---QEHP
ATEG_8899 RLRHLKHLSTTLRSPKCLTTASLAAEFAGQEYITPICVQSLDYENLVHL----------P
57
* *: :*.: * * *** : *:* : . :::: . :
ATEG_2815 DDNVDDIALLMLTSGSTGHAKAVPLTHSQLLSALAGKERFLQLRQHGPSLNWVAFDHIAS
ATEG_2004 EPSLSDDALLLFTSGSSGNSKGVCLSHGQILASISGKYAVRPLPDNTSFLNWVGLDHVAA
ATEG_8899 IKEGGDIAVLMFTSGSSGHCKVVPLTHEQILASLSGKAWTFPLPDNTAQLNWVGMNHVAS
. .* *:*::****:*:.* * *:* *:*::::** * :: ****.::*:*:
ATEG_2815 LAEMHFHPIFACIDQVHVAAADVITDPLILLELIHRHRVGITFAPNFLLAKLLDSLEREP
ATEG_2004 IVEIHLQAMYALKTQVHVPAADILSSPATFLQLLEKHRVSRTFAPNFFLAKLRDLLQEND
ATEG_8899 LVEVHLFSIYTHSDQVHIPTVEVLSHVTLFLDLIHRHRVSRTFAPNFFLAKLRAALSADD
:.*:*: ::: ***: :.:::: :*:*:.:***. ******:**** *. :
ATEG_2815 -SPSSRPWDLSCLMHLLSGGEANVVDTCARLARRLTQDYGVPSTCIKPAFGMTETCAGCS
ATEG_2004 SLPEPRRWDLRSLEYVASGGEANVTKTCDRLSEYL-VAFGAPKDVIVPGFGMTETCAGSI
ATEG_8899 TLAKY-TGSLSNLRYIVSGGEANVTQTINDLAQML-KKCGAVSNVIVPAFGMTETCAGAI
. .* * :: *******..* *:. * *. . * *.*********.
ATEG_2815 FNDRFPTYETVHMLDFASLGRGVKGVQMRVTSLSTGQPVDDHSEVGNLELSGPSVFRGYY
ATEG_2004 FNTRCPEYDKSRSAEFASVGTCMPGISMRVTDLSN--NALPSGEIGHLQLTGPVVFKRYF
ATEG_8899 YNTSFPQYDVEHGLPFASVGSCMPGIQVRIVQLNGNGNSVPPGTVGNLEICGPVVLKGYF
:* * *: : ***:* : *:.:*:..*. . :*:*:: ** *:: *:
ATEG_2815 NNSQATRDSFTPDGWFRTGDLAMIDAGGQLVLRGRSKELICINGAKYLPHEVESAIEDAK
ATEG_2004 NNTSATQEAFTPDGWFKTGDMGCIDENGCLTLTGRAKENMIINGVNHSPHEIETALD--K
ATEG_8899 NNPAATKSTFTNDNWFKTGDLAFVDDNGMLVLAGREKDSIIVNGANYSPHDIESAIDEAN
** **:.:** *.**:***:. :* .* *.* ** *: : :**.:: **::*:*:: :
ATEG_2815 VRGVTPGFTICFGYRPAKAQTESLAVVYLPAYEEADVESRSQAQNAIIRVGLIMTGTRPY
ATEG_2004 IPGLTPSYSCCFSFFPSGGETEEICVVYLPTYSPDDLAARAQTADAISKTVLMSTGSRPH
ATEG_8899 IPGLISGFTCCFSTFPPSADTEEVIIVYLPNYTPADTVRRSETAAAIRKVAMMSVGVRAT
: *: .:: **. * .:**.: :**** * * *::: ** :. :: .* *
ATEG_2815 VLPLDAHTLVKSSLGKISRNKIKTGLESGAFQAFEETNNRLLKLRQSTPVVPAGNETETL
ATEG_2004 VLPLEREALPKSSLGKLSRAKIKAAYEKGEYATYQNANNELMRRYRESTRAEPQNDLEKT
ATEG_8899 VLPLDRTMLEKSTLGKLARGKIKAAYERGDYKSYQEANEQMMALHHKVSHHQPRSGLEQS
****: * **:***::* ***:. * * : :::::*:.:: :. . *
ATEG_2815 LLAA-ALHVFRVTADEFGVETPMFAFGITSLDMIAWKRQAETILGH---EIPMLAIITSP
ATEG_2004 LLEVFTRSLSI-TDDAFDVKTPIFDVGINSVELIRLKRDIEDHLGMAASAIPMIMLMTHS
ATEG_8899 LLGVFTRTIPENLTEDFDVLTSIFDLGITSIELLKLKRGIEDLIGHG--QIPLITLMTNP
** . : : : *.* * :* .**.*:::: ** * :* **:: ::*
ATEG_2815 TIRVLARQLQD--GHHGPGEYNPVVTLQPHGSKTPLWLIHPIGGEVLVFVSLAGLFADDR
ATEG_2004 TVRDLATALEKLQG---PREYDP----------NPLWLVHPGAGEVLVFINLAQ-YIVDR
ATEG_8899 TIRTLSDALKQHAQQRDCSIYNPVVVLQSQGKKPPIWLVHPVGGEVMIFMNLAK-FIIDR
*:* *: *:. : :* . *:**:** .***::*:.** : **
ATEG_2815 PVHALRARGLNRGEPPFGSIHEAADAYYQAIKRVQPHGPYAVAGYSYGSLVAFEVAKRLD
ATEG_2004 PVYALRARGFNDGEQPFETIEEATASYYNGIRSRQPHGPYALAGYCYGSMLAFEVAKMLE
ATEG_8899 PVYGLRARGFNDGEDPFHTFEEIVSTYHASIKEKQPSGPYAIAGYSYGAKVAFDIAKALE
**:.*****:* ** ** ::.* . :*: .*: ** ****:***.**: :**::** *:
ATEG_2815 QHGKDEVPFFGSLDLPPFHA-QIISKSDWTESLLHLASSLSLIAEEEINTLGA--DLRGL
ATEG_2004 SHG-EEVRFLGSFNLPPHIK-MRMRELDWKECLLHLAYFLDLVSQERSREMSV--ELAGL
ATEG_8899 HNG-DEVRFLGLLDLPPSLNGTQMRAVAWKEMLLHICRMVGVIREEGIKKIYPRLEPENI
:* :** *:* ::*** : *.* ***:. :.:: :* . : : .:
58
ATEG_2815 PQPRAIQKILARAPPRRIRELDLSPDGLMRWTKLTSAMAQATRGYVPVGQTRSVDVFYTE
ATEG_2004 SHDEILDSVIQNANMERYAELSLNRPLLVRWADVAYELHRMAFDYDPAGCVAGMDVFFSI
ATEG_8899 SPRHAIETVMGEADVTRLAELGLTASALERWANLTHALQRCIVDHKTNGSVAGADAFYCD
. ::.:: .* * **.*. * **:.:: : : .: * . . *.*:
ATEG_2815 PSGALATTRDEWLDRH-REWRQFGRLETQFHPLEGLHYRLMDEDNVHKVYRVLSRAMDAR
ATEG_2004 PLAIAAASKTEWRNVHLSQWEDFTRTVPKFHDVAGEHYSMIGPDHVFSFQKTLRKALDER
ATEG_8899 PMASMAISNEQWACDYIGKWSDHTRSPPRFHHIAGTHYTILDAENIFSFQKTFLRALNDR
* . * :. :* : :* :. * :** : * ** ::. :::... :.: :*:: *
ATEG_2815 GTLCLSLSL
ATEG_2004 GM-------
ATEG_8899 GI-------
*
NPS @ Network Protein Sequence Alignment
_ predicted alpha helix
_ predicted extended strand (beta sheet)
_ random coil
ATEG_2815 --------------------MTKIDLINLLDHAADTVAASGILIYSPGNVESPHRLTYAE
ATEG_2004 --------------------MTLNNLQALLRRVAAREDSGHVVVYGMGNTKAFKSYSYQD
ATEG_8899 MNKKLKLFSMPGAQTSQIVIMLFQSLLHLLEAIASREPTRYIITYSIGNTHTPEIFSYSD
ATEG_2815 LRDAAQQNARRLGCMEGFAPGSLVLLHLDGYRDNMIWLWSLIYAGCIPVMSTPFAHHEEH
ATEG_2004 LLRVAIKASVALRKTSDLHPGSVILLHFDNHWDNIVWFWAASFAGCLPAISASFSNDASQ
ATEG_8899 LLQSARKAAGALRFKYHVVPGSVVLLHFNDHWNSMLWFWATLIADCIPAMSTPFSNNPET
ATEG_2815 RRSHLLHLQSLLRDPICLTRQGLEAQFPPDVGFRLCNIESIS-GGSNPFTSPRLGQPPGQ
ATEG_2004 RTAHIERLSTTLMHPLCLTNERIMADFAGQDAVQPLAVETLVLNGDVSFEALP---QEHP
ATEG_8899 RLRHLKHLSTTLRSPKCLTTASLAAEFAGQEYITPICVQSLDYENLVHL----------P
ATEG_2815 DDNVDDIALLMLTSGSTGHAKAVPLTHSQLLSALAGKERFLQLRQHGPSLNWVAFDHIAS
ATEG_2004 EPSLSDDALLLFTSGSSGNSKGVCLSHGQILASISGKYAVRPLPDNTSFLNWVGLDHVAA
ATEG_8899 IKEGGDIAVLMFTSGSSGHCKVVPLTHEQILASLSGKAWTFPLPDNTAQLNWVGMNHVAS
ATEG_2815 LAEMHFHPIFACIDQVHVAAADVITDPLILLELIHRHRVGITFAPNFLLAKLLDSLEREP
ATEG_2004 IVEIHLQAMYALKTQVHVPAADILSSPATFLQLLEKHRVSRTFAPNFFLAKLRDLLQEND
ATEG_8899 LVEVHLFSIYTHSDQVHIPTVEVLSHVTLFLDLIHRHRVSRTFAPNFFLAKLRAALSADD
ATEG_2815 -SPSSRPWDLSCLMHLLSGGEANVVDTCARLARRLTQDYGVPSTCIKPAFGMTETCAGCS
ATEG_2004 SLPEPRRWDLRSLEYVASGGEANVTKTCDRLSEYL-VAFGAPKDVIVPGFGMTETCAGSI
ATEG_8899 TLAKY-TGSLSNLRYIVSGGEANVTQTINDLAQML-KKCGAVSNVIVPAFGMTETCAGAI
ATEG_2815 FNDRFPTYETVHMLDFASLGRGVKGVQMRVTSLSTGQPVDDHSEVGNLELSGPSVFRGYY
ATEG_2004 FNTRCPEYDKSRSAEFASVGTCMPGISMRVTDLSN--NALPSGEIGHLQLTGPVVFKRYF
ATEG_8899 YNTSFPQYDVEHGLPFASVGSCMPGIQVRIVQLNGNGNSVPPGTVGNLEICGPVVLKGYF
ATEG_2815 NNSQATRDSFTPDGWFRTGDLAMIDAGGQLVLRGRSKELICINGAKYLPHEVESAIEDAK
ATEG_2004 NNTSATQEAFTPDGWFKTGDMGCIDENGCLTLTGRAKENMIINGVNHSPHEIETALD--K
ATEG_8899 NNPAATKSTFTNDNWFKTGDLAFVDDNGMLVLAGREKDSIIVNGANYSPHDIESAIDEAN
ATEG_2815 VRGVTPGFTICFGYRPAKAQTESLAVVYLPAYEEADVESRSQAQNAIIRVGLIMTGTRPY
ATEG_2004 IPGLTPSYSCCFSFFPSGGETEEICVVYLPTYSPDDLAARAQTADAISKTVLMSTGSRPH
59
ATEG_8899 IPGLISGFTCCFSTFPPSADTEEVIIVYLPNYTPADTVRRSETAAAIRKVAMMSVGVRAT
ATEG_2815 VLPLDAHTLVKSSLGKISRNKIKTGLESGAFQAFEETNNRLLKLRQSTPVVPAGNETETL
ATEG_2004 VLPLEREALPKSSLGKLSRAKIKAAYEKGEYATYQNANNELMRRYRESTRAEPQNDLEKT
ATEG_8899 VLPLDRTMLEKSTLGKLARGKIKAAYERGDYKSYQEANEQMMALHHKVSHHQPRSGLEQS
ATEG_2815 LLAA-ALHVFRVTADEFGVETPMFAFGITSLDMIAWKRQAETILGH---EIPMLAIITSP
ATEG_2004 LLEVFTRSLSI-TDDAFDVKTPIFDVGINSVELIRLKRDIEDHLGMAASAIPMIMLMTHS
ATEG_8899 LLGVFTRTIPENLTEDFDVLTSIFDLGITSIELLKLKRGIEDLIGHG--QIPLITLMTNP
ATEG_2815 TIRVLARQLQD--GHHGPGEYNPVVTLQPHGSKTPLWLIHPIGGEVLVFVSLAGLFADDR
ATEG_2004 TVRDLATALEKLQG---PREYDP----------NPLWLVHPGAGEVLVFINLAQ-YIVDR
ATEG_8899 TIRTLSDALKQHAQQRDCSIYNPVVVLQSQGKKPPIWLVHPVGGEVMIFMNLAK-FIIDR
ATEG_2815 PVHALRARGLNRGEPPFGSIHEAADAYYQAIKRVQPHGPYAVAGYSYGSLVAFEVAKRLD
ATEG_2004 PVYALRARGFNDGEQPFETIEEATASYYNGIRSRQPHGPYALAGYCYGSMLAFEVAKMLE
ATEG_8899 PVYGLRARGFNDGEDPFHTFEEIVSTYHASIKEKQPSGPYAIAGYSYGAKVAFDIAKALE
ATEG_2815 QHGKDEVPFFGSLDLPPFHA-QIISKSDWTESLLHLASSLSLIAEEEINTLGA--DLRGL
ATEG_2004 SHG-EEVRFLGSFNLPPHIK-MRMRELDWKECLLHLAYFLDLVSQERSREMSV--ELAGL
ATEG_8899 HNG-DEVRFLGLLDLPPSLNGTQMRAVAWKEMLLHICRMVGVIREEGIKKIYPRLEPENI
ATEG_2815 PQPRAIQKILARAPPRRIRELDLSPDGLMRWTKLTSAMAQATRGYVPVGQTRSVDVFYTE
ATEG_2004 SHDEILDSVIQNANMERYAELSLNRPLLVRWADVAYELHRMAFDYDPAGCVAGMDVFFSI
ATEG_8899 SPRHAIETVMGEADVTRLAELGLTASALERWANLTHALQRCIVDHKTNGSVAGADAFYCD
ATEG_2815 PSGALATTRDEWLDRH-REWRQFGRLETQFHPLEGLHYRLMDEDNVHKVYRVLSRAMDAR
ATEG_2004 PLAIAAASKTEWRNVHLSQWEDFTRTVPKFHDVAGEHYSMIGPDHVFSFQKTLRKALDER
ATEG_8899 PMASMAISNEQWACDYIGKWSDHTRSPPRFHHIAGTHYTILDAENIFSFQKTFLRALNDR
ATEG_2815 GTLCLSLSL
ATEG_2004 GM-------
ATEG_8899 GI-------
60
Figure S4. Sequencing data of hybrids. Of each hybrid strain the region where one domain of one gene ends and the other
domain of the other gene begins was amplified and sequenced using the primers in table S2. This confirms that the predicted
hybrid sequence was correctly inserted into the A. nidulans genomic DNA.
2004 A, T/2815 TE
NNNNNNNNNNNNANNCTNNCGCGCTCGCTGAGCATCACAGACGACGCATTCGATGTGAAAACCCCTATTT
TCGATGTGGGAATCAACTCGGTCGAGTTGATCCGTCTGAAAAGAGACATCGAAGACCATCTCGGCATGGC
TGCGTCAGCAATCCCCATGATCATGCTGATGACGCACAGCACCGTCAGAGATCTAGCCACTGCTTTGGAA
AAACTGCAGGGCCCGCGCGAGTACAACCCCGTCGTGACCCTGCAACCACACGGTTCTAAGACTCCCCTGT
GGCTCATCCACCCCATCGGTGGCGAAGTGCTCGTGTTCGTCTCCCTGGCCGGGCTCTTCGCGGACGACCG
CCCGGTGCACGCACTTCGCGCACGCGGCCTCAACCGGGGCGAACCACCTTTCGGCAGCATCCATGAAGCC
GCCGACGCCTATTACCAGGCCATCAAGCGTGTCCAGCCCCATGGGCCGTATGCCGTCGCCGGATACTCGT
ACGGCTCGCTGGTCGCTTTCGAGGTCGCCAAGCGCCTCGACCAGCACGGTAAGGACGAAGTCCCGTTCTT
TGGNTCCCTANATCTACCTCCGCTCCACGCCCAAATCATCANCANNAGCGACTGNACCGANAGTCCTACT
CCNCCTAGNNN
2004 A/2815 T, TE
NNNNNNNNNNNNNNNNNNNNACCGAGGAGATCTGTGTGGTGTATCTACCCACATACAGCCCTGACGACCT
GGCTGCACGAGCCCAGACGGCCGATGCAATTTCAAAGACAGTCCTCATGAGTACGGGTTCGCGGCCGCAT
GTACTGCCGCTCGAAAGGGAGGCCCTTCCCAAATCCTCGTTGGGAAAGCTGTCGCGAGCCAAAATCAAGG
CGGCGTACGAGAAGGGGGAGTATGCGACCTACCAGAACGCCAATAACGAGCTGCTCAAGCTCCGCCAGTC
CACACCAGTCGTGCCTGCCGGGAATGAGACGGAGGCTCTGCTCCTCGCTGCGGCGTTGCATGTCTTCCGC
GTCACCGCGGACGAGTTCGGCGTCGAAACGCCCATGTTCGCCTTCGGGATCACTTCCTTGGACATGATCG
CGTGGAAGAGACAAGCAGAGACCATCCTCGGGCACGAGATCCCCATGCTGGCCATCATAACCAGCCCGAC
CATCAGGGTCCTCGCTCGCCAACTGCAGGATGGGCACCATGGCCCGGGCGAATACAACCCCGTCGTGACC
CTGCAACCACACGGTTCTAAGACTCCCCTGTGGCTCATCCACCCCATCGGTGGCGAAGTGCTCGTGTTCG
TCTCCCTGGCCGGGCTCTTCGCGGACGACCGCCCGGTGCACGCACTTCGCGCACGCGGCCTCAACCGGGG
CGAACCACCTTTCGGCAGCATCCATGAAGCCGCCGACGCCTATTACCAGGCCATCAAGCGTGTCCAGCCC
CATGGGCCGTATGCCGTCGCCGGATACTCGTACGGCTCGCTGGTCGCTTTCGAGGTCGCCAAGCGCCTCG
ACCAGCACGGTAAGGACGAAAGTCCCGTTCTTTGGGTCCCTAGATCTACCNNCGTTCCACGCCCAAATCA
TCAGCAAGAGCGACTGGACCGANAGTCTACTCCACCTAGCGAGCTCCCTCTCCCTCATTGCCGAGGANGA
GATCAANNCCCTTGGCGCTGACCTCNNANGCCTACCNCAGCCCAGAGCAATCCAANAANNANCCTCGCAC
NGGNCACNCCCGCGGNNNCATCCNCCAAGCTGGACCTNNGNCCCNAATGGACTGATGCNGTNNNANNAAA
CTGAACGNNNNNNNNNGCANNNNNNCNNCNNGGGNNNNNNNNNCCNGTNCGANANNNAANNNNTANNNNN
NNN
8899 A, T/2815 TE
NNNNNNNNNNNNCTGCTANNNGTCGACATTCACGAACGACAATTGGTTCAAAACCGGAGATTTAGCTTTC
GTTGACGATAACGGAATGCTGGTACTTGCTGGACGTGAAAAGGATAGCATCATTGTGAATGGGGCCAACT
ACAGTCCACACGATATCGAGTCCGCCATCGACGAAGCAAACATCCCCGGCCTTATCTCTGGTTTCACTTG
TTGTTTCTCCACGTTCCCGCCCAGCGCAGACACAGAGGAGGTCATCATTGTTTATCTCCCAAATTACCCA
CCAGCGGACACAGTTCGACGATCTGAAACTGCAGCCGCGATCAGAAAGGTCGCCATGATGTCAGTCGGCG
TGCGTGCCACAGTTCTCCCGCTCGACCGGACAATGCTGGAGAAATCGACTCTGGGCAAGCTTGCCCGCGG
CAAGATCAAGGCTGCTTATGAAAGGGGAGACTATAAAAGTTATCAAGAAGCGAACGAACAGCTGATGGCT
CTACACCACAAAGTGTCGCATCATCAGCCGCGGTCTGGTCTCGAACAGAGTCTACTCGGCGTCTTCACCC
61
GCACTATACCCGAGAACTTGACGGAGGACTTCGACGTGTTGACGTCAATATTTGATCTGGGAATCACATC
CATCGAGCTCCTCAAGCTCAAGAGAGGTATCGAAGATCTGATAGGTCACGGACAGATTCCTCTCATCACC
CTGATGACAAACCCCACTATCCGGACATTATCAGACGCGCTGAAGCAGCACGCTCAGCAAAGAGACTGCA
GCATATACAACCCCGTCGTGACCCTGCAACCACACGGTTCTAAGACTCCCCTGTGGCTCATCCACCCCAT
CGGTGGCGAAGTGCTCGTGTTCGTCTCCCTGGCCGGGCTCTTCGCGGACGACCGCCCGGTGCACGCACTT
CGCGCACGCGGCCTCAACCGGGGCGAACCACCTTTCGGCAGCATCCATGAAGCCGCCGACGCCTATTACC
NNNCATCAAGCGTGTCCAGCCCCATGGGCCGTATGCCGTCGCCGGATACTCGTACGGCTCGCTGGNCGCT
TTCGAGGTCGCCNNCNCNCGACNGCACGGNNNGACGANNNCCNNNNNTTGGGTCCNANATCTACCTNCNN
TTCCNNNCCCNAANCATCAGCNAANNN
8899 A/ 2815 T, TE
NNNNNNNGCNGCTNCAAGTCGACATTCNCGAACGACAATTGGTTCAAAACCGGAGATTTAGCTTTCGTTG
GCGATAACGGAATGCTGGTACTTGCTGGACGTGAAAAGGATAGCATCATTGTGAATGGGGCCAACTACAG
TCCACACGATATCGAGTCCGCCATCGACGAAGCAAACATCCCCGGCCTTATCTCTGGTTTCACTTGTTGT
TTCTCCACGTTCCCGCCCAGCGCAGACACAGAGGAGGTCATCATTGTTTATCTCCCAAATTACCCACCAG
CGGACACAGTTCGACGATCTGAAACTGCAGCCGCGATCAGAAAGGTCGCCATGATGTCAGTCGGCGTGCG
TGCCACAGTTCTCCCGCTCGACCGGACAATGCTGGAGAAATCGACTCTGGGCAAGCTTGCCCGCGGCAAG
ATCAAGGCTGCTTATGAAAGGGGAGACTATAAAAGTTATCAAGAAGCGAACGAACAGATGCTCAAGCTCC
GCCAGTCCACACCAGTCGTGCCTGCCGGGAATGAGACGGAGGCTCTGCTCCTCGCTGCGGCGTTGCATGT
CTTCCGCGTCACCGCGGACGAGTTCGGCGTCGAAACGCCCATGTTCGCCTTCGGGATCACTTCCTTGGAC
ATGATCGCGTGGAAGAGACAAGCAGAGACCATCCTCGGGCACGAGATCCCCATGCTGGCCATCATAACCA
GCCCGACCATCAGGGTCCTCGCTCGCCAACTGCAGGATGGGCACCATGGCCCGGGCGAATACAACCCCGT
CGTGACCCTGCAACCACACGGTTCTAAGACTCCCCTGTGGCTCATCCACCCCATCGGTGGCGAAGTGNTC
GTGTTCGTCTCCCTGGNNGGGCTCTTCNCGGACGACCGCCCGGTGCACGCACTTCGCGCACGCGGNNCAA
CCGGGGCGANCACCTTTCGCAGCATCCATGANCGCNACGCCTATACNGNCATCAGCGTGTCAGCCCATGG
NNNNANNCNNCNNNNNACTCGTANNNTCGCTGNNGCTTTCNAGTCGCANCGCNNGACNNNANGNNAGNAC
NAANTCCNNTNNTNNNCNNNANNNACNTNCNTNNCNNNNNNCNTNNNCNNNNNNCNGNNNNNNANNNNNN
NCNNNNNNNNCNNNNNNNNNNNNNNANCNNNCNNNNNNNNACNNNNNNGC
62
Figure S5. Spectral data compounds 1-4.
Spectral data of compounds 1 and 3 were shown before in.
9, 11
Spectral data of compound 2 are
reported here for the first time and the m/z corresponds to butyrolactone IIa. Spectral data of
compound 4 indicates it is different from compounds 1-3. LCMS conditions can be found in a previous
report.
18
63
Figure S6. NMR spectra of compound 2.
64
65
66
67
68
69
Figure S7. High Resolution Mass Spectrometry.
High Resolution Mass Spectrometry Total Ion Current and Mass Spectrum of major fraction of the crude
extract of the cultured 8899/2815 hybrid shows an m/z of 309.0770 in the negative mode, which has a
predicted molecular formula of C 18H 14O 5. This molecular formula aided in the NMR data analysis.
70
Figure S8. NMR spectra of compound 4.
71
72
73
74
75
76
Supplemental references
1. Balibar, C. J.; Howard-Jones, A. R.; Walsh, C. T., Terrequinone A biosynthesis through L-tryptophan
oxidation, dimerization and bisprenylation. Nature Chemical Biology 2007, 3 (9), 584-592.
2. Wackler, B.; Lackner, G.; Chooi, Y. H.; Hoffmeister, D., Characterization of the Suillus grevillei
Quinone Synthetase GreA Supports a Nonribosomal Code for Aromatic alpha-Keto Acids. Chembiochem
2012, 13 (12), 1798-1804.
3. Zhu, J.; Chen, W.; Li, Y. Y.; Deng, J. J.; Zhu, D. Y.; Duan, J.; Liu, Y.; Shi, G. Y.; Xie, C.; Wang, H. X.;
Shen, Y. M., Identification and catalytic characterization of a nonribosomal peptide synthetase-like (NRPS-
like) enzyme involved in the biosynthesis of echosides from Streptomyces sp LZ35. Gene 2014, 546 (2),
352-358.
4. Schneider, P.; Bouhired, S.; Hoffmeister, D., Characterization of the atromentin biosynthesis genes
and enzymes in the homobasidiomycete Tapinella panuoides. Fungal Genetics and Biology 2008, 45 (11),
1487-1496.
5. Yeh, H. H.; Chiang, Y. M.; Entwistle, R.; Ahuja, M.; Lee, K. H.; Bruno, K. S.; Wu, T. K.; Oakley, B. R.;
Wang, C. C. C., Molecular genetic analysis reveals that a nonribosomal peptide synthetase-like (NRPS-like)
gene in Aspergillus nidulans is responsible for microperfuranone biosynthesis. Applied Microbiology and
Biotechnology 2012, 96 (3), 739-748.
6. Chiang, Y. M.; Oakley, C. E.; Ahuja, M.; Entwistle, R.; Schultz, A.; Chang, S. L.; Sung, C. T.; Wang, C.
C. C.; Oakley, B. R., An Efficient System for Heterologous Expression of Secondary Metabolite Genes in
Aspergillus nidulans. Journal of the American Chemical Society 2013, 135 (20), 7720-7731.
7. Tohge, T.; Watanabe, M.; Hoefgen, R.; Fernie, A. R., Shikimate and phenylalanine biosynthesis in
the green lineage. Frontiers in Plant Science 2013, 4, 13.
8. Szewczyk, E.; Nayak, T.; Oakley, C. E.; Edgerton, H.; Xiong, Y.; Taheri-Talesh, N.; Osmani, S. A.;
Oakley, B. R., Fusion PCR and gene targeting in Aspergillus nidulans. Nature Protocols 2006, 1 (6), 3111-
3120.
9. Guo, C. J.; Sun, W. W.; Bruno, K. S.; Oakley, B. R.; Keller, N. P.; Wang, C. C. C., Spatial regulation of
a common precursor from two distinct genes generates metabolite diversity. Chemical Science 2015, 6
(10), 5913-5921.
10. Guo, C.-J.; Knox, B. P.; Sanchez, J. F.; Chiang, Y.-M.; Bruno, K. S.; Wang, C. C. C., Application of an
Efficient Gene Targeting System Linking Secondary Metabolites to their Biosynthetic Genes in Aspergillus
terreus. Organic Letters 2013, 15 (14), 3562-3565.
11. Sun, W. W.; Guo, C. J.; Wang, C. C. C., Characterization of the product of a nonribosomal peptide
synthetase-like (NRPS-like) gene using the doxycycline dependent Tet-on system in Aspergillus terreus.
Fungal Genetics and Biology 2016, 89, 84-88.
12. Yeh, H. H.; Chang, S. L.; Chiang, Y. M.; Bruno, K. S.; Oakley, B. R.; Wu, T. K.; Wang, C. C. C.,
Engineering Fungal Nonreducing Polyketide Synthase by Heterologous Expression and Domain Swapping.
Organic Letters 2013, 15 (4), 756-759.
13. Beer, R.; Herbst, K.; Ignatiadis, N.; Kats, I.; Adlung, L.; Meyer, H.; Niopek, D.; Christiansen, T.;
Georgi, F.; Kurzawa, N.; Meichsner, J.; Rabe, S.; Riedel, A.; Sachs, J.; Schessner, J.; Schmidt, F.; Walch, P.;
Niopek, K.; Heinemann, T.; Eils, R.; Di Ventura, B., Creating functional engineered variants of the single-
module non-ribosomal peptide synthetase IndC by T domain exchange. Molecular Biosystems 2014, 10
(7), 1709-1718.
14. Calcott, M. J.; Ackerley, D. F., Genetic manipulation of non-ribosomal peptide synthetases to
generate novel bioactive peptide products. Biotechnology Letters 2014, 36 (12), 2407-2416.
15. Duerfahrt, T.; Doekel, S.; Sonke, T.; Quaedflieg, P.; Marahiel, M. A., Construction of hybrid peptide
synthetases for the production of alpha-L-aspartyl-L-phenylalanine, a precursor for the high-intensity
sweetener aspartame. European Journal of Biochemistry 2003, 270 (22), 4555-4563.
77
16. Fischbach, M. A.; Lai, J. R.; Roche, E. D.; Walsh, C. T.; Liu, D. R., Directed evolution can rapidly
improve the activity of chimeric assembly-line enzymes. Proceedings of the National Academy of Sciences
of the United States of America 2007, 104 (29), 11951-11956.
17. Yu, J. H.; Hamari, Z.; Han, K. H.; Seo, J. A.; Reyes-Dominguez, Y.; Scazzocchio, C., Double-joint PCR:
a PCR-based molecular tool for gene manipulations in filamentous fungi. Fungal Genetics and Biology
2004, 41 (11), 973-981.
18. Bok, J. W.; Chiang, Y. M.; Szewczyk, E.; Reyes-Dominguez, Y.; Davidson, A. D.; Sanchez, J. F.; Lo, H.
C.; Watanabe, K.; Strauss, J.; Oakley, B. R.; Wang, C. C. C.; Keller, N. P., Chromatin-level regulation of
biosynthetic gene clusters (vol 5, pg 462, 2009). Nature Chemical Biology 2009, 5 (9), 696-696.
78
Chapter 4
Expanding the chemical space of Nonribosomal Peptide
Synthetase-like Enzymes by domain and tailoring enzyme
recombination
Biosynthetic gene clusters contain a myriad of tailoring enzymes in addition to the usual one or two core
backbone enzymes. Among the most frequently found and well-characterized are methyltransferases,
prenyltransferases, oxygenases, glycosyltransferase and aminotransferases. Several of these types of
enzymes are also found in primary metabolism. The cytochrome P450 family of monooxygenases plays an
important role in drug metabolism, DNA methyltransferases regulate transcription and prenyltransferases
post-translationally modify proteins to confer membrane-binding functionality. In general, tailoring of
small molecule natural products can be viewed as analogous to post-translational modification of
proteins. Elucidating the biosynthetic pathway of natural products often revolves around solving the
concerted action of tailoring enzymes, rather than just identifying the one or two core enzymes. The
increasing understanding of all the different types of modifying enzymes has the potential to increase the
diversity and complexity of small molecules that could be generated by metabolic pathway engineering.
Ideally, a conceptual, novel natural product could be biosynthesized by combining the core enzyme that
yields the backbone of interest with the correct set of modifying enzymes. At the same time, a small library
of natural products can be generated by making a set of enzyme combinations.
In our previous research, we showed that non-ribosomal peptide synthetase (NRPS)-like enzymes from
Aspergillus terreus can be easily expressed in our Aspergillus nidulans host and produce their products
1
.
We also demonstrated that the individual domains can be exchanged to make hybrid enzymes with
predictable functionality. Here, we aim to expand the complexity of the products of these enzymes by
adding modifying enzymes. To identify potential modifying enzymes for our small molecules we looked
near three NRPS-like genes that were previously expressed heterologously. apvA (ATEG_02004.1)
79
produces aspulvinone E which is used in melanin biosynthesis or for further aspulvinone derivatives.
Phenguignardic acid is the product of pgnA (ATEG_08899.1), which appears to be a stand-alone enzyme,
since the final product is a known secondary metabolite in other species. btyA (ATEG_02815.1) produces
butyrolactone IIa, which is a precursor for other butyrolactones found in A. terreus
2
. We chose tailoring
enzymes from the butyrolactone pathway for recombination with NRPS-like enzyme (hybrids) since there
is a neighboring S-adenosyl methionine (SAM)-methyltransferase (ATEG_02816.1/btyB) and the
subsequent prenyltransferase was recently identified as well (ATEG_01730.1/abpB), though not near
btyA
3
. BtyB methylates butyrolactone IIa at the carboxylic acid to yield butyrolactone II. A single
subsequent trans-prenylation by AbpB yields butyrolactone I, which can be extracted from wild-type A.
terreus. Butyrolactone III, an epoxidated form of butyrolactone I, can also be found in the extract of A.
terreus, though the gene responsible for that is not known at this point
2
. After reconstituting this pathway
in our A. nidulans host, we recombined these tailoring enzymes with functional hybrid NRPS-like enzymes
to increase their complexity and gain knowledge on the specificity of these modifying enzymes.
Engineering of biosynthesis is usually accompanied by loss in efficiency and thus yield. A robust
heterologous host with the capacity to produce g/L titers is therefore employed here for pathway
engineering
4
. As a control and proof of concept, the butyrolactone I pathway from Aspergillus terreus was
reconstituted in our Aspergillus nidulans host. The butyrolactone IIa producing mutant strain CW8502
previously developed in our lab underwent two more cycles of transformation. First, the
methyltransferase ATEG_02816.1 was amplified from A. terreus genomic DNA. This gene was merged with
an alcA promoter, an AfpyroA marker and flanking regions to facilitate homologous recombination right
after the BtyA gene thereby recycling the 3’ AfpyrG marker. Colonies were analyzed by diagnostic PCR and
for their inability to grow without uridine to confirm the AfpyrG marker was recycled. A correct colony
was restreaked and cultured in liquid glucose minimal media (GMM) as described before and
overexpression of btyA and the methyltransferase btyB was induced by addition of methylethylketone
80
(MEK). Extract analysis by LCMS indicated the production of butyrolactone II based on retention time and
m/z value. This was confirmed by 2D NMR (Figure S#). The presence of butyrolactone IIa shows that the
efficiency of the methyltransferase is not optimal under these conditions, though sufficient methylated
product was formed for structural characterization. This successful mutant strain was used for subsequent
transformation to add the prenyltransferase ATEG_01730.1. Similar as for the methyltransferase, this
gene was amplified from A. terreus genomic DNA and fused to the alcA promoter, flanking regions, and
now an AfpyrG marker. Colonies were again analyzed by diagnostic PCR, restreaked and grown in liquid
GMM with MEK induction. LCMS analysis of the extract show the presence of butyrolactone I, based on
m/z value and retention time. This was confirmed by 2-D NMR as well (Figure S#).
Figure 1. HPLC traces of ethyl acetate extracts of mutant strains with genes from the butyrolactone pathway heterologously
expressed at a total spectrum scan by DAD detector. (A) Overexpressed btyA shows butyrolactone IIa (*) production with minor,
unidentified, side products. (B) Addition of the methyltransferase results in the production of butyrolactone II (#) as the major
product, though * is still present as well. The shift in retention time corresponds to methylation of the carboxylic acid. (C) Further
addition of the trans-prenyltransferase leads to the appearance of mono prenylated butyrolactone I (Δ), though roughly equal
amounts of # are observed. Mass spectra, UV absorption and 1-H and 13-C NMR of each major compound can be found in the
supplemental.
81
Besides butyrolactone I, an equal amount of II can be seen in the LCMS trace, which means the
prenyltransferase does not fully convert under these conditions. This could be due to suboptimal relative
expression levels, cellular localization, or insufficient culture time. Butyrolactone IIa, however, is
completely absent in this strain, suggesting an equilibrium shift towards the final product. It must be noted
that ATEG_01730.1 was found to also be responsible for aspulvinone E bisprenylation to aspulvinone H,
therefore suggesting a more complex regulatory mechanism.
Now that the methyl- and prenyltransferase were shown to be functional in a reconstituted butyrolactone
pathway, a previously generated hybrid NRPS-like enzyme was combined with these tailoring enzymes in
the same two-step transformation process. CW8504 is a heterologous expression strain with the A domain
of pgnA and the T and TE domain of bytA, previously shown to produce a novel compound. The LCMS
traces show the appearance of a new peak with an m/z value and retention time that corresponds to
methylation. Scale up, purification and NMR analysis of the new product confirmed methylation of
phenylbutyrolactone IIa. The previously reported instability of phenylbutyrolactone IIa is not observed in
the methylated molecule. However, addition of the prenyltransferase did not result in any prenylated
product, which could be expected since the hydroxyl group on the phenyl side chain is absent and is most
likely necessary for aromatic ring prenylation. This recombination approach was applied to a different
hybrid NRPS-like enzyme. A bishydroxylated form of phenguignardic acid was generated by combing the
A domain of apvA, which was shown to be substitutable with the btyA A domain since they activate the
same HPPA unit, with the T and TE domain of pgnA. In this case the methyltransferase did not methylate
the carboxylic acid side chain of the heterocyclic core of the molecule. This indicates a high specificity of
the methyltransferase. Prenyl group transfer to this hybrid molecule was achieved, though at a much
lower efficiency under these conditions than for the native butyrolactone pathway.
82
Figure 2. HPLC traces of ethyl acetate extracts of mutant strains with a hybrid NRPS-like gene combined the butyrolactone
pathway heterologously expressed at a total spectrum scan by DAD detector. (A) Expression of pgnA/btyA hybrid gene shows
phenylbutyrolactone IIa production. (B) The added methyltransferase results in phenylbutyrolactone II production which shows
a similar retention shift as butyrolactone II. However, no prenylation was observed on the phenyl side chain (C). Mass spectra,
UV absorption and 1-H and 13-C NMR of each major compound can be found in the supplemental.
In conclusion, we showed that the butyrolactone pathway can be partially reconstituted in Aspergillus
nidulans. It seems the tailoring enzymes work specifically on their target functional groups. The carboxylic
acid on the butyrolactone core in phenylbutyrolactone IIa was methylated, despite the absence of the
hydroxyl groups on the aromatic side chains. However, when the core was slightly changed but the side
chains were intact in hydroxyphenguignardic acid, no methylation was observed. This can be due to the
methyltransferase recognizing a specific molecule or that it has some protein-protein interaction with the
C-terminal end of the NRPS-like enzyme or due to some other form of colocalization that does not occur
with a slightly different NRPS-like enzyme or in a heterologous host. The prenyltransferase was already
shown to act on multiple targets with hydroxyphenyl side chains, since both butyrolactones and
aspulvinones are targets in A. terreus. Hydroxyphenguignardic acid can be added to that group, though
83
Figure 3. HPLC traces of ethyl acetate extracts of mutant strains with a novel hybrid NRPS-like gene combined the butyrolactone
pathway heterologously expressed at a total spectrum scan by DAD detector. (A) The apvA/pgnA hybrid expressed in the A.
nidulans host yields hydroxyphenguignardic acid (*) as its major product. (B) The methyltransferase addition did not result in the
expected product. (C) The prenyltransferase seems only moderately functional in this context, with only minor prenylated
product (Δ) detectable. Mass spectra, UV absorption and 1-H and 13-C NMR of each indicated compound can be found in the
supplemental.
the full scope of the targets of this prenyltransferase remains to be explored. For example can indole side
chains be prenylated by ATEG_1730.1 or can prenyltransferases from the terrequinone A pathway
5
prenylate products of novel hybrid NRPS-like enzymes. What this research does show is the capability
(and some limitations) of enzyme recombination to expand the chemical space of engineered natural
products. Previously, these types of tailoring enzymes have been recombined with native, small NRPS. For
example, ftmPS from Neosartorya fischeri produces a cyclic dipeptide brevianamide, which can be
prenylated in three different ways by coexpression of three prenyltransferases from A. nidulans
6
. In this
example prenylation was also not complete, with non-prenylated dipeptide still being the main product,
like what was found in this study. The use of hybrid NRPS-like enzymes in this type of pathway engineering
84
has not been reported before though. A wider, more systematic and automated screening of domain and
tailoring enzyme combinations together with more structural data should lead to a more diverse library
of these types of compounds that can be used for bioactivity screening.
References
1. van Dijk, J. W. A.; Guo, C. J.; Wang, C. C. C., Engineering Fungal Nonribosomal Peptide Synthetase-
like Enzymes by Heterologous Expression and Domain Swapping. Organic Letters 2016, 18 (24), 6236-
6239.
2. Guo, C.-J.; Knox, B. P.; Sanchez, J. F.; Chiang, Y.-M.; Bruno, K. S.; Wang, C. C. C., Application of an
Efficient Gene Targeting System Linking Secondary Metabolites to their Biosynthetic Genes in Aspergillus
terreus. Organic Letters 2013, 15 (14), 3562-3565.
3. Guo, C. J.; Sun, W. W.; Bruno, K. S.; Oakley, B. R.; Keller, N. P.; Wang, C. C. C., Spatial regulation of
a common precursor from two distinct genes generates metabolite diversity. Chemical Science 2015, 6
(10), 5913-5921.
4. Chiang, Y. M.; Oakley, C. E.; Ahuja, M.; Entwistle, R.; Schultz, A.; Chang, S. L.; Sung, C. T.; Wang, C.
C. C.; Oakley, B. R., An Efficient System for Heterologous Expression of Secondary Metabolite Genes in
Aspergillus nidulans. Journal of the American Chemical Society 2013, 135 (20), 7720-7731.
5. Balibar, C. J.; Howard-Jones, A. R.; Walsh, C. T., Terrequinone A biosynthesis through L-tryptophan
oxidation, dimerization and bisprenylation. Nature Chemical Biology 2007, 3 (9), 584-592.
6. Wunsch, C.; Mundt, K.; Li, S. M., Targeted production of secondary metabolites by coexpression
of non-ribosomal peptide synthetase and prenyltransferase genes in Aspergillus. Applied Microbiology
and Biotechnology 2015, 99 (10), 4213-4223.
85
Supporting Information
Expanding the chemical space of Nonribosomal Peptide Synthetase-like Enzymes
by domain and tailoring enzyme recombination
Table of Contents
Culturing of Mutant Strains and Isolation of Compounds .................................................................. 86
Table S1. Fungal strains used in this study ........................................................................................ 87
Table S2. Primers used in this study .................................................................................................. 88
Table S3. NMR data for butyrolactone IIa ......................................................................................... 89
Table S4. NMR data for butyrolactone II ........................................................................................... 90
Table S5. NMR data for butyrolactone I ............................................................................................ 91
Table S6. NMR data for phenylbutyrolactone IIa .............................................................................. 92
Table S7. NMR data for phenylbutyrolactone II ................................................................................. 93
Table S8. NMR data for hydroxyphenguignardic acid ........................................................................ 94
Figure S1. Schematics for fusion PCR and transformation .................................................................. 95
Figure S2. Schematics for marker recycling ....................................................................................... 96
Figure S3. Diagnostic PCR ................................................................................................................. 97
Figure S4. Sequencing data of hybrid ................................................................................................ 99
Figure S5. ATEG_02816.1 and ATEG_01730.1 sequences ................................................................... 99
Figure S6. Spectral data compounds ............................................................................................... 103
Figure S7. NMR spectra of butyrolactone II ..................................................................................... 105
Figure S8. NMR spectra of butyrolactone I ...................................................................................... 110
Figure S9. NMR spectra of phenylbutyrolactone II ........................................................................... 115
Figure S10. NMR spectra of hydroxyphenguignardic acid................................................................. 120
86
Culturing of Mutant Strains and Isolation of Compounds
At 1 million spores per mL, mutant strains were cultured in 4x240 ml GMM + supplements shaking at 180
rpm for 42 hours at 37 °C pre-induction with 50 mM MEK. Cultures were continued to shake at 180 rpm
for three more days at 30 °C post induction. Dried down ethyl acetate extracts of the media yielded 40-
100 mg of crude extract varying per strain and iteration. The gradient system used to purify the crude was
5% MeCN/H 2O (solvent A) and MeCN (solvent B) both containing 0.05% TFA. The gradient condition for
semi-preparative HPLC of the CW8602b crude was 0-5 min 0% B, 5-20 min 40-100% B, 20-22 min 100% B,
22-23 min 100-0% B, 23-27 min 0% B, with butyrolactone II eluting at 9.47 min and butyrolactone I at
12.20 min. For phenylbutyrolactone II (CW8603a) the gradient was 0-5 min 0% B, 5-6 min 0-40% B, 6-18
min 40-100% B, 18-20 min 100% B, 20-21 min 100-0% B, 21-25 min 0% B, with phenylbutyrolactone eluting
at 14.44 min. CW8607a yielded hydroxyphenguignardic acid and its prenylated form, purified using a
gradient with 0-5 min 0% B, 5-15 min 40-75% B, 15-17 min 75-100% B, 17-18 min 100-0% B, 18-21 min 0%
B with elution at 9.47 min and 12.09 min respectively.
87
Table S1. Fungal strains used in this study.
Fungal strain or
transformants
Gene mutation(s) Genotype
CW8502 stcJΔ, alcA(p)-btyA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-btyA-AfpyrG
CW8601 stcJΔ, alcA(p)-btyA, alcA(p)-btyB
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-btyA-alcA(p)-btyB-AfpyroA
CW8602b
stcJΔ, alcA(p)-btyA, alcA(p)-btyB,
alcA(p)-abpB
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-btyA-alcA(p)-btyB-alcA(p)- abpB -
AfpyrG
CW8506 stcJΔ, alcA(p)-pgnA/btyA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-pgnA(A)-btyA(T and TE)-AfpyrG
CW8603a
stcJΔ, alcA(p)-pgnA/btyA,
alcA(p)-btyB
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-pgnA(A)-btyA(T and TE)- alcA(p)-btyB-
AfpyroA
CW8604a
stcJΔ, alcA(p)-pgnA/btyA,
alcA(p)-btyB, alcA(p)- abpB
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-pgnA(A)-btyA(T and TE)- alcA(p)-btyB-
alcA(p)- abpB -AfpyrG
CW8605b stcJΔ, alcA(p)-apvA/pgnA
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-apvA(A)-pgnA(T and TE)-AfpyrG
CW8606e
stcJΔ, alcA(p)-apvA/pgnA,
alcA(p)-btyB
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-apvA(A)-pgnA(T and TE)- alcA(p)-
btyB-AfpyroA
CW8607a
stcJΔ, alcA(p)-apvA/pgnA,
alcA(p)-btyB, alcA(p)- abpB
pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
yA::alcA(p)-apvA(A)-pgnA(T and TE)- alcA(p)-
btyB- alcA(p)- abpB -AfpyrG
LO4389 stcJΔ pyrG89; pyroA4; nkuA::argB; riboB2; stcA-W
88
Table S2. Primers used in this study.
primer sequence (5' → 3')
Primers used in heterologous expression of NRPS-like (hybrid) enzymes from Aspergillus terreus
ATEG2815.1HEF
CCA ATC CTA TCA CCT CGC CTC AAA ATG ACC AAA ATT GAT
TTG AT
ATEG2815.1HER
CGA AGA GGG TGA AGA GCA TTG GCG TTC TAT TTG CTT TGA
A
ATEG8899.1HEF
3
CCA ATC CTA TCA CCT CGC CTC AAA ATG AAT AAG AAG CTC
AAG CT
ATEG8899.1AR CAT CTG TTC GTT CGC TTC TTG
ATEG2815.1TTEF CAA GAA GCG AAC GAA CAG ATG CTC AAG CTC CGC CAG TCC
ATEG2004.1HEF
CCA ATC CTA TCA CCT CGC CTC AAA ATG ACT TTG AAC AAC
CTA CA
ATEG2004.1AR
CAG CTC GTT ATT GGC GTT CT
ATEG8899.1TTEF
CTA CCA GAA CGC CAA TAA CGA GCT GAT GGC TCT
ACA CCA CAA AGT G
ATEG8899.1HER
3
CGA AGA GGG TGA AGA GCA TTG ATG GGA CAG GCG ATA GAT
AA
Primers used in heterologous expression of tailoring enzymes from Aspergillus terreus
ATEG2816.1HEF
CCA ATC CTA TCA CCT CGC CTC AAA ATG ACA CAA TCG GAA
GTC AGT
ATEG2816.1HER
CGA AGA GGG TGA AGA GCA TTG GGG CTG TAG AGT AGA GAC
C
2816_5’flanking(2815)FW CCA CAC GTC TAA GAC TC
2816_5’flanking(2815)FW_nested CCT GTG GCT CAT CCA CC
2816_5’flanking(2815)Rev
CTA TCA CAA TCA GCT TTT CAG GCG TTC TAT TTG CTT TGA
A
2816_5’flanking(8899)FW CCA CCC ATC TGG CTT GTC C
2816_5’flanking(8899)FW_nested CCA GTC GGC GGA GAA GTC
2816_5’flanking(8899)Rev
CTA TCA CAA TCA GCT TTT CAG CTC ACG ATG AAA TTT CGT
TTA GC
ATEG1730.1HEF
CCA ATC CTA TCA CCT CGC CTC AAA CTT CCA TAC CAA CAG
ACA TCG
ATEG1730.1HER
CGA AGA GGG TGA AGA GCA TTG_CAA GGT GTT CAA CGC CAG
AC
1730_5’flanking_FW ATG ACA CAA TCG GAA GTC AGT
1730_5’flanking_FW_nested GTC AGT TTC GGC ATC GAC AC
1730_5’flanking_Rev
GAA CTA TCA CAA TCA GCT TTT CAG_GGG CTG TAG AGT AGA
GAC C
Primers for sequencing hybrid strain
2004 seq FW A-T CCG AGC TAC AGC TGT TGC
89
Table S3. NMR data for butyrolactone IIa (400 and 100 MHz in DMSO-d 6)
Position δH ( J in Hz) δC HMBC
a
COSY NOESY
1 157.9, C
2 6.88, d (8.0) 115.9, CH 1, 3, 4, 5, 6 H-3 H-3
3 7.61, d (8.0) 129.0, CH 1, 2, 5, 6, 7 H-2 H-2, H2-12
4 121.5, C
5 7.61, d (8.0) 129.0, CH 1, 2, 3, 6, 7 H-6 H-6
6 6.88, d (8.0) 115.9, CH 1, 2, 3, 4, 5 H-5 H-5
7 128.0, C
8 138.2, C
9 168.5, C
10 85.2, C
11 171.0, C
12 3.38, s 38.1, CH2 7, 10, 11, 13, 14, 16 H-3, H-18
13 124.0 C
14 6.61, d (8.0) 131.2, CH 12, 15, 16, 17, 18, H-15 H-15
15 6.51, d (8.0) 114.6, CH 13, 14, 16, 17, 18 H-14 H-14
16 156.3, C
17 6.51, d (8.0) 114.6, CH 13, 14,15, 16, 17 H-18 H-18
18 6.61, d (8.0) 131.2, CH 12, 14, 15, 16, 17 H-17 H2-12, H17
11-OH 10.5, br s
a
: HMBC correlations are from proton(s) to the indicated carbon.
90
Table S4. NMR data for butyrolactone II (400 and 100 MHz in DMSO-d 6)
Position δH ( J in Hz) δC HMBC
a
COSY HMQC
1 158.3, C
2 6.86, d 116.3, CH 1, 4, 6 3
3 7.50, d 129.2, CH 1, 2, 5, 7 2
4 121.4, C
5 7.50, d 129.2, CH 1, 3, 6, 7 6
6 6.86, d 116.3, CH 1, 2, 4 5
7 127.8, C
8 138.5, C
9 168.3, C
10 85.1, C
11 170.2, C
12 3.38, d 38.4, CH2 7, 10,11, 13, 14, 18
13 123.6, C
14 6.56, d 131.6, CH 12, 16, 18 12, 15
15 6.48, d 115.0, CH 13,16, 17 12, 14
16 156.7, C
17 6.48, d 115.0, CH 13, 16, 15 12, 18
18 6.56, d 131.6, CH 12, 14, 16 12, 17
11-OCH3 3.73, s 53.9, CH3 11, 12
1-OH 9.96 1, 2, 6
8-OH 10.58 7, 8, 9
16-OH 9.23 15, 16, 17
a
: HMBC correlations are from proton(s) to the indicated carbon.
91
Table S5. NMR data for butyrolactone I (400 and 100 MHz in DMSO-d 6)
Position δH ( J in Hz) δC HMBC
a
COSY HMQC
1 158.3, C
2 6.86, d 116.2, CH 1, 4, 6 H-3
3 7.47, d 129.2, CH 1, 2, 5, 6, 7 H-2
4 121.4, C
5 7.47, d 129.2, CH 1, 2, 3, 6, 7 H-6
6 6.86, d 116.2, CH 1, 2, 4 H-5
7 127.9, C
8 138.5, C
9 168.4, C
10 85.1, C
11 170.2, C
12 3.34, s 38.5, CH2
3, 5, 7, 10, 11,
13, 14, 18
13 123.5, C
14 6.45, d 131.8, CH 18
15 6.51, d 114.5, CH 13, 17
16 154.2, C
17 126.9, C
18 6.34, s 131.3, CH 16
11-OCH3 3.73, d 53.9, CH3 11, 12
1’ 2.98. d 27.9, CH2 2’, 14, 17, 18
H2’, H-4’,
H5’
2’ 4.98, t 122.7, CH
H-1’, H-4’,
H-5’
3’ 131.8, C
4’ 1.51, s 17.9, CH3 2’, 5’, 14 H-1’, H2’
5’ 1.60, s 25.9, CH3 2’, 4’, 14, 17 H-1’, H2’
1-OH 9.93, s 1, 2, 6
8-OH 10.52, s 7, 8, 9
16-OH 9.13 15, 16, 17
a
: HMBC correlations are from proton(s) to the indicated carbon.
92
Table S6. NMR data for phenylbutyrolactone IIa (400 and 100 MHz in DMSO-d 6).
Position δH ( J in Hz) δC HMBC
a
COSY NOESY
1 7.39, t (8.0) 130.8, CH 2,3,5,6 H-2, H-6 H-2, H-6
2 7.49, t (8.0) 129.3, CH 1,3,5,6 H-1, H-3 H-1, H-3
3 7.72, d (8.0) 127.5, CH 4,5,7 H-2 H-2, H-12
4 129.0, C
5 7.72, d (8.0) 127.5, CH 3,4,7 H-6 H-6, H-12
6 7.49, t (8.0) 129.3, CH 1,2,3,5 H-1, H-5 H-1, H-5
7 127.2, C
8 140.7, C
9 168.4, C
10 85.6, C
11 170.9, C
12 3.50, s 39.0, CH2
7,10,11,13,
14,18
H-3, H5, H14,
H18
13 134.3, C
14 6.78, d (8.0) 130.6, CH 16,18 H-15 H-12, H-15
15 7.12, t 128.2, CH 13 H-14 H-14
16 7.12, t 127.3, CH 13 H-15, H-17
17 7.12, t 128.2, CH 13 H-18 H-18
18 6.78, d (8.0) 130.6, CH 14,16 H-17 H-17, H-12
8-OH 10.9, br s
a
: HMBC correlations are from proton(s) to the indicated carbon.
93
Table S7. NMR data for phenylbutyrolactone II (400 and 100 MHz in DMSO-d 6).
Position δH ( J in Hz) δC HMBC
a
COSY HMQC
1 7.40, t 130.8, CH 3, 5 H-2, H-6
2 7.49, t 129.4, CH 6, 14, 18 H-1, H-3
3 7.63, d 127.5, CH 4, 5 H-1, H-2
4 129.1, C
5 7.63, d 127.5, CH 3, 4 H-1, H-6
6 7.49, t 129.4, CH 2, 14, 18 H-1, H-5
7 126.8, C
8
9 158.1, C
10 85.2, C
11 169.9, C
12 3.54, dd 39.0, CH2
7, 10, 13,
14, 18
13 133.7, C
14 6.78, d 130.6, CH 16, 18 H-15
15 7.14, t 128.2, CH 13 H-14
16 7.14, t 127.5, CH 13 H-15, H-17
17 7.14, t 128.2, CH 13 H-18
18 6,78, d 130.6, CH 14, 16 H-17
8-OH 11.13
11-OCH3 3.75 54.1, CH3 11
a
: HMBC correlations are from proton(s) to the indicated carbon.
94
Table S8. NMR data for hydroxyphenguignardic acid (400 and 100 MHz in DMSO-d 6).
Position δH ( J in Hz) δC HMBC
a
COSY HSQC
1 156.5, C
2 6.54, d 115.1, CH 4 H-3
3 6.99, d 132.1, CH 1 H-2
4 124.5, C
5 6.99, d 132.1, CH 1 H-6
6 6.54, d 115.1, CH 4 H-5
7 3.25, s 39.7, CH2 3, 4, 5, 8
8 108.1, C
9 167.1, C
10 164.9, C
11 136.7, C
12 5.94, s 102.0, CH 11, 14, 18
13 124.6, C
14 7.46, d 131.1, CH 16 H-15
15 6.76, d 116.0, CH 13 H-14
16 158.0, C
17 6.76, d 116.0, CH 13 H-18
18 7.46, d 131.1, CH 16 H-17
1-OH 9.19, s 2, 6
16-OH 9.84, s
a
: HMBC correlations are from proton(s) to the indicated carbon.
95
A
B
Figure S1. Schematics for fusion PCR and transformation. (A) A gene of interest is fused via fusion PCR to a promoter at the 5’
end, a nutritional marker at the 3’ end, and flanking homologous fragments for recombination with the yellow locus in Aspergillus
nidulans. (B) A 5’ and a 3’ fragment from two different heterologous expression constructs is amplified and fused together using
fusion PCR. This fragment is used for transformation in the same fashion as for the native genes. Experimental details for fusion
PCR and transformation can be found in these references
1, 2
96
Figure S2. Schematics for additional rounds of transformation adding the methyltransferase and the prenyltransferase recycling
a nutritional marker every round.
97
Figure S3. Diagnostic PCR.
The yellow locus of each strain was amplified using yA P1 and yA P6 primers. The unmodified yellow locus
of the background strain has an expected size of 3940 bp, whereas the mutants after insertion of the
NRPS-like (hybrid) gene construct have an expected size of ~7400 bp. Despite picking single colonies from
two restreaks, colony CW8605a seems to have both the insertion and the intact yA gene. The other two
colonies seem to only have the insertion.
Two subsequent rounds of transformation for the btyA strain. Left: Primers flanking the methyltransferase
construct (see Figure S2) were used to see whether it was inserted. The observed increase in size of
CW8601 suggests correct insertion. Right: Primers flanking the prenyltransferase construct (see Figure S2)
were used to see whether it was inserted. All three colonies show a size increase suggesting correct
insertion.
98
Two subsequent rounds of transformation for the pgnA/btyA strain. Left: Primers flanking the
methyltransferase construct (see Figure S2) were used to see whether it was inserted. The observed
increase in size of CW8603a suggests correct insertion. Right: Primers flanking the prenyltransferase
construct (see Figure S2) were used to see whether it was inserted. The observed increase in size of
CW8604a suggests correct insertion.
Two subsequent rounds of transformation for the apvA/pgnA strain. Left: Primers flanking the
methyltransferase construct (see Figure S2) were used to see whether it was inserted. The observed
increase in size of CW8606e suggests correct insertion. Four other colonies show bands of both sizes,
suggesting both the inserted and original gene construct is present, which is undesirable for the next
round of transformation. Middle: Primers flanking the prenyltransferase construct (see Figure S2) were
used to see whether it was inserted. There is no observed increase, though a faint band can be seen above
the main one in CW8607a and b. Right: An additional diagnostic PCR using primers targeting the
prenyltransferase and the AfpyrG marker shows specific presence of these two genes next to each other
in all 5 colonies.
99
Figure S4. Sequencing data of new hybrid. The region where one domain of one gene ends and the other
domain of the other gene begins was amplified and sequenced using the primer in table S2. This confirms
that the predicted hybrid sequence was correctly inserted into the A. nidulans genomic DNA.
2004 A/8899 T, TE
NNNNNNNNNNNNNATGCATTTCAAGACAGTCCTCATGAGTACGGGTTCGCGGCCGCATGTACTGCCGCTC
GAAAGGGAGGCCCTTCCCAAATCCTCGTTGGGAAAGCTGTCGCGAGCCAAAATCAAGGCGGCGTACGAGA
AGGGGGAGTATGCGACCTACCAGAACGCCAATAACGAGCTGATGGCTCTACACCACAAAGTGTCGCATCA
TCAGCCGCGGTCTGGTCTCGAACAGAGTCTACTCGGCGTCTTCACCCGCACTATACCCGAGAACTTGACG
GAGGACTTCGACGTGTTGACGTCAATATTTGATCTGGGAATCACATCCATCGAGCTCCTCAAGCTCAAGA
GAGGTATCGAAGATCTGATAGGTCACGGACAGATTCCTCTCATCACCCTGATGACAAACCCCACTATCCG
GACATTATCAGACGCGCTGAAGCAGCACGCTCAGCAANNNAGACTGCA
Figure S5. Sequences of the Aspergillus terreus genomic DNA regions containing the methyltransferase
and prenyltransferase. Start and stop codons are highlighted in green, primers used to amplify are
highlighted in yellow.
ATEG_02816.1 Methyltransferase
TCGGAATCCTACGTCTGGAGGGAACTGCGCCTCCAGGCCTTGCCGAGTCAGGCAGATGGG
GTCCCGTAGGAGAGACTGGAGGTGCAGCAAGTGGGAGCGGCGATGCTCTTCGTGATGTGC
AAATGGCGTAGACATCACCGGGATACATCCGGCATAGATCAGCGACCATAGCCAGATCAT
GTTATCCCGGTAGCCGTCTAAATGCAGCAGAACCAGCGAGCCGGGGGCAAAGCCCTCCAT
GCATCCCAAGCGCCTGGCATTCTGCTGAGCAGCATCCCGAAGTTCAGCATAGGTTAATCG
GTGTGGTGACTCGACATTTCCAGGAGAGTAGATCAGGATTCCGCTGGCAGCAACCGTGTC
GGCTGCGTGGTCCAGGAGGTTAATCAAATCAATTTTGGTCATGAGGCCTTATCGCTAGTG
AGCGGAAGACTTGCAATGCTATGTCGAACGATCTGGGGGAGGGCGTGGTCAAAGAAGACC
ACGCATGTTTAAGGAGCTGCCATCGAAGTGGTACAAGGATGAATACGCCTAGACTGCAAG
GGAGACGCTTGGCTACTACCTAGGTATCACCCCATTTCCACGCCAAGGACATTATTGACT
GGGTTTCTTCTCCTCCTTACCGCCTCAAGGGATAGAATGGAGGGAGAAGCAGCCGGAATG
CAGTCAGCGTAGCATGCATGTTGGGTCGATTTACAACCCAACGGATGTCCTTTCGAATCC
CCCCCCCTCAACCAGGAATAAATGAAGAACAGTCCGCGTTCAAGCTCAGCACATCCACAC
100
ACAACATGACACAATCGGAAGTCAGTTTCGGCATCGACACCGAAGAGCTGGCCCGTCTCT
ACGAGACGGTGTGTGAGCACCAATATCAGACAGGGCTCGTCCTTCTCGACCAGCTGAAGC
TCGCGCCGGGGGGAGAGTGTTCTCGATGTCGGGTCTGGAACGGGCAAGCTGGCCACCTAC
GCCGCCGGGATGGTGGGCGAAAGCGGGCGAGTCGTTGGTATCGATCCCCTTGGCGCTCGC
GTCAGCATTGCCAACGAGTCTGCCCGGGCAAATCTCTCGTTTGCCGTCGGAGATGCACAC
GATCTCACCCGCTTTGAACCGGCTAGCTTCGATGTCGTCTACCTCAATGCCGTCTTCCAC
TGGCTATCCGACAAACCCGAAGCCCTGCGGCAGTTTGCCCGCGTGCTGAAACCCAACGGC
CGCCTCGGCATCACCACCGGATCAGGCGACCATCGGTTCCCGCACGAGACCGTTCGGGAC
CGTGTGCTTGCCCGGGATGCTTATCGCGCCTATCAGGACGGCGGTGTGAAGGGCCAGGCG
CAACTGGTGACGAGGAAGGAGCTGGAGAGATACCTGACCGACGCCGGATTTCACGCAGGC
CCGGTTACCGAGGTGCCCAATGAGCTATGCACCAGAGACGCCGACACCATGATCGATTTT
GTCGAGTCGAGCTCGTTCGGGAATTATCTCGGCCATCTGCCCGAAGACGTGCGGGCCAGG
GTCCGGCAGGAAATCGCGCAGGAATATGAGGAGTTTCGCACAGCGAATGGGATACGTGCC
AGCGCAATAGACTATCTGGTGGTGGCCACAGTGCCCTCACAAATACCTACTGCATAGTGC
CCGAGTCATGGCCGGTCTGGATGTCAATAGCCGAATGTCAGTCATCATCGTTAGGGGAAG
AATTCACGTCTCATTGGAATACCGCTGTTTCTTGGCTTTATGTAGCATTCCTGCAGCCTG
TTCCTGCTCCGTAGTGGATTCTTCTGTAGTGTCTCTCTACCTTATAGTACCTAGATAGGT
TATTTTCTGTGACAGGTCTCTACTCTACAGCCCGTCCCTACTCCATGGTGGGTAATCTTC
TATTTGTCAACCCTGGGATATCATATGCAACTCCGTTCCCAATGCATGCCATGAAGTTCA
GCTCCATGATAATCAACAGTGGACATGGGTCATAAATCCAATTTCACAGGATGTCTAGTA
ACGGCGCTGTAGGGATTGAGTCCTTTCACATGAAAGGATGTTTGCTATCGGAGTAGTCTT
CGTCTAATAGTCGTTAAGATATCATAGAGGCTATGATTAGTTGGACTTGCCCTCTGAGAT
ATGAAGCCACGATATGTCTCGCGCGGCCAATTAACGTAATGTTAGATAGATACTCTCGAT
CATACGACTTGTATACCACGAAATTGTTTTATTAATATTCCGCGTGCGCAATATATAGCG
TAATAATTATATCATCATGGCAGGGATAACAAGATTAGTTGGACAAGATAATATCGCTAT
CCACTAATACTAGGCATAATAGTGATCCGTAAGAGATATTCCTTATCTATTAGTACTTCT
AACCCTACTATGATGCCTTCAAGTCCCTGTATCCGGCCATCCCGTTGCGCAGCAGACCTT
GGAATAGTCTCCACGCTCACCGTGATCGATAACAACACATCAAAGACATTAAGCAAACGA
AGCACTCCTCCTTCCGCCCC
101
ATEG_01730.1 Prenyltransferase (reverse complement)
TAATCCGCCGTTCAAATAGCAATCCACATCCATGCCGATGGATGTCTATCATACAGATGA
CACGTCAGTGCTCTCAGCCGCTTTTCGACACGAGAAACCCTGCAAGGAGGTGCCATCTTG
GCCAATATAAGCAGATAAGCAGGTATGATTTGGGAGGATAGAGAGATTATAAATTGTCCT
GGGAGCAGCATCAGAATCGTTTCTTCCATACCAACAGACATCGAAGCATCGCATAGTGTA
TCTTGCAATCTTATAAATAGTTCCTTTTCACCATGACAAAAAGTATCATCGCTTCTGAGC
CTCCCAGCGCCGAATCGTCCGGCAGGCTACCCTGGAAGATATTGGGACAAACGACCGGAT
TCCCAAACCAAGACCAGGAGCTCTGGTGGCTAAACACAGCCCCTCTGCTCAACGAATTTC
TGGCCGAGTGCCAATATGACGTCCACTTGCAGTACCAATACCTCACGTTCTTCCGCCACC
ATGTCATTCCTGTTCTAGGGCCCTTCTTTGCCCCAGGGACGACTCCAAACTTCGCCAGCA
GACTCAGCAAGCACGGCCACCCTCTTGATTTCAGCGTCAATTTCCAGGAGTCCGGTGCAA
CAGTCCGAATGAGCCTGGGGGCCATCGGTAGCTTTGCTGGCTTGCAGCAGGATCCGTTGA
ACCAGTTCAGGGCGAGAGAAGTTCTCGACAAGCTAGCCATCTTGTACCCAACCGTGGACT
TGCAGTTGTTCAAACACTTCGAGTCTGAGTTTGGCATCAACCACGCCGACGCGCTGAAGG
TAGCAGCCAAGCTCCCTAAACTAGACCGGGCAACGAAAATGATCGCGATTGATATGTTAA
AGAACGGGTCCATGACATTCAAAGTCTACTACATGGTTCGATCTAAGGCGGCTGCAACTG
GACTCCCTGTGCATACCGTCCTCTTCAACGCCGTTCAACGTCTGGGATCAGCATTTGAGC
CTGGATTGTCTCTGCTGAAACAGTTCCTCTCTCCGCTATGTGATGCTGGGGAAACGGATC
TGGGTCTGCTGAGTTTCGATTGCGTTCCTACCGAATCCTCACGCATCAAGCTCTACGCTA
TCAAGCAGGTCGGGTCCCTGGACGCCATCCGGAATCTCTGGACCCTCGGGGGCACTATGG
ATGATCCGACGACTATGAAGGGTCTCGCAGTACTGGAACATGTTTGCGAGCTGCTCCAGT
TCGGCTGGTCAGGCGATTCCCGTGTCCAGCCCATATTGTTCAACTATGAAATTAAAAAGG
GATCGACCCCGAAGCCACAGATATACATCCCTCTAGCCGACAGATATGACGAGTTCGACG
CTGCCAAACTGAAGGCAGTTTTCCAAGATCTGGACTGGAAGCGAGTCCCATTCTATCAAG
ATACTGGGAAAGACTTGGCTTCAGTTTTGTAAGTGCTCCTTCATCACGTCTATGGATTAA
CTGGCTGACAATTTTCTGAGCCCAACTGTTGATCTGGGTTCTACCACAAACGTTCATCGT
TGGCTGTCATTCTCTTACACAGAAGCCAGAGGGCCTTATATGAGTGTCTACTACGATGTA
GTCCATCCTGAGTGTGGTCCATCCCTAGACACCGTCCGTTTTGAGCCTCTGAAGAAGTCT
ATCGAATGACGGTTTTCGTTCATGTCTGGCGTTGAACACCTTGAAGAGGACTAACCAGAA
AAGCCACAATCTAAATATCTTGTTGATCTGTCCTGCAACTTCCAAGTCCCGCATAGCCGT
102
GGGAGTATCCTCGGACCTGCTGTCCCACCCGAGCATCTAGTCTATGAACTACACCAGCAG
GACGACTGCCTCGTAAGGCTGGAGCAGCCATTTGCTCCCCGAAAACCTCTGCTTTGCGGA
CTCTGCTGGCTCATAAGTGTTCAGCAAGACGTCTCGTGCGCCCTTCACCCCGTTGGCCGC
AGCATCCCACTCAAGACTATTCTCAGTCCAGTTCGCCAGAACCAAAGCTTTCTGATCCCC
ATATTGACGGCTGTAGGCAAACACCGCCTGGCTATCCCGATCAACGAGCTCGTAGTTCCC
GTAGACAAAGATGTCCAGGTATTTCTTCCGGAGGCCAAGCACAGCGGCCCAATAGTGATA
103
Figure S6. Spectral data of compounds in this project.
104
Spectral data of butyrolactone IIa, II and I and phenylbutyrolactone IIa were shown before. Spectral
data of phenylbutyrolactone IIa and both hydroxyphenguignardic acids are reported here for the first
time. LC-MS conditions can be found in a previous report
5
.
105
Figure S7. NMR spectra of butyrolactone II
106
107
108
109
110
Figure S8. NMR spectra of butyrolactone I.
111
112
113
114
115
Figure S9. NMR spectra of phenylbutyrolactone II .
116
117
118
119
120
Figure S10. NMR spectra of hydroxyphenguignardic acid.
121
122
123
124
125
Chapter 5
One step heterologous expression of the ustiloxin B gene
cluster
A class of small molecules of interest that can be classified as secondary metabolites has the same building
blocks as NRPs (amino acids), but has a distinct biosynthesis. A 2013 review of these types of compounds
dubbed them Ribosomally synthesized and Post-translationally modified Peptides (RiPPs) and the sheer
diversity of chemical structures accessible through the ribosomal pathway makes it an interesting focus
for drug development
1
. RiPPS are initially synthesized as linear peptides with often four segments. At the
N-terminus a signal peptide directs the prepeptide to the right cellular location. After that a leader peptide
can be found, which guides the post-translational processing enzymes. In the middle is the core peptide
on which the modification are made and which sequence can be found in the final molecule. At the C-
terminus a recognition sequence plays an important role in excision and cyclization. Notable RiPPs are
lanthopeptides cyclized with one or more thioether bonds, azol(in)e containing peptides in which the
cysteine or serine/threonine side chains are cyclized, head-to-tail cyclized peptides, disulfide rich
peptides, and lasso peptides where the N-terminus cyclizes with a carboxylic acid side chain and the tail
sticks through that peptide ring and is stabilized with one or more disulfide bonds. All these examples
highlight the chemical complexity of RiPPs and the many tailoring enzymes with unique but poorly
understood mechanisms.
In studying complex RiPP biosynthesis, filamentous fungi and bacteria are superior to other species
because the genes responsible for production of their secondary metabolites are often grouped together
in gene clusters. This means that for a particular natural product, the gene for the enzyme that produces
the backbone is in the vicinity of the genes that modify it further (eg. cyclization, reduction, oxidation,
methylation, etc). This allows for elucidation of biosynthetic pathways and potentially engineering them
to make new drug leads. Until recently, only two RiPPs were identified in fungi, α-amanitin and phallacidin,
126
both in basidiomycota
2
. Both peptides are charactrized by a cyclized backbone, a bond between a cysteine
and a tryptophan side chain and bis-hydroxylation of a leucine/isoleucine side chain (Figure 1).
Figure 1. Structures of α-amanitin (A) and phallacidin (B). The peptide sequence of α-amanitin is PIWGIGCN with all L-amino acids.
β-amanitin has an aspartic acid instead of the asparagine. The phallacidin sequence is PAWLVDC with all L-amino acids except the
aspartic acid in D-configuration. The D-aspertic acid is also hydroxylated at the β-carbon. Phalloidin has a D-threonine instead of
the aspartic acid.
In 2014, the first RiPP in filamentous fungi, ustiloxin B, a strong inhibitor of microtubule assembly (Figure
2), was identified in Aspergillus flavus using a novel genome mining method
3, 4
. Earlier methods relied on
the analysis of sequenced genomes using the sequence homology between secondary metabolite core
enzymes to find new biosynthetic gene clusters (BGCs). PKSs, NRPSs, terpene cyclases and dimethylallyl
tryptophan synthases all share similar reactive cores and those conserved sequences or motifs can be
used to identify new core enzymes in genome sequences. Other biosynthetic genes involved with the
identified core enzyme can often be found in the genomic vicinity. This method however, does not take
transcriptional activity in consideration, which leads to the identification of silent or cryptic gene clusters.
Additionally, BCGs that do not have a designated core enzyme or contain a completely novel one, will not
be detected. The motif-independent de novo detection algorithm for secondary metabolite biosynthetic
gene clusters (MIDDAS-M) developed by Umemura at al. compares the transcriptome of secondary
127
metabolite producing with non-producing conditions. Virtual gene clusters of one to thirty genes are
compared and scored. First the whole genome is scanned gene by gene, then two neighboring genes at a
time, etc. This way the highest score will be obtained when the gene cluster is "caught" during a scan of
its size. The kojic acid gene cluster, which does not contain a core enzyme, was correctly identified from
Aspergillus oryzae sequence data.
Figure 2. Structure of ustiloxin B (MW 645.679) with the post-translational modifications highlighted in green. The tyrosine,
alanine, isoleucine and glycine (YAIG) are encoded 16 fold in the precursor gene. The norvaline is added post-translationally. Most
notably the cyclization is formed by a stereoselective ether bond between the side chains of tyrosine and isoleucine.
Figure 3. Overview of the ustiloxin B biosynthetic cluster of 16 genes. Most notable are the precursor peptide (ustA) which forms
the backbone of the molecule, the transcription factor (ustR) and the methyltransferase (ustM). The other genes encode other
tailoring enzymes.
Using gene knockouts, the ustiloxin B biosynthetic cluster boundaries were confirmed and putative gene
functions were assigned based on sequence homology (Figure 3). Most interestingly, the gene cluster
contains a precursor peptide gene but unlike well-characterized RiPPs it has 16 copies instead of just one.
128
Peptide repeats were reported before in filamentous fungi like poi-2, but poi-2 generates simple linear
signaling peptides as opposed to a cyclized and modified secondary metabolite like ustiloxin B
5
. Umemura
et al. did not report any identified ustiloxin B intermediates in their initial publication. Though more
recently they analyzed the knockout strains for intermediates which confirmed the role of several genes
and elucidated much of the ustiloxin B biosynthetic pathway
6
. The understanding of this cluster opens up
possibilities of generating cyclic peptide analogs with improved antimitotic activity and better
pharmaceutical properties.
Figure 4. The UstA peptide which contains a signal peptide and 16 repeats of the YAIG peptide which is the backbone of ustiloxin
B flanked by a leader peptide and recognition sequence. Ustilaginoidea virens contains a similar gene, with the notable difference
that it contains both YAIG and YVIG, the latter resulting in ustiloxin A. The C-terminal side of the KR motif is a cleavage site for
Kex2, a conserved enzyme in fungal species.
To study this gene cluster without culturing Aspergillus flavus, a human pathogen, we decided to
heterologously express the entire ustiloxin B cluster in Aspergillus nidulans that we have extensive
experience with in our lab. To date, whole gene clusters have been removed and gene clusters have been
added gene by gene into Aspergillus nidulans
7
. In this approach each gene has its own promoter and
129
expression levels are driven by the same promoter. To do this for the ustiloxin B cluster of 16 genes would
take 16 iterative transformations. Additionally, the transcriptional regulation of the gene cluster as a
whole would be lost. Instead, we aimed to transfer the whole cluster of 26,671 base pairs in one
transformation by combining four large DNA fragments with a 1000 bp overlap and a different selection
marker fused to the 5‘ fragment and the 3‘ fragment (Figure 5).
A
B
Figure 5. (A) Transfer of the ustiloxin B gene cluster into Aspergillus nidulans. A 5` gene cluster fragment is fused to Aspergillus
fumigatus derived riboB nutritional marker and a ~1000 bp flanking sequence to the yA locus. A 3` gene cluster fragment is fused
to Aspergillus fumigatus derived pyrG nutritional marker and a ~1000 bp flanking sequence. The fusion products together with
the two central fragments of the gene cluster are contransformed into a Aspergillus nidulans strain that has green conidia and
carries nkuAΔ, a deletion of the ST cluster and three selectable markers, [pyrG89 (pyrimidine auxotrophy), riboB2 (riboflavin
auxotrophy), and pyroA4 (pyridoxine auxotrophy)]. Upon transformation the two flanking fragments are homologously
recombined with the yA flanking sequences and with the two central fragments. Transformants are selected on riboflavin and
pyrimidine lacking plates, and correct transformants have yellow conidia. Diagnostic PCR is done at five different places to confirm
complete gene cluster transformation. (B) Insertion of the alcA promoter before the C6 transcription factor in the ustiloxin B
130
cluster. ~1000 bp flanking sequences are fused to the Aspergillus fumigatus derived pyroA nutritional marker and the alcA
promoter by fusion PCR. Linear DNA construct is used for transformation.
The primers used for PCR and fusion PCR of each fragment and construct can be found in table S1, the
agarose gels for each fragment and fusion construct are in the supplemental as well. Transformation of
the Aspergillus nidulans host was performed as described in chapter 2 with the following alterations. The
four fragments were premixed in a 1:2:2:1 molar ratio, where the total amount of each DNA fragment
ranged from 500-1000 ng. Since the total DNA volume exceeded 20 µl, the volume of protoplasts used
was increased to 200 µl instead of the usual 100 µl.
The transformation yielded dozens of colonies of which five were picked, isolated and after DNA
extraction subjected to diagnostic PCR analysis. Normally, with smaller construct insertions, primers just
outside of the construct in the genomic DNA of the host are used for diagnostic PCR where the size of the
amplified fragment changes if incorporation occured. In this case the inserted fragment is larger than 25
kb and, despite what sales people say, any polymerase will be hard pressed to amplify DNA longer than
that. Instead five different regions were probed both in and outside the inserted fragments. The 5‘ and 3‘
ends were probed with one primer outside the construct in the yA locus and the other inside the
respective nutritional marker (See Figure 5A). Furthermore, the overlapping regions within the construct
were probed so if any fragment was missing it would show. All five colonies appear to have the complete
cluster incorporated as can be seen from the diagnostic PCR gels in Figure 6.
131
Figure 6. Diagnostic PCRs of ustiloxin B cluster in A. nidulans. Five separate primer sets used to diagnose the ~26 kb cluster
insertion: 5’ region with marker using a 5’ primer outside of inserted fragment, three overlapping regions of each fragment, and
3’ region with marker using a 3’ primer outside inserted fragment. All five colonies have positive diagnostic PCR except #55 for
the 5’ region.
All five strains were cultured in glucose minimal media (GMM) for five days together with the host strain
control. Aqueous extracts (extraction method described in supplemental) were analyzed using LC-MS and
positive ions with m/z value of 646.1 ± 0.1, corresponding to ustiloxin B, were extracted. No difference
was observed between the empty background strain and any ustiloxin B cluster mutant (Figure 8A and B).
To turn on expression of the cluster, the powerful alcA promoter was inserted in front of the transcription
factor gene ustR. Two sets of primers were used to amplify flanking regions for the AfpyroA-alcA construct,
which can be found in Table S1. The PCR product of the successful fusion (Figure S4) was used in a
transformation with the ustB cluster inserted mutant as background. Diagnostic PCR using the forward
132
and reverse primers of each flanking region showed a shift upwards from 2 to 4.2 kb corresponding to the
size of the AfpyroA-alcA, despite significant non-specific amplification by this primer pair (Figure 8).
Figure 7. Diagnostic PCR for alcA insertion before transcription factor in ustiloxin B cluster. CW8056 is the background strain.
CW8109-CW8111 are transformants. Forward primer is the 5’ forward primer for the flanking fragment, reverse primer is the 3’
reverse primer for the flanking fragment. Background expected size: ~2 kb. AfpyroA-alcA insert expected size: ~4.2 kb.
Culturing of mutant strains with the ustiloxin B cluster and an upregulated transcription factor in GMM
also did not yield any ustiloxin (Figure 8C).
Aspergillus flavus is a plant pathogen and can be found on contaminated crops where it produces the
deadly aflatoxin. Interestingly, ustiloxin B was found to be produced in fruit and vegetable based media,
specifically cracked maize media and V8 juice media. Culturing of the A. nidulans strain containing the
ustiloxin B cluster in V8 juice media did produce detectable amounts of ustiloxin B as shown by Extracted
Ion Chromatogram in Figure 8E. Induction of the transcription factor increased production roughly tenfold
(Figure 8F), showing that both the culturing condition as well as transcription factor expression contribute
to ustiloxin B production.
However, upon closer inspection of the gene cluster sequence it turned out the SAM-dependent
methyltransferase at the 3’ end of the cluster, UstM, was unintentionally omitted from the transforming
fragments. To obtain the full-length gene cluster, the existing truncated gene cluster was extended in an
additional transformation using primers shown in Table S1. The individual fragments and the fusion
construct are shown on agarose gel in supplemental. The diagnostic PCR shows extension at the 3’ of the
cluster (Figure 9).
133
Figure 8. Extracted Ion Chromatograms of m/z 646.1 ± 0.1 [M+H]
+
(corresponding to ustiloxin B) of water extracts of GMM and
V8 juice media cultures. (A) A. nidulans empty background strain in GMM. No ustiloxin B observed. (B) Mutant strain with the
ustiloxin B cluster inserted in GMM. No ustiloxin B observed, nor when the transcription factor UstR is upregulated with alcA
induction (C). (D, E, F) Same three strains but cultured in V8 juice media. Mutant strain shows ustiloxin B production and alcA
induced ustR overexpression leads to a tenfold increase.
Figure 9. Diagnostic PCR of extension of the ustiloxin B cluster. The forward primer is located upstream of the extended part, the
reverse primer is in the yA region. Extension by ~1000 bp confirmed by increase in band size from ~4 to 5 kb, despite non-specific
amplification.
134
Culturing the full length ustiloxin B cluster with upregulated transcription factor in V8 juice media
produced compound at similar levels as the truncated strain (Figure 10D). However, culturing the same
strain in GMM also produces a detectable amount of ustiloxin (Figure 10B), comparable to uninduced
production in V8 juice media (Figure 8E). This indicates that the overexpressed transcription factor
increases the yield by the same factor as optimizing the media. It also indicates that the missing
methyltransferase can be compensated for by A. nidulans but only when grown in V8 juice media, since
previous research showed that an ustM deletion mutant of A. flavus did not produce ustiloxin B. Blast
analysis of the ustM sequence with the A. nidulans genome sequence did not give any hits, though
transcriptome data between GMM and V8 juice media can be compared to identify candidate genes.
Figure 10. Extracted Ion Chromatograms of m/z 646.1 ± 0.1 [M+H]
+
(corresponding to ustiloxin B) of water extracts of GMM and
V8 juice media cultures. (A, B) Full length ustiloxin B cluster expressed in A. nidulans produces ustilxin B in GMM. (C,D) Repeating
the same experiment in V8 juice media results in a ten fold increase in ustiloxin B production.
135
Future directions
These results show that it is possible to transfer a complete gene cluster from one species to another in
one step and trigger expression to yield its product either by optimizing the media, inducing the TF, or
both. This working cluster opens up the possibility to generate ustiloxin B analogs by changing the genetic
code of the precursor. In fact, this is already seen in Ustilaginoidea virens which produces ustiloxin A and
B from one precursor gene containing both YAIG and YVIG repeats
8
. The first modification to the
developed strain will be to exchange YAIG repeats for YVIG and see whether it can also produce ustiloxin
A. From there on, other mutations can be made to generate analogs which can be assayed for microtubule
inhibiting potency. Furthermore, putative RiPP clusters from other fungi that not grow well under
laboratory conditions, like basidiomycota, can be rapidly transfered and expressed in Aspergillus nidulans
opening up great genome mining possibilities for new drug lead discovery.
136
References
1. Arnison, P. G.; Bibb, M. J.; Bierbaum, G.; Bowers, A. A.; Bugni, T. S.; Bulaj, G.; Camarero, J. A.;
Campopiano, D. J.; Challis, G. L.; Clardy, J.; Cotter, P. D.; Craik, D. J.; Dawson, M.; Dittmann, E.; Donadio,
S.; Dorrestein, P. C.; Entian, K. D.; Fischbach, M. A.; Garavelli, J. S.; Goransson, U.; Gruber, C. W.; Haft, D.
H.; Hemscheidt, T. K.; Hertweck, C.; Hill, C.; Horswill, A. R.; Jaspars, M.; Kelly, W. L.; Klinman, J. P.; Kuipers,
O. P.; Link, A. J.; Liu, W.; Marahiel, M. A.; Mitchell, D. A.; Moll, G. N.; Moore, B. S.; Muller, R.; Nair, S. K.;
Nes, I. F.; Norris, G. E.; Olivera, B. M.; Onaka, H.; Patchett, M. L.; Piel, J.; Reaney, M. J. T.; Rebuffat, S.; Ross,
R. P.; Sahl, H. G.; Schmidt, E. W.; Selsted, M. E.; Severinov, K.; Shen, B.; Sivonen, K.; Smith, L.; Stein, T.;
Sussmuth, R. D.; Tagg, J. R.; Tang, G. L.; Truman, A. W.; Vederas, J. C.; Walsh, C. T.; Walton, J. D.; Wenzel,
S. C.; Willey, J. M.; van der Donk, W. A., Ribosomally synthesized and post-translationally modified peptide
natural products: overview and recommendations for a universal nomenclature. Natural Product Reports
2013, 30 (1), 108-160.
2. Hallen, H. E.; Luo, H.; Scott-Craig, J. S.; Walton, J. D., Gene family encoding the major toxins of
lethal Amanita mushrooms. Proceedings of the National Academy of Sciences of the United States of
America 2007, 104 (48), 19097-19101.
3. Umemura, M.; Koike, H.; Nagano, N.; Ishii, T.; Kawano, J.; Yamane, N.; Kozone, I.; Horimoto, K.;
Shin-ya, K.; Asai, K.; Yu, J. J.; Bennett, J. W.; Machida, M., MIDDAS-M: Motif-Independent De Novo
Detection of Secondary Metabolite Gene Clusters through the Integration of Genome Sequencing and
Transcriptome Data. Plos One 2013, 8 (12), 10.
4. Umemura, M.; Nagano, N.; Koike, H.; Kawano, J.; Ishii, T.; Miyamura, Y.; Kikuchi, M.; Tamano, K.;
Yu, J.; Shin-ya, K.; Machida, M., Characterization of the biosynthetic gene cluster for the ribosomally
synthesized cyclic peptide ustiloxin B in Aspergillus flavus. Fungal Genetics and Biology 2014, 68, 23-30.
5. Kim, H.; Nelson, M. A., Molecular and functional analyses of poi-2, a novel gene highly expressed
in sexual and perithecial tissues of Neurospora crassa. Eukaryotic Cell 2005, 4 (5), 900-910.
6. Ye, Y.; Minami, A.; Igarashi, Y.; Izumikawa, M.; Umemura, M.; Nagano, N.; Machida, M.; Kawahara,
T.; Shin-ya, K.; Gomi, K.; Oikawa, H., Unveiling the Biosynthetic Pathway of the Ribosomally Synthesized
and Post-translationally Modified Peptide Ustiloxin B in Filamentous Fungi. Angewandte Chemie-
International Edition 2016, 55 (28), 8072-8075.
7. Chiang, Y. M.; Oakley, C. E.; Ahuja, M.; Entwistle, R.; Schultz, A.; Chang, S. L.; Sung, C. T.; Wang, C.
C. C.; Oakley, B. R., An Efficient System for Heterologous Expression of Secondary Metabolite Genes in
Aspergillus nidulans. Journal of the American Chemical Society 2013, 135 (20), 7720-7731.
8. Tsukui, T.; Nagano, N.; Umemura, M.; Kumagai, T.; Terai, G.; Machida, M.; Asai, K., Ustiloxins,
fungal cyclic peptides, are ribosomally synthesized in Ustilaginoidea virens. Bioinformatics 2015, 31 (7),
981-985.
137
Supporting Information
One step heterologous expression of the ustiloxin B gene cluster
Table of Contents
Culturing conditions and sample analysis ........................................................................................ 138
Table S1. Primers used in this research ........................................................................................... 138
Figure S1. Preparative agarose gels of the four fragments for ustiloxin B cluster transformation ...... 139
Figure S2. Two analytical agarose gels of fusion PCR reactions ........................................................ 139
Figure S3. Spectral data of ustiloxin B ............................................................................................. 140
Figure S4. Agarose gels of fragments for promoter exchange .......................................................... 140
Figure S5. Agarose gels for cluster extension ................................................................................... 140
Ustiloxin B biosynthetic gene cluster sequence ............................................................................... 141
138
Culturing conditions and sample analysis
GMM or V8 juice media (20% V8 juice, 0.3% CaCO 3) cultures were shaken at 170 rpm for 42 hours at 30
°C, induced with MEK and shaken for three more days. After one hour of incubation with acetone,
hyphae were filtered off and acetone removed under vacuum. The aqueous phase was extracted with
ethyl acetate and an aliquot from the aqueous phase was filtered and 10 μl was injected in the LC-MS,
diverting the first two minutes to the waste not to inject the MS with salts from the media.
Table S1. Primers used in this research. The table above shows the primers used to amplify large fragments of the ustiloxin B
gene cluster, to extend the cluster to full length, and introduce the alcA promoter before the C6 transcription factor ustR.
primer sequence (5' --> 3')
Primers used for heterologous expression of ustiloxin B cluster
Afl_ustB_FW1 CTG AAA AGC TGA TTG TGA TAG GCT ACT TGG AGC CCT GCT
Afl_ustB_Rev1 AAC AGA GGA TCC AGC TAG C
Afl_ustB_Rev1_nested GCA GGA ACA ACT CAC ATG G
Afl_ustB_FW2 AAA TAC GGT ACG ACA ATG AGA
Afl_ustB_Rev2 AAT ACC CTC TTT CAC GTA GTC
Afl_ustB_FW3 CAG GAT GCC ATC AAA ATC CTT
Afl_ustB_Rev3 AAT CAG GTG GGC TAG TCA TA
Afl_ustB_FW4 GGA GAA CCA TAC AGC GCG
Afl_ustB_FW4_nested GTC TTT TGG CTG GTA GGG
Afl_ustB_Rev4 CGA AGA GGG TGA AGA GCA TTG GTC AAG TTA TCG TGG AAG TCT
Primers used for cluster extension to include last gene
Afl_ustB_ext_FW CTC CAG CAC AGC CCA CCA
Afl_ustB_ext_Rev CGA AGA GGG TGA AGA GCA TTG CGC TGC GGT GGA AAT CGA TC
Primers used to introduce the alcA promoter before the transcription factor
TF_5'flank_FW AGA TGC ACA TCC GGT CCT C
TF_5'flank_FW_nested TGG CCA GTC GGG GAG ATC
TF_5'flank_Rev CGA AGA GGG TGA AGA GCA TTG TGA CCG AAC CTC CAG GGA
TF_3'flank_FW CCA ATC CTA TCA CCT CGC CTC AAA ATG TCA GGG GCC CAG CGG
TF_3'flank_Rev TGG AAT CAA GGG CCG AGG
TF_3'flank_Rev_nested TGT TGT CTA GGA GTT GTG CC
139
The first 4 pairs of primers were used to amplify large fragments of the ustiloxin B gene cluster from
Aspergillus flavus genomic DNA, kindly provided by Professor Nancy Keller, University of Wisconsin
Madison. The preparative agarose gels of these PCR reactions is shown in the figure below.
Figure S1. Preparative agarose gels of the four fragments for ustiloxin B cluster transformation. ustB fragment 1 expected size:
4.5 kb, observed: 4-5 kb. ustB fragment 2 expected size: 8.3 kb, observed 8-10 kb. ustB fragment 3 expected size: 10.6 kb,
observed 10+ kb. ustB fragment 4 expected 6.1 kb, observed 6-8 kb.
Fragments 2 and 3 are ready to use in transformation, but fragments 1 and 4 need to be fused to their
respective marker and yA flanking region. Fusion PCR was achieved by reacting equal molar amounts of
each fragment using nested primers. Agarose gels of both fusions is shown below.
Figure S2. Two analytical agarose gels for respectively 4 and 2 different fusion PCR reactions (varying amounts of template) for
ustB fragments 1 and 4 with their 5‘ and 3‘ flanking fragments. 5‘ yA AfriboB + ustB fragment 1 expected size: 7.5 kb, observed:
8-10 kb. ustB fragment 4 + AfpyrG 3’yA expected size: 9.1 kb, observed: 8-10 kb. The successful fusion PCR reactions were purified
via gel extraction of the expected band.
140
Figure S3. UV-VIS and ESI-MS spectra in positive and negative mode of ustiloxin B. UV-VIS shows a non-discriminative spectrum
for peptides and proteins around 280 nm. This is due to poor separation and many peptidic compounds eluting at the same
time.
Figure S4. Agarose gels with fragments for introducing the alcA promoter before the transcription factor and the fusion of these
two fragments with an AfpyroA alcA fragment (not shown) of roughly 2.2 kb yields a band of roughly 4 kb as expected. The
fusion PCR product was directly used in a transformation after gel extraction purification.
Figure S5. Two fragments for extension of the ustiloxin B cluster in A. nidulans and the resulting fusion of the two which was used
directly in a transformation after gel extraction purification.
141
Below is the sequence of the ustiloxin B biosynthetic cluster from Aspergillus flavus, with each gene
prediction highlighted in green. Most notably are ustA which contains the precursor peptide for the
backbone of ustiloxin B, ustR which was predicted as two separate genes, but is a single C6 transcription
factor for the whole cluster, and ustM which is a methyltransferase critical for product formation.
>Aspfl1 A_flavus_1041045517010:1600929-1627600 5'pad=0 3'pad=0
strand=+
GCTACTTGGAGCCCTGCTGCTCGTTTAATGGCCTGTCAAGATAAGTTAGTAGTTGTCATT
CTTGTAATGCAGAGCAGAGTACTTATTAGCAATTCTGCCCGCTTGTATTGTCATTATCTA
ATGACACTCGAACCTTGGAGATTTATAGCAACAGTATCGACCATAGATTAGTCTACTAAA
GCCCTAATCTACAGATTACTCTACCATTGGTACAGCACATATCCTACCCACTGTGCGACT
ACCATGACTAAAAACATAAGCGCCATTCCGGTTTTGATTATATCAATTGAAAGTGATAG
TAACTGGCCCAGTTTGTCACGTGAGCCCACCATATAAACTCGCCCTTGTCCCTCATTGCC
GAGTGAAGACCGGGCAGTCCATCAAGGCTCCTACTGTTCCATTGCATCTGGTATTCAGGC
AGTTCCCCTCGATTAATTGTGTATTCAGCAAATTTGCCCAGCCTCGGGCGGTTTGTGAAA
CACGGCTTTGGTTCATTTGTTTCTGGATCGACAGTGAGGTCGAGCAGTGGGACTCCAAAG
CCACAGGATATTTGAACCTGCGAGAGGGTCAGTCCTTGTAGATTAGTGATGATAAACAAG
CACAATAAAAACAAAACCAACCTTGAAGATATCTAGAATAATAACTGCCCTTGCTCCGAC
TAGAGACTTCACGCCCATTCGCTTGACGTATCCAGCATACCTCGGATCATTCCATTCAAT
GACCGAACCCGTGCAGAAAAGTCGCATGATTCGCGGAGTAGCATCGAATGAGCAGAACAT
TACGGTTGCCCGTCCATTCTCGCGCAGGTGGCAGATAGTCTCACATCCGGATCCGGTTGA
GTCGACATATGCCACTTTGCTGGGGCTCAGAACGGCAAAGGATGAGTCGGGGAGGCCTTT
GGGGGATACATTAATATGACGACCTCGGTAGGGGGCAGATGAGACGAAGAAGAGAGGTTG
GCGGAGGGCCCAGTCACGGAGGTTGTCGGAGAGCGATTCGTAGAAAATTGGCATTGCTTT
TTTGAATTTTTTTTCTTTTGTTTTCTTGGTGTATAAAGCTCCAAGGATAGCAATCAATTT
TGCTACGTTGCGTAGAGGAATGAGATTCCCCACGTGACAATCAAGGGATGCAACGAAGCG
CCTATTTTCTTGGGACGAGGCGCCACGAAGTAATGTCGGGGATTGTAAACACCGTCACAA
GTGTGAGCTGTATAATGTAGAACTTGATAATCATACAGCGGTTACATATCGCGAATCGCG
AAGCGCTATCTAAGCGCGAATGATAATGGTCCCAAAATCGAAATATAGGAACAACTACTC
CAACTGTCATGGACTGCAACAGCAGGGGTAGAAGCGCAATGCAATGACTGTTTCTCAGGT
TCGCCGAGTGGCCGTGATCGGCGCTGGAATCAGCGGAGTTGTTTCAACAGCCCATCTTGT
AGCTGCTGGATTCGAAGTCACCGTGTTCGAAAGAAACCAGCAAACAGGGGGGATTTGGTA
CGTATACAATGTGCCAGAATGACATACTCTCGGTTGACCTCGATAGGCTGTACGATGAAC
AGACACCATTGGAATGTTCTTTTCCTTCTCCTGGTCCTTCTTTAGCAGATAAGGTAGAGA
AAAACGCCAGATTCGATAGAGAGAAGCTCCGACTACAGCATGCTCCTCCAGGGTAAGTGT
TTCCTCTTGACCTTGGAAATGCTTTGATGGCGTGAAAAGAAACAAAGAAAGAAAGAGAAA
GGGAACACAAAGGGAAACGACTGAGAAGGAGCTAATGAGAGTATTAACAGACCATGCTAC
AAAAATCTCACAACAAATGTGTCCACCCCTCTCATGCGAATAAAACTTCGCGCCTGGCCA
GAAAACACACCCGATTTTGTTCACCACAGTGTCGTGAATGAATATATCAGGGATATAGCA
CTCAGTACTGGAGTCGACGAGCGAACAATATACGGCGCTCGAGTGGAGCATGTCTATAAA
GATGGTGGGAAGTGGCATGTCAATTGGTCAGTACTAGACGACAATGGCAGTATTGACGGT
CTAGAAGAGAGGCGGCTGATCTCTGTAAGCCGCTACCGTGCAAAATAGGTTGCAAGCGCT
CTGATTAGAGGTTTAACCTCTTTTTTAGACCTTTGACGCGGTTGTAGTCGCGTCTGGCCA
TTATCATTCACCTCACATTCCGGACATACCTGGGTTATCCGAAGTAAAGAAAAGATGGCC
TTCAAGGGTTATACATTCCAAGCGATATAGAACACCGGAAGTCTACAGGGACGAAGTAAG
TGTTTGATATTCCGTGACCCATTAGATTGATTGCTATGAAATATGCTCACAGATCATCCA
GAATGTTTTGATGATTGGCGGAGGAGTATCATCGATGGATATATCGCGGGACCTTGGGCC
AFLA_094940
ustO
AFLA_094950
ustF1
142
ATTTGCCAAAATGATATTCCAGAGCACACGAAATGGCGACGCAGATCCACCTGCTCTCAT
GCTTCCAGATAATGCAGTAAGAATTGGCGAGATTGATCATCTCGAACTGCTATCTGGGAC
TGGTGACACGCTACCAGAGGGTGATCCCTTGCCATTAATACTCTGTCTGAAGTCATCTCA
ACGATTATGTAAAATTCACAAGATCATTGTTTGCACAGGGTATCAAATTGTATTCCCGTT
CCTTCCAGACTATCACGACGATTCAATGCCGCTTCAGGATGCTAATGATACTATTCTTGT
TACCAACGGGACACAAGTGCATAACATTCATCGAGACATCTTTTATATCCCAGACCCGAC
TCTGGCTTTTGTCGGGATACCTTACTTCAATACAACATTTACCCTCTTTGAATTCCAGGC
TATTGCAGTGACAGCCGTCTGGTCCCGAACTGCATGTCTACCATCAACCACCGAGATGAG
ACGGGAGTACCTAGTAAAGCAAAAGCAGACCGGTGGCGGGCGCAAATTCCACTCGTTGAA
GGATAAAGAAAAGGAATACGTCCGCGACCTTATGGCGTGGATAAATGATGGTAGAAACGC
ACATGGGCTTGTTCCTATTGAAGGTCATACGGCGGCTTGGTTTGAAGCTATGGATAAACT
GTGGGACGAGGCGAGAGCGGCGATGAAAGAAAGAAAGGAGCAACAAGAGAAGATTATCAA
GCGCATTCCATTTTCGGCTGATTGTACTTCGGTCGAACTGCCACATCTTAATTAGGGGAT
ATTATCTCGAATAGTCCACGGTGTTATGCAGTTAGAGGTGTTATTATATTGACCAGTAAG
AGGAGTAGCGCAACGCAACCTTGTAGGCGCGCTTGTACCGTTTAGTTTTGATTTAAAAAG
AACGCCTTGCCCGTAAGTCGAGAGATAATCATACTCCCTCCACTCAACAAAAGTCAATAC
ACACGGAAGCCCCGAGCAGGAATGTCACCGTTCATCTTTGCTGTCACATTGACCTTCGCC
ATCTTAGCTCTTGGTATCCTCCGACGGCGCTATTTCCACCCCCTCTCCAGGTTTCCAGGC
CCTTTCCTCGGGTCTGTAACGAGTCTCTATCAGACGTACTGGCATGTCCATCCCAACAAG
ACGCTTCATGATACAGAGCTCCATAGAAAATACGGTACGACAATGAGATCGAGTGGAAAT
CATGGAGAGTTCCACTGATATGGAACCCCAGGTCCGATTGTGAGATATAGCCCAAACGGC
CTCATTGTTAATGACCCAGCCCTGCTTCCGGTCATTTATAATCGTCGCGCCAATAAGACA
GATTTCTACGCTCCTGTCTTTGATACGCATTCGACATTTACCAGGAAGGACTATAGGGAA
CATGTCGCGTCAAGAAAGGCAATTAGCCATGCGGTAGGTTTTCTTGCTTTTTTTCGTACC
GAATATTCTTACGGGTTAGTCTTGCTTATGATAATCTTCAAACAGTATTCGGTGACGAAT
ACTCGCTTATTCGAGCCCCAAGTTGACGGTATCCTGTCTGAGTTGATCTCCCTGTTGAGT
GAATCGGCCACCGAAAAGCGATTGGTTGATATTATGGAGTATGGGAGGTGAGCCTGAACG
CGTTTTACTTCTCTTGGCTTTTGGCTTTCAATGTCAAACTCCAGATCTAACGCTAATGCA
CCCTATGGCTAGTTGGTTTACCTACGATGTCACTAGTCTGTTTGTATGTGGTAAACCCTT
TGGGTTTGTGGAGAAGCGTACAGATGTCAAAGGCCTTATTCAGAACAAGAACAAAGTTCT
TTTCATAGTCTTCATTATGACCATCCAGGAGAACCTGTCTTGGATTGTCCGCAACACGCG
TTTGGGGCGGCGGTATCTCATGCCTCATCCGACAGATCAGTCGGGTCTTGGTGTTGTCAT
GGCGGAGAGAGACCGCATCGTGGATGCGGTCATTGACAGCGATGGCAAGGTTAAACGCCA
CCTGTTGGTAAAGGGGAGTCTTTTGAGCAGCCTCATGGAGATTCTCGGTACGGAAGGCTG
TCCACTCAGCTTGGTGGATGTCAAGGCTGAGATTTTCTTTGCCATGTGAGTTGTTCCTGC
TCTGTCCGGACCAAATAAGACAATAACGATAAGTGAAACAGGCTAGCTGGATCCTCTGTT
ACCCCGAGCCAGCTCGCCCGAGTCATCTTCCATATATCGCGTAACTTCAAGGTCCAGGAG
AAGCTATACGAAGAGCTTGTCGCAGCAGAACAGGACGGTCGAATCCCACCTCTATCCGCC
ATAATCTCGGATGAGCAGGCTCACAGACTCCCCTTTCTTTCTGCTTGTATCAGAGAGGCT
CAACGATACGCTCCTACCATGTCCCAACTGCCTCGTTATGCGCCCGAGGGGACTGGGCTA
GAGTTACACGAGCAGTACGTTCCCCCAGGGACAAGCGTATCTACCAGCCCATGGATCATT
GGCCGAAATAAAGACCTGTACGGTGAGGACGCAAACTCGTTCCGGCCAGAACGATGGTTA
GAAGCATCTCCCGAGGAAGAAAGACGATGGGACCACTTTTCCTTTCATTTTGGATATGGA
GCTCGGAAATGTCTTGCGAATAATTTCGGCCTGATGCAGTTGTACAAAGTTGCTGCTGAG
GTGTGTGCATATCCTATTCGTTGCTTGTTGCGTTATCGGGCTCATCATGAGTGCAGGTAT
TTCGTCGTTTTGAAGTTAAAGTTGAAGGGTCGAACGAGGATACAGTCAGTGGAGGGCCGC
CTGCGAGTGCCAGGTTTCGCTTTGATCGCAGAGCAAGATCCTGGTCATAAGCATAATGGG
TATATAAATATGGATCTTTTCAGGGCATGATGGATTCCAAGGGCTGATGTATTAACTCAG
CCGATTGATATTCATGATGTGTAAGAAGCTAGCCATCTATCAAGTAAAGGCTATAGAGTA
TCATTATATATCAAAACCCGAATGCTAGGTTCTAGATCGCCTCAAGTAAATATAATTGGA
ATGTAATCTTTTCTCTCATGCGACGACGGACCGGCCCTAACCACCCAGCCAGATACGATC
TAGCTCCAACCAAACAGATGAACAACGAACTTTTGCTCCATGCATGATTCTCTCTCACCA
AFLA_094960
ustC
AFLA_094970
ustU
143
TGATTCTTTTTATTTCCTCACAGGACGGACCGTATCATAGCCTCGCACCAAGTCTGAGAA
GATGCAGAGTTGAATTGAGCGCCCAAAAGCCGACCCTCCATTAACAACTCGGCGCAAAGC
AGGAGGCAGCAGCTATCTTCCCACACCGGCGCCAAATTGATCTTCAGCTCGCATGTCACC
CTAGGCCAATGAGAACAGATAACGACACTGCTTGTCATCGCCAACCTCCGCAGCGCCAGC
GTCAAAGGTCCATCCCTCCAAACGGTGGCGCCTCTTGTCTTGAATTCCATCCTACCCCGC
CATCTCATGGCTATTATCCAGCTATATATGAATGGTTACTTCGTTCGTTTGTGACTACAC
CTCTTTGCAACATCGTATCTACATCCGTTTGTTTGTTGAAAACCCCACCGGAGCTCTTTG
ACAGCAGCCAACATGAAGCTTATTCTTACTCTACTCGTATCGGGCCTCTGTGCCCTGGCT
TGCCCCTGCGGCTAAGgtcagtaaatgctcattgctgcttactgttccattcatcaagct
tatctgattcagCGTGATGGCGTCGAGGATTACGCATCGGCATTGATAAGCGTAACTCAG
TTGAAGACTACGCTATCGGCATTGATAAGCGTAACTCAGTTGAAGACTACGCTATTGGCA
TCGACAAGCGCAATTCTGTCGAGGACTACGCTATCGGAATTGACAAGCGTAACTCAGTTG
AAGACTACGCTATCGGAATTGATAAGCGGAACACGGTTGAGGATTATGCTATTGGCATCG
ACAAGCGTAACTCAGTTGAGGACTACGCTATTGGTATCGACAAGCGGAACACGGTTGAGG
ATTATGCTATTGGCATTGACAAGCGTAACTCCGTTGAGGACTACGCGATTGGTATTGATA
AGCGCAACTCCGTTGAAGATTATGCTATTGGCATTGATAAGCGTGGTGGCTCAGTTGAAG
ACTACGCTATCGGCATCGACAAGCGCAATTCTGTGGAAGACTACGCAATTGGTATCGACA
AGCGCAACTCTGTTGAGGACTACGCCATTGGTATTGACAAGCGTGGTTCAGTTGAAGACT
ACGCTATCGGCATTGATAAGAAGCGCGGTACTGTTGAAGACTACGCTATTGGCATCGACA
AGCGTGGAGGATCAGTTGAGGATTATGCCATTGGTATCGACAAGCGCCATGGAGGGCATT
AAGCTCGCTCAGTCTTCCGAGTATGATGCCGACATATTTGCACATACAGAACTGTTTAGC
ACTTGAACAGCTTTATGCAAGTGTTTTTGAAATGTAGTGTTCGTATCAAGCTCTTTAGCA
CTACAGTTTTGCATGATAATGATGATGATGTTTCATTGAATGCCGTATCCATTACTCATC
TGATTGTCTTTTTAACCGGTCGAGGTAAGGCAGCAGCACCAGTAGAGAGTATTATGTTAG
CAAGTATGCCCTAGAATGGAACATTGAAACTGAGCCACGTGTTAGAGACGATGGTACGTT
CCAATATAAGAAATAGCTATGTAGCGCTGATCTATATATTTGGGATTATCCAACTCAAGT
GAAACACAGAAGCAAATGATATATTGCTATCATGGACATTTAACCTTTGTTCTATGTACC
ACTGACATACACCAATCGTTCCCTCATTAATGAATCCCATACGTCGTCTTCAACCTATTG
TCCTCGGCCCATCGAGCAATAGCATCGAAATCACCACACGTATGCTCAACGCCCCATCCC
GTACTTCCCTTGTCATTCGGAGGCGCAGGAATCCATTCTAGAGTCGTGTCCGCGTGACAC
ATGATAGCCTGTCTCAGGTAGTCCCAGCAATGCATGAGATGCGCTGCATTAACTTGGTCG
AGGTTCCCTTCTCGAGCGGCGTAGTAGCCCTCTCGGGTCATATACTGAAAGCAATTAGCG
TAAGCGTTCTTTCTTGACATTGTTGTTATAGACTAATGTGCTTACAATGCAGTGGAGCTG
GTGGAACACGGAAATCATTGCACGTTGGTGGGGAAGAGATTGATCCAGACCAGGCTGGTC
GGGAAGTGCTGTGTCGTTGTTGATATTCACGAAGCCTCGTCCAACTGGTGGCGTATTATT
ATATAATTCCATGATCGTCCTAGACATGAAAGAAACTCACTCGGCATCAATTCGTCCCAA
GCCTTTTCCGCCTCAGGATTGAGAGGTTCTCCATACAACGTCTGAAAAACGAAAAGTTTT
CGAGTTGGGGCTGACAGGGAACATTATCAATTTGATTGCATTGATTTGGATAAGAAGAAG
GGAGCTCGAACCTGTCTTCGGGGGTAACCAAGGAACATCCTTTTCCTTGTGATGCGTCTT
CCGGAAGTAATGGATGAGACCTCCCAACAGCCCGATATTCGACAATAAGAGCAACGCAAT
GAGAAACCAGACAGCTTTCGAACGAGATCGCTTCCTGTCTCGTCTCGAGTACGACCGAGC
TTCTAATAATGTGTCTTTTTCTTCTTCGGCGATAGTTGATTCTTCGGATTGGCGCACCGG
TACTTCTTTGTACCCATTAGATGAGCGCTCTGCCATTTTTATTTCTCTGGACCGAAAGAA
ATATGAAAACATGAATAGATATCGATTAAGCTACAAATAACCTCTTTACAGGCCTTCAAG
TCTGGTAGTCTTTTTTGTACATGCCTGGGATTTATGATGGATGTAGGCGCGGAATCAATT
GGCTGTGGAAGCGGCGCCTGCGCCAAAGGACCCACTCGCCGTTATCCATCAATATGGAGT
TCAGTCAGTCAACATAGCGTGTGCATTGGAGCAGCTGGCTGTCAACTCTCCTGGAATATC
TGTAAAAAGGAGGTGTGGTGGTAAAGATGGGTTTTAGCTGGTATGGCGTACTCCTTTTCG
TCCAGCTTATTTCATCCACCATTGTTTACGCCTCGGATCCATGCGCCCAGATCGATCACT
ATGTAGCATGGGGGAAAAAACAAGGTACATATTCATGAATGCAATATGACTGATTTTTGT
AAGTGATGATTGACAAATTATAGGGAGGAACAAAATCTCCGGGATTCCAGGCCATCTGGC
GTATGATGTCTCGTCGATGCCATTTCGCTCAGATCTGGCCGTGAAGTTGTGACGACTATG
AFLA_094980
ustA
AFLA_094990
ustYa
AFLA_095000
ustP1
144
CCAAATACTACAGTTCCATGCCACGGTATCTATGCTGAAAGGTAGCTCAGCTCCGCCAGG
TAATCCCCTTGTAGAACCAAGCTAATGATGACGTGAGTTAGACCCTCCCAGCGGCTATAT
CTCAACCGGGGTCGATTTGTGGGGAGGATTGCAGAGAATCCGACAAAAGGCCAGCGACAA
TGTATATTCGAGCCAGTACGATTTTGATTCTGATCTCAAGTATCTCACCTCACGGGCCAA
CGATGGCCATCTCAGTGTTGGGCTGTGTTCTCTGGAAATCATGCACTTTGAGCACGATAT
GCCTTTGGTGTCGATTTCACTGGATGGGGTTCAGCTTCCCCAGATATATACGTACTGTAA
GCTTTTTGGGTGTTCCAGTTGTGACTTGATAGAAGGGTGGATGGCTCAGTCAGAGGAATA
TGCTATAGATGACGCCGAGATGAAGCTTCGCGGGACCGAGGCCGCGATCTCATCGGTTTA
TTGTATTGAAGACATGGATCCGGTTTACTATTTGCAGGCTAATATCGGTGTAACCATTGG
ACTTCAAGATCCTGATGCTCGGTATACCCGATGGGCCAATAATTGAAAGCATAATGGCTA
ATGTGCAATCCAGATATAACCACCTGTTTCCCTCACCTGCTGCAGCATTTTCAGGAATGT
ACACAGGAGGGCTGTGGACCAACAATCTGGGTTCCTGGCCTGGGAAGGCCAACCAAACCG
TGGAATTCAGCAATGGAACCAAAATGACAGTGGAAACTACTGCATCAGTGATGTTGGATC
GTGGATTGGATTTCTCCAGTGGGGAATCACTCTTTCAGACTGCTTGCATGCCGAATAAGA
AGAGTCGCCCTCCCGATCCCCGCCCATCACTCGCGGTAGGGAAGCCTCCATACTCGATTC
CACTAGGTGGACCTTCGATGTATCCGGACCCGATTATACATCATAAGAAAGACTTTGTAA
GAGGGTATTATCTCCATGAAGAAAGACTTGAAGATGTTGCCGTACTGCAGCTCCCAACGT
TCAGACTCATCGGGGAGAGTCCGGTATCGTTGGCGCGAGTTGCAGTTCAGTTCCTTGAAC
GCGCAAGAAAGGATGGAAAAGAGAAGCTCATCATTGATCTATCGAACAATATGGGGGGCG
ACATCAACCTGGGATTTAATCTGTTCCGGATCCTCTTCCCAGACAAGCCAATCTACACGG
CCACTCGGTTCCCCTCCACAGAGTTGATTGGCCTGATGGGTCGTGTCTTCTCGACTTCCC
AGGGTAATGAAGCAGTTGAGCATGACAATACGCTGGACCTGCCTCTGGTGTTCCAGAATG
CGGTTACACCAGACCATCGACACTCGTTCGGCAGCTGGGAGAAGCTGTTCGGTCCCGTCG
AAATCGCAGGTCAGAACATGTCCCATCTACATGCGACGTATAATTTCACCACGGCATCGA
CCGAGGACAATCCGATCAGTGGGTATGGGGGCATTGAGTTTGGACCTTCGACTCAGCTAT
TCCACGCCGAGAATATAATTATCGTATGTCTTCCATTAAATATGCATAGCACCGACACTA
ATACCAACAGATGACAAACGGCATCTGTGCATCCACATGTACAATTCTAGCAAGATTACT
CAAGCAACAAGGCGTAAGAAGCATAGTTTTCGGAGGGCGTCCTAGAGCAGCCCCGATGCA
GCTACTCGGGGGCAGCAAAGGCGGCCAATATTGGTCGTTAGTTACAATAAGCCACTACAT
TAAAAAGGCACGCGAGATTGCAGTGAACGCAAGTGGAGCCGGTTCTCCCATTCTCTCTGA
AGATGAATTGGCTCGGTTTCTAGAGTTGGCTCCTCCACCACTGACAGGATTCCCAATTCG
AATTGATAGCCGTGGAGGGAGTGGGGTGAACTTCCGCAATGAATACGACGAGAAGGACCC
AACCACCCCGTTGCAGTTTGTCTACGAAGCGGCGGACTGTCGGTTGTTTTGGACGGCGGA
GAACTATGTCTTTCCCGAGAGTTCGTGGGTTGCAGCAGCAGATGCGATGTTTGGAGATGC
TTCGTGCGTGGAGGAGTCAGATGGACATCATATTACTCCTTAACCTTTTCTTTCCCGTAA
TGAAAATATATATTTACTTCATGTATTACTACTGAATCCTAAGCATCACACGCCCCTTGT
CTAAAGTGTCCATTATTCCCACACTAATTATGCTTAGCATCCACAAGCCGCCGACTATCC
GCCCAGGCCAGGATGCCATCAAAATCCTTACAGATATGTACTGCACCTGTACCGTCCGTG
CCGGGGTTGTCTGTCCGTGGATCCTGCCCTTCCAAGGCTGTATCTCCACAGCAGAGCAGC
GACTGGCGCAGATATTGAAAACAATGGTCGACGTGGTCATGGGAGTGGACATGGACCTGT
TCATGAGGGTGCTCGTTGGAATGTGTGTCGTCTCGCGAATGGTGCGCGTTTAAATCTGCT
GCTGATTTGGCTGCTGCCAGGTCGTCATAGACAGACATGATTGCGTACTGAATTGTAGAT
TAGGAACTAGGAAGCTCATATCAAGTGATCTCCTTTTGGACAGAACATGACATACCAAGC
AGTGTAACTGGTGGAATACTGCAATGGAATAGGTATCCTGACCTAGCTGCTTGATCGGGG
GAGGCAGGGTGTATCTCTCCGTCTGGTTGACTGCGATAAAGCCATTTCCTCCTGTCGACC
TTGTTAGATGGTTTCCATCGCTAGGCATATATTGCAAAACCCACGAGGCATATACGACAA
CCAATTGTTCATTGTCGCATTTCTGGATTCCTCTGTTTTGTGATCTGATGCAGCCAATGG
ATCAGAGCGAAATATGACTCTCCGAGCAGGAACTATATCCATCCCCATTGGCGGTTAATC
CTAAATTCCATGGACAGATACATAGATATATGTTTATCAACTCACAGTTCCCAACCAGCC
CATTTAGCTCACCCAACAACCGCTCCGGAGTCTTCCTAGTAGCCTGGGCGAAGAAAATAC
CGAAGAACAAGATTTCGATGAATCCAACAAAGGCAAGAGAGGTGTAAATGATCGGACGCA
GGCAATCCCATCTTTTTGGTCGGTTCTTCTTCTCGGCAAAGTGTTCGGGTTTGAGGAGGT
AFLA_095010
ustP2
AFLA_095020
ustYb
145
ATTCGTCGGATTCTCGATGTGCGTAGAGATCCGACATGGTGGCTAGGTCTGGCTAATAGA
ATTAGGAAGAATTGACTACGTGAAAGAGGGTATTATTCTATTCTATGTCCGATTTACAGA
AATGTAAAGCTATCACGGTGATTTCTGTTATTTGGCGCTGAGCTGCGCCAAACTGTGCGC
CAAATATGCTGCATTTGTTCGTATTGCACAAACAAGAAATGGATTAGCGCCTAGTTTCGA
GGACTCTCGTCGATTAGAGTAAAAAAAAAATGTGTGTCTTACCATCCAAGATCCACTACC
TTATGTGAGCAGTCCAAAAGTCCGTAAGATCGTCAGTTACCCCTCTTTGGTTTGCCAGCG
AATCAGCTCTCGGCTGGAATGCAGCTTCACCTCCTGCATCTTTCTTCCTGCCTTTCTTCG
TCGCATTCCATTGCTGTACAGTGTTAGCCTGACTTCTCATACAGGGTCCCTCGTCACTCA
TGGCCTCAAAATGGATAGAAGAGCAGCCCCTCGTCCATCGTAGAGACATCCGGATCTCCA
GTAAATCCCGCATTGCAGCTGGTTTGCTCGTTCTCCTTGTACTCTGGCGGTATGGTCTCC
CTTCTTCGATTCACTTTGGTTTCTCGTCAGAGGAGCCCAAGCAGTTAGGTGCCGTTGCTA
GTGAACATGCGCTGTGCAGTCGGTATGGAGCAGACATGCTGGAACGAGGTGGGAATGCGG
CTGATGCTGTAAGCCTTTTTCCTAATCTTCTCCTCTTCGTGCGCGTTAATGCTTTTTCAT
AGATGGTCGCTACAATGTTTTGTATTGGCGTCGTAGGTATGATGCGCATTGCCTGTGGTG
AACGAAGAAAGATCAGCTCATGAACAAGCCCAGGCATGTATCATAGTGGAATCGGGGGAG
GCGGTTTTATGCTTATCAAATCCCCTGATGGCGATTTTGAATTTGTGGATTTCCGCGAAA
CAGCCCCTGCTGCAATTGTCGCATTGGGAAAGAATACATCCGCCGGTCTGAGGAGGTAGG
AGACTCCTCCGCTATCCCACTTCATGCGATAGCTCACCATTAATCCACTCATAGTGGTGT
TCCTGGAGAAGTGCGCGGTCTTGAATATCTACATCGAAAATATGGCGTCCTCCCCTGGTC
GGTGGTCCTGGAACCGGCTATCCGCACTGCACGAGATGGGTTCCTTGTGCAGGAAGACCT
GGTCAACTATATCGACATGGCGGTCGAAGAGACAGGTGAAGATTTCTTGTCCAAGCATCC
CTCTTGGGCGGTTGACTTTAGCCCTTCCGGGTCCCGAGTCCGGCTGGGGGATACGATGAC
TCGTCGACGACTAGCCGCGACTCTGGAAAGGATTTCCGTTGATGGTCCAGATGCGTTCTA
TTCCGGCCCCATCGCGGAGGATATGGTTGCTTCGTTGCGGAACGTGGGGGGAATCATGAC
TCTGGAGGATCTGGCTAATTATACGGTTGTTACTCGCGATACCTCGCACATTGACTACCG
GGGGTATCAGATTACCAGTACAACTGCACCATCGAGCGGGACTATCGCGATGAACATCCT
CAAGGTGTTGGACACATATGACGAGTTCTTCACTCCTGGAACCACCGAGCTCAGCACCCA
TCGCATGATAGAAGCTATGAAGTTTGCGTTTGGCCTGGTATGCATCTTACCCTGGACTAC
CGACAGTACTTCAACCTGCCCCGACTAACAGCAGCACCAGAGAACACGTCTCGGGGATCC
GTCGTTCGTGCATGGTATGGAAGAATATGAAAATCACATCCTGAGTGCAGAAATGATCGA
CCATATTCGCCAGAGTATATCCGACTCGCACACGCAGGATACATCAGCGTATAACCCTGA
CGGGCTGGAAGTCGTCAACAGCACAGGGACAGCGCATATCGCAACAGTCGACCACCAAGG
CTTGGCCATCTCCGCGACGACCACTATCAACCGCTTGTTCGGCAATCAGATCATGTGTGA
CCGGACCGGCATTATCATGAATAATGAGATGGATGGTAGGCTCCACCGGCTCTACTGCAG
CTCAACGATCAATGTTAACATGACCTGGAAGACTTCTCCGTTCCCACCTCATCGCCGCCA
ACATTCGGCCATACTCCATCTTCGACCAATTTCGCTGAGCCCGGAAAACGTCCCCTCTCG
GCCATTTCGCCAGCCATCATTCTCCATCCCGACGGATCACTATTTCTGATTGCCGGCTCG
GCCGGCAGTAACTGGATCACCACCACAACCGTGCAGAATATCATCTCTGGTATTGACCAG
AACTTGGCAGCGCAGGAGATCCTGGCGACGCCGCGTGTCCATCATCAGTTGATACCCAAT
CATGCCATTTTCGAGACCACATACGATAATGGGACGGTGGACTTCCTCTCCCAGCTAGGG
CATGAAGTGACATGGTACCCGCCTGCCGCGAGCATGGCGCATTTGATTCGCGTGAACGCA
GACGGGGGATTTGATCCTGCTGGAGATCCCCGGCTGAAGAACTCCGGGGGTGTAGTGGCC
CTGCAGCGTCGTAAGTTCTGGTAGTCGGTCTCATGCTTATGGTGAAATTAAACACATACC
AACATTCATACGTACAAAACTACATATATCTTAATTGAATTGCCCTTGAGTAGAGGAAGC
GAGGAGTAGTTCTAGGTATCCCGCGTGACAATCTCATCCAACTCACTGCAAAACGCCCTA
ACTTCCTCGACAGTATTGTAATGGACAAAACTAACCCTAACCAAACCATCGCTACTCTTC
GGCTTCAAAACATCCCACGTCGGGCGCGGCGCCAAACATATACCCGAGGTAATTCTAAAC
CGATTCCTCGTATTCACCCGCATAGCCACATCCCCAGATGATCGGCCCACGACCTCAAAC
GTCACAATCGCAACCCTTTGACTCGGGTCACTATTCCGGCGCCCAAAGACCCGGTAAACA
CTCGGTTTGCTCAGGAGATACTCAAGCAAAATAGTAACCAGGACAGTCTCCTGACGGACA
ATCCGGTCCCAACCAACTGTATCTTGGAGATAGGAAACGATCGGCGAACACATGAGCTGG
AGCTCGAAGCTGGGCATCCCCAGAGCCAATTTCCCATCGAGAGACGACGAGGAAACGAAA
AFLA_095030
ustH
AFLA_095040
ustD
146
TAGTGATTGATGCTAGTCATGTATCGGTCCTGTGCTTTGCGACTCGCATAGAGCGTGCCC
AAATGGGGGCCGAACAGCTTGTACCAGCTGAAACAGTAGAAGTCGACGTCGAGCTCCTTC
ACGTCCACGGGACGATGGGGCACGCAGGCTACGCCGTCGACGATGAGCATGCAGCCTGGG
ATAGTGTGCACTACGTCTGCGATCTCGCGGATGGGATGGATGGTTCCGACGACGTTGGAG
ACGTGGTTGCAGGTGACGAGTCGGGTTTTGGGAGAGAGGAGTGGTTTGAGGCTGTCTGTG
GTGAGGACCGGGTCGTCGGGGCTGTTCGGGGTGGTGGTTGGGGACCACCATTTAATGGTG
ATTCCTAGTTCTCTGGAAAGATGGATCCAGGCACTGGCTGCTGCCTCGTGGCAGAGGGTT
GAGCAGACGATTTCACAATCGTTGTTTAGCATGGGTTTGAGAGATAATCCTAGCAGGCGA
AATAGGCATGTGGTTGATTGGCCGAAGGCTGGATTTTACAGTACGTTTGCCCTGGATTAG
CATGCACCATAAAGACAAAAGACATGATGGATTTACTTATTTCGTCTGGCAACGCATTGA
TAAACGCGGCCACCTTCCCCTTGTTCCCCGTATAGGCTGTGATTGCCTCCATGCTCTTCG
CATCGACGCCCGGGGGGAAAGGGAAGGAATACATGAAGTTGCTTGTACTGTGGATCCCAT
TGGCGTCTTGTACTCCTTCATATATAAAGAATGCGATGGGGAGTATAAGTACCTTTCAAT
CGCTTCCTTCAGAACAACCGTCCCAGACGCATTGTTGAAAGCAGCTGTCTCTCCTCCTAG
AACTGGAAAGTGGGATCGGACACTTTCGACATCTATGACATTTTCAAGAGGTGTTTCTGC
CTGTGCAGTGCCATTTATGCTAGAGCCCAGTGGGACCGAATCTTTGTCGACATCATCAAG
TGATGAGGTTGCTACCGATTTCATTGTCGATGATCCACTTTACTGTGTTATTAGACTTTC
ATGGATTATCCATATCGAAGGAATTATATTCGTTGATTGCAAGTAATGAAGGAGAAATGC
AAGTGCAGCTATTGATGCATCAGGCGCCTATCATCCATCTTAGCGCAGAGCCACCAAAAA
ACGGCGCCAAATAGCAGTCGACCAGCGCAAATCATATTTTATAGGCTTAGGTGAACATAA
GAATAAAACCAGAATACCTAATACCGATACTGAAACACCGCAGAGTAAATGGCCAATCCT
CAAACCACCCGTGTAGCCGTGGTTGGTGCTGGCATTAGCGGTGTTCTGGCCGCCGGACAT
CTCCTCGCGACAGGTCTTGAAGTGACCGTTTTCGAACGCAATGCGGCCCCCGGTGGAGTC
TGGTATGCTATTCCCTTTTCTGGTCTACTTGCTACTCGAGAAGCTGATGCCTGGGCAAGG
TTATACGATGAACGAACACCGATCGAGCCATCGTATCCGGCCATGAAGCCCTCAAAGGCC
GACCCGCCAGCAACAAATGAACAAGAAACGAGCAGGTTCATGCTACAGCATGCCCCGCCT
GGGTGAGTGTATCTCTCTCTATCTCTCCCTTACTTTGGTAACACGGAAAATCGGTTTAAT
GTGAATCAGGCCGTGCTATTATAACCTCCAGAACAATGTCCCCACGCCACTACTGGAGGT
GTCGCTTAAGCCATGGCCAGACGGAACGCCAGATACTGTGAGACATGATGTAATCCAACG
GTTTATTCAGGATATGTCGATTGAGGCCAAGGTGCATGATGTGACACGGTATGAAGCGCG
AGTAAAAAAGGTTGTGAAAGATGGCGCAGAATGGAAGATTACCTGGTCTACTCCGCAGGT
GGGGTTGCAATCTGAGACTTCGGAGTTCGAGCAAGTATCTGTAAGCCACTGTAGGGATCT
TACCGGCTCAAGAGCTAACCATCGATAGCCTTTTGATGTGGTGATCGTGGCGTCTGGACA
CTACCATGCGCCCCGTGTTCCAGATATTCCGGGCTTATCAGACACGAAGAGAAAGTATGG
GTCTCGAATCTTGCATTCCAAGGAGTATCGACGGCCGGAAAACTTCAGGAACAAGGTTCG
TGGTCTTGCCCGGGAATCCATTTAGACTCATGCCAACACACACAGAATATTCTTATGATC
GGTGGAGGAGTCTCGTCGATCGACATCGCCAACGATATCAGTCCATTCGCCAACACCATT
TACCAGAGCACGAGGAACAGTAAATTCGATCTCGTTGAGAGTATGCTCCCCGAGAACGGG
GTCCGAGTCCACGAGATATCCCATTTCGAAATACAAAGCCACAGCGACGAGCCATTATCA
GATGACGAGCCACTACCGTTGACAATCCATTTCGAGTCCGGCCAGAATCTCCACGGGATT
CATATGATCATGCTCTGTACGGGCTATCATATCACATTTCCCTATCTAGAAGAATACCAC
AGTGATGAAACAACATTGCAAGACGCAGACGAAAACATCCTCATCACGGACGGCACGCAG
GTGCATAACTTATACCAGGATATATTCTACATCCCCGACCCCACGCTTGTCTTCGTAGGA
CTGCCTTACTACACATTCACATTTTCCATCTTCGACTTCCAGGCCATTGTTGTTGCCCAG
GTACTCTCCGGTACCGTCCAGCTACCAACTGAGACTGAAATGAGATCGGAATACAACGCC
AAGGTGGAACGAGTCGGTCTTGGAAAGGTGTTCCATTCGATTCTGGGGACAGAAGAAAAC
TACGTGCATGATTTGTTAACGTGGGTGAATACTAGCAGGGCAGCGCAGGAGCTTGTCGCG
ATTAAAGGGTTTAGCCCGAGATGGTACGAGGCAAAGGAGGCACTGAGACAGAAATACAGG
GCTCAAGTTAACAAGTAGTTCTGTAGCCGACGAGAATCGATATAGAATATGCTCTTCTAG
ATATTAATTTTCATGCGGCGCCTGCTTCGGCGCCAAAGTCCTTCCCCGCGGCAGTTTCTC
CACTAGGGTTCGTTAAAGTAACGAATTGCCCACGGCACCTCTTTGCTTATAAAGTACACA
AACCACATAGGAATAGTCATATGTCAAAATAAAATAATACATAGATCTCCATATGGCCGT
AFLA_095050
ustF2
147
GGAATATTTCCAGGAAAAACTAAACAAGTGGCGGTATTCGCCGGTGGCAGGCTCACCGGA
CGAGGAGGGGGGAAATGCCACTGTCAAGCCCAAGTACGAATATTCCAACCATTTCGTATA
ACATGGATCCATGGCTAATCTAACCCAACAGAGCTAGCTTTGTGCCAGTATATGCAGGAC
TTACCATCATAAGTCTCATTACAGTGACCGTCTCTCTTGTCCACCTCCTCTCTGGAACAG
GAACGACGACAACATTTCCGCCCTGCAAGAACCCTGCCGTCCGTCGTGAATGGAGATCTC
TTACCAGTAGCGAAAAGCAAAACTTCACACAGGCAGTCATCTGTCTGGCAAGTATTCCGT
CGACGTGGCAGCCGAATGGAACGATATATGACGACTTTGCGATTTTGCATGGGGGGATTG
GATCGTGGTGTTCGTGCCATCTAGCTTTATGGTCACTCTGCATAGAAGGTAGAATATAGC
TGACTGATAGAAATAGGCCATCGTTCGGCCTCCTTTCTACCCTGGCATCGATACACGCTA
GTCGTGTTTGAGAAGGCCCTCCGAGAGCACTGCGGCTTCACCGGCCAGGTTCCGTATGTT
GAATTCCAATCCAACTCGCAGCACACTTGGTGCCCATATCTTTATGATTACACGTCTAAC
GAGATAGATACTGGGACTGGACACTGGACTGGATGAACTTAGCCAACTCCTCTATCTTCA
ACAGCGTGGACGGATTCGGCGGCGACGGAGACCGCACCGGACAGGAGGTCGTAGGTGGTG
GACGCTGCGTGATCGATGGCCCCTTTGCTGGGCTGCAACCCATATTATATAACCATACCT
ATGTGCGACATTGCATTGCGCGCGGCTTTCGCGACGGTGACCAAGCCGGTCGTATATCAG
GGGAGTACTACCGGCCCGAGTCGATTGGTGGTATCCTCCGCAAACAATCGTATGTGGAAC
TGGTGAGGGAGGTCGAGATTTATTTGCATAATCCGCTGCATCAGGGCGTGAATGGGGACT
TTTTGGCCATGACGGCTGCTAATGGTATTCTCTCTCTCTCTCTCCCTCTTtgtgtgtgtg
tgtgtgtgtgtgtgtTTGGACAAGGGCTAACAGATCAGATCCCCTTTTCTATGTCCACCA
TGCACAACTGGACCGACTTTGGTGGCGTTGGCAGCAGGAAAGCCCGGACCTCAGACTCAA
GGAATATCACGGAAAGCACATGTACAACTCGACCGGGAACGCGACGCTGGATGACATACT
CATGTATGGTGGGTTTGCGGAGGATATCCCGGTATCTCGCGTGATGGACACCAAAGGGGG
GTTTTTATGTTATACGTATTAGGGGTCCCGAGTCCTGATAAATATAATGCTATGAATAAA
TCGTATCTAAACGTCTAATGTGTCGTGACTTCGTAAATCTAAGATATCCAACATGCGATA
GGAAGACGAGGAGAGTTGATATATATCAATACGACCTTGCTGCAGAGATGGCACCCAGCG
CTAGCGTGAACAATACAGCAGCCAGCAGAAAAGGCAATCCCACCCAGATTCCACCTAGCT
GTAATCCCCACTGCAACGTCTTGGCCAGCATAGGACTACCAATTACCAGACCTCCGTACA
ACATCGTCGTGACTCCGGTGAAGACAGTTCCAATGTGCATTGGGTCCACCATGCCCGTGA
GGAAACTCCTGGCCGTGACGGAAAAGGCAAATCCCAGGGCGAAGACCGTCTGCCCGAAGA
CCAGAGTGCCGGGTGAAGCAGCGAGGAACATCACAAAAGATCCGATGATGAGGCAGACGC
CGTTGATCTGGGTGATTCGTTTGTCTTTCACCACGTCGTTGAGCTTGAAGCGTTTCACCA
GGATGTACGAAAGGGCGGGAATGATGGCCGCTAGAACGAATAGATTGACTCCGGCCCGCA
GAGAAACTAGTAAGGATGCCTTGAATCAGATGGTCAGCATTGTTGACTACTTTATTCCCT
GATCGACGTAATCTGACCTTGTCAAATTTCCAGTGAAACTTGGCAGCAGCATACTGCAGG
GTAATACCAGAAATCATTCTCCCGAGTTGACAGACGAAAAAGGAGGCCATGATCAGCAGA
ACGTTCACATCTCGGATCCAAGAGGAATCCTTCCTGAACTCGTCCACAATCTTGCGCCAA
CGGTGGTGGATCGACATGAGCCATGTGGGATGCGACTCCTGCGCAGAGGATAAGAAGTCA
CCATCGCTGCCTCCTTCGCGCTTGGAACCCGCTGGCCGGACATCTGGCACCACCACGTAG
GCGAACAAGATGCCTAGGACCATGAAAATCGCAGCCCCGAAGACGGGGATCCAGGGGTTG
AAGTTTGCTAGAGCCGCACCGGCGGGGACGGACACTAACTCGGCCACGAGAACGGCGGCA
TGGACTTGGGAGAATGCCGTAGTTCTAACAGTATGTCAAGTGAGAAATAGAAATCAAAGG
GTCATATCGAAGATACAATATACCCTCTTACCTCAAATCTGCAGGGCATGAATCGGCTAT
CATTGCAAATGCCATAGAGGATATGGAAGCTCCTCCGCCACCGATCAGTTGCCAGATACC
GGAGAACCATACAGCGCGGAGAGGGAAAGTATCGGGGAACCAGGCTTTGTTCTCTTAGTC
TTTTGGCTGGTAGGGAATCAGGCTAAGAGATTAAATTTAACCATCAACTTACTCACGACC
CCGACCCAGATGTCGCTCAGAAGACAACCGACCATGGCAATTAAGAGGACCTTTTTACGC
CCAAACCGATCTGCGACGACTCCATAGGGTATAGAAACTGCGATGGCTATGGAGGCAAAT
GAGCCAGGCTACAGGAATACCCACCCAGAGTATAGGAAGTACCTGGAATATTATCAAACA
TCTCCCGCCAACCATTGATCAGAGCCAGTTCACTCTGCACTGGCTCAACCTTGCAATCTC
CCATTCCTGTGCCTGCAGCGCCCGAAACCTGGGTATAATACGCCTTACAGATGATGTGTT
CGAAAATCGCCAGGCGGGGTGCGACGATGATTTGGCCGGCTATATCAGTAATCAGGAAGG
TCAAGCTGGCGACTACCGCGATATAGAGCGGATTGTAGTGCCTGTTGATTTTGTGTTGAT
AFLA_095070
ustT
AFLA_095060
ustQ
148
ATAGTCCGTCCGGGGTGGATGACCGTCGATGCTCCCGATCGGCGCCGATTAACGATACAT
CTTCGCTGCAGTCGCTATAATCGTGGTCCACCATGGTGATATGGTCTTTCGGTGTGGGTG
GAGAGGTGCCCGACATGACGGTGGGAACTTTCGGGCGACCGTCGAAAACCCGGGTTCGGG
GTGCTGTTTATAAATAAATGTTTACAGTCGTTGAGAGAGGGAAGCGCAGCTGATATATTG
ACAGAAATGACCAAGCGGTCAGACTCCCTACTAGTGGATCATGCTTCAAATTATGCCCTA
GTCAGCTGCGCCCGCCTGACCAGTCAGTCAACCCTAACTTGAAACCAACGCCATGAAGAA
TCTCCGATATACACCTATCTATCCCTTCCATACGTAAGGAAGATCGCCTTAATCGGATAA
TATACCTGAAAGGGTGATACCAGTGATCATATTGAAAGGCTTAAGTTCCTATCCCAATCT
TCTTATGAGGGGTAGGAGTGACTCAGATGGAGGCCAGTGATATGACTAGCCCACCTGATT
AAATTGCGGATCGTAGATTAAGTGGGGTAAAAACCCACAGGAACAGACTACGATGGAGTA
CGGAGGAAGTAATGCTCATAGCCATAGGCCGCCACACCCGCTACGAGTGATACCGAGGTA
GGGCGCCAACAGATCGTATTATAGCAGCTCGATCTTTTCCCACTCTACCACGGTAGCATT
CTTTTAGTTGACGATCTTATCCCACCTCGGCAAAGTAGCGTTCCGATCGTGTTCGGCCGA
ATTCCGGCGCCGAGTTCAGCGAAACGCGACCTGGCGTCTATAGGAAGATGTATTTAGGCG
CAGGTTGATGATCTGTTATCAGCAAGCTCAGACCCTCGAGATGAGGAATTATACTGACTA
AGGGCAATGGGCGGCGATGGAGTGTTTAAATCAAACTAATATCAAGTTAGGATTTTGACT
CTATTTTGATTGAGCTCTACTAAGAAACACGTTTGGTCAGCTCTTAGGCGCTACTGCTAG
GAGCGGTGTTACTGACTAAGCAATTCACAAGCTTAGTTATGCAGTGGGAGCGCCTGAGCT
CGCTGAGGTAAATGATCTATGAGGTACAAAAGTACAATGTCCCAAAGTGTAGTAGGGCTA
GCATTGACAGCACCCTGGGAGACCGAACGAATATAATAACACAGGCAAGTGGGCCATCCC
GAAGGTCTCAACGAAAATAACACACCGGCTAGTGAGTCGTGTACATCATATCACCCATGT
CAGATACAGTCCAACACCTGAAGAGGAGATTTGAGGATGGCTTCCTATTGGATCTAATCC
AGATGCACATCCGGTCCTCTGGCCAGTCGGGGAGATCAAGGGTCGAGATGGTCATCAAGC
GTTATCCCCCACATGAGAACCCTCGACGTGCATGTGGGTAGCTATATTTCTTGATGTATC
TCGTAAATGACTGGTCAAATTCCCATCCACGTCGCTAGGCAATCTGCAAGGCTACCCACC
CACACAAGCACACGGACTCTCTCTTTCCCCTTTTCTTGCTCTCTTGCTCGCGGGCCGACC
AGAAGCTGCAACCTGCCGATTCAGCCCGCCACCACCCCGTTATCTTGTTCTTCAGGCATA
AACAATGGTCTAGACTTTACCTTCGGACCAACCCCGCCCGTAGTCCTGCAGGATACAGAT
ATCTGGTGTATGCATTCATTCATTCAGTTATTTATTTATTTCAATCACAGTGTAGCTGTC
CATGTGATTCAATAGCTATCAAACCACATAGCTCTACGACAGGGGCAATCGTCTAATTTT
TGTCCACTAACAACAACCGCCAGAAGCAAGACTTCCGTAGTTGCACCGTACCAACCAAAT
CTGAAGGGTACTTCCGTGCAAAAGATTACGTACCAGATAAAAGGCGAGTATAGAGGGAAT
CTAATCTTGCCGATCATCTTGCTGTCAATGATTGTCAACCATCCATCCTTCATGCCCCGC
CATGCAAGAAAGAATCAAGATAGGCCCGTGCACCTGCTCCCCCACTCGCCCACCACAAGC
TGCTGTTTGTATATATCGCTGCACAGAATCTCGGATTGCACCATCTGCACTGCGGATGCT
CTCAGAGTGTGTGTTCCCTTGATCCTATTCTGTCGTTAGTTCTCCCCTGAAACCTTGCCT
GCGGATGGGCTTGGATGCGCGCTTAGGTCTCCCCACTTTCATTCATCCCCTCCCACACCC
AACCGGGAGGGCCCGCCTCGTTCCTTCATCTGTAGAGTTTGGACTATATAGCTCGGCTGG
GAGAATCATCCCATATCCCGTGAGGGTCTTCCCTGGAGGTTCGGTCAATGTCAGGGGCCC
AGCGGCGTAAGACGGGATGCTGGACATGCCGGCTCCGGCGGAAGAAGTGTAATGAAGACG
GACAGCCGTGCTCGAACTGTGAGGCGCGCGGGGTTTTCTGCCATGGCTATGGCCCAAAGC
CATCGTGGAAGGATCGTGGGGAGAGGGAGCGGGAGGAAGCTCAGCGGTTGCAATTGATTT
CTAGAGCGCGAACTCGGCGTGCGCGCGCGACCCCTACAAACTCCATAAATGGTGAGCCCC
GACGGCCCTCGATCGACATGAACACGTCTGATATGGGTTCGCCCATTCAAATGGGGCTCG
GGTCCTCGGACTCGTCCATCTTCGAGACCTCCAGCCTCGACCTGCTTGATATCCCGGAGC
TGGGATCCTCGCTCTGGGAGCCGACACTGGATATCATTCAGGTACAACCTCCCCCCTCGG
ATGCCCCCAGTGCTCTGCAGTTTCCGTCCATTCTCGGTGATGGGCTGGAAGAGAAGGAGA
TCGACTTGATCATGTACTATGTCGTAGAAGTGTTTCCGAGGCAGCACTCTTCGTACCAGG
GAACCTCGGTGATGGAGCGGTCGTGGCTGCTTTGCGTGCTCAAGCGGTCCTCGAGTTTCT
ATTATACAAGTCTGAGCCTGAGTGCACACTACCGACTCATGAGTATGCCGGAAGGCGGCC
AAGGGCGGACTGCCTTACTACAGGAATATGAACGATACAAAACGTGCAGTCTGTTTCGCT
TCCAAGAGCTGGTGAGCTCCGCCACACGGCCACAGTCACCTATATCTACGGGGGTAGTGG
AFLA_095080
ustR
149
GGGAGTCTGTGATCTGCGGGGTTCAGATTGCCATGCTGGAGGTAAGAATGATCATCATCG
TCTTTTCCTTCTGGCACATATTTGGTAACAGAAATAGCTAGGCGGTAAGTAAGAACATGC
AGTCCTCCTACCTGCACCTCAGTTCCGCAGCGCTATCACTGGCACAACTCCTAGACAACA
CCTCGGCCCTTGATTCCAGCTCTATATCCACAAATACCACCCCACCGTCCTCTGTACTAA
CCACAGACCTATACCATGCGGGTTCCATGGAATACAAGGCTCTCCGATGCTTCAGCATCA
TCTTAATATGGAATGATATCCTCCACTGCTCCGCGCAGCAAACAATCCCAGCCGCGGCCA
AAACCTATCAAAAGCTCCTAGCGGATGAAAGCTTCATTCCGCTCTTCGCTGACATCGTGG
GATGCGAGGGCTGGGTGCTCGTCCCCATTCTCGAGGCAATGCTGCTGGCCAGCTGGAAGA
AGGACCAGGAAGCTAAAGGCCAATTGAGCATCCGCGAACTCCTCACCAAAGTCGACCACA
TCGAATCCATCCTCGGCGACCGAATGAAGCGACTAGCCCCGGCCGTGCTCCGCCAGAAGG
AGGCCGGGATCTCGTCGCAGAGCCCCGAACAGATCAGGCTGGTCCACACATACATCTTTG
CTCACGCAAGTTTCGTTCATCTCCACACCGTTGTCTCTGGCGCGCAGCCCAGCGTGCCCG
AGATCCGACAGAGTATTGACAAATCCTTAGCTGCATGGCAGCTCCTGCCTCCCTCTTTGA
TCAACTTCAAAACCATGGCGTGGCCATTCTGCGTGTGCGGTAGTATGGCCGTGGGTTCTC
AGCGAGAGCTGTTCCAGAAAATCATGTCCGAGAACTTTCAAAATCAATCGACGTCGAGCA
ATCTCCATTGTCTTAAGTCTGTAGTAGAAGAGTGCTGGAAGAATTTCGATAAGCGTGTGC
CAGAACAGAGTCCGTCGTCTTATAATTGGAAGGTCGTGATGGAAAAGCTAAATTTAAGTA
TCCTGTTCATATGAACTCGCACCACTTTCTTCTGCAAAGTTAGATATCTTGCAAGGTATG
TTCTTCGACAGCACCCCACAGTACGGTGTTTGGTTTGGAGAACAACCTCTCCCTCCTTTC
TTTTTCCTTCTCACTCCCTCTTTCTCTGTGTATGTTCATTCCTTCAGCATGACATAAACG
ACACAAGTAGATCGAATGAATCTACCTAGTAGAATCACACGCTCAAGCACTATAATATAA
ACAGATATAAAACACCCCCCTATTCTCCCAGACACCAATTCAATCACACCAAAGCCACAA
CCAACCAGACACAAGTAGCATCAAGCGCCCAGCTCCAGCACAGCCCACCAAATATCCAAA
TTCCTCACTGTCTCATGCCCCGCCCGCTTCCATTCCTCATGGACCTCCCGCAGACGCGCC
AGAGAAGCGAGGATCTCACTCTCCTGCGCCTCCGTAAGCGCCTCCTCCTTGACTCGGCGC
ATGAACTTCTCCCCCTCCACATACTGCACCTCAAAGTGGCCCTCGAGAAAGCTAGGGTCA
GGCGTGACTTTCCCATCCCTGACCACCCTATATCCAGCCCCCCGGGCCGCCTCCAAAATC
GACGCCTGGTCAAGTCCCGCACGGACATTCTGCTCTAGGTCCCCAGGCGCACGAGGCCGT
CGATACCGATAGAACAGCCTCTGCGTCTGAGCCGCTAGAGCATGCACATTCTGCTCCTCC
AGCGAGATCTCATACGAGTACTCGCACAGATACACCCTTGGCGCCTGGGCATCGACCGCC
AACGTCCGGAAAAGATCATACACGCTCTGGCTATTCGGAAAATACCAGAGACTATGGAAG
AGCGCGACGGAGTCGAAGACCTCGCGCGCGGGCCGGTGCAAATCGCGCAGGAAAGACGGC
GTGTCGGAGCGAAGGAATGTAATCCGTGATCCTAGCTCGGATTTCCTCACGTAGTCGTGC
GACTCTTTCACCGTGTAGGGACTGCCGTATTCGGGCTGCGCGATGTCGATCCCGGTTATG
TGGCCGGTGCGCCCGACTAGGTGGGCTAATACGAGGCATGATTCTCCCTGACCGCAGCCG
ATGTCGAGGATCCTTTGCCCTTGGGGTGATTCCCCAGGCGTGGGCGATTAGGAGACGGTG
CGTGAAGCGGGGGATATAGAATTTGGTTTTGAGATTGGGGTCGTAGAGGCAGCTGGATAG
AATTGTTTCCATGGTTGTATTTGGAATTAAGGTGGTTTATTGCTTGGGTTGTTCTGGTAG
ATGGGAAGTGTGGCTGCTGACTATATATATATATTGGTTGCGTGGTTGGCGCTGTGGGAC
CGACAAGTCCGCGCCTAACATACTTTTTTATATACGCGCTTTTTTCCGTGCAGTTTATGA
ATGTCAGGGAGAGACTTCCACGATAACTTGACGAAAGCAAAAGAAAACGAAGTACATTTC
ATTTTCCAAGAACATAATACGGTAGCAGCAAGTCACACGCGATGACAAAAACAGAAGAAA
GCCCCCTTCACTTTTTCGATATCTTCTCTACCCTACCAGGTACGGATAGTCATATAAAAA
TATAATGTTGGCCTCAATACTAACCCAAGCAAAGGCACATCCAAATCATGGTCCTCAAAC
GTCTTGAAAATCCGCATGGTCCTCAACTACAAAGGCATCCCCTACACGCAAAGCTTCCAT
TCCTATCCGGACATCGCGCCTCTCCTTCAGAGTCTCTCGGTACCTCCCCACAAACAAGGC
CGATTTAAATACACTCTACCAGCGATATGCCACCCCTCCTCGGTCAAGTCGAGCCCCTCT
GGTGCCATGATGGATTCTCTCCCGATTGCATGTCATCTCGATGAGACCTATCCGGATCCT
CCGCTCTTCCCGTCTGGGGAGGCTAGCTATGCCCTTGCTCTGGCTATCGGCAAGCTCATG
GTTCCCGCTGCACTAAAAACGTGCGATCTGCTTCTACCCAAAGCGGAGGAGGTGTTGGAC
GATCGGGGCAAGGAGTACTTCGTGCGGACTCGGACGGAGATCTTCGGGAAGCCGCTGTCG
GAGCTGAGACCCAAGACGGAGGAAGGGGTGCGGGCGATCGTCGATGGGATGAAGGCGGAT
AFLA_095110
ustS
AFLA_095090
ustR
AFLA_095100
ustM
150
ATGGAGGTTTTCATTTCCATGCTTCGGGGGAGAGGCGAGGGGAAGAAGAGTGGGCCATTT
CTGGAGGGAGAGAAGCCGGGATATGCGGATTTCATTTTGGTTACTTTTCTGTCGTGGAGT
CATCGGTTTGATATGGAGCTCTGGAGGGAGATTATGGATATGGGCAATGGGGAATTTAGG
GCGCTTTGGCATGCTTCTGTGCAGTGGTTGGAAGGGCAAGGGGAAGAGAAGGAATGGGCT
GTCCCTCAGTTAAGTACTGTAGATTAACGCGATTATGAATAGAGCTCAGGCGCTATATGT
AGTTCACTTCTGATTATAATTAATCCTTGATATGAATACATCTTTCCAAAATTCAAGCAG
TGGGGGACAAAGATAACTACTCCTGATCAACATCCAATATACCCGAGATAATAAAACCAC
ACTCCCTGCATCATGTTCCTGTTCTCTCTTTGCCTTTTATTCCTCCACCGTCTTTTCGCG
GATCGATTTCCACCGCAGCGAAGCCGACCCAACCAGGGAAAAAGCGCACATCACCACAGC
AACATGAAACGTGTCTGTAATGGCGCGGCTAAACGCCCCAAGGACATCCGCTAACATCTC
CGGCGGGATAGTATCGCGAAGCATCGTGACGCCCTTCTCGACCATAGACGGGTCGATAGA
AGGAGCACTAACACGCAGATTCTCGACGAGTCGGTTGTGAAAGACACTTTGCCCAATAAC
TAGCGAAACAGAGCTGGAGAGAGCTTGCACAAATGTCAGGCTCGAGATGGCGATCACGAC
ATCTTCTTTTTGCACAGCGACCTGCGGGACGAGAGTCGAGTTTTGCGATCCCAGTCCGAT
CCCGAGGCTCAGCAGTACCTGGAATATCAGGGATGATATCAGATTGGAGGAGGGATGTAA
AGTGCTGAGTAGAGCGGCTCCCAGAGTAGAAATCACGGACGAGGCGATTAGGAAGGGGGT
GTAGTATCCCACAAGAGAGACGAATATTCCTCCAGCGAGCGAACAGAACACCAAGCCCAA
CTGCGTAGGCAGAGTCATGAGACCGGATTTGGTTGCGGAAAAGCCCTCGACGGCTTGATA
CCAAATCGGGAGCTAGATAGACGTACAGGGTCAGCCATGTACAGATATGACGTTGTCACA
TGATGGGGTCGTAGGATATCCATACGTAGTAAATAAAGACGATCATAGCACCGCCGTTGA
AGATTCCCCACAAGGTGATGCTCAGCATATCTCGATTGCACAGGATACGAAGTGGAATCG
TGGCTTTGTCCCCCGCCCAGACCTGGACACCGGCAAACGCGGCCAGCATGAT
151
Chapter 6
Discussion
From the earliest, primitive uses of fungi to modern applications involving synthetic biology and precise
genome editing, the development of fungal research has made huge strides and the future promises even
more with improved genome sequencing coupled with new bioinformatic tools and automated high-
throughput screenings.
In this thesis, I have described a method for heterologous expression of fungal genes in Aspergillus
nidulans. The strengths of this system formed the basis for the work in chapter 3, 4 and 5. The gene to
product time is only a few weeks, allowing for fast screening of genetic constructs. A clean background
and relatively high yields made product identification straightforward as well. The most time-consuming
step is often fusion PCR. Large constructs (>7 kb), where multiple fragments must be fused, have low
yields and take a long time to run, making the necessary optimization arduous. Currently, other
approaches like Gibson assembly are being explored as an alternative, though low yields can still be an
issue there. In this thesis, yield optimization was not an objective and not necessary for product
characterization. However, if any of these compounds or future derivatives has an application where large
scale production is necessary, yield optimization will involve media composition, growth and post-
induction temperature, culture duration, shaking speed, media/overhead ratio and pH to name a few.
More sophisticated optimization could involve different promoters for optimal relative expression level
and specific direction of cellular localization and protein-protein interactions of the individual enzymes
involved in product formation. Additionally, structural studies of the hybrid enzymes described in chapter
3 and 4 could lead to further improvement on the amino acid level, also making future new hybrid
enzymes easier to assemble.
152
In Chapter 5 it was shown that natural products from a pathogenic fungus could be obtained safely by
transferring the entire biosynthetic gene cluster to a heterologous host. However, further genetic
manipulation in the host proved problematic for unknown reasons. In theory, the manipulation of RiPP
pathways could lead to a wealth of novel, highly modified peptides ideally suited for high-throughput
applications using a synthetic DNA library to produce many precursors that can feed into the pathway.
Despite this promise, most efforts involving fungal RiPPs focus on identifying new ones since currently
only a handful are known, but potentially many more can be discovered. Hopefully when that well runs
dry, efforts will shift towards synthetic biology approaches to fungal RiPPs.
Synthetic biology in filamentous fungi will see tremendous progress in the years to come. The enormous
variety of species and products become easier and faster to manipulate. The contributions of fungi to
green chemistry will impact the world, not only because they can achieve unique chemistry but also due
to the lack of toxic chemicals used in organic chemistry. The ultimate goal of fungal synthetic biology to
have fully manipulable filamentous fungi that can make any desired organic molecule gets closer every
day.
Abstract (if available)
Abstract
The use of fungi by humans dates back more than 5000 years. At first a trial-and-error approach was used. Later the molecules in extracts became known and characterized. Now, with more and more species having their genome sequenced, effort in employing fungi focus on a genetic level, identifying new enzyme functionalities and engineering genomes for increased yield or novel product discovery. In this thesis gene engineering as well as gene discovery is applied to obtain novel enzyme functionality and molecules. The most important methods used are described in Chapter 2, while additional methods will be described in the respective chapters. ❧ Domain swapping in Non-Ribosomal Peptide Synthetase-like enzymes (Chapters 3 and 4) ❧ A group of small non-ribosomal peptide synthetase (NRPS)-like enzymes was recently discovered in Aspergillus terreus. The products of these enzymes include antitumor metabolites asterriquinones and butyrolactones, and the phytotoxic phenguignardic acid. Knowledge of the mechanism of these biosynthetic enzymes will open the way for developing analogues with improved functionality. NRPSs are typically megadalton size enzymes that produce peptides by selecting free amino acids, activating and tethering them, and via a condensation reaction couple one to the next. Each amino acid is recognized by a protein domain so a particular NRPS makes a specific peptide sequence, unlike the ribosome which uses the DNA as template. In non-ribosomal peptide formation, non-proteinogenic amino acids can be incorporated and via additional protein modules the peptide can be modified during chain elongation, for example epimerization to generate D-amino acids, N-methylation, or side chain cyclization to generate oxazolidines. The immunosuppressant cyclosporin is the most famous product of this type of enzymes. Their modular architecture, with a separate domain for each amino acid and reaction, makes them a target of engineering efforts. Libraries of NRPs could be generated by recombining the domains that recognize amino acids. To date much effort has gone into determining the domain boundaries and where exactly to exchange protein domains while retaining whole protein functionality. The main downside of the engineered enzymes has been yields as low as 1% of the wildtype enzyme. To combat potential low yields for this project, a heterologous expression system was used: Aspergillus nidulans and its powerful native alcA promoter. NRPS-like enzymes are similar in domain architecture to their NRPS cousins. The main difference is that they are a lot smaller with typically only three domains: one for recognizing and activating an amino acid (A domain), one for tethering (T domain) and one for release (TE domain). The other difference is that instead of an amino acid, an alpha-keto acid is activated, which means the final product does not contain any peptide bonds. Five enzymes with these three domains were recently identified in A. terreus and their products were confirmed via knockout studies and heterologous expression in A. nidulans with yields of 50-100 mg/L. The products were all dimeric aromatic alpha-keto acids, differing only in cyclization pattern. Our hypothesis was that the cyclization pattern is determined by the TE domain so cyclization and release go hand in hand. To test this, we generated hybrids of the genes by fusing the TE domain of one gene with the A and T domain of another. The resulting functional hybrid enzymes had comparable yield and allowed for the structural characterization of their products, showing that the cyclization pattern of hybrids was determined by which TE domain they contained. Products of these hybrids were novel secondary metabolites with new combinations of heterocycles and side chains. This result shows that our system can not only be used to study enzyme mechanisms, but also engineered to produce new molecules on an industrial viable scale. To generate even more novel molecules, the tailoring enzymes of the wildtype clusters were coexpressed. Using this approach specifically methylated and prenylated molecules were successfully obtained. Part of this work was published in Organic Letters in 2016. Future work will focus on generating more functional hybrids and thus more novel molecules. These new molecules will be compared to the wild type compounds in their phytotoxic (plant assay using leaf discs) and cytotoxic activity (MTT assay) in four cancer cell lines, NCI-H460 (non-small cell lung), MCF-7 (breast), SF-268 (CNS glioma), and MIA Pa Ca-2 (pancreatic). ❧ Heterologous expression of Ribosomally synthesized and Post-translationally modified Peptides (RiPPs) (Chapter 5) ❧ Ustiloxin B is a cyclic tetrameric peptide exhibiting antimitotic properties via microtubule inhibition making it a cancer drug lead. It is produced by a number of crop infesting fungi like Aspergillus flavus and Ustilaginoidea virens and until recently it was thought to originate from an NRPS. However, the NRPS was never identified and through genome analysis a gene with a 16 repeat of the four amino acids corresponding to the ustiloxin B sequence was found and shown to be a ribosomally synthesized precursor for ustiloxin B. Through knockout studies the gene cluster borders were determined, and there are 16 genes in the cluster. The generation of ustiloxin B analogues could lead to stronger antitumor activity and generally better drug properties. ❧ NRP analogues can be created through A domain replacements in NRPSs, but in the case of ustiloxin B a simple codon change in the precursor gene can yield analogues. To achieve this, we first expressed the gene cluster in our heterologous host A. nidulans, which has a ready genetic system with three usable selection markers. This way modifications to the ustiloxin B cluster can be made easily. Previously biosynthetic gene clusters were expressed one gene after the other, starting with the core gene and adding tailoring enzymes later recycling a marker with each gene. However, for a 16 gene cluster this would take 16 rounds of transformation. Therefore, a new method was developed for transformation of the complete gene cluster of A. flavus into the A. nidulans host in a single step, reducing the time needed from months to weeks. Culturing of the mutants under standard glucose minimal media (GMM) did not yield ustiloxin B. However, it is known that ustiloxin B is naturally produced by A. flavus when growing on and contaminating crops. Therefore, the mutants were grown in V8 juice media, in which amounts of ustiloxin B were detected. Subsequent overexpression of the transcription factor of the cluster, using the alcA promoter, increased ustiloxin B production tenfold. Future work will involve the expression of modified precursor peptides to generate ustiloxin B analogs. These will be tested for their antimitotic properties using initial microtubule assembly assay, followed by MTT assay to test the cytotoxicity on MKN-1, MKN-7, MKN-74, RERF-LC-MA, SBC-5, MCF-7, WiDr, SW-480, and KU-2 cancer cell lines. ❧ In this work it is shown that besides the enzymes Nature provides us, genome engineering can open up a chemical space currently not occupied, with seemingly endless possibilities. As engineering efforts increase, guided by structural data and high-throughput screening, one day a new natural enzyme will be discovered that has the same functionality as an existing engineered one. Until that day efforts on engineering and natural discovery both have value.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Genetic engineering of fungi to enhance the production and elucidate the biosynthesis of bioactive secondary metabolites
PDF
Genome mining of natural product biosynthesis pathways in filamentous fungi for novel drug discovery and production
PDF
Harnessing environmental and culture conditions to alter fungal ‘omics’
PDF
Application of genome-wide strategies for the mining of secondary metabolite biosynthesis pathways in filamentous fungi
PDF
Solvation as a driving force for peptide docking to the major histocompatibility complex (MHC) class II molecules
PDF
Assembling NRPS modules in e. coli to establish a platform for rational design of biologically active compounds
PDF
Genetic and chemical characterization of two highly-reducing polyketide synthase clusters from Aspergillus species
PDF
Genome manipulation of filamentous fungi for upregulating the production and illustrating the biosynthesis of valuable secondary metabolites using CRISPR-Cas9
PDF
Multi-omic data mining to elucidate molecular adaptation mechanisms of filamentous fungi exposed to space environment
PDF
Genome mining of secondary metabolites in Aspergillus terreus isolated from the Chernobyl Collection
PDF
A novel chemoenzymatic conjugation method for bispecific antibody production
PDF
Characterization of the interaction of nucleotide-binding oligomerization domain, leucine-rich repeat and pyrin domain-containing protein 12 (Nlrp12) with hematopoietic cell kinase (Hck)
PDF
Secondary metabolites of Aspergillus nidulans
PDF
Secondary metabolites biosynthesis in Aspergillus species revealed by fungal genome mining
PDF
Effect of acetaminophen and ibuprofen on spermatogenesis and cell signaling mechanisms
PDF
Proinsulin-transferrin recombinant fusion protein: mechanism of activation and potential application in diabetes treatment
PDF
Uncovering the mechanism of the radiation tolerance of a Chernobyl isolated Cladosporium cladosporioides using genetic engineering
PDF
Towards DNA-directed assembly of pMHC multimers for detection of low-affinity T cells
PDF
Inhibition of NR3B1 attenuates the progression of NAFLD and NASH in liver-specific Pten knockout mice
PDF
Integration of KNIME and molecular docking for evaluation of tau fibril inhibitors
Asset Metadata
Creator
van Dijk, Johannes (author)
Core Title
Genome engineering of filamentous fungi for efficient novel molecule production
School
School of Pharmacy
Degree
Doctor of Philosophy
Degree Program
Molecular Pharmacology and Toxicology
Publication Date
10/18/2018
Defense Date
05/22/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Aspergillus,Fungi,genome engineering,heterologous expression,OAI-PMH Harvest,secondary metabolism
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Wang, Clay (
committee chair
), Haworth, Ian (
committee member
), Xie, Jianming (
committee member
)
Creator Email
janwillemvandijk1984@gmail.com,vandijk@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-82928
Unique identifier
UC11675435
Identifier
etd-vanDijkJoh-6876.pdf (filename),usctheses-c89-82928 (legacy record id)
Legacy Identifier
etd-vanDijkJoh-6876.pdf
Dmrecord
82928
Document Type
Dissertation
Format
application/pdf (imt)
Rights
van Dijk, Johannes
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
genome engineering
heterologous expression
secondary metabolism