Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Ecologically responsible domestication of kelp facilitated by genomic tools
(USC Thesis Other)
Ecologically responsible domestication of kelp facilitated by genomic tools
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ECOLOGICALLY RESPONSIBLE DOMESTICATION OF KELP
FACILITATED BY GENOMIC TOOLS
by
Kelly Jacqueline DeWeese
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(MOLECULAR BIOLOGY)
December 2024
Copyright 2024 Kelly Jacqueline DeWeese
ii
Dedication
To my moms.
iii
Acknowledgements
I would like to thank University of Connecticut researchers Simona Augyte, Yaoguang Li, Schery
Umanzor, and Michael Marty-Rivera for establishing and culturing the Saccharina latissima
germplasm for sequencing. I would also like to thank University of Southern California researchers
Gary Molano, Melisa Osborne, José Diesel, Maxim Kovalev, Jordan Chancellor, Maddelyn
Harden, Makayla Morton, and Dylan Wallis for their contributions to figure design and validation.
I would be remiss not to mention the other students who mentored and influenced my PhD career
in USC’s Department of Molecular and Computational Biology: Sara Keeble, Katie Orban, Celja
Uebel, Rachel Schell, and Caleb Ghione. And of course, all my remaining gratitude goes to the
advisors that have played pivotal roles in my research, including Scott Lindell of the Woods Hole
Oceanographic Institute and Professor Charles Yarish of the University of Connecticut, who
graciously invited me to analyze a treasure trove of Saccharina latissima data; Professor Matt
Dean of USC, who introduced me to the love of my life and bane of my existence, RStudio; and
last but certainly not least, my committee chair, Professor Sergey Nuzhdin, who reminds me every
day that we do science not only because we love it, but because it matters.
iv
TABLE OF CONTENTS
Dedication...................................................................................................................................... ii
Acknowledgements.......................................................................................................................iii
List of Tables................................................................................................................................ vii
List of Figures.............................................................................................................................viii
Abstract ......................................................................................................................................... xi
Introduction.................................................................................................................................... 1
Chapter 1: Understanding the metabolome and metagenome as extended phenotypes: The next
frontier in macroalgae domestication and improvement ............................................................... 5
1.1 Abstract....................................................................................................................... 5
1.2 Introduction ................................................................................................................ 6
1.2.1 Brief review of marine macroalgae aquaculture ............................................... 6
1.2.2 Modern farming in the U.S. and targeted breeding........................................... 8
1.3 Application of metabolomics to macroalga domestication ...................................... 11
1.3.1 A need to adapt: Global warming and resilient organisms.............................. 12
1.3.2 The metabolome of marine macroalgae .......................................................... 14
1.3.3 Metabolomic analysis through a structural equation-modeling framework.... 14
1.3.4 The macroalgae holobiont: combining metabolomics and metagenomics to
probe host-symbiont interactions............................................................................ 16
1.4 Application of metagenomics to macroalga domestication...................................... 18
1.4.1 Understanding the microbiome ....................................................................... 18
1.4.2 Using metagenomic data to understand community assembly patterns of the
microbiome ............................................................................................................. 22
1.4.3 Identifying beneficial bacteria using metagenomics and physiological crop
traits......................................................................................................................... 23
1.4.4 Characterizing impact of host genotype on microbial community ................. 24
1.4.5 Technical challenges of analyzing host genome impact on microbiome ........ 25
1.5 Conclusion................................................................................................................ 28
1.6 Author contributions................................................................................................. 30
Chapter 2: Scaffolded and annotated nuclear and organelle genomes of the North American
brown alga Saccharina latissima................................................................................................. 31
2.1 Abstract..................................................................................................................... 31
2.2 Introduction .............................................................................................................. 32
2.3 Materials and Methods............................................................................................. 35
2.3.1 Sample collection ............................................................................................ 35
2.3.2 DNA extraction and sequencing...................................................................... 35
v
2.3.3 Nuclear genome assembly and decontamination ............................................ 36
2.3.4 Hi-C scaffolding and polishing ....................................................................... 37
2.3.5 Gene annotation............................................................................................... 38
2.3.6 Comparative genomic analyses....................................................................... 39
2.3.7 NCBI decontamination.................................................................................... 41
2.3.8 Organelle genome assemblies......................................................................... 41
2.4 Results...................................................................................................................... 42
2.4.1 Nuclear genome assembly............................................................................... 42
2.4.2 General statistics ............................................................................................. 42
2.4.3 Contiguity and chromosome number.............................................................. 45
2.4.4 Gene content.................................................................................................... 46
2.4.5 Synteny analysis.............................................................................................. 47
2.4.5.1 Interspecies whole genome alignment ................................................ 47
2.4.5.2 Reference-based scaffold ordering...................................................... 49
2.4.6 Organelle genomes.......................................................................................... 50
2.5 Discussion ................................................................................................................ 51
2.5.1 Genomic tools in breeding and agriculture ..................................................... 51
2.5.2 Enhanced sugar kelp breeding with genomic tools......................................... 51
2.5.3 Identification of homologous chromosomes in multiple species undergoing
domestication .......................................................................................................... 52
2.5.4 Improved organelle genome resources............................................................ 53
2.5.5 Future Directions............................................................................................. 54
2.6 Funding..................................................................................................................... 54
2.7 Author contributions................................................................................................. 55
Chapter 3: Genetic variant annotation toward the detection of naturally non-reproductive kelp
genotypes in germplasm .............................................................................................................. 56
3.1 Abstract..................................................................................................................... 56
3.2 Introduction .............................................................................................................. 57
3.3 Methods.................................................................................................................... 62
3.3.1 Sampling, culture, and DNA sequencing ........................................................ 62
3.3.2 RNA sequencing and differential expression analysis.................................... 63
3.3.3 Detection of conserved reproductive genes with protein homology............... 63
3.3.4 Variant calling.................................................................................................. 64
3.3.5 Functional variant annotation.......................................................................... 65
3.4 Results...................................................................................................................... 65
3.4.1 Curation of reproductive gene set ................................................................... 65
3.4.1.1 Differential expression analysis between tissue types ........................ 65
3.4.1.2 Homology and annotation search........................................................ 66
3.4.1.3 Defining putatively reproductive genes.............................................. 67
3.4.2 Candidate genetic variants .............................................................................. 67
3.4.3 Test crosses homozygous for candidate variants............................................. 68
3.4.3.1 Variant and genotype scoring and ranking .......................................... 68
vi
3.4.3.2 Preliminary phenotypic results............................................................ 69
3.5 Discussion ................................................................................................................ 71
3.6 Funding..................................................................................................................... 73
3.7 Author contributions................................................................................................. 73
Conclusion ................................................................................................................................... 74
References.................................................................................................................................... 75
Appendix A: Supplementary Figures......................................................................................... 107
Appendix B: Supplementary Tables........................................................................................... 114
vii
List of Tables
Table 2.1. Comparison of Saccharina latissima nuclear genome assembly statistics to those
of related brown macroalgal species. *Excludes artificial chromosomes. ................................... 44
Table 2.2. Comparison of Saccharina latissima organelle genome versions on assembly and
gene annotation statistics. ............................................................................................................. 50
Supplementary Table 1. Genomic libraries included in the Saccharina latissima nuclear
genome assembly and their respective assembled sequence coverage levels in the final
release. *
Average read length of PacBio reads............................................................................ 114
Supplementary Table 2.PacBio library statistics for the libraries included in the Saccharina
latissima nuclear genome assembly and their respective assembled sequence coverage
levels. .......................................................................................................................................... 115
Supplementary Table 3. Summary statistics of the initial output of the RACON polished
HiFiAsm+HIC Saccharina latissima nuclear genome assembly. The table shows total
contigs and total assembled basepairs for each set of scaffolds greater than the size listed in
the left-hand column. .................................................................................................................. 116
Supplementary Table 4. Final summary assembly statistics for the version 1.0 Saccharina
latissima nuclear genome assembly............................................................................................ 117
Supplementary Table 5. Summary statistics of S. latissima whole genome alignments per
homologous chromosome in S. japonica, M. pyrifera, U. pinnatifida, and Ectocarpus sp.
Ec32. ........................................................................................................................................... 118
viii
List of Figures
Figure 0.1. Haplodiplontic life history observed in species of brown macroalgae (kelp, order
Laminariales) highlighting stages at which meiosis, differentiation, and fertilization occur.
Red arrows indicate the stage at which gametophytes can be arrested before gamete
production in clonally propagated cultures under red wavelength light. Figure modified from
North (1987).................................................................................................................................... 3
Figure 1.1. Domestication of marine macroalgae species. Many wild marine macroalgae
populations settle on rocky substrates, providing important habitats for hundreds of marine
species. There is much variation in rate of growth and chemical composition amongst wild
macroalgae populations. Similar to agricultural crops, this natural phenotypic variation is
reduced in domesticated macroalgae cultivars as the phenotype is optimized............................. 10
Figure 1.2. Use of omics techniques to investigate temperate kelp adaptation to climate
change. (A) Kelps grown in cool water (blue, high [nitrate]) are expected to perform better
overall than kelps grown in warmer water (red, low [nitrate]). The blue star denotes the
hypothetical fittest cool-water individuals, while the red star is fittest in warm-water. (B-D)
Analyses of individual kelps in setup of A: (B) Population genetics analysis of SNPs
segregating between fittest kelps in cool vs. warm water. (C) Differential expression
analysis between fittest individuals in cool vs. warm water. (D) Identification of pathways
involved in kelp adaptation to warming oceans with genetic, transcriptomic and
metabolomic data.......................................................................................................................... 13
Figure 1.3. Overview of omics techniques currently used and proposed for use in
macroalgae domestication and improvement. Proposed schema for the application of DNA
and RNA sequencing, metagenomics to efficiently improve and domesticate macroalgae,
emphasizing how each analysis can complement and build upon others. The information
gained through performing metagenomic and metabolomic analyses on top of conventional
sequencing analyses—including the identification of potential microbial inoculants for
macroalgae cultivars, essential nutrients for macroalgae growth, and the mechanism by
which genetic variants of macroalgae species produce desirable phenotypes—makes a case
for the increased collection and utilization of these important datasets and analyses.................. 29
Figure 2.1. Log10-scaled scaffold size distribution for five brown macroalgal species:
Ectocarpus sp. Ec32 (red), Undaria pinnatifida (yellow), Macrocystis pyrifera (green),
Saccharina japonica (blue), and Saccharina latissima (purple). ................................................. 43
Figure 2.2. The number of Ns (unknown bases) per 100 kbp (gray bars) are compared for
each genome assembly. BUSCO scoring using the Stramenopiles ortholog database
(ODB10, n=100) shows relative counts of complete (blue), fragmented (yellow), and
missing (red) orthologs in each assembly. Phylogenetic tree showing relatedness of
compared species pruned from Starko et al. (2019). .................................................................... 46
Figure 2.3. (A) Venn diagram highlights overlaps of unique Saccharina latissima homologs
mapping to genomes of four related species. (B) Heatmap shows the maximal syntenic
match of 1,110 S. latissima scaffolds and contigs to chromosomes of related species. ............... 48
ix
Figure 2.4. (A) Size distribution (log10 bp scale) of our v1.0 Saccharina latissima assembly
before and after synteny-based rescaffolding show 267 scaffolds (light blue) incorporated
into 38 psuedochromosomes (dark blue). (B) Gene orthology between 32 S. latissima
psuedochromosomes (dark green) and 38 Undaria pinnatifida chromosomes and contigs
(light green) was mapped using 3 Mbp windows containing 10 single-copy orthologs, with
band colors denoting highest synteny (gray), ortholog rearrangement (red) and chromosomal
splitting or fusion (purple). ........................................................................................................... 49
Figure 3.1. The schematic demonstrates how the haplodiplontic kelp life history (A) can be
linearized (B) by selecting and crossing haploid gametophytes possessing deleterious alleles
that cause a non-reproductive phenotype in the resulting sporophyte.......................................... 61
Figure 3.2. Annotated volcano plot of differential gene expression between sorus tissue
versus pooled non-reproductive tissues of a Saccharina latissima sporophyte, showing log2
fold change (LFC) on the x-axis and –log10 p-value on the y-axis. Dotted lines intersect the
x-axis at LFC = ±2 and the y-axis at p = 1 x 10-5
. ........................................................................ 66
Figure 3.3. Distributions of counts of high impact alternate alleles in 278 Saccharina
latissima genotypes (A) across all genes (nalleles = 26,099) and (B) in genic regions of
putatively reproductive genes (nalleles = 144). (C) Distribution of counts of female and male
genotype pairs that share x high impact alleles. The red line intersects the x-axis at the mean
of 288 shared alleles...................................................................................................................... 68
Figure 3.4. Parent genotypes of Saccharina latissima sporophyte crosses expressing nonreproductive phenotype in farm setting. Genotypes highlighted in cyan were predicted to
produce non-reproductive sporophytes and proposed in the test cross plan................................. 70
Figure 3.5. (A) Bars represent log2 fold change (LFC) in gene expression between
reproductive versus non-reproductive tissues for each putatively reproductive gene in with a
variant the test crossed genotypes, and are colored by tissue expression bias (i.e.,
reproductive bias = LFC > 0). Boxes to the left of gene IDs on the y-axis are gene
annotations or top BLASTp hit of a protein translated from the gene. (B) For each class of
predicted variant effect, a boxplot shows the distribution in the proportion of crosses
possessing a given variant that have been phenotyped as non-reproductive................................ 71
Supplementary Figure 1. The number of Ns (unknown bases) per 100 kbp (gray bars) are
compared for each genome assembly. BUSCO scoring using the Eukaryota ortholog
database (ODB10, n=255) shows relative counts of complete (blue), fragmented (yellow),
and missing (red) orthologs in each assembly. Phylogenetic tree showing relatedness of
compared species pruned from Starko et al. (2019). .................................................................. 107
Supplementary Figure 2. Regression fits of total lengths (A) and alignment lengths (B) of
Saccharina latissima scaffolds and contigs versus homologs in Saccharina japonica (red),
Macrocystis pyrifera (green), Undaria pinnatifida (blue), and Ectocarpus sp. Ec32 (purple).
x
Excludes artificial chromosomes. (C) Distributions of n S. latissima scaffolds aligned to
each reference chromosome. Outliers are marked with triangles............................................... 108
Supplementary Figure 3. Arrangement of v1.0 assembly contigs (light blue “input” bars)
onto Ragout pseudochromosomes. ............................................................................................. 109
Supplementary Figure 4. Rescaffolding results on S. latissima assembly v1.0 with varying
parameters to Ragout: (A) no size filtering, chimeric scaffolds (red) allowed, (B) no size
filtering, unbroken input scaffolds, (C) no size filtering, unbroken input scaffolds, forced
HAL instead of MAF, (D) short contigs filtered out of comparison genomes, chimeric
scaffolds (red) allowed, (E) short contigs filtered out of comparison genomes, unbroken
input scaffolds............................................................................................................................. 110
Supplementary Figure 5. Gene annotation of (A) this Saccharina latissima chloroplast
genome assembly and (B) the previously published version.......................................................111
Supplementary Figure 6 Gene annotation of (A) this Saccharina latissima mitochondrial
genome assembly and (B) the previously published version...................................................... 112
Supplementary Figure 7. Dotplots of (A) mitochondrion and (B) chloroplast genome
assemblies aligned to previous versions. Segments that align in the forward direction (+) are
blue, while segments that align along the complementary strand (-) are orange........................ 113
xi
Abstract
As the climate changes and the human population grows, development of new sustainable fuel and
food sources is imperative. Globally, seaweed aquaculture directly supplies biomass to vital
industries including food, pharmaceuticals, and biofuels. The group of large, fast-growing brown
seaweed known as kelp provide ecologically rich habitats and are attractive potential biomass
feedstocks. Genomics-assisted breeding can expediate biomass production and bring kelp
aquaculture to the level of modern crop agriculture. Rapid improvements in sequencing and
genomic tools have allowed breeding programs to benefit from selection strategies informed by
genomic data and to be applied in non-model organisms. These tools can be further harnessed to
analyze the metabolism and microbiome of kelps, allowing for selection of optimal metabolite
composition. A review of metabolomic and metagenomic studies to improve agricultural crops
emphasizes the potential importance of this research in kelp breeding programs. The future of
genomic selection in kelp species depends on the availability of robust genomic resources, an effort
that is directly supported by the scaffolded and annotated North American sugar kelp (Saccharina
latissima) genome assembly reported here. We employed this crucial genomic resource to
functionally analyze genetic variants in sequenced kelp germplasm and investigate traits of
interest. In support of efforts to mitigate ecological concerns regarding gene flow from
aquaculture, we predicted putatively non-reproductive kelp genotypes. Developing kelp strains
that will not perturb marine ecosystems is critical to sustainable kelp aquaculture and can be
efficiently achieved with genomics-informed breeding programs sustained by this type of research.
1
Introduction
Selective breeding of plants and animals in agriculture over the last ten millennia facilitated the
establishment of modern human societies and has led to the development of an agricultural system
that today supports a world population of 8 billion (Purugganan and Fuller, 2009; UN DESA,
2024). In the 20th century, escalating food and resource demands led to the proliferation of
technologies that dramatically increased the efficiency and yield of agriculture crops in a period
known as the Green Revolution (Khush, 2001; Ortiz et al., 2007). However, current rates of yield
improvement in the most essential plant crops (e.g., cereal grains) are not predicted to adequately
support the human population by 2050 (Ray et al., 2013; Hickey et al., 2019). Diversifying our
production of food and other resources is crucial for sustaining humanity in the future and essential
for strengthening global industries against the impacts of climate change (Spillias et al., 2023;
FAO, 2024; Fricke et al., 2024).
Seaweeds have attracted worldwide attention as healthy, sustainable (Duarte et al., 2021)
alternate sources of nutrition to conventional agriculture that can expand overall food security
(Kim et al., 2017). Marine macroalgae, collectively termed seaweeds, are widely cultivated in
offshore ocean aquaculture farms. World seaweed production has more than tripled since 2000—
an increase of 244.5% by weight—and in 2022, the industry was worth an estimated US$17 billion
(FAO, 2024). As a direct food source, seaweeds provide nutritional benefits that include lowcalorie fiber and high concentrations of rare vitamins and minerals (Cherry et al., 2019; FAO,
2024). Each year, approximately 35 million tons of seaweed is commercially cultivated, of which
30–38% is produced for human consumption (FAO, 2024), while the remainder is utilized in
extractive applications or as marine feedstock (McHugh, 2003; Cai et al., 2021).
2
Large brown seaweed known as kelp represent some of the major species cultivated
worldwide, including kombu (Saccharina japonica) and wakame (Undaria pinnatifida), which
make up 35.4% and 7.4% of all produced seaweed, respectively (Yamanaka and Akiyama, 1993;
Lindsey Zemke-White and Ohno, 1999; Cai et al., 2021; Cottier-Cook et al., 2023). Kelps (order
Laminariales) are fast-growing, habitat-forming seaweeds with broad ecological as well as
economic significance (Wernberg et al., 2019). Over 25% of the world’s coastline hosts kelp forests
that support taxonomically rich ecosystems across temperate and Arctic climates (Dayton, 1985;
Krumhansl et al., 2016). Despite their similar morphologies, kelps are a divergent eukaryotic
lineage from plants that represent an independent evolution of multicellularity over 1,500 MYA
(Yoon et al., 2004; Cock et al., 2010; Bringloe et al., 2020). A notable distinction between kelps
and plants that is significant to cultivation arises in their life histories. Many kelp species such as
U. pinnatifida and those of the genus Saccharina have heteromorphic life histories that feature a
free-living, microscopic haploid stage called a gametophyte that can both propagate vegetatively
in clonal culture and produce gametes (North, 1987; Liu et al., 2017) (Figure 0.1). Haplodiplontic
kelps are uniquely suited to efficient, controlled breeding programs that take advantage of the
gametophyte stage to isolate and select strains to establish a germplasm of kelp improved for
cultivation (Kim et al., 2017).
3
Figure 0.1. Haplodiplontic life history observed in species of brown macroalgae (kelp, order Laminariales)
highlighting stages at which meiosis, differentiation, and fertilization occur. Red arrows indicate the stage at which
gametophytes can be arrested before gamete production in clonally propagated cultures under red wavelength light.
Figure modified from North (1987).
The shift from wild harvest to aquaculture in seaweed production occurred within just the
last 100 years (Hwang et al., 2019). Consequently, seaweed aquaculture lacks the traditionally
improved strains that exist for almost all major terrestrial agriculture crops (Robinson et al., 2013;
Peteiro et al., 2016). Additionally, in contrast with modern terrestrial farms, which extensively
modify and control the biotic and abiotic factors plants experience with irrigation, fertilization,
and pesticides, in general ocean ecosystems are prohibitively difficult to modify for crop yield
optimization (Alston and Tobutt, 1989; Camus et al., 2018). Following the initiation of commercial
seaweed mariculture in Asia in the 1940s, research and development to increase seaweed
4
production have focused on selective breeding to enhance growth and cultivate traits of interest in
farmed species (Kim et al., 2017; Camus et al., 2018; Goecke et al., 2020). To strengthen breeding
programs, genetic resources for several brown algal species have been generated during the last
two decades, including genomes for the model system Ectocarpus sp. (Cock et al., 2010; Cormier
et al., 2017) and the kelps Saccharina japonica (Ye et al., 2015; Fan et al., 2020a), Undaria
pinnatifida (Shan et al., 2020; Graf et al., 2021), and Macrocystis pyrifera (Diesel et al., 2023).
The work reported here builds upon these foundational genetic resources and delves into
three pivotal areas aimed at advancing the field of kelp domestication. First, we review the
potential to enhance kelps for aquaculture by optimizing metabolite profiles and cultivating
beneficial associated microbes. In support of the goal to expand genetic resources for kelp species
we provide a high-quality, annotated reference genome assembly for Saccharina latissima (sugar
kelp), an edible kelp found in North America that is a close relative of the most widely cultivated
kelp species S. japonica. Our analysis highlights shared genomic architecture between several kelp
species relevant to aquaculture. We employed this new genome and gene annotation information
to efficiently detect naturally non-reproductive kelp genotypes within a S. latissima germplasm
collection by predicting putatively sterile genetic variants, an important milestone in the
establishment of ecologically responsible kelp cultivars. Taken together, this thesis outlines critical
research for breeding programs to support sustainable kelp aquaculture worldwide.
5
Chapter 1: Understanding the metabolome and metagenome as
extended phenotypes: The next frontier in macroalgae
domestication and improvement
Kelly J. DeWeese1
, Melisa G. Osborne1
1 Department of Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, United
States
1.1 Abstract
“Omics” techniques (including genomics, transcriptomics, metabolomics, proteomics, and
metagenomics) have been employed with huge success in the improvement of agricultural crops.
As marine aquaculture of macroalgae expands globally, biologists are working to domesticate
species of macroalgae by applying these techniques tested in agriculture to wild macroalgae
species. Metabolomics has revealed metabolites and pathways that influence agriculturally
relevant traits in crops, allowing for informed crop crossing schemes and genomic improvement
strategies that would be pivotal to inform selection on macroalgae for domestication. Advances in
metagenomics have improved understanding of host-symbiont interactions and the potential for
microbial organisms to improve crop outcomes. There is much room in the field of macroalgal
biology for further research toward improvement of macroalgae cultivars in aquaculture using
metabolomic and metagenomic analyses. To this end, this review discusses the application and
necessary expansion of the omics tool kit for macroalgae domestication as we move to enhance
seaweed farming worldwide.
6
1.2 Introduction
The farming of marine macroalgae (seaweed) contributes to the international food, cosmetic,
science and pharmaceutical industries (Buschmann et al., 2017). Aquaculture of macroalgae has
historically been almost exclusively an Asian economic enterprise—over 99% of annual
production is based in Asia (Ferdouse et al., 2018). With increased globalization and the search for
ecofriendly alternatives in the food and biofuel industries, aquaculture has grown outside of Asia
as well. Recently, Europe and the Americas have seen an emergence of aquaculture farms,
companies and research facilities specializing in algae (Kim et al., 2017). Over the last two
decades, world production of marine macroalgae has more than tripled, up from 10.6 million tons
in 2000 to 32.4 million tons in 2018 (FAO, 2020). Though there are hundreds of known marine
macroalgae species, 81% of algae production is comprised of only a handful of brown and red
macroalgae species (FAO, 2018). Aquaculture, particularly that of marine macroalgae, has much
potential to expand globally, onto different coastlines and harnessing new species. Furthermore,
there are large economic and environmental gaps to fill, specifically in the realms of biofuel and
food production (Pereira and Yarish, 2008; Duarte et al., 2017; Guzinski et al., 2018; Sudhakar et
al., 2018). Macroalgae are an especially attractive species for food and biofuel crops as they do
not compete with agriculture for land or freshwater resources (Kim et al., 2019b). From the
growing pressure that climate change presents to agriculture, aquaculture has quickly gained global
attention and many species of macroalgae are positioned to become significant marine crops.
1.2.1 Brief review of marine macroalgae aquaculture
Seaweed farming, and its impact on aquaculture, has evolved over hundreds of years (Pereira and
Yarish, 2008). While it began as a practice of coastal harvesting, the increasing use and value of
7
seaweed in human food products has led to more direct farming of seaweed as a crop. In 2018,
farmed seaweeds represented 97.1 percent by volume of the total of 32.4 million tons of wildcollected and cultivated aquatic algae combined (FAO, 2020). Although global production of
farmed seaweed has experienced relatively slow growth in recent years, a renewed interest in
sustainable foods, feeds, phycocolloids, and biofuel production has ignited groups in academia and
industry to focus efforts on developing several species of macroalgae as commercial crops (Kim
et al., 2019b). In fact, over the last 30 years several U.S. federal agencies, including the Department
of Energy (DOE), Department of Agriculture (USDA) and Department of Commerce’s National
Oceanic and Atmospheric Administration (NOAA) have invested nearly one billion dollars into
developing the U.S. as a leader in aquaculture (Love et al., 2017; Kim et al., 2019b).
Seaweeds are extractive crops, meaning that they benefit the environment by removing
waste materials such as nitrogen, phosphorus, and carbon (Kim et al., 2015; FAO, 2018). High
levels of these materials can have negative consequences on coastal ecosystems by triggering, for
example, harmful microalgae blooms. However, these materials can be absorbed by several species
of seaweed and are in fact important nutrients to support proper growth and development (Rose et
al., 2015; Buck et al., 2017). In addition to being an environmentally friendly crop, seaweeds are
increasingly being recognized for their abundance of nutrients including, but not limited to, iodine,
iron, and vitamin A (Tanna and Mishra, 2019; FAO, 2020). Seaweeds contain micronutrient
minerals (e.g., iron, calcium, iodine, potassium and selenium) and vitamins (particularly A, C and
B-12) and are the only non-fish sources of natural omega-3 long-chain fatty acids (FAO, 2020).
Beyond their applications in the human food industry, seaweeds have a diverse range of
commercial applications such as: additives in feeds, fertilizers, and cosmetics; production of
8
alginate, agar, and carrageenan; pharmaceuticals; and biofuels (Pereira and Yarish, 2008; Kim and
Venkatesan, 2015; Buschmann et al., 2017; Kim et al., 2017).
1.2.2 Modern farming in the U.S. and targeted breeding
In the fifty years between 1950 and 2000, the human population more than doubled, increasing in
an unprecedented manner from 2.5 billion to 6 billion (Bongaarts, 2009). At the head of the
population boom, the 1960s were fraught with concern about the ratio of food to population
(Khush, 2001). This sentiment contrasts starkly with our reality today, where we struggle to
minimize food waste in developed countries (Walia and Sanders, 2019). Agriculture was able to
keep pace with the human population with its own productivity boom now called the “green
revolution,” a period in which advanced breeding crop techniques were developed and crops were
first genetically modified to increase yields (Khush, 2001). Since the green revolution changed the
face of agriculture, entire fields of science have developed to further investigate the genes, variants,
expression profiles, and metabolites involved in creating high-yield, superior crops.
The past two decades have seen genomics and other omics disciplines begin to dominate
the study of agriculture. Many significant agricultural crops have had their genomes sequenced,
allowing scientists to gain an understanding of the gene networks and pathways that determine
crop success for targeted breeding approaches (Van Emon, 2016). Modern genomics techniques
for improving agricultural crops include quantitative trait loci (QTL) mapping and genome-wide
association studies (GWASs) (Huang and Han, 2014). QTL mapping identifies genetic loci that
cause or contribute to specific phenotypes in crops, while GWAS identifies single-nucleotide
polymorphisms (SNPs) associated with traits of interest (Murray et al., 2008b; Huang et al., 2015).
QTL mapping has been applied to crop improvement in agricultural breeding programs through
9
marker-assisted selection (MAS), which utilizes polymorphic regions linked to specific alleles to
detect and select for desirable alleles and traits in crop cultivars (Collard and Mackill, 2008).
However, MAS is best able to detect high-effect genes or monogenic traits (Xu and Crouch, 2008;
Heffner et al., 2009). The limitations of MAS can be overcome in part using advanced genetic
methods that incorporate GWAS models into breeding, in a process known as genomic selection
(Goecke et al., 2020). In genomic selection, SNPs are used as markers across the entire genome to
estimate breeding values, a technique that has far higher resolution than MAS, can help to detangle
tightly linked traits, and is starting to model polygenic traits (Heffner et al., 2009). Microarray and
RNA sequencing data, used to build transcriptomes, and expression profiles, as well as to conduct
differential expression analyses, have also been instrumental in identifying significant genes and
variants in agricultural crops (Ashikari et al., 2005; Van Emon, 2016). To review the use of
molecular markers and genomic breeding programs in seaweed aquaculture, we refer the reader to
Yong et al. (2016) and Goecke et al. (2020). For a more recent review of genomic approaches for
improving agriculture more broadly, we also point the reader to the review by Bohra et al. (2020).
Even more recently, metabolomics and metagenomics data have been incorporated into
analyses for crop improvement (Dixon et al., 2006; Van Emon, 2016; Walker et al., 2016; Lin et
al., 2018; Gastauer et al., 2019; O’Donnell et al., 2020). Metabolomics provides insight into
pathways involved in growth and other desirable phenotypes, while metagenomics reveals
associated microbial species and putative functions that they could be performing for their crop
hosts such as nitrogen fixation.
Although humans have utilized seaweed as a food source for hundreds of years, its presence
in Western diets is concentrated in scattered coastal areas including Maine, Hawaii, and Alaska
10
(Hunter, 1975; Mouritsen et al., 2013; Kim et al., 2019b). While the vast majority of global
seaweed production is grown in Asia (FAO, 2018), aquaculture is one of the fastest growing
maritime industries in the U.S., particularly around New England (Kim et al., 2019b). With the
growing accessibility of and technology for multi-omics applications (O’Donnell et al., 2020), this
presents a timely opportunity for domestication and improvement of relevant seaweed species
through omics.
Figure 1.1. Domestication of marine macroalgae species. Many wild marine macroalgae populations settle on rocky
substrates, providing important habitats for hundreds of marine species. There is much variation in rate of growth
and chemical composition amongst wild macroalgae populations. Similar to agricultural crops, this natural
phenotypic variation is reduced in domesticated macroalgae cultivars as the phenotype is optimized.
Despite recent growth and investment in the U.S. aquaculture industry, several challenges
still exist. Mainly, securing permits for coastal domestication is a significant bottleneck for those
wishing to enter the industry (Kim et al., 2019b). Offshore farming is considered a more suitable
route for aquaculture as it presents fewer permitting conflicts (Cicin-Sain et al., 2001; Buck et al.,
2004; Tiller et al., 2013; Kim et al., 2019b). However, offshore farms are recognized as more
stressful environments for macroalga due to stronger currents, limited nutrients, and cooler
temperatures (Solan and Whiteley, 2016; Charrier et al., 2018; Roleda and Hurd, 2019).
11
Breeding programs can and have been developed to improve the productivity of seaweeds
in certain environments or tailor crops to enhance specific characteristics of interest (Figure 1.1)
(Zhang et al., 2016, 2018; Walker, 2018; Hwang et al., 2019). Species of seaweeds targeted for this
type of work include, but are not limited to: Saccharina, Laminaria, Porphyra, Pyropia, and
Macrocystis (Kim et al., 2019b). For example, in Saccharina japoinca QTL mapping has been
used to identify QTLs responsible for higher yield by increasing blade size (Wang et al., 2018),
and in Saccharina latissima QTLs for stipe length were similarly identified (Mao et al., 2020).
Macroalgal targeted breeding programs can be enhanced by more extensive incorporation of recent
advances in multi-omics technology. This review makes the case for the inclusion of metabolomic
and metagenomic data and analyses in continued efforts to domesticate and improve marine
macroalgae for aquaculture.
1.3 Application of metabolomics to macroalga domestication
Implicitly or explicitly, the metabolome—the suite of biological molecules present in a cell, tissue
or organism—plays a large role in the domestication and improvement of crops (Dixon et al., 2006;
Burgess et al., 2014). Selecting for certain desired phenotypes is a necessary step in domestication
to increase yield and cultivate optimal morphological characteristics. Investigating the genetics of
cultivated crops has allowed the field of agriculture to create and employ improved crop varieties
that have higher yield (Murray et al., 2008a; Wang et al., 2018) and better nutrient profiles or flavor
(Gao et al., 2014), and to determine alleles involved in heterosis, allowing for the creation of
crossing schemes in which offspring perform far superior with respect to their parents (Huang et
al., 2015). When combined with genomics and transcriptomics, metabolomics has further
benefited agricultural plant domestication and improvement research in many areas including:
12
pesticide resistance (Aliferis and Chrysayi-Tokousbalides, 2011); microbial associations (Han and
Micallef, 2016); crop flavor and nutrition profiles (Hall et al., 2008; Fernie and Schauer, 2009);
stress and heat tolerance (Sung et al., 2015; Che-Othman et al., 2020); and growth and yield (Obata
et al., 2015).
1.3.1 A need to adapt: Global warming and resilient organisms
The projected global demand for food, livestock feeds and bio-energy by 2050 will force the
increased farming of low-carbon and carbon-sequestering marine resources, such as kelp and
shellfish—especially in the face of large-scale environmental changes in climate and land use
(Waite et al., 2014; Froehlich et al., 2018; Kim et al., 2019b). The application of the suite of modern
omics techniques, specifically genomics, transcriptomics and metabolomics, to study and improve
macroalgae is not only relevant to aquaculture seaweed production for human use, however; native
macroalgae species are often habitat-forming and are integral species in coastal ecosystems (Teagle
et al., 2017). Wild temperate kelps have been increasingly threatened by the effects of global
warming. While globally marine macroalgae populations have been steadier than expected, when
essential, habitat-forming macroalgae are displaced or lost in an environment, it can be extremely
detrimental to the marine ecosystem (Krumhansl et al., 2016); the loss of M. pyrifera beds on
Australian coastlines is one example (Mabin et al., 2019).
13
Figure 1.2. Use of omics techniques to investigate temperate kelp adaptation to climate change. (A) Kelps grown in
cool water (blue, high [nitrate]) are expected to perform better overall than kelps grown in warmer water (red, low
[nitrate]). The blue star denotes the hypothetical fittest cool-water individuals, while the red star is fittest in warmwater. (B-D) Analyses of individual kelps in setup of A: (B) Population genetics analysis of SNPs segregating
between fittest kelps in cool vs. warm water. (C) Differential expression analysis between fittest individuals in cool
vs. warm water. (D) Identification of pathways involved in kelp adaptation to warming oceans with genetic,
transcriptomic and metabolomic data.
Several metabolomic studies to identify climate change-related responses have been
conducted in plant species (Ahuja et al., 2010; Arbona et al., 2013; Peñuelas et al., 2013; Park et
al., 2014). Recently, the use of metabolomics alongside other functional genomics techniques to
identify metabolic responses in climate change-tolerant and construct conservation plans has been
frequently suggested (Ouborg et al., 2010; De Ollas et al., 2019; Rilov et al., 2019). Climate
change-resilient marine macroalgae (e.g., heat-, acidification-, low nutrient-tolerant, etc.) can be
identified with analogous metabolomic studies to those proposed for terrestrial plants (Figure 1.2).
The future of wild marine macroalgae populations as well as domesticated macroalgae cultivars
will depend upon biologically informed conservation and improvement efforts, which can be made
more effective with metabolomic analyses.
14
1.3.2 The metabolome of marine macroalgae
Metabolite profiling and analysis has a long history in marine macroalgae (Gupta et al., 2014), but
has focused on metabolites that are bioactive, pharmaceutically relevant compounds (Davis and
Vasanthi, 2011; Greff et al., 2017), seasonal variation (Surget et al., 2017), delineating biochemical
differences of green, red and brown algae (Belghit et al., 2017), profiling during reproductive
fragmentation (He et al., 2019) and stress, defense and environmental responses (La Barre et al.,
2004; Ritter et al., 2014; Gaubert et al., 2019, 2020). Domestication- and improvement-based
metabolomic studies in macroalgae are sparse, even for model organisms.
A promising step forward in metabolomics of marine macroalgae was the development of
a metabolomic “atlas” for brown macroalgae, called EctoGEM, generated through a genome-scale
analysis of the metabolome of the model organism Ectocarpus siliculosus (Prigent et al., 2014).
The state of omics research in marine macroalgae is advancing rapidly (Cock et al., 2010; Michel
et al., 2010; Collén et al., 2013; Ritter et al., 2014; Liu et al., 2019), and is rising to meet more
advanced functional genomics techniques to improve and optimize macroalgal species for
aquaculture applications.
1.3.3 Metabolomic analysis through a structural equation-modeling framework
Metabolomic datasets are constructed using mass spectrometry (MS) and nuclear magnetic
resonance (NMR) on tissues of interest (Zacharias et al., 2018; Emwas et al., 2019). The specific
classes of primary and secondary metabolites captured by varying MS and NMR analyses, have
been reviewed extensively elsewhere (Gupta et al., 2014; Emwas, 2015; Kumar et al., 2016;
Bingol, 2018; Emwas et al., 2019; Nemkov et al., 2019). Metabolites are the intermediate and final
15
molecules modified and consumed by protein activity. The type and abundance of metabolites
present are intimately tied to gene expression (Burgess et al., 2014; Nelson and Cox, 2017).
In an effort to investigate this relationship between metabolic flow and gene expression,
metabolomic datasets are often complemented with RNA sequencing and phenotypic data (Fernie
and Schauer, 2009). To model metabolic pathways with metabolomic and gene expression data,
structural equation modeling (SEM), also known as confirmatory factor analysis, is a powerful
tool. Modeling the metabolism provides a deeper understanding of the underlying pathways
determining phenotypes of interest, and how specific genetic variants perturb these metabolic
pathways (Karns et al., 2013) SEM is a supervised approach based on geneticist Sewell Wright’s
path analysis, where the order and direction of the relationships between genes and metabolites are
intrinsic parts of the structural model (Wright, 1918, 1921; Igolkina and Samsonova, 2018).
Previously, SEM has been used to model gene regulatory networks (GRNs) in both in animals
(Fear et al., 2015) and in plants (Igolkina et al., 2019). Recent applications of SEM include
annotations of underlying pathways for biomass development in rice (Momen et al., 2019), grain
yield in wheat (Vargas et al., 2007), and BMI in humans (Kaakinen et al., 2010). They have also
been used to construct local GRNs based on patterns of differential gene expression (Aburatani,
2012; Tarone et al., 2012). The SEM framework can validate results using a variance*covariance
structure of metabolites and transcripts across genotypes. This allows for the expansion of
metabolic and transcriptomic networks. Using existing knowledge about a given GRN as a
baseline model, geneticists can systematically scan macroalgae genomes for additional
components and improve their understanding of existing networks. To validate constructed
16
metabolic networks, flux balance analysis (FBA), a network reconstruction-based approach, is
often used to model the flow of small molecules through known reactions (Orth et al., 2010).
Many macroalgal systems, such as the model brown macroalga E. siliculosus, model red
alga Chondrus crispus, or the commercial kelp Saccharina japonica, have been characterized with
genomic and transcriptomic methods (Cock et al., 2010; Collén et al., 2013; Liu et al., 2019).
Expanding upon this data to develop full metabolomic models for organisms can reveal which
genetic variants affect phenotype, which can be probed to maximize beneficial (e.g. growth and
yield) phenotypes and would particularly benefit macroalgae crops in aquaculture.
As more studies harnessing modern omics methods are conducted in macroalgae,
domestication and improvement of macroalgae species will become more efficient and be able to
accomplish phenotypic changes with informed crossing schemes and genetic manipulation, which
have taken thousands of years for traditional crop domestication in agriculture (Zsögön et al.,
2018).
1.3.4 The macroalgae holobiont: combining metabolomics and metagenomics to probe hostsymbiont interactions
The term “holobiont” emerged in the context of holistic biology as early as the 1940s with Dr.
Adolf Meyer-Abich (Baedke et al., 2020). Modern usage of the term to describe the relationship
between organisms and their closely associated microbiota has been traced back to work by Dr.
Lynn Margulis in 1991. Initially described as the relationship between a host and a single symbiont,
the term has since evolved to describe a host organism and its associated biome, specifically
including symbionts without which the holobiont would be able to perform its functions (Margulis,
1991). To learn more about the evolution of this term, we refer the reader to a recent review by
17
Baedke et al. (2020). The holobiont concept has transformed the way scientists think about host
organisms and their associated microbiota (collectively termed the microbiome). Specifically, it
promotes the symbiotic relationship between host and microbiome by considering the microbiome
and extended phenotype of the host and recognizing how both entities are influenced and shaped
by the other. This frame of thought has been applied to microbial studies of many eukaryotes,
including plants and mammals.
Applying this concept to seaweeds is a particularly strong example of holobionts in action,
as microorganisms (specifically bacteria) are recognized to play an essential role in the health and
fitness of aquatic plants and algae. As previously reviewed in Egan et al. (2013), any species of
macroalgae rely on their associated microbes to provide essential nutrients (such as CO2 and fixed
nitrogen) required for proper growth and development. To that end, several models have been
proposed on how macroalgae may go so far as to “recruit” beneficial microbes by creating a
desirable habitat by, for example, metabolite or chemical mediation (Saha and Weinberger, 2019).
Given these tight associations, it is reasonable that macroalgal species and their microbial
communities may have co-evolved to rely on each other’s biological mechanisms and lose
redundant gene pathways. Indeed, there exists an exciting opportunity to leverage existing
knowledge of the macroalgal holobiont and apply tools from metabolomics, metagenomics, and
transcriptomics to understand what genes are present and being expressed in the host and microbial
metagenome (Egan et al., 2008). Exploiting the fruits of these discoveries—which may include
identifying microbe species that serve essential functions for the macroalgae holobiont,
determining growth-promoting nutrients at specific macroalgae life stages, and creating beneficial
18
or protective microbial inoculants—will be essential for future genomic strategies for
domestication and improvement of economically relevant macroalgae species.
1.4 Application of metagenomics to macroalga domestication
High throughput sequencing, such as shotgun metagenomics, is a powerful resource for
understanding microbial communities (Sieber et al., 2018; Olson et al., 2019; Wilkins et al., 2019).
Advances in this field have enabled the discovery of novel genes and pathways that contribute to
overall microbiome function. By combining metagenomics data from the native microbiome with
physiological and genomic data from the host, we can investigate how the microbiome can be
utilized to optimize host growth. There are several review papers on the use of metagenomic data
to understand taxonomic composition and functional profile of the microbiome (Gilbert and
Dupont, 2011; Quince et al., 2017; Alves et al., 2018; Chiu and Miller, 2019), and how these
methods can benefit aquaculture (Martínez-Córdova et al., 2015; Tello et al., 2019). Specifically,
functions of the microbiome can be exploited to improve crop productivity by providing essential
nutrients or increasing disease resistance (Ezemonye et al., 2009; Caruso, 2013; Martínez-Córdova
et al., 2015).
Application of these methods to macroalgae domestication is a unique opportunity and a
powerful tool for research in agriculture and microbiomes because macroalgae is a fast-growing
organism and its overall fitness is tightly intertwined with the native microbiome (Laycock, 1974;
Egan et al., 2013; Busetti et al., 2017; Florez et al., 2017; Tello et al., 2019).
1.4.1 Understanding the microbiome
One of the challenges around developing macroalgae as a commercial crop is that they are highly
adapted to their local environment and do not demonstrate resilience or consistent physiological
19
traits when farmed in new locations (Minich et al., 2018; Grebe et al., 2019; Qiu et al., 2019;
Zacher et al., 2019). An example of this is giant kelp (Macrocystis pyrifera), a brown macroalgae
that thrives in coastal environments and is a promising resource for future domestic biofuel
production. Although giant kelp is one of the fastest growing organisms, farms along the U.S. coast
would not produce enough harvest to support giant kelp as a viable biofuel feedstock (Kim et al.,
2019b). Domestication of giant kelp in offshore farms would provide ample space for necessary
production rates, but current practices of farming still remain uncompetitive to wild harvested
macroalga. Similar challenges arise with the domestication of other species of macroalgae that are
native to coastal environments. Offshore farms present a significant challenge in that they
experience lower nutrient concentrations and are a more stressful environment for macroalgae
compared to coastlines. Engineering solutions to overcome issues of containment, nutrient
availability, and protection will be critical for offshore farming (Roesijadi et al., 2008; Duarte et
al., 2017). Some individuals may be successful in this environment by chance, yet the
unpredictable growth of seaweeds in offshore farms is a critical barrier to large-scale cultivation
and commercial applications. While breeding programs can be developed to optimize the gene
content of macroalgae and promote genotypes that are resilient in low nutrient conditions, there
remain significant challenges. However, limitations of breeding, such as trade-offs in yield and
stress resistance, can be overcome by microbial treatments that improve crop fitness and
predictable growth (Tack et al., 2016). Development of these treatments requires better
understanding of seaweed-bacterial associations, which are necessary for proper growth,
recruitment, and development of seaweeds (Morris et al., 2016; Fitzpatrick et al., 2018).
20
The surface microbiome of seaweeds is composed of a diverse community of
microorganisms that contribute to host health and form a biofilm across kelp blades. Macroalgal
hosts rely on resident surface-associated microbes for proper growth and development. In addition
to providing growth-benefitting compounds, epiphytic microbes provide a number of additional,
beneficial services such as nutrient acquisition and protection from pathogens (Dubilier et al.,
2008; Egan et al., 2008, 2013; Crawford and Clardy, 2011; Wahl et al., 2012; Naragund et al.,
2020). For example, certain seaweed-associated bacterial isolates are capable of fixing
atmospheric nitrogen, which is often a limiting nutrient for kelp growth and development (Singh
and Reddy, 2014).
Given that many species of macroalgae are not natively found offshore, this provides a
unique challenge for open-ocean farming. The open-ocean environment constitutes a low nutrient
and stressful environment for macroalgae (Hawkes and Connor, 2017). For field applications,
microbial inoculations face an additional challenge as native members of the kelp microbiome may
outcompete introduced microbes or their interactions may prove to be detrimental to kelp growth
(Gause, 1934; Hutchinson, 1957; van Veen et al., 1997). By investigating community assembly
patterns, we can navigate ecological niches to reduce competition and build a predictive
understanding of how microbial treatments will impact the microbial community and overall
growth of seaweed in aquaculture. Understanding functional traits across the community is also
vital to developing successful microbial treatments for aquaculture. Although it remains a
challenge to characterize function across the community, metagenomic sequencing summed across
taxa provides sufficient representation of overall function compared to species-specific
characteristics (Fierer et al., 2014; Jiang et al., 2016). Furthermore, host genotype also plays a key
21
role in recruiting and sustaining a beneficial microbial community (Horton et al., 2014). By
treating the microbiome as an extended phenotype of the host, one can perform a genome-wide
association study (GWAS) to identify genotypes or specific genes that are better suited to
supporting a beneficial microbial community, which can guide future breeding design to
compound the growth benefit of fast-growing genotypes and beneficial microbes (Awany et al.,
2019).
Few studies have attempted to rigorously interrogate macroalgal host-microbe
associations, assess the impact of host genotype on microbial community recruitment, or utilize
these interactions to support macroalgae as a commercial crop. Amplicon sequencing, such as that
of 16S rRNA, is a great resource for understanding species diversity in microbial communities.
However, the added benefit of using metagenomic shotgun sequencing for studying the
microbiome is that the greater genomic coverage and data output gives insight into overall
functional diversity, in addition to identifying unique and novel members of the community.
Utilizing this type of data would enable us to not only characterize community assembly patterns
and identify beneficial species of the microbiome, but also the impact of genetic diversity on
seaweed-associated microbial communities to find host genes promoting or restricting recruitment
of different symbionts. While several studies have investigated the microbial community of several
macroalgal species (Morris et al., 2016; Hawkes and Connor, 2017; Minich et al., 2018; Smith et
al., 2018), this field would benefit from a large-scale study in a single well-controlled farm
environment to understand microbiome structure and functional diversity in the context of a wide
range of macroalgae genotypes. Understanding the mechanisms and functions of the microbial
22
community, and the interaction between macroalgae and bacteria in offshore farms, is essential for
ensuring success of the rising macroalgae aquaculture industry.
1.4.2 Using metagenomic data to understand community assembly patterns of the microbiome
It is well recognized that targeted metagenomic sequencing, such as 16S rRNA sequencing, is
frequently insufficient to characterize taxonomic and functional variation in microbial
communities. Shotgun sequence metagenomic data can be used to fully annotate the species and
functional diversity of microbial communities across a wide range of macroalgal species.
Taxonomic identification and analysis of the microbial community with shotgun data requires the
construction of Metagenomic Assembled Genomes (MAGs). An example of a program to assist
with this step is Anvi’o, an advanced analysis and visualization platform for omics data (Eren et
al., 2015). To create consensus taxonomy for each unique genome, Anvi’o uses both single-copy
core genes (SCGs) and the taxonomy determined by The Genome Taxonomy Database (Parks et
al., 2018, 2020). Established species abundance tables then enable one to investigate community
assembly patterns and develop a predictive understanding of how introduction of novel species
will interact with the native community. A challenge often associated with analysis of
metagenomic data is low coverage organisms and closely related taxa. To correct for this, there
exist programs such as BinSanity, which uses an algorithm that clusters assemblies using coverage
with compositional based refinement to optimize bins containing multiple source organisms
(Graham et al., 2017).
Previous studies have analyzed microbial and benthic macroinvertebrates using
community ecology tools such as beta (β) and zeta (ζ) diversity (Doane et al., 2017; Simons et al.,
2019). These tools, which are established methods for measuring compositional change across
23
ecological communities, can be used to analyze community assembly patterns in the context of
macroalgal surface microbiomes (Hui and McGeoch, 2014; McGeoch et al., 2019). These tools
traditionally require that microbiome samples be collected from separate sites and use distance
between sites to measure assembly patterns across a larger area. However, in macroalgal farm
settings, by treating each individual as an independent community, and having information on the
genome of the host, genetic distance between individuals can be used to establish the foundation
for community assembly patterns. Moving from individual phenotypes (microbial loads) to
functional groups, exploratory and confirmatory factor analyses, such as structural equation
models described earlier, combined with QTL mapping can be used to determine how kelp genes
affect community structures. These computational tools are innovative and have recently been
applied to analyze effects of human genetic variation on the metabolome (Igolkina et al., 2018),
but have not yet been considered for microbial communities.
1.4.3 Identifying beneficial bacteria using metagenomics and physiological crop traits
Metagenomic analysis in macroalgal farm settings provides a powerful tool when combined with
physiological trait data of crops. By focusing on desired physiological traits, such as blade weight
or nutrient composition, one can establish a metric to assess “successful” growth and consequently
identify beneficial microbial species or groups of species. Co-occurrence networks can be used to
determine microbial species most often associated with successful growth. These species can be
considered “beneficial” and used to tailor future aspects of the analysis pipeline, such as analysis
of functional traits. Understanding functional traits across the community is vital to developing
successful microbial treatments for aquaculture. One trait of particular interest is nitrogen fixation.
Nitrogen is an important resource for macroalgae growth (Hanisak, 1983; Zehr et al., 1996;
24
Harrison and Hurd, 2001). Metagenomic sequence data would enable one to characterize
symbiotic relationships between seaweed and nitrogen fixing bacteria that have not yet been
described. Preliminary studies have demonstrated a pattern of strong, and potentially vertical, cotransmission of Mesorhizobium spp. and Sinorhizobium spp. with giant kelp (Minich et al., 2018).
This points to an opportunity to enhance nitrogen fixation in brown macroalgae and optimize
growth since more than half of nitrogen fixation in Sargassum, is thought to be derived from
associated microbes (Phlips and Zeman, 1990; Raut et al., 2018) and is known to be a limiting
factor in Macrocystis pyrifera ((Minich et al., 2018; Raut et al., 2018).
For this type of analysis one can consider each genotype an independent experimental
observation with a unique effect on its associated microbiome. Specifically, metagenomic shotgun
and 16S rRNA sequencing of the microbiome can be used in a quantitative model to infer kelp
genes affecting recruitment and diversity of the native microbial communities. Community
profiling, including species abundance and functional niches, can be assessed using established
methods for taxonomy assignment and gene annotation (Eren et al., 2015; Vollmers et al., 2017;
Sieber et al., 2018; Ortiz-Estrada et al., 2019; Woloszynek et al., 2019).
1.4.4 Characterizing impact of host genotype on microbial community
Similar to work done in other organisms, structural equation models and QTL mapping can be
used to determine how kelp genes affect overall community structure of the native microbiome
(Horton et al., 2014; Chen et al., 2018; Jones et al., 2019). There have been a number of studies
investigating the microbiome of several species of seaweeds such as Porphyra and Pyropia
(Miranda et al., 2013; Quigley et al., 2018; Aydlett, 2019; Yan et al., 2019). The value of these
studies, and overall impact on the macroalgae aquaculture industry, can be enhanced by further
25
considering the impact of the most (macroalga) genotype on the recruitment of the native
microbiome community structure.
For every group of kelp genotype with significant microbiome difference, one can generate
co-occurrence networks to illustrate the likeliest associations between kelp genes of interest and
particular members of their microbiome. In farm settings, unique kelp genotypes can be considered
as independent experimental observations, each with a unique effect on its associated microbes.
Similar work has been applied in humans (Igolkina et al., 2018), agriculture (Shin et al., 2019),
and other aquaculture crops (Simons et al., 2018). Metagenomic work for large numbers of
samples can be cost prohibitive. While a restricted number of metagenomic samples can be
limiting for genome-wide association studies (GWAS), this can be overcome by identifying
strongly diverged populations of macroalgae and focusing on genotypes collected by hybrid zones.
By having a sufficient number of samples come from hybrid zone lineages and focusing on QTL
mapping, it becomes more feasible to have sufficient statistical power and replication to determine
correlations between kelp genotypes and their microbiomes. To further focus this type of analysis,
genotypes represented in this work, and microbiome samples to be analyzed, can be chosen based
on whole genome sequencing (WGS) data from macroalgal individuals.
1.4.5 Technical challenges of analyzing host genome impact on microbiome
To accomplish GWAS and QTL analysis, the microbial community should be treated as a multitrait extended phenotype, with groups of community members or functions considered as different
traits. For sparse datasets, pairwise distance matrices using beta-diversity can be used as the
quantified host phenotype. Several approaches for multi-trait models have been proposed, but
analysis can be challenging with correlated traits such as species abundance (Yang and Wang,
26
2012; Hackinger and Zeggini, 2017). One way to cope with correlated traits is to model the intertrait covariance with random effect in linear mixed effects models (Laird and Ware, 1982). Until
recently, this model could use only a pair of correlated traits at a time due to the computational
intensity (Korte et al., 2012). To reduce this load, variable reduction techniques have been
suggested to replace several phenotypic traits with new independent constructs. These constructs
play the role of new traits and can be obtained with a standard principal component analysis of
traits (PCA), various principal components of heritability (PCH) (Baltimore, 1986; Lange et al.,
2004; Wang et al., 2007) or pseudo-principal components (Gao et al., 2014). Another challenge in
association studies is to develop a powerful multi-locus model. Testing SNP by SNP, single-locus
models require correction for multiple testing afterward, which can eliminate important
quantitative trait variants. To avoid this problem, multi-locus models, that consider all markers
simultaneously, can be applied. Due to the ‘large p (number of SNPs), small n (sample size)’
problem, many multi-locus models are based on regularization/penalized techniques including:
LASSO (Wu et al., 2009), Elastic Net (Cho et al., 2009), Bayesian LASSO (Yi and Xu, 2008), and
adaptive mixed LASSO (Wang et al., 2011). Other multi-locus methods (incorporated in the
mrMLM package) involve two-step algorithms, which first selects candidate variants in singlelocus design and then examines them together in a multi-locus manner (Wen et al., 2017).
Despite the broad spectrum of multi-trait and multi-locus models in GWAS and trait
prediction studies, only a few of them simultaneously incorporate correlated traits and several
associated variants (Lippert et al., 2014; Liu et al., 2016; Zhan et al., 2017; Dutta et al., 2018;
Weighill et al., 2019). In principle, multi-trait and multi-locus models have a potential to reveal
complex and important types of associations, for instance, a single variant might have a direct
27
effect on one trait, and an indirect impact on the other trait; a SNP may act on a single trait or its
effect might be pleiotropic affecting several traits. However, none of these traits-variants
associations are explicitly embedded into known models, but they can be directly accounted for
with the previously described method SEM, a multivariate statistical analysis technique first
introduced for path analysis by geneticist Sewell Wright (Wright, 1918, 1921). SEM has been
widely used in the fields of genetics, econometrics, and sociology, and current SEM applications
are gradually shifting to molecular biology (Igolkina et al., 2018). SEM models have also been
applied in the association studies in both multi-trait and multi-locus designs. For example, the GWSEM method was developed to test the association of a SNP with multiple phenotypes through a
latent construct (Verhulst et al., 2017). It was demonstrated that in comparison with the existing
multi-trait single-locus GWAS software package GEMMA (Zhou and Stephens, 2012), GW-SEM
provides for more accurate estimates of associations; however, GEMMA was almost three times
faster than GW-SEM. Another SEM-based model which can be used in association studies was
proposed for multi-trait QTL mapping (Mi et al., 2010). This method proposes that phenotypes are
causally related, forming a core structure without latent constructs, and that QTLs play the roles
of exogenous variables to the structure. This approach allows the model to decompose QTL effects
into direct, indirect, and total effects.
Addressing these challenges is critical to understanding and properly analyzing the impact
of the host genome on the recruitment of the native microbiome. Applications described here have
not been applied to the seaweed microbiome and should be explored in more detail to have an
impact on the aquaculture industry.
28
1.5 Conclusion
Advances in modern omics technology, including in sequencing and metabolite profiling, have
created powerful tools and analysis pipelines for understanding the metagenome and metabolome
of living organisms. Established pipelines from functional genomics research in agriculture are
beginning to be applied to seaweed systems but must be expanded to improve the development of
seaweed aquaculture. Metabolomic analysis has been used to fine-tune domestication and crop
improvement strategies in agriculture by revealing pathways involved in crop nutrition, flavor, and
stress response. Applying metabolomics analyses—alongside other functional genomics
analyses—more broadly in marine macroalgae toward improvement of cultivars for aquaculture
will allow scientists to identify significant pathways for macroalgae domestication more
efficiently.
The microbiome plays an important role in the overall health of several macroalgal species
and it is imperative to further understand microbial communities in the context of host genotypes
to ensure the success of the rising macroalgae industry in the U.S. Development and understanding
of host-microbe associations in aquaculture can lead to microbial inoculations to boost crop yields
and future breeding programs should focus on the compounded benefit of seaweed genotypes and
optimized microbial communities. Further investigation of the microbiome and metabolome in
seaweeds has the potential to greatly improve current methods of U.S. macroalgae domestication.
29
Figure 1.3. Overview of omics techniques currently used and proposed for use in macroalgae domestication and
improvement. Proposed schema for the application of DNA and RNA sequencing, metagenomics to efficiently
improve and domesticate macroalgae, emphasizing how each analysis can complement and build upon others. The
information gained through performing metagenomic and metabolomic analyses on top of conventional sequencing
analyses—including the identification of potential microbial inoculants for macroalgae cultivars, essential nutrients
for macroalgae growth, and the mechanism by which genetic variants of macroalgae species produce desirable
phenotypes—makes a case for the increased collection and utilization of these important datasets and analyses.
By treating the metabolome and microbiome as extended phenotypes, pathways for
domestication and optimized breeding go beyond the traditional desired physiological traits in
crops, such as size and nutrient load. Specifically, functional traits of the microbiome, such as its
ability to fix nitrogen or prevent disease, can be exploited to improve overall crop health and
fitness. Application of these tools in macroalgae is a unique opportunity because it is a fastgrowing organism that heavily relies on an optimal, environment-adapted metabolic profile and
beneficial microbial interactions.
30
With the rise of marine macroalgae in the aquaculture industry, there exists a unique
opportunity to utilize modern developed omics tools to domesticate macroalgae species rapidly,
including by optimizing their metabolomic and metagenomic profiles (Figure 1.3). As future
breeding programs are developed to establish the U.S. as a world leader in aquaculture, we must
incorporate the powerful tools and recent advances in modern omics techniques to guide our
breeding protocol and turn to functional genomics to greatly improve upon traditional breeding
programs. Metagenomics and metabolomics analyses provide important contributions that
deserves future consideration as scientists continue to explore omics in aquaculture.
1.6 Author contributions
This review is a collaborative work written by Kelly J. DeWeese and Melisa G. Osborne, PhD.
The abstract, introduction, and conclusion sections were contributed to equally. The remaining
sections were divided as follows: KJD authored “1.2 Modern farming in the United States and
targeted breeding”, “2 Application of metabolomics to macroalga domestication” and all its
subsections (2.1-2.4). KJD created all figures. MGO authored “1.1 Brief review of marine
macroalgae aquaculture”, “3 Application of metagenomics to macroalga domestication” and all its
subsections (3.1-3.5). Sections by MGO should not be evaluated as part of this dissertation.
31
Chapter 2: Scaffolded and annotated nuclear and organelle genomes
of the North American brown alga Saccharina latissima
Kelly J. DeWeese1
, Gary Molano1
, Sara Calhoun2
, Anna Lipzen2
, Jerry Jenkins3
, Melissa
Williams3
, Christopher Plott3
, Jayson Talag4
, Jane Grimwood3
, Jean-Luc Jannink5,6, Igor V.
Grigoriev2,7
, Jeremy Schmutz3
, Charles Yarish8,9, Sergey Nuzhdin1
, Scott Lindell9
1 Department of Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, United
States 2 US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United
States 3 HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, United States 4 Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721, United States 5 US Department of Agriculture, Agricultural Research Service (USDA-ARS), Ithaca, NY, United States 6 Section On Plant Breeding and Genetics, School of Integrative Plant Sciences, Cornell University, Ithaca, NY 14853, United
States 7 Department of Plant and Microbial Biology, University of California Berkeley, CA 94720, United States 8 Department of Ecology and Evolutionary Biology, University of Connecticut, Stamford, CT 06901-2315, United States 9 Applied Ocean Physics and Engineering Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543, United
States
2.1 Abstract
Increasing the genomic resources of emerging aquaculture crop targets can expedite breeding
processes as seen in molecular breeding advances in agriculture. High quality annotated reference
genomes are essential to implement this relatively new molecular breeding scheme and benefit
research areas such as population genetics, gene discovery, and gene mechanics by providing a
tool for standard comparison. The brown macroalga Saccharina latissima (sugar kelp) is an
ecologically and economically important kelp that is found in both the northern Pacific and
Atlantic Oceans. Cultivation of Saccharina latissima for human consumption has increased
significantly this century in both North America and Europe, and its single blade morphology
allows for dense seeding practices used in the cultivation of its Asian sister species, Saccharina
japonica. While Saccharina latissima has potential as a human food crop, insufficient information
from genetic resources have so far limited molecular breeding adoption in kelp aquaculture. We
32
present scaffolded and annotated Saccharina latissima nuclear and organelle genomes from an
individual gametophyte collected from Black Ledge, Groton, Connecticut. This Saccharina
latissima genome compares well with other published kelp genomes and contains 218 scaffolds
with a scaffold N50 of 1.35 MB, a GC content of 49.84%, and 25,012 predicted genes. We also
validated this genome by comparing the synteny and completeness of this Saccharina latissima
genome to other kelp genomes. Our team has successfully performed initial genomic selection
trials with sugar kelp using a draft version of this genome. This Saccharina latissima genome
expands the genetic toolkit for the economically and ecologically important sugar kelp and will be
a fundamental resource for future foundational science, breeding, and conservation efforts.
2.2 Introduction
The demand for sustainable farming resources is increasing due to the combination of rising global
temperatures, increasing population levels, and decreasing amount of arable land (Pimentel, 1991;
Thaler et al., 2021; Abbass et al., 2022). Increasing global food production can help raise the
carrying capacity of Earth despite these environmental challenges escalating in the 21st century
(Hopfenberg, 2003). One potential sustainable resource solution is to increase the amount of
biomass produced in the ocean by deploying open ocean kelp farms, as kelp grows quickly and
requires no arable land, freshwater, herbicides, or fertilizers (ARPA-E 2017) (Kim et al., 2019b).
Kelp are haplodiplontic brown macroalgae that provide vital ecosystem services, including habitat
creation and primary production, that sustain some of the ocean’s most diverse communities
(Dayton, 1985; Steneck et al., 2002; Wernberg et al., 2018; Eger et al., 2023). Kelps began to
diversify 30 million years ago, with the initial kelps appearing in the northern Pacific Ocean and
radiating southward over time (Starko et al., 2019).
33
While kelps and other seaweeds have been consumed by humans since the Mesolithic era,
kelp aquaculture development lagged compared to terrestrial agriculture until the 20th century
(Hwang et al., 2019; Buckley et al., 2023). Large scale kelp farming initially started in Japan,
Korea, and China in the 1950s to 1970s, and Asia currently accounts for 97% of the $6 billion
global kelp market (Tseng and Fei, 1987). The predominant kelp species farmed are Saccharina
japonica ($4.6 billion) and Undaria pinnatifida ($1.9 billion), with kelp aquaculture directly
supporting a range of industries, from food to pharmaceuticals (Cai et al. 2021, FAO, 2022; J. K.
Kim et al., 2017). As the industry expanded, kelp breeding programs were formed to address low
quality seed, increasing biomass, disease resistance, and trait consistency using phenotypic
selection (Hwang et al., 2019; Hu et al., 2023). In agriculture, emerging genetic resources, such as
reference genomes and sequence information for breeding panels of plants, have accelerated
genomics guided breeding programs to develop more productive and resilient cultivars (Song et
al., 2023). Genomics can also accelerate kelp breeding programs, if the proper genetic resources,
such as reference genomes and breeding populations, are developed.
The haplodiplontic life cycle of kelp is ideal for genomics-based breeding, as haploid
gametophytes can be vegetatively propagated in culture (Dring and Lüning, 1975; Huang et al.,
2022, 2023). These gametophyte cultures can then serve as kelp breeding germplasm, with scalable
production of monoclonal gametophyte cultures producing ample material for crossing or
sequencing experiments (Wade et al., 2020). Sequencing data can then be aligned to reference
genomes, producing variant information that can be used with phenotypic data to produce genomic
selection models (Meuwissen et al., 2001; Huang et al., 2022). These models produce genome
estimated breeding values (GEBVs) which can then be used to predict the phenotypes, such as
34
biomass, of potential crosses in the sequenced germplasm (Budhlakoti et al., 2022; Huang et al.,
2022). Sequenced germplasm collections, along with the vegetative propagation of haploid
gametophytes, compose an incredibly powerful tool for breeding programs for kelp (Diehl et al.,
2023; Huang et al., 2023; Hu et al., 2023).
The brown macroalga Saccharina latissima (sugar kelp) is an ecologically and
economically significant species of the order Laminariales (kelps). S. latissima last diverged from
other kelp species about 7 million years ago and can be found in both the northern Pacific and
Atlantic Oceans (Starko et al., 2019). S. latissima is a sister species to the economically important
Japanese sugar kelp Saccharina japonica, with a similar uni-blade morphology that is ideal for
kelp farming (Starko et al., 2019). S. latissima is farmed in both northern Europe and the US, and
is the most farmed kelp in the US, accounting for a predominant share of the current >$300 million
US kelp industry (Kim et al., 2019b; Heidkamp et al., 2022; Stekoll et al., 2024). The US kelp
industry is growing rapidly, and emerging genetic resources will only help to improve breeding
programs increase productivity of kelp farms, as well as aiding ecological and conservation efforts
(Li et al., 2022; Brayden and Coleman, 2023). While population genetic studies of sugar kelp in
both the US and Europe have started providing some of the resources necessary for breeding
programs, a scaffolded and annotated genome is needed to further refine breeding models as well
as assisting in future population genetic and conservation studies for this species (Breton et al.,
2018; Mao et al., 2020; Huang et al., 2022, 2023). Early results from a S. latissima genomic
selection breeding program based on the reference genome described here produced cultivars that
doubled biomass yield compared with non-selected kelps (Huang et al., 2023). We present the
35
scaffolded and annotated S. latissima nuclear genome alongside organelle genomes of the same
individual, providing an essential tool for subsequent scientific studies and breeding programs.
2.3 Materials and Methods
2.3.1 Sample collection
Reproductive sporophyll tissue from a wild population of Saccharina latissima sporophytes was
sampled from Black Ledge, Groton, Connecticut, US (41°31'N, 72°07'W, June 26, 2014).
Following induced sporulation, individual gametophytes were isolated to establish monoclonal
gametophyte cultures in a laboratory setting, as outlined in Redmond et al. (2014) and Alsuwaiyan
et al. (2019). One female S. latissima gametophyte (var. SL-CT1-FG3) was selected for long-read
sequencing for reference genome assembly and cultured in Erlenmeyer flasks under red light with
a 12:12 h light:dark photoperiod at 10 °C to inhibit reproduction and promote growth and mitotic
division for genomic DNA extraction at the Marine Biotechnology Laboratory at the University of
Connecticut (Redmond et al., 2014; Augyte et al., 2018).
2.3.2 DNA extraction and sequencing
Augyte et al. (2018) extracted DNA from a 24mg (fresh) gametophyte culture of Saccharina
latissima using a modified protocol of the Macherey-Nagel NucleoSpin Plant II Maxi Kit
(Macherey-Nagel, Düren, Germany). Gametophyte biomass was spun down in 1.5-mL tubes in an
Eppendorf 5424 microcetrifuge (21,000 rcf, 2 min). Sealed tubes of gametophyte material were
frozen in liquid nitrogen for 20 seconds before being ground with a plastic pestle for 30 seconds.
Ground samples were extracted with CTAB buffer using repeated wash steps (Doyle and Doyle,
1990; Augyte et al., 2018). Extracted gametophyte DNA was then sent to the HudsonAlpha
Institute in Huntsville, Alabama for whole genome sequencing.
36
The Saccharina latissima individual (var. SL-CT1-FG3) was sequenced using a whole
genome shotgun sequencing strategy and standard sequencing protocols. Sequencing reads were
collected by HudsonAlpha using Illumina and PacBio platforms. Illumina reads were sequenced
using the Illumina NovoSeq6000 platform, and the PacBio reads were sequenced using the Sequel
II platform. One 400bp insert 2x250 Illumina fragment library (66.03x) was sequenced along with
one 2x150 Dovetail Hi-C library (145.33x) (Table S1). Prior to use, the Illumina fragment reads
were screened for phix contamination. Reads composed of >95% simple sequence were removed.
Illumina reads <50bp after trimming for adapter and quality (q<20) were removed. The final read
set consisted of 282,564,006 reads for a total of 66.03x of high-quality Illumina bases. For the
PacBio sequencing, a total raw sequence yield of 111.14 Gb, with a total coverage of 180.56x per
haplotype (Table S2).
2.3.3 Nuclear genome assembly and decontamination
The assembly version 0.0 was generated by assembling 11,430,834 PacBio HiFi reads using the
HiFiAsm+HIC assembler (Cheng et al., 2021) and subsequently polished using RACON (Vaser et
al., 2017). This produced an initial assembly consisting of 4,854 contigs, with a contig N50 of
863.2 Kb, and a total assembly size of 925.8 Mb (Table S3).
Hi-C sequencing yielded 626,664,456 2x150 Hi-C Illumina reads, an estimated 145.33x
coverage. The reads were aligned to the assembly version 0.0 using BWA-MEM (Li, 2013). Pairedend reads were mapped independently (as single-ends) due to the nature of the Hi-C pair which
captures conformation via proximity-ligated fragments. A small fraction of single end mapped
reads will contain a ligation junction, an indicator that they are chimeric because they do not
originate from a contiguous piece of DNA. In these cases, only the 5’-side was retained, as the 3’-
37
end generally originates from the same contiguous DNA as the 5’-side of the mated read. The
resulting single end alignments were combined into a BAM file containing the paired, chimerafiltered Hi-C read alignments. The 3D-DNA (Dudchenko et al., 2017) suite of internal tools was
used to generate a contact map using the resultant BAM file, and the contact map was visualized
using Juicebox (Durand et al., 2016). Chromosome-scale scaffolding was attempted using 3DDNA, but high levels of contamination and the presence of alternative haplotypes prevented any
meaningful scaffolding.
The assembled contigs were screened against bacterial proteins, organelle sequences, and
the NCBI non-redundant protein (NR) sequence database and removed if found to be a
contaminant. Contigs were classified into bins depending on sequence content. Contamination was
identified using BLASTn against the NCBI non-redundant nucleotide database and BLASTx using
a set of known microbial proteins (Boratyn et al., 2013). Additional contigs were classified in the
version 1.0 release as contaminants (2,863 contigs, 283.9 Mb), chloroplast (181 contigs, 11.0 Mb),
prokaryote (14 contigs, 10.6 Mb), redundant (>95% masked with 24mers that occur more than 2
times in all contigs) (233 contigs, 6.2 Mb), repetitive (>95% masked with 24mers that occur more
than 4 times in contigs greater than the contig N50) (24 contigs, 1.8 Mb), and unanchored rDNA
(3 contigs, 165.1 Kb).
2.3.4 Hi-C scaffolding and polishing
With contaminant contigs removed, we attempted to scaffold contigs together using the contact
information from 3D-DNA. First, a graph was formed between all contigs and graph edge weights
between any two contigs (u,v) were computed by dividing the number of counts N(u,v) by the total
number of cut sites in both u and v, w(u,v) = N(u,v)/(C(u)+C(v)). We computed a normalized Best
38
Buddy Weight, BBW(u, v), as the weight, w(u, v), divided by the maximal weight of any edge
incident upon contigs u or v, excluding the (u, v) edge itself (Ghurye et al., 2019). All BBW(u,v)
values > 1 were retained, and a reverse Dijkstra’s algorithm (highest weight graph) was then
utilized to generate the contig order and orientation for the join file. Using this algorithm, a total
of 333 joins were made to form an additional 218 scaffolded contig sets. Each join was padded
with an unsized gap of 10,000 Ns.
Homozygous SNP and INDEL errors in the consensus were corrected with 282,564,006
Illumina fragment 2x250 reads (66.03x coverage) by aligning the reads using BWA-MEM (Li,
2013) and identifying homozygous SNPs and INDELs with the GATK’s UnifiedGenotyper tool
(McKenna et al., 2010). A total of 502 homozygous SNPs and 20,723 homozygous INDELs were
corrected in the release. The final version 1.0 release contained 612.2.1 Mb of sequence, consisting
of 218 scaffolds and 1,513 contigs, with a contig N50 of 971.7 Kb (Supplementary Table 4).
2.3.5 Gene annotation
The JGI Annotation Pipeline was used to annotate the Saccharina latissima nuclear genome
assembly (Grigoriev et al., 2014; Kuo et al., 2014). The pipeline predicts, filters, and functionally
annotates gene models as described by Grigoriev et al. (2014). Briefly: repeats in the genome
assembly were masked with RepeatMasker, RepBase, and RepeatScout (Smit et al., 2004; Jurka
et al., 2005; Price et al., 2005). The masked assembly was then used to predict gene models with
ab initio, homology, and transcriptomic modeling methods. Ab initio protein-coding gene
predictions were performed with Fgenesh and GeneMark (Salamov and Solovyev, 2000; TerHovhannisyan et al., 2008). Homology was assessed by BLASTx of the assembly against NCBI
NR (Boratyn et al., 2013). Resulting alignments were used to seed Fgenesh+ and Genewise for
39
homology-based gene prediction (Salamov and Solovyev, 2000; Birney et al., 2004). A
transcriptome assembly was generated for S. latissima with Illumina RNAseq reads using the
Trinity v2.11.0 transcriptome assembler, and RNA reads were mapped back to the genome
assembly with HISAT2 (Grabherr et al., 2011; Kim et al., 2019a). With these inputs, the
transcriptome-based programs Fgenesh, and combest were used to generate gene models (Salamov
and Solovyev, 2000; Zhou et al., 2015; Hoff et al., 2016). Automated filtering of the gene set based
on evidence from sequence homology and transcriptomics was performed to select for a single
representative gene model at each locus, and models with sequence similarity to transposable
elements were eliminated from the annotations. Functional annotation of predicted protein
sequences was then performed to classify signal sequences (SignalP v3), transmembrane and
protein domains (TMHMM, InterproScan), and homologs (BLASTp against NCBI NR, SwissProt, KEGG and KOG) (Nielsen et al., 1997; Melén et al., 2003; Koonin et al., 2004; Quevillon et
al., 2005; Kanehisa, 2006; Apweiler et al., 2012; Boratyn et al., 2013). The version 1.0 S. latissima
genome assembly is hosted on the JGI PhycoCosm comparative algal genome portal
(https://phycocosm.jgi.doe.gov/SlaSLCT1FG3_1) (Grigoriev et al., 2021).
2.3.6 Comparative genomic analyses
Gene content was scored for the analyzed brown macroalgal genomes using BUSCO v5.7.1 in
(mode: genome, gene predictor: Augustus v3.5.0) against both the Eukaryota and Stramenopiles
ortholog databases (eukaryota_odb10 and stramenopiles_odb10) (Manni et al., 2021). QUASTLG v5.2.0 was used to compute relevant assembly quality metrics for each genome (Mikheenko
et al., 2018). Synteny of our v1.0 Saccharina latissima assembly to related species was
investigated by aligning to published genomes of Saccharina japonica, Macrocystis pyrifera,
40
Undaria pinnatifida, and Ectocarpus sp. using the Progressive Cactus v2.6.7 pipeline (Armstrong
et al., 2020). The phylogenetic tree used to seed the Cactus alignment was pruned from the Starko
et al. (2019) kelp phylogeny constructed from plastid, mitochondrial and ribosomal genes.
Blocks of synteny were extracted from the five-species hierarchical whole genome
alignment (HAL) into the BLAT-defined PSL format using halSynteny (Kent, 2002; Hickey et al.,
2013; Krasheninnikova et al., 2020). We applied a method put forward by Nosil et al. (2023) to
establish one-to-one homology between chromosomes of related species using synteny blocks
derived from pairwise Cactus alignments. To appropriately map our Saccharina latissima scaffolds
onto longer reference chromosomes, we modified the method to a many-to-one approach, allowing
multiple S. latissima scaffolds to align to a single chromosome. For each alignment of a S. latissima
scaffold to a reference species chromosome, lengths of syntenic blocks (i.e., “matches” in PSL
format) were summed, and best scaffold-chromosome pairs were identified with respect to each S.
latissima scaffold. The five genome assemblies and the HAL whole genome alignment (converted
to MAF using hal2maf) were also given as input to Ragout v2.3, a reference-assisted scaffolding
tool used to improve assembly contiguity (Hickey et al., 2013; Kolmogorov et al., 2018).
Heatmaps representing synteny between reference chromosomes of each species versus
Saccharina latissima scaffolds were generated using R v4.4.0 (R Core Team, 2024). Ordering of
our v1.0 assembly scaffolds onto synteny-constructed pseudochromosomes was rendered into
genetic map representation using R. Scripts used to perform these analyses and generate figures
are hosted in a public GitHub repository (https://github.com/kdews/s-latissima-genome).
41
2.3.7 NCBI decontamination
The version 1.0 assembly was screened for vector using the NCBI UniVec database
(https://www.ncbi.nlm.nih.gov/tools/vecscreen/univec/) using the standard command blastall -p
blastn -d Saccharina_latissima.mainGenome.fasta -i UniVec -q -5 -G 3 -E 3 -F "m D" -e 700 -Y
1.75e12 -m 8. A total of 43 vector hits were identified with >=95% identity and >=31bp in length.
Contaminants were identified using the contaminant screener FCS-GX (Astashyn et al., 2024). A
total of 90 contaminated regions were identified (63 TRIM, 21 FIX, 6 EXCLUDE). All identified
vector and contaminant regions were removed from the sequence. If a vector/contaminant region
fell within a contig, then the region was replaced with the same number of N’s. If the
vector/contaminant region fell on the front/end of a contig, the bases were eliminated, and the
annotation GFF file was translated appropriately. These changes resulted in a net loss of 48 contigs,
4,498,393 bp, and a total of 209 annotated genes.
2.3.8 Organelle genome assemblies
Organelle genomes from the same female Saccharina latissima gametophyte (SL-CT1-FG3) were
also assembled using the PacBio reads generated for the nuclear genome assembly. Reference
organelle genomes for S. latissima have already been published, but these genomes do not have
corresponding nuclear genomes from the same individual (Wang et al., 2016; Fan et al., 2020b).
Gene transfers between organelle and nuclear genomes are more easily identified when using
genomes sourced from the same individual (Cui et al., 2021). To identify organelle reads, raw
PacBio reads were aligned to the S. latissima mitochondrial and chloroplast genomes using
minimap2 standard settings (Wang et al., 2016; Li, 2018; Fan et al., 2020b). Read IDs that aligned
to the organelle genomes were extracted using samtools, and then corresponding PacBio reads that
42
aligned to the respective organelle genomes were subset using seqtk (Li et al., 2009; Li, 2012).
Organelle genomes were then assembled using the flye version 2.9.2-b1786 assembler, with the -
-pacbio-raw flag and genome size estimates based on previously published sugar kelp chloroplast
and mitochondrial genomes (Wang et al., 2016; Kolmogorov et al., 2019; Fan et al., 2020b). We
used GeSeq2 to annotate and compare our sugar kelp genomes versus available published
reference S. latissima organelle genomes, using genome annotations of Undaria pinnatifida
mitochondria and chloroplast as outgroups (Fan, Xie, et al., 2020; Tillich et al., 2017; Wang et al.,
2016, Li et al. 2015, Zhang et al. 2016). A collection of scripts used in this analysis have been
placed in a public GitHub repository (https://github.com/kdews/s-latissima-organelles).
2.4 Results
2.4.1 Nuclear genome assembly
We evaluated the nuclear genome content of our assembled Saccharina latissima by assessing the
presence of conserved orthologs, genome contiguity, and whole genome synteny.
2.4.2 General statistics
Long-read sequencing of an individual Saccharina latissima gametophyte (var. SL-CT1-FG3)
yielded 111 Gb of PacBio HiFi reads, estimated to represent ~180x genomic coverage. Our initial
de novo assembly (v0.0) was 925.8 Mb and consisted of 4,854 contigs (Table S3). Following
contaminant filtering, a total of 3,341 contigs (283.9 Mb) were removed. Hi-C scaffolding joined
333 of the v0.0 contigs into 218 scaffolds. The final version 1.0 S. latissima genome contains a
total of 218 scaffolds and 1,513 contigs, with a genome size of 615.5 Mb (0.5% gaps) and scaffold
N50 of 1.35 Mb (Figure 2.1, Table 2.1).
43
Figure 2.1. Log10-scaled scaffold size distribution for five brown macroalgal species: Ectocarpus sp. Ec32 (red),
Undaria pinnatifida (yellow), Macrocystis pyrifera (green), Saccharina japonica (blue), and Saccharina latissima
(purple).
44
Ectocarpus
sp. Ec32
Undaria
pinnatifida
Macrocystis
pyrifera
Saccharina
japonica
Saccharina
latissima
Genome size
(Mb)
196.8 511.3 537.5 548.54 615.5
Chromosomes* 28 30 35 31 est. 32
Scaffolds* 28 114 35 31 218
Contigs 12,767 618 921 37,788 1,513
Largest scaffold
(Mb)*
10.32 32.30 26.51 19.97 11.32
Scaffold N50
(Mb)*
6.53 16.51 13.67 12.42 1.35
Contig N50 (Kb) 32 1,800 1,000 44 971
Percent gaps 2.602% 0.049% 0.013% 1.733% 0.541%
GC content 53.59% 50.14% 50.37% 49.66% 49.84%
Genes 18,369 12,499 25,919 50,098 25,012
Complete
BUSCOs,
Stramenopiles
95.0% 92.0% 94.0% 87.0% 86.0%
Complete
BUSCOs,
Eukaryota
69.0% 70.6% 69.8% 57.7% 59.2%
Genome
coverage
121x 120x 100x 178x 185x
Citation Cormier et al.
(2017)
Shan et al.
(2020)
Diesel et al.
(2023)
Fan et al.
(2020a)
This
publication
JGI / ORCAE ID EctsiV2 Undpi1_1 Macpyr2 Sacja SlaSLCT1FG
3_1
Table 2.1. Comparison of Saccharina latissima nuclear genome assembly statistics to those of related brown
macroalgal species. *Excludes artificial chromosomes.
45
To evaluate the quality and completeness of the Saccharina latissima genome assembly
reported here, the genomes of four related brown alga were compared: the model brown alga
Ectocarpus sp. (Cormier et al., 2017), giant kelp Macrocystis pyrifera (Diesel et al., 2023), and the
widely cultivated kelps wakame Undaria pinnatifida (Shan et al., 2020) and Japanese sugar kelp
Saccharina japonica (Fan et al., 2020a). Three early genome size predictions for S. latissima used
standard methods of staining and flow cytometry to estimate a genome size of 588–720 Mb
(Phillips et al., 2011). These genome size estimates agree with our v1.0 assembly size of 615.5
Mb. Of all compared reference genomes, the GC content (49.84%) and size of our S. latissima
assembly is closest to that of S. japonica (49.66% GC, 548.54 Mb), the expected result given they
are of the same genus and least diverged of the five assessed species (Table 2.1).
2.4.3 Contiguity and chromosome number
Despite high genomic sequencing coverage (185x) with long PacBio reads, as well as Hi-C
sequencing to 145x coverage, the genome assembly could not be scaffolded to the level of
chromosomes (Table 2.1). Our reported N50 of 1.35 Mb is an order of magnitude smaller than
similarly sized brown macroalgal genomes. Chromosome number estimates in brown macroalgae
are complicated due to sex-specific polyteny and population-specific ploidy variation in sugar kelp
meristem sporophyte tissue (Müller et al., 2016; Goecke et al., 2022). Chromosome number
predictions in Saccharina species vary depending upon which method is used: flow cytometry has
estimated a genome ranging from 588–720 MB with 62 chromosomes (Phillips et al., 2011), while
microscopy has estimated 31 chromosomes (Liu et al., 2012b, 2022). Recent brown macroalgal
genome assemblies have predicted 28–34 chromosomes (Cormier et al., 2017; Fan et al., 2020a;
Shan et al., 2020; Diesel et al., 2023). We estimated S. latissima chromosome number from our
46
assembly using synteny-based re-scaffolding of the 155 longest scaffolds and contigs, which
yielded 32 pseudochromosomes (Kolmogorov et al., 2018) (Supplementary Figure 4D).
Figure 2.2. The number of Ns (unknown bases) per 100 kbp (gray bars) are compared for each genome assembly.
BUSCO scoring using the Stramenopiles ortholog database (ODB10, n=100) shows relative counts of complete
(blue), fragmented (yellow), and missing (red) orthologs in each assembly. Phylogenetic tree showing relatedness of
compared species pruned from Starko et al. (2019).
2.4.4 Gene content
Gene content was evaluated with BUSCO, which evaluates a given assembly against the set of
single-copy, highly conserved orthologous genes predicted to be present in a specific clade (Manni
et al., 2021). We benchmarked the five compared genomes against two relevant clades, eukaryota,
which provides a general metric for comparison across all conserved Eukaryota genes, and
Stramenopiles genes, the monophyletic clade containing brown algae. The percentage of complete
BUSCOs (comprising single copy and duplicate orthologs) detected in an assembly can be used
as a proxy for genome completeness and to detect artificial duplications resulting from de novo
assembly. Benchmarked against Eukaryota, our S. latissima assembly (60% complete BUSCOs)
scores similarly to S. japonica (64.3% complete BUSCOs) (Supplementary Figure 1, Table 2.1).
47
Of the 100 BUSCOs in Stramenopiles, our genome contains 86 complete BUSCOs (86%), with
85 single copy and 1 duplicated, and the remainder fragmented (3) and missing (11) (Figure 2.2,
Table 2.1). Incomplete BUSCOs in the S. latissima genome could be attributed to lower contiguity,
which would be further exacerbated by the Laminariales gene structure that often features long
intronic regions. Long genes split between unassembled regions could be fragmented beyond the
threshold of local alignment detection.
2.4.5 Synteny analysis
2.4.5.1 Interspecies whole genome alignment
To capture canonical genomic rearrangements between related species, we performed whole multigenome alignments (Armstrong et al., 2020) of our Saccharina latissima assembly against the
genomes of four related brown macroalgal species, which formed the basis of our homology
analysis. Homology to at least one of the compared genomes was detected in 99% of our assembly
across 1123 scaffolds and contigs (613.38 Mb) (Figure 2.3B). On average, exact matches spanned
15% of each scaffold in S. latissima. A core set of 858 scaffolds and contigs (525.43 Mb) in the
Saccharina latissima genome aligned to all four of the compared brown algal species (Figure
2.3A).
48
Figure 2.3. (A) Venn diagram highlights overlaps of unique Saccharina latissima homologs mapping to genomes of
four related species. (B) Heatmap shows the maximal syntenic match of 1,110 S. latissima scaffolds and contigs to
chromosomes of related species.
As expected, S. latissima had the highest total exact matches to S. japonica of all species,
both genome-wide (181.89 Mb) and averaged per chromosome (5.68 Mb) (Supplementary Table
5). Overall, total exact matches between alignments increases with species relatedness to S.
latissima, a trend that holds both genome-wide and per chromosome (Supplementary Figure 2B,
Supplementary Table 5). Linear regression of S. latissima scaffold lengths versus their respective
49
homologous reference chromosome lengths yielded slopes that closely correspond to genome size
ratios, with the smallest reference, Ectocarpus sp. Ec32, reflecting a 3x shorter genome with a
slope of 3.06 (Supplementary Figure 2A).
2.4.5.2 Reference-based scaffold ordering
We further leveraged syntenic information to order our Saccharina latissima scaffolds into
pseudochromosomes. The S. latissima v1.0 assembly was rescaffolded repeatedly under varying
parameters into 32–40 pseudochromosomes that incorporated 92–464 scaffolds and contigs
(Supplementary Figure 4). One pseudochromosomal S. latissima assembly (Supplementary Figure
4D) was used to map gene orthology against Undaria pinnatifida (Figure 2.4B).
Figure 2.4. (A) Size distribution (log10 bp scale) of our v1.0 Saccharina latissima assembly before and after syntenybased rescaffolding show 267 scaffolds (light blue) incorporated into 38 psuedochromosomes (dark blue). (B) Gene
orthology between 32 S. latissima psuedochromosomes (dark green) and 38 Undaria pinnatifida chromosomes and
contigs (light green) was mapped using 3 Mbp windows containing 10 single-copy orthologs, with band colors
denoting highest synteny (gray), ortholog rearrangement (red) and chromosomal splitting or fusion (purple).
50
2.4.6 Organelle genomes
Our assembled Saccharina latissima chloroplast genome (130,613 bp) is almost the same length
as the published chloroplast genome (130,619 bp) (Fan et al., 2020b) (Table 2.2, Supplementary
Figure 5, Supplementary Figure 7B). Our assembled S. latissima mitochondrial genome is 37,510
bp and contains 39 genes. It is slightly smaller than the previously published mitochondrial genome
(Wang et al., 2016) (37,659 bp), but contains an additional tRNA annotation (Table 2.2,
Supplementary Figure 6, Supplementary Figure 7A). For each of the new organelle genomes
reported here, a consensus sequence was generated through multiple separate assemblies with
Flye, a long-read assembler especially robust to sequencing errors and specialized to resolve
repetitive regions (Kolmogorov et al., 2019). Long reads allowed for resolution of inverted repeat
(IR) regions in the chloroplast genome that typically span ~5.5 kb in brown macroalgae (Rana et
al., 2019). The average sequencing read length (8,389 bp) also aided with assembly, as each read
covers ~22% and ~6% of mitochondria and chloroplast genome lengths, respectively.
Saccharina latissima
organelle genome
(version)
Sequencer Average
read length
(bp)
Size (bp) Genes tRNAs rRNAs
Mitochondria (this
genome)
PacBio Sequel
II
8,389 37,510 39 19 3
Mitochondria (Wang
et al., 2016)
Illumina HiSeq
2000
200 37,659 38 18 3
Chloroplast (this
publication)
PacBio Sequel
II
8,389 130, 613 138 27 3
Chloroplast (Fan et al.,
2020b)
Illumina HiSeq
2000
200 130, 619 138 27 3
Table 2.2. Comparison of Saccharina latissima organelle genome versions on assembly and gene annotation
statistics.
51
2.5 Discussion
2.5.1 Genomic tools in breeding and agriculture
Many tools used in breeding rely on the availability of high-quality plant genomes and properly
assigned synteny and gene annotation. For example, the use of modern genomic engineering
strategies requires accounting for potential off target genome editing (Guo et al., 2023).
Additionally, to calculate a polygenic score (PGS) that accounts for genomic estimated breeding
value (EBV), variation must be tabulated across the whole genome (Meuwissen et al., 2001; Calus
and Veerkamp, 2007; Heffner et al., 2009; Ding et al., 2023). And to predict performance of genetic
crosses, chromosome rearrangements between them must be known (Nuzhdin et al., 2012). Here,
we have assembled a genome of sufficient quality to be very useful for the progression of
intraspecific breeding programs (Li et al., 2022), as well as possible crossbreeding with closely
related species such as Saccharina japonica.
2.5.2 Enhanced sugar kelp breeding with genomic tools
The decrease in sequencing costs has led to an increase in genomic resources for non-model
species such as brown algae, which until recently were severely understudied genetically. Nuclear
genomes are now available for some Phaeophyta species, e.g., Ectocarpus sp. (Cock et al., 2010;
Cormier et al., 2017), Saccharina japonica (Ye et al., 2015; Liu et al., 2019; Fan et al., 2020a),
Undaria pinnatifida (Shan et al., 2020; Graf et al., 2021) and published plastid and mitochondria
genome assemblies are also mounting (Secq et al., 2006; Chen et al., 2019; Rana et al., 2021). For
Saccharina latissima, short-read organelle genome assemblies have become available over the past
decade (Wang et al., 2016; Fan et al., 2020b; Rana et al., 2021). Recently a nuclear assembly of
52
Northern European S. latissima was made available on the Phaeoexplorer brown algal genome
database. The European S. latissima genome is 531.3 Mb, contains 17,672 predicted genes (12,549
functionally annotated), and has not been scaffolded (4,592 contigs, N50 = 247 kb).
This annotated and scaffolded sugar kelp genome represents a major step forward as it can
now serve as a reference for future research, including population genetics, gene expression,
conservation, and breeding domestication. This sugar kelp genome has served as the backbone of
recent applied genomic advances of a sugar kelp selective breeding project including a genomic
selection model targeting higher yield (Huang et al., 2023; Y. Li et al., 2022b; Umanzor et al.,
2021). New domestication strategies, such as genomic selection, require annotated reference
genomes to expedite the selection cycles compared to traditional phenotypic domestication
methods. The importance of genomics to kelp domestication projects has been highlighted by other
industry leaders (Goecke et al., 2022; Hu et al., 2023). Since sugar kelp is the most farmed kelp in
the US and Europe, publishing an annotated reference genome will open breeding opportunities
for traits such as heat-tolerance and lower iodine content for the kelp farming industry.
2.5.3 Identification of homologous chromosomes in multiple species undergoing domestication
In general, genome quality and completeness can be assessed through gene annotation and
evaluation of sequence contiguity. Detection of conserved genes orthologs in de novo assemblies
can help infer genome completeness and score relative gene content. Particularly when comparing
amongst species within a monophyletic clade, quantification of BUSCOs (Manni et al., 2021)
leverages the strong retention and sequence conservation experienced by single-copy orthologs
(Waterhouse et al., 2011). From a structural perspective, genome contiguity informs the degree to
which assembly and scaffolding methods have succeeded in reconstructing chromosomes from
53
whole-genome shotgun sequencing. It is typically calculated with statistics including contig N50,
scaffold N50, and N (unknown base) content.
In brown macroalgae, Undaria pinnatifida (Shan et al., 2020) and Macrocystis pyriefra
(Diesel et al., 2023) represent the most highly scaffolded genome assemblies, with 98.4% and 92%
of sequence contained in 30 and 35 chromosomes, respectively. In contrast, although the
Saccharina japonica assembly contains 31 chromosomes, over 35% of the genome is unscaffolded
(Fan et al., 2020a). In addition, the metric of N (i.e., unknown bases) per 100 kbp is far higher in
the assemblies of both Saccharina species compared here (Figure 2.2). Difficulty reconstructing
chromosomes from sequencing in Saccharina japonica and Saccharina latissima, despite
comparable long-read, high-coverage, Hi-C assisted assembly methods employed by all genomes
compared here (Table 2.1), may speak to a structural quality present in the Saccharina genus that
impedes efforts to scaffold the genome, such as repeat regions. Comparative genomic analyses like
those presented here help to correlate chromosomes between species for more robust predictions
in genomic breeding technologies that leverage the increasing amount of quality, annotated brown
macroalgal assemblies.
2.5.4 Improved organelle genome resources
While genomics has greatly increased breeding efficiency in plants, animals, and kelp, most of
these breeding programs rely strictly on nuclear genomes for establishing markers for selective
breeding. However, integrating organelle genomes into breeding models may increase breeding
efficiency (Kersten et al., 2016). By optimizing cytonuclear interactions, breeding efficiency for
sugar kelp in the future can be accelerated (Colombo, 2019). As nuclear and organelle genomes
can differ between individuals in the same species, pairing nuclear and cytoplasmic genomes from
54
a single genotype can provide a reference for future breeding experiments. Our use of long-read
sequencing increases our confidence in both the nuclear and organelle genomes to serve as a
foundation for future sugar kelp breeding.
2.5.5 Future Directions
This scaffolded and annotated sugar kelp genome is a fundamental resource for subsequent
Saccharina latissima basic science research, breeding, and conservation experiments. Despite
working with this genome for several years, it can and should still be improved through further
long-read sequencing and assembly, with the goal consisting of a better scaffolded genome. A
sugar kelp pan-genome that includes individuals from important regions such as Alaska, Gulf of
Maine, and Northern Europe should also be generated. This genome can also be used with other
high-quality brown macroalgae genomes for comparative genomics studies to examine potential
evolutionary differences between brown algae.
2.6 Funding
This project was supported by the U.S. Department of Energy (DOE) Advanced Research Projects
Agency–Energy (ARPA-E) program Macroalgae Research Inspiring Novel Energy Resources
(MARINER) awards DE-AR0000914 and DE-AR0000915. A portion of these data were produced
by the U.S. Department of Energy Joint Genome Institute in collaboration with the user
community. The work conducted by the U.S. Department of Energy Joint Genome Institute
(https://ror.org/04xm1d337), a DOE Office of Science User Facility, is supported by the Office of
Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
55
2.7 Author contributions
KD performed data analysis of nuclear and organelle genomes, generated figures, and led
manuscript preparation and writing. GM performed organelle genome assembly and annotation,
consulted on nuclear genome analysis, and contributed to manuscript writing. SL, CY and JLJ
conceptualized the project and experimental design and acquired project funding. JT was
responsible for biomolecule extraction. MW customized libraries and performed sequencing. CP
and JJ generated the nuclear genome assembly, and JJ was responsible for integration. JG
performed data collection. JS was the project and computational lead. AL generated the
transcriptome used for nuclear genome annotation. SC and IVG contributed to genome annotation
and comparative analysis. GM, SN, SL, CY, and JLJ supervised the project. KD, GM, SN, SL, CY,
IVG, JS, and JLJ edited the manuscript.
56
Chapter 3: Genetic variant annotation toward the detection of
naturally non-reproductive kelp genotypes in germplasm
Kelly J. DeWeese1
, Gary Molano1
, Sergey Nuzhdin1
, Scott Lindell2
1 Department of Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, United
States 2 Applied Ocean Physics and Engineering Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543, United
States
3.1 Abstract
Kelp aquaculture has rapidly expanded over the past decade, driven by its sustainability, restorative
potential, and nutritional value. The United States native Saccharina latissima (sugar kelp) is a
promising species for cultivation efforts to support growing food demands amid environmental
pressures on conventional food sources. The integration of genetic tools into breeding programs
has been demonstrated to accelerate crop improvement and allow for more precise trait selection,
and the recent release of an annotated genome assembly for North American S. latissima provides
a valuable resource for genomics-assisted breeding in kelp. A critical goal for the future of kelp
cultivation is the development of technologies to control kelp reproduction to address
environmental concerns of gene flow. Utilizing this genome, we analyzed whole-genome
sequencing (WGS) data from 278 S. latissima haploid gametophytes to predict potentially nonreproductive crosses from genetic variants. We pre-screened candidate loci linked to reproduction
for variants with a predicted high degree of impact on protein products (e.g., nonsense and
missense mutations). We predicted putative non-reproductive S. latissima test crosses from the set
of genotyped gametophytes over two farming seasons and have seen promising preliminary results
from phenotyped outplants. This analysis demonstrates the possible gains when comprehensive
57
genomic resources are leveraged to enhance trait selection efficiency in the creation of a novel
domesticate. The predictive approach outlined here can significantly reduce the need for costly
and resource-intensive phenotyping and high-power genome-wide association studies (GWAS) or
genomic selection (GS) models. Our findings demonstrate the utility of genomic resources in kelp
trait prediction, paving the way for more economical and targeted breeding initiatives in marine
aquaculture.
3.2 Introduction
Until very recently, human agronomic history was dominated by classic artificial selection and
crossing, whereby cultivars are established through generations of human-mediated selection of
preferred crop traits and hybridization is employed to combine multiple advantageous phenotypes
(Khush, 2001; Meyer and Purugganan, 2013). Plant domestication efforts initiated by our ancestors
generated crop cultivars with enhanced harvest size, consistency, flavor, and nutrition well before
the field of genetics was conceived of (Kleinhofs, 1983). As reviewed extensively by Khush
(2001), food crop production experienced exponential gains in the latter half of the 20th century
during the Green Revolution when breeders began to enhance their crossing schemes using genetic
markers (McMillin, 1983; Dekkers and Hospital, 2002). In the modern era, expanded availability
of next-generation and third-generation sequencing has enabled further achievements in both
disease mitigation and quality enhancement in crop cultivars developed through genomic breeding,
which includes the methods of marker-assisted selection (MAS), association mapping, and
genomic selection (GS) (Varshney et al., 2005, 2021; Desta and Ortiz, 2014; Phan and Sim, 2017).
Modern genomic breeding techniques developed for terrestrial agriculture have a
demonstrated capacity to improve our existing plant and animal domesticates; notably, these
58
techniques also present the opportunity for rapid de novo domestication of species that have not
undergone millennia of phenotypic selection by humans (Zsögön et al., 2018; Jian et al., 2022).
Genomics-informed de novo domestication methods have the inherent benefits of decreasing the
generations required for trait selection (estimated >20 generations under phenotypic artificial
selection), preserving wild genetic diversity, and detangling linked advantageous and deleterious
alleles (Fernie and Yan, 2019).
Among potential candidate species for de novo domestication in the modern era, several
brown macroalga (kelp) species have emerged as particularly attractive due to their ecological
significance, rapid growth rates, and potential as a sustainable food resource. In addition to direct
human consumption, kelps also have established food additive, feed, and pharmaceutical
applications (McHugh, 2003). As the industry of kelp aquaculture has grown at an incredible speed
in the 21st century (FAO, 2024), concerns have arisen about the environmental impacts of farming
improved and non-native kelp species(Grebe et al., 2019). In marine environments, human activity
including shipping and aquaculture has historically led to the introduction of species (Molnar et
al., 2008; Tan et al., 2023) that can become invasive and detrimentally affect sensitive ocean
ecosystems through competitive interactions with native species (Mooney and Cleland, 2001).
Already, mariculture of the northeast Asian kelp Undaria pinnatifida has driven widespread
invasions into coasts from Europe to the Americas (James and Shears, 2016; Epstein and Smale,
2017). Traditional strategies to manage invasive species, such as physical removal or biocides, are
often impractical or unscalable in the open environment of the ocean (Thresher and Kuris, 2004).
Regulation of the emerging kelp aquaculture industry in the US has thus focused on controlling
crop-to-wild gene flow from farms into local coastal ecosystems (Grebe et al., 2019). Coastal US
59
kelp aquaculture will require breeding and management strategies that inherently reduce the risks
of negative environmental impacts from new farming efforts.
Though commercial kelp aquaculture is over a century old, kelp production has historically
relied upon wild harvest or collection of reproductive tissues from wild kelp populations (Robinson
et al., 2013; Peteiro et al., 2016). The consequence of this history of cultivation is that there are
not traditionally improved domesticates analogous to those present in terrestrial agriculture for
farmed kelp species (Kim et al., 2017); work to generate cultivars of the kelps U. pinnatifida and
Saccharina japonica has had a fraction of the time to reap the benefits of phenotypic selection as
compared to agricultural crops (Yamanaka and Akiyama, 1993). And counter to terrestrial farms,
the establishment of ocean farms for kelp species has a far more limited—or even nonexistent—
ability to remodel or control the farm environment (Camus et al., 2018). Given present limitations,
cultivar development for kelp species is obtained by trait optimization in selective breeding
programs. To accelerate the pace of cultivar selection and improvement in kelp species and jumpstart de novo domestication efforts, research has increasingly turned to genetic tools for breeding
(Hwang et al., 2018; Graf et al., 2021). Globally, the development of genomic resources for several
brown algal species (Cormier et al., 2017; Fan et al., 2020a; Shan et al., 2020; Diesel et al., 2023;
DeWeese et al., 2024) has already expedited and enhanced research and development of kelp
cultivars for farmed settings (Liu et al., 2012a; Gao et al., 2013; Peteiro et al., 2016; Hwang et al.,
2017; Sato et al., 2020; Li et al., 2022; Huang et al., 2023; Wang et al., 2023).
In a US regulatory context, the development of kelp cultivars with genetic research is
currently limited by low public endorsement for genetic modification in aquatic organisms
(Abdelrahman et al., 2017). Here we address both regulatory desires to (1) limit the reproductive
60
capacity of farmed kelps and (2) develop kelp cultivars utilizing natural genetic variation present
in wild populations to detect potentially non-reproductive crosses. We propose to leverage the
haplodiplontic life history (Figure 3.1A) of the US native kelp Saccharina latissima to block
reproduction specifically at the sporophyte stage (Figure 3.1B) by employing genomics-informed
selective breeding without genetic modification. Several kelp species have heteromorphic haploiddiploid life stage alternation that importantly features a free-living microscopic haploid stage
called a gametophyte (Liu et al., 2017), which has the capacity to both produce gametes and
clonally propagate, potentially infinitely under correct laboratory conditions (Redmond et al.,
2014). These gametophytes can be preserved as immortal genotypes in a germplasm bank where
they can be used to continuously seed crosses to produce macroscopic diploid sporophyte plants
for farms (Kim et al., 2017). Essentially, we plan to linearize the S. latissima life cycle starting
from cultures of the gametophyte stage that possess specific alleles that, when crossed to produce
progeny homozygous for these alleles, generates a non-reproductive sporophyte stage (Figure
3.1B). Development of non-reproductive S. latissima cultivars for aquaculture supports
ecologically responsible breeding goals and allows for phenotypic improvement of economically
important kelp species while minimizing the environmental impact of kelp mariculture.
61
Figure 3.1. The schematic demonstrates how the haplodiplontic kelp life history (A) can be linearized (B) by
selecting and crossing haploid gametophytes possessing deleterious alleles that cause a non-reproductive phenotype
in the resulting sporophyte.
Here we report our analysis to detect putative non-reproductive variants in S. latissima
genotypes by sequencing an extant gametophyte germplasm collection sampled from wild
populations in the US (Mao et al., 2020). Using a high-quality, annotated S. latissima reference
genome (DeWeese et al., 2024), we scanned whole genome sequencing (WGS) data from 278
haploid gametophyte genotypes for highly deleterious mutations in putatively reproductive genes.
62
Examining existing genetic variation from allelically rich wild S. latissima populations allowed us
to find promising rare genetic variants that could lead to aberrant reproductive development in S.
latissima sporophytes. We analyzed phenotypic data from several seasons of S. latissima test farms
to evaluate the observed sporophyte morphology of crosses we predicted bioinformatically to be
non-reproductive and have seen promising early results. We expect that the approach posed in this
work can be applied to genomics-informed breeding programs in the future to streamline the
preliminary phases of trait mapping. Additionally, we hope that the generation of non-reproductive
kelp domesticates from this and related works establishes a breeding strategy aligned with
ecological sustainability for the next age of de novo crop domestication.
3.3 Methods
3.3.1 Sampling, culture, and DNA sequencing
Wild Saccharina latissima sporophytes sampled from 15 locations in New England were analyzed
to identify two wild subpopulations and used to establish a germplasm bank (Mao et al., 2020).
From these founder populations, a collection of male and female S. latissima gametophytes was
established and has been maintained in a laboratory setting according to conventional algal culture
protocols (Redmond et al., 2014; Alsuwaiyan et al., 2019). 278 gametophytes from this collection
representing the variation present in the founder populations were grown up for whole genome
sequencing (WGS), as described by Huang et al. (2023). Briefly, a version of the Macherey-Nagel
NucleoSpin Plant II Maxi Kit (Macherey-Nagel, Düren, Germany) protocol modified for
macroalgal tissue was used to extract DNA. From each gametophyte sample, 24 mg of fresh tissue
was spun down, frozen in liquid nitrogen, and ground using a plastic pestle. DNA extraction was
performed with CTAB buffer (Doyle and Doyle, 1990; Augyte et al., 2018). At HudsonAlpha
63
(Huntsville, Alabama, USA), gametophyte DNA was cleaned using a DNAeasy PowerClean Pro
Cleanup kit (Qiagen) and amplified Illumina libraries were generated in 96 well format using an
Illumina TruSeq nano HT library kit using standard protocols. DNA sequencing was performed on
an Illumina NovaSeq 6000 instrument at 2x150 base pair read length.
3.3.2 RNA sequencing and differential expression analysis
From 6 freshly collected Saccharina latissima sporophytes, 4 tissue types were excised: holdfast,
meristematic blade, distal blade, and sorus. Fresh tissues were immediately flash frozen in liquid
nitrogen. HudsonAlpha (Huntsville, Alabama, USA) performed RNA extraction and prepared
2x150bp libraries from 12 samples of sufficient quality (RIN ≥ 3.5). RNA libraries were sequenced
on an Illumina NovaSeq 6000. Quality and adapter trimming was performed on RNA reads using
Trim Galore v0.6.7 (Krueger et al., 2021). Trimmed reads were aligned to the nuclear S. latissima
reference genome (DeWeese et al., 2024) using RSEM v1.3.3 (Li and Dewey, 2011). Differential
expression analysis was performed with DESeq2 v1.30.0 (Love et al., 2014) and plots were
rendered in R with the package EnhancedVolcano v1.8 (Blighe et al., 2021). A public repository
of code used in the differential expression analysis is hosted at https://github.com/kdews/slatissima-de.
3.3.3 Detection of conserved reproductive genes with protein homology
We conducted a literature review of conserved eukaryotic—and specifically algal (Liu et
al., 2017) —proteins involved in reproduction to curate the following: Spo11 (Engebrecht et al.,
1991; Li et al., 2006; Borde, 2007; Keeney, 2008; Chen et al., 2010), Mre11 (Li et al., 2006; Borde,
2007; Keeney, 2008), Rad50 (Engebrecht et al., 1991; Li et al., 2006; Borde, 2007; Keeney, 2008;
Chen et al., 2010), and Rck (Chen et al., 2010). Search terms for these known conserved
64
reproductive proteins, as well as general terms for meiotic and reproductive function, were used
to probe annotated S. latissima genes (DeWeese et al., 2024) though the US Department of Energy
Joint Genome Institute’s PhycoCosm database (phycocosm.jgi.doe.gov/SlaSLCT1FG3_1) to
identify any genes already functionally implicated in reproduction. To procure additional
candidates, sequences of these conserved proteins annotated in the model brown alga Ectocarpus
sp. (Cormier et al., 2017) were used as queries for BLASTp (Boratyn et al., 2013) searches against
a reference of all annotated S. latissima protein sequences. We further used these Ectocarpus
sequences to query translated open reading frames (ORFs) from a pre-release version of the
Macrocystis pyrifera genome (Diesel et al., 2023) to increase the probability of finding targets in
sequences of a more closely related kelp species. Resulting homologous M. pyrifera proteins were
then also used to query the S. latissima protein sequence BLASTp reference.
3.3.4 Variant calling
Whole genome sequencing (WGS) reads from 380 genomic libraries were trimmed with Trim
Galore v0.6.7 (Krueger et al., 2021). Trimmed reads were aligned to the Saccharina latissima
reference genome (DeWeese et al., 2024) using HISAT2 v2.2.1 (Kim et al., 2019a). Alignments
were sorted and collapsed into 278 per-individual alignments for variant calling using
SAMtools/BCFtools v1.13 (Danecek et al., 2021), and duplicate alignments were marked using
GATK v4.2.2.0 (Van der Auwera and O’Connor, 2020). Variant calling was done following the
best practices pipeline for germline variant discovery from GATK4. Briefly, per-sample genomic
VCF (gVCF) files were generated from all 278 per-individual alignments with GATK4’s
HaplotypeCaller using the parameters "-ploidy 1 -ERC GVCF". Each gVCF was split into equally
sized genomic intervals ("SplitIntervals --scatter-count 100") and converted into GenomicsDB
65
datastores for genotyping into VCFs with GenotypeGVCFs. Per-interval VCFs were sorted and
compiled using SortVcf/MergeVcfs into 278 individual VCFv4.2 file. Variants were quality
filtered with VCFtools v0.1.16 (Danecek et al., 2011). Scripts used to perform variant calling are
publicly available at https://github.com/kdews/s-latissima-popgen.
3.3.5 Functional variant annotation
The filtered VCF file was decomposed and normalized using vt (Tan et al., 2015) for compatibility
with downstream analyses. A SnpEff v4.3 (Cingolani et al., 2012) database was built for the
Saccharina latissima genome using a sequence feature file, coding sequences (CDSs), and protein
sequences of annotated genes (DeWeese et al., 2024). Variants scored as high impact by SnpEff
were subset from genic regions in and around the identified genes of interest. A detailed description
of the pipeline used to prepare files, build a custom SnpEff database, and filter and parse functional
variant annotation data can be found at https://github.com/kdews/s-latissima-mutation-annotation.
3.4 Results
3.4.1 Curation of reproductive gene set
3.4.1.1 Differential expression analysis between tissue types
After quality filtering extracted RNA, 12 unique samples from 5 Saccharina latissima sporophytes
remained, made up of the following tissues: 2 distal blade, 4 meristematic blade, 3 holdfast, and 4
sorus. Sequencing of 12 transcriptomic libraries yielded a total of 335 Gb with an average of 92
million reads per library, 137 bp average read length, and a GC content of 50.83%. Trimmed RNA
reads were aligned to the S. latissima reference, which has 25,012 high-quality annotated gene
models with transcriptomic support (DeWeese et al., 2024). Reads aligned with an average overall
mapping rate of 18.08%. To detect sorus-expressed genes, the differential expression analysis
66
calculated log2 fold change (LFC) in gene expression in sorus tissue (4 samples) versus a pool of
the three non-reproductive tissue types (9 samples). The differential expression analysis revealed
a set of 7 genes that were highly differentially expressed (LFC > 2) and statistically significant (pvalue < 1x10-5
) (Figure 3.2, red genes in top right quadrant).
Figure 3.2. Annotated volcano plot of differential gene expression between sorus tissue versus pooled nonreproductive tissues of a Saccharina latissima sporophyte, showing log2 fold change (LFC) on the x-axis and –log10
p-value on the y-axis. Dotted lines intersect the x-axis at LFC = ±2 and the y-axis at p = 1 x 10-5
.
3.4.1.2 Homology and annotation search
Reproductive genes known to be highly conserved across eukaryota (see Methods) and in
algal species (Liu et al., 2017) were used to seed searches for homologous genes in Saccharina
latissima genome annotations, yielding 87 candidates.
67
3.4.1.3 Defining putatively reproductive genes
Genes were considered putatively reproductive if they met at least one of the following criteria:
(1) annotated meiotic or other reproductive gene function in Saccharina latissima genome
(DeWeese et al., 2024), (2) exhibited homology via protein BLAST (Boratyn et al., 2013) to
reproductive proteins of related species Ectocarpus sp. (Cormier et al., 2017) and Macrocystis
pyrifera (Diesel et al., 2023), or (3) are highly significantly differentially expressed between
germline and somatic Saccharina latissima tissues. In total, 94 proteins translated from 94 unique
genes satisfied these criteria, hereafter referred to as putatively reproductive genes. Regions
containing putatively reproductive genes were used to select variants for further investigation.
3.4.2 Candidate genetic variants
Sequenced DNA from 278 S. latissima gametophytes yielded 3,342 Gb and 22 billion reads in
total, with 51.18% GC content and 7x average genomic coverage. In total, 10,666,929 variants
were called, including 1,908,929 insertions and deletions (indels) and 8,985,193 single nucleotide
polymorphisms (SNPs). Functional variant annotation was performed with SnpEff, which assigns
predicted variant effect(s) on coding regions, splice sites, and non-coding regions (Cingolani et al.,
2012). We focused specifically on 26,099 variants (0.18%) that were annotated as "HIGH" impact
by SnpEff (Figure 3.3A). High impact variants may result in loss of gene function or deleterious
phenotypic outcomes, including premature stop codons and frameshift mutations. After filtering
for only variants in regions of putatively reproductive genes, 144 high impact variants remained
in the analysis (Figure 3.3B). Genome-wide, female and male genotypes shared on average 288 of
any combination of high impact variants (Figure 3.3C).
68
Figure 3.3. Distributions of counts of high impact alternate alleles in 278 Saccharina latissima genotypes (A) across
all genes (nalleles = 26,099) and (B) in genic regions of putatively reproductive genes (nalleles = 144). (C) Distribution
of counts of female and male genotype pairs that share x high impact alleles. The red line intersects the x-axis at the
mean of 288 shared alleles.
3.4.3 Test crosses homozygous for candidate variants
3.4.3.1 Variant and genotype scoring and ranking
To plan Saccharina latissima test crosses that would generate sporophytes homozygous for our
candidate variants, we selected pairs of male and female Saccharina latissima gametophytes that
were genotyped to share one (or more) of the 144 high impact annotated variants in putatively
reproductive genes. In total, we found 241 candidate genotypes (142 male and 99 female). We
ranked crosses to maximize the following criteria in candidate variants: (1) predicted
69
deleteriousness of the variant (prioritizing frameshifts and nonsense mutations), (2) previous
involvement of parent genotype in cross(es) lacking reproductive development in sporophyte
stage, (3) sorus-biased expression (LFC > 0) of the impacted gene, and (4) read depth at the site
of the candidate alternative allele. With these rankings, we selected the top 49 crosses representing
19 male and 15 female genotypes to test for disrupted or absent reproductive development in F1
sporophytes. Each of these test crosses involved at least one of the 19 high impact candidate
variants in 13 putatively reproductive genes.
3.4.3.2 Preliminary phenotypic results
In the 2021–2022 season of a Saccharina latissima ocean test farm (ARPA-E MARINER, Woods
Hole Oceanographic Institute, PI: S. Lindell), our 49 gametophyte crosses proposed to screen for
aberrant reproductive phenotypes were outplanted. The full harvest in 2022 yielded 20 crosses that
had abnormal reproductive development, which included the observation of no sorus tissue growth
in the farm setting, or either an unproductive spore release or non-motile spores from sorus in a
laboratory setting. Six of the parent genotypes of these crosses possessed more than one of our
initial 144 candidate genetic variants (Figure 3.4). Most variant effects observed in these genotypes
are frameshift mutations, and three of the six genotypes have the same predicted stop
gained/frameshift mutation in gene_18640 (Figure 3.4).
70
Figure 3.4. Parent genotypes of Saccharina latissima sporophyte crosses expressing non-reproductive phenotype in
farm setting. Genotypes highlighted in cyan were predicted to produce non-reproductive sporophytes and proposed
in the test cross plan.
In the 2022–2023 farm season, a further 43 crosses were identified as having disrupted or
absent reproductive capacity. Combining the candidate genetic variants between all farm seasons
present in parent genotypes that have been phenotyped as non-reproductive, we looked for trends
in tissue-biased gene expression (Figure 3.5A) and strength of variant effect (Figure 3.5B). We
find that within the set of genes impacted by these variants, gene expression is slightly stronger
and biased toward reproductive tissue (Figure 3.5A). From the perspective of mutational class,
there is a strong but unfortunately extremely under-sampled signal showing variants annotated as
stop gained mutations may have higher heritability than frameshift or splice site mutations (Figure
3.5B).
71
Figure 3.5. (A) Bars represent log2 fold change (LFC) in gene expression between reproductive versus nonreproductive tissues for each putatively reproductive gene in with a variant the test crossed genotypes, and are
colored by tissue expression bias (i.e., reproductive bias = LFC > 0). Boxes to the left of gene IDs on the y-axis are
gene annotations or top BLASTp hit of a protein translated from the gene. (B) For each class of predicted variant
effect, a boxplot shows the distribution in the proportion of crosses possessing a given variant that have been
phenotyped as non-reproductive.
3.5 Discussion
With this research, we demonstrate the capacity to expedite breeding programs for kelp species by
leveraging robust genetic resources. Specifically, we computationally modeled phenotypes not
found in wild populations by generating in silico crosses and ranking by the predicted degree of
effect from deleterious mutation(s). The methods described here exploit the wealth of data
generated for genomic selection (GS) models used in modern breeding and invert the conventional
approach by first investigating genetic variation at the pathway level.
72
Our objective was to efficiently discover haploid gametophyte genotypes of the kelp
Saccharina latissima that are capable of gamete production but breed non-reproductive diploid
sporophytes. Our approach was not mutagenetic, but instead focused on detecting naturally
occurring genetic variants that may disrupt reproductive pathways. Wild populations of S.
latissima sporophytes found on the northern Atlantic coast of the US were used as founders for the
germplasm bank of gametophytes analyzed here (Huang et al., 2023). Founder genotypes
displayed distinct population structure (Mao et al., 2020) and had far greater allelic richness
compared to what would be found in established cultivars (Griffiths et al., 2015). High levels of
standing genetic variation present in wild S. latissima populations were an essential source of rare
variants for functional annotation analysis.
Further research is needed to investigate the impacts of the identified candidate nonreproductive variants using molecular biology tools to probe perturbations in gene and protein
expression associated with these specific S. latissima genotypes in both gametophyte and
sporophyte stages. Ongoing work is aimed at transcriptomic sequencing of all non-reproductive
F1s to detect patterns in gene expression that may signal which proteins and pathways are affected
in ways that disrupt reproduction in the sporophyte stage. Additionally, directed mutation of
gametophyte and sporophyte genotypes in controlled laboratory settings would be beneficial to
more deeply investigate the role of the genes
The methods outlined in this work can serve as a model for advancing aquaculture species
where both genetic improvement and environmental considerations are paramount. Before any
generations are grown for phenotypic selection, our pipeline strives to identify and rank genotypes
of interest to streamline selection. Today, genomic data exists for many industry-relevant kelp
73
species, and databases are constantly being updated—see PhycoCosm (Grigoriev et al., 2021),
Phaeoexplorer (Phaeoexplorer, 2016), and ORCAE (Sterck et al., 2012). Future analyses that
search for phenotypes of interest in functionally annotated genetic data before crossing and
phenotyping have the potential to streamline and reduce costs associated with high-power genomic
selection models. We anticipate these crucial steps toward understanding and manipulating
reproductive pathways in kelp species will contribute to the development of sustainable marine
aquaculture practices to support this rapidly growing industry.
3.6 Funding
This project was supported by the U.S. Department of Energy (DOE) Advanced Research Projects
Agency–Energy (ARPA-E) program Macroalgae Research Inspiring Novel Energy Resources
(MARINER) awards DE-AR0000914 and DE-AR0000915.
3.7 Author contributions
KD performed all statistical analysis and data visualization, and drafted, wrote, and revised the
manuscript. KD processed sequencing data, conducted differential expression analyses, identified
conserved reproductive genes, performed functional annotation, designed and executed a pipeline
for variant detection and ranking, and planned test crosses to combine variants. SN and GM
conceptualized the project and focus on non-reproductive genotypes and kelp germplasm. SL led
experimental design, sample collection and preparation, culture maintenance, nucleotide
sequencing, and conducted and phenotyped planned test crosses. SN, SL, and GM also provided
supervision and secured funding for the project.
74
Conclusion
Genomics-assisted breeding holds transformative potential for kelp aquaculture, promising
accelerated gains in cultivating traits of interest in kelps, such as increased biomass, optimal sugar
and lipid composition, and the generation of non-reproductive outplants. The genetic resources
developed here, in particular the scaffolded and annotated reference genome of the US native
Saccharina latissima (sugar kelp), provide a strong foundation for trait selection and genetic
variant analysis in kelp germplasm that can begin to align kelp cultivation with the technology and
efficiency of modern terrestrial agriculture. This thesis highlights the promise of genomics not
only for enhancing economically valuable traits, but also for advancing ecologically responsible
aquaculture through the identification of putatively non-reproductive strains. As climate and
population pressures intensify, harnessing these genomic insights to propel algae aquaculture
forward can position kelp as a key sustainable resource for a variety of applications, while ensuring
that future developments in kelp aquaculture are both economically viable and ecologically sound.
75
References
Abbass, K., Qasim, M. Z., Song, H., Murshed, M., Mahmood, H., and Younis, I. (2022). A
review of the global climate change impacts, adaptation, and sustainable mitigation
measures. Environmental Science and Pollution Research 29, 42539–42559. doi:
10.1007/s11356-022-19718-6
Abdelrahman, H., ElHady, M., Alcivar-Warren, A., Allen, S., Al-Tobasei, R., Bao, L., et al.
(2017). Aquaculture genomics, genetics and breeding in the United States: current status,
challenges, and priorities for future research. BMC Genomics 18, 191. doi: 10.1186/s12864-
017-3557-1
Aburatani, S. (2012). Network Inference of pal-1 Lineage-Specific Regulation in the C. elegans
Embryo by Structural Equation Modeling. Bioinformation 8, 652–657. doi:
10.6026/97320630008652
Ahuja, I., de Vos, R. C. H., Bones, A. M., and Hall, R. D. (2010). Plant molecular stress
responses face climate change. Trends Plant Sci 15, 664–674. doi:
10.1016/j.tplants.2010.08.002
Aliferis, K. A., and Chrysayi-Tokousbalides, M. (2011). Metabolomics in pesticide research and
development: Review and future perspectives. Metabolomics 7, 35–53. doi:
10.1007/s11306-010-0231-x
Alston, F. H., and Tobutt, K. R. (1989). “Breeding and Selection for Reliable Cropping in Apples
and Pears,” in Manipulation of Fruiting, (Elsevier), 329–339. doi: 10.1016/B978-0-408-
02608-6.50027-8
Alsuwaiyan, N. A., Mohring, M. B., Cambridge, M., Coleman, M. A., Kendrick, G. A., and
Wernberg, T. (2019). A review of protocols for the experimental release of kelp
(Laminariales) zoospores. Ecol Evol 9, 8387–8398. doi: 10.1002/ece3.5389
Alves, L. D. F., Westmann, C. A., Lovate, G. L., De Siqueira, G. M. V., Borelli, T. C., and
Guazzaroni, M. E. (2018). Metagenomic Approaches for Understanding New Concepts in
Microbial Science. Int J Genomics. doi: 10.1155/2018/2312987
Apweiler, R., Martin, M. J., O’Donovan, C., Magrane, M., Alam-Faruque, Y., Alpi, E., et al.
(2012). Update on activities at the Universal Protein Resource (UniProt) in 2013. Nucleic
Acids Res 41, D43–D47. doi: 10.1093/nar/gks1068
Arbona, V., Manzi, M., de Ollas, C., and Gómez-Cadenas, A. (2013). Metabolomics as a tool to
investigate abiotic stress tolerance in plants. Int J Mol Sci 14, 4885–4911. doi:
10.3390/ijms14034885
76
Armstrong, J., Hickey, G., Diekhans, M., Fiddes, I. T., Novak, A. M., Deran, A., et al. (2020).
Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature 587,
246–251. doi: 10.1038/s41586-020-2871-y
Ashikari, M., Sakakibara, H., Lin, S., Yamamoto, T., Takashi, T., Nishimura, A., et al. (2005).
Cytokinin Oxidase Regulates Rice Grain Production. Science (1979) 309, 741–745. doi:
10.1126/science.1113373
Astashyn, A., Tvedte, E. S., Sweeney, D., Sapojnikov, V., Bouk, N., Joukov, V., et al. (2024).
Rapid and sensitive detection of genome contamination at scale with FCS-GX. Genome
Biol 25, 60. doi: 10.1186/s13059-024-03198-7
Augyte, S., Lewis, L., Lin, S., Neefus, C. D., and Yarish, C. (2018). Speciation in the exposed
intertidal zone: the case of Saccharina angustissima comb. nov. & stat. nov. (Laminariales,
Phaeophyceae). Phycologia 57, 100–112. doi: 10.2216/17-40.1
Awany, D., Allali, I., Dalvie, S., Hemmings, S., Mwaikono, K. S., Thomford, N. E., et al. (2019).
Host and microbiome genome-wide association studies: Current state and challenges. Front
Genet 10, 637. doi: 10.3389/fgene.2018.00637
Aydlett, M. (2019). Examining the Microbiome of Porphyra umbilicalis in the North Atlantic.
Honors College, University of Maine. Available at:
https://digitalcommons.library.umaine.edu/honors/570 (Accessed August 1, 2020).
Baedke, J., Fábregas-Tejeda, A., and Nieves Delgado, A. (2020). The holobiont concept before
Margulis. J Exp Zool B Mol Dev Evol 334, 149–155. doi: 10.1002/jez.b.22931
Baltimore, J. Ott. (1986). Analysis of Human Genetic Linkage. Ann Hum Genet 50, 101–102.
doi: 10.1111/j.1469-1809.1986.tb01944.x
Belghit, I., Rasinger, J. D., Heesch, S., Biancarosa, I., Liland, N., Torstensen, B., et al. (2017).
In-depth metabolic profiling of marine macroalgae confirms strong biochemical differences
between brown, red and green algae. Algal Res 26, 240–249. doi:
10.1016/j.algal.2017.08.001
Bingol, K. (2018). Recent Advances in Targeted and Untargeted Metabolomics by NMR and
MS/NMR Methods. High Throughput 7, 9. doi: 10.3390/ht7020009
Birney, E., Clamp, M., and Durbin, R. (2004). GeneWise and Genomewise. Genome Res 14,
988–995. doi: 10.1101/gr.1865504
Blighe, K., Rana, S., Turkes, E., Ostendorf, B., Grioni, A., and Lewis, M. (2021).
EnhancedVolcano: Publication-ready volcano plots with enhanced colouring and labeling.
doi: 10.18129/B9.bioc.EnhancedVolcano
77
Bohra, A., Chand Jha, U., Godwin, I. D., and Kumar Varshney, R. (2020). Genomic interventions
for sustainable agriculture. Plant Biotechnol J 18, 2388–2405. doi: 10.1111/pbi.13472
Bongaarts, J. (2009). Human population growth and the demographic transition. Philosophical
Transactions of the Royal Society B: Biological Sciences 364, 2985–2990. doi:
10.1098/rstb.2009.0137
Boratyn, G. M., Camacho, C., Cooper, P. S., Coulouris, G., Fong, A., Ma, N., et al. (2013).
BLAST: a more efficient report with usability improvements. Nucleic Acids Res 41, W29–
W33. doi: 10.1093/nar/gkt282
Borde, V. (2007). The multiple roles of the Mre11 complex for meiotic recombination.
Chromosome Research 15, 551–563. doi: 10.1007/s10577-007-1147-9
Brayden, C., and Coleman, S. (2023). Maine Seaweed Benchmarking Report. Available at:
https://maineaqua.org/wp-content/uploads/2023/08/Maine-Seaweed-BenchmarkingReport.pdf (Accessed January 21, 2024).
Breton, T. S., Nettleton, J. C., O’Connell, B., and Bertocci, M. (2018). Fine-scale population
genetic structure of sugar kelp, Saccharina latissima (Laminariales, Phaeophyceae), in
eastern Maine, USA. Phycologia 57, 32–40. doi: 10.2216/17-72.1
Bringloe, T. T., Starko, S., Wade, R. M., Vieira, C., Kawai, H., De Clerck, O., et al. (2020).
Phylogeny and Evolution of the Brown Algae. CRC Crit Rev Plant Sci 39, 281–321. doi:
10.1080/07352689.2020.1787679
Buck, B. H., Krause, G., and Rosenthal, H. (2004). Extensive open ocean aquaculture
development within wind farms in Germany: The prospect of offshore co-management and
legal constraints. Ocean Coast Manag 47, 95–122. doi: 10.1016/j.ocecoaman.2004.04.002
Buck, B. H., Nevejan, N., Wille, M., Chambers, M. D., and Chopin, T. (2017). “Offshore and
Multi-Use Aquaculture with Extractive Species: Seaweeds and Bivalves,” in Aquaculture
Perspective of Multi-Use Sites in the Open Ocean: The Untapped Potential for Marine
Resources in the Anthropocene, eds. B. H. Buck and R. Langan (Cham: Springer
International Publishing), 23–69. doi: 10.1007/978-3-319-51159-7_2
Buckley, S., Hardy, K., Hallgren, F., Kubiak-Martens, L., Miliauskienė, Ž., Sheridan, A., et al.
(2023). Human consumption of seaweed and freshwater aquatic plants in ancient Europe.
Nat Commun 14, 6192. doi: 10.1038/s41467-023-41671-2
Budhlakoti, N., Kushwaha, A. K., Rai, A., Chaturvedi, K. K., Kumar, A., Pradhan, A. K., et al.
(2022). Genomic Selection: A Tool for Accelerating the Efficiency of Molecular Breeding
for Development of Climate-Resilient Crops. Front Genet 13, 832153. doi:
10.3389/fgene.2022.832153
78
Burgess, K., Rankin, N., and Weidt, S. (2014). “Metabolomics,” in Handbook of
Pharmacogenomics and Stratified Medicine, ed. S. Padmanabhan (London: Elsevier), 181–
205. doi: 10.1016/B978-0-12-386882-4.00010-4
Buschmann, A. H., Camus, C., Infante, J., Neori, A., Israel, Á., Hernández-González, M. C., et
al. (2017). Seaweed production: overview of the global state of exploitation, farming and
emerging research activity. Eur J Phycol 52, 391–406. doi:
10.1080/09670262.2017.1365175
Busetti, A., Maggs, C. A., and Gilmore, B. F. (2017). Marine macroalgae and their associated
microbiomes as a source of antimicrobial chemical diversity. Eur J Phycol 52, 452–465.
doi: 10.1080/09670262.2017.1376709
Cai, J., Lovatelli, A., Aguilar-Manjarrez, J., Cornish, L., Dabbadie, L., Desrochers, A., et al.
(2021). Seaweeds and microalgae: an overview for unlocking their potential in global
aquaculture development. Rome: FAO Fisheries and Aquaculture Circular, No. 1229. doi:
10.4060/cb5670en
Calus, M. P. L., and Veerkamp, R. F. (2007). Accuracy of breeding values when using and
ignoring the polygenic effect in genomic breeding value estimation with a marker density of
one SNP per cM. Journal of Animal Breeding and Genetics 124, 362–368. doi:
10.1111/J.1439-0388.2007.00691.X
Camus, C., Faugeron, S., and Buschmann, A. H. (2018). Assessment of genetic and phenotypic
diversity of the giant kelp, Macrocystis pyrifera, to support breeding programs. Algal Res
30, 101–112. doi: 10.1016/j.algal.2018.01.004
Caruso, G. (2013). Microbes and their use as Indicators of Pollution. Journal of Pollution Effects
& Control 1, 1000e102. doi: 10.4172/2375-4397.1000e102
Charrier, B., Wichard, T., and Reddy, C. R. K. eds. (2018). Protocols for Macroalgae Research.
Boca Raton : Taylor & Francis, 2018.: CRC Press. doi: 10.1201/b21460
Chen, C., Farmer, A. D., Langley, R. J., Mudge, J., Crow, J. A., May, G. D., et al. (2010).
Meiosis-specific gene discovery in plants: RNA-Seq applied to isolated Arabidopsis male
meiocytes. BMC Plant Biol 10, 280. doi: 10.1186/1471-2229-10-280
Chen, C., Huang, X., Fang, S., Yang, H., He, M., Zhao, Y., et al. (2018). Contribution of Host
Genetics to the Variation of Microbial Composition of Cecum Lumen and Feces in Pigs.
Front Microbiol 9, 2626. doi: 10.3389/fmicb.2018.02626
Chen, J., Zang, Y., Shang, S., and Tang, X. (2019). The complete mitochondrial genome of the
brown alga Macrocystis integrifolia (Laminariales, Phaeophyceae). Mitochondrial DNA B
Resour 4, 635–636. doi: 10.1080/23802359.2018.1495114
79
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H., and Li, H. (2021). Haplotype-resolved de
novo assembly using phased assembly graphs with hifiasm. Nat Methods 18, 170–175. doi:
10.1038/s41592-020-01056-5
Che-Othman, M. H., Jacoby, R. P., Millar, A. H., and Taylor, N. L. (2020). Wheat mitochondrial
respiration shifts from the tricarboxylic acid cycle to the GABA shunt under salt stress. New
Phytologist 225, 1166–1180. doi: 10.1111/nph.15713
Cherry, P., O’Hara, C., Magee, P. J., McSorley, E. M., and Allsopp, P. J. (2019). Risks and
benefits of consuming edible seaweeds. Nutr Rev 77, 307–329. doi: 10.1093/nutrit/nuy066
Chiu, C. Y., and Miller, S. A. (2019). Clinical metagenomics. Nat Rev Genet 20, 341–355. doi:
10.1038/s41576-019-0113-7
Cho, S., Kim, H., Oh, S., Kim, K., and Park, T. (2009). Elastic-net regularization approaches for
genome-wide association studies of rheumatoid arthritis. BMC Proc 3, S25. doi:
10.1186/1753-6561-3-s7-s25
Cicin-Sain, B., Bunsick, S. M., DeVoe, R., Eichenberg, T., Ewart, J., Halvorson, H., et al. (2001).
Development of a Policy Framework for Offshore Marine Aquaculture in the 3-200 Mile
U.S. Ocean Zone. Available at: http://udspace.udel.edu/handle/19716/2504 (Accessed July
19, 2020).
Cingolani, P., Platts, A., Wang, L. L., Coon, M., Nguyen, T., Wang, L., et al. (2012). A program
for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs
in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92.
doi: 10.4161/fly.19695
Cock, J. M., Sterck, L., Rouzé, P., Scornet, D., Allen, A. E., Amoutzias, G., et al. (2010). The
Ectocarpus genome and the independent evolution of multicellularity in brown algae.
Nature 465, 617–621. doi: 10.1038/nature09016
Collard, B. C. Y., and Mackill, D. J. (2008). Marker-assisted selection: an approach for precision
plant breeding in the twenty-first century. Philosophical Transactions of the Royal Society
B: Biological Sciences 363, 557–572. doi: 10.1098/rstb.2007.2170
Collén, J., Porcel, B., Carré, W., Ball, S. G., Chaparro, C., Tonon, T., et al. (2013). Genome
structure and metabolic features in the red seaweed Chondrus crispus shed light on
evolution of the Archaeplastida. Proceedings of the National Academy of Sciences 110,
5247–5252. doi: 10.1073/pnas.1221259110
Colombo, N. (2019). Taking Advantage of Organelle Genomes in Plant Breeding: An Integrated
Approach. Journal of Basic and Applied Genetics 30, 35–51. doi:
10.35407/bag.2019.XXX.01.05
80
Cormier, A., Avia, K., Sterck, L., Derrien, T., Wucher, V., Andres, G., et al. (2017). Re‐
annotation, improved large‐scale assembly and establishment of a catalogue of noncoding
loci for the genome of the model brown alga Ectocarpus. New Phytologist 214, 219–232.
doi: 10.1111/nph.14321
Cottier-Cook, E. J., Lim, P.-E., Mallinson, S., Yahya, N., Poong, S.-W., Wilbraham, J., et al.
(2023). Striking a Balance: Wild Stock Protection and the Future of Our Seaweed
Industries. Bruges: United Nations University Institute on Comparative Regional
Integration Studies Policy Brief, No. 6. Available at:
https://cris.unu.edu/thefutureofourseaweedindustries (Accessed October 17, 2024).
Crawford, J. M., and Clardy, J. (2011). Bacterial symbionts and natural products. Chemical
Communications 47, 7559–7566. doi: 10.1039/c1cc11574j
Cui, H., Ding, Z., Zhu, Q., Wu, Y., Qiu, B., and Gao, P. (2021). Comparative analysis of nuclear,
chloroplast, and mitochondrial genomes of watermelon and melon provides evidence of
gene transfer. Sci Rep 11, 1595. doi: 10.1038/s41598-020-80149-9
Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., et al. (2011). The
variant call format and VCFtools. Bioinformatics 27, 2156–2158. doi:
10.1093/bioinformatics/btr330
Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., et al. (2021).
Twelve years of SAMtools and BCFtools. Gigascience 10. doi:
10.1093/gigascience/giab008
Davis, G. D. J., and Vasanthi, A. H. R. (2011). Seaweed metabolite database (SWMD): A
database of natural compounds from marine algae. Bioinformation 5, 361–364. doi:
10.6026/97320630005361
Dayton, P. K. (1985). Ecology of Kelp Communities. Annu Rev Ecol Syst 16, 215–245. doi:
10.1146/annurev.es.16.110185.001243
De Ollas, C., Morillón, R., Fotopoulos, V., Puértolas, J., Ollitrault, P., Gómez-Cadenas, A., et al.
(2019). Facing climate change: Biotechnology of iconic mediterranean woody crops. Front
Plant Sci 10, 427. doi: 10.3389/fpls.2019.00427
Dekkers, J. C. M., and Hospital, F. (2002). The use of molecular genetics in the improvement of
agricultural populations. Nat Rev Genet 3, 22–32. doi: 10.1038/nrg701
Desta, Z. A., and Ortiz, R. (2014). Genomic selection: genome-wide prediction in plant
improvement. Trends Plant Sci 19, 592–601. doi: 10.1016/j.tplants.2014.05.006
81
DeWeese, K., Molano, G., Calhoun, S., Lipzen, A., Jenkins, J., Williams, M., et al. (2024).
Scaffolded and annotated nuclear and organelle genomes of the North American brown alga
Saccharina latissima. Frontiers in Genetics (in review).
Diehl, N., Li, H., Scheschonk, L., Burgunter-Delamare, B., Niedzwiedz, S., Forbord, S., et al.
(2023). The sugar kelp Saccharina latissima I: recent advances in a changing climate. Ann
Bot, 1–29. doi: 10.1093/aob/mcad173
Diesel, J., Molano, G., Montecinos, G. J., DeWeese, K., Calhoun, S., Kuo, A., et al. (2023). A
scaffolded and annotated reference genome of giant kelp (Macrocystis pyrifera). BMC
Genomics 24, 543. doi: 10.1186/s12864-023-09658-x
Ding, Y., Hou, K., Xu, Z., Pimplaskar, A., Petter, E., Boulier, K., et al. (2023). Polygenic scoring
accuracy varies across the genetic ancestry continuum. Nature 618, 774–781. doi:
10.1038/s41586-023-06079-4
Dixon, R. A., Gang, D. R., Charlton, A. J., Fiehn, O., Kuiper, H. A., Reynolds, T. L., et al.
(2006). Applications of Metabolomics in Agriculture. J Agric Food Chem 54, 8984–8994.
doi: 10.1021/jf061218t
Doane, M. P., Haggerty, J. M., Kacev, D., Papudeshi, B., and Dinsdale, E. A. (2017). The skin
microbiome of the common thresher shark (Alopias vulpinus) has low taxonomic and gene
function β-diversity. Environ Microbiol Rep 9, 357–373. doi: 10.1111/1758-2229.12537
Doyle, J. J., and Doyle, J. L. (1990). Isolation of Plant DNA from Fresh Tissue. Focus (Madison)
12, 13–15.
Dring, M. J., and Lüning, K. (1975). A photoperiodic response mediated by blue light in the
brown alga Scytosiphon lomentaria. Planta 125, 25–32. doi: 10.1007/BF00388870
Duarte, C. M., Bruhn, A., and Krause-Jensen, D. (2021). A seaweed aquaculture imperative to
meet global sustainability targets. Nature Sustainability 2021, 1–9. doi: 10.1038/s41893-
021-00773-9
Duarte, C. M., Wu, J., Xiao, X., Bruhn, A., and Krause-Jensen, D. (2017). Can seaweed farming
play a role in climate change mitigation and adaptation? Front Mar Sci 4, 100. doi:
10.3389/fmars.2017.00100
Dubilier, N., Bergin, C., and Lott, C. (2008). Symbiotic diversity in marine animals: The art of
harnessing chemosynthesis. Nat Rev Microbiol 6, 725–740. doi: 10.1038/nrmicro1992
Dudchenko, O., Batra, S. S., Omer, A. D., Nyquist, S. K., Hoeger, M., Durand, N. C., et al.
(2017). De novo assembly of the Aedes aegypti genome using Hi-C yields chromosomelength scaffolds. Science (1979) 356, 92–95. doi: 10.1126/science.aal3327
82
Durand, N. C., Shamim, M. S., Machol, I., Rao, S. S. P., Huntley, M. H., Lander, E. S., et al.
(2016). Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C
Experiments. Cell Syst 3, 95–98. doi: 10.1016/j.cels.2016.07.002
Dutta, S. S., Tyagi, W., Pale, G., Pohlong, J., Aochen, C., Pandey, A., et al. (2018). Marker–trait
association for low-light intensity tolerance in rice genotypes from Eastern India. Molecular
Genetics and Genomics 293, 1493–1506. doi: 10.1007/s00438-018-1478-6
Egan, S., Harder, T., Burke, C., Steinberg, P., Kjelleberg, S., and Thomas, T. (2013). The seaweed
holobiont: Understanding seaweed-bacteria interactions. FEMS Microbiol Rev 37, 462–476.
doi: 10.1111/1574-6976.12011
Egan, S., Thomas, T., and Kjelleberg, S. (2008). Unlocking the diversity and biotechnological
potential of marine surface associated microbial communities. Curr Opin Microbiol 11,
219–225. doi: 10.1016/j.mib.2008.04.001
Eger, A. M., Marzinelli, E. M., Beas-Luna, R., Blain, C. O., Blamey, L. K., Byrnes, J. E. K., et al.
(2023). The value of ecosystem services in global marine kelp forests. Nat Commun 14,
1894. doi: 10.1038/s41467-023-37385-0
Emwas, A.-H. M. (2015). “The Strengths and Weaknesses of NMR Spectroscopy and Mass
Spectrometry with Particular Focus on Metabolomics Research,” in Methods in Molecular
Biology, (Humana Press Inc.), 161–193. doi: 10.1007/978-1-4939-2377-9_13
Emwas, A.-H., Roy, R., McKay, R. T., Tenori, L., Saccenti, E., Gowda, G. A. N., et al. (2019).
NMR Spectroscopy for Metabolomics Research. Metabolites 9, 123. doi:
10.3390/metabo9070123
Engebrecht, J., Voelkel-Meiman, K., and Roeder, G. S. (1991). Meiosis-Specific RNA Splicing
in Yeast. Cell 66, 1257–1266.
Epstein, G., and Smale, D. A. (2017). Undaria pinnatifida : A case study to highlight challenges
in marine invasion ecology and management. Ecol Evol 7, 8624–8642. doi:
10.1002/ece3.3430
Eren, A. M., Esen, O. C., Quince, C., Vineis, J. H., Morrison, H. G., Sogin, M. L., et al. (2015).
Anvi’o: An advanced analysis and visualization platformfor ’omics data. PeerJ 2015,
e1319. doi: 10.7717/peerj.1319
Ezemonye, L. I. N., Ogeleka, D. F., and Okieimen, F. E. (2009). Lethal toxicity of industrial
detergent on bottom dwelling sentinels. International Journal of Sediment Research 24,
479–483. doi: 10.1016/S1001-6279(10)60019-4
Fan, X., Han, W., Teng, L., Jiang, P., Zhang, X., Xu, D., et al. (2020a). Single‐base methylome
profiling of the giant kelp Saccharina japonica reveals significant differences in DNA
83
methylation to microalgae and plants. New Phytologist 225, 234–249. doi:
10.1111/nph.16125
Fan, X., Xie, W., Wang, Y., Xu, D., Zhang, X., and Ye, N. (2020b). The complete chloroplast
genome of Saccharina latissima. Mitochondrial DNA Part B 5, 3481–3482. doi:
10.1080/23802359.2020.1825999
FAO (2018). The State of World Fisheries and Aquaculture 2018 – Meeting the sustainable
development goals. Rome. Available at:
https://openknowledge.fao.org/handle/20.500.14283/ca9229en (Accessed June 14, 2020).
FAO (2020). The State of World Fisheries and Aquaculture 2020. Sustainability in action. Rome.
doi: 10.4060/ca9229en
FAO (2022). The State of World Fisheries and Aquaculture 2022. Towards Blue Transformation.
Rome: FAO. doi: 10.4060/cc0461en
FAO (2024). The State of World Fisheries and Aquaculture 2024 – Blue Transformation in
action. Rome. doi: 10.4060/cd0683en
Fear, J. M., Arbeitman, M. N., Salomon, M. P., Dalton, J. E., Tower, J., Nuzhdin, S. V., et al.
(2015). The Wright Stuff: Reimagining path analysis reveals novel components of the sex
determination hierarchy in Drosophila melanogaster. BMC Syst Biol 9, 53. doi:
10.1186/s12918-015-0200-0
Ferdouse, F., Løvstad Holdt, S., Smith, R., Murúa, P., and Yang, Z. (2018). The global status of
seaweed production, trade and utilization. FAO Globefish Research Programme 124, 120.
Available at: http://www.fao.org/in-action/globefish/publications/detailspublication/en/c/1154074/ (Accessed November 9, 2020).
Fernie, A. R., and Schauer, N. (2009). Metabolomics-assisted breeding: a viable option for crop
improvement? Trends in Genetics 25, 39–48. doi: 10.1016/j.tig.2008.10.010
Fernie, A. R., and Yan, J. (2019). De Novo Domestication: An Alternative Route toward New
Crops for the Future. Mol Plant 12, 615–631. doi: 10.1016/j.molp.2019.03.016
Fierer, N., Barberán, A., and Laughlin, D. C. (2014). Seeing the forest for the genes: Using
metagenomics to infer the aggregated traits of microbial communities. Front Microbiol 5,
614. doi: 10.3389/fmicb.2014.00614
Fitzpatrick, C. R., Copeland, J., Wang, P. W., Guttman, D. S., Kotanen, P. M., and Johnson, M. T.
J. (2018). Assembly and ecological function of the root microbiome across angiosperm
plant species. Proc Natl Acad Sci U S A 115, E1157–E1165. doi: 10.1073/pnas.1717617115
84
Florez, J. Z., Camus, C., Hengst, M. B., and Buschmann, A. H. (2017). A Functional Perspective
Analysis of Macroalgae and Epiphytic Bacterial Community Interaction. Front Microbiol 8,
2561. doi: 10.3389/fmicb.2017.02561
Fricke, A., Capuzzo, E., Bermejo, R., Hofmann, L., Hernández, I., Pereira, R., et al. (2024).
Ecosystem Services Provided by Seaweed Cultivation: State of the Art, Knowledge Gaps,
Constraints and Future Needs for Achieving Maximum Potential in Europe. Reviews in
Fisheries Science & Aquaculture, 1–19. doi: 10.1080/23308249.2024.2399355
Froehlich, H. E., Runge, C. A., Gentry, R. R., Gaines, S. D., and Halpern, B. S. (2018).
Comparative terrestrial feed and land use of an aquaculture-dominant world. Proceedings of
the National Academy of Sciences 115, 5295–5300. doi: 10.1073/pnas.1801692115
Gao, H., Zhang, T., Wu, Y., Wu, Y., Jiang, L., Zhan, J., et al. (2014). Multiple-trait genome-wide
association study based on principal component analysis for residual covariance matrix.
Heredity (Edinb) 113, 526–532. doi: 10.1038/hdy.2014.57
Gao, X., Endo, H., Yamana, M., Taniguchi, K., and Agatsuma, Y. (2013). Compensation of the
brown alga Undaria pinnatifida (Laminariales; Phaeophyta) after thallus excision under
cultivation in Matsushima Bay, northern Japan. J Appl Phycol 25, 1171–1178. doi:
10.1007/s10811-012-9925-y
Gastauer, M., Vera, M. P. O., de Souza, K. P., Pires, E. S., Alves, R., Caldeira, C. F., et al. (2019).
Data descriptor: A metagenomic survey of soil microbial communities along a rehabilitation
chronosequence after iron ore mining. Sci Data 6. doi: 10.1038/sdata.2019.8
Gaubert, J., Payri, C. E., Vieira, C., Solanki, H., and Thomas, O. P. (2019). High metabolic
variation for seaweeds in response to environmental changes: a case study of the brown
algae Lobophora in coral reefs. Sci Rep 9, 993. doi: 10.1038/s41598-018-38177-z
Gaubert, J., Rodolfo-Metalpa, R., Greff, S., Thomas, O. P., and Payri, C. E. (2020). Impact of
ocean acidification on the metabolome of the brown macroalgae Lobophora rosacea from
New Caledonia. Algal Res 46, 101783. doi: 10.1016/j.algal.2019.101783
Gause, G. F. (1934). Experimental analysis of Vito Volterra’s mathematical theory of the struggle
for existence. Science (1979) 79, 16–17. doi: 10.1126/science.79.2036.16-a
Ghurye, J., Rhie, A., Walenz, B. P., Schmitt, A., Selvaraj, S., Pop, M., et al. (2019). Integrating
Hi-C links with assembly graphs for chromosome-scale assembly. PLoS Comput Biol 15,
e1007273. doi: 10.1371/journal.pcbi.1007273
Gilbert, J. A., and Dupont, C. L. (2011). Microbial metagenomics: Beyond the genome. Ann Rev
Mar Sci 3, 347–371. doi: 10.1146/annurev-marine-120709-142811
85
Goecke, F., Gómez Garreta, A., Martín–Martín, R., Rull Lluch, J., Skjermo, J., and Ergon, Å.
(2022). Nuclear DNA Content Variation in Different Life Cycle Stages of Sugar Kelp,
Saccharina latissima. Marine Biotechnology 24, 706–721. doi: 10.1007/s10126-022-10137-
9
Goecke, F., Klemetsdal, G., and Ergon, Å. (2020). Cultivar Development of Kelps for
Commercial Cultivation—Past Lessons and Future Prospects. Front Mar Sci 8, 1–17. doi:
10.3389/fmars.2020.00110
Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., et al. (2011).
Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat
Biotechnol 29, 644–652. doi: 10.1038/nbt.1883
Graf, L., Shin, Y., Yang, J. H., Choi, J. W., Hwang, I. K., Nelson, W., et al. (2021). A genomewide investigation of the effect of farming and human-mediated introduction on the
ubiquitous seaweed Undaria pinnatifida. Nat Ecol Evol 5, 360–368. doi: 10.1038/s41559-
020-01378-9
Graham, E. D., Heidelberg, J. F., and Tully, B. J. (2017). Binsanity: Unsupervised clustering of
environmental microbial assemblies using coverage and affinity propagation. PeerJ 2017,
e3035. doi: 10.7717/peerj.3035
Grebe, G. S., Byron, C. J., Gelais, A. St., Kotowicz, D. M., and Olson, T. K. (2019). An
ecosystem approach to kelp aquaculture in the Americas and Europe. Aquac Rep 15,
100215. doi: 10.1016/j.aqrep.2019.100215
Greff, S., Zubia, M., Payri, C., Thomas, O. P., and Perez, T. (2017). Chemogeography of the red
macroalgae Asparagopsis: metabolomics, bioactivity, and relation to invasiveness.
Metabolomics 13, 33. doi: 10.1007/s11306-017-1169-z
Griffiths, A. J. F., Wessler, S. R., Lewontin, R. C., and Carroll, S. B. (2015). Introduction to
Genetic Analysis., 11th Edn. New York: W. H. Freeman & Company.
Grigoriev, I. V, Hayes, R. D., Calhoun, S., Kamel, B., Wang, A., Ahrendt, S., et al. (2021).
PhycoCosm, a comparative algal genomics resource. Nucleic Acids Res 49, D1004–D1011.
doi: 10.1093/nar/gkaa898
Grigoriev, I. V., Nikitin, R., Haridas, S., Kuo, A., Ohm, R., Otillar, R., et al. (2014). MycoCosm
portal: gearing up for 1000 fungal genomes. Nucleic Acids Res 42, D699–D704. doi:
10.1093/NAR/GKT1183
Guo, C., Ma, X., Gao, F., and Guo, Y. (2023). Off-target effects in CRISPR/Cas9 gene editing.
Front Bioeng Biotechnol 11, 1143157. doi: 10.3389/fbioe.2023.1143157
86
Gupta, V., Thakur, R. S., Baghel, R. S., Reddy, C. R. K., and Jha, B. (2014). “Seaweed
Metabolomics,” in Advances in Botanical Research, eds. J.-P. Jacquot, P. Gadal, and N.
Bourgougnon (Academic Press), 31–52. doi: 10.1016/B978-0-12-408062-1.00002-0
Guzinski, J., Ballenghien, M., Daguin‐Thiébaut, C., Lévêque, L., and Viard, F. (2018).
Population genomics of the introduced and cultivated Pacific kelp Undaria pinnatifida :
Marinas—not farms—drive regional connectivity and establishment in natural rocky reefs.
Evol Appl 11, 1582–1597. doi: 10.1111/eva.12647
Hackinger, S., and Zeggini, E. (2017). Statistical methods to detect pleiotropy in human complex
traits. Open Biol 7, 170125. doi: 10.1098/rsob.170125
Hall, R. D., Brouwer, I. D., and Fitzgerald, M. A. (2008). Plant metabolomics and its potential
application for human nutrition. Physiol Plant 132, 162–175. doi: 10.1111/j.1399-
3054.2007.00989.x
Han, S., and Micallef, S. A. (2016). Environmental metabolomics of the tomato plant surface
provides insights on Salmonella enterica colonization. Appl Environ Microbiol 82, 3131–
3142. doi: 10.1128/AEM.00435-16
Hanisak, M. D. (1983). “The Nitrogen Relationships of Marine Macroalgae,” in Nitrogen in the
Marine Environment, eds. E. J. Carpenter and D. G. Capone (New York: Academic Press,
Inc.), 699–730. doi: 10.1016/B978-0-12-160280-2.50027-4
Harrison, P. J., and Hurd, C. L. (2001). Nutrient physiology of seaweeds: Application of concepts
to aquaculture. Cah Biol Mar 42, 71–82. Available at:
https://www.researchgate.net/publication/235938878_Nutrient_physiology_of_seaweeds_A
pplication_of_concepts_to_aquaculture (Accessed August 1, 2020).
Hawkes, C. V., and Connor, E. W. (2017). Translating Phytobiomes from Theory to Practice:
Ecological and Evolutionary Considerations. Phytobiomes J 1, 57–69. doi:
10.1094/PBIOMES-05-17-0019-RVW
He, Y., Wang, Y., Hu, C., Sun, X., Li, Y., and Xu, N. (2019). Dynamic metabolic profiles of the
marine macroalga Ulva prolifera during fragmentation-induced proliferation. PLoS One 14,
e0214491. doi: 10.1371/journal.pone.0214491
Heffner, E. L., Sorrells, M. E., and Jannink, J. (2009). Genomic Selection for Crop Improvement.
Crop Sci 49, 1–12. doi: 10.2135/cropsci2008.08.0512
Heidkamp, C. P., Krak, L. V., Kelly, M. M. R., and Yarish, C. (2022). Geographical
considerations for capturing value in the U.S. sugar kelp (Saccharina latissima) industry.
Mar Policy 144, 105221. doi: 10.1016/j.marpol.2022.105221
87
Hickey, G., Paten, B., Earl, D., Zerbino, D., and Haussler, D. (2013). HAL: a hierarchical format
for storing and analyzing multiple genome alignments. Bioinformatics 29, 1341–1342. doi:
10.1093/bioinformatics/btt128
Hickey, L. T., N. Hafeez, A., Robinson, H., Jackson, S. A., Leal-Bertioli, S. C. M., Tester, M., et
al. (2019). Breeding crops to feed 10 billion. Nat Biotechnol 37, 744–754. doi:
10.1038/s41587-019-0152-9
Hoff, K. J., Lange, S., Lomsadze, A., Borodovsky, M., and Stanke, M. (2016). BRAKER1:
Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.
Bioinformatics 32, 767–769. doi: 10.1093/bioinformatics/btv661
Hopfenberg, R. (2003). Human Carrying Capacity Is Determined by Food Availability. Popul
Environ 25, 109–117. doi: 10.1023/B:POEN.0000015560.69479.c1
Horton, M. W., Bodenhausen, N., Beilsmith, K., Meng, D., Muegge, B. D., Subramanian, S., et
al. (2014). Genome-wide association study of Arabidopsis thaliana leaf microbial
community. Nat Commun 5. doi: 10.1038/ncomms6320
Hu, Z., Shan, T., Zhang, Q., Liu, F., Jueterbock, A., Wang, G., et al. (2023). Kelp breeding in
China: Challenges and opportunities for solutions. Rev Aquac. doi: 10.1111/raq.12871
Huang, M., Robbins, K. R., Li, Y., Umanzor, S., Marty-Rivera, M., Bailey, D., et al. (2022).
Simulation of sugar kelp (Saccharina latissima) breeding guided by practices to accelerate
genetic gains. G3 Genes|Genomes|Genetics 12. doi: 10.1093/g3journal/jkac003
Huang, M., Robbins, K. R., Li, Y., Umanzor, S., Marty-Rivera, M., Bailey, D., et al. (2023).
Genomic selection in algae with biphasic lifecycles: A Saccharina latissima (sugar kelp)
case study. Front Mar Sci 10. doi: 10.3389/fmars.2023.1040979
Huang, X., and Han, B. (2014). Natural Variations and Genome-Wide Association Studies in
Crop Plants. Annu Rev Plant Biol 65, 531–551. doi: 10.1146/annurev-arplant-050213-
035715
Huang, X., Yang, S., Gong, J., Zhao, Y., Feng, Q., Gong, H., et al. (2015). Genomic analysis of
hybrid rice varieties reveals numerous superior alleles that contribute to heterosis. Nat
Commun 6, 6258. doi: 10.1038/ncomms7258
Hui, C., and McGeoch, M. A. (2014). Zeta diversity as a concept and metric that unifies
incidence-based biodiversity patterns. American Naturalist 184, 684–694. doi:
10.1086/678125
Hunter, C. J. (1975). Edible Seaweeds - A Survey of the Industry and Prospects for Farming the
Pacific Northwest. Marine Fisheries Review 37, 19–26.
88
Hutchinson, G. E. (1957). Concluding Remarks. Cold Spring Harb Symp Quant Biol 22, 415–
427. doi: 10.1101/sqb.1957.022.01.039
Hwang, E. K., Ha, D. S., and Park, C. S. (2017). Strain selection and initiation timing influence
the cultivation period of Saccharina japonica and their impact on the abalone feed industry
in Korea. J Appl Phycol 29, 2297–2305. doi: 10.1007/s10811-017-1179-2
Hwang, E. K., Liu, F., Lee, K. H., Ha, D. S., and Park, C. S. (2018). Comparison of the
cultivation performance between Korean (Sugwawon No. 301) and Chinese strains
(Huangguan No. 1) of kelp Saccharina japonica in an aquaculture farm in Korea. ALGAE
33, 101–108. doi: 10.4490/algae.2018.33.2.4
Hwang, E. K., Yotsukura, N., Pang, S. J., Su, L., and Shan, T. F. (2019). Seaweed breeding
programs and progress in eastern Asian countries. Phycologia 58, 484–495. doi:
10.1080/00318884.2019.1639436
Igolkina, A. A., Armoskus, C., Newman, J. R. B., Evgrafov, O. V., McIntyre, L. M., Nuzhdin, S.
V., et al. (2018). Analysis of Gene Expression Variance in Schizophrenia Using Structural
Equation Modeling. Front Mol Neurosci 11, 1–12. doi: 10.3389/fnmol.2018.00192
Igolkina, A. A., Bazykin, G. A., Chizhevskaya, E. P., Provorov, N. A., and Andronov, E. E.
(2019). Matching population diversity of rhizobial nod A and legume NFR5 genes in plant–
microbe symbiosis. Ecol Evol 9, 10377–10386. doi: 10.1002/ece3.5556
Igolkina, A. A., and Samsonova, M. G. (2018). SEM: Structural Equation Modeling in Molecular
Biology. Biophysics (Oxf) 63, 139–148. doi: 10.1134/S0006350918020100
James, K., and Shears, N. T. (2016). Proliferation of the invasive kelp Undaria pinnatifida at
aquaculture sites promotes spread to coastal reefs. Mar Biol 163, 34. doi: 10.1007/s00227-
015-2811-9
Jian, L., Yan, J., and Liu, J. (2022). De Novo Domestication in the Multi-Omics Era. Plant Cell
Physiol 63, 1592–1606. doi: 10.1093/pcp/pcac077
Jiang, Y., Xiong, X., Danska, J., and Parkinson, J. (2016). Metatranscriptomic analysis of diverse
microbial communities reveals core metabolic pathways and microbiomespecific
functionality. Microbiome 4. doi: 10.1186/s40168-015-0146-x
Jones, P., Garcia, B. J., Furches, A., Tuskan, G. A., and Jacobson, D. (2019). Plant hostassociated mechanisms for microbial selection. Front Plant Sci 10, 862. doi:
10.3389/fpls.2019.00862
Jurka, J., Kapitonov, V. V., Pavlicek, A., Klonowski, P., Kohany, O., and Walichiewicz, J. (2005).
Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110,
462–467. doi: 10.1159/000084979
89
Kaakinen, M., Läärä, E., Pouta, A., Hartikainen, A. L., Laitinen, J., Tammelin, T. H., et al.
(2010). Life-course analysis of a fat mass and obesity-associated (FTO) gene variant and
body mass index in the northern Finland birth cohort 1966 using structural equation
modeling. Am J Epidemiol 172, 653–665. doi: 10.1093/aje/kwq178
Kanehisa, M. (2006). From genomics to chemical genomics: new developments in KEGG.
Nucleic Acids Res 34, D354–D357. doi: 10.1093/nar/gkj102
Karns, R., Succop, P., Zhang, G., Sun, G., Indugula, S. R., Havas-Augustin, D., et al. (2013).
Modeling metabolic syndrome through structural equations of metabolic traits, comorbid
diseases, and GWAS variants. Obesity 21, E745–E754. doi: 10.1002/oby.20445
Keeney, S. (2008). “Spo11 and the Formation of DNA Double-Strand Breaks in Meiosis,” in
Recombination and Meiosis, (Berlin, Heidelberg: Springer Berlin Heidelberg), 81–123. doi:
10.1007/7050_2007_026
Kent, W. J. (2002). BLAT—The BLAST-Like Alignment Tool. Genome Res 12, 656–664. doi:
10.1101/GR.229202
Kersten, B., Faivre Rampant, P., Mader, M., Le Paslier, M.-C., Bounon, R., Berard, A., et al.
(2016). Genome Sequences of Populus tremula Chloroplast and Mitochondrion:
Implications for Holistic Poplar Breeding. PLoS One 11, e0147209. doi:
10.1371/journal.pone.0147209
Khush, G. S. (2001). Green revolution: the way forward. Nat Rev Genet 2, 815–822. doi:
10.1038/35093585
Kim, D., Paggi, J. M., Park, C., Bennett, C., and Salzberg, S. L. (2019a). Graph-based genome
alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915.
doi: 10.1038/s41587-019-0201-4
Kim, J. K., Yarish, C., Hwang, E. K., Park, M., and Kim, Y. (2017). Seaweed aquaculture:
cultivation technologies, challenges and its ecosystem services. Algae 32, 1–13. doi:
10.4490/algae.2017.32.3.3
Kim, J., Kraemer, G., and Yarish, C. (2015). Use of sugar kelp aquaculture in Long Island Sound
and the Bronx River Estuary for nutrient extraction. Mar Ecol Prog Ser 531, 155–166. doi:
10.3354/meps11331
Kim, J., Stekoll, M., and Yarish, C. (2019b). Opportunities, challenges and future directions of
open-water seaweed aquaculture in the United States. Phycologia 58, 446–461. doi:
10.1080/00318884.2019.1625611
90
Kim, S.-K., and Venkatesan, J. (2015). Springer Handbook of Marine Biotechnology., 1st Edn,
ed. S.-K. Kim. Berlin, Heidelberg: Springer Berlin Heidelberg. doi: 10.1007/978-3-642-
53971-8
Kleinhofs, A. (1983). “New Horizons in Plant Genetics,” in Isozymes in Plant Genetics and
Breeding, eds. S. D. Tanksley and T. J. Orton (Amsterdam: Elsevier Science Publishing
Company Inc.), 15–23. doi: 10.1016/B978-0-444-42226-2.50007-4
Kolmogorov, M., Armstrong, J., Raney, B. J., Streeter, I., Dunn, M., Yang, F., et al. (2018).
Chromosome assembly of large and complex genomes using multiple references. Genome
Res 28, 1720–1732. doi: 10.1101/gr.236273.118
Kolmogorov, M., Yuan, J., Lin, Y., and Pevzner, P. A. (2019). Assembly of long, error-prone
reads using repeat graphs. Nat Biotechnol 37, 540–546. doi: 10.1038/s41587-019-0072-8
Koonin, E. V, Fedorova, N. D., Jackson, J. D., Jacobs, A. R., Krylov, D. M., Makarova, K. S., et
al. (2004). A comprehensive evolutionary classification of proteins encoded in complete
eukaryotic genomes. Genome Biol 5, R7. doi: 10.1186/gb-2004-5-2-r7
Korte, A., Vilhjálmsson, B. J., Segura, V., Platt, A., Long, Q., and Nordborg, M. (2012). A
mixed-model approach for genome-wide association studies of correlated traits in structured
populations. Nat Genet 44, 1066–1071. doi: 10.1038/ng.2376
Krasheninnikova, K., Diekhans, M., Armstrong, J., Dievskii, A., Paten, B., and O’Brien, S.
(2020). halSynteny: a fast, easy-to-use conserved synteny block construction method for
multiple whole-genome alignments. Gigascience 9. doi: 10.1093/gigascience/giaa047
Krueger, F., James, F., Ewels, P., Ebrahim, A., Weinstein, M., Schuster-Boeckler, B., et al.
(2021). FelixKrueger/TrimGalore. doi: 10.5281/zenodo.5127899
Krumhansl, K. A., Okamoto, D. K., Rassweiler, A., Novak, M., Bolton, J. J., Cavanaugh, K. C.,
et al. (2016). Global patterns of kelp forest change over the past half-century. Proceedings
of the National Academy of Sciences 113, 13785–13790. doi: 10.1073/pnas.1606102113
Kumar, M., Kuzhiumparambil, U., Pernice, M., Jiang, Z., and Ralph, P. J. (2016). Metabolomics:
an emerging frontier of systems biology in marine macrophytes. Algal Res 16, 76–92. doi:
10.1016/j.algal.2016.02.033
Kuo, A., Bushnell, B., and Grigoriev, I. V. (2014). “Fungal Genomics: Sequencing and
Annotation,” in Advances in Botanical Research, ed. F. M. Martin (London: Academic
Press), 1–52. doi: 10.1016/B978-0-12-397940-7.00001-X
La Barre, S. L., Weinberger, F., Kervarec, N., and Potin, P. (2004). Monitoring defensive
responses in macroalgae - Limitations and perspectives., in Phytochemistry Reviews, 371–
379. doi: 10.1007/s11101-005-1459-3
91
Laird, N. M., and Ware, J. H. (1982). Random-Effects Models for Longitudinal Data. Biometrics
38, 963. doi: 10.2307/2529876
Lange, C., Van Steen, K., Andrew, T., Lyon, H., DeMeo, D. L., Raby, B., et al. (2004). A familybased association test for repeatedly measured quantitative traits adjusting for unknown
environmental and/or polygenic effects. Stat Appl Genet Mol Biol 3, 17. doi: 10.2202/1544-
6115.1067
Laycock, R. A. (1974). The detrital food chain based on seaweeds. I. Bacteria associated with the
surface of Laminaria fronds. Mar Biol 25, 223–231. doi: 10.1007/BF00394968
Li, B., and Dewey, C. N. (2011). RSEM: Accurate transcript quantification from RNA-Seq data
with or without a reference genome. BMC Bioinformatics 12, 323. doi: 10.1186/1471-2105-
12-323
Li, H. (2012). seqtk, Toolkit for processing sequences in FASTA/Q formats. Available at:
https://github.com/lh3/seqtk (Accessed March 7, 2024).
Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.
ArXiv. Available at: http://arxiv.org/abs/1303.3997 (Accessed January 21, 2024).
Li, H. (2018). Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–
3100. doi: 10.1093/bioinformatics/bty191
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., et al. (2009). The Sequence
Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079. doi:
10.1093/bioinformatics/btp352
Li, J., Hooker, G. W., and Roeder, G. S. (2006). Saccharomyces cerevisiae Mer2, Mei4 and
Rec114 Form a Complex Required for Meiotic Double-Strand Break Formation. Genetics
173, 1969–1981. doi: 10.1534/genetics.106.058768
Li, Y., Umanzor, S., Ng, C., Huang, M., Marty-Rivera, M., Bailey, D., et al. (2022). Skinny kelp
(Saccharina angustissima) provides valuable genetics for the biomass improvement of
farmed sugar kelp (Saccharina latissima). J Appl Phycol, 1–13. doi: 10.1007/s10811-022-
02811-1
Lin, J. D., Lemay, M. A., and Parfrey, L. W. (2018). Diverse Bacteria Utilize Alginate Within the
Microbiome of the Giant Kelp Macrocystis pyrifera. Front Microbiol 9, 1–16. doi:
10.3389/fmicb.2018.01914
Lindsey Zemke-White, W., and Ohno, M. (1999). World seaweed utilisation: An end-of-century
summary. J Appl Phycol 11, 369–376. doi: 10.1023/A:1008197610793
92
Lippert, C., Casale, F. P., Rakitsch, B., and Stegle, O. (2014). LIMIX: genetic analysis of
multiple traits. bioRxiv. doi: 10.1101/003905
Liu, F., Yao, J., Wang, X., Repnikova, A., Galanin, D. A., and Duan, D. (2012a). Genetic
diversity and structure within and between wild and cultivated Saccharina japonica
(Laminariales, Phaeophyta) revealed by SSR markers. Aquaculture 358–359, 139–145. doi:
10.1016/j.aquaculture.2012.06.022
Liu, J., Yang, C., Shi, X., Li, C., Huang, J., Zhao, H., et al. (2016). Analyzing Association
Mapping in Pedigree‐Based GWAS Using a Penalized Multitrait Mixed Model. Genet
Epidemiol 40, 382–393. doi: 10.1002/gepi.21975
Liu, T., Wang, X., Wang, G., Jia, S., Liu, G., Shan, G., et al. (2019). Evolution of Complex
Thallus Alga: Genome Sequencing of Saccharina japonica. Front Genet 10. doi:
10.3389/fgene.2019.00378
Liu, X., Bogaert, K., Engelen, A. H., Leliaert, F., Roleda, M. Y., and De Clerck, O. (2017).
Seaweed reproductive biology: environmental and genetic controls. Botanica Marina 60,
89–108. doi: 10.1515/bot-2016-0091
Liu, Y., Bi, Y., Gu, J., Li, L., and Zhou, Z. (2012b). Localization of a Female-Specific Marker on
the Chromosomes of the Brown Seaweed Saccharina japonica Using Fluorescence In Situ
Hybridization. PLoS One 7, e48784. doi: 10.1371/journal.pone.0048784
Liu, Y., Bi, Y.-H., and Zhou, Z.-G. (2022). Dual-color fluorescence in situ hybridization with
combinatorial labeling probes enables a detailed karyotype analysis of Saccharina japonica.
Algal Res 62, 102636. doi: 10.1016/j.algal.2022.102636
Love, D. C., Gorski, I., and Fry, J. P. (2017). An Analysis of Nearly One Billion Dollars of
Aquaculture Grants Made by the US Federal Government from 1990 to 2015. J World
Aquac Soc 48, 689–710. doi: 10.1111/jwas.12425
Love, M. I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and
dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550. doi: 10.1186/s13059-014-
0550-8
Mabin, C. J. T., Johnson, C. R., and Wright, J. T. (2019). Physiological response to temperature,
light, and nitrates in the giant kelp Macrocystis pyrifera from Tasmania, Australia. Mar Ecol
Prog Ser 614, 1–19. doi: 10.3354/meps12900
Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A., and Zdobnov, E. M. (2021). BUSCO
Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic
Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Mol Biol Evol 38,
4647–4654. doi: 10.1093/molbev/msab199
93
Mao, X., Augyte, S., Huang, M., Hare, M. P., Bailey, D., Umanzor, S., et al. (2020). Population
Genetics of Sugar Kelp Throughout the Northeastern United States Using Genome-Wide
Markers. Front Mar Sci 7, 1–13. doi: 10.3389/fmars.2020.00694
Margulis, L. (1991). “Symbiogenesis and symbionticism,” in Symbiosis As a Source of
Evolutionary Innovation: Speciation and Morphogenesis, eds. L. Margulis and R. Fester
(Cambridge (Massachusetts): The MIT Press), 1–14.
Martínez-Córdova, L. R., Emerenciano, M., Miranda-Baeza, A., and Martínez-Porchas, M.
(2015). Microbial-based systems for aquaculture of fish and shrimp: An updated review.
Rev Aquac 7, 131–148. doi: 10.1111/raq.12058
McGeoch, M. A., Latombe, G., Andrew, N. R., Nakagawa, S., Nipperess, D. A., Roigé, M., et al.
(2019). Measuring continuous compositional change using decline and decay in zeta
diversity. Ecology 100. doi: 10.1002/ecy.2832
McHugh, D. J. (2003). A guide to the seaweed industry. Rome. Available at:
http://www.fao.org/3/a-y4765e.pdf (Accessed November 15, 2020).
McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., et al. (2010).
The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation
DNA sequencing data. Genome Res 20, 1297–1303. doi: 10.1101/gr.107524.110
McMillin, D. E. (1983). “Plant Isozymes: A Historical Perspective,” in Isozymes in Plant
Genetics and Breeding, eds. S. D. Tanksley and T. J. Orton (Amsterdam: Elsevier Science
Publishing Company Inc.), 3–13. doi: 10.1016/B978-0-444-42226-2.50006-2
Melén, K., Krogh, A., and von Heijne, G. (2003). Reliability Measures for Membrane Protein
Topology Prediction Algorithms. J Mol Biol 327, 735–744. doi: 10.1016/S0022-
2836(03)00182-7
Meuwissen, T. H. E., Hayes, B. J., and Goddard, M. E. (2001). Prediction of Total Genetic Value
Using Genome-Wide Dense Marker Maps. Genetics 157, 1819–1829. doi:
10.1093/genetics/157.4.1819
Meyer, R. S., and Purugganan, M. D. (2013). Evolution of crop species: genetics of
domestication and diversification. Nat Rev Genet 14, 840–852. doi: 10.1038/nrg3605
Mi, X., Eskridge, K., Wang, D., Baenziger, P. S., Campbell, B. T., Gill, K. S., et al. (2010).
Regression-based multi-trait QTL mapping using a structural equation model. Stat Appl
Genet Mol Biol 9. doi: 10.2202/1544-6115.1552
Michel, G., Tonon, T., Scornet, D., Cock, J. M., and Kloareg, B. (2010). Central and storage
carbon metabolism of the brown alga Ectocarpus siliculosus : insights into the origin and
94
evolution of storage carbohydrates in Eukaryotes. New Phytologist 188, 67–81. doi:
10.1111/j.1469-8137.2010.03345.x
Mikheenko, A., Prjibelski, A., Saveliev, V., Antipov, D., and Gurevich, A. (2018). Versatile
genome assembly evaluation with QUAST-LG. Bioinformatics 34, i142–i150. doi:
10.1093/bioinformatics/bty266
Minich, J. J., Morris, M. M., Brown, M., Doane, M., Edwards, M. S., Michael, T. P., et al.
(2018). Elevated temperature drives kelp microbiome dysbiosis, while elevated carbon
dioxide induces water microbiome disruption. PLoS One 13, e0192772. doi:
10.1371/journal.pone.0192772
Miranda, L. N., Hutchison, K., Grossman, A. R., and Brawley, S. H. (2013). Diversity and
Abundance of the Bacterial Community of the Red Macroalga Porphyra umbilicalis: Did
Bacterial Farmers Produce Macroalgae? PLoS One 8, e58269. doi:
10.1371/journal.pone.0058269
Molnar, J. L., Gamboa, R. L., Revenga, C., and Spalding, M. D. (2008). Assessing the global
threat of invasive species to marine biodiversity. Front Ecol Environ 6, 485–492. doi:
10.1890/070064
Momen, M., Campbell, M. T., Walia, H., and Morota, G. (2019). Utilizing trait networks and
structural equation models as tools to interpret multi-trait genome-wide association studies.
Plant Methods 15, 107. doi: 10.1186/s13007-019-0493-x
Mooney, H. A., and Cleland, E. E. (2001). The evolutionary impact of invasive species.
Proceedings of the National Academy of Sciences 98, 5446–5451. doi:
10.1073/pnas.091093398
Morris, M. M., Haggerty, J. M., Papudeshi, B. N., Vega, A. A., Edwards, M. S., and Dinsdale, E.
A. (2016). Nearshore pelagic microbial community abundance affects recruitment success
of giant kelp, Macrocystis pyrifera. Front Microbiol 7, 1800. doi:
10.3389/fmicb.2016.01800
Mouritsen, O. G., Dawczynski, C., Duelund, L., Jahreis, G., Vetter, W., and Schröder, M. (2013).
On the human consumption of the red seaweed dulse (Palmaria palmata (L.) Weber &
Mohr). J Appl Phycol 25, 1777–1791. doi: 10.1007/s10811-013-0014-7
Müller, D. G., Maier, I., Marie, D., and Westermeier, R. (2016). Nuclear DNA level and life
cycle of kelps: Evidence for sex‐specific polyteny in Macrocystis (Laminariales,
Phaeophyceae). J Phycol 52, 157–160. doi: 10.1111/jpy.12380
Murray, S. C., Rooney, W. L., Mitchell, S. E., Sharma, A., Klein, P. E., Mullet, J. E., et al.
(2008a). Genetic Improvement of Sorghum as a Biofuel Feedstock: II. QTL for Stem and
Leaf Structural Carbohydrates. Crop Sci 48, 2180–2193. doi: 10.2135/cropsci2008.01.0068
95
Murray, S. C., Sharma, A., Rooney, W. L., Klein, P. E., Mullet, J. E., Mitchell, S. E., et al.
(2008b). Genetic Improvement of Sorghum as a Biofuel Feedstock: I. QTL for Stem Sugar
and Grain Nonstructural Carbohydrates. Crop Sci 48, 2165–2179. doi:
10.2135/cropsci2008.01.0016
Naragund, R., Singh, Y. V., Bana, R. S., Choudhary, A. K., Jaiswal, P., Kadam, P., et al. (2020).
Influence of crop establishment practices and microbial inoculants on plant growth and
nutrient uptake of summer green gram (Vigna radiata). Annals of Plant and Soil Research
22, 55–59. Available at:
https://www.researchgate.net/publication/339527524_Influence_of_crop_establishment_pra
ctices_and_microbial_inoculants_on_plant_growth_and_nutrient_uptake_of_summer_green
_gram_Vigna_radiata (Accessed August 1, 2020).
Nelson, D. L., and Cox, M. M. (2017). Lehninger Principles of Biochemistry., 7th Edn. New
York: W. H. Freeman.
Nemkov, T., Reisz, J. A., Gehrke, S., Hansen, K. C., and D’Alessandro, A. (2019). “Highthroughput metabolomics: Isocratic and gradient mass spectrometry-based methods,” in
Methods in Molecular Biology, (Humana Press Inc.), 13–26. doi: 10.1007/978-1-4939-
9236-2_2
Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997). Identification of prokaryotic
and eukaryotic signal peptides and prediction of their cleavage sites. Protein Engineering
Design and Selection 10, 1–6. doi: 10.1093/protein/10.1.1
North, W. J. (1987). “Biology of the Macrocystis resource in North America,” in Case study of
seven commercial seaweed resources, eds. M. Doty, J. Caddy, and B. Santelices (Rome:
FAO Fisheries Technical Paper), 265–311. Available at:
https://www.fao.org/4/x5819e/x5819e00.htm (Accessed October 13, 2024).
Nosil, P., Soria-Carrasco, V., Villoutreix, R., De-la-Mora, M., de Carvalho, C. F., Parchman, T.,
et al. (2023). Complex evolutionary processes maintain an ancient chromosomal inversion.
Proceedings of the National Academy of Sciences 120. doi: 10.1073/pnas.2300673120
Nuzhdin, S. V., Friesen, M. L., and McIntyre, L. M. (2012). Genotype–phenotype mapping in a
post-GWAS world. Trends in Genetics 28, 421–426. doi: 10.1016/j.tig.2012.06.003
Obata, T., Witt, S., Lisec, J., Palacios-Rojas, N., Florez-Sarasa, I., Araus, J. L., et al. (2015).
Metabolite profiles of maize leaves in drought, heat and combined stress field trials reveal
the relationship between metabolism and grain yield. Plant Physiol 169, pp.01164.2015.
doi: 10.1104/pp.15.01164
O’Donnell, S. T., Ross, R. P., and Stanton, C. (2020). The Progress of Multi-Omics
Technologies: Determining Function in Lactic Acid Bacteria Using a Systems Level
Approach. Front Microbiol 10, 3084. doi: 10.3389/fmicb.2019.03084
96
Olson, N. D., Treangen, T. J., Hill, C. M., Cepeda-Espinoza, V., Ghurye, J., Koren, S., et al.
(2019). Metagenomic assembly through the lens of validation: Recent advances in assessing
and improving the quality of genomes assembled from metagenomes. Brief Bioinform 20,
1140–1150. doi: 10.1093/bib/bbx098
Orth, J. D., Thiele, I., and Palsson, B. Ø. (2010). What is flux balance analysis? Nat Biotechnol
28, 245–248. doi: 10.1038/nbt.1614
Ortiz, R., Trethowan, R., Ferrara, G. O., Iwanaga, M., Dodds, J. H., Crouch, J. H., et al. (2007).
High yield potential, shuttle breeding, genetic diversity, and a new international wheat
improvement strategy. Euphytica 157, 365–384. doi: 10.1007/s10681-007-9375-9
Ortiz-Estrada, Á. M., Gollas-Galván, T., Martínez-Córdova, L. R., and Martínez-Porchas, M.
(2019). Predictive functional profiles using metagenomic 16S rRNA data: a novel approach
to understanding the microbial ecology of aquaculture systems. Rev Aquac 11, 234–245.
doi: 10.1111/raq.12237
Ouborg, N. J., Pertoldi, C., Loeschcke, V., Bijlsma, R. K., and Hedrick, P. W. (2010).
Conservation genetics in transition to conservation genomics. Trends in Genetics 26, 177–
187. doi: 10.1016/j.tig.2010.01.001
Park, S., Seo, Y. S., and Hegeman, A. D. (2014). Plant metabolomics for plant chemical
responses to belowground community change by climate change. Journal of Plant Biology
57, 137–149. doi: 10.1007/s12374-014-0110-5
Parks, D. H., Chuvochina, M., Chaumeil, P. A., Rinke, C., Mussig, A. J., and Hugenholtz, P.
(2020). A complete domain-to-species taxonomy for Bacteria and Archaea. Nat Biotechnol,
1–8. doi: 10.1038/s41587-020-0501-8
Parks, D. H., Chuvochina, M., Waite, D. W., Rinke, C., Skarshewski, A., Chaumeil, P. A., et al.
(2018). A standardized bacterial taxonomy based on genome phylogeny substantially
revises the tree of life. Nat Biotechnol 36, 996. doi: 10.1038/nbt.4229
Peñuelas, J., Sardans, J., Estiarte, M., Ogaya, R., Carnicer, J., Coll, M., et al. (2013). Evidence of
current impact of climate change on life: A walk from genes to the biosphere. Glob Chang
Biol 19, 2303–2338. doi: 10.1111/gcb.12143
Pereira, R., and Yarish, C. (2008). “Mass Production of Marine Macroalgae,” in Encyclopedia of
Ecology, eds. S. E. Jørgensen and B. D. Fath (Oxford: Elsevier), 2236–2247. doi:
10.1016/B978-008045405-4.00066-5
Peteiro, C., Sánchez, N., and Martínez, B. (2016). Mariculture of the Asian kelp Undaria
pinnatifida and the native kelp Saccharina latissima along the Atlantic coast of Southern
Europe: An overview. Algal Res 15, 9–23. doi: 10.1016/j.algal.2016.01.012
97
Phaeoexplorer (2016). Phaeoexplorer: Genomic and transcriptomic data for marine microalgae.
Available at: https://phaeoexplorer.sb-roscoff.fr/ (Accessed July 30, 2024).
Phan, N. T., and Sim, S.-C. (2017). Genomic Tools and Their Implications for Vegetable
Breeding. Horticultural Science and Technology 35, 149–164. doi:
10.12972/kjhst.20170018
Phillips, N., Kapraun, D. F., Gómez Garreta, A., Ribera Siguan, M. A., Rull, J. L., Salvador
Soler, N., et al. (2011). Estimates of nuclear DNA content in 98 species of brown algae
(Phaeophyta). AoB Plants 2011. doi: 10.1093/aobpla/plr001
Phlips, E. J., and Zeman, C. (1990). Photosynthesis, growth and nitrogen fixation by epiphytic
forms of filamentous cyanobacteria from pelagic Sargassum. Bull Mar Sci 47, 613–621.
Available at:
https://www.researchgate.net/publication/233516583_Photosynthesis_Growth_and_Nitroge
n_Fixation_by_Epiphytic_Forms_of_Filamentous_Cyanobacteria_from_Pelagic_Sargassu
m (Accessed August 1, 2020).
Pimentel, D. (1991). Global warming, population growth, and natural resources for food
production. Soc Nat Resour 4, 347–363. doi: 10.1080/08941929109380766
Price, A. L., Jones, N. C., and Pevzner, P. A. (2005). De novo identification of repeat families in
large genomes. Bioinformatics 21, i351–i358. doi: 10.1093/bioinformatics/bti1018
Prigent, S., Collet, G., Dittami, S. M., Delage, L., Ethis de Corny, F., Dameron, O., et al. (2014).
The genome-scale metabolic network of Ectocarpus siliculosus (EctoGEM): a resource to
study brown algal physiology and beyond. The Plant Journal 80, 367–381. doi:
10.1111/tpj.12627
Purugganan, M. D., and Fuller, D. Q. (2009). The nature of selection during plant domestication.
Nature 457, 843–848. doi: 10.1038/nature07895
Qiu, Z., Coleman, M. A., Provost, E., Campbell, A. H., Kelaher, B. P., Dalton, S. J., et al. (2019).
Future climate change is predicted to affect the microbiome and condition of habitatforming kelp. Proceedings of the Royal Society B: Biological Sciences 286, 20181887. doi:
10.1098/rspb.2018.1887
Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler, R., et al. (2005).
InterProScan: protein domains identifier. Nucleic Acids Res 33, W116–W120. doi:
10.1093/nar/gki442
Quigley, C. T. C., Morrison, H. G., Mendonça, I. R., and Brawley, S. H. (2018). A common
garden experiment with Porphyra umbilicalis (Rhodophyta) evaluates methods to study
spatial differences in the macroalgal microbiome. J Phycol 54, 653–664. doi:
10.1111/jpy.12763
98
Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J., and Segata, N. (2017). Shotgun
metagenomics, from sampling to analysis. Nat Biotechnol 35, 833–844. doi:
10.1038/nbt.3935
R Core Team (2024). R: A Language and Environment for Statistical Computing. Available at:
https://www.R-project.org (Accessed May 21, 2024).
Rana, S., Valentin, K., Bartsch, I., and Glöckner, G. (2019). Loss of a chloroplast encoded
function could influence species range in kelp. Ecol Evol 9, 8759–8770. doi:
10.1002/ece3.5428
Rana, S., Valentin, K., Riehl, J., Blanfuné, A., Reynes, L., Thibaut, T., et al. (2021). Analysis of
organellar genomes in brown algae reveals an independent introduction of similar foreign
sequences into the mitochondrial genome. Genomics 113, 646–654. doi:
10.1016/j.ygeno.2021.01.003
Raut, Y., Morando, M., and Capone, D. G. (2018). Diazotrophic Macroalgal Associations With
Living and Decomposing Sargassum. Front Microbiol 9, 3127. doi:
10.3389/fmicb.2018.03127
Ray, D. K., Mueller, N. D., West, P. C., and Foley, J. A. (2013). Yield Trends Are Insufficient to
Double Global Crop Production by 2050. PLoS One 8, e66428. doi:
10.1371/journal.pone.0066428
Redmond, S., Green, L., Yarish, C., Kim, J., and Neefus, C. (2014). New England Seaweed
Culture Handbook. Connecticut Sea Grant. Available at:
https://digitalcommons.lib.uconn.edu/seagrant_weedcult/1 (Accessed March 13, 2024).
Rilov, G., Mazaris, A. D., Stelzenmüller, V., Helmuth, B., Wahl, M., Guy-Haim, T., et al. (2019).
Adaptive marine conservation planning in the face of climate change: What can we learn
from physiological, ecological and genetic studies? Glob Ecol Conserv 17, e00566. doi:
10.1016/j.gecco.2019.e00566
Ritter, A., Dittami, S. M., Goulitquer, S., Correa, J. A., Boyen, C., Potin, P., et al. (2014).
Transcriptomic and metabolomic analysis of copper stress acclimation in Ectocarpus
siliculosus highlights signaling and tolerance mechanisms in brown algae. BMC Plant Biol
14. doi: 10.1186/1471-2229-14-116
Robinson, N., Winberg, P., and Kirkendale, L. (2013). Genetic improvement of macroalgae:
status to date and needs for the future. J Appl Phycol 25, 703–716. doi: 10.1007/s10811-
012-9950-x
Roesijadi, G., Copping, A. E. E., Huesemann, M. H. H., Forster, J., Benemann, J. R., and Thom,
R. M. (2008). Techno-Economic Feasibility Analysis of Offshore Seaweed Farming for
Bioenergy and Biobased Products - Independent Research and Development Report - IR
99
Number : PNWD-3931 - Battelle Pacific Northwest Division. Available at:
www.surialink.com (Accessed August 1, 2020).
Roleda, M. Y., and Hurd, C. L. (2019). Seaweed nutrient physiology: application of concepts to
aquaculture and bioremediation. Phycologia 58, 552–562. doi:
10.1080/00318884.2019.1622920
Rose, J. M., Bricker, S. B., Deonarine, S., Ferreira, J. G., Getchis, T., Grant, J., et al. (2015).
“Nutrient Bioextraction,” in Encyclopedia of Sustainability Science and Technology, ed. R.
Meyers (New York, NY: Springer New York), 1–33. doi: 10.1007/978-1-4939-2493-6_944-
1
Saha, M., and Weinberger, F. (2019). Microbial “gardening” by a seaweed holobiont: Surface
metabolites attract protective and deter pathogenic epibacterial settlement. Journal of
Ecology 107, 2255–2265. doi: 10.1111/1365-2745.13193
Salamov, A. A., and Solovyev, V. V. (2000). Ab initio Gene Finding in Drosophila Genomic
DNA. Genome Res 10, 516–522. doi: 10.1101/gr.10.4.516
Sato, Y., Endo, H., Oikawa, H., Kanematsu, K., Naka, H., Mogamiya, M., et al. (2020). Sexual
Difference in the Optimum Environmental Conditions for Growth and Maturation of the
Brown Alga Undaria pinnatifida in the Gametophyte Stage. Genes (Basel) 11, 944. doi:
10.3390/genes11080944
Secq, M.-P. O.-L., Goër, S. L., Stam, W. T., and Olsen, J. L. (2006). Complete mitochondrial
genomes of the three brown algae (Heterokonta: Phaeophyceae) Dictyota dichotoma, Fucus
vesiculosus and Desmarestia viridis. Curr Genet 49, 47–58. doi: 10.1007/s00294-005-0031-
4
Shan, T., Yuan, J., Su, L., Li, J., Leng, X., Zhang, Y., et al. (2020). First Genome of the Brown
Alga Undaria pinnatifida: Chromosome-Level Assembly Using PacBio and Hi-C
Technologies. Front Genet 11. doi: 10.3389/fgene.2020.00140
Shin, M. G., Bulyntsev, S. V., Chang, P. L., Korbu, L. B., Carrasquila-Garcia, N., Vishnyakova,
M. A., et al. (2019). Multi-trait analysis of domestication genes in Cicer arietinum – Cicer
reticulatum hybrids with a multidimensional approach: Modeling wide crosses for crop
improvement. Plant Science 285, 122–131. doi: 10.1016/j.plantsci.2019.04.018
Sieber, C. M. K., Probst, A. J., Sharrar, A., Thomas, B. C., Hess, M., Tringe, S. G., et al. (2018).
Recovery of genomes from metagenomes via a dereplication, aggregation and scoring
strategy. Nat Microbiol 3, 836–843. doi: 10.1038/s41564-018-0171-1
Simons, A. L., Churches, N., and Nuzhdin, S. (2018). High turnover of faecal microbiome from
algal feedstock experimental manipulations in the Pacific oyster (Crassostrea gigas). Microb
Biotechnol 11, 848–858. doi: 10.1111/1751-7915.13277
100
Simons, A. L., Mazor, R., Stein, E. D., and Nuzhdin, S. (2019). Using alpha, beta, and zeta
diversity in describing the health of stream-based benthic macroinvertebrate communities.
Ecological Applications 29, e01896. doi: 10.1002/eap.1896
Singh, R. P., and Reddy, C. R. K. (2014). Seaweed-microbial interactions: Key functions of
seaweed-associated bacteria. FEMS Microbiol Ecol 88, 213–230. doi: 10.1111/1574-
6941.12297
Smit, A., Hubley, R., and Green, P. (2004). RepeatMasker. Open-3.0. Available at:
https://www.repeatmasker.org/ (Accessed November 1, 2023).
Smith, J. M., Brzezinski, M. A., Melack, J. M., Miller, R. J., and Reed, D. C. (2018). Urea as a
source of nitrogen to giant kelp ( Macrocystis pyrifera ). Limnol Oceanogr Lett 3, 365–373.
doi: 10.1002/lol2.10088
Solan, M., and Whiteley, N. eds. (2016). Stressors in the Marine Environment. Oxford: Oxford
University Press. doi: 10.1093/acprof:oso/9780198718826.001.0001
Song, H., Dong, T., Yan, X., Wang, W., Tian, Z., Sun, A., et al. (2023). Genomic selection and its
research progress in aquaculture breeding. Rev Aquac 15, 274–291. doi: 10.1111/raq.12716
Spillias, S., Valin, H., Batka, M., Sperling, F., Havlík, P., Leclère, D., et al. (2023). Reducing
global land-use pressures with seaweed farming. Nat Sustain 6, 380–390. doi:
10.1038/s41893-022-01043-y
Starko, S., Soto Gomez, M., Darby, H., Demes, K. W., Kawai, H., Yotsukura, N., et al. (2019). A
comprehensive kelp phylogeny sheds light on the evolution of an ecosystem. Mol
Phylogenet Evol 136, 138–150. doi: 10.1016/j.ympev.2019.04.012
Stekoll, M., Pryor, A., Meyer, A., Kite-Powell, H. L., Bailey, D., Barbery, K., et al. (2024).
Optimizing seaweed biomass production - a two kelp solution. J Appl Phycol, 1–11. doi:
10.1007/s10811-024-03296-w
Steneck, R. S., Graham, M. H., Bourque, B. J., Corbett, D., Erlandson, J. M., Estes, J. A., et al.
(2002). Kelp forest ecosystems: biodiversity, stability, resilience and future. Environ
Conserv 29, 436–459. doi: 10.1017/S0376892902000322
Sterck, L., Billiau, K., Abeel, T., Rouzé, P., and Van de Peer, Y. (2012). ORCAE: online resource
for community annotation of eukaryotes. Nat Methods 9, 1041–1041. doi:
10.1038/nmeth.2242
Sudhakar, K., Mamat, R., Samykano, M., Azmi, W. H., Ishak, W. F. W., and Yusaf, T. (2018). An
overview of marine macroalgae as bioresource. Renewable and Sustainable Energy Reviews
91, 165–179. doi: 10.1016/j.rser.2018.03.100
101
Sung, J., Lee, S., Lee, Y., Ha, S., Song, B., Kim, T., et al. (2015). Metabolomic profiling from
leaves and roots of tomato (Solanum lycopersicum L.) plants grown under nitrogen,
phosphorus or potassium-deficient condition. Plant Science 241, 55–64. doi:
10.1016/j.plantsci.2015.09.027
Surget, G., Le Lann, K., Delebecq, G., Kervarec, N., Donval, A., Poullaouec, M.-A., et al.
(2017). Seasonal phenology and metabolomics of the introduced red macroalga Gracilaria
vermiculophylla, monitored in the Bay of Brest (France). J Appl Phycol 29, 2651–2666.
doi: 10.1007/s10811-017-1060-3
Tack, J., Barkley, A., Rife, T. W., Poland, J. A., and Nalley, L. L. (2016). Quantifying varietyspecific heat resistance and the potential for adaptation to climate change. Glob Chang Biol
22, 2904–2912. doi: 10.1111/gcb.13163
Tan, A., Abecasis, G. R., and Kang, H. M. (2015). Unified representation of genetic variants.
Bioinformatics 31, 2202–2204. doi: 10.1093/bioinformatics/btv112
Tan, K., Ya, P., Tan, K., Cheong, K.-L., and Fazhan, H. (2023). Ecological impact of invasive
species and pathogens introduced through bivalve aquaculture. Estuar Coast Shelf Sci 294,
108541. doi: 10.1016/j.ecss.2023.108541
Tanna, B., and Mishra, A. (2019). Nutraceutical Potential of Seaweed Polysaccharides: Structure,
Bioactivity, Safety, and Toxicity. Compr Rev Food Sci Food Saf 18, 817–831. doi:
10.1111/1541-4337.12441
Tarone, A. M., McIntyre, L. M., Harshman, L. G., and Nuzhdin, S. V (2012). Genetic variation in
the Yolk protein expression network of Drosophila melanogaster: sex-biased negative
correlations with longevity. Heredity (Edinb) 109, 226–234. doi: 10.1038/hdy.2012.34
Teagle, H., Hawkins, S. J., Moore, P. J., and Smale, D. A. (2017). The role of kelp species as
biogenic habitat formers in coastal marine ecosystems. J Exp Mar Biol Ecol 492, 81–98.
doi: 10.1016/j.jembe.2017.01.017
Tello, D., Gil, J., Loaiza, C. D., Riascos, J. J., Cardozo, N., and Duitama, J. (2019). NGSEP3:
accurate variant calling across species and sequencing protocols. Bioinformatics 35, 4716–
4723. doi: 10.1093/bioinformatics/btz275
Ter-Hovhannisyan, V., Lomsadze, A., Chernoff, Y. O., and Borodovsky, M. (2008). Gene
prediction in novel fungal genomes using an ab initio algorithm with unsupervised training.
Genome Res 18, 1979–1990. doi: 10.1101/gr.081612.108
Thaler, E. A., Larsen, I. J., and Yu, Q. (2021). The extent of soil loss across the US Corn Belt.
Proceedings of the National Academy of Sciences 118, e1922375118. doi:
10.1073/pnas.1922375118
102
Thresher, R. E., and Kuris, A. M. (2004). Options for Managing Invasive Marine Species. Biol
Invasions 6, 295–300. doi: 10.1023/B:BINV.0000034598.28718.2e
Tiller, R., Gentry, R., and Richards, R. (2013). Stakeholder driven future scenarios as an element
of interdisciplinary management tools; the case of future offshore aquaculture development
and the potential effects on fishermen in Santa Barbara, California. Ocean Coast Manag 73,
127–135. doi: 10.1016/j.ocecoaman.2012.12.011
Tillich, M., Lehwark, P., Pellizzer, T., Ulbricht-Jones, E. S., Fischer, A., Bock, R., et al. (2017).
GeSeq – versatile and accurate annotation of organelle genomes. Nucleic Acids Res 45,
W6–W11. doi: 10.1093/nar/gkx391
Tseng, C. K., and Fei, X. G. (1987). “Macroalgal commercialization in the Orient,” in Twelfth
International Seaweed Symposium, (Dordrecht: Springer Netherlands), 167–172. doi:
10.1007/978-94-009-4057-4_23
UN DESA (2024). World Population Prospects 2024: Summary of Results. Available at:
https://desapublications.un.org/file/20622/download (Accessed October 20, 2024).
Van der Auwera, G. A., and O’Connor, B. D. (2020). Genomics in the Cloud: Using Docker,
GATK, and WDL in Terra. Sebastopol: O’Reilly Media, Inc.
Van Emon, J. M. (2016). The Omics Revolution in Agricultural Research. J Agric Food Chem
64, 36–44. doi: 10.1021/acs.jafc.5b04515
van Veen, J. A., van Overbeek, L. S., and van Elsas, J. D. (1997). Fate and activity of
microorganisms introduced into soil. Microbiol Mol Biol Rev 61, 121–135. doi:
10.1128/.61.2.121-135.1997
Vargas, M., Crossa, J., Reynolds, M. P., Dhungana, P., and Eskridge, K. M. (2007). Paper
Presented at International Workshop on Increasing Wheat Yield Potential, CIMMYT,
Obregon, Mexico, 20-24 March 2006: Structural equation modelling for studying genotype
x environment interactions of physiological traits affecting yield in wheat., in Journal of
Agricultural Science, 151–161. doi: 10.1017/S0021859607006806
Varshney, R., Graner, A., and Sorrells, M. (2005). Genomics-assisted breeding for crop
improvement. Trends Plant Sci 10, 621–630. doi: 10.1016/j.tplants.2005.10.004
Varshney, R. K., Bohra, A., Yu, J., Graner, A., Zhang, Q., and Sorrells, M. E. (2021). Designing
Future Crops: Genomics-Assisted Breeding Comes of Age. Trends Plant Sci 26, 631–649.
doi: 10.1016/j.tplants.2021.03.010
Vaser, R., Sović, I., Nagarajan, N., and Šikić, M. (2017). Fast and accurate de novo genome
assembly from long uncorrected reads. Genome Res 27, 737–746. doi:
10.1101/gr.214270.116
103
Verhulst, B., Maes, H. H., and Neale, M. C. (2017). GW-SEM: A Statistical Package to Conduct
Genome-Wide Structural Equation Modeling. Behav Genet 47, 345–359. doi:
10.1007/s10519-017-9842-6
Vollmers, J., Wiegand, S., and Kaster, A. K. (2017). Comparing and evaluating metagenome
assembly tools from a microbiologist’s perspective - Not only size matters! PLoS One 12,
e0169662. doi: 10.1371/journal.pone.0169662
Wade, R., Augyte, S., Harden, M., Nuzhdin, S., Yarish, C., and Alberto, F. (2020). Macroalgal
germplasm banking for conservation, food security, and industry. PLoS Biol 18, e3000641.
doi: 10.1371/journal.pbio.3000641
Wahl, M., Goecke, F., Labes, A., Dobretsov, S., and Weinberger, F. (2012). The second skin:
Ecological role of epibiotic biofilms on marine organisms. Front Microbiol 3, 292. doi:
10.3389/fmicb.2012.00292
Waite, R., Beveridge, M., Brummett, R., Castine, S., Chaiyawannakarn, N., Kaushik, S., et al.
(2014). Improving productivity and environmental performance of aquaculture.
Washington, DC. Available at: https://www.wri.org/publication/improving-aquaculture
(Accessed July 19, 2020).
Walia, B., and Sanders, S. (2019). Curbing food waste: A review of recent policy and action in
the USA. Renewable Agriculture and Food Systems 34, 169–177. doi:
10.1017/S1742170517000400
Walker, L. R., Hoyt, D. W., Walker, S. M., Ward, J. K., Nicora, C. D., and Bingol, K. (2016).
Unambiguous metabolite identification in high‐throughput metabolomics by hybrid 1D 1 H
NMR/ESI MS 1 approach. Magnetic Resonance in Chemistry 54, 998–1003. doi:
10.1002/mrc.4503
Walker, T. (2018). Selective breeding the next step for kelp culture. Hatchery International 19,
20–21. Available at:
http://magazine.hatcheryinternational.com/publication/?m=53689&i=486811&p=20
(Accessed July 19, 2020).
Wang, D., Eskridge, K. M., and Crossa, J. (2011). Identifying QTLs and Epistasis in Structured
Plant Populations Using Adaptive Mixed LASSO. J Agric Biol Environ Stat 16, 170–184.
doi: 10.1007/s13253-010-0046-2
Wang, S., Fan, X., Guan, Z., Xu, D., Zhang, X., Wang, D., et al. (2016). Sequencing of complete
mitochondrial genome of Saccharina latissima ye-C14. Mitochondrial DNA Part A 27,
4037–4038. doi: 10.3109/19401736.2014.1003832
Wang, X., Chen, Z., Li, Q., Zhang, J., Liu, S., and Duan, D. (2018). High-density SNP-based
QTL mapping and candidate gene screening for yield-related blade length and width in
104
Saccharina japonica (Laminariales, Phaeophyta). Sci Rep 8, 13591. doi: 10.1038/s41598-
018-32015-y
Wang, X., Yang, X., Yao, J., Li, Q., Lu, C., and Duan, D. (2023). Genetic linkage map
construction and QTL mapping of blade length and width in Saccharina japonica using
SSR and SNP markers. Front Mar Sci 10. doi: 10.3389/fmars.2023.1116412
Wang, Y., Fang, Y., and Jin, M. (2007). A ridge penalized principal-components approach based
on heritability for high-dimensional data. Hum Hered 64, 182–191. doi: 10.1159/000102991
Waterhouse, R. M., Zdobnov, E. M., and Kriventseva, E. V. (2011). Correlating Traits of Gene
Retention, Sequence Divergence, Duplicability and Essentiality in Vertebrates, Arthropods,
and Fungi. Genome Biol Evol 3, 75–86. doi: 10.1093/gbe/evq083
Weighill, D., Jones, P., Bleker, C., Ranjan, P., Shah, M., Zhao, N., et al. (2019). Multi-Phenotype
Association Decomposition: Unraveling Complex Gene-Phenotype Relationships. Front
Genet 10, 417. doi: 10.3389/fgene.2019.00417
Wen, Y. J., Zhang, H., Ni, Y. L., Huang, B., Zhang, J., Feng, J. Y., et al. (2017). Methodological
implementation of mixed linear models in multi-locus genome-wide association studies.
Brief Bioinform 19, 700–712. doi: 10.1093/bib/bbw145
Wernberg, T., Coleman, M. A., Bennett, S., Thomsen, M. S., Tuya, F., and Kelaher, B. P. (2018).
Genetic diversity and kelp forest vulnerability to climatic stress. Sci Rep 8, 1851. doi:
10.1038/s41598-018-20009-9
Wernberg, T., Krumhansl, K., Filbee-Dexter, K., and Pedersen, M. F. (2019). “Status and Trends
for the World’s Kelp Forests,” in World Seas: An Environmental Evaluation, (Elsevier), 57–
78. doi: 10.1016/B978-0-12-805052-1.00003-6
Wilkins, L. G. E., Ettinger, C. L., Jospin, G., and Eisen, J. A. (2019). Metagenome-assembled
genomes provide new insight into the microbial diversity of two thermal pools in
Kamchatka, Russia. Sci Rep 9, 3059. doi: 10.1038/s41598-019-39576-6
Woloszynek, S., Mell, J. C., Zhao, Z., Simpson, G., O’Connor, M. P., and Rosen, G. L. (2019).
Exploring thematic structure and predicted functionality of 16S rRNA amplicon data. PLoS
One 14, e0219235. doi: 10.1371/journal.pone.0219235
Wright, S. (1918). On the Nature of Size Factors. Genetics 3, 367–74. doi:
10.1093/genetics/3.4.367
Wright, S. (1921). Correlation and causation. J. Agric. Res. 20, 557–585.
105
Wu, T. T., Chen, Y. F., Hastie, T., Sobel, E., and Lange, K. (2009). Genome-wide association
analysis by lasso penalized logistic regression. Bioinformatics 25, 714–721. doi:
10.1093/bioinformatics/btp041
Xu, Y., and Crouch, J. H. (2008). Marker‐Assisted Selection in Plant Breeding: From
Publications to Practice. Crop Sci 48, 391–407. doi: 10.2135/cropsci2007.04.0191
Yamanaka, R., and Akiyama, K. (1993). Cultivation and utilization of Undaria pinnatifida
(wakame) as food. J Appl Phycol 5, 249–253. doi: 10.1007/BF00004026
Yan, Y.-W., Yang, H.-C., Tang, L., Li, J., Mao, Y.-X., and Mo, Z.-L. (2019). Compositional Shifts
of Bacterial Communities Associated With Pyropia yezoensis and Surrounding Seawater
Co-occurring With Red Rot Disease. Front Microbiol 10, 1666. doi:
10.3389/fmicb.2019.01666
Yang, Q., and Wang, Y. (2012). Methods for Analyzing Multivariate Phenotypes in Genetic
Association Studies. J Probab Stat 2012, 1–13. doi: 10.1155/2012/652569
Ye, N., Zhang, X., Miao, M., Fan, X., Zheng, Y., Xu, D., et al. (2015). Saccharina genomes
provide novel insight into kelp biology. Nat Commun 6, 6986. doi: 10.1038/ncomms7986
Yi, N., and Xu, S. (2008). Bayesian LASSO for quantitative trait loci mapping. Genetics 179,
1045–1055. doi: 10.1534/genetics.107.085589
Yong, W. T. L., Chin, G. J. W. L., and Rodrigues, K. F. (2016). “Genetic Identification and Mass
Propagation of Economically Important Seaweeds,” in Algae - Organisms for Imminent
Biotechnology, eds. N. Thajuddin and D. Dhanasekaran (InTech), 277–305. doi:
10.5772/62802
Yoon, H. S., Hackett, J. D., Ciniglia, C., Pinto, G., and Bhattacharya, D. (2004). A Molecular
Timeline for the Origin of Photosynthetic Eukaryotes. Mol Biol Evol 21, 809–818. doi:
10.1093/molbev/msh075
Zacharias, H. U., Altenbuchinger, M., and Gronwald, W. (2018). Statistical Analysis of NMR
Metabolic Fingerprints: Established Methods and Recent Advances. Metabolites 8, 47. doi:
10.3390/metabo8030047
Zacher, K., Bernard, M., Daniel Moreno, A., and Bartsch, I. (2019). Temperature mediates the
outcome of species interactions in early life-history stages of two sympatric kelp species.
Mar Biol 166, 161. doi: 10.1007/s00227-019-3600-7
Zehr, J. P., Braun, S., Chen, Y., and Mellon, M. (1996). Nitrogen fixation in the marine
environment: Relating genetic potential to nitrogenase activity. J Exp Mar Biol Ecol 203,
61–73. doi: 10.1016/0022-0981(96)02570-1
106
Zhan, X., Zhao, N., Plantinga, A., Thornton, T. A., Conneely, K. N., Epstein, M. P., et al. (2017).
Powerful Genetic Association Analysis for Common or Rare Variants with HighDimensional Structured Traits. Genetics 206, 1779–1790. doi: 10.1534/genetics.116.199646
Zhang, J., Liu, T., Bian, D., Zhang, L., Li, X., Liu, D., et al. (2016). Breeding and genetic
stability evaluation of the new Saccharina variety “Ailunwan” with high yield. J Appl
Phycol 28, 3413–3421. doi: 10.1007/s10811-016-0810-y
Zhang, J., Liu, T., Feng, R., Liu, C., Jin, Y., Jin, Z., et al. (2018). Breeding of the new Saccharina
variety “Sanhai” with high-yield. Aquaculture 485, 59–65. doi:
10.1016/j.aquaculture.2017.11.015
Zhou, K., Salamov, A., Kuo, A., Aerts, A. L., Kong, X., and Grigoriev, I. V (2015). Alternative
splicing acting as a bridge in evolution. Stem Cell Investig 2, 19. doi: 10.3978/j.issn.2306-
9759.2015.10.01
Zhou, X., and Stephens, M. (2012). Genome-wide efficient mixed-model analysis for association
studies. Nat Genet 44, 821–824. doi: 10.1038/ng.2310
Zsögön, A., Čermák, T., Naves, E. R., Notini, M. M., Edel, K. H., Weinl, S., et al. (2018). De
novo domestication of wild tomato using genome editing. Nat Biotechnol 36, 1211–1216.
doi: 10.1038/nbt.4272
107
Appendix A: Supplementary Figures
Supplementary Figure 1. The number of Ns (unknown bases) per 100 kbp (gray bars) are compared for each genome
assembly. BUSCO scoring using the Eukaryota ortholog database (ODB10, n=255) shows relative counts of
complete (blue), fragmented (yellow), and missing (red) orthologs in each assembly. Phylogenetic tree showing
relatedness of compared species pruned from Starko et al. (2019).
108
Supplementary Figure 2. Regression fits of total lengths (A) and alignment lengths (B) of Saccharina latissima
scaffolds and contigs versus homologs in Saccharina japonica (red), Macrocystis pyrifera (green), Undaria
pinnatifida (blue), and Ectocarpus sp. Ec32 (purple). Excludes artificial chromosomes. (C) Distributions of n S.
latissima scaffolds aligned to each reference chromosome. Outliers are marked with triangles.
109
Supplementary Figure 3. Arrangement of v1.0 assembly contigs (light blue “input” bars) onto Ragout
pseudochromosomes.
110
Supplementary Figure 4. Rescaffolding results on S. latissima assembly v1.0 with varying parameters to Ragout: (A)
no size filtering, chimeric scaffolds (red) allowed, (B) no size filtering, unbroken input scaffolds, (C) no size
filtering, unbroken input scaffolds, forced HAL instead of MAF, (D) short contigs filtered out of comparison
genomes, chimeric scaffolds (red) allowed, (E) short contigs filtered out of comparison genomes, unbroken input
scaffolds.
111
Supplementary Figure 5. Gene annotation of (A) this Saccharina latissima chloroplast genome assembly and (B) the
previously published version.
112
Supplementary Figure 6 Gene annotation of (A) this Saccharina latissima mitochondrial genome assembly and (B)
the previously published version.
113
Supplementary Figure 7. Dotplots of (A) mitochondrion and (B) chloroplast genome assemblies aligned to previous
versions. Segments that align in the forward direction (+) are blue, while segments that align along the
complementary strand (-) are orange.
114
Appendix B: Supplementary Tables
Library
Sequencing
Platform
Average
Read/Insert
Size
Read
Number
Assembled
Sequence
Coverage (x)
IYHH Illumina
(2x150)
400 282,564,006 66.03
JAZK Illumina-HiC
(2x150)
N/A 626,664,456 145.33
PacBio 8,389* 11,430,834 180.56
Total N/A 920,659,296 391.92
Supplementary Table 1. Genomic libraries included in the Saccharina latissima nuclear genome assembly and their
respective assembled sequence coverage levels in the final release. *
Average read length of PacBio reads.
115
Cutoff Number of
Reads
Basepairs Average Read
Length Coverage
0 11,430,834 111,139,900,311 8,389 180.56x
1,000 11,426,757 111,137,691,066 8,391 180.56x
2,000 11,406,850 111,106,697,298 8,400 180.50x
3,000 11,362,643 110,992,506,950 8,420 180.31x
4,000 11,248,297 110,583,256,664 8,474 179.65x
5,000 10,857,495 108,778,137,020 8,662 176.71x
6,000 9,274,040 99,971,100,832 9,494 162.41x
7,000 7,501,551 88,486,830,709 10,554 143.75x
8,000 6,153,382 78,406,619,727 11,486 127.37x
9,000 5,095,863 69,436,380,004 12,353 112.80x
10,000 4,199,440 60,931,006,064 13,236 98.98x
11,000 3,414,452 52,698,473,044 14,177 85.61x
12,000 2,751,358 45,083,558,817 15,163 73.24x
13,000 2,211,845 38,349,212,951 16,160 62.30x
14,000 1,775,287 32,463,759,860 17,168 52.73x
15,000 1,425,810 27,403,071,208 18,160 44.51x
16,000 1,145,593 23,065,013,041 19,137 37.47x
17,000 921,034 19,364,018,470 20,091 31.45x
18,000 738,743 16,177,069,267 21,042 26.28x
19,000 590,689 13,440,647,354 21,979 21.83x
Supplementary Table 2.PacBio library statistics for the libraries included in the Saccharina latissima nuclear
genome assembly and their respective assembled sequence coverage levels.
116
Minimum
Scaffold
Length
Number of
Contigs
Scaffold Size Basepairs % Non-gap
Basepairs
5 Mb 14 95,381,417 95,381,417 100.00%
2.5 Mb 47 219,364,104 219,364,104 100.00%
1 Mb 183 426,743,239 426,743,239 100.00%
500 Kb 387 572,290,709 572,290,709 100.00%
250 Kb 713 688,982,581 688,982,581 100.00%
100 Kb 1,368 793,335,385 793,335,385 100.00%
50 Kb 1,971 835,227,866 835,227,866 100.00%
25 Kb 4,090 910,555,146 910,555,146 100.00%
10 Kb 4,849 925,745,371 925,745,371 100.00%
5 Kb 4,853 925,781,601 925,781,601 100.00%
2.5 Kb 4,853 925,781,601 925,781,601 100.00%
1 Kb 4,854 925,783,470 925,783,470 100.00%
0 bp 4,854 925,783,470 925,783,470 100.00%
Supplementary Table 3. Summary statistics of the initial output of the RACON polished HiFiAsm+HIC Saccharina
latissima nuclear genome assembly. The table shows total contigs and total assembled basepairs for each set of
scaffolds greater than the size listed in the left-hand column.
117
Scaffold total 218
Contig total 1,513
Scaffold sequence total 615.5 Mb
Contig sequence total 612.2 Mb (0.5% gap)
Scaffold L50 / N50 109 / 1.4 Mb
Contig L50 / N50 165 / 971.7 Kb
Supplementary Table 4. Final summary assembly statistics for the version 1.0 Saccharina latissima nuclear genome
assembly.
118
Statistic Ectocarpus sp. Ec32 U. pinnatifida M. pyrifera S. japonica All
Average n homologous
scaffolds mapped per
chromosome 31.03 ± 16.71 35.09 ± 60.59 18.18 ± 15.5 31.57 ± 11.38 26.94 ± 31.23
Maximum n homologous
scaffolds mapped per
chromosome 60 362 57 74 362
Minimum n homologous
scaffolds mapped per
chromosome 1 6 1 16 1
Total n homologous scaffolds
mapped 1024 1123 1091 884 4122
Average length of
homologous scaffolds (Mb)
per chromosome 17.77 ± 8.62 19.17 ± 20.77 10.11 ± 8.34 19 ± 6.95 15.28 ± 12.52
Maximum length of
homologous scaffolds (Mb)
per chromosome 34.66 126.18 28.4 35.04 126.18
Minimum length of
homologous scaffolds (Mb)
per chromosome 0.03 4.21 0.03 7.07 0.03
Total length of homologous
scaffolds (Mb) 586.3 613.38 606.34 531.98 2338
Average exact matches (Mb)
per chromosome 1.07 ± 0.74 5.68 ± 5.7 1.52 ± 1.39 0.62 ± 0.38 2.13 ± 3.31
Average exact matches (%)
per chromosome 6.76 ± 3.58 41.14 ± 9.14 18.92 ± 9.56 9.32 ± 4.02 19.19 ± 14.52
Maximum exact matches
(Mb) per chromosome 2.52 35.36 5.07 1.77 35.36
Minimum exact matches
(Mb) per chromosome 0 1.68 0.01 0.21 0
Total exact matches (Mb) 35.36 181.89 90.93 17.24 325.43
Supplementary Table 5. Summary statistics of S. latissima whole genome alignments per homologous chromosome
in S. japonica, M. pyrifera, U. pinnatifida, and Ectocarpus sp. Ec32.
Abstract (if available)
Abstract
As the climate changes and the human population grows, development of new sustainable fuel and food sources is imperative. Globally, seaweed aquaculture directly supplies biomass to vital industries including food, pharmaceuticals, and biofuels. The group of large, fast-growing brown seaweed known as kelp provide ecologically rich habitats and are attractive potential biomass feedstocks. Genomics-assisted breeding can expedite biomass production and bring kelp aquaculture to the level of modern crop agriculture. Rapid improvements in sequencing and genomic tools have allowed breeding programs to benefit from selection strategies informed by genomic data and to be applied in non-model organisms. These tools can be further harnessed to analyze the metabolism and microbiome of kelps, allowing for selection of optimal metabolite composition. A review of metabolomic and metagenomic studies to improve agricultural crops emphasizes the potential importance of this research in kelp breeding programs. The future of genomic selection in kelp species depends on the availability of robust genomic resources, an effort that is directly supported by the scaffolded and annotated North American sugar kelp (Saccharina latissima) genome assembly reported here. We employed this crucial genomic resource to functionally analyze genetic variants in sequenced kelp germplasm and investigate traits of interest. In support of efforts to mitigate ecological concerns regarding gene flow from aquaculture, we predicted putatively non-reproductive kelp genotypes. Developing kelp strains that will not perturb marine ecosystems is critical to sustainable kelp aquaculture and can be efficiently achieved with genomics-informed breeding programs sustained by this type of research.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Diversity and dynamics of giant kelp “seed-bank” microbiomes: Applications for the future of seaweed farming
PDF
Developing genetic tools to assist in the domestication of giant kelp
PDF
Exploring the genomic landscape of giant kelp: biotechnological implications and sustainable development
PDF
Understanding genetics of traits critical to the domestication of crops using Mixed Linear Models
PDF
A multi-omics investigation into breeding shellfish for ocean acidification resilience in the California current system
PDF
Genome sequencing and transcriptome analysis of the phenotypically plastic spadefoot toads
PDF
Understanding the characteristic of single nucleotide variants
PDF
From gamete to genome: evolutionary consequences of sexual conflict in house mice
PDF
A population genomics approach to the study of speciation in flowering columbines
PDF
Investigating the role of ploidy on genome stability in fission yeast
PDF
Application of phylogenetic trees in variant analysis, genome evolution and gene functional annotation
PDF
Model selection methods for genome wide association studies and statistical analysis of RNA seq data
PDF
Genome-scale insights into the underlying genetics of background effects
PDF
Understanding the genetics, evolutionary history, and biomechanics of the mammalian penis bone
PDF
Understanding the development of sexually selected traits
PDF
Decoding the embryo: on spatial and genomic tools to characterize gene regulatory networks in development
PDF
Predicting autism severity classification by machine learning models
PDF
Breaking the plateau in de novo genome scaffolding
PDF
Biological interactions on the behavioral, genomic, and ecological scale: investigating patterns in Drosophila melanogaster of the southeast United States and Caribbean islands
PDF
Studies in bivalve aquaculture: metallotoxicity, microbiome manipulations, and genomics & breeding programs with a focus on mutation rate
Asset Metadata
Creator
DeWeese, Kelly Jacqueline (author)
Core Title
Ecologically responsible domestication of kelp facilitated by genomic tools
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Molecular Biology
Degree Conferral Date
2024-12
Publication Date
12/03/2024
Defense Date
11/15/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
brown macroalgae,comparative genomics,genomic breeding,Kelp,OAI-PMH Harvest,seaweed,variant annotation
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Nuzhdin, Sergey (
committee chair
), Dean, Matthew (
committee member
), Mancuso, Nicholas (
committee member
), Tower, John (
committee member
)
Creator Email
kdeweese@mac.com,kdeweese@usc.edu
Unique identifier
UC11399E8J0
Identifier
etd-DeWeeseKel-13669.pdf (filename)
Legacy Identifier
etd-DeWeeseKel-13669
Document Type
Dissertation
Format
theses (aat)
Rights
DeWeese, Kelly Jacqueline
Internet Media Type
application/pdf
Type
texts
Source
20241205-usctheses-batch-1226
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
brown macroalgae
comparative genomics
genomic breeding
seaweed
variant annotation