Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Molecular ecology of marine cyanobacteria: microbial assemblages as units of natural selection
(USC Thesis Other)
Molecular ecology of marine cyanobacteria: microbial assemblages as units of natural selection
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Molecular ecology of marine cyanobacteria: Microbial assemblages as units of natural selection Michael D. Lee A dissertation submitted to the faculty of the University of Southern California Graduate School in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Biology) Department of Biological Sciences Marine Environmental Biology/Marine Biology and Biological Oceanography University of Southern California May 2018 Committee members: Eric A. Webb (co-chair) David A. Hutchins (co-chair) Jed A. Fuhrman Moh Y. El-Naggar “The problem that is presented by the phytoplankton is essentially how it is possible for a number of species to coexist in a relatively isotropic or unstructured environment all competing for the same sorts of materials.” Hutchinson 1961 2 Acknowledgements ................................................................................................................................... 4 Introduction ................................................................................................................................................. 5 Cyanobacteria – harbingers of doom and Life ........................................................................................... 5 Carbon, nitrogen, and the global climate ..................................................................................................... 5 Trichodesmium ........................................................................................................................................................ 7 Synechococcus .......................................................................................................................................................... 8 Coevolution and the paradox of the plankton ........................................................................................... 8 Chapter 1 – The Trichodesmium consortium: conserved heterotrophic co-occurrence and genomic signatures of potential interactions ................................................................. 11 Chapter 2 – Transcriptional activities of the microbial consortium living with the marine nitrogen-fixing cyanobacterium Trichodesmium reveal potential roles in community-level nitrogen cycling ............................................................................................... 23 Chapter 3 – Genomics and pangenomics of Synechococcus in the global ocean .......... 39 Conclusion .................................................................................................................................................. 64 References .................................................................................................................................................. 65 Supplemental material ......................................................................................................................... 70 3 Acknowledgements I am grateful to many people and many opportunities that have motivated, enabled, and helped me: Elaine Schardien, my former professor and guidance counselor at Ocean County College in New Jersey who met with me and helped me plot out how to return to school when I was 25; Matthew Mongelli, my former professor and research advisor at Kean University who helped me through a difficult time; Michele Birmele, my research advisor who accepted me into my first NASA internship at the Kennedy Space Center; Brad Bebout, my research advisor at the Ames Research Center; Katrina Edwards, who drew my attention to USC and accepted me into her lab group; Jason Sylvan, who interviewed me and helped guide me through my first years at USC; Beth Orcutt, who made it possible for me to get my first paper published; Nate Walworth, Dave Hutchins, and Eric Webb, who brought me from the deep biosphere up to the surface ocean and provided me with an environment in which I could grow in many directions; external collaborators and friends such as Meren, Tom O. Delmont, and Chris Dupont; and my USC friends and colleagues Jean-Paul Baquiran, Arkadiy Garber, Gustavo Ramirez, Joshua Kling, Elaina Graham, Ella Sieradzki, and Ben Tully. And I would like to say a very special thank you to Dr. Dale Vitale. I first met Dr. Vitale when he was the professor of a class I took at Kean University, “Spectral Identification of Organic Compounds”. It was a difficult class with many complicated topics, and Dr. Vitale presented them all with the kind of enthusiasm and wonder that made you want to understand everything. We quickly became close friends. I appreciated the way that he constantly engaged the class to make sure everyone was able to put each piece together as we moved along, and he appreciated the way I would deduce the correct answers to his questions based on why he might be asking us something and where he might be going, rather than by assessing the question based on its content alone. Following that semester, I began taking part in organic chemistry research with him looking at spontaneous generation of optical activity – which let my thoughts and our conversations dance around origin of life topics in an environment I had never been able to enjoy before. I hadn’t really considered graduate school before becoming friends with Dr. Vitale, as it just wasn’t a path I had been exposed to or knew anything about, and I had certainly never considered that I could be an actual scientist and get to spend my time asking fundamental questions about how things work. I remember a conversation with him when he had asked if I planned on applying for graduate school, and I said I hadn’t really thought about it. He responded with: “That’s crazy, Mike! You’re built for research!” Dr. Vitale unexpectedly passed away while I was working with him in 2012. When I met his wonderful family following this tragedy, I found out he was as much a hero to them as he was to me, and that it wasn’t one of those situations where you only see one of many faces of someone. I also found out how much his family had heard about me from him. It gave me an incredible sense of self-worth and pride that someone I respected so much had thought so highly of me. The pride and motivation he instilled in me is still with me today. Thank you, Dr. Vitale, for catapulting me onto this extraordinary trajectory. 4 Introduction Cyanobacteria – harbingers of doom and Life Cyanobacteria, as in the term most often used as a clade designation, are broadly defined as prokaryotic, oxygenic phototrophs – the phylum designation of Cyanobacteria by some definitions includes a subset of aphotic cyanobacteria (Di Rienzi et al. 2013). The prevailing view of the scientific community is that multicellular eukaryotic (metazoan) life as it exists today would not have been possible without cyanobacteria. A little over 2 billion years ago, ancestors of these lineages began producing molecular oxygen (O2) as a waste product at a rate superseding its sinks that eventually raised the atmospheric concentration of O2 from infinitesimal to about 5% during what is now referred to as the Great Oxidation Event (Lyons et al. 2014). Though there is some evidence of early volatility, this level was roughly maintained for some 2 billion years. The Cambrian explosion, a period ~550 million years ago during which an incredible diversity of metazoan life forms evolved, seems to have been choreographed to wait backstage until cyanobacteria and plastids like chloroplasts (former cyanobacteria that had moved into eukaryotes) enabled the atmospheric O2 concentration to reach ~20% (Lyons et al. 2014). This is still the current level today and about what we humans need to be able to function properly. One of the reasons oxygen enabled Life to experiment in the fashion it has is ultimately couched in physics, with the general gist being simply that oxygen offers one of the best exchange rates around in terms of energy for your electrons. This provided new avenues for shuttling around elections more efficiently, enabling greater energy to be generated and spent. This fundamental change in the energy infrastructure of the Earth ~2 billion years ago did not come without consequences, however. At the time, oxygen was toxic to almost all life on the planet. Cyanobacteria therefore were polluting their world with this waste product, and ultimately caused what was likely the first global mass extinction. Additionally, this change in atmospheric O2 concentration had a tremendous impact on the planet’s climate. The shift from being a largely reducing atmosphere to becoming an oxidative one altered the equilibrium between methane and carbon dioxide (CO2). As methane is much more efficient at trapping heat that is radiated off of the Earth than carbon dioxide, this in turn led to a global ice age, perhaps the first “Snowball Earth” event, further exasperating the ongoing extinction event. Even if considering only their function in oxygenating the world, one would be hard-pressed to overstate the role cyanobacteria have played in the coevolution of the Earth and its biosphere. But their impact and influence extends far beyond just oxygen production. Carbon, nitrogen, and the global climate Today cyanobacteria and their descendant chloroplasts are still sustaining virtually the entirety of the world’s oxygen demand, but equally as important they supply much of the world’s fixed carbon as well. Marine phytoplankton alone (including cyanobacteria and unicellular eukaryotic photoautotrophs) are estimated to be responsible for about half of global, net primary production (Field et al. 1998; 5 Flombaum et al. 2013). Without substantive, immediate changes to current greenhouse gas emissions, the latest Intergovernmental Panel on Climate Change (IPCC) report predicts that atmospheric CO2 concentration will be greater than 900 ppm by the end of this century (Collins et al. 2014). This CO2 influx along with rising temperatures is resulting in various changes in the marine biome such as reducing pH, stratification, limiting nutrient mixing, and increasing light availability (e.g. Rost et al. 2008; Gao et al. 2012). In the face of the current anthropogenically induced global changes, rates of CO2 drawdown and sequestration are of key interest. The nitrogen (N) cycle on Earth is mediated microbial activity and has also played a profound role regarding both the atmosphere and biosphere over our planet’s history. Driven by the large redox fluctuations caused by oxygenic photosynthesis, there is believed to have been a shifting balance between biological N fixation drawing down N2 and denitrification replenishing it (Canfield, Glazer, and Falkowski 2010). The history of our N cycle is described as a series of transitions starting with the anoxic and high NH4 + conditions present during the evolution of the earliest biosphere to the conditions of today’s oceans (oxic and low NH4 + ), with periods of time characterized by anoxic conditions with low NH4 + (as the early biosphere used up the available NH4 + ) and a sub-oxic euxinic ocean containing high concentrations of NH4 + and some NO3 - due to the activity of ammonium-oxidizing bacteria utilizing photosynthetically produced oxygen to oxidize NH4 + (Canfield, Glazer, and Falkowski 2010). Biological N cycling has also played a role in modulating our planet’s climate over the past 4 billion years, at times protecting from “snowball-Earth” events, and at other times perhaps facilitating them. At times when the global redox state allowed N fixation to reign over denitrification, the drawdown of N2 is believed to have caused a drop in atmospheric pressure, thereby weakening the warming effects of greenhouse gasses. Conversely, under anoxic conditions, when denitrification dominated this balance, N2 would be returned to the atmosphere, increasing its pressure and overall greenhouse effect. Beyond this, there is good evidence that bioavailability of N is a limiting factor in the product of biomass on the Earth today, and throughout its history (e.g. Canfield, Glazer, and Falkowski 2010; Falkowski and Godfrey 2008). Understanding the biological machinery responsible for N cycling is prudent not only for its role in enabling the proliferation of all global biological productivity, but also because it may be an integral knob involved in the tuning of a world’s climate. This work focuses on two keystone lineages of marine cyanobacteria that continue to successfully propagate via very different lifestyles: Trichodesmium, a filamentous nitrogen fixer (diazotroph) that often forms macroscopic colonies, experiences characteristic bloom events in between bouts of low abundance, and seemingly lives in perpetual association with a consortium of other organisms; and Synechococcus, a coccoid, more-traditionally “free-living” cyanobacterium that is found consistently present throughout most of the global surface ocean as co- occurring “subpopulations” (i.e. different genomic lineages). In both cases, whether considering the community of different taxa that stably coexist in the Trichodesmium consortium, or the stable co-occurrence of different subpopulations of Synechococcus, investigating their respective microbial assemblages as a whole has provided new insights into their diversity, ecology, and evolutionary histories. 6 Trichodesmium Trichodesmium spp. have been observed in the open ocean for centuries, and their biogeochemical significance – due primarily to their contribution of new N and subsequent modulation of primary production in the tropics and subtropics – has become increasingly recognized over the past several decades (e.g. Capone and Carpenter 1982; Bergman and Carpenter 1991; Capone et al. 1997; Capone et al. 2005; Bergman et al. 2013). Accordingly, many experiments have been carried out exposing these filamentous, diazotrophic cyanobacteria to varying levels of CO2 alone (e.g. Barcelos e Ramos et al. 2007; Levitan et al. 2007; Kranz et al. 2009; Hutchins et al. 2013; Gradoville, White, and Letelier 2014; Hutchins et al. 2015) as well as in tandem with fluctuations in other relevant variables such as nutrient availability, light, and temperature (e.g. Hutchins et al. 2007; Fu et al. 2008; Kranz et al. 2010; Law et al. 2012). While the majority of these studies have consistently found increases in N fixation, carbon fixation, and growth rates under elevated CO2, some responses have varied. One study that looked at four strains of Trichodesmium spp. from geographically distinct locations found significant differences between responses to higher CO2 levels – indicating that intragenus, and even in some cases intraspecies, variants may enjoy different selective advantages in the changing ocean (Hutchins et al. 2013). Further highlighting the complexity of the Trichodesmium genus, a long-term CO2 manipulation experiment (~7 years) recently observed not only sustained increases in growth rate and N fixation in high- CO2 adapted lines, but also that these high-CO2 adapted physiologies were maintained (for at least 2 years) after being switched back into the ancestral environment (ambient CO2; Hutchins et al. 2015). Additionally, somewhat initially counterintuitive response to co-limitation have been observed recently wherein Trichodesmium spp. have been found to exhibit increased fitness in co-limitation treatments relative to corresponding single-limitation scenarios (Garcia et al. 2015; Walworth et al. 2016). Results such as these highlight the difficult task of trying to understand and predict this keystone organism’s responses to the changing global biome, and how those responses will echo throughout marine microbial systems. Further confounding these efforts is the fact that there are no axenic cultures of Trichodesmium. In the environment, Trichodesmium often forms large colonies consisting of tens to hundreds of filaments, which in turn are each comprised of tens to hundreds of cells (Capone et al. 1997). These colonies, like many cyanobacteria and eukaryotic algae, serve as relatively nutrient-rich substrates that many other organisms colonize (Paerl, Bebout, and Prufert 1989; Siddiqui et al. 1992; Sheridan et al. 2002; Hewson et al. 2009; Hmelo, Van Mooy, and Mincer 2012; Lee et al. 2017a; Lee et al. 2017b). The lack of any axenic cultures of Trichodesmium means that all ongoing experiments may be more aptly described as being carried out on a consortium of organisms that is supported by the diazotrophic host. As such, any interpretation of experimental observations should take into consideration that the results may actually be an agglomeration, a net effect, of many interwoven physiologies, and when possible studies should attempt to incorporate some characterization of the associated community – as has been suggested (e.g. Borstad 1978; Hmelo et al. 2002). Overall, the extent to which these consortial interactions 7 modulate Trichodesmium physiology and N fixation remains largely unstudied, despite having long-been recognized as significant (e.g. Borstad 1978; O’Neil and Roman 1992). Synechococcus In the marine environment, Synechococcus is a free-living unicellular cyanobacterium that can be highly abundant throughout the global surface ocean. It is present in detectable amounts virtually everywhere other than high latitudes, but is often in much greater abundance in nutrient-enriched locations like coastal and upwelling areas (e.g. Flombaum et al. 2013; Sohm et al. 2015; Farrant et al. 2016). It is estimated that Synechococcus and its closest relative Prochlorococcus together account for up to 25% of net marine primary productivity (Flombaum et al. 2013). Recent developments in cultivation-independent technologies capable of interrogating in situ natural populations have opened up a complex and intricate world regarding stably coexisting “subpopulation” variability within these organisms (i.e. different, but closely related, genomic lineages; e.g. Marston et al. 2012; Sun and Blanchard 2014; Kashtan et al. 2014). For instance, a study employing single-cell genomics reported hundreds of co-occurring subpopulations of Prochlorococcus in natural samples (Kashtan et al. 2014). With regard to Synechococcus, high-throughput marker-gene studies have observed multiple coexisting clades (e.g. Sohm et al. 2015; Farrant et al. 2016). The recent drastic increase in sequenced bacterial and archaeal genomes has enabled pangenomic analyses (i.e. considering the entire genetic complement of an ad hoc set of genomes selected based on an arbitrarily defined relatedness suited to the question of the researchers) have illuminated an expansive genomic architecture that underlies the available genetic landscape of most known prokaryotes (Rodriguez-Valera and Ussery 2012). For instance, a pangenomic analysis of 61 Escherichia coli genomes identified ~16,000 gene families in the pangenome, while each individual genome harbors ~5,000 genes (Lukjancenko, Wassenaar, and Ussery 2010). This type of pangenomic openness has recently been characterized in 31 Prochlorococcus genomes as well (O. Delmont and Eren 2018), and also in Synechococcus (Lee et al. in prep, see Ch. 3). These pangenomic studies identify a “core” set of genes, those present in all included genomes, and an “accessory” set of genes, those not present in all, and the generation and sustained nature of multiple subpopulations in environmental samples such as occurs with Prochlorococcus and Synechococcus is expected to be due to variation in the accessory set of genes that are often found spatially constrained in what are called flexible regions, or hypervariable genomic islands (e.g. Rodriguez-Valera and Ussery 2012; O. Delmont and Eren 2018). Coevolution and the paradox of the plankton As mentioned above, many primary producers are associated with a heterotrophic community, often through physical attachment and direct colonization, though not necessarily (Fisher, Wilcox, and Graham 1998; Sapp et al. 2007; Nausch 1996; Hmelo, Van Mooy, and Mincer 2012; Lee et al. 2017a; Lee et al. 2017b). A common feature of these associations is the passive transfer from 8 photoautotroph to heterotroph of organic carbon, and in the case of diazotrophs like Trichodesmium, fixed N as well (Mulholland et al. 2006). These assemblages form a complex interorganismal network where, despite some likely competitive interactions (Amin, Parker, and Armbrust 2012), many mutualistic relationships have been found. For example, heterotrophs can assist their associated photoautotroph in several ways, such as reduction of local concentrations of hazardous compounds, assisting in the acquisition of trace metals, and the secretion of anti-biofouling agents (e.g. Paerl and Pinckney 1996; Morris et al. 2008; Bertrand and Allen 2012; Beliaev et al. 2014; Bertrand et al. 2015). Such mutualistic relationships are believed to have developed through co- evolutionary histories (Sison-Mangus et al. 2014; Stevenson and Waterbury 2006; Amin, Parker, and Armbrust 2012). One model that has been proposed as a mechanism for the development of such relationships is known as the Black Queen Hypothesis (BQH; Morris, Lenski, and Zinser 2012). The BQH is rests on two main suppositions: 1) some bacterial functions are “leaky”, meaning they can affect or be utilized by other nearby organisms, and are as such considered “public goods”; and 2) organisms benefiting from these public goods can then experience a selective pressure resulting in the loss of their costly pathways involved in producing or affecting that public good. This model predicts the development of various interorganismal dependencies like may be occurring within the Trichodesmium consortium. Throughout its history in the literature, given its difficulty to initially cultivate, and that no axenic cultures exist, various researchers have suggested that obligate relationships may exist between Trichodesmium and its associated community (Paerl, Bebout, and Prufert 1989; Zehr 1995; Waterbury 2006; Hmelo, Van Mooy, and Mincer 2012; Lee et al. 2017a; Lee et al. 2017b). Indeed, one of the most important facets of Trichodesmium physiology, N fixation, may be tied to rates of respiration and exopolysaccharide production by its associated consortium. The co-evolutionary forces at work here ultimately create a supraorganismal structure, wherein should the host suffer fitness for some reason, so too shall the surrounding epibionts. This establishes the assemblage as a whole as a selective unit, as selective forces do not act in isolation on any individual organismal member due to the inherent self-regulating feedback systems intrinsic to the relationship. Even before microbial ecologists had the technology to grant them the level of resolution required to delineate hundreds of subpopulations of coexisting cyanobacteria (Kashtan et al. 2014), there were questions posed about how the seemingly apparent scenario of similar organisms competing for similar resource did not lead to the out-competing/extinction of many or most, and ultimately less diversity in a sufficiently mixed environment such as the global ocean. This was termed the “paradox of the plankton”, and Hutchinson worded the concept as follows in his seminal paper over half a century ago: “The problem that is presented by the phytoplankton is essentially how it is possible for a number of species to coexist in a relatively isotropic or unstructured environment all competing for the same sorts of materials.” – Hutchinson 1961 The reason this was considered a paradox was because the sustained presence of such a diversity of organisms competing for similar materials seemed to be at odds 9 with what has become known as the competitive exclusion principle – a formulation of the concept that “complete competitors cannot coexist” (Hardin 1960). Though, even from its inception this paradox was more an acknowledgement of ignorance than anything else – meaning it was anticipated that there were likely more factors at play beyond the level of resolution currently attainable. Indeed, 20 years before Hutchinson penned his paradox of the plankton article, solidifying that as its moniker, he had already expressed at least one reason that the diversity of phytoplankton that existed in natural populations should very well be expected, that being that the constantly changing surrounding environmental factors precluded the system’s ever reaching a state of equilibrium (Hutchinson 1941). Hutchinson also provided a few more, non-mutually exclusive, hypotheses in his 1961 work that circumvented the competitive exclusion principle without violating it, including vertical gradients, predation, and the idea of symbioses exactly like those described in the Black Queen Hypothesis: “Since some phytoplankters require vitamins and others do not, a more generally efficient species, requiring vitamins produced in excess by an otherwise less efficient species not requiring such compounds, can produce a mixed equilibrium population.” (Hutchinson 1961). It was clear the competitive exclusion principle was not meant to be taken literally for a few reasons. First, Hardin nicely presented why this theory could not be tested empirically as it was essentially non-falsifiable (Hardin 1960). He illustrates this by assuming an experimental setup of competing two organisms: for the scenario in which one competes the other to exclusion, the theory is confirmed; whereas for the scenario wherein they stably coexist it is not falsified, but rather presumed there were unknown variations in their ecological niches (Hardin 1960). Hardin also clearly made the case for why the competitive exclusion principle’s utility is not found in being stated as an axiom, but rather in it being one contributing concept of a larger framework (Hardin 1960). And from Hutchinson’s perspective, it basically served as the null hypothesis (Hutchinson 1961). Today, technological advances in nucleic acid sequencing have given us greater resolution into these systems, allowing us to peer deeper into our ignorance that those like Hardin and Hutchinson acknowledged over 50 years ago, allowing us to decipher a little farther why the null hypothesis of the competitive exclusion principle is indeed not being violated. This body of work explores the community associated with Trichodesmium and co-occurring subpopulations of Synechococcus as whole entities in order to gain insight into their diversity, ecology, and evolution in the marine environment, and attempts to emphasize the value in viewing these assemblages (as a whole) as a substantive unit upon which natural selection acts. 10 ORIGINAL ARTICLE The Trichodesmium consortium: conserved heterotrophic co-occurrence and genomic signatures of potential interactions Michael D Lee 1 , Nathan G Walworth 1 , Erin L McParland 1 , Fei-Xue Fu 1 , Tracy J Mincer 2 , Naomi M Levine 1 , David A Hutchins 1 and Eric A Webb 1 1 MarineEnvironmentalBiologySection,DepartmentofBiologicalSciences,UniversityofSouthernCalifornia, Los Angeles, CA, USA and 2 Woods Hole Oceanographic Institute, Department of Marine Chemistry and Geochemistry, Woods Hole, MA, USA The nitrogen (N)-fixing cyanobacterium Trichodesmium is globally distributed in warm, oligotrophic oceans, where it contributes a substantial proportion of new N and fuels primary production. These photoautotrophs form macroscopic colonies that serve as relatively nutrient-rich substrates that are colonized by many other organisms. The nature of these associations may modulate ocean N and carbon(C)cycling,andcanofferinsightsintomarineco-evolutionarymechanisms.Hereweintegrate multiple omics-based and experimental approaches to investigate Trichodesmium-associated bacterial consortia in both laboratory cultures and natural environmental samples. These efforts have identified the conserved presence of a species of Gammaproteobacteria (Alteromonas macleodii), and enabled the assembly of a near-complete, representative genome. Interorganismal comparative genomics between A. macleodii and Trichodesmium reveal potential interactions that may contribute to the maintenance of this association involving iron and phosphorus acquisition, vitamin B 12 exchange, small C compound catabolism, and detoxification of reactive oxygen species. These results identify what may be a keystone organism within Trichodesmium consortia and support the idea that functional selection has a major role in structuring associated microbial communities. These interactions, along with likely many others, may facilitate Trichodesmium’s unique open-ocean lifestyle, and could have broad implications for oligotrophic ecosystems and elemental cycling. The ISME Journal (2017) 11, 1813–1824; doi:10.1038/ismej.2017.49; published online 25 April 2017 Introduction Most primary producers are associated with a heterotrophic community, often through physical attachment and direct colonization (Nausch, 1996; Fisher et al., 1998; Sapp et al., 2007; Hmelo et al., 2012). A common feature of these associations is the passive transfer from host to epibiont of organic carbon (C), and in the case of diazotrophs such as Trichodesmium,reducednitrogen(N)(Mulholland et al., 2006). These assemblages provide a dynamic interorganismal interface wherein, despite some likely competitive interactions (Amin et al., 2012), many relationships have been found to be mutualis- tic. For example, epibionts can assist their host through the acquisition of trace metals or other nutrients, reduction of local concentrations of hazardous compounds, and secretion of anti- biofouling agents (Paerl and Pinckney, 1996; Morris et al., 2008; Bertrand and Allen, 2012; Beliaev et al., 2014;Bertrandetal.,2015).Elucidatingthenatureof these interactions and their consequent modulation of host/colony elemental fluxes is integral to devel- oping a complete understanding of marine biogeo- chemical cycling. At a broad taxonomic level, Alphaproteobacteria, Gammaproteobacteria and Bacteroidetes have been consistently found in association with many photo- autotrophs (Fisher et al., 1998; Sapp et al., 2007; Amin et al., 2012; Bertrand et al., 2015), including Trichodesmium (Hewson et al., 2009; Hmelo et al., 2012; Rouco et al., 2016), likely because of general lifestyle characteristics such as attachment and opportunism (Hmelo et al., 2012). There has also been evidence supporting host-specific associations at finer taxonomic resolutions within these clades (Stevenson and Waterbury, 2006; Lachnit et al., 2009; Guannel et al., 2011; Sison-mangus et al., 2014). For instance, heterotrophs co-occurring with the diatom Pseudo-nitzschia have been shown to vary as a function of host species and toxicity Correspondence: EA Webb, Marine Environmental Biology Sec- tion, Department of Biological Sciences, University of Southern California, 3616 Trousdale Parkway, AHF 331a, Los Angeles, CA 90089, USA. E-mail: eawebb@usc.edu Received 4 November 2016; revised 31 December 2016; accepted 16 February 2017; published online 25 April 2017 The ISME Journal (2017) 11, 1813–1824 © 2017 International Society for Microbial Ecology All rights reserved 1751-7362/17 www.nature.com/ismej 11 (Guannel et al., 2011)—with some epibionts being mutualistic with regard to their native host, yet parasitic to others (Sison-mangus et al., 2014). Such organismal-specific mutualistic relationships are believed to have developed through co-evolutionary histories (Stevenson and Waterbury, 2006; Amin et al., 2012; Sison-mangus et al., 2014). The recently proposed Black Queen Hypothesis (BQH) (Morris etal.,2012)offersamechanisticframeworkthatcould drive such interweavings of evolutionary paths. The BQH is built upon two main suppositions: (1) certain bacterialfunctionsare‘leaky’,meaningtheycanaffect or be used by nearby organisms, and are therefore considered‘publicgoods’(forexample,theexudation of fixed C by photoautotrophs); and (2) organisms making use of these public goods may then experi- enceapositiveselectivepressureresultingintheloss of their own costly pathways responsible for those particular goods (Morris et al., 2012; Sachs and Hollowell, 2012). This process is anticipated to leave initswakevariousinterorganismaldependenciesthat canguidecommunitystructureandultimatelyleadto highly specific associations (Sachs and Hollowell, 2012; Morris, 2015). Trichodesmium spp. are notoriously difficult to maintain in culture (Paerl et al., 1989; Waterbury, 2006). It has been suggested that this is perhaps in part because of the existence of obligate dependen- cies between host and epibiont (Paerl et al., 1989; Zehr, 1995; Waterbury, 2006; Hmelo et al., 2012). However, the extent to which the interactions between Trichodesmium and their consortia mod- ulate Trichodesmium physiology and N 2 fixation remains largely unstudied, despite having long been recognized as significant (Borstad, 1978; O’Neil and Roman, 1992). Here we investigate microbial communities asso- ciated with Trichodesmium spp. in laboratory enrichments and natural environmental samples, and present a highly conserved association found to be present in both laboratory samples and in situ. We then use an interorganismal comparative geno- mics approach to generate genetic-based hypotheses as to what interactions may be contributing to the maintenance of these organisms’ co-occurrence. Materials and methods Nucleic acid extractions DNA extractions were performed for this study for tag analysis only. All other data sets were attained from prior studies (detailed in Supplementary Table S1). Extractions utilized the FastDNA Spin Kit for Soil (MP Biomedicals, Santa Ana, CA, USA) follow- ing the manufacturer’s protocol. For enrichments currently maintained, ~75ml were filtered onto 5μm polycarbonate Nucleopore filters (Whatman, Pittsburgh, PA, USA). Filters were placed directly into lysis tubes of extraction kit. Protocol blanks (nothing added to lysis tubes) were performed to track potential kit-introduced contamination. Tag data sequencing and analysis DNA from the 11 samples extracted for this study (plus two blanks) was sent for sequencing by a commercial vender (Molecular Research LP, MR DNA, Shallowater, TX, USA). Illumina (San Diego, CA, USA) MiSeq paired-end (2×300bp) sequencing was performed with primers targeting the V4V5 region of the 16S ribosomal RNA (rRNA) gene (515f: 5′-GTGCCAGCMGCCGCGGTAA-3′; 926r: 5′-CCG YCAATTYMTTTRAGTTT-3′; Parada et al., 2015). Library preparation and sequencing was carried out at the facility following Illumina library preparation protocols. Tagdatacurationandinitialmergingofpairedreads wereperformedwithinmothurv.1.36.1(Schlossetal., 2009) following the mothur Illumina MiSeq Standard Operating Procedure (Kozich et al.,2013). These merged and quality-filtered sequences were demulti- plexed, primers trimmed, and contigs clustered using Minimum Entropy Decomposition (MED) (Eren et al., 2014). MED is an unsupervised, non-alignment-based algorithm that enables single-nucleotide resolution when segregating amplicon sequences. This has been shown to often result in biologically relevant repre- sentative sequences (referred to herein as ‘oligotypes’) more similar than traditional techniques of clustering operational taxonomic units can achieve—even at 99% ID similarity (Eren et al., 2013; Mclellan et al., 2013). Six ‘non-Trichodesmium’, particle-size fraction samples previously sequenced with the same primers (Parada et al., 2015) were addedto our data set before clustering with MED in order to address and negate the possibility of the ubiquitous presence of any oligotypes. Extraction blanks (no sample or DNA added to DNA lysis tubes) were used to identify and remove potential contaminants resulting from the lab or extraction kit as described previously (Lee et al., 2015). Other 16S rRNA sequences used in this study included heterotrophs isolated from Trichodesmium cultures (see below), a clone library study of associated epibionts of natural colonies (Hmelo et al., 2012), 16S rRNA sequences derived from an environmentalTrichodesmiummetagenomicsample from the Sargasso Sea (Walworth et al., 2015) and a recent marker-gene study (Rouco et al., 2016). These sequences were trimmed to cover only the V4V5 region with mothur v.1.36.1 (Schloss et al., 2009) by aligning to the mothur-recreated Silva SEED data- basev119andthentrimmingtotargetedregionusing database positions 11895:28464. These and the recovered oligotypes from this study were then aligned in Geneious v.9.0.5 (Kearse et al., 2012), usingMuscle(Edgar,2004)withdefaultsettings,and a maximum likelihood phylogenetic tree was con- structed with the PhyML plug-in v.2.2.0 (default settings, 100 bootstraps). In silico analysis of the primers utilized in the Hmelo et al. (2012) clone library study was done with the web-based ‘PCR Products’ tool (www.bioinformatics.org/sms2/ pcr_products.html; Stothard, 2000). The Trichodesmium consortium MD Lee et al 1814 The ISME Journal 12 Metagenomic and metatranscriptomic sequencing and analysis Variousdatasetsfrommultiplesourceswereusedin thisstudyandaredescribedinSupplementaryTable S1, with further information presented in Supplementary Table S2. Five metagenomic data sets from a long-term Trichodesmium CO 2 manip- ulation experiment, presented elsewhere (Hutchins et al., 2015), were generated through sequencing performed at USC’s Epigenome Center (Los Angeles, CA, USA). DNA from these samples was previously extracted with the MoBio DNA PowerPlant kit (MoBio, Carlsbad, CA, USA), and sequenced with Illumina’s HiSeq (2×50bp) with a 300-bp insert. Raw datawere quality-filtered (minimumquality20, minimum length 35) with the FASTX-toolkit (han- nonlab.cshl.edu/fastx_toolkit/). The quality-filtered forward and reverse reads from the five metagenomic data sets from the long- term Trichodesmium CO 2 experiment (Hutchins et al., 2015) were concatenated, interleaved, and co-assembled with IDBA-UD v.1.1.1 (Peng et al., 2012) (Supplementary Table S3 contains co- assembly statistics). Coverage profiles were gener- ated for each of the five original metagenomes by mapping them to the co-assembly with Bowtie2 (Langmead and Salzberg, 2012). Metagenomic bin- ning(clusteringcontigsintorepresentativegenomes) was carried out using the analysis and visualization platform Anvi’o (Eren et al., 2015). Anvi’o utilizes CONCOCT(Alnebergetal.,2014),basedoncoverage and tetranucleotide composition, to perform an initial unsupervised clustering of contigs, and then allows for human-guided curation. Available envir- onmental Trichodesmium metagenomic (Walworth et al.,2015)andmetatranscriptomic(Hewson et al., 2009) data sets were subsequently mapped to these bins to investigate their environmental significance. Refinement of Alteromonas macleodii bin ToimproveourrepresentativegenomeofA.macleodii, wemappedallfiveofthemetagenomesusedintheco- assembly to our A. macleodii bin recovered from Anvi’o. Only reads that successfully mapped were thensubsequentlyassembledwiththeSPAdesgenome assembler v.3.8.1 (Bankevich et al.,2012).Comparing these two bins with QUAST (v.4.1) (Gurevich et al., 2013) revealed this process yielded a higher-quality assembly (Supplementary Table S4), and this refined A. macleodii bin was utilized in subsequent analyses. Identification of genes/gene clusters within A. macleodii bin The genetic potential for siderophore and acyl- homoserine lactone production was identified by way of the web-based tool AntiSmash (Weber et al., 2015);thisuseswell-curatedHiddenMarkovModels to identify gene clusters involved in secondary metabolite biosynthesis and transport. All other genes were identified through BLAST (Altschul et al., 1990) or Hidden Markov Model searches (HMMER v.3.1b2) of profiles built from reference sequences as presented in Supplementary Table S5. Isolation of heterotrophs from Trichodesmium Epibionts were isolated from Trichodesmium cul- tures using a soft agar RMP medium with the following modifications: 5g of agarose added to 250ml MilliQ water before autoclaving, and a sterile addition of 0.3% methanol, final concentration, was added post-autoclaving and cooled to 50°C before plates were poured. Trichodesmium cultivars of A. macleodii, A020, A021 and A029, were isolated from T. erythraeum K-11#131 by inoculating colo- nies onto the solid RMP-methanol medium using a sterile loop and incubating at 27°C under a 14:10 light:dark regime (100μMm −2 s −1 ). Trichodesmium filaments were motile in the soft agar medium leaving trails of epibionts after 7–10 days. Sections ofenrichedtrailswereexcised usingasterilescalpel and inoculated onto fresh RMP-methanol agar. Individual colonies were isolated and grown in liquid RMP-methanol medium containing 0.1% tryptone and cryopreserved in 10% dimethylsulf- oxide at −80°C. Siderophore production assay Chrome azurol S plates were used to screen for siderophore production as described previously (Schwyn and Neilands, 1987). DMSP analysis The production of dimethylsulfoniopropionate (DMSP) by T. erythraeum IMS101 was measured in replete, phosphorus (P)-limited, and iron (Fe)-lim- ited conditions. Enrichments were grown in 1l of YBC-II media(Chen et al.,1996)in2lpolycarbonate baffled flasks at 26°C under a 14:10 light:dark cycle (150μMm −2 s −1 ) with fluorescent light. Enrichments were continuously shaken to avoid cell sedimenta- tion. Before analysis of DMSP production, nutrient- limited cultures were semi-continuously acclimated tonutrientconditions(P-limitedwas25× lowerthan replete and Fe-limited had no Fe added). Limitation was confirmed via growth rates. DMSP was sampled in all cultures on day 10. DMSPwasmeasuredasdimethylsulfide(DMS)ona custom Shimadzu GC-2014 equipped with a flame photometric detector and a Chromosil 330 packed column. Briefly, DMSP was cleaved to DMS via alkaline hydrolysis using 5M NaOH and pre- concentrated using a liquid-N purge-and-trap method following a modified protocol (Kiene and Service, 1991). Samples for chlorophyll-a were filtered on GF/F filters, extracted in 90% acetone and measured on a Turner Trilogy (San Jose, CA, USA) fluorometer. See supplemental text for further discussion. Accession information Accession numbers for previously deposited data sets are presented in Supplementary Table S1. The The Trichodesmium consortium MD Lee et al 1815 The ISME Journal 13 tag data generated herein have been uploaded to NCBI’s Sequence Read Archive under accession number SRP078329. The five metagenomes from the Trichodesmium long-term CO 2 manipulation experiment (Hutchins et al., 2015) have been deposited in NCBI’s Sequence Read Archive under SRP078343. The draft genome of A. macleodii has been deposited in NCBI’s Whole Genome Shotgun database under accession MBSN00000000. Clone library sequences of isolates from Trichodesmium are available through NCBI’s GenBank, accession numbers KX519544–KX519550. Results and Discussion The microbial composition of Trichodesmium consortia Trichodesmium-associated communities were initi- ally investigated via 16S rRNA gene sequencing performed on 11 samples spanning two species and multiple strains. These included several distinct laboratory-maintained cultures and an environmen- tal sample of handpicked and washed individual Trichodesmium colonies (Webb et al., 2007) (Supplementary Table S1). Consistent with prior studies (Hewson et al., 2009; Hmelo et al., 2012), at a broad taxonomic level this revealed a predominance of sequences sourced from Bacteroi- detes, Gammaproteobacteria, and Alphaproteobac- teria (Supplementary Data Set S1; Supplementary Table S6). By clustering sequences with single- nucleotide resolution (Eren et al., 2014) into groups, herein referred to as ‘oligotypes’, we identified several identical sequences (100% similarity over the ~370-bp region sequenced; highlighted in Supplementary Table S6) as present in all Tricho- desmium samples (n=11), yet absent from ‘non- Trichodesmium’ samples (n=6; consisting of particle-size fraction environmental data sets pre- viously generated utilizing the same primers; Parada et al., 2015). Subsequently aligning these conserved oligotypes (those present in all of our samples) to sequences previously recovered from various other environmental Trichodesmium samples revealed highly similar sequences detected in all sources, and clusters of identical ones within the Gammapro- teobacteria (Figure 1). These identical sequences, originating from A. macleodii, were acquired from samples collected independently by different researchers at different times and locations, includ- ing an environmental metagenome, and were gener- ated by way of three different sequencing technologies (Figure 1 legend). In light of these results, and genome-level evidence of co-occurrence in natural environmental samples presented in the following section, herein we focus on the A. macleodii association with Trichodesmium. Figure 1 Maximum likelihood phylogenetic tree of 16S rRNA genes depicting only closely related sequences recovered from Trichodesmium-associated communities of various samples. Coloredleaf labels representthe sequencesource:blue=tag sequences from this study, included are only those identical sequences that were found to be present in all 11 Trichodesmium samples analyzed*; orange=Sanger sequences from organisms isolated directly from individual Trichodesmium colonies; green=Sanger sequences from a Trichodesmium-associated community clone library study (Hmelo et al., 2012; while this study did not detect A. macleodii, in silico analysis of the primers utilized revealed they would not have amplified these sequences); and red=16S rRNA sequences mined from a metagenomicassemblyofanenvironmentalTrichodesmiumsample(Walworthetal.,2015).*‘Oligotype1923’waspresentin10/11ofthe samples analyzed. The Trichodesmium consortium MD Lee et al 1816 The ISME Journal 14 Inthisvein,weleveragedarecentlypublishedstudy investigating environmental Trichodesmium epi- bionts in both the Atlantic and Pacific Ocean using tag sequencing from handpicked and washed colo- nies (Rouco et al., 2016). Processing that data set with single-nucleotide resolution clustering (Eren et al., 2014) reveals our A. macleodii oligotypes are present in 25 of the 27 samples surveyed (100% identicalacrosstheshared~280bpbetweenthedata sets) comprisingo1–5% of reads recovered, includ- ing Trichodesmium; these data further demonstrate the broadly conserved presence of these A. macleo- dii sequences in situ. The two samples lacking A. macleodii were out of 10 from the same sampling site, in the North Pacific, whereas all other 8 contained them. It is therefore possible they were simply below detection and/or underrepresented during the initial PCR. Furthermore, Alteromonas has recently been observed to disproportionally contributetoglobaltranscripts,withgeneexpression being ~10-fold higher than genetic abundance (Dupont et al., 2015)—that is, low relative abun- dance does not necessarily suggest low activity. It is worth noting that all of the 16 ‘finished’ genomes of Alteromonas available through inte- grated microbial genomes (IMG) (Markowitz et al., 2009) possess five 16S rRNA gene copies, which are often not identical (although typically499% simi- lar). It is thus possible that the distinct sequences presented in Figure 1 within the A. macleodii clade (which vary only 1–2 bases over the ~370-bp sequenced) may actually be sourced from a single genome. Regardless, this marker-gene analysis revealed the consistent presence of identical sequences within associated consortia of many distinct laboratory-maintained and environmental Trichodesmium samples, suggesting a conserved association with A. macleodii. Beyondamarker-gene:genome-levelevidenceoftheco- occurrence of Trichodesmium and A. macleodii in situ The striking ubiquity of these identical sequences across both environmental and laboratory- maintained samples suggested that Trichodesmium epibionts occurring in culture may be environmen- tally relevant, as opposed to being comprised solely of laboratory-derived contaminants. To investigate this beyond a single marker-gene, we leveraged available enrichment-derived metagenomic and metatranscriptomic (Walworth et al., 2015) data sets from the aforementioned long-term CO 2 experiment (Hutchins et al., 2015; Supplementary Table S2). A co-assembly of five metagenomic data sets was performed in an effort to better access those organ- isms in low abundance (as Trichodesmium contrib- uted ~96–99% of total reads), and the resulting contigs were clustered to identify representative genomes (Supplementary Data Set S2 for co-assembly fasta; Supplementary Table S3 for summary). Beyond Trichodesmium, this process enabled the recovery of three near-complete repre- sentative genomes (‘bins’) sourced from members of A. macleodii, Lewinella sp., and Synechococcus sp. —all estimated at 497% complete and o5% contamination (Supplementary Table S4). To assess if these bins were truly representative of organisms present within in situ Trichodesmium assemblages, we examined two publically available Trichodesmium-focused environmental data sets including the above-mentioned Sargasso Sea meta- genome (Walworth et al., 2015) and a southwest Pacific Ocean metatranscriptome (Hewson et al., 2009). Mapping these data sets to our bins revealed extensive recruitment across our A. macleodii genome (Figure 2). As mapping is a highly stringent alignmentprocess,thisdemonstratedthepresenceof a closely related A. macleodii, at the genomic level, present in both of these Trichodesmium environ- mental samples, yet absent from particle-size frac- tion ‘non-Trichodesmium’ samples (Figure 2; Supplementary Figure S1 for further discussion). This is particularly remarkable for the Sargasso Sea metagenomic sample as individual Trichodesmium colonies were handpicked and washed several times to remove any organisms not tightly associated—yet this closely related A. macleodii remained. It is worth emphasizing that these metagenomic bins were derived from a laboratory-maintained Figure 2 Visualization of laboratory enrichment and environ- mental metagenomic (DNA) and metatranscriptomic (RNA) reads mapped to our assembled representative genomic bins revealing the presence of a closely related A. macleodii in both environ- mentalTrichodesmiumsamples(labeled‘Tricho’),butabsentfrom samples lacking Trichodesmium (‘non-Tricho’). A hierarchical clustering of contigs from the co-assembly is shown at the top segregating contigs into representative genomic bins as depicted bythefourlargecoloredcolumnsandlabeledatthebottom.Thin, vertical peaks display log transformed normalized coverage of reads (across that sample/row) mapped to that contig (column/ leaf). ‘High’ and ‘low’ CO 2 DNA and RNA samples mapping from the long-term experiment to these newly recovered bins revealed their presence and activity were maintained in both CO 2 treatments, even after several hundred generations (Hutchins et al., 2015). SCB, Southern California Bight. See Supplementary Table S1 for detailed sample information. The Trichodesmium consortium MD Lee et al 1817 The ISME Journal 15 Trichodesmium erythraeum enrichment, strain IMS101, which was originally isolated ~25 years ago (Prufert-Bebout et al., 1993). Regardless of whether this cohabitation with A. macleodii has persisted over the entire time IMS101 has been in culture, or if the organism was introduced as a ‘contaminant’ at some point in the strain’s long cultivation history (~3500 generations) and subse- quently maintained, its co-occurrence in these laboratory enrichments and in the environment in 28/30 available samples (evidenced by marker-gene analysis and at the genome level) intimates a robust stability of this association. It is also important to note that while this relationship may not be exclusive, as A. macleodii is commonly observed in association with other photoautotrophs (Morris etal.,2008;Billeretal.,2015),collectivelythesedata suggest Trichodesmium colonies may consistently harbor A. macleodii—a finding with significant implications regarding our understanding of Tricho- desmium’sphysiology.As phytoplankton-associated communities are understood to be structured by functional and characteristic properties of host and epibiont (Stevenson and Waterbury, 2006; Guannel et al.,2011;Lachnitet al.,2011;Sison-manguset al., 2014), this conserved association of Trichodesmium with A. macleodii may in part be maintained by interorganismal interactions. Genomic signatures of potential interactions If this co-occurrence of Trichodesmium and A. macleodii is being maintained because of specific interactions, then, as predicted by the BQH, there shouldexistcorrespondinggeneticsignaturesunder- lying them. In considering the implications of the BQH, Sachs and Hollowell (2012) recently noted: ‘This new paradigm suggests that bacteria may often form interdependent cooperative interactions in communities and moreover that bacterial coopera- tion should leave a clear genomic signature via complementary loss of shared diffusible functions’. Owing to its tight association with its epibiotic community,Trichodesmiumisaprimecandidatefor evolutionarymechanismssuchasthosedescribedby the BQH. Moreover, Trichodesmium has long been considered enigmatic because of its ability to carry out N 2 fixation (with the O 2 -sensitive enzyme nitrogenase) while contemporaneously performing O 2 -evolving photosynthesis with no immediately apparentmechanismforkeepingthesetwoprocesses segregated (Carpenter and Price, 1976; Paerl et al., 1989;Zehr,1995).Ithasbeensuggestedthatthismay in part be because of interactions with its associated community. For example, host exudation of organic Csupportsassociatedheterotrophicgrowthandfuels respiration, which in turn can generate microenvir- onments of low O 2 concentrations thereby reducing oxic inhibition of host N 2 fixation (Paerl and Kellar, 1978; Paerl and Bebout, 1988; Paerl et al.,1989; Fay, 1992; Zehr, 1995; Paerl and Pinckney, 1996). This cascadeofeventsbenefitstheconsortiumasawhole, and the development of such complex interorganis- mal networks has been argued to be a natural consequence of the BQH—while the underlying mechanism is certainly not driven by any such mutualisms, they will tend to emergently arise (Sachs and Hollowell, 2012; Morris, 2015). As a first step toward assessing if any such mutualisticinteractionsmayexistbetweenTrichodes- mium and A. macleodii, we applied an interorganis- mal comparative genomics approach in order to identify any potential interactions through their corresponding genomic footprints. By identifying these gene loss/retention patterns of shared diffusible functions,weprovideheregenetic-basedevidencefor potential interactions related to acquisition of Fe and P, vitamin B 12 exchange, small C compound cycling, and reactive oxygen species (ROS) detoxification. Such genetic complementation in and of itself cannot provide firm evidence of mutualistic interdependen- cies, but does serve to generate hypotheses about possiblehost/epibiont interactive feedbacks and iden- tifiespromisingavenuesforfuturedefinitivestudiesof putative BQH relationships. Fe acquisition Fe is an essential micronutrient required for photo- synthesis, respiration, and N 2 fixation that is often limiting in marine environments (Vraspir and Butler, 2009;Chappelletal.,2012).Onemechanismbywhich marinemicrobesovercomeFelimitationisthroughthe production of extracellular, high-affinity Fe-binding molecules known as siderophores that scavenge Fe 3+ fromthesurroundingenvironment(Aminetal.,2009). Although many siderophore/Fe complexes are highly stable and organismal specific (requiring selective outermembranereceptorsandtransportersforuptake), there is a subset produced in the marine environment that exhibit low organismal specificity and are highly photoreactive when bound to Fe 3+ (Amin et al.,2009; Vraspir and Butler, 2009). These complexes become oxidized through ultraviolet photocatalysis whereby chelated Fe 3+ is reduced and released as Fe 2+ ,provid- ingasubstantialsourceofbioavailableFeinthephotic zone (Barbeau et al.,2001).Theseeminglysurprising observation that many organisms produce sidero- phores that so readily serve as ‘public goods’ has led to the suggestion that they may have been evolutiona- rily selected for as a result of bacterial–phytoplankton associations (Amin et al.,2009).Indeed,ithasbeen shown that this process increases the Fe uptake of phytoplankton (Barbeau et al.,2001),and Synecho- coccus Fe limitation response transcripts have been showntosignificantlydecreasewhenco-culturedwith asiderophore-producingstrainof Shewanella (Beliaev et al.,2014). Unlike several other cyanobacteria, Trichodes- mium has not been shown to produce siderophores, and lacks any known genetic potential to do so (Chappell and Webb, 2010; Kranzler et al., 2011). The Trichodesmium consortium MD Lee et al 1818 The ISME Journal 16 However, in addition to mechanisms of ferric and ferrous inorganic Fe acquisition (Roe and Barbeau, 2014), Trichodesmium does appear to have the ability to take up Fe-siderophore complexes via several Ton-B components, tonB-exbB-exbD (Tery_1418, 4448 and 4449; Chappell and Webb, 2010; Kranzler et al., 2011). Consistent with BQH predictions of compensatory gene loss/retention within mutualistic relationships, our A. macleodii genomic bin possesses a gene cluster predicted to be responsible for the biosynthesis of aerobactin (Supplementary Table S5), a well-studied, low- affinity, highly photoreactive siderophore (Kupper et al., 2006). Transcripts for these genes were detected in the aforementioned long-term metatran- scriptomes (Supplementary Table S5),and tofurther validate this genetic potential, we experimentally confirmed the production of siderophores by our A.macleodiiisolates(A020,A021,A029inFigure1) via the chrome azurol S plate assay (Supplementary Figure S2). Furthermore, the exbB-exbD transporter components of the potential Trichodesmium uptake complex share homology with those found in the freshwater cyanobacterium Synechocystis sp. PCC6803 (slr0677 and slr0678, at 67% similarity 85% positives and 54/71%, respectively; see Supplementary Figure S3 for alignments). In Syne- chocystis PCC6803, these genes have been shown to be essential for the reduction of Fe 3+ bound to aerobactin just before the subsequent uptake of Fe 2+ into the cell (Kranzler et al., 2011). Although this evidence suggests that Trichodes- mium may have theabilityto obtain Fe bound tothe aerobactin produced by A. macleodii, this hypoth- esis requires experimental investigation. Regardless, however, the photoreactivity of these siderophores and their low stability when ligated to Fe 3+ clearly result in more available Fe as a public good for any nearby organisms (Amin et al., 2009). Vitamin B 12 exchange B 12 exchange has become a key example of inter- domain dependence within photoautotroph- heterotroph associations (Croft et al., 2005). Most of this research has focused on eukaryotes (which cannot synthesize B 12 de novo; Bertrand and Allen, 2012), whereby biosynthesis of B 12 by associated heterotrophsisfollowedbyuptakebytheiralgalhost (Croft et al., 2005; Bertrand et al., 2015). In contrast to this commonly seen scenario, in our prokaryote/ prokaryote association Trichodesmium can produce B 12 denovo(Sañudo-Wilhelmyetal.,2014),whereas A. macleodii cannot. Despite lacking this capability, thereisevidenceourA.macleodiimaybefacultative with regard to the vitamin as it contains at least two B 12 -dependent enzymes: an epoxyqueosine reduc- tase involved in tRNA modification; and a methio- nine synthetase, in addition to also containing a B 12 - independent version (Supplementary Table S5). It has been shown that bacteria capable of both methods of methionine production are often faculta- tive, with the B 12 -dependent pathway being more efficient (Augustus and Spicer, 2011). As some marine cyanobacteria have been shown to exude B 12 (Bonnet et al., 2010), and our A. macleodii bin also possesses genes identified as outer membrane receptors and active transporters of the vitamin (all found to be expressed, Supplementary Table S5), it is possible this epibiont may benefit from the exudation of B 12 by Trichodesmium. This hypothesized direction of B 12 exchange (from photoautotrophic host to associated epibionts) is opposite to that commonly seen within eukaryotic algal-heterotroph associations (Croft et al., 2005; Bertrand et al., 2015). A general trend is therefore possible wherein heterotrophs that cannot synthe- size B 12 may tend to more often associate with prokaryotic photoautotrophs rather than their eukar- yotic counterparts. To the best of our knowledge, such a relationship has yet to be investigated. P acquisition As a N fixer, Trichodesmium is believed to com- monly be P limited, and has been shown to partially fulfill its P quota through the expression of alkaline phosphatases (Webb et al., 2007)—enzymes that cleavephosphategroupsfromorganicP.Inaddition, it has been suggested that there may be selective pressure within Trichodesmium colonies for organ- isms possessing greater capabilities of P uptake (Hewson et al., 2009). The expression of alkaline phosphatases by members of Trichodesmium con- sortia has been previously observed, and these organisms are expected to ultimately be aiding their host with P acquisition (Hynes et al., 2009). Our A. macleodii does indeed possess the genetic potential to produce alkaline phosphatases, but moreover, it also possesses a secondary metabolite biosynthesis cluster predicted to produce acyl- homoserine lactones (AHLs; Supplementary Table S5). AHLs are a class of molecules known to be involved in quorum sensing (Case et al., 2008), and have specifically been shown to double the activity of alkaline phosphatases upon addition to natural Trichodesmium colonies (Van Mooy et al., 2012). This genetic evidence suggests A. macleodii may be one of the organisms within Trichodesmium con- sortia helping to orchestrate the colonial acquisition of P, a major limiting nutrient to the host in situ. As AHLs can modulate many population-level cellular responses (Amin et al., 2012), it is likely these molecules are also coordinating much more than solely APase activity within these consortia. Consortial C catabolism For many phytoplankton, photosynthesis is believed tobe C-limited in situ (Hein and Sand-Jensen, 1997). Consequently,theyuseC-concentratingmechanisms that, although energetically costly, are capable of The Trichodesmium consortium MD Lee et al 1819 The ISME Journal 17 generating intracellular CO 2 concentrations up to 1000-fold higher than external levels (Burnap et al., 2015). It has been argued that the Trichodesmium colonyasawholewouldbenefitfromatightnutrient coupling system wherein host-exuded organics may be catabolized relatively rapidly by a metabolically diverse consortium, thereby in part relieving C-lim- itation in addition to reducing local O 2 concentra- tions/inhibition of N 2 fixation (Paerl et al., 1989; O’Neil and Roman, 1992; Paerl and Pinckney, 1996). Here we present two examples where A. macleodii possesses, and was found to be expressing, the genetic machinery to degrade specific C compounds produced by Trichodesmium. Methanol is an important small C compound in the global ocean primarily because of its role in atmospheric ozone production (Heikes, 2002). Recently, a wide phylogenetic array of phytoplank- ton have been shown to produce methanol in significant amounts as a by-product (up to μM levels for Trichodesmium; Mincer and Aicher, 2016). Correspondingly, there is genetic evidence that A. macleodii may be involved in methanol catabo- lism as our representative genome possesses two distinct genes identified as pyrroloquinoline quinone-dependent alcohol dehydrogenases (enzymes that catalyze methanol/ethanol oxidation, and are often indicative of facultative methylotro- phy; Chistoserdova, 2011), as well as the pyrrolo- quinoline quinone biosynthesis pathway (all found to be actively transcribed, Supplementary Table S5). In keeping with these genomic observations, the three closely related A. macleodii isolates presented in Figure 1 were isolated in media containing methanol as the sole C source. Given A. macleodii’s consistent association with Trichodesmium, and its ability to catabolize methanol, it is possible this is onemechanismbywhichitisaidingitshostthrough rapid C cycling and O 2 drawdown. DMSPisanorganosulfurcompoundproducedand exuded by many phytoplankton and is a major intercellular metabolite (Stefels et al., 2007; Durham et al., 2015) that has been hypothesized to serve as an inter-trophic level signaling molecule. It is known to be both a strong chemoattractant (Seymour, 2010) and to induce the upregulation of AHLs (Johnson et al., 2016), although the full scope of its cellular function is not yet fully understood (Yoch, 2002). This compound is rapidly cycled by heterotrophic bacterioplankton, supplying up to 13% of bacterial C and up to 100% of bacterial sulfur demand (Kiene and Linn, 2000). Heterotrophs degrade DMSP via two enzymatically mediated pathways: a cleavage pathway (dddP) that provides a labile C source and produces DMS, an environ- mentally relevant trace gas (Simo, 2001), as a by- product; and a demethylation pathway (dmdA) that provides both C and reduced sulfur, foregoing DMS (Curson et al., 2011). To date, Trichodesmium is the only open ocean cyanobacteria that has been shown toproduce significantintracellular concentrationsof DMSP (Bucciarelli et al., 2013); a function that we confirmedinboththelong-termTrichodesmiumCO 2 manipulation experiment and in T. erythraeum IMS101grownunderrepleteandPandFelimitation (SupplementaryFigureS4).OurA.macleodiibinhas the genetic potential to both demethylate DMSP (beginning with dmdA; Supplementary Table S5) and cleave DMSP (dddP; Supplementary Table S5), withbothfoundtobeactivelytranscribed.Italsohas previously been demonstrated that some strains of A. macleodii (with 16S sequences 100% identical to those from the current study) can grow with DMSP as a sole C source (Raina et al., 2009). Although further investigation is needed to follow-up on this genetic potential between these co-occurring organ- isms, this is the first indication that DMSP may be activelycycledbetweenacyanobacterialhostandits associatedcommunity.Thissuggeststhatinaddition toassociatedcommunitymemberspotentiallyaiding theirhostthroughrapidCcyclingwithinthecolony, the use of DMSP as a signaling molecule for heterotrophstoalter theirmetabolicfunction toward a ‘cooperative lifestyle’ (Johnson et al., 2016) may also be occurring in our system. Detoxification of ROS When first presenting the BQH, Morris et al. postu- lated a mutualistic relationship between the pico- cyanobacterium Prochlorococcus and ‘helper’ bacteria. This mutualism was based on the see- mingly paradoxical lack of a catalase-peroxidase gene (katG), or any heme-based catalases or perox- idases, to aid the O 2 -evolving cyanobacterium in detoxifying ROS (Morris et al., 2008, 2012). IMS101, the only finished Trichodesmium genome, is anno- tated in IMG (Markowitz et al., 2009) as possessing a catalase peroxidase. However, upon closer inspec- tion this gene was identified as a pseudogene that has undergone substantial gene decay (Fawal et al., 2013). As annotated byIMG,itremains asonlythree gene fragments (Tery_1759, 1760 and 1761) inter- spersedwithstartandstopcodons,andtotalso700- bprelativetothe~2100-bpfunctionalgene.Further- more, katG genes have been frequently transferred between organisms (Bernroitner et al., 2009), and it appearsasthoughTrichodesmium’sformerkatGwas horizontally acquired before its pseudogenization as phylogenetic analysis of the PeroxiBase pseudo- gene sequence places it deep within the Bacteroi- detes clade, distinct from other Cyanobacteria (Supplementary Figure S5). In accordance with this apparent history, a recent survey found that hor- izontally transferred genes were twice as likely as those vertically transmitted to become pseudogen- ized (Liu et al., 2004). Consistent with the BQH ‘helper’ role, our A. macleodii possesses a katG found to be transcrip- tionally active (Supplementary Table S5), and the species is known to produce catalase peroxidase (Morris et al., 2011). In fact, a strain of A. macleodii The Trichodesmium consortium MD Lee et al 1820 The ISME Journal 18 was the first ‘helper’ bacteria described to assist the growth of Prochlorococcus (Morris et al., 2008) as a resultofthisactivity.ThissuggeststhatA.macleodii may be aiding in the detoxification of ROS within Trichodesmium consortia in a similar manner. ROS toxicity is particularly problematic for photoauto- trophs exposed to intense irradiance, and as Tricho- desmiumfrequentlyformslargesurfaceblooms,ROS detoxification could thus be an important service provided by A. macleodii for its cyanobacterial host. The Alteromonas genus: functional distribution and association with photoautotrophs There are currently 16 Alteromonas ‘finished’ gen- omes available through IMG (Markowitz et al., 2009) (accessed July 2016), many with differences regard- ing the aforementioned functions identified in our Trichodesmium-derived A. macleodii bin. For instance, only 2 of the 16 also contain a gene cluster for siderophore production, only 3 possess an mdh2 gene (implicated in methylotrophy; Keltjens et al., 2014), and 7 are predicted to produce AHLs (Supplementary Table S7). In contrast, similarities can be seen across the genus in the possession of genes related to DMSP degradation, APase activity, and the apparent facultative usage of vitamin B 12 (Supplementary Table S7). Recently, the genome of an A. macleodii isolated fromaProchlorococcusculturehasbeenconstructed (Biller et al., 2015). In searching this genome for the above-discussed genetic capabilities, this strain con- tains all of those found within our bin, except, interestingly, the potential to produce AHLs (Supplementary Table S7). This may reflect that the density of organisms associated with the unicellular Prochlorococcusismuchlowerthanintherelatively enriched Trichodesmium colony environment, which may render quorum sensing ineffective. Whether there are properties of some A. macleodii that make them more prone to being associated with photoautotrophs, as opposed to particles, warrants deeper investigation. Conclusion Here we have presented molecular evidence that Trichodesmium may be consistently associated with A. macleodii in both laboratory-maintained enrich- ments and in situ (at the marker-gene and genome level). These data do not, however, allow us to addresstheexclusivityorpromiscuitywithregardto A. macleodii. Ultimately any organismal associa- tions are going to be functionally and characteristi- cally driven, rather than taxonomically or even phylogenomically. As such, it is possible similar heterotrophic strains may be commonly found co- occurring with various photoautotrophs (as they tend to exhibit comparable qualities), whereas Trichodesmium spp. may only exist in association with A. macleodii; importantly, this would not lessen the potential import of A. macleodii’simpact on Trichodesmium’sphysiology. The unique lifestyle of Trichodesmium and its biogeochemical significance make its colonial infra- structureanidealandnecessaryregimetobeviewed in the light of co-evolutionary processes such as those laid out by the BQH. In addition to showing Figure 3 Schematic of the discussed potential interactions between Trichodesmium and A. macleodii. Beige box, Trichodesmium; Purple boxes, A. macleodii cells; AA, amino acids; APases, alkaline phosphatase; DMS, dimethylsulfide; KatG, catalase-peroxidase; red ellipses, siderophores. (Colony image courtesy of WHOI). The Trichodesmium consortium MD Lee et al 1821 The ISME Journal 19 that Trichodesmium may consistently harbor A. macleodii, we have further presented genetic and experimental evidence that generates several hypotheses regarding potential interactions between these organisms that may ultimately be contributing to the maintenance of this association (Figure 3). Although direct experimental observation of these putative mutualistic interactions is not a trivial task (as there are no axenic cultures of Trichodesmium), this work opens the door to further investigations of this relationship. Photoautotroph–heterotroph associations repre- sentcomplexsystemsofinterorganismalinteractions and A. macleodii is only one of many organisms thriving within Trichodesmium colonies. While this workprovidesafoundationandpathtowardtargeted investigations of what may be a keystone organism withintheseconsortia,manyinteractionslikelyexist between this host and its other epibionts, as well as solely between the various associated organisms. Ultimately these intricate networks of organismal interconnectivity have direct impacts on the phy- siologyofTrichodesmium,andthereforeonfluxesof biologically and climatically relevant elements such as C, N, P, Fe and S in the open ocean. Unraveling these networks is integral to our understanding of not only Trichodesmium’s productivity and evolu- tion, but also to our emerging picture of ocean biogeochemistry. Conflict of Interest The authors declare no conflict of interest. Acknowledgements We thank Gustavo Ramirez for insightful discussion and assistance in editing the manuscript through various draft stages. Funding was provided by US National Science Foundation grants BO-OCE 1260490 to DAH, EAW and F-XF, a Rose Hills Foundation grant to NML and EM, and CHE-OCE 1131415 to TJM. References Alneberg J, Bjarnason BSS, de Bruijn I, Schirmer M, Quick J, Ijaz UZ et al. (2014). Binning metagenomic contigs by coverage and composition. Nat Meth 11: 1144–1146. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. (1990). Basic local alignment search tool. J Mol Biol 215: 403–410. Amin SA, Green DH, Hart MC, Küpper FC, Sunda WG, Carrano CJ. (2009). Photolysis of iron-siderophore chelates promotes bacterial-algal mutualism. Proc Natl Acad Sci USA 106: 17071–17076. Amin SA, Parker MSM, Armbrust EV. (2012). Interactions between diatoms and bacteria. Microbio Mol Bio R 76: 667–684. Augustus AM, Spicer LD. (2011). The MetJ regulon in Gammaproteobacteria determined by comparative genomics methods. BMC Gen 12: 558. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS et al. (2012). SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comp Biol 19: 455–477. Barbeau K, Rue EL, Bruland KW, Butler A. (2001). Photochemical cycling of iron in the surface ocean mediatedbymicrobialiron(III)-bindingligands.Nature 413: 409–413. Beliaev AS, Romine MF, Serres M, Bernstein HC, Linggi BE, Markillie L et al. (2014). Inference of interactions in cyanobacterial–heterotrophic co-cultures via transcriptome sequencing. ISME J 8:2243–2255. Bernroitner M, Zamocky M, Furtmu PG, Furtmüller PG, Peschek GA, Obinger C. (2009). Occurrence, phylo- geny, structure, and function of catalases and perox- idases in cyanobacteria. J Exp Bot 60: 423–440. Bertrand EM, Allen AE. (2012). Influence of vitamin B auxotrophy on nitrogen metabolism in eukaryotic phytoplankton. Front Microbio 3:1–16. Bertrand EM, McCrow JP, Moustafa A, Zheng H, McQuaid JB, Delmont TO et al. (2015). Phytoplankton-bacterialinteractionsmediatemicronu- trientcolimitationatthecoastalAntarcticseaiceedge. Proc Natl Acad Sci USA 112: 9938–9943. Biller SJ, Coe A, Chisholm SW, Foun- BM. (2015). Draft genome sequence of Alteromonas macleodii strain MIT1002, isolated from an enrichment culture of a marine cyanobacterium. Genome Announc 3:3–4. Bonnet S, Webb Ea, Panzeca C, Karl DM, Capone DG, Sañudo-WilhelmySA.(2010).VitaminB12excretionby cultures of the marine cyanobacteria Crocosphaera and Synechococcus. Limnol Oceanogr 55:1959–1964. Borstad GA. (1978). Some Aspects of the Occurrence and Biology of Trichodesmium Near Barbados. McGill University: Montreal, Quebec, Canada. Bucciarelli E, Ridame C, Sunda WG, Dimier-Hugueney C, Cheize M, Belviso S et al. (2013). Increased intracellular concentrationsofDMSPandDMSOiniron-limitedoceanic phytoplanktonThalassiosiraoceanicaandTrichodesmium erythraeum. Limnol Oceanogr 58:1667–1679. Burnap R, Hagemann M, Kaplan A. (2015). Regulation of CO 2 concentrating mechanism in Cyanobacteria. Life 5: 348–371. Carpenter EJ, Price CC. (1976). Marine oscillatoria (Tricho- desmium): explanation for aerobic nitrogen fixation without heterocysts. Science 191: 1278–1280. Case RJ, Labbate M, Kjelleberg S. (2008). AHL-driven quorum-sensing circuits: their frequency and function among the Proteobacteria. ISME J 2: 345–349. Chappell PD, Moffett JW, Hynes AM, Webb EA. (2012). Molecular evidence of iron limitation and availability in the global diazotroph Trichodesmium. ISME J 6: 1728–1739. Chappell PD, Webb EA. (2010). A molecular assessment of theiron stress responseinthe two phylogenetic clades of Trichodesmium. Environ Microbiol 12: 13–27. Chen YB, Zehr JP, Mellon M. (1996). Growth and nitrogen fixation of the diazotrophic filamentous nonheterocys- tous cyanobacterium Trichodesmium sp. IMS 101 in defined media: evidence for a circadian rhythm. J Phycol 32: 916–923. Chistoserdova L. (2011). Modularity of methylotrophy, revisited. Environ Microbiol 13: 2603–2622. Croft MT, Lawrence AD, Raux-deery E, Warren MJ, Smith AG. (2005). Algae acquire vitamin B12 through a symbiotic relationship with bacteria. Nature 438: 90–93. The Trichodesmium consortium MD Lee et al 1822 The ISME Journal 20 Curson ARJ, Todd JD, Sullivan MJ, Johnston AWB. (2011). Catabolism of dimethylsulphoniopropionate: microor- ganisms, enzymes and genes. Nat Rev Microbiol 9: 849–859. DupontCL,McCrowJP,ValasR,MoustafaA,WalworthN, Goodenough U et al. (2015). Genomes and gene expression across light and productivity gradients in eastern subtropical Pacific microbial communities. ISME J 9: 1076–1092. Durham BP, Sharma S, Luo H, Smith CB, Amin SA, Bender SJ et al. (2015). Cryptic carbon and sulfur cycling between surface ocean plankton. Proc Natl Acad Sci USA 112: 453–457. EdgarRC.(2004).MUSCLE:amultiplesequencealignment methodwithreducedtimeandspacecomplexity.BMC Bioinf 5: 113. Eren AM, Morrison HG, Lescault PJ, Reveillaud J, Vineis JH, Sogin ML. (2014). Minimum entropy decomposition: unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences. ISME J 9: 968–979. Eren AM, Esen ÖC, Quince C, Vineis JH, Morrison HG, SoginMLetal.(2015).Anvi’o:anadvancedanalysisand visualization platform for ‘omics data. PeerJ 3:e1319. Eren AM, Maignien L, Sul WJ, Murphy LG, Grim SL, Morrison HG et al. (2013). Oligotyping: differentiating betweencloselyrelatedmicrobialtaxausing16SrRNA gene data. Methods Ecol Evol 4: 1111–1119. FawalN,LiQ,SavelliB,BretteM,PassaiaG,FabreMetal. (2013). PeroxiBase: a database for large-scale evolu- tionary analysis of peroxidases. Nucleic Acids Res 41: 441–444. Fay P. (1992). Oxygen relations of nitrogen fixation in cyanobacteria. Microbiol Rev 56: 340–373. Fisher MM, Wilcox LW, Graham LE. (1998). Molecular characterization of epiphytic bacterial communities on charophycean green algae. Microb Ecol 64: 4384–4389. Guannel ML, Horner-Devine MC, Rocap G. (2011). Bacter- ial community composition differs with species and toxigenicity of the diatom Pseudo-nitzschia. Aquat Microb Ecol 64: 117–133. GurevichA,SavelievV,VyahhiN,TeslerG.(2013).QUAST: quality assessment tool for genome assemblies. BMC Bioinform (Oxford, England) 29:1072–1075. HeikesBG.(2002).Atmosphericmethanolbudgetandocean implication. Global Biogeochem Cycles 16:1–13. Hein M, Sand-Jensen K. (1997). CO 2 increases oceanic primary production. Nature 388: 526–527. Hewson I, Poretsky RS, Dyhrman ST, Zielinski B, WhiteAE,TrippHJetal.(2009).Microbialcommunity gene expression within colonies of the diazotroph, Trichodesmium, from the Southwest Pacific Ocean. ISME J 3: 1286–1300. Hmelo LR, Van Mooy BAS, Mincer TJ. (2012). Character- ization of bacterial epibionts on the cyanobacterium Trichodesmium. Aquat Microb Ecol 67:1–14. Hutchins DA, Walworth NG, Webb EA, Saito MA, Moran D, McIlvin MR et al. (2015). Irreversibly increased nitrogen fixation in Trichodesmium experimentally adapted to elevated carbon dioxide. Nat Comm 6:8155. Hynes AM, Chappell PD, Dyhrman ST, Doney SC, Webb EA. (2009). Cross-basin comparison of phos- phorus stress and nitrogen fixation in Trichodesmium. Limnol Oceanogr 54: 1438–1448. Johnson WM, Soule MCK, Kujawinski EB. (2016). Evi- dence for quorum sensing and differential metabolite production by a marine bacterium in response to DMSP. ISME J 10:1–13. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S et al. (2012). Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. BMC Bioinform 28: 1647–1649. Keltjens JT, Pol A, Reimann J, Op Den Camp HJM. (2014). PQQ-dependent methanol dehydrogenases: rare-earth elementsmakeadifference.ApplMicrobiolBiotechnol 98: 6163–6183. Kiene RP, Linn LJ. (2000). Distribution and turnover of dissolved DMSP and its relationship with bacterial production and dimethylsulfide in the Gulf of Mexico. Limnol Oceanogr 45: 849–861. Kiene RP, Service SK. (1991). Decomposition of dissolved DMSP and DMS in estuarine waters: dependence on temperature and substrate concentration. Mar Ecol Prog Ser 76:1–11. Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. (2013). Development of a dual-index sequencing strategy and curation pipeline for analyz- ing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl Environ Microbiol 79: 5112–5120. Kranzler C, Lis H, Shaked Y, Keren N. (2011). The role of reduction in iron uptake processes in a unicellular, planktonic cyanobacterium. Environ Microbiol 13: 2990–2999. Kupper FC, Carrano CJ, Kuhn JU, Butler A. (2006). Photoreactivity of iron(III)-aerobactin: photoproduct structure and iron(III) coordination. Inorganic Chem 45: 6028–6033. Lachnit T, Blumel M, Imhoff JF, Wahl M. (2009). Specific epibacterial communities on macroalgae: phylogeny matters more than habitat. Aquat Biol 5: 181–186. Lachnit T, MeskeD, WahlM, Harder T, Schmitz R. (2011). Epibacterial community patterns on marine macro- algaearehost-specificbuttemporallyvariable.Environ Microbiol 13: 655–665. Langmead B, Salzberg SL. (2012). Fast gapped-read alignment with Bowtie 2. Nat Methods 9: 357–359. Lee MD, Walworth NG, Sylvan JB, Edwards KJ, Orcutt BN. (2015). Microbial communities on seafloor basalts at DoradoOutcropreflectlevelofalterationandhighlight global lithic clades. Front Microbiol 6:1–20. Liu Y, Harrison PM, Kunin V, Gerstein M. (2004). Comprehensive analysis of pseudogenes in prokaryotes: widespread gene decay and failure of putativehorizontallytransferredgenes.GenomeBiol5: R64. Markowitz VM, Mavromatis K, Ivanova NN, Chen IMA, Chu K, Kyrpides NC. (2009). IMG ER: a system for microbial genome annotation expert review and cura- tion. Bioinformatics 25: 2271–2278. Mclellan SL, Newton RJ, Vandewalle JL, Shanks OC, Huse SM, Eren AM et al. (2013). Sewage reflects the distributionofhumanfaecalLachnospiraceae.Environ Microbiol 15: 2213–2227. Mincer TJ, Aicher AC. (2016). Methanol production by a broadphylogeneticarrayofmarinephytoplankton.PLoS One 11:1–17. Morris JJ. (2015). Black Queen evolution: the role of leakiness in structuring microbial communities. Trends Genet 31: 475–482. MorrisJJ,JohnsonZI,SzulMJ,KellerM,ZinserER.(2011). Dependence of the cyanobacterium Prochlorococcus The Trichodesmium consortium MD Lee et al 1823 The ISME Journal 21 on hydrogen peroxide scavenging microbes for growth at the ocean’s surface. PLoS One 6: e16805. Morris JJ, Kirkegaard R, Szul MJ, Johnson ZI, Zinser ER. (2008). Facilitation of robust growth of Prochlorococ- cus colonies and dilute liquid cultures by 'helper' heterotrophic bacteria. Appl Environ Microbiol 74: 4530–4534. Morris J, Lenski RE, Zinser ER. (2012). The Black Queen hypothesis: evolution of dependencies through adap- tive gene loss. Mol Bio 3: e00036–12. Mulholland MR, Bernhardt PW, Heil CA, Bronk DA, Neil JMO. (2006). Nitrogen fixation and release of fixed nitrogen by Trichodesmium spp. in the Gulf of Mexico. Limnol Oceanogr 51: 1762–1776. Nausch M. (1996). Microbial activities on Trichodesmium colonies. Mar Ecol Prog Series 141: 173–181. O’Neil JM, Roman MR. (1992). Grazers and associated organismsof Trichodesmium. Marine Pelagic Cyanobac- teria Trichodesmium Other Diazotrophs 362:61–73. Paerl H, Bebout B, Prufert L. (1989). Bacterial associations with marine Oscillatoria sp. (Trichodesmium sp.) populations: ecophysiological implications. J Phyc 25: 773–784. Paerl HW, Bebout BM. (1988). Direct measurement of O2- depleted microzones in marine Oscillatoria): relation to N2-fixation. Science Wash 241: 442–445. Paerl HW, Kellar PE. (1978). Significance of bacterial- Anabaena associations with respect to N2 fixation in freshwater. J Phyc 14: 254–260. PaerlHW,Pinckney JL.(1996).Amini-reviewofmicrobial consortia: their roles in aquatic production and biogeochemical cycling. Microb Ecol 31: 225–247. Parada A, Needham DM, Fuhrman JA. (2015). Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time- series and global field samples. Environ Microbiol 18: 1403–1414. PengY,LeungHCM,YiuSM,ChinFYL.(2012).IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinfor- matics 28: 1420–1428. Prufert-Bebout L, Paerl HW, Lassen C. (1993). Growth, nitrogenfixation,andspectralattenuationincultivated Trichodesmium species. Appl Envir Microbiol 59: 1367–1375. Raina JB, Tapiolas D, Willis BL, Bourne DG. (2009). Coral- associated bacteria and their role in the biogeochemical cycling of sulfur. Appl Envir Microbiol 75:3492–3501. Roe KL, Barbeau KA. (2014). Uptake mechanisms for inorganic iron and ferric citrate in Trichodesmium erythraeum IMS101. Metallomics 6: 2042–2051. Rouco M, Haley ST, Dyhrman ST. (2016). Microbial diversity within the Trichodesmium holobiont. Environ Microbiol 18: 5151–5160. Sachs JL, Hollowell AC. (2012). The origins of cooperative bacterial communities. Mol Bio 3: e00099–12-e00099- 12. Sañudo-Wilhelmy SA, Gómez-Consarnau L, Suffridge C, Webb EA. (2014). The role of B vitamins in marine biogeochemistry. Ann Rev Mar Sci 6: 339–367. Sapp M, Schwaderer AS, Wiltshire KH, Hoppe HG, Gerdts G, Wichels A. (2007). Species-specific bacterial communities in the phycosphere of microalgae?. Microbial Ecol 53: 683–699. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB et al. (2009). Introducing mothur: open-source, platform-independent, community- supported software for describing and comparing microbial communities. App Environ Microbiol 75: 7537–7541. Schwyn B, Neilands JB. (1987). ‘Universal chemical assay for the detection and determination of siderophores. Anal Biochem 160: 47–56. Seymour JR. (2010). Chemoattraction to dimethylsulfonio- propionate throughout the marine microbial food web. Science 329: 342–346. Simo R. (2001). Production of atmospheric sulfur by oceanic plankton: biogeochemical, ecological and evolutionary links. Trends Ecol Evol 16: 287–294. Sison-mangus MP, Jiang S, Tran KN, Kudela RM. (2014). Host-specific adaptation governs the interaction of the marine diatom, Pseudo-nitzschia and their microbiota. ISME J 8: 63–76. Stefels J, Steinke M, Turner S, Malin G, Belviso S. (2007). Environmental constraints on the production and removal of the climatically active gas dimethylsul- phide (DMS) and implications for ecosystem model- ling. Biogeochem 83: 245–275. Stevenson BS, Waterbury JB. (2006). Isolation and identi- fication of an epibiotic bacterium associated with heterocystous Anabaena cells. Biol Bull 210: 73–77. Stothard P. (2000). The sequence manipulation suite: JavaScript programs for analyzing and formatting protein and DNA sequences. Biotechniques 6: 1102–1104. Van Mooy BAS, Hmelo LR, Sofen LE, Campagna SR, May AL, Dyhrman ST et al. (2012). Quorum sensing control of phosphorus acquisition in Trichodesmium consortia. ISME J 6: 422–429. Vraspir JM, Butler A. (2009). Chemistry of marine ligands and siderophores. Ann Rev Mar Sci 1: 43–63. Walworth N, Pfreundt U, Nelson WC, Mincer T, Heidel- berg JF, Fu F et al. (2015). Trichodesmium genome maintains abundant, widespread noncoding DNA in situ, despite oligotrophic lifestyle. Proc Natl Acad Sci USA 112: 4251–4256. Waterbury JB. (2006). The Cyanobacteria - isolation, purification and indentification. Prok 4: 1053–1073. Webb EA, Jakuba RW, Moffett JW, Dyhrman ST. (2007). Molecularassessmentofphosphorusandironphysiol- ogy in Trichodesmium populations from the western Central and western South Atlantic. Limnol Oceanogr 52: 2221–2232. WeberT,BlinK,DuddelaS,KrugD,KimHU,BruccoleriR et al. (2015). AntiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Res 43: W237–W243. Yoch DC. (2002). Dimethylsulfoniopropionate: its sources, role in the marine food web, and biological degrada- tion to dimethylsulfide. Appl Environ Microbiol 68: 5804–5815. Zehr JP. (1995). Nitrogen fixation in the sea: why only Trichodesmium?. Molecular Ecol Aquat Micr G38: 335–364. Supplementary Information accompanies this paper on The ISME Journal website (http://www.nature.com/ismej) The Trichodesmium consortium MD Lee et al 1824 The ISME Journal 22 Tr a n s c r i p t i o n a l A c t i v i t i e s o f t h e M i c r o b i a l C o n s o r t i u m L i v i ng w i t h t h e M a r i n e N i t r o gen - F i x i n g C y a n o b a c t e r i u m T r i c h od e s m i u m R e v e a l P o ten t i a l R o l e s i n C o m m u n i t y - L e v e l N i t r o ge n C y c li ng M i c h a e l D . L ee , a E r i c A . W e bb , a N a t ha n G . W a l w o r t h , a F e i- X u e F u, a N oe ll e A . H e l d , b M a k A . S a i to, b Dav i d A . H u t c h i n s a a M a r in e E n v i r on m e n t a l B i o l o g y , D e p a r t m e n t o f B i o l o gi c a l S c i e n c e s , U n i v e r s i t y o f S ou t h e r n C a lif o r n i a , L o s A n g e l e s , C a lif o r n i a , U S A b M a r i n e C h e mi s t r y a n d G e o c h e mi s t r y D e p a r t m e n t , W oo d s H o l e O c e a no g r a p h i c I n s t i t u t e , W oo d s H o l e , M a s s a c h u s e tt s , U S A A B S T R A C T T r i c h od e s m i u m i s a g l o b a ll y di s t r ib u t e d c y a no b a c t e r i u m w h o s e n i t ro g e n - fi x i n g c a p a b ili t y f u e l s p r i m a r y p ro d u c t i o n i n w a r m oli g o t r o p h i c oce a n s . L i k e m a n y p h o t o a u t o t r o p h s , T r i c h od e s m i u m s e r v e s a s a h o s t t o v a r i o u s o t h e r m i c r o o rg a n i s m s , y e t litt l e i s k n o w n a b o u t h o w t h i s a ss o c i a t e d c ommun i t y m o d u l a t e s fl u x e s o f e n v i - ro nm e n t a l l y r e l e v a n t c h e m i c a l s p e c i e s i n t o a n d o u t o f t h e s u p r a o rg a n i s m a l s t r u c t u r e . H e r e , w e u t ili z e d m e t a t r a n s c rip t o m i c s t o e x a m i n e g e n e e x p r e ss i o n a c t i v i t i e s o f m i - c r obi a l c o mmun i t i e s a ss o c i a t e d wi t h T r i c h od e s m i u m e r y t h r a e u m ( s t r a i n IM S 101 ) u s i ng l a b o r a t o r y- m a i n t a i n e d e n r i c hm e n t c u l t u r e s t h a t h a v e p r e v i o u s l y bee n s h o w n t o h a r - b o r m i c r o bi a l c ommun i t i e s s i mil a r t o t h o s e o f n a t u r a l p o p u l a t i o n s . I n e n r i c hm e n t s m a i n t a i n e d und e r t w o di s t i n c t C O 2 c o nce n t r a t i o n s f o r 8 y e a r s , t h e c o mmun i t y t r a n s c r i p t i o n a l p r o fi l e s w e r e f o un d t o b e s pec i fi c t o t h e t r e a t m e n t , d e m o n s t r a t i n g a r e s t r u c t u r i n g o f o v e r a l l g e n e e x p r e ss i o n h a d o c c urr e d . S o m e o f t h i s r e s t r u c t u r i n g i n - v o l v e d s i gn i fi c a n t i n c r e a s e s i n c o mmun i t y r e s pi r a t i o n - r e l a t e d t r a n s c r i p t s und e r e l e - v a t e d C O 2 , p o t e n t i a l l y f a c ili t a t i n g t h e c o r r e s p o ndi n g m e a s u r e d i n c r e a s e s i n h o s t n i - t ro g e n fi x a t i o n r a t e s . P a r t i c u l a r l y o f n o t e , i n b o t h t r e a t m e n t s , c o mmun i t y t r a n s c r ip t s i n v o l v e d i n t h e r e d u c t i o n o f n i t r a t e , n i t r i t e , a n d n i t ro u s o x id e w e r e d e t ec t e d , s ug - g e s t i n g t h e a ss o c i a t e d or g a n i s m s m a y pl a y a ro l e i n c o l o n y - l e v e l n i t ro g e n c y c li ng . L a s t l y , a t a x o n - s pec i fi c a n a l y s i s r e v e a l e d di s t i n c t ec o l og i c a l n i c h e s o f c o n s i s t e n t l y c o occ u r r i n g m a j o r t a x a t h a t m a y e n a bl e , o r e v e n e n c o u r a g e , t h e s t a bl e c o h a bi t a t i o n o f a di v e r s e c o mmun i t y wi t h i n T r i c h od e s m i u m c o n s or t i a . I M P O R T A N C E T r i c h od e s m i u m i s a g e nu s o f g l o b a ll y di s t r ib u t e d , n i t ro g e n - fi x i n g m a - r i n e c y a n o b a c t e r i a . A s a s o u r c e o f n e w n i t ro g e n i n o t h e r w i s e n i t ro g e n - d e fi c i e n t s y s - t e ms , t h e s e or g a n i s m s h e l p f u e l c a r b o n fi x a t i o n c a r r i e d o u t b y o t h e r mor e a b un d a n t p h o t o a u t o t r o p h s a n d t h e r e b y h a v e s i gn i fi c a n t ro l e s i n g l o b a l n i t ro g e n a n d c a r b o n c y c li ng . M e m b e r s o f t h e T r i c h od e s m i u m g e nu s t e n d t o f or m l a r g e m a c r o s c o pi c c o l o - n i e s t h a t a pp e a r t o p e r p e t u a ll y h o s t a n a ss o c i a t i o n o f di v e r s e i n t e r a c t i n g m i c r o b e s di s t i n c t f r o m t h e s urr oundi n g s e a w a t e r , p o t e n t i a ll y m a k i n g t h e e n t i r e a ss e mbl a g e a uniq u e m i n i a t u r e e c o s y s t e m . S i n c e i t s fi r s t s ucce ss f u l c u l t i v a t i o n i n t h e e a r l y 1990 s , t h e r e h a v e b e e n q u e s t ion s a bou t t h e p o t e n t i a l i n t e r d e p e n d e n c i e s b e t wee n T r i c h o d e s - m i u m a n d i t s a ss o c i a t e d m i c ro bi a l c ommun i t y a n d w h e t h e r t h e ho s t ’ s s ee m ing l y e n i g - m a t i c n i t rog e n fi x a t i o n s c h e m a s om e ho w i n v o l v e d o r b e n e fi t e d fro m i t s e pi b ion t s . H e r e , w e r e v i s i t t h e s e o l d q u e s t ion s wi t h n e w t e c hno log y a n d i n v e s t i g a t e g e n e e x p r e ss i o n a c - t i v i t i e s o f m i c ro bi a l c ommun i t i e s li v i n g i n a ss o c i a t i o n wi t h T r i c h od e s m ium . K E YWO R D S T r i c h od e s m i u m , b a c t e r i a l c o n s o r t i um , g e n e e x p r e ss i o n , h i g h C O 2 a d a p t e d , m e t a t r a n s c r i p t o m e , n i t ro g e n fi x a t i o n , p r o t e o m e R e c e i v e d 1 4 S e p t e m b e r 201 7 A c c e p t e d 1 5 O c t o b e r 201 7 A c c e p t e d manu s c rip t p o s t e d on l i n e 2 0 O c t o b e r 201 7 C i t a ti o n L e e M D , W e b b E A , W a lw o r t h N G , F u F - X , H e l d NA , S a i t o M A , H u t c hin s D A . 2018 . T r a n s c r i p t i o n a l a c t i v i t i e s o f t h e mi c r o b i a l c o n s o r t i u m l i v in g w i t h t h e m a r in e ni t r o g e n - fi x in g cy a n o b a c t e r i u m T r i c h od e s m i u m r e v e a l p o t e n t i a l r o l e s i n c o mm u ni t y - l e v e l ni t r o g e n cyc ling . A pp l E n v i r o n M i c r o b i o l 84 : e0202 6 -17 . htt p s : // d o i . o r g / 10 . 1 128 / A E M . 0202 6 -17 . E di t o r H a r o l d L . D r a ke , U ni v e r s i t y o f B a y r e u t h C o p y ri gh t © 201 7 A m e r i c a n S o c i e t y f o r M i c r o b i o l o g y . A l l R igh t s R e s e rv e d . A dd r e s s c o r r e s p o nd e n c e t o D a v i d A . H u t c hin s , d a h u t c h @ u s c . e d u . E N V I R O N M E N T A L M I C R O B I O L O G Y c r o s s m J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m . or g 1 A ppli e d a n d E n v i ro nm e n t a l M i c r o bi o l o g y 23 T r i c h od e s m i u m i s a g l o b a l l y di s t r ib u t e d n i t ro g e n - fi x i n g g e nu s o f c y a no b a c t e r i a c o m - m o n i n w a r m o lig o t r o p h i c oce a n s . T h e y p ro v i d e a s u b s t a n t i a l p ro p or t i o n o f n e w n i t ro g e n ( N ) t o N - li m i t e d s y s t e m s , t h e r e b y h e lpi n g t o f u e l p r i m a r y p r o d u c t i v i t y ( 1 – 3) . A s a k e y s t o n e or g a n i s m i n m a j o r m a r i n e e l e m e n t a l c y c l e s , T r i c h od e s m i u m h a s bee n t h e f o c u s o f s t u di e s p ro bi n g i t s r e s p o n s e s t o o ng o i n g r a pi d c h a ng e s i n t h e g l o b a l oce a n s , s u c h a s c h a ng e s i n t e m p e r a t u r e , p H , a n d C O 2 ( e . g . , s e e r e f e r e nce s 4 – 9) . O n e c o m m o n r e s p o n s e o f T . e r y t h r a e u m s t r a i n IM S 10 1 t o e l e v a t e d C O 2 h a s bee n i n c r e a s e d N fi x a t i o n r a t e s (8 , 10 , 11) , y e t c u r i o u s l y , t h e s e d o n o t c o i n c id e wi t h i n c r e a s e d t r a n s c rip t s or p r o t e i n s o f t h e n i t ro g e n a s e c o m p l e x r e s p o n s ibl e f o r N fi x a t i o n ( 8 , 12) . U n d e r s t a ndi ng t h e c o n t r o l s o n o n e o f t h e gr e a t e s t s o u rce s o f n e w m a r i neN i s e ss e n t i a l f o r t h e m o d e li n g o f g l o b a l bi o g e o c h e m i c a l c y c l e s a n d t h e a ss i mil a t i o n o f C O 2 i n t h e p r e s e n t a n d f u t u r e o c e a n s . T r i c hod e s m i u m o f t e n f o r m s l a rg e m a c r o s c o pi c c o l o n i e s c om p r i s i n g t e n s t o hun d r e d s o f fi l a m e n t s , e a c h c o m p o s e d o f t e n s t o hund r e d s o f c e ll s ( 1) . L i k e m a n y o t h e r a ggr e - g a t i n g o r r e l a t i v e l y l a r g e p r i m a r y p r o d uce r s ( p a r t i c u l a r l y a l g a e ) , t h e s e c o l on i e s a c t a s nu t r i e n t - r i c h s u b s t r a t e s t h a t h a r b o r a di v e r s e m i c r obi a l c o mmun i t y ( 13 – 17) . W hil e t h e i n t e r a c t i on s occ u r r i n g b e t wee n T r i c h od e s m i u m a n d i t s a s s o c i a t e d e pibi on t s h a v e l o ng bee n r e c o gn i z e d a s li k e l y b e i n g i m p o r t a n t f o r t h e h o s t , e pibi o n t s , o r b o t h ( 18 – 20) , t h e e x t e n t t o w h i c h t h e y m o d u l a t e h o s t p h y s i o l o g y a n d N fi x a t i o n r e m a i n s l a r g e l y un - k n o w n . M o re o v e r , T r i c h od e s m i u m i s dif fi c u l t t o m a i n t a i n i n c u l t u r e ( 19 , 21) , a n d i t h a s bee n s ugg e s t e d t h i s m a y b e d u e t o t h e e x i s t e n c e o f o b l i g a t e d e p e n d e n c i e s o f t h e h o s t o n i t s a ss o c i a t e d m e m b e r s ( 16 , 19 , 22 , 23 ) . A tt e mp t s t o e s t a bli s h s t a bl e a x e n i c c u l t u r e s o f T r i c h od e s m i u m h a v e bee n un s u c - ce ss f u l , p e r h a p s b e c a u s e s u c h a s t a bl e c o h a b i t a t i o n o f or g a n i s m s h a s l e d t o c o m p l e x , i n t e r d e p e n d e n t c oo p e r a t i v e i n t e r a c t i o n s ( 24 , 25) . O n e p oss ibl e e x a mpl e o f s u c h a n i n t e r a c t i o n wi t h i n T r i c h od e s m i u m c o n s o r t i a i n v o l v e s t h e h o s t ’ s c r i t i c a l r o l e o f N fi x a t i o n . T h e n i t r og e n a s e c o m p l e x i s i nhibi t e d b y m o l e c u l a r oxy g e n ( O 2 ) , a n d T r i c h od e s m i u m h a s l o n g bee n c o n s id e r e d e n i gm a t i c a s i t c a r r i e s o u t N fi x a t i o n w h i l e c o n t e m p o r a n e o u s l y p e r f o r m i n g O 2 - e v o l v i n g p h o t o s y n t h e s i s (1 , 19 , 22 , 26) . S o m e s t ud i e s h a v e i mpli c a t e d t e mp or a l a n d s p a t i a l s e g r e g a t i o n s ( 2 7 a n d r e v i e w e d i n r e f e r e n c e 28) , b u t a n o t h e r p r o p o s e d mec h a n i s m f o r t h i s c a p a bili t y i n v o l v e s a c a s c a d e o f i n t e r w o v e n i n t e r a c t i o n s b e t wee n T r i c h od e s m i u m a n d i t s a ss o c i a t e d c ommun i t y . T h i s h y p o t h e s i s s ugg e s t s t h a t h o s t e x u d a t i o n o f or g a n i c c a r b o n s upp or t s t h e g r o w t h o f a ss o c i a t e d h e t e r o t r o p h s a n d f u e l s r e s pi r a t i o n , r e s u l t i n g i n m i c r o e n v i ro nm e n t s o f dec r e a s e d O 2 c o nce n t r a t i o n s . T h e s e s u b o x i c m i c r o e n v i r o nm e n t s t h e n s e r v e a s h a v e n s o f h o s t N fi x a t i o n , u l t i m a t e l y b e n e - fi t i n g t h e c o l on y a s a w h o l e ( 19 , 22 , 29 – 32) . I n t h e c o n t e x t o f s o l e l y p h o t o s y n t h e t i c c a r b o n fi x a t i o n , a e r o bi c b a c t e r i a li v i n g i n a s s o c i a t i o n wi t h a l g a e h a v e i n dee d bee n s h o w n t o s t i mu l a t e h o s t gro w t h v i a O 2 c o n s um p t i o n , c r e a t i n g c o ndi t i o n s mor e c o n - d u c i v e t o p h o t o s y n t h e s i s ( 33) . I n T r i c h od e s m i u m , m i c r o e l e c t ro d e m e a s u r e m e n t s o f p r i ma r il y “ r a f t ” - t y p e c o l o n i e s r e v e a l e d dec r e a s e d i n t r a c o l o n y O 2 l e v e l s c o r r e l a t i n g wi t h i n c r e a s e d N fi x a t i o n un d e r s t e a d y li gh t (30) , w hil e a n o t h e r rece n t s t u d y w or k i n g wi t h “ p uff ” - t y p e c o l on i e s f oun d dec r e a s e d O 2 c o nce n t r a t i o n s wi t h i n c o l o n y c or e s o n l y i n d a r k n e s s (11) . T h o ug h c o l o n y s i z e s ee m s t o b e o n e f a c t o r r e s p o n s ibl e f o r t h e g e n e r a - t i o n o f O 2 - d e pl e t e d m i c ro e nv i ro nm e n t s , i t r e m a i n s un k n o w n i f c o l o n y mor p h o l o g y i s a l s o a c o m p on e n t , a n d di rec t e v id e n c e s upp or t i n g t h e ro l e o f t h e a ss o c i a t e d c ommu - n i t y i s l a c k i ng . C h a r a c t e r i z a t ion s o f t h e a ss o c i a t e d m i c ro bi a l c ommun i t i e s o f T r i c h o d e s m i u m org a n i s m s a n d o t h e r pho t o a u t o t ro p h s h a v e r e v e a l e d c on s i s t e n t l y o b s e r v e d grou p s o f m a j o r t a x a , a n d no t a bl y , c ommun i t i e s di s t i n c t f r o m t ho s e i n t h e s urr o un di n g s e a w a t e r ( 16 , 34 – 36) . A t a bro a d t a x on o m i c l e v e l , c ommon l y a ss o c i a t e d org a n i s m s i n c l u d e m e m b e r s o f A l p h ap r o t e o - ba c t e r i a , G a m m ap r o t e o b a c t e r i a , a n d B a c t e r o i d e t e s t a x a ( 14 – 17 , 3 4 , 3 6 – 40) . W hil e t h e r e i s e v id e n c e s upp or t i n g h o s t - s pec i fi c a ss o c i a t i o n s a t fi n e r t a x o n o m i c r e s o l u t i o n s wi t h i n t h e s e l a rg e c l a d e s ( 23 , 41 – 43) , t h e r e a r e a l s o li k e l y un d e r l y i n g g e n e r a l lif e s t y l e a n d f un c t i o n a l c h a r a c t e r i s t i c s r e s p o n s ibl e f o r t h e s e b ro a d t r e n d s i n p h o t o a u t o t r o p h - h e t e ro t r o p h a ss o c i a t i on s ( e . g . , h o s t c a r b o n a n d / o r n i t ro g e n fi x a t i o n , e pibi o n t t r a i t s f o r p a r t i c l e - a ss o c i a t i o n , c opi o t r o p h i c / o pp o r t un i s t i c lif e s t y l e s , e t c . ) ( 37 , 44) . Dec ip h e r i n g t h e L e e e t a l . A ppli e d a n d E n v i ro nm e n t a l M i c ro bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m. or g 2 24 a c t i v i t i e s occ u r r i n g wi t h i n t h e s e i n t e r c onnec t e d c o m m un i t i e s i s e ss e n t i a l t o un d e r - s t a ndi n g t h e n e t bi o g e o c h e m i c a l c o n t r i b u t i o n s f r o m a n y h o s t t h a t e x i s t s i n p e r p e t u a l a ss o c i a t i o n wi t h o t h e r or g a n i s m s ( 45) . H e r e , w e p r e s e n t d e t ec t e d t r a n s c r ip t i o n a l a n d p r o t e o m i c d a t a o f a ss o c i a t e d c o m - mun i t i e s o f T . e r y t h r a e u m s t r a i n IM S 1 0 1 f o l l o wi n g 8 y e a r s o f s e lec t i o n un d e r t w o di s t i n c t C O 2 c o nce n t r a t i o n s . T h e e n r i c hm e n t c u l t u r e s i n t h i s s t u d y h a r b o r m i c r o bi a l c o mmun i - t i e s s i mil a r t o t ho s e f o un d i n a ss o c i a t i o n w i t h n a t u r a l T r i c h od e s m i u m p o p u l a t i on s (17) , s ugg e s t i n g t h a t t h e y c a n s e r v e a s a wi n d o w i n t o t h e c o m p l e x n e t w or k o f i n t e r a c t i on s o c c urr i n g b e t wee n t h e c y a n o b a c t e r i u m a n d i t s e pibi o n t s . Acc o rdi ng l y , w e e x a m i n e d 3 p r i ma r y q u e s t i o n s . ( i ) W h a t ro l e m i gh t t h e c o m m un i t y pl a y i n N c y c li ng ? ( ii ) C o uld i n c r e a s e d c o mmun i t y r e s pi r a t i o n b e f a c ili t a t i n g t h e c o r r e s p o ndi n g i n c r e a s e d r a t e s o f h o s t N fi x a t i o n und e r e l e v a t e d C O 2 ? ( iii ) C a n t r a n s c r i p t i o n a l a n d p r o t e o m i c p r o fi l e s h e lp d e fi n e t h e di s t i n c t ec o l o g i c a l n i c h e s o f t h e m a j o r t a x a c o mmon l y a ss o c i a t e d wi t h T r i c h od e s m i u m ? R E S U L T S A N D D ISC U S S I O N T h e c u l t u r e s u t ili z e d i n t h i s s t u d y w e r e p r e v i o u s l y s pli t f r o m o n e IM S 1 0 1 ce l l li n e a n d m a i n t a i n e d un d e r t w o C O 2 c o nce n t r a t i o n s ( l o w [ a mbi e n t 38 0 t o 400 a t m ] a n d h i gh [800 a t m] ) f o r 8 y e a r s ( 1 , 20 0 t o 1 , 70 0 g e n e r a t i o n s [ 8 ] ) . C u l t u r e d s t r a i n IM S 101 u s u a l l y g r o w s a s i ndi v id u a l fi l a m e n t s —m a c r o s c o pi c milli m e t e r - l o n g c h a i n s o f hun d r e d s o f ce ll s — r a t h e r t h a n a s a ggr e g a t e d c o l o n i e s . I n t h e o p e n o c e a n , i t h a s bee n o b s e r v e d t h a t mo s t ( 80 % ) o f t h e t o t a l T r i c h od e s m i u m bi o m a s s i s p r e s e n t a s fi l a m e n t s r a t h e r t h a n c o l o n i e s ( 46 – 49) . T h i s fi l a m e n t o u s lif e s t y l e m a y s u s t a i n a b ro a d b a s e li n e di s t r ib u - t i o n o f T r i c h od e s m i u m b e t wee n t h e mor e e pi s o di c ( b u t m o r e f r e q u e n t l y s a mpl e d ) bl o o m e v e n t s . B o t h fi l a m e n t ou s a n d c o l o n i a l m o r p h o l og i e s o f T r i c h od e s m i u m h o s t s i mil a r d o m i n a n t m i c r obi a l c o mmun i t i e s ( 17) , a lb e i t i n diff e r i n g or g a n i s m a l r e l a t i v e a b un d a nce , b u t litt l e i s k n o w n a b ou t h o w t h e ec o p h y s i o l o g y o f t h e e n t i r e s y s t e m ( h o s t a n d e pibi o n t s ) c h a ng e s wi t h a n d / o r d r i v e s t h e s e s t r u c t ur a l diff e r e n c e s . A s h a s bee n r e p o r t e d i n num e ro u s s t udi e s ( 8 – 12 ) a n d c o n fi r m e d h e r e , I M S 101 d e mo n s t r a t e s s i gn i fi c a n t l y i n c r e a s e d N fi x a t i o n r a t e s un d e r e l e v a t e d C O 2 ( F i g . 1) . T h e d a t a p r e s e n t e d h e r e a r e f o l l o wi n g 8 y e a r s o f s e lec t i o n , b u t t h e s a m e p h e n o t y p e h a s bee n o b s e rv e d a f t e r j u s t wee k s ( 8 , 10) . C u r i o u s l y , t h e s e i n c r e a s e d r a t e s w e r e n o t a c c o m p a n i e d b y i n c r e a s e s i n r e l e v a n t t r a n s c r i p t s o r p r o t e i n s ( 8 , 9) , a n i m p o r t a n t r e mi n d e r t h a t g e n e e x p r e ss i o n l e v e l s , a n d e v e n l e v e l s o f p ro t e i n p r o d u c t s , t e l l u s a b o u t t h e i r r e s pec t i v e l e v e l s o n l y a n d m a y o r m a y n o t b e r e fl ec t e d i n a c t u a l p h y s i o l og i c a l a n d bi o g e o c h e m i c a l r e s p o n s e s . T hu s , t r a n s c r i p t o m i c s a n d p r o t e o m i c s a r e u s e f u l t oo l s f or h y p o t h e s i s g e n e r a t i o n r a t h e r t h a n h y p o t h e s i s c o n fi r m a t i o n . Dee p m e t a t r a n s c r i p t o m i c F I G 1 H o s t ( I M S 101 ) d e m o n s t r a t e s s i gn i fi c a n t l y i n c r e a s e d n i t r o g e n fi x a t i o n r a t e s un d e r e l e v a t e d C O 2 c once n t r a t ion s ( n 3 ; s t a r , P 0 . 0 5 b y t t e s t ) . M e t a t r a n s c rip t o m e o f t h e T r i c hod e s m i u m C o n s o r t i u m A ppli e d a n d E n v i ro nm e n t a l M i c ro bi o l o g y J a nu a r y 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m . or g 3 25 s e q u e n c i n g o f t h e s e e n r i c hm e n t c u l t u r e s ( 9 5 m i l li o n r e a d s / s a mpl e ) ( s e e T a bl e S 1 i n t h e s uppl e m e n t a l m a t e r i a l ) e n a bl e d a n i n v e s t i g a t i o n o f t h e t r a n s c rip t i o n a l a c t i v i t y o f IM S 101 ’ s a ss o c i a t e d m i c r o bi a l c o mmun i t y . A s e n r i c hm e n t s o f s t r a i n IM S 101 , t h e t o t a l R N A p oo l s rec ov e r e d w e r e d o m i n a t e d b y T r i c h od e s m i u m , w h i c h w a s t h e s o u r c e o f 75 % o f t o t a l r e a d s ( s e e T a b l e S 1 i n t h e s uppl e m e n t a l m a t e r i a l ) . H o w e v e r , t h e d e p t h o f s e q u e n c i n g a c h i e v e d r e c o v e r e d a n a v e r a g e o f 1 5 milli o n r e a d s p e r s a m p l e or igi - n a t i n g s o l e l y f r o m t h e a ss o c i a t e d c o mmun i t y ( T a bl e S 1) . W hil e t h e t r a n s c r i p t i o n a l a c t i v i t i e s a n d p h y s i o l og y o f IM S 1 0 1 f r o m t h e s e e x p e r i m e n t s a r e i n t e g r a t e d h e r e i n a r e l e v a n t c o n t e x t ; w e f o c u s h e r e p r i m a r i l y o n T r i c h od e s m i u m ’ s a ss o c i a t e d c o mmun i t y . D e n o v o r e f e r e n c e li br a r y c o n s t r u c t i o n a n d t a xo n om i c c om p o s i t i o n . A d e n o v o c o a ss e mbl y o f mu l t ipl e m e t a t r a n s c rip t o m e s rec ov e r e d f ro m t h e s e c u l t u r e s w a s p e r - f o rm e d a n d s u b s e q u e n t l y id e n t i fi e d c o di n g s e q u e nce s ( C D S s ) w e r e u t ili z e d a s o u r r e f e r e n c e lib r a r y f o r rec r u i t i n g m e t a t r a n s c r ip t om i c r e a d s f r o m e a c h i ndi v id u a l s a mpl e a n d f o r p ro t e om i c a n a l y s i s . O f 45 , 00 0 C D S s d e r i v e d s o l e l y f r o m t h e IM S 101 - a ss o c i a t e d c o mmun i t y , t a x o n o m y w a s a ss i gn e d t o 33 , 00 0 ( 73 % ) ( s e e T a bl e S 2) . T h e v a s t m a j or i t y o f C D S s w e r e id e n t i fi e d a s b a c t e r i a l i n or i g i n (93 % ) ( s e e F i g . S 1 a n d T a bl e S 2) . C o n s i s t e n t wi t h ( i ) t h e o n l y a v a il a b l e e n v i ro nm e n t a l T r i c h od e s m i u m m e t a t r a n s c r i p t o m e ( 36) , ( ii ) a c l on e - lib r a r y s t u d y o f o p e n - oce a n T r i c h od e s m i u m c o l on i e s (16) , a n d ( iii ) a n a l y s i s o f t h e s e s pec i fi c c u l t u r e s a n d o t h e r l a b o r a t o r y - m a i n t a i n e d a n d e n v i r onm e n t a l s a m p l e s ( 17) , o u r d e n o v o r e f e r e n c e lib r a r y w a s d o m i n a t e d b y t h e m a j o r b a c t e r i a l t a x a B a c t e r o i d e t e s , C y a n oba c t e r i a , A l p h ap r o t e oba c t e r i a , a n d G a mm ap r o t e oba c t e r i a ( F i g . S 1 a n d S 2) . A s h a s bee n r e p o r t e d p r e v i o u s l y ( 16 , 36) , wi t h i n t h e s e b r o a d t a x o n o m i c c l a d e s w e r e p o p u l a t i on s di s t i n c t f r o m t y pi c a l pl a n k t o n i c m i c r obi a l c o mmun i t i e s , i n c l udi n g t h e c o n s pi c u o u s a b s e n c e o f m a j o r t a x a s u c h a s S A R 1 1 a n d a n y A r c h a e a . I n o u r r e f e r e nce lib r a r y , B a c t e r o i d e t e s w e r e p r e d o m i n a n t l y c o m p o s e d o f m e m b e r s o f t h e S ap r o s p i r a c e a e f a mi l y , P h a e oda c t y l iba c t e r x i a m e n e n s i s , a n d L e w i n e l l a c o h a e r e n s , w h e r e a s C y a n oba c t e r i a C D S s w e r e s o u rce d a l m o s t e n t i r e l y f r o m S y n e c h o c occ u s s pp . , A l p h ap r o t e oba c t e r i a p r e - d o m i n a n t l y i n c l u d e d m e m b e r s o f t h e or d e r s R h odoba c t e r a l e s a n d R h i z ob i a l e s , a n d G a mm ap r o t e o ba c t e r i a w e r e d o m i n a t e d b y t h e or d e r A l t e r o m o n ada l e s . T h e r e l a t i v e l y f e w e u k a r y o t i c C D S s r e c o v e r e d w e r e p r e d o m i n a n t l y f ung i a n d a l g a e ( V i r idipl a n t a e ) , c o n s i s - t e n t wi t h o b s e r v a t i on s o f T r i c h od e s m i u m i n t h e o p e n oce a n ( 36) . I m p or t a n t l y , t h e a ss o c i a t e d c o mmuni t i e s f r o m t h e s e s pec i fi c c u l t u r e s h a v e bee n s h o w n t o b e e n v i r o n - m e n t a ll y r e l e v a n t ( 17) . U t ili z i n g t h i s c u s t o m - b uil t r e f e r e n c e lib r a r y , w e e mpl o y e d t w o di s t i n c t a pp r o a c h e s i n t h e c h a r a c t e r i z a t i o n o f g e n e e x p r e ss i o n wi t h i n t h e s e IM S 101 - a ss o c i a t e d c o mmun i t i e s . F i r s t , w e f un c t i o n a l l y a n a l y z e d t h e g l o b a l m e t a t r a n s c r ip t o m e a ro un d T r i c hod e s m i u m a s a c o l lec t i v e un i t i rr e s p e c t i v e o f t a x o n o m y ( i . e . , wi t h o u t c o n s id e r a t i o n o f “ w h o ” w a s d o i n g “ w h a t ” ) . T h i s e n a bl e d a n o v e r a l l a ss e ss m e n t o f w h i c h p r e d o m i n a n t m e t a b o l i c a c t i v i t i e s ma y b e u l t i m a t e l y i n fl u e n c i n g t h e h o s t ’ s m i c r o e n v i r onm e n t . W e t h e n f o c u s e d o n t h e t r a n s c rip t i o n a l a c t i v i t y o f e a c h m a j o r t a x o n o m i c g r ou p t o i n v e s t i g a t e t h e p o t e n t i a l f o r uniq u e ec o l o g i c a l n i c h e s o f t h e s e c o n s i s t e n t l y c oo c c urr i n g a n d s t a bl e a ss o c i a t i on s ( 16 , 17 , 36) . W e a ddi t i o n a ll y r e a n a l y z e d r e ce n t l y p ubli s h e d p r o t e o m i c d a t a f r o m t h e s e s a m e s a mp l e s u s i n g o u r d e n o v o c o m m un i t y r e f e r e n c e lib r a r y ( 12) . T h e d e p t h o f m e t a t r a n - s c r ip t o m i c s e q u e n c i n g e mpl o y e d e n a bl e d a d e t a il e d a n a l y s i s o f t h e a ss o c i a t e d c o mm - un i t y ’ s t r a n s c r ip t i o n a l a c t i v i t i e s d e s pi t e t h e h o s t ’ s d o m i n a n t r e l a t i v e a b un d a n c e ( T a bl e S 1 ) , b u t t h e p r i m a r y f o c u s o f t h e p ro t e om i c s w o r k w a s t o c h a r a c t e r i z e s o l e l y t h e h o s t ’ s p r o t e o m e ( 12) . A s s u c h , t h e r e s u l t a n t p r o t e o m i c c o v e r a g e p r e c l u d e d a c o m p r e h e n s i v e p r o fi li n g o f t h e a ss o c i a t e d c o mmun i t y ’ s g l o b a l p ro t e om e . F o r t h i s r e a s o n , w e d o n o t a tt e m p t t o i n t e r p r e t t h e c o m mun i t y - r e l a t e d p ro t e o mi c s d a t a q u a n t i t a t i v e l y . H o w e v e r , i n t e g r a t i n g t h e s e d a t a s e t s di d e n a bl e t h e d e t ec t i o n o f m a n y p r o t e i n p ro d u c t s o f t h e di s c u s s e d t r a n s c rip t s ( p r e s um a bl y t h o s e p ro t e i n p r o d u c t s i n g r e a t e s t r e l a t i v e a b un - d a nce ) . Ac c or di ng l y , t h i s s e r v e s a s t h e f o un d a t i o n f o r a mu c h - nee d e d c u s t o m p ro t e om - i c s d a t a b a s e f o r u s e wi t h e n v i ro nm e n t a l s a mpl e s o f T r i c h od e s m i u m . D i s t i n c t g l o b a l c ommu n i t y t r a n s c r i p t i o n a l pr o fi l e s a cc o rd i n g t o C O 2 c o n c e n - t r a t i o n . H i e r a r c h i c a l c l u s t e r i n g a n d p r i n c ipl e c o o r di n a t e s a n a l y s i s o f s a mpl e r e a d L e e e t a l . A ppli e d a n d E n v i r o nm e n t a l M i c ro bi o l o g y J a nu a r y 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m. or g 4 26 rec r u i t m e n t t o o u r C D S s b o t h r e v e a l e d di s t i n c t c l u s t e r s o f g l o b a l c o mmun i t y t r a n s c rip - t i o n a l a c t i v i t y c o r r e s p o ndi n g t o e a c h t r e a t m e n t ( s e e F i g . S 3) . T hu s , t h e c o mmun i t y e x hibi t e d uniq u e t r a n s c r ip t i on a l p r o fi l e s wi t h r e s pec t t o C O 2 c o nce n t r a t i o n . T o p ro b e t h e s e c o mmun i t y- l e v e l t r a n s c rip t i o n a l diff e r e nce s , C D S s w e r e f un c t i o n a ll y a nn o t a t e d wi t h K y o t o E n c y c l o p e di a o f G e n e s a n d G e n o m e s ( K E GG ) o r t h o l o g s ( [ K O s ] 50) . T h i s r e s u l t e d i n 14 , 00 0 C D S s a ss i gn e d t o K O s ( 30 % o f t h e t o t a l ) ( T a bl e S 2) , w h i c h w e r e t h e n c o l l a p s e d i n t o K E G G m o d u l e s ( K E G G “ m o d u l e s ” r e p r e s e n t or g a n i z e d f un c t i o n a l un i t s o f g e n e s ) ( T a bl e S 3) . T o e x a m i n e t h e a s s o c i a t e d c o m m un i t y ’ s a c t i v i t i e s i n t h e c o n t e x t o f i t s h o s t , IM S 101 ’ s g e n e s w e r e r e t r i e v e d f ro m I n t e g r a t e d M i c ro b i a l G e n o m e s (I M G [51 ] ) a n d p r oce ss e d i n t h e s a m e m a nn e r . T h i s p e r s pec t i v e r e v e a l e d a n o v e r a ll t r e n d w h e r e i n un d e r l o w C O 2 , t h e a ss o c i a t e d c o mmun i t y m o r e e v e n l y di s t r ib u t e d i t s t r a n s c r i p t i o n a l p o o l a c r o s s a b ro a d r a ng e o f m o d u l e s , w hil e I M S 101 ’ s t r a n s c rip t i o n a l a ll o c a t i o n w a s p r i m a r il y i n v e s t e d i n n i t ro g e n m e t a b o li s m a n d p h o t o s y n t h e s i s ( F i g . 2 ) . F I G 2 R e l a t i v e e x p r ess i o n l e v e l s f o r K E G G m o d u l e s i n T r i c h o d e s m i u m a n d i t s a ss o c i a t e d c ommun i t y un d e r r e pli c a t e l o w a n d h i g h C O 2 t r e a t m e n t s . N um b e r s i n p a r e n t h e s e s f o ll owi n g m o d u l e n a m e r e p r e s e n t t h e n o . o f C D S s id e n t i fi e d i n t h e c o mmun i t y / n o . id e n t i fi e d i n T r i c h o d e s m i um . M e t a t r a n s c rip t o m e o f t h e T r i c hod e s m i u m C o n s o rt i u m A ppli e d a n d E n v i ro nm e n t a l M i c ro bi o l o g y J a nu a r y 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m . or g 5 27 C o n t r a s t i n g wi t h t h i s , und e r h i g h C O 2 , t h e c o mmun i t y a s a w h o l e a ll ott e d r e l a t i v e l y mor e t r a n s c r ip t s t o a f e w e r num b e r o f p ro c e ss e s , s u c h a s p h o t o s y n t h e s i s , A T P s y n t h e s i s , a n d o x id a t i v e p h o s p h o r y l a t i o n ( p o t e n t i a ll y s ugg e s t i n g i n c r e a s e d g r o w t h r a t e s a n d / o r o v e r a l l c o mmun i t y a c t i v i t y ) , w h e r e a s T r i c h od e s m i u m ’ s t r a n s c r i p t p oo l s hif t e d t o b e i ng mor e e v e n l y di s t r ib u t e d a c ro s s a b r o a d r a ng e o f m o d u l e s a n d l e s s c o n c e n t r a t e d o n p h o t o s y n t h e s i s a n d n i t r og e n m e t a b o li s m ( F i g . 2) . T h i s s hif t i n n i t r og e n m e t a b o l i s m w a s di rec t l y d r i v e n b y s i gn i fi c a n t dec r e a s e s i n h o s t n i t ro g e n a s e t r a n s c rip t s und e r e l e v a t e d C O 2 ( n i f H D K , P 0 . 04 , 2 - t a il e d t t e s t ) ( s e e T a bl e S 4) , w h i c h a c t u a ll y c o i n c id e s wi t h s i gn i fi c a n t i n c r e a s e s i n m e a s u r e d n i t ro g e n fi x a t i o n r a t e s a s n o t e d a b o v e ( F i g . 1) . A s t h e s e t r e a t m e n t s h a d di v e r g e d und e r t h e i r r e s pec t i v e C O 2 c o nce n t r a t i o n s f o r 8 y e a r s p r i o r t o s a mpli ng , t h e g l o b a l t r a n s c rip t i o n a l c l u s t e r i n g p a tt e r n s ( F i g . S 3 ) a n d t h e p r e s e n c e o f t r e n d s e v e n a t t h i s c o a r s e l y r e s o l v e d l e v e l o f K E G G m o d u l e s ( F i g . 2 ) s p e a k t o t h e f un c t i o n a l ro b u s t n e s s o f t h e s y s t e m s . C omm un i t y n i t r o g e n c y c li n g . G i v e n T r i c h od e s m i u m ’ s s i gn i fi c a n c e a s a s o u r c e o f fi x e d N t o o t h e r wi s e n i t ro g e n - s t a r v e d s y s t e m s a n d t h e d e a r t h o f i n f o rm a t i o n r e g a r di ng t h e ro l e o f i t s p e r p e t u a ll y p r e s e n t a ss o c i a t e d c o mmun i t y i n t h i s p r oce ss , w e s pec i fi c a l l y i n v e s t i g a t e d t h e k n o w n p r i m a r y g e n e s i n v o l v e d i n n i t ro g e n c y c li n g i n b o t h o u r h o s t a n d t h e a ss o c i a t e d c o mmun i t y . T h o ug h o t h e r n i t ro g e n - fi x i n g m i c r o b e s h a v e bee n s ee n wi t h T r i c h od e s m i u m i n t h e o p e n oce a n ( 19 , 52) , t h e o n l y d e t ec t e d N fi x a t i o n g e n e s i n o u r e n r i c hm e n t s w e r e s o u rce d f ro m IM S 10 1 ( F i g . 3) , i ndi c a t i n g t h e m e a s u r e d N fi x a t i o n r a t e s w e r e s o l e l y t h e r e s u l t o f t h e h o s t . I n t e r e s t i ng l y , t r a n s c r i p t s f o r diss i mil a t or y n i t r a t e / n i t r i t e r e d u c t i o n w e r e d e t ec t e d ( n a r G -n i r B ) , a s w e l l a s t h e fi n a l s t e p o f d e n i t r i - fi c a t i o n ( n o s Z ) , w h i c h c a t a l y z e s t h e c o n v e r s i o n o f n i t ro u s o x id e ( N 2 O ) i n t o N 2 ( F i g . 3) ; t h e e x p r e ss i o n o f s u c h g e n e s i s e x pec t e d t o b e i n d uce d o n l y un d e r c o ndi t i o n s w h e r e o x y g e n i s d e pl e t e d . S u c h a n a e r o bi c N t r a n s f o rm a t i o n s h a v e rece n t l y bee n d e m o n - s t r a t e d t o occ u r wi t h i n t h e a ss o c i a t e d c o mmun i t i e s o f a n o t h e r m a r i n e c y a n o b a c t e r i a l di a z o t r o p h , N odu l a r i a s pu m i g e n a ( 53) , a n d t h e y h a v e bee n p ro p o s e d t o occ u r wi t h i n T r i c h od e s m i u m c o l on i e s a s w e l l (54) . K l a w o n n e t a l . a ddi t i o n a l l y id e n t i fi e d a n o x i c i n t e r i o r c o r e s o f t h e milli m e t e r - s i z e d c o l o n i e s e x t e ndi n g t o 5 % o f t h e i r t o t a l s i z e , e v e n w h e n s u s p e nd e d i n 100 % a i r - s a t u r a t e d w a t e r ( 53) , s i mil a r t o o b s e r v e d dec r e a s e s i n O 2 c o nce n t r a t i o n n e a r t h e c or e o f T r i c h od e s m i u m c o l o n i e s (30) . F a c u l t a t i v e a n a e r o b e s a r e t h o ugh t t o h a v e a n a d v a n t a g e w h e n i t c o m e s t o p a r t i c l e - a ss o c i a t e d lif e s t y l e s d u e t o s u c h m i c r o e n v i ro nm e n t s ( 55) , a n d mor e o v e r , O 2 c o nce n t r a t i o n s und e r g o r a p i d c h a ng e s F I G 3 N i t ro g e n c y c li n g t r a n s c rip t e x p r ess i o n l e v e l s i n T r i c h o d e s m i u m ( o r a ng e a n d b r o w n b a r s ) a n d i n i t s a ss o c i a t e d c o mmun i t y c o n s id e r e d a s a w h o l e ( bl u e b a r s ) un d e r l o w ( a mbi e n t ) a n d h i g h ( 80 0 a t m ) C O 2 . S t a r s i ndi c a t e s i gn i fi c a n c e a t a 0 . 0 5 t hr e s h o l d b y 2 - t a il e d t t e s t s . n i f H D K , n i t r o g e n a s e ; a m t B , a mmon i um t r a n s p o rt ; N i t T / T a u T , n i t r a t e / n i t r i t e / t a u r i n e t r a n s p o r t ; u r t A - E , u r e a t r a n s p o r t ; n a s A -n a r B , a ss i mil a t or y n i t r a t e r e d u c t a s e ; n i r A , a s s i mil a t o r y n i t r i t e r e du c t a s e ; na r G , diss i mil a t o r y n i t r a t e r e d u c t a s e ; n i r B , diss i m i - l a t o r y n i t ri t e re du c t a s e ; n o s Z , n i t ro u s o x id e r e d u c t a s e ; g l n A , g l u t a m i n e s y n t h e s i s . L e e e t a l . A ppli e d a n d E n v i ro nm e n t a l M i c ro bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m . or g 6 28 ( wi t h i n m i nu t e s ) f r o m s u p e r s a t u r a t e d t o s u b s a t ur a t e d wi t h i n T r i c h od e s m i u m c o l on i e s t r a n s i t i o n e d f ro m l i gh t t o d a r k n e s s (30) . T h o ug h a s n o t e d a b o v e , i n c u l t u r e , IM S 101 t e n d s t o gro w a s fi l a me n t s n o t c o l on i e s . T h e g e n e r a t i o n o f a n a e r o bi c m i c r o e n v i r on - m e n t s c o n d u c i v e t o s u c h N t r a n s f o r m a t i o n s m a y a l s o o c c u r wi t h i n e x o p o l y s a cc h a rid e m a t r i c e s ( 32) , t h e p ro d u c t i o n o f w h i c h a pp e a r s t o b e a n a c t i v e p r oce s s i n o u r s y s t e m ( di s c u s s e d b e l o w ) , b u t i t i s un c l e a r i f t h e s e d e t ec t e d c o mmun i t y N - r e d u c i n g e x p r e ss i o n l e v e l s a r e s i mpl y d u e t o t h e i r c o n s t i t u t i v e e x p r e ss i o n . R e g a rdl e ss , d u e t o t h e s i mil a r i t y o f t h e m i c ro b i a l c o mmun i t i e s b e t wee n t h e s e c u l t ur e s a n d t h o s e i n n a t u r a l c o l o n i e s , a n d g i v e n t h e rece n t i n s i gh t s i n t o t h e s e p r oce ss e s i n N . s pu m i g e n a ( 53) , t h e p o t e n t i a l i mpli c a t i o n s r e m a i n t h e s a me . I t i s p oss ibl e f o r a ggr e g a t i n g c y a n o b a c t e r i a l di a z o t r o p h s t o h a r b o r d e n i t r i f y i n g f a c u l t a t i v e a n a e r o b e s , w h i c h r a i s e s i n t e r e s t i n g q u e s t i o n s r e g a r d - i n g N c y c l i n g wi t h i n T r i c h od e s m i u m c o l o n i e s . W i t h r e s p ec t t o C O 2 t r e a t m e n t , a s a w h o l e , t h e r e w a s a r e s t ru c t ur i n g o f t h e a ss o c i a t e d c o mmun i t y ’ s g l o b a l N - r e l a t e d t r a n s c r ip t i o n a l p ro fi l e s ( F i g . 3) , c o i n c idi n g wi t h t h e i n c r e a s e d h o s t N fi x a t i o n r a t e un d e r e l e v a t e d C O 2 ( F i g . 1) . A mm o n i u m t r a n s p o r t t r a n s c r i p t s w e r e s i gn i fi c a n t l y i n c r e a s e d i n t h e a ss o c i a t e d c o mmun i t y b u t w e r e l e ss a b un d a n t i n T r i c h od e s m i u m un d e r h i g h C O 2 ( F i g . 3 , a m t B ) . A l s o s i gn i fi c a n t l y e nr i c h e d i n t h e a ss o c i a t e d c o m m un i t y b u t dec r e a s e d i n t h e h o s t i n t h e h i g h C O 2 t r e a t me n t w e r e t r a n s c r i p t s f o r g l u t a m i n e bi o s y n t h e s i s ( g l n A ) , o n e o f t h e p r i ma r y p a t h w a y s t h a t i n c o r- p o r a t e s n e w N i n t o bi o ma ss . I t i s p oss ibl e t h e s e c o m m un i t y s hif t s r e fl ec t c o r r e s p o ndi ng i n c r e a s e s i n g r o w t h r a t e s o r o v e r a l l ce ll u l a r m e t a b o li s ms i n r e s p o n s e t o t h e i n c r e a s e d a v a il a bili t y o f fi x e d N s uppli e s . I n c r e a s e d c ommu n i t y r e s p i r a t i o n u n d e r e l e v a t e d C O 2 c o n c e n t r a t i o n s . A s di s - c uss e d a b o v e , a ss o c i a t e d mi c r obi a l r e s pi r a t i o n c o u l d b e n e fi t t h e T r i c h od e s m i u m h o s t b y a idi n g i n t h e g e n e r a t i o n o f m i c r o e n v i ro nm e n t s o f l o w e r O 2 c o nce n t r a t i o n s , t h e r e b y r e li e v i n g o x i c i nhibi t i o n o f N fi x a t i o n ( 19 , 22 , 29 – 3 2 ) . I n t h e c o n t e x t o f o u r e x p e r i me n t , t h i s i s o n e p oss ibl e e x pl a n a t i o n f o r t h e i n c r e a s e d r a t e s o f h o s t N fi x a t i o n un d e r e l e v a t e d C O 2 ( F i g . 1 ) d e s pi t e t h e dec r e a s e s i n n i t ro g e n a s e t r a n s c r ip t s ( F i g . 3 ) a n d p r o t e i n s (8 , 9 ) . T o i n v e s t i g a t e t h i s p oss ibili t y t hro ug h t h e l e n s o f c o m m un i t y t r a n s c rip t i o n a l e x p r e ss i o n , w e e x a m i n e d t h e c y t o c hr o m e c o x id a s e a n d A T P a s e c o mpl e x e s i n v o l v e d i n r e s pi r a t i o n a n d o x id a t i v e p h o s p h o r y l a t i o n . T r i c h od e s m i u m ’ s a s s o c i a t e d c ommun i t y a s a w h o l e w a s i n dee d f o un d t o b e d e di - c a t i n g a s i gn i fi c a n t l y i n c r e a s e d p ro p o r t i o n o f i t s t r a n s c r i p t s t o w a r d t h e s e p r oce ss e s un d e r e l e v a t e d C O 2 ( F i g . 4) . A s t h e s e ce l l li n e s h a d di v e r g e d f o r 8 y e a r s un d e r t h e s e di s t i n c t C O 2 c o nce n t r a t i on s , t h e s e r e s u l t s a l s o s ugg e s t t h i s t o b e a s t a bl e s t a t e . I t i s p oss ibl e t h a t i n c r e a s e d c o mmun i t y r e s pi r a t i o n l e d t o a n i n c r e a s e i n m i c r o e n v i ro nm e n t s F I G 4 E x p r ess i o n l e v e l s o f t h e c y t o c hr o m e c a n d A T P a s e c ompl e x e s i nvo l v e d i n o x id a t i v e p h o s p hor y l a - t i o n . T h e t r a n s c rip t s w e r e e nr i c h e d i n t h e c o mmun i t y un d e r e l e v a t e d C O 2 . S t a r s i ndi c a t e s i gn i fi c a n c e a t a 0 . 0 5 t hr e s h o l d b y 2 - t a il e d t t e s t s . M e t a t r a n s c rip t o m e o f t h e T r i c hod e s m i u m C o n s o r t i u m A ppli e d a n d E n v i r o nm e n t a l M i c r o bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m. or g 7 29 o f l o w e r O 2 c o nce n t r a t i o n s , t h e r e b y l e ss e n i n g t h e e ffec t i v e O 2 i nhibi t i o n o f t h e n i t ro - g e n a s e e n z y m e a n d u l t i ma t e l y f a c ili t a t i n g t h e i n c r e a s e d IM S 1 0 1 N fi x a t i o n r a t e s c o n - s i s t e n t l y o b s e rv e d un d e r e l e v a t e d C O 2 ( 8 , 10 , 56) . T o o u r k n o wl e d g e , t h e s e a r e t h e fi r s t d a t a s pec i fi c a l l y s upp or t i n g a dec a d e s - o l d h y p o t h e s i s o f w h a t ma y b e a f un d a m e n t a l mec h a n i s m ( t h e a ss o c i a t e d c o mmun i t y ’ s i n fl u e nce ) i n v o l v e d wi t h T r i c h od e s m i u m ’ s s ee m i ng l y e n i gm a t i c N fi x a t i o n s t r a t e g y . O n e p o t e n t i a l c o r r e l a t i n g t r a i t o f h i gh e r r e s pi r a t i o n r a t e s i s i n c r e a s e d gr o w t h r a t e s . W hil e t h e r e i s n o w a y t o di rec t l y q u a n t if y g r o w t h r a t e s o f t h e t a x a a ss o c i a t e d wi t h o u r I M S 1 0 1 e n r i c hm e n t s v i a t h e a v a il a b l e m e t a t r a n s c r ip t om i c d a t a , t h e t r a n s c r ip t i on a l i n v e s t m e n t i n r ib o s o m a l p r o t e i n s ( i . e . , t h e p r o p o r t i o n o f t r a n s c r i p t s o f t h e t o t a l t r a n - s c r ip t p oo l ) h a s bee n s h o w n t o c o rr e l a t e wi t h gro w t h ( 5 7 ) a n d h a s bee n u s e d a s a q u a n t i t a t i v e m e t r i c o f a c t i v i t y f r o m me t a t r a n s c r i p t o m i c d a t a ( 58) . A n a n a l y s i s o f o u r m a j o r a ss o c i a t e d b a c t e r i a l c l a d e s di d r e v e a l a g r e a t e r p ro p o r t i o n o f r ib o s o ma l t r a n - s c r i p t s und e r e l e v a t e d C O 2 ( F i g . 5) . T h i s fi ndi ng , a l o n g w i t h a n i n c r e a s e i n g l u t a mi n e s y n t h e s i s t r a n s c r ip t s ( F i g . 3) , s ugg e s t s t h a t i n c r e a s e d g r o w t h r a t e s m a y c o r r e l a t e wi t h t h e e n r i c hm e n t o f r e s pi r a t i o n t r a n s c r ip t s und e r i n c r e a s e d C O 2 ( F i g . 4) . T o n o t e , e u - k a r y o t e s w e r e n o t c o n s id e r e d i n t h i s r ib o s o m a l p ro t e i n a n a l y s i s d u e t o t h e o v e r a b un - d a n c e o f c o n f oundi n g r ib o s o m a l t r a n s c r i p t s or i g i n a t i n g f ro m c h l oro p l a s t s a n d mi t o - c ho n d r i a ( s e e F i g . S 4) . T r a n s c r i p t i o n a l s u pp o r t f o r d i s t i n c t e c o l o g i c a l n i c h e s w i t h i n T r i c h ode s m ium c o n s o r t i a . M e m b e r s o f B a c t e r o i d e t e s , C y a n oba c t e r i a , A l p h ap r o t e oba c t e r i a , a n d G a mm a- p r o t e oba c t e r i a t a x a c o mm o n l y c o occ u r i n n a t u r a l T r i c hod e s m i u m p o p u l a t i o n s a s w e l l a s i n s t a bl e a ss o c i a t i o n s i n a l l l a b o r a t or y e n r i c hm e n t s e x a m i n e d t h u s f a r ( 16 , 17 , 36) . T h i s s u s t a i n e d c o h a bi t a t i o n s p a nn i n g y e a r s i n l a b o r a t o r y e n r i c hm e n t s wi t h o u t a dd e d fi x e d c a r b o n o r n i t ro g e n ( 17 ) a n d t h e f a c t t h a t t h e r e a re n o a x e n i c c u l t u r e s o f T r i c h od e s m i u m s upp o r t t h e i d e a t h a t t h e r e i s a p e r s i s t e n t n e t w or k f o r n u t r i e n t c y c li n g occ u r ri ng b e t wee n t h e h o s t a n d t h e c o mmun i t y . G i v e n t h i s a n d t h e c o n s e r v e d a ss o c i a t i o n o f t h e s e m a j o r t a x a wi t h T r i c h od e s m i u m a n d o t h e r p h o t o a u t o t r o p h s , w e e x a mi n e d t h e t r a n s c r i p t i o n a l p r o fi l e s o f t h e s e g r ou p s t o id e n t if y p o t e n t i a ll y di s t i n c t ec o l og i c a l n i c h e s . W hil e i n v e s t i g a t i n g t h e r e l a t i v e c o n t r ib u t i o n s o f v a r i o u s p h y l o g e n e t i c g r ou p s t o t h e d e g r a d a t i o n o f t h e mu l t if a r i o u s mili e u s o f diss o l v e d or g a n i c ma tt e r ( D O M ) i n a q u a t i c e n v i r onm e n t s , C ott r e l l a n d K i r c hm a n ( 59 ) n o t e d t h a t w hil e n o i ndi v id u a l ma j o r t a x o n c a n m e t a b o li z e a l l f o rm s o f D O M , a n a ss e mbl a g e c o mp r i s i n g r e p r e s e n t a t i v e s f r o m j u s t t h r e e gro u p s — B a c t e r o i d e t e s , A l p h ap r o t e oba c t e r i a , a n d G a mm ap r o t e oba c t e r i a —li k e l y c o uld . T h i s i s b e c a u s e e v e n a t t h i s r e l a t i v e l y b r o a d l e v e l o f t a x o n o mi c r e s o l u t i o n , t h e r e a r e s t il l c l e a r di s t i ngu i s h i n g c h a r a c t e r i s t i c s o f t h e s e c l a d e s ( 59) , a n d i t i s li k e l y t h e s e diff e r e n c e s a r e w h a t u l t i m a t e l y und e r li e t h e di s t i n c t ec o l o g i c a l ro l e s t h a t e n a bl e a n d e n c our a g e t h e c oocc urr e n c e o f t h e s e m a j o r t a x a n o t o n l y wi t h i n T r i c hod e s mi u m c o n s or t i a b u t a l s o a dh e r e d t o p a r t i c l e s ( 60 ) a n d a l g a e ( 37 ) . W e c o n t r a s t e d t h e s e g r ou p s , a l o n g wi t h C y a n oba c t e r i a , b y c o l l a p s i n g i ndi v id u a l t a x o n C D S s i n t o K O s a n d no r ma l i z i n g t o t h a t t a x o n ’ s t o t a l num b e r o f f un c t i o n a l l y F I G 5 T a x o n a ll o c a t i o n o f rib o s o m a l p r o t e i n t r a n s c rip t s r e l a t i v e t o t h e i r r e s pec t i v e t r a n s c rip t p o o l s i ndi c a t e s p oss ibl e i n c r e a s e d gro w t h r a t e s un d e r e l e v a t e d C O 2 . S t a r s i ndi c a t e s i gn i fi c a n c e a t a 0 . 05 t hr e s h o l d b y 2 - t a il e d t t e s t s . L e e e t a l . A ppli e d a n d E n v i r o nm e n t a l M i c ro bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m. or g 8 30 a nn o t a t e d t r a n s c rip t s . H i e r a rc h i c a l c l u s t e r i n g a n d or di n a t i o n b o t h r e v e a l e d c l e a r g r ou p - i ng s b y t a x a ( s e e F i g . S 5) , a n d ma n y o f t h e p r i m a r y d r i v i n g f orce s b e h i n d t h i s s e p a r a t i o n bec a m e a pp a r e n t w h e n C D S s w e r e g r ou p e d i n t o K E G G mo d u l e s ( F i g . 6 ) a n d u p o n c o m p a r i n g t h e m o s t h i gh l y e x p r e ss e d g e n e s f r o m e a c h t a x o n ( T a bl e 1 ; s e e a l s o T a bl e S 6) ; f o r c l a r i t y , o n l y t h e a mbi e n t C O 2 r e pli c a t e s a r e p r e s e n t e d i n F i g . 6 , b u t t h e t r e n d s w e r e s i mil a r i r r e s pec t i v e o f C O 2 c o nce n t r a t i o n ( s e e F i g . S 6) . T h e s e a n a l y s e s p ro v i d e a f o un d a t i o n t o b e g i n e l u c id a t i n g s o m e o f t h e di s t i n c t ro l e s o f t h e s e ma j o r t a x a wi t h i n t h e T r i c h od e s m i u m c o n s o rt i um . W hil e t h e c urr e n t w o r k i s f o c u s e d o n T r i c h od e s m i u m , t h e di s t i n c t i o n s d e s c r ib e d h e r e f o r t h e s e m a j o r t a x a a r e p o t e n t i a l l y r e l e v a n t t o o t h e r p h o t o a u t o t r o p h - h e t e ro t r o p h i n t e r a c t i o n s a s w e ll , d u e t o t h e f un c t i o n a l r e d un d a n c i e s a t t h e l e v e l o f r e s o l u t i o n b e i n g c o n s id e r e d . B a c t e r o ide t e s . M a r i n e B a c t e r o i d e t e s a r e m o s t l y k n o w n f o r t h e i r p a r t i c l e - a ss o c i a t e d lif e s t y l e s ( 61) , h i g h m o lec u l a r - w e i gh t - c o m p o un d d e g r a d a t i o n c a p a bili t i e s (62) , a n d g r e a t e r - t h a n - t y pi c a l a b un d a n c e o f g e n e s r e l a t e d t o e x o p o l y s a c c h a r id e p ro d u c t i o n a n d a d h e s i o n ( 63) . S pec i fi c a l l y , t h e y a pp e a r t o p oss e s s 2 t o 3 t i me s mor e g l y c o s y l t r a n s - f e r a s e g e n e s p e r milli o n b a s e p a i r s t h a n A l p h ap r o t e oba c t e r i a a n d G a mm ap r o t e oba c t e r i a . G l y c o s y l t r a n s f e r a s e s a r e t y pi c a l l y o u t e r me mb r a n e e n z y me s i n v o l v e d i n t h e g e n e r a t i o n F I G 6 R e l a t i v e e x p r ess i o n o f s e lec t K E G G m o d u l e s f o r e a c h o f t h e 4 m a j o r t a x a c o mm o n l y s ee n i n a ss o c i a t i o n wi t h T r i c h o d e s m i u m un d e r a mbi e n t C O 2 . V a l u e s a r e p e rce n t a g e s o f t o t a l t r a n s c rip t s m a pp e d t o K O - a nn o t a t e d C D S s . C o l or e d p a i r i ng s d e n o t e s t a t i s t i c a ll y diff e r e n t e x p r ess i o n l e v e l s b e t wee n t w o t a x a : gree n , C y a n o b a c t e r i a ( p r e d o m i n a n t l y S y n e c h o c o c c u s ) ; p urpl e , A l ph a p r o t e o b a c t e r i a ( R h o d o b a c t e r a l e s a n d R h i z o b i a l e s ) ; r e d , B a c t e r o i d e t e s ( P h a e o d a c t y li b a c t e r x i a m e n e n s i s ) ; bl u e , G a mm a p r o t e o b a c t e r i a ( A l t e r o m o n- ada l e s ) . S e e F i g . S 6 i n t h e s uppl e m e n t a l m a t e r i a l f o r a l l t r e a t m e n t s . M e t a t r a n s c rip t o m e o f t h e T r i c hod e s m i u m C o n s o r t i u m A ppli e d a n d E n v i r o nm e n t a l M i c r o bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m . or g 9 31 o f e x o p o l y s a cc h a rid e s f o r a tt a c hm e n t (63) . A s wi t h o t h e r e n v i ro nm e n t s , i t i s li k e l y t h e s e t r a i t s m a y h e l p d e fi n e t h e i r un i qu e ro l e i n t h e T r i c h od e s m i u m c o n s o r t i um . O u r B a c t e - r o i d e t e s c l a d e w a s a l mo s t e n t i r e l y d o mi n a t e d b y me m b e r s o f t h e S ap r o s p i r a c e a e f a mil y , n a m e l y , P h a e oda c t y l iba c t e r x i a m e n e n s i s , w h i c h i s c l o s e l y r e l a t e d t o t h e mor e c o mmon l y k n o w n L e w i n e l l a c o h a e r e n s . A p t l y , P . x i a m e n e n s i s w a s i s o l a t e d f r o m t h e di a t o m P h a e - oda c t y l u m t r i c o rnu t u m ( 64 , 65) , s upp or t i n g t h e n o t i o n t h a t , d u e t o und e r l y i n g s i mil a r p r o p e r t i e s , t h e p h y c o s p h e r e s o f b o t h a l g a e a n d c y a n o b a c t e r i a s e lec t f o r s i mil a r a ss o - c i a t e d o rg a n i s m s , a t l e a s t a t t h i s l e v e l o f r e s o l u t i o n . I n t h i s a n a l y s i s , B a c t e r o i d e t e s w a s t h e o n l y m a j o r t a x o n t h a t e x p r e ss e d t r a n s c r ip t s f o un d i n K E GG ’ s a ro ma t i c s d e g r a d a t i o n m o d u l e , h i ghli gh t i n g t h e c l a d e ’ s di s t i ngu i s h i n g a f fi n i t y f o r c o mp l e x c a rb o n - c o mp o un d d e g r a d a t i o n ( e . g . , K 10217) , a n d i n t h e g l y c a n me t a b o l i s m m o d u l e , r e s u l t i n g a l mo s t e n t i r e l y f ro m t r a n s c r ip t s i n v o l v e d i n g l y c o s y l t r a n s f e r a s e ( K 12666 ) ( F i g . 6 ; s e e a l s o F i g . S 6 ) . T h e e n z y m e s e n c o d e d b y t h e s e t r a n s c r ip t s a r e k n o w n t o b e i n t e g r a l t o e x o p o l y - s a cc h a rid e p ro d u c t i o n a n d a r e b e l i e v e d t o b e l a rg e l y r e s p o n s ibl e f o r t h e t a x o n ’ s t y pi c a l p a r t i c l e - a ss o c i a t e d lif e s t y l e ( 66 , 67) . T h i s s ugg e s t s m e m b e r s o f B a c t e r o i d e t e s ma y pl a y a k e y ro l e i n T r i c h od e s m i u m c o n s or t i a b y c o n t r i b u t i n g t o t h e e x t r a ce l l u l a r ma t r i x t o t h e T A B L E 1 S umm a r y o f t h e 2 0 m o s t h i gh l y e x p r e s s e d g e n e s i n e a c h t a x o n o m i c g r ou p un d e r a mbi e n t C O 2 a M o s t h i gh l y e x p ress e d gene( s ) C a t eg o r y B a c t e r o i d e t e s A lph a p r o t e ob a c t e ri a G a m m a p r o te o b a c t e ri a C y anob a c te ri a T r a n s c rip t i on rpo E , r p o D , rpo B , rpo C c a r D r p o S , c s p A , rpo A , r p o H T r a n s l a t ion t u f , f u s A t u f , f u s A R i b o s om e s r pm E , r p s U , r p s P r p l N , r p s L , r p l B , r p s U r m f , r p l M C h a p e ron e s c l p C d n a K p s p A o s m C R N A m e t h y l t r a n s f e r a s e t r m D R N A p ho s p ho t r a n s f e r a s e k p t A G l u t a m i n e / g l u t a m a t e bi o s y n t h e s i s g l t B g l n A g l n A F a tt y a c i d bi o s y n t h e s i s a c p P a c p P A T P bi o s y n t h e s i s a t p A , a t p D N u c l e o t id e bi o s y n t h e s i s nd k , n rd G H e m e bi o s y n t h e s i s h e m D G l y o x y l a t e / di c a r b o x y l a t e m e t a b o li s m a c eA , i c d , a t o B O x id a t i v e p h o s p h o r y l a t i o n c ox A nu o C , p e t B , nu o L P ho t o s y n t h e s i s c pe B , a p c B , c pe A , p e t D , pe tF P ho t o s y s t e m I p s a A , p s a B , p s a L , p s a D P ho t o s y s t e m I I p s b A , p s b C , p s b B , p s b O T w o - c o m p o n e n t r e gu l a t i o n l y t T N i t r o g e n r e gu l a t i on g l n K G e n e r a l t r a n s por t e x b B , ton B A m i n o a c i d t r a n s p or t aa p J -bz t A P e p t id e t ra n s p o r t A BC . P E . S U r e a t r a n s p or t u r t A A mmoniu m t r a n s por t a m t B C h e mo t a x i s mo t B fli C G l y c o s y l t r a n s f e r a s e g l g A S ulf u r r e d u c t i on s a t -m e t 3 X e n o bi o t i c s d e g r a d a t i o n E 3.8.1 . 2 d h a A S p h i ngolipi d d e gr a d a t i on a s l A P y r u v a t e d e h y drog e n a s e p d h B N i t r o n a t e m o n o oxy g e n a s e npd M e t h i on i n e d e gr a d a t i on a h c Y P e ro x i d a s e a h p C a P r o t e i n p ro d u c t s f o r g e n e s i n b o ldf a c e f on t w e r e a l s o d e t ec t e d . A mor e d e t a il e d t a bl e wi t h T P M- n o rm a li z e d t r a n s c rip t c o un t s a n d K O id e n t i fi e r s ( T a bl e S 6 ) a n d p ro t e i n s pec t ra l c o un t s ( T a bl e S 7 ) a r e p r e s e n t e d i n t h e s uppl e m e n t a l m a t e r i a l . r p o D E S H , R N A p o l y m e ra s e s i gm a f a c t o rs ; r po A B C , R N A p o l y m e r a s e s u b u n i t s ; c a r D , t r a n s c rip t i o n a l r e gu l a t o r , c o l d s h o c k p r o t e i n ; t u f - f u s A , e l ong a t i o n f a c t o r s ; r pm E - r p s L P U - r p l B M N , rib o s o m a l p r o t e i n s ; r m f , rib o s o m e mo d u l a t e f a c t o r ; c l p C , A T P - d e p e n d e n t p ro t e a s e ; dna K , R N A d e gr a d a t i o n ; p s p A , p h a g e s h o c k p r o t e i n A ; o s m C , o s m o t i c a ll y i n d u c ibl e p ro t e i n ; t r m D , t R N A me t h y l t r a n s f e r a s e ; k p t A , p u t a t i v e R NA p h o s p h o t r a n s f e r a s e ; g l t B , g l u t a m a t e s y n t h a s e ; g l n A , g l u t a m i n e s y n t h e t a s e ; a c p P , a c y l c a rr i e r p r o t e i n ; a t p A D , F - t y p e H - t r a n s p o r t i n g A T P a s e s ubun i t s ; nd k , nu c l e o s id e - dip h o s p h a t e k i n a s e ; n r d G , a n a e r o bi c rib o nu c l e o s id e - t ri p h o s p h a t e r e d u c t a s e ; h e m D , ur o p o r p h y r i n o g e n - I I I s y n t h a s e ; a c eA , i s o c i t r a t e l y a s e ; i c d , i s o c i t r a t e d e h y dr o g e n a s e ; a t o B , a ce t y l - c o e n z y m e A C - a c e t y l t r a n s f e r a s e ; c o x A , c y t o c hr o me c o x id a s e s u b uni t I ; nuo C L , NAD H - qu i n o n e o x id o r e d u c t a s e s ubun i t s ; p e t B , ubi qu i n o l - c y t o c hr o me c r e d u c t a s e ; c peA B , p h y c o e r y t h r i n a lp h a / b e t a c h a i n s ; a p c B , a ll o p h y c o c y a n i n b e t a s u b u n i t ; p e t D , c y t o c hr o m e b 6 - f c ompl e x s ubun i t ; p e t F , f e rr e d o x i n ; p s a A B , p h o t o s y s t e m ( P S ) I c h l o r o p h y l l a a p o p r o t e i n s ; p s a D L , P S I s u b uni t s ; p s b A , P S I I r e a c t i o n c e n t e r p r o t e i n ; p s b B C , P S I I c h l o ro p h y l l a p o p ro t e i n s ; p s b O , P S I I o x y g e n - e v o l v i n g e nh a nce r p ro t e i n ; l y t T , t w o - c o m p on e n t r e s p o n s e r e gu l a t o r ; g l n K , n i t r o g e n r e gu l a t o r y p r o t e i n ; aa p J -b z t A , a m i n o a c i d t ra n s p o r t ; A B C . P E . S , p e p t id e / n i c k e l t r a n s p o r t ; u r t A , ur e a t r a n s p o r t ; am t B , a mm o n i u m t r a n s p o r t ; mo t B - fli C , fl a g e ll a r a ss e mbl y p r o t e i n s ; s a t -m e t 3 , s ulf a t e a d e n y l y l t r a n s f e r a s e ; E 3 . 8 . 1 .2 , 2 - h a l o a c i d d e h a l o g e n a s e ; dha A , h a l o a l k a n e d e h a l o g e n a s e ; a s l A , a r y l s ulf a t a s e ; pdh B , p y ru v a t e d e h y d ro g e n a s e ; npd , n i t r o n a t e m o no o x y g e n a s e ; ah c Y , a d e n o s y l h o m o c y s t e i n a s e ; ahp C , p e r o x i r e d o x i n . L e e e t a l . A ppli e d a n d E n v i r o nm e n t a l M i c r o bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m . or g 10 32 b e n e fi t s o f t h e r e s t o f t h e a s s o c i a t e d c o mmun i t y a n d t h e h o s t . T h i s p oss ibl e b e n e fi t a g a i n i n v o l v e s t h e g e n e r a t i o n a n d m a i n t e n a n c e o f l o w - o x y g e n mi c r o e n v i r o nm e n t s , a s t h e e x o p o l y s a cc h a r id e m a t r i x r e s t r i c t s oxy g e n diff u s i o n ( 32) . A ddi t i o n a ll y , B a c t e r o i d e t e s ’ 5 t h a n d 6 t h m o s t h i gh l y e x p r e s s e d g e n e s ( a l s o d e t ec t e d i n t h e p ro t e o m e ) w e r e c o m p o n e n t s o f a T o n B - d e p e nd e n t t r a n s p or t e r s y s t e m ( e x b B [ K 03561 ] a n d t o n B [ K 038 3 2 ] ) ( T a bl e 1 ; s e e a l s o T a bl e S 6) . T h e s e a r e t y pi c a l l y i n v o l v e d i n t h e t r a n s p o r t o f c o mp l e x c o m p o un d s s u c h a s s id e ro p h o r e s , v i t a mi n B 1 2 , a n d c a rb o h y d r a t e s (68) . W hil e i t i s un c l e a r w h a t t h e s e t r a n s p o r t e r s a r e a c t i n g o n i n t h i s i n s t a nce , B a c t e r o i d e t e s w e r e e x p e ndi n g l a r g e p r o p o r t i o n s o f t h e i r t r a n s c r ip t i on a l e n e r g y o n t h e s e g e n e s ( 2 . 5 % o f t h e i r t o t a l t r a n s c r i p t p oo l ) , a n d t h e i r c orr e s p o ndi n g p r o t e i n p ro d u c t s w e r e d e t ec t e d ( T a b l e s S 6 a n d S 7) . A l p h a p r o teob a c t e r i a . T h e c l a s s A l p h ap r o t e oba c t e r i a i s o f t e n t h e d o mi n a n t ma j or t a x o n r e s p o n s ibl e f o r t h e c on s u mp t i o n o f t h e a mi n o a c i d c o m p o n e n t o f ma r i n e D O M ( 59) . I n o u r s y s t e m , A l p h ap r o t e oba c t e r i a ( d o mi n a t e d b y m e m b e r s o f t h e or d e r s R h odo - ba c t e r a l e s a n d R h i z ob i a l e s ) di d i n dee d d e m o n s t r a t e s i gn i fi c a n t l y h i gh e r r e l a t i v e g e n e e x p r e ss i o n f o r a m i n o a c i d t r a n s p o r t t h a n t h e o t h e r ma j o r t a x a ( F i g . 6) ; n o t a bl y , B a c t e r o i d e t e s , p r o p o s e d t o b e m o r e i n v o l v e d i n h i gh e r - mo l e c u l a r - w e i gh t D O M dec o m - p o s i t i o n , h a d n o d e t ec t a bl e a mi n o a c i d t r a n s p o r t e x p r e ss i o n . T h i s e n r i c h e d A l p h ap ro- t e oba c t e r i a e x p r e ss i o n w a s l a rg e l y d r i v e n b y L - a m i n o a c i d A B C t r a n s p or t ( K 09969) , w h i c h c o m p r i s e d t h e t a x o n ’ s 6 t h m o s t h i gh l y e x p r e ss e d t r a n s c r ip t s a n d f o r w h i c h p r o t e i n p r o d u c t s w e r e d e t ec t e d ( T a bl e 1 ; s e e a l s o T a bl e S 6) . A l p h ap r o t e oba c t e r i a c o n t r i b u t e d a s i gn i fi c a n t l y l a r g e r p r o p o r t i o n o f t h e i r t r a n s c r i p t i o n a l p oo l t o K E G G ’ s “ p e p t id e / n i c k e l t r a n s p or t ” m o d u l e ( F i g . 6 ) a s w e ll , d u e t o t h e e x p r e ss i o n o f g e n e s i n v o l v e d i n di - a n d t r ip e p t id e t r a n s p o r t ( K 02035) , f o r w h i c h p r o t e i n s w e r e a l s o d e t ec t e d ( T a bl e 1 ; s e e a l s o T a bl e S 7) . A l s o uniq u e l y h i gh l y e x p r e ss e d w e re t r a n s c r ip t s i n v o l v e d i n c h l o r i n a t e d c y c li c a n d a c y c li c h y d r o c a r b o n d e g r a d a t i o n ( E 3 . 8 . 1 . 2 [ K 01560 ] ) ( T a bl e 1 ; s ee a l s o T a bl e S 6 ) . T h e s e r e s u l t s s upp or t t h e n o t i o n t h a t A l p h ap r o t e oba c t e r i a m a y h o l d a n e n v i r onm e n t a l n i c h e s p a c e wi t h i n T r i c h od e s m i u m c o n s or t i a b a s e d o n t h e i r u t iliza t i o n o f a mi n o a c id s a n d c h l o r i n a t e d h y d ro c a r b on s . T h e gr o u p a ddi t i o n a ll y e x hibi t e d r e l a t i v e l y g r e a t e r e x p r e ss i o n o f s ulf u r m e t a b o li s m g e n e s ( F i g . 6 ) d u e p r i ma r il y t o t r a n s c r ip t s f o r s ulf u r o x id a t i o n ( K 1722 2 a n d K 17227 ) a n d s ulf a t e a d e n y l y l t r a n s f e r a s e ( m e t 3 [ K 00958 ] ) ( T a bl e 1) , w h i c h u t ili z e s A T P t o a c t i v a t e s ulf a t e y i e l di n g a d e n y l y l s ulf a t e . T h e h i gh r e l a t i v e e x p r e ss i o n o f t h e s e w a s mo s t l y d r i v e n b y m e m b e r s o f t h e or d e r R h odoba c t e - r a l e s , w h i c h h a v e bee n p ro p o s e d t o b e a k e y gro u p f o r s ulf u r c y c li n g wi t h i n a l g a l bl o o m e v e n t s (69) ; a s s u c h , t h e y m a y pl a y a ro l e i n s ulf u r bi o g e o c h e mi s t r y wi t h i n T r i c h od e s - m i u m c o n s or t i a a s w e ll . G a m m a p r o t e o b a c t e r i a . G a mm ap r o t e oba c t e r i a C D S s mo s t l y or igi n a t e d f ro m t h e f a mi l y A l t e r o m o n ada l e s ( T a bl e S 2) . T h i s c l a d e i s k n o w n f o r i t s a b i li t y t o u t ili z e v a r i e d a n d o f t e n c y c l i c c a r b o n c o mp o un d s s u c h a s N - a ce t y l g l u c o s a mi n e (59 , 70) . I t s h o u l d b e n o t e d t h a t rpo S a n d p s p A ( 2 o f t h e 2 0 m o s t h i gh l y e x p r e ss e d g e n e s wi t h i n t h e G a m m ap r o t e oba c t e r i a ) ( T a bl e 1 ) a r e b o t h a ss o c i a t e d wi t h s t r e s s r e s p o n s e s t o nu t r i e n t li m i t a t i o n a n d o v e r a l l s u b o p t i m a l c o ndi t i o n s ( 71 , 72) , a n d a s s u c h i t i s p oss ibl e me mb e r s o f t h i s c l a d e w e r e s t r e ss e d a t t h e t i me o f s a mpl i ng . N o n e t h e l e ss , t h i s gro u p w a s s i gn i fi c a n t l y e n r i c h e d i n K E G G ’ s “ o t h e r c a r b o h y d r a t e m e t a b o l i s m” mo d u l e c o mp a r e d t o t h e o t h e r 3 m a j o r t a x a , wi t h t h i s m o d u l e c o mp r i s i n g j u s t o v e r 10 % o f t h e t a x o n ’ s t o t a l rec o v e r e d t r a n s c rip t p o o l ( F i g . 6) . T h i s e n r i c hm e n t w a s d r i v e n l a rg e l y b y t r a n s c r i p t s e n c o di n g e n z y m e s f o r i s o c i t r a t e l y a s e ( a c e A [ K 01637 ] ) a n d ma l a t e d e h y d ro g e n a s e ( i c d [ K 01638 ] ) ( F i g . 6 ; s e e a l s o T a bl e S 6) . T h e s e a r e b o t h i nv o l v e d i n t h e g l y oxy l a t e c y c l e a n d h a v e bee n n o t e d a s h i gh l y e n r i c h e d f ro m A l t e romo n ada l e s i n m a r i n e mi c r o c o s m e x p e r im e n t s u p o n t h e a ddi t i o n o f n a t u r a ll y o c c urr i n g D O M ( 73) . A ddi t i o n a ll y , a mo ng t h e 2 0 m o s t h i gh l y e x p r e ss e d g e n e s f o r G a mm ap r o t e oba c t e r i a w e r e t h o s e f o r h a l o a l - k a n e d e h a l og e n a s e ( d h a A ) ( T a bl e 1 ; s e e a l s o T a bl e S 6) , w h i c h f a ll s wi t h i n K E GG ’ s x e n obi o t i c d e g r a d a t i o n m o du l e . T a k e n t o g e t h e r , t h i s p ro v i d e s a p o t e n t i a l u n i q u e ec o l o g i c a l n i c h e f o r G a m m ap r o t e oba c t e r i a wi t h r e g a r d t o p r e f e r r e d o rg a n i c s u b s t r a t e s M e t a t r a n s c rip t o m e o f t h e T r i c hod e s m i u m C o n s o r t i u m A ppli e d a n d E n v i r o nm e n t a l M i c r o bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m. or g 11 33 c o mp a r e d t o t h e o t h e r ma j o r t a x a , a s h a s bee n o b s e r v e d b e y o n d t h e T r i c h od e s m i u m c o n s o r t iu m (59) . C y anob a c t e r i a . T h e t r a n s c r ip t i on a l p r o fi l e o f C y a n oba c t e r i a , p r i m a ril y d o mi n a t e d b y S y n e c h o c occ u s , w a s l e a s t s i mil a r a m o n g t h e 4 m a j o r t a x a ( F i g . S 5) . T h i s i s c e r t a i n l y d u e i n l a rg e p a r t t o t h e s u b s t a n t i a l p r o p or t i o n o f t h e i r t r a n s c rip t s i n v o l v e d i n p h o t o s y n t h e - s i s ( 25 % o f t h e t a x o n ’ s t o t a l ) ( F i g . 6) . L i k e l y r e l a t e d t o t h i s l a r g e t r a n s c r ip t i on a l i n v e s t m e n t i n ph o t o s y n t h e s i s , t h e t a x o n a l s o h a d a mu c h s m a ll e r pro p o r t i o n o f r ib o s om a l p r o t e i n t r a n s c r ip t s t h a n t h o s e o f t h e o t h e r ma j o r t a x a ( 2 % ) ( F i g . 5 a n d 6) , s i mil a r t o o b s e r v a t i o n s o f S y n e c h o c occ u s m a r i n e s a mpl e s (58) . C y a n oba c t e r i a h a v e bee n c o m- m o n l y f o un d wi t h i n n a t u r a l T r i c h od e s m i u m c o n s or t i a (16 , 1 7 , 35 , 36 , 74) , t h o ug h i n t h e c urr e n t s t ud y o f e n r i c hm e n t c u l t u r e s wi t h a r e l a t i v e l y l o w a b un d a n c e o f a ss o c i a t e d c o mmun i t y m e m b e rs , i t i s dif fi c u l t t o s pec u l a t e o n a p o t e n t i a l uniq u e ro l e . H o w e v e r , t h e C y a n oba c t e r i a di d a ll o c a t e m o r e o f t h e i r t r a n s c r ip t i on a l p o o l t o w a r d a mm o n i um ( a m t B ) a n d u r e a ( u r t A ) t r a n s p o r t ( T a bl e 1) , wi t h p r o t e i n p r o d u c t s a l s o d e t ec t e d f o r t h e l a tt e r , a n d t h e y h a v e bee n no t e d i n t h e e n v i ro nm e n t t o h a v e h i gh e r e x p r e s s i o n o f u r e a s e t r a n s c rip t s t h a n o t h e r c oo c c u r r i n g pl a n k t o n ( 58) . I t i s li k e l y C y a n oba c t e r i a i n p a r t f u r t h e r s upp or t t h e c o mmun i t y a s a w h o l e v i a t h e p r o v i s i o n o f a ddi t i o n a l fi x e d c a r b o n . T h i s c o u l d b e p a r t i c u l a r l y b e n e fi c i a l un d e r t i m e s o f nu t r i e n t s t r e s s (e . g . , wi t h p h o s p h o - r u s o r i ro n l i m i t a t i o n ) , w h e n T r i c h od e s m i u m ’ s a bili t y t o fi x n i t ro g e n m a y b e i nhibi t e d , a s t h e e n t i r e c on s or t i u m w o u l d t h e n n o t b e d e p e n d e n t s o l e l y u p o n t h e h o s t f o r b o t h c a r b o n a n d n i t ro g e n fi x a t i o n . C o n c l u s i o n s . T h e m a r i n e m i c ro b i a l l o o p l a r g e l y d r i v e s s u r f a c e oce a n g l o b a l bi og e o - c h e m i c a l c y c li n g a n d s upp or t s m o s t m a r i n e f o o d w e b s . A n o ng o i n g s hif t i n r e s e a r c h f o c u s h a s l e d t o a g r owi n g i n t e r e s t i n un d e r s t a ndi n g mi c ro bi a l a ss e mbl a g e s a s a w h o l e , a s o pp o s e d t o i ndi v id u a l s pec i e s i n i s o l a t i o n ( 45 , 75) . T r i c h od e s m i u m i s c urr e n t l y o n l y k n o w n t o e x i s t i n n a t u r e a n d i n c u l t u r e i n c l o s e a ss o c i a t i o n w i t h o t h e r mi c r o b e s . I n o u r e ff o r t s t o und e r s t a n d t h e s i gn i fi c a n c e o f t h i s wid e s p r e a d c y a n o b a c t e r i u m f o r g l o b a l n i t ro g e n a n d c a r b o n c y c l e s , i t i s p r u d e n t t h a t w e c o n s id e r t h e e n t i r e T r i c h od e s m i u m h o s t - e pibi o n t a ss e mbl a g e a s a w h o l e . T h i s me t a t r a n s c r ip t o mi c a n a l y s i s o f T r i c h od e s m i - u m ’ s a ss o c i a t e d c o m m un i t y s h e d s li gh t o n t h e p o t e n t i a l f o r t h e c o mmun i t y ’ s i n v o l v e - m e n t i n s y s t e m - wid e n i t ro g e n c y c l i n g ( F i g . 7) , i n c l udi n g id e n t if y i n g t r a n s c r ip t s i n v o l v e d i n d e n i t r i fi c a t i o n , a p ro c e s s rece n t l y r e p o r t e d t o b e occ u r r i n g wi t h i n t h e c o n s or t i u m o f a n o t h e r m a r i n e c y a n o b a c t e r i a l di a z o t r o p h ( 53) . A ddi t i o n a ll y , w e r e p o r t s i gn i fi c a n t i n c r e a s e s i n t h e r e l a t i v e a b un d a n c e o f c o mm un i t y r e s pi r a t i o n - r e l a t e d t r a n s c rip t s c o r - r e s p o ndi n g t o i n c r e a s e s i n ho s t n i t ro g e n fi x a t i o n r a t e s und e r e l e v a t e d C O 2 , a fi ndi n g i n a li gnm e n t wi t h a dec a d e s - o l d h y p o t h e s i s o f i n t e r o r g a n i s ma l i n t e r a c t i on s t h a t , a l o n g - s id e fi n e - s c a l e s p a t i a l a n d t e m p or a l s e g r e g a t i o n m ec h a n i s m s ( 2 7 ) , m a y b e i n t e g r a l t o T r i c hod e s m i u m ’ s s ee m i ng l y e n i gm a t i c n i t ro g e n fi x a t i o n s t r a t e g y . F i n a ll y , g i v e n t h e c o n s i s t e n t a ss o c i a t i o n o f j u s t a f e w ma j o r t a x o n o mi c c l a d e s wi t h i n T r i c h od e s m i u m c o n s o r t i a , a s w e l l a s wi t h o t h e r p h o t o a u t o t ro p h s , w e c a n b e gi n t o d e li n e a t e t h e uniq u e ec o l o g i c a l n i c h e s p a c e t h e y m a y b e occ u p y i n g b y c o upli n g t h e i r t r a n s c r i p t i o n a l p r o fi l e s a n d d e t ec t e d p r o t e i n s wi t h t h e i r k n o w n di s t i ngu i s h i n g c h a r a c t e r i s t i c s f r o m t h e li t e r a - t u r e ( F i g . 7) . I t i s li k e l y n o t a c o i n c id e n c e t h a t t h e s a m e f e w m a j o r t a x o n o mi c gr o u p s t h a t c a n t o g e t h e r d e g r a d e t h e ma j o r i t y o f D O M i n t h e ma r i n e e n v i ro nm e n t ( 5 9 ) a r e a l s o t h e d o mi n a n t g r o u p s c o n s i s t e n t l y f o un d li v i n g i n a ss o c i a t i o n wi t h p h o t o a u t o t ro p h s s u c h a s T r i c h od e s m i u m . A mor e c o m p r e h e n s i v e un d e r s t a ndi n g o f t h e i n t e r a c t i o n s o c c urr i n g b e t wee n p h o t o a u t o t r o p h i c h o s t s a n d t h e i r c o n s i s t e n t l y a ss o c i a t e d c o n s or t i a wil l i mp r o v e o u r un d e r s t a ndi n g o f e n v i r onm e n t a ll y r e l e v a n t e l e m e n t a l fl u x e s a s w e l l a s o ff e r i n s i gh t s i n t o c o e v o l u t i o n i n t h e m a r i n e e n v i ro nm e n t . M A T ER I A L S A N D M E T H O D S C u l t u ri n g c on d i t i on s , p hy s i o l ogy , a n d s a m p li ng . De t a il e d c u l t u r i n g c ond i t i o n s a n d s a mpli ng m e t h o d s a r e p r e s e n t e d e l s e w h e r e ( 8 , 9 ) a n d i n s uppl e m e n t a l m a t e r i a l . C u l t u r e s w e r e c o n t i nu o u s l y b ubbl e d wi t h pr e p a r e d a i r - C O 2 m i x t ur e s ( P r a x a i r ) t o m a i n t a i n s t a bl e C O 2 c o nce n t r a t i o n s o f a mbi e n t ( 380 t o 40 0 a t m ) o r 80 0 a t m f o r 8 y e a r s p r i o r t o s a mpli ng . T r i c h o d e s m i u m gr o w t h r a t e s w e r e c a l c u l a t e d f r o m mi c r o s c o pi c ce l l c o un t s , a n d N fi x a t i o n r a t e s w e r e m e a s ur e d u s i n g a ce t y l e n e r e du c t i o n a s d e s c rib e d p r e v i o u s l y ( 8 ) . G r o w t h a n d N fi x a t i o n r a t e s w e r e L e e e t a l . A ppli e d a n d E n v i r o nm e n t a l M i c r o bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m . or g 12 34 m e a s ur e d d u r i n g t h e middl e o f t h e p h o t o p e r i o d ( 76 ) , a n d 20 0 m l o f e a c h s a mpl e w a s fi l t e r e d ( 5 - m p o l y c a r b o n a t e ; W h a t m a n ) a t t h e s a m e t i m e a n d t h e n i mm e di a t e l y fl a s h - f r o z e n a n d s t or e d i n liq u id n i t r o g e n un t i l R N A e x t r a c t i on . R N A e x t r a c t i o n a n d s e q u e n c i n g . R N A w a s e x t r a c t e d f r o m 2 r a n d o m l y c h o s e n s a mpl e s o f t ripli c a t e bi o l o gi c a l r e pli c a t e s , wi t h s t e p s p e r f o rm e d t o r e mov e D N A a n d r R N A . I ll um i n a H i - S e q s e q u e n c i n g w a s p e r f o rm e d y i e ldi n g s i ng l e - e n d 5 0 - b p r e a d s . De t a il e d i n f o rm a t i o n i s p r e s e n t e d i n t h e s uppl e m e n t a l m a t e ri a l . P r o t e om i c s . P r o t e i n e x t r a c t i o n , t r y p t i c di g e s t i o n , a n d gl o b a l p r o t e o m e a n a l y s i s w e r e p e r f o rm e d a s p r e v i o u s l y d e s c r i b e d ( 8 ) a n d s pec i fi e d i n t h e s uppl e m e n t a l m a t e r i a l . T h e s e a r c h d a t a b a s e c o n t a i n e d t r a n s l a t e d C D S s f r o m t h i s s t u d y ’ s c o a ss e mbl y ( s e e T a bl e S 2 i n t h e s uppl e m e n t a l m a t e r i a l ) a n d T . e r y t h r a e u m I M S 101 ’ s g e n e s r e t ri e v e d f r o m I M G ( 51 ) . W e id e n t ifi e d 1 , 98 8 I M S 10 1 p r o t e i n s a n d 357 p r o t e i n s f ro m t h e a ss o c i a t e d c o mmun i t y ( s e e T a bl e S 7 ) b a s e d o n 1 , 01 4 id e n t i fi e d un i q u e t r y p t i c p e p t id e s ( s e e T a bl e S 8 ) . B i o i n f o r m a t i c s . De t a il e d i n f orm a t i o n i s p r e s e n t e d i n t h e s uppl e m e n t a l m a t e r i a l . B r i e fl y , q u a li t y fi l t e r e d r e a d s w e r e rec r u i t e d t o I M G ’ s T . e r y t h r a e u m I M S 10 1 r e f e r e n c e g e n o m e , a n d r e a d s n o t m a pp e d w e re c o n s id e re d t o b e d e ri v e d f r o m t h e a ss o c i a t e d c o mmun i t y a n d p r ocess e d f ur t h e r ( T a bl e S 1 ) . F o ll owi n g a ddi t i o n a l r R N A r e mov a l i n s i l i c o , a c o a ss e mbl y o f a l l s a mpl e s w a s p e r f o rm e d wi t h T r i n i t y v . 2 . 4 . 0 ( 77 ) , a n d P r o di g a l v . 2 . 6 . 2 ( 78 ) w a s u s e d t o id e n t if y C D S s f r o m t h e r e s u l t a n t a ss e mbl e d t r a n s c rip t s . T h e s e C D S s w e r e u t ili z e d a s o u r r e f e r e n c e lib r a r y f o r rec r u i t i n g t h e r e a d s f r o m i n d i v i du a l s a mpl e s , a n d r e a d c o un t s w e r e c o n v e r t e d t o t r a n s c rip t s p e r milli o n ( T P M ) b y fi r s t n o rm a li z i n g b y C D S l e ng t h a n d t h e n b y t h e s i z e o f t h e s a mpl e lib r a r y c o rr e s p o ndi n g t o e i t h e r t h e e n t i r e c o mmun i t y o r t o i ndi v id u a l t a x o n o m i c gr o u p s a s n o t e d . C D S s t h a t f a il e d t o rec ru i t gr e a t e r t h a n 1 T P M f r o m a n y s a mpl e ( a s norm a li z e d t o t h e e n t i r e c o mmun i t y ) w e r e fi l t e r e d o u t , l e a v i ng 45 , 00 0 C D S s i n a l l d o w n s t r e a m a n a l y s e s ( T a bl e S 2 ) . A m i n o a c i d s e q u e nce s o f C D S s w e r e f un c t i o n a ll y a nn o t a t e d wi t h K y o t o E n c y c l o p e di a o f G e n e s a n d G e n o m e s ( K E GG ) o r t h o l o g s ( [ K O s ] 50 ) a n d t a x o n o m i e s w e r e a ss i gn e d v i a a t w o - s t e p p r oce s s c o m p r i s i n g s t a n d - a l o n e B L A S T p v . 2 . 2 . 3 0 ( 79 ) ru n a g a i n s t N C B I ’ s R e f S e q p r o t e i n d a t a b a s e ( 80 ) a n d p a r s i n g o f t h e o u t p u t s wi t h M E G A N ( 81 ) v . 6 . 7 . 6 u s i n g t h e i r d e f a u l t l o w e s t c o mmo n a nce s t o r ( L C A ) a l gor i t hm . A l l d a t a v i s u a li z a t i o n s w e r e g e n e r a t e d wi t h R S t u d i o v . 1 . 0 . 1 3 6 ( 82 ) a n d R v 3 . 4 . 1 . I ndi v i d u a l t e s t s f o r s t a t i s t i c a l s i gn i fi c a n c e ( i . e . , t h o s e n o t e d i n F i g . 3 , 4 , a n d 5 ) w e r e p e r f o rm e d wi t h s t a n d a r d t w o - t a il e d S t u d e n t ’ s t t e s t s u s i n g a P v a l u e c u t o f f o f 0 . 05 , w hil e t e s t s f o r s i gn i fi c a n c e a t t h e K E G G m o d u l e l e v e l F I G 7 S umm a r y s c h e m a t i c o f n i t r o g e n c y c li ng - r e l a t e d t r a n s c rip t s a n d uniq u e t r a n s c rip t i o n a l a c t i v i t i e s d e t ec t e d i n t h e m a j o r t a x a c o mmon l y a ss o c i a t e d wi t h T r i c h o d e s m i u m h i ghli gh t i n g p o t e n t i a l n i c h e s p a ce s . G e n e n a m e s a r e a s p r e s e n t e d i n t h e F i g . 3 l e g e n d ; H M W , h i g h m o lec u l a r w e i gh t ; E P S , e x o p o l y s a cc h a rid e ; AA , a m i n o a c id . M e t a t r a n s c rip t o m e o f t h e T r i c hod e s m i u m C o n s o r t i u m A ppli e d a n d E n v i r o nm e n t a l M i c r o bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m. or g 13 35 ( F i g . 6 ) w e r e p e r f o rm e d wi t h a n a l y s e s o f v a r i a n c e (A N O V A s ) f o ll o w e d b y T u k e y ’ s h o n e s t l y s i gn i fi c a n t diff e r e n c e t e s t s ( H S D s ) wi t h a n a d j u s t e d P v a l u e c u t o f f o f 0 . 05 . Acc ess i o n num b e r ( s ) . R a w R N A - s e q f a s t q fi l e s a r e i n N C B I ’ s G e n e E x p r ess i o n O mnib u s ( 83 ) , a cce s - s ibl e t hr o ug h G E O S e r i e s a ccess i o n num b e r G SE 949 5 1 . T h e s a mpl e a cce s s i o n num b e r s c o rr e s p o ndi n g t o t h e l o w a n d h i g h C O 2 s a mpl e s f r o m t h i s w o r k a r e G S M 2492 3 42 , G S M 2 4 923 4 3 , G S M 249 2 344 , a n d G S M 249 2 345 . S U PP L E M E N T A L M A TER I A L S uppl e m e n t a l m a t e r i a l f o r t h i s a r t i c l e m a y b e f o un d a t htt p s : // d o i . or g / 10 . 1 128 / A E M . 02026 -17 . S U P P L E M E N T A L F I L E 1 , P D F fi l e , 0 . 3 M B . S U P P L E M E N T A L F I L E 2 , X L S X fi l e , 8 . 3 M B . A C K NOW L E D G M E N T S M . D . L . t h a n k s C h r i s D u p o n t f o r h i s a d v i c e a n d guid a nce . F und i n g w a s p r o v id e d b y U . S . N a t i o n a l S c i e n c e F o un d a t i o n g r a n t s B O - O C E 1260490 a n d O C E 165775 7 t o D . A .H . , E . A . W . , a n d F . - X . F . , a n d N S F - O C E 165776 6 a n d a G o rdo n a n d B e tt y M oo r e F o un d a t i o n gr a n t 378 2 t o M . A . S . F undi n g s o u rce s h a d n o ro l e i n s t u d y d e s i gn , d a t a c o llec t i o n a n d i n t e r p r e t a t i on , o r t h e dec i s i o n t o s u b m i t t h e w o r k f o r p ubli c a t i on . RE F ERE N C E S 1 . C a p on e D G , Ze h r JP , P a e r l H W , B e rgm a n B , C a r p e n t e r E J . 1997 . T r i c h o d e s - m i um , a g l o b a ll y s ig n i fi c a n t m a r i n e c y a no b a c t e r i u m . S c i e n c e 27 6 : 122 1 – 12 2 9 . h t t p s : / / d o i . or g / 10 . 1 126 / s c i e n c e. 276 . 531 6 . 12 2 1 . 2 . K a r l D , M i c h a e l s A , B e rgm a n B , C a p on e D , C a r p e n t e r E , L e t e li e r R , L ip s c hu l t z F , P a e r l H , S i gm a n D , S t a l L . 2002 . D i n i t r o g e n fi x a t i o n i n t h e w or l d ’ s o c e a n s . B i o g e o c h e mi s t r y 57 : 4 7 – 9 8 . h t t p s : / / d o i . or g / 1 0 . 1 023 / A : 101 579 8105 851 . 3 . S o h m J A , W e b b E A , C a p o n e D G . 2 0 11 . E m e rg i n g p a tt e rn s o f m a r i n e n i t ro g e n fi x a t i o n . N a t R e v M i c r o bi o l 9 : 4 9 9 – 508 . htt p s : // d o i . o rg / 1 0 . 10 3 8 / nrmi c r o2594 . 4 . S a ñu d o - W il h e l m y S A , K u s t k a A B , G obl e r C J , H u t c h i n s DA , Y a n g M , L wiza K , B urn s J , C a p on e D G , R a v e n J A , C a r p e n t e r E J . 200 1 . P h o s p h o ru s li m i t a t i o n o f n i t r o g e n fi x a t i o n b y T r i c h o d e s m i u m i n t h e ce n t r a l A t l a n t i c O ce a n . N a t ur e 411 : 6 6 – 69 . htt p s : // do i . o r g / 10 . 1 0 38 / 3 5075 0 41 . 5 . K u s t k a A B , S a ñu d o - W il h e l m y S A , C a r p e n t e r E J , C a p on e D , B u r n s J , S un d a W G . 2003 . I r o n r e q u i r e m e n t s f o r di n i t r o g e n - a n d a mm o n i um - s upp o r t e d gr o w t h i n c u l t u r e s o f T r i c h o d e s m i u m ( I M S 101 ) : c o m p a r i s o n wi t h n i t ro - g e n fi x a t i o n r a t e s a n d i r o n : c a r b o n r a t i o s o f fi e l d p o p u l a t i o n s . L i mn o l O ce a n o g r 48 : 1 86 9 – 188 4 . htt p s : // d o i . o rg / 1 0 . 43 1 9 / l o . 20 0 3 . 4 8 . 5 . 1 869 . 6 . F u F - X , Z h a n g Y , L e bl a n c K , S a ñu d o - W il h e l m y S A , H u t c h i n s DA . 2 0 05 . T h e bi o l o gi c a l a n d bi o g e o c h e m i c a l c o n s e q u e nce s o f p h o s p h a t e s c a v e ng i ng on t o p h yt o p l a n k t o n c e l l s ur f a c e s . L im no l O c e a no g r 50 : 1 4 5 9 – 14 7 2 . h t t p s : / / d o i . or g / 1 0 . 4 319 / l o . 20 0 5 . 5 0 . 5 . 1459 . 7 . W e b b E A , J a k u b a R W , M off e t t J W , D y h r m a n S T . 200 7 . M o l e c u l a r a ss e s s - m e n t o f p h o s p h o ru s a n d i r o n p h y s i o l og y i n T r i c h o d e s m i u m p o p u l a t i o n s f r o m t h e w e s t e r n C e n t r a l a n d w e s t e r n S o u t h A t l a n t i c . L i mn o l O ce a n o gr 52 : 2 221 – 2232 . h t t p s : // d o i . o rg / 10 . 4 319 / l o . 2 0 07 . 5 2 . 5 . 222 1 . 8 . H u t c h i n s DA , W a lw or t h N G , W e b b E A , S a i t o M A , M o r a n D , M c I l v i n M R , G a l e J , F u F . 2015 . I r r e v e r s ibl y i n c r e a s e d n i t r o g e n fi x a t i o n i n T r i c h o d e s - m i u m e x p e r i m e n t a ll y a d a p t e d t o e l e v a t e d c a r b o n di o x id e . N a t C o mmun 6 : 8 1 55 . htt p s : // d o i . o rg / 1 0 . 10 3 8 / n c o mm s 915 5 . 9 . W a lw or t h N G , L e e M D , F u F , H u t c h i n s DA , W e b b E A . 2016 . M o lec u l a r a n d p h y s i o l o g i c a l e v i d e n c e o f g e n e t i c a ss i mil a t i o n t o h i g h C O 2 i n t h e m a r i n e n i t r o g e n fi x e r T r i c h o d e s m i um . P r o c N a t l Ac a d S c i U S A 1 13 : E 7 367 – E 737 4 . htt p s : // d o i . o r g / 10 . 1 0 73 / p n a s . 16 0 520 2 1 13 . 1 0 . H u t c hin s DA , F u F - X , Z h a n g Y , W a rn e r M E , F e n g Y , P or t u n e K , B e r nh a rd t P W , M u l h o ll a n d M R . 2007 . C O 2 c o n t r o l o f T r i c h o d e s m i u m N 2 fi x a t i o n , p h o t o s y n t h e s i s , gr o w t h r a t e s , a n d e l e m e n t a l r a t i o s : i mpli c a t i o n s f o r p a s t , p re s e n t , a n d f u t u r e oce a n bi o g e o c h e m i s t ry . L i mn o l O ce a n o g r 5 2 : 1293 – 130 4 . htt p s : // d o i . o rg / 1 0 . 43 1 9 / l o . 20 0 7 . 5 2 . 4 . 1 293 . 11 . E i c hn e r M J , K l a w on n I , W il s o n S T , L itt m a n n S , W h i t e h o u s e M J , C hu r c h M J , K u y p e r s MM , K a r l D M , P l o u g H . 2017 . C h e m i c a l m i c r o e n v i r o nm e n t s a n d s i ng l e - ce l l c a r b o n a n d n i t r o g e n u p t a k e i n fi e ld - c o llec t e d c o l o n i e s o f T r i c h o d e s m i u m und e r diff e r e n t p C O 2 . I S M E J 11 : 130 5– 131 7 . htt p s : // d o i . o r g / 10 . 1038 / i s m e j . 20 1 7 . 1 5 . 1 2 . W a lw o r t h N G , F u F - X , W e b b E A , S a i t o M A , M o r a n D , M c ll v i n M R , L e e M D , H u t c h i n s DA . 201 6 . M ec h a n i s m s o f i n c r e a s e d T r i c h o d e s m i u m fi t n ess un d e r i r o n a n d p h o s p h o ru s c o - li m i t a t i o n i n t h e p r e s e n t a n d f u t ure oce a n . N a t C o mmu n 7 : 1 2 081 . htt p s : // d o i . o rg / 10 . 1 0 38 / n c o mm s 1208 1 . 13 . N a u s c h M . 1 9 96 . M i c r o bi a l a c t i v i t i e s o n T r i c h o d e s m i u m c o l on i e s . M a r E c o l P r o g S e r 141 : 173 – 181 . htt p s : // d o i . o rg / 10 . 3 3 54 / m e p s 14 1 173 . 14 . F i s h e r MM , W il c o x L W , G r a h a m L E . 1998 . M o lec u l a r c h a r a c t e r iza t i o n o f e pip h y t i c b a c t e r i a l c o mmun i t i e s o n c h a r o p h y ce a n gree n a l g a e . A ppl E n v i ro n M i c r o bi o l 64 : 438 4– 438 9 . 1 5 . S a p p M , S c h w a d e r e r A S , W il t s h i r e K H , H o pp e H G , G e rd t s G , W i c h e l s A . 2007 . S p e c i e s - s p e c i fi c b a c t e r i a l c o m mun i t i e s i n t h e ph y c o s p h e r e o f mi c ro a l g a e ? M i c r o b E c o l 53 : 6 83 – 6 99 . htt p s : // d o i . o rg / 1 0 . 10 0 7 / s 0024 8 - 006 - 9 1 62 - 5 . 16 . H m e l o L R , V a n M o o y B A S , M i nce r T J . 2012 . C h a r a c t e r i z a t i o n o f b a c t e r i a l e pibi o n t s o n t h e c y a n o b a c t e r i u m T r i c h o d e s m i um . A q u a t M i c r o b E c o l 67 : 1– 14 . htt p s : // d o i . o r g / 10 . 3 3 54 / a m e 01571 . 1 7 . L e e M D , W a lw or t h N G , M c P a r l a n d E L , F u F - X , M i nce r T J , L e v i n e N M , H u t c h i n s DA , W e b b E A . 20 1 7 . T h e T r i c h o d e s m i u m c o n s o r t i um : c o n s e r v e d h e t e r o t r o p h i c c o - occ urr e n c e a n d g e n o m i c s i gn a t ur e s o f p o t e n t i a l i n t e r - a c t i o n s . I S M E J 11 : 1 8 13 –1 824 . htt p s : // d o i . o rg / 10 . 1 0 38 / i s m e j . 2017 . 4 9 . 1 8 . B or s t a d G A . 197 8 . S o m e a s p e c t s o f t h e o c c urr e n c e a n d bi o l o g y o f T r i c h o d e s m i u m n e a r B a r b a d o s . P h D diss e r t a t i o n . M c G il l U n i v e r s i t y , M o n - t r e a l , Q u e bec , C a n a d a . 19 . P a e r l H , B e b o u t B , P ru f e r t L . 198 9 . B a c t e r i a l a s s o c i a t i o n s wi t h m a r i n e O s c il l a t o r i a s p . ( T r i c h o d e s m i u m s p . ) p o p u l a t i o n s : ec o p h y s i o l o g i c a l i mpli - c a t i o n s . J P h y c o l 25 : 7 73 – 7 8 4 . htt p s : // d o i . o rg / 1 0 . 111 1 / j . 0 0 22 - 364 6 . 19 8 9 . 00 7 73 . x . 20 . O ’ N e i l J M , R om a n M R . 1992 . G raz e r s a n d a s s o c i a t e d org a n i s m s o f T r i c h o d e s m i um , p 61 – 7 3 . I n C a r p e n t e r E J , C a p on e D G , R u e t e r J G (e d ) , M a r i n e p e l a g i c C y a n o b a c t e r i a : T r i c h o d e s m i u m a n d o t h e r di a z o t r o p h s , vo l 362 . S p r i ng e r , D o rdrec h t , N e t h e r l a n d s . 21 . W a t e r b ur y JB . 20 0 6 . T h e C y a noba c t e r i a—i s o l a t i o n , p ur i fi c a t i o n a n d id e n t i fi c a t i on , p 1053 – 10 7 3 . I n D w or k i n M , F a l k o w S , R o s e n b e r g E , S c h l e i f e r K - H , S t a c k e b r a nd t E . T h e p ro k a r y o t e s , 3 r d e d . S p r ing e r - V e r l a g , Ne w Y or k , N Y . 22 . Ze h r JP . 199 5 . N i t r o g e n fi x a t i o n i n t h e s e a : w h y o n l y T r i c h o d e s m i um ? p 333 –3 64 . I n J o i n t I (e d ) , M o lec u l a r ec o l og y o f a q u a t i c m i c r o b e s , vo l 38 . S p r i ng e r , B e rli n , G e rm a n y . 23 . S t e v e n s o n B S , W a t e r b u r y JB . 2 0 06 . I s o l a t i o n a n d id e n t i fi c a t i o n o f a n e pibi o t i c b a c t e r i u m a ss o c i a t e d wi t h h e t e ro c y s t ou s A n a b a e n a ce ll s . B i o l B u l l 210 : 73 – 77 . htt p s : // do i . o r g / 10 . 2 307 / 4 134 5 96 . 24 . S a c h s J L , H o ll o w e l l A C . 2012 . T h e o r i g i n s o f c o o p e r a t i v e b a c t e r i a l c o m - mun i t i e s . m B i o 3 : e 000 9 9 - 12 . htt p s : // d o i . o rg / 1 0 . 1 128 / m B i o . 0009 9 - 12 . 25 . M or r i s J J , L e n s k i R E , Z i n s e r E R . 2012 . T h e bl a c k q u e e n h y p o t h e s i s : L e e e t a l . A ppli e d a n d E n v i r o nm e n t a l M i c ro bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m. or g 14 36 e vo l u t i o n o f d e p e n d e n c i e s t hr o ug h a d a p t i v e g e n e l oss . m B i o 3 : e 0003 6 - 12 . htt p s : // d o i . o rg / 10 . 1 128 / m B i o . 000 3 6 - 12 . 26 . C a r p e n t e r E J , P r i c e CC . 1976 . M a r i n e O s c ill a t o ri a ( T r i c h o d e s m i um ) : e x pl a - n a t i o n f o r a e r o bi c n i t r o g e n fi x a t i o n wi t h o u t h e t e r o c y s t s . S c i e n c e 191 : 127 8 – 128 0 . htt p s : // d o i . o r g / 10 . 1 126 / s c i e nce . 12 5 774 9 . 27 . B e rm a n - F r a n k I , L un d gr e n P , C h e n YB , K olb e r Z , B e r gm a n B , F a l k o w s k i P , K üpp e r H . 200 1 . S e gr e g a t i o n o f n i t r o g e n fi x a t i o n a n d o x y g e n i c p h o t o - s y n t h e s i s i n t h e m a r i n e c y a no b a c t e r i u m T r i c h o d e s m i um . S c i e n c e 294 : 153 4 – 153 7 . htt p s : // d o i . o r g / 10 . 1 126 / s c i e nce . 10 6 408 2 . 2 8 . B e r gm a n B , S a nd h G , L i n S , La rss o n J , C a r p e n t e r E J . 20 1 3 . T r i c h od e s m i u m – a wid e s p r e a d m a r i n e c y a no b a c t e r i u m w i t h u n u s u a l n i t ro g e n fi x a t i on p r o p e r t i e s . F E M S M i c r o bi o l R e v 37 : 2 8 6 – 30 2 . htt p s : // d o i . o rg / 10 . 1 111 / j . 15 7 4 - 697 6 . 20 1 2 . 0 0352 . x . 29 . P a e rl H W , K e ll a r P E . 19 7 8 . S i gn i fi c a n c e o f b a c t e r i a l - A n a b a e n a a ss o c i a - t i o n s wi t h r e s pec t t o N 2 fi x a t i o n i n f r e s h w a t e r . J P h y c o l 14 : 25 4 – 26 0 . htt p s : // d o i . o r g / 10 . 111 1 / j . 1 5 29 - 881 7 . 19 7 8 . t b 0 0 295 . x . 30 . P a e rl H W , B e b o u t B M . 1 9 88 . D i rec t m e a s u r e m e n t o f O 2 - d e pl e t e d m i c r o - z o n e s i n m a r i n e O s c i l l a t o r i a : r e l a t i o n t o N 2 fi x a t i o n . S c i e n c e 241 : 442 – 445 . htt p s : // d o i . o r g / 10 . 1 126 / s c i e nce . 24 1 . 48 6 4 . 4 4 2 . 31 . F a y P . 1992 . O x y g e n r e l a t i o n s o f n i t r o g e n fi x a t i o n i n c y a n o b a c t e r i a . M i c r o bi o l R e v 56 : 34 0 – 373 . 32 . P a e rl H W , P i n c k n e y J L . 19 9 6 . A m i n i - r e v i e w o f m i c r o bi a l c o n s o r t i a : t h e i r r o l e s i n a q u a t i c p r o d u c t i o n a n d bi o g e o c h e m i c a l c y c li ng . M i c r o b E c o l 31 : 2 25 – 247 . htt p s : // do i . o r g / 10 . 1 007 / B F 001 7 1569 . 33 . M o ug e t J L , D a k h a m a A , La vo i e M C , d e l a N o u e J . 19 9 5 . A l g a l gr o w t h e nh a nce m e n t b y b a c t e r i a : i s c o n s um p t i o n o f p h o t o s y n t h e t i c o x y g e n i nvo l v e d ? F E M S M i c r o bi o l E c o l 18 : 3 5 – 4 3 . h t t p s : // d o i . o rg / 10 . 111 1 / j . 1 574 - 6941 . 199 5 . t b001 5 9 . x . 34 . F a ndi n o L B , R i e m a n n L , S t e w a r d G F , L o n g R A , A za m F . 2001 . V a r i a t i on s i n b a c t e ri a l c o mmun i t y s t ru c t u r e d u r i n g a di n o fl a g e ll a t e bl o o m a n a l y z e d b y DGG E a n d 16 S r D N A s e q u e n c i ng . A q u a t M i c r o b E c o l 23 : 1 1 9 – 13 0 . htt p s : // d o i . o r g / 10 . 3 3 54 / a m e 023 1 19 . 35 . S h e ri d a n CC , S t e i n b e r g D K , K li n g G W . 20 0 2 . T h e m i c robi a l a n d m e t a z o a n c o mmun i t y a ss o c i a t e d wi t h c o l on i e s o f T r i c h o d e s m i u m s pp . : a q u a n t i t a - t i v e s ur v e y . J P l a n k t o n R e s 24 : 9 1 3 – 9 2 2 . htt p s : // d o i . o rg / 1 0 . 1 0 93 / pl a n k t / 24 . 9 . 91 3 . 36 . H e w s o n I , P or e t s k y R S , D y hrm a n S T , Z i e li n s k i B , W h i t e A E , T rip p H J , M o n t o y a JP , Ze h r JP . 20 0 9 . M i c ro bi a l c o mmun i t y g e n e e x p r ess i o n wi t h i n c o l on i e s o f t h e di a z o t r o p h , T r i c h o d e s m i um , f r o m t h e S o u t h w e s t P a c i fi c O ce a n . I S M E J 3 : 1 28 6 – 1300 . htt p s : // do i . o r g / 10 . 1 038 / i s m e j . 200 9 . 75 . 37 . A m i n S A , P a r k e r M S , A rm b ru s t E V . 2012 . I n t e r a c t i o n s b e t wee n di a t om s a n d b a c t e r i a . M i c r o bi o l M o l B i o l R e v 76 : 6 67 – 6 84 . htt p s : // d o i . o rg / 1 0 . 1 128 / MM B R . 000 0 7 - 12 . 3 8 . B e r t r a n d E M , M c C r o w JP , M o u s t a f a A , Z h e n g H , M c Q u a i d J B , De l m o n t T O , P o s t A F , S ipl e r R E , S p a c k ee n J L , X u K , B r o n k DA , H u t c h i n s DA , A ll e n A E . 2015 . P h y t o pl a n k t o n - b a c t e r i a l i n t e r a c t i o n s m e di a t e m i c r o nu t ri e n t c o limi t a t i o n a t t h e c o a s t a l A n t a r c t i c s e a i c e e dg e . P ro c N a t l A c a d S c i U S A 112 : 99 3 8 – 994 3 . h t t p s : // d o i . or g / 10 . 1073 / p n a s . 150 161 5112 . 39 . R o u c o M , Ha l e y S T , D y h r m a n S T . 201 6 . M i c r o bi a l di v e r s i t y wi t h i n t h e T r i c h o d e s m i u m h o l o bi on t . E n v i ro n M i c r o bi o l 18 : 5 151 – 516 0 . htt p s : / / d o i . o r g / 10 . 11 1 1 / 1462 - 292 0 . 1 3 513 . 40 . F r i s c h k o r n K R , R o u c o M , V a n M o o y B A S , D y h r m a n S T . 2017 . E pibi o n t s d om i n a t e m e t a b oli c f un c t i on a l p o t e n t i a l o f T r i c hod e s m i u m c o l on i e s f r o m t h e o li go t r o p h i c oce a n . I S M E J 11 : 2 09 0 – 210 1 . htt p s : // d o i . o rg / 1 0 . 10 3 8 / i s m e j . 201 7 . 74 . 41 . La c hn i t T , B l um e l M , I mho f f J F , W a h l M . 200 9 . S pec i fi c e pib a c t e r i a l c o mmun i t i e s o n m a c r o a l g a e : p h y l o g e n y m a tt e r s m o r e t h a n h a bi t a t . A q u a t B i o l 5 : 1 8 1 – 186 . htt p s : // d o i . o rg / 10 . 3 3 54 / a b0014 9 . 4 2 . G u a nn e l M L , H orn e r -De v i n e M C , R o c a p G . 2011 . B a c t e r i a l c o m mun i t y c o mp o s i t i o n di ff e r s w i t h s p e c i e s a n d t o x ig e n i c i t y o f t h e di a t o m P s e udo -n i t z s c h i a . A q u a t M i c ro b E c o l 64 : 117–133 . htt p s : // d o i . or g / 1 0 . 33 5 4 / a m e 015 1 3 . 43 . S i s on-M a ngu s M P , J i a n g S , T r a n K N , K u d e l a R M . 2014 . H o s t - s pec i fi c a d a p t a t i o n g o v e rn s t h e i n t e r a c t i o n o f t h e m a r i n e di a t om , P s e udo- n i t z s c h i a a n d t h e i r m i c r o bi o t a . I S M E J 8 : 6 3 – 76 . htt p s : // d o i . o rg / 10 . 1 0 38 / i s me j . 201 3 . 1 3 8 . 4 4 . V a n M oo y B A S , H m e l o L R , S o f e n L E , C a m p a g n a S R , M a y A L , D y h r m a n S T , H e i t h o f f A , W e b b E A , M o m p e r L , M i nce r T J . 2012 . Q u o r u m s e n s i ng c o n t r o l o f p h o s p h o ru s a c q u i s i t i o n i n T r i c h o d e s m i u m c o n s o r t i a . I S M E J 6 : 4 2 2 – 4 2 9 . htt p s : // d o i . o r g / 10 . 1 0 38 / i s m e j . 2 0 1 1 . 1 15 . 45 . W a n g H , H il l R T , Z h e n g T , H u X , W a n g B . 2 016 . E ffec t s o f b a c t e r i a l c o mmun i t i e s o n bi o f u e l - p r o d u c i n g m i c r o a l g a e : s t i mu l a t i o n , i nhibi t i o n a n d h a r v e s t i ng . C r i t R e v B i o t ec hn o l 36 : 3 41 –3 52 . htt p s : // d o i . o rg / 10 . 3 1 09 / 0738 8 551 . 201 4 . 96 1 402 . 46 . M a rum o R , A s a o k a O . 1974 . T r i c h o d e s m i u m i n t h e E a s t C h i n a S e a . D i s t rib u t i o n o f T r i c h o d e s m i u m t h e i b a u t i i dur i n g 196 1 - 196 7 . J M a r S c i T ec hn o l 30 : 4 8 – 53 . 47 . M a rum o R , N a g a s a w a S . 1976 . S e a s on a l v a r i a t i o n o f t h e s t a ndi n g c r o p o f t h e p e l a g i c bl u e - gree n a l g a , T r i c h o d e s m i u m i n t h e K ur o s h i o w a t e r . B ull P l a n k t o n S o c J a p a n 23 : 1 9 – 25 . ( I n J a p a n e s e) 48 . L e t e li e r R M , K a r l D M . 1 996 . R o l e o f T r i c h o d e s m i u m s pp . i n t h e p r o d u c - t i v i t y o f t h e s u b t r o pi c a l N o r t h P a c i fi c O ce a n . M a r E c o l P r o g S e r 133 : 263 – 273 . htt p s : // d o i . o r g / 10 . 3 3 54 / m e p s 1332 6 3 . 4 9 . O r c u t t K M , L ip s c hu l t z F , G unde r s e n K , A r i m o t o R , M i c h a e l s A F , K n a p A H, G a ll o n J R . 200 1 . A s e a s o n a l s t u d y o f t h e s i gn i fi c a n c e o f N 2 fi x a t i o n b y T r i c h o d e s m i u m s pp . a t t h e B e r mu d a A t l a n t i c t i m e - s e r i e s s t u d y ( B A T S ) s i t e . Dee p R e s P a r t 2 T o p S t u d O ce a n o g r 48 : 1 583 –1 608 . htt p s : // d o i . o rg / 10 . 1016 / S 0967-0645 ( 00 ) 001 5 7 - 0 . 50 . K a n e h i s a M , S a t o Y , K a w a s h i m a M , F u r um i c h i M , T a n a b e M . 2 0 16 . K E GG a s a r e f e r e n c e r e s o ur c e f o r g e n e a n d p r o t e i n a nn o t a t i o n . N u c l e i c Ac id s R e s 44 : D4 57 –D4 62 . htt p s : // d o i . o rg / 1 0 . 1 0 93 / n a r / g k v10 7 0 . 51 . M a r k o wi t z V M , M a v r o m a t i s K , I v a nov a NN , C h e n I M , C h u K , K y r pid e s N C . 2009 . IM G E R :a s y s t e m f o r m i c r o bi a l g e n o m e a nn o t a t i o n e x p e r t r e v i e w a n d c ur a t i on . B i o i nf or m a t i c s 25 : 2 271–227 8 . h t t p s : / / d o i . or g / 10 . 109 3 / bi o i n f or m a t i c s / b t p 39 3 . 52 . M o m p e r L M , R ee s e B K , C a r v a l h o G , L e e P , W e b b E A . 2 015 . A nov e l c o h a bi t a t i o n b e t wee n t w o di a z o t r o p h i c c y a no b a c t e r i a i n t h e o li go t r o - p h i c oce a n . I S M E J 9 : 88 2 – 89 3 . htt p s : // d o i . o rg / 1 0 . 10 3 8 / i s m e j . 201 4 . 18 6 . 53 . K l a w o n n I , B o n a gli a S , B r ü c h e r t V , P l o u g H . 2015 . Ae r o bi c a n d a n a e r o bi c n i t r o g e n t r a n s f o rm a t i o n p r o c ess e s i n N 2 - fi x i n g c y a n o b a c t e r i a l a ggr e - g a t e s . I S M E J 9 : 1 45 6 – 14 6 6 . htt p s : // do i . o r g / 10 . 1 038 / i s m e j . 2014 . 2 32 . 54 . W y m a n M , H o d g s o n S , B i r d C . 201 3 . D e n i t rif y i n g A l ph a p r o t e o b a c t e r i a f r o m t h e A r a bi a n S e a t h a t e x p r e s s n o s Z , t h e g e n e e n c odi n g n i t r o u s o x id e r e d u c t a s e , i n o x i c a n d s u b o x i c w a t e r s . A p p l E n vi ro n M i c ro bi o l 79 : 267 0 – 2681 . h t t p s : / / d o i . or g / 10 . 1 128 / A E M . 03705 -12 . 55 . G a n e s h S , P a rr i s D J , De L o n g E F , S t e w a r t F J . 2014 . M e t a g e n o m i c a n a l y s i s o f s i z e - f r a c t i o n a t e d pi c o pl a n k t o n i n a m a r i n e o x y g e n m i n i mu m z o n e . I S M E J 8 : 1 8 7 – 211 . htt p s : // do i . o rg / 10 . 103 8 / i s m e j . 20 1 3 . 1 4 4 . 56 . B a rce l o s e R a mo s J , B i s w a s H , S c hu l z K G , La R o c h e J , R i e b e s e l l U . 2007 . E ffec t o f r i s i n g a t mo s p h e r i c c a r b o n di o x id e o n t h e m a r i n e n i t r o g e n fi x e r T r i c h o d e s m i um . G l o b a l B i o g e o c h e m C y c l e s 21 : 1 – 6 . htt p s : // d o i . o rg / 1 0 . 1029 / 2006 G B 002 8 98 . 57 . W e i Y . 2001 . H i gh - d e n s i t y m i c r o a rr a y- m e di a t e d g e n e e x p r ess i o n p r o fi l - i n g o f E s c h e r i c h i a c o li . J B a c t e r i o l 183 : 545 – 556 . htt p s : // d o i . o rg / 1 0 . 1 128 / J B . 18 3 . 2 . 545 - 556 . 200 1 . 58 . G iff o r d S M , S h a rm a S , B o o t h M , M o r a n M A . 20 1 3 . E x p r e ss i o n p a tt e rn s r e v e a l n i c h e di v e r s ifi c a t i o n i n a m a r i n e m i c r o bi a l a s s e mbl a g e . I S M E J 7 : 2 8 1 – 298 . htt p s : // d o i . o r g / 10 . 1 0 38 / i s m e j . 20 1 2 . 9 6 . 59 . C ott r e l l M T , K i r c hm a n D L . 2 0 00 . N a t ur a l a s s e mbl a g e s o f m a r i n e p r o t e o - b a c t e r i a a n d m e m b e r s o f t h e C y t o ph a g a - F lavo b a c t e r c l u s t e r c on s um i ng l o w - a n d h i gh - m o lec u l a r - w e i gh t di s s o l v e d o rg a n i c m a t t e r . A pp l E n v i ro n M i c ro bi o l 66 : 169 2 – 1697 . h t t p s : / / d o i . or g / 1 0 . 1 128 / A E M . 66 . 4 . 1 692 -16 9 7 . 20 0 0 . 60 . P e d r ott i M L , B e a u v a i s S , K e r r o s M E , I v e r s e n K , P e t e r s F . 2009 . B a c t e ri a l c o l on i z a t i o n o f t r a n s p a r e n t e x o p o l y m e r i c p a r t i c l e s i n m e s o c o s m s und e r diff e r e n t t u r b u l e n c e i n t e n s i t i e s a n d nu t r i e n t c o ndi t i o n s . A q u a t M i c r o b E c o l 55 : 3 01 –3 12 . htt p s : // d o i . o rg / 1 0 . 33 5 4 / a m e 013 0 8 . 61 . De L o n g E F , F r a n k s D G , A lld r e d g e A L . 1993 . P h y l o g e n e t i c di v e r s i t y o f a ggr e g a t e - a tt a c h e d v s . f ree - li v i n g m a r i n e b a c t e r i a l a ss e mbl a g e s . L i mn o l O ce a n o g r 38 : 9 2 4 – 93 4 . htt p s : // d o i . o rg / 1 0 . 43 1 9 / l o . 19 9 3 . 3 8 . 5 . 0 924 . 62 . T h o m a s F , H e h e m a n n J H , R e b uff e t E , C z j z e k M , M i c h e l G . 20 1 1 . E n v i ro n - m e n t a l a n d gu t B a c t e r o i d e t e s : t h e f oo d c o nnec t i o n . F r o n t M i c r o bi o l 2 : 93 . htt p s : // d o i . o r g / 10 . 3 3 89 / f m i c b . 20 1 1 . 0 0 093 . 6 3 . F e rn a n d e z - G om e z B , R i c h t e r M , S c hu l e r M , P i nh a ss i J , Ac i n a s S G , G on - z a l e z J M , P e d r o s - A li o C . 2013 . E c o l og y o f m a r i n e B a c t e r o i d e t e s:a c om - p a r a t i v e g e n o m i c s a pp r o a c h . I S M E J 7 : 10 2 6 – 1037 . htt p s : // d o i . o rg / 10 . 10 3 8 / i s m e j . 201 2 . 16 9 . 6 4 . C h e n Z , L e i X , L i Y , Z h a n g J , Z h a n g H , Y a n g L , Z h e n g W , X u H , Z h e n g T . 2014 . W h o l e - g e n o m e s e q u e n c e o f m a r i n e b a c t e r i u m P h a e o d a c t y l i b a c t e r x i a m e n e n s i s s t r a i n K D 52 , i s o l a t e d f r o m t h e p h y c o s p h e r e o f m i c r o a l g a P h a c o d a c t y l u m t r i c o r nu t um . G e n o m e A nn o un c 2 : e 01 2 89 - 14 . htt p s : / / d o i . o r g / 10 . 1 128 / g e nomeA . 012 8 9 - 14 . 6 5 . C h e n Z , L e i X , La i Q , L i Y , Z h a n g B , Z h a n g J , Z h a n g H , Y a n g L , Z h e n g W , T i a n Y , Y u Z , X u H , Z h e n g T . 2014 . P h a e o d a c t y l i b a c t e r x i a m e n e n s i s g e n . nov . , s p . nov .,a m e m b e r o f t h e f a mil y S a p r o s p i r a c e a e i s o l a t e d f r o m t h e m a r i n e a l g a P h a e o d a c t y l u m t r i c o r nu t um . I n t J S y s t E vo l M i c r o bi o l 64 : 349 6 – 350 2 . htt p s : // d o i . o r g / 10 . 1 0 99 / i j s . 0 . 0 6390 9 - 0 . 66 . H a ll - S t oo dl e y L , C o s t e r t o n J W , S t oo dl e y P . 2 0 04 . B a c t e r i a l bi o fi l m s : f r o m M e t a t r a n s c rip t o m e o f t h e T r i c hod e s m i u m C o n s o r t i u m A ppli e d a n d E n v i r o nm e n t a l M i c r o bi o l o g y J a nu a ry 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m. or g 15 37 t h e n a t ur a l e nv i ron m e n t t o i n f e c t i ou s di s e a s e s . N a t R e v M i c robi o l 2 : 9 5– 108 . htt p s : // d o i . o rg / 10 . 1 038 / n rm i c r o821 . 6 7 . B a u e r M , K u b e M , T ee li n g H , R i c h t e r M , L om b a r d o t T , A ll e r s E , W ürd e - m a n n C A , Q u a s t C , K u h l H , K n a u s t F , W o e b k e n D , B i s c ho f K , M uss m a nn M , C ho u d hu r i J V , M e y e r F , R e i nh a r d t R , A m a n n R I , G l ö c k n e r F O . 2006 . W h o l e - g e n o m e a n a l y s i s o f t h e m a r i n e B a c t e r o id e t e s ‘ G r a m e ll a f o r s e t ii ’ r e v e a l s a d a p t a t ion s t o d e g r a d a t i o n o f p o l y m e r i c or g a n i c m a t t e r . E n v iro n M i c ro bi o l 8 : 22 0 1–221 3 . h tt p s : // d o i . or g / 1 0 . 111 1 / j . 14 6 2 - 292 0 . 20 0 6 . 0 1 1 52 . x . 68 . N o i n a j N , G uilli e r M , B a rn a r d TJ , B u c h a n a n S K . 2 010 . T on B - d e p e nd e n t t r a n s p o r t e r s : r e gu l a t i o n , s t ru c t u r e , a n d f un c t i o n . A nn u R e v M i c r o bi o l 64 : 4 3 – 6 0 . htt p s : // d o i . o rg / 1 0 . 1 146 / a nnur e v . m i c r o . 1 124 0 8 . 1 3 424 7 . 69 . Z u b k o v M V , F u c h s B M , A r c h e r S D , K i e n e R P , A m a n n R , B ur k il l P H . 2001 . L i n k i n g t h e c o m p o s i t i o n o f b a c t e r i o pl a n k t o n t o r a pi d t u r nov e r o f di s - s o l v e d di m e t h y l s ulp h o n i o p r o pi o n a t e i n a n a l g a l bl oo m i n t h e N o r t h S e a . E n v i ro n M i c r o bi o l 3 : 30 4 – 311 . htt p s : // d o i . o rg / 10 . 1 0 46 / j . 1462 - 2920 . 200 1 . 00 1 96 . x . 70 . N i k r a d M P , C ott r e l l M T , K i r c hm a n D L . 20 1 4 . U p t a k e o f di s s o l v e d o rg a n i c c a r b o n b y g a mm a p r o t e o b a c t e r i a l s u b gr o u p s i n c o a s t a l w a t e r s o f t h e w e s t A n t a r c t i c p e n i n s u l a . A pp l E n v i r o n M i c r o bi o l 80 : 3362 – 336 8 . htt p s : / / doi . org / 10 . 1 128 / A E M . 00 1 21 - 14 . 71 . B r iss e tt e J L , W e i n e r L , R ip m a s t e r T L , M od e l P . 199 1 . C h a r a c t e r i z a t i o n a n d s e q u e n c e o f t h e E s c h e r i c h i a c o l i s t r ess - i n d uce d p s p o p e r o n . J M o l B i o l 220 : 35 – 4 8 . htt p s : // d o i . o r g / 10 . 1 0 16 / 0 0 22 - 283 6 ( 91 ) 9037 9 - K . 72 . B a tt e s t i A , M a j d a l a n i N , G ott e s m a n S . 2 0 15 . S t r e s s s i gm a f a c t o r R p o S d e gr a d a t i o n a n d t r a n s l a t i o n a r e s e n s i t i v e t o t h e s t a t e o f ce n t r a l m e t a b - o li s m . P r o c Na t l Ac a d S c i U S A 112 : 5 15 9 – 516 4 . htt p s : // d o i . o rg / 1 0 . 10 7 3 / p n a s . 1504 6 39 1 12 . 7 3 . M c C a r r e n J , B e c k e r J W , R e p e t a D J , S h i Y , Y o un g C R , M a l m s t ro m RR , C hi s h o l m S W , De L o n g E F . 201 0 . M i c ro b i a l c o m m un i t y t r a n s c r ip t o m e s r e v e a l mi c ro b e s a n d m e t a b o l i c p a t h w a y s a ss o c i a t e d w i t h diss o l v e d or g a n i c m a tt e r t urnov e r i n t h e s e a . P ro c N a t l A c a d S c i U S A 10 7 : 16 4 2 0 – 1 6 427 . h tt p s : / / d o i . o rg / 10 . 1 07 3 / p n a s . 10 1 07 3 210 7 . 74 . S iddi q u i PJ A , B e r gm a n B , C a r p e n t e r E J . 199 2 . F il a m e n t o u s c y a no b a c t e r i a l a ss o c i a t e s o f t h e m a r i n e p l a n k t on i c c y a no b a c t e r i u m T r i c h od e s m i u m . P h y - c o log i a 31 : 3 2 6 – 33 7 . h tt p s : // d o i . or g / 1 0 . 2 2 16 / i 0 0 31 - 88 8 4 - 31 - 3 - 4 - 326 . 1 . 7 5 . A m i n S A , H m e l o L R , v a n T o l H M , D u r h a m B P , C a r l s o n L T , H e a l K R , M o r a l e s R L , B e r t h i a um e C T , P a r k e r M S , D j un a e d i B , I ng a ll s A E , P a r s e k M R , M o r a n M A , A rm b r u s t E V . 201 5 . I n t e r a c t i o n a n d s i gn a lli n g b e t wee n a c o s m o p o li t a n p h y t opl a n k t o n a n d a ss o c i a t e d b a c t e r i a . N a t u r e 52 2 : 9 8 – 101 . htt p s : // do i . o r g / 10 . 1 038 / n a t ur e 1448 8 . 7 6 . W a lw o r t h N , P f r e u n d t U , N e l s o n W C , M i nce r T , H e i d e lb e r g J F , F u F , W a t e r b u r y J B , G l a v i n a d e l R i o T , G o o dwi n L , K y rpid e s N C , La n d M L , W oyk e T , H u t c h i n s DA , H e s s W R , W e b b E A . 2015 . T r i c h o d e s m i u m g e n o m e m a i n t a i n s a b un d a n t , wid e s p r e a d n o n c o di n g D N A i n s i t u , d e s pi t e o li go - t r o p h i c lif e s t y l e . P r o c Na t l Ac a d S c i U S A 112 : 4 251 – 425 6 . htt p s : // d o i . o r g / 10 . 1073 / pn a s . 14 2 233 2 1 12 . 7 7 . H aa s B J , P a p a n i c o l a o u A , Y a ss o u r M , G r a b h e r r M , B l oo d P D , B o wd e n J , C o u g e r M B , E cc l e s D , L i B , L i e b e r M , M a c m a n e s M D , O t t M , O r v i s J , P o c h e t N , S t r ozz i F , W ee k s N , W e s t e rm a n R , W illi a m T , De w e y C N, H e n s c h e l R , L e du c R D , F r i e dm a n N , R e g e v A . 2 0 13 . D e nov o t r a n s c rip t s e q u e n c e rec o n s t ru c t i o n f r o m R N A - s e q u s i n g t h e T r i n i t y pl a t f o r m f o r r e f e r e n c e g e n e r a t i o n a n d a n a l y s i s . N a t P r o t o c 8 : 14 9 4 – 151 2 . htt p s : // d o i . o r g / 10 . 1038 / npro t . 20 1 3 . 0 84 . 78 . H y a t t D , L o c a s c i o P F , H a u s e r L J , U b e r b a c h e r E C . 2012 . G e n e a n d t r a n s - l a t i o n i n i t i a t i o n s i t e p r e d i c t i o n i n m e t a g e nom i c s e q u e nce s . B i o i n f o rm a t - i c s 28 : 2 2 23 –2 230 . htt p s : // d o i . o rg / 10 . 1 0 93 / bi o i n f o rm a t i c s / b t s 429 . 79 . A l t s c hu l S F , G i s h W , M ill e r W , My e r s E W , L ip m a n D J . 1990 . B a s i c l o c a l a li gnm e n t s e a r c h t o o l . J M o l B i o l 215 : 403 – 410 . htt p s : // d o i . o rg / 10 . 1 0 16 / S 002 2 - 2836 ( 05 ) 803 6 0 - 2 . 80 . O ’ L e a r y . 201 6 . R e f e r e n c e s e q u e n c e ( R e f S e q ) d a t a b a s e a t N C B I : c urr e n t s t a t u s , t a x o n o m i c e x p a n s i on , a n d f un c t i o n a l a nn o t a t i o n . N u c l e i c Ac id s R e s 44 : D7 33 –D7 45 . htt p s : // d o i . o rg / 1 0 . 1 0 93 / n a r / g k v 1 189 . 81 . H u s o n D H , B e i e r S , F l a d e I , G ó r s k a A , E l - H a di d i M , M i t r a S , R u s c h e w e y h H J , T a pp u R . 2 0 16 . M E G A N c o mmun i t y e di t i o n - i n t e r a c t i v e e x pl o r a t i o n a n d a n a l y s i s o f l a rg e - s c a l e m i c r o bi o m e s e q u e n c i n g d a t a . P L o S C o m p u t B i o l 12 :e 1004 9 57 . htt p s : // d o i . o rg / 1 0 . 13 7 1 / j o urn a l . p c bi . 1 0 049 5 7 . 82 . R a c i n e J . 2012 . R S T U D I O:a pl a t f o rm - i n d e p e n d e n t I D E f o r R a n d S w e a v e . J A pp l E c o n 27 : 1 67 –1 72 . htt p s : // d o i . o rg / 1 0 . 10 0 2 / j a e . 12 7 8 . 83 . B a rr e t t T , W il h i t e SE , L e d o u x P , E v a ng e li s t a C , K i m I F , T o m a s h e v s k y M , M a r s h a l l K A , P hillipp y K H , S h e rm a n P M , H o l k o M , Y e f a no v A , L e e H, Z h a n g N , R o b e r t s o n C L , S e rov a N , D a v i s S , S o b o l e v a A . 2013 . N C B I G E O : a r c h i v e f o r f un c t i o n a l g e n o m i c s d a t a s e t s – upd a t e . N u c l e i c Ac id s R e s 41 :D 991 – D 995 . h t t p s : // do i . o r g / 10 . 1 0 93 / n a r / g k s 1 193 . L e e e t a l . A ppli e d a n d E n v i r o nm e n t a l M i c ro bi o l o g y J a nu a r y 201 8 V olum e 8 4 I s s u e 1 e 02026 - 17 a e m.a s m . or g 16 38 In preparation for PNAS Plus Major and minor categories: Biological Sciences – Environmental Sciences Genomics and pangenomics of Synechococcus in the global ocean Michael D. Lee 1 , Nathan G. Walworth 1 , Joshua D. Kling 1 , Nathan A. Ahlgren 2 , Gabrielle Rocap 3 , Mak A. Saito 4 , David A. Hutchins 1 , and Eric A. Webb 1 * Author affiliations: 1: Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA. 2: Department of Biology, Clark University, Worcester, MA, USA. 3: School of Oceanography, University of Washington, Seattle, WA, USA. 4: Woods Hole Oceanographic Institute, Woods Hole, MA, USA. *: Correspondence: Dr. Eric A. Webb University of Southern California 3616 Trousdale Pkwy, AHF 107 Los Angeles, CA 90089, USA eawebb@usc.edu Keywords: Synechococcus | TARA Oceans | metagenomics | pangenomics 39 Synechococcus is a globally abundant photoautotroph and large driver of the Earth’s carbon cycle. Here we present 12 new, high-quality draft genomes of marine Synechococcus isolates spanning several clades. We integrate them phylogenomically and pangenomically with previously available reference genomes and utilize ~100 environmental metagenomes largely sourced from the TARA Oceans project to assess the global distributions of these currently accessible genomic lineages. This provided heretofore-unattained resolution into the genus’s variability in the marine environment both within and across samples. We find that the newly provided clade II isolates are by far the most representative of the recovered in situ populations. These genomic lineages possess the smallest genomes yet recovered of the genus (2.14 ± 0.05Mbps; mean ± 1SD), while concurrently hosting some of the highest GC contents (60.67 ± 0.16%). This is in direct opposition to what Synechococcus’s nearest relative, Prochlorococcus, demonstrates – wherein decreasing genome size via genomic streamlining coincides with a strong decrease in GC content – suggesting this sub-clade of Synechococcus appears to have convergently undergone genomic streamlining, but aloft a fundamentally different evolutionary trajectory. Regarding Synechococcus ecology, we find that selection across the marine environment plays a larger role than simply dispersal, and we demonstrate the value of utilizing the entire genetic complement identified in the current pangenome (~85,000 genes) to functionally characterize in situ populations. Finally, an analysis of hypervariable genes identified based on their frequency of detection in the environment reveals an enrichment of integrases, multidrug exporters, and glycosyltransferases genes involved in outer membrane modification, providing insight into mechanisms that may be contributing to the generation and stability of within-sample fine-scale diversity. Altogether, this work provides several novel insights into the genomics, ecology, and evolution of marine Synechococcus. Introduction Synechococcus is a unicellular cyanobacterium highly abundant throughout the global surface ocean. As a photoautotroph, in terms of abundance it is second only to its nearest relative Prochlorococcus, but in terms of distribution Synechococcus is able to thrive in a wider range of environments (e.g. Flombaum et al. 2013; Sohm et al. 2015; Farrant et al. 2016). Broadly speaking, Prochlorococcus populations extend to deeper depths and are found in greater abundance in warm, oligotrophic waters, but are more latitudinally restricted than Synechococcus whose conducive regimes include polar and nutrient-rich habitats (e.g. Flombaum et al. 2013). The successful proliferation of both of these genera across the world’s oceans and within a multitude of distinct environmental niches is owed to a plethora of varied genomic lineages that have evolved since the divergence of Synechococcus and Prochlorococcus – estimated to have occurred between ~0.25–1 billion years ago (Dvořák et al. 2014). Recent studies leveraging current sequencing technologies have begun to shed light on the expansiveness of the genomic architecture underlying these organisms (e.g. Marston et al. 2012; Sun and Blanchard 2014; Kashtan et al. 2014). Moreover, it has been revealed that the level of “fine-scale diversity” of a population within a single sample (here defined as genetic variation that exists within a traditionally defined operational taxonomic unit; Chase and Martiny 2018) is often greater than what may have been anticipated. For example, hundreds of subpopulations of 40 Prochlorococcus have been reported to be stably coexisting in natural samples (Kashtan et al. 2014). In the same vein, multiple “ecotypes” of Synechococcus have also been observed coexisting based on marker-gene studies (Sohm et al. 2015; Farrant et al. 2016). Due to their profusion, ubiquity, and carbon-fixing nature – all of which are anticipated to expand under current global change trends (Flombaum et al. 2013) – understanding the diversity and ecology of these marine picoplankton is vital to our understanding of Earth’s elemental cycling. Members of Synechococcus have been grouped into various clades based on both physiological characteristics and various phylogenetic markers (e.g. Dufresne et al. 2008; Ahlgren and Rocap 2012). Their global distributions have also been studied using marker genes (e.g. the 16S rRNA gene, 16S–23S internal transcribed spacer, rpoC1, ntcA, and petB), and this has revealed some trends in biogeographic distributions corresponding to clade designations (Ahlgren and Rocap 2012; Toledo and Palenik 1997; Penno, Lindell, and Post 2006; Mazard et al. 2012; Sohm et al. 2015; Farrant et al. 2016). For instance, clades I and IV are commonly reported at higher latitudes with cooler waters, while clade II is more frequently detected at lower latitudes (e.g. Sohm et al. 2015; Farrant et al. 2016). Individual marker genes are not indicative of entire genomic content however, and, due to their inherent broad-level resolution, can mask fine-scale diversity with biogeographical and ecological implications (Chase and Martiny 2018). Here we introduce 12 new, high-quality draft genomes of marine Synechococcus isolates from several different clades and analyze them phylogenomically and pangenomically along with the genus’s previously available reference genomes. With this reference library of 31 non-redundant, marine Synechococcus isolate genomes, we then utilized ~100 environmental metagenomic samples largely sourced from the TARA Oceans sampling project (Sunagawa et al. 2015) to assess their global distributions; while reference genomes are not necessarily perfect representatives of in situ populations, they do serve as windows into the genomic lineages they represent. In this work, we discuss the novel insights into marine Synechococcus ecology, genomics, and evolution that these new isolates and the integration of these data provides. Results Phylogenomics, general genome characteristics, and environmental abundance of Synechococcus isolate genomes Nineteen Synechococcus isolate genomes sourced from various locations were sequenced, assembled, and manually curated resulting in 12 new, distinct (< 98% average nucleotide identity), high-quality draft genomes (estimated ~>99% complete and ~<1% redundancy; Table S1). Phylogenomic analysis with previously available marine Synechococcus reference genomes placed these within several clades including I, II, XV, XVI, and CRD1 (Fig. 1A, new genomes are bolded and underlined). In order to assess the environmental relevance of the currently accessible marine Synechococcus genomic lineages, we created a reference library from these 31 non-redundant reference genomes and recruited metagenomic reads from 97 environmental samples (of size fraction 0.2– 3µm) spanning all major oceans and the Mediterranean and Red Seas (Table S2; Fig. S2). Overall, our Synechococcus reference library recruited roughly 1.13% of a total of ~32 billion quality-filtered reads – with ~1.21% mapping from 65 surface samples and ~0.99% mapping from 32 deep-chlorophyll maximum samples (Table S2). 41 To mitigate artifacts due to non-specific mapping, we employed a “detection” criterion for our reference genomes requiring that at least 50% of the reference base pairs (bps) recruit reads in order for that genome to be considered representative of the in situ population (see Fig. S3 and Supp. Note 1 for further discussion of “detection”). (We use the term “population” throughout this paper referring to the Synechococcus genomic lineages of a sample that our reference genomes grant us access to via read recruitment.) Boxplots of each reference genome’s levels of detection (Table S3), along with the number of samples in which they passed this detection threshold, are presented in Figure 1C, which combined with their relative abundances (here defined as the proportion of recruited reads; Fig. 1D) together demonstrate which currently available Synechococcus isolates best represent the in situ populations of the recovered samples. For instance, reference genome RS9917’s maximum detection in any sample was ~13% (meaning at most only 13% of its bps recruited any reads from any sample; Fig. 1C), and therefore it was not deemed representative of any of the in situ Synechococcus populations recovered from the incorporated 97 metagenomic samples; RS9917 has previously been found to be associated with hypersaline environments (Dufresne et al. 2008). In contrast, isolate Figure 1 | Phylogenomic tree of the 31 analyzed Synechococcus genomes (A) and genome information (B–D). Phylogenomic tree (A) is maximum-likelihood (100BS) of 1,002 orthologs present in single-copy in each of the 31 incorporated Synechococcus reference genomes forming an alignment of 363,980 amino acids. The root depicted is inferred from a similar tree built incorporating Gloeobacter violaceus (Fig. S1). Underlined and bolded genomes are newly provided from the current study, and colors correspond to clades. Size and GC content are plotted in panel B, a star indicates the genome possesses at least one giant open-reading frame (see main text). Panel C depicts the number of samples (of the total 97) for which a reference genome was deemed detected, the maximum genome detection is plotted as the main bar, and within it are boxplots of the reference genome’s detection. And panel D plots each reference’s overall relative abundance – defined here as the proportion of reads recruited to each genome out of the total number of reads recruited to the entire library of the 31 reference genomes (i.e. the “overall relative abundance” of all reference genomes sums to 100%). 42 genome N32 in clade II recruited reads to more than 50% of its genome in 38 of the 97 total samples (Fig. 1C). By far the most abundant genomic lineages were represented by the 5 newly contributed genomes within clade II (N32, N5, UW86, N26, and N19; Fig. 1). These together were responsible for ~58% of the total recruited reads (Fig. 1D). In the context of the other reference genomes, these five demonstrate significantly smaller genomes (~2.14 ± 0.05 Mbps; mean ± 1 SD) than the remaining 26 (~2.55 ± 0.23 Mbps; p=5e-9; Welch 2-tail test), and significantly higher GC contents (~60.67 ± 0.16% as compared to ~57.36 ± 2.93%; p=5e-6; Fig. 1B). They also hold significantly fewer genes (2,469 ± 45 vs 2,786 ± 245; p=1e-6) and have a moderately higher coding density (~0.908 ± 0.005 vs ~0.898 ± 0.021; p=0.04; Table S1). Prior to the addition of these newly sequenced clade II isolates, the available clade II reference genomes included only CC9605 and WH8109, which have wildly different genome sizes (~2.51 Mbps and ~2.11 Mbps; Fig. 1B). While WH8109 has a comparable genome size to the newly provided clade II members (~2.12 Mbps), its overall relative abundance (Fig. 1D) and distribution in the environment (Fig. S4) were notably different from the 5 new clade II members, and better reflect those of CC9605 (Fig. S4). The next most representative reference genomes of the incorporated samples were GEYO and MIT9508 from clade CRD1 (together responsible for ~10% overall recruitment), WH8102 of clade III (~5%), BL107 and CC9902 of clade IV (~6.5% combined), RCC307 of clade X (~3%), and KORDI49 of WPC1 (~1.5%; Fig. 1D). None of the other reference genomes were responsible for greater than 1% of the total reads that successfully recruited to our reference library (Fig. 1D; Table S1). Biogeography of Synechococcus genomic lineages Figure 2 depicts the distributions and relative abundances of Synechococcus clades in the 65 included surface ocean samples (deeper samples showed similar trends but in lower relative abundance; Fig. S5; Table S2). The locations with the greatest recovered abundances of Synechococcus were site 141 just north of the Panama Canal where ~9% of total reads mapped to our reference genomes, site 33 in the Red Sea with ~7%, and site 57 in the southern Indian Ocean with ~6% – all of which were mostly dominated by clade II (Fig. 2; Table S2). In fact the top 13 samples in terms of relative abundance of Synechococcus based on read recruitment were all dominated by clade II, though as mentioned above, recruitment to this clade was not evenly distributed across all of its representative genomes. Each of the 5 newly contributed genomes recruited greater than twice the amount of reads as did CC9605 or WH8109 (Fig. 1D; Table S4), and recruited in different proportions in different samples (Fig. S4; Table S4). Overall, consistent with previous literature utilizing marker-genes (Sohm et al. 2015; Farrant et al. 2016), clade II abundance was found to be significantly positively correlated with iron (Fe) and temperature (Fig. 3; Fe estimates were sourced from Farrant et al. 2016). Following the samples dominated by clade II when ranked by the relative abundance of recovered Synechococcus, the next most abundant was site 25 in the Mediterranean Sea with ~3% of total reads recruiting to our reference library (Fig. 2). Though mixed, the most representative reference genome of this population was WH8102 of clade III (Fig. 2; Table S4). This sole reference genome of clade III was found to be representative of a portion of the populations recovered in a few other samples, such as site 142 in the Gulf of Mexico and site 141 north of the Panama Canal, but mostly dominated the 43 Mediterranean Sea (Fig. 2). WH8102 abundance was significantly correlated negatively with phosphate (PO 4 3- ) and silicon (Si), and positively with salinity and Fe (Fig. 3). Clade CRD1 was most dominant in the Costa Rica Upwelling Dome (CRUD) of the mid-eastern Pacific (green in Fig. 2), with by far the majority of recruited reads mapping to isolate genomes GEYO and MIT9508 (Fig. 1D; Table S4), but was also representative of portions of the populations recovered in the southeast Pacific and southeast Atlantic (e.g. sites 93 and 68; Fig. 2). This clade was the only one significantly positively correlated with PO 4 3- , nitrite (NO 2 - ), and chlorophyll (Fig. 3), and although not found to be significant, was the only clade with a strong negative correlation to Fe (adj. p=0.07). Genomes BL107 and CC9902 of clade IV, and to a much lesser extent, CC9311 and WH8020 of clade I, were mostly representative of Synechococcus populations in cooler waters/higher latitudes, as has been observed before based on marker-gene analyses (Sohm et al. 2015; Farrant et al. 2016), and were Figure 2 | Distributions of dominant clades from 65 surface ocean samples. Pie sizes are scaled to represent percent of total sample reads that were recruited to the 31 reference genomes, serving roughly as a metric of how abundant the Synechococcus population was at each sample (though this is constrained by the reference genomes). Pies are colored by proportion of reads recruited to each of the dominant clades in each sample. “Not detected” refers to sites where no genome’s proportion of recruitment (“detection” – see main text) surpassed the defined 50% threshold. Select samples have their corresponding numbers to TARA designations. Supp. Fig. S2 is the same map with all identifiers shown, which are also presented in Supp. Table S2 with further sample data. Figure 3 | Heatmap of Spearman correlations between clade recruitment and environmental data. Input data is presented in Supp. Tables SX/SY. Stars indicate adjusted p-value < 0.05. 44 found to be significantly negatively correlated with temperature – and correspondingly positively correlated with oxygen (Fig. 3). A Synechococcus pangenome Based on the open-reading frame finding software utilized here (Prodigal; Hyatt et al. 2010), we identified a total of 84,784 “genes” (Table S5) from the 31 incorporated reference genomes (2,735 ± 254 genes per genome; mean ± 1SD; Table S1) representing the Synechococcus pangenome of the current study. These were functionally annotated with NCBI’s Cluster of Orthologous Groups (COGs; Galperin et al. 2015) and Kyoto Encyclopedia of Genes and Genomes (KEGG; Kanehisa et al. 2016) Orthologs (KOs), resulting in ~63% of genes been annotated with COGs and ~44% being annotated with KOs (Table S5). Clustering all identified genes using the Markov Cluster (MCL) algorithm (van Dongen and Abreu-Goodger 2012) within the analysis and visualization platform anvi’o (Eren et al. 2015) yielded a total of 14,036 orthologous gene clusters (GCs; heart of Fig. 4; Table S5). These were grouped into pangenomically defined “core”, “accessory”, and “unique” GCs (“pangenomically defined” as used here is in contrast to their environmentally defined counterparts presented below). This Synechococcus pangenome revealed a “core” set of 1,106 GCs of which each genome contained at least one copy (marked as “Core” in Fig. 4 and in Table S5), totaling 35,140 genes. Within this core set of GCs, 1,002 held genes that were present solely in single copy in all 31 reference genomes (these were the 1:1 orthologs utilized in the phylogenomic tree of Figure 1A; marked as “SCG” in Table S5). A set of 4,943 GCs contained 41,026 “accessory” genes, defined here as those with identified orthologs shared between at least 2 of the reference genomes but less than all 31 (marked as “Accessory” in Table S5). And 7,987 “unique” GCs were detected, representing GCs holding genes detected in only individual reference genomes (marked as “Unique” in Fig. 4 and in Table S5). While a unique GC could potentially hold multiple gene copies from an individual genome, this was not commonly the case as there were only a total of 8,181 genes spread amongst the 7,986 unique GCs (Fig. 4; Table S5). We used COG categories in order to summarize gene annotations across the pangenome (Table S6). While overall ~63% of all genes were annotated with COGs, the proportions of genes successfully annotated were not evenly distributed across the pangenomically defined core, accessory, and unique genes. Roughly 87% of the 35,140 core genes, ~51% of the 41,026 accessory genes, and ~25% of the 8,181 unique genes were ascribed annotations (Fig. 5; Table S6). This trend likely reflects some combination of more highly conserved genes (identified here as the “core”) being more likely to be present in reference databases, and the divergence of unique genes leading to them not clustering together into GCs based on the employed thresholds. Because of the latter, viewing which types of functions are disproportionately annotated in accessory and unique groupings of genes should provide insight into areas of higher divergence among the genus. In comparing the proportions of genes that were assigned putative functions from each pangenome-defined group (Fig. 5; Table S6), the core’s most frequently annotated functional categories included “Translation, ribosomal structure, and biogenesis” (~13%), “Coenzyme transport and metabolism” (~9%), and “Amino acid transport and metabolism” (~9%). This is in contrast to those identified as accessory and 45 unique genes (Fig. 5) with the top 3 dominant categories for accessory genes being “General function prediction only” (~13%), “Cell wall/membrane/envelope biogenesis” Figure 4 | Synechococcus pangenome based on the 31 included reference genomes. Starting from the center of the circle, each layer radiating out represents a different genome, with each spanning 14,036 gene clusters (GCs) representing the pangenome of the 31 reference genomes. Colored-in sections indicate that genome possesses at least 1 gene within that GC. The right edge of the circle plot (starting at 3 o'clock) highlights 1,106 GCs representing coding sequences all 31 genomes possess (Supp. Table S4/S5). The section highlighted 'unique' represents GCs only found in an individual genome (Supp. Table S6). After the genome layers there are 7 additional layers of information for each GC. The first of these layers, environmental core genes / environmental accessory genes (ECG/EAG) depicts the ratio of genes within a GC that were detected in the environment when that reference genome was detected: green = the gene was present in the environment when the reference was; red = the gene was not detected in the environment when the reference was (see main text). The additional 6 layers display log- transformed normalized coverage values for each GC for the named samples. The plots in the top right extending upward from the circle include: a Spearman correlation heatmap of genome relative abundance to environmental data (with stars denoting significance at <0.05); the presence/absence of select genes with copy number displayed if > 1; the relative distributions of each genome across select surface samples (normalized by row – darker blue indicates that genome recruited a greater proportion of reads from that sample); at the top is the overall relative abundance of that reference genome in the entire 97 metagenomes; and lastly, the horizontal barplot at the top right depicts the percent of reads recruited of total reads for the corresponding row – as a metric of relative abundance of the Synechococcus population recovered by the 31 reference genomes for each specified sample. ANE = Atlantic northeast; ANW = Atlantic northwest, ASE = Atlantic southeast, ION = Indian ocean north, IOS = Indian ocean south, MED = Mediterranean Sea, PON = Pacific ocean north, PSE = Pacific southeast, PSW = Pacific southwest, and RED = Red Sea. 46 (~11%), and “Function unknown” (~7%), while the top 3 dominant categories for the identified unique genes included “Cell wall/membrane/envelope biogenesis” (~14%), “General function prediction only” (~13%), and “Secondary metabolites biosynthesis, transport, and catabolism” (~7%; Fig. 5). Based on the currently available references, some Synechococcus members still maintain a catalase peroxidase (katG), though none of the genomic lineages found to be abundant in the current study do, and the gene’s presence or absence in a genome does not track with previously designated clades (Fig. 4, see “Gene Presence/Absence” on right side). There were, however, some clade- specific GCs identified (Fig. 4). For instance, a cluster of 147 GCs detected only in all 4 representatives of clade CRD1 included genes putatively involved in gluconate and lactate dehydrogenase (kduD and dld; Fig. 4). Within that clade, there were 62 clusters of distribution-specific GCs found only in MIT9508 and GEYO (the two references which recruited the vast majority of reads to CRD1; Figs. 1 and 4). These were largely unannotated, but several were ascribed as uncharacterized membrane proteins (e.g. ygaE, DUF939; Table S5). The only representative of clade III, WH8102, which was dominant in the Mediterranean Sea (Fig. 2), possessed 323 unique GCs. Among these included a nitrate/nitrite transporter operon (ntrABC; gene IDs 64720, 64719, and 64718) not detected in any of the other reference genomes (though several reference genomes in the large grouping 5.1A, including WH8102, as well as clade X’s isolate RCC307, possess an annotated nitrate/taurine transporter operon; Table S5). Additionally, all reference genomes contain a single copy of the nitrate/nitrite transporter narK (Table S5), while clade III’s WH8102 contains 2 copies – though one is frame-shifted from a nonsense insertion causing a stop codon midway through which interrupts environmental read recruitment (Fig. S7; gene IDs 64695 and 64696). Clade III’s WH8102 is also the only abundantly detected isolate to possess an annotated phoD-type alkaline phosphatase (Fig. 4; Table S5; gene ID 62255; clade V’s WH7803 also possesses a copy) – though all 31 genomes (including WH8102) possess at least one copy of an annotated broad-specificity phosphatase (phoE), and several (including WH8102) also possess an annotated secreted phosphatase (phoX; Table S5). The presence of this phoD gene being unique to clade III’s WH8102 and clade V’s WH7803 has been noted before (Scanlan et al. 2009), and still holds true given a 3-fold increase in the number of Synechococcus isolate genomes. As noted above, WH8102 was found to be significantly negatively correlated with PO 4 3- , Figure 5 | Top COG category annotations of pangenome- defined gene sets. Percents listed beneath gene sets in the legend are percent annotated out of: 84,787 total genes called; 35,140 core genes; 41,026 accessory genes; and 8,181 unique genes. 47 in congruence with/due to its dominance in the Mediterranean Sea (Figs. 2 and 4) which is known to have unusually high N to P ratios (Powley et al. 2017). In soil communities, pH was found to be the strongest determining factor of community phoD diversity (Ragot, Kertesz, and Bunemann 2015). As the Mediterranean is more acidic than the global ocean (Flecha et al. 2015), it is possible pH is playing or has played a role in the evolution and maintenance of WH8102’s phoD. Contrasting the recruitment profiles of this gene from the Mediterranean (sites 04 and 25) with those of the western Atlantic (sites 141 and 142; in which WH8102 was also detected; see Fig. 2) reveals clear differences between the in situ Synechococcus populations of each (Fig. 6). The N- terminus of phoD and the calcium-binding domain (COG2931) at the C-terminus of the gene both recruit reads at all locations, but the active site of phoD and the C-terminus of the phoD domain both drop to 0 recruitment in the western Atlantic (Fig. 6). Looking at correlation patterns of individual genomic lineage abundances with environmental parameters reveals further inter- and intra-clade variation. For instance, within clade II, all were significantly positively correlated with Fe, but CC9605 was the only one significantly negatively correlated with PO 4 3- , and only the 5 new members of clade II were significantly positively correlated with temperature (Fig. 4; Spearman correlations at the base of the top-right stack). The only isolate genomes significantly positively correlated with chlorophyll were GEYO and MIT9508 of clade CRD1, while most others, like all members of clade II, tended to be slightly negatively correlated with chlorophyll (though not significantly; Fig. 4). Additionally, only members of clades I and Figure 6 | Read recruitment to a putative phoD-type alkaline phosphatase unique to isolate WH8102. Coverage patterns reveal variability of the gene possessed by in situ populations found in the western Atlantic (blue) as compared to those in and near the Mediterranean Sea (red) in the active site and C-terminus of PhoD. PhoD (pfam09423) = alkaline phosphatase; PhoD_N (pfam16655) = N-terminus of PhoD; COG2931 = calcium-binding protein. 48 IV were significantly negatively correlated with temperature (as has been seen based on marker-gene studies; Sohm et al. 2015; Farrant et al. 2016), though two reference genomes of clade I were not significantly correlated – likely due to their low levels of recruitment (Fig. 4; shaded-blue heatmap at top-right of stack depicts relative abundances of each genome in each surface sample). Giant open-reading frames (gORFs), with coding sequences reaching sizes of ~32,000 bps in WH8102, and up to 85,000 bps in RS9917, have been noted and previously studied within Synechococcus (McCarren and Brahamsha 2007; Scanlan et al. 2009). These have been shown to be involved in a rare form of motility in WH8102 (McCarren and Brahamsha 2007), and are suspected to also likely be involved in some sort of defense mechanism (Scanlan et al. 2009). We scanned the incorporated reference genomes of the current study for gORFs, arbitrarily defining them here as coding sequences greater in length than 10,000 bps. Twelve of the 31 incorporated genomes contained at least one identified open-reading frame that surpassed this threshold encompassing members in both of the larger groups of 5.1A and 5.1B (Fig. 1B; stars indicate identification of at least one gORF; Fig. S7 depicts top 10 largest genes of each genome). Of those reference genomes most representative of the in situ populations recovered (Fig. 1C–D), WH8102 of clade III and both members of clade IV possessed gORFs, while clade II and the two reference genomes responsible for the majority of clade CRD1 recruitment, GEYO and MIT9508, did not (Fig. 1). For the reference genomes that do possess gORFs and were detected in the environment, we looked at the read recruitment profiles for these genes to see if they were being maintained in situ (at least to a degree of similarity that would successfully recruit to our references sequences). Most genomes with gORFs did recruit reads across the genes at a similar level of coverage to the rest of the genome (Table 1). Exceptions in those that were abundant Table 1 | Genomes with gORFs and their detection in the environment Genome (clade) # gORFs Gene IDs (~size in kbps) (bold if coverage matched genome’s) Gene detection without genome KORDI49 (WPC1) 8 16030 (34); 13780 (26); 16451 (14); 15392 (14); 16181 (13); 14601 (11); 14861 (11); 15247 (11) – WH7805 (VI) 5 61333 (24); 60086 (14); 59970 (11); 61336 (11); 60316 (11) – RS9916 (IX) 4 34124 (28); 33786 (23); 34138 (22); 32181 (14) 33768: CRUD and northern Pacific 34124: southern Indian Ocean UW140 (XVI) 2 45129 (22); 45256 (18) – RS9917 (VIII) 2 34755 (85), 35366 (27) – UW179A (CRD1) 2 47455 (16); 47857 (14) Both in Pacific Ocean UW105 (XVI) 1 39404 (20) – CC9616 (UC-A) 1 70517 (12) – KORDI100 (UC-A) 1 12510 (10) – WH8102 (III) 1 63052 (32)* – CC9902 (IV) 1 6493 (11)* – BL107 (IV) 1 69282 (11)* – *These recruited reads across the gene, but with low coverage values compared to the source genome. include clade III’s WH8102 and both of clade IV’s (CC9902 and BL107). In these cases, the gORFs did recruit reads across all or nearly all of the gene, but at coverage levels drastically lower than the coverage of that reference genome in that sample. This could 49 be either because the in situ populations maintain varied sequences which do not all successfully recruit, or, because these gORF sequences are highly repetitive (Scanlan et al. 2009), perhaps the in situ communities hold sequences that are much shorter. We also checked to see if any of these identified gORFs were recruiting reads in samples in which their source genome did not pass the detection criterion. This was rare overall, but there were some instances in which this occurred. Two of the gORFs from RS9916 recruited reads from the Costa Rica Upwelling Dome (CRUD) and from the southern Indian Ocean, and both gORFs from UW179A recruited reads consistently from samples in the Pacific Ocean in which UW179A was not deemed present (Table 1). It should be noted however that because gORF sequences are highly repetitive (Scanlan et al. 2009), sequencing and assembly methods employed for any individual reference genome likely play a part in the ability of them to be reconstructed and subsequently identified. As such, the presence of a gORF in a reference genome is likely to be more informative than the absence of one. With that said, the newly contributed isolates of this study were sequenced on the Illumina HiSeq platform (2x150 bp), and we were still able to assemble and recover gORFs from several isolates (Fig. 1B). Environmentally defined core and accessory genes Above we identified core, accessory, and unique genes based on the pangenome – meaning based on the presence or absence of genes in the 31 incorporated reference genomes. This approach is useful in the characterization of the reference genomes, but by itself may not be informative with regard to in situ populations. By integrating pangenomics with environmental metagenomics, one can assess the presence or “absence” of individual genes in environmental samples in which their source genome is detected. “Absence” is quoted in the previous sentence in order to emphasize that the lack of read recruitment in a sample to a gene does not necessarily indicate that gene’s absence, but rather that gene may either be absent or sufficiently diverged in that sample such that it does not pass the specified read recruitment thresholds. In either of those cases, however, a lack of recruitment is indicative of a difference in the genetic composition of that sample as compared to the reference. As recently demonstrated in a study on Prochlorococcus, (O. Delmont and Eren 2018), this approach can be used to delineate “Environmental Core Genes” (ECGs; those genes consistently detected in environmental samples in which their source genome is detected) and “Environmental Accessory Genes” (EAGs; those genes that are not consistently detected in environmental samples in which their source genome is detected). It should be stressed that environmentally identified core genes are not the same as pangenomically defined core genes, as environmentally identified core genes are defined by the conserved presence of that gene in the environment along with its source genome, and does not consider possession of genes across genomes as does the pangenomic approach. In delineating ECGs and EAGs across the 97 incorporated metagenomes, a total of 71,952 genes were identified as ECGs (i.e. consistently detected in samples in which their source genomes were detected), of which ~66% were annotated with COGs, while 10,091 genes were identified as EAGs (i.e. not consistently detected in samples in which their source genomes were detected), of which ~37% were annotated with COGs. The distributions of EAGs were not randomly distributed throughout their source genomes, but rather were often co-localized and syntenic (Fig. S8). A recently published similar 50 analysis with Prochlorococcus revealed the same trend and demonstrated the utility in this approach as a means of identifying these as hypervariable genomic islands (O. Delmont and Eren 2018). Summarizing the annotated proportions of ECGs and EAGs revealed that functional annotations were enriched in different COG categories for each grouping of genes (Table 2; Fig. S9). For instance, the COG categories “Nucleotide transport and metabolism” and “Translation, ribosomal structure and biogenesis” had an ~2.7-fold and ~2.5-fold greater proportion of annotated genes identified as ECGs (Table 2; Fig. S9). While on the other end of the spectrum, the COG categories for “Extracellular structures”, “Mobilome: prophages, transposons”, and “Cell motility” were all enriched among EAGs, with ~5.2-, ~4.4-, and ~2.7-fold greater proportions of EAGs over ECGs (Table 2; Fig. S9). Table 2 | COG categories with greatest enrichment of annotations from EAGs COG category EAGs/ECGs Dominant annotations Extracellular structures 5.2 Pilus assembly Mobilome: prophages, transposons 4.4 Phage integrases Virulence-associated proteins Cell motility 2.7 Pilus assembly Secretion pathway proteins Twitching motility Defense mechanisms 2.3 Multidrug efflux Bacteriocin/lantibiotic exporters Cell wall/membrane/envelope biogenesis 2.2 Glycosyltransferase Nucleoside-diphosphate-sugar epimerase Secondary metabolites biosynthesis, transport, and catabolism 2.2 Ca 2 + -binding protein, RTX toxin-related Short-chain alcohol dehydrogenase family Discussion New clade II genomic lineages dominant in mid-latitude coastal areas Among the current study’s newly contributed references genomes were several representatives from clade II (Fig. 1), a clade known from previous marker-gene studies to be globally abundant (Sohm et al. 2015; Farrant et al. 2016). We found this to hold true at the genome level (Fig. 2), however recruitment to the 7 reference genomes of this clade was not uniform. Reference genomes N32, N5, UW86, N26, and N19 were responsible for an average of 11.7 ± 1.1% of total recruited reads each (mean ± 1 SD), while the previously available WH8109 and CC9605 recruited 4.8 ± 0.4% of total reads each (Fig. 1D). These previously available clade II genomic lineages also demonstrated clearly different distributions beyond just relative abundances (Fig. 4, top-right blue- shaded distribution plot; and Fig. S4 depicts a map of only clade II). As one of the primary utilities of clade designations is to delineate environmentally relevant, physiologically distinct units, due to these strongly divergent environmental distributions among clade II genomic lineages (Fig. 1D; Fig. 4; Fig. S4) it may be useful to designate the 5 new clade II isolates (N32, N5, UW86, N26, and N19) as a sub-clade of clade II. The numerical dominance of these particular genomic lineages seems to be tied to their reduced genome size, rather than their possession of any niche-defining genetic capabilities such as clade CRD1’s unique dehydrogenases or WH8102’s unique phoD- 51 type alkaline phosphatase (Fig. 4; Table S5). Indeed these 5 reference genomes possess unique genomic characteristics with regard to size and GC content. Genomic streamlining and GC content – a unique evolutionary path Following their divergence from Synechococcus, most known genomic lineages of Prochlorococcus, excluding MIT9303 and MIT9313, have proceeded down an evolutionary path of genomic streamlining coinciding with a substantial reduction in GC content (e.g. Dufresne et al. 2008; Fig. 7A–C). The general thinking regarding Synechococcus is that it has not undergone similar genomic streamlining. However, the 5 new clade II reference genomes contributed here (N32, N5, UW86, N26, and N19; Fig. 1) – along with WH8109, though it recruited much less overall and demonstrated a different global distribution (Fig. 1D; Fig. 4; Fig. S4) – appear to have undergone significant genome reduction relative to the rest of the available marine Synechococcus reference genomes (~2.14 ± 0.05Mbps compared to ~2.55 ± 0.23Mbps; p=5e-9; Welch 2- tail test; Fig. 7B). Yet, they also possess some of the highest GC contents of the genus (~60.67 ± 0.16% as compared to ~57.36 ± 2.93%; p=5e-6; Fig. 7C). This suggests these genomic lineages of this sub-clade of clade II have taken an evolutionary path convergent to the majority of known Prochlorococcus with regard to reduction in genome size via genomic streamlining, but distinct with regard to GC content (Fig. 7B–C). Making the issue more interesting, phylogenomic analysis indicates that Synechococcus strain RCC307 (black in Figs. 7A–C) diverged before the Synechococcus speciation event that led to Prochlorococcus (Fig. 7A). RCC307 currently possesses a relatively smaller genome size compared to the rest of Synechococcus (Fig. 7B) and a relatively higher GC content (Fig. 7C) – characteristics more similar to sub-clade of clade II (blue in Figs. 7A– C), than to the rest of the Synechococcus isolates (Figs. 7B–C). This leaves the question of whether: a) the ancestral Synechococcus population that eventually diverged into RCC307 and the current clades of Synechococcus and Prochlorococcus possessed a larger genome, and RCC307 and this sub-clade of clade II subsequently reduced in genome size while the majority of Synechococcus did not (Fig. 7A–B); or b) the ancestral population possessed a smaller genome, like that of the current RCC307 and the new sub- clade of clade II, and the majority of other Synechococcus expanded their genome sizes (Fig. 7A–B). Regardless of their origins, these new Synechococcus genomic lineages represented by isolates N32, N5, UW86, N26, and N19 are highly abundant in the surface ocean that the TARA dataset covers, and they demonstrate an evolutionary history of genomic streamlining without a concurrent decrease in GC content (Fig. 7A–C). One hypothesis to explain the reduced GC content in Prochlorococcus is couched in the observation that all streamlined Prochlorococcus are missing one or both of ada and mutY, which are genes involved in the repair of G:C to A:T transversions (Giovannoni et al. 2014). All currently available Synechococcus reference genomes, including those newly contributed with this work, possess copies of both of these genes (Table S5). It is therefore possible these repair genes contribute to their maintained higher GC content. It is also possible that nitrogen (N) plays a factor in the GC content of Prochlorococcus as compared to Synechococcus. In considering the same 93 TARA Oceans samples recruited to 31 Prochlorococcus isolate genomes (data from O. Delmont and Eren 2018), Prochlorococcus abundance is significantly negatively correlated with N, while Synechococcus abundance is not (Fig. 7D). In terms of atoms of N, an A–T base 52 pair requires 7, while a G–C base pair requires 8, therefore an organism with a lower GC content in general would be better suited for a lower-N-quota lifestyle. Taking N usage a step further, looking at amino acid usage between Synechococcus and Prochlorococcus reveals that of the 6 amino acids that contain additional N atoms in their side chains, Synechococcus encodes for proportionately more per genome than Prochlorococcus in all except for lysine and asparagine (Fig. 7E). When considering the characteristics of these amino acids, it is also fitting that Prochlorococcus would use proportionately more lysine than Synechococcus. The three amino acids with positively charged side chains are arginine (with 4 N atoms), histidine (3 N atoms), and lysine (2 N atoms). As Prochlorococcus would still need amino acids with positively charged side chains, it might be anticipated it would favor lysine at a cost of fewer N atoms (Fig. 7E; frequencies of all amino acids presented in Fig. S10). Figure 7 | Genome size and GC content of marine Synechococcus and Prochlorococcus. (A) Maximum likelihood phylogenomic tree of 57 shared single-copy ribosomal proteins comprising an alignment of 5,866 amino acid positions (100BS). Panes (B) and (C) represent genome sizes and GC content with points colored as described in the bottom legend, both genera include 31 reference isolate genomes. (D) Spearman correlations of normalized read recruitment to the 31 reference genomes of each genera to environmental parameters from the 93 incorporated TARA samples. (E) Percentage of nitrogen-rich amino acid usage of groups described by colors in legend (same as in panes B and C). 53 This leads us to suggest that N availability may have been a contributing factor in the evolutionary divergence of Synechococcus and Prochlorococcus. It is possible that some members of the ancestral population stochastically may have spent more time in environments with greater bioavailable N, while other spent more time in environments with less bioavailable N (simply due to external driving factors). These varied distributions of the ancestral population would enable selection to start acting on these subpopulations creating positive feedback loops that may have ultimately contributed to the divergence of Synechococcus and Prochlorococcus. There are certainly many factors that led to the speciation of Prochlorococcus and the two genera’s subsequent further divergence, but these clear differences in genomic architecture, and Prochlorococcus’s greater abundance in more N-depleted waters, together indicate that N availability may have been (and continues to be) one of those factors. Population-level selection supersedes dispersal A recent study of Synechococcus biogeography utilizing the cytochrome b 6 gene (petB) observed stark changes in marker-gene-defined ecological units over short geographical distances in some locations (Farrant et al. 2016). We have recapitulated those findings here at the genomic level, as can be seen in locations such as the Marquesas Islands in the Pacific Ocean and at the southern tip of Africa where the southern Atlantic and Indian Oceans meet (Fig. 2). These distributions are likely tied to physical oceanography phenomena. For instance at the southern tip of Africa the dominance of clades IV and I in the southeastern Atlantic may be in part attributed to the Coriolis force in the southern hemisphere bringing cooler waters from the south up along Africa’s west coast (Fig. 2). As can also been seen in Figure 2, in situ populations most similar to clade II reference genomes dominate in noncontiguous sampling locations, often bridged by differing populations of Synechococcus – e.g., site 33 in the Red Sea and site 141 just north of the Panama Canal (Fig. 2). Similarly, WH8102 of clade III is most prominent in the Mediterranean Sea and also detected in the western Atlantic, but not in between (though date of collection could also play a factor in this case; Fig. 2). As mentioned above, we utilized a 50% detection criterion to delineate whether a particular reference genome was representative of an in situ population or not. This is useful for characterizing the distributions of the reference genomes, but it is not ideal for characterizing and comparing in situ populations at fine-scale levels of resolution. To better achieve this goal, we combined metagenomics and pangenomics by utilizing the entire genetic complement of ~84,000 genes as identified herein to examine within- and across-sample fine-scale diversity based on normalized metagenomic read recruitment to the GCs identified by the pangenomic analysis (see Methods for details). We selected 6 sites for this targeted analysis that met the following criteria: 1) spanned various locations; 2) were largely dominated by clade II; and 3) whose genes had a median coverage near or greater than 100X. These sites included (labeled in Fig. 2): 141 just north of the Panama Canal; 140 just south of the Panama Canal; 57 in the southern Indian Ocean; 38 in the northern Indian Ocean; 33 in the Red Sea; and 124 in the Pacific (Fig. 2). Hierarchical clustering of normalized GC coverage values for these 6 sites revealed some sample populations were functionally and genetically more similar to distant samples than they were to geographically closer ones (Fig. 4; the outer 6 layers (rings) 54 show normalized coverage values for each GC for the 6 targeted samples arranged by their hierarchical clustering depicted at top-center). For instance, site 33 of the Red Sea (see Fig. 2) was found to be functionally more similar to site 140, just south of the Panama Canal, than it was to the spatially closer site 38 of the northern Indian Ocean (Figs. 2 and 4). Likewise, the southern Indian Ocean site 57 was found to be more similar to site 141, just north of the Panama Canal, than to the northern Indian Ocean site 38 (Figs. 2 and 4). To investigate what was causing these relationships, we identified some of the GCs that had the largest normalized-coverage standard deviations across these sites (Fig. 8). For reference, the normalized coverage values of petB and recA, both highly conserved single-copy genes, are also visualized (Fig. 8). These reference points, along with the median normalized coverage of the pangenome-defined core genes at each end, show what would be expected of a gene within the Synechococcus population of a sample that our reference library gives us access to – were that gene to exist in single copy in the entire population (see “Pan-Core median”, petB, and recA bars in Fig. 8). Further, values above this line suggest multiple copies may be present, and values below this line suggest that either not all members of the population possess a copy, and/or some possess a copy that is genetically too diverged to successfully recruit reads to the reference (Fig. 8). Overall, when considering all GCs, this provides a high-resolution genetic profile of the in situ Synechococcus populations (i.e. outer 6 layers of Fig. 4), and with this view (Fig. 8) we can see some of what is causing spatially distant samples to cluster closer together. For example, the two most similar sites are the Red Sea’s site 33 and the eastern Pacific’s Figure 8 | Select gene-cluster normalized coverages for 6 surface samples with relatively abundant Synechococcus populations mostly dominated by clade II. The “Pan-Core median” is the medial normalized coverage of the 1,002 single copy core genes as defined by the pangenome; “Pan-Core less 1SD” marks one standard deviation less than that median coverage value. Each column represents a gene cluster, a single functional annotation in some cases spans more than one gene cluster (e.g. ferredoxin). Hierarchical clustering on left is based on clustering of all GCs normalized coverages. Highlighted boxes are discussed in the main text. 55 site 140 (Fig. 8, clustering on left), despite the Red Sea being spatially closest to the northern Indian Ocean’s site 38 (Fig. 2). One contributing factor to this a fructose- bisphosphate aldolase (FBP-ALDO), as the Synechococcus population in site 38 of the Indian Ocean demonstrates recruitment characteristics that are inline with what would be expected if each member of the population contained a single copy (Fig. 8), – whereas the Red Sea and Pacific sites demonstrate either the absence of the gene in a large fraction of the population, or genetic divergence in a large fraction of the population (Fig. 8). Some fraction of the population at site 38 also possesses a thiD detectable by our current reference library, while nothing recruits to thiD from the populations at sites 140 or 33 (Fig. 8). Focusing on site 141 in the western Atlantic, which recruited the greatest proportion of reads out of all samples (Fig. 2), we can see some stark differences in the population there as compared to the other sites (Fig. 8). Boxes 1, 2, and 3 in Figure 8 highlight the unique Fe-related genetic potential at this site. The population there seems to be devoid of any flavodoxin genes (Fig. 8, box 3), which serve to help shuttle electrons in place of ferredoxin in times when Fe may be limiting (Karlusich and Carrillo 2017). This suggests the population that exists at site 141 may have evolved (and is maintained) under less Fe- limiting conditions than the other 5 sites. In alignment with this, there are virtually no ferritin genes detected at site 141 (Fig. 8; box 2) – genes involved with iron-storage. Box 4 in Figure 8 highlights an interesting example regarding phosphorus, as site 141 is the only one wherein the secreted phosphatase phoX is found, and there is a complete absence of recruitment to the broad-specificity phosphatase phoE (Fig. 8, box 4). This also serves as an example that particularly demonstrates the value of this approach; each of the 31 isolate genomes possesses at least 2 copies of the phoE gene (a total of 70 gene sequences from the pangenome; Table S5), yet there is no recruitment to any of these from site 141 (Fig. 8, box 4). To reiterate, this implies that the population at that site either does not possess the gene, or that it has diverged beyond the recruitment criteria. While it is impossible to resolve the difference between those two possibilities with the current data, both scenarios would be due to divergence within the genomic lineages comprising the in situ populations. Hypervariable regions as self-regulating mechanisms to maintain fine-scale diversity There has been a steadily increasing volume of work identifying glycosyltransferases (particularly those involved with cell-wall modification), transposases, and bacteriophage-related genes as the genetic elements that are often responsible for the variability that exists between closely related organisms. It has been shown that up to 65% of differences in genetic content within species designations is related to transposases and phage-related genes, suggesting they may help drive speciation (Konstantinidis and Tiedje 2005). A pangenomic analysis of 11 Synechococcus isolate genomes that were available at the time suggested that lateral gene transfer of genomic islands facilitated the niche diversification of closely related lineages, and that these islands were enriched in glycosyltransferases and other genes involved in cell wall modification and biosynthesis (Dufresne et al. 2008). Moreover, a recent single-cell genomics study that identified hundreds of co-occurring subpopulations of Prochlorococcus described that much of the gene content variation between subpopulations were genes in the glycosyltransferase family that could be involved in 56 phage-attachment (Kashtan et al. 2014). In support of this, an experiment that tracked the coevolution of a Synechococcus isolate with a virus identified key mutations in glycosyltransferases and the O-antigen of lipopolysaccharides that increased viral resistance in as few as 170 generations (Marston et al. 2012). Marston el al. also observed that the rapid antagonistic coevolution they observed generated multiple, stably coexisting phenotypes of Synechococcus and viral strains, and recognized this as one mechanism that may be contributing to the generation and maintenance of subpopulation diversity (Marston et al. 2012). In this same vein, the genomic islands containing these diverse components that can act as phage recognition targets are often accompanied by genes that contribute to niche specialization (Rodriguez-Valera and Ussery 2012). Given the self-regulating nature between virus and host that emerges from this system, and their frequency of lateral exchange, these hypervariable genomic islands have been considered evolutionary units of their own (Avrani et al. 2012). Furthermore, it has been argued that for closely related, stably coexisting subpopulations of prokaryotes, their pangenome and the pangenome of their consistently interacting phages together form an evolutionary unit that selection acts upon as a whole (Rodriguez-Valera and Ussery 2012). Our EAG analysis identified hypervariable genes based on their frequency of detection in environmental samples in which their source genome was detected. These were not randomly distributed throughout the genome, but rather were more often co-localized within genomic islands (Fig. S8), as has also been recently seen in a similar analysis of Prochlorococcus (O. Delmont and Eren 2018). Compared to the environmentally conserved genes, functional annotations of these hypervariable genomic islands were enriched in phage integrases, virulence-associated proteins, multidrug exporters, and glycosyltransferases (Table 2; Fig. S9; Table S5, marked EAGs). These findings support the above-discussed notions of these hypervariable genomic regions likely playing a role in the generation and maintenance of fine-scale diversity across the global ocean. Conclusion With this work we have introduced 12 Synechococcus isolate genomes, expanding the number of available marine Synechococcus isolate genomes by ~1/3. Five of these new genomes form a new sub-clade of clade II that are the most representative of highly abundant in situ populations in mid-latitude coastal areas. These isolates are being maintained and are openly available to the community for experimentation. Integrating pangenomics with large-scale metagenomics enables formulating and pursuing pangenomics-based questions that are guided by known environmental relevance. Here we identified a potentially unique evolutionary path of an environmentally abundant lineage of Synechococcus that demonstrates genomic streamlining without a corresponding reduction in GC content. This is in direct contrast to Synechococcus’s nearest relative Prochlorococcus, in which genomic streamlining is consistently associated with a reduction in GC content. Assessing nucleotide and amino acid usage in both genera as a whole, and contrasting metagenomic read recruitment from across the TARA Oceans transect, in the context of their phylogenomic relationships provides evidence that N bioavailability may have been a strong factor driving the speciation of Prochlorococcus from Synechococcus. We further were able to verify the recruitment of environmental DNA to many giant open-reading frames, previously shown to be involved in Synechococcus motility, revealing that these large coding sequences are maintained in 57 there environment. We additionally demonstrated the utility of using a pangenomic gene- centric approach to characterize in situ populations, and were able to confirm that hypervariable genomic islands in situ are enriched in genes involved with cell wall modification, extracellular structures, and other cellular components susceptible to phage interaction – supporting theories that these mechanisms may play a key role in the generation and stability of multiple coexisting subpopulations in the global ocean. Altogether this work summarizes the characteristics and distributions of the currently available Synechococcus isolate genomes, revealing several insights regarding their ecology and evolution, and opens the door for more targeted experimentation with these globally distributed, biogeochemically important cyanobacteria. Materials and Methods Isolate source and genome sequencing Details of the newly sequenced isolates from the current study, including isolation source and maintenance, are presented in Supplemental Table S1. Paired-end, 2x150 bps sequencing was performed by the Joint Genome Institute (JGI) on the Illumina HiSeq platform. Genomic sequencing processing, assembly, and bin refinement All programs were run with default settings unless otherwise noted. Starting from the quality-filtered and decontaminated reads provided by JGI, reads for each genome were kmer-depth normalized with bbnorm (B. Bushnell; https://sourceforge.net/projects/bbmap/files/) and assembled with SPAdes (v3.11.1; Bankevich et al. 2012) with the –meta flag specific (as all cultures were enrichments), and error-correction and –careful mode turned on. For manual curation and elimination of contaminating contigs, assemblies and read coverages (generated via bowtie2; (Langmead and Salzberg 2012) were input into anvi’o (Eren et al. 2015). Coding sequences were identified by Prodigal v.2.6.2 (Hyatt et al. 2010), and the program centrifuge (Kim et al. 2016) was used to taxonomically classify them. Percent completeness and redundancy were estimated based on gene copies identified by hidden Markov models of conserved single-copy genes (Rinke et al. 2013; Campbell et al. 2013). The interactive framework provided by anvi’o (Eren et al. 2015), including clustering of contigs based on tetranucleotide frequency and coverage, made it very easy to identify and assess target cultivar assembled contigs. Incorporated reference genomes All currently available marine Synechococcus genomes were downloaded from NCBI (NCBI Resource Coordinators 2018) in November of 2017. To minimize cross-mapping between references to some extent, we collapsed redundant genomes at the arbitrary threshold of 98% average nucleotide identity (ANI) or greater and retained solely the larger genome. Therefore all incorporated genomes share less than 98% average ANI. ANI was calculated with pyani (Pritchard et al. 2016). All included genomes and relevant information are presented in Table S1. Functional annotation, pangenomics, and phylogenomics of all incorporated genomes 58 All incorporated genomes were processed through anvi’o (Eren et al. 2015) as described above. All open reading frames were functionally annotated with NCBI’s Cluster of Orthologous Groups (COGs; Galperin et al. 2015) and Kyoto Encyclopedia of Genes and Genomes (KEGG; Kanehisa et al. 2016) Orthologs (KOs). All identified genes were clustered using the MCL algorithm (van Dongen and Abreu-Goodger 2012) within anvi’o to generate gene clusters (GCs) with the –mcl-inflation parameter set to 4. These GCs were used to identify the pangenomically defined core, accessory, and unique gene sets within anvi’o, as well as the 1,002 1:1 orthologs utilized in the phylogenomic tree. Amino acid sequences of each individual gene of these 1:1 orthologs were aligned with muscle (Edgar 2004), then the alignments were concatenated together and the maximum- likelihood tree was built with RAxML (Stamatakis 2014) set to 100 bootstraps. Construction of reference library and recruitment of environmental metagenomic data Nucleotide fasta files for the 31 incorporated reference genomes in total (Table S1) were used to generate a reference library with bowtie2 (Langmead and Salzberg 2012). Metagenomic short reads from 93 samples from the TARA Oceans project and 4 samples from the Costa Rica Upwelling Dome were downloaded, quality filtered with the iu- filter-quality-minoche program within the illumina-utils package (Eren et al. 2013), and recruited to the reference library with bowtie2 default settings (Langmead and Salzberg 2012). Table S2 contains all environmental information and accession information for all 97 of the samples. Identification of environmentally defined core and accessory genes Similar to the 50% detection requirement we employed for genomes (see main text, Fig. S3, and Supp. Note 1), we also employed a 50% detection criterion for genes. First, gene coverages and detections were exported from anvi’o, which parses bam files and internal gene-coordinate to create these. Then we generated a modified coverage table by setting the coverage of any gene in any sample to zero if the detection of that gene in that sample was less than 50%. We then summed the coverage for each gene across all samples in which that gene’s source genome was deemed representative of (i.e. source genome passed the 50% detection criterion), and then recovered the median summed coverage of all genes for each genome (leaving 31 summed median coverages, one for each genome). Genes were then deemed environmental-accessory genes (EAGs) if their summed coverage was less than 25% of the median summed coverage for all genes of their source genome. The ECG/EAG layer in the pangenome figure (Fig. 4) shows the ratio of all the genes in that GC (column). Gene-level analysis and GC normalized coverage For individual genes in individual samples (like those presented in Fig. S8), we simply used the 50% detection criterion at the gene level to determine whether or not a gene was detected or not with its corresponding genome (Fig. S8). In order to not eliminate genes from consideration in the functional pangenomic approach, independent of source genome, we used a modified gene coverage table created as described above (set any gene coverage value in any sample to 0 if that gene’s detection in that sample was less than 50%). We then collapsed (summed) gene coverages by GCs identified by the pangenomic workflow as described above, and then normalized each sample’s total GC 59 coverage to 1 million. This normalized GC-coverage matrix was used for the outer 6 GC- coverage layers of Fig. 4, and for Fig. 9. Data availability and reproducibility Draft genomes of the 12 newly sequenced Synechococcus isolates have been deposited in NCBI’s Whole Genome Shotgun database under accession numbers: MBSNxxxxxxxx (pending). Acknowledgements MDL would like to thank Tom O. Delmont, Chris Dupont, and A. Murat Eren for their insight and expertise on many aspects of this work. We also thank the anvi’o team for providing such a rich platform to the world and the TARA Oceans project for providing such a rich dataset to the world. References Ahlgren, Nathan A., and Gabrielle Rocap. 2012. “Diversity and Distribution of Marine Synechococcus: Multiple Gene Phylogenies for Consensus Classification and Development of qPCR Assays for Sensitive Measurement of Clades in the Ocean.” Frontiers in Microbiology 3 (JUN): 1–24. doi:10.3389/fmicb.2012.00213. Avrani, Sarit, Daniel Schwartz, and Debbie Lindell. 2012. “Virus-Host Swinging Party in the Oceans – Incorporating Biological Complexity into Paradigms of Antagonistic Coexistence.” Mobile Genetic Elements 2 (2): 88–95. doi:10.4161/mge.20031. Bankevich, Anton, Sergey Nurk, Dmitry Antipov, Alexey a. Gurevich, Mikhail Dvorkin, Alexander S. Kulikov, Valery M. Lesin, et al. 2012. “SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing.” Journal of Computational Biology 19 (5): 455–77. doi:10.1089/cmb.2012.0021. Campbell, James, Patrick O’Donoghue, Alisha Campbell, Patrick Schwientek, Alexander Sczyrba, Tanja Woyke, Dieter Soll, and Mircea Podar. 2013. “UGA Is an Additional Glycine Codon in Uncultured SR1 Bacteria from the Human Microbiota.” PNAS 110 (14): 5540–45. doi:10.1073/pnas.1303090110. Chase, Alexander, and Jennifer B H Martiny. 2018. “The Importance of Resolving Biogeographic Patterns of Microbial Microdiversity.” Microbiology Australia. doi:10.1071/MA18003. Coordinators, NCBI Resource. 2018. “Database Resources of the National Center for Biotechnology Information.” Nucleic Acids Res 46: D8–13. doi:10.1093/nar/gkx1095. Dufresne, Alexis, Martin Ostrowski, David J Scanlan, Laurence Garczarek, Sophie Mazard, Brian P Palenik, Ian T Paulsen, et al. 2008. “Unravelling the Genomic Mosaic of a Ubiquitous Genus of Marine Cyanobacteria.” Genome Biology 9 (5): R90. doi:10.1186/gb-2008-9-5-r90. Dvořák, Petr, Dale A. Casamatta, Aloisie Poulíčková, Petr Hašler, Vladan Ondřej, and Remo Sanges. 2014. “Synechococcus: 3 Billion Years of Global Dominance.” Molecular Ecology 23 (22): 5538–51. doi:10.1111/mec.12948. Edgar, Robert C. 2004. “MUSCLE: A Multiple Sequence Alignment Method with Reduced Time and Space Complexity.” BMC Bioinformatics 5: 113. doi:10.1186/1471-2105-5-113. 60 Eren, A. Murat, C. Esen, Ozcan, C. Quince, J. Vineis, H. G. Morrison, M. L. Sogin, and T. O. Delmont. 2015. “Anvi’o: An Advanced Analysis and Visualization Platform for ‘omics Data.” PeerJ 3 (1319). Eren, A. Murat, H. Vineis, Joseph, G. Morrison, Hilary, and L. Sogin, Mitchell. 2013. “A Filtering Method to Generate High Quality Short Reads Using Illumina Paired-End Technology.” PLOS One 8 (6). doi:10.1371/journal.pone.0066643. Farrant, Gregory K., Hugo Doré, Francisco M. Cornejo-Castillo, Frédéric Partensky, Morgane Ratin, Martin Ostrowski, Frances D. Pitt, et al. 2016. “Delineating Ecologically Significant Taxonomic Units from Global Patterns of Marine Picocyanobacteria.” Proceedings of the National Academy of Sciences 113 (24): E3365–74. doi:10.1073/pnas.1524865113. Flecha, Susana, Fiz Perez, Jesus Garcia-Lafuente, Simone Sammartino, Aida Rios, and Emma Huertas. 2015. “Trends of pH Decrease in the Mediterranean Sea through High Frequency Observational Data: Indication of Ocean Acidi Cation in the Basin.” Scientific Reports 5:16770. Flombaum, P., J. Gallegos, R. Gordillo, J. Ricon, L. Zabala, N. Jiao, D. Karl, et al. 2013. “Present and Future Global Distributions of the Marine Cyanobacteria Prochlorococcus and Synechococcus.” PNAS 110 (24): 9824–29. Galperin, Michael Y., Kira S. Makarova, Yuri I. Wolf, and Eugene V. Koonin. 2015. “Expanded Microbial Genome Coverage and Improved Protein Family Annotation in the COG Database.” Nucleic Acids Research 43 (D1): D261–69. doi:10.1093/nar/gku1223. Giovannoni, Stephen J. 2017. “SAR11 Bacteria: The Most Abundant Plankton in the Oceans.” Annual Review of Marine Science 9: 231–55. Giovannoni, Stephen J, J Cameron Thrash, and Ben Temperton. 2014. “Implications of Streamlining Theory for Microbial Ecology.” The ISME Journal 8 (8). Nature Publishing Group: 1–13. doi:10.1038/ismej.2014.60. Hyatt, Doug, Philip F. Locascio, Loren J. Hauser, and Edward C. Uberbacher. 2010. “Gene and Translation Initiation Site Prediction in Metagenomic Sequences.” Bioinformatics 28 (17): 2223–30. doi:10.1093/bioinformatics/bts429. Kanehisa, Minoru, Yoko Sato, Masayuki Kawashima, Miho Furumichi, and Mao Tanabe. 2016. “KEGG as a Reference Resource for Gene and Protein Annotation.” Nucleic Acids Research 44 (D1): D457–62. doi:10.1093/nar/gkv1070. Karlusich, Juan Jose Pierella, and Nestor Carrillo. 2017. “Evolution of the Acceptor Side of Photosystem I: Ferredoxin, Avodoxin, and Ferredoxin-NADP+ Oxidoreductase.” Photosynth Res 134: 235–50. doi:10.1007/s11120-017-0338-2. Kashtan, Nadav, Sara E Roggensack, Sébastien Rodrigue, Jessie W Thompson, Steven J Biller, Allison Coe, Huiming Ding, et al. 2014. “Single-Cell Genomics Reveals Hundreds of Coexisting Subpopulations in Wild Prochlorococcus.” Science (New York, N.Y.) 344 (6182): 416–20. doi:10.1126/science.1248575. Kim, Daehwan, Li Song, Florian Breitwieser, and Steven L. Salzberg. 2016. “Centrifuge: Rapid and Sensitive Classification of Metagenomic Sequences.” Genome Research 26: 1721–29. doi:10.1101/gr.210641.116. Konstantinidis, Konstantinos T, and James M. Tiedje. 2005. “Genomic Insights That Advance the Species Definition for Prokaryotes.” PNAS 102 (7): 2567–72. doi:10.1073 pnas.0409727102. 61 Langmead, Ben, and Steven L Salzberg. 2012. “Fast Gapped-Read Alignment with Bowtie 2.” Nat Methods 9 (4): 357–59. doi:10.1038/nmeth.1923. Marston, M., F. Pierciey, A. Shepard, G. Gearin, J. Qi, C. Yandava, S. Schuster, M. Henn, and J. Martiny. 2012. “Rapid Diversification of Coevolving Marine Synechococcus and a Virus.” PNAS 109 (12): 4544–49. Mazard, Sophie, Martin Ostrowski, Frédéric Partensky, and David J. Scanlan. 2012. “Multi-Locus Sequence Analysis, Taxonomic Resolution and Biogeography of Marine Synechococcus.” Environmental Microbiology. doi:10.1111/j.1462- 2920.2011.02514.x. McCarren, Jay, and B. Brahamsha. 2007. “SwmB, a 1.12-Megadalton Protein That Is Required for Nonflagellar Swimming Motility in Synechococcus.” Journal of Bacteriology, 1158–62. O. Delmont, Tom, and A. Murat Eren. 2018. “Linking Pangenomes and Metagenomes: The Prochlorococcus Metapangenome.” PeerJ 6:e4320. Penno, Sigrid, Debbie Lindell, and Anton F. Post. 2006. “Diversity of Synechococcus and Prochlorococcus Populations Determined from DNA Sequences of the N- Regulatory Gene ntcA.” Environmental Microbiology. doi:10.1111/j.1462- 2920.2006.01010.x. Powley, Helen, Michael Krom, and Philippe Van Cappellen. 2017. “Understanding the Unique Biogeochemistryof the Mediterranean Sea: Insights from Acoupled Phosphorus and Nitrogen Model.” Global Biogeochemical Cycles 31: 1010–31. Pritchard, Leighton, Rachel Glover, Sonia Humphris, John Elphinstone, and Ian Toth. 2016. “Genomics and Taxonomy in Diagnostics for Food Security: Soft-Rotting Enterobacterial Plant Pathogens.” Analytical Methods 8 (12). doi:10.1039/c5ay02550h. Ragot, Sabine, Michael Kertesz, and Else Bunemann. 2015. “phoD Alkaline Phosphatase Gene Diversity in Soil.” Appl Environ Microbiol 81: 7281–89. Rinke, Christian, Patrick Schwientek, Alexander Sczyrba, Natalia N Ivanova, Iain J Anderson, Jan-Fang Cheng, Aaron Darling, et al. 2013. “Insights into the Phylogeny and Coding Potential of Microbial Dark Matter.” Nature 499 (7459): 431–37. doi:10.1038/nature12352. Rodriguez-Valera, Francisco, and David Ussery. 2012. “Is the Pan-Genome Also a Pan- Selectome?” F1000 Research. doi:10.12688/f1000research.1-16.v1. Scanlan, D J, M Ostrowski, S Mazard, a Dufresne, L Garczarek, W R Hess, a F Post, M Hagemann, I Paulsen, and F Partensky. 2009. “Ecological Genomics of Marine Picocyanobacteria.” Microbiology and Molecular Biology Reviews : MMBR 73 (2): 249–99. doi:10.1128/MMBR.00035-08. Sohm, Jill A, Nathan A Ahlgren, Zachary J Thomson, Cheryl Williams, James W Moffett, Mak A Saito, Eric A Webb, and Gabrielle Rocap. 2015. “Co-Occurring Synechococcus Ecotypes Occupy Four Major Oceanic Regimes Defined by Temperature, Macronutrients and Iron.” The ISME Journal. Nature Publishing Group, 1–13. doi:10.1038/ismej.2015.115. Stamatakis, Alexandros. 2014. “RAxML Version 8: A Tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies.” Bioinformatics 30 (9): 1312–13. doi:10.1093/bioinformatics/btu033. Sun, Zhiyi, and Jeffrey L. Blanchard. 2014. “Strong Genome-Wide Selection Early in the 62 Evolution of Prochlorococcus Resulted in a Reduced Genome through the Loss of a Large Number of Small Effect Genes.” PLoS ONE 9 (3). doi:10.1371/journal.pone.0088837. Sunagawa, Shinichi, Luis Pedro Coelho, Samuel Chaffron, Jens Roat Kultima, Karine Labadie, Guillem Salazar, Bardya Djahanschiri, et al. 2015. “Structure and Function of the Global Ocean Microbiome.” Science 348 (6237): 1261359. doi:10.1126/science.1261359. Toledo, Gerardo, and Brian Palenik. 1997. “Synechococcus Diversity in the California Current as Seen by RNA Polymerase (rpoC1) Gene Sequences of Isolated Strains.” Applied and Environmental Microbiology 63 (11): 4298–4303. van Dongen, S., Abreu-Goodger, C. 2012. “Using MCL to Extract Clusters from Networks.” Methods Mol Biol 804: 281–95. doi:10.1007/978-1-61779-361-5_15. 63 Conclusion Over the course of our planet’s tenure, cyanobacteria and their descendant plastids have caused mass extinctions, inverted the global redox state, and enabled Life to explore an explosion in metazoan diversity; one would be hard-pressed to overstate the role they have played in the coevolution of the Earth and its biosphere. Today, globally distributed, extant lineages of these organisms are still integral components in the cycling of essential elements such as oxygen, carbon, and nitrogen. As such, our continuing efforts to build a more nuanced and accurate understanding of our planet’s elemental cycling, and its subsequent climate modulation, require a more nuanced and accurate understanding of the diversity, ecology, and evolution of the biomolecular machines driving these processes. This work focused on two globally distributed, ecologically significant cyanobacteria, Synechococcus and the nitrogen-fixing Trichodesmium, and attempted to gain insight into their ecology by considering their entire assemblages as a whole. With regard to Trichodesmium, characterizing its associated microbial consortium across multiple laboratory-maintained, experimental, and environmental samples has revealed a highly conserved heterotrophic genomic lineage present among all samples (at a marker- gene and genomic level, and via DNA and RNA), genomic signatures of potential co- dependencies between host and epibionts, and distinct ecological niches of consistently co-occurring major taxa based on RNA expression – including potential roles in colony- level nitrogen cycling. With regard to Synechococcus, we introduced 12 new, marine isolate reference genomes (increasing the currently available total by ~1/3), integrated them phylogenomically and pangenomically with previously available references, and characterized their global distributions across ~100 environmental metagenomes largely sourced from the TARA Oceans project. In addition to supporting the observed stable co- occurrence of multiple subpopulations at the genomic level, an analysis of hypervariable genes as identified based on their frequency of detection in the environment revealed a disproportionate amount of syntenic genomic islands enriched in integrases, multidrug exporters, and glycosyltransferases involved outer membrane modification – providing much-needed environmental support to a steadily increasing volume of work identifying these genetic elements as likely playing a key role in the generation and maintenance of stably coexisting subpopulation diversity (largely intertwined with viral interactions). In both of these microbial spheres, their success in the environment is intimately tied to the interactions, redundancies, and variability contributed by their mixed populations – be it a mixed community at commonly used taxonomic levels such as with Trichodesmium’s associated consortium, or at finer levels of resolution as with co-occurring Synechococcus subpopulations. Though operating at different levels of resolution, these two systems overlap in that their assemblages as a whole form self-regulating, molecular networks that can demonstrate emergent properties and experience selective pressures. 64 References Amin, S. A., M. S. Parker, and E. V. Armbrust. 2012. “Interactions between Diatoms and Bacteria.” Microbiology and Molecular Biology Reviews 76 (3): 667–84. doi:10.1128/MMBR.00007-12. Barcelos e Ramos, J., H. Biswas, K. G. Schulz, J. LaRoche, and U. Riebesell. 2007. “Effect of Rising Atmospheric Carbon Dioxide on the Marine Nitrogen Fixer Trichodesmium.” Global Biogeochemical Cycles 21 (April): 1–6. doi:10.1029/2006GB002898. Beliaev, Alexander S, Margie F Romine, Margrethe Serres, Hans C Bernstein, Bryan E Linggi, Lye M Markillie, Nancy G Isern, et al. 2014. “Inference of Interactions in Cyanobacterial–heterotrophic Co-Cultures via Transcriptome Sequencing.” The ISME Journal 8 (11). Nature Publishing Group: 2243–55. doi:10.1038/ismej.2014.69. Bergman, B., Carpenter, E.J. 1991. “Nitrogenase Confined to Randomly Distributed Trichomes in the Marine Cyanobacterium Trichodesmium Thiebautii.” J. Phyc 27: 158–65. Bergman, Birgitta, Gustaf Sandh, Senjie Lin, John Larsson, and Edward J. Carpenter. 2013. “Trichodesmium - a Widespread Marine Cyanobacterium with Unusual Nitrogen Fixation Properties.” FEMS Microbiology Reviews 37 (3): 286–302. doi:10.1111/j.1574-6976.2012.00352.x. Bertrand, Erin M., and Andrew E. Allen. 2012. “Influence of Vitamin B Auxotrophy on Nitrogen Metabolism in Eukaryotic Phytoplankton.” Frontiers in Microbiology 3 (OCT): 1–16. doi:10.3389/fmicb.2012.00375. Bertrand, Erin M, John P McCrow, Ahmed Moustafa, Hong Zheng, Jeffrey B McQuaid, Tom O Delmont, Anton F Post, et al. 2015. “Phytoplankton-Bacterial Interactions Mediate Micronutrient Colimitation at the Coastal Antarctic Sea Ice Edge.” Proceedings of the National Academy of Sciences of the United States of America 112 (32): 9938–43. doi:10.1073/pnas.1501615112. Borstad, G.A. 1978. “Some Aspects of the Occurrence and Biology of Trichodesmium near Barbados.” McGill University. Canfield, Donald E, Alexander N Glazer, and Paul G Falkowski. 2010. “The Evolution and Future of Earth’s Nitrogen Cycle.” Science 330: 192–96. doi:10.1126/science.1186120. Capone, D G, and E J Carpenter. 1982. “Nitrogen Fixation in the Marine Environment.” Science (New York, N.Y.) 217 (4565). United States: 1140–42. doi:10.1126/science.217.4565.1140. Capone, Douglas G., James A. Burns, Joseph P. Montoya, Ajit Subramaniam, Claire Mahaffey, Troy Gunderson, Anthony F. Michaels, and Edward J. Carpenter. 2005. “Nitrogen Fixation by Trichodesmium Spp.: An Important Source of New Nitrogen to the Tropical and Subtropical North Atlantic Ocean.” Global Biogeochemical Cycles 19 (2): 1–17. doi:10.1029/2004GB002331. Capone, Douglas G., Jonathan P Zehr, Hans W Paerl, Birgitta Bergman, and Edward J Carpenter. 1997. “Trichodesmium, a Globally Significant Marine Cyanobacterium.” Science 276 (5316): 1221–29. doi:10.1126/science.276.5316.1221. 65 Collins, Matthew, Reto Knutti, J.-L. Dufresne, Thierry Fichefet, Pierre Friedlingstein, Xuejie Gao, Willim J. Gutowski, et al. 2014. “Long-Term Climate Change: Projections, Commitments and Irreversibility.” Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Di Rienzi, Sara C., Itai Sharon, Kelly C. Wrighton, Omry Koren, Laura A. Hug, Brian C. Thomas, Julia K. Goodrich, et al. 2013. “The Human Gut and Groundwater Harbor Non-Photosynthetic Bacteria Belonging to a New Candidate Phylum Sibling to Cyanobacteria.” eLife. doi:10.7554/eLife.01102.001. Falkowski, Paul G, and Linda V Godfrey. 2008. “Electrons, Life and the Evolution of Earth’s Oxygen Cycle.” Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 363 (1504): 2705–16. doi:10.1098/rstb.2008.0054. Farrant, Gregory K., Hugo Doré, Francisco M. Cornejo-Castillo, Frédéric Partensky, Morgane Ratin, Martin Ostrowski, Frances D. Pitt, et al. 2016. “Delineating Ecologically Significant Taxonomic Units from Global Patterns of Marine Picocyanobacteria.” Proceedings of the National Academy of Sciences 113 (24): E3365–74. doi:10.1073/pnas.1524865113. Fisher, Madeline M, Lee W Wilcox, and Linda E Graham. 1998. “Molecular Characterization of Epiphytic Bacterial Communities on Charophycean Green Algae.” Microbial Ecology 64 (11): 4384–89. Field, C.B., Behrenfeld, M.J., Randerson, J.T., and Falkowski, P. 1998. "Primary production of the biosphere: Integrating terrestrial and oceanic components." Science. 281: 237–240. doi: 10.1126/science.281.5374.237. Flombaum, P., J. Gallegos, R. Gordillo, J. Ricon, L. Zabala, N. Jiao, D. Karl, et al. 2013. “Present and Future Global Distributions of the Marine Cyanobacteria Prochlorococcus and Synechococcus.” PNAS 110 (24): 9824–29. Fu, Fei-Xue, Margaret R. Mulholland, Nathan S. Garcia, Aaron Beck, Peter W. Bernhardt, Mark E. Warner, Sergio A. Sañudo-Wilhelmy, and David A. Hutchins. 2008. “Interactions between Changing pCO2, N2 Fixation, and Fe Limitation in the Marine Unicellular Cyanobacterium Crocosphaera.” Limnology and Oceanography 53 (6): 2472–84. doi:10.4319/lo.2008.53.6.2472. Gao, Kunshan, Juntian Xu, Guang Gao, Yahe Li, David A. Hutchins, Bangqin Huang, Lei Wang, et al. 2012. “Rising CO2 and Increased Light Exposure Synergistically Reduce Marine Primary Productivity.” Nature Climate Change 2 (7). Nature Publishing Group: 519–23. doi:10.1038/nclimate1507. Garcia, Nathan S, Feixue Fu, Peter N Sedwick, and David A Hutchins. 2015. “Iron Deficiency Increases Growth and Nitrogen-Fixation Rates of Phosphorus- Deficient Marine Cyanobacteria.” The ISME Journal 9 (1). Nature Publishing Group: 238–45. doi:10.1038/ismej.2014.104. Gradoville, Mary R., Angelicque E. White, and Ricardo M. Letelier. 2014. “Physiological Response of Crocosphaera Watsonii to Enhanced and Fluctuating Carbon Dioxide Conditions.” PLoS ONE 9 (10): e110660. doi:10.1371/journal.pone.0110660. Hardin, G. 1960. “The Competitive Exclusion Principle.” Science. doi:10.1126/science.131.3409.1292. Hewson, Ian, Rachel S Poretsky, Roxanne A Beinart, Angelicque E White, Tuo Shi, 66 Shellie R Bench, Pia H Moisander, et al. 2009. “In Situ Transcriptomic Analysis of the Globally Important Keystone N2-Fixing Taxon Crocosphaera Watsonii.” The ISME Journal 3 (5). Nature Publishing Group: 618–31. doi:10.1038/ismej.2009.8. Hmelo, Laura R. 2002. “Microbial Interactions Associated with Biofilms Attached to Trichodesmium Spp. and Detrital Particles in the Ocean.” Thesis. Hmelo, Lr R., Bas A S Van Mooy, and Tj J. Mincer. 2012. “Characterization of Bacterial Epibionts on the Cyanobacterium Trichodesmium.” Aquatic Microbial Ecology 67 (1): 1–14. doi:10.3354/ame01571. Hutchins, D. A., F.-X. Fu, Y. Zhang, M. E. Warner, Y. Feng, K. Portune, P. W. Bernhardt, and M. R. Mulholland. 2007. “CO2 Control of Trichodesmium N2 Fixation, Photosynthesis, Growth Rates, and Elemental Ratios: Implications for Past, Present, and Future Ocean Biogeochemistry.” Limnology and Oceanography 52 (4): 1293–1304. doi:10.4319/lo.2007.52.4.1293. Hutchins, David A., Fei-Xue Fu, Eric A. Webb, Nathan Walworth, and Alessandro Tagliabue. 2013. “Taxon-Specific Response of Marine Nitrogen Fixers to Elevated Carbon Dioxide Concentrations.” Nature Geoscience 6 (9). Nature Publishing Group: 790–95. doi:10.1038/ngeo1858. Hutchins, David A, Nathan G Walworth, Eric A Webb, Mak A Saito, Dawn Moran, Matthew R. McIlvin, Jasmine Gale, and Fei-xue Fu. 2015. “Irreversibly Increased Nitrogen Fixation in Trichodesmium Experimentally Adapted to Elevated Carbon Dioxide.” Nature Communications 6. Nature Publishing Group: 1–7. doi:10.1038/ncomms9155. Hutchinson, G.E. 1941. “Ecological Aspects of Succession in Natural Populations.” The American Naturalist 75: 406–418. ———. 1961. “The Paradox of the Plankton.” The American Naturalist 95 (882). Kashtan, Nadav, Sara E Roggensack, Sébastien Rodrigue, Jessie W Thompson, Steven J Biller, Allison Coe, Huiming Ding, et al. 2014. “Single-Cell Genomics Reveals Hundreds of Coexisting Subpopulations in Wild Prochlorococcus.” Science (New York, N.Y.) 344 (6182): 416–20. doi:10.1126/science.1248575. Kranz, Sven A., Dieter Sültemeyer, Klaus-Uwe Richter, and Björn Rost. 2009. “Carbon Acquisition by Trichodesmium: The Effect of pCO2 and Diurnal Changes.” Limnology and Oceanography 54 (2): 548–59. doi:10.4319/lo.2009.54.2.0548. Kranz, Sven A, Orly Levitan, Klaus-Uwe Richter, Ondrej Prásil, Ilana Berman-Frank, and Björn Rost. 2010. “Combined Effects of CO2 and Light on the N2-Fixing Cyanobacterium Trichodesmium IMS101: Physiological Responses.” Plant Physiology 154 (September): 334–45. doi:10.1104/pp.110.159145. Law, Cliff S., Eike Breitbarth, Linn J. Hoffmann, Christina M. McGraw, Rebecca J. Langlois, Julie LaRoche, Andrew Marriner, and Karl A. Safi. 2012. “No Stimulation of Nitrogen Fixation by Non-Filamentous Diazotrophs under Elevated CO2 in the South Pacific.” Global Change Biology 18: 3004–14. doi:10.1111/j.1365-2486.2012.02777.x. Lee, Michael D., Nathan G. Walworth, Erin L. McParland, Fei Xue Fu, Tracy J. Mincer, Naomi M. Levine, David A. Hutchins, and Eric A. Webb. 2017. “The Trichodesmium Consortium: Conserved Heterotrophic Co-Occurrence and Genomic Signatures of Potential Interactions.” ISME Journal 11 (8). Nature 67 Publishing Group: 1813–24. doi:10.1038/ismej.2017.49. Lee, Michael D., Eric A. Webb, Nathan G. Walworth, Fei-Xue Fu, Noelle A. Held, Mak A. Saito, and David A. Hutchins. 2017. “Transcriptional Activities of the Microbial Consortium Living with the Marine Nitrogen-Fixing Cyanobacterium Trichodesmium Reveal Potential Roles in Community-Level Nitrogen Cycling.” Applied and Environmental Microbiology 84 (October): AEM.02026-17. doi:10.1128/AEM.02026-17. Levitan, O., G. Rosenberg, I. Setlik, E. Setlikova, J. Grigel, J. Klepetar, O. Prasil, and Ilana Berman-Frank. 2007. “Elevated CO2 Enhances Nitrogen Fixation and Growth in the Marine Cyanobacterium Trichodesmium.” Global Change Biology 13: 531–38. doi:10.1111/j.1365-2486.2006.01314.x. Lukjancenko, Oksana, Trudy M. Wassenaar, and David W. Ussery. 2010. “Comparison of 61 Sequenced Escherichia Coli Genomes.” Microbial Ecology 60 (4): 708–20. doi:10.1007/s00248-010-9717-3. Lyons, Timothy W., Christopher T. Reinhard, and Noah J. Planavsky. 2014. “The Rise of Oxygen in Earth’s Early Ocean and Atmosphere.” Nature. doi:10.1038/nature13068. Marston, M., F. Pierciey, A. Shepard, G. Gearin, J. Qi, C. Yandava, S. Schuster, M. Henn, and J. Martiny. 2012. “Rapid Diversification of Coevolving Marine Synechococcus and a Virus.” PNAS 109 (12): 4544–49. Morris, J. Jeffrey, Robin Kirkegaard, Martin J. Szul, Zackary I. Johnson, and Erik R. Zinser. 2008. “Facilitation of Robust Growth of Prochlorococcus Colonies and Dilute Liquid Cultures By ‘helper’ heterotrophic Bacteria.” Applied and Environmental Microbiology 74 (14): 4530–34. doi:10.1128/AEM.02479-07. Morris, Jeffrey J., Richard E. Lenski, and Erik R. Zinser. 2012. “The Black Queen Hypothesis : Evolution of Dependencies through Adaptative Gene Loss.” Mbio. doi:10.1128/mBio.00036-12.Copyright. Mulholland, Margaret R., Peter W. Bernhardt, Cynthia a. Heil, Deborah a. Bronk, and Judith M. O’ Neil. 2006. “Nitrogen Fixation and Release of Fixed Nitrogen by Trichodesmium Spp. in the Gulf of Mexico.” Limnology and Oceanography 51 (4): 1762–76. doi:10.4319/lo.2006.51.4.1762. Nausch, M. 1996. “Microbial Activities on Trichodesmium Colonies.” Marine Ecology Progress Series 141: 173–81. doi:10.3354/meps141173. O. Delmont, Tom, and A. Murat Eren. 2018. “Linking Pangenomes and Metagenomes: The Prochlorococcus Metapangenome.” PeerJ 6:e4320. O’Neil, J M, and M R Roman. 1992. “Grazers and Associated Organisms of Trichodesmium.” Marine Pelagic Cyanobacteria: Trichodesmium and Other Diazotrophs 362: 61–73. doi:10.1007/978-94-015-7977-3_5. Paerl, H.W., and J.L. Pinckney. 1996. “A Mini-Review of Microbial Consortia: Their Roles in Aquatic Production and Biogeochemical Cycling.” Microbial Ecology 31 (3): 225–47. doi:10.1007/BF00171569. Paerl, HW, BM Bebout, and LE Prufert. 1989. “Bacterial Associations with Marine Oscillatoria Sp. (Trichodesmium Sp.) Populations: Ecophysiological Implications.” Journal of Phycology. Rodriguez-Valera, Francisco, and David Ussery. 2012. “Is the Pan-Genome Also a Pan-Selectome?” F1000 Research. doi:10.12688/f1000research.1-16.v1. 68 Rost, Björn, Ingrid Zondervan, and Dieter Wolf-Gladrow. 2008. “Sensitivity of Phytoplankton to Future Changes in Ocean Carbonate Chemistry: Current Knowledge, Contradictions and Research Directions.” Marine Ecology Progress Series 373: 227–37. doi:10.3354/meps07776. Sapp, Melanie, Anne S. Schwaderer, Karen H. Wiltshire, Hans Georg Hoppe, Gunnar Gerdts, and Antje Wichels. 2007. “Species-Specific Bacterial Communities in the Phycosphere of Microalgae?” Microbial Ecology 53 (4): 683–99. doi:10.1007/s00248-006-9162-5. Sheridan, C. C., D. K. Steinberg, G. W. Kling, Pope Road, P O Box, Gloucester Point, Evolutionary Biology, A N N Arbor, and Corresponding Author Email. 2002. “The Microbial and Metazoan Community Associated with Colonies of Trichodesmium Spp.: A Quantitative Survey.” Journal of Plankton Research 24 (9): 913–22. doi:10.1093/plankt/24.9.913. Siddiqui, P. J. A., Bergman, B., Carpenter, E.J. 1992. “Filamentous Cyanobacterial Associates of the Marine Planktonic Cyanobacterium Trichodesmium.” Phycologia 31: 326–37. Sison-Mangus, Marilou P, Sunny Jiang, Kevin N Tran, and Raphael M Kudela. 2014. “Host-Specific Adaptation Governs the Interaction of the Marine Diatom, Pseudo-Nitzschia and Their Microbiota.” The ISME Journal 8 (1). Nature Publishing Group: 63–76. doi:10.1038/ismej.2013.138. Sohm, Jill A, Nathan A Ahlgren, Zachary J Thomson, Cheryl Williams, James W Moffett, Mak A Saito, Eric A Webb, and Gabrielle Rocap. 2015. “Co-Occurring Synechococcus Ecotypes Occupy Four Major Oceanic Regimes Defined by Temperature, Macronutrients and Iron.” The ISME Journal. Nature Publishing Group, 1–13. doi:10.1038/ismej.2015.115. Stevenson, Bradley S., and John B. Waterbury. 2006. “Isolation and Identification of an Epibiotic Bacterium Associated with Heterocystous Anabaena Cells.” Biological Bulletin 210 (2): 73–77. Sun, Zhiyi, and Jeffrey L. Blanchard. 2014. “Strong Genome-Wide Selection Early in the Evolution of Prochlorococcus Resulted in a Reduced Genome through the Loss of a Large Number of Small Effect Genes.” PLoS ONE 9 (3). doi:10.1371/journal.pone.0088837. Walworth, Nathan G, Fei-Xue Fu, Eric A Webb, Mak A Saito, Dawn Moran, Matthew R Mcllvin, Michael D Lee, and David A Hutchins. 2016. “Mechanisms of Increased Trichodesmium Fitness under Iron and Phosphorus Co-Limitation in the Present and Future Ocean.” Nat Commun 7 (May). Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.: 1–11. doi:10.1038/ncomms12081. Waterbury, John B. 2006. “The Cyanobacteria - Isolation, Purification and Indentification.” Prokaryotes 4: 1053–73. doi:10.1007/0-387-30744-3_38. Zehr, J P. 1995. “Nitrogen Fixation in the Sea: Why Only Trichodesmium?” Molecular Ecology of Aquatic Microbes, 335–64. 69 70 Supplemental material Figure S1 | Phylogenomic tree of the 31 analyzed Synechococcus genomes including Gloeobacter violaceus. Phylogenomic tree is maximum-likelihood (100BS) based on amino acid sequence alignments of 343 orthologs present in single-copy in each of the 31 incorporated Synechococcus reference genomes and G. violaceus. Figure S2 | Locations of all samples included in the current study. Color indicates time of year sample was collected. All locations had surface samples, those with underlines also had deep-chlorophyll max samples. -40 0 40 -100 0 100 long lat 152 150 151 36 38 39 41 42 45 48 52 141 142 145 146 148 149 66 67 68 70 72 76 78 82 30 31 32 33 34 St8 St13 St17 St18 65 56 57 62 58 64 18 25 23 4 132 133 137 138 140 100 102 109 110 111 93 94 96 98 99 112 122 123 124 125 128 84 85 D e c/ Ja n / F e b Mar/Apr/May Ju n / Ju l / A u g S e p / O ct / N o v Time of year sampled Underline indicates a bottom water sample also collected 71 Supplemental Note 1 Mapping short reads to a reference Mapping millions of metagenomic short reads to a reference will inevitably yield some degree of non-specific mapping, such as recruitment of reads to highly conserved portions even when the source-genome of the recruited reads is something otherwise very distinct from the reference they are recruiting to, and because of this, coverage by itself is not sufficient for determining if a reference is representative of an in situ community (whether the reference is an entire genome or a gene, etc., see Supp. Fig. 3 above for visual example). As such, we utilized an arbitrary “detection” threshold of 50% or greater in order to declare whether a reference genome or gene should be considered “representative of the in situ Synechococcus community” of a given sample – meaning a reference genome would need to recruit reads to greater than 50% of its base pairs in order to be considered representative of the in situ Synechococcus community for a given sample. Utilizing this threshold provides a much more robust metric of how representative a particular reference genome is of the Synechococcus members that exist in these metagenomic datasets than simply coverage alone. For instance, genome RS9917 at most recruited reads to only 13% of its genome, and was therefore not considered representative of the Synechococcus community in any of the 97 metagenomes (Fig. 1C), whereas genome N32 in clade II had more than 50% of its genome detected in 38 of the samples (Fig. 1C; all detection values for each genome and sample are presented in Table S3) and was relatively the most abundant (as defined in the main text “abundance” is used here in terms of relative reads recruited) of all 31 incorporated reference genomes (Fig. 1D; all relative abundance values are presented in Table S4). Figure S3 | Example showing why coverage alone can be misleading when mapping short reads to a reference sequence, and how requiring a “detection” criterion helps. The x-axis represents the span of ~900 base pairs of Gene X. The y-axis represents the coverage from Sample Y and Sample Z. Here, the “coverage” of Gene X in both samples may be about 75. However, visually inspecting the recruitments shows that only one conserved area of Gene X is actually recruiting any reads from Sample Z, while the rest of the gene is not recruiting anything. In this case, Gene X would be deemed present in Sample Y, but not in Sample Z. 72 0.01 3 5 9 % recruited of total reads -40 0 40 -100 0 100 long lat Genome N32 UW86 N26 N19 N5 CC9605 WH8109 Figure S4 | Scaled plots of only clade II isolate genome recruitment. Recruitment to clade II reference genomes was not evenly distributed across the 7 isolate genomes. The newly contributed N32, UW86, N26, N19, and N5 had clearly different distributions than CC9605 and WH8109. Figure S5 | Distributions of dominant clades from 32 deep chlorophyll maximum ocean samples. Pie sizes are scaled to represent percent of total sample reads that were recruited to the 31 reference genomes, serving roughly as a metric of how abundant the Synechococcus population was at each sample (though this is constrained by the reference genomes). Pies are colored by proportion of reads recruited to each of the dominant clades in each sample. Supp. Fig. S2 is the same map with corresponding identifiers shown, which are also presented in Supp. Table S2 with further sample data. 73 Figure S6 | Visualized read recruitment to WH8102’s narK gene that is frame-shifted from a nonsense insertion causing a stop codon midway through. Coverage in 3 samples with recruitment to WH8102’s reference genome show a clear drop in recruitment at the location of the split gene. 74 Figure S7 | Plot of the top 10 longest open-reading frames identified in each isolate genome. 75 Figure S8 | Gene presence/absence of isolate genomes in 6 samples in which they were representative of a relatively large portion of the recovered Synechococcus population. The x-axis holds a single column for each of the identified genes in each genome. For the top 6 labeled sample rows of each, a dark blue fill indicates that gene was detected in that sample, and light blue indicates it was not detected in that sample. The ECG/EAG row is the ratio of each across all samples in which the genome was detected, with green indicating ECG and red indicating EAG. The contig row specifies where there are breaks in the contigs of the reference genome. And gene-level GC scatter plots each individual gene’s divergence from the genome’s median GC (Table S1). 76 Figure S9 | Functional summaries of “core” and “accessory” genes identified based on their detection in the environment. 77 Figure S10 | Percentages of encoded amino acids based on 31 Prochlorococcus and 31 Synechococcus isolate genomes. 78 Supplemental Tables Index Table S1 Genome info Table S2 Environmental sample information Table S3 Genome detections Table S4 Genome relative abundances Table S5 Genes table Table S6 Pangenome COG category summaries 79 Table S1 - Genome info *percent completion and redundance estimates based on 139 single-copy genes from Campbell et al. 2013 **number of genes as identified by Prodigal v2.6.2 Genome Clade Total length (bp) # contigs/scaffolds N50 % GC % Complete* % Redundancy* # genes** % coding KORDI52 WPC2 2572069 1 2572069 59.07 99.28 0.72 2771 0.883 CC9605 II 2510659 1 2510659 59.20 100.00 0.72 2889 0.885 WH8109 II 2111515 1 2111515 60.07 100.00 0.72 2403 0.895 N32 II 2210892 17 552645 60.79 99.28 0.72 2530 0.911 N5 II 2089511 14 531783 60.80 100.00 0.72 2440 0.906 UW86 II 2091288 18 548326 60.79 99.28 1.44 2423 0.915 N26 II 2126898 13 362552 60.49 99.28 0.72 2450 0.905 N19 II 2163106 18 189897 60.51 99.28 0.72 2500 0.902 UW106 XV 2479535 15 318215 57.86 100.00 0.72 2715 0.888 UW69 XV 2372121 15 637127 58.41 100.00 0.72 2595 0.897 BL107 IV 2285034 1 2285034 54.23 99.28 1.44 2522 0.907 CC9902 IV 2234828 1 2234828 54.17 100.00 1.44 2451 0.912 WH8102 III 2434428 1 2434428 59.53 99.28 2.16 2705 0.909 KORDI100 UC-A 2789000 1 2789000 57.51 100.00 0.72 2940 0.882 CC9616 UC-A 2645910 1 2645910 56.54 99.28 1.44 2809 0.889 KORDI49 WPC1 2585813 1 2585813 61.46 98.56 0.72 2681 0.932 CC9311 I 2606748 1 2606748 52.42 99.28 1.44 2857 0.879 WH8020 I 2661166 1 2661166 53.11 99.28 1.44 2901 0.884 UW179B I 2526282 26 167461 54.07 97.84 2.16 2726 0.904 WH8016 I 2693328 14 347852 53.77 97.84 1.44 3000 0.897 WH7803 V 2366980 1 2366980 60.25 99.28 1.44 2553 0.936 WH7805 VI 2627046 3 2621166 57.42 99.28 0.72 2709 0.907 GEYO CRD1 2343605 13 252868 56.48 97.12 0.72 2656 0.899 MIT9508 CRD1 2502434 23 167605 56.00 97.84 0.72 2837 0.890 UW179A CRD1 3054081 33 283750 54.43 98.56 0.72 3377 0.867 80 MIT9509 CRD1 3087293 34 136058 55.39 97.84 2.16 3495 0.851 RS9917 VIII 2584918 1 2584918 64.43 100.00 0.72 2741 0.943 RS9916 IX 2664873 1 2664873 59.77 100.00 0.72 2794 0.902 UW105 XVI 2659417 31 151891 57.43 100.00 0.72 2872 0.882 UW140 XVI 2704142 35 116698 57.58 99.28 0.72 2892 0.889 RCC307 X 2224914 1 2224914 60.84 100.00 1.44 2550 0.935 Bolded/underlined indicates new genome contributed from the current study 81 *percent completion and redundance estimates based on 139 single-copy genes from Campbell et al. 2013 *** as defined by proportion of recruited reads (after filtering by detection - see main text) Longest ORF gene_ID (bps) overall relative abundance*** isolate source Accession 17660 (7140) 0.40 Pacific, East China Sea GCF_000737595 3181 (4602) 5.19 Pacific, California coast GCF_000012625 66310 (6990) 4.35 Sargasso Sea GCF_000161795 25925 (4716) 13.27 Pacific, Costa Rica upwelling dome (CRUD) 27772 (4602) 10.41 Pacific, Costa Rica upwelling dome (CRUD) 54794 (4602) 12.59 24148 (4602) 11.60 Pacific, Costa Rica upwelling dome (CRUD) 20322 (4602) 10.67 Pacific, Costa Rica upwelling dome (CRUD) 41062 (4971) 0.57 53807 (6732) 0.96 69282 (10575) 3.36 Mediterranean Sea, Blanes Bay GCF_000153805 6493 (10515) 3.14 Pacific, California coast GCF_000012505 63052 (32376) 5.30 Atlantic, tropical GCF_000195975 12510 (10386) 0.20 Pacific, tropical GCF_000737535 70517 (11619) 0.39 Pacific, California coast GCF_000515235 16030 (33801) 1.50 Pacific, East China Sea GCF_000737575 155 (6537) 0.58 California current, Pacific coastal GCF_000014585 82052 (5955) 0.56 Sargasso Sea GCF_001040845 50499 (6957) 0.07 80277 (5961) 0.08 Atlantic, coastal GCF_000230675 59213 (6627) 0.10 Sargasso Sea GCF_000063505 61333 (24390) 0.09 Sargasso Sea GCF_000153285 8341 (4599) 5.53 73086 (6528) 4.40 Pacific, eastern tropical, CRUD GCF_001632165 47455 (16305) 0.30 82 76453 (9324) 0.67 Pacific, eastern tropical, CRUD GCF_001631935 34755 (84537) 0.00 Red Sea, Gulf of Aqaba GCF_000153065 34142 (28353) 0.15 Red Sea, Gulf of Aqaba GCF_000153825 39404 (19716) 0.43 45129 (21648) 0.09 30054 (5748) 3.03 Mediterranean Sea GCF_000063525 83 Table S2 Sample information Sample_ID map_id (fig. 2) Mean_latitude Mean_longitude BioSample SRA_sample ANE_004_05M 4 36.55 -6.57 SAMEA2619376 ERS487899 ANE_004_40M 4 36.57 -6.54 SAMEA2619399 ERS487936 MED_018_05M 18 35.76 14.26 SAMEA2619667 ERS488330 MED_018_60M 18 35.76 14.25 SAMEA2619678 ERS488346 MED_023_05M 23 42.21 17.71 SAMEA2591084 ERS477979 MED_025_05M 25 39.39 19.39 SAMEA2619766 ERS488486 MED_025_50M 25 39.41 19.38 SAMEA2619782 ERS488509 MED_030_05M 30 33.92 32.89 SAMEA2591108 ERS478017 MED_030_70M 30 33.93 32.77 SAMEA2591122 ERS478040 RED_031_05M 31 27.16 34.83 SAMEA2619802 ERS488545 RED_032_05M 32 23.36 37.22 SAMEA2619818 ERS488569 RED_032_80M 32 23.38 37.29 SAMEA2619840 ERS488599 RED_033_05M 33 21.98 38.24 SAMEA2619857 ERS488621 RED_034_05M 34 18.4 39.88 SAMEA2619879 ERS488649 RED_034_60M 34 18.4 39.86 SAMEA2619907 ERS488685 ION_036_05M 36 20.82 63.5 SAMEA2619927 ERS488714 ION_036_17M 36 20.82 63.51 SAMEA2619952 ERS488747 ION_038_05M 38 19.04 64.49 SAMEA2620000 ERS488799 ION_038_25M 38 19.03 64.62 SAMEA2620021 ERS488830 ION_039_25M 39 18.57 66.49 SAMEA2620081 ERS488916 ION_041_05M 41 14.6 69.98 SAMEA2620194 ERS489043 ION_041_60M 41 14.56 70.01 SAMEA2620217 ERS489074 ION_042_05M 42 6.03 73.89 SAMEA2620230 ERS489087 ION_042_80M 42 6 73.9 SAMEA2620259 ERS489134 ION_045_05M 45 0 71.64 SAMEA2620339 ERS489236 84 ION_048_05M 48 -9.39 66.42 SAMEA2620404 ERS489315 IOS_052_05M 52 -16.96 53.99 SAMEA2620542 ERS489529 IOS_052_75M 52 -16.96 54.01 SAMEA2620570 ERS489585 IOS_056_05M 56 -15.34 43.29 SAMEA2620651 ERS489712 IOS_057_05M 57 -17.02 42.74 SAMEA2620672 ERS489733 IOS_058_66M 58 -17.32 42.29 SAMEA2620734 ERS489846 IOS_062_05M 62 -21.38 39.56 SAMEA2620756 ERS489877 IOS_064_05M 64 -29.5 37.99 SAMEA2620786 ERS489917 IOS_064_65M 64 -29.54 37.94 SAMEA2620828 ERS490002 IOS_065_05M 65 -35.19 26.29 SAMEA2620855 ERS490029 IOS_065_30M 65 -35.25 26.31 SAMEA2620890 ERS490085 ASE_066_05M 66 -34.94 17.94 SAMEA2620929 ERS490124 ASE_066_30M 66 -34.89 18.07 SAMEA2620950 ERS490163 ASE_067_05M 67 -32.23 17.71 SAMEA2620970 ERS490183 ASE_068_05M 68 -31.03 4.69 SAMEA2621013 ERS490265 ASE_068_50M 68 -31.03 4.69 SAMEA2621037 ERS490296 ASE_070_05M 70 -20.44 -3.19 SAMEA2621066 ERS490327 ASW_072_05M 72 -8.78 -17.91 SAMEA2621132 ERS490433 ASW_072_100M 72 -8.7 -17.94 SAMEA2621155 ERS490476 ASW_076_05M 76 -20.94 -35.2 SAMEA2621198 ERS490542 ASW_078_05M 78 -30.14 -43.29 SAMEA2621254 ERS490659 ASW_082_05M 82 -47.18 -58.3 SAMEA2621401 ERS490885 ASW_082_40M 82 -47.21 -58.06 SAMEA2621423 ERS490928 SOC_084_05M 84 -60.23 -60.65 SAMEA2621487 ERS491001 SOC_085_05M 85 -62.03 -49.54 SAMEA2621509 ERS491044 SOC_085_90M 85 -62.24 -49.18 SAMEA2621536 ERS491095 PSE_093_05M 93 -34.05 -73.08 SAMEA2621779 ERS491421 PSE_093_35M 93 -33.88 -73.05 SAMEA2621812 ERS491463 PSE_094_05M 94 -32.78 -87.09 SAMEA2621839 ERS491492 PSE_096_05M 96 -29.72 -101.16 SAMEA2621859 ERS491525 85 PSE_098_05M 98 -25.83 -111.78 SAMEA2621990 ERS491699 PSE_099_05M 99 -21.15 -104.79 SAMEA2622074 ERS491804 PSE_100_05M 100 -12.99 -95.99 SAMEA2622097 ERS491836 PSE_100_50M 100 -12.99 -95.99 SAMEA2622119 ERS491874 PSE_102_05M 102 -5.25 -85.16 SAMEA2622173 ERS491938 PSE_102_40M 102 -5.27 -85.23 SAMEA2622219 ERS492012 PON_109_05M 109 1.99 -84.58 SAMEA2622316 ERS492145 PON_109_30M 109 2.08 -84.52 SAMEA2622336 ERS492177 PSE_110_05M 110 -2.01 -84.59 SAMEA2622376 ERS492228 PSE_110_50M 110 -1.84 -84.63 SAMEA2622402 ERS492264 PSE_111_05M 111 -16.96 -100.63 SAMEA2622452 ERS492321 PSE_111_90M 111 -16.96 -100.7 SAMEA2622478 ERS492357 PSW_112_05M 112 -23.28 -129.4 SAMEA2622518 ERS492408 PSW_122_05M 122 -8.99 -139.21 SAMEA2622652 ERS492642 PSW_123_05M 123 -8.91 -140.28 SAMEA2622710 ERS492733 PSW_124_05M 124 -9.11 -140.58 SAMEA2622759 ERS492814 PSW_125_05M 125 -8.91 -142.56 SAMEA2622817 ERS492888 PSW_128_05M 128 0 -153.68 SAMEA2622901 ERS493044 PSW_128_40M 128 0 -153.68 SAMEA2622923 ERS493098 PON_132_05M 132 31.52 -159 SAMEA2623059 ERS493300 PON_133_05M 133 35.41 -127.74 SAMEA2623116 ERS493390 PON_133_45M 133 35.42 -127.69 SAMEA2623135 ERS493431 PON_137_05M 137 14.2 -116.63 SAMEA2623275 ERS493636 PON_137_40M 137 14.17 -116.71 SAMEA2623295 ERS493670 PON_138_05M 138 6.33 -102.94 SAMEA2623350 ERS493752 PON_138_60M 138 6.34 -102.98 SAMEA2623370 ERS493788 PON_140_05M 140 7.41 -79.3 SAMEA2623426 ERS493877 ANW_141_05M 141 9.85 -80.04 SAMEA2623446 ERS493914 ANW_142_05M 142 25.51 -88.38 SAMEA2623463 ERS493938 ANW_145_05M 145 39.23 -70.03 SAMEA2623627 ERS494170 86 ANW_146_05M 146 34.68 -71.3 SAMEA2623673 ERS494236 ANW_148_05M 148 31.7 -64.25 SAMEA2623734 ERS494332 ANW_149_05M 149 34.1 -49.89 SAMEA2623774 ERS494394 ANE_150_05M 150 35.91 -37.26 SAMEA2623808 ERS494445 ANE_150_40M 150 35.75 -37.05 SAMEA2623826 ERS494488 ANE_151_05M 151 36.16 -29.01 SAMEA2623850 ERS494518 ANE_151_80M 151 36.19 -28.88 SAMEA2623868 ERS494559 ANE_152_05M 152 43.69 -16.85 SAMEA2623886 ERS494579 PON_CRUD_St13 St13 9.5025 -92.3286 SAMN07632551 SRS2671626 PON_CRUD_St17 St17 2.0416 -97.0511 SAMN07632552 SRS2671628 PON_CRUD_St18 St18 0.8736 -97.0511 SAMN07632553 SRS2671720 PON_CRUD_St8 St8 8.7051 -86.4906 SAMN07632550 SRS2671625 87 SRA_exp SRA_runs Date_collected Size_fraction ERX555964,ERX556024 ERR598955,ERR599003 9/15/09 0.22_1.6 ERX555913,ERX556067 ERR599095,ERR598950 9/15/09 0.22_1.6 ERX555914,ERX556002 ERR598993,ERR599140 11/2/09 0.22_1.6 ERX556039,ERX555962 ERR599073,ERR599092 11/2/09 0.22_1.6 ERX289007,ERX289009 ERR315858,ERR315861 11/18/09 0.22_1.6 ERX555952,ERX556082 ERR599043,ERR598951 11/23/09 0.22_1.6 ERX556032,ERX556124 ERR599094,ERR599153 11/23/09 0.22_1.6 ERX289010,ERX289011 ERR315862,ERR315863 12/15/09 0.22_1.6 ERX291766,ERX291768,ERX291769,ERX291767 ERR318618,ERR318619,ERR318620,ERR318621 12/15/09 0.22_1.6 ERX556018,ERX555912 ERR598969,ERR599106 1/9/10 0.22_1.6 ERX556066,ERX556022,ERX556080 ERR599155,ERR599116,ERR599041 1/11/10 0.22_1.6 ERX556055,ERX555931 ERR599061,ERR599097 1/11/10 0.22_1.6 ERX555946,ERX556035 ERR599134,ERR599049 1/13/10 0.22_1.6 ERX555978,ERX555909 ERR598991,ERR598959 1/20/10 0.22_1.6 ERX555956,ERX556025 ERR598975,ERR599111 1/20/10 0.22_1.6 ERX555943,ERX556073 ERR599143,ERR598966 3/12/10 0.22_1.6 ERX556135,ERX556019 ERR598974,ERR599028 3/12/10 0.22_1.6 ERX556006,ERX556116 ERR599102,ERR599158 3/15/10 0.22_1.6 ERX556034,ERX556061 ERR599082,ERR598949 3/15/10 0.22_1.6 ERX556015 ERR599145 3/18/10 0.22_1.6 ERX555983,ERX555906 ERR599011,ERR599074 3/30/10 0.22_1.6 ERX556125,ERX555976 ERR598977,ERR599053 3/30/10 0.22_1.6 ERX556052,ERX556118 ERR599075,ERR599141 4/4/10 0.22_1.6 ERX556136,ERX556098 ERR599013,ERR599130 4/4/10 0.22_1.6 ERX556105,ERX556132 ERR599054,ERR599045 4/13/10 0.22_1.6 88 ERX556122,ERX556056 ERR599019,ERR599138 4/19/10 0.22_1.6 ERX556014,ERX555985 ERR599098,ERR599139 5/17/10 0.22_1.6 ERX556063,ERX555923 ERR599002,ERR599016 5/17/10 0.22_1.6 ERX555971 ERR599057 6/26/10 0.22_3 ERX555945 ERR599058 6/27/10 0.22_3 ERX556068 ERR599026 6/29/10 0.22_3 ERX556129 ERR599012 7/3/10 0.22_3 ERX556127,ERX556012,ERX556096 ERR599150,ERR599088,ERR598970 7/7/10 0.22_3 ERX555948,ERX556119,ERX556087 ERR598972,ERR599023,ERR599025 7/8/10 0.22_3 ERX556091,ERX555970 ERR599146,ERR598979 7/12/10 0.22_3 ERX556072,ERX556023,ERX555996 ERR598990,ERR599110,ERR599018 7/12/10 0.22_3 ERX555958,ERX556004,ERX556107 ERR599173,ERR599068,ERR598973 7/15/10 0.22_3 ERX556046,ERX556113 ERR598982,ERR599107 7/15/10 0.22_3 ERX556084,ERX556109 ERR599144,ERR598994 9/7/10 0.22_3 ERX556071,ERX555986,ERX555975 ERR599171,ERR599129,ERR599174 9/14/10 0.22_3 ERX555939,ERX556041,ERX556038 ERR599017,ERR599103,ERR599056 9/14/10 0.22_3 ERX555918,ERX556111 ERR599165,ERR599135 9/21/10 0.22_3 ERX556009,ERX555934 ERR598984,ERR599105 10/5/10 0.22_3 ERX555987,ERX556079 ERR599133,ERR599137 10/5/10 0.22_3 ERX555995,ERX555999 ERR599010,ERR599126 10/16/10 0.22_3 ERX555968,ERX556088 ERR599022,ERR599006 11/4/10 0.22_3 ERX556030,ERX555925 ERR599035,ERR599009 12/6/10 0.22_3 ERX555915,ERX555998 ERR599122,ERR599027 12/6/10 0.22_3 ERX555982,ERX556058 ERR598945,ERR599059 1/3/11 0.22_3 ERX556133,ERX556110 ERR599090,ERR599176 1/6/11 0.22_3 ERX556064,ERX555997 ERR599104,ERR599121 1/6/11 0.22_3 ERX555965 ERR599064 3/12/11 0.22_3 ERX556045 ERR598965 3/12/11 0.22_3 ERX555988 ERR599050 3/18/11 0.22_3 ERX556130 ERR598967 3/24/11 0.22_3 89 ERX555979,ERX556049 ERR599120,ERR599093 4/3/11 0.22_3 ERX555926 ERR599024 4/9/11 0.22_3 ERX555916,ERX556074,ERX556092 ERR599163,ERR599169,ERR599063 4/15/11 0.22_3 ERX555930,ERX555917 ERR599113,ERR599081 4/15/11 0.22_3 ERX555954,ERX556029 ERR598978,ERR598943 4/21/11 0.22_3 ERX556048,ERX556001,ERX556005 ERR598962,ERR599168,ERR599007 4/22/11 0.22_3 ERX555963,ERX556086 ERR599118,ERR598997 5/12/11 0.22_3 ERX556128,ERX556007,ERX556026 ERR598952,ERR599065,ERR599108 5/12/11 0.22_3 ERX556137 ERR599039 5/21/11 0.22_3 ERX555941 ERR599014 5/21/11 0.22_3 ERX555933 ERR599077 5/31/11 0.22_3 ERX556060 ERR598961 5/31/11 0.22_3 ERX555919 ERR598954 6/14/11 0.22_3 ERX555947 ERR598992 7/26/11 0.22_3 ERX556126 ERR599160 7/31/11 0.22_3 ERX556117,ERX555929,ERX556076,ERX556057 ERR599036,ERR599080,ERR599151,ERR599069 8/4/11 0.22_3 ERX556044,ERX555938,ERX555960,ERX555908 ERR599114,ERR599091,ERR599119,ERR599066 8/8/11 0.22_3 ERX555990 ERR599038 9/4/11 0.22_3 ERX556134 ERR599032 9/5/11 0.22_3 ERX556094 ERR599142 10/4/11 0.22_3 ERX555967 ERR599052 10/18/11 0.22_3 ERX555907 ERR598942 10/19/11 0.22_3 ERX556031 ERR598989 12/2/11 0.22_3 ERX556104,ERX555920,ERX556097,ERX556115 ERR598987,ERR599099,ERR599147,ERR599070 12/2/11 0.22_3 ERX556106 ERR599030 12/10/11 0.22_3 ERX555961 ERR599087 12/11/11 0.22_3 ERX556121 ERR599162 12/21/11 0.22_3 ERX556062 ERR599029 12/30/11 0.22_3 ERX556028 ERR599136 1/9/12 0.22_3 ERX556101 ERR598983 2/2/12 0.22_3 90 ERX556059 ERR598968 2/15/12 0.22_3 ERX555984 ERR599123 2/24/12 0.22_3 ERX556003 ERR598963 3/1/12 0.22_3 ERX556037 ERR599170 3/5/12 0.22_3 ERX555927 ERR598996 3/5/12 0.22_3 ERX555957 ERR598976 3/9/12 0.22_3 ERX556040 ERR598986 3/9/12 0.22_3 ERX556054 ERR599078 3/19/12 0.22_3 SRX3375198 SRR6269030 7/29/05 0.22_3 SRX3375200 SRR6269032 7/29/05 0.22_3 SRX3375295 SRR6269135 7/29/05 0.22_3 SRX3375197 SRR6269029 7/29/05 0.22_3 91 Depth_ID PANGAEA_Sample_ID Mean_Depth_m Mean_temperature_C Mean_salinity SRF TARA_Y200000002 10 20.5 36.6 DCM TARA_X000000368 38.7 16.2 36.6 SRF TARA_A100000164 5.4 21.4 37.9 DCM TARA_S200000501 61.6 18.4 37.9 SRF TARA_E500000075 5.5 17.6 38.2 SRF TARA_E500000178 5.5 18.3 38.2 DCM TARA_E500000331 51.7 15.2 38.5 SRF TARA_A100001015 5.4 20.5 39.4 DCM TARA_A100001011 69.1 19 39.3 SRF TARA_A100001388 5.4 25.1 40 SRF TARA_A100001035 5.4 25.8 39.7 DCM TARA_A100001037 85.5 26.1 40.2 SRF TARA_A100001234 5.4 27.3 38.9 SRF TARA_B100000003 5.4 27.6 38.6 DCM TARA_B100000029 60.1 27.6 38.9 SRF TARA_Y100000022 5.5 25.6 36.5 DCM TARA_B100000035 17.4 25.4 36.5 SRF TARA_Y100000287 5.5 26.2 36.6 DCM TARA_B100000073 25.3 25.5 <NA> DCM TARA_B100000085 25.3 26.8 36.3 SRF TARA_B100000282 5.5 29.1 36 DCM TARA_B100000287 58.2 27.1 36.5 SRF TARA_B100000123 5.4 30 34.6 DCM TARA_B100000131 79 27.7 35.1 SRF TARA_B100000161 5.5 30.5 35.1 92 SRF TARA_B100000242 5.4 29.8 34.2 SRF TARA_B100000212 5.5 27.9 34.6 DCM TARA_B100000214 79 24.9 34.9 SRF TARA_B000000609 5.5 27.3 35 SRF TARA_B000000565 5.6 27 35.1 DCM TARA_B000000557 67.1 25.3 35.2 SRF TARA_B000000532 5.4 25.1 35.3 SRF TARA_B100000401 5.5 22.2 35.3 DCM TARA_B100000405 63.6 22.3 35.3 SRF TARA_B000000437 5.9 21.8 35.4 DCM TARA_B000000441 28.7 21.8 <NA> SRF TARA_B000000475 5.4 15 35.3 DCM TARA_B000000477 28.6 15 35.3 SRF TARA_B100000497 5.5 12.8 34.8 SRF TARA_B100000475 5.4 16.8 35.7 DCM TARA_B100000482 40.3 16.8 35.7 SRF TARA_B100000459 5.4 19.8 36.4 SRF TARA_B100000424 5.8 25 36.4 DCM TARA_B100000427 95.4 24.1 36.6 SRF TARA_B100000513 5.5 23.3 37.1 SRF TARA_B100000524 5.6 19.9 36.3 SRF TARA_B100000768 5.5 7.3 34 DCM TARA_B100000767 41.8 7 34.1 SRF TARA_B100000780 5.9 1.8 33.7 SRF TARA_B100000787 5.9 0.7 34.4 DCM TARA_B100000795 87.4 -0.8 34.3 SRF TARA_B100001063 5.3 18 34.3 DCM TARA_B100001059 33.8 16.4 34.3 SRF TARA_B100001057 5.4 21.1 34.7 SRF TARA_B100000989 5.5 23.8 35.8 93 SRF TARA_B100001027 5.6 25.1 36.4 SRF TARA_B100000886 5.4 23.8 36.1 SRF TARA_B100000963 5.5 25.3 35.8 DCM TARA_B100000965 57.6 20.6 35.5 SRF TARA_B100000900 5.5 24.9 34.7 DCM TARA_B100000902 45.7 19.6 34.9 SRF TARA_B100000925 5.4 27.6 33.4 DCM TARA_B100000927 29.8 26.5 34.3 SRF TARA_B100001109 5.5 23.9 35 DCM TARA_B100001113 48.7 21.8 35.1 SRF TARA_B100000575 5.9 22.8 36 DCM TARA_B100000579 89 19.9 35.7 SRF TARA_B100000941 5.4 24.2 36.5 SRF TARA_B100001115 5.9 26.5 35.4 SRF TARA_B100000683 5.5 26.6 35.4 SRF TARA_B100000674 9.9 26.5 35.4 SRF TARA_B100001121 5.5 26.8 35.4 SRF TARA_B100000609 5.4 26.1 35.1 DCM TARA_B100000614 41.7 26.1 35.1 SRF TARA_B100001248 5.5 25.2 35.2 SRF TARA_B100001093 5.5 19.2 33.1 DCM TARA_B100001094 47.2 13.2 33.2 SRF TARA_B100001287 5.4 26.4 33.9 DCM TARA_B100001964 44.2 18.6 34.4 SRF TARA_B100001989 5.4 26.6 33.4 DCM TARA_B100001996 58.1 19 34.6 SRF TARA_B100002019 5.4 26.6 28.9 SRF TARA_B100001939 5.4 27.1 34.3 SRF TARA_B100002051 5.4 25 36.2 SRF TARA_B100001142 5.5 14.1 35.2 94 SRF TARA_B100001540 11.7 19.1 36.5 SRF TARA_B100001741 6.1 20.4 36.6 SRF TARA_B100001758 5.5 18.7 36.4 SRF TARA_B100001769 5.5 17.6 36.3 DCM TARA_B100001778 39.8 17.7 36.2 SRF TARA_B100001564 5.4 17.3 36.2 DCM TARA_B100001559 77.6 16.8 36.2 SRF TARA_B100001173 5.4 14.3 36 SRF <NA> 5 <NA> <NA> SRF <NA> 5 <NA> <NA> SRF <NA> 5 <NA> <NA> SRF <NA> 5 <NA> <NA> 95 Mean_Oxygen_umol_per_kg Mean_nitrates_umol_per_L NO2_umol_per_L NO2NO3_umol_per_L <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 207.8 <NA> 0.02 0.12 237.9 <NA> 0.03 0.02 220 <NA> 0.01 0.04 218 <NA> 0 0.03 229.5 <NA> 0.07 0.15 207.6 -0.7 0 0.05 218.1 0.3 0.01 0.02 188.8 -0.1 0.01 0.03 188.4 -0.5 0.01 0 180 1.2 0.01 0.01 182.8 0 0 0 184.1 -0.9 0.02 0.03 175.4 0.1 0.27 0.64 210.9 <NA> 0.05 0.13 211.6 0.4 0.51 2.08 199.9 0.9 0.01 0.06 154.7 1.2 0.7 1.86 193 -0.6 0.02 0.09 187.4 -1.3 0 0.09 148.3 -0.2 0.07 0.49 189.3 -1.5 0 0.03 132.8 0.6 0.15 1.39 185.2 2.6 0.02 28.64 96 187.3 -2.2 0.01 0.06 191.7 <NA> 0 0 192.6 0.2 0.03 0.47 193.1 <NA> <NA> <NA> 190.1 <NA> 0.02 0 185 0.8 0.03 0.06 199.9 -0.2 <NA> <NA> 210 <NA> 0 0 207.4 -0.1 0 0.02 207 <NA> <NA> <NA> 206.4 <NA> <NA> <NA> 238.9 2.5 0.3 3.34 240.4 <NA> 0.26 3.23 249.4 0.4 0.17 7.09 231.9 <NA> 0.25 1.3 231.7 <NA> 0.29 1.08 215.7 0.2 0.05 0.99 199.1 0.3 0 0.02 194.4 0.2 0.01 0.04 206.2 0 0 0 221.5 -0.4 0 0.02 305 17.1 0.15 18.15 306.2 17.1 0.14 19.07 338.3 24.4 0.27 25.3 343.4 27.5 0.11 28.5 325.4 33.1 0.06 31 244.9 -1.5 0.01 0.02 237.2 0.7 0.37 7.55 219.2 -0.4 0 0.03 204.1 -0.4 0 0.02 97 200.5 -1.6 0 0.05 204 -1.7 0 0.04 200.2 4.6 0.14 6.2 216.8 1.5 0.16 5.7 206 11.7 0.32 12.6 103.9 20.3 1.2 24.4 198.6 1.2 0.04 0.9 203.1 3.9 0.11 4.6 190.8 6.7 0.31 8.2 144.6 10.9 0.41 10.7 208.9 2.6 0.04 2.68 211.5 1.4 0.2 0.93 202.2 -0.9 0.01 0.03 186.2 4 0.12 5.46 189.8 3.9 0.14 4.9 190.7 5.1 0.16 6.18 187.3 5 0.19 3.73 179.9 2.5 0.27 5.08 177 4.4 0.28 5.26 197.7 -1 0 0.01 224.4 -0.7 0 0.02 227.7 4.8 0.08 1.3 195.1 3.2 0.07 2.44 95.4 11 0.31 6.05 196.9 -1.2 0 0 77 22.1 0.83 12.9 205.3 -3.8 0 0 195.5 -1.7 0 0 194.3 -2.2 0.02 0.05 233.9 3.5 0.11 4.38 98 214.4 0.9 0.22 0.74 212.6 <NA> 0.09 0.15 220.2 -1.2 0.25 0.68 228.4 -0.3 0.04 0.17 230.1 1.2 0.04 0.17 232.1 0.3 0.02 0.02 228.5 1.6 0.01 0.05 243.1 3 0.31 2.16 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> 99 * the 4 PON_CRUD samples were 2x150bps PO4_umol_per_L SI_umol_per_L Chlorophyll_mg_Chla_per_m3 total reads (2x100bps)* QCd reads <NA> <NA> 0.078 404336892 369070354 <NA> <NA> 0.090621 476293096 423703874 0.03 0.56 0.048898 414976308 370103744 0.01 0.62 0.132494 408021182 362086112 0.01 1.38 0.0419 149566010 130876938 0.01 1.05 0.121845 457560422 408995570 0.07 1.82 0.201016 386627816 357584824 0 0.82 0.01585 478785582 445072382 0 0.67 0.042495 315646408 300230682 0.02 0.82 0.048898 401751524 359774790 0.02 0.87 0 352979512 339092972 0.03 0.85 0.224444 394022740 363018048 0.05 1.15 0.163228 255070344 231848988 0.19 3.91 0.186697 241308424 222472590 0.09 1.35 0.181406 449416158 410834234 0.37 1.2 0.129697 251165916 231670922 0.51 1.79 0.353707 402101650 371466534 0.32 1.18 0.161808 261840264 235358560 0.42 1.63 0.604752 419686666 347745850 0.26 1.28 0.17638 338044892 318825684 0.14 1.47 0.020173 440504066 408606976 0.34 1.38 0.47554 463399290 417237142 0.08 2.21 0.006724 401723064 361354620 0.34 3.2 0.388348 430039794 384624166 1.94 15.14 0.026898 391038482 370602874 100 0.08 2.12 0 499292418 452161618 0.12 2.67 0.176565 404567984 374583112 0.18 3.22 0.394226 388947166 360259294 <NA> <NA> 0.04282 324775688 310306972 0 2.81 0.085544 336385740 319776472 0 2.51 0.032278 337711862 315508628 <NA> <NA> 0.089746 291429494 281299030 0.08 1.77 0.162993 460647798 431438098 0.08 1.75 0.214573 410378996 373824970 <NA> <NA> 0.215292 290200094 238639672 <NA> <NA> 0.272648 433566456 399609040 0.34 2.74 0.250054 320731360 257533088 0.37 2.47 0.424363 149855818 138676666 1.02 13.88 1.551493 157314750 148584718 0.23 2.6 0.201517 294061050 247707850 0.23 2.44 0.42458 262743724 221419218 0.36 1.36 0.324315 262754648 215176492 0.1 0.87 0.050608 420077668 385809816 0.14 1.32 0.263939 327621278 301545300 0.06 0.81 0.033622 461725288 396735186 0 0.48 0.053173 474643436 420837752 1.3 1.82 0.310844 268894564 252054616 1.42 3.08 1.01772 444220572 408090866 1.72 16.55 0.109555 442330696 409392428 2.11 79.52 0.065273 401701090 339283264 2.31 81.82 0.540091 436978564 393899978 0.52 0.09 0.387495 274983484 266365620 1.07 0.59 1.104727 338611726 312448428 0.28 0.09 0.147822 460018862 432477730 0.16 0.53 0.044093 377820080 355916782 101 0.2 0.38 0.013449 253142740 239050364 0.29 0.55 0.037733 338549582 279542414 0.68 1.2 0.287892 373134256 357613416 0.78 1.9 0.366887 341083836 320404292 1 5 0.238224 337892910 319549476 1.86 10.5 0.728939 409176104 388979268 0.27 1.7 0.28167 372118964 347812950 0.5 3.2 0.737378 340008658 323801396 0.75 3.9 0.317793 321797088 304813382 0.93 6 0.460366 423500782 383480904 0.5 1 0.203347 380365534 352918548 0.44 0.9 0.378549 374127914 347584958 0.14 0.8 0.04895 399008374 367650168 0.57 2.19 0.166512 278380318 266284158 0.53 2.22 0.319103 342284306 320895730 0.63 2.52 0.286662 368419764 353530344 0.56 1.8 0.233673 328505148 314898368 0.54 2.67 0.288205 306384522 250555094 0.55 2.74 0.451256 297815870 285103462 0.01 2.43 0.057131 333427578 314678592 0.29 1.91 0.038221 539113764 482541062 0.46 2.84 0.436597 359284260 332874030 0.46 2.54 0.290611 371142378 344858074 0.88 4.66 <NA> 385071042 367966468 0.16 1.19 0.013449 342278366 322298694 1.11 8.07 0.334076 391822972 361276302 0.08 0.54 0.040695 370334582 345919492 0.01 2.93 0.126428 342398030 322218196 0 1.22 0.163338 314283624 297863244 0.35 2.77 0.296307 352030928 329775746 102 0.02 0.94 0.12799 338943134 319581346 0 0.73 0.166512 346238396 318465884 0.06 0.8 0.214174 357891876 327666974 0.01 0.84 0.196921 387769558 348349270 0 0.84 0.339591 364054734 333632862 0.01 0.65 0.040347 396286730 355439850 0.01 0.63 0.247378 369538288 334513286 0.16 1.24 0.238225 329240054 308331856 <NA> <NA> <NA> 262970914 238557206 <NA> <NA> <NA> 238760682 215040466 <NA> <NA> <NA> 437291634 401751672 <NA> <NA> <NA> 216955172 203018622 34873724222 31954710020 103 reads recruited % reads recruited reads recruited after filtering by genome detection % reads recruited after filtering by genome detection 3530718 0.96 3291851 0.89 2592233 0.61 2424523 0.57 482598 0.13 201870 0.05 526476 0.15 276300 0.08 1086116 0.83 944379 0.72 12902196 3.15 12272031 3.00 1782979 0.50 1195653 0.33 1209386 0.27 1086987 0.24 3390748 1.13 3149464 1.05 4142276 1.15 3574992 0.99 1620104 0.48 1320961 0.39 655597 0.18 405877 0.11 16080076 6.94 15677666 6.76 9099439 4.09 8864007 3.98 197972 0.05 0 0.00 8467950 3.66 8252789 3.56 15612879 4.20 15252311 4.11 10160144 4.32 9662619 4.11 21533786 6.19 20860619 6.00 12934545 4.06 12294879 3.86 2362446 0.58 1973534 0.48 207222 0.05 0 0.00 535265 0.15 351605 0.10 124054 0.03 0 0.00 1553031 0.42 1136568 0.31 104 378427 0.08 45940 0.01 647923 0.17 254949 0.07 473815 0.13 283400 0.08 740951 0.24 470835 0.15 19654210 6.15 19311198 6.04 2147565 0.68 1784085 0.57 3160652 1.12 2794671 0.99 6841517 1.59 6443179 1.49 12438513 3.33 11981737 3.21 3583219 1.50 3382006 1.42 13776128 3.45 13539002 3.39 479272 0.19 327443 0.13 500946 0.36 381079 0.27 749451 0.50 718758 0.48 6756168 2.73 6268627 2.53 5154085 2.33 4715324 2.13 735837 0.34 621742 0.29 257886 0.07 0 0.00 71504 0.02 0 0.00 961450 0.24 441691 0.11 3486283 0.83 3066327 0.73 63786 0.03 0 0.00 81211 0.02 0 0.00 65205 0.02 0 0.00 53995 0.02 0 0.00 57251 0.01 0 0.00 7322909 2.75 6627203 2.49 2742800 0.88 2361665 0.76 1929126 0.45 1466568 0.34 89270 0.03 0 0.00 105 119198 0.05 0 0.00 130510 0.05 0 0.00 192953 0.05 0 0.00 1017649 0.32 779942 0.24 1581062 0.49 1141113 0.36 4021481 1.03 3485493 0.90 1344703 0.39 597491 0.17 603080 0.19 302349 0.09 1352125 0.44 1004998 0.33 904925 0.24 625394 0.16 216531 0.06 61632 0.02 378359 0.11 208556 0.06 291131 0.08 0 0.00 210732 0.08 38719 0.01 7905258 2.46 7339208 2.29 12938584 3.66 11836157 3.35 3112798 0.99 2309023 0.73 359712 0.14 75486 0.03 314592 0.11 0 0.00 286563 0.09 39664 0.01 966129 0.20 456920 0.09 1284193 0.39 1028799 0.31 853707 0.25 395431 0.11 98942 0.03 0 0.00 231173 0.07 0 0.00 97418 0.03 0 0.00 12001112 3.47 11612332 3.36 28531838 8.85 28117874 8.73 5047612 1.69 4743480 1.59 3049171 0.92 2883073 0.87 106 4084854 1.28 3940017 1.23 8836927 2.77 8567472 2.69 5874645 1.79 5617923 1.71 3118112 0.90 2905059 0.83 2747163 0.82 2612333 0.78 1704248 0.48 1404661 0.40 660254 0.20 478236 0.14 3992361 1.29 3597604 1.17 5954028 2.50 4742350 1.99 1470789 0.68 1224970 0.57 4702945 1.17 4324921 1.08 1017587 0.50 462015 0.23 361798745 1.13 330719609 1.03 107 Table S3 Genome detections* * detection is defined as the proportion of the reference genome that is covered at least 1X in a sample ANE_004_05M ANE_004_40M ANE_150_05M ANE_150_40M ANE_151_05M ANE_151_80M ANE_152_05M ANW_141_05M ANW_142_05M BL107 0.89025867 0.9306893 0.95771708 0.94787553 0.95259768 0.93753319 0.96913547 0.0366925 0.10967188 CC9311 0.15009322 0.5071714 0.35224776 0.28508638 0.47750642 0.3136278 0.67345526 0.01673706 0.01709209 CC9605 0.93892727 0.91364465 0.82754528 0.88398356 0.74650247 0.35157584 0.09401525 0.87621897 0.80515039 CC9616 0.92985155 0.79091317 0.75638705 0.7131814 0.27002342 0.15300289 0.35882925 0.23226264 0.18406417 CC9902 0.65777655 0.74203005 0.91821004 0.89691228 0.84919478 0.71712216 0.93517229 0.03515227 0.04260919 GEYO 0.01451192 0.02335042 0.86422239 0.80632401 0.11332708 0.10108042 0.91365804 0.05850596 0.03465763 KORDI100 0.10015331 0.05924178 0.06982568 0.06338297 0.02985773 0.01605341 0.04663771 0.43429014 0.85168928 KORDI49 0.5351388 0.28675163 0.85677796 0.85725065 0.42479773 0.13570072 0.45249959 0.94818155 0.90639133 KORDI52 0.34330943 0.30332999 0.33424139 0.36676939 0.29144722 0.14873458 0.07396779 0.56714001 0.31837827 MIT9508 0.01286636 0.0267479 0.81406857 0.74129686 0.09770315 0.08540829 0.88826478 0.0506089 0.02825992 MIT9509 0.00819474 0.01823279 0.20236861 0.14246972 0.02687822 0.01476811 0.47407581 0.04355328 0.02391418 N19 0.61455494 0.50397019 0.55412025 0.60915398 0.4269665 0.20157486 0.07767509 0.95562949 0.84370273 N26 0.6415668 0.52683608 0.59405557 0.63846364 0.45758888 0.22066983 0.08989479 0.96201555 0.85842318 N32 0.72162685 0.61466115 0.65622153 0.70745445 0.53835505 0.2778273 0.11194611 0.9550771 0.87936294 N5 0.63685675 0.52586182 0.57802519 0.62857954 0.44384611 0.20845424 0.08827995 0.95379057 0.83836168 RCC307 0.51664691 0.86116363 0.33105426 0.55693367 0.0307522 0.01183668 0.01743542 0.91538201 0.8677589 RS9916 0.02233528 0.04322542 0.05028233 0.06380459 0.02155956 0.01379843 0.03810383 0.87771152 0.23176817 RS9917 0.01211048 0.01560972 0.02721522 0.02246863 0.00934967 0.00750466 0.03191469 0.1317347 0.06872123 UW105 0.31517109 0.93732776 0.84011353 0.90636524 0.22720658 0.2222637 0.09840741 0.06848297 0.04358702 UW106 0.68524498 0.76329214 0.80726053 0.84589613 0.76430203 0.6573908 0.15221199 0.36364688 0.17961573 UW140 0.06276742 0.57916426 0.38346036 0.56295847 0.04413537 0.03441373 0.04841666 0.07507181 0.04021869 UW179A 0.02037807 0.03164401 0.25672959 0.18570031 0.04759455 0.03060059 0.50774082 0.04466486 0.03509834 UW179B 0.07054547 0.27562303 0.11564451 0.08627551 0.21456837 0.1195923 0.43758003 0.02843288 0.01826127 UW69 0.82213019 0.89116443 0.90564403 0.91338495 0.88187716 0.83449895 0.20816779 0.42375053 0.21615053 UW86 0.68914731 0.58151838 0.66698226 0.72437164 0.50473063 0.25219512 0.09737859 0.98570754 0.89469656 WH7803 0.02266402 0.02152902 0.03271397 0.02724237 0.01644559 0.01024293 0.03291746 0.89537737 0.06608104 WH7805 0.02523864 0.02449617 0.03256128 0.02744014 0.02141529 0.01297044 0.03179459 0.26859646 0.05437506 108 WH8016 0.06928036 0.26306164 0.12490381 0.0955455 0.20395821 0.11113629 0.41737218 0.03817172 0.02763119 WH8020 0.32762977 0.92292152 0.86873525 0.9168061 0.76758843 0.59591641 0.66927269 0.02634078 0.02712436 WH8102 0.96248654 0.84460698 0.58195638 0.59971901 0.54525023 0.11552788 0.09590947 0.96963855 0.9600336 WH8109 0.74832006 0.65181164 0.72003636 0.76606574 0.56269878 0.29160062 0.10666457 0.89097687 0.77313335 109 * detection is defined as the proportion of the reference genome that is covered at least 1X in a sample ANW_145_05M ANW_146_05M ANW_148_05M ANW_149_05M ASE_66_05MASE_66_30MASE_67_05MASE_68_05MASE_68_50MASE_70_05M 0.960628 0.03271907 0.1218345 0.95352611 0.92425295 0.93053455 0.89794658 0.96008143 0.95253668 0.02238217 0.62401694 0.02745587 0.08901321 0.36613262 0.33183412 0.27898709 0.66651405 0.1909446 0.13150167 0.02697928 0.18674239 0.83648671 0.88523137 0.85056392 0.15855351 0.14168137 0.00777831 0.3484424 0.28375649 0.03858732 0.01178981 0.20707315 0.60067129 0.14347697 0.02384617 0.03454583 0.00406011 0.15416364 0.11459608 0.03356036 0.96153863 0.03512639 0.16814901 0.92071784 0.81836204 0.83306751 0.9240593 0.92769639 0.91729651 0.01914453 0.01207234 0.02233805 0.02482596 0.02897646 0.42948563 0.51822994 0.00821048 0.94311285 0.9377147 0.89292461 0.01598246 0.04980942 0.08411067 0.02972081 0.00801064 0.0089676 0.00430494 0.04064714 0.03280547 0.03229766 0.08231393 0.39572174 0.88917485 0.36807023 0.07884044 0.08036248 0.00888238 0.42288988 0.31104399 0.05648259 0.07788529 0.35391686 0.46821343 0.44188062 0.06243292 0.06031415 0.01103568 0.21690997 0.17724948 0.0461532 0.01905648 0.01612371 0.02136344 0.0272633 0.36339772 0.4456233 0.01273621 0.92522353 0.91760776 0.85007784 0.00975191 0.00951036 0.01370693 0.01357522 0.03666005 0.04842257 0.00748113 0.25649862 0.23529563 0.60635882 0.23538452 0.88465706 0.92466411 0.87174988 0.11232843 0.09134664 0.00352245 0.2779263 0.21840077 0.03045541 0.25306854 0.91395858 0.940491 0.89991993 0.129606 0.10189184 0.00839293 0.29977709 0.23883872 0.0436404 0.28302036 0.90953852 0.93787377 0.90050602 0.15022322 0.12516555 0.00795642 0.34447428 0.27975538 0.0526192 0.23548241 0.89394518 0.93473533 0.88700909 0.11811039 0.0944827 0.00605514 0.28876481 0.22825073 0.03733656 0.84774182 0.94222818 0.95274071 0.92357989 0.06909979 0.05073011 0.00453572 0.01838785 0.01487264 0.0111221 0.01726837 0.14051501 0.22526801 0.09487652 0.01073752 0.01110349 0.00891573 0.04676945 0.03905536 0.02966429 0.01069995 0.01790773 0.03263066 0.01630284 0.00688436 0.00652575 0.00734717 0.03181545 0.02569497 0.02600272 0.46815322 0.91509149 0.94323545 0.95016822 0.04715329 0.05258494 0.01364543 0.04940521 0.04644162 0.02753404 0.22537685 0.67802798 0.85182658 0.8689779 0.20849929 0.23415836 0.00846792 0.5627025 0.47536804 0.04184237 0.0879539 0.51606639 0.72617183 0.65923655 0.01896412 0.01779621 0.01256435 0.04330241 0.04081161 0.02955016 0.01377471 0.01681423 0.02549716 0.02403965 0.04873573 0.05656912 0.01444628 0.26925868 0.24796469 0.60804496 0.73395982 0.01867195 0.04613412 0.19440902 0.1270837 0.10577866 0.54208395 0.06995799 0.03784577 0.00981403 0.39629358 0.85047855 0.93248196 0.92500075 0.30927722 0.33591623 0.01350888 0.73633467 0.63115532 0.04582901 0.3630661 0.95965935 0.97181976 0.95163109 0.14319191 0.12498681 0.00791634 0.33801097 0.26947509 0.0434977 0.01202298 0.02803977 0.04185015 0.01979159 0.0088952 0.00901202 0.0072135 0.03388511 0.02902042 0.02611267 0.01553039 0.02235552 0.04193771 0.02549167 0.00958393 0.00937644 0.00975414 0.02848415 0.02454927 0.02004408 110 0.79542925 0.02499461 0.05786735 0.20741036 0.12425707 0.10026489 0.52217379 0.07222867 0.04098529 0.01316338 0.61797844 0.07754096 0.25411234 0.81874024 0.65028977 0.61943324 0.58721103 0.55062965 0.49666723 0.02424438 0.11514694 0.66633993 0.95818883 0.23797821 0.02342019 0.02218797 0.00853891 0.09554308 0.07973973 0.06837414 0.22094584 0.89814708 0.94203852 0.9184548 0.14130782 0.12785646 0.00694411 0.36885309 0.30546951 0.03895126 111 ASW_72_05M ASW_72_100M ASW_76_05M ASW_78_05M ASW_82_05M ASW_82_40M ION_36_05MION_36_17MION_38_05MION_38_25M 0.0136352 0.00709625 0.0115357 0.14809098 0.04988793 0.04401775 0.01581018 0.02056154 0.0309069 0.04859766 0.00730297 0.01126532 0.00539184 0.09416053 0.09565903 0.0970838 0.00885488 0.01018286 0.01353994 0.02505783 0.23511333 0.00519678 0.91608885 0.94334245 0.00219254 0.00236735 0.67463543 0.73885813 0.73530024 0.79523388 0.25859846 0.01036104 0.03487298 0.28760852 0.00107369 0.00166633 0.02699753 0.0383009 0.08351324 0.11229213 0.01523312 0.00706107 0.01069955 0.09763167 0.07292985 0.05165657 0.01531297 0.01990868 0.02989911 0.04579585 0.17528602 0.03863886 0.01444286 0.08733358 0.00348093 0.00453698 0.73012708 0.71607355 0.9135759 0.95298417 0.18145279 0.00513071 0.08499408 0.07082452 0.00122017 0.00125512 0.02269869 0.03009927 0.06520044 0.08937275 0.06077398 0.00518203 0.05250007 0.69167867 0.00252771 0.00243698 0.16138886 0.26555035 0.49454933 0.65831637 0.0914943 0.00647638 0.32737496 0.4715038 0.00382795 0.00367548 0.27058177 0.31791697 0.30994099 0.40340938 0.14323761 0.04085294 0.01097465 0.06470487 0.00333801 0.0035737 0.63313551 0.6307118 0.85320291 0.93148577 0.12451464 0.02079321 0.00672406 0.01935115 0.00130133 0.00140995 0.2622318 0.28928224 0.4112791 0.57096235 0.2031734 0.00237654 0.42783524 0.77999999 0.00066233 0.00128029 0.94860857 0.95800473 0.95023857 0.96042503 0.22281862 0.00809108 0.45841305 0.80708536 0.00332147 0.00429605 0.94982316 0.95813497 0.94971624 0.95906667 0.26492884 0.01129052 0.53328393 0.86107999 0.00551469 0.00533138 0.94137795 0.94989188 0.9496241 0.95399905 0.21716207 0.00253269 0.46001648 0.80458896 0.00052797 0.00077012 0.96039491 0.97118042 0.96375072 0.96863482 0.00961903 0.00331613 0.02226937 0.91118413 0.0057989 0.00521488 0.01209109 0.01621177 0.02394225 0.04075259 0.02668788 0.00447665 0.0171739 0.07004945 0.00164323 0.00207472 0.04298841 0.0530125 0.07304452 0.11172775 0.02656697 0.00512222 0.01188932 0.0388658 0.00149353 0.00199569 0.02541325 0.0294344 0.05226755 0.08456533 0.0181416 0.00755227 0.01222889 0.07480749 0.0040689 0.0044125 0.0201859 0.02852036 0.04135878 0.07299341 0.04819514 0.00135264 0.18135896 0.64978903 0.00048956 0.00045469 0.10968122 0.13927364 0.15539761 0.23828018 0.01977195 0.00900796 0.01388914 0.04022447 0.00470786 0.00422806 0.02332312 0.02954034 0.0475527 0.07609455 0.14647769 0.04803617 0.01497709 0.04343585 0.00429384 0.00683197 0.21310986 0.24150752 0.35831173 0.53343645 0.00561885 0.00199741 0.00420367 0.0455253 0.05371129 0.0542201 0.01140307 0.0138969 0.01547874 0.02954633 0.05918591 0.00807227 0.2142726 0.77970213 0.00350078 0.00602079 0.12744929 0.16545422 0.17515725 0.26453867 0.23924669 0.00786777 0.49652515 0.85651277 0.0037722 0.00464312 0.95397986 0.96711226 0.95688458 0.96741769 0.03083846 0.00518214 0.0139501 0.04275774 0.00238572 0.0025477 0.0306617 0.03675273 0.05813342 0.09537902 0.02220623 0.00526987 0.01662224 0.04163061 0.00232279 0.00223012 0.02873331 0.03357515 0.0492598 0.07890582 112 0.00807575 0.00361581 0.00883036 0.05153061 0.04406167 0.05043629 0.01771256 0.02082388 0.02525186 0.04037252 0.00786738 0.00887745 0.00441322 0.34878671 0.06870713 0.07098591 0.01318701 0.01554067 0.0194731 0.03601859 0.09534845 0.0055772 0.20773787 0.85395187 0.00189878 0.00208915 0.06217883 0.07620676 0.1263874 0.16604062 0.27257745 0.00639427 0.52112661 0.83639465 0.00242007 0.00323374 0.88791795 0.91466943 0.89326259 0.94157233 113 ION_39_25MION_41_05MION_41_60MION_42_05MION_42_80MION_45_05MION_48_05MIOS_52_05MIOS_52_75MIOS_56_05M 0.02540571 0.01018821 0.00575428 0.00461063 0.00526106 0.00594574 0.00366788 0.0146637 0.01428272 0.00711372 0.0130417 0.00744814 0.00616702 0.00479444 0.0057625 0.00526661 0.00477247 0.01125331 0.01189862 0.00667104 0.87388879 0.93423818 0.14244438 0.81477348 0.08209412 0.90007886 0.15536575 0.04549737 0.0400448 0.79907928 0.23133532 0.07243102 0.00597343 0.04822274 0.00675471 0.14619318 0.08773565 0.0429867 0.03797421 0.0580967 0.02474186 0.01039307 0.00655911 0.00594983 0.00596607 0.00683651 0.00512376 0.01665075 0.01480986 0.00863395 0.53567568 0.03193705 0.05538161 0.01395763 0.0508902 0.0189907 0.01214181 0.58237054 0.66510391 0.11581632 0.3319496 0.58470978 0.00815328 0.54893841 0.02161558 0.80149602 0.64546177 0.04574985 0.03768492 0.57556024 0.7240818 0.41240956 0.02910146 0.05188087 0.03596408 0.06198916 0.01186878 0.06589337 0.05517715 0.162767 0.44589844 0.34169197 0.05729918 0.16941814 0.03021027 0.30403114 0.05222877 0.05375151 0.04639895 0.24370947 0.47842778 0.02647938 0.05360081 0.01201594 0.04901892 0.01719813 0.01168552 0.55693764 0.71316563 0.11003505 0.41643745 0.0229294 0.06712092 0.00857421 0.05673625 0.01543038 0.01138781 0.57769513 0.79481316 0.14638169 0.95042469 0.84745266 0.42189184 0.52368223 0.11744147 0.71599131 0.19165834 0.04249808 0.0383007 0.65168119 0.95503084 0.86670952 0.42006596 0.55481871 0.12131305 0.74011319 0.21210931 0.0544319 0.04564245 0.68931968 0.96766361 0.91985173 0.48107961 0.66541373 0.16165495 0.83720858 0.26180865 0.06970493 0.06338737 0.76038067 0.97306794 0.87158559 0.41572849 0.55402378 0.11771001 0.74575434 0.20444896 0.0489318 0.04253792 0.67803131 0.12815399 0.03875855 0.00713486 0.01016707 0.01233104 0.00640995 0.00366717 0.01523269 0.0133624 0.2944994 0.12315291 0.04842143 0.01192425 0.01839754 0.0093178 0.02110307 0.00834064 0.05369836 0.04283405 0.06408638 0.05216794 0.0335305 0.00707069 0.01268011 0.00752201 0.01462845 0.00683369 0.05067528 0.0401058 0.02236897 0.04241854 0.01917547 0.01031572 0.01039872 0.01186527 0.01156086 0.00773483 0.03830099 0.03161419 0.01812244 0.29002984 0.15335632 0.01725901 0.04881213 0.00982108 0.13169101 0.01230633 0.04310086 0.03504493 0.10874747 0.04653424 0.02385429 0.00933254 0.01360478 0.00988403 0.01487135 0.00798517 0.04339005 0.04066403 0.01975575 0.36927795 0.03999566 0.06307222 0.02299994 0.0554272 0.02990649 0.02603286 0.3665661 0.60050752 0.08890295 0.01561781 0.0069864 0.00619678 0.00182625 0.00226312 0.00322763 0.00158281 0.01272977 0.01305948 0.00455411 0.32245739 0.18213573 0.02625725 0.06563154 0.01632446 0.15942371 0.01592281 0.04824572 0.04284364 0.13825322 0.96709151 0.88034893 0.46073451 0.5954383 0.14038001 0.77088544 0.24598558 0.05734333 0.04953404 0.71404499 0.06356837 0.05159218 0.00925917 0.01759606 0.00799018 0.01645122 0.00979107 0.06233264 0.04737932 0.03436256 0.05414667 0.03675162 0.00938198 0.01383472 0.00749614 0.01750699 0.0080838 0.04268521 0.03329568 0.02543189 114 0.02691809 0.01533676 0.00475654 0.0051591 0.00326374 0.01093924 0.00418525 0.01277424 0.01113601 0.00858457 0.01884367 0.00778548 0.0047261 0.00330732 0.00529975 0.00461281 0.00344633 0.01222421 0.01402187 0.00720611 0.24791653 0.46219666 0.0103641 0.21775288 0.01005157 0.26769178 0.05012072 0.06530424 0.05393143 0.19638804 0.91557386 0.78910926 0.312595 0.55619033 0.10708061 0.72691824 0.3239361 0.04736395 0.04008546 0.56496557 115 IOS_57_05MIOS_58_66MIOS_62_05MIOS_64_05MIOS_64_65MIOS_65_05MIOS_65_30MMED_18_05M MED_18_60M MED_23_05M 0.03119646 0.01278499 0.01641823 0.02401491 0.03081438 0.01062193 0.02689866 0.26872257 0.17660346 0.07973884 0.01641027 0.0081181 0.01069212 0.0294329 0.07090581 0.01625498 0.04445404 0.10332129 0.09873282 0.14203538 0.95498435 0.89136043 0.92395285 0.95422733 0.95907267 0.89199315 0.94657254 0.2707368 0.22925295 0.06309509 0.14844727 0.12349894 0.10038971 0.12625947 0.17564542 0.03898804 0.21486717 0.81362302 0.65200535 0.48595679 0.03148658 0.01492144 0.02187915 0.02279552 0.03028538 0.00986041 0.02690462 0.05994552 0.04878341 0.19006532 0.08416149 0.19325 0.04724112 0.04135755 0.06907993 0.02172121 0.07951165 0.00873006 0.01093527 0.00942916 0.79363754 0.65032472 0.74930986 0.25137787 0.28786885 0.10031909 0.48165671 0.03295607 0.02566734 0.03021244 0.56406826 0.48524329 0.45029002 0.57148411 0.67077726 0.25883707 0.57269708 0.38578819 0.612284 0.91905428 0.60080744 0.39251798 0.44217429 0.53989618 0.64393676 0.38624862 0.59299203 0.0939921 0.08507124 0.05307994 0.07693027 0.17469952 0.04172216 0.04129599 0.06476056 0.02095568 0.07257385 0.00906026 0.00836819 0.00851002 0.08088598 0.15825056 0.04532958 0.04025217 0.06398849 0.02060452 0.09181902 0.0043868 0.00379612 0.00673954 0.94786333 0.86078873 0.89062543 0.92211796 0.93080988 0.9099976 0.9440314 0.24445855 0.21252372 0.04402565 0.95951135 0.88851555 0.91464924 0.93983709 0.94687531 0.93405079 0.95313672 0.27134758 0.22799035 0.05813931 0.9626059 0.91419407 0.92994887 0.93879623 0.94688742 0.91599975 0.94867366 0.33124996 0.28627712 0.0748168 0.96070785 0.8773471 0.90071048 0.93335564 0.9427974 0.91845438 0.95006145 0.25432993 0.21669389 0.05580599 0.82825515 0.24002025 0.71408415 0.66820118 0.71454152 0.49895295 0.78940138 0.75579133 0.82206841 0.95317554 0.85809075 0.22620554 0.41076734 0.31287019 0.44516481 0.21748626 0.67809648 0.01244385 0.01219487 0.02338591 0.08804949 0.04035911 0.04743824 0.03737588 0.05385475 0.01862556 0.05460657 0.00836379 0.00939463 0.02588505 0.06142907 0.05091353 0.04751996 0.06481217 0.10768292 0.04614425 0.18885769 0.02032498 0.0189604 0.02174302 0.50093162 0.24917761 0.31196819 0.48372428 0.64709608 0.28433517 0.54413616 0.05563896 0.05162166 0.0406086 0.06011328 0.03502693 0.03809813 0.04541641 0.06082882 0.02764469 0.08028387 0.01954094 0.0216308 0.02362475 0.08784591 0.10844265 0.05314696 0.05713235 0.08411146 0.02922804 0.09129804 0.0104537 0.01028843 0.01111324 0.02092519 0.00765621 0.01112748 0.01966697 0.04158948 0.00883058 0.0291035 0.05684336 0.05730284 0.08877832 0.54309107 0.28786812 0.35323909 0.54997271 0.71351058 0.32505499 0.59665298 0.0792048 0.06529192 0.0509638 0.96988798 0.89902142 0.9220699 0.9507949 0.95673565 0.93699515 0.9664688 0.30774306 0.262928 0.06272424 0.10126323 0.05428954 0.06457144 0.04507388 0.06040942 0.02386332 0.06546795 0.03109085 0.04227688 0.32873969 0.08568538 0.0420488 0.05101053 0.04267071 0.05561768 0.02639634 0.05861034 0.02819652 0.04667995 0.40039736 116 0.03315821 0.01681757 0.01968328 0.02769941 0.04991246 0.01611246 0.04010727 0.05746069 0.05523493 0.08332555 0.0233318 0.01331079 0.01366765 0.05917014 0.16696918 0.05648825 0.23348312 0.08124114 0.07822595 0.11153262 0.48410328 0.39672984 0.43838076 0.37243627 0.46201608 0.13878333 0.30746289 0.85290807 0.92282779 0.95615888 0.92206338 0.77522284 0.81889636 0.9096958 0.9330092 0.81755826 0.92021482 0.36586349 0.32980599 0.05226129 117 MED_25_05M MED_25_50M MED_30_05M MED_30_70M PON_CRUD_St13 PON_CRUD_St17 PON_CRUD_St18 PON_CRUD_St8 PON_132_05M PON_133_05M 0.33728011 0.40556636 0.01047456 0.02183344 0.03370255 0.02250536 0.04482808 0.02242176 0.00370519 0.55942753 0.34744674 0.51055617 0.00743711 0.0597565 0.02577714 0.03361056 0.1938367 0.01222491 0.004298 0.32390817 0.17216782 0.05324006 0.06624403 0.10365435 0.15038219 0.10460057 0.3882318 0.10169948 0.73998641 0.06611193 0.90245724 0.6154819 0.69142758 0.92060379 0.26956579 0.06004569 0.06025383 0.1088422 0.10191564 0.04408047 0.59507531 0.7084975 0.01025968 0.02363326 0.03104972 0.02235676 0.04593236 0.02087101 0.00486997 0.7293155 0.03187075 0.01915626 0.01166953 0.01482293 0.98669269 0.96296368 0.97941962 0.94223862 0.00711846 0.706206 0.11384625 0.03052504 0.04090209 0.0900927 0.1389985 0.05268402 0.0565117 0.06550113 0.02116132 0.03893147 0.96187206 0.91447707 0.893727 0.92326095 0.12312912 0.07998063 0.09407709 0.08956257 0.01171883 0.07947013 0.15467823 0.05510695 0.05867939 0.09535881 0.12087545 0.08920671 0.20885666 0.08676901 0.08411613 0.06860431 0.02761428 0.01762373 0.01023726 0.01493224 0.93806788 0.94374634 0.96517379 0.86929414 0.00684652 0.70789748 0.02555167 0.01203232 0.0076539 0.01112563 0.63467935 0.69807372 0.58216198 0.47695816 0.00252261 0.39062431 0.13892653 0.03815746 0.04970378 0.07498346 0.24536477 0.25060267 0.81343033 0.14701473 0.17602541 0.05974442 0.15722055 0.04716661 0.06307343 0.08860347 0.2499972 0.28090426 0.83872579 0.1531495 0.19663507 0.0706165 0.18100236 0.06655232 0.07632544 0.11144123 0.31580245 0.2526287 0.77642891 0.17779063 0.24727513 0.08143161 0.15873875 0.04490453 0.05856396 0.0883715 0.24467583 0.25466576 0.8357691 0.14426419 0.18158327 0.06642828 0.97053042 0.92359656 0.9004886 0.90155598 0.03089533 0.02013953 0.02273071 0.02774711 0.00355776 0.01745859 0.07475647 0.02439304 0.02391484 0.03280437 0.08546623 0.05751215 0.06420736 0.05813803 0.00543347 0.04128928 0.09427099 0.02220772 0.02731328 0.03509353 0.06941908 0.04510532 0.04882406 0.04712891 0.00460915 0.03867235 0.0637206 0.23486214 0.01566903 0.02007285 0.05889783 0.04393449 0.04734701 0.0372494 0.00700991 0.03487053 0.1803201 0.11925445 0.0495565 0.18010579 0.11258658 0.0798894 0.19007555 0.07884622 0.02272727 0.0668949 0.08982957 0.28862683 0.01469363 0.02103551 0.06204749 0.04427072 0.04524081 0.03877203 0.00747413 0.03369246 0.03095189 0.01716265 0.01082744 0.01824596 0.5503929 0.45871058 0.43412619 0.38436914 0.01546703 0.37806866 0.24876237 0.4028845 0.00710408 0.03503537 0.01996361 0.01067798 0.02148777 0.00933432 0.00134807 0.15453269 0.21140844 0.16179573 0.06010003 0.22173295 0.1170193 0.08887436 0.24585679 0.08466484 0.03120891 0.07141458 0.17186153 0.05458069 0.06478685 0.09810203 0.28346049 0.29576424 0.85244796 0.17117445 0.21595909 0.0698009 0.64539303 0.07826792 0.22961665 0.11100227 0.07898094 0.05266564 0.05413611 0.05344926 0.00718172 0.03874473 0.90433569 0.20553582 0.23960618 0.12622959 0.05779875 0.03821061 0.04356279 0.04082548 0.00658544 0.0366625 118 0.24059132 0.37951675 0.01225047 0.04630359 0.03442356 0.01738846 0.02984154 0.01934543 0.0034142 0.16187713 0.29541587 0.46028827 0.00755127 0.04758349 0.04088362 0.03038022 0.05925089 0.02198635 0.00266865 0.2487093 0.98577022 0.95300735 0.96468311 0.98490095 0.14568525 0.10200365 0.12756114 0.11324341 0.16108115 0.1010012 0.14665893 0.04227095 0.05386556 0.08019972 0.16086765 0.13712191 0.49774799 0.11016619 0.2666096 0.06495806 119 PON_133_45M PON_137_05M PON_137_40M PON_138_05M PON_138_60M PON_140_05M PSE_100_05M PSE_100_50M PSE_102_05M PSE_102_40M 0.71709741 0.02685354 0.0110769 0.00910435 0.00493046 0.02370965 0.00571951 0.0196082 0.02998801 0.19575017 0.66552106 0.01586035 0.02487004 0.00721836 0.00864151 0.0120873 0.00721479 0.15555456 0.04028947 0.6030449 0.03883644 0.07906109 0.01126038 0.02544576 0.00512825 0.87456316 0.01237746 0.0324172 0.08377499 0.08515279 0.02562609 0.05098324 0.01392605 0.01836307 0.00418458 0.1235832 0.01049011 0.02794347 0.05174262 0.06080603 0.95656329 0.02750785 0.01145473 0.01010426 0.00619836 0.02152408 0.00695453 0.01655476 0.03004853 0.35138448 0.53848287 0.90756529 0.18847937 0.12980423 0.08670332 0.14683997 0.43471797 0.88241285 0.94646125 0.95932429 0.02387448 0.05605921 0.01275007 0.02111089 0.00381526 0.32161492 0.01084969 0.02404689 0.05395083 0.0548548 0.04401803 0.09433107 0.01562744 0.03654546 0.00630101 0.60933857 0.02063395 0.04786477 0.09493836 0.09707736 0.04217614 0.0820696 0.01484052 0.03026313 0.00778028 0.45902548 0.0173972 0.0406604 0.08697996 0.08911098 0.53302934 0.84434361 0.1720497 0.12226335 0.07871899 0.11433198 0.43554286 0.8765465 0.88942094 0.92861222 0.34913832 0.432674 0.10977652 0.14851957 0.07089627 0.07196812 0.24599202 0.81146078 0.35296307 0.41516005 0.02992044 0.08609164 0.0125337 0.02079988 0.00129693 0.95442057 0.00936655 0.02774971 0.09129845 0.08296223 0.04278193 0.09702888 0.02000101 0.02733076 0.00755256 0.96268135 0.01436098 0.03228143 0.10063149 0.09546243 0.04461112 0.10621917 0.02349671 0.03844735 0.0137578 0.95879564 0.02531047 0.04415333 0.10904267 0.10615133 0.03644439 0.09436972 0.0167451 0.02352139 0.00261024 0.96486673 0.00868036 0.02950115 0.09551353 0.08940266 0.01191838 0.02355947 0.01172759 0.00909339 0.00331619 0.85700605 0.00526639 0.01012243 0.02288065 0.02671464 0.02706048 0.05716055 0.01728085 0.02459448 0.00515791 0.43065851 0.01659081 0.03261285 0.05385695 0.06214532 0.02263373 0.05503905 0.01830692 0.02587714 0.00734607 0.06705796 0.01508857 0.02434616 0.04875598 0.04973923 0.02488009 0.04087835 0.01989841 0.01923922 0.00861235 0.04666349 0.01288752 0.0298563 0.03901216 0.05694468 0.03843369 0.07631661 0.01002507 0.02257937 0.00396955 0.2201543 0.01217249 0.03634832 0.08181929 0.08881225 0.02769052 0.0415925 0.01917422 0.01716637 0.00807061 0.04079615 0.01565707 0.02980218 0.04093056 0.04969902 0.32589762 0.29339322 0.11622672 0.29528188 0.04824418 0.06024492 0.16624188 0.32368618 0.2833495 0.36287249 0.32733013 0.0132273 0.01704561 0.0060329 0.00599724 0.01716225 0.00265142 0.03452913 0.01458965 0.02361268 0.04472767 0.08322978 0.0186073 0.03726936 0.01522387 0.25700873 0.01602441 0.04254056 0.08893361 0.09696071 0.03927213 0.10233171 0.01950408 0.02747152 0.00839343 0.97275838 0.01602079 0.03527323 0.10966653 0.10768914 0.0258486 0.06072215 0.01610676 0.03249174 0.00623605 0.07297414 0.01803646 0.026287 0.05051704 0.05075028 0.02359483 0.04412542 0.01453828 0.02132841 0.00646969 0.06286468 0.01411419 0.02202648 0.03982885 0.04302877 120 0.31852184 0.01741424 0.0218871 0.00564692 0.00519986 0.02639647 0.00560433 0.03892615 0.02044468 0.02969905 0.50340532 0.01950092 0.0247413 0.00615656 0.00588979 0.01750638 0.00630273 0.11553442 0.02490019 0.05830854 0.05969684 0.11683403 0.01681409 0.04421393 0.00598219 0.61389389 0.01792991 0.05411964 0.11913036 0.1281148 0.03833971 0.08936781 0.01280431 0.0267602 0.0058066 0.90337042 0.01414528 0.03410833 0.0929502 0.08735419 121 PSE_109_05M PSE_109_30M PSE_110_05M PSE_110_50M PSE_111_05M PSE_111_90M PSE_93_05MPSE_93_35MPSE_94_05MPSE_96_05M 0.03398166 0.02153308 0.02633882 0.02759097 0.00951768 0.00936532 0.95922401 0.92964265 0.03825046 0.0037015 0.01960465 0.01831933 0.06718376 0.06762815 0.00964942 0.02354815 0.77650974 0.82466992 0.17208679 0.00676409 0.10287007 0.06374751 0.07411134 0.07901627 0.0177425 0.0141921 0.06337565 0.02474279 0.0727818 0.01505178 0.10310171 0.04053603 0.04493086 0.03578805 0.01665724 0.01376616 0.04231505 0.01696145 0.05977807 0.00349659 0.03389516 0.02170738 0.02657824 0.02815541 0.00969059 0.00992053 0.97477627 0.96967832 0.03725893 0.0062468 0.91235756 0.89254359 0.94051872 0.92367289 0.50780973 0.71821036 0.93741672 0.87613139 0.84402027 0.01030081 0.08271209 0.04367993 0.04432313 0.03343658 0.01664553 0.01091114 0.03783244 0.0159978 0.05522671 0.00296221 0.11201224 0.07685861 0.0772552 0.06475944 0.03033937 0.02294576 0.06814202 0.02892865 0.09050387 0.00541161 0.10210228 0.07072832 0.07366309 0.06529279 0.02279956 0.02026285 0.06626126 0.02922631 0.0777568 0.01253277 0.83866302 0.81734201 0.91161287 0.88494252 0.52460791 0.75623988 0.91920281 0.83458651 0.86235809 0.00949482 0.55591648 0.39807527 0.48371997 0.46250211 0.3974339 0.63739887 0.41957578 0.31388097 0.36812803 0.00792894 0.12281883 0.07239431 0.14462417 0.19519518 0.01216504 0.01029766 0.0600063 0.02085104 0.06474555 0.00555224 0.12585122 0.07773453 0.1558136 0.20742168 0.01866391 0.01816364 0.06711357 0.03040703 0.07556826 0.01217712 0.13966263 0.09107952 0.16361176 0.21139596 0.02831861 0.0230202 0.07799461 0.0438839 0.08211386 0.01332265 0.1253943 0.07605749 0.14336532 0.18881437 0.01524009 0.0121763 0.06841789 0.02710366 0.0690851 0.00794155 0.02848994 0.01766352 0.01925122 0.01393292 0.00683925 0.00523851 0.01904261 0.01052742 0.01957637 0.00264692 0.07081189 0.04490661 0.0512393 0.04216545 0.01866676 0.0156592 0.04531743 0.02233654 0.04632284 0.00429246 0.06708447 0.04147195 0.04546845 0.03613921 0.01611216 0.01465576 0.03701361 0.01760972 0.04350629 0.00438838 0.0475481 0.02899135 0.03748596 0.03627796 0.01842733 0.01509107 0.04198995 0.02274131 0.0433251 0.00685224 0.0941256 0.06244603 0.06849316 0.07175363 0.01847763 0.01763059 0.06805612 0.03130927 0.07532126 0.00443824 0.04916475 0.03253902 0.03965574 0.03681071 0.01873689 0.016089 0.03664167 0.02096178 0.0410332 0.00726545 0.39296143 0.26263901 0.33575009 0.29462028 0.16789218 0.19226963 0.4095667 0.29765314 0.26602285 0.0225654 0.01962736 0.01403065 0.02260933 0.01878983 0.00415295 0.00652764 0.4202907 0.38724033 0.05135413 0.00228537 0.09954963 0.07188022 0.07501842 0.08633366 0.02292266 0.02063982 0.07554653 0.03611973 0.08313496 0.01025792 0.13339262 0.08430778 0.1710676 0.21630865 0.01843535 0.01616541 0.0700832 0.02851354 0.07699582 0.01279925 0.07185049 0.04672491 0.04910817 0.04092714 0.01884963 0.01422893 0.04124405 0.01999335 0.04165458 0.00633873 0.05462684 0.03383985 0.03698985 0.03019706 0.0154767 0.01168098 0.04045221 0.02771159 0.03657388 0.00487008 122 0.02497331 0.0146792 0.02292726 0.02206315 0.00735869 0.00819463 0.4097592 0.38663989 0.05072067 0.0031358 0.02796755 0.01996677 0.05680084 0.05146219 0.00857057 0.01631936 0.60982306 0.60735464 0.12037736 0.003341 0.13422845 0.09620209 0.09470405 0.07652201 0.02863305 0.0235921 0.09087924 0.03594871 0.11031266 0.00566052 0.11557565 0.07197149 0.09785198 0.11152081 0.01849548 0.01476315 0.06482049 0.02787695 0.067703 0.01211101 123 PSE_98_05MPSE_99_05MPSW_112_05M PSW_122_05M PSW_123_05M PSW_124_05M PSW_125_05M PSW_128_05M PSW_128_40M RED_31_05M 0.00357957 0.00433429 0.00434676 0.00454494 0.00854503 0.01601524 0.01403172 0.00972254 0.00923503 0.01844487 0.00430773 0.00401617 0.00548623 0.0084816 0.00736957 0.01193761 0.01230129 0.01123506 0.01084183 0.01301792 0.17069446 0.00622861 0.27097314 0.01074376 0.43941443 0.47839625 0.33202886 0.05836181 0.02492714 0.9407692 0.01187895 0.00447533 0.02103662 0.01048611 0.01647308 0.03383414 0.03015754 0.02155495 0.02131271 0.08703895 0.00465367 0.00548625 0.00622385 0.00634807 0.00927444 0.01535499 0.01479306 0.01095355 0.0112065 0.01907013 0.01259854 0.01214273 0.01902261 0.2422125 0.24784117 0.42138225 0.43768421 0.37318156 0.36224175 0.029353 0.00686196 0.00437774 0.0221508 0.01120066 0.01978623 0.03716484 0.03467985 0.0279296 0.02422955 0.54359072 0.00835026 0.00771023 0.01994645 0.02028013 0.03728022 0.06638165 0.0601467 0.04403659 0.0424226 0.31887776 0.11432219 0.00766268 0.06936793 0.01644789 0.17037973 0.19653426 0.13327833 0.04045715 0.03363287 0.41392329 0.01169641 0.01041484 0.0188579 0.2331036 0.2448587 0.41580587 0.43819581 0.36747714 0.36243669 0.0227299 0.01027662 0.0088168 0.02774195 0.50010843 0.39539833 0.67344826 0.52115007 0.50155303 0.45347958 0.01677069 0.09348317 0.00274364 0.37002105 0.00596114 0.90120836 0.91612988 0.83197512 0.06228347 0.02161668 0.88741075 0.11315481 0.00882295 0.39494122 0.01380378 0.88891821 0.8986481 0.81396394 0.07322655 0.02944947 0.91137086 0.13684541 0.01213233 0.43563716 0.02207408 0.92157449 0.92863482 0.88049293 0.09159541 0.03906937 0.92634843 0.10513453 0.00377967 0.38886598 0.00863943 0.92880072 0.93304452 0.8462321 0.06876862 0.02441574 0.89917164 0.00268124 0.0028925 0.06654663 0.00571932 0.00948154 0.01779884 0.01829736 0.02005856 0.01130965 0.80849145 0.00587203 0.00475945 0.03064891 0.01932815 0.20013323 0.09800333 0.05839122 0.04020665 0.03720449 0.4932366 0.00514324 0.00489236 0.01049471 0.01728639 0.02776517 0.04912893 0.05358795 0.0340962 0.03297868 0.04284139 0.00753603 0.00825377 0.00929378 0.01704751 0.0200241 0.03677271 0.04174018 0.02800876 0.02856731 0.04000218 0.05041627 0.00210564 0.02556566 0.01007505 0.07789045 0.10264368 0.06888552 0.02756227 0.02546165 0.21416533 0.00714654 0.00763642 0.01065661 0.0175894 0.02140581 0.03650382 0.04066612 0.02912801 0.0265596 0.03602494 0.02325646 0.02023156 0.04156377 0.19172622 0.20033639 0.33270612 0.35918122 0.28929627 0.28396116 0.02901916 0.00180164 0.00165651 0.00377208 0.00534399 0.00691459 0.01295103 0.01402147 0.00781474 0.00720742 0.0113571 0.06843304 0.00805003 0.03803908 0.01223804 0.09491466 0.11904975 0.08269271 0.03251078 0.03396415 0.2390385 0.11923316 0.00826642 0.42348456 0.01368962 0.91140682 0.92318754 0.85082805 0.07806315 0.03063993 0.92366925 0.00680041 0.0064451 0.01550575 0.02116681 0.0312217 0.05623037 0.06431779 0.04060537 0.0404895 0.04475321 0.00641225 0.00550185 0.0110067 0.01474346 0.02671731 0.04445587 0.04765845 0.02756055 0.02761356 0.04280144 124 0.00271046 0.00226012 0.00475446 0.00637317 0.01401755 0.02097449 0.01978156 0.01022887 0.01077307 0.02210679 0.00227286 0.00260625 0.00342577 0.00770466 0.00850536 0.01642511 0.0144404 0.01016796 0.00954883 0.01608436 0.01753548 0.0857265 0.03268312 0.01716499 0.05030376 0.07627431 0.06091558 0.04596118 0.04169244 0.63451764 0.13070356 0.00660514 0.16710433 0.01089372 0.59359555 0.63534718 0.46425909 0.04671993 0.02592883 0.83877035 125 RED_32_05MRED_32_80MRED_33_05MRED_34_05MRED_34_60MSOC_84_05MSOC_85_05MSOC_85_90M 0.00845801 0.01157421 0.01668833 0.01240168 0.01170226 0.00163096 0.00140135 0.00137581 0.96913547 0.00618632 0.01233544 0.00790535 0.00651618 0.01273441 0.00214264 0.001789 0.00222175 0.82466992 0.89492778 0.75261898 0.87752313 0.79966761 0.10607867 0.00204834 0.00157972 0.00161611 0.95907267 0.0298968 0.02042862 0.04695254 0.0281126 0.01413248 0.00102149 0.00090231 0.00105223 0.92985155 0.00858211 0.01186372 0.0145339 0.01159085 0.01213163 0.0026078 0.00207446 0.00263824 0.97477627 0.01453413 0.0161179 0.0309449 0.0164885 0.02135022 0.00306681 0.00317538 0.00320103 0.98669269 0.1801785 0.08969043 0.16440448 0.06549311 0.03385121 0.00111568 0.00101594 0.00097242 0.85168928 0.16188294 0.08484653 0.43864084 0.43588811 0.03583616 0.00216331 0.0022446 0.00197994 0.96187206 0.27034909 0.14988699 0.421793 0.31230051 0.05417285 0.00285237 0.00263491 0.00262406 0.64393676 0.01367746 0.01490803 0.02452137 0.01611371 0.01926202 0.00286202 0.00369351 0.00281925 0.96517379 0.00822798 0.01030343 0.01977913 0.01030349 0.01296625 0.00107198 0.00100267 0.00115381 0.81146078 0.82972179 0.59471239 0.95386626 0.94534292 0.31564987 0.00063017 8.99E-05 0.00080147 0.96042503 0.86560578 0.62115141 0.95548016 0.95020225 0.32667197 0.00313922 0.00313467 0.0035268 0.96268135 0.87328979 0.66691176 0.95518956 0.94754813 0.35875331 0.00384381 0.00408661 0.00437755 0.96766361 0.85406516 0.60124938 0.96868129 0.96560989 0.32342877 0.00062078 0.00044367 0.00081633 0.97306794 0.7616037 0.5659768 0.8277755 0.57005906 0.45024537 0.00203395 0.00171619 0.00205659 0.97053042 0.12077533 0.27043553 0.81780004 0.53824287 0.07363339 0.00153092 0.00140228 0.00129891 0.87771152 0.0202214 0.01775815 0.04573433 0.02828043 0.01947495 0.00138082 0.00161394 0.00155778 0.1317347 0.01504344 0.0304262 0.03027746 0.01794559 0.0304048 0.00328757 0.00341087 0.00318165 0.95016822 0.11099011 0.05473779 0.26718526 0.16899137 0.02565825 0.00015111 0.00021696 0.00024796 0.8689779 0.01522951 0.02176387 0.03266637 0.018338 0.02022152 0.00347314 0.00348383 0.00370756 0.72617183 0.01850195 0.02500067 0.02954777 0.02178603 0.02941714 0.0053247 0.00400175 0.00262561 0.60804496 0.00594072 0.01016802 0.01026537 0.00666179 0.01146461 0.0006487 0.00017807 0.00044283 0.73395982 0.12944263 0.06646264 0.30150847 0.19688224 0.03508419 0.00421178 0.00286122 0.00291207 0.93248196 0.87169207 0.6802383 0.97641137 0.96313144 0.34607043 0.00398678 0.00367633 0.00387987 0.98570754 0.02242291 0.01606186 0.04625531 0.02786202 0.01525449 0.00205687 0.00210117 0.00215924 0.89537737 0.02216541 0.01706023 0.03971051 0.02812166 0.01727104 0.00196672 0.00183904 0.00188984 0.90433569 126 0.01120586 0.01234969 0.02136603 0.01554198 0.01411196 0.00112131 0.00102073 0.0012587 0.79542925 0.00555154 0.01163821 0.01295643 0.01011508 0.01515607 0.00135088 0.00096789 0.00101011 0.92292152 0.38641311 0.08538509 0.37247571 0.27734436 0.02259363 0.00154117 0.0016869 0.00157223 0.98577022 0.7086945 0.43237426 0.87691381 0.81505881 0.11170578 0.00224206 0.00216811 0.00204282 0.94203852 127 Table S4 Genome relative abundances*, ** * abundance is operationally defined here as the proportion of total reads that recruited to the genome ANE_004_05M ANE_004_40M ANE_150_05M ANE_150_40M ANE_151_05M ANE_151_80M ANE_152_05M ANW_141_05M BL107 0.034753816 0.039034126 0.19842402 0.1209668 0.15426613 0.06235292 0.47217359 0 CC9311 0 0.011888475 0 0 0 0 0.05356431 0 CC9605 0.389305114 0.070483561 0.03131792 0.04590541 0.02131868 0 0 0.21371053 CC9616 0.069954664 0.017981532 0.01707579 0.01419351 0 0 0 0 CC9902 0.011784199 0.014086922 0.15935269 0.07849699 0.03709654 0.01576293 0.30328519 0 GEYO 0 0 0.05132743 0.02896114 0 0 0.13654494 0 KORDI100 0 0 0 0 0 0 0 0 KORDI49 0.009294655 0 0.04070754 0.04300143 0 0 0 0.19657728 KORDI52 0 0 0 0 0 0 0 0.08181339 MIT9508 0 0 0.04539668 0.02601767 0 0 0.13532082 0 MIT9509 0 0 0 0 0 0 0 0 N19 0.024354727 0.009817353 0.01203886 0.01573869 0 0 0 1.19792357 N26 0.029887434 0.012073425 0.01541359 0.01907516 0 0 0 1.4107759 N32 0.045598908 0.01809875 0.02109991 0.02647305 0.01303703 0 0 1.30792457 N5 0.025308535 0.009952018 0.01251349 0.01612663 0 0 0 1.15482392 RCC307 0.008631267 0.029587616 0 0.00920815 0 0 0 0.64497666 RS9916 0 0 0 0 0 0 0 0.05354705 RS9917 0 0 0 0 0 0 0 0 UW105 0 0.111034959 0.02626674 0.03841133 0 0 0 0 UW106 0.02465115 0.037320028 0.04769035 0.07186511 0.04371297 0.0197723 0 0 UW140 0 0.018765748 0 0.01341379 0 0 0 0 UW179A 0 0 0 0 0 0 0.0253314 0 UW179B 0 0 0 0 0 0 0 0 UW69 0.038021388 0.064819705 0.07666042 0.11472463 0.07619636 0.0350981 0 0 UW86 0.033318909 0.013673855 0.01784894 0.02285628 0.01044743 0 0 1.51884385 WH7803 0 0 0 0 0 0 0 0.04332687 WH7805 0 0 0 0 0 0 0 0 128 WH8016 0 0 0 0 0 0 0 0 WH8020 0 0.057461617 0.02693387 0.03771625 0.0187173 0.00997859 0.04057589 0 WH8102 0.104835935 0.018842926 0.01066979 0.01059412 0.00778927 0 0 0.65004414 WH8109 0.042229699 0.017298539 0.0232121 0.02925023 0.01260785 0 0 0.25205795 129 * abundance is operationally defined here as the proportion of total reads that recruited to the genome ** modified – meaning genomes with detection values less than 50 had their total number of reads recruited set to 0 before getting percentages of total reads ANW_142_05M ANW_145_05M ANW_146_05M ANW_148_05M ANW_149_05M ASE_66_05MASE_66_30MASE_67_05MASE_68_05MASE_68_50M 0 0.2196601 0 0 0.15802068 0.07667463 0.15981197 0.07816865 0.94464849 0.69338891 0 0.07306321 0 0 0 0 0 0.12174337 0 0 0.08434272 0 0.06225273 0.1441683 0.09374583 0 0 0 0 0 0 0 0 0.01140573 0 0 0 0 0 0 0 0.40677184 0 0 0.219446 0.03421489 0.0704342 0.08127308 0.40592296 0.30089488 0 0 0 0 0 0 0.01811626 0 0.63395457 0.62883307 0.03891261 0 0 0 0 0 0 0 0 0 0.10257345 0 0 0.04337844 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.48128137 0.48034956 0 0 0 0 0 0 0 0 0 0 0.08887006 0 0.11832339 0.24345337 0.10166451 0 0 0 0 0 0.0992666 0 0.13498904 0.27668485 0.11870391 0 0 0 0 0 0.12484044 0 0.15547835 0.32222427 0.14019843 0 0 0 0 0 0.08473548 0 0.1135906 0.2412883 0.1022933 0 0 0 0 0 0.14957402 0.01982665 0.20526922 0.26532891 0.08522862 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0443197 0.08494471 0.10116096 0 0 0 0 0 0 0 0.02074084 0.07092287 0.08287936 0 0 0 0.01789585 0 0 0 0.01113761 0.02091169 0.01959083 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.04469729 0 0 0 0 0 0.05731915 0 0 0 0 0.03447238 0.12437869 0.14351202 0 0 0 0.03356625 0.02612478 0.12558025 0 0.18134944 0.37087003 0.14868101 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 130 0 0.05166686 0 0 0 0 0 0.05989956 0 0 0 0.05856674 0 0 0.02416922 0.01625645 0.02643445 0.08533206 0.01338404 0 0.6353821 0 0.01291278 0.16257891 0 0 0 0 0 0 0.05842491 0 0.13803233 0.30769312 0.17522768 0 0 0 0 0 131 ** modified – meaning genomes with detection values less than 50 had their total number of reads recruited set to 0 before getting percentages of total reads ASE_70_05MASW_72_05M ASW_72_100M ASW_76_05M ASW_78_05M ASW_82_05M ASW_82_40M ION_36_05MION_36_17MION_38_05M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0715487 0.18804144 0 0 0.08988169 0.10246666 0.11164505 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.11719012 0 0 0 0 0 0 0.0271069 0.01773247 0.15982106 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02033037 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.12035398 0 0 0 0 0 0 0.02233549 0.01485032 0.12688344 0.02482237 0 0 0 0 0 0 0 0 0 0 0 0 0 0.04442085 0 0 0.64156717 0.74539618 0.67832481 0 0 0 0 0.05300815 0 0 0.63739291 0.74079648 0.68235802 0 0 0 0.02139812 0.07403335 0 0 0.68562429 0.79496902 0.75918589 0 0 0 0 0.04544195 0 0 0.58505787 0.68814355 0.63248783 0 0 0 0 0.08024162 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02262778 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02657853 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.03330687 0 0 0 0 0 0 0 0 0 0.0598051 0 0 0.68550102 0.79732805 0.7326079 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 132 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.04317942 0 0 0 0 0 0 0 0 0.01838463 0.06418755 0 0 0.18782239 0.20428906 0.22217455 133 ION_38_25MION_39_25MION_41_05MION_41_60MION_42_05MION_42_80MION_45_05MION_48_05MIOS_52_05MIOS_52_75M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.16056289 0.13107349 0.11538998 0 0.02314169 0 0.06594856 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.43601573 0.02141916 0 0 0 0 0 0 0.02068602 0.01674456 0 0 0.01195551 0 0.01020943 0 0.02467876 0.01016014 0 0 0.03964282 0.06577089 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.36003278 0 0 0 0 0 0 0 0.01919955 0.01854294 0.149198 0 0 0 0 0 0 0 0.02817645 0.02583534 0.84751373 0.63326112 0.05312008 0 0.00816475 0 0.02922041 0 0 0 0.83543586 0.67366857 0.06058289 0 0.01036316 0 0.03502085 0 0 0 0.96169976 0.75394162 0.08302973 0 0.01527577 0 0.05212112 0 0 0 0.78886061 0.64120256 0.05435891 0 0.00836299 0 0.03005158 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.1181147 0 0 0 0 0 0 0 0 0.01754273 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.91137521 0.7267263 0.06602942 0 0.0111933 0 0.03815573 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 134 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.39035959 0.20923765 0.03852438 0 0.01059072 0 0.0314838 0 0 0 135 IOS_56_05MIOS_57_05MIOS_58_66MIOS_62_05MIOS_64_05MIOS_64_65MIOS_65_05MIOS_65_30MMED_18_05M MED_18_60M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02590711 0.55460789 0.06802386 0.12941901 0.25945666 0.5595547 0.10676716 0.31105437 0 0 0 0 0 0 0 0 0 0 0.01890558 0.01009116 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01376955 0.03032248 0.01622793 0.02338893 0 0 0 0 0 0 0 0.020762 0 0 0.01527687 0.03368021 0 0.01773877 0 0.00983754 0 0.10660893 0 0 0.03695559 0.07990488 0 0.06393591 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01593599 0.84998518 0.07581806 0.12830737 0.1796968 0.38380976 0.22333802 0.48532571 0 0 0.01942558 0.96101487 0.08611853 0.14637709 0.20404273 0.42807153 0.25799198 0.55005656 0 0 0.02652379 1.12995561 0.11106143 0.18468865 0.24556258 0.52120492 0.27492148 0.59888383 0 0 0.01609769 0.84422054 0.07409049 0.12542926 0.17258749 0.36960494 0.2101618 0.46710196 0 0 0 0.11339396 0 0.0342067 0.01832008 0.03051745 0 0.05729817 0.01278387 0.01545449 0 0.04816557 0 0 0 0 0 0.01707042 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.03126793 0 0 0 0.02887046 0 0.0240803 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.04268022 0 0 0.01613652 0.0389699 0 0.03161833 0 0 0.02063664 1.01169857 0.09234882 0.15613595 0.21858585 0.46750956 0.26077118 0.56417919 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 136 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02285473 0.04092475 0.01343572 0.29428395 0.04177406 0.06553448 0.12679802 0.26347476 0.08325017 0.19971843 0 0 137 MED_23_05M MED_25_05M MED_25_50M MED_30_05M MED_30_70M PON_CRUD_St13 PON_CRUD_St17 PON_CRUD_St18 PON_CRUD_St8 PON_132_05M 0 0 0 0 0 0 0 0 0 0 0 0 0.04525108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01260469 0 0.04710303 0.0100692 0.01192366 0.13823152 0 0 0 0 0 0 0.00989266 0.01448347 0 0 0 0 0 0 0 0 0 0 0 0 0.94346258 0.30316486 0.56478619 0.13354737 0 0 0 0 0 0 0 0 0 0 0 0.11164702 0.40016569 0.05319047 0.06986017 0.14409186 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.64626113 0.22352132 0.39709114 0.09402533 0 0 0 0 0 0 0.23125989 0.0429602 0.02245449 0 0 0 0 0 0 0 0 0 0.01641394 0 0 0 0 0 0 0 0 0 0.01945812 0 0 0 0 0 0 0 0 0 0.02041533 0 0 0 0 0 0 0 0 0 0.01584583 0 0 0.15824394 0.50773066 0.03731674 0.02543302 0.03833865 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.16694612 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02005084 0 0 0 0.0470435 0 0 0 0 0 0 0 0 0 0.07644622 0 0 0 0 0 0 0 0 138 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.45168653 1.91214735 0.17405809 0.13701024 0.72835257 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 139 PON_133_05M PON_133_45M PON_137_05M PON_137_40M PON_138_05M PON_138_60M PON_140_05M PSE_100_05M PSE_100_50M PSE_102_05M 0.01786413 0.08169565 0 0 0 0 0 0 0 0 0 0.02960407 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.14858779 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02496946 0.14289726 0 0 0 0 0 0 0 0 0.02560627 0.01886592 0.06405103 0 0 0 0 0 0.09700219 0.20840621 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01897247 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02625045 0.01801792 0.05061384 0 0 0 0 0 0.1137706 0.14869422 0 0 0 0 0 0 0 0 0.0326514 0 0 0 0 0 0 0 0.50506767 0 0 0 0 0 0 0 0 0 0.59973582 0 0 0 0 0 0 0 0 0 0.56270562 0 0 0 0 0 0 0 0 0 0.5030203 0 0 0 0 0 0 0 0 0 0.23823579 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.58672292 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 140 0 0 0 0 0 0 0 0 0 0 0 0.0179848 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02281422 0 0 0 0 0 0 0 0 0 0.17108369 0 0 0 141 PSE_102_40M PSE_109_05M PSE_109_30M PSE_110_05M PSE_110_50M PSE_111_05M PSE_111_90M PSE_93_05MPSE_93_35MPSE_94_05M 0 0 0 0 0 0 0 0.42729904 0.11576697 0 0.00953875 0 0 0 0 0 0 0.18634774 0.15950669 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1.06073319 0.25265988 0 0.51966784 0.07867133 0.05147867 0.18156322 0.08894271 0.00860774 0.0214425 0.35603149 0.07016613 0.15319437 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.36685471 0.06079542 0.04189625 0.14814619 0.07414086 0.00885583 0.02529484 0.34781932 0.06554892 0.18591392 0 0.03231841 0 0 0 0 0.01326417 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 142 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.10977899 0.09220901 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 143 PSE_96_05MPSE_98_05MPSE_99_05MPSW_112_05M PSW_122_05M PSW_123_05M PSW_124_05M PSW_125_05M PSW_128_05M PSW_128_40M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01454043 0 0.03573969 0.06708748 0.03012742 0 0 0 0 0 0 0.38288455 0.5593702 0.1178863 0 0 0 0 0 0 0 0.38175479 0.55233408 0.11564566 0 0 0 0 0 0 0 0.59739394 0.86757223 0.18417507 0 0 0 0 0 0 0 0.38795149 0.55066622 0.11343493 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.44194113 0.63917432 0.1350303 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 144 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.09517475 0.14313224 0 0 0 145 RED_31_05MRED_32_05MRED_32_80MRED_33_05MRED_34_05MRED_34_60MSOC_84_05MSOC_85_05MSOC_85_90M 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.21715982 0.06658256 0.02221829 0.19967236 0.10735451 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.01649294 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.10565948 0.04630795 0.01281098 1.11046701 0.67553948 0 0 0 0 0.12015527 0.05361297 0.01516055 1.24364783 0.750106 0 0 0 0 0.15315739 0.06476772 0.01947935 1.30908877 0.7701634 0 0 0 0 0.10327477 0.04664506 0.01229232 1.14977979 0.7105108 0 0 0 0 0.03934014 0.02744145 0.0125032 0.15045218 0.0182985 0 0 0 0 0 0 0 0.03465394 0.0160638 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.13243838 0.05709724 0.01734147 1.30560394 0.78417363 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 146 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02688938 0 0 0 0 0 0 0 0 0.07910749 0.0271022 0 0.25865021 0.15210413 0 0 0 0 147 Pangenome COG category summaries symbol Functional group % of all genes (84,784 total) (53,431 annotated) % of core genes (35,140 total) (30,413 annotated) J Translation, ribosomal structure, and biogenesis 9.69 13.01 R General function prediction only 9.83 7.62 M Cell wall/membrane/envelope biogenesis 8.50 6.60 E Amino acid transport and metabolism 7.81 8.90 H Coenzyme transport and metabolism 7.79 9.10 C Energy production and conversion 5.88 6.96 O Posttranslational modification, protein turnover, chaperones 5.81 6.27 G Carbohydrate transport and metabolism 5.63 6.23 S Function unknown 5.33 3.73 P Inorganic ion transport and metabolism 4.79 4.06 L Replication, recombination and repair 4.43 5.61 T Signal transduction mechanisms 4.48 3.12 K Transcription 3.88 3.73 I Lipid transport and metabolism 3.21 3.72 F Nucleotide transport and metabolism 3.12 3.88 Q Secondary metabolites biosynthesis, transport and catabolism 2.54 1.83 V Defense mechanisms 2.28 2.02 D Cell cycle control, cell division, chromosome partitioning 1.33 1.43 U Intracellular trafficking, secretion, and vesicular transport 1.35 0.99 N Cell motility 1.06 0.65 X Mobilome: prophages, transposons 0.79 0.36 W Extracellular structures 0.48 0.20 Pangenome-defined 148 Total genes 84,784 35,140 Annotated genes 53,431 (63.0%) 30,413 (86.5%) Not annotated genes 31,353 (37.0%) 4,727 (13.5%) 149 % of accessory genes (41,026 total) (20,954 annotated) % of unique genes (8,181 total) (2,064 annotated) % of partial genes (427 total) (155 annotated) 5.72 3.03 3.91 12.58 13.18 8.38 10.59 13.78 11.73 6.72 3.71 1.12 6.29 4.64 2.79 4.79 1.98 1.12 5.29 4.83 7.26 4.92 4.53 2.23 7.46 6.44 12.29 5.93 3.74 2.23 2.83 3.89 3.35 6.39 4.49 0.56 4.02 4.53 13.41 2.65 1.83 1.12 2.31 0.64 1.12 3.00 7.49 9.50 2.40 4.61 0.56 1.19 1.35 2.23 1.75 2.43 8.94 1.46 2.70 0.00 1.06 3.82 2.79 0.66 2.36 3.35 Pangenome-defined 150 8,181 41,463 1,909 (23.3%) 21,109 (50.9%) 6,272 (76.7%) 20,354 (49.1%) 151
Abstract (if available)
Abstract
This work focuses on two keystone lineages of marine cyanobacteria that continue to successfully propagate via very different lifestyles: Trichodesmium, a filamentous nitrogen fixer that often forms macroscopic colonies, experiences characteristic bloom events in between bouts of low abundance, and seemingly lives in perpetual association with a consortium of other organisms
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
The molecular adaptation of Trichodesmium to long-term CO₂-selection under multiple nutrient limitation regimes
PDF
Spatial and temporal dynamics of marine microbial communities and their diazotrophs in the Southern California Bight
PDF
Thermal acclimation and adaptation of key phytoplankton groups and interactions with other global change variables
PDF
Examining potential triggers of algal blooms and harmful algae in the Southern California bight region
PDF
Microbial communities in marine sediments affecting and effecting biogeochemical cycling: influence of microbial ecology on geochemical transformations in two contrasting marine settings
PDF
The dynamic regulation of DMSP production in marine phytoplankton
PDF
Effects of global change on the physiology and biogeochemistry of the N₂-fixing cyanobacteria Trichodesmium erythraeum and Crocosphaera watsonii
PDF
Thermal diversity within marine phytoplankton communities
PDF
Iron-dependent response mechanisms of the nitrogen-fixing cyanobacterium Crocosphaera to climate change
PDF
Identifying functional metabolic guilds: a computational approach to classifying heterotrophic diversity in the marine system
PDF
Unexplored microbial communities in marine sediment porewater
PDF
Genetic characterization of microbial eukaryotic diversity and metabolic potential
PDF
Microbial ecology in the deep terrestrial biosphere: a geochemical, metagenomic and culture-based approach
PDF
The development of novel measures of landscape diversity in assessing the biotic integrity of lotic communities
PDF
Enhancing recovery of understudied and uncultured lineages from metagenomes
PDF
Changes in the community composition of marine microbial eukaryotes across multiple temporal scales of measurement
PDF
The distribution of B-vitamins in two contrasting aquatic systems, and implications for their ecological and biogeochemical roles
PDF
Patterns of molecular microbial activity across time and biomes
PDF
Ecophysiology of important understudied bacterioplankton through an integrated research and education approach
PDF
Temporal variability of marine archaea across the water column at SPOT
Asset Metadata
Creator
Lee, Michael Douglas
(author)
Core Title
Molecular ecology of marine cyanobacteria: microbial assemblages as units of natural selection
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Biology (Marine Biology and Biological Oceanography)
Publication Date
10/10/2018
Defense Date
03/21/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
community interactions,OAI-PMH Harvest,photoautotroph-heterotroph interactions,Synechococcus,Trichodesmium
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Hutchins, David A. (
committee chair
), Webb, Eric A. (
committee chair
), El-Naggar, Moh Y. (
committee member
), Fuhrman, Jed A. (
committee member
)
Creator Email
leemd@usc.edu,michael.lee0517@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-4470
Unique identifier
UC11672282
Identifier
etd-LeeMichael-6220.pdf (filename),usctheses-c89-4470 (legacy record id)
Legacy Identifier
etd-LeeMichael-6220.pdf
Dmrecord
4470
Document Type
Dissertation
Rights
Lee, Michael Douglas
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
community interactions
photoautotroph-heterotroph interactions
Synechococcus
Trichodesmium