Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Understanding the genetic architecture of complex traits
(USC Thesis Other)
Understanding the genetic architecture of complex traits
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
i
Understanding the genetic architecture of
complex traits
by
Takeshi Matsui
A Dissertation presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(MOLECULAR BIOLOGY)
DECEMBER 2018
Copyright 2018 Takeshi Matsui
ii
Acknowledgements
I would like to thank Ian and my lab members for all their support throughout my PhD. I
would also like to thank my committee members: Susan Forsburg, Matt Dean, and James
Boedicker.
iii
Acknowledgements ii
List of Figures vii
List of Tables x
List of Supplemental Notes xi
Abstract 1
Chapter 1: Introduction 2
1.1 Challenges in understanding how genotype specifies phenotype 2
1.2 Genetic heterogeneity 2
1.3 Gene by environment interactions 3
1.4 Epistasis and background effects 3
1.5 Goals of this dissertation 4
1.6 Summary of the chapters 5
Chapter 2: Regulatory rewiring in a cross causes extensive genetic heterogeneity 7
2.1 Abstract 7
2.2 Introduction 8
2.3 Results 10
2.3.1 Many BYxYJM segregants show invasion that is independent of FLO8 10
2.3.2 Initial effort to identify loci underlying FLO8-independent invasion 11
2.3.3 FLO8-independent invasion in glucose-only individuals depends on the MAPK
cascade 11
2.3.4 Multiple architectures of FLO8-independent invasion in ethanol-only individuals 13
2.3.5 Testing for effects of mating type and non-genetic factors on FLO8-independent
invasion 15
2.3.6 Segregants that invade in a FLO8-independent manner require different
transcription factors and cell surface proteins 16
2.4 Conclusion 18
2.5 Materials and Methods 20
2.5.1 Generation of initial mapping population. 20
2.5.2 Phenotyping for invasive growth 20
2.5.3 Genotyping by sequencing 20
2.5.4 Detection of loci influencing ability to invade 21
2.5.5 Genetic engineering 21
2.5.6 Generation of backcross segregants 21
iv
2.5.7 Screening for mating type and non-genetic effects. 21
2.5.8 Amplification of the FLO11 coding region 22
2.6 Supplementary Materials 23
Chapter 3: Gene-environment interactions in stress response contribute additively to
a genotype-environment interaction 28
3.1 Abstract 28
3.2 Introduction 29
3.3 Results and Discussion 32
3.3.1 Genetic mapping of poor growth in E37 by recurrent backcrossing and selection 32
3.3.2 Many introgressed loci have biological effects 33
3.3.3 Loci involved in poor growth in E37 mainly act in an additive manner 34
3.3.4 Loci show a negative relationship between average growth level and additive
effect size 37
3.3.5 Causal genes play roles in stress response 38
3.4 Conclusion 40
3.5 Materials and Methods 42
3.5.1 Generation of initial mapping population. 42
3.5.2 Examination of growth among F 2 segregants 42
3.5.3 Generation of BY and YJM NILs 42
3.5.4 Genotyping of BY and YJM NILs 42
3.5.5 Generation of YJM NIL 3 F 2B 7 segregants 43
3.5.6 Genotyping of the F 2B 7 population4.3.2 Linkage mapping of mutation-
independent and mutation-responsive effects 43
3.5.7 Phenotyping of the F 2B 7 population 44
3.5.8 Quantitative analysis of the effect of the causal loci on growth 44
3.5.9 Modeling of growth as a function of the number of YJM alleles an individual
carries 45
3.5.10 Genetic engineering 45
3.5.11 Population, phylogenetic, and functional analysis of the causal polymorphism in
YGR250C 46
3.6 Supplementary Materials 47
Chapter 4: The complex underpinnings of genetic background effects 59
4.1 Abstract 59
4.2 Introduction 60
v
4.3 Results 61
4.3.1 Preliminary screen 61
4.3.2 Linkage mapping of mutation-independent and mutation-responsive effects 62
4.3.3 Most mutation-responsive effects involve higher-order epistasis 65
4.3.4 Environment plays a strong role in background effects 66
4.3.5 Interactions between segregating loci and different knockouts 68
4.3.6 Mutation-responsive effects correlate with mutation-induced changes in
phenotypic variance 69
4.4 Discussion 71
4.5 Materials and Methods 73
4.5.1 Generation of different knockout backgrounds of the BYx3S cross 73
4.5.2 Genotyping of segregants 73
4.5.3 Phenotyping of segregants 74
4.5.4 Scans for one-locus effects 75
4.5.5 Scans for two-locus effects 77
4.5.6 Scans for three-locus genetic effects 77
4.5.7 Assignment of mutation-responsive effects to specific knockouts 78
4.5.8 Statistical power analysis 79
4.5.9 Contributions of individual loci to mutation-responsive two- and three-locus
effects 79
4.5.10 Analysis of mutation-responsive effects across environments 80
4.5.11 Phenotypic variance explained by mutation-responsive effects in wild type and
knockout segregants 80
4.5.12 Checking potential consequences of allele frequency bias 81
4.6 Supplementary Materials 82
Chapter 5: Concluding remarks 109
5.1 Impact of my work 109
5.2 Future directions 112
References 114
Appendix A: Higher-order genetic interactions and complex trait variation 125
A.1 Abstract 125
A.2 Keywords 125
A.3 Key Concepts 125
A.4 Introduction 126
vi
A.5 Main Text 126
A.5.1 Evidence for HGIs 126
A.5.2 Challenges in detecting HGIs 128
A.5.3 How do HGIs arise at the molecular level 130
A5.4 Phenotypic and potential evolutionary consequences of HGIs 131
A.6 Conclusion 133
A.7 Further Reading List 134
A.8 Glossary 134
Appendix B: Genetic suppression: Extending our knowledge from lab experiments
to natural populations 135
B.1 Abstract 135
B.2 Introduction 136
B.3 Main text 137
B.3.1 High-throughput techniques for identifying suppressor mutations in lab
experiments 137
B.3.2 Functional mechanisms that cause genetic suppression 141
B.3.3 Examples of genetic suppression in natural populations 142
B.3.4 The genetic and molecular basis of naturally occurring genetic suppression 144
B.4 Conclusion and outlook 147
vii
List of Figures
Figure 2.1. Effects of FLO8 on ability to invade 9
Figure 2.2 Genetic dissection of FLO8-independent glucose-only invasion 12
Figure 2.3 Genetic dissection of ethanol-only invasion by backcrossing Segregant 2
to BY and YJM 13
Figure 2.4 Genetic dissection of FLO11-independent, ethanol-only invasion by
backcrossing of Segregant 3 to BY 15
Figure 2.5 Deletion screen of known FLO11 activators 17
Figure S2.1 Initial results from selective genotyping of segregants that show FLO8-
independent invasion 23
Figure S2.2. Construction of allele replacements 24
Figure S2.3. Differences FLO11 coding region length between BY and YJM 25
Figure S2.4. Replacement of the FLO11 coding region in segregant 2 with the BY
allele causes loss of invasion 26
Figure 3.1 Representative images of BY, YJM, control segregants, and poor
growing segregants under four conditions: glucose at 30°C, glucose at 37°C,
ethanol at 30°C, and ethanol at 37°C 30
Figure 3.2. Crossing scheme to generate BY and YJM F2B6 NILs 31
Figure 3.3 Introgressed genomic regions detected in the NILs 33
Figure 3.4 Phenotypic effects of the Chromosome I, VII, X_1, X_2, and XVI loci in
the four conditions 36
Figure 3.5 Analysis of growth among 192 random F2B7s across the four conditions. 38
Figure 3.6. Identification of the causal genes underlying the Chromosome VII, X_1,
and X_2 loci 40
Figure S3.1 Whole genome sequencing reveals two YJM NILs are aneuploidy 47
Figure S3.2 YJM NIL 2 and another YJM NIL show similar introgressed genomic
regions 48
Figure S3.3 Four of the five introgressed genomic regions in YJM NIL 3 contribute
to poor growth in E37 49
Figure S3.4 YGR250C
YJM
contains an amino acid change in a highly conserved site 50
viii
Figure S3.5. Sample image of how growth in the F2B7 population was measured
using the Plate Analysis plugin for ImageJ. 51
Figure 4.1 Examples of mutation-responsive genetic effects. 63
Figure 4.2 Most mutation-responsive genetic effects involve multiple loci 65
Figure 4.3 Higher-order epistasis among knockouts and multiple loci is an
important contributor to background effects 66
Figure 4.4 Analysis of mutation-responsive effects across environments 67
Figure 4.5 Analysis of mutation-responsive effects across knockout backgrounds 69
Figure 4.6 Mutation-responsive effects underlie differences in phenotypic variance
between knockout and wild type backgrounds across environments 70
Figure S4.1 Generation of BYx3S knockout segregants 82
Figure S4.2 Certain genes exhibit significant background effects when perturbed 83
Figure S4.3 Allele frequency plot 84
Figure S4.4 Growth of all 1,411 segregants across the 10 environments 86
Figure S4.5 Individual and joint contributions of loci to background effects across
different significance thresholds. 87
Figure S4.6 Analysis of how mutation-responsive effects interact with different
knockouts at multiple significance thresholds 89
Figure S4.7 Statistical power analysis for one, two, and three-locus interactions 91
Figure S4.8 Extent to which mutation-responsive effects interact with different
knockouts 92
Figure S4.9 Absence of a relationship between identified mutation-responsive
effects and mean phenotypic differences between knockout and wild type
backgrounds 93
Figure S4.10 All seven knockout populations show nominally significant
correlations between changes in phenotypic variance and detected mutation-
responsive effects 94
Figure S4.11 Most mutation-responsive effects show small differences in
phenotypic variance explained (‘PVE’) in mutants relative to wild type segregants 95
Figure A.1 Comparison of three additive loci to three loci-involved in an HGI 127
Figure A.2 Factors that limit the statistical power of tests for HGIs 129
ix
Figure A.3 Role of gene regulatory networks in HGIs 131
Figure A.4 Involvement of HGIs in phenotypic capacitance 132
Figure B.1 Genetic basis of suppression in lab versus natural environments 137
Figure B.2 Identifying intragenic interactions across an entire gene 138
Figure B.3 Techniques for identifying intergenic suppressors 139
Figure B.4 Combinations of genetic variants may cause suppression in natural
populations 147
x
List of Tables
Table S2.1. Analysis of dissected tetrads from homozygous diploid derivatives of
specific segregants 27
Table 3.1 Full factorial ANOVA for E37 condition 35
Table S3.1 Genomic intervals that were introgressed in at least 2 NILs 52
Table S3.2 Full factorial ANOVA for G30 condition 53
Table S3.3 Full factorial ANOVA for E30 condition 54
Table S3.4 Full factorial ANOVA for G37 condition 55
Table S3.5 Genetic intervals identified in the 45 F2B7s segregants with poor growth
in E37 56
Table S3.6 PCR primers and restriction enzymes used for genotyping F2B7s 57
Table S4.1 List of 47 genes that were screened for background effects 97
Table S4.2 Screen summary statistics 99
Table S4.3 Mapping population breakdown 100
Table S4.4 Genomic regions with allele frequency bias. 101
Table S4.5 Fixed regions in BY/3S hemizygous diploids 102
Table S4.6 Phenotyping environments 103
Table S4.7 Number of mutation-independent and mutation-responsive genetic
effects across different significance thresholds 104
Table S4.8 Chi-squared test results for mutation-responsive effects 105
Table S4.9 Number of mutation-responsive effects that show biased individual and
multi-locus allele frequencies 106
xi
List of Supplemental Notes
Note S3.1 Attempt to clone causal gene underlying the Chromosome I locus 58
Note S4.1 Conditional essentiality of esa1Δ segregantS 107
Note S4.2 All major results are robust to different significance thresholds 107
Note S4.3 All major results are also robust to bias in allele combinations 107
1
Abstract
Understanding how standing genetic variants contribute to heritable phenotypic
variation is one of the central goals of contemporary genetics. However, for many
heritable phenotypes of interest, genome-wide association and linkage studies have only
identified a small fraction of the traits’ genetic basis. This is because most heritable
phenotypes are genetically complex; they involve a large number of loci that can interact
with each other, the environment, and even various mutations.
In this thesis, I provide detailed genetic characterization of multiple complex
traits, which has allowed me to gain better understanding of how genotype specifies
phenotype. In chapter two, I show that multiple distinct regulatory architectures can
underlie a trait. In chapter three, I describe how combinations of multiple gene-
environment interactions collectively give rise to major genotype by environment
interaction. In chapter four, I demonstrate that standing genetic variants contribute to
background effects mostly through higher-order interactions that involve not only a
mutation, but also other variants and the environment.
2
Chapter 1: Introduction
1.1 Challenges in understanding how genotype specifies phenotype
One of the central challenges in contemporary genetics is to determine how
genotype specifies phenotype. Recent advances in DNA sequencing and genetic mapping
technologies have propelled major progress in this area by making it possible to identify
thousands of loci that contribute to traits in humans and model organisms (TIMPSON et al.
2018). Although this work has provided valuable insights into the genetic basis of
complex traits (VISSCHER et al. 2017, BOYLE et al. 2017), for most phenotypes, detected
loci fail to explain much of the heritability and have not enabled accurate trait predictions
(TIMPSON et al. 2017). To comprehensively elucidate the mechanisms governing the
connection between genotype and phenotype, it is, therefore, of vital importance to
understand what causes this ‘missing heritability’ problem.
1.2 Genetic heterogeneity
One possible contributor to the missing heritability problem is genetic
heterogeneity, a phenomenon that occurs when genetically distinct individuals exhibit
similar phenotype as a result of different underlying mechanisms (RISCH 2000;
MCCLELLAN AND KING 2010; WRAY AND MAIER 2014). Genetic heterogeneity can mask
the effect of genetic variants and reduce statistical power of mapping studies (MANCHIA
et al. 2013, WRAY AND MAIER 2014). This presents an experimental challenge for
studying traits with high levels of heterogeneity, as large sample sizes will be needed to
achieve sufficient statistical power. There are two types of genetic heterogeneity: allelic
heterogeneity, which involves different alleles in the same gene, and non-allelic
heterogeneity, which involves mutations in different genes (RISCH 2000). While allelic
heterogeneity is pervasive in biological systems (MCCLELLAN AND KING 2010;
EHRENREICH et al. 2012; LONG et al. 2014), not much is known about non-allelic
heterogeneity. Studies from genetic diseases, such as breast cancer (WALSH AND KING
2007) and schizophrenia (WALSH et al. 2008), have implied that mutations in genes that
encode proteins in related pathways may cause non-allelic heterogeneity. A better
3
understanding of the prominence and underlying causes of non-allelic heterogeneity may
aid our efforts to explain heritability and predict phenotype from genomic data.
1.3 Gene by environment interactions
Another possible contributor to the missing heritability problem is gene-
environment interactions, which can modify the effects of genetic variants depending on
the environmental context (GRISHKEVICH AND YANAI 2013). Gene-environment
interactions have long been of interest to quantitative geneticists, as they are critical for
understanding how organisms adapt to their local environment (VIA et al. 1985,
FOURNIER-LEVEL et al. 2011) and for maximizing agricultural yield. In addition,
environmental perturbations have been shown to uncover sets of deleterious cryptic
genetic variants that result in conditional disease phenotypes (GIBSON et al. 2009).
Despite its influence on many traits of agricultural, ecological, evolutionary, and medical
significance (LYNCH AND WALSH 1998, KAMMENGA et al. 2007, GUTTELING et al. 2007),
the role gene-environment interactions play in trait variation is not well understood. One
reason for this is that detecting gene-environment interactions is far more statistically
challenging than detecting genetic and environmental factors in isolation. Because the
number of tests required increases dramatically when numerous genetic variants and
environments are considered, multiple testing can incur severe penalties (CORDELL 2009,
SHAM AND PURCELL 2014). As a consequence, only gene-environment interactions with
the largest effects are usually detected. Another reason is that it is often difficult to get
precise and reliable measurements of environmental exposures (CASPI 2006). For
example, the majority of epidemiological studies rely on self-reported information
obtained through interviews and questionnaires (MOFFITT 2005). Indeed, imprecise
phenotyping has been shown to reduce estimated effect sizes of genetic associations in
genome-wide association studies (‘GWAS’) (MANCHIA et al. 2013). More research is
needed to further our understanding of how sets of gene-environment interactions
collectively give rise to phenotype.
4
1.4 Epistasis and background effects
The relative contribution of epistasis to missing heritability is also a subject of
debate (TIMPSON et al. 2017, MACKAY 2014). Epistasis occurs when genetic variants
exhibit different phenotypic effects depending on the genotypic context in which they are
examined (CORDELL 2009). Although individual examples of epistasis have been
identified for a broad range of traits (CARLBORG 2006, GERKE 2009), the prevailing
model is that genetic variants contribute to quantitative traits in a predominantly additive
manner (MACKAY 2014). Consistent with this argument, genomic prediction models that
utilize this purely additive model have shown very high accuracy in domesticated crops
and animals (CROW 2010). Nonetheless, there are compelling pieces of evidence that
epistasis can significantly contribute to trait variation in specific contexts. For example,
in many Mendelian monogenic human diseases, individuals carrying a disease allele
frequently exhibit incomplete penetrance and varied expressivity in the severity of
symptoms and the age of onset (COOPER et al. 2013). Similarly, studies in model
organisms have found that loss-of-function mutations in certain genes can cause a huge
fitness defect in one individual, but exhibit no effect in another (DOWELL et al. 2010).
Research in model organisms has shown that these ‘background effects’ usually result
from one or multiple standing genetic variants interacting with and modifying the effect
of the mutation (DOWELL et al. 2010; JAROSZ AND LINDQUIST 2010; CHANDLER et al.
2017; PAABY et al. 2015; TAYLOR AND EHRENREICH ; LEE et al. 2016; TAYLOR et al.
2016). Thus, these findings imply that epistasis plays a major role in trait variation, but in
a highly contextual manner. However, these insights mostly derive from studies where
the genetic architecture of background effects was inferred rather than directly shown
through the mapping of involved loci. To gain a deeper understanding of background
effects, a systematic identification and genetic mapping of background effects is
necessary.
1.5 Goals of this dissertation
Altogether, these factors prevent us from bridging the gap between genotype and
phenotype. My goal as a PhD student has been to perform detailed genetic
characterization of these factors to achieve a better understanding of how standing
5
genetic variants contribute to trait variation. I was specifically interested in resolving the
following questions about complex traits: 1) How much non-allelic heterogeneity can
underlie a trait? 2) How do sets of gene-environment interactions collectively give rise to
GxE? 3) What is the genetic architecture of background effects? 4) What types of
epistasis contribute to background effects? 5) How much of a role does environment play
in background effects?
To answer these questions, I used the budding yeast Saccharomyces cerevisiae as
the model system to study multiple examples of complex traits. Isolates of budding yeast
are genetically diverse and can harbor genetic complexity that rivals more complicated
organisms. In addition, budding yeast presents many advantages, including a small
genome and transcriptome, fast generation time, and high recombination rate.
Furthermore, yeast can be easily genetically engineered, can be maintained as haploid or
diploid, and can be grown in replicate under tightly controlled environmental conditions.
These properties make budding yeast an ideal system to mechanistically understand how
complex traits are specified at the molecular level, because they facilitate statistically
powerful genetic mapping studies and make it possible to resolve causal loci to specific
genes and polymorphisms.
1.6 Summary of the chapters
In chapter 2, I use heritable variation in the ability of S. cerevisiae strains to
undergo haploid invasive growth as my model to show the extensive non-allelic genetic
heterogeneity that can underlie a trait. I demonstrate that phenotypic response of a
mutation depends on which signaling cascade(s) or pathway(s) an individual employs to
express a given phenotype in a given environment.
In chapter 3, I focus on characterizing a poor growth phenotype that occurs
specifically when certain yeast segregants are grown on ethanol at 37°C. I determine that
this poor growth is caused by multiple deleterious cryptic variants.
6
In chapter 4, I study multiple examples of background effects involving conserved
chromatin-associated proteins. I show that introduction of a mutation can significantly
change how standing genetic variants interact with each other and the environment. I also
illustrate that the genetic architecture of a background effect depends on the particular
gene that is mutated.
In chapter 5, I discuss the impact of my work and provide future directions to
further understand the causes of trait variation.
7
Chapter 2: Regulatory rewiring in a cross causes extensive genetic heterogeneity
This work appears essentially as published in 2015 in Genetics. 201: 769-777
2.1 Abstract
Genetic heterogeneity occurs when individuals express similar phenotypes due to
different underlying mechanisms. Although such heterogeneity is known to be a potential
source of unexplained heritability in genetic mapping studies, its prevalence and
molecular basis are not fully understood. Here, we show that substantial genetic
heterogeneity underlies a model phenotype—the ability to grow invasively—in a cross of
two Saccharomyces cerevisiae strains. The heterogeneous basis of this trait across
genotypes and environments makes it difficult to detect causal loci with standard genetic
mapping techniques. However, using selective genotyping in the original cross, as well as
in targeted backcrosses, we detect four loci that contribute to differences in the ability to
grow invasively. Identification of causal genes at these loci suggests they act by changing
the underlying regulatory architecture of invasion. We verify this point by deleting many
of the known transcriptional activators of invasion, as well as the cell surface protein
FLO11, from five relevant segregants and showing that these individuals differ in the
genes that they require for invasion. Our work illustrates the extensive genetic
heterogeneity that can underlie a trait and suggests that regulatory rewiring is a basic
mechanism that gives rise to this heterogeneity.
8
2.2 Introduction
Genetic studies in humans and model organisms have reported unexplained
heritability for many traits (MANOLIO et al. 2009). A possible contributor to this
‘missing’ heritability is genetic heterogeneity—individuals exhibiting similar phenotypes
due to different genetic and molecular mechanisms (RISCH 2000; MCCLELLAN AND KING
2010; WRAY AND MAIER 2014). Genetic heterogeneity can reduce the statistical power of
mapping studies (MANCHIA et al. 2013; WRAY AND MAIER 2014), and may involve
multiple variants segregating in the same gene (‘allelic’ heterogeneity) or different genes
(‘non-allelic’ heterogeneity) (RISCH 2000). Work to date has shown that allelic
heterogeneity is widespread (e.g., (MCCLELLAN AND KING 2010; EHRENREICH et al.
2012; LONG et al. 2014)), and often involves two or more null or partial loss-of-function
variants segregating in a single phenotypically important gene (e.g., (NOGEE et al. 2000;
SUTCLIFFE et al. 2005; WILL et al. 2010)). However, the prominence and underlying
mechanisms of non-allelic heterogeneity are less understood.
In this paper, we describe an example of non-allelic heterogeneity, using heritable
variation in the ability of Saccharomyces cerevisiae strains to undergo haploid invasive
growth as our model. Invasive growth is a phenotype that is triggered by low carbon or
nitrogen availability, and is thought to be an adaptive response that allows yeast cells to
adhere to and penetrate surfaces (CULLEN AND SPRAGUE 2000). Invasion typically
requires expression of FLO11, which encodes a cell surface glycoprotein that facilitates
cell-cell and cell-surface adhesion (LO AND DRANGINIS 1998; RUPP et al. 1999). In
addition to FLO11, S. cerevisiae possesses other cell surface proteins that can contribute
to adhesion-related traits (as described in (GUO et al. 2000; HALME et al. 2004) and
elsewhere). In some cases, these cell surface proteins are regulated by multiple signaling
cascades (BRUCKNER AND MOSCH 2012), potentially providing an opportunity for genetic
variants in different pathways to have similar effects on invasion.
Here, we examine the genetic basis of variation in the ability to invade on two
carbon sources—glucose and ethanol—in a cross of the lab strain BY4716 and the
clinical isolate YJM789 (‘BY’ and ‘YJM’, respectively) (LITI et al. 2009). YJM is highly
9
invasive on both carbon sources (Figure 2.1A). In contrast, BY cannot grow invasively
on either carbon source (Figure 2.1A). This is because BY carries a nonsense allele of
FLO8 (Figure 2.1B; Methods), which encodes a transcriptional activator that is
regulated by the Ras-cAMP-PKA pathway. Flo8 is typically required for invasive growth
in both S. cerevisiae (LIU et al. 1996) and Candida albicans (CAO et al. 2006). Consistent
with the importance of FLO8 for invasion, deletion of this gene from YJM significantly
reduces its invasive growth on both carbon sources (Figure 2.1B; Methods).
Figure 2.1 Effects of FLO8 on ability to invade. (A) BY and YJM were grown for five days on
YPD or YPE plates at 30°C. Colonies were then washed off the plates using water and examined
for invasion. (B) Comparison of BY with a functional allele of FLO8 and YJM flo8Δ. (C)
Fraction of the initial mapping population of 127 F 2 BYxYJM segregants that show invasion on
glucose (‘glu’) or ethanol (‘eth’) in each FLO8 genotype class. (D) Fraction of the 97 invasive
FLO8
BY
segregants that show invasion on glucose, ethanol, or both carbon sources (‘both’).
While screening BYxYJM segregants for invasion on the two carbon sources, we
found that many individuals exhibit invasion even though they possess the FLO8
BY
10
nonsense allele, a result that was also recently reported in (SONG et al. 2014). We show
that this FLO8-independent growth has a heterogeneous genetic basis that reflects the
presence of multiple distinct regulatory architectures that enable FLO8-independent
invasion. Most of these regulatory architectures are FLO11-dependent but require
different transcriptional activators; however, we also provide evidence for an architecture
that is FLO11-independent. Our results suggest that regulatory rewiring is an important
source of non-allelic genetic heterogeneity and illustrate how studying the causes of
phenotypic similarities among genetically distinct individuals can advance our
understanding of complex traits.
2.3 Results
2.3.1 Many BYxYJM segregants show invasion that is independent of FLO8
We examined a population of 127 genotyped BYxYJM MATa segregants for
ability to invade on two carbon sources—glucose and ethanol (Methods). Despite the
major role of FLO8 in the invasion phenotypes of BY and YJM (Figures 2.1A and 2.1B),
we unexpectedly found that a large fraction (52%) of segregants with the FLO8
BY
nonsense allele were capable of invading in at least one condition (Figure 2.1C). A
possible explanation for these individuals’ phenotypes is that FLO8
BY
is partially
functional in some genetic backgrounds. Flo8 is comprised of a LisH domain (amino
acids 72-105) that is involved in physical interactions with the transcription factor Mss11
and a transcriptional activation domain (amino acids 701-799) that is necessary for DNA
binding (KIM et al. 2014). The nonsense polymorphism in FLO8
BY
occurs after the LisH
domain at amino acid 142, suggesting that the truncated Flo8 may retain some
functionality. We tested for partial functionality of FLO8
BY
by deleting the entire coding
portion of FLO8 from multiple invasive FLO8
BY
segregants, and phenotyping them for
invasive growth on glucose and ethanol (Methods). Complete deletion of FLO8 had no
effect on invasion, suggesting that other mechanisms enable these individuals to grow
invasively.
11
2.3.2 Initial effort to identify loci underlying FLO8-independent invasion
As a first step in identifying the genetic basis of FLO8-independent invasion, we
screened 384 additional F2 segregants for invasion on glucose and ethanol. We obtained
55 invasive FLO8
BY
individuals from this experiment, bringing the total number of
invasive FLO8
BY
individuals to 97. Among these 97 individuals, 50% were invasive on
both glucose and ethanol, 37% were invasive only on glucose, and 12% were invasive
only on ethanol (Figure 2.1D). We genotyped the 55 new individuals using low-coverage
genome sequencing and attempted to detect enriched alleles among the larger set of 97
genotyped FLO8
BY
strains that were capable of invasion (Methods). Although our past
work suggests that such selective genotyping should have high statistical power
(EHRENREICH et al. 2010), even in the presence of complex non-additive genetic effects
(TAYLOR AND EHRENREICH 2014), we failed to detect any loci using this strategy (Figure
S2.1A).
2.3.3 FLO8-independent invasion in glucose-only individuals depends on the MAPK
cascade
We hypothesized that FLO8-independent invasion is genetically heterogeneous in
the BYxYJM cross, reducing the statistical power of our genetic mapping effort. To
mitigate this potential problem, we attempted to identify causal loci by focusing on
different classes of FLO8
BY
segregants. We first looked at FLO8
BY
individuals that show
invasion on both glucose an ethanol, but this analysis did not identify any loci (Figure
S2.1B). We next examined individuals that invade in only one condition, under the
assumption that different mechanisms might underlie condition-specific invasion. Among
the segregants showing FLO8-independent invasion only on glucose (n = 36), nearly all
of these individuals carried the BY allele of a locus on Chromosome VIII, which we were
able to delimit to 10 genes (Figure 2.2A; Methods).
12
Figure 2.2 Genetic dissection of FLO8-independent glucose-only invasion. (A) Genome-wide
relative allele frequency plot of glucose-only FLO8
BY
BYxYJM segregants. FLO8 and the
markers used to generate haploid progeny are highlighted by red vertical bars, while the strongly
enriched locus on Chromosome VIII, which was nearly fixed for the BY allele, is highlighted by
a green vertical bar. The genomic interval underlying the Chromosome VIII peak is also
provided. (B) Comparison of Segregant 1, a glucose-only FLO8
BY
individual, and the GPA1
YJM
Segregant 1 supports GPA1 as the causal gene underlying the Chromosome VIII locus.
To determine the causal gene(s) at the Chromosome VIII locus, we replaced the
BY allele of each gene in this interval with the YJM allele in a FLO8
BY
segregant that
was invasive only on glucose (‘Segregant 1’; Methods). Each replacement spanned the
promoter, coding region, and part of the downstream region of the tested gene (Figure
S2.2). The only replacement that had an effect was GPA1, a subunit of the G-protein
coupled receptor involved in the Mitogen-Activated Protein Kinase (MAPK) cascade
pheromone response (FUJIMURA 1989). Converting Segregant 1’s GPA1 allele to the
YJM version rendered the strain nearly incapable of invading on glucose and had no
effect on ethanol (Figure 2.2B). BY is known to possess a lab-derived amino acid variant
(S469I) in GPA1 that causes a large number of gene expression changes specifically in
glucose (YVERT et al. 2003; SMITH AND KRUGLYAK 2008). This amino acid substitution
may also be the causal variant in our study.
13
2.3.4 Multiple architectures of FLO8-independent invasion in ethanol-only
individuals
We next studied FLO8
BY
individuals that were invasive only on ethanol. Because
our sample size for this group was small (n = 12), we generated backcross populations in
a manner similar to (TAYLOR AND EHRENREICH 2014) and used these populations to
identify loci that influence invasive growth in a single segregant (‘Segregant 2’;
Methods). In the backcross to BY, we screened 192 segregants and found that 16% were
invasive only on ethanol. Among these individuals (n = 30), we identified a single locus
that was nearly fixed for the YJM allele (Figure 2.3A top), which was located on
Chromosome IX and overlapped FLO11. FLO11 is known to harbor extensive functional
variation across yeast isolates in both its coding and noncoding regions (FIDALGO et al.
2006; FIDALGO et al. 2008). To test for functional variation at FLO11 in the BYxYJM
cross, we separately replaced the coding and noncoding regions of FLO11 in Segregant 2
with the BY alleles (Figure S2.2; Methods). We found that replacement of the FLO11
coding region caused a loss of invasion on ethanol (Figure 2.3B), while replacement of
the noncoding region had no effect. A number of amino acid differences, as well as ~700
base pair length difference, distinguish the BY and YJM alleles of FLO11 (Figure S2.3
and S2.4), making it difficult to determine the causal variant.
14
Figure 2.3 Genetic dissection of ethanol-only invasion by backcrossing Segregant 2 to BY
and YJM. (A) Genome-wide relative allele frequency plots for the BY and YJM backcrosses are
shown on the top and bottom, respectively. FLO8 and the markers used to generate haploid
progeny are highlighted with red vertical bars, while the strongly enriched intervals on
Chromosome IX and XIV are highlighted with green vertical bars. The genomic intervals
underlying the Chromosome IX and XIV loci are also provided. (B) Comparison of Segregant 2,
an ethanol-only FLO8
BY
individual, to FLO11
YJM
replacement and SIP3
deletion strains in the
Segregant 2 background supports FLO11 and SIP3 as the causal genes underlying the
Chromosome IX and XIV loci, respectively.
In the backcross of Segregant 2 to YJM, we also screened 192 segregants and
found that 11% were invasive only on ethanol. Among these individuals (n = 22), we
identified a single locus on Chromosome XIV that was fixed for the BY allele. Based on
the genotype data, we delimited this interval to 16 candidate genes (Figure 2.3A bottom;
Methods). We tested every gene in this interval for an effect on Segregant 2’s ability to
invade using gene knockouts and found that only deletion of SIP3 resulted in a loss of
invasion (Figure 2.3B; Methods). Sip3 is a transcription cofactor that interacts with
DNA-bound Snf1p, which is known to regulate FLO11 expression in response to glucose
limitation (HEDBACKER et al. 2008).
Although the FLO11
YJM
coding region contributes to invasion on ethanol, not all
of the ethanol-only segregants possessed this allele. Among the 12 individuals that were
invasive only on ethanol in our genotyped F2 population, two carried FLO11
BY
. To
determine the mechanism that allows these individuals to invade only on ethanol, we
backcrossed one relevant segregant (‘Segregant 3’) to BY and YJM. The YJM backcross
exhibited very low sporulation; for this reason, we were only able to perform genetic
mapping in the BY backcross. We screened 192 segregants and found 32 individuals
(17%) that grew invasively only on ethanol. We performed genetic mapping to look for
enriched alleles and identified a single locus on Chromosome II, at which individuals
were fixed for the YJM allele (Figure 2.4A). This locus was detected at a resolution of
four genes, of which only AMN1 had an effect when deleted. To verify that the BY and
YJM alleles functionally differ, we replaced Segregant 3’s AMN1
YJM
with AMN1
BY
and
15
found that this resulted in a loss of invasion (Figure 2.4B and S2.2; Methods). An amino
acid variant (D368V) in AMN1, which plays a role in daughter cell separation and exit
from mitosis (WANG et al. 2003), has been implicated as a major determinant of FLO11-
independent cell clumping in multiple studies (YVERT et al. 2003; LI et al. 2013), and
may also be the causal variant in our study.
Figure 2.4 Genetic dissection of FLO11-independent, ethanol-only invasion by backcrossing
of Segregant 3 to BY. (A) Genome-wide relative allele frequency plot of ethanol-only invasion
in the backcross of Segregant 3 to BY. The marker used to generate haploid progeny is
highlighted with a red vertical bar, while the enriched locus on Chromosome II is highlighted
with a green vertical bar. The genomic interval underlying the Chromosome II locus is also
provided. (B) Comparison of Segregant 3, a FLO11-independent ethanol-only FLO8
BY
individual, to AMN1
BY
replacement strains in the Segregant 3 background supports AMN1 as the
causal gene underlying the Chromosome II locus.
2.3.5 Testing for effects of mating type and non-genetic factors on FLO8-
independent invasion
Non-genetic factors are known to influence the expression of traits in yeast
crosses (e.g., (SIRR et al. 2015)) and might also contribute to FLO8-independent
invasion. Additionally, because our experiments were conducted exclusively in MATa
haploids, some of the FLO8-independent invasion might be mating type-dependent. To
test both of these possibilities, we generated and sporulated homozygous diploid versions
of Segregants 1, 2, and 3 (Methods). From each individual, we obtained 7 to 10 four-
spore tetrads. Only mating type and non-genetic factors should segregate among these
16
spores (Methods). If we have identified loci that depend on mating type, then invasion
should co-segregate 2:2 with mating type. Alternatively, if non-genetic factors contribute
to FLO8-independent invasion, then less than 100% of the examined spores should show
the same phenotype as their progenitor.
The effects of mating type and non-genetic factors varied among the tested
segregants. For Segregants 2 and 3, which only invade on ethanol, all of the haploid
spores also showed ethanol-only invasion (Table S7). This indicates that mating type and
non-genetic factors likely do not influence the phenotypes of these individuals. In
contrast, Segregant 1, which only invades on glucose, provided evidence for both mating
type- and non-genetic effects. Among the 40 tested spores from this individual, 16 out of
20 MATa spores showed glucose-only invasion, while none of the 20 MATα spores
exhibited invasion (Table S2.1). This suggests that Segregant 1’s phenotype is mating
type-dependent and may also have a non-genetic component.
2.3.6 Segregants that invade in a FLO8-independent manner require different
transcription factors and cell surface proteins
Our results to this point indicate that FLO8-independent invasion has a
heterogeneous basis that is largely genetic. This genetic heterogeneity might arise if
distinct regulatory factors and/or cell surface proteins facilitate invasion in different
segregants and environments. The possibility of such rewiring of invasive growth is
supported by recent work showing that the Σ1278b strain requires the transcription factor
Tec1 to express FLO11, while BY does not (CHIN et al. 2012), as well as by experiments
demonstrating extensive variability in transcription factor binding among progeny from
the BYxYJM cross (ZHENG et al. 2010). Further supporting such a scenario, some of the
genes that we cloned have regulatory functions. For example, GPA1 influences signaling
through the MAPK cascade and the MAPK cascade is known to regulate Ste12, which is
a transcriptional activator required for invasion in many pathogenic fungi (LO AND
DRANGINIS 1998; FELDEN et al. 2014).
17
To explore whether regulatory rewiring might contribute to the genetic
heterogeneity in our study, we deleted 11 transcription factors that are known to regulate
invasion, as well as FLO11, from Segregants 1, 2, and 3 (Methods). We also performed
these deletions in two additional individuals that showed FLO8-independent invasion on
both glucose and ethanol (hereafter referred to as ‘Segregant 4’ and ‘Segregant 5’).
Although some deletions had quantitative effects on invasion (Figure 2.5), we focused on
cases where deletion of one of the examined genes caused inability to invade. Such
complete losses of the phenotype indicate genes that are required for a particular
segregant to express FLO8-independent invasion.
Figure 2.5 Deletion screen of known FLO11 activators. FLO11 and a number of transcription
factors that are known to regulate invasive growth were knocked out in Segregants 1 through 5.
These deletion strains were then phenotyped for their ability to invade.
The examined segregants differed in their requirements of FLO11 and four
transcription factors— MGA1, MSN1, RME1, and STE12 (Figure 2.5). None of the
deletions caused Segregant 3 to lose its ability to invade, implying that this individual
18
invades in a FLO11-independent manner that may not require the examined transcription
factors. In contrast, Segregants 1, 2, 4, and 5 showed FLO11-dependent invasion, but
differed in the transcription factors that they require. Segregants 1 and 4 lost the ability to
invade when STE12 was deleted, suggesting that their ability to invade is MAPK-
dependent. Segregants 2 and 5 required MSN1, a transcriptional activator that influences
many traits in yeast. While MSN1 was the only transcription factor that caused loss of
invasion in Segregant 2, Segregant 5 also lost its ability to invade when MGA1 and RME1
were deleted. The finding that individuals differ in the transcription factors and cell
surface proteins that they require for invasion supports regulatory rewiring as a cause of
genetic heterogeneity in our study.
2.4 Conclusion
We have shown that a model phenotype in yeast—haploid invasive growth—
exhibits extensive non-allelic genetic heterogeneity. This heterogeneity is caused by
genetic variants that change the regulation of invasive growth and enable FLO8-
independent invasion in specific cross progeny. Our results from genetic mapping and
genetic engineering experiments suggest that multiple distinct regulatory architectures of
FLO8-independent invasion segregate in the BYxYJM cross. Although these regulatory
architectures require different transcription factors and/or cell surface proteins, they lead
to similar abilities to invade.
The present data do not shed light on the specific details of these different
regulatory architectures. However, the finding that most BYxYJM segregants that show
FLO8-independent invasion require FLO11 suggests that FLO11 expression is an
important component of most of the regulatory architectures. This is of note because
FLO11 has one of the largest promoters in the yeast genome, and is thought to be
influenced by at least 8 pathways and 15 transcription factors, as well as linked
noncoding RNAs and chromatin remodeling complexes (BRUCKNER AND MOSCH 2012).
The potential of FLO11 to be regulated by a number of different pathways may facilitate
some of the variability in wiring that we have described.
19
Our finding that different transcription factors and cell surface proteins are
required for different genetic backgrounds to invade is similar to the recent discovery of
‘conditional essential’ genes in yeast (DOWELL et al. 2010). These conditional essential
genes are necessary for viability in some isolates, but dispensable in others. Our work
suggests that conditional essentiality may arise because genetically distinct individuals
express similar phenotypes due to different underlying regulatory mechanisms. If this is
true, then the essentiality of a gene for a trait will depend on which signaling cascade(s)
or pathway(s) an individual employs to express a given phenotype in a particular
environment.
Given that we have examined a single phenotype in only one pairwise cross and
two conditions, we cannot comment on the broader extent of this heterogeneity across
species, traits, and environments. However, we note that our results are comparable to
recent studies in humans (as summarized in (MCCLELLAN AND KING 2010)) and mice
(SHAO et al. 2008; SPIEZIO et al. 2012), which have shown that many genetic
perturbations can produce comparable phenotypic outcomes. To some degree, our effort
also represents an integration of previous work describing genetic variation in regulatory
pathways (YVERT et al. 2003) and transcription factor activity (ZHENG et al. 2010; CHIN
et al. 2012) across yeast isolates. Importantly, we have extended these past studies by
connecting changes in signaling and transcription factor activity, as identified through
genetic techniques, to phenotypic outcomes.
20
2.5 Materials and Methods
2.5.1 Generation of initial mapping population. We used the synthetic genetic array
marker system (TONG et al. 2001) to generate recombinant BYxYJM MATa segregants.
The BY parent of our cross was MATα can1∆::STE2pr-SpHIS5 lyp1∆ his3∆, while the
YJM parent was MATa his3∆::natMX ho::kanMX. We mated these BY and YJM
haploids to produce the diploid progenitor of our cross, which was sporulated using
standard techniques (SHERMAN 1991). MATa segregants were obtained using random
spore plating on minimal media containing canavanine, as previously described
(EHRENREICH et al. 2010; TAYLOR AND EHRENREICH 2014).
2.5.2 Phenotyping for invasive growth. Strains were phenotyped for invasive growth on
2% agar plates containing yeast extract and peptone (YP) with either 2% glucose
(dextrose) or 2% ethanol as the carbon source (YPD and YPE, respectively). Prior to
pinning onto the agar plates, strains were grown overnight to stationary phase in liquid
YPD. After this culturing step, strains were then pinned onto agar plates and allowed to
grow for 5 days. Following this incubation period, we screened for invasive growth by
applying water to the agar plates, manually scrubbing colonies, and decanting the mixture
of water and cells. Presence or absence of invasion was scored by eye under a light
microscope. Each segregant was phenotyped three independent times and the median
phenotype was used in analyses.
2.5.3 Genotyping by sequencing. Segregants were genotyped by Illumina sequencing.
Whole genome libraries were constructed using the Illumina Nextera kit. These libraries
were then sequenced in multiplex to at least 5X genomic coverage on either a HiSeq or a
NextSeq with 100 base pair (bp) x 100 bp reads. We also sequenced BY and YJM to
~100X genomic coverage, and used the data to identify 57,402 high confidence SNPs.
Reads for segregants were mapped to the BY genome using Burrows-Wheeler Aligner
(BWA) (LI AND DURBIN 2009) and SAMTOOLS (LI et al. 2009). We called genotypes
for each individual by taking the base calls at the SNPs and employing a Hidden Markov
Model by chromosome, using the HMM() package in R, as described in (TAYLOR AND
EHRENREICH 2014). The sequence data from our experiments is available from the NCBI
21
Short Read Archive under accession numbers SRR2039809- SRR2039935, SRR2039936
to SRR2039992, SRR2040045 to SRR2040076, SRR2040023 to SRR2040044, and
SRR2039993 to SRR2040022.
2.5.4 Detection of loci influencing ability to invade. Allele frequency analyses were
computed using the genotype data of all individuals from a particular mapping population
that exhibited the same phenotype. To determine the intervals of the identified causal
loci, we identified regions where the alleles were either fixed or at a frequency of 95% or
higher.
2.5.5 Genetic engineering. Knockouts were generated by PCR amplifying the CORE
cassette with homology-tailed primers and then selecting for transformants on G418
(STORICI et al. 2001). Phusion high-fidelity DNA polymerase was used for PCR under
the recommended reaction conditions with 35 cycles and an extension time of 30 seconds
per kilobase. The entire coding region of target genes was deleted in these strains. Correct
integration of the CORE cassette was checked for each deletion strain using PCR. Allele
replacement strains were constructed using the co-transformation of two partially
overlapping PCR products (Figure S2.2), similar to (ERDENIZ et al. 1997). One product
contained the promoter and coding region of the gene to be replaced, while the other
included (in order) 60 bp of overlap with the 3’ end of the gene PCR product, kanMX or
natMX, and 30 to 50 bp of the genomic region immediately downstream of the
transcribed portion of the gene. Replacement of a gene was verified using Sanger
sequencing.
2.5.6 Generation of backcross segregants. Backcrosses were conducted by mating a
BYxYJM segregant to a MATα his3∆ version of BY or YJM. Sporulation and selection
for MATa backcross segregants was performed as described for the initial mapping
population.
2.5.7 Screening for mating type and non-genetic effects. To induce mating type
switching in our MATa segregants, we first deleted URA3 from these individuals using
22
the hphMX cassette with homology-tailed primers, as described above. Correct
integration of the cassette was verified using PCR and further checked by plating the
ura3Δ strains onto 5-FOA plates. Next, mating type switching was performed using the
pGAL-HO plasmid, as previously described (HERSKOWITZ AND JENSEN 1991). Otherwise
isogenic MATa and MATα individuals were mated to produce homozygous diploids.
These individuals were sporulated as described above and standard microdissection
techniques were used to obtain spores from the homozygous diploids. Tetrads from
which all four spores were recovered were then grown on glucose and ethanol, and
checked for ability to invade (Table S2.1).
2.5.8 Amplification of the FLO11 coding region. The entire FLO11 coding region was
PCR amplified using 5’-GGAAGAGCGAGTAGCAACCA as the forward primer and 5’-
TTGTAGGCCTCAAAAATCCA as the reverse primer. The size of the BY and YJM
alleles were compared on a 2% agarose gel.
23
2.6 Supplementary Materials
Figure S2.1 Initial results from selective genotyping of segregants that show FLO8-
independent invasion. (A) Comparison of genome-wide relative allele frequency plot among
FLO8
BY
invasive progeny to a non-invasive FLO8
BY
control population. (B) Genome-wide
relative allele frequency plot among FLO8
BY
segregants that invade on both glucose and ethanol.
24
Figure S2.2. Construction of allele replacements. In the first step, one pair of primers (F1 and
R1) was used to amplify the promoter and the coding sequence of the gene to be replaced with 60
bp overlapping the 5’ end of the resistance marker attached at the 3’ end of the PCR product
(shown in orange). Another pair of primers (F2 and R2) was used to amplify the resistance
marker with 60 bp overlapping the genomic region immediately downstream of the transcribed
potion of the gene using the first primer pair attached at the 3’ end of the PCR product. In the
second step, the two overlapping PCR products were transformed into the strains. Integration into
the genome requires recombination between the PCR products and the target locus.
25
Figure S2.3. Differences FLO11 coding region length between BY and YJM. PCR was used
to amplify the FLO11 coding region from the BY and YJM strains. The size of FLO11
BY
was
~4.1kb, while FLO11
YJM
was ~3.4kb.
26
Figure S2.4. Replacement of the FLO11 coding region in segregant 2 with the BY allele
causes loss of invasion. To verify that FLO11
BY
was correctly integrated and replaced using our
one-step allele replacement, we PCR amplified the 5’ end of the gene, and Sanger sequenced
multiple invasive and non-invasive transformants. Only the transformants carrying the BY SNPs
(marked in black) toward the 5’ end showed loss of invasion, implying that only individuals with
most of the FLO11 gene replaced exhibited loss of invasion. Flo11 protein is comprised of three
domains, which are reflected in the sequence of the FLO11 gene. The N-terminal portion of the
protein encodes a hydrophobic signal sequence, is exposed at the cell surface, and binds to
ligands. The middle domain largely contains variable length tandem repeats that are enriched for
serines and threonines, and is the part of the protein where heavy glycosylation occurs. The C-
terminal portion of the protein is a GPI anchor that localizes Flo11 to the cell wall. The highly
repetitive nature of the middle portion of FLO11 makes it difficult to accurately determine the
length and sequence of the gene using short Illumina reads. In the regions that we were able to
confidently align, we identified 69 SNPs between the BY and the YJM allele, of which 31 were
non-synonymous. In addition, we identified that the YJM allele of FLO11 has a 45bp insertion in
the N-terminal region between amino acid position 123 and 124. We also found that no
sequencing reads from the YJM mapped to 635 base positions in comparison to BY, which is
most likely due to deletions given that the YJM allele of FLO11 was ~700 bases smaller in
comparison to the BY allele (Figure S4). In particular, large stretches of the middle domains were
missing from amino acid positions 207 to 315, 359 to 372, 409 to 449, 795 to 808, 824 to 845,
and 881 to 899 in the YJM allele. We have not yet determined how these changes alter the
functionality of Flo11. We note that this portion of the gene is known to be highly variable across
yeast strains, affecting many FLO11-dependent traits, such as biofilm formation, flocculation,
and invasion.
27
Segregant Tetrad
MATa
spore 1
MATa
spore 2
MATalpha
spore 1
MATalpha
spore 2
1 1 N N N N
1 2 I N N N
1 3 I N N N
1 4 I I N N
1 5 I I N N
1 6 I I N N
1 7 I I N N
1 8 I I N N
1 9 I I N N
1 10 I I N N
2 1 I I I I
2 2 I I I I
2 3 I I I I
2 4 I I I I
2 5 I I I I
2 6 I I I I
2 7 I I I I
3 1 I I I I
3 2 I I I I
3 3 I I I I
3 4 I I I I
3 5 I I I I
3 6 I I I I
3 7 I I I I
I = Invasive, N = Non-invasive
Table S2.1. Analysis of dissected tetrads from homozygous diploid derivatives of specific
segregants.
28
Chapter 3: Gene-environment interactions in stress response contribute additively
to a genotype-environment interaction
This work appears essentially as published in 2016 in PLOS Genetics 12(7): e1006158.
3.1 Abstract
How combinations of gene-environment interactions collectively give rise to genotype-
environment interactions is not fully understood. To shed light on this problem, we
genetically dissected an environment-specific poor growth phenotype in a cross of two
budding yeast strains. This phenotype is detectable when certain segregants are grown on
ethanol at 37°C (‘E37’), a condition that differs from the standard culturing environment
in both its carbon source (ethanol as opposed to glucose) and temperature (37°C as
opposed to 30°C). Using recurrent backcrossing with phenotypic selection, we identified
16 contributing loci. To examine how these loci interact with each other and the
environment, we focused on a subset of four loci that together can lead to poor growth in
E37. We measured the growth of all 16 haploid combinations of alleles at these loci in all
four possible combinations of carbon source (ethanol or glucose) and temperature (30 or
37°C) in a nearly isogenic population. This revealed that the four loci act in an almost
entirely additive manner in E37. However, we also found that these loci have weaker
effects when only carbon source or temperature is altered, suggesting that their effect
magnitudes depend on the severity of environmental perturbation. Consistent with such a
possibility, cloning of three causal genes identified factors that have unrelated functions
in stress response. Thus, our results indicate that polymorphisms in stress response can
show effects that are intensified by environmental stress, thereby resulting in major
genotype-environment interactions when multiple of these variants co-occur.
29
3.2 Introduction
Genotype-environment interaction (‘GxE’) occurs when genetically distinct
individuals show different phenotypic responses to the environment (FALCONER AND
MACKAY 1996; LYNCH AND WALSH 1998). Although GxE is known to influence many
agriculturally, evolutionarily, and medically relevant traits (e.g., (ZENG 2005; MACKAY et
al. 2009; BAYE et al. 2011; RAUW AND GOMEZ-RAYA 2015)), our basic knowledge of the
genetic and molecular mechanisms that underlie GxE remains incomplete. Recent work
on this topic in Saccharomyces cerevisiae suggests GxE can arise due to not only
individual loci that show gene-environment interactions, but also sets of loci that show
environment-dependent epistatic interactions (GERKE et al. 2010; BHATIA et al. 2014;
LEE et al. 2016). However, because the underlying genetic basis of GxE has only been
comprehensively dissected in a small number of cases (e.g., (GERKE et al. 2010; BHATIA
et al. 2014; LEE et al. 2016)), the relative contributions of these different types of genetic
effects to GxE is unclear.
Here, we generate an additional, detailed example of the genetic basis of GxE in
the budding yeast Saccharomyces cerevisiae. We focus on characterizing the underlying
genetics of a poor growth phenotype that occurs specifically when certain segregants
from a cross of the BY4716 (‘BY’) lab strain and the YJM789 (‘YJM’) clinical isolate
(LITI et al. 2009) are cultured on ethanol at 37ºC (‘E37’; Figure 3.1). Although yeast is
typically grown on glucose as the carbon source and at 30ºC as the temperature (‘G30’),
it can tolerate a broad range of environmental conditions, including other carbon sources
and temperatures (KVITEK et al. 2008; LITI et al. 2009). Among the different carbon
sources that yeast can utilize, ethanol can be particularly stressful because it is
metabolized via respiration instead of fermentation, which results in increased oxidative
stress (BROACH 2012). Furthermore, high temperature is known to be a stressor for
budding yeast (GASCH et al. 2000), with some isolates incapable of growing at 37ºC or
above (MCCUSKER et al. 1994a; MCCUSKER et al. 1994b; STEINMETZ et al. 2002; SINHA
et al. ; KVITEK et al. 2008; SINHA et al. 2008; GAGNEUR et al. 2013; YANG et al. 2013).
Given that ethanol and high temperature both represent non-preferred conditions for S.
30
cerevisiae, poor growth in E37 likely occurs because some segregants have low
tolerances for environmental stress.
Figure 3.1 Representative images of BY, YJM, control segregants, and poor growing
segregants under four conditions: glucose at 30°C, glucose at 37°C, ethanol at 30°C, and
ethanol at 37°C. We refer to these conditions throughout the paper as ‘G30’, ‘G37’, ‘E30’, and
‘E37’, respectively.
To determine the genetic basis of poor growth in E37, we use a genetic mapping
strategy involving recurrent backcrossing with phenotypic selection (Figure 3.2).
Through this approach, we identify 16 loci that contribute to poor growth in E37. We
then conduct a more detailed study of four of these loci, which collectively result in poor
growth in E37 when they co-occur in the YJM background. By analyzing the growth of
all 16 haploid multi-locus genotypes involving the loci on all four combinations of two
carbon sources (glucose and ethanol) and two temperatures (30 and 37ºC), we find that
the four loci contribute to poor growth in E37 in a primarily additive manner.
Furthermore, we also show that these loci exhibit weaker, negative effects on growth
when only carbon source or temperature is altered relative to standard conditions. These
results indicate that GxE in our system reflects the composite effect of multiple additive
loci that show condition-dependent effect magnitudes. Additionally, by resolving three of
these loci to a component of the vacuolar protein sorting machinery (VPS70), a stress
granule-associated RNA binding protein (YGR250C), and a stress responsive kinase
(IKS1), we implicate genetic variation in stress response as the source of the identified
gene- and genotype-environment interactions.
31
Figure 3.2. Crossing scheme to generate BY and YJM F 2B 6 NILs. First, haploid versions of
BY and YJM were mated, and the resulting F 1 diploid was sporulated to generate haploid F 2
segregants. These F 2s were then screened for growth in E37. A single F 2 exhibiting poor growth
in E37 (shown in red) was chosen to serve as the progenitor for backcrossing. This F 2 was then
backcrossed to both BY and YJM, and the resulting diploids were sporulated to generate haploid
F 2B backcross segregants. Seven BY and seven YJM F 2Bs that grow poorly in E37 were selected
to serve as the progenitors for additional backcrossing. Next, these strains were subjected to five
additional rounds of mating to the appropriate parent, sporulation, and selection for the
conditional poor growth phenotype to create 14 independent backcross lineages. Finally, a single,
haploid F 2B 6 exhibiting poor growth in E37 was chosen from each backcross lineage and
designated as a Nearly Isogenic Line (NIL). These NILs are expected to carry combinations of
32
alleles from one parent that collectively lead to poor growth in E37 when they co-occur in the
genetic background of the other parent.
3.3 Results and Discussion
3.3.1 Genetic mapping of poor growth in E37 by recurrent backcrossing and
selection
We screened 112 haploid BYxYJM F2s for growth on both glucose and ethanol at
both 30 and 37ºC. We found that five of these individuals exhibited noticeably poor
growth specifically in E37 (Figure 3.1). To determine the genetic basis of this phenotype,
we used a recurrent backcrossing with phenotypic selection strategy (Figure 3.2). In
brief, we mated one of the five poorly growing F2s to both BY and YJM, and generated
and phenotyped at least 576 haploid F2B recombinants from each backcross (Methods).
14 F2Bs (seven per backcross) were then used to breed haploid Nearly Isogenic Lines
(NILs) that carry alleles that collectively cause poor growth in E37 (Figure 3.2;
Methods). To identify these alleles, we sequenced the genomes of the NILs to an average
per site coverage of 21X and identified genomic regions that had been introgressed
(Figure 3.3; Methods). Based on these data, we determined that three of the YJM NILs
harbored aneuploidies or appeared to be replicates of other NILs (Figure S3.1 and S3.2).
We excluded these individuals from all subsequent analyses. Among the remaining 11
NILs, we detected 41 introgressed genomic regions (Figure 3.3).
33
Figure 3.3 Introgressed genomic regions detected in the NILs. (A) Loci from YJM that were
introgressed into the BY genetic background are shown as orange boxes against a blue BY
genetic background. (B) Loci from BY that were introgressed into the YJM genetic background
are shown as blue boxes against an orange YJM genetic background. YJM NIL 3, which served
as the progenitor of the F 2B 7 population described later, is highlighted in red (C) The number of
times each region was introgressed is shown. Selection markers used to generate haploid
progeny—MAT and CAN—are also highlighted in red. The Chromosome I, VII, X_1, X_2, and
XVI loci that segregate in the F 2B 7 population are denoted as ‘L I’, ‘L VII’,’L X_1’,’L X_2’,’ and
‘L XVI’, respectively.
3.3.2 Many introgressed loci have biological effects
To verify that the introgressed regions contribute to poor growth in E37, we
generated a population of haploid F2B7s by backcrossing YJM NIL 3 to YJM an
additional time. Ignoring a control marker at CAN1 on Chromosome V, five genomic
regions (Chromosome I, VII, X_1, X_2, and XVI), were polymorphic in the F2B7
population (Figure 3.3B and C). Four of these loci were detected in other YJM NILs
(Chromosome I, VII, X_1, and X_2), while the genomic region on Chromosome XVI
was unique to this NIL (Figure 3.3C). By screening 864 F2B7s, we obtained 45
34
individuals that grow poorly in E37 (Methods). These individuals, as well as a distinct
population of 192 random F2B7s, were then genotyped by low coverage whole genome
sequencing or restriction enzyme typing (Methods). We tested for allelic enrichment
among the poor growing individuals relative to the random controls (Methods). Fisher’s
exact tests indicate that the Chromosome I, VII, X_1, and X_2 loci contribute to YJM
NIL 3’s poor growth in E37 (I: p ≤ 3.8 x 10
-8
, VII: p ≤ 4 x 10
-20
, X_1: p ≤ 8.4 x 10
-7
, X_2:
p ≤ 1.6 x 10
-20
; Figure S3.3), while the Chromosome XVI locus does not (XVI: p ≤ 0.34;
Figure S3.3). Given that the former loci were detected in two or more NILs and the latter
locus was only identified in a single NIL, these results suggest that loci that were detected
independently at least twice among the NILs have biological effects. Extension of this
finding to the entire set of introgressed genomic regions conservatively implicates at least
16 loci as contributors to poor growth in E37 (Figure 3.3C; Table S3.1).
3.3.3 Loci involved in poor growth in E37 mainly act in an additive manner
We analyzed the phenotypic effects of the Chromosome I, VII, X_1, and X_2 loci
using the population of 192 random F2B7s (Methods). These strains were quantitatively
phenotyped for growth in E37, and the additive and epistatic effects of the four loci were
assessed (Methods). In a full factorial ANOVA that included all possible additive effects
and pairwise or higher-order epistatic interactions (Methods), genetic factors explained
79.9% of the phenotypic variance (Table 3.1). 94 and 6% of this genetic contribution to
growth was due to additive and epistatic effects, respectively. Furthermore, 7, 11.1, 24.7,
and 32.4% of the phenotypic variance was explained by the Chromosome X_1, I, X_2,
and VII loci, respectively (Table 3.1). Each of these additive effects were highly
significant (F statistic > 60, d.f.numerator, = 1, d.f.residuals = 175, p < 6 x 10
-13
; Figure 3.4;
Table 3.1). In contrast, only four epistatic interactions showed significant effects (F
statistic > 5.2, d.f.numerator, = 1, d.f.residuals = 175, p < 0.024). These were each pairwise
interactions that explained only between 0.6 and 2% of the phenotypic variance (Table
3.1). Thus, our results indicate that extremely poor growth in E37 has a genetic basis that
is almost entirely additive.
35
Source Df Sum Sq Mean Sq F value Pr(>F) PVE
I 1 3360.3 3360.3 96.142 <2.2e-16 11.1
VII 1 9836.2 9836.2 281.427 <2.2e-16 32.4
X_1 1 2116.2 2116.2 60.546 5.986e-13 7.0
X_2 1 7492.9 7492.9 214.383 <2.2e-16 24.7
I:VII 1 90.2 90.2 2.579 0.110031 0.3
I:X_1 1 19.1 19.1 0.545 0.460992 0.1
VII:X_1 1 597.4 597.4 17.092 5.516e-05 2.0
I:X_2 1 181.2 181.2 5.185 0.023983 0.6
VII:X_2 1 308 308 8.811 0.003413 1.0
X_1:X_2 1 221.1 221.1 6.326 0.01279 0.7
I:VII:X_1 1 3.1 3.1 0.087 0.767754 0.0
I:VII:X_2 1 28.5 28.5 0.815 0.367612 0.1
I:X_1:X_2 1 8.6 8.6 0.244 0.621402 0
VII:X_1:X_2 1 0.5 0.5 0.013 0.908321 0
I:VII:X_1:X_2 1 0.5 0.5 0.012 0.909587 0
Residuals 175 6116.4 35
Table 3.1 Full factorial ANOVA for E37 condition. PVE, percent of phenotypic variance
explained. Interaction terms are denoted by ‘:’.
36
Figure 3.4. Phenotypic effects of the Chromosome I, VII, X_1, X_2, and XVI loci in the four
conditions. Box plots showing the phenotypic effects of the four loci among the F 2B 7s in each
condition. Individuals carrying the BY and YJM alleles at each locus are shown in blue and
orange, respectively. Statistical significance was assessed using factor effect tests obtained from
the full factorial ANOVAs described in the main text. All four loci were found to have
statistically significant phenotypic effect on growth in G37, E30, and E37. ** and *** denote p ≤
0.01 and p ≤ 0.001, respectively.
We also examined the effects of the Chromosome I, VII, X_1, and X_2 loci in
G30, ethanol at 30ºC (‘E30’), and glucose at 37ºC (‘G37’). As a first step, full factorial
ANOVA models were implemented in each of these conditions. In G30, the only
significant effect was a higher-order epistatic interaction involving all four loci, which
explained 3.3% of the phenotypic variance (F statistic = 6.4, d.f.numerator, = 1, d.f.residuals =
175, p < 0.013; Table S3.2). In comparison, full factorial models for E30 and G37
revealed that all four loci showed significant additive effects in both conditions (F
statistic > 7.3, d.f. numerator, = 1, d.f.residuals = 175, p < 0.004; Figure 3.4; Table S3.3 and
37
S3.4). The only other significant genetic effect in E30 or G37 occurred in the former
condition, with a pairwise epistatic interaction detected between the Chromosome VII
and X_1 loci (F statistic = 15.5, d.f. numerator, = 1, d.f.residuals = 175, p = 0.0001; Table S3).
These results show that the effects of the Chromosome I, VII, X_1, and X_2 loci have
detectable additive effects in all three non-standard culturing conditions in our study,
indicating their effects are influenced by both carbon source and temperature (Figure 3.4;
Table 3.1; Tables S3.2 through S3.4).
3.3.4 Loci show a negative relationship between average growth level and additive
effect size
We next assessed how the effects of the Chromosome I, VII, X_1, and X_2 loci
change across conditions. Based on the aforementioned full factorial models, we found
that the average percent phenotypic variance explained by the additive effects of the four
loci was 0.48, 5.4, 9.2, and 18.8% in G30, G37, E30, and E37, respectively. These
changes in average effect size across conditions show an inverse relationship with the
average growth levels seen among F2B7s in the respective conditions, which exhibit the
relationship G30 > G37 > E30 > E37 (Figure 3.5A). These reductions in average growth
levels across conditions likely reflect increased environmental stress and suggest that
higher stress intensifies the effect magnitudes of the loci (Figure 3.4). Additionally,
because each of the loci shows a similar relationship between environmental stress and
effect magnitude, variability in growth within a given non-standard condition remains
predominantly additive in its genetic basis despite the presence of extensive gene- and
genotype-environment interaction across conditions (Figure 3.5B).
38
Figure 3.5. Analysis of growth among 192 random F 2B 7s across the four conditions. (A)
Density plots of the median pixel intensities observed among segregants are plotted for each
condition. (B) Within each condition, the relationship between number of YJM alleles carried by
a segregant across the Chromosome I, VII, X_1, X_2, and XVI loci and phenotype is plotted. The
black lines represent equal size, additive effect regression models that were fit to the data for each
condition (Methods). Despite variability in allelic effect sizes among the four loci, these models
were highly significant in G37, E30, and E37 (p ≤ 10
-10
), but not G30 (p = 0.67). Also, for each
condition, the fraction of phenotypic variance explained by all genetic factors and only additive
genetic factors are noted by R
2
G and R
2
A, respectively. These values were obtained from the full
factorial ANOVA models for each condition, rather than from the simpler regression models
illustrated in the plots (Methods).
3.3.5 Causal genes play roles in stress response
To help determine the mechanism that relates average growth level and allelic
39
effect size, we attempted to clone the causal genes underlying the four loci. The F2B7 data
allowed us to resolve the Chromosome I, VII, X_1, and X_2 loci to small intervals
containing on average 5,943 bp (Table S3.5; Note S3.1; Methods). For each candidate
gene in each locus, we performed allele replacements that included the promoter and
coding region (Methods). Specifically, the existing BY allele of each candidate gene was
replaced with the YJM allele in YJM NIL 3 (Methods). Through these experiments, we
were able to resolve the Chromosome VII, X-1, and X-2 loci to YGR250C, IKS1, and
VPS70, respectively (Figure 3.6). YGR250C
encodes a RNA binding protein that
localizes to stress granules (WOUT et al. 2009; CHERRY et al. 2012; MITCHELL et al.
2013). Stress granules are cytoplasmic messenger ribonucleoprotein (mRNPs) complexes
that form in response to stress and are thought to aid in the translation of mRNAs by
increasing the local concentration of translation initiation factors (BUCHAN et al. 2008;
BUCHAN AND PARKER 2008; DECKER AND PARKER 2012). We were able to further resolve
the YGR250C locus to a derived, YJM-specific amino change in a predicted RNA binding
motif (Figure S3.4; Methods). As for IKS1, this gene encodes an uncharacterized protein
kinase that has been shown to be induced during mild heat stress and to alter the
sensitivity of yeast to a number of different small molecules (CHERRY et al. 2012).
Lastly, VPS70 encodes an uncharacterized protein involved in vacuolar protein sorting,
which is known to mediate cellular response to a wide range of environmental stresses
(LI AND KANE 2009; DUITAMA et al. 2014; VOORDECKERS et al. 2015). These findings
suggest that polymorphisms in different cellular processes involved in stress response
make major contributions to the heritable growth variation in our study.
40
Figure 3.6. Identification of the causal genes underlying the Chromosome VII, X_1, and
X_2 loci. Comparison of YJM NIL 3 to the YGR250C
YJM
, IKS1
YJM
, and VPS70
YJM
allele
replacement strains supports a causal role for these genes in poor growth in E37.
3.4 Conclusion
We have determined the genetic basis of an example of GxE in which certain
yeast segregants exhibit extremely poor growth in a specific environmental condition.
Our results indicate that this poor growth is caused by a number of environmentally
responsive loci that individually show allelic effect sizes that increase with the severity of
environmental stress and collectively result in very poor growth under stressful
conditions. This finding provides support for the concept of decanalization, which has
been hypothesized to occur when environmental perturbation uncovers sets of deleterious
cryptic genetic variants that result in conditional disease phenotypes or other genotype-
environment interactions (GIBSON 2009). However, our results are also compatible with
recent work illustrating the largely additive genetic basis of quantitative trait variation in
yeast (BLOOM et al. 2013; BLOOM et al. 2015; LINDER et al. 2016). Indeed, our work
suggests that when many loci show similar gene-environment interactions with
environmental stress, decanalization can occur across conditions while trait variation
retains an additive genetic architecture within conditions.
The current study also provides a valuable contrast to previous work from our
group and others showing a substantial epistatic contribution to GxE (GERKE et al. 2010;
BHATIA et al. 2014; LEE et al. 2016). Here, we find that epistasis does not meaningfully
contribute to GxE in growth variation under our assay conditions. Although it could be
41
that we have somewhat underestimated the contribution of epistasis to our study by
focusing on a particular set of four loci, our results might also reflect a major difference
in the molecular mechanisms that give rise to the focal phenotypes in the present and past
studies. In particular, in previous work on colony morphology (LEE et al. 2016) and
sporulation (GERKE et al. 2010), the examined phenotypes were controlled by specific
gene regulatory networks involving multiple transcription factors. Genetic variability in
such networks is known to be an important source of pairwise and higher-order epistatic
interactions (OMHOLT et al. 2000; GJUVSLAND et al. 2017; TAYLOR AND EHRENREICH ;
TAYLOR AND EHRENREICH 2015a; TAYLOR AND EHRENREICH 2015b). In contrast, our
current effort is focused on growth, which unlike colony morphology or sporulation, is
not a phenotype that arises due to a single predominant gene regulatory network. Thus,
our past (LEE et al. 2016) and current findings suggest that GxE can show a range of
genetic architectures from completely additive to completely epistatic; where the genetic
architecture of GxE in a particular trait lies along this continuum likely depends on the
phenotype’s underlying molecular basis.
42
3.5 Materials and Methods
3.5.1 Generation of initial mapping population. Using the synthetic genetic array
marker system (TONG et al. 2001), 112 recombinant BYxYJM MATa segregants were
generated. The BY parent of our cross was MATα can1∆::STE2pr-SpHIS5 lyp1∆ his3∆,
while the YJM parent was MATa his3∆::NatMX ho::HphMX. The BY and YJM haploids
were mated to produce a diploid, which was then sporulated using standard techniques
(SHERMAN 1991). MATa segregants were obtained using random spore plating on
minimal media containing canavanine, as previously described (EHRENREICH et al. 2010;
TAYLOR AND EHRENREICH 2014).
3.5.2 Examination of growth among F2 segregants. Strains were phenotyped on 2%
agar plates containing yeast extract and peptone (YP) with either 2% glucose (dextrose)
or 2% ethanol as the carbon source (YPD and YPE, respectively) at 30°C or 37°C. Prior
to pinning onto the agar plates, strains were grown overnight to stationary phase in liquid
YPD. After this culturing step, strains were then pinned onto agar plates and allowed to
grow in the appropriate condition for five days. Individuals were considered poor
growing in E37 based on three replicate phenotyping experiments that were performed
using randomized designs. Qualitatively poor growth was never observed in G30, G37, or
E30.
3.5.3 Generation of BY and YJM NILs. Similar to our past work (TAYLOR AND
EHRENREICH 2014; MATSUI et al. 2015), F2B backcross segregants that grow poorly in
E37 were obtained by screening haploid progeny from backcrosses of a relevant
BYxYJM F2 segregant to MATα ho his3∆ versions of BY and YJM. Seven BY and seven
YJM F2Bs were then subjected to five additional rounds of backcrossing with selection
for maintenance of poor growth in E37. Each round of backcrossing was performed using
MATα his3∆ versions of BY and YJM. Sporulation and selection for MATa segregants
was performed as described for the initial F2 population.
3.5.4 Genotyping of BY and YJM NILs. The NILs were genotyped by Illumina
sequencing. Whole genome libraries were constructed using the Illumina Nextera kit,
43
with each library tagged with a unique barcode for multiplexing. Each library was
sequenced to an average per site genomic coverage of at least 21X on a NextSeq with 100
base pair (bp) x 100 bp reads. The BY and YJM parent strains were also sequenced to an
average per site genomic coverage of ~100X, and these data were used to identify 57,402
high confidence SNPs. Reads for the NILs were mapped to the S288c genome (version
S288C_reference_sequence_R64-2-1_20150113.fsa from Saccharomyces Genome
Database (http://www.yeastgenome.org) using Burrows-Wheeler Aligner (BWA) version
0.7.7-r441 (LI AND DURBIN 2009) and mpileup files were generated with SAMTOOLS
(LI et al. 2009) version 0.1.19-44428cd. The default parameters for BWA and
SAMTOOLS were used for mapping Illumina reads to the genome. Genotypes for each
individual were called by taking the fraction of BY allele calls at each of the SNPs and
employing a Hidden Markov Model by chromosome, using the HMM() package version
1.0 in R, as described in (TAYLOR AND EHRENREICH 2014). The parameters used for
transition and emission probabilities were transProbs =
matrix(c(.9999,.0001,.0001,.9999),2) and emissionProbs = matrix(c(.0.5,0.5,0.5,0.5),2),
respectively. We also used the sequencing data to screen the NILs for aneuploidies. If the
average sequence coverage for any individual chromosome was 1.5 times higher or lower
than the average genome-wide sequencing coverage for a given individual, that strain
was classified as aneuploid. Two YJM NILs were found to be aneuploid and thus were
excluded from all analyses described in the paper (Figure S3.1). Additionally, we found
that two YJM NILs possessed nearly identical sets of introgressed regions, suggesting a
technical error on our part during the recurrent backcrossing process. Only one of these
NILs was included in our analyses (Figure S3.2).
3.5.5 Generation of YJM NIL 3 F2B7 segregants. YJM F2B7 segregants were created by
backcrossing YJM NIL 3 (Figure 3.3B) to a MATα ho∆ his3∆ version of YJM.
Sporulation and selection for MATa segregants was performed as described for the initial
F2 population.
3.5.6 Genotyping of the F2B7 population. 96 YJM F2B7 random segregants and 45
additional F2B7s that grew poorly on E37 were genotyped by sequencing to an average
44
per site coverage of at least 5X using the same method described for the BY and YJM
NILs. An additional 96 YJM F2B7 random segregants were genotyped at the five loci that
had been introgressed into YJM NIL 3 using PCR and restriction enzyme typing. All
reactions are provided in Table S3.6. Fisher’s exact tests were then performed in R, using
two-by-two matrices in which the first row contained the counts of BY and YJM alleles
among the 45 F2B7s showing poor growth in E37, and the second row contained the
counts of BY and YJM alleles among the 192 YJM F2B7 random population. Allele
counts were measured at a single site for each locus that showed maximal allelic
enrichment among the 45 F2B7s that grow poorly in E37.
3.5.7 Phenotyping of the F2B7 population. To further analyze growth in the F2B7
population, we grew each of these individuals on all possible combinations of carbon
sources—glucose and ethanol—and temperatures—30 and 37ºC. Individuals were pinned
onto agar plates and then grown in the appropriate condition for three days. The plates
were then imaged using the BioRAD Gel Doc XR+ Molecular Imager. The dimensions of
all the images were set at 13.4x10 cm (WxL) and imaged under white Epi illumination
with an exposure time of 0.5 seconds. The images were then exported as tiff files with a
publishing resolution of 300dpi. To measure the pixel intensity of each colony, ImageJ
(SCHNEIDER et al.) was used. The total pixel intensity within a circle (spot radius = 50
pixels) surrounding each colony in the image was measured using the Plate Analysis JRU
v1 plugin for ImageJ, which was downloaded from the Stowers Institute ImageJ Plugins
page (http://research.stowers.org/imagejplugins/index.html; Figure S3.5). The Circ
Background option was used to control for background noise. The average pixel intensity
was determined by dividing the total pixel intensity by the area of the circle examined
(7845 pixels
2
). Five biological replicate measurements using different randomized
designs were taken for each F2B7 in each condition. The median pixel intensity among
these five replicates was then used in downstream analyses.
3.5.8 Quantitative analysis of the effect of the causal loci on growth. To measure the
additive and epistatic effects of the Chromosome I, VII, X_1, and X_2 loci among the
F2B7s in a particular condition, we implemented full factorial ANOVAs in R.
45
Specifically, we modeled the median pixel intensity of the F2B7 segregants in each
condition as a function of all possible additive and epistatic effects involving the four
loci. The model was specified using the statement:
lm(median_pixel_intensity_for_each_condition ~ genotype_at_locus_I *
genotype_at_locus_VII * genotype_at_locus_X_1 * genotype_at_locus_X_2). ANOVA
tables were then obtained using the anova() function. In addition to the terms provided by
R, we computed the percent of phenotypic variance explained for each locus by dividing
the sum of squares associated with a particular term by the sum of squares total (Table 1
and Tables S3.2, S3.3, and S3.4). Respectively, the fractions of phenotypic variance
explained by all genetic effects (R
2
G) or only additive genetic effects (R
2
A) were
computed by summing the fractions of phenotypic variance explained by all genetic
terms or only additive genetic terms in a given model.
3.5.9 Modeling of growth as a function of the number of YJM alleles an individual
carries. Within each condition, we modeled the median pixel intensities of the F2B7s as a
function of how many YJM alleles they carried. This model assumes complete additivity
with loci showing equal effect sizes. These linear models were fit in R using the lm()
function in R with the statement lm(median_pixel_intensity_for_each_condition ~
number_of_YJM_alleles_at_four_loci).
3.5.10 Genetic engineering. All transformations were conducted using standard PCR-
based techniques (ERDENIZ et al. 1997). Allele replacement strains were constructed
using the co-transformation of two partially overlapping PCR products as described in
(MATSUI et al. 2015). One product contained the promoter and coding region of the gene
to be replaced, while the other included (in order) 60 bp of overlap with the 3’ end of the
gene PCR product, kanMX, and 60 bp of the genomic region downstream of the
transcribed portion of the gene, such that the entire coding and the promoter region of a
given gene was replaced. All engineerings were performed in YJM NIL 3 and involved
replacement of the BY allele of a given gene with the YJM allele. Each putative allele
replacement was verified by Sanger sequencing. Controls were also generated to ensure
that inserting kanMX near each gene was not responsible for our findings.
46
3.5.11 Population, phylogenetic, and functional analysis of the causal polymorphism
in YGR250C. DNA sequences for other S. cerevisiae strains were downloaded from the
Saccharomyces Genome Database (http://www.yeastgenome.org), as well as from
different S. cerevisiae resequencing projects (LITI et al. 2009; STROPE et al. 2015). DNA
sequence alignments were then generated using Geneious v7.0.6 and the amino acid
sequences of these other isolates was determined by translating the DNA sequence
alignment. The amino acid sequences of other closely related fungal species were
obtained using WU-BLAST2 with default settings (http://www.yeastgenome.org/blast-
fungal). The putative RNA binding motifs of YGR250C were then identified from domain
predictions available through InterPro (http://www.ebi.ac.uk/interpro/protein/P53316)
(MITCHELL et al. 2015).
47
3.7 Supplementary Materials
Figure S3.1 Whole genome sequencing reveals two YJM NILs are aneuploid
48
Figure S3.2 YJM NIL 2 and another YJM NIL show similar introgressed genomic regions.
One YJM NIL, which is denoted as YJM NIL 2*, was excluded from further study as it appears
to be a replicate of YJM NIL 2.
49
Figure S3.3 Four of the five introgressed genomic regions in YJM NIL 3 contribute to poor
growth in E37. Frequencies of the BY alleles at each locus in the populations of poorly growing
and control F 2B 7s are plotted. The Chromosome I, VII, X_1, and X_2 loci show statistically
significant differences in their frequencies between the two populations (Fisher’s exact tests: I: p
≤ 3.84 x 10
-8
, VII: p ≤ 3.98 x 10
-20
, X_1: p ≤ 8.38 x 10
-7
, X_2: p ≤ 1.56 x 10
-20
), while the locus on
Chromosome XVI did not (XVI: p ≤ 0.341). The significant loci are denoted with ‘***’.
50
Figure S3.4 YGR250C
YJM
contains an amino acid change in a highly conserved site. (A)
Amino acid differences between BY and YJM are shown with either a black line (non-causal) or
a red line (causal). The three predicted RNA recognition motifs within YGR250C are labeled in
purple. (B) The causal amino acid polymorphism in YJM is highlighted in red and other sites that
differ from S. cerevisiae are highlighted in grey. Based on presently available genomes from
recent resequencing projects (LITI et al. 2009; STROPE et al. 2015) or the Saccharomyces
Genome Database (CHERRY et al.), YJM is the only budding yeast that harbors an amino acid at
position 542 that is not a leucine or an isoleucine.
51
Figure S3.5. Sample image of how growth in the F 2B 7 population was measured using the
Plate Analysis plugin for ImageJ.
52
Contributing
parent
Chromosome Start Position End Position
BY V-1 97416 208850
BY V-2 361243 371216
BY XI 617954 632869
BY XIII-1 103752 116112
BY XIII-2 409643 434776
BY XIII-3 817457 864535
BY XIV-1 196326 242111
BY XIV-2 349812 356084
BY XV 388886 459980
YJM I 35751 58166
YJM IV 928114 997571
YJM VII 956516 1009525
YJM X-1 237551 363087
YJM X-2 609577 673602
YJM XII 967268 1001139
YJM XV 585644 638733
Table S3.1 Genomic intervals that were introgressed in at least 2 NILs
53
Source Df Sum Sq Mean Sq F value Pr(>F) PVE
I 1 114.5 114.54 2.0715 0.15186 1.1
VII 1 21.6 21.64 0.3914 0.53236 0.2
X_1 1 25.2 25.17 0.4552 0.50076 0.2
X_2 1 44.6 44.63 0.8072 0.37018 0.4
I:VII 1 4.3 4.33 0.0784 0.77985 0
I:X_1 1 4.3 4.26 0.0771 0.78163 0
VII:X_1 1 171.3 171.35 3.0989 0.08009 1.6
I:X_2 1 23.7 23.74 0.4293 0.5132 0.2
VII:X_2 1 14.7 14.68 0.2656 0.60698 0.1
X_1:X_2 1 0.1 0.13 0.0023 0.96178 0
I:VII:X_1 1 2.9 2.92 0.0527 0.81862 0
I:VII:X_2 1 26.9 26.93 0.487 0.48618 0.3
I:X_1:X_2 1 73.9 73.93 1.3371 0.24912 0.7
VII:X_1:X_2 1 14.3 14.27 0.258 0.61212 0.1
I:VII:X_1:X_2 1 351.8 351.84 6.363 0.01254 3.3
Residuals 175 9676.4 55.29
Table S3.2 Full factorial ANOVA for G30 condition.
54
Source Df Sum Sq Mean Sq F value Pr(>F) PVE
I 1 633.57 633.57 43.2163 5.435e-10 13.3
VII 1 727.81 727.81 49.644 4.068e-11 15.3
X_1 1 264.69 264.69 18.0548 3.483e-05 5.6
X_2 1 126.27 126.27 8.6129 0.0037859 2.7
I:VII 1 49.21 49.21 3.3569 0.0686245 1.0
I:X_1 1 14.74 14.74 1.0055 0.3173638 0.3
VII:X_1 1 227.52 227.52 15.5191 0.0001178 4.8
I:X_2 1 0.14 0.14 0.0092 0.9235038 0
VII:X_2 1 13.24 13.24 0.903 0.343296 0.3
X_1:X_2 1 0.34 0.34 0.0234 0.8787173 0
I:VII:X_1 1 18.75 18.75 1.2791 0.2596155 0.4
I:VII:X_2 1 53.92 53.92 3.678 0.0567623 1.1
I:X_1:X_2 1 38.06 38.06 2.5961 0.1089287 0.8
VII:X_1:X_2 1 17.01 17.01 1.1606 0.2828289 0.4
I:VII:X_1:X_
2
1 0.4 0.4 0.0275 0.8684786 0
Residuals 175 2565.59 14.66
Table S3.3 Full factorial ANOVA for E30 condition.
55
Source Df Sum Sq Mean Sq F value Pr(>F) PVE
I 1 829.1 829.05 15.0553 0.0001477 6.4
VII 1 797.3 797.29 14.4786 0.0001958 6.1
X_1 1 802.8 802.84 14.5792 0.0001863 6.2
X_2 1 404.2 404.18 7.3398 0.0074147 3.1
I:VII 1 1.2 1.19 0.0216 0.8833552 0
I:X_1 1 49.6 49.57 0.9001 0.3440591 0.4
VII:X_1 1 66.5 66.48 1.2073 0.2733816 0.5
I:X_2 1 46.1 46.1 0.8372 0.3614623 0.4
VII:X_2 1 5.9 5.94 0.1078 0.7430638 0
X_1:X_2 1 125.3 125.31 2.2756 0.1332302 1.0
I:VII:X_1 1 21.8 21.8 0.3958 0.5300677 0.2
I:VII:X_2 1 8.6 8.58 0.1558 0.6935174 0.1
I:X_1:X_2 1 0.1 0.08 0.0015 0.968715 0
VII:X_1:X_2 1 3.3 3.33 0.0605 0.8060051 0
I:VII:X_1:X_2 1 210.6 210.64 3.8252 0.0520784 1.6
Residuals 175 9636.7 55.07
Table S3.4 Full factorial ANOVA for G37 condition.
56
Chromosome Start position End Position Causal Gene
I 52603 53266 ?
VII 987973 996630 YGR250C
X_1 326422 329730 IKS1
X_2 651274 662418 VPS70
Table S3.5 Genetic intervals identified in the 45 F 2B 7s segregants with poor growth in E37.
57
Table S3.6 PCR primers and restriction enzymes used for genotyping F 2B 7s.
Introgressed
Loci
Primer Sequence
Restriction
enzyme used
Allele
that is cut
Chr I F primer TGATATGTTTGGTTTTGCTTATAGA HpyCH4III BY
Chr I R primer AAGGTTGGGGTACGAATTGC HpyCH4III BY
Chr VII F primer AATGTCCCAGATGGTTCTGC MnlI BY
Chr VII R primer TGATTGAACATGCGCGTACT MnlI BY
Chr X-1 F primer CCAAAGTTGTTTTCTTAATCATCGT KpnI BY
Chr X-1 R primer AAGGAAAGCGTTGAAAAGCA KpnI BY
Chr X-2 F primer CCAATCTTTGTTGCTCACACC BanI YJM
Chr X-2 R primer GACACACGAGGAAGTACAACCA BanI YJM
Chr XVI F primer GGGGCGCTCTTGTATAAGTAA BslI YJM
Chr XVI R primer ACAACTACGGTGGCCATACC BslI YJM
58
Note S3.1 Attempt to clone causal gene underlying the Chromosome I locus.
Even though we were able to resolve the genomic interval for the Chromosome I
locus down to ~600bp containing one gene, GEM1, the replacement of this gene with the
YJM allele in the YJM NIL3 genetic background did not restore growth. To identify the
causal gene, we next expanded our search to genomic intervals where the BY allele was
second and third most enriched in the 45 F2B7s segregants with poor growth in E37. This
decreased the resolution to ~19kb, containing 8 genes. We again tried replacing all the
genes in the interval. However, none of the replacements had any effect on growth. This
failure to resolve the causal gene could be caused by several reasons. One possibility is
that we may need to expand our search even further. Another possibility is that there may
be multiple genes with effects on growth in the Chromosome I locus. In this case,
replacement of one gene may not be sufficient to restore growth. A further possibility is
that the marked allele replacement strategy we employed disrupted a functional element
necessary for detecting the effect of the causal polymorphism in the Chromosome I locus,
such as a 3’ UTR or a transcription factor binding site.
59
Chapter 4: The complex underpinnings of genetic background effects
This work is currently under review at Nature Communications
4.1 Abstract
Spontaneous and induced mutations often show different phenotypic effects across
genetically distinct individuals. Although these background effects are known to result
from epistasis between mutations and standing polymorphisms, their underlying genetic
architecture remains poorly characterized. Here, we genotyped 1,411 wild type and
mutant segregants from the same budding yeast cross, and measured their growth in 10
environments. Using these data, we mapped 1,086 genetic interactions between
segregating loci and seven different gene knockouts. Between 73 and 543 interactions
were identified for each knockout, with 89% of the detected interactions involving
higher-order epistasis between a knockout and multiple loci. Identified loci interacted
with as few as one knockout and as many as all seven knockouts. Loci that interacted
with fewer knockouts tended to show enhanced phenotypic effects in mutants. In
contrast, loci that interacted with more knockouts typically had reduced phenotypic
effects in mutants. Analysis of the identified loci across environments found that most
interactions between the knockouts and segregating loci also depended on the
environment. These results provide detailed insights into the complicated interactions
between mutations, standing polymorphisms, and the environment that ultimately cause
background effects.
60
4.2 Introduction
Background effects occur when the same spontaneous or induced mutations show
different phenotypic effects across genetically distinct individuals (NADEAU 2001;
DOWELL et al. 2010; CHANDLER et al. 2013; TAYLOR AND EHRENREICH 2015b; CHOW ;
LEE et al. 2016; TAYLOR et al. 2016). Countless examples of background effects have
been described across species and traits (NADEAU 2001; CHANDLER et al. 2013),
collectively suggesting that this phenomenon is common in biological systems and plays
a significant role in many phenotypes. For example, alleles that show background effects
contribute to a wide range of hereditary disorders, including, but not limited to, certain
colorectal cancers, hypertension, and phenylketonuria (COOPER et al. 2013). Background
effects may also impact other disorders that frequently involve de novo mutations, such
as autism (SANDERS et al. 2012), congenital heart disease (JIN et al. 2017), and
schizophrenia (FROMER et al. 2014). Additionally, it has been proposed that background
effects can shape the potential trajectories of evolutionary adaptation (CARLBORG et al.
2016; JERISON et al. 2017), influence the emergence of novel traits (TAYLOR et al. 2016),
and help maintain deleterious genetic variation within populations (HEMANI et al. 2013).
Despite the importance of background effects to biology and medicine,
understanding of their causal genetic mechanisms remains limited. Although superficially
background effects are known to arise due to genetic interactions (or ‘epistasis’) between
mutations and standing polymorphisms (MACKAY 2014; TAYLOR AND EHRENREICH
2015a; SACKTON AND HARTL ; CHANDLER et al. 2017; EHRENREICH 2017; MATSUI et al.
2017), only recently have studies begun to provide deeper insights into the architecture of
epistasis underlying background effects. These papers indicate that background effects
often involve multiple polymorphisms that interact not only with a mutation, but also
with each other (RUTHERFORD AND LINDQUIST 1998; QUEITSCH et al. 2002; VAN
SWINDEREN AND GREENSPAN 2005; DOWELL et al. 2010; JAROSZ AND LINDQUIST 2010;
CHANDLER et al. 2013; PAABY et al. 2015; TAYLOR AND EHRENREICH 2014; LEE et al.
2016; TAYLOR et al. 2016) and the environment (LEE et al. 2016). This work also
suggests that background effects are caused by a mixture of loci that show enhanced and
61
reduced phenotypic effects in mutants relative to wild type individuals (RUTHERFORD
AND LINDQUIST 1998; QUEITSCH et al. 2002; GIBSON AND DWORKIN 2004; JAROSZ AND
LINDQUIST 2010; TIROSH et al. 2010; CHANDLER et al. 2017; RICHARDSON et al. 2013;
PAABY AND ROCKMAN 2014; TAYLOR AND EHRENREICH 2015b; GEILER-SAMEROTTE et
al. 2016; LEE et al. 2016; SCHELL et al. ; TAYLOR et al. 2016; EHRENREICH 2017).
Together, these previous reports imply that the phenotypic effect of a mutation in a given
genetic background can depend on an individual’s genotype at a potentially large number
of loci that interact in complicated, highly contextual ways. However, this point is
difficult to explicitly show because doing so requires systematically mapping the
interactions between mutations, polymorphisms, and environment that give rise to
background effects.
In this paper, our goal was to perform a detailed genetic characterization of a
number of background effects across multiple environments. Previous work in yeast, as
well as other model species, has established that mutations in chromatin regulation and
transcription often show background effects (TIROSH et al. 2010; RICHARDSON et al.
2013; CHANDLER et al. 2014; TAYLOR AND EHRENREICH 2015b; VU et al. 2015; LEE et
al. 2016; TAYLOR et al. 2016). We extended this past work by knocking out seven
different chromatin regulators in a cross of the BY4716 (‘BY’) and 322134S (‘3S’)
strains of Saccharomyces cerevisiae. We generated and genotyped 1,411 wild type and
knockout segregants, measured the growth of these individuals in 10 environments, and
performed linkage mapping with these data. In total, we identified 1,086 interactions
between the knockouts and segregating loci. These interactions allowed us to obtain
novel, detailed insights into the genetic architecture of background effects across
different mutations and environments.
4.3 Results
4.3.1 Preliminary screen
When a mutation that exhibits background effects is introduced into a population,
the phenotypic variance among individuals will often change (RUTHERFORD AND
LINDQUIST 1998; QUEITSCH et al. 2002; GEILER-SAMEROTTE et al. 2016; SCHELL et al.
62
2016). Here, we attempted to identify mutations that induce such changes in phenotypic
variance. Specifically, we screened 47 complete gene knockouts of histones, histone
modifying enzymes, chromatin remodelers, and other chromatin-associated genes for
impacts on phenotypic variance in segregants from a cross of the BY and 3S strains of
budding yeast (Figure S4.1 and S4.2; Table S4.1; Methods). To do this, we generated
BY/3S diploid hemizygotes, sporulated these hemizygotes to obtain haploid knockout
segregants, and then quantitatively phenotyped these BYx3S knockout segregants for
growth on rich medium containing ethanol, an environment in which we previously
found background effects that influence yeast colony morphology (TAYLOR AND
EHRENREICH 2014; TAYLOR AND EHRENREICH 2015b; LEE et al. 2016; TAYLOR et al.
2016). For each panel of segregants, three biological replicate end-point colony growth
assays were performed and averaged. We then tested whether the knockout segregants
exhibited significantly higher phenotypic variance than wild type segregants using
Levene’s test (Table S4.2). This analysis implicated CTK1 (a kinase that regulates RNA
polymerase II), ESA1 and GCN5 (two histone acetyltransferases), HOS3 and RPD3 (two
histone deacetylases), HTB1 (a copy of histone H2B), and INO80 (a chromatin
remodeler) as knockouts that show strong background effects in the BYx3S cross (Figure
S4.1; Note S4.1).
4.3.2 Linkage mapping of mutation-independent and mutation-responsive effects
To map loci that interact with the seven knockouts identified in the screen, we
genotyped 1,411 segregants in total. This included 164 wild type, 210 ctk1∆, 122 esa1∆,
215 gcn5∆, 220 hos3∆, 177 htb1∆, 141 ino80∆, and 162 rpd3∆ segregants (Figure S4.3;
Table S4.3 through S4.5; Methods). These genotyped segregants were phenotyped for
growth in 10 diverse environments using replicated end-point colony growth assays
(Figure S4.4; Table S4.6; Methods). We note that, despite causing increased phenotypic
variance in ethanol, the knockouts induced a broad range of phenotypic responses in
other environments (Figure S4.4).
As described in detail in the Methods, genome-wide linkage mapping scans were
conducted within each environment. To maximize statistical power, we analyzed the
63
1,411 segregants together using a fixed effects linear model that accounted for genetic
background. We identified individual loci, as well as two- and three-way genetic
interactions among loci, that exhibited the same phenotypic effect across the wild type
and knockout backgrounds (hereafter, ‘mutation-independent’ effects). We also
conducted scans for individual loci, as well as two- and three-way genetic interactions
among loci, that exhibited different phenotypic effects in at least one knockout
background relative to the wild type background (hereafter, ‘mutation-responsive’
effects). Post hoc tests were used to associate mutation-responsive effects with specific
knockouts. Mutation-responsive one-, two-, and three-locus effects can alternatively be
viewed as two-, three-, and four-way interactions where one of the involved genetic
factors is a knockout. However, to avoid confusion throughout the paper, we do not count
the knockouts as genetic factors. Instead, we classify each genetic effect as mutation-
independent or –responsive and report how many loci it involves. Representative
examples of mutation-responsive effects are shown in Figure 4.1.
64
Figure 4.1 Examples of mutation-responsive genetic effects. a shows representative examples
of one-, two-, and three-locus mutation-responsive effects with larger phenotypic effects in wild
type segregants than mutants. In contrast, b shows representative examples of one-, two-, and
three-locus mutation-responsive effects with larger phenotypic effects in mutants than wild type
segregants. Means depicted along the y-axis show residuals from a fixed effects linear model that
includes the mutation-independent effect of each involved locus, as well as any possible lower-
order mutation-independent and mutation-responsive effects. The different genotype classes are
plotted below the x-axis. Blue and orange boxes correspond to the BY and 3S alleles of a locus,
respectively. Error bars represent one standard deviation from the mean.
In total, we detected 1,211 genetic effects across the 10 environments (Figures
S4.5 and 4.6; Tables S4.7 through S4.9; Note S4.2 and S4.3). 125 (10%) of these genetic
effects were mutation-independent, while 1,086 (90%) were mutation-responsive (Figure
4.2a). On average, we identified 121 genetic effects per environment, 109 of which were
mutation-responsive. However, the number of detected genetic effects varied
significantly across the environments, from 15 in room temperature to 359 in ethanol.
Despite this variability, in every environment, 47% or more of the identified genetic
effects were mutation-responsive. This suggests that, regardless of environment, most
genetic effects in the cross were responsive to the knockouts. Additionally, the seven
knockouts exhibited major differences in their numbers of mutation-responsive effects.
Between 73 and 118 mutation-responsive effects were found for the CTK1, ESA1, GCN5,
HTB1, INO80, and RPD3 knockouts (Figure 4.2b). In contrast, the HOS3 knockout had
543 mutation-responsive effects (Figure 4.2b).
65
Figure 4.2 Most mutation-responsive genetic effects involve multiple loci. In a, the number of
mutation-independent and mutation-responsive genetic effects detected in each environment are
shown. In b, the aggregate numbers of mutation-responsive effects found for each knockout
across the 10 environments are provided.
4.3.3 Most mutation-responsive effects involve higher-order epistasis
While only 29% (36 of 125) of the mutation-independent effects involved
multiple loci, this proportion was more than tripled (89%; 965 of 1,086) among the
mutation-responsive effects (Figure 4.2a). Simulations indicate our statistical power to
detect mutation-responsive loci was appreciably higher for single locus effects than for
multiple locus effects, suggesting that our results may underestimate the importance of
higher-order epistasis to background effects (Figure S4.7). To better assess how loci
involved in the identified higher-order interactions contribute to background effects, we
partitioned the individual and joint contributions of involved loci to mutation-responsive
phenotypic variance (Methods). For mutation-responsive two-locus effects, on average,
78% of the mutation-responsive phenotypic variance was attributed to the higher-order
interaction between the knockout and both loci (Figure 4.3a). Likewise, among
mutation-responsive three-locus effects, on average, 58% of the mutation-responsive
phenotypic variance was explained by the higher-order interaction of the knockout and
the three loci (Figure 4.3b). Thus, most mutation-responsive effects involve multiple loci
that contribute to background effects predominantly through their higher-order
interactions with each other and a mutation, rather than through their individual
interactions with a mutation.
66
Figure 4.3 Higher-order epistasis among knockouts and multiple loci is an important
contributor to background effects. In a, for each mutation-responsive two-locus effect, we
partitioned the individual and joint contributions of the two loci. ‘L1’ and ‘L2’ refer to the
involved loci, while ‘KO’ denotes the relevant knockout. We determined the relative phenotypic
variance explained (PVE) by interactions between knockout and each individual locus (i.e., KO x
L1 and KO x L2) and higher-order epistasis involving knockout and the two loci (i.e., KO x L1 x
L2). Similarly, in b, for each mutation-responsive three-locus effect, we determined the relative
PVE for all possible mutation-responsive one-, two-, and three-locus effects involving the
participating loci. In both a and b, relative PVE values were calculated using sum of squares
obtained from ANOVA tables, as described in the Methods. Mutation-responsive effects that
interact with multiple knockouts are shown multiple times, once for each knockout.
4.3.4 Environment plays a strong role in background effects
The role of the environment in background effects has yet to be fully
characterized. Although our group previously showed that the genetic architecture of
background effects can significantly change across environments (LEE et al. 2016), this
past work focused on only a modest number of segregating loci and environments. To
more generally assess how the environment influences the genetic architecture of
background effects, we determined whether the 1,086 mutation-responsive effects
impacted phenotype in environments outside the ones in which they were originally
67
detected. This analysis was performed using statistical thresholds that were more liberal
than those employed in our initial genetic mapping (Methods). 29% (311) of the
mutation-responsive effects were detectable in additional environments, with this
proportion varying between 7% and 65% across the 10 environments (Figure 4.4). Of
these mutation-responsive effects, 64% (200) were identified in only one additional
environment, 28% (85) were found in two additional environments, and just 8% (26)
were detected in three or more environments. Given the limited resolution of the data, it
is possible that some of the mutation-responsive effects that were detected in multiple
environments in fact represent distinct, closely linked loci that act in different
environments. Such linkage would lead us to overestimate of how often mutation-
responsive effects contribute to background effects in different environments, further
suggesting that most mutation-responsive effects act in a limited number of
environments. These findings support the idea that background effects are caused by
complex interactions between not only mutations and polymorphisms, but also the
environment.
Figure 4.4 Analysis of mutation-responsive effects across environments. The height of each
stacked bar represents the number of mutation-responsive effects that were detected in a given
environment. The bars are color-coded according to the number of additional environments in
which these mutation-responsive effects could be detected when liberal statistical thresholds were
employed (Methods).
68
4.3.5 Interactions between segregating loci and different knockouts
We next looked at how the same mutation-responsive effects interact with
different knockouts. Based on involvement of the same loci, the 1,086 mutation-
responsive effects were collapsed into 594 distinct mutation-responsive effects that
showed epistasis with at least one knockout (Methods). 65% of these mutation-
responsive effects were found in only one knockout background, while 35% were
identified in two or more knockout backgrounds (Figure S4.8). 97% of the mutation-
responsive effects that interacted with only one knockout were HOS3-responsive and
these effects represented 69% of the total interactions detected in a hos3∆ background
(Figure 4.5a). In contrast, nearly all (between 95% and 100%) of the CTK1-, ESA1-,
GCN5-, HTB1-, INO80-, and RPD3-responsive effects were detected in multiple
backgrounds (Figure 4.5a). Although the mutation-responsive effects exhibited a broad,
continuous range of responses to the knockouts (Figure 4.5b), they could be partitioned
into two qualitative classes—‘enhanced’ and ‘reduced’—based on whether they
explained more or less phenotypic variance in mutants than wild type segregants,
respectively. The distinct mutation-responsive effects exhibited a strong relationship
between their number of interacting knockouts and how they were classified. Mutation-
responsive effects that interacted with fewer than three knockouts predominantly were in
the enhanced class, while mutation-responsive effects that interacted with four or more
knockouts typically were in the reduced class (χ
2
= 709.37, d.f. = 6, p = 5.81 x 10
-150
;
Figure 4.5c). These results illustrate how background effects are caused by a mixture of
loci that respond specifically to mutations in particular genes and loci that respond more
generically to mutations in different genes, with the relative contribution of these two
classes of loci varying significantly across mutations. Our findings also suggest that how
loci respond to mutations in a particular gene is related to the degree to which they
interact with mutations in other genes.
69
Figure 4.5 Analysis of mutation-responsive effects across knockout backgrounds. In a, the
number of mutation-responsive effects that interacted with only one knockout (pink) or interacted
with multiple knockouts (blue) are shown for each knockout. In b, the PVE for each mutation-
responsive effect is shown in the relevant knockout segregants, as well as in the wild type
segregants. The PVE for each mutation-responsive effect was determined using fixed-effects
linear models fit within each individual background (Methods). Mutation-responsive effects are
color-coded by the knockout population in which they were identified. In c, the percentage of
mutation-responsive effects that showed larger phenotypic effects in mutants than wild type
segregants (y-axis, left side) and mutation-responsive effects that showed larger phenotypic
effects in wild type segregants than mutants (y-axis, right side) is depicted. These values are
plotted as a function of the number of knockouts that interact with a given mutation-responsive
effect.
4.3.6 Mutation-responsive effects correlate with mutation-induced changes in
phenotypic variance
Lastly, we looked at the extent to which the identified mutation-responsive effects
in aggregate related to differences in phenotypic variance between the knockout and wild
type versions of the BYx3S cross across the 10 environments. Among the 70 different
combinations of the 7 knockouts and 10 environments, we found a highly significant
relationship between differences in the numbers of mutation-responsive effects with
reduced and enhanced phenotypic effects and knockout-induced changes in phenotypic
variance (Figure 4.6; Spearman’s ρ = 0.84, p = 4.33 x 10
-20
). No such relationship was
seen when we looked at the mean phenotypic changes induced by the mutations (Figure
S4.9). To control for potential biases in our analyses that might arise from allele
frequency differences among the backgrounds (Figure S4.3), we performed the same
70
analysis on each knockout background individually using data from the 10 environments.
When we did this, we found that all seven knockout backgrounds exhibited nominally
significant correlations between observed changes in phenotypic variance and detected
mutation-responsive effects across environments (Spearman’s ρ > 0.71, p < 0.02; Figure
S4.10). Permutations indicate the probability of observing this result by chance is very
low (p < 10
-5
). These findings are consistent with the knockout-induced changes in
phenotypic variance resulting from a large number of epistatic loci with small phenotypic
effects (Figure S4.11). In summary, our results not only provide valuable insights into
the genetic architecture of background effects, but also illustrate how interactions
between mutations, segregating loci, and the environment can influence a population’s
phenotypic variance.
Figure 4.6 Mutation-responsive effects underlie differences in phenotypic variance between
knockout and wild type backgrounds across environments. Each point’s position on the x-axis
represents the difference in phenotypic variance between a knockout background of the cross
(‘V P.Mut’) and the wild type background of the cross (‘V P.WT’) in a single environment. The y-axis
shows the difference in the number of mutation-responsive effects with ‘enhanced’ and ‘reduced’
phenotypic effects. In this paper, we classified mutation-responsive effects as enhanced or
reduced based on whether they explained more or less phenotypic variance in mutants relative to
wild type segregants, respectively. Spearman’s ρ and its associated p-value are provided on the
plot. Colors denote different knockout backgrounds.
71
4.4 Discussion
Most prior studies of background effects have described specific examples
without identifying the contributing loci. Here, we used a screen of 47 different
chromatin regulators to identify 7 knockout mutations that exhibit strong background
effects in a yeast cross. We then generated and phenotyped a panel of 1,411 mutant and
wild type segregants. Using these data, we detected 1,086 genetic interactions that
involve the 7 knockouts and loci that segregate in the cross. To better understand the
genetic architecture of background effects, we comprehensively examined how these loci
interact not only with the knockouts, but also with each other and the environment. Our
results confirm important points about the genetic architecture of background effects that
to date have been suggested but not conclusively proven. Namely, background effects can
be highly polygenic, with many, if not most, loci contributing through higher-order
genetic interactions that involve a mutation and multiple loci. These loci can respond to
mutations in different ways, such as by exhibiting enhanced and reduced phenotypic
effects in mutants relative to wild type individuals. Moreover, most of these interactions
between mutations and segregating loci also involve the environment. Altogether, these
findings shed light on the complex genetic and genotype-environment interactions that
give rise to background effects.
Our work also illustrates how the genetic architecture of background effects
varies significantly across different mutated genes. In our study, response to six of the
seven knockouts was mediated almost exclusively by loci that respond to mutations in
different genes and predominantly exhibit reduced effects in mutants relative to wild type
segregants. Given that some of the examined chromatin regulators have counteracting or
unrelated biochemical activities (LI et al. 2007; RANDO AND WINSTON 2012), we propose
that loci detected in multiple knockout backgrounds respond generically to perturbations
of cell state or fitness, rather than to any specific biochemical process. In contrast,
response to HOS3 knockout was largely mediated by loci that were not detected when the
other genes were compromised. Why so many loci responded specifically to perturbation
of HOS3 is difficult to infer from current understanding of Hos3’s biochemical activities.
Although Hos3 can deacetylate all four of the core histones (CARMEN et al. 1999) and
72
influence chromatin regulation in certain genomic regions (ROBYR et al. 2002), it also
plays roles in cell cycle (WANG AND COLLINS 2014) and nuclear pore regulation (KUMAR
et al. 2018). Thus, further work is needed to characterize HOS3 and its extensive epistasis
with with polymorphisms in the BYx3S cross.
In addition to advancing understanding of background effects, our results may
also have more general implications for the genetic architecture of complex traits. Many
phenotypes, including common disorders like autism (SANDERS et al. 2012) and
schizophrenia (FROMER et al. 2014), are influenced by loss-of-function mutations that
occur de novo or persist within populations at low frequencies. We have shown that these
mutations can significantly change the phenotypic effects of many polymorphisms within
a population by altering how these polymorphisms interact with each other and the
environment. Although these complicated interactions between mutations, standing
polymorphisms, and the environment are often ignored in genetics research, our study
suggests that they in fact play a major role in determining the relationship between
genotype and phenotype.
73
4.5 Materials and Methods
4.5.1 Generation of different knockout backgrounds of the BYx3S cross
All BYx3S segregants described in this paper were generated using the synthetic genetic
array marker system, which makes it possible to obtain MATa haploids by digesting
tetrads and selecting for spores on minimal medium lacking histidine and containing
canavanine (TONG AND BOONE 2006)
(Figure S4.1). We first constructed a BY/3S diploid
by mating a BY MATa can1Δ::STE2pr-SpHIS5 his3∆ strain to a 3S MATα ho::HphMX
his3∆::NatMX strain. This diploid served as the progenitor for the wild type segregants.
Hemizygous complete gene deletions were engineered into this wild type BY/3S diploid
to produce the progenitors of the knockout segregants. Genes were deleted using
transformation with PCR products that were comprised of (in the following order) 60 bp
of genomic sequence immediately upstream of the targeted gene, KanMX, and 60 bp of
genomic sequence immediately downstream of the targeted gene. Lithium acetate
transformation was employed (GIETZ AND WOODS 2002). To obtain a given knockout,
transformants were selected on rich medium containing G418, ClonNAT, and
Hygromycin B, and PCR was then used to check transformants for correct integration of
the KanMX cassette. These PCRs were conducted with primer pairs where one primer
was located within KanMX and the other primer was located adjacent to the expected site
of integration. PCR products were Sanger sequenced. Wild type and hemizygous
knockout diploids were sporulated using standard techniques. Low-density random spore
plating (around 100 colonies per plate) was then used to obtain haploid BYx3S
segregants from each wild type and knockout background of the cross. Wild type
segregants were isolated directly from MATa selection plates, while knockout segregants
were first replica plated from MATa selection plates onto G418 plates, which selected for
the gene deletions.
4.5.2 Genotyping of segregants
Segregants were genotyped using low-coverage whole genome sequencing. A sequencing
library was prepared for each segregant using the Illumina Nextera kit and custom
barcoded adapters. Libraries from different segregants were pooled in equimolar fractions
and these multiplex pools were size selected using the Qiagen gel extraction kit.
74
Multiplexed samples were sequenced by BGI on an Illumina HiSeq 2500 using 100 bp x
100 bp paired-end reads. For each segregant, reads were mapped against the S288c
genome (version S288C_reference_sequence_R64-2-1_20150113.fsa from
https://www.yeastgenome.org) using BWA version 0.7.7-r44 (LI AND DURBIN 2009).
Pileup files were then produced with SAMTOOLS version 0.1.19-44428cd (LI et al.
2009). BWA and SAMTOOLS were run with their default settings. Base calls and
coverages were obtained from the pileup files for 36,756 previously identified high
confidence SNPs that segregate in the cross (TAYLOR et al. 2016). Individuals who showed
evidence of being aneuploid, diploid, or contaminated based on unusual patterns of
coverage or heterozygosity were excluded from further analysis. We also used the data to
confirm the presence of KanMX at the gene that had been knocked out. Individuals with
an average per site coverage <1.5X were removed from the dataset. A vector containing
the fraction of 3S calls at each SNP was generated and used to make initial genotype calls
with sites above and below 0.5 classified as 3S and BY, respectively. This vector of
initial genotype calls was then corrected with a Hidden Markov Model, implemented
using the HMM package version 1.0 in R (RABINER 1989). We used the following
transition and emission probability matrices: transProbs =
matrix(c(.9999,.0001,.0001,.9999),2) and emissionProbs =
matrix(c(.0.25,0.75,0.75,0.25),2). We examined the HMM-corrected genotype calls for
adjacent SNPs that lacked recombination in the segregants. In such instances, a single
SNP was chosen to serve as the representative for the entire set of adjacent SNPs that
lacked recombination. This reduced the number of markers used in subsequent analyses
from 36,756 to 8,311.
4.5.3 Phenotyping of segregants
Prior to phenotyping, segregants were always inoculated from freezer stocks into YPD
broth containing 1% yeast extract (BD Product #: 212750), 2% peptone (BD 211677),
and 2% dextrose (BD Product #:15530). After these cultures had reached stationary
phase, they were pinned onto and outgrown on plates containing 2% agar (BD 214050).
Unless specified, these plates were made with YPD and incubated at 30°C for two days.
However, some of the environments required adding a chemical compound to the YPD
75
plates, or changing the temperature or carbon source. In addition to YPD at 30°C, we
measured growth in the following environments: 21°C, 42°C, 2% ethanol (Koptec
A06141602W), 250 ng/mL 4-Nitroquinoline 1-oxide (‘4NQO’) (TCI N0250), 9 mM
copper sulfate (Sigma 209198), 50 mg/mL fluconazole (TCI F0677), 260 mM hydrogen
peroxide (EMD Millipore HX0640-5), 7 mg/mL neomycin sulfate (Gibco 21810-031),
and 5 mg/mL zeocin (Invivogen ant-zn-1). For 4-NQO, copper sulfate, fluconazole,
hydrogen peroxide, neomycin sulfate, and zeocin, the doses used for phenotyping were
chosen based on preliminary experiments across a broader range of concentrations
(Table S4.6). Growth assays were conducted in triplicate using a randomized block
design to account for positional effects on the plates. Four BY controls were included on
each plate. Plates were imaged using the BioRAD Gel Doc XR+ Molecular Imager. Each
image was 11.4 cm x 8.52 cm (width x length) and imaged under white epi-illumination
with an exposure time of 0.5 seconds. Images were exported as Tiff files with a
resolution of 600 dpi. As in (MATSUI AND EHRENREICH 2016), image analysis was
conducted in ImageJ software, with pixel intensity for each colony calculated using the
Plate Analysis JRU v1 plugin (http://research.stowers.org/imagejplugins/index.html). The
growth of each segregant on each plate was computed by dividing the segregant’s total
pixel intensity by the mean pixel intensity of the average of BY controls from the same
plate. The replicates for a segregant within an environment were then averaged and used
as that individual’s phenotype in subsequent analyses.
4.5.4 Scans for one-locus effects
All genetic mapping was conducted within each environment using fixed effects linear
models applied to the complete set of 1,411 wild type and knockout segregants. To
ensure that mean differences in growth among the eight backgrounds were always
controlled for during mapping, we included a background term in our models.
Throughout the paper, we refer to loci or combinations of loci that statistically interact or
do not statistically interact with the background term as mutation-responsive and
mutation-independent, respectively. Genetic mapping was performed in R using the “lm”
function, with the p-values for relevant terms obtained from tables generated using the
“summary” function.
76
We first identified individual loci that show mutation-independent or mutation-
responsive phenotypic effects using forward regression. To detect mutation-independent
loci, genome-wide scans were conducted with the model phenotype ~ background +
locus + error. Significant loci identified in this first iteration were then used as covariates
in the next iteration i.e. phenotype ~ background + known_locus1 … known_locusN +
locus + error, where the known_locus terms corresponded to each of the loci that had
already been identified in a given environment. To determine significance, 1,000
permutations were conducted at each iteration of the forward regression, with the
correspondence between genotypes and phenotypes randomly shuffled each time. Among
the minimum p-values obtained in the permutations, the 5
th
quantile was identified and
used as the threshold for determining significant loci. This process was iterated until no
additional loci could be detected for each environment.
To identify mutation-responsive one-locus effects, we employed the same procedure
described in the preceding paragraph, except the model phenotype ~ background + locus
+ background:locus + error was used. Here, the significance of the background:locus
interaction term was tested, again with significance determined by permutations as
described in the preceding paragraph. The locus term was included in the model to ensure
that phenotypic variance explained by mutation-independent effects did not load onto the
mutation-dependent effects. For each locus with significant background:locus terms, we
included both an additive and background interaction term in the subsequent iterations of
the forward regression: i.e., phenotype ~ background + known_locus1 … known_locusN
+ locus + background:known_locus1 … background:known_locusN + background:locus
+ error. The known_locus terms were included in these forward regression models to
ensure that variance due to the mutation-independent effects of previously identified loci
was not inadvertently attributed to the mutation-responsive terms for these loci. This
process was iterated until no additional background:locus terms were discovered for each
environment.
77
4.5.5 Scans for two-locus effects
We also performed full-genome scans for two-locus effects. Here, every unique pair of
loci was interrogated using fixed effects linear models like those described in the
preceding section. As with the one-locus effects, we employed two models in parallel.
The model phenotype ~ background + locus1 + locus2 + locus1:locus2 + error was used
to identify mutation-independent two-locus effects, whereas the model phenotype ~
background + locus1 + locus2 + background:locus1 + background:locus2 +
locus1:locus2 + background:locus1:locus2 + error was employed to detect mutation-
responsive two-locus effects. Specifically, we tested for significance of the locus1:locus2
and background:locus1:locus2 interaction terms with the former and latter models,
respectively. Simpler terms were included in each model to ensure that variance was not
erroneously attributed to more complex terms. Significance thresholds for these terms
were determined using 1,000 permutations with the correspondence between genotypes
and phenotypes randomly shuffled each time. However, to reduce computational run
time, 10,000 random pairs of loci, rather than all possible pairs of loci, were examined in
each permutation. Significance thresholds were again established based on the 5
th
quantile of minimum p-values observed across the permutations. To ensure that our main
findings were robust to threshold, we also generated results at False Discovery Rates
(FDRs) of 0.01, 0.05, and 0.1 by comparing the rate of discoveries at a given p-value in
the permutations to the rate of discoveries at that same p-value in our results (Table S4.7;
Note S4.2).
4.5.6 Scans for three-locus genetic effects
Due to computational limitations, we were unable to run a comprehensive scan for
mutation-independent and mutation-responsive three-locus effects. Instead, we scanned
for three-locus effects involving two loci that had already been identified in a given
environment (known_locus1 and known_locus2) and a third locus that had yet to be
detected (locus3). The model phenotype ~ background + known_locus1 + known_locus2
+ locus3 + known_locus1:known_locus2 + known_locus1:locus3 +
known_locus2:locus3 + known_locus1:known_locus2:locus3 + error was used to
identify mutation-independent three-locus effects, whereas the model phenotype ~
78
background + known_locus1 + known_locus2 + locus3 + background:known_locus1 +
background:known_locus2 + background:locus3 + known_locus1:known_locus2 +
known_locus1:locus3 + known_locus2:locus3 +
background:known_locus1:known_locus2 + background:known_locus1:locus3 +
background:known_locus2:locus3 + known_locus1:known_locus2:locus3 +
background:known_locus1:known_locus2:locus3 + error was employed to detect
mutation-responsive three-locus effects. Significance of the
known_locus1:known_locus2:locus3 and
background:known_locus1:known_locus2:locus3 terms in the respective models was
determined using 1,000 permutations with the correspondence between genotypes and
phenotypes randomly shuffled each time. For each permutation, 10,000 trios of sites were
chosen by first randomly picking two loci on different chromosomes and then randomly
selecting an additional 10,000 sites. The minimum p-value across the 10,000 tests was
retained. Significance thresholds were again established based on the 5
th
quantile of
minimum p-values observed across the permutations. As with the two-locus effect scans,
we also performed our analysis across multiple FDR thresholds to ensure that our
findings were robust (Table S4.7; Note S4.2).
4.5.7 Assignment of mutation-responsive effects to specific knockouts
In the aforementioned linkage scans, genetic effects exhibited statistical interactions with
the background term if they had a different phenotypic effect in at least one of the eight
backgrounds relative to the rest. To determine the specific knockouts that interacted with
each mutation-responsive effect, we used the “contrast” function from the R package
lsmeans. This was applied to the specific effect of interest post-hoc using the same linear
models that were employed for detection. All possible pairwise contrasts between wild
type and knockout segregants were conducted. Mutation-responsive effects were assigned
to specific mutations if the contrast between a mutation and a WT population was
nominally significant. Unless otherwise noted, we counted each assignment of a
mutation-responsive effect to a specific knockout as a separate genetic effect even if they
involved the same set of loci.
79
4.5.8 Statistical power analysis
To determine the statistical power of our mapping procedures, we simulated phenotypes
for the 1,411 genotyped segregants given their genotypes at randomly chosen loci and
then tried to detect these loci using the approaches described earlier. In each simulation, a
given segregant’s phenotype was determined based on both the mutation it carried (if
any), as well as its genotype at one, two, or three randomly chosen loci. The effects of
mutations were calculated based on the real phenotype data for the glucose environment.
Phenotypic effects of the mutation-responsive locus or loci were also attributed to each
segregant. Specifically, the phenotype of segregants in only one of the possible genotype
classes were increased by a given increment, which we refer to as the absolute effect size.
For one-, two-, and three-locus effects, this respectively entailed half, one-quarter, and
one-eighth of the individuals having their phenotypes increased by the increment. In the
case of mutation-independent effects, these increments were applied to all eight of the
wild type and knockout backgrounds. In contrast, for mutation-responsive effects,
increments were only applied to one of the eight backgrounds, with the specific
background randomly chosen. Lastly, random environmental noise was added to each
segregant’s phenotype. Using these genotype and phenotype data, we tested whether we
could detect the loci that had been given a phenotypic effect. This was done by fitting the
appropriate fixed-effects linear model, extracting the p-value for the relevant term, and
determining if that p-value fell below a nominal significance threshold of a = 0.05.
Statistical power was calculated as the proportion of tests at a given phenotypic
increment where p £ 0.05. The results of this analysis are shown in Figure S4.7
4.5.9 Contributions of individual loci to mutation-responsive two- and three-locus
effects
For all mutation-responsive two- and three-locus effects, we determined the proportion of
mutation-responsive phenotypic variance explained by each individual locus and the
interactions among these loci. To do this, we generated seven subsets of data, each of
which were comprised of the wild type segregants and one set of knockout segregants.
We then fit the same model that was used to originally identify a given mutation-
responsive effect to the appropriate data subsets. For two-locus effects, we obtained the
80
sum of squares for the background:locus1, background:locus2, and
background:locus1:locus2 terms. We then divided each of these values by the sum of all
three sum of squares. For three-locus effects, we obtained the sum of sum of squares
associated with each individual locus (background:locus1, background:locus2,
background:locus3) and pair of loci (background:locus1:locus2,
background:locus1:locus3, background:locus2:locus2), as well as the sum of squares
associated with the trio of loci (background:locus1:locus2:locus3). We then divided the
total sum of squares associated with each class of terms by the total sum of squares across
all mutation-responsive terms. The ternary plots used to show these results were
generated using the R package ggtern.
4.5.10 Analysis of mutation-responsive effects across environments
We determined whether each one-, two-, and three-locus mutation-responsive effect
exhibited a phenotypic effect in any environment outside the one in which it was
originally detected. To do this, we used seven subsets of data, each of which was
comprised of the wild type segregants and one set of knockout segregants. We then fit the
same model that was used to originally identify a given mutation-responsive effect to the
appropriate data subsets for each of the nine additional environments. The p-value was
then extracted for the relevant term. Bonferroni corrections were used to account for
multiple testing.
4.5.11 Phenotypic variance explained by mutation-responsive effects in wild type
and knockout segregants
We measured the phenotypic variance explained by each mutation-responsive genetic
effect in the relevant knockout background(s), as well as in the wild type background.
Here, we fit each mutation-responsive genetic effect in both populations without using
any background term. For mutation-responsive one-, two-, and three-locus effects, the
following models were respectively employed: phenotype ~ locus1 + error, phenotype ~
locus1 + locus2 + locus1:locus2 + error, and phenotype ~ locus1 + locus2 + locus3 +
locus1:locus2 + locus1:locus3 + locus2:locus3 + locus1:locus2:locus3 + error. Partial
R
2
values were obtained in each population by obtaining the sum of squares associated
81
with the term of interest and dividing it by the total sum of squares. Mutation-responsive
effects were then classified by the number of knockout backgrounds in which they were
detected. For each class, the number of genetic effects with larger partial R
2
values in the
knockout background than the wild type background (enhanced effects) and the number
of genetic effects with smaller partial R
2
values in the knockout background than the wild
type background (reduced effects) were determined. The proportion of mutation-
responsive effects that show enhanced and reduced phenotypic effects was calculated for
each class. 95% bootstrap confidence intervals were then generated for the proportions
measured in each class.
4.5.12 Checking potential consequences of allele frequency bias
Allele frequency bias may result in the erroneous detection of mutation-responsive
genetic effects due to uneven representation of one-, two-, or three-locus combinations
across the knockout and wild type backgrounds. To account for this, we generated 2x8,
4x8, and 8x8 contingency tables for all one-, two-, or three-locus interactions
respectively, counting each of the possible allele combinations in the wild type and seven
knockout populations. Specifically, for one-locus interactions, we counted the number of
individuals carrying the BY and 3S allele at the significant locus for each population. For
two-locus interactions, the number of individuals carrying the BY/BY, BY/3S, 3S/BY,
and 3S/3S alleles at the two loci were enumerated. For three-locus interactions, the
number of individuals carrying the BY/BY/BY, BY/BY/3S, BY/3S/BY, 3S/BY/BY,
BY/3S/3S, 3S/BY/3S, 3S/3S/BY, and 3S/3S/3S alleles at the three loci were counted. We
then ran chi-square tests to identify individual loci or combinations of loci that show
different frequencies across the eight backgrounds, using Bonferroni corrections to
account for multiple testing. After filtering out genetic effects that involve loci or
combinations of loci with biased frequencies, we repeated our main analyses to ensure
that our results were robust to allele frequency differences (Figures S4.5 and S4.6; Table
S4.8; Note S4.3).
82
4.6 Supplementary Materials
Figure S4.1 Generation of BYx3S knockout segregants. For each knockout or wild type
background in our study, a BY/3S diploid was generated and sporulated. MATa segregants were
obtained using the synthetic genetic array marker system
1
(Methods). Wild type segregants were
collected directly from MATa selection plates, while knockout segregants were replica-plated
from MATa selection plates onto G418 plates prior to their collection to select for segregants with
the gene deletion.
83
Figure S4.2 Certain genes exhibit significant background effects when perturbed. In a
preliminary screen, we generated and phenotyped segregants from 47 mutant versions of the same
yeast cross, each of which lacked a different chromatin-associated protein (Table S5.1). The
coefficient of variation found in a given knockout background is shown on the x-axis, while the
phenotypic variance is shown on the y-axis. In addition to increased phenotypic variance, we
found that knockout of certain genes, in particular ESA1, caused severe growth reductions in all
but few outlier segregants, which resulted in a high coefficient of variation. Presence of these
outliers may reflect higher-order interactions among loci, which would lead to small fraction of
individuals showing unusual growth
2,3
. Using Levene’s Test, we found that seven genes exhibit
significant background effects when deleted: CTK1, ESA1, GCN5, HOS3, HTB1, INO80, and
RPD3 (Table S5.2). The points corresponding to these genes are shown in color, while the point
corresponding to wild type is illustrated in black. All other points are gray.
84
Figure S4.3 Allele frequency plot. (a) Allele frequency plots are shown for each knockout and
wild type background. 50kb regions surrounding the knockouts, as well as regions that were fixed
due to selection on markers used to generate MATa haploids, are highlighted in red. Regions
where the allele frequency in at least one population is significantly different from other
populations (Table S4.4; Note S4.1; Methods), as well as regions that were fixed due to mitotic
85
recombination or gene conversion in the progenitor hemizygous diploids, are highlighted in blue
(Table S4.5; Note S4.3; Methods). (b) We observed that all of Chromosome XII was enriched in
the gcn5∆ population. This appears to be due to selection against a recombinant version of the
chromosome. Specifically, individuals who harbored a 3S-BY recombinant haplotype centered on
YCS4 were depleted (Table S4.5). This site of increased recombination on Chromosome XII in
the gcn5∆ population is denoted with an asterisk. No recombinants were observed at this site
among wild type segregants, suggesting that the gcn5∆ knockout resulted in a new recombination
hotspot.
86
Figure S4.4 Growth of all 1,411 segregants across the 10 environments.
87
88
Figure S4.5 Individual and joint contributions of loci to background effects across different
significance thresholds. In a, for each mutation-responsive two-locus effect identified across
different significance thresholds, the relative phenotypic variance explained (PVE) by the
individual involved loci and their interaction is illustrated. The same analysis was also performed
after loci that show biased two-locus allele frequencies were filtered from the set of two-locus
effects identified at the α = 0.05 threshold. In b, for each mutation-responsive three-locus effect
identified across different significance thresholds, the relative PVE for the individual loci, the
pairs of loci, and the trio of loci is provided. Similar to mutation-responsive two-locus effects,
loci that show biased three-locus allele frequencies were filtered from the set of three-locus
effects identified at the α = 0.05 threshold and their relative PVE values were determined.
Relative PVE values were calculated using sum of squares obtained from ANOVA tables, as
described in the Methods. As with the results reported in the paper, which were obtained using
the α = 0.05 threshold, we find that loci involved in most mutation-responsive effects identified at
other threshold mainly contribute to background effects through higher-order epistasis.
89
Figure S4.6 Analysis of how mutation-responsive effects interact with different knockouts at
multiple significance thresholds. Our findings are reported across FDRs of 0.01, 0.05, and 0.10,
as well as α = 0.05 after filtering of loci and multi-locus genotype classes with biased frequencies.
In each row, the left plot shows the number of genetic effects found in one (pink) or more (blue)
knockout backgrounds. The middle plot shows the PVE for mutation-responsive effects in the
wild type and relevant knockout segregants. The third plot shows the percentage of genetic
effects with larger PVE in the relevant knockout background than the wild type background
(PVE KO > PVE WT) as a function of the number of knockouts that interact with the effect. We
provide an additional y-axis on the right side of the plot, which indicates the percentage of
genetic effects with smaller PVE in the relevant knockout background than the wild type
90
background (PVE KO < PVE WT) . Error bars represent 95% bootstrap confidence intervals
(Methods).
91
Figure S4.7 Statistical power analysis for one, two, and three-locus interactions. We
determined our statistical power to detect different types of mutation-independent and mutation-
responsive genetic effects. This was done by simulating phenotype data for the 1,411 segregants
and then performing genetic mapping on the simulated phenotype data (Methods). Each set of
simulated phenotypes was determined based on which background of the BYx3S cross a
segregant came from, as well as the segregant’s genotype at one or more randomly chosen loci,
and a knockout that was randomly selected to interact with the loci. We then applied a random
deviate to each segregant’s phenotype, which was intended to represent environmental noise. The
relevant fixed-effects linear model was fit using the genotype and simulated phenotype data, and
the p-value for the appropriate term in the model was obtained. Statistical power for a given
absolute effect size was calculated as the proportion of tests that had a p-value £ 0.05. These
simulations are based on our real phenotype data for glucose. In this environment, we detected
average absolute effect sizes of 0.07, 0.15, and 0.3 for mutation-independent one, two-, and three-
locus effects, and 0.09, 0.13, and 0.26 for mutation-responsive one-, two-, and three-locus effects.
92
Figure Extent to which mutation-responsive effects interact with different knockouts. The
number of knockout backgrounds in which a particular mutation-responsive effect was detected is
shown on the x-axis. The number of mutation-responsive effects in each class is shown on the y-
axis.
93
Figure S4.9 Absence of a relationship between identified mutation-responsive effects and
mean phenotypic differences between knockout and wild type backgrounds. Each point
represents a different knockout background and environment. A point’s position on the x-axis
indicates the difference in mean between a particular knockout (Mean P.Mut) background and the
wild type (Mean P.WT) background in a single environment. On the y-axis, the difference in the
number of genetic effects with enhanced and reduced phenotypic effect in mutants relative to
wild type segregants are shown. The spearman’s ρ and its associated p-value are provided on the
plot.
94
Figure S4.10 All seven knockout populations show nominally significant correlations
between changes in phenotypic variance and detected mutation-responsive effects.
Individual panels show the results for the ctk1∆, esa1∆, gcn5∆, hos3∆, htb1∆, ino80∆, and rpd3∆
backgrounds. Each point’s position on the x-axis represents the difference in phenotypic variance
between wild type and knockout populations. On the y-axis, the difference in the number of
genetic effects with enhanced and reduced phenotypic effect in mutants relative to wild type
segregants are shown. The spearman’s ρ values and their associated p-values are provided on the
plot.
95
Figure S4.11 Most mutation-responsive effects show small differences in phenotypic
variance explained (‘PVE’) in mutants relative to wild type segregants. On the y-axis, we
show differences in PVE for each mutation-responsive effect when PVE is computed separately
in mutant and wild type segregants. Mutation-responsive effects above the red line have a higher
PVE in mutants , while mutation-responsive effects below the red line have a higher PVE in wild
type segregants.
96
Genes in Screen for Background Effects
Standard
Name
Systematic
Name
Function(s)
ASF1 YJL115W
Nucleosome assembly factor; involved in chromatin assembly,
disassembly
CHD1 YER164W
Chromatin remodeler; regulate chromatin structure and maintain
chromatin integrity
CTK1 YKL139W Catalytic (alpha) subunit of C-terminal domain kinase I (CTDK-I)
DOT1 YDR440W Nucleosomal histone H3-Lys79 methylase
EAF3 YPR023C Subunit of Rpd3S deacetylase and NuA4 acetyltransferase complexes
ELP3 YPL086C Subunit of Elongator complex; exhibits histone acetyltransferase activity
ESA1 YOR244W Catalytic subunit of the histone acetyltransferase complex (NuA4)
GCN5 YGR252W
Catalytic subunit of ADA and SAGA histone acetyltransferase
complexes
GIS1 YDR096W Histone demethylase and transcription factor
HAT1 YPL001W Catalytic subunit of the Hat1p-Hat2p histone acetyltransferase complex
HAT2 YEL056W Subunit of the Hat1p-Hat2p histone acetyltransferase complex
HDA1 YNL021W Putative catalytic subunit of a class II histone deacetylase complex
HHF1 YBR009C Histone H4
HHF2 YNL030W Histone H4
HHO1 YPL127C Histone H1
HHT1 YBR010W Histone H3
HHT2 YNL031C Histone H3
HIR1 YBL008W Subunit of HIR nucleosome assembly complex
HMT1 YBR034C Nuclear SAM-dependent mono- and asymmetric methyltransferase
HOS1 YPR068C Class I histone deacetylase (HDAC) family member
HOS2 YGL194C Histone deacetylase and subunit of Set3 and Rpd3L complexes
HOS3 YPL116W Trichostatin A-insensitive homodimeric histone deacetylase (HDAC)
HST1 YOL068C NAD(+)-dependent histone deacetylase
HTA1 YDR225W Histone H2A
HTA2 YBL003C Histone H2A
HTB1 YDR224C Histone H2B
HTB2 YBL002W Histone H2B
HTZ1 YOL012C Histone variant H2AZ
97
INO80 YGL150C ATPase and nucleosome spacing factor
ISW1 YBR245C ATPase subunit of imitation-switch (ISWI) class chromatin remodelers
JHD1 YER051W JmjC domain family histone demethylase specific for H3-K36
JHD2 YJR119C JmjC domain family histone demethylase
NAT4 YMR069W N alpha-acetyl-transferase
RLF2 YPR018W Largest subunit (p90) of the Chromatin Assembly Complex
RPD3 YNL330C
Histone deacetylase, component of both the Rpd3S and Rpd3L
complexes
RPH1 YER169W JmjC domain-containing histone demethylase
RTT109 YLL002W Histone acetyltransferase
SAS2 YMR127C Histone acetyltransferase (HAT) catalytic subunit of the SAS complex
SAS3 YBL052C Histone acetyltransferase catalytic subunit of NuA3 complex
SET1 YHR119W Histone methyltransferase, subunit of the COMPASS (Set1C) complex
SET2 YJL168C Histone methyltransferase with a role in transcriptional elongation
SET3 YKR029C Subunit of the SET3 histone deacetylase complex
SET4 YJL105W Unknown function; paralog of SET3
SNF1 YDR477W
AMP-activated S/T protein kinase; regulates H3 acetylation and
chromatin remodelling
SNF2 YOR290C Catalytic subunit of the SWI/SNF chromatin remodeling complex
SNF5 YBR289W Subunit of the SWI/SNF chromatin remodeling complex
SWR1 YDR334W Swi2/Snf2-related ATPase; structural component of the SWR1 complex
Table S4.1 List of 47 genes that were screened for background effects. The standard gene
name, the systematic name, and the biological function of each gene are shown.
98
Knockout Mean Growth Variance Levene's p-value
asf1Δ 0.779 0.020 0.978
chd1Δ 1.008 0.008 0.013
ctk1Δ 0.505 0.047 0.015
dot1Δ 1.000 0.027 0.573
eaf3Δ 0.893 0.030 0.159
elp3Δ 0.939 0.020 0.895
esa1Δ 0.299 0.124 0.010
gcn5Δ 0.727 0.050 0.033
gis1Δ 1.062 0.013 0.134
hat1Δ 1.077 0.027 0.235
hat2Δ 1.052 0.023 0.757
hda1Δ 0.932 0.029 0.509
hhf1Δ 0.991 0.013 0.103
hhf2Δ 0.958 0.013 0.123
hho1Δ 1.014 0.014 0.147
hht1Δ 1.008 0.028 0.489
hht2Δ 1.021 0.020 0.968
hir1Δ 0.904 0.023 0.784
hmt1Δ 0.959 0.022 0.592
hos1Δ 0.947 0.048 0.119
hos2Δ 0.979 0.015 0.275
hos3Δ 0.882 0.073 0.000
hst1Δ 0.860 0.028 0.552
hta1Δ 0.955 0.033 0.214
hta2Δ 1.052 0.043 0.126
htb1Δ 0.454 0.070 0.002
htb2Δ 1.058 0.010 0.036
htz1Δ 0.739 0.016 0.299
ino80Δ 0.368 0.069 0.000
isw1Δ 1.000 0.018 0.389
jhd1Δ 0.985 0.018 0.864
jhd2Δ 1.015 0.029 0.436
99
nat4Δ 1.008 0.025 0.654
rlf2Δ 0.993 0.040 0.851
rpd3Δ 0.733 0.023 0.658
rph1Δ 0.978 0.023 0.949
rtt109Δ 0.787 0.014 0.210
sas2Δ 1.033 0.016 0.327
sas3Δ 0.990 0.017 0.448
set1Δ 0.964 0.045 0.022
set2Δ 0.918 0.022 0.963
set3Δ 1.011 0.012 0.092
set4Δ 1.042 0.022 0.910
snf1Δ 0.237 0.011 0.081
snf2Δ 0.173 0.019 0.239
snf5Δ 0.316 0.012 0.143
swr1Δ 0.817 0.024 0.810
WT 1.001 0.024 NA
Table S4.2 Screen summary statistics. The mean growth and variance for each of the 48
knockout and wild type segregant backgrounds are listed in this table. The “Levene’s p-value”
column specifies the p-value statistic from the Levene’s test used to assess whether the
populations of knockout segregants exhibited significantly different heritable phenotypic
variation than a wild type BYx3S segregant population on ethanol.
100
Population
Number of
segregants with good
data
Diploid Aneuploid
Low coverage, cross-
contamination
WT 164 0 0 76
ctk1Δ 210 0 2 28
esa1Δ 122 0 25 93
gcn5Δ 215 9 15 1
hos3Δ 220 1 3 16
htb1Δ 177 1 2 60
ino80Δ 141 5 5 89
rpd3Δ 162 0 0 78
Total 1411 16 52 441
Table S4.3 Mapping population breakdown. This table shows the number of segregants from
each knockout populations that were excluded due to technical issues (i.e., low coverage or cross-
contamination) or biological issues (i.e., anueploidy or diploidy).
101
Chromosome Start position End position
4 720061 861257
4 975252 1408452
5 190283 256159
7 84678 170211
7 275701 386593
7 905443 946974
10 642317 684208
11 53931 316100
12 50192 749934
13 75046 130721
14 231034 607186
15 69484 462566
15 843892 959820
16 195879 525073
Table S4.4 Genomic regions with allele frequency bias. Regions on the genome where the
allele frequency in at least one knockout population significantly differed from other populations
were found using chi-square tests. Specifically, 2x8 contingency tables were generated and chi-
square tests were ran for all 8,311 markers, counting the number of BY and 3S allele in the wild
type and 7 knockout populations. Bonferroni correction was used for multiple testing correction.
102
Hemizygote Chromosome Start
Position
Stop
Position
Range (bp) Fixed Allele
ctk1∆/CTK1 11 0 182309 182309 BY
esa1∆/ESA1 15 657980 663812 5832 3S
gcn5∆/GCN5 4 969630 985630 11205 3S
gcn5∆/GCN5 4 1123442 1408596 285154 3S
gcn5∆/GCN5 4 1483478 1525043 41565 3S
gcn5∆/GCN5 12 519517 543736 24219 BY
hos3∆/HOS3 13 524221 526917 2696 3S
Table S4.5 Fixed regions in BY/3S hemizygous diploids. This summary table shows regions in
the genome where all segregants within a knockout segregant population carried the same
parental haplotype due to mitotic recombination in our parental diploids.
103
Condition Stock solution Amount added per liter Final Concentration
4NQO 0.5 mg/ml 500 uL .25ug/ml
Copper 500mM 18 mL 9 mM
Ethanol 20% 100mL 2% Ethanol
Fluconazole 10 mg/ml 5 mL 50 ug/ml
Glucose NA NA 2% Glucose
High temperature NA NA 2% Glucose/42°C
Hydrogen peroxide 30% 541uL 260 uM
Neomycin 50 mg/ml 140 mL 7 mg/ml
Room temperature NA NA 2% Glucose/22°C
Zeocin 100 mg/ml 50 uL 5 ug/ml
Table S4.6 Phenotyping environments. This table provides information on the environments
used for phenotyping assays in our study. Drugs and chemicals were added to 2% glucose plates
with the exception of the ethanol condition, in which 2% ethanol was used as the carbon source
instead of glucose.
104
# of Mutation-independent effects # of Mutation-responsive effects
Two-locus Three-locus Two-locus Three-locus
α 0.05 17 19 295 670
FDR 0.10 17 2 542 2324
FDR 0.05 10 1 250 1311
FDR 0.01 4 0 79 389
Table S4.7 Number of mutation-independent and mutation-responsive genetic effects across
different significance thresholds. One-locus effects are excluded from this analysis as we only
performed scans for these effects at a commonly used significance threshold of α=0.05.
105
α 0.05 FDR 0.01 FDR 0.05 FDR 0.10 α 0.05 Filtered
Χ
2
709.37 343.92 981.52 1870.9 341.41
p-value 5.81E-150 3.12E-71 8.87E-209 < 2.2E-16 (0) 1.08E-70
Table S4.8 Chi-squared test results for mutation-responsive effects. This table shows p-values
from chi-squared tests used to test if the ratio of mutation-responsive genetic effects with
enhanced and reduced phenotypic effects in mutants changes as a function of number of
knockouts a loci interacts (Methods). Results are reported across significance thresholds , as well
as at the initial α = 0.05 threshold after filtering out effects involving loci with biased individual
or multi-locus allele frequencies.
106
1-locus
interaction
2-locus
interaction
3-locus
interaction
Mutation-responsive without
filtering
45 143 406
Mutation-responsive with
filtering
14 73 181
Mutation-independent 89 17 19
Table S4.9 Number of mutation-responsive effects that show biased individual and multi-
locus allele frequencies. The numbers of two-locus and three-locus mutation-responsive effects
that show biased individual and multi-locus allele frequencies were determined using chi-square
tests. Specifically, 2x8, 4x8, and 8x8 contingency tables were generated and chi-square tests were
ran for all one, two, and three-locus interactions respectively, counting the number of each loci
combination in the wild type and 7 knockout populations. Bonferroni correction was used for
multiple testing correction.
107
Note S4.1 Conditional essentiality of esa1Δ segregants. Esa1, the catalytic subunit of
the NuA4 HAT complex, is essential in the BY background, with esa1∆ spores only able
to divide four or five times following germination before dying
4
. However, we found
that roughly 1% of the esa1Δ knockout segregants generated from the ESA1 hemizygous
diploids survive in our study. To examine for selection on specific genotypes in the
esa1Δ segregants, allele frequency was examined across all SNP markers (Figure S3A).
All esa1∆ segregants carried the 3S genotype from positions 467,219 bp to 472,584 bp on
Chromosome XIV, which is centered on END3, an EH-domain containing protein that
functions in endocytosis and actin cytoskeleton formation. Two lines of evidence suggest
that END3 is the causal gene at this locus: end3∆ and a temperature sensitive allele of
ESA1 were previously found to be synthetic lethal in the BY background
5
, and END3 is
a major contributor to trait variation in the BYx3S cross
2,3,6,7
. We note that this
chromosome XIV locus was also identified as having a large additive effect in other
knockout and wild type populations. We did not observe any other fixed regions in esa1∆
segregants. This implies that conditional essentiality of ESA1 may depend on the
cumulative effect of the chromosome XIV region and many small effects variants.
Note S4.2 All major results are robust to different significance thresholds.
Significance thresholds can affect the results and interpretations of genetic mapping
studies, especially those focused on genetic interactions. To address this possibility, we
reiterated our work across a number of different False Discovery Rate (FDR) thresholds
(Methods). Although choice of threshold impacted the number of genetic effects that
were detected, we found that all of the major results remain the same regardless of
threshold (Figure S5 and S6). This implies that our main conclusions are robust to
threshold.
Note S4.3 All major results are also robust to bias in allele combinations. In the
paper, we show that mutation-independent effects tend to be genetically simpler, while
mutation-responsive effects tend to be more genetically complex. This may be a technical
artifact driven by allele frequency differences between the different knockout and wild
type versions of the BYx3S cross, which has the potential to cause both false negatives
108
and false positives. To examine this possibility, we excluded loci that show biased
individual or multi-locus allele frequencies from our analyses. After exclusion, we found
that mutation-responsive effects still involve more higher-order epistasis than mutation-
independent effects. This difference was determined to be significant for both two-locus
and three-locus mutation-responsive effects using chi-square tests (p-value = 6.731 x 10
-
19
and 5 x 10
-30
, respectively). Additionally, similar to analyses done with different
significance thresholds, we find that our conclusions remain qualitatively the same when
we include or exclude loci that exhibit allele frequency bias (Figure S5 and S6). Thus,
this implies that our major findings are not the results of technical artifacts.
109
Chapter 5: Concluding remarks
In this chapter, I will discuss the impact of my work. I will then talk about future
directions this work can be taken in.
5.1 Impact of my work
The total body of my work from my PhD provides novel, detailed insights into
how standing genetic variants can cause heritable phenotypic variation in complex traits.
Through my research, I have conclusively answered many important questions about
complex traits, including:
1) How much non-allelic heterogeneity can underlie a trait?
In chapter two, I found that multiple distinct regulatory architectures of FLO8-
independent invasion segregate in the BYxYJM cross. These regulatory architectures
require distinct regulatory factors and cell surface proteins for invasion, implying that
regulatory rewiring is a basic mechanism that can give rise to non-allelic heterogeneity.
Similarly, in chapter three, I determined that numerous different combinations of loci
could lead to the same poor growth phenotype. Cloning of three causal loci identified
genes that have unrelated functions in stress response. Individuals carrying multiple of
these deleterious cryptic genetic variants in stress response grow fine in wild type
conditions. However, when they are exposed to environmental stress, they are unable to
grow because their ability to respond to stress is perturbed. Altogether, these results show
that extensive non-allelic heterogeneity can underlie a trait. They also highlight the
importance of studying phenotypic similarities in genetically distinct individuals to
advance our understanding of complex traits.
2) How do sets of gene-environment interactions collectively give rise to genotype by
environment interactions (‘GxE’)?
Findings from chapter three showed that epistasis does not meaningfully
contribute to GxE in growth variation under our assay conditions. In this study, GxE
reflects the composite effect of multiple additive gene-environment interactions that show
110
condition-dependent effect magnitudes. This is consistent with the prevailing model that
genetic variants contribute to quantitative traits in a predominantly additive manner.
3) What is the genetic architecture of background effects?
Work from chapter four showed that background effects are highly polygenic,
with most loci contributing through higher-order interactions that involve a mutation and
multiple loci. Most of these identified loci do not epistatically interact in wild type
segregants, which implies that mutation of certain genes can significantly change how
standing genetic variants interact with each other. I also found that the genetic
architecture of background effects varies significantly depending on which gene is being
mutated. The number of interactions detected for each mutation ranged from 73 to 543.
Furthermore, response to six of the seven mutations was mediated almost exclusively by
loci that respond to mutations in different genes. In contrast, the seventh mutation tended
to interact with loci that were not detected when other genes were mutated.
This finding is in stark contrast with findings from chapter three, where causal
loci act in mostly additive manner. More research is required to understand why the
epistatic network of standing genetic variants changes so dramatically when mutations
are introduced but not environmental stress.
Additionally, Flo8-independent invasion described in chapter two can also be
viewed as a background effect in response to the loss-of-function mutation in FLO8.
Although I was not able to directly map most of the involved loci, the highly
heterogeneous genetic basis of this trait suggests that many loci can modify the effect of
this mutation. I also found that perturbation of FLO8 uncovers complex genetic
interactions among standing genetic variants that can cause invasion in the absence of
FLO8.
5) What types of epistasis contribute to background effects?
Work from chapter four showed that individual loci or sets of loci that interact
with a mutation (‘mutation-responsive effects’) exhibit a broad, continuous range of
111
responses to the mutations. Some mutation-responsive effects gained or lost phenotypic
effect when a mutation was introduced, while the magnitude or the directionality of the
effect changed in others. I also found that mutation-responsive effects that interacted with
fewer mutations predominantly showed increased phenotypic effects in mutants than wild
type segregants, while mutation-responsive effects that interacted with more mutations
typically exhibited the opposite relationship. Thus, different types of epistasis all play a
major role in background effects. In addition, I determined that most mutation-responsive
effects that involve multiple loci contribute to background effects mainly through their
higher-order interactions with each other and a mutation, rather than through their
individual interactions with a mutation.
6) How much of a role does environment play in background effects?
Work from chapter four confirmed that environment plays a major role in
background effects. I found that the genetic architecture of background effects changes
significantly across environments. The number of interactions detected across
environments ranged from 15 to 359. Moreover, majority of the mutation-responsive
effects only exhibited phenotypic effects in the environment in which they were
originally detected. Of the few mutation-responsive effects whose phenotypic effect was
detectable in additional environments, most were only identified in one additional
environment.
Other impacts
My findings may also have more general implications for the genetics of complex
traits. In natural populations, heritable variation in complex traits is most likely caused by
a combination of common and rare polymorphisms, including de novo mutations. Most
studies aimed at determining the genetic basis of these traits employ genome-wide
association (‘GWAS’), which looks for common variants with additive effects. However,
my work shows that this methodology is insufficient, because rare alleles and de novo
mutations can alter the phenotypic effects of common variants and even dramatically
change the way common variants interact with each other.
112
5.2 Future directions
Chapters two and three
Although studies from chapters two and three were highly informative, they are
both limited in scale. This is largely because I employed a genetic mapping strategy
involving backcrosses with phenotypic selection. While it is a powerful method that can
reduce the genetic complexity of traits, it can be experimentally challenging to generate a
large sample size for both backcross directions, especially when multiple progenitor
strains are being backcrossed. Another disadvantage is that researchers are limited in the
genotypic space they can examine when working with backcross segregants. For
example, I may have underestimated the contribution of epistasis in chapter three,
because I focused on a particular set of four YJM alleles introgressed in an otherwise BY
background. In other sets of causal loci, it is possible that epistasis may play a larger role.
To achieve a deeper understanding of non-allelic heterogeneity and GxE, a more high-
throughput method is required to genetically characterize a much larger number of
strains.
Chapter four
To my knowledge, this study is one of the most comprehensive genetic mapping
studies of background effects. However, there are several points that could be expanded
upon. For example, I focused on chromatin-associated genes in this study, but
perturbation of any gene could theoretically cause background effects. It would be
interesting to see how many genes show strong background effects if all the genes in the
yeast genome were systematically knocked out. In addition, the same experiment could
be repeated in multiple crosses with different yeast isolates to test how the results change
with different sets of standing genetic variants. These additional studies will provide a
more general understanding about the causes of background effects. However, to do this
experiment at a scale that is being discussed, implementation of new genetic mapping
technologies, such as the double barcode system (SCHLECHT 2017), would be necessary.
One limitation of this study is that I generated new segregants every time I
introduced a new mutation. As a result, I could only examine how segregants respond to
the mutations at the population level. This is a problem, because I used significant
113
increase in phenotypic variations in the segregants as an indicator of strong background
effects when certain genes are mutated. However, perturbation of a gene could also show
strong background effects without changing the phenotypic variation. One way this could
happen is if there is significant line-crossing epistasis. A better way to do this experiment
is to introduce the mutation into a common panel of segregants. This will allow
researchers to observe how segregants individually respond to the same mutation.
However, this will require massive amounts of labor and time to genetically engineer
each segregant separately.
Lastly, identified loci, on average, contained about 10 candidate genes. This
resolution makes it impossible to readily identify the causal gene. Increasing the sample
size or using strategies to increase the number of recombination breakpoints present in an
individual may improve mapping resolutions and aid in determining the molecular causes
of background effects.
114
References
Albert, F. W., and L. Kruglyak, 2015 The role of regulatory variation in complex traits and
disease. Nat Rev Genet 16: 197-212.
Baryshnikova, A., M. Costanzo, S. Dixon, F. J. Vizeacoumar, C. L. Myers et al., 2010 Synthetic
genetic array (SGA) analysis in Saccharomyces cerevisiae and Schizosaccharomyces
pombe. Methods Enzymol 470: 145-179.
Baye, T. M., T. Abebe and R. A. Wilke, 2011 Genotype-environment interactions and their
translational implications. Per Med 8: 59-70.
Bergman, A., and M. L. Siegal, 2003 Evolutionary capacitance as a general feature of complex
gene networks. Nature 424: 549-552.
Bergstrom, A., J. T. Simpson, F. Salinas, B. Barre, L. Parts et al., 2014 A high-definition view of
functional genetic variation from natural yeast genomes. Mol Biol Evol 31: 872-888.
Bhatia, A., A. Yadav, C. Zhu, J. Gagneur, A. Radhakrishnan et al., 2014 Yeast growth plasticity
is regulated by environment-specific multi-QTL interactions. G3 4: 769-777.
Bloom, J. S., I. M. Ehrenreich, W. T. Loo, T. L. Lite and L. Kruglyak, 2013 Finding the sources
of missing heritability in a yeast cross. Nature 494: 234-237.
Bloom, J. S., I. Kotenko, M. J. Sadhu, S. Treusch, F. W. Albert et al., 2015 Genetic interactions
contribute less than additive effects to quantitative trait variation in yeast. Nat Commun
6: 8712.
Boone, C., H. Bussey and B. J. Andrews, 2007 Exploring genetic interactions and networks with
yeast. Nat Rev Genet 8: 437-449.
Botstein, D., 2015 Decoding the language of genetics. Cold Spring Harbor Laboratory Press,
Cold Spring Harbor Laboratory, NY.
Boyle, E. A., Y. I. Li and J. K. Pritchard, 2017 An Expanded View of Complex Traits: From
Polygenic to Omnigenic. Cell 169: 1177-1186.
Breslow, D. K., D. M. Cameron, S. R. Collins, M. Schuldiner, J. Stewart-Ornstein et al., 2008 A
comprehensive strategy enabling high-resolution functional analysis of the yeast genome.
Nat Methods 5: 711-718.
Broach, J. R., 2012 Nutritional control of growth and development in yeast. Genetics 192: 73-
105.
Bruckner, S., and H. U. Mosch, 2012 Choosing the right lifestyle: adhesion and development in
Saccharomyces cerevisiae. FEMS Microbiol Rev 36: 25-58.
Buchan, J. R., D. Muhlrad and R. Parker, 2008 P bodies promote stress granule assembly in
Saccharomyces cerevisiae. J Cell Biol 183: 441-455.
Buchan, J. R., and R. Parker, 2009 Eukaryotic stress granules: the ins and outs of translation.
Mol Cell 36: 932-941.
Cao, F., S. Lane, P. Raniga, Z. Zhou, K. Ramon et al., 2006 The Flo8 Transcription Factor Is
Essential for Hyphal Development and Virulence in Candida albicans. Mol Biol Cell 17:
295-307.
Carlborg, O., and C. S. Haley, 2004 Epistasis: too often neglected in complex trait studies? Nat
Rev Genet 5: 618-625.
Carlborg, O., L. Jacobsson, P. Ahgren, P. Siegel and L. Andersson, 2006 Epistasis and the
release of genetic variation during long-term selection. Nat Genet 38: 418-420.
Carmen, A. A., P. R. Griffin, J. R. Calaycay, S. E. Rundlett, Y. Suka et al., 1999 Yeast HOS3
forms a novel trichostatin A-insensitive homodimer with intrinsic histone deacetylase
115
activity. Proc Natl Acad Sci U S A 96: 12356-12361.
Caspi, A., and T. E. Moffitt, 2006 Gene-environment interactions in psychiatry: joining forces
with neuroscience. Nat Rev Neurosci 7: 583-590.
Chandler, C. H., S. Chari and I. Dworkin, 2013 Does your gene need a background check? How
genetic background impacts the analysis of mutations, genes, and evolution. Trends
Genet 29: 358-366.
Chandler, C. H., S. Chari, A. Kowalski, L. Choi, D. Tack et al., 2017 How well do you know
your mutation? Complex effects of genetic background on expressivity,
complementation, and ordering of allelic effects. PLoS Genet 13: e1007075.
Chandler, C. H., S. Chari, D. Tack and I. Dworkin, 2014 Causes and consequences of genetic
background effects illuminated by integrative genomic analysis. Genetics 196: 1321-
1336.
Chari, S., and I. Dworkin, 2013 The conditional nature of genetic interactions: the consequences
of wild-type backgrounds on mutational interactions in a genome-wide modifier screen.
PLoS Genet 9: e1003661.
Chen, R., L. Shi, J. Hakenberg, B. Naughton, P. Sklar et al., 2016 Analysis of 589,306 genomes
identifies individuals resilient to severe Mendelian childhood diseases. Nat Biotechnol
34: 531-538.
Chen, R. E., and J. Thorner, 2007 Function and regulation in MAPK signaling pathways: lessons
learned from the yeast Saccharomyces cerevisiae. Biochim Biophys Acta 1773: 1311-
1340.
Cherry, J. M., E. L. Hong, C. Amundsen, R. Balakrishnan, G. Binkley et al., 2012
Saccharomyces genome database: The genomics resource of budding yeast. Nucleic acids
research 40: D700-705.
Chin, B. L., O. Ryan, F. Lewitter, C. Boone and G. R. Fink, 2012 Genetic variation in
Saccharomyces cerevisiae: circuit diversification in a signal transduction network.
Genetics 192: 1523-1532.
Chow, C. Y., 2016 Bringing genetic background into focus. Nat Rev Genet 17: 63-64.
Churchill, G. A., and R. W. Doerge, 1994 Empirical threshold values for quantitative trait
mapping. Genetics 138: 963-971.
Ciliberti, S., O. C. Martin and A. Wagner, 2007 Robustness can evolve gradually in complex
regulatory gene networks with varying topology. PLoS Comput Biol 3: e15.
Cong, L., F. A. Ran, D. Cox, S. Lin, R. Barretto et al., 2013 Multiplex genome engineering using
CRISPR/Cas systems. Science 339: 819-823.
Cooper, D. N., M. Krawczak, C. Polychronakos, C. Tyler-Smith and H. Kehrer-Sawatzki, 2013
Where genotype is not predictive of phenotype: towards an understanding of the
molecular basis of reduced penetrance in human inherited disease. Hum Genet 132:
1077-1130.
Cordell, H. J., 2009 Detecting gene-gene interactions that underlie human diseases. Nat Rev
Genet 10: 392-404.
Costanzo, M., A. Baryshnikova, J. Bellay, Y. Kim, E. D. Spear et al., 2010 The genetic
landscape of a cell. Science 327: 425-431.
Costanzo, M., B. VanderSluis, E. N. Koch, A. Baryshnikova, C. Pons et al., 2016 A global
genetic interaction network maps a wiring diagram of cellular function. Science 353.
Crow, J. F., 2010 On epistasis: why it is unimportant in polygenic directional selection. Philos
Trans R Soc Lond B Biol Sci 365: 1241-1244.
116
Cullen, P. J., and G. F. Sprague, Jr., 2000 Glucose depletion causes haploid invasive growth in
yeast. Proc Natl Acad Sci U S A 97: 13619-13624.
Decker, C. J., and R. Parker, 2012 P-bodies and stress granules: possible roles in the control of
translation and mRNA degradation. Cold Spring Harb Perspect Biol 4: a012286.
DiCarlo, J. E., J. E. Norville, P. Mali, X. Rios, J. Aach et al., 2013 Genome engineering in
Saccharomyces cerevisiae using CRISPR-Cas systems. Nucleic Acids Res 41: 4336-
4343.
Dixon, S. J., M. Costanzo, A. Baryshnikova, B. Andrews and C. Boone, 2009 Systematic
mapping of genetic interaction networks. Annu Rev Genet 43: 601-625.
Dominguez, A. A., W. A. Lim and L. S. Qi, 2016 Beyond editing: repurposing CRISPR-Cas9 for
precision genome regulation and interrogation. Nat Rev Mol Cell Biol 17: 5-15.
Dowell, R. D., O. Ryan, A. Jansen, D. Cheung, S. Agarwala et al., 2010 Genotype to phenotype:
a complex problem. Science 328: 469.
Duitama, J., A. Sanchez-Rodriguez, A. Goovaerts, S. Pulido-Tamayo, G. Hubmann et al., 2014
Improved linkage analysis of Quantitative Trait Loci using bulk segregants unveils a
novel determinant of high ethanol tolerance in yeast. BMC Genomics 15: 207.
Ehrenreich, I. M., 2017 Epistasis: searching for interacting genetic variants using crosses.
Genetics 206: 531-535.
Ehrenreich, I. M., J. Bloom, N. Torabi, X. Wang, Y. Jia et al., 2012 Genetic architecture of
highly complex chemical resistance traits across four yeast strains. PLoS Genet 8:
e1002570.
Ehrenreich, I. M., J. P. Gerke and L. Kruglyak, 2009 Genetic dissection of complex traits in
yeast: insights from studies of gene expression and other phenotypes in the BYxRM
cross. Cold Spring Harb Symp Quant Biol 74: 145-153.
Ehrenreich, I. M., and D. W. Pfennig, 2015 Genetic assimilation: a review of its potential
proximate causes and evolutionary consequences. Ann Bot.
Ehrenreich, I. M., N. Torabi, Y. Jia, J. Kent, S. Martis et al., 2010 Dissection of genetically
complex traits with extremely large pools of yeast segregants. Nature 464: 1039-1042.
Erdeniz, N., U. H. Mortensen and R. Rothstein, 1997 Cloning-free PCR-based allele replacement
methods. Genome Res 7: 1174-1183.
Falconer, D. S., and T. F. Mackay, 1996 Introduction to quantitative genetics (4th edition).
Pearson Education Limited, Harlow, England.
Felden, J., S. Weisser, S. Bruckner, P. Lenz and H. U. Mosch, 2014 The Transcription Factors
Tec1 and Ste12 Interact with Coregulators Msa1 and Msa2 To Activate Adhesion and
Multicellular Development. Mol Biol Cell 34: 2283-2293.
Felix, M. A., and A. Wagner, 2008 Robustness and evolution: concepts, insights and challenges
from a developmental model system. Heredity (Edinb) 100: 132-140.
Fidalgo, M., R. R. Barrales, J. I. Ibeas and J. Jimenez, 2006 Adaptive evolution by mutations in
the FLO11 gene. Proc Natl Acad Sci U S A 103: 11228-11233.
Fidalgo, M., R. R. Barrales and J. Jimenez, 2008 Coding repeat instability in the FLO11 gene of
Saccharomyces yeasts. Yeast 25: 879-889.
Forsberg, S. K. G., J. S. Bloom, M. J. Sadhu, L. Kruglyak and O. Carlborg, 2016 Accounting for
genetic interactions is necessary for accurate prediction of extreme phenotypic values of
quantitative traits in yeast. bioRxiv.
Fournier-Level, A., A. Korte, M. D. Cooper, M. Nordborg, J. Schmitt et al., 2011 A map of local
adaptation in Arabidopsis thaliana. Science 334: 86-89.
117
Fromer, M., A. J. Pocklington, D. H. Kavanagh, H. J. Williams, S. Dwyer et al., 2014 De novo
mutations in schizophrenia implicate synaptic networks. Nature 506: 179-184.
Fujimura, H. A., 1989 The yeast G-protein homolog is involved in the mating pheromone signal
transduction system. Mol Cell Biol 9: 152-158.
Gagneur, J., O. Stegle, C. Zhu, P. Jakob, M. M. Tekkedil et al., 2013 Genotype-environment
interactions reveal causal pathways that mediate genetic effects on phenotype. PLoS
Genet 9: e1003803.
Gasch, A. P., P. T. Spellman, C. M. Kao, O. Carmel-Harel, M. B. Eisen et al., 2000 Genomic
expression programs in the response of yeast cells to environmental changes. Mol Biol
Cell 11: 4241-4257.
Geiler-Samerotte, K. A., Y. O. Zhu, B. E. Goulet, D. W. Hall and M. L. Siegal, 2016 Selection
transforms the landscape of genetic variation interacting with Hsp90. PLoS Biol 14:
e2000465.
Genomes Consortium. Electronic address, m. n. g. o. a. a., and C. Genomes, 2016 1,135
Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana. Cell 166:
481-491.
Gerke, J., K. Lorenz and B. Cohen, 2009 Genetic interactions between transcription factors cause
natural variation in yeast. Science 323: 498-501.
Gerke, J., K. Lorenz, S. Ramnarine and B. Cohen, 2010 Gene-environment interactions at
nucleotide resolution. PLoS Genet 6: e1001144.
Gibson, G., 2009 Decanalization and the origin of complex disease. Nat Rev Genet 10: 134-140.
Gibson, G., and I. Dworkin, 2004 Uncovering cryptic genetic variation. Nat Rev Genet 5: 681-
690.
Gietz, R. D., and R. A. Woods, 2002 Transformation of yeast by lithium acetate/single-stranded
carrier DNA/polyethylene glycol method. Methods Enzymol 350: 87-96.
Gilbert, L. A., M. H. Larson, L. Morsut, Z. Liu, G. A. Brar et al., 2013 CRISPR-mediated
modular RNA-guided regulation of transcription in eukaryotes. Cell 154: 442-451.
Gjuvsland, A. B., B. J. Hayes, S. W. Omholt and O. Carlborg, 2007 Statistical epistasis is a
generic feature of gene regulatory networks. Genetics 175: 411-420.
Grishkevich, V., and I. Yanai, 2013 The genomic determinants of genotype x environment
interactions in gene expression. Trends Genet 29: 479-487.
Guarente, L., 1993 Synthetic enhancement in gene interaction: a genetic tool come of age.
Trends Genet 9: 362-366.
Guo, B., C. A. Styles, Q. Feng and G. R. Fink, 2000 A Saccharomyces gene family involved in
invasive growth, cell-cell adhesion, and mating. Proc Natl Acad Sci U S A 97: 12158-
12163.
Gutteling, E. W., A. Doroszuk, J. A. Riksen, Z. Prokop, J. Reszka et al., 2007 Environmental
influence on the genetic correlations between life-history traits in Caenorhabditis elegans.
Heredity (Edinb) 98: 206-213.
Halme, A., S. Bumgarner, C. Styles and G. R. Fink, 2004 Genetic and epigenetic regulation of
the FLO gene family generates cell-surface variation in yeast. Cell 116: 405-415.
Hartman, J. L., B. Garvik and L. Hartwell, 2001 Principles for the buffering of genetic variation.
Science 291: 1001-1004.
Hemani, G., S. Knott and C. Haley, 2013 An evolutionary perspective on epistasis and the
missing heritability. PLoS Genet 9: e1003295.
Hermisson, J., and G. P. Wagner, 2004 The population genetic theory of hidden variation and
118
genetic robustness. Genetics 168: 2271-2284.
Herskowitz, I., and R. E. Jensen, 1991 Putting the HO gene to work: practical uses for mating-
type switching. Methods Enzymol 194: 132-146.
Hillenmeyer, M. E., E. Ericson, R. W. Davis, C. Nislow, D. Koller et al., 2010 Systematic
analysis of genome-wide fitness data in yeast reveals novel gene function and drug
action. Genome Biol 11: R30.
Hillenmeyer, M. E., E. Fung, J. Wildenhain, S. E. Pierce, S. Hoon et al., 2008 The chemical
genomic portrait of yeast: uncovering a phenotype for all genes. Science 320: 362-365.
Hodgkin, J., 2005 Genetic suppression. WormBook: 1-13.
Hodgkin, J., K. Kondo and R. H. Waterston, 1987 Suppression in the nematode Caenorhabditis
elegans. Trends Genet 3: 325-329.
Hou, J., A. Friedrich, J. S. Gounot and J. Schacherer, 2015 Comprehensive survey of condition-
specific reproductive isolation reveals genetic incompatibility in yeast. Nat Commun 6:
7214.
Hou, J., and J. Schacherer, 2017 Fitness trade-offs lead to suppressor tolerance in yeast. Mol Biol
Evol 34: 110-118.
Hou, J., A. Sigwalt, T. Fournier, D. Pflieger, J. Peter et al., 2016 The hidden complexity of
Mendelian traits across natural yeast populations. Cell Rep 16: 1106-1114.
Ip, C. L., M. Loose, J. R. Tyson, M. de Cesare, B. L. Brown et al., 2015 MinION analysis and
reference consortium: phase 1 data release and analysis. F1000Res 4: 1075.
Jain, M., I. T. Fiddes, K. H. Miga, H. E. Olsen, B. Paten et al., 2015 Improved data analysis for
the MinION nanopore sequencer. Nat Methods 12: 351-356.
Jarosz, D. F., and S. Lindquist, 2010 Hsp90 and environmental stress transform the adaptive
value of natural genetic variation. Science 330: 1820-1824.
Jerison, E. R., S. Kryazhimskiy, J. Mitchell, J. S. Bloom, L. Kruglyak et al., 2017 Genetic
variation in adaptability and pleiotropy in budding yeast. bioRxiv.
Jin, S. C., J. Homsy, S. Zaidi, Q. Lu, S. Morton et al., 2017 Contribution of rare inherited and de
novo variants in 2,871 congenital heart disease probands. Nat Genet 49: 1593-1601.
Jones, G. M., J. Stalker, S. Humphray, A. West, T. Cox et al., 2008 A systematic library for
comprehensive overexpression screens in Saccharomyces cerevisiae. Nat Methods 5:
239-241.
Jordan, D. M., S. G. Frangakis, C. Golzio, C. A. Cassa, J. Kurtzberg et al., 2015 Identification of
cis-suppression of human disease mutations by comparative genomics. Nature 524: 225-
229.
Julien, P., B. Minana, P. Baeza-Centurion, J. Valcarcel and B. Lehner, 2016 The complete local
genotype-phenotype landscape for the alternative splicing of a human exon. Nat Commun
7: 11558.
Kammenga, J. E., M. A. Herman, N. J. Ouborg, L. Johnson and R. Breitling, 2007 Microarray
challenges in ecology. Trends Ecol Evol 22: 273-279.
Kang, C. M., and Y. W. Jiang, 2005 Genome-wide survey of non-essential genes required for
slowed DNA synthesis-induced filamentous growth in yeast. Yeast 22: 79-90.
Kim, H. Y., S. B. Lee, H. S. Kang, G. T. Oh and T. Kim, 2014 Two distinct domains of Flo8
activator mediates its role in transcriptional activation and the physical interaction with
Mss11. Biochem Biophys Res Commun 449: 202-207.
Konermann, S., M. D. Brigham, A. E. Trevino, J. Joung, O. O. Abudayyeh et al., 2015 Genome-
scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature 517:
119
583-588.
Kumar, A., P. Sharma, M. Gomar-Alba, Z. Shcheprova, A. Daulny et al., 2018 Daughter-cell-
specific modulation of nuclear pore complexes controls cell cycle entry during
asymmetric division. Nat Cell Biol 20: 432-442.
Kvitek, D. J., J. L. Will and A. P. Gasch, 2008 Variations in stress sensitivity and genomic
expression in diverse S. cerevisiae isolates. PLoS Genet 4: e1000223.
Larson, M. H., L. A. Gilbert, X. Wang, W. A. Lim, J. S. Weissman et al., 2013 CRISPR
interference (CRISPRi) for sequence-specific control of gene expression. Nat Protoc 8:
2180-2196.
Lee, J. T., M. B. Taylor, A. Shen and I. M. Ehrenreich, 2016 Multi-locus genotypes underlying
temperature sensitivity in a mutationally induced trait. PLoS Genet 12: e1005929.
Lehner, B., 2011 Molecular mechanisms of epistasis within and between genes. Trends Genet
27: 323-331.
Lehner, B., C. Crombie, J. Tischler, A. Fortunato and A. G. Fraser, 2006 Systematic mapping of
genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse
signaling pathways. Nat Genet 38: 896-903.
Li, B., M. Carey and J. L. Workman, 2007 The role of chromatin during transcription. Cell 128:
707-719.
Li, C., W. Qian, C. J. Maclean and J. Zhang, 2016 The fitness landscape of a tRNA gene.
Science 352: 837-840.
Li, H., and R. Durbin, 2009 Fast and accurate short read alignment with Burrows-Wheeler
transform. Bioinformatics 25: 1754-1760.
Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan et al., 2009 The Sequence
Alignment/Map format and SAMtools. Bioinformatics 25: 2078-2079.
Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan et al., 2009 The Sequence
Alignment/Map format and SAMtools. Bioinformatics 25: 2078-2079.
Li, J., L. Wang, X. Wu, O. Fang, L. Wang et al., 2013 Polygenic molecular architecture
underlying non-sexual cell aggregation in budding yeast. DNA Res 20: 55-66.
Li, S. C., and P. M. Kane, 2009 The yeast lysosome-like vacuole: endpoint and crossroads.
Biochim Biophys Acta 1793: 650-663.
Linder, R. A., F. Seidl, K. Ha and I. M. Ehrenreich, 2016 The complex genetic and molecular
basis of a model quantitative trait. Mol Biol Cell 27: 209-218.
Liti, G., D. M. Carter, A. M. Moses, J. Warringer, L. Parts et al., 2009 Population genomics of
domestic and wild yeasts. Nature 458: 337-341.
Liu, G., M. Y. Yong, M. Yurieva, K. G. Srinivasan, J. Liu et al., 2015 Gene essentiality Is a
quantitative property linked to cellular evolvability. Cell 163: 1388-1399.
Liu, H., C. A. Styles and G. R. Fink, 1996 Saccharomyces cerevisiae S288C Has a Mutation in
FL08, a Gene Required for Filamentous Growth. Genetics 144: 967-978.
Lo, W. S., and A. M. Dranginis, 1998 The cell surface flocculin Flo11 is required for
pseudohyphae formation and invasion by Saccharomyces cerevisiae. Mol Biol Cell 9:
161-171.
Long, A. D., S. J. Macdonald and E. G. King, 2014 Dissecting complex traits using the
Drosophila Synthetic Population Resource. Trends Genet 30: 488-495.
Lynch, M., and B. Walsh, 1998 Genetics and analysis of quantitative traits. Sinauer Associates,
Inc., Sunderland, Massachusetts.
Mackay, T. F., 2014 Epistasis and quantitative traits: using model organisms to study gene-gene
120
interactions. Nat Rev Genet 15: 22-33.
Mackay, T. F., E. A. Stone and J. F. Ayroles, 2009 The genetics of quantitative traits: challenges
and prospects. Nat Rev Genet 10: 565-577.
Magtanong, L., C. H. Ho, S. L. Barker, W. Jiao, A. Baryshnikova et al., 2011 Dosage
suppression genetic interaction networks enhance functional wiring diagrams of the cell.
Nat Biotechnol 29: 505-511.
Mali, P., L. Yang, K. M. Esvelt, J. Aach, M. Guell et al., 2013 RNA-guided human genome
engineering via Cas9. Science 339: 823-826.
Manchia, M., J. Cullis, G. Turecki, G. A. Rouleau, R. Uher et al., 2013 The impact of phenotypic
and genetic heterogeneity on results of genome wide association studies of complex
diseases. PLoS One 8: e76295.
Mani, R., R. P. St Onge, J. L. t. Hartman, G. Giaever and F. P. Roth, 2008 Defining genetic
interaction. Proc Natl Acad Sci U S A 105: 3461-3466.
Manolio, T. A., F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff et al., 2009 Finding the
missing heritability of complex diseases. Nature 461: 747-753.
Masel, J., and M. L. Siegal, 2009 Robustness: mechanisms and consequences. Trends Genet 25:
395-403.
Matsui, T., and I. M. Ehrenreich, 2016 Gene-environment interactions in stress response
contribute additively to a genotype-environment interaction. PLoS Genet 12: e1006158.
Matsui, T., J. T. Lee and I. M. Ehrenreich, 2017 Genetic suppression: Extending our knowledge
from lab experiments to natural populations. Bioessays 39.
Matsui, T., R. Linder, J. Phan, F. Seidl and I. M. Ehrenreich, 2015 Regulatory rewiring in a cross
causes extensive genetic heterogeneity. Genetics 201: 769-777.
McClellan, J., and M. C. King, 2010 Genetic heterogeneity in human disease. Cell 141: 210-217.
McCusker, J. H., K. V. Clemons, D. A. Stevens and R. W. Davis, 1994 Saccharomyces
cerevisiae virulence phenotype as determined with CD-1 mice is associated with the
ability to grow at 42 degrees C and form pseudohyphae. Infect Immun 62: 5447-5455.
McCusker, J. H., K. V. Clemons, D. A. Stevens and R. W. Davis, 1994 Genetic characterization
of pathogenic Saccharomyces cerevisiae isolates. Genetics 136: 1261-1269.
Mitchell, A., H. Y. Chang, L. Daugherty, M. Fraser, S. Hunter et al., 2015 The InterPro protein
families database: the classification resource after 15 years. Nucleic Acids Res 43: D213-
221.
Mitchell, S. F., S. Jain, M. She and R. Parker, 2013 Global analysis of yeast mRNPs. Nat Struct
Mol Biol 20: 127-133.
Moffitt, T. E., A. Caspi and M. Rutter, 2005 Strategy for investigating interactions between
measured genes and measured environments. Arch Gen Psychiatry 62: 473-481.
Mosch, H. U., and G. R. Fink, 1997 Dissection of filamentous growth by transposon mutagenesis
in Saccharomyces cerevisiae. Genetics 145: 671-684.
Nadeau, J. H., 2001 Modifier genes in mice and humans. Nat Rev Genet 2: 165-174.
Narasimhan, V. M., K. A. Hunt, D. Mason, C. L. Baker, K. J. Karczewski et al., 2016 Health and
population effects of rare gene knockouts in adult humans with related parents. Science
352: 474-477.
Nogee, L. M., S. E. Wert, S. A. Proffit, W. M. Hull and J. A. Whitsett, 2000 Allelic
heterogeneity in hereditary surfactant protein B (SP-B) deficiency. Am J Respir Crit Care
Med 161: 973-981.
Nuzhdin, S. V., M. L. Friesen and L. M. McIntyre, 2012 Genotype-phenotype mapping in a post-
121
GWAS world. Trends Genet 28: 421-426.
Omholt, S. W., E. Plahte, L. Oyehaug and K. Xiang, 2000 Gene regulatory networks generating
the phenomena of additivity, dominance and epistasis. Genetics 155: 969-980.
Paaby, A. B., and M. V. Rockman, 2014 Cryptic genetic variation: evolution's hidden substrate.
Nat Rev Genet 15: 247-258.
Paaby, A. B., A. G. White, D. D. Riccardi, K. C. Gunsalus, F. Piano et al., 2015 Wild worm
embryogenesis harbors ubiquitous polygenic modifier variation. Elife 4.
Perez-Pinera, P., D. D. Kocak, C. M. Vockley, A. F. Adler, A. M. Kabadi et al., 2013 RNA-
guided gene activation by CRISPR-Cas9-based transcription factors. Nat Methods 10:
973-976.
Pettersson, M., F. Besnier, P. B. Siegel and O. Carlborg, 2011 Replication and explorations of
high-order epistasis using a large advanced intercross line pedigree. PLoS Genet 7:
e1002180.
Phillips, P. C., 2008 Epistasis--the essential role of gene interactions in the structure and
evolution of genetic systems. Nat Rev Genet 9: 855-867.
Prelich, G., 1999 Suppression mechanisms: themes from variations. Trends Genet 15: 261-266.
Puchta, O., B. Cseke, H. Czaja, D. Tollervey, G. Sanguinetti et al., 2016 Network of epistatic
interactions within a yeast snoRNA. Science 352: 840-844.
Qi, L. S., M. H. Larson, L. A. Gilbert, J. A. Doudna, J. S. Weissman et al., 2013 Repurposing
CRISPR as an RNA-guided platform for sequence-specific control of gene expression.
Cell 152: 1173-1183.
Queitsch, C., T. A. Sangster and S. Lindquist, 2002 Hsp90 as a capacitor of phenotypic variation.
Nature 417: 618-624.
Rabiner, L. R., 1989 A tutorial on hidden markov models and selected applications in speech
recognition. Proceedings of the IEEE 77: 257-286.
Rando, O. J., and F. Winston, 2012 Chromatin and transcription in yeast. Genetics 190: 351-387.
Rauw, W. M., and L. Gomez-Raya, 2015 Genotype by environment interaction and breeding for
robustness in livestock. Front Genet 6: 310.
Richardson, J. B., L. D. Uppendahl, M. K. Traficante, S. F. Levy and M. L. Siegal, 2013 Histone
variant HTZ1 shows extensive epistasis with, but does not increase robustness to, new
mutations. PLoS Genet 9: e1003733.
Risch, N. J., 2000 Searching for genetic determinants in the new millennium. Nature 405: 847-
856.
Robyr, D., Y. Suka, I. Xenarios, S. K. Kurdistani, A. Wang et al., 2002 Microarray deacetylation
maps determine genome-wide functions for yeast histone deacetylases. Cell 109: 437-
446.
Rupp, S., E. Summers, H. J. Lo, H. Madhani and G. Fink, 1999 MAP kinase and cAMP
filamentation signaling pathways converge on the unusually large promoter of the yeast
FLO11 gene. EMBO J 18: 1257-1269.
Rutherford, S. L., 2000 From genotype to phenotype: buffering mechanisms and the storage of
genetic information. Bioessays 22: 1095-1105.
Rutherford, S. L., 2003 Between genotype and phenotype: protein chaperones and evolvability.
Nat Rev Genet 4: 263-274.
Rutherford, S. L., and S. Lindquist, 1998 Hsp90 as a capacitor for morphological evolution.
Nature 396: 336-342.
Sackton, T. B., and D. L. Hartl, 2016 Genotypic context and epistasis in individuals and
122
populations. Cell 166: 279-287.
Sagot, I., S. K. Klee and D. Pellman, 2002 Yeast formins regulate cell polarity by controlling the
assembly of actin cables. Nat Cell Biol 4: 42-50.
Sanders, S. J., M. T. Murtha, A. R. Gupta, J. D. Murdoch, M. J. Raubeson et al., 2012 De novo
mutations revealed by whole-exome sequencing are strongly associated with autism.
Nature 485: 237-241.
Sandrock, T. M., J. L. O'Dell and A. E. Adams, 1997 Allele-specific suppression by formation of
new protein-protein interactions in yeast. Genetics 147: 1635-1642.
Sarkisyan, K. S., D. A. Bolotin, M. V. Meer, D. R. Usmanova, A. S. Mishin et al., 2016 Local
fitness landscape of the green fluorescent protein. Nature 533: 397-401.
Schell, R., M. Mullis and I. M. Ehrenreich, 2016 Modifiers of the genotype-phenotype map:
Hsp90 and beyond. PLoS Biol 14: e2001015.
Schlecht, U., Z. Liu, J. R. Blundell, R. P. St Onge and S. F. Levy, 2017 A scalable double-
barcode sequencing platform for characterization of dynamic protein-protein interactions.
Nat Commun 8: 15586.
Schneider, C. A., W. S. Rasband and K. W. Eliceiri, 2012 NIH Image to ImageJ: 25 years of
image analysis. Nat Methods 9: 671-675.
Sham, P. C., and S. M. Purcell, 2014 Statistical power and significance testing in large-scale
genetic studies. Nat Rev Genet 15: 335-346.
Shao, H., L. C. Burrage, D. S. Sinasac, A. E. Hill, S. R. Ernest et al., 2008 Genetic architecture
of complex traits: large phenotypic effects and pervasive epistasis. Proc Natl Acad Sci U
S A 105: 19910-19914.
Sherman, F., 1991 Guide to yeast genetics and molecular biology, pp. 3-21 in Methods in
Enzymology, edited by C. Guthrie and G. R. Fink. Elsevier Academic Press, San Diego,
California.
Sherman, F., 1991 Guide to Yeast Genetics and Molecular, pp. 3-21 in Methods in Enzymology,
edited by C. Guthrie and G. R. Fink. Elsevier Academic Press, San Diego, California.
Siegal, M. L., and J. Y. Leu, 2014 On the nature and evolutionary impact of phenotypic
robustness mechanisms. Annu Rev Ecol Evol Syst 45: 496-517.
Sinha, H., L. David, R. C. Pascon, S. Clauder-Munster, S. Krishnakumar et al., 2008 Sequential
elimination of major-effect contributors identifies additional quantitative trait loci
conditioning high-temperature growth in yeast. Genetics 180: 1661-1670.
Sinha, H., B. P. Nicholson, L. M. Steinmetz and J. H. McCusker, 2006 Complex genetic
interactions in a quantitative trait locus. PLoS Genet 2: e13.
Sirr, A., G. A. Cromie, E. W. Jeffery, T. L. Gilbert, C. L. Ludlow et al., 2015 Allelic variation,
aneuploidy, and nongenetic mechanisms suppress a monogenic trait in yeast. Genetics
199: 247-262.
Smith, E. N., and L. Kruglyak, 2008 Gene-environment interaction in yeast gene expression.
PLoS Biol 6: e83.
Song, Q., C. Johnson, T. E. Wilson and A. Kumar, 2014 Pooled segregant sequencing reveals
genetic determinants of yeast pseudohyphal growth. PLoS Genet 10: e1004570.
Sopko, R., D. Huang, N. Preston, G. Chua, B. Papp et al., 2006 Mapping pathways and
phenotypes by systematic gene overexpression. Mol Cell 21: 319-330.
Spiezio, S. H., T. Takada, T. Shiroishi and J. H. Nadeau, 2012 Genetic divergence and the
genetic architecture of complex traits in chromosome substitution strains of mice. BMC
Genet 13: 38.
123
St Onge, R. P., R. Mani, J. Oh, M. Proctor, E. Fung et al., 2007 Systematic pathway analysis
using high-resolution fitness profiling of combinatorial gene deletions. Nat Genet 39:
199-206.
Steinmetz, L. M., H. Sinha, D. R. Richards, J. I. Spiegelman, P. J. Oefner et al., 2002 Dissecting
the architecture of a quantitative trait locus in yeast. Nature 416: 326-330.
Storici, F., L. K. Lewis and M. A. Resnick, 2001 In vivo site-directed mutagenesis using
oligonucleotides. Nat Biotechnol 19: 773-776.
Strope, P. K., D. A. Skelly, S. G. Kozmin, G. Mahadevan, E. A. Stone et al., 2015 The 100-
genomes strains, an S. cerevisiae resource that illuminates its natural phenotypic and
genotypic variation and emergence as an opportunistic pathogen. Genome Res 25: 762-
774.
Sutcliffe, J. S., R. J. Delahanty, H. C. Prasad, J. L. McCauley, Q. Han et al., 2005 Allelic
heterogeneity at the serotonin transporter locus (SLC6A4) confers susceptibility to autism
and rigid-compulsive behaviors. Am J Hum Genet 77: 265-279.
Tanenbaum, M. E., L. A. Gilbert, L. S. Qi, J. S. Weissman and R. D. Vale, 2014 A protein-
tagging system for signal amplification in gene expression and fluorescence imaging.
Cell 159: 635-646.
Taylor, M. B., and I. M. Ehrenreich, 2014 Genetic interactions involving five or more genes
contribute to a complex trait in yeast. PLoS Genet 10: e1004324.
Taylor, M. B., and I. M. Ehrenreich, 2015 Higher-order genetic interactions and their
contribution to complex traits. Trends Genet 31: 34-40.
Taylor, M. B., and I. M. Ehrenreich, 2015 Transcriptional derepression uncovers cryptic higher-
order genetic interactions. PLoS Genet 11: e1005606.
Taylor, M. B., J. Phan, J. T. Lee, M. McCadden and I. M. Ehrenreich, 2016 Diverse genetic
architectures lead to the same cryptic phenotype in a yeast cross. Nat Commun 7: 11669.
Taylor, M. B., J. Phan, J. T. Lee, M. McCadden and I. M. Ehrenreich, In press Diverse genetic
architectures lead to the same cryptic phenotype in a yeast cross. Nature
Communications.
Timpson, N. J., C. M. T. Greenwood, N. Soranzo, D. J. Lawson and J. B. Richards, 2018 Genetic
architecture: the shape of the genetic contribution to human traits and disease. Nat Rev
Genet 19: 110-124.
Tirosh, I., S. Reikhav, N. Sigal, Y. Assia and N. Barkai, 2010 Chromatin regulators as capacitors
of interspecies variations in gene expression. Mol Syst Biol 6: 435.
Tischler, J., B. Lehner, N. Chen and A. G. Fraser, 2006 Combinatorial RNA interference in
Caenorhabditis elegans reveals that redundancy between gene duplicates can be
maintained for more than 80 million years of evolution. Genome Biol 7: R69.
Tong, A. H., and C. Boone, 2006 Synthetic genetic array analysis in Saccharomyces cerevisiae.
Methods Mol Biol 313: 171-192.
Tong, A. H., M. Evangelista, A. B. Parsons, H. Xu, G. D. Bader et al., 2001 Systematic genetic
analysis with ordered arrays of yeast deletion mutants. Science 294: 2364-2368.
van Leeuwen, J., C. Pons, J. C. Mellor, T. N. Yamaguchi, H. Friesen et al., 2016 Exploring
genetic suppression interactions on a global scale. Science 354.
van Swinderen, B., and R. J. Greenspan, 2005 Flexibility in a gene network affecting a simple
behavior in Drosophila melanogaster. Genetics 169: 2151-2163.
Via, S., and R. Lande, 1985 Genotype-Environment Interaction and the Evolution of Phenotypic
Plasticity. Evolution 39: 505-522.
124
Visscher, P. M., and J. Bruce Walsh, 2017 Commentary: Fisher 1918: the foundation of the
genetics and analysis of complex traits. Int J Epidemiol.
Voordeckers, K., J. Kominek, A. Das, A. Espinosa-Cantu, D. De Maeyer et al., 2015 Adaptation
to high ethanol reveals complex evolutionary pathways. PLoS Genet 11: e1005635.
Vu, V., A. J. Verster, M. Schertzberg, T. Chuluunbaatar, M. Spensley et al., 2015 Natural
variation in gene expression modulates the severity of mutant phenotypes. Cell 162: 391-
402.
Wagner, A., 2005 Distributed robustness versus redundancy as causes of mutational robustness.
Bioessays 27: 176-188.
Wagner, A., 2012 The role of robustness in phenotypic adaptation and innovation. Proc Biol Sci
279: 1249-1258.
Walsh, T., and M. C. King, 2007 Ten genes for inherited breast cancer. Cancer Cell 11: 103-105.
Walsh, T., J. M. McClellan, S. E. McCarthy, A. M. Addington, S. B. Pierce et al., 2008 Rare
structural variants disrupt multiple genes in neurodevelopmental pathways in
schizophrenia. Science 320: 539-543.
Wang, M., and R. N. Collins, 2014 A lysine deacetylase Hos3 is targeted to the bud neck and
involved in the spindle position checkpoint. Mol Biol Cell 25: 2720-2734.
Wang, Y., T. Shirogane, D. Liu, J. W. Harper and S. J. Elledge, 2003 Exit from exit: resetting the
cell cycle through Amn1 inhibition of G protein signaling. Cell 112: 697-709.
Will, J. L., H. S. Kim, J. Clarke, J. C. Painter, J. C. Fay et al., 2010 Incipient balancing selection
through adaptive loss of aquaporins in natural Saccharomyces cerevisiae populations.
PLoS Genet 6: e1000893.
Wout, P. K., E. Sattlegger, S. M. Sullivan and J. R. Maddock, 2009 Saccharomyces cerevisiae
Rbg1 protein and its binding partner Gir2 interact on Polyribosomes with Gcn1. Eukaryot
Cell 8: 1061-1071.
Wray, N. R., and R. Maier, 2014 Genetic basis of complex genetic disease: The contribution of
disease heterogeneity to missing heritability. Curr Epidemiol Rep 1: 220-227.
Yang, Y., M. R. Foulquie-Moreno, L. Clement, E. Erdei, A. Tanghe et al., 2013 QTL analysis of
high thermotolerance with superior and downgraded parental yeast strains reveals new
minor QTLs and converges on novel causative alleles involved in RNA processing. PLoS
Genet 9: e1003693.
Yvert, G., R. B. Brem, J. Whittle, J. M. Akey, E. Foss et al., 2003 Trans-acting regulatory
variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet 35:
57-64.
Zeng, Z. B., 2005 QTL mapping and the genetic basis of adaptation: recent developments.
Genetica 123: 25-37.
Zheng, W., H. Zhao, E. Mancera, L. M. Steinmetz and M. Snyder, 2010 Genetic analysis of
variation in transcription factor binding in yeast. Nature 464: 1187-1191.
125
Appendix A: Higher-order genetic interactions and complex trait variation
This work appears essentially as published in 2016 in eLS. 1-5.
A.1 Abstract
Genetic variants that segregate within species can cause individuals to show heritable
phenotypic differences. Some of these polymorphisms act the same regardless of the
other variants with which they co-occur. However, many of these polymorphisms exhibit
genetic (or epistatic) interactions with each other and thus show different effects across
genetic backgrounds. These interactions represent a potentially important source of
heritable trait variation, but are difficult to identify in most genetic mapping studies. For
this reason, researchers typically focus on two-locus interactions, which are the least
complex and easiest to identify form of interaction. Although two-locus interactions are
undoubtedly important, higher-order genetic interactions (HGIs) involving three or more
loci can also occur. In this chapter, we discuss the phenotypic effects, underlying
molecular mechanisms, and potential biological significance of these HGIs.
A.2 Keywords
Genetic interaction; epistasis; higher-order genetic interaction; pairwise genetic
interaction; non-additive genetic effects; complex traits; heritability; gene regulatory
network; cryptic genetic variation; phenotypic capacitance; phenotypic selection.
A.3 Key Concepts
• Higher-order genetic interactions (HGIs) occur when three or more
polymorphisms collectively exhibit unexpected phenotypic effects.
• Detecting HGIs is difficult, especially as the number of involved loci increases.
• Evidence suggests that genetically complex perturbation of gene regulatory
networks might be the major source of HGIs.
• HGIs can cause individuals to show different susceptibilities to environmental
change and mutation.
126
A.4 Introduction
One of the central challenges in contemporary genetics is to comprehensively
determine the causes of heritable phenotypic variation (MANOLIO et al. 2009). Indeed,
despite advances in DNA sequencing technologies and genetic mapping methods, our
understanding of the genetic basis of most agriculturally, evolutionarily, and clinically
relevant traits remains incomplete (EHRENREICH et al. 2009). This is because heritable
phenotypes are often genetically complex, i.e., they are influenced by a large number of
loci that can interact with each other and the environment (MACKAY et al. 2009). See
also: DOI: 10.1002/9780470015902.a0021448, DOI: 10.1002/9780470015902.a0002295
In this chapter, we focus specifically on the contribution of genetic (or ‘epistatic’)
interactions to complex traits (MACKAY 2014). Due to statistical considerations, most
work to date on genetic interactions has emphasized pairwise genetic interactions.
However, genetic interactions involving three or more loci also occur and can make
important contributions to traits (TAYLOR AND EHRENREICH 2015a). Here, we summarize
evidence for these higher-order genetic interactions (HGIs) and describe methods that can
be used to further identify and study HGIs in model organisms. Additionally, we discuss
molecular mechanisms that theoretical and empirical studies suggest might give rise to
HGIs. To conclude the chapter, we review the role that HGIs may play in potentiating the
phenotypic effects of mutations and environmental change.
A.5 Main Text
A.5.1 Evidence for HGIs
HGIs occur when combinations of alleles at three or more loci show unexpected
phenotypic effects (Figure A.1A and A.1B) (TAYLOR AND EHRENREICH 2015a). An HGI
might be detected as a quantitative change in phenotype, with individuals that carry
particular combinations of alleles exhibiting significantly higher or lower trait values than
the sum of the individual effect of alleles. Alternatively, HGIs can be qualitative and
might result in only individuals that possess the necessary combinations of alleles
expressing a particular trait. Both classes of HGIs have been described in the literature, as
summarized in (TAYLOR AND EHRENREICH 2015b).
127
Figure A.1 Comparison of three additive loci to three loci-involved in an HGI. In A, three
additive loci with equal effects affect a phenotype. In contrast, B shows a situation where three
loci that interact collectively influence a phenotype. For both panels, an individual’s genotype at a
specific locus is shown as either blue or orange squares and corresponds to the two segregating
alleles of a locus. The same coloring scheme is used throughout this chapter. The different
genotypes across the involved loci are shown on the x-axis, and the quantitative change in trait
value among individuals with a given genotype is shown in the y-axis. For simplicity, we use
haploids in this figure, as well as in all subsequent figures.
Some of the strongest evidence for HGIs comes from model organisms. For
example, in the budding yeast Saccharomyces cerevisiae, HGIs involving five or more
genes have been shown to influence the morphology of colonies of cells grown on agar
plates (TAYLOR AND EHRENREICH 2015a, TAYLOR AND EHRENREICH 2015b). Furthermore,
suggestive evidence has been provided that HGIs might also play a role in determining
which genes are essential for viability in different yeast strain backgrounds (DOWELL et
al. 2010). Additionally, HGIs were shown to make major contributions to body weight in
chickens (PETTERSSON et al. 2011) and to the effects of mutations on wing shape in
Drosophila melanogaster (CHANDLER et al. 2014). These examples show that HGIs
happen in a variety of organisms and suggest that additional HGIs will likely be
identified in the future.
B A
Three additive loci
Quantitative change
in phenotype
Locus A
Locus B
Locus C
Higher-order genetic interactions
Quantitative change
in phenotype
Locus A
Locus B
Locus C
128
A.5.2 Challenges in detecting HGIs
Genetic mapping studies typically have low statistical power to detect genetic
interactions (MACKAY 2014). This is particularly true for HGIs and arises due to at least
two reasons (TAYLOR AND EHRENREICH 2015a). First, the number of tests required for
genome-wide scans for genetic interactions increases exponentially as increasingly
complex HGIs are considered (Figure A.2A). Few, if any, HGIs may be statistically
significant after such genome-wide scans are corrected for multiple testing. Second, with
each increase in the number of loci involved in an HGI, the number of genotype classes
under consideration increases by a factor of two in haploids or three in diploids (Figure
A.2B). If the effect of an interaction is only visible in one multi-locus genotype class and
the number of individuals in that class is small, an interaction may not be detected even if
it has an effect.
The main strategy that has been used to overcome these statistical challenges is
artificial selection for individuals that show unusually high or low trait values, or express
novel phenotypes (EHRENREICH et al. 2010, CARLBORG et al. 2006, TAYLOR AND
EHRENREICH 2014, CHANDLER et al. 2014). Even though individuals that carry higher-
order combinations of interacting alleles will be rare, a sufficient number of these
individuals can be obtained by screening large mapping populations. When these selected
individuals obtained are genotyped, interacting alleles of loci involved in HGIs may show
enrichment (EHRENREICH et al. 2010, TAYLOR AND EHRENREICH 2014, TAYLOR AND
EHRENREICH 2015b). Researchers have successfully identified loci involved in HGIs
using phenotypic selection-based genetic mapping in a variety of organisms (CARLBORG
et al. 2006, TAYLOR AND EHRENREICH 2014, CHANDLER et al. 2014, TAYLOR AND
EHRENREICH 2015b). See also: DOI: 10.1038/npg.els.0005410
129
Figure A.2 Factors that limit the statistical power of tests for HGIs. In A, the number of
statistical tests involved in genome-wide scans for additive effects or genetic interactions of
varying complexities across 1,000 loci are shown. To enable visualization of the large number of
tests, the values are plotted on a log 10 scale on the y-axis. In B, all possible genotype classes that
can exist with increasing number of interacting alleles are displayed. As more loci are involved,
the number of possible genotype combinations increases. Thus, fewer individuals with a specific
Number of
interacting loci
Possible
haploid genotypes
2
3
4
1 Locus A
Locus A
Locus B
Locus A
Locus B
Locus C
Locus A
Locus B
Locus C
Locus D
Locus A
Locus B
Locus C
Locus D
B
A
10
1
10
3
10
5
10
10
10
15
10
20
10
25
1 3 4 2 5 6 7 8 9 10
Number of involved loci
Number of tests in a genome-wide
scan of 1,000 genetic variants
130
multi-locus genotype class will be present in a mapping population, reducing statistical power to
detect HGIs.
A.5.3 How do HGIs arise at the molecular level
A number of mechanisms have been identified that can result in two-locus
interactions (LEHNRER 2011, BOONE et al. 2007, PHILLIPS 2008, TAYLOR AND
EHRENREICH 2015a). These include disruption of functionally redundant genes, pathways,
or protein complexes; perturbation of physical interactions between proteins; changes in
the dosage of key transcripts, proteins, or metabolites beyond normal thresholds; and
complex changes in gene regulation. Although each of these mechanisms could in
principle extend to HGIs, most empirical and theoretical work to date suggests that
genetically complex changes to gene regulatory networks are the primary source of HGIs
(OMHOLT et al. 2000, GJUVSLAND et al. 2007, TAYLOR AND EHRENREICH 2015b).
Gene expression is often regulated by complex networks of signaling cascades
and transcription factors, and for a phenotypic change to occur, it is likely that multiple
genes may need to be altered in their abundance or localization (NUZHDIN et al. 2012).
See also: DOI: 10.1002/9780470015902.a0002322.pub2 Such changes are possible
because populations harbor a large number of polymorphisms that can alter gene
expression (ALBERT AND KRUGLYAK 2015). These genetic variants likely are able to
rewire gene regulatory networks (MATSUI et al. 2015), causing individuals to show
significant differences in the levels of phenotypically important transcripts (Figure A.3A,
A.3B, and A.3C).
131
Figure A.3 Role of gene regulatory networks in HGIs. One way HGIs can occur is if genetic
variants are found in genes that are involved in the gene regulatory network and collectively
determine the transcript level of a phenotypically important gene. (A) Individuals with blue
alleles at all three interacting loci collectively suppress the expression of the gene. (B) The gene
regulatory network in individuals with one orange allele is altered such that there is low
transcription of the gene, but transcript abundance is not high enough to cause a phenotype
change. (C) The gene regulatory network is further modified in individuals with three orange
alleles. The phenotypically important gene is now highly transcribed, leading to a change in
phenotype.
A5.4 Phenotypic and potential evolutionary consequences of HGIs
Mounting evidence suggests that HGIs can involve polymorphisms that do not
typically show phenotypic effects (HERMISSON AND WAGNER 2004). Such genetic variants
are referred to as ‘cryptic’ because of their highly conditional nature (GIBSON AND
DWORKIN 2004, PAABY AND ROCKMAN 2014). Populations may be capable of
accumulating large amounts of cryptic variation because these polymorphisms are
typically neutral with respect to fitness (EHRENREICH AND PFENNIG 2015). However,
changes in conditions, e.g., if the environment changes or a mutation occurs, can convert
cryptic genetic variants from being silent to having significant phenotypic effects (Figure
A.4)(RUTHERFORD 2000, RUTHERFORD 2003, GIBSON AND DWORKIN 2004, PAABY AND
A
Gene
No mRNA transcripts:
No phenotype
B C
Gene
Few mRNA transcripts:
No phenotype
AAAA
AAAA
Gene
Many mRNA transcripts:
Change in phenotype
AAAA
AAAA
AAAA
AAAA
132
ROCKMAN 2014). This phenomenon is called ‘phenotypic capacitance’ in reference to the
fact that it involves the release of phenotypic potential that is usually suppressed
(RUTHERFORD AND LINDQUIST 1998, QUEITSCH et al. 2002, BERGMAN AND SIEGAL 2003,
JAROSZ AND LINDQUIST 2010, TAYLOR AND EHRENREICH 2015b).
Figure A.4 Involvement of HGIs in phenotypic capacitance. Individuals can carry cryptic
genetic variants, which may not exhibit any visible phenotypic effects under normal conditions.
However, when individuals carrying the alleles involved in an HGI are exposed to atypical
conditions, such as a spontaneous mutation or sudden change in environment, then their effects
may now be uncovered, causing a change in phenotype. Individuals with different genotype
classes and their phenotypes, depicted as either a gray circle or a green box, are shown here.
Under normal conditions, all genetically distinct individuals exhibit the same phenotype.
However, when they are exposed to an atypical condition, the individual with the three orange
alleles now exhibits a different phenotype.
HGIs among cryptic genetic variants may provide the genetic potential for
phenotypic capacitance to occur (HERMISSON AND WAGNER 2004, CHANDLER et al. 2014,
TAYLOR AND EHRENREICH 2015b). In other words, phenotypic capacitance may only be
possible in specific genetic backgrounds that carry particular combinations of interacting
cryptic genetic variants (TAYLOR AND EHRENREICH 2015b). Consistent with a relationship
Normal condition
Environmental change
or mutation
133
between HGIs and phenotypic capacitance, systems-level modeling suggests that, like
HGIs, phenotypic capacitance arises due to complex changes to gene regulatory networks
(BERGMAN AND SIEGAL 2003).
A.6 Conclusion
Sets of three or more genetic variants can interact with each other, as well as
mutations and the environment, to produce unexpected phenotypic effects. These HGIs
are difficult to detect in genetic studies, but mapping techniques that employ phenotypic
selection can be used to identify HGIs. Although more examples of HGIs are needed,
present evidence suggests that HGIs arise through genetically complex perturbations of
gene regulatory networks. HGIs can cause individuals to differ in their responses to
mutations and environmental change, and thus might facilitate the expression of diseases
and novel phenotypes in specific genetic backgrounds.
134
A.7 Further Reading List
Bloom J, et al. (2013) Finding the sources of missing heritability in a yeast cross. Nature
494(7436): 234-237.
Cordell HJ (2002) Epistasis: what it means, what it doesn't mean, and statistical methods
to detect it in humans. Hum Mol Genet 11(20):2463-2468.
Carlborg O & Haley CS (2004) Epistasis: too often neglected in complex trait studies?
Nat Rev Genet 5:618-25.
Huang W, et al. (2012) Epistasis dominates the genetic architecture of Drosophila
quantitative traits. Proc Natl Acad Sci USA 109:15553-15559.
A.8 Glossary
Additive genetic variant: A DNA polymorphism that shows a phenotypic effect that is
not affected by genetic variation at other loci.
Pairwise genetic interaction: When certain combinations of alleles at two loci produce
unexpected phenotypic effects.
Higher-order genetic interaction: When certain combinations of alleles at three or more
loci produce unexpected phenotypic effects.
Genetic background effect: When a genetic variant or a mutation has a phenotypic effect
that changes across genetically distinct individuals.
Cryptic variation: Polymorphisms that do not typically show phenotypic effects, but can
be uncovered by genetic or environmental perturbations.
Phenotypic capacitance: When cryptic variants are revealed and they collectively cause a
phenotypic change.
Gene regulatory network: A collection of functionally interacting genes that control the
expression levels of one or more transcripts.
135
Appendix B: Genetic suppression: Extending our knowledge from lab experiments
to natural populations
This work appears essentially as published in 2017 in Bioessays
B.1 Abstract
Many mutations have deleterious phenotypic effects that can be alleviated by suppressor
mutations elsewhere in the genome. High-throughput approaches have facilitated the
large-scale identification of these suppressors and have helped shed light on core
functional mechanisms that give rise to suppression. Following reports that suppression
occurs naturally within species, it is important to determine how our understanding of this
phenomenon based on lab experiments extends to genetically diverse natural populations.
Although suppression is typically mediated by individual genetic changes in lab
experiments, recent studies have shown that suppression in natural populations can
involve combinations of genetic variants. This difference in complexity suggests that sets
of variants can exhibit similar functional effects to individual suppressors found in lab
experiments. In this review, we discuss how characterizing the way in which these
variants jointly lead to suppression could provide important insights into the genotype-
phenotype map that are relevant to evolution and health.
136
B.2 Introduction
Genetic interactions occur when combinations of mutations show phenotypic
effects that differ from expectations based on individual mutations (HARTMAN et al.
2001; CARLBORG AND HALEY 2004; BOONE et al. 2007; MANI et al. 2008; PHILLIPS 2008;
COSTANZO et al. 2010; MACKAY 2014; TAYLOR AND EHRENREICH 2015a; COSTANZO et
al. 2016; FORSBERG et al. 2016; SCHELL et al. 2016). Among the types of genetic
interactions that can occur (ST ONGE et al. 2007; BRESLOW et al. 2008; COSTANZO et al.
2016), suppression represents an extreme case in which one or more genetic changes at
other sites in the genome (i.e., ‘suppressors’) reverse a mutation’s deleterious effects
(GUARENTE 1993; BOTSTEIN 2015). Identifying these suppressors can provide valuable
insights into the functional mechanisms by which mutations jointly affect phenotype
(HODGKIN et al. 1987; PRELICH 1999; BOTSTEIN 2015). As we describe below, high-
throughput sequencing and genomics strategies have led to new approaches for rapidly
identifying mutation-suppressor combinations on a large scale. These methods have
helped produce general insights into the mechanisms that can cause suppression within
and between genes in lab experiments.
Suppression has also become a point of interest for researchers focused on
understanding how naturally occurring genetic differences among individuals alter the
effects of mutations, thereby leading to incomplete penetrance and variable expressivity.
For example, recent studies in yeast (DOWELL et al. 2010; HOU et al. 2015; TAYLOR et al.
2016) and humans (JORDAN et al. 2015; CHEN et al. 2016) have shown that the ability to
suppress particular large effect and Mendelian mutations can segregate within
populations. At present, the extent to which lab studies on suppression relate to these
cases of naturally occurring suppression is unclear. Notably, while suppression in the lab
typically entails one suppressor interacting with a mutation (COSTANZO et al. 2016),
suppression in natural contexts may involve combinations of genetic variants that
collectively revert the effect of a mutation (Figure B.1).
137
Figure B.1 Genetic basis of suppression in lab versus natural environments. A: In lab
experiments, populations start out as genetically identical and suppressors arise as new single
mutations on the background of the original mutation. B: Large amounts of genetic variation
segregate within natural populations and some of these variants can individually or collectively
act as suppressors in a similar fashion to the large-effect mutations identified in lab experiments.
In both A and B, each chromosome represents a haploid individual. Grey chromosomes exhibit
the mutant phenotype, while black chromosomes do not. Stars represent a mutation of interest,
while vertical bars show other genetic changes in the population. Red vertical bars individually or
collectively act as suppressors, while black vertical bars have no effect on the mutant phenotype.
In this review, we attempt to broadly synthesize work on suppression and leverage
knowledge gained from lab experiments to provide insights into the genetic and
molecular basis of suppression in natural populations.
B.3 Main text
B.3.1 High-throughput techniques for identifying suppressor mutations in lab
experiments
High-throughput approaches have facilitated the comprehensive identification of
suppressors in experimental systems. In the following section, we summarize these
138
techniques, differentiating between approaches that primarily enable identification of
suppressors that occur in the same gene as the mutation whose effect they alleviate
(‘intragenic’ suppressors) and suppressors that occur in different genes (‘intergenic’ or
‘extragenic’ suppressors):
Comprehensive mutagenesis of individual genes: Researchers attempting to identify
intragenic suppressors can employ site-directed or random mutagenesis to generate a
large subset, if not all, of the possible single and double mutants of a given gene (JULIEN
et al. 2016; LI et al. 2016; PUCHTA et al. 2016; SARKISYAN et al. 2016) (Figure B.2A).
Comparison of single and double mutant phenotypes can be used to identify cases of
intragenic suppression (Figure B.2B). Studies of this type can examine a large fraction of
the possible genetic interactions within a given gene, especially if a gene is small. Such
projects are presently constrained by the read lengths of short read sequencing
technologies and the throughputs and error rates of long read sequencing technologies (IP
et al. 2015; JAIN et al. 2015). However, as long read sequencing technologies increase in
throughput and accuracy, larger genes, and potentially even sets of functionally related
genes, may become amenable to gene-specific mutagenesis techniques.
Figure B.2 Identifying intragenic interactions across an entire gene. A: Using directed or
random mutagenesis, a library of mutants can be constructed such that every pairwise (or
139
potentially higher-order) combination of nucleotides in a gene (YFG – ‘your favorite gene’) is
mutated. B: Each single and double mutant can be examined in order to identify all of the
potential genetic interactions that can occur within a gene, including those that result in
suppression.
Mapping induced or spontaneous suppressors: Suppressors can be obtained by
screening for induced or spontaneous mutations that revert the phenotype of the original
mutant (Figure B.3A). However, revertants recovered from these screens typically carry
multiple mutations and which mutation is the suppressor is not always clear-cut
(BOTSTEIN 2015; COSTANZO et al. 2016). A straightforward strategy to distinguish a
mutation with a phenotypic effect from its co-occurring ‘passenger’ mutations is by using
crosses in combination with whole genome sequencing (COSTANZO et al. 2016; TAYLOR
et al. 2016). Recently, such an approach was used to identify more than 200 mutation-
suppressor pairs in a single study (COSTANZO et al. 2016).
Figure B.3 Techniques for identifying intergenic suppressors. Multiple types of genetic
screens have been utilized to identify suppressors in lab experiments. A: New mutations
generated either through mutagenesis or spontaneous mutation can result in suppressors (red).
This approach usually results in the identification of multiple mutations, from which the actual
suppressor must be distinguished. B: Genome-wide suppressor screens can be accomplished by
systematically knocking out each gene in a mutant background. This can be accomplished by
targeted gene deletion or recombination with a collection of knockout strains. C: Dosage
suppression interactions can be screened using different strategies for altering the regulation of
140
each gene in the genome using either plasmid-based overexpression, RNAi, or
CRISPRa/CRISPRi.
Genome-wide knockout screens: Large-scale screens for genetic interactions have
been performed by systematically deleting each gene in the genome in a mutant
background (Figure B.3B). In yeast for instance, this has been accomplished by crossing
strains that carry a query mutation to a genome-wide collection of gene deletion strains to
generate every possible double deletion mutant (BARYSHNIKOVA et al. 2010; COSTANZO
et al. 2010). With the availability of CRISPR/Cas9 gene editing technologies (CONG et
al. 2013; DICARLO et al. 2013; MALI et al. 2013), similar genome-wide genetic
interaction screens can now be implemented in other species. A caveat here is that some
suppressors are gain-of–function mutations (SOPKO et al. 2006; COSTANZO et al. 2016),
which may not be identifiable in a screen focused on gene knockouts.
Genome-wide screens involving overexpression or silencing: Dosage suppression
occurs when a mutant phenotype is rescued by overexpression of another gene
(MAGTANONG et al. 2011) (Figure B.3C). High-copy plasmid libraries have been used to
successfully identify dosage suppressors (JONES et al. 2008; MAGTANONG et al. 2011),
and similar screens are now possible using CRISPR/Cas9 activation (CRISPRa) (PEREZ-
PINERA et al. ; KONERMANN et al. ; DOMINGUEZ et al. 2016). Much like overexpression
screens, genes that act as suppressors when they are downregulated can also be identified
using approaches that instead repress transcription at interacting genes using RNA
interference (RNAi) (LEHNER et al. 2006; TISCHLER et al. 2006) or CRISPR/Cas9
interference (CRISPRi) (GILBERT et al. ; LARSON et al. 2013; QI et al. ; TANENBAUM et
al. 2014).
In the next section, we discuss some of the general insights that have been gained from
studies of genetic suppression to date.
141
B.3.2 Functional mechanisms that cause genetic suppression
Past studies, including recent work using the approaches mentioned above, have
described a number of mechanisms that can lead to genetic suppression (PRELICH 1999;
HODGKIN 2005; MAGTANONG et al. 2011; COSTANZO et al. 2016). These mechanisms
include:
Reverting a mutated amino acid: Intragenic suppressors in coding regions may change
a mutated codon so that it specifies an amino acid that is structurally or biochemically
similar to the one that was initially present (PRELICH 1999). Intergenic suppressors in
tRNAs can have a similar effect by changing the codons that specify a particular amino
acid (BOTSTEIN 2015).
Restoring the structural conformation or dosage of an mRNA, protein, protein
complex, or cellular component: Intragenic suppressors can impact transcript or protein
stability, thereby allowing a mutated gene to function at closer to wild type levels
(PRELICH 1999; HODGKIN 2005). Intergenic suppressors might affect the physical
interaction between proteins by altering sites of protein-protein interaction (SANDROCK et
al. 1997), increasing the levels of available binding partners (MAGTANONG et al. 2011),
or even enabling a protein complex or cellular component to function in the absence of
the originally mutated gene’s protein product (LIU et al. 2015).
Changing the dosage of a mutated gene’s cognate mRNA or protein: Both intragenic
and intergenic suppressors may directly alter the levels of a mutated gene’s cognate
mRNA or protein (PRELICH 1999; HODGKIN 2005; MAGTANONG et al. 2011; COSTANZO
et al. 2016). Changing dosage in this way may compensate for the reduced activity of a
gene product due to the destabilizing effect of an initial mutation. Intragenic suppressors
that act in this way may occur in cis regulatory elements, whereas intergenic suppressors
could occur in transcription factors and their regulators (BOTSTEIN 2015).
Modified activity within a pathway: An intergenic suppressor may occur in a gene that
is in the same pathway as the original mutation, thereby restoring wild type activity levels
142
within that pathway (COSTANZO et al. 2016). For instance, one gene may activate
downstream targets of the pathway, while the other represses these targets. A mutation in
the activator that disrupts the balance between these regulators could inactivate the
pathway. Loss-of-function in the corresponding repressor or gain-of-function in a parallel
activator in the same pathway could then suppress the effect of the first mutation and
restore the pathway’s function.
Changes in activity between pathways: Intergenic suppressors can also occur in
pathways that perform functions related to that of the pathway containing the original
mutation, which may be able to functionally compensate for altered activity in the initial
mutant. However, research suggests this form of genetic suppression is less prevalent
than some of the other mechanisms described above (DIXON et al. 2009; BARYSHNIKOVA
et al. 2010; COSTANZO et al. 2016). This could in part be due to the negative effects of
network rewiring, as the gain-of-function mutation in the suppressor allele may disrupt
the original function of the gene harboring the suppressor, resulting in a negative fitness
effect.
Global changes in transcription, translation, or other cellular processes: Suppressors
may act in a more general manner by influencing overall levels of transcription and
translation in cells (COSTANZO et al. 2016). For example, a mutation that reduces the
expression of a gene might be suppressed by another mutation that decreases protein
degradation.
As described in this section, lab experiments have been used to comprehensively
determine the mechanisms that can give rise to genetic suppression. Information from
these experiments is a valuable research tool for considering how genetic suppression
might occur in other contexts, such as in natural populations.
B.3.3 Examples of genetic suppression in natural populations
Although genetic suppression has historically been a focal point for researchers
interested in dissecting pathways and genetic networks (COSTANZO et al. 2016), it is now
143
becoming increasingly important to scientists who study heritable phenotypic variation
within natural populations. Multiple examples of suppression have been identified in
different species, particularly yeast (DOWELL et al. 2010; HOU et al. 2015) and humans
(CHEN et al. 2016). Such naturally occurring suppression might play an important role in
evolution and disease.
One of the most striking examples of naturally occurring suppression comes from
genes that are essential in only certain individuals within a species. Essential genes
encode fundamental cellular functions that are required for viability. However, which
genes are essential varies from individual to individual because of differences in their
genetic backgrounds. To demonstrate this point, Dowell et al. knocked out nearly all of
the roughly 5,000 genes in two strains of the budding yeast Saccharomyces cerevisiae
(DOWELL et al. 2010). By doing this, they found 57 genes that were essential in only one
strain or the other. This represents nearly 6% of the genes that were essential in either of
the strains.
In another study focused on yeast, Hou et al. found that naturally occurring
suppressors can have a large effect on the viability of segregants in S. cerevisiae crosses
(HOU et al. 2015). Specifically, they identified a genetic variant in a tyrosine tRNA that
allows read-through of TGA stop codons. This tRNA variant suppresses a nonsense allele
in a mitochondrial cytochrome c-oxidase (COX15), thereby reverting the respiratory
defect shown by individuals with the COX15 nonsense allele.
Large-scale whole genome resequencing studies in humans have found similar
results to the work in yeast. For example, an analysis of ~590,000 genomes identified 13
adults who were healthy even though they possessed disease-associated genotypes at
fully penetrant early onset Mendelian disease loci (CHEN et al. 2016). In another study,
sequencing of exomes from 3,222 highly related individuals identified 1,111 homozygous
variants with predicted loss of function in 781 genes (NARASIMHAN et al. 2016). Despite
the fact that some of the variants had previously been associated with diseases, no
significant correlation was observed between an individual’s genotype and health record
(NARASIMHAN et al. 2016). Resequencing studies have shown that other species also
large amounts of rare loss-of-function variation in phenotypically important genes (e.g.,
144
(LITI et al. 2009; BERGSTROM et al. 2014; GENOMES CONSORTIUM. ELECTRONIC ADDRESS
AND GENOMES 2016)).
Additionally, Jordan et al. used comparative genomics to discover intragenic
suppressors of human disease mutations that were present in other species (JORDAN et al.
2015). They first characterized the extent to which disease alleles have fixed in 100 non-
human vertebrates. Up to 12% of the queried variants showed fixation in at least one non-
human genomes, suggesting that other genetic differences in these outgroups ameliorated
the effects of the disease alleles. Examination of orthologous amino acid sequences using
a computational model facilitated the identification of potential intragenic suppressors,
several of which were experimentally validated.
These findings in yeast, humans, and other species reflect a broader reality that
genetic background often plays a strong role in influencing how large effect mutations
impact phenotype (NADEAU 2001; CHANDLER et al. 2013). In some cases, alleles that
appear to have Mendelian effects in some genetic backgrounds can show more
complicated phenotypic impacts when other backgrounds are considered (HOU et al.
2016). Also, some mutations may affect viability in a more probabilistic manner, with
different genotypes varying in their propensities to survive a particular genetic
perturbation. For example, Paaby et al. found extensive genetic variation in embryonic
lethality among Caenorhabditis elegans strains by using RNAi knockdown of important
developmental regulators (PAABY et al. 2015).
Following upon the results in this section, as well as similar studies that were not
discussed, it is important to determine the genetic and molecular mechanisms that cause
individuals to show different responses to the same large effect mutations.
B.3.4 The genetic and molecular basis of naturally occurring genetic suppression
Characterizing the mechanisms that underlie naturally occurring suppression can
provide novel insights into how genetic variation within populations can modify the
responses of individuals to new mutations. As a starting point for considering this
problem, one must appreciate that populations often harbor large amounts of genetic
variation, which can rewire the pathways and networks that give rise to phenotype
(WAGNER 2005; CILIBERTI et al. 2007; FELIX AND WAGNER 2008; MASEL AND SIEGAL
145
2009; WAGNER 2012; SIEGAL AND LEU 2014; MATSUI et al. 2015; TAYLOR AND
EHRENREICH 2015b). These changes can cause individuals to differ in the genes they
require to be healthy or viable, or to express a given trait.
As with most traits that segregate within species, suppression in natural
populations may have a complex genetic basis that involves multiple variants (Figure
B.1). Supporting this view, crossing experiments in budding yeast indicate that most
conditional essentialities are mediated by two or more genetic variants (DOWELL et al.
2010; HOU AND SCHACHERER 2017). More generally, work on other types of background
effects has shown that response to mutations can be mediated by higher-order sets of
variants that interact not only with a mutation, but also with each other (CHARI AND
DWORKIN 2013; CHANDLER et al. 2014; TAYLOR AND EHRENREICH 2014; TAYLOR AND
EHRENREICH 2015b; LEE et al. 2016; TAYLOR et al. 2016) and potentially even the
environment (LEE et al. 2016). However, only a small number of background effects
have been comprehensively teased apart at the genetic level (CHANDLER et al. 2013),
leaving open the possibility that other genetic architectures, such as those in which a
mutation acts as a hub of pairwise genetic interactions with many different variants
(FORSBERG et al. ; SCHELL et al. 2016), could also be important.
Moving forward, it is necessary to better determine the genetic and molecular
basis of suppression in natural populations. Such work could be difficult in humans, but
is feasible in model systems that facilitate comprehensive genetic dissection of complex
traits (EHRENREICH et al. 2010; BLOOM et al. 2013). Research along these lines can
answer questions about the number and molecular functions of involved genes and
genetic variants, and can clarify the relationship between the functional mechanisms of
suppression in lab experiments and natural populations.
Given the genetic complexity that likely underlies some cases of suppression in
natural populations, intergenic suppressors may be more likely to contribute in nature
than intragenic suppressors because of their significantly larger target space for
accumulating genetic variation. Beyond this distinction, categorizing the functional
mechanisms that are responsible may not be as straightforward as in lab experiments.
Naturally occurring suppression might involve a number of variants with small effects on
molecular function, which may individually alter the structures of mRNAs and peptides,
146
enzymatic activities, or transcript and protein levels in subtle ways (e.g., (TAYLOR et al.
2016)). Some of these variants may influence individual genes, whereas others may have
more global effects, in some cases influencing the expression of a large fraction of the
genome. Combinations of these variants could then collectively achieve a similar
functional effect to the individual suppressors typically seen in lab experiments (Figure
B.4).
Figure B.4 Combinations of genetic variants may cause suppression in natural populations.
A: A signaling network may regulate the expression of a transcript that is required for viability.
B: Knockdown of a key component of the network (red) can result in loss of a required transcript.
C: As is typically observed in lab experiments, a large effect mutation (blue) elsewhere in the
pathway can suppress the deleterious effects of the initial mutation and restore activity. D: In
147
natural populations, multiple variants with small functional effects within the pathway (light blue)
can collectively result in suppression if their combined effect on network activity is sufficient to
achieve levels of pathway output required for viability.
Characterizing the genetic and functional mechanisms underlying naturally
occurring suppression will provide valuable insights into how combinations of variants
can alter the susceptibility of biological systems to genetic perturbations. This problem
has a fundamental bearing on our understanding of the ways in which new mutations and
pre-existing genetic variation jointly determine the relationship between genotype and
phenotype.
B.4 Conclusion and outlook
Lab experiments enabled by high-throughput genetic and genomic approaches
have helped provide detailed insights into the distinct functional mechanisms that give
rise to genetic suppression. Given that suppression appears to segregate within species,
determining how findings from the lab relate to natural populations is important. Because
of the large amount of genetic variation in these populations, naturally occurring
suppression may in some case be more complex at the genetic and molecular levels than
suppression studied in the lab. Work that is able to tease apart this complexity may
provide new insights into how genetic variation within species alter the susceptibilities of
individuals to large effect mutations.
Abstract (if available)
Abstract
Understanding how standing genetic variants contribute to heritable phenotypic variation is one of the central goals of contemporary genetics. However, for many heritable phenotypes of interest, genome-wide association and linkage studies have only identified a small fraction of the traits’ genetic basis. This is because most heritable phenotypes are genetically complex
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Exploring the genetic basis of quantitative traits
PDF
Genetic architectures of phenotypic capacitance
PDF
Complex mechanisms of cryptic genetic variation
PDF
Genome-scale insights into the underlying genetics of background effects
PDF
The complex genetic and molecular basis of oxidative stress tolerance
PDF
Genetic and molecular insights into the genotype-phenotype relationship
PDF
Genetic architecture underlying variation in different traits in the Pacific oyster Crassostrea gigas
PDF
Exploring the genetic basis of complex traits
PDF
Understanding the genetics, evolutionary history, and biomechanics of the mammalian penis bone
PDF
Genetic diversity and bacterial death in the context of adaptive evolution
PDF
Ultra rapid identity-by-descent mapping in massive genetic datasets
PDF
Mapping epigenetic and epistatic components of heritability in natural population
PDF
Studies in bivalve aquaculture: metallotoxicity, microbiome manipulations, and genomics & breeding programs with a focus on mutation rate
PDF
Application of genetic association methods in mice to understand phenotypes with a complex etiology
PDF
The evolution of gene regulatory networks
PDF
Developing genetic tools to assist in the domestication of giant kelp
PDF
The impact of global and local Polynesian genetic ancestry on complex traits in Native Hawaiians
PDF
Scanning and catalytic properties of AID with structural comparisons to APOBEC3A
PDF
Understanding genetics of traits critical to the domestication of crops using Mixed Linear Models
PDF
Robustness and stochasticity in Drosophila development
Asset Metadata
Creator
Matsui, Takeshi
(author)
Core Title
Understanding the genetic architecture of complex traits
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Molecular Biology
Publication Date
10/15/2018
Defense Date
07/24/2018
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
epistasis,genetic background effects,genetic heterogeneity,genetics,OAI-PMH Harvest,QTL mapping
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ehrenreich, Ian (
committee chair
), Boedicker, James (
committee member
), Dean, Matt (
committee member
), Forsburg, Susan (
committee member
)
Creator Email
108westwind@gmail.com,tmatsui2@stanford.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-77084
Unique identifier
UC11671386
Identifier
etd-MatsuiTake-6831.pdf (filename),usctheses-c89-77084 (legacy record id)
Legacy Identifier
etd-MatsuiTake-6831.pdf
Dmrecord
77084
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Matsui, Takeshi
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
epistasis
genetic background effects
genetic heterogeneity
genetics
QTL mapping