Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Natural variation in Arabidopsis thaliana
(USC Thesis Other)
Natural variation in Arabidopsis thaliana
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
NATURAL V ARIATION IN ARABIDOPSIS THALIANA
by
Chunlao Tang
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(MOLECULAR BIOLOGY)
December 2006
Copyright 2006 Chunlao Tang
Dedication
To the 77 months elapsed.
ii
Acknowledgments
My sincere gratitude goes to my advisor - Dr. Magnus Nordborg, and all other members of
my academic committee - Drs. Nelson Bickers, Ting Chen, Steven Finkel, Paul Marjoram
and Fengzhu Sun. My study would not be successful and fruitful without their instructions
and assistances.
I especially thank Linda Bazilian, our departmental graduate students manager, and the late
departmental graduate students coordinator, Bill Trusten. My USC life started with their kind
assistance from the ever first day when I arrived here.
I also feel obliged to my peers and friends, who have offered help, convenience and encour-
agement throughout my student life at USC. More or less, the kindness is always precious to
me. This period has been turned into a special and sweet memory just because of the flavors
they added.
Part of my projects is collaborative work, thus I have received a lot of assistance from other
lab members. Particularly, I thank Dr. Maria Jose Aranzana, Dr. Honggang Zheng and Vi-
jaya Rao for plant growing and flowering time phenotyping, Dr. Yoko Ishino and Tina Hu for
iii
the pilot project data management and Keyan Zhao for association mapping. I also thank Dr.
Jenny Hagenblad for Aly8 sequencing data and Dr. June Nasrallah lab for S-locus data.
Last but not the least, I deeply appreciate the invaluable support and sacrifice from my whole
family.
iv
Table of Contents
Dedication ii
Acknowledgments iii
List of Figures vii
Abstract x
1 GENERAL INTRODUCTION 1
1.1 History of Arabidopsis thaliana as a model plant . .............. 2
1.2 Natural variation in Arabidopsis thaliana ................... 3
1.3 Utilizing natural variation . ........................... 4
1.3.1 Forwardgeneticsandreversegenetics ................. 4
1.3.2 Linkage and association mapping ................... 5
1.3.3 Association mapping in Arabidopsis thaliana ............. 7
1.4 Overview .................................... 7
2 HAPLOTYPE STRUCTURE AND PHENOTYPIC ASSOCIATION AT THE
FLOWERING TIME LOCUS FLC 9
2.1 Introduction ................................... 10
2.2 MaterialsandMethods ............................. 14
2.2.1 Plantsamplesandphenotyping..................... 14
2.2.2 Genotyping ............................... 15
2.2.3 Associationanalysismethods ..................... 16
2.3 Results...................................... 17
2.3.1 Pilot project............................... 17
2.3.2 Follow-upproject............................ 18
2.4 Discussion.................................... 19
3 NATURAL V ARIATION FOR SEED DORMANCY 26
3.1 Introduction ................................... 27
3.2 MaterialsandMethods ............................. 30
3.2.1 Plant materials and genotyping..................... 30
3.2.2 Plant cultivation and flowering time phenotyping ........... 30
v
3.2.3 Seed collection and storage....................... 31
3.2.4 Germination test and dormancy measurement ............. 31
3.2.5 Climatic data . . ............................ 33
3.3 ResultsandDiscussion ............................. 33
3.3.1 Seed dormancy measured in DSDS50 . . . .............. 33
3.3.2 Non-random geographical distribution and microgeographic variation 35
3.3.3 Latitudinal distribution ......................... 37
3.3.4 Relation with flowering time ...................... 38
3.3.5 Relation with temperature and precipitation .............. 39
3.4 Conclusion ................................... 41
4 DNA POLYMORPHISM AT THE SELF-INCOMPATIBILITY LOCUS 48
4.1 Introduction ................................... 49
4.1.1 The mechanism of self-incompatibility in Brassica .......... 49
4.1.2 Self-incompatibility in Arabidopsis lyrata ............... 51
4.1.3 Self-compatibility and S-locus genes in Arabidopsis thaliana..... 53
4.2 MaterialsandMethods ............................. 54
4.2.1 DNAsamples .............................. 54
4.2.2 PCR amplification and DNA sequencing . . .............. 54
4.2.3 Genome walk . . . ........................... 55
4.2.4 DNA Sequence alignment and analysis . . .............. 56
4.3 Results...................................... 56
4.3.1 ARK3 and ΨSRK are both highly polymorphic regions ........ 56
4.3.2 ΨSCRisalsohighlypolymorphic,maybeevenmore ......... 57
4.4 Discussion.................................... 60
Bibliography 66
vi
List of Figures
1.1 Global distribution of Arabidopsis thaliana. Modified from Jonathan Clarke
(www.arabidopsis.org) ............................. 3
2.1 Flowering timing pathways in Arabidopsis thaliana showing the central role
of FLC. ..................................... 12
2.2 Origins and flowering time for USC samples in pilot project. See Figure 2.3. 20
2.3 Origins and flowering time for the Salk Institute samples in pilot project.
Notes: DTF: days to flower; fri-Col: fri
Col
; fri-Ler: fri
Ler
; a: 1.2kb FLC
intron 1 insertion; b: 4.2kb insertion assay detects >5 kb insertion; c: 4.2kb
FLC intron 1 insertion; d: FLC insertion data not available; e: FRI genotype
notavailable................................... 21
2.4 Primers used for PCR amplification. The positions of the target sequences on
chromosome V are also shown. ......................... 22
2.5 Pair-wise marker-trait association around FLC. Association results with all
data (top) and with only non-carriers of the two null FRI alleles (bottom). . . 23
2.6 Haplotype-trait association around FLC. Single-fragment haplotype analysis
detects no significant signal (a) while spatial clustering algorithm does detect
one right at FLC gene (b). This association is caused by very late-flowering
Swedish accessions................................ 24
2.7 Haplotypes and intron 1 insertions at FLC region. . .............. 25
3.1 The extensive distribution of mean DSDS50 for 83 accessions is shown in
the histogram (a) and the well-controlled deviation within replicates for each
accession is shown in the boxplot (b). ..................... 34
vii
3.2 Examples of germination ratio tests. Generally, the seed germination monoton-
icaly increased over the storage time and there was a short logarithmic phase
of rapid change (eg. C24); but exception was seen with Cvi-0, in which the
change is irregular: after 50 days, the germination ratio was highly vaiable. . 35
3.3 Accessions and their seed dormancy in DSDS50 (Part 1). See Figure 3.4 for
notes. ...................................... 42
3.4 Accessions used and their seed dormancy in DSDS50 (Part 2). The other re-
lated data are also included: latitudes and longitudes of seed origins, average
July precipitaions (Precip), annual average air temperatures (Temp), flower-
ing time (days to flower or DTF) under short-day with vernalization (SD+V)
and long-day condition (LD-V). Note that the interpolated data about the
temperature and precipitation cannot reflect the microgeographic conditions. . 43
3.5 Geographic distribution of seed dormancy. ................... 44
3.6 Relationship between seed dormancy in DSDS50 and latitudes of the seed
origins. N.SE: north Sweden, S.SE: south Sweden, C. Asia: Central Asia,
FI: Finland, SP: Spain; flc in: with FLC intron 1 insertion, fri Col: with
fri
Col
allele, fri Ler: with fri
Ler
allele. This legend also applies to Figure
3.7-10....................................... 44
3.7 Relationship between seed dormancy in DSDS50 and flowering time under
short-day condition with vernalization. ..................... 45
3.8 Relationship between seed dormancy in DSDS50 and flowering time under
long-day condition without vernalization. ................... 45
3.9 Relationship between seed dormancy in DSDS50 and flowering time under
long-day condition with vernalization. ..................... 46
3.10 Relationship between seed dormancy in DSDS50 and flowering time under
short-day condition without vernalization. ................... 46
3.11 Relationship between seed dormancy in DSDS50 and annual average tem-
peraturefrom1950through1996. ....................... 47
3.12 Relationship between seed dormancy in DSDS50 and average July precipita-
tion from 1950 through 1996........................... 47
viii
4.1 Self-incompatibility response in Brassica (reprinted from Goring and Walker
2004). During self-pollination, the specific recognition and binding between
SP11/SCR ligand and SRK receptor trigger off a cascade of signalling path-
way resulting in self-infertility. Two components of this pathway, MLPK and
ARC1, have been identified, but there are still more to be discovered. . . . . 51
4.2 Phylogenetic relationship of Arabidopsis spp. and Brassica. Most taxa ex-
hibit sporophytic self-incompatibility, but selfing has evolved multiple times
(e.g. A. thaliana and C. bursa-pastoris). .................... 52
4.3 (a) Resequenced regions (underlined regions) in ARK3, ΨSRK and ΨSCR1
in A. thaliana. The filled boxes are exons (the first exons for ARK3 and
ΨSRK are S-domains); (b) The overall structure of S-locus in A. thaliana
(Col-0) (reprinted from Shimizu, Cork et al. 2004). The figure also shows the
diversity of each region according to the results of Shimizu et al. (Shimizu,
Corketal.2004). ................................ 55
4.4 DNA sequence polymorphism at ARK3 gene. The nuleotide positions are
counted starting at -99nt upstream of the start codon. ............. 58
4.5 Neighbor-joining tree constructed for ARK3 S-domain sequences of 93 A.
thaliana and Aly8 sequences of 10 A. lyrata accessions. ............ 64
4.6 PCR results and DNA sequence polymorphisms of 96 A. thaliana accessions
at ΨSRK and ΨSCR1 regions along with the ARK haplogroups. The nu-
cleotide position is counted from the start codon. . .............. 65
ix
Abstract
Naturally occurring variation is an important alternative resource for functional genetics and
genomics research.
In a well-characterized collection of Arabidopsis thaliana accessions, we have investigated
variation for flowering time and seed dormancy. We tried to apply association mapping in
candidate gene FLC. We have detected a strong association signal at FLC attributable to the
very late flowering Swedish accessions. In addition, we also found several geographically re-
stricted clusters of early flowering accessions. Among them, Central Asian accessions were
previously shown to carry weak FLC alleles. We also identified eight types of large-size in-
tronic insertions, including the two previously known transposable element insertions found
respectively in Ler and Da1-12 that cause the weak function of FLC in these accessions.
Likely the other intronic insertions have the same functional effect. This result indicates that
many such kind of alleles have evolved independently at different origins.
Seed dormancy is a difficult trait to quantify and pattern of natural variation is largely un-
known. Using the requirement for after-ripening - the days of dry storage to reach 50%
x
germination ratio (DSDS50), we have successfully phenotyped seed dormancy in this popu-
lation and found clear evidence of both nonrandom global patterning and micro-geographic
variation that relates to local environmental conditions.
The final part of this thesis considers the evolution of selfing in A. thaliana. A. thaliana
is highly selfing while many of its close relatives including A. lyrata are self-incompatible.
One recent study claimed that self-incompatibility was lost in A. thaliana recently due to
the sweep selection at one of the S-locus genes SCR based on the finding that the region
of ΨSCR1, a pseudogenized ortholog of SCR in A. thaliana, is very low in diversity. We
demonstrate that this region is actually highly polymorphic in A. thaliana. When and how
the selfing transition happened in A. thaliana remains unclear, which could be solved by
investigation of both within-species and inter-species natural variation.
xi
Chapter 1
GENERAL INTRODUCTION
1
1.1 History of Arabidopsis thaliana as a model plant
Arabidopsis thaliana, a small mustard also known by common name as “mouse-ear cress”
or “thale cress”, is an annual dicotyledoneous hermaphrodite flowering plant (Meyerowitz
1987). It is named after Johannes Thal, who described the species in the early seventeenth
century. A. thaliana belongs to the family Brassicaceae which also includes important crops
such as rape, cabbage and radish. Unlike most other Brassicaceae members, A. thaliana is
highly selfing ( >95 % ) (Abbott RJ, Gomes MF 1989). A. thaliana has no agronomic value,
but has many advantages as model research organism.
The potential of A. thaliana as a model organism for genetics was first reported in 1940s by
Friedrich Laibach, who considered its many suitable properties - prolificacy, rapid growth,
easy cultivatation in limited space, abundant natural variation, fertile hybrids and small num-
ber of chromosomes. By the late 1980s, A. thaliana was well established as a model plant
for physiology, biochemistry and genetics research, partly thanks to the advance in easy and
efficient transformation protocol and its amenability for detailed molecular analysis due to
its small genome size (reviewed by Sommerville and Koornneef 2000). Because of its very
small genome size (∼125 Mb) and relative lack of repeats, A. thaliana became the plant first
plant species with a sequenced genome in 2000 (Arabidopsis Genome Initiative, 2000). The
very short period since then has witnessed further rapid growth in functional genomic studies
with this species (Bevan and Walsh 2005).
2
The knowledge acquired through studying A. thaliana will benefit us not only in understand-
ing the basic biological questions about plants and other eukaryotes, but also in improving
agronomic traits in crops. Sequence comparison analysis has shown that most Arabidopsis
genes have homologs in crop plants, e.g. about 70% of the Arabidopsis genes have homology
with rice genes and 50% for reverse comparison (Schoof, et al., 2003). Thus, A. thaliana can
function as a platform to conduct early-stage basic research before we can effectively deal
with crops on agriculturally important traits.
1.2 Natural variation in Arabidopsis thaliana
A. thaliana is widely distributed (Figure 1.1) and over 750 natural accessions collected from
around the world are available at two seed stock centers, ABRC at the Ohio State University,
USA and NASC at the University of Nottingham, UK. Accessions inhabiting distinct envi-
Figure 1.1: Global distribution of Arabidopsis thaliana. Modified from Jonathan Clarke
(www.arabidopsis.org)
3
ronments with biotic or abiotic stresses such as frost, drought and salts would be expected to
have adapted to their local enviroments. Variation has been described for a variety of traits
in A. thaliana, including developmental traits (e.g. flowering time), physiological traits (e.g.
seed dormancy) and biochemical traits (reviewed by Alonso-Blanco and Koornneef 2000).
Recently, natural variation in the expression of self-incompatibility after introduction of for-
eign genes was also described (Nasrallah, Liu et al. 2004). There is tremendous interest in
understanding the molecular genetic basis that underly the phenotypic variation. Advances
in modern genomics has have greatly enhanced our ability to accomplish this objective.
1.3 Utilizing natural variation
1.3.1 Forward genetics and reverse genetics
The availability of genome sequences provides great convenience for the study of indi-
vidual genes. More than 30,000 genes have been predicted in A. thaliana (TAIR, http://
www.arabidopsis.org). However, describing gene function remains a tough challenge: at
least one-third of predicted genes are unclassifiable, and less than 10% have been experimen-
tally identified (Arabidopsis Genome Initiative 2000). Taking the advantage of whole genome
sequence of A. thaliana, a systematic reverse genetics approach can identify gene functions
by observing the phenotypic changes after point-mutation, insertional knockout (especially
genome-wide T-DNA insertion) or RNAi silencing knockdown (e.g. McCallum, Comai et al.
2000; Alonso, Stepanova et al. 2003).
4
However, reverse genetics approach has some inherent disadvantages. First, it is very difficult
to profile the phenotypic or metabolite changes without any prior knowledge of the candidate
genes, and pleiotropy can further complicate the phenotype analysis; second, knockout of es-
sential genes will cause lethality and thus we can never acquire mutants for these genes; third,
specific genetic backgrounds may cause no identifiable mutant phenotype due to null/weak
allele, gene redundancy or epistatic interaction. Partly for these reasons, forward genetics,
starting from phenotypic variants to track down to the underlying gene(s), continue to play an
important role in functional genomics. A particular version of forward genetics is the utiliza-
tion of naturally occurring variation (Alonso-Blanco and Koornneef 2000). Using naturally
occurring variation not only circumvents the problems associated with mutant phenotypic ex-
pression or identification as described, but also provides a diversity of functional alleles with
a wealth of sequence polymorphisms, which could harbor important information about the
molecular basis of the gene function and its evolution history. Modern genomics techniques
are making it more effective to utilize natural resources.
1.3.2 Linkage and association mapping
The critical step of forward genetics studies is to map genes to genetic loci delimited by mole-
cular genetic markers. The resolution of the localized regions will largely decide the outcome
of the subsequent positional cloning or map-based cloning. In most cases, natural occurring
variation is quantitative, i.e. the phenotype is controlled by multiple quantitative trait loci
(QTLs) and thus continuously distributed. The fundamental principle of genetic mapping is
5
that markers physically closer to a functional locus will show higher correlation to the phe-
notype controlled by this locus: recombination between two loci will gradually break down
this correlation over reproductive generations. Through traditional family-based QTL linkage
mapping or QTL mapping, QTLs can only be mapped to cM-scale regions, because the lim-
ited number of recombination events in mapping population leave large-size intact fragments
(identical-by-descent or IBD fragments) around the functional loci, therefore no significant
difference in phenotypic correlation can be detected for all genetic markers across the whole
IBD region. There could be hundreds of genes located within such a wide location, and
the following positional cloning will be tedious if not impossible. Non-pedigree population-
based association mapping utilizing linkage disequilibrium (LD) may have advantage in this
respect. A natural population consists of “unrelated” individuals that are actually “distantly
related”, among which numerous historical recombination events have shrunk the IBD frag-
ments, making it possible to pinpoint a gene of interest down to perhaps kb-scale fragment
depending on the LD pattern around the functional gene, and subsequently reduce or simply
eliminate the work for positional cloning.
Association mapping can be performed either as a candidate gene approach or as a genome
wide scan. Candidate gene association studies can be used to confirm or fine map a previously
known gene while genome wide scans can identify novel genes of interest in addition. Either
way, there are many factors confounding the association analysis. The low phenotype effect
size or penetrance of certain alleles, low allele frequency in samples, and genetic heterogene-
ity (multiple functional genetic loci or multiple allele variants at one locus contributing to the
6
phenotypic variation) will reduce the detection power. Furthermore, population structure can
inflate the false positive rate (type 1 error). Overall, although association study is potentially
powerful tool, its efficiency is determined by the genetic architecture of the trait itself, the
history of the mapping population and study design (Cardon and Bell 2001; Terwilliger and
Hiekkalinna 2006).
1.3.3 Association mapping in Arabidopsis thaliana
With in hand the whole genome sequence and the rapid advance in techniques of DNA se-
quencing and polymorphism detection, typing polymorphisms of high density in large popu-
lation is becoming more affordable, thereby rendering association mapping a financially more
practical option. It was demonstrated that A. thaliana is suitable for association mapping in
terms of its LD extent which was estimated to be 25-250kb (Nordborg, Borevitz et al. 2002;
Nordborg, Hu et al. 2005). Furthermore, because it is highly selfing, so that individuals exist
as almost completely inbred lines, a trait of interest can be measured under same or different
conditions repeatedly for same set of genotypes and many traits can be mapped using same
set of DNA polymorphism data. These properties can effectively reduce the genotyping cost
per phenotype and making multiple-trait analysis in same population more feasible.
1.4 Overview
My thesis work is based on a global A. thaliana population of 96 accessions (Nordborg, Hu
et al. 2005). We are interested in the geographical distribution and phenotypic variation for
important traits such as flowering time and seed dormancy. Our ultimate goal is utilizing
7
natural variation to help dissect the underlying genetic basis. Association mapping is rapidly
emerging as an important alternative approach to achieve this goal. However, there are many
factors affecting the effectiveness of this method. Consequently, we would like to test its
feasibility in this model species. As one of the efforts, I have investigated the DNA sequence
polymorphism surrounding the candidate gene FLC and analyzed the phenotypic association
in an attempt to localize this gene.
Seed dormancy is an important physiological trait but little is known about its global distrib-
ution and variation pattern partly due to the difficulty of phenotyping for this trait. Therefore,
I have surveyed the seed dormancy with our global samples by measuring the after-ripening
requirement (or dry storage time) for dormancy-breaking. The results will be instrumental
for designing future genetic study on this trait.
A. thaliana is highly selfing while its close relatives are mostly self-incompatible, such as A.
lyrata. The self-incompatibility is known controlled by so called S-locus genes or S-genes,
which are all pseudogenized in A. thaliana according to the reference Col genome sequence.
S-genes could be the only target of natural selection for selfing, but could also be only one
of such targets. Thus, one interesting question is when and how the self-incompatibility (SI)
was lost in A. thaliana. We tried to see whether there is any footprint left by the evolution
history through identifying the polymorphism pattern at S-locus region.
8
Chapter 2
HAPLOTYPE STRUCTURE AND PHENOTYPIC
ASSOCIATION AT THE FLOWERING TIME LOCUS FLC
9
2.1 Introduction
Flowering is a significant event in the life cycle of a flowering plant, and it is critical to
properly time this event so that the external conditions are favorable for pollination and sub-
sequent seed development. Plants have developed multiple endogenous pathways coordinat-
ing with environmental cues to quantitatively control this process. These studies have been
extensively conducted on A. thaliana. In this species, several major pathways have been
identified, including two pathways responding to the external cues - the photoperiod and ver-
nalization pathways, and another two pathways to the endogenous cues - the gibberellin and
autonomous pathways (Figure 2.1).
A. thaliana is a facultative long-day annual flowering plant. Sensing the increase of daylength
in springs and summers through the photoperiod pathway genes, plants start to flower accom-
panied by up-regulated expression of the integrator genes, FT and SOC1. Even before this,
some accessions, so-called “winter annuals”, require a prolonged cold exposure (vernaliza-
tion) while others, so-called “summer annuals” do not. This vernalization requirement is
conferred by a MADS-box floral repressor gene FLOWERING LOCUS C (FLC), expression
of which is up-regulated by, among others, gene FRIGIDA (FRI), but down-regulated by the
autonomous pathway genes and vernalization. In most summer annuals, mutant FRI and/or
FLC alleles are found to cause low expression of FLC (Johanson, West et al. 2000; Sheldon,
Rouse et al. 2000; Michaels and Amasino 2001; Michaels, He et al. 2003). The mechanisms
behind the regulations on FLC expression have been, or are being unraveled, and are often
10
appeared to involve epigenetic transcriptional or post-transcriptional control (Rouse, Sheldon
et al. 2002; Simpson 2004; Quesada, Dean et al. 2005; Sung and Amasino 2005).
The relationship between DNA methylation and flowering time has long been recognized in
A. thaliana (Kakutani, Jeddeloh et al. 1995; Sheldon, Burn et al. 1999) and a recent study
showed that low DNA methylation represses FLC expression accompanied with reduced his-
tone trimethylation of H3K4 and reduced acetylation of both histones H3 and H4 around the
promoter-translation start of FLC, which were also observed during vernalization. However,
the low methylation effect was independent of VIN3, a gene required for vernalization path-
way (Jean Finnegan, Kovac et al. 2005). Thus, the mechanisms of DNA methylation and
vernalization are at least not identical.
In summary, flowering time is quantitatively controlled by numerous genes in different path-
ways with FLC as one of the central genes. The regulation at different levels - genetic, epige-
netic and post-transcriptional, enables plants to keep fine tuned to distinct and ever changing
environmental conditions. The major pathways and genes are outlined in Figure 2.1 (Jean
Finnegan, Kovac et al. 2005; Quesada, Dean et al. 2005).
Through mutagenesis studies, about 80 genes have been found to regulate flowering time in
A. thaliana (Levy and Dean 1998; Meinke, Cherry et al. 1998). However, these 80 genes do
not equally contribute to the natural variation of flowering time. FRI has been found to be
a major contributor, largely due to the two loss-of-function alleles, fri
Col
and fri
Ler
, which
11
Figure 2.1: Flowering timing pathways in Arabidopsis thaliana showing the central role of
FLC.
were first detected in Ler and Col respectively. They cause early flowering by suppressing
the expression of FLC (Johanson, West et al. 2000). Some less frequent types of null alleles
have also been observed (Gazzani, Gendall et al. 2003; Shindo, Aranzana et al. 2005).
FLC, a MADS domain transcription factor suppressing flowering, is a master gene through
which many other genes or regulators exert effects on flowering time. Vernalization and
12
autonomous pathway genes promote flowering by inhibiting FLC expression. On the con-
trary, FRI maintains high expression of FLC. In winter annuals, FRI and FLC synergistically
inhibit the floral transition in a dosage-dependent manner until after vernalization (Michaels
and Amasino 1999; Sheldon, Burn et al. 1999). All these regulations are epigenetic processes
through modifying FLC chromatin or RNA processing. Histone trimethylation of H3K4 and
histone acetylation are associated with active FLC expression, whereas histone deacetylation
and H3K9, H3K27 or H3K36 dimethylation are involved in FLC repression (He, Doyle et
al. 2004; He and Amasino 2005; Zhao, Yu et al. 2005; Guyomarch, Benhamed et al. 2006).
Weak alleles of FLC have been identified in two natural accessions, Ler and Da1-12, with
1.2kb and 4.2kb transposon insertions at intron 1 respectively (Michaels, He et al. 2003).
The transposable element suppresses FLC expression in Ler by epigenetic chromatin modi-
fications (H3K9 dimethylation ) mediated by siRNAs generated from homologous transpos-
able elements in the genome and the mechanism underlying Da1-12 allele is likely to be the
same (Liu, He et al. 2004). Several Central Asian accessions, Kondara, Kz-9 and Shahdara,
were also shown to have weak FLC alleles (Johanson, West et al. 2000; Gazzani, Gendall et
al. 2003; Werner, Borevitz et al. 2005) , but no transposon insertions were identified with
them and the specific genetic changes remain unknown. Other kinds of mutant FLC alleles
also exist (e.g. two alleles in Van-0 and Bur-0 with mutations in coding region (Werner, Bore-
vitz et al. 2005)), but appear to be less frequent. Functional mutations at many other genes
such as FLM (Werner, Borevitz et al. 2005) and PHYC (Balasubramanian, Sureshkumar et
al. 2006) may also contribute to the natural variation in flowering time. But FRI and possibly
13
FLC are the only major determinants identified to date. Much of the variation remains unex-
plained (Shindo et al. 2005; Werner et al. 2005).
It was previously demonstrated that FRI could be localized to a 35kb fine region by candidate
gene association mapping and the association analysis could be severely confounded by the
exsistance of allele heterogeneity with alleles fri
Col
and fri
Ler
of FRI (Hagenblad, Tang et al.
2004). For present study, we focus on FLC using denser DNA sequence data to further test
the feasibility of association mapping in A. thaliana and to see whether we can identify novel
functional mutant alleles for FLC.
2.2 Materials and Methods
2.2.1 Plant samples and phenotyping
We first carried out a pilot project. Totally 112 accessions from the Nottingham Arabidopsis
Stock Center as well as personally collected ones were grown at USC. The other 84 acces-
sions were carried out at the Salk Institute and mainly used accessions obtained from the
Arabidopsis Biological Resource Center at Ohio State University.
Flowering time was measured in days, using unvernalized plants under long-day conditions
(16 hr light/8 hr dark). Plants at USC were grown at 18
◦
C in growth chambers with a mix-
ture of fluorescent and incandescent bulbs. Plants at the Salk Institute were grown at 22
◦
C in
chambers with a 3:1 mixture of Cool-white and Gro-Lux (Sylvania) fluorescent bulbs. In both
14
experiments, plants were regularly rotated within and between shelves to minimize effects of
environmental heterogeneity. Plants that did not flower within 200 days were counted as 200.
We combined the two datasets using a linear function y = 2.20 + 0.65x to transform the data
from the Salk Institute. The linear relation was based on the common accessions used for
both labs ( R
2
= 0.88). The final phenotype results are listed in Figure 2.2 and Figure 2.3
for USC and the Salk Institute experiments respectively.
Following the pilot project, we increased the sequencing fragment density to redo the associ-
ation analysis on several candidate genes. My focus was on FLC. A total of 192 accessions
were used for this follow-up project. Among the samples, 96 accessions were described by
Nordborg et al.(Nordborg, Hu et al. 2004) and the remaining are described in Shindo et al.
(2005). All accessions were phenotyped as described by Aranzana et al. (Aranzana, Kim et
al. 2005);
2.2.2 Genotyping
DNA was extracted using the DNeasy plant MiniKit (Qiagen, CA). Nine fragments for pilot
project and 16 for follow-up project around the FLC region were PCR amplified with primers
designed from the reference genome sequence (see Figure 2.4). Promega PCR Premix (cat#
M750X ) were used for regular PCR reaction. PCR products were purified using MagneSil
Yellow (cat # A923) based system (Promega, WI) or MultiScreen - PCR MANU03010 plates
15
(Millipore, MA). Sequencing reaction, clean-up and loading preparation used CEQ DTCS -
Quick Start Kit (Beckman Coulter, USA). All fragments were sequenced in both directions
on Beckman CEQ sequencers.
The 1.2kb and 4.2kb insertions characteristic of Ler and Da1-12 respectively were scored
using primer pairs Ler-in and Da1-in1 respectively (Figure 2.4). For Da1-12-type insertion,
the sequence was confirmed (except in the case of No-0) using primer pair Da1-in2, of which
the forward primer was designed within insertion sequence.
Fifteen fragments were sequenced for the second candidate gene project. Since the two
known weak FLC alleles have transposable element insertion for each, we screened for large-
size insertions within intron 1 in the ninety-six 2010 project accessions using Roche long
template PCR system (Roche Applied Science). The typed insertions were confirmed by
sequencing to identify the positions.
2.2.3 Association analysis methods
In pilot project, several methods were tried and compared.
1. Single marker ANOV A test:
We tested each marker locus (only diallelic SNP or indel) individually, using the alleles as
factors in a Kruskal-Wallis “non-parametric” ANOV A. Alleles with frequency lower than five
were not used.
2. Haplotype-based ANOV A test:
16
We used the same test to look for associations between the phenotype and the haplotypes
given by each sequenced fragment. Singleton alleles were ignored when constructing the
haplotypes, and haplotypes with frequency less than five were ignored.
3. Haplotype-based spatial clustering method:
A clustering algorithm described by Molitor et al. (Molitor, Marjoram et al. 2003(a); Molitor,
Marjoram et al. 2003(b)) was tried. Fragment haplotypes were treated as markers. By
assigning a functional mutation location and a prototype haplotype in each iteration, this
algorithm searches for the most deviant haplotype clusters in flowering time. Hierarchical
clustering of haplotypes based on the frequency of co-occuring in the same significant cluster
was used to visualize the result. For the 2nd follow-up project, several methods described as
by Aranzana et al. (Aranzana, Kim et al. 2005) were used.
2.3 Results
2.3.1 Pilot project
Pairwise marker-trait associations:
Figure 2.5 shows association between flowering time and individual markers in the FLC
region. There is no any signal of association for FLC, no matter whether the null FRI allele-
carriers are removed or not.
Haplotype-trait associations:
Haplotype-based method utilizes the information between single markers thus should have
17
more power to detect the association. However, simple single fragment haplotype-trait asso-
ciation analysis does not improve the result (Figure 2.6.a). When we applied the spatial clus-
tering algorithm using single-fragments as markers, a signal was detected a signal precisely
at FLC gene (Figure 2.6.b). However the signal is due to late-flowering Swedish accessions.
While it is possible that the extreme late-flowering phenotype of these accessions could result
from the variation of FLC allele, the possibility of false positives cannot be ruled out with-
out further studies. Although no signal was detected for early flowering, several accessions
tending to be early flowering were successfully clustered. Among them, the Central Asian
accessions were shown to have weak FLC alleles.
2.3.2 Follow-up project
In the follow-up project, we have got the similar result as in pilot project: a significant signal
of phenotypic association was detected at the FLC gene but largely due to the very late flow-
ering Swedish accessions.
In the ninety-six 2010 accessions, we have detected 8 types of intron 1 insertions (Figure 2.7),
including the two previously known transposable element insertions that were detected in Ler
and Da1-12 respectively (Gazzani, Gendall et al. 2003; Michaels, He et al. 2003). Central
Asian accessions were not amplifiable after trying several primer pairs. All accessions with
same insertion share haplotype extensively, at least 100kb, and many span whole sequenced
region (∼300kb). When we clusterrd the FLC haplotypes based on similarity, several clusters
showed earlier flowering time. Among them are those with certain intron 1 insertions: US
18
accessions with 1.2kb insertion, those with Da1-12 insertion and Czech accessions. It is likely
that these insertions result in weak FLC alleles as in Ler or Da1-12. We have completely
sequenced the∼1.2kb insertion in the US accessions. There are two 9-bp direct repeats
characteristic of transposition.
2.4 Discussion
Based on haplotypes in FLC region, we have identified several interesting clusterings of
relatively early-flowering accessions: the Central Asian group, which were demonstrated
to harbor weak FLC alleles, and other groups associated with large-size insertions at FLC
intron 1 (Figure 2.7), including the two previously known transposable element insertion in
accessions Ler and Da1-12 respectively (Gazzani, Gendall et al. 2003; Michaels, He et al.
2003). These alleles are all likely to be weak in FLC function, and thus partly responsible
for early flowering. However, because their effects are relatively small, and the frequency of
each allele is low, it is hard to detect these mutant FLC alleles by association mapping.
We were able to detect a signal at FLC gene in both populations by association map-
ping, but it is largely due to the extremely late Swedish accessions. Thus we cannot exclude
the possibility of false positive caused by population structure. However, a recent linkage
mapping study in several F2 crosses between Col-0 and otehr accessions (including Swedish
accessions and USA accessions with intron 1 insertion) all revealed a strong QTL signal cen-
tering on FLC gene for flowering time ( Shindo, Lister, et al. 2006), thus the FLC alleles
identified by our association study might be real functional variants.
19
Figure 2.2: Origins and flowering time for USC samples in pilot project. See Figure 2.3.
20
Figure 2.3: Origins and flowering time for the Salk Institute samples in pilot project. Notes:
DTF: days to flower; fri-Col: fri
Col
; fri-Ler: fri
Ler
; a: 1.2kb FLC intron 1 insertion; b: 4.2kb
insertion assay detects >5 kb insertion; c: 4.2kb FLC intron 1 insertion; d: FLC insertion
data not available; e: FRI genotype not available
21
Figure 2.4: Primers used for PCR amplification. The positions of the target sequences on
chromosome V are also shown.
22
Figure 2.5: Pair-wise marker-trait association around FLC. Association results with all data
(top) and with only non-carriers of the two null FRI alleles (bottom).
23
Figure 2.6: Haplotype-trait association around FLC. Single-fragment haplotype analysis de-
tects no significant signal (a) while spatial clustering algorithm does detect one right at FLC
gene (b). This association is caused by very late-flowering Swedish accessions.
24
Figure 2.7: Haplotypes and intron 1 insertions at FLC region.
25
Chapter 3
NATURAL V ARIATION FOR SEED DORMANCY
26
3.1 Introduction
Seed germination, characterized by radicle emergence, is another critical step in the life cycle
of flowering plants. Plants have evolved seed dormancy, a temporary arrest of seed germi-
nation, to control the timing of germination. Seed dormancy and flowering timing together
help phase the life cycle of plants in coordination with the external cues, ensuring seedling
establishment, floral transition and seed development to take place under favorable seasons.
Seed dormancy promotes the survival of plants by increasing the distribution of germination
both temporally and spatially. Dormancy can be developed before and after seed abscission
from mother plants, termed as primary and secondary dormancy respectively. We will focus
on the primary dormancy. Seed dormancy is also a key agricultural trait that has generally
been reduced for rapid germination in crops. Research results on seed dormancy have po-
tential applications in improving agronomic traits in crops and in weed management, such
as introducing foreign dormancy genes to prevent pre-harvest seed sprouting or to allow fall
planting so that seedlings will emerge earlier in spring avoiding the more likely or intensive
competition from weeds and diseases in later warmer season (Foley and Fennimore, 1998;
Foley, 2001).
A dormant seed is defined as one that will not germinate under environmental conditions
normally favorable for germination, but that will germinate under those same conditions af-
ter breaking the dormancy (Hilhorst, 1995; Bewley, 1997; Li and Foley, 1997; Baskin and
Baskin, 2004). Seed dormancy is controlled by complex regulatory networks and is deter-
mined by both morphological and physiological properties of the seed (Nikolaeva, 2004).
27
Accordingly it was classified into five classes: physiological (PD), morphological (MD),
morphophysiological (MPD), physical (PY) and combinational (PY+PD), each with further
subdivision into levels and types (Baskin and Baskin, 2004). Dormancy in Arabidopsis seeds,
along with the great majority of other seeds, belongs to nondeep PD, i.e. embryos excised
from these seeds will produce normal seedlings. Depending on the specific species, either
gibberellin treatment, scarification, after-ripening (dry storage) or cold/warm stratification
can break this kind of dormancy (Baskin and Baskin, 2004). Arabidopsis is among the
best-studied plants, especially diacotyledonous species, for seed dormancy. Insights into
the mechanisms in this species would be applicable broadly in many other species that share
similar seed dormancy type with Arabidopsis.
Many QTLs or genes affecting seed dormancy have been identified through mutagenesis stud-
ies A. thaliana (Leon-Kloosterziel, van de Bunt et al. 1996; Koornneef, Alonso-Blanco et al.
1998). Two genes, RDO2 and RDO4, have been cloned: interestingly RDO4 was shown to
be involved in epigenetic chromatin organization (Liu, Geyer et al.). However, many of these
genes or QTLs might affect dormancy through regulating seed development in aspects other
than the dormancy establishment (e.g. Baumbusch, Hughes et al. 2004).
Natural occurring variation is an important alternative resource for genetics research in A.
thaliana (Alonso-Blanco and Koornneef 2000). Some QTLs underlying dormancy variation
have been identified using recombinant inbred lines from crosses of natural variants (van Der
Schaar, Alonso-Blanco et al. 1997; Alonso-Blanco, Bentsink et al. 2003; Clerkx, El-Lithy
28
et al. 2004). However, because of the coarse localization, it is hard to identify the particular
genes. So far, only the geneDOG1 has been cloned (Marten Koornneef, personal communi-
cation) by this approach.
With the advent of affordable methods for genome-wide DNA polymorphism genotyping,
association mapping has become as a possible alternative, which is potentially powerful in
fine mapping (e.g. Aranzana, Kim et al. 2005). However, the power of association mapping
strongly depends on the genetic basis of phenotypic variation for the trait of interest in sam-
ples used. Relatively, the phenotype must vary in a significantly large scale in the samples,
and the phenotyping method should be precise enough to reliably differentiate such variation,
otherwise the power of association mapping will be lost. Little is known about the variation
pattern of seed dormancy in A. thaliana, partly because seed dormancy is a notoriously diffi-
cult trait to quantify (Derkx, Karssen, et al. 1993; Alonso-Blanco, Bentsink et al. 2003) due
to the huge environmental effects and rapid changes in germination behavior during after-
ripening. We decided to measure seed dormancy in a well-characterized natural population
for which genome-wide polymorphism data are available (Nordborg, Hu et al. 2005). Our
main objective was to investigate the global pattern of variation for this adaptively important
trait and to look for obvious correlations with climatic variables. A secondary objective was
to investigate the feasibility of measuring seed dormancy on a scale fine enough for full-scale
association studies.
29
Seed dormancy can only be measured indirectly as the requirement for dormancy-breaking.
In A. thaliana, this can be accomplished by measuring the after-ripening (dry storage) re-
quirement for half ratio of germination - the number of days of dry storage required to reach
50% germination (DSDS50). We measured DSDS50 for our global samples and the results
were reliable as evidenced by the reasonable repeatablility among duplicates and strong ge-
ographic patterning of the phenotype. The relationship between variation for dormancy and
flowering time is described, and possible explanatory variables, such as latitude and climatic
factors are discussed.
3.2 Materials and Methods
3.2.1 Plant materials and genotyping
The accessions used in this study were described by Nordborg et al. (Nordborg, Hu et al.
2005). The FRI indel genotyping data were generated by Aranzana et al. (Aranzana, Kim et
al. 2005). The FLC genotyping data of the intron 1 insertion were described in Chapter 2.
3.2.2 Plant cultivation and flowering time phenotyping
All plants were grown in environmentally controlled growth chambers ( MTPS72 Conviron,
Canada). The sown seeds were treated with 2-3 days 4
◦
C cold treatment for germination
synchronization, followed by continuous short day conditions (8 hrs of light / 16 hrs of dark)
at 18
◦
Cexcept that they were treated for 4 weeks of 4
◦
C vernalization after the full establish-
ment of rosettes. For each accession, there were six replicates to start with. All plants were
30
rotated periodically to minimize the effect of environmental heterogeneity. The flowering
time was recorded as the days to flower starting at the germination date. The average value
of available replicates is used for analysis.
3.2.3 Seed collection and storage
Once a plant started to flower, a transparent plastic sleeve was used to cover the plant and
thus to prevent the potential outcrossing or seed contamination. The seeds of each plant were
collected only once by gently tapping the siliques when significant portion of siliques had
senesced to the point of being totally brown, and the seeds were stored in a cellulose paper
bag kept in a dark incubator with the air temperature and relative humidity controlled at 20
◦
C
and 40% respectively.
3.2.4 Germination test and dormancy measurement
Seed germination was generally tested every one or two weeks starting at the collecting
day. Over the progress of test, depending on the seed availability, preceding germination
test results and available space, larger time intervals might apply. For each test, 30 or more
seeds were sown on a filter paper maintained saturated with deionized water in a transparent
polystyrene Petri dish. All Petri dishes were then put in the same growth chambers as for
plant growing, and were rotated daily.
After seven days of incubation, the ratio of germinated seeds, identified by the visible radical
protrusion, was calculated in percentage. The germination test would continue until nearly
31
100% germination as long as the seeds were available. In case of fungus contamination,
invaded seeds would be excluded from calculation unless they constituted a significant pro-
portion (larger than 10%) of non-germinated seeds, in which case the test was discarded.
Incomplete data resulting from lackof seeds ( all seeds were used before reaching 50% ger-
mination) were removed from analysis because the germination change could be rapid and
unpredictable. At the end of the experiment, a viability test was performed for those seed
collections having never reached 90% germination ratio. We tested the seed viability by
measuring germination same as described but after applying 4
◦
C cold stratification for 1-4
weeks in deionized distilled water followed by an extended incubation of 10 days. Seeds with
less than 90% germination were excluded from final results.
The seed germination process was completed within 4 or 5 days after sowing for most ger-
mination tests in this study. However, for some very dormant seeds, a significant portion of
tested seeds could continue to germinate even after one week. One reason is that the “residual
dormancy” (Bradford 2002) was still significant for these very dormant seeds. In addition,
the seed “vigor” could be partly lost as the experiment had lasted for a long time for these
very dormant seeds. Either case will result in a slower germination rate. The Timson’s index
(see Baskin and Baskin 1998) takes into account of this by measuring seed dormancy with
n, where n is the cumulative daily germination percentage for each day until the 100% ger-
mination. However, this method will prohibitively increase the work for a large-scale survey.
In fact, the germination “rate” and “ratio” correlatively reflect the seed dormancy level in
same direction, thus it is justifiable to consider only the germination ratio. Accordingly, we
32
calculated the germination percentage uniformly after one week of incubation and adopted
the days of dry storage to reach 50% germination or DSDS50 (Alonso-Blanco, Bentsink et al.
2003) to score the dormancy. The DSDS50 value of each collection was estimated by logit
regression on the germination data using statistical package GLM in software R. Average
DSDS50 value of all available replicates was taken as our final estimate of seed dormancy.
3.2.5 Climatic data
The precipitation and air temperature data (average of years 1950-1996) were downloaded
from the Website developed primarily by K. Matsuura and C. Willmott at the Center for Cli-
matic Research, Department of Geography of University of Delaware (http://climate.geog.udel.
edu/∼climate/html pages/download.html). The data for each origin of seeds were estimated
by a single or average data of the closest site(s) if available.
3.3 Results and Discussion
3.3.1 Seed dormancy measured in DSDS50
DSDS50 values ranged between 0 and 200 days and the distribution of mean DSDS50 values
across accessions is heavily skewed towards low dormancy (mean = 43 days, median = 36.5
days) (Figure 3.1). Generally speaking, the standard deviation of DSDS50 for each acces-
sion was small - less than two weeks even for most of those with averages larger than 100
days (Figure 3.3 and Figure 3.4).
33
a
DSDS50 (Days)
Frequency
0 50 100 150 200
0 5 10 15 20
0 50 100 150 200
b
Accessions
Seed Dormancy in DSDS50 (Days)
Figure 3.1: The extensive distribution of mean DSDS50 for 83 accessions is shown in the
histogram (a) and the well-controlled deviation within replicates for each accession is shown
in the boxplot (b).
In most cases, the ratio of seed germination increased monotonically over the dry storage
time: a pattern illustrated by C24 in Figure 3.2, but exception existed, as seen with the
accession Cvi-0 (Figure 3.2), for which seed germination could proceed irregularly and
the ratio could be very different between the replicates. Cvi-0 originates from the Cape
Verde Island where the summer is warm and dry with meager and erratic precipitation. The
observed unusually variable dormancy might reflect this special climatic conditions. Our
results were qualitatively in agreement with other published results (van Der Schaar, Alonso-
Blanco et al. 1997; Baumbusch, Hughes et al. 2004; Clerkx, El-Lithy et al. 2004).
In conclusion, the reproducibility of our results indicate that seed dormancy can be reliably
measured using DSDS50 as long as the conditions are strictly controlled. As we shall see
below, the reasonable geographical distribution of our measured seed dormancy also reflects
the reliablility of the results.
34
0 50 100 200
0.0 0.4 0.8
C24
DSDS50 = 104 day
Days of Dry Storage
Germination Ratio
0 50 100 200
0.0 0.4 0.8
C24
DSDS50 = 104 day
Days of Dry Storage
Germination Ratio
0 50 100 200
0.0 0.4 0.8
Cvi−0
Days of Dry Storage
Germination Ratio
0 50 100 200
0.0 0.4 0.8
Cvi−0
Days of Dry Storage
Germination Ratio
Figure 3.2: Examples of germination ratio tests. Generally, the seed germination monotoni-
caly increased over the storage time and there was a short logarithmic phase of rapid change
(eg. C24); but exception was seen with Cvi-0, in which the change is irregular: after 50 days,
the germination ratio was highly vaiable.
3.3.2 Non-random geographical distribution and microgeographic variation
Overall, we observed a strongly non-random geographic patterning of seed dormancy (Figure
3.5)that was consist with local adaptation because dormancy levels appeared to be explainable
by local climatic and geographic conditions. Accessions with strong-dormancy seeds hail
from regions where the summers are hot and dry and/or where the winters are destructively
35
harsh. Under these kinds of climates, seed require dormancy to postpone teh germination
after these unfavorable seasons because either the following summer drought or harsh winter
will destroy the seedlings. Specifically, these accessions include several of the Mediterranean
accessions including four Spanish (Se-0, Ts-1, Ts-5 and LL-0) and one northern Italian ac-
cession (Mr-0), a very high-altitude accession (Kas-2) and two Kazakhstan accessions (Kz-1
and Kz-9). At the other extreme, seeds from subarctic zone, northern Sweden and Finland,
are all non-dormancy. In order to avoid seed maturation being curtailed by the early onset of
winters, these seeds have to germinate promptly earlier in the life cycle. And while winters
in these regions may seem harsh, the deep snow cover is likely to be protective of seedings.
Seeds of similar origins tended to display similar level of seed dormancy, but considerable
intra-group variation was observed as well. Seeds of the three out of four southern Swedish
accessions are highly dormant while seeds of another accession, Ull-2-3, were much less
dormant. However, this accession is known to be genetically distinct from the other southern
Swedish accessions (Aranzana, Kim et al. 2005). Significant microgeographic heterogeneity
widely exists, for example, among the 2 England accessions (NFA-8 and NFA-10) and 3 Tad-
jikistan accessions (Kondara, Sorbo and Shahdara). These Tadjikistan accessions are close in
origin and also in genetic background (Nordborg et al., 2005). However, seed dormancy was
significantly weaker in Shahdara. This variation might be explainable by their different local
environmental conditions as described by Ratcliffe (1965). Briefly, at the Shahdara habitat,
the autumns are cold and extremely dry followed by severe winters (below−10
◦
C). Thus
one possible explanantion is that the seeds can enter quiescent state in fall and winter and
36
thereby largely dispense with dormancy under this straight prohibitive condition in contrast
to relatively milder habitates of the other two accessions. We shall see the special relationship
between precipitaton and seed dormancy in section 3.3.5.
3.3.3 Latitudinal distribution
Selection is expected to give rise to latitudinal clines of certain traits because environmen-
tal cues such as photoperiod and temperature are correlated with latitude. However, many
factors including microenvironmental heterogeneity and recent dissemination can mask the
clinal patterning even for adaptive traits.
Only weak negative correlation (R
2
= 0.11, p = 0.0013) was observed for correlation be-
tween DSDS50 and latitude of origin (Figure 3.6). Even this weak correlation was largely
attributable to the northern Swedish accessions. Lack of significant latitudinal cline was also
reported for flowering time in the same Arabidopsis samples as used here (Shindo, Aranzana
et al. 2005). A significant cline for flowering time was detected in another study though the
relationship was only restricted to those accessions carrying none of the two major null FRI
alleles, fri
Col
and fri
Ler
(Weinig, Stinchcombe et al. 2003). Similarly, the latitudinal cline
becomes much more significant (R
2
= 0.2, p = 1.2×10
−4
) after removing accessions with the
null FRI alleles; and even stronger (R
2
= 0.48, p = 8.1×10
−8
) if we also remove accessions
carrying a putative weak FLC allele with an transposon insertion at intron 1. The dormancy
distribution for fri
Ler
is broader than that of fri
Col
(Figure 3.6), similarly to what was ob-
served for flowering time (Shindo, Aranzana et al. 2005). Compared with those accessions
37
carrying the other mutant alleles at FRI or FLC, fri
Col
carriers more evidently tend to have
reduced dormancy. Considering a previous study suggesting that the fri
Col
was disseminated
more recently than fri
Ler
and selection on the former allele is more evident (Toomajian, Hu
et al. 2006), it seems that the null or weak alleles of FRI or FLC were selected for acceler-
ated life cycling by both reducing flowering time and seed dormancy via co-selection and/or
pleiotropy of the relevant genes.
3.3.4 Relation with flowering time
In the life cycle of an annual plant, most intra-species variation in timing likely exists in either
the period of seed dormancy or the period of vegetative growth. Therefore, we would expect
a negative relationship in time between these two stages. There also exist “rapid-cycling” ac-
cessions like Col-0, which are able to support multiple generations annually. Thus, whether
or not the negative correlation between between dormancy and flowering time is observed
will depend on the samples. Nevertheless, the negative correlation is slightly observable in
whole population between seed dormancy and flowering timing under short-day condition
with vermalization (SD+V) (R
2
= 0.13, p = 7×10
−4
) (Figure 3.7). Again, eliminating
putatively rapid-cycling accessions with a mutant allele at FRI or FLC strengthens the corre-
lation (R
2
= 0.21, p = 1.47×10
−3
) (Figure 3.7). If we also remove all Swedish accessions
which are extreme in flowering time, the correlation become much higher (R
2
= 0.39, p =
5.15×10
−5
). Similar to what we observed for latitudinal correlation, the difference between
the two null FRI alleles for this correlation is also obvious: the correlation is much stronger
38
for fri
Ler
carriers than for fri
Col
carriers ( fri
Ler
: R
2
= 0.33, p = 0.04; fri
Col
: R
2
= 0.0026, p
= 0.91) (Figure 3.7).
In any case, no significant correlation with flowering time can be found under other con-
ditions, long-day condition without vernalization (LD-V) (R
2
= 0.01, p = -0.001)( (Figure
3.8), long-day condition with vernalization (LD+V) (R
2
= 0.05, p = 0.046)(Figure 3.9) or
short-day condition without vernalization (SD-V)(R
2
= 0.04, p = 0.073) (Figure 3.10). One
seeming reason is that we collected the seed dormancy data under the SD+V condition. And
another possible explanation is that SD+V better mimicked the natural field condition than
LD-V for these samples collectively, and these two traits are results of adaptation to field
conditions and will have different plasticity in lab experimental conditions. Without experi-
mental vernalization, the flowering time of winter annuals will be longer than that in natural
field depending on the degreee of requirement for vernalization, but the seed dormancy might
not be affected in a similar pattern.
3.3.5 Relation with temperature and precipitation
Temperature and precipitation are two major climatic cues for plants. To test whether they
explain seed dormancy, we estimated the monthly and annual average temperatures and pre-
cipitations for each origin using the interpolated climatic data (see Materials and Methods).
Obviously, such rough climatologic data cannot accurately reflect the micro-environmental
conditions of the plant habitats (e.g. the mentioned habitates of the three Tadjikistan ac-
cessions (Ratcliffe 1965)). In addition, seeds mature at different seasonal time in different
39
habitats, thus comparing single month/season may underestimate the significance of rela-
tionship. Nonetheless, these data should be good enough to reveal the general trend in our
samples if the correlation is strong.
Neither of the twelve monthly averages nor the annual average do not show a significant cor-
relation with seed dormancy with the annual average being the most significant one (R
2
=
0.014, p = 0.29) (Figure 3.11). This lack of correlation makes sense in that the temperatures
at most regions are not major stress factors for seed germination.
The situation is different for precipitation because the drought or flood stress is critical for
germination and seedling establishment. We found that average July precipitation is most
significantly correlated with seed dormancy (R
2
= 0.13, p = 4.16×10
−4
) (Figure 3.12).
This correlation thus reveales the importance of precipitation after seed maturation to the
evolution of seed dormancy , but we must note that the correlation is largely caused by a few
accessions of strong seed dormancy and there is almost no relationship when average July
precipitations are higher than 50mm. In addition, it is clear from the figure that the extreme
drought is associated with low dormancy above (Figure 3.12).
40
3.4 Conclusion
We have demonstrated that the requirement of after-ripening for seed germination (DSDS50)
is a reliable index of seed dormancy. Extensive variation has been found in natural popula-
tions and both global non-random geographic pattern and microgeographic variation in seed
dormancy are evident and potentially explainable.
Strong-dormancy seeds are always associated hot dry summer and/or cold dry winter, consis-
tent with the viewpoint that droughts and frost select for dormancy (Allen and Meyer, 1998).
On the other hand, the extremely inhibitive conditions like coldness or drought could force
seeds into quiescence and thereby relieve the pressure for dormancy. There is no seed dor-
mancy for those very late flowering annuals with strong vernalization requirement like north-
ern Swedish ones. But for earlier flowering accessions the variation in seed dormancy is much
broader. Particularly, early flowering accessions carrying fri
Col
, believed to have experienced
recent strong selection for rapid flowering, tend to be non-dormancy or low-dormancy, and
thus have become real rapid-cycling annuals, capable of more than one generation annually.
Obviously, seed dormancy is an adaptive trait shaped by local environmental cues. There was
no constant linear relationship between seed dormancy and latitude or temperature or precip-
itation. Presumably, only the local environmental conditions and flowering time together
suffice as the functional predictors of seed dormancy. Therefore, in order to fully explain the
geographical variation in seed dormancy, one needs the specific environmental and climatic
data, which unfortunately are generally unavailable for current stock seeds.
41
Accession Latitude Longitude DSDS50±sd Precip Temp DTF_SD+V DTF_LD-V
Ag-0 45 1.3 65±11 965.8 12.0 72.5 45.2
An-1 51.3 4.3 1±3 727.1 10.3 48.2 26.3
Bay-0 49 11 0±0 701.3 8.2 79.3 27.5
Bil-5 63.19 18.29 1±1 568.9 3.6 107.5 200.0
Bil-7 63.19 18.29 0±0 568.9 3.6 61.5 200.0
Bor-1 49.12 16.37 34±6 550.3 8.0 42.3 41.8
Bor-4 49.12 16.37 28±6 550.3 8.0 56.8 38.5
Br-0 49 16.3 73±14 534.0 8.0 36.8 92.5
Bur-0 53.3 -8 0±0 912.8 9.6 76.3 32.0
C24 40.12 -8.25 119±14 1245.1 14.8 27.0 28.7
CIBC-17 51.25 0.41 55±7 685.1 9.7 53.7 46.8
CIBC-5 51.25 0.41 52±3 685.1 9.7 62.3 35.5
Col-0 38.3 -92.3 0±0 1007.9 13.2 56.8 27.3
CS22491 61.36 34.15 15±2 637.4 1.8 44.8 85.3
Ct-1 37.3 15 41±11 622.0 17.9 34.5 26.8
Cvi-0 16 -24 na na na 26.3 31.7
Eden-1 62.53 18.11 2±3 594.5 3.3 74.7 200.0
Eden-2 62.53 18.11 na 594.5 na 200.0
Edi-0 56 -3 76±14 745.3 7.6 53.5 140.3
Ei-2 50.3 6.3 13±4 963.9 7.4 55.7 41.2
Est-1 58.3 25.3 9±3 628.8 5.5 68.3 25.5
Fab-2 63.01 18.19 na 568.9 3.5 171.5 200.0
Fab-4 63.01 18.19 na 568.9 3.5 200.0 200.0
Fei-0 40 -8 23±8 1412.2 14.9 59.7 33.7
Ga-0 50.3 8 47±11 627.8 9.1 49.5 34.7
Got-22 51.32 9.55 57±5 723.2 7.2 66.2 157.7
Got-7 51.32 9.55 61±8 723.2 7.2 71.0 162.5
Gu-0 50.3 8 26±7 627.8 9.1 45.0 27.5
Gy-0 49 2 21±3 650.6 10.3 45.8 37.7
HR-10 51.25 0.41 43±2 685.1 9.7 59.8 25.2
HR-5 51.25 0.41 58±11 685.1 9.7 48.3 28.8
Kas-2 35 77 189±16 214.5 -4.0 35.3 44.2
Kin-0 44.46 -85.37 51±15 752.2 6.6 35.8 31.3
Knox-10 41.18 -86.38 33±5 979.1 9.5 72.8 57.4
Knox-18 41.18 -86.38 15±12 979.1 9.5 71.0 38.5
Kondara 38.35 68.48 60±8 403.4 12.9 42.8 45.0
Kz-1 49.5 73.1 106±20 312.8 2.5 33.0 35.7
Kz-9 49.5 73.1 132±12 312.8 2.5 42.2 31.0
Ler-1 10±7 568.5 9.0 61.2 26.8
LL-0 41.59 2.49 129±4 759.3 11.4 37.7 46.0
Lov-1 62.48 18.05 0 594.5 3.3 99.0 200.0
Lov-5 62.48 18.05 0 594.5 3.3 117.2 200.0
Lp2-2 49.22 16.39 13±4 550.3 7.0 60.8 36.8
Lp2-6 49.22 16.39 6±5 550.3 7.0 59.8 36.3
Lz-0 46 3.3 56±8 799.8 10.6 40.5 48.0
Mr-0 44.3 9.3 122±9 1417.8 13.5 58.0 63.4
Mrk-0 49 9.3 17±3 713.9 9.5 51.7 37.3
Ms-0 56 38 43±6 630.9 4.7 36.5 42.0
Mt-0 33 23 7±7 170.8 10.2 68.0 26.7
Figure 3.3: Accessions and their seed dormancy in DSDS50 (Part 1). See Figure 3.4 for
notes.
42
Accession Latitude Longitude DSDS50±sd Precip Temp DTF_SD+V DTF_LD-V
Mz-0 50.3 8.3 30±10 632.2 9.0 70.0 26.8
Nd-1 51 10 na 690.8 8.4 60.2 25.0
NFA-10 51.25 0.41 5±4 685.1 9.7 51.3 31.7
NFA-8 51.25 0.41 38±6 685.1 9.7 46.3 28.0
Nok-3 52.3 4 36±8 741.0 64.3 40.2
Omo2-1 56.12 15.18 73±14 595.6 7.4 77.3 185.3
Omo2-3 56.12 15.18 70±4 595.6 7.4 52.6 101.8
Oy-0 60.23 6.13 na 2150.2 1.9 64.8 32.8
Pna-10 42.07 -86.27 8 940.7 10.0 74.4 51.5
Pna-17 42.07 -86.27 37±5 940.7 10.0 68.7 63.8
Pro-0 43.15 -6 79±8 1102.0 10.6 37.0 25.0
Pu2-23 42.38 18.07 28±8 2069.3 12.3 47.2 44.0
Pu2-7 42.38 18.07 33±5 2069.3 12.3 50.0 51.2
Ra-0 46 3.3 64±17 799.8 10.6 50.0 27.2
Ren-1 48.5 -1.41 51±17 798.4 10.8 52.3 38.0
Ren-11 48.5 -1.41 41±5 641.3 10.8 38.8 24.7
Rmx-A02 42.07 -86.29 13±0 940.7 10.0 67.2 47.0
Rmx-A180 42.07 -86.29 43 940.7 10.0 78.5 31.5
RRS-10 41.32 -86.26 na 979.1 9.5 na 50.6
RRS-7 41.32 -86.26 43±13 979.1 9.5 69.2 41.8
Se-0 41.3 2.3 123±5 597.8 15.7 31.7 46.5
Shahdara 38.35 68.48 28±8 403.4 12.9 38.8 33.3
Sorbo 38.35 68.48 73±7 403.4 12.9 37.5 50.0
Spr1-2 56.32 14.29 na 660.9 7.4 200.0 60.8
Spr1-6 56.32 14.29 na 660.9 7.4 80.7 200.0
Sq-1 51.25 0.41 na 685.1 9.7 46.8 42.7
Sq-8 51.25 0.41 28±6 685.1 9.7 42.0 26.8
Tamm-2 59.58 23.26 11±1 718.9 6.0 70.8 200.0
Tamm-27 59.58 23.26 6±2 718.9 6.0 77.5 200.0
Ts-1 41.3 3 129±13 694.7 15.7 32.3 43.2
Ts-5 41.3 3 127±15 694.7 15.7 39.8 71.3
Tsu-1 34.43 136.31 7±8 na na 50.5 35.2
Ull2-3 56.09 13.46 15±2 747.7 7.8 59.5 29.6
Ull2-5 56.09 13.46 71 747.7 7.8 161.3 200.0
Uod-1 48.07 14.53 32±5 1117.3 8.5 50.3 31.0
Uod-7 48.07 14.53 46±9 1117.3 8.5 55.3 54.4
Van-0 49.3 -123 46±13 630.9 4.7 64.3 28.3
Var2-1 55.33 14.2 na 660.9 7.4 126.0 200.0
Var2-6 55.33 14.2 na 660.9 7.4 182.7 200.0
Wa-1 52.3 21 19 504.4 8.0 52.0 23.7
Wei-0 47.25 8.26 75±7 1135.7 8.6 47.8 23.3
Ws-0 52.3 30 42±12 615.8 6.7 57.0 99.0
Ws-2 52.3 30 0±0 615.8 6.7 33.2 23.2
Wt-5 52.3 9.3 65±15 724.0 8.6 43.8 28.7
Yo-0 37.45 -119.35 37±3 784.7 4.5 88.5 59.8
Zdr-1 49.12 16.37 20±4 550.3 8.0 44.2 27.8
Zdr-6 49.12 16.37 16±9 550.3 8.0 47.3 36.8
Figure 3.4: Accessions used and their seed dormancy in DSDS50 (Part 2). The other related
data are also included: latitudes and longitudes of seed origins, average July precipitaions
(Precip), annual average air temperatures (Temp), flowering time (days to flower or DTF)
under short-day with vernalization (SD+V) and long-day condition (LD-V). Note that the
interpolated data about the temperature and precipitation cannot reflect the microgeographic
conditions.
43
Figure 3.5: Geographic distribution of seed dormancy.
30 40 50 60 70
0 50 100 150 200
30 40 50 60 70
0 50 100 150 200
Latitude of Seed Origin
Seed dormancy (DSDS50)
30 40 50 60 70
0 50 100 150 200
30 40 50 60 70
0 50 100 150 200
30 40 50 60 70
0 50 100 150 200
N.SE+FI
S.SE
C. Asia
SP
others
30 40 50 60 70
0 50 100 150 200
30 40 50 60 70
0 50 100 150 200
30 40 50 60 70
0 50 100 150 200
Latitude of Seed Origin
Seed dormancy (DSDS50)
30 40 50 60 70
0 50 100 150 200
30 40 50 60 70
0 50 100 150 200
flc_in
fri_Col
fri_Ler
others
Figure 3.6: Relationship between seed dormancy in DSDS50 and latitudes of the seed origins.
N.SE: north Sweden, S.SE: south Sweden, C. Asia: Central Asia, FI: Finland, SP: Spain;
flc in: with FLC intron 1 insertion, fri Col: with fri
Col
allele, fri Ler: with fri
Ler
allele. This
legend also applies to Figure 3.7-10.
44
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
Days to Flower (SD+V)
Seed dormancy (DSDS50)
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
N.SE+FI
S.SE
C. Asia
SP
others
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
Days to Flower (SD+V)
Seed dormancy (DSDS50)
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
flc_in
fri_Col
fri_Ler
others
Figure 3.7: Relationship between seed dormancy in DSDS50 and flowering time under short-
day condition with vernalization.
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
Days to Flower (LD−V)
Seed dormancy (DSDS50)
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
N.SE+FI
S.SE
C. Asia
SP
others
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
Days to Flower (LD−V)
Seed dormancy (DSDS50)
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
flc_in
fri_Col
fri_Ler
others
Figure 3.8: Relationship between seed dormancy in DSDS50 and flowering time under long-
day condition without vernalization.
45
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
Days to Flower (LD+V)
Seed dormancy (DSDS50)
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
N.SE+FI
S.SE
C. Asia
SP
others
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
Days to Flower (LD+V)
Seed dormancy (DSDS50)
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
flc_in
fri_Col
fri_Ler
others
Figure 3.9: Relationship between seed dormancy in DSDS50 and flowering time under long-
day condition with vernalization.
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
Days to Flower (SD−V)
Seed dormancy (DSDS50)
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
N.SE+FI
S.SE
C. Asia
SP
others
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
Days to Flower (SD−V)
Seed dormancy (DSDS50)
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
0 50 100 150 200
flc_in
fri_Col
fri_Ler
others
Figure 3.10: Relationship between seed dormancy in DSDS50 and flowering time under
short-day condition without vernalization.
46
0 5 10 15 20
0 50 100 150 200
0 5 10 15 20
0 50 100 150 200
Annual Average Temperature(C)
Seed dormancy (DSDS50)
0 5 10 15 20
0 50 100 150 200
0 5 10 15 20
0 50 100 150 200
0 5 10 15 20
0 50 100 150 200
N.SE+FI
S.SE
C. Asia
SP
others
0 5 10 15 20
0 50 100 150 200
0 5 10 15 20
0 50 100 150 200
0 5 10 15 20
0 50 100 150 200
Annual Average Temperature(C)
Seed dormancy (DSDS50)
0 5 10 15 20
0 50 100 150 200
0 5 10 15 20
0 50 100 150 200
flc_in
fri_Col
fri_Ler
others
Figure 3.11: Relationship between seed dormancy in DSDS50 and annual average tempera-
ture from 1950 through 1996.
0 50 100 150
0 50 100 150 200
0 50 100 150
0 50 100 150 200
July Precipitation (mm)
Seed dormancy (DSDS50)
0 50 100 150
0 50 100 150 200
0 50 100 150
0 50 100 150 200
0 50 100 150
0 50 100 150 200
N.SE+FI
S.SE
C. Asia
SP
others
0 50 100 150
0 50 100 150 200
0 50 100 150
0 50 100 150 200
0 50 100 150
0 50 100 150 200
July Precipitation (mm)
Seed dormancy (DSDS50)
0 50 100 150
0 50 100 150 200
0 50 100 150
0 50 100 150 200
flc_in
fri_Col
fri_Ler
others
Figure 3.12: Relationship between seed dormancy in DSDS50 and average July precipitation
from 1950 through 1996.
47
Chapter 4
DNA POLYMORPHISM AT THE SELF-INCOMPATIBILITY
LOCUS
48
4.1 Introduction
Self-incompatibility (SI) is a mechanism for hermaphrodite flowering plants to promote
cross-fertilization by avoiding self-fertilization. Despite the well-known advantages of out-
crossing, selfing is common among hermaphrodites in nature: 20% are highly selfing and
33% are intermediate between selfing and out-crossing (Kalisz, V ogler et al. 2004). One
explanation, proposed by Darwin (1876), is reproductive assurance, i.e. self-pollination is
favored because of pollen limitation in peripheral or isolated places where mating partners
and/or pollinators are scarce (Barrett 2002). Another explanation relies on the automatic se-
lection or transmission advantage first noted by Fisher (1941). Relative to alleles promoting
out-crossing, an allele promoting self-fertilization in self-compatible plants has a 3:2 trans-
mission advantage because it is transmitted through both selfing and out-crossing. Thus alle-
les that increase the selfing rate are expected to spread unless their transmission advantage is
outweighed by inbreeding depression and pollen discounting (reduction in amount of pollen
contributable to the outcross pollen pool) (Barrett 2002).
4.1.1 The mechanism of self-incompatibility in Brassica
SI is generally controlled by a single locus called the S-locus (also known as S-loci). The self-
pollen rejection is initiated by the specific recognition between pollen and pistil based on their
respective S-locus genotypes. There are two kinds of SI, sporophytic self-incompatibility
(SSI) and gametophytic self-incompatibility (GSI). In SSI, S-haplotype specificity is con-
trolled by the stigma genotype and the diploid (sporophytic) genotype of the pollen parent,
i.e. pollen-rejection will take place if either of the two paternal S-haplotypes matches one
49
of those of maternal parent. GSI incompatibility is controlled by the haploid S-haplotype of
pollen itself. GSI is more common, but SSI is better understood, mostly through research on
Brassica.
In Brassica, the S-locus is formed by two closely linked genes (S-genes) encoding two prod-
ucts: the female determinant (expressed in the stigma) S-receptor kinase SRK (Stein and
Nasrallah 1993) and the male determinant (expressed in pollen) S-locus cysteine-rich small
protein (50-66 amino acids) (SCR) (Schopfer, Nasrallah et al. 1999) or S-locus protein 11
(SP11) (Suzuki, Kai et al. 1999). SRK is a trans-membrane kinase with an extracellular
S-domain for ligand recognition and an intracellular kinase domain. The ligand-receptor
recognition between these two proteins triggers a SRK-mediated signaling pathway, currently
known to involve a E3 ubiquitin ligase (ARC1) and a non-receptor M locus protein kinase
(MLPK) (Murase, Shiba et al. 2004) degrading unknown substrates and thereby resulting in
self-pollen rejection at the pistil (Figure 4.1) (Goring and Walker 2004; Murase, Shiba et al.
2004).
As is the case for the alleles at the major histocompatibility complex (MHC) gene in verte-
brate species, specific nonself recognition loci are expected to be in high diversity and often
maintain trans-species polymorphisms (Ioerger, Clark et al. 1990). The SRK gene is also
highly polymorphic, especially in the S-domain that controls the recognition specificity. SCR
gene is extremely polymorphic and only eight of the cysteines involved in intramolecular
disulphide bonds are well conserved between multiple S-haplotypes (Watanabe, Ito et al.,
50
Figure 4.1: Self-incompatibility response in Brassica (reprinted from Goring and Walker
2004). During self-pollination, the specific recognition and binding between SP11/SCR lig-
and and SRK receptor trigger off a cascade of signalling pathway resulting in self-infertility.
Two components of this pathway, MLPK and ARC1, have been identified, but there are still
more to be discovered.
2000; Takayama, Shiba et al., 2001). These two genes have been subject to balancing selec-
tion resulting in intermediate allele frequencies and high diversity across the nearby neutral
sites; recombination between two genes is suppressed to maintain the recognition specificity
(Kamau and Charlesworth 2005; Bechsgaard, Castric et al. 2006).
4.1.2 Self-incompatibility in Arabidopsis lyrata
Arabidopsis lyrata is the closest relative to the model plant A. thaliana, with an estimated
divergence time of 5 million years ago (Koch, Haubold et al. 2000; Yogeeswaran, Frary et al.
2005). In contrast to A. thaliana, A. lyrata is self-incompatible, and it is clear that an SI sys-
tem of the Brassica variety is the ancestral state (Figure 4.2). The S-loci have recently been
51
studied intensively in A. lyrata, partly to help shed light on how SI was lost in A. thaliana.
Figure 4.2: Phylogenetic relationship of Arabidopsis spp. and Brassica. Most taxa exhibit
sporophytic self-incompatibility, but selfing has evolved multiple times (e.g. A. thaliana and
C. bursa-pastoris).
Similar to those in Brassica (Schopfer, Nasrallah et al. 1999), currently available SRK se-
quences in A. lyrata fall into two major classes A and B; inter-class divergence of SRK gene
sequences is extremely deep (higher than 30% at S-domain) (Schierup, Mable et al. 2001;
Charlesworth, Bartolome et al. 2003; Mable, Schierup et al. 2003; Schierup, Bechsgaard et
al. 2006). Members within group B share more similarity than within A. Group A can be
further divided into subgroups, one of which is more closely related to B than to the other A
members (Mable, Beland et al. 2004; Prigoda, Nassuth et al. 2005). Both in Brassica and A.
lyrata, SCR diversity is very high and few alleles have been sequenced.
52
4.1.3 Self-compatibility and S-locus genes in Arabidopsis thaliana
As is common for weedy plants, A. thaliana is highly selfing. In the A. thaliana reference
genome (derived from the standard laborary accesssion Col-0), the S-locus consists of ΨSRK,
the pseudogenized ortholog of SRK in A. lyrata, and three diminished SCR orthologs, psudo-
genes Ψ SCR1, Ψ SCR2 and Ψ SCR3. The structure of the sequenced haplotype is similar to
that of group A in A. lyrata (Kusaba, Dwyer et al. 2001).
Transformation of the functional SCR and SRK genes from A. lyrata can recover SI pheno-
type, but it does so differentially among different accessions (Nasrallah, Liu et al. 2002;
Nasrallah, Liu et al. 2004). In accession C24, SI was stably restored by the foreign S-genes,
indicating that the SI signaling pathway has been maintained largely functional and self-
compatibility (SC) was acquired through the loss-of-function mutation restricted to the SCR
and/or SRK at least in this accession; the differentially incomplete SI observed in other acces-
sions suggested the presence of mutations in other downstream SI-modifier genes required
for pollen rejection - in these accessions these mutations could be secondary results or might
be the primary cause of SI breakdown. In other words, self-compatibility could have arisen
in multiple independent ways among different subpopulations of A. thaliana (Nasrallah, Liu
et al. 2004).
Three divergent haplogroups at the ΨSRK region showing sequence similarity to their coun-
terparts in A. lyrata were recently discovered: one major group A includes Col-0, group B is
of single Cvi-0 and group C includes Kas-2 (Shimizu, Cork et al. 2004). This conservation
53
of diversity suggests the recent loss-of-function of ΨSRK gene in A. thaliana. However, very
low diversity was observed for the the ΨSCR1 and also the ΨSCR2/3 with the same samples,
and thus sweep selection at the ΨSCR1 region was proposed to be responsible for the selfing
transition and the relatively recent loss of SI in A. thaliana was further claimed (Shimizu,
Cork et al. 2004). However this explanation necessitates frequent recombination at the S-
locus in order to reconcile with the high diversity observed at its nearby ΨSRK region. We
hoped to gain further insights on this issue via analysis of the DNA polymorphism pattern
around S-locus in our A. thaliana samples.
4.2 Materials and Methods
4.2.1 DNA samples
DNA samples of 96 accessions of A. thaliana were same as used in Chapter 2.
4.2.2 PCR amplification and DNA sequencing
All the protocols for PCR and DNA sequencing were same as in Chapter 2. The primers
used are designed according to the Col-0 genome sequence; The primers designed to screen
96 accessions for Kas-2 or Cvi-0 specific sequence were based on genome walk result of
ΨSRK sequences which are the same as the published ones respectively. The primers used
to amplify ΨSCR1 and Ψ SCR2/3 by Shimizu et al. (Shimizu, Cork et al. 2004) were also
included. The sequenced regions are shown in Figure 4.3.
54
Figure 4.3: (a) Resequenced regions (underlined regions) in ARK3, ΨSRK and ΨSCR1 in A.
thaliana. The filled boxes are exons (the first exons for ARK3 and ΨSRK are S-domains); (b)
The overall structure of S-locus in A. thaliana (Col-0) (reprinted from Shimizu, Cork et al.
2004). The figure also shows the diversity of each region according to the results of Shimizu
et al. (Shimizu, Cork et al. 2004).
4.2.3 Genome walk
BD GenomeWalker Universal Kit was used to amplify the DNA fragments downstream of
SRK genes in accession Kas-2 and Cvi-0 respectively. The genome-specific primers were
designed by the published SRK sequences following the instructions in the protocol. The re-
spective two genomic specific primers for genome walk with Cvi-0 and Kas-2 DNA samples
are as follows:
55
Cvi-0, 1st: 5’-TCTCTTATGTGTTCAAGAGCGTGCAGAGG-3’
Cvi-0, 2nd: 5’-GTCGTCGGTTGTTTTAATGCTAGGAAGC-3’
Kas-2, 1st: 5’-GGGGGACATAGGGCATAATTTTCTCCACAG-3’
Kas-2, 2nd: 5’-TTCTCCACAGGTCCTTTACTCGCAACAC-3’.
4.2.4 DNA Sequence alignment and analysis
DNA sequence alignment and editing were conducted same as in Chapter 2; DNA sequence
analyses were performed using DnaSP and Mega softwares.
4.3 Results
4.3.1 ARK3 andΨSRK are both highly polymorphic regions
The ARK3 S-domain was sequenced. Totally 93 available sequences (Figure 4.4) fall into two
major groups A and B (Figure 4.5). The divergence between groups is high (average number
of nucleotide substitutions per site between populations Dxy = 0.0234) while the within-
group diversities are very low (nucleotide diversity or average number of pair-wise nucleotide
differences per site Pi = 0.0017 and 0.0064 for group A and B respectively). The overall
estimate of synonymous nucleotide diversity is 0.0345, which is very high compared with the
genome-wide average 0.005 (Nordborg, Hu et al. 2005). It is virtually same as in the study
by Shimizu et al. (Shimizu, Cork et al. 2004). The divergence at ARK3 between A. thaliana
and A. lyrata (ARK3 ortholog is termed as Aly8)(Hagenblad, personal communication) is
56
very deep (Dxy = 0.7685) (Figure 4.5). But trans-specific polymorphisms were found for
ARK3 region. This suggests the long time of maintaining S-haplotype in the history extreme
recombination suppression at this region and the loss-of-function mutation should happen
recently (Charlesworth, Kamau et al. 2006).
We then resequenced ΨSRK region using primers designed by Col-0 sequence. As expected
for the high level of polymorphism observed by Shimizu et al. (Shimizu, Cork et al. 2004),
PCR repeatedly failed in a significant part of the samples and the failure rate became higher
from 5’ toward 3’ end of ΨSRK. We also adopted the newly published two groups ΨSRK
sequences represented by Kas-2 and Cvi-0 (Shimizu, Cork et al. 2004) to screen whole
samples by PCR amplification at both ends of ΨSRK. Although these two genotype-specific
primer pairs did recover amplification for some samples that failed with Col-specific primers,
many samples still remained unamplifiable. Our results thus agree with those of Shimizu
et al. (Shimizu, Cork et al. 2004) in that there is considerable variation at SRK.Perhaps
interestingly, the nonsense mutation at exon 4 in Col-0 does not exist in all accessions. There
is a frame-shift mutation residing with only four accessions, and these accessions also have
that nonsense mutation. Thus, for 13 accessions, no functionally drastic DNA mutation was
found in the sequenced SRK regions (Figure 4.6).
4.3.2 ΨSCR is also highly polymorphic, maybe even more
Shimizu et al. (2004) reported an almost complete lack of polymorphism at SCR1, and inter-
preted this as the result of a selective sweep. However, our attempts to to sequence ΨSCR1,
57
Sites
199
233
286
319
330
398
404
414
429
438
516
536
539
541
547
550
593
617
654
675
678
696
756
783
792
817
828
831
861
862
865
869
876
907
909
1009
1026
1062
1080
1104
1136
1164
1173
1185
1194
1196
1221
1222
1242
1250
1255
1260
1290
1313
1323
1326
1339
1350
1356
1389
1413
1416
1417
1429
1434
1435
1449
1461
1462
1478
1491
1492
1493
1496
1505
1506
1507
1513
1525
1528
1530
1536
1549
1578
1583
1584
1591
Reference G C G G GCATCAC GCC G GCC GCCTCCT GCACC GCCA G G T G G A G G G G GTACAATC GTTAACTTT--- GCT--T TAA G---TCA G GTTCCC
Ag-0 .............................................................---...--.....---..........
An-1 .............................................................---...--.....---..........
C24 .............................................................---...--.....---..........
CIBC-17 .............................................................---...--.....---..........
Ct-1 .............................................................---...--.....---..........
Fei-0 .............................................................---...--.....---..........
Ga-0 .............................................................---...--.....---..........
Got-22 .............................................................---...--.....---..........
Got-7 .............................................................---...--.....---..........
Gy-0 .............................................................---...--.....---..........
Knox-10 .............................................................---...--.....---..........
Knox-18 .............................................................---...--.....---..........
LL-0 .............................................................---...--.....---..........
Mt-0 .............................................................---...--.....---..........
NFA-10 .............................................................---...--.....---..........
NFA-8 .............................................................---...--.....---..........
Pna-17 .............................................................---...--.....---..........
Ren-1 .-...........................................................---...--.....---..........
Ren-11 .............................................................---...--.....---..........
Rmx-A02 .............................................................---...--.....---..........
Rmx-A180 .............................................................---...--.....---..........
RRS-7 .............................................................---...--.....---..........
RRS-10 .............................................................---...--.....---..........
Se-0 .............................................................---...--.....---..........
Sq-1 .............................................................---...--.....---..........
Ts-5 .............................................................---...--.....---..........
Ull2-3 .............................................................---...--.....---..........
Uod-7 .............................................................---...--.....---..........
Wei-0 .............................................................---...--.....---..........
Bay-0 .TAAA. .......................................................---...--.....---..........
Bil-5 .TAAA. .......................................................---...--.....---..........
Bil-7 .TAAA. .......................................................---...--.....---..........
Bor-1 ATAAA. .......................................................---...--.....---..........
Br-0 .TAAA. .......................................................---...--.....---..........
Bur-0 .TAAA. .....................................A.................---...--.....---..........
Eden-1 .TAAA. .......................................................---...--.....---..........
Edi-0 .TAAA. ......................... G.............................---. ..--.....---..........
Gu-0 .TAAA. .......................................................---...--.....---..........
HR-10 .TAAA. ..........................A............................---...--.....---..........
Kas-2 .TAAA. .......... G........................C...................---...--.....---..........
Kin-0 .TAAA. ..........................A............................---...--.....---..........
Kz-9 .TAAA. .............................A. ........................---...--.....---..........
Lov-1 .TAAA. .......................................................---...--.....---..........
Lov-5 .TAAA. .......................................................---...--.....---A.........
Lz-0 .TAAA. .......................................................---...--.....---..........
Mr-0 .TAAA. .......................................................---...--.....---.....C....
Mrk-0 .TAAA. ......................... G.............................---. ..--.....---..........
Mz-0 .TAAA. ......................... G.............................---. ..--.....---..........
Nd-1 .TAAA. ......................... G.............................---. ..--.....---..........
Nok-3 .TAAA. .......................................................---...--.....---..........
Omo2-1 .TAAA. .......................................................---...--.....---..........
Omo2-3 .TAAA. ......................... G.............................---. ..--.....---..........
Oy-0 .TAAA. ......................... G.............................---. ..--.....---..........
Pu2-23 ATAAA. .......................................................---...--.....---..........
Pu2-7 ATAAA. .......................................................---...--.....---..........
Ra-0 .TAAA. .......................................................---...--.....---..........
Sq-8 .TAAA. ......................... G.............................---. ..--.....---..........
Ts-1 .TAAA. ......................... G.............................---. ..--.....---..........
Tsu-1 .TAAA. ......................... G.............................---. ..--.....---..........
Uod-1 .TAAA. .......................................................---...--.....---..........
Van-0 .TAAA. ...............................A.......................---...--.....---..........
Wa-1 ATAAA. .......................................................---...--.....---..........
Ws-0 ATAAA. .......................................................---...--.....---..........
Wt-5 .TAAA. .......................................................---...--.....---..........
Yo-0 .TAAA. ......................... G.............................---. ..--.....---..........
Zdr-1 .TAAA. ......................... G.............................---. ..--.....---..........
Fab-2 .TAAA. ................................. G..A.CCT G G GAT . CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Spr1-2 .TAAA. ......................................CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Var2-1 .TAAA. ......................................CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Var2-6 .TAAA. ......................................CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Bor-4 .T . . . . GC. G G.......ATACTT G.TT .AA . . TA .C.T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Col-0 ...... GC. G G.......ATACTT G..TTAA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
CS22491 ...... GC. G G.......ATACTT G..T.AA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA G--- . ATC . A . TA . -A . T
Eden-2 ...... GC. G G...... GATACTT G..T.AA..TA.C.T GC. A .CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Ei-2 ...... GC. G G.......ATACTT G..TTAA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Est-1 ...... GC. G G. . .A . . .ATACTT G..TTAA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
HR-5 ...... GCT G G...... GATACTT G..T.AA..TA.C.T GC. A .CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Kondara ...... GC. G G.......ATACTT G..T.AA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Kz-1 ...... GC. G G.......ATACTT G..TTAA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Ler-1 ...... GC. G G.......ATACTT GA. TTAA . . TA .C. T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Lp2-2 ...... GC. G G.......ATACTT GA. TTAA . . TA .C. T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Lp2-6 ...... GC. G G.......ATACTT GA. TTAA . . TA .C. T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Ms-0 ...... GC. G G.......ATACTT G..T.AA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA G--- . ATC . A . TA . -A . T
Shahdara ...... GC. G G.......ATACTT G..T.AA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA G--- . ATC . A . TA . -A . T
Sorbo ...... GC. G G.......ATACTT G..T.AA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA G--- . ATC . A . TA . -A . T
Spr-1-6 ...... GC. G G...... GATACTT G..T.AA..TA.C.T GC. A .CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Tamm-2 ...... GC. G G.......ATACTT GA. TTAA . . TA .C. T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Tamm-27 ...... GC. G G.......ATACTT GA. TTAA . . TA .C. T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Ull-2-5 ...... GC. G G...... GATACTT G..T.AA..TA.C.T GC. A .CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Ws-2 ...... GC. G G. . .A . . .ATACTT G..TTAA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Zdr-6 .TAAA . GC. G G.......ATACTT G..TTAA..TA.C.T G..A.CCT G G G.T .CA G....CTCAA. .TA.---.ATC.A.TA.-A.T
Pro-0 ...... GC. G G.......ATACTT G..TTAA..TA..........................---.T.--.....---...T......
Cvi-0 .....T GC. G GAAT . T . . A . . C . . G........TA.... G....CC G G G G.TACA GCTAC . --AA . C-- . -- GC--- . ACTA . - . G T
Figure 4.4: DNA sequence polymorphism at ARK3 gene. The nuleotide positions are counted
starting at -99nt upstream of the start codon.
58
as well as ΨSCR2/3 (data not shown), using Col-specific primers ran into problems similar
to those reported above. Many accessions would not amplify, suggesting that polymorphism
was high rather than low, contrary to the claim of Shimizu et al. (2004). To investigate
whether the discrepancy was due to difference in the samples used, we attempted a direct
replication of the results of Shimizu et al. (using the same primers and two of the accessions
which we suspected of harboring divergent SCR alleles). For these two accessions, Kas-2 and
Cvi-0, we were unable to replicate the published results.
The failure pattern for either PCR reactions is consistent throughout the whole S-locus region
in that the same accessions tend to fail for all primer pairs. The ARK3 haplogroups show no
correlation to this pattern. For sequences amplified by two Col-specific primer pairs spanning
ΨSCR1 at both ends, the nucleotide diversities are 0.0006 and 0.0043 respectively. DNA se-
quences of the 3’ end fragments (physically closer to the ΨSRK) become more diverse.
PCR failure can result from deletions as well as high sequence divergence. In order to gain
further insight into the reasons for the PCR failure, a genome walk from the ΨSRK 3’ end
towards the putative location of the ΨSCR1 was attempted in the divergent accessions Cvi-0
and Kas-2. For the latter, we successfully extended the known sequence by 1567 bp after the
ΨSRK stop codon. Should the ΨSRK in Kas-2 be similarly structured as Col-0, this would
fully cover ΨSCR1 region, the 3’ end of which is only 714bp away from the ΨSRK stop codon
59
in Col-0 (Kusaba, Dwyer et al. 2001). Alignment of this 1567bp sequence with correspond-
ing genome region of Col-0 showed high nucleotide diversity of 0.52. It has no significant
similarity either to the Col-0 genome sequence by BLAST against GenBank database.
4.4 Discussion
We have demonstrated that in A. thaliana the ΨSRK and a closed linked gene ARK3 are highly
polymorphic, an observation in agreement with what Shimizu et al. reported (Shimizu, Cork
et al. 2004). However, we were unable to repeat their result about the ΨSCR1 even for the
same genotypes, Kas-2 and Cvi-0, of which we have got exactly the same sequences for
ΨSRK. All tested primers failed consistently in PCR amplification for one largely fixed sub-
set of accessions and these DNA samples were amplifiable in other PCR reactions including
those for ARK3 region. Thus the potential experimental errors cannot explain this discrep-
ancy. The conclusion is that the ΨSCR1 region is highly variable as is the case for Brassica
and A. lyrata. It was shown by DNA hybridization tests that the ΨSCR1 is absent or highly
diverged at least in C24 and Mt-0 (Nasrallah, personal communication).
The three divergent ΨSRK haplogroups have their counterparts found in both A. lyrata and
A. halleri (another self-incompatible relative) and orthologs of group A ΨSRK and known
ΨSCR1 are in tight linkage, but ortholog of this type of ΨSCR1 was not identifiable in other
accessions with orthologs of group B or C ΨSRK (Bechsgaard, Castric et al. 2006). This
suggests that the S-locus diversity is well conserved in these close relatives for entire region
60
without intergenic recombination, and that the SCR is so diverse that it is not amplifiable ac-
cording to the known sequences for these two self-incompatible species (Bechsgaard, Castric
et al. 2006). Actually, it was already known before that the SCR genes are relatively more
diverse than the SRK in Brassica (Watanabe, Ito et al. 2000; Sato, Nishio et al. 2002) and A.
lyrata (Charlesworth, Bartolome et al. 2003).
In our study, we had about half data missing for the ΨSCR1 amplification by Col-specific
primers. However, for those amplified sequences, the ΨSCR1 region is about two-fold more
diverse than amplified ΨSRK sequences. These diversities are still trivial relative to the di-
versity observed in the ARK3 or those S-locus of A. lyrata (Charlesworth, Bartolome et al.
2003), thus they are likely intragroup estimates with more divergent alleles missing. We
would expect a tree-like clustering of S-locus sequences (both the ΨSCR1 and ΨSRK)for
whole samples, similar to that of ARK3 (Figure 4.5) and ΨSRK (Shimizu, Cork et al. 2004)
but with a deeper diversity.
It is reasonable to propose that high diversities at ΨSRK and Ψ SCR regions are remnant of
the ancient diversity in a self-incompatible ancestor, and the still high but relatively lower
diversity as we observed at the ARK3 is due to hitch-hiking effect of the balance selection
at S-loci - the time has not been long enough for the plants to reduce the diversities through
recombination. While we are unable to draw any conclusion about when and how the SI was
lost in A. thaliana, our results demonstrate that the ΨSCR1 region is still highly polymorphic
61
in our sample or even possibly deleted for some accessions. The ΨSCR1 pseudogenes of Cvi-
0 and Kas-2 are likely belong to very distinct haplogroups from that of Col-0. The claimed
recent sweep selection based on uniform ΨSCR1 sequences in accessions with distinct ΨSRK
haplotypes (Shimizu, Cork et al. 2004) is contradicted by our results. Nor could the other S-
locus genes be under such selection. Given that the genealogical trees at the ARK3 from both
studies match very well, the discrepancy cannot be readily explained by sampling difference.
As mentioned above, the SI is controlled by a cascade of signaling pathway consisting of
more components other than S-locus genes, such as ARC1 and MLPK. Orthologs of both two
have been identified in A. thaliana (Mudgil, Shiu et al. 2004; Murase, Shiba et al. 2004). The
variation at these components could also contribute to the loss of SI. Actually, pseudo-self-
fertility, self-fertility due to the expression of modifier alleles at non-S-loci, has been well
known in numerous species (Levin, 1996). Pseudo-self-fertility provides benefits in evolu-
tion because of the dual advantages of out-crossing and selfing. The predominant distribution
of intermediate out-crossing for wind- and animal-pollinated species (V olger, Kalisz, 2000)
is in line with this viewpoint. Thus, loss of SI could have been achieved through different
ways in flowering plants. Some species might convert to SC through mutation(s) at S-locus
genes; some others might have experienced a quantitative process accumulating mutations
at one or more modifier genes independent of S-locus genes, especially if S-locus genes are
functionally important in other aspects.
62
After all, in order to figure out the mechanisms of SC transition in A. thaliana, we may need
S-locus sequence data from more accessions in A. thaliana and its out-crossing relatives, and
we might also need to finally turn to the variation at other genes in SI pathway. By taking
advantage of natural variation and modern genomics tools, the answer to the “loss of SI”
mystery in A. thaliana should not be too long in coming.
63
Lov-1
Kas-2
Bil-7
Lz-0
Bil-5
Uod-1
Omo2-1
Ra-0
Wt-5
Eden-1
Br-0
Bay-0
Gu-0
Nok-3
Bur-0
Bor-1
Pu2-23
Ws-0
Pu2-7
Wa-1
Kz-9
Van-0
Zdr-1
Nd-1
Ts-1
Sq-8
Mrk-0
Omo2-3
Yo-0
Tsu-1
Mz-0
Edi-0
Oy-0
Mr-0
HR-10
Kin-0
Lov-5
Ren-1
Pna-17
Ga-0
Ct-1
Ts-5
Wei-0
RRS-10
An-1
C24
Gy-0
Mt-0
Rmx-A180
Se-0
CIBC-17
Ag-0
RRS-7
Got-22
Ren-11
Fei-0
Knox-18
NFA-8
Knox-10
Sq-1
Rmx-A02
NFA-10
Got-7
LL-0
Ull2-3
Uod-7
Pro-0
Fab-2
Var2-6
Var2-1
Spr1-2
Cvi-0
Zdr-6
Bor-4
Eden-2
Ull2-5
Spr1-6
HR-5
Kondara
Ms-0
Shahdara
CS22491
Sorbo
Est-1
Ws-2
Kz-1
Col-0
Ei-2
Ler-1
Lp2-2
Tamm-2
Lp2-6
Tamm-27
L415.S9
477.S1
575.S9
570.S33
419T.S25
L419.S27
L419C.S33
543.S6
562.S14
570.S25 99
99
80
89
100
58
86
68
62
64
62
52
73
100
79
93
84
64
63
63
64
95
97
92
61
A1
A2
A3
B1
B2
Aly8
Figure 4.5: Neighbor-joining tree constructed for ARK3 S-domain sequences of 93 A.
thaliana and Aly8 sequences of 10 A. lyrata accessions.
64
Figure 4.6: PCR results and DNA sequence polymorphisms of 96 A. thaliana accessions
at ΨSRK and ΨSCR1 regions along with the ARK haplogroups. The nucleotide position is
counted from the start codon.
65
Bibliography
Abbott R. J. and M. F. Gomes (1989) Population genetic structure and outcrossing rate of
Arabidopsis thaliana. Heredity 62(33): 411-8.
Allen P. S., Meyer S. E. (1998). Ecological aspects of seed dormancy loss. Seed Sci Res
8:183-91.
Alonso-Blanco, C., L. Bentsink, et al. (2003). Analysis of natural allelic variation at seed
dormancy loci of Arabidopsis thaliana. Genetics 164(2): 711-29.
Alonso-Blanco, C. and M. Koornneef (2000). Naturally occurring variation in Arabidopsis:
an underexploited resource for plant genetics. Trends Plant Sci 5(1): 22-9.
Alonso, J. M., A. N. Stepanova, et al. (2003). Genome-wide insertional mutagenesis of Ara-
bidopsis thaliana. Science 301(5633): 653-7.
Arabidopsis Genome Initiative (2000). Analysis of the genome sequence of the flowering
plant Arabidopsis thaliana. Nature 408(6814): 796-815.
Balasubramanian, S., S. Sureshkumar, et al. (2006). The PHYTOCHROME C photoreceptor
gene mediates natural variation in flowering and growth responses of Arabidopsis thaliana.
Nat Genet 38(6): 711-5.
Barrett, S. C. (2002). The evolution of plant sexual diversity. Nat Rev Genet 3(4): 274-84.
Baskin C. C. and J. M. Baskin, (2004). A classification system for seed dormancy. Seed
Science Research 14: 1-16.
Baumbusch, L. O., D. W. Hughes, et al. (2004). LEC1, FUS3, ABI3 and Em expression
reveals no correlation with dormancy in Arabidopsis. JExpBot 55(394): 77-87.
Bechsgaard, J. S., V . Castric, et al. (2006). The transition to self-compatibility in Arabidopsis
thaliana and evolution within S-haplotypes over 10 Myr. Mol Biol Evol 23(9): 1741-50.
66
Bentsink, L. and M. Koornneeff (2002). Seed dormancy and germination, p. DOI 10.1199
tab.0050 The Arabidopsis Book, edited by C. R. Somerville and E. M. Meyerowitz. Ameri-
can Society of Plant Biologists, Rockville, MD (doi/10.1199/tab.ooog, http: //www.aspb.org
/publications /arabidopsis/).
Bevan, M. and S. Walsh (2005). The Arabidopsis genome: a foundation for plant research.
Genome Res 15(12): 1632-42.
Bradford, K. J. (2002). Applications of hydrothermal time to quantifying and modeling seed
germination and dormancy. Weed Sci 50: 248-60.
Charlesworth, D., C. Bartolome, et al. (2003). Haplotype structure of the stigmatic self-
incompatibility gene in natural populations of Arabidopsis lyrata. Mol Biol Evol 20(11):
1741-53.
Derkx, M. P. M. and C. M. Karssen (1993). Variability in light-gibberellin and nitrate re-
quirement of Arabidopsis thaliana seeds due to harvest time and conditions of dry storage. J.
Plant Physiol. 141:574-82.
Edwards, K. D., P. E. Anderson, et al. (2006). FLOWERING LOCUS C mediates natural
variation in the high-temperature response of the Arabidopsis circadian clock. Plant Cell
18(3): 639-50.
Foley M. E. (2001). Seed dormancy: an update on terminology, physiological genetics, and
quantitative trait loci regulating germinability. Weed Sci. 49:305-317.
Foley, M. E. and Fennimore, S. A. (1998). Genetic basis for seed dormancy, Seed Science
Research 8: 173-82.
Gazzani, S., A. R. Gendall, et al. (2003). Analysis of the molecular basis of flowering time
variation in Arabidopsis accessions. Plant Physiol 132(2): 1107-14.
Goring, D. R. and J. C. Walker (2004). Plant sciences. Self-rejection–a new kinase connec-
tion. Science 303(5663): 1474-5.
Guyomarc’h, S., M. Benhamed, et al. (2006). MGOUN3: evidence for chromatin-mediated
regulation of FLC expression. JExpBot 57(9): 2111-9.
Hagenblad, J., C. Tang, et al. (2004). Haplotype structure and phenotypic associations in the
chromosomal regions surrounding two Arabidopsis thaliana flowering time loci. Genetics.
168: 1627-38.
67
Hagenblad, J., J. Bechsgaard, et al. (2006). Linkage disequilibrium between incompatibility
locus region genes in the plant Arabidopsis lyrata. Genetics 173(2): 1057-73.
He, Y . and R. M. Amasino (2005). Role of chromatin modification in flowering-time control.
Trends Plant Sci 10(1): 30-5.
Hilhorst H. W. M. (1995). A critical update on seed dormancy. I. Primary dormancy. Seed
Sci Res 5:61-73.
Ioerger, T. R., A. G. Clark, et al. (1990). Polymorphism at the self-incompatibility locus in
the Solanaceae predates speciation. Proc. Natl. Acad. Sci. USA 87: 9732-5.
Jean Finnegan, E., K. A. Kovac, et al. (2005). The downregulation of FLOWERING LOCUS
C (FLC) expression in plants with low levels of DNA methylation and by vernalization oc-
curs by distinct mechanisms. Plant J 44(3): 420-32.
Johanson, U., J. West, et al. (2000). Molecular analysis of FRIGIDA, a major determinant of
natural variation in Arabidopsis flowering time. Science 290(5490): 344-7.
Kakutani, T., J. A. Jeddeloh, et al. (1995). Characterization of an Arabidopsis thaliana DNA
hypomethylation mutant. Nucleic Acids Res 23(1): 130-7.
Kalisz, S., D. W. V ogler, et al. (2004). Context-dependent autonomous self-fertilization
yields reproductive assurance and mixed mating. Nature 430(7002): 884-7.
Kamau, E. and D. Charlesworth (2005). Balancing selection and low recombination affect
diversity near the self-incompatibility loci of the plant Arabidopsis lyrata. Curr Biol 15(19):
1773-8.
Koch, M. A., B. Haubold, et al. (2000). Comparative evolutionary analysis of chalcone
synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassi-
caceae). Mol Biol Evol 17(10): 1483-98.
Koornneef, M., L. Bentsink, et al. (2002). Seed dormancy and germination. Curr Opin Plant
Biol 5(1): 33-6.
Kusaba, M., K. Dwyer, et al. (2001). Self-incompatibility in the genus Arabidopsis: charac-
terization of the S locus in the outcrossing A. lyrata and its autogamous relative A. thaliana.
Plant Cell 13(3): 627-43.
Leon-Kloosterziel, K. M., G. A. van de Bunt, et al. (1996). Arabidopsis mutants with a re-
duced seed dormancy. Plant Physiol 110(1): 233-40.
68
Levin D. A. (1996). The evolutionary significance of pseudo-self-fertility. Am Nat, 148: 321-
32.
Li B. L. and M. E. Foley. (1997). Genetic and molecular control of seed dormancy. Trends
in Plant Science 2: 384-9.
Liu, J., Y . He, et al. (2004). siRNAs targeting an intronic transposon in the regulation of
natural flowering behavior in Arabidopsis. Genes & Dev 18: 2873-8.
Liu Y ., R. Geyer, M. Koorneef and W. Scoppe (2006) the 17th International Conference on
Arabidopsis Research at Madison, Wisconsin.
Mable, B. K., J. Beland, et al. (2004). Inheritance and dominance of self-incompatibility
alleles in polyploid Arabidopsis lyrata. Heredity 93(5): 476-86.
Mable, B. K., M. H. Schierup, et al. (2003). Estimating the number, frequency, and domi-
nance of S-alleles in a natural population of Arabidopsis lyrata (Brassicaceaee) with sporo-
phytic control of self-incompatibility. Heredity 90(6): 422-31.
McCallum, C. M., L. Comai, et al. (2000). Targeting induced local lesions IN genomes
(TILLING) for plant functional genomics. Plant Physiol 123(2): 439-42.
Meinke, D. W., J. M. Cherry, et al. (1998). Arabidopsis thaliana: a model plant for genome
analysis. Science 282(5389): 662, 679-82.
Michaels, S. D. and R. M. Amasino (1999). FLOWERING LOCUS C encodes a novel MADS
domain protein that acts as a repressor of flowering. Plant Cell 11(5): 949-56.
Michaels, S. D. and R. M. Amasino (2001). Loss of FLOWERING LOCUS C activity elimi-
nates the late-flowering phenotype of FRIGIDA and autonomous pathway mutations but not
responsiveness to vernalization. Plant Cell 13(4): 935-41.
Michaels, S. D., Y . He, et al. (2003). Attenuation of FLOWERING LOCUS C activity as a
mechanism for the evolution of summer-annual flowering behavior in Arabidopsis. Proc Natl
Acad Sci U S A 100(17): 10102-7.
Molitor, J., P. Marjoram, et al. (2003a). Application of Bayesian spatial statistical methods
to analysis of haplotypes effects and gene mapping. Genet Epidemiol 25(2): 95-105.
Molitor, J., P. Marjoram, et al. (2003b). Fine-scale mapping of disease genes with multiple
mutations via spatial clustering techniques. Am J Hum Genet 73(6): 1368-84.
69
Mudgil, Y ., S. H. Shiu, et al. (2004). A large complement of the predicted Arabidopsis ARM
repeat proteins are members of the U-box E3 ubiquitin ligase family. Plant Physiol 134(1):
59-66.
Murase, K., H. Shiba, et al. (2004). A membrane-anchored protein kinase involved in Bras-
sica self-incompatibility signaling. Science 303(5663): 1516-9.
Nasrallah, M. E., P. Liu, et al. (2002). Generation of self-incompatible Arabidopsis thaliana
by transfer of two S locus genes from A. lyrata. Science 297(5579): 247-9.
Nasrallah, M. E., P. Liu, et al. (2004). Natural variation in expression of self-incompatibility
in Arabidopsis thaliana: implications for the evolution of selfing. Proc Natl Acad Sci U S A
101(45): 16070-4.
Nikolaeva M. G. (2004). On criteria to use in studies of seed evolution. Seed Science Re-
search 14: 315-320.
Nordborg, M., J. O. Borevitz, et al. (2002). The extent of linkage disequilibrium in Ara-
bidopsis thaliana. Nat Genet 30: 190-3.
Nordborg, M., T. T. Hu, et al. (2005). The pattern of polymorphism in Arabidopsis thaliana.
PLoS Biol 3(7): e196.
Ostrowski, M. F., J. David, et al. (2006). Evidence for a large-scale population structure
among accessions of Arabidopsis thaliana: possible causes and consequences for the distri-
bution of linkage disequilibrium. Mol Ecol 15(6): 1507-17.
Peeters, A. J., H. Blankestijn-De Vries, et al. (2002). Characterization of mutants with re-
duced seed dormancy at two novel rdo loci and a further characterization of rdo1 and rdo2 in
Arabidopsis. Physiol Plant 115(4): 604-612.
Prigoda, N. L., A. Nassuth, et al. (2005). Phenotypic and genotypic expression of self-
incompatibility haplotypes in Arabidopsis lyrata suggests unique origin of alleles in different
dominance classes. Mol Biol Evol 22(7): 1609-20.
Quesada, V ., C. Dean, et al. (2005). Regulated RNA processing in the control of Arabidopsis
flowering. Int J Dev Biol 49(5-6): 773-80.
Ratcliffe D. (1965). The geographical and ecological distribution of Arabidopsis and com-
ments on physiological variation. Arabidopsis Information Service 1 (Supplement).
70
Rouse, D. T., C. C. Sheldon, et al. (2002). FLC, a repressor of flowering, is regulated by
genes in different inductive pathways. Plant J 29(2): 183-91.
Salathia, N., S. J. Davis, et al. (2006). FLOWERING LOCUS C-dependent and -independent
regulation of the circadian clock by the autonomous and vernalization pathways. BMC Plant
Biol 6: 10.
Sato, K., T. Nishio, et al. (2002). Coevolution of the S-locus genes SRK, SLG and SP11/SCR
in Brassica oleracea and B. rapa. Genetics 162(2): 931-40.
Schoof, H. and W. M. Karlowski (2003). Comparison of rice and Arabidopsis annotation.
Curr Opin Plant Biol 6(2): 106-12.
Schierup, M. H., J. S. Bechsgaard, et al. (2006). Selection at work in self-incompatible Ara-
bidopsis lyrata: mating patterns in a natural population. Genetics 172(1): 477-84.
Schierup, M. H., B. K. Mable, et al. (2001). Identification and characterization of a poly-
morphic receptor kinase gene linked to the self-incompatibility locus of Arabidopsis lyrata.
Genetics 158(1): 387-99.
Schopfer, C. R., M. E. Nasrallah, et al. (1999). The male determinant of self-incompatibility
in Brassica. Science 286(5445): 1697-700.
Sharbel, T. F., B. Haubold, et al. (2000). Genetic isolation by distance in Arabidopsis
thaliana: biogeography and postglacial colonization of Europe. Mol Ecol 9(12): 2109-18.
Sheldon, C. C., J. E. Burn, et al. (1999). The FLF MADS box gene: a repressor of flowering
in Arabidopsis regulated by vernalization and methylation. Plant Cell 11(3): 445-58.
Sheldon, C. C., D. T. Rouse, et al. (2000). The molecular basis of vernalization: the central
role of FLOWERING LOCUS C (FLC). Proc Natl Acad Sci U S A 97(7): 3753-8.
Shimizu, K. K., J. M. Cork, et al. (2004). Darwinian selection on a selfing locus.Science
306(5704): 2081-4.
Shindo, C., M. J. Aranzana, et al. (2005). Role of FRIGIDA and FLOWERING LOCUS C in
determining variation in flowering time of Arabidopsis. Plant Physiol 138(2): 1163-73.
Shindo C., C. Lister, et al. (2006). Variation in the epigenetic silencing of FLC contributes
to natural variation in Arabidopsis vernalization response. Genes & Development, to appear.
Simpson, G. G. (2004). The autonomous pathway: epigenetic and post-transcriptional gene
regulation in the control of Arabidopsis flowering time. Curr Opin Plant Biol 7(5): 570-4.
71
Sommerville C. and M. Koornneef. (2002) A fortunate choice: the history of Arabidopsis as
a model plant. Nature Reviews Genetics 3: 883-889
Stein, J. C. and J. B. Nasrallah (1993). A plant receptor-like gene, the S-locus receptor kinase
of Brassica oleracea L., encodes a functional serine/threonine kinase. Plant Physiol 101(3):
1103-6.
Stein, J. C., B. Howlett, et al. (1991). Molecular cloning of a putative receptor protein kinase
gene encoded at the self-incompatibility locus of Brassica oleracea. Proc Natl Acad Sci U S
A 88(19): 8816-20.
Sung, S. and R. M. Amasino (2005). Remembering winter: toward a molecular understand-
ing of vernalization. Annu Rev Plant Biol 56: 491-508.
Suzuki, G., N. Kai, et al. (1999). Genomic organization of the S locus: Identification and
characterization of genes in SLG/SRK region of S(9) haplotype of Brassica campestris (syn.
rapa). Genetics 153(1): 391-400.
Takayama, S., H. Shimosato, et al. (2001). Direct ligand-receptor complex interaction con-
trols Brassica self-incompatibility. Nature 413(6855): 534-8.
Terwilliger, J. D. and T. Hiekkalinna (2006). An utter refutation of the Fundamental Theorem
of the HapMap. Eur J Hum Genet 14(4): 426-37.
van Der Schaar, W., C. Alonso-Blanco, et al. (1997). QTL analysis of seed dormancy in
Arabidopsis using recombinant inbred lines and MQM mapping. Heredity 79: 190-200.
Toomajian, C., T. T. Hu, et al. (2006). A nonparametric test reveals selection for rapid flow-
ering in the Arabidopsis genome. PLoS Biol 4(5): e137.
Watanabe M., Ito A. et al. (2000). Highly divergent sequences of the pollen self-incompatibility
(S) gene in class-I S haplotypes of Brassica campestris (syn. rapa) L. FEBS Letters 473: 139-
44.
Werner, J. D., J. O. Borevitz, et al. (2005). FRIGIDA-independent variation in flowering time
of natural Arabidopsis thaliana accessions. Genetics 170(3): 1197-207.
Werner, J. D., J. O. Borevitz, et al. (2005). Quantitative trait locus mapping and DNA array
hybridization identify an FLM deletion as a cause for natural flowering-time variation. Proc
Natl Acad Sci U S A 102(7): 2460-5.
72
Yogeeswaran, K., A. Frary, et al. (2005). Comparative genome analyses of Arabidopsis spp.:
chromosomal rearrangement in evolutionary history.Genome Res 15(4): 505-15.
V ogler, D. W., and S. Kalisz. (2001). Sex among the flowers: the distribution of plant mating
systems. Evolution. 55:202-4.
Zhao, Z., Y . Yu, et al. (2005). Prevention of early flowering by expression of FLOWERING
LOCUS C requires methylation of histone H3 K36. Nat Cell Biol 7(12): 1256-60.
73
Abstract (if available)
Abstract
Naturally occurring variation is an important alternative resource for functional genetics and genomics research.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Association mapping in Arabidopsis thaliana
PDF
Comparative genomics of arabidopsis thaliana
PDF
Analysis of genomic polymorphism in Arabidopsis thaliana
PDF
Natural variation of Arabidopsis thaliana methylome and its impact on genome evolution
PDF
Linkage disequilibrium and its application to mapping in Arabidopsis thaliana
PDF
Mapping epigenetic and epistatic components of heritability in natural population
PDF
A population genomics approach to the study of speciation in flowering columbines
PDF
Genome-wide association study of factors influencing gene expression variation and pleiotropy
PDF
Long term evolution of gene duplicates in arabidopsis polyploids
PDF
Computational and experimental approaches for the identification of genes and gene networks in the Drosophila sex-determination hierarchy
PDF
Integrative analysis of gene expression and phenotype data
PDF
Function of Msx1 and Msx2 in germ cell and calvarial bone development
PDF
Genetic control of meristematic proliferation in Arabidopsis thaliana
PDF
Enamelysin (MMP20) and Kallikrein 4 (KLK4) functions during enamel formation
PDF
Prioritizing phenotype-associated functional modules and sub-networks from high throughout screening results
PDF
Two-stage genotyping design and population stratification in case-control association studies
PDF
Multiple generations of hybridization between populations of the intertidal copepod Tigriopus californicus
PDF
The molecular mechanism underlying the autoimmune-associated PTPN22 R620W variation and the quest for therapeutics
PDF
Error-rate and significance based error-rate (SBER) estimation via built-in self-test in support of error-tolerance
PDF
Statistical analysis of microarray data and functional genomics of yeast ageing
Asset Metadata
Creator
Tang, Chunlao (author)
Core Title
Natural variation in Arabidopsis thaliana
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Molecular Biology
Publication Date
11/17/2006
Defense Date
10/23/2006
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
arabidopsis,dormancy,flowering,OAI-PMH Harvest,self-incompatibility,variation
Language
English
Advisor
Nordborg, Magnus (
committee chair
), Bickers, Nelson (
committee member
), Sun, Fengzhu Z. (
committee member
)
Creator Email
chunlaot@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m166
Unique identifier
UC196481
Identifier
etd-Tang-20061117 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-26964 (legacy record id),usctheses-m166 (legacy record id)
Legacy Identifier
etd-Tang-20061117.pdf
Dmrecord
26964
Document Type
Dissertation
Rights
Tang, Chunlao
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
arabidopsis
dormancy
flowering
self-incompatibility
variation