Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Aggregatibacter actinomycetemcomitans evolves in vivo in patients with periodontal disease
(USC Thesis Other)
Aggregatibacter actinomycetemcomitans evolves in vivo in patients with periodontal disease
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
AGGREGATIBACTER ACTINOMYCETEMCOMITANS EVOLVES IN VIVO
IN PATIENTS WITH PERIODONTAL DISEASE
by
Ruoxing Sun
A Thesis Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements of the Degree
MASTER OF SCIENCE
(CRANIOFACIAL BIOLOGY)
May 2012
Copyright 2012 Ruoxing Sun
Epigraph
Do not, for one repulse, give up the purpose that you resolved to effect.
(William Shakespeare)
ii
Dedication
This thesis is dedicated to my parents who have inspired me, supported me and
prayed for me all the way through the study. This thesis would not have been possible
without your unconditional love.
iii
Acknowledgements
I would like to express my deepest gratitude to my supervisor, Dr. Casey Chen,
whose encouragement and guidance enabled me to develop an understanding of the
subject. He has made available his support in a number of ways and with incredible
patience in order to make this project valuable.
I am heartily thankful to my program director, Dr. Michael Paine, with the spirit
of an real educator, whose guidance has taken me stepping forward through the darkness
of my study.
I am gratefully thankful to my committee members and my colleagues, especially
Dr. Maggie Zeichner-David, Dr. Weerayuth Kittichotirat, and Dr. Weizhen Chen, for their
precious support and suggestions. This thesis would not have been possible without thee.
I also would like to express my sincere appreciation to my friends who support
me in any respect during the completion of the project. I am also indebted to many
coworkers in Casey’s laboratory for enlightening my spirit of knowledge.
iv
Table of Contents
Epigraph
ii
Dedication
iii
Acknowledgements
iv
List of Tables
vii
List of Figures
viii
Abstract
ix
General Introduction
1
Materials and Methods
3
Bacterial strains and genomic DNA preparation
3
Whole genome sequencing (WGS)
4
Construction of customized pan-genome microarray of
A. actinomycetemcomitans
5
Comparative genomic hybridization (CGH) to detect gene differences
between the paired strains
7
Identification of candidate genes that vary in their presence in the
paired strains
9
PCR confirmation of gene presence/absence in the paired strains
10
Sequence confirmation of a novel plasmid in strain S23A
11
Sequence confirmation of the serotype-specific antigen (SSP) cluster of
strains S23A and I23C
12
Phylogenetic analysis
15
Results
16
WGS
16
Phylogenetic analysis
17
CGH
20
Initial identification of gene differences of the paired strains
21
Confirmation of gene differences of the paired strains by PCR
and sequencing
24
v
Identification of a plasmid in A. actinomycetemcomitans strain S23A
26
Reversion that affected ORF17 and ORF18 in the SSP gene cluster
of A. actinomycetemcomitans strain I23C
29
Discussion
31
Bibliography
35
Appendices
41
Appendix A: Information of the 14 published A. actinomycetemcomitans
strains used in this study
41
Appendix B: Cluster ID and functions of the 150 core genes used for
phylogenetic analysis of A. actinomycetemcomitans strains
42
Appendix C: Present/absent genes identified by WGS while not covered
by CGH
44
vi
List of Tables
Table 1: Four sets of paired strains of A. actinomycetemcomitans recovered
from four individuals
4
Table 2: Newbler Metrics of 454 sequencing of four pairs of
A. actinomycetemcomitans strains
5
Table 3: Categories of presence/absence genes based on CGH and WGS
10
Table 4: Primers sequences
13
Table 5: Genes identification by WSG in the paired strains
16
Table 6: Pair-wise mutation count based on the 150 core genes
19
Table 7: Summary of cutoff point of each array, and the numbers of present
and absent genes in A. actinomycetemcomitans strains
20
Table 8: Analysis of present and absent genes in the paired strains
23
Table 9: Confirmation of candidate present/absent genes by PCR and sequencing
25
vii
List of Figures
Figure 1: Distribution of hybridization signals in CGH of
A. actinomycetemcomitans with the pan-genome microarray
9
Figure 2: Cladogram of 18 A. actinomycetemcomitans strains and
Aggregatibacter aphrophilus NJ8700 based on 150 core genes
18
Figure 3: The process of identifying candidate genes of difference in the
paired strains
22
Figure 4: Confirmation of the candidate gene of difference by PCR in
A. actinomycetemcomitans strains S23A and I23C,
SCC2302 and AAS4a
26
Figure 5: The contigs of the novel plasmid pS23A of strain S23A and the
homologous regions in the genomes of strains S23A, I23C and
ANH9381
28
Figure 6: Postulated model of the mutation in A. actinomycetemcomitans strain
I23C serotype-b specific antigen gene cluster
30
viii
Abstract
Background: Periodontal pathogen Aggregatibacter actinomycetemcomitans exhibits
marked variations in genomic content among strains. Such variations presumably arise
from different evolution pathways of A. actinomycetemcomitans strains. However, it is
not known whether genomic variation of A. actinomycetemcomitans may occur during
short-term persistent infections in vivo. Hypothesis: A. actinomycetemcomitans may
evolve in vivo in patients with periodontal disease. If true, the genomic variation may be
related to the adaption mechanisms of the bacteria to the host. The information may
provide insight to the mechanisms of persistent infection of A. actinomycetemcomitans in
humans. Methods: Four pairs of A. actinomycetemcomitans strains (SCC393/A160,
SCC1398/SCC4092, SCC2302/AAS4a, S23A/I23C) (henceforth referred to as “paired
strains”) recovered from four individuals respectively over a period of 0-10 years were
subjected to (i) whole genome sequencing (WGS), (ii) phylogenetic analysis, (iii)
comparative genomic hybridization (CGH) with an A. actinomycetemcomitans pan-
genome microarray, and (iv) PCR analysis to confirm the mutations in selected genes
between the paired strains. Results: Each paired strains were confirmed to derive from a
recent ancestral strain by phylogenetic analysis of 150 core genes. Indications for short-
term evolution were obtained from two sets of the paired strains. Two genes (encoding
hypothetical proteins) in strain SCC2302 (the first strain) were not detected in strain
AAS4a (isolated three years later). For the pair of S23A/I23C (isolated at the same time),
S23A had ten genes that were not detected in I23C. These ten genes were found to be part
ix
of a 24.1 Kb plasmid in S23A. An intact serotype-specific gene cluster was found in
serotype b antigen expressing S23A. In contrast, nontypeable I23C was found to have a
353-bp reversion in the gene cluster, which apparently has inactivated two ORFs of the
cluster. Conclusion: A. actinomycetemcomitans genomes are largely stable during short-
term persistent infections in humans, but may evolve via gene gains/losses or mutations.
x
General Introduction
Gram-negative Aggregatibacter actinomycetemcomitans is assumed to be a
primary etiologic agent of aggressive periodontitis (Slots & Ting, 1999). The species has
been very successful in maintaining its association with human. The bacterium may
initially colonize oral mucosa possibly as a facultative intracellular pathogen (Yue et al.,
2007, Rudney et al., 2005, Rudney et al., 2001), and moves from the initial oral
colonization site to subgingival crevices and competes with other bacteria in the niche.
Persistent colonization of A. actinomycetemcomitans in subgingival crevices may lead to
periodontal destruction and development of periodontitis in susceptible individuals
(Haubek et al., 2004, Haubek et al., 2008, Fine et al., 2007). The transmission of A.
actinomycetemcomitans via saliva as a vehicle to a new host then initiates another round
of infection.
A. actinomycetemcomitans comprises discrete clonal lineages represented by
different serotypes: a to g (Rylev & Kilian, 2008, Takada et al., 2010). The term “clone”
refers to a group of genetically identical or nearly identical bacterial strains derived from
a common ancestor (Ørskov & Ørskov, 1983). By definition, little or no genetic
recombination has occurred between strains of different clones or clonal lineages. In this
scenario, it is likely that different clonal lineages of A. actinomycetemcomitans may
continue to evolve and diverge to possess different biological characteristics. Indeed, we
have demonstrated substantial genomic variation among A. actinomycetemcomitans
strains. For example, the genomic content (annotated genes) of a given strain may differ
1
as much as 19.5% from another strain (Kittichotirat et al., 2011). The patterns of genomic
variation among strains correlate with the major clonal lineages of A.
actinomycetemcomitans, suggesting that the variation is driven by a long-term
evolutionary process. However, bacterial genomic variation may also occur over a
relatively short period of time. Increasing studies show that Helicobacter pylori and
Escherichia coli evolve in vivo during short-term infection due to bacterial adaption to
the host environment (Morelli et al., 2010, Kennemann et al., 2011, Chattopadhyay et al.,
2009).
The objective of this study is to examine whether individual A.
actinomycetemcomitans strains may evolve in vivo in patients with periodontal disease. If
true, the observed genomic differences may include mechanisms of adaption of A.
actinomycetemcomitans to the host. Bacterial mutations may occur as gene gain or loss
(insertion/deletion) or point mutations. Here we focus primarily on gene gain or loss in
the short-term evolution of A. actinomycetemcomitans. The full dataset of point
mutations analysis will be included in the future publication.
2
Materials and Methods
Bacterial strains and genomic DNA preparation. Four pairs of clinical strains A.
actinomycetemcomitans (SCC393/A160, SCC1398/SCC4092, SCC2302/AAS4a, S23A/
I23C) were included in the study. Three pairs of the strains were isolated from the same
subjects at two different time points. The remaining pair of strains was isolated from a
subject at the same time, but displayed a difference in serotype antigen expression
(serotype b vs. nontypeable by immunodiffusion assay) (See in Table 1). Henceforth, the
pair of strains from each individual will be referred to as “paired strains.” All strains were
verified as A. actinomycetemcomitans by a 16S rDNA-based PCR assay (Chen & Slots,
1999). Their serotypes were initially determined by immunodiffusion (Asikainen et al.,
1991) and later confirmed based on the presence of serotype-specific gene clusters in the
genomes.
For genomic DNA preparation, A. actinomycetemcomitans bacteria were grown
on tryptic soy agar plate with 0.6% yeast extract for two days at 37℃ in atmosphere
supplemented with 5% CO 2, and harvested by washing the bacteria off the plates with
PBS buffer. The genomic DNA was then isolated using the Qiagen DNAeasy Blood &
Tissue Kit (Cat. No. 69504, QIAGEN) according to the manufacturer’s manual.
3
Table 1. Four sets of paired strains of A. actinomycetemcomitans recovered from four
individuals.
Over Time Over Time
Follow-up
Years
Serotype
#
Age at the initial
sampling, Ethnicity,
Geographic
Location
of Patient
Diagnosis
*
First Strain
Fellow Up
Strain
Follow-up
Years
Serotype
#
Age at the initial
sampling, Ethnicity,
Geographic
Location
of Patient
Diagnosis
*
SCC393 A160 10 e
40, Caucasian
Finland
CP, severe
SCC1398 SCC4092 3 b
25, Caucasian
Finland
LAP
SCC2302 AAS4a 3 c
33, Caucasian
Finland
G
Same time Same time Serotype
#
Serotype
#
S23A I23C b vs. Nontypeable b vs. Nontypeable
48, Caucasian
Finland
CP, mild
#
Serotype was determined by immunodiffusion assay and supported by the presence of the
serotype-specific gene cluster in the genome.
*
CP, chronic periodontitis; LAP, localized aggressive periodontitis; G, chronic gingivitis.
Whole genome sequencing (WGS). The genome sequences of 18 A.
actinomycetemcomitans strains and an Aggregatibacter aphrophilus strain NJ8700 were
included in this study. The genome sequences of four strains (A160, SCC4092, AAS4a,
and S23A) were determined in this study. Table 2 provides a summary of the Newbler
Metrics of the sequencing results.
The sequences of the remaining 14 strains have been released in our previous
publications (Kittichotirat et al., 2011 and other references) (See Appendix A for
Genbank accession numbers). Briefly, the genome sequencing was performed using 454
pyrosequencing technology (Margulies et al., 2005) and run on a Genome Sequencer
FLX Instrument (Software 1.0.53) following the manufacturer’s instruction (Hoffmann-
La Roche Ltd). The protocol for gene prediction and annotation was as described in our
4
previous publications (Kittichotirat et al., 2011). BLAST searching against the Genbank
non-redundant protein sequence database (Blastp with E-Value cutoff of 1e-6) was
performed to annotate all the protein-coding genes by the criteria of ≥50% identity and
≥50% coverage to the non-redundant protein sequence. The description of the best
BLAST hit was used as an annotation for that gene. Insertion and deletion may generate a
premature stop codon, and create two truncated genes of the complete gene. Such cases
were corrected manually by analyzing the BLAST hits of each protein-coding gene.
Table 2. Newbler Metrics of 454 sequencing of four pairs of A. actinomycetemcomitans
strains.
Strain SCC393 A160 SCC1398 SCC4092 SCC2302 AAS4a S23A I23C
Coverage 18X 16X 27X 26X 31X 30X 10X 17X
No. of
Large
Contigs
200 785 137 303 46 61 516 400
Avg Contig
Size (bp)
11,110 2,470 15,829 6,571 44,293 33,438 3,839 5,051
Median
Contig Size
(bp)
18,019 3,447 4,592 11,749 102,488 61,192 6,109 7,958
Q39 Bases
a
21,893
(0.99%)
31,555
(1.63%)
7,344
(0.34%)
15,940
(0.80%)
11,367
(0.56%)
4,210
(0.21%)
48,852
(2.47%)
33,125
(1.64%)
a
Bases with a quality score of less than or equal to 39 which is equivalent to an error rate of
approximately 1 in 10,000 bases.
Construction of customized pan-genome microarray of A. actinomycetemcomitans.
The detail of the pan-genome microarray of A. actinomycetemcomitans and the validation
of its performance in gene detection will be published elsewhere. Briefly, a total of
5
42,668 predicted genes identified in the genome of the 18 strains of A.
actinomycetemcomitans were grouped based on sequencing homology into 3,426
homologous gene clusters. The longest sequence in each gene cluster was selected as the
representative to design microarray probes using eArray and ArrayOligoSelector tool.
The final microarray consisted of 15,208 probes representing 3,126 gene clusters of A.
actinomycetemcomitans and 536 control probes. Three hundred gene clusters failed to
generate probes and were not included in the analysis. These genes were often (71%)
small genes (<300 bp) with no known function (annotated as hypothetical protein). All
the probes were then spotted to create an 8 × 15K microarray with randomized feature
layout option selected.
Due to sequence variance in each gene cluster among A.
actinomycetemcomitans strains, the performance of individual probes may not be uniform
across all strains. Therefore, the in silico analysis was further performed to filter out the
probes that may exhibit strain-to-strain variation in hybridization. A probe was accepted
if it matched all members of the target gene cluster with at least 80% sequence identity
over the whole probe length and must not have matched any genes belonging to a
different cluster. Gene clusters represented by 2 or fewer probes were excluded. After the
filtering process, a total of 10,934 probes of 2,676 gene clusters remained, including
1,762 core genes that shared by all strains and 914 accessory genes that identified in the
genome of one or more, but not all, strains.
6
Comparative genomic hybridization (CGH) to detect gene differences between the
paired strains. The genome contents of the 4 pairs of strains were determined by CGH
with the pan-genome microarray by Agilent Oligonucleotide Array-Based CGH Analysis
standard protocol. The hybridization was performed in duplicates for strains SCC393,
SCC1398, SCC4092, SCC2302, AAS4a, S23A, I23C, and triplicates for strain A160.
Briefly, 0.5 µg genomic DNA in the volume of 13 µl was used for hybridization. DNA
denaturation and fragmentation was done using a thermocycler by mixing 2.5 µl Random
Primers (Agilent) and 13 µl genomic DNA at 94℃ for 10 minutes. The mixture was
cooled to 4℃ and 9.5 µl labeling master mix (Agilent) was added to make a total volume
of 25 µl. DNA was labeled with fluorescent dye cyanine 3 using a thermocycler following
the steps: 37℃ for 2 hours, 65℃ for 10 minutes, and held at 4℃. Labeled genomic
DNA was purified with individual Amicon 30kDA filters (Millipore) following the
manufacturer’s protocol. The yield of labeled genomic DNA was determined by
measuring the absorbance at A 260nm (DNA) and A 550nm (cyanine 3) using Nano-Drop
(ND-1000). A hybridization sample mixture of 40 µl was loaded onto the glass slide and
incubated at 65℃ for 24 hours with rotation. Arrays were washed with Agilent CGH
wash buffer set according to the manufacturer’s protocol. Microarray scanning was done
using Agilent scanner 6000C with manufacturer’s recommended settings for aCGH
arrays (scan area 61 × 21.6 mm, 5 µm resolution, 100% PMT R&G). Signal data was
extracted from the scanner by Agilent Feature Extractor v10.5 software with
manufacturer’s recommended setting for oligo aCGH, using protocol CGH_105_Dec08.
7
The signal data obtained from the scanner were processed by the following
steps. First the gProcessedSignal value, the background subtracted signal, for each probe
was extracted. Since the labeling involved incorporating cyanine 3-dUTP into the
genomic sample, it is possible that probes that are rich in A would have higher signal
relative to probes with lower A content and therefore may not accurately reflect the true
abundant of the target sequence. To adjust for the variation, the signal values were
normalized by dividing them with the total number of A nucleotides in the probe
sequence. Second, the signals from probes for the same gene cluster were consolidated
into a single value by averaging, and then the log 2 of the signal was taken. Third, a
“cutoff point” was identified from the bimodal distribution pattern of the log 2 of the
signals for gene calling (presence of absence) (See Figure 1 as an example). The
performance of A. actinomycetemcomitans pan-genome microarray in gene detection was
excellent. Testing with the completed sequenced strains HK1651, D7S-1, and D11S-1
showed that sensitivity and specificity for gene detection were > 0.979 and > 0.967,
respectively (unpublished data). Any gene that detected at least once by one of the
microarray hybridizations was tentatively assigned as being present, and the information
was used to identify genes that may be present in one of the paired strains but not in the
other as described below.
8
Figure 1. Distribution of hybridization signals in CGH of A. actinomycetemcomitans with the
pan-genome microarray. An example of the dataset for CGH of strain A160 is shown. The
signals were averaged for each gene cluster from all probes designed for that cluster,
divided by the number of nucleotide A in the probes, log 2-transformed and binned as
shown in the x-axis. The distribution displays a bimodal shape. The mid-point of the
overlapped region of the two peaks was chosen as the “cutoff value” for calling gene
presence/absence.
Identification of candidate genes that vary in their presence in the paired strains.
CGH and WGS data were combined to select candidate genes that may differ in the
paired strains. First, the genes were assigned into four categories based on their presence/
absence in the paired strains by CGH (A, B, C, D) or WGS (E, F, G, H) (Table 3). Among
genes assigned to categories B, C, F, G (presence/absence in paired strains), those
9
supported by both CGH and WGS were selected as candidate genes, while the genes with
mutually exclusive results between WGS and CGH were removed. Genes identified to be
presence/absence in one method (e.g., WGS) but not supported by the other method (e.g.,
CGH) were selected if the CGH ratio of the signals for the genes were ≥ 2 ratio in the
paired strains. Genes identified by WGS to be vary in presence while not covered by
CGH were listed in Appendix C. An illustration of the process and examples are provided
later in the results section.
Table 3. Categories of presence/absence genes based on CGH and WGS.
CGH CGH
Strain A Strain A
CGH CGH
+ -
Strain B
+ A B
Strain B
- C D
WGS WGS
Strain A Strain A
WGS WGS
+ -
Strain B
+ E F
Strain B
- G H
PCR confirmation of gene presence/absence in the paired strains. Candidate genes
identified by the above process were subject to PCR and sequencing confirmation. A gene
cluster p-cluster0322 present in all the 18 strains was selected as a positive control for
PCR analysis. The primers (Table 4) were designed using online primer design tool
Primer 3 (v. 0.4.0) and synthesized by IDT (Integrated DNA Technologies). The 25 µl
PCR mixture included 50-100 ng of genomic DNA, 0.3 µM of each primer, 2.5 µl of
2mM dNTPs, and 1 unit of Taq DNA polymerase in 1× Taq DNA polymerase buffer. The
10
PCR amplification was performed with the following thermocycling profile: 2 minutes at
94℃ for denaturation followed by 30 cycles of 94℃ for 30 seconds, an annealing step at
52-58℃ for 1 minute, an extension step at 72℃ for 1 minute, and a final extension of 8
minutes at 72℃. Long amplification PCR (LongAmp Taq PCR) was also used in some
experiments. The 25 µl LongAmp Taq PCR mixture included 10 ng-1 µl of genomic
DNA, 0.4 µM of each primer, 1.5 µl of 2mM dNTPs, and 1 unit of LongAmp Taq DNA
polymerase in 1× LongAmp Taq DNA polymerase buffer. The PCR amplification was
performed with the following thermocycling profile: 30 seconds at 94℃ for denaturation
followed by 30 cycles of 94℃ for 30 seconds, an annealing step at 52-58℃ for 1 minute,
an extension step at 65℃ for 5-8 minute, and then a final extension of 10 minutes at
65℃. The resultant amplicons were analyzed in 1% agarose gel and subsequently
purified with QIAquick PCR Purification Kits (Cat. No. 28106, QIAGEN) or QIAquick
Gel Extraction Purification Kits (Cat. No. 28706, QIAGEN), and sequenced (Eton
Bioscience Inc).
Sequence confirmation of a novel plasmid in strain S23A. The contigs from strain
S23A that showed high homology to a previously described A. actinomycetemcomitans
plasmid pS57 (Genbank Access No. NC_014629) were aligned using the pS57 as
scaffold, and the contig gaps were closed by PCR primer walk. Primers were designed
and synthesized were as described (Table 4). The PCR amplicons for gaps >700 bp in
size were sequenced twice from both ends when required, and addition primers were
designed for subsequent sequencing when gaps were >1400 bp in size (Table 4). For the
11
amplicons showing ambiguous sequencing results, long amplification PCR was employed
to amplify large fragments spanning several gaps. The PCR mixture and thermocycling
profile were as described above. The amplicons were visualized in 1% agarose gel,
purified and sequenced (Eton Bioscience Inc) using gap-specific primers.
Sequence confirmation of the serotype-specific antigen (SSP) cluster of strains S23A
and I23C. The WGS analysis suggested that there was a reversion of ~400 bp within the
SSP gene cluster in strain I23C in comparison to the published data (Genbank serotype-b
specific gene cluster. Accession no. AB002668) (Yoshida et al., 1998). Therefore, the
complete sequences of the SSP gene cluster of both S23A and I23C were determined by
PCR primer walk. For I23C, each gap closing was performed twice with different pairs of
primers (see in Table 4). The reversion in SSP of strain I23C was confirmed by 4 sets of
PCR analysis experiments spanning the reversed region.
12
Table 4. Primers sequences.
Forward Primer Reverse Primer
(i) Purpose: PCR/Sequence confirmation of gene presence/absence in paired strains (i) Purpose: PCR/Sequence confirmation of gene presence/absence in paired strains (i) Purpose: PCR/Sequence confirmation of gene presence/absence in paired strains (i) Purpose: PCR/Sequence confirmation of gene presence/absence in paired strains
Cluster ID Gene ID
p-cluster13483 SCC393_1702 CAGCGCGGTGATCTTCAG CGATCCTCACCGCGAAAC
p-cluster02391 A160_2074
ACAAAAGAAGAGGTGACGGTTGC
C
AACCTCGGGCGGGTACTCATT
p-cluster01799 SCC1398_N/A
§
CAAATCAAAAAGGGCTTTCA TTTTCCCTTTCCTTGTCCTTC
p-cluster12007 SCC2302_1123 ACCTAAAACGACTGGCAAGC GAACCGGCTTGAATTTTTGA
p-cluster12011 SCC2302_1127 GTGCTTGCTGAATCTATACCAGA ATAGAGGAGCGCCCCAAC
p-cluster12012 SCC2302_1128 TTGAATGTCTCATCTACTGCTCCT GCGCCTAAAAATGGTTGAAG
p-cluster03035 AAS4a_N/A
§
TGCCAAGGGATGTCCTTTTGGAAC GACCCCGCCTTTCCATCAAATCCA
p-cluster02269 S23A_0872 ATGTGCGCCCCGCTGGTTCA TCGTTGACACTGCCGCTACGTT
p-cluster02280 S23A_0877 TTATGCCCGTGTATCGACAA AGAAACCGGCTTGATTGAGA
p-cluster02561
₷
S23A_0874 TCATGCTTCATCAAAACCAAA CGATATTCGCGTTTAACCTCA
p-cluster02561
₷
S23A_0875 TGGCGATACGGATAATCCTC TTTCAACAGCTCGGCTAGGT p-cluster02561
₷
S23A_0876 GAATATCAACGTGCGAGACG CTTGTGCAGACGGCAATTTA
p-cluster02578 S23A_N/A
§
AACGTGGTTTTTCCTGCAAC AACGCACTCGATTTTTACCG
p-cluster02790 S23A_0936
AAAATCGAAAAGTTAAGAGAGCA
G
AGGATTACTTTCGCATTGTTCC
p-cluster03521
₷
S23A_0939 CGTAGTGCGCATTTTGGATA AACAAGACCATGCCCACAGT
p-cluster03521
₷
S23A_0940 ATCAAGCCACACGGAAAGTT TCAACTGCTCACGCTTTTTG
p-cluster03622 S23A_0941 ATGCCGGTCAGAAAACACTC AGCTGCCGTTAAATCGGATA
p-cluster03948 S23A_0937 CAAGGCGAATTTTGAGTGGT TCGCCAAACTCATCATCAAG
p-cluster15527 S23A_0942 AGCTAAGCAGCACCCGAAA ATTTCCGCCCGTGATAATTT
p-cluster02319
⋈
S23A_0938 CAAGGCGAATTTTGAGTGGT CGTAGTGCGCATTTTGGATA
(ii) Purpose: sequencing of a novel plasmid in S23A (ii) Purpose: sequencing of a novel plasmid in S23A (ii) Purpose: sequencing of a novel plasmid in S23A (ii) Purpose: sequencing of a novel plasmid in S23A
Contig64-Contig141 Contig64-Contig141 ATCAGCGTTTTCAAGCCATC ATGTGCGCCCCGCTGGTTCA
Contig141-Contig143 Contig141-Contig143 CGTTTTGGTCTTGCCATTTC AGAAACCGGCTTGATTGAGA
Contig143-Contig158 Contig143-Contig158 CGATATTCGCGTTTAACCTCA
AGGATTACTTTCGCATTGTTC
Contig143-Contig158 Contig143-Contig158 CGATATTCGCGTTTAACCTCA
GGAGGGCTTGACCTTGATA
Contig158-Contig504 Contig158-Contig504 AGCTAAGCAGCACCCGAAA
TGTTTTGTTGCCAACGCT
Contig158-Contig504 Contig158-Contig504 AGCTAAGCAGCACCCGAAA
AGTCACGACTCAACACATCA
Contig504-Contig142 Contig504-Contig142 TGGCAACAAAACAGAAAGG CAAGCCGCTAAAGAACTACGA
Contig142-Contig457 Contig142-Contig457 TTCCTCTCGCCTTAAATCTCA AAAATGGGATTACGGCCAAG
Contig457-Contig595 Contig457-Contig595 CGCATATTTCTTCGCTGCTT GCTCAAAGCGGGTTTAGTC
Contig595-Contig91 Contig595-Contig91 CGACCGCCTATAATGACGA ACTTATTTAGCGGCGGATGACAT
Contig91-Contig90 Contig91-Contig90 TCGCCTTCAAGCAATTCAC CGTAACGGTCAGATCCAA
Contig90-Contig59 Contig90-Contig59 TTCTTCGCCTTGTTGGATCT AACCACAACCAATGCCTACC
Contig59-Contig98 Contig59-Contig98 AATCGGTTTCCGTTTTGTC GCGGTGAAAAAGTTGTAGG
Contig98-Contig69 Contig98-Contig69 TTTTCGCCCTCACTGCTTTG TGGTTGAAGGGTTCCGTATG
Contig69-Contig78 Contig69-Contig78 CACGAATCAGAAATGGCAC CAATACCAATCCAAGCTAAGCC
Contig78-Contig437 Contig78-Contig437 ACTACCACCTTTGCCATTTAGAC GGGCAAAAGAAGCAAAACAA
Contig437-Contig66 Contig437-Contig66 CTGATTCCGAAACCATCCAC GGCGGAAAATACATGCTGA
Contig66-Contig64 Contig66-Contig64 CGTTTTTCGCCTCTGTCATC GAGGAATTTAAGCGTTTTTGCTG
13
Table 4 (Continued).
Forward Primer Reverse Primer
(iii) Purpose: gap closing of the SSP gene cluster in S23A and I23C (iii) Purpose: gap closing of the SSP gene cluster in S23A and I23C (iii) Purpose: gap closing of the SSP gene cluster in S23A and I23C (iii) Purpose: gap closing of the SSP gene cluster in S23A and I23C
Strain S23A Strain S23A
Contig462-Contig52 Contig462-Contig52 GCGAAGCCATCCTTATTGAA GATGTTGCGGGTAAAGATCG
Contig52-Contig473 Contig52-Contig473 CAATGCGCCAAATGAAAAAT
TCTATGGAGCATTTTAAATCCATT
A
Contig473-Contig537 Contig473-Contig537 TCAACAGAACCATCTTTAGAACCA TGGGGAGTTTTCTTAACAATGTG
Contig537-Contig558 Contig537-Contig558 AACATCCCATCCTGCAAAAA TGGTAAGCATCTGCCATATGAA
Contig558-contig430 Contig558-contig430 TCCCAAGCTACATTTTGTGC AGCACATCTTGGGAAAATGA
Contig430-Contig503 Contig430-Contig503 ATGTGCCATCAGTTGAACCA GATGGGGTCAACAAATCCAG
Contig503-Contig309 Contig503-Contig309 GCAAGGCTTTGGGTAACAAG AGTTGCCACACGCTAATTCC
Strain I23AC Strain I23AC
Contig31-Contig372 Contig31-Contig372
CAACTATGGGTGCCGGAAG CAGCAGACAATTTTGCCACA
Contig31-Contig372 Contig31-Contig372
AATTAGCAATGCGCCAAATG TGTGTTGACAGATGAAGCTGAA
Contig372-Contig349 Contig372-Contig349
TATCGCCAATAATCCGAACG CCACTTTCTAACATTTCGGTCA
Contig372-Contig349 Contig372-Contig349
CCTGATCGACAGAGTGAGATACA
CAGAAATAATTAACGTTGAAAAT
GC
Contig349-Contig261 Contig349-Contig261
TCTAAGAAAGAAAAATTGTATTGG
TCA
TGCTTTTCGTGCATACAAGG
Contig349-Contig261 Contig349-Contig261
CATTGGTATTTGCAGATTCATCA
TCTAAGAAAGAAAAATTGTATTG
GTCA
Contig261-Contig141 Contig261-Contig141
TCCCAAGCTACATTTTGTGC AAACTGTCCGCACGGTTATG
Contig261-Contig141 Contig261-Contig141
GGCTGGAAACCACAATATCAA GGGATATGCCAAGCGTGAT
(iv) Purpose: verification of the inverted region in the SSP s gene cluster in I23C (iv) Purpose: verification of the inverted region in the SSP s gene cluster in I23C (iv) Purpose: verification of the inverted region in the SSP s gene cluster in I23C (iv) Purpose: verification of the inverted region in the SSP s gene cluster in I23C
ATGTGCCATCAGTTGAACCA TCTGTAGGCGGTATATCAGCTTT
TCGGACCTTGAACAATTTCAT GGACGCTATTTTATTCCGATCA
AACTGAGAATGCAACGGTTT TGGACAACATATGCCATCAGA
AAAACGGCGATTCAAAATAGA TCTGATGGCATATGTTGTCCA
GGTTTTAATACCCATTAACAATGC
AAAATTAAAGATGGTGTTGATATT
GC
TGTATAAAACGGCGATTCAAAA TTCAAACTGGAATGATGTACGC
GGGTGCATACGATGACAGAA GGAGCTAACGTTATTGATCTTCC
AAAACGGCGATTCAAAATAGA TCTGGTGGTGTTCAAACTGG
§
N/A means the gene ID was not available based on the annotation.
₷
Some truncated gene fragments were located within in one homologous gene cluster, for which
primers were designed separately for each fragment.
14
Phylogenetic analysis. Phylogenetic analysis of the 18 A. actinomycetemcomitans
genomes was performed using concatenated sequences of 150 core genes (total alignment
length is 127,857 bp). The 150 core genes were high-quality predicted genes that were
found in all the 18 strains and A. aphrophilus strain (used as an outgroup). These genes
were found not to have frameshift, fragmentation or other problems that may confound
gene detection and annotation. Maximum likelihood method was used to build a
cladogram indicating the relatedness among the strains.
15
Results
WGS. A summary of the characteristics of the genomes determined by WGS is shown in
Table 5. Briefly, the GC content is 44.40%-45.50%, within the normal range found in
other A. actinomycetemcomitans strains. The numbers of protein coding genes range from
2,057 to 2,669, and show some variation in the paired strains. As shown later, not all the
differences were due to genetic variations of the paired strains. Most of the variation
could be attributed to the incomplete sequence information of the draft genomes,
sequencing errors, or limitation of the gene detection algorithms.
Table 5. Genes identification by WSG in the paired strains.
Strain SCC393 A160 SCC1398 SCC4092 SCC2302 AAS4a S23A I23C
Genome
size (bp)
b
2,222,051 1,988,704 2,167,288 2,001,915 2,037,488 2,033,542 1,994,684 2,020,423
% G+C 44.40% 45.50% 44.50% 44.80% 44.40% 44.50% 44.90% 44.90%
No. of
protein-
coding
gene
clusters
2,669 2,303 2,437 2,118 2,442 2,057 2,417 2,473
No. of
tRNA
genes
35 24 39 39 44 41 28 26
b
Estimated based on the total number of bases in the large contigs.
16
Phylogenetic analysis. We have to rule out the possibility that the paired strains from the
same subjects were genetically distinct strains that coinfected the individuals.
Phylogenetic analysis based on the150 core genes was performed for the 18 A.
actinomycetemcomitans strains and A. aphrophilus NJ8700. Three major groups were
identified. (i) serotypes a, d, e (excluding SC1083) and f strains (ii) serotypes b and c
strains, and (iii) serotype e strain SC1083, consistent with our former phylogenetic
analysis based on 25 housekeeping genes (Kittichotirat et al., 2011). The four pairs of
strains were phylogenetically closest to each other than to any other strains from different
individuals. Table 6 shows the total numbers of sequence differences in the 150 genes
among all strains. The differences found between strains from different subjects range
from 21 to 7,833 nucleotides, with the most of the differences > 500 nucleotides. In sharp
contrast, no differences of nucleotides of the paired strains were detected. The result
suggested that the paired strains were either identical or likely have derived from a
parental strain in the recent past.
17
Serotype c-D17P2
Serotype c-D11S1
Serotype c-SCC2302
Serotype c-AAS4A
Serotype b-SCC4092
Serotype b-SCC1398
Serotype b-HK1651
Serotype b-S23A
Serotype b-I23C
Serotype b-ANH9381
Serotype d-I63B
Serotype a-D17P3
Serotype a-H5P1
Serotype a-D7S-1
Serotype e-SCC393
Serotype e-A160
Serotype f-D18P1
Serotype e-SC1083
NJ8700
Wednesday, March 28, 12
Figure 2. Cladogram of 18 A. actinomycetemcomitans strains and Aggregatibacter
aphrophilus NJ8700 based on 150 core genes. In the cladogram, the paired strains were the
closest related to each other than any other strain, which indicated that they derived from the
same ancestral strain and experienced a recent evolution.
18
Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes. Table 6. Pair-wise mutation count based on the 150 core genes.
D17P3 D7S-1 H5P1 I63B SC1083 SCC393 A160 D18P1 HK1651 ANH9381 SCC1398 SCC4092 I23C S23A D11S-1 D17P2 SCC2302 AAS4A NT05HA
D17P3 0 78 80 94 5447 455 455 435 1933 1933 1925 1925 1927 1927 1836 1836 1843 1843 15043
D7S-1 78 0 20 34 5443 436 436 377 1980 1978 1972 1972 1972 1972 1887 1887 1892 1892 15038
H5P1 80 20 0 36 5445 440 440 379 1990 1990 1982 1982 1984 1984 1897 1897 1902 1902 15047
I63B 94 34 36 0 5437 430 430 371 1980 1980 1972 1972 1974 1974 1887 1887 1892 1892 15042
SC1083 5447 5443 5445 5437 0 5421 5421 5414 5398 5377 5388 5388 5371 5371 5387 5387 5393 5393 15142
SCC393 455 436 440 430 5421 0 0 294 1918 1920 1910 1910 1916 1916 1823 1823 1828 1828 15047
A160 455 436 440 430 5421 0 0 294 1918 1920 1910 1910 1916 1916 1823 1823 1828 1828 15047
D18P1 435 377 379 371 5414 294 294 0 1879 1883 1871 1871 1879 1879 1810 1810 1815 1815 15031
HK1651 1933 1980 1990 1980 5398 1918 1918 1879 0 129 80 80 125 125 718 718 722 722 15068
ANH9381 1933 1978 1990 1980 5377 1920 1920 1883 129 0 121 121 20 20 726 726 730 730 15063
SCC1398 1925 1972 1982 1972 5388 1910 1910 1871 80 121 0 0 117 117 708 708 712 712 15069
SCC4092 1925 1972 1982 1972 5388 1910 1910 1871 80 121 0 0 117 117 708 708 712 712 15069
I23C 1927 1972 1984 1974 5371 1916 1916 1879 125 20 117 117 0 0 722 722 726 726 15060
S23A 1927 1972 1984 1974 5371 1916 1916 1879 125 20 117 117 0 0 722 722 726 726 15060
D11S-1 1836 1887 1897 1887 5387 1823 1823 1810 718 726 708 708 722 722 0 22 42 42 15067
D17P2 1836 1887 1897 1887 5387 1823 1823 1810 718 726 708 708 722 722 22 0 42 42 15069
SCC2302 1843 1892 1902 1892 5393 1828 1828 1815 722 730 712 712 726 726 42 42 0 0 15068
AAS4A 1843 1892 1902 1892 5393 1828 1828 1815 722 730 712 712 726 726 42 42 0 0 15068
NT05HA 15043 15038 15047 15042 15142 15047 15047 15031 15068 15063 15069 15069 15060 15060 15067 15069 15068 15068 0
Monday, April 2, 12
19
CGH. Table 7 provides a summary of the CGH results for the 4 paired strains in genes
detection. There were slight variations in the numbers of genes detected among replicates
of CGH and also between the paired strains. This was not unexpected, since CGH
analysis is subjected to variations associated with the array design, hybridization
conditions, and the signal processing protocol. However, as will be shown later, some of
the variations in the paired strains were confirmed.
Table 7. Summary of cutoff point of each array, and the numbers of present and absent
genes in A. actinomycetemcomitans strains.
Strain Cutoff value for present gene Total Present Genes Total absent Genes
SCC393-1 6 2151 509
SCC393-2 2 2158 502
A160-1 3 2156 504
A160-2 3 2158 502
A160-3 3 2158 502
SCC1398-1 5 1978 682
SCC1398-2 3 1979 681
SCC4092-1 6 1952 708
SCC4092-2 3 1974 686
SCC2302-1 5 1962 698
SCC2302-2 4 1964 696
AAS4a-1 5 1953 707
AAS4a-2 4 1970 690
S23A-1 5 2028 632
S23A-2 3 2045 615
I23C-1 5 2033 627
I23C-2 3 2034 626
20
Initial identification of gene differences of the paired stains. Gene detection is not
100% reliable based on either WGS or CGH. Therefore, the two sets of data were
combined to select candidate genes that may be different in the paired strains. An
example of the process for strains SCC2302/AAS4a is shown in Figure 3. First, the genes
were assigned into presence/absence in the paired strains by either WGS or CGH. Among
the 8 genes in the Category B by CGH, one gene was supported by WGS (Category F)
and was selected as a candidate gene. The remaining 7 genes in the Category B, 3 were in
Category E (genes present in both AAS4a and SCC2302) and 4 were in Category H
(genes absent in both strains). These 7 genes were further evaluated based on additional
criteria of the data from CGH. Genes were selected if the CGH signals demonstrated ≥ 2
ratio between the paired strains and signals for gene presence were >5000. The same
process was applied three more times: C versus E, H, G; F versus A, D, B; G versus E, H,
F (details shown in Figure 3 flow chart). In total, 20 candidate genes were identified by
this process (2 in SCC393/A160, 4 in SCC2302/AAS4a, 1 in SCC1398/SCC4092, and 13
in S23A/I23C, see in Table 8).
21
Figure 3. The process of identifying candidate genes of difference between the paired
strains. The final results of candidate gene selection in all four pairs of strains are summarized in
Table 8.
22
Table 8. Analysis of present and absent genes in the paired strains.
SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2) SCC393/A160 (n=2)
Categories B B B C C C F F F G G G
No. of genes 0 0 0 2 2 2 7 7 7 130 130 130
Categories E H F E H G A D B A D C
No. of genes 0 0 0 0 1 1 0 6 1 96 34 0
No. of genes survived after
additional criteria
- - - - 0 1 - 0 1 0 0 -
SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1) SCC1398/SCC4092 (n=1)
Categories B B B C C C F F F G G G
No. of genes 1 1 1 6 6 6 3 3 3 19 19 19
Categories E H F E H G A D B A D C
No. of genes 0 1 0 3 1 1 1 2 0 18 1 0
No. of genes survived after
additional criteria
- 0 - 0 0 1 0 0 - 0 0 -
SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4) SCC2302/AAS4a (n=4)
Categories B B B C C C F F F G G G
No. of genes 8 8 8 11 11 11 1 1 1 7 7 7
Categories E H F E H G A D B A D C
No. of genes 3 4 1 5 6 0 0 0 1 0 7 0
No. of genes survived after
additional criteria
0 0 1 0 0 - - - 1 - 3 -
S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13) S23A/I23C (n=13)
Categories B B B C C C F F F G G G
No. of genes 12 12 12 12 12 12 43 43 43 20 20 20
Categories E H F E H G A D B A D C
No. of genes 3 7 2 0 3 9 32 11 0 7 4 9
No. of genes survived after
additional criteria
0 0 0 - 2 9 0 0 - 0 0 9
†
The total number of candidate genes in each pair was not necessarily equal to the sum of the
number of candidate genes in the corresponding table, because one gene was possibly selected
more twice.
23
Confirmation of gene differences of the paired stains by PCR and sequencing. Table
9 presents a summary of the results of the confirmation of gene differences of the paired
strains by PCR and sequencing. Among the candidate genes, 10 genes were confirmed to
be present in S23A and absent in I23C. Also, during the confirmation of these 10 genes,
an additional gene was identified and confirmed to be present in S23A and absent in I23C
(Figure 4. Lane 1-13). The confirmed presence/absence of genes in pair S23A/I23C were
located in four contigs; more details of these results are presented in the next section.
Two genes were confirmed to be present in SCC2302 and absent in AAS4a. The
remaining candidate genes were rejected.
24
Table 9. Confirmation of candidate presence/absence genes by PCR and sequencing.
Cluster ID Gene Description
Length of
Gene (bp)
Presence/Absence
in Paired Strains
Presence/Absence
in Paired Strains
Final Results
SCC393 A160
p-cluster13483 Hypothetical protein 189 + - Rejected
p-cluster02391 Hypothetical protein pVT745_p18 174 - + Rejected
SCC1398 SCC4092
p-cluster01799
Cell filamentation protein Fic-related
protein
292 + - Rejected
SCC2302 AAS4a
p-cluster12007 Hypothetical protein 147 + - Rejected
p-cluster12011 Hypothetical protein 117 + - Confirmed
p-cluster12012 Hypothetical protein 120 + - Confirmed
p-cluster03035 Bacteriophage Mu GP27-like protein 259 - + Rejected
S23A I23C
p-cluster02269 Lytic transglycosylase catalytic 474 + - Confirmed
p-cluster02280 Site-specific recombinase 576 + - Confirmed
p-cluster02561 Nickase 1119 + - Confirmed
p-cluster02578 Replication protein 428 + - Confirmed
p-cluster02790 Hypothetical protein 282 + - Confirmed
p-cluster03521 Hypothetical protein pVT745_p07 711 + - Confirmed
p-cluster03622 Hypothetical protein NTHI0157 615 + - Confirmed
p-cluster03948
Hypothetical protein
GCWU000324_01050
387 + - Confirmed
p-cluster08381 Predicted protein N/A + - Not tested
p-cluster11592 Hypothetical protein N/A + - Not tested
p-cluster15520 Hypothetical protein pVT745_p11 N/A + - Not tested
p-cluster15527 Hypothetical protein D11S_2161 195 + - Confirmed
p-cluster15532 Hypothetical protein N/A + - Not tested
p-cluster02319
⋈
Hypothetical protein 388 + - Confirmed
⋈
P-cluster02319 was found because it was flanked by two genes found to be present in S23A and
absent in I23C, and was also confirmed.
25
S23A
I23C
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1 2 3 4 5 7 8 9 10 11
SCC2302 AAS4a
(a)
(b)
1 kb
ladder
1 kb
ladder
Monday, April 2, 12
Figure 4. Confirmation of the candidate genes of difference by PCR in A.
actinomycetemcomitans strains S23A and I23C, SCC2302 and AAS4a. (a) Ten genes were
confirmed to be present in S23A but absent in I23C based on the detection of the relevant PCR
amplicons. Lanes 1-13: detection for p-cluster02561 (S23A_0874, S23A_0875, S23A_0876), p-
cluster02280 (S23A_0877), p-cluster02790 (S23A_0936), p-cluster03948 (S23A_0937), p-
cluster03521 (S23A_0939, S23A_0940), p-cluster03622 (S23A_0941), p-cluster15527
(S23A_0942), p-cluster02269 (S23A_0872), p-cluster02578, p-cluster02319 (S23A_0938),
respectively; Lane 14: detection of the connection between genes S23A_0877 and S23A_0936;
Lane 15: detection of the connection between S23A_0874 and S23A_0877; Lane 16: detection of
the connection between genes S23A_0937 and S23A_0939; Lane 17: positive control.
Identification of a plasmid in A. actinomycetemcomitans strain S23A. The ten genes
present in S23A (located in 4 contigs) but missing in I23C demonstrated homology to a
known 24.0 Kb plasmid named pS57 in A. actinomycetemcomitans strain D11S-1. The
sequence of the plasmid pS57 was used to identify 16 homologous contigs (cumulative of
~23 Kb) from the WGS of S23A. Subsequently, the pS57 was used as the scaffold o align
the contigs and contig gaps were then closed by PCR primer walk (see Figure 5 for a
26
summary of the sequencing strategy). The result suggested the existence of a circular
plasmid of 24.1 Kb in S23A, which was named pS23A (Figure 5). The sequence of
pS23A was 97% identical to pS57, and 82% identical to another A.
actinomycetemcomitans plasmid pVT745 (Genbank Access No. NC_002579) in strain
VT745.
During the sequencing effort of the plasmid pS23A, we noted that two contigs in
the draft genome of S23A contained both plasmid and non-plasmid sequences (see Figure
5. Contig457 and Contig64 of S23A). This could be interpreted as either sequencing
errors, or the existence of plasmid related sequences in the genome of S23A. PCR
analysis that use primers spanning the regions (see Figure 5) showed that the 14.1 Kb of
the 24.1 Kb plasmid pS23A was indeed integrated in the genome of S23A (shown in
Figure 5). A region homologous to the same 14.1 Kb was also found in the genome of
I23C and the previously sequenced ANH9381. The results suggested that part of the
pS23A was integrated into the genome of I23C and the presence of pS23A homologous
regions could be common in the genomes of A. actinomycetemcomitans strains.
27
Figure 6. The contigs of the novel plasmid pS23A of strain S23A and the homologous regions in the genomes of strains S23A, I23C
and ANH9381. Each arrow represents a contig. Above the map of pS23A, the lines indicate the regions of PCR analysis, and the boxes show
the regions that were sequenced. The region of pS23A not found in the genomes of S23A or I23C is shown in orange. The homologous 14.1
Kb in pS23A and in strains S23A, I23C and the previously sequenced ANH9381 are colored in green. The unrelated contigs of the genomes in
S23A and I23C are colored in light blue.
4500 1 9000 13500 18000 22500
Contig
143
Contig
141
Contig158
Contig
504
Contig
142
Contig
457
Partial
Contig595 Contig91
Contig
90
Contig
59
Contig
98
Contig
69
Contig
79
Contig
437
Contig
66
Contig
64
Partial
pS23A
Gap 2
(2 bp)
Gap 3
(858 bp)
Gap 4
(1077 bp)
Gap 5
(155 bp)
Gap 6
(535 bp)
Gap 7
(0 bp)
Gap 8
(765 bp)
Gap 9
(0 bp)
Gap 10
(0 bp)
Gap 11
(0 bp)
Gap 12
(118 bp)
Gap 13
(226 bp)
Gap 14
(307 bp)
Gap 15
(331 bp)
Gap 16
(0 bp)
Gap 1
(139 bp)
S23A
Contig
457
Contig
64
Contig595
Contig91
I23C Contig297 Contig55 Contig52 Contig267
Contig
90
Contig
59
Contig
98
Contig
69
Contig
79
Contig
437
Contig
66
Contig200 Contig151
ANH
9381
2,136,808 2,131,369 2,130,903 2,126,680
1
3,234 1,295 2,116,386
Contig362
Contig232
Homologous region across
pS23A, S23A, I23C, ANH9381
Monday, April 2, 12
Monday, April 2, 12
28
Reversion that affected ORF17 and ORF18 in the SSP gene cluster of A.
actinomycetemcomitans strain I23C. Strain S23A was serotype b, while strain I23C was
nontypeable. Initial WGS results suggested that strain S23A has a full complement of
SSP gene cluster, while the genome of I23C genome contained a reversion of ~400 bp
within the gene clusters. Therefore, the full sequences of SSP gene clusters in S23A and
I23C were determined and compared. A. actinomycetemcomitans serotype-b SSP gene
cluster (Yoshida et al., 1998) was used as a scaffold to identify the contigs that were
homologous to SSP gene cluster in S23A and I23C. The contig gaps were then closed by
PCR primer walk. A 353-bp reversion was confirmed that affected the last 278 bases of
the ORF17 and the first 76 bases of the ORF18 (See Figure 6). In Y4 and S23A, the
region corresponding to the 353-bp fragment is flanked by 5’-GGCTTAC-3’ and 5’-
GTCAGCC-3’. Interestingly, the 353-bp fragment in I23C is flanking by a perfect
inverted repeat of 5’-GGCTGAC-3’ and 5’-GTCAGCC-3’. It was noted that the upstream
5’-GTCAGCC-3’ in I23C differs from the upstream 5’-GGCTTAC-3’ in S23A and Y4 by
a transversion of T to G (See the postulated model of the mutation in Figure 6). No such
reversion or base variations were found in the comparable region in other serotype-b
strains (SCC1398, SCC4092, ANH9381, and HK1651).
Due to the 353-bp reversion, the last 93 amino acids of ORF 17 are predicted to
be affected by the reversion. Furthermore, the start codon of ORF18 was deleted due to
the reversion. Several potential start codons have been identified in the downstream
sequence; all are followed shortly by a stop codon.
29
ORF5 ORF22 ORF17ORF18 ORF5 ORF17ORF18 ORF22
S23A
(serotype-b specific
antigen expressing)
I23C
(serotype-b specific antigen
non-expressing)
Hypothetical
intermediate strain
S23A
I23C
inversion via
recombination of
a 7bp-invert repeat
GGCTTAC GTCAGCC
GGCTGAC
GGCTGAC GTCAGCC
ORF6 ORF7 ORF8 ORF9 ORF10 ORF11 ORF12 ORF13 ORF14 ORF15 ORF16 ORF19 ORF20 ORF21
Y4 serotype-b
specific antigen
0 4500
9000
13500 18000 22500
S23A serotype-b
specific antigen
I23C serotype-b
specific antigen
Contig
462
Contig52
GTCAGCC
Contig
473
Contig
578
Contig
537
Contig
548
Contig
558
Contig
430
Contig
557
Contig503 Contig309
Contig31
Contig
372
Contig
349
Contig
457
Contig
428
Contig141 Contig261
Gap 1
(109 bp)
Gap 2
(0 bp)
Gap 3
(143 bp, 432 bp)
Gap 4
(155 bp) (476 bp)
Gap 5
(716 bp)
Gap 6
(535 bp) (1,259 bp)
Gap 7
(124 bp)
Gap 1
(306 bp)
Gap 2
(333 bp)
Gap 4
(131 bp)
Gap 4
(131 bp) Reconfirm Area
Monday, April 2, 12
Figure 6. Postulated model of the mutation in A. actinomycetemcomitans strain I23C
serotype-b specific antigen gene cluster. Genetic map of the SSP gene clusters of S23A and
I23C and the 353-bp reversion that affect ORF17 and ORF18in strain I23C. In the model, a
mutation in the wildtype SSP gene cluster converts the 7-base sequence from 5’-GGCTTAC-3’ to
5’-GGCTGAC-3’. The 7-base sequence and the downstream 5’-GTCAGCC-3’ form a pair of
perfect inverted repeat, which medicate the reversion of the fragment, thereby inactivating the
function of ORF17 and ORF18.
30
Discussion
This study examined the hypothesis that A. actinomycetemcomitans may evolve
in vivo during the short-term infection in the same individuals. Both WGS and CGH were
used to assess the genome content of strains recovered from the same individuals. The
detected gene deletion/insertion was then verified by PCR analysis. Phylogenetic analysis
was performed to rule out coinfection by two genetically distinct strains in the same
individuals. The results showed that while two sets of paired strains of the A.
actinomycetemcomitans were identical in their genomes, the other two sets of paired
strains demonstrated some differences in genome content. The results suggested that A.
actinomycetemcomitans genome is largely stable, but may change over time by gene
gain/loss, plasmid gain/loss, or point mutations. However, more rigorous testing of this
hypothesis is necessary due to several inherent difficulties associated with such studies. It
is possible A. actinomycetemcomitans within an individual at any one time may comprise
a population of bacteria that exhibit slight genomic variation, some of which may or may
not persist over time. Therefore, the minor genomic variation revealed between the paired
strains may not be the consequence of short-term evolution during the infection cycle. On
the contrary, it is also possible that A. actinomycetemcomitans evolves and mutates in
vivo to a greater extent than demonstrated by this study, but the mutated strains were not
collected for the study. To test these possibilities, it will be necessary to examine the
31
multiple strains of A. actinomycetemcomitans strains within individuals from each time
point.
There are a number of studies in the literature that compared the clonality of A.
actinomycetemcomitans strains recovered from family members or from the same
individuals over time (Alaluusua et al., 1996, Alaluusua et al., 1991, Alaluusua et al.,
1993, Asikainen & Chen, 1999, Asikainen et al., 1997). The methods for clonal
identification were limited to assessment of one or a few phenotypes such as serotype
antigen expression by immunodiffusion, or by relatively crude DNA fingerprinting
methods such as arbitrarily primed PCR. The strains from the same individuals over time
or from family members often exhibited identical genotypes or phenotypes. The results
were taken as an evidence for the transmission of the bacteria among family members,
and the persistent infection in individuals over time. However, it is not always possible to
ascertain whether the clonally identical strains as demonstrated by these techniques were
truly the same strains. Based on the results of this study, there is little doubt that A.
actinomycetemcomitans strains may persist in vivo for a short duration.
A number of studies have reported the detection of nontypeable clinical isolates
of A. actinomycetemcomitans (Paju et al., 1998, Kanasi et al.). Further analysis of
typeable and nontypeable strains from the same individuals revealed that they were most
likely isogenic strains. The mechanism of serotype non-expression and its potential
benefit to the bacteria have not been examined in detail. Among the 24 possible ORFs in
the serotype-b SSP gene clusters, ORF9-ORF21 were indispensable the expression of the
32
antigen (Yoshida et al., 1998). In this study, we noted a reversion of 353-bp fragment that
involved the ORF17 and 18 within the SSP gene cluster. We also noted the possibility of
a single base mutation in a 7-base sequence in the wildtype S23A create inverted repeats
that could mediate the reversion in strain I23C. The mutation would have deleterious
effects to ORF17 and ORF18, leading to nonexpression of serotype-specific antigen.
Serotype-specific antigen in A. actinomycetemcomitans is the O-antigen of the end of the
LPS. It could serve as adhesion for A. actinomycetemcomitans (Fujise et al., 2008).
However, it may also provide a target for host immune response (Gu et al., 1998, Page et
al., 1991). It seems possible that the a mutation in SSP gene cluster allows A.
actinomycetemcomitans to avoid the host immune response in an environment that no
longer requires the antigen to function as an adhesin. More studies are needed to examine
this hypothesis.
The number of base mutations between bacterial strains may be used to estimate
the time of divergence of the strains (Elena et al., 2005, Lenski et al., 2003). In this study,
we noted no base mutations were detected between the paired strains in the 150 core
genes used for phylogenetic analysis. Here we estimate the divergence time of the paired
strains using the following assumptions (i) the numbers of base mutation between the
paired strain is <1, (ii) a mutation rate of 10
-9
, (iii) the numbers of sites at risk for
synonymous mutations to be 23.7% of the 150 genes used for analysis, and (iv) a
generation time of 1/day for A. actinomycetemcomitans. The resulting divergence time
was less than 65 years, which is within a log of difference to the time intervals of culture
identification of the paired strains. More rigorous analysis of divergent time will require
33
the mutation analysis of the whole genomes of the paired strains after the mutations are
confirmed by PCR and sequencing analysis.
34
Bibliography
Alaluusua, S., S. Asikainen & C. H. Lai, (1991) Intrafamilial transmission of
Actinobacillus actinomycetemcomitans. J Periodontol 62: 207-210.
Alaluusua, S., J. Matteo, L. Gronroos, S. Innila, H. Torkko, S. Asikainen, H. Jousimies
Somer & M. Saarela, (1996) Oral colonization by more than one clonal type of
mutans streptococcus in children with nursing-bottle dental caries. Arch Oral Biol
41: 167-173.
Alaluusua, S., M. Saarela, H. Jousimies Somer & S. Asikainen, (1993) Ribotyping shows
intrafamilial similarity in Actinobacillus actinomycetemcomitans isolates. Oral
Microbiol Immunol 8: 225-229.
Asikainen, S. & C. Chen, (1999) Oral ecology and person-to-person transmission of
Actinobacillus actinomycetemcomitans and Porphyromonas gingivalis.
Periodontol 2000 20: 65-81.
Asikainen, S., C. Chen, A. Alauusua & J. Slots, (1997) Can one acquire periodontopathic
bacteria and periodontitis from a spouse or a parent? JADA 128: 1263-1271.
Asikainen, S., C. H. Lai, S. Alaluusua & J. Slots, (1991) Distribution of Actinobacillus
actinomycetemcomitans serotypes in periodontal health and disease. Oral
Microbiol Immunol 6: 115-118.
35
Chattopadhyay, S., S. J. Weissman, V . N. Minin, T. A. Russo, D. E. Dykhuizen & E. V .
Sokurenko, (2009) High frequency of hotspot mutations in core genes of
Escherichia coli due to short-term positive selection. Proc Natl Acad Sci U S A
106: 12412-12417.
Chen, C. & J. Slots, (1999) Microbiological tests for Actinobacillus
actinomycetemcomitans and Porphyromonas gingivalis. [Review] [40 refs].
Periodontology 2000 20: 53-64.
Elena, S. F., T. S. Whittam, C. L. Winkworth, M. A. Riley & R. E. Lenski, (2005)
Genomic divergence of Escherichia coli strains: evidence for horizontal transfer
and variation in mutation rates. Int Microbiol 8: 271-278.
Fine, D. H., K. Markowitz, D. Furgang, K. Fairlie, J. Ferrandiz, C. Nasri, M. McKiernan
& J. Gunsolley, (2007) Aggregatibacter actinomycetemcomitans and its
relationship to initiation of localized aggressive periodontitis: longitudinal cohort
study of initially healthy adolescents. J Clin Microbiol 45: 3859-3869.
Fujise, O., Y . Wang, W. Chen & C. Chen, (2008) Adherence of Aggregatibacter
actinomycetemcomitans via serotype-specific polysaccharide antigens in
lipopolysaccharides. Oral Microbiol Immunol 23: 226-233.
Gu, K., B. Bainbridge, R. P. Darveau & R. C. Page, (1998) Antigenic components of
Actinobacillus actinomycetemcomitans lipopolysaccharide recognized by sera
from patients with localized juvenile periodontitis. Oral Microbiol Immunol 13:
150-157.
36
Haubek, D., O. K. Ennibi, K. Poulsen, N. Benzarti & V . Baelum, (2004) The highly
leukotoxic JP2 clone of Actinobacillus actinomycetemcomitans and progression
of periodontal attachment loss. J Dent Res 83: 767-770.
Haubek, D., O. K. Ennibi, K. Poulsen, M. Vaeth, S. Poulsen & M. Kilian, (2008) Risk of
aggressive periodontitis in adolescent carriers of the JP2 clone of Aggregatibacter
(Actinobacillus) actinomycetemcomitans in Morocco: a prospective longitudinal
cohort study. Lancet 371: 237-242.
Kanasi, E., B. Dogan, M. Karched, B. Thay, J. Oscarsson & S. Asikainen, Lack of
serotype antigen in A. actinomycetemcomitans. J Dent Res 89: 292-296.
Kennemann, L., X. Didelot, T. Aebischer, S. Kuhn, B. Drescher, M. Droege, R.
Reinhardt, P. Correa, T. F. Meyer, C. Josenhans, D. Falush & S. Suerbaum, (2011)
Helicobacter pylori genome evolution during human infection. Proc Natl Acad Sci
U S A 108: 5033-5038.
Kittichotirat, W., R. E. Bumgarner, S. Asikainen & C. Chen, (2011) Identification of the
pangenome and its components in 14 distinct Aggregatibacter
actinomycetemcomitans strains by comparative genomic analysis. PLoS One 6:
e22420.
Lenski, R. E., C. L. Winkworth & M. A. Riley, (2003) Rates of DNA sequence evolution
in experimental populations of Escherichia coli during 20,000 generations. J Mol
Evol 56: 498-508.
37
Margulies, M., M. Egholm, W. E. Altman, S. Attiya, J. S. Bader, L. A. Bemben, J. Berka,
M. S. Braverman, Y . J. Chen, Z. Chen, S. B. Dewell, L. Du, J. M. Fierro, X. V .
Gomes, B. C. Godwin, W. He, S. Helgesen, C. H. Ho, G. P. Irzyk, S. C. Jando, M.
L. Alenquer, T. P. Jarvie, K. B. Jirage, J. B. Kim, J. R. Knight, J. R. Lanza, J. H.
Leamon, S. M. Lefkowitz, M. Lei, J. Li, K. L. Lohman, H. Lu, V . B. Makhijani,
K. E. McDade, M. P. McKenna, E. W. Myers, E. Nickerson, J. R. Nobile, R.
Plant, B. P. Puc, M. T. Ronan, G. T. Roth, G. J. Sarkis, J. F. Simons, J. W.
Simpson, M. Srinivasan, K. R. Tartaro, A. Tomasz, K. A. V ogt, G. A. V olkmer, S.
H. Wang, Y . Wang, M. P. Weiner, P. Yu, R. F. Begley & J. M. Rothberg, (2005)
Genome sequencing in microfabricated high-density picolitre reactors. Nature
437: 376-380.
Morelli, G., X. Didelot, B. Kusecek, S. Schwarz, C. Bahlawane, D. Falush, S. Suerbaum
& M. Achtman, (2010) Microevolution of Helicobacter pylori during prolonged
infection of single hosts and within families. PLoS Genet 6: e1001036.
Ørskov, F. & I. Ørskov, (1983) Summary of a workshop on the clone concept in the
epidemiology, taxonomy, and evolution of the enterobactericeae and other
bacteria. J Infect Dis 148: 346-357.
Page, R. C., T. J. Sims, L. D. Engel, B. J. Moncla, B. Bainbridge, J. Stray & R. P.
Darveau, (1991) The immunodominant outer membrane antigen of Actinobacillus
actinomycetemcomitans is located in the serotype-specific high-molecular-mass
carbohydrate moiety of lipopolysaccharide. Infect Immun 59: 3451-3462.
38
Paju, S., M. Saarela, S. Alaluusua, P. Fives-Taylor & S. Asikainen, (1998)
Characterization of serologically nontypeable Actinobacillus
actinomycetemcomitans isolates. J Clin Microbiol 36: 2019-2022.
Rudney, J. D., R. Chen & G. J. Sedgewick, (2001) Intracellular Actinobacillus
actinomycetemcomitans and Porphyromonas gingivalis in buccal epithelial cells
collected from human subjects. Infect Immun 69: 2700-2707.
Rudney, J. D., R. Chen & G. J. Sedgewick, (2005) Actinobacillus
actinomycetemcomitans, Porphyromonas gingivalis, and Tannerella forsythensis
are components of a polymicrobial intracellular flora within human buccal cells. J
Dent Res 84: 59-63.
Rylev, M. & M. Kilian, (2008) Prevalence and distribution of principal periodontal
pathogens worldwide. J Clin Periodontol 35: 346-361.
Slots, J. & M. Ting, (1999) Actinobacillus actinomycetemcomitans and Porphyromonas
gingivalis in human periodontal disease: occurrence and treatment. Periodontol
2000 20: 82-121.
Takada, K., M. Saito, O. Tsuzukibashi, Y . Kawashima, S. Ishida & M. Hirasawa, (2010)
Characterization of a new serotype g isolate of Aggregatibacter
actinomycetemcomitans. Mol Oral Microbiol 25: 200-206.
39
Yoshida, Y ., Y . Nakano, Y . Yamashita & T. Koga, (1998) Identification of a genetic locus
essential for serotype b-specific antigen synthesis in Actinobacillus
actinomycetemcomitans. Infect Immun 66: 107-114.
Yue, G., J. B. Kaplan, D. Furgang, K. G. Mansfield & D. H. Fine, (2007) A second
Aggregatibacter actinomycetemcomitans autotransporter adhesin exhibits
specificity for buccal epithelial cells in humans and Old World primates. Infect
Immun 75: 4440-4448.
40
Appendix A: Information of the 14 published
A. actinomycetemcomitans strains used in this study.
Strain Locus Tag Genbank Accession no.
Serotype-a_D17P3 D17P3 ADOA00000000
Serotype-a_D7S D7S ADCF00000000
Serotype-a/x_H5P1 H5P1 AEJK00000000
Serotype-b_SCC1398 SCC1398 AEJP00000000
Serotype-b_ANH9381 ANH9381 CP003099
Serotype-b_I23C I23C AEJQ00000000
Serotype-c_SCC2302 SCC2302 AEJR00000000
Serotype-c_D11S1 D11S CP001733.1
Serotype-c_D17P2 D17P2 ADOB00000000
Serotype-d_I63B I63B AEJL00000000
Serotype-e_SCC393 SCC393 AEJN00000000
Serotype-e_SC1083 SC1083 AEJM00000000
Serotype-f_D18P1 D18P1 AEJO00000000
Serotype-b_HK1651 HK1651 http:// www.genome.ou.edu/act.html
41
Appendix B: Cluster ID and functions of the 150 core genes
used for phylogenetic analysis of A. actinomycetemcomitans.
ClusterID Product Description Expected
Length
ClusterID Product Description Expected
Length
p-cluster00040 ponB, mrcB 2394 p-cluster00599 rpL2, rplB 822
p-cluster00057 ycbY 2163 p-cluster00602 truA, hisT 819
p-cluster00112 msbA, ywjA 1749 p-cluster00619 DeoR family 798
p-cluster00122 fruA 1698 p-cluster00628 comL 795
p-cluster00138 cstA 1599 p-cluster00626 atpB 789
p-cluster00162 emrB 1518 p-cluster00627 lpxA 789
p-cluster00169 murC 1494 p-cluster00686 surE 786
p-cluster00196 pntB 1485 p-cluster00643 COG3022 780
p-cluster00187 hyfD 1446 p-cluster00645 vacJ 777
p-cluster00236 folC 1392 p-cluster00646 hycG, hevG 777
p-cluster00235 Na+/H+ antiporter NhaC 1386 p-cluster00658 ydfG 768
p-cluster00220 atpD 1374 p-cluster00664 rph 762
p-cluster00225 hemN 1368 p-cluster00668 udp, mtaP, deoD 759
p-cluster00241 mrsA 1338 p-cluster00673
23S rRNA pseudouridine
synthase D
753
p-cluster00246 srmB, deaD, rhlB 1332 p-cluster00676 trmD 753
p-cluster00261 murD 1317 p-cluster00685 act 741
p-cluster00258 eno 1311 p-cluster00697 fabG, envM 729
p-cluster00265 pepP 1302 p-cluster00702 yhbG 726
p-cluster00269 lamB 1302 p-cluster00707
putative M22 peptidase-like
protein YeaZ
723
p-cluster00272 kdtA, waaA 1284 p-cluster00715 rpS3, rpsC 708
p-cluster00279 hemY 1278 p-cluster00725 ribonuclease T2 family 696
p-cluster00280 thrC\ 1275 p-cluster00727 pfs, mtn 693
p-cluster00330 galK 1248 p-cluster00730 rpl1, rplA 690
p-cluster00292 nqrF 1236 p-cluster00774 yrbC 690
p-cluster00324 sucC 1224 p-cluster00749 ATP-binding protein 684
p-cluster00299 fabB 1221 p-cluster00744 metI 678
p-cluster00317 lpxB 1185 p-cluster00748 dsbA 675
p-cluster00336 integral membrane protein 1140 p-cluster00758 minC 675
p-cluster00342
anhydro-N-acetylmuramic acid
kinase
1131 p-cluster00756 ccmB 666
p-cluster00346 nagZ, hexA 1128 p-cluster00775 hyfE 639
p-cluster00363 ychF, obg 1128 p-cluster00813 slyD 639
p-cluster00347 ribD 1125 p-cluster00781 tmk 633
p-cluster00349 rodA, mrdB 1122 p-cluster00785 nqrD 630
p-cluster00353 asd 1113 p-cluster00789 pyrR, upp 627
p-cluster00356 gpcE, ispG 1104 p-cluster00792 rhtC 624
p-cluster00412 purM 1095 p-cluster00809 protein YcfC 612
42
ClusterID Product Description Expected
Length
ClusterID Product Description Expected
Length
p-cluster00367 apbE 1089 p-cluster00826
putative glycerol-3-phosphate
acyltransferase PlsY
606
p-cluster00368 aroB 1089 p-cluster00818 rpL4, rplD 603
p-cluster03251 nrfF 1086 p-cluster00819 recR 603
p-cluster00378 aroC 1074 p-cluster00822 hyfA 600
p-cluster00381 conserved permease 1068 p-cluster00827
D-fructose-6-phosphate
amidotransferase
594
p-cluster00383 murG 1065 p-cluster00829 nudE 591
p-cluster00428 bioB 1059 p-cluster00836 dcd 588
p-cluster00415 lpxD, firA 1038 p-cluster00832 mob 585
p-cluster00431 ydeZ 1011 p-cluster00847 ruvC 573
p-cluster00430 trpS 1005 p-cluster00856 protein in HemN 3'region 561
p-cluster00451 thiL 996 p-cluster00857 integral membrane protein 561
p-cluster00456 rluD, sfhB 996 p-cluster00860 yeaY , slp, rnd 558
p-cluster00442 pheS 990 p-cluster00867 comB 558
p-cluster00443 rpoA 990 p-cluster00866 ccmG, dsbE 552
p-cluster00446 trpX, miaA 987 p-cluster00876 rpL6, rplF 534
p-cluster00452 oppD 984 p-cluster00878 hslV , clpQ 531
p-cluster00458 cysB, cbl, gltC 972 p-cluster00879 ppa, ipyR 531
p-cluster00461 ydeW 966 p-cluster00885 actF, fldA 525
p-cluster00462 pfkA 966 p-cluster00887 periplasmic protein 522
p-cluster00465 fatD, yclN, ceuB 966 p-cluster00911 YfbU family protein 507
p-cluster00479 yfcB 951 p-cluster00903 tpx 501
p-cluster00491 fabD 948 p-cluster00907 rps5, rpsE 501
p-cluster00484 tkt 945 p-cluster00908 pgpA 501
p-cluster00487 fruk, lacC 942 p-cluster00912 kdtB, coaD 495
p-cluster00493 ftsX 936 p-cluster06280 COG1238 489
p-cluster00500 rfaD 927 p-cluster00926 ribH 474
p-cluster00502 HflC 927 p-cluster00927 YccF 474
p-cluster00549 hslO, hsp33 909 p-cluster00928 IscR 474
p-cluster00517 mepA 903 p-cluster00932 COG1607 471
p-cluster00569 menB 897 p-cluster00968 atpC 456
p-cluster00540 sapC 888 p-cluster00955 holD 453
p-cluster00541
D-3-phosphoglycerate
dehydrogenase
888 p-cluster00975 excinuclease ATPase subunit 447
p-cluster00542 nagC, glcK 888 p-cluster00962 thioester dehydrase family 444
p-cluster00555 sucD 879 p-cluster00981 integral membrane protein 441
p-cluster00550 citE, cilB 876 p-cluster00966
organic solvent tolerance
protein
435
p-cluster00556 N-acetylmannosamine kinase p-cluster00976 COG2050 432
p-cluster00562 ksgA 864 p-cluster00982 hslR 417
p-cluster00577 AfeD, yfeD 849 p-cluster00984 smpA 414
p-cluster00597 icc 825 p-cluster00988 excinuclease ABC subunit A 414
43
Appendix C: Present/absent genes identified by WGS while not
covered by CGH.
Cluster ID Expected Length (bp) Gene Description
Present/Absent in Paired
Strains
Present/Absent in Paired
Strains
SCC393 A160
p-cluster00755 117 glutamate racemase + -
p-cluster01053 303 COG3668:Plasmid stabilization system + -
p-cluster01138 123 Hypothetical Protein + -
p-cluster01300 126 Hypothetical Protein + -
p-cluster01480 219 Sucrose-6-phosphate hydrolase (Sucro + -
p-cluster01506 1668 sulfatase + -
p-cluster01552 636 tfoX,sxy: DNA transformation protein + -
p-cluster01556 630 tadF:tight adherence protein F + -
p-cluster01651 144 IS1016 transposase + -
p-cluster01751 153 Hypothetical Protein + -
p-cluster01753 153 Hypothetical Protein pVT745_p22 + -
p-cluster01777 156 hypothetical protein D11S_2004 + -
p-cluster01876 168 Hypothetical protein + -
p-cluster01894 174 Hypothetical protein + -
p-cluster01967 189 Hypothetical protein + -
p-cluster02004 678 hypothetical protein NTO5HA_0294 + -
p-cluster02030 642 COG0614: ABC-type Fe3+-hydroxamate + -
p-cluster02061 243 hypothetical protein CT1927 + -
p-cluster02062 243 protein HipA + -
p-cluster02110 234 Hypothetical protein + -
p-cluster02134 150 Hypothetical protein + -
p-cluster02140 123 hypothetical protein + -
p-cluster02455 138 Hypothetical protein + -
p-cluster02841 231 Hypothetical protein + -
p-cluster02923 162 Hypothetical protein + -
p-cluster04441 204 hypothetical protein pVT745_p21 + -
p-cluster04801 147 Hypothetical protein + -
p-cluster04804 243 Hypothetical protein + -
p-cluster12497 156 hypothetical protein D11S_1881 + -
p-cluster12703 270 hypothetical protein HS_0521 + -
p-cluster12864 123 hypothetical protein + -
p-cluster13482 132 Hypothetical protein + -
p-cluster13483 192 Hypothetical protein + -
p-cluster13511 114 Hypothetical protein + -
p-cluster13977 201 Hypothetical protein + -
p-cluster14115 237 Hypothetical protein + -
p-cluster01065 120 Hypothetical protein - +
p-cluster01474 315 17 kDa Surface antigen - +
p-cluster02142 255 Hypothetical protein - +
p-cluster02688 510 transcriptional regulator - +
p-cluster15256 114 Hypothetical protein - +
p-cluster15257 141 Hypothetical protein - +
44
Cluster ID Expected Length (bp) Gene Description
Present/Absent in Paired
Strains
Present/Absent in Paired
Strains
SCC1398 SCC4092
p-cluster01550 138 Hypothetical Protein + -
p-cluster00457 114 Hypothetical Protein - +
S23A I23C
p-cluster01907 177 hypothetical protein + -
p-cluster01957 186 membreane protein + -
p-cluster02319 390 hypothetical protein NTHI0155 + -
p-cluster02459 543 hypothetical protein pVT745_p04 + -
p-cluster01172 276 hypothetical protein - +
p-cluster01320 678 hypothetical protein - +
p-cluster01668 147 Hypothetical protein - +
p-cluster01699 150 Hypothetical protein - +
p-cluster01967 189 hypothetical protein D11S_1721 - +
45
Abstract (if available)
Abstract
Background: Periodontal pathogen Aggregatibacter actinomycetemcomitans exhibits marked variations in genomic content among strains. Such variations presumably arise from different evolution pathways of A. actinomycetemcomitans strains. However, it is not known whether genomic variation of A. actinomycetemcomitans may occur during short-term persistent infections in vivo. Hypothesis: A. actinomycetemcomitans may evolve in vivo in patients with periodontal disease. If true, the genomic variation may be related to the adaption mechanisms of the bacteria to the host. The information may provide insight to the mechanisms of persistent infection of A. actinomycetemcomitans in humans. Methods: Four pairs of A. actinomycetemcomitans strains (SCC393/A160, SCC1398/SCC4092, SCC2302/AAS4a, S23A/I23C) (henceforth referred to as "paired strains") recovered from four individuals respectively over a period of 0~10 years were subjected to (i) whole genome sequencing (WGS), (ii) phylogenetic analysis, (iii) comparative genomic hybridization (CGH) with an A. actinomycetemcomitans pan-genome microarray, and (iv) PCR analysis to confirm the mutations in selected genes between the paired strains. Results: Each paired strains were confirmed to derive from a recent ancestral strain by phylogenetic analysis of 150 core genes. Indications for short-term evolution were obtained from two sets of the paired strains. Two genes (encoding hypothetical proteins) in strain SCC2302 (the first strain) were not detected in strain AAS4a (isolated three years later). For the pair of S23A/I23C (isolated at the same time), S23A had ten genes that were not detected in I23C. These ten genes were found to be part of a 24.1 Kb plasmid in S23A. An intact serotype-specific gene cluster was found in serotype b antigen expressing S23A. In contrast, nontypeable I23C was found to have a 353-bp reversion in the gene cluster, which apparently has inactivated two ORFs of the cluster. Conclusion: A. actinomycetemcomitans genomes are largely stable during short-term persistent infections in humans, but may evolve via gene gains/losses or mutations.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Identification of strain-specific DNA fragments of actinobacillus actinomycetemcomitans by representational difference analysis
PDF
Assessing the role of strain-specific DNA in the growth adn biofilm formation of aggregatibacter actinmycetemcomitans
PDF
Aggregatibacter actinomycetemcomitans oral infection: in vivo T cell immune responses and a novel bone-targeting conjugate to treat biofilm-mediated osteolytic infection
PDF
Reporting quality of randomized controlled trials of periodontal diseases in journal abstracts: a cross-sectional survey and bibliometric analysis
PDF
Local and systemic responses to craniofacial osteolytic defects in an animal model
PDF
The utility of bleeding on probing and 0.25% sodium hypochlorite rinse in the treatment of periodontal disease
Asset Metadata
Creator
Sun, Ruoxing (author)
Core Title
Aggregatibacter actinomycetemcomitans evolves in vivo in patients with periodontal disease
School
School of Dentistry
Degree
Master of Science
Degree Program
Craniofacial Biology
Publication Date
05/04/2012
Defense Date
05/04/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
Actinobacillus actinomycetemcomitans,aggressive periodontitis,comparative genomic hybridization,genomic stability,indel mutation,microarray analysis,OAI-PMH Harvest
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Chen, Casey (
committee chair
), Paine, Michael L. (
committee member
), Zeichner-David, Maggie (
committee member
)
Creator Email
ruoxings@usc.edu,ruoxingsun@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-30305
Unique identifier
UC11289106
Identifier
usctheses-c3-30305 (legacy record id)
Legacy Identifier
etd-SunRuoxing-766-0.pdf
Dmrecord
30305
Document Type
Thesis
Rights
Sun, Ruoxing
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
Actinobacillus actinomycetemcomitans
aggressive periodontitis
comparative genomic hybridization
genomic stability
indel mutation
microarray analysis