Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Cluster analysis of p53 mutational spectra
(USC Thesis Other)
Cluster analysis of p53 mutational spectra
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
CLUSTER ANALYSIS OF p53 MUTATIONAL SPECTRA
Copyright 1998
by
Mary Ju Fang Lo
A Thesis Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
MASTER OF SCIENCE
August 1998
Mary Ju Fang Lo
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
U N IV ERSITY O F S O U T H E R N C A L IFO R N IA
THE G R A D U A T E S C H O O L
U N IV E R S IT Y P A R K
L O S A N G E L E S . C A L IF O R N IA 8 0 0 0 7
This thesis, written by
________Mary Ju Fang Lo______ ______
under the direction of k.JrS.. Thesis Committee,
and approved by a ll its members, has been pre
sented to and accepted by the Dean of The
Graduate School, in partial fulfillment of the
requirements for the degree of
Master of Science
7W J u l 7 6_ > 1 9 9 8
THESIS COMMITTEE
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
TABLE OF CONTENTS
Abstract.................................................................................................................................................. iv
Introduction..............................................................................................................................................1
Background............................................................................................................................................ 3
Methods................................................................................................................................................... 9
Results....................................................................................................................................................11
Discussion.............................................................................................................................................24
Conclusion............................................................................................................................................. 32
References.............................................................................................................................................33
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
LIST OF TABLES AND FIGURES
Table 1. Number of mutations of cancers selected from the p53 database......................................13
Table 2. Percentage of G to T transversions in smoking-related cancers.........................................14
Figure I. Mutational spectra of smoking-related cancers.................................................................14
Figure 2. Mutational spectra of cancers with high proportions of CpG
mutations.............................................................................................................................16
Figure 3. Mutational spectra of cancers associated with UV light..................................................18
Figure 4. Mutational spectra of liver, nasopharynx, and prostate cancers......................................18
Figure S. Euclidean distance clustering of cancer mutational spectra with lung
cancer smoking status.........................................................................................................19
Figure 6. Euclidean distance clustering of cancer mutational spectra with lung
cancer histology................................................................................................................ 21
Figure 7. Log-likelihood ratio distance clustering of cancer mutational spectra
with lung cancer smoking status......................................................................................22
Figure 8. Log-likelihood ratio distance clustering of cancer mutational spectra
with lung histology........................................................................................................... 23
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ABSTRACT
p53 is a frequently mutated tumor suppressor gene in approximately half
of all cancers. A mutational spectra analysis program that calculates Euclidean
and log-likelihood ratio distances between two cancers based on mutation type,
strand specificity, and hotspot parameters was tested on mutations from a p53
database. Additional smoking exposure and histology information for lung cancer
mutations were compiled from the literature. Cancer distances were then grouped
by cluster analysis. Cancers with significant proportions o f endogenous mutations
clustered early while cancers with known hotspots (lung and liver) or unique
mutational spectra (skin) caused by exogenous sources clustered late. Clustering
of log-likelihood ratio distances is more easily interpreted. This method is useful
in generating exposure hypotheses and comparing the mutational spectra of two or
more cancers in relation to other cancers.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
1
INTRODUCTION
p53 is a tumor suppressor gene located on chromosome 17pl3 with an
important role of regulating the cell cycle. p53 mutations are found in a majority
of cancers (Hollstein et al., 1991) with a somatic point mutation on one gene and
loss of heterozygosity in the other. The p53 gene has 11 exons and encodes for a
53 kilodalton nuclear phosphoprotein composed of 393 amino acids.
The p53 gene is conserved in evolution. Conserved regions II-V
correspond to the center of the protein and approximately to exons 5-8. The
center of the protein is a DNA binding domain where mutations are most likely to
occur at sites called hotspots. X-ray crystallography methods have confirmed that
these mutations affect conformation of the protein preventing the protein from
binding to DNA (Cho et al., 1994). Mutations select for p53 proteins that fail to
bind to DNA in a sequence-specific fashion.
In its normal state, a cell has low concentrations of p53 protein. Levels of
p53 protein increase under stressfid conditions. Such stress includes damage to
DNA double strand breaks produced by y irradiation and presence of DNA repair
intermediates after ultraviolet irradiation or chemical damage to DNA. Hypoxia
is another condition that activates the p53 protein. Rapidly growing tumor cells
require a constant blood supply to sustain growth. Hypoxia triggers an increase in
p53 levels resulting in cell death. Mutant cells with dysfunctioning p53 have a
survival advantage under hypoxic conditions.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2
The p53 gene regulates the activity of multiple genes involved in DNA
repair and cell cycle control in a positive or negative manner (Levine et al., 1997).
An increase in p53 protein levels induces cell cycle arrest late in the G1 phase.
This gives the cell an opportunity to repair its DNA before it divides. Additional
cell cycle regulation occurs at the G2/M checkpoint. An increase in p53 can also
activate apoptosis, or programmed cell death. p53 in conjunction with other
genes induce apoptosis. Apoptosis is a cell’s natural defense against cancer. By
inducing its own death, mutated cells are prevented from dividing and becoming
cancerous.
The p53 gene is an appropriate gene for mutational spectra analysis.
There is a broad range of p53 mutations by type and position. p53 mutations are
common in many human cancers allowing for comparison o f different cancers.
Germline mutational spectra of the p53 gene can be constructed. Carriers of a
germline p53 mutation often have Li-Fraumeni syndrome in which 50% get
multiple cancers before the age of 30. Sarcomas and multiple primary cancers
including stomach, colon, and early-onset breast cancer are characteristic of Li-
Fraumeni syndrome.
BACKGROUND
A mutational spectra is defined as the pattern of mutations in relation to
type, location, and frequency. Analysis of mutational spectra usually involves
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
comparing proportions of mutations classified by type. The most common
mutation of the p53 gene is the point mutation or base substitution. Substitution
mutations can be classified as transitions and transversions. A transition is a
mutation from a purine to another purine or a pyrimidine to another pyrimidine.
A transversion is a mutation from purine to pyrimidine or pyrimidine to purine.
Mutations can be caused by exogenous agents such as UV radiation and
diet or by endogenous sources that arise from biological reactions within the cell.
Endogenous sources may be DNA polymerase errors dining DNA replication and
CpG transitions. One of the ways the cell proofreads its DNA is by adding a
methyl group to cytosine at CpG sites, places on the DNA strand where cytosine
is adjacent to a guanine. The spontaneous deamination o f 5-methylcytosine
converts cytosine to thymine producing a C to T transition. Proportions of CpG
mutations serve as a background rate of mutations. An increase in proportion of
exogenous mutations compared to CpG mutations signify that an external
mutagen may be responsible for causing mutations.
Strand specificity occurs when a mutation type is found on one strand of
DNA more frequently than the other and is due to a mutagen preferentially
attacking one strand or strand-specific repair. The non-coding, or template strand
is repaired faster than the coding, non-template strand. If the coding strand does
not get repaired before the cell divides, the mutation is passed on to the daughter
cells. Cancers with the highest percentage o f strand specificity were found to be
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4
tobacco-related cancers such as lung, liver, and pancreas (Greenblat et al., 1994)
followed by cervix, esophagus, and head and neck. Cancers not associated with a
well-known carcinogen were not found to have high percentages of strand
specificity. This is in agreement with the hypothesis that a nontranscribed strand
bias signifies exposure to an exogenous carcinogen.
Some organs have been found to have distinct p53 mutational spectra
caused by a specific exposure. Lung cancer cells have a majority of G to T
transversions of the p53 gene related to tobacco smoke exposure. G to T
transversions are found in highest proportions in lung cancer (Greenblat et al.,
1994). There is a dose-response association between G to T transversions in lung
cancer and duration of smoking (Takeshima et al., 1993). In particular, a
polyaromatic hydrocarbon compound in smoke, benzo[a]pyrenedio!epoxide
(BPDE), forms adducts with DNA. Benzo[ajpyrene has long been suspected as
the carcinogen responsible for G to T transversions in lung cancer. This was
proven in an in vitro study providing a direct link between benzo[a]pyrene and G
to T transversions in lung cells (Denissenko et al., 1997). Benzo[a]pyrene
adducts were found in BPDE-treated HeLa cells and bronchial epithelial cells at
guanine residues in codons 157,248, and 273. These codons are mutational
hotspots in lung cancer.
Tobacco and alcohol are known risk factors for oral cancers (Blot et al.,
1988). Providing molecular evidence of an external risk factor for head and neck
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5
cancer, more than half of the p53 mutations in head and neck cancer were
substitutions at base pairs with nontranscribed strand bias while CpG transitions
were uncommon (Greenblat et al., 1994). Among patients with squamous cell
carcinoma of the head and neck, an increased frequency of p53 mutations was
observed with cigarette smoking and alcohol consumption compared to patients
exposed to neither. Among four patients with p53 mutations who did not smoke
or drink only endogenous mutations were found (Brennan et al., 1995). Different
common mutations of upper respiratory tract cancers (G to A transitions) and
lower respiratory tract cancers (G to T transversions) (Law et al., 1995) may be
due to different exposures between the two sites. The upper aerodigestive tract
may be exposed to more risk factors such as alcohol, chewing tobacco, and other
carcinogens in tobacco smoke.
The analysis of mutational spectra can provide causal evidence between a
carcinogen exposure and cancer. Liver cancer has a unique mutational hotspot of
G to T transversions at codon 249 due to dietary aflatoxin in preserved foods. The
mutational hotspot was correlated in regions of high liver cancer incidence such as
Qidong, China, Mozambique, and Monterrey, Mexico where aflatoxin levels were
also high (Hsu et al., 1991; Ozturk et al., 1991; Soini et al., 1996). In regions
where rates of hepatitis B virus (HBV) exposure were equivalent, difference in
incidence of codon 249 mutations varied with the risk of aflatoxin exposure
showing that HBV was not directly responsible for this hotspot.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
6
A unique mutational spectra is referred to as a fingerprint of exposure
(Semenza and Weasel, 1997). A pattern of mutations known to be associated with
a carcinogen found in a group of patients gives evidence that these patients were
exposed to the carcinogen. In skin cancer cells, mutations found at pyrimidine
dimers, particularly at adjacent cytosines, provides evidence of ultraviolet
radiation exposure. Brash et al. found a majority of invasive skin squamous cell
carcinomas (58%) had p53 mutations and showed that CC to TT mutations were
induced by UV light. p53 mutations are rare in melanomas. Melanomas are
hypothesized to have an alternative mechanism for the inactivation of cell
proliferation control.
Xeroderma pigmentosum (XP) is a rare genetic disease in which patients
have a nucleotide excision repair defect. XP patients are extremely photosensitive
and are at increased risk of developing sunlight-induced cancers. The UV
component of sunlight causes genetic damage. Most mutations of the p53 gene in
skin tumors in normal subjects and XP patients were tandem CC to TT transitions,
a signature of UV exposure (Dumaz et al., 1994). The UV-specific pattern was
not found in internal cancers. Mutations at dipyrimidine sequences in non-XP
skin tumors and XP skin tumors were significantly different from internal tumors
(p<0.01). Demonstrating preferential repair in humans, almost all mutations in
XP tumors were on the non-transcribed strand (95%) whereas no strand bias was
found for internal or non-XP skin tumors.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
7
The analysis of mutational spectra can help determine whether a cancer is
caused by exogenous agents. A role for environmental factors in the formation of
breast cancer was implied by comparing the mutational spectra of breast cancer to
those of lung and colon cancers (Biggs et al., 1993). Colon cancer spectra was
similar to the germline spectra of the p53 gene and the Factor IX gene. Breast
cancer differed from colon cancer in that proportions of G to A and C to T
transitions at CpG dinucleotides were significandy reduced and G to T
transversions were significandy increased (p<0.0005). Similar to lung cancer,
breast cancer had an excess of G to T transversions at codon 157 and more G to T
transversions on the coding strand. The differences between breast and colon
mutadonal spectra can be due to differences in repair mechanisms. Breast tissue
may not be able to repair mutated DNA as well as colon tissue which is constantiy
exposed to carcinogenic substances.
To implicate a carcinogen as a cause of mutations, the gene must be
mutated during exposure to the carcinogen in question. Knowing when gene
mutations occur is important in understanding the tumor forming process and
varies by cancer. The p53 mutation in colon cancer occurs between the transition
of large villous adenoma to invasive carcinoma (Fearon and Vogelstein, 1990).
Mutations occur early in the tumor forming process in skin cancer (Ziegler et al.,
1994). Cancers of the lung, esophagus, head and neck, breast, cervix, and
stomach have alterations in the p53 pathway as an early event and cancers of the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
8
brain, thyroid, liver, and ovary have p53 alterations as a late event (Greenblat et
al., 1994).
Few programs exist that analyze mutational spectra. A mutational analysis
program written by Neal Cariello gives a p value as a measure of how different
two mutational spectra are (Cariello et al., 1994). For each base position, the
nature of the mutation out of three types of possible mutations is noted.
Information is retained only for mutated bases. Two mutational spectra are
compared using the Adams-Skopek algorithm to determine the probability that a
distribution of single-base substitutions in two spectra exists by chance based on
random spectra generated from the hypergeometric distribution (Adams and
Skopek, 1987).
A mutational spectra analysis program written by Jonathan Buckley
compares mutational spectra of two or more cancers by proportion of mutation
type. The program calculates Euclidean and log-likelihood ratio distances
between two cancers allowing for strand specificity and hotspot parameters. This
program was tested with mutations from a p53 database and the published
literature. The objective of this analysis is to do a global comparison of the
mutational spectra of multiple cancers using this new program and a clustering
procedure.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
9
METHODS
The p53 database maintained by Thierry Soussi was downloaded from the
Internet in October 1997 (Cariello et al., 1996). At the time the database
contained 6053 mutations. The database included information on cancer type, cell
origin, loss of heterozygosity, base position, nature of mutation, amino acid
position, wild-type and mutant amino acid sequence, local sequence around a
mutation, and literature citations. The mutations were sorted into individual
cancer datasets with base change and base position information. For the purpose
of including smoking and histology information in the analysis, the lung cancer
mutations from the database were excluded and replaced by data of p53 mutations
in lung cancer found in the literature. Only articles with additional information on
smoking exposure or histology were recorded. Lung cancer mutations were
sorted into datasets either by smoking status (yes/no) or histology as listed in the
reference article.
The Mutational Spectra Analysis program is written in Fortran by
Jonathan Buckley. The MSA program compares the mutation dataset of selected
cancers against a normal p53 sequence and calculates percentages of insertions,
deletions, and point mutations for each cancer. Point mutations are separated into
CpG mutations and non-CpG point mutations. The MSA program calculates
probabilities for six different types of mutations, A:T to G:C, A:T to C:G, A:T to
T:A, G:C to A:T (non-CpG), G:C to T:A, and G:C to C:G. Optionally, the MSA
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
10
program gives hotspot and strand specificity percentages. Strand specificity
percentages are given for coding versus non-coding strand. Hotspot mutations are
calculated for benzo[a]pyrene hotspots at base positions 469, 818, and 743
(codons 157, 273, and 248), the aflatoxin hotspot at base position 747 (codon
249), as well as other CpG transition hotspots.
The MSA program calculates Euclidean and log-likelihood ratio distances
between two cancers based on differences in their mutational spectra parameters.
Mutation percentages are used as parameters in distance calculations. The
Euclidean distance for n parameters, pt ... pn, where cancers A and B represent
any two cancers, is n/(PiA -Pib)2 + (P2 a'P2b)2 + • • • (Pha-Pob)2-
The log-likelihood ratio between two cancers is given by the equation
log(LR) = [LLa + b - (LLa + LLB )]/(n,+nb ), where LLA + B is the maximum likelihood
of the combined data o f cancers A and B, LLA is the maximum likelihood of
cancer A data, LLB is the maximum likelihood of cancer B data, and n, + nb is the
total sample size of cancers A and B. The sum of the maximum likelihoods of the
individual cancers is subtracted from the maximum likelihood of the combined
dataset of both cancers. The maximum log-likelihood ratio is standardized by the
total sample size of the two cancers to give a log-likelihood ratio independent of
sample size.
The MSA program produces matrices of the log-likelihood ratio and
Euclidean cancer distances. The distance matrices were input into S AS for cluster
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
11
analysis using the average-linkage method (SAS Institute, Cary, N.C.). In the
average-linkage method, the distance between two clusters is the average distance
between pairs of observations, one in each cluster, and is defined as
Dk l = Xj)/(NK NL ), where is the distance between clusters CK
and CL , d(x,y) is the distance between vectors x and y, and NK is the number of
observations in CK . Each cancer starts out as a cluster of one observation. SAS
combines the two cancers closest together to form a cluster of two observations.
The next two closest observations are then combined and the process continues
until all cancers are combined into one cluster. The combinatorial formula o f the
average-linkage method is D;m = (NkDjk + NL DJ L )/NM , where CM is the merged
cluster of CK and CL , and DJM is the distance between CM and any other cluster C}.
This method tends to join clusters with small variances with a bias toward
producing clusters with the same variance.
All distance calculations included hotspot and strand specificity
parameters. Clustering was done for each distance type. Because mutations with
both smoking and histology information could represent the same tumor in two
datasets, each lung covariate was clustered separately with the remaining cancers.
RESULTS
Table 1 shows the number of mutations for the 21 cancers selected from
the database and lung cancer datasets from p53 mutations found in the literature.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
12
Esophagus and head and neck cancers were categorized as carcinoma or
squamous cell carcinoma. The database also included p53 mutations related to
Li-Fraumeni syndrome and xeroderma pigmentosum. Most published lung cancer
p53 mutations are listed with histologies as adenocarcinoma, squamous cell, large
cell, small cell, or non-small cell.
Table 2 shows the proportion of G to T transversions of all mutations
analyzed and strand specificity percentage for smoking-related cancers. In lung
cancer, 54% of mutations were G to T transversions associated with smoking and
13% were G to T transversions associated with nonsmoking. High proportions of
G to T transversions were found in large cell (60%), squamous (47%), non-small
cell (47%), and small cell (42%) lung cancers. Lung adenocarcinomas had fewer
proportions of G to T transversions (26%), however 47% of G to T transversions
were found on benzo[a]pyrene associated hotspots. Except for nonsmoking
related lung cancers, all lung cancers had very high percentages of G to T
mutations that were strand specific. Considerable percentages of G to T
transversions were found on benzo[a]pyrene hotspots for head and neck,
esophagus, larynx, and bladder cancers. No benzo[a]pyrene hotspots were found
in pancreas cancer.
Figure 1 shows the mutational spectra of smoking related cancers. Most
of the mutations in lung cancer were point mutations and a small portion of these
were CpG mutations. Head and neck, esophagus, and larynx cancers did not have
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
13
Table 1. Number of mutations of cancers selected from the p53 database.
Cancer________________________________ N
Basal cell 76
Bladder 253
Brain 282
Breast 717
Colon 494
Esophagus 126
Esophagus (squamous) 96
Head and neck 182
Head and neck (squamous) 246
Larynx (squamous) 19
Li-Fraumeni syndrome 37
Liver 348
Lung (smoker)1
Lung (nonsmoker)1
Lung (adenocarcinoma)1
Lung (squamous)1
Lung (large cell)1
Lung (small cell)1
150
30
146
137
46
86
Lung (non-small cell)1 70
Nasopharynx 26
Ovary 327
Pancreas 137
Prostate 115
Skin (squamous) 48
Stomach 264
Thyroid 45
Uterus 60
Xeroderma pigmentosum 35
‘Data from Bodner et al., 1992; Chiba et al., 1990; Chung et al., 1995; D’Amico et al., 1992;
Gorgoulis et al., 1998; Guinee et al., 1995; Harty et al., 1996; Hollstein et al., 1997; Isobe et al.,
1994; Kashii et al., 1994; Kishimoto et al., 1992; Kondo et al., 1992; Lee et al., 1994; Lehman et
al., 1991; Lie/ al., 1994; Lohmann et al., 1993; Lung etal., 1996; Mao et al., 1994; Miller et al.,
1992; Mitsudomi et al., 1992; Nigro et al., 1989; Noguchi et al., 1993; Reichel et al., 1994;
Ryberg et al., 1994; Sameshima et al., 1992; Schlegel et al., 1992; Sozzi et al., 1995; Suzuki et
al., 1992; Takagi et al., 1995; Takahashi etal., 1991;Takeshimae/a/., 1993; Takeshima et al.,
1994; Taylor et al., 1994; Top et al., 1995; Vahakangas et al., 1992; Wang et al., 1995.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
14
Table 2. Percentage of G to T transversions in smoking-related cancers. Cancers not
smoking-related are given for comparison.
% Benzo[a Jpvrene
Cancer N % G to T (% coding strand bias) hotspots1
Lung (smoker) 150 55 (93) 18
Lung (nonsmoker) 30 14(75) 25
Lung (Adenocarcinoma) 146 26 (92) 47
Lung (Squamous) 137 47 (96) 11
Lung (Small cell) 86 43 (92) 33
Lung (Large cell) 46 60 (96) 22
Lung (Non-small cell) 70 48(97) 31
Head & Neck 182 26 (86) 23
Head & Neck (Squamous) 246 17 (80) 20
Esophagus 126 18 (95) 14
Esophagus (Squamous) 96 19 (50) 25
Larynx 19 22 (100) 20
Bladder 253 11(85) 22
Pancreas 137 8(90) 0
Colon 494 11(78) 10
Breast 717 12(86) 9
Brain 282 11(74) 19
Percent G to T transversions at codons 157,248, and 273.
4)
u
e
(0
o
Figure 1. Mutational spectra of smoking-related cancers.
Lung (smoker)
Lung (nonsmoker)
Lung (Ad)
Lung (Sq)
Lung (Large)
Lung (Small)
Lung (NSC)
Head & Neck
Head & Neck (Sq)
Esophagus
Esophagus (Sq)
Larynx
Bladder
Pancreas
B Insertion
B Deletion
□CpG
BAT to GC
BAT to CG
BATtoTA
□ GCtoAT
B G C toC G
BG CtoTA
20% 40% 60%
Percentage
80% 100%
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
15
such a high a proportion of G to T transversion mutations as lung. Head and neck
squamous cell, esophagus, and pancreatic cancers had greater proportions o f CpG
mutations than G to T transversions. Bladder and pancreas had very few G to T
transversions. Proportions o f G to A transitions were higher than G to T
transversions in head and neck squamous (21%) and bladder (28%) cancers. In
larynx cancer, 27% of point mutations were G to C transversions.
Figure 2 shows the mutational spectra for cancers with a high proportion
of endogenous mutations as represented by CpG transitions. Li-Fraumeni cancers
show the highest proportion of CpG mutations followed by colon and uterine
cancers. Internal organs such as the brain and thyroid have high proportions of
endogenous mutations. Breast and ovarian cancers were found to have greater
proportions of endogenous mutations than exogenous mutations. Most o f these
cancers also have high G to A and A to G transitions.
Figure 3 shows mutational spectra for the UV-related cancers, squamous
and basal cell skin cancers and XP-associated cancers. Prominent features o f
these cancers are significant proportions of G:C to A:T and CpG transitions. In
basal and XP-associated cancers, the majority of the mutations were G to A
transitions and CpG transitions in equivalent proportions. In skin squamous cell
carcinoma, nearly half of the mutations were G to A transitions and CpG
transitions were in smaller proportions. All CpG mutations in XP-associated
cancers occurred on the coding strand compared to 48% for skin and 33% for
basal cell. G to A transitions were more likely to occur on the noncoding strand
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Figure 2. Mutational spectra of cancers with high proportions of CpG mutations.
B Insertion
B Deletion
□CpG
AT to GC
AT to CG
ATtoTA
HGC to AT
BGC to TA
GC to CG
« 30
Li-Fraumenl (37) Colon (494) Uterine Brain (282) Thyroid (45) Stomach (264)
Cancer (N)
Ovary (327) Breast (717)
o\
17
in XP associated cancers (7%) and equally on both strands for skin (66%) and
basal cell cancers (64%).
The mutational spectra of liver, nasopharynx, and prostate cancers are
presented in Figure 4. Characteristic of liver cancer mutational spectra, a high
proportion of G to T transversions (76%) was found in codon 249.
Nasopharyngeal cancer also had an unusual mutational spectra with high
percentages of CpG (33%) and G:C to C:G (43%) mutation types. Most of the G
to C transitions (6/7) occurred at codon 280 and frequently on the coding strand
(78%). Prostate cancer did not have any particular mutation type that was notably
in greater proportions. A variety of mutations in roughly similar proportions were
found in prostate cancer of which the majority was the CpG mutation (23%).
Figure 5 is a diagram of the clustering of cancer mutational spectra
calculated by Euclidean distance. Lung cancer mutations by smoking status are
included in this cluster. In step 1, breast cancer combined with ovarian cancer to
form a cluster. In step 2, colon cancer combined with the cluster formed in step 1.
In step 4, the two categories of head and neck cancer formed a cluster that joined
with a larger cluster of cancers in step 8. In step 9, the endocrine organs prostate
and thyroid combined. Li-Fraumeni cancers joined a cluster of the digestive
system organs stomach and pancreas in step 10. The last few cancers to cluster
beyond step 15 were mostly related to external exposures. Smoking-related lung
cancer was next to cluster. Basal and XP-associated cancers were clustered in
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
18
Figure 3. Mutational spectra of cancers associated with UV light.
Skin (N=48)
Deletion
GC to CG 2%
5%
Basal (N=76)
GC to TA
24% ^
CpG
12% A TtoG C
4%
AT to CG
4%
ATtoTA
2%
)
GC to AT
46%
GC to CG
GC to TA
6%
12%
Insertion
2%
GC to AT
32%
AT to GC
4%
A TtoTA ATto CG
8% 2%
XP-associated (N=35)
GC to CG
5%
Insertion
2%
GC to AT
39%
GC to TA
ATtoTA
5%
AT to GC
AT to CG 2%
5%
Figure 4. Mutational spectra of liver, nasopharynx, and
prostate cancers.
60
50
40
01
CD
I 30
4 )
a.
20
10
■ T ' m
L * j
B Insertion
B Deletion
□ CpG
BAT to GC
BAT to CG
BA TtoTA
□ G CtoA T
B G C toT A
B G C to CG
Liver (348) Nasopharynx (26)
Cancer (N )
Prostate (115)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
19
Figure 5. Euclidean distance clustering of cancer mutational
spectra with lung cancer smoking status.
Breast-+
1— +
Ovary— + 2— +
Colon— + 3— +
Brain— + 5— +
I I
Bladder— + 6— +
Esophagus— + |
8— +
Head & Neck (Sq)— + | |
4— + |
Head & Neck------ + 11— +
I I
Prostate— + I I
9 — + 12- - +
Thyroid + | |
I 13— +
Uterus--+ | |
I I
Esophagus (Sq)----+ |
14— +
Stomach--+ | |
7— + | |
Pancreas-+ 10---------- + 15— +
I I I
Li-Fraumeni + | I
I 17— +
Lung (Smoker)---+ I I
I I
Liver----+ 18— +
I I
Nasopharynx--+ 19— +
I I
Basal + I I
16-----+ 20— +
XP-assoc— + I |
I I
Larynx---+ 21-+
I I
Lung (Nonsmoker) + 22
I
Skin +
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
20
step 16. This was followed by liver and nasopharynx. Skin cancer was last to be
clustered.
Figure 6 shows the clustering of cancer mutational spectra in Euclidean
distance with lung cancer histology replacing smoking status. Clustering
sequences were mainly unchanged. Lung adenocarcinomas joined with head and
neck carcinomas in step 8. The cluster of remaining lung cancer histologies
combined with the large cancer cluster in step 18 before the cluster of stomach,
pancreas, and Li-Fraumeni cancers.
In Figure 7, log-likelihood ratio distances of cancer mutational spectra
were clustered with lung cancer smoking status. Breast and head and neck
squamous cell cancers were clustered in the first step. Colon cancer combined
with Li-Fraumeni cancers in a separate cluster in step 2. Smoking and
nonsmoking-related lung cancers combined in step 18. Basal, XP-associated
cancers, and skin clustered in steps 16 and 21. The last step joined the skin cancer
cluster with the rest of the cancers. As seen in Euclidean distance clustering, the
skin cancers were clustered late.
In Figure 8, the log-likelihood ratio distances of cancer mutational spectra
were clustered replacing smoking exposure distances with lung histology
distances. Compared to Figure 7, clustering of cancers were similar except for a
few early changes in the clustering order. Breast cancer combined with
esophageal cancer to form a cluster in step 1. Ovarian cancer is added to this
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
21
Figure 6. Euclidean distance clustering of cancer mutational
spectra with lung cancer histology.
Breast— +
1— +
Ovary + 2— +
Colon + 4— +
I I
Brain + 6— +
I I
Bladder + 7---+
Esophagus— + I
I
Head & Neck (Sq)— + 10— +
5--+ I I
Head & Neck-------+ 8-+ I
I I
Lung (Ad)---- + 14— +
I I
Prostate---+ I I
12— + 16— +
Thyroid----+ I (
I I
Uterus---+ 17— +
I I
Esophagus (Sq)--- + |
I
Lung (Small)— + 18— +
3 + |
Lung (NSC) + I I
1 5 ----------------+
Lung (Sq)---- + |
I I + 19— +
Lung (Large)-+
Stomach +
9--------- +
Pancreas + 13-
21 — +
I I I
Li-Fraumeni + | 22— +
Liver + I I
I I
Nasopharynx--+ 23— +
I I
Basal + | |
20------- + 24-
XP-assoc— + |
t
Larynx--- +
Skin—
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
- +
25
I
- +
22
Figure 7. Log-likelihood ratio distance clustering of cancer
mutational spectra with lung cancer smoking status.
Breast----------- +
1 +
Head & Neck (Sq)— + 6--+
I I
Esophagus---+ 8---+
I
Head & Neck +
Colon---------+
2 +
Li-Fraumeni + 4-
I
Stomach +
Ovary- +
3-
Pancreas +
Brain +
5-
CJterus +
9--+
11 +
12 +
10— +
i
I
-+
14--+
Thyroid + |
I
Prostate + 17--+
Larynx +
7-
Bladder— +
-+ I
15---+
19 h
Esophagus (Sq) +
Lung (Smoker)--- -+ I
18----+
Lung (Nonsmoker)— +
Liver------- +
13-
Nasopharynx— +
20 -
I
J
22
Basal +
16-
XP-assoc— +
+
21 -
Skin +
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 8. Log-likelihood ratio distance clustering of cancer
mutational spectra with lung histology.
Breast +
1—+
Esophagus-+ 3--+
Ovary— + 6 +
Head & Neck (Sq)-+ 9--+
Pancreas +
10 +
Colon-
+
2 — +
Li-Fraumeni— + 4-
I
Stomach +
Brain +
5-
Uterus— +
11 +
I
- +
14---+
I I
I I
18— +
I I
20 -
Thyroid +
I
Prostate— +
I
Esophagus (Sq) +
+
Larynx +
7 ---------
Bladder— + I
17--- +
Lung (ad)----+ I
1 2-----+ I
Head & Neck--+ 15— +
Lung (small)
Lung (Sq)-----+
8------ k
Lung (Large)— + 13-
I
I
22
I
I
I
19--- +
23-
Lung (NSC)— +
Liver-----
Nasopharynx +
Basal
-+
16-
— +
21 -
XP-assoc— +
Skin'
— +
24
— +
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
24
cluster in step 3. Lung adenomocarcinomas clustered with head and neck
carcinomas in step 12. Lung cancer histologies clustered relatively closely
together in the middle of the process.
DISCUSSION
The p53 database contains a wealth of information in readily assessible
format, however there are some drawbacks to using this database. The main
limitation is lack of exposure and histology data which would enable a more
detailed analysis of p53 mutational spectra. The database also does not have
patient information or geographic factors which would be relevant in a molecular
epidemiology study. There is an underdetection of p53 mutations. Because most
p53 mutations are discovered by screening exons 5-8, there is a bias against
detecting mutations outside of exons 5-8.
The database gives two categories for esophagus and head and neck
cancers, carcinoma and squamous cell. Upon checking the references in these two
categories, it was discovered that a majority of cancers classified as carcinoma
was of the squamous cell type. For this analysis, these two datasets should have
been combined into one dataset. In addition, there is some misclassification of
cancer mutations. One reference article under the category of head and neck
carcinoma consisted of mutations from skin cancers of the head and neck.
Clustering order is largely influenced by how the distance formula uses
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
25
mutational spectra parameters in its calculation. To compare the mutational
spectra of two cancers, the Euclidean formula squares differences in parameters,
giving equal weight to all parameters. However, in reality, some parameters are
more important than others, e.g. the proportion of CpG versus strand specificity
for A:T to G:C mutation.
In general, the clustering diagrams of LR distances are easier to interpret.
When a cancer is added to a cluster of two cancers, a new distance is calculated
representing an average o f the three cancer distances. This distance is used in the
combinatorial formula to determine further clustering with other cancers. In the
clustering of Euclidean distances, single cancers adding on to a cluster o f cancers
were common. In LR distance clustering, single cancers tended to combine
together before combining with clusters of more than one cancer. The LR
distance clustering has the advantage of preserving the original distance between
two cancers more often than the Euclidean distance clustering and offers a better
qualitative measurement of the similarity between two cancers in mutational
spectra against those of other cancers.
Smoking is a major cause of lung, esophagus, mouth and pharynx,
pancreas, larynx, and bladder cancers (Parkin et al., 1994). The mutational
spectra of these cancers were analyzed with respect to characteristics o f the
tobacco smoke compound benzo [ajpyrene. Lung had the highest percentage of G
to T mutations compared to other smoking-related cancers whose mutational
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
26
spectra may be due to carcinogens in tobacco smoke besides benzo [a] pyrene.
Benzo[a]pyrene is a carcinogen reactive in the lungs while another smoking
carcinogen or metabolite affects the bladder. These organs may also be exposed
to carcinogens not smoking related. In addition, organs exposed to the same
carcinogen may show different mutational spectra depending on how the
carcinogen is metabolized. Mutational spectra can vary by DNA repair processes
and biological selection of mutations (Biggs et al., 1993).
Cancers with similar mutational spectra cluster close together early in the
process. Smoking-related cancers were dispersed throughout the clustering
diagrams, and as reflected in the clustering, the mutational spectra of pancreas and
other smoking-related cancers were not similar to lung mutational spectra. Minor
groupings of cancers were seen but differed by distance type. Esophagus and
head and neck cancers tended to cluster early with other cancers in Euclidean and
LR distance clustering. In Figure 7, larynx and bladder formed a cluster in step 7
that clustered with esophagus in step 15. Pancreas combined with cancers high in
proportions of endogenous mutations such as stomach in Euclidean distance
clustering (Fig. 6, step 9) and ovary in LR clustering (Fig. 7, step 3).
Lung cancer smokers had more than three times as many G to T
transversions compared to lung cancer nonsmokers showing a positive association
between smoking and G to T transversions on the p53 gene. However, more G to
T transversions on benzo [ajpyrene related hotspots were found for nonsmokers
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
27
than smokers. An explanation for this could be that nonsmokers were exposed to
the benzo[a]pyrene carcinogen through passive smoking. This could not be
confirmed as the original published articles with p53 mutations do not provide
additional information on passive smoking exposure.
The mutational spectra of lung cancer smokers and nonsmokers clustered
differently depending on distance type. In LR clustering, smoking and
nonsmoking-related lung cancers formed a cluster before combining with other
cancers (Fig. 7, step 18) reflecting similarities between their spectra. However in
Euclidean distance clustering, lung cancer smokers were separated from
nonsmokers, the differences in spectra driving them apart.
The clustering of lung with head and neck cancers depended on whether
lung cancer mutations were sorted by smoking exposure or histology. In both
distance types, lung cancers sorted by smoking exposure did not cluster closely
with head and neck cancers. This could be evidence of head and neck cancers
arising either from exposure to a nonsmoking carcinogen or from a different
smoking or tobacco carcinogen that affects the lung. In contrast, lung
adenocarcinomas combined with head and neck cancers in both distance types
(Fig. 6, step 8 and Fig. 8, step 12). This could imply a common etiologic factor
between lung adenocarcinomas and head and neck cancers. Smoking and
nonsmoking information for major smoking related cancers would be helpful in
this type of analysis.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
28
In both Euclidean and LR distance clustering, lung histologies tended to
cluster in their own branch, most likely driven by the high G to T transversion
proportions in high strand specificity the lung cancers have in common. In LR
distance clustering, non-small cell lung cancer mutations combine with the other
specific non-small cell lung cancers, squamous and large cell, making biological
sense. In Euclidean distance clustering, similar mutational spectra between non
small cell and small cell cancers can account for non-small cell lung cancers
combining with small cell lung cancers before grouping with other lung
histologies.
Histogenesis of lung cell types is still unclear (Lee et al., 1998).
Squamous metaplasia is frequently seen in the bronchial epithelium of chronic
smokers and lung cancer patients as a reaction to injury from irritants or
carcinogens. Squamous cell carcinoma is hypothesized to arise from squamous
metaplasia which originate from basal cell hyperplasia. Adenocarcinomas are
typically found in the peripheral regions o f the lung distant from bronchial
epithelium where metaplastic changes are observed and are not likely to be
associated with squamous metaplasia.
In the first step of the four clustering diagrams presented, breast combined
with another organ. Breast combined with ovary in Euclidean distance clustering,
followed by colon. These organs have high proportions of CpG and G to A
transitions. Breast and ovarian cancers are both associated with BRCA1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
29
mutations. Clustering of these three organs could represent common endogenous
influence producing similar p53 mutational spectra. The alternative explanation
of an exogenous agent responsible for breast cancer is offered in the clustering o f
log-likelihood ratio distances where the mutational spectra of breast cancer
appears more similar to the smoking related cancers o f head and neck and
esophagus.
In the Euclidean and LR clustering diagrams, colon cancer clustered with
breast cancer relatively early compared to lung cancer which clustered later. This
implies that the mutational spectra of breast cancer is more similar to colon cancer
rather than lung cancer, contrasting with the observation by Biggs et al. that the
mutational spectra of breast cancer was more similar to that of lung than that of
colon. Diet has been considered as a risk factor for both colon and breast cancers
(Willett, 1989) and breast cancer rates vary geographically implying an external
risk factor. Comparing p53 mutational spectra of a large number of subjects from
different geographic regions taking into account diet and other risk factors would
clarify this issue.
The second step of LR clustering combined colon cancer and Li-Fraumeni
cancers, both associated with high proportions of CpG mutations. This supports
the observation made by Biggs et al. that colon mutational spectra resembles that
of germline. Stomach cancers, which are part of the Li-Fraumeni syndrome
cancers, cluster closely with Li-Fraumeni cancers in both Euclidean and LR
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
30
clustering. These cancer clusters suggest similar endogenous mechanisms behind
their mutational spectra.
In agreement with the literature, most of the mutations in skin and XP
associated cancers were G:C to A:T transitions with high strand specificity for
CpG and C to T transitions in XP patients. The skin and XP-associated cancers
grouped together before combining with the rest of the cancers and was common
in both Euclidean and log-likelihood ratio distance clustering. C to T transitions
caused by UV light in equivalent or greater proportions than CpG transitions are
most likely the reason for skin cancers to cluster. A method to further test the
hypothesis that the skin cancers are clustering because of a common exposure to
UV radiation is to add lip cancer mutations into the clustering pool. Lip cancer is
also caused by UV radiation exposure (Kwa et al., 1992) and would be expected
to combine with the skin cancers in the analysis if large proportions of C to T
transitions were found in the p53 gene.
The G to T hotspot in liver cancer is most likely due to aflatoxin exposure.
The G:C to C:G hotspot along with no G to T transversions in nasopharyngeal
cancer is an unusual mutational spectrum. The same G to C hotspot at codon 280
was observed in bladder cancer (Spruck et al., 1993). A carcinogen not yet
identified may be causing this hotspot. Cancer of the nasopharynx has been
associated with ingestion of salted fish and other preserved foods (Yu, 1991) and
nitrosamines in fermented foods are potential carcinogens for nasopharyngeal
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
31
cancer. It would be interesting to see what may be responsible for this hotspot
and its relationship to both bladder and nasopharyngeal cancers.
In contrast to liver and nasopharyngeal cancer, prostate cancer did not
have a very distinguishable mutational spectrum. Prostate is usually found in the
middle of the cluster combining with thyroid in Euclidean clustering and
following thyroid in LR clustering. Other genes besides the p53 gene are mutated
in prostate cancer, however no single gene is mutated in the majority of prostate
cancers (Issacs and Bova, 1998).
Endogenous mutations are found in all cancers. Cancers with common
mutational spectra and high in CpG mutations tend to be clustered in the first half
of the clustering process while cancers associated with carcinogen exposure were
more likely to be clustered later. The unique pattern of CC to TT transitions
found in UV-related cancers and not in cancers of internal organs may explain
why the skin cancers clustered late. Like skin cancer, liver and nasopharyngeal
carcinoma have unusual mutational spectra and were two of the last organs to
cluster. An uncommon mutation occurring in a hotspot results in greater distances
between the cancer and all other cancers. The trend from endogenous to
exogenous sources of mutations is more apparent in the log-likelihood ratio
clustering (Fig. 7).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
32
CONCLUSION
A new mutational spectra analysis program was tested using mutations
from a p53 database and the published literature. The p53 gene is a good
candidate gene for mutational spectra analysis because it is mutated in a majority
of cancers with varied types of mutations. Distances between two cancer
mutational spectra based on parameters of mutation type, strand specificity, and
selected hotspots were calculated using the program and then clustered in a
separate analysis.
Order and closeness of clustering can reveal how similar cancer mutational
spectra are. The log-likelihood ratio distance seems to cluster by similarities in
mutational spectra and provides easier comparison between two cancers in
relation to other cancers. The clustering of Euclidean distances emphasizes
uniqueness of individual mutational spectra. In distinguishing common exposures
to carcinogens, organs may be exposed to multiple cancer-causing agents, and
additional information such as smoking and histology can provide more sensitive
analysis in this regard. Cancers with higher proportions of endogenous mutations
appear to cluster earlier. Cancers with increased proportions of exogenous
mutations over endogenous mutations, either from hotspots or known patterns
produced by carcinogens, tend to cluster last. The mutational spectra analysis
program is useful for generating hypotheses about exposures and allows
qualitative comparisons to be made between two or more mutational spectra.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
33
REFERENCES
Adams WT, Skopek TR. Statistical test for the comparison of samples from
mutational spectra. J Mol Biol 1987; 194: 391-396.
Biggs PJ, Warren W, Venitt S, Stratton MR. Does a genotoxic carcinogen
contribute to human breast cancer? The value of mutational spectra in unravelling
the etiology of cancer. Mutagenesis 1993; 8: 275-283.
Blot WJ, McLaughlin JK, Winn DM, Austin DF, Greenberg RS, Preston-Martin
S, Bernstein L, Schoenberg JB, Stemhagen A, Fraumeni JF Jr. Smoking and
drinking in relation to oral and pharyngeal cancer. Cancer Res 1988; 48: 3282-
3287.
Bodner SM, Minna JD, Jensen SM, D’Amico D, Carbone D, Mitsudomi T,
Fedorko J, Buchhagen DL, Nau MM, Gazdar AF, Linnoila RI. Expression of
mutant p53 proteins in lung cancer correlates with the class of p53 gene mutation.
Oncogene 1992; 7: 743-749.
Brash DE, Rudolph JA, Simon JA, Lin A, McKenna GJ, Baden HP, Halperin AJ,
Ponten J. A role for sunlight in skin cancer: UV-induced p53 mutations in
squamous cell carcinoma. Proc Natl Acad Sci USA 1991; 88: 10124-10128.
Brennan JA, Boyle JO, Koch WM, Goodman SN, Hruban RH, EBY YJ, Cough
MJ, Forastiere AA, Sidransky D. Association between cigarette smoking and
mutation of the p53 gene in squamous-cell carcinoma of the head and neck. N
Engl J Med 1995; 332: 712-7.
Cariello NF, Cui L, Beroud C, Soussi T. Database and software for the analysis
of mutations in the human p53 gene. Cancer Res 1994; 54: 4454-4460.
Cariello NF, Douglas GR, Soussi T. Databases and software for the analysis of
mutations in the human p53 gene, the human hprt gene and the lacZ gene in
transgenic rodents. Nucleic Acids Res, 1996; 24: 119-120.
Chiba I, Takahashi T, Nau MM, D’ Amico D, Curiel DT, Mitsudomi T,
Buchhagen DL, Carbone D, Piantadosi S, Koga H, Reissman PT, Slamon DJ,
Holmes EC, Minna JD. Mutations in the p53 gene are frequent in primary,
resected non-small cell lung cancer. Oncogene 1990; 5: 1603-1610.
Cho Y, Gorina S, Jeffrey PD, Pavletich NP. Crystal structure of a p53 tumor
suppressor-DNA complex: understanding tumorigenic mutations. Science 1994;
265: 346-355.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
34
Chung GT, Sundaresan V, Hasleton P, Rudd R, Taylor R, Rabbitts PH.
Sequential molecular genetic changes in lung cancer development. Oncogene
1995; 11:2591-2598.
D'Amico D, Carbone D, Mitsudomi T, Nau M, Fedorko J, Russel E, Johnson B,
Buchhagen D, Bodner S, Phelps R, Gazdar A, Minna JD. High frequency of
somatically acquired p53 mutations in small-cell lung cancer cell lines and
tumors. Oncogene 1992; 7: 339-346.
Denissenko MF, Pao A, Tang MS, Pfeifer GP. Preferential formation of
benzo[a]pyrene adducts at lung cancer mutational hotspots in p53. Science 1996;
274: 430-432.
Dumaz N, Stary A, Soussi T, Daya-Grosjean L, Sarasin A. Can we predict solar
ultraviolet radiation as the causal event in human tumors by analyzing the
mutation spectra of the p53 gene? Mutat Res 1994; 307: 375-386.
Fearon ER, Vogelstein B. A genetic model for colorectal tumorigenesis. Cell
1990; 61: 759-767.
Gorgoulis VG, Zacharatos PV, Manolis E, Ikonomopoulos JA, Damalas A,
Lamprinoplos C, Rassidakis GZ, Zoumpourlis V, Kotsinas A, Rassidakis AN,
Halazonetis TD, Kittas C. Effects of p53 mutants derived from lung carcinomas
on the p53-responsive element (p53RE) of the MDM2 gene. British J Cancer
1998; 77: 374-384.
Greenblatt MS, Bennett WP, Hollstein M, Harris CC. Mutations in the p5 3 tumor
suppressor gene: clues to cancer etiology and molecular pathogenesis. Cancer
Res 1994; 54:4855-4878.
Guinee DG Jr. Travis WD, Trivers GE, De Benedetti VMG, Cawley H, Welsh JA,
Bennett WP, Jett J, Colby TV, Tazelaar H, Abbondanzo SL, Pairolero P, Trastek
V, Caporaso NE, Liotta LA, Harris CC. Gender comparisons in human lung
cancer: analysis ofp53 mutations, anti-p53 serum antibodies and C-er^B-2
expression. Carcinogenesis 1995; 16: 993-1002.
Harty LC, Guinee DG, Travis WD, Bennett WP, Jett J, Colby TV, Tazelaar H,
Trastek V, Pairolero P, Liotta LA, Harris CC, Caporaso NE. p53 mutations and
occupational exposures in a surgical series of lung cancers. Cancer Epidemiol
Biomarkers Prev 1996; 5: 997-1003.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
35
Hollstein M, Bartsch H, Wesch H, Kure EH, Mustonen R, Muhlbauer KR,
SpiethofF A, Wegener K, Wiethege T, Muller KM. p53 gene mutation analysis in
tumors of patients exposed to alpha-particles. Carcinogenesis 1997; 18: 511-516.
Hollstein M, Sidransky D, Vogelstein B, Harris CC. p53 mutations in human
cancers. Science 1991; 253: 49-53.
Hsu IC, Metcalf RA, Sun T, Welsh JA, Wang NJ, Harris CC. Mutational hotspot
in the p53 gene in human hepatocellular carcinomas. Nature 1991; 350:427-428.
Isobe T, Hiyama K, Yoshida Y, Fujiwara Y, Yamakido M. Prognostic
significance of p53 and ras gene abnormalities in lung adenocarcinoma patients
with stage I disease after curative resection. Jpn J Cancer Res 1994; 85: 1240-
1246.
Issacs WB, Bova S. Prostate cancer. In: Vogelstein B, Kinzler KW, eds. The
genetic basis of human cancer. New York: McGraw-Hill, 1998: 653-660.
Kashii T, Mizushima Y, Monno S, Nakagawa K, Kobayashi M. Gene analysis of
K-, H-ras, p53, and retinoblastoma susceptibility genes in human lung cancer cell
lines by the polmerase chain reaction/single-strand conformation polymorphism
method. J Cancer Res Clin Oncol 1994; 120: 143-148.
Kishimoto Y, Murakami Y, Shiraishi M, Hayashi K, Sekiya T. Aberrations o f the
p53 tumor suppressor gene in human non-small cell carcinomas of the lung.
Cancer Res 1992; 52: 4799-4804.
Kondo K, Umemoto A, Akimoto S, Uyama T, Hayashi K, Ohnishi Y, Monden Y.
Mutations in the p5 3 tumour suppressor gene in primary lung cancer in Japan.
Biochem Biophys Res Commun 1992; 183: 1139-1146.
Kwa RE, Campana K, Moy RL. Biology of cutaneous squamous cell carcinoma.
J Am Acad Dermatol 1992; 26: 1-26, 1992.
Law JC, Whiteside TL, Collin SM, Weissfeld J, El-Ashmawy L, Srivastava S,
Laandreneau RJ, Johnson JT, Ferrell RF. Variation of p53 mutational spectra
between carcinoma of the upper and lower respiratory tract. Clin Cancer Res
1995; 1: 763-768.
Lee LN, Shew JY, Sheu, JC, Lee YC, Lee WC, Fang MT, Chang HF, Yu CJ,
Yang PC, Luh KT. Exon 8 mutation of p53 gene associated with nodal metastasis
in non-small-cell lung cancer. Am J Respir Crit Care Med 1994; 150: 1667-1671.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
36
Lee JS, Mao L, Hong WK. Biology o f preneoplastic lesions. In: Roth JA, Cox
JD, Hong WK, eds. Lung Cancer, 2nd ed. Malden: Blackwell Science, Inc.,
1998: 25-55.
Lehman TA, Bennett WP, Metcalf RA, Welsh JA, Ecker J, Modali RV, Ullrich S,
Romano JW, Appella E, Testa JR, Gerwin BI, Harris CC. p53 mutations, ras
mutations, and p53-heat shock 70 protein complexes in human lung carcinoma
cell lines. Cancer Res 1991; 51:4090-4096.
Li ZH, Zheng J, Weiss LM, Shibata D. c-K-ray and p53 mutations occur very
early in adenocarcinoma of the lung. Am J Pathol 1994; 144: 303-309.
Lohmann D, Putz B, Reich U, Bohm J, Prauer H, Hofler H. Mutational spectrum
of the p53 gene in human small-cell lung cancer and relationship to
clinicopathological data. Am J Pathol 1993; 143: 907-915.
Lung ML, Wong MP, Skaanild MT, Fok CL, Lam WK, Yew WW. p53
mutations in non-small cell lung carcinomas in Hong Kong. Chest 1996; 109:
718-726.
Mao L, Hruban RH, Boyle JO, Tockman M, Sidranski D. Detection of oncogene
mutations in sputum precedes diagnosis of lung cancer. Cancer Res 1994; 54:
1634-1637.
Miller CW, Simon K, Aslo A, Kik K, Yokota J, Buys CH, Terada M, Koeffler
HP. p53 mutations in human lung tumors. Cancer Res 1992; 52: 1695-1698.
Mitsudomi T, Steinberg SM, Nau MM, Carbone D, D’Amico D, Bodner S, Oie
HK, Linnoila RI, Mulshine JL, Minna JD, Gazdar AF. p53 gene mutations in
non-small-cell lung cancer cell lines and their correlation with the presence of ras
mutations and clinical features. Oncogene 1992; 7: 171-180.
Nigro JM, Baker SJ, Preisinger AC, Jessup JM, Hostetler R, Cleary K, Bigner SH,
Davidson N, Baylin S, Devilee P, Glover T, Collins FS, Weston A, Modali R,
Harris CC, Vogelstein B. Mutations in the p53 gene occur in diverse human
tumor types. Nature 1989; 342: 705-708.
Noguchi M, Maezawa N, Nakanishi Y, Matsuno Y, Shimosato Y, Hirohashi S,
Application of the p53 gene mutation pattern for differential diagnosis of primary
versus metastatic lung carcinomas. Diagn Mol Pathol 1993; 2: 29-35.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
37
Ozturk M, Bressac B, Puisieux A & 26 other collaborators. p53 mutation in
hepatocellular carcinoma after aflatoxin exposure. Lancet 1991; 338: 1356-1359.
Parkin DM, Pisani P, Lopez AD, Masuyer E. At least one in seven cases of
cancer is caused by smoking. Global estimates for 1985. Int J Cancer 1994; 59:
494-504.
Reichel MB, Ohgaki H, Petersen I, Kleihues P. p53 mutations in primary human
lung tumors and their metastases. Mol Carcinog 1994; 9:105-109.
Ryberg D, Kure E, Lystad S, Skaug V, Stangeland L, Mercy I, Borresen AL,
Haugen A. p53 mutations in lung tumors: relationship to putative susceptibility
markers for cancer. Cancer Res 1994; 54: 1551-1555.
Sameshima Y, Matsuno Y, Hirohashi S, Shimosato Y, Mizoguchi H, Sugimura T,
Terada M, Yokota J. Alterations of the p53 gene are common and critical events
for the maintenance of malignant phenotypes in small-cell lung carcinoma.
Oncogene 1992; 7: 451-457.
Schlegel L, Rosenfeld MR, Volkenandt M, Rosenblum M, Dalmau J, Fumeaux H.
p53 gene mutations in primary lung tumors are conserved in brain metastases. J
Neuro-Oncology 1992; 14: 93-100.
Semenza JC, Weasel LH. Molecular epidemiology in environmental health: the
potential of tumor suppressor gene p53 as a biomarker. Environ Health Perspect
1997; 105 (Suppl 1): 155-163.
Soini Y, Chia SC, Bennett WP, Groopman JD, Wang JS, DeBenedetti VM,
Cawley H, Welsh JA, Hansen C, Bergasa NV, Jones EA, DiBisceglie AM, Trivers
GE, Sandoval CA, Calderon IE, Munoz Espinosa LE, Harris CC. An aflatoxin-
associated mutational hotspot at codon 249 in the p53 tumor suppressor gene
occurs in hepatocellular carcinomas from Mexico. Carcinogenesis 1996; 17:
1007-1012.
Sozzi G, Miozzo M, Pastorino U, Pilotti S, Donghi R, Giarola M, De Gregorio L,
manenti G, Radice P, Minoletti F, Della Porta G, Pierotti MA. Genetic evidence
for an independent origin of multiple preneoplastic and neoplastic lung lesions.
Cancer Res 1995; 55:135-140.
Spruck CH, Rideout WM, Olumi AF, Ohneseit PF, Yang AS, Tsai YC, Nichols
PW, Horn T, Hermann GG, Steven K, Ross RK, Yu MC, Jones PA. Distinct
pattern of p53 mutations in bladder cancer: relationship to tobacco usage. Cancer
Res 1993; 53: 1162-1166.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
38
Suzuki H, Takahashi T, Kuroishi T, Suyama M, Ariyoshi Y, Takahashi T, Ueda
R. p53 mutations in non-small cell lung cancer in Japan: association between
mutations and smoking. Cancer Res 1992; 52: 734-736.
Takahashi T, Takahashi T, Suzuki H, Hida T, Sekido Y, Ariyoshi Y, Ueda R.
The p53 gene is very frequently mutated in small-cell lung cancer with a distinct
nucleotide substitution pattern. Oncogene 1991; 6:1775-1778.
Takagi Y, Koo LC, Osada H, Ueda R, Kyaw K, Ma CC, Suyama M, Saji S,
Takahashi T, Tominaga S, Takahashi T. Distinct mutational spectrum of the p53
gene in lung cancers from Chinese women in Hong Kong. Cancer Res 1995; 55:
5354-5357.
Takeshima Y, Inai K, Bennett WP, Metcalf RA, Welsh JA, Yonehara S, Hayashi
Y, Fujihara M, Yamakido M, Akiyama M, Tokuoka S, Land CE, Harris CC. p53
mutations in lung cancers from Japanese mustard gas workers. Carcinogenesis
1994; 15: 2075-2079.
Takeshima Y, Seyama T, Bennett WP, Akiyama M, Tokuoka S, Inai K, Mabuchi
K, Land CE, Harris CC. p53 mutations in lung cancers from non-smoking
atomic-bomb survivors. Lancet 1993; 342; 1520-1521.
Taylor JA, Watson MA, Devereux TR, Michels RY, Saccomanno G, Anderson
M. p53 mutation hotspot in radon-associated lung cancer. Lancet 1994; 343: 86-
87.
Top B, Mooi WJ, Klaver SG, Boerrigter L, Wisman P, Elbers HR, Visser S,
Rodenuis S. Comparative analysis of p53 gene mutations and protein
accumulation in human non-small-cell lung cancer. Int J Cancer 1995; 64: 83-91.
Vogelstein B, Kinzler KW, eds. The Genetic Basis of Human Cancer. New
York: McGraw-Hill, 1998.
Wang X, Christiani DC, Wiencke JK, Fischbein M, Xu X, Cheng TJ, Mark E,
Wain JC, Kelsey KT. Mutations in the p53 gene in lung cancer are associated
with cigarette smoking and asbestos exposure. Cancer Epidemiol Biomarkers
Prev 1995; 4: 543-548.
Willett W. The search for the causes of breast and colon cancer. Nature 1989;
338: 389-394.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
39
Yu MC. Nasopharyngeal carcinoma: epidemiology and dietary factors. In:
O’Neill IK, Chen J, Bartsch H, eds. Relevance to human cancer of iV-Nitroso
compounds, tobacco smoke and mycotoxins. Lyon, International Agency for
Research on Cancer, 1991.
Ziegler A, Jonason AS, Leffell DJ, Simon JA, Sharma HW, Kimmelman J,
Remington L, Jacks T, Brash DE. Sunburn and p53 in the onset of skin cancer.
Nature 1994; 372: 773-776.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
IMAGE EVALUATION
TEST TARGET (Q A -3 )
150mm
6"
IM/4GEE. In c
1653 East Main Street
Rochester, NY 14609 USA
Phone: 716/482-0300
Fax: 716/288-5989
0 1993. Applied Image. Inc.. A il Rights Reserved
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
INFORMATION TO USERS
This manuscript has been reproduced from the microfilm master. UMI
films the text directly from the original or copy submitted. Thus, some
thesis and dissertation copies are in typewriter face, while others may be
from any type of computer printer.
The quality of this reproduction is dependent upon the quality of the
copy submitted. Broken or indistinct print, colored or poor quality
illustrations and photographs, print bleedthrough, substandard margins,
and improper alignment can adversely affect reproduction.
In the unlikely event that the author did not send UMI a complete
manuscript and there are missing pages, these will be noted. Also, if
unauthorized copyright material had to be removed, a note will indicate
the deletion.
Oversize materials (e.g., maps, drawings, charts) are reproduced by
sectioning the original, beginning at the upper left-hand corner and
continuing from left to right in equal sections with small overlaps. Each
original is also photographed in one exposure and is included in reduced
form at the back of the book.
Photographs included in the original manuscript have been reproduced
xerographically in this copy. Higher quality 6” x 9” black and white
photographic prints are available for any photographs or illustrations
appearing in this copy for an additional charge. Contact UMI directly to
order.
UMI
A Bell & Howell Information Company
300 North Zeeb Road, Ann Arbor MI 48106-1346 USA
313/761-4700 800/521-0600
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
NOTE TO USERS
The original manuscript received by U M I contains pages with
light print. Pages were microfilmed as received.
This reproduction is the best copy available
UMI
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMI Number: 1393177
UMI Microform 1393177
Copyright 1999, by UMI Company. All rights reserved.
This microform edition is protected against unauthorized
copying under Title 17, United States Code.
UMI
300 North Zeeb Road
Ann Arbor, MI 48103
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Rates of cognitive decline using logitudinal neuropsychological measures in Alzheimer's disease
PDF
Descriptive epidemiology of thyroid cancer in Los Angeles County, 1972-1995
PDF
BRCA1 mutations and polymorphisms in African American women with a family history of breast cancer identified through high throughput sequencing
PDF
A case/parental/sibling control study of Ewing's sarcoma/peripheral primitive neuroectodermal tumor (pPNET)
PDF
A descriptive analysis of medication use by asthmatics in the Children's Health Study, 1993
PDF
beta3-adrenergic receptor gene Trp64Arg polymorphism and obesity-related characteristics among African American women with breast cancer: An analysis of USC HEAL Study
PDF
Extent, prevalence and progression of coronary calcium in four ethnic groups
PDF
Family history, hormone replacement therapy and breast cancer risk on Hispanic and non-Hispanic women, The New Mexico Women's Health Study
PDF
Comparisons of metabolic factors among gestational diabetes mellitus probands, siblings and cousins
PDF
Determinants of mammographic density in African-American, non-Hispanic white and Hispanic white women before and after the diagnosis with breast cancer
PDF
Dietary fiber intake and atherosclerosis progression: The Los Angeles Atherosclerosis Study
PDF
A pilot survey of medical abortion knowledge and practices among obstetrician/gynecologists and family practitioners in Los Angeles County
PDF
Evaluation of the accuracy and reliability of self-reported breast, cervical, and ovarian cancer incidence in a large population-based cohort of native California twins
PDF
Comparison of predicting accuracy of neural networks for censored survival data using generalized Receiver Operating Charactaristic (ROC)-C-Index method
PDF
Association between body mass and benign prostatic hyperplasia in Hispanics: Role of steroid 5-alpha reductase type 2 (SRD5A2) gene
PDF
A linear model for measurement errors in oligonucleotide microarray experiment
PDF
Development and evaluation of standardized stroke outcome measures in a population of stroke patients in rural China
PDF
P53 and bladder cancer outcome: A combined analysis from the Keck School of Medicine
PDF
Does young adult Hodgkin's disease cluster by school, residence and age?
PDF
Association between latchkey status and smoking behavior in middle school children
Asset Metadata
Creator
Lo, Mary Ju Fang
(author)
Core Title
Cluster analysis of p53 mutational spectra
School
Graduate School
Degree
Master of Science
Degree Program
Applied Biometry/Epidemiology
Degree Conferral Date
1998-08
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
biology, biostatistics,health sciences, oncology,OAI-PMH Harvest
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
Buckley, Jonathan D. (
committee chair
), [illegible] (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-25237
Unique identifier
UC11341189
Identifier
1393177.pdf (filename),usctheses-c16-25237 (legacy record id)
Legacy Identifier
1393177.pdf
Dmrecord
25237
Document Type
Thesis
Rights
Lo, Mary Ju Fang
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
biology, biostatistics
health sciences, oncology