Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
A large-scale genetic association study of prostate cancer in a multi-ethnic population
(USC Thesis Other)
A large-scale genetic association study of prostate cancer in a multi-ethnic population
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
INFORMATION TO USERS
This manuscript has been reproduced from the microfilm master. U M I films
the text directly from the original or copy submitted. Thus, some thesis and
dissertation copies are in typewriter face, while others may be from any type of
computer printer.
The quality of this reproduction is dependent upon the quality of the
copy submitted. Broken or indistinct print, colored or poor quality illustrations
and photographs, print bleedthrough, substandard margins, and improper
alignment can adversely affect reproduction.
In the unlikely event that the author did not send UM I a complete manuscript
and there are missing pages, these will be noted. Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion.
Oversize materials (e.g., maps, drawings, charts) are reproduced by
sectioning the original, beginning at the upper left-hand comer and continuing
from left to right in equal sections with small overlaps.
ProQuest Information and Learning
300 North Zeeb Road. Ann Arbor, M l 48106-1346 USA
800-521-0600
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A LARGE-SCALE GENETIC ASSOCIATION STUDY OF PROSTATE
CANCER IN A MULTI-ETHNIC POPULATION
Copyright 2002
by
Celeste Leigh Pearce
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirement of the Degree
DOCTOR OF PHILOSOPHY
(EPIDEMIOLOGY)
May 2002
Celeste Leigh Pearce
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UM I Number: 3073834
Copyright 2002 by
Pearce, Celeste Leigh
All rights reserved.
__< 8 >
UMI
UMI Microform 3073834
Copyright 2003 by ProQuest Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest Information and Learning Company
300 North Zeeb Road
P.O. Box 1346
Ann Arbor, Ml 48106-1346
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UNIVERSITY OF SOUTHERN CALIFORNIA
THE GRADUATE SCHOOL
UNIVERSITY PARK
LOS ANGELES, CALIFORNIA 90007
This dissertation, written by
under the direction of h.sx. Dissertation
Committee, and approved by all its members,
has been presented to and accepted by The
Graduate School, in partial fulfillment of re
quirements for the degree of
DOCTOR OF PHILOSOPHY
>ean of Graduate Studies
DISSERTATION COMMITTEE
Chairperson
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
DEDICATION
This dissertation is dedicated to the memory of Sally Pearce.
Thank you for your comfort, company, and unconditional love.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ACKNOWLEDGMENTS
I would first like to thank Brian E. Henderson, MD, my committee
chairperson, for without his guidance and unwavering support this dissertation would
not have been written. He was never bothered by my lack of formal training in
genetics, but rather provided a stimulating environment in which I could learn.
I would like to recognize the other members of my committee, David
Altshuler, M.D., Ph.D., Gerhard Coetzee, Ph.D., Sue Ingles, Dr.P.H., Ronald Ross,
M.D., and Giske Ursin, M.D., Ph.D., for their help in focusing my work, for pushing
me to do my best, and for giving up their precious time for me.
I want to thank Bisher Akil, M.D. who has provided me with encouragement,
love and support throughout this entire process, and even continues to love me. I
also want to thank my close friends, Scott Pearce, Henry Chu, and Barbara Kennedy,
who believed in me and accepted my moodiness and busyness without complaint.
And thank you to Roberta McKean-Cowdin, Ph.D. who kept me strong during the
tears and doubts.
I would like to thank Alexandra Levine, M.D. who continues to be an
inspiration to me as a scientist and a woman. And thank you to Joel Hirschhom,
M.D., Ph.D., Noel Burtt, and Matthew Freedman, M.D. for their assistance.
Last, but certainly not least, I want to express my gratitude to Malcolm Pike,
Ph.D. for spending many hours working with me on this dissertation and for
choosing me as his post-doctoral fellow. He makes me feel smart, respected, and
challenged. His expectations of me are high and for that I am grateful.
iii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
TABLE OF CONTENTS
DEDICATION............................................................................................................... ii
ACKNOWLEDGMENTS...........................................................................................iii
LIST OF TABLES___________________________________________________vii
LIST OF FIGURES__________________________________________________ ix
ABSTRACT....................................................................................................................x
INTRODUCTION......................................................................................................... 1
I. REVIEW OF THE LITERATURE..................................................................... 10
A. Prospective Studies of Serum Hormones........................................................10
B. Hormone Levels and Racial/Ethnic Variation in Risk................................... 14
C. Genes and Cancer............................................................................................18
1. SRD5A2.........................................................................................................19
2. AR..................................................................................................................21
3. HSD3B1........................................................................................................ 23
H. DATA ANALYSIS............. 24
A. Association of Missense Substitution in SRD5A2 Gene with Prostate Cancer
in African-American and Hispanic Men in Los Angeles, USA......................24
1. Introduction...................................................................................................24
2. Methods......................................................................................................... 25
a. Epidemiology............................................................................................ 25
b. Molecular Biology.................................................................................... 27
c. Biochemistry..............................................................................................28
3. Results........................................................................................................... 29
4. Discussion.....................................................................................................34
B. SRD5A2 V89L Substitution is Not Associated with Risk of Prostate Cancer
in a Multi-Ethnic Population Study.................................................................37
1. Introduction...................................................................................................37
2. Methods......................................................................................................... 38
3. Results........................................................................................................... 39
4. Conclusions...................................................................................................39
iv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
C. Association Between the Steroid 5-Alpha Reductase Type II Gene A49T
Missense Variant and Prostate Cancer Risk in Whites and Japanese-
Americans..........................................................................................................42
1. Introduction.....................................................................................................42
2. Methods...........................................................................................................43
3. Results.............................................................................................................44
4. Discussion.......................................................................................................46
D. Association Between Hydroxysteroid Dehydrogenase 3-Beta Type I Gene
F286L Missense Variant and Risk of Prostate Cancer in a Multiethnic
Population Study............................................................................................... 49
1. Introduction.....................................................................................................49
2. Methods........................................................................................................... 50
a. HSD3B1 Variant........................................................................................ 51
b. Genotyping..................................................................................................52
c. Quality Control.......................................................................................... 54
d. Severity of Disease.................................................................................... 54
e. Statistical Analysis..................................................................................... 54
3. Results............................................................................................................. 56
4. Discussion.......................................................................................................57
E. Association Between Single Nucleotide Polymorphisms in the Androgen
Receptor and Risk of Prostate Cancer in a Multiethnic Population................61
1. Introduction.....................................................................................................61
2. Methods........................................................................................................... 63
a. ARSNPs.....................................................................................................64
b. Genotyping..................................................................................................64
c. Severity of Disease.................................................................................... 66
d. Statistical Analysis..................................................................................... 66
3. Results............................................................................................................. 67
4. Discussion....................................................................................................... 75
in : METHODOLOGICAL DEVELOPMENT: STRATEGY FOR CASE-
COHORT ANALYSIS FOR STUDIES OF GENETIC SUSCEPTIBILITY IN
THE MULTIETHNIC COHORT STUDY______________________________ 79
A. Introduction.......................................................................................................79
B. MEC ‘Effective’ Sub-Cohort and ‘Design’ Sub-Cohort................................80
C. Participation Rate..............................................................................................83
D. Time Scales....................................................................................................... 85
F. Statistical Analysis............................................................................................88
v
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
IV. GRANT PROPOSAL: GENETIC SUSCEPTIBLITY TO PROSTATE
CANCER IN MINORITY POPULATION______________________________ 89
A. Introduction....................................................................................... 89
B. Specific Aims.....................................................................................................91
C. Background and Significance.........................................................................92
1. Genetic Nature of Prostate Cancer............................................................... 93
2. The Androgen Biosynthetic Pathway..........................................................96
3. Adrenal Androgens and Prostate Cancer....................................................98
4. Prostate Cancer Genetic Association Studies...........................................100
5. Sequencing and Analysis........................................................................... 103
D. Preliminary Results................................................................................106
1. Hawaii-Los Angeles Multiethnic Cohort Study of Cancer and Diet 106
a. Cases Identified and Specimens Collected............................................ 106
b. SRD5A2....................................................................................................108
2. Whitehead Institute/MIT Center for Genome Research...........................109
3. Characterization of Genotyping Methods................................................. 110
4. USC/Norris Cancer Center and Whitehead Institute Pilot Study............ 112
E. Research Design and Methods..................................................................... 114
1. Description of the Multiethnic Cohort.......................................................115
2. Identification of Incident Cancer Cases.....................................................118
3. Obtaining Biological Samples (2001-2006)..............................................119
a. Collection of Samples from Prostate Cancer Cases and Controls 120
b. Processing and Storage of Blood Specimens..........................................122
c. DNA Extraction and Storage...................................................................123
d. Data Management.................................................................................... 124
4. Selection of Genes for Study......................................................................125
5. Search for Functional SNPs in Coding and Regulatory Regions.............126
6. Selection of SNPs from Public Databases.................................................128
7. Single Marker and Linkage Disequilibrium Analysis.............................. 128
8. Laboratory Methods.....................................................................................131
9. Statistical Issues...........................................................................................132
a. Data Analysis............................................................................................132
b. Power........................................................................................................ 136
10. Criteria for Interpretation of Gene-Cancer Associations.......................... 138
11. Intra-ethnic Stratification............................................................................ 139
F. Translational Focus........................................................................................ 140
REFERENCES_____________________________________________________ 141
vi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
LIST OF TABLES
Table 1. Description of study populations for prospective studies of hormone level-
prostate cancer risk........................................................................................12
Table 2. Relative risks for type o f hormone and risk of prostate cancer for all
prospective hormone level-prostate cancer risk studies 1988-1999...........13
Table 3. Description of T and AAG results in published cross-sectional studies of
circulating hormone levels............................................................................15
Table 4. Age and educational attainment of cases and controls................................30
Table 5. Association of the A49T missense substitution in SRD5A2 gene with risk
o f prostate cancer..........................................................................................31
Table 6. In-vitro kinetic properties of mutant and wild-type SRD5A2 enzyme 32
Table 7. Odds ratios and 95% confidence limits for the association between V89L
genotype and risk of prostate cancer by racial/ethnic group......................40
Table 8. Descriptive characteristics of the study population by racial/ethnic group.
....................................................................................................................... 45
Table 9. Association between A49T genotype and risk of prostate cancer by
racial/ethnic group........................................................................................ 47
Table 10. Genotype and allele frequencies for F286L by racial/ethnic group and sub
cohort membership....................................................................................... 55
Table 11. Relative risks and 95% confidence intervals for F286L genotype by...... 57
Table 12. Descriptive comparison between the sub-cohort and the entire MEC
cohort by racial/ethnic group....................................................................... 65
Table 13. Descriptive characteristics of study population by sub-cohort membership
(n and %).......................................................................................................68
Table 14. Minor allele frequencies and locations for AR SNPs by racial/ethnic group
for sub-cohort members and cases...............................................................71
Table 15. Relative risk and 95% CIs for the minor allele AR SNPs by racial/ethnic
group..............................................................................................................73
vii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 16. Haplotype structure of the AR gene using 11 SNPs by racial/ethnic group.
........................................................................................................................74
Table 17. Relative risk and 95% CIs for AR Haplotypes for African-Americans and
all groups combined......................................................................................76
Table 18. Number of Invasive Prostate Cancer Cases and Age-Adjusted Incidence
Rates from the Multiethnic Cohort in African-American and Latino
participants through 12/31/97...................................................................... 93
Table 19. Sample Collection from Incident Prostate Cancer Cases and Cohort
Controls in Los Angeles, as o f March 1,2000.......................................... 107
Table 20. Relevant Genes and Discovered SNPs Screened by the Whitehead
Institute........................................................................................................ 110
Table 21. Odds Ratios and 95% Confidence Intervals for Genotype and Prostate
Cancer Risk by Racial/Ethnic Group......................................................... 113
Table 22. Distribution of the Cohort by Age, Sex, and Ethnicity, Hawaii and Los
Angeles, 1993-1996.................................................................................... 117
Table 23. Projected Blood Sample Collection from Invasive Prostate Cancer Cases
Los Angeles (1993-2006)........................................................................... 120
Table 24. Lower and Upper Bounds of Prevalence of Any Factor that Increases Risk
in a Population that Can Be Detected with 80% Power and a 2-sided Type
I Error of 5% with the Number of Cases Available Per Racial/Ethnic
Group in this Study..................................................................................... 137
Table 25. Detectable Interaction Relative Risks (RR|) with 80% Power and a 2-sided
Type I Error of 5%; Marginal ‘Environmental’ Factor Relative Risk = 1.5
with Population Prevalence of 50%; Marginal Genotype Relative Risk =
1.5.................................................................................................................138
vm
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
LIST OF FIGURES
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
Figure
1. The androgen metabolism pathway............................................................ 20
2. In-vitro kinetics o f normal (wild-type) and mutant (A49T) SRD5A2
enzymes......................................................................................................... 33
3. SNP map of the androgen receptor gene (map is not to scale)..................70
4. Pairwise D’ (top value), LOD score (middle value), and R2 (bottom value)
for the 11 AR markers for African-Americans........................................... 72
5. Members of the ‘Design’ and ‘Effective’ Sub-Cohort...............................82
6. Members of the ‘Design’ Sub-Cohort, but not the ‘Effective’ Sub-Cohort.
........................................................................................................................84
7. Categorization of Sub-Cohort Membership................................................86
8. Risk Set Categorization for Sub-Cohort Members.....................................87
9. The pathways of steroid hormone synthesis in the testes and adrenals in
humans...........................................................................................................97
10. Example of targeted resequencing and SNP selection in a candidate gene.
......................................................................................................................129
ix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ABSTRACT
Prostate cancer is a serious public health problem in the United States and
abroad. Despite decades of research, the etiology remains elusive, however several
characteristics of the disease are well known. There is distinct racial/ethnic
variation in risk of prostate cancer, risk increases with age, androgen ablation
therapy is an effective treatment for advanced prostate cancer, and rats develop
prostate cancer after administration of testosterone.
Given the apparent role of androgens and the interesting racial/ethnic
variation in risk, the role of single nucleotide polymorphisms (SNPs) in genes
involved in androgen metabolism in the prostate was studied in relation to prostate
cancer risk in African-American, Japanese-American, Latino, and White men. The
three genes studied in this dissertation were SRD5A2, HSD3B1, and AR.
There was an apparent increased risk of advanced prostate cancer associated
with the T allele o f the A49T SNP in the SRD5A2 gene in African-American and
Latino men, but not in Japanese-American and White men. There was no association
between the V89L SNP in the SRD5 A2 gene and risk of prostate cancer in any of the
four ethnic groups studied.
There was an increased risk of advanced prostate cancer associated with the
LL genotype of the F286L SNP o f HSD3B1 in African-American men. The LL
genotype was not present at an appreciable frequency in the other racial/ethnic
groups.
x
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A total of eleven SNPs in the AR gene were studied using both a single SNP
analysis and a haplotype analysis. Although there did not appear to be a strong
association with prostate cancer risk, the haplotype diversity was interesting.
Although a total 2,048 haplotypes were possible (2n), only nineteen were observed,
and only ten at a frequency of one percent or greater. The Japanese-American group
had only one haplotype whereas the African-Americans showed the greatest
diversity.
Several candidate SNPs in genes involved in androgen metabolism in the
prostate have been identified as potentially associated with prostate cancer risk in the
multi-ethnic population studied. Additional studies are recommended to confirm
these findings.
The methodology needed to conduct a case-cohort analysis using the genetic
data within this study population was also developed as part of this dissertation.
These methods were implemented in this dissertation and will continue to be used in
future studies.
Future research directions were proposed as part of a grant proposal that is
also included in this dissertation. This grant proposes to investigate the role of genes
involved in adrenal androgen biosynthesis and metabolism and represents the
independent development of a research hypothesis.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
INTRODUCTION
This dissertation is written in accordance with the requirements of the Doctor
of Philosophy (Ph.D.) degree in Epidemiology in the Department of Preventive
Medicine at the University of Southern California Keck School of Medicine. This
dissertation addresses androgen metabolism and genetic susceptibility to prostate
cancer and falls into four parts: (I) a literature review of the epidemiological data
relevant to the research topic; (II) a series of five manuscripts on this topic; (III) a
detailed epidemiologic methods section relevant to the study design utilized in this
dissertation; and (IV) a grant proposal for future research on a topic related to the
dissertation research.
Prostate cancer is a serious public health problem in the United States (U.S.).
Prostate cancer is the most common cancer in men and the second leading cause of
cancer-related death in men (American Cancer Society, 2001). In 2001, it was
estimated that there were 198,100 new cases of prostate cancer among U.S. men and
31,500 deaths (American Cancer Society, 2001). Little is known about the etiology
of the disease.
Incidence rates of prostate cancer in the U.S. show a distinct racial/ethnic
pattern. African-Americans are approximately 1.5 times as likely as Whites to
develop prostate cancer whereas Asian-Americans are 0.6 times as likely (American
Cancer Society, 2001). The mechanism through which race/ethnicity plays a role in
prostate cancer risk is not well understood. It has been suggested that racial/ethnic
variation in disease incidence may be due to environmental factors, particularly
I
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
because of the reported low incidence rates of prostate cancer in blacks outside the
U.S. compared to the high rate in African-American men.
The role of environment is however unclear; recent literature has suggested
that the reported rates of prostate cancer in Africa and other predominantly black
countries are underestimated. If in fact the rates are much higher than previously
believed in these less developed countries it is likely that the high rates in African-
American men are not due to the adoption of a Western lifestyle, but rather they are
indicative of an underlying predisposition to prostate cancer.
For example, the 1973-1977 estimated age-adjusted incidence rate (adjusted
to the U.S. 1970 population) of prostate cancer for Kingston, Jamaican men, of
whom 91% are of African-descent (Glover et al., 1998), was ~41/100,000 men (this
is the most recent time period available) compared to ~117/100,000 for U.S. Blacks
(Waterhouse, Muir, Shanmugaratnam, & Powell, 1982). However, Glover and
colleagues recently conducted an extensive review of hospital, physician office,
government pathology laboratory, and the Jamaican Cancer Registry records to
identify all prostate cancer cases diagnosed in Kingston, Jamaica from 1989 to 1994
and found an incidence rate higher than that of American blacks (Glover et al.,
1998). These investigators paid careful attention to ensuring that only Kingston
residents were included and that individuals were uniquely identified so as to avoid
double counting cases. Their reported 1992 rate of prostate cancer for Kingston was
324.3/100,000 compared to 249.1/100,000 for U.S. Blacks (Glover et al., 1998).
The authors speculated that the reason for the disparate rates for 1973-1977
2
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(40.8/100,000) and 1992 (324.3/100,000) is likely due to underreporting of the
disease to the cancer registry. There is little impetus to send information to the
cancer registry because prostate cancer is not reportable. The investigators noted
that only 30% of cases diagnosed by transrectal biopsy in urology offices were
reported (Glover et al., 1998). It is not clear whether this high rate of prostate cancer
can be generalized to the whole of Jamaica. Diagnosis is likely to be a serious
problem in areas outside of Kingston as only one of the six urologists in Jamaica at
the time of the Glover study practiced outside of Kingston (Glover et al., 1998).
Glover et al. may have found higher rates in Jamaica compared to the U.S. because
of errors in counting cases in Jamaica, but it may also be a reflection of a less
admixed population in Jamaican blacks compared to U.S. blacks. This, of course,
would suggest that the rates among blacks in Africa should be as high, if not higher,
than those in Kingston.
The reported prostate cancer incidence rates in African countries are quite
low (of the order of -20/1 00,000; age-adjusted to the 1970 US population). This
most likely reflects under-reporting and under-diagnosis of the disease; and a special
study from Nigeria by Osegbe suggested that the rates in Nigeria are 127/100,000
men (population used for standardization is not clear) (Osegbe, 1997). However the
quality of this article is in question and therefore the true rate is unknown.
Further, the only environmental factor that has been consistently associated
with prostate cancer risk is dietary fat intake (Shibata & Whittemore, 1997).
However, Whittemore and colleagues have estimated that differences in dietary fat
3
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
intake would likely only explain 10% of the black-white difference and 15% of the
white-Asian difference in U.S. prostate cancer incidence rates (Whittemore et al.,
1995). Although it is possible that unidentified environmental factors could explain
the difference in disease risk among racial/ethnic groups, this appears unlikely given
the extent to which this disease has been studied.
An alternative to the environmental explanation for racial/ethnic variation in
disease risk is that differences may be due to different underlying genetic
susceptibility to the disease. There is limited evidence that circulating metabolite
levels of the most potent androgen in the prostate, dihydrotestosterone (DHT), are
lower in Asian groups compared to groups at higher risk of prostate cancer
(Lookingbill et al., 1991; Ross et al., 1992). These circulating androgen levels are
likely under genetic control (see below).
The prostate is an androgen-dependent organ and it has long been
hypothesized that androgens are involved in the development of prostate cancer.
Huggins and Hodges established the role of androgen ablative therapy as an effective
treatment modality for metastatic prostate cancer more than 50 years ago (Huggins &
Hodges, 1941). Androgen ablative therapy remains the treatment of choice for
metastatic prostate cancer today. Experimental evidence also supports a role for
androgens in the etiology of prostate cancer. Noble’s groundbreaking work showed
an increase in prostate cancer incidence among rats following administration of
testosterone (T) (Nobel, 1977). He demonstrated an increase in the incidence rate of
4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
prostate cancer from 0.5% to 20% in the Nb rat treated with T propionate (adding the
ester propionate slows the release of T) (Nobel, 1977).
Androgen levels (circulating or intra-prostatic) may influence risk of prostate
cancer and genetic variation may affect androgen levels and therefore prostate cancer
risk. Genetic susceptibility to prostate cancer may be mediated through variability in
androgen metabolism and it has been proposed that this variability may also explain
the racial/ethnic distribution of disease. The focus of this dissertation is on variation
in genes which control androgen metabolism in the prostate.
Part I of this dissertation consists of a literature review of the androgen-
prostate cancer hypothesis in order to provide a framework for the dissertation work.
The topics covered include (a) prospective studies of circulating androgen levels and
risk of prostate cancer; (b) cross-sectional studies of circulating androgen levels
across racial/ethnic groups; and (c) androgen metabolism genes and prostate cancer.
T and androstenedione (A-dione) biosynthesis in the testes and
dehydroepiandrosterone (DHEA) and A-dione biosynthesis in the adrenals are
controlled through complex hypothalamic-pituitary feedback loops. Circulating T is
almost exclusively derived from the testes. Approximately half of circulating A-
dione and nearly all of circulating DHEA are derived from the adrenals. These three
hormones circulate in the blood and then enter the prostate. Approximately 60% of
DHT in the prostate is converted from circulating T. The remaining 40% of DHT in
the prostate is converted in the prostate from circulating DHEA and A-dione (Labrie,
Dupont, & Belanger, 1985). Intra-prostatic DHT, and to a lesser extent T, bind to
5
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the androgen receptor (AR); and the AR-DHT and AR-T complexes transactivate
target genes resulting in prostate cell proliferation.
If androgens are a major factor influencing prostate cancer risk, then
individuals with high circulating androgen levels should be more likely to develop
prostate cancer than men with low circulating levels, and racial/ethnic groups at
highest risk of prostate cancer should also have the highest circulating androgen
levels unless androgen metabolism in the prostate is markedly different in different
racial/ethnic groups.
Part II of this dissertation consists of an independent research project
presented as a series of five papers, either published or in the process of being
published, on several aspects of the role o f intra-prostatic androgens in genetic
susceptibility to prostate cancer risk.
The five papers of Part II of this dissertation rely entirely on the study
population of the Multiethnic Cohort Study of Diet and Cancer (MEC) (Kolonel et
a l, 2000). The MEC is a collaborative study between the University of Hawaii and
the University o f Southern California (Principal Investigators: Drs. Larry Kolonel
and Brian Henderson). The MEC was initiated in 1993 and recruitment was
completed in 1996. The total size of the cohort is 215,251 men and women. The
MEC is now in its eighth year of follow-up. The cohort is followed for incident
cancers through linkage with the Los Angeles County Cancer Surveillance Program
(the Surveillance, Epidemiology, and End Results [SEER] registry covering Los
Angeles County), the California State Cancer Registry, and the Hawaii SEER
6
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
program. Annual linkage to the Hawaii and California death files is also completed
to determine vital status of cohort members. MEC members who develop prostate
cancer (cases) are contacted and asked to provide a blood sample. Also, a random
sample of MEC members was selected to be a sub-cohort from whom blood samples
would be requested (controls). A sample of the cases and sub-cohort members who
provided blood samples serves as the study population for this dissertation. This
sample consists of men from the four major racial/ethnic groups in the MEC:
African-Americans, Japanese-Americans, Latinos, and Whites. These four groups
will permit us to test the hypothesis that racial/ethnic variation in prostate cancer risk
can be explained by genetic variation in androgen metabolism in these groups.
The first, second, and third papers in Part II of this dissertation address the
hypothesis that two missense (variants that alter the encoded amino acid) single
nucleotide polymorphisms (SNPs) in the steroid 5-alpha reductase type II (SRD5A2)
gene are associated with risk of prostate cancer. This key gene is responsible for the
conversion of T to DHT in the prostate. These two SNPs were identified by
sequencing of the gene which was done in the laboratory of Dr. Juergen K.V.
Reichardt at the University of Southern California.
I conducted the epidemiological analysis for the first paper as well as writing
the epidemiological methods and results sections of the paper. I also contributed to
the introduction and discussion sections. The first paper was published in The
Lancet. I conducted the analysis and wrote the second and third papers completely,
with the thoughtful and appreciated input of my co-authors. The second paper was
7
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
published in Cancer Epidemiology, Biomarkers and Prevention. The third paper has
been accepted by the British Journal o f Cancer.
The fourth paper in Part n of this dissertation considers the role of an
additional gene involved in androgen metabolism in prostate cancer risk,
hydroxysteroid dehydrogenase 3-beta type 1 (HSD3Blj, which encodes the enzyme
that metabolizes DHEA to A-dione in the prostate. The single SNP was identified
through sequencing conducted at the Whitehead Institute in Cambridge,
Massachusetts. I did the majority of the genotyping described in the paper,
conducted the statistical analysis, and wrote this paper. This paper will be submitted
to Cancer Research.
The fifth paper in Part II of this dissertation explores the association between
prostate cancer and 11 SNPs in the AR gene utilizing a haplotype approach. A
haplotype is a segment of a chromosome which is inherited as a unit. This approach
is used rather than a single SNP analysis to allow for the study of the entire gene,
including SNPs which have not yet been identified or genotyped. These SNPs were
identified from the National Center for Biotechnology Information database
(dbSNP). I participated in the genotyping of these 11 SNPs, conducted the statistical
analysis, and wrote this preliminary analysis.
Part III of this dissertation represents the epidemiological methods
contribution of this dissertation. This section details the special methods needed for
a case-cohort analysis with the MEC genetic data.
8
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The final chapter of this dissertation, Part IV, is a grant proposal to study the
role of genes involved in adrenal androgen biosynthesis in risk of prostate cancer.
Although the majority of T (and DHT) exposure in the prostate comes from the
testes, nearly 40% of T (and DHT) in the prostate is derived from adrenal precursors
(Labrie et al., 1985). I propose in this grant proposal to study six genes that are
involved in adrenal androgen biosynthesis. I suggest a haplotype-based approach
utilizing SNPs discovered through targeted gene sequencing and those available in
the public database.
Ultimately, it is likely that understanding the role of androgens in the etiology
of prostate cancer will be dependent on building a polygenic model of prostate
cancer risk that considers genes involved in both biosynthesis and metabolism of
androgens. As our understanding of the human genome increases and this
information is readily available to the research community we will be able to apply
this knowledge to unlocking the mystery of prostate cancer risk.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
I. REVIEW OF THE LITERATURE
The biologic hypothesis that androgens are involved in risk of prostate cancer
emerged from evidence that androgen ablation therapy was effective in the treatment
of prostate cancer and that administration of androgens to rats could induce the
disease. Epidemiologists have sought to further characterize the role of androgens in
prostate cancer risk primarily using three approaches. First, prospective studies have
been conducted to determine if circulating androgen levels are higher in men who
subsequently develop prostate cancer compared to men who do not develop disease.
Second, the correlation between serum levels of sex steroid hormones and
race/ethnicity has been examined cross-sectionally to determine if the levels vary by
racial/ethnic group in the same way that prostate cancer incidence rates vary. Third,
the relationship between genes involved in androgen metabolism and prostate cancer
has been studied. The literature related to these subjects is reviewed below.
A. Prospective Studies of Serum Hormones
If circulating hormones are associated with risk of prostate cancer,
individuals who develop disease should have different levels on average than
individuals who do not develop prostate cancer. Although this hypothesis has been
tested by a number of investigators (Barrett-Connor, Garland, McPhillips, Khaw, &
Wingard, 1990; Carter et a l, 1995; Dorgan et al, 1998; Gann, Hennekens, Ma,
Longcope, & Stampfer, 1996; Guess et a l, 1997; Heikkila et a l, 1999; Hsing &
Comstock, 1993; Nomura, Heilbrun, Stemmermann, & Judd, 1988; Nomura,
10
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Stemmermann, Chyou, Henderson, & Stanczyk, 1996; Vatten et al., 1997), the
picture remains unclear.
The association between circulating T levels and subsequent risk of prostate
cancer is tenuous. Three investigators have provided evidence that high T levels are
associated with risk of prostate cancer (Gann et al., 1996; Heikkila et al., 1999;
Hsing & Comstock, 1993), whereas four have found a decreased risk associated with
higher levels (table 1) (Dorgan et al., 1998; Guess et al., 1997; Nomura et al., 1996;
Vatten et al., 1997). None of these results is statistically significant, with the
exception of that of Gann and colleagues (see table 2) (Gann et al., 1996). These
investigators report a more than two-fold increased risk of disease for individuals in
the highest quartile of serum T level after adjusting for sex hormone binding globulin
(SHBG) and estradiol levels in the analysis. SHBG may play a role in prostate
cancer risk in that higher SHBG levels would lower the amount of bio-available
(unbound) T available to enter the prostate and reduced levels of SHBG were found
in the prostate cancer cases. The role of estradiol in prostate cancer risk is less clear
and it is not obvious from their analysis that estradiol was a strong confounder in the
T-prostate cancer association. Most previous studies, with the exceptions of
Heikklia and colleagues (Heikkila et al., 1999) and Dorgan et al. (Dorgan et al.,
1998), did not adjust for SHBG or estradiol. Heikklia and colleagues (Heikkila et
al., 1999) noted a similar finding to Gann (Gann et al., 1996) in an analysis which
excluded the first eight years of follow-up, after adjusting for SHBG, but this result
was not statistically significant Dorgan and co-workers (Dorgan et al., 1998) did
1 1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced w ith permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 1. Description of study populations for prospective studies of hormone level-prostate cancer risk.
First Author
(year published)
Population Period (Entry to
end of follow-up)
Age at
entry
Mean years to
diagnosis
(range)
Cases
a
Controls
Matching Variables
Nomura (1996) 6860 participants of the
Honolulu Heart Program,
Oahu, HI ~ all Ja.-Am.
1971-1993 45-68 7 (<1-I4) 141 141 Age
Date of blood draw
Time of day of blood draw
Barrett-Connor
(1990)
1008 participants in the
Rancho Bernardo Study, CA -
all White
1972-1986 40-79 8(1-14) 57 951 Time of day o f blood draw
Hsing(1993) 25,260 residents of
Washington County, MD -
all White
1974-1986 35-94 Not given (1-
12)
98 98 Age
Carter (1995) 1,459 participants in the
Baltimore Longitudinal Study
of Aging, Baltimore, MD
1958-1990 55-90 Not given (7-
25)
20 16 None
Gann (1996) 22,071 participants of the
Physician’s Health Study -
primarily White
1982-1992 40-84 6 (not given) 222 390 Age
Smoking
Guess (1997) >125,000 participants in the
Kaiser Permanente Medical
Care Program (KPMCP) - all
White
1964-1987 Not
given
14 (5-23)
[Median]
106 106 Age
Date of blood draw
Clinic site
Vatten(1997) 28,000 blood donors, Oslo,
Norway - all White
1973-1994 42-66 10(1-19) 59 180 Age
Date of blood draw
Dorgan (1998) 29,133 participants of the
Alpha-Tocopherol, Beta-
Carotene Study (ATBC),
Finland - all White smokers
1985-1993 50-69 4 (< 1-7.2)
[Median]
116 231 Age
Date of blood draw
Clinic site
Intervention group
Heikklia (1999) 16,481 participants from the
Mobile Clinic Health
Examination Survey, Finland
- all White
1966-1991 18-78 Not given 166 300 Age
Municipality
Length of time for serum
sample storage
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 2. Relative risks for type of hormone and risk of prostate cancer for all prospective hormone level-
prostate cancer risk studies 1988-1999.__________ _________ _______________________ ___________
Study T (total) DHT T/DHT Estradiol SHBG Androstenedione T (free) AAG
Nomura (96) 1.0 1.0 1.0 1.0 1.0
Quartilcs 0.77 0.87 1.21 0.91 0.52
0.73 0.74 1.38 1.43 0.56
1.03 0.82 1.24 1.09 0.85
Barrett- 1.0 (/6nM) 1.10 (/0.04nM) 1.04 (/18nM) 1.26 (/l.lnM )
Connor (90) 1.0 (tertiles)
/nM 1.34
/fertile 1.98
Using (93) 1.0 1.0 1.0 1.0
Quartiles 1.7 0.7 1.4 0.9
2.0 0.8 1.4 l.l
1.5 1.0 1.7 1.0
Gann (96) 1.0 1.0 1.0 1.0 1.0 1.0
Unadjusted 1.26 1.12 1.68 0.59* 1.05 1.35
Quartilcs 1.27 0.81 1.37 0.50* 0.71 1.54
1.30 0.83 2.35* 0.75 0.69 1.47
Gann (96) 1.0 1.0 1.0 1.0 1.0
Adjusted 1.41 1.02 0.53* 0.93 1.44
Quartiles 1.98 0.78 0.40* 0.61 1.58
2.60* 0.71 0.56* 0.46 1.60
Guess (97) 0.95 (/quartile) 1.08 (/quartile) 1.11 (/quartile)
Vatten (97) 1.0 1.0 1.0 1.0
Quartiles 0.75 0.59 1.10 1.52
0.79 0.87 0.79 1.40
0.83 0.83 1.31 1.10
Dorgan (98) 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
Adjusted 1.0 0.90 1.10 1.30 0.40* 0.70 0.80 0.70
Quartiles 0.70 0.80 1.30 1.0 0.60 0.80 0.60 1.30
0.80 0.70 1.70 1.10 0.80 1.0 1.10 1.20
Heikklia (99) 1.0 1.0 1.0
Adjusted 1.33 1.39 0.70
Quintiles 1.23 0.98 1.20
1.07 1.10 0.91
1.23 1.12 0.92
< * » * Indicates statistically significant at the p<0.05 level
not find an increased risk associated with circulating T levels in their study after
adjusting for SHBG and estradiol. Additional studies are needed to better understand
the role of circulating T levels and risk of prostate cancer, as well as the role of
estradiol and SHBG.
Circulating DHT, the most potent androgen and most bioactive in the
prostate, has been studied extensively. Although no statistically significant findings
have been reported between circulating DHT levels and risk of prostate cancer, all of
the five studies which have examined this hormone have found a decreased risk
associated with higher circulating levels (Dorgan et al., 1998; Gann et al., 1996;
Hsing & Comstock, 1993; Nomura et al., 1996; Vatten et al., 1997). This finding
would be consistent with an hypothesis that DHT is not leaving the prostate quickly
so that individuals who subsequently develop prostate cancer have lower circulating
levels, but higher levels in the organ. However, there are no data to either support or
refute this hypothesis.
These prospective studies have provided some evidence o f a role for
circulating androgen levels as a marker for subsequent risk of prostate cancer, but the
relationship is not clear. It is not obvious that circulating androgen levels actually
reflect intra-prostatic exposure. Studies are needed which correlate circulating levels
of androgens to prostate tissue levels.
B. Hormone Levels and Racial/Ethnic Variation in Risk
African-American men are approximately 1.5 times as likely to develop
prostate cancer as their White counterparts, and risk among Asian groups is lower
14
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
than in Whites. As a result of this observation and the proposed role of androgens in
risk of disease, investigators hypothesized that healthy African-American men would
have higher circulating T levels, leading to a higher lifetime exposure to this
hormone (see Table 3). The high exposure over a lifetime could account for the
increased risk of prostate cancer for African-Americans.
Table 3. Description of T and AAG results in published cross-sectional studies
of circulating hormone levels.
Study
African-
American White Asian
Ross n 49 47 54
Total T (ng/dl) 640 575 602
Free T (pg/ml) 166 137 n/a
AAG (ng/ml) 6.59 6.91 5.28
Lookingbill n n/a 53 57
Total T (umol/L) n/a -20 -20
FreeT n/a n/a n/a
AAG (umol/L) n/a n/a n/a
Ellis n 525 3654 34
Total T (ng/dl) 659 637 647
Free T n/a n/a n/a
AAG n/a n/a n/a
W un 315 411 275
Total T (ng/dl) 495.3 470.9 520.5
Free T (pg/ml) 101.3 97.4 103.9
AAG n/a n/a n/a
This hypothesis was borne out in the study of Ross and colleagues (Ross et
al., 1986) which found that T levels were higher in African-American men compared
to the White men. In their study of college-age men, these investigators reported that
the African-American subjects had 15% higher circulating T levels compared to the
White men. This finding was also supported by the studies of Ellis and colleagues
15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(Ellis & Nyborg, 1992) and Wu, et al. (Wu et al., 1995) who also demonstrated that
African-American men had higher T levels compared to Whites.
The hypothesis predicted that Asian groups, who are at lowest risk of prostate
cancer, would have the lowest circulating T levels when compared to the higher risk
groups of African-Americans and Whites. However, this assumption was not
supported by the research findings. Both Ross, et al. (Ross et al., 1992) and Ellis, et
al. (Ellis & Nyborg, 1992) found that their Asian groups had T levels intermediate
between African-Americans and Whites. Further, Wu and coworkers (Wu et al.,
1995) found that Asians in their study had T levels higher than both African-
Americans and Whites. A fourth group, Lookingbill and colleagues (Lookingbill et
al., 1991), found no difference in T levels between Whites and Asians in their study.
These inconsistencies led to the study of a T metabolite, 3-alpha
androstandiol glucuronide (AAG), blood levels of which are considered to be a
measure of intra-prostatic DHT exposure. It was believed that low levels of AAG in
the blood would indicate a low exposure to DHT in the prostate. The hypothesis
followed that African-Americans would have high AAG levels compared to White
and Asian men.
This prediction was supported by the findings of Lookingbill and colleagues
(Lookingbill et al., 1991) who found that the Asian men in their study had lower
AAG levels than White men. Similarly, Ross and colleagues (Ross et al., 1992)
found that Japanese men in their study had AAG levels lower than the African-
American and White men. However, although Ross et al. (Ross et al., 1992)
16
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
reported low AAG levels in their Asian men, African-Americans did not have the
highest levels as would be expected; Whites in their study had the highest levels.
This DHT metabolite has not been studied by any other groups.
Overall neither T nor AAG follow the expected distribution pattern based on
the incidence rates in the different racial/ethnic groups. If circulating T is the key
measurement of subsequent risk of prostate cancer, African-Americans should have
the highest and Asian populations the lowest levels, but this is not the case. If AAG
is the critical marker, African-Americans should have the highest levels and Asians
should have the lowest, which again is not the case.
The meaning of these data is unclear given that a single circulating
measurement is being correlated with a lifetime exposure of the target organ. Also,
the directionality of these hypotheses is questionable. For example, one could argue
that low circulating AAG levels, not high levels, would increase risk of prostate
cancer risk because DHT is not being efficiently eliminated from the prostate,
allowing for continued transactivation of target genes and increased cell
proliferation.
Overall, these cross-sectional studies comparing circulating hormone levels
by racial/ethnic group do not provide great insight into the role of androgens in risk
of prostate cancer.
The weak evidence relating circulating hormones to risk of prostate cancer
suggests that an alternative route of investigation into the role of androgens and risk
of prostate cancer is warranted. A more direct approach, which may better estimate
17
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the lifetime exposure of the prostate to androgens, can be taken by studying variation
in genes which control hormone metabolism in the organ itself.
C. Genes and Cancer
There are two primary models for genetic susceptibility to cancer (Risch &
Merikangas, 1996). The first suggests that single major genes, which follow strict
Mendelian inheritance patterns, will cause the development of cancer. These major
loci are uncommon in the population and therefore do not explain the majority of
cancers, even though they may be highly penetrant. Several major prostate cancer
genes have been proposed, such as the candidate region located on lq(24-25) (Smith
et al., 1996), however no major prostate cancer susceptibility gene has been
identified.
Much more likely are minor susceptibility genes associated with modestly
increased risk, but high population attributable risk, because the risk allele may be
quite common in the population. These types of genes, such as those involved in
androgen metabolism as discussed below, may act alone or in combination with
other common variants of other genes to increase the risk of prostate cancer.
Ultimately common variants in several such genes in an androgen pathway could
give rise to a ‘polygenic’ etiology of cancer, thereby allowing the identification of
individuals with high or low risk polygenic profiles.
Extraordinary success has been achieved using linkage studies to identify the
highly penetrant, major disease-causing genes. Linkage studies rely on the use of
families with many affected members with the underlying assumption that the
18
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
affected members will all carry the risk allele. For minor susceptibility alleles,
linkage studies are not the best study design choice. A better approach is that of
association studies which seek to directly test for differences in allele or genotype
frequency of a candidate loci (Risch & Merikangas, 1996). This approach requires,
however, the determination of candidate genes. This dissertation is based on the
strong biologic hypothesis that androgens are involved in the etiology of prostate
cancer, and genes involved in androgen metabolism in the prostate are excellent
candidates. I have restricted the search for candidate genes to those active in the
prostate for two reasons: (1) the lack of strong evidence that circulating hormones
are associated with prostate cancer risk, and (2) the higher probability that the
relevant genes are those that are actually functioning in the organ. The three genes
under study in this dissertation are the steroid 5-alpha reductase type II gene
(SRD5A2), the androgen receptor gene (AR), and the hydroxysteroid dehydrogenase
3-beta type I gene (HSD3B1) (figure I). The existing epidemiologic literature on the
three genes is described below.
1. SRD5A2
Two steroid 5-alpha reductase enzymes, type I and type II, have been identified and
are encoded by SRD5A1 and SRD5A2, respectively. The homology between the
amino acid sequences of the two genes is approximately 50%. The type I enzyme is
primarily expressed in the skin, whereas the type II enzyme is expressed in the
prostate. The type II enzyme is responsible for the conversion of T to DHT in the
prostate. Certain mutations in SRD5A2 cause male pseudohermaphroditism, a
19
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 1. The role of SRDSA2, the AR and HSD3B1 in androgen metabolism in the prostate.
DHEA
HSD3B1
i r
Androstenedione
SRD5A2
Testosterone --------------► DHT
▼
AR
AR
▼
transactivation of androgen response elements
prostate cell proliferation
severe phenotype characterized by ambiguous genitalia at birth but
subsequent virilization at puberty. These individuals are generally reared as girls
until puberty when a male-gender identity is adopted.
The SRD5A2 gene was sequenced in the laboratory of Dr. Juergen K.V.
Reichardt at the University of Southern California using single strand conformation
polymorphism (SSCP) followed by sequencing of any aberrant SSCP patterns
identified. They identified 13 variants, seven of which were missense mutations
(altering the encoded amino acid), and six of which were either silent or intronic.
One of the missense variants identify by Dr. Reichardt, the valine substitution to
leucine at codon 89 (V89L) has been studied by two groups. Investigators from the
Physician’s Health Study (Febbo et al., 1999) have shown a slight, non-significant,
protective effect of the leucine homozygous genotype (LL), whereas Lunn and
colleagues (Lunn, Bell, Mohler, & Taylor, 1999) reported an increased risk
associated with a leucine genotype.
The V89L variant and a second missense variant identified by Dr. Reichardt,
an alanine to threonine substitution at codon 49 (A49T), are studied in this
dissertation.
2. The AR Gene
The AR gene is the receptor for DHT and T. After the AR binds with its ligand this
complex transactivates target genes resulting in prostate cell proliferation. This gene
is located on the X chromosome and is mutated in Kennedy’s spinal and bulbar
muscular atrophy. Exon 1 contains a glutamine repeat (CAGn ) which has been the
21
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
focus of research efforts because an expansion of this repeat to more than double the
usual size causes Kennedy’s disease. Kennedy’s disease is characterized by
infertility and low virilization.
Because long repeat lengths are associated with low AR activity, Ross and
Coetzee (Coetzee & Ross, 1994) hypothesized that short repeat lengths may be
associated with prostate cancer. Extensive research has been conducted on this
hypothesis. Many (Giovannucci et al., 1997; Hakimi, Schoenberg, Rondinelli,
Piantadosi, & Barrack, 1997; Ingles etal., 1997; Irvine, Yu, Ross, & Coetzee, 1995;
Stanford et al., 1997), but not all (Bratt, Borg, Kristoffersson, Zhang, & Olsson,
1999; Edwards et al., 1999), studies have provided evidence of an association
between shorter repeat lengths and increased risk of prostate cancer. Irvine et al.
(Irvine et al., 1995) described a 25% increased risk associated with less than 22
repeats and this finding was confirmed by Stanford and colleagues (Stanford et al.,
1997) and investigators from the Physician’s Health Study (Giovannucci et al.,
1997). Two additional studies failed to find an association, however and the
association remains unclear (Bratt et al., 1999; Edwards et al., 1999).
The association between single nucleotide polymorphisms (SNPs) in the AR
gene and risk of prostate cancer is studied in this dissertation. These SNPs were
identified from the National Center for Biotechnology Information database (dbSNP)
(http://www.ncbi.nlm.nih.gov/SNP/).
22
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3. HSD3B1
HSD3B1 and HSD3B2 encode highly homologous enzymes, however, they are
expressed in different tissues. HSD3B2 is primarily expressed in the testes and the
adrenals whereas HSD3B1 is expressed in the prostate. The HSD3B1 gene plays a
pivotal role in the prostate by converting DHEA to A-dione. This dissertation
represents the first effort to explore the association between variation in HSD3B1
and risk of prostate cancer. The single SNP, F286L, studied in this dissertation was
identified through sequencing conducted at the Whitehead Institute/MIT Center for
Genome Research in Cambridge, Massachusetts. This variant and its flanking
sequence can now be found in the National Center for Biotechnology Information
dbSNP public database under the identification number rs6205
(http://www.ncbi.nlm.nih.gOv/SNP/i.
The choice of genes involved in androgen metabolism in the prostate is
biologically sound. If a role for any or a combination of these genes is found, it is
possible that a prostate cancer “risk profile” can be developed. The elucidation of
such a risk profile for prostate cancer will allow for more careful screening and
possibly less invasive treatment for this disease. In the ensuing chapter I discuss the
research findings related to the role of SRD5A2, AR, and HSD3B1 in risk of prostate
cancer.
23
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
II. DATA ANALYSIS
This chapter contains five manuscripts, each of which addresses the question
of whether variation in genes involved in intra-prostatic androgen metabolism is
associated with risk of prostate cancer. Each manuscript stands alone as a complete
paper with an introduction, methods, results, and discussion. The first two
manuscripts have already been published; the first was published in The Lancet in
1999 and the second in Cancer, Epidemiology, Biomarkers, and Prevention in 2002.
The third manuscript has been accepted by the British Journal o f Cancer. The fourth
manuscript represents work recently completed and as a result it has not yet been
submitted for publication. The fifth manuscript represents preliminary data. This
project has been expanded significantly by our colleagues at the Whitehead
Institute/MIT Center for Genome Research.
A. Association of Missense Substitution in SRD5A2 Gene with Prostate
Cancer in African-American and Hispanic Men in Los Angeles, USA
I. Introduction
Prostate cancer is a very common disease in more-developed countries: more
than 39,200 men died of the disease in the USA in 1998 (Landis, Murray, Bolden, &
Wingo, 1998), and 50,122 died in the European Union in 1990 (Black, Bray, Ferlay,
& Parkin, 1997). Prostate cancer is androgen dependent (Henderson, Ross, Pike, &
Casagrande, 1982; Huggins & Hodges, 1941), and we have previously proposed that
variations in androgen metabolism may affect a man’s risk of this disease
(Henderson et al., 1982; Ross et al., 1992; Ross et a l, 1998). We have provided
24
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
evidence that increased intra-prostatic androgen metabolism, particularly through the
enzyme steroid 5a-reductase, may have an important role in predisposition to
prostate cancer. This enzyme catalyses the conversion of testosterone (T) to
dihydrotestosterone (DHT) - the most potent androgen in the prostate. Thus, genetic
variants encoded by the steroid 5a-reductase gene (SRD5A2> may have an effect on
predisposition to prostate cancer. We report our epidemiological and biochemical
findings on the relation between prostate cancer and a constitutional (germline)
missense substitution in SRD5A2, which results in the replacement of an alanine
residue at codon 49 with threonine (A49T) (Makridakis et al., 1997). This
substitution is associated with a significantly increased risk of prostate cancer
(particularly of an advanced nature), probably through increased metabolic activation
ofT toD H T.
2. Methods
a. Epidemiology
This case-control study was part of the prospective Hawaii-Los Angeles
Multiethnic Cohort Study of Diet and Cancer, which has been described in detail
elsewhere (Kolonel et al., 2000). About 200,000 African-American, Japanese-
American, Hispanic, and white individuals between the ages of 45 and 75 years are
being followed up for incident cancer diagnose, primarily through linkage with the
population-based Surveillance, Epidemiology and End Results cancer registries for
Hawaii and Los Angeles County, USA. Data on demography, lifestyle, diet, and
medical history were collected on all cohort members through a self-completed
25
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
questionnaire. This cohort is similar to the general population of Los Angeles and
Hawaii in terms of educational and marital status (Kolonel et al., 2000).
A nested case-control design was used. We attempted to collect blood
samples from all patients with incident prostate cancer (cases) and from a similar
number of randomly selected controls. No attempt was made to match on age or any
other demographic variable except race. The cases and controls were not
individually matched nor frequency matched. Seventy-seven percent of incident
cases and 73% of eligible controls consented to take part. The observed numbers of
prostate-cancer cases in the four main racial/ethnic groups in the cohort were in line
with the expected numbers (adjusted for age and ethnic group) for Los Angeles and
Hawaii.
The men who agreed to participate were asked to provide blood and urine
samples after giving informed consent. Blood components were separated and
stored in 0.5 mL volumes at -80°C (Kolonel et al., 2000). DNA was purified from
lymphocytes from peripheral blood samples for all cases and controls by a rapid
DNA preparation method. The blood samples were processed within 4 hours of
collection. Individuals included in this study had their blood samples taken between
June, 1994, and October, 1998.
Incident cases and controls who reported a history of prostate surgery were
excluded from this analysis. Individuals who were selected as controls, but were
subsequently found to have prostate cancer, were classified as controls in the
26
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
analysis. SRD5A2 genotype was dichotomized into AA or AT/TT, and age was
categorized as younger than 60 years, or 60 years and older.
Prostate cancers were classified according to disease severity by means of a
combination o f the TNM (tumor, nodes, metastases) staging system and Gleason
grade (Andriole & Catalona, 1991; Montie, 1993; Montie, Pienta, & Pontes, 1996).
Cancers o f TNM stage 1 and Gleason grade 6 or less were categorized as “clinically
non-significant or localized”; those of TNM stage 2 or greater (irrespective of grade)
or of TNM stage 1 and Gleason grade higher than 6 were classified as “clinically
significant or advanced.”
Two-sample t tests and chi-square tests were used to compare the
demographic characteristics of cases and controls (SAS version 6.12). Odds ratios
and 95% confidence intervals were estimated by use of exact distributions with a
mid-p correction as implemented by the program StatXact (version 3.0.2).
Population attributable risk was calculated by the method of Miettinen (Miettinen,
1974). All significance levels quoted are two-sided.
b. Molecular Biology
Genomic DNA was isolated from the lymphocytes of 216 African-American
and 172 Hispanic men with prostate cancer, and from 261 African-American and 200
Hispanic controls.
We screened for the A49T mutation by PCR amplification of radiolabelled
exon 1 o f the SRD5A2 gene (with the primers 0NMI6 GCAGCGGCCACCGGCG;
and oNM32 GTGGAAGTAATGTACGCAGAA), followed by single-stranded
27
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
conformational polymorphism analysis (Ausubel, Brent, & Kingston, 1995). All
samples were submitted in coded format for genotyping and included 5% masked
repeats.
c. Biochemistry
The A49T mutation was reconstructed in SRD5A2 cDNA (obtained from D
Russell, Dallas, TX, USA) by site-directed mutagenesis with custom
oligonucleotides (Life Technologies, Gaithersburg, MD, USA) and the QuickChange
kit (Stratagene, San Diego, CA, USA). 3xl08 log-phase cos cells were
electroporated with no DNA (“mock”), or with 15 pxg of normal or mutant SRD5 A2
constructs (Wigley et al., 1994), along with 5 pg o f a cotransfected B-galactosidase
control plasmid (pCMV p). Cell extracts were prepared 48 hours after transfection
by means of sonication (Stoner, 1996). Total protein was quantified with a BioRad
(Hercules, CA, USA) assay, P-galactosidase activity (Stoner, 1996) was measured,
and SRD5A2 activity was assessed by measurement of testosterone-to-
dihydrotestosterone conversion. For this assay, normalized extracts were incubated
at 37°C with carbon-14-labelled testosterone (New England Nuclear, Boston, MA,
USA) and NADPH (Sigma, St. Louis, MO, USA). Reactions were stopped by the
addition of methylene chloride; dried and redissolved steroids in ethanol were
applied to K6 silica thin-layer chromatography plates (Whatman, Clifton, NJ, USA),
which were developed in methylene chloride and acetone in a ratio o f 12.3/1 (by
volume) (Wigley et al., 1994). Dried plates were exposed to autoradiographic film
(Kodak Biomax, Rochester, NY, USA) or directly quantified on a Storm
28
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
phosphorimager (Molecular Dynamics, Mountain View, CA, USA). Data were
plotted and analyzed with Cricket Graph 1.3 (Cricket Software, Malvern, PA, USA).
Finasteride was obtained from Merck (Rahway, NJ), and steady-state protein
concentrations were measured by western blotting (Stoner, 1996) with polyclonal
antibody (a gift from Dr. David Russell) (Wigley et al., 1994) and an ECF
chemifluorescent detection kit (Amersham, Arlington Heights, IL, USA). All assays
were done in triplicate, and then averaged.
3. Results
The mean age of the controls was 63.6 years, compared with 68.9 years for
the cases. The cases and controls did not differ with respect to education level. The
racial/ethnic distribution of the study population is presented in table 4.
The A49T missense substitution was uncommon in health African-American
and Hispanic men in Los Angeles; we found an allele frequency of 1.0% (five of
522) in African-American controls and 2.3% (nine o f400) in Hispanic controls
(table 5).
In African-American men with prostate cancer, the A49T allele frequency
was 4.0% (17 o f432); 2.1% (six o f276) in cases with clinically non-significant
(localized) disease, and 7% (11 of 156) in cases with clinically significant
(advanced) disease (table 5). Among Hispanic prostate cancer patients, the allele
frequency o f the missense substitution was 4.1% (14 o f344); 3% (six o f202) in
cases with clinically non-significant disease, and 5.6% (eight of 142) in cases with
29
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 4. Age and educational attainment of cases and controls.
African-American men Hispanic men
Controls Cases Controls Cases
(n=261) (n=216) (n=200) (n=172)
Age (years)
Mean (SE)* 64.5 (0.52) 67.6(0.43) 63.9(0.53) 68.4 (0.46)
Number <60 82 (31.4%) 30(13.9%) 60 (30.0%) 14(8.1%)
years 179(68.6%) 186(86.1%) 140 (70.0%) 158 (91.9%)
Number >60
years
Education (years)**
0-10 35(13.6%) 39(18.4%) 65 (33.3%) 61 (36.3%)
11-12 64(24.9%) 50(23.6%) 44 (22.6%) 41 (24.4%)
Vocational 15(5.8%) 9(4.3%) 20(10.3%) 10(6.0%)
Some college 77 (30.0%) 55 (25.9%) 36(18.5%) 31 (18.5%)
College graduate 66 (25.7%) 59 (27.8%) 30(15.4%) 25 (14.9%)
*2 sample t-test p=0.0001 for difference between cases and controls
**Chi square p=0.52 for African-American men and p=0.68 for Hispanic men
U>
o
Table 5. Association of the A49T missense substitution in SRD5A2 gene with risk of prostate cancer.
African-American men Hispanic men
AA AT/TT Relative risk (95% Cl)
P
AA AT/TT Relative risk (95% Cl) P
Controls 257 3/1 1.0 193 5/2 1.0
---
Cases
All 203 9/4 3.28(1.09-11.87) 0.03 160 10/2 2.50 (0.90-7.40) 0.08
Localized 134 2/2 1.47 (0.33-6.63) 0.60 96 4/1 1.71 (0.46-6.12) 0.41
Advanced 69 7/2 7.22 (2.17-27.91 0.001 64 6/1 3.60(1.09-12.27) 0.04
PAR% 8.6% 8.3%
AA=normal (wildtype) homozygotes, AT=heterozygotes, and TT=mutant (A49T) homozygotes;
PAR%=population attributable risk (in %);
Data were age-adjusted.
clinically significant disease (table 5). Age-adjusted relative risk estimates for
possession of any A49T allele and the population attributable risks for clinically
significant disease are given in table 5.
We reconstructed the A49T missense substitution in SRD5A2 cDNA, and
overexpressed it in a mammalian cos cell system to examine the biochemical
characteristics of the mutant enzyme. Kinetic data for the normal (wildtype) enzyme
were in line with earlier in vitro characterizations, and followed standard Michaelis-
Menten kinetics (figure 2, table 6) (Wigley et a l, 1994). There were substantial
differences between the mutant and wildtype enzymes; the Vm a x was about five-fold
higher for the mutant enzyme than that for the wildtype enzyme. For the mutant
enzyme, the Km for the substrate (testosterone) was slightly higher than that of the
wildtype enzyme, whereas the Km for the cofactor (NADPH) was about the same for
both enzymes. The optimum pH of the mutant enzyme was also similar to that of the
wildtype. The steady-state concentrations of the mutant and normal proteins were
identical. The Ki for the competitive inhibitor finasteride (Stoner, 1996) was more
than ten-fold higher for the mutant enzyme than for the wildtype (table 6).
Table 6. In-vitro kinetic properties of mutant and wild-type SRD5A2 enzyme.
Property Wildtype enzyme A49T mutant enzyme
Vm a x (nmol min'1 mg'1 ) 1.9 9.9
Km for testosterone (pmol/L) 0.9 2.7
Km for NADH (pmol/L) 8 7
Optimum pH 6.0 6.0
K, for finasteride (nmol/L) 23 270
Protein (%)* 100 98
* Steady-state concentration
32
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 2 . In-vitro kinetics o f normal (wild-type) a n d mutant (A49T) SRD5A2 enzymes.
8.
> v
• o
Os
<
00
1
1
o
£
c
s
s
33
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4. Discussion
We have investigated the relation between three polymorphisms in the
SRD5 A2 gene and risk of prostate cancer in the cohort described. The first, a (TA)„
dinucleotide repeat in the 3’ untranslated region (Reichardt et al., 1995), was
originally described by David and Russell (Davis & Russell, 1993). We found
differences in the distribution of (TA)„ alleles in different racial/ethnic control groups
(Reichardt et al., 1995), and preliminary case-control study results suggested a weal
association between long repeats and risk of prostate cancer. However, we have not
confirmed the case-control association with risk of prostate cancer with a larger
sample of cases, and other investigators have also reported no association between
repeat length and risk of prostate cancer (Kantoff et al., 1997).
The second polymorphism we identified and studied was a missense
substitution, V89L, which substitutes leucine for valine at codon 89 (Makridakis et
al., 1997). This polymorphism was interesting because it is very common among
low-risk Chinese and Japanese patients in whom it is associated with low serum
concentrations of 3a-androstanediol glucuronide, a major metabolite of
dihydrotestosterone, and an in vivo index of steroid 5a-reductase activity
(Makridakis et al., 1997). However, to date, no consistent differences in the
frequency of V89L SRD5A2 gene polymorphism between cases and controls have
been observed. Since the association between concentrations of 3a-androstanediol
glucuronide and the V89L missense substitution is modest, a large sample size may
be necessary to assess this polymorphism fully.
34
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The third polymorphism in the SRD5A2 gene, the A49T missense
substitution (Makridakis et al., 1997), differs substantially from the two previously
reported polymorphisms. The mutant allele is rare in controls, but there is a
significant association with risk of advanced prostate cancer in African-American
and Hispanic men (table 5). The population attributable risk is about 8% in both
populations. We are in the process of extending our study to include white and
Japanese cases and controls through the multiethnic cohort. The A49T allele
frequency in controls from these two groups is similar to that of African-American
and Hispanic controls (Ross et al., 1992). The Japanese-American group is most
interesting since these men have a particularly low risk of the disease.
The case-control design within the prospective multiethnic cohort reported
here, combined with a high success rate in collecting blood samples, ensures
comparability of cases and controls (Kolonel et al., 2000). The participation rate in
this study was quite high for a study in which blood samples are collected. In other
case-control studies with African-American individuals, the participation rate is
reported to be only 45%. The reason for the higher participation rate in this study is
probably due to the individuals’ voluntary enrollment in a larger cohort study.
Although the participants and non-participants did differ with respect to educational
attainment, this factor was not a confounder in this analysis and is unlikely to bias
the results.
For the African-American group, the study had 80% power o f detecting a
relative risk of 5.0 (with a two-side type I error of 5% and a proportion of AT/TT of
35
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
1.5% in controls). For the 78 advanced cases, the relative risk under the same
conditions was 7.2. For Hispanic men, these two relative risks were 3.6 and 4.8
(with a proportion of AT/TT of 3.5%). This study, therefore, only had sufficient
power to detect quite high relative risks.
A direct association seems to be the most reasonable explanation for the
relation between the A49T mutation and risk of prostate cancer in African-American
and Hispanic men. We note that the chromosomal location of the human SRD5A2
gene (chromosome band 2p23) does not coincide with any of the three proposed loci
for hereditary prostate-cancer susceptibility investigated by others by means of
linkage analyses in high-risk families (Berthon et al., 1998; Carter, Beaty, Steinberg,
Childs, & Walsh, 1992; Mclndoe et al., 1997; Smith et al., 1996; Xu et al., 1998).
This discrepancy is not surprising, however, since we are investigating sporadic
prostate cancer (Carter et al., 1992), the most common form of the disease, whereas
others have focused on the rarer familial form of prostate cancer (Berthon et al.,
1998; Mclndoe et al., 1997; Smith et al., 1996; Xu et al., 1998).
We found that the A49T missense substitution in the SRD5A2 gene results in
increased enzymatic activity in vitro (figure 2, table 6). This effect seems to be an
inherent gain-of-function encoded by the amino acid substitution, since the steady-
state concentrations of the normal and mutant enzymes were identical (table 6). We
are collecting data on in vivo concentrations of 3a-androstanediol glucuronide in
normal controls with the A49T mutation to check on this further, but this work is
progressing slowly owing to the rarity of the allele.
36
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The identification of genetic variants in genes that control androgen
biosynthesis or metabolism (Henderson et al., 1982; Ross et al., 1998), which can
also affect risk of prostate cancer, has important implications for understanding the
biology of prostate cancer, for identification of at-risk men before symptoms arise,
and for development of chemopreventive strategies. The finding o f a much lower in
vitro effect of finasteride with the A49T allele suggests that men with an A49T
genotype may be unaffected by this drug. If confirmed, this finding needs to be
taken into account when this drug is prescribed for the treatment of benign prostatic
hyperplasia or for prevention of prostate cancer (Ausubel et al., 1995).
B. SRD5A2 V89L Substitution is Not Associated with Risk of Prostate
Cancer in a Multi-Ethnic Population Study
1. Introduction
While the genetic causes of prostate cancer are poorly understood, variation
in androgen biosynthesis and metabolism genes has been hypothesized to alter
prostate cancer risk (Ross et al., 1998). Steroid 5-alpha reductase type II (SRD5A2)
which encodes the enzyme responsible for converting testosterone to
dihydrotestosterone in the prostate has been studied as a candidate gene (Febbo et
al., 1999; Lunn et al., 1999; Nam et al., 2001). A missense single nucleotide
polymorphism (SNP) at codon 89 resulting in a valine to leucine change (V89L) was
tested in relation to prostate cancer risk in three previous case-control studies. A
large nested case-control study within the Physicians’ Health Study found a small
protective effect of the LL genotype that was not statistically significant (Febbo et
37
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
al., 1999). Similarly, Lunn and co-workers found a 10% reduction in risk associated
with the LL genotype, but again this result did not achieve statistical significance
(Lunn et al., 1999). However, a recent case-control study suggested a much stronger
effect of the V89L SNP; this study found a 64% decrease in risk associated with the
LL compared to the W genotype (Nam et al., 2001).
We report here the results of a large multi-ethnic case-control study designed
to test the association between the V89L variant and prostate cancer risk.
2. Methods
We completed a case-control study nested in the Hawaii-Los Angeles
Multiethnic Cohort (MEC), including 921 incident cases and 1295 male controls
from the four major racial/ethnic groups enrolled in the cohort (African-Americans,
Japanese-Americans, Latinos, and Whites). Details of the MEC have been published
previously (Kolonel et al., 2000). Incident case ascertainment was completed by
computer linkage of the cohort with the Surveillance, Epidemiology and End Results
(SEER) cancer registries in Hawaii and Los Angeles, as well as with the California
State Cancer Registry. Both incident prostate cancer cases and a random sample of
male controls in the MEC were contacted by phone and asked to provide a blood
specimen. The overall participation rate for blood collection was 72% for cases and
69% for controls.
The men who agreed to participate in the blood collection provided written
informed consent following study approval by both the University of Hawaii and the
University of Southern California Institutional Review Boards.
38
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
DNA was purified from lymphocytes of peripheral blood samples for all
cases and controls using either a rapid DNA preparation or the Gentra PureGene kit.
Genotyping was carried out as described previously (Makridakis et al., 1997). All
samples were submitted in coded format for genotyping and included 5% masked
repeats. PSA levels are determined for all controls in the study.
Logistic regression was used to model the association between risk of
prostate cancer and V89L genotype. No mode of inheritance was assumed. All
analyses were adjusted for age at entry into the MEC and for ethnicity in any
analysis that combined the four ethnic groups.
3. Results
We found no significant association between the V89L variant and prostate
cancer risk in any individual ethnic group, nor in all groups combined (table 7). We
found a small, non-significant protective association between the LL genotype and
risk of prostate cancer in the Latino and White groups; however risk was slightly
increased among African-American and Japanese men (table 7), indicating no strong
or consistent pattern across racial/ethnic groups. The results were unchanged when
restricted to cases with advanced disease.
4. Conclusions
The results presented here agree with two of the three previously published
studies suggesting no substantial association between the V89L variant and risk of
prostate cancer. The findings from the third study may be in disagreement due to
39
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 7. Odds ratios and 95% confidence limits for the association between V89L genotype and risk of prostate cancer
by racial/ethnic group.
Genotype
African-American* Ja 3anese* Latino* White* All Groups**
Ca/Co OR
(95% Cl)
Ca/Co OR
(95% Cl)
Ca/Co OR
(95% Cl)
Ca/Co OR
(95% Cl)
Ca/Co OR
(95% Cl)
VV
VL
LL
178/231
134/159
25/21
1.0
1.05
(0.78-1.43)
1.52
(0.82-2.83)
56/85
71/156
35/43
1.0
0.68
(0.42-1.10)
1.27
(0.69-2.36)
112/151
111/156
36/53
1.0
1.03
(0.71-1.50)
0.91
(0.55-1.53)
77/110
71/107
15/23
1.0
1.05
(0.66-1.66)
0.91
(0.42-1.98)
423/577
387/578
111/139
1.0
0.95
(0.79-1.15)
1.15
(0.85-1.54)
♦ORs adjusted for age at entry into the cohort
**ORs adjusted for age at entry into the cohort and racial/ethnic group
■u
o
chance. The participants in our study were not screened for prostate cancer as part of
follow-up and therefore it is possible that some controls could have undetected
disease. The low PSA levels in our controls indicate that misclassification of
participants due to undetected disease is not likely to explain our lack of significant
findings, and our results are unchanged when excluding controls with PSA values of
4 or higher.
Our study had 80% power at the a=0.05 level to detect an odds ratio of 0.64,
the risk observed in the previously published “positive” study. A formal meta
analysis of the four studies (including only the White group from our study) revealed
a 20% decreased risk for carriers of the LL genotype compared to the W genotype
(95% Cl 0.59-1.09), but this result was not statistically significant. While we cannot
rule out a small effect of this missense substitution on prostate cancer risk, the
sample size required to detect an odds ratio of 0.80 is quite large (n=8,796). The
usefulness of conducting a study powered to detect such a small effect is unclear as
this variant would not contribute significantly to the public health burden of prostate
cancer in terms of either screening or prevention. Efforts should be focused
elsewhere to further our understanding of the role of genetic variation in risk of
prostate cancer.
41
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
C. Association Between the Steroid 5-Alpha Reductase Type II Gene A49T
Missense Variant and Prostate Cancer Risk in Whites and Japanese-
Americans
1. Introduction
It is well accepted that the prostate is androgen dependent and it has been
hypothesized that variation in genes involved in androgen biosynthesis and
metabolism might be risk factors for prostate cancer. A key gene in this pathway
is the steroid 5-alpha reductase type II (SRD5A2,) gene which encodes the
enzyme that converts testosterone to its more potent form, dihydrotestosterone, in
the prostate. We previously reported that a single missense variant in exon 1,
A49T (which replaces alanine at codon 49 with threonine), was associated with
increased risk of prostate cancer in African-Americans and Latinos in the
Hawaii-Los Angeles Multiethnic Cohort (MEC) study (Makridakis et al., 1999).
We also showed five-times higher enzymatic activity in the presence o f this
missense variant in in vitro studies (Makridakis, di Salle, & Reichardt, 2000;
Makridakis et al., 1999). In a study o f white prostate cancer cases in
Pennsylvania, the mutant T allele was associated with risk of extracapsular
extension of disease (Jaffe et al., 2000). However, in a recent case-control study
of Finnish whites no association was found between the T allele and risk of
prostate cancer (Mononen et al., 2001). We report here on the association
between the A49T variant and prostate cancer risk in the white and Japanese-
American participants of the MEC study.
42
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2. Methods
We completed a case-control study nested within the Hawaii-Los Angeles
MEC study, including 553 incident cases and 724 male controls from the White and
Japanese-American subjects enrolled in the cohort. Details of the MEC study have
been published previously (Kolonel et al., 2000). Briefly, over 200,000 men and
women between the ages of 45 and 75 and residing in Hawaii and California
completed a questionnaire which included data on demographic, lifestyle, and health
characteristics as well as a comprehensive dietary survey. This cohort is broadly
similar to the general populations of those racial-ethnic groups in Los Angeles and
Hawaii, in terms of education and marital status (Kolonel et al., 2000).
Participants in the MEC are followed for incident cancers. Incident case
ascertainment is completed by computer linkage of the cohort with the Surveillance,
Epidemiology and End Results (SEER) cancer registries in Hawaii and Los Angeles,
as well as with the California Cancer Registry. Both incident prostate cancer cases
and a random sample of male controls in the MEC were contacted by telephone and
asked to provide a blood specimen. The overall participation rate for blood
collection was 72% for cases and 69% for controls.
The men who agreed to participate in the blood collection provided written
informed consent following study approval by both the University of Hawaii and the
University of Southern California Institutional Review Boards.
43
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Blood components were separated and stored in 0.5 mL volumes at -80°C.
The blood samples were processed within 4 hours of collection. Individuals included
in this study had their blood samples taken between June 1994 and October 2000.
DNA was purified from lymphocytes of peripheral blood samples for all cases and
controls using either a rapid DNA preparation or the Gentra PureGene kit.
Genotyping by single strand conformation polymorphism (SSCP) was carried out as
described previously (Makridakis et al., 1999). All samples were provided to the
laboratory for genotyping, blindly as to case or control status and to ethnicity using a
unique identifier.
Prostate tumors were classified according to disease severity using a
combination of stage and Gleason grade. Local disease was defined as tumors
localized to the prostate with a Gleason grade of less than 8. Advanced disease was
defined as tumors that had regional extension or distant metastases, regardless of
grade, and tumors that were localized to the prostate, but had a Gleason grade of 8 or
higher.
Logistic regression was used to model the association between risk of
prostate cancer and A49T genotype. Individuals with either an AT or TT genotype
were grouped together due to the rarity of the TT genotype. All analyses were
adjusted for age at entry into the cohort.
3. Results
The demographic characteristics of the study participants are shown in table
8. As expected from the method of choosing controls, cases were older than controls
44
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 8. Descriptive characteristics of the study population by racial/ethnic group.
W lites8 Japanese8
Characteristic Cases
(n=277)
Controls
(n=362)
Cases
(n=276)
Controls
(n=362)
Age Mean (sd) 65.0(6.9) 58.7 (8.4) 66.5 (6.3) 60.0 (9.0)
Education # (%)
0-10 years
11-12 years
Vocational
Some college
College graduate
19(6.9)
42(15.2)
13(4.7)
81 (29.4)
121 (43.8)
14(3.9)
48(13.3)
1 1 (3.1)
89 (24.7)
199 (55.1)
29(10.6)
87(31.6)
32(11.6)
45 (16.4)
82 (29.8)
17(4.7)
102(28.3)
39(10.8)
75 (20.8)
127 (35.3)
Marital status # (%)
Married
Not married
Never married
203 (73.8)
53(19.3)
19(6.9)
285 (79.2)
49(13.6)
26 (7.2)
241 (88.0)
19(6.9)
14(5.1)
313(86.5)
31 (8.6)
18(5.1)
Vasectomy # (%)
No
Yes
213(76.9)
64(23.1)
266 (73.5)
96 (26.5)
242 (87.7)
34(12.3)
318(87.9)
44(12.2)
Smoking # (%)
Current
Past
Never
29(10.6)
160(58.4)
85 (31.0)
49(13.5)
182 (50.3)
131 (36.2)
31 (11.3)
175 (63.6)
69(25.1)
33 (9.2)
205 (56.9)
122 (33.9)
Family history of PrCa #
(%)
1+ Is *0 relative
No Is* ° relative
36(13.7)
226 (86.3)
36(10.2)
318(89.8)
28(10.9)
230 (89.2)
13(3.8)
326 (96.2)
8 Numbers in tables do not always sum to total numbers in the study because of missing data.
in both ethnic groups. Controls were more educated than cases whereas cases were
more likely to have a first degree relative with prostate cancer.
No carriers of the T allele were identified in the 275 Japanese-American
cases genotyped whereas 0.8% of the Japanese-American controls were carriers of
the T allele. The frequency of the T allele in the white controls was 3.9% (28/724),
compared to 3.4% (19/554) in white cases.
There was no association between AT or TT genotype and risk of prostate
cancer in whites when considering either all cases combined (OR=1.04, 95% Cl
0.49-2.19) or limiting analyses to advanced cases only (OR=1.00, 95% Cl 0.33-3.06)
(table 9).
4. Discussion
Unlike our previously reported results in African-Americans and Latinos
from the MEC (table 9) (Makridakis et al., 1999), we found no association between
the AT or TT genotype and risk of prostate cancer in either white or Japanese-
American participants in the MEC. These findings agree with the case-control study
results in Finnish whites as reported by Mononen and colleagues (Mononen et al.,
2001).
The reason for inconsistencies between the A49T variant of SRD5A2 and
prostate cancer risk across racial-ethnic groups is not readily apparent. Possible
explanations include variable linkage disequilibrium with the true causal
polymorphism across ethnic groups and allele-allele, gene-gene or gene-environment
46
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 9. Association between A49T genotype and risk of prostate cancer by racial/ethnic group.
Race/Ethnicity Genotype Controls All Cases OR% Adv Cases OR+
Whites AA 341 264 1.0 84 1.0
AT/TT 21 13 1.04 (0.49-2.19) 4 1.00(0.33-3.06)
Japanese AA 357 276 93
AT/TT 5 0 0
African-Am. AA 626 434 1.0 114 1.0
AT/TT 13 14 1.45 (0.67-3.12) 9 3.77(1.57-9.04)
Latino AA 448 340 1.0 113 1.0
AT/TT 10 23 4.03(1.82-8.93) 10 4.93(1.92-12.66)
♦Adjusted for age at entry into the Cohort
interactions that vary across racial-ethnic groups. Variable linkage disequilibrium is
an unlikely reason given our previous in vitro data suggesting that A49T is a
functional polymorphism with five-fold increased activity (Makridakis et a l, 2000;
Makridakis et al., 1999). We believe that gene-gene or gene-environment
interactions are the most likely explanation for the disparate results across
racial/ethnic group. However, given the low frequency of the A49T variant and the
paucity of additional gene or environmental risk factors thus far identified for
prostate cancer this hypothesis must remain untested for now. We are planning a
comprehensive haplotype analysis of SRD5 A2 in order to determine if allele-allele
interactions can explain these inconsistencies.
This study underscores the complexity of studying and interpreting genetic
association study results. In the same multi-ethnic population-based study we find
conflicting results across racial-ethnic groups even in the face of strong supportive
functional data. These observations argue strongly for the need for confirming
status, whether positive or negative with strong underlying biologic rationale, as well
as for more detailed understanding of the interplay of genetic variants with each
other and with the environment in defining disease risk.
48
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
D. Association Between Hydroxysteroid Dehydrogenase 3-Beta Type I
Gene F286L Missense Variant and Risk of Prostate Cancer in a
Multiethnic Population Study
1. Introduction
The prostate is an androgen-dependent organ and it has long been
hypothesized that androgens are etiologically involved in the development of
prostate cancer. Testosterone (T) and dihydrotestosterone (DHT), through their
interaction with the androgen receptor (AR), are responsible for cell proliferation in
the prostate. Efforts to link circulating androgen levels with prostate cancer risk
have met with limited success. Only 60% of T in the prostate is derived from the
testes (Labrie et al., 1985), although nearly 100% of circulating T comes from this
source, suggesting that circulating levels might not adequately measure tissue
exposure. The remaining T in the prostate is derived from precursor hormones
(dehydroepiandrosterone [DHEA] and androstenedione [A-dione]) biosynthesized in
the adrenals and then converted to T and then DHT in the prostate (Labrie et al.,
1985). We have been correlating variation in the genes controlling intra-prostatic
androgen ‘exposure’ with risk of the disease.
The AR is one such gene and an association with the length of a
polyglutamine (CAG) tract in exon one and prostate cancer risk has been described
(Giovannucci et al., 1997; Hakimi et al., 1997; Hsing et al., 2000; Ingles et al., 1997;
Irvine et al., 1995; Stanford et al., 1997). Also, variation in the steroid 5-alpha
reductase type II (SRD5A2> ) gene, which is responsible for reducing T to the more
potent androgen DHT in the prostate, has been associated with risk of disease
49
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A sub-cohort of the MEC was identified for studies of genetic susceptibility
to cancer and efforts have been made to collect blood samples on all sub-cohort
members. The sub-cohort consists of random samples of the entire MEC plus all
African-American men aged 65 or older. All incident prostate cancer cases are also
contacted and a blood sample requested. Participation rates for the blood collection
are 72% for cases and 69% for sub-cohort members. A total of 1,422 sub-cohort
members and 880 incident cases are included in this analysis. Sub-cohort members
and all cases diagnosed with prostate cancer prior to their entry into the MEC were
excluded from the analysis. Initially, for each of the four major racial/ethnic groups
(African-American, Japanese-American, Latino, and White) all ‘advanced’ prostate
cancer cases, a random sample of 50 ‘local’ prostate cancer cases, and an equal
number of sub-cohort members were selected for this study. The variant under
study was nearly monomorphic in three of the groups (Japanese-Americans, Latinos,
and Whites) and as a result no more genotyping was done in these groups. Because
the variant was highly polymorphic in African-Americans, we selected an extended
sample of sub-cohort members and cases for further investigation,
a. HSD3B1 Variant
HSD3B1 is located on chromosome lpl3.1, spans 7.8 kb, and encodes a 373
amino acid protein. The gene consists of four exons and the translation start site is
located in exon 2. The missense single nucleotide polymorphism studied, F286L, is
located in exon 4 and was identified through a separate sequencing project conducted
by the Whitehead Institute/MIT Center for Genome Research (Cargill et al., 1999).
51
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
This variant and its flanking sequence can now be found in the National Center for
Biotechnology Information dbSNP public database under the identification number
rs6205 (http://www.ncbi.nlm.nih.gov/SNPA.
b. Genotyping
DNA was purified from lymphocytes of peripheral blood samples for all
cases and sub-cohort members using the Gentra PureGene kit. Genotyping was
performed using two approaches: (1) single-base extension with fluorescence
polarization (SBE-FP) and (2) Sequenom mass spectrometry.
The SBE-FP protocol consists of touchdown PCR, followed by a shrimp
alkaline phosphatase/exonuclease (SAP/Exo) clean-up, a single-base extension
reaction, and the reading of the plate on a fluorometer to determine the genotype.
The 10 pi PCR mix consisted of 5 pi (-10 ng) of genomic DNA, 3.15 pi of water,
lpl of lOx buffer, 0.6 pi of 25 mM MgC12, 0.025 pi 10 mM dNTPs, 0.1 pi of Taq
polymerase, and 0.125 pi of 10 pM primers. The amplification was performed at
92°C for 10 minutes, followed by twelve cycles at 92°C for 15 seconds, 55°C for 40
seconds and 72°C for 30 seconds, followed by 35 cycles at 92°C for 15 seconds,
49°C for 45 seconds and 72°C for 30 seconds, and a final 10 minute cycle at 72°C.
The 5ul SAP/Exo clean-up mix was added to the 10 pi PCR product in order to
eliminate the remaining dNTPs and primers. The SAP/Exo mix consisted of 2.5 pi
of water, 1.56 pi of 10 x buffer, 1.04 pi of SAP, and 0.104 pi of Exo. The
PCR/SAP/Exo mix was incubated for 45 minutes at 37°C and then 15 minutes at
96°C. The 5 pi SBE reaction consists of 2.9 pi of water, 2.0 pi lOx buffer, 0.016 pi
52
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
of thermosequenase, 0.01 each of 100 pM ROX and TAMRA ddNTP, and 0.08 of
100 pM SBE probe which is added directly to the PCR product after the SAP/Exo
protocol has been completed. The SBE protocol includes incubation for two minutes
at 92°C, followed by 50 cycles at 94°C for 10 seconds and at 49°C for 30 seconds.
The SBE reactions were read on an Analyst Fluorescence Plate-reader (LJL
Biosystems, Sunnyvale, CA) to determine which base was incorporated.
The Sequenom protocol consists of a PCR step, followed by SAP clean-up,
mass extension, and a final resin clean-up step. The 6 pi PCR mix consisted of 2 pi
(~5 ng) of genomic DNA, 3.11 pi of water, 0.5 pi of 10 x buffer, 0.2 pi of 25 mM
MgC12, 0.1 pi 10 mM dNTPs, 0.04 pi of Taq polymerase, and 0.05 pi of 5 pM
primers. The amplification was performed for 15 minutes at 92°C, followed by 45
cycles of 20 seconds at 94°C, 30 seconds at 56°C and one minute at 72°C, and a final
three minute cycle at 72°C. The 2 pi SAP mix consisted of 1.7 pi of 5 x TS buffer,
0.3 pi of SAP. The PCR/SAP mix was incubated for 20 minutes at 34°C and then 5
minutes at 85°C. The 2 pi hME reaction consists of 1.24 pi of water, 0.018 pi of
thermosequenase, 0.2 pi ddNTP, and 0.54 pi of 10 pM probe which is added directly
to the PCR product after the SAP protocol has been completed. The hME protocol
includes incubation for two minutes at 94°C, followed by 55 cycles of 5 seconds at
94°C, 5 seconds at 52°C, and 5 seconds at 72°C. As a final step 16 pi of a resin-
water mix is added.
53
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
c. Quality Control
All samples were genotyped twice. The concordance between the two
genotyping runs was >99%. Replicate samples were also included within a
genotyping run and the concordance was also >99%. The discordant samples were
removed from the analysis.
d. Severity of Disease
Cases were categorized into either ‘local’ or ‘advanced’ disease. ‘Local’ was
local disease with a well or moderately differentiated grade and ‘advanced’ was local
disease with a poorly differentiated grade or regional or distant disease.
e. Statistical Analysis
Hardy-Weinberg equilibrium (HWE) was verified among sub-cohort
members within each racial/ethnic group using standard methods.
The association between F286L and risk of prostate cancer was analyzed
using a standard case-cohort approach. Sub-cohort members who developed prostate
cancer during the study period contributed control time up to the point they
developed disease. Because sub-cohort members could also be cases, the numbers of
sub-cohort members and cases in table 10 do not sum to the total. The time-scale
used in the case-cohort analysis was age. For sub-cohort member, although samples
were collected over time after recruitment, entry into the sub-cohort was taken as the
age at the time the original MEC questionnaire was completed, and not age at blood
draw. This approach was adopted because germline genotype does not change over
time and the true entry into the sub-cohort was the date the questionnaire was
54
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 10. Genotype and allele frequencies for F286L by racial/ethnic group and sub-cohort membership.
F286L genotype African-American Japanese-American Latino White
n (%) n(%) n (%) n (%)
Sub-Cohort Members
FF 389(40.0) 116(97.5) 133 (88.1) 102 (96.2)
FL 436(44.8) 3 (2.5) 17(11.3) 4 (3.8)
LL 148(15.2) 0 1 (0.6) 0
L% 37.6 1.3 6.3 1.9
HWE p-value 0.16 0.89 0.58 0.84
Cases
FF 189(42.1) 123 (96.9) 126 (85.1) 104(96.3)
FL 187(41.6) 4(3.1) 22(14.9) 3 (2.8)
LL 73(16.3) 0 0 1 (0.9)
L% 37.1 1.6 7.4 2.3
C/I
U )
completed. This approach assumes that genotype is not related to death or refusal to
provide a blood sample. We attempted to determine if genotype is related to death
by exploring whether genotype frequencies change over time and we did not observe
any difference in genotype frequencies over calendar time. Although the potential
for bias cannot be excluded, we do not believe this is a significant source of bias.
Exit from the study was age at December 31,1999, age at death, or age at prostate
cancer diagnosis, whichever came first. Potential confounders were sought based on
a priori hypotheses and univariate analysis of study factors. No confounders were
identified. Descriptive analysis was conducted using SAS (SAS Institute Inc., Cary,
NC) and the case-cohort analysis was conducted using EPICURE (HiroSoft
International Corporation, Seattle, WA).
3. Results
F286L was in HWE in the sub-cohort members of each of the ethnic groups.
The allele frequencies by ethnic group are given in table 10. F286L was common in
African-Americans (37.6%), but rare in the other three ethnic groups.
There was no overall association between F286L genotype and risk of
prostate cancer. However, there was an increased risk of advanced prostate cancer
associated with the LL genotype in African-American men in this study, although
this finding was not statistically significant (table 11). Men who carried two copies
of this variant allele were 1.47 times as likely to have advanced prostate cancer
compared to sub-cohort members with the FF genotype (95% Cl 0.87-2.47). There
56
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
was no association of risk among African-American local prostate cancer cases with
genotype (table 11).
Table 11. Relative risks and 95% confidence intervals for F286L genotype by
stage of disease in African-Americans.
Genotype N Relative Risk 95% Cl
Sub-Cohort Members
FF 389
FL 436
LL 148
All Cases
FF 189 1.0
FL 187 0.87 0.69-1.09
LL 73 1.03 0.75-1.40
Local Cases
FF 112 1.0
FL 94 0.74 0.55-0.99
LL 39 0.91 0.62-1.35
Advanced Cases
FF
FL
LL
46
52
25
1.0
1.02
1.47
0.69-1.54
0.87-2.47
4. Discussion
We found an increased risk of advanced prostate cancer among African-
American men with the LL genotype of the F286L missense SNP in the HSD3B1
gene. This gene is of interest because of its pivotal role of converting DHEA to A-
dione in the prostate. Variation in this gene might alter the hormonal milieu within
the prostate sufficiently to alter risk of cancer. The relevance of F286L as the causal
variant is unknown with the exception that it does change the encoded amino acid
and as a result may be functionally important.
57
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Labrie and colleagues have termed intra-prostatic androgen metabolism
“intracrinology” (Labrie, 1993). These investigators have shown that 40% of DHT
in the prostate is converted from DHEA and A-dione (Labrie et a l, 1985). This
finding might account for the difficulty in correlating circulating T levels with
subsequent prostate cancer risk. Nearly one-half of T exposure in the prostate would
not be accounted for in a serum measurement.
Prostate cancer is highly heritable (Lichtenstein et al., 2000), however it is
likely a complex disease given that the majority of disease does not follow a
Mendelian inheritance pattern and no major gene has been identified that would
account for an appreciable proportion of disease. F286L is a common variant with a
minor allele frequency of 37.6% in African-Americans. This variant is associated
with a modestly increased risk of approximately 50% in advanced cases. The
common variant-common disease hypothesis suggests that multiple susceptibility
alleles contribute to a modestly increased risk of disease (Risch & Merikangas,
1996). Common disease alleles have been implicated in Alzheimer’s disease
(Strittmatter & Roses, 1996), type 2 diabetes (Altshuler et al., 2000a), and deep vein
thrombosis (Dahlback, 1997).
One explanation for this finding with F286L could be population
stratification. Because this variant is more common in African-Americans
andprostate cancer is also more common in this group it is possible that residual
ethnic confounding might be present. Following the methods proposed by Prichard
and Rosenberg (Pritchard & Rosenberg, 1999), the MEC was tested for evidence of
58
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
population stratification using randomly selected markers throughout the genome
(~100 markers). Our colleagues at the Whitehead Institute have found that these
random markers were not associated with prostate cancer risk at any level greater
than what would be expected by chance, demonstrating that there was no appreciable
population stratification that would confound these results (personal
communication).
This result is not statistically significant and could be solely due to chance.
Also, we have studied many potential susceptibility alleles in other genes in this
population which further calls into question this result. However, the initial African-
American sample genotyped and the additional African-American cases and sub
cohort members subsequently genotyped both gave similar relative risks (data not
shown). Although these two groups are not from two different populations, we were
able to replicate our findings within an extended sample in this study. Confirmation
will require further data.
The increased risk associated with LL genotype of F286L was only present in
advanced cases which greatly limited our power. We have chosen to focus our
prostate cancer studies on advanced cases in an effort to ensure the exclusion of
clinically insignificant disease and also as a means of studying progression. The
wide-spread use of the prostate specific antigen screening test has resulted in the
detection of a substantial amount of latent disease (disease that is clinically
insignificant). These cases likely make up a large proportion of the local disease
category which is now a mixture of disease that is clinically relevant and irrelevant.
59
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
By focusing on advanced cases we hope to identify markers which help to separate
the disease that is likely to spread, thus necessitating intervention, from disease
which will lie dormant in the prostate until the patient’s death from other causes.
Given the significant sequelae (impotence and incontinence) associated with
prostatectomy we believe that identifying the men who actually require this surgery
would be of great help to clinicians and patients.
We only had 123 advanced cases, however we have recently established a
collaboration in which an additional 200 advanced African-American cases are
available for study. This follow-up study will allow us to better determine if F286L
in HSD3B1 is associated with increased risk of advanced disease or if we have
simply observed a statistical fluctuation. If the association is replicated in the new
set of advanced African-American cases this variant may serve to identify
individuals with clinically significant prostate cancer.
60
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
E. Association Between Single Nucleotide Polymorphisms in the Androgen
Receptor and Risk of Prostate Cancer in a Multiethnic Population
I. Introduction
Androgens have long been hypothesized to play a role in the development of
prostate cancer by serving as prostate cell mitogens. This action is largely
controlled by the androgen receptor (AR). The AR, with its ligand (either
dihydrotestosterone [DHT] or testosterone [T]), modulates the expression of genes
containing androgen response elements (AREs). Certain of these genes in turn have
the major role in control of cell proliferation in the prostate. Given the pivotal role
of the AR gene in controlling cell proliferation in the prostate it is possible that
variation in this gene may influence risk of prostate cancer.
A polymorphic CAG microsatellite repeat has been identified in exon 1 of the
AR. In vitro data suggest that the length of the CAG repeat affects transactivation,
with shorter repeats leading to higher activity. The number of these repeats varies by
racial/ethnic group (Coetzee & Ross, 1994). African-Americans have the shortest
repeat lengths (on average) and Asian-Americans have the longest repeat lengths,
leading Coetzee and Ross (Coetzee & Ross, 1994) to hypothesize that CAG repeat
length may be associated with prostate cancer risk.
The hypothesis has been studied by a number of investigators. Many
investigators (Giovannucci et aL, 1997; Hsing et al., 2000; Ingles et al., 1997; Irvine
et al., 1995; Stanford et al., 1997), although not all (Bratt et al., 1999; Edwards et al.,
1999; Lange et al., 2000), have found an association with repeat length and risk of
61
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
prostate cancer. Irvine, et al. reported a 25% increased risk among men with less
than 22 repeats in their study of Whites in Los Angeles County (Irvine et al., 1995).
Similarly, Stanford and colleagues reported a 23% increased risk for men with less
than 22 repeats in their King County, Washington study (Stanford et al., 1997).
Also, Nam and co-workers (Nam et al., 2000) showed that among men at low risk of
recurrence following a radical prostatectomy, men with fewer CAG repeats were
more likely to progress compared to men with more CAG repeats.
Among members in our multiethnic cohort we observed an increased risk of
prostate cancer among men with fewer repeats in our White, Japanese, and African-
American men, but in our Latino men shorter repeat length was protective. This
observation, coupled with the inconsistencies in the literature, caused us to
hypothesize that the CAG repeat may be a marker for disease risk that is in variable
linkage disequilibrium with a causal allele depending on the population studied. For
this reason as well as to look for other genetic associations, we are examining the
role of single nucleotide polymorphisms (SNPs) in the androgen receptor with risk of
prostate cancer. In this paper we report our preliminary result on 11 polymorphic
SNPs in the National Center Biotechnology Information SNP database (dbSNP).
We studied the role of these 11 SNPs individually and using a haplotype-based
analysis in relationship to prostate cancer risk in participants in our multiethnic
cohort study.
62
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2. Methods
The Multiethnic Cohort Study of Diet and Cancer (MEC) is a collaborative
study between the University of Hawaii and the University of Southern California
(Kolonei et al., 2000). The MEC was initiated in 1993 and recruitment was
completed in 1996. The cohort is now in its ninth year of follow-up. Five primary
racial/ethnic groups, African-American, Japanese-American, Latinos, Whites, and
Native Hawaiians, were targeted for membership in the cohort. The total size of the
MEC is over 215,000 men and women for whom questionnaire data, including
demographic, diet, lifestyle, and medical information were collected. The cohort is
followed for incident cancers through linkage with the Los Angeles Cancer
Surveillance Program (the Los Angeles County Surveillance, Epidemiology, End
Results [SEER] program), the California Cancer Registry, and the Hawaii SEER
program.
A sub-cohort of the MEC was identified for studies of genetic susceptibility
to cancer and efforts have been made to collect blood samples on all sub-cohort
members. The sub-cohort consists of random samples of the entire MEC plus all
African-American men aged 65 or older. All incident prostate cancer cases are also
contacted and a blood sample requested. Participation rates for the blood collection
are 72% for cases and 69% for sub-cohort members. A total of 581 sub-cohort
members and 538 incident cases are included in this analysis (total sample
size=1075). The sub-cohort members and incident cases do not sum to 1075 because
44 sub-cohort members developed prostate cancer during the follow-up period. Sub-
63
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
cohort members diagnosed with prostate cancer prior to their entry into the MEC
were excluded from the analysis. The sub-cohort is representative of the MEC (table
12); the variables which were different between the entire MEC and the sub-cohort
(age, family history of prostate cancer, and education level) were considered as
potential confounders.
a. AR SNPs
The AR is located on the X chromosome and spans more than 200 kb. The
gene consists of eight exons. The 3’ UTR of the gene is not well defined and it is not
clear whether SNP #11 is located in the gene’s 3’ UTR or outside of the gene. SNP
#1 is located upstream of exon 1. The 11 SNPs span a 275 kb region. The 11 SNPs
and their flanking sequences can be found in the National Center for Biotechnology
Information dbSNP public database (http://www.ncbi.nlm.nih.gov/SNPA under the
identification numbers rs962458, rs6152, rsl204041, rsl204039, rsl572500,
rsl337075, rsl572501, rsl361038, rsl337076, rsl0060, and rsl337077.
b. Genotyping
DNA was purified from lymphocytes of peripheral blood samples for all
cases and sub-cohort members using the Gentra PureGene kit. Genotyping was
performed using Sequenom.
Sequenom is a matrix-assisted laser desorption ionization time of flight
(MALDI-TOF) mass spectrometry. The Sequenom protocol consists of a PCR step,
followed by SAP clean-up, mass extension, and a final resin clean-up step. The 6 pi
64
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 12. Descriptive comparison between the sub-cohort and the entire MEC cohort by racial/ethnic group.
Characteristic n (%)
African-American Japanese Latino White
MEC
Sub-
Cohort MEC
Sub-
Cohort MEC
Sub-
Cohort MEC
Sub-
Cohort
Age Group
<50
50-54
55-59
60-64
65+
p-vaiue
1552(12.99)
1502(12.57)
1749(14.64)
1650(13.80)
5492(45.98)
14(7.61)
17(9.24)
16(8.70)
54(29.35)
83(45.11)
< 0.01
4229(16.04)
3530(13.39)
3286(12.46)
4189(15.89)
11134 (42.23)
28 (22.05)
19(14.96)
22(17.32)
16(12.60)
42 (33.07)
0.07
2440(10.93)
2950(13.22)
4803(21.52)
5343 (23.93)
6787(30.40)
16(10.13)
23 (14.56)
36(22.78)
42 (26.58)
41 (25.95
0.75
4611 (20.65)
3878(17.37)
3279(14.68)
3466(15.52)
7095 (31.77)
21 (18.75)
22(19.64)
18(16.07)
19(16.96)
32 (28.57)
0.88
Number of Children
None
1-2
3+
p-value
1803(15.8)
3893(34.1)
5719(50.1)
28(15.9)
61 (34.7)
87 (49.4)
0.98
4429(17.0)
11704(45.0)
9882 (38.0)
22(17.7)
56 (45.2)
46(37.1)
0.97
1802(8.3)
4915(22.6)
15007(69.1)
13(8.4)
34(22.1)
107(69.5)
0.99
4818(21.9)
8579(38.99)
8608(39.12)
23 (20.7)
44 (39.6)
44 (39.6)
0.96
Ist degree Family
History of PrCa
None
1 +
p-value
9374 (90.0)
1042(10.0)
136(84.0)
26(16.1)
0.01
23066 (93.5)
1593(6.5)
114 (94.2)
7(5.8)
0.76
18305(93.5)
1282 (6.6)
139(94.6)
8(5.4)
0.59
19253(91.6)
1760(8.4)
95 (89.6)
II (10.4)
0.46
Marital Status
Married
Not married
Never married
p-value
7315(62.1)
3583 (30.4)
879(7.5)
125 (68.3)
45 (24.6)
13(7.1)
0.21
21482(81.8)
2619(10.0)
2170(8.3)
112 (88.2)
6(4.7)
9(7.1)
0.11
17745 (80.3)
3420(15.5)
945 (4.3)
125 (79.6)
22(14.0)
10 (6.4)
0.40
16268 (73.3)
4013(18.1)
1917(8.6)
89(79.5)
17(15.2)
6(5.4)
0.29
Education Level
Grades 0-10
Grades 11-12
Vocational
Some college
College+
p-value
1990(16.9)
3010(25.6)
651 (5.5)
3555 (30.2)
2563(21.8)
27(14.8)
52 (28.6)
8(4.4)
45 (24.7)
50(27.5)
0.20
2188(8.4)
7642 (29.2)
3317(12.7)
4546(17.4)
8499(32.5)
5(4.0)
30 (23.8)
16(12.7)
28(22.2)
47 (37.3)
0.15
9561 (43.6)
4608(21.0)
1443 (6.6)
3673(16.8)
2634(12.0)
50(32.1)
33(21.2)
14 (9.0)
28(18.0)
31 (19.9)
0.01
1523(6.9)
3571 (16.1)
938 (4.2)
5565(25.1)
10571 (47.7)
4(3.6)
13(11.8)
7(6.4)
24(21.8)
62 (56.4)
0.18
PCR mix consisted of 2 pi (~5 ng) of genomic DNA, 3.11 f j.1 o f water, 0.5 pi of lOx
buffer, 0.2 |il of 25 mM MgC12, 0.1 pi 10 mM dNTPs, 0.04 pi of Taq polymerase,
and 0.05 pi of 5 pM primers. The amplification was performed for 15 minutes at
92°C, followed by 45 cycles o f 20 seconds at 94°C, 30 seconds at 56°C and one
minute at 72°C, and a final three minute cycle at 72°C. The 2 pi SAP mix consisted
of 1.7 pi of 5x TS buffer, 0.3 pi of SAP. The PCR/SAP mix was incubated for 20
minutes at 34°C and then 5 minutes at 85°C. The 2 pi hME reaction consists of 1.24
pi of water, 0.018 pi o f thermosequenase, 0.2 pi ddNTP, and 0.54 pi of 10 pM probe
which is added directly to the PCR product after the SAP protocol has been
completed. The hME protocol includes incubation for two minutes at 94°C,
followed by 55 cycles o f 5 seconds at 94°C, 5 seconds at 52°C, and 5 seconds at
72°C. As a final step 16 pi of a resin-water mix is added.
c. Severity o f Disease
Cases were categorized into either ‘local’ or ‘advanced’ disease. ‘Local’ is
defined as local disease with a well or moderately differentiated grade, and
‘advanced’ is local disease with a poorly differentiated grade or regional or distant
disease.
d. Statistical Analysis
Pairwise linkage disequilbrium (LD) was assessed using the D’ statistic with
the LOD score as a measure of statistical significance. The correlation between the
markers was assessed using R2.
66
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The association between the 11 SNPs and risk o f prostate cancer was
analyzed using a standard case-cohort approach for single variant analysis. This was
also possible for haplotype-based analysis because the AR is located on the X
chromosome so that haplotypes can be determined unambiguously.
Sub-cohort members who developed prostate cancer during the study period
contributed control time up to the point they developed disease. Because sub-cohort
members could also be cases, the numbers of sub-cohort members and cases in table
13 do not sum to the total. The time-scale used in the case-cohort analysis was age.
For sub-cohort member, although samples were collected over time, entry into the
sub-cohort was the age at the time the original MEC questionnaire was completed,
and not age at blood draw. This approach was adopted because germline genotype
does not change over time and the true entry into the sub-cohort was the date the
questionnaire was completed. Exit from the study was age at September 1, 1999, age
at death, or age at prostate cancer diagnosis, whichever came first. Potential
confounders were sought based on a priori hypotheses and univariate analysis of
study factors. No confounders were identified. Descriptive analysis was conducted
using SAS (SAS Institute Inc., Cary, NC) and the case-cohort analysis was
conducted using the EPICURE (HiroSoft International Corporation, Seattle, WA).
5. Results
A total of 581 sub-cohort members and 538 cases were included in this study
(44 sub-cohort members developed prostate cancer during the study period).
African-Americans make up the majority of sub-cohort members who developed
67
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
disease (n=32) because older African-American were over sampled for the sub
cohort. The descriptive characteristics of the study participants are given in table
There are 290 African-American, 254 Japanese-American, 310 Latino, and 221
White participants.
Table 13. Descriptive characteristics of study population by sub-cohort
membership (n and %).
Variable Sub-Cohort (n=581 ) Cases (n=538)
Geographic Location
Los Angeles 405 (69.7) 396 (73.6)
Hawaii 176(30.3) 142(26.4)
Race
African-American 184(31.7) 138(25.7)
Japanese 127(21.9) 130(24.1)
Latino 158(27.2) 157(29.2)
White 112(19.3) 113(21.0)
Age at Exit
Mean (SD) 66.1 (8.0) 67.3 (6.4)
Range 50-81 46-78
Marital Status
Married 451 (77.9) 417(77.8)
Not married 90(15.5) 96 (17.9)
Never married 38 (6.6) 23 (4.3)
Education Level
0-10* grade 86(15.0) 96(18.0)
11-12* grade 128 (22.3) 137(25.8)
Vocational school 45 (7.8) 33 (6.2)
Some college 125(21.8) 110 (20.7)
College graduate + 190 (33.1) 156(29.3)
Number of Children
None 86 (15.2) 52 (9.9)
1-2 195 (34.5) 183 (34.9)
3+ 284 (50.3) 290 (552)
Smoking Status
Never smoker 167 (29.0) 147 (27.7)
Former smoker 318(55.3) 304 (57.2)
Current smoker 90 (15.7) 80(15.1)
Family History of PrCa
No history or unk 484 (90.3) 414(86.0)
1 s t degree relative 52 (9.7) 69 (14.0)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The location of each SNP within the gene is shown in figure 3. The allele
frequencies for the 11 SNPs are shown in table 14. The Japanese-Americans showed
no variation at any of the 11 loci studied, whereas the African-Americans were
extremely polymorphic. The Whites showed variation at only three of the SNPs
studied. Although Latinos were polymorphic at all sites studied, most of the variants
were rare.
In examining the association of individual SNPs with prostate cancer risk, the
C allele of SNP 1 was nominally significantly associated with increased risk among
African-Americans (RR=1.73, 95% CI=1.07-2.80), but was protective among
Latinos (RR=0.32, 95% CI=0.11-0.99) (see table 15). There was no association
observed for Whites. The A allele of SNP 2 was also associated with an increased
risk among African-Americans, but was protective among Latinos. The T allele of
SNP 5, which was only present at an appreciable frequency in African-Americans,
was associated with a nominally significant protective effect, as was the A allele of
SNP 11 (see table 15).
The linkage disequilibrium across this 275 kb region was strong (see figure
4). A total of 2048 (21 1 ) haplotypes were possible from these 11 SNPs, however we
observed 19 in total and only ten at a frequency of greater than 1% in any of the four
ethnic groups (see table 16). There was one haplotype (Haplotype #1) that was
clearly the most common across all racial/ethnic groups. Haplotype #2 was just as
common in African-Americans (see table 16). All ten haplotypes were observed
among the African-American population whereas only one haplotype was observed
69
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Figure 3. SNP map of the androgen receptor gene (map is not to scale).
1-2731 143946-144233
99225-99376
118554-118670
150022-150166
154377-154507
155371-155528
--j
o
212600— ?
8 k
▲
10
1 1
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 14. Minor allele frequencies and locations for AR SNPs by racial/ethnic group for sub-cohort
members and cases.
Map
Location
Allele* Location
African-
American
Japanese Latino White
Sub-
Cohort
Cases Sub-
Cohort
Cases Sub-
Cohort
Cases Sub-
Cohort
Cases
1 2 -17911 54.3% 66.3% 0% 0% 7.6% 3.6% 4.0% 6.9%
2 1 1754 54.9% 65.6% 0% 0% 12.8% 9.7% 16.0% 19.8%
3 1 21816 38.2% 32.2% 0% 0% 3.2% 1.7% 0% 0%
4 4 22143 84.8% 79.8% 0% 0% 14.4% 10.5% 16.0% 19.2%
5 4 116816 28,1% 14.1% 0% 0% 1.4% 0.7% 0% 0%
6 4 122128 23.1% 14.1% 0% 0% 1.4% 0.7% 0% 0%
7 2 129969 3.8% 4.2% 0% 0% 0% 0.7% 0% 0%
8 3 142388 13.6% 19.8% 0% 0% 1.4% 0.7% 0% 0%
9 4 154299 29.1% 30.5% 0% 0% 1.4% 0.7% 0% 0%
10 3 218923 47.7% 44.2% 0% 0% 3.5% 1.5% 0% 0%
1 1 1 257505 28.5% 14.0% 0% 0% 1.3% 0.7% 0% 0%
* Represents the base present with 1=A, 2=C, 3=G, 4=T
Figure 4. Pairwise D’ (top value), LOD score (middle value), and R2 (bottom
value) for the 11 AR markers for African-Americans.
1.00
65.0
0.971
0.39
5.79
0.12
0.41
6.55
0.14
H
1.00
15.4
0.23
1.00
15.7
0.23
1.00
9.02
0.13
0
0.94
22.9
0.37
0.94
24.5
0.39
0.84
21.6
0.42
1.00
7.15
0.08
n
0.94
19.8
0.32
0.94
20.3
0.33
0.94
23.4
0.44
1.00
6.01
0.07
1.00
50.8
0.821
1.00
2.29
0.03
1.00
2.24
0.03
1.00
1.63
0.03
1.00
0.67
0.01
1.00
1.15
0.01
1.00
1.00
0.01 H
1.00
10.8
0.16
1.00
10.5
0.15
1.00
11.1
0.19
1.00
2.97
0.03
1.00
5.40
0.06
1.00
4.67
0.05
1.00
7.74
0.24 |
0.81
13.3
0.23
0.87
15.3
0.25
0.24
1.42
0.03
1.00
6.89
0.08
0.80
6.35
0.10
0.77
5.02
0.08
1.00
4.79
0.09
1.00
22.2
0.42 If
0.67
17.4
0.31
0.68
17.8
0.32
0.07
0.18
0.00
1.00
13.2
0.00
0.60
6.16
0.11
0.78
9.75
0.16
1.00
2.87
0.04
1.00
13.4
0.20
1.00
31.9
0.48 H
1.00
29.3
0.44
1.00
31.1
0.46
0.90
26.9
0.50
1.00
7.3
0.08
0.96
52.7
0.86
0.95
40.3
0.69
1.00
1.20
0.01
1.00
5.66
0.07
0.71
5.23
0.08
0.49
4.47
0.08
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table IS. Relative risk and 95% CIs for the minor allele AR SNPs by racial/ethnic group.
Map Location Minor Allele4 Black Latino White All Groups4 ♦
1 2 1.73(1.07-2.80) 0.32(0.11-0.99) 1.08(0.29-4.01) 1.18(0.78-1.77)
2 1 1.57 (0.98-2.52) 0.52 (0.24-1.12) 1.31 (0.60-2.89) 1.14(0.80-1.62)
3 1 0.70(0.42-1.17) 0.69(0.41-1.15) 0.69(0.41-1.15)
4 4 0.88(0.49-1.59) 0.56(0.28-1.15) 1.23 (0.56-2.74) 0.88(0.59-1.30)
5 4 0.53 (0.30-0.92) 0.57(0.32-1.00)
6 4 0.56(0.31-1.02) 0.58(0.32-1.06)
7 2 1.41 (0.45-4.49) 1.39(0.43-4.56)
8 3 1.46(0.78-2.76) 1.14(0.60-2.17)
9 4 1.37 (0.83-2.26) 1.05 (0.63-1.76)
10 3 1.05 (0.67-1.67) 0.94(0.59-1.48)
1 1 1 0.51 (0.29-0.88) 0.56 (0.32-0.99)
* Represents the base present with 1=A, 2=C, 3=G, 4=T
♦♦Adjusted for race/ethnicity
-j
u >
Table 16. Haplotype structure of the AR gene using 11 SNPs by racial/ethnic group.
Haplotype # Haplotype^ African-American
(n=214)”
Japanese-American
(n=224)”
Latino
(n=269)"
White
(n=195)"
1 4-3-3-2-3-2-4-1-3-1-3 18.7% 100% 88.8% 83.6%
2 4-3-1-44-4-4-1-3-1-1 18.7% 0.7%
3 2-1-3-4-3-2-4-1-3-1-3 17.3% 2.6% 5.6%
4 4-1-3-4-3-2-4-1-3-1-3 0.9% 5.2% 10.8%
5 2-1 -3-4-3-2-4-1 -4-3-3 11.2%
6 2-1-3-4-3-2-4-1-3-3-3 10.7% 0.7%
7 2-1-1-4-3-2-4-3-4-3-3 11.2% 0.7%
8 2-1-1-4-3-2-2-3-4-3-3 4.2% 0.4%
9 4-3-3-4.4-2-4-I-3-3-1 1.9%
10
4-3-I-4 4 .4-4-I-4.3.I
1.4%
♦Represents the base present with 1=A, 2=C, 3=G, 4=T (the first SNP has C and T as the two possible alleles)
♦♦Numbers do not sum to total N in this study due to missing data at one or more loci, percentages do not sum
to 100% because haplotypes present in <1% of people in an ethnic group are not included in the table.
among Japanese-Americans. Three haplotypes were observed in the White
population and although seven haplotypes were observed in the Latino group, four
were only present in one or two people. Further analysis of the Latinos with rare
haplotypes was done by examining the reported maternal race and place of birth, but
this did not explain why haplotypes that were otherwise exclusively African-
American were observed in the Latino group. Also, the second most common
haplotype observed in the Whites and Latinos was observed in only 2 African-
Americans. Further analysis of these two people based on self-reported maternal
race and place of birth did not explain this observation.
The risks associated with each haplotype compared to Haplotype #1 the most
common haplotype were determined for African-Americans and the entire group
(table 17). Haplotype #2, a common haplotype in African-Americans, was
associated with a nominally statistically significant decrease in risk (RR=0.44, 95%
0=0.20-0.97). None of the other haplotypes were associated with risk of prostate
cancer (see table 17).
4. Discussion
The AR is a strong prostate cancer candidate gene. There is evidence of an
association between the CAG microsatellite repeat in exon I with risk and prognosis
of prostate cancer (Giovannucci et al., 1997; Ingles et al., 1997; Irvine et al., 1995;
Stanford et al., 1997). Further it is the AR with its ligand that controls cell
proliferation in the prostate. We studied 11 SNPs in and around the AR spanning
more than 275kb.
75
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Table 17. Relative risk and 95% CIs for AR Haplotypes for African-Americans and all groups combined.
Haplotype
African-American
RR 95% Cl
All Groups**
RR 95% Cl
4-3-3-2-3-2-4-1-3-1-3 Reference Reference
4-3-1-4-4-4-4-1-3-1-1 0.48 0.20-1.14 0.44 0.20-0.97
2-1-3-4-3-2-4-1-3-1 -3 1.21 0.53-2.78 0.98 0.52-1.82
4-1-3-4-3-2-4-1-3-1-3 - - - - -
1.27 0.62-2.59
2-1-3-4-3-2-4-1 -4-3-3 1.03 0.41-2.61 0.73 0.30-1.82
2-1-3-4-3-2-4-1-3-3-3 1.04 0.40-2.67 0.72 0.30-1.73
2-1-1 -4-3-2-4-3-4-3-3 1.16
0.44-3.07 0.85 0.36-2.06
2-1-1-4-3-2-2-3-4-3-3 1.06 0.26-4.26 0.92 0.24-3.61
*1=A, 2=C, 3=G, 4=T
** Adjusted for race/ethnicity
-J
On
In the single SNP analysis, SNPs 5, 6, and 11 were all associated with
decreased risk of prostate cancer in African-Americans. These three protective
alleles fall on the same haplotype, however the protective effect of the haplotype is
only marginally greater than any one of the single SNPs. This occurred because
these three SNPs are highly correlated (see figure 4). SNPs 5 and 6 are both
intronic, making them unlikely candidates as actual causal alleles. It is possible that
SNP 11 which is either in the 3’ UTR or downstream of the 3’ UTR may serve a
functional role, but this speculative; it is far more likely that any actual causal allele
remains to be studied and is simply marked by these three SNPs. The observed
protective effect of this haplotype, however, is consistent with chance given the
multiple hypotheses that were tested. Much discussion has taken place about how to
adequately adjust for multiple hypotheses. One method, a simple Bonferroni-type
correction, has been described as “unfair” because equal weight is given to each
hypothesis tested whereas some hypotheses probably have a higher prior probability
based on the biology or functional relevance of any given candidate marker. For
example, SNP 11 may deserve a higher prior probability given that it is located in a
possible regulatory region.
Although it does not appear that variation at these 11 loci in the AR explains
any racial/ethnic variation in disease risk, the general pattern of diversity is
interesting. There are 2048 possible haplotypes based on using 11 markers, however
only nineteen were observed and only ten at an appreciable frequency, reflecting the
strong linkage disequilibrium (LD) across this region. All o f the Japanese
77
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
participants had one haplotype whereas all ten haplotypes were present in the
African-American population. Generally Black populations have more diversity
indicating this population’s age and therefore greater time for LD to decay (see
figure 4). The two most common haplotypes are quite different, sharing only six
alleles. Seven of the ten haplotypes are almost exclusively African-American
suggesting these haplotypes may be markers for race/ethnicity.
It is possible that our selection of 11 SNPs across a 275 kb region was not
sufficient to describe the frill diversity across the AR. Also we did not have CAG
repeat length on a substantial number of subjects in this study. We are now
undertaking an extensive genotyping project to increase the number of SNPs studied
and to determine the CAG repeat length for all of the individuals in the complete
MEC case-cohort sample. We hope to determine if the complete haplotypic diversity
across this region, including the CAG repeat length, is associated with risk of
prostate cancer.
78
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
in : METHODOLOGICAL DEVELOPMENT: STRATEGY FOR CASE-
COHORT ANALYSIS FOR STUDIES OF GENETIC SUSCEPTIBILITY IN
THE MULTIETHNIC COHORT STUDY
A. Introduction
A case-cohort study is one in which a random sample (a sub-cohort) of a
larger cohort is selected at entry to the cohort to serve as controls for all disease
studies. The sub-cohort is usually only a small percentage of the cohort so that the
cases primarily occur outside of the sub-cohort, although some cases do of course
occur within the sub-cohort. Incident cases are collected and sub-cohort members
followed until the end of the study which is known as the censoring date. The
censoring date (i.e., study cut-off date) is the point through which participants are no
longer followed and will vary for any individual analysis. For example, in an
ongoing cohort study, if case ascertainment is complete through the end of 1999, the
censoring date would be December 31, 1999. In the future when case ascertainment
is complete through 2000 the censoring date would be December 31, 2000.
Participants in a case-cohort study are either incident cases of the disease of
interest or members of the sub-cohort. (Note: Sub-cohort members with the disease
of interest at entry to the cohort are not considered in a case-cohort study of that
particular disease.) Participants can be classified as
1) Sub-cohort members without incident disease, or
2) Sub-cohort members with incident disease, or
3) Non sub-cohort members with incident disease
79
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The planned design for studies of genetic susceptibility within the
Multiethnic Cohort Study of Diet and Cancer (MEC) was the case-cohort approach.
This dissertation represents the first attempt to utilize a case-cohort analysis with the
MEC.
The MEC case-cohort study differs slightly from traditional case-cohort
designs. Usually, exposure information (in this case, a blood sample) is collected on
the sub-cohort members at the start of the study. In the MEC, although the sub
cohort was identified at the start of the study, collection of the blood specimens is
ongoing and not all members of the sub-cohort thus far have been contacted to
provide a sample.
B. MEC Effective’ Sub-Cohort and ‘Design’ Sub-Cohort
A random sample of the MEC was taken to serve as the sub-cohort for any
disease of interest (the ‘design’ sub-cohort). The date of entry into the MEC sub
cohort (the ‘design’ sub-cohort) is the date of enrollment into the MEC (i.e.,
questionnaire completion date). Members of the sub-cohort are then assigned a
sequence number for blood draw and as blood samples are collected these sub-cohort
members become part of the ‘design’ sub-cohort. The ‘design’ sub-cohort represents
sub-cohort members for whom a blood sample was collected or would have been
collected by the end of this study (the censoring date— December 31, 1999) whereas
the ‘effective’ sub-cohort consists of the entire random sample of sub-cohort
members. Logistically it was difficult to determine if a sub-cohort member who
80
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
developed disease would have been approached to have a blood drawn by the
censoring date because members of the sub-cohort are not assigned a date of blood
draw, but rather a sequence in which to be contacted. This basically means that all
subjects were randomly assigned a number, starting at one, and sub-cohort members
were approached to provide a blood sample in this order. Sub-cohort members who
had not yet had their blood drawn but developed disease (and then immediately had a
blood sample draw) were taken out of the sequence database (but their sequence
number was kept). In order to determine the date the sub-cohort member’s blood
would have been drawn, it was necessary to compare the blood draw dates of
individuals with nearby sequence numbers. If these dates were before the censoring
date then it was considered that the sub-cohort case would have had his blood drawn
before the censoring date and he was included as a sub-cohort member.
Figures 5 depicts the members of the ‘effective’ sub-cohort and figure 6
shows the individuals who were in the ‘design’ sub-cohort, but not the ‘effective’
sub-cohort. In figure 5, subject 1 is sampled, has his blood drawn and is followed to
the censoring date. He contributes control time for the entire study. The date his
blood was actually drawn is not utilized. Subjects 2 and 3 are similar except that
they contribute control time only until their time of death or development of disease.
Subject 4 is different in that he developed prostate cancer prior to his planned blood
draw date (PBD). Because subject 4’s PBD falls before the censoring date this
participant contributes control time until he develops disease.
81
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Figure 5. Members of the ‘Design* and ‘Effective’ Sub-Cohort.
B lo o d draw n
A liv e at en d o f stu d y
0
B lo o d draw n
D ead befo re en d o f stu d y
3
B lo o d draw n
D ev elo p ed p ro state can cer
D ev elo p s p ro state cancer
4
B lo o d draw n
Q u estio n n a ire P B D an d A B D
Q u estio n n a ire P B D an d A B D D eath
Q u estio n n a ire P iC a A B D PB D
1 _ _ _ 1 I I
Q u estio n n a ire P B D an d A B D P iC a
P B D = P lanned B lood D raw
A B D = A ctual B lo o d D raw
P tC a= P ro state C an cer
’93 ’94 '9 5 ’96 ’9 7
C alen d ar T im e o f F o llow -up
00
to
C en so red
'9 8 C en so rin g
D ate
Figure 6 describes the ‘effective’ sub-cohort, however it is critical to
remember that not all intended sub-cohort members enroll. Figure 6 describes the
three situations in which an intended sub-cohort member does not become part of the
‘effective’ sub-cohort. First, as shown by subject 1, a person might refuse a blood
draw. Second, a person might die prior to being contacted for a blood draw (subject
2 in figure 6). Lastly, a person’s sequence number might not have been reached by
the censoring date of the study (subject 3 in figure 6). The underlying assumption
for this study is that none of these events is related to genotype.
C. Participation Rate
Although the majority of cases occur outside of the sub-cohort, some will occur
among sub-cohort members as shown in figure 5. This fact raises an important issue
for this particular study design. As exemplified by subject 4, some cases within the
sub-cohort occur before the participant’s blood was actually drawn. However, once
the case was diagnosed blood was drawn immediately. Even though in this situation
a sub-cohort member’s blood was drawn as a case they still serve as a member of the
sub-cohort if their blood would have been drawn as a sub-cohort member by the
censoring date. The issue of concern is that the rate o f participation among cases is
approximately 67% and among sub-cohort members is approximately 44%. The
sub-cohort members collected as cases participated at the 67% rate rather than the
control rate of 44%. It is necessary to correct the participation rate to the proper
44%. There were a total of 53 subjects in this dissertation dataset who were drawn
as cases, but also were eligible to serve as a sub-cohort member. After
83
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced w ith permission o f th e copyright owner. Further reproduction prohibited without permission.
Figure 6. Members of the ‘Design' Sub-Cohort, but not the ‘Effective’ Sub-Cohort.
R e fu se d b lo o d d ra w
D ied b e fo re b lo o d d ra w
B lo o d n o t y e t d ra w n
Q u e s tio n n a ire P B D , b u t refu sed
Q u e s tio n n a ire D eath P B D
Q u e stio n n a ire
P B D
T y p e o f S u b -C o h o rt M e m b e r ’9 3 ’9 4 ’9 5 ’9 6 ’9 7 '9 8 C e n so rin g
D a te
P B D = P la n n e d B lo o d D raw „ . , .... r l . ..
C a le n d a r T im e o f F o llo w -u p
0 0
applying the proper participation rate a total of 35 subjects were appropriate to
remain ((.44/.67)*53=35). A random selection procedure was run to select 35 of the
53 subjects. The 18 subjects who were removed were not included in the category of
non-sub-cohort cases.
D. Time Scales
In any type of cohort study the critical ‘time variables’ must be determined.
The critical time variables are those which are relevant to the disease being studied.
In the situation of prostate cancer the most important time variable is age. The risk
of prostate cancer increases exponentially with increasing age. Calendar time is a
second potentially important time variable. Calendar time plays a crucial role when
screening or diagnostic practices change over time. For example, the introduction of
the prostate specific antigen (PSA) screening test in the late 1980s would make
calendar time a critical variable for any study that spanned the pre- and post-PSA
era. Because the MEC began after the introduction of PSA and no other relevant
calendar time events have occurred it is not necessary to consider calendar time as a
critical time variable in this analysis. A third potentially relevant time variable is
time since enrollment into the cohort, but because prostate cancer is not a prodromal
disease there is no reason to expect that people destined to get prostate cancer were
more likely to be early enrollers in the cohort. Age at exit from the study for sub
cohort members was the age at the censoring date or death or incident disease. The
specific definitions of entry and exit ages for both sub-cohort members and cases are
shown in figure 7. As shown in figure 8, age serves as the time scale on the X-axis
85
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Figure 7. Categorization of Sub-Cohort Membership.
S u b -C o h o r t M e m b e rs
D id N o t D e v e lo p I n c id e n t P T g ^ c C m K t ^ e ^ e ^ m m U t g D a i e ^ ^ D e w t o p e d J n c jd e m P r o f la te C a n c e r B e f o r e C e n s o rin g D a te
V E n tr y tim e is th e a g e a t q u e s tio n n a ire W i n tr y tim e is th e a g e a t q u e s tio n n a ire
► E xit tim e is th e a g e a t th e c e n s o rin g d a te o r a g e a t d e a th W ix it tim e is a g e a t d ia g n o s is
C a s e s :
D e fin itio n : Id e n tifie d a s a p r o s ta te c a n c e r c a s e o u ts id e th e s u b - c o h o rt a s o f t h e s tu d y d a te . E n tr y tim e is th e a g e j u s t
p rio r to d ia g n o s is a n d e x it tim e is th e a g e a t d ia g n o sis.
Figure 8 . R isk S et Categorization fo r Sub-Cohort Members.
oo
SO
22
5
e o
as oo so m
sj9qui3j/\[ noqo3 qn§
C N
87
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
and entry into the cohort is marked by age rather than calendar time. Risk sets are
then assembled based on sub-cohort members at risk of disease at the age the case
developed disease. In figure 3, for case Di, sub-cohort members 1,4, and 8 are part
of his risk set. Sub-cohort member #9 is part of the risk set for D2 and for sub-cohort
member #3 (D2) who developed disease at age 63. Sub-cohort member #6 is in the
risk set of both D3 and D4. Sub-cohort members #5 and #10 are not in any risk set.
F. Statistical Analysis
The EPICURE (HiroSoft International Corporation, Seattle, WA) statistical
software package has a case-cohort procedure. This procedure allows for the
necessary designation of case-cohort members without disease (coded as 0), case-
cohort members with incident prostate cancer (coded as I), and incident cases
outside of the sub-cohort (coded as 2). Entry into the cohort for those coded as 0 or
1 was specified as the age at questionnaire completion. Entry into the cohort for
those coded as 2 was the instant just before the age at prostate cancer diagnosis (this
is required by the program). Exit from the cohort was defined as the age at the
censoring date for those coded as 0, unless death had already occurred at which point
exit was the age at death. Exit from the cohort was defined as the age at prostate
cancer diagnosis for those coded as 1 or 2.
The case-cohort analysis will be used for all studies of genetic susceptibility
to cancer within the MEC whenever possible.
88
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
IV. GRANT PROPOSAL: GENETIC SUSCEPTIBLITY TO PROSTATE
CANCER IN MINORITY POPULATION
A. Introduction
Cancer risk is influenced by multiple genetic and environmental factors.
However, these factors are not constant across all populations. For nearly a half-
century, epidemiologists have been intrigued by the international and racial-ethnic
variation in cancer risk. Cancer of the prostate exhibits some of the more interesting
aspects of this variability. For example, African-American men have about a three
fold increased risk when compared to Latino men. While there is strong evidence
that this cancer has an important genetic component to its etiology, geographic
variation in risk among members of the same racial-ethnic group also supports an
important environmental component. In fact, the etiology of this cancer is most
likely due to a complex interplay between these genetic and environmental
components in risk. In 1993 we began a prospective study including 215,251
participants from four racial-ethnic groups in Los Angeles and Hawaii - primarily
Japanese and Whites in Hawaii and African-Americans and Latinos in Los Angeles
(the Hawaii/Los Angeles Multiethnic Cohort Study or MEC) - to evaluate the
dietary and other environmental contributions to this racial-ethnic variability in
cancer risk with a major focus on prostate cancer. In 1996, we began collecting
biological samples with the intent to study the genetic susceptibility to prostate
cancer in this multi-ethnic setting using a case-cohort study design. Our approach to
understanding genetic causes of prostate cancer using the MEC is that of candidate
89
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
genes, in which genes are selected for evaluation based on a priori biologic
hypotheses that a particular gene or family of genes is involved in a disease pathway.
It will clearly be necessary to comprehensively screen genes in many intricately
related pathways to find the variants that influence prostate cancer risk. This
challenge is increased by incomplete knowledge of the metabolic pathways
implicated in disease risk and the lack of a framework to determine which
polymorphisms are most likely to be functional. However, by using existing
epidemiologic and biochemical data to carefully select candidate genes and
accessing the increasing availability of genomic sequence data, dense single
nucleotide polymorphism (SNP) maps, and tools for high-throughput genome
analysis, these limitations may be overcome. The critical challenge for genetic
epidemiology is to merge these genomic resources with large patient collections as
typified by the MEC. To this end, we have established collaboration with Joel
Hirschhom and Eric Lander of the Whitehead Institute/MIT Center for Genome
Research. Our large population-based cohort and the extensive genomic and
technologic resources of the Whitehead Institute make possible this study of prostate
cancer susceptibility genes in largely understudied populations with highly variable
disease risk. By studying different populations, we have the opportunity to discover
risk factors that are common to multiple populations as well as risk factors that are
unique to an individual population. In this project, we focus on two groups that have
a high risk of prostate cancer but are usually left out of similar studies: African-
Americans and Latinos. In the MEC, African-Americans have the highest risk of
90
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
prostate cancer while Latinos have a three-fold lower risk, but a risk which is
comparable to non-Latino whites.
B. Specific Aims
We will collaborate with the Whitehead Institute to use novel technology to
search for promising variants in candidate genes in the androgen signaling pathway.
Specifically, we will examine genes involved in the biosynthesis of adrenal
androgens (androstenedione [A-dione], dehydroepiandrosterone [DHEA], DHEA-
Sulfate [DHEA-S]) and those responsible for the conversion of these weak
androgens into the more potent androgen, testosterone (T). We will comprehensively
search for variants in these genes that contribute to risk using the following
approach:
1) conduct high volume sequencing of the regulatory and coding regions
for polymorphic variation (SNPs) of six candidate genes in African-
American and Latino men presenting with advanced prostate cancer.
2) conduct high throughput genotyping of the sequence variants
discussed above as well as a selection of SNPs from the public
database (The SNP Consortium) in 2,000 African-American and
Latino prostate cancer cases and controls.
3) conduct case-cohort analysis of single variants and linkage
disequilibrium analysis using reconstructed ancestral haplotypes for
the SNPs genotyped in 2) in the African-American and Latino
prostate cancer cases and controls.
91
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4) conduct genotyping of the most promising SNPs or haplotypes
identified in 3) above in a replication sample consisting of an
additional 2,000 African-American and Latino cases and controls.
5) create a data set that can be used to explore gene-gene and gene-
environment interactions in the African-American and Latino prostate
cancer cases and controls using the SNPs genotyped in 2) and
epidemiologic covariate data available from a comprehensive
questionnaire on lifestyle and dietary exposures through the MEC.
C. Background and Significance
Prostate cancer is the most common cancer and the second leading cause of
cancer-related deaths among men in the United States (U.S.). It is estimated that
there will be 180,400 new cases of prostate cancer and an accompanying 31,900
deaths in 2000 in the U.S. (American Cancer Society, 2000). The incidence of
prostate cancer in the U.S. increased dramatically between 1989 and 1992 as a result
of intensified screening efforts, but the rates subsequently declined and are
stabilizing now. The increase in rates coincided with the introduction of the prostate-
specific antigen (PSA) screening test that is widely employed in the U.S., thereby
resulting in earlier diagnosis for many men and possibly a decrease in mortality rates
(American Cancer Society, 2000).
There is evidence for both environmental and genetic components to prostate
cancer, but in general the etiology is poorly understood. Age and race/ethnicity are
two of the strongest risk factors for the disease. There is distinct variation in risk by
92
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
racial/ethnic group with African-Americans at highest risk of the disease. In the
MEC, African-Americans have the highest rates, followed by Latinos (table 18).
There is also clear geographic variation in risk and individuals who migrate to the
United States experience an increase in incidence that cannot be entirely attributed to
increased detection efforts. The mechanism through which race/ethnicity exerts its
effect is unclear, but is likely that both genetics and the environment play a role.
Table 18. Number of Invasive Prostate Cancer Cases and Age-Adjusted
Incidence Rates from the Multiethnic Cohort in African-American and Latino
participants through 12/31/97.
Race/Ethnicity Number of Cases Age-Ad justed Incidence Rate*
African-Americans 622 1163.1
Latinos 428 450.4
♦Incidence rates are truncated to ages 50-74 per 100,000 and age-adjusted to the
1970 standard population.
I. Genetic Nature o f Prostate Cancer
The field of genetic epidemiology is still in its infancy, but genetic linkage
and associations studies, combined with improved sequencing and genotyping
technology, has enabled us to move forward in studying complex diseases such as
prostate cancer. Genetic epidemiology provides ample evidence that many common
diseases are influenced by inherited factors. Prominent among these examples are
common cancers, such as prostate cancer, which has recently been demonstrated to
have a genetic component to risk (Lichtenstein et a i, 2000). However, in the case of
a complex disease like prostate cancer, the pattern of inheritance does not follow
93
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
classical Mendelian ratios, indicating multifactorial causality, with multiple genetic
and environmental factors playing a role (Risch, 2000).
Genome-wide linkage scans have been performed for prostate cancer, and
only weak evidence for locus-specific linkage has been observed. Given the
documented power of linkage studies to detect single genes of high effect, these
weak or negative results strengthen the view that the genetic component of prostate
cancer risk is likely due to a combination of multiple weak genetic effects. Alleles
with modest effects on individual risk can nevertheless contribute significantly to the
population risk of disease, if they are common in the population. For example, the
ApoE4 variant causes only a three-fold increased risk of Alzheimer’s disease;
however, due to its population frequency of 17%, this single variant is thought to
explain up to half of the variation in Alzheimer’s disease risk in the general
population (Strittmatter & Roses, 1996). Interestingly, ApoE4 contributes much
more heavily to the population burden of disease than do the more powerful, but
rare, alleles identified through linkage studies. Thus, common alleles with weak
genetic effects may contribute significantly to the variation in disease risk across a
population.
Common alleles also represent the bulk of all human variation, due to the
limited nucleotide diversity in human populations. Existing human populations are
largely derived from a small founder population that rapidly expanded over the last
10,000-100,000 years. Accordingly, the bulk of gene diversity is expected to reside
in common variants that date to the original founder population; this expectation was
94
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
confirmed recently in the largest survey to date of human gene diversity (Cargill et
al., 1999). Specifically, these theoretical and experimental data indicate that the vast
majority of all allelic variation in the human population is attributable to a small
number of common alleles. Thus, common polymorphisms not only have a large
potential effect on prostate cancer risk in the general population but also account for
the vast majority of all gene diversity. Based on these lines of investigation, a key
challenge for human genetics is to develop tools and methodology that can reliably
detect common alleles that contribute to disease risk (Risch, 2000).
Association studies offer the greatest statistical power for detecting common
alleles that contribute to prostate cancer (Risch, 2000). Candidate gene studies -
those that examine genetic variation at genes of relevant biological function —
provide a valuable avenue for investigation. However, candidate gene methods have
traditionally been limited by slow and insensitive methods for identifying and typing
gene variants. Progress towards human gene identification - with the completion this
year of a draft copy of the Human Genome - plus the impending identification of
1,000,000 SNPs removes the first obstacle. Furthermore, high throughput genotyping
technologies make possible the use of these SNPs in disease studies. Since a role for
androgens in prostate cancer risk has long been hypothesized (Huggins & Hodges,
1941; Nobel, 1977), we suggest that genes involved in androgen signaling or
biosynthesis may be excellent candidates for this disease.
95
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2. The Androgen Biosynthetic Pathway
The primary androgen in the circulation is T (Norman & Litwack, 1997). T is
produced in the testes in a complex feedback loop with the hypothalamic-pituitary
axis. Luteinizing hormone-releasing hormone (LHRH) from the hypothalamus binds
to receptors on the pituitary and stimulates the production and release of luteinizing
hormone (LH). LH then binds to its receptor on Leydig cells of the testes and
initiates the biosynthesis of cholesterol to T. This pathway is shown in figure 9. T
then circulates in the blood, both freely and bound to sex-hormone binding globulin
(SHBG).
Although T is the most abundant and potent circulating androgen, it is not the
only one found in the blood. A-dione, DHEA, and DHEA-S, all weak androgens and
precursors to T, also circulate in substantial quantities. A-dione, DHEA, and DHEA-
S are synthesized in the adrenal and are secreted by the zona reticularis of the adrenal
cortex (see figure 9) (Norman & Litwack, 1997).
A-dione, DHEA, and DHEA-S enter the prostate where they are all converted
to T and then to dihydrotestosterone (DHT) by the enzyme 5a-reductase (encoded by
the SRD5A2 gene). DHT, and to a lesser extent T, bind to the androgen receptor
(AR). This ligand complex translocates to the cell nucleus where it transactivates
androgen-responsive genes, including those that stimulate cell division. It has been
estimated that 40% of intraprostatic T is derived from the adrenal precursors (Labrie,
1993; Labrie etal., 1985).
96
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission o f th e copyright owner. Further reproduction prohibited without permission.
Figure 9. The pathways of steroid hormone synthesis in the testes and adrenals in humans.
Cholesterol
A, T, 1
1
Pregnenolone p Progesterone—p DOC p Corticosterone p 180H-Corticosterone ► Aldosterone
A,T, 2 I A, 3 A, 4 A, 5 A,6
A,T, 7 |
170rt-Pregenolone * 170H-Progesterone * I l-Deoxycortisol * Cortisol
A,T, 2 I A, 3 A, 4
' A,T,8 ♦ A,T,8
DHEA * Androstenedione * Testosterone
T, 2 T, 9
Tissue Code: A=adrenals, T=testes
Enzymes: 1) P450scc, 2) 3B-HSD, 3) P450c21,4-6) P450cl 1, 7-8) P450cl7,9) 17B-HSD
VO
3. Adrenal Androgens and Prostate Cancer
The prostate is clearly an androgen dependent organ. The involvement of
androgens in prostate cancer progression was established more than 50 years ago,
when androgen ablative therapy was shown to be an effective treatment for
metastatic prostate cancer (Huggins & Hodges, 1941). In experimental studies, rats
develop prostate cancer after administration of large doses of T (Nobel, 1977). It is
generally thought that androgens exert their effects through increasing prostate-cell
division.
A primary focus of epidemiologic studies has been on circulating androgen
levels and prostate cancer risk. A number of cohort studies have examined serum T
levels and prostate cancer risk. None of the studies found any significant
relationship, but Gann and colleagues (Gann et a i, 1996) noted that T levels were
quite strongly correlated with sex hormone binding globulin (SHBG) levels and they
provided strong evidence that non-SHBG bound T (‘bioavailable’ T) may be related
to prostate cancer risk in a large prospective serologic study (2.3-fold increase in risk
between the upper and lower quartile).
Adrenal androgens have also been studied in relation to prostate cancer risk.
Lookingbill and colleagues, in a cross-sectional study of white American and
Chinese subjects, found that circulating DHEA-S and A-dione levels differed
significantly between the two groups (Lookingbill et a i, 1991). The levels were
much lower in the Chinese subjects compared to the white subjects, which follows
the racial/ethnic variation in prostate cancer risk both in the United States and
98
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
internationally. In a more direct measurement of prostate cancer risk associated with
adrenal androgens, Barrett-Connor and co-workers found that in a cohort of upper
middle class white men followed for 14 years, A-dione levels were significantly
associated with risk of disease (Barrett-Connor et al., 1990). For each 1.17 nM
increase in androstenedione the relative risk of prostate cancer increased 1.26 (95%
Cl 1.04-1.54). Nomura and colleagues, in a cohort study of Japanese-American men,
also reported a slightly elevated, although non-significant, risk associated with
higher androstenedione levels, but there was no trend with increasing levels (Nomura
etal., 1996).
Although the adrenal androgens A-dione, DHEA, and DHEA-S are not as
potent as T, these three weak androgens are known to enter the prostate where the
enzymes necessary to convert them to the more potent androgen T are readily
available. This process has been described by Labrie and colleagues as an
“intracrine” function and offers a possible biologic mechanism as to how adrenal
androgens could increase prostate cancer risk (Labrie, 1993). The potential
importance of these three adrenal androgens is underscored by the finding that T
levels in the prostate are only reduced by approximately 60% following castration
(Labrie, 1993; Labrie et al., 1985). The source for T in the prostates of castrated men
is believed to be solely from the adrenal androgens that are locally converted. T can
then be converted into the most potent androgen DHT by the steroid 5-alpha
reductase type II enzyme with resultant binding to the AR.
99
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4. Prostate Cancer Genetic Association Studies
The strong biologic hypothesis has led investigators including our research
team here at the USC/Norris Cancer Center to study candidate genes involved in the
androgen bioactivation pathway. This multifaceted pathway provides many excellent
candidate genes, including those involved in the biosynthesis of testicular androgens,
those involved in the production of adrenal androgens, and those involved in
androgen metabolism, binding, and catabolism in the prostate. In separate,
nonoverlapping studies, we are exploring the role of genes involved in synthesis of
testicular androgens, androgen binding and androgen catabolism in the prostate. The
focus of this project is on adrenal androgen biosynthesis and intra-prostatic
conversion of these adrenal steroid hormones to T.
Our approach to identify variants in relevant androgen-related candidate
genes has been highly successful to date. In our other studies, we have made a major
effort to identify polymorphisms in the steroid S-alpha reductase type II (SRD5A2,)
gene, which encodes the enzyme responsible for the conversion of T to DHT in the
prostate, based on the hypothesis that prostatic DHT levels are important and that
SRD5A2 activity is an important determinant of intraprostatic DHT levels. Our
laboratories have identified seven novel variants in SRD5A2 by conducting focused
sequencing in prostate cancer cases. We found that both African-American and
Latino men in the MEC who carried the Ala49Thr variant of this gene had a
significantly increased risk of advanced prostate cancer (OR=7.2 and 3.5,
respectively with attributable risks of about 10% in both African-Americans and
100
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Latinos) (Makridakis et al., 1999). We also found that this variant was functionally
related to increased Sa-reductase activity (Makridakis et al., 1999), and most
strikingly that 30% of all prostate cancers carry Ala49Thr somatic mutations
(personal communication, Dr. J.K.V. Reichardt). Although we have not yet been able
to confirm the increased risk for Ala49Thr in Whites and Japanese in the MEC, two
case-control studies have reported finding T alleles only in cases and not in controls
in White and Chinese populations [personal communication JKV Reichardt].
Further, Jaffe and colleagues (Jaffe et al., 2000) found White prostate cancer cases
with the Ala49Thr germline variant to experience a poor prognosis compared to
cases without the variant. These findings show the potential importance of SNPs and
the benefit of sequencing prostate cancer cases to identify disease-relevant variants.
Genes that are particularly crucial to adrenal steroid biosynthesis can be
readily identified because severe mutations in these genes result in congenital
adrenal hyperplasia (C AH), usually as a result of decreased cortisol production.
Depending on which gene is affected, CAH leads to drastically increased or
decreased secretion of androgens, and boys with CAH can exhibit either precocious
puberty or undervirilization of external genitalia.
CAH with altered androgen levels most often results from autosomal
recessive inheritance of variants in the CYP21, CYP11B1, CYP17 and HSD3B2
genes (Norman & Litwack, 1997). CAH caused by a deficiency in enzymes encoded
by CYP21, CYP11B1, CYP17 and HSD3B2 leads to altered androstenedione and
DHEA circulating levels (Norman & Litwack, 1997). Individuals who are
101
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
heterozygous for some of the variants that lead to adrenal hyperplasia in the
homozygous state have also been documented to have increased levels o f circulating
androgens (Knochenhauer et a l, 1997). Thus, these four genes clearly play an
important role in androgen biosynthesis in the adrenals. It is possible that known
variants or as yet undiscovered variants in these four genes may result in increased
(or decreased) risk of prostate cancer through an alteration in adrenal biosynthesis of
androstenedione, DHEA, and/or DHEA-S.
At least two other genes are of importance in this pathway. As discussed
above, adrenal androgens enter the prostate where local conversion to T occurs. An
enzyme critical in this process is HSD17B5, encoded by the gene CYP17, has been
localized to the prostate and converts A-dione to T (Dufort, Rheault, Huang, Soucy,
& Luu-The, 1999; El-Alfy etal., 1999; Rheault, Dufort, Soucy, & Luu-The, 1999).
The conversion of DHEA/DHEA-S to A-dione can occur not only within the adrenal
and gonad but also in other tissues, through the product of the HSD3B genes (El-Alfy
et al., 1999). Both of these genes are therefore of interest in exploring the role of
adrenal androgens in prostate cancer risk.
The CYP11B1, CYP17, CYP21, HSD3B1, and HSD3B2 genes were
included in the initial phase of a large-scale effort to characterize single-nucleotide
polymorphisms in the coding regions of human genes (Cargill et al., 1999). CYP21
was found to be the most polymorphic gene studied, although all five genes were
polymorphic in the populations screened. It remains to be determined if there are
102
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
variants in these genes associated with an increased or decreased risk of prostate
cancer.
CYP17 is a key gene in testicular and adrenal androgen biosynthesis. Two
investigators have published results on the association between a silent SNP in
CYP17 and prostate cancer risk. In one study, individuals who were TT for the T27C
variant were 1.61 times as likely to have prostate cancer (95% Cl 1.02,2.53)
(Wadelius, Andersson, Johansson, Wadelius, & Rane, 1999). Lunn and colleagues,
however, reported that TC/CC individuals were at increased risk of prostate cancer
(Lunn et al., 1999). We have also explored this association and the results are
inconsistent across racial/ethnic groups (see preliminary results). One possibility is
that in some populations (but not in others), this silent polymorphism is in linkage
disequilibrium with a nearby causal variant (see below). Comprehensive evaluation
of this gene will be required to resolve this issue. These inconsistent findings thus
underscore the need to comprehensively sequence genes in this pathway and to
employ novel methods for analyzing the genotyping results.
5. Sequencing and Analysis
To be sure of detecting the effect of a variant in a gene, one can identify and
test the variant directly. Alternatively, one can identify the chromosomal context
surrounding the variant by constructing a haplotype using nearby markers in linkage
disequilibrium with the variant; testing this haplotype is a proxy for testing the
marker itself. With accumulating polymorphism data, it is increasingly clear that
haplotype structures are complex, with highly variable extent of linkage
103
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
disequilibrium (Clark et al., 1998; Nickerson et al., 1998). Furthermore,
investigation of gene regulation has demonstrated that regulatory sequences can be
found at significant distances from the start of transcription, and in introns of genes.
These combined observations have significant implication for association studies.
First, where haplotype association is strong, a positive association with a coding
region variant may, in fact, be reflecting linkage disequilibrium with a causal
regulatory variant some distance away. Conversely, where linkage disequilibrium is
weak, a very dense set of markers will be required. Second, a search for causal
variants cannot be considered comprehensive if it is limited to coding and near
upstream regions.
To address these concerns, two types of approaches are suggested: either
identify the functionally relevant regions (and then screen them directly for
polymorphisms), or comprehensively identify haplotypes using a sufficiently dense
marker map. In the near future, both approaches should become feasible for the first
time on a large-scale. To identify functional regulatory motifs, there is expanding
evidence that cross-species conservation can be used reliably. That is, non-coding
regions of high homology across species have a significant rate of being functional
regulatory motifs (Loots et al., 2000). With the impending final sequence of the
human and mouse genomes, this approach can be implemented during the span of
this grant. For haplotype-based approaches, a dense marker (SNP) map is required.
The SNP Consortium, a public-private partnership, will produce and release such a
map. At present, the map contains approximately 300,000 SNPs (one every 10,000
104
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
bp), and we project that there will be over 1,000,000 SNPs in the public domain
before this SPORE grant would begin. Thus, a map of arbitrary density (1 SNP per
3kb) should be available for nearly any gene of interest. Thus, the process of
identifying the causal variant will be revolutionized in the span of this grant.
To comprehensively evaluate allelic variation at relevant candidate genes will
require typing numerous polymorphisms. Furthermore, there are many models of
haplotype, gene-gene and gene-environment interactions that could potentially be
used to analyze the data. To avoid being misled by statistical fluctuations, it is
necessary to analyze the data in light of these multiple comparisons and arrive at
meaningful thresholds for evaluating any given single association and hypothesis for
further study. Also, ethnic stratification may be a cause of false-positive associations,
although the extent to which this occurs is debated. To be certain of the associations
we observe, it is desirable to control for this confounder. Using novel methods
pioneered by Pritchard and colleagues (Pritchard & Rosenberg, 1999), it is possible
to directly measure and adjust for stratification. We describe our approaches to these
issues below.
In summary, the use of strong biologic hypotheses to select candidate genes
and the comprehensive evaluation of these genes is critical to successfully
identifying genetic variation associated with risk of prostate cancer. Further, large
numbers of cases and controls are needed to detect small effects and to study gene-
gene and gene-environment interactions. Our proposed study addresses both of these
issues.
105
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
D. Preliminary Results
1. Hawaii-Los Angeles Multiethnic Cohort Study o f Cancer and Diet
The Hawaii-Los Angeles Multi-ethnic Cohort Study of Diet and Cancer
(MEC) has successfully recruited and followed more than 200,000 men and women
in Hawaii and Los Angeles since 1993. A description of the cohort and how it was
established can be found under “Research Design and Methods” below.
a. Cases Identified and Specimens Collected
Collection of blood and urine samples from incident cases of prostate cancer,
and a random sample of cohort members to serve as controls, began in Los Angeles
in July 1995 through funding obtained for U01 CA063464. We commenced
collection of samples from cohort members residing outside Los Angeles County but
within the state of California in July 1997 following our first linkage with the CCR.
Although sample collection in other areas of California commenced 1-2 years after
collection commenced in Los Angeles County, incident cancer cases occurring since
January 1, 1995 were contacted and samples collected. Additionally, all cancer cases
who subsequently move out-of-state after their cancer diagnosis are contacted and
asked to provide a blood specimen through their doctor’s office or nearby clinic
using our “mail-in” kit (supplies, instructions, and a pre-paid overnight delivery
mailer).
As of March 1,2000 we had identified 1,187 individuals with cancer of the
prostate among African-American and Latino participants. We have collected blood
and urine samples, by personal home visit, from 798 (67%) of these patients.
106
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
We have also selected a random sample of participating cohort members to
serve as cohort controls. Race as reported on the questionnaire was used to determine
ethnicity and all ages were included. The first sample included 600 males, Latinos
and African-Americans, for a pool of 1,200 controls. Every selected cohort member
was then randomly assigned an ‘order number’ that determines the order in which we
contact subjects for blood collection. As of March 1, 2000 we had contacted 3,247
African-American and Latino male cohort controls to obtain blood samples (table
19).
Table 19. Sample Collection from Incident Prostate Cancer Cases and Cohort
Controls in Los Angeles, as of March 1,2000.
AA* LA* Total
Prostate Cancer Cases
Eligible for collection 667 520 1,187
Sample collected 440 358 798
Refused 142 76 218
Unable to reach or pending 85 86 171
Participation rate - % 66 69 67
Cohort Controls
Total contacted or attempted 2,376 871 3,247
Sample collected 1,657 525 2,182
Refused 539 170 709
Unable to reach or pending 180 176 356
Participation rate - % 70 60 67
*AA=African-American and LA= ^atino
(These numbers include our ongoing efforts to collect samples from a broader
sample of African-American males aged 65+ who reside in Los Angeles funded by a
two-year NIH supplement to this U01 grant; thus the number of samples collected
from African-American males is considerably greater than the number of samples for
107
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Latinos.) We have collected samples from 2,182 cohort controls representing an
overall participation rate of 70% among African-American men and 60% among
Latinos.
b. SRD5A2
As discussed above, we initiated focused sequencing of the SRD5A2 gene by
using prostate cancer cases from multiple racial/ethnic groups. The goal of this
approach was to increase the power to detect variants less common in the general
population, but more common in affected individuals. This approach was successful
and we identified seven novel variants. We have characterized several of these
SRD5A2 enzyme variants biochemically in vitro (Makridakis et al., 1997). Major
progress has been made in the epidemiologic and biochemical characterization of
both the Val89Leu and the Ala49Thr variants. The Val89Leu is associated with an
alteration in the mean circulating level of 3 a androstanediol glucuronide levels in
Latinos, Japanese, and Whites. There is no such effect in African-Americans,
although the number of LL individuals is small (~5%). However, we have found no
consistent effect of Val89Leu genotype on prostate cancer risk in any of the four
racial-ethnic groups (data not shown).
In contrast, the Ala49Thr genetic variant has demonstrated a substantial
relationship between genotype and prostate cancer risk among African-Americans
and Latino men; individuals with a T variant have a significantly increased risk of
prostate cancer (Makridakis et al., 1999). This genetic variant significantly increases
the risk of prostate cancer about 8-fold in African-American men and about 4-fold in
108
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Latino men. The population-attributable risk for this variant is about 9% in both
populations. The Ala49Thr variant increases the in vitro Vm a x for the mutant enzyme
about 5-fold. The K * for the 5-alpha reductase inhibitor and proposed
chemopreventive agent finasteride (“Proscar”) is increased about 13-fold in the
Ala49Thr variant enzyme (Makridakis et al., 1999). So it is likely that finasteride
would have to be administered at a much higher dose in men with this variant to
substantially affect DHT levels.
2. Whitehead Institute/MIT Center for Genome Research
The Whitehead Institute/MIT Center for Genome Research has long been a
leading contributor to the analysis of human genome structure and variation, as well
as their application to genetic studies of simple and complex traits. The Whitehead
has contributed to the development of a wide range of SNP resources. First, an initial
SNP map of the human genome was generated by screening 21,322 STSs by
sequencing and Variant Detector Array (VDA) (Wang et al., 1998). This screen
represented over 2.5 Mb of DNA, and yielded 3,467 SNPs. Second, a large-scale
survey of coding region variation involving 106 genes in over 100 chromosomes,
resulted in the discovery of 565 cSNPs (Cargill et al., 1999). This survey provided
the first broad overview of human coding region variation, revealing the small
number (approximately four) of coding region changes that exist at high frequency
(>1%) in the typical coding region. This effort has continued, with hundreds of genes
screened and thousands of polymorphisms identified. Finally, the Reduced
Representation Shotgun (RRS) method employed by the SNP Consortium was
109
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
described by the Co-Investigator of the Whitehead Institute component of this
project (Altshuler et al., 2000b), and has been used at the Whitehead Institute to
discover over 47,000 human SNPs. In total, over 150,000 SNPs have been reported
(using RRS) by The SNP Consortium, and our current expectation is that
approximately 1,000,000 SNPs (one SNP every 3kb) will be placed in public
databases by The SNP Consortium within the year. This effort makes possible the
haplotype-based association studies described in section E. We have already
sequenced five of the six genes proposed for this study as described in table 20. This
sequencing was not, however, focused on prostate cancer cases and therefore it was
not adequately powered to detect variants common only in affected individuals or
only in individual racial/ethnic groups.
Table 20. Relevant Genes and Discovered SNPs Screened by the Whitehead
Institute.
Gene
No. Synonymous
Polymorphisms
No. Non-synonymous
Polymorphisms Total
CYP11B1 7 7 14
CYP17 3 0 3
CYP21 7 7 14
HSD3B1 3 2 5
HSD3B2 1 1 2
3. Characterization o f Genotyping Methods
To explore the potential impact of this wealth of human polymorphism data,
new methods are required for high-throughput SNP genotyping. Multiple methods
have been tested at the Whitehead, with criteria including ease of assay development,
accuracy, cost, and throughput. We assessed a number of complementary methods,
110
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
including intensive evaluations of Taqman, allele-specific PCR amplification (ASA),
mass-spectroscopy, and single base extension (SBE). After evaluating these
methods, we felt that SBE offered the optimal solution for a flexible, robust and cost-
effective solution for high-throughput genotyping.
SBE involves template dependent, primer directed incorporation of labeled
dideoxynucleotides. In this proposal, the template is PCR-amplified material from
patient DNA. The SBE primer ends one base before the polymorphic position, and
since dideoxynucleotides cannot be extended beyond that one base, only the
complement of the polymorphic position is incorporated. Each dideoxynucleotide is
uniquely labeled with a fluorescent dye or other label, and thus genotype
determination is reduced to detecting the type of label incorporated into the SBE
primer. Previously described or proposed detection methods include fluorescence
resonance energy transfer (SBE-FRET) (Chen, Zehnbauer, Gnirke, & Kwok, 1997),
fluorescence polarization (SBE-FP) (Chen, Levine, & Kwok, 1999), array
hybridization (SBE-TAGS) (Hirschhom et al., 2000), and length-multiplexing (LM-
SBE) (Lindblad-Toh etal., 2000). SBE-TAGS and LM-SBE were developed at the
Whitehead Institute.
We evaluated these methods and determined that LM-SBE offers significant
advantages in cost, speed and laboratory integration, and has thus become the
method of choice for our high-throughput projects. LM-SBE introduces a crucial
element of parallelism by simultaneously amplifying and genotyping 12-18 SNPs in
each reaction; the individual genotypes are then de-multiplexed and detected using
111
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
length separation and simultaneous fluorescence detection on capillary sequencers.
Specifically, SBE primers are designed at a range of lengths, varying from 18-50
bases in four base-pair intervals. Furthermore, each length can be used twice, since
two SNPs can be selected with non-overlapping base composition (for example, G/T
and C/A). In this way, pools of compatible loci are constructed. PCR primers are
designed for each locus, and amplification performed in a reaction containing each of
the pool's loci. After a clean-up step to remove unincorporated primers and
remaining nucleotides, a mix of the SBE primers is added and 12-18 simultaneous
SBE reactions performed. Finally, these products are loaded on automated
sequencers and fluorescence at each size positions recorded.
Using LM-SBE in pilot testing, over 75,000 genotypes were performed on
parent-offspring trios, with a genotype success rate of over 90%. Accuracy was
determined by systematic search for segregation errors: the calculated error rate was
<1%. Thus, LM-SBE genotyping is reliable and accurate for genotyping.
4. USC/Norris Cancer Center and Whitehead Institute Pilot Study
In order to ensure that a cross-country collaboration involving biologic
specimens was feasible on an ongoing basis we initiated a pilot study. During the
past year and one-half we have evaluated the quality of DNA available from the
MEC, established a shipping protocol for specimens, conducted genotyping on a
small scale, and submitted proposals for studies of genetic susceptibility to cancer.
The investigators on this project have worked together closely and formed a good
and productive working relationship. Funding has been sought and received to study
112
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
breast cancer and ovarian cancer. We have also submitted a proposal to explore
genes involved in testicular androgen biosynthesis as well as genes responsible for
the metabolism of T in the prostate and for DHT binding and catabolism. The current
proposal offers an additional hypothesis for a role of adrenal androgens by exploring
genes active in the adrenals and the prostate. In our pilot study we examined the
CYP17 gene, which is relevant for both adrenal and testicular androgen biosynthesis.
During the Whitehead Institute gene characterization study they identified two novel,
common variants (and one rare variant) in this gene (see table 20). We genotyped
MEC prostate cancer cases and controls for these two novel, common variants, as
well as the variant previously reported in the literature (see table 21).
Table 21. Odds Ratios and 95% Confidence Intervals for Genotype and
Prostate Cancer Risk by Racial/Ethnic Group.
Gene/Variant
African-American Latino
OR 95% Cl OR 95% Cl
CYP17/T27C
Wild type
Heterozygote
Mutant
1.00
1.26
0.90
0.74-2.14
0.41-2.00
1.00
1.30
0.78
0.75-2.26
0.38-1.61
CYP17/ul
Wild type
Heterozygote
Mutant
1.00
1.23
0.78
0.72-2.08
0.35-1.78
1.00
1.47
0.87
0.85-2.55
0.43-1.76
CYP17/u2
Wild type
Heterozygote
Mutant
1.00
1.10
0.71
0.63-1.94
0.30-1.68
1.00
1.42
1.15
0.81-2.49
0.56-2.39
These three variants are in tight linkage disequilibrium and are all silent
variants. The increased risk among heterozygotes is consistent across the
racial/ethnic groups, but the meaning of these results is unclear. Further, the
113
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
functional relevance of such silent polymorphisms is unknown, but this work
provides “proof of principal” that this type of approach can be utilized effectively.
The Whitehead Institute sequencing project did not find any missense SNPs,
however we believe that by conducting sequencing of targeted coding region and
regulatory regions in advanced prostate cancer cases in each racial/ethnic group we
are likely to find polymorphisms of probable functional relevance not only in this
gene, but the five other genes as well. Further, the introduction of SNPs from The
SNP Consortium will make possible the development of SNP maps for the six genes
under study in this project.
E. Research Design and Methods
We are proposing a case-cohort study to explore the association between
variation in genes involved in androgen signaling via adrenal androgen production
and risk of prostate cancer in a large sample of African-American and Latino men.
This study has two major components: collection of biologic samples and genetic
analysis of prostate cancer candidate genes. Biologic specimens will be collected
from participants in the MEC. These samples will be added to already existing pool
of samples from the same cohort study. We will generate a comprehensive SNP map
for each of the six prostate cancer candidate genes under study by sequencing coding
and regulatory regions in advanced cases, as well as utilizing SNP information from
public sources. We then will carry out genetic association studies using both single
markers as well as linkage disequilibrium analysis using haplotype reconstruction in
a case-cohort analysis o f 2,000 participants. We will then conduct follow-up studies
114
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
of the promising results using an additional 2,000 African-American and Latino
prostate cancer cases and controls for whom samples have been collected as part of
this project. We will also be able to evaluate gene-gene and gene-environment
interactions in this proposal as we will have a total o f4,000 African-American and
Latino cases and controls for analysis. This study population of African-Americans
and Latinos will represent the largest pool of samples collected to date for these two
groups, who are underrepresented in research despite their disproportionate high risk
of prostate cancer.
1. Description o f the Multiethnic Cohort
The Hawaii-Los Angeles Multiethnic Cohort Study of Diet and Cancer
(MEC) (R01 CA54281-06: Laurence N. Kolonel, Principal Investigator) is in its
seventh year of follow-up. The design and implementation of this large cohort study
in the populations of Hawaii and Los Angeles has been described in detail elsewhere
(Kolonel et al., 2000; Stram et al., 2000). Briefly, participants entered the cohort
from 1993 to 1996 by completing a 26-page, self-administered mail questionnaire
that asked about the normal intake of approximately 180 foods and beverages. The
dietary component comprises the major portion of the questionnaire and covers all
the major sources of nutrients for each of the ethnic populations. In addition, the
questionnaire included information on demographic factors (including ethnicity,
education, and migrant status), personal behaviors (smoking, solar exposure, and
physical activity), history of prior medical conditions (e.g., high blood pressure,
heart attack or angina, diabetes, and stroke), use o f medications, family history of
115
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
common cancers, and, for women, reproductive history and the use of oral
contraceptives and hormone replacement therapy.
The primary goal of the study was to establish a broadly representative
multiethnic cohort comprised of African-Americans, Latinos, Japanese and Whites
for long-term study of the relationship of dietary and other environmental risk factors
for cancer and other chronic diseases. We used the Department of Motor Vehicles
drivers’ license file as the primary source of potential subjects. An additional source
of African-Americans, added after the study’s inception, was the Health Care
Financing Administration (HCFA). The HCFA database contains a race designation
making the recruitment of African-Americans (aged 65+) from other California
counties possible. We enrolled approximately 23% of our African-American study
sample from the southern California counties of San Diego, Riverside, San
Bernardino, and Orange and the northern California counties of Alameda, Contra
Costa, San Mateo, and San Francisco. Hereinafter, reference to the Los Angeles
Cohort includes those subjects who resided throughout California at time of entry
into the study. In 1996, we successfully accomplished our primary goal with the
establishment of a cohort of 215,251 men and women who are being followed for
cancer and other disease outcomes. The distribution of the cohort, by age and sex, is
shown below (table 22). Hawaiians in Hawaii make up the majority of the ‘Other’
category.
116
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 22. Distribution of the Cohort by Age, Sex, and Ethnicity, Hawaii and
Los Angeles, 1993-1996.
Ethnicity
45-54
Age Group (Years)*
55-64 65-75
Total No. % No. % No. %
Males
African-American 2,899 22.6 3,503 27.3 6,449 50.2 12,851
Latino 6,376 27.9 9,453 41.4 6,989 30.6 22,818
Japanese 8,065 29.9 7,667 28.4 11,232 41.7 26,964
White 8,718 38.1 6,672 29.2 7,467 32.7 22,857
Other 4,523 40.0 3,770 33.3 3,027 26.7 11,320
Total 30,581 31,065 35,164 96,810
Females
African-American 6,136 27.6 6,315 28.4 9,805 44.1 22,256
Latino 7,767 31.5 10,481 42.6 6,372 25.9 24,620
Japanese 8,809 29.4 9,299 31.0 11,849 39.6 29,957
White 10,216 38.5 7,937 29.9 8,349 31.5 26,502
Other 6,702 44.4 4,921 32.6 3,483 23.1 15,106
Total 39,630 38,953 39,858 118,441
*Age at baseline (completion of the quantitative food frequency questionnaire in
1993-1996).
A goal of the current funding cycle is to begin data analyses of the
diet/cancer hypotheses for the common cancers in the cohort (i.e., prostate, breast,
colorectal and lung). We recently commenced active follow-up of the cohort by the
administration of a brief questionnaire to obtain information about recent illnesses,
current use of vitamin supplements, and family history of cancer and other disease
outcomes. Passive follow-up is being accomplished by computer linkage with the
population-based SEER cancer registries in Los Angeles and Hawaii. In addition, we
link annually to the statewide California Tumor Registry, to state death files, to the
117
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
social security death index, and to the State of California, Office of Statewide Health
Planning and Development hospital discharge data base. This latter linkage provides
hospital discharge abstract records (all diagnoses and procedures for each hospital
admission since 1991) for all California hospitals except City of Hope and VA
hospitals. We are well positioned to follow this cohort for all disease outcomes as
well as overall mortality as we have social security number, a critical identifier, on
98.7% of our entire cohort. Further, we estimate that fewer than 3% of Los Angeles
cohort members have migrated from the state of California since the study’s
inception in 1993. The proposed study will include only African-American and
Latino men from the Los Angeles portion of the cohort. These two groups represent
largely understudied populations who are at varying, but high risk of prostate cancer.
Only the Los Angeles component of this cohort as it pertains to prostate cancer will
be discussed throughout the remaining text of this proposal.
2. Identification o f Incident Cancer Cases
The Los Angeles County Cancer Surveillance Program (CSP), administered
by the USC/Norris Comprehensive Cancer Center, ascertains all cancer diagnoses
among residents of Los Angeles County. It is a member of the statewide population-
based cancer surveillance system, the California Cancer Registry (CCR). In 1992,
the CSP became a National Cancer Institute-funded Surveillance, Epidemiology, and
End Results (SEER) registry. Four times per year, we match the Los Angeles
component of the MEC with the CSP master data file using a software program
118
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
which conducts a probabilistic data linkage using the social security number, last
name, first name, date of birth, and gender.
Once per year we link our cohort with the CCR. We submit multiple records
per individual if there are alternate social security numbers, alternate dates of birth,
or alias names in our tracking database. Our most recent linkage was completed in
March 2000. We have ascertained 100% of the 1993-1997 expected cases based on
local race- and age-specific rates.
3. Obtaining Biological Samples (2001-2006)
The quarterly CSP linkages and the annual CCR linkage identifies incident
cancer cases within the Los Angeles component of the MEC. The system has proven
to be effective with respect to prostate cancer as the survival rates of this cancer are
high. For this proposal, we are proposing to collect additional biologic specimens
from African-Americans and Latinos. The number of projected samples to be
identified and collected from cases of invasive prostate cancer during this time
period is shown in table 23. These projections are based upon incidence rates
calculated within the cohort using approximately 5 years of follow-up data. We are
proposing to collect 777 cases and an equal number of cohort controls (475 of whom
will be Latino) through SPORE funding for a total of 1,554 samples (777 x 2). This
will result in a total study population o f4,000 cases and controls (1,223 case samples
available starting 7/1/01 with an additional 777 cases to be collected over the study
period and an equal number o f controls for a total study population of 4,000
119
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
individuals). The exact methods for the collection and processing of these samples
are described below.
Table 23. Projected Blood Sample Collection from Invasive Prostate Cancer
Cases Los Angeles (1993-2006).
Race/Ethnicity
Projected No.
Cases Collected to
6/30/01
Cases to be Collected
in the Current Proposal
(7/1/01-6/30/06)
Total No.
Cases
(1993-2006)
African-American 711 289 1,000
Latino 512 488 1,000
Total 1,223 777 2,000
a. Collection of Samples from Prostate Cancer Cases and Controls
We have used a home visit approach for the collection of blood samples in
Los Angeles and based on our current success rates we believe this is the best
method for our study population. We cite the following reasons: (1) Los Angeles
County encompasses a very large geographic area and our study population resides
throughout the entire area; (2) the Latino and African-American populations are not
easy populations to recruit into a study and we want to make blood collection as
simple and convenient as possible to obtain the highest participation rates; (3) many
of our Latino cohort members are first-generation immigrants, Spanish-speaking, not
well educated, and likely to have no ongoing motivation for participating in a health
study; (4) many of our African-American cohort members are elderly and may not be
driving or be able to utilize public transportation (25% will be age 75+ in the year
2001); and (5) we have not been able to identify a suitable and reliable alternative
120
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
approach, such as the centralized laboratory system currently in use for the MEC in
Hawaii.
Blood collection will be conducted using a team approach. The team is
comprised of a recruiter, a phlebotomist, a project assistant and a data programmer.
This approach has proven to be efficient, largely error free and requires little
supervisory oversight or logistical coordination between personnel. The phlebotomist
maintains his own appointment schedule, obtains the necessary collection forms
from the files, collects blood and urine from the participant, and fills out a short
questionnaire with the participant. He processes the samples in the laboratory
following stringent guidelines, stores them in liquid nitrogen, and records processing
times and storage location on the collection forms. The phlebotomist will also collect
samples from the cohort members who reside in the San Francisco/Oakland area and
in San Diego.
The recruiter will call back and attempt to schedule blood collection
appointments for the subjects who were unable or uncertain about willingness to
schedule a blood specimen appointment when first contacted by the phlebotomist.
This person will describe the study, answer all participant questions, and must be
able to persuade reluctant cohort members to participate without any coercive tactics.
Our experience has been that this position requires extensive time on the telephone
speaking with cohort members. She will schedule an appointment for the
phlebotomist. We have found this system to yield higher response rates than if the
same phlebotomist makes a follow-up call. The project assistant will provide all
121
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
administrative support for the phlebotomist and recruiter. In brief, she is responsible
for sending out informational letters, printing all collection forms, tracking down
addresses and phone numbers, answering incoming phone calls, entering data from
the collection form in to the computer database, and printing summary reports. The
programmer will enter the data collected by the phlebotomist as well as provide
overall data management as described below.
b. Processing and Storage of Blood Specimens
A detailed protocol for proper blood specimen collection, processing, and
storage has been developed. Forty ml of blood will be collected by venipuncture
(three 10 ml green top (heparinized) vacutainers and one 10 ml red top (clot)
vacutainer) from overnight fasting participants. Blood is collected first in the red top
and then the green top vacutainers. The tubes are protected from light by wrapping
with aluminum foil and then placed immediately in a styrofoam container with
frozen ice packs. The time the subject last had anything to eat or drink, other than
water, is recorded on the specimen collection form. All samples collected in Los
Angeles are processed within 6 hours of collection. This time restriction does not
apply to specimens received through the mail. This will be possible because travel
time from collection site to the laboratory at the Norris Cancer Center is not more
than 114 hours in any direction. To assure adherence to this schedule, the time of
specimen collection and the time that processing ends will be recorded for each
sample and the interval will be monitored.
122
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
For processing, we utilize an automated specimen component dispensing
machine (the Cryo-Bio System) developed in collaboration with the International
Agency for Research on Cancer for use in the multicenter European (EPIC) cohort
study. Four blood components (serum, plasma, red blood cells, and white blood
cells) are aliquoted into “straws” of 0.5 ml volume. Color-coded PVC jackets, fitted
tightly around the straws, identify different blood components. Each of the jackets is
pre-labeled with a code indicating the study site (LA01 for Los Angeles) followed by
a sequential 5-digit number. After centrifugation at 2,500 rpm, the blood fractions
are transferred manually to glass test tubes. The buffy coat of white blood cells is
diluted 1:8 with PBS without added magnesium and calcium. Each component is
then automatically dispensed into straws by aspiration. A set of 42 straws are filled
for each subject (8-serum, 14-plasma, 6-RBCs and 14-WBCs). The filled straws are
sorted into three sets of 14, pre-frozen to -80° C for 30 minutes and then stored in
liquid nitrogen until needed.
Samples will be shipped to our colleagues at the Whitehead Institute on dry
ice by airfreight. We have extensive experience in shipping biological samples by
air. The services of a reliable airfreight company are engaged to guarantee
appropriate handling to and from Los Angeles International Airport and onboard
departing flights.
c. DNA Extraction and Storage
We plan to extract DNA from one (1) straw on all MEC participants on
whom we collect a blood sample. DNA extraction can be efficiently accomplished
123
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
on an ongoing basis using the Qiagen 96 DNA Blood Biorobotic Kit (or equivalent).
This approach will maximize the availability of the cohort “control pool.” The
purified DNA will be adjusted into a pair of 1.5 ml Eppendorf tubes after being
suspended in 300 pi of hydration solution. The extracted DNA will be stored at -
20°C. The use of the biorobot will simplify the subsequent genotyping as small
aliquots o f the extracted DNA can be dispensed directly into 96 well plates. We will
ship 2,000 samples at the beginning of the study and the additional 2,000 samples
during year 5.
d. Data Management
We currently have a comprehensive and well-tested system of computerized
data management in place to track specimen collection, storage, and distribution. In
addition to the unique specimen ID printed on every straw, extracted DNA from an
individual is assigned a second specimen ID, a six character code beginning with M
(for MEC) followed by a 5-digit number. The purpose of the second code is to blind
a laboratory to the specimen’s origin when indicated (Los Angeles or Hawaii) and
possibly the ethnicity of the sample and to provide a way to include repeats in every
batch of specimens. Only the programmer has the necessary knowledge to link these
specimen IDs to each other or to the actual participant.
Our main MEC tracking database contains essential demographic information
used for locating cohort members and printing collection forms. Other databases
contain cancer diagnoses, death information, or hospital discharge data. The
specimen inventory database contains the variables from the collection forms, e.g.,
124
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
HSD3B1 - Mapped to the short arm of chromosome 1 (lpl3.1). This gene
encodes an enzyme responsible for oxidizing 3 beta-OH to an oxo group. 3-beta-
hydroxysteroid dehydrogenase converts DHEA into androstenedione in the prostate.
It is not clear whether it is HSD3B1 or HSD3B2 which is responsible for this
reaction in the prostate.
HSD3B2 - Mapped to the short arm of chromosome 1 (lpl3.1). This gene
also encodes an enzyme responsible for oxidizing 3 beta-OH to an oxo group. It has
been implicated in increasing the secretion of androgens from the adrenals in
individuals with CAH. Also, 3-beta-hydroxysteroid dehydrogenase converts DHEA
into androstenedione in the prostate. It is not clear whether it is HSD3BI or HSD3B2
which is responsible for this reaction in the prostate.
CYP17 - Mapped to chromosome 10 (10pl5-14). This gene encodes a
prostate-specific enzyme which reduces 17-oxo to a 17 beta-hydroxyl. This is the
enzyme that converts androstenedione into testosterone in the prostate.
5. Search for Functional SNPs in Coding and Regulatory Regions
We are taking a novel, two-pronged approach to identify functionally
relevant variants in both coding and regulatory regions of the six candidate genes
discussed above. First, it is increasingly clear that regions of non-coding
conservation between the mouse and human have a significant rate o f being
functional regulatory sites, as recently described by Rubin and colleagues (Loots et
al., 2000). Over the next five years, it is expected that nearly all of the mouse
genome will be available in public databases. Thus, by systematic comparison of
126
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
each human gene sequence (provided by the Human Genome Project) and that of the
mouse, it should be possible to identify a substantial proportion of non-coding
regulatory motifs. Second, we know that the coding region (exons) of any given
candidate gene may harbor relevant variants. We will therefore sequence the
identified mouse-human homologous regions, as well as the coding region of each
candidate gene in 23 affected individuals from each of the two ethnic groups. This
sample size gives us 90% power for detecting a SNP present in a single group at 5%
among cases, and >99% power for alleles of over 5% frequency. Thus, this approach
will identify not only polymorphisms present in all populations but also variants that
are common only in affected individuals or only in one of the ethnic groups. We
expect, based on our prior experience (see table 20), that there will be six SNPs on
average in the coding region, and we estimate that there will be two SNPs in the
human-mouse homologous regions, resulting in a total of eight potentially functional
SNPs per candidate gene. We are estimating the eight SNPs per gene as an average
for all six genes in this project, although the number for each individual gene may be
higher or lower.
To discover the variants, sequencing of PCR products will be performed,
using primers tailed with M l3 forward and reverse sequences (to allow generic
sequencing primers to be used). Standard dye terminator sequencing reactions will
be performed, and sequences detected using ABI3700 capillary sequencers. Data will
be analyzed using in-house pipelines and SNPs detected using PolyPhred (Nickerson
et al., 1998) and novel SNP finder algorithms (Altshuler et al., 2000b).
127
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
6. Selection ofSNPs from Public Databases
Although a targeted search for functional SNPs (as above) will have a good
chance to find etiologic SNPs, it remains the case that some relevant mutations may
fall outside these identified regions. We will select SNPs from the public database to
augment the potentially functional variants discovered during the resequencing
project (above) in order to develop a comprehensive SNP map for each of the above
listed candidate genes. We propose to select a dense set of SNPs spanning each
locus. Assuming an average gene size (genomic) of 50,000 bp, and a SNP every 6
kb, we estimate 8 SNPs will be available across each gene. Figure 10 is a cartoon of
a hypothesized typical candidate gene.
7. Single Marker and Linkage Disequilibrium Analysis
All SNPs in the putative functional regions (approximately 8 in a given gene)
and those from the public database (approximately 8 per gene) will be genotyped in a
sample o f2,000 total cases and controls (1,000 African-Americans and 1,000
Latinos) during years 2, 3, and 4. We picked these two groups because they have
been underrepresented in previous studies and because they have different overall
risks of prostate cancer. Thus, these groups may potentially provide new insights into
the pathogenesis of prostate cancer that might not be possible by studying only one
ethnic group.
We will first test single marker associations using standard case-
cohort analysis (see Statistical Issues section below). A second, powerful approach
will utilize the dense SNP map we will have for each candidate gene to identify
128
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 10. Example of targeted resequencing and SNP selection in a candidate gene.
- > • 4 - - > 4— 44--------^4 --------44 ^ 4 - -► •4 - - > - 4 - - 44 - - 44 -
I !
A
------------ 1
£
>
T T
0 0
3 4 5 6 7
Exons (white boxes)
Regions of mouse-human homology (black boxes)
regions targeted for resequencing
f SNPs discovered by resequencing
J SNPs drawn from public databases
Putative functional regions (white boxes) will be resequenced (arrows) as described in the text. Non-coding regions
conserved between mouse and human genome sequences (black boxes) are hypothesized to be regulatory, and will be
identified as mouse sequence becomes available. We will genotype each SNP discovered in putative functional
regions (diamonds), as well as SNPs from The SNP Consortium and other public databases (circles). Based on current
projections, we assume there will be 500,000 - 1,000,000 SNPs(one per 3-5 kb) in public databases by the start-date of
the grant.
haplotypes within the gene. This analysis avoids the need to discover the causal
allele by identifying instead ancestral haplotypes that are in linkage disequilibrium
with the allele and therefore show association with disease. If the disease-influencing
polymorphism arose once and retains linkage disequilibrium with its founding
ancestral haplotype, then this approach offers significant power. The exact density of
marker SNPs required to detect ancestral haplotypes is unknown (Collins, Lonjou, &
Morton, 1999; Kruglyak, 1999); however, we will have approximately 16 SNPs per
gene and therefore it will be possible to examine any candidate gene by haplotype
analysis at a density > 1 SNP per 3-6 kb, which satisfies the most conservative
estimates o f required density (Kruglyak, 1999). We will conduct haplotype
reconstruction from diploid genotypes using the expectation maximization algorithm
(Slatkin & Excoffier, 1996) in the same 2,000 cases and controls (1,000 from each
racial/ethnic group).
To ensure that associations we observe are robust, the most promising
associations identified during the initial case-control analysis o f2,000 participants
(1,000 from each racial/ethnic group) will then be followed-up in the second sample
of 2,000 participants to be collected as part of this project. We will genotype the
most promising 25% of SNPs or haplotypes in this replication sample during year
five. By requiring replication of an association, we will dramatically lessen the
probability of reporting a false-positive association. The guidelines for determining
the most promising associations are described in the Statistical Issues section below.
130
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Genotyping in this second sample also allows a larger data set for the eventual
exploration of gene-gene and gene-environment interactions.
In summary, we have established a comprehensive approach to identify
variation in the six genes under study. We are sequencing the regulatory regions of
these six genes in advanced prostate cancer cases and selecting random SNPs to
create a dense map. We will follow up our initial promising findings in a second
sample o f2,000 prostate cancer cases and controls. We will thus effectively survey
all variation in these genes resulting in a determination of their role, or lack thereof,
in prostate cancer risk as well as provide an internal replication for any associations
with prostate cancer.
8. Laboratory Methods
The genotyping method we will use is the LM-SBE method described in the
Preliminary Results and in Lindblad-Toh et al. (Lindblad-Toh et al., 2000). In this
method, SNPs will be grouped into sets and multiplex PCR pools formed using a
novel algorithm developed at the Whitehead. Genotyping primers for SBE will be
designed to detect each SNP in the pool, and a length ladder designed by addition of
the necessary number of bases to each SBE primer; SBE primers will vary from 18-
50 bp in length in 4bp increments. After multiple PCR of the SNPs from each DNA,
the length-multiplexed SBE primers will be added, SBE reactions performed, and the
products separated on A B I3700 capillary sequencers. Data extraction and SNP calls
will be determined using a novel genotyping pipeline under development at the
Whitehead Institute.
131
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
To maintain confidentiality, samples will be delivered to the laboratory
identified only with a unique specimen ID number. All assays will be run with
known standards and include a “water blank” that will indicate if there is any
contamination in the assay. In the rare event that contamination is present, the batch
will be rerun after appropriate decontamination procedures have been carried out and
new reagents have been prepared. One in every 20 specimens (5%) is repeated (with
the laboratory personnel blinded to whether it is a repeat) to confirm the
reproducibility of our genotyping results.
9. Statistical Issues
a. Data Analysis
The MEC is carefully monitored for occurrence of cancer through the Los
Angeles SEER cancer registry and the CCR. The subjects in this study who have
biological samples available are cases of prostate cancer and randomly selected
members of the MEC. (These randomly selected cohort members had their
probability of selection weighted by age and ethnicity, but not by disease status. If a
MEC member has his/her blood collected because of some other non-random reason,
such as disease status, then he/she only becomes a member of the random sub-cohort
if he/she is chosen to be such by the random procedure and only enters the sub
cohort from the date his/her blood draw would have taken place under that
circumstance.) This study is thus a standard case-cohort study (Prentice, 1986).
Underlying the statistical analysis of this study is a relative risk approach, in
which the probability that a subject is diagnosed with disease, D, at a given age, t,
132
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
conditional on having survived to age t without disease, depends upon the variables
of interest (genes, environment, nutrition) according to the strength of the regression
parameters which are to be estimated in the analysis. The estimation of parameters in
the relative risk function involves the comparison of the covariates for cases to those
for all other subjects who constitute the ‘risk set’, R(t), surrounding the case. This
risk set is defined as consisting of all subjects who: (1) were being monitored for
disease at the age, t, of the case (i.e., had entered the MEC study before that age, and
were still being followed at age t); (2) had reached age t without having been
diagnosed with disease D; and (3) were members of the random sub-cohort at the
date on which the case was diagnosed. The comparison is then made between the
case and a random sample from R(t). We will have 1 control for each case.
Appropriate methods of statistical analysis for case-cohort designs such as this
were first proposed by Prentice (Prentice, 1986). The methods are essentially standard
nested case-control methods with allowance for the repeated use of the sub-cohort for
control selection. The statistical package EPICURE (HiroSoft International
Corporation, Seattle, WA) contains procedures for these case-cohort designs. The
logistic regression model (Breslow & Day, 1980) is emphasized in these packages and
is generally our model of choice. Such methods will be used in the statistical analysis
of case-control comparisons of genotype and case-control comparisons of gene-
environment interactions and gene-gene interactions. Regression models will be used
to examine the independent effects of genotype and other variables while adjusting for
potential confounders, to look for interactions, and to assess dose-response
133
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
relationships. The estimate of effect in the two racial/ethnic groups will be examined to
look for consistency of effect by a standard log-likelihood approach.
The ‘best fitting, model together with the frequency of ‘exposure’ in controls
in the two racial/ethnic groups will be used to estimate the extent to which such a
model can explain the differences in cancer rates between the groups. Special attention
will be paid to correlations between the factors to ensure appropriateness of any such
procedure. The statistical method for doing this is straightforward and effectively uses
the control frequencies in the best fitting model, and has been described in detail by
our colleagues at USC/Norris (Navidi, Thomas, Stram, & Peters, 1994). The estimate
of genotype frequencies for the different racial/ethnic populations will be based on the
sub-cohort controls from the appropriate population. The binomial standard errors (se)
of estimates based on such numbers are quite small, e.g., for a true proportion (p)=0.4,
the se=0.0245 even with only 500 controls. Only for very small p will this error
contribute substantially to our estimate of the extent to which the genotype can explain
ethnic differences in cancer rates. Similarly with these sample sizes, the standard errors
of other factors should not contribute significantly to our estimate of the extent to
which the factor (in particular, genotype frequencies and adduct levels) can explain
ethnic differences.
Measurement “error” o f environmental risk factors such as diet, has to be
taken into account when applying any “ best fitting” model in this way. Suppose, for
example, that the relationship between a dietary measurement, X, and disease, D, is
such that
134
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Pr(D) = exp(ak + yX)/(l + exp(ak + yX))
i.e., there is a logistic regression of risk of disease on X, where X has been suitably
transformed to be approximately normally distributed with variance cf2. Here k
indexes the matched case-control sets, while a k defines the background rate of
disease in subjects with the same matching criteria. The parameter of interest is y,
that defines the strength of the relationship between X and D. If X for an individual,
i, varies from day to day, such that
Xjj — Zj + ejj
where Xjj is the measurement of X on day j for subject i, Zj is the true long-run
average of XjjS and ejj is the day-to-day “error term”, then the intra-class correlation
coefficient, p, is defined to be
p = V ar(Z j)/(V ar(Zj) + V a r (ejj)).
The estimated y from an analysis of the relationship between disease and X is
corrected for measurement error by dividing the estimate of y by the estimate of p
(Rosner, Spiegelman, & Willett, 1990). The standard error of the corrected parameter
depends on the standard error o f y and on the standard error of our estimate of p
(Rosner et al., 1990). We will be obtaining estimates of p for the variables of interest
as part of the main MEC grant (dietary variables). These corrected estimates of effect
are essential to any analysis in which we measure the extent to which the variable
can explain differences between racial-ethnic groups. Multivariate extensions of
these methods will be used as appropriate (Rosner et al., 1990).
135
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Problems of interpretation may arise with multiple comparisons being made,
especially when candidate genes are found to be highly polymorphic. There is no
completely satisfactory method of dealing with this; however, we describe our
approach below. We also discuss our efforts to explore the issue of intra-ethnic
stratification below.
b. Power
We have decided on a 1:1 (casercontrol) ratio for this study of candidate gene
polymorphisms. Table 24 shows that this design with our expected case sample sizes
is sufficient to give us ample power to detect quite small relative risks for many
hypotheses to be tested, and is also sufficient to allow us to have a quite accurate
idea of the frequency distribution of different factors of interest, including genotype
frequencies, in the various populations (necessary to allow calculation of the
predicted (relative) risk in the two populations based on their differing frequency of
various factors). It is possible to argue that a smaller size could accomplish many of
the aims o f this study. A smaller sample size would, however, seriously reduce the
power of investigating interactions of genotype and other risk factors, such as diet,
available for analysis in this study, since such investigations require large sample
sizes to have any chance of detecting statistically significant interactions (Smith &
Day, 1984).
136
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 24. Lower and Upper Bounds of Prevalence of Any Factor that Increases
Risk in a Population that Can Be Detected with 80% Power and a 2-sided Type
I Error of 5% with the Number of Cases Available Per Racial/Ethnic Group in
this Study.
Relative Risk
Cases (n) 1.25 1.50 2.00 3.00 5.00
500 — .24 - .68 .057 - .893 .019 - .946 .007 - .966
We will collect enough samples to create a dataset that will eventually allow us
to look specifically for possible interactions between genotype and environmental
variables (e.g., dietary fat). By interaction we mean a non-independent, i.e., non-
multiplicative, effect. For example, for 2 factors each at 2 levels (a, A) and (b, B) there
are 3 relative risks (RRs): RRab = 1, R -A b , R aB and Rab- The interaction RR, Ri, is
defined by expressing these as R^ = 1, RA b = RA , RaB=RB , Rab=RaRbRi- Such
interactions are difficult to detect (low power) unless the interaction is pronounced or
there is a large sample size. A genetic interaction may be pronounced and, if this is the
situation, we will be able to detect it for both racial/ethnic groups. Our large total
sample size also permits us to detect much smaller interaction effects in the follow-up
aspect of the study where genotypes will be available for 1,632 cases. The power
(Breslow, Lubin, & Marek, 1983) to detect an interaction for a range of total cases is
given in table 25. In this table, the marginal relative risk (RR) for the gene is assumed
to be 1.5. The ‘environmental factor’ is assumed to affect 50% of the population with a
marginal RR of 1.5 (e.g., possible dietary influences, dividing the population at the
median consumption). The study proposed has adequate power to address moderate
interaction effects.
137
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 25. Detectable Interaction Relative Risks (RR|) with 80% Power and a 2-
sided Type I Error of 5%; Marginal ‘Environmental' Factor Relative Risk = 1.5
with Population Prevalence of 50%; Marginal Genotype Relative Risk = 1.5.
Cases
(n)
Population
%
Lower
RR,
Upper
RR,
Cases
(n)
Population
%
Lower
RR,
Upper
RR,
1000 10% 0.52 2.00 2000 10% 0.63 1.62
20% 0.60 1.69 20% 0.70 1.44
50% 0.64 1.56 50% 0.73 1.37
80% 0.53 1.82 80% 0.65 1.53
90% 0.41 2.26 90% 0.54 1.78
10. Criteria for Interpretation o f Gene-Cancer Associations
In this proposal we have proposed genotyping variants from six genes based
on an a priori biologic hypothesis. We recognize that when testing this many variants
we may identify associations that are false positives. In order to avoid reporting
false-positive associations we plan to select the ‘most significant’ associations
discovered from the initial study of 2,000 prostate cancer cases and controls for
additional follow-up using the following criteria: (1) observation of a consistent
effect across the two racial/ethnic groups, (2) observation of a high (or protective)
relative risk for rare alleles or a large attributable risk for common alleles, and (3)
observation of high statistical significance of the observed association. A promising
association would meet at least one of these criteria. We will conduct additional
follow-up in the remaining 2,000 prostate cancer cases and controls. Using this
follow-up analysis, the larger sample size should have the effect of reducing the
overall type I error to less than 1% per variant tested. We believe this approach will
138
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
minimize the possibility of reporting false-positive associations and provide us with
ample information to determine which variants/genes should be studied further.
11. Intra-ethnic Stratification
There has been much debate as to whether ethnic confounding
(“stratification’') is a significant source of false-positive association results in genetic
studies (Caporaso, Rothman, & Wacholder, 1999). Clearly, there are documented
examples of stratification causing false associations (Gelernter, Goldman, & Risch,
1993; Knowler, Williams, Pettitt, & Steinberg, 1988), but whether these are rare
exceptions or the rule is unknown. We are in the process of conducting a direct test
for population stratification using SNP analysis. Recently, investigators at the
Whitehead Institute (Reich & Goldstein, 2001) and elsewhere (Pritchard &
Rosenberg, 1999) have noted that stratification can be straightforwardly assessed by
first testing a case-control population for association to a large number (n=100) of
randomly chosen SNPs. We are using this procedure for a sample of the prostate
cancer cases obtained in the MEC. Specifically, we have randomly selected 100
SNPs from over 148,000 SNPs in The SNP Consortium database. These will be
genotyped in 300 MEC cases and controls. A chi-square distribution for association
of SNPs with disease will be constructed for each case-control comparison, and the
distribution compared to that expected under the null hypothesis of no association.
As we assume that none of these 100 random SNPs will be a ssociated with disease,
an excess of chi-square values beyond the null distribution will be taken as evidence
for stratification. If stratification is found for one or more samples, we will match
139
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
each case to the control most similar in SNP allele content. This project is
independent of the proposed study.
F. Translational Focus
We expect to contribute significantly to the understanding of the role of
adrenal androgens in prostate cancer and to definitively test the hypothesis that
variation within major genes in the adrenal androgen biosynthesis pathway affect
risk o f prostate cancer This comprehensive survey of six genes will potentially lead
to the identification of a “profile1 ’ of individuals at risk of prostate cancer. Such a
profile would be useful not only for risk estimation but also for the development and
application of early intervention strategies. We have focused on two racial/ethnic
groups, African-Americans and Latinos, who are clearly underrepresented in the
literature despite their high risk of disease. A total pool o f4,000 cases and controls
will be available and can be used by other investigators in the future to explore other
genes. Furthermore, the dataset of genotypes can be combined with our existing
dataset of environmental factors to explore gene-environment interactions that
contribute to prostate cancer risk and prevention.
140
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
REFERENCES
Altshuler, D., Hirschhom, J. N., Klannemark, M., Lindgren, C. M., Vohl, M. C.,
Nemesh, J., Lane, C. R., Schaffaer, S. F., Bolk, S., Brewer, C., Tuomi, T.,
Gaudet, D., Hudson, T. J., Daly, M., Groop, L., & Lander, E. S. (2000a). The
common PPARgamma Pro 12 Ala polymorphism is associated with decreased
risk of type 2 diabetes. Nat Genet, 26(1), 76-80.
Altshuler, D., Pollara, V. J., Cowles, C. R., Van Etten, W. J., Baldwin, J., Linton, L.,
& Lander, E. S. (2000b). An SNP map of the human genome generated by
reduced representation shotgun sequencing. Nature, 407(6803), 513-6.
American Cancer Society. (2000). Cancer Facts and Figures. Atlanta, GA.
American Cancer Society. (2001). Cancer Facts and Figures. Atlanta, GA.
Andriole, G. L., & Catalona, W. J. (1991). The diagnosis and treatment of prostate
cancer. Annu Rev Med, 42, 9-15.
Ausubel, F., Brent, R., & Kingston, R. (Eds.). (1995). Current protocols in
molecular biology. New York: Wiley & Sons.
Barrett-Connor, E., Garland, C., McPhillips, J. B., Khaw, K. T., & Wingard, D. L.
(1990). A prospective, population-based study of androstenedione, estrogens,
and prostate cancer. Cancer Res, 50, 169-173.
Berthon, P., Valeri, A., Cohen-Akenine, A., Drelon, E., Paiss, T., Wohr, G., Latil,
A., Millasseau, P., Mellah, I., Cohen, N., Blanche, H., Bellane-Chantelot, C.,
Demenais, F., Teillac, P., Le Due, A., de Petriconi, R., Hautmann, R.,
Chumakov, I., Bachner, L., Maitland, N., Lidereau, R., Vogel, W., Goumier,
G., Mangin, P., Cohen, D., & Cussenot, O. (1998). Predisposing gene for
early-onset prostate cancer, localized on chromosome lq42.2-43. Am. J.
Hum. Genet., 6 2 ,1416-1424.
Black, R. J., Bray, F., Ferlay, J., & Parkin, D. M. (1997). Cancer incidence and
mortality in the European Union: cancer registry data and estimates of
national incidence for 1990. Eur J Cancer, 33(7), 1075-107.
Bratt, O., Borg, A., Kristoffersson, U., Zhang, Q.-X., & Olsson, H. (1999). CAG
repeat length in the androgen receptor gene is related to age at diagnosis os
prostate cancer and response to endocrine therapy, but not to prostate cancer
risk. British Journal o f Cancer, 81(A), 672-676.
141
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Breslow, N. E., & Day, N. E. (1980). Statistical methods in cancer research. Vol. 1.
The Analysis o f Case-Control Studies (Vol. 1). International Agency for
Research on Cancer, Lyon: IARC Scientific Publications.
Breslow, N. E., Lubin, J. H., & Marek, P. (1983). Multiplicative models and cohort
analysis. J American Statist Association, 75(1-12).
Caporaso, N., Rothman, N., & Wacholder, S. (1999). Case-control studies of
common alleles and environmental factors. Journal o f the National Cancer
Institute Monographs, 26, 25-30.
Cargill, M., Altshuler, D., Ireland, J., Sklar, P., Ardlie, K., Patil, N., Lane, C., Lim,
E., Kalyanaraman, N., Nemesh, J., Ziaugra, L., Fried land, L., Rolfe, A.,
Warrington, J., Lipshutz, R., Daley, G., & Lander, E. (1999).
Characterization of single-nucleotide polymorphisms in coding regions of
human genes. Nature Genetics, 2 2 ,231-238.
Carter, B., Beaty, T., Steinberg, G., Childs, B., & Walsh, P. (1992). Mendelian
inheritance of familial prostate cancer. Proc. Natl. Acad. Sci., 8 9 ,3367-3371.
Carter, H. B., Pearson, J. D., Metter, E. J., Chan, D. W., Andres, R., Fozard, J. L.,
Rosner, W., & Walsh, P. C. (1995). Longitudinal evaluation of serum
androgen levels in men with and without prostate cancer. Prostate, 27, 25-31.
Chen, X., Levine, L., & Kwok, P. (1999). Fluorescence polarization in homogeneous
nucleic acid analysis. Genome Research, 9 ,492-8.
Chen, X., Zehnbauer, B., Gnirke, A., & Kwok, P. (1997). Fluorescence energy
transfer detection as a homogeneous DNA diagnostic method. Proceedings
o f the National Academy o f Sciences o f the United States o f America., 94,
10756-61.
Clark, A., Weiss, K., Nickerson, D., Taylor, S., Buchanan, A., Stengard, J., Salomaa,
V., Vartiainen, E., Perola, M., Boerwinkle, E., & Sing, C. (1998). Haplotype
structure and population genetic inferences from nucleotide-sequence
variation in human lipoprotein lipase. American Journal o f Human Genetics.,
63, 595-612.
Coetzee, G. A., & Ross, R. K. (1994). Prostate cancer and the androgen receptor
[Letter]. Journal o f the National Cancer Institute, 86(11), 872-873.
Collins, A., Lonjou, C., & Morton, N. E. (1999). Genetic epidemiology of single
nucleotide polymorphisms [see comments]. Proceedings o f the National
Academy o f Sciences o f the United States o f America, 96(26), 15173-7.
142
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Dahlback, B. (1997). Resistance to activated protein C caused by the factor VR506Q
mutation is a common risk factor for venous thrombosis. Thromb Haemost,
75(1), 483-8.
Davis, D. L., & Russell, D. W. (1993). Unusual length polymorphism in human
steroid 5 alpha-reductase type 2 gene (SRD5A2). Hum Mol Genet, 2(6), 820.
Dorgan, J., Albanes, D., Virtamo, J., Heinonen, O. P., Chandler, D. W., Galmarini,
M., McShane, L. M., Barrett, M. J., Tangrea, J., & Taylor, P. R. (1998).
Relationships of serum androgens and estrogens to prostate cancer risk:
results from a prospective study in Finland. Cancer Epidemiol Biomarkers
Prev, 7,1069-1074.
Dufort, I., Rheault, P., Huang, X., Soucy, P., & Luu-The, V. (1999). Characteristics
of a highly labile human type 5 17beta-hydroxysteroid dehydrogenase.
Endocrinology, 140, 568-74.
Edwards, S., Badzioch, M., Minter, R., Hamoudi, R., Collins, N., Ardem-Jones, A.,
Dowe, A., Osbome, S., Kelly, J., Shearer, R., Easton, D., Saunders, G.,
Deamaley, D., & Eeles, R. (1999). Androgen receptor polymorphisms:
associations with prostate cancer risk, relapse and overall survival. Int. J.
Cancer, 84,458-465.
El-Alfy, M., Luu-The, V., Huang, X., Berger, L., Labrie, F., & Pelletier, G. (1999).
Localization of type 5 17 beta-hydroxysteroid dehydrogenase, 3 beta-
hydroxysteroid dehydrogenase, and androgen receptor in the human prostate
by in Situ hybridization and immunocytochemistry. Endocrinology, 140,
1481-91.
Ellis, L., & Nyborg, H. (1992). Racial/ethnic variations in male testosterone levels: a
probably contributor to group differences in health. Steroids, 57,72-75.
Febbo, P. G., Kantoff, P. W., Platz, E. A., Casey, D., Batter, S., Giovannucci, E.,
Hennekens, C. H., & Stampfer, M. J. (1999). The V89L polymorphism in the
5-alpha reductase type 2 gene and risk of prostate cancer. Cancer Res, 59,
5878-5881.
Gann, P., Hennekens, C. H., Ma, J., Longcope, C., & Stampfer, M. J. (1996).
Prospective study of sex hormone levels and risk of prostate cancer. J Natl
Cancer Inst, 8 8 ,1118-1126.
Gelemter, J., Goldman, D., & Risch, N. (1993). The A1 allele at the D2 dopamine
receptor gene and alcoholism. A reappraisal. Jama, 269(13), 1673-7.
143
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Giovannucci, E., Stampfer, M. J., Krithivas, K., Brown, M., Dahl, D., Brufsky, A.,
Talcott, J., Hennekens, C. H., & Kantoff, P. W. (1997). The CAG repeat
within the androgen receptor gene and its relationship to prostate cancer
[published erratum appears in Proc Natl Acad Sci U S A 1997 Jul
22;94(15):8272]. Proceedings o f the National Academy o f Sciences o f the
United States o f America, 94(7), 3320-3.
Glover, F. E., Jr., Coffey, D. S., Douglas, L. L., Cadogan, M., Russell, H., Tulloch,
T., Baker, T. D., Wan, R. L., & Walsh, P. C. (1998). The epidemiology of
prostate cancer in Jamaica. J Urol, 159(6), 1984-6; discussion 1986-7.
Guess, H., Friedman, G., Sadler, M., Stanczyk, F., Vogelman, J., Imperato-
McGinley, J., Lobo, R., & Orentreich, N. (1997). 5-alpha reductase activity
and prostate cancer: A case-control study using stored sera. Cancer
Epidemiology, Biomarkers & Prevention, 6, 21-24.
Hakimi, J., Schoenberg, M., Rondinelli, R., Piantadosi, S., & Barrack, E. (1997).
Androgen receptor variants with short glutamine or glycine repeats may
identify unique subpopulations of men with prostate cancer. Clinical Cancer
Research, 3 , 1599-1608.
Heikkila, R., Aho, K., Heliovaara, M., Hakama, M., Mamiemi, J., Reunanen, A., &
Knekt, P. (1999). Serum testosterone and sex hormone-binding globulin
concentrations and the risk of prostate carcinoma. Cancer, 86(2), 312-315.
Henderson, B. E., Ross, R. K., Pike, M. C., & Casagrande, J. T. (1982). Endogenous
hormones as a major factor in human cancer. Cancer Res, 42(8), 3232-9.
Hirschhom, J. N., Sklar, P., Lindblad-Toh, K., Lim, Y. M., Ruiz-Gutierrez, M., Bolk,
S., Langhorst, B., Schaffner, S., Winchester, E., & Lander, E. S. (2000).
SBE-TAGS: an array-based method for efficient single-nucleotide
polymorphism genotyping. Proc Natl Acad Sci USA, 97(22), 12164-9.
Hsing, A., Gao, Y., Wu, G., Wang, X., Deng, J., Chen, Y., Sesterhenn, I., Mostofi,
F., Benichou, J., & Chang, C. (2000). Polymorphic CAG and GGN repeat
lengths in the androgen receptor gene and prostate cancer risk: a population-
based case-control study in China. Cancer Research, 60, 5111-5116.
Hsing, A. W., & Comstock, G. W. (1993). Serological precursors of cancer: serum
hormones and risk o f subsequent prostate cancer. Cancer Epidemiol
Biomarkers Prev, 2 ,27-32.
144
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Huggins, C., & Hodges, C. (1941). Studies on prostatic cancer. I. The effect of
castration, of estrogen and of androgen injection on the serum phosphatases
in metastatic carcinoma of the prostate. Cancer Res, 1 ,293-297.
Ingles, S. A., Ross, R. K., Yu, M. C., Irvine, R. A., La Pera, G., Haile, R. W., &
Coetzee, G. A. (1997). Association of prostate cancer risk with genetic
polymorphisms in vitamin D receptor and androgen receptor [see comments].
Journal o f the National Cancer Institute, 89(2), 166-70.
Irvine, R. A., Yu, M. C., Ross, R. K., & Coetzee, G. A. (1995). The CAG and GGC
microsatellites of the androgen receptor gene are in linkage disequilibrium in
men with prostate cancer. Cancer Research, 55, 1937.
Jaffe, J. M., Malkowicz, S. B., Walker, A. H., MacBride, S., Peschel, R.,
Tomaszewski, J., Van Arsdalen, K., J., W. A., & R., R. T. (2000).
Association of SRD5A2 genotype and pathological characteristics of prostate
tumors. Cancer Res, 60(6), 1626-1630.
Kantoff, P. W., Febbo, P. G., Giovannucci, E., Krithivas, K., Dahl, D. M., Chang, G.,
Hennekens, C. H., Brown, M., & Stampfer, M. J. (1997). A polymorphism of
the 5 alpha-reductase gene and its association with prostate cancer: a case-
control analysis. Cancer Epidemiol Biomarkers Prev, 6(3), 189-92.
Knochenhauer, E. S., Cortet-Rudelli, C., Cunnigham, R. D., Conway-Myers, B. A.,
Dewailly, D., & Azziz, R. (1997). Carriers of 21-hydroxylase deficiency are
not at increased risk for hyperandrogenism. J Clin Endocrinol Metab, 82(2),
479-85.
Knowler, W. C., Williams, R. C., Pettitt, D. J., & Steinberg, A. G. (1988).
Gm3;5,13,14 and type 2 diabetes mellitus: an association in American
Indians with genetic admixture. Am J Hum Genet, 43, 520-526.
Kolonel, L., Henderson, B., Hankin, J., Nomura, A., Wilkens, L., Pike, M., Stram,
D., Monroe, K., Earle, M. E., & Nagamine, F. (2000). A Multiethnic cohort
in Hawaii and Los Angeles: Baseline Characteristics. American Journal o f
Epidemiology, 151(4), 346-357.
Kruglyak, L. (1999). Prospects for whole-genome linkage disequilibrium mapping of
common disease genes. Nature Genetics, 22(2), 139-44.
Labrie, F. (1993). Intracrinology: it's impact on prostate cancer. Curr Opin Urol, 3,
381-387.
145
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Labrie, F., Dupont, A., & Belanger, A. (Eds.). (1985). Complete androgen blockade
for the treatment o f prostate cancer. Philadelphia: J. B. Lippincott.
Landis, S. H., Murray, T., Bolden, S., & Wingo, P. A. (1998). Cancer statistics,
1998. CA Cancer J Clin, ¥<§(1), 6-29.
Lange, E., Chen, H., Brierley, K.., Livermore, H., Wojno, K., Langefeld, C., Lange,
K., & Cooney, K. (2000). The polymorphic exon 1 androgen receptor CAG
repeat in men with a potential inherited predisposition to prostate cancer.
Cancer Epidemiology, Biomarkers & Prevention, 9 ,439-442.
Lichtenstein, P., Holm, N., Verkasalo, P., Iliadou, A., Kaprio, J., Koskenvuo, M.,
Pukkala, E., Skytthe, A., & Hemminki, K. (2000). Environmental and
heritable factors in the causation of cancer-analyses of cohorts of twins from
Sweden, Denmark, and Finland. N Engl J Med, 343, 78-85.
Lindblad-Toh, K., Winchester, E., Daly, M. J., Wang, D. G., Hirschhom, J. N.,
Laviolette, J. P., Ardlie, K., Reich, D. E., Robinson, E., Sklar, P., Shah, N.,
Thomas, D., Fan, J. B., Gingeras, T., Warrington, J., Patil, N., Hudson, T. J.,
& Lander, E. S. (2000). Large-scale discovery and genotyping of single
nucleotide polymorphisms in the mouse. Nature Genetics, 24(4), 381-6.
Lookingbill, D., Demers, L., Wang, C., Leung, A., Rittmaster, R., & Santen, R.
(1991). Clinical and biochemical parameters of androgen action in normal
healthy Caucasian versus Chinese subjects. Journal o f Clinical
Endocrinology & Metabolism, 72(6), 1242-8.
Loots, G. G., Locksley, R. M., Blankespoor, C. M., Wang, Z. E., Miller, W., Rubin,
E. M., & Frazer, K. A. (2000). Identification of a coordinate regulator of
interleukins 4,13, and 5 by cross-species sequence comparisons. Science,
255(5463), 136-40.
Lunn, R., Bell, D., Mohler, J., & Taylor, J. (1999). Prostate cancer risk and
polymorphism in 17 hydroxylase (CYP17) and steroid reductase (SRD5A2).
Carcinogenesis, 20, 1727-31.
Makridakis, N., di Salle, E., & Reichardt, J. (2000). Biochemical and
pharmacogenetic dissection o f human steroid 5a-reductase type II.
Pharmacogenetics, 10,407-413.
Makridakis, N., Ross, R. K., Pike, M. C., Chang, L., Stanczyk, F. Z., Kolonel, L. N.,
Shi, C. Y., Yu, M. C., Henderson, B. E., & Reichardt, J. K. V. (1997). A
prevalent missense substitution that modulates activity of prostatic steroid 5a-
reductase. Cancer Res, 5 7 ,1020-1022.
146
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Makridakis, N. M., Ross, R. K.., Pike, M. C., Crocitto, L. E., Kolonel, L. N., Pearce,
C. L., Henderson, B. E., & Reichardt, J. K. (1999). Association of mis-sense
substitution in SRD5A2 gene with prostate cancer in African-American and
Hispanic men in Los Angeles, USA. Lancet, 354, 975-978.
Mclndoe, R., Stanford, J., Gibbs, M., Jarvik, G., Brandzel, S., Neal, C., Li, S.,
Gammack, J., Gay, A., Goode, E., Hood, L., & Ostrander, E. (1997). Linkage
analysis o f 49 high-risk families does not support a common familial prostate
cancer-susceptibility gene at lq24-25. Am. J. Hum. Genet., 61, 347-353.
Miettinen, O. S. (1974). Proportion of disease caused or prevented by a given
exposure, trait or intervention. Am J Epidemiol, 99(5), 325-32.
Mononen, N., Ikonen, T., Sytjakoski, K., Matikainen, M., Schleutker, J., Tammela,
T. L. J., Koivisto, P. A., & Kallioniemi, O. P. (2001). A missense substitution
A49T is the steroid 5-alpha-reductase gene (SRD5A2) is not associated with
prostate cancer in Finland. B rJ Cancer, 84, 1344-1347.
Montie, J. (1993). 1992 staging system for prostate cancer. Semin Urol, 11, 10-13.
Montie, J., Pienta, K., & Pontes, P. (1996). Staging systems and prognostic factors
for prostate cancer. In N. Vogelzang (Ed.), Comprehensive textbook o f
genitourinary oncology (pp. 712-22). Baltimore: Williams and Wilkins.
Nam, R., Elhai, Y., Krahn, M., Hakimi, J., Ho, M., Chu, W., Sweet, J., Trachtenberg,
J., MAS, J., & Narod, S. (2000). Significance of the CAG repeat
polymorphism of the androgen receptor gene in prostate cancer progression.
The Journal o f Urology, 164, 567-572.
Nam, R., Toi, A., Vesprini, D., Ho, M., Chu, W., Harvie, S., Sweet, J., Trachtenberg,
J., Jewett, M., & Narod, S. (2001). V89L polymorphism of type-2,5-alpha
reductase enzyme gene predicts prostate cancer presence and progression.
Urology, 5 7 ,199-205.
Navidi, W., Thomas, D., Stram, D., & Peters, J. (1994). Design and analysis of
multilevel analytic studies with applications to a study of air pollution.
Environ Health Perspect, 102 Suppl 8 ,25-32.
Nickerson, D., Taylor, S., Weiss, K., Clark, A., Hutchinson, R., Stengard, J„
Salomaa, V., Vartiainen, E., Boerwinkle, E., & Sing, C. (1998). DNA
sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene
[see comments]. Nature Genetics., 19,233-40.
147
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Nobel, R. (1977). The development of prostatic adenocarincoma in Nb rats following
prolonged sex hormone administration. Cancer Res, 3 7 ,1929-1933.
Nomura, A., Heilbrun, L., Stemmermann, G., & Judd, H. (1988). Prediagnostic
Serum Hormones and the Risk of Prostate Cancer. Cancer Research, 48,
3515-3517.
Nomura, A. M., Stemmermann, G. N., Chyou, P. H., Henderson, B. E., & Stanczyk,
F. Z. (1996). Serum androgens and prostate cancer. Cancer Epidemiol
Biomarkers Prev, 5, 621-624.
Norman, A., & Litwack, G. (1997). Hormones (2nd ed.). San Diego: Academic
Press.
Osegbe, D. (1997). Prostate cancer in Nigerians: facts and nonfacts. The Journal o f
Urology, 157, 1340-1343.
Prentice, R. L. (1986). A case-cohort design for epidemiologic cohort studies and
disease prevention studies. Biometrika, 73, 1-11.
Pritchard, J. K., & Rosenberg, N. A. (1999). Use of Unlinked Genetic Markers to
Detect Population Stratification in Association Studies. American Journal o f
Human Genetics, 65,220-228.
Reich, D. E., & Goldstein, D. B. (2001). Detecting association in a case-control
study while correcting for population stratification. Genet Epidemiol, 20(1),
4-16.
Reichardt, J. K. V., Makridakis, N., Henderson, B. E., Yu, M. C., Pike, M. C., &
Ross, R. K. (1995). Genetic variability of the human SRD5A2 gene:
implications for prostate cancer risk. Cancer Research, 55(3973-3975).
Rheault, P., Dufort, I., Soucy, P., & Luu-The, V. (1999). Assignment of HSD17B5
encoding type 5 17 beta-hydroxysteroid dehydrogenase to human
chromosome bands 10pl5— >pl4 and mouse chromosome 13 region A2 by in
situ hybridization: identification of a new syntenic relationship.
Cytogenetics & Cell Genetics., 8 4 ,241-2.
Risch, N. (2000). Searching for genetic determinants in the new millenium. Nature,
405, 847-856.
Risch, N., & Merikangas, K. (1996). The future of genetic studies of complex human
diseases. Science, 273{5281), 1516-7.
148
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Rosner, B., Spiegelman, D., & Willett, W. C. (1990). Correction of logistic
regression relative risk estimates and confidence intervals for measurement
error: the case of multiple covariates measured with error. American Journal
o f Epidemiology, 132(4), 734-45.
Ross, R., Bernstein, L., Judd, H., Hanisch, R., Pike, M., & Henderson, B. (1986).
Serum testosterone levels in healthy young black and white men. J Natl
Cancer Inst, 76,45-48.
Ross, R. K., Bernstein, L., Lobo, R. A., Shimizu, H., Stanczyk, F. Z., & Pike, M. C.
(1992). 5-alpha reductase activity and risk of prostate cancer among Japanese
and US white and black males. Lancet, 339, 887-890.
Ross, R. K., Pike, M. C., Coetzee, G. A., Reichardt, J. K., Yu, M. C., Feigelson, H.,
Stanczyk, F. Z., Kolonel, L. N., & Henderson, B. E. (1998). Androgen
metabolism and prostate cancer: establishing a model of genetic
susceptibility. Cancer Res, 58,4497-4504.
Shibata, A., & Whittemore, A. (1997). Genetic predisposition to prostate cancer:
possible explanation for ethnic differences in risk. The Prostate, 32,65-72.
Slatkin, M., & Excoffier, L. (1996). Testing for linkage disequilibrium in genotypic
data using the Expectation-Maximization algorithm. Heredity, 76, 377-383.
Smith, J., Freije, D., Carpten, J., Gronberg, H., Xu, J., Isaacs, S., Brownstein, M.,
Bova, G., Guo, H., Bujnovszky, P., Nusskem, D., Damber, J., Bergh, A.,
Emanuelsson, M., Kallioniemi, O., Walker-Daniels, J., Bailey-Wilson, J.,
Beaty, T., Meyers, D., Walsh, P., Collins, F., Trent, J., & Isaacs, W. (1996).
Major susceptibility locus for prostate cancer on chromosome 1 suggested by
a genome-wide search. Science, 274,1371-1374.
Smith, P. G., & Day, N. E. (1984). The design of case-control studies: the influence
of confounding and interaction effects. Int J Epidemiology, 13,356-365.
Stanford, J. L., Just, J. J., Gibbs, M., Wicklund, K. G., Neal, C. L., Blumenstein, B.
A., & Ostrander, E. A. (1997). Polymorphic repeats in the androgen receptor
gene: molecular markers of prostate cancer risk [see comments]. Cancer
Research, 57(6), 1194-8.
Stoner, E. (1996). 5alpha-reductase inhibitors/finasteride. Prostate Suppl, 6, 82-7.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Strain, D. O., Hankin, J. H., Wilkens, L. R., Pike, M. C., Monroe, K. R., Park, S.,
Henderson, B. E., Nomura, A. M. Y., Earle, M. E., Nagamine, F. S., &
Kolonel, L. N. (2000). Calibration of the dietary questionnaire for a
multiethnic cohort in Hawaii and Los Angeles. American Journal o f
Epidemiology, 151(4), 358-370.
Strittmatter, W. J., & Roses, A. D. (1996). Apolipoprotein E and Alzheimer's
disease. Annu Rev Neurosci, 19, 53-77.
Vatten, L., Ursin, G., Ross, R., Stanczyk, F., Lobo, R., Harvei, S., & Jellum, E.
(1997). Androgens in serum and the risk of prostate cancer: A nested case-
controls study from the Janus Serum Bank in Norway. Cancer Epidemiology,
Biomarkers & Prevention, 6, 967-969.
Wadelius, M., Andersson, A., Johansson, J., Wadelius, C., & Rane, E. (1999).
Prostate cancer associated with CYP17 genotype. Pharmacogenetics, 9, 635-
9.
Wang, D., Fan, J., Siao, C., Bemo, A., Young, P., Sapolsky, R., Ghandour, G.,
Perkins, N., Winchester, E., Spencer, J., Kruglyak, L., Stein, L., Hsie, L.,
Topaloglou, T., Hubbell, E., Robinson, E., Mittmann, M., Morris, M., Shen,
N., Kilbum, D., Rioux, J., Nusbaum, C., Rozen, S., Hudson, T., Lander ES.,
& al., e. (1998). Large-scale identification, mapping, and genotyping of
single-nucleotide polymorphisms in the human genome. Science, 280, 1077-
82.
Waterhouse, J., Muir, C., Shanmugaratnam, K., & Powell, J. (Eds.). (1982). Cancer
Incidence in Five Continents (Vol. IV). Lyon: International Agency for
Research on Cancer.
Whittemore, A. S., Kolonel, L. N., Wu, A. H., John, E. M., Gallagher, R. P., Howe,
G. R., Burch, J. D., Hankin, J., Dreon, D. M., West, D. W., & et al. (1995).
Prostate cancer in relation to diet, physical activity, and body size in blacks,
whites, and Asians in the United States and Canada. J Natl Cancer Inst,
87(9), 652-61.
Wigley, W. C., Prihoda, J. S., Mowszowicz, I., Mendonca, B. B., New, M. I.,
Wilson, J. D., & Russell, D. W. (1994). Natural mutagenesis study of the
human steroid 5 alpha-reductase 2 isozyme. Biochemistry, 33(5), 1265-70.
150
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Wu, A. H., Whittemore, A. S., Kolonel, L. N., John, E. M., Gallagher, R. P., West,
D. W., Hankin, J., Teh, C. Z., Dreon, D. M., & Paffenbarger Jr., R. S. (1995).
Serum androgens and sex hormone-binding globulins in relation to lifestyle
factors in older African-American, White, and Asian men in the United States
and Canada. CEBP, 4, 735-741.
Xu, J., Isaacs, W., Schleutker, J., Kallioniemi, O., Berry, R., Thibodeau, S.,
Gronberg, H., Jonsson, B., Smith, J., & Trent, J. (1998). Evidence for a
prostate cancer susceptibility locus on the X chromosome. Nature Genetics,
20, 175-179.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Genetic risk factors in breast cancer susceptibility: The multiethnic cohort
PDF
Androgen receptor gene and prostate -specific antigen gene in breast cancer
PDF
Breast cancer in the multiethnic cohort study: Genetic (prolactin pathway genes) and environmental (hormone therapy) factors
PDF
Androgens and breast cancer
PDF
BRCA1 mutations and polymorphisms in African American women with a family history of breast cancer identified through high throughput sequencing
PDF
Association of vitamin D receptor gene polymorphisms with colorectal adenoma
PDF
Dietary fats, fat metabolizing genes, and the risk of breast cancer
PDF
CYP17 polymorphism and risk for colorectal adenomas
PDF
Assessment of fatigue as a late effect of therapy among survivors of childhood leukemia
PDF
Association between body mass and benign prostatic hyperplasia in Hispanics: Role of steroid 5-alpha reductase type 2 (SRD5A2) gene
PDF
Effect of genetic factors in the development of childhood lymphocytic leukemia (ALL)
PDF
Determinants of mammographic density in African-American, non-Hispanic white and Hispanic white women before and after the diagnosis with breast cancer
PDF
beta3-adrenergic receptor gene Trp64Arg polymorphism and obesity-related characteristics among African American women with breast cancer: An analysis of USC HEAL Study
PDF
Development and evaluation of standardized stroke outcome measures in a population of stroke patients in rural China
PDF
Extent, prevalence and progression of coronary calcium in four ethnic groups
PDF
A case/parental/sibling control study of Ewing's sarcoma/peripheral primitive neuroectodermal tumor (pPNET)
PDF
Antioxidants and risk of myocardial infarction and cancer in a cohort of middle-aged finnish men: the kuopio ischemic heart disease risk factor study
PDF
Descriptive epidemiology of thyroid cancer in Los Angeles County, 1972-1995
PDF
Colorectal cancer risks in Singapore Chinese: Polymorphisms in the insulin-like growth factor-1 and the vitamin D receptor
PDF
Recreational physical activity and risk of breast cancer
Asset Metadata
Creator
Pearce, Celeste Leigh
(author)
Core Title
A large-scale genetic association study of prostate cancer in a multi-ethnic population
School
Graduate School
Degree
Doctor of Philosophy
Degree Program
Epidemiology
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
health sciences, oncology,health sciences, public health,OAI-PMH Harvest
Language
English
Contributor
Digitized by ProQuest
(provenance)
Advisor
Henderson, Brian (
committee chair
), Altshuler, David (
committee member
), Coetzee, Gerhard (
committee member
), Ingles, Sue Ann (
committee member
), Ross, Ronald (
committee member
), Ursin, Giske (
committee member
)
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-228797
Unique identifier
UC11339082
Identifier
3073834.pdf (filename),usctheses-c16-228797 (legacy record id)
Legacy Identifier
3073834.pdf
Dmrecord
228797
Document Type
Dissertation
Rights
Pearce, Celeste Leigh
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
health sciences, oncology
health sciences, public health