Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Molecular classification, diagnosis and prognosis of pediatric rhabdomyosarcoma by oligonucleotide microarray analyses
(USC Thesis Other)
Molecular classification, diagnosis and prognosis of pediatric rhabdomyosarcoma by oligonucleotide microarray analyses
PDF
Download
Share
Open document
Flip pages
Copy asset link
Request this asset
Request accessible transcript
Transcript (if available)
Content
MOLECULAR CLASSIFICATION, DIAGNOSIS AND PROGNOSIS OF
PEDIATRIC RHABDOMYOSARCOMA BY OLIGONUCLEOTIDE
MICROARRAY ANALYSES
by
Elai Davicioni
____________________________________________________________________
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(PATHOBIOLOGY)
May 2006
Copyright 2006 Elai Davicioni
UMI Number: 3237123
3237123
2006
UMI Microform
Copyright
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest Information and Learning Company
300 North Zeeb Road
P.O. Box 1346
Ann Arbor, MI 48106-1346
by ProQuest Information and Learning Company.
ii
Dedication
To my parents Gino Davide Joshue and Vivienne Jill Penelope,
To my siblings Lisa Erez and Jesse Davicioni,
In loving memory of my grandfathers Israel Davicion Levi and Julius Efraim
Gurwitz and
To my grandmothers Rosa Eshua Kemalova and Helma Rahmiel Lazarus.
ולש תישאר ו תירחא
“The one whom the end and beginning are his”
iii
Acknowledgements
It is my pleasure now to have the opportunity to express my gratitude to all those
involved in this thesis work. I would like to gratefully acknowledge the enthusiastic
support of my committee chair, Prof. Timothy J. Triche who gave me the
opportunity to work in the Tumor Biology group at the Saban Research Institute of
Children’s Hospital Los Angeles. Dr. Triche provided all aspects of the means to this
end; the procurement of financial support, assembly of a multi-disciplinary team, but
most importantly, the conception, vision and inspiration for this thesis work. I would
also like to thank Margaret Triche for her warm hospitality during my stay in Los
Angeles. I am also indebted to my outside committee member, Prof. Jonathan D.
Buckley for his expertise and guidance in all matters concerning biostatistics and
bioinformatics and to his team at Epicenter Software for the development of our
primary software program, the Genetrix Suite for microarray analysis. I am also
obliged to my committee member, Prof. Michael J. Anderson who provided all
manner of guidance including; professional advice, instruction in molecular biology
and paper writing but most importantly taught me how to think like a scientist. I
gained a lot of experience from our collaborations on numerous projects especially
the work presented in Chapter 2, which was done in cooperation with post-doc Dr.
Friedrich Finckenstein who prepared the ectopic PAX-FKHR expression model and
co-authored the paper.
Special thanks to Dr. Deborah E. Schofield director of the Microarray CORE and
from the Department of Pathology, Betty Schaub, Sitara Widayantre, Xuan Chen,
iv
Morgan Wu, Su-Ann Phung, and Dr. Samuel Wu who provided tremendous support
for the duration. I am grateful also to other members of the Triche Group including
Xian Feng Liu, Violette Shahbazian and Daniel H. Wai. Thank you very much to the
members of the Tumor Biology Study Group which was most helpful over the last
few years; Prof. Elisabeth Lawlor, Prof. Ling-Tao Wu, Prof. William May, Dr.
Hyung-Goo Kang, Dr. George McNamarra, Dr. Siuwen Hu-Lieskovan, Dr. Sahab
Asgarzadeh, Dr. Srinivas Somianchi, Dr. Michele Wing, Keegan Warner, Long
Hung, and many others.
More thanks for generous support from the Department of Pathology to Lourdres
Cruz, Lisa Doumak, and Prof. Cheng-Ming Chuong, Prof. Louis Dubeau, Prof.
Robert Maxson and Prof. Clive Taylor. From the PIBBS Program I would like to
thank all those concerned that recruited me to USC and gave me a very fortunate
experience for my first year of graduate school, Prof. Michael Stallcup, Prof. Debbie
Johnston, Prof. Baruch Frenkel and Marisella Zuniga. Also thank you Prof. David
Warburton, Prof. David Hinton, Prof. Glenn Merlino, Prof. C. Pat Reynolds, Prof.
Robert Seeger, Prof. Norbert Brendt, Dr. Nino Keshleva and Jerry Barnhardt.
Finally, I would like to express my gratitude to my LA family; Ofer Tsruya and
David Abergel and an honorable mention to the XXs.
v
Table of Contents
Dedication ii
Acknowledgements iii
List of Tables vi
List of Figures viii
Abstract x
Introduction 1
Chapter 1: Molecular Diagnosis and Classification 8
Introduction 8
Results 12
Discussion 33
Chapter 2: PAX-FKHR Expression Signature 41
Introduction 41
Results 43
Discussion 60
Chapter 3: Molecular Staging 67
Introduction 67
Results 75
Discussion 88
Chapter 4: Future Directions 97
Chapter 5: Materials and Methods 105
Bibliography 119
Appendix A: Supplementary Tables and Figures 137
Appendix B: List of publications and manuscripts 255
vi
List of Tables
Table 1: Clinical characteristics of molecular classification data set 13
Table 2: Nearest Shrunken Centroids class predictors 27
Table 3: Multivariate analysis of RMS molecular classes 32
Table 4: ‘PAX-FKHR expression signature’ 51
Table 5: Multivariate analysis of PAX-FKHR metagene 59
Table 6: TNM-based RMS staging 70
Table 7: RMS Clinical Groups 70
Table 8: IRS-V Risk Groups 71
Table 9: Prognosis genes for RMS metagene 81
Table 10: Multivariate analysis of RMS metagene 87
Supplementary Table 1: Clinical covariates of tumor samples 137
Supplementary Table 2: Meta-clustering gene list for molecular
classes 150
Supplementary Table 3: EASE Overrepresentation analysis for
molecular classes 171
Supplementary Table 4: Clinical characteristics of LOH data set 175
Supplementary Table 5: Proportion of LOH in RMS tumors 176
Supplementary Table 6: Clinical characteristics of RMS tumors
used for ‘PAX-FKHR expression signature’ analysis 178
Supplementary Table 7: Meta-clustering gene list for mARMS
vs mERMS 179
Supplementary Table 8: In vitro PAX-FKHR expression profile 194
Supplementary Table 9: Differentially expressed genes in RMS
cell lines 209
vii
Supplementary Table 10: EASE Overrepresentation analysis
for PAX-FKHR expression profile 212
Supplementary Table 11: Ingenuity Networks 215
Supplementary Table 12: PAX-FKHR Expression Signature
Network Genes 217
Supplementary Table 13: EASE Overrepresentation analysis
Network 1 226
Supplementary Table 14: EASE Overrepresentation analysis
Network 2 227
Supplementary Table 15: EASE Overrepresentation analysis
Network 3 228
Supplementary Table 16: EASE Overrepresentation analysis
Network 4 230
Supplementary Table 17: Clinical characteristics of RMS
prognosis data set 232
Supplementary Table 18: Genes correlated to outcome in RMS 233
Supplementary Table 19: EASE Overrepresentation analysis
of prognosis genes 251
viii
List of Figures
Figure 1. Supervised and semi-supervised clustering analysis 16
Figure 2. Histology re-evaluation of ICR diagnosis 22
Figure 3. LOH analysis of tumors 24
Figure 4. Nearest Shrunken Centroids class predictions 26
Figure 5. Tissue microarray immunohistochemistry 29
Figure 6. Kaplan-Meier analysis of RMS molecular classes 31
Figure 7. Meta-clustering and PAX-FKHR expression in RMS 45
Figure 8. In vitro PAX-FKHR expression model 47
Figure 9. Principle Components analysis of RMS cell lines and
PAX-FKHR expression model 49
Figure 10. ‘PAX-FKHR expression signature’ clustering of
primary tumors 53
Figure 11. Kaplan-Meier analysis of PAX-FKHR metagene 58
Figure 12. Kaplan-Meier analysis of the best single gene
outcome predictor 76
Figure 13. Hierarchical clustering of RMS tumors using
prognosis genes 78
Figure 14. Kaplan-Meier analysis of RMS metagene 82
Figrue 15. Correlation between QRT-PCR and U133A
microarray expression levels for PAX3 and PAX7 98
Figure 16. HuEx1.0 and QRT-PCR comparison for PAX7 99
Figure 17. HuEx1.0 analysis of PAX3 and FKHR 101
Figure 18. Alternative splicing detected in ASS1 102
ix
Figure 19. QRT-PCR analysis of PAX-FKHR expression
in ARMS 108
Figure 20. Cox regression evaluation of metagenes 118
Supplementary Figure 1. K-means centroids used for
Molecular Classes 149
Supplementary Figure 2. Microarray expression levels for
RMS molecular class markers 174
Supplementary Figure 3. Microarray expression levels for
TFAP2 β and HMGA2 177
Supplementary Figure 4. Ingenuity Pathway Analysis networks 222
x
ABSTRACT
Pediatric rhabdomyosarcoma (RMS) are a heterogeneous group of tumors defined by
their histological resemblance to developing skeletal muscle cells. RMS occurs in
two major histological subtypes, embryonal (ERMS) and alveolar (ARMS). Most
ARMS express PAX3- or PAX7-FKHR (PAX-FKHR) fusion genes but up to 30% of
ARMS are fusion-negative. ERMS is associated with 11p15.5 loss-of-heterozygosity
(LOH) but is often confused with non-myogenic non-rhabdomyosarcoma soft tissue
sarcomas (NRSTS). Due to these confounding factors, it is unclear whether RMS
represent a single disease or multiple clinical and biological entities with a common
phenotype. Treatment of RMS is risk-adapted and recognizes the diverse outcomes
associated with RMS subtypes but because of the diagnostic uncertainty associated
with RMS, diagnosis based solely on microscopic appearance may lead to
inappropriate therapy. Here we show that an objective genomic classification derived
from microarray gene expression profiling and LOH analysis of 160 cases of RMS is
possible for this complex disease. We found that ARMS tumors expressing either
PAX-FKHR gene share a common expression profile distinct from fusion-negative
ARMS. Using an in vitro PAX-FKHR expression model we identified a gene
expression signature regulated by PAX-FKHR that is specific to PAX-FKHR positive
ARMS tumors and observed that a minimum expression level of PAX-FKHR is
necessary for the detection of this expression profile. The gene expression profile
and pattern of LOH of fusion-negative ARMS is indistinguishable from conventional
ERMS and myogenic gene expression characterizes all true RMS from non-
xi
myogenic NRSTS. Implementation of the Nearest Shrunken Centroids algorithm
identified a minimal set of 10 genes capable of making a differential diagnosis of
ARMS, ERMS and NRSTS with an estimated error rate of only 2%. Furthermore,
we used immunohistochemistry and quantitative RT-PCR to validate two novel
markers, TFAP2 β and HMGA2 useful for differential diagnosis in paraffin-
embedded clinical material. We also developed a 34-metagene continuous predictor
of outcome using Cox regression modeling that segregated RMS patients into three
risk groups independent of clinical risk factors. Our results demonstrate that
molecular classes based solely on genomic analysis at diagnosis are objectively
derived, reproducible and highly predictive of outcome but are also at variance with
conventional histopathologic criteria. Adoption of these genomic classifiers would
likely lead to better diagnosis, patient management and improved therapeutic
outcome.
1
Introduction
Historically, human malignancies are described on the basis of the presumed
normal tissue thought to have undergone malignant transformation (Triche et al.,
2001). In this context, nearly all adult solid neoplasia are carcinomas of epithelial
derivation and as such are named primarily according to the organ system (e.g.,
prostate adenocarcinoma for tumors of the prostate gland). In contrast, solid tumors
of childhood are almost exclusively ascribed to malignant tumors of mesenchymal
cell origin or sarcomas. The ‘small blue round cell tumors’ (SBRCT) of childhood a
designation that includes rhabdomyosarcomas, non-Hodgkin’s lymphomas, Ewing’s
family tumors and neuroblastomas based on their similar morphological appearance,
are notoriously difficult to distinguish by light microscopy alone. These tumors show
a broad range of phenotypes but often have very little evidence of differentiation on
the morphological level and appear primitive or embryonal in appearance. Currently,
in the clinic several methodologies (e.g., immunohistochemistry, cytogenetics and
RT-PCR) are employed to develop a definitive diagnosis, which is critical due to the
variable prognoses associated with these tumors (Schofield and Triche, 2002).
SBRCTs therefore, are a prime example where the rapidly emerging field of gene
expression profiling can be used to categorize cancers of unknown origin (Tothill et
al., 2005) or those cancers with a difficult differential diagnosis into clinically
relevant subgroups (Khan et al., 2001).
The nomenclature systems developed by modern pathologists although
convenient do not inherently reflect the molecular basis of these pathologies but
2
rather the phenotypic consequences of tumorigenesis. Given what is known about the
etiology and epidemiology of childhood cancer, it is likely that aberrations of normal
developmental processes play a central role in the tumorigenic process especially
those regulating cell fate and differentiation (Harris, 2005). It is clear that single gene
analyses (i.e., searching for Holy Grail’s; ‘tumor suppressors’ and ‘oncogenes’) are
relics of the past. In this genomic age, coming to the forefront are studies of systems
biology- networks of genes and coordinated transcriptional programs, advanced by
computer scientists/bioinformaticians and adopted by developmental and cancer
biologists (Segal et al., 2005). In the light of recent observations, cancer increasingly
appears as a disease of stem-cell differentiation (Bjerkvig et al., 2005; Dean, 1998)
whether the consequence of de-differentiation (da Costa, 2001), trans-differentiation
(Zhang et al., 2001) or undifferentiation (Capp, 2005). Cancer as a disease of
development is not a new idea, in fact, recognition by pathologists of shared
phenotypes among pediatric sarcomas and developing normal tissue initially formed
the basis of diagnosis for most of these pathologies.
In a sense, whole-genome analyses of cancer have much in common with
classical histopathology studies, essentially these methods are the science (and in
many cases, the art) of pattern recognition. However, advances in technologies and
methodologies are forwarding new fields of research such as whole-genome studies
of molecular epidemiology (Chen and Hunter, 2005), germline polymorphisms
(Hunter, 2004), proteomics (Danna and Nolan, 2006; Petricoin et al., 2005) and
epigenetics (Schumacher et al., 2006) beyond the confines of classical tumor
3
pathobiology. Several whole-genome analyses of SBRCTs (Baird et al., 2005;
Henderson et al., 2005; Khan et al., 2001) and rhabdomyosarcoma (Bortoluzzi et al.,
2005; De Pitta et al., 2005; Khan et al., 1998; Lu et al., 2001; Schaaf et al., 2005;
Wachtel et al., 2004) have been reported. Due to the rarity of rhabdomyosarcoma,
only a small number of cases have been analyzed thus far, limiting the scope of the
conclusions presented in these works. In this thesis, I present work towards
developing genomic based classifiers (i.e., mRNA expression and DNA
polymorphism profiles) of the largest cohort of rhabdomyosarcoma primary tumors
assembled to date supplemented with analysis of in vitro models and RMS cell lines.
Rhabdomyosarcoma (RMS) from the Greek words, rhabdo, meaning rod
shape, and myo, meaning muscle was first described in 1854 by Weber but a clear
histologic definition recognizing the distinct morphology of rhabdomyoblasts
appearing in ‘round’, ‘strap’, ‘racquet’ and ‘spider’ forms was only described by
Stout in 1946 (Stout, 1946). It is commonly believed that RMS arises from
mesenchymal progenitors that have undergone some degree of skeletal muscle
differentiation- although in fact the cell(s) -of-origin of RMS remains subject to
speculation (Pappo et al., 1999b). Intriguingly, RMS tumors are found at diverse
anatomical sites where striated skeletal muscle tissue is normally absent (e.g.,
bladder, which is surrounded by smooth muscle) and rhabdomyosarcoma cells
devoid of any overt myogenic phenotype often undergo chemotherapy induced
myogenic differentiation both in vitro and in vivo (Klunder et al., 2003; Leuschner et
al., 2002; Smith et al., 2002). Regardless, the recognition early on that these tumors
4
show evidence of myogenic differentiation formed the basis for the diagnosis and the
collectivization of these tumors into a clinically relevant diagnostic group (Tsokos,
1986).
The annual incidence of RMS in children and adolescents is approximately 7
cases per million with about 250 new cases diagnosed each year in the United States.
They account for up to 8% of all pediatric cancers and are the third most common
neoplasm after neuroblastoma and Wilm’s tumor (Punyko et al., 2005). The relative
rarity of the disease is a major obstacle to progress in understanding the biology and
investigating treatment of RMS (Newton et al., 1999). Large multi-institutional
cooperative group studies such as the Intergroup Rhabdomyosarcoma Study Group
(IRSG) in North America and the International Society
of Pediatric Oncology
Malignant Mesenchymal Tumor (SIOP, MMT) study group in Europe have proven
themselves as an efficient and effective means to study RMS and related pathologies
(Crist et al., 1990; Donaldson and Anderson, 2005). These cooperative group studies
have produced invaluable information and resulted in dramatic improvements in
patient survival over the years with some of this progress stemming from the
identification of numerous clinicopathological risk factors which have resulted in the
(ongoing) development of several RMS risk grouping systems.
Similar to other pediatric tumors, such as acute myeloid leukemia, this
diverse group of tumors has several histologic and genetic subtypes that are
clinically associated with diverse patient outcomes (Qualman and Morotti, 2002).
The main histological subtypes recognized for their prognostic significance and
5
reproducibility of diagnosis, include embryonal rhabdomyosarcoma (ERMS, ~60%
of patients), the botryoid and spindle cell variants of ERMS (~5% of patients),
alveolar rhabdomyosarcoma (ARMS, ~20% of patients) and undifferentiated
sarcoma (UDS, ~20% of patients) (Qualman et al., 1998). Classical ERMS histology
resembles skeletal muscle cell histology during embryogenesis and fetal
development, consisting of spindle-shaped cells with eosinophilic cytoplasm and an
abundant stromal compartment. This pattern however, is highly variable, ranging
from poorly differentiated tumors with no morphological evidence of skeletal
muscle, (especially difficult to diagnose) to well differentiated tumors with a striking
resemblance to fetal skeletal muscle. Histological preparations of ARMS have a
pattern with open spaces resembling pulmonary alveoli and appear as round or
polygonal cells adhering to fibrous stromal septae (Enterline and Horn, 1958;
Riopelle and Thériault, 1956). Solid forms of ARMS (i.e., no open spaces) are also
recognized although not as a distinct clinical entity but present an additional
confounding factor for the (histological) differential diagnosis (Tsokos, 1994). Two
age peaks tend to be associated with different anatomical locations and tumor
histologies. Most of the incidence of RMS occurs in the first decade of life and these
patients tend to have head and neck or genito-urinary tract primary tumors with
embryonal histology. In contrast, adolescents have a preponderance towards
extremity lesions with alveolar histology. The botryoid variant of ERMS arises only
in mucosal cavities, such as the bladder, vagina and nasopharynx and spindle cell
variant is rarely found in sites other than the paratesticular region (Qualman et al.,
6
1998). As already mentioned above, RMS can arise anywhere in the body except in
bone and the most common sites for distant metastases are lung, bone marrow,
lymph nodes and bone {Raney, 2001 #2522; Raney et al., 2001). Treatment response
and prognosis vary widely depending on location and histology, so when clinical
stage and other variables are taken into account, survival rates range from <20% for
patients with metastatic alveolar tumors to over 95% for some localized embryonal
disease (Crist et al., 2001; Pappo et al., 1999a).
As with most tumors of childhood, the cause of RMS is unknown. The
majority of RMS appears to be sporadic in nature although some genetic factors have
been implicated such as the occurrence of the disease in siblings and in patients with
multiple neoplasms (Birch et al., 1990; Hartley et al., 1993). Hereditary diseases
such as Li-Fraumeni syndrome (TP53 germline mutations) (Li and Fraumeni, 1969),
neurofibromatosis type I (NF1 germline mutations) (Matsui et al., 1993) and a fetal
overgrowth disorder called Beckwith-Wiedemann syndrome (deregulation of
imprinted genes within the 11p15 region) (Cohen, 2005) have been linked to an
increased incidence of RMS tumors (Merlino and Helman, 1999). Much more is
currently known about recurrent somatic alterations in RMS primary tumors.
Chromosomal studies have identified nonrandom translocations exclusively in
ARMS (Turc-Carel et al., 1986). The predominant translocation t(2;13)(q35-37;q14)
and a variant t(1;13)(p36;q14) (Douglass et al., 1991) were identified and later
molecular studies revealed that these resulted in the rearrangement and fusion of
PAX3 and PAX7 genes, respectively to FKHR (forkhead in rhabdomyosarcoma,
7
FOXO1A) (Barr, 2001). ERMS do not demonstrate recurrent chromosomal
translocations. Instead, they show greater genomic instability (manifested as highly
variable karyotypes) and recurring allelic imbalances such as loss-of-heterozygosity
(LOH) at 11p15.5 (Besnard-Guerin et al., 1996; Visser et al., 1997) and trisomies of
chromosome 8 (Anderson et al., 1999b; Lee et al., 1993).
Genomic analyses have made an impact on the field of pathology, redefining
tumor classes based on molecular features (Greer and Khan, 2004; Khan et al., 2001)
and identification of novel tumor subclasses unrecognized by conventional
methodologies such as histology and cytogenetics (Ebert and Golub, 2004). Further
strides have been made towards molecular staging and prediction of disease outcome
(Mano, 2004; Simon, 2003; van de Vijver et al., 2002). In this thesis, I will focus on
three immediate challenges of interest to both basic and clinical researchers of
rhabdomyosarcoma, outlined below and described in detail in the following three
chapters:
1. Correctly diagnose and classify rhabdomyosarcoma subtypes using gene
expression and DNA polymorphism profiling.
2. Characterize the role of PAX-FKHR fusion genes and identification of a PAX-
FKHR ‘expression signature’ relevant to the diagnostic classification and clinical
behavior of the most aggressive subtype, alveolar rhabdomyosarcoma.
3. Identification of genes most predictive of prognosis and develop multivariate
gene expression-based models as novel prognostic classifiers for all
rhabdomyosarcoma patients.
8
Chapter 1: Microarray Analyses of RMS- Molecular Diagnosis
and Classification
Introduction
Rhabdomyosarcoma, (RMS) defines a group of histologically and genetically
heterogeneous sarcomas. First characterized in the 19
th
century, it took another 100
years before distinct subtypes of rhabdomyosarcoma were recognized (Horn and
Enterline, 1958; Riopelle and Thériault, 1956) and the adoption of the ‘conventional’
classification scheme (Enterline and Horn, 1958). Prior to the widespread use of
multi-agent chemotherapy in the 1970s, the prognosis of patients with
rhabdomyosarcoma was dismal with few survivors (Sutow et al., 1970). The
Intergroup Rhabdomyosarcoma Study Group (IRSG) was formed as a cooperative
effort to investigate the biology and treatment of RMS and undifferentiated sarcoma
(UDS). This led to dramatic improvement of patient survival from 25% before the
first IRS-I protocol to approximately 71% on IRS-IV (Crist et al., 1995; Crist et al.,
2001; Crist et al., 1990). The substantial progress made in patient outcome led to the
observations that subclassification of RMS (i.e., the ‘conventional’ scheme) was of
potential prognostic value, although only in univariate analysis (Parham, 2001).
Further refinements and alternate classification schemes have been offered
over the years. Retrospective analysis of IRS-I and IRS-II material yielded the
cytohistological classification (Palmer, 1981) based solely on nuclear morphology
rather than cytoplasmic differentiation and found increased risk of death for patients
with tumors composed of ‘monomorphous round cell rhabdomyosarcoma’. The
9
International Society of Pediatric Oncology (SIOP) a multi-institutional European
effort published another classification system that emphasized cytologic
differentiation (i.e., myogenesis) and cellular density (Caillaud et al., 1989). Superior
outcomes were observed in highly myogenic RMS such as ‘loose botryoid’ and
‘well-differentiated embryonal’ RMS. The National Cancer Institute (NCI)
classification revised the conventional classification by recognition of solid alveolar
tumors lacking fibrous septae into the alveolar category (Tsokos et al., 1992). In
retrospective analysis, the NCI classification was the first scheme tested, to be an
independent prognostic factor in multivariate analysis (Tsokos, 1994). Finally, an
international panel of experts was assembled (including representatives of each
classification scheme) to test the overall level of agreement within and among
pathologists for each of the four major classification schemes (Asmar et al., 1994).
The result of this effort was yet another classification system- a consensus
classification system called the International Classification of Rhabdomyosarcoma
(ICR) which has become the mainstay of clinical trials conducted by the IRSG
(presently the Children’s Oncology Group Soft Tissue Sarcoma Committee, COG
STS) and has proven prognostic power but also increased reproducibility and
reliability of the pathological diagnosis (Newton et al., 1995). The ICR recognizes
the conventional scheme, (i.e., alveolar and embryonal forms) but also includes the
superior prognosis leiomyomatous or spindle cell (Cavazzana et al., 1992), botryoid
embryonal variants and the poor prognosis undifferentiated sarcoma categories.
Rhabdomyosarcoma and sarcoma, not otherwise specified (NOS) categories are also
10
retained in the ICR for cases in which a definitive diagnosis cannot be made because
of suboptimal biopsy material.
Consistent chromosomal translocations (Barr, 1997; Barr, 2001) that result in
the expression of chimeric transcription factors, PAX3-FKHR or PAX7-FKHR
(PAX-FKHR) are detected by molecular genetic techniques exclusively in ARMS.
Initially, this was thought to provide an objective basis for distinguishing the two
major forms of the disease (Barr, 1997). However, analysis of a large series of cases
consistently fails to show a one to one association of the translocation with alveolar
histology; approximately 25-30% of cases possess classic alveolar histology but lack
a translocation (Barr et al., 2002; Sorensen et al., 2002). ERMS do not demonstrate
recurrent chromosomal translocations. Instead, they show greater genomic instability
(manifested as highly variable karyotypes) and recurring allelic imbalances such as
loss-of-heterozygosity (LOH) at 11p15.5 (Besnard-Guerin et al., 1996; Visser et al.,
1997).
As the name implies, RMS is presumed to show at least some evidence of
rhabdomyogenesis or skeletal muscle differentiation. The spectrum of histological
differentiation of this tumor varies and typically, well differentiated RMS has cross-
striations or rhabdomyoblasts that allow for a confident morphologic diagnosis
without adjunct studies. However, in virtually all cases, morphologic evidence of this
is limited to a small percent of the tumor cells, while in many cases there is no
morphologic evidence of such differentiation and in at least 20% of RMS cases
immunohistochemistry (IHC) is required for a definitive diagnosis. The use of
11
antibodies for myogenesis associated proteins such as desmin and MyoD have aided
the detection of myogenesis in such cases (Cessna et al., 2001; Tsokos, 1986). This
issue is of some clinical relevance, especially since cases lacking any evidence of
myogenesis, the undifferentiated or non-rhabdomyosarcoma soft-tissue sarcomas
(UDS or NRSTS) show poor outcome (Cessna et al., 2001; Raney et al., 2001;
Schmidt et al., 1986; Tsokos, 1986). These histologically featureless tumors are thus
a diagnosis of exclusion but have been included in RMS treatment protocols as they
respond to therapy in a similar manner to ARMS (Qualman et al., 1998).
As described above, the ICR criteria, resulted in remarkable improvements in
the reproducibility of diagnosis among patients as well as providing a platform for
survival models that are predictive of patient outcome (Asmar et al., 1994; Newton et
al., 1995). However, despite these exhaustive studies aimed at establishing
reproducible diagnostic criteria, as many as a third of patients are incorrectly
managed on treatment protocols due to inconsistency and uncertainty in diagnosis
(Qualman et al., 1998). The classical approach to tumor diagnosis, morphological
and cytological examination of hematoxylin-eosin stained histology sections is a
proven methodology but not without limitations and more often then not, the
pathologist requires additional costly and time consuming ancillary assays. In this
chapter, we examine whether a new molecular-based classification derived from
analysis of gene expression profiles and whole-genome patterns of LOH is better
suited to define a heterogeneous disease such as RMS. Specifically, we were
interested in comparing the histological diagnosis to molecular-based classes, the
12
identification of any novel classes of RMS tumors and to test the reproducibility of
diagnosis using microarray technology. Here we report our results from the analysis
of the initial diagnostic biopsy from 160 cases of RMS drawn largely from the IRS
IV study conducted by the COG (Breneman et al., 2003; Crist et al., 2001; Raney et
al., 2001). Based on our genomic analysis of the clinical samples using 22,000 gene
expression arrays and 10,000 SNP arrays, corroborated by RT-PCR and
immunohistochemistry of primary tumor material, we have identified homogeneous
classes of this tumor that differ markedly from current convention. These
molecularly defined classes appear to be superior to current histopathologic methods
of diagnosis, are highly reproducible, objective and impart prognostically relevant
information that may useful in optimizing risk-adapted therapy.
Results
Supervised and Semi-supervised Cluster Analysis
To define the specific gene expression patterns associated with distinct histological
RMS variants we generated gene-expression profiles for a cohort of 160 primary
diagnostic biopsies of RMS and UDS/NRSTS (Table 1 and Appendix,
Supplementary Table 1). The proportion of RMS histological variants in this data set
is representative of their incidence as reported in IRSG clinical trials, although the
numbers of cases with alveolar histology was increased in order to include more
translocation negative alveolar tumors for analysis (Qualman et al., 1998). We
initially hypothesized that each of the histological RMS variants would be associated
with distinct expression profiles, reflecting biological and clinicopathological
13
differences. To test this, we initially performed supervised class comparison utilizing
analysis of variance (ANOVA) which identified 707 genes (data not shown),
differentially expressed between tumors with alveolar, embryonal, spindle/botryoid
(amalgamated due to limited numbers of botryoid samples in the cohort) and
undifferentiated sarcomas/NRSTS.
Table 1. Clinical Characteristics of 160 RMS/NRSTS Patients
No. %
alveolar 66 41
mixed alveolar/embryonal 4 3
embryonal 69 43
botryoid 3 2
spindle 6 4
Histology
undifferentiated or NRSTS * 12 8
IA & IB 32 28
IIA & IIB 13 11
III 44 38
Clinical Groups
IV 26 23
Alive 105 66 Alive/Dead
Dead 53 34
Male 98 65 Gender
Female 52 35
Orbit 5 3
Head/Neck 12 8
Parameningeal 22 14
Bladder/prostate 13 8
Genitourinary Other † 28 18
Extremity 39 25
Primary Site
Other 34 22
<1 7 5
1-4 50 33
5-9 59 39
10-14 22 15
Age Groups
>15 13 9
Note: Some categories do not total correctly because of incomplete clinical
covariate data
* Review diagnosis of undifferentiated sarcoma or non-rhabdomyosarcoma soft-
tissue sarcoma
† non-bladder/prostate genitourinary tumors
14
Hierarchical clustering analysis of the 160 tumors depicts a dendrogram split into
two main branches- most of the tumors with alveolar histology cluster together on
one branch, the remaining embryonal and spindle/botryoid variants comprised the
second branch (Figure 1A). A small clade off the main branch of embryonal tumors
contained most of the UDS/NRSTS tumors. In addition, we observed 15 tumors with
alveolar histology that clustered among the embryonal variants.
The results from this supervised analysis therefore revealed discrepancies
between the ICR-based histopathological classification and the tumor expression
profiles. In order to further investigate and resolve these apparent differences; we
turned to a semi-supervised learning approach utilizing a reiterative strategy called
meta-clustering. This approach uses ANOVA again for supervised gene selection but
implements an unsupervised k-means method to refine sample clustering over
numerous iterations. Sample membership in k-means cluster centroids were
determined by the cumulative similarity of their gene expression profiles controlled
for centroid false-discovery under sample cross-validation (Simon et al., 2003). After
1,000 rounds of meta-clustering, 692 genes (data not shown) were identified and
used to plot the multi-dimensional scaling analysis of the cross-validated ‘test’ set
(Figure 1B). Tumors were assigned into one of three cluster centroids and similar to
the results of the supervised hierarchical clustering analysis in Figure 1, most tumors
with alveolar histology (n=55) clustered together in a dense group. A second centroid
consisted of tumors with embryonal, spindle/botryoid and alveolar histology forming
a cluster on the z-axis (n=71). The third cluster along the y-axis consisted of
15
UDS/NRSTS and embryonal histology tumors (n=33). In agreement with the
findings from supervised analysis a significant number of alveolar tumors and one
tumor with mixed alveolar/embryonal histology (n=15), representing 21% of all
alveolar histology tumors clustered with the embryonal tumors on the z-axis.
Accordingly, diagnostic RT-PCR assays for the expression of PAX-FKHR fusion
gene mRNA revealed that all of the alveolar tumors that form the main cluster of
alveolar tumors express fusion gene transcripts. A more detailed analysis of PAX-
FKHR expression using quantitative RT-PCR (QRT-PCR) is presented in Chapter 2.
In contrast, those 15 tumors with alveolar histology that clustered with the
embryonal and spindle/botryoid RMS tumors did not express PAX-FKHR fusion
gene (Figure 1B, purple dots).
Next we repeated the meta-clustering algorithm using the three cluster k-
means centroids as identified in Figure 1B to refine gene selection using ANOVA.
After this second round of meta-clustering, we found that gene expression profiles of
the tumors were consistent with further subdivisions of the tumors yielding five k-
means cluster centroids (Appendix, Supplementary Figure 1). The results from this
analysis can be appreciated by inspection of the expression matrix shown in Figure
1C. This expression matrix is derived from two-way hierarchical clustering of
samples and the 530 genes identified from the second round of meta-clustering
analysis (Appendix, Supplementary Table 2) and the colored bars above the samples
indicate the K-means cluster centroid determinations.
16
Figure 1 Supervised and semi-supervised cluster analysis A: Supervised testing with
ANOVA was used to identify 707 differentially expressed genes between ICR histological
groups. The dendrogram derived from hierarchical clustering of 160 tumor samples,
depicts two main branches which appear to cluster most tumors with alveolar histology on
the right branch. B: The results of the first round of semi-supervised clustering of tumors
visualized in a multi-dimensional scaling plot of the cumulative simulated ‘test’ set
similarity matrix. Samples are color coded by histology review diagnosis as in A, except
fusion-negative ARMS tumors (purple dots). Colored circles indicate three cluster
centroids identified in this analysis. C: Expression matrix depicting the expression patterns
of 530 genes among 160 tumors as identified by a second round of semi-supervised
clustering (Appendix, Supplementary Figure 1). Sample clusters and gene groups were
derived from hierarchical clustering. Clustered gene groups are labeled to the left of the
expression matrix. The expression of each gene in each sample was normalized in the
pseudo colored heatmap by the number of standard deviations above (red) and below
(blue) the median expression value (white) across all samples.
17
PAX-FKHR ARMS tumors were split into two subgroups labeled A1 (n=28) and A2
(n=27, red and orange bars, respectively). The majority of ERMS were split among
two groups, designated E1 (n=29) and E2 (n=62) which also included the fusion-
negative alveolar and spindle/botryoid as well as four tumors diagnosed with
UDS/NRSTS histology (blue and green bars, respectively). The subgroup designated
as N (n=14) contained most of the UDS/NRSTS tumors but also six tumors with
embryonal histology (brown bar). The five tumor groups are defined by four clearly
identifiable gene expression patterns (labeled to the left of the expression matrix,
Groups I-IV, see Supplementary Table 2). Group I (n=67) genes were expressed in
most of the RMS tumors but not in the UDS/NRSTS and the embryonal tumors that
clustered with them (subclass N, brown bar). Group II (n=131) genes appeared to be
expressed in subclass A1 subgroup of ARMS and subclass E1 subgroup of ERMS
tumors (red and blue bars, respectively) but not in other RMS subclasses or the
UDS/NRSTS subclass. Groups III (n=94) and IV (n=238) genes were highlighted by
their low or high expression, respectively in both ARMS subclasses (A1 and A2).
Note, we did not find an expression signature specific to the UDS/NRSTS N class
rather these tumors are characterized their lack of expression of Group I and II genes.
Biological Themes of Expression Profiles
In order to better understand the functional significance of these distinct tumor
expression patterns, we used overrepresentation analysis of Gene Ontology (GO)
annotations for the identification of “biological themes” and the functional
significance of the four gene groups (Appendix, Supplementary Table 3). This
18
technique statistically evaluates the fraction of genes in a given Gene Ontology
category found among a given list of genes queried. The gene groups identified by
meta-clustering were analyzed by the Expression Analysis Systematic Explorer
(EASE) program (Hosack et al., 2003). Group I genes expressed in most RMS but
not the UDS/NRSTS or the embryonal RMS that co-clustered with them (subclass
N), are enriched for GO categories such as ‘muscle development’, ‘muscle
contraction’ and ‘myogenesis’. Genes such as the muscle-specific transcription
factors MYOD and MYOG, known for their specificity in the diagnosis of RMS by
immunohistochemistry (Cessna et al., 2001) and novel markers (i.e., FGFR4 and
CDH15) are characteristic of this group (Appendix, Supplementary Figure 2A).
Group II genes, which differentiate the subclasses within ARMS and ERMS tumors
(expressed in A1 and E1 only) was overwhelmingly enriched for structural/functional
components of muscle such as the sarcomere and muscle contraction apparatus. Over
30% of the genes in this group are associated with more advanced stages of
terminally differentiated striated muscle (i.e., myosin light and heavy chain
isoforms). These genes are lowly expressed in the ‘poorly differentiated’ A2 and E2
RMS subclasses and not expressed at all in subclass N. Group III genes are not
expressed in either of the mARMS subclasses but expressed in all other tumors were
overrepresented by genes localized to chromosome 8 which is often detected as a
trisomy in embryonal but not alveolar RMS (Anderson et al., 1999a; Gordon et al.,
2001). Finally, Group IV genes which account for nearly half of all the differentially
expressed genes identified were predominately expressed in mARMS tumors and
19
overrepresented with the GO Biological Process of neurogenesis and central nervous
system development. Genes such as transcription factor AP-2 β (TFAP2 β) and
neurogenic differentiation 2 (NEUROD2) are representative of this group.
The expression of genes associated with more advanced muscle
differentiation defines subclasses within the mARMS and mERMS tumors. Markers
characteristic of advanced or terminal muscle differentiation such as the
structural/functional muscle myosin isoforms are differentially expressed between
RMS molecular classes (Appendix, Supplementary Figure 2B). Previous RMS
classification schemes (e.g., SIOP, NCI) have accounted for the degree of
rhabdomyoblastic differentiation in their classifications of RMS (Newton et al.,
1995). Indeed, this microarray data supports these original observations made by
cytohistology, immunochemistry or electron microscopy that RMS tumors display
varying degrees of rhabdomyoblastic differentiation. For the most part, the ICR
based histology review diagnosis correlated well with the molecularly defined
classes. However, there were discrepancies between these two classification schemes
for 25 (15%) tumors (i.e., 15 fusion negative alveolar, 6 embryonal and 4
UDS/NRSTS). These differences were not apparently due to misdiagnosis of tumor
histology on the part of the pathology review center (see below). Rather, these
discrepancies mostly represent conceptual differences between the two methods of
classification. Reflecting these differences, we henceforth refer to the following
nomenclature for RMS tumors. All alveolar tumors that express PAX-FKHR fusion
genes are classified as molecular ARMS (mARMS), whereas all fusion-negative
20
RMS tumors (including alveolar, embryonal and spindle/botryoid) are classified as
molecular ERMS (mERMS). UDS/NRSTS and the embryonal RMS tumors that co-
cluster with them, we reclassify as molecular non-rhabdomyosarcoma soft-tissue
sarcomas (mNRSTS).
Histology Re-Evaluation
We managed to obtain H&E stained slides for a number of tumors whose
histological diagnosis did not seem to correlate the gene expression-based molecular
diagnosis. We confirm the original histology diagnosis for four fusion-negative
alveolar tumors (two representative tumor sections, are depicted) whose molecular
profile was consistent with the mERMS class (Figure 2A and B). These tumors
showed characteristic alveolar architecture and cell morphology. In addition, we
obtained four tumors diagnosed with embryonal histology. Figure 2C and D depict
two of these tumors and although they show minimal cytoplasmic differentiation
characteristic of myogenic differentiation and could be considered atypical for
ERMS, we confirm the ICR review diagnosis for these tumors. However, our gene
expression based classification found these tumors to cluster with the mNRSTS
molecular class. In contrast, a highly myogenic tumor originally diagnosed as a
NRSTS, a pleuropulmonary blastoma (Figure 2E) shows characteristic myogenic
features and hence clustered with the mERMS tumors. Another tumor diagnosed as
an NRSTS, a diffuse, featureless tumor does not show any overt signs of myogenesis
by histology (Figure 2F) but by expression analysis clustered with the mERMS
tumors. Therefore, we can confirm that the original diagnoses based on histology and
21
other clinical features are accurate and in accordance with the ICR guidelines
although they are discordant with the classification identified by expression
profiling. From this, we conclude that morphology/histology alone is an unreliable
means of identifying myogenesis in these soft tissue sarcomas, and that ERMS can
be molecularly defined as a rhabdomyogenic sarcoma that lacks a PAX-FKHR
fusion-gene, while bona fide UDS/NRSTS may appear similar morphologically, but
expresses no myogenesis related genes.
22
Figure 2 Histology re-evaluation of H&E stained tumor sections confirms the original
ICR review diagnosis. A, B: Representative PAX-FKHR fusion-negative alveolar RMS
that clustered with the mERMS class tumors show distinct alveolar architecture and cell
morphology. Expression profiling however, reveals that they share gene expression
patterns consistent with other fusion-negative RMS tumors and not PAX-FKHR
expressing alveolar tumors C, D: Tumors classified as embryonal RMS, display atypical
ERMS histology but do show some cytoplasmic differentiation characteristic of
myogenic tumors. Expression profiling of these tumors indicate they do not express
rhabdomyogenic genes and hence cluster with the mNRSTS molecular class. E, F:
Review diagnosis of ‘other’, this pulmonary pleuroblastoma (E) shows highly
eosinophilic rhabdomyogenic staining suggestive of a RMS phenotype. In contrast, a
tumor with a review diagnosis of NRSTS (F) displays a rather featureless histology,
without overt evidence of rhabdomyogenesis. Both of these tumors however, clustered
with the mERMS tumors. Original magnifications 200X except C, 400X.
23
LOH Analysis
The results obtained from gene expression profiling were consistent with a three-
class distinction between mARMS, mERMS and mNRSTS, but independent validation
was needed to confirm this conclusion. Accordingly, we subjected a representative
subset (73 tumors, Supplementary Table 4) of these three groups to whole-genome
single nucleotide polymorphism (SNP) analysis that detected approximately 11, 555
SNPs across the entire genome. High density SNP arrays, greatly facilitate the
identification of allelic imbalance through analysis of the distribution of polymorphic
sites (Hoque et al., 2003; Mei et al., 2000). Chromosomal regions were divided into
15 Mb regions (for a total of 184 regions excluding sex chromosomes) and the
probability of LOH was calculated for each sample in that region. Nearly all tumors
(97%) showed some allelic imbalance but the overall degree of LOH, represented by
the fractional allelic loss (FAL) differed greatly between the molecular classes of
RMS tumors (range, 0-54%, Figure 3A). In general, mARMS (n=25) and mNRSTS
(n=5) tumors had low levels of FAL (mean, 0.06 and 0.05, respectively). In contrast,
the molecular class mERMS tumors including, embryonal (n=33), spindle/botryoid
(n=6) and alveolar fusion-negative (n=4) tumors had significantly higher levels of
FAL (mean 0.16, p<0.00015). Furthermore, patterns of LOH appear to be specific
for tumor subtype as can be seen in the LOH map of 143 genome-wide, 15 Mb
regions that showed allelic variations among the histological subtypes (Figure 3B).
Figure 3 Loss-of-heterozygosity patterns in RMS tumors derived from single
nucleotide polymorphism microarray analysis. A: Mean fractional allelic loss for each
of the three main molecular classes (ANOVA, p<0.016). The mERMS class is
subdivided by histological variants. B: Genome-wide (chromosomes 1-19) patterns of
LOH, as determined across 15Mb windows in 25 mARMS, 43 mERMS and 5 mNRSTS
tumors. C: Proportion of LOH for G-bands along chromosomes 10 (top) and 11
(bottom) for mARMS, mERMS and mNRSTS tumor classes as indicated by colored
legend.
Similar to the results obtained by others, (Visser et al., 1997) the predominant
regions of LOH were found along chromosome 11, especially 11p15.4 (no SNP
probe sets were available for 11p15.5), 11p13, 11q22 and 11q25. Most mERMS
tumors (81%), including four fusion-negative alveolar tumors showed evidence of
LOH at 11p and/or 11q (Appendix, Supplementary Table 5), in contrast to mARMS
tumors (only 20% showed any chromosome 11 LOH). In addition, none of the 5
24
25
mNRSTS tumors analyzed had evidence of LOH at chromosome 11p and only 1 of
these tumors had LOH at 11q22. Other chromosome arms that showed frequent
allelic imbalances primarily in embryonal and spindle/botryoid tumors with high
FAL, included 10q, 10p, 9q, 9p, 4p, 4q, 8q and 12q. Differences in proportion of
LOH along chromosomes 10 and 11 (regions with the highest levels of LOH) for
mARMS, mERMS and mNRSTS tumors are highlighted Figure 3C. The SNP LOH
data supports our conclusion from our expression analysis that the mERMS
(including fusion-negative alveolar tumors) are a distinct group from mNRSTS
tumors. However, do to the limited number of NRSTS cases available, this is still a
diagnosis of exclusion, as we do not have specific LOH patterns or gene expression
profiles that are unique to this group of tumors and as has been noted by other
classification schemes (Newton et al., 1995). In addition, fusion-negative alveolar
RMS appears to share with other fusion-negative ERMS, levels and patterns of
allelic imbalance that are not characteristic of fusion-positive mARMS tumors. This
is in accordance with some findings from analysis of comparative genomic
hybridization data of fusion-positive ARMS and -negative ARMS. Fusion-negative
ARMS were shown to share similar chromosomal gains to other fusion-negative
ERMS tumors but not fusion-positive ARMS tumors (Bridge et al., 2002).
Nearest Shrunken Centroids Class Prediction
In order to assess the reproducibility of the molecular classification scheme (i.e.,
mARMS, mERMS and mNRSTS), the Nearest Shrunken Centroids algorithm
(Tibshirani et al., 2002) was employed and used to identify the smallest set of genes
predictive of molecular class with an associated minimal error rate. Using the 530
genes selected by meta-clustering analyses as inputs, ten genes were identified that
predict molecular class with an estimated error rate of 1.9% (Figure 4 and Table 2).
Figure 4. Nearest Shrunken Centroids class probability predictions. Plotted are the
actual tumor class defined by us as the molecular classes, mARMS (red), mERMS
(green) and mNRSTS (brown) against the class probabilities as determined by
approximation to cluster centroids shrunken by a delta value of δ=5.4. Arrows indicate
three mERMS tumors with probability prediction scores consistent with the mNRSTS
class. The overall cross-validated error rate on the randomly generated ‘test’ subsets of
samples was estimated at 1.9%.
None of the genes identified are singularly correlated to the mERMS class instead,
rhabdomyogenic genes such as CDH15, CHRNA1, FGFR4 and TPM2 distinguish all
RMS tumors from mNRSTS. The RMS subtype distinction is derived from genes
highly expressed in mARMS such as transcription factor AP-2 β TFAP2B,
cannabinoid receptor CNR1, and p-cadherin CDH3, which are not expressed in
mERMS tumors or mNRSTS tumors (Table 2). From leave-n-out cross-validation
analysis, three tumors from the mERMS class were identified with high probabilities
or prediction scores of belonging to the mNRSTS class (arrows, Figure 4). As
expected from previous analysis, fusion-negative ARMS tumors had high
probabilities of membership in the mERMS class as they similarly did not express
26
27
these mARMS ‘signature’ genes (data not shown). Therefore, this analysis clearly
demonstrates the utility of a small subset of genes at making accurate and
reproducible differential diagnosis within RMS subtypes and between RMS and
NRSTS tumors.
Table 2. Genes used by Nearest Shrunken Centroids for Cross-validated
Class Prediction
Mean Affymetrix
Difference Intensity
Symbol Gene Name
Tumor
Class
ANOVA
F-Stat
ARMS ERMS NRSTS
PIPOX Pipecolic acid oxidase ARMS 532 928 96 80
TFAP2B Transcription factor AP-2 beta ARMS 528 495 2 2
CNR1 Cannabinoid receptor 1 (brain) ARMS 312 674 13 12
ASS Argininosuccinate synthetase ARMS 245 2247 116 72
CDH15
Cadherin 15, M-cadherin
(myotubule)
RMS 220 278 133 1
NHLH1 Nescient helix loop helix 1 ARMS 215 87 1 1
CDH3
Cadherin 3, type 1, P-cadherin
(placental)
ARMS 199 162 1 2
CHRNA1
Cholinergic receptor, nicotinic,
alpha polypeptide 1 (muscle)
RMS 146 828 495 31
FGFR4
Fibroblast growth factor receptor
4
RMS 138 441 150 2
TPM2 Tropomyosin 2 (beta) RMS 132 955 421 4
Note: all p values <0.00001
Tissue Microarray Immunohistochemistry
Immunohistochemistry is the most common molecular technique used by clinical
pathologists to aid in the differential diagnosis of RMS from NRSTS and other small
blue round cell tumors of childhood, primarily using antibodies to detect myogenic
markers such as MyoD and desmin (Qualman et al., 1998). However, there are
28
currently no reliable markers of the RMS subtype distinction aside from PAX-FKHR
fusion gene detection by other techniques such as FISH and RT-PCR. Rank-ordered
gene lists were evaluated for candidate markers that showed high levels of
expression and high specificity by the confidence of their statistical association with
the molecular classes of RMS (Appendix, Supplementary Table 2). As observed in
the nearest shrunken centroids analysis, TFAP2 β, was highly expressed by mARMS
tumors with mean expression levels of this gene greater than 200-fold relative to all
other RMS tumors (Appendix, Supplementary Figure 3). Similarly, for the mERMS
tumors an architectural transcription factor, HMGA2 was identified. Although
HMGA2 expression partially overlapped between mARMS and mERMS, it
nonetheless strongly correlated with mERMS tumors which had mean expression
levels of greater than 8-fold over mARMS (Appendix, Supplementary Figure 3). A
set of RMS tissue microarrays provided an ideal platform for high-throughput
validation on the protein level for RMS tumor markers identified by microarray
(upper panels, Figure 5). TMAs contained 209 tumor and normal sections from 64
unique alveolar and embryonal (32 each) tumors from cases that were not subjected
to microarray analysis. HMGA2 antibodies positively stain ERMS and fusion-
negative ARMS (mERMS), but react poorly with PAX3-FKHR or PAX7-FKHR
ARMS (mARMS) which react strongly with TFAP2 β antibodies. The
immunochemistry results with these two biomarkers validate the gene expression
data, with 86% of mARMS tumors reacting with TFAP2 β whereas no ERMS and
only one fusion-negative ARMS reacted with antibodies to TFAP2 β.
29
Figure 5 Immunohistochemical analysis of an independent set of RMS tissue microarrays
validates oligonucleotide microarray results and demonstrates the utility of TFAP2 β and
HMGA2 in RMS diagnosis on formalin-fixed tissue. ERMS TMA with antibodies detecting
HMGA2 (left) and ARMS TMA (right) with antibodies detecting TFAP2 β are shown on
the top row. Representative serial tumor sections are shown staining for HMGA2 (left
column, embryonal and alveolar fusion-negative) and TFAP2 β (right column, PAX3-FKHR
alveolar). Insets show number of positively staining tumors out of total. ERMS and ARMS
TMAs were counterstained with hematoxylin and methyl green, respectively. Original
magnification 200X.
30
Approximately 80% of the ERMS and fusion-negative ARMS cases but only 20% of
the fusion-positive mARMS cases reacted with HMGA2 antibodies, paralleling the
gene expression data. Between these two novel markers, virtually all cases of
mERMS and mARMS can be differentially diagnosed from clinical material (e.g.,
paraffin-embedded tumor sections).
Kaplan-Meier and Cox Regression Analysis
The importance of the differential diagnosis to risk-adapted therapy is clearly
established. As can be seen by Kaplan-Meier analysis for overall and failure-free
survival, tumor histology is a significant prognostic factor in our data set, in
accordance with the overall trends observed in the IRS-IV studies (Breneman et al.,
2003; Crist et al., 2001) (Figure 6A, C). However, our microarray analysis reveals in
addition, that the molecular subclasses of RMS tumors as depicted in Figure 1C (and
Supplementary Figure 1) possess improved prognostic power in comparison to ICR
histology classification alone. We found that subclasses of mARMS and mERMS
identified by expression profiling show different overall and failure-free survival
outcomes in a Kaplan-Meier analysis for overall survival probability (Figure 6B, D).
Overall survival at five years for the mERMS E1 and E2 classes were 87% and 74%,
respectively although this difference was not statistically significant (p<0.20 by log-
rank test). However, both mERMS subclasses exceeded the survival of any of the
mARMS tumors (p<0.004). For mARMS subclasses, the differences in estimated five
year survival for A1 and A2 was only marginally significant at 68% and 36%,
respectively (p<0.08). The mNRSTS group had poor overall survival of about 51%,
similar to mARMS overall, and markedly worse than any form of mERMS (p<0.008).
Figure 6 Kaplan-Meier overall survival estimates identify novel prognostic subclasses
of RMS tumors. A, C: Patient overall and failure-free survival by ICR histology-based
classification. B, D: Patient overall and failure-free survival for molecular subclasses as
identified in Figure 2. Cox regression χ
2
tests for homogeneity p values (i.e., the
likelihood that the distribution of survival curves occurs by chance) are indicated below
the curves.
31
32
Next multivariate Cox Regression analysis was employed to test for the
dependence of these molecular-based classes of tumors on previously characterized
clinical prognostic variables. ‘Molecular Class’ showed independence from all
clinical variables with the exception of patient age and tumor histology (Table 3).
Table 3. Multivariate analysis of RMS molecular classes determined by
expression profiling
p-values*
Covariate Parameter
five year Overall
Survival Estimates
Univariate Adjusted MC
Molecular
Classes†
A1, A2, E1, E2, N 68, 36, 87, 74, 51 0.0003 -
IRS Risk Group‡ Low, Inter., High 93, 67, 13 <0.00001 <0.00001
Clinical Group** I, II, III, IV 94, 74, 72, 18 <0.00001 <0.00001
Hist GP
S/B, EMB, ALV,
NRSTS 100, 77, 52, 52 0.0004 0.06
Nodal
Involvement N0 vs. N1 78 vs. 38 0.0005 0.003
Translocation*** P7F, NEG, P3F 93, 59, 35 0.003 0.004
Local
Invasiveness T1 vs. T2 84 vs. 56 0.007 0.01
Tumor Size ≤ 5cm vs. > 5 cm 81 vs. 60 0.01 0.01
Age <10 vs. ≥10 73 vs. 47 0.05 0.1
* Univariate: Cox Regression χ
2
p-values for individual clinical risk factors. Adjusted p-value:
Multivariate Cox regression χ
2
p-values for the Molecular Class (MC) model, adjusted for each of
the clinical risk factors or covariates.
† Molecular Classes as determined by expression profiling, depicted in Figure 1C.
‡ IRS Risk Groups see Table 8.
**Tumor histology: S/B = spindle and botryoid, EMB = embryonal, ALV = alveolar, NRSTS =
undifferentiated or nonrhabdomyosarcoma soft-tissue sarcoma
***Translocation as determined by RT-PCR for tumors with alveolar histology only. P7F= PAX7-
FKHR, NEG= fusion-negative, P3F= PAX3-FKHR.
In addition, we observed an additive effect when adjusting for the current IRSG
scheme for patient assignment into therapeutic arms of IRS-V protocol (Raney et al.,
2001) and the post-operative clinical groups assigned to the individual patients. The
33
IRSG scheme for risk-adapted therapy assesses tumor histology, stage, patient age,
anatomical site, the degree of nodal involvement and local invasiveness (see Chapter
3 for further details). This multivariate analysis shows that the ‘well differentiated’
subclasses of mARMS and mERMS tumors identified (A1 and E1, respectively) that
display increased expression of myogenic differentiation genes when compared to
their respective ‘poorly differentiated’ counterparts (A2 and E2, respectively) impart
additional prognostic information that could help to further refine the current risk
assignment protocols. Indeed, the degree of (rhabdomyoblastic) differentiation in
these tumors has long been recognized as a prognostic factor. Here, we report that
the expression patterns of a defined set of genes can readily identify the degree of
differentiation in these tumors. Hitherto, these analyses have failed to yield
reproducible criteria for such determinations because they have relied on the
availability of highly specialized and experienced pathologists employing
cytohistology and other ancillary techniques not accessible to most clinicians (Carter
et al., 1990; Herrera-Gayol et al., 1995; Molenaar et al., 1985; Schmidt et al., 1986;
Tsokos, 1986; Wijnaendts et al., 1994).
Discussion
Several important conclusions emerge from this analysis of rhabdomyosarcoma, by
far the largest microarray study to date for this disease (Khan et al., 2001; Wachtel et
al., 2004). In aggregate, our findings strongly suggest that a molecular classification
of this disease is both more accurate and biologically relevant than current
morphologic or histopathological methods. We find that biologically homogeneous
34
and reproducible subtypes of RMS cannot be reliably identified by histologic or
cytologic features alone. Recent advances such as molecular genetic assays for the
detection PAX-FKHR fusion gene expression in alveolar tumors have certainly
contributed to the distinction between alveolar and embryonal subtypes but are
unable to distinguish RMS from UDS/NRSTS. Immunohistochemistry can often
identify myogenic (e.g., RMS) vs. non-myogenic (e.g., NRSTS) sarcomas, but not in
all cases and usually requires at lest two antibody assays. The International
Classification of Rhabdomyosarcoma, a multi-investigator, multi-institutional study
of RMS based on the review of hundreds of cases made a substantial impact on the
reproducibility of diagnosis between pathologists but is still associated with
significant inter- and intra- observer error rates, especially for the distinction
between ERMS and NRSTS. These misclassification rates, ranging from 10% to
over 35%, (Asmar et al., 1994; Newton et al., 1995) have a detrimental impact on
patient management. We believe that the classification system proposed herein will
improve the diagnosis, and therefore clinical management, of RMS patients.
Molecular classification of these soft tissue sarcomas is possible, based on a
few guiding principles. We show that RMS can be effectively divided into two
molecular classes, similarly to the findings from studies of other soft-tissue sarcomas
(Nielsen et al., 2002; Segal et al., 2003). In an effort to both retain historical
nomenclature and precisely define molecular genetic alterations, we define mARMS
(e.g., molecular ARMS) tumors as those with a specific translocation (e.g., PAX3-
FKHR or PAX7-FKHR) which is associated with a homogenous, tightly clustered
35
‘strong’ gene expression profile. These tumors also show little genomic loss or
allelic imbalance. These features distinguish mARMS from mERMS (molecular
ERMS) tumors, which show ‘complex’ karyotypes, displaying higher levels of
allelic imbalance and more heterogeneous gene expression profiles. Our observations
on primary human tumor material are corroborated by experimental studies as well.
In the next chapter, we present our work towards characterizing a ‘PAX-FKHR
expression signature’ from analysis of an in vitro PAX-FKHR model system and
find that it is tightly correlated to primary mARMS gene expression profiles. The
notion that PAX-FKHR expression serves to define the alveolar RMS subtype has
been described by other authors (Anderson et al., 2001b; Barr, 2001),
but at least
one group (Wachtel et al., 2004),
has reported expression profiles unique to both
fusion-positive and –negative ARMS and distinct from any form of ERMS. Here
with a larger cohort of RMS cases, we find no such molecular class (e.g., fusion-
negative ARMS) distinct from ERMS, and therefore classify all fusion-negative
RMS tumors irrespective of their histology together and conclude that PAX-FKHR
expression effectively defines ‘true’ alveolar RMS. In fact, closer inspection of the
data presented in Wachtel et al., reveals that there were no statistically significant
differences between fusion negative alveolar and embryonal RMS gene expression
profiles in their data set, further supporting our conclusions. Fusion-negative alveolar
and embryonal RMS (mERMS) share common, albeit heterogeneous gene expression
profiles and levels of allelic imbalance 3-fold greater than that observed in mARMS
or mNRSTS. A distinct gene expression profile for the superior prognosis embryonal
36
spindle cell and botryoid variants is not evident, and they display similar levels of
allelic imbalance as classical ERMS. Consequently, these variants are classified with
the other fusion-negative mERMS tumors.
The diagnosis of UDS/NRSTS remains a diagnosis of exclusion, as reported
in the ICR classification study. These tumors do not express muscle lineage markers,
which appears to be a common trait for all ‘true’ RMS tumors whether fusion-
positive or -negative. Therefore, the difficult task of deciding when a tumor is no
longer a rhabdomyosarcoma, and is therefore a non-myogenic NRSTS (with
generally a far worse clinical behavior) can be addressed by assessing expression of
a panel of myogenic genes as defined herein and can be substantiated by independent
evidence of genomic LOH differences between mERMS and mNRSTS. A combined
gene expression profile augmented by LOH analysis may be a useful, objective, and
an accurate means of distinguishing mARMS, mERMS and mNRSTS. Whole-genome
array based technologies are not generally economically viable approaches for
differential diagnosis, though this will likely change over the next few years. At
present, immunohistochemistry is the “gold standard” for tumor marker analysis and
is the de facto standard for RMS diagnosis. We find that the two antibodies studied
here, TFAP2 β and HMGA2, can be used for routine diagnosis on clinical material
and will likely provide a molecular class distinction in the majority of cases. For
TFAP2 β, this conclusion is substantiated by recent work reported on an independent
RMS TMA (Wachtel et al., 2006). When combined with histopathology, they will
likely clarify a significant number of ambiguous diagnoses.
37
A microarray-based molecular classification approach seems to be both
feasible and specific, as it reliably discriminates ARMS from ERMS. A 10-gene
signature identified by Nearest Shrunken Centroids analysis resulted in a small
(<2%) error rate, Additional markers that could be useful for the differential
diagnosis of RMS from NRSTS, chiefly, muscle-specific isoforms of genes such as
CHRNA1, CDH15, FGFR4 and TPM2 warrant further investigation for their utility
on clinical material. Alternatively, custom microarray Gene Chips or multiplex PCR
assays may be a viable means of making accurate diagnoses of the three main
molecular classes identified by our whole-genome gene expression and LOH
analyses.
We also find heterogeneity within molecular class in RMS related to degree
of myogenesis. Expression profiles of both mARMS and mERMS showed
heterogeneity within the class, revealing subclasses of RMS tumors that differ
primarily in their expression of muscle differentiation markers. It should be noted
that the well differentiated ‘A1’ mARMS subclass of tumors exclusively express
myosin light chain isoform 1 (MYL1) in contrast to poorly differentiated ‘A2’
mARMS subclass. In the mERMS class, well differentiated ‘E1’ subclass tumors
express myosin light chain isoform 4 (MYL4), the ‘atrial, embryonic’ isoform and
myosin heavy chain 8 (MYH8), the ‘perinatal’ isoform, whereas the ‘E2’ subclass
generally do not. These differences in myosin chain usage could reflect the origins of
ARMS and ERMS from different pools of progenitors (Carter et al., 1990; Tonin et
al., 1991; Wijnaendts et al., 1994). A recent study (Tiffin et al., 2003) proposed that
38
ERMS is derived from muscle ‘satellite’ cells, based on the expression patterns of
PAX7, which post-natally is expressed exclusively in this cell lineage. Our findings
suggest a more complex relationship between cell of origin and gene expression
pattern. We observe heterogeneous mERMS expression profiles, which may parallel
the sequential differentiation of embryonal mesenchyme into skeletal muscle during
different stages of development but do not necessarily imply origins from a fixed cell
lineage.
The gene expression profile unique to mARMS is surprisingly dominated by
neurogenesis associated genes, which was not expected, given the historical
recognition of RMS as a skeletal muscle tumor. This is consistent with the
hypothesis that ARMS results from the de-differentiation of muscle cells (Keller and
Capecchi, 2005; Keller et al., 2004). PAX-FKHR expression in ARMS progenitor
cells could drive this dedifferentiation, resulting in tumor cells with an inherent
muscle phenotype but accompanied by the expression of genes characteristic of other
(i.e., neurogenic) cell lineages where wild type PAX3 and PAX7 are also
transcriptionally active during development (Mansouri, 1998).
In this scheme, mERMS fundamentally differs from mARMS in that ERMS
resembles fetal mesenchyme differentiating into muscle, while ARMS resembles de-
differentiation of a committed muscle precursor cell, such as a satellite cell. The
observed incidence and anatomic origins of these two classes of RMS would support
this thesis. Indeed, the embryonal subtypes predominant in young children while the
alveolar subtypes occur primarily in adolescence (Qualman et al., 1998) and
39
embryonal tumors tend to occur in non-striated muscle bearing areas, while alveolar
tumors tend to occur in extremities or other strap-muscle bearing sites. These clinical
observations, along with the current molecular evidence, suggest that these RMS
subtypes arise from different cell lineages.
We also found that subclasses of mARMS and mERMS tumors can be
recognized by distinct patterns of myogenic differentiation gene expression. These
subclasses are associated with distinct patient outcomes and multivariate analysis
reveals that these subclass determinations were independent all clinicopathological
covariates except histology and patient age at diagnosis. In addition, for the mARMS
and mERMS subclasses identified, an additive effect was observed when we adjusted
for the IRS risk group assigned to the patients. The IRS risk group is an amalgam of
clinical variables such as histology, anatomical site, patient age at diagnosis, tumor
size, degree of nodal involvement and local invasiveness. This data therefore,
supports previous studies (Carter et al., 1990; Herrera-Gayol et al., 1995) that have
tried to link the degree of myogenic differentiation in RMS to patient prognosis.
Microarray-based detection of a ‘myogenic expression signature’ is likely to be a
more reproducible method for making such determinations and could serve as an
additional prognostic factor for RMS staging.
The above distinctions are of some clinical importance, as current standard of
care dictates risk based treatment, on the assumption that patients with tumors with
an intrinsic better prognosis require lesser therapy, and the converse (Meyer and
Spunt, 2004). However, the known error rates in the diagnosis of tumor subtypes,
40
compounded by the inappropriate inclusion of non-myogenic soft tissue sarcomas
that are intended to be treated on other protocols, undermines this effort (Qualman et
al., 1998). It is hoped that a molecular definition of RMS and NRSTS classes as
outlined here might allow clear, unambiguous identification of these tumors. These
appear to be more predictive of outcome and are linked to underlying biology, such
as chimeric gene expression and allelic imbalances as determined by patterns of
LOH, than historically defined classes. It will be important to test these new findings
in a prospective analysis of uniformly treated RMS patients, as is planned in future
COG studies of this disease.
41
Chapter 2: Identification and Characterization of a PAX-FKHR
Expression Signature in Alveolar RMS- A Major Determinant
of Molecular Class and Clinical Behavior
Introduction
Cytogenetic analysis of ARMS tumors has identified two distinct chromosomal
translocations that can be used for the differential diagnosis from ERMS, as well as
the other childhood solid tumors. The t(2;13) translocation, which generates a
chimeric gene fusing PAX3 on chromosome 2 to FKHR on chromosome 13, is found
in nearly 55% of the tumors. A variant t(1;13) translocation leads to the fusion of
PAX7 on chromosome 1 to FKHR, and is found in 22% of these tumors (Sorensen et
al., 2002). Despite exhibiting the classical alveolar histology, however,
approximately 25-30% of ARMS tumors lack either translocation. In fact, it appears
that these fusion-negative ARMS tumors are more similar to ERMS tumors with
respect to other covariates, such as age at diagnosis and patient survival (Barr et al.,
2002). Improved outcome has been reported for some ARMS patients with the less
common PAX7-FKHR variant (Anderson et al., 2001a; Sorensen et al., 2002)
although this has not yet been implemented in the clinic as a prognostic factor.
Lastly, there are a small number of RMS tumors displaying mixed
alveolar/embryonal histology. The PAX3-FKHR or PAX7-FKHR (PAX-FKHR)
chromosomal translocations are observed in only some of these mixed histology
tumors (Chiles et al., 2004; Kilpatrick et al., 1994). Presently, tumors with any
evidence of alveolar histology are considered to be one pathological entity, although
it is clear that they are genetically and clinically heterogeneous.
42
The PAX-FKHR fusion proteins contain the N-terminal region including the
intact paired domain and homeodomain DNA-binding elements of PAX3 or PAX7,
and the potent C-terminal transactivation domain of FKHR (Galili et al., 1993). The
fork head DNA-binding domain of FKHR is truncated in the fusion protein and does
not appear to influence target gene recognition (Sublett et al., 1995). Transcriptional
activity attributed to PAX-FKHR fusion proteins is therefore likely due to
deregulation of both wild type PAX and FKHR function (Bennicelli et al., 1995; del
Peso et al., 1999). Several studies have made efforts towards identifying PAX-FKHR
target genes using heterologous cell lines, including NIH3T3 fibroblasts (Khan et al.,
1999), RD embryonal RMS (Barber et al., 2002) and SAOS-2 osteosarcoma (Begum
et al., 2005) cells. These studies, and others, have shown numerous consequences of
PAX-FKHR expression including activation of a myogenic transcription program
(Khan et al., 1999), mesenchymal-to-epithelial transition (Begum et al., 2005) and
transformation and growth suppression (Xia and Barr, 2004). However, few targets
have been identified and demonstrated to have relevance to ARMS biology or
clinical value in predicting patient outcome.
In this chapter we describe work focused on the role of the PAX-FKHR
fusion genes in the diagnosis, tumor behavior and prognosis of ARMS. First, we
addressed the question of the proper diagnostic classification of ARMS tumors. As
already alluded to by the analyses presented in Chapter 1, we found that this
subgroup of tumors should be grouped by their genotype as determined by the
mRNA expression of the PAX-FKHR genes, rather than by their morphological
43
phenotype. Second, in order to identify the molecular targets of the fusion
transcription factors, we characterized a PAX-FKHR expression signature detectable
both in tumors and in an in vitro model of ERMS cells expressing PAX-FKHR. To
describe the biological role of PAX-FKHR, we mined its expression signature for
over-represented functional annotations and biological networks, finding indications
for a role in regulation of proliferation and differentiation through suppression of
apoptosis and muscle cell differentiation. Finally, we showed that the expression
patterns of a subset of the PAX-FKHR signature genes correlate with patient
outcome. Thus, we not only identified novel prognostic markers that are independent
of conventional criteria, but we also demonstrated the importance of PAX-FKHR
transcriptional control for the behavior of ARMS.
Results
PAX-FKHR Fusion Genes Dictate the Expression Profile of Alveolar RMS
As discussed in Chapter 1 we found that the two main histological variants of RMS
were associated specific gene expression patterns. We repeated the analysis of RMS
expression profiles using the meta-clustering algorithm this time, excluding the
mNRSTS tumors which were determined in Chapter 1, not to be ‘true’ RMS tumors
(Appendix, Supplementary Table 6). Again, using meta-clustering analysis, we
generated gene-expression profiles for 139 RMS tumors identifying 534 genes
represented by 650 probe sets that participated in k-means centroid clustering at a
false-discovery rate of 0.1% (Appendix, Supplementary Table 7). Two main clusters
44
are apparent (Figure 7A). Most of the tumors with alveolar histology, including three
of the tumors with mixed alveolar/embryonal histology, clustered together into a
dense group away from the origin whereas tumors with embryonal or
spindle/botryoid histology clustered as a more heterogeneous spread. Again, we
observed 15 tumors with alveolar or mixed alveolar/embryonal histology,
representing 21% of all the tumors with alveolar histology, which clustered with the
non-alveolar RMS variants.
Previous studies have shown that the PAX-FKHR genes are expressed in most
RMS tumors with alveolar histology, although approximately 25% do not express
either gene (Barr et al., 2002; Sorensen et al., 2002). To determine whether PAX-
FKHR expression differences explain why the 15 samples did not cluster among the
other ARMS tumors, for this analysis we used a more sensitive technique than
conventional RT-PCR and measured PAX-FKHR mRNA expression by quantitative
RT-PCR (QRT-PCR) in 59 of the 70 tumors with alveolar or mixed
alveolar/embryonal histology (see Materials and Methods, Figure 19). Figure 1B
shows the QRT-PCR data superimposed on the multi-dimensional scaling plot. PAX-
FKHR expressing alveolar tumors (orange-to-pink dots) clustered together, while
alveolar tumors in which PAX-FKHR expression was not detected (black dots)
clustered instead with the other fusion-negative RMS tumors. Two tumors (blue
dots) with very low but detectable PAX-FKHR expression (at levels less than 0.1%
Figure 7 PAX-FKHR expression in alveolar RMS is associated with a unique gene
expression profile that is independent of tumor histology. A: Multidimensional scaling
analysis of 139 primary RMS tumors based on semi-supervised analysis reveals tight
clustering of most alveolar tumors, in contrast to the heterogeneous distribution of
embryonal and spindle/botryoid tumors. Included in the main alveolar cluster are 3 mixed
histology alveolar/embryonal tumors. The legend indicates the histological diagnoses. B:
Replot of panel A based on normalized QRT-PCR expression levels of PAX3-FKHR and
PAX7-FKHR for 59 tumors with alveolar or mixed alveolar/embryonal histology.
Relative PAX-FKHR mRNA levels are depicted by colored dots as indicated in the color
scale. Note that the two tumors with low but detectable PAX-FKHR expression (blue
dots) had median PAX-FKHR expression levels 1000-fold less than alveolar tumors
expressing high levels of PAX-FKHR (orange to pink dots). RMS samples for which no
QRT-PCR data was available are also indicated (gray dots).
of the median PAX-FKHR expression level for all tumors assayed), also clustered
with other fusion-negative tumors. Five additional alveolar tumors for which
material was not available for QRT-PCR, but were previously determined by the
conventional RT-PCR analysis to be fusion-negative, similarly clustered with the
other fusion-negative RMS tumors. These data suggest that despite their common
histology, PAX-FKHR fusion-positive and fusion-negative alveolar RMS tumors
have distinct expression profiles that are dependent on the expression of the PAX-
FKHR genes. Furthermore, it appears that a minimum expression level of PAX-
45
46
FKHR is required in order for a tumor with alveolar histology to cluster among the
PAX-FKHR expressing ARMS tumors.
In Vitro PAX-FKHR Expression Model
If PAX-FKHR is in fact a major determinant of the expression profiles of fusion-
positive ARMS tumors, we reasoned that ectopic expression of PAX-FKHR in an
embryonal RMS cell line would significantly alter its expression profile such that
this non-ARMS cell line would now cluster with PAX-FKHR expressing ARMS cell
lines. To test this, we stably expressed PAX3-FKHR and PAX7-FKHR in the
embryonal RD cell line using retroviral transduction. Retroviral constructs contained
HA-tagged PAX3-FKHR or PAX7-FKHR cDNAs cloned upstream of IRES-GFP
sequences, facilitating the rapid screening of transduced cells by FACS sorting
(Figure 8A). Four independent polyclonal populations infected with PAX3-FKHR,
PAX7-FKHR or vector control virus were isolated. Expression of PAX-FKHR was
detected by Western blot in all of the PAX-FKHR polyclones (Figure 8B and C).
Transactivation of the pTK-PRS9 reporter construct (Chalepakis et al., 1991), which
contains six tandem copies of a consensus PAX and PAX-FKHR binding sequence,
in representative PAX3-FKHR and PAX7-FKHR transduced polyclones confirmed
the functionality of the ectopic proteins (Figure 8D). CAT activity in the PAX3-
FKHR and PAX7-FKHR expressing cells was 4-fold higher than in vector
transduced RD cells, which showed basal CAT activity due to the presence of
endogenous PAX3 and PAX7 (Tiffin et al., 2003).
Figure 8 In vitro PAX-FKHR expression model. A: Diagram of the retroviral
constructs used to express PAX-FKHR. Three tandem HA-epitope tags are
represented by the black boxes at the 3’ end of the PAX-FKHR cDNAs. B, C:
Western blot analysis of FACS sorted polyclones transduced with vector, PAX3-
FKHR or PAX7-FKHR retrovirus. PAX-FKHR expression was determined using an
anti-HA antibody; β-actin was used as a loading control. D: Transactivation of the
pTK-PRS9 reporter construct in representative vector, PAX3-FKHR and PAX7-FKHR
polyclones confirmed the functionality of the ectopic proteins. The activity in the
vector polyclone was arbitrarily set to 1, with all other activities related to this value.
Error bars represent standard deviations calculated from multiple experiments.
We next analyzed the microarray expression profiles of the PAX3-FKHR,
PAX7-FKHR and vector transduced RD populations. A simple t-test identified 334
genes (represented by 389 probe sets) differentially expressed between the PAX-
FKHR and the vector populations (Appendix, Supplementary Table 8).
Approximately 80% of the 334 genes identified in this screen were defined as ‘up-
regulated’, showing increased expression in the PAX-FKHR expressing polyclones
relative to vector polyclones, and 68 genes were defined as ‘down-regulated’. Using
this in vitro derived PAX-FKHR expression profile, we next determined how the
ectopic expression of PAX-FKHR relates to expression patterns seen in RMS
47
48
derived cell lines. To do this, we analyzed a panel of established RMS cells lines
derived from primary tumors, including four PAX3-FKHR expressing alveolar and
five fusion-negative cell lines. Within the 334 differentially expressed genes that
comprised the in vitro PAX-FKHR expression profile, we identified a subset of 85
genes (represented by 106 probe sets) that were differentially expressed between the
PAX-FKHR expressing and fusion-negative RMS cell lines (Appendix,
Supplementary Table 9). Principle component analysis based on the expression of
these 85 genes separated the cell lines into two clusters (Figure 9). PAX3-FKHR and
PAX7-FKHR transduced RD clustered with the alveolar fusion-positive cell lines.
As expected, vector transduced populations clustered with the other fusion-negative
RMS cell lines. This subset of 85 genes, differentially expressed between PAX-
FKHR expressing and fusion negative RMS cell lines represent ~12% of the
variation between these RMS cell line subtypes and therefore reflect only some of
the inherent differences between PAX-FKHR expressing and fusion negative RMS
cell lines (data not shown). However, this analysis highlights the fact that the
expression profile of an embryonal cell line can be shifted towards that of a fusion-
positive alveolar cell line by ectopic expression of PAX-FKHR, resulting from the
activation of a PAX-FKHR transcriptional program in the RD cell background.
Together, our analysis of primary tumors and model cell lines suggest the existence
of only two molecular classes of RMS. Reflecting these findings, we continue the
nomenclature first described in Chapter 1 and refer to PAX-FKHR expressing
tumors as mARMS (molecular ARMS) and all fusion-negative RMS tumors,
independent of tumor histology, as mERMS (molecular ERMS).
Figure 9 Principal components (PC) analysis of RMS cell lines and transduced RD
polyclones using 85 genes differentially expressed between both the PAX-FKHR and
vector transduced RD polyclones and between fusion-positive and -negative RMS cell
lines. PAX3-FKHR and PAX7-FKHR transduced samples co-cluster with cell lines
derived from fusion-positive alveolar tumors, in contrast to vector transduced samples
that cluster with the fusion-negative RMS cell lines.
PAX-FKHR ‘Expression Signature’ in primary alveolar RMS tumors
Based on the above, we sought to determine whether the in vitro PAX-FKHR
expression profile is present in primary ARMS tumors as well. By screening this
profile for differential expression between the mARMS and mERMS tumor classes,
we sought to identify genes relevant to the pathobiology of fusion-positive ARMS
tumors. As before, we applied the meta-clustering algorithm to identify genes that
were differentially expressed: genes were selected using a t-test comparison of the
mARMS (n=55) and mERMS (n=84, including fusion-negative ARMS) cases in a
series of 1000 leave-n-out analyses. The p-value threshold was set to a value that
49
50
provided an estimated false discovery rate of 0.1%. Genes selected in at least 50% of
the randomized iterations under sample cross-validation were chosen resulting in a
total of 109 genes (~33% of the in vitro expression profile). Of these 109 genes, 28
were expressed incongruently between the transduced polyclones and the primary
tumors (i.e., up-regulated by PAX-FKHR expression in RD cells but expressed at
decreased levels in PAX-FKHR primary tumors relative to fusion-negative RMS
tumors, or vice versa). These genes were removed from our analysis leaving a PAX-
FKHR ‘expression signature’ consisting of 61 up-regulated and 20 down-regulated
genes (Table 4), represented by 102 probe sets. As a measure of the significance of
these PAX-FKHR signature genes to the expression profiles of mARMS tumors, we
compared this 81 gene PAX-FKHR expression signature to the 81 top-ranked
differentially expressed genes between mARMS and mERMS tumors (ranked by t-test
statistic) and found 16 genes (20%) in common between the two gene lists
(Appendix, Supplementary Table 7). Notably absent from the PAX-FKHR
expression signature are some of the top-ranked discriminating genes such as
TFAP2 β, CDH3 and CNR1. The PAX-FKHR expression signature, depicted in a
two-way hierarchical clustering dendrogram and corresponding expression matrix
shows that all of the mARMS tumors cluster on a separate branch of the dendrogram
from the fusion-negative mERMS tumors (Figure 10A and B, respectively). As
expected, fusion-negative ARMS tumors cluster with the ERMS tumors.
51
Table 4. PAX-FKHR 'Expression Signature'
Affy ID Symbol Gene Name PAX-FKHR Regulation
*
209460_at ABAT 4-aminobutyrate aminotransferase up
214895_s_at ADAM10
A disintegrin and metalloproteinase domain
10
up
208212_s_at ALK Anaplastic lymphoma kinase (Ki-1) up
202920_at ANK2 Ankyrin 2, neuronal up
202207_at ARL7 ADP-ribosylation factor-like 7 up
207076_s_at ASS Argininosuccinate synthetase up
205444_at ATP2A1 ATPase, Ca++ transporting, cardiac muscle down
205431_s_at BMP5 Bone morphogenetic protein 5 up
216598_s_at CCL2 Chemokine (C-C motif) ligand 2 down
201005_at CD9 CD9 antigen (p24) up
212977_at CMKOR1 Chemokine orphan receptor 1 up
209082_s_at COL18A1 Collagen, type XVIII, alpha 1 up
204850_s_at DCX Doublecortex; lissencephaly, X-linked) up
201581_at DJ971N18.2 Hypothetical protein DJ971N18.2 up
222154_s_at DNAPTP6 DNA polymerase-transactivated protein 6 up
204014_at DUSP4 Dual specificity phosphatase 4 down
211237_s_at FGFR4 Fibroblast growth factor receptor 4 up
219147_s_at FLJ20559 Chromosome 9 open reading frame 95 up
203689_s_at FMR1 Fragile X mental retardation 1 up
203725_at GADD45A
Growth arrest and DNA-damage-inducible,
alpha
up
205848_at GAS2 Growth arrest-specific 2 down
207145_at GDF8 Growth differentiation factor 8 down
209168_at GPM6B Glycoprotein M6B up
202455_at HDAC5 Histone deacetylase 5 up
205163_at HUMMLC2B Myosin light chain 2 down
210095_s_at IGFBP3 Insulin-like growth factor binding protein 3 down
203233_at IL4R Interleukin 4 receptor up
202794_at INPP1 Inositol polyphosphate-1-phosphatase up
209185_s_at IRS2 Insulin receptor substrate 2 up
205902_at KCNN3
Potassium intermediate/small conductance
calcium-activated channel, subfamily N,
member 3
up
205968_at KCNS3
Potassium voltage-gated channel, delayed-
rectifier, subfamily S, member 3
up
205888_s_at KIAA0555 Jak and microtubule interacting protein 2 up
205151_s_at KIAA0644 KIAA0644 gene product down
212956_at KIAA0882 KIAA0882 protein up
210102_at LOH11CR2A
Loss of heterozygosity, 11, chromosomal region
2, gene A
up
214110_s_at LSP1 Similar to lymphocyte-specific protein 1 up
208786_s_at MAP1LC3B
Microtubule-associated protein 1 light chain 3
beta
up
213256_at MARCH3 Membrane-associated RING-CH protein III up
211042_x_at MCAM Melanoma cell adhesion molecule up
210794_s_at MEG3 Maternally expressed 3 up
203510_at MET Met proto-oncogene up
219038_at MORC4 MORC family CW-type zinc finger 4 up
209708_at MOXD1 Monooxygenase, DBH-like 1 down
209757_s_at MYCN
V-myc myelocytomatosis viral related
oncogene
up
206657_s_at MYOD1 Myogenic factor 3 up
203962_s_at NEBL Nebulette up
205113_at NEF3 Neurofilament 3 (150kDa medium) down
206089_at NELL1 NEL-like 1 (chicken) up
204105_s_at NRCAM Neuronal cell adhesion molecule up
218162_at OLFM3 Olfactomedin 3 down
221969_at PAX5 Paired box gene 5 up
219148_at PBK PDZ binding kinase up
(Continued on the following page)
52
Table 4. PAX-FKHR 'Expression Signature' continued
Affy ID Symbol Gene Name PAX-FKHR Regulation
*
217996_at PHLDA1
Pleckstrin homology-like domain, family A,
member 1
down
201939_at PLK2 polo-like kinase 2 (Drosophila) down
210830_s_at PON2 Paraoxonase 2 up
211341_at POU4F1 POU domain, class 4, transcription factor 1 up
212680_x_at PPP1R14B
Protein phosphatase 1, regulatory (inhibitor)
subunit 14B
up
203680_at PRKAR2B
Protein kinase, cAMP-dependent, regulatory,
type II, beta
up
213093_at PRKCA Protein kinase C, alpha up
211373_s_at PSEN2 Presenilin 2 (Alzheimer disease 4) up
202388_at RGS2 Regulator of G-protein signalling 2, 24kDa down
206850_at RRP22 RAS-related on chromosome 22 up
201739_at SGK Serum/glucocorticoid regulated kinase up
203625_x_at SKP2 S-phase kinase-associated protein 2 (p45) up
221489_s_at SPRY4 Sprouty homolog 4 (Drosophila) down
212353_at SULF1 Sulfatase 1 up
212382_at TCF4 Transcription factor 4 down
216511_s_at TCF7L2 Transcription factor 7-like 2 up
217853_at TENS1 Tensin-like SH2 domain-containing 1 up
202039_at TIAF1 TGFB1-induced anti-apoptotic factor 1 up
209656_s_at TM4SF10 Transmembrane 4 superfamily member 10 up
205123_s_at TMEFF1
Transmembrane protein with EGF-like and two
follistatin-like domains
up
202643_s_at TNFAIP3 Tumor necrosis factor, alpha-induced protein 3 up
205388_at TNNC2 Troponin C2, fast down
202369_s_at TRAM2 Translocation associated membrane protein 2 up
202478_at TRIB2 Tribbles homolog 2 down
221861_at unknown Homo sapiens mRNA; cDNA DKFZp762M127 up
201760_s_at WSB2 WD repeat and SOCS box containing protein 2 up
201368_at ZFP36L2 Zinc finger protein 36, C3H type-like 2 down
Note: Genes in bold were used to create the prognostic 33 probe set metagene (Figure 14).
* Gene expression up-regulated or down-regulated by ectopic PAX-FKHR expression.
Biological Themes of the PAX-FKHR expression signature
To investigate the functional consequences of the PAX-FKHR expression signature,
in terms of ‘biological themes’, we performed overrepresentation analysis using the
EASE program (Hosack et al., 2003). This type of analysis was aimed at identifying
functional categories from the Gene Ontology (GO) and public gene annotation
databases that are present in the PAX-FKHR expression signature more frequently
than expected by chance alone (p<0.05). GO categories over-represented in both the
up- and down-regulated components of the expression signature included
‘development’, ‘morphogenesis’ and ‘organogenesis’ (Figure 10C and Appendix,
Supplementary Table 10). However, the up-regulated component differed from the
down-regulated component with overrepresentation of GO terms ‘cell adhesion’,
‘cell proliferation’, ‘programmed cell death’, ‘cell differentiation’, ‘apoptosis’ and
‘neurogenesis’. In contrast, the down-regulated component showed
Figure 10 Microarray analysis and functional annotation of a PAX-FKHR
expression signature active in primary RMS tumors. A, B: Dendrogram (A) and
expression matrix (B) derived from two-way hierarchical clustering analysis depicts
sample clustering of 139 primary RMS tumors by 81 genes determined to be
differentially expressed both between PAX-FKHR and vector transduced RD
polyclones, and between PAX-FKHR ARMS tumors and fusion-negative RMS
tumors. The legend indicates the fusion gene status and histology for each tumor.
This subset of genes accurately segregates all PAX-FKHR ARMS on the left branch
of the dendrogram (also purple bar in B) and all other fusion-negative ARMS and
ERMS on the right branch (also blue bar in B). In B, the expression of each gene in
each sample was normalized in the pseudo-colored expression matrix based on the
number of standard deviations above (red) and below (blue) the median expression
value (black) across all samples. C: Overrepresentation analysis of Gene Ontology
annotations shows functional categories enriched in the up- and down-regulated
components of the PAX-FKHR expression signature (red and blue bars,
respectively) plotted by the negative log of the EASE score p-value.
53
54
overrepresentation of GO categories such as ‘muscle development’ and ‘regulation
of muscle contraction’. However, the up-regulated component differed from the
down-regulated component with overrepresentation of GO terms ‘cell adhesion’,
‘cell proliferation’, ‘programmed cell death’, ‘cell differentiation’, ‘apoptosis’ and
‘neurogenesis’. In contrast, the down-regulated component showed
overrepresentation of GO categories such as ‘muscle development’ and ‘regulation
of muscle contraction’. Although the EASE analysis identified functional categories
that were regulated by PAX-FKHR, it was not informative with respect to how the
individual genes interact within signaling pathways. To identify gene networks
regulated by PAX-FKHR, we utilized the Ingenuity Pathways Analysis (IPA) tool.
This proprietary database devises networks based on gene associations derived from
peer-reviewed publications (Zeng and Schultz, 2005). We selected our PAX-FKHR
expression signature to represent Focus Genes, which serve as nodes to derive
biological networks that contain 35 genes. Of the 81 genes that comprise the PAX-
FKHR expression signature, 49 mapped to four genetic networks that were generated
according to statistical criteria using the IPA tool (Appendix, Supplementary Tables
11-16 and Figure 4). Ten to thirteen expression signature genes were used to build
each of the four networks, representing approximately a third of the genes within a
given network. Two of the networks confirmed our observations from the EASE
analysis. Network 1 included 13 PAX-FKHR expression signature genes in a 35
gene network that featured regulators of apoptosis (Appendix, Supplementary Figure
4A). Functional annotation of this network revealed overrepresentation of GO
55
categories such as ‘cell proliferation’ and ‘apoptosis’ (Appendix, Supplementary
Table 13). Network 4 featured MyoD as a central node and included 10 PAX-FKHR
expression signature genes (Appendix, Supplementary Figure 4D). The GO
categories of ‘muscle development’ and ‘regulation of cell-cycle’ were
overrepresented in this network (Appendix, Supplementary Table 16). In summary,
the EASE and IPA network data mining analyses indicate that the functional
consequences of PAX-FKHR expression include repression of myogenic
differentiation and maintenance of a proliferative state through regulation of
apoptosis.
PAX-FKHR ‘Expression Signature’ Predicts Patient Outcome
Establishing a correlation between the PAX-FKHR signature genes and patient
survival would not only indicate their value for prognostic classification of ARMS
but also demonstrate their relevance for tumor behavior. We observed by QRT-PCR
analysis that although PAX-FKHR was expressed in all mARMS tumors its
expression level varied over an 80-fold range (Materials and Methods, Figure 19B)
and were also heterogeneous with respect to patient survival (range, 0.9 -11.47
years). However, PAX-FKHR expression itself, failed to correlate significantly to
patient survival in a univariate model. Therefore, we hypothesized that downstream
target genes (i.e., the PAX-FKHR expression signature) might be better suited to
analysis of patient survival in multivariate models. To test this, we used Cox
regression modeling to identify genes from our PAX-FKHR expression signature
whose expression was correlated with patient outcome in 50 mARMS tumor patients
56
(those with available survival data). A Cox regression proportional-hazards model
was applied to the PAX-FKHR expression signature using leave-n-out samples
cross-validation. For each iteration, a randomly generated ‘training’ subset of 25
patients was used to identify the best single gene predictor of outcome, which was
subsequently evaluated on the remaining ‘test’ subset. This process was repeated to
generate 2500 cross-validated univariate models. A total of 57 probe sets (56% of the
102 probe set PAX-FKHR expression signature) were used in at least one of the
cross-validated models. The best single gene predictor, TNNC2 (a muscle-specific
troponin isoform), was utilized in about a third of all models. Five-year overall
survival estimates (OAS) from Kaplan-Meier analysis of the groups generated from
splitting the samples about the median TNNC2 expression level were 30% and 70%
(log-rank test, p<0.02) (Figure 11A and B). When patients with evidence of
metastatic disease at presentation were omitted from the analysis, TNNC2 expression
levels were still informative with respect to outcome (Figure 11C). However, we
found that combining PAX-FKHR signature genes into multi-gene data vectors or
‘metagenes’, greatly improved the statistical power of the resulting multivariate
model compared to any single gene univariate model. Cox regression χ
2
test statistics
were determined for each multivariate model and showed that a 33-metagene model
(representing 25 genes, Table 4) had the highest χ
2
test statistic. Metagenes generated
after sample permutations show that these results were not the result of chance alone
(Materials and Methods, Figure 20A). Further addition of PAX-FKHR expression
signature genes to the model actually decreased the performance of the metagene,
57
due to the increased noise associated with addition of genes that were not
significantly correlated to outcome (see Materials and Methods).
In order to evaluate the predictive power of the 33-metagene, mARMS
patients were split into three groups by their metagene scores (see Supplementary
Information) (Figure 11D). Kaplan-Meier analysis revealed that patients in the 1
st
tertile (n=17, red curve) had a five year OAS of 7% (Figure 11E). In contrast,
patients in the 2
nd
(n=16, green curve) and 3
rd
(n=17, blue curve) tertiles had a five
year OAS of 48% and 93%, respectively (log-rank test, p<0.00001). The highest risk
group was comprised entirely of patients with PAX3-FKHR tumors (Fisher exact,
p<0.001) and six of fourteen patients with metastatic disease at presentation,
although this was not statistically significant (Fisher exact, p<0.48). Interestingly, the
1
st
tertile had the lowest levels of PAX-FKHR expression, a mean 6-fold lower than
the mean of the remaining tumors (p<0.02). The lowest risk group contained nearly
the same number of patients with PAX3-FKHR (n=8) and PAX7-FKHR (n=9)
tumors, but overall this group accounted for 56% of all PAX7-FKHR tumors (Fisher
exact, p<0.02). Similarly, event-free survival (EFS) at five years showed significant
differences between the 1
st
, 2
nd
and 3
rd
tertiles with EFS of 9%, 37% and 86%,
respectively (log-rank test, p<0.0001, data not shown). Again, the metagene was still
predictive after omitting patients with evidence of metastatic disease at diagnosis
from the analysis (Figure 11F). Patients with non-metastatic disease in the 1
st
, 2
nd
and 3
rd
tertiles had an OAS at five years of 17%, 50% and 100% (log-rank test,
p<0.001).
Figure 11 The PAX-FKHR expression signature determines clinical outcome. A, D:
Histogram showing the binned distribution of the best single gene predictor (A) or 33-
metagene predictor scores (D) for 50 patients with mARMS tumors. The vertical green
lines highlight the median TNNC2 expression level (A) or the tertile cut-points for the
33-metagene scores (D). B, C, E, F: Kaplan-Meier survival analysis for all mARMS
patients (B, E) and non-metastatic mARMS patients only (C, F) grouped by TNNC2
expression level median cut-point (B, C) or 33-metagene predictor tertiles (E, F).
Numbers below the curves indicate the number of patients at risk in each group about
the median (B, C; below median, red curve and above median, blue curve) or tertile
(E, F; 1
st
red curve, 2
nd
green curve, 3
rd
blue curve). p-values are from statistical
analysis of Kaplan-Meier survival curves by log-rank test.
58
59
Adjusting for previously characterized prognostic factors such as PAX-FKHR
translocation variant, tumor size, patient age, anatomic site and staging in a
multivariate analysis did not affect the performance of the metagene significantly
(Table 5). Therefore, the prognostic metagene identifies differences in patient
outcome independent of other known prognostic variables. These results add to our
understanding of ARMS tumor biology, since they link malignant tumor behavior
(i.e., growth characteristics of ARMS cells) to a group of genes suggested to be
regulated by PAX-FKHR in our expression array analyses. Therefore, the
demonstrated prognostic value of the PAX-FKHR signature ultimately proves its
functional relevance for ARMS tumors.
Table 5. Metagene survival predictor is independent of clinical risk factors
p-values*
Covariate Parameter
five year Overall
Survival Estimates
Univariate
Adjusted
33-
Metagene
33-Metagene Score 1st, 2nd, 3rd tertile 7%, 48%, 93% <0.0001 -
TNNC2** low vs. high expression 30% vs. 70% 0.02 0.0002
Translocation PAX3-FKHR vs. PAX7-FKHR 34% vs. 92% 0.0007 <0.0001
Metastasis M0 vs. M1 63% vs. 12% 0.006 <0.0001
IRS Risk Group† Intermediate vs. High risk group 65% vs. 14% 0.004 <0.0001
Stage 1, 2, 3, 4 0%, 100%, 52%, 14% 0.001 0.0002
Local Invasiveness T1 vs. T2 72% vs. 30% 0.02 0.0005
Nodal Involvement N0 vs. N1 69% vs. 29% 0.03 <0.0001
Tumor Size ≤ 5 cm vs. > 5 cm 67% vs. 38% 0.04 0.0002
Anatomical Site‡ Favorable vs. Unfavorable 33% vs. 53% 0.34 <0.0001
Age ≤10 years vs. >10 years 57% vs. 40% 0.72 0.8
* Univariate: Cox Regression p-values for individual clinical risk factors. Adjusted p-value:
Multivariate Cox regression p-values for the 33-Metagene, adjusted for each of the clinical risk
factors or covariates.
** Best single gene predictor of outcome as determined by cross-validated Cox proportional-
hazards modeling. Low vs. high expression groups were split by the median expression level.
† IRS risk groups as determined by IRS-V study guide published by the Children’s Oncology
Group.
‡ Favorable = orbit/eye lid, head and neck (excluding parameningeal), genito-urinary (not
bladder/prostate). Unfavorable = bladder, prostate, extremity, parameningeal, other (trunk,
retroperitoneal, etc).
60
Discussion
In this chapter we used oligonucleotide microarray analysis to identify classes of
ARMS tumors based on their expression profiles, and to discover transcriptional
targets of the PAX-FKHR fusion genes. We believe our results will be useful in
elucidating the genetic components of this disease. As first presented in Chapter 1,
these studies further demonstrate that fusion-positive ARMS are better defined by an
expression profile that distinguishes them from all fusion-negative RMS, rather than
by their morphological appearance. A significant portion of this expression profile is
dictated by PAX-FKHR, as established by our in vitro model. We also show that a
PAX-FKHR expression signature can be used to predict tumor behavior as assessed
by clinical outcome, confirming the previously proposed correlation between PAX-
FKHR expression and increased aggressiveness of ARMS (Anderson et al., 2001a;
Barr, 2001; Sorensen et al., 2002).
Using microarray analysis Wachtel et al. reported that RMS tumors can be
divided into three molecular classes: embryonal, alveolar fusion-negative and
alveolar fusion-positive (Wachtel et al., 2004). Their study utilized a small sample
set (n=29), and no cross-validation was performed. In contrast, our findings are
based on a larger data set (n=139), facilitating a high degree of reproducibility owing
to the utilization of sample cross-validation techniques. The data from our
microarray clustering and QRT-PCR analyses show that histologically isomorphic
ARMS tumors form two molecular classes of tumors, those that express PAX-FKHR
and those that do not. Furthermore, in contrast to Wachtel et al., we find that fusion-
61
negative alveolar tumors share a common expression profile with the other fusion-
negative RMS variants. These findings are further supported by whole genome loss-
of-heterozygosity (LOH) and immunohistochemical analyses (presented in Chapter
1). The International Classification of Rhabdomyosarcoma (ICR) was originally
developed by a panel of expert pathologists to establish reproducible criteria for
histological diagnosis that are significantly correlated to patient survival rates
(Newton et al., 1995). Collectively, our data supports the reevaluation of the current
histology-based ICR classification scheme to include findings from genome-wide
expression analyses of RMS tumors.
Although previous genome-wide studies aimed at identifying PAX-FKHR
target genes have been reported (Barber et al., 2002; Begum et al., 2005; Khan et al.,
1999), the relevance of these genes to the expression profiles of primary ARMS
tumors was not confirmed. In our study, the differentially expressed PAX-FKHR
target genes identified in vitro were screened against those genes differentially
expressed between fusion-positive ARMS and fusion-negative RMS tumors. Our
approach is analogous to the one taken by Hu-Lieskovan et al., wherein they
identified EWS-FLI1 target genes using an in vitro model and primary Ewing’s
tumor samples (Hu-Lieskovan et al., 2005). Taking this approach, we identified a
PAX-FKHR expression signature with a significant overlap with the top-ranked
discriminators of molecular class (i.e., between the mARMS and mERMS tumors).
This overlap suggests that a significant portion of the variation in gene expression
found between the two molecular classes of RMS can be directly attributed to a
62
PAX-FKHR transcriptional response. However, many of the genes differentially
expressed between fusion-positive and fusion-negative RMS tumors, but absent from
the PAX-FKHR signature, can conceivably be attributed to other sources of genetic
variation at the transcriptional level, such as the genetic backgrounds of the yet
unidentified RMS progenitor cells (Keller et al., 2004; Linardic et al., 2005; Tiffin et
al., 2003).
Overrepresentation analysis provided us with a statistical means to
comprehend the biological themes associated with our PAX-FKHR expression
signature. Genes down-regulated by PAX-FKHR were over-represented by ‘muscle
development’ and ‘muscle contraction’. This is in contrast to Khan et al., who found
induction of a myogenic transcription program in NIH3T3 transduced with PAX3-
FKHR (Khan et al., 1999). This myogenic transcription program included genes such
as ACTC, MYL1, MYOG, SNAI2 and TNNC2. We found all of these genes to be
down-regulated in our model system. One gene, PRRX1, found by Khan et al. to be
repressed was actually up-regulated in our model system. This suggests that in the
myogenic RD background, in contrast to the non-myogenic NIH3T3 background,
PAX-FKHR expression represses myogenic differentiation as has been shown
previously for PAX3 and PAX3-FKHR in C2C12 myoblasts (Epstein et al., 1995).
Despite the down regulation of a large number of myogenesis related genes,
MyoD was expressed 2-fold higher in PAX-FKHR transduced RD compared to
vector control. This confirms our recent finding that MyoD is a PAX-FKHR target
gene (Graf Finckenstein et al., manuscript in preparation). This apparent discrepancy
63
can be explained by findings from a recent report that showed that MyoD regulates
distinct subsets of genes in proliferating versus differentiating myoblasts (Blais et al.,
2005). The authors also used EASE analysis and found that in proliferating
myoblasts the GO categories ‘synaptic transmission’ and ‘transmission of nerve
impulse’ were over-represented within MyoD target genes. In contrast, in terminally
differentiating myotubes the GO categories ‘muscle development’, ‘muscle
contraction’ and ‘regulation of striated muscle contraction’ were over-represented
within MyoD target genes. These results are remarkably similar to our findings. We
speculate that following MyoD induction by PAX-FKHR only a subset of MyoD
target genes is activated, similar to those found in proliferating myoblasts where
‘neural phenotype’ GO categories were overrepresented in the MyoD expression
signature. PAX-FKHR, possibly through activation of MyoD, may promote a limited
degree of myogenic determination and at the same time actively repress myogenic
differentiation (as assessed by the expression of markers of muscle terminal
differentiation such as MYL2 and TNNC2) (Bergstrom et al., 2002; Tonin et al.,
1991). MyoD was in fact a central node in one of the IPA networks that included the
cell-cycle G1/S regulator RB1, as well as MYOG, HDAC5, TWIST1, ID2, GDF8 and
IGF1. These genes are all involved in regulating the transition between proliferating
and differentiating myoblasts (Berkes and Tapscott, 2005; Blais et al., 2005; Merlino
and Helman, 1999). HDAC5, for example, is a crucial regulator of myogenic
differentiation, acting as a transcriptional co-repressor for MEF2C at MyoD target
gene promoters (McKinsey et al., 2001). Other previously described MyoD
64
downstream target genes such as ASS, PKIA and TNNC2 (Bergstrom et al., 2002)
were also part of this network and of the PAX-FKHR expression signature.
Overrepresentation analysis identified the GO categories ‘programmed cell
death’, ‘apoptosis’ and ‘cell proliferation’. Confirming this, pathway analysis yielded
a network that integrated genes involved in suppressing apoptosis (e.g., up-regulation
of MYCN (Slack et al., 2005), down-regulation of IGFBP3 (Butt and Williams,
2001)) and promoting cell survival (e.g., down-regulation of DUSP4 (Chen et al.,
2001) and SPRY4 (Sasaki et al., 2003), up-regulation of FGFR4 (Hart et al., 2000)).
In addition, some of these genes such as FGFR4 and PRKCA appear to play dual
roles in myogenic cells, serving to both repress differentiation and promote cell
survival (Li et al., 1992; Shaoul et al., 1995). Repression of myogenic differentiation
and activation of cell survival pathways, therefore, appear to be hallmarks of the
PAX-FKHR mediated transcriptional program.
Previous work with a similar in vitro model system showed that ectopic
PAX3-FKHR expression in embryonal RMS cell lines (including RD) leads to
increased tumorigenicity and aggressiveness in mouse tumor models (Anderson et
al., 2001b). We could not however, correlate PAX-FKHR (QRT-PCR) expression
levels to mARMS patient survival. Instead, we found that a subset of the PAX-FKHR
expression signature (i.e., the 33-metagene), likely downstream target genes of PAX-
FKHR, was highly correlated to mARMS patient survival. Surprisingly, we found
that mARMS patients in the high risk metagene group (1
st
tertile) had the lowest
levels of PAX-FKHR expression in their tumors. A recent in vitro study has shown a
65
cell transformation phenotype in cells transduced with low levels of PAX-FKHR
while higher levels of PAX-FKHR were associated with a growth suppression
phenotype in immortalized murine cell lines (Xia and Barr, 2004). In addition, there
are several known alternatively spliced forms of PAX3, PAX7, PAX3-FKHR and
PAX7-FKHR with differing transcriptional activities (Du et al., 2005). Therefore, the
relationship between PAX-FKHR expression, ARMS cell biology and ARMS
clinical behavior is likely more complex than previously assumed (Sorensen et al.,
2002). Regardless, our data demonstrates that a PAX-FKHR transcriptional program
plays an important role in determining not only the molecular phenotype, but also the
clinical behavior of ARMS tumors.
Studies involving other malignancies have successfully utilized expression
profiling, including techniques such as Cox regression modeling, for patient risk
stratification based on the expression of a small number of genes (Dave et al., 2004;
Pittman et al., 2004). The 33-metagene developed in this study clearly discriminates
ARMS patients into risk groups, the lowest-risk group with a five year OAS of 93%
and the highest-risk group with a five year OAS of only 7%. In addition, multivariate
analysis showed that the 33-metagene is predictive of patient outcome independent
of known clinical risk factors. When we excluded metastatic disease patients, we still
found significant differences in outcome among the non-metastatic patients,
suggesting that the PAX-FKHR transcriptional program is a major determinant of
outcome independent of the clinical manifestation of metastases. Our analysis also
indicates that the expression pattern of just a single gene such as the skeletal muscle
66
specific troponin isoform, TNNC2, can provide prognostic information, which could
be amenable to a routine clinical assay (i.e., QRT-PCR). In conclusion, it appears
that the PAX-FKHR expression signature genes can be used for patient risk
assessment, yet its clinical usefulness has ultimately to be proven by prospective
evaluation. Finally, new targeted therapies are needed to treat those tumors that do
not respond to traditional modalities, especially within the subgroup of highly
metastatic alveolar RMS tumors. The identification of a defined group of genes
regulated in a tumor-type specific manner, as described by our study, may provide
promising candidates for such gene-specific therapies.
67
Chapter 3: Gene Expression Profiling for Survival Prediction
in Rhabdomyosarcomas
Introduction
The cure rate of rhabdomyosarcoma has improved significantly since the
introduction of multi-modal therapy and refinements to the applications of surgery,
chemotherapy and radiotherapy (Breitfeld and Meyer, 2005). Rhabdomyosarcoma is
a chemosensitive tumor, and more than 70% of patients with non-metastatic disease
can be cured (McDowell, 2003). Nearly all patients are assumed to present with
micro-metastatic disease and as a result, multi-modal therapy is necessary in
rhabdomyosarcoma for nearly all patients (Meyer and Spunt, 2004). However, recent
evidence suggests that there exist subgroups of patients whose disease can be
managed by surgery alone (Arndt et al., 2001; Libera et al., 1999) and increasingly
there are patients where effective treatment can be delivered without the use of
radiotherapy (Schuck et al., 2004) and only low-level use of chemotherapy (Crist et
al., 2001).
Radical surgery played the central role in the treatment of patients with RMS
prior to the adoption of multi-agent chemotherapy and frequently resulted in severe
disability for patients. In addition, most patients with apparently localized disease
relapsed due to the dissemination of embolized tumor cells from the surgical
procedure, residual microscopic disease or the existence of undetectable micro-
metastatic disease at diagnosis. The use of vincristine and dactinomycin (VA), or VA
plus cyclophosphamide (VAC), became the standard chemotherapy regimen that
followed surgery and resulted in improved overall survival from 25% in 1970 to 71%
68
in 1990 through successive IRS clinical trials (Pappo et al., 1995). Other three-agent
combinations evaluated over the years include VA plus ifosfamide (VAI) and
vincristine, ifosfamide and etoposide (VIE). Subsequent efforts to increase the
therapeutic response of high-risk patients with metastatic disease such as dose
escalation of cyclophosphamide resulted in substantially greater toxicity and no
apparent therapeutic benefit (Spunt et al., 2004). Recently completed high-risk trials
demonstrated response rates for irinotecan (Pappo, 2005) and topotecan
(Walterhouse et al., 2004) however, new treatment strategies are clearly needed,
particularly for newly diagnosed metastatic patients and for those that develop
recurrent tumors where little improvement in outcome has been achieved (Breitfeld
and Meyer, 2005). Use of radiation therapy for tumors that remain unresectable after
chemotherapy, or that have been incompletely resected by second surgery after
initial chemotherapy has also improved survival of many RMS patients (Breneman
and Wiener, 2000). However, the short-term toxic effects of radiotherapy are
unacceptable for some patients (e.g., <1 years old) and many of the substantial late
effects of RMS treatment can be directly attributed to radiotherapy (Stevens, 2005).
Although significant cure rates have been achieved through adjustments and
refinements to the therapeutic protocols there remain a number of clinically
identifiable subgroups of RMS patients where the progress has not been satisfactory
as will be discussed in detail below.
Perhaps the most important advances made by the IRS and similar clinical
trials elsewhere (e.g., SIOP) aside from the major technical innovations in the
69
application of therapy is the development of protocols for assigning patients into risk
groups. Rhabdomyosarcoma cancer staging for assignment into patient risk
categories has evolved into a complex multi-faceted protocol that requires the
cooperation of pathologists and clinicians from multiple disciplines (Qualman and
Morotti, 2002). Risk categories have been continuously refined over the past several
decades. The primary prognostic variables that determine patient outcome relate to
pretreatment biologic factors of the tumor (e.g., site, stage, size, histology, nuclear
proliferative index and cellular differentiation), patient characteristics (e.g., age,
disease history) and therapy administered (e.g., extent of surgical resection, type of
chemotherapy/radiotherapy, dose and duration) (Qualman et al., 1998). Diagnostic
methods including magnetic resonance imaging, computerized tomography, bone
scans and assessment of metastatic spread by imaging of regional lymph nodes and
bone marrow aspirates are routinely used to guide RMS staging. RMS staging
consists of a pre-treatment Tumor Node Metastasis (TNM) system modified by
recognition of favorable and unfavorable primary tumor sites (Lawrence et al., 1997)
and a surgicopathological clinical groups classification, which is determined by the
extent of disease after initial surgical resection (Tables 6 and 7) (Maurer et al., 1988).
In general, patients with group I and group II embryonal histology have the best
outcomes whereas patients with advanced group and metastatic alveolar histology
tumors have the poorest outcomes. This has lead to the widely accepted conclusion
that achieving complete excision (with the exception of alveolar histology and
extremity tumors) is an extremely favorable prognostic factor (Crist et al., 1995).
70
These observations have led to significant improvements in the well being of RMS
patients with group I disease as radiation therapy (and its associated side effects) is
remitted from their treatment protocol (Maurer et al., 1988).
Table 6. Presurgical TNM-based RMS staging
Stage Sites T Invasiveness T size Regional Nodes Metastasis
1 Orbit T1 or T2 a or b N0, N1 or NX M0
Head/Neck* T1 or T2 a or b N0, N1 or NX M0
Genitourinary** T1 or T2 a or b N0, N1 or NX M0
2 Bladder/prostate T1 or T2 a N0 or NX M0
Extremity T1 or T2 a N0 or NX M0
Parameningeal T1 or T2 a N0 or NX M0
Others*** T1 or T2 a N0 or NX M0
3 Bladder/prostate T1 or T2 a N1 M0
Extremity T1 or T2 b N0, N1 or NX M0
Parameningeal T1 or T2 b N0, N1 or NX M0
Others*** T1 or T2 b N0, N1 or NX M0
4 Any T1 or T2 a or b N0 or N1 M1
Legend
* Excluding parameningeal
** Nonbladder/nonprostate
*** Includes trunk, retroperitoneum, etc.
Abbreviations: M0, no distant metastasis; M1, metastasis present; N0, regional nodes not clinically
involved; N1, regional nodes clinically involved by neoplasm; NX, clinical status of regional nodes
unknown (especially sites that preclude lymph node evaluation); Size a, <5 cm in diameter; Size b, >5
cm in diameter; T1, confined to anatomic site of origin; T2, extension and/or fixation to surrounding
tissue.
Table 7. Post-surgical RMS Staging
Clinical
Group
Extent of Disease/Surgical Result
I A Localized tumor, confined to site of origin, completely resected
B Localized tumor, infiltrating beyond site of origin, completely resected
II A Localized tumor, gross total resection, but with microscopic residual disease
B Locally extensive tumor (spread to regional lymph nodes), completely resected
C Locally extensive tumor (spread to regional lymph nodes), gross total resection, but
microscopic residual disease
III A Localized or locally extensive tumor, gross residual disease after biopsy only
B Localized or locally extensive tumor, gross residual disease after major resection
(greater than 50% debulking)
IV Any size primary tumor, with or without regional lymph node involvement, with
distant metastases, without respect to surgical approach to primary tumor
71
Despite the many improvements realized for patients with nonmetastatic disease, the
overall survival of the high-risk metastatic group, which have been subjected to
numerous randomized evaluations of experimental therapeutics (Atra and Pinkerton,
2002; Breneman et al., 2003; Cassady, 1995; Pappo et al., 1999a) has not
significantly improved since the IRS-III trial (Breitfeld and Meyer, 2005).
For the current clinical trials, the COG STS committee has further refined
RMS staging to a three tier risk-group system (Table 8).
Table 8. Risk Group Assignment for IRS-V Clinical Trials.
Risk Stage Group Histology Site Age
Low 1 I Embryonal favorable < 21
1 II Embryonal favorable < 21
1 III Embryonal orbit only < 21
2 I Embryonal unfavorable < 21
1 II Embryonal favorable < 21
1 III Embryonal orbit only < 21
1 III Embryonal favorable (excluding orbit) < 21
2 II Embryonal unfavorable < 21
3 I or II Embryonal unfavorable < 21
3 I or II Embryonal unfavorable < 21
Intermediate 2 III Embryonal unfavorable < 21
3 III Embryonal unfavorable < 21
3 III Embryonal unfavorable < 21
1 or 2 or 3 I or II or III Alveolar favorable or unfavorable < 21
4 IV Embryonal favorable or unfavorable < 10
High 4 IV Embryonal favorable or unfavorable ≥ 10
4 IV Alveolar favorable or unfavorable < 21
For anatomic site favorable or unfavorable definitions see Table 5
72
This classification system incorporates both of the earlier clinical group and TNM
stage schemes as well as additional prognostic variables such as patient age and
tumor histology (Raney et al., 2001). This three tier risk-group system devised from
post-hoc analysis of IRS I-IV and IRSV pilot studies appears to be the best method at
assigning patient risk devised to date (Breitfeld and Meyer, 2005). Quality of life
after treatment is becoming an issue of importance and improvements need to be
made in shifting the balance between maximizing cure rates and minimizing the side
effects of treatment. Until genuine tumor-specific targeted therapies become
commonplace, it is hoped that improvements to RMS staging and patient risk-
assignment is an area where improvements will be of greatest benefit to patients in
terms of quality of life for young cancer survivors (Stevens, 2005).
In all types of cancer, the main purpose of staging is to classify tumors into
categories from which treatment can be planned and prognosis predicted. It is also an
essential way of evaluating outcome of different treatment regimens. Unfortunately,
modern-day clinicopathological based staging systems do not appear to identify
many of the fundamental differences in underlying tumor biology (Bair and
Tibshirani, 2004). As microarray technology migrates into the clinic and is more
routinely used by pathologists, it is hoped that new molecular staging systems will be
developed. Most of the early progress in application of microarray technology came
in the field of diagnosis (Tibshirani et al., 2002) but has also led to proposals for new
classification systems (Bhattacharjee et al., 2001; Liu, 2003) and even the
identification of previously unrecognized tumor subclasses (Yeoh et al., 2002). More
73
recently, the challenge of outcome prediction and risk assignment based on gene
expression patterns is being addressed in a number of cancers (Simon et al., 2003;
van 't Veer et al., 2002; Wang et al., 2005). Molecular staging has promise in
predicting the long-term patient outcome by analysis of the tumor gene expression
profile at diagnosis (Simon et al., 2003). The inherent assumption to this approach,
supported by recent analyses (Ramaswamy et al., 2003; Weigelt et al., 2005) is the
hypothesis that every tumor contains informative gene expression signatures that, at
the time of diagnosis, can direct the biologic behavior of the tumor over time
(Simon, 2005).
Several methods have been used to identify clinically relevant gene
expression profiles that predict patient survival such as hierarchical clustering
analysis of tumor subtypes or between cancer survivors and non-survivors (Alizadeh
et al., 2000; Beer et al., 2002; Bhattacharjee et al., 2001). Others have used clinical
data to identify ‘high’ and ‘low’-risk patients and used more supervised methods to
identify prognostically relevant gene expression profiles and survival predictors
(Shipp et al., 2002; van 't Veer et al., 2002; van de Vijver et al., 2002). More
recently, improved methods have been introduced using various forms of Cox
regression modeling in semi-supervised approaches aimed at correlating gene
expression levels to patient survival (Bair and Tibshirani, 2004; Li and Gui, 2004;
Pawitan et al., 2004; van Houwelingen et al., 2005). These methods have several
advantages over unsupervised and supervised techniques because they permit the
74
generation of true continuous predictors of survival and are generally independent of
clinicopathological variables other than survival rates (Bair and Tibshirani, 2004).
In this chapter, we describe the use of Cox regression modeling (Cox, 1972;
Kirkwood et al., 2003) in the development of multi-variate gene expression based
outcome models. Cox regression proportional-hazards modeling is one of the most
commonly used methods for the analysis of treatment effects and prognostic factors
on outcome in cancer patients (Sposto, 2002). When using these methods, the data
must fit the proportional-hazards assumption in that there is an implicit survival
function (or hazard function) that underlies the data and that most of the important
effects of treatment or prognostic factors can be measured through shifts in the times
at which events (i.e., deaths) occur (Katz, 2003). Here we utilize the approach
described in Chapter 2, except instead of focusing solely on the utility of this method
to show the relevance of the PAX-FKHR expression signature as a determinant of
the clinical behavior of the aggressive mARMS subtype, we use these methods on all
RMS patients utilizing all available gene expression data. As will be discussed
below, this approach assumes that regardless of their histological appearance, age of
incidence, or anatomical site, RMS tumors share common patterns of expression that
can be related to patient survival. Historically, this heterogeneous disease has been
treated in the clinic as a group because of their common phenotype (i.e., myogenic)
and similar response to treatment regimens (i.e., chemosensitivity profile). This we
believe is due to a significant degree to underlying biology that determines the
rhabdomyogenic phenotype.
75
Results
Single Gene or Conventional Clustering Analyses have Limited Prognostic
Value
We used Cox regression modeling of survival data to identify genes most
correlated with censored survival time in a subset of our RMS cohort (n=120, see
Appendix, Supplementary Table 17 for patient characteristics). The criteria for
inclusion in this regression analysis required known outcome for at least 2 years of
follow-up after diagnosis (except, patients that died of disease at any time-point).
The median survival of this cohort was 6.6 years (mean, 6.14 years). Most of the 39
deaths (i.e., n= 26, 67%) occurred within the first two years of diagnosis and the
cause of death was attributed to the tumor in all cases except for two (one from
infection on regimen and one from toxicity unrelated to the chemotherapy regimen).
Graphical inspection of the data set using clinical ‘explanatory’ variables such as
histology, age at diagnosis and stage justify the assumptions of the proportional-
hazards model and validate the use of the Cox regression method in this data set
(Ng'andu, 1997). Of the patients that died, 24 (62%) had alveolar histology, 13
(33%) had embryonal histology and two (5%) had mixed alveolar/embryonal
histology. No reported deaths occurred in patients with spindle cell or botryoid
histology tumors.
Using proportional-hazards modeling for log transformed gene expression
data (see Materials and Methods) we identified a total of 578 genes with significant
Cox χ
2
scores (p<0.01) (Appendix, Supplementary Table 18). The best single gene
predictor of outcome (as determined by the significance level using the χ
2
test),
hypothetical protein FLJ10726 was evaluated using Kaplan-Meier analysis (Figure
12), splitting patients into two groups about the median expression level in tumors
(data not shown). The differences in five year overall survival (OAS) between the
two groups was 50% and 86% for expression below and above the median
expression level, respectively.
Figure 12 Kaplan-Meier analysis of the best single gene predictor of outcome for
this data set, as determined by the magnitude of Cox regression score. Patient
groupings were objectively derived by splitting patients with tumor gene expression
above (blue curve) or below (red curve) the median FLJ10726 expression level.
Differences in patient OAS between the two groups at five years are 86% and 50%,
respectively (log-rank, p<0.00002).
Another method commonly used to identify patient prognostic subgroups is
to cluster tumors based on gene expression profiles of outcome correlated genes. In
our data set, the results from two-way hierarchical clustering of tumor samples
(based on the gene expression patterns of the 578 genes identified by Cox regression
76
77
modeling) depict the segregation of tumors onto one of two main branches of the
dendrogram (Figure 13A). Cluster 1 (n=51) is comprised almost exclusively of
tumors with alveolar histology (n=42) and only 9 tumors with embryonal histology.
Cluster 2 (n=69) in contrast, is comprised of mostly tumors with embryonal or
spindle/botryoid variant histology and only 9 tumors with alveolar histology.
Kaplan-Meier analysis of these two tumor clusters (Figure 13B) corresponds to OAS
differences of 50% and 80% for Cluster 1 and Cluster 2, respectively at five years.
However neither of these methods, (i.e., single gene regression analysis and
hierarchical clustering of tumor samples) were independently prognostic of tumor
histology (p<0.5, by multivariate analysis). This is an expected result given the fact
that both of these methods essentially segregated the expression profiles of tumors
with alveolar histology (more precisely, the mARMS class) from other RMS tumors.
Furthermore, it appears that there is insufficient prognostic information in any single
gene for this complex disease and dichotomous splitting of tumors by clustering
methods based on the expression of numerous genes is similarly insensitive to the
inherent heterogeneity of this disease with respect to prognostic gene expression
profiles. Therefore, what is needed are continuous predictors of disease such as will
be described below. These methods have the advantages of modeling survival data
using regression methods but with multiple genes (i.e., multivariate) and don’t suffer
from the ‘clumsiness’ of discriminant type cluster analyses (Bair and Tibshirani,
2004).
Figure 13 Two-way hierarchical clustering of RMS tumors and genes correlated to
outcome by Cox regression. A: Dendrogram divides RMS tumors into two main
clusters, Cluster 1 (purple bar in B) is comprised almost entirely of alveolar histology
tumors (red dots), whereas Cluster 2 (green bar in B) is comprised mostly of
embryonal (green dots) or spindle/botryoid (blue dots) histology tumors. B: Heatmap
depicting genes correlated to good outcome (vertical blue bar) and poor outcome
(vertical red bar) as determined by Cox regression scores in RMS patients. The
expression of each gene in each sample was normalized in the pseudo-colored
expression matrix based on the number of standard deviations above (red) and below
(blue) the median expression value (black) across all samples. C: Kaplan-Meier
analysis of two clusters derived from hierarchical clustering reveals that Cluster 1
patients (purple curve) had five year OAS of 50% in contrast to Cluster 2 patients
with an OAS of 80% (log-rank test, p<0.004). Note that this method does not yield
prognostic information that is independent of tumor histology ( χ
2
, p<0.5) and does not
perform any better than a single gene predictor of outcome (see Figure 12).
78
79
Metagene or multi-gene data vectors show increased prognostic power over
known clinical variables
In order to build multi-gene continuous predictor models we first wished to
identify genes most reproducibly predictive of outcome. Due to limited sample size
the most feasible method for acquiring measures of reproducibility is to use sample
cross-validation (Simon, 2005). Therefore, we used Cox regression modeling under
50% sample cross-validation to acquire sampling statistics for each gene evaluated.
For each iteration of the model, 50% of the samples (n=60) were randomly left out
and Cox regression was used to evaluate the genes most predictive of outcome
(p<0.05) on this ‘training’ set. The Cox regression model was then applied to the
remaining samples, the ‘test’ set and only those genes most predictive of outcome
(p<0.05) on the ‘test’ set were scored. Over 2500 rounds of cross-validated
iterations, sampling statistics for each gene were cumulated based on the number of
times a gene was predictive in both ‘training’ and ‘test’ sets and genes were ranked
accordingly (Appendix, Supplementary Table 18). Multi-gene data vectors or
metagenes were then assembled from this ranked list (i.e., the top 5 genes formed a
5-metagene, the top 10 genes a 10-metagene, etc.). The χ
2
test statistics were
recorded for each ranked gene from Cox regression modeling of the entire data set.
The signed square root of the χ
2
test statistic for individual genes were used as
weighting factors, positive and negative signs indicating high expression associated
with poor and good patient outcome, respectively. The absolute magnitude of the test
statistics determined the weight assigned to the gene expression value for each
80
sample, so that genes more predictive of outcome (i.e., by magnitude of their Cox
scores) had greater influence on the model. In this manner, each patient’s metagene
score was derived as a cumulative total of the products of gene weights and
expression values (for complete details see Materials and Methods).
In order to determine the most predictive metagene model, metagenes were
generated in a step-wise fashion from the ranked list of the best single gene
predictors. The maximum likelihood estimate of the χ
2
test statistics were determined
for each multivariate model and showed that a 34 probe set (34-metagene, Table 9)
model had the highest test statistic (see Materials and Methods, Figure 20B). Next,
the data set was permuted by sample shuffling and the cross-validated Cox
regression modeling was repeated, generating metagenes from the ranked list of the
best single gene predictors in the permuted data set. The χ
2
test
statistics from
analysis of the metagenes generated on the permuted data set indicate that these
results are not likely due to chance alone (blue curve, Materials and Methods Figure
20B).
81
Table 9. Genes used to create the 34-metagene continuous predictor of
outcome
Affy ID Gene Name Symbol
Genes correlated to good patient outcome
214643_x_at Bridging integrator 1 BIN1
219953_s_at Chromosome 11 open reading frame 17 C11orf17
222250_s_at Chromosome 1 open reading frame 73 C1orf73
204643_s_at Cytosolic ovarian carcinoma antigen 1 COVA1
201905_s_at
Carboxy-terminal domain, RNA polymerase II, polypeptide A small
phosphatase-like
CTDSPL
218695_at Exosome component 4 EXOSC4
218314_s_at Hypothetical protein FLJ10726 FLJ10726
207688_s_at Inhibin, beta C INHBC
202788_at Mitogen-activated protein kinase-activated protein kinase 3 MAPKAPK3
213946_s_at Obscurin-like 1 OBSL1
35156_at R3H domain and coiled-coil containing 1 R3HCC1
218392_x_at Sideroflexin 1 SFXN1
207069_s_at SMAD, mothers against DPP homolog 6 (Drosophila) SMAD6
214662_at WD repeat domain 43 WDR43
219548_at Zinc finger protein 16 (KOX 9) ZNF16
Genes correlated to poor patient outcome
221588_x_at Aldehyde dehydrogenase 6 family, member A1 ALDH6A1
211248_s_at Chordin CHRD
210656_at Embryonic ectoderm development EED
213434_at Epimorphin EPIM
218394_at Leucine zipper domain protein FLJ22386
204075_s_at Glycine-, glutamate-, thienylcyclohexylpiperidine-binding protein GlyBP
209525_at Hepatoma-derived growth factor, related protein 3 HDGFRP3
220447_at Histamine receptor H3 HRH3
209184_s_at Insulin receptor substrate 2 IRS2
212546_s_at KIAA0826 KIAA0826
204584_at L1 cell adhesion molecule L1CAM
213437_at
RUN and FYVE domain-containing 2; Run- and FYVE-domain
containing protein
LOC441022
213672_at Methionine-tRNA synthetase MARS
215921_at Nuclear pore complex interacting protein NPIP
209791_at Peptidyl arginine deiminase, type II PADI2
205632_s_at Phosphatidylinositol-4-phosphate 5-kinase, type I, beta PIP5K1B
211974_x_at Recombining binding protein suppressor of hairless (Drosophila) RBPSUH
219196_at Secretogranin III SCG3
202342_s_at Tripartite motif-containing 2 TRIM2
In order to evaluate the predictive power of the 34-metagene, RMS patients
were split into three equally sized groups (tertiles, an approximation determined by
the histogram bar groupings) (Figure 14A). Kaplan-Meier analysis revealed that
patients in the 1
st
(n=39, blue curve) and 2
nd
(n=41, green curve) tertiles had five year
OAS of 98% and 75%, respectively (Figure 14B). In contrast, patients in the 3
rd
tertile (n=40, red curve) had a five year OAS of only 29% (log-rank test, p<0.00001).
Figure 14 Metagene predictor scores determine outcome in RMS patients. A: Histogram
showing the binned distribution of the 34-metagene predictor scores for 120 patients with
RMS tumors. The vertical purple lines highlight the tertile cut-points for the 34-metagene
scores. B, C: Kaplan-Meier survival analysis for all RMS patients (B) and non-metastatic
RMS patients only (C) 34-metagene predictor tertile groups. Numbers below the curves
indicate the number of patients at risk in each tertile group (1
st
blue curve, 2
nd
green curve,
3
rd
red curve). p-values are from analysis of Kaplan-Meier survival curves by log-rank test.
82
83
The 1
st
tertile or low-risk metagene group had excellent outcome with only
one death in a patient with relapsed metastatic embryonal disease. Four additional
patients with metastatic disease were identified in this group but none of these
patients relapsed or died from their disease. Forty-eight percent of favorable outcome
Group I (n=13) patients (completely resected tumor with no evidence of microscopic
disease) were found in the 1
st
metagene tertile but equal numbers of Group III (n=13)
patients (incompletely resected tumor with gross residual disease) were also apparent
( χ
2
p<0.07). Only three patients in this group had alveolar histology tumors and only
one of these expressed the PAX3-FKHR fusion gene and this 1
st
tertile was
overwhelmingly comprised of patients with favorable histology embryonal and
embryonal variant tumors ( χ
2
p<0.0001). However, only 60% of the superior
prognosis spindle cell (n=3) or botryoid (n=1) embryonal variants were in this 1
st
tertile, the remainder were found in the 2
nd
tertile. Anatomical sites with favorable
prognosis were enriched in the 1
st
tertile including three of four orbital tumors and
44% of genitourinary tract tumors (n=10) but also 45% of parameningeal tumors
(n=9) which are an unfavorable tumor primary site. In recent years, patients with
localized favorable disease have had the burden of their therapy reduced by
elimination of radiotherapy and alkylating agents (i.e., cyclophosphamide).
However, although we found that the majority of patients that received the low dose
two-combination VA chemotherapy regimen (8 of 13) were in this group, most of
the patients (64%) received the standard 3-combination (i.e., VAC, VIE, VAI). In
84
addition, over 64% of patients in the 1
st
tertile were treated with some form of
radiotherapy, either conventional (n=18) or hyper-fractionated radiotherapy (n=7).
The intermediate metagene risk group had 16 tumors with alveolar histology,
most of these expressed the more favorable prognosis PAX7-FKHR fusion gene
variant (n=8) (i.e., 73% of all PAX7-FKHR tumors) and only 4 PAX3-FKHR and 4
fusion negative alveolar histology tumors (p<0.0005). While the majority of the
tumors in this group had embryonal (n=22) or spindle/botryoid (n=3) histology, this
was not statistically significant (p<0.7). Of the 5 patients in the intermediate group
diagnosed with metastatic disease, 3 of these were tumors with embryonal histology
that relapsed and died of disease. Forty percent of patients in this intermediate group
had Stage 3 (n=16) disease and while there were fewer Stage 2 disease patients
(n=11) they represented 65% of all Stage 2 patients in the data set (p<0.009). Six of
the 2
nd
tertile patients were treated with VIE and all were cured of their disease in
contrast, five patients from the high-risk metagene group (3
rd
tertile) that were
treated with VIE had only 40% five year OAS (p<0.04). Interestingly, patients in the
2
nd
tertile treated with VAC (n=13) had five year OAS of 77% and fared no better
than patients in the 3
rd
tertile treated with VAC (n=5) that had an OAS of 80%
(p<0.9) suggesting that there exists a subset of patients that are unresponsive to VAC
therapy.
The highest risk group (i.e., 3
rd
tertile) was comprised almost exclusively of
patients with alveolar histology tumors (78%, Fisher exact, p<0.001), twenty-five of
these expressed the PAX3-FKHR, only 3 expressed the PAX7-FKHR fusion gene
85
variant and four were fusion negative (p<0.0007). Of the eight patients with
embryonal histology tumors, five of them died from their disease (one case from
infection while on regimen). Irrespective of tumor histology, most of the relapsed
patients that succumbed to disease (n=28) and twenty-one patients with metastatic
disease (representing 68% of all patients with metastatic disease at diagnosis) were
found in this high-risk group (p<0.0001). However, as described above, there were a
number of patients with metastatic disease at diagnosis that were found in the low
(n=5) and intermediate (n=5) metagene-risk groups. Compared to patients in the high
risk group with an OAS of only 11% at five years, patients with metastatic disease in
the low and intermediate risk groups fared significantly better with OAS of 50% at
five years ( χ
2
p<0.002). Furthermore, the metagene was still predictive when patients
with evidence of metastatic disease at diagnosis were omitted from the Kaplan-Meier
analysis (Figure 14C). Patients with non-metastatic disease in the 1
st
, 2
nd
and 3
rd
tertile risk groups had an OAS at five years of 47%, 83% and 100% (log-rank test,
p<0.0001). This is in accordance with suggestions in the literature that there exist
metastatic disease patients with an atypical favorable prognosis (Breneman et al.,
2003). With a median age of 8.46 years the patients of this 3
rd
tertile group were
slightly older than the than the median age of 6.27 years for the other two metagene
risk groups (p<0.02 by logistic regression). Most patients in the 3
rd
tertile had tumors
with lymph node involvement (n=17, 53%), were locally invasive (n=24, 75%) and
were of greater size (i.e., >5cm, n=25, 78%). Twenty-nine of these patients had
tumors in prognostically unfavorable anatomic sites although this was not
86
statistically significant (p<0.29), 34% of them were extremity lesions and 26%
lesions arising in the trunk/retroperitoneum (p<0.0001). Only two of fifteen patients
treated with VAC and additional experimental therapies such as irinotecan,
topotecan, ifosfamide/etoposide and high-dose cyclophosphamide and only six of
fourteen patients treated with conventional three-drug combinations (i.e., VAC, VIE,
VAI) survived in the high risk 3
rd
tertile group.
Finally, multivariate analysis of the 34-metagene, by adjusting for previously
characterized prognostic factors such as tumor size, patient age, anatomic site, tumor
histology, stage, clinical group and PAX-FKHR translocation variant did not affect
the performance of the metagene significantly (Table 10). In addition, the 34-
metagene predictor scores were independent of the newly defined IRS risk groupings
that are currently being used in COG STS clinical trials. Therefore, the prognostic
metagene identifies differences in patient outcome independent of clinically defined
prognostic variables.
87
Table 10. Metagene survival predictor uni- and multi-variate analysis
p-values*
Covariate Parameter
Five year Overall
Survival
Estimates
Univariate
Adjusted
MG
34-Metagene Tertiles 1
st
, 2
nd
, 3
rd
Tertile 98%, 75%, 29% 0.00001 -
IRS Risk Group† Low, Inter., High 96%, 69%, 11% 0.00001 0.00001
Metastasis M0 vs. M1 82% vs. 22% 0.00001 0.00001
Clinical Group I, II, III, IV 100%, 74%, 76%, 22% 0.00001 0.00001
Stage 1, 2, 3, 4 90%, 100%, 69%, 22% 0.00001 0.00001
Local Invasiveness T1 vs. T2 88% vs. 56% 0.00079 0.00001
Nodal Involvement N0 vs. N1 82% vs. 36% 0.00007 0.00001
Tumor Size ≤ 5cm vs. > 5 cm 86% vs. 60% 0.0021 0.00001
Anatomical Site‡
Favorable vs.
Unfavorable 78% vs. 62% 0.15 0.00001
Histology Embryonal vs. Alveolar 82% vs. 47% 0.0002 0.00001
Age <10 vs. ≥10 years 74% vs. 49% 0.004 0.00001
* Univariate: Cox Regression p-values for individual clinical risk factors. Adjusted p-value:
Multivariate Cox regression p-values for the 34-Metagene, adjusted for each of the clinical risk
factors or covariates.
† IRS risk groups see Table 8.
‡ See Table 5 legend for favorable and unfavorable anatomic sites definitions.
Biological Themes of Prognosis Correlated Genes
To investigate the functional consequences of the genes whose expression was
correlated to outcome, in terms of ‘biological themes’, we performed
overrepresentation analysis using the EASE program as described in previous
chapters (Hosack et al., 2003). We compared the statistical overrepresentation of
Gene Ontology (GO) categories between the gene lists for genes whose increased
expression was correlated with poor and good outcome (Appendix, Supplementary
Table 18). In the gene list of genes whose increased expression was correlated to
poor outcome (n=241) we found overrepresentation of the GO terms such as
‘oxidoreductase activity’, ‘tubulin binding’, ‘electron transporter activity’, ‘negative
88
regulation of transcription’, ‘regulation of synapse’ and ‘invasive growth’. In
contrast, among the gene list of genes whose increased expression was correlated to
good outcome (n=337), the terms of ‘nucleic acid binding’, ‘RNA binding’,
‘damaged DNA binding’, ‘translation factor activity’, ‘RNA processing’ were
overrepresented (Appendix, Supplementary Table 19). Inspection of the 34-
metagene signature (Table 9) revealed that poor outcome genes such as L1CAM, a
cell adhesion molecule (Allory et al., 2005; Gavert et al., 2005; Thies et al., 2002)
and IRS2 (Nagle et al., 2004) are associated with increased metastatic potential and
invasiveness in several tumor types. Another poor outcome gene, transcription factor
RBPSUH, is involved in repression of differentiation in numerous cancers (Pursglove
and Mackay, 2005) and conversely, a good outcome gene BIN1 is a well
characterized tumor suppressor gene that functions to promote muscle differentiation
(Wechsler-Reya et al., 1998) and differentiation of tumor cells (Sakamuro and
Prendergast, 1999).
Discussion
This is the first study aimed at identifying prognostically relevant gene expression
profiles in rhabdomyosarcoma. In previous chapters, we identified two main
molecular classes of RMS tumors and found that much of the variation in gene
expression can be attributed to the presence or absence of PAX-FKHR expression. In
the present analysis, we queried rather for differences in outcome and found that a
metagene predictor behaves independently of tumor histology or molecular class.
89
The underlying assumption that one must make with this sort of analysis is that all
RMS tumors though phenotypically and genetically a heterogeneous disease, share
common transcriptional and regulatory programs. The most likely candidate program
is one regulated by members of the type II PAX transcription factor family. Two
genes, PAX3 and PAX7 make up this subfamily and are expressed in both ARMS
(Barr et al., 1996) and ERMS tumors (Barr et al., 1999; Tiffin et al., 2003), although
in the former, predominately as fusion genes- the products of chromosomal
translocations. These two transcription factors have overlapping patterns of
expression during development and are with few exceptions not expressed post-
natally (Mansouri, 1998). Constitutive re-expression of PAX3 and PAX7 is found in
numerous sarcomas and neural crest tumors (Schafer, 1998). PAX proteins during
development are known to regulate proliferation, differentiation, stem-cell self
renewal, apoptosis resistance, migration and invasion (Robson et al., 2006).
Therefore deregulation of these functions is likely to play an important role in tumor
cells and is thought to underlie the malignancy of RMS tumors (Blake and Ziman,
2003). Indeed, it has been know for some time that downregulation of these genes
induces an apoptotic program in several tumor types (Bernasconi et al., 1996; Scholl
et al., 2001). On the basis of their common expression and utilization of PAX
transcription factors, crucial to cell survival and the process of tumorigenesis, it is
therefore justifiable to assume that they direct common transcriptional programs in
all RMS subtypes.
90
Given this hypothesis, that attributes such an important role for PAX3/7 or
PAX-FKHR fusion genes in directing malignancy in such a hierarchical manner, the
obvious question arises- why are these genes then, not an integral part of the
metagene predictor or prognostic gene signatures identified in this analysis?
Unfortunately, technical limitations and design flaws in the U133A platform have
precluded their use for the reliable measurement of PAX3/7 gene expression levels.
In the final chapter of this thesis, we will show some preliminary data from the latest
Affymetrix gene expression array technology, the Human All Exon Arrays which
appear to reliably measure wild type PAX3 and PAX7 and potentially PAX-FKHR
fusion genes. As described in Chapter 2, PAX-FKHR can be described as a gain-of-
function mutant of wild type PAX and it is likely that they share many of the same
target genes (Barr, 2001) and they are known to compete for transcriptional
activation at target gene promoters (Anderson et al., 2001c; Fredericks et al., 1995).
We and others have focused on the differences between fusion positive and fusion
negative RMS tumors in the context of PAX-FKHR expression but future studies
will also need to look at the commonality of all active PAX signaling in RMS. In
Chapter 2, we found that although we could not directly correlate PAX-FKHR
expression levels to outcome, we found that the putative downstream targets of
PAX-FKHR were highly correlated to outcome and defined a 33-metagene
comprised of PAX-FKHR expression signature genes. In an analogous fashion, here
we describe a 34-metagene that is highly correlated to patient outcome and possibly
91
represents downstream targets in common to all PAX signaling (i.e., wild type and
fusion gene) active in all RMS subtypes.
Recently Wachtel et al., reported protein expression patterns on a large panel
of RMS tissue microarrays that was prognostically significant (Wachtel et al., 2006).
With this immunohistochemical method, they observed differences in overall
survival of RMS patients determined by the expression pattern of four proteins.
Essentially, this type of prognostic stratification of patients divides patients into the
RMS molecular classes that we described in Chapter 1, that is the mARMS and
mERMS patients with no additional prognostic information. Similarly, we found that
by both single gene and multi-gene clustering analyses we were unable to obtain
outcome relevant data beyond the well characterized favorable (i.e., embryonal) vs.
unfavorable (i.e., alveolar) histology dichotomy, known for nearly three decades as
an independently prognostic factor. The 34-metagene described in this chapter
identified patient risk groups independent of tumor histology. Kaplan-Meier analysis
of the objectively derived 34-metagene tertile groups divided patients into three risk
groups. In the low risk group (1
st
tertile) almost all patients had embryonal histology
tumors except for three patients with alveolar histology tumors. Patients with
superior prognosis spindle cell or botryoid histology tumors were however divided
between the low and intermediate risk groups. Although, none of these patients in
our cohort died from their disease, it is clear that they have heterogeneous gene
expression profiles, as was described in Chapter 1- about half of the spindle/botryoid
patients clustered in the highly myogenic E1 mERMS class, the rest in the less
92
myogenic E2 class. Therefore, the superior prognosis characteristics of these tumors
might be more attributable to the fact that they are relatively easy to manage
surgically and tend to be diagnosed early (i.e., they arise in readily accessible sites
e.g., oral mucosal cavities or paratesticular regions). The intermediate risk group (2
nd
tertile) had OAS very similar to clinical groups II and III disease patients of 75% at
five years. Although, there were more patients with embryonal histology, the
proportion of patients with alveolar histology was similar to the overall proportion of
these patients in this and larger data sets (Crist et al., 2001). There were however, a
disproportionate number of patients with PAX7-FKHR ARMS, further support for
reports that this subset of alveolar patients is associated with better outcome (Kelly et
al., 1997; Sorensen et al., 2002). While, the high risk group (3
rd
tertile), was
predominately comprised of the less favorable PAX3-FKHR ARMS tumors it also
included eight patients with embryonal disease and two with mixed
alveolar/embryonal disease. These data highlight the capability of the metagene to
identify subsets of patients with varying risk, independent of tumor histology and
indicates that patients of different histologies share overlapping expression patterns
of prognostic genes.
The most adverse prognostic factor for patients with RMS is the presence of
metastatic disease at diagnosis. While most of the metastatic disease patients were
found in the high risk (3
rd
tertile) group, nearly a third were found in the other two
risk groups. Significantly, metastatic patients in the high risk group had five year
OAS of only 11% but metastatic disease patients in the other risk groups and
93
significantly better OAS of 50% at five years. Also, while the five year survival of
the nonmetastatic patients in the high risk group increased to 47% from 29% when
all patients were analyzed, it remained well below that of the survival of the low and
intermediate risk groups which was not significantly different when metastatic
patients were excluded from analysis. This supports observations from clinical trials
that found subsets of patients with metastasis with a favorable prognosis (Breneman
et al., 2003; Pappo et al., 1999a). However, these defined favorable prognosis
metastatic patients as less than ten years old with embryonal disease at diagnosis or
orbital tumors. In contrast, we found that three of ten metastatic disease patients in
low or intermediate metagene risk groups had alveolar disease and none that had
orbital tumors but seven had unfavorable site pelvic, extremity and parameningeal
tumors. Based on these results, it seems probable that a genetic method for the sub-
stratification of patients with disseminated disease at diagnosis will greatly improve
the management of metastatic disease patients. Furthermore, it is well known that
previously untreated high risk RMS respond better to chemotherapy with drugs such
as melphalan than do relapsed or metastatic RMS that have already been subjected to
some form of prior chemotherapy (Horowitz et al., 1988). Therefore, a continuous
predictor of outcome such as the 34-metagene presented here could provide the
means to reliably distinguish at diagnosis the patients with the worst prognosis,
enabling clinicians to properly test experimental therapeutic agents on chemo-naïve
high risk patients (Breitfeld and Meyer, 2005).
94
Over the years reports from IRSG/COG STS committee clinical trials have
refined protocols for patients with the most favorable prognosis to reduce exposure
to cyclophosphamide (i.e., VA only) and radiation therapy (Raney et al., 2001).
Additional radiotherapy to favorable prognosis patients (i.e. Group I, low stage,
embryonal disease) was shown to provide no additional benefit for reduction of
tumor burden (Wolden et al., 1999). The low risk metagene group (1
st
tertile) had
excellent outcome with 98% OAS at five years and only 1 patient death. Most
patients in this group received both three-agent chemotherapy (i.e., VAC) and
radiotherapy. It is unclear if this analysis suggests that metagene low risk patients in
fact benefited from additional chemo- and radiotherapy or if patients in this group
could in fact represent a subset of patients where further reduction in therapy will not
harm patient outcome but effectively reduce treatment associated side-effects and
increase patient quality of life (Schuck et al., 2004). Another interesting finding
came from comparison of patients treated with VIE or VAC in the intermediate (2
nd
tertile) and high (3
rd
tertile) risk groups. No patients in the intermediate metagene
risk group treated with VIE died but OAS at five years for patients in the high risk
group was only 40%. In contrast, VAC treated patients in both intermediate and high
risk groups had five year OAS of 80%. These data suggest that within the
intermediate risk group, a subset of patients exists that is very responsive to VIE
therapy in contrast to the conclusions drawn from the IRS-IV study that found no
benefit for VIE treated patients in comparison to a VAC regimen (Crist et al., 2001).
Conclusions from such retrospective analyses are hard to make with regard to the
95
effect of treatment in subsets of patients with different prognosis as defined by
metagene expression signatures. It will therefore be most interesting to test these
further on independent data sets and eventually in prospective analyses.
Analysis of the genes used to create the 34-metagene and those that overall
were correlated to outcome revealed interesting observations. Several genes are
know to be linked to increased metastatic potential in numerous tumors (e.g.,
L1CAM, IRS2) and others as key regulators of tumor and normal cell differentiation
(e.g., BIN1, RBPSUH). In Chapter 1, we found that expression of a subset of
myogenic differentiation genes was associated with better outcome and this data
from the prognostic metagene further demonstrates the importance that errors in
differentiation control may play in tumor behavior (da Costa, 2001). Although, none
of the genes in our 34-metagene signature are such structural myogenic genes, it is
likely that they represent differentiation control genes regulating transcription of the
structural genes. However, GO overrepresentation analysis also indicates that there
are other important pathways that may be disrupted in tumors from patients with
poor outcome such as the DNA damage response pathways, amino-acid metabolism
pathways, RNA splicing pathways and protein synthesis pathways.
Current methods for rhabdomyosarcoma staging have evolved to direct risk-
adapted therapy (Breitfeld and Meyer, 2005). This has yielded an unwieldy, complex
amalgam of variables and clinical factors requiring a multi-disciplinary approach to
assign patients into protocol regimens (Qualman et al., 2003). This is important not
only for patient management but also for evaluation of the effects of different
96
treatment regimens. However, it appears that clinicopathological based staging
systems do not identify many of the fundamental differences in underlying tumor
biology (Bair and Tibshirani, 2004). Molecular staging in contrast, by identifying
biological differences in terms of gene expression signatures or genomic based
predictors has the promise of identifying differences in tumor behavior and biology
at diagnosis (Simon et al., 2003). The 34-metagene derived from analysis of gene
expression profiles at diagnosis generated a continuous predictor of patient outcome
that surpassed the capabilities of the TNM based RMS staging and the post-surgical
RMS clinical groupings. The latest risk assignment scheme put into practice for the
IRS-V clinical trial is derived from post-hoc analysis of previous IRS clinical trials
and is admittedly more adept at identifying patients with the poorest outcome but
these patients represent <15% of all RMS patients. The 34-metagene risk groups in
contrast, were objectively derived (i.e., tertile cutpoints) so that the high risk 3
rd
tertile group includes a third of all patients analyzed. The 34-metagene 1
st
and 2
nd
tertiles were however; equally adept at segregating patients with superior and
intermediate prognosis, on par with latest IRS risk group assignment scheme. Future
studies are necessary however, to validate and expand this analysis with larger
numbers of patients and eventually to use metagene predictors prospectively to prove
their value for widespread clinical use.
97
Future Directions
In this Chapter, we report in brief preliminary results from what is hoped to be the
next phase of microarray based gene expression profiling in pediatric soft-tissue
sarcomas. All exon gene expression arrays are now available that make it possible to
interrogate 1.4 million probe set regions (PSRs), which can be used to report gene
expression on an exon level as well as other RNA species such as microRNAs. We
subjected RNA from five fusion negative embryonal and five PAX3-FKHR alveolar
RMS tumors to analysis using the HuEx1.0 arrays. The data was modeled using
ProbeProfiler set to normalize expression values to the median expression level of all
probe sets on all chips. After import into Genetrix, we examined the expression
patterns of a number of genes of interest to RMS biology (Genetrix gene screening
procedures and higher-dimensional clustering algorithms are still in the
developmental phase). In addition to gene expression data from analysis of this panel
of tumors on the U133A platform a subset of genes was also analyzed by QRT-PCR.
As described in previous chapters, due to limitations in the design and assay
conditions for the Affymetrix Gene Expression arrays (i.e., the 3’ bias of the U133A
platform), gene expression data on PAX3 and PAX7 was not of any use. The design
for the probe sets for these genes on the U133A platform did not include any probe
sets that interrogated expression levels beyond exon 4. As can be seen in Figure 15
there is no correlation between PAX3 or PAX7 expression levels determined by
U133A microarrays to the QRT-PCR data.
Figure 15 Comparison of quantitative RT-PCR normalized expression data to
U133A microarray normalized expression data for PAX3 and PAX7. Neither PAX3
(red diamonds) nor PAX7 (blue squares) expression levels as detected by U133A
microarray correlate to QRT-PCR derived expression values. Due to the 3’ bias of the
U133A platform (cRNA is oligo-dT primed only) and the fact that probes were
designed to the first 4 exons only (based on incomplete GenBank sequences published
at the time), the U133A platform cannot reliably estimate expression levels for PAX3
or PAX7.
For PAX7, QRT-PCR shows that the mERMS tumors had overall higher expression
than the mARMS tumors (Figure 16A). When we looked at PAX7 PSRs using the
HuEx1.0 platform we can derive expression values for all exons as depicted in
Figure 16B. Furthermore, expression values for a metagene (derived from
normalizing the expression values across all exons) give results highly correlated to
the analysis of PAX7 as determined by QRT-PCR (Figure 16B).
98
Figure 16 Comparison of QRT-PCR normalized expression and HuEx1.0 relative
expression for PAX7. A: Normalized expression levels as determined by QRT-PCR
for PAX7 depict increased expression in ERMS samples relative to ARMS samples
(p<0.002). B: Relative expression as determined by analysis of HuEx1.0 derived
PAX7 levels. Exonic structure is depicted with lines connecting probe set regions
(PSRs) to heatmap. The expression of each PSR (in some cases there are multiple
PSRs per exon) in each sample was normalized in the pseudo-colored expression
matrix based on the number of standard deviations above (red) and below (blue) the
median expression value (white) across all samples. The larger rectangles adjacent to
the legend indicate ‘metagenes’ derived from averaging expression across all exons to
give relative gene-level expression values. HuEx1.0 metagenes are highly correlated
to the QRT-PCR data (R
2
= 0.72).
In the case of PAX3, analysis of gene expression on an exon level is
complicated by the fact that the alveolar tumors express PAX3-FKHR which is a
product of the fusion of exons 1-7 of PAX3 to exons 2 and 3 of FOXO1A. In
99
100
addition, alveolar tumors express wild type PAX3, both of which were detected by
QRT-PCR using specific primers that span the breakpoint (PAX3-FKHR only) or
primers 5’ downstream of the breakpoint (PAX3 only) (Figure 17A). Accordingly,
we noticed upon inspection of exon expression levels that most of the alveolar
tumors had higher expression of PAX3 exons 1-7 relative to the embryonal tumors
(Figure 17B). Beyond the breakpoint, expression of PAX3 exons 8-10 actually
appeared to be higher in embryonal versus alveolar tumors. Inspection of FOXO1A
(i.e, FKHR), found that expression levels also followed a similar trend. Expression
of FOXO1A exon 1 showed relatively increased levels in embryonal tumors but the
alveolar tumors had higher expression across the breakpoint for FKHR exons 2 and 3
(Figure 17C). Through transcriptional activation, PAX3-FKHR alveolar tumors
exhibit amplification in the number of fusion gene transcripts (Barr et al., 1996) and
this probably explains why we observe increased expression of PAX3 (exons 1-7)
and FKHR (exons 2-3) in the mARMS tumors. We are currently working on methods
to combine PSRs from different genes (and different chromosomes) into metagenes.
This is hoped to enable estimation of gene expression levels for fusion gene
transcripts such as PAX3-FKHR and PAX7-FKHR.
101
Figure 17 Biallelic expression of PAX3, PAX3-FKHR and FKHR. A: Normalized
expression levels as determined by QRT-PCR for PAX3 and PAX3-FKHR depict
expression of PAX3 in some ERMS samples but only PAX3-FKHR in ARMS
samples. B: Relative expression as determined by analysis of HuEx1.0 derived PAX3
levels. Exonic structure is depicted with lines connecting probe set regions (PSRs) to
heatmap. PAX3 expression in ERMS is lower across exons 1-7 compared to ARMS
presumably due to transcriptional amplification of the PAX3-FKHR allele in ARMS
tumors. C: FKHR expression shows relatively higher expression for ERMS until
exons 2 and 3 which are included in the ARMS fusion gene.
Alternative splicing of genes is also possible using the HuEx1.0 platform. As
is depicted in Figure 18, argininosuccinate synthetase (ASS) is a gene that was
determined to be differentially expressed between mARMS and mERMS tumors and a
putative PAX-FKHR target gene (i.e., PAX-FKHR expression signature member)
(Figure 18A). Inspection of the exon level expression patterns reveals that this gene
is highly expressed in mERMS tumors but only from exons 1-8, in contrast, mARMS
tumors display expression of the full length gene (Figure 18B). As was described in
Chapter 3, ‘RNA splicing’ was a Gene Ontology category that was overrepresented
in the ‘good outcome’ prognostic gene list. Therefore, future studies will be required
in order to assess the role of RNA alternative splicing in RMS and perhaps lead to
more sensitive markers of class discrimination and perhaps novel classes of tumors
as well.
Figure 18 Alternative splicing of ASS1 in RMS tumors. A: ASS1 is differentially
expressed between ARMS and ERMS tumors as depicted in the binned histogram of
U133A microarray expression values (p<0.0008). B: On the exon level, ASS1 appears
to be equally expressed in ARMS and ERMS but ERMS only utilize exons 1-8 and do
not express the remaining exons whereas ARMS express the full length gene. Exonic
structure is depicted with lines connecting probe set regions (PSRs) to heatmap.
102
103
There are several other areas which are enabled by technological progress in
the field of DNA microarrays. Already, SNP Chips are available that interrogate over
500, 000 single nucleotide polymorphisms with 50 times greater density than the
10K SNP Chips used in the LOH analysis presented in Chapter 1. With this platform,
one can certainly obtain a higher resolution picture of somatic changes such as
patterns of LOH and copy number changes of whole chromosomes and chromosomal
regions in RMS tumors. In addition, large case-control studies including RMS
patients and their family members are needed to identify any possible germline
predisposition loci for this disease. It is known from epidemiological studies that the
incidence of RMS is lower in East Asians than in Western Eurasians (Stiller et al.,
1991) and that siblings of RMS patients have increased incidence of cancer
(Friedman et al., 2005).
Finally, Affymetrix in cooperation with the COG Translational Medicine
Agreement has agreed to support and design custom allelic specific exon level
expression arrays (custom tiling arrays). These Gene Chips will provide the unique
opportunity to observe expression patterns in an allele specific manner and can
detect RNA transcripts beyond the known protein-coding genes to enable mapping of
transcription factors and other protein binding domains (Cheng et al., 2005). We
intend to design a custom array for rhabdomyosarcoma genes of interest, focusing on
loci with some previous characterized RMS association such as the 11p15 region
(with genes such as IGF2, MYOD, p57KIP2) which either through loss-of-imprinting
or LOH is deregulated in most ERMS and the loci for PAX3, PAX7 and FKHR
104
which are disrupted in ARMS. With this type of analysis we hope to gain insights
into how the genomes of cancer cells are regulated and data on what role allelic
specific expression plays in RMS. This new generation of arrays will provide new
challenges that will require new methods for analysis and increase the demand on
software and hardware solutions for this data management. In addition, to running
new RMS cases on these new platforms it will be equally important to validate new
RMS cases on the older platforms. This will be important especially for the
prognosis analysis described in Chapter 3, where validation on an independent data
set is crucial before steps can be made towards their clinical use.
105
Chapter 5: Materials and Methods
Tumor Specimens and Cell Lines
Frozen tumor samples from patients enrolled in the Intergroup Rhabdomyosarcoma
Study Group (IRS-IV and -V) Children’s Oncology Group clinical trials were
obtained from the Pediatric Cooperative Human Tissue Network tumor bank
(Columbus, OH). Additional tumors samples were obtained from the Children’s
Hospital Los Angeles institutional tumor bank. Frozen tumor samples were sectioned
and representative sections were examined for tumor cell content of at least 80%.
Clinical covariates were obtained from the COG Research Data Center (Arcadia,
CA). For clinical characteristics of the data set see Table 1, Chapter 1 and individual
sample covariate data can be found in Appendix, Supplementary Table 1. The
analysis described in Chapter 1 included 160 patients with RMS or UDS/NRSTS. In
Chapter 2, we excluded patients characterized as mNRSTS from the analyses
described in Chapter 1 leaving 139 RMS tumors for analysis (see Appendix,
Supplementary Table 6). For the analysis described in Chapter 3, 120 patients with
overall survival data were used (see Appendix, Supplementary Table 17). RMS cell
lines evaluated (Chapter 2) included four embryonal cell lines (TTC-442, TTC-516,
Birch and RD), one alveolar fusion-negative cell line (RH18) and four alveolar
PAX3-FKHR fusion-positive cell lines (HR, JR/C, RH28 and RH30). All cell lines
were cultured in RPMI 1640 supplemented with 10% fetal bovine serum and
antibiotics (Invitrogen, Carlsbad, CA).
106
Histology and Molecular Diagnosis
Centrally reviewed (IRSG/CHTN) histological diagnoses were based on the
International Classification of Rhabdomyosarcoma criteria, in accordance with IRSG
protocols (Newton et al., 1995; Qualman et al., 1998). Mixed alveolar/embryonal
tumors or tumors with any evidence of alveolar histology, classical or solid variant
were classified as the alveolar subtype. Embryonal tumors were all of classical
histology or not-otherwise specified with the exception of the botryoid and spindle
cell variants. Undifferentiated sarcomas or those sarcomas designated as ‘other’ due
to indeterminate/uncertain diagnosis were classified as UDS/NRSTS. RT-PCR using
total RNA extracted from frozen tissue was performed previously on all tumors with
alveolar or mixed alveolar/embryonal histology for detection of specific fusion
transcripts for PAX3-FKHR and PAX7-FKHR (Barr et al., 2002).
RNA and DNA Isolation for Microarray Expression and SNP Profiling
DNA and RNA were extracted from frozen tissues with DNA STAT and RNA
STAT-60, respectively (Tel-Test Inc., Friendswood, TX). Total mRNA from the
transduced polyclones and RMS cell lines was harvested using TRIzol (Invitrogen).
RNA was purified with RNeasy Protect Kit (Qiagen, Valencia, CA) according to the
manufacturer’s instructions. Biotin-labeled cRNA was prepared from total RNA and
hybridized to Affymetrix GeneChip Human U133A Expression Arrays according to
the manufacturer’s protocol (Affymetrix, Santa Clarita, CA). Genomic DNA was
digested with XbaI; PCR amplified and hybridized to Affymetrix GeneChip 10K
107
mapping arrays according to the manufacturer’s protocol. Complete microarray
protocols can be found at the USC/CHLA Genome Core website at
http://genomecore-chla.usc.edu/GenomeCore/GenomeCore.html.
Quantitative RT-PCR
To determine PAX-FKHR mRNA expression in primary tumors, aliquots of the
microarray in vitro transcription (IVT) reactions were subjected to quantitative PCR
using the QuantiTect SYBR Green PCR kit (Qiagen) on a Smart Cycler (Cepheid,
Sunnyvale, CA). IVTs prepared for microarray analysis were used in place of cDNA
due to insufficient RNA to make additional cDNA preparations for all tumors. This
approach was first validated by QRT-PCR analysis on a subset of samples for which
cDNA was available (Figure 19A). Expression levels were normalized to GAPDH
according to standard curves generated for each primer set and scaled by log
transformation for multi-dimensional visualization (Figure 19B). All reaction
products were verified to have yielded a specific amplification product by melt-curve
analysis using the Smart Cycler v2.0 analysis software. Two samples with low but
detectable PAX-FKHR mRNA (blue arrows, Figure 19B) showed melt-curves
specific for PAX-FKHR product formation but expression levels of <0.1% of the
median PAX-FKHR expression level.
108
Figure 19 Expression of PAX-FKHR mRNA in primary ARMS tumors. A: relative PAX-
FKHR mRNA expression was compared by QRT-PCR on a subset of cases for which both
cDNA and IVT reactions were available. Expression values relative to the GAPDH
expression value determined for each cDNA and IVT reaction. B: relative PAX-FKHR
mRNA expression for 59 ARMS tumors was determined by QRT-PCR using IVT
reactions. PAX-FKHR expression was normalized to GAPDH and the expression value
shown is relative to the highest expressing sample. Blue arrows indicate two samples with
low but detectable PAX-FKHR expression (<0.1% of the median PAX-FKHR expression
level).
109
PAX-FKHR primer sequences (Peter et al., 2001) and GAPDH primers sequences (a
generous gift of Dr. D.E. Schofield) are as follows:
PAX3-FKHR forward: TCCAACCCCATGAACCCC
PAX7-FKHR forward: CAACCACATGAACCCGGTC
PAX3/7-FKHR reverse: GCCATTTGGAAAACTGTGATCC
PAX3 forward: TTCCAACCCAGACAGCAG
PAX3 reverse: GGAGAGCGCGTAATCAGT
PAX7 forward: GTTCGGGAAGAAAGAGGAGG
PAX7 reverse: TTCAGTGGGAGGTCAGGTTC
GAPDH forward: TCCTCTGACTTCAACAGCGACA
GAPDH reverse: ATGGTACATGACAAGGTGCGG
Retroviral Transduction and FACS Sorting
HA epitope-tagged PAX3-FKHR and PAX7-FKHR (Anderson et al., 2001c) were
subcloned into the retroviral expression vector MSCV-IRES-GFP (Persons et al.,
1999; Torchia et al., 2003). Supernatants containing virus were made in 293T cells
by co-transfecting expression vector with plasmids pHIT60 (Soneoka et al., 1995) or
pCgp (Han et al., 1997) encoding gag-pol and pHIT456 (Page et al., 1990) encoding
amphotropic env. RD cells were transduced with retrovirus; FACS sorted 48-72
hours later for GFP expression and expanded for RNA and protein isolation.
110
Immunoblotting
Protein was isolated as described previously (Plattner et al., 1996). Equal amounts of
protein (40 µg) were fractionated by 7.5-12% SDS-PAGE, transferred to Immobilon-
P membrane (Millipore, Billerica, MA), and reacted with antibodies against HA
(HA.11, Covance, Princeton, NJ) and β-actin (AC-15, Sigma, St. Louis, MO).
Immunodetection was done using HRP-conjugated secondary antibodies (Bio-Rad,
Hercules, CA) and the ECL detection system (Amersham, Piscataway, NJ).
Transient Reporter Assays
Transfections were performed using Lipofectamine and Plus Reagent according to
the manufacturer’s protocol (Invitrogen). For the PRS9 reporter assays, transfections
were done using 1 µg of pcDNA3-lacZ and 1 µg of pTK-PRS9 (Anderson et al.,
2001c; Chalepakis et al., 1991). Cells were harvested 72 hours post-transfection, and
CAT assays performed as described previously (Anderson et al., 2001c). CAT
activity was normalized to β-galactosidase activity; experiments were performed
within the linear range of the assays and done three times in duplicate.
Immunohistochemistry
Formalin-fixed, paraffin-embedded tissue microarrays (TMA) for alveolar and
embryonal RMS were obtained from the Cooperative Human Tissue Network
(CHTN, Columbus, OH). The alveolar TMA contained 96 tumor sections
representing 32 individual alveolar tumors, 5 normal skeletal muscle and 5
111
embryonal tumors. The embryonal TMA contained 113 tumor sections comprised of
material from 32 individual embryonal tumors, 5 normal skeletal muscle and 5
alveolar tumors. TMA were deparafinized and rehydrated followed by heat-induced
epitope retrieval in a steamer (Black & Decker) using antigen Target Retrieval
Solution (DAKO, Glostrup, Denmark) (for TFAP2 β) or 0.01% pronase protease-
induced epitope retrieval (HMGA2). After incubation with normal serum, sections
were incubated with rabbit anti-TFAP2 β (dilution 1:400, Santa Cruz, Santa Cruz,
CA) or rabbit anti-HMGA2 (dilution 1:500, Covance, Berkeley, CA), detected using
Vectastain Elite ABC kit and color was developed with DAB (Vector Laboratories,
Burlingame, CA). TMAs were counterstained with Hematoxylin (HMGA2) or
Methyl Green (TFAP2 β). Control antibody vimentin (dilution, 1:40, Vantana) was
used to determine the integrity of tumor sections and array elements that were
negative for vimentin were not scored.
Analysis of Gene Expression
All data management and analysis was conducted using the Genetrix suite of tools
for microarray analysis (Epicenter Software, Pasadena, CA,
http://www.epicentersoftware.com). Probe set modeling and data pre-processing
were derived using the Probe-Profiler algorithm (Chapter 1 data set) or the Robust
Multi-Array (RMA) algorithm (Chapters 2 & 3) implemented by ‘Analyzer’ global
microarray pre-processing module (Corimbia, Berkeley, CA) (James et al., 2004). In
Chapter 1, we performed probe set reduction on the data set and trimmed the full
112
data set of 22, 215 probe sets of genes with a standard deviation of less than 40
Affymetrix difference intensity units of a normalized data range which was log
transformed, yielding 12, 136 probe sets (henceforth, ‘genes’). Data pre-processed
using the RMA algorithm was not subjected to any initial probe set reduction. The
complete tumor microarray data set can be found on the NCI Director’s Challenge
website (http://dc.nci.nih.gov/index.html).
Statistical Analysis
Supervised Gene Selection
Supervised analysis of genes differentially expressed between ICR-based
histological groups (alveolar, embryonal, spindle/botryoid and UDS/NRSTS) was
determined using analysis of variance (ANOVA) and controlling for false-discovery
by capping genes at p < 0.00001 where the false-discovery rate is estimated to be
0.1% (Chapter 1). Gene selection for the in vitro PAX-FKHR model system was
based on the fold change in the average expression (at least 1.5 fold change) and the
associated t-test, with a p-value cut-off of p<0.001 between RD-transduced with
PAX-FKHR and RD-empty vector (Chapter 2).
Semi-Supervised Meta-Clustering
Semi-supervised clustering was performed using a fuzzy k-means algorithm. Since
the k-means method converges to a local minimum, with the final clusters being
dependent on the starting position of the cluster centroids, a meta-clustering
approach was used in which the clustering was repeated 1,000 times and the cluster
113
membership information from each run was aggregated. For each meta-clustering
run: (1) the genes used for classification were independently selected based on the
significance level for each gene in a one-way analysis of variance test of
homogeneity of the expression means values; (2) the initial positions of the k
centroid means were randomly selected; (3) a random selection of n (“test”) samples
were separated for cross validation (leave-n-out sampling), where n was set to
approximately 10% of the sample set; (4) the k-means algorithm was applied to the
remaining (“training”) samples; (5) the membership of each of the n out-of-sample
cases was based on the closest training-set centroid; (6) a pairwise similarity matrix
was cumulated across all meta-cluster runs, based on the proportion of all runs in
which both members of the pair were present in the same test set that placed both
members in the same class; and (7) a multidimensional scaling analysis based on the
final cumulative similarity matrix was used to generate a three-dimensional plot
showing the relative proximities of each sample. Gene selection criteria for ANOVA
utilized a false discovery correction with a p value set to <0.00001 to provide a false
discovery rate of 0.1% for genes that were selected in at least 50% of the randomized
iterations for identification of molecular classes based on histological groups (i.e.,
alveolar, spindle/botryoid, embryonal and UDS/NRSTS) for the first round of
analysis presented in Chapter 1. Cluster centroids as identified in Figure 1B (Chapter
1), were then used as variables for a second round of ANOVA based k-means meta-
clustering for a further 1000 iterations. A multidimensional scaling analysis showing
the relative proximities of each sample (Appendix, Supplementary Figure 1) based
114
on the final cumulative similarity matrix was used to generate the expression matrix
(Chapter 1, Figure 1C). Samples and genes were optimally ordered by hierarchical
clustering using Pearson’s correlation complete-linkage metrics.
A similar meta-cluster routine was implemented for analysis of the in vitro
PAX-FKHR expression profile in order to derive the PAX-FKHR expression
signature (Chapter 2). In this case, however, a t-test was used to identify
differentially expressed genes between mARMS and mERMS molecular classes from
the in vitro derived PAX-FKHR expression profile with a p value set to <0.00008 to
provide a false discovery rate of 0.1% for genes that were selected in at least 50% of
the randomized iterations. Two-way hierarchical clustering analysis was optimally
ordered using complete-linkage distance measurements with Pearson’s correlation
distance metric on both tumor samples and PAX-FKHR expression signature genes
(Chapter 2, Figure 10).
Nearest Shrunken Centroids
The nearest shrunken centroids algorithm, a derivative of SAM (Significance
Analysis of Microarrays), developed by Tibshirani et al., and implemented within
Genetrix, was evaluated as a classification tool by calculation of the centroids for
each of the three major tumor classes (mARMS, mERMS, mNRSTS) as determined by
our semi-supervised metaclustering analysis (Tibshirani et al., 2002; Tusher et al.,
2001). Classes were based on a training set of samples and allocation of members of
a test set to the nearest centroid. The training and test sets were created through a
115
leave-n-out cross validation procedure (with n=16). The initial gene set for
classification was the 530 genes that were selected in the meta-cluster analysis, and
the centroids were shrunken using a delta value of δ=5.4. This analysis identified 10
genes that were used to generate centroids that predicted sample class probabilities
with a cross-validated error rate of 1.9%.
SNP Analysis
Single nucleotide polymorphism analysis for LOH determination was performed on
Affymetrix 10K Human SNP chips analyzed using the SNP analysis module of the
Genetrix Software package. Regions with LOH were determined by statistical
comparison to a pooled normal DNA SNP reference profile available on the
Affymetrix web-site (http://www.affymetrix.com). Fractional allelic loss was
determined as described previously (Visser et al., 1997).
Survival Analysis
Comparison of survival times was carried out using Kaplan-Meier survival plots and
log-rank tests of significance. Comparisons between molecular groups and tests of
association used Fisher exact or Chi-square tests where appropriate. Multivariate
tests for independence utilized a Cox regression proportional hazards model.
116
Metagene Predictor Scores
A Cox proportional-hazards model was employed to determine which genes from the
PAX-FKHR expression signature (i.e., 102 probesets/81 genes) should be included
in the gene-expression based outcome predictor model for mARMS patients (Chapter
2) or for all RMS patients (Chapter 3). Multivariate Cox modeling was fitted with
coefficients for each gene that best correlated with censored overall survival data.
Positive coefficients were assigned to genes whose high expression was correlated to
a low likelihood of survival, whereas negative coefficients were assigned to genes
whose high expression was correlated to a high likelihood of survival. The metagene
score for each patient was calculated as a weighted sum of the gene expression value
with the weights being the signed square root of the Cox chi-square test statistic. The
multivariate model was developed first by identifying the best single gene predictors
of outcome (p<0.05 for overall survival, OAS) by sample cross-validation. Leave-n-
out cross-validation was utilized by randomly excluding 50% of tumors for which
survival data was available (i.e., n=25 for mARMS metagene and n=60 for RMS
metagene) for each iteration of the model, running the Cox regression proportional-
hazards model on this ‘training’ set. The remaining samples in each iteration, the
‘test’ set, were then used to evaluate the results from running the model on the
‘training’ set. This process was reiterated to generate 2500 cross-validated models
(i.e., 5000 iterations) to calculate the number of times each gene was used in a cross-
validated model and genes were ranked by the number of models in which they were
selected.
117
Metagenes, or multi-gene (i.e., multivariate) models were built in a step-wise
procedure from the ranked list of the best single gene predictors. Cox regression χ
2
test statistics were determined for each multivariate model and showed that a 33
probe set and 34 probe set (i.e., 33-metagene and 34-metagene) models had the
highest χ
2
test statistic for mARMS and all RMS patient metagenes, respectively
(Figures 20A and B). Next, the data set was permuted by sample shuffling and the
cross-validated Cox regression modeling was repeated, generating metagenes from
the ranked list of the best single gene predictors in the permuted data set. The Cox
regression χ
2
statistics from the metagenes generated on the permuted data set
indicate that these results are not likely due to chance alone. Multivariate analyses to
test for independence of the metagene predictor scores were conducted by adjusting
for covariates in Cox proportional-hazards models.
Figure 20 Evaluation of the best metagene model. Metagenes were generated by building
in a step-wise procedure from the ranked list of the best single gene predictors as
determined by reiterative cross-validation. A, B: χ
2
values plotted for each metagene
generated (blue curve) were compared to those generated from a permuted data set (red
curve) for the PAX-FKHR expression signature ARMS metagene (A) or the RMS
metagenes (B). Purple arrows indicate the peak χ
2
statistics of 33-metagene for PAX-
FKHR ARMS metagene and 34-metagene for the RMS metagene. Generation of
metagenes on a permuted data set indicates that the Cox Regression model χ
2
values are
not likely attributed to chance alone (red curves).
Functional Annotation of Gene Clusters
Functional annotation was performed using the Expression Analysis Systematic
Explorer (EASE, http://david.niaid.nih.gov/david/ease.htm) software package for
overrepresentation analysis of functional gene categories and for multi-database
annotation of the differentially expressed genes (Hosack et al., 2003). Pathway
analysis was performed with the on-line Ingenuity Pathways analysis tool (version
3.0, http://www.ingenuity.com) (Zeng and Schultz, 2005). Literature data-mining
was performed with the on-line PubMatrix tool (http://pubmatrix.grc.nia.nih.gov)
(Becker et al., 2003).
118
119
Bibliography
Alizadeh, A. A., Eisen, M. B., Davis, R. E., Ma, C., Lossos, I. S., Rosenwald, A.,
Boldrick, J. C., Sabet, H., Tran, T., Yu, X., et al. (2000). Distinct types of diffuse
large B-cell lymphoma identified by gene expression profiling. Nature 403, 503-511.
Allory, Y., Matsuoka, Y., Bazille, C., Christensen, E. I., Ronco, P., and Debiec, H.
(2005). The L1 cell adhesion molecule is induced in renal cancer cells and correlates
with metastasis in clear cell carcinomas. Clin Cancer Res 11, 1190-1197.
Anderson, J., Gordon, A., McManus, A., Shipley, J., and Pritchard-Jones, K.
(1999a). Disruption of imprinted genes at chromosome region 11p15.5 in paediatric
rhabdomyosarcoma. Neoplasia 1, 340-348.
Anderson, J., Gordon, A., Pritchard-Jones, K., and Shipley, J. (1999b). Genes,
chromosomes, and rhabdomyosarcoma. Genes Chromosomes Cancer 26, 275-285.
Anderson, J., Gordon, T., McManus, A., Mapp, T., Gould, S., Kelsey, A.,
McDowell, H., Pinkerton, R., Shipley, J., and Pritchard-Jones, K. (2001a). Detection
of the PAX3-FKHR fusion gene in paediatric rhabdomyosarcoma: a reproducible
predictor of outcome? Br J Cancer 85, 831-835.
Anderson, J., Ramsay, A., Gould, S., and Pritchard-Jones, K. (2001b). PAX3-FKHR
induces morphological change and enhances cellular proliferation and invasion in
rhabdomyosarcoma. Am J Pathol 159, 1089-1096.
Anderson, M. J., Shelton, G. D., Cavenee, W. K., and Arden, K. C. (2001c).
Embryonic expression of the tumor-associated PAX3-FKHR fusion protein
interferes with the developmental functions of Pax3. Proc Natl Acad Sci U S A 98,
1589-1594.
Arndt, C. A., Donaldson, S. S., Anderson, J. R., Andrassy, R. J., Laurie, F., Link, M.
P., Raney, R. B., Maurer, H. M., and Crist, W. M. (2001). What constitutes optimal
therapy for patients with rhabdomyosarcoma of the female genital tract? Cancer 91,
2454-2468.
Asmar, L., Gehan, E. A., Newton, W. A., Webber, B. L., Marsden, H. B., van Unnik,
A. J., Hamoudi, A. B., Shimada, H., Tsokos, M., and Harms, D. (1994). Agreement
among and within groups of pathologists in the classification of rhabdomyosarcoma
and related childhood sarcomas. Report of an international study of four pathology
classifications. Cancer 74, 2579-2588.
Atra, A., and Pinkerton, R. (2002). High-dose chemotherapy in soft tissue sarcoma in
children. Crit Rev Oncol Hematol 41, 191-196.
120
Bair, E., and Tibshirani, R. (2004). Semi-supervised methods to predict patient
survival from gene expression data. PLoS Biol 2, E108.
Baird, K., Davis, S., Antonescu, C. R., Harper, U. L., Walker, R. L., Chen, Y.,
Glatfelter, A. A., Duray, P. H., and Meltzer, P. S. (2005). Gene expression profiling
of human sarcomas: insights into sarcoma biology. Cancer Res 65, 9226-9235.
Barber, T. D., Barber, M. C., Tomescu, O., Barr, F. G., Ruben, S., and Friedman, T.
B. (2002). Identification of target genes regulated by PAX3 and PAX3-FKHR in
embryogenesis and alveolar rhabdomyosarcoma. Genomics 79, 278-284.
Barr, F. G. (1997). Fusions involving paired box and fork head family transcription
factors in the pediatric cancer alveolar rhabdomyosarcoma. Curr Top Microbiol
Immunol 220, 113-129.
Barr, F. G. (2001). Gene fusions involving PAX and FOX family members in
alveolar rhabdomyosarcoma. Oncogene 20, 5736-5746.
Barr, F. G., Fitzgerald, J. C., Ginsberg, J. P., Vanella, M. L., Davis, R. J., and
Bennicelli, J. L. (1999). Predominant expression of alternative PAX3 and PAX7
forms in myogenic and neural tumor cell lines. Cancer Res 59, 5443-5448.
Barr, F. G., Nauta, L. E., Davis, R. J., Schafer, B. W., Nycum, L. M., and Biegel, J.
A. (1996). In vivo amplification of the PAX3-FKHR and PAX7-FKHR fusion genes
in alveolar rhabdomyosarcoma. Hum Mol Genet 5, 15-21.
Barr, F. G., Qualman, S. J., Macris, M. H., Melnyk, N., Lawlor, E. R., Strzelecki, D.
M., Triche, T. J., Bridge, J. A., and Sorensen, P. H. (2002). Genetic heterogeneity in
the alveolar rhabdomyosarcoma subset without typical gene fusions. Cancer Res 62,
4704-4710.
Becker, K. G., Hosack, D. A., Dennis, G., Jr., Lempicki, R. A., Bright, T. J.,
Cheadle, C., and Engel, J. (2003). PubMatrix: a tool for multiplex literature mining.
BMC Bioinformatics 4, 61.
Beer, D. G., Kardia, S. L., Huang, C. C., Giordano, T. J., Levin, A. M., Misek, D. E.,
Lin, L., Chen, G., Gharib, T. G., Thomas, D. G., et al. (2002). Gene-expression
profiles predict survival of patients with lung adenocarcinoma. Nat Med 8, 816-824.
Begum, S., Emani, N., Cheung, A., Wilkins, O., Der, S., and Hamel, P. A. (2005).
Cell-type-specific regulation of distinct sets of gene targets by Pax3 and
Pax3/FKHR. Oncogene.
Bennicelli, J. L., Fredericks, W. J., Wilson, R. B., Rauscher, F. J., 3rd, and Barr, F.
G. (1995). Wild type PAX3 protein and the PAX3-FKHR fusion protein of alveolar
121
rhabdomyosarcoma contain potent, structurally distinct transcriptional activation
domains. Oncogene 11, 119-130.
Bergstrom, D. A., Penn, B. H., Strand, A., Perry, R. L., Rudnicki, M. A., and
Tapscott, S. J. (2002). Promoter-specific regulation of MyoD binding and signal
transduction cooperate to pattern gene expression. Mol Cell 9, 587-600.
Berkes, C. A., and Tapscott, S. J. (2005). MyoD and the transcriptional control of
myogenesis. Semin Cell Dev Biol 16, 585-595.
Bernasconi, M., Remppis, A., Fredericks, W. J., Rauscher, F. J., 3rd, and Schafer, B.
W. (1996). Induction of apoptosis in rhabdomyosarcoma cells through down-
regulation of PAX proteins. Proc Natl Acad Sci U S A 93, 13164-13169.
Besnard-Guerin, C., Newsham, I., Winqvist, R., and Cavenee, W. K. (1996). A
common region of loss of heterozygosity in Wilms' tumor and embryonal
rhabdomyosarcoma distal to the D11S988 locus on chromosome 11p15.5. Hum
Genet 97, 163-170.
Bhattacharjee, A., Richards, W. G., Staunton, J., Li, C., Monti, S., Vasa, P., Ladd,
C., Beheshti, J., Bueno, R., Gillette, M., et al. (2001). Classification of human lung
carcinomas by mRNA expression profiling reveals distinct adenocarcinoma
subclasses. Proc Natl Acad Sci U S A 98, 13790-13795.
Birch, J. M., Hartley, A. L., Blair, V., Kelsey, A. M., Harris, M., Teare, M. D., and
Jones, P. H. (1990). Cancer in the families of children with soft tissue sarcoma.
Cancer 66, 2239-2248.
Bjerkvig, R., Tysnes, B. B., Aboody, K. S., Najbauer, J., and Terzis, A. J. (2005).
Opinion: the origin of the cancer stem cell: current controversies and new insights.
Nat Rev Cancer 5, 899-904.
Blais, A., Tsikitis, M., Acosta-Alvear, D., Sharan, R., Kluger, Y., and Dynlacht, B.
D. (2005). An initial blueprint for myogenic differentiation. Genes Dev 19, 553-569.
Blake, J., and Ziman, M. R. (2003). Aberrant PAX3 and PAX7 expression. A link to
the metastatic potential of embryonal rhabdomyosarcoma and cutaneous malignant
melanoma? Histol Histopathol 18, 529-539.
Bortoluzzi, S., Bisognin, A., Romualdi, C., and Danieli, G. A. (2005). Novel genes,
possibly relevant for molecular diagnosis or therapy of human rhabdomyosarcoma,
detected by genomic expression profiling. Gene 348, 65-71.
Breitfeld, P. P., and Meyer, W. H. (2005). Rhabdomyosarcoma: new windows of
opportunity. Oncologist 10, 518-527.
122
Breneman, J. C., Lyden, E., Pappo, A. S., Link, M. P., Anderson, J. R., Parham, D.
M., Qualman, S. J., Wharam, M. D., Donaldson, S. S., Maurer, H. M., et al. (2003).
Prognostic factors and clinical outcomes in children and adolescents with metastatic
rhabdomyosarcoma--a report from the Intergroup Rhabdomyosarcoma Study IV. J
Clin Oncol 21, 78-84.
Breneman, J. C., and Wiener, E. S. (2000). Issues in the local control of
rhabdomyosarcoma. Med Pediatr Oncol 35, 104-109.
Bridge, J. A., Liu, J., Qualman, S. J., Suijkerbuijk, R., Wenger, G., Zhang, J., Wan,
X., Baker, K. S., Sorensen, P., and Barr, F. G. (2002). Genomic gains and losses are
similar in genetic and histologic subsets of rhabdomyosarcoma, whereas
amplification predominates in embryonal with anaplasia and alveolar subtypes.
Genes Chromosomes Cancer 33, 310-321.
Butt, A. J., and Williams, A. C. (2001). IGFBP-3 and apoptosis--a license to kill?
Apoptosis 6, 199-205.
Caillaud, J. M., Gerard-Marchant, R., Marsden, H. B., van Unnik, A. J., Rodary, C.,
Rey, A., and Flamant, F. (1989). Histopathological classification of childhood
rhabdomyosarcoma: a report from the International Society of Pediatric Oncology
pathology panel. Med Pediatr Oncol 17, 391-400.
Capp, J. P. (2005). Stochastic gene expression, disruption of tissue averaging effects
and cancer as a disease of development. Bioessays 27, 1277-1285.
Carter, R. L., Jameson, C. F., Philp, E. R., and Pinkerton, C. R. (1990). Comparative
phenotypes in rhabdomyosarcomas and developing skeletal muscle. Histopathology
17, 301-309.
Cassady, J. R. (1995). How much is enough? The continuing evolution of therapy in
childhood rhabdomyosarcoma and its refinement. Int J Radiat Oncol Biol Phys 31,
675-676; discussion 681.
Cavazzana, A. O., Schmidt, D., Ninfo, V., Harms, D., Tollot, M., Carli, M., Treuner,
J., Betto, R., and Salviati, G. (1992). Spindle cell rhabdomyosarcoma. A
prognostically favorable variant of rhabdomyosarcoma. Am J Surg Pathol 16, 229-
235.
Cessna, M. H., Zhou, H., Perkins, S. L., Tripp, S. R., Layfield, L., Daines, C., and
Coffin, C. M. (2001). Are myogenin and myoD1 expression specific for
rhabdomyosarcoma? A study of 150 cases, with emphasis on spindle cell mimics.
Am J Surg Pathol 25, 1150-1157.
Chalepakis, G., Fritsch, R., Fickenscher, H., Deutsch, U., Goulding, M., and Gruss,
P. (1991). The molecular basis of the undulated/Pax-1 mutation. Cell 66, 873-884.
123
Chen, P., Hutter, D., Yang, X., Gorospe, M., Davis, R. J., and Liu, Y. (2001).
Discordance between the binding affinity of mitogen-activated protein kinase
subfamily members for MAP kinase phosphatase-2 and their ability to activate the
phosphatase catalytically. J Biol Chem 276, 29440-29449.
Chen, Y. C., and Hunter, D. J. (2005). Molecular epidemiology of cancer. CA
Cancer J Clin 55, 45-54; quiz 57.
Cheng, J., Kapranov, P., Drenkow, J., Dike, S., Brubaker, S., Patel, S., Long, J.,
Stern, D., Tammana, H., Helt, G., et al. (2005). Transcriptional maps of 10 human
chromosomes at 5-nucleotide resolution. Science 308, 1149-1154.
Chiles, M. C., Parham, D. M., Qualman, S. J., Teot, L. A., Bridge, J. A., Ullrich, F.,
Barr, F. G., and Meyer, W. H. (2004). Sclerosing rhabdomyosarcomas in children
and adolescents: a clinicopathologic review of 13 cases from the Intergroup
Rhabdomyosarcoma Study Group and Children's Oncology Group. Pediatr Dev
Pathol 7, 583-594.
Cohen, M. M., Jr. (2005). Beckwith-Wiedemann syndrome: historical,
clinicopathological, and etiopathogenetic perspectives. Pediatr Dev Pathol 8, 287-
304.
Cox, D. R. (1972). Regression models and life tables (with discussion). Journal of
the Royal Statistical Society, Series B 34, 187-220.
Crist, W., Gehan, E. A., Ragab, A. H., Dickman, P. S., Donaldson, S. S., Fryer, C.,
Hammond, D., Hays, D. M., Herrmann, J., Heyn, R., and et al. (1995). The Third
Intergroup Rhabdomyosarcoma Study. J Clin Oncol 13, 610-630.
Crist, W. M., Anderson, J. R., Meza, J. L., Fryer, C., Raney, R. B., Ruymann, F. B.,
Breneman, J., Qualman, S. J., Wiener, E., Wharam, M., et al. (2001). Intergroup
rhabdomyosarcoma study-IV: results for patients with nonmetastatic disease. J Clin
Oncol 19, 3091-3102.
Crist, W. M., Garnsey, L., Beltangady, M. S., Gehan, E., Ruymann, F., Webber, B.,
Hays, D. M., Wharam, M., and Maurer, H. M. (1990). Prognosis in children with
rhabdomyosarcoma: a report of the intergroup rhabdomyosarcoma studies I and II.
Intergroup Rhabdomyosarcoma Committee. J Clin Oncol 8, 443-452.
da Costa, L. F. (2001). Return of de-differentiation: why cancer is a developmental
disease. Curr Opin Oncol 13, 58-62.
Danna, E. A., and Nolan, G. P. (2006). Transcending the biomarker mindset:
deciphering disease mechanisms at the single cell level. Curr Opin Chem Biol.
124
Dave, S. S., Wright, G., Tan, B., Rosenwald, A., Gascoyne, R. D., Chan, W. C.,
Fisher, R. I., Braziel, R. M., Rimsza, L. M., Grogan, T. M., et al. (2004). Prediction
of survival in follicular lymphoma based on molecular features of tumor-infiltrating
immune cells. N Engl J Med 351, 2159-2169.
De Pitta, C., Tombolan, L., Albiero, G., Sartori, F., Romualdi, C., Jurman, G., Carli,
M., Furlanello, C., Lanfranchi, G., and Rosolen, A. (2005). Gene expression
profiling identifies potential relevant genes in alveolar rhabdomyosarcoma
pathogenesis and discriminates PAX3-FKHR positive and negative tumors. Int J
Cancer.
Dean, M. (1998). Cancer as a complex developmental disorder--nineteenth Cornelius
P. Rhoads Memorial Award Lecture. Cancer Res 58, 5633-5636.
del Peso, L., Gonzalez, V. M., Hernandez, R., Barr, F. G., and Nunez, G. (1999).
Regulation of the forkhead transcription factor FKHR, but not the PAX3-FKHR
fusion protein, by the serine/threonine kinase Akt. Oncogene 18, 7328-7333.
Donaldson, S. S., and Anderson, J. R. (2005). Rhabdomyosarcoma: many
similarities, a few philosophical differences. J Clin Oncol 23, 2586-2587.
Douglass, E. C., Rowe, S. T., Valentine, M., Parham, D. M., Berkow, R., Bowman,
W. P., and Maurer, H. M. (1991). Variant translocations of chromosome 13 in
alveolar rhabdomyosarcoma. Genes Chromosomes Cancer 3, 480-482.
Du, S., Lawrence, E. J., Strzelecki, D., Rajput, P., Xia, S. J., Gottesman, D. M., and
Barr, F. G. (2005). Co-expression of alternatively spliced forms of PAX3, PAX7,
PAX3-FKHR and PAX7-FKHR with distinct DNA binding and transactivation
properties in rhabdomyosarcoma. Int J Cancer 115, 85-92.
Ebert, B. L., and Golub, T. R. (2004). Genomic approaches to hematologic
malignancies. Blood 104, 923-932.
Enterline, H. T., and Horn, R. C., Jr. (1958). Alveolar rhabdomyosarcoma; a
distinctive tumor type. Am J Clin Pathol 29, 356-366.
Epstein, J. A., Lam, P., Jepeal, L., Maas, R. L., and Shapiro, D. N. (1995). Pax3
inhibits myogenic differentiation of cultured myoblast cells. J Biol Chem 270,
11719-11722.
Fredericks, W. J., Galili, N., Mukhopadhyay, S., Rovera, G., Bennicelli, J., Barr, F.
G., and Rauscher, F. J., 3rd (1995). The PAX3-FKHR fusion protein created by the
t(2;13) translocation in alveolar rhabdomyosarcomas is a more potent transcriptional
activator than PAX3. Mol Cell Biol 15, 1522-1535.
125
Friedman, D. L., Kadan-Lottick, N. S., Whitton, J., Mertens, A. C., Yasui, Y., Liu,
Y., Meadows, A. T., Robison, L. L., and Strong, L. C. (2005). Increased risk of
cancer among siblings of long-term childhood cancer survivors: a report from the
childhood cancer survivor study. Cancer Epidemiol Biomarkers Prev 14, 1922-1927.
Galili, N., Davis, R. J., Fredericks, W. J., Mukhopadhyay, S., Rauscher, F. J., 3rd,
Emanuel, B. S., Rovera, G., and Barr, F. G. (1993). Fusion of a fork head domain
gene to PAX3 in the solid tumour alveolar rhabdomyosarcoma. Nat Genet 5, 230-
235.
Gavert, N., Conacci-Sorrell, M., Gast, D., Schneider, A., Altevogt, P., Brabletz, T.,
and Ben-Ze'ev, A. (2005). L1, a novel target of beta-catenin signaling, transforms
cells and is expressed at the invasive front of colon cancers. J Cell Biol 168, 633-
642.
Gordon, T., McManus, A., Anderson, J., Min, T., Swansbury, J., Pritchard-Jones, K.,
and Shipley, J. (2001). Cytogenetic abnormalities in 42 rhabdomyosarcoma: a United
Kingdom Cancer Cytogenetics Group Study. Med Pediatr Oncol 36, 259-267.
Greer, B. T., and Khan, J. (2004). Diagnostic classification of cancer using DNA
microarrays and artificial intelligence. Ann N Y Acad Sci 1020, 49-66.
Han, J. Y., Cannon, P. M., Lai, K. M., Zhao, Y., Eiden, M. V., and Anderson, W. F.
(1997). Identification of envelope protein residues required for the expanded host
range of 10A1 murine leukemia virus. J Virol 71, 8103-8108.
Harris, H. (2005). A long view of fashions in cancer research. Bioessays 27, 833-
838.
Hart, K. C., Robertson, S. C., Kanemitsu, M. Y., Meyer, A. N., Tynan, J. A., and
Donoghue, D. J. (2000). Transformation and Stat activation by derivatives of
FGFR1, FGFR3, and FGFR4. Oncogene 19, 3309-3320.
Hartley, A. L., Birch, J. M., Blair, V., Kelsey, A. M., Harris, M., and Jones, P. H.
(1993). Patterns of cancer in the families of children with soft tissue sarcoma. Cancer
72, 923-930.
Henderson, S. R., Guiliano, D., Presneau, N., McLean, S., Frow, R., Vujovic, S.,
Anderson, J., Sebire, N., Whelan, J., Athanasou, N., et al. (2005). A molecular map
of mesenchymal tumors. Genome Biol 6, R76.
Herrera-Gayol, A., Royal, A., and Babai, F. (1995). Correlation between cell
differentiation stage, types of invasion, and hematogenous metastasis in experimental
rhabdomyosarcomas. Exp Mol Pathol 63, 1-15.
126
Hoque, M. O., Lee, C. C., Cairns, P., Schoenberg, M., and Sidransky, D. (2003).
Genome-wide genetic characterization of bladder cancer: a comparison of high-
density single-nucleotide polymorphism arrays and PCR-based microsatellite
analysis. Cancer Res 63, 2216-2222.
Horn, R. C., Jr., and Enterline, H. T. (1958). Rhabdomyosarcoma: a
clinicopathological study and classification of 39 cases. Cancer 11, 181-199.
Horowitz, M. E., Etcubanas, E., Christensen, M. L., Houghton, J. A., George, S. L.,
Green, A. A., and Houghton, P. J. (1988). Phase II testing of melphalan in children
with newly diagnosed rhabdomyosarcoma: a model for anticancer drug development.
J Clin Oncol 6, 308-314.
Hosack, D. A., Dennis, G., Jr., Sherman, B. T., Lane, H. C., and Lempicki, R. A.
(2003). Identifying biological themes within lists of genes with EASE. Genome Biol
4, R70.
Hu-Lieskovan, S., Zhang, J., Wu, L., Shimada, H., Schofield, D. E., and Triche, T. J.
(2005). EWS-FLI1 fusion protein up-regulates critical genes in neural crest
development and is responsible for the observed phenotype of Ewing's family of
tumors. Cancer Res 65, 4633-4644.
Hunter, K. W. (2004). Host genetics and tumour metastasis. Br J Cancer 90, 752-
755.
James, A. C., Veitch, J. G., Zareh, A. R., and Triche, T. (2004). Sensitivity and
specificity of five abundance estimators for high-density oligonucleotide
microarrays. Bioinformatics 20, 1060-1065.
Katz, M. H. (2003). Multivariable analysis: a primer for readers of medical research.
Ann Intern Med 138, 644-650.
Keller, C., and Capecchi, M. R. (2005). New genetic tactics to model alveolar
rhabdomyosarcoma in the mouse. Cancer Res 65, 7530-7532.
Keller, C., Hansen, M. S., Coffin, C. M., and Capecchi, M. R. (2004). Pax3:Fkhr
interferes with embryonic Pax3 and Pax7 function: implications for alveolar
rhabdomyosarcoma cell of origin. Genes Dev 18, 2608-2613.
Kelly, K. M., Womer, R. B., Sorensen, P. H., Xiong, Q. B., and Barr, F. G. (1997).
Common and variant gene fusions predict distinct clinical phenotypes in
rhabdomyosarcoma. J Clin Oncol 15, 1831-1836.
Khan, J., Bittner, M. L., Saal, L. H., Teichmann, U., Azorsa, D. O., Gooden, G. C.,
Pavan, W. J., Trent, J. M., and Meltzer, P. S. (1999). cDNA microarrays detect
127
activation of a myogenic transcription program by the PAX3-FKHR fusion
oncogene. Proc Natl Acad Sci U S A 96, 13264-13269.
Khan, J., Simon, R., Bittner, M., Chen, Y., Leighton, S. B., Pohida, T., Smith, P. D.,
Jiang, Y., Gooden, G. C., Trent, J. M., and Meltzer, P. S. (1998). Gene expression
profiling of alveolar rhabdomyosarcoma with cDNA microarrays. Cancer Res 58,
5009-5013.
Khan, J., Wei, J. S., Ringner, M., Saal, L. H., Ladanyi, M., Westermann, F.,
Berthold, F., Schwab, M., Antonescu, C. R., Peterson, C., and Meltzer, P. S. (2001).
Classification and diagnostic prediction of cancers using gene expression profiling
and artificial neural networks. Nat Med 7, 673-679.
Kilpatrick, S. E., Teot, L. A., Geisinger, K. R., Martin, P. L., Shumate, D. K.,
Zbieranski, N., Russell, G. B., and Fletcher, C. D. (1994). Relationship of DNA
ploidy to histology and prognosis in rhabdomyosarcoma. Comparison of flow
cytometry and image analysis. Cancer 74, 3227-3233.
Kirkwood, B. R., Sterne, J. A. C., and Kirkwood, B. R. (2003). Essential medical
statistics, 2nd edn (Malden, Mass.: Blackwell Science).
Klunder, J. W., Komdeur, R., Van Der Graaf, W. T., De Bont, E. J., Hoekstra, H. J.,
Van Den Berg, E., and Molenaar, W. M. (2003). Expression of multidrug resistance-
associated proteins in rhabdomyosarcomas before and after chemotherapy: the
relationship between lung resistance-related protein (LRP) and differentiation. Hum
Pathol 34, 150-155.
Lawrence, W., Jr., Anderson, J. R., Gehan, E. A., and Maurer, H. (1997).
Pretreatment TNM staging of childhood rhabdomyosarcoma: a report of the
Intergroup Rhabdomyosarcoma Study Group. Children's Cancer Study Group.
Pediatric Oncology Group. Cancer 80, 1165-1170.
Lee, W., Han, K., Harris, C. P., and Meisner, L. F. (1993). Detection of aneuploidy
and possible deletion in paraffin-embedded rhabdomyosarcoma cells with FISH.
Cancer Genet Cytogenet 68, 99-103.
Leuschner, I., Heuer, T., and Harms, D. (2002). Induction of drug resistance in
human rhabdomyosarcoma cell lines is associated with increased maturation:
possible explanation for differentiation in recurrences? Pediatr Dev Pathol 5, 276-
282.
Li, F. P., and Fraumeni, J. F., Jr. (1969). Rhabdomyosarcoma in children:
epidemiologic study and identification of a familial cancer syndrome. J Natl Cancer
Inst 43, 1365-1373.
128
Li, H., and Gui, J. (2004). Partial Cox regression analysis for high-dimensional
microarray gene expression data. Bioinformatics 20 Suppl 1, I208-I215.
Li, L., Zhou, J., James, G., Heller-Harrison, R., Czech, M. P., and Olson, E. N.
(1992). FGF inactivates myogenic helix-loop-helix proteins through phosphorylation
of a conserved protein kinase C site in their DNA-binding domains. Cell 71, 1181-
1194.
Libera, D. D., Falconieri, G., and Zanella, M. (1999). Embryonal "Botryoid"
rhabdomyosarcoma of the larynx: a clinicopathologic and immunohistochemical
study of two cases. Ann Diagn Pathol 3, 341-349.
Linardic, C. M., Downie, D. L., Qualman, S., Bentley, R. C., and Counter, C. M.
(2005). Genetic modeling of human rhabdomyosarcoma. Cancer Res 65, 4490-4495.
Liu, E. T. (2003). Classification of cancers by expression profiling. Curr Opin Genet
Dev 13, 97-103.
Lu, Y. J., Williamson, D., Clark, J., Wang, R., Tiffin, N., Skelton, L., Gordon, T.,
Williams, R., Allan, B., Jackman, A., et al. (2001). Comparative expressed sequence
hybridization to chromosomes for tumor classification and identification of genomic
regions of differential gene expression. Proc Natl Acad Sci U S A 98, 9197-9202.
Mano, H. (2004). Stratification of acute myeloid leukemia based on gene expression
profiles. Int J Hematol 80, 389-394.
Mansouri, A. (1998). The role of Pax3 and Pax7 in development and cancer. Crit
Rev Oncog 9, 141-149.
Matsui, I., Tanimura, M., Kobayashi, N., Sawada, T., Nagahara, N., and Akatsuka, J.
(1993). Neurofibromatosis type 1 and childhood cancer. Cancer 72, 2746-2754.
Maurer, H. M., Beltangady, M., Gehan, E. A., Crist, W., Hammond, D., Hays, D. M.,
Heyn, R., Lawrence, W., Newton, W., Ortega, J., and et al. (1988). The Intergroup
Rhabdomyosarcoma Study-I. A final report. Cancer 61, 209-220.
McDowell, H. P. (2003). Update on childhood rhabdomyosarcoma. Arch Dis Child
88, 354-357.
McKinsey, T. A., Zhang, C. L., and Olson, E. N. (2001). Control of muscle
development by dueling HATs and HDACs. Curr Opin Genet Dev 11, 497-504.
Mei, R., Galipeau, P. C., Prass, C., Berno, A., Ghandour, G., Patil, N., Wolff, R. K.,
Chee, M. S., Reid, B. J., and Lockhart, D. J. (2000). Genome-wide detection of
allelic imbalance using human SNPs and high-density DNA arrays. Genome Res 10,
1126-1137.
129
Merlino, G., and Helman, L. J. (1999). Rhabdomyosarcoma--working out the
pathways. Oncogene 18, 5340-5348.
Meyer, W. H., and Spunt, S. L. (2004). Soft tissue sarcomas of childhood. Cancer
Treat Rev 30, 269-280.
Molenaar, W. M., Oosterhuis, J. W., Oosterhuis, A. M., and Ramaekers, F. C.
(1985). Mesenchymal and muscle-specific intermediate filaments (vimentin and
desmin) in relation to differentiation in childhood rhabdomyosarcomas. Hum Pathol
16, 838-843.
Nagle, J. A., Ma, Z., Byrne, M. A., White, M. F., and Shaw, L. M. (2004).
Involvement of insulin receptor substrate 2 in mammary tumor metastasis. Mol Cell
Biol 24, 9726-9735.
Newton, W. A., Jr., Gehan, E. A., Webber, B. L., Marsden, H. B., van Unnik, A. J.,
Hamoudi, A. B., Tsokos, M. G., Shimada, H., Harms, D., and Schmidt, D. (1995).
Classification of rhabdomyosarcomas and related sarcomas. Pathologic aspects and
proposal for a new classification--an Intergroup Rhabdomyosarcoma Study. Cancer
76, 1073-1085.
Newton, W. A., Jr., Webber, B., Hamoudi, A. B., Gehan, E. A., and Maurer, H. M.
(1999). Early history of pathology studies by the Intergroup Rhabdomyosarcoma
Study Group. Pediatr Dev Pathol 2, 275-285.
Ng'andu, N. H. (1997). An empirical comparison of statistical tests for assessing the
proportional hazards assumption of Cox's model. Stat Med 16, 611-626.
Nielsen, T. O., West, R. B., Linn, S. C., Alter, O., Knowling, M. A., O'Connell, J.
X., Zhu, S., Fero, M., Sherlock, G., Pollack, J. R., et al. (2002). Molecular
characterisation of soft tissue tumours: a gene expression study. Lancet 359, 1301-
1307.
Page, K. A., Landau, N. R., and Littman, D. R. (1990). Construction and use of a
human immunodeficiency virus vector for analysis of virus infectivity. J Virol 64,
5270-5276.
Palmer, N. F., Sachs N, Foulkes M. (1981). Histopathology and prognosis in
rhabdomyosarcoma: a report of the Intergroup Rhabdomyosarcoma Study. Proc
SIOP 13, 113.
Pappo, A., Lyden E, Breitfeld P, Donaldson S, Anderson J, Qualman S, Wiener E,
Crews K, Houghton P and Meyer WH (2005). Vincristine (V) and Irinotecan (CPT):
A highly active combination in metastatic rhabdomyosarcoma. A Report from the
Soft Tissue Sarcoma Committee of the Children's Oncology Group (STSCOG). Proc
Am Soc Clin Oncol
130
23, 802s.
Pappo, A. S., Anderson, J. R., Crist, W. M., Wharam, M. D., Breitfeld, P. P.,
Hawkins, D., Raney, R. B., Womer, R. B., Parham, D. M., Qualman, S. J., and Grier,
H. E. (1999a). Survival after relapse in children and adolescents with
rhabdomyosarcoma: A report from the Intergroup Rhabdomyosarcoma Study Group.
J Clin Oncol 17, 3487-3493.
Pappo, A. S., Parham, D. M., Rao, B. N., and Lobe, T. E. (1999b). Soft tissue
sarcomas in children. Semin Surg Oncol 16, 121-143.
Pappo, A. S., Shapiro, D. N., Crist, W. M., and Maurer, H. M. (1995). Biology and
therapy of pediatric rhabdomyosarcoma. J Clin Oncol 13, 2123-2139.
Parham, D. M. (2001). Pathologic classification of rhabdomyosarcomas and
correlations with molecular studies. Mod Pathol 14, 506-514.
Pawitan, Y., Bjohle, J., Wedren, S., Humphreys, K., Skoog, L., Huang, F., Amler, L.,
Shaw, P., Hall, P., and Bergh, J. (2004). Gene expression profiling for prognosis
using Cox regression. Stat Med 23, 1767-1780.
Persons, D. A., Allay, J. A., Allay, E. R., Ashmun, R. A., Orlic, D., Jane, S. M.,
Cunningham, J. M., and Nienhuis, A. W. (1999). Enforced expression of the GATA-
2 transcription factor blocks normal hematopoiesis. Blood 93, 488-499.
Peter, M., Gilbert, E., and Delattre, O. (2001). A multiplex real-time pcr assay for the
detection of gene fusions observed in solid tumors. Lab Invest 81, 905-912.
Petricoin, E. F., 3rd, Bichsel, V. E., Calvert, V. S., Espina, V., Winters, M., Young,
L., Belluco, C., Trock, B. J., Lippman, M., Fishman, D. A., et al. (2005). Mapping
molecular networks using proteomics: a vision for patient-tailored combination
therapy. J Clin Oncol 23, 3614-3621.
Pittman, J., Huang, E., Dressman, H., Horng, C. F., Cheng, S. H., Tsou, M. H., Chen,
C. M., Bild, A., Iversen, E. S., Huang, A. T., et al. (2004). Integrated modeling of
clinical and gene expression information for personalized prediction of disease
outcomes. Proc Natl Acad Sci U S A 101, 8431-8436.
Plattner, R., Anderson, M. J., Sato, K. Y., Fasching, C. L., Der, C. J., and Stanbridge,
E. J. (1996). Loss of oncogenic ras expression does not correlate with loss of
tumorigenicity in human cells. Proc Natl Acad Sci U S A 93, 6665-6670.
Punyko, J. A., Mertens, A. C., Baker, K. S., Ness, K. K., Robison, L. L., and Gurney,
J. G. (2005). Long-term survival probabilities for childhood rhabdomyosarcoma. A
population-based evaluation. Cancer 103, 1475-1483.
131
Pursglove, S. E., and Mackay, J. P. (2005). CSL: a notch above the rest. Int J
Biochem Cell Biol 37, 2472-2477.
Qualman, S. J., Bowen, J., Parham, D. M., Branton, P. A., and Meyer, W. H. (2003).
Protocol for the examination of specimens from patients (children and young adults)
with rhabdomyosarcoma. Arch Pathol Lab Med 127, 1290-1297.
Qualman, S. J., Coffin, C. M., Newton, W. A., Hojo, H., Triche, T. J., Parham, D.
M., and Crist, W. M. (1998). Intergroup Rhabdomyosarcoma Study: update for
pathologists. Pediatr Dev Pathol 1, 550-561.
Qualman, S. J., and Morotti, R. A. (2002). Risk assignment in pediatric soft-tissue
sarcomas: an evolving molecular classification. Curr Oncol Rep 4, 123-130.
Ramaswamy, S., Ross, K. N., Lander, E. S., and Golub, T. R. (2003). A molecular
signature of metastasis in primary solid tumors. Nat Genet 33, 49-54.
Raney, R. B., Anderson, J. R., Barr, F. G., Donaldson, S. S., Pappo, A. S., Qualman,
S. J., Wiener, E. S., Maurer, H. M., and Crist, W. M. (2001). Rhabdomyosarcoma
and undifferentiated sarcoma in the first two decades of life: a selective review of
intergroup rhabdomyosarcoma study group experience and rationale for Intergroup
Rhabdomyosarcoma Study V. J Pediatr Hematol Oncol 23, 215-220.
Riopelle, J. L., and Thériault, J. P. (1956). Sur une forme méconnue de sarcome des
parties molles: le rhabdomyosarcome alvéolaire. Ann Anat Pathol (Paris) 1, 88-111.
Robson, E. J., He, S. J., and Eccles, M. R. (2006). A PANorama of PAX genes in
cancer and development. Nat Rev Cancer 6, 52-62.
Sakamuro, D., and Prendergast, G. C. (1999). New Myc-interacting proteins: a
second Myc network emerges. Oncogene 18, 2942-2954.
Sasaki, A., Taketomi, T., Kato, R., Saeki, K., Nonami, A., Sasaki, M., Kuriyama, M.,
Saito, N., Shibuya, M., and Yoshimura, A. (2003). Mammalian Sprouty4 suppresses
Ras-independent ERK activation by binding to Raf1. Nat Cell Biol 5, 427-432.
Schaaf, G. J., Ruijter, J. M., van Ruissen, F., Zwijnenburg, D. A., Waaijer, R.,
Valentijn, L. J., Benit-Deekman, J., van Kampen, A. H., Baas, F., and Kool, M.
(2005). Full transcriptome analysis of rhabdomyosarcoma, normal, and fetal skeletal
muscle: statistical comparison of multiple SAGE libraries. Faseb J 19, 404-406.
Schafer, B. W. (1998). Emerging roles for PAX transcription factors in cancer
biology. Gen Physiol Biophys 17, 211-224.
Schmidt, D., Reimann, O., Treuner, J., and Harms, D. (1986). Cellular differentiation
and prognosis in embryonal rhabdomyosarcoma. A report from the Cooperative Soft
132
Tissue Sarcoma Study 1981 (CWS 81). Virchows Arch A Pathol Anat Histopathol
409, 183-194.
Schofield, D., and Triche, T. J. (2002). cDNA microarray analysis of global gene
expression in sarcomas. Curr Opin Oncol 14, 406-411.
Scholl, F. A., Kamarashev, J., Murmann, O. V., Geertsen, R., Dummer, R., and
Schafer, B. W. (2001). PAX3 is expressed in human melanomas and contributes to
tumor cell survival. Cancer Res 61, 823-826.
Schuck, A., Mattke, A. C., Schmidt, B., Kunz, D. S., Harms, D., Knietig, R.,
Treuner, J., and Koscielniak, E. (2004). Group II rhabdomyosarcoma and
rhabdomyosarcomalike tumors: is radiotherapy necessary? J Clin Oncol 22, 143-149.
Schumacher, A., Kapranov, P., Kaminsky, Z., Flanagan, J., Assadzadeh, A., Yau, P.,
Virtanen, C., Winegarden, N., Cheng, J., Gingeras, T., and Petronis, A. (2006).
Microarray-based DNA methylation profiling: technology and applications. Nucleic
Acids Res 34, 528-542.
Segal, E., Friedman, N., Kaminski, N., Regev, A., and Koller, D. (2005). From
signatures to models: understanding cancer using microarrays. Nat Genet 37 Suppl,
S38-45.
Segal, N. H., Pavlidis, P., Antonescu, C. R., Maki, R. G., Noble, W. S., DeSantis, D.,
Woodruff, J. M., Lewis, J. J., Brennan, M. F., Houghton, A. N., and Cordon-Cardo,
C. (2003). Classification and subtype prediction of adult soft tissue sarcoma by
functional genomics. Am J Pathol 163, 691-700.
Shaoul, E., Reich-Slotky, R., Berman, B., and Ron, D. (1995). Fibroblast growth
factor receptors display both common and distinct signaling pathways. Oncogene 10,
1553-1561.
Shipp, M. A., Ross, K. N., Tamayo, P., Weng, A. P., Kutok, J. L., Aguiar, R. C.,
Gaasenbeek, M., Angelo, M., Reich, M., Pinkus, G. S., et al. (2002). Diffuse large B-
cell lymphoma outcome prediction by gene-expression profiling and supervised
machine learning. Nat Med 8, 68-74.
Simon, R. (2003). Diagnostic and prognostic prediction using gene expression
profiles in high-dimensional microarray data. Br J Cancer 89, 1599-1604.
Simon, R. (2005). Roadmap for developing and validating therapeutically relevant
genomic classifiers. J Clin Oncol 23, 7332-7341.
Simon, R., Radmacher, M. D., Dobbin, K., and McShane, L. M. (2003). Pitfalls in
the use of DNA microarray data for diagnostic and prognostic classification. J Natl
Cancer Inst 95, 14-18.
133
Slack, A., Chen, Z., Tonelli, R., Pule, M., Hunt, L., Pession, A., and Shohet, J. M.
(2005). The p53 regulatory gene MDM2 is a direct transcriptional target of MYCN
in neuroblastoma. Proc Natl Acad Sci U S A 102, 731-736.
Smith, L. M., Anderson, J. R., and Coffin, C. M. (2002). Cytodifferentiation and
clinical outcome after chemotherapy and radiation therapy for rhabdomyosarcoma
(RMS). Med Pediatr Oncol 38, 398-404.
Soneoka, Y., Cannon, P. M., Ramsdale, E. E., Griffiths, J. C., Romano, G.,
Kingsman, S. M., and Kingsman, A. J. (1995). A transient three-plasmid expression
system for the production of high titer retroviral vectors. Nucleic Acids Res 23, 628-
633.
Sorensen, P. H., Lynch, J. C., Qualman, S. J., Tirabosco, R., Lim, J. F., Maurer, H.
M., Bridge, J. A., Crist, W. M., Triche, T. J., and Barr, F. G. (2002). PAX3-FKHR
and PAX7-FKHR gene fusions are prognostic indicators in alveolar
rhabdomyosarcoma: a report from the children's oncology group. J Clin Oncol 20,
2672-2679.
Sposto, R. (2002). Cure model analysis in cancer: an application to data from the
Children's Cancer Group. Stat Med 21, 293-312.
Spunt, S. L., Smith, L. M., Ruymann, F. B., Qualman, S. J., Donaldson, S. S.,
Rodeberg, D. A., Anderson, J. R., Crist, W. M., and Link, M. P. (2004).
Cyclophosphamide dose intensification during induction therapy for intermediate-
risk pediatric rhabdomyosarcoma is feasible but does not improve outcome: a report
from the soft tissue sarcoma committee of the children's oncology group. Clin
Cancer Res 10, 6072-6079.
Stevens, M. C. (2005). Treatment for childhood rhabdomyosarcoma: the cost of cure.
Lancet Oncol 6, 77-84.
Stiller, C. A., McKinney, P. A., Bunch, K. J., Bailey, C. C., and Lewis, I. J. (1991).
Childhood cancer and ethnic group in Britain: a United Kingdom children's Cancer
Study Group (UKCCSG) study. Br J Cancer 64, 543-548.
Stout, A. (1946). Rhabdomyosarcoma of the skeletal muscles. Ann Surg 123, 447-
472.
Sublett, J. E., Jeon, I. S., and Shapiro, D. N. (1995). The alveolar rhabdomyosarcoma
PAX3/FKHR fusion protein is a transcriptional activator. Oncogene 11, 545-552.
Sutow, W. W., Sullivan, M. P., Ried, H. L., Taylor, H. G., and Griffith, K. M.
(1970). Prognosis in childhood rhabdomyosarcoma. Cancer 25, 1384-1390.
134
Thies, A., Schachner, M., Moll, I., Berger, J., Schulze, H. J., Brunner, G., and
Schumacher, U. (2002). Overexpression of the cell adhesion molecule L1 is
associated with metastasis in cutaneous malignant melanoma. Eur J Cancer 38, 1708-
1716.
Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G. (2002). Diagnosis of multiple
cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A
99, 6567-6572.
Tiffin, N., Williams, R. D., Shipley, J., and Pritchard-Jones, K. (2003). PAX7
expression in embryonal rhabdomyosarcoma suggests an origin in muscle satellite
cells. Br J Cancer 89, 327-332.
Tonin, P. N., Scrable, H., Shimada, H., and Cavenee, W. K. (1991). Muscle-specific
gene expression in rhabdomyosarcomas and stages of human fetal skeletal muscle
development. Cancer Res 51, 5100-5106.
Torchia, E. C., Jaishankar, S., and Baker, S. J. (2003). Ewing tumor fusion proteins
block the differentiation of pluripotent marrow stromal cells. Cancer Res 63, 3464-
3468.
Tothill, R. W., Kowalczyk, A., Rischin, D., Bousioutas, A., Haviv, I., van Laar, R.
K., Waring, P. M., Zalcberg, J., Ward, R., Biankin, A. V., et al. (2005). An
expression-based site of origin diagnostic method designed for clinical application to
cancer of unknown origin. Cancer Res 65, 4031-4040.
Triche, T. J., Schofield, D., and Buckley, J. (2001). DNA microarrays in pediatric
cancer. Cancer J 7, 2-15.
Tsokos, M. (1986). The role of immunocytochemistry in the diagnosis of
rhabdomyosarcoma. Arch Pathol Lab Med 110, 776-778.
Tsokos, M. (1994). The diagnosis and classification of childhood
rhabdomyosarcoma. Semin Diagn Pathol 11, 26-38.
Tsokos, M., Webber, B. L., Parham, D. M., Wesley, R. A., Miser, A., Miser, J. S.,
Etcubanas, E., Kinsella, T., Grayson, J., and Glatstein, E. (1992).
Rhabdomyosarcoma. A new classification scheme related to prognosis. Arch Pathol
Lab Med 116, 847-855.
Turc-Carel, C., Lizard-Nacol, S., Justrabo, E., Favrot, M., Philip, T., and Tabone, E.
(1986). Consistent chromosomal translocation in alveolar rhabdomyosarcoma.
Cancer Genet Cytogenet 19, 361-362.
135
Tusher, V. G., Tibshirani, R., and Chu, G. (2001). Significance analysis of
microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98,
5116-5121.
van 't Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A., Mao, M.,
Peterse, H. L., van der Kooy, K., Marton, M. J., Witteveen, A. T., et al. (2002). Gene
expression profiling predicts clinical outcome of breast cancer. Nature 415, 530-536.
van de Vijver, M. J., He, Y. D., van't Veer, L. J., Dai, H., Hart, A. A., Voskuil, D.
W., Schreiber, G. J., Peterse, J. L., Roberts, C., Marton, M. J., et al. (2002). A gene-
expression signature as a predictor of survival in breast cancer. N Engl J Med 347,
1999-2009.
van Houwelingen, H. C., Bruinsma, T., Hart, A. A., Van't Veer, L. J., and Wessels,
L. F. (2005). Cross-validated Cox regression on microarray gene expression data.
Stat Med.
Visser, M., Sijmons, C., Bras, J., Arceci, R. J., Godfried, M., Valentijn, L. J., Voute,
P. A., and Baas, F. (1997). Allelotype of pediatric rhabdomyosarcoma. Oncogene 15,
1309-1314.
Wachtel, M., Dettling, M., Koscielniak, E., Stegmaier, S., Treuner, J., Simon-
Klingenstein, K., Buhlmann, P., Niggli, F. K., and Schafer, B. W. (2004). Gene
expression signatures identify rhabdomyosarcoma subtypes and detect a novel
t(2;2)(q35;p23) translocation fusing PAX3 to NCOA1. Cancer Res 64, 5539-5545.
Wachtel, M., Runge, T., Leuschner, I., Stegmaier, S., Koscielniak, E., Treuner, J.,
Odermatt, B., Behnke, S., Niggli, F. K., and Schafer, B. W. (2006). Subtype and
Prognostic Classification of Rhabdomyosarcoma by Immunohistochemistry. J Clin
Oncol.
Walterhouse, D. O., Lyden, E. R., Breitfeld, P. P., Qualman, S. J., Wharam, M. D.,
and Meyer, W. H. (2004). Efficacy of topotecan and cyclophosphamide given in a
phase II window trial in children with newly diagnosed metastatic
rhabdomyosarcoma: a Children's Oncology Group study. J Clin Oncol 22, 1398-
1403.
Wang, Y., Klijn, J. G., Zhang, Y., Sieuwerts, A. M., Look, M. P., Yang, F., Talantov,
D., Timmermans, M., Meijer-van Gelder, M. E., Yu, J., et al. (2005). Gene-
expression profiles to predict distant metastasis of lymph-node-negative primary
breast cancer. Lancet 365, 671-679.
Wechsler-Reya, R. J., Elliott, K. J., and Prendergast, G. C. (1998). A role for the
putative tumor suppressor Bin1 in muscle cell differentiation. Mol Cell Biol 18, 566-
575.
136
Weigelt, B., Hu, Z., He, X., Livasy, C., Carey, L. A., Ewend, M. G., Glas, A. M.,
Perou, C. M., and Van't Veer, L. J. (2005). Molecular portraits and 70-gene
prognosis signature are preserved throughout the metastatic process of breast cancer.
Cancer Res 65, 9155-9158.
Wijnaendts, L. C., van der Linden, J. C., van Unnik, A. J., Delemarre, J. F., Barbet,
J. P., Butler-Browne, G. S., and Meijer, C. J. (1994). Expression of developmentally
regulated muscle proteins in rhabdomyosarcomas. Am J Pathol 145, 895-901.
Wolden, S. L., Anderson, J. R., Crist, W. M., Breneman, J. C., Wharam, M. D., Jr.,
Wiener, E. S., Qualman, S. J., and Donaldson, S. S. (1999). Indications for
radiotherapy and chemotherapy after complete resection in rhabdomyosarcoma: A
report from the Intergroup Rhabdomyosarcoma Studies I to III. J Clin Oncol 17,
3468-3475.
Xia, S. J., and Barr, F. G. (2004). Analysis of the transforming and growth
suppressive activities of the PAX3-FKHR oncoprotein. Oncogene 23, 6864-6871.
Yeoh, E. J., Ross, M. E., Shurtleff, S. A., Williams, W. K., Patel, D., Mahfouz, R.,
Behm, F. G., Raimondi, S. C., Relling, M. V., Patel, A., et al. (2002). Classification,
subtype discovery, and prediction of outcome in pediatric acute lymphoblastic
leukemia by gene expression profiling. Cancer Cell 1, 133-143.
Zeng, F., and Schultz, R. M. (2005). RNA transcript profiling during zygotic gene
activation in the preimplantation mouse embryo. Dev Biol 283, 40-57.
Zhang, Z., Yuan, X. M., Li, L. H., and Xie, F. P. (2001). Transdifferentiation of
neoplastic cells. Med Hypotheses 57, 655-666.
137
Appendix A
Supplementary Table 1. Tumor sample covariates
Sample ID Histology Transloc.
Molecular
Class
Dead Survival Fail FFS Stage
Clin
GP
A335 Other ? E2 Alive 0.5 . 0.5 Stage 4 IV
A337 EMB ? E2 Alive 7.1 . 7.1 Stage 3 ?
A338 Other ? N Alive 0.2 Fail 0.2 Stage 3 ?
A339 UDS ? N Alive 12.9 . 12.9 Stage 3 III
A340 UDS ? N Dead 1.1 Fail 1.0 Stage 1 III
A341 Spindle ? E2 Alive 12.8 . 12.8 Stage 1 IA
A342 ALV P7F A1 Alive 12.6 . 12.6 Stage 2 IA
A343 EMB ? E2 Alive 11.1 . 11.1 Stage 3 III
A344 ALV P3F A1 Dead 3.2 Fail 1.8 Stage 3 III
A345 EMB ? E2 Dead 0.4 Fail 0.2 Stage 4 IV
A346 EMB ? E1 Alive 12.3 . 12.3 Stage 2 IA
A347 Spindle ? E2 Alive 9.0 . 9.0 Stage 1 IA
A348 ALV P3F A2 Alive 11.1 . 11.1 Stage 3 III
A349 EMB ? E2 Alive 12.1 . 12.1 Stage 2 IIA
A350 EMB ? E2 Alive 11.9 . 11.9 Stage 2 IA
A352 Other ? E2 Alive 11.2 . 11.2 Stage 3 ?
A354 UDS ? N Dead 1.8 Fail 1.4 Stage 1 IA
A355 ALV P3F A1 Dead 1.6 Fail 1.1 Stage 3 III
A356 ALV P3F A2 Dead 1.8 Fail 0.4 Stage 4 IV
A357 ALV P3F A1 Dead 4.4 Fail 1.9 Stage 4 IV
A358 EMB ? E2 Alive 3.8 . 3.8 Stage 1 IA
A360 EMB ? E2 Alive 11.1 . 11.1 Stage 3 IB
A361 EMB ? E1 Alive 10.3 . 10.3 Stage 4 IV
A362 EMB ? E2 Alive 11.3 . 11.3 Stage 3 III
A363 EMB ? E2 Alive 6.9 . 6.9 Stage 1 III
A366 EMB ? E2 Dead 3.6 Fail 2.5 Stage 4 IV
A367 ALV P7F A1 Alive 5.0 . 5.0 Stage 2 ?
A368 EMB ? E1 Alive 9.5 . 9.5 Stage 3 III
138
Supplementary Table 1. Tumor sample covariates
Sample ID Histology Transloc.
Molecular
Class
Dead Survival Fail FFS Stage
Clin
GP
A369 ALV P7F A2 Alive 0.1 . 0.1 Stage 4 IV
A371 ALV P7F A2 Alive 10.5 . 10.5 Stage 2 IA
A372 EMB ? E2 Alive 2.2 . 2.2 Stage 1 IIA
A373 Spindle ? E2 Alive 11.4 . 11.4 Stage 1 IA
A374 EMB ? N Alive 9.7 . 9.7 Stage 2 III
A375 ALV NEG E2 Alive 11.0 . 11.0 Stage 3 III
A376 EMB ? E2 Alive 8.6 . 8.6 Stage 3 III
A377 ALV P3F A1 Dead 7.6 Fail 5.6 Stage 3 IA
A378 ALV P3F A1 Alive 10.9 . 10.9 Stage 2 IA
A379 EMB ? E2 Dead 1.7 Fail 1.2 Stage 1 ?
A381 ALV P3F A2 Dead 1.0 Fail 0.6 Stage 4 IV
A382 Other ? E2 ? ? ? ? ?
A383 EMB ? E2 Alive 5.3 . 5.3 Stage 1 IA
A384 EMB ? E2 Alive 10.4 . 10.4 Stage 4 IV
A385 ALV P7F A2 Dead 2.3 Fail 1.6 Stage 3 III
A387 ALV P7F A2 Alive 9.2 . 9.2 Stage 2 IIA
A388 EMB ? E2 Alive 9.9 . 9.9 Stage 2 IA
A390 ALV P3F A1 Alive 6.3 . 6.3 Stage 2 IB
A391 EMB ? E2 Alive 9.9 . 9.9 Stage 1 IIA
A392 EMB ? E2 Alive 9.2 . 9.2 Stage 3 III
A395 ALV P3F A2 Dead 4.4 Fail 1.6 Stage 3 III
A396 EMB ? E1 Alive 11.1 . 11.1 Stage 2 IA
A398 EMB ? N Alive 7.7 . 7.7 Stage 4 IV
A400 ALV P3F A2 Dead 1.1 Fail 1.0 Stage 3 III
A402 EMB ? E2 Alive 3.8 . 3.8 Stage 3 III
A403 EMB ? E2 Alive 10.9 . 10.9 Stage 3 ?
A404 ALV P7F A1 Alive 5.6 . 5.6 Stage 4 IV
A405 ALV P3F A1 Alive 11.6 . 11.6 Stage 2 IA
A406
Mixed
Alv/Emb
NEG E1 Dead 0.2 Fail 0.2 Stage 4 IV
A409 EMB ? E2 Dead 0.9 Fail 0.7 Stage 4 IV
139
Supplementary Table 1. Tumor sample covariates
Sample ID Histology Transloc.
Molecular
Class
Dead Survival Fail FFS Stage
Clin
GP
A412 EMB ? E2 Alive 1.7 . 1.7 Stage 3 ?
A413 EMB ? E1 Alive 5.4 . 5.4 Stage 3 III
A416 Botryoid ? E1 Alive 3.9 . 3.9 Stage 1 III
A418 Spindle ? E1 Alive 10.9 . 10.9 Stage 1 ?
A419 ALV NEG E1 Alive 11.1 . 11.1 Stage 3 IIB
A420 ALV P3F A1 Alive 6.4 . 6.4 Stage 3 IA
A423 Other ? E2 Alive 8.7 . 8.7
A424 EMB ? E2 Alive 4.9 . 4.9 Stage 3 IIA
A425 Botryoid ? E2 Alive 0.2 . 0.2 Stage 3 ?
A426 Other ? N Dead 0.1 Fail 0.1 ?
A427 UDS ? N Alive 11.5 . 11.5 III
A428 EMB ? E2 Alive 8.4 . 8.4 Stage 3 III
A429 EMB ? E2 Alive 6.0 . 6.0 Stage 1 IA
A431 EMB ? E2 Dead 3.5 Fail 3.1 Stage 3 ?
A433 ALV P7F A1 Alive 2.9 Fail 1.8 Stage 3 ?
A434 ALV NEG E2 Alive 3.0 . 3.0 Stage 4 IV
A437 EMB ? E2 Alive 10.7 . 10.7 Stage 3 III
A439 Botryoid ? E1 Alive 8.7 . 8.7 Stage 3 III
A440 EMB ? E2 Dead 1.5 Fail 1.3 Stage 3 III
A441 EMB ? E2 Dead 2.5 Fail 0.8 Stage 4 IV
A442 ALV NEG E1 Alive 0.8 . 0.8 Stage 2 III
A444 ALV P3F A2 Dead 1.7 Fail 1.2 Stage 3 III
A446 ALV P3F A1 Alive 12.3 . 12.3 Stage 3 III
A449 Other ? N Alive 0.3 . 0.3 ?
A455 EMB ? E2 Alive 2.8 . 2.8 Stage 1 IA
A459 EMB ? E1 Alive 10.7 . 10.7 Stage 3 IA
A460 ALV P3F A2 Dead 4.7 Fail 2.5 Stage 1 IIA
140
Supplementary Table 1. Tumor sample covariates
Sample ID Histology Transloc.
Molecular
Class
Dead Survival Fail FFS Stage
Clin
GP
A462 EMB ? N Dead 1.5 Fail 1.3 Stage 4 IV
A464 ALV P7F A1 Alive 6.1 . 6.1 Stage 2 IA
A465 EMB ? N Alive 9.0 . 9.0 Stage 1 IA
A466 EMB ? E2 Alive 0.1 . 0.1 Stage 3 III
A467 EMB ? E1 Dead 2.1 Fail 1.3 Stage 3 ?
A468 ALV P3F A2 Dead 1.4 Fail 1.1 Stage 3 III
A470 EMB ? E2 Alive 8.7 . 8.7 Stage 1 IA
A513 EMB ? N Alive 2.8 ? ? ? ?
A514 ALV P7F A1 Alive 6.8 ? ? ? ?
A518 ALV P3F A2 Dead 0.9 ? ? Stage 4 IV
A519 ALV NEG E2 Dead 1.0 ? ? Stage 4 IV
A520 ALV P3F A1 Dead ? ? ? ? ?
A521 ALV P3F A2 Dead 1.1 ? ? Stage 4 IV
A522 ALV P7F A1 Alive ? ? ? ? ?
A523 ALV P3F A2 Alive 1.9 ? ? ? ?
A524 EMB ? E1 Alive 8.5 ? ? ? ?
A527 EMB ? E2 Alive 8.4 ? ? ? ?
A529 EMB ? E1 Alive 9.2 ? ? ? ?
A530 EMB ? E2 Dead ? ? ? ? ?
A531 ALV P3F A2 Dead ? ? ? ? ?
A535 ALV P3F A2 Dead ? ? ? ? ?
A537 ALV P3F A2 Dead ? ? ? ? ?
A539
Mixed
Alv/Emb
P7F A1 Alive 9.8 ? ? ? ?
A541 ALV P3F A1 Alive 0.9 Fail ? ? ?
A542 EMB ? E1 Alive 7.1 . ? ? ?
A543 EMB ? E2 Dead 2.7 ? ? Stage 4 IV
A545 EMB ? N Dead 1.3 ? ? ? ?
A547
Mixed
Alv/Emb
P7F A1 Dead 5.9 ? ? ? ?
A551
Mixed
Alv/Emb
P3F A1 Dead 1.1 ? ? Stage 4 IV
A613 ALV P7F A1 Alive 3.0 . 3.0 Stage 4 IV
A859 Spindle ? E1 ? ? ? ? Stage 1 ?
A860 EMB ? E2 ? ? ? ? ? ?
A864 EMB ? E1 ? ? ? ? Stage 1 ?
A867 EMB ? E1 ? ? ? ? Stage 2 ?
B635 ALV P3F A1 Dead 0.8 Fail 0.3 Stage 4 IV
B638 EMB ? E2 Alive 11.4 Fail 1.4 Stage 2 III
B642 UDS ? N Dead 2.3 Fail 1.6 Stage 3 ?
141
Supplementary Table 1. Tumor sample covariates
Sample ID Histology Transloc.
Molecular
Class
Dead Survival Fail FFS Stage
Clin
GP
B645 ALV P3F A1 Dead 1.8 Fail 0.6 Stage 4 IV
B648 ALV NEG E2 Dead 2.0 Fail 1.2 Stage 4 IV
B649 ALV P3F A2 Alive 0.4 . 0.4 Stage 3 III
B650 ALV P3F A2 Dead 2.2 Fail 1.6 Stage 3 IIB
B655 EMB ? E1 Dead 3.1 Fail 1.0 Stage 3 III
B660 EMB ? E1 Alive 0.7 . 0.7 Stage 1 IIB
B665 ALV P3F A1 Dead 3.1 Fail 1.7 Stage 4 IV
B666 EMB ? E2 Dead 2.1 Fail 1.7 Stage 4 IV
B734 EMB ? E2 Dead 1.8 Fail 1.8 Stage 4 IV
B737 EMB ? E2 Alive 8.7 . 8.7 Stage 1 IA
B738 ALV P3F A2 Dead 2.0 Fail 1.7 Stage 4 IV
B739 ALV P7F A2 Alive 8.0 Fail 3.0 Stage 3 IIA
B740 ALV P7F A1 Alive 1.7 . 1.7 Stage 2 IA
B743 EMB ? E1 Alive 8.6 . 8.6 Stage 1 IA
B746 ALV P3F A2 Alive 8.8 . 8.8 Stage 3 III
B747 Spindle ? E2 Alive 8.6 . 8.6 Stage 2 III
B753 ALV P3F A2 Dead 1.6 Fail 1.0 Stage 4 IV
B755 ALV P3F A2 Alive 8.4 . 8.4 Stage 3 III
B757 EMB ? E1 Alive 8.3 . 8.3 Stage 3 IIA
B759 EMB ? E2 Alive 7.7 Fail 3.9 Stage 3 III
B760 ALV NEG E1 Alive 0.2 . 0.2 Stage 3 III
B763 ALV NEG E1 Alive 7.6 . 7.6 Stage 3 III
B765 EMB ? E1 Alive 7.8 . 7.8 Stage 1 IA
B766 EMB ? E1 Alive 7.6 . 7.6 Stage 1 IA
B767 ALV P3F A2 Alive 1.5 . 1.5 Stage 4 IV
B768 EMB ? E2 Alive 5.4 . 5.4 Stage 2 III
142
Supplementary Table 1. Tumor sample covariates
Sample ID Histology Transloc.
Molecular
Class
Dead Survival Fail FFS Stage
Clin
GP
B771 EMB ? E2 Alive 7.5 . 7.5 Stage 3 III
B776 EMB ? E1 Alive 7.7 . 7.7 Stage 1 IA
B777 EMB ? E2 Alive 8.2 . 8.2 Stage 3 III
B780 ALV P3F A1 Dead 2.1 Fail 1.8 Stage 4 IV
C337 ALV NEG E1 Alive 1.9 . 1.9 ? ?
C338 ALV NEG E2 Dead 0.7 Fail 0.5 Stage 3 ?
C339 ALV NEG E2 Dead 0.6 Fail 0.5 Stage 3 ?
C340 ALV NEG E2 Alive 2.8 . 2.8 ? ?
C341 ALV P3F A1 Alive 2.1 . 2.1 Stage 4 IV
C342 ALV NEG E2 Alive 1.0 . 1.0 Stage 4 IV
C343 ALV NEG E2 Alive 1.5 . 1.5 Stage 3 ?
ge 371 ALV P3F A2 Alive 9.7 . 9.7 Stage 3 III
Abbreviations
Histology: EMB= embryonal, ALV= alveolar, UDS = undifferentiated sarcoma, Other= nonrhabdomyosarcoma
STS
Transloc: P3F= PAX3-FKHR, P7F= PAX7-FKHR, NEG= fusion negative (RT-PCR)
For Stage and Clinical Group definitions see Tables 6 and 7.
143
Supplementary Table 1. Continued
Sample
ID
T N M Size Site
IRS-V
Risk GP
Chemo Radio Age Sex
A335 ? ? M1 ? Lung High ? ? 0 male
A337 ? ? M0 ? Retroperitoneum Intermed VAC
conv
XRT
10 female
A338 ? ? M0 ? Buttock Intermed ? ? 2 female
A339 T-2 N-0 M0 > 5 cm Bladder Intermed ? ? 2 male
A340 T-1 N-0 M0 ≤ 5 cm Neck Intermed ? ? 0 male
A341 T-1 N-0 M0 > 5 cm Testis-Paratestis Low VA None 3 male
A342 T-1 N-0 M0 ≤ 5 cm Buttock Intermed VAI None 3 female
A343 T-2 N-0 M0 > 5 cm Bladder Intermed ? ? 1 male
A344 T-2 N-1 M0 > 5 cm
Pelvis, Site
Indeterminate
Intermed VIE
conv
XRT
1 male
A345 T-2 N-0 M1 > 5 cm
Parapharyngeal
Area
Intermed IE
conv
XRT
7 female
A346 T-1 N-0 M0 ≤ 5 cm Chest Wall Low VAC
conv
XRT
3 male
A347 T-1 N-0 M0 > 5 cm Testis-Paratestis Low VA None 5 male
A348 T-2 N-1 M0 > 5 cm Perineum Intermed VIE
hyper
XRT
8 male
A349 T-2 N-0 M0 ≤ 5 cm
Orbit & PM
Extension
Intermed VAI
conv
XRT
9 male
A350 T-1 N-0 M0 ≤ 5 cm Shoulder Girdle Low VIE None 2 female
A352 ? ? M0 ? Lung Intermed ? ? 2 male
A354 T-1 N-0 M0 ≤ 5 cm
Other Head &
Neck
Intermed ? ? 15 female
A355 T-2 N-1 M0 > 5 cm Prostate Intermed VIE
conv
XRT
14 male
A356 T-1 N-1 M1 ≤ 5 cm Foot High VM
conv
XRT
9 male
A357 T-2 N-1 M1 ≤ 5 cm Forearm High IE
conv
XRT
16 female
A358 T-1 N-0 M0 > 5 cm
Other Head &
Neck
Low VAC None 5 male
A360 T-2 N-0 M0 > 5 cm Bladder Intermed VAC
conv
XRT
1 male
A361 T-2 N-1 M1 > 5 cm
Pelvis, Site
Indeterminate
Intermed IE
conv
XRT
7 male
A362 T-2 N-0 M0 > 5 cm Nasopharynx Intermed VIE
conv
XRT
4 male
A363 T-1 N-0 M0 ≤ 5 cm Cheek Low VIE
conv
XRT
8 female
A366 T-2 N-0 M1 > 5 cm Bladder Intermed VAI
conv
XRT
4 male
A367 ? ? M0 ? Leg Intermed VIE None 6 female
A368 T-2 N-0 M0 > 5 cm Nasopharynx Intermed VIE
hyper
XRT
10 female
A369 T-2 N-1 M1 > 5 cm Neck High VM
conv
XRT
8 female
144
Supplementary Table 1. Continued
Sample
ID
T N M Size Site
IRS-V
Risk GP
Chemo Radio Age Sex
A371 T-1 N-0 M0 ≤ 5 cm Thigh Intermed VAC None 8 male
A372 T-1 N-0 M0 ≤ 5 cm Orbit Low VA
conv
XRT
13 female
A373 T-1 N-0 M0 > 5 cm Testis-Paratestis Low VA None 6 male
A374 T-2 N-0 M0 ≤ 5 cm Paranasal Sinus Intermed VAC
hyper
XRT
8 male
A375 T-2 N-1 M0 ≤ 5 cm Middle Ear Intermed VAC
conv
XRT
3 female
A376 T-2 N-0 M0 > 5 cm Bladder Intermed VAC
conv
XRT
3 male
A377 T-1 N-0 M0 > 5 cm Thigh Intermed VAC
conv
XRT
1 male
A378 T-1 N-0 M0 ≤ 5 cm Hand Intermed VAI None 8 male
A379 ? ? M0 ? Testis Low VA None 17 male
A381 T-2 N-1 M1 > 5 cm Uterus High VM None 17 female
A382 ? ? ? ? ? ? ? ? ? ?
A383 T-1 N-0 M0 ≤ 5 cm Testis-Paratestis Low VA None 3 male
A384 T-2 N-0 M1 > 5 cm Uterus Intermed VM None 5 female
A385 T-1 N-1 M0 > 5 cm Retroperitoneum Intermed VAC
hyper
XRT
2 female
A387 T-1 N-0 M0 ≤ 5 cm Chest Wall Intermed VAI
conv
XRT
6 male
A388 T-1 N-0 M0 ≤ 5 cm Paraspinal Low VAC None 0 female
A390 T-1 N-0 M0 ≤ 5 cm Thigh Intermed VAC None 15 female
A391 T-1 N-0 M0 ≤ 5 cm Orbit Low VA
conv
XRT
2 male
A392 T-2 N-0 M0 > 5 cm
Pelvis, Site
Indeterminate
Intermed VAI
hyper
XRT
5 female
A395 T-2 N-1 M0 > 5 cm Foot Intermed VAC
conv
XRT
5 female
A396 T-1 N-0 M0 ≤ 5 cm Thigh Low VAI None 1 female
A398 ? ? M1 ? Liver Intermed VM
conv
XRT
9 male
A400 T-2 N-1 M0 > 5 cm Nasopharynx Intermed VAI
hyper
XRT
11 male
A402 T-2 N-0 M0 > 5 cm
Infratemporal
Fossa
Intermed VAI
conv
XRT
5 male
A403 ? ? M0 ? Retroperitoneum Intermed VIE
hyper
XRT
2 female
A404 T-2 N-1 M1 > 5 cm Buttock High IE
conv
XRT
8 male
A405 T-1 N-0 M0 ≤ 5 cm Leg Intermed VAC None 2 female
A406 T-2 N-1 M1 > 5 cm Retroperitoneum High VM
conv
XRT
1 male
A409 T-2 N-1 M1 > 5 cm Testis-Paratestis High VM
conv
XRT
17 male
A412 ? ? M0 ? Prostate Intermed VAC
hyper
XRT
3 male
145
Supplementary Table 1. Continued
Sample
ID
T N M Size Site
IRS-V
Risk GP
Chemo Radio Age Sex
A413 T-2 N-0 M0 > 5 cm
Nasal Cavity &
Sinus
Intermed VIE
hyper
XRT
4 male
A416 T-1 N-0 M0 ≤ 5 cm Cervix Low VAI
hyper
XRT
2 female
A418 ? ? M0 ? Testis-Paratestis Low VAI
conv
XRT
13 male
A419 T-2 N-0 M0 > 5 cm Leg Intermed VAI
conv
XRT
2 male
A420 T-1 N-0 M0 > 5 cm Forearm Intermed VAC
conv
XRT
9 female
A423 ? ? M0 ? Testis Intermed ? ? 1 male
A424 T-2 N-0 M0 > 5 cm Bladder Intermed VAI
conv
XRT
4 male
A425 ? ? M0 ?
Paranasal
Sinuses
Intermed VAC
hyper
XRT
8 male
A426 ? ? ? ? ? ? ? ? 2 male
A427 T-2 N-0 M0 > 5 cm Buttock Intermed ? ? 0 female
A428 T-1 N-1 M0 ≤ 5 cm Nasopharynx Intermed VAC
hyper
XRT
5 female
A429 T-1 N-0 M0 ≤ 5 cm Oral Cavity Low VAC None 14 male
A431 ? ? M0 ? Pelvis Intermed VAI
conv
XRT
1 male
A433 ? ? M0 ? Leg Intermed VIE
conv
XRT
2 female
A434 T-2 N-0 M1 > 5 cm Neck High IE
conv
XRT
11 male
A437 T-2 N-0 M0 > 5 cm Paranasal Sinus Intermed VIE
conv
XRT
3 female
A439 T-2 N-0 M0 > 5 cm
Gall Bladder &
Biliary Tree
Intermed VIE
conv
XRT
5 male
A440 T-2 N-0 M0 > 5 cm
Infratemporal
Fossa
Intermed VAI
hyper
XRT
2 female
A441 T-2 N-0 M1 > 5 cm
Pelvis, Site
Indeterminate
Intermed IE
conv
XRT
5 male
A442 T-2 N-1 M0 > 5 cm
Pelvis, Site
Indeterminate
Intermed VAC None 1 female
A444 T-2 N-0 M0 > 5 cm
Other H&N &
PM Extension
Intermed VAI
conv
XRT
17 male
A446 T-2 N-1 M0 > 5 cm Forearm Intermed VAC
hyper
XRT
16 female
A449 ? ? M0 ? Vagina Intermed ? ? 0 female
A455 T-1 N-0 M0 ≤ 5 cm Testis-Paratestis Low VA None 5 male
A459 T-1 N-0 M0 > 5 cm Bladder Intermed VAC
conv
XRT
7 female
A460 T-1 N-0 M0 ≤ 5 cm Cheek Intermed VAI
conv
XRT
2 male
A462 T-1 N-0 M1 > 5 cm Heart Intermed IE
conv
XRT
0 male
146
Supplementary Table 1. Continued
Sample
ID
T N M Size Site
IRS-V
Risk GP
Chemo Radio Age Sex
A464 T-1 N-0 M0 ≤ 5 cm Thigh Intermed VIE None 6 male
A465 T-1 N-0 M0 ≤ 5 cm Uterus Low VAC None 13 female
A466 T-2 N-0 M0 > 5 cm Prostate Intermed VAI
hyper
XRT
12 male
A467 ? ? M0 ? Bladder Intermed VAC
conv
XRT
2 male
A468 T-2 N-0 M0 > 5 cm Retroperitoneum Intermed VAC
conv
XRT
8 male
A470 T-1 N-0 M0 > 5 cm Uterus Low VIE None 20 female
A513 ? ? M0 ? Buttock Intermed ? ? 1 female
A514 ? ? M0 ? Calf Intermed ? ? 3 male
A518 ? ? M1 ? Chest Wall met High ? ? 15 male
A519 ? ? M1 ? Renal High ? ? 11 male
A520 ? ? ? ? Lung ? ? ? ? male
A521 ? ? M1 ?
Submand
(primary thigh)
High ? ? ? male
A522 ? ? ? ? Leg ? ? ? ? male
A523 ? ? M0 ? Axillary Mass Intermed ? ? 5 female
A524 ? ? M0 ? Pharyngeal Intermed ? ? 2 male
A527 ? ? M0 ? Eye ? ? ? 3 female
A529 ? ? M0 ? Abdominal mass ? ? ? 12 male
A530 ? ? ? ? Abdominal mass ? ? ? 0 male
A531 ? ? ? ? Thigh ? ? ? ? male
A535 ? ? ? ? Hand ? ? ? ? ?
A537 ? ? ? ?
Ischiorectal
fossa
? ? ? ? ?
A539 ? ? ? ? Thigh ? ? ? 1 male
A541 ? ? ? ? Foot ? ? ? 15 female
A542 ? ? ? ? Buttock ? ? ? 5 male
A543 ? ? M1 ?
Mediastinum
(paratest)
Intermed ? ? 4 male
A545 ? ? ? ? Chest Wall ? ? ? 14 male
A547 ? ? ? ? Arm ? ? ? 1 male
A551 ? ? M1 ? Chest Wall High ? ? 15 male
A613 ? N-1 M1 > 5 cm Forearm High IE
conv
XRT
7 female
A859 T-1 N-0 M0 > 5 cm Testis-Paratestis Low ? ? 7 male
A860 ? ? ? ? ? ? ? ? ? ?
A864 T-2 N-0 M0 > 5 cm Testis-Paratestis Low ? ? 18 male
A867 T-2 N-0 M0 ≤ 5 cm Liver Intermed ? ? 6 male
B635 T-2 N-1 M1 > 5 cm Cheek High VM
conv
XRT
0 male
B638 T-2 N-0 M0 ≤ 5 cm Bladder Intermed VAC
conv
XRT
1 male
B642 T-2 N-0 M0 > 5 cm Chest Wall Intermed ? ? 3 male
B645 T-2 N-1 M1 > 5 cm Testis-Paratestis High VM
conv
XRT
12 male
147
Supplementary Table 1. Continued
Sample
ID
T N M Size Site
IRS-V
Risk GP
Chemo Radio Age Sex
B648 T-2 N-0 M1 ≤ 5 cm Pterygopalatine High IE
conv
XRT
3 female
B649 T-2 N-1 M0 > 5 cm Nasopharynx Intermed VAI
conv
XRT
13 male
B650 T-1 N-1 M0 ≤ 5 cm Leg Intermed VIE
conv
XRT
8 female
B655 T-2 N-0 M0 > 5 cm
Infratemporal
Fossa
Intermed VIE
hyper
XRT
13 male
B660 T-1 N-1 M0 > 5 cm Testis-Paratestis Low VAC
conv
XRT
8 male
B665 T-2 N-0 M1 > 5 cm Forearm High Topo/Cyclo
conv
XRT
11 male
B666 T-2 N-0 M1 > 5 cm Uterus Intermed HD C
conv
XRT
6 female
B734 T-2 N-1 M1 > 5 cm Nasopharynx Intermed IE
conv
XRT
8 male
B737 T-1 N-0 M0 ≤ 5 cm Testis-Paratestis Low VA None 5 male
B738 T-2 N-0 M1 > 5 cm Retroperitoneum High Topo
conv
XRT
14 female
B739 T-1 N-0 M0 > 5 cm Forearm Intermed VIE
conv
XRT
6 male
B740 T-1 N-0 M0 ≤ 5 cm Leg Intermed VIE None 14 male
B743 T-1 N-0 M0 ≤ 5 cm Testis-Paratestis Low VA None 2 male
B746 T-2 N-1 M0 > 5 cm Perineum Intermed VIE
hyper
XRT
17 female
B747 T-1 N-0 M0 ≤ 5 cm
Other Head &
Neck
Intermed VAI
conv
XRT
11 female
B753 T-2 N-1 M1 > 5 cm Leg High Topo
conv
XRT
5 male
B755 T-2 N-1 M0 ≤ 5 cm Leg Intermed VAC
conv
XRT
13 female
B757 T-2 N-0 M0 > 5 cm
Other H&N &
PM Extension
Intermed VAI
conv
XRT
4 female
B759 T-2 N-0 M0 > 5 cm
Parapharyngeal
Area
Intermed VAC
conv
XRT
15 male
B760 T-2 N-0 M0 > 5 cm
Pelvis, Site
Indeterminate
Intermed VAC
conv
XRT
6 female
B763 T-1 ? M0 > 5 cm Thigh Intermed VIE
conv
XRT
18 male
B765 T-1 ? M0 ≤ 5 cm Testis-Paratestis Low VA None 4 male
B766 T-1 ? M0 ≤ 5 cm Testis-Paratestis Low VA None 3 male
B767 T-2 N-1 M1 > 5 cm Ovary High Topo
conv
XRT
13 female
B768 T-2 N-0 M0 ≤ 5 cm Nasopharynx Intermed VAI
conv
XRT
2 male
B771 T-2 ? M0 > 5 cm Nasopharynx Intermed VIE
hyper
XRT
2 male
B776 T-1 N-0 M0 > 5 cm Testis-Paratestis Low VA None 7 male
148
Supplementary Table 1. Continued
Sample
ID
T N M Size Site
IRS-V
Risk GP
Chemo Radio Age Sex
B777 T-2 N-1 M0 > 5 cm
Parapharyngeal
Area
Intermed VAC
hyper
XRT
5 male
B780 T-2 N-1 M1 > 5 cm Chest Wall High Topo/Cyclo
conv
XRT
8 female
C337 ? ? M0 ? ? Intermed ? ? 3 ?
C338 ? ? M0 ? ? Intermed ? ? 10 ?
C339 ? ? M0 ? ? Intermed HD C
conv
XRT
3 ?
C340 ? ? M0 ? ? High ? ? 1 ?
C341 ? ? M1 ? ? High
CPT-
11/VCR
conv
XRT
5 ?
C342 ? ? M1 ? ? High
CPT-
11/VCR
conv
XRT
7 ?
C343 ? ? M0 ? ? Intermed ? ? 5 ?
ge 371 T-2 N-0 M0 > 5 cm Leg Intermed VAC
conv
XRT
11 male
Abbreviations
For TNM and Clinical Group definintions, see Tables 6 and 7.
For IRS-V Risk Group defininitions see Table 8.
Supplementary Figure 1 Multi-dimensional scaling of tumors cumulative
simulated ‘test’ set similarity matrix derived from a second round of cross-validated
k-means meta-clustering. Colored dots indicate molecular classes as indicated in
legend and depicted in Figure 1C. Genes used to generate similarity matrix can be
found in Supplementary Table 3.
149
150
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
221605_s_at Pipecolic acid oxidase PIPOX 284.22 4
214451_at
Transcription factor AP-2 beta
(activating enhancer binding
protein 2 beta)
TFAP2B 260.8 4
213436_at Cannabinoid receptor 1 (brain) CNR1 155.72 4
206328_at
Cadherin 15, M-cadherin
(myotubule)
CDH15 132.42 1
206447_at Elastase 2A ELA2A 124.52 4
212654_at Tropomyosin 2 (beta) TPM2 122.09 1
207076_s_at Argininosuccinate synthetase ASS 120.72 4
208195_at Titin TTN 116.93 1
214628_at Nescient helix loop helix 1 NHLH1 109.21 4
213832_at Clone 24405 mRNA sequence 106.42 4
203256_at
Cadherin 3, type 1, P-cadherin
(placental)
CDH3 100.48 4
205054_at Nebulin NEB 96.28 2
215014_at
MRNA; cDNA
DKFZp547P042 (from clone
DKFZp547P042)
95.89 4
210395_x_at
Myosin, light polypeptide 4,
alkali; atrial, embryonic
MYL4 94.02 1
216054_x_at
Myosin, light polypeptide 4,
alkali; atrial, embryonic
MYL4 92.22 1
203872_at Actin, alpha 1, skeletal muscle ACTA1 91.37 2
206327_s_at
Cadherin 15, M-cadherin
(myotubule)
CDH15 90.81 1
206446_s_at Elastase 2A ELA2A 88.18 4
219829_at
Integrin beta 1 binding protein
(melusin) 2
ITGB1BP2 85.25 2
206633_at
Cholinergic receptor, nicotinic,
alpha polypeptide 1 (muscle)
CHRNA1 85.12 1
219106_s_at
Kelch repeat and BTB (POZ)
domain containing 10
KBTBD10 84.94 2
210088_x_at
Myosin, light polypeptide 4,
alkali; atrial, embryonic
MYL4 84.47 1
204810_s_at Creatine kinase, muscle CKM 80.69 2
205132_at Actin, alpha, cardiac muscle ACTC 78.58 1
209904_at Troponin C, slow TNNC1 78.52 2
214087_s_at
Myosin binding protein C,
slow type
MYBPC1 76 2
204483_at Enolase 3 (beta, muscle) ENO3 75.62 1
205824_at Heat shock 27kDa protein 2 HSPB2 75.25 1
203862_s_at Actinin, alpha 2 ACTN2 75.12 2
205163_at
Fast skeletal myosin light
chain 2
MYLPF 74.51 2
151
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
34471_at
Myosin, heavy polypeptide 8,
skeletal muscle, perinatal
MYH8 72.35 2
207077_at Elastase 2B ELA2B 70.38 4
215389_s_at Troponin T2, cardiac TNNT2 70.01 1
205940_at
Myosin, heavy polypeptide 3,
skeletal muscle, embryonic
MYH3 69.24 2
206128_at
Adrenergic, alpha-2C-,
receptor
ADRA2C 68.55 4
217274_x_at
Myosin, light polypeptide 4,
alkali; atrial, embryonic
MYL4 68.34 1
204579_at
Fibroblast growth factor
receptor 4
FGFR4 68.23 4
205889_s_at
Jak and microtubule
interacting protein 2
KIAA0555 66.82 4
206657_s_at Myogenic factor 3 MYOD1 66.3 1
201810_s_at
SH3-domain binding protein 5
(BTK-associated)
SH3BP5 66.24 3
219728_at
Titin immunoglobulin domain
protein (myotilin)
TTID 65.71 2
219772_s_at
Small muscle protein, X-
linked
SMPX 63.86 2
211373_s_at
Presenilin 2 (Alzheimer
disease 4)
PSEN2 62.62 1
202222_s_at Desmin DES 61.32 2
204850_s_at
Doublecortex; lissencephaly,
X-linked (doublecortin)
DCX 60.68 4
205553_s_at
Cysteine and glycine-rich
protein 3 (cardiac LIM
protein)
CSRP3 60.01 2
205693_at Troponin T3, skeletal, fast TNNT3 59.46 2
207317_s_at
Calsequestrin 2 (cardiac
muscle)
CASQ2 59.25 2
209888_s_at
Myosin, light polypeptide 1,
alkali; skeletal, fast
MYL1 58.02 2
206393_at Troponin I, skeletal, fast TNNI2 57.14 2
221523_s_at Ras-related GTP binding D RRAGD 56.72 2
213782_s_at Myozenin 2 MYOZ2 56.55 2
204239_s_at Neuronatin NNAT 56.34 3
205935_at Forkhead box F1 FOXF1 56.28 4
202039_at
TGFB1-induced anti-apoptotic
factor 1
TIAF1 56.27 4
205485_at
Ryanodine receptor 1
(skeletal)
RYR1 56.19 1
208212_s_at
Anaplastic lymphoma kinase
(Ki-1)
ALK 56.17 4
152
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
203638_s_at
Fibroblast growth factor
receptor 2
FGFR2 56.12 4
216887_s_at LIM domain binding 3 LDB3 55.44 2
203863_at Actinin, alpha 2 ACTN2 55.36 2
214027_x_at Desmin DES 54.96 2
205388_at Troponin C2, fast TNNC2 54.93 2
219438_at
Hypothetical protein
FLJ12650
FLJ12650 54.87 4
208204_s_at Caveolin 3 CAV3 53.35 2
205374_at Sarcolipin SLN 53.19 2
206375_s_at Heat shock 27kDa protein 3 HSPB3 53.15 2
212314_at KIAA0746 protein KIAA0746 53 4
210683_at Neurturin NRTN 52.96 4
57588_at
Solute carrier family 24
(sodium/potassium/calcium
exchanger), member 3
SLC24A3 52.05 4
205109_s_at
Rho guanine nucleotide
exchange factor (GEF) 4
ARHGEF4 52.04 4
207148_x_at Myozenin 2 MYOZ2 51.87 2
218625_at Neuritin 1 NRN1 51.69 4
205888_s_at
Jak and microtubule
interacting protein 2
KIAA0555 51.37 4
205431_s_at Bone morphogenetic protein 5 BMP5 51.27 4
203861_s_at Actinin, alpha 2 ACTN2 50.99 2
206717_at
Myosin, heavy polypeptide 8,
skeletal muscle, perinatal
MYH8 50.81 2
214365_at Tropomyosin 3 TPM3 50.49 2
213280_at
GTPase activating
Rap/RanGAP domain-like 4
GARNL4 50.45 4
206013_s_at Actin-like 6B ACTL6B 50.34 1
219144_at
Dual specificity phosphatase
26 (putative)
DUSP26 50.23 1
211570_s_at
Receptor-associated protein of
the synapse, 43kD
RAPSN 50.18 1
219632_s_at
Transient receptor potential
cation channel, subfamily V,
member 1
TRPV1 50.16 4
205872_x_at
Phosphodiesterase 4D
interacting protein
(myomegalin)
PDE4DIP 49.6 2
205430_at Bone morphogenetic protein 5 BMP5 49.54 4
212989_at Transmembrane protein 23 TMEM23 49.52 4
213712_at
Elongation of very long chain
fatty acids (FEN1/Elo2,
SUR4/Elo3, yeast)-like 2
ELOVL2 48.34 4
153
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
206089_at NEL-like 1 (chicken) NELL1 48.18 4
49111_at
MRNA; cDNA
DKFZp762M127 (from clone
DKFZp762M127)
48.13 4
208399_s_at Endothelin 3 EDN3 48 4
210047_at
Solute carrier family 11
(proton-coupled divalent metal
ion transporters), member 2
SLC11A2 47.79 4
210247_at Synapsin II SYN2 47.78 4
213155_at KIAA0523 protein KIAA0523 47.77 4
215367_at KIAA1614 protein KIAA1614 47.69 4
207282_s_at Myogenin (myogenic factor 4) MYOG 47.12 1
213825_at
Oligodendrocyte lineage
transcription factor 2
OLIG2 46.62 4
218469_at
Gremlin 1 homolog, cysteine
knot superfamily (Xenopus
laevis)
GREM1 46.39 4
204851_s_at
Doublecortex; lissencephaly,
X-linked (doublecortin)
DCX 46.12 4
218237_s_at
Solute carrier family 38,
member 1
SLC38A1 45.86 4
214774_x_at
Trinucleotide repeat
containing 9
TNRC9 45.3 4
201069_at
Matrix metalloproteinase 2
(gelatinase A, 72kDa
gelatinase, 72kDa type IV
collagenase)
MMP2 45.21 3
202920_at Ankyrin 2, neuronal ANK2 45.04 4
214439_x_at Bridging integrator 1 BIN1 44.85 2
212686_at
Protein phosphatase 1H (PP2C
domain containing)
PPM1H 44.81 4
202931_x_at Bridging integrator 1 BIN1 44.78 2
203566_s_at
Amylo-1, 6-glucosidase, 4-
alpha-glucanotransferase
(glycogen debranching
enzyme, glycogen storage
disease type III)
AGL 44.54 2
202966_at Calpain 6 CAPN6 44.48 4
206304_at Myosin binding protein H MYBPH 44.21 2
209090_s_at
SH3-domain GRB2-like
endophilin B1
SH3GLB1 44.14 4
203184_at
Fibrillin 2 (congenital
contractural arachnodactyly)
FBN2 44.1 3
216623_x_at
Trinucleotide repeat
containing 9
TNRC9 43.71 4
205902_at
Potassium intermediate/small
conductance calcium-activated
channel, subfamily N, member
KCNN3 43.49 4
154
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
3
215108_x_at
Trinucleotide repeat
containing 9
TNRC9 43.36 4
209123_at
Quinoid dihydropteridine
reductase
QDPR 43.29 4
203423_at
Retinol binding protein 1,
cellular
RBP1 42.7 1
214978_s_at
Protein tyrosine phosphatase,
receptor type, f polypeptide
(PTPRF), interacting protein
(liprin), alpha 4
PPFIA4 42.62 4
204623_at Trefoil factor 3 (intestinal) TFF3 42.41 4
43511_s_at
MRNA; cDNA
DKFZp762M127 (from clone
DKFZp762M127)
42.21 4
219090_at
Solute carrier family 24
(sodium/potassium/calcium
exchanger), member 3
SLC24A3 42.19 4
209465_x_at
Pleiotrophin (heparin binding
growth factor 8, neurite
growth-promoting factor 1)
PTN 41.96 3
222278_at Hypothetical LOC389393 LOC389393 41.72 4
206117_at Tropomyosin 1 (alpha) TPM1 40.94 2
220327_at Vestigial-like 3 VGL-3 40.82 1
206353_at
Cytochrome c oxidase subunit
VIa polypeptide 2
COX6A2 40.61 2
218974_at
Hypothetical protein
FLJ10159
FLJ10159 40.61 3
205610_at
Myomesin 1 (skelemin)
185kDa
MYOM1 40.45 2
204262_s_at
Presenilin 2 (Alzheimer
disease 4)
PSEN2 40.42 1
220273_at Interleukin 17B IL17B 40.4 2
218184_at Tubby like protein 4 TULP4 40.38 4
209283_at Crystallin, alpha B CRYAB 40.06 2
209460_at
4-aminobutyrate
aminotransferase
ABAT 39.89 4
218468_s_at
Gremlin 1 homolog, cysteine
knot superfamily (Xenopus
laevis)
GREM1 39.83 4
205736_at
Phosphoglycerate mutase 2
(muscle)
PGAM2 39.72 2
206160_at
Apolipoprotein B mRNA
editing enzyme, catalytic
polypeptide-like 2
APOBEC2 39.44 2
219316_s_at
Chromosome 14 open reading
frame 58
C14orf58 39.34 4
155
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
204979_s_at
SH3 domain binding glutamic
acid-rich protein
SH3BGR 39.34 2
219370_at
Reprimo, TP53 dependant G2
arrest mediator candidate
RPRM 38.74 3
209656_s_at Transmembrane protein 47 TMEM47 38.49 4
221755_at
EH domain binding protein 1-
like 1
EHBP1L1 38.44 2
212390_at
Phosphodiesterase 4D
interacting protein
(myomegalin)
PDE4DIP 38.15 2
209242_at Paternally expressed 3 PEG3 37.84 4
218959_at Homeo box C10 HOXC10 37.73 3
209742_s_at
Myosin, light polypeptide 2,
regulatory, cardiac, slow
MYL2 37.67 2
206228_at Paired box gene 2 PAX2 37.55 4
207424_at Myogenic factor 5 MYF5 37.39 3
213371_at LIM domain binding 3 LDB3 37.38 2
211804_s_at Cyclin-dependent kinase 2 CDK2 37.26 3
219779_at Zinc finger homeodomain 4 ZFHX4 37.23 3
219147_s_at
Chromosome 9 open reading
frame 95
C9orf95 37.18 4
214357_at LOC92346 LOC92346 37.11 1
212774_at Zinc finger protein 238 ZNF238 37.03 4
201636_at 201636_at 36.73 2
205177_at Troponin I, skeletal, slow TNNI1 36.52 2
219521_at
Beta-1,3-glucuronyltransferase
1 (glucuronosyltransferase P)
B3GAT1 36.08 4
38149_at
Rho GTPase activating protein
25
ARHGAP25 35.98 4
202587_s_at Adenylate kinase 1 AK1 35.73 2
221994_at PDZ and LIM domain 5 PDLIM5 35.52 2
213157_s_at KIAA0523 protein KIAA0523 35.06 4
202454_s_at
V-erb-b2 erythroblastic
leukemia viral oncogene
homolog 3 (avian)
ERBB3 34.92 1
212850_s_at
Low density lipoprotein
receptor-related protein 4
LRP4 34.9 3
210464_at DKFZP434F122 protein DKFZP434F122 34.73 4
212736_at
Chromosome 16 open reading
frame 45
C16orf45 34.72 4
209757_s_at
V-myc myelocytomatosis viral
related oncogene,
neuroblastoma derived (avian)
MYCN 34.38 4
213963_s_at
Sin3-associated polypeptide,
30kDa
SAP30 34.32 1
209883_at
Glycosyltransferase 25 domain
containing 2
GLT25D2 33.48 3
156
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
219225_at
PiggyBac transposable
element derived 5
PGBD5 33.05 4
44783_s_at
Hairy/enhancer-of-split related
with YRPW motif 1
HEY1 33 3
207876_s_at
Filamin C, gamma (actin
binding protein 280)
FLNC 32.97 1
217515_s_at
Calcium channel, voltage-
dependent, L type, alpha 1S
subunit
CACNA1S 32.96 2
209286_at
CDC42 effector protein (Rho
GTPase binding) 3
CDC42EP3 32.89 4
204364_s_at
Chromosome 2 open reading
frame 23
C2orf23 32.84 2
213486_at
Hypothetical protein
DKFZp761N09121
DKFZP761N09121 32.83 4
203072_at Myosin IE MYO1E 32.73 4
221612_at HT017 protein HT017 32.71 2
202603_at
A disintegrin and
metalloproteinase domain 10
ADAM10 32.71 4
214078_at
P21 (CDKN1A)-activated
kinase 3
PAK3 32.43 4
212311_at KIAA0746 protein KIAA0746 32.36 4
219732_at Plasticity related gene 3 PRG-3 32.28 4
212573_at KIAA0830 protein KIAA0830 31.95 2
207145_at Growth differentiation factor 8 GDF8 31.92 1
208228_s_at
Fibroblast growth factor
receptor 2 (bacteria-expressed
kinase, keratinocyte growth
factor receptor, craniofacial
dysostosis 1, Crouzon
syndrome, Pfeiffer syndrome,
Jackson-Weiss syndrome)
FGFR2 31.9 4
222287_at Triadin TRDN 31.83 2
207639_at
Frizzled homolog 9
(Drosophila)
FZD9 31.79 2
206850_at
RAS-related on chromosome
22
RRP22 31.64 4
206306_at Ryanodine receptor 3 RYR3 31.63 4
216265_x_at
Myosin, heavy polypeptide 7,
cardiac muscle, beta
MYH7 31.58 2
203864_s_at Actinin, alpha 2 ACTN2 31.49 2
204173_at Myosin light chain 1 slow a MLC1SA 31.4 2
204094_s_at TSC22 domain family 2 TSC22D2 31.36 4
210967_x_at
Calcium channel, voltage-
dependent, beta 1 subunit
CACNB1 31.34 2
204513_s_at
Engulfment and cell motility 1
(ced-12 homolog, C. elegans)
ELMO1 31.31 4
157
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
209459_s_at
4-aminobutyrate
aminotransferase
ABAT 31.15 4
202157_s_at
CUG triplet repeat, RNA
binding protein 2
CUGBP2 30.91 1
208025_s_at
High mobility group AT-hook
2
HMGA2 30.83 3
218829_s_at
Chromodomain helicase DNA
binding protein 7
CHD7 30.76 4
211237_s_at
Fibroblast growth factor
receptor 4
FGFR4 30.67 4
49306_at
Ras association (RalGDS/AF-
6) domain family 4
RASSF4 30.64 1
204036_at
Endothelial differentiation,
lysophosphatidic acid G-
protein-coupled receptor, 2
EDG2 30.51 3
205068_s_at
Rho GTPase activating protein
26
ARHGAP26 30.46 4
211737_x_at
Pleiotrophin (heparin binding
growth factor 8, neurite
growth-promoting factor 1)
PTN 30.45 3
206723_s_at
Endothelial differentiation,
lysophosphatidic acid G-
protein-coupled receptor, 4
EDG4 30.43 4
201578_at Podocalyxin-like PODXL 30.29 4
213050_at Cordon-bleu homolog (mouse) COBL 30.13 1
209928_s_at
Musculin (activated B-cell
factor-1)
MSC 30.08 3
221217_s_at Ataxin 2-binding protein 1 A2BP1 30 2
212747_at
Ankyrin repeat and sterile
alpha motif domain containing
1
ANKS1 29.96 4
211704_s_at Spindlin family, member 2 SPIN2 29.91 4
209106_at Nuclear receptor coactivator 1 NCOA1 29.89 4
219736_at Tripartite motif-containing 36 TRIM36 29.89 4
221861_at
MRNA; cDNA
DKFZp762M127 (from clone
DKFZp762M127)
29.76 4
213256_at
Membrane-associated ring
finger (C3HC4) 3
MARCH3 29.72 4
204882_at
Rho GTPase activating protein
25
ARHGAP25 29.6 4
219077_s_at
WW domain containing
oxidoreductase
WWOX 29.56 4
203625_x_at
S-phase kinase-associated
protein 2 (p45)
SKP2 29.53 4
205145_s_at
Myosin, light polypeptide 5,
regulatory
MYL5 29.49 2
205020_s_at ADP-ribosylation factor-like ARL4A 29.49 3
158
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
4A
209200_at
MADS box transcription
enhancer factor 2, polypeptide
C (myocyte enhancer factor
2C)
MEF2C 29.38 2
206372_at Myogenic factor 6 (herculin) MYF6 29.33 3
205589_at
Myosin, light polypeptide 3,
alkali; ventricular, skeletal,
slow
MYL3 29.23 2
212915_at
PDZ domain containing RING
finger 3
PDZRN3 29.05 4
205848_at Growth arrest-specific 2 GAS2 29.05 3
219963_at
Dual specificity phosphatase
13
DUSP13 29.02 2
212867_at Nuclear receptor coactivator 2 NCOA2 28.86 3
211042_x_at
Melanoma cell adhesion
molecule
MCAM 28.8 4
207066_at
Histidine rich calcium binding
protein
HRC 28.77 2
220359_s_at
Cyclic AMP-regulated
phosphoprotein, 21 kD
ARPP-21 28.7 2
209829_at
Chromosome 6 open reading
frame 32
C6orf32 28.62 4
211341_at
POU domain, class 4,
transcription factor 1
POU4F1 28.6 4
205113_at
Neurofilament 3 (150kDa
medium)
NEF3 28.58 3
205522_at Homeo box D4 HOXD4 28.48 3
205613_at B/K protein LOC51760 28.47 4
203680_at
Protein kinase, cAMP-
dependent, regulatory, type II,
beta
PRKAR2B 28.4 4
210715_s_at
Serine protease inhibitor,
Kunitz type, 2
SPINT2 28.37 4
213478_at KIAA1026 protein KIAA1026 28.27 3
209147_s_at
Phosphatidic acid phosphatase
type 2A
PPAP2A 28.2 4
209621_s_at PDZ and LIM domain 3 PDLIM3 28.01 1
220319_s_at
Myosin regulatory light chain
interacting protein
MYLIP 27.99 4
206216_at Serine/threonine kinase 23 STK23 27.95 2
219926_at Popeye domain containing 3 POPDC3 27.93 4
209869_at
Adrenergic, alpha-2A-,
receptor
ADRA2A 27.9 4
209700_x_at
Phosphodiesterase 4D
interacting protein
(myomegalin)
PDE4DIP 27.86 2
204724_s_at Collagen, type IX, alpha 3 COL9A3 27.58 3
159
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
218502_s_at
Trichorhinophalangeal
syndrome I
TRPS1 27.55 3
202674_s_at LIM domain 7 LMO7 27.5 2
221659_s_at
Myosin light chain 2,
precursor lymphocyte-specific
MYLC2PL 27.46 2
209199_s_at
MADS box transcription
enhancer factor 2, polypeptide
C (myocyte enhancer factor
2C)
MEF2C 27.35 2
205372_at Pleiomorphic adenoma gene 1 PLAG1 27.3 3
209568_s_at
Ral guanine nucleotide
dissociation stimulator-like 1
RGL1 27.07 3
201032_at
Bladder cancer associated
protein
BLCAP 26.87 1
204235_s_at
GULP, engulfment adaptor
PTB domain containing 1
GULP1 26.66 3
218839_at
Hairy/enhancer-of-split related
with YRPW motif 1
HEY1 26.6 3
218869_at Malonyl-CoA decarboxylase MLYCD 26.57 4
212254_s_at Dystonin DST 26.43 4
204612_at
Protein kinase (cAMP-
dependent, catalytic) inhibitor
alpha
PKIA 26.34 1
210227_at
Discs, large (Drosophila)
homolog-associated protein 2
DLGAP2 26.13 4
218816_at
Leucine rich repeat containing
1
LRRC1 26.04 3
202156_s_at
CUG triplet repeat, RNA
binding protein 2
CUGBP2 25.8 1
207302_at
Sarcoglycan, gamma (35kDa
dystrophin-associated
glycoprotein)
SGCG 25.75 2
205826_at
Myomesin (M-protein) 2,
165kDa
MYOM2 25.43 2
218876_at Brain specific protein CGI-38 25.37 2
214043_at
Protein tyrosine phosphatase,
receptor type, D
PTPRD 25.37 1
213624_at
Sphingomyelin
phosphodiesterase, acid-like
3A
SMPDL3A 25.36 1
213201_s_at Troponin T1, skeletal, slow TNNT1 25.35 2
204237_at
GULP, engulfment adaptor
PTB domain containing 1
GULP1 25.32 3
219181_at Lipase, endothelial LIPG 25.3 4
206394_at
Myosin binding protein C, fast
type
MYBPC2 25.28 2
204179_at Myoglobin MB 25.26 2
160
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
208211_s_at
Anaplastic lymphoma kinase
(Ki-1)
ALK 25.24 4
214769_at Chloride channel 4 CLCN4 25.14 4
221000_s_at
Kazal-type serine protease
inhibitor domain 1
KAZALD1 25.11 3
207558_s_at
Paired-like homeodomain
transcription factor 2
PITX2 25.09 1
205389_s_at Ankyrin 1, erythrocytic ANK1 24.96 2
202746_at Integral membrane protein 2A ITM2A 24.94 3
210202_s_at Bridging integrator 1 BIN1 24.92 2
203482_at
Chromosome 10 open reading
frame 6
C10orf6 24.62 4
219175_s_at
Solute carrier family 41,
member 3
SLC41A3 24.54 2
221035_s_at Testis expressed sequence 14 TEX14 24.48 4
201983_s_at
Epidermal growth factor
receptor (erythroblastic
leukemia viral (v-erb-b)
oncogene homolog, avian)
EGFR 24.46 3
202965_s_at Calpain 6 CAPN6 24.42 4
220615_s_at
Male sterility domain
containing 1
MLSTD1 24.31 4
204365_s_at
Chromosome 2 open reading
frame 23
C2orf23 24.11 2
204824_at Endonuclease G ENDOG 24.03 4
219648_at
Likely ortholog of mouse
dilute suppressor
DSU 23.97 4
204743_at Transgelin 3 TAGLN3 23.97 4
204570_at
Cytochrome c oxidase subunit
VIIa polypeptide 1 (muscle)
COX7A1 23.96 4
216331_at Integrin, alpha 7 ITGA7 23.96 1
209163_at Cytochrome b-561 CYB561 23.94 4
219148_at PDZ binding kinase PBK 23.76 4
219855_at
Nudix (nucleoside diphosphate
linked moiety X)-type motif
11
NUDT11 23.72 4
209243_s_at Paternally expressed 3 PEG3 23.65 4
219714_s_at
Calcium channel, voltage-
dependent, alpha 2/delta 3
subunit
CACNA2D3 23.65 4
203667_at Tubulin-specific chaperone a TBCA 23.65 4
219894_at MAGE-like 2 MAGEL2 23.63 3
206167_s_at
Rho GTPase activating protein
6
ARHGAP6 23.51 4
217066_s_at
Dystrophia myotonica-protein
kinase
DMPK 23.4 2
214846_s_at Alpha-kinase 3 ALPK3 23.4 4
161
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
91703_at
EH domain binding protein 1-
like 1
EHBP1L1 23.14 2
213005_s_at Ankyrin repeat domain 15 ANKRD15 23.12 4
221185_s_at IQ motif containing G IQCG 23.06 4
204823_at Neuron navigator 3 NAV3 23.05 3
205405_at
Sema domain, seven
thrombospondin repeats (type
1 and type 1-like),
transmembrane domain (TM)
and short cytoplasmic domain,
(semaphorin) 5A
SEMA5A 23.02 4
202517_at
Collapsin response mediator
protein 1
CRMP1 23.01 4
205295_at
Creatine kinase, mitochondrial
2 (sarcomeric)
CKMT2 22.99 2
206572_x_at
Zinc finger protein 85 (HPF4,
HTF1)
ZNF85 22.82 4
203510_at
Met proto-oncogene
(hepatocyte growth factor
receptor)
MET 22.78 4
214058_at
V-myc myelocytomatosis viral
oncogene homolog 1, lung
carcinoma derived (avian)
MYCL1 22.75 1
206290_s_at
Regulator of G-protein
signalling 7
RGS7 22.72 3
209107_x_at Nuclear receptor coactivator 1 NCOA1 22.69 4
213169_at
Clone TUA8 Cri-du-chat
region mRNA
22.67 4
210249_s_at Nuclear receptor coactivator 1 NCOA1 22.62 4
202158_s_at
CUG triplet repeat, RNA
binding protein 2
CUGBP2 22.57 1
219480_at Snail homolog 1 (Drosophila) SNAI1 22.57 4
47560_at Latrophilin 1 LPHN1 22.5 3
204737_s_at
Myosin, heavy polypeptide 6,
cardiac muscle, alpha
(cardiomyopathy, hypertrophic
1)
MYH6 22.5 2
206940_s_at
POU domain, class 4,
transcription factor 1
POU4F1 22.5 4
212845_at
Sterile alpha motif domain
containing 4
SAMD4 22.48 4
221959_at
Hypothetical protein
MGC39325
MGC39325 22.45 3
221567_at
Nucleolar protein 3 (apoptosis
repressor with CARD domain)
NOL3 22.44 4
222154_s_at
DNA polymerase-
transactivated protein 6
DNAPTP6 22.34 4
202455_at Histone deacetylase 5 HDAC5 22.27 4
162
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
204865_at
Carbonic anhydrase III,
muscle specific
CA3 22.19 2
204567_s_at
ATP-binding cassette, sub-
family G (WHITE), member 1
ABCG1 22.13 4
203139_at
Death-associated protein
kinase 1
DAPK1 22.11 4
218245_at
Likely ortholog of chicken
tsukushi
TSK 22.05 3
204379_s_at
Fibroblast growth factor
receptor 3 (achondroplasia,
thanatophoric dwarfism)
FGFR3 21.89 4
212360_at
Adenosine monophosphate
deaminase 2 (isoform L)
AMPD2 21.89 3
209466_x_at
Pleiotrophin (heparin binding
growth factor 8, neurite
growth-promoting factor 1)
PTN 21.84 3
216836_s_at
V-erb-b2 erythroblastic
leukemia viral oncogene
homolog 2,
neuro/glioblastoma derived
oncogene homolog (avian)
ERBB2 21.83 2
206774_at
FERM and PDZ domain
containing 1
FRMPD1 21.8 1
209074_s_at TU3A protein TU3A 21.68 4
219410_at Transmembrane protein 45A TMEM45A 21.44 3
221065_s_at
Carbohydrate (N-
acetylgalactosamine 4-0)
sulfotransferase 8
CHST8 21.43 4
204072_s_at Hypothetical protein CG003 13CDNA73 21.41 4
214110_s_at
Similar to lymphocyte-specific
protein 1
21.38 4
218918_at
Mannosidase, alpha, class 1C,
member 1
MAN1C1 21.34 4
205273_s_at Pitrilysin metalloproteinase 1 PITRM1 21.32 4
210305_at
Phosphodiesterase 4D
interacting protein
(myomegalin)
PDE4DIP 21.32 2
202565_s_at Supervillin SVIL 21.26 2
37996_s_at
Dystrophia myotonica-protein
kinase
DMPK 21.25 2
202345_s_at
Fatty acid binding protein 5
(psoriasis-associated)
FABP5 21.2 3
221322_at
Chromosome 7 open reading
frame 9
C7orf9 21.19 2
203706_s_at
Frizzled homolog 7
(Drosophila)
FZD7 21.1 3
219319_at
Hypoxia inducible factor 3,
alpha subunit
HIF3A 21.06 3
163
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
212509_s_at
Transmembrane anchor
protein 1
TMAP1 21.01 4
210632_s_at
Sarcoglycan, alpha (50kDa
dystrophin-associated
glycoprotein)
SGCA 21 1
221713_s_at
Hypothetical protein
FLJ12748
FLJ12748 20.98 1
202973_x_at
Family with sequence
similarity 13, member A1
FAM13A1 20.87 3
201829_at
Neuroepithelial cell
transforming gene 1
NET1 20.84 1
218820_at
Chromosome 14 open reading
frame 132
C14orf132 20.73 4
213609_s_at
Seizure related 6 homolog
(mouse)-like
SEZ6L 20.69 4
219259_at
Sema domain,
immunoglobulin domain (Ig),
transmembrane domain (TM)
and short cytoplasmic domain,
(semaphorin) 4A
SEMA4A 20.69 3
219682_s_at
T-box 3 (ulnar mammary
syndrome)
TBX3 20.66 3
219804_at Synaptopodin 2-like SYNPO2L 20.63 2
203178_at
Glycine amidinotransferase
(L-arginine:glycine
amidinotransferase)
GATM 20.62 1
205262_at
Potassium voltage-gated
channel, subfamily H (eag-
related), member 2
KCNH2 20.59 1
221958_s_at
Putative NFkB activating
protein 373
FLJ23091 20.57 3
211248_s_at Chordin CHRD 20.5 4
219740_at
Hypothetical protein
FLJ12505
FLJ12505 20.48 1
209361_s_at Poly(rC) binding protein 4 PCBP4 20.46 4
202478_at
Tribbles homolog 2
(Drosophila)
TRIB2 20.45 3
209791_at 209791_at 20.4 2
209692_at
Eyes absent homolog 2
(Drosophila)
EYA2 20.4 4
205619_s_at Mesenchyme homeo box 1 MEOX1 20.4 4
213262_at
Spastic ataxia of Charlevoix-
Saguenay (sacsin)
SACS 20.18 1
204301_at KIAA0711 gene product KIAA0711 20.17 3
205730_s_at
Actin binding LIM protein
family, member 3
ABLIM3 19.91 2
219195_at
Peroxisome proliferative
activated receptor, gamma,
PPARGC1A 19.89 2
164
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
coactivator 1, alpha
204540_at
Eukaryotic translation
elongation factor 1 alpha 2
EEF1A2 19.87 2
218665_at
Frizzled homolog 4
(Drosophila)
FZD4 19.87 3
215767_at
Chromosome 2 open reading
frame 10
C2orf10 19.85 4
217992_s_at
EF hand domain family,
member D2
EFHD2 19.8 4
219884_at LIM homeobox 6 LHX6 19.8 3
202769_at Cyclin G2 CCNG2 19.77 4
205948_at
Protein tyrosine phosphatase,
receptor type, T
PTPRT 19.69 4
204105_s_at
Neuronal cell adhesion
molecule
NRCAM 19.66 4
211039_at
Cholinergic receptor, nicotinic,
alpha polypeptide 1 (muscle)
CHRNA1 19.63 1
211596_s_at
Leucine-rich repeats and
immunoglobulin-like domains
1
LRIG1 19.61 4
204163_at Elastin microfibril interfacer 1 EMILIN1 19.43 3
218772_x_at Transmembrane protein 38B TMEM38B 19.42 2
206059_at
Zinc finger protein 91 (HPF7,
HTF10)
ZNF91 19.35 4
209435_s_at
Rho/rac guanine nucleotide
exchange factor (GEF) 2
ARHGEF2 19.27 4
41037_at TEA domain family member 4 TEAD4 19.26 1
203222_s_at
Transducin-like enhancer of
split 1 (E(sp1) homolog,
Drosophila)
TLE1 19.23 4
214121_x_at
PDZ and LIM domain 7
(enigma)
PDLIM7 19.21 2
204099_at
SWI/SNF related, matrix
associated, actin dependent
regulator of chromatin,
subfamily d, member 3
SMARCD3 19.21 1
214230_at
Cell division cycle 42 (GTP
binding protein, 25kDa)
CDC42 19.2 4
221854_at
Plakophilin 1 (ectodermal
dysplasia/skin fragility
syndrome)
PKP1 19.19 4
208712_at
Cyclin D1 (PRAD1:
parathyroid adenomatosis 1)
CCND1 19.1 3
204442_x_at
Latent transforming growth
factor beta binding protein 4
LTBP4 19.1 3
210271_at Neurogenic differentiation 2 NEUROD2 19.1 4
165
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
210327_s_at
Alanine-glyoxylate
aminotransferase (oxalosis I;
hyperoxaluria I;
glycolicaciduria; serine-
pyruvate aminotransferase)
AGXT 19.03 4
206121_at
Adenosine monophosphate
deaminase 1 (isoform M)
AMPD1 18.96 2
212554_at
CAP, adenylate cyclase-
associated protein, 2 (yeast)
CAP2 18.92 1
201206_s_at
Ribosome binding protein 1
homolog 180kDa (dog)
RRBP1 18.91 3
213547_at TBP-interacting protein TIP120B 18.73 2
202756_s_at Glypican 1 GPC1 18.72 2
218858_at DEP domain containing 6 DEPDC6 18.71 2
205968_at
Potassium voltage-gated
channel, delayed-rectifier,
subfamily S, member 3
KCNS3 18.69 4
209558_s_at
Huntingtin interacting protein-
1-related
HIP1R 18.68 4
213479_at Neuronal pentraxin II NPTX2 18.66 3
213684_s_at PDZ and LIM domain 5 PDLIM5 18.63 2
218824_at
Hypothetical protein
FLJ10781
FLJ10781 18.61 4
218376_s_at
Microtubule associated
monoxygenase, calponin and
LIM domain containing 1
MICAL1 18.59 4
206116_s_at Tropomyosin 1 (alpha) TPM1 18.56 2
219377_at
Family with sequence
similarity 59, member A
FAM59A 18.55 3
204319_s_at
Regulator of G-protein
signalling 10
RGS10 18.54 3
221524_s_at Ras-related GTP binding D RRAGD 18.52 2
201204_s_at
Ribosome binding protein 1
homolog 180kDa (dog)
RRBP1 18.48 3
219509_at Myozenin 1 MYOZ1 18.46 2
201787_at Fibulin 1 FBLN1 18.45 3
200636_s_at
Protein tyrosine phosphatase,
receptor type, F
PTPRF 18.4 4
204639_at Adenosine deaminase ADA 18.4 3
208711_s_at
Cyclin D1 (PRAD1:
parathyroid adenomatosis 1)
CCND1 18.32 3
215311_at
MRNA full length insert
cDNA clone EUROIMAGE
21920
18.31 4
221667_s_at Heat shock 22kDa protein 8 HSPB8 18.29 2
204726_at
Cadherin 13, H-cadherin
(heart)
CDH13 18.26 4
166
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
206836_at
Solute carrier family 6
(neurotransmitter transporter,
dopamine), member 3
SLC6A3 18.25 4
205116_at
Laminin, alpha 2 (merosin,
congenital muscular
dystrophy)
LAMA2 18.2 3
202145_at
Lymphocyte antigen 6
complex, locus E
LY6E 18.13 3
204121_at
Growth arrest and DNA-
damage-inducible, gamma
GADD45G 18.12 4
205444_at
ATPase, Ca++ transporting,
cardiac muscle, fast twitch 1
ATP2A1 18.12 2
206030_at
Aspartoacylase (aminoacylase
2, Canavan disease)
ASPA 18.05 4
204042_at
WAS protein family, member
3
WASF3 18.05 3
205380_at PDZ domain containing 1 PDZK1 18 4
209288_s_at
CDC42 effector protein (Rho
GTPase binding) 3
CDC42EP3 17.92 4
205549_at Purkinje cell protein 4 PCP4 17.89 4
206858_s_at Homeo box C6 HOXC6 17.74 3
204811_s_at
Calcium channel, voltage-
dependent, alpha 2/delta
subunit 2
CACNA2D2 17.72 4
40093_at
Lutheran blood group
(Auberger b antigen included)
LU 17.72 4
206502_s_at Insulinoma-associated 1 INSM1 17.71 4
217997_at
Pleckstrin homology-like
domain, family A, member 1
PHLDA1 17.66 3
218613_at
Pleckstrin and Sec7 domain
containing 3
PSD3 17.62 3
215605_at Nuclear receptor coactivator 2 NCOA2 17.55 3
211476_at Myozenin 2 MYOZ2 17.49 2
221019_s_at
Collectin sub-family member
12
COLEC12 17.49 3
210657_s_at Septin 4 38599 17.43 1
204584_at L1 cell adhesion molecule L1CAM 17.36 4
218618_s_at
Fibronectin type III domain
containing 3B
FNDC3B 17.35 3
47550_at
Leucine zipper, putative tumor
suppressor 1
LZTS1 17.35 1
204642_at
Endothelial differentiation,
sphingolipid G-protein-
coupled receptor, 1
EDG1 17.35 4
209708_at Monooxygenase, DBH-like 1 MOXD1 17.3 3
204045_at
Transcription elongation factor
A (SII)-like 1
TCEAL1 17.26 2
167
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
205354_at
Guanidinoacetate N-
methyltransferase
GAMT 17.25 2
204995_at
Cyclin-dependent kinase 5,
regulatory subunit 1 (p35)
CDK5R1 17.25 4
202588_at Adenylate kinase 1 AK1 17.23 2
221211_s_at
Chromosome 21 open reading
frame 7
C21orf7 17.14 2
207306_at
Transcription factor 15 (basic
helix-loop-helix)
TCF15 17.14 4
221933_at Neuroligin 4, X-linked NLGN4X 17.13 4
213125_at Olfactomedin-like 2B OLFML2B 17.08 4
213362_at
Protein tyrosine phosphatase,
receptor type, D
PTPRD 17.08 1
206382_s_at
Brain-derived neurotrophic
factor
BDNF 17.06 3
212675_s_at 212675_s_at 17.06 4
204604_at PFTAIRE protein kinase 1 PFTK1 17.06 4
203986_at Genethonin 1 GENX-3414 17.05 2
203643_at Ets2 repressor factor ERF 17.04 3
209234_at Kinesin family member 1B KIF1B 17.03 4
206996_x_at
Calcium channel, voltage-
dependent, beta 1 subunit
CACNB1 17.03 2
202575_at
Cellular retinoic acid binding
protein 2
CRABP2 17.02 3
204217_s_at Reticulon 2 RTN2 17.02 1
213817_at
CDNA FLJ13601 fis, clone
PLACE1010069
16.86 4
212805_at KIAA0367 KIAA0367 16.75 2
203725_at
Growth arrest and DNA-
damage-inducible, alpha
GADD45A 16.73 4
212843_at
Neural cell adhesion molecule
1
NCAM1 16.68 1
205417_s_at
Dystroglycan 1 (dystrophin-
associated glycoprotein 1)
DAG1 16.6 4
91580_at HT017 protein HT017 16.53 2
203765_at
Grancalcin, EF-hand calcium
binding protein
GCA 16.53 4
209274_s_at
HESB like domain containing
2
HBLD2 16.45 4
213342_at
Yes-associated protein 1,
65kDa
YAP1 16.42 3
206100_at Carboxypeptidase M CPM 16.41 3
210567_s_at
S-phase kinase-associated
protein 2 (p45)
SKP2 16.39 4
214475_x_at Calpain 3, (p94) CAPN3 16.38 2
212144_at
Unc-84 homolog B (C.
elegans)
UNC84B 16.37 4
168
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
204281_at TEA domain family member 4 TEAD4 16.37 1
205738_s_at
Fatty acid binding protein 3,
muscle and heart (mammary-
derived growth inhibitor)
FABP3 16.35 2
204083_s_at Tropomyosin 2 (beta) TPM2 16.34 1
214607_at
P21 (CDKN1A)-activated
kinase 3
PAK3 16.32 4
203438_at Stanniocalcin 2 STC2 16.29 4
213543_at
Sarcoglycan, delta (35kDa
dystrophin-associated
glycoprotein)
SGCD 16.28 2
219042_at
Leucine zipper, putative tumor
suppressor 1
LZTS1 16.28 1
205150_s_at KIAA0644 gene product KIAA0644 16.24 3
221814_at
G protein-coupled receptor
124
GPR124 16.23 3
201689_s_at Tumor protein D52 TPD52 16.2 4
212813_at
Junctional adhesion molecule
3
JAM3 16.19 1
202947_s_at
Glycophorin C (Gerbich blood
group)
GYPC 16.15 4
201940_at Carboxypeptidase D CPD 16.06 4
211075_s_at
CD47 antigen (Rh-related
antigen, integrin-associated
signal transducer)
CD47 16.03 4
204588_s_at
Solute carrier family 7
(cationic amino acid
transporter, y+ system),
member 7
SLC7A7 15.99 1
203000_at Stathmin-like 2 STMN2 15.97 4
204014_at Dual specificity phosphatase 4 DUSP4 15.97 3
201849_at
BCL2/adenovirus E1B 19kDa
interacting protein 3
BNIP3 15.96 4
212361_s_at
ATPase, Ca++ transporting,
cardiac muscle, slow twitch 2
ATP2A2 15.89 4
206090_s_at Disrupted in schizophrenia 1 DISC1 15.86 4
204037_at 204037_at 15.72 3
218145_at
Tribbles homolog 3
(Drosophila)
TRIB3 15.7 4
205122_at
Transmembrane protein with
EGF-like and two follistatin-
like domains 1
TMEFF1 15.68 4
201272_at
Aldo-keto reductase family 1,
member B1 (aldose reductase)
AKR1B1 15.62 1
202364_at MAX interactor 1 MXI1 15.58 4
217979_at Tetraspanin 13 TSPAN13 15.55 2
218454_at Hypothetical protein FLJ22662 15.54 3
169
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
FLJ22662
209663_s_at Integrin, alpha 7 ITGA7 15.52 1
216840_s_at
Laminin, alpha 2 (merosin,
congenital muscular
dystrophy)
LAMA2 15.5 3
210036_s_at
Potassium voltage-gated
channel, subfamily H (eag-
related), member 2
KCNH2 15.42 1
216381_x_at
Aldo-keto reductase family 7,
member A3 (aflatoxin
aldehyde reductase)
AKR7A3 15.42 4
204115_at
Guanine nucleotide binding
protein (G protein), gamma 11
GNG11 15.35 3
212538_at Dedicator of cytokinesis 9 DOCK9 15.31 4
203001_s_at Stathmin-like 2 STMN2 15.26 4
212956_at KIAA0882 protein KIAA0882 15.26 4
221902_at
G protein-coupled receptor
153
GPR153 15.2 4
202724_s_at
Forkhead box O1A
(rhabdomyosarcoma)
FOXO1A 15.11 4
213009_s_at Tripartite motif-containing 37 TRIM37 15.09 4
207915_at 207915_at 15.06 4
204723_at
Sodium channel, voltage-
gated, type III, beta
SCN3B 15.05 2
202747_s_at Integral membrane protein 2A ITM2A 14.97 3
203724_s_at Rap2 interacting protein x RIPX 14.96 4
221969_at
Paired box gene 5 (B-cell
lineage specific activator)
PAX5 14.94 4
204086_at
Preferentially expressed
antigen in melanoma
PRAME 14.88 3
205384_at
FXYD domain containing ion
transport regulator 1
(phospholemman)
FXYD1 14.82 2
201939_at
Polo-like kinase 2
(Drosophila)
PLK2 14.72 3
204286_s_at
Phorbol-12-myristate-13-
acetate-induced protein 1
PMAIP1 14.65 2
209693_at Astrotactin 2 ASTN2 14.58 4
201194_at Selenoprotein W, 1 SEPW1 14.57 2
214895_s_at
A disintegrin and
metalloproteinase domain 10
ADAM10 14.53 4
208427_s_at
ELAV (embryonic lethal,
abnormal vision, Drosophila)-
like 2 (Hu antigen B)
ELAVL2 14.5 4
204653_at
Transcription factor AP-2
alpha (activating enhancer
binding protein 2 alpha)
TFAP2A 14.49 4
170
Supplementary Table 2. Meta-clustering gene list for RMS and NRSTS
Affy ID Gene Name Symbol
ANOVA
F-Stat*
Gene
Group **
205399_at
Doublecortin and CaM kinase-
like 1
DCAMKL1 14.44 4
201906_s_at
CTD (carboxy-terminal
domain, RNA polymerase II,
polypeptide A) small
phosphatase-like
CTDSPL 14.43 3
212680_x_at
Protein phosphatase 1,
regulatory (inhibitor) subunit
14B
PPP1R14B 14.37 1
207869_s_at 207869_s_at 14.18 4
213325_at Poliovirus receptor-related 3 PVRL3 13.99 2
201811_x_at
SH3-domain binding protein 5
(BTK-associated)
SH3BP5 13.97 3
219213_at
Junctional adhesion molecule
2
JAM2 12.99 2
* ANOVA F-statistic for genes differentially expressed between 5 k-means cluster centroids
(i.e., Molecular Classes, Figure 1C).
** Gene Groups identified by two-way gene and sample hierarchical clustering (Figure 1C).
171
Supplementary Table 3. EASE Analysis of Metacluster Gene Groups
Group I n=67 genes All RMS overrepresented categories (not NRSTS)
System Gene Category
List
Hits *
EASE
score **
GO BP muscle development 12 1.81E-12
GO BP muscle contraction 8 1.16E-06
GO MF structural constituent of muscle 5 4.12E-05
GO BP organogenesis 13 9.88E-05
GO BP regulation of heart rate 4 1.81E-04
GO BP cell motility 8 1.95E-04
GO BP morphogenesis 13 2.90E-04
GO CC cytoskeleton 12 6.52E-04
GO CC muscle fiber 4 2.14E-03
GO BP myogenesis 3 2.90E-03
GO CC actin cytoskeleton 6 7.27E-03
GO CC cytoplasm 27 7.50E-03
GO BP circulation 4 8.82E-03
GO CC myofibril 3 1.33E-02
GO CC sarcomere 3 1.33E-02
GO MF structural constituent of cytoskeleton 4 1.55E-02
GO BP cellular process 30 2.05E-02
GO BP development 13 2.11E-02
GO MF structural molecule activity 8 2.13E-02
GO CC intracellular 35 2.67E-02
GO MF protein binding 12 3.43E-02
Group II n=131
genes
Well Differentiated' Tumors overrepresented categories
(i.e., A1 and E1 molecular classes)
System Gene Category
List
Hits
EASE
score
GO BP muscle contraction 39 3.13E-46
GO CC muscle fiber 29 7.39E-42
GO BP muscle development 33 2.36E-37
GO CC myofibril 23 1.20E-34
GO CC sarcomere 23 1.20E-34
GO BP cell motility 39 2.69E-32
GO BP striated muscle contraction 17 7.25E-26
GO MF structural constituent of muscle 19 3.54E-25
GO CC actin cytoskeleton 31 2.00E-23
GO BP regulation of muscle contraction 12 5.78E-16
GO CC cytoskeleton 37 3.32E-15
GO BP morphogenesis 39 1.56E-14
GO BP organogenesis 37 1.63E-14
GO CC striated muscle thick filament 9 1.11E-13
GO CC striated muscle thin filament 9 1.57E-12
GO MF structural molecule activity 29 2.02E-11
GO BP development 45 2.31E-11
GO CC cytoplasm 73 2.72E-11
GO CC myosin 13 6.98E-11
GO CC muscle myosin 9 7.42E-11
172
Supplementary Table 3. EASE Analysis of Metacluster Gene Groups
GO MF cytoskeletal protein binding 19 1.73E-10
GO CC myosin II 9 1.08E-08
GO MF actin binding 15 1.11E-08
GO CC troponin complex 6 1.45E-08
GO BP regulation of striated muscle contraction 5 2.57E-07
GO MF structural constituent of cytoskeleton 11 3.15E-07
GO MF calcium ion binding 18 7.68E-06
GO MF protein binding 32 9.06E-06
GO BP cellular process 76 1.51E-05
GO CC smooth endoplasmic reticulum 4 4.09E-05
GO CC Z disc 3 8.23E-04
GO CC intracellular 76 3.24E-03
GO MF tropomyosin binding 3 3.70E-03
GO MF metal ion binding 20 5.06E-03
GO BP cytoskeleton organization and biogenesis 9 5.15E-03
GO BP organelle organization and biogenesis 9 1.84E-02
GO MF motor activity 5 1.94E-02
GO MF protein phosphatase 2B binding 2 2.77E-02
GO BP regulation of heart rate 3 3.04E-02
GO CC muscle thin filament tropomyosin 2 3.63E-02
GO MF creatine kinase activity 2 3.68E-02
GO MF phosphatase binding 2 4.58E-02
GO MF protein phosphatase binding 2 4.58E-02
GO BP cytoplasm organization and biogenesis 9 4.59E-02
Group III n=94
genes ‘not mARMS’ overrepresented categories
System Gene Category
List
Hits
EASE
score
Chromosome Homo sapiens 8 14 3.38E-06
Chromosome Homo sapiens 8q 9 2.42E-04
GO BP morphogenesis 17 7.13E-04
GO BP development 23 8.46E-04
GO MF transcription factor activity 14 1.42E-03
GO MF signal transducer activity 24 4.90E-03
GO MF carbohydrate binding 5 7.35E-03
GO MF transcription regulator activity 15 9.63E-03
GO CC extracellular matrix 7 1.33E-02
GO MF frizzled receptor activity 2 1.34E-02
GO CC transcription factor complex 10 2.05E-02
GO MF extracellular matrix structural constituent 4 2.11E-02
GO BP regulation of transcription\, DNA-dependent 19 2.19E-02
Chromosome Homo sapiens 8p 5 2.51E-02
GO BP regulation of transcription 19 2.54E-02
GO BP organogenesis 12 3.27E-02
GO BP transcription\, DNA-dependent 19 3.33E-02
Chromosome Homo sapiens 12q 8 3.54E-02
GO MF sugar binding 4 3.58E-02
GO MF Wnt receptor activity 2 3.96E-02
GO BP regulation of cell cycle 7 4.08E-02
173
Supplementary Table 3. EASE Analysis of Metacluster Gene Groups
GO BP signal transduction 22 4.22E-02
GO BP transcription 19 4.50E-02
Group IV n=238
genes mARMS overrepresented categories
System Gene Category
List
Hits
EASE
score
GO BP neurogenesis 22 1.63E-06
GO BP morphogenesis 37 2.81E-06
GO BP organogenesis 33 1.22E-05
GO BP cellular process 118 3.95E-05
GO BP development 46 2.14E-04
GO BP cell differentiation 11 4.26E-04
GO MF fibroblast growth factor receptor activity 3 1.45E-03
GO BP central nervous system development 8 2.07E-03
GO BP intracellular signaling cascade 24 2.15E-03
GO MF cytoskeletal protein binding 13 2.26E-03
GO CC membrane 87 2.60E-03
GO BP cell communication 61 5.19E-03
GO MF protein binding 37 5.57E-03
GO BP MAPKKK cascade 5 7.04E-03
GO BP apoptosis 14 7.18E-03
GO BP programmed cell death 14 7.33E-03
GO MF calcium channel activity 5 8.38E-03
GO BP protein kinase cascade 8 1.16E-02
GO BP cell death 14 1.19E-02
GO BP death 14 1.29E-02
GO BP neuron differentiation 3 1.50E-02
GO MF
transmembrane receptor protein tyrosine kinase
activity 5 1.73E-02
GO CC integral to plasma membrane 31 1.76E-02
GO MF protein kinase activity 15 2.18E-02
GO BP actin cytoskeleton organization and biogenesis 5 2.42E-02
GO CC plasma membrane 40 2.67E-02
GO BP actin filament-based process 5 2.78E-02
GO CC cytoskeleton 23 2.88E-02
GO MF transmembrane receptor protein kinase activity 5 3.05E-02
GO MF pancreatic elastase II activity 2 3.13E-02
GO BP phosphate metabolism 18 3.43E-02
GO BP phosphorus metabolism 18 3.43E-02
GO CC integral to membrane 57 3.56E-02
GO BP regulation of transcription from Pol II promoter 9 3.68E-02
GO BP protein amino acid phosphorylation 14 4.11E-02
GO MF microtubule binding 3 4.64E-02
GO MF alpha2-adrenergic receptor activity 2 4.66E-02
Note: Overrepresentation analysis is relative to gene population of 22215 genes and ESTs from
Affymetrix U133A GeneChips
* Number of genes from Gene Group that belong to Gene Category for given Gene Ontology
** EASE score is a modified Fisher exact score (see Hosack et al.)
GO Abbreviations: BP= Biological Process, CC= Cellular Component, MF= Molecular Function
174
Supplementary Figure 2 Microarray analysis of novel markers that differentiate
RMS subclasses. A: All RMS subclasses express muscle cadherin (CDH15) and
fibroblast growth factor receptor 4 (FGFR4) but NRSTS tumors do not. Skeletal
muscle actin (ACTA1), routinely used in the differential diagnosis of RMS from other
small blue round cell tumors of childhood (SRBCT) appears to be expressed
predominately in A1 and E1 subclasses of mARMS and mERMS tumors,
respectively. This analysis suggests that this gene product might not be suitable for
the differential diagnosis of RMS from SRBCT but instead might be useful for the
identification of ‘well differentiated’ mARMS and mERMS tumors associated with
more favorable outcome (see Figure 6). B: Myosin light chain 1 (MYL1) is a good
candidate for the differential diagnosis of mARMS and mERMS subclasses associated
with more favorable outcome (A1 and E1). Myosin heavy chain 8 (MYH8), the
perinatal isoform and myosin light chain 4 (MYL4) have potential for the diagnosis of
the E1 from the E2 mERMS subclass but do not appear suitable for the differential
diagnosis of A1 and A2 mARMS subclasses.
175
Supplementary Table 4. Clinical Characteristics of SNP-LOH
Microarray Data Set
No. %
alveolar 29 39.7
embryonal 34 46.6
botryoid 2 2.7
spindle 4 5.5
Histology
undifferentiated or NRSTS * 4 5.5
1 14 19.4
2 15 20.8
3 30 41.7
Stage
4 13 18.1
PAX3-FKHR 17 58.6
PAX7-FKHR 8 27.6
Translocation
(Alveolar Tumors)
Negative 4 13.8
Alive 53 72.6
Alive/Dead
Dead 20 27.4
Male 46 63.0
Gender
Female 27 37.0
mARMS 25 34.2
mERMS 43 58.9 Molecular Class **
mNRSTS 5 6.8
Age at Diagnosis mean, median, (range) 6.59, 5 (0-20)
* ICR Review diagnosis of undifferentiated sarcoma or non-rhabdmyosarcoma
soft-tissue sarcoma
** Molecular Class as determined by microarray analysis
176
Supplementary Table 5. Proportion of tumors with LOH for
regions with frequent allelic imbalance
% LOH by Molecular Class
G-Band
LOH
mARMS mERMS mNRSTS Total
N % N % N % % all
11p15.4 5 20 29 67 0 0 45
11p13 3 12 25 58 0 0 38
11q22.1 3 12 19 44 0 0 30
11q22.3 4 16 15 35 1 20 27
11q25 1 4 17 40 0 0 25
11q14.1 7 28 13 30 0 0 25
11p15.1 2 8 14 33 0 0 22
11p14.3 4 16 11 26 0 0 21
10q21.1 0 0 13 30 1 20 18
10q25.1 0 0 11 26 0 0 15
10p12.1 2 8 9 21 1 20 15
11p12 1 4 9 21 0 0 14
8q21.13 1 4 7 16 1 20 12
11q24.2 0 0 8 19 1 20 11
4q28.1 0 0 8 19 0 0 11
16q23.1 3 12 1 2 1 20 6.8
16q21 0 0 3 7 1 20 5.5
16q12.2 0 0 2 5 2 40 5.5
% LOH by Histology
CHR ARMS
ALV
NEG
EMB S/B NRSTS
11p 16 100 63 67 0
11q 20 60 50 50 20
10p 2 20 28 17 0
10q 1 20 34 17 20
16q 12 20 13 0 40
4q 8 20 25 17 0
8q 4 80 13 50 0
Abbreviations: CHR= chromosome arm, ARMS= fusion positive alveolar,
ALV NEG= fusion negative alveolar, EMB= embryonal, S/B= spindle or
botryoid, NRSTS= nonrhabdomyosarcoma STS
177
Supplementary Figure 3 Microarray analysis of TFAP2 β and HMGA2
expression levels. TFAP2 β (blue) is expressed only in mARMS tumors but not in
fusion negative alveolar or other mERMS tumors. Mean levels in mARMS tumors
were >200-fold and highly significant (ANOVA, p<0.00001). HMGA2 (red) is
lowly expressed in mARMS tumors but at levels >8-fold in mERMS tumors,
including the fusion negative alveolar tumors (ANOVA, p<0.00001).
178
Supplementary Table 6. Clinical characteristics of the ‘PAX-FKHR Expression
Signature’ data set.
Number %
alveolar 66 47.5
mixed alveolar/embryonal 4 2.9
embryonal 61 43.9
botryoid 2 1.4
Histology
spindle 6 4.3
1 24 20.5
2 19 16.2
3 49 41.9
Stage
4 25 21.4
PAX3-FKHR 39 55.7
PAX7-FKHR 16 22.9
Translocation
(Alveolar Histology Only)
Negative 15 21.4
Alive 91 66.9
Alive/Dead
Dead 45 33.1
Male 85 65.9
Gender
Female 44 34.1
Orbit 5 3.8
Head/Neck 10 7.6
Paramenigeal 19 14.4
GU-bladder/prostate 12 9.1
GU-other 25 18.9
Extremity 36 27.3
Primary Site
Other 25 18.9
Age at Diagnosis
mean, median, (range) 7.01, 5 (0-20)
NOTE: Some of the categories do not total correctly due to incomplete clinical covariate data.
179
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
214451_at 1
transcription factor AP-2 beta (activating enhancer
binding protein 2 beta)
TFAP2B 24.9
207076_s_at 2 argininosuccinate synthetase ASS 15.8
213436_at 3 cannabinoid receptor 1 (brain) CNR1 23.1
212314_at 4 KIAA0746 protein KIAA0746 5.3
213832_at 5 Clone 24405 mRNA sequence --- 3.8
221605_s_at 6 pipecolic acid oxidase PIPOX 7.5
213155_at 7 KIAA0523 protein KIAA0523 4.0
203256_at 8 cadherin 3, type 1, P-cadherin (placental) CDH3 3.4
212736_at 9 chromosome 16 open reading frame 45 C16orf45 3.1
219147_s_at 10 chromosome 9 open reading frame 95 FLJ20559 3.9
219779_at 11 zinc finger homeodomain 4 ZFHX4 -3.0
203638_s_at 12
fibroblast growth factor receptor 2 (bacteria-
expressed kinase, keratinocyte growth factor
receptor, craniofacial dysostosis 1, Crouzon
syndrome, Pfeiffer syndrome, Jackson-Weiss
syndrome)
FGFR2 3.3
213280_at 13 GTPase activating Rap/RanGAP domain-like 4 GARNL4 3.6
203072_at 14 myosin IE MYO1E 1.8
218625_at 15 neuritin 1 NRN1 5.4
211737_x_at 16
pleiotrophin (heparin binding growth factor 8,
neurite growth-promoting factor 1) /// pleiotrophin
(heparin binding growth factor 8, neurite growth-
promoting factor 1)
PTN -6.9
209459_s_at 17 4-aminobutyrate aminotransferase ABAT 7.3
205889_s_at 18 Jak and microtubule interacting protein 2 KIAA0555 2.9
215108_x_at 19 trinucleotide repeat containing 9 TNRC9 6.3
209123_at 20 quinoid dihydropteridine reductase QDPR 3.2
205935_at 21 forkhead box F1 FOXF1 5.1
218665_at 22 frizzled homolog 4 (Drosophila) FZD4 -2.0
218959_at 23 homeo box C10 HOXC10 -3.2
206089_at 24 NEL-like 1 (chicken) NELL1 4.7
218974_at 25 hypothetical protein FLJ10159 FLJ10159 -3.4
204850_s_at 26
doublecortex; lissencephaly, X-linked
(doublecortin)
DCX 6.9
49111_at 27
MRNA; cDNA DKFZp762M127 (from clone
DKFZp762M127)
--- 2.1
219225_at 28 piggyBac transposable element derived 5 PGBD5 3.2
206128_at 29 adrenergic, alpha-2C-, receptor ADRA2C 1.9
204094_s_at 30 TSC22 domain family, member 2 TSC22D2 3.4
201939_at 31 polo-like kinase 2 (Drosophila) PLK2 -3.8
203139_at 32 death-associated protein kinase 1 DAPK1 2.7
218839_at 33 hairy/enhancer-of-split related with YRPW motif 1 HEY1 -3.3
180
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
43511_s_at 34
MRNA; cDNA DKFZp762M127 (from clone
DKFZp762M127)
--- 2.0
201069_at 35
matrix metalloproteinase 2 (gelatinase A, 72kDa
gelatinase, 72kDa type IV collagenase)
MMP2 -2.3
213478_at 36 kazrin KIAA1026 -2.8
208399_s_at 37 endothelin 3 EDN3 5.3
219855_at 38
nudix (nucleoside diphosphate linked moiety X)-
type motif 11
NUDT11 2.6
206059_at 39 zinc finger protein 91 (HPF7, HTF10) ZNF91 4.0
202966_at 40 calpain 6 CAPN6 4.1
201578_at 41 podocalyxin-like PODXL 3.3
208212_s_at 42 anaplastic lymphoma kinase (Ki-1) ALK 3.3
221861_at 43
MRNA; cDNA DKFZp762M127 (from clone
DKFZp762M127)
--- 2.0
202747_s_at 44 integral membrane protein 2A ITM2A -3.3
205619_s_at 45 mesenchyme homeo box 1 MEOX1 3.1
210247_at 46 synapsin II SYN2 2.3
206858_s_at 47 homeo box C6 HOXC6 -4.2
208025_s_at 48
high mobility group AT-hook 2 /// high mobility
group AT-hook 2
HMGA2 -6.2
204036_at 49
endothelial differentiation, lysophosphatidic acid G-
protein-coupled receptor, 2
EDG2 -2.6
214628_at 50 nescient helix loop helix 1 NHLH1 1.8
202603_at 51 A disintegrin and metalloproteinase domain 10 ADAM10 2.7
205848_at 52 growth arrest-specific 2 GAS2 -6.7
219480_at 53 snail homolog 1 (Drosophila) SNAI1 1.5
204882_at 54 Rho GTPase activating protein 25 ARHGAP25 2.6
218502_s_at 55 trichorhinophalangeal syndrome I TRPS1 -2.7
205372_at 56 pleiomorphic adenoma gene 1 PLAG1 -2.4
204823_at 57 neuron navigator 3 NAV3 -2.8
201628_s_at 58 Ras-related GTP binding A RRAGA 1.7
203705_s_at 59 frizzled homolog 7 (Drosophila) FZD7 -1.8
221958_s_at 60 chromosome 1 open reading frame 139 C1orf139 -2.3
218468_s_at 61
gremlin 1, cysteine knot superfamily, homolog
(Xenopus laevis)
GREM1 5.7
201938_at 62 CDK2-associated protein 1 CDK2AP1 2.0
204239_s_at 63 neuronatin NNAT -4.7
205020_s_at 64 ADP-ribosylation factor-like 4 ARL4 -1.9
219090_at 65
solute carrier family 24 (sodium/potassium/calcium
exchanger), member 3
SLC24A3 2.5
219894_at 66 MAGE-like 2 MAGEL2 -2.1
219521_at 67
beta-1,3-glucuronyltransferase 1
(glucuronosyltransferase P)
B3GAT1 2.6
221959_at 68 hypothetical protein MGC39325 MGC39325 -1.8
205068_s_at 69 Rho GTPase activating protein 26 ARHGAP26 2.3
181
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
201787_at 70 fibulin 1 FBLN1 -2.1
214724_at 71 DIX domain containing 1 DIXDC1 2.9
218184_at 72 tubby like protein 4 TULP4 2.0
202039_at 73 TGFB1-induced anti-apoptotic factor 1 TIAF1 2.2
205430_at 74 bone morphogenetic protein 5 BMP5 2.6
204105_s_at 75 neuronal cell adhesion molecule NRCAM 3.1
212867_at 76
Nuclear receptor coactivator 2 /// Nuclear receptor
coactivator 2
NCOA2 -2.4
202478_at 77 tribbles homolog 2 (Drosophila) TRIB2 -2.9
213256_at 78 membrane-associated ring finger (C3HC4) 3 MARCH3 2.2
201983_s_at 79
epidermal growth factor receptor (erythroblastic
leukemia viral (v-erb-b) oncogene homolog, avian)
EGFR -3.0
202641_at 80 ADP-ribosylation factor-like 3 ARL3 2.0
218918_at 81 mannosidase, alpha, class 1C, member 1 MAN1C1 2.3
219438_at 82 hypothetical protein FLJ12650 FLJ12650 1.9
209163_at 83 cytochrome b-561 CYB561 2.4
212915_at 84 PDZ domain containing RING finger 3 PDZRN3 3.0
219632_s_at 85
transient receptor potential cation channel,
subfamily V, member 1
TRPV1 2.2
218847_at 86 IGF-II mRNA-binding protein 2 IMP-2 -3.1
202345_s_at 87 fatty acid binding protein 5 (psoriasis-associated) FABP5 -3.8
215014_at 88
MRNA; cDNA DKFZp547P042 (from clone
DKFZp547P042)
--- 2.4
220319_s_at 89 myosin regulatory light chain interacting protein MYLIP 2.2
222278_at 90 hypothetical LOC389393 LOC389393 2.2
204639_at 91 adenosine deaminase ADA -1.6
218829_s_at 92 chromodomain helicase DNA binding protein 7 CHD7 2.5
209757_s_at 93
v-myc myelocytomatosis viral related oncogene,
neuroblastoma derived (avian)
MYCN 2.7
204163_at 94 elastin microfibril interfacer 1 EMILIN1 -2.3
213825_at 95 oligodendrocyte lineage transcription factor 2 OLIG2 2.3
221539_at 96
eukaryotic translation initiation factor 4E binding
protein 1
EIF4EBP1 -2.0
209693_at 97 astrotactin 2 ASTN2 1.8
218162_at 98 olfactomedin-like 3 OLFML3 -2.5
211704_s_at 99
spindlin family, member 2 /// spindlin family,
member 2 /// spindlin-like protein 2 /// spindlin-like
protein 2
SPIN2 /// SPIN-
2
1.8
204513_s_at 100
engulfment and cell motility 1 (ced-12 homolog, C.
elegans)
ELMO1 2.0
206446_s_at 101 elastase 2A ELA2A 3.0
203667_at 102 tubulin-specific chaperone a TBCA 2.1
219497_s_at 103 B-cell CLL/lymphoma 11A (zinc finger protein) BCL11A 2.4
206645_s_at 104 nuclear receptor subfamily 0, group B, member 1 NR0B1 1.9
213712_at 105
elongation of very long chain fatty acids
(FEN1/Elo2, SUR4/Elo3, yeast)-like 2
ELOVL2 2.8
182
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
201135_at 106
enoyl Coenzyme A hydratase, short chain, 1,
mitochondrial
ECHS1 1.8
205273_s_at 107 pitrilysin metalloproteinase 1 PITRM1 1.9
202455_at 108 histone deacetylase 5 HDAC5 1.8
212686_at 109 protein phosphatase 1H (PP2C domain containing) PPM1H 2.3
219926_at 110 popeye domain containing 3 POPDC3 2.3
219077_s_at 111 WW domain containing oxidoreductase WWOX 1.8
212360_at 112 adenosine monophosphate deaminase 2 (isoform L) AMPD2 -1.6
219148_at 113 PDZ binding kinase PBK 3.4
205109_s_at 114 Rho guanine nucleotide exchange factor (GEF) 4 ARHGEF4 2.1
209568_s_at 115 ral guanine nucleotide dissociation stimulator-like 1 RGL1 -1.9
218913_s_at 116 GEM interacting protein GMIP 1.5
212845_at 117 sterile alpha motif domain containing 4 SAMD4 2.0
221641_s_at 118 acyl-Coenzyme A thioesterase 2, mitochondrial ACATE2 1.8
208928_at 119 P450 (cytochrome) oxidoreductase POR 1.9
222154_s_at 120 DNA polymerase-transactivated protein 6 DNAPTP6 2.1
205946_at 121 vasoactive intestinal peptide receptor 2 VIPR2 1.7
217511_at 122 Kazal-type serine protease inhibitor domain 1 KAZALD1 -1.5
212747_at 123
ankyrin repeat and sterile alpha motif domain
containing 1
ANKS1 2.4
204442_x_at 124
latent transforming growth factor beta binding
protein 4
LTBP4 -1.8
204237_at 125
GULP, engulfment adaptor PTB domain containing
1
GULP1 -2.1
219511_s_at 126 synuclein, alpha interacting protein (synphilin) SNCAIP -1.8
213005_s_at 127 ankyrin repeat domain 15 ANKRD15 2.4
202947_s_at 128 glycophorin C (Gerbich blood group) GYPC 1.9
213125_at 129 olfactomedin-like 2B OLFML2B 2.7
221019_s_at 130
collectin sub-family member 12 /// collectin sub-
family member 12
COLEC12 -2.1
209242_at 131 paternally expressed 3 PEG3 3.2
206745_at 132 homeo box C11 HOXC11 -1.5
218454_at 133 hypothetical protein FLJ22662 FLJ22662 -1.8
220911_s_at 134 KIAA1305 KIAA1305 -1.6
204623_at 135 trefoil factor 3 (intestinal) TFF3 3.0
202679_at 136 Niemann-Pick disease, type C1 NPC1 1.9
204897_at 137 prostaglandin E receptor 4 (subtype EP4) PTGER4 -3.0
206373_at 138
Zic family member 1 (odd-paired homolog,
Drosophila)
ZIC1 -3.7
212989_at 139 transmembrane protein 23 TMEM23 3.0
212188_at 140
potassium channel tetramerisation domain
containing 12 /// potassium channel tetramerisation
domain containing 12
KCTD12 -2.8
220262_s_at 141 EGF-like-domain, multiple 9 EGFL9 1.4
219884_at 142 LIM homeobox 6 LHX6 -1.4
183
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
205818_at 143 deleted in bladder cancer 1 DBC1 1.7
213362_at 144 Protein tyrosine phosphatase, receptor type, D PTPRD 2.3
205542_at 145
six transmembrane epithelial antigen of the prostate
1
STEAP1 -1.9
213519_s_at 146
laminin, alpha 2 (merosin, congenital muscular
dystrophy)
LAMA2 -2.3
221065_s_at 147
carbohydrate (N-acetylgalactosamine 4-0)
sulfotransferase 8
CHST8 1.7
218777_at 148 chromosome 8 open reading frame 20 C8orf20 -1.3
203680_at 149
protein kinase, cAMP-dependent, regulatory,
type II, beta
PRKAR2B 3.1
219757_s_at 150 chromosome 14 open reading frame 101 C14orf101 2.1
212230_at 151 phosphatidic acid phosphatase type 2B PPAP2B -2.1
201121_s_at 152 progesterone receptor membrane component 1 PGRMC1 -1.6
219319_at 153 hypoxia inducible factor 3, alpha subunit HIF3A -1.7
219410_at 154 transmembrane protein 45A TMEM45A -2.5
219377_at 155 family with sequence similarity 59, member A FAM59A -1.5
206722_s_at 156
endothelial differentiation, lysophosphatidic acid G-
protein-coupled receptor, 4
EDG4 1.4
209869_at 157
adrenergic, alpha-2A-, receptor /// adrenergic, alpha-
2A-, receptor
ADRA2A 2.0
219181_at 158 lipase, endothelial LIPG 2.6
202534_x_at 159 dihydrofolate reductase DHFR 2.0
211052_s_at 160
tubulin-specific chaperone d /// tubulin-specific
chaperone d
TBCD 1.8
202788_at 161
mitogen-activated protein kinase-activated protein
kinase 3
MAPKAPK3 -1.3
204015_s_at 162 dual specificity phosphatase 4 DUSP4 -1.5
209829_at 163 chromosome 6 open reading frame 32 C6orf32 2.5
212509_s_at 164 matrix-remodelling associated 7 MXRA7 1.8
217992_s_at 165 EF-hand domain family, member D2 EFHD2 2.0
213009_s_at 166 tripartite motif-containing 37 TRIM37 1.7
206228_at 167 paired box gene 2 PAX2 1.9
218237_s_at 168 solute carrier family 38, member 1 SLC38A1 2.4
221578_at 169 Ras association (RalGDS/AF-6) domain family 4 RASSF4 1.9
205151_s_at 170 KIAA0644 gene product KIAA0644 -2.2
206767_at 171
RNA binding motif, single stranded interacting
protein
RBMS3 -1.4
219806_s_at 172 FN5 protein FN5 -1.6
204824_at 173 endonuclease G ENDOG 1.8
201204_s_at 174 Ribosome binding protein 1 homolog 180kDa (dog) RRBP1 -2.0
207424_at 175 myogenic factor 5 MYF5 -3.1
202517_at 176 collapsin response mediator protein 1 CRMP1 2.4
210715_s_at 177 serine protease inhibitor, Kunitz type, 2 SPINT2 2.5
203417_at 178 microfibrillar-associated protein 2 MFAP2 -2.5
209091_s_at 179 SH3-domain GRB2-like endophilin B1 SH3GLB1 2.1
184
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
211373_s_at 180 presenilin 2 (Alzheimer disease 4) PSEN2 2.1
209082_s_at 181 collagen, type XVIII, alpha 1 COL18A1 2.2
206167_s_at 182 Rho GTPase activating protein 6 ARHGAP6 2.4
201995_at 183 exostoses (multiple) 1 EXT1 -1.7
212313_at 184 CHMP family, member 7 CHMP7 -1.6
212563_at 185 block of proliferation 1 BOP1 -1.9
205811_at 186
polymerase (DNA directed), gamma 2, accessory
subunit
POLG2 1.4
203233_at 187 interleukin 4 receptor IL4R 1.8
207859_s_at 188 cholinergic receptor, nicotinic, beta polypeptide 3 CHRNB3 2.2
202575_at 189 cellular retinoic acid binding protein 2 CRABP2 -1.8
218394_at 190 leucine zipper domain protein FLJ22386 1.3
202920_at 191 ankyrin 2, neuronal ANK2 2.7
221260_s_at 192
chromosome 12 open reading frame 22 ///
chromosome 12 open reading frame 22
C12orf22 1.9
210683_at 193 neurturin NRTN 2.8
221447_s_at 194
glycosyltransferase 8 domain containing 2 ///
glycosyltransferase 8 domain containing 2
GLT8D2 -1.7
213169_at 195 Clone TUA8 Cri-du-chat region mRNA --- 2.2
209692_at 196 eyes absent homolog 2 (Drosophila) EYA2 2.1
211113_s_at 197
ATP-binding cassette, sub-family G (WHITE),
member 1
ABCG1 1.3
204086_at 198 preferentially expressed antigen in melanoma PRAME -3.8
207077_at 199 elastase 2B ELA2B 2.4
212110_at 200
solute carrier family 39 (zinc transporter), member
14
SLC39A14 -1.9
221004_s_at 201
integral membrane protein 2C /// integral membrane
protein 2C
ITM2C -1.9
204570_at 202
cytochrome c oxidase subunit VIIa polypeptide 1
(muscle)
COX7A1 3.4
213147_at 203 homeo box A10 HOXA10 -1.5
204798_at 204
v-myb myeloblastosis viral oncogene homolog
(avian)
MYB 1.7
210227_at 205
discs, large (Drosophila) homolog-associated
protein 2
DLGAP2 1.9
222111_at 206 Family with sequence similarity 63, member B KIAA1164 1.5
220462_at 207 TGF-beta induced apotosis protein 2 TAIP-2 -1.6
202800_at 208
solute carrier family 1 (glial high affinity glutamate
transporter), member 3
SLC1A3 -1.6
210139_s_at 209 peripheral myelin protein 22 PMP22 -2.2
218337_at 210 retinoic acid induced 16 RAI16 -1.4
58780_s_at 211 hypothetical protein FLJ10357 FLJ10357 -1.7
210946_at 212 phosphatidic acid phosphatase type 2A PPAP2A 2.1
202098_s_at 213
HMT1 hnRNP methyltransferase-like 1 (S.
cerevisiae)
HRMT1L1 1.5
212956_at 214 KIAA0882 protein KIAA0882 2.4
185
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
221185_s_at 215 IQ motif containing G IQCG 2.3
219154_at 216 Hypothetical LOC144404 --- 1.5
218376_s_at 217
microtubule associated monoxygenase, calponin and
LIM domain containing 1
MICAL1 2.3
212713_at 218 microfibrillar-associated protein 4 MFAP4 -2.0
214110_s_at 219 Similar to lymphocyte-specific protein 1 --- 2.3
205376_at 220
inositol polyphosphate-4-phosphatase, type II,
105kDa
INPP4B -1.4
204743_at 221 transgelin 3 TAGLN3 2.0
206090_s_at 222 disrupted in schizophrenia 1 DISC1 1.9
210464_at 223 DKFZP434F122 protein DKFZP434F122 2.1
201126_s_at 224
mannosyl (alpha-1,3-)-glycoprotein beta-1,2-N-
acetylglucosaminyltransferase
MGAT1 -1.4
204483_at 225 enolase 3 (beta, muscle) ENO3 2.9
209656_s_at 226 transmembrane protein 47 TMEM47 2.7
207144_s_at 227
Cbp/p300-interacting transactivator, with Glu/Asp-
rich carboxy-terminal domain, 1
CITED1 -1.7
218364_at 228 leucine rich repeat (in FLII) interacting protein 2 LRRFIP2 1.7
218125_s_at 229 coiled-coil domain containing 25 CCDC25 -1.3
218695_at 230 exosome component 4 EXOSC4 -1.5
219316_s_at 231 chromosome 14 open reading frame 58 C14orf58 2.0
208958_at 232
thioredoxin domain containing 4 (endoplasmic
reticulum)
TXNDC4 1.2
212864_at 233
CDP-diacylglycerol synthase (phosphatidate
cytidylyltransferase) 2
CDS2 1.4
219427_at 234 FAT tumor suppressor homolog 4 (Drosophila) FAT4 -1.4
214078_at 235 P21 (CDKN1A)-activated kinase 3 PAK3 2.6
200972_at 236 tetraspanin 3 TSPAN3 1.7
201455_s_at 237 aminopeptidase puromycin sensitive NPEPPS 1.6
200636_s_at 238 protein tyrosine phosphatase, receptor type, F PTPRF 1.9
219291_at 239 DTW domain containing 1 DTWD1 1.4
204653_at 240
transcription factor AP-2 alpha (activating enhancer
binding protein 2 alpha)
TFAP2A 2.3
204121_at 241 growth arrest and DNA-damage-inducible, gamma GADD45G 1.6
219038_at 242 MORC family CW-type zinc finger 4 MORC4 2.0
204334_at 243 Kruppel-like factor 7 (ubiquitous) KLF7 1.5
212548_s_at 244 KIAA0826 KIAA0826 1.5
212838_at 245 dynamin binding protein DNMBP 1.5
205417_s_at 246
dystroglycan 1 (dystrophin-associated glycoprotein
1)
DAG1 2.1
213752_at 247 hypothetical protein FLJ43806 FLJ43806 -1.3
209234_at 248 kinesin family member 1B KIF1B 1.8
202894_at 249 EPH receptor B4 EPHB4 -1.4
207595_s_at 250 bone morphogenetic protein 1 BMP1 -1.3
213186_at 251 zinc finger DAZ interacting protein 3 DZIP3 1.9
186
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
217890_s_at 252 parvin, alpha PARVA -1.5
214874_at 253 --- --- 1.6
213624_at 254 sphingomyelin phosphodiesterase, acid-like 3A SMPDL3A 2.1
213848_at 255 Dual specificity phosphatase 7 DUSP7 -1.6
213113_s_at 256 solute carrier family 43, member 3 SLC43A3 -1.4
214721_x_at 257 CDC42 effector protein (Rho GTPase binding) 4 CDC42EP4 -1.3
209205_s_at 258 LIM domain only 4 LMO4 1.7
203071_at 259
sema domain, immunoglobulin domain (Ig), short
basic domain, secreted, (semaphorin) 3B
SEMA3B -1.4
218245_at 260 tsukushi TSK -1.2
206657_s_at 261 myogenic factor 3 MYOD1 1.8
208704_x_at 262 amyloid beta (A4) precursor-like protein 2 APLP2 -1.5
219548_at 263 zinc finger protein 16 (KOX 9) ZNF16 -1.3
208779_x_at 264 discoidin domain receptor family, member 1 DDR1 1.4
207071_s_at 265 aconitase 1, soluble ACO1 1.4
210380_s_at 266
calcium channel, voltage-dependent, alpha 1G
subunit
CACNA1G -1.3
205903_s_at 267
potassium intermediate/small conductance
calcium-activated channel, subfamily N, member
3
KCNN3 1.7
221556_at 268
CDC14 cell division cycle 14 homolog B (S.
cerevisiae)
CDC14B 1.8
219732_at 269 plasticity related gene 3 PRG-3 2.6
201369_s_at 270 zinc finger protein 36, C3H type-like 2 ZFP36L2 -1.7
218346_s_at 271 sestrin 1 SESN1 1.6
203438_at 272 stanniocalcin 2 STC2 1.9
204838_s_at 273 mutL homolog 3 (E. coli) MLH3 1.5
211075_s_at 274
CD47 antigen (Rh-related antigen, integrin-
associated signal transducer) /// CD47 antigen (Rh-
related antigen, integrin-associated signal
transducer)
CD47 2.2
219736_at 275 tripartite motif-containing 36 TRIM36 1.8
207121_s_at 276 mitogen-activated protein kinase 6 MAPK6 1.6
209814_at 277 zinc finger protein 330 ZNF330 1.5
219953_s_at 278 chromosome 11 open reading frame 17 C11orf17 -1.3
218820_at 279 chromosome 14 open reading frame 132 C14orf132 2.2
202081_at 280 immediate early response 2 IER2 -1.6
200039_s_at 281
proteasome (prosome, macropain) subunit, beta
type, 2 /// proteasome (prosome, macropain)
subunit, beta type, 2
PSMB2 1.5
212144_at 282 unc-84 homolog B (C. elegans) UNC84B 1.4
219825_at 283
cytochrome P450, family 26, subfamily B,
polypeptide 1
CYP26B1 -1.7
206774_at 284 FERM and PDZ domain containing 1 FRMPD1 1.6
212830_at 285 EGF-like-domain, multiple 5 EGFL5 1.4
203729_at 286 epithelial membrane protein 3 EMP3 -2.1
187
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
204418_x_at 287 glutathione S-transferase M2 (muscle) GSTM2 -1.4
213220_at 288 hypothetical protein LOC92482 LOC92482 1.4
212228_s_at 289 chromosome 16 open reading frame 49 C16orf49 1.4
201641_at 290 bone marrow stromal cell antigen 2 BST2 -2.3
203912_s_at 291 deoxyribonuclease I-like 1 DNASE1L1 1.5
207714_s_at 292
serine (or cysteine) proteinase inhibitor, clade H
(heat shock protein 47), member 1, (collagen
binding protein 1)
SERPINH1 -1.7
214719_at 293 hypothetical protein LOC283537 LOC283537 1.6
217955_at 294 BCL2-like 13 (apoptosis facilitator) BCL2L13 1.5
204854_at 295 leprecan-like 2 LEPREL2 -1.3
207643_s_at 296
tumor necrosis factor receptor superfamily, member
1A
TNFRSF1A -1.5
203765_at 297
grancalcin, EF-hand calcium binding protein ///
grancalcin, EF-hand calcium binding protein
GCA 1.9
203354_s_at 298 pleckstrin and Sec7 domain containing 3 PSD3 -1.8
221523_s_at 299 Ras-related GTP binding D RRAGD 2.2
207282_s_at 300 myogenin (myogenic factor 4) MYOG 2.2
209288_s_at 301 CDC42 effector protein (Rho GTPase binding) 3 CDC42EP3 2.2
213456_at 302 sclerostin domain containing 1 SOSTDC1 -1.8
200629_at 303 tryptophanyl-tRNA synthetase WARS 1.8
204078_at 304 synaptonemal complex protein SC65 SC65 -1.4
206306_at 305 ryanodine receptor 3 RYR3 2.9
212254_s_at 306 dystonin DST 2.0
204642_at 307
endothelial differentiation, sphingolipid G-protein-
coupled receptor, 1
EDG1 1.3
200814_at 308
proteasome (prosome, macropain) activator subunit
1 (PA28 alpha)
PSME1 1.5
203725_at 309 growth arrest and DNA-damage-inducible, alpha GADD45A 1.7
200813_s_at 310
platelet-activating factor acetylhydrolase, isoform
Ib, alpha subunit 45kDa
PAFAH1B1 1.4
201599_at 311 ornithine aminotransferase (gyrate atrophy) OAT 1.8
208711_s_at 312 cyclin D1 (PRAD1: parathyroid adenomatosis 1) CCND1 -1.5
203678_at 313 KIAA1018 protein KIAA1018 1.5
204589_at 314 NUAK family, SNF1-like kinase, 1 NUAK1 -1.9
217833_at 315
synaptotagmin binding, cytoplasmic RNA
interacting protein
SYNCRIP 1.6
202910_s_at 316 CD97 antigen CD97 -1.3
209760_at 317 KIAA0922 protein KIAA0922 1.5
212277_at 318 myotubularin related protein 4 MTMR4 1.5
65718_at 319 G protein-coupled receptor 124 GPR124 -1.3
218526_s_at 320 RAN guanine nucleotide release factor RANGNRF 1.4
207414_s_at 321 proprotein convertase subtilisin/kexin type 6 PCSK6 1.6
218070_s_at 322 GDP-mannose pyrophosphorylase A GMPPA -1.3
219416_at 323 scavenger receptor class A, member 3 SCARA3 -1.3
188
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
201941_at 324 carboxypeptidase D CPD 1.5
211518_s_at 325 bone morphogenetic protein 4 BMP4 -1.3
213050_at 326 cordon-bleu homolog (mouse) COBL 1.9
209129_at 327 thyroid hormone receptor interactor 6 TRIP6 -1.6
208924_at 328 ring finger protein 11 RNF11 1.8
212632_at 329 Syntaxin 7 STX7 1.4
202734_at 330 thyroid hormone receptor interactor 10 TRIP10 -1.3
206940_s_at 331 POU domain, class 4, transcription factor 1 POU4F1 3.2
210249_s_at 332 nuclear receptor coactivator 1 NCOA1 1.7
218079_s_at 333 zinc finger protein 403 ZNF403 1.4
216835_s_at 334
docking protein 1, 62kDa (downstream of tyrosine
kinase 1)
DOK1 -1.3
203643_at 335 Ets2 repressor factor ERF -1.3
218761_at 336 ring finger protein 111 RNF111 1.5
212356_at 337 KIAA0323 KIAA0323 -1.2
216044_x_at 338 hypothetical LOC388650 LOC388650 1.3
204214_s_at 339 RAB32, member RAS oncogene family RAB32 -1.3
218618_s_at 340 fibronectin type III domain containing 3B FNDC3B -2.0
200942_s_at 341 heat shock factor binding protein 1 HSBP1 1.5
219648_at 342 dilute suppressor DSU 1.9
208112_x_at 343 EH-domain containing 1 EHD1 -1.4
220334_at 344 regulator of G-protein signalling 17 RGS17 1.5
219215_s_at 345 solute carrier family 39 (zinc transporter), member 4 SLC39A4 -1.3
217902_s_at 346 hect domain and RLD 2 HERC2 1.7
202429_s_at 347
protein phosphatase 3 (formerly 2B), catalytic
subunit, alpha isoform (calcineurin A alpha)
PPP3CA 1.9
221591_s_at 348 family with sequence similarity 64, member A FAM64A 1.6
204579_at 349 fibroblast growth factor receptor 4 FGFR4 2.1
205260_s_at 350 acylphosphatase 1, erythrocyte (common) type ACYP1 1.6
221035_s_at 351
testis expressed sequence 14 /// testis expressed
sequence 14
TEX14 2.1
214394_x_at 352 --- --- -1.4
202761_s_at 353 spectrin repeat containing, nuclear envelope 2 SYNE2 1.8
208786_s_at 354
microtubule-associated protein 1 light chain 3
beta
MAP1LC3B 1.5
201380_at 355 cartilage associated protein CRTAP -1.3
200708_at 356
glutamic-oxaloacetic transaminase 2, mitochondrial
(aspartate aminotransferase 2)
GOT2 1.3
213269_at 357 zinc finger protein 248 ZNF248 1.4
221740_x_at 358
C114 SLIT-like testicular protein /// Hypothetical
LOC388397
LOC474170 1.3
204550_x_at 359 glutathione S-transferase M1 GSTM1 -1.3
204060_s_at 360 protein kinase, X-linked /// protein kinase, Y-linked PRKX /// PRKY 1.6
209984_at 361 jumonji domain containing 2C JMJD2C 1.2
189
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
209274_s_at 362 HESB like domain containing 2 HBLD2 1.6
219118_at 363 FK506 binding protein 11, 19 kDa FKBP11 -1.5
218902_at 364
Notch homolog 1, translocation-associated
(Drosophila)
NOTCH1 1.5
209983_s_at 365 neurexin 2 NRXN2 -1.3
204119_s_at 366 adenosine kinase ADK 1.5
212677_s_at 367 KIAA0582 KIAA0582 1.8
205088_at 368 chromosome X open reading frame 6 CXorf6 1.5
219203_at 369 chromosome 14 open reading frame 122 C14orf122 1.3
218167_at 370 hypothetical protein LOC51321 LOC51321 1.6
218628_at 371 CGI-116 protein CGI-116 1.5
216397_s_at 372 block of proliferation 1 BOP1 -1.3
202145_at 373 lymphocyte antigen 6 complex, locus E LY6E -1.6
201286_at 374 syndecan 1 SDC1 -1.5
213327_s_at 375 ubiquitin specific protease 12 USP12 1.4
215056_at 376 Clone 23695 mRNA sequence --- 1.4
52940_at 377 single Ig IL-1R-related molecule SIGIRR -1.3
213435_at 378 SATB family member 2 SATB2 2.2
209356_x_at 379
EGF-containing fibulin-like extracellular matrix
protein 2
EFEMP2 -1.6
212870_at 380 Ras association (RalGDS/AF-6) domain family 3 RASSF3 1.5
201905_s_at 381
CTD (carboxy-terminal domain, RNA polymerase
II, polypeptide A) small phosphatase-like
CTDSPL -1.2
203488_at 382 latrophilin 1 LPHN1 -1.3
218139_s_at 383 chromosome 14 open reading frame 108 C14orf108 1.7
212538_at 384 dedicator of cytokinesis 9 DOCK9 2.2
215717_s_at 385 fibrillin 2 (congenital contractural arachnodactyly) FBN2 -1.2
209004_s_at 386 F-box and leucine-rich repeat protein 5 FBXL5 1.8
205226_at 387 platelet-derived growth factor receptor-like PDGFRL -1.7
216836_s_at 388
v-erb-b2 erythroblastic leukemia viral oncogene
homolog 2, neuro/glioblastoma derived oncogene
homolog (avian)
ERBB2 -1.4
218151_x_at 389 G protein-coupled receptor 172A GPR172A -1.2
204072_s_at 390 hypothetical protein CG003 13CDNA73 2.1
212068_s_at 391 KIAA0515 KIAA0515 1.4
201810_s_at 392 SH3-domain binding protein 5 (BTK-associated) SH3BP5 -1.5
204316_at 393 regulator of G-protein signalling 10 RGS10 -1.2
201207_at 394
tumor necrosis factor, alpha-induced protein 1
(endothelial)
TNFAIP1 1.3
213025_at 395 THUMP domain containing 1 THUMPD1 1.5
203700_s_at 396 deiodinase, iodothyronine, type II DIO2 1.6
217787_s_at 397
UDP-N-acetyl-alpha-D-galactosamine:polypeptide
N-acetylgalactosaminyltransferase 2 (GalNAc-T2)
GALNT2 -1.3
204075_s_at 398
glycine-, glutamate-, thienylcyclohexylpiperidine-
binding protein
GlyBP 1.2
190
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
206704_at 399
chloride channel 5 (nephrolithiasis 2, X-linked, Dent
disease)
CLCN5 1.6
212209_at 400 thyroid hormone receptor associated protein 2 THRAP2 1.5
219641_at 401 de-etiolated homolog 1 (Arabidopsis) DET1 1.3
202905_x_at 402 nibrin NBN -1.5
219763_at 403 KIAA1608 KIAA1608 -1.2
204604_at 404 PFTAIRE protein kinase 1 PFTK1 1.5
200746_s_at 405
guanine nucleotide binding protein (G protein), beta
polypeptide 1
GNB1 1.4
218683_at 406 polypyrimidine tract binding protein 2 PTBP2 1.7
208002_s_at 407 brain acyl-CoA hydrolase BACH 1.6
211941_s_at 408 prostatic binding protein PBP 1.5
212416_at 409 secretory carrier membrane protein 1 SCAMP1 1.4
215311_at 410
MRNA full length insert cDNA clone
EUROIMAGE 21920
--- 2.3
221885_at 411 KIAA1277 KIAA1277 -1.2
220134_x_at 412 chromosome 1 open reading frame 78 C1orf78 -1.4
215690_x_at 413
GPAA1P anchor attachment protein 1 homolog
(yeast)
GPAA1 -1.3
217863_at 414 protein inhibitor of activated STAT, 1 PIAS1 1.3
209702_at 415 fatso FTO 1.3
209253_at 416 vinexin beta (SH3-containing adaptor molecule-1) SCAM-1 -1.2
203476_at 417 trophoblast glycoprotein TPBG -1.8
202959_at 418 methylmalonyl Coenzyme A mutase MUT 1.4
218900_at 419 cyclin M4 CNNM4 -1.1
203100_s_at 420 chromodomain protein, Y-like CDYL 1.3
201032_at 421 bladder cancer associated protein BLCAP 1.9
202854_at 422
hypoxanthine phosphoribosyltransferase 1 (Lesch-
Nyhan syndrome)
HPRT1 1.5
219506_at 423 chromosome 1 open reading frame 54 C1orf54 -1.5
202579_x_at 424 high mobility group nucleosomal binding domain 4 HMGN4 1.3
213161_at 425
chromosome 9 open reading frame 97 ///
chromosome 9 open reading frame 97
C9orf97 1.3
212655_at 426 zinc finger, CCHC domain containing 14 ZCCHC14 1.5
213224_s_at 427 hypothetical protein LOC92482 LOC92482 1.4
203510_at 428
met proto-oncogene (hepatocyte growth factor
receptor)
MET 2.1
221969_at 429
Paired box gene 5 (B-cell lineage specific
activator)
PAX5 1.8
213946_s_at 430 KIAA0657 protein KIAA0657 -1.4
208978_at 431 cysteine-rich protein 2 CRIP2 1.6
218373_at 432 fused toes homolog (mouse) FTS 1.3
213021_at 433 golgi SNAP receptor complex member 1 GOSR1 1.2
212309_at 434 cytoplasmic linker associated protein 2 CLASP2 1.5
209695_at 435 protein tyrosine phosphatase type IVA, member 3 PTP4A3 -1.5
191
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
203745_at 436
holocytochrome c synthase (cytochrome c heme-
lyase)
HCCS 1.5
209657_s_at 437 heat shock transcription factor 2 HSF2 1.5
219372_at 438
carnitine deficiency-associated, expressed in
ventricle 1
CDV1 1.3
205042_at 439
glucosamine (UDP-N-acetyl)-2-epimerase/N-
acetylmannosamine kinase
GNE 1.4
213185_at 440 KIAA0556 protein KIAA0556 1.3
219146_at 441 chromosome 17 open reading frame 42 C17orf42 1.2
204662_at 442 CP110 protein CP110 1.3
208916_at 443
solute carrier family 1 (neutral amino acid
transporter), member 5
SLC1A5 -1.2
202783_at 444 nicotinamide nucleotide transhydrogenase NNT 1.5
207785_s_at 445
recombining binding protein suppressor of hairless
(Drosophila)
RBPSUH 1.7
213251_at 446 Hypothetical LOC 441046 --- 1.6
218528_s_at 447 ring finger protein 38 RNF38 1.4
217967_s_at 448 chromosome 1 open reading frame 24 C1orf24 -1.8
203565_s_at 449 menage a trois 1 (CAK assembly factor) MNAT1 1.4
204726_at 450 cadherin 13, H-cadherin (heart) CDH13 1.5
202724_s_at 451 forkhead box O1A (rhabdomyosarcoma) FOXO1A 1.7
212211_at 452 ankyrin repeat domain 17 ANKRD17 1.4
213365_at 453 similar to RIKEN cDNA 4933424N09 gene MGC16943 1.4
219048_at 454 phosphatidylinositol glycan, class N PIGN 1.2
218099_at 455 uncharacterized hypothalamus protein HT008 HT008 1.2
218857_s_at 456 asparaginase like 1 ASRGL1 1.4
209737_at 457
membrane associated guanylate kinase, WW and
PDZ domain containing 2
MAGI2 1.5
221482_s_at 458 cyclic AMP phosphoprotein, 19 kD ARPP-19 1.6
220014_at 459 mesenchymal stem cell protein DSC54 LOC51334 -1.4
201370_s_at 460 cullin 3 CUL3 1.4
203297_s_at 461 Jumonji, AT rich interactive domain 2 JARID2 1.5
206695_x_at 462 zinc finger protein 43 (HTF6) ZNF43 1.3
204093_at 463 cyclin H CCNH 1.3
201644_at 464 tissue specific transplantation antigen P35B TSTA3 -1.3
204600_at 465 EPH receptor B3 EPHB3 -1.3
209863_s_at 466 tumor protein p73-like TP73L -1.3
212710_at 467 calmodulin regulated spectrin-associated protein 1 CAMSAP1 1.4
209084_s_at 468 RAB28, member RAS oncogene family RAB28 1.3
217964_at 469 tetratricopeptide repeat domain 19 TTC19 1.4
215185_at 470 hypothetical gene supported by AK024177 LOC441468 -1.1
213960_at 471 CDNA FLJ37610 fis, clone BRCOC2011398 --- 1.9
204485_s_at 472 target of myb1-like 1 (chicken) TOM1L1 1.7
212547_at 473 FLJ35348 FLJ35348 1.4
209111_at 474 ring finger protein 5 RNF5 1.3
192
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
217977_at 475 selenoprotein X, 1 SEPX1 1.3
201218_at 476 C-terminal binding protein 2 CTBP2 1.6
212815_at 477 activating signal cointegrator 1 complex subunit 3 ASCC3 1.4
221933_at 478 neuroligin 4, X-linked NLGN4X 1.7
201566_x_at 479
inhibitor of DNA binding 2, dominant negative
helix-loop-helix protein /// inhibitor of DNA binding
2B, dominant negative helix-loop-helix protein
ID2 /// ID2B -1.4
212607_at 480
v-akt murine thymoma viral oncogene homolog 3
(protein kinase B, gamma)
AKT3 1.4
206737_at 481
wingless-type MMTV integration site family,
member 11
WNT11 -1.3
222025_s_at 482 5-oxoprolinase (ATP-hydrolysing) OPLAH -1.2
201829_at 483 neuroepithelial cell transforming gene 1 NET1 1.6
209435_s_at 484 rho/rac guanine nucleotide exchange factor (GEF) 2 ARHGEF2 1.3
221836_s_at 485 IKK2 binding protein T1 -1.2
205665_at 486 tetraspanin 9 TSPAN9 -1.1
206498_at 487
oculocutaneous albinism II (pink-eye dilution
homolog, mouse)
OCA2 -1.2
208857_s_at 488
protein-L-isoaspartate (D-aspartate) O-
methyltransferase
PCMT1 1.6
213852_at 489 RNA binding motif protein 8A RBM8A 1.3
217525_at 490 olfactomedin-like 1 OLFML1 -1.4
212774_at 491 zinc finger protein 238 ZNF238 1.7
212928_at 492 TSPY-like 4 TSPYL4 1.5
203660_s_at 493 pericentrin 2 (kendrin) PCNT2 1.4
202777_at 494 soc-2 suppressor of clear homolog (C. elegans) SHOC2 1.4
213689_x_at 495 Hypothetical LOC388650 LOC388650 1.4
213059_at 496 cAMP responsive element binding protein 3-like 1 CREB3L1 -1.3
220512_at 497 deleted in liver cancer 1 DLC1 -1.2
218612_s_at 498 tumor suppressing subtransferable candidate 4 TSSC4 -1.2
212322_at 499 sphingosine-1-phosphate lyase 1 SGPL1 1.4
219025_at 500 CD248 antigen, endosialin CD248 -1.3
204500_s_at 501 ATP/GTP binding protein 1 AGTPBP1 1.3
207191_s_at 502
immunoglobulin superfamily containing leucine-
rich repeat
ISLR -1.7
204983_s_at 503 glypican 4 GPC4 -1.4
218686_s_at 504 rhomboid family 1 (Drosophila) RHBDF1 -1.3
201018_at 505 eukaryotic translation initiation factor 1A, X-linked EIF1AX 1.6
208051_s_at 506 poly(A) binding protein interacting protein 1 PAIP1 1.6
204201_s_at 507
protein tyrosine phosphatase, non-receptor type 13
(APO-1/CD95 (Fas)-associated phosphatase)
PTPN13 -1.2
220694_at 508 HSPC054 protein HSPC054 -1.3
203073_at 509 component of oligomeric golgi complex 2 COG2 1.2
204208_at 510 RNA guanylyltransferase and 5'-phosphatase RNGTT 1.2
193
Supplementary Table 7. Meta-clustering derived gene list from RMS primary
tumor analysis (no UDS/NRSTS).
Affy ID
Gene
Rank
*
Gene Name Symbol
Mean Fold-
Difference
†
205911_at 511 parathyroid hormone receptor 1 PTHR1 -1.2
201735_s_at 512 chloride channel 3 CLCN3 1.4
212498_at 513 Membrane-associated ring finger (C3HC4) 6 MARCH-VI 1.4
212987_at 514 F-box protein 9 FBXO9 1.4
204071_s_at 515 topoisomerase I binding, arginine/serine-rich TOPORS 1.2
201696_at 516 splicing factor, arginine/serine-rich 4 SFRS4 1.2
201010_s_at 517 thioredoxin interacting protein TXNIP 1.5
202614_at 518 solute carrier family 30 (zinc transporter), member 9 SLC30A9 1.4
222236_s_at 519
development and differentiation enhancing factor-
like 1
DDEFL1 -1.2
208644_at 520 poly (ADP-ribose) polymerase family, member 1 PARP1 1.4
219932_at 521
solute carrier family 27 (fatty acid transporter),
member 6
SLC27A6 1.6
219276_x_at 522 chromosome 9 open reading frame 82 C9orf82 1.3
203062_s_at 523 mediator of DNA damage checkpoint 1 MDC1 1.4
203909_at 524
solute carrier family 9 (sodium/hydrogen
exchanger), isoform 6
SLC9A6 1.4
221730_at 525 collagen, type V, alpha 2 COL5A2 -2.0
205510_s_at 526 hypothetical protein FLJ10038 FLJ10038 1.2
205651_x_at 527 Rap guanine nucleotide exchange factor (GEF) 4 RAPGEF4 1.2
212678_at 528
Neurofibromin 1 (neurofibromatosis, von
Recklinghausen disease, Watson disease)
NF1 1.3
203175_at 529 ras homolog gene family, member G (rho G) RHOG -1.2
201975_at 530
restin (Reed-Steinberg cell-expressed intermediate
filament-associated protein)
RSN 1.4
212997_s_at 531 tousled-like kinase 2 TLK2 1.2
206100_at 532 carboxypeptidase M CPM -1.5
205336_at 533 parvalbumin PVALB -1.2
212760_at 534 ubiquitin protein ligase E3 component n-recognin 2 UBR2 1.3
* Gene Rank as determined by test statistic. Cut-off criteria for p-value (p<0.000016) was determined using a two-step false
discovery rate of 0.1%.
† Geometric mean fold-difference between mARMS and mERMS tumor classes.
Note: PAX-FKHR signature genes are highlighted in bold
194
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
205389_s_at ANK1
ankyrin 1, erythrocytic ///
ankyrin 1, erythrocytic
2.5 2.2 2.4 1.0E-09
207076_s_at ASS argininosuccinate synthetase 2.9 2.5 2.7 1.0E-09
212097_at CAV1
caveolin 1, caveolae protein,
22kDa
10.4 6.9 8.7 1.0E-09
211919_s_at CXCR4
chemokine (C-X-C motif)
receptor 4 /// chemokine (C-X-
C motif) receptor 4
2.8 3.3 3.0 1.0E-09
204602_at DKK1
dickkopf homolog 1 (Xenopus
laevis)
5.3 5.8 5.6 1.0E-09
219908_at DKK2
dickkopf homolog 2 (Xenopus
laevis)
3.7 3.3 3.5 1.0E-09
203723_at ITPKB
inositol 1,4,5-trisphosphate 3-
kinase B
2.3 2.8 2.5 1.0E-09
206835_at STATH statherin 28.5 63.7 46.1 1.0E-09
202644_s_at TNFAIP3
tumor necrosis factor, alpha-
induced protein 3
2.4 2.1 2.3 1.0E-09
218084_x_at FXYD5
FXYD domain containing ion
transport regulator 5
1.8 2.1 1.9 1.0E-08
201466_s_at JUN
v-jun sarcoma virus 17
oncogene homolog (avian)
3.4 4.7 4.1 1.0E-08
210794_s_at MEG3 maternally expressed 3 2.6 2.9 2.8 1.0E-08
201110_s_at THBS1 thrombospondin 1 2.8 3.5 3.1 1.0E-08
218717_s_at LEPREL1 leprecan-like 1 3.2 5.3 4.2 2.0E-08
219894_at MAGEL2 MAGE-like 2 2.0 1.9 2.0 2.0E-08
209365_s_at ECM1 extracellular matrix protein 1 2.0 1.8 1.9 3.0E-08
212190_at SERPINE2
serine (or cysteine) proteinase
inhibitor, clade E (nexin,
plasminogen activator inhibitor
type 1), member 2
4.3 7.6 6.0 3.0E-08
209201_x_at CXCR4
chemokine (C-X-C motif)
receptor 4
2.7 3.4 3.1 4.0E-08
201325_s_at EMP1 epithelial membrane protein 1 2.4 2.6 2.5 4.0E-08
204948_s_at FST follistatin 3.2 4.7 3.9 4.0E-08
201464_x_at JUN
v-jun sarcoma virus 17
oncogene homolog (avian)
3.3 4.2 3.7 4.0E-08
205479_s_at PLAU
plasminogen activator,
urokinase
2.5 2.0 2.3 5.0E-08
212956_at KIAA0882 KIAA0882 protein 3.7 6.6 5.2 6.0E-08
211042_x_at MCAM
melanoma cell adhesion
molecule /// melanoma cell
adhesion molecule
1.9 2.3 2.1 7.0E-08
204105_s_at NRCAM neuronal cell adhesion molecule 2.4 3.4 2.9 8.0E-08
210432_s_at SCN3A
sodium channel, voltage-gated,
type III, alpha
6.4 16.3 11.3 8.0E-08
39248_at AQP3 aquaporin 3 3.7 6.8 5.2 9.0E-08
204014_at DUSP4 dual specificity phosphatase 4 -1.9 -1.6 -1.7 1.0E-07
195
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
212651_at RHOBTB1
Rho-related BTB domain
containing 1
1.6 1.6 1.6 1.0E-07
216598_s_at CCL2 chemokine (C-C motif) ligand 2 -1.7 -1.6 -1.6 1.1E-07
203434_s_at MME
membrane metallo-
endopeptidase (neutral
endopeptidase, enkephalinase,
CALLA, CD10)
3.5 5.4 4.5 1.5E-07
221861_at ---
MRNA; cDNA
DKFZp762M127 (from clone
DKFZp762M127)
2.6 3.9 3.3 1.8E-07
201289_at CYR61
cysteine-rich, angiogenic
inducer, 61
2.5 2.0 2.3 2.2E-07
202207_at ARL7 ADP-ribosylation factor-like 7 2.7 3.5 3.1 2.6E-07
211341_at POU4F1
POU domain, class 4,
transcription factor 1
3.6 6.2 4.9 2.6E-07
221123_x_at ZNF395 zinc finger protein 395 1.7 2.0 1.8 3.0E-07
212012_at PXDN
peroxidasin homolog
(Drosophila)
2.0 2.6 2.3 3.6E-07
206657_s_at MYOD1 myogenic factor 3 1.9 2.2 2.0 4.3E-07
205128_x_at PTGS1
prostaglandin-endoperoxide
synthase 1 (prostaglandin G/H
synthase and cyclooxygenase)
1.6 1.6 1.6 4.4E-07
203510_at MET
met proto-oncogene
(hepatocyte growth factor
receptor)
2.5 3.8 3.1 4.6E-07
218807_at VAV3 vav 3 oncogene 2.0 2.6 2.3 4.6E-07
207571_x_at C1orf38
chromosome 1 open reading
frame 38
2.0 2.8 2.4 4.9E-07
213438_at NFASC neurofascin 1.5 1.8 1.6 5.2E-07
207826_s_at ID3
inhibitor of DNA binding 3,
dominant negative helix-loop-
helix protein
1.6 1.6 1.6 5.5E-07
205433_at BCHE butyrylcholinesterase 2.0 2.6 2.3 5.6E-07
205132_at ACTC actin, alpha, cardiac muscle -1.7 -1.4 -1.6 6.1E-07
217028_at CXCR4
chemokine (C-X-C motif)
receptor 4
3.7 6.3 5.0 6.1E-07
203625_x_at SKP2
S-phase kinase-associated
protein 2 (p45)
2.3 2.6 2.5 6.2E-07
209888_s_at MYL1
myosin, light polypeptide 1,
alkali; skeletal, fast
-2.0 -2.2 -2.1 6.3E-07
202998_s_at LOXL2 lysyl oxidase-like 2 2.9 4.8 3.9 6.4E-07
201109_s_at THBS1 thrombospondin 1 1.5 1.7 1.6 6.7E-07
213069_at HEG HEG homolog 1 (zebrafish) 2.0 2.2 2.1 6.8E-07
212614_at ARID5B
AT rich interactive domain 5B
(MRF1-like)
1.9 2.5 2.2 8.6E-07
49111_at ---
MRNA; cDNA
DKFZp762M127 (from clone
DKFZp762M127)
2.3 4.2 3.3 8.7E-07
196
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
215813_s_at PTGS1
prostaglandin-endoperoxide
synthase 1 (prostaglandin G/H
synthase and cyclooxygenase)
2.0 1.9 1.9 9.7E-07
222162_s_at ADAMTS1
a disintegrin-like and
metalloprotease (reprolysin
type) with thrombospondin type
1 motif, 1
1.7 1.7 1.7 1.0E-06
222154_s_at DNAPTP6
DNA polymerase-transactivated
protein 6
1.4 1.6 1.5 1.0E-06
201565_s_at ID2
inhibitor of DNA binding 2,
dominant negative helix-loop-
helix protein
1.6 1.7 1.7 1.0E-06
203233_at IL4R interleukin 4 receptor 1.6 1.8 1.7 1.2E-06
206940_s_at POU4F1
POU domain, class 4,
transcription factor 1
2.6 4.1 3.3 1.2E-06
201368_at ZFP36L2
zinc finger protein 36, C3H
type-like 2
-1.5 -1.7 -1.6 1.3E-06
43511_s_at ---
MRNA; cDNA
DKFZp762M127 (from clone
DKFZp762M127)
2.4 4.7 3.6 1.4E-06
203324_s_at CAV2 caveolin 2 1.4 1.6 1.5 1.4E-06
204364_s_at C2orf23
chromosome 2 open reading
frame 23
-2.8 -2.6 -2.7 1.5E-06
204722_at SCN3B
sodium channel, voltage-gated,
type III, beta
1.7 1.7 1.7 1.6E-06
205151_s_at KIAA0644 KIAA0644 gene product -1.7 -1.5 -1.6 1.8E-06
204454_at LDOC1
leucine zipper, down-regulated
in cancer 1
1.8 2.4 2.1 1.8E-06
205534_at PCDH7 BH-protocadherin (brain-heart) 1.8 2.4 2.1 1.8E-06
219304_s_at PDGFD platelet derived growth factor D 2.2 3.2 2.7 1.9E-06
209710_at GATA2 GATA binding protein 2 1.5 1.8 1.6 2.0E-06
212611_at DTX4 deltex 4 homolog (Drosophila) 2.8 3.5 3.1 2.1E-06
210785_s_at C1orf38
chromosome 1 open reading
frame 38
2.0 3.0 2.5 2.3E-06
205656_at PCDH17 protocadherin 17 1.7 2.2 1.9 2.4E-06
211320_s_at PTPRU
protein tyrosine phosphatase,
receptor type, U
1.8 1.7 1.8 2.4E-06
203065_s_at CAV1
caveolin 1, caveolae protein,
22kDa
5.0 2.5 3.8 3.0E-06
203725_at GADD45A
growth arrest and DNA-
damage-inducible, alpha
1.7 2.3 2.0 3.1E-06
209946_at VEGFC
vascular endothelial growth
factor C
1.6 1.9 1.7 3.2E-06
219935_at ADAMTS5
a disintegrin-like and
metalloprotease (reprolysin
type) with thrombospondin type
1 motif, 5 (aggrecanase-2)
2.2 2.5 2.3 3.3E-06
202728_s_at LTBP1
latent transforming growth
factor beta binding protein 1
1.4 1.8 1.6 3.3E-06
208195_at TTN titin -2.6 -4.1 -3.3 3.6E-06
197
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
204850_s_at DCX
doublecortex; lissencephaly, X-
linked (doublecortin)
4.9 18.0 11.5 3.9E-06
213689_x_at LOC388650 Hypothetical LOC388650 1.5 1.5 1.5 4.4E-06
201860_s_at PLAT plasminogen activator, tissue 2.5 5.3 3.9 4.4E-06
211237_s_at FGFR4
fibroblast growth factor
receptor 4
1.7 2.3 2.0 4.5E-06
209656_s_at TMEM47 transmembrane protein 47 4.1 5.1 4.6 4.6E-06
205388_at TNNC2 troponin C2, fast -2.0 -2.0 -2.0 4.6E-06
203910_at ARHGAP29
Rho GTPase activating protein
29
2.3 2.9 2.6 4.7E-06
210830_s_at PON2 paraoxonase 2 1.4 1.6 1.5 4.8E-06
212387_at TCF4 Transcription factor 4 -2.1 -3.2 -2.6 4.8E-06
201005_at CD9 CD9 antigen (p24) 2.6 6.0 4.3 4.9E-06
201324_at EMP1 epithelial membrane protein 1 2.4 4.9 3.6 4.9E-06
209121_x_at NR2F2
nuclear receptor subfamily 2,
group F, member 2
1.7 1.9 1.8 5.0E-06
200953_s_at CCND2 cyclin D2 1.5 1.7 1.6 5.1E-06
203217_s_at ST3GAL5
ST3 beta-galactoside alpha-2,3-
sialyltransferase 5
1.6 2.2 1.9 5.4E-06
201596_x_at KRT18 keratin 18 1.9 3.1 2.5 5.9E-06
204320_at COL11A1 collagen, type XI, alpha 1 -2.1 -4.1 -3.1 6.0E-06
209220_at GPC3 glypican 3 3.1 8.6 5.8 6.0E-06
205431_s_at BMP5 bone morphogenetic protein 5 1.5 1.4 1.5 6.1E-06
205444_at ATP2A1
ATPase, Ca++ transporting,
cardiac muscle, fast twitch 1
-1.6 -1.6 -1.6 6.2E-06
212977_at CMKOR1 chemokine orphan receptor 1 4.6 19.9 12.2 6.5E-06
212354_at SULF1 sulfatase 1 1.8 1.9 1.9 6.8E-06
222062_at IL27RA interleukin 27 receptor, alpha 1.6 2.1 1.8 6.9E-06
214608_s_at EYA1
eyes absent homolog 1
(Drosophila)
1.8 2.9 2.3 7.7E-06
213093_at PRKCA protein kinase C, alpha 1.8 1.9 1.9 7.7E-06
204365_s_at C2orf23
chromosome 2 open reading
frame 23
-1.7 -2.7 -2.2 8.5E-06
204214_s_at RAB32
RAB32, member RAS
oncogene family
1.4 1.7 1.5 9.3E-06
208212_s_at ALK
anaplastic lymphoma kinase
(Ki-1)
1.5 1.8 1.7 9.4E-06
206850_at RRP22 RAS-related on chromosome 22 1.3 1.7 1.5 9.6E-06
208353_x_at ANK1 ankyrin 1, erythrocytic 2.0 1.5 1.8 1.0E-05
221841_s_at KLF4 Kruppel-like factor 4 (gut) 2.8 7.9 5.4 1.0E-05
211926_s_at MYH9
myosin, heavy polypeptide 9,
non-muscle
1.4 1.6 1.5 1.0E-05
202369_s_at TRAM2
translocation associated
membrane protein 2
1.4 1.6 1.5 1.0E-05
212099_at RHOB
ras homolog gene family,
member B
1.4 1.7 1.6 1.1E-05
205150_s_at KIAA0644 KIAA0644 gene product -1.9 -1.9 -1.9 1.2E-05
202598_at S100A13 S100 calcium binding protein 1.4 1.8 1.6 1.2E-05
198
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
A13
210567_s_at SKP2
S-phase kinase-associated
protein 2 (p45)
2.4 5.6 4.0 1.2E-05
222146_s_at TCF4 transcription factor 4 -1.7 -1.7 -1.7 1.2E-05
203476_at TPBG trophoblast glycoprotein 1.8 2.9 2.3 1.2E-05
219471_at C13orf18
chromosome 13 open reading
frame 18
-1.5 -1.7 -1.6 1.3E-05
204579_at FGFR4
fibroblast growth factor
receptor 4
1.9 3.0 2.5 1.3E-05
212098_at LOC151162
hypothetical protein
LOC151162
1.5 1.6 1.5 1.3E-05
212385_at TCF4 Transcription factor 4 -1.6 -2.3 -2.0 1.3E-05
202643_s_at TNFAIP3
tumor necrosis factor, alpha-
induced protein 3
1.8 1.6 1.7 1.3E-05
200872_at S100A10
S100 calcium binding protein
A10 (annexin II ligand,
calpactin I, light polypeptide
(p11))
2.1 4.3 3.2 1.4E-05
202208_s_at ARL7 ADP-ribosylation factor-like 7 1.4 1.7 1.5 1.5E-05
213050_at COBL cordon-bleu homolog (mouse) -1.8 -1.8 -1.8 1.5E-05
205480_s_at UGP2
UDP-glucose
pyrophosphorylase 2
1.6 2.1 1.9 1.6E-05
203394_s_at HES1
hairy and enhancer of split 1,
(Drosophila)
1.6 2.1 1.8 1.7E-05
202887_s_at DDIT4
DNA-damage-inducible
transcript 4
-1.6 -1.8 -1.7 1.8E-05
207345_at FST follistatin 1.7 2.7 2.2 1.8E-05
203435_s_at MME
membrane metallo-
endopeptidase (neutral
endopeptidase, enkephalinase,
CALLA, CD10)
1.7 2.3 2.0 2.0E-05
206089_at NELL1 NEL-like 1 (chicken) 2.2 5.1 3.7 2.0E-05
219686_at STK32B serine/threonine kinase 32B 1.4 1.7 1.6 2.1E-05
204451_at FZD1
frizzled homolog 1
(Drosophila)
1.6 2.2 1.9 2.2E-05
201876_at PON2 paraoxonase 2 1.5 1.6 1.5 2.2E-05
212386_at TCF4 Transcription factor 4 -2.3 -3.0 -2.6 2.2E-05
205559_s_at PCSK5
proprotein convertase
subtilisin/kexin type 5
1.6 2.0 1.8 2.3E-05
212013_at PXDN
peroxidasin homolog
(Drosophila)
2.2 1.9 2.0 2.3E-05
209651_at TGFB1I1
transforming growth factor beta
1 induced transcript 1
1.7 2.8 2.3 2.4E-05
209835_x_at CD44
CD44 antigen (homing function
and Indian blood group system)
1.4 1.7 1.6 2.5E-05
202008_s_at NID1 nidogen 1 1.4 1.8 1.6 2.6E-05
203626_s_at SKP2
S-phase kinase-associated
protein 2 (p45)
2.2 4.6 3.4 2.7E-05
209185_s_at IRS2 insulin receptor substrate 2 1.4 1.6 1.5 2.8E-05
199
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
209765_at ADAM19
a disintegrin and
metalloproteinase domain 19
(meltrin beta)
1.5 1.7 1.6 2.9E-05
212865_s_at COL14A1
collagen, type XIV, alpha 1
(undulin)
2.2 2.2 2.2 2.9E-05
210511_s_at INHBA
inhibin, beta A (activin A,
activin AB alpha polypeptide)
1.8 3.3 2.5 3.0E-05
209757_s_at MYCN
v-myc myelocytomatosis viral
related oncogene,
neuroblastoma derived (avian)
1.5 2.6 2.1 3.0E-05
218309_at CAMK2N1
calcium/calmodulin-dependent
protein kinase II inhibitor 1
1.4 1.9 1.7 3.1E-05
205462_s_at HPCAL1 hippocalcin-like 1 1.4 2.0 1.7 3.1E-05
202039_at TIAF1
TGFB1-induced anti-apoptotic
factor 1 /// myosin XVIIIA
1.8 3.4 2.6 3.1E-05
205902_at KCNN3
potassium intermediate/small
conductance calcium-activated
channel, subfamily N, member
3
1.5 2.1 1.8 3.4E-05
211518_s_at BMP4 bone morphogenetic protein 4 1.6 2.0 1.8 3.6E-05
205330_at MN1
meningioma (disrupted in
balanced translocation) 1
1.8 3.1 2.4 3.6E-05
211423_s_at SC5DL
sterol-C5-desaturase (ERG3
delta-5-desaturase homolog,
fungal)-like
1.6 2.2 1.9 3.6E-05
212188_at KCTD12
potassium channel
tetramerisation domain
containing 12 /// potassium
channel tetramerisation domain
containing 12
-1.5 -1.4 -1.4 3.8E-05
203184_at FBN2
fibrillin 2 (congenital
contractural arachnodactyly)
2.4 5.0 3.7 3.9E-05
220227_at CDH4
cadherin 4, type 1, R-cadherin
(retinal)
1.4 2.1 1.7 4.0E-05
213241_at PLXNC1 plexin C1 2.0 3.3 2.7 4.0E-05
202794_at INPP1
inositol polyphosphate-1-
phosphatase
1.4 2.0 1.7 4.2E-05
212353_at SULF1 sulfatase 1 1.8 2.2 2.0 4.7E-05
200625_s_at CAP1
CAP, adenylate cyclase-
associated protein 1 (yeast)
1.4 1.6 1.5 5.0E-05
220794_at GREM2
gremlin 2, cysteine knot
superfamily, homolog
(Xenopus laevis)
-2.7 -3.2 -2.9 5.0E-05
209170_s_at GPM6B glycoprotein M6B 1.8 2.8 2.3 5.1E-05
212489_at COL5A1 collagen, type V, alpha 1 1.4 1.8 1.6 5.2E-05
203961_at NEBL nebulette 2.0 4.9 3.4 5.2E-05
202377_at LEPR
leptin receptor /// leptin
receptor overlapping transcript
1.4 1.9 1.6 5.4E-05
200661_at PPGB
protective protein for beta-
galactosidase (galactosialidosis)
1.3 1.8 1.5 5.6E-05
200
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
206571_s_at MAP4K4
mitogen-activated protein
kinase kinase kinase kinase 4
1.4 1.6 1.5 5.8E-05
202345_s_at FABP5
fatty acid binding protein 5
(psoriasis-associated)
2.1 2.9 2.5 5.9E-05
209683_at FAM49A
Family with sequence similarity
49, member A
1.6 2.7 2.1 5.9E-05
205888_s_at KIAA0555
Jak and microtubule interacting
protein 2
1.6 1.7 1.7 5.9E-05
37892_at COL11A1 collagen, type XI, alpha 1 -2.5 -7.5 -5.0 6.2E-05
217889_s_at CYBRD1 cytochrome b reductase 1 1.5 1.6 1.6 6.2E-05
207630_s_at CREM
cAMP responsive element
modulator
1.5 2.0 1.7 6.6E-05
211828_s_at TNIK
TRAF2 and NCK interacting
kinase
-1.4 -1.7 -1.5 6.6E-05
219718_at FLJ10986 hypothetical protein FLJ10986 1.7 3.4 2.5 6.7E-05
207282_s_at MYOG myogenin (myogenic factor 4) -1.6 -1.3 -1.4 6.9E-05
200788_s_at PEA15
phosphoprotein enriched in
astrocytes 15
1.3 1.7 1.5 6.9E-05
221586_s_at E2F5
E2F transcription factor 5,
p130-binding
1.4 1.7 1.6 7.2E-05
221840_at PTPRE
protein tyrosine phosphatase,
receptor type, E
1.5 1.6 1.5 7.2E-05
212473_s_at MICAL2
microtubule associated
monoxygenase, calponin and
LIM domain containing 2
-1.7 -1.8 -1.7 7.3E-05
218806_s_at VAV3 vav 3 oncogene 1.9 3.2 2.6 7.3E-05
202071_at SDC4
syndecan 4 (amphiglycan,
ryudocan)
1.5 1.6 1.6 7.4E-05
212344_at SULF1 sulfatase 1 1.5 1.9 1.7 7.5E-05
203753_at TCF4 transcription factor 4 -1.9 -1.5 -1.7 7.6E-05
214632_at NRP2 neuropilin 2 1.3 1.8 1.6 8.1E-05
212880_at WDR7 WD repeat domain 7 -1.5 -2.0 -1.7 8.4E-05
212382_at TCF4 Transcription factor 4 -2.3 -1.9 -2.1 8.7E-05
205113_at NEF3
neurofilament 3 (150kDa
medium)
-1.4 -2.0 -1.7 8.8E-05
220979_s_at ST6GALNAC5
ST6 (alpha-N-acetyl-
neuraminyl-2,3-beta-galactosyl-
1,3)-N-acetylgalactosaminide
alpha-2,6-sialyltransferase 5 ///
ST6 (alpha-N-acetyl-
neuraminyl-2,3-beta-galactosyl-
1,3)-N-acetylgalactosaminide
alpha-2,6-sialyltransferase 5
1.4 1.7 1.5 8.8E-05
211323_s_at ITPR1
inositol 1,4,5-triphosphate
receptor, type 1
-1.3 -1.6 -1.5 9.3E-05
201939_at PLK2 polo-like kinase 2 (Drosophila) -1.8 -3.3 -2.6 9.4E-05
202603_at ADAM10
A disintegrin and
metalloproteinase domain 10
2.2 2.4 2.3 9.5E-05
201
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
209120_at NR2F2
nuclear receptor subfamily 2,
group F, member 2
2.8 9.0 5.9 9.8E-05
211671_s_at NR3C1
nuclear receptor subfamily 3,
group C, member 1
(glucocorticoid receptor) ///
nuclear receptor subfamily 3,
group C, member 1
(glucocorticoid receptor)
1.6 1.8 1.7 9.8E-05
213381_at C10orf72
Chromosome 10 open reading
frame 72
1.7 3.5 2.6 1.0E-04
203689_s_at FMR1 fragile X mental retardation 1 1.6 2.5 2.0 1.1E-04
212157_at SDC2
syndecan 2 (heparan sulfate
proteoglycan 1, cell surface-
associated, fibroglycan)
1.5 2.1 1.8 1.1E-04
201811_x_at SH3BP5
SH3-domain binding protein 5
(BTK-associated)
1.8 1.8 1.8 1.1E-04
209655_s_at TMEM47 transmembrane protein 47 1.9 2.9 2.4 1.1E-04
202478_at TRIB2 tribbles homolog 2 (Drosophila) -1.4 -2.1 -1.8 1.1E-04
200621_at CSRP1
cysteine and glycine-rich
protein 1
1.4 2.1 1.7 1.2E-04
209031_at IGSF4
Immunoglobulin superfamily,
member 4
1.7 3.5 2.6 1.2E-04
202729_s_at LTBP1
latent transforming growth
factor beta binding protein 1
1.6 1.7 1.7 1.2E-04
211668_s_at PLAU
plasminogen activator,
urokinase /// plasminogen
activator, urokinase
2.5 1.8 2.2 1.2E-04
205087_at RWDD3 RWD domain containing 3 1.5 1.6 1.6 1.2E-04
201581_at DJ971N18.2
hypothetical protein
DJ971N18.2
1.7 3.4 2.6 1.3E-04
204015_s_at DUSP4 dual specificity phosphatase 4 -1.5 -1.4 -1.4 1.3E-04
203178_at GATM
glycine amidinotransferase (L-
arginine:glycine
amidinotransferase)
1.7 2.4 2.1 1.4E-04
210095_s_at IGFBP3
insulin-like growth factor
binding protein 3
-1.5 -2.1 -1.8 1.4E-04
203837_at MAP3K5
mitogen-activated protein
kinase kinase kinase 5
1.7 1.3 1.5 1.4E-04
204049_s_at PHACTR2
phosphatase and actin regulator
2
1.7 3.0 2.3 1.4E-04
209156_s_at COL6A2 collagen, type VI, alpha 2 1.9 2.2 2.0 1.5E-04
209094_at DDAH1
dimethylarginine
dimethylaminohydrolase 1
1.6 3.0 2.3 1.5E-04
207145_at GDF8 growth differentiation factor 8 -1.8 -2.0 -1.9 1.5E-04
206632_s_at APOBEC3B
apolipoprotein B mRNA editing
enzyme, catalytic polypeptide-
like 3B
1.7 3.6 2.6 1.6E-04
209539_at ARHGEF6
Rac/Cdc42 guanine nucleotide
exchange factor (GEF) 6
-1.5 -1.4 -1.5 1.6E-04
210074_at CTSL2 cathepsin L2 1.3 1.8 1.5 1.6E-04
202
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
212888_at DICER1
Dicer1, Dcr-1 homolog
(Drosophila)
1.6 2.2 1.9 1.6E-04
221016_s_at TCF7L1
transcription factor 7-like 1 (T-
cell specific, HMG-box) ///
transcription factor 7-like 1 (T-
cell specific, HMG-box)
1.4 2.1 1.7 1.6E-04
210102_at LOH11CR2A
loss of heterozygosity, 11,
chromosomal region 2, gene A
1.6 2.8 2.2 1.7E-04
202007_at NID1 nidogen 1 2.1 3.3 2.7 1.7E-04
203824_at TSPAN8 tetraspanin 8 1.3 2.0 1.6 1.7E-04
202920_at ANK2 ankyrin 2, neuronal 1.8 4.2 3.0 1.8E-04
208092_s_at FAM49A
family with sequence similarity
49, member A /// family with
sequence similarity 49, member
A
1.6 2.1 1.9 1.8E-04
217997_at PHLDA1
pleckstrin homology-like
domain, family A, member 1
-1.6 -1.7 -1.7 1.8E-04
211373_s_at PSEN2
presenilin 2 (Alzheimer disease
4)
1.3 1.8 1.5 1.8E-04
210036_s_at KCNH2
potassium voltage-gated
channel, subfamily H (eag-
related), member 2
-1.2 -1.7 -1.4 1.9E-04
200795_at SPARCL1 SPARC-like 1 (mast9, hevin) -1.6 -3.4 -2.5 1.9E-04
202524_s_at SPOCK2
sparc/osteonectin, cwcv and
kazal-like domains
proteoglycan (testican) 2
-1.5 -1.6 -1.5 1.9E-04
218870_at ARHGAP15
Rho GTPase activating protein
15
-1.7 -2.5 -2.1 2.0E-04
200632_s_at NDRG1
N-myc downstream regulated
gene 1
1.7 2.0 1.8 2.0E-04
218162_at OLFML3 olfactomedin-like 3 -1.5 -1.6 -1.6 2.0E-04
212256_at GALNT10
UDP-N-acetyl-alpha-D-
galactosamine:polypeptide N-
acetylgalactosaminyltransferase
10 (GalNAc-T10)
1.5 2.9 2.2 2.1E-04
200833_s_at RAP1B
RAP1B, member of RAS
oncogene family
1.4 1.8 1.6 2.1E-04
209460_at ABAT
4-aminobutyrate
aminotransferase
1.8 4.2 3.0 2.2E-04
203851_at IGFBP6
insulin-like growth factor
binding protein 6
1.4 2.6 2.0 2.2E-04
209119_x_at NR2F2
nuclear receptor subfamily 2,
group F, member 2
1.3 1.7 1.5 2.2E-04
212599_at AUTS2
autism susceptibility candidate
2
1.3 2.0 1.7 2.3E-04
201487_at CTSC cathepsin C 1.4 2.4 1.9 2.3E-04
206404_at FGF9
fibroblast growth factor 9 (glia-
activating factor)
-1.3 -1.7 -1.5 2.3E-04
212680_x_at PPP1R14B
protein phosphatase 1,
regulatory (inhibitor) subunit
1.3 1.7 1.5 2.3E-04
203
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
14B
205481_at ADORA1 adenosine A1 receptor -1.6 -2.0 -1.8 2.4E-04
221042_s_at CLMN
calmin (calponin-like,
transmembrane)
1.2 1.7 1.5 2.4E-04
216866_s_at COL14A1
collagen, type XIV, alpha 1
(undulin)
1.7 3.8 2.8 2.4E-04
204183_s_at ADRBK2
adrenergic, beta, receptor
kinase 2
1.4 2.4 1.9 2.5E-04
202921_s_at ANK2 ankyrin 2, neuronal 1.4 2.4 1.9 2.5E-04
219355_at FLJ10178 hypothetical protein FLJ10178 -1.9 -2.5 -2.2 2.5E-04
203414_at MMD
monocyte to macrophage
differentiation-associated
1.3 1.8 1.5 2.5E-04
203962_s_at NEBL nebulette 2.2 9.8 6.0 2.5E-04
214734_at SLAC2-B SLAC2-B -1.3 -1.8 -1.6 2.5E-04
219474_at TTMP
TPA-induced transmembrane
protein
-1.6 -1.4 -1.5 2.5E-04
214895_s_at ADAM10
a disintegrin and
metalloproteinase domain 10
1.5 2.2 1.9 2.6E-04
205163_at MYLPF
fast skeletal myosin light chain
2
-3.0 -1.5 -2.3 2.6E-04
204723_at SCN3B
sodium channel, voltage-gated,
type III, beta
1.6 2.3 1.9 2.6E-04
203408_s_at SATB1
special AT-rich sequence
binding protein 1 (binds to
nuclear matrix/scaffold-
associating DNA's)
1.6 4.0 2.8 2.7E-04
216511_s_at TCF7L2
transcription factor 7-like 2 (T-
cell specific, HMG-box)
1.5 2.8 2.2 2.7E-04
218149_s_at ZNF395 zinc finger protein 395 1.6 2.8 2.2 2.7E-04
219148_at PBK PDZ binding kinase 1.4 2.2 1.8 2.8E-04
201648_at JAK1
Janus kinase 1 (a protein
tyrosine kinase)
1.4 2.1 1.7 2.9E-04
209840_s_at LRRN3 leucine rich repeat neuronal 3 -1.5 -2.2 -1.9 2.9E-04
219038_at MORC4
MORC family CW-type zinc
finger 4
1.4 1.7 1.6 2.9E-04
213652_at PCSK5
Proprotein convertase
subtilisin/kexin type 5
1.5 3.0 2.3 2.9E-04
205123_s_at TMEFF1
transmembrane protein with
EGF-like and two follistatin-
like domains 1
1.7 3.4 2.6 2.9E-04
210517_s_at AKAP12
A kinase (PRKA) anchor
protein (gravin) 12
1.6 2.4 2.0 3.0E-04
220495_s_at C5orf14
chromosome 5 open reading
frame 14
1.3 1.6 1.5 3.0E-04
215073_s_at NR2F2
nuclear receptor subfamily 2,
group F, member 2
1.7 3.7 2.7 3.0E-04
210026_s_at CARD10
caspase recruitment domain
family, member 10
1.3 1.7 1.5 3.1E-04
204
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
216942_s_at CD58
CD58 antigen, (lymphocyte
function-associated antigen 3)
1.4 2.0 1.7 3.1E-04
220161_s_at EPB41L4B
erythrocyte membrane protein
band 4.1 like 4B
1.4 2.1 1.7 3.1E-04
217853_at TENS1
Tensin-like SH2 domain
containing 1
1.4 2.1 1.7 3.1E-04
218832_x_at ARRB1 arrestin, beta 1 1.4 2.3 1.8 3.2E-04
202723_s_at FOXO1A
forkhead box O1A
(rhabdomyosarcoma)
1.8 6.4 4.1 3.2E-04
200600_at MSN moesin 1.4 1.8 1.6 3.2E-04
212481_s_at TPM4 tropomyosin 4 1.5 1.6 1.6 3.2E-04
201371_s_at CUL3 cullin 3 1.5 2.5 2.0 3.4E-04
202759_s_at
PALM2-
AKAP2
PALM2-AKAP2 protein 1.7 1.4 1.5 3.4E-04
204914_s_at SOX11
SRY (sex determining region
Y)-box 11
-1.5 -1.6 -1.6 3.4E-04
202808_at C10orf26
chromosome 10 open reading
frame 26
1.5 2.8 2.1 3.5E-04
207012_at MMP16
matrix metalloproteinase 16
(membrane-inserted)
1.8 1.7 1.7 3.5E-04
202188_at NUP93 nucleoporin 93kDa 1.4 2.4 1.9 3.5E-04
204526_s_at TBC1D8
TBC1 domain family, member
8 (with GRAM domain)
-1.3 -1.9 -1.6 3.5E-04
210966_x_at LARP1
La ribonucleoprotein domain
family, member 1
-1.2 -1.7 -1.5 3.6E-04
204612_at PKIA
protein kinase (cAMP-
dependent, catalytic) inhibitor
alpha
-1.6 -1.4 -1.5 3.6E-04
213494_s_at YY1 YY1 transcription factor -1.3 -1.7 -1.5 3.6E-04
202120_x_at AP2S1
adaptor-related protein complex
2, sigma 1 subunit
1.3 2.1 1.7 3.7E-04
203895_at PLCB4 phospholipase C, beta 4 -1.8 -2.6 -2.2 3.7E-04
209459_s_at ABAT
4-aminobutyrate
aminotransferase
2.0 6.1 4.0 3.8E-04
201186_at LRPAP1
low density lipoprotein
receptor-related protein
associated protein 1
1.3 2.2 1.7 3.8E-04
205991_s_at PRRX1 paired related homeobox 1 1.4 2.1 1.8 3.8E-04
203313_s_at TGIF
TGFB-induced factor (TALE
family homeobox)
1.4 2.1 1.8 3.8E-04
208399_s_at EDN3 endothelin 3 -2.0 -2.2 -2.1 3.9E-04
217732_s_at ITM2B integral membrane protein 2B 1.4 1.9 1.7 3.9E-04
201666_at TIMP1
tissue inhibitor of
metalloproteinase 1 (erythroid
potentiating activity,
collagenase inhibitor)
1.3 1.8 1.5 3.9E-04
213016_at BBX
Bobby sox homolog
(Drosophila)
-2.0 -3.1 -2.6 4.0E-04
204456_s_at GAS1 growth arrest-specific 1 1.5 2.2 1.9 4.0E-04
203896_s_at PLCB4 phospholipase C, beta 4 -2.1 -4.9 -3.5 4.0E-04
205
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
204479_at OSTF1 osteoclast stimulating factor 1 1.3 1.9 1.6 4.1E-04
201760_s_at WSB2
WD repeat and SOCS box-
containing 2
1.3 1.8 1.6 4.3E-04
212792_at DPY19L1 dpy-19-like 1 (C. elegans) 1.4 2.1 1.7 4.4E-04
205560_at PCSK5
proprotein convertase
subtilisin/kexin type 5
1.2 1.8 1.5 4.4E-04
221263_s_at SF3B5
splicing factor 3b, subunit 5,
10kDa /// splicing factor 3b,
subunit 5, 10kDa
1.4 2.3 1.9 4.4E-04
200900_s_at M6PR
mannose-6-phosphate receptor
(cation dependent)
1.2 1.8 1.5 4.5E-04
213256_at MARCH3
membrane-associated ring
finger (C3HC4) 3
1.3 2.0 1.7 4.5E-04
207551_s_at MSL3L1
male-specific lethal 3-like 1
(Drosophila)
1.4 2.7 2.0 4.5E-04
206070_s_at EPHA3 EPH receptor A3 1.7 2.7 2.2 4.6E-04
216044_x_at LOC388650 hypothetical LOC388650 1.4 2.6 2.0 4.7E-04
209468_at LRP5
low density lipoprotein
receptor-related protein 5
-1.5 -1.8 -1.6 4.7E-04
209082_s_at COL18A1 collagen, type XVIII, alpha 1 1.4 1.7 1.5 5.0E-04
219694_at FLJ11127 hypothetical protein FLJ11127 1.3 1.8 1.6 5.0E-04
205848_at GAS2 growth arrest-specific 2 -1.9 -3.2 -2.6 5.0E-04
205535_s_at PCDH7 BH-protocadherin (brain-heart) 1.8 4.8 3.3 5.1E-04
216733_s_at GATM
glycine amidinotransferase (L-
arginine:glycine
amidinotransferase)
1.7 4.8 3.2 5.2E-04
212960_at KIAA0882 KIAA0882 protein 1.3 1.8 1.5 5.2E-04
209027_s_at ABI1 abl-interactor 1 1.4 2.4 1.9 5.3E-04
202562_s_at C14orf1
chromosome 14 open reading
frame 1
1.3 2.0 1.6 5.3E-04
204103_at CCL4 chemokine (C-C motif) ligand 4 -1.2 -1.7 -1.5 5.4E-04
218718_at PDGFC platelet derived growth factor C 1.5 2.6 2.0 5.4E-04
212415_at SEP6 septin 6 1.3 2.3 1.8 5.4E-04
53991_at KIAA1277 KIAA1277 1.3 2.2 1.8 5.5E-04
208785_s_at MAP1LC3B
microtubule-associated protein
1 light chain 3 beta
1.4 2.6 2.0 5.6E-04
217783_s_at YPEL5 yippee-like 5 (Drosophila) 1.3 2.1 1.7 5.6E-04
203695_s_at DFNA5 deafness, autosomal dominant 5 1.4 1.6 1.5 5.7E-04
221685_s_at FLJ20364 hypothetical protein FLJ20364 1.7 1.3 1.5 5.7E-04
205968_at KCNS3
potassium voltage-gated
channel, delayed-rectifier,
subfamily S, member 3
1.3 2.0 1.6 5.8E-04
205381_at LRRC17
leucine rich repeat containing
17
1.3 2.1 1.7 5.9E-04
209168_at GPM6B glycoprotein M6B 1.4 2.6 2.0 6.1E-04
214110_s_at LASP1
Similar to lymphocyte-specific
protein 1
1.3 2.2 1.7 6.1E-04
218450_at HEBP1 heme binding protein 1 1.3 2.4 1.9 6.2E-04
201526_at ARF5 ADP-ribosylation factor 5 1.2 1.7 1.5 6.5E-04
206
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
221885_at KIAA1277 KIAA1277 1.4 2.9 2.2 6.6E-04
218273_s_at PPM2C
protein phosphatase 2C,
magnesium-dependent,
catalytic subunit
-1.2 -1.7 -1.5 6.6E-04
201041_s_at DUSP1 dual specificity phosphatase 1 1.2 1.8 1.5 6.7E-04
206506_s_at SUPT3H
suppressor of Ty 3 homolog (S.
cerevisiae)
1.2 1.9 1.6 6.8E-04
206662_at GLRX glutaredoxin (thioltransferase) -1.3 -1.7 -1.5 6.9E-04
209708_at MOXD1 monooxygenase, DBH-like 1 -1.4 -2.4 -1.9 7.0E-04
202237_at NNMT
nicotinamide N-
methyltransferase
1.5 1.7 1.6 7.0E-04
218319_at PELI1 pellino homolog 1 (Drosophila) 1.3 1.9 1.6 7.0E-04
203680_at PRKAR2B
protein kinase, cAMP-
dependent, regulatory, type II,
beta
1.5 3.1 2.3 7.0E-04
202656_s_at SERTAD2 SERTA domain containing 2 1.3 1.8 1.6 7.0E-04
213002_at MARCKS
Myristoylated alanine-rich
protein kinase C substrate
-1.5 -1.8 -1.7 7.1E-04
210993_s_at SMAD1
SMAD, mothers against DPP
homolog 1 (Drosophila)
1.5 2.6 2.1 7.1E-04
201889_at FAM3C
family with sequence similarity
3, member C
1.5 2.5 2.0 7.2E-04
204491_at PDE4D
Phosphodiesterase 4D, cAMP-
specific (phosphodiesterase E3
dunce homolog, Drosophila)
-1.3 -1.8 -1.6 7.2E-04
201739_at SGK
serum/glucocorticoid regulated
kinase
1.5 1.6 1.6 7.4E-04
219377_at FAM59A
family with sequence similarity
59, member A
1.6 2.4 2.0 7.5E-04
201200_at CREG1
cellular repressor of E1A-
stimulated genes 1
1.5 1.9 1.7 7.7E-04
219249_s_at FKBP10
FK506 binding protein 10, 65
kDa
-1.5 -1.7 -1.6 7.8E-04
36711_at MAFF
v-maf musculoaponeurotic
fibrosarcoma oncogene
homolog F (avian)
1.7 5.0 3.4 7.9E-04
221969_at PAX5
Paired box gene 5 (B-cell
lineage specific activator)
1.6 5.8 3.7 8.0E-04
203860_at PCCA
propionyl Coenzyme A
carboxylase, alpha polypeptide
1.4 1.6 1.5 8.0E-04
219634_at CHST11
carbohydrate (chondroitin 4)
sulfotransferase 11
1.3 2.5 1.9 8.1E-04
221773_at ELK3
ELK3, ETS-domain protein
(SRF accessory protein 2)
1.7 3.1 2.4 8.1E-04
201798_s_at FER1L3
fer-1-like 3, myoferlin (C.
elegans)
1.7 1.7 1.7 8.2E-04
210774_s_at NCOA4 nuclear receptor coactivator 4 1.3 2.2 1.8 8.2E-04
205054_at NEB nebulin -1.5 -1.4 -1.4 8.2E-04
207
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
214978_s_at PPFIA4
protein tyrosine phosphatase,
receptor type, f polypeptide
(PTPRF), interacting protein
(liprin), alpha 4
-1.4 -1.9 -1.7 8.2E-04
213139_at SNAI2 snail homolog 2 (Drosophila) 1.5 1.9 1.7 8.2E-04
213061_s_at NTAN1 N-terminal asparagine amidase 1.4 2.8 2.1 8.3E-04
218260_at PCIA1
cross-immune reaction antigen
PCIA1
1.3 2.5 1.9 8.3E-04
202455_at HDAC5 histone deacetylase 5 1.3 2.2 1.7 8.4E-04
202388_at RGS2
regulator of G-protein
signalling 2, 24kDa
-1.3 -2.0 -1.6 8.4E-04
221489_s_at SPRY4 sprouty homolog 4 (Drosophila) -1.3 -2.7 -2.0 8.4E-04
214736_s_at ADD1 adducin 1 (alpha) 1.3 2.2 1.7 8.5E-04
206670_s_at GAD1
glutamate decarboxylase 1
(brain, 67kDa)
1.3 1.8 1.5 8.5E-04
202446_s_at PLSCR1 phospholipid scramblase 1 1.2 1.7 1.5 8.5E-04
219147_s_at C9orf95
chromosome 9 open reading
frame 95
1.3 2.0 1.7 8.6E-04
218384_at CARHSP1
calcium regulated heat stable
protein 1, 24kDa
1.3 2.7 2.0 8.6E-04
208786_s_at MAP1LC3B
microtubule-associated protein
1 light chain 3 beta
1.3 2.4 1.9 8.6E-04
218434_s_at AACS acetoacetyl-CoA synthetase 1.4 2.5 1.9 8.8E-04
204490_s_at CD44
CD44 antigen (homing function
and Indian blood group system)
1.3 1.8 1.5 8.8E-04
202332_at CSNK1E casein kinase 1, epsilon 1.2 1.8 1.5 8.8E-04
202539_s_at HMGCR
3-hydroxy-3-methylglutaryl-
Coenzyme A reductase
1.5 2.8 2.1 8.8E-04
219106_s_at KBTBD10
kelch repeat and BTB (POZ)
domain containing 10
-2.2 -1.9 -2.1 8.8E-04
217996_at PHLDA1
pleckstrin homology-like
domain, family A, member 1
-1.6 -1.7 -1.6 8.9E-04
206108_s_at SFRS6
splicing factor, arginine/serine-
rich 6
-2.4 -2.6 -2.5 9.1E-04
209608_s_at ACAT2
acetyl-Coenzyme A
acetyltransferase 2 (acetoacetyl
Coenzyme A thiolase)
1.4 3.2 2.3 9.2E-04
216235_s_at EDNRA endothelin receptor type A -1.2 -1.7 -1.5 9.2E-04
205173_x_at CD58
CD58 antigen, (lymphocyte
function-associated antigen 3)
1.4 2.4 1.9 9.3E-04
209652_s_at PGF
placental growth factor,
vascular endothelial growth
factor-related protein
1.6 1.6 1.6 9.3E-04
205555_s_at MSX2
msh homeo box homolog 2
(Drosophila)
1.3 2.6 2.0 9.4E-04
203373_at SOCS2
suppressor of cytokine
signaling 2
1.2 1.8 1.5 9.4E-04
213823_at HOXA11 homeo box A11 1.3 2.6 2.0 9.5E-04
201953_at CIB1
calcium and integrin binding 1
(calmyrin)
1.3 2.1 1.7 9.6E-04
208
Supplementary Table 8. Genes differentially expressed between PAX-FKHR
polyclones and vector controls.
Mean Fold-Difference
(relative to vector control)
Affy ID Symbol Gene Name
PAX3-
FKHR
PAX7-
FKHR
PAX-
FKHR
*
p-
value
†
213927_at MAP3K9
mitogen-activated protein
kinase kinase kinase 9
-1.3 -1.7 -1.5 9.7E-04
204555_s_at PPP1R3D
protein phosphatase 1,
regulatory subunit 3D
-1.8 -2.4 -2.1 9.8E-04
212136_at ATP2B4
ATPase, Ca++ transporting,
plasma membrane 4
-1.3 -1.8 -1.5 9.9E-04
221598_s_at CRSP8
cofactor required for Sp1
transcriptional activation,
subunit 8, 34kDa
1.4 3.1 2.2 9.9E-04
200820_at PSMD8
proteasome (prosome,
macropain) 26S subunit, non-
ATPase, 8
1.2 1.8 1.5 9.9E-04
* Geometric mean fold-difference between the average signal of both PAX3-FKHR and PAX7-FKHR polyclones
versus vector control.
† p-value from a simple t-test comparing PAX-FKHR versus vector control.
209
Supplementary Table 9. Genes differentially expressed between PAX-
FKHR and fusion-negative RMS cell lines
Affy ID Symbol Gene Name
Mean
Fold-
Difference*
p-
value
†
214895_s_at ADAM10 A disintegrin and metalloproteinase domain 10 2.4 1.E-09
219147_s_at C9orf95 Chromosome 9 open reading frame 95 2.8 1.E-09
202455_at HDAC5 Histone deacetylase 5 1.7 1.E-09
203723_at ITPKB Inositol 1,4,5-trisphosphate 3-kinase B 2.7 1.E-09
205888_s_at KIAA0555 Jak and microtubule interacting protein 2 2.8 1.E-09
203414_at MMD Monocyte to macrophage differentiation-associated 2.1 1.E-09
206089_at NELL1 NEL-like 1 (chicken) 6.0 1.E-09
204479_at OSTF1 Osteoclast stimulating factor 1 1.7 1.E-09
205123_s_at TMEFF1
Transmembrane protein with EGF-like and two
follistatin-like domains 1
3.4 1.E-09
201368_at ZFP36L2 Zinc finger protein 36, C3H type-like 2 -2.2 1.E-09
209460_at ABAT 4-aminobutyrate aminotransferase 5.3 1.E-08
206940_s_at POU4F1 POU domain, class 4, transcription factor 1 3.3 1.E-08
209459_s_at ABAT 4-aminobutyrate aminotransferase 6.2 3.E-08
218319_at PELI1 Pellino homolog 1 (Drosophila) 1.9 6.E-08
211341_at POU4F1 POU domain, class 4, transcription factor 1 4.9 6.E-08
209757_s_at MYCN
V-myc myelocytomatosis viral related oncogene,
neuroblastoma derived (avian)
4.7 1.E-07
203626_s_at SKP2 S-phase kinase-associated protein 2 (p45) 2.6 2.E-07
208212_s_at ALK Anaplastic lymphoma kinase (Ki-1) 2.1 3.E-07
210567_s_at SKP2 S-phase kinase-associated protein 2 (p45) 3.2 5.E-07
210794_s_at MEG3 Maternally expressed 3 3.5 8.E-07
219038_at ZCWCC2 Zinc finger, CW type with coiled-coil domain 2 1.7 8.E-07
201005_at CD9 CD9 antigen (p24) 3.4 1.E-06
213002_at 213002_at -1.5 1.E-06
210095_s_at IGFBP3 Insulin-like growth factor binding protein 3 -8.1 1.E-06
201939_at PLK2 Polo-like kinase 2 (Drosophila) -4.2 2.E-06
203625_x_at SKP2 S-phase kinase-associated protein 2 (p45) 2.1 2.E-06
201581_at DJ971N18.2 Hypothetical protein DJ971N18.2 3.7 3.E-06
207076_s_at ASS Argininosuccinate synthetase 8.2 3.E-06
205431_s_at BMP5 Bone morphogenetic protein 5 2.6 5.E-06
212136_at ATP2B4 ATPase, Ca++ transporting, plasma membrane 4 -1.9 5.E-06
206835_at STATH Statherin 17.6 5.E-06
218384_at CARHSP1 Calcium regulated heat stable protein 1, 24kDa 1.7 6.E-06
211042_x_at MCAM Melanoma cell adhesion molecule 2.0 7.E-06
206657_s_at MYOD1 Myogenic factor 3 3.7 7.E-06
209683_at 209683_at 3.2 7.E-06
217028_at CXCR4 Chemokine (C-X-C motif) receptor 4 7.8 9.E-06
208092_s_at FAM49A Family with sequence similarity 49, member A 2.4 9.E-06
202921_s_at ANK2 Ankyrin 2, neuronal 2.0 1.E-05
212956_at KIAA0882 KIAA0882 protein 3.6 1.E-05
211919_s_at CXCR4 Chemokine (C-X-C motif) receptor 4 3.6 1.E-05
218807_at VAV3 Vav 3 oncogene 2.4 1.E-05
37892_at COL11A1 Collagen, type XI, alpha 1 -4.1 1.E-05
211237_s_at FGFR4 Fibroblast growth factor receptor 4 2.5 1.E-05
210
Supplementary Table 9. Genes differentially expressed between PAX-
FKHR and fusion-negative RMS cell lines
Affy ID Symbol Gene Name
Mean
Fold-
Difference*
p-
value
†
209201_x_at CXCR4 Chemokine (C-X-C motif) receptor 4 3.6 2.E-05
206404_at FGF9 Fibroblast growth factor 9 (glia-activating factor) -1.7 2.E-05
222146_s_at TCF4 Transcription factor 4 -1.8 2.E-05
204914_s_at SOX11 SRY (sex determining region Y)-box 11 -2.6 2.E-05
203178_at GATM
Glycine amidinotransferase (L-arginine:glycine
amidinotransferase)
1.9 2.E-05
212385_at TCF4 Transcription factor 4 -1.5 2.E-05
218806_s_at VAV3 Vav 3 oncogene 2.1 3.E-05
203753_at TCF4 Transcription factor 4 -1.8 4.E-05
212680_x_at PPP1R14B
Protein phosphatase 1, regulatory (inhibitor) subunit
14B
1.7 4.E-05
210102_at LOH11CR2A
Loss of heterozygosity, 11, chromosomal region 2,
gene A
2.5 4.E-05
209168_at GPM6B Glycoprotein M6B 2.6 5.E-05
204320_at COL11A1 Collagen, type XI, alpha 1 -2.4 5.E-05
214608_s_at EYA1 Eyes absent homolog 1 (Drosophila) 2.0 5.E-05
207145_at GDF8 Growth differentiation factor 8 -3.0 5.E-05
205087_at RWDD3 RWD domain containing 3 1.6 6.E-05
202643_s_at TNFAIP3 Tumor necrosis factor, alpha-induced protein 3 1.8 6.E-05
202644_s_at TNFAIP3 Tumor necrosis factor, alpha-induced protein 3 2.0 7.E-05
204579_at FGFR4 Fibroblast growth factor receptor 4 2.7 7.E-05
202039_at TIAF1 TGFB1-induced anti-apoptotic factor 1 2.2 8.E-05
201889_at FAM3C Family with sequence similarity 3, member C 1.7 9.E-05
218273_s_at PPM2C
Protein phosphatase 2C, magnesium-dependent,
catalytic subunit
-1.6 9.E-05
218832_x_at ARRB1 Arrestin, beta 1 1.7 1.E-04
212386_at TCF4 Transcription factor 4 -1.9 1.E-04
219718_at FLJ10986 Hypothetical protein FLJ10986 2.3 1.E-04
203510_at MET
Met proto-oncogene (hepatocyte growth factor
receptor)
2.6 1.E-04
202920_at ANK2 Ankyrin 2, neuronal 2.5 1.E-04
221861_at
MRNA; cDNA DKFZp762M127 (from clone
DKFZp762M127)
2.6 1.E-04
209708_at MOXD1 Monooxygenase, DBH-like 1 -2.0 1.E-04
216733_s_at GATM
Glycine amidinotransferase (L-arginine:glycine
amidinotransferase)
2.7 1.E-04
218162_at OLFML3 Olfactomedin-like 3 -2.4 1.E-04
220161_s_at EPB41L4B Erythrocyte membrane protein band 4.1 like 4B 1.8 2.E-04
205163_at HUMMLC2B Myosin light chain 2 -3.6 2.E-04
49111_at
MRNA; cDNA DKFZp762M127 (from clone
DKFZp762M127)
2.6 2.E-04
219377_at C18orf11 Chromosome 18 open reading frame 11 2.0 2.E-04
43511_s_at
MRNA; cDNA DKFZp762M127 (from clone
DKFZp762M127)
2.7 2.E-04
207571_x_at C1orf38 Chromosome 1 open reading frame 38 2.2 2.E-04
209655_s_at TM4SF10 Transmembrane 4 superfamily member 10 2.0 2.E-04
210785_s_at C1orf38 Chromosome 1 open reading frame 38 2.3 3.E-04
209608_s_at ACAT2
Acetyl-Coenzyme A acetyltransferase 2 (acetoacetyl
Coenzyme A thiolase)
2.2 3.E-04
211
Supplementary Table 9. Genes differentially expressed between PAX-
FKHR and fusion-negative RMS cell lines
Affy ID Symbol Gene Name
Mean
Fold-
Difference*
p-
value
†
212382_at TCF4 Transcription factor 4 -1.8 3.E-04
212977_at CMKOR1 Chemokine orphan receptor 1 4.3 3.E-04
209170_s_at GPM6B Glycoprotein M6B 3.0 4.E-04
212387_at TCF4 Transcription factor 4 -1.8 4.E-04
203725_at GADD45A Growth arrest and DNA-damage-inducible, alpha 1.8 4.E-04
206662_at GLRX Glutaredoxin (thioltransferase) -1.6 4.E-04
202188_at NUP93 Nucleoporin 93kDa 1.6 4.E-04
204105_s_at NRCAM Neuronal cell adhesion molecule 2.8 5.E-04
216511_s_at TCF7L2
Transcription factor 7-like 2 (T-cell specific, HMG-
box)
1.8 5.E-04
205330_at MN1 Meningioma (disrupted in balanced translocation) 1 2.0 5.E-04
219304_s_at PDGFD Platelet derived growth factor D 2.1 5.E-04
209121_x_at NR2F2 Nuclear receptor subfamily 2, group F, member 2 1.5 5.E-04
203680_at PRKAR2B
Protein kinase, cAMP-dependent, regulatory, type
II, beta
2.0 5.E-04
213061_s_at NTAN1 N-terminal asparagine amidase 1.6 6.E-04
209120_at NR2F2 Nuclear receptor subfamily 2, group F, member 2 2.5 7.E-04
216044_x_at LOC388650 Hypothetical LOC388650 1.8 7.E-04
203961_at NEBL Nebulette 2.2 7.E-04
203962_s_at NEBL Nebulette 3.0 8.E-04
204850_s_at DCX
Doublecortex; lissencephaly, X-linked
(doublecortin)
3.7 9.E-04
205389_s_at ANK1 Ankyrin 1, erythrocytic 1.7 9.E-04
201464_x_at JUN V-jun sarcoma virus 17 oncogene homolog (avian) 2.2 9.E-04
218149_s_at ZNF395 Zinc finger protein 395 1.8 1.E-03
221123_x_at ZNF395 Zinc finger protein 395 1.6 1.E-03
39248_at AQP3 Aquaporin 3 2.4 1.E-03
* Geometric mean fold-difference between PAX-FKHR positive and fusion-negative RMS cell lines.
† p-value from a simple t-test comparing PAX-FKHR positive versus fusion-negative RMS cell lines.
212
Supplementary Table 10. Gene ontology categories over-represented in PAX-
FKHR vs. vector control gene lists as determined by EASE analysis
PAX-FKHR
(down-regulated)
PAX-FKHR
(up-regulated)
System Gene Category
Number
of
Genes*
EASE
Score
†
Number
of
Genes
EASE
Score
GO Biological
Process
cellular process 31 0.053 162 1.2E-09
GO Biological
Process
cell communication 16 0.17 98 3.8E-09
GO Molecular
Function
protein binding 5 0.90 56 1.8E-06
GO Biological
Process
morphogenesis 12 0.003 42 1.5E-05
GenMAPP pathway Hs_TGF Beta Signaling Pathway 0 - 9 1.9E-05
GO Biological
Process
development 15 0.007 57 1.1E-04
GO Biological
Process
signal transduction 12 0.31 68 2.6E-04
GO Biological
Process
organogenesis 11 0.004 34 7.4E-04
GO Molecular
Function
glycosaminoglycan binding 0 - 8 0.001
GO Molecular
Function
heparin binding 0 - 7 0.001
GO Cellular
Component
extracellular 7 0.35 39 0.001
GO Biological
Process
cell adhesion 4 0.40 23 0.002
GO Biological
Process
regulation of cell proliferation 2 0.65 14 0.002
GO Molecular
Function
growth factor activity 1 1 10 0.002
GO Molecular
Function
structural molecule activity 8 0.026 26 0.002
GO Biological
Process
regulation of cellular process 3 0.40 16 0.003
GO Biological
Process
regulation of biological process 3 0.40 16 0.003
GO Molecular
Function
transcription cofactor activity 0 - 13 0.003
GO Biological
Process
regulation of transcription from Pol II
promoter
1 1 13 0.004
GO Biological
Process
cell proliferation 7 0.27 35 0.004
Chromosome Homo sapiens 2p 4 0.14 15 0.004
GO Biological
Process
death 3 0.52 18 0.004
GO Cellular
Component
extracellular matrix 3 0.33 14 0.005
GO Molecular
Function
transcription factor binding 1 1 13 0.006
Chromosome Homo sapiens 2 10 0.005 27 0.007
GO Molecular
Function
extracellular matrix structural constituent 1 1 7 0.008
GO Biological cell differentiation 1 1 10 0.009
213
Supplementary Table 10. Gene ontology categories over-represented in PAX-
FKHR vs. vector control gene lists as determined by EASE analysis
PAX-FKHR
(down-regulated)
PAX-FKHR
(up-regulated)
System Gene Category
Number
of
Genes*
EASE
Score
†
Number
of
Genes
EASE
Score
Process
GO Biological
Process
cell death 3 0.51 17 0.009
GO Biological
Process
enzyme linked receptor protein signaling
pathway
1 1 10 0.011
GO Biological
Process
extracellular matrix organization and
biogenesis
2 0.092 4 0.011
GO Biological
Process
extracellular structure organization and
biogenesis
2 0.092 4 0.011
GO Molecular
Function
signal transducer activity 10 0.49 54 0.012
GO Biological
Process
regulation of cell cycle 2 0.81 16 0.013
GO Molecular
Function
cell adhesion molecule activity 1 1 14 0.013
GO Molecular
Function
receptor signaling protein activity 0 - 10 0.017
GO Molecular
Function
receptor binding 3 0.63 18 0.017
GO Biological
Process
negative regulation of cell proliferation 0 - 8 0.019
GO Biological
Process
apoptosis 3 0.48 15 0.024
GO Biological
Process
programmed cell death 3 0.48 15 0.025
GO Biological
Process
mesoderm development 0 - 4 0.027
Chromosome Homo sapiens 9q 0 - 13 0.027
GO Biological
Process
cell migration 0 - 5 0.027
GO Molecular
Function
cytoskeletal protein binding 1 1 12 0.029
GO Biological
Process
phosphate metabolism 7 0.057 22 0.030
GO Biological
Process
phosphorus metabolism 7 0.057 22 0.030
GO Molecular
Function
metalloendopeptidase activity 0 - 6 0.030
GO Biological
Process
bone remodeling 0 - 4 0.032
GO Biological
Process
ossification 0 - 4 0.032
GO Molecular
Function
transmembrane receptor protein tyrosine
kinase activity
0 - 5 0.035
GO Molecular
Function
small monomeric GTPase activity 0 - 7 0.036
GO Biological
Process
neurogenesis 3 0.51 15 0.038
GO Biological
Process
intracellular signaling cascade 4 0.63 24 0.039
GO Biological
Process
positive regulation of neuron
differentiation
0 - 2 0.041
214
Supplementary Table 10. Gene ontology categories over-represented in PAX-
FKHR vs. vector control gene lists as determined by EASE analysis
PAX-FKHR
(down-regulated)
PAX-FKHR
(up-regulated)
System Gene Category
Number
of
Genes*
EASE
Score
†
Number
of
Genes
EASE
Score
GO Molecular
Function
growth factor binding 1 1 4 0.046
Chromosome Homo sapiens 8 3 0.53 15 0.046
GO Biological
Process
neurotransmitter metabolism 0 - 3 0.046
GO Biological
Process
transcription from Pol II promoter 1 1 16 0.046
GO Biological
Process
protein amino acid phosphorylation 6 0.049 15 0.11
GO Molecular
Function
calcium ion binding 8 0.007 15 0.16
GO Biological
Process
cell motility 6 0.011 10 0.21
GO Molecular
Function
metal ion binding 10 0.037 26 0.23
Chromosome Homo sapiens 2q 6 0.045 12 0.34
GO Cellular
Component
actin cytoskeleton 6 0.007 8 0.38
GO Biological
Process
muscle development 6 2.3E-04 4 0.52
GO Molecular
Function
structural constituent of muscle 4 0.001 2 0.61
GO Cellular
Component
cytoskeleton 9 0.025 16 0.77
GO Biological
Process
muscle contraction 5 0.003 2 0.95
Chromosome Homo sapiens 20 5 0.049 4 0.96
GO Cellular
Component
muscle fiber 3 0.027 1 1
GO Biological
Process
regulation of striated muscle contraction 2 0.029 0 -
GO Biological
Process
striated muscle contraction 3 0.007 0 -
GO Biological
Process
regulation of muscle contraction 3 0.006 0 -
* Refers to number of genes from analyzed list found in each GO category.
† EASE score is a conservative adjustment to the Fisher exact probability test.
Note: Categories highlighted in bold are considered statistically significant.
215
Supplementary Table 11. Ingenuity Pathways Analysis of biological
networks
Netw
ork
Genes
IPA
Score
*
Focus
Genes
†
EASE Top Functions
‡
1
CAD, DAD1, DIO3, DUSP4,
EGF, ELK3, EPHA2, FGF2,
FGF6, FGFR4, GADD45A,
GAS2, GLG1, HAS2, IGFBP3,
MAPK1, MYCN, NEF3,
NEFH, P8, PBK, PLAGL1,
PLK2, PSEN2, RGS16,
RPL11, RPS6KB2, SCG2,
SPRY1, SPRY4, TCF7L2,
TNFRSF12A, TP53, TTF1,
ZFP36L2
21 13
Cell Cycle, Cell Proliferation,
Apotptosis, Angiogenesis
2
ALK, ANK2, ARG2, ARL7,
ATP2A1, CCL24, CCR8,
CD58, CD1C, CLEC7A,
CMKOR1, CYSLTR1, EFS,
FYN, IL2, IL4, IL13, IL13RA2,
IL4R, INPP1, IRS2, ITPR1,
JAK3, LTA4H, MAOA,
MAP1LC3B, MCAM, MN1,
POU4F1, RAC1, RNF128,
SCN3A, SWAP70, SYNGR2,
TIAF1
21 13
Cell Communication, Cell
Defense, Immune Response,
Cell Adhesion
3
ADAM10, BDKRB1, CCL2,
CD9, COL18A1, COL1A1,
COL5A3, CXCL16, DSG1,
EFNB2, FBLN1, FBLN2, FN1,
ITGB6, JUP, LCN2, LTBP2,
MET, MMP16, NID, NRCAM,
P4HB, PRKCA, RANBP9,
RGS2, SDC4, SGK, SKP2,
SULF1, TAX1BP1, TNF,
TNFAIP3, TNFRSF21, TNIP2,
TRAM2
21 13
Extracellular Matrix, Cell
Adhesion, Apoptosis, Collagen
4
ACTA1, AGTR2, AKT2, ASS,
BGLAP, CBX5, CDK7, CDK9,
CKM, DHFR, ETV3, GDF8,
HBP1, HDAC5,
HUMMLC2B, ID1, ID2,
IFI16, IGF1, KLF6, MEF2A,
MKI67, MYOD1, MYOG,
PAX5, PHLDA1, PKIA,
PRKAR2B, RB1, RBBP6,
RPS6KA2, SKIIP, TNNC2,
TWIST1, YWHAE
15 10
Muscle Development, Cell
Differentiation, Regulation of
Transcription, Regulation of
Cell Cycle
* The score indicates the likelihood of the focus genes in a given network being found together by chance alone. A
score of 3 indicates that there is a 1 in 1000 chance that the focus genes were assembled randomly into a network.
† PAX-FKHR expression signature genes were defined as Focus Genes, and are indicated in bold.
‡ From EASE analysis (see Supplemental Tables 8-11)
216
Supplementary Table 12. Genes allocated to the PAX-FKHR signature gene
networks by the IPA tool
Name Description
Fold-
Change*
p-
value
Network
Protein
Family
ACTA1 actin, alpha 1, skeletal muscle 2.8 0 4 other
ADAM10
a disintegrin and metalloproteinase
domain 10
2.7 0 3 peptidase
AGTR2 angiotensin II receptor, type 2 -1.3 0 4
G-protein coupled
receptor
AKT2
v-akt murine thymoma viral oncogene
homolog 2
-1.0 0.77 4 kinase
ALK anaplastic lymphoma kinase (Ki-1) 3.3 0 2 kinase
ANK2 ankyrin 2, neuronal 2.7 0 2 other
ARG2 arginase, type II -1.1 0.01 2 enzyme
ARL7 ADP-ribosylation factor-like 7 1.6 0 2 enzyme
ASS argininosuccinate synthetase 15.8 0 4 enzyme
ATP2A1
ATPase, Ca++ transporting, cardiac
muscle, fast twitch 1
-1.6 0.001 2 enzyme
BDKRB1 bradykinin receptor B1 -1.1 0.007 3
G-protein coupled
receptor
BGLAP
bone gamma-carboxyglutamate (gla)
protein (osteocalcin)
-1.2 0.001 4 other
CAD
carbamoyl-phosphate synthetase 2,
aspartate transcarbamylase, and
dihydroorotase
-1.3 0 1 enzyme
CBX5
chromobox homolog 5 (HP1 alpha
homolog, Drosophila)
1.1 0.11 4 other
CCL2 chemokine (C-C motif) ligand 2 -1.5 0 3 cytokine
CCL24 chemokine (C-C motif) ligand 24 -1.0 0.14 2 cytokine
CCR8 chemokine (C-C motif) receptor 8 -1.0 0.34 2
G-protein coupled
receptor
CD9 CD9 antigen (p24) 1.7 0 3 other
CD58
CD58 antigen, (lymphocyte function-
associated antigen 3)
-1.1 0.18 2 other
CD1C CD1C antigen, c polypeptide -1.0 0.074 2 other
CDK7
cyclin-dependent kinase 7 (MO15
homolog, Xenopus laevis, cdk-activating
1.1 0.24 4 kinase
217
Supplementary Table 12. Genes allocated to the PAX-FKHR signature gene
networks by the IPA tool
Name Description
Fold-
Change*
p-
value
Network
Protein
Family
CDK9
cyclin-dependent kinase 9 (CDC2-
related kinase)
-1.0 0.83 4 kinase
CKM creatine kinase, muscle -1.0 0.96 4 kinase
CLEC7A
C-type lectin domain family 7, member
A
-1.1 0.031 2
transmembrane
receptor
CMKOR1 chemokine orphan receptor 1 2.3 0 2
G-protein coupled
receptor
COL18A1 collagen, type XVIII, alpha 1 2.2 0 3 other
COL1A1 collagen, type I, alpha 1 -1.0 0.34 3 other
COL5A3 collagen, type V, alpha 3 -1.1 0.22 3 other
CXCL16 chemokine (C-X-C motif) ligand 16 3 cytokine
CYSLTR1 cysteinyl leukotriene receptor 1 1.0 0.45 2
G-protein coupled
receptor
DAD1 defender against cell death 1 -1.0 0.84 1 other
DHFR dihydrofolate reductase 2.0 0 4 enzyme
DIO3 deiodinase, iodothyronine, type III -1.1 0.006 1 enzyme
DSG1 desmoglein 1 -1.0 0.11 3 other
DUSP4 dual specificity phosphatase 4 -1.9 0 1 phosphatase
EFNB2 ephrin-B2 2.0 0 3 other
EFS embryonal Fyn-associated substrate -1.7 0 2 other
EGF
epidermal growth factor (beta-
urogastrone)
-1.0 0.099 1 growth factor
ELK3
ELK3, ETS-domain protein (SRF
accessory protein 2)
-1.9 0 1
transcription
regulator
EPHA2 EPH receptor A2 -1.1 0.037 1 kinase
ETV3 ets variant gene 3 -1.0 0.61 4
transcription
regulator
FBLN1 fibulin 1 -2.1 0 3 other
FBLN2 fibulin 2 -1.1 0.052 3 other
FGF2 fibroblast growth factor 2 (basic) -1.1 0 1 growth factor
FGF6 fibroblast growth factor 6 -1.0 0.81 1 growth factor
FGFR4 fibroblast growth factor receptor 4 2.1 0 1 kinase
FN1 fibronectin 1 1.3 0.022 3 enzyme
FYN
FYN oncogene related to SRC, FGR,
YES
-1.2 0.016 2 kinase
218
Supplementary Table 12. Genes allocated to the PAX-FKHR signature gene
networks by the IPA tool
Name Description
Fold-
Change*
p-
value
Network
Protein
Family
GADD45A
growth arrest and DNA-damage-
inducible, alpha
1.7 0 1 other
GAS2 growth arrest-specific 2 -6.7 0 1 other
GDF8 growth differentiation factor 8 -2.2 0 4 growth factor
GLG1 golgi apparatus protein 1 -1.1 0.054 1 other
HAS2 hyaluronan synthase 2 -1.5 0 1 enzyme
HBP1 HMG-box transcription factor 1 -1.1 0.25 4
transcription
regulator
HDAC5 histone deacetylase 5 1.8 0 4
transcription
regulator
HUMMLC2B fast skeletal myosin light chain 2 -3.3 0 4 other
ID1
inhibitor of DNA binding 1, dominant
negative helix-loop-helix protein
-1.5 0.001 4 other
ID2
inhibitor of DNA binding 2, dominant
negative helix-loop-helix protein
-1.9 0 4 other
IFI16 interferon, gamma-inducible protein 16 1.6 0 4
transcription
regulator
IGF1
insulin-like growth factor 1
(somatomedin C)
-1.5 0.008 4 growth factor
IGFBP3
insulin-like growth factor binding
protein 3
-1.8 0 1 other
IL2 interleukin 2 -1.0 0.17 2 cytokine
IL4 interleukin 4 -1.0 0.58 2 cytokine
IL13 interleukin 13 -1.0 0.43 2 cytokine
IL13RA2 interleukin 13 receptor, alpha 2 -1.1 0.27 2
transmembrane
receptor
IL4R interleukin 4 receptor 1.8 0 2
transmembrane
receptor
INPP1 inositol polyphosphate-1-phosphatase 1.6 0 2 phosphatase
IRS2 insulin receptor substrate 2 1.5 0 2 other
ITGB6 integrin, beta 6 -1.1 0.2 3 other
ITPR1
inositol 1,4,5-triphosphate receptor, type
1
-1.1 0.25 2 ion channel
JAK3
Janus kinase 3 (a protein tyrosine kinase,
leukocyte)
1.0 0.42 2 kinase
JUP junction plakoglobin 1.0 0.56 3 other
KLF6 Kruppel-like factor 6 -1.3 0 4
transcription
regulator
219
Supplementary Table 12. Genes allocated to the PAX-FKHR signature gene
networks by the IPA tool
Name Description
Fold-
Change*
p-
value
Network
Protein
Family
LCN2 lipocalin 2 (oncogene 24p3) -1.2 0.022 3 transporter
LTA4H leukotriene A4 hydrolase -1.2 0.026 2 enzyme
LTBP2
latent transforming growth factor beta
binding protein 2
1.3 0.003 3 other
MAOA monoamine oxidase A -1.0 0.63 2 enzyme
MAP1LC3B
microtubule-associated protein 1 light
chain 3 beta
1.5 0 2 other
MAPK1 mitogen-activated protein kinase 1 -1.2 0.006 1 kinase
MCAM melanoma cell adhesion molecule 1.8 0 2 other
MEF2A
MADS box transcription enhancer factor
2, polypeptide A (myocyte enhancer
factor 2A)
-1.1 0.018 4
transcription
regulator
MET
met proto-oncogene (hepatocyte
growth factor receptor)
2.1 0 3 kinase
MKI67
antigen identified by monoclonal
antibody Ki-67
1.0 0.5 4 other
MMP16
matrix metalloproteinase 16 (membrane-
inserted)
-1.4 0 3 peptidase
MN1
meningioma (disrupted in balanced
translocation) 1
2.3 0 2 other
MYCN
v-myc myelocytomatosis viral related
oncogene, neuroblastoma derived
(avian)
2.7 0 1
transcription
regulator
MYOD1 myogenic factor 3 1.8 0 4
transcription
regulator
MYOG myogenin (myogenic factor 4) 2.2 0 4
transcription
regulator
NEF3 neurofilament 3 (150kDa medium) -3.0 0 1 other
NEFH
neurofilament, heavy polypeptide
200kDa
-1.5 0 1 other
NID nidogen 1 -1.9 0 3 other
NRCAM neuronal cell adhesion molecule 3.1 0 3 other
P8 p8 protein (candidate of metastasis 1) -1.1 0.044 1
transcription
regulator
P4HB
procollagen-proline, 2-oxoglutarate 4-
dioxygenase (proline 4-hydroxylase),
beta polypeptide (protein disulfide
isomerase-associated 1)
-1.0 0.87 3 enzyme
PAX5
paired box gene 5 (B-cell lineage
specific activator)
1.8 0 4
transcription
regulator
220
Supplementary Table 12. Genes allocated to the PAX-FKHR signature gene
networks by the IPA tool
Name Description
Fold-
Change*
p-
value
Network
Protein
Family
PBK PDZ binding kinase 3.4 0 1 kinase
PHLDA1
pleckstrin homology-like domain,
family A, member 1
-2.3 0 4 other
PKIA
protein kinase (cAMP-dependent,
catalytic) inhibitor alpha
-1.7 0 4 other
PLAGL1 pleiomorphic adenoma gene-like 1 -1.4 0.022 1
transcription
regulator
PLK2 polo-like kinase 2 (Drosophila) -3.8 0 1 kinase
POU4F1
POU domain, class 4, transcription
factor 1
4.2 0 2
transcription
regulator
PRKAR2B
protein kinase, cAMP-dependent,
regulatory, type II, beta
3.1 0 4 kinase
PRKCA protein kinase C, alpha 1.5 0 3 kinase
PSEN2 presenilin 2 (Alzheimer disease 4) 2.1 0 1 other
RAC1
ras-related C3 botulinum toxin substrate
1 (rho family, small GTP binding protein
Rac1)
-1.2 0.092 2 enzyme
RANBP9 RAN binding protein 9 1.1 0.23 3 other
RB1
retinoblastoma 1 (including
osteosarcoma)
1.1 0.19 4
transcription
regulator
RBBP6 retinoblastoma binding protein 6 1.1 0.43 4 other
RGS2
regulator of G-protein signalling 2,
24kDa
-2.2 0 3 other
RGS16 regulator of G-protein signalling 16 -1.5 0 1 other
RNF128 ring finger protein 128 1.0 0.29 2 enzyme
RPL11 ribosomal protein L11 1.3 0 1 other
RPS6KA2
ribosomal protein S6 kinase, 90kDa,
polypeptide 2
-1.1 0.16 4 kinase
RPS6KB2
ribosomal protein S6 kinase, 70kDa,
polypeptide 2
-1.1 0.004 1 kinase
SCG2 secretogranin II (chromogranin C) 1.2 0.045 1 other
SCN3A
sodium channel, voltage-gated, type III,
alpha
1.3 0.01 2 ion channel
SDC4 syndecan 4 (amphiglycan, ryudocan) -1.1 0.097 3 other
SGK serum/glucocorticoid regulated kinase 2.0 0 3 kinase
SKIIP SKI interacting protein 1.2 0.004 4
transcription
regulator
SKP2
S-phase kinase-associated protein 2
(p45)
2.3 0 3 other
221
Supplementary Table 12. Genes allocated to the PAX-FKHR signature gene
networks by the IPA tool
Name Description
Fold-
Change*
p-
value
Network
Protein
Family
SPRY1
sprouty homolog 1, antagonist of FGF
signaling (Drosophila)
-1.4 0 1 other
SPRY4 sprouty homolog 4 (Drosophila) -1.6 0 1 other
SULF1 sulfatase 1 2.2 0 3 enzyme
SWAP70 SWAP-70 protein 1.2 0 2 other
SYNGR2 synaptogyrin 2 1.2 0.024 2 other
TAX1BP1
Tax1 (human T-cell leukemia virus type
I) binding protein 1
-1.2 0.025 3 other
TCF7L2
transcription factor 7-like 2 (T-cell
specific, HMG-box)
1.9 0 1
transcription
regulator
TIAF1
TGFB1-induced anti-apoptotic factor
1
2.2 0 2 other
TNF
tumor necrosis factor (TNF superfamily,
member 2)
-1.0 0.6 3 cytokine
TNFAIP3
tumor necrosis factor, alpha-induced
protein 3
1.9 0 3 other
TNFRSF21
tumor necrosis factor receptor
superfamily, member 21
-1.2 0 3 other
TNFRSF12A
tumor necrosis factor receptor
superfamily, member 12A
1.0 0.98 1 other
TNIP2 TNFAIP3 interacting protein 2 -1.0 0.28 3 other
TNNC2 troponin C2, fast -2.5 0.001 4 other
TP53
tumor protein p53 (Li-Fraumeni
syndrome)
-1.1 0.097 1
transcription
regulator
TRAM2
translocation associated membrane
protein 2
1.5 0 3 other
TTF1
transcription termination factor, RNA
polymerase I
1.0 0.64 1
transcription
regulator
TWIST1
twist homolog 1 (acrocephalosyndactyly
3; Saethre-Chotzen syndrome)
(Drosophila)
-1.4 0 4
transcription
regulator
YWHAE
tyrosine 3-monooxygenase/tryptophan 5-
monooxygenase activation protein,
epsilon polypeptide
1.7 0 4 other
ZFP36L2 zinc finger protein 36, C3H type-like 2 -2.2 0 1
transcription
regulator
* Mean fold-change between mARMS and mERMS primary tumors (i.e. PAX-FKHR expressing versus non-expressing tumors).
Note: PAX-FKHR expression signature genes (i.e. Focus Genes) are highlighted in bold.
Supplementary Figure 4A (see legend below)
222
Supplementary Figure 4B (see legend below)
223
Supplementary Figure 4C (see legend below)
224
Supplementary Figure 4D (see legend below)
Supplementary Figure 4 Ingenuity Pathways Analysis of the PAX-FKHR
expression signature. A: Network 1; B: Network 2; C: Network 3; D: Network 4,
The networks are graphed as nodes (genes) and edges (biological relationship).
Network nodes are color coded as either up- (red) or down-regulated (blue). The
distance between nodes is related to the number of literature reports confirming an
interaction between the nodes (i.e., the shorter the distance, the greater the number of
reports). The edges are labeled indicating the type of interaction (i.e., B, binding; P,
phosphorylation; E, expression). The shape of the node represents the function of
the protein (i.e., circle, transcription factor; see Supplementary Table 12 for the
specific details of each protein in each network).
225
226
Supplementary Table 13. Gene ontology categories over-represented in IPA
Network 1 as determined by EASE analysis
System Gene Category
Number of
Genes*
EASE
Score †
GO Biological Process cell cycle 14 3.94E-08
GO Biological Process cell proliferation 15 6.23E-07
GO Biological Process apoptosis 9 1.20E-05
GO Biological Process programmed cell death 9 1.23E-05
GO Biological Process regulation of cell cycle 9 1.40E-05
GO Biological Process cell death 9 1.92E-05
GO Biological Process death 9 2.06E-05
organismal role Cell death/Apoptosis 8 3.02E-05
SwissProt keyword Phosphorylation 13 1.58E-04
GO Biological Process cell growth and/or maintenance 21 4.97E-04
PIR keyword phosphoprotein 8 1.53E-03
GO Biological Process cellular process 27 1.55E-03
KEGG pathway Amyotrophic lateralsclerosis (ALS) - Homo sapiens 3 2.53E-03
GO Biological Process regulation of apoptosis 5 2.96E-03
SwissProt keyword Angiogenesis 3 3.21E-03
GO Biological Process
transmembrane receptor protein tyrosine kinase
signaling pathway
4 4.82E-03
GO Biological Process induction of apoptosis 4 5.44E-03
GO Biological Process positive regualtion of apoptosis 4 5.44E-03
GO Biological Process regulation of programmed cell death 4 5.44E-03
GO Biological Process positive regulation of programmed cell death 4 5.44E-03
GO Biological Process induction of programmed cell death 4 5.44E-03
GO Biological Process angiogenesis 3 5.83E-03
GO Biological Process blood vessel development 3 6.44E-03
GO Biological Process MAPKKK cascade 3 0.009
GO Biological Process cell cycle arrest 3 0.011
GO Biological Process phosphate metabolism 7 0.013
GO Biological Process phosphorus metabolism 7 0.013
GO Biological Process protein amino acid phosphorylation 6 0.013
GO Molecular
Function
protein kinase activity 6 0.014
KEGG pathway Neurodegenerative Disorders - Homo sapiens 4 0.014
GO Biological Process enzyme linked receptor protein signaling pathway 4 0.016
GO Molecular
Function
protein serine/threonine kinase activity 5 0.019
GO Biological Process phosphorylation 6 0.019
SwissProt keyword Growth arrest 2 0.019
GO Biological Process protein modification 8 0.021
GO Cellular
Component
neurofilament 2 0.021
GO Biological Process signal transduction 13 0.022
GO Molecular
Function
phosphotransferase activity\, alcohol group as acceptor 6 0.027
GO Molecular
Function
heavy metal binding 2 0.030
227
Supplementary Table 13. Gene ontology categories over-represented in IPA
Network 1 as determined by EASE analysis
System Gene Category
Number of
Genes*
EASE
Score †
GO Molecular
Function
protein-tyrosine kinase activity 4 0.035
PIRaln fibroblast growth factor 2 0.040
GO Biological Process FGF receptor signaling pathway 2 0.042
Interpro Eukaryotic protein kinase 5 0.044
GO Biological Process activation of MAPK 2 0.047
PIR superfamily fibroblast growth factor 2 0.049
* refers to number of genes from analyzed list found in each GO category
† EASE score is a conservative adjustment to the Fisher exact probability test
Supplementary Table 14. Gene ontology categories over-represented in IPA
Network 2 as determined by EASE analysis
System Gene Category
Number
of
Genes*
EASE
Score †
GO Biological Process cell communication 20 1.40E-05
GO Biological Process defense response 11 1.97E-05
GO Biological Process response to biotic stimulus 11 3.96E-05
GO Biological Process signal transduction 17 5.47E-05
GO Biological Process immune response 10 5.95E-05
GO Biological Process response to pest/pathogen/parasite 8 1.55E-04
GO Molecular
Function
cytokine binding 4 5.06E-04
GO Biological Process humoral immune response 5 1.04E-03
GO Biological Process response to external stimulus 11 1.39E-03
GO Biological Process cellular process 25 1.61E-03
GO Biological Process antimicrobial humoral response (sensu Vertebrata) 4 1.65E-03
GO Biological Process antimicrobial humoral response 4 1.76E-03
organismal role Anti-pathogen response 10 2.77E-03
GO Biological Process taxis 4 3.32E-03
GO Biological Process chemotaxis 4 3.32E-03
GO Molecular
Function
signal transducer activity 13 4.25E-03
GO Biological Process response to stress 8 4.36E-03
GO Biological Process humoral defense mechanism (sensu Vertebrata) 4 4.97E-03
GO Biological Process response to wounding 5 5.45E-03
GO Cellular
Component
integral to membrane 15 7.30E-03
Interpro Interleukins -4 and -13 2 8.17E-03
GO Biological Process inflammatory response 4 1.06E-02
GO Biological Process innate immune response 4 1.20E-02
organismal role Cell migration/motility 5 0.012
GO Biological Process calcium ion transport 3 0.012
GO Biological Process cell adhesion 6 0.015
228
Supplementary Table 14. Gene ontology categories over-represented in IPA
Network 2 as determined by EASE analysis
System Gene Category
Number
of
Genes*
EASE
Score †
GO Biological Process intracellular signaling cascade 7 0.016
GO Biological Process response to chemical substance 4 0.019
GO Biological Process di-\, tri-valent inorganic cation transport 3 0.026
SwissProt keyword Transmembrane 13 0.027
SwissProt keyword 3D-structure 9 0.029
GO Biological Process metal ion transport 4 0.041
GO Molecular
Function
receptor binding 5 0.043
GO Cellular
Component
integral to plasma membrane 8 0.044
EC number 2.7.1.112 3 0.045
GO Molecular
Function
purine nucleotide binding 8 0.047
SwissProt keyword Tyrosine-protein kinase 3 0.047
GO Molecular
Function
receptor activity 8 0.047
GO Molecular
Function
nucleotide binding 8 0.049
Supplementary Table 15. Gene ontology categories over-represented in IPA
Network 3 as determined by EASE analysis
System Gene Category
Number
of
Genes*
EASE
Score †
GO Cellular
Component
extracellular matrix 9 6.30E-07
SwissProt keyword Signal 19 7.58E-07
GO Molecular
Function
extracellular matrix structural constituent 6 2.44E-06
SwissProt keyword Glycoprotein 19 5.23E-06
GO Biological Process cell adhesion 10 9.51E-06
SwissProt keyword Extracellular matrix 6 5.33E-05
subcellular localization Extracellular matrix (cuticle and basement membrane) 5 2.92E-04
GO Biological Process apoptosis 7 4.74E-04
GO Biological Process programmed cell death 7 4.80E-04
SwissProt keyword Pyrrolidone carboxylic acid 4 5.94E-04
GO Cellular
Component
extracellular 11 6.54E-04
GO Biological Process cell death 7 6.66E-04
GO Biological Process death 7 7.02E-04
PIR keyword glycoprotein 10 1.03E-03
GO Biological Process cell communication 17 1.05E-03
subcellular localization Extracellular (excluding cell wall) 7 1.28E-03
GO Molecular
Function
cell adhesion molecule activity 6 1.29E-03
SwissProt keyword Cell adhesion 6 1.31E-03
organismal role Extracellular matrix component 5 1.34E-03
229
Supplementary Table 15. Gene ontology categories over-represented in IPA
Network 3 as determined by EASE analysis
System Gene Category
Number
of
Genes*
EASE
Score †
GO Biological Process cellular process 25 1.61E-03
GO Cellular
Component
soluble fraction 5 3.28E-03
GO Cellular
Component
collagen 3 3.31E-03
SwissProt keyword Connective tissue 3 5.48E-03
GO Molecular
Function
structural molecule activity 7 0.007
SwissProt keyword Repeat 14 0.007
SwissProt keyword 3D-structure 10 0.007
Interpro Collagen triple helix repeat 3 0.010
GO Molecular
Function
extracellular matrix glycoprotein 2 0.010
SwissProt keyword Collagen 3 0.012
molecular localization Soluble 6 0.013
SwissProt keyword Hydroxylation 3 0.014
GO Cellular
Component
integral to plasma membrane 9 0.015
PIR keyword extracellular matrix 3 0.016
Interpro EGF-like domain 4 0.017
SwissProt keyword Apoptosis 4 0.017
SwissProt keyword Transmembrane 13 0.020
GO Cellular
Component
extracellular space 5 0.020
PIR pcmotif Anaphylatoxin domain signature 2 0.020
Chromosome Homo sapiens 6 6 0.023
GO Cellular
Component fibrillar collagen 2 0.024
GO Cellular
Component cell surface 2 0.024
GO Biological Process negative regulation of apoptosis 3 0.024
GO Biological Process anti-apoptosis 3 0.024
Interpro Anaphylatoxin/Fibulin 2 0.024
GO Biological Process chitin metabolism 2 0.024
GO Biological Process response to wounding 4 0.034
GO Cellular
Component
plasma membrane 10 0.038
organismal role Connective Tissue Development and Maintenance 3 0.041
GO Cellular
Component
integral to membrane 13 0.047
* refers to number of genes from analyzed list found in each GO category
† EASE score is a conservative adjustment to the Fisher exact probability test
230
Supplementary Table 16. Gene ontology categories over-represented in IPA
Network 4 as determined by EASE analysis
System Gene Category
Number
of
Genes*
EASE
Score †
GO Biological Process muscle development 9 4.24E-09
GO Biological Process organogenesis 15 1.11E-07
GO Biological Process morphogenesis 15 4.56E-07
GO Biological Process transcription from Pol II promoter 11 5.51E-07
GO Biological Process cell differentiation 8 6.47E-07
SwissProt keyword Nuclear protein 17 1.72E-05
GO Biological Process development 16 3.92E-05
GO Cellular
Component
nucleus 19 6.40E-05
Interpro Basic helix-loop-helix dimerization domain (bHLH) 5 2.40E-04
GO Cellular
Component
nucleoplasm 10 2.63E-04
GO Biological Process transcription\, DNA-dependent 15 2.86E-04
Interpro Myc-type, helix-loop-helix dimerization domain 5 2.97E-04
SwissProt keyword Differentiation 4 3.72E-04
GO Biological Process transcription 15 4.15E-04
GO Molecular Function protein kinase regulator activity 4 5.44E-04
GO Molecular Function kinase regulator activity 4 6.72E-04
GO Biological Process regulation of transcription from Pol II promoter 6 6.82E-04
GO Biological Process regulation of transcription\, DNA-dependent 14 6.98E-04
GO Biological Process
nucleobase\, nucleoside\, nucleotide and nucleic acid
metabolism
18 7.15E-04
GO Molecular Function transcription regulator activity 11 7.33E-04
GO Biological Process regulation of transcription 14 8.16E-04
GO Biological Process regulation of cell cycle 7 9.03E-04
GO Molecular Function binding 29 9.28E-04
GO Cellular
Component
transcription factor complex 8 0.001
SwissProt keyword Developmental protein 6 0.001
molecular localization DNA-associated (direct or indirect) 9 0.001
PIR keyword cell cycle control 4 0.002
GO Biological Process myogenesis 3 0.002
PIR pcmotif Myc-type, 'helix-loop-helix' dimerization domain signature 4 0.003
biochemical function DNA-binding protein 8 0.003
GO Biological Process cell cycle 8 0.004
SwissProt keyword Transcription regulation 9 0.006
GO Molecular Function DNA binding 13 0.008
GO Biological Process cell proliferation 9 0.010
SwissProt keyword DNA-binding 9 0.011
GO Biological Process hemopoiesis 3 0.012
subcellular localization Nuclear 9 0.013
231
Supplementary Table 16. Gene ontology categories over-represented in IPA
Network 4 as determined by EASE analysis
System Gene Category
Number
of
Genes*
EASE
Score †
GO Biological Process protein amino acid phosphorylation 6 0.013
GO Biological Process negative regulation of transcription from Pol II promoter 3 0.016
GO Biological Process phosphorylation 6 0.019
SwissProt keyword Myogenesis 2 0.021
GO Biological Process negative regulation of transcription\, DNA-dependent 3 0.023
PIR superfamily human myogenin 2 0.023
Interpro Myogenic Basic domain 2 0.024
GO Molecular Function protein binding 10 0.028
cellular role Pol II transcription 9 0.029
PIR superfamily transcription repressor Id-2 2 0.029
cellular role Chromatin/chromosome structure 4 0.031
GO Cellular
Component
intracellular 25 0.032
organismal role Bone Development and Maintenance 3 0.032
GO Molecular Function nucleic acid binding 14 0.034
GO Molecular Function cyclin-dependent protein kinase activity 2 0.035
GO Molecular Function transcription cofactor activity 4 0.037
GO Biological Process negative regulation of transcription 3 0.037
PIR keyword phosphotransferase 5 0.040
GO Molecular Function transcription factor binding 4 0.044
GO Biological Process phosphorus metabolism 6 0.046
GO Biological Process phosphate metabolism 6 0.046
GO Biological Process B-cell differentiation 2 0.047
KEGG pathway Urea cycle and metabolism ofamino groups - Homo sapiens 2 0.050
* refers to number of genes from analyzed list found in each GO category
† EASE score is a conservative adjustment to the Fisher exact probability test
232
Supplementary Table 17. Clinical characteristics of the prognosis RMS
microarray analysis
Number %
5-year
OAS (%)
alveolar 49 40.8 49
mixed alveolar/embryonal 2 1.7 0
embryonal 62 51.7 79
botryoid 2 1.7 100
Histology
spindle 5 4.2 100
PAX3-FKHR 30 58.8 32
PAX7-FKHR 11 21.6 91
Translocation
(Alveolar Histology Only)
Negative 10 19.6 50
1 21 18.4 90
2 17 14.9 100
3 45 39.5 69
Pre-operative Stage
4 31 27.2 22
IA & IB 27 26.0 100
IIA & IIB 12 11.5 74
III 34 32.7 76
Post-operative Clinical
Group
IV 31 29.8 22
Low 24 20.9 96
Intermediate 70 60.9 69
IRS Risk Group
High 21 18.3 11
Alive 81 67.5 -
Alive/Dead
Dead 39 32.5 -
Orbit 4 3.4 100
Head/Neck 9 7.7 62
Paramenigeal 20 17.1 75
GU-bladder/prostate 9 7.7 67
GU-other 23 19.7 78
Extremity 28 23.9 74
Primary Site
Other 24 20.5 37
Age at Diagnosis mean, median, (range)
6.94, 5
(0-20) -
NOTE: Some of the categories do not total correctly due to incomplete clinical covariate data.
233
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
209459_s_at 4-aminobutyrate aminotransferase ABAT 3 0
209460_at 4-aminobutyrate aminotransferase ABAT 2 0
209247_s_at
ATP-binding cassette, sub-family F
(GCN20), member 2
ABCF2 -2 0
45288_at Abhydrolase domain containing 6 ABHD6 2 0
218405_at Activator of basal transcription 1 ABT1 -2 218
207071_s_at Aconitase 1, soluble ACO1 2 0
206891_at Actinin, alpha 3 ACTN3 -2 0
213532_at
A disintegrin and metalloproteinase domain
17 (tumor necrosis factor, alpha, converting
enzyme)
ADAM17 -2 343
207175_at
Adiponectin, C1Q and collagen domain
containing
ADIPOQ 2 226
201281_at Adhesion regulating molecule 1 ADRM1 -3 50
202144_s_at Adenylosuccinate lyase ADSL -2 330
220290_at Absent in melanoma 1-like AIM1L -2 0
203180_at
Aldehyde dehydrogenase 1 family, member
A3
ALDH1A3 2 299
203722_at
Aldehyde dehydrogenase 4 family, member
A1
ALDH4A1 2 339
221588_x_at
aldehyde dehydrogenase 6 family, member
A1
ALDH6A1 3 112
209424_s_at Alpha-methylacyl-CoA racemase AMACR -2 0
212289_at Ankyrin repeat domain 12 ANKRD12 2 325
204671_s_at Ankyrin repeat domain 6 ANKRD6 3 138
204672_s_at Ankyrin repeat domain 6 ANKRD6 3 58
211404_s_at Amyloid beta (A4) precursor-like protein 2 APLP2 -3 185
222013_x_at
Amyloid beta (A4) precursor protein
(protease nexin-II, Alzheimer disease)
APP -2 0
218527_at Aprataxin APTX 3 92
203025_at
ARD1 homolog A, N-acetyltransferase (S.
cerevisiae)
ARD1A -3 426
205109_s_at
Rho guanine nucleotide exchange factor
(GEF) 4
ARHGEF4 3 216
213433_at ADP-ribosylation factor-like 3 ARL3 3 0
220597_s_at
ADP-ribosylation-like factor 6 interacting
protein 4
ARL6IP4 -2 0
220359_s_at
Cyclic AMP-regulated phosphoprotein, 21
kD
ARPP-21 -2 326
207076_s_at Argininosuccinate synthetase ASS 2 0
218987_at
Activating transcription factor 7 interacting
protein
ATF7IP -2 318
203926_x_at
ATP synthase, H+ transporting,
mitochondrial F1 complex, delta subunit
ATP5D -2 0
234
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
213041_s_at
ATP synthase, H+ transporting,
mitochondrial F1 complex, delta subunit
ATP5D -2 0
207809_s_at
ATPase, H+ transporting, lysosomal
accessory protein 1
ATP6AP1 2 209
214330_at
ATP synthase mitochondrial F1 complex
assembly factor 2
ATPAF2 2 0
214742_at 5-azacytidine induced 1 AZI1 -3 0
53076_at
Xylosylprotein beta 1,4-
galactosyltransferase, polypeptide 7
(galactosyltransferase I)
B4GALT7 -2 0
205638_at Brain-specific angiogenesis inhibitor 3 BAI3 2 446
213882_at Beta-amyloid binding protein precursor BBP 2 0
203140_at
B-cell CLL/lymphoma 6 (zinc finger
protein 51)
BCL6 2 449
201170_s_at
Basic helix-loop-helix domain containing,
class B, 2
BHLHB2 2 510
202931_x_at Bridging integrator 1 BIN1 -3 207
210201_x_at Bridging integrator 1 BIN1 -3 412
210202_s_at Bridging integrator 1 BIN1 -3 62
214439_x_at Bridging integrator 1 BIN1 -3 297
214643_x_at Bridging integrator 1 BIN1 -3 4
222199_s_at Bridging integrator 3 BIN3 -3 0
205430_at Bone morphogenetic protein 5 BMP5 3 43
205431_s_at Bone morphogenetic protein 5 BMP5 3 48
218955_at
BRF2, subunit of RNA polymerase III
transcription initiation factor, BRF1-like
BRF2 -2 0
217809_at Basic leucine zipper and W2 domains 2 BZW2 -3 0
219953_s_at Chromosome 11 open reading frame 17 C11orf17 -4 0
212736_at Chromosome 16 open reading frame 45 C16orf45 2 184
214173_x_at Chromosome 19 open reading frame 2 C19orf2 -3 0
220688_s_at Chromosome 1 open reading frame 33 C1orf33 -3 0
214163_at Chromosome 1 open reading frame 41 C1orf41 2 0
219506_at chromosome 1 open reading frame 54 C1orf54 -2 316
219463_at Chromosome 20 open reading frame 103 C20orf103 3 0
208880_s_at Chromosome 20 open reading frame 14 C20orf14 -2 260
215767_at Chromosome 2 open reading frame 10 C2orf10 2 0
207511_s_at Chromosome 2 open reading frame 24 C2orf24 2 418
219261_at Chromosome 7 open reading frame 26 C7orf26 -2 505
219124_at chromosome 8 open reading frame 41 C8orf41 -2 0
202452_at Chromosome 9 open reading frame 60 C9orf60 2 0
203963_at Carbonic anhydrase XII CA12 2 0
202966_at Calpain 6 CAPN6 2 447
218929_at
Collaborates/cooperates with ARF
(alternate reading frame) protein
CARF 2 294
209790_s_at
Caspase 6, apoptosis-related cysteine
protease
CASP6 -2 104
235
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
208056_s_at
Core-binding factor, runt domain, alpha
subunit 2; translocated to, 3
CBFA2T3 3 45
218125_s_at coiled-coil domain containing 25 CCDC25 -3 238
204716_at Coiled-coil domain containing 6 CCDC6 -2 265
200812_at
Chaperonin containing TCP1, subunit 7
(eta)
CCT7 -2 0
207277_at CD209 antigen CD209 -2 0
218529_at CD320 antigen CD320 -3 203
208022_s_at
CDC14 cell division cycle 14 homolog B
(S. cerevisiae)
CDC14B 2 429
210622_x_at Cyclin-dependent kinase (CDC2-like) 10 CDK10 3 150
213183_s_at
Cyclin-dependent kinase inhibitor 1C (p57,
Kip2)
CDKN1C -2 266
36499_at
Cadherin, EGF LAG seven-pass G-type
receptor 2 (flamingo homolog, Drosophila)
CELSR2 2 0
202937_x_at CGI-96 protein CGI-96 -2 0
33307_at CGI-96 protein CGI-96 -2 410
218642_s_at
Coiled-coil-helix-coiled-coil-helix domain
containing 7
CHCHD7 -2 0
201184_s_at
Chromodomain helicase DNA binding
protein 4
CHD4 -3 34
211248_s_at Chordin CHRD 3 60
200810_s_at Cold inducible RNA binding protein CIRBP -2 352
200811_at Cold inducible RNA binding protein CIRBP -2 0
207144_s_at
Cbp/p300-interacting transactivator, with
Glu/Asp-rich carboxy-terminal domain, 1
CITED1 -3 0
202712_s_at
Creatine kinase, mitochondrial 1
(ubiquitous)
CKMT1 2 165
212308_at Cytoplasmic linker associated protein 2 CLASP2 2 300
209143_s_at Chloride channel, nucleotide-sensitive, 1A CLNS1A -3 65
211043_s_at Clathrin, light polypeptide (Lcb) CLTB -2 0
204740_at
Connector enhancer of kinase suppressor of
Ras 1
CNKSR1 2 0
206731_at
Connector enhancer of kinase suppressor of
Ras 2
CNKSR2 2 0
213436_at Cannabinoid receptor 1 (brain) CNR1 3 0
213050_at Cordon-bleu homolog (mouse) COBL 2 0
209082_s_at Collagen, type XVIII, alpha 1 COL18A1 2 0
219997_s_at
COP9 constitutive photomorphogenic
homolog subunit 7B (Arabidopsis)
COPS7B -2 0
204643_s_at Cytosolic ovarian carcinoma antigen 1 COVA1 -3 40
201940_at Carboxypeptidase D CPD 3 0
201943_s_at Carboxypeptidase D CPD 3 107
208146_s_at Carboxypeptidase, vitellogenic-like CPVL -3 57
203368_at Cysteine-rich with EGF-like domains 1 CRELD1 2 450
51176_at
Cofactor required for Sp1 transcriptional
activation, subunit 8, 34kDa
CRSP8 -2 0
236
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
201160_s_at Cold shock domain protein A CSDA -3 183
201161_s_at Cold shock domain protein A CSDA -3 0
202190_at
Cleavage stimulation factor, 3' pre-RNA,
subunit 1, 50kDa
CSTF1 -2 0
201905_s_at
CTD (carboxy-terminal domain, RNA
polymerase II, polypeptide A) small
phosphatase-like
CTDSPL -3 0
203917_at Coxsackie virus and adenovirus receptor CXADR 2 258
209163_at Cytochrome b-561 CYB561 3 38
209164_s_at Cytochrome b-561 CYB561 3 68
201066_at Cytochrome c-1 CYC1 -2 0
202435_s_at
Cytochrome P450, family 1, subfamily B,
polypeptide 1
CYP1B1 2 340
212128_s_at
Dystroglycan 1 (dystrophin-associated
glycoprotein 1)
DAG1 2 274
205818_at Deleted in bladder cancer 1 DBC1 2 0
204977_at
DEAD (Asp-Glu-Ala-Asp) box polypeptide
10
DDX10 -2 160
210811_s_at
DEAD (Asp-Glu-Ala-Asp) box polypeptide
49
DDX49 -2 0
31807_at
DEAD (Asp-Glu-Ala-Asp) box polypeptide
49
DDX49 -4 35
217754_at
DEAD (Asp-Glu-Ala-Asp) box polypeptide
56
DDX56 -3 0
202447_at
2,4-dienoyl CoA reductase 1,
mitochondrial
DECR1 -2 0
204610_s_at
Hepatitis delta antigen-interacting protein
A
DIPA -3 98
206090_s_at Disrupted in schizophrenia 1 DISC1 2 53
214724_at DIX domain containing 1 DIXDC1 2 143
222250_s_at DKFZP434B168 protein DKFZP434B168 -3 90
202196_s_at Dickkopf homolog 3 (Xenopus laevis) DKK3 2 411
220511_s_at Deleted in liver cancer 1 DLC1 -2 0
210227_at
Discs, large (Drosophila) homolog-
associated protein 2
DLGAP2 3 46
202572_s_at
Discs, large (Drosophila) homolog-
associated protein 4
DLGAP4 2 214
205963_s_at
DnaJ (Hsp40) homolog, subfamily A,
member 3
DNAJA3 2 455
204720_s_at
DnaJ (Hsp40) homolog, subfamily C,
member 6
DNAJC6 2 0
209509_s_at
Dolichyl-phosphate (UDP-N-
acetylglucosamine) N-
acetylglucosaminephosphotransferase 1
(GlcNAc-1-P transferase)
DPAGT1 -3 122
202116_at D4, zinc and double PHD fingers family 2 DPF2 -2 351
203258_at DR1-associated protein 1 (negative DRAP1 -2 0
237
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
cofactor 2 alpha)
221681_s_at Dentin sialophosphoprotein DSPP 2 0
219648_at Likely ortholog of mouse dilute suppressor DSU 2 0
203230_at Dishevelled, dsh homolog 1 (Drosophila) DVL1 2 307
221586_s_at E2F transcription factor 5, p130-binding E2F5 -2 0
201323_at EBNA1 binding protein 2 EBNA1BP2 -2 0
204036_at
Endothelial differentiation,
lysophosphatidic acid G-protein-coupled
receptor, 2
EDG2 -2 0
210656_at Embryonic ectoderm development EED 2 55
213087_s_at
Eukaryotic translation elongation factor 1
delta (guanine nucleotide exchange protein)
EEF1D -3 82
214395_x_at
Eukaryotic translation elongation factor 1
delta (guanine nucleotide exchange protein)
EEF1D -2 154
205107_s_at Ephrin-A4 EFNA4 -2 0
220262_s_at EGF-like-domain, multiple 9 EGFL9 2 263
209039_x_at EH-domain containing 1 EHD1 -2 0
222221_x_at EH-domain containing 1 EHD1 -2 301
208289_s_at Etoposide induced 2.4 mRNA EI24 -2 0
215482_s_at
Eukaryotic translation initiation factor 2B,
subunit 4 delta, 67kDa
EIF2B4 -2 0
201144_s_at
Eukaryotic translation initiation factor 2,
subunit 1 alpha, 35kDa
EIF2S1 -2 0
209393_s_at
Eukaryotic translation initiation factor 4E
member 2
EIF4E2 -2 327
213571_s_at
Eukaryotic translation initiation factor 4E
member 2
EIF4E2 -2 170
221539_at
Eukaryotic translation initiation factor 4E
binding protein 1
EIF4EBP1 -2 0
213712_at
Elongation of very long chain fatty acids
(FEN1/Elo2, SUR4/Elo3, yeast)-like 2
ELOVL2 2 0
221094_s_at
Elongation protein 3 homolog (S.
cerevisiae)
ELP3 -2 0
203729_at Epithelial membrane protein 3 EMP3 -3 401
200878_at Endothelial PAS domain protein 1 EPAS1 3 0
212336_at
Erythrocyte membrane protein band 4.1-
like 1
EPB41L1 3 10
212339_at
Erythrocyte membrane protein band 4.1-
like 1
EPB41L1 2 0
203499_at EPH receptor A2 EPHA2 2 135
206114_at EPH receptor A4 EPHA4 -2 459
213434_at Epimorphin EPIM 3 245
203719_at
Excision repair cross-complementing
rodent repair deficiency, complementation
group 1 (includes overlapping antisense
sequence)
ERCC1 -2 187
218695_at Exosome component 4 EXOSC4 -4 22
238
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
58696_at Exosome component 4 EXOSC4 -3 0
91684_g_at Exosome component 4 EXOSC4 -3 308
209202_s_at Exostoses (multiple)-like 3 EXTL3 -2 304
203249_at Enhancer of zeste homolog 1 (Drosophila) EZH1 3 406
202345_s_at
Fatty acid binding protein 5 (psoriasis-
associated)
FABP5 -2 0
201863_at
Family with sequence similarity 32,
member A
FAM32A -2 262
222099_s_at
Family with sequence similarity 61,
member A
FAM61A -3 0
218549_s_at
family with sequence similarity 82,
member B
FAM82B -3 477
202159_at
Phenylalanine-tRNA synthetase-like, alpha
subunit
FARSLA -3 80
201787_at Fibulin 1 FBLN1 -2 0
203184_at
Fibrillin 2 (congenital contractural
arachnodactyly)
FBN2 -3 93
220127_s_at F-box and leucine-rich repeat protein 12 FBXL12 -3 71
214436_at F-box and leucine-rich repeat protein 2 FBXL2 2 0
203033_x_at Fumarate hydratase FH -2 0
202041_s_at
Fibroblast growth factor (acidic)
intracellular binding protein
FIBP -2 125
219522_at Four jointed box 1 (Drosophila) FJX1 -3 0
218314_s_at hypothetical protein FLJ10726 FLJ10726 -4 2
218722_s_at hypothetical protein FLJ12436 FLJ12436 -2 0
221777_at hypothetical protein FLJ14827 FLJ14827 -3 137
218483_s_at hypothetical protein FLJ21827 FLJ21827 -2 186
218394_at leucine zipper domain protein FLJ22386 3 6
219176_at hypothetical protein FLJ22555 FLJ22555 -2 0
218454_at hypothetical protein FLJ22662 FLJ22662 -3 246
208614_s_at Filamin B, beta (actin binding protein 278) FLNB 2 267
219250_s_at
Fibronectin leucine rich transmembrane
protein 3
FLRT3 -3 0
211726_s_at Flavin containing monooxygenase 2 FMO2 2 161
204948_s_at Follistatin FST -2 313
214094_at
Far upstream element (FUSE) binding
protein 1
FUBP1 2 0
201945_at
Furin (paired basic amino acid cleaving
enzyme)
FURIN 3 20
204451_at Frizzled homolog 1 (Drosophila) FZD1 -2 570
210220_at Frizzled homolog 2 (Drosophila) FZD2 -2 464
218665_at Frizzled homolog 4 (Drosophila) FZD4 -2 0
203706_s_at Frizzled homolog 7 (Drosophila) FZD7 -2 0
205690_s_at Maternal G10 transcript G10 -3 441
212891_s_at
Growth arrest and DNA-damage-inducible,
gamma interacting protein 1
GADD45GIP1 -2 0
239
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
212256_at
UDP-N-acetyl-alpha-D-
galactosamine:polypeptide N-
acetylgalactosaminyltransferase 10
(GalNAc-T10)
GALNT10 -3 8
213280_at
GTPase activating Rap/RanGAP domain-
like 4
GARNL4 3 75
201439_at
Golgi-specific brefeldin A resistance factor
1
GBF1 2 0
219539_at
Gem (nuclear organelle) associated protein
6
GEMIN6 -2 0
208915_s_at
Golgi associated, gamma adaptin ear
containing, ARF binding protein 2
GGA2 2 421
213772_s_at
Golgi associated, gamma adaptin ear
containing, ARF binding protein 2
GGA2 3 405
214006_s_at Gamma-glutamyl carboxylase GGCX -3 0
204074_s_at
glycine-, glutamate-,
thienylcyclohexylpiperidine-binding
protein
GlyBP 2 233
204075_s_at
glycine-, glutamate-,
thienylcyclohexylpiperidine-binding
protein
GlyBP 3 12
218070_s_at GDP-mannose pyrophosphorylase A GMPPA -2 0
214431_at Guanine monphosphate synthetase GMPS -2 0
204248_at
Guanine nucleotide binding protein (G
protein), alpha 11 (Gq class)
GNA11 2 158
200708_at
Glutamic-oxaloacetic transaminase 2,
mitochondrial (aspartate aminotransferase
2)
GOT2 2 479
212951_at G protein-coupled receptor 116 GPR116 3 70
64942_at G protein-coupled receptor 153 GPR153 2 146
218151_x_at G protein-coupled receptor 172A GPR172A -2 0
205240_at
G-protein signalling modulator 2 (AGS3-
like, C. elegans)
GPSM2 -3 21
221922_at
G-protein signalling modulator 2 (AGS3-
like, C. elegans)
GPSM2 -2 114
213170_at Glutathione peroxidase 7 GPX7 -3 0
205814_at Glutamate receptor, metabotropic 3 GRM3 3 212
202554_s_at Glutathione S-transferase M3 (brain) GSTM3 -3
200824_at Glutathione S-transferase pi GSTP1
305
-3 84
205436_s_at H2A histone family, member X H2AFX -3 0
202455_at Histone deacetylase 5 HDAC5 2 344
209525_at
Hepatoma-derived growth factor, related
protein 3
HDGFRP3 3 292
203974_at
Haloacid dehalogenase-like hydrolase
domain containing 1A
HDHD1A -2 0
204370_at ATP/GTP-binding protein HEAB -3 17
52159_at HemK methyltransferase family member 1 HEMK1 -2 0
209558_s_at Huntingtin interacting protein-1-related HIP1R 3 54
240
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
208583_x_at Histone 1, H2aj HIST1H2AJ -3 204
208546_x_at Histone 1, H2bh HIST1H2BH -2 415
208496_x_at Histone 1, H3g HIST1H3G -3 0
208580_x_at Histone 1, H4k HIST1H4K -3 202
214463_x_at Histone 1, H4k HIST1H4K -3 404
203040_s_at Hydroxymethylbilane synthase HMBS -2 473
215489_x_at Homer homolog 3 (Drosophila) HOMER3 -2 0
222222_s_at Homer homolog 3 (Drosophila) HOMER3 -3 215
213147_at Homeo box A10 HOXA10 -2 144
206289_at Homeo box A4 HOXA4 -2 440
212552_at Hippocalcin-like 1 HPCAL1 2 487
220447_at Histamine receptor H3 HRH3 3 14
217106_x_at Dimethyladenosine transferase HSA9761 -2 0
208815_x_at Heat shock 70kDa protein 4 HSPA4 -3 249
211016_x_at Heat shock 70kDa protein 4 HSPA4 -2 0
217926_at HSPC023 protein HSPC023 -2 0
214011_s_at Hypothetical protein HSPC111 HSPC111 -2 0
221046_s_at HSPC135 protein HSPC135 -2 442
213230_at Paraneoplastic antigen HUMPPA 3 89
201565_s_at
Inhibitor of DNA binding 2, dominant
negative helix-loop-helix protein
ID2 -2 314
201566_x_at
Inhibitor of DNA binding 2, dominant
negative helix-loop-helix protein
ID2 -2 97
201163_s_at Insulin-like growth factor binding protein 7 IGFBP7 2 259
205707_at Interleukin 17 receptor IL17R -2 0
203233_at Interleukin 4 receptor IL4R 2 37
221548_s_at
Integrin-linked kinase-associated
serine/threonine phosphatase 2C
ILKAP -3 39
212411_at
IMP4, U3 small nucleolar
ribonucleoprotein, homolog (yeast)
IMP4 -2 296
205981_s_at Inhibitor of growth family, member 2 ING2 3 178
207688_s_at Inhibin, beta C INHBC -4 1
210587_at Inhibin, beta E INHBE 2 451
218305_at importin 4 IPO4 -2 139
221185_s_at IQ motif containing G IQCG 3 13
209184_s_at IRS2 3 293 Insulin receptor substrate 2
209185_s_at Insulin receptor substrate 2 IRS2 2 403
210213_s_at Integrin beta 4 binding protein ITGB4BP -3 217
214927_at
Integrin, beta-like 1 (with EGF-like repeat
domains)
ITGBL1 2 427
202223_at Integral membrane protein 1 ITM1 -2 0
202746_at Integral membrane protein 2A ITM2A -2 0
209984_at Jumonji domain containing 2C JMJD2C 3 0
41387_r_at Jumonji domain containing 3 JMJD3 3 413
221000_s_at Kazal-type serine protease inhibitor domain KAZALD1 -3 220
241
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
1
221584_s_at
Potassium large conductance calcium-
activated channel, subfamily M, alpha
member 1
KCNMA1 2 0
205902_at
Potassium intermediate/small conductance
calcium-activated channel, subfamily N,
member 3
KCNN3 2 416
205968_at
Potassium voltage-gated channel, delayed-
rectifier, subfamily S, member 3
KCNS3 3 302
202417_at Kelch-like ECH-associated protein 1 KEAP1 -2 168
212056_at KIAA0182 protein KIAA0182 2 323
212450_at KIAA0256 gene product KIAA0256 3 52
213242_x_at KIAA0284 KIAA0284 2 482
212356_at KIAA0323 KIAA0323 -3 0
213300_at KIAA0404 protein KIAA0404 -2 444
203171_s_at KIAA0409 protein KIAA0409 -2 0
213155_at KIAA0523 protein KIAA0523 3 0
213157_s_at KIAA0523 protein KIAA0523 3 400
205888_s_at Jak and microtubule interacting protein 2 KIAA0555 2 0
212311_at KIAA0746 protein KIAA0746 3 136
212314_at KIAA0746 protein KIAA0746 3 0
212546_s_at KIAA0826 KIAA0826 3 15
212548_s_at KIAA0826 KIAA0826 2 0
209379_s_at KIAA1128 KIAA1128 2 0
212906_at KIAA1201 protein KIAA1201 -3 61
53968_at KIAA1698 protein KIAA1698 -3 63
209234_at Kinesin family member 1B KIF1B 3 208
203543_s_at Kruppel-like factor 9 KLF9 3 156
212101_at Karyopherin alpha 6 (importin alpha 7) KPNA6 2 0
204584_at L1 cell adhesion molecule L1CAM 3 19
210150_s_at Laminin, alpha 5 LAMA5 2 474
219884_at LHX6 -2 0 LIM homeobox 6
219181_at Lipase, endothelial LIPG 3 298
200805_at Lectin, mannose-binding 2 LMAN2 -2 0
204424_s_at LIM domain only 3 (rhombotin-like 2) LMO3 2 469
209205_s_at LIM domain only 4 LMO4 3 181
213589_s_at
hypothetical protein LOC146712 /// UDP-
GlcNAc:betaGal beta-1,3-N-
acetylglucosaminyltransferase-like 1
LOC146712 ///
B3GNTL1
-2 0
215782_at Ras-like GTPase-like LOC286526 2 0
221501_x_at hypothetical protein LOC339047 LOC339047 2 422
216589_at
similar to 60S ribosomal protein L10 (QM
protein) (Tumor suppressor QM) (Laminin
receptor homolog)
LOC390998 -2 201
214035_x_at LOC399491 protein LOC399491 2 303
242
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
213437_at
similar to RUN and FYVE domain-
containing 2; Run- and FYVE-domain
containing protein
LOC441022 4 25
201871_s_at unknown protein LOC51035 LOC51035 -3 0
220014_at mesenchymal stem cell protein DSC54 LOC51334 -2 341
203548_s_at Lipoprotein lipase LPL 2 87
213496_at Plasticity related gene 1 LPPR4 2 0
204381_at
Low density lipoprotein receptor-related
protein 3
LRP3 -2 224
205282_at
Low density lipoprotein receptor-related
protein 8, apolipoprotein e receptor
LRP8 2 408
40093_at
Lutheran blood group (Auberger b antigen
included)
LU 2 357
212248_at LYRIC/3D3 LYRIC -2 0
208682_s_at Melanoma antigen family D, 2 MAGED2 -2 457
219894_at MAGE-like 2 MAGEL2 -2 329
218918_at Mannosidase, alpha, class 1C, member 1 MAN1C1 3 151
219003_s_at Mannosidase, endo-alpha MANEA 2 0
202788_at
Mitogen-activated protein kinase-activated
protein kinase 3
MAPKAPK3 -3 177
213672_at Methionine-tRNA synthetase MARS 3 9
202350_s_at Matrilin 2 MATN2 -2 78
214397_at Methyl-CpG binding domain protein 2 MBD2 2 432
207549_x_at
Membrane cofactor protein (CD46,
trophoblast-lymphocyte cross-reactive
antigen)
MCP 3 211
213333_at
Malate dehydrogenase 2, NAD
(mitochondrial)
MDH2 -2 0
219348_at
Uncharacterized hematopoietic
stem/progenitor cells protein MDS032
MDS032 -3 0
221706_s_at
Uncharacterized hematopoietic
stem/progenitor cells protein MDS032
MDS032 -2 0
210104_at
Mediator of RNA polymerase II
transcription, subunit 6 homolog (yeast)
MED6 -2 0
221079_s_at Methyltransferase like 2 METTL2 -2 0
205740_s_at Hypothetical protein MGC10433 MGC10433 -2 0
221637_s_at Hypothetical protein MGC2477 MGC2477 -2 361
219324_at Hypothetical protein MGC3731 MGC3731 -2 0
221959_at Hypothetical protein MGC39325 MGC39325 -2 0
205408_at
Myeloid/lymphoid or mixed-lineage
leukemia (trithorax homolog, Drosophila);
translocated to, 10
MLLT10 3 108
219909_at Matrix metalloproteinase 28 MMP28 2 0
202519_at Mlx interactor MONDOA 2 145
203948_s_at Myeloperoxidase MPO 3 51
203949_at Myeloperoxidase MPO 2 0
219162_s_at Mitochondrial ribosomal protein L11 MRPL11 -3 0
243
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
218027_at Mitochondrial ribosomal protein L15 MRPL15 -2 0
217980_s_at Mitochondrial ribosomal protein L16 MRPL16 -3 148
222216_s_at Mitochondrial ribosomal protein L17 MRPL17 -3 0
218281_at Mitochondrial ribosomal protein L48 MRPL48 -2 565
218046_s_at Mitochondrial ribosomal protein S16 MRPS16 -2 0
213132_s_at
Malonyl-CoA:acyl carrier protein
transacylase, mitochondrial
MT -2 0
219363_s_at MTERF domain containing 1 MTERFD1 -2 0
206304_at Myosin binding protein H MYBPH -2 0
201959_s_at MYC binding protein 2 MYCBP2 3 152
207424_at Myogenic factor 5 MYF5 -3 244
206372_at Myogenic factor 6 (herculin) MYF6 -2 0
205163_at Fast skeletal myosin light chain 2 MYLPF -3 83
203072_at Myosin IE MYO1E 2 0
222018_at
Nascent-polypeptide-associated complex
alpha polypeptide
NACA 2 0
218330_s_at Neuron navigator 2 NAV2 2 213
216466_at Neuron navigator 3 NAV3 -2 0
202906_s_at nibrin NBN -2 0
202907_s_at nibrin NBN -2 564
201521_s_at
Nuclear cap binding protein subunit 2,
20kDa
NCBP2 -2 0
205731_s_at Nuclear receptor coactivator 2 NCOA2 -2 407
212867_at Nuclear receptor coactivator 2 NCOA2 -2 0
208969_at
NADH dehydrogenase (ubiquinone) 1
alpha subcomplex, 9, 39kDa
NDUFA9 -2 0
202839_s_at
NADH dehydrogenase (ubiquinone) 1 beta
subcomplex, 7, 18kDa
NDUFB7 -2 0
211752_s_at
NADH dehydrogenase (ubiquinone) Fe-S
protein 7, 20kDa (NADH-coenzyme Q
reductase)
NDUFS7 -2 0
203189_s_at
NADH dehydrogenase (ubiquinone) Fe-S
protein 8, 23kDa (NADH-coenzyme Q
reductase)
NDUFS8 -3 0
221214_s_at Nasal embryonic LHRH factor NELF 2 91
206089_at NEL-like 1 (chicken) NELL1 2 142
213438_at Neurofascin NFASC 2 166
212808_at
Nuclear factor of activated T-cells,
cytoplasmic, calcineurin-dependent 2
interacting protein
NFATC2IP 3 67
210556_at
Nuclear factor of activated T-cells,
cytoplasmic, calcineurin-dependent 3
NFATC3 2 0
206968_s_at
Nuclear factor related to kappaB binding
protein
NFRKB -2 0
204107_at Nuclear transcription factor Y, alpha NFYA -3 253
214628_at Nescient helix loop helix 1 NHLH1 2 0
205893_at Neuroligin 1 NLGN1 3 205
244
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
201159_s_at N-myristoyltransferase 1 NMT1 2 0
204239_s_at Neuronatin NNAT -2 0
217950_at Nitric oxide synthase interacting protein NOSIP -3 0
218506_x_at Cytokine-like nuclear factor n-pac N-PAC 3 159
214870_x_at Nuclear pore complex interacting protein NPIP 2 162
215921_at Nuclear pore complex interacting protein NPIP 3 44
204105_s_at Neuronal cell adhesion molecule NRCAM 3 76
210683_at Neurturin NRTN 3 18
203269_at
Neutral sphingomyelinase (N-SMase)
activation associated factor
NSMAF -2 231
203978_at
Nucleotide binding protein 1 (MinD
homolog, E. coli)
NUBP1 -3 0
218227_at
Nucleotide binding protein 2 (MinD
homolog, E. coli)
NUBP2 -2 0
219100_at
Oligonucleotide/oligosaccharide-binding
fold containing 1
OBFC1 -2 445
213946_s_at obscurin-like 1 OBSL1 -3 30
219277_s_at Oxoglutarate dehydrogenase-like OGDHL 2 0
210443_x_at Opioid growth factor receptor OGFR -2 0
207563_s_at
O-linked N-acetylglucosamine (GlcNAc)
transferase (UDP-N-
acetylglucosamine:polypeptide-N-
acetylglucosaminyl transferase)
OGT -2 466
213125_at Olfactomedin-like 2B OLFML2B 2 423
213825_at
Oligodendrocyte lineage transcription
factor 2
OLIG2 3 41
213046_at Poly(A) binding protein, nuclear 1 PABPN1 2 225
209791_at peptidyl arginine deiminase, type II PADI2 3 16
203228_at
Platelet-activating factor acetylhydrolase,
isoform Ib, gamma subunit 29kDa
PAFAH1B3 -3 109
212858_at
Progestin and adipoQ receptor family
member IV
PAQR4 3 0
221280_s_at
Par-3 partitioning defective 3 homolog (C.
elegans)
PARD3 -2 0
205060_at Poly (ADP-ribose) glycohydrolase PARG 2 77
207838_x_at
Pre-B-cell leukemia transcription factor
interacting protein 1
PBXIP1 3 101
210022_at Polycomb group ring finger 1 PCGF1 -2 0
203660_s_at Pericentrin 2 (kendrin) PCNT2 2 190
205825_at
Proprotein convertase subtilisin/kexin type
1
PCSK1 2 0
203803_at Prenylcysteine oxidase 1 PCYOX1 -2 0
202093_s_at Hypothetical protein F23149_1 PD2 -2 0
219275_at Programmed cell death 5 PDCD5 -2 470
213388_at
Phosphodiesterase 4D interacting protein
(myomegalin)
PDE4DIP 3 0
208911_s_at Pyruvate dehydrogenase (lipoamide) beta PDHB -2 0
245
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
200787_s_at Phosphoprotein enriched in astrocytes 15 PEA15 2 74
218590_at Progressive external ophthalmoplegia 1 PEO1 -3 0
205160_at Peroxisomal biogenesis factor 11A PEX11A 3 210
218387_s_at 6-phosphogluconolactonase PGLS -3 86
218388_at 6-phosphogluconolactonase PGLS -2 335
206567_s_at PHD finger protein 20 PHF20 2 0
213407_at
PH domain and leucine rich repeat protein
phosphatase-like
PHLPPL 3 179
204571_x_at
Protein (peptidyl-prolyl cis/trans
isomerase) NIMA-interacting, 4 (parvulin)
PIN4 -3 56
205632_s_at
Phosphatidylinositol-4-phosphate 5-kinase,
type I, beta
PIP5K1B 3 69
201080_at
Phosphatidylinositol-4-phosphate 5-kinase,
type II, beta
PIP5K2B 2 430
221605_s_at Pipecolic acid oxidase PIPOX 2 0
221854_at
Plakophilin 1 (ectodermal dysplasia/skin
fragility syndrome)
PKP1 3 140
201929_s_at Plakophilin 4 PKP4 2 0
219024_at
Pleckstrin homology domain containing,
family A (phosphoinositide binding
specific) member 1
PLEKHA1 2 0
201939_at Polo-like kinase 2 (Drosophila) PLK2 -2 0
214756_x_at Postmeiotic segregation increased 2-like 1 PMS2L1 -2 428
202306_at
Polymerase (RNA) II (DNA directed)
polypeptide G
POLR2G -2 0
202868_s_at
Processing of precursor 4, ribonuclease
P/MRP subunit (S. cerevisiae)
POP4 -2 0
209482_at
Processing of precursor 7, ribonuclease P
subunit (S. cerevisiae)
POP7 -2 0
210235_s_at
Protein tyrosine phosphatase, receptor type,
f polypeptide (PTPRF), interacting protein
(liprin), alpha 1
PPFIA1 -2 321
202883_s_at
Protein phosphatase 2 (formerly 2A),
regulatory subunit A (PR 65), beta isoform
PPP2R1B -2 328
207769_s_at Polyglutamine binding protein 1 PQBP1 -3 169
214527_s_at Polyglutamine binding protein 1 PQBP1 -3 0
222106_at Prion protein 2 (dublet) PRND -2 99
210988_s_at Prune homolog (Drosophila) PRUNE 2 0
218613_at Pleckstrin and Sec7 domain containing 3 PSD3 -2 172
216088_s_at
Proteasome (prosome, macropain) subunit,
alpha type, 7
PSMA7 -2 529
204219_s_at
Proteasome (prosome, macropain) 26S
subunit, ATPase, 1
PSMC1 -2 0
201199_s_at
Proteasome (prosome, macropain) 26S
subunit, non-ATPase, 1
PSMD1 -2 0
246
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
201232_s_at
Proteasome (prosome, macropain) 26S
subunit, non-ATPase, 13
PSMD13 -2 0
201233_at
Proteasome (prosome, macropain) 26S
subunit, non-ATPase, 13
PSMD13 -3 66
206482_at PTK6 protein tyrosine kinase 6 PTK6 2 0
209466_x_at
Pleiotrophin (heparin binding growth factor
8, neurite growth-promoting factor 1)
PTN -2 0
211737_x_at
Pleiotrophin (heparin binding growth factor
8, neurite growth-promoting factor 1)
PTN -2 0
213362_at
Protein tyrosine phosphatase, receptor type,
D
PTPRD 2 417
200635_s_at
Protein tyrosine phosphatase, receptor type,
F
PTPRF 2 0
203029_s_at
Protein tyrosine phosphatase, receptor type,
N polypeptide 2
PTPRN2 2 0
212866_at R3H domain and coiled-coil containing 1 R3HCC1 -3 113
35156_at R3H domain and coiled-coil containing 1 R3HCC1 -4 24
203883_s_at
RAB11 family interacting protein 2 (class
I)
RAB11FIP2 2 0
203933_at
RAB11 family interacting protein 3 (class
II)
RAB11FIP3 3 0
220964_s_at RAB1B, member RAS oncogene family RAB1B -3 111
221960_s_at RAB2, member RAS oncogene family RAB2 -2 311
204460_s_at RAD1 homolog (S. pombe) RAD1 -2 189
201271_s_at
RNA binding protein (autoantigenic,
hnRNP-associated with lethal yellow)
RALY -2 503
218593_at RNA binding motif protein 28 RBM28 -2 59
211974_x_at
Recombining binding protein suppressor of
hairless (Drosophila)
RBPSUH 3 11
209219_at RD RNA binding protein RDBP -2 0
218599_at REC8-like 1 (yeast) REC8L1 2 0
218194_at
REX2, RNA exonuclease 2 homolog (S.
cerevisiae)
REXO2 -2 272
1053_at Replication factor C (activator 1) 2, 40kDa RFC2 -3 353
204316_at Regulator of G-protein signalling 10 RGS10 -2 0
204319_s_at Regulator of G-protein signalling 10 RGS10 -3 242
213939_s_at Rap2 interacting protein x RIPX 2 0
211753_s_at Relaxin 1 RLN1 -2 0
218497_s_at Ribonuclease H1 RNASEH1 -2 434
217865_at Ring finger protein 130 RNF130 -2 191
221430_s_at Ring finger protein 146 RNF146 2 26
201962_s_at Ring finger protein 41 RNF41 2 251
207438_s_at RNA, U transporter 1 RNUT1 -3 188
213427_at Ribonuclease P 40kDa subunit RPP40 -2 256
217336_at Ribosomal protein S10 RPS10 -3 180
214089_at Ribosomal protein S8 RPS8 -2 0
247
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
209567_at
RRS1 ribosome biogenesis regulator
homolog (S. cerevisiae)
RRS1 -3 47
212319_at RUN and TBC1 domain containing 1 RUTBC1 3 0
201459_at RuvB-like 2 (E. coli) RUVBL2 -3 250
201569_s_at
sorting and assembly machinery
component 50 homolog (S. cerevisiae)
SAMM50 -2 0
201570_at
sorting and assembly machinery
component 50 homolog (S. cerevisiae)
SAMM50 -2 221
220367_s_at MSin3A-associated protein 130 SAP130 -2 182
200051_at
Squamous cell carcinoma antigen
recognised by T cells
SART1 -3 28
201819_at Scavenger receptor class B, member 1 SCARB1 2 127
219196_at Secretogranin III SCG3 3 29
206147_x_at Sex comb on midleg-like 2 (Drosophila) SCML2 -3 134
57703_at SUMO1/sentrin specific protease 5 SENP5 -3 0
212414_s_at Septin 6 SEPT6 -2 0
202627_s_at
Serine (or cysteine) proteinase inhibitor,
clade E (nexin, plasminogen activator
inhibitor type 1), member 1
SERPINE1 3 0
202628_s_at
Serine (or cysteine) proteinase inhibitor,
clade E (nexin, plasminogen activator
inhibitor type 1), member 1
SERPINE1 3 32
200688_at Splicing factor 3b, subunit 3, 130kDa SF3B3 -2 0
215004_s_at Splicing factor 4 SF4 -2 0
218392_x_at Sideroflexin 1 SFXN1 -3 402
209899_s_at Fuse-binding protein-interacting repressor SIAHBP1 -2 0
218797_s_at
Sirtuin (silent mating type information
regulation 2 homolog) 7 (S. cerevisiae)
SIRT7 2 0
207069_s_at
SMAD, mothers against DPP homolog 6
(Drosophila)
SMAD6 -2 94
211989_at
SWI/SNF related, matrix associated, actin
dependent regulator of chromatin,
subfamily e, member 1
SMARCE1 -2 0
202508_s_at Synaptosomal-associated protein, 25kDa SNAP25 2 0
201622_at
Staphylococcal nuclease domain containing
1
SND1 -2 95
202567_at
Small nuclear ribonucleoprotein D3
polypeptide 18kDa
SNRPD3 -2 0
214478_at Secreted phosphoprotein 2, 24kDa SPP2 2 0
211056_s_at
Steroid-5-alpha-reductase, alpha
polypeptide 1 (3-oxo-5 alpha-steroid delta
4-dehydrogenase alpha 1)
SRD5A1 2 0
203114_at
Sjogren's syndrome/scleroderma
autoantigen 1
SSSCA1 -3 123
212225_at Putative translation initiation factor SUI1 2 0
213879_at
SMT3 suppressor of mif two 3 homolog 2
(yeast)
SUMO2 -2 0
248
209198_s_at Synaptotagmin XI SYT11 2 333
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
205613_at synaptotagmin XVII SYT17 2 261
200055_at
TAF10 RNA polymerase II, TATA box
binding protein (TBP)-associated factor,
30kDa
TAF10 -2 0
221264_s_at TAR DNA binding protein TARDBP 2 257
220947_s_at TBC1 domain family, member 10B TBC1D10B 2 197
209403_at TBC1 domain family, member 3 TBC1D3 2 345
211052_s_at Tubulin-specific chaperone d TBCD 2 0
221035_s_at Testis expressed sequence 14 TEX14 2 116
214451_at
Transcription factor AP-2 beta (activating
enhancer binding protein 2 beta)
TFAP2B 3 110
218996_at
TCF3 (E2A) fusion partner (in childhood
Leukemia)
TFPT -3 439
210215_at Transferrin receptor 2 TFR2 -3 0
216262_s_at
TGFB-induced factor 2 (TALE family
homeobox)
TGIF2 -2 0
212207_at
Thyroid hormone receptor associated
protein 2
THRAP2 2 0
218408_at
Translocase of inner mitochondrial
membrane 10 homolog (yeast)
TIMM10 -4 23
218188_s_at
Translocase of inner mitochondrial
membrane 13 homolog (yeast)
TIMM13 -2 0
203092_at
Translocase of inner mitochondrial
membrane 44 homolog (yeast)
TIMM44 -3 0
218357_s_at
Translocase of inner mitochondrial
membrane 8 homolog B (yeast)
TIMM8B -2 0
203679_at
Transmembrane emp24 domain containing
1
TMED1 -3 0
204488_at Transmembrane protein 15 TMEM15 -2 0
218804_at Transmembrane protein 16A TMEM16A 3 155
218113_at Transmembrane protein 2 TMEM2 2 0
206299_at Transmembrane protein 28 TMEM28 2 0
221951_at transmembrane protein 80 TMEM80 2 0
214955_at Transmembrane protease, serine 6 TMPRSS6 -2 0
202643_s_at
Tumor necrosis factor, alpha-induced
protein 3
TNFAIP3 2 0
202644_s_at
Tumor necrosis factor, alpha-induced
protein 3
TNFAIP3 2 453
210405_x_at
Tumor necrosis factor receptor
superfamily, member 10b
TNFRSF10B -3 319
205388_at Troponin C2, fast TNNC2 -2 102
206393_at Troponin I, skeletal, fast TNNI2 -2 0
214774_x_at Trinucleotide repeat containing 9 TNRC9 2 0
215108_x_at Trinucleotide repeat containing 9 TNRC9 2 124
216623_x_at Trinucleotide repeat containing 9 TNRC9 2 0
218864_at Tensin TNS -2 0
249
201512_s_at
Translocase of outer mitochondrial
membrane 70 homolog A (yeast)
TOMM70A -2 227
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
217914_at Two pore segment channel 1 TPCN1 2 115
204140_at Tyrosylprotein sulfotransferase 1 TPST1 -2 0
202368_s_at
Translocation associated membrane protein
2
TRAM2 2 164
217958_at Trafficking protein particle complex 4 TRAPPC4 -3 42
217959_s_at Trafficking protein particle complex 4 TRAPPC4 -3 130
206961_s_at
Trf (TATA binding protein-related factor)-
proximal homolog (Drosophila)
TRFP -2 0
218145_at Tribbles homolog 3 (Drosophila) TRIB3 2 0
202342_s_at Tripartite motif-containing 2 TRIM2 2 85
213009_s_at Tripartite motif-containing 37 TRIM37 2 248
219299_at
tRNA methyltranferase 12 homolog (S.
cerevisiae)
TRMT12 -2 0
218502_s_at Trichorhinophalangeal syndrome I TRPS1 -2 538
209074_s_at TU3A protein TU3A 2 309
216609_at Thioredoxin TXN -2 206
208958_at
Thioredoxin domain containing 4
(endoplasmic reticulum)
TXNDC4 3 0
215577_at
Ubiquitin-conjugating enzyme E2E 1
(UBC4/5 homolog, yeast)
UBE2E1 2 73
218235_s_at
UTP11-like, U3 small nucleolar
ribonucleoprotein, (yeast)
UTP11L -2 286
221514_at
UTP14, U3 small nucleolar
ribonucleoprotein, homolog A (yeast)
UTP14A -3 252
221513_s_at
UTP14, U3 small nucleolar
ribonucleoprotein, homolog C (yeast)
UTP14C -2 0
201556_s_at
Vesicle-associated membrane protein 2
(synaptobrevin 2)
VAMP2 2 0
201557_at
Vesicle-associated membrane protein 2
(synaptobrevin 2)
VAMP2 3 322
214792_x_at
Vesicle-associated membrane protein 2
(synaptobrevin 2)
VAMP2 2 438
214226_at
Vitamin K epoxide reductase complex,
subunit 1
VKORC1 2 141
218731_s_at
Von Willebrand factor A domain-related
protein
WARP 3 33
218512_at WD repeat domain 12 WDR12 -2 0
220917_s_at WD repeat domain 19 WDR19 2 157
218882_s_at WD repeat domain 3 WDR3 -3 153
202249_s_at WD repeat domain 42A WDR42A 3 312
214662_at WD repeat domain 43 WDR43 -3 3
221712_s_at
WD repeat domain 74 /// WD repeat
domain 74
WDR74 -3 96
221113_s_at
Wingless-type MMTV integration site
family, member 16
WNT16 -2 0
250
221609_s_at
Wingless-type MMTV integration site
family, member 6
WNT6 2 163
Supplementary Table 18. Genes correlated to RMS patient outcome
Affy ID Gene Name Symbol χ
2*
Rank**
219077_s_at WW domain containing oxidoreductase WWOX 3 247
218753_at
X Kell blood group precursor-related
family, member 8
XKR8 -3 5
203655_at
X-ray repair complementing defective
repair in Chinese hamster cells 1
XRCC1 -3 399
221848_at
Zinc finger, CCCH-type with G patch
domain
ZGPAT -2 0
57539_at
Zinc finger, CCCH-type with G patch
domain
ZGPAT -3 0
206373_at
Zic family member 1 (odd-paired homolog,
Drosophila)
ZIC1 -2 448
219548_at Zinc finger protein 16 (KOX 9) ZNF16 -4 398
200054_at Zinc finger protein 259 ZNF259 -3 409
217185_s_at Zinc finger protein 259 ZNF259 -2 425
214714_at Zinc finger protein 394 ZNF394 -2 0
205964_at Zinc finger protein 426 ZNF426 -2 507
205039_s_at
Zinc finger protein, subfamily 1A, 1
(Ikaros)
ZNFN1A1 -2 0
213097_s_at Zuotin related factor 1 ZRF1 -2 0
210733_at --- --- -3 243
212044_s_at Similar to 60S ribosomal protein L27a --- -2 49
215450_at --- --- -2 0
216410_at --- --- -2 0
216421_at --- --- -2 0
217107_at --- --- -3 295
217191_x_at --- --- -3 0
217313_at --- --- -2 0
217623_at
Transcribed locus, moderately similar to
XP_524454.1 PREDICTED: hypothetical
protein XP_524454 [Pan troglodytes]
--- 2 194
221877_at
CDNA FLJ46713 fis, clone
TRACH3016885
--- -2 452
* χ2 Cox regression scores. Note negative numbers indicate genes whose increased expression
is associated with good outcome and positive numbers indicate genes whose increased
expression is correlated to poor outcome.
** Rank as determined by 50% sample cross-validated Cox Regression modeling of outcome
gene expression in RMS patients.
251
Supplementary Table 19. Gene ontology categories over-represented in
'poor' and 'good' outcome gene lists as determined by EASE analysis
Poor Outcome
Gene List
Good Outcome
Gene List
System Gene Category
Number
of
Genes*
EASE
Score
†
Number
of Genes
EASE
Score
Chromosome Homo sapiens 16 22 1.4E-04 5 1.0
Chromosome Homo sapiens 16p 14 8.5E-04 4 1.0
GO Cellular
Component
cytoplasmic vesicle 9 0.001 2 0.9
GO Biological
Process
negative regulation of transcription 8 0.001 3 0.7
KEGG pathway
Amino Acid Metabolism - Homo
sapiens
11 0.002 1 1.0
GO Cellular
Component
coated vesicle 7 0.002 2 0.9
GO Cellular
Component
clathrin-coated vesicle 6 0.005 2 0.8
GO Biological
Process
regulation of neurotransmitter levels 5 0.006 1 1.0
GO Biological
Process
neurotransmitter secretion 4 0.007 1 1.0
GO Cellular
Component
synaptic vesicle 5 0.007 1 1.0
GO Molecular
Function
cytoskeletal protein binding 12 0.01 5 0.9
GO Molecular
Function
oxidoreductase activity\, acting on
the aldehyde or oxo group of donors
4 0.01 1 1.0
GO Molecular
Function
tubulin binding 4 0.01 0 1.0
Chromosome Homo sapiens 17 21 0.01 4 1.0
GO Biological
Process
transmission of nerve impulse 10 0.02 2 1.0
GO Molecular
Function
protein binding 36 0.02 33 0.7
Chromosome Homo sapiens 17q 16 0.02 4 1.0
GO Biological
Process
regulation of synapse 3 0.02 0 1.0
GO Molecular
Function
electron transporter activity 11 0.02 6 0.8
GO Molecular
Function
transcriptional repressor activity 4 0.02 0 1.0
KEGG pathway
Phenylalanine metabolism - Homo
sapiens
3 0.03 0 1.0
Chromosome Homo sapiens 1 30 0.03 16 1.0
KEGG pathway
Arginine and proline metabolism -
Homo sapiens
4 0.03 0 1.0
GO Biological
Process
synaptic transmission 9 0.04 2 1.0
KEGG pathway
Alanine and aspartate metabolism -
Homo sapiens
3 0.04 4 0.9
252
Supplementary Table 19. Gene ontology categories over-represented in
'poor' and 'good' outcome gene lists as determined by EASE analysis
Poor Outcome
Gene List
Good Outcome
Gene List
System Gene Category
Number
of
Genes*
EASE
Score
†
Number
of Genes
EASE
Score
GO Molecular
Function
oxidoreductase activity\, acting on
the aldehyde or oxo group of donors\,
NAD or NADP as acceptor
3 0.04 1 1.0
Chromosome Homo sapiens 1p 18 0.04 10 1.0
GO Biological
Process
cell-cell signaling 15 0.05 9 0.9
GO Biological
Process
invasive growth 3 0.05 0 1.0
GO Biological
Process
nuclear mRNA splicing\, via
spliceosome
1 1.0 6 0.046
GO Biological
Process
RNA splicing\, via transesterification
reactions
1 1.0 6 0.046
GO Biological
Process
RNA splicing\, via transesterification
reactions with bulged adenosine as
nucleophile
1 1.0 6 0.046
GO Molecular
Function
oxidoreductase activity\, acting on
NADH or NADPH\, other acceptor
1 1.0 4 0.045
GO Molecular
Function
frizzled receptor activity 0 1.0 2 0.045
GO Biological
Process
positive regulation of growth 0 1.0 3 0.042
GO Biological
Process
cell growth 2 0.8 7 0.041
GO Molecular
Function
endoribonuclease activity 0 1.0 4 0.039
GO Molecular
Function
NADH dehydrogenase (ubiquinone)
activity
0 1.0 4 0.039
GO Molecular
Function
NADH dehydrogenase activity 0 1.0 4 0.039
Chromosome Homo sapiens 20q 6 0.3 11 0.038
GO Molecular
Function
primary active transporter activity 0 1.0 10 0.038
Chromosome Homo sapiens 7q 4 1.0 16 0.037
GO Biological
Process
biosynthesis 16 0.7 34 0.035
GO Biological
Process
positive regulation of growth rate 0 1.0 3 0.031
GO Biological
Process
regulation of growth rate 0 1.0 3 0.031
GO Biological
Process
mRNA metabolism 2 0.9 9 0.030
GO Molecular
Function
tRNA-specific ribonuclease activity 0 1.0 3 0.030
GenMAPP pathway Hs_Translation Factors 1 1.0 5 0.028
GO Biological
Process
macromolecule biosynthesis 12 0.8 30 0.026
GO Biological
Process
Wnt receptor signaling pathway 4 0.1 6 0.024
253
Supplementary Table 19. Gene ontology categories over-represented in
'poor' and 'good' outcome gene lists as determined by EASE analysis
Poor Outcome
Gene List
Good Outcome
Gene List
System Gene Category
Number
of
Genes*
EASE
Score
†
Number
of Genes
EASE
Score
KEGG pathway Proteasome - Homo sapiens 0 1.0 4 0.023
GO Molecular
Function
translation regulator activity 2 0.8 7 0.023
GO Molecular
Function
translation factor activity\, nucleic
acid binding
2 0.8 7 0.022
GO Molecular
Function
ribonuclease P activity 0 1.0 3 0.021
GO Biological
Process
regulation of growth 0 1.0 4 0.020
GO Biological
Process
protein metabolism 36 0.6 67 0.020
Chromosome Homo sapiens 8p 2 0.9 10 0.018
GO Molecular
Function
ribonuclease activity 0 1.0 5 0.018
GO Biological
Process
translation 5 0.4 11 0.017
GO Biological
Process
mRNA processing 2 0.9 9 0.016
GO Cellular
Component
mitochondrial matrix 2 0.6 6 0.016
Chromosome Homo sapiens 7 5 1.0 23 0.015
GO Biological
Process
RNA splicing 1 1.0 8 0.014
GO Molecular
Function
translation initiation factor activity 1 1.0 6 0.014
GO Biological
Process
regulation of protein biosynthesis 1 1.0 4 0.011
GO Cellular
Component
ribosome 3 0.9 14 0.009
GO Molecular
Function
damaged DNA binding 0 1.0 5 0.008
GO Molecular
Function
endoribonuclease activity\, producing
5'-phosphomonoesters
0 1.0 4 0.008
GO Molecular
Function
protein translocase activity 0 1.0 4 0.008
GO Molecular
Function
RNA cap binding 0 1.0 3 0.007
GO Biological
Process
35S primary transcript processing 0 1.0 3 0.005
GO Cellular
Component
cytoplasm 77 0.2 120 0.005
GO Molecular
Function
nuclease activity 1 1.0 9 0.004
GO Molecular
Function
non-G-protein coupled 7TM receptor
activity
0 1.0 4 0.004
GO Cellular
Component
cell 158 0.5 237 0.003
GO Cellular
Component
mitochondrial inner membrane
presequence translocase complex
0 1.0 4 0.002
GO Biological
Process
protein biosynthesis 10 0.5 25 0.002
GO Molecular
Function
RNA binding 5 1.0 23 0.001
254
Supplementary Table 19. Gene ontology categories over-represented in
'poor' and 'good' outcome gene lists as determined by EASE analysis
Poor Outcome
Gene List
Good Outcome
Gene List
System Gene Category
Number
of
Genes*
EASE
Score
†
Number
of Genes
EASE
Score
GO Biological
Process
mitochondrial translocation 0 1.0 5 5.6E-04
Chromosome Homo sapiens 19 3 1.0 35 5.1E-04
GO Cellular
Component
nucleolus 0 1.0 11 1.9E-04
Chromosome Homo sapiens 19p 2 1.0 21 1.6E-04
GO Biological
Process
metabolism 102 0.4 173 1.0E-04
GO Cellular
Component
mitochondrion 11 0.6 32 9.0E-05
GO Cellular
Component
nucleus 34 1.0 92 5.1E-05
GO Cellular
Component
ribonucleoprotein complex 3 0.9 25 3.6E-05
GO Biological
Process
nucleobase\, nucleoside\, nucleotide
and nucleic acid metabolism
38 0.9 93 2.6E-05
GO Biological
Process
rRNA metabolism 0 1.0 8 2.5E-05
GO Molecular
Function
nucleic acid binding 37 0.9 91 1.5E-05
GO Biological
Process
rRNA processing 0 1.0 8 1.2E-05
GO Biological
Process
ribosome biogenesis and assembly 0 1.0 10 1.2E-05
GO Biological
Process
ribosome biogenesis 0 1.0 10 8.7E-06
GO Biological
Process
RNA metabolism 4 1.0 26 5.5E-06
GO Biological
Process
RNA processing 4 0.9 25 5.1E-06
Chromosome Homo sapiens 8q 0 1.0 23 1.1E-07
Chromosome Homo sapiens 8 2 1.0 33 4.5E-09
Chromosome Homo sapiens 11 6 1.0 48 1.7E-10
GO Cellular
Component
intracellular 111 0.4 202 9.7E-11
Chromosome Homo sapiens 11q 2 1.0 39 2.3E-11
* Refers to number of genes from analyzed list found in each GO category.
† EASE score is a conservative adjustment to the Fisher exact probability test.
Note: Categories highlighted in bold are considered statistically significant.
Appendix B
List of Publications and Manuscripts from Thesis and Related Works
Peer Reviewed Publications
1. Yanlin Yu, Elai Davicioni, Timothy J. Triche, and Glenn Merlino. 2005. The
homeoprotein Six1 upregulates multiple pro-tumorigenic genes yet requires
Ezrin to promote metastasis. Cancer Research, 66, 1982-1989.
2. Elai Davicioni*, Friedrich Graf Finckenstein*, Violette Shahbazian, Jonathan D.
Buckley, Timothy J. Triche, and Michael J. Anderson. 2005. Identification of a
PAX-FKHR Gene Expression Signature that Defines Molecular Classes and
Determines the Prognosis of Alveolar Rhabdomyosarcomas. Cancer Research,
manuscript in press.
* Note these authors contributed equally to this work.
Manuscripts in Review
1. Elai Davicioni, Michael J. Anderson, Friedrich Graf Finckenstein, James C.
Lynch, Poul H.B. Sorensen, Stephen J. Qualman, Hiroyuki Shimada, Deborah E.
Schofield, Jonathan D. Buckley, William H. Meyer and Timothy J. Triche. 2005.
Molecular Classification of Rhabdomyosarcoma: Genotypic and Phenotypic
Determinants of Diagnosis and Prognosis. American Journal of Pathology,
manuscript in review.
2. Graf Finckenstein F, Davicioni E, Osborn KG, Arden KC, Cavenee WK,
Anderson MJ: 2005. Transgenic mice expressing the tumor specific PAX3-
FKHR fusion transcription factor have multiple defects in muscle development,
including ectopic skeletal myogenesis in the developing neural tube. Transgenic
Research, manuscript in review.
3. Jian-guang Wang, Hui Peng, Lora W. Barsky, Elai Davicioni, Kenneth I.
Weinberg, Timothy J. Triche, Xiao-kun Zhang, and Lingtao Wu. 2005. Role of
decreased PML/RAR α binding to and phosphorylation by CAK in retinoid-
mediated leukemia cell G
1
arrest and transition into differentiation. Cancer
Research, manuscript in review.
Manuscripts in Preparation
1. Elai Davicioni, Johnathan D. Buckley, James C. Lynch, William H. Meyer and
Timothy J. Triche. Gene Expression Profiling for Survival Prediction in Pediatric
Rhabdomyosarcomas, manuscript in preparation.
2. Friedrich Graf Finckenstein, Elai Davicioni, Violette Shahbazian and Michael
J.Anderson. The PAX-FKHR fusion proteins transcriptionally activate MyoD
while inhibiting its function, thereby inducing a limited myogenic phenotype
resembling that of alveolar rhabdomyosarcoma, manuscript in preparation.
255
3. Elai Davicioni*, James Witowsky*, Timothy J. Triche and Erwin F. Wagner.
Gene Expression Profiling of a Trp53/Fos Double Knockout Mouse Model for
Rhabdomyosarcoma, manuscript in preparation.
* Note these authors contributed equally to this work.
4. Sandeep Batra, Elai Davicioni, Jingsong Zhang, Zheseng Wan, Timothy J
Triche, and Charles P Reynolds. Expression profiling of drug-resistant Ewing’s
sarcoma cell lines, manuscript in preparation.
5. Nino Keshelava, Zesheng Wan, Elai Davicioni, Xuan. Chen, Timothy J. Triche,
C. Patrick Reynolds. Multi-Drug Resistant Neuroblastoma Cell Lines
Overexpress Histone Deacetylase (HDAC) 1 and HDAC Inhibitors Can
Synergistically Enhance the Response to Cytotoxic Agents, manuscript in
preparation.
256
Asset Metadata
Creator
Davicioni, Elai (author)
Core Title
Molecular classification, diagnosis and prognosis of pediatric rhabdomyosarcoma by oligonucleotide microarray analyses
Contributor
Digitized by ProQuest
(provenance)
Degree
Doctor of Philosophy
Degree Program
Pathobiology
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
biology, biostatistics,biology, molecular,health sciences, oncology,health sciences, pathology,OAI-PMH Harvest
Language
English
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c16-437714
Unique identifier
UC11342277
Identifier
3237123.pdf (filename),usctheses-c16-437714 (legacy record id)
Legacy Identifier
3237123.pdf
Dmrecord
437714
Document Type
Dissertation
Rights
Davicioni, Elai
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the au...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus, Los Angeles, California 90089, USA
Tags
biology, biostatistics
biology, molecular
health sciences, oncology
health sciences, pathology
Linked assets
University of Southern California Dissertations and Theses