Page 1 |
Save page Remove page | Previous | 1 of 137 | Next |
|
small (250x250 max)
medium (500x500 max)
Large (1000x1000 max)
Extra Large
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
|
DISCOVERY OF COMPLEX PATHWAYS FROM OBSERVATIONAL DATA by James William Baurley A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (STATISTICAL GENETICS AND GENETIC EPIDEMIOLOGY) August 2010 Copyright 2010 James William Baurley
Object Description
Title | Discovery of complex pathways from observational data |
Author | Baurley, James William |
Author email | baurley@usc.edu; jbaurle@alumni.clemson.edu |
Degree | Doctor of Philosophy |
Document type | Dissertation |
Degree program | Biostatistics |
School | Keck School of Medicine |
Date defended/completed | 2010-05-07 |
Date submitted | 2010 |
Restricted until | Unrestricted |
Date published | 2010-06-11 |
Advisor (committee chair) | Thomas, Duncan |
Advisor (committee member) |
Gauderman, W. James Conti, David V. Gilliland, Frank D. Zhou, Xianghong Jasmine |
Abstract | The etiology of complex diseases may involve a network of biological interactions, genetic and environmental. With the availability of high-throughput genotyping platforms, epidemiologists can thoroughly evaluate the genetic component of complex diseases. While seemingly straightforward when the unit of analyses is a single variant, comprehensive analysis of pathways is fundamentally more involved. The objective of pathway-based approaches is ultimately to uncover associations that have a biological pathway context, often undetectable by a “single variable at a time” perspective. Recently, there has been growing recognition that analysis methods that focus on pathways are needed to improve detection of interactions.; I introduce two pathway-based frameworks aimed at discovery of complex pathways from observational data. Both approaches account for pathway uncertainty by basing inference on the posterior distribution of models. They also allow for external pathway knowledge to be incorporated as priors on pathway parameters and structure or to enhance algorithm performance.; The Algorithm for Learning Pathway Structure (ALPS) discovers plausible pathways from observational data, and estimates both the net effect of the pathway and the relationships (interactions) among genetic or environmental risk factors. In this framework, a topology links combinations of observed variables through intermediate nodes (representing interactions) to a disease outcome. Biologic knowledge can be readily applied as a “prior topology” to give preference to more biologically plausible models. I demonstrate that ALPS can correctly identify the true risk factors and interactions across various simulated pathway configurations.; As the number of genetic variants increases to the scale of modern candidate gene studies and genome-wide association studies (GWAS), the space of models grows extremely large. The second framework introduced is a Bayesian model selection algorithm (known as PEAK) where parallel MCMC chains are utilized to tune the proposal density to better approximate the target density (i.e. the posterior). PEAK organizes the model space into subspaces linked through a graph derived from an ontology or domain expert. I demonstrate the flexibility and efficiency of the framework by running PEAK on various simulated graph structures (informative, uninformative) and causal models.; ALPS and PEAK were applied to real data in a pathway analysis of oxidative stress genes in a GWAS of asthma. By considering multivariate models with interactions, these methods uncovered several associations with strong Bayes factors missed by a traditional marginal scans. ALPS and PEAK provide a valuable toolkit for pathway-based investigations of complex diseases. |
Keyword | pathways; complex disease; biostatistics; epidemiology; Markov Chain Monte Carlo; Bayesian model selection |
Language | English |
Part of collection | University of Southern California dissertations and theses |
Publisher (of the original version) | University of Southern California |
Place of publication (of the original version) | Los Angeles, California |
Publisher (of the digital version) | University of Southern California. Libraries |
Provenance | Electronically uploaded by the author |
Type | texts |
Legacy record ID | usctheses-m3125 |
Contributing entity | University of Southern California |
Rights | Baurley, James William |
Repository name | Libraries, University of Southern California |
Repository address | Los Angeles, California |
Repository email | cisadmin@lib.usc.edu |
Filename | etd-Baurley-3799 |
Archival file | uscthesesreloadpub_Volume48/etd-Baurley-3799.pdf |
Description
Title | Page 1 |
Contributing entity | University of Southern California |
Repository email | cisadmin@lib.usc.edu |
Full text | DISCOVERY OF COMPLEX PATHWAYS FROM OBSERVATIONAL DATA by James William Baurley A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (STATISTICAL GENETICS AND GENETIC EPIDEMIOLOGY) August 2010 Copyright 2010 James William Baurley |