Page 1 |
Save page Remove page | Previous | 1 of 148 | Next |
|
small (250x250 max)
medium (500x500 max)
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
Subset |
PREDICTION MODELING AND STATISTICAL ANALYSIS FOR AMINO ACID
SUBSTITUTIONS
by
Hua Yang
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
December 2006
Copyright 2006 Hua Yang
Object Description
| Title | Prediction modeling and statistical analysis of amino acid substitutions |
| Author | Yang, Hua |
| Author email | huayang@usc.edu |
| Degree | Doctor of Philosophy |
| Document type | Thesis |
| Degree program | Electrical Engineering |
| School | Viterbi School of Engineering |
| Date defended/completed | 2006-10-09 |
| Date submitted | 2006 |
| Restricted until | Unrestricted |
| Date published | 2006-11-15 |
| Advisor (committee chair) | Kuo, C.-C. Jay |
| Advisor (committee member) |
Chen, Ting Leahy, Richard Sun, Fengzhu |
| Abstract | Classifying and predicting amino acid substitutions are important in pharmaceutical and pathological research. We proposed a novel feature set from amino acids' physicochemical properties, evolutionary profile of proteins, and protein sequence information. Large scale size of human disease-associated data were collected and processed, together with the unbiased experimental amino acid substitutions. Machine learning methods of decision tree, support vector machine, Gaussian mixture model, and random forests were used to classify neutral and deleterious substitutions, and the comparison of classification accuracy with published results showed that our feature set is superior to the existing ones.; We designed a simulated annealing bump hunting method to automatically extract interpretable rules for amino acid substitutions. Rules are consistent with current biological knowledge or provide new insights for understanding substitutions.; We also designed a Multiple Selection and Rule Voting (MS-RV) model, which integrates data partition and feature selection to predict and prioritize disease-associated mutations. For mutation data in SwissProt database, the 10-fold cross validation accuracy outperforms the support vector machine and random forests. We prioritized the substitutions inside thirty 10-Mb chromosomal regions which are related to monogenic diseases, and analyzed the normalized ranks. The overall area under ROC curve (AUC) scores is 86.6%. For the polygenic disease-associated amino acid substitutions, we analyzed the mutations that cause the Alzheimer disease. Our method prioritized the disease-associated substitutions on top ranks. The results indicate that MS-RV model effectively prioritizes disease-associated amino acid substitutions. We also studied the unclassified mutations with high prediction scores, and found evidences to support our conclusions. |
| Keyword | machine learning; data processing; classification; prediction model; statistical analysis; simulated annealing bump hunting strategy; Monte-Carlo simulation with variable temperature; predicting |
| Language | English |
| Part of collection | University of Southern California dissertations and theses |
| Publisher (of the original version) | University of Southern California |
| Place of publication (of the original version) | Los Angeles, California |
| Publisher (of the digital version) | University of Southern California. Libraries |
| Type | texts |
| Legacy record ID | usctheses-m136 |
| Rights | Yang, Hua |
| Repository name | Libraries, University of Southern California |
| Repository address | Los Angeles, California |
| Repository email | http://www.usc.edu/isd/libraries/services/ask_a_librarian/email/ |
| Filename | etd-Yang-20061115 |
| Archival file | uscthesesreloadpub_Volume23/etd-Yang-20061115.pdf |
Description
| Title | Page 1 |
| Full text | PREDICTION MODELING AND STATISTICAL ANALYSIS FOR AMINO ACID SUBSTITUTIONS by Hua Yang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) December 2006 Copyright 2006 Hua Yang |
Comments
Post a Comment for Page 1

