Page 1 |
Save page Remove page | Previous | 1 of 187 | Next |
|
small (250x250 max)
medium (500x500 max)
Large (1000x1000 max)
Extra Large
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
|
MULTIVARIATE TIME SERIES ANALYSIS BASED ON PRINCIPAL COMPONENT ANALYSIS by Kiyoung Yang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) August 2007 Copyright 2007 Kiyoung Yang
Object Description
Title | Multivariate time series analysis based on principal component analysis |
Author | Yang, Kiyoung |
Author email | kiyoungy@usc.edu |
Degree | Doctor of Philosophy |
Document type | Dissertation |
Degree program | Computer Science |
School | Viterbi School of Engineering |
Date defended/completed | 2007-05-08 |
Date submitted | 2007 |
Restricted until | Unrestricted |
Date published | 2007-07-25 |
Advisor (committee chair) | Shahabi, Cyrus |
Advisor (committee member) |
Narayanan, Shrikanth S. Ortega, Antonio |
Abstract | Time series is a series of observations over time. When there is one observation at each time instance, it is called a univariate time series (UTS), and when there are more than one observations, it is called a multivariate time series (MTS). While UTS datasets have been extensively explored, MTS datasets have not been broadly investigated. The techniques for UTS datasets, however, cannot be simply extended for MTS datasets, since multivariate time series is different from multiple univariate time series. That is, an MTS item may not be broken into multiple univariate time series and be separately analyzed, because this will result in the loss of the correlation information within the multivariate time series.; In this dissertation, we introduce a set of techniques for multivariate time series analysis based on principal component analysis (PCA). As a similarity measure for MTS datasets, we present Eros (Extended Frobenius norm). Eros computes the similarity between two MTS items by comparing the corresponding principal components and using the variances that the principal components represent as weights.; For efficient retrieval of MTS items using Eros, we introduce an index structure for Eros, termed Muse (Multilevel distance-based index structure for Eros). Given a query item, Muse first utilizes the lower bound of Eros to filter out the MTS items that are not to be in the set of k Nearest Neighbors. Subsequently, Muse refines the MTS items that are not filtered out by employing Eros in order to exactly identify the k Nearest Neighbors of the given query item.; Inherently, an MTS item is very high dimensional. Hence, it is, in general, beneficial to reduce the dimension of the dataset before applying data mining techniques, e.g., classification and clustering, which results in the elimination of irrelevant and/or redundant data. For Eros, we present a feature subset selection technique, termed Ropes (Recursive Feature Elimination on Common Principal Components for Eros). Ropes utilizes the common principal components and the weights recursively in order to select a subset of features for Eros.; In addition, utilizing the correlation information and Eros, we introduce a set of feature subset selection and feature extraction techniques for multivariate time series datasets, such as Corona (Correlation as Features), CLeVer (descriptive Common principal component Loading based Variable subset selection) and KEros. Corona is a supervised feature subset selection technique, which first represents an MTS item using the correlation coefficients, and recursively eliminates at each time one of the features based on the contribution to the classification decision boundary. CLeVer is an unsupervised feature subset selection technique, which performs the feature subset selection based on the contribution to the common principal components. KEros performs the feature extraction based on the Kernel PCA technique using Eros as the similarity measure between two MTS items.; With the advent of various sensing techniques, there are cases where each data is represented in an n-way array, where n is greater than 2. One of the examples would be the functionalMagnetic Resonance Imaging (fMRI) data, where each data is represented in a 3-way array, and an fMRI stream is represented in a 4-way array. An n-way array may be flattened into a matrix, where, for example, Eros can be applied. However, this flattening may result in the loss of the spatial correlation. In order to address this problem, we extended Eros to these n-way array datasets, termed nEros (n-way Eros). Intuitively, for an n-way array, there are n ways of unfolding it into a matrix. For each fold, we perform Eros, and sum up the n results into one similarity value.; Our experimental evaluation employing various real-world and synthetic datasets shows that the presented techniques based on the correlation information within the MTS items perform better than traditional approaches that do not utilize the correlation information, e.g., Euclidean distance. |
Keyword | time series; principal component analysis; kernel methods; similarity measure; index structure; feature selection; feature extraction; stationarity; n-way analysis |
Language | English |
Part of collection | University of Southern California dissertations and theses |
Publisher (of the original version) | University of Southern California |
Place of publication (of the original version) | Los Angeles, California |
Publisher (of the digital version) | University of Southern California. Libraries |
Type | texts |
Legacy record ID | usctheses-m664 |
Contributing entity | University of Southern California |
Rights | Yang, Kiyoung |
Repository name | Libraries, University of Southern California |
Repository address | Los Angeles, California |
Repository email | cisadmin@lib.usc.edu |
Filename | etd-Yang-20070725 |
Archival file | uscthesesreloadpub_Volume51/etd-Yang-20070725.pdf |
Description
Title | Page 1 |
Contributing entity | University of Southern California |
Repository email | cisadmin@lib.usc.edu |
Full text | MULTIVARIATE TIME SERIES ANALYSIS BASED ON PRINCIPAL COMPONENT ANALYSIS by Kiyoung Yang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) August 2007 Copyright 2007 Kiyoung Yang |