Page 1 |
Save page Remove page | Previous | 1 of 114 | Next |
|
small (250x250 max)
medium (500x500 max)
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
Subset |
CONTEXTUAL MODELING OF AUDIO SIGNALS
TOWARD INFORMATION RETRIEVAL
by
Samuel Kim
A Dissertation Presented to the
FACULTY OF THE THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Ful llment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
December 2010
Copyright 2010 Samuel Kim
Object Description
| Title | Contextual modeling of audio signals toward information retrieval |
| Author | Kim, Samuel |
| Author email | kimsamue@usc.edu; worshipersam@gmail.com |
| Degree | Doctor of Philosophy |
| Document type | Dissertation |
| Degree program | Electrical Engineering |
| School | Viterbi School of Engineering |
| Date submitted | 2010 |
| Restricted until | Unrestricted |
| Date published | 2010-11-22 |
| Advisor (committee chair) | Narayanan, Shrikanth S. |
| Advisor (committee member) |
Kuo, C.-C. Jay Shahabi, Cyrus |
| Abstract | The main focus of this dissertation is on audio modeling and indexing toward audio information retrieval. In this regard, various novel methodologies are proposed in the direction of capturing audio context within a wide spectrum of audio contents; from well-structured music to unstructured environmental sound. This dissertation consists of two major parts depending on the types of audio contents: music information retrieval and general audio information retrieval.; In the first part, an efficient context-based music information retrieval method using music fingerprint is introduced. The music fingerprint is proposed to encapsulate musical context of a given music audio in a compact representation obtained directly from the music audio signal; it provides an efficient handle for music information retrieval in terms of both accuracy and computing requirements. The musically meaningful aspects considered in deriving this representation include harmonic structures and their temporal dynamic information (a.k.a. chord progression). Empirical results on various music information retrieval tasks, such as opus identification, composer identification and semantic description annotation show that the proposed music fingerprint is competitive to the state-of-the-art systems in terms of accuracy and computing power requirements.; In the second part, a new contextual modeling algorithm for general audio information retrieval is introduced. Assuming that hidden acoustic topics exist and they represent the context of an audio clip, we proposed a latent acoustic topic model that learns a probability distribution over a set of hidden topics of a given audio clip in an unsupervised manner. We use the latent Dirichlet allocation (LDA) method to implement the latent acoustic topic model and introduce the notion of acoustic words to support modeling within this framework. The proposed audio information retrieval system also aims to provide users with flexibility in formulating their retrieval queries using naive text as well as pre-determined categories or audio examples. To mitigate interoperability issues between the annotation and retrieval processes inherent in text descriptions, we propose an intermediate audio description layer (iADL) spanned by onomatopoeic and semantic labels in conjunction with context-based text transformation methods that map naive descriptions onto the proposed iADL. |
| Keyword | audio context; audio modeling; content-based audio information retrieval; context-based audio information retrieval; music information retrieval; music fingerprint; cover-song identification; composer identification; environmental sound; unstructured audio; acoustic topic models; acoustic word; text-like audio signal processing; naive sound description modeling; intermediate audio descriptive layer |
| Language | English |
| Part of collection | University of Southern California dissertations and theses |
| Publisher (of the original version) | University of Southern California |
| Place of publication (of the original version) | Los Angeles, California |
| Publisher (of the digital version) | University of Southern California. Libraries |
| Provenance | Electronically uploaded by the author |
| Type | texts |
| Legacy record ID | usctheses-m3546 |
| Rights | Kim, Samuel |
| Repository name | Libraries, University of Southern California |
| Repository address | Los Angeles, California |
| Repository email | http://www.usc.edu/isd/libraries/services/ask_a_librarian/email/ |
| Filename | etd-Kim-4156 |
| Archival file | uscthesesreloadpub_Volume40/etd-Kim-4156.pdf |
Description
| Title | Page 1 |
| Full text | CONTEXTUAL MODELING OF AUDIO SIGNALS TOWARD INFORMATION RETRIEVAL by Samuel Kim A Dissertation Presented to the FACULTY OF THE THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Ful llment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) December 2010 Copyright 2010 Samuel Kim |
Comments
Post a Comment for Page 1

