Page 1 |
Save page Remove page | Previous | 1 of 268 | Next |
|
small (250x250 max)
medium (500x500 max)
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
Subset |
MULTIMODAL ANALYSIS OF EXPRESSIVE HUMAN
COMMUNICATION:
SPEECH AND GESTURE INTERPLAY
by
Carlos Busso
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Ful llment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
August 2008
Copyright 2008 Carlos Busso
Object Description
| Title | Multimodal analysis of expressive human communication: speech and gesture interplay |
| Author | Busso, Carlos |
| Author email | busso@usc.edu; carlosbusso@gmail.com |
| Degree | Doctor of Philosophy |
| Document type | Dissertation |
| Degree program | Electrical Engineering (Multimedia & Creative Technology) |
| School | Viterbi School of Engineering |
| Date defended/completed | 2008-05-09 |
| Date submitted | 2008 |
| Restricted until | Unrestricted |
| Date published | 2008-08-04 |
| Advisor (committee chair) | Narayanan, Shrikanth S. |
| Advisor (committee member) |
Kuo, C.-C. Jay Neumann, Ulrich |
| Abstract | The verbal and non-verbal channels of human communication are internally and intricately connected. As a result, gestures and speech present high levels of correlation and coordination. This relationship is greatly affected by the linguistic and emotional content of the message being communicated. The interplay is observed across the different communication channels such as various aspects of speech, facial expressions, and movements of the hands, head and body. For example, facial expressions and prosodic speech tend to have a stronger emotional modulation when the vocal tract is physically constrained by the articulation to convey other linguistic communicative goals. As a result of the analysis, applications in recognition and synthesis of expressive communication are presented.; From an emotion recognition perspective, we propose to build acoustically neutral models, which are used to measure the degree of similarity between the input speech and neutral speech. A fitness measure is then used as feature for classification, achieving better performance than conventional classification schemes in terms of accuracy and robustness. In addition to detecting users' emotions, we analyze how to use such ideas for meta-analysis of user behavior such as in automatically monitoring and tracking the behaviors, strategies and engagement of the participants in multiperson interactions. We describe a case of study of an intelligent meeting environment equipped with audio-visual sensors. We accurately estimate in real-time not only the flow of the interaction, but also how dominant and engaged each participant was during the discussion.; Finally, we show examples of how to synthesize expressive behavior by exploiting interrelation between speech and gestures. We propose to synthesize natural head motion sequences from acoustic prosodic features by sampling from trained Hidden Markov Models (HMMs). Our comparison experiments show that the synthesized head motions are perceived as natural as the captured head motion sequences. |
| Keyword | human communication; emotion; expressive behavior |
| Language | English |
| Part of collection | University of Southern California dissertations and theses |
| Publisher (of the original version) | University of Southern California |
| Place of publication (of the original version) | Los Angeles, California |
| Publisher (of the digital version) | University of Southern California. Libraries |
| Provenance | Electronically uploaded by the author |
| Type | texts |
| Legacy record ID | usctheses-m1533 |
| Rights | Busso, Carlos |
| Repository name | Libraries, University of Southern California |
| Repository address | Los Angeles, California |
| Repository email | http://www.usc.edu/isd/libraries/services/ask_a_librarian/email/ |
| Filename | etd-Busso-2231 |
| Archival file | uscthesesreloadpub_Volume14/etd-Busso-2231-0.pdf |
Description
| Title | Page 1 |
| Full text | MULTIMODAL ANALYSIS OF EXPRESSIVE HUMAN COMMUNICATION: SPEECH AND GESTURE INTERPLAY by Carlos Busso A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Ful llment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) August 2008 Copyright 2008 Carlos Busso |
Comments
Post a Comment for Page 1

