Page 1 |
Save page Remove page | Previous | 1 of 269 | Next |
|
small (250x250 max)
medium (500x500 max)
large ( > 500x500)
Full Resolution
All (PDF)
|
This page
All
Subset |
EMOTIONAL SPEECH RESYNTHESIS
by
Murtaza Bulut
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(ELECTRICAL ENGINEERING)
May 2008
Copyright 2008 Murtaza Bulut
Object Description
| Title | Emotional speech resynthesis |
| Author | Bulut, Murtaza |
| Author email | murtazabulut@yahoo.com |
| Degree | Doctor of Philosophy |
| Document type | Dissertation |
| Degree program | Electrical Engineering |
| School | Viterbi School of Engineering |
| Date defended/completed | 2007-11-21 |
| Date submitted | 2008 |
| Restricted until | Unrestricted |
| Date published | 2008-02-15 |
| Advisor (committee chair) | Narayanan, Shrikanth S. |
| Advisor (committee member) |
Byrd, Dani Kuo, C. C. Jay |
| Abstract | Emotions play an important role in human life. They are essential for communication, for decision making, and for survival. They pose a challenging research area across diverse disciplines such as psychology, sociology, philosophy, medicine and engineering. One realm of inquiry relates to emotions expressed in speech. In this study our focus is on angry, happy, sad, and neutral emotions in speech. We investigate the speech acoustic correlates that are important for emotion perception in utterances and propose techniques to synthesize emotional speech which will be correctly recognized by human listeners. The motivation for our research comes from the desire to impart emotion processing capabilities to machines in order to make human-machine interactions more pleasant, effective and productive. Instead of generating the emotional speech from text, in our approach we start with a natural neutral utterance and modify its acoustic features to impart the targeted emotion.; As shown by the analysis and recognition studies, spectral and prosodic (F0, duration, energy) parameters can be successfully used to describe and recognize emotions. In this study we utilize these acoustic parameters for emotion resynthesis and follow an experimental methodology to investigate how they should be modified in order to produce one of the angry, happy or sad emotions in human speech. Based on the experiment results a multi-level emotion to emotion transformation (ETET) system is proposed. This is a novel system which is capable of generating good quality emotional speech. It consists of three main components that modify speech acoustic parameters at different time scales. First spectral conversion is applied at phoneme level, then prosody parameters are statistically estimated and modified at part of speech (POS) tags level, and finally automatically selected modification factors are applied on voiced and unvoiced regions. The proposed ETET system is robust and it can be easily adapted to new emotions and speakers. The field of emotional speech synthesis is a challenging new research area. We believe that the ideas, results, and discussions presented in this study will be beneficial for improving the rapidly developing and growing research of emotions in speech. |
| Keyword | emotional speech; analysis; synthesis; human evaluation tests; expressive; prosody |
| Language | English |
| Part of collection | University of Southern California dissertations and theses |
| Publisher (of the original version) | University of Southern California |
| Place of publication (of the original version) | Los Angeles, California |
| Publisher (of the digital version) | University of Southern California. Libraries |
| Type | texts |
| Legacy record ID | usctheses-m1017 |
| Rights | Bulut, Murtaza |
| Repository name | Libraries, University of Southern California |
| Repository address | Los Angeles, California |
| Repository email | http://www.usc.edu/isd/libraries/services/ask_a_librarian/email/ |
| Filename | etd-Bulut-20080215 |
| Archival file | uscthesesreloadpub_Volume44/etd-Bulut-20080215.pdf |
Description
| Title | Page 1 |
| Full text | EMOTIONAL SPEECH RESYNTHESIS by Murtaza Bulut A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) May 2008 Copyright 2008 Murtaza Bulut |
Comments
Post a Comment for Page 1

