ENRICHING SPOKEN LANGUAGE PROCESSING: REPRESENTATION AND
MODELING OF SUPRASEGMENTAL EVENTS.
Vivek Kumar Rangarajan Sridhar
A Dissertation Presented to the
FACULTY OF THE GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
Copyright 2008 Vivek Kumar Rangarajan Sridhar
Machine processing of speech, while has advanced significantly, is still insufficient in capturing and utilizing rich contextual information such as prosodic prominence, phrasing and discourse information that are conveyed beyond words. The work presented in this dissertation focuses on automatic enrichment of spoken language processing through the representation and modeling of suprasegmental events such as prosody and discourse context. First, we demonstrate the suitability of maximum entropy models for the automatic recognition of these events from speech and text. The techniques that we have developed achieve state-of-the-art performance. Second, we introduce a novel framework for enriching speech translation with rich information. Our approach of incorporating rich information in speech translation is motivated by the fact that it is important to capture and convey not only what is being communicated (the words) but how something is being communicated (the context). We show that promising improvements in translation quality can be obtained by exploiting rich annotations in conventional speech translation approaches.