|Save page Remove page||Previous||1 of 129||Next|
small (250x250 max)
medium (500x500 max)
Large (1000x1000 max)
large ( > 500x500)
Towards Social Virtual Listeners: Computational Models of Human Nonverbal Behaviors by Derya Ozkan Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Ful llment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) May 2014 Copyright 2014 Derya Ozkan
|Title||Towards social virtual listeners: computational models of human nonverbal behaviors|
|Degree||Doctor of Philosophy|
|Degree program||Computer Science|
|School||Viterbi School of Engineering|
|Advisor (committee chair)||Morency, Louis-Philippe|
|Advisor (committee member)||
Narayanan, Shrikanth S.
Medioni, Gerard G.
Medioni, Gérard G.
Marsella, Stacy C.
|Abstract||Human nonverbal communication is a highly interactive process, in which the participants dynamically send and respond to nonverbal signals. These signals play a significant role in determining the nature of a social exchange. Although human can naturally recognize, interpret and produce these nonverbal signals in social contexts, computers are not equipped with such abilities. Therefore, creating computational models for holding fluid interactions with human participants has become an important topic for many research fields including human‐computer interaction, robotics, artificial intelligence, and cognitive sciences. Central to the problem of modeling social behaviors is the challenge of understanding the dynamics involved with listener backchannel feedbacks (i.e. the nods and paraverbals such as "uh‐hu" and "mm‐hmm" that listeners produce as someone is speaking). In this thesis, I present a framework for modeling visual backchannels of a listener during a dyadic conversation. I address the four major challenges involved in modeling nonverbal human behaviors, more specifically listener backchannels: (1) High Dimensionality: Human communication is a complicated phenomenon that involves many behaviors (i.e dimensions) such smile, nod, hand moving, and voice pitch. A better understanding and analysis of social behaviors can be obtained by discovering the subset of features relevant to a specific social signal (e.g., backchannel feedback). In this thesis, I present a new feature ranking scheme which exploits the sparsity of probabilistic models when trained on human behavior problems. This technique gives researchers a new tool to analyze individual differences in social nonverbal communication. Furthermore, I present a feature selection approach which first looks at the important behaviors for each individual, called self‐features, before building a consensus. (2) Multimodal Processing: This high dimensional data comes from different communicative channels (modalities) that contain complementary information essential to interpretation and understanding of human behaviors. Therefore, effective and efficient fusion of these modalities is a challenging task. If integrated carefully, different modalities have the potential to provide complementary information that will improve the model performance. In this thesis, I introduce a new model called Latent Mixture of Discriminative Experts which can automatically learn the temporal relationship between different modalities. Since, I train separate experts for each modality, LMDE is capable of improving the prediction performance even with limited amount of data. (3) Visual Influence: Human communication is dynamic in the sense that people affect each other's nonverbal behaviors (i.e. gesture mirroring). Therefore, while predicting the nonverbal behaviors of a person of interest, the visual gestures from the second interlocutor should also be taken into account. In this thesis, I propose a context‐based prediction framework that models the visual influence of an interlocutor in a dyadic conversation, even if the visual modality from the second interlocutor is absent. (4) Variability in Human's Behaviors: It is known that age, gender and culture effect people's social behaviors. Therefore, there are differences in the way people display and interpret nonverbal behaviors. A good model of human nonverbal behaviors should take these differences into account. Furthermore, gathering labeled data sets is time consuming and often expensive in many real life scenarios. In this thesis, I use "wisdom of crowds" that enables parallel acquisition of opinions from multiple annotators/labelers. I propose a new approach for modeling wisdom of crowds called wisdom‐LMDE, which is able to learn the variations and commonalities among different crowd members (i.e. labelers).|
|Keyword||artificial intelligence; machine learning; multimodal processing; virtual agents|
|Part of collection||University of Southern California dissertations and theses|
|Publisher (of the original version)||University of Southern California|
|Place of publication (of the original version)||Los Angeles, California|
|Publisher (of the digital version)||University of Southern California. Libraries|
|Provenance||Electronically uploaded by the author|
|Legacy record ID||usctheses-m|
|Contributing entity||University of Southern California|
|Physical access||The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.|
|Repository name||University of Southern California Digital Library|
|Repository address||USC Digital Library, University of Southern California, University Park Campus MC 7002, 106 University Village, Los Angeles, California 90089-7002, USA|
|Full text||Towards Social Virtual Listeners: Computational Models of Human Nonverbal Behaviors by Derya Ozkan Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Ful llment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (COMPUTER SCIENCE) May 2014 Copyright 2014 Derya Ozkan|