Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Asymetrical discourse in a computer-mediated environment
(USC Thesis Other)
Asymetrical discourse in a computer-mediated environment
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ASYMETRICAL DISCOURSE IN A COMPUTER-MEDIATED ENVIRONMENT by Michael Dennis Rushforth A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (HISPANIC LINGUISTICS) August 2012 Copyright 2012 Michael Dennis Rushforth ii ACKNOWLEDGEMENTS As I reflect upon my doctoral program, I feel deeply grateful to the many people who have influenced and helped me. I especially want to thank my dissertation chair, Carmen Silva-Corvalán, for her steadfast support and encouragement. Her guidance was invaluable. My USC experience began when she welcomed me with a phone call, letting me know I had been admitted; it was with a sense of poetic closure that I knelt to receive my doctoral hood from her in the graduation celebrations. I am grateful to Mario Saltarelli for his generosity of time and his confidence in me as my research interests evolved. I fondly recall the many times he let me importune him in his office and at his home to discuss my latest ideas and projects. To David Traum, I am indebted for helping me fulfill a dream by working at the Institute for Creative Technologies. I felt truly blessed to be in the company of great scholars in such diverse areas of expertise working together. To Lanita Jacobs, I am grateful for her straightforwardness in challenging me to focus on something I felt passionate about. I also wish to thank Ed Finegan for introducing me to the study of discourse. It was a privilege to be his teaching assistant, and I see the world differently because of what I learned from him. I am grateful to Dmitri Williams, who helped acquaint me with the formal study of online communities. It was during his class discussions that I learned to connect the ideas of community values and the structure of computer code. I thank my colleagues in the NLD group at ICT: Sudeep Gandhe, Ron Artstein, Antonio Roque, David Devault, Kenji Sagae, Anton Leuski, Kallirroi Georgila, Priti iii Aggarwal, Fabrizio Morbini, Angela Nazarian and Jillian Gerten. The work I did with TacQ for this dissertation would not have been possible without their contributions. Warm thanks to the Cougarboard community and, notably, to Steve Meyers for answering my many questions and helping me obtain the data I needed. Many thanks to Joyce Perez, my friend and graduate advisor, for keeping me on track and always knowing the answers to my questions. To my fellow graduate students and Spanish lecturers, I am grateful for your friendship. Heartfelt thanks to my fairy godmother-in-law Sue, who fed and cared for my family at critical points along the way so I could work. Thanks to my father-in-law Tom, for being so supportive and for accommodating Sue’s travels to help me. I thank my father for setting an example and for having confidence that I could earn a PhD. Thanks to my mother for many meals and hours of childcare while I worked, and to both of my parents for being sounding boards for my ideas. Thanks to my brother Craig and his wife Jessica, for their unfailing hospitality and for being the “family glue”. I am grateful to my brother and fellow Trojan, Shaun, for joining me in Los Angeles. I thank my children - Ethan, Scott and Elizabeth Axie - for their daily prayers, and I thank my baby, Emily Carmen, for delaying her arrival just long enough for me to finish. Finally, I thank my wife, Erin, who has been there for me every step of the way with this dissertation. iv TABLE OF CONTENTS Acknowledgements ii List of Tables v List of Figures vii Abstract viii Chapter One: Introduction 1 Chapter Two: Holding the Floor in a Threaded Message Board 6 Literature Review 7 Structural Description of Cougarboard 12 Conventions of Discourse 18 The Conversation Floor 19 Controlling the Floor 26 Discussion and Conclusions 32 Chapter Three: User Preferences for Feedback from a Pedagogical Conversation Agent 35 Review of Literature 36 Conversation Agent Design 42 Methodology 45 Results and Discussion 46 Conclusions and Future Work 55 Chapter Four: Authoring Virtual Humans and Asymmetrical Discourse 58 Background 61 Role playing, Asymmetries and Authoring 62 Building Conversation Networks 74 Comprehension Asymmetry 76 Conclusion 79 Chapter Five: Conclusions and Future Work 81 Bibliography 87 Appendix: Interaction Survey for Conversation Partner 93 v LIST OF TABLES Table 2.1: Quantiles for time from initial message to reply messages from Figure 2.5 13 Table 2.2: Summary of fit for reads per message 29 Table 2.3: Analysis of variance for reads per message 29 Table 2.4: Parameter estimates for reads per message 29 Table 2.5: Summary of fit for length of time in the community 31 Table 2.6: Analysis of variance for length of time in the community 31 Table 2.7: Parameter estimates for length of time in the community 31 Table 3.1: Enjoyableness of chatting with Rosa 46 Table 3.2: Usefulness of chatting with Rosa 47 Table 3.3: Ranking of online activities for enjoyableness 47 Table 3.4: Ranking of online activities for usefulness 48 Table 3.5: Receiving feedback during the chat 48 Table 3.6: Feedback as it relates to usefulness of language activities 49 Table 3.7: Enjoyableness of one agent vs. two agents 49 Table 3.8: Usefulness of one agent vs. two agents 50 Table 3.9: Average rankings of online activities for enjoyableness and usefulness 51 Table 3.10: Average rankings of feedback sources for enjoyableness and usefulness 54 Table 4.1: Sample domain 66 Table 4.2: Dialogue act frames 68 vi Table 4.3: Object or attribute 69 Table 4.4: Single vs multiple true values 70 Table 4.5: Addressing yes/no questions about relationships 72 Table 4.6: Multiple values for a relationship attribute 73 Table 4.7: Representing events 74 Table 4.8: A negotiation network 75 vii LIST OF FIGURES Figure 2.1: A message 13 Figure 2.2: A thread 15 Figure 2.3: The message board 16 Figure 2.4: The front page 17 Figure 2.5: Life span in minutes of 1000 threads 22 Figure 2.6: Reads per message based on linear order 24 Figure 2.7: Reads per message based on depth 25 Figure 3.1: The Rosa chatbot 43 Figure 4.1: End-user interacting with a virtual human 59 viii ABSTRACT Over the course of my studies in linguistics, I became intrigued by the impact of computer code on communication. As physical space can be used to provide communicative advantages to one party over another, so the computer code that structures a virtual communicative channel shapes discourse patterns. This dissertation is organized as a collection of three papers, each of which considers the asymmetries of discourse in a different virtual environment. The first environment is an online message board for sports fans. Its conversations follow a tree-structure format which identifies whether the author of a message is a donor to the website, a social status marker signaled by the underlying computer code. In this chapter, I investigate how the board’s tree-structure influences which messages are read. I also consider what quantifiable differences in participation and readership exist between donors and non-donors. The study of the board structure demonstrates that a reply’s proximity to its parent messages affects its readership, with those replies closest to the parent message receiving the highest readership. The study also finds that donors have a higher participation rate in conversations, but on average, messages posted by donors receive slightly less readership per message than those of non-donors. The second environment is in the domain of second language learning and examines students in first semester university Spanish interacting with a virtual conversation partner in Spanish. The conversations followed a format similar to in-class role play activities and were guided by prompted questions from the virtual agent. The ix study shows that students believe that metalinguistic feedback is necessary for a language learning activity to be useful, although there was not a consensus on the pedagogical effects of the feedback. The pilot study also indicates a preference to have feedback delivered by a separate virtual agent, rather than have the role of conversation partner and tutor be executed by the same agent. The third environment is one of virtual agents designed for tactical questioning training. This chapter looks at interactions primarily from the perspective of the authors creating interactive narratives; it examines communicative asymmetries that are inherent to authoring, as well as restrictions imposed by the specific architecture used to develop the agents. As computers become increasingly ubiquitous as a communication tool, it is important to consider how different environments are structured through computer code. These three studies contribute to the understanding of how design decisions regarding computer-mediated conversation environments affect user interactions. 1 CHAPTER ONE: INTRODUCTION Imagine that you walk into a restaurant. Do you want to eat at a table or do you want to eat at the bar? If your sole purpose is to eat, you could probably make your decision with a coin toss because the food will likely taste the same at both locations. Now suppose, for a moment, that you came not only to eat, but to discuss an important project with some colleagues. If you place yourselves at the bar, the physical environment will inhibit your group’s ability to exchange ideas effectively because some individuals may be four to five seats away from each other. Choosing the table is clearly a superior choice to facilitate the meeting. Conversely, let us now imagine that you came alone, but were hoping perhaps to meet someone new. In this case, the bar is obviously the better option because you are more likely to be seated in an environment where you can easily strike up a conversation with a stranger next to you, or, at a minimum, the bartender. Neither physical space is inherently superior to the other except inasmuch as they help facilitate particular communicative goals. Knowing that physical space shapes or, at least, influences social interactions, good interior designers and architects attempt to create physical spaces to promote and enable certain kinds of interactions. Similarly, programmers structure virtual space to achieve communicative goals. For example, many news websites enable readers to post comments below articles. The structure of those comment sections is a matter of conscious programming design decisions. Some sites are designed such that people post their comments immediately under the article as a long list. This approach fosters 2 commentary on the article. Other websites choose a threaded commentary section where individuals can respond to one another in addition to commenting directly on the article. This fosters a sense of community. In terms of physical structure influencing communication, these two approaches are somewhat analogous to the bar or table decision mentioned earlier. The examples I have given so far show how structured space, real or virtual, facilitate conversations where the participants are on more or less equal footing. Physical space can also be used to create conversation environments that communicate hierarchical social relationships, such as a throne being placed on a riser overlooking the court. Physical space can also be used to provide advantages to one party over another, such as when a person is being questioned through a one way mirror. The subject of this dissertation lies within the second type of conversations, where there exist asymmetries in discourse. The dissertation is organized as a collection of three papers, followed by some closing commentary on general conclusions and proposals for future work. The first paper looks at the effects of design decisions taken on a particular sports message board (cougarboard.com) which determine the number of times participants have their messages read by other members of the online community. I became interested in this topic after spending time in several online sports communities which achieved varying degrees of success in growing the size of their membership. These communities used different design features in structuring their message boards. I wondered what features contributed to making some boards more successful than others. I chose Cougarboard because it had the largest membership. Since a message board is 3 only as good as the content, I was interested in how conversations were organized to direct attention to the most important content. The role of the computer is one of a silent mediator that arbitrates turn-taking among the participants. My assertion is that the message board structure created by the computer code strongly influences which messages get read, and, by extension, which members of the online community have greater social standing. I also look at the role of social markers within the community in determining which participants have more messages read. The asymmetrical relationship examined among the participants is whether or not the individual is a financial donor to the online community. The hypotheses that I test are that message position within a tree- structured thread influences the number of times the message will get read. Secondly, I predict that members that have made a financial contribution will receive more attention per message on average based on their greater social standing in the community. Whereas the first paper sees the computer as a silent discourse partner that simply guides the attention of a large human audience, the remaining two papers look at conditions where a computer program is seen as a participant or as a proxy in communication. The first of these relates to a pilot study for a conversation agent that serves as a Spanish language tutor and conversation partner. One of the prototypical asymmetrical social relationships is that of teacher and student. Instructor and student interactions in a classroom setting are generally built on the presumption of knowledge transfer from the teacher to the pupil. Communicative asymmetries are especially noticeable when the course of study is a foreign language because of the linguistic proficiency difference between the instructor and the students. 4 Some approaches to teaching languages foster a peculiar dynamic in conversation in that in addition to accomplishing the communicative functions of language (i.e. simply get a message across), the teacher is also routinely providing form-focused corrective feedback to address learner errors. While there are a great number of studies that examine various feedback techniques, most of these studies have focused on learning outcomes when the techniques are used among human interlocutors. There are varying opinions about the importance of receiving corrective feedback that is focused on explicit grammatical correction. Critics point out that feedback in the classroom is usually given inconsistently and in a typical language classroom, there are relatively few opportunities for each student to receive this kind of feedback. Notwithstanding, students often request and expect feedback from instructors. One possible way of providing more consistent feedback is to use simulated dialogue agents. This study considers student appraisal of usefulness and enjoyableness with respect to receiving feedback while engaging in structured dialogues with an embodied computer conversation partner. The agent is presented to beginning level Spanish-students as a tutor and conversation partner. These two functions (tutor vs. peer) can be captured in one agent or divided between two agents. One of the central questions answered in that paper is whether students prefer separating the quasi peer-like role of conversation partner from that of the more authoritative role of providing feedback. The third paper relates to virtual agents as well, but from a reverse perspective. Rather than focus on the conversations between end-users and the computer agent, this chapter looks at interactions primarily from the perspective of the authors creating 5 interactive narratives. The TacQ system was developed with the express intent of facilitating the rapid creation of interactive characters by novice developers. Novices learn the technical aspects of authoring characters prior to appreciating the subtleties of the narrative implications of the system limitations. At the heart of the dialogue authoring system is a shallow semantic representation system and policies about how the virtual character should respond to the end-user. This system has been very effective for simulating some storylines, but less so for others. The hypothesis that is pursued is that the underlying narrative in each scenario has semantic properties that can be difficult for the representation The value in this work comes from making explicit some of those properties so that novice users can avoid them and more advanced authors/developers can expand the system’s capabilities. The dissertation culminates with general conclusions and future work associated with each of its subcomponents. 6 CHAPTER TWO: HOLDING THE FLOOR IN A THREADED MESSAGE BOARD Holding the floor is a notion generally associated with turn-taking in face-to-face conversations. With interactions taking place in computer-mediated conversation environments, it is worthwhile to explore to what degree the idea of a floor carries over to these new media, particularly on a message board. As with face-to-face conversations, message boards use turns for communication, which can be analyzed quantitatively by length and by frequency. On a message board, however, posts can be simultaneous or separated by large time gaps without disrupting the coherence. Additionally, there is no way to interrupt, no notion of waiting for a turn, and no unfilled pauses since the medium is designed for asynchronous communication. Despite these differences, I argue in this study that the notion of holding the floor is also applicable within the context of a message board. For this study, I will explore how a conversation participant holds the floor within the message board community called Cougarboard. I will also explore how the design of the conversation floor of Cougarboard impacts the allocation of group attention through four features: a chronological thread queue, promoted threads and messages, tree- structured conversation threads, and visible status markers, such as financial contributions and length of community membership. Within Cougarboard, I define holding the floor as the combination of taking a conversation turn by posting a message and then having that message read by another 7 community member. This two-part definition allows us to see how the floor is allocated among the different participants. First, however, we must look at the context from which Cougarboard emerges as well as review the literature related to online communities and discourse. Literature Review Cougarboard, originally known as BYUboard, is an online community that started in 1999 as a fan site dedicated to the Athletics program at Brigham Young University (S. Meyers, personal communication, March 4, 2009). It is a large and active site with over 15,000 registered users and 2000 new messages on a typical day (“Statistics - CougarBoard.com,” n.d.). The Cougarboard archives date back to 2002 and provide a rich corpus of online interaction. Regular contributors form a core group of participants that engage in discussion of sports, as well as other topics, ranging from politics to medical advice. Frequent postings by core members create a community feel to Cougarboard. The emergence of communities like Cougarboard was predicted early on when computers were first being considered as communication devices. Licklider and Taylor (1968) anticipated that the development of communication technology would enable "communities not of common location, but of common interest". (p. 38) While not a full substitute for a tangible world community, virtual communities have been shown to facilitate creating social bonds among their participants (Rheingold, 1993; Wilson & Peterson, 2002). Armstrong and Hagel (2000) divide communities into four classes: communities of transaction, communities of interest, communities of fantasy, and 8 communities of relationship. Communities of transaction include sites like eBay where the sole purpose is to buy and sell goods. Communities of interest center around a particular subject that drives the community discourse (e.g. hobbyists). Communities of fantasy engage in online role-playing, such as Dungeons and Dragons, and communities of relationship are designed specifically for socializing and building strong relationships amongst members. As a site dedicated to BYU sports, Cougarboard falls firmly into the category of community of interest, but what implication does this have for the community? A theoretical construct that has been used to answer this question is social capital, which is the idea that the connections that exist between people are valuable. Putnam (2001) shows that, over time, there has been a decrease in social capital within our society. Fewer people socialize one with another in traditional settings such as clubs, political organizations, and unions. Some have hypothesized that with the advent of the Internet, there is a resumption of extending these social ties to online forums. (Galston, 2000; Licklider & Taylor, 1968) Depending on the type of community, there can be different levels and types of social capital. According to Putnam, there are two types of social capital: bridging and bonding. Bridging capital refers to social ties that link an individual to communities of people outside a tight social network. Bonding capital are the strong ties that are reinforced in a tight community. Norris (2002) shows that different types of online communities vary in social capital. For example, her study states that sports-based communities are low in both bridging and bonding capital. 9 The type of community is significant because the focus of discourse varies depending on the kind of community involved. Ren, Kraut, and Kiesler (2007) discuss reasons that people join online communities and how the website design influences the type of community that develops. According to the authors, common identity theory predicts that people join because they have an affinity for the group’s subject matter, whereas common bond theory says that they go to a community because of relationships they have with individuals. Of course, the authors recognize that there is a continuum between communities built around a common identity and those built on social bonding. Understanding where a particular community falls on this continuum has implications for how the community forum should be designed. One way this can be manifested is in the way discourse topics are organized. "Identity-based communities are likely to want to have people talk primarily about the nominal topic of the community" (p. 395). The authors suggest that when members of an identity-based community form common bonds, off-topic discourse can be moved to a separate area away from the main floor. In order to organize the discourse in this way, there must be computer code. Lessig (2006) argues that, as it relates to computer code, "[a]rchitecture is a kind of law: It determines what people can and cannot do." (p. 17) In Cougarboard, the computer code influences the types of discourse structures which are possible by the way it organizes each message within the larger discourse context. Whereas in face-to-face human interactions, tacit social norms govern turn-taking, in online computer-mediated conversation, turn taking is managed in large part by explicit rules articulated in computer code. We will consider these rules later on. 10 The literature related to holding the floor and turn-taking primarily addresses face-to-face (F2F) interactions. In F2F interactions, holding the floor is generally considered something that requires coordination and timing. The most oft cited work in the area of turn-taking is Sacks, Schegloff, & Jefferson (1974), who offer a model with 14 principles of turn-taking. Some of these principles apply only to F2F dialogue and its inherent real-time communication, but, others, with some qualifications, also apply to the text and asynchronicity of message board protocol. Perhaps the most notable difference, for the purposes of this paper, comes in principle two. “Overwhelmingly, one party talks at a time” (p. 700). The asynchronous nature of the communication channel allows multiple participants to add messages simultaneously. This difference means that the message board can accommodate much larger groups than a face-to-face conversation. Sacks et al. acknowledge that in F2F conversations "[t]hough the turn-taking system does not restrict the number of parties to a conversation it organizes, still the system favors, by virtue of its design, smaller numbers of participants" ( p. 712). Because of the structure imposed on the conversation in a message board, multiple participants can add messages without interference in the communication channel. The code arbitrates the ordering of messages to make sure that they appear chronologically and thus prevents turns from being interrupted. Some scholars disagree with the idea that posting a message equates taking a turn, as well as the idea that taking a turn is the same as holding the floor. Herring (2001) cautions against equating asynchronous messages with turn-taking, pointing out that asynchronous communication means such as email "effectively convey what would have 11 been communicated through multiple turns in synchronous interaction" (p. 620). Edelsky (1981) posits that more than one participant may hold the floor at a time. She defines two types of floors. The first, F1, is one in which speakers have uninterrupted turns. The second, F2, is a more collaborative floor, where there is substantial speaker overlap and where speakers complete each others' ideas. In the case of F2, Edelsky asserts that multiple participants hold the floor simultaneously. In response to the potential criticism raised by Herring, there are some instances on Cougarboard where mapping one message to a single turn is problematic; however, the similarity between a turn and a message is very strong for the vast majority of the messages on Cougarboard. Since the quantitative analysis in this study involves tens of thousands of messages, if there are multiple turns taken in a single message, they should merely factor into the noise of the data. Having more than one participant hold the floor at once is not a problem because of the architecture of the communication channel. In one sense, one could look at the entire message board as a special kind of F2. Even though any registered member has an equal opportunity to participate in the conversation floor through self-selection, this does not mean that the turns are equally distributed among them. Sacks et al. suggest that "for socially organized activities, the presence of 'turns' suggests an economy, with turns for something being valued -- and with means for allocating them, which affect their relative distribution, as in economies"(p. 696). Online communities often show similarities to other economies inasmuch as they seem to obey the Pareto principle (Croll & Power, 2009), also referred to as the 80:20 principle, the Pareto principle shows that in human 12 economies, 80% of the resources are controlled by 20% of the population. This pattern also appears in online message boards, both in the amount of time it takes for a message to receive a response in an asynchronous system (Kalman, Ravid, Raban, & Rafaeli, 2006) and a similar long tail distribution in the growth of conversation threads (Gómez, Kaltenbrunner, & López, 2008). What is most interesting here is what features distribute the resources of turns and attention. As Sacks et al. say (1974, p. 696), "[a]n investigator interested in the sociology of a turn-organized activity will want to determine, at least, the shape of the turn-taking organization device, and how it affects the distribution of turns for the activities on which it operates." In the case of a message board, the organization imposed by the computer code is the organization device. In order to understand the distribution of the conversation floor, we must look at the communication architecture that Cougarboard employs. Structural Description of Cougarboard The way a message board organizes messages involves many design choices. The conversation floor in Cougarboard is divided into three primary levels beginning with the most basic and working to up to the largest structured unit of the conversation floor. These three levels are first, a message, second, a thread, and finally, the message board. Arguably, there is a fourth level which is the material on the front page of the website, but since the participants in the message board do not directly control the content of this section, I describe it separately. 13 Messages Similar to e-mail, each message has a subject line, an identification of author, the date and time of message posting, the body of a message and an optional signature line (see Figure 1). Images and video can also be embedded in the message. There are some significant ways that it differs from email, the obvious being that messages are public and not limited to the view of those in the “to, cc, bcc” fields. Secondly, each message must also be explicitly categorized into one of the available topics. Third, the messages link together in a larger structured conversation called a thread. Figure 2.1: A message 14 Threads Each message starts a new thread, or replies to a previous message in an existing thread. This is analogous to starting a new conversation topic or simply joining a previous conversation, respectively. A thread connects the subject lines of related messages using a graphical tree structure that explicitly shows how each message relates to each other. A user views a message by clicking on the subject line from the main board or from select messages posted on the front page. The body of the message is displayed on the screen. Only one message body can be viewed at a time, but one can still see all the subject lines of the other messages in the same thread. Within the tree-structure view of a thread, each message subject line contains 8 unique information tokens (see Figure 2.2). Viewers use these units of information to decide which messages warrant their attention. Each of these information tokens is numbered and described in the following paragraphs. 1. The first is an icon that indicates the topic of the message. This is a mandatory field. The author must select a topic from a list before posting. 2. The second is the visual images of connector lines, which indicate how each message relates to each other. From a structural standpoint, each message may respond to a single message, but each message may have an unbounded number of replies to it. 3. The message title is the next information unit. By convention, the title may be a summary of the contents of the message, the first portion of the message that resumes in the body of the message or, for very brief messages, it constitutes the entirety of the message. 15 Figure 2.2: A thread 4. After the title is a length indicator, which is automatically appended to the subject line and put in parentheses. If the subject line comprises the entirety of the message, it will end in “nm”, which is an abbreviation for “no message”. Because reading the message requires clicking on the subject line, the (nm) feature is a courtesy to the readers. Other length indicators are “short” (for messages under 50 characters,) or long (for messages over 2000 characters). Normal length messages are left unmarked. The length for extremely long posts included extra Os to indicate this (e.g. looooong). 5. Registered members who are logged in are able to view a number at the end of the subject line indicating the number of times the body of a message has been viewed. 6. Donors to the site can have an icon associated with their moniker. 7. The moniker uniquely identifies the author of the message. 16 8. The final piece of information is the time stamp. Messages are stored indefinitely, and they are time stamped to indicate when they were first posted. Message board The threads are organized into a larger collection of threads which constitutes the message board. They are displayed in chronological order from top to bottom according to the time of the first post on a thread (see Figure 2.3). The default is to show all messages regardless of topic; however, users can modify this view and restrict it to threads belonging to a particular topic, messages posted only by donors, messages posted by other registered members that are specified as friends, or the top 10 threads on the board. The top 10 messages are selected based on a computer algorithm that looks at participation metrics to select the most active discussion threads. These top threads are also featured on the front page material. Figure 2.3: The message board 17 Front page One of the important features of any community is shared knowledge. The front page of Cougarboard helps highlight what should be considered shared knowledge. There are two sections on the front page (see Figure 2.4). In the center of the page is a list of the top 10 threads, ordered chronologically with the most recent at the top; these are selected by an algorithm, which weighs the number of comments, the number of reads, and the age of the thread. On the right are individual messages from the previous day which readers have nominated for “post of the day”, which they do by clicking a link in the message body. They are ranked in order of votes. Figure 2.4: The front page 18 Conventions of Discourse Having just described the basic architecture of the message board, it is important to consider its conventions of discourse. The interactions on Cougarboard are governed by a combination of rules imposed by the computer code and community social norms. Rules imposed by the computer code The following inviolable rules are imposed by the code. First, anyone can read the message board, but in order to take a turn/post a message, a user must be registered and logged in. Second, turns/messages are categorized by topic with a topic icon. Third, a turn may initiate a new thread or reply to one and only one message. Fourth, turns which are categorized as politics or religion are not visible unless a user opts in, either when they set their preferences at the time of creating their account or by modifying their preferences later. Rules enabled through code and enforced by the community Some of the discourse rules are enabled through computer code, but require action on the part of the community for these rules to be applied. First, moderators can ban a user, removing the right to take turns on the message board. Second, moderators can remove individual messages or entire threads. Third, individuals can delete or modify their own turns. Fourth, user-initiated turn deletions are replaced with the subject line “<deleted>” and remain a part of the tree. Fifth, replies to a deleted message are not deleted even if the original is. Sixth, registered users can “ignore” other registered users and, thus, not view their turns. Seventh, frequent violators of community rules can have 19 their turns hidden by default. Users have to opt in to read their messages by adjusting a content filter. Lastly, any user can report offensive postings or categorized messages. Community-enforced rules There are two main discourse conventions which are community enforced: First, contributions should be novel. This is an application of Grice's maxim of quality (Grice, 1975/2002). If messages duplicate information due to overlapping turns (i.e. messages were posted roughly at the same time), the message is marked by a reply from another contributor with the subject line “Hack”. Second, beginning a new thread with information that has already been shared is often flagged within the community by replying to the message with SGC or an acronym representing the content (e.g. "Larry H. Miller died" --> LHMD). SGC stands for "Staley got cut". In 2003, a former BYU football player named Luke Staley was cut from the roster of the Detroit Lions, a professional football team. The news of this event was reported independently by so many Cougarboard members in the weeks afterward that it became an inside joke. The acronym came to stand for any news that had been previously been shared on another thread. The Conversation Floor In my definition of holding the floor, there are two components: first, the ability or right to take a turn, and second, receiving group attention for that turn. On Cougarboard, conversation takes place on two distinct floors. There is the common floor of subject line discourse and there is the message body discourse which requires user action, a click, in order to participate. In this study, we do not have a metric for group 20 attention on the common floor. However, because each click is recorded, we can measure attention to the body of a message. Group attention is a scarce resource. With thousands of messages posted each day, it is impossible for each message to receive the attention of every member of the community. The message board is designed to help its members read messages that they will find relevant. As previously mentioned, this includes a number of features that make certain messages move into an area of greater visual prominence. For instance, the subject lines of select messages are moved to the front page based on three factors: the number of message views, the number of messages in the thread and the number of days since posting. Once a message reaches the front page, the added prominence further reinforces its popularity since these messages can persist on the front page for several days. These messages most likely have a different set of properties in terms of how long they are open for discussion and readership. One factor that contributes to the attention given to each message is its placement within the tree structure of the thread. The simplest way to discuss the structure of these trees is using family terminology: child message, sibling messages, parent messages, etc.). A parent node is an original post; comments on a parent node are child messages, and sibling messages are comments on the same parent node. Parent nodes, provided they have a message body, receive more views than child nodes. As one might expect, siblings that are closest to the parent nodes get more attention than later siblings (statistics to support this appear later in this chapter). The most likely reason for this is that the lack of physical presence on the message board makes it possible to browse conversations and 21 exit them before the final turn. Unlike face-to-face communication where dropping out of a conversation is a face-threatening move, a person who clicks on a message, reads it, and loses interest has no motivation to continue the virtual conversation. Another possibility for declining interest in later sibling messages is that the whole thread itself has been displaced by newer threads on the message board, moving it out of the viewable area. Some messages do not generate replies because they are not of interest to other people, some were never intended to generate discussion (e.g. “Go BYU!”), some are deprecated by passing events (e.g. speculation about the outcome of a future game), some may have reached a point where interest in the topic has been satiated and there is no new content. A final reason for discontinuing a thread is that it has been displaced by newer threads and is no longer in the visible active area. Even though each thread is stored indefinitely, users sense that the focus of group attention moves on as messages are displaced by newer threads. This is evident in Figure 2.5, which represents data for 1000 threads. The Y axis represents time in minutes. Each vertical line represents a unique thread. The bottom of the line is the time of the first message posted on the thread. The line runs vertically until the time of the last post on the thread. The slope of the line formed by the stock chart indicators represents the volume of new threads being posted. A gentle slope indicates a high volume whereas a steep line shows a slow-down in the number of messages posted. These slower periods are most evident in the chart as stair- like jumps in the slope that occur at regular intervals during the night-time hours. 22 m5014495 m5014934 m5015296 m5015502 m5015829 m5016232 m5016667 m5017186 m5017650 m5018000 m5018432 m5018796 m5019287 m5019739 m5020184 m5020544 m5020948 m5021366 m5021724 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Life Span of Threads in Minutes median Though there are threads that stay open for several days, these are outliers. A more reliable indicator of when the conversation has moved on is to look at the median post on a thread. By examining the data for those threads with more than one message, we see that the median post is approximately 14 minutes after thread initialization (see Table 2.1 below). In this way, even though a thread remains open from a technical standpoint, in practice, threads end fairly quickly unless they have been promoted to the front page. Threads arranged in chronological order by initial post Figure 2.5: Life span in minutes of 1000 threads 23 Table 2.1: Quantiles for time from initial message to reply messages from Figure 2.5 100.0% maximum 9764.5 min 99.5% 1913.57 min 97.5% 844.825 min 90.0% 120.5 min 75.0% quartile 37.5 min 50.0% median 14.5 min 25.0% quartile 5.5 min 10.0% 2.5 min 2.5% 1.5 min 0.5% 0.5 min 0.0% minimum 0.5 min The times reported in the right column of table 2.1 are approximate. The message board records the time of each message by minute only, and since many replies were recorded within the same minute as the initial message on each thread, I added 30 seconds to the recorded reply time to represent the separation in time between the initial post and the replies. It is easy to collect statistics on how many turns are taken on a thread by counting the number of messages that have been posted. For 2/3 of the messages, it is easy to determine how often they were read. Each time a person clicks on the title of a message, the full message content is displayed. The number of clicks is shown at the end of the title of the message. However, a full 1/3 of the messages do not have body content, which means they can be read in full by looking at the title. Because of this, in charting the reads based on position in the thread (Figures 2.6 and 2.7) I excluded messages which were only a subject line. I also excluded short messages because few people read them. 24 This made it difficult to analyze the decrease in the number of reads from one sibling to the next. Again, sibling messages are comments on the same original post. In lieu of looking at the change in reads from one sibling to the next, I considered the linear order in which they appear in the thread since sibling messages are ordered chronologically with respect to each other. Figure 2.6 represents the decrease in the number of reads for a message based on its linear order in the message thread (e.g. 1 is the initial message, 2 is the following message, etc.), and figure 2.7 represents the decrease in the number of reads based on the depth below the root node (e.g 1 is a child node, 2 is grandchild, etc..). The data set included 26,641messages. Figure 2.6: Reads per message based on linear order Linear Fit reads = 92.455218 - 1.3373694*linearOrder 25 Figure 2.7: Reads per message based on depth Linear Fit reads = 116.3541 - 12.952231*depth Both of these graphs show a downward slope, which indicates that the further away a reply message is from the original message, the fewer reads it will receive. For linear order (figure 2.6) there is a statistically significant drop of 1.33 (Prob>|t| <.0001*) reads per message on average for each subsequent message. In other words, this graph predicts that message 4 would have approximately 1.33 fewer reads on average than message 3. The second graph (figure 2.7) predicts that moving down one descendant (parent to child), the child node will have 12.95 fewer reads on average (Prob>|t| <.0000*). 26 Controlling the floor In spoken dialogue, there are a number of techniques that people use to control the discourse floor. In online discourse, there are mixed findings related to whether men or women control the floor (Panyametheekul & Herring, 2003). In male dominated online environments, some studies suggest that men attempt to silence women in order to control the discourse( Herring, Johnson, & DiBenedetto, 1995). This motivates the question of whether this is a characteristic of male discourse only directed at women or if males adopt strategies in controlling the discourse floor with other males. There is evidence within the Cougarboard community that there is a perceived social hierarchy among some of the participants (BYU81, 2012). A number of social factors may play into who becomes dominant within the community. Cougarboard provides demographic data for advertising purposes which shows that the community is remarkably homogeneous: "97% male, 59% of male population is between the ages of 21-34; 83% are college graduates; 85% are married; 74% have children" (“Statistics - CougarBoard.com,” n.d.). Though statistics for religious affiliation are not provided, Cougarboard focuses on Brigham Young University, whose student population is 98.5% Latter-day Saint with 13% of the student body belonging to racial minorities. (“Demographics,” n.d.) One would be inclined to think the population following the sports program would largely come from the same population. Based on this information, gender, age, education and religious affiliation are not generally distinguishing features among members of the community. 27 The most easily identifiable feature is whether or not an individual that posts is a donor within the community. Donating to Cougarboard allows an individual to place a graphical icon next to their unique moniker, which gets displayed with each message. A second group of features is available through the personal profiles of each registered user. Each message that is posted contains a link to the author’s personal profile. Each profile displays the number of days that the user has been a registered community member, as well as the total number of messages that the user has contributed. Hypotheses Based on those features, I test the following hypotheses related to the social dominance of donors and members that have been in the community for extended periods of time: First, donors are more likely to initiate threads than non-donors. Second, donors are more likely to take a turn than non-donors, not just posting initializations. Third, donors are socially dominant and therefore receive more attention than non-donors. Fourth, time within the community is a factor that correlates with social dominance and that will translate to more reads per message. Using publicly available information posted on Cougarboard.com as to the donor status, length of time in the community, and the corpus of the message board postings between December 2009 and January 2010, I ran statistical analyses to test the four hypotheses stated above. 28 Topic initialization Donors make up 11% of the registered members and seem to generate the greatest number of new threads at a rate of about 4 to 1, or in other words, an 80:20 split. The actual counts are 6182 for donors and 1413 (81%, 19%). This might suggest that donors dominate the floor by setting the discussion topics while others follow their lead and join those conversations which have already begun. In order to follow up on this hypothesis, I looked at the turn counts, regardless of the position within the thread, to see if a different pattern emerged. This is also the subject of the second hypothesis. Messages posted regardless of position within the thread As it turns out, thread initialization behavior does not pattern differently between donors and non-donors on the entire message board. The 49,981 total messages split in the same 4 to 1 ratio between donors and non-donors. Donors accounted for 40,309 messages and non-donors contributed the other 9,272 (81%,19%). From this, I conclude that donors not only lead the discussion, but they continue to dominate the remainder of the thread turns in the same ratio. The next section examines whether domination of turn-taking actually translates into domination of attention. Reads per message, donors versus non-donors For this portion of the analysis, I restricted the data set to a subset of the threads. To get an indication of who gets the majority of the attention in the community, I needed to find data that would be a fair comparison between donors and non donors. As an indicator, I chose the metric of clicks per post on thread initial messages about football 29 (the most common topic) whose message has content in the body (not just the subject line). The reason for doing this restricted set is that it controls for variation due to the position where the message appears in the thread. I ran a regression analysis on this data set and obtained the following results (see Tables 2.2-2.4). Table 2.2: Summary of fit for reads per message RSquare 0.00542 RSquare Adj 0.005022 Root Mean Square Error 185.3669 Mean of Response 182.398 Observations (or Sum Wgts) 2500 Table 2.3: Analysis of variance for reads per message Source DF Sum of Squares Mean Square F Ratio Model 1 467781 467781 13.6137 Error 2498 85833502 34361 Prob > F C. Total 2499 86301283 0.0002* Table 2.4: Parameter estimates for reads per message Term Estimate Std Error t Ratio Prob>|t| Intercept 208.36648 7.954855 26.19 <.0001* donorStatus -33.17384 8.990977 -3.69 0.0002* 30 The results indicate that donors actually receive 33 fewer reads per message on average than do non-donors (non-donors 208, donors 175.19, average for all messages 182). This result is surprising, but it can be explained. If the purpose of the football section of the message board is to swap information, then the people that post the most often probably have the least amount of new information to share. This is something that social network theory would predict based on the difference between strong and weak ties. Time within the community Using the previous data set, which is restricted to the thread initial posts about football, I tested to see if the length of time in the community was a factor which influenced the number of reads per message posted. There is a great deal of co-variance with length of time in the community and donor status. Those who have been in the community a long time and post messages tend to be donors. I ran a regression analysis to discover if length of time in the community may be an independent factor beyond donor status (see Tables 2.5-2.7). Controlling for the donor status and the number of days that the individual has been in the community does not prove to be a significant factor in predicting readership of messages. 31 Table 2.5: Summary of fit for length of time in the community RSquare 0.00611 RSquare Adj 0.005314 Root Mean Square Error 185.3398 Mean of Response 182.398 Observations (or Sum Wgts) 2500 Table 2.6: Analysis of variance for length of time in the community Source DF Sum of Squares Mean Square F Ratio Model 2 527273 263637 7.6748 Error 2497 85774010 34351 Prob > F C. Total 2499 86301283 0.0005* Table 2.7: Parameter estimates for length of time in the community Term Estimate Std Error t Ratio Prob>|t| Intercept 213.98673 9.027704 23.70 <.0001* donorStatus -27.18252 10.07671 -2.70 0.0070* daysSinceSignUp -0.005674 0.004312 -1.32 0.1883 32 Discussion and Conclusions The analysis of the Cougarboard data showed three outcomes. First, the analysis of the architecture of the system demonstrated that there was a region of active conversation that in many ways resembles an F2 conversation floor where multiple participants contribute to the conversation simultaneously. Despite being asynchronous in nature, threads are displaced relatively quickly by newer messages. This displacement gives the message board conversation some of the properties of synchronous communication inasmuch as topics move out of focus. Second, because taking a turn and receiving attention for the turn are separated in the architecture, users take advantage of the tree-structure within a thread to organize the messages that they will read. This readership is analogous to speakers receiving group attention when holding the floor. The result is that turns within the same conversation thread do not receive equal attention. There is a dramatic drop off from parent messages to child messages. The linear order of where the messages appear within a thread also predicts the relative readership, with the earlier messages receiving more attention. The quantitative analysis of this readership that I have conducted constitutes a novel contribution to the notion of holding the floor in an online community where there is a weakened sense of presence amongst the conversation participants. Aside from the structural components of the message board, this paper also examined the role of two social markers, length of Cougarboard membership and donor/non-donor status within the community in predicting turn-taking and readership. 33 The analysis of the message board shows that the donors play a dominant role in the community, confirming my first hypothesis that donors are more likely to initiate threads than non-donors, as well as my second hypothesis that donors are more likely to comment on existing posts. Though a small minority among all the registered users, they take many more turns than do non-donors in the community. This was not an unexpected finding, but it was surprising to find that the proportion remained at 4 to 1 (donors to non- donor posts) both for initiating new topics and for replies. In other words, non-donors participate by initiating new conversations at the same rate that they comment on existing conversations. Additionally, messages posted by donors receive lower readership for thread initial messages. This contradicts the third hypothesis that social dominance would correlate with higher average readership per message. Finally, length of time in the community does not present any statistical differences for number of turns taken or attention received if the variable of donor status is controlled for. Future work in studying this community will look at the interactions within the community to analyze the speech-acts within threads. This analysis will identify patterns of social dominance that may exist with the community that have not showed up in the quantitative measures outlined in this dissertation. As Cougarboard is not the only online sports community for BYU, the findings in this study should be compared with those from similar communities. Cougarfan.com came into being at roughly the same time as cougarboard.com, but is structured in such a way that the threads from different topics are not on the same common floor. 34 Cougarfan.com has a policy of using real names instead of monikers or pseudonyms. Another competing site is cougarblue.com, which, at one point, allowed people to post anonymously without having to register. This board became dysfunctional and was taken down, later to be brought back with a different architecture. At present, other communities (e.g. twitter) also compete as communication channels. Understanding the roles of architecture and community standards in the emergence of these communities may facilitate an understanding of their different levels of success and is worthy of further study. 35 CHAPTER THREE: USER PREFERENCES FOR FEEDBACK FROM A PEDAGOGICAL CONVERSATION AGENT As the world has become more computerized, universities have begun to offer asynchronous online classes. The student-to-teacher ratio in this type of course can be very large, making it difficult for the instructor to monitor and assess student performance. One approach to increasing the interactivity of an asynchronous language course is through the use of a conversational agent capable of acting as a dialogue partner for paired exercises and role play scenarios. A computer-based conversation partner could be valuable in several ways. First, it relieves the logistical problems in creating paired student activities since the agent is always available. Second, the agent says pre-programmed such that learners are only exposed to correct linguistic forms. Third, agents can be programmed to act as proxies for teachers, taking care of monitoring a student’s linguistic output and providing feedback. While there has been some work done using conversational agents as practice partners, considerably less work has been done using conversational agents as tutors for language learning. This study evaluates an early prototype for a conversational agent system that can act as both a conversation partner and tutor. The prototype is evaluated by beginning level students along the dimensions of perceived enjoyableness and usefulness of the conversation. Additionally, the study considers user preferences for 36 feedback delivery, specifically whether it is preferable for the conversation agent to act as both tutor and conversation partner or to divide those two roles between separate agents. In considering this study, the next section will, first, review literature related to conversational agents and, second, review literature related to feedback. Review of Literature Text-based conversational agents In 1950, in response to the question of whether or not computers could think, Alan Turing proposed a test or as he put it an “imitation game” (Turing, 1950). In this test, a human interrogator uses a text-based interface to interact with two participants, one human and the other a machine. In order for the machine to pass the test, it must successfully pass itself off as a human. Turing predicted that within 50 years of his description of this test, machines would be able to successfully pass this test 70% of the time (1950, p. 442). To date, no system has succeeded in passing the test; however, many systems have been developed to act as conversation partners. In 1966, Weizenbaum described a conversational agent he developed called Eliza (Weizenbaum, 1966). This was a simple “chatbot” which imitated the behavior of a Rogerian-style psychologist by encouraging users to elaborate on the topics they brought up. Eliza selected responses by applying templates to user input. These templates identified keywords within end-user input in order to return reasonably coherent replies. In order to hide the limited knowledge store that the program had, the program would often echo elements of the users input using some simple transformations. (e.g. switching “you” for “I”). While not imitating human behavior flawlessly, Eliza showed that it was 37 possible for a computer to carry on a limited conversation within a particular style of discourse. Eliza was limited to entertainment purposes; however, she was followed by many more task directed conversational agents. I will be focusing specifically on dialogue agents developed for language teaching. Conversational Agents in Teaching and Tutoring Zacharski (2002) describes a prototype for an adventure game designed to teach Spanish where learners interact with conversational agents using a chat environment. The system is more sophisticated than an Eliza chatbot because it incorporates a goal directed approach with a planner. The program also includes multiple channels of communication (instant messaging, email and audio). Additionally, it has a number of characters in the game with whom the learner can interact. The description of the system did not include measures of learning outcomes. Williams & van Compernolle (2009) reviewed a French chatbot for possible application in language teaching. The authors criticized the chatbot on several fronts, including a lack of sensitivity to degrees of formality (tu vs vous) and the inability to handle misspelled words from second language learners. Despite the criticisms, the study concluded that, while the chatbot was not suitable as a peer for conversation, there could be a pedagogical benefit in reviewing the transcripts of the interactions. It should be noted that the assessment only looked at a single chatbot. The nature of the criticisms regarding the chatbot suggests that authors used an existing agent that was not designed for pedagogical purposes. 38 Jia & Chen (2008) conducted a study evaluation of CSIEC, a system that was designed to assist learners of English and incorporated chatting. The results showed high user satisfaction since all users recommended it, and post-test scores improved significantly, especially those who performed poorly on the pre-test placement exam. The approach taken by CSIEC rewarded students for staying within curriculum-based topics of discussion, which may account for some of the success that was not found in the earlier studies. It is intuitive that conversational agents that restrict the coverage to particular topics have a better chance of success in passing as a credible partner. Theodoridou (2009) looks at the effect of having an animated pedagogical agent used for vocabulary acquisition in Spanish. Results from the qualitative analysis suggested that some students benefited from having the agent as a companion in vocabulary acquisition. A minority reported that the agent was a distraction. The discussion section alludes to different learning styles being a possible explanation for the differences. The study found no significant learning outcome differences between the two groups, which, in turn, raises questions about the necessity of feedback in language learning. Research on feedback There are mixed opinions regarding the usefulness of feedback in language pedagogy. Traditional approaches to language instruction place a heavy emphasis on accuracy in linguistic output by appealing to grammar rules, while communicative methods deemphasize the role of explicit grammar instruction and feedback in language teaching. Communicative approaches to language teaching appeal to those that believe 39 that we acquire second languages using the same mechanisms that enable first language acquisition. Some research suggests a sequential acquisition that cannot be altered by changing the order in which concepts are presented within a curriculum (Bailey, Madden, & Krashen, 1974; Dulay, 1974). In this view of language acquisition, explicit grammar instruction is unnecessary and potentially harmful because it will cause some learners to believe that they must know how to articulate the set of grammar rules from a meta-linguistic standpoint before attempting to produce utterances. Fear of linguistic error may delay or prevent students from engaging in a process which may have some intermediate steps of grammatical competence before achieving a fully formed grammatical system. There is substantial evidence to support the idea that that much of what a person learns about speaking a foreign language is done at a subconscious level; however, this does not necessarily obviate the usefulness of meta-linguistic feedback entirely (Carroll, Swain, & Roberge, 1992; Kim, 2005). Krashen (2002/1981) makes a distinction between language learning and language acquisition. Language learning consists of the explicit instruction and corrective feedback, while language acquisition is a subconscious process whereby language patterns are internalized, based on exposure to input in the target language. According to his monitor hypothesis, the corrective feedback learners receive may help them perceive differences in their production from the standard forms, but such feedback does not, in and of itself, create the natural intuitions needed in general use language production. By examining case studies, Krashen suggests that meta-linguistic feedback 40 can be useful if used appropriately. Some learners are so preoccupied with producing grammatically correct sentences that feedback impedes their communication. At the other extreme of the spectrum, other learners disregard feedback entirely. Those learners are often able to make themselves understood, but exhibit frequent grammatical mistakes in their communication. Krashen states that the most successful learners fall between those two extremes. Understanding the nature of when it is most appropriate to provide feedback or if different types of feedback produce the best learning outcomes is still an open issue. Based on observations of classroom data, Lyster & Ranta (1997) described a flow chart model for how teachers provide feedback when presented with learner errors. In this model, corrective feedback falls into one of six possible categories: explicit correction , recasting, clarification request, meta-linguistic feedback, elicitation, and repetition. The model for feedback assumes that the communication channel is live spoken discourse, so not every element of the flowchart carries over into other modalities, such as synchronous chat. To this end, Heift (2004) examines feedback in the context of computer assisted language learning (CALL). Heift (2004) takes the feedback from Lystra & Ranta’s (1997) taxonomy and shows that there are analogous elements that could be incorporated in a CALL environment. Correction and recasting are ways of presenting the learner with the correct answer. Clarification asks the learner to try the response again. Finally, elicitation and repetitions are done orally using voice quality, which could be accomplished through highlighting in a text-based medium. 41 One of the issues with CALL based work is the difficulty in handling feedback for open- ended activities. Certain types of activities lend themselves to automatic grading with feedback (e.g. multiple choice activities, matching, and fill in the blank). The risk of this approach to language teaching is that it deprives students of those meaningful communicative opportunities which foster language acquisition by requiring the student to synthesize knowledge derived from study and observation into novel utterances using productive patterns. In recent years, there has been a push to incorporate more sophisticated natural language processing components in CALL systems so they can provide feedback on open-ended activities. These techniques typically get applied to short answer questions or essay systems that use parsers to give grammatical feedback (Nagata, 1995, 2009). Few have engaged in tutoring as part of a greater dialogue system. One group that does develop dialogue agents in language and culture tutoring is Alelo. This company has developed a number of tools for authoring and deploying tutored dialogues embedded in game-like scenarios for social simulation. Their systems have been deployed as spoken dialogue systems in a number of languages (W. Johnson, Ashish, Bodnar, & Sagae, 2010; W. L. Johnson & Valente, 2009). Specifics about the development of their tools is available in the literature (W. Johnson et al., 2010), but development using their tools requires a contract (M. Emonts, personal communication, 2010). 42 Conversation Agent Design This study examines the use of a dialogue agent to conduct a prompted dialogue with a partner, wherein both participants answer the same set of questions. The agent/bot is not intended to handle all types of conversations across various domains. Rather, the goal is to create lightweight bots that can accompany a particular student exercise. The feedback that the learner receives will also be tailored to the specific grammar lesson being covered for that particular exercise. To better control the direction of the conversation, the agent takes the initiative in asking all the questions. Using an agent that directs the conversation through a series of thematically related questions is a typical language learning classroom exercise This type of exercise is not unlike many text-book activities. When introducing new vocabulary or grammar, students generally begin with constrained tasks where they must first identify the feature in question from a given context (e.g. chose the correct verb form from a list). This type of activity is followed by another activity where the student must fill in a blank without a list from which to select. Finally, students are given a chance to follow a prompted dialogue or role-play. The elements of these small conversations are given to students, often in list form. The list of questions is assigned to one partner while the other responds. Often students are instructed to alternate. As the agent controls the dispensing of questions, it acts as a typical student partner. If the system identifies some feedback or an appropriate follow-up question, that enhances the believability and may be even more interactive than a low performing student. The fact that corrective feedback may not be absolutely thorough, in other words, 43 that the agent will not correct every grammar mistake, is not different than what occurs in the classroom. Teachers are encouraged to focus on targeted feedback related to the grammar principles in question. Generally speaking, only the most gregarious and motivated students engage in L2 conversation beyond the required questions. Given this predictability in student behavior, the most cost effective and time efficient way to deliver a rapid prototype of the agent was the Alicebot system developed by Wallace (2003, 2009). The system uses a mark-up language called AIML (Bush, Wallace, Ringate, Taylor, & Baer, 2001) that was designed similarly to the Eliza chatbot in that it matches patterns in the user input to formulate replies. These same pattern matching techniques are equally well applied to identifying common errors and providing feedback. The prototype agent was hosted at a site that provides animation to accompany the agent (sitepal.com). Figure 3.1: The Rosa chatbot 44 Users communicate with the agent via text interface. The agent responds in a dual modality with a text-based reply accompanied by a synthesized voice stating what appears in the text field. With respect to corrective feedback, the agent delivers direct feedback when specific errors are committed. It does this without deviating from the agenda of questions to ask. In the following sample dialogue, the grammatical error by the student and subsequent correction by the agent are underlined for the sake of clarity. They did not appear underlined during the interaction. For example: Agent: ¿Cómo te llamas tú? Student: Me llamo es Juan. Agent: La forma correcta es “Me llamo Juan”. Mucho gusto Juan. Yo soy de la ciudad de México.¿ De dónde eres tú? Feedback for learners A nice feature of AIML is that the templates are portable, so making a separate agent that keeps track of errors is a simple matter of importing the AIML content into a new agent. The possible advantage of having a separate agent is that the student may segregate the discourse associated with meta-linguistic feedback from their conversation partner. This makes it more likely that students will focus solely on the communicative task with one conversation agent and still feel like they have access to monitoring. The question at hand is how students will react to these different scenarios. If the tutoring and the conversation occur with the same agent, when the agent fails to respond correctly to an utterance (something that is bound to happen at some point), the 45 credibility of the tutoring component could be in jeopardy. A separate question that this study does not address is whether to allow the tutoring agent to respond to questions, and if so, whether the meta-linguistic feedback should occur in the student’s L2 or L1. This may depend on the L2 level of the language student. For beginners, it would likely be more helpful to have the agent speaking the student’s L1 since the students do not have a high enough proficiency in L2 to process the meta-linguistic feedback. One challenge to using an agent for conversation is the maintenance and authoring of a credible character, even for a small domain. The requisite technical expertise surpasses what is generally expected from a language instructor, but because there are so many language learners, it might be possible to dedicate the resources necessary for training or to provide for a specialized language instructor. Some conversational/pedagogical agents and systems have been deployed, but we have yet to see the widespread adoption of pedagogical agents in language classrooms as a matter of standard practice. Methodology Students were primarily notified of the opportunity to participate in a computer conversation partner study through a link provided in their course’s online homework management system. Additionally, some instructors followed up via email to alert their students to the study. Participation in the study was optional. As an incentive to participate in the study, each participant had the opportunity to enter a drawing to win $100. 46 The link led students to an online information sheet with instructions for conversing with the virtual agent online. Each student completed one structured dialogue consisting of approximately 10 question and answer pairs, followed by a survey. (See appendix for a sample survey.) Because there was no guarantee that the student/agent interaction would result in corrective feedback in every session, the question related to feedback from one agent or two agents included two accompanying videos which demonstrated simulated interactions under two conditions (one agent vs. two). Results and Discussion A total of 25 students elected to participate in the study. One of the participants failed to complete the survey. The data from the partially completed survey were omitted from the analysis. The results to the survey questions are displayed in the tables below. Table 3.1: Enjoyableness of chatting with Rosa Chatting with Rosa was an enjoyable language learning activity. # of students who selected this response % of students who selected this response Strongly Agree 4 16.67% Agree 11 45.83% Neutral 7 29.17% Disagree 1 4.17% Strongly Disagree 1 4.17% 47 Table 3.2: Usefulness of chatting with Rosa Chatting with Rosa was a useful language learning activity. # of students who selected this response % of students who selected this response Strongly Agree 6 25.00% Agree 13 54.17% Neutral 3 12.50% Disagree 1 4.17% Strongly Agree 1 4.17% Table 3.3: Ranking of online activities for enjoyableness Assuming that that all the activities are conducted online, outside of class and cover similar topics, rank the following options for a question & answer activity from most enjoyable to least enjoyable. Most enjoyable Second most enjoyable Third most enjoyable Least enjoyable Responses Interacting with an instructor. 45.83% 33.33% 4.17% 16.67% 24 Interacting with another student. 25.00% 20.83% 33.33% 20.83% 24 Using an on-line web form to answer questions. 4.17% 16.67% 45.83% 33.33% 24 Interacting with a system like Rosa. 25.00% 29.17% 16.67% 29.17% 24 48 Table 3.4: Ranking of online activities for usefulness Assuming that that all the activities are conducted online, outside of class and cover similar topics, rank the following options for a question & answer activity according to their usefulness. Most useful Second most useful Third most useful Least useful Responses Interacting with an instructor. 29.17% 25.00% 12.50% 33.33% 24 Interacting with another student. 4.17% 29.17% 33.33% 33.33% 24 Using an on line web form to answer questions. 37.50% 20.83% 33.33% 8.33% 24 Interacting with a system like Rosa. 29.17% 25.00% 20.83% 25.00% 24 Table 3.5: Receiving feedback during the chat During your chat, did you receive any feedback to improve your Spanish? (i.e. Were grammatical errors you may have made brought to your attention?) # of students who selected this response % of students who selected this response Yes 8 33.33% No 16 66.66% 49 Table 3.6: Feedback as it relates to usefulness of language activities Receiving feedback about grammar mistakes is necessary for a language practice activity to be useful. # of students who selected this response % of students who selected this response Strongly Agree 13 54.17% Agree 10 41.67% Neutral 0 0.00% Disagree 1 4.17% Strongly Disagree 0 0.00% Table 3.7: Enjoyableness of one agent vs. two agents Which of the following options is most enjoyable? (one computer character) A computer system where one character acts as both a conversation partner and someone that provides feedback on grammar. 9 37.5% (two computer characters) A computer system where one character is exclusively a conversation partner and a second character is exclusively in charge of giving feedback on grammar. 15 62.5% 50 Table 3.8: Usefulness of one agent vs. two agents Which of the following options is most useful? (one computer character) A computer system where one character acts as both a conversation partner and someone that provides feedback on grammar. 11 45.83% (two computer characters) A computer system where one character is exclusively a conversation partner and a second character is exclusively in charge of giving feedback on grammar. 13 54.17% The results from this study show that the majority of the participants thought that chatting with the character was both enjoyable (62.5%) and useful (79.17%). Those that did not accounted for a small minority (8.34% for both measures). When comparing the rankings of four online activities, talking with the agent came in second place behind communicating with the instructor. The third most enjoyable was working with a web form, and interacting with a fellow student was ranked least enjoyable. This was measured by the average ranking from 1 to 4 for how enjoyable the activity was. Given the small number of participants, significant numbers were not run, but the trends suggest that the students preferred interacting with Rosa over interacting with each other in online settings. Interacting with other students was the least preferred option. The statistics show a great variability in the different preferences for the students, which was also exhibited in the comments. 51 With respect to usefulness, communicating with the teacher was considered the most useful, followed by webforms, which was narrowly ahead of the agent, with fellow students being in last place. Table 3.9: Average rankings of online activities for enjoyableness and usefulness Online Activity communicating with: Average Rank for Enjoyable Average Rank for Useful Instructor 1.67 (1) 2.125 (1) Computer program 1.83 (2) 2.5 (3) Web form 2.04 (3) 2.41 (2) Another student 2.13 (4) 2.96 (4) The survey gives space for participants to justify their rankings. Two participants commented on the unreliability of technology. In reviewing the conversation logs, a few system errors did arise that could be easily prevented in future trials. For instance, two sessions had serious flaws with the agent’s responses which repeated the opening sequence to every response given by the user. In the following sample, the student’s name has been replaced with X and the student dialogue has not been corrected for spelling or grammar: For example [name replaced with X]: Student: hola Agent: Hola. Me llamo Rosa. ¿Cómo te llamas tú? Student: me llamo [X] 52 Agent: Hola. Me llamo Rosa. ¿Cómo te llamas tú? Student: me llamo [X] Agent: Hola. Me llamo Rosa. ¿Cómo te llamas tú? The most likely culprit was a failure to control for browsers that did not meet the system requirement that they accept cookies. The study instructions to the participants specified that cookies needed to be enabled to participate in the study, but the website hosting the agent did not contain code to enforce this restriction. Those two cases likely account for the 8.34% that did not find the experience useful or enjoyable. In addition to the failed interactions, there were a few minor cases where the conversational agent parsed multi-sentence user utterances and replied with multiple questions in subsequent turns. For example [name replaced with X]: Student: a [sic] mi me gusta pasar tiempo con mis amigos. Mi [sic] y mis amigas vamos al cine los fines de semana Agent: Divertido. Perdona el cambio de tema pero ¿Qué hora es? ¡Uy! Es muy tarde. Tengo que salir de la biblioteca. Adios [X] In the utterance above, the agent asks for the time and does not wait for a reply before saying she has to leave. In this particular example, it causes relatively low interference to the overall coherence; however, this has the potential to cause serious disruption to the believability of the character. This was a minor problem which was extremely infrequent and can be addressed with a programming tweak. An additional category of problems surfaced from incorporating fragments of user utterances in the agent’s replies. This involves a risk/reward trade off. When it works 53 well, it can simulate an attentive listener and be more engaging. When it does not work well, it dramatically ruins the effect. Unfortunately, it is not possible to anticipate every user statement. For example [name replaced with X]: Student: Me llama[sic] es [X] Agent: Mucho gusto Me llama[sic] es [X]. Yo soy de la ciudad de México. ¿De dónde eres tú? These types of errors are relatively easy to identify and address through better grammar checking and better user modeling. At the time of the study, the system had a very limited set of error patterns it could identify. This was reflected in the self-reported statistics about the grammatical feedback. Only a minority of students reported that they had received feedback during the session (8 of 24 or 33%). An analysis of the log showed that there were only 3 cases where corrective feedback was given, so in some cases, participants perceived that they had received corrective feedback when none was offered. This could mean that that the participants had a different definition than the one intended by the survey. Future modifications to the survey could have an open ended response where the survey participant could write down the feedback that they had received. The overwhelming consensus among students (95%) was that feedback was necessary for an activity to be useful, with only one participant disagreeing. At first glance, this might seem to contradict the earlier finding that the majority of participants found the exercise useful. This may be due to perceived capabilities; for example, the students may have believed that the system was capable of giving feedback, but had not 54 found anything wrong with the user statements. In this sense, the lack of commentary about the user’s grammar could be perceived as tacit feedback that the user is producing well-formed sentences. Concerning feedback delivery options, the feedback hierarchy was the same for both how enjoyable and how useful it was. Average rankings are seen below in Table 3.10. Table 3.10: Average rankings of feedback sources for enjoyableness and usefulness Feedback source Average Rank for Enjoyable Average Rank for Useful Instructor 1.58 (1) 1.21 (1) Computer program 1.92 (2) 1.92 (2) Another student 2.75 (3) 2.83 (3) No feedback 3.75 (4) 3.63 (4) In a key question to the study, students tended to prefer two agents, where one had the sole role of providing grammatical feedback and the other was working as a conversation partner. Comments that justified the two character scenario cited face- saving as a consideration: “Feedback from the same character makes them seem condescending and makes me not want to chat with them.” One interesting comment ascribed perceived capabilities to a separate tutoring agent that the individual believed the one agent architecture would not have: “Because the other person will hear the conversation and it will give a different and a wise feedback.” Reasons that were cited 55 for having one tutor were simplicity in design: “Easier to comprehend, less complicated interface.” One other comment highlighted a design feature difference between the agents presented in the videos. The single agent had both text and audio feedback while the two- tutor condition only had audio feedback. There was a preference by that individual to include both modalities. Conclusions and Future Work The results of this study show that the prototype for a conversation agent matched comparable communicative activities that were performed outside the classroom in a way that requires less coordination among individuals. The perceived usefulness of the system was relatively high, despite few students reporting having received feedback. The agent performed well enough to be considered more useful and more enjoyable than working with peers by the majority of users. Some minor technical glitches affected some of the interactions, which may have lowered the score that the agent received for usefulness and how enjoyable the interaction was perceived. Modeling the character after a structured dialogue similar to what the students were familiar with classroom activities was successful, at least for an initial interaction. The measures of usefulness and how enjoyable the character was should be considered a floor rather than a ceiling of performance measures. Overall, this is a promising beginning, but much more system development is necessary. Participation rates in the studies did not reach the level required to build a large corpus of user interactions necessary for a more robust system; however, positive 56 findings from the study justify its inclusion in the homework portion of a course on a limited basis in order to collect sufficient data to improve the user model. There is sufficient support for a two agent feedback model to proceed along that path for future system development. Having a two agent system produces some advantages in terms of dividing the responsibility of creating structured dialogues from tutoring interactions. It frees the structured dialogue creator from figuring out how to elegantly work feedback into the text of the dialogue. It also appears to have some face- saving advantages. Methodologically, it is necessary to expand the use of simulated conversation and feedback as a means to better understand learner preferences about delivering feedback within the context of dialogues. Special attention should be paid to the importance of feedback modality (text versus audio) as well as questions about language use and tutoring (L1 versus L2). It will be critical to know when it is appropriate to provide for grammatical feedback and when it is better to ignore errors so as to not interfere with the communicative process. Additional questions relate to user agents with respect to soliciting feedback. The model that was illustrated in the video was strictly an opt-in policy. Errors were highlighted by underlining and end-users would click to receive feedback from the second agent. Another line of research which should be explored further is to expand the feedback model to incorporate some of the features of the Lystra & Ranta (1997) feedback model. From a technical perspective, some of the limitations of the AIML approach for identifying user errors were apparent in this study and may need to be addressed by 57 seeking other authoring environments or simply by expanding the capabilities with ancillary programs that would be able to examine user input. It would be more worthwhile to explore integrating another existing system for grammatical feedback (e.g Spanishchecker.com) than to look for ways to proceed with AIML. Identifying grammar errors using AIML is not the most efficient manner. This study examined user preferences, but did not account for learning outcomes, which would be of critical importance when considering the true usefulness of the system. Moreover, feedback on the system was limited to student perceptions and did not include the attitudes and opinions of instructors with respect to incorporating virtual agents into the curriculum. In short, this study successfully deployed an agent that could conduct a structured dialogue for the purpose of language learning. There is substantial work before the agent could be part of the core curriculum for a course; however, user evaluations justify including it as an exercise on a limited use basis as the system is further developed. 58 CHAPTER FOUR: AUTHORING VIRTUAL HUMANS AND ASYMMETRICAL DISCOURSE Marshall McLuhan (2008/1964) is well known for asserting that the content of a new medium is the content of an older medium. This is to say that with the advance of technology, old genres expand or migrate to new media. Eventually, the original form is adjusted to accommodate the differences between the original and the new medium. For example, the language used for radio drama is arguably a descendant of the stage play. The radio permitted audiences far away, in diverse locations, to share in the event; however, the fact that the audience lacked a shared visual context for the play forced the writers to make certain adjustments to the script. To compensate for an absence of a visual channel, writers employed enhanced audio cues and inserted lines that were specially designed to allow the audience to understand changes in the scene. Consider this excerpt from The Maltese Falcon. SOUND: (SPADE PUNCHES CAIRO, KNOCKS HIM DOWN) SPADE: I'll take the gun, Mr. Cairo. Now, get up. SOUND: (CAIRO RISES) (Wilson, 1946) Had the scene occurred on stage, the events likely would have been communicated visually with little or no dialogue. Spade’s line is inserted in the radio drama to communicate to the audience that a series of actions have taken place. Through inference, the listeners understand that Spade took the gun and Mr. Cairo fell and 59 subsequently got up. Skillful radio drama writers understood the restrictions of the medium and were able to compensate for them in ways that appear as natural as possible. Virtual humans are a new medium for engaging in interactive role-play, which until recently has been restricted exclusively to live human interactions. The new medium allows institutions that use role-play as part of their training programs to use a virtual human in place of trained human actors for conducting exercises. See Figure 4.1. Figure 4.1: End-user interacting with a virtual human While conversations between the virtual human and the human users are generally analyzed as a two-party dialogue (machine-human), another way of looking at the 60 interaction is that the computer software is a medium through which a conversation takes place between various human end-users and the person that authored the script for the virtual human. Through this means, the author can engage an indefinite number of role- play partners in conversation by using a technological proxy. These highly asymmetrical conversations are made possible only through computer code. Throughout this chapter, the term “author” will refer to the person who created the script for the virtual human. By “script”, I do not mean a pre-determined sequence of utterances, such as in a radio or movie script; rather, I am referring to all potential utterances of the virtual human and the policies governing the usage of those utterances. As with the radio example, the computer technology that permits this new form of communication also comes with some inherent restrictions. This chapter addresses some aspects of authoring by making explicit the narrative implications of those restrictions. For this chapter, I constrain my analysis to a virtual human scenario created using the Tactical Questioning system (TacQ), a suite of software components developed at the Institute for Creative Technologies. The development and testing of this software was a collaborative effort from a group of researchers. My contribution to this project included data collection and analysis, dialogue network development and training of novice scenario developers. Having a thorough understanding of the authoring platform from a technical standpoint, and an inside view of the details of scenario development, as well as having spent many hours analyzing user input gave me a unique perspective on the system’s strengths and weaknesses. 61 First, I will look at the implications of using one particular semantic representation system (object, attribute, value triples) on conversations. Second, I will consider the conversational limitations that the default set of dialogue networks have. Third, I will discuss conversational effects of dealing with low speech recognition during the interactions. The findings of this study may apply to other dialogue systems inasmuch as they share architectural similarities. Before beginning the analysis, it is helpful to understand some background for the system and some general discussion about the asymmetries inherent in virtual human role plays. Background The primary motivation for creating TacQ virtual humans was to facilitate the training of military personnel in effective and culturally appropriate ways of tactical questioning (Traum et al., 2007). The military expends a considerable amount of resources in hiring actors for role play training. Because of the large number of personnel that need training and limited staffing of actors, the military uses virtual humans to supplement and prepare for live training. Virtual role plays provide an environment within which personnel can learn cultural sensitivity and proper protocols for conducting information gathering interviews. The development of the role-play scenarios rely on content created by subject matter experts (SMEs) that do not generally have any expertise in the technical aspects of computer dialogue systems (Gandhe et al., 2008). Consequently, the tools developed for creating content emphasized ease of authoring as much as possible. 62 As part of the development of the software, several scenarios were created to test the effectiveness of the authoring tools. The system proved successful in letting non- experts create several rapid prototypes of virtual human characters (Gandhe et al., 2011). Role playing, Asymmetries, and Authoring In a live role play of tactical questioning, actors playing the part of informants and people learning how to do tactical questioning share the same physical environment. In terms of negotiating their physical surroundings and linguistic interactions, they are more or less on equal footing. Migrating this type of interaction to a virtual environment creates a dramatic change in the dynamic and a complex, asymmetrical relationship between the linguistic participants. The relationship between the author and the end-user is not easily captured by standard speaker-addressee roles. It is better described using the more sophisticated taxonomy outlined by Levinson (1988). This taxonomy uses four binary features to define the role of the producer of a message (Participant, Transmitter, Motive and Form) which generates 10 different roles or categories of producers. In the case of character authoring, the author has properties most closely resembling the producer category called "ultimate source". The ultimate source is not a participant that is present during a speech event, nor do they transmit the message themselves, but they are responsible for the form (or content) of the message and the motives. Levinson uses a military commander as an example of an ultimate source. A commander who wants to communicate to subordinates uses an intermediary or relayer, in Levinson's terminology. The relayer has the properties of being a participant and a 63 transmitter, delivering the message without motive and without composing the message’s form. Using this taxonomy, the virtual human is a type of relayer. One difficulty in using the relayer category to describe virtual humans is that it fails to capture the complexity of motives. Whereas a military commander is expressly communicating his own will, the author, as an ultimate source, is speaking in the voice of a fictional character, the virtual human. The recipient of a military command via a relayer will likely know that the command comes through, not from, the relayer. However, end-users in a virtual human role play tend to immerse themselves within the fictional scenario and disregard the role of the author. In writing about imaginative play, Clark & Van Der Wege (2005) point out that in joint pretense, the action occurs on different layers (Clark, 1999). When people engage in role play, they are mutually suspending disbelief to place themselves in fictitious environments and conversations with assumed goals that are relevant only within the context of the conversation. Who sets these goals? On a meta-level, end-users engaging the TacQ system may have a variety of reasons for choosing to participate in the conversation, such as assigned training, testing the capabilities of the system or mere curiosity. However, within the context of the role play, it is the author who assigns a fictional back story and motives for both the virtual character and the human-end user. While live role plays may also have assigned roles and goals for participants, the assigned roles designated by the author are predetermined in TacQ; consequently, its characters cannot engage in verbal improvisation. Every utterance of the virtual human 64 has been pre-authored for the interaction; to be felicitous, the end-user must remain in character and fulfill his assigned information gathering goals. Creating a computer system that can sustain human interaction is challenging because the computer has some disadvantages to a human, particularly when it comes to understanding input. When an end-user speaks to the system, the speech is transcribed using automatic speech recognition. The system then statistically analyzes the transcription to match it to a simplified meaning representation of the statement. Based on the simplified representation, the program uses a set of rules to select the most appropriate type of reply from a list of pre-authored responses. At each of those stages of processing, errors can occur due to technological limitations or insufficient coverage in the authored material. Fortunately, many of these limitations are manageable through careful authoring. Authoring Authoring in the TacQ environment involves several steps as outlined by Gandhe, Taylor and Traum (2011) and further elaborated below: 1. Story creation 2. Domain organization 3. Assigning surface text to the domain 4. Refining the storyline and the domain. The first stage of authoring a character is decidedly non-technical. It is simply to come up with the motivation and context for the tactical questioning that will take place. The task of tactical questioning is inherently asymmetrical. One party has information 65 that the other party wants. The role of the author is to decide what that information is and then structure the information so that it can be used in the system. It is easy to consider a TacQ scenario as primarily a collection of question and answers, but there is generally an underlying narrative that connects those questions. The narrative can vary in complexity from simply completing an information report using standard military protocols to complex multi-part narratives which include lying and negotiation. Structuring the scenario, also called organizing the domain, requires three steps. The first is to describe and document the scenario; it is a reference guide for the author. The second step is to write the back story that orients the end-user to their role and to the fictional context of the questioning session. The third step is to break down the important elements of the story into a machine readable knowledge structure which is referred to as a domain. The process of creating the domain is tightly connected with structuring the dialogue because the abstract semantic representations of what the character can say and understand are created prior to creating the actual text that the character can say. This next section will address how three things define the scope of the interactions: the simple semantic representation system, the closed set of speech-acts, and the pre-defined network that links the speech-acts into discourse patterns. Structuring Knowledge Specifics on the architecture for creating domains is found in several papers (Gandhe et al., 2008; Gandhe, Whitman, Traum, & Artstein, 2009; Traum et al., 2008) and is reiterated here. Domains are an abstract representation of everything the virtual 66 human "knows" about. When beginning to create a new character, the authoring tools automatically populate the domain with a small set of dialogue acts that are independent of any particular specific scenarios (greetings, closings, etc.). The author has no part in creating the abstract representation of these objects. Domain creation is the hardest component to teach novice authors because it requires them to create an ontology. A domain is organized composed of units called objects. Objects represent people, things or events that the author thinks will be relevant during the questioning scenario. Each of these objects has author-defined attributes associated with it. For example, one of the virtual characters created for the tactical questioning was Amani. Her fictional background is that she is a 25 year-old school-teacher and knows about an incident where soldier was shot by a sniper at a market. See table 4.1 below for an example of what this small domain could look like. The value “don’t know” is false, meaning that if the author sets the system to lie, it will choose the false value. The system default is that a value is true. Table 4.1: Sample domain Objects Attributes Value Amani Name Amani Age 25 The sniper suspect Name Saif don't know (false) Shooting Incident Time Morning Location Market In order to create the character, the author uses software to create a machine- readable text representation of relevant information. As part of the authoring component, 67 every time a new object and attribute are created, three dialogue act frames are automatically generated. A dialogue act frame is a machine-readable representation (i.e. eXtenxible Markup Language) of the illocutionary force with some associated semantic information. One of the automatically generated dialogue act frames represents what the character can assert about the object and its attributes. The other two are questions that the end-user can ask the character: a wh-question and a yes/no question. Associated with every speech-act frame is surface text. That is the text spoken by the character or paraphrases of questions that the author expects will be asked by the end- user. The author is thus writing in both the voice of the virtual human character and the human end-user who will be talking to the computer character. See Table 4.2 for an example of dialogue act frames and the associated surface text. In order to provide some variety in case the character has to respond to a repeated question, more than one variant of the surface text is provided. This, what the author expects the end-user to say, makes up the bulk of the authoring. During the first round of character creation, prior to pilot testing a new character, the author relies heavily on intuition about what the future questioners may ask. Once a pilot test has been done, the author reviews the data to see what the end-users have asked and adds additional paraphrased surface text. 68 Table 4.2: Dialogue Act Frames Speaker Speech Act Object Attribute Value Sample Surface Text Amani Assert The shooting incident number_of_shooters One “There was only one sniper involved in the shooting” End-user Wh- question the shooting incident number_of_shooters <?> How many people fired during the incident? End-user YN- question the shooting incident number_of_shooters One Was there only one shooter? One of the asymmetries between the end-user and the author is that everything the character says must be constructed with the limits of ontological framework in mind. The author constructs everything that the character can say as well as everything that it can understand; the author must consider what the character will say in addition to the questions that will match those statements. They are simultaneously writing the 69 underlying representation for the questions and the answers, which has implications for how the domain is structured. Below are some of the issues that arise because of the representations system. Deciding if something should be an object or an attribute There is a fair degree of subjectivity deciding if something ought to be an attribute or an object. In questions like, "What kind of clothing was the sniper wearing?", there is more than one option on how to structure the domain. The first option is to treat the answer, “t- shirt”, as if it were the value of an attribute of the sniper. The second option is to create a new object called sniper suspect clothing. See table 4.3 below for examples. A general rule of thumb is that if follow-up questions about a particular item are likely, then it may need to be its own object. If not, it is safe to leave it as an attribute. An author is not always able to make this prediction. Table 4.3: Object or Attribute Object Attribute Value Sniper suspect Clothing type t-shirt Sniper suspect clothing Type t-shirt Questions whose answers return multiple true values Continuing the previous example about questions of clothing. If the sniper, instead of only wearing a t-shirt, is wearing T-shirt and jeans. One possibility is to treat the value as a single entity (jeans_and_t-shirt) or treat them as two separate values, as shown in table 4.4 below. 70 Table 4.4: Single vs multiple true values Object Attribute Value Sniper suspect clothing Type jeans_and_t-shirt. Sniper suspect clothing Type Jeans t-shirt The consequence of listing each item separately is that each value generates its own yes/no question: "Was the sniper wearing jeans?" and "Was the sniper wearing a t- shirt?” The dialogue manager treats each yes/no question as a wh-question about the attribute. The system will always respond with the first true value. If an author wishes to release both pieces of information at once, jeans_and_T-shirt should be a single value. If it becomes important to talk about these objects separately and follow-up questions abound, then a separate object can be created for each. Representing responses to open-ended questions People learning tactical questioning are taught to ask open-ended questions. Rather than beginning with specific questions about attributes such as name or location, questioners often begin by broaching subjects broadly and asking questions such as "What can you tell me about the sniper?" Two approaches to handling this type of question are to either treat the question as a wh-question without an attribute or create an attribute called general description or general info. 71 Each object has a wh-question dialogue act frame that has no attribute assigned to it. This was originally created to handle questions such as "What else can you tell me about the sniper?" Accordingly, the system response is to return the answer to a question about the object under discussion that has not been given yet. If the user adopts of strategy of repeating “What else can you tell me?”, it will continue to give new answers about the object until it runs out of information. When the computer has exhausted all the answers for a given object, it returns an off-topic response. In the future, the system may be changed to reply with "I have told you all I know about..." The other option is to create an attribute called general_description or general_info. If the author creates a general_info attribute on objects, the user receives a more complete answer to an open-ended question from the computer. The disadvantage is that general_info is a broad label and not very descriptive as to the content. This approach is inferior in style and in terms of domain organization. Representing relationships between objects Representing relationships among objects (especially social relationships) presents some challenges. One tactic is to avoid creating a scenario where questions need to be asked about relationships among characters. If this is not possible, relationships may be represented as attributes. "Are the sniper and Assad friends?" In the Amani domain, most of the questions about relationships were asked to determine whether a particular kind of relationship existed. For example, some people wanted to know if the shopkeeper Assad and the sniper were friends. This was usually done with a yes/no question. The system treats yes/no questions as if they were a wh-question. In 72 order to respond to a yes/no question, the author could assign friend of Assad status as an attribute with a yes value (see table 4.6 below). Doing so allows yes/no questions to be asked about this particular relationship, but fails to account for the broader question that a person could ask such as “Who are the sniper's friends?” or “Who are Assad's friends?” Table 4.5: Addressing yes/no questions about relationships Object Attribute Value Sniper friend_of_assad_status yes Friend Assad Another approach if there is only one value for the “friend” attribute is to name the friend as a value. This approach has some significant drawbacks. The example in table 4.6 above assigns the “friend” attribute to the Sniper object when it could just as easily be an attribute of the Assad object (assuming that the friendship is mutual). Keeping track of relationships across objects is best done with a policy on which object will hold the information. If the author intends to import or export objects from one domain to another, this approach is not advisable since there is no checking mechanism to show that whether the author has imported all the objects linked to the friend attribute. More problems arise in the context of relationships if an author assigns multiple values for an attribute. Because the system responds to yes/no questions with the answer to the corresponding wh-question, the answer will not be “yes” or “no” but whatever the answer is to the question “What/Who is the friend of Sniper?” If there are multiple person 73 objects that share this relationship, the author loses control over the output. The system will most likely select the top value of the list. Table 4.6: Multiple values for a relationship attribute Object Attribute Value Sniper Friend Assad Anah Hakim Al Leyla Malik The text that is the reply to “Are Assad and the Sniper friends?” would also have to include information about all the other objects. (e.g. “The sniper is friends with Assad, Anah, Hakim, Leyla, and Malik” ). Another approach is to make the attribute more relationship specific. Questions to find out if certain events occurred One of the challenges is to respond to questions about whether certain events have occurred or if someone was a witness, such as the following examples: “Did you hear any of their conversation?” "Is Al Qaeda planting IED's?" "Did you see him pull the trigger?" An author can represent both events that did and did not occur, using the previously mentioned possibility of marking a value “false”. See table 4.7 below for examples. 74 Table 4.7: Representing events Object Attribute Value Tea_drinking_conversation people_that_overheard Amani Bomb_Planting responsible_agent Nobody Al-qaeda [false] Building conversation networks The author is not responsible for linking the dialogue acts together, but must be aware of how they are connected. Built into the system are a set of networks that have certain assumptions about how the conversation will connect the user’s dialogue acts with the corresponding dialogue acts from the system. The simplest network is a wh-question that will return the stored assertion dialogue act that answers the question. This network accounts for a large portion of most conversations with the character; however, questions and answers are not the only the only dialogue acts that the system has. The author can label certain information as sensitive; the character may make requests prior to disclosing sensitive information. This leads to a more complex network which is easier to see in the context of a dialogue: 75 Table 4.8: A negotiation network Speaker Utterance Dialogue Act End- user Where is this man's house at? Wh-Question Object = house Attribute = location Sensitive information Amani You are asking for a lot of information. If you expect me to tell you that, you must offer me secrecy. Elicit-Offer Give-secrecy End- user This will be kept between me and you. Offer Give-secrecy Amani You have assured me of secrecy. Acknowledge-offer Give-secrecy I believe he hides on the second floor of Assad's shop. I know he is in there. but when i go in the shop i don't see him. And i have no idea where he is now. Assert Object = house Attribute = location Value = above shop The network feature that needs highlighting from the previous dialogue is the fact that sensitive information triggers an elicit offer instead of an immediate answer. Additionally, it shows some of the assumptions that are built into the network. Whereas the networks consider each end-user turn as a single dialogue act, the character can 76 output several dialogue acts in succession (e.g. acknowledgment followed by an assertion). There are system components that allow for segmentation of end-user utterances into multiple dialogue acts (Morbini & Sagae, 2011), but the networks supporting them have not been created. In large part, this is due to the next asymmetry which will now be discussed, namely the comprehension asymmetry. Comprehension Asymmetry The author is far less constrained by the representation system than the end-user is. Authors can embellish utterances in a single turn beyond the object attribute value triple and have a reasonable expectation that the end-user will understand what is said. The reverse is not true. Additionally, when the user input gets transcribed, it has a high error rate. To compensate for this, the classifier that matches the input is looking for statistical similarities in the words and not exact matches (Leuski, Kennedy, Patel, & Traum, 2006). If the user utterance cannot be classified to an existing dialogue act, or if it falls below a certain level of confidence, it is classified as unknown. Handling unknown utterances elegantly is partly up to the author. An utterance can be classified as unknown for two reasons: 1. The system has a suitable dialogue act frame, but not sufficient example data in the stored surface text to make the classification. In other words, the system fails to recognize an utterance for which is does have an appropriate response. 2. There is no dialogue act frame created yet for that utterance. In either case, the character has an obligation to respond. This leaves ample room for misunderstanding. Because of this, the system has tested approaches to grounding information in various scenarios. (Roque & Traum, 2009; 77 Traum et al., 2007). These approaches let the end-user know what the computer system has understood or failed to understand. Topic tracking In the current architecture, each object is considered a topic. There are speech acts related to topic tracking that can be recognized each time the system thinks the player has changed the subject. When the user changes the subject, the system will issue this speech act and then proceed to answer the question. In the example below, the speech act issued by the system to acknowledge a change of subject is underlined: End-user: What time did you leave? (end of talking about another topic) Amani: I left at 4 o'clock (answer) End-user: What can you tell me about Assad? (change of subject) Amani: So you want to talk about the shopkeeper. (repeat-back) Amani: Oh, the shopkeeper his name is Assad.(answer-to-question) An author must decide if this is a desirable functionality for the scenario. It can be eliminated, in which case the previous dialogue would read as follows: End-user: What time did you leave? Amani: I left at 4 o'clock. End-user: What can you tell me about Assad? Amani: Oh, the shopkeeper his name is Assad. 78 The benefit of topic tracking is that if the system misunderstands and shifts topics inappropriately, the end-user can spot the topic of the error. Although helpful for clarifying to the end-user what the character thinks the topic, topic tracking happens at a frequency that is unnatural and distracting. There is a trade-off in sounding natural and keeping the end-user appraised of what the system understands. Request repair object and request repair attribute The ability to track topics can also be used for times when the system does not understand a user utterance. First, the system tries to confirm that the topic of conversation is still the same as the previous utterance. This is done with the request- repair-object move. If the user says yes, the system will then ask the participant to be more specific about what they want to know about the topic at hand. A simple dialogue using request repair object and request-repair-attribute would look like this: End-user: What can you tell me about Assad? Amani: Oh, the shopkeeper his name is Assad. End-user: <system fails to understand an utterance> Amani: Are we still talking about the shopkeeper? End-user: Yes Amani: What do you want to know about the shopkeeper? These repairs are useful for eliciting rephrasing of questions, which gives the system a second opportunity to find a suitable match in the dialogue acts. This approach has to be coupled with a cut-off point where the system gives up and admits that they may not know the information. 79 Conclusion Virtual humans are a relatively new medium for conducting interactive role plays that permits domain authors the ability to deliver valuable practice in tactical questioning without having to hire live actors. The present architecture of the TacQ system allows non-experts to create questioning scenarios relatively quickly because it requires minimal technical experience to create content. The ease of authoring is due in large part to the semantic representation (object attribute value triples) and a default set of dialogue act networks. There are many domains where this combination is ideal because it allows for broad coverage with limited technical skills needed to develop robust characters, for example teaching personnel how to conduct a SALUTE report. (United States, 1984) Having said that, this chapter looked at some lessons learned from authoring particularly complex domain which exceeded the capabilities of the system in some respects. Five areas were identified as problematic: multiple true values, open-ended questions, questions involving relationships among objects, sequencing of events, and hypothesis testing questions. It is important for new authors to understand these limits before they begin constructing a character so that they do not build a domain that exceeds the capability of the system. Secondly, speech recognition errors and domain incompleteness are major obstacles for communication between end-users and virtual humans. A variety of techniques are available for recovering from failed understanding, but they may come at the expense of natural sounding dialogue patterns inasmuch as the system tends to use these techniques more frequently than they occur in human-human interactions. This is 80 an ongoing area of research. Obviously, improvements in speech recognition would be the most direct way of improving performance, but this does not preclude simultaneous experimentation of the grounding dialogue networks. One avenue I would like to explore is statistically biasing some of the replies based on the last utterance of the character, not just the topic of conversation. The limitations of the system can be looked at as temporary. The radio play largely gave way to television programming as technology advanced to allow the broadcast of images. Similarly, dialogue systems can overcome communicative limitations through further technological advances. 81 CHAPTER FIVE: CONCLUSIONS AND FUTURE WORK This dissertation identified three distinct domains where computer code circumscribed the boundaries of communicative possibilities in asymmetrical ways. The chapter regarding the BYU online community showed that the message board format affects how members decide what messages to read. Messages that appeared closer to the thread initial message received significantly more attention than those later in the same thread. This is a marked difference from face-to-face conversations since listeners are not able to selectively pay attention to speaking turns in the same fashion. The second part of the message board chapter looked at the role of social status in both participation and readership. The analysis showed a strong correlation between donor status and message board participation. Participants that had made a financial contribution to the website dominated the conversation floor at a 4:1 ratio as compared with non-donors. Though the donors contributed more content to the message board, on average, individual messages by non-donors received slightly more reads per message posted. Understanding the role of structure and status is an important first step in understanding the social dynamics of the board, yet there is much more to be learned about the way that community culture and computer code are mutually reinforcing. One aspect that has yet to be identified is which social markers do correlate to greater attention within the community since donor status was only a minor predictor. One avenue of continued research is to identify the characteristics of those members of the community that are outliers for having both high numbers of postings and high per 82 post readership. Some candidate features that show promise include affiliation indicators in the monikers that reflect membership in rival institutions (e.g UteFan), members who present themselves as having familial relationships with current members athletics programs, and markers of non-LDS religious affiliation. Aside from identifying the characteristic properties of the community members with the most social weight, it may provide further insights into the underlying value system of the community. Another line of research which deserves more attention is an analysis of the larger BYU sports community across various websites. There is some evidence of social stratification among the different communities which are divided along lines of willingness to pay subscription fees (Blumagic, 2009). There seems to be differentiation in these communities in conversation topic and discourse style that has not been studied systematically. A third line of research is one of tracking information flow across these distinct, but related, communities. Among the various communities are individuals that participate across several fora. The role of these members that act as bridging capital present some interesting opportunities to study the direction that rumors flow through the social system. The higher priced message boards tout insider information and comprise a smaller social network of regular contributors. The free message boards do not advertise insider information, but comprise a much larger community. The directionality of information flow is an interesting angle because a larger social network, in principle, could have more insider information based on sheer size despite the smaller community having more insider access. 83 A fourth line of research relates to these communities and the discursive creation of identity during periods of reanalysis. In the last year, there has been a profound change across the college football landscape. Because of market forces related to television contracts, many football conferences have expanded and realigned. In the process, several rival institutions which have shared a conference affiliation for over a century now find themselves in different conferences. Rivalry games that are born out of the zero-sum game nature of a conference championship race are now relegated to a lower status in terms of the impact on post-season play. The practical implications are only one dimension of these rivalry games, however. Over the course of a long rivalry, institutions tend to define their identity in terms of one another. This effect is somewhat magnified in the rivalry between Brigham Young University and the University of Utah, where religious and political identities coupled with geographic proximity combine to accentuate these differences. With the elimination of a conference race and the possibility of a discontinuance of future games between the institutions, many have speculated about what will happen to the rivalry. The process of this adjustment of identities will be borne out most clearly in the discourse in online communities in a way that can be thoroughly documented through text. The work related to using light-weight dialogue agents for pedagogical purposes is ready to move forward from its initial stages. The pilot study included in chapter three showed that even in the early stages of development, students found the activity enjoyable and useful. There was also a marked preference towards separating the role of tutor and conversation partner. These findings justify further studies which may be 84 included as part of the curriculum, thus increasing the number of potential participants. The finding that most participants preferred two agents (a tutor and a conversation partner) allows for future work with the agents to proceed along two distinct lines that require separate development. With respect to the conversation agent portion, several steps need to be taken. First, though the data gathered was a useful first step for creating a user model for one task, it only represents a single structured dialogue. It is not clear whether students will continue to find the agent both enjoyable and useful when the novelty of chatting with an animated character wears off. Tracking enjoyability and usefulness across time over multiple scenarios is a necessary step before building a curriculum around these agents. This line of research are also needs to address practical issues of development to be considered in terms of expanding content and creating a better user model based as more data is gathered. Another element that needs to be addressed is returning some initiative to the students in terms of asking questions. The data gathered using agent initiated questions showed relatively short replies. While these replies are often appropriate for the context, they did not provide much in the way of data to understand the linguistic development stage of the students. Structuring activities with more student initiative may address those concerns as it would require them to write in more complete sentences. A second front in the development process is the tutoring agent character research. Rather than develop the agent’s technical capabilities right away, a detailed analysis of user preferences should be taken into account. This first iteration of 85 development benefited from using a video of simulated feedback using a single approach to feedback. This made it possible to gather data on preferences regardless as to whether the end user had experienced feedback during their interaction with the conversation agent. This approach should be expanded to include demonstrations of a variety of approaches to find out how students react to each method. Another line of research to consider is when and how to deliver the feedback. It is neither necessary nor advisable to correct every user error. Giving priority to grammar principles that relate to the theme of the particular exercise must be given greater attention, but should the agent ignore all other grammar principles? Designing a system with intelligent application of grammatical feedback will be key to the success of the agent. Being that the computer is going to be imperfect as it is developed, there is a danger of providing inappropriate or inaccurate feedback. Reducing feedback to only those areas where the system has a very high degree of confidence that an error exists runs a different risk. If the student assumes that they are receiving corrective feedback, the lack of corrections may give students a false sense of security. With respect to the chapter regarding TacQ, the current architecture makes it easy to create content for straightforward domains. This dissertation has shown that more complex domains that focus heavily on relationships among groups of objects, sequencing of events, and hypothesis testing are harder to represent and thus should be avoided by novice authors. The chapter offered practical suggestions for how to work around these limitations for more advanced authors. 86 The next stage in development would be to allow more advanced authors to customize the underlying dialogue rules to facilitate more complex domains where end- users perform multiple dialogue acts in a single turn. The TacQ architecture is no longer being actively developed, but its components could be re-used in other projects. As computers become increasingly ubiquitous as a communication tool, it is valuable to consider how different environments are structured through computer code. These three studies contribute to the understanding of how design decisions regarding computer-mediated conversation environments, and the code supporting the design, affect user interactions. 87 BIBLIOGRAPHY Armstrong, A., & Hagel, J. (2000). The real value of online communities. In E. L. Lesser, M. A. Fontaine, & J. A. Slusher (Eds.), Knowledge and communities, (pp. 85–95). Oxford: Butterworth-Heinemann. Bailey, N., Madden, C., & Krashen, S. D. (1974). Is there a “natural sequence” in adult second language learning? Language learning, 24(2), 235–243. Blumagic (2009, January 4 ). Ive had enough of this boards inferiority complex. Cougarboard.com. Retrieved June 27th, 2012, from http://www.cougarboard.com/board/message.html?id=4356296 Bush, N., Wallace, R., Ringate, T., Taylor, A., & Baer, J. (2001). Artificial Intelligence Markup Language (AIML) Version 1.0. 1. Retrieved June 28th, 2012 from http://www.alicebot.org/TR/2001/WD-aiml/ BYU81 (2012, May 12). This is the only message board where people who don't pay. Message posted to Cougarboard.com. Retrieved June 27th, 2012, from http://www.cougarboard.com/board/message.html?id=8615821 Carroll, S., Swain, M., & Roberge, Y. (1992). The role of feedback in adult second language acquisition: Error correction and morphological generalizations. Applied Psycholinguistics, 13(02), 173–198. Clark, H. (1999). How do real people communicate with virtual partners. Proceedings of 1999 AAAI Fall Symposium. Retrieved from http://www.aaai.org/Papers/Symposia/Fall/1999/FS-99-03/FS99-03-006.pdf Clark, H. H., & Van Der Wege, M. M. (2005). Imagination in Discourse. In D. Schiffrin, D. Tannen, & H. E. Hamilton (Eds.), The Handbook of Discourse Analysis (pp. 772–786). Malden, Massachusetts, USA: Blackwell Publishers Ltd. Retrieved from http://uq5sd9vt7m.scholar.serialssolutions. com/sid=google&auinit= HH& aulast=Clark&atitle=Imagination+in+discourse&id=doi:10.1002/9780470753460. ch40 Croll, A., & Power, S. (2009). ProQuest Tech Books: Complete Web Monitoring, 1st Edition. Retrieved February 12, 2010, from http://proquest.safaribooksonline. com/9780596804084/what_did_they_say_question_online_commu#X2ludGVyb mFsX0ZsYXNoUmVhZGVyP3htbGlkPTk3ODA1OTY4MDQwODQvdGhlX2F uYXRvbXlfb2ZfYV9jb252ZXJzYXRpb24maW1hZ2VwYWdlPQ== 88 Demographics. (n.d.).Brigham Young University Official Website. Retrieved May 11, 2010, from http://yfacts.byu.edu/viewarticle.aspx?id=135 Dulay, H. C. (1974). Natural Sequences in Child Second Language Acquisition. Language learning, 24(1), 37–53. doi:10.1111/j.1467-1770.1974.tb00234.x Edelsky, C. (1981). Who’s Got the Floor? Language in Society, 10(3), 383–421. Galston, W. A. (2000). Does the Internet Strengthen Community? National Civic Review, 89(3), 193–202. doi:10.1002/ncr.8930 Gandhe, S., DeVault, D., Roque, A., Martinovski, B., Artstein, R., Leuski, A., Gerten, J., & Traum, D. (2008). From Domain Specification to Virtual Humans: An integrated approach to authoring tactical questioning characters. Proceedings of Interspeech-08, Brisbane, Australia. Retrieved from http://www.ict.usc.edu/%7Eleuski/publications/papers/tacq08.pdf Gandhe, Sudeep, Taylor, A., Gerten, J., & Traum, D. (2011). Rapid Development of Advanced Question-Answering Characters by Non-experts. Proceedings of the SIGDIAL 2011 Conference, The 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Portland, Oregon, USA: The Association for Computer Linguistics. Retrieved from http://www-scf.usc.edu/~gandhe/tacq- demo.pdf Gandhe, S., Rushforth, M., Aggarwal, P., & Traum, D. (2011). Evaluation of an Integrated Authoring Tool for Building Advanced Question-Answering Characters. Proceedings of Interspeech-11, Florence, Italy. Gómez, V., Kaltenbrunner, A., & López, V. (2008). Statistical analysis of the social network and discussion threads in slashdot. Proceeding of the 17th international conference on World Wide Web (pp. 645–654). Beijing, China: ACM. doi:10.1145/1367497.1367585 Grice, H.P. (2002). Logic and Conversation. In D. J.Levitin (Ed.), Foundations of Cognitive Psychology: Core Readings. MIT Press. (Original work published 1975) Heift, T. (2004). Corrective feedback and learner uptake in CALL. ReCALL, 16(2), 416– 431. Herring, S. C. (2001). Computer-mediated discourse. In D. Schiffrin, D. Tannen, & H. Hamilton (Eds.), The Handbook of Discourse Analysis, (pp. 612-634). Oxford: Blackwell Publishers. Retrieved from http://ella.slis.indiana.edu/~herring/cmd.pdf 89 Herring, S. C., Johnson, D. A., & DiBenedetto, T. (1995). "This discussion is going too far!" Male resistance to female participation on the Internet. In M. Bucholtz & K. Hall (Eds.), Gender Articulated: Language and the Socially Constructed Self, 67- 96. New York: Routledge Jia, J., & Chen, W. (2008). Motivate the Learners to Practice English through Playing with Chatbot CSIEC. In Z. Pan, X. Zhang, A. El Rhalibi, W. Woo, & Y. Li (Eds.), Technologies for E-Learning and Digital Entertainment (Vol. 5093, pp. 180-191). Nanjing, China: Springer Berlin / Heidelberg. doi:10.1007/978-3-540-69736-7_20 Johnson, W., Ashish, N., Bodnar, S., & Sagae, A. (2010). Expecting the Unexpected: Warehousing and Analyzing Data from ITS Field Use. In V. Aleven, J. Kay, & J. Mostow (Eds.), Intelligent Tutoring Systems, Lecture Notes in Computer Science (Vol. 6095, 352–354). Springer Berlin / Heidelberg. Retrieved from http://www.springerlink.com.libproxy.usc.edu/content/gw0146164m0gp517/abstr act/ Johnson, W. L., & Valente, A. (2009). Tactical Language and Culture Training Systems: Using AI to Teach Foreign Languages and Cultures. AI Magazine, 30(2), 72–83. doi:10.1609/aimag.v30i2.2240 Kalman, Y. M., Ravid, G., Raban, D. R., & Rafaeli, S. (2006). Speak* now* or forever hold your peace: power law chronemics of turn-taking and response in asynchronous CMC. Journal of Computer-Mediated Communication (Vol. 12, pp. 1-23). Retrieved from http://jcmc.indiana.edu/vol12/issue1/kalman.html Kim, J. H. (2005). Issues of corrective feedback in second language acquisition. Teachers College, Columbia University Working Papers in TESOL & Applied Linguistics, 4(2). Krashen, S. (2002). Second Language Acquisition and Second Language Learning. Retrieved from http://sdkrashen.com/SL_Acquisition_and_Learning/SL_Acquisition_and_Learni ng.pdf (Original work published 1981) Lessig, L. (2006). Code: Version 2.0. Basic Books, Inc. New York, NY, USA. Retrieved from http://codev2.cc/download+remix/Lessig-Codev2.pdf Leuski, A., Kennedy, B., Patel, R., & Traum, D. (2006). Asking questions to limited domain virtual characters: how good does speech recognition have to be? 25th Army Science Conference, Orlando, Florida. Retrieved from http://ict.usc.edu.libproxy.usc.edu/~traum/Papers/asc06blackwell.pdf 90 Levinson, S. C. (1988). Putting linguistics on a proper footing: Explorations in Goffman’s concepts of participation. Oxford, England: Polity Press; Boston, MA, US: Northeastern University Press. Retrieved from http://www.mpi.nl/publications/escidoc-66709/@@popup Licklider, J. C. ., & Taylor, R. W. (1968). The computer as a communication device. Science and technology, 76, 21–31. Lyster, R., & Ranta, L. (1997). Corrective feedback and learner uptake. Studies in second language acquisition, 19(01), 37–66. McLuhan, M. (2008). The Medium is the Message. In C. D. Mortensen (Ed.), Communication theory (2nd ed., pp. 390–402). New Brunswick, N.J.: Transaction Publishers. (Original publication 1964). Retrieved from http://is.gd/VPDn2l Morbini, F., & Sagae, K. (2011). Joint Identification and Segmentation of Domain- Specific Dialogue Acts for Conversational Dialogue Systems. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Techologies. Portland, Oregon: Association for Computational Linguistics. Retrieved from http://www.aclweb.org/anthology/P/P11/P11- 2017.pdf Nagata, N. (1995). An effective application of natural language processing in second language instruction. CALICO JOURNAL, 13(1), 47–67. Nagata, N. (2009). Robo-Sensei. CALICO Journal, 26(3), 18. Norris, P. (2002). The Bridging and Bonding Role of Online Communities. The Harvard International Journal of Press/Politics, 7(3), 3-13. doi:10.1177/1081180X0200700301 Panyametheekul, S., & Herring, S. C. (2003). Gender and turn allocation in a Thai chat room. Journal of Computer-Mediated Communication, 9(1). Putnam, R. D. (2001). Bowling Alone : The Collapse and Revival of American Community (1st ed.). Simon & Schuster. Ren, Y., Kraut, R., & Kiesler, S. (2007). Applying common identity and bond theory to design of online communities. Organization Studies, 28(3), 377. Rheingold, H. (1993). The Virtual Community (1st. ed.). Addison-Wesley Pub. Co. Retrieved from http://www.rheingold.com/vc/book/intro.html 91 Roque, A., & Traum, D. (2009). Improving a virtual human using a model of degrees of grounding. Proceedings of IJCAI-09. Pasadena, CA. Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50(4), 696–735. Statistics - CougarBoard.com. (n.d.). Retrieved February 14, 2010, from http://www.cougarboard.com/stats.html Theodoridou, K. D. (2009). Learning with Laura: Investigating the effects of a pedagogical agent on Spanish lexical acquisition. University of Texas Libraries. Retrieved from http://repositories1.lib.utexas.edu/handle/2152/6612 Traum, D., Leuski, A., Roque, A., Gandhe, S., DeVault, D., Gerten, J., & Martinovski, B. (2008). Natural Language Dialogue Architectures for Tactical Questioning Characters. Presented at the Army Science Conference, Florida. Retrieved from http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA503947 Traum, D., Roque, A., Leuski, A., Georgiou, P., Gerten, J., Martinovski, B., Narayanan, S., et al. (2007). Hassan: A virtual human for tactical questioning. Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, 71–74. Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460. United States (1984). Combat skills of the soldier: FM 21-75, Washington, DC: Headquarters, Dept. of the Army. Wallace, R. (2003). The elements of AIML style. Alice AI Foundation. Retrieved from https://files.ifi.uzh.ch/cl/hess/classes/seminare/chatbots/style.pdf Wallace, R. S. (2009). The Anatomy of A.L.I.C.E. In R. Epstein, G. Roberts, & G. Beber (Eds.), Parsing the Turing Test (pp. 181-210). Springer Netherlands. Retrieved from http://dx.doi.org/10.1007/978-1-4020-6710-5_13 Weizenbaum, J. (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36–45. Williams, L., & van Compernolle, R. A. (2009). The chatbot as a peer/tool for learners of French. In L. Lomicka & G. Lord (Eds.), The next generation: Social networking and online collaboration in foreign language learning. Retrieved from www.personal.psu.edu/rav137/preprints/CALICO.pdf 92 Wilson, F. (1946, July 3).The Maltese Falcon. In Academy Award. CBS network. Retrieved April 11, 2012, from http://emruf.webs.com/acad.htm Wilson, S. M., & Peterson, L. C. (2002). The anthropology of online communities. Review of Anthropology, 31(1), 449–467. Zacharski, R. (2002). Conversational agents for language learning. Innovative Applications of Artificial Intelligence Conference. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.12.2330&rep=rep1&ty pe=pdf 93 APPENDIX: Interaction Survey for Conversation Partner 94 95
Abstract (if available)
Abstract
Over the course of my studies in linguistics, I became intrigued by the impact of computer code on communication. As physical space can be used to provide communicative advantages to one party over another, so the computer code that structures a virtual communicative channel shapes discourse patterns. This dissertation is organized as a collection of three papers, each of which considers the asymmetries of discourse in a different virtual environment. ❧ The first environment is an online message board for sports fans. Its conversations follow a tree-structure format which identifies whether the author of a message is a donor to the website, a social status marker signaled by the underlying computer code. In this chapter, I investigate how the board's tree-structure influences which messages are read. I also consider what quantifiable differences in participation and readership exist between donors and non-donors. The study of the board structure demonstrates that a reply's proximity to its parent messages affects its readership, with those replies closest to the parent message receiving the highest readership. The study also finds that donors have a higher participation rate in conversations, but on average, messages posted by donors receive slightly less readership per message than those of non-donors. ❧ The second environment is in the domain of second language learning and examines students in first semester university Spanish interacting with a virtual conversation partner in Spanish. The conversations followed a format similar to in-class role play activities and were guided by prompted questions from the virtual agent. The study shows that students believe that metalinguistic feedback is necessary for a language learning activity to be useful, although there was not a consensus on the pedagogical effects of the feedback. The pilot study also indicates a preference to have feedback delivered by a separate virtual agent, rather than have the role of conversation partner and tutor be executed by the same agent. ❧ The third environment is one of virtual agents designed for tactical questioning training. This chapter looks at interactions primarily from the perspective of the authors creating interactive narratives
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Register and style variation in speakers of Spanish as a heritage and as a second language
PDF
A discourse analysis of teacher-student classroom interactions
PDF
Adding and subtracting alternation: resumption and prepositional phrase chopping in Spanish relative clauses
PDF
Virtual extras: conversational behavior simulation for background virtual humans
PDF
The Spanish feminine el at the syntax-phonology interface
PDF
A reduplicative analysis of sentence modal adverbs in Spanish
PDF
The present perfect in Spanish: a study on semantic variation
PDF
Hybrid communication: a structurational analysis of computer gaming teams
PDF
Understanding and generating multimodal feedback in human-machine story-telling
PDF
Toward a multi-formalism specification environment
PDF
Virtually human? Negotiation of (non)humanness and agency in the sociotechnical assemblage of virtual influencers
PDF
The interpersonal effect of emotion in decision-making and social dilemmas
PDF
Vietnamese pronouns in discourse
PDF
Coded computing: Mitigating fundamental bottlenecks in large-scale data analytics
PDF
A corpus-based discourse analysis of Korean grammatical constructions: Focus on the multifold functions and meanings of the pragmatic construction e kaciko
PDF
Computational foundations for mixed-motive human-machine dialogue
PDF
Palimpsest: shifting the culture of computing
PDF
A framework for research in human-agent negotiation
PDF
Not a sign of weakness: civil discourse in an urban classical charter school
PDF
Gamification + HCI + CMC: effects of persuasive video games on consumers’ mental and physical health
Asset Metadata
Creator
Rushforth, Michael Dennis
(author)
Core Title
Asymetrical discourse in a computer-mediated environment
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Linguistics (Hispanic Linguistics)
Publication Date
07/30/2012
Defense Date
05/09/2012
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
computer-mediated,discourse,OAI-PMH Harvest,virtual humans
Language
English
Contributor
Electronically uploaded by the author, with typographical error in the title which should read "Asymmetrical discourse in a computer-mediated environment".
(provenance)
Advisor
Silva-Corvalan, Carmen (
committee chair
), Jacobs, Lanita (
committee member
), Saltarelli, Mario (
committee member
), Traum, David (
committee member
)
Creator Email
michael.rushforth@utsa.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-74358
Unique identifier
UC11289267
Identifier
usctheses-c3-74358 (legacy record id)
Legacy Identifier
etd-RushforthM-1056.pdf
Dmrecord
74358
Document Type
Dissertation
Rights
Rushforth, Michael Dennis
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
computer-mediated
discourse
virtual humans