Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Code-switching dialogue systems: an investigation into how systems can support code-switching and when they should, with analysis of two Choctaw-English applications
(USC Thesis Other)
Code-switching dialogue systems: an investigation into how systems can support code-switching and when they should, with analysis of two Choctaw-English applications
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
CODE-SWITCHING DIALOGUE SYSTEMS: AN INVESTIGATION INTO HOW SYSTEMS CAN
SUPPORT CODE-SWITCHING AND WHEN THEY SHOULD, WITH ANALYSIS OF TWO
CHOCTAW-ENGLISH APPLICATIONS
by
Jacqueline Brixey
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulllment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(COMPUTER SCIENCE)
December 2024
Copyright 2025 Jacqueline Brixey
Dedication
I dedicate this dissertation rst to my Choctaw community: Theresa, for answering my truly endless questions about Choctaw; my classmates in Los Angeles; Cheyenne, a
¯
tek; Dora for her beautiful translations
and transcriptions; the Department of Chahta Immi in Mississippi; and to the many Choctaw speakers and
learners who shared their time and language knowledge with me.
I also dedicate this dissertation to my family. Many thanks to my parents for their unwavering support
and mentorship. One million thanks to my husband, CheChé, for always believing in me. I wouldn’t have
nished without you in my corner.
ii
Acknowledgments
I would like to express my gratitude to the Choctaw Nation of Oklahoma for their support throughout
this research. Special thanks go to Jason Campbell, Senior Director of Education; Anjanette Williston,
Director of the School of Language; Teri Billy, Executive Director of the School of Language; and James
Parrish, former Director of the School of Language, for their invaluable assistance. I would also like to
thank Amanda Johnson, Executive Director of Education, and Carey Fuller, Administrative Director of the
Choctaw Nation of Oklahoma Institutional Review Board, for their support. I thank the members of the
CNO IRB board for their helpful feedback.
During my PhD, I was fortunate to receive support from two fellowships: the GEM fellowship and a
departmental fellowship from USC, which enabled me to make my rst trip to Oklahoma for data collection.
I am also grateful to receive US Army funding for supporting this research.
I would like to acknowledge the many contributions of Dr. David Traum in this work, as well as his
guidance, patience, and advice. I thank Dr. Khalil Iskarous and Dr. Maja Mataric for serving as dissertation
committee members. I am deeply grateful for their support of my dreams of building Choctaw language
technology.
I am indebted to the PhD and PostDocs in LA online writing group, especially Tomoko. The commiserating time was as important as the time spent writing.
iii
Lastly, I thank the many Choctaw speakers who contributed their time and expertise to this work. I
extend a heartfelt thanks to those contributors who died during the COVID-19 pandemic, I will do my best
to be a good steward of your recordings and knowledge.
iv
Table of Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Anumpa Nushkobo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Choctaw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Bilinguals and Bilingualism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Types of bilinguals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Code-switching frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 Bilingualism in conversation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Dialogue Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Thesis statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.7 Publications and applications resulting from this dissertation . . . . . . . . . . . . . . . . . 12
1.7.1 Technology Transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.8 Outline of Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Chapter 2: Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1 Bilingual linguistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Language features of code-switching . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.2 Languages of code-switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.3 Code-switching frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.4 Purpose of code-switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Bilingual natural language processing (NLP) . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2.1 Computational approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.2 Future directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Brief Overview of Dialogue Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.1 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Bilingual dialogue systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
v
2.4.1 Reasons for code-switching interactions . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.1.1 User preference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1.2 Task required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4.1.3 System is an unbalanced bilingual . . . . . . . . . . . . . . . . . . . . . . 36
2.4.1.4 Socio-psychological . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4.1.5 User prociency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.2 Computational requirements of a bilingual dialogue system . . . . . . . . . . . . . 41
2.4.2.1 Data sets of code-switching . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.2.2 Understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.2.3 Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.4.3 Summary of the research space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5 Overview of Choctaw language and culture . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5.1 Sounds and orthography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5.2 Dictionaries and Reference Grammars . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.6 A Brief Overview of Prior work on Indigenous languages . . . . . . . . . . . . . . . . . . . 53
2.6.1 Challenges and motivations for Indigenous language technology . . . . . . . . . . 53
2.6.2 Documenting Indigenous languages . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.6.2.1 Historical practices and issues . . . . . . . . . . . . . . . . . . . . . . . . 55
2.6.2.2 Best practices for working with community . . . . . . . . . . . . . . . . 56
2.6.3 Examples of Indigenous Language Technology . . . . . . . . . . . . . . . . . . . . 57
Chapter 3: Foundational Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.0.1 ChoCo Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.0.1.1 Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.0.1.2 Corpus Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.0.1.3 Corpus Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.0.2 Spoken language data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.0.3 Choctaw Automatic Speech Recognizer . . . . . . . . . . . . . . . . . . . . . . . . 69
3.0.4 Verb generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.0.5 Ongoing and long term projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.0.5.1 Mississippi Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.0.5.2 Native Earth Native Sky . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Chapter 4: Bilingual Language Revitalization Systems - Application 1: Masheli . . . . . . . . . . . 73
4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2.1 Literature review of chatbots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.2.1.1 Chatbots as pedagogical agents . . . . . . . . . . . . . . . . . . . . . . . 75
4.2.1.2 Chatbots as language learning companions . . . . . . . . . . . . . . . . . 76
4.2.2 Second language acquisition literature . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2.2.1 Indigenous language pedagogy and language learning systems . . . . . . 77
4.2.2.2 Emerging Bilingual Conversational Behaviors and Translanguaging . . . 79
4.3 Masheli 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3.1 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3.1.1 Code-switching design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3.1.2 Dialogue types and tags . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.3.1.3 Orthographic considerations . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
vi
4.3.2.1 User study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.3.2.2 Language experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3.2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 Masheli 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4.1 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.4.2 System design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.4.2.1 Additional questions in QA corpus . . . . . . . . . . . . . . . . . . . . . 95
4.4.2.2 Generating code-switched QA corpus . . . . . . . . . . . . . . . . . . . . 95
4.4.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4.3.1 Language Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.4.3.2 Survey design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.4.3.3 Experiment session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.3.4 Inclusion and exclusion criteria . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.4.4.1 Language Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.4.4.2 Responses to survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.4.4.3 Chat logs and annotations . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.4.4.4 Language experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.6 Summary of ndings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Chapter 5: Bilingual Language Documentation Systems - Application 2: DAPEL . . . . . . . . . . . 121
5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.2 Related literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3 Overarching design of DAPEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.4 Pilot studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.4.1 Pilot study 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.4.1.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.4.2 System design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.4.2.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.4.2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.4.3 Pilot study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.4.3.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.4.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.4.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.5 Code-switching DAPEL Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.5.1 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.5.2 Dialogue Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.5.2.1 Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.5.2.2 Prompts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.5.2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.5.2.4 Small talk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.5.2.5 Code-switching design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.5.2.6 Code-switching in user responses . . . . . . . . . . . . . . . . . . . . . . 138
5.5.2.7 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.3.1 Four versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.3.2 Survey design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
vii
5.5.3.3 Experiment session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.5.3.4 Research context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.5.3.5 Setting and participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.5.4.1 Survey results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.5.4.2 Audio duration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.5.4.3 Measuring diversity in recordings . . . . . . . . . . . . . . . . . . . . . . 150
5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.7 Summary of Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Chapter 6: Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.1.1 Masheli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.1.2 DAPEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.3.1 Masheli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.3.2 DAPEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.4 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.5 Future research directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.5.1 Masheli and language learning companions . . . . . . . . . . . . . . . . . . . . . . 170
6.5.2 DAPEL and systems for recording endangered languages in conversation . . . . . 170
6.5.3 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.5.4 Indigenous Language Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
viii
List of Tables
2.1 Summary of existing code-switching dialogue systems research space and situation of
this dissertation within the research space. Chapter 4 of this dissertation covers the
Masheli chatbot for language learning and Chapter 5 is the DAPEL system for language
documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.1 Overview of corpus structure, with number and type of les within each subsection . . . . 63
3.2 Number of tokens for text data within the corpus . . . . . . . . . . . . . . . . . . . . . . . 64
3.3 Number of hours and minutes of spoken Choctaw within the corpus . . . . . . . . . . . . 65
4.1 Dialogue types in Masheli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 Framework-based utterances examples. English portions are bolded in code-switched
utterances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.3 Non-framework utterances said by the chatbot. English portions are bolded in codeswitched utterances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.4 Measurement of change between pre and post-language tests . . . . . . . . . . . . . . . . . 105
4.5 The results of comparing survey responses between the monolingual and code-switching
interactions. p<0.10 results are marked with one asterisk. Standard deviations are given
in parentheses next to the average in the nal two columns. . . . . . . . . . . . . . . . . . 106
4.6 Sentiment analysis of open-ended survey responses. M stands for monolingual and CSW
for code-switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.7 Analysis of chat logs for the number of words. In the rst four columns, M stands for
monolingual, and CSW for code-switching. In the nal two columns, the participant
with that specic value is given in parentheses; "multiple" indicates that more than one
participant had that value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
ix
4.8 Analysis of chat logs for the number of stories. Participants may have asked for stories
but were unsuccessful in this request attempt, as indicated by the number of attempts to
get a story. Alternatively, sometimes, the chatbot gives a story without a request, leading
to a mismatch between the number of requests and the number of stories received. . . . . 111
4.9 Initiative annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.10 Pearson correlation test to evaluate the relationship between survey responses and
number of stories requested in an interaction. Strong correlations (r<0.60) are bolded, and
two asterisks indicate p<0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.11 Pearson correlation test to evaluate the relationship between survey responses and
average number of tries to receive a story in an interaction. Strong correlations (r<0.60)
are bolded, and two asterisks indicate p<0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.1 Sample of questionnaire results. Participants were asked to rate their experience on a 1-5
Likert scale, with 5 being the highest score. . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.2 List of all prompts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.3 Code-switching options in DAPEL prompts and small talk. Examples are given in the
column on the right. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.4 Survey results of all participants comparing those who interacted with the monolingual
system versus the code-switching system. Averages and standard deviations are given for
each group. A single asterisk indicates p<0.10; two indicates p<0.05. . . . . . . . . . . . . . 143
5.5 Survey results of all participants divided into L1 and L2 cohorts comparing those who
interacted with the monolingual system versus the code-switching system. Bolded column
names are T-test results. Averages and standard deviations are given for each group. A
single asterisk indicates p<0.10; two indicates p<0.05. . . . . . . . . . . . . . . . . . . . . . 144
5.6 Survey results comparing L2 to L1 participants who interacted with a given system. Mono
indicates the monolingual system, and CSW indicates the CSW system. A comparison of
all L2 participants against all L1 participants, regardless of system, is given in the three
rightmost columns. Averages and standard deviations for a given group and system are
given for each pairing. A single asterisk indicates p<0.10; two indicates p<0.05. . . . . . . . 145
5.7 Sentiment analysis on rst open-ended survey question . . . . . . . . . . . . . . . . . . . . 146
5.8 Sentiment analysis on nal survey question . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.9 Audio durations for prompts are given in hours:minutes:seconds. Standard deviations
are given in parentheses. The number of participants is given in the column headers in
square brackets; the groups are not even as L1 and L2 were not explicitly recruited, and
one participant from the CSW group was excluded due to technical recording issues. . . . 148
5.10 Overview of total and total unique word counts for all prompt responses in interaction. . . 152
x
6.1 All research questions listed in this dissertation. The leftmost column states where in
the chapter it was discussed, and the rightmost column indicates whether the research
question was addressed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.2 The results of the language test, the raw scores are given in columns 2-4, with the
dierence in change given in the nal two columns on the right . . . . . . . . . . . . . . . 192
6.3 The results of the language test for the grammar questions, the raw scores are given in
columns 2-4, with the dierence in change given in the nal two columns on the right . . 193
6.4 Total duration, average duration, and number of responses per condition (code-switching
and monolingual) and group (L1 and L2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
xi
List of Figures
1.1 The Choctaw tribe used to be based in Mississippi and Alabama (states shown in red), but
today, they are dispersed across the US, such as in Louisiana and California (states shown
in green) as well. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Summary of reasons to code-switch (References include: Giles 1979; Boztepe 2003;
Gardner-Chloros and Edwards 2004; Grosjean 2013; Poplack 2000) . . . . . . . . . . . . . . 23
2.2 Task-oriented systems (From Hongshen Chen et al. 2017) . . . . . . . . . . . . . . . . . . . 25
2.3 Interactions with a bilingual dialogue system considering the sociolinguistic motivations
and technical components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4 In this example from ChatGPT 3.5, the user directly asks a question using code-switching.
The question asked is “Where is the bathroom?”, to which the system gives a corrected
form in Spanish. This interaction occurred in January 2024. . . . . . . . . . . . . . . . . . . 33
2.5 In this example from ChatGPT 3.5, the system provides a code-switched response on the
second request from the user. This interaction occurred in January 2024. . . . . . . . . . . 34
2.6 Example conversation from the Tauira system (Schagen and Knott 2004) in which the
system learns new vocabulary from the user. H indicates the user and C indicates the
system in the gure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.7 An example of Czech-English code-switching where high-information content words are
switched . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.8 Summary of the broad research areas of computational aspects of bilingual systems . . . . 42
2.9 Choctaw sounds and orthographic variants. Bracketed characters are IPA symbols. . . . . 51
2.10 Example conversation with ChatGPT 4, retrieved October 26, 2024. . . . . . . . . . . . . . 58
3.1 Structure of the corpus, with descriptions for substructures . . . . . . . . . . . . . . . . . . 62
3.2 Screenshot of Excel le for Byington dictionary . . . . . . . . . . . . . . . . . . . . . . . . 66
xii
3.3 The rst ten lines to be repeated by participants . . . . . . . . . . . . . . . . . . . . . . . . 68
3.4 Transcription of one conversation recording between a uent and student Choctaw speakers 69
4.1 Translanguaging strategies from (Cenoz and Gortegaorter 2017) . . . . . . . . . . . . . . . 83
4.2 Common translanguaging strategies from (Seals and Olsen-Reeder 2020) . . . . . . . . . . 83
4.3 Additional translanguaging strategies from (Dougherty 2021) . . . . . . . . . . . . . . . . . 83
4.4 Example conversation with the chatbot in Masheli 1.0. Translations are in square brackets. 88
4.5 Example of code-switching within one user turn in Masheli 1.0. Translations are in square
brackets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.6 Errors in user input in Choctaw. Translations are in square brackets. . . . . . . . . . . . . 92
4.7 Number of attempted questions for each question on the language test per group. "Pre"
indicates it was before the interaction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.8 Number of correct answers for each question on the language test per group. "Pre"
indicates it was before the interaction, "post" indicates after. . . . . . . . . . . . . . . . . . 104
5.1 DAPEL system interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.2 Interface of the system showing the prompt page . . . . . . . . . . . . . . . . . . . . . . . 159
5.3 Screenshot of the summary page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.4 Screenshot of the small talk page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.5 All versions of DAPEL, illustrating ASR components and which language would be spoken
by the system and user at each point in the interaction . . . . . . . . . . . . . . . . . . . . 160
5.6 The Choctaw Nation reservation is located in the southeastern corner of Oklahoma.
Figure from Google Maps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.7 The Choctaw Nation has 12 districts. Durant is located in District 9, Broken Bow in
District 2, and Idabel in District 1. Figure from Wikimedia. . . . . . . . . . . . . . . . . . . 161
5.8 Total recording durations per condition (Mono versus code-switching) and group (L1
versus L2) for prompt responses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.1 Masheli CNO IRB approval letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
6.2 DAPEL CNO IRB approval letter page 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
6.3 Masheli USC IRB approval letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
xiii
6.4 DAPEL CNO IRB approval letter page 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.5 DAPEL CNO IRB approval letter page 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.6 DAPEL USC IRB approval letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
xiv
Anumpa Nushkobo
Holisso ilVppVt Miliki anumpa ti
¯
kba kVnia ya
¯
nana achukmalechit anonti imanukla onochi hosh anumpa
tuklo anumpuli aiimVlhpesa ya
¯
pisa. AkmVt aiimVlhpesa mVt Na Hullo micha Chahta anumpa halVllit i
¯
sha
chi
¯
ka
¯
yVmmako
¯
ti
¯
kba pisa. Anumpa tuklo ittimanumpuli ya
¯
aiyVmohmi aiimoma anumpa lawa yakni a
¯
sha
ya
¯
himak aiimVlhpesa ya
¯
anumpa inlachi ka
¯
anumpa ittimanumpuli yVt ikshot isht ia. Himak aiimVlhpesa
yVt anumpa achukmVt abVchi kVt kaniohmihchi hosh anumpuli ya
¯
il onachi kiyo.
AiimVlhpesa nana kaniohmi micha imomaka ya
¯
achukmalechi hosh kaniohmihchit anumpa inla aiVlhpesa
anonti anumpa tuklo anumpula hinla ka
¯
ittilauichit apesVchi ho
¯
ittimanumpuli ya
¯
holisso yVt achukmVt
pisa. Na ponaklo anukcheto ya
¯
achukmVt pisa li. Code inlachi yohmikma yVmmVt achukma hosh inshalichit ikhVna hinla anonti okla ittimaiikhVna hinla kVt chaha cho ittimanumpuli apesa cha kaniohmi hosh
okla falama, keyukmVt atuklant yVmmihchi ka
¯
ishit pesa cha anumpa kVnia ya
¯
audio nana kia hVsh ikbit
hVsh ilatomba ho
¯
anumpa yVt kVnia chi
¯
kiyo. AkmVt nana achakVlechi li kVt okla ittibai achVVt nana
ikhVna anonti anumpa tuklo anumpuli Vhleha pa
¯
hosh anumpa tikba anumpuli kVnia isht Vtta Vhleha ya
ikbi.
Imanukla onochi yVt anumpa tuklo anumpuli afehna ya
¯
il achukmalechi kVt holisso itibafoka ilVppVt
ti
¯
kba aieninchi. Ittatu
¯
klo kVt anumpa ilatomba micha anumpa isht hlampkochi yVt hina himona atahli
micha Chahta anumpa ya
¯
holisso aia
¯
hlichit ibafoki hosh anumpa yVt kVnia hinla ka
¯
hlampkochi micha
aiimVlhpesa ikhana anonti abVchi ibafoka ya
¯
atukkuchi. AiimVlhpesa anumpa achVa ya
¯
aiimVlhpesa
anumpa tuklo yVt isht Vtta Vhleha ya
¯
anukcheto kVt i
¯
shalichi hosh kanihmi fehna ka
¯
imomakachi hosh
xv
imanukla onochi ya
¯
pisachi. Anumpa ti
¯
kba kVnia ya
¯
achunachit ilatomba cho anumpa ikhVna achukmalecha hinla ka
¯
hoyo fehna hosh chahachi kVt anumpa tuklo ittimanumpuli yohma hinla. Atu
¯
kla ka
¯
holisso yVt anumpa tuklo anumpuli ya
¯
ibafoka itVnnali kVt nana afehna anoli, nan aiikhVna aiVlhtokowa
micha nana ik aiyo mVt anumpa tuklo ittimanumpuli ya
¯
ibafoka. TikshVneli, anumpuli ikhVna anonti
anumpa ti
¯
kba anumpuli, Chahta – Na Hullo anumpa lawa hosh himona chi
¯
ka
¯
nana itahoba ho
¯
, Chahta ya
¯
Vmmona anumpa isht ikhVna ya
¯
toksVli ilVppVt polanka Chahta anumpa anukcheto micha anumpa imma
technology ikhananchi. Nan aiikhVna aiyVmohmi ya
¯
anumpa ti
¯
kba micha aianukcheto akanlusi inla ya
¯
holisso pVt anowa hlampkochi hosh a
¯
hlichi.
xvi
Abstract
This dissertation explores the development and application of bilingual dialogue systems, focusing specifically on systems that support English and Choctaw, an endangered American Indigenous language. Bilingual dialogue systems are critical in facilitating more natural and inclusive interactions for the many bilingual users worldwide, yet current systems often fail to accommodate linguistic features of bilingualism,
such as code-switching.
The dissertation investigates dialogue systems that manage unbalanced bilingualism and appropriate
code-switching, improving user experience and system performance. I explore research questions such as
whether code-switching leads to higher rapport, higher learning gains, or enhances interactions to collect
endangered language audio data. Additionally, I address the sociocultural and linguistic challenges of
developing conversational agents for endangered Indigenous languages.
The contributions of this dissertation include rst, the development of two major bilingual applications. Both applications meaningfully add to the documentation of the Choctaw language and provide
new avenues for language revitalization and preservation. The results of experiments with these applications demonstrate that bilingual dialogue systems serve some user populations better than monolingual
systems. The research highlights the potential for bilingual dialogue systems to be used for language learning or preservation eorts for endangered Indigenous languages. Finally, this work introduces several
Choctaw language resources and language-based technologies: a multimodal corpus, the rst automatic
speech recognition system for Choctaw, novel bilingual Choctaw-English corpora of uent and learning
xvii
speakers, and dictionaries. This dissertation demonstrates a path for revitalizing other low-resource and
Indigenous languages through the use of computational methods.
xviii
Chapter 1
Introduction
This dissertation investigates bilingual dialogue systems, supporting dialogue systems that speak English
and Choctaw, an American Indigenous language. Bilingualism is the ability to speak two languages. Dialogue systems are computer programs that a human user can carry out a conversation with. Bilingual
dialogue systems are thus systems that can speak two or more languages within a single conversation.
Many people can speak more than one language (bi- or multi-linguals). Often this manifests as speaking
and understanding more than one language within the same dialogue (called code-switching). Bilingual
dialogue systems are important to consider and implement because many people in the world are bilingual and use both languages within a single conversation. Programs that could respond to a user in several
languages and potentially within a single utterance would allow more users to benet and interact with
dialogue systems comfortably. The focus of the dissertation is on the creation and evaluation of dialogue
systems that improve the user experience for dierent applications.
1
Dialogue systems rely on natural language processing (NL) techniques, such as natural language understanding (NLU), natural language generation (NLG), automatic speech recognition (ASR), and text processing. Natural Language Processing (NLP) is a subeld of Articial Intelligence (AI). NLP enables computers and digital devices to recognize, understand, and generate text and speech by combining computational linguistics—the rule-based modeling of human language—together with statistical modeling,
machine learning (ML), and deep learning (Jurafsky 2000).
The current state of the art includes singular systems that can speak multiple languages, such as Amazon Alexa, which can respond to queries in English or Spanish, among other languages. However, the
state of the art does not fully support the use of multiple languages within a single utterance. This can
lead to unnatural and less enjoyable interactions for bilinguals when interacting with dialogue systems
(Y. J. Choi, M. Lee, and Sangsu Lee 2023). Furthermore, technology that supports Indigenous languages is
limited, with no current state-of-the-art systems for Choctaw.
While there are many challenges to the NLP part of code-switching systems, Sitaram et al. (2019)
notes, “a code-switching intelligent agent has to be more than just the sum of parts that can handle codeswitching. To build eective systems that can code-switch, we will also have to leverage the work done in
sociolinguistics to understand how, when, and why to code-switch.” In sum, novel dialogue systems and
policies must be developed to build systems that accurately reect how, when, and why to switch between
languages.
1.1 Goals
The goals of this dissertation are multi-fold. First, this dissertation will explore the relatively new eld
of code-switching bilingual dialogue systems. The ndings will meaningfully contribute to research on
dialogue systems and human-computer interaction.
2
This dissertation will also address Natural Language Processing (NLP) challenges around working with
low-resource, endangered Indigenous languages.
Finally, this dissertation addresses the sociological goals of preserving, documenting, and revitalizing
a low-resource, endangered American Indigenous language through computational techniques.
1.2 Choctaw
The Choctaw language is spoken by the Choctaw tribe. The Choctaw language is considered endangered
(Simons and Fennig 2018a). This makes it important and pressing to document the language as it is losing
speakers over time. I am personally interested in Choctaw as I am an enrolled member of the Choctaw
Nation of Oklahoma, and so Choctaw is my ancestral language.
Choctaws originally resided in the southeast of the United States, in what is today Alabama and Mississippi. In the early 1830s, the Choctaws were forcibly relocated by the US government to Oklahoma in
the migration known as the Trail of Tears (Minges 2001).
Today there are three federally recognized Choctaw tribes1
: Jena Band of Choctaw Indians (in Louisiana),
Mississippi Band of Choctaw Indians, and The Choctaw Nation of Oklahoma. Speakers are concentrated
primarily in Mississippi and southeastern Oklahoma (Ulrich 1993), but also in Alabama, Louisiana, and
California (shown in Figure 1.1).
The Choctaw language belongs to the Western Muskogean language family. This language family encompasses several extant and extinct languages and includes the following still spoken languages:
Choctaw, Chickasaw, Alabama, Koasati, Hitchiti, Mikasuki, and Creek (Haas 1979)
2
. Features of this language family include subject-object-verb sentence order and noun-adjective order in adjectival phrases,
among other characteristics.
1
See the Federal Register: https://www.federalregister.gov/documents/2017/01/17/2017-00912/indian-ent
ities-recognized-and-eligible-to-receive-services-from-the-united-states-bureau-of-indian
2A description of relationships between these languages and loan words is given in (Campbell 1997). Details of how these
languages are split into divisions, a history of those divisions, and some linguistic aliations are described in (Haas 1979).
3
Figure 1.1: The Choctaw tribe used to be based in Mississippi and Alabama (states shown in red), but today,
they are dispersed across the US, such as in Louisiana and California (states shown in green) as well.
For many speakers in Oklahoma (R. S. Williams 1999), Choctaw is their second language, and revitalization eorts have worked to establish language courses at local schools, universities, and online. Choctaw
is spoken by all ages in Mississippi but is losing speakers over time (Simons and Fennig 2018a).
While sources agree that there are dialect variants within Mississippi (Nicklas 1972; Broadwell 2005;
Broadwell 2006), it is unclear whether and to what extent those variants carried over to places where
Choctaws settled after the Trail of Tears. In this work, I will call the dierent regional versions of Choctaw
“variants”. Broadwell (Broadwell 2006) identies four present-day regional variants: Mississippi Choctaw,
Oklahoma Choctaw, Louisiana Choctaw, and Mississippi Choctaw of Oklahoma; the latter is spoken by
Choctaws who live in The Chickasaw Nation in Oklahoma and are believed to have been relocated there
from Mississippi in the early 1900s. Overall, the literature notes that regional variation in Choctaw is
fairly minor, with some variation in phonetic detail (Ulrich 1993) and a small number of lexical dierences
4
(for example, the word for “onion” is typically hato
¯
fVlaha in Oklahoma and shatshonna in Mississippi).
Orthography, however, is a large source of variation between the Choctaw variants. Additional description
of the language and some of the challenges computationally processing it is given in Section 2.5.
1.3 Bilinguals and Bilingualism
The denition of bilingualism that I use in this work is the act of knowing, understanding, or communicating in at least two languages in a written or spoken form. A bilingual is a person who is capable of using
two languages in some capacity (Grosjean 2013).
At the time of this writing, sources indicate there are more than 7,000 languages in the world (Gordon Jr
2005). Although many countries have established an ocial language, a single country can be the home of
multiple language groups and lingua francas. For example, over 400 languages are spoken in India. People
of dierent linguistic backgrounds living in the same geographical place might speak multiple languages
when communicating with one another. Other reasons why a person might speak multiple languages
could be due to immigration for employment, education, marriage, or multilingual family settings. To the
best of my knowledge, no studies have been conducted on how many people in the world are bilingual;
however, it is thought that the population of the world that speaks more than one language is larger than
the population of monolinguals (Tucker 2001) and bilingualism exists across all age groups, levels in society,
and countries of residence. In the United States, it is estimated that around 18-20% of the population, or
approximately 55 million Americans, might be bilingual (Grosjean 2013).
1.3.1 Types of bilinguals
A typical way of discussing a speaker’s languages is to refer to the languages by order of acquisition. I
use L1 to refer to the speaker’s rst learned language, or “mother tongue”, which is learned in childhood,
and L2 to all languages learned in adulthood. The L1 does not necessarily mean that this is the speaker’s
5
dominant language or that the language is the one in which the speaker is most procient. A speaker can
have multiple L1s, as the speaker might have been exposed to multiple languages early in their life. L2
languages are typically dierentiated from L1 languages by the age of acquisition as they are learned later
in life (Krashen, Long, and Scarcella 1979).
The stereotyped image of bilingualism is of a person speaking both languages uently, with perfect
knowledge, and without a foreign accent in either language. These speakers do exist and are called balanced bilinguals. However, the reality of many bilinguals is demonstrating unbalanced bilingualism, or
having greater knowledge and ability in one of the languages; in other words, having a dominant language.
Greater knowledge or ability could be in a general sense, where the bilingual has a larger vocabulary or
sounds more like a monolingual speaker across contexts in one language. It could also be in a modalitydependent sense, where an individual is uent at speaking but shows a dierent ability level in writing. It
could also be domain-specic, such as always speaking German at home but English at school (Grosjean
2013) or always using German to talk about domestic tasks but English to discuss science or math topics.
An additional stereotype to the image of bilingualism is that a bilingual always exhibits a stable level
of bilingualism. Literature indicates that a bilingual never fully forgets a language (Brixey 2013), but the
level of uency in a language can wax and wane over a lifetime. The changes could happen because a
dormant language becomes a language that is used daily.
There are numerous attitudes about how code-switching is perceived. In some cases and across language communities, code-switching has traditionally been stigmatized by the community and possibly
the individual speakers themselves (Low and Lu 2006). However, some communities view code-switching
favorably (Yim and Clément 2021). One evidence of how ubiquitous the mixing of languages might be for
a community is in the case of languages that have portmanteaus, such as "Spanglish" for the mixing of
Spanish and English, and "Choclish" for the mixing of Choctaw and English (Kickham 2015). As Y.J. Choi,
M. Lee, and Sangsu Lee (2023) note in interviews with bilinguals, people tend to code-switch with people
6
they do not expect to be negatively judged by or with whom they feel close. They noted that participants
needed to feel a level of trust that they would not be negatively judged by their interlocutor and that “The
close relationship built upon similar linguistic and cultural backgrounds formed high trust and intimacy.”
1.3.2 Code-switching frameworks
A bilingual using more than one language is called code-switching, and this can be inter-sentential or
intra-sentential—across sentence switching or within sentence switching, respectively.
Other denitions prefer the term “code-mixing” in contrast to “code-switching”. Code-switching, in
these denitions, means to alternate between two or more languages in a single conversation, while codemixing is the practice of mixing dierent types of languages in a single utterance, specically on social
media (Thara and Poornachandran 2018). Others use the terms interchangeably (see Bawa, Khadpe, et al.
2020). As it is not standardized within the eld to use those terms and denitions, I will use the term
“code-switching” throughout this dissertation for all acts of alternating between languages and will use
inter-sentential versus intra-sentential, intra-turn versus inter-turn, and intra-word to indicate where the
code-switching occurs. As an additional note, it is possible to be bilingual in dierent dialects of the same
language, but this usage will not be considered in this work. I do not make a distinction as to what modality
the code-switching occurs in.
Single noun, or noun phrase, switches are the most common codeswitching item in reviews of codeswitching corpora (Poplack 2000). This is where a single noun word in the second language is inserted
into the base form of the rst (example 1). Switching can also happen at a major constituent (example 2).
1. From (Poplack 2000): ESTAMOS COMO MARIDO Y woman.
(We are like man and wife.)
2. From (Gingràs 1974): Tell Larry QUE SE CALLE LA BOCA.
(Tell Larry to shut up.)
7
Bilingualism as a linguistic phenomenon will be discussed in more detail in Chapter 2.1.
1.3.3 Bilingualism in conversation
There are numerous reasons why bilinguals might demonstrate their bilingualism through code-switching
in a given dialogue. One reason is due to interference from the language not being spoken. This could be
related to prociency, or switching to the other language could simply reduce the psychological load and
eort required for sentence planning. It could also be accidental, which can happen under conditions of
stress or tiredness (Meuter 2009). Another possibility is trying to align with the community that speaks
the language. It could be to demonstrate their identity, to alter the speaker’s status with their interlocutor,
or to enhance an interpersonal relationship (Grosjean 2013).
Additionally, it has been found that women tend to do more intra-sentential switches than men (Poplack
2000). Furthermore, Poplack (2000) also determined that not all bilinguals will intentionally code-switch,
and some may view code-switching negatively, believing that doing so diminishes or dilutes a language
or identity.
1.4 Dialogue Systems
A dialogue system is a computer program that a human user can communicate with and be understood by.
Popular examples of dialogue systems are Siri or Amazon Alexa 3
. Many dialogue systems are monolingual,
and only a few engage in multiple languages. The majority of the state-of-the-art systems, such as Siri
and Amazon Alexa, code-switch between turns, demonstrating inter-sentential code-switching, and the
system is equally procient in both languages, demonstrating balanced bilingualism. A chatbot is a type
of dialogue system. It is software that uses human language to converse with a partner (Shawar and Atwell
2007b).
3These systems do speak multiple languages, but it is not known if the dierent languages share a common language model,
dialogue policy, or speech recognizers (ASRs).
8
However, research indicates that these systems are often unsatisfactory for bilingual users. Y.J. Choi,
M. Lee, and Sangsu Lee (2023) interviewed bilinguals about their experiences interacting with existing
bilingual dialogue systems. Users reported that they often found that the bilingual dialogue systems behaved more like two monolingual systems stitched together and lacked bi-cultural knowledge. Users also
noted that the agents were inconsistent in their personas in the dierent languages. For example, one user
(P16) remarked, “I often switch my language setting because I work both in Korean and English. But the
agent has a very calm male voice in Korean, whereas a cold female voice in English. Although I’m the
same person, I feel like I’m talking to two dierent systems separately.”
In the same study, users expressed hope that the systems would code-switch in ways similar to their
own, and said they were constantly analyzing the system for similarities. Some interviewees indicated that
if the agent’s multilingual and multicultural abilities were similar to their own, they would feel accepted
and think their code-switching was “being acknowledged as an eective communication tool in the community rather than evidence of low language prociency.”. However, most users in the study found that
because of constant failures and challenging experiences, they were “convinced they were not designed to
be the primary users of the conversational agents.”
1.5 Thesis statement
Bilingual conversational systems are important to research as they play a signicant role in how bilingual users interact with technology; however, current systems often fail to meet their needs. Although
state-of-the-art systems demonstrate balanced bilingualism and inter-sentential code-switching, they frequently fall short in terms of cultural integration, persona consistency across languages, and unbalanced
bilingualism that might relate better to a user. Bilingual users perceive these systems as disconnected,
functioning more like two separate monolingual systems rather than a seamless, unied experience. This
lack of bi-cultural knowledge and failure to recognize code-switching as a valid communication tool leads
9
to frustration and alienation among bilingual users. Understanding and improving these systems could
help create more inclusive, natural, and culturally aware interactions. Additionally, few conversational
systems have been developed in an endangered American Indigenous language, omitting an additional
population of bilingual users.
Thus, there is a need to develop new systems in order to address these issues. To ensure that bilingual
dialogue systems will be useful, I explore and address the following questions in this dissertation:
• O-1: Could dialogue systems lead to useful applications, such as for language learning and language
preservation? To what degree can bilingual dialogue systems facilitate this process?
• O-2:What code-switching strategies can lead to an increase in learning when interacting with a
bilingual chatbot?
• O-3:What code-switching strategies can lead to an increase in recorded audio when interacting with
a bilingual dialogue system?
• O-4: Will users show a higher preference for, level of enjoyment, or rapport with a code-switching
system?
• O-5: Will users be comfortable using two languages in a conversation but only communicating in
one with a dialogue system?
My hypothesis is that the appropriate use of code-switching in bilingual systems for endangered Indigenous languages can lead to a better user experience and more productive interactions than in a monolingual system. I explore this hypothesis in two applications. The rst is a text-based chatbot named
Masheli, to help language learners gain conversational uency in an endangered Indigenous language. I
hypothesize that code-switching would lead to a greater increase in user enjoyment and learning. The
10
second is a mobile app named DAPEL that records endangered Indigenous languages through spoken dialogue. I hypothesize that users will speak more with the code-switching chatbot, leading to longer recorded
audio. I also hypothesize that users will prefer the code-switching system over the monolingual one.
1.6 Contributions
This dissertation includes the following contributions.
1. Methods for creation of other Indigenous language technology, detailed throughout as well as in
Chapter 3.
2. A Choctaw automatic speech recognition (ASR) system to facilitate spoken dialogues, discussed
in Chapter 3. As no ASR systems previously existed for Choctaw, this is a novel and meaningful
contribution for a low-resource American Indigenous language.
3. A corpus of bilingual data of Choctaw-English, created through the experiments in Chapter 4 and
Chapter 5. The corpus includes text from learning Choctaw speakers and recorded audio of intermediate to uent-level speakers.
4. The two studies presented in Chapters 4 and 5 focus on the Masheli and DAPEL applications. These
studies assess user experience by comparing a code-switching application with a monolingual counterpart. They also evaluate the eectiveness of bilingual systems across populations.
5. The Masheli chatbot system, discussed in Chapter 4. The chatbot will be useful for Choctaw language
learners if it is deployed publicly. The system could be adapted and deployed for other endangered
and American Indian languages to support language revitalization eorts.
6. The DAPEL system, presented in Chapter 5, contributes to preventing language loss because if there
is language documentation, a language community has some means to revitalize and reclaim its
11
language. The app will be useful for future documentation eorts for language preservation practitioners and language community members. The app is intended to be designed in such a way that
it could be deployed to collect audio data on any endangered language by an individual with little
Computer Science or Linguistics training.
1.7 Publications and applications resulting from this dissertation
The following is a list of publications and presentations resulting from this dissertation.
1. Jacqueline Brixey, Eli Pincus, and Ron Artstein. "Chahta anumpa: A multimodal corpus of the
Choctaw language". In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC). 2018.
2. Jacqueline Brixey and Ron Artstein. "ChoCo: a multimodal corpus of the Choctaw language" In:
Language Resources and Evaluation. Springer, Volume 55 Issue 1, pages 241-257. 2018.
3. Jacqueline Brixey and Ron Artstein. "ChoCo: a multimodal corpus of the Choctaw language" Poster
presentation at: UNESCO - Language Technologies for All (LT4All). 2019.
4. Jacqueline Brixey, David Sides, Timothy Vizthum, David Traum, and Khalil Iskarous. "Exploring a
Choctaw language corpus with word vectors and minimum distance length". In: Proceedings of the
Twelfth International Conference on Language Resources and Evaluation (LREC). 2020.
5. Jacqueline Brixey and David Traum. “Masheli: A Choctaw-English Bilingual Chatbot”. In: Conversational Dialogue Systems for the Next Decade. Springer, 2021, pp. 41–50. *Best video award
6. Seyed Hossein Alavi, Jacqueline Brixey, and David Traum. “Can we use a spoken dialogue system
to document endangered languages?” In: Dialog for Good (DiGo). 2019.
12
7. Seyed Hossein Alavi, Jacqueline Brixey, and David Traum. “Can we use a spoken dialogue system
to document endangered languages?” Poster presentation at: UNESCO- Language Technologies for
All (LT4All). 2019.
8. Jacqueline Brixey and David Traum. “Towards an Automatic Speech Recognizer for the Choctaw
language”. In: Speech for Social Good (Interspeech). 2022.
9. Jacqueline Brixey and David Traum. “Developing a spoken dialogue system for the Choctaw language”. In: IWSDS. 2023.
10. Jacqueline Brixey and David Traum. “Why should a dialogue speak more than one language?”. In:
IWSDS. 2024.
Additionally, my work was disseminated in the press. The Masheli chatbot was covered in Viterbi
magazine, an award-winning video by Viterbi magazine, Bisknik (the Choctaw Nation of Oklahoma’s
newspaper), and The Washington Post.
1.7.1 Technology Transitions
At the time of writing, the Choctaw ASR is being deployed on the Mississippi Band of Choctaw Indians’
(MBCI) dictionary website4
. It is used as an aid for speakers who may not be literate in Choctaw, or for
those who are uncertain about the spelling. I discuss the challenges of Choctaw spelling in Section 2.5 and
more on the MBCI project in Section 3.0.5.1
The Masheli chatbot has been retooled to be included in Indigenous STEM curriculum in Oklahoma as
part of the Native Earth Native Sky grant5
. More on this project is described in Section 3.0.5.2.
4
https://apps.neh.gov/publicquery/AwardDetail.aspx?gn=PD-271355-20
5
https://education.okstate.edu/research/centers/native-earth-native-sky/
13
1.8 Outline of Dissertation
The dissertation will be presented as follows. In Chapter 2, I present a comprehensive literature review
focused on dialogue systems, bilingual dialogue systems, and bilingualism as described in linguistics literature. Following this, I detail in Chapter 3 the foundational work I undertook to prepare for this research,
specically the development of a corpus that provided the necessary data to build more sophisticated systems. Chapter 4 is about Masheli, a bilingual chatbot designed for language learning, which facilitates
conversation practice for users. Next, I describe DAPEL, a dialogue system I developed for recording endangered languages in conversation in Chapter 5, highlighting its importance for linguistic preservation.
Finally, I conclude in Chapter 6 by summarizing the contributions of this work to the elds of Indigenous
language technology and dialogue systems, as well as indicate areas for future research.
14
Chapter 2
Prior Work
In this chapter, I review prior work relevant to this interdisciplinary dissertation and indicate open questions for implementing dialogue systems capable of processing code-switched language in a low-resource
Indigenous language. In this chapter, I introduce a review of bilingualism from linguistics literature (Section 2.1), concepts about the computational processing of bilingual data (Section 2.2), overview of dialogue
systems (Section 2.3), bilingual dialogue systems (Section 2.4), prior external work about the Choctaw
language (Section 2.5), and Indigenous language technology (Section 2.6).
2.1 Bilingual linguistics
The question of what counts as a code is not easily answered. As Auer (1995) notes, it is participants’, rather
than linguists’, notions of ’code A’ and ’code B’ that determine what counts as code-switching. Ultimately,
code-switching is a tool for bilinguals to engage fully in conversation and cooperative communication in
multiple languages (Beatty-Martínez, Navarro-Torres, and Dussias 2020).
Research has demonstrated that code-switching is a widespread form of bilingual interaction frequently requiring high levels of competence (Muysken 1995). It is also not a recent phenomenon, one
example of written code-switching (also called Macaronic language) is in the Carmina Burana, from roughly
the 11th century, which shows examples of Latin mixed with Medieval German or French (Beatie 1967).
15
Code-switching in conversation displays regularities that can be both context-independent and contextsensitive, and the literature notes that it is not a question of whether code-switching follows structural
constraints but rather how best to characterize them (Belazi, Rubin, and Toribio 1994). Code-switching
has shed light on fundamental linguistic questions, ranging from Universal Grammar to the development
of self and group identity through verbal behaviors (Auer 1995a).
Today, code-switching is studied primarily through two lenses: structural and sociolinguistic. The former investigates the morphosyntactic patterns that underlie the grammar of switching between languages.
The latter investigates why bilingual speakers make the code-switching decisions that they do. It also attempts to answer how social meaning is created through code-switching and what discourse functions it
serves (Boztepe 2003).
2.1.1 Language features of code-switching
A stereotype of bilinguals is the expectation that they will sound like native speakers in all of their spoken
languages. It can be the case that bilinguals only sound native in one language and that some bilinguals
have a non-native sounding accent in all of their languages (Grosjean 2013).
One question in code-switching research has been whether bilinguals have one mental language processing system, dierent mental processing systems for each language, or partially overlapping mental
systems for their multiple languages (Kroll and De Groot 2009). Evidence indicates that neither language
is ever completely shut o in a bilingual’s mind, whether by choice or otherwise, and that both languages
will have an impact on the other (Mitchell, Myles, and Marsden 2013). Instead, a bilingual is using enhancement and suppression tactics to manage the two languages mentally (Schwartz and Kroll 2006).
Code-switching can be an accidental intrusion, such as interference due to stress or tiredness (Meuter
2009), and more procient bilinguals will be more skilled at those enhancement and suppression tactics
(Schwartz and Kroll 2006).
16
Code-switching can happen at any level of speech. The two main levels of code-switching are intersentential switching and intra-sentential. Inter-sentential switching is to switch between languages at
sentence boundaries. Intra-sentential is switching between languages within a single sentence. In an
intra-sentential switch, single-word switches are called “insertions”, while multiple-word code switches
are termed “alternations” (Grosjean 2013).
There is debate in the literature about whether borrowed lexical items should count as a code-switch.
For example, “email”, is typically preferred in Spanish over “correo electrónico” (literally: electronic mail),
and is usually given a Spanish inection when spoken. An additional example is “spoiler” (such as when
a person shares the ending of a movie with someone who has not seen it yet), which also gains a Spanish
inection and is preferred over “destripe”. For simplicity, I treat borrowed lexical items as examples of
code-switching in this work.
Code-switching is often perceived as being the result of language incompetency, or in other words,
not knowing one of the languages as well. Certain types of code-switching will indeed happen because of
lower levels of competency, however, it is also true that other types of code-switching will happen because
of advanced competency. When code-switching occurs, some of the literature argues that it will follow
certain rules, some of which are syntactic requirements, and others for sociolinguistic reasons. In the
following example from Poplack (2000), the boundaries marked by slashes in the monolingual sentences
are acceptable code-switching points. English is shown in capitalized letters in the spoken line.
1. Spoken: I TOLD HIM THAT pa’que la trajera ligero.
English: I/told him/that/so that/he/would bring it/fast.
Spanish: (Yo)/le dije/eso/pa’que/(é)l/la trajera/ligero.
The literature indicates that balanced bilinguals are more likely to alternate after clauses, such as this
example from (Ahn et al. 2020):
17
2. Spoken: ¿Tienes algún amigo THAT STUDIES LINGUISTICS?
English: Do you have a friend that studies linguistics?
Spanish: ¿Tienes algún amigo que estudie linguistics?
At the other end of the spectrum, unbalanced bilinguals are more likely to do word insertions, and
typically do noun insertions, such as the following example from (Ahn et al. 2020):
3. Spoken: ¿Tienes algún friend que estudie LINGUISTICS?
English: Do you have a friend that studies linguistics?
Spanish: ¿Tienes algún amigo que estudie linguistics?
2.1.2 Languages of code-switching
To the best of my knowledge, and as evidenced by the diverse examples in the relevant literature, codeswitching can occur with all possible language pairs, even with languages extremely distinct from one
another.
However, some code-switching pairs may be more visible than others. The attitude towards one language or community that speaks the language, or the level of prestige assigned to a language, may account
for less public use of a language, including in the use of code-switching (Cenoz and Gortegaorter 2017).
Other factors, discussed in Section 2.1.4, may also account for why a language is used in a code-switch,
such as due to the content of the discourse, the function of the interaction, age, socio-economic status,
power relation, formality, and if there is already a preferred language established with the interlocutors
(Grosjean 2013).
2.1.3 Code-switching frameworks
There are numerous frameworks (see Belazi, Rubin, and Toribio 1994 for a brief overview), however the two
main linguistic frameworks that tend to be used to computationally process and generate code-switched
18
utterances are (1) matrix-embedded language framework (MLF), and (2) equivalence constraint framework
(ECF). There are strengths and weaknesses to each framework, for example, some switches might be disallowed in certain frameworks but still occur in bilinguals’ speech.
The theory of MLF states that a “matrix” is the frame for a sentence’s language, which governs all or
most of the grammatical morphemes and the word order (Poplack 2000). The code-switching elements are
the “embedded” language in the matrix. A sentence could have the grammatical structure of one language
but use only lexical items from another. This framework argues that there is an asymmetrical relationship
between the dominant matrix language and the subordinate embedded language (Myers-Scotton 1997).
There are varying ideas on how to determine which language is in the matrix role. One idea is “discourseoriented”, in other words, the main language of the conversation. A statistical approach might be the
language appearing the most frequently. A structural approach suggests it is based on the main verb,
although this is imperfect if the language has a readily available strategy to incorporate outside words, such
as in Swahili or Hindi. In my corpus of Choctaw conversations (see Section 3.0.2), learning students would
frequently code-switch noun phrases within the matrix of a Choctaw sentence. For example, “Breakfast
chompa li tok.” (I bought breakfast.) In the following example, the speaker is speaking in Choctaw, but has
the noun phrase structure from English (shown in capitalized letters), as Choctaw places the noun rst,
and then the descriptor, while English does the opposite. A grammatical form of this sentence is shown in
the last line in the example.
4. Spoken: Himona kaa sV bVnna. HIMONA CHUKKA sV bVnna.
English: I want a new car. I want a new house.
Choctaw: Kaa himona sV bVnna. Chukka himona sV bVnna.
The theory of ECF states that in a well-formed code-switched sentence, switching can occur at any
point where the grammatical constraints of both languages will both be satised, which will tend to be
at points where the juxtaposition of L1 and L2 elements do not violate a syntactic rule of either language
19
(Poplack 2000; Boztepe 2003). This could be at a conjunction point, which was observed in a corpus of
Punjabi-English code-switching (Gardner-Chloros and Edwards 2004). The literature claims that violations
tend to be very rare but do occur. An example of a violation is in the following Choctaw example (Example
5) where the switch happens on the object, basketball. The sentence is grammatical in English syntax since
English is SVO, however, the sentence is not grammatical in Choctaw. The syntax of Choctaw is SOV but it
is only explicit subjects (such as "the dog") that are placed at the beginning of the sentence, whereas subject
pronouns are always immediately next to verbs. MLF possibly better explains this code-switch example,
since if the matrix of the sentence is treated as English with a Choctaw embedding, then the word order
is acceptable.
5. Spoken: I KNOW BASKETBALL ato towa toli.
English: I know basketball is dribbling the ball.
Choctaw: Basketball ato towa toli ikhana li.
The literature on ECF also notes that it is unusual to code-switch at morpheme boundaries. For example, one study found that no Spanish-English bilinguals said code-switches such as “eat-iendo” (eating
in English, or comiendo in Spanish), a blend combining an English verb base with a Spanish morpheme
(Poplack 2000). However, the following example of French-English code-switching below from (GardnerChloros and Edwards 2004) was observed.
6. Spoken: Je SUNBATH-ais.
English: I was sunbathing.
French: Je prenais un bain de soleil.
A Choctaw example might be “balili-ed” (ran), however, I did not nd any examples using verbs in my
corpus (See Section 3.0.1). Rather, a frequent example of Choctaw code-switching observed in my corpus is
switching between a possessive pronoun and noun. In the Mississippi orthography, the possessive pronoun
20
(a
¯
) is attached to the beginning of the word, such as “a
¯
tek” (my sister), while Oklahoma orthography places
a space between the pronoun and noun, “a
¯
tek”. I saw numerous examples of “a
¯
sister” and “a
¯
mother” in
my ChoCo recordings of spontaneous conversational Choctaw speech.
However, other literature argues that while many examples align with the frameworks outlined above,
there are nevertheless code-switching examples that fall outside that theoretical grammar. For example,
Muysken (1995) argues that there doesn’t need to be a base language at all, rather each part of speech will
have its own matrix structure. Gardner-Chloros and Edwards (2004) state that code-switching frameworks
“can fall into the alternative trap of idealizing – and hence articially restricting – [code-switching] itself”,
and that following a specic grammar such as the frameworks do fails to fully explain code-switching
behaviors. They indicate instead that sociolinguistic factors frequently override ’grammatical’ factors,
and note that speakers will use pauses, interruptions, and reformulations when speaking to help neutralize
possible grammatical awkwardness.
The various frameworks do converge on several factors. First, they all view these bilinguals as rational
actors making linguistic choices that exploit their linguistic repertoires to achieve their goals (MyersScotton 1998). Second, there is evidence that the level of typological similarity between the two languages
being switched will lead to dierent tactics being used by the speaker which result in dierent morphosyntactic outcomes (Gardner-Chloros and Edwards 2004).
2.1.4 Purpose of code-switching
A lack of prociency is the typical reason people think of to explain code-switching. It is certainly true that
if a language is used in a very limited way, such as for only a few domains or with only certain interlocutors,
then the speaker will potentially show less uency in that language (Poplack 2000). This might be the case
for a heritage speaker, which is a speaker who speaks one language exclusively at home amongst family
members but speaks a second language at school and/or in professional settings. If a bilingual typically
21
does not discuss a certain domain in a given language, then there is a good chance that the bilingual will
not show the same lexical variety that a speaker who does would demonstrate. The heritage speaker, for
example, might be more procient at discussing biology in their school-based language as that is what they
learned the topic in, but show less prociency when discussing the same topics in their at-home language.
The heritage speaker might resort to borrowing from the other language to ll in gaps in the language not
typically used to discuss this domain.
However, bilinguals acquire and use their languages to serve dierent purposes, be functional in different domains of life, and to carry out a conversation with dierent people. The choice of language can
be inuenced by four factors: participants, situation, content of discourse, and function of the interaction
(Grosjean 2013). Other motivating factors could be political arrangements, relations of power, language
ideologies, and/or the interlocutor’s views on their and others’ identities (Pavlenko and Blackledge 2004).
While not all bilinguals will regularly elect to code-switch (Grosjean 2013), those who do might exhibit
usage patterns conforming to community-based norms (Beatty-Martínez, Navarro-Torres, and Dussias
2020). Community-based norms might be established in bilingual communities with diering traditions
of language separation. For example, insertion code-switching might occur more often in communities
“where bilingual prociency is asymmetric (e.g. colonial or recent migrant settings)” (Gardner-Chloros
and Edwards 2004). Although there may be overarching community norms, there are nevertheless individual code-switching decisions that might depend on the speakers’ age, education, and social background
(Gardner-Chloros and Edwards 2004).
Code-switching might also be a part of speech accommodation, a theory that speakers might adjust
their style of speech as a way to express their attitudes or intentions toward their interlocutors (Giles
1979). Convergence would be the act of shifting towards the interlocutor. In this case, code-switching
might be used to indicate membership to the same group, which could potentially decrease the sense of
social distance between the speakers and convey a sense of solidarity. Divergence would be the act of
22
Figure 2.1: Summary of reasons to code-switch (References include: Giles 1979; Boztepe 2003; GardnerChloros and Edwards 2004; Grosjean 2013; Poplack 2000)
creating social distance or communicating social disapproval. Code-switching in this instance might be
done to demonstrate that the interlocutor does not belong to the same group as the speaker, possibly to
demonstrate having higher status, more education, or more power (Boztepe 2003).
A summary of some of the possible reasons why a bilingual might code-switch in a conversation covered in this section is shown in Figure 2.1.
2.2 Bilingual natural language processing (NLP)
Winata et al. (2022) systematically reviewed publications on bilingual systems. They found that at the
time of writing, many of the state-of-the-art bilingual systems were paired with English. Hindi-English
systems were the most common (111), as well as Chinese-English (20), Tamil-English (37), and MalayalamEnglish(23). Spanish-English systems were the second most common bilingual systems, with 78 systems
(Winata, Aji, et al. 2022). They also found that few systems were tri- or multi-lingual.
23
The majority of code-switching publications work with data sets of social media data as it is a platform
for informal interactions where code-switching is more likely to occur. Additionally, the majority of publications on code-switching deal with the task of language identication, followed by sentiment analysis
(Winata, Aji, et al. 2022).
2.2.1 Computational approaches
As discussed in Section 2.1.3, the main theories that systems tend to use to computationally process and
generate code-switched utterances, in order of greatest usage to least, are (1) matrix-embedded language
Framework (MLF), then (2) equivalence constraint (ECF). Other frameworks are tested in the literature,
such as the Functional Head Constraint (Belazi, Rubin, and Toribio 1994), but are more limited in usage
than MLF and ECF.
The literature (Doğruöz et al. 2021; Auer and Muhamedova 2005) notes that although the rst framework (matrix language framework) dominates in computational approaches to code-switching, it is contested on empirical and theoretical grounds as the consistent identication of a matrix language is not
always possible and the criteria for dening it are ambiguous.
2.2.2 Future directions
Additional details and citations will be given in Chapter 4, but to give a broad overview of the eld, the
general trend in more recent publications has been to utilize machine learning techniques to both process
and generate code-switched utterances. These approaches include “classic” machine learning algorithms,
such as Naive Bayes, while more recently, researchers have used Neural Networks, including for pretrained embeddings and language models (Winata, Aji, et al. 2022).
24
Figure 2.2: Task-oriented systems (From Hongshen Chen et al. 2017)
A large area for of open work is that the majority of the existing code-switching language technologies
do not usually take into account the linguistic and social factors that inuence when and why a user would
choose to code-switch (Doğruöz et al. 2021).
2.3 Brief Overview of Dialogue Systems
Dialogue systems are the broad category of articial intelligence that encompasses automatic humancomputer conversation systems. The systems can be divided into two categories: (1) task-oriented systems
and (2) open domain systems (i.e. non-task-oriented systems).
Task-oriented systems are those that help a user to complete a task, such as booking a restaurant. The
traditional approach for building these systems is shown in Figure 2.2. The system will usually contain
natural language understanding (NLU) components, which process the text and extract meaning (Hongshen Chen et al. 2017). Dialogue state tracking is the component of the system that infers the state of the
conversation, such as the user’s goal, given all of the conversation history up to that point in the dialogue
(J. D. Williams, Raux, and Henderson 2016). Finally, the natural language generation (NLG) component
will produce the dialogue system’s response to return to the user (Ni et al. 2023).
25
Open domain systems are not focused on a given task and are instead open to discussing any topic
and carrying on a conversation (Hongshen Chen et al. 2017). There are frequent challenges to build better,
more naturalistic open domain systems, such as the Alexa Challenge1
.
2.3.1 Components
In earlier work, task-oriented systems were modular, with multiple components to process user input,
determine how to respond, and then produce a response. Recognition and understanding components of
a user’s input in a dialogue system rely on NLP techniques. Generation components give the system the
ability to respond to a user’s input with natural language generation (NLG) techniques (Hongshen Chen
et al. 2017). In nearly every component, a number of increasingly sophisticated neural networks have been
adopted. I refer the reader to Ni et al. (2023) and Chen et al. (2017) for a comprehensive survey of relevant
networks that have been applied.
Dialogue policies dictate how a system should respond for both task oriented and open domain dialogue
systems. For example, if users often like to go o-topic with a chatbot, the dialogue policy could include
suggesting on-topic options to a user after several o-topic utterances (Patel, Leuski, and Traum 2006).
In a modular system, a given dialogue policy will be triggered based on a user’s utterance, and will then
inform an NLG module what to respond (Hongshen Chen et al. 2017).
The NLG component is responsible for converting a dialogue action as indicated by a dialogue policy
into natural language surface utterance to present to the user (Hongshen Chen et al. 2017). Older methods
relied on sentence planning, where rules and/or slots might indicate a response to be created. To provide
an example, if the dialogue policy passes “Inform (name = Wonder Woman; genre = Action; destination
= Golden Village)” to NLG module, the NLG module will then convert it into a language representations
like, “There is an action movie named Wonder Woman at Golden Village.” (Ni et al. 2023)
1
https://www.amazon.science/alexa-prize/socialbot-grand-challenge
26
Open-domain dialogue systems are typically classied into three categories: generative systems, retrievalbased systems, and ensemble systems which are a mix of the two prior systems. Generative systems utilize
sequence-to-sequence models to convert the user message and dialogue history into a response sequence
that may not be in the training data. In contrast, retrieval-based systems aim to choose a pre-existing
response from a specic response set (Ni et al. 2023).
While generative systems can create exible and contextually relevant responses, they may sometimes
lack coherence and produce uninspired replies. Retrieval-based systems select responses from humangenerated sets, which can create better coherence (Sordoni et al. 2015). However, these systems are limited
by the nite nature of the response sets, and the retrieved responses may not always align well with the
dialogue context (Ni et al. 2023).
In contrast to both of these approaches are end-to-end (E2E) systems. These systems can generate
responses without training of individual components (Razumovskaia et al. 2021), making them a popular
approach. Although fully E2E models are a possibility, typically approaches include limited modular training of some components in order to collapse only a portion of a pipeline (Qin et al. 2023). E2E systems
heavily rely on a variety of neural network models and large pre-trained models (Serban et al. 2016; Ni
et al. 2023).
In yet more recent work has been the development of Large Language Models (LLM). One popular LLM
is ChatGPT (Brown et al. 2020). In these models, users provide questions or queries in text form for the
system to answer. Models are able to respond to many types of user input, producing text responses such
as essays, short stories, and conversational chat. LLMs are being used as well for components of dialogue
systems, such as for state tracking (Feng et al. 2023).
27
2.4 Bilingual dialogue systems
Most dialogue systems are monolingual, and only a few engage in multiple languages, more often through
inter-sentential than intra-sentential code-switching at present. To communicate in a human language
with a human user, the dialogue system will contain recognition and understanding components, generation components, and policies for what and how the system should respond to a given user input.
It is highly likely that users will spontaneously code-switch and want to code-switch with a dialogue
system. Choi, Lee, et al. (2023) interviewed bilingual Korean users who discussed the diculties of only
knowing certain vocabulary in one language rather than both, preferring one language for certain expressions, or having an accent in one of the languages that presented challenges with being recognized (one
user discusses having a French accent for some English pronunciations).
To support bilingual users who might engage in code-switching interaction with dialogue systems, we
need bilingual dialogue systems that can do the same. While many current systems can engage in dialogue
in more than one language, the language processing components are distinct for each language, and the
systems cannot maintain a consistent identity and dialogue state across languages, although more recent
research has started to address having a consistent voice across multiple languages (Wilcock 2023). Figure
2.3 gives an overview of the why and how of bilingual interactions with dialogue systems.
A review of how and why people code-switch in dialogue is given in Section 2.1.1 and Section 2.1.4,
respectively. The remainder of this chapter is as follows. In Section 2.4.1, I examine why a user would
interact with a bilingual dialogue system and the types of dialogue systems developed to meet those conversational needs (Section 2.4.1). I consider the computational requirements needed to build a bilingual
dialogue system (Section 2.4.2) and review past and current research in each requirement area. The nal
subsection summarizes the research space and indicates where this dissertation ts into the research space.
28
Figure 2.3: Interactions with a bilingual dialogue system considering the sociolinguistic motivations and
technical components
2.4.1 Reasons for code-switching interactions
In this rst characterization of bilingual dialogue systems, I examine why a dialogue system would be
bilingual. The dimensions described in this section discuss the system’s bilingualism, the user’s bilingualism, and how the demands, needs, and expectations of both conversational members shape a bilingual
dialogue.
Some of the reasons why a user might engage in a bilingual interaction include the user’s preference,
the dialogue task required for code-switching, the system’s unbalanced bilingualism, meeting one or several socio-psychological goals, and accommodating the user’s language prociency.
There are several existing bilingual dialogue systems. Some popular examples include Amazon Alexa
and Siri, as well as additional systems discussed in this section. Many existing systems respond in the
language of the user’s choosing.
29
2.4.1.1 User preference
The rst reason for a bilingual dialogue system is that the user has sucient prociency and chooses
to code-switch from a place of preference. In these systems, users initiated code-switching explicitly or
through language switches. One consideration for this reason for a system to code-switch is that the
user’s prociency may not be adequate in the second language. Still, they nevertheless prefer to use the
given language because they are learning that language and wish to practice it more. I explore the user’s
prociency in Section 2.4.1.5.
There are several types of systems that support the user’s preference: question-answer (QA) type
systems that switch between languages at the turn level; the system speaks a dierent language than the
user; and systems for casual conversation. I also discuss large language models in a separate subsection.
QA systems
The beginning of one spoken dialogue system is a weather forecasting Croatian-Slovene bilingual
system (Martincic-Ipsic et al. 2003). The author notes that the task of spoken language identication (see
2.4.2.2 for more on this task) is dicult as both languages in the system are Slavic languages. The main
task of the dialogue system was to compile a corpus with which to build a speech recognizer. The authors
researched two approaches for the ASR: (1) two monolingual ASRs, one of which would be deployed
after rst being identied by a language identier module and (2) an ASR with a merged language model
of the two languages and shared acoustic units, which was eventually used from the two approaches.
The resulting dialogue system was not tested with human evaluators to determine the acceptability of
the forecasts produced in conversation. A key point is that the system responds in the same language
as the user’s last utterance—indicative of supporting the user’s preference and prociency—and it is not
indicated how or if the system would respond to or produce code-switching. It is unclear what type of
code-switching the system would produce, but it is evident from the construction of the ASR that the
intention was to support intra-turn, intra-sentential user code-switching.
30
A second spoken dialogue system answers foreign exchange Cantonese and English inquiries (Meng,
Steven Lee, and Wai 2000), later expanded to being a trilingual system for Cantonese, Putonghua, and
English (Meng, S. F. Chan, et al. 2000). Unlike the Croatian-Slovene system above, this system expects
only inter-turn code-switching, not intra-sentential. The ASR components are separate for Cantonese and
English. To switch between languages, the user must make this request; thus, this system also supports a
user’s preference and prociency. Responses by the system were generated from real-time data. Human
evaluators evaluated the system for the acceptability of responses. Users spent an average of 2 minutes
interacting with the system; the duration was slightly lower for Cantonese than for English. The system
could interact with the user by following a series of scripted prompts or freely, where the user could ask
any question without constraint or order. The study found that the directed conversations lasted slightly
longer than the free-form ones but only diered by a few seconds.
A proof-of-concept system capable of speaking in Swedish and English about news found on the internet was implemented in (Gawronska and House 1998). English and Swedish responses for each query
were generated using language-specic templates and then converted from text to speech. The system was
not tested for user experience. The system did not code-switch within its responses, and it was unclear if
supporting user code-switching was a feature planned for in the system.
A nal system is a Japanese-English robot with a knowledge base from Wikipedia Wilcock, Jokinen,
and Yamamoto 2016. In these cases, users initiated code-switching explicitly or through language switches.
System speaks a language dierent than the user
In Ali et al. (2018), multiple avatars were recorded, either speaking English, Arabic, or a mix. Users
could then select which avatar they interacted with. If the user selected an avatar that did not speak the
same language as them, subtitles were shown for the user. It was not clear if code-switched user data
was possible with the system. The agent was intended to be used for conversational purposes and as a
testimony of stories.
31
In Al-Shawakfa and Evens (1999), the authors proposed creating an agent to connect users to a module
to learn how to use a network operating system. The system was only proposed and not built so that no
user experiences could be described. The description of the systems appeared to allow for the system’s
code-switching, as the authors described how English and Arabic dier in written orientation (left to right
versus the reverse) but did not detail intra-sentential code-switching nor how the parser would cope with
this type of user input.
Casual conversation
The text-based agent wrote in English and Hindi in Bawa, Choudhury, and Bali (2018). Users were
presented with four variants of conversations: no code-switching by either interlocutor, code-switching
used by the agent only, code-switching from the participant only, and code-switching by both interlocutors.
Participants scored the conversations on a 7-point Likert scale. Attitudes about the presence and quality
of code-switching largely depended on the participant’s attitude towards code-switching in general, as
around one-third of their participants had negative attitudes towards code-switching. They concluded
that it is benecial to determine the participant’s attitude towards code-switching before including it in a
dialogue.
One avenue for future work that the same researchers propose is nudging (Bawa, Khadpe, et al. 2020).
A “nudge” is to include small phrases from the other language to determine the user’s attitude about codeswitching. If the user replies with code-switching, it is a positive reaction to the nudge. If not, it is a
negative reaction, and additional code-switching is not included in later turns. The policy for nudges was
evaluated at every conversation turn, and the level of mixing was based on the user’s last response. If the
user did not mix above a certain threshold, the policy dictated that the system should back o on mixing.
The nudge would then occur again in a few turns. They found that their participants rated the agents that
code-switched with higher levels of conversational ability and human likeness and showed a preference
32
Figure 2.4: In this example from ChatGPT 3.5, the user directly asks a question using code-switching. The
question asked is “Where is the bathroom?”, to which the system gives a corrected form in Spanish. This
interaction occurred in January 2024.
for code-switching agents over monolingual ones. Additionally, they found that users preferred chatbots
that mimicked the user’s code-switching patterns.
Large Language Models Large language models, such as ChatGPT (2
), are often trained on multiple
languages and can easily respond to dierent languages between turns. These models can also respond to
a user’s request for code-switching, as demonstrated in the following examples. In Figure 2.4, the question
the user asks is “Where is the bathroom?”, to which the system gives a corrected form of how to say the
phrase entirely in Spanish and explains how to use the phrase. The system does not respond to the question
and does not code-switch.
The models do not always respond to a user’s request to code-switch, as shown in the rst turn in
Figure 2.5. The rst code-switched response (the second turn) demonstrates intra-turn, inter-sentential
switching. The second code-switched response (the third turn) switches to English at prepositions in both
sentences after starting in Spanish and switches back to Spanish at the comma.
2.4.1.2 Task required
A second reason a user might expect to interact with a bilingual dialogue system would be the conversational task requiring bilingualism. Dialogue systems have been implemented to support numerous types
2
https://chatgpt.com/
33
Figure 2.5: In this example from ChatGPT 3.5, the system provides a code-switched response on the second
request from the user. This interaction occurred in January 2024.
34
of conversations and tasks, from casual conversations about daily life and interests to task-oriented conversations on several topics, such as booking a ight or seeking customer support (Motger, Franch, and
Marco 2022).
Most existing code-switching dialogue systems are task-oriented. The tasks in these dialogue systems
include booking use cases (restaurants, transportation, hotels), automation of customer support (e.g., in
domains like banking or telecommunications), or retrieving and providing information (e.g., in health care
or tourism) (Razumovskaia et al. 2021).
Language requirement
For some languages, specic vocabulary is borrowed from another language. In these instances, the
language requires the inclusion of these borrowings. For example, the word “email” is typically the word
used globally for electronic mail, even if a given language has created a specic noun phrase, such as correo
electrónico in Spanish. Hindi is often used as an example language for code-switching and borrowing, with
numerous borrowings from English, as exemplied respectively in the following utterances.
• Original: VERY CUTE, bachpan main bhi FACE pe ATTITUDE hain, VERY VERY CUTE
English: (very cute, even in childhood there is that attitude on his face, very very cute)
From:(Dąbrowska et al. 2013)
• Original: Maim ais UNIVERSITY ka INTERNSHIP kar raha hoon.
English: (I am doing an internship at this university.)
From:(Dey and P. Fung 2014)
Auer (1995) indicates contexts in which a bilingual tends to code-switch in conversation, which includes:
• giving reported speech
• change of interlocutor(s) in a group setting
35
• parentheses or side-comments
• reiterations or repeats
• change of task or activity type
• topic shift
• language play
• interjections, ller
• false start repair
One goal of existing task-oriented dialogue systems is to build a corpus of code-switched humansystem data. In one, the system would initiate the code-switching in English, Spanish, or Hindi to encourage the user to also code-switch between those languages. Code-switching was intra-sentential to capture
that variety of language use (Ramanarayanan and Suendermann-Oeft 2017). A second system was built in
part to create a corpus of English-Spanish code-switching utterances (Ahn et al. 2020).
2.4.1.3 System is an unbalanced bilingual
A system might use code-switching to account for or compensate for its deciencies in one or both languages.
Dialogue systems are typically built following standards for how humans communicate with other
human interlocutors. The usual pipeline for developing a dialogue system is rst to gather data of human
interlocutors demonstrating a specic behavior, develop the required computational components, and then
test the dialogue system with human users. While it is established that the expectations for a computer
interlocutor can dier in some ways (Cassell 2000; Brixey 2015), the expectations are often the same. Thus,
a data set of human conversations is rst needed to build a dialogue system, which can then be used to
model understanding and generation components of the dialogue system.
However, of the 7,000 languages in the world, many languages need more existing computational
resources to implement the understanding and generating components of a dialogue system. An existing
36
Figure 2.6: Example conversation from the Tauira system (Schagen and Knott 2004) in which the system
learns new vocabulary from the user. H indicates the user and C indicates the system in the gure.
system of this kind is Tauira (Schagen and Knott 2004), in which the system queried the user for the correct
usage of an unknown word in a second language. The Tauira system was designed to gain new vocabulary
in English and be able to translate those items in Maori. An example conversation is shown in Figure ¯ 2.6.
The second application described in this dissertation, DAPEL in Chapter 5, is also a system of this
variety as it code-switches to establish an identity as a speaker of that language but does not have the full
resources to carry out a conversation in the second language Choctaw.
2.4.1.4 Socio-psychological
A speaker might code-switch to achieve another aim with their interlocutor. Code-switching oers bilingual speakers a method to increase the exibility of their expression. It allows them to “index the nuances of
social relationships by exploiting the socio-psychological associations of the languages employed (MyersScotton 1997). Language choice might simply follow the expected social cues of the language community.
Language choice can also communicate aspects about the speaker, such as their identity and membership
in various groups. Language choice might also be used to build a relationship with the interlocutor.
Cues of language community
While code-switching can be speaker dependent (Vu, Adel, and Schultz 2013), Garnder-Chloros and
Edwards (2004) suggest that social factors inuence language choice as dierent generations of speakers
from the same community have exhibited very distinct code-switching patterns (Doğruöz et al. 2021).
37
Figure 2.7: An example of Czech-English code-switching where high-information content words are
switched
Myslín and Levy (2015) found that Czech-English speakers switch to English for high-information
content words in prominent prosodic positions when speaking Czech. Examples are shown in Figure 2.7.
For Hindi-English bilingual users of Twitter, Hindi is preferred for the expression of negative sentiment
(Rudra et al. 2016) and for swearing (Agarwal et al. 2017) in tweets. Other code-switching items that might
be cultural or community-based are words or phrases that frequently occur at the edge of clauses, such as
ojalá in Spanish (Doğruöz et al. 2021).
In Parekh et al. (2020), the system is designed to investigate how the user accommodates the system’s
code-switching choices. The system in Ahn et al. (2020) explicitly asks users to rate if the dialogue system
code-switched as a human member of the language community would. The dialogue system in Ramanarayanan and Suendermann-Oeft (2017) had dialogue utterances that were modeled on natural human
code-switching to encourage participants to produce additional natural code-switching examples to form
a corpus.
As Auer (1995) points out, the speaker can also choose to distance themselves from the language community by not abiding by norms.
Identity establishment
Social identity is the sense of membership within a social group. This group membership could be
dened by ethnicity, gender, social class, and nationality (Mitchell, Myles, and Marsden 2013). Theories
also consider language a salient marker of identity and group membership (Pavlenko and Blackledge 2004).
38
Speakers may choose to speak in the language they feel would symbolize the rights and obligations
they wish to enforce and their identity (Myers-Scotton 1998). Code-switching could then be an attempt
to negotiate a dierent balance of rights, obligations, and identity representation within the conversation
(Myers-Scotton 1998; Pavlenko and Blackledge 2004).
Which language is used can be a political choice when dealing with national or ethnic identities (Piller
2001), as French-English code-switching in Quebec, Canada may signal a dierence between dominant and
subordinate language groups (Heller 1992). Pavlenko and Blackledge (2004) state that those who speak a
lingua franca often do not demonstrate feelings of identity or membership based on using/knowing that
language.
Some denitions of the concept of identity emphasize the negotiability and exibility of identity and
the role of language practices in the construction of identity (Mitchell, Myles, and Marsden 2013). This
means that Language learners can transform their identity to include the second language as part of their
overall identity as they gain knowledge in that language.
In the systems described in Section 2.4.1.4, the three works (Parekh et al. 2020; Ahn et al. 2020; Ramanarayanan and Suendermann-Oeft 2017) demonstrated their identities as speakers of both languages
through code-switching. All of the systems intended to collect human-system dialogue data, which was
achieved by demonstrating their knowledge of both languages.
Facework and Rapport
Face is the image one has of oneself that is delineated in terms of approved social attributes, such
as one’s profession (Goman 1967). Face and identity are related, as face can be considered the public
display of one’s identity (Ting-Toomey 2009; Holtgraves 2009). An individual’s face is supported or maintained when what they present as an image of themselves is internally consistent and supported by other
participants (Goman 1967).
39
Face is a collaborative process, as it must be protected and supported by others. Support or threats to
the interlocutor’s face are simultaneously threats or support to one’s own face (Holtgraves 2009; Goman
1967). Emotional attachment, politeness, and avoiding hostility are all aspects that help explain why a
person conducts themselves in a way that protects the face of all conversational participants, including
themselves (Goman 1967).
Because of the collaborative nature of dialogue and facework, face plays a role in language comprehension (Holtgraves 2009). If a speaker’s face becomes threatened in conversation, such as encountering
a language issue, they “would be likely [to] experience identity-based frustration, emotional vulnerability,
shame, hurt, anger” (Ting-Toomey 2009). Face threats can be on the group membership or individual level
and may alter one’s sense of identity as a language speaker. Conversely, if a person senses their face is
being positively recognized, they typically respond with condence and assurance (Goman 1967).
The three elements underpinning rapport management include people’s face sensitivities, their perceived social rights and obligations, and their goals for the interaction (Spencer-Oatey 2009). If language
choice indicates solidarity with or deference to the interlocutor, it can narrow the social distance between
them. Because the interlocutor’s face has been supported and the social distance has been decreased, there
is a higher chance of developing rapport. The inverse case, wherein the choice of language is used to
indicate a dierence in power (such as one language having higher status than the other), anger, or resistance, could serve to increase the social distance between the conversational participants (Pavlenko and
Blackledge 2004).
Rapport has been extensively studied for monolingual systems. Examples include for virtual characters
(Cassell, Bickmore, et al. 1999; Gratch et al. 2007) and investigating the impact on and importance of rapport
in negotiations (Zhao, Romero, and Rudnicky 2018), conversational coordination (Cassell, Gill, and Tepper
2007), and interviews (Kobori, Nakano, and Nakamura 2016), among many other examples (N. Wang and
40
Gratch 2009; Lugrin, Pelachaud, and Traum 2022). To the best of my knowledge, no existing dialogue
systems code-switches to build positive or negative facework or rapport with the user.
2.4.1.5 User prociency
The user might require bilingualism when in conversation with the dialogue system if they are an unbalanced bilingual. Unbalanced prociency can result from several factors (Schwartz and Kroll 2006). One
typical example is to use one language at home among family or close friends but use the second in a
professional setting. This would cause the user to have a highly developed level of vocabulary in the professional setting for specic domains that the user may not have in the home language. A second common
example is that the user is learning a second language and is simply at the beginning stages.
The systems described in Section 2.4.1.1 support user preference; however, it is unknown if the user
prefers to use one or the other language at that point in the conversation to overcome a prociency challenge. In Ahn et al. 2020, they conclude that more procient bilinguals are less likely to use word insertions
as a code-switching method and more likely to code-switch at clauses.
To the best of my knowledge, no systems support code-switching in computer-assisted language learning (CALL) chatbots. Instead, CALL chatbots are designed to speak only in the language being learned.
This diers from the Masheli system described in this dissertation, which uses code-switching to support
language learning.
2.4.2 Computational requirements of a bilingual dialogue system
In all of the NLP tasks considered in this Section, it is clear that code-switched data is unique and challenging in ways that cannot be handled by simply stitching together two (or more) monolingual models.
Switching languages can change the denition of the word being used, resulting in changes to such tasks
as language identication and parsing to intra-word mixing, where a single code-switched word consists
41
Figure 2.8: Summary of the broad research areas of computational aspects of bilingual systems
of morphemes and lexical bases from two languages. As a result, code-switched data requires uniquely
tailored tools and approaches (Çetinoğlu, Schulz, and Vu 2016). Figure 2.8 summarizes the broad areas of
research in the computational aspects of bilingual systems.
Computational approaches to process and generate code-switched data are relatively recent developments. While some work began in the early 1980s (Joshi 1982), a focus on code-switched data has recently
emerged. There are now workshops dedicated to code-switching, such as the Workshop on Computational
Approaches to Code-switching3 or the Computational Approaches to Linguistic Code-Switching (CALCS)
workshop (Molina et al. 2016).
At present, there are at least two benchmarks for code-switching tasks, the LinCE benchmark4
and the
GLUE-CoS benchmark5
. The LinCE benchmark, short for Linguistic Code-switching Evaluation, combines
ten corpora in pairs from English-Spanish to English-Nepali and several shared tasks (Aguilar, Kar, and
Solorio 2020). The GLUE-CoS benchmark contains corpora only for English-Hindi and English-Spanish
pairs and several shared tasks (Khanuja et al. 2020).
3
https://code-switching.github.io/2023
4
https://ritual.uh.edu/lince/
5
https://microsoft.github.io/GLUECoS/
42
In this section, I review existing computational methods for understanding and generating code-switch
data for specic NLP tasks, such as speech recognition and language modeling.
2.4.2.1 Data sets of code-switching
One major challenge in developing a bilingual dialogue system is the sparsity of code-switched data. This
is partly because bilinguals tend to code-switch in conversation rather than in written form (Sitaram et al.
2019), resulting in fewer options for creating novel data sets. A related challenge is that code-switching
can depend on the individual, making it dicult to capture generalizations or multiple examples of a single
type of code-switching instance.
A second major challenge is that the literature notes that at least one of the code-switched languages
tends to be low-resourced (Doğruöz et al. 2021). Data sets and annotations are sparse for many of the
world’s languages (Razumovskaia et al. 2021).
Some popular example data sets include a Hindi-English corpus (Dey and P. Fung 2014), a MandarinEnglish spoken corpus (Lyu et al. 2010), a Hindi-English and Spanish-English human-system conversations
(Ramanarayanan and Suendermann-Oeft 2017), and Tagalog-English tweets (Herrera, Aich, and Parde
2022). A recent survey captures 23 unique corpora and the NLP tasks supported and explored with the
corpora (Jose et al. 2020). An excellent source for locating corpora, many of which are publicly available
for use, would be the International Conference on Language Resources and Evaluation (LREC)6
.
2.4.2.2 Understanding
Most of the literature on code-switching computational requirements has focused on the computational
processing of code-switching. For the understanding tasks, a computational system must recognize that
a language switch occurred and identify which language is used to process the utterance appropriately.
Typically, the code-switching system looks at one language pair, and it is not yet the case that multiple
6
http://www.lrec-conf.org/
43
language pairs are worked on together or that a model built for one language pair can successfully transfer
to an altogether dierent pair. It is also the case that the complexity of this task is reduced by breaking
it down into a series of sub-tasks, with each output given to the next task in the pipeline (Razumovskaia
et al. 2021).
The remainder of this section summarizes some of the challenges and state-of-the-art for each understanding computational component.
Language Identication
Language identication is the process of identifying which language is used in a given utterance section (Das and Gambäck 2014; Joshi 1982). If the code-switching is at the phrase level (inter-sentential
switching) and does not mix within a word, this is the only understanding component task that is considered to be solved (McNamee 2005). Language identication has reached up to 100% on benchmark data
using approaches such as n-grams (Cavnar, Trenkle, et al. 1994), character encoding detection (Dunning
1994), or stop word lists (Grefenstette 1995). The most commonly used methods in a survey of language
identication research were Support Vector Machines and Conditional Random Fields (Hidayatullah et al.
2022).
However, the task is not solved if the code-switching includes word borrowing or intra-word mixing
(Das and Gambäck 2014). Part of the challenge of solving this issue is that word borrowing presents
challenges to annotators, who may dier on believing the word is integrated into the language or not, and
as a result, may inconsistently label data (Çetinoğlu, Schulz, and Vu 2016).
Text normalization
One NLP preprocessing task is text normalization, which transforms text into one consistent form that
can be processed. In English, an example of text normalization might be to expand contractions into full
forms, such as can’t into cannot. In the case of Choctaw, text normalization could be used to decrease
the variation in possible spellings. A simple script can transform instances of v into V. A model could
44
then recognize that bvnna and bVnna (want) are the same word. However, text normalization becomes an
essential step when working with code-switched text that does not share a similar orthography, such as
Korean and English.
The process of normalizing text is not standardized. To illustrate this, some approaches cited in the
literature have included neural net translation architecture (Q. Zhang, Huan Chen, and X. Huang 2014),
grapheme to phoneme (G2P) conversion(Parikh and Solorio 2021), conditional random elds (F. Liu et al.
2011; Mave, Maharjan, and Solorio 2018), and word embedding model (Singh, Choudhary, and Shrivastava
2023), among others.
Named Entity Recognition
Named entity recognition (NER) is the task of identifying named people, locations, organizations, or
recognizable products (Thara and Poornachandran 2018). For example, NER could recognize that “Shirley
Temple” is a named entity and that the entity could refer to the actress or the name of a drink.
Gupta, Tripathi, et al. (2016) used Conditional Random Fields to extract entities in English-Hindi and
English-Tamil code-switched tweets on Twitter.
Sentiment Analysis
Sentiment analysis classies a text or an utterance as positive, negative, or neutral in sentiment (Thara
and Poornachandran 2018).
Vilares, Alonso, and Gómez-Rodríguez (2015) used logistic regression to classify tweet sentiment in
English-Spanish pairs. The conference SemEval now hosts a task to classify sentiment in code-switched
tweets (Patwa et al. 2020).
Language modeling
A large portion of the literature on code-switching is dedicated to language modeling. Language modeling cannot exist without datasets, as described in Section 2.4.2.1 as a lack of sucient data makes training statistical language models dicult or impossible. Additionally, normalization, as described in Section
45
2.4.2.2, is an important step in preparing a data set (Çetinoğlu, Schulz, and Vu 2016). Language modeling is essential to downstream applications, such as Automatic Speech Recognition (ASR) and machine
translation.
Many models are possible within language modeling, all of which assign a probability to a given sentence (Çetinoğlu, Schulz, and Vu 2016). Language modeling for code-switched data requires predicting the
next word and the next word’s language (Chandu et al. 2018).
If the code-switching is inter-sentential, two monolingual language models can simply be interpolated (Çetinoğlu, Schulz, and Vu 2016). The challenges begin with intra-sentential code-switching and
intra-word switching. The early work for code-switched language models proposed frameworks for codeswitching (Joshi 1982), much like the linguistics literature has done. One approach to processing bilingual
language is running two parallel monolingual language models. In this approach, what language is being
spoken at the word level is rst identied, and then the words using one or the other of two monolingual
language models are processed (Sitaram et al. 2019).
Later research attempted to predict where code-switching might occur based on these linguistic frameworks. This approach was explored since predicting where a switch might occur would allow the system
to utilize one or the other monolingual language models, thereby reducing processing time (Solorio and Y.
Liu 2008). However, additional data sets, many of which are from the Internet, such as from Twitter, indicated that the earlier approaches that relied heavily on frameworks for code-switching are only sometimes
successful on these newer, noisier examples of code-switching (Chandu et al. 2018).
Other language model approaches have involved mixing the data rather than having two monolingual
language models (Sitaram et al. 2019). More recently, massive multilingual transformers (MMTs), such as
multilingual BERT, are being proposed to process code-switched data by relying on cross-lingual transfer (Devlin et al. 2018). However, there are caveats, as recent work has shown that the eectiveness of
46
MMTs drastically drops in transfers to distant languages or languages represented with small-sized monolingual corpora (Lauscheretal.,2020).”(Razumovskaia et al. 2021). In other words, these are not yet eective
methods for low-resource languages but rather for languages with larger corpora.
Automatic Speech Recognition (ASR)
An ASR is a system that converts spoken utterances into text. Implementing a successful bilingual
speech recognizer depends on developing a well-trained acoustic model capable of recognizing two languages and a well-trained bilingual language model.
One approach is to use a language identication system (discussed in Section 2.4.2.2) to split the codeswitched speech into monolingual sections and then use monolingual recognizers on the corresponding
segments. While straightforward, semantic information is often lost between the segments. Additionally, this approach relies on a language identication system that works on the acoustic frame level but
frequently fails if the speech segments are shorter than around three seconds (Çetinoğlu, Schulz, and Vu
2016). A newer approach used meta-transfer learning, which performed better than monolingual recognizers (Winata, Cahyawijaya, et al. 2020).
A second approach is to implement one integrated system, rather than two monolingual systems, comprising a multilingual acoustic model, dictionary, and language model (Vu, Lyu, et al. 2012; Sivasankaran
et al. 2018; Weiner et al. 2012; Biswas et al. 2018). This approach does allow for intra-word code-switching
and does not lose the semantic information between segments.
2.4.2.3 Generation
Most of the literature on computational requirements has focused on processing and understanding codeswitching, with less attention given to generating code-switched data.
Since it is challenging to obtain code-switched data, there have been attempts to create intra-sentential
code-switched data synthetically (Pratapa et al. 2018). Creating code-switched data has generally followed
47
a linguistic framework, although recent approaches have utilized other methods. In one reference, codeswitched data was generated using a pre-trained encoder with transfer learning (Gupta, Ekbal, and Bhattacharyya 2020). Another work converted monolingual text into code-switched text using a dependency
parser and a machine translator (Gregorius and Okadome 2022).
Other generation tasks have included code-switched text-to-speech (TTS). Some TTS approaches have
included techniques such as phone mapping, where the phones of the dierent languages are substituted
with the closest-sounding phones in one language. The result of this approach has produced strongly
accented speech (Sitaram et al. 2019).
Within dialogue systems, generation would be necessary for the response generation module of the
system (Razumovskaia et al. 2021). An alternative to generating an entirely new response on the y is
to retrieve an acceptable code-switched response from an existing data source, which is then given to a
generative model to alter (Razumovskaia et al. 2021).
Until recently, the state-of-the-art approach was to implement each component of understanding and
generation as a separate entity, with the features required for one code-switching task becoming the output
for another. However, pipeline-style approaches have yet to be successful when the tasks are cyclic, such
as for normalization or task identication tasks. Additionally, it is noted that pipelines can cause error
propagation, as an error in one component will ow throughout the rest of the pipeline (Çetinoğlu, Schulz,
and Vu 2016). End-to-end (E2E) code-switched systems are a recent development(Sreeram and Sinha 2020).
E2E systems typically rely on a neural architecture (Razumovskaia et al. 2021). Some work has found that
E2E models can be more successful in open-domain conversations (such as chatbots) than task-oriented
dialogues (Razumovskaia et al. 2021).
48
2.4.3 Summary of the research space
Table 2.1 summarizes the bilingual dialogue systems research space as outlined by the dimensions of the
literature review. What can be noted is the absence of dialogue systems that either say or process intraword code-switching. Additionally, few systems address the system becoming bilingual as a reason to
code-switch, and few systems use code-switching for socio-psychological reasons.
This dissertation situates itself within the space outlined by the reviewed literature and within the
research space by covering new ground by examining socio-psychological dimensions such as identity,
facework, and rapport. I also leave intra-word dialogue systems to future work.
2.5 Overview of Choctaw language and culture
In this Section, I describe the Choctaw language, tribe, and culture. I also detail some of the existing
dictionaries, reference grammars, and writing systems.
2.5.1 Sounds and orthography
The Choctaw language has 15 consonant sounds and 9 vowel sounds in three series: short, long, and
nasalized. The orthography for all variants uses the Latin script but is not fully standardized. Broadwell
(Broadwell 2006) describes several writing systems: traditional, Mississippi school, and Mississippi modern, along with a variety only used by linguists. I only describe the traditional and Mississippi writing
systems.
The current literature indicates that systematic writing of Choctaw only began with the arrival of
American missionaries, although there was signicant contact with the French and Spanish prior to the
English-speaking missionaries. The rst known writing system that has persisted was led by Reverend
Cyrus Byington in 1819 (Byington 1870), who devised the traditional orthography to translate and publish
religious texts in Choctaw (discussed in detail in the following subsection). For example, the rst text
49
Type of code-switching processed/generated
Reasons for
code-switching
Interturn Intraturn,
intersentential,
Intraturn,
intrasentential,
Intraword
User preference Al-Shawakfa and Evens
1999; Ali et al. 2018;
Gawronska and House
1998; Martincic-Ipsic
et al. 2003; Meng,
S. F. Chan, et al. 2000;
Wilcock, Jokinen, and
Yamamoto 2016
Bawa, Choudhury, and Bali
2018; Bawa,
Khadpe, et al.
2020; Brown
et al. 2020
Bawa, Choudhury, and Bali
2018; Bawa,
Khadpe, et al.
2020; Brown
et al. 2020
Task required Pratapa et al. 2018 Ramanarayanan
and
SuendermannOeft 2017
Ramanarayanan
and
SuendermannOeft 2017
System becoming
bilingual
Schagen and Knott 2004
User prociency Chapter 4.3 Chapter 4.4 Chapter 4.4
Socio-psychological -
language community
cue
Chapter 5 Chapter 5 Chapter 5
Socio-psychological -
demonstrate an identity
Chapter 4.3; Chapter 5 Ahn et al.
2020; Ramanarayanan and
SuendermannOeft 2017;
Razumovskaia
et al. 2021;
Chapter 4.4;
Chapter 5
Ahn et al.
2020; Ramanarayanan and
SuendermannOeft 2017;
Razumovskaia
et al. 2021;
Chapter 4.4;
Chapter 5
Socio-psychological -
facework
Chapter 4.3; Chapter 5 Chapter 4;
Chapter 5
Chapter 4;
Chapter 5
Socio-psychological -
rapport
Chapter 4.3; Chapter 5 Chapter 4.4;
Chapter 5
Chapter 4.4;
Chapter 5
Table 2.1: Summary of existing code-switching dialogue systems research space and situation of this dissertation within the research space. Chapter 4 of this dissertation covers the Masheli chatbot for language
learning and Chapter 5 is the DAPEL system for language documentation.
50
Common characters (IPA in brackets when dierent from spelling)
p b t k f [F] s h m n l w y [j]
Choctaw sounds represented dierently in some orthographic systems
IPA Traditional
Mississippi
School Modern
tS ch č ch
S sh š sh
ì hl, lh ł lh
a V, v, a a a
a: a á á
ã a
¯
, an, am a˛ a
¯
i i i i
i: e, i í í
˜ı i
¯
, in, im i˛ i
¯
o u, o o o
o: o ó ó
õ o
¯
, u
¯
, on, om o˛ o
¯
Figure 2.9: Choctaw sounds and orthographic variants. Bracketed characters are IPA symbols.
published in Choctaw was the Bible, and is still the longest text published in the language today. The
language documented in the work of Byington was incomplete, leaving out several of the dialects present
at the time (Swanton 1931; Kickham 2015), however Byington’s bilingual dictionary contains the most
lexical items of all printed Choctaw dictionaries to present day.
Distinguishing features of the traditional orthography include the short vowel [a] being represented as
V, or, alternatively, as v for ease of typing on QWERTY keyboards. Some challenges in this orthography are
the inconsistency of the representation of vowel length and nasality, and the ambiguity of the sequence hl,
which can stand for either the lateral fricative [ì] or the consonant cluster [hl]. The traditional orthography
is the preferred writing system in Oklahoma. The Choctaw Nation of Oklahoma has attempted some
standardization through spelling rules issued by the School of Choctaw Language and the publication of
a dictionary (The Choctaw Nation of Oklahoma Dictionary Committee 2016)
7
.
7
https://dictionary.choctawnation.com/word/
51
The Mississippi School alphabet was developed by the Bilingual Education for Choctaws in Mississippi
program (BECOM). It was established in 1974 to provide education in Choctaw in grade levels kindergarten through third grade (York and Scott 1976). The Mississippi School alphabet was accepted by the
Tribal Council in 1981 but was not universally adopted by the community. BECOM then developed the
Mississippi Modern alphabet. This orthography restored the digraphs ch, lh, and sh from the traditional
orthography, as well as the representation of nasalized vowels with a macron below. However, it retained
the representation of vowel quality as in English, and vowel length with an acute accent (Choctaw Tribal
Language Program 2005).
2.5.2 Dictionaries and Reference Grammars
Several bilingual lexicons were published in the 19th century (Watkins 1892; Wright 1880). The rst comprehensive dictionary was written by Cyrus Byington (Byington 1915). Byington’s dictionary was published nearly 50 years after his death and has unique symbols introduced by the editor: a
˙
for short [a], and
a superscript for nasalized vowels an
, in
, on
. A second dictionary was published by Allen Wright (Wright
1880). A more recent dictionary in the traditional orthography was released by The Choctaw Nation of Oklahoma (The Choctaw Nation of Oklahoma Dictionary Committee 2016). The Mississippi Band of Choctaw
Indians have a dictionary forthcoming as of writing 8
.
There are three published reference grammars in English for Choctaw. The rst is by the missionary
Cyrus Byington (Byington 1870). The second is by Thurston Dale Nicklas (Nicklas 1979). The nal reference grammar is by George Aaron Broadwell (Broadwell 2006). It is highly possible the language was
documented by the French and Spanish who were present in the Choctaw homelands prior to Englishspeaking populations, however discovering those documents is beyond the scope of this dissertation and
I leave such work to future scholars.
8
https://dictionary.choctaw.org
52
2.6 A Brief Overview of Prior work on Indigenous languages
At the time of writing this dissertation, we are currently in the UN Decade of Indigenous Languages9
. More
attention is being paid to documentation and revitalization of Indigenous languages in the world, such as
through fellowships with large funds allocated to eorts like the MIT Solve prize10, or Shared Tasks at
conferences (Ebrahimi et al. 2023).
In this section, I introduce the role of technology in both documenting and revitalizing Indigenous
languages, as well as providing denitions for documentation and revitalization.
2.6.1 Challenges and motivations for Indigenous language technology
The United States has over 570 federally recognized tribes11
,
12 representing many unique languages and
cultures. Many Indigenous languages in the US are endangered and losing speakers. There are several
reasons that communities are losing language speakers, one of which was US government policies of forced
assimilation and suppression of Indigenous languages over hundreds of years (Reyhner 1993). Examples
of government policies include forced attendance at residential schools and forced removal from ancestral
lands.
The vast majority of American Indigenous languages have little representation in language technology and NLP technology. Indigenous languages are often also low-resource languages, to the best of my
knowledge there are no Indigenous languages that would be termed high-resource. Resource in this context
refers to the creation, development, and maintenance of language resources to implement NLP technology.
The majority of languages in the world are considered low-resource, and the most well-resourced language
is English (European Language Resources Association 2019).
9
https://www.un.org/development/desa/Indigenouspeoples/Indigenous-languages.html
10https://solve.mit.edu/challenges/2024-indigenous-communities
11https://www.usa.gov/tribes
12See https://www.justice.gov/otj/about-native-americans for more on federal recognition.
53
It can be a challenging endeavor to develop language technology for an Indigenous language. The rst
step towards developing any language technology is to create a sizable data set, which can be complicated
to achieve as many languages have few and/or dwindling numbers of speakers who may be geographically distant from one another. Additionally, many Indigenous languages have non-standardized writing
systems or even none at all. This presents several challenges, one of which is the high learning curve for
training transcribers (Coto-Solano, Nicholas, and Wray 2018).
There are numerous reasons to overcome the challenges and support Indigenous language technology. For one, there are interesting challenges to develop solutions for when considering a new language.
Additionally, it helps preserve the culture of the language community (European Language Resources Association 2019) and adds to our understanding of the human mind’s capacity for language. It also boosts the
perceived prestige level of the language, particularly with younger speakers (Kornai 2013; Coto-Solano,
Nicholas, and Wray 2018). Finally, the act of speaking and maintaining Indigenous languages has been
shown to have positive health outcomes for individual speakers. An individual’s use of their Indigenous
language was correlated with lower rates of alcohol and drug consumption (Whalen, Moss, and Baldwin
2016), reduced suicide rates in youths (Hallett, Chandler, and Lalonde 2007), and a lower prevalence of
diabetes (Oster et al. 2014).
2.6.2 Documenting Indigenous languages
Language documentation is the work of creating a multi-purpose record of examples of phenomena in a
language. The record can show registers and varieties, social or formal use, and/or spoken and written
samples of the language (Himmelmann 2006). The purpose of language documentation is to serve as a
resource for language maintenance support, as a data repository for future scientic inquiries, and as a
means to replicate and verify past research on the language (Himmelmann 2006).
54
Language preservation is the eort to prevent a language from becoming unknown by trying to make
a record of understanding the structure of the language. Eorts include making word lists, recording
speakers in multiple speaking scenarios such as monologues, and analyzing the language’s syntax. A
language preservation practitioner can also conduct a dialogue with a language speaker in the form of
an interview, moderate a conversation between two speakers, or ask questions about specic aspects of a
language. Language preservation is carried out on languages with declining populations of speakers, such
as for endangered languages (Olko and Sallabank 2021).
Language documentation and preservation is often part of the revitalization and reclamation process
(Bird 2020). Language reclamation is the “larger eort by a community to claim its right to speak a language and to set associated goals in response to community needs and perspectives” (Leonard 2012), while
revitalization is the term for eorts to reverse language shift by gaining new speakers for languages that
have been losing speakers (Pine and Turin 2017).
2.6.2.1 Historical practices and issues
Language documentation and revitalization as practices have changed over time. Some Native American languages were systematically documented by the early twentieth-century anthropologist, Franz Boas
(White 2006). Boas usually produced a phonological, lexical, and syntactical analysis for each language.
The large majority of the languages, at least for English-speaking documentation practitioners, were documented by missionaries. While the resulting dictionaries and translations were incredibly benecial, it is
noted that one intent of this early work by missionaries was to pull Indigenous people towards Christian
ideologies and away from Indigenous culture and language (Battiste 1998). This intention can be seen in
the words and translations that were documented which often dealt with Christian topics rather than Indigenous practices and ceremonies. As a result, traditional cultural knowledge was not recorded for later
reclamation eorts.
55
Revitalization practices during the 1900s have often been led by linguists, some of whom were not
from the community. This presented dierent challenges. In some cases, community permission was not
obtained, or recordings were maintained outside of the community and not readily shared (Shilling 2020).
2.6.2.2 Best practices for working with community
The literature notes that language loss is often the byproduct of public oppression (Nevins 2013) as well
as government policies that target the connection between Indigenous languages and identity (Stebbins,
Eira, and Couzens 2017). With these aspects in mind, care must be given when working with Indigenous language communities. I refer the reader to Bird (2020) for how traditional and older academic and
documentation approaches can replicate colonization and oppressive practices.
Pinhanez et al. (2023) notes that working with communities can be summed up as, “Nothing for us
without us.” In other words, any language initiative must be done in collaboration with the community
and should have its benet as a core part of the project goals.
In modern approaches to working with communities, the right to self-determination is a priority (Bird
2020). The research or tools that might be developed should rst meet the needs of the language community and the community context, rather than what was historically practiced where the needs of only the
researcher came rst with any possible benets to the community as secondary (Shilling 2020). A rst step
towards understanding community needs and contexts would be to build relationships with community
members and to maintain those relationships and connections throughout the life cycle of the work (Bird
2020).
Each language community will have dierent protocols for working with researchers, external and
even internal to the community, but will usually have a community approval process. In the case of working with the Choctaw Nation of Oklahoma for this dissertation, all work was reviewed by the Choctaw
Nation Institutional Review Board.
56
Additionally, protocols can change and should be revisited as necessary. For example, prior to the
construction of the Cultural Center13, I discussed with community members and tribal ocials as to where
my resulting recordings and other related materials should be archived as the tribe did not store such
data within the tribe at that time. As other resources have been archived at the Sam Noble Museum in
Oklahoma, community members recommended this option for ease of locating and accessing. However,
the Choctaw Nation has since built the Cultural Center in Durant, Oklahoma where electronic les can be
archived and maintained by the tribe.
2.6.3 Examples of Indigenous Language Technology
Notable NLP developments have been achieved for some languages, such as Maori (Coto-Solano, Nicholas, ¯
and Wray 2018; Knott, Bayard, et al. 2002) and Cherokee (S. Zhang, Frey, and Bansal 2022). Developed
tools include tools such as Part-of-Speech (POS) taggers, Automatic Speech Recognition (ASR), automatic
object recognition (Running Wolf et al. 2021), grammar databases (Nordho, Tuttle, and Lovick 2016), and
using large language models for machine translation (Coleman et al. 2024). Mager et al.(2018) provide a
more complete overview of technology for Indigenous languages in the Americas. Relevant conferences
include the ComputEL Workshop focuses on language technology for endangered (and often Indigenous)
languages14 and the Workshop on NLP for Indigenous Languages of the Americas (Mager, Ebrahimi, et al.
2023).
LLMs could be an interesting tool for Indigenous languages, as there is interesting research promise
in developing a language model with small amounts of data (Alam et al. 2024). At present, LLMs such as
ChatGPT (Brown et al. 2020) resist some requests to produce texts in Indigenous languages, such as the
dialogue in Figure 2.10. In the rst turn, the system refuses to produce a story in the Choctaw language.
However, with additional probing in the second and third turn, the system does respond to inquiries about
13https://choctawculturalcenter.com/
14https://computel-workshop.org/
57
Figure 2.10: Example conversation with ChatGPT 4, retrieved October 26, 2024.
the Choctaw language. Unfortunately, the system produces incorrect translations, as "i
¯
yuka" means his/her
slave (second turn), and "i
¯
yuka okla" means his/her slave people. Given these profound errors and potential
to miseducate learning speakers, additional data, ne-tuning of the models, and working with communities
is needed before LLMs can be a reliable tool in the eort to revitalize Indigenous languages.
58
Chapter 3
Foundational Work
In this Chapter I describe foundational language technology work that I have created for Choctaw. This
work includes the creation of a corpus, an online representation of a dictionary, a verb generator, and
recording Choctaw speakers, described in Section 3.0.1. Next, I describe a spoken language data collection
that I carried out in 2019 (Section 3.0.2). Using the collected data, I then created a Chocatw automatic
speech recognizer, detailed in Section 3.0.3. Also in this Chapter, I describe a Choctaw verb generator
(Section 3.0.4). Finally, I describe some Choctaw language work that will continue into the future in Section
3.0.5.
The foundational work described in this Chapter was instrumental to the dissertation work as a whole.
Data from the corpus was used to create the knowledge base of the Masheli chatbot in Chapter 4. Conversational data collected, described in Section 3.0.2, was used to inform the code-switching frameworks
of both the Masheli and DAPEL systems. The automatic speech recognizer, detailed in Section 3.0.3, was
part of one of the DAPEL systems (see Chapter 5 for more).
59
3.0.1 ChoCo Corpus
I rst developed a data set to begin working on language technology for Choctaw. Current NLP systems all
rely on corpora and language data. As such, it is essential to build a sucient data set in order for downstream tasks, such as dialogue systems, to be feasible. The data set, originally named Chahta Anumpa
(Brixey, Pincus, and Artstein 2018), was later renamed ChoCo, short for "Choctaw Corpus" (Brixey and
Artstein 2021). The data set is a general-use corpus and contains audio, video, and text resources, with
many texts also translated into English. The Oklahoma Choctaw and the Mississippi Choctaw variants of
the language are represented in the corpus. The data set provides documentation support for the endangered language and allows researchers and language teachers access to a diverse collection of resources.
The corpus is not publicly available at the time of writing but will be submitted to the Mississippi Band of
Choctaw Indians’ archives and/or the Choctaw Cultural Center in the future.
The rst eort to form the corpus was by gathering printed written and oral teaching materials, religious texts, audio clips, and videos. The majority of the materials were gathered from searching dierent
sources on the internet, with some materials being donated by uent speakers. A description of how
materials described in this section are organized in the corpus is given in Section 3.0.1.2.
As this is an ongoing eort, this initial corpus will continue to grow as more material is gathered. There
are numerous recordings that cannot be accessed online as well as undigitized documents in historical
archives, such as those held by the Sam Noble Museum (Norman, OK) and the Mississippi Band of Choctaw
Indians Tribal Archives (Choctaw, MS). Due to the nature of these delicate, old, and irreplaceable items,
in-person visits are required in order to access and view them. As funding and permissions are attained,
it is expected that these remote sources will be digitized and added to this corpus.
60
3.0.1.1 Sources
Existing published materials in Choctaw were gathered from donations, archives, searching online, and
printed teaching materials.
Donations
One story in the Oklahoma orthography, and several texts in the Mississippi orthography, were donated
to the corpus by uent speakers upon request.
Archives
Eighteen audio les of interviews and oral traditions collected by William D. Davies were downloaded
from the American Philosophical Society (APS).1 These clips range in length from one minute to ten minutes, for a total of 58 minutes. These recordings have some overlap in content with the short stories section:
for example, the traditional story, “How the turtle got the pattern on its back” occurs in both. All of the
recordings are in Choctaw, with recorded translations in English.
Additional resources exist in archives but have not yet been added to the corpus as an in-person visit
is required to view the materials.
Online
A number of materials were gathered from the Internet, such as religious texts, reference grammars,
dictionaries, instructional texts and audio. To the best of my knowledge, no blogs or web pages are written
in only Choctaw.
The website Internet Archive2 had scanned copies of several grammars, dictionaries, and a number of
religious texts for free on the Internet Archive. I converted the scanned images into text.
Additional religious texts were found at Global Bible3
and Bible.com4
.
1
http://www.amphilsoc.org/collections/view?docId=ead/Mss.Rec.120-ead.xml
2
http://archive.org
3
https://cho.global.bible/bible/624ab6be442c0806-01/GEN.37
4
https://www.bible.com/versions/1927
61
Figure 3.1: Structure of the corpus, with descriptions for substructures
Both audio and video examples were also gathered from the internet. Several audio clips were gathered
from Global Recordings Network5
. Fifteen videos were extracted from the Cultural Legacy Youtube channel6
. Three videos were extracted from a Jehovah’s Witness website7
. Finally, 27 videos were downloaded
from the “Sounds of Choctaw” playlist on the Youtube channel for the Choctaw Nation of Oklahoma8
.
Printed Teaching Materials
Published teaching material was gathered from the Choctaw Nation of Oklahoma and the Mississippi
Band of Choctaw Indians. Marcia Haag and Henry Willis’s two books of teaching material, poetry, short
stories, and correspondence formed a large portion of the Oklahoma Choctaw portion of the corpus Haag
and Willis 2001; Haag and Willis 2007. Archives from the Los Angeles Unied School District Indian
Education Program provided additional teaching materials for the Oklahoma and Mississippi variants.
3.0.1.2 Corpus Organization
The corpus, shown in Figure 3.1, contains primary data, metadata, descriptive data, and original copies for
some of the texts in the primary data. The number of les per type of section of data is given in Table 3.1.
The design of my corpus was inspired by documentation formats for corpora Himmelmann 2006.
Primary Data
5
https://globalrecordings.net/it/program/A04680
6
https://www.youtube.com/channel/UCSPJdYRhDAf5e8kxxaTySSg
7
https://www.jw.org/cho/
8
https://www.youtube.com/playlist?list=PLPZ_FlC5CLYpp5R-W5pOaV-ef2qsJu3XL
62
Corpus structure Type of data Specics Number and type of le
Declarative Data Text Dictionaries 3 xlsx
Ngrams 3 txt
Inected verbs 461 txt
Text test beds 2 txt
Primary Data Text Bilingual 6 xlsx
Monolingual 1 xlsx
2 doc
Audio Audio only 39 mp3
45 wav
With transcript 24 wav
32 doc
1373 mp3
1317 txt
Video 45 mp4
Metadata Text 2xlsx
Audio 5 txt
1 xlsx
Video 3 txt
Original Scanned Copies Image 1001 pdf
Table 3.1: Overview of corpus structure, with number and type of les within each subsection
The primary data portion of the corpus is rst organized into mediums: text, audio, and video. Descriptions of the subdivisions within each medium are described in the following subsections.
1. Text Only
Within the text-only section of the corpus, the data is rst split into whether the data is monolingual or
has an English translation. The organization of the text primary data and number of tokens per category
can be seen in Table 3.2. Both monolingual and bilingual data are then separated by variant, and within
each variant, then separated by genre. Transcripts of audio are not included in this section.
In the monolingual portion of text, I have a large collection of religious texts in the traditional orthography. I also have examples of phrases and stories in this section. I have not yet found any monolingual
examples in the MS variants.
In the bilingual portion of text, I have an Excel le for stories, and an Excel le for phrases in the
Mississippi variant. Phrases are singular sentences unrelated to other sentences in the le. In the Oklahoma
63
Corpus structure Type of text Genre Number of tokens
Declarative Data Dictionaries 46,704
Inected verbs (Generator) 102,208
Primary Data Bilingual text (OK) Correspondence 159
Poetry 203
Phrases 16,465
Short stories 6,234
Bilingual text (MS) Phrases 331
Short stories 1,693
Monolingual text (OK) Phrases 1,344
Religious 80,818
Monolingual text (MS) Religious 30,010
Phrases 41
Audio with transcript Sam Noble Chahta Anumpa Aikhvna 35,346
Language Awareness Program 254
OK Dialogue (transcriptions ongoing) 200
School of Language CNO 4864
Table 3.2: Number of tokens for text data within the corpus
variant, I have a unique Excel le for poetry, correspondence, stories, and phrases. I chose this format
rather than one with more markup (such as XML) because it is easy to read with minimal processing and
it does not require complex indexing to relate a Choctaw phrase with its English translation.
All genre les have an original text version, which is the text from the source. There is one example of
the Mississippi Choctaw of Oklahoma variant, this is stored in the MS phrases with an additional column
that denotes its variant.
2. Audio
Audio primary data is rst divided into whether the audio clip has a transcript. Once an audio clip is
transcribed, it could be moved into the section of audio clips with transcripts. The split is designed with
speech processing in mind, as a system will be trained on data with text. The data is then broken down by
source of the audio, as all sources provided multiple language samples. The number of les of audio can
be seen in Table 3.1. The organization and amount of all audio clips can be seen in Table 3.3.
64
Corpus structure Type of data Source Amount of Audio
Primary Data Audio only APS 1:13:22
Cultural Legacy 0:16:54
Words of Life 1:00:26
JW.org 0:08:16
Language Awareness Program 0:58:41
Audio with transcript Sam Noble Chahta Anumpa Aikhvna 6:58:20
Language Awareness Program 0:02:40
OK Dialogue 2:07:50
OK Repetition 35:25:20
School of Language CNO 3:12:33
Video Cultural Legacy 1:31:53
JW.org 0:08:13
School of Language CNO 0:43:51
Table 3.3: Number of hours and minutes of spoken Choctaw within the corpus
Apart from the audio I collected in Oklahoma (described in Section 3.0.2), audio was almost entirely
collected from the internet, with the exception of a set of audio pulled from ve cassette tapes. Sources for
all audio are saved in the metadata section of the corpus.
3. Video
At this time, none of the video in the corpus has an accompanying transcript. I leave it to future work
to generate transcripts. Videos are currently separated by source; in total there are three sources for a
total of 45 videos. The videos are of varying length and topic. Some videos contain conversations between
multiple Choctaw speakers, while others are individual interviews. The organization and amount of video
data can be seen in Table 3.3.
Metadata
Metadata contains a text le containing the source information for each sample of primary data where
possible. It is provided for verication purposes, and to also contextualize data where possible. For example, recording information for the dialogue and repetition data collection (see Section 3.0.2) is given to
describe the recording task and gives some details about the individual being recorded.
65
Figure 3.2: Screenshot of Excel le for Byington dictionary
Declarative data
Declarative data contains information about the language, and are not examples of natural language
use in complete, grammatical sentences. It contains language description data in the forms of trigrams and
bigrams, dictionaries, test sets, and word lists from the generator.
This part of the corpus rst contains language description data from An Crúbadán 9
. This includes:
trigrams, bigrams, a word list, and a list of website sources. Unfortunately, the sources used to generate the
lists included OCR materials that contained substantial errors, however this is the only currently existing
language data resource for Choctaw of its kind.
This section also contains three dictionaries: Byington’s Byington 1915(shown in Figure 3.2), Wright’s
Wright 1880, and the latest dictionary from the Choctaw Nation of Oklahoma The Choctaw Nation of Oklahoma Dictionary Committee 2016. All dictionaries are stored as Excel les, and all contain data organized
as a column with a Choctaw word, a column for the part-of-speech type for the word, and a column containing the English translation. Byington’s dictionary contains 20,117 Choctaw tokens, Wright’s contains
9,990, and the latest dictionary contains 4,438.
Finally, there is a folder that contains all of the generated forms of verb bases by the generator (detailed
in Section 3.0.4). Each verb has a unique le containing all of the inections; there are 461 les in total.
Original Copies
9
http://crubadan.org/languages/cho
66
Within the original copies folder of the corpus, I have retained pdfs of scanned images in the Mississippi
variant, and texts on which I conducted OCR and correction. I kept these les in the interest of transparency
and verication purposes.
3.0.1.3 Corpus Applications
The proposed use cases for the data set are numerous given the variety in both content and language
variant. The primary use cases are for academic research in linguistics, history, and natural language
processing (NLP). The corpus is a repository for teaching and learning the language. As the majority of
the text entries are bilingual, learners and teachers alike can benet from the translations.
In terms of potential future NLP use cases, one possibility is in machine translation (MT). The small
size of the data set would encourage novel approaches for a MT model, as there is not enough data to
use many machine learning techniques. With additional resources in the future, this use case could be
feasible as the majority of the data in our corpus is translated in English, creating a well-formed parallel
data set. The language presents interesting challenges in this domain, as morphologically rich languages
pose problems for MT systems from errors in word-alignment and multiple axes. Current alignment
models at word-level do not distinguish words and morphemes, and produce low-quality end translation
due to misalignment X. Li et al. 2016.
3.0.2 Spoken language data collection
In 2019, I collected two novel types of audio recordings of Choctaw speakers of varying uency. These
audio recordings were collected to provide a sample of speakers in order to create a Choctaw speech-totext system (task one), and provide insights into how Choctaw speakers code-switch in conversation (task
two). All recordings were conducted in Oklahoma. Transcriptions in Choctaw are written in Oklahoma
orthography.
67
Figure 3.3: The rst ten lines to be repeated by participants
In the rst task, 48 speakers ranging in skill from beginner to uent, were recorded repeating 200
prepared phrases. The selected phrases represent a diversity of sounds in Choctaw that were used to train
a novel Automatic Speech Recognition (ASR) system for Choctaw (described in 3.0.3). Non-uent speakers
were recorded to potentially train a system that will recognize nonstandard Choctaw sounds common in
language learners. The rst ten lines of the task can be seen in Figure 3.3.
In the second task, I recorded eight sessions of pairs of uent and learning speakers having a 15-
minute conversation. Fifteen people participated in total; uent speakers were allowed to participate
more than once. Participants were encouraged to speak about any topic of their choosing. The Choctaw
code-switching examples throughout this dissertation all originated from these recordings. Following the
recording session, both participants completed an oral survey reecting on strategies they used to be understood during the conversation. In the survey responses, many of the participants explained that they
code-switched to support the interlocutor’s prociency level, make use of easier sentence planning, or
because there was not an equivalent word. These stated strategies align with the code-switching literature
(see Figure 2.1). Part of one conversation’s transcript in the software ELAN is shown in Figure 3.4.
68
Figure 3.4: Transcription of one conversation recording between a uent and student Choctaw speakers
3.0.3 Choctaw Automatic Speech Recognizer
I implemented an ASR using Kaldi (Povey et al. 2011), an open-source ASR development toolkit based on
nite-state transducers. Kaldi provides several recipes for developing ASR models from scratch; I followed
the WSJ recipe10. In this recipe, a monophone model and a triphone model are created. The ASR system
is currently implemented with minimal ne-tuning.
The lexicon is the pronunciation dictionary le in which all words are represented as phones. For the
lexicon, I used a dictionary produced by the Choctaw Nation of Oklahoma (The Choctaw Nation of Oklahoma Dictionary Committee 2016). All words in the lexicon were standardized to match the orthography
found in the dictionary entries. Pronunciations and phones were derived from the same dictionary.
The lexicon currently contains 2,727 entries, a limitation and an important lesson to share with other
language communities that may seek to develop ASR systems for their languages. It is challenging to
develop the lexicon without expert knowledge. Other Choctaw dictionaries in Choctaw (such as Byington
10http://kaldi-asr.org/doc/kaldi_for_dummies.html
69
1915) contain more lexical entries but do not include pronunciation. I expect the system to encounter
many words not listed in the current lexicon as conversational data is added to the model, such as the
many possible inected forms of verbs. It will be necessary to consult with uent speakers to add entries
to our lexicon and develop a more robust future ASR. Developing a large lexicon could be a severe challenge
for languages with few or no uent speakers. However, a lexicon is a valuable and important contribution
to language documentation.
As described in Section 3.0.2, I recruited 48 speakers of varying uency in Oklahoma Choctaw, repeating aloud prepared phrases. These phrases were used to train the ASR. Although participants were given
the phrases to repeat, variations did occur. For example, one speaker said the shorter form skVlli (“money”)
rather than the full form of the word, iskVlli. Other changes occurred through participant errors, for example, one speaker said, AiitVoba mVt nipi bVshli iksho tuk, rather than the intended phrase, AiitVoba ma
¯
nipi bVshli yVt iksho (“The store does not have a butcher.”).
3.0.4 Verb generator
I developed a verb generator for Choctaw. The verb generator produces dierent morphological forms of
Choctaw verbs. Morphology is the study of the composition of words from bases and morphemes, which
are inections on bases that are units of meaning that cannot be decomposed to a lower level (Jensen
1990). For example, the inected word “baked” in English is formed with the base “bake”, and the past
tense morpheme “-ed”.
Choctaw is a morphologically complex language in that it has many morphemes and morpheme combinations that can inect on one base. A base could be a verb, adjective, or noun. By implementing rules in
a generator program, it is then possible to generate forms not seen in the gathered material by generating
many possible combinations.
70
The scope of the generator, the rst for Choctaw, produces inections on verb bases in the traditional
orthography. Morphology can present computational challenges. If the language’s morphology is very
complex, it becomes increasingly harder to computationally understand new texts. The generator aims
to resolve some of those challenges by generating inected verb forms in Choctaw and then saving the
inected forms as word lists that can be searched by a system. The generator inected 400 verbs in the
Oklahoma orthography and supports both language learning and computational recognition of verbs.
3.0.5 Ongoing and long term projects
3.0.5.1 Mississippi Dictionary
In 2020, I joined the Mississippi Band of Choctaw Indians National Endowment for the Humanity grant11
in their work to develop a dictionary for the Mississippi variant of Choctaw. As described in Section 2.5,
there are orthographic and lexical dierences between Oklahoma and Mississippi writing systems. Before
this work, no dictionary existed for the Mississippi variant of Choctaw. I have developed a website and
Android and iOS apps for the dictionary. The website can be viewed at https://dictionary.choctaw.org;
all content will be publicly released in 2025.
3.0.5.2 Native Earth Native Sky
In 2023, I joined the Native Earth Native Sky project. This is a project funded by a NASA Science Activation Grant. The purpose of the project is to add Indigenous knowledge, such as cultural stories and
vocabulary in an Indigenous language, to STEM curriculum for middle school aged children in Oklahoma
public schools. The project is currently working with the Choctaw, Chickasaw, and Cherokee tribes in
11https://securegrants.neh.gov/publicquery/main.aspx?f=1&gn=PD-271355-20
71
Oklahoma. I am designing the project’s website and interactive learning games. As mentioned in the Introduction, the a version of the Masheli chatbot is also being retooled and incorporated into this project.
The website can be viewed at https://www.nativeearthnativesky.org/.
72
Chapter 4
Bilingual Language Revitalization Systems - Application 1: Masheli
The rst application in which I test hypotheses about bilingual dialogue systems is a language-learning
chatbot named Masheli. The hypothesis that I test is whether code-switching leads to a greater increase
in user enjoyment and learning. In this application, the dialogue system is a conversational partner that
shares cultural Choctaw stories. The chatbot is more bilingual than the user, and the user is an emerging
bilingual. Emerging bilinguals are language learners who are in the process of learning a second language.
I conducted two experiments where I explored research questions about bilingual versus monolingual
interactions in a learning context with this system.
All of the chatbots are named “Masheli”, meaning “fair sky” in Choctaw. The rst section in this chapter
details the motivation for developing this system. Next, I review the relevant literature. Then, I describe
Masheli 1.0, which was the initial implementation developed in 2016 as a use case for the ChoCo corpus
(Section 3.0.1). I present an experiment with this implementation, the ndings, and open questions. I then
discuss the relevant literature that inuenced the design of Masheli 2.0, the methods and experiments of
Masheli 2.0, and the results.
73
4.1 Motivation
Conversational uency is a goal for many language learners. However, for learners of endangered languages like Choctaw, access to uent speakers may be limited. This lack of access may be due to geographical features, such as not living on or near tribal lands, or because there are few remaining uent
speakers of the language. It is unclear how many Indigenous languages are still spoken in the United
States; one estimate (Moseley 2010) was 256 in 2010, while the 2010 US census estimated 1651
. At the time
of writing, no similar summary could be found for the results of the 2020 census. However, it is anticipated that the number of speakers has declined over time(Simons and Fennig 2018b), particularly after the
devastating eects of the COVID-19 pandemic2
,
3
, thus support for learning these languages is time critical.
The goal of the Masheli chatbot investigated in this chapter is to serve as a conversational partner in
the absence of a uent Choctaw-speaking human interlocutor. The goal of the interaction is for the user to
gain conversational uency in Choctaw by interacting with the system. This conversational uency might
be indicated by increased vocabulary, knowledge of Choctaw language syntax, or a greater sense of ease
of communication.
However, several questions remain, such as how a chatbot can be eectively used to teach conversational uency, especially in a low-resource minority language. Will minority language learners want to
use this technology?
4.2 Literature review
Technology developed for learning purposes, especially language learning, is a well-established area of
research4
. Technology, particularly dialogue systems, has been implemented in this sphere for several
1
https://www2.census.gov/library/publications/2011/acs/acsbr10-10.pdf
2
https://www.nytimes.com/2021/01/12/us/tribal-elders-native-americans-coronavirus.html
3
https://www.kxii.com/2021/01/29/choctaw-nation-members-talk-about-impact-of-losing-native-spea
kers-to-covid-19/
4
For example, proceedings from the following CALL conference give a good overview https://calico.org/
74
reasons. While traditional classroom settings may attempt to create conversational opportunities, many
student factors, such as shyness or fear of making errors, can prevent learners from engaging fully in
conversation with a human partner (Shawar and Atwell 2007b). In previous research, learners reported
feeling more comfortable chatting with a dialogue system than with a human interlocutor (L. Fryer and
Carpenter 2006).
Developing dialogue systems for learning endangered languages can also ll an important role. As
access to uent speakers may be limited or nonexistent for endangered and minority languages, a dialogue
system could thus serve as a surrogate conversational partner. One prior system to learn the Indigenous
language Maori included modules for a dialogue manager, sentence generator, and sentence parser (Knott, ¯
Mooreld, et al. 2003). Other examples of world Indigenous language dialogue systems include one for
North Sami (Trong, Jokinen, and Hautamäki 2019) and a system for Xhosa and Zulu (Malaza et al. 2005).
4.2.1 Literature review of chatbots
In the chatbot area of second language teaching research, chatbots can play two roles: pedagogical agents
or learning companions.
4.2.1.1 Chatbots as pedagogical agents
In the rst role of a chatbot, the chatbot acts as a pedagogical agent by providing corrections and feedback
on errors the learner makes. One example is the Taidhgín system for learning Irish (Chiaráin and Chasaide
2016). A second is the Tactical Language and Culture Training System (TLTS), which trains learners in
foreign languages such as Levantine Arabic, Iraqi Arabic, and Pashto. The emphasis of the TLTS system
is on spoken communication (Valente, Johnson, and Vilhjálmsson 2006).
A learner might expect this kind of conversation with correction in an explicit instructional setting, but
such feedback would not be typically given in a casual conversation. As found in my Choctaw recordings
75
(see Chapter 2), corrections are limited within casual conversations. The literature indicates that in humanhuman conversations, even in learning environments (Lightbown and Spada 2013), individuals tend not
to make corrections (unless specically requested) as it could be interpreted as face-threatening. It would
also interrupt the ow of the conversation (Chou, T.-W. Chan, and Lin 2003).
Pedagogical agents are commonly evaluated on technology acceptance by learners, usability questionnaires, learning gains, increased motivation, or the success of specied technical aspects (Hobert 2019).
4.2.1.2 Chatbots as language learning companions
In the second role, and what is intended for Masheli, the chatbot takes a non-authoritative role by focusing
on carrying on a conversation as an equal conversational partner and without explicit corrective feedback
(Chou, T.-W. Chan, and Lin 2003). A learner might expect to have this kind of casual conversation with
another human speaker of the language outside of a learning context.
There are numerous examples of monolingual chatbots as learning companions in several domains
(W. Huang, Hew, and L. K. Fryer 2022; Y. Chen et al. 2023; Morgado et al. 2024). Recent work has also
incorporated Large Language Models to power the English knowledge base of the chatbot (Y. Li et al.
2024). However, to the best of my knowledge at the time of writing, Vanjani, Posey, and Aiken(2019) is
the only multilingual chatbot deployed to chat informally as a language learning companion. In this work,
the chatbot was connected to Google Translate and could thus respond to user utterances in any language
supported by Google Translate. The translated utterance was then given to Tutor Mike5
, a free online
chatbot that responded in English. In several experiments, the English responses from Tutor Mike were
translated again by Google Translate to respond to the user in a non-English language.
Often, chatbots will be evaluated on metrics such as review of chatbot’s produced content by an automatic means or human experts, completion of a goal or task, and user engagement (Maroengsit et al.
2019; Shawar and Atwell 2007a; Finch and J. D. Choi 2020). Other metrics to evaluate a chatbot include
5
http://bandore.pandorabots.com/pandora/talk?botid=ad1eeebfae345abc
76
correctness of grammar, informativeness, and coherence of topics or chatbot persona (Finch and J. D. Choi
2020). Tutor Mike was assessed on the quality of the translations and humanness. As no other examples
exist, no standard has been established for evaluating chatbots as learning companions.
4.2.2 Second language acquisition literature
Second language acquisition is learning a second language other than the rst after the rst language
has been acquired (Ortega 2014). Theories, frameworks, and descriptions for second language acquisition
abound (For a more detailed overview, see Ortega 2014; Mitchell, Myles, and Marsden 2013; Lightbown
and Spada 2013). Therefore, I focus this section on literature related to Indigenous language learning and
systems and emerging bilingual conversational behaviors and pedagogy that supports these behaviors.
4.2.2.1 Indigenous language pedagogy and language learning systems
The relevant literature frequently mentions that learning an Indigenous language may dier from learning
other L2 languages. Crow and Parsons (2015) note that Indigenous language learning and revitalization
“is as much a process of cultural rejuvenation as it is linguistic rejuvenation.” Some challenges are that
there may be fewer uent speakers, often due to language suppression and forced acclimation. Many of
these speakers may be dispersed geographically rather than concentrated in a particular area. In addition,
there may be fewer published pieces of literature and fewer media publications, including audio. Finally,
English is the dominant and primary language for many American Indigenous tribes, even on reserves and
reservations. As a result, there are often few places and situations to use the Indigenous language within
its language community (White 2006).
One unique aspect of Indigenous language pedagogy is that some Indigenous languages are dormant,
meaning there are no current uent speakers. As a result, Indigenous language education requires using
archival resources as teaching materials. Additionally, for some languages, it may be the case that not
77
all teachers are trained in pedagogy but rather know enough of the language or are motivated to teach
others. Also, there may not be enough teachers to teach all possible learners of all ages. Additionally, not
all teachers may be trained to teach adults and children, thus, some educational material may not benet
all students equally (Lukaniec and Palakurthy 2022).
Motivation for second language learners is frequently discussed in the second language literature (Ortega 2014; Lightbown and Spada 2013; Mitchell, Myles, and Marsden 2013). The motivation for Indigenous language learners might dier from that of other language learners in that they will rarely learn
the language for nancial or tourism reasons. Instead, most Indigenous language learners are learning
to reconnect with their ancestry personally and connect to their cultures and communities (B. Alexander
2018).
As discussed previously in Chapter 2, there is little representation of non-majority languages in technology6
. However, Crow and Parsons (2015) state that a critical part of revitalizing languages and cultures
is creating and distributing content through dierent forms of media, especially technology, such as their
Maori game app. Begay (2013) reviews several Indigenous language learning apps and notes that while
there are several Indigenous language learning apps and programs, none of the ones included in their review provided scenarios in which to use the language in a conversational setting included a chatbot or
made use of code-switching.
Most prior Indigenous language technology has revolved around creating tools for L2 learning. The
shift of Indigenous people away from ancestral homelands and reservations to urban locations has been
a continuing challenge for language revitalization eorts as it results in a lack of access to other speakers
as well as fewer opportunities to use the language regularly (Cassels and Farr 2019). Thus, technology has
oered new language learning opportunities and allows speakers to connect despite geographic distances
(Shilling 2020). Additionally, learning through technology allows learners to make mistakes privately
6
See https://lt4all.org/en/index.html
78
rather than in front of a teacher or an Elder, which can increase a learner’s comfort level and condence
(Begay 2013). Early iterations of L2 learning programs were on CD-ROMS (Auld 2002), then shifted online
(McHenry 2002; Haag and Coston 2002), and more recently to mobile apps (Begay 2013).
It is noted that technology is not intended to be a singular solution for revitalizing a language, as one
mobile app alone will not produce a perfectly procient speaker. Rather, L2 learning technology is meant
to be one tool among many and can successfully serve as a supplement to independent and motivated
Indigenous language learners (Cassels and Farr 2019). As Mark Turin, former chair of the First Nations
and Endangered Languages Program at the University of British Columbia, states, “tools and technology
don’t save language — speakers do” (Karstens-Smith 2018).
For emerging Choctaw speakers, the Choctaw Nation of Oklahoma has a long history of using technology to teach the language. Telecourses were rst introduced in 2000 (Haag and Coston 2002), and
teaching the language over Zoom became a popular mode of teaching and learning the language during
the pandemic, a trend that has continued to the present day7
.
4.2.2.2 Emerging Bilingual Conversational Behaviors and Translanguaging
It is typical behavior for emerging bilinguals to combine elements from various resources in some or all of
the languages they know to communicate–or, in other words, code-switch–and may do so in ways that may
not be entirely grammatical (Cenoz and Gortegaorter 2017). Nevertheless, they can still co-construct meaning with interlocutors (Canagarajah 2011). It is also typical that speakers attempt to negotiate meaning
through code-switching in casual conversations where the language choice is not xed, and interlocutors
share multiple languages in common (Auer 1995b).
While learning a language will ultimately be an individual endeavor, appropriate supportive pedagogy
can improve and enhance the process. For many years, “immersion” monolingual style second-language
pedagogy was a standard practice, where learners would interact only in the language to be learned in the
7
https://www.choctawnation.com/about/language/classes/
79
classroom. However, translanguaging—or the intentional and systematic use of multiple languages in a
learning setting—has recently become a more studied and accepted practice in second-language pedagogy
in general and for teaching endangered languages (Cenoz and Gortegaorter 2017).
To dierentiate between code-switching and translanguaging, the literature indicates that code-switching
is the shifting between two languages in any conversational setting, while translanguaging is the second
language instruction approach of purposefully encouraging emerging bilinguals to use all their languages
to t their communicative needs in a learning setting. The instructor gradually removes that support as
learners become more advanced in the second language (Cenoz and Gortegaorter 2017; Makalela 2015).
Translanguaging was rst introduced in the bilingual education of English and Welsh. The approach was
based on a series of proposals focusing on how multilingual individuals communicate (Cenoz and Gortegaorter 2017).
The translanguaging approach emphasizes maximizing being able to interact and participate orally,
even if not entirely in the target learning language (Makalela 2015; García 2009). This is not to necessarily
say that learners can altogether forgo making an eort to learn the target language; instead, they are
given the freedom to use their other language(s) to ll in places in the learning language that they do not
know yet (Dougherty 2021). In contrast, immersion-style teaching often reprimands, discourages, or even
ignores statements in the non-target language.
Benets of translanguaging
The literature indicates two main benets of translanguaging: (1) language acquisition and (2) a student’s psychological status.
In monolingual educational settings, when emerging bilinguals are confronted with a language feature they don’t understand, they tend not to address the confusion unless necessary, hoping that further
occurrences of the item will provide more context and they will be able to understand later (Canagarajah 2011). This frequently-observed tendency is summarized as the "let it pass” principle (Firth 1996). A
80
negative impact on comprehension has been observed if no additional occurrences happen for understanding. However, in classrooms where translanguaging was introduced, one study found enhanced learning
outcomes for second-language learners of English at the kindergarten and collegiate level as fewer “let it
pass” instances occurred (Champlin 2016). In a real-world example, it was found that Maori literacy levels ¯
increased when students were allowed to use their rst language, English, to process Maori texts (Lowman ¯
et al. 2007). Maori is an indigenous language spoken in New Zealand. ¯
Translanguaging has also been shown to be psychologically benecial for emerging bilinguals. First,
it legitimizes a student’s relationship with both languages and begins the process of the student selfidentifying as a speaker of both languages (Makalela 2015). Being encouraged to speak and freely use
all of their resources rather than suppress specic repertoires can promote students’ self-condence. As
one student interviewed in Makkalela (2015) said about learning Sepedi (L2), a language spoken in South
Africa, and using isiZulu (L1), a Bantu language also spoken in South Africa, for translanguaging in the
learning process:
“The fact that I was allowed to use my own language in this class made me feel at home. Both
sides of me were brought into play. One side I was an expert in my own language and the other
side a novice in the new language. This gave me control over what I was going through—a
balance I don’t usually get when I learn a language or content subject.”
Instructors also noted a dierence in the learners’ attitudes when encouraged to use the translanguaging approach. One instructor interviewed in Dougherty (2021) said:
“I tend to pay attention a lot to how they [students] are emotionally. I know if this was purely
English, they would have a lot more diculties understanding the learning or even making
friends with others. So because everyone is equal with Spanish and English, it gives them
[students] a boost of condence and self-assurance because they can use both [languages].
I think it mostly helps them feel condent because they can use the other language and it’s
81
a resource for them. For example, there are moments when a student sees they don’t know
English, but they can go back to Spanish and that elevates them.”
Translanguaging and Minority languages
General challenges when teaching a minority language are discussed in Section 4.2.2.1. The literature identies potential issues regarding translanguaging for minority languages, such as Indigenous
languages.
In regards to adopting translanguaging in teaching minority languages such as Indigenous languages,
some instructors of American Indigenous minority languages have voiced concern that the translanguaging approach might “dilute the integrity of Native American languages” and that it “violates the elders’
rule of mutually assured separatism” (Canagarajah 2011). Additionally, not all Indigenous communities
will be open to code-switching. The Tewa, an American Indigenous tribe along the Rio Grande River, has
a long history of keeping languages separate, not even adopting Spanish loan words following contact
but converting those loan words through Tewa word building. As one article notes, “the tendency to mix
language is met with disdain”. (White 2006)
However, for others, translanguaging is an accepted and encouraged practice in some Indigenous learning spaces, for example, in Cherokee language learning classrooms in Oklahoma (Peter et al. 2017), where
it has been successful for emerging Cherokee speakers.
Translanguaging strategies
Like any pedagogical approach, there are strategies to use translanguaging in a learning environment
eectively. The most common approaches are rst outlined in (Cenoz and Gortegaorter 2017) in Figure
4.1, then expanded on (Seals and Olsen-Reeder 2020), shown in Figure 4.2. A nal set of common strategies
(Dougherty 2021) is given in Figure 4.3. The strategies emphasize linking translanguaging to content in
lessons, especially important vocabulary. Another element is that the instructor should utilize translanguaging and encourage its use by individual students and within groups.
82
Figure 4.1: Translanguaging strategies from (Cenoz and Gortegaorter 2017)
Figure 4.2: Common translanguaging strategies from (Seals and Olsen-Reeder 2020)
Figure 4.3: Additional translanguaging strategies from (Dougherty 2021)
83
While translanguaging is frequently considered a verbal act (Canagarajah 2011), the literature supports
translanguaging in text form. The Masheli chatbot ts within translanguaging approaches since including
it in text-based learning activities also provides essential space for using the minority language (Seals
and Olsen-Reeder 2020). Additionally, many texts are multilingual on the internet (B. T. Williams 2009),
indicating real-world applications for emerging bilinguals to comprehend both languages in text.
As the content for Masheli is crafted in advance, guidance for spontaneous translanguaging responses
to the learning user will not be feasible with the current design. I have integrated the rst two principles in
Figure 4.2 to guide the code-switching language used in the chatbot Masheli for this dissertation, explained
in detail in Section 4.4.2.2.
4.3 Masheli 1.0
In this version, I implemented a system that modeled inter-turn code-switching and shared stories as a
text-based chatbot (Brixey and Traum 2021). My research questions are as follows.
• M1.0-1: Do language learners enjoy a code-switching chatbot?
• M1.0-2: Do language learners like a chatbot that uses inter-turn code-switching?
I hypothesized that a user would experience an increase in conversational uency not only from the
opportunity to practice using the language but also because of the chatbot connecting to the user’s bilingual
identity through code-switching. This connection to identity could potentially increase the user’s rapport
or enjoyment of the interaction.
In this section, I describe the system design, testing the system with several learning Choctaw speakers,
and conclusions that guided the development of Masheli 2.0.
84
4.3.1 System Design
As the backend, all chatbot versions employ NPCEditor, a response classier and dialogue management
system (Leuski and Traum 2011). NPCEditor uses a statistical classier that is trained on linked questions
and responses. The classier is trained on a question-answer corpus. For each user input, the classier
ranks all the available responses. NPCEditor also contains a dialogue manager, which selects an appropriate response from the ranked responses. Previous applications of NPCEditor have been used for interactive
characters in domains such as museum guides (Swartout et al. 2010), entertainment experiences (Hartholt,
Gratch, Weiss, et al. 2009), interviews with Holocaust survivors(Traum et al. 2015), and to answer sexual
health questions (Brixey, Hoegen, et al. 2017).
I selected ten stories about animals from ChoCo (Brixey, Pincus, and Artstein 2018), my Choctaw
language corpus (described in Chapter 2), to form the chatbot’s domain knowledge. I selected stories
because pedagogy literature, especially for American Indigenous languages (Cantoni 1999), indicates that
story-based instruction is benecial in language learning environments (Kickham 2015; Andrews, Hull,
and Donahue 2009). All stories are originally in Choctaw and have English translations. Stories were
entered in the chatbot in their original orthography and variant (see Chapter 2 for more on orthographies
and variants), even though some originated from Mississippi data while others were from Oklahoma.
4.3.1.1 Code-switching design
All QA corpus items were entered in Choctaw and English. There is no explicit module for recognizing
which language the user is communicating in. Instead, the language is detected by determining which
output has the best matching score, given the training data—the questions portion of the QA corpus—that
always matched outputs to the same language as inputs. In other words, English questions were trained
with English responses, and as a result, a novel English question from a user would be matched to an
85
English response based on the training data. A user typing in Choctaw would not receive an English
response, or vice versa.
I implemented a policy in the dialogue manager that allows users to request to see the opposite language of the previous chatbot utterance through statements like "Say that in Choctaw" or "Repeat in English." The chatbot then repeats the same utterance or story in the opposite language.
4.3.1.2 Dialogue types and tags
The dialogue manager functionality within NPCEditor chooses a response in the response portion of the
QA corpus, ranked highest by the classier. The dialogue manager is the same for both Choctaw and
English responses. In Masheli 1.0 and 2.0, the QA corpus includes queries to hear a story, responses about
the chatbot itself, such as its name, and utterances that maintain dialogue ow, such as greetings and
management of o-topic questions. All responses in the QA corpus have a label denoting the dialogue
type of the utterance. Examples and denitions of each dialogue type are given in Table 4.1. There are nine
dialogue types: three dealing with pleasantries (“personal”, “greeting”, “meeting”), four to manage domain
knowledge (“don’t know”, “don’t have story”, “knowledge stories”, “o topic”), “repeat” for handling repeattype requests, and story type. Each story is indexed by an animal in it, and if a story contains multiple
animals, these were entered into the answer portion of the QA corpus multiple times. For example, a story
with a rabbit and a fox in it would be entered twice, once with a "rabbit" ID and once with "fox." The stories
might dier slightly, such as the word "Ok" or "Ome" at the beginning. The same questions are then linked
to these two IDs. There are ten unique bilingual stories.
The dialogue manager can choose a lower-ranked response to avoid repetition. If the score of the
top-ranked response is below the threshold that was selected during training, the dialogue manager will
instead select a response that indicates non-understanding or that aims to end a conversation topic. For
example, the expression “Mihacha?” (“It really is, isn’t it?”) might be selected as a response when no other
86
Category Dialogue type Description Example
Pleasantries Personal Information about the character of the chatbot "My name is Masheli"
Greeting All greetings to the chatbot "Halito a
¯
kana!" (Hello my friend!)
Meeting Shared after asking and sharing names with the user "I am happy to meet you"
Manage Don’t know Occurs when the user asks or states something that is Ak ikhano. (I don’t know.)
domain outside the domain knowledge of the chatbot, such as
knowledge "Will it rain today?"
Don’t have story Used for any request for an animal not in one of the ten "I do not have a story about that animal. Why
in the corpus don’t you tell me one that you know."
Knowledge stories Responses to share story knowledge base with the user. Also "I have stories (nan annoa or nan vnnoa) about
triggered after multiple o-topic utterances from user hvchonchoba, kinta, issi, luksi, shu
¯
kvni,
chulhkvn, shukhvta, shawi, chula, aka
¯
ka,
nashoba, and chuk."
O-topic Occurs when the user states or asks something outside the "I
¯
ho?" (Really?)
domain knowledge of the chatbot, such as "I like cheese."
Repeat Request from the user to repeat its previous utterance
in the opposite language
Story [Ten stories with Choctaw cultural stories about animals
unique animal identiers]
Table 4.1: Dialogue types in Masheli
response scores above the threshold. A counter keeps track of the number of consecutive times the chatbot
has used an “o-topic” response. In the third instance, a “knowledge stories” response suggests asking for
a story about a given animal. The counter restarts after giving a “knowledge stories” response.
The dialogue manager tracks in which language its last utterance was using the tag at the end of the
dialogue type labels. The English utterance is dierentiated from the Choctaw by a tag “E” at the end
of the dialogue-type label. For example, one greeting is labeled “01greeting” ("Halito!") for the Choctaw
utterance and “01greetingE” ("Hello!") for the English. If the user requests the system to repeat its previous
statement in the opposite language, its next action is based on the presence or lack of an “E” tag.
An example dialogue is in Figure 4.4, demonstrating some greetings, telling a story in Choctaw, and
then repeating the story in English when requested.
4.3.1.3 Orthographic considerations
To support the dierent possible orthographic standards from the user, each training example in the question portion of the QA corpus was written in multiple formats. For example, the sentence “Do you know
a story about a woodpecker?” could be written as:
1. Biskinik am anumpa nan anoli ish i
¯
shi?
87
Figure 4.4: Example conversation with the chatbot in Masheli 1.0. Translations are in square brackets.
88
2. Biskinik a anumpa nan anoli ish i
¯
shi?
3. Biskinik an anumpa nan anoli ish i
¯
shi?
4. Biskinik a
¯
anumpa nan anoli ish i
¯
shi?
5. Biskinik a
¯
anumpa nan anoli ish inshi?
The examples demonstrate how a nasalized character, such as the a
¯
that is the second word in the
Choctaw questions, might be represented by omitting the underlining or might be written with an explicit
m or n instead. At the same time, since there is no consistent use of a standardized orthography, an
individual might use a dierent representation for a nasalized character in another place in the same
sentence.
4.3.2 Evaluation
I evaluated Masheli 1.0 using two methods: demoing it with English-Choctaw bilinguals learning the
Choctaw language and conducting language experiments.
4.3.2.1 User study
Four English-Choctaw speakers of varying uency in a community Choctaw language class were recruited
to interact with Masheli. I briey informed the participants of the system’s capabilities. All were encouraged to interact naturally with Masheli and for as long as they chose.
Three users explored the system’s “personal” and “greeting” types of dialogue. All users asked for a
story; two users asked for more than one story. All users interacted with the system for four or more turns;
the maximum number of turns was eleven. The shortest interaction was approximately ve minutes, and
the longest was approximately fteen minutes. In the longest interaction, the user only completed four
turns with the system but then spent most of the interaction time reviewing new vocabulary in the story.
89
Two participants only interacted with the system in Choctaw. These users asked the system to repeat
the last Choctaw utterance in English once, and they reported that they asked for the translation to see the
function, not necessarily to aid comprehension. The other two users interacted with the chatbot almost
entirely in English. To access the Choctaw stories, these users asked for a story rst in English and then
requested a translation.
All of the speakers who tried the system have learned the Oklahoma orthography. However, no users
found utilizing multiple writing systems unusual for the chatbot. Three users did not report issues with
reading stories in the Mississippi orthography; the nal speaker did report issues.
All speakers orally reported that it was an enjoyable experience overall, from enjoying the stories to
being able to alternate between languages. One critique was that it was uncomfortable for the system to
give an entire story in one dialogue turn. They reported that it would be more comfortable for reading
and more natural pacing to have the story broken into multiple turns.
4.3.2.2 Language experiments
As noted above, the chatbot does not have a language detection module. Responses are selected based
on how similar words in the user’s input are to words in the question-answer training pairs. In most
tested cases, the system responded in the language corresponding to the input. However, errors and codeswitching in a conversation turn are typical behaviors of language learners. I thus experimented with the
chatbot’s ability to cope with non-standard language utterances.
Figure 4.5 shows that the system could not handle sentences with intra-sentential code-switching. In
the rst turn, the user asks for a story about a deer, but the system responds with an o-topic response
in Choctaw, meaning “that’s it”. In the second turn, the system responds in English when the user’s input
is only one-third English. When the sentence is precisely half English and half Choctaw, like in turn ve,
90
Figure 4.5: Example of code-switching within one user turn in Masheli 1.0. Translations are in square
brackets.
the system responds in Choctaw, oering a list of story topics. Using the word deer in Choctaw produces
an o-topic response (turn four), but using it in English (turn nine) results in a deer story response.
A second consideration is that learners’ input may contain errors. In Figure 4.6, the user requests to
hear a story about a beaver. However, in the rst turn, the input uses a non-standard spelling for beaver
and places it at the end of the sentence, producing an ungrammatical sentence in Choctaw. In the third
turn, the user makes the same grammatical error and uses a non-standard spelling for possum. In the fth
turn, the syntax improves, but the word for possum is still a form not in the QA corpus. In the remaining
turns, the system is unable to cope with the dierent spellings for beaver (“kinta”), deer (“issi”), and wolf
(“noshoba”).
91
Figure 4.6: Errors in user input in Choctaw. Translations are in square brackets.
4.3.2.3 Summary
This initial study demonstrated user interest in a bilingual Choctaw-English chatbot for Choctaw learners
to practice conversational uency. The four participants in the pilot study expressed enjoyment with the
dialogue design. Experiments to further test more natural learner expressions indicate the chatbot should
be trained on additional code-switching examples. The chatbot at this stage also does not code-switch
intra-sententially, which could be a means for learners to feel more comfortable with the chatbot and
might be a means for additional learning.
4.4 Masheli 2.0
The rst goal of the second implementation of Masheli, or Masheli 2.0, was to allow users to use more
intra-sentential code-switching as well as incorporate more examples of errors that users might utter into
92
the question portion of the QA corpus. A second goal was for the system to use intra-sentential codeswitching. An additional literature review indicated that translanguaging could be an eective means for
the system to code-switch and would also emphasize and improve learning goals.
In this section, I will detail my research questions and hypotheses related to Masheli 2.0, describe a
human subjects study, and discuss the study results.
4.4.1 Hypotheses
To briey summarize the prior research (Section 4.2.2.2), translanguaging and using an already known
language can enhance a learner’s learning gains and sense of comfort learning in a classroom setting with
human-human interactions (Butzkamm and Caldwell 2009). The literature also shows that code-switching
can lessen the feeling of distance between conversational human interlocutors.
I was also interested in how users would perceive code-switching that followed linguistic frameworks.
It requires linguistic expertise to be able to create grammatically correct code-switching utterances that
only swap at clauses and noun phrases. If users do not perceive code-switching that does not use a linguistic
framework as problematic, this would decrease the amount of language knowledge needed to create a
code-switching dialogue system.
Masheli 2.0 investigated these ndings in a human-chatbot scenario.
My main research questions are:
• M2.0-1:Will translanguaging lead to a better user experience? Will users show a higher preference
for the translanguaging chatbot?
• M2.0-2:Will translanguaging lead to an increase in learning?
• M2.0-3: Will users prefer a chatbot that uses a linguistic framework for code-switching over one
that does not (non-framework)?
93
My experiments test a monolingual Choctaw chatbot against a code-switching chatbot. The codeswitching chatbots—one following a framework (referred to as "framework") and one not ("non-framework")—would
also be compared against each other. Based on the literature, my hypotheses for this experiment are as
follows.
1. Code-switching bilingual chatbots that use translanguaging techniques lead to a better learning
experience, possibly through learning gains or a greater sense of rapport, comfort, or enjoyment for
language learning users.
2. Modeling linguistic aspects of bilingualism, such as code-switching that follows code-switching linguistic frameworks, will lead to a better learning experience than the monolingual Choctaw, demonstrated through higher levels of learning gains, rapport, or enjoyment.
3. Users will have a higher level of enjoyment with the framework code-switching chatbot over the
free code-switching chatbot.
4. Users will demonstrate the highest learning gains with a code-switching system.
5. Users will have a lower user experience with the monolingual system than with one of the codeswitching bilingual systems.
I developed three systems to test my hypotheses: a monolingual Choctaw chatbot, a code-switching
chatbot that follows a framework, and a code-switching chatbot that does not follow translanguaging or
a linguistic framework.
4.4.2 System design
The design was largely carried over from Masheli 1.0 in that the same dialogue types were used. I removed
the functionality to repeat a phrase in the opposite language so that users would only experience the
94
monolingual or one of the code-switched systems. I also removed two stories since these stories were not
parables from the answer portion of the corpus, as the remaining eight stories were then a cohesive genre
of parables. All of the systems have around 60 responses in the answer portion of the chatbot.
4.4.2.1 Additional questions in QA corpus
I wrote a Python script to generate additional questions to add to the question portion of the QA corpus.
The script included several sentences with predominantly English syntax, such as "Can I have a story
about ..." or "Tell me about ..." and the list of animals from the stories in Choctaw to be added at the end of
the sentence. The result produced a sentence like "Can I have a story about shawi?" (Can I have a story
about raccoons?) As animal vocabulary is some of the earliest words taught in the Choctaw curriculum,
I expected users to refrain from typing a largely Choctaw sentence with an English word. As such, I did
not generate Choctaw sentences with English animal words.
The monolingual Choctaw version of Masheli 2.0 was intended to mimic an immersion-style pedagogy,
so I only added a handful of English and code-switched sentences. Most of these were mapped to an o-
topic response encouraging the user to speak in Choctaw. This type of response aligns with the immersion
style curriculum, which will ignore or discourage statements made in the non-target language (see Section
4.2.2.2).
4.4.2.2 Generating code-switched QA corpus
I created handcrafted responses for the three chatbots. This section describes the design for each of the
three chatbots in Masheli 2.0.
Monolingual Choctaw
I removed all code-switched and English responses from Masheli 2.0. For the code-switching systems,
conversational items such as greetings, asking how the other is doing, and closures are all exclusively left
95
in Choctaw. According to the literature and what was demonstrated in my recorded conversation data (see
Section 3.0.2), these items tend not to show code-switching.
Framework code-switching
To incorporate translanguaging strategies, I repeated key vocabulary to understand the story in English in parentheses. Repetition was one non-spontaneous strategy for eective translanguaging (Seals and
Olsen-Reeder 2020). The examples in Table 4.2 show how code-switching and translanguaging were incorporated into a story. In the framework-based code-switching (line 4) story, the words for alligator, beaver,
tail, boat, and snake are key vocabulary for understanding the story and animal vocabulary intended to be
gained during the interaction.
English One day a man riding in a boat came to the end of the water.
Monolingual Choctaw Mak atoko
¯
nittak himona ka
¯
hattak mvt oka peni fokka osh ont aivhli ma
¯
ona tok.
Insertional-Cho matrix Mak atoko
¯
nittak himona ka
¯
hattak mvt a boat fokka osh ont aivhli ma
¯
ona tok.
Clausal-Cho matrix One day, hattak mvt oka peni fokka osh ont aivhli ma
¯
ona tok.
Insertional-Eng matrix One day a man riding in oka peni came to the end of the water.
Clausal-Eng matrix Mak atoko
¯
nittak himona ka
¯
a man riding in a boat came to the end of the water.
Repetition One day a man riding in oka peni (a boat) came to the end of the water.
Table 4.2: Framework-based utterances examples. English portions are bolded in code-switched utterances.
Framework-based code-switching was generated in two options, insertional and switching at clauses,
which follows the linguistic literature on code-switching and the model described in Ahn et al. (2020).
There were two options for the matrix language, either Choctaw or English.
Not every sentence in a story includes code-switching. Instead, I aimed for roughly 75% of a given
Choctaw story to have code-switching.
Non-framework code-switching
To create the non-framework utterances, I rst established that any part of speech could be a point
for switching in non-framework code-switching, including only question words, nouns, verbs, and morphemes. The utterances still had to be grammatically correct. Examples of non-framework utterances are
in Table 4.3.
96
English One day a man riding in a boat came to the end of the water.
Monolingual Choctaw Mak atoko
¯
nittak himona ka
¯
hattak mvt oka peni fokka osh ont aivhli ma
¯
ona tok.
Non-framework Cho matrix Mak atoko
¯
nittak himona ka
¯
man mvt oka peni fokka the end ona tok.
Non-framework Eng matrix One nittak himona a man riding in a boat came to the end of the water.
Table 4.3: Non-framework utterances said by the chatbot. English portions are bolded in code-switched
utterances.
Again, only some sentences in a story contained code-switching. First, I randomly generated a number
between 0 (monolingual English) and 1 (monolingual Choctaw) for a given sentence. I chose to randomize
at the sentence level rather than the story level, as overall code-switching coherence would not be expected
in random code-switching. I also randomly selected whether the sentence matrix would be English or
Choctaw.
To determine which words to code-switch, I used a random number generator. If two words in a 16-
word sentence needed to be code-switched, the number generator provided values corresponding to the
sentence’s word order. I would then apply those numbers to the sentence. I made allowances when the
code-switch would be ungrammatical by choosing the next nearest neighbor. If the word was part of a
phrase, I code-switched the phrase. For example, if the numbers 6 and 12 were the numbers selected for a
Choctaw matrix sentence: Mak atoko
¯
nittak himona ka
¯
hattak mvt oka peni fokka osh ont aivhli ma
¯
ona
tok. I could not switch “ont” alone as it is part of the phrase “ont aivhli ma
¯
," which is “the end.”
4.4.3 Methods
To measure the experience, I designed a language test to measure the user’s learning gains and a survey
to measure the user’s sense of rapport, comfort, and enjoyment.
97
4.4.3.1 Language Test
I created a language test to be administered before and after the interaction. The test illustrates whether
learners gained any new vocabulary, as demonstrated by questions 1-12, or any new syntax, as demonstrated by the nal three questions.
1. What is the word for deer in Choctaw?
2. What is the word for ant in Choctaw?
3. What is the word for turtle in Choctaw?
4. What is the word for possum in Choctaw?
5. What is the word for alligator in Choctaw?
6. What is the word for fox in Choctaw?
7. What is the word for beaver in Choctaw?
8. What is the word for wolf in Choctaw?
9. What is the word for spider in Choctaw?
10. What is the word for chicken in Choctaw?
11. What is the word for raccoon in Choctaw?
12. What is the word for rabbit in Choctaw?
13. How would you say, "What is your name?" in Choctaw?
14. How would you say, "I am doing well." in Choctaw?
15. How would you say, "Do you know a story about deer?" in Choctaw?
The language test also served to inform all participants about the chatbot’s domain knowledge of
animal stories, a fact given in the instructions read to each participant, so that participants would have
more consistent experiences and not have to spend time discovering which stories the chatbot knows.
98
4.4.3.2 Survey design
The survey was designed to evaluate the user’s sense of rapport, the naturalness of the code-switching,
and the feeling of connection because of language identity.
Conversational systems are often evaluated based on how eciently they generate responses, the quality of those responses—such as their appropriateness—and the user’s overall experience, including their
satisfaction or enjoyment of the interaction (Shawar and Atwell 2007a). For the last metric, user ratings
are typically averaged to provide a generalized assessment of the system’s performance. Variations in individual ratings may be inuenced by personality traits like extraversion and agreeableness, with more
agreeable users sometimes giving higher scores than less agreeable ones (Papangelis et al. 2022).
The survey consisted of twelve 5-point Likert scale questions, and the answers were scored from 1
strongly disagree to 5 strongly agree. Many questions came from previous research on rapport (Novick
and Gris 2014; Gratch et al. 2007). Questions 7 and 10 are novel and tailored to this experiment. The nal
two questions were open-ended questions where participants could write sentences to respond. All survey
questions were optional, and participants could choose to skip any questions.
1. The system understood me.
2. The system seemed unengaged.
3. The system was friendly.
4. The system and I worked towards a common goal.
5. The system and I did not seem to connect.
6. I didn’t understand the system.
7. The system knows the Choctaw language.
8. The interaction was interesting.
9. The interaction felt natural.
10. I felt the system and I were in the same social group.
99
11. I would be willing to continue the conversation with the system for longer.
12. I would recommend interacting with this system to a friend.
13. Was there anything else that you wanted to talk to the system about? (open-ended)
14. Do you have any other comments to share about your experience? (open-ended)
Questions were selected to determine levels of rapport (1, 2, 4, 5, 6, 9) and engagement and connection
(3, 8, 10, 11, 12). I hypothesized that the code-switching cohort would score the chatbot higher on these
questions. The survey also measured people’s perception of the chatbot’s knowledge of the Choctaw
language (7) to gauge how users perceived the uency of the chatbot’s code-switching. I hypothesized
participants might score the non-framework chatbot lower than the others.
4.4.3.3 Experiment session
Participants were rst given a consent form to read and sign in the experiment session. Following an
oral explanation of the information in the consent form, the experiment began. The rst task was for
participants to complete the language test (see list of questions in 4.4.3.1). After this, participants interacted
with the chatbot for 15 minutes. The participants then completed the language test again. Last, participants
completed the post-interaction survey to rate their experience and provide any additional comments.
Participants were encouraged by the instructions to have a dictionary handy prior to the interaction.
If participants did not have a paper copy of the dictionary, I sent URL links to two dictionaries8
,
9
4.4.3.4 Inclusion and exclusion criteria
The inclusion criteria for the study required participants to follow instructions closely, engage meaningfully with the tasks, and provide appropriate, on-topic interactions with the chatbot. Participants were
8
https://www.webonary.org/byington-choctaw/
9
https://dictionary.choctawnation.com/word/
100
instructed to communicate with me during the interaction portion of the session only if they had questions regarding the instructions or encountered technical diculties. Each participant was expected to
complete all tasks as outlined, including language prociency tests, survey questions, and structured interactions with the chatbot, ensuring the integrity and relevance of the data. While participants were not
required to spend the full 15 minutes interacting with the chatbot and would not be excluded for choosing
to nish early, they were encouraged to spend as much time as needed referencing the dictionary. No
specic number of chatbot interactions was expected, but participants were required to engage with the
chatbot for at least one turn to demonstrate participation.
Exclusion criteria included making multiple consecutive o-topic utterances to the chatbot, inappropriate comments directed toward either the chatbot or me, and o-topic responses in the survey. Nonengagement was identied by discussing unrelated topics aloud with me during the chatbot interaction,
except when addressing technical issues or seeking clarication on instructions.
4.4.4 Results
In total, 23 participants completed the experiment. Twelve participants interacted with the monolingual
Choctaw chatbot, while eleven participants interacted with the framework-based code-switching chatbot. One participant from the monolingual chatbot met the exclusion criteria, so their chat log, survey
responses, and language test responses were omitted. There was insucient recruitment to test the nonframework code-switching chatbot condition. For simplicity, I refer to the monolingual Choctaw chatbot
as the monolingual condition and the framework-based code-switching chatbot as the code-switching condition for the remainder of this chapter.
101
In addition, since some participants expressed interest in interacting with the chatbot again, I invited
them to chat with it again during one weekend over the month that experiments were held. The conversations that day were recorded via a log, but no information was noted about who spoke to the chatbot at
any given time.
Two participants requested to nish the chat portion of the experiment session early. The data collected
from these participants was retained as they followed all experiment protocols and engaged in conversation
with the chatbot, albeit for a shorter amount of time. One user interacting with the monolingual chatbot
requested that the session be ended early, after 6 minutes, citing frustration and disinterest. Additionally,
one user interacting with the code-switching chatbot also asked to end the session early, after 13 minutes,
stating they had no additional topics to speak with the chatbot about. Many participants asked the chatbot
for denitions and translations despite having a dictionary available to them; it is a consideration for future
work for the chatbot to provide denitions and translations for individual words.
In this section, I rst analyze the language test results (Section 4.4.4.1), where I determine the change
in number of attempted and correct responses. Next, I analyze the survey responses (Section 4.4.4.2).
I use statistical tests to determine the signicance between groups. I also evaluate open-ended survey
question responses. Finally, I analyze the logs recorded from each interaction (Section 4.4.4.3). I will detail
the number of stories requested, annotate and describe the types of utterances participants made in the
interactions, and detail the number of total words as well as words used in Choctaw and English.
4.4.4.1 Language Test
All language tests (pre- and post-test) were scored for two factors. The rst factor was how many questions
were attempted, regardless of correctness. The second factor was correctness. A correct answer was one
point; thus, a perfect score on the quiz would be 15.
102
Figure 4.7: Number of attempted questions for each question on the language test per group. "Pre" indicates
it was before the interaction.
For the rst 12 questions on the language test, I applied a rubric for grading the questions. Since
Choctaw is not standardized and can require a keyboard with the unique characters, I made allowances for
dierences in spelling. One participant informed me that a teacher at the School of Language teaches their
students to use capitals for nasalized characters (A in place of a
¯
). Next, I considered dierent spellings in
the dictionaries. Alligator is written hVchonchoba in both dictionaries; however, I marked it as acceptable
to spell the word as hvchonchoba or hachonchoba as well (Note: the sound represented by the upsilon
character is a short a). I deducted half of a point if an extra syllable was added, such as hvchonchonchoba,
if a vowel was suciently incorrect to impact the pronunciation, such as hvchanchoba, or if a consonant
was substantially incorrect, like hachonchola.
I deducted half of a point for the remaining three questions if the words were correct, but the ordering
was o. For example, in the rst question, "What is your name?", the proper order is, "Chi hochifo yVt
nanta?" so a response of, "Nanta chi hochifo yvt?" would get half of a point marked o. I also removed
half of a point for incorrect pronouns, such as is "Sv hochifo yVt nanta?" (What is my name?)
103
Figure 4.8: Number of correct answers for each question on the language test per group. "Pre" indicates it
was before the interaction, "post" indicates after.
Figure 4.7 and Figure 4.8 show the number of attempted and correct answers on the language test.
The full results for each participant’s language test scores are in the appendices. The rst question, "What
is the word for deer in Choctaw?" on the language test, was known by all participants before the exam.
The questions with the greatest number of attempts for both groups were 10 (chicken), 12 (rabbit), and 13
(How would you say, ’What is your name?’ in Choctaw?"). The highest number of correct answers for
both groups were 11 (raccoon), 12, and 14 ("How would you say, ’I am doing well.’ in Choctaw?"). The
lowest attempted and lowest scoring questions were 2 (ant), 7 (beaver), and 15 ("How would you say, ’Do
you know a story about deer?’ in Choctaw?").
Next, I evaluated the average change for attempted and correct responses. The question with the greatest change between pre and post-interaction attempts and correct answers was question 4 (possum). The
average change in the number of vocabulary questions attempted was 1.18 for the monolingual group and
1.36 for the code-switching group. This indicates the code-switching group was slightly more inclined to
104
Mono CSW
Change in attempt vocabulary 0.84 0.27
Change in correct vocabulary 0.97 0.57
Change in attempt grammar 0.71 0.13
Change in correct grammar 0.97 0.92
Table 4.4: Measurement of change between pre and post-language tests
try more questions after interacting with the chatbot. The average change in correct answers for vocabulary questions for the monolingual group was 1.5, while the code-switching group was 1.36. This indicates
that all groups beneted from the interaction, with the monolingual group improving slightly more. No
participants had decreased test scores. Several participants in both groups showed no improvement via
the language test. The participant with the greatest improvement was in the code-switching group, with
a gain of 4.5 points between pre- and post-interaction tests, and this participant also showed the greatest change in the number of questions attempted. For the grammar questions of the language test, the
monolingual group showed an average increase of 0.1 in the number of grammar questions attempted.
In contrast, the code-switching group had an average increase between pre and post-grammar questions
attempted of 0.09. The monolingual group improved in correct responses on average by 0.2 points, while
the code-switching group improved by 0.18.
Finally, I used multiple linear regression to determine if there was a signicant amount of change in
the vocabulary learned or the amount of grammar learned or attempted, shown in Table 4.4.
The results indicate no signicant changes were observed. There was a slightly larger rate of change in
the amount of grammar questions attempted for the code-switching group. Using an A-priori Sample Size
test with an anticipated eect size of 0.5 to understand what size population I would have needed to see
signicant results, more than 100 participants were needed to see a signicant result between the groups
for the change in attempts and correct for vocabulary and grammar.
105
These results indicate that there was a small amount of change in the number of questions attempted
between the groups, but not a signicant eect depending on which chatbot a participant was paired with.
Nevertheless, the interaction was benecial as some increase was seen for both groups.
The main message from the language test analysis is that I did not observe a signicantly higher level
of learning with the monolingual chatbot, which would have been an "immersion" style of learning over
the translanguaging, code-switching chatbot. An overall positive nding is that language learning is happening with either chatbot. It is possible learning would be more detectable after a longer conversational
interaction or after multiple interactions.
4.4.4.2 Responses to survey
The survey that all participants completed following the interaction consisted of 12 questions scored on
a Likert scale. The results of comparing the two groups’ survey responses using a one-tailed T-Test are
shown in Table 4.5. The table also shows the average score for each group, with the standard deviation
given in parentheses next to the mean value.
Question T-test result Mean Mono (std dev) Mean CSW (std dev)
1 The system understood me. 0.25 3(1) 3.54(1.03)
2 The system seemed unengaged. 0.73 2.54(1.50) 2.36(1.36)
3 The system was friendly. *0.07 3.18 (1.47) 4.36 (0.80)
4 The system and I worked towards a common goal. 0.65 3.36 (1.36) 3.54 (0.82)
5 The system and I did not seem to connect. 0.34 2.90 (1.51) 2.45 (1.12)
6 I didn’t understand the system. 1 2.54 (1.29) 2.54 (1.29)
7 The system knows the Choctaw language. 0.64 3.9 (0.87) 4 (0.89)
8 The interaction was interesting. 0.16 3.63 (1.36) 4.36 (0.92)
9 The interaction felt natural. 0.19 2.81 (1.32) 3.45 (0.82)
10 The system and I were in the same social group. 0.16 2.45 (1.21) 3.18 (1.16)
11 I would be willing to continue the conversation with the system for longer. 0.13 3.63 (1.74) 4.45 (0.52)
12 I would recommend interacting with this system to a friend. *0.06 3.63 (1.62) 4.54 (0.52)
Table 4.5: The results of comparing survey responses between the monolingual and code-switching interactions. p<0.10 results are marked with one asterisk. Standard deviations are given in parentheses next to
the average in the nal two columns.
106
There are two p<0.10 values, one responding to the question, "The system was friendly." The codeswitching group scored their chatbot as friendlier than the monolingual group. Additionally, the codeswitching group reported that they would be more likely to recommend interacting with the system to a
friend. While these are not signicant ndings, they are suggestive of dierences between the groups.
I then analyzed the survey responses by clustering the questions by rapport (1, 2, 4, 5, 6, 9) and engagement and connection (3, 8, 10, 11, 12). By clustering, I summed the scores for the given questions for each
participant. For negatively phrased questions (2, 5, 6), I reversed the polarity. The p-value for the clustered
questions on rapport was 0.24. The p-value for engagement and connection was 0.04, a signicant value.
There were also two open-ended questions at the end of the survey. First, I looked at the word counts
for these questions. The average response in the monolingual group was 23 for the rst question and 59
for the second. The average word count in the response for the code-switching group was 17 for the rst
question and 19 for the second.
Next, I used Excel’s prebuilt AI model to conduct sentiment analysis on all open-ended responses.
The results are in Table 4.6. Both groups have more neutral responses to the rst question, "Was there
anything else that you wanted to talk to the system about?" Both groups responded more positively to the
nal question, "Do you have any other comments to share about your experience?"
Question Group Positive Neutral Mixed Negative No response
11 Was there anything else that you wanted to talk to the system about? M 2 5 1 0 3
CSW 3 6 0 2 0
12 Do you have any other comments to share about your experience? M 5 1 0 1 3
CSW 5 2 0 0 4
Table 4.6: Sentiment analysis of open-ended survey responses. M stands for monolingual and CSW for
code-switching
4.4.4.3 Chat logs and annotations
Each conversation log from the 22 interactions was annotated and analyzed for several metrics, described
briey in the following list.
107
• Total words
• Usage of code-switching
• Topics discussed
• Success of all input
• Success for stories
• Dialogue Act annotations
• Initiative annotations
• Correlations between chat interactions and survey responses
• Opposite chatbot pairing
Total words
First, I reviewed the total number of words, unique words, English words, and Choctaw words for
each conversation. This analysis indicates if a specic group spoke more with their chatbot. The results
of this analysis are shown in Table 4.7. There are several notable results from this analysis. First, no
participants in either condition spoke only in English when interacting with the chatbots. Second, there
was a larger range in the number of words and turns in the interaction for code-switching participants;
this becomes obvious when looking at the nal two columns, where various participants from this group
represent multiple high and low values. Finally, it can be summarized that the code-switching group did
use more English in their conversations but used almost as many Choctaw words as the group interacting
with the monolingual Choctaw group. This indicates that the code-switching group talked more in their
interactions overall.
Usage of code-switching
108
M Interaction CSW Interaction M Interaction CSW Interaction Highest Value Lowest Value
Average Average Median Median
Number of Words 44.7 58.1 42 54 106 (11c) 16 (5c)
Number of Unique Words 29.5 35.7 27 37 63 (8c) 12 (7c)
English Words 7.2 24.5 3 12 86 (1c) 0 (multiple)
Choctaw Words 37.4 33.6 33 43 49 (8c) 3 (3m)
Number of Turns 14.4 16.0 13 16 30 (11c) 5 (5c)
Number of turns in English 1.0 3.9 0 3 14 (1c) 0 (multiple)
Number of turns in Choctaw 12.9 10.9 12 12 22 (10c) 2 (4c)
Number of code-switching turns 0.4 2.0 0 0 7 (8c) 0 (multiple)
Percent of interaction in English 19.4 32.6
Percent of interaction in Choctaw 80.5 67.3
Percent of English turns 7.5 23.1
Percent of Choctaw turns 89.3 64.5
Percent of code-switching turns 3.1 12.3
Table 4.7: Analysis of chat logs for the number of words. In the rst four columns, M stands for monolingual, and CSW for code-switching. In the nal two columns, the participant with that specic value is
given in parentheses; "multiple" indicates that more than one participant had that value.
Next, I reviewed the chat logs for participants’ code-switching. This analysis was to determine if participants adopted the code-switching strategies of the chatbot, and to review if participants code-switched
in ways similar to the literature. Only seven participants used intra-sentential code-switching in their
interactions: three in the monolingual group and four in the code-switching. There were 23 utterances
with intra-sentential code-switching present. Eleven of the utterances were requesting a story. All of these
utterances used insertional code-switching for the name of the animal, two in English and the remainder
with the animal word in Choctaw. Twelve code-switching utterances were requesting a translation or
denition for a word that the chatbot had used. I did not program the chatbot to provide denitions or
translation, thus these utterances were unsuccessful and received an o-topic response.
Topics discussed
I also analyzed the conversations for the number of topics participants discussed with the chatbot.
This would indicate if participants from one group spoke to a chatbot dierently than the other group. On
average, the monolingual condition had 8.36 topics in a conversation, while the code-switching condition
had 12.09. The highest value overall was 26 in a code-switching conversation, while the lowest was three
in a code-switching conversation. In comparison to the number of turns, participants frequently changed
topics. Typically, participants from both groups spent about two turns per topic.
109
Success of all input
I also annotated each utterance in the conversations to determine how successful participants were for
that utterance. Measuring success per utterance is a means to evaluate if one of the chatbots was more
supportive for learning speakers’ variety of language usage. Success was measured by how on-topic the
response from the chatbot was given the user’s previous utterance. If a user asked for a story, a successful
response would be the requested story, while an unsuccessful response would be an unrelated response.
An unrelated response might be utterances such as, "I know," or "Goodbye!" to the question, "What did you
eat today?". The rate of success was measured as the number of successful utterances over the number of
total utterances. The average rate of success in the monolingual conversations was 0.46, while the rate of
success for the code-switching conversations was 0.47. The highest success rate was 0.84, while the lowest
was 0.25, both from monolingual conversations.
Success for stories
Next, I analyzed how many stories participants received in their interactions, summarized in Table
4.8. This analysis provides information on how successful participants were at receiving stories with their
chatbot, and would indicate if one chatbot is a better partner for sharing stories. The monolingual and
code-switching participants asked for an average of three stories in their respective conversations. The
code-switching group was slightly more successful in receiving a story based on their request, receiving
an average of four stories in their interactions. The monolingual group received an average of two stories.
Only one monolingual participant asked for zero stories, while three code-switching participants asked for
zero stories in their interactions. These participants spent their time introducing themselves and asking
questions about the chatbot, such as what its name meant. Some users received stories even when not
requested. Occasionally, users received a story when saying "yakoke" or "thank you". This occurred ve
times in the monolingual and code-switching interactions, and participants consistently received a story
about a turtle or a possum. By comparing the rst two columns in the table, it can be seen that monolingual
110
participants were not as successful at receiving stories when they asked for one. This was often due to
syntax or spelling errors not present in the training data, which confounded the system.
M Interaction CSW Interaction M Interaction CSW Interaction Highest Value Lowest Value
Average Average Median Median
Number of stories requested 3.5 3.9 3 4 10 (1c) 0 (multiples in CSW only)
Number of attempts to get story 1.46 1.13 1.33 1.2 1.6 (4c) 0 (multiple in CSW only)
Number of stories received 2.4 4 3 4 10 (1c) 0 (multiple)
Table 4.8: Analysis of chat logs for the number of stories. Participants may have asked for stories but were
unsuccessful in this request attempt, as indicated by the number of attempts to get a story. Alternatively,
sometimes, the chatbot gives a story without a request, leading to a mismatch between the number of
requests and the number of stories received.
Dialogue Act annotation
Next, I annotated the chats using the ISO 24617-2 Standard for dialogue act annotation (Bunt et al.
2017). This analysis provides insight into specic dialogue actions that occurred during the interactions.
Based on the annotations, one can then look for generalizations between and within groups.
Each participant utterance was annotated with a dialogue act dimension and a communicative function label. The most commonly occurring dialogue act dimension was task (51% monolingual, 57% codeswitching), followed by social obligations management (30% monolingual, 33% code-switching), autofeedback (9%, 7%), own communication management (5%, 0%), and allo-feedback (3%, 1%).
Within the dimension "task", most utterances were requests (58%, 41%) or questions (30%, 48%). The
social obligations management dimension was also highly utilized, with many participants greeting, thanking, and making introductions with the chatbot. The auto-feedback dimension was nearly equally used
for positive and negative utterances. A positive utterance would be "A, ome" (Yes, ok), while a negative
utterance would be "what is akostinchi" where the user was repeating a word back that the chatbot had
just used. Allo-feedback utterances were almost entirely negative, where the user repeated their previous
utterance with no changes.
The type own communication management was when the user repeated previous utterances with modi-
cations to try to improve the coherence of the input. A few allo-feedback utterances were positive, where
111
the user indicated that they understood the chatbot’s previous o-topic response. In the example below,
the participant asks how the weather is outside (as a note, the correct ordering of the sentence is, "Kucha
yvt pisa katiohmi?"). The only occurrence of the word pisa in the training data is matched to the utterance
for saying goodbye. The participant responded to this inappropriate utterance from the chatbot by saying
yes and then moved on to another topic in the following turn.
1. Participant: Kucha yvt katiohmi pisa? (What’s it like outside?)
Chatbot: Chi pisa la chike! (I’ll see you later!)
Participant: A- (Yes)
Initiative annotations
Next, I utilized an initiative annotation framework (Nouri and Traum 2014; Nazarian, Nouri, and Traum
2014) to evaluate the initiative of the participants in the interaction. The framework includes four labels:
R, F, I, and N. R-type (R for related) utterances contribute to the same topic and connect to prior utterances
from the interlocutor, regardless of whether they fulll an obligation or an obligation exists. F-type (F
for fullling obligation) utterances satisfy obligations set by previous initiatives from the interlocutor and
align with conversational norms. I-type (I for invoking obligation) utterances create a new discourse obligation for the dialogue partner to respond. N-type (N for new) utterances introduce unsolicited, optional,
or additional material that does not fulll an obligation. These last two labels often appear together, as in
new questions or proposals requiring a response. Both are unsolicited and impose an obligation, though
each can occur independently.
Table 4.9 gives the totals for participants interacting with each chatbot. The high rates of I- and N-type
utterances align with the chatbot being a QA dialogue system and relying on users to prompt the system
for information. The low levels of F are also in alignment, as few responses from the chatbot were I-type
and imposed an obligation on the user.
112
Metric Mono Total Mono Ratio CSW Total CSW Ratio
R 73 0.25 73 0.21
F 10 0.03 13 0.04
I 103 0.36 133 0.37
N 103 0.36 137 0.38
Table 4.9: Initiative annotations
I also reviewed how users responded to receiving a story within the initiative annotations. Five participants in the monolingual cohort asked follow-up questions about stories, while only one code-switching
participant asked a follow-up question related to the stories. An example of a follow-up question was, “Is
that why in Choctaw dances there’s often someone with a tail?" in response to a story about a possum’s
tail. Some participants also gave feedback to the story; all comments were positive, such as "Good!". Three
participants from the monolingual group gave feedback, while only one code-switching participant did.
Six participants, three from each cohort, thanked the chatbot after the story. Most of the time, participants
from both groups asked for another story. This aligns with the values in Table 4.9 as R-type utterances are
lower than N. The lack of R indicates it is a new topic unrelated to a previous utterance.
Correlations between chat interactions and survey responses
Next, I analyzed if there was a correlation between the number of stories requested and the survey
scores. This analysis provides insight into the relationships between the chatbot’s actions and participants’
survey responses. The results are given in Table 4.10. No strong correlations were seen in the monolingual
cohort. However, there was a moderate negative correlation between "The system knows the Choctaw
language" and the number of stories the participants requested. In the code-switching group, two strong
correlations were observed. The rst was in response to the "The system and I worked towards a common
goal," and the second to "I felt the system and I were in the same social group." It should be noted that three
participants in the code-switching group did not request any stories in their interactions; all participants
in the monolingual group asked for at least one story. Participants may have felt positively about their
ability to request stories based on the code-switching they saw the chatbot utilize, or for those who asked
113
Mono CSW
Question Pearson P-value Pearson P-value
1 The system understood me. -0.24 0.47 0.42 0.18
2 The system seemed unengaged. 0.05 0.87 -0.20 0.53
3 The system was friendly. -0.29 0.37 0.21 0.53
4 The system and I worked towards a common goal. -0.14 0.66 0.60 0.04**
5 The system and I did not seem to connect. 0.24 0.47 -0.03 0.91
6 I didn’t understand the system. 0.02 0.93 -0.03 0.92
7 The system knows the Choctaw language. -0.37 0.24 0.44 0.16
8 The interaction was interesting. -0.06 0.85 -0.01 0.95
9 The interaction felt natural. -0.25 0.45 0.44 0.17
10 I felt the system and I were in the same social group. -0.19 0.75 0.67 0.02**
11 I would be willing to continue the conversation with the system for longer. 0.17 0.61 0.38 0.24
12 I would recommend interacting with this system to a friend. 0.24 0.47 0.38 0.23
Table 4.10: Pearson correlation test to evaluate the relationship between survey responses and number
of stories requested in an interaction. Strong correlations (r<0.60) are bolded, and two asterisks indicate
p<0.05.
for no stories, may have felt positively about the chitchat they engaged in with the chatbot. I also analyzed
the survey responses against the number of stories received and found similar correlations, with no strong
relationships seen in the monolingual cohort, and one correlation with the question, "I felt the system and
I were in the same social group, (r=0.67, p=0.02).
A nal review within this analysis was to evaluate the average number of tries to get a story against
the survey responses, given in Table 4.11. For the monolingual group, there is a suggestive result for the
question, "I didn’t understand the system", and a signicant result for the survey question, "The system
knows the Choctaw language." The average number of tries in the monolingual group to get a story was
1.46. The extremely signicant p-value and very strong negative correlation indicate participants were
negatively evaluating the chatbot when they required multiple tries to ask for a story.
The average number of tries to get a story in the code-switching group was 1.13. The only strong
correlation in this group was in response to the question, "The interaction was interesting." Here, there
was a strong negative correlation. This could be due to the three participants who requested no stories, so
a conversation with no story requests attempted was more interesting than those with a story.
Opposite chatbot pairing
114
Mono CSW
Question Pearson P-value Pearson P-value
1 The system understood me. -0.48 0.13 -0.33 0.31
2 The system seemed unengaged. -0.13 0.68 0.35 0.28
3 The system was friendly. -0.25 0.44 -0.12 0.70
4 The system and I worked towards a common goal. -0.39 0.22 0.11 0.74
5 The system and I did not seem to connect. 0.41 0.20 0.49 0.12
6 I didn’t understand the system. 0.54 0.08 -0.45 0.16
7 The system knows the Choctaw language. -0.80 0.003 -0.44 0.17
8 The interaction was interesting. -0.23 0.48 -0.72 0.01
9 The interaction felt natural. -0.40 0.21 -0.49 0.12
10 I felt the system and I were in the same social group. -0.47 0.14 -0.33 0.30
11 I would be willing to continue the conversation with the system for longer. -0.23 0.48 -0.01 0.96
12 I would recommend interacting with this system to a friend. -0.36 0.26 -0.04 0.89
Table 4.11: Pearson correlation test to evaluate the relationship between survey responses and average
number of tries to receive a story in an interaction. Strong correlations (r<0.60) are bolded, and two
asterisks indicate p<0.05.
An additional analysis was to use the participant’s utterances with the opposite chatbot (monolingual
with code-switching, and vice versa) to evaluate if the conversation would have been substantially dierent
or more successful had the participant been paired with a dierent chatbot. The number of the same
responses from the opposite chatbot was high for both cases. For the monolingual logs being used with
the code-switching chatbot, the number of the same responses was, on average, 51%. For the reverse, 45%
of the responses of the chatbots were the same on average. When paired with the opposite chatbot, the
monolingual group increased to an average of three stories, while the code-switching group decreased
to two. The number of unsuccessful turns, where the user did not receive an on-topic and appropriate
response from the user, increased by an average of 2 when paired with the monolingual chatbot. There
was not a large increase in successful turns when the monolingual conversations were paired with the
code-switching chatbot.
4.4.4.4 Language experiments
I veried the output of both chatbots using the same type of entries from the language experiments in
Section 4.3.2.2.
115
I rst reviewed the code-switching chatbot, which was adept at handling real-world examples of intrasentential switching from the participants, such as "Tell me a story about the issi." Of the eleven utterances
that requested stories, all were originally paired with the code-switching chatbot. Eight of the eleven
received a story. In the hypothetical conversations where the same utterances were tested with the monolingual chatbot, again, eight of the eleven received a story. Of the utterances that did not receive a story,
two utterances were Choctaw syntax with an English noun insertion, such as "nan vnnoa possum," which
was not in the training data. One utterance, "Yakoke please tell me a story about luksi", triggered the
response that informs the user of what animal stories the chatbot does know.
I also reviewed how the chatbots handled user errors. Here, the performance was mixed. The ungrammatical sentences "Li bvnna issi?" (Want I deer?) and "issi nan vnnoa" (story deer) received deer story
responses, but "Anumpa nan anoli bvnna li." (Story want I) did not. Future work could incorporate examples
of errors into the QA corpus to improve the chatbot’s ability to understand user errors.
4.5 Discussion
First, I will review the ndings for my main research questions.
• M2.0-1:Will translanguaging lead to a better user experience? Will users show a higher preference
for the translanguaging chatbot?
The survey results indicate that users had a better, more satisfying experience with the code-switching,
translanguaging chatbot.
• M2.0-2:Will translanguaging lead to an increase in learning?
The language tests indicate that participants learned new vocabulary while interacting with the
code-switching chatbot. However, they did not learn signicantly more than the monolingual group,
indicating that interacting with any chatbot will lead to a learning experience.
116
• M2.0-3: Will users prefer a chatbot that uses a linguistic framework for code-switching over one
that does not (non-framework)?
Unfortunately, this research question was not answered due to limited recruitment. I leave it to
future work to investigate this topic.
Now, I will review the ndings in relation to the hypotheses.
H1: Code-switching bilingual chatbots that use translanguaging techniques lead to a better
learning experience, possibly through learning gains or a greater sense of rapport, comfort, or
enjoyment for language learning users.
The survey results are suggestive that the interaction experience was better and more enjoyable for
participants interacting with the code-switching chatbot. This indicates that a translanguaging, bilingual
chatbot is a more appropriate language partner for Choctaw learners. Participants said they would recommend the code-switching chatbot to others. Additionally, participants paired with the code-switching
chatbot received more stories.
Notably, participants marked that the code-switching chatbot was friendlier. There was nothing different in what any of the Masheli 2.0 versions said, its register, or its level of politeness. This result aligns
with the literature on face, identity, and the perception of rapport when one’s face and identity are threatened. Participants may have perceived their face being threatened when the chatbot did not understand
their attempts to communicate in Choctaw.
Kickham (2015) states that there are multiple types of motivation when learning a language, and unique
to those learning a language tied to their ethnic identity is ethnolinguistic motivation. This is the motivation a learner experiences when they desire to learn a language to reconnect to a heritage language or
ethnic community to which they claim membership. It is possible participants did not connect to the identity of the chatbot as a Choctaw monolingual speaker, and this disconnect was an aront to their identity.
117
However, I did not collect data on whether participants identied as being Choctaw or members of the
tribe. It is possible that some recruited language learners are not ethnically Choctaw or enrolled members.
H2: Modeling linguistic aspects of bilingualism, such as code-switching that follows codeswitching linguistic frameworks, will lead to a better learning experience than the monolingual
Choctaw, demonstrated through higher levels of learning gains, rapport, or enjoyment.
The code-switching chatbot followed translanguaging principles in its code-switching but also followed linguistic frameworks that produced insertional and clause switches. The survey results are suggestive that modeling code-switching aspects using linguistic frameworks does lead to higher levels of
reported rapport and enjoyment.
It does not lead to higher levels of learning gains. The nal question of the survey indicates that the
code-switching cohort would be more likely to interact with the chatbot again; thus, it is possible that
learning gains could be achieved over multiple interactions.
H3: Users will have a higher level of enjoyment with the framework code-switching chatbot
over the non-framework code-switching chatbot.
I was unable to test this hypothesis due to limitations in participant recruitment. I leave it to future
work to investigate.
H4: Users will demonstrate the highest learning gains with a code-switching system.
What can be concluded from the language test results is that huge learning gains with immersion
(interacting with the monolingual chatbot) were not observed in terms of the number of correct questions,
nor did it produce increases in the number of questions the participant attempted after the interaction.
The learning gains of the participants who interacted with the code-switching agent were slightly lower
in terms of number correct, however, only slightly. Participants were more likely to attempt vocabulary
questions after the interaction, perhaps due to feeling more condent or positive (even if that sentiment
did not always translate to newly acquired vocabulary being quite right). Code-switching had nearly the
118
same outcome for learning gains as the monolingual Masheli 2.0. These results indicate that either chatbot
will produce positive learning gains.
H5: Users will have a lower user experience with the monolingual system than with one of
the code-switching bilingual systems.
Based on the suggestive results for questions 3 and 12 on the user survey, there was a slight preference
for the bilingual system.
The participants paired with the monolingual chatbot received slightly fewer stories in their interactions and had somewhat less successful conversations than those paired with the code-switching conversations. The conversations would have only marginally improved based on the results of using the
monolingual-paired participants with the opposite chatbot. There was no large dierence in follow-up
questions to stories between the groups or in learning gains. It can be concluded that users preferred the
bilingual chatbot because they were responding to the code-switching and bilingual identity of the chatbot.
Literature on face and face-work provides an explanation for why participants preferred the bilingual
chatbot. As previously stated in Section 2.4.1.4, face is the image one has of oneself and emerges during
interactions (Haugh 2009). Face is important in any conversation as humans want to be liked and respected
by others, but face is a key factor in learning scenarios (N. Wang, Johnson, et al. 2008), and particularly
in second language conversations (Piirainen-Marsh 1995; Ahvenainen 2021). Since face requires interaction and thus requires outside assessment, it is agreed that “face is a vulnerable phenomenon, and hence
associated with emotional reactions.” (Spencer-Oatey 2007). If a language learner’s face becomes threatened in conversation due to a language comprehension issue (Holtgraves 2009 pg 198), the learner “would
be likely [to] experience identity-based frustration, emotional vulnerability, shame, hurt, anger” (TingToomey 2009 pg 229). The social expectation is that interlocutors are socially obligated to protect each
other’s faces (Holtgraves 2009). The users paired with the monolingual chatbot may have been sensitive
to its rejections (i.e., o-topic responses) of their use of English or non-standard Choctaw.
119
4.6 Summary of ndings
I tested two phases of chatbot design, the latter with multiple versions, with human subjects in two experiments. In the rst phase, the chatbot could code-switch within the dialogue but not within a turn. It
was also incapable of processing code-switched user utterances. I tested this with a 4-person study; participants commented on the enjoyment of receiving the stories. In the second experiment, I compared a
monolingual Choctaw Masheli 2.0 against a code-switching one incorporating translanguaging.
The results of the second experiment indicate that users prefer the code-switching chatbot over the
monolingual one based on the survey results. Both cohorts demonstrated learning gains from the interaction in the form of a vocabulary and grammar quiz. However, the monolingual cohort learned slightly
more but not signicantly more. An important nding was that the code-switching chatbot was better at
retrieving stories for both populations.
An analysis of correlations between survey responses and the number of story requests found no strong
correlations in the monolingual group, except for a negative one for the survey question, "The system
knows the Choctaw language." In the code-switching group, strong correlations showed participants felt
they shared a social group with the system when comparing survey responses to the number of story
requests made as well as the number of stories received. More story request attempts negatively impacted
perceptions of the chatbot’s ability to speak Choctaw in the monolingual group.
Finally, it is possible that signicant ndings in learning gains and survey results could be found with
a larger population size. It is also possible that signicant ndings could be found with longer interactions,
possibly by interacting for longer than 15 minutes in one session or by having multiple interaction sessions.
120
Chapter 5
Bilingual Language Documentation Systems - Application 2: DAPEL
This chapter details the second code-switching dialogue system. The goal of the interaction in this application is to preserve an endangered language through dialogue. My proposed application is named DAPEL,
short for Dialogue APp for Endangered Languages.
DAPEL is designed as a tool for language documentation and preservation. It can also serve for reclamation eorts because, with sucient language documentation, a language community will then have
the resources and means to revitalize and reclaim its language. The system is intended to preserve any
language; however, I use Choctaw as the language for documentation.
Some of the key research questions guiding my research in this system are:
• D-1: Will the proposed system be useful for collecting endangered language audio data?
• D-2: Will users be comfortable using two languages in a conversation but only communicating in
one with the system?
• D-3: Will the proposed system be comparable in recorded audio duration or in the level of enjoyment
the user experiences as that with a human interviewer?
• D-4: Will code-switching improve the user’s experience through increased scores on a rapport survey, higher number of unique words in recordings, or longer recording durations?
121
I describe the related literature for this application, two pilot studies, and an experiment conducted
with uent Choctaw speakers.
5.1 Motivation
There are roughly 7,000 languages spoken in the world today. Seifart et al. (2018) found that “around
3,660—that is, more than half of now living languages—are currently threatened, endangered, moribund,
or nearly extinct”.
There are numerous bottlenecks in the process of preserving a language. From my previous experiences, such as recording the audio described in Section 3.0.2, I can attest to the time and labor involved in
recruitment and conducting recordings. A dialogue system for language preservation would fulll a similar role as a language preservation practitioner conducting an interview with a speaker and would reduce
the challenges of time, labor, and cost. Additionally, speakers could use the system when their schedules
permit, making capturing data from additional speakers easier. An added benet is that users have been
shown to disclose more information to a dialogue system than with a human (Lucas et al. 2014), which
could indicate that people would be willing to speak more and be recorded saying more in the endangered
language than with a human interviewer.
5.2 Related literature
The eld of linguistics has a long history of endangered language documentation. In the early tradition in
America, linguists and anthropologists created written and oral records of Indigenous languages, typically
producing three items intended and structured for an academic audience: a dictionary, a grammar, and a
set of texts of the language. The language use documented in this tradition often prioritized literary and
ceremonial domains and rarely documented conversational or everyday language (Rouvier 2017).
122
In the 1980s, as language communities began to use documentation records to support language revitalization eorts, two key changes in methodologies occurred. First, language documentation became an
important resource for combating language shift (Dobrin, Austin, and Nathan 2009). Second, communities
demanded a more central role in determining documentation practices and resulting documents (Rouvier
2017). Today, the expectation is that language documentation should support community priorities for
language revitalization, and documentation funding opportunities should contribute to these eorts, even
if there are other research goals within the preservation activity (Rouvier 2017). A complete literature
review of documentation of Indigenous languages is given in Section 2.6.2.
Despite the long tradition of language documentation, there is “a minimally adequate quantity of data
for less than 1% of the world’s 7000 languages” (Gauthier et al. 2016). Technology can play an important role
in scaling up documentation eorts, allowing users to be recorded simultaneously and as their schedule
allows while reducing prohibitive cost aspects such as travel. Additionally, technology has fostered greater
audio and video documentation. Finally, technology has encouraged better sharing of primary language
data with researchers and community members (Rouvier 2017). More details about language preservation
and documentation best practices can be found in Chapter 2.
Previous systems that aimed to document a language include AIKUMA (Bird et al. 2014) and LIGAIKUMA (Gauthier et al. 2016). The AIKUMA app (Bird et al. 2014) was designed to record parallel translation data in English and a second language. The AIKUMA app is deployed on smartphones and shows
the viability of deploying mobile apps to document endangered languages. Members of the speech community can upload written translations. LIG-AIKUMA (Gauthier et al. 2016) added functionality to the
AIKUMA app. Users could again translate speech and record themselves in spontaneous speech alone or
with others, introducing novel texts and reading them aloud or re-speaking a previously recorded audio
clip from another speaker. LIG-AIKUMA was deployed to record three African languages (Adda et al.
123
2016). Unlike DAPEL, LIG-AIKUMA does not directly engage with the speaker in dialogue or oer spoken
conversational prompts.
5.3 Overarching design of DAPEL
The general format of DAPEL is an interview-type system. One previous interview system interviewed
participants to detect the presence of psychological distress indicators, such as those for PTSD and depression (DeVault et al. 2013). The virtual agent in this system, named Ellie, asked open-ended questions,
such as “What would you say are some of your best qualities?” and, “What are some things that make
you really mad?” A second system (Johnston et al. 2013) created an automated spoken dialogue system
to communicate over the phone that asked questions drawn from government and social science surveys.
Their motivation was to standardize the interview experience across participants by lowering the error
and bias that human interviewers can introduce in survey results data. The experiments in this study
were primarily focused on strategies for conrmation but did nd that participants were satised with
over-the-phone interviews with the dialogue system. Additionally, previous dialogue systems found that
including small talk can lead to positive user impressions of dialogue systems (Kobori, Nakano, and Nakamura 2016; Cassell, Bickmore, et al. 1999). A nal study (Nakamura, Kobori, and Nakano 2019) found that
users strongly preferred the system with small talk to the system without.
As noted in Section 5.2, many languages have not been documented or minimally documented. With
many languages facing decreasing populations of speakers, it is imperative to document languages while
there are still uent speakers. Technology could be a means to eciently and economically document
a language. The DAPEL system is a system to record endangered languages in conversational, informal
settings and would require users to speak two languages while receiving system responses in only one.
While it might be feasible for me to develop a monolingual system in Choctaw because I have the
expertise, community connection, and data to support this, it would be challenging to do the same for other
124
endangered languages that do not have the same amount of data. Thus, a monolingual English system of
DAPEL would require the least amount of expert knowledge and existing language data to implement. By
using a majority language for the system’s responses, I minimize the need to implement novel technical
components, leveraging o-the-shelf English systems instead.
As the ultimate goal is to capture as much endangered language data as possible, I experiment with
other dialogue acts to increase the duration that a user speaks in the second language. I explore incorporating small talk to increase rapport and comfort during the interaction in Section 5.4.1. I also investigate if
including the language to be recorded through including code-switching encourages users to speak more
in Section 5.4.3 and Section 5.5. DAPEL demonstrates how a minimally bilingual system could provide a
process for incrementally working towards a fully monolingual system in another endangered language.
The overall interaction of DAPEL is shown in Figure 5.1. Each round in the dialogue contains a recording prompt, summary, and small talk. The prompt is an open-ended question or topic to be responded to
in the endangered language. The next step is a summary, where the user translates into English a brief
summary of what was said in the prompt response. The small talk, also in English, is based on information
shared in the summary.
5.4 Pilot studies
Before conducting a large-scale experiment, I conducted several pilot studies with small populations to
test the viability of the system and certain design components. I describe two pilot studies, their results,
and their implications.
125
5.4.1 Pilot study 1
In a pilot study (Alavi, Brixey, and Traum 2019), I tested an initial version of DAPEL with a convenience
population using several non-English languages to determine the viability of the proposed system’s design.
I conducted an experiment with three research questions:
• D1-1: Do people respond with lengthy and meaningful responses when prompted by a computer to
record answers in another language?
• D1-2: Is a summarizing act of what was said previously in the other language natural for the user
and the dialogue ow?
• D1-3: Does small talk help users feel comfortable with the interaction? Are the small talk responses
adequate?
I hypothesized that users would feel as comfortable with the system as with the human interviewer.
I also hypothesized that small talk would help participants feel more comfortable and at ease with the
system, potentially leading to higher levels of rapport and satisfaction and potentially longer recordings.
I hypothesized that the participant would not perceive the summary task as unusual.
5.4.1.1 Methods
To clarify if this design would work, I dened the rst condition in the experiment in which a human
interviewer was paired with a participant. In the second and third conditions, participants interacted with
a dialogue system (described in the next section) to evaluate how users would respond to an equivalent interaction when paired with a computer interviewer. All participants were given a questionnaire to discuss
their experience after the interaction.
126
Questionnaire question Condition 1 Condition 2 Condition 3
1 How natural did you nd the interaction overall? 4.17 3.5 3.83
2 How easy was it to understand what was expected of you? 4 4 3.83
3 How natural was it to provide a summary of what was said during the recording prompt? 3.83 4
4 How comfortable was the small talk? 4 3.83
Table 5.1: Sample of questionnaire results. Participants were asked to rate their experience on a 1-5 Likert
scale, with 5 being the highest score.
I recruited 18 participants, with six participants per condition. Nine females and nine males participated. Participants (number given in parentheses) spoke: Korean (1), Mandarin (4), Bengali (2), Spanish
(2), Persian (2), French (1), Japanese (1), Dutch (1), Hebrew (1), Hindi (1), and Russian (1).
5.4.2 System design
All conditions were given the same script and small talk options. The interviewer or system introduced
the prompt, asked for a summary, and then gave at least two small-talk responses from a prepared script
based on the summary given. For condition two, I used a Wizard of Oz (WOZ) system to play the prompt
recordings and to manage the small talk portion of the dialogue. In the second condition, prerecorded small
talk questions from the same script as condition one were played to the user based on the summaries. In the
third condition, the same WOZ system as condition 2 was used, but the small talk portion of the dialogue
was omitted, and users only heard the prompt portion of the dialogue from the system.
5.4.2.1 Results
I found that participants spoke nearly as much with the computer system as with the human interviewer.
The average duration of speaking times in the non-English language was 418 seconds, 228 seconds, and
382 seconds in conditions one, two, and three, respectively. The median times were 383 seconds, 234.5
seconds, and 359 seconds in each condition, respectively. The results demonstrate that users will interact
with a multilingual system and give recorded responses in their respective languages.
127
For survey Question #3, Table 5.1 shows that participants in the experiment did not nd the summarizing act unnatural, and users are accepting of systems that ask them to perform this action. Although
the dierence might not be signicant, it is interesting that users found summarizing in condition two
more natural than condition one. This might seem counter-intuitive at rst since, in condition two, users
were interacting with a WOZ system, while in condition one, they interacted with a human interviewer.
This may be because in the rst part, when they are speaking in their non-English language, they know
that the human interviewer does not understand them at all, which might seem unnatural. People look
for reactions from other people in their normal conversations. It may be easier to interact with a system
without expecting a reaction.
The participants did not nd the system’s small talk unnatural, and all reported feeling neutral, positive,
or extremely positive. The small talk responses were scripted for both the human interviewer and the WOZ
system. However, participants in condition two reported that the WOZ small talk responses sounded
formulaic and did not always appropriately respond to the summary. This feedback might also explain
why the total speaking duration for condition two was lower than that of the other two conditions and
less natural.
The pilot study demonstrated that users are willing to speak two languages with a system and disclose information about their language and culture. I also determined that users are willing to engage in
summaries and small talk with a dialogue system, so long as the small talk responses are engaging and
appropriate.
5.4.2.2 Discussion
The results indicate that DAPEL is a viable system for collecting endangered language responses, even
though the system engages in all dialogue acts in English. The results also indicate that the summary is
128
not an unnatural dialogue act in the interaction. Based on the durations and survey responses, the small
talk responses were inadequate and needed additional revisions.
However, one unexplored direction was how to incorporate code-switching into the dialogue’s utterances and whether this might improve the interaction by increasing the recording durations or enhancing
the user experience. It was also clear that the small talk component needed additional renement for later
work.
5.4.3 Pilot study 2
Based on the results from pilot study 1, I developed three new research questions.
• D2-1: What can be changed in the system’s dialogue design to improve the naturalness and comfort
that users experience?
• D2-2: What can be changed in the system’s dialogue design to increase the duration that the user
speaks in the endangered language?
• D2-3: How could code-switching be incorporated into the interaction?
If language use is a means of demonstrating an identity, then code-switching could help the system
establish a similar identity to the user. I hypothesized that code-switching could also increase communal
common ground and chances for building rapport.
5.4.3.1 Methods
To test these questions, I recruited two bilingual participants who spoke two unique languages (Vietnamese and Mandarin) and English to interact with a bilingual human interlocutor in a second pilot study.
No system or WOZ was utilized in this pilot study to test the introduction of code-switching, rather, all
participants interacted with a human interviewer. In this version of the study, I added translation requests
129
to the prompts to code-switch in those languages. Then, using the information provided by the user, I
added those translations to the prompts. I did not include translation requests in the small talk portion of
the dialogue. An example conversation would have been as seen in the example below.
• Human interlocutor: How do you say “food” in your language?
Participant: “đô ăn” `
Human interlocutor: Thanks! What is “đô ăn” that is unique to your culture? `
Following the interaction, both participants lled out brief surveys similar to those in pilot one. This
survey also included a question to rate the naturalness of the translation request.
5.4.3.2 Results
One participant reported a low survey score on questions about the translation requests and found it
unnatural; however, this participant still oered translations without prompting throughout the dialogue.
The other participant liked the code-switching but reported that it did not encourage them to speak more in
the second language. Notably, the two participants reported a higher score for naturalness than pilot study
1 to the question, “How natural was it to speak two languages during the interaction?” This result indicates
the system code-switching, even in a limited, unbalanced bilingual manner, increases the naturalness of
the conversation.
However, the duration of recorded speech in the second language did not increase substantially with
the introduction of the translation request.
5.4.3.3 Discussion
The increase in the score for naturalness indicates that the addition of code-switching had a positive eect.
The recordings not increasing in length could be due to the translation requests, as these added a lot
of time to the interaction without improving the user’s experience or recordings. As a result, I omitted
130
the translation requests in later versions and aimed to test the utilization of pre-written code-switched
utterances.
5.5 Code-switching DAPEL Study
As indicated in the related literature section (5.2), there is limited work in the domain of using technology,
and specically dialogue systems, to document and preserve languages. However, the pilot studies in 5.4
indicated this is a meaningful and viable approach and how code-switching should be incorporated into
the dialogue.
I designed four versions of a single system in this experiment. I compare a monolingual Choctaw system against two code-switching systems and against a monolingual English system. The code-switching
systems include one that uses a linguistic framework for code-switching, and the other code-switching
version that does not use a linguistic framework ("non-framework"). These frameworks are described in
detail in Chapter 2.1.3 and are similar to those tested in Chapter 4. I introduced new research questions in
this experiment:
• DCSW-1: Will code-switching lead to a better user experience?
• DCSW-2: Will code-switching lead to an increase in recorded audio?
• DCSW-3: Will users prefer a system that uses a linguistic framework for code-switching over one
that does not (non-framework)?
• DCSW-4: How do users want to interact with small talk in the system, using buttons or speaking
aloud when ready?
• DCSW-5: Will users like having a recording button in the prompt section? Will users forget to press
the button to start the recording?
131
This section describes the hypotheses, methods, results, and discussion of this experiment with DAPEL.
5.5.1 Hypotheses
I built o the initial versions of DAPEL to test additional hypotheses. Many of these hypotheses relate to
specic questions in this chapter and the broader research questions of the dissertation overall.
1. (Framework) bilingual systems lead to a better user experience for language preservation interactions than monolingual English systems.
2. (Framework) bilingual systems lead to longer recorded responses in language preservation interactions than monolingual English systems.
3. Modeling framework bilingualism leads to better dialogue in terms of greater and more diverse
production in the language being preserved than monolingual English or random bilingual systems.
4. The monolingual Choctaw system produces the longest recorded responses and highest rates of
positive user experiences than the monolingual English system.
I hypothesize that the non-framework bilingualism version of the system decreases the user experience
because the user nds it distracting, and the CSW breaks social norms. Breaking social norms decreases
the user’s enjoyment of the conversation and decreases the user’s sense of naturalness of the conversation.
CSWing usually follows norms and grammatical rules (see Introduction); hence, examining the random
condition results indicates if humans also expect their computer interlocutor to follow those norms.
5.5.2 Dialogue Design
All of the utterances said by DAPEL were handcrafted and recorded in advance. The dialogue sequence
of DAPEL consists of a prompt, summary, and small talk parts (see Figure 5.1). Prompts are open-ended
topics that the participant responds to in the non-English language and are intended to be broad enough
132
for a person to be able to speak at length. The summary is responded to in English and is where the user
provides a brief overview of what they said to the prompt. The nal part, small talk, is where the system
can ask additional questions or engage in chit-chat in English with the user based on what was said in the
summary. I describe each component in more detail below.
5.5.2.1 Interface
The system was presented as a webpage to the user. I thought this would be the most intuitive for users,
as in this day and age, most people have used a web browser.
Each part of the dialogue was presented on a separate page: the prompt on one page, the summary on
the next, and the small talk on a subsequent page. This design was intended to make the dierent parts of
the dialogue clear to the user.
The prompt and summary recordings had an explicit "begin recording" button. This was so that participants did not feel pressured to begin talking immediately but could instead reect on the question, make
any notes, and look up words in the provided dictionaries.
The small talk portion of the dialogue was recorded during each experiment session on an external
recording device. This device was left running for the duration of the experiment session to serve as a
backup method if audio was not captured on the computer with which the participant interacted.
5.5.2.2 Prompts
First, the prompts were questions that I selected and adapted from https://relearnalanguage.com/lan
guage-exchange-topics/. Figure 5.2 is a screenshot of the prompt page in the interface. I designed the
interactions to last for roughly fteen minutes. To ensure that the system has sucient content to last for
fteen minutes, I selected twenty-two prompts; this means a maximum of twenty-two iterations through
the three parts of a dialogue sequence. The full list of the prompts is given in Table 5.2.
133
1 How did you learn to speak Choctaw, at home or at school? Tell me where and with whom you
speak your language nowadays.
2 Can you share a good story your grandparents or other family members told you about their lives?
3 Who are you closest with in your family? Why is that?
4 If you could go back and talk to one ancestor, who would it be and why?
5 Do you have any siblings? How do you think being the youngest, oldest, middle, or only child
aected you?
6 What’s the oldest possession you currently have? Why do you still have it?
7 Can you tell me a story from your childhood?
8 What was the most interesting job you ever had?
9 What’s the best piece of advice you ever received?
10 What was the strangest thing or coolest object in your childhood home? Does it have a story?
11 If money were no issue, how would you spend your time?
12 What’s one thing that someone borrowed from you and never brought back, do you miss it?
13 Did you ever play any musical instruments when you were growing up? If you didn’t, what would
you like to learn? Tell me something about the instrument.
14 What are some of your favorite things to cook or eat? What are things that you do not like to cook
or eat? Just to note, if you are sharing a cultural food, please only discuss it if it is ok for people
not from your community to know about.
15 Tell me about a vacation place that you would like to visit and why.
16 What animals are you afraid of? Why?
17 What was your favorite tv show when you were a child? Tell me something about the show.
18 Do you like to read? Why or why not?
19 How important do you think science and math courses are in school, in comparison with literature,
technology, arts, and history classes?
20 What is the most disgusting vegetable to eat, in your opinion? Why?
21 What piece of technology are you the most reliant on, and why?
22 What sports do you like to play or watch?
Table 5.2: List of all prompts
134
Apart from capturing a variety of vocabulary, the prompts were selected to elicit various syntactical
forms. Prompts such as prompt one were intended to capture the present tense, while prompts like two,
seven, and eight would capture the past tense. Questions like eleven, thirteen, and fteen could capture
hypothetical type tenses and conditionals. Another syntactical feature the prompts intended to capture
was negation forms, such as the second part of fourteen, eighteen, and twenty.
5.5.2.3 Summary
Next, the summary part is where the system asks the user, “Could you summarize what you just said?”
Figure 5.3 is a screenshot of what this page looked like in the interface. The summary part asks the user
to both translate into English what was said in the preceding prompt part and to briey summarize it. To
prevent fatigue, this was not said aloud but only presented to the user as text on the interface.
5.5.2.4 Small talk
Finally, the small talk part of the dialogue was partially reused from the pilot studies, and new small talk
was based on the prompts. Figure 5.4 shows a screenshot of what the small talk page looked like in the
interface. The small talk was only said orally; it was not displayed on the screen.
Not all prompts included small talk to reduce fatigue. There was no small talk for prompts 3, 4, 6, 8,
11, 14, 19, 21, or 22. In these cases, the system typically thanked them for their answer and instructed the
participant to proceed to the next page.
Prompts 2, 7, and 10 invited the participant to share another story or anecdote. Sharing an anecdote
or story was not required to continue to the next round.
Armative and negative type small talk responses were formulated based on small talk examples in
Nakamura, Kobori, and Nakano (2019). If there was a small talk option, one option was that the summary
from the previous page was converted to text. The system then selected a response based on keywords in
135
the text. For example, if the participant said they liked reading, the system asked what their favorite book
was. Most small-talk interactions were one or two additional turns. For example, in prompt 15 ("Tell me
about a vacation place that you would like to visit and why.") the system would rst say, "That sounds like
a good choice! Do you know anyone who has been there before?" The following if-else statements show
how the Monolingual English system would respond to the user’s armative, negative, or other type of
response.
• If yes:
Monolingual English - Always helpful to have a recommendation! Very good. Well let’s move on,
you can click next now.
• If no:
Monolingual English - Paving your own way! Very good. Well let’s move on, you can click next
now.
• Else:
Monolingual English - "Well that’s all very interesting. Let’s move on, you can click next now."
5.5.2.5 Code-switching design
To design the random code-switched utterances, I randomly selected any part of speech as a switching
point, including question words, nouns, verbs, and morphemes. Code-switching could be in an English
matrix or a Choctaw matrix. In this example, only one of the sibling adjectives (“akni”) in the second
sentence is code-switched rather than all of the terms, “Do you have an ishki or nak? How do you
think being the youngest, akni, middle, or only child aected you?” In another example, only the question
word (“nanta”) is switched, “Nanta the best piece of advice you ever received?” In this example, only the
136
Choctaw phrase in English matrix How did you learn to speak Chahta? (How did you learn to speak
Choctaw?)
English to Choctaw alternation Who are you closest with in your family? YVmmVt katimi a
¯
? (Who
are you closest with in your family? Why is that? )
Choctaw to English alternation Chukka cho
¯
holisso apisa? Tell me where and with whom you
speak your language nowadays. (How did you learn to speak
Choctaw, at home or at school? Tell me where and with whom
you speak your language nowadays.)
Table 5.3: Code-switching options in DAPEL prompts and small talk. Examples are given in the column
on the right.
morpheme (ish) is switched rather than the entire verb phrase, “Tell me about a vacation place that ish
would like to visit micha katiohmi.” I then applied this strategy to the small talk parts of the conversation.
To design the “framework” code-switched utterances, I followed a framework outlined in Parekh et al.
2020. A code-switched utterance could be one of the three options illustrated in Table 5.3.
To create the alternation utterances, I switched at a conjunction point, a comma, before a verb, or intersententially. I tended to leave indirect objects in English for clarity, as these are typically null in Choctaw.
As there are 22 prompts, I cycled through the options going down the list of prompts. I had to swap some
options as certain words needed to be in English because there are no Choctaw equivalents, and it would
throw o what the option required to have those words in English.
Nearly all the prompts are identical to their English or non-framework code-switched counterparts;
however, some utterances were reworded or reordered to emphasize an option. In the Choctaw to English
alternation version of prompt question 20, I put “in your opinion” rst in the sentence so that the rest of the
sentence could alternate to English, “Chim anukla, what is the most disgusting vegetable to eat? Why?”
For reference, the English counterpart is What is the most disgusting vegetable to eat in your opinion? I then
applied the code-switching options to the small talk utterances part of the dialogue.
137
5.5.2.6 Code-switching in user responses
The system is not equipped to process code-switching in participant responses, as the Choctaw ASR (see
Section 3.0.3) is only used for the monolingual Choctaw version of DAPEL. If a participant codes-switches,
the dialogue model will revert to an o-topic response. For example, an o-topic response to the rst
prompt is, "How lucky to speak Choctaw! Let’s move on, you can click the button marked next to go to
the next page."
5.5.2.7 System Design
The backend systems include an o-the-shelf English ASR system and my implementation of a Choctaw
ASR. I designed four versions of DAPEL: a monolingual English, two bilingual, and a monolingual Choctaw.
5.5.3 Methods
5.5.3.1 Four versions
Four dierent system versions were designed (as mentioned in Section 5.5.2.7). Figure 5.5 shows the technical components for each version. The intention was to recruit 20 participants to test each system. Due
to the realities of recruiting speakers of an endangered language, described in Section 5.5.3.5 below, only
two systems were tested: the monolingual English system and the framework code-switching system.
5.5.3.2 Survey design
The survey utilized in this experiment was nearly the same as that used in Masheli 2.0 (see Chapter
4.4.3.2). The survey was again designed to evaluate the user’s sense of rapport, the naturalness of the
code-switching, and the feeling of connection because of language identity. The survey consisted of thirteen 5-point Likert scale questions, and the answers were scored from 1 strongly disagree to 5 strongly
agree. Twelve questions are the same as the Masheli 2.0 experiment, with the addition of question 10
138
for evaluating the DAPEL system’s small talk. Again, the nal two questions were open-ended questions
where participants could write sentences to respond. All survey questions were optional, and participants
were informed that they could choose to skip any questions.
1. The system understood me.
2. The system seemed unengaged.
3. The system was friendly.
4. The system and I worked towards a common goal.
5. The system and I did not seem to connect.
6. I didn’t understand the system.
7. The system knows the Choctaw language.
8. The interaction was interesting.
9. The interaction felt natural.
10. I enjoyed the small talk.
11. I felt the system and I were in the same social group.
12. I would be willing to continue the conversation with the system for longer.
13. I would recommend interacting with this system to a friend.
14. Was there anything else that you wanted to talk to the system about? (open-ended)
15. Do you have any other comments to share about your experience? (open-ended)
Questions were selected to determine levels of rapport (1, 2, 4, 5, 6, 9) and engagement and connection
(3, 8, 10, 11, 12, 13). I hypothesized that the code-switching cohort would score the system higher on these
questions. Question 10 evaluates an aspect of the system (small talk), and question 7 measures people’s
perception of the system’s knowledge of the Choctaw language. I hypothesized participants might score
the non-framework system lower on question 7 than the other interaction groups.
139
5.5.3.3 Experiment session
The experiment session lasted thirty minutes, and participants could continue voluntarily. Participants
rst received a consent form. I then read the following script.
In this task, you will be speaking with a computer program. The program is collecting recordings of the Choctaw language. The program will ask you a question that you should respond
to in Choctaw. When you nish, the program will then ask you to summarize what you said in
Choctaw, you can be as brief or detailed as you’d like in the summary. The program will then
ask you a few follow-up questions and/or engage in some chit-chat with you. After that, it
will ask you another question for you to respond to in Choctaw. The program will follow this
pattern and will run for 20 minutes. Answer in ways that feel comfortable and natural for you,
and please only share information in your responses that are appropriate for you personally
and culturally. All parts of this study are voluntary.
I provided participants with a dictionary, scratch paper, and a pen and verbally told them they were
welcome to use these materials to prepare their responses if they wished. Next, I guided participants
through the rst prompt, summary, and chit-chat screens. After the rst round, I asked participants if
they wanted assistance clicking through the screens or preferred to work independently. For participants
who requested assistance, I sat on their right side and conrmed when they wanted to start and nish
recordings and proceed to the next screen. For the independent participants, I remained in the experiment
room in case they needed technical assistance. At the end of twenty minutes, all participants were given
the survey. All participants were verbally informed of the voluntary option to continue interacting with
the system after the survey. Participants were paid $15 for their time and eort at the end of the experiment
session.
140
5.5.3.4 Research context
Several steps are required to research the Oklahoma Choctaw language or with Choctaw tribal members.
First, a sponsor must review and support the work. A sponsor must be someone who works for the tribe.
My sponsors have been leadership members at the School of Language, rst Mr. James Parrish, then Mr.
Phil Lewis, and nally Ms. Angie Williston. My sponsors evaluated the proposal for sensitivity to the
community, adequate protection of tribal members, and alignment with tribal initiatives. Additionally, all
of the collected data from my research was requested to be archived at the Choctaw Cultural Center’s
archives to ensure that the tribe would continue to benet from this eort.
Following a sponsor’s approval and support, I applied to Choctaw Nation’s IRB. I submitted it to USC’s
IRB only after review and approval from CNO IRB. An additional review by the US Army’s review process
was required as they funded the data collection of this dissertation.
5.5.3.5 Setting and participants
All experiments were conducted in person. I spent roughly two weeks in Oklahoma on the Choctaw reservation (see Figure 5.6 for the location of the reservation within the state). I collected data from participants
in the towns of Durant, Broken Bow, and Idabel, districts 9, 2, and 1, respectively. Districts are shown in
Figure 5.7. All participants self-identied as Choctaw speakers capable of holding a conversation. All
participants were English bilinguals. No other data was collected about other languages the participants
might know and speak.
The School of Language generously provided me with a research space at the Choctaw headquarters
in Durant. I primarily recruited language apprentices and teachers at this recruitment site.
I also collected data at the Choctaw Community Centers in Durant, Broken Bow, and Idabel with the
permission of the tribal representative. Participants were mainly recruited from elder lunches held once a
week at the various community centers.
141
5.5.4 Results
In total, 28 participants participated in the study. No data was collected about the participant’s age or
gender identity. To measure the experience, I analyze responses to the post-interaction survey. I also
compare the average duration of the collected recordings and examine the variety of recorded content
across conditions.
5.5.4.1 Survey results
Using a two-tailed T-test, I rst compared the responses to each survey question of the code-switching
group against the monolingual group; the results are in Table 5.4. There is an almost signicant result
between the two groups observed only for question 7, "The system knows the Choctaw language." This
makes sense as the monolingual system spoke no words in Choctaw. However, it is also surprising as the
code-switching system only switched a minimal amount, and users still perceived it as a knowledgeable
Choctaw speaker.
It became apparent during recruitment that two cohorts were within the participant population. One
cohort comprises 13 second-language speakers of Choctaw, all of whom were recruited in Durant. The
other cohort is 15 individuals whose rst language is Choctaw. I then divided the survey results based on
these two cohorts.
Some interesting results were observed by separating survey responses by cohort, as seen in Table
5.5. The L1 group felt the code-switching system better understood them than the monolingual system
(p<0.09). The L1 participants recognized the code-switching system as a Choctaw speaker (0.04) and as
part of their social group (0.04). Notably, the L2 cohort did not have this same level of recognition of the
code-switching system being a Choctaw speaker.
142
Question T-test Mono Avg CSW avg
(std dev) (std dev)
1 The system understood me. 0.20 3.5 (1.40) 4 (0.67)
2 The system seemed unengaged. 0.47 3.14 (1.16) 2.58 (1.50)
3 The system was friendly. 1 4.78 (0.42) 4.78 (0.42)
4 The system and I worked towards a common goal. 0.56 4.14 (1.23) 4.35 (0.84)
5 The system and I did not seem to connect. 0.34 2.5 (1.40) 2 (1.10)
6 I didn’t understand the system. 0.16 2.35 (1.33) 1.86 (1.14)
7 The system knows the Choctaw language. 0.06* 2.85 (1.51) 4.07 (0.86)
8 The interaction was interesting. 0.75 4.57 (0.75) 4.64 (0.63)
9 The interaction felt natural. 0.52 3.71 (1.43) 4.07 (1.07)
10 I enjoyed the small talk. 0.72 4.53 (0.77) 4.38 (0.86)
11 I felt the system and I were in the same social group. 0.46 3.5 (1.45) 4 (1.08)
12 I would be willing to continue the conversation with 1 3.92 (1.43) 3.92 (1.43)
the system for longer.
13 I would recommend interacting with this system to a 0.87 4.21 (1.18) 4.14 (1.16)
friend.
Table 5.4: Survey results of all participants comparing those who interacted with the monolingual system
versus the code-switching system. Averages and standard deviations are given for each group. A single
asterisk indicates p<0.10; two indicates p<0.05.
I continued to compare the cohorts, shown in Table 5.6. In the monolingual column (mono), I compared
the survey results of the L1 cohort (eight participants) against the L2 cohort (six participants) that interacted with the monolingual system. The L1 cohort was signicantly more likely to rate the monolingual
system as unengaged than the L2 cohort (0.02) paired with the same system. The code-switching columns
(CSW) make the same comparison, with ve L1 speakers and nine L2 speakers. The L1 group thought
the system worked with them towards a common goal (0.08), signicantly felt the system knows Choctaw
(0.02), signicantly thought the interaction felt natural (0.1), signicantly enjoyed the small talk (0.02),
and signicantly felt the system was in the same social group as them (0.002). The nal three columns
also provide insights: the L2 cohort was less likely than the L1 to think the system was working towards
a common goal with them (0.07), and the L2 cohort enjoyed the small talk signicantly less than the L1
cohort (0.02).
143
Question All All Mono Avg All CSW avg L2 L2 Mono avg L2 CSW avg L1 L1 Mono avg L1 CSW avg
(std dev) (std dev) (std dev) (std dev) (std dev) (std dev)
1 The system understood me. 0.20 3.5 (1.40) 4 (0.67) 0.64 3.33 (1.63) 3.66 (0.5) 0.09 3.62 (1.30) 4.6 (0.54)
2 The system seemed unengaged. 0.47 3.14 (1.16) 2.58 (1.50) 0.53 2.33 (1.03) 2.75 (1.38) 0.20 3.75(0.86) 2.25 (1.89)
3 The system was friendly. 1 4.78 (0.42) 4.78 (0.42) 0.67 4.66 (0.51) 4.77 (0.44) 0.75 4.87 (0.35) 4.8 (0.44)
4 The system and I worked towards a common goal. 0.56 4.14 (1.23) 4.35 (0.84) 0.53 3.66 (1.50) 4.11 (0.92) 0.45 4.5 (0.92) 4.8 (0.44)
5 The system and I did not seem to connect. 0.34 2.5 (1.40) 2 (1.10) 0.38 2.66 (1.21) 2.11 (1.05) 0.49 2.37 (1.59) 1.8 (1.30)
6 I didn’t understand the system. 0.16 2.35 (1.33) 1.86 (1.14) 0.11 2.5 (1.22) 1.5 (0.53) 0.87 2.25 (1.48) 2.4 (1.67)
7 The system knows the Choctaw language. 0.06* 2.85 (1.51) 4.07 (0.86) 0.14 2.66 (1.50) 3.77 (0.83) 0.04** 3 (1.60) 4.75 (0.5)
8 The interaction was interesting. 0.75 4.57 (0.75) 4.64 (0.63) 0.89 4.5 (0.83) 4.55 (0.72) 0.60 4.62 (0.74) 4.8 (0.44)
9 The interaction felt natural. 0.52 3.71 (1.43) 4.07 (1.07) 0.71 3.5 (1.51) 3.77 (1.20) 0.23 3.87 (1.45) 4.6 (0.54)
10 I enjoyed the small talk. 0.72 4.53 (0.77) 4.38 (0.86) 0.88 4.2 (1.09) 4.11 (0.92) 0.17 4.75 (0.46) 5 (0)
11 I felt the system and I were in the same social group. 0.46 3.5 (1.45) 4 (1.08) 0.74 3.33 (1.36) 3.55 (1.01) 0.04** 3.62 (1.59) 5 (0)
12 I would be willing to continue the conversation with 1 3.92 (1.43) 3.92 (1.43) 0.62 4.5 (0.83) 4.22 (1.30) 0.91 3.5 (1.69) 3.4 (1.67)
the system for longer.
13 I would recommend interacting with this system to a 0.87 4.21 (1.18) 4.14 (1.16) 0.35 3.66 (1.50) 4.33 (0.70) 0.37 4.62 (0.74) 3.8 (1.78)
friend.
Table 5.5: Survey results of all participants divided into L1 and L2 cohorts comparing those who interacted with the monolingual system versus
the code-switching system. Bolded column names are T-test results. Averages and standard deviations are given for each group. A single asterisk
indicates p<0.10; two indicates p<0.05.
144
Question Mono Mono L2 avg Mono L1 avg CSW CSW L2 avg CSW L1 avg L1 v L2 L2 avg L1 avg
(std dev) (std dev) (std dev) (std dev) (std dev) (std dev)
1 The system understood me. 0.72 3.62 (1.30) 3.33 (1.63) 0.01** 3.66 (0.5) 4.6 (0.54) 0.27 3.53 (1.06) 4 (1.15)
2 The system seemed unengaged. 0.01** 3.75 (0.88) 2.33 (1.03) 0.65 2.75 (1.38) 2.25 (1.89) 0.20 2.57 (1.22) 3.25 (1.42)
3 The system was friendly. 0.42 4.87 (0.35) 4.66 (0.51) 0.93 4.77 (0.44) 4.8 (0.44) 0.48 4.73 (0.45) 4.84 (0.37)
4 The system and I worked towards a common goal. 0.26 4.5 (0.92) 3.66 (1.50) 0.08* 4.11 (0.92) 4.8 (0.44) 0.07* 3.93 (1.16) 4.61 (0.79)
5 The system and I did not seem to connect. 0.70 2.37 (1.59) 2.66 (1.21) 0.66 2.11 (1.05) 1.8 (1.30) 0.72 2.33 (1.11) 2.15 (1.46)
6 I didn’t understand the system. 0.73 2.25 (1.48) 2.5 (1.22) 0.29 1.5 (0.53) 2.4 (1.67) 0.45 1.92 (0.99) 2.30 (1.49)
7 The system knows the Choctaw language. 0.69 3 (1.60) 2.66 (1.50) 0.02** 3.77 (0.83) 4.75 (0.5) 0.65 3.33 (1.23) 3.58 (1.56)
8 The interaction was interesting. 0.77 4.62 (0.74) 4.5 (0.83) 0.45 4.55 (0.72) 4.8 (0.44) 0.54 4.53 (0.74) 4.69 (0.63)
9 The interaction felt natural. 0.65 3.87 (1.45) 3.5 (1.51) 0.10* 3.77 (1.20) 4.6 (0.54) 0.31 3.66 (1.29) 4.15 (1.21)
10 I enjoyed the small talk. 0.33 4.75 (0.46) 4.2 (1.09) 0.02** 4.11 (0.92) 5 (0) 0.02** 4.14 (0.94) 4.83 (0.38)
11 I felt the system and I were in the same social group. 0.71 3.62 (1.59) 3.33 (1.36) 0.002** 3.55 (1.03) 5 (0) 0.23 3.46 (1.12) 4.08 (1.44)
12 I would be willing to continue the conversation with 0.17 3.5 (1.69) 4.5 (0.83) 0.37 4.22 (1.30) 3.4 (1.67) 0.11 4.33 (1.11) 3.46 (1.61)
the system for longer.
13 I would recommend interacting with this system to a 0.19 4.62 (0.74) 3.66 (1.50) 0.55 4.33 (0.70) 3.8 (1.78) 0.59 4.06 (1.09) 4.30 (1.25)
friend.
Table 5.6: Survey results comparing L2 to L1 participants who interacted with a given system. Mono indicates the monolingual system, and CSW
indicates the CSW system. A comparison of all L2 participants against all L1 participants, regardless of system, is given in the three rightmost
columns. Averages and standard deviations for a given group and system are given for each pairing. A single asterisk indicates p<0.10; two indicates
p<0.05.
145
Group Positive Neutral Negative
Mono L1 1 0 0
Mono L2 0 1 0
CSW L1 1 0 0
CSW L2 1 1 0
Table 5.7: Sentiment analysis on rst open-ended survey question
There were two additional open-ended questions at the end of the survey. The rst question asked,
"Were there any other questions the system should have asked you about to better document the Choctaw
language?" Seventeen participants wrote a response; however, twelve of these responses were "no" or
something similar, such as "I can’t think of anything." I conducted sentiment analysis using Excel’s Power
Apps to rate the responses as positive, neutral, or negative. None of the responses were negative, and most
were positive. The sentiment of the responses is summarized in Table 5.7.
The responses to the question were (presented as entered in the survey response):
1. "I feel like the system could ask about how one could learn the choctaw language or our experiences
with the choctaw language, but for the most part I enjoyed it."
2. "Would have liked to have the questions beforehand to prepare longer, better responses"
3. "I think conversations about family and growing up are more common and therefore have a need to
be documented and understood. However, I think it would be interesting to expand into discussion
about instructions, ways of life, how people think about social and political issues. At least when
gaining information from uent speakers."
4. "what are some of the dialect dierences that you have heard?"
5. "Questions were good."
Some of these suggestions could be incorporated into a future system, such as allowing users to pick
which questions they would like to respond to at the start of the interaction. Suggestions about topics for
prompt questions could be easily incorporated into future designs.
146
Group Positive Neutral Negative
Mono L1 1 0 2
Mono L2 1 0 0
CSW L1 1 2 0
CSW L2 4 1 0
Table 5.8: Sentiment analysis on nal survey question
The nal open-ended survey question was, "Do you have any other comments to share about your
experience?" Fifteen participants responded, of which three stated "no" or similar. I conducted sentiment
analysis on the content of the twelve responses. Most responses were positive. Only two were negative,
both from the monolingual L1 cohort.
Some participants gave some of their thoughts orally at the end of the session that they did not include
in their survey responses.
• Participant 19 remarked that the next button was not easy to click and suggested larger button sizes.
Also liked the questions.
• Participant 21 said they liked the questions in the prompts.
• Participant 25 stated they had fun and enjoyed the experience
In addition, multiple participants voiced their relief during the session, saying that they could consider
their prompt responses before starting the recording. Many used reference materials or took notes before
starting the record button. Most people did not re-record any of their responses as a result.
5.5.4.2 Audio duration
Next, I evaluated for potential dierences between the groups in the length of the audio recordings for the
prompts, summaries, and small talk portions of the interactions.
147
Analyzing the audio durations presents several challenges. First, not all participants completed the
same prompts. Participants were encouraged to skip any they did not feel comfortable or interested in
answering. There is no prompt that all participants responded to make a balanced comparison.
Additionally, some participants had additional free time to continue the interaction for longer, such as
Participant 23, the only participant to reach the nal prompt, as this individual had no other time commitments. Participant 23 also gave the most prolonged responses for most prompts, which heavily skewed
overall durations and averages for the code-switching group in which they participated. A second factor
was that some participants could not continue longer even if they wished to because other participants
were scheduled immediately after their experiment session. I encouraged any participants who wanted
to continue to schedule a second experiment session on a dierent day or time, but none went beyond
verbally saying they were interested.
A nal challenge was that the external audio recorder malfunctioned when capturing small talk for
participants 1 through 12. As a result, only half of the participants’ small talk was recorded and could be
analyzed for duration.
Mono [14] Csw [13] L1 [13] L2 [14]
Average recording duration for prompts 0:05:16 (0:03:13) 0:08:56 (0:17:10) 0:08:50 (0:16:41) 0:05:05 (0:02:02)
Total recording time for prompts 1:13:42 1:56:02 1:06:10 2:03:34
Average recording duration for summaries 0:03:35 (0:02:32) 0:04:30 (0:08:24) 0:03:22 (0:01:52) 0:04:39 (0:08:14)
Total recording time for summaries 0:50:16 0:58:35 0:43:43 1:05:08
Table 5.9: Audio durations for prompts are given in hours:minutes:seconds. Standard deviations are given
in parentheses. The number of participants is given in the column headers in square brackets; the groups
are not even as L1 and L2 were not explicitly recruited, and one participant from the CSW group was
excluded due to technical recording issues.
An overview of the average duration for prompts and summaries is given in Table 5.9. One participant
was excluded from the L2 group and CSW group due to technical recording issues. The average and total
recording durations for both prompts and summaries were generally longer in the code-switching group,
but as the standard deviations indicate, there was an enormous variation in the durations observed between
148
individuals. In the code-switching group and L2 cohort, the variation was led mainly by one participant
who spoke more per response and responded to nearly all of the prompts.
There was limited statistical signicance found comparing groups and conditions for total recording
time for prompts and durations for prompt responses from individuals.
• Between conditions ignoring groups: No statistical signicance was found for either group (in other
words, monolingual versus code-switching disregarding L1 or L2 group) comparing the total prompt
or individual prompt duration times.
• Between groups ignoring conditions: No statistical signicance was found for either group comparing the total prompt duration times. However, a p-value of 0.03 was found for Prompt 4 ("If you
could go back and talk to one ancestor, who would it be and why?"), 0.04 for Prompt 6 ("What’s the
oldest possession you currently have? Why do you still have it?"), and 0.01 for Prompt 7("Can you
tell me a story from your childhood?). when comparing the responses of all of the L2 speakers to all
of the L1 speakers. In all instances, the L2 speakers spoke more on average than the L1 speakers.
• Between groups and within conditions: No statistical signicance was found for either group comparing the total prompt duration times. For each prompt, a p-value of 0.03 was found for Prompt 4
when comparing the responses of the CSW L2 speakers to the CSW L1 speakers, with the L2 speakers
speaking an average of 0:01:56 and the L1 speakers speaking an average of 0:00:15.
• Within groups and between conditions: Again, no statistical signicance was found for either condition comparing the total prompt duration times. A p-value of 0.03 was found for Prompt 4, and a
p-value of 0.02 was found for Prompt 6 when comparing the L1 speakers paired with the monolingual system against the L1 speakers paired with the code-switching system. In both cases, the L1
speakers paired with the monolingual system spoke more (an average of 0:00:28 compared to 0:00:15
for Prompt 4, and 0:00:45 to 0:00:16 for Prompt 6).
149
Figure 5.8 visualizes the total recording durations per condition (monolingual versus code-switching
system) and group (L1 versus L2) for prompt responses. The rst few questions have high durations, with
Prompt 2 ("Can you share a good story your grandparents or other family members told you about their
lives?") being a very popular response for all participants; the total duration for the monolingual group
was nearly sixteen minutes, and the code-switching group at a little under eleven minutes. Prompts 11
("If money were no issue, how would you spend your time?") and 12 ("What’s one thing that someone
borrowed from you and never brought back, do you miss it?") were among the shortest responses in terms
of duration, with total durations under one minute for all conditions and groups. Additional details are
given in Table 6.4.
Finally, I had anticipated that small talk engagement would be higher in the code-switching group.
However, an equal number of participants in both groups contributed an equal number of small talk responses (12 per group).
Given this population size, individual dierences and schedules were larger determiners in duration
lengths and engagement in small talk than which system a participant was paired with. Participants were
as likely to speak at length with either system.
5.5.4.3 Measuring diversity in recordings
One interesting nding was that many people spoke informally with the system. It is common to shorten
words in Choctaw in informal settings, and this was noted in many responses, for example, saying chukma
instead of achukma ("good"). The Choctaw transcriptionist also noted that many people spoke in a slang
style, frequently substituting "g" in places of "k," resulting in words like achugma instead of achukma. A
second general nding was the people were eager to speak Choctaw in other sections of the interaction,
such as in the summaries. Three participants gave summaries in Choctaw, and two gave two summaries
in Choctaw each. Several participants also spoke in Choctaw for the small talk. It is unclear why these
150
errors occurred as all participants were guided through a complete round of prompt, summary, and small
talk. None of the participants made errors in the rst round; errors typically occurred toward the middle
or end of the interaction. It is possible people were inattentive to which part of the interaction they were
completing, tired, forgot which language to use for a given part, or the instructions should have been
reiterated in the interface which language should be used for a given part. In instances where Choctaw
was spoken instead of English in the small-talk part, the o-the-shelf English speech recognizer used in
the monolingual English and code-switching versions of the system was not able to process Choctaw. I
wrote all of the small talk responses as if-else statements (see Section 5.5.2.4), and as a result, Choctaw
responses were all captured by the else statements. However, since not all of the small talk was recorded,
I am unable to accurately measure how many participants in total used Choctaw in other places of the
interaction.
Finally, I analyzed how lexically rich the prompt responses were. I rst looked at the total word counts
of transcriptions of each response to the prompts; the overview is given in Table 5.10. The unique word
count measurement was measured by response. The average total word count and total unique word count
were both higher for the code-switching cohort. These word counts are raw scores of all words said in the
response, including any English words. The participants with the highest and lowest word counts were
both in the code-switching group. No statistical signicance was found between the dierent groups or
cohorts for word count or unique word count.
The prompts with the highest average total words in the monolingual group were prompts 2 (Can you
share a good story your grandparents or other family members told you about their lives?) and 3 (Who are
you closest with in your family? Why is that?), with averages of 103 and 77 respectively. Prompt 2 was also
popular with the code-switching group, with an average word count of 64, and prompt 4 (If you could go
back and talk to one ancestor, who would it be and why?), also with an average of 64.
151
Word Count Unique Word Count
1 High total 3592 (CSW, L2) 1474 (CSW, L2)
2 Low total 32 (CSW, L2) 27 (CSW, L2)
3 Average Mono 404 233
4 Standard Deviation 27.76 13.05
5 Average CSW 537 275
6 Standard Deviation 109.66 20.32
7 Average L1 356 238
8 Standard Deviation 18.05 9.89
9 Average L2 533 249
10 Standard Deviation 60.72 22.57
Table 5.10: Overview of total and total unique word counts for all prompt responses in interaction.
Regarding unique word counts, prompts 2 and 3 elicited the highest average unique word counts in
the monolingual group, respectively 52 and 44. In the code-switching cohort, prompt 2 elicited 35 unique
words on average, and prompt 4 had the next highest average with 32. Prompt 14 has a high average
word count and unique word count for the code-switching group; however, this was driven mainly by one
participant. As only two other participants answered this same prompt in the cohort and had much lower
scores, prompt 14 is not a good indicator of an average for the code-switching group overall.
Prompt 12 produced the lowest average total words (15 monolingual and 13.5 code-switching) and the
lowest number of unique words (12.5 both) for both cohorts. The prompt is, What’s one thing that someone
borrowed from you and never brought back, do you miss it?
Next, I looked at examples of code-switching in the participant’s prompt responses. Only ve participants did not use any code-switching in their responses, two in the monolingual group and three in the
code-switching. The monolingual group used an average of 13 English words in their prompt responses,
with a standard deviation of 12.69. The highest usage was 35 English words. The average number of
English words in the code-switching group was 26, double the monolingual group, and had a standard
deviation of 65.19. The highest usage was with Participant 23, who, as mentioned previously, was the only
participant to go through the entire list of prompts.
152
The majority of the code-switches were nouns (classmates), noun phrases (class ring), or proper nouns
(Santa Claus). Many nouns have no equivalent in Choctaw, such as saxophone, state senate, and insurance.
No L2 speakers code-switched familial relations, but it was observed several times in the L1 group. Another
common code-switch for all groups was llers, such as well. One interesting code-switching example was
a noun with a pronoun attached, story’t (The story). All participants who used years in their responses
code-switched these, preferring to say nineteen eighty-two. Only one participant gave two entire prompt
responses in English; this participant was in the monolingual English cohort. An uncommon example was
to switch verbs, such as raise, in the context of raising a child; this was only observed in one response.
Another uncommon switch was to insert an entire English phrase, such as how do I say.
5.6 Discussion
One conclusion is that the system is a successful method for recording a language in the absence of a
language practitioner such as a linguist. Participants were able to engage and interact with the system
independently. All responses meaningfully contributed to the documentation of the Choctaw language.
The experiments described in this chapter demonstrate that bilingual dialogue systems can be used to
successfully gather language data on Indigenous, endangered, low-resource languages.
The research questions and ndings for the code-switching experiment of DAPEL were:
1. DCSW-1: Will code-switching lead to a better user experience? The results indicate that L1
speakers have a positive response to code-switching. L2 speakers do not respond to it in the same
way.
2. DCSW-2: Will code-switching lead to an increase in recorded audio? I found that codeswitching did not lead to an increase in the amount of recorded audio.
153
3. DCSW-3: Will users prefer a system that uses a linguistic framework for code-switching
over one that does not (non-framework)? Unfortunately, this question could not be tested due
to limitations in recruitment.
4. DCSW-4: How do users want to interact with small talk in the system, using buttons or
speaking aloud when ready? Some users were confused about the small talk not having a record
button and unsure when to start speaking. A record button should be integrated into this page for
simplicity and clarity.
5. DCSW-5: Will users like having a recording button in the prompt section? Will users forget
to press the button to start the recording? Most users did not have an issue remembering to press
the record button. Users expressed liking the visual cue that they were ready to start speaking with
the system and knew when the recording was completed.
To respond to the overarching questions of DAPEL:
• D-1: Will the proposed system be useful for collecting endangered language audio data? I
found that DAPEL was able to collect around 1,500 unique words and 500 new words to add to my
corpus.
• D-2: Will users be comfortable using two languages in a conversation but only communicating in one with the system? All pilot studies and experiments demonstrated that users were
comfortable being recorded in one language and interacting with the system in a second language.
• D-3: Will the proposed system be comparable in recorded audio duration or in the level of
enjoyment the user experiences as that with a human interviewer? The pilot study demonstrated that it was slightly lower but not substantially lower.
154
• D-4: Will code-switching improve the user’s experience through increased scores on a rapport survey, higher number of unique words in recordings, or longer recording durations?
A dierence was observed between L1 speakers and L2 speakers. Additional research is encouraged to better understand why dierent speakers responded in separate ways to the system’s codeswitching. The literature in Section 1.3.1 indicated that code-switching can be a means to build trust
and intimacy in a relationship for code-switching speakers.
I will now review the ndings in relationship to the hypotheses.
1. Bilingual systems lead to a better user experience for language preservation interactions
than monolingual English systems.
This was not found to be the case overall. However, based on the survey responses, this was true when
comparing L1 speakers to L2. The L1 speakers signicantly felt they related to the code-switching system,
which indicates there were dierent language ideologies for some of the participants. While learning
speakers are working to suppress the use of English and may view it as an incorrect behavior, the rstlanguage speakers were predominantly older and did not perceive using English nouns and phrases when
speaking Choctaw as wrong. The same preference for the code-switching system was not observed in the
L2 group.
2. Bilingual systems lead to longer recorded responses in language preservation interactions
than monolingual English systems.
No signicant dierences in recorded duration were found between the groups. All participants generally produced substantial recordings of the prompts and small talk.
3. Modeling framework bilingualism leads to better dialogue in terms of greater and more
diverse production in the language being preserved than monolingual English or random bilingual systems.
155
Participants paired with the code-switching system produced more total words and more unique words
on average than those paired with the monolingual system. The total unique words said per participant
for all prompts during the interaction was slightly higher for the monolingual cohort.
4. The monolingual Choctaw system produces the longest recorded responses and highest
rates of positive user experiences than the monolingual English system. This still needs to be
tested. I leave it for future work to recruit additional participants to test the monolingual Choctaw and
non-framework code-switching systems.
5.7 Summary of Findings
The system captured roughly 500 new words that were not present in the ChoCo corpus (see Section 3.0.1).
For comparison, the ChoCo corpus contains roughly 300,000 tokens, the Byington dictionary (Byington
1915) around 15,000 words, and the newest dictionary around 5,000 (The Choctaw Nation of Oklahoma
Dictionary Committee 2016). Many of the 500 new words were inected forms, which were not attested
to in previous dictionaries. Over 1,500 unique Choctaw words were said in the total collected audio.
There were also several interesting code-switch examples from the collected audio. One participant
borrowed the word "story" from English and formed a contraction with the subject pronoun "Vt", creating
the inected word "story’t". The framework linguistic literature stated that morphemes and nouns would
not be code-switched (Poplack 2000), and this example shows a noun phrase with the subject marker in
Choctaw while the noun is in English. Additionally, the contraction would not be considered acceptable
by the literature.
To summarize the ndings of the DAPEL experiments, in pilot studies one and two, I observed that
people were comfortable speaking in two languages with the system even if the system only interacted in
one language. In the code-switching DAPEL study, I expanded the system to code-switch two languages.
156
In the code-switching DAPEL study, I found that the best system design as to whether to include codeswitching may depend on the speaker’s uency. I observed that the code-switching system was better
than the monolingual English system for L1 speakers but did not have an impact on L2 participants. I
observed that L1 speakers of Choctaw highly identied with the code-switching system over the monolingual system. L2 speakers did not show this same level of identity relation with either system. Neither
group rated a high preference for speaking longer with the system or recommending the system to others. All open-ended survey responses were positive or neutral; no participants felt negatively about the
interaction.
While the L1 group’s preferences for the code-switching system over the L2 group manifested in higher
survey scores, it did not impact the duration of the audio collected. Instead, all speakers were as likely to
record long responses or voluntarily continue past the required maximum experiment time regardless of
the system.
The preferences for a specic system slightly impacted the total number of words and unique words
that users said, with the code-switching cohort speaking slightly more and using a bit more vocabulary
per prompt.
157
Figure 5.1: DAPEL system interaction
158
Figure 5.2: Interface of the system showing the prompt page
Figure 5.3: Screenshot of the summary page
159
Figure 5.4: Screenshot of the small talk page
Figure 5.5: All versions of DAPEL, illustrating ASR components and which language would be spoken by
the system and user at each point in the interaction
160
Figure 5.6: The Choctaw Nation reservation is located in the southeastern corner of Oklahoma. Figure
from Google Maps.
Figure 5.7: The Choctaw Nation has 12 districts. Durant is located in District 9, Broken Bow in District 2,
and Idabel in District 1. Figure from Wikimedia.
161
Figure 5.8: Total recording durations per condition (Mono versus code-switching) and group (L1 versus
L2) for prompt responses.
162
Chapter 6
Conclusions
This chapter will conclude the dissertation by summarizing the key research ndings in relation to the
research aims and questions and discussing their value and contributions. Additionally, it will address the
study’s limitations and suggest areas for future research.
6.1 Summary
This dissertation expanded research on bilingual dialogue systems and how code-switching can enhance
the user experience. Table 6.1 gives an overview of the research questions addressed. The two studies in
this dissertation investigated unbalanced bilingualism in two scenarios with dialogue systems: a chatbot
application for gaining conversational uency and an application for recording endangered languages.
At a high level, the results indicate that code-switching adds to the user experience for learning speakers when interacting with a chatbot. Code-switching also adds to the user experience of L1 speakers (those
who learned the language as a child) when interacting with a language documentation dialogue system.
6.1.1 Masheli
I conducted two experiments with a chatbot involving human subjects. In the rst experiment, the chatbot
could code-switch between turns but could not process or say intra-turn code-switched utterances within
163
Section Number Question Addressed
Overall O-1 Could dialogue systems lead to useful applications, such as for language
learning and language preservation? To what degree can bilingual dialogue systems facilitate this process?
X
Overall O-2 What code-switching strategies can lead to an increase in learning
when interacting with a bilingual chatbot?
X
Overall O-3 What code-switching strategies can lead to an increase in recorded audio when interacting with a bilingual dialogue system?
X
Overall O-4 Will users show a higher preference for, level of enjoyment, or rapport
with a code-switching system?
X
Overall O-5 Will users be comfortable using two languages in a conversation but
only communicating in one with a dialogue system?
X
Chapter 4.3 M1.0-1 Do language learners enjoy a code-switching chatbot? X
Chapter 4.3 M1.0-2 Do language learners like a chatbot that uses inter-turn codeswitching?
X
Chapter 4.4 M2.0-1 Will translanguaging lead to a better user experience? Will users show
a higher preference for the translanguaging chatbot?
X
Chapter 4.4 M2.0-2 Will translanguaging lead to an increase in learning? X
Chapter 4.4 M2.0-3 Will users prefer a chatbot that uses a linguistic framework for codeswitching over one that does not (non-framework)?
Chapter 5 D-1 Will the proposed system be useful for collecting endangered language
audio data?
X
Chapter 5 D-2 Will users be comfortable using two languages in a conversation but
only communicating in one with the system?
X
Chapter 5 D-3 Will the proposed system be comparable in recorded audio duration or
in the level of enjoyment the user experiences as that with a human
interviewer?
X
Chapter 5 D-4 Will code-switching improve the user’s experience through increased
scores on a rapport survey, higher number of unique words in recordings, or longer recording durations?
X
Chapter 5.4.1 D1-1 Do people respond with lengthy and meaningful responses when
prompted by a computer to record answers in another language?
X
Chapter 5.4.1 D1-2 Is a summarizing act of what was said previously in the other language
natural for the user and the dialogue ow?
X
Chapter 5.4.1 D1-3 Does small talk help users feel comfortable with the interaction? Are
the small talk responses adequate?
X
Chapter 5.4.3 D2-1 What can be changed in the system’s dialogue design to improve the
naturalness and comfort that users experience?
X
Chapter 5.4.3 D2-2 What can be changed in the system’s dialogue design to increase the
duration that the user speaks in the endangered language?
X
Chapter 5.4.3 D2-3 How could code-switching be incorporated into the interaction? X
Chapter 5.5 DCSW-1 Will code-switching lead to a better user experience? X
Chapter 5.5 DCSW-2 Will code-switching lead to an increase in recorded audio? X
Chapter 5.5 DCSW-3 Will users prefer a system that uses a linguistic framework for codeswitching over one that does not (non-framework)?
Chapter 5.5 DCSW-4 How do users want to interact with small talk in the system, using
buttons or speaking aloud when ready?
X
Chapter 5.5 DCSW-5 Will users like having a recording button in the prompt section? Will
users forget to press the button to start the recording?
X
Table 6.1: All research questions listed in this dissertation. The leftmost column states where in the chapter
it was discussed, and the rightmost column indicates whether the research question was addressed.
164
the dialogue. In the second experiment, I compared a monolingual Choctaw Masheli 2.0 chatbot with a
code-switching version that included translanguaging.
The ndings from the second experiment suggest that users favored the code-switching chatbot over
the monolingual one, as indicated by survey results. Both groups showed learning gains through a vocabulary and grammar quiz, although the monolingual group learned slightly more, though not signicantly.
A key nding was that the code-switching chatbot was more eective at retrieving stories for both groups;
the increased number of story request attempts that the monolingual cohort experienced was correlated
with negative perceptions of the chatbot’s ability to speak Choctaw. In the code-switching group, strong
correlations indicated that participants felt a sense of shared social identity with the system when comparing survey responses to the number of story requests and the number of stories received.
6.1.2 DAPEL
I demonstrated that the system is a viable means for recording endangered languages, as it captured over
1,500 unique Choctaw words in the collected audio system, of which 500 new words were not found in the
ChoCo corpus (see Section3.0.1).
From the DAPEL experiments, I found that participants were comfortable speaking in two languages
with the system, even when it used only one language. In the second experiment, I introduced codeswitching between two languages.
I discovered that the eectiveness of including code-switching depended on the speaker’s uency. The
code-switching system was better for L1 speakers but did not aect L2 participants. L1 speakers identied
more with the code-switching system, while L2 speakers showed less identication with either system.
All open-ended survey responses were positive or neutral; no one reported negative feelings about the
interaction.
165
Although L1 speakers preferred the code-switching system, this did not aect the length of the audio collected. All speakers were equally likely to give long responses or continue speaking beyond the
maximum required time of the experiment session.
6.2 Contributions
This dissertation makes signicant contributions to the eld of bilingual dialogue systems research. In our
multilingual world, dialogue systems should ideally be at least bilingual, if not multilingual, to eectively
support diverse users.
It also highlights critical gaps within the existing research on bilingual dialogue systems. Firstly, the
literature on bilingual dialogue systems (Section 2.4) developed in this work summarizes current bilingual
dialogue systems, oering a comprehensive overview of previous studies and outlining potential future
directions for research. Secondly, this dissertation addresses several key gaps by exploring the sociopsychological aspects of code-switching and examining intra-turn intra-sentential code-switching.
My experiments yielded novel insights into the user experience of interacting with a code-switching
dialogue system. I discovered that learners nd a code-switching chatbot more enjoyable than a monolingual system, which could positively aect learner retention. Maintaining learner engagement is crucial
for endangered languages to prevent further language loss. Additionally, I found that L1 speakers favor a
code-switching dialogue system for recording endangered languages. Many of these languages have never
been documented, and the ability to quickly record and preserve them could be vital for their potential
reclamation in the future.
Additionally, as stated in the Introduction (see 1.7), several peer-reviewed journal and conference publications have resulted from this dissertation, along with a number of publications in the media.
Next, this dissertation meaningfully added to Choctaw language documentation. Prior to this work,
no data set suitable for computational techniques existed for Choctaw. I collected over 300,000 tokens in
166
ChoCo, a Choctaw language corpus (3.0.1). In addition, I added roughly 300 minutes of novel Choctaw
recordings through the DAPEL experiments, providing a snapshot of the state of spoken Choctaw in general and post-COVID (an event that sadly killed many speakers). The text conversations collected from
the Masheli experiments are a valuable data set for linguists and language instructors of learning Choctaw
speakers in conversation. The complete data set is useful for linguistic studies, language learners, and
computational methods that require larger amounts of data.
As a whole, this dissertation also increased the representation of Indigenous languages in technology.
It outlined how others might accomplish similar systems for other Indigenous languages. While there are
very few Indigenous Computer Scientists, the experiments conducted during this dissertation may have
increased interest in the eld within the Choctaw community. One Masheli participant shared that she
was so interested in the chatbot that she was going to resume a Computer Science degree, which she had
put on hold during the pandemic.
6.3 Limitations
There were several limitations in the study. First, I was unable to test the non-framework version of
Masheli and the non-framework version of DAPEL due to limited recruitment. The benet of learning
whether non-framework code-switching is acceptable to users is that more lexical items, including verbs
and morphemes, could be code-switched. This would then mean that potentially less expert linguistic
knowledge would be needed to create code-switching scripts, as knowledge about nouns and noun phrase
boundaries was required to write the code-switched utterances used in the experiments in this dissertation.
Secondly, code-switching can also occur phonologically (Poplack 2000), for example, by saying an
English word with a Choctaw intonation. This aspect of code-switching was not considered in this work,
but it could be a fruitful area of new code-switching research in spoken dialogue systems.
167
6.3.1 Masheli
One computational limitation of Masheli was that the system could not process some of the unique characters. The system was trained using specic ASCII characters but had not been trained on some of the
other possible ASCII variations. Additionally, some of the characters did not render correctly for unclear
reasons, such as a
¯
presented as å. Participants were encouraged during the experiment to use alternative
spellings if the system could not process their original statement; however, this may have impacted user
satisfaction.
An additional limitation of Masheli was limited recruitment. I was unable to test one of the systems
(non-framework code-switching), and further recruitment may have led to statistical signicance for the
two systems that were tested.
6.3.2 DAPEL
There were several limitations and improvements that could be made to the system. The small talk design
was based on experiment 1, where all participants gave summaries that were often shorter than the prompt
response. However, some participants in experiment 2 spoke as long in their summaries. This made the
speech-to-text processing in the small talk extremely slow. I recommended that some participants feel free
to skip the small talk if they did not wish to wait. One example was Participant 22, who gave ve-minutelong summaries.
A second improvement to the system would be a skip button. Participants who wished to skip a prompt
were forced to click through the prompt, summary, and small talk pages to proceed to the next prompt. A
skip button to bypass all three pages would be a worthwhile implementation to simplify the user experience
and save time.
A third improvement would be simplifying the recording devices for the small talk portion. I relied on
an external device to capture the small talk; however, some responses were not recorded due to technical
168
diculties with the device. Since I was uncertain how users might prefer to have the small talk, whether
a button would cause the conversation to be less natural or if only errant noises would be captured if the
audio began recording as soon as the small talk page was presented, I wanted to capture the small talk on
an external device and base future designs on how participants responded in experiment 2. Many users
were unclear if they were being recorded and asked for clarication. An area for future work would be to
include a record button for the small talk portions so that users know when the recording begins and to
align with how the start of a recording is presented on other pages.
6.4 Implications
While working on a low-resource, endangered language has its challenges, this dissertation demonstrates
the steps involved in creating a language corpus that would support the development of a speech recognizer, chatbot, and spoken dialogue system from scratch. Given that I spoke no Choctaw before starting
the Ph.D. program, my eorts demonstrate that lack of knowledge is not an insurmountable barrier to
pursuing similar work.
The experiments and results of this dissertation also demonstrate that it is possible to build bilingual
systems that users appreciate and eectively support their needs. I have shown that creating such systems
with limited data is feasible while still achieving meaningful impact. Furthermore, these ndings highlight
the importance of user-centered design in developing dialogue systems that resonate with diverse linguistic communities. The potential applications of these systems extend beyond education, oering valuable
tools for cultural preservation and enhancing communication in multilingual environments. Ultimately,
this work lays the groundwork for future research and development in bilingual dialogue systems and emphasizes the necessity for linguistic inclusivity—in its many dierent meanings within this dissertation—in
technology.
169
6.5 Future research directions
There are many possibilities for future research directions in the bilingual dialogue system and endangered
Indigenous language technology spaces. A rst area for future research would be exploring the limitations
described above.
6.5.1 Masheli and language learning companions
One possibility for future work with Masheli and other language learning companion systems is testing the
non-framework code-switching chatbot, a limitation of this work due to insucient recruitment. Another
is to evaluate the learning gains over a longer period to determine if additional time spent interacting with
the chatbot over several sessions could produce signicant results. It is possible that retention would be
higher with the group paired with the code-switching chatbot, given the higher survey scores, and with
higher retention, the possibility of higher learning. A nal consideration is replication is needed for other
language communities to conrm that the results found here are not unique to the Choctaw language.
6.5.2 DAPEL and systems for recording endangered languages in conversation
A future direction for DAPEL and other systems for recording languages in conversation is that one system
could be designed for L1 speakers and another for L2 speakers. L1 speakers showed a signicant preference
for the small talk in the code-switching group over their L2 counterparts.
Additionally, future work with more subjects is needed to investigate the monolingual Choctaw and
the non-framework code-switching systems, as I was unable to recruit enough participants in the data
collection time frame to test these systems. Based on the results of the work in this chapter, both L1 and
L2 speakers should be recruited to test these systems.
170
Finally, it can be concluded that the system is viable. It could be used by the Choctaw Nation or adapted
for other languages to collect data on other endangered languages. The system could be easily altered to
change prompts and small talk items to make it culturally appropriate for other communities.
6.5.3 State of the Art
At the time of writing, publicly available state-of-the-art systems do not engage in code-switching with
users, even when prompted. This is true for LLMs, such as ChatGPT 3 and 4, which also lack support
for code-switching users. While the models may parse and respond to code-switched utterances, users
may react negatively to corrections in their language (such as the example conversations seen in Section
2.4.1.1), preferring that the system code-switch with them. Supporting the many multilingual speakers
who code-switch as part of their normal speech is an interesting avenue for future research.
6.5.4 Indigenous Language Technology
Finally, Indigenous language technology is a vast area for future work. Many NLP approaches have never
been applied to Indigenous languages. More technology can be developed for these languages beyond
language learning and revitalization, like many everyday technologies, such as speech-to-text and machine
translation.
171
Bibliography
[1] Gilles Adda, Sebastian Stüker, Martine Adda-Decker, Odette Ambouroue, Laurent Besacier,
David Blachon, Hélene Bonneau-Maynard, Pierre Godard, Fatima Hamlaoui, Dmitry Idiatov, et al.
“Breaking the unwritten language barrier: The BULB project”. In: Procedia Computer Science 81
(2016), pp. 8–14.
[2] Prabhat Agarwal, Ashish Sharma, Jeenu Grover, Mayank Sikka, Koustav Rudra, and
Monojit Choudhury. “I may talk in English but gaali toh Hindi mein hi denge: A study of
English-Hindi code-switching and swearing pattern on social networks”. In: 2017 9th international
conference on communication systems and networks (comsnets). IEEE. 2017, pp. 554–557.
[3] Gustavo Aguilar, Sudipta Kar, and Thamar Solorio. “LinCE: A Centralized Benchmark for
Linguistic Code-switching Evaluation”. English. In: Proceedings of The 12th Language Resources
and Evaluation Conference. Marseille, France: European Language Resources Association, May
2020, pp. 1803–1813. isbn: 979-10-95546-34-4.
[4] Emily Ahn, Cecilia Jimenez, Yulia Tsvetkov, and Alan W Black. “What Code-Switching Strategies
are Eective in Dialog Systems?” In: Proceedings of the Society for Computation in Linguistics 2020.
2020, pp. 213–222.
[5] Tarmo Ahvenainen. “Language prociency facework and perceptions of language prociency
face in L2 interaction”. In: JYU dissertations (2021).
[6] Firoj Alam, Shammur Absar Chowdhury, Sabri Boughorbel, and Maram Hasanain. “LLMs for Low
Resource Languages in Multilingual, Multimodal and Dialectal Settings”. In: Proceedings of the
18th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial
Abstracts. 2024, pp. 27–33.
[7] Seyed Hossein Alavi, Jacqueline Brixey, and David Traum. “Can we use a spoken dialogue system
to document endangered languages?” In: Dialog for Good (DiGo). 2019.
[8] Bri Alexander. “Contextualizing technology: Designing Indigenous language CALL programs”.
PhD thesis. The University of Arizona, 2018.
[9] Dana Abu Ali, Muaz Ahmad, Hayat Al Hassan, Paula Dozsa, Ming Hu, Jose Varias, and
Nizar Habash. “A bilingual interactive human avatar dialogue system”. In: Proceedings of the 19th
Annual SIGdial Meeting on Discourse and Dialogue. 2018, pp. 241–244.
172
[10] Dee H Andrews, Thomas D Hull, and Jennifer A Donahue. “Storytelling as an Instructional
Method: Descriptions and Research Questions”. In: The Interdisciplinary Journal of Problem-based
Learning• volume 3.1 (2009), pp. 6–28.
[11] Peter Auer. “Code-Switching in Conversation: Language, interaction and identity”. In: Routledge,
1995. Chap. 1, pp. 1–24.
[12] Peter Auer. “The pragmatics of code-switching: a sequential approach”. In: One speaker, two
languages. Ed. by Leslie Milroy and Pieter Muysken. University of Cambridge, 1995. Chap. 6,
pp. 115–135.
[13] Peter Auer and Raihan Muhamedova. “Embedded language and ‘matrix language in insertional
language mixing: Some problematic cases”. In: Rivista di linguistica 17.1 (2005), pp. 35–54.
[14] Glenn Auld. “THE ROLE OF THE COMPUTER IN LEARNING NDJ BBANA”. In: Language,
Learning & Technology 6.2 (2002), pp. 41–41.
[15] Marie Battiste. “Enabling the autumn seed: Toward a decolonized approach to Aboriginal
knowledge, language, and education”. In: Canadian Journal of Native Education 22.1 (1998).
[16] Anshul Bawa, Monojit Choudhury, and Kalika Bali. “User Perception of Code-Switching Dialog
Systems”. In: 15th International Conference on Natural Language Processing. Vol. 171. 2018.
[17] Anshul Bawa, Pranav Khadpe, Pratik Joshi, Kalika Bali, and Monojit Choudhury. “Do
Multilingual Users Prefer Chat-bots that Code-mix? Let’s Nudge and Find Out!” In: Proceedings of
the ACM on Human-Computer Interaction 4.CSCW1 (2020), pp. 1–23.
[18] Bruce A Beatie. “Macaronic poetry in the Carmina Burana”. In: Vivarium 5.1 (1967), pp. 16–24.
[19] Anne L Beatty-Martínez, Christian A Navarro-Torres, and Paola E Dussias. “Codeswitching: A
bilingual toolkit for opportunistic speech planning”. In: Frontiers in Psychology 11 (2020), p. 1699.
[20] Winoka Rose Begay. “Mobile apps and indigenous language learning: New developments in the
eld of indigenous language revitalization”. In: (2013).
[21] Hedi M Belazi, Edward J Rubin, and Almeida Jacqueline Toribio. “Code switching and X-bar
theory: The functional head constraint”. In: Linguistic inquiry (1994), pp. 221–237.
[22] Steven Bird. “Decolonising speech and language technology”. In: Proceedings of the 28th
International Conference on Computational Linguistics. 2020, pp. 3504–3519.
[23] Steven Bird, Florian R Hanke, Oliver Adams, and Haejoong Lee. “Aikuma: A mobile app for
collaborative language documentation”. In: Proceedings of the 2014 Workshop on the Use of
Computational Methods in the Study of Endangered Languages. 2014, pp. 1–5.
[24] Astik Biswas, Febe de Wet, Ewald van der Westhuizen, Emre Yilmaz, and Thomas Niesler.
“Multilingual Neural Network Acoustic Modelling for ASR of Under-Resourced English-isiZulu
Code-Switched Speech.” In: INTERSPEECH. 2018, pp. 2603–2607.
173
[25] Erman Boztepe. “Issues in code-switching: Competing theories and models”. In: Studies in Applied
Linguistics and TESOL 3.2 (2003).
[26] Jacqueline Brixey. “Resumptive L1 Bilinguals: A Conceptual Model”. In: Proceedings of the fortieth
Western Conference on Linguistics. Vol. 22. 2013, pp. 47–57. url: http://www.fresnostate.edu/artsh
um/linguistics/documents/WECOL%5C%202013%5C%20VOLUME%5C%2022.pdf.
[27] Jacqueline Brixey. “Virtual rapport with extraverted agents”. Masters Thesis. The University of
Texas at El Paso, 2015.
[28] Jacqueline Brixey and Ron Artstein. “ChoCo: a multimodal corpus of the Choctaw language”. In:
Language Resources and Evaluation 55 (2021), pp. 241–257.
[29] Jacqueline Brixey, Rens Hoegen, Wei Lan, Joshua Rusow, Karan Singla, Xusen Yin, Ron Artstein,
and Anton Leuski. “SHIHbot: A Facebook chatbot for Sexual Health Information on HIV/AIDS”.
In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. 2017, pp. 370–373.
[30] Jacqueline Brixey, Eli Pincus, and Ron Artstein. “Chahta Anumpa: A Multimodal Corpus of the
Choctaw Language”. In: Proceedings of LREC 2018. Miyazaki, Japan, 2018.
[31] Jacqueline Brixey and David Traum. “Masheli: A Choctaw-English Bilingual Chatbot”. In:
Conversational Dialogue Systems for the Next Decade. Springer, 2021, pp. 41–50.
[32] George Aaron Broadwell. A Choctaw Reference Grammar. U of Nebraska Press, 2006.
[33] George Aaron Broadwell. “Choctaw”. In: Native Languages of the Southeastern United States.
Ed. by Janine Scancarelli and Heather K. Hardy. U of Nebraska Press, 2005.
[34] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal,
Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. “Language models are
few-shot learners”. In: Advances in neural information processing systems 33 (2020), pp. 1877–1901.
[35] Harry Bunt, Volha Petukhova, David Traum, and Jan Alexandersson. “Dialogue act annotation
with the ISO 24617-2 standard”. In: Multimodal Interaction with W3C Standards: Toward Natural
User Interfaces to Everything (2017), pp. 109–135.
[36] Wolfgang Butzkamm and John AW Caldwell. The bilingual reform: A paradigm shift in foreign
language teaching. Narr Francke Attempto Verlag, 2009.
[37] Cyrus Byington. A Dictionary of the Choctaw Language. Edited by John R. Swanton and Henry S.
Halbert. Smithsonian Institution Bureau of American Ethnology Bulletin 46. US Government
Printing Oce, 1915.
[38] Cyrus Byington. “Grammar of the Choctaw Language”. In: Proceedings of the American
Philosophical Society 11 (1870). Edited by Daniel Garrison Brinton. Also published as a
monograph by McCalla and Stavely, Philadelphia, 1870., pp. 317–367.
[39] Lyle Campbell. American Indian Languages. Oxford University Press, 1997.
174
[40] Suresh Canagarajah. “Translanguaging in the classroom: Emerging issues for research and
pedagogy”. In: Applied linguistics review 2.1 (2011), pp. 1–28.
[41] Gina P Cantoni. “Using TPR-Storytelling to Develop Fluency and Literacy in Native American
Languages”. In: Revitalizing Indigenous Languages (1999), p. 53.
[42] Justine Cassell. Embodied Conversational Agents. MIT Press, 2000.
[43] Justine Cassell, Timothy Bickmore, Mark Billinghurst, Lee Campbell, Kenny Chang,
Hannes Vilhjálmsson, and Hao Yan. “Embodiment in conversational interfaces: Rea”. In:
Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 1999, pp. 520–527.
[44] Justine Cassell, Alastair Gill, and Paul Tepper. “Coordination in conversation and rapport”. In:
Proceedings of the workshop on Embodied Language Processing. 2007, pp. 41–50.
[45] Morgan Cassels and Chloë Farr. “Mobile applications for Indigenous language learning: Literature
review and app survey”. In: Working Papers of the Linguistics Circle 29.1 (2019), pp. 1–24.
[46] William B Cavnar, John M Trenkle, et al. “N-gram-based text categorization”. In: Proceedings of
SDAIR-94, 3rd annual symposium on document analysis and information retrieval. Vol. 161175. Las
Vegas, NV. 1994.
[47] Jasone Cenoz and Durk Gortegaorter. “Minority languages and sustainable translanguaging:
Threat or opportunity?” In: Journal of Multilingual and Multicultural Development 38.10 (2017),
pp. 901–912.
[48] Özlem Çetinoğlu, Sarah Schulz, and Ngoc Thang Vu. “Challenges of Computational Processing of
Code-Switching”. In: Proceedings of the Second Workshop on Computational Approaches to Code
Switching. 2016, pp. 1–11.
[49] Molly J Champlin. “Translanguaging and bilingual learners: A study of how translanguaging
promotes literacy skills in bilingual students”. MS Thesis. St. John Fisher College, 2016.
[50] Khyathi Chandu, Thomas Manzini, Sumeet Singh, and Alan W Black. “Language informed
modeling of code-switched text”. In: Proceedings of the Third Workshop on Computational
Approaches to Linguistic Code-Switching. 2018, pp. 92–97.
[51] Hongshen Chen, Xiaorui Liu, Dawei Yin, and Jiliang Tang. “A survey on dialogue systems: Recent
advances and new frontiers”. In: Acm Sigkdd Explorations Newsletter 19.2 (2017), pp. 25–35.
[52] Yu Chen, Scott Jensen, Leslie J Albert, Sambhav Gupta, and Terri Lee. “Articial intelligence (AI)
student assistants in the classroom: Designing chatbots to support student success”. In:
Information Systems Frontiers 25.1 (2023), pp. 161–182.
[53] Neasa Ní Chiaráin and Ailbhe Ní Chasaide. “Chatbot technology with synthetic voices in the
acquisition of an endangered language: motivation, development and evaluation of a platform for
Irish”. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation
(LREC’16). 2016, pp. 3429–3435.
175
[54] Choctaw Tribal Language Program. Chahta im annopa hiyohli alhíha. 2005. url:
http://www.choctaw.org/culture/pdf/CLM-1ChahtaImAnnopaHiyohliAlhiha.pdf.
[55] Yunjae J Choi, Minha Lee, and Sangsu Lee. “Toward a Multilingual Conversational Agent:
Challenges and Expectations of Code-mixing Multilingual Users”. In: Proceedings of the 2023 CHI
Conference on Human Factors in Computing Systems. 2023, pp. 1–17.
[56] Chih-Yueh Chou, Tak-Wai Chan, and Chi-Jen Lin. “Redening the learning companion: the past,
present, and future of educational agents”. In: Computers & Education 40.3 (2003), pp. 255–269.
[57] Jared Coleman, Bhaskar Krishnamachari, Khalil Iskarous, and Ruben Rosales. “LLM-Assisted Rule
Based Machine Translation for Low/No-Resource Languages”. In: arXiv preprint arXiv:2405.08997
(2024).
[58] Rolando Coto-Solano, Sally Akevai Nicholas, and Samantha Wray. “Development of natural
language processing tools for Cook Islands Maori”. In: ¯ Proceedings of the Australasian Language
Technology Association Workshop 2018. 2018, pp. 26–33.
[59] Tyne Crow and David Parsons. “A mobile game world for Maori language learning”. In: ¯
International Conference on Mobile and Contextual Learning. Springer. 2015, pp. 84–98.
[60] Marta Dąbrowska et al. “Functions of code-switching in Polish and Hindi Facebook users’ posts”.
In: Studia Linguistica Universitatis Iagellonicae Cracoviensis 130 (2013), pp. 63–84.
[61] Amitava Das and Björn Gambäck. “Identifying languages at the word level in code-mixed Indian
social media text”. In: (2014).
[62] David DeVault, Kallirroi Georgila, Ron Artstein, Fabrizio Morbini, David Traum, Stefan Scherer,
Albert A Rizzo, and Louis-Philippe Morency. “Verbal indicators of psychological distress in
interactive dialogue with a virtual human”. In: Proceedings of the SIGDIAL 2013 Conference. 2013,
pp. 193–202.
[63] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. “BERT: Pre-training of Deep
Bidirectional Transformers for Language Understanding”. In: CoRR abs/1810.04805 (2018). arXiv:
1810.04805.
[64] Anik Dey and Pascale Fung. “A Hindi-English Code-Switching Corpus.” In: LREC. 2014,
pp. 2410–2413.
[65] Lise Dobrin, Peter K Austin, and David Nathan. “Dying to be counted: The commodication of
endangered languages in documentary linguistics”. In: Language documentation and description 6
(2009), pp. 37–52.
[66] A Seza Doğruöz, Sunayana Sitaram, Barbara Bullock, and Almeida Jacqueline Toribio. “A Survey
of Code-switching: Linguistic and Social Perspectives for Language Technologies”. In: Proceedings
of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th
International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021,
pp. 1654–1666.
176
[67] Jessica Dougherty. “Translanguaging in Action: Pedagogy That Elevates.” In: ORTESOL Journal
38 (2021), pp. 19–32.
[68] Ted Dunning. Statistical identication of language. Computing Research Laboratory, New Mexico
State University Las Cruces, 1994.
[69] Abteen Ebrahimi, Manuel Mager, Shruti Rijhwani, Enora Rice, Arturo Oncevay, Claudia Baltazar,
María Cortés, Cynthia Montaño, John E. Ortega, Rolando Coto-solano, Hilaria Cruz,
Alexis Palmer, and Katharina Kann. “Findings of the AmericasNLP 2023 Shared Task on Machine
Translation into Indigenous Languages”. In: Proceedings of the Workshop on Natural Language
Processing for Indigenous Languages of the Americas (AmericasNLP). Ed. by Manuel Mager,
Abteen Ebrahimi, Arturo Oncevay, Enora Rice, Shruti Rijhwani, Alexis Palmer, and
Katharina Kann. Toronto, Canada: Association for Computational Linguistics, July 2023,
pp. 206–219. doi: 10.18653/v1/2023.americasnlp-1.23.
[70] European Language Resources Association. LT4All: Language Technologies for All. 2019. url:
https://en.unesco.org/LT4All.
[71] Yujie Feng, Zexin Lu, Bo Liu, Liming Zhan, and Xiao-Ming Wu. “Towards LLM-driven Dialogue
State Tracking”. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language
Processing. 2023, pp. 739–755.
[72] Sarah E Finch and Jinho D Choi. “Towards Unied Dialogue System Evaluation: A
Comprehensive Analysis of Current Evaluation Protocols”. In: Proceedings of the 21th Annual
Meeting of the Special Interest Group on Discourse and Dialogue. 2020, pp. 236–245.
[73] Alan Firth. “The discursive accomplishment of normality: On ‘lingua franca’English and
conversation analysis”. In: Journal of pragmatics 26.2 (1996), pp. 237–259.
[74] Luke Fryer and Rollo Carpenter. “Bots as language learning tools”. In: Language Learning &
Technology 10.3 (2006), pp. 8–14.
[75] Ofelia García. “Education, multilingualism and translanguaging in the 21st century”. In: Social
justice through multilingual education. Multilingual Matters, 2009, pp. 140–158.
[76] Penelope Gardner-Chloros and Malcolm Edwards. “Assumptions behind grammatical approaches
to code-switching: when the blueprint is a red herring”. In: Transactions of the Philological Society
102.1 (2004), pp. 103–129.
[77] Elodie Gauthier, David Blachon, Laurent Besacier, Guy-Noel Kouarata, Martine Adda-Decker,
Annie Rialland, Gilles Adda, and Grégoire Bachman. “LIG-AIKUMA: A mobile app to collect
parallel speech for under-resourced language studies”. In: Interspeech 2016 (short demo paper).
2016.
[78] Barbara Gawronska and David House. “Information Extraction and text generation of news
reports for a Swedish-English bilingual spoken dialogue system”. In: Fifth International
Conference on Spoken Language Processing. 1998.
177
[79] Howard Giles. “Accommodation theory: Optimal levels of convergence”. In: Language and social
psychology (1979), pp. 45–65.
[80] Rosario Gingràs. “Problems in the description of Spanish-English intrasentential code-switching”.
In: Southwest areal linguistics (1974), pp. 167–174.
[81] Erving Goman. “On Face-Work”. In: Interaction ritual: Essays on face-to-face interaction. Aldine,
1967. Chap. 2, pp. 5–46.
[82] Raymond G Gordon Jr. “Ethnologue, languages of the world”. In: http://www. ethnologue. com/
(2005).
[83] Jonathan Gratch, Ning Wang, Jillian Gerten, Edward Fast, and Robin Duy. “Creating rapport
with virtual agents”. In: Intelligent Virtual Agents: 7th International Conference, IVA 2007 Paris,
France, September 17-19, 2007 Proceedings 7. Springer. 2007, pp. 125–138.
[84] Gregory Grefenstette. “Comparing two language identication schemes”. In: Proceedings of JADT.
Vol. 95. 1995.
[85] Bryan Gregorius and Takeshi Okadome. “Generating Code-Switched Text from Monolingual Text
with Dependency Tree”. In: Proceedings of the The 20th Annual Workshop of the Australasian
Language Technology Association. Adelaide, Australia: Australasian Language Technology
Association, Dec. 2022, pp. 90–97.
[86] François Grosjean. “Bilingualism: A short introduction”. In: The psycholinguistics of bilingualism.
Ed. by François Grosjean and Ping Li. Wiley-Blackwell Chichester, 2013. Chap. 1, pp. 1–25.
[87] Deepak Gupta, Asif Ekbal, and Pushpak Bhattacharyya. “A semi-supervised approach to generate
the code-mixed text using pre-trained encoder and transfer learning”. In: Findings of the
Association for Computational Linguistics: EMNLP 2020. 2020, pp. 2267–2280.
[88] Deepak Gupta, Shubham Tripathi, Asif Ekbal, and Pushpak Bhattacharyya. “A hybrid approach
for entity extraction in code-mixed social media data”. In: Money 25.66 (2016).
[89] Marcia Haag and F Wayne Coston. “EARLY EFFECTS OF TECHNOLOGY ON THE OKLAHOMA
CHOCTAW LANGUAGE COMMUNITY”. In: Language, Learning & Technology 6.2 (2002),
pp. 70–70.
[90] Marcia Haag and Henry Willis. Choctaw language and culture: Chahta anumpa. Vol. 1. University
of Oklahoma Press, 2001.
[91] Marcia Haag and Henry Willis. Choctaw language and culture: Chahta anumpa Volume 2. Vol. 2.
University of Oklahoma Press, 2007.
[92] Mary R. Haas. “Southeastern Languages”. In: The Languages of Native America: Historical and
Comparative Assessment. Ed. by Lyle Campbell and Marianne Mithun. University of Texas Press,
1979, pp. 299–326.
178
[93] Darcy Hallett, Michael J Chandler, and Christopher E Lalonde. “Aboriginal language knowledge
and youth suicide”. In: Cognitive development 22.3 (2007), pp. 392–399.
[94] Arno Hartholt, Jonathan Gratch, Lori Weiss, et al. “At the virtual frontier: Introducing
Gunslinger, a multi-character, mixed-reality, story-driven experience”. In: International Workshop
on Intelligent Virtual Agents. Springer. 2009, pp. 500–501.
[95] Michael Haugh. “Face and interaction”. In: Face, communication and social interaction. Ed. by
Michael Haugh and Francesca Bargiela-Chiappini. Equinox London, 2009. Chap. 1, pp. 1–30.
[96] Monica Heller. “The politics of codeswitching and language choice”. In: Journal of Multilingual &
Multicultural Development 13.1-2 (1992), pp. 123–142.
[97] Megan Herrera, Ankit Aich, and Natalie Parde. “TweetTaglish: A Dataset for Investigating
Tagalog-English Code-Switching”. In: Proceedings of the Thirteenth Language Resources and
Evaluation Conference. Marseille, France: European Language Resources Association, June 2022,
pp. 2090–2097.
[98] Ahmad Fathan Hidayatullah, Atika Qazi, Daphne Teck Ching Lai, and Rosyzie Anna Apong. “A
systematic review on language identication of code-mixed text: techniques, data availability,
challenges, and framework development”. In: IEEE access (2022).
[99] Nikolaus P. Himmelmann. “Language Documentation: What is it and what is it good for?” In:
Essentials of Language Documentation. Ed. by Jost Gippert, Nikolaus P. Himmelmann, and
Ulrike Mosel. Berlin: Mouton de Gruyter, 2006. Chap. 1, pp. 1–30.
[100] Sebastian Hobert. “How Are You, Chatbot? Evaluating Chatbots in Educational Settings-Results
of a Literature Review”. In: DELFI 2019-Die 17. Fachtagung Bildungstechnologien, Lecture Notes in
Informatics (2019).
[101] Thomas Holtgraves. “Face, politeness and interpersonal variables: implications for language
production and comprehension”. In: Face, communication and social interaction. Ed. by
Michael Haugh and Francesca Bargiela-Chiappini. Equinox London, 2009. Chap. 10, pp. 192–207.
[102] Weijiao Huang, Khe Foon Hew, and Luke K Fryer. “Chatbots for language learning—Are they
really useful? A systematic review of chatbot-supported language learning”. In: Journal of
Computer Assisted Learning 38.1 (2022), pp. 237–257.
[103] John T. Jensen. Morphology: Word Structure in generative grammar. Vol. 70. John Benjamins
Publishing Company, 1990.
[104] Michael Johnston, Patrick Ehlen, Frederick G Conrad, Michael F Schober, Christopher Antoun,
Stefanie Fail, Andrew Hupp, Lucas Vickers, Huiying Yan, and Chan Zhang. “Spoken dialog
systems for automated survey interviewing”. In: Proceedings of the SIGDIAL 2013 Conference.
2013, pp. 329–333.
179
[105] Navya Jose, Bharathi Raja Chakravarthi, Shardul Suryawanshi, Elizabeth Sherly, and
John P McCrae. “A survey of current datasets for code-switching research”. In: 2020 6th
international conference on advanced computing and communication systems (ICACCS). IEEE. 2020,
pp. 136–141.
[106] Aravind Joshi. “Processing of sentences with intra-sentential code-switching”. In: Coling 1982:
Proceedings of the Ninth International Conference on Computational Linguistics. 1982.
[107] Daniel Jurafsky. Speech and language processing. 2000.
[108] Gemma Karstens-Smith. “B.C. teen creates app to help revive fading Indigenous language”. In:
Toronto Star (Jan. 7, 2018).
[109] Simran Khanuja, Sandipan Dandapat, Anirudh Srinivasan, Sunayana Sitaram, and
Monojit Choudhury. “GLUECoS: An Evaluation Benchmark for Code-Switched NLP”. In:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online:
Association for Computational Linguistics, July 2020, pp. 3575–3585. doi:
10.18653/v1/2020.acl-main.329.
[110] Elizabeth A Kickham. “Purism, Prescriptivism, and privilege: Choctaw language ideologies and
their impact on teaching and learning”. PhD thesis. UNIVERSITY OF OKLAHOMA, 2015.
[111] Alistair Knott, Ian Bayard, Samson De Jager, and Nick Wright. “An architecture for bilingual and
bidirectional nlp”. In: Proceedings of the 2nd Australasian Natural Language Processing Workshop
(ANLP 2002). Citeseer. 2002.
[112] Alistair Knott, John Mooreld, Tamsin Meaney, and Lee-Luan Ng. “A human-computer dialogue
system for Maori language learning”. In: ¯ EdMedia+ Innovate Learning. Association for the
Advancement of Computing in Education (AACE). 2003, pp. 3336–3343.
[113] Takahiro Kobori, Mikio Nakano, and Tomoaki Nakamura. “Small talk improves user impressions
of interview dialogue systems”. In: Proceedings of the 17th annual meeting of the special interest
group on discourse and dialogue. 2016, pp. 370–380.
[114] András Kornai. “Digital language death”. In: PloS one 8.10 (2013), e77056.
[115] Stephen D Krashen, Michael A Long, and Robin C Scarcella. “Age, rate and eventual attainment
in second language acquisition”. In: TESOL quarterly (1979), pp. 573–582.
[116] Judith F Kroll and Annette MB De Groot. Handbook of bilingualism: Psycholinguistic approaches.
Oxford University Press, 2009.
[117] Wesley Y Leonard. “Framing language reclamation programmes for everybody’s empowerment.”
In: Gender & Language 6.2 (2012).
[118] Anton Leuski and David Traum. “NPCEditor: Creating virtual human dialogue using information
retrieval techniques”. In: Ai Magazine 32.2 (2011), pp. 42–56.
180
[119] Xuansong Li, Jennifer Tracey, Stephen Grimes, and Stephanie Strassel. “Uzbek-English and
Turkish-English Morpheme Alignment Corpora.” In: LREC. 2016.
[120] Yu Li, Shang Qu, Jili Shen, Shangchao Min, and Zhou Yu. “Curriculum-Driven Edubot: A
Framework for Developing Language Learning Chatbots through Synthesizing Conversational
Data”. In: Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and
Dialogue. Ed. by Tatsuya Kawahara, Vera Demberg, Stefan Ultes, Koji Inoue, Shikib Mehri,
David Howcroft, and Kazunori Komatani. Kyoto, Japan: Association for Computational
Linguistics, Sept. 2024, pp. 400–419. doi: 10.18653/v1/2024.sigdial-1.35.
[121] Patsy M Lightbown and Nina Spada. How Languages are Learned 4th edition-Oxford Handbooks for
Language Teachers. Oxford University Press, 2013.
[122] Fei Liu, Fuliang Weng, Bingqing Wang, and Yang Liu. “Insertion, deletion, or substitution?
Normalizing text messages without pre-categorization nor supervision”. In: Proceedings of the
49th Annual Meeting of the Association for Computational Linguistics: Human Language
Technologies. 2011, pp. 71–76.
[123] Winnie WM Low and Dan Lu. “Persistent use of mixed code: An exploration of its functions in
Hong Kong schools”. In: International Journal of Bilingual Education and Bilingualism 9.2 (2006),
pp. 181–204.
[124] Chris Lowman, Tangimai Fitzgerald, Patsy Rapira, and Rahera Clark. “First language literacy skill
transfer in a second language learning environment: Strategies for biliteracy”. In: Set 2 (2007),
pp. 24–28.
[125] Gale M Lucas, Jonathan Gratch, Aisha King, and Louis-Philippe Morency. “It’s only a computer:
Virtual humans increase willingness to disclose”. In: Computers in Human Behavior 37 (2014),
pp. 94–100.
[126] Birgit Lugrin, Catherine Pelachaud, and David Traum. The Handbook on Socially Interactive
Agents: 20 Years of Research on Embodied Conversational Agents, Intelligent Virtual Agents, and
Social Robotics Volume 2: Interactivity, Platforms, Application. ACM, 2022.
[127] Megan Lukaniec and Kayla Palakurthy. “Additional language learning in the context of
Indigenous language reclamation”. In: The Routledge handbook of second language acquisition and
sociolinguistics. Routledge, 2022, pp. 341–355.
[128] Dau-Cheng Lyu, Tien-Ping Tan, Eng Siong Chng, and Haizhou Li. “Seame: a mandarin-english
code-switching speech corpus in south-east asia”. In: Eleventh Annual Conference of the
International Speech Communication Association. 2010.
[129] Manuel Mager, Abteen Ebrahimi, Arturo Oncevay, Enora Rice, Shruti Rijhwani, Alexis Palmer,
and Katharina Kann, eds. Proceedings of the Workshop on Natural Language Processing for
Indigenous Languages of the Americas (AmericasNLP). Toronto, Canada: Association for
Computational Linguistics, July 2023.
181
[130] Manuel Mager, Ximena Gutierrez-Vasques, Gerardo Sierra, and Ivan Meza-Ruiz. “Challenges of
language technologies for the indigenous languages of the Americas”. In: Proceedings of the 27th
International Conference on Computational Linguistics. 2018, pp. 55–69.
[131] Leketi Makalela. “Translanguaging as a vehicle for epistemic access: Cases for reading
comprehension and multilingual interactions”. In: Per Linguam: a Journal of Language Learning=
Per Linguam: Tydskrif vir Taalaanleer 31.1 (2015), pp. 15–29.
[132] Thembekile Malaza, Luvuyo Martins, Justus Roux, and Thomas Niesler. “Porting an English
Spoken Dialogue System to Xhosa and Zulu”. In: South African Journal of African Languages 25.2
(2005), pp. 101–110.
[133] Wari Maroengsit, Thanarath Piyakulpinyo, Korawat Phonyiam, Suporn Pongnumkul,
Pimwadee Chaovalit, and Thanaruk Theeramunkong. “A survey on evaluation methods for
chatbots”. In: Proceedings of the 2019 7th International conference on information and education
technology. 2019, pp. 111–119.
[134] S Martincic-Ipsic, J Zibert, I Ipsic, and F Mihelic. “A bilingual spoken dialog system for Slovenian
and Croatian weather forecast”. In: The IEEE Region 8 EUROCON 2003. Computer as a Tool. Vol. 2.
IEEE. 2003, pp. 140–143.
[135] Deepthi Mave, Suraj Maharjan, and Thamar Solorio. “Language identication and analysis of
code-switched social media text”. In: Proceedings of the third workshop on computational
approaches to linguistic code-switching. 2018, pp. 51–61.
[136] Tracey McHenry. “Words as big as the screen: Native American languages and the Internet”. In:
(2002).
[137] Paul McNamee. “Language identication: a solved problem suitable for undergraduate
instruction”. In: Journal of computing sciences in colleges 20.3 (2005), pp. 94–101.
[138] Helen M Meng, Shuk Fong Chan, Yee Fong Wong, Tien Ying Fung, Wai Ching Tsui, Tin Hang Lo,
Cheong Chat Chan, Ke Chen, Lan Wang, Ting-Yao Wu, et al. “ISIS: A multilingual spoken dialog
system developed with CORBA and KQML agents.” In: INTERSPEECH. 2000, pp. 150–153.
[139] Helen M Meng, Steven Lee, and Carmen Wai. “CU FOREX: a bilingual spoken dialog system for
foreign exchange enquiries”. In: 2000 IEEE International Conference on Acoustics, Speech, and
Signal Processing. Proceedings (Cat. No. 00CH37100). Vol. 2. IEEE. 2000, pp. II1229–II1232.
[140] Renata F. I. Meuter. “Language selection in bilinguals: Mechanisms and processes”. In: Oxford
University Press, 2009. Chap. 17, pp. 349–370.
[141] Patrick Minges. “Beneath the underdog: Race, religion, and the trail of tears”. In: American Indian
Quarterly 25.3 (2001), pp. 453–479.
[142] Rosamond Mitchell, Florence Myles, and Emma Marsden. Second Language Learning Theories.
Routledge, 2013.
182
[143] Giovanni Molina, Fahad AlGhamdi, Mahmoud Ghoneim, Abdelati Hawwari,
Nicolas Rey-Villamizar, Mona Diab, and Thamar Solorio. “Overview for the Second Shared Task
on Language Identication in Code-Switched Data”. In: Proceedings of the Second Workshop on
Computational Approaches to Code Switching. 2016, pp. 40–49.
[144] Lina Morgado, MTS Oliveira, Ana Paula Afonso, Isabel Cristina Carvalho, and Nathalie Ferret.
“Exploring the role of digital assistants and chatbots as learning companions in open and distance
learning”. In: EDULEARN24 Proceedings. IATED. 2024, pp. 10547–10552.
[145] Christopher Moseley. Atlas of the World’s Languages in Danger. Unesco, 2010.
[146] Quim Motger, Xavier Franch, and Jordi Marco. “Software-Based Dialogue Systems: Survey,
Taxonomy and Challenges”. In: ACM Computing Surveys (CSUR) (2022).
[147] Pieter Muysken. “Code-switching and grammatical theory”. In: One speaker, two languages. Ed. by
Leslie Milroy and Pieter Muysken. University of Cambridge, 1995. Chap. 9, pp. 177–198.
[148] Carol Myers-Scotton. Codes and consequences: Choosing linguistic varieties. Oxford University
Press, 1998.
[149] Carol Myers-Scotton. Duelling languages: Grammatical structure in codeswitching. Oxford
University Press, 1997.
[150] Mark Myslín and Roger Levy. “Code-switching and predictability of meaning in discourse”. In:
Language (2015), pp. 871–905.
[151] Tomoaki Nakamura, Takahiro Kobori, and Mikio Nakano. “Learning Dialogue Strategies for
Interview Dialogue Systems that Can Engage in Small Talk”. In: 9th International Workshop on
Spoken Dialogue System Technology. Springer. 2019, pp. 307–317.
[152] Angela Nazarian, Elnaz Nouri, and David Traum. “Initiative Patterns in Dialogue Genres”. In:
DialWatt—Semdial (2014), p. 229.
[153] M Eleanor Nevins. Lessons from Fort Apache: Beyond language endangerment and maintenance.
Vol. 4. John Wiley & Sons, 2013.
[154] Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, and Erik Cambria. “Recent advances in deep
learning based dialogue systems: A systematic survey”. In: Articial intelligence review 56.4
(2023), pp. 3055–3155.
[155] Thurston Dale Nicklas. Reference grammar to the Choctaw Language. Choctaw Bilingual
Education Program, Southeastern Oklahoma State University, 1979.
[156] Thurston Dale Nicklas. “The Elements of Choctaw”. Ph.D. dissertation. University of Michigan,
1972.
[157] Sebastian Nordho, Siri Tuttle, and Olga Lovick. “The Alaskan Athabascan Grammar Database”.
In: Proceedings of the Tenth International Conference on Language Resources and Evaluation
(LREC’16). 2016, pp. 3286–3290.
183
[158] Elnaz Nouri and David Traum. “Initiative taking in negotiation”. In: Proceedings of the 15th Annual
Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL). 2014, pp. 186–193.
[159] David Novick and Iván Gris. “Building rapport between human and ECA: A pilot study”. In:
Human-Computer Interaction. Advanced Interaction Modalities and Techniques: 16th International
Conference, HCI International 2014, Heraklion, Crete, Greece, June 22-27, 2014, Proceedings, Part II
16. Springer. 2014, pp. 472–480.
[160] Justyna Olko and Julia Sallabank. Revitalizing endangered languages: A practical guide. Cambridge
University Press, 2021.
[161] Lourdes Ortega. Understanding second language acquisition. Routledge, 2014.
[162] Richard T Oster, Angela Grier, Rick Lightning, Maria J Mayan, and Ellen L Toth. “Cultural
continuity, traditional Indigenous language, and diabetes in Alberta First Nations: a mixed
methods study”. In: International journal for equity in health 13 (2014), pp. 1–11.
[163] Alexandros Papangelis, Nicole Chartier, Pankaj Rajan, Julia Hirschberg, and Dilek Hakkani-Tur.
“Understanding how people rate their conversations”. In: Conversational AI for Natural
Human-Centric Interaction: 12th International Workshop on Spoken Dialogue System Technology,
IWSDS 2021, Singapore. Springer. 2022, pp. 179–189.
[164] Tanmay Parekh, Emily Ahn, Yulia Tsvetkov, and Alan W Black. “Understanding Linguistic
Accommodation in Code-Switched Human-Machine Dialogues”. In: Proceedings of the 24th
Conference on Computational Natural Language Learning. 2020, pp. 565–577.
[165] Dwija Parikh and Thamar Solorio. “Normalization and back-transliteration for code-switched
data”. In: Proceedings of the Fifth Workshop on Computational Approaches to Linguistic
Code-Switching. 2021, pp. 119–124.
[166] Ronakkumar Patel, Anton Leuski, and David Traum. “Dealing with out of domain questions in
virtual characters”. In: Intelligent Virtual Agents: 6th International Conference, IVA 2006, Marina
Del Rey, CA, USA, August 21-23, 2006. Proceedings 6. Springer. 2006, pp. 121–131.
[167] Parth Patwa, Gustavo Aguilar, Sudipta Kar, Suraj Pandey, Srinivas Pykl, Björn Gambäck,
Tanmoy Chakraborty, Thamar Solorio, and Amitava Das. “SemEval-2020 Task 9: Overview of
Sentiment Analysis of Code-Mixed Tweets”. In: Proceedings of the Fourteenth Workshop on
Semantic Evaluation. 2020, pp. 774–790.
[168] Aneta Pavlenko and Adrian Blackledge. “Introduction: New Theoretical Approaches to the Study
of Negotiation of Identities in Multilingual Contexts”. In: Negotiation of Identities in Multilingual
Contexts. Ed. by Aneta Pavlenko and Adrian Blackledge. Multilingual Matters LTD, 2004, pp. 1–33.
[169] Lizette Peter, Tracy Hirata-Edds, Durbin Feeling, Wyman Kirk, and Philip T Duncan. “The
Cherokee nation immersion school as a translanguaging space”. In: Journal of American Indian
Education 56.1 (2017), pp. 5–31.
[170] Arja Piirainen-Marsh. “Face in second language conversation”. In: JYU dissertations (1995).
184
[171] Ingrid Piller. “Identity constructions in multilingual advertising”. In: Language in society 30.2
(2001), pp. 153–186.
[172] Aidan Pine and Mark Turin. “Language Revitalization”. In: Oxford Research Encyclopedia of
Linguistics. 2017.
[173] Claudio S Pinhanez, Paulo Cavalin, Marisa Vasconcelos, and Julio Nogima. “Balancing social
impact, opportunities, and ethical constraints of using ai in the documentation and vitalization of
indigenous languages”. In: Proceedings of the Thirty-Second International Joint Conference on
Articial Intelligence. 2023, pp. 6174–6182.
[174] Shana Poplack. “Sometimes I’ll start a sentence in Spanish y termino en español: Toward a
typology of code-switching”. In: The bilingualism reader 18.2 (2000), pp. 221–256.
[175] Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel,
Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, et al. “The Kaldi speech
recognition toolkit”. In: IEEE 2011 workshop on automatic speech recognition and understanding.
CONF. IEEE Signal Processing Society. 2011.
[176] Adithya Pratapa, Gayatri Bhat, Monojit Choudhury, Sunayana Sitaram, Sandipan Dandapat, and
Kalika Bali. “Language modeling for code-mixing: The role of linguistic theory based synthetic
data”. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics
(Volume 1: Long Papers). 2018, pp. 1543–1553.
[177] Libo Qin, Wenbo Pan, Qiguang Chen, Lizi Liao, Zhou Yu, Yue Zhang, Wanxiang Che, and Min Li.
“End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions”. In:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023,
pp. 5925–5941.
[178] Vikram Ramanarayanan and David Suendermann-Oeft. “Jee haan, I’d like both, por favor:
Elicitation of a Code-Switched Corpus of Hindi-English and Spanish-English Human-Machine
Dialog.” In: INTERSPEECH. 2017, pp. 47–51.
[179] Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Anna Korhonen, and Ivan Vulic.
“Crossing the conversational chasm: A primer on multilingual task-oriented dialogue systems”.
In: arXiv preprint arXiv:2104.08570 (2021).
[180] Jon Reyhner. “American Indian language policy and school success”. In: The journal of educational
issues of language minority students 12.3 (1993), pp. 35–59.
[181] Ruth Rouvier. Language Documentation, Revitalization and Reclamation: Supporting Young
Learners and Their Communities. White Paper. May 2017, p. 31.
[182] Koustav Rudra, Shruti Rijhwani, Raya Begum, Kalika Bali, Monojit Choudhury, and
Niloy Ganguly. “Understanding language preference for expression of opinion and sentiment:
What do hindi-english speakers do on twitter?” In: Proceedings of the 2016 conference on empirical
methods in natural language processing. 2016, pp. 1131–1141.
185
[183] Michael Running Wolf, Noelani Arista, Caroline Running Wolf, Caleb Moses, and Joel Davison.
“How to build-your-own practical AI tools for language maintenance”. In: (2021).
[184] Maarten van Schagen and Alistair Knott. “Tauira: A tool for acquiring unknown words in a
dialogue context”. In: Proceedings of the Australasian Language Technology Workshop 2004. 2004,
pp. 131–138.
[185] Ana I Schwartz and Judith F Kroll. “Bilingual lexical activation in sentence context”. In: Journal of
memory and language 55.2 (2006), pp. 197–212.
[186] Corinne A Seals and Vincent Olsen-Reeder. “Translanguaging in conjunction with language
revitalization”. In: System 92 (2020), p. 102277.
[187] Frank Seifart, Nicholas Evans, Harald Hammarström, and Stephen C Levinson. “Language
documentation twenty-ve years on”. In: Language 94.4 (2018), e324–e345.
[188] Iulian Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. “Building
end-to-end dialogue systems using generative hierarchical neural network models”. In:
Proceedings of the AAAI conference on articial intelligence. Vol. 30. 1. 2016.
[189] Emad Al-Shawakfa and Martha Evens. “Bilingual Dialogs with a Network Operating System”. In:
Proceedings of Midwest Articial Intelligence and Cognitive Science Conference. Vol. 99. 1999,
pp. 7–12.
[190] Bayan Abu Shawar and Eric Atwell. “Dierent measurement metrics to evaluate a chatbot
system”. In: Proceedings of the workshop on bridging the gap: Academic and industrial research in
dialog technologies. 2007, pp. 89–96.
[191] Bayan Abu Shawar and Eric Atwell. “Fostering language learner autonomy through adaptive
conversation tutors”. In: Proceedings of the The fourth Corpus Linguistics conference. 2007.
[192] Amber Jean Shilling. “Exploring the use of mobile language learning technology as a means for
urban Indigenous youth to connect to identity and culture”. PhD thesis. University of British
Columbia, 2020.
[193] Gary F. Simons and Charles D. Fennig, eds. Ethnologue: Languages of the World. Twenty-rst.
Dallas, Texas: SIL International, 2018. url: https://www.ethnologue.com/language/cho.
[194] Gary F. Simons and Charles D. Fennig, eds. Ethnologue: Languages of the World. Twenty-rst.
Dallas, Texas: SIL International, 2018. url: https://www.ethnologue.com/country/US/.
[195] Rajat Singh, Nurendra Choudhary, and Manish Shrivastava. “Automatic normalization of word
variations in code-mixed social media text”. In: Computational Linguistics and Intelligent Text
Processing: 19th International Conference, CICLing 2018, Hanoi, Vietnam, March 18–24, 2018,
Revised Selected Papers, Part I. Springer. 2023, pp. 371–381.
[196] Sunayana Sitaram, Khyathi Raghavi Chandu, Sai Krishna Rallabandi, and Alan W Black. “A
survey of code-switched speech and language processing”. In: arXiv preprint arXiv:1904.00784
(2019).
186
[197] Sunit Sivasankaran, Brij Mohan Lal Srivastava, Sunayana Sitaram, Kalika Bali, and
Monojit Choudhury. “Phone Merging For Code-Switched Speech Recognition”. In: Proceedings of
the Third Workshop on Computational Approaches to Linguistic Code-Switching. Melbourne,
Australia: Association for Computational Linguistics, July 2018, pp. 11–19. doi:
10.18653/v1/W18-3202.
[198] Thamar Solorio and Yang Liu. “Learning to predict code-switching points”. In: Proceedings of the
2008 Conference on Empirical Methods in Natural Language Processing. 2008, pp. 973–981.
[199] Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell,
Jian-Yun Nie, Jianfeng Gao, and William B Dolan. “A Neural Network Approach to
Context-Sensitive Generation of Conversational Responses”. In: Proceedings of the 2015
Conference of the North American Chapter of the Association for Computational Linguistics: Human
Language Technologies. 2015, pp. 196–205.
[200] Helen Spencer-Oatey. “Face, identity and interactional goals”. In: Face, communication and social
interaction. Ed. by Michael Haugh and Francesca Bargiela-Chiappini. Equinox London, 2009.
Chap. 7, pp. 137–154.
[201] Helen Spencer-Oatey. “Theories of identity and the analysis of face”. In: Journal of pragmatics
39.4 (2007), pp. 639–656.
[202] Ganji Sreeram and Rohit Sinha. “Exploration of end-to-end framework for code-switching speech
recognition task: Challenges and enhancements”. In: IEEE Access 8 (2020), pp. 68146–68157.
[203] Tonya N Stebbins, Kris Eira, and Vicki L Couzens. Living languages and new approaches to
language revitalisation research. Routledge, 2017.
[204] John Reed Swanton. Source material for the social and ceremonial life of the Choctaw Indians.
Vol. 103. US Government Printing Oce, 1931.
[205] William Swartout, David Traum, Ron Artstein, Dan Noren, Paul Debevec, Kerry Bronnenkant,
Josh Williams, Anton Leuski, Shrikanth Narayanan, Diane Piepol, et al. “Ada and Grace: Toward
realistic and engaging virtual museum guides”. In: International Conference on Intelligent Virtual
Agents. Springer. 2010, pp. 286–300.
[206] S Thara and Prabaharan Poornachandran. “Code-mixing: A brief survey”. In: 2018 International
conference on advances in computing, communications and informatics (ICACCI). IEEE. 2018,
pp. 2382–2388.
[207] The Choctaw Nation of Oklahoma Dictionary Committee. Chahta Anumpa Tosholi Himona: New
Choctaw Dictionary. 1st. Choctaw Print Services, 2016.
[208] Stella Ting-Toomey. “Facework collision in intercultural communication”. In: Face,
communication and social interaction. Ed. by Michael Haugh and Francesca Bargiela-Chiappini.
Equinox London, 2009. Chap. 12, pp. 227–249.
187
[209] David Traum, Andrew Jones, Kia Hays, Heather Maio, Oleg Alexander, Ron Artstein,
Paul Debevec, Alesia Gainer, Kallirroi Georgila, Kathleen Haase, et al. “New Dimensions in
Testimony: Digitally preserving a Holocaust survivor’s interactive storytelling”. In: International
Conference on Interactive Digital Storytelling. Springer. 2015, pp. 269–281.
[210] Trung Ngo Trong, Kristiina Jokinen, and Ville Hautamäki. “Enabling spoken dialogue systems for
low-resourced languages – End-to-end dialect recognition for North Sami”. In: 9th International
Workshop on Spoken Dialogue System Technology. Springer. 2019, pp. 221–235.
[211] G Richard Tucker. “A global perspective on bilingualism and bilingual education”. In:
GEORGETOWN UNIVERSITY ROUND TABLE ON LANGUAGES AND LINGUISTICS 1999 (2001),
p. 332.
[212] Charles H Ulrich. “The glottal stop in Western Muskogean”. In: International journal of American
linguistics 59.4 (1993), pp. 430–441.
[213] Andre Valente, W Lewis Johnson, and Hannes Högni Vilhjálmsson. “The Tactical Language and
Culture Training System: A Demonstration.” In: AAAI. 2006, pp. 1955–1957.
[214] Mahesh Vanjani, Jamison Posey, and Milam Aiken. “An Evaluation of a multingual chatbot”. In:
Issues in Information Systems 20.1 (2019), pp. 134–143.
[215] David Vilares, Miguel A Alonso, and Carlos Gómez-Rodríguez. “Sentiment analysis on
monolingual, multilingual and code-switching twitter corpora”. In: Proceedings of the 6th
workshop on computational approaches to subjectivity, sentiment and social media analysis. 2015,
pp. 2–8.
[216] Ngoc Thang Vu, Heike Adel, and Tanja Schultz. “An investigation of code-switching attitude
dependent language modeling”. In: Statistical Language and Speech Processing: First International
Conference, SLSP 2013, Tarragona, Spain, July 29-31, 2013. Proceedings 1. Springer. 2013,
pp. 297–308.
[217] Ngoc Thang Vu, Dau-Cheng Lyu, Jochen Weiner, Dominic Telaar, Tim Schlippe, Fabian Blaicher,
Eng-Siong Chng, Tanja Schultz, and Haizhou Li. “A rst speech recognition system for
Mandarin-English code-switch conversational speech”. In: 2012 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP). IEEE. 2012, pp. 4889–4892.
[218] Ning Wang and Jonathan Gratch. “Rapport and facial expression”. In: 2009 3rd International
Conference on Aective Computing and Intelligent Interaction and Workshops. IEEE. 2009, pp. 1–6.
[219] Ning Wang, W Lewis Johnson, Richard E Mayer, Paola Rizzo, Erin Shaw, and Heather Collins.
“The politeness eect: Pedagogical agents and learning outcomes”. In: International journal of
human-computer studies 66.2 (2008), pp. 98–112.
[220] Ben Watkins. Complete Choctaw Dener: English with Choctaw Denition. JW Baldwin, 1892.
188
[221] Jochen Weiner, Ngoc Thang Vu, Dominic Telaar, Florian Metze, Tanja Schultz, Dau-Cheng Lyu,
Eng-Siong Chng, and Haizhou Li. “Integration of language identication into a recognition
system for spoken conversations containing code-switches”. In: Spoken Language Technologies for
Under-Resourced Languages. 2012.
[222] Douglas H Whalen, Margaret Moss, and Daryl Baldwin. “Healing through language: Positive
physical health eects of indigenous language use”. In: F1000Research 5.852 (2016), p. 852.
[223] Frederick White. “Rethinking Native American language revitalization”. In: American Indian
Quarterly 30.1/2 (2006), pp. 91–109.
[224] Graham Wilcock. “Bilingual Japanese/English robot dialogues with knowledge graphs and
conversational AI”. In: The 37th Annual Conference of the Japanese Society for AI. 2023.
[225] Graham Wilcock, Kristiina Jokinen, and Seiichi Yamamoto. “What topic do you want to hear
about?: A bilingual talking robot using English and Japanese Wikipedias”. In: COLING. 2016.
[226] Bronwyn T Williams. “Multilingual literacy strategies in online worlds”. In: JAC (2009),
pp. 255–259.
[227] Jason D Williams, Antoine Raux, and Matthew Henderson. “The dialog state tracking challenge
series: A review”. In: Dialogue & Discourse 7.3 (2016), pp. 4–33.
[228] Robert S Williams. “Referential tracking in Oklahoma Choctaw: Language obsolescence and
attrition”. In: Anthropological linguistics (1999), pp. 54–74.
[229] Genta Indra Winata, Alham Fikri Aji, Zheng-Xin Yong, and Thamar Solorio. “The Decades
Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges”.
In: arXiv preprint arXiv:2212.09660 (2022).
[230] Genta Indra Winata, Samuel Cahyawijaya, Zhaojiang Lin, Zihan Liu, Peng Xu, and Pascale Fung.
“Meta-Transfer Learning for Code-Switched Speech Recognition”. In: Proceedings of the 58th
Annual Meeting of the Association for Computational Linguistics. 2020, pp. 3770–3776.
[231] Allen Wright. Chahta Leksikon: A Choctaw in English Denition for the Choctaw Academies and
Schools. First. St. Louis: The Presbyterian Publishing Company, 1880.
[232] Odilia Yim and Richard Clément. “Acculturation and attitudes toward code-switching: A
bidimensional framework”. In: International Journal of Bilingualism 25.5 (2021), pp. 1369–1388.
[233] Ken York and J. Robert Scott. Bilingual Education for Choctaws of Mississippi. Annual Evaluation
Report FY 75–76. Mississippi Band of Choctaw Indians, 1976. url:
https://eric.ed.gov/?id=ED137007.
[234] Qi Zhang, Huan Chen, and Xuanjing Huang. “Chinese-English mixed text normalization”. In:
Proceedings of the 7th ACM international conference on Web search and data mining. 2014,
pp. 433–442.
189
[235] Shiyue Zhang, Ben Frey, and Mohit Bansal. “How can NLP Help Revitalize Endangered
Languages? A Case Study and Roadmap for the Cherokee Language”. In: Proceedings of the 60th
Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022,
pp. 1529–1541.
[236] Ran Zhao, Oscar J Romero, and Alex Rudnicky. “SOGO: a social intelligent negotiation dialogue
system”. In: Proceedings of the 18th International Conference on intelligent virtual agents. 2018,
pp. 239–246.
190
Appendices
191
Participant Pre test Pre test Post test questions Post test Change Change
questions questions questions questions attempted correct
attempted correct attempted correct
Monolingual
1m 10 7.5 11 9.5 1 0
2m 15 13.5 15 13.5 0 0
3m 12 8.5 12 8.5 0 0
4m 8 8 10 10 2 2
5m 12 10 15 14 3 4
6m 15 11.5 15 12.5 0 1
7m 10 7.5 10 7.5 0 0
8m 9 7 11 8.5 2 1.5
10m 13 12 15 14.5 2 2.5
11m 4 3 6 5 2 2
12m 11 8.5 12 9.5 1 1
Code-switching
1c 7 5.5 8 6.5 1 1
2c 6 6 9 8 3 2
3c 13 11 14 12 1 1
4c 9 6.5 9 6.5 0 0
5c 8 7 10 9.5 2 2.5
6c 11 7.5 12 9 1 1.5
7c 7 4 7 4.5 0 0.5
8c 8 6 13 10.5 5 4.5
9c 7 5.5 7 5.5 0 0
10c 7 7 7 7 0 0
11c 9 7 11 9 2 2
Table 6.2: The results of the language test, the raw scores are given in columns 2-4, with the dierence in
change given in the nal two columns on the right
192
Participant Pre test Pre test Post test questions Post test Change Change
questions questions questions questions attempted correct
attempted correct attempted correct
Monolingual
1m 2 1 2 2 0 1
2m 3 2.5 3 2.5 0 0
3m 2 0.5 2 0.5 0 0
4m 2 2 2 2 0 0
5m 2 1.5 3 2.5 1 1
6m 3 1.5 3 1.5 0 0
7m 3 1.5 3 1.5 0 0
8m 3 1 3 1 0 0
10m 3 2.5 3 2.5 0 0
11m 1 0 1 0 0 0
12m 3 2 3 2.5 0 0.5
Code-switching
1c 1 0.5 1 0.5 0 0
2c 1 1 2 1 1 0
3c 2 0 2 0 0 0
4c 2 1 2 1 0 0
5c 2 1.5 2 2 0 0.5
6c 3 0.5 3 1 0 0.5
7c 3 1.5 3 2 0 0.5
8c 3 1 3 1.5 0 0.5
9c 2 1 2 1 0 0
10c 1 1 1 1 0 0
11c 3 1.5 3 1.5 0 0
Table 6.3: The results of the language test for the grammar questions, the raw scores are given in columns
2-4, with the dierence in change given in the nal two columns on the right
193
CSW Mono L1 L2
Prompt Total Avg Num res. Total Avg Num res. Total Avg Num res. Total Avg Num res.
1 0:07:14 0:00:36 12 0:12:23 0:00:53 14 0:04:50 0:00:22 13 0:14:47 0:01:08 13
2 0:10:37 0:00:58 11 0:15:59 0:01:20 12 0:11:17 0:00:52 13 0:15:19 0:01:32 10
3 0:06:38 0:00:36 11 0:12:25 0:01:02 12 0:10:41 0:00:53 12 0:08:22 0:00:46 11
4 0:08:57 0:01:00 9 0:09:43 0:00:49 12 0:07:16 0:00:34 13 0:11:24 0:01:25 8
5 0:10:58 0:01:06 10 0:08:01 0:00:48 10 0:08:00 0:00:40 12 0:10:59 0:01:22 8
6 0:06:41 0:00:45 9 0:04:19 0:00:32 8 0:04:04 0:00:22 11 0:06:56 0:01:09 6
7 0:11:19 0:01:25 8 0:05:55 0:00:44 8 0:06:46 0:00:37 11 0:10:28 0:02:06 5
8 0:07:36 0:01:16 6 0:02:13 0:00:27 5 0:04:13 0:00:28 9 0:05:36 0:02:48 2
9 0:04:59 0:01:00 5 0:03:13 0:00:32 6 0:04:37 0:00:28 9 0:03:35 0:03:35 1
10 0:03:10 0:00:38 5 0:00:22 0:00:11 2 0:01:19 0:00:13 6 0:02:13 0:02:13 1
11 0:00:34 0:00:17 2 0:00:20 0:00:20 1 0:00:54 0:00:18 3 0 0 0
12 0:00:50 0:00:12 4 0 0 0 0:00:28 0:00:09 3 0:00:22 0:00:22 1
13 0:06:01 0:01:12 5 0 0 0 0:00:33 0:00:16 2 0:05:28 0:01:49 3
14 0:05:14 0:01:45 3 0 0 0 0:00:52 0:00:26 2 0:04:22 0:04:22 1
15 0:04:00 0:01:20 3 0 0 0 0:00:22 0:00:11 2 0:03:38 0:03:38 1
16 0:03:35 0:01:12 3 0 0 0 0:00:27 0:00:14 2 0:03:08 0:03:08 1
17 0:03:10 0:01:03 3 0 0 0 0:00:42 0:00:21 2 0:02:28 0:02:28 1
18 0:02:41 0:02:41 1 0 0 0 0 0 0 0:02:41 0:02:41 1
19 0:04:01 0:04:01 1 0 0 0 0 0 0 0:04:01 0:04:01 1
20 0:02:05 0:02:05 1 0 0 0 0 0 0 0:02:05 0:02:05 1
21 0:02:03 0:02:03 1 0 0 0 0 0 0 0:02:03 0:02:03 1
22 0:03:39 0:03:39 1 0 0 0 0 0 0 0:03:39 0:03:39 1
Table 6.4: Total duration, average duration, and number of responses per condition (code-switching and
monolingual) and group (L1 and L2).
194
Figure 6.1: Masheli CNO IRB approval letter
195
Figure 6.2: DAPEL CNO IRB approval letter page 2
196
Figure 6.3: Masheli USC IRB approval letter
197
Figure 6.4: DAPEL CNO IRB approval letter page 1
198
Figure 6.5: DAPEL CNO IRB approval letter page 2
199
Figure 6.6: DAPEL USC IRB approval letter
200
Abstract (if available)
Abstract
This dissertation explores the development and application of bilingual dialogue systems, focusing specifically on systems that support English and Choctaw, an endangered American Indigenous language. Bilingual dialogue systems are critical in facilitating more natural and inclusive interactions for the many bilingual users worldwide, yet current systems often fail to accommodate linguistic features of bilingualism, such as code-switching.
The dissertation investigates dialogue systems that manage unbalanced bilingualism and appropriate code-switching, improving user experience and system performance. I explore research questions such as whether code-switching leads to higher rapport, higher learning gains, or enhances interactions to collect endangered language audio data. Additionally, I address the sociocultural and linguistic challenges of developing conversational agents for endangered Indigenous languages.
The contributions of this dissertation include first, the development of two major bilingual applications. Both applications meaningfully add to the documentation of the Choctaw language and provide new avenues for language revitalization and preservation. The results of experiments with these applications demonstrate that bilingual dialogue systems serve some user populations better than monolingual systems. The research highlights the potential for bilingual dialogue systems to be used for language learning or preservation efforts for endangered Indigenous languages. Finally, this work introduces several Choctaw language resources and language-based technologies: a multimodal corpus, the first automatic speech recognition system for Choctaw, novel bilingual Choctaw-English corpora of fluent and learning speakers, and dictionaries. This dissertation demonstrates a path for revitalizing other low-resource and Indigenous languages through the use of computational methods.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Incrementality for visual reference resolution in spoken dialogue systems
PDF
Dialogue management in spoken dialogue systems with degrees of grounding
PDF
An investigation of fully interactive multi-role dialogue agents
PDF
Automatic evaluation of open-domain dialogue systems
PDF
Switching dynamical systems with Poisson and multiscale observations with applications to neural population activity
PDF
Building generalizable language models for code processing
PDF
Rapid prototyping and evaluation of dialogue systems for virtual humans
PDF
Data-driven and logic-based analysis of learning-enabled cyber-physical systems
PDF
Understanding and generating multimodal feedback in human-machine story-telling
PDF
Algorithm and system co-optimization of graph and machine learning systems
PDF
Spoken language processing in low resource scenarios with applications in automated behavioral coding
PDF
Computational foundations for mixed-motive human-machine dialogue
PDF
Deep learning for characterization and prediction of complex fluid flow systems
PDF
Grounding language in images and videos
PDF
Quality diversity scenario generation for human robot interaction
PDF
Inductive biases for data- and parameter-efficient transfer learning
PDF
Constructing an unambiguous user-and-machine-friendly, natural-language protocol specification system
PDF
Interpretation of pronominal forms across languages
PDF
Identifying and mitigating safety risks in language models
PDF
Towards generalized event understanding in text via generative models
Asset Metadata
Creator
Brixey, Jacqueline
(author)
Core Title
Code-switching dialogue systems: an investigation into how systems can support code-switching and when they should, with analysis of two Choctaw-English applications
School
Viterbi School of Engineering
Degree
Doctor of Philosophy
Degree Program
Computer Science
Degree Conferral Date
2024-12
Publication Date
12/23/2024
Defense Date
11/25/2024
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
American Indigenous languages,bilingualism,code-switching,dialogue system,low resource languages,OAI-PMH Harvest
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Traum, David (
committee chair
)
Creator Email
brixey@usc.edu,jacbrixey@usc.edu
Unique identifier
UC11399EZL5
Identifier
etd-BrixeyJacq-13711.pdf (filename)
Legacy Identifier
etd-BrixeyJacq-13711
Document Type
Dissertation
Format
theses (aat)
Rights
Brixey, Jacqueline
Internet Media Type
application/pdf
Type
texts
Source
20241223-usctheses-batch-1230
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
American Indigenous languages
code-switching
dialogue system
low resource languages