Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Vietnamese pronouns in discourse
(USC Thesis Other)
Vietnamese pronouns in discourse
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Copyright 2019 Binh Ngo
VIETNAMESE PRONOUNS IN DISCOURSE
by
Binh Ngo
A Dissertation Submitted to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(LINGUISTICS)
August 2019
Acknowledgement
As the African proverb says, “It takes a village to raise a child”. In my case, one may say “It takes
two countries to raise a Vietnamese linguist”. It is not an exaggeration considering the amount of
support I have received during my years in graduate school from the many amazing people in my
life, in the U.S. as well as in Vietnam. Thank you all for always being there for me during this
seemingly never-ending roller-coaster ride called the Ph.D.
I would like to start by saying how grateful I am to my two dedicated advisors, Andrew
Simpson and Elsi Kaiser, for their unwavering support, insightful guidance, immense patience and
kindness throughout this program. Thank you for indulging my wish to be a syntactician as well
as a psycholinguist, both of which took many hours of your time. You have taught me to be not
only a better researcher and teacher but also a better person. Thank you for everything!
I would also like to thank my committee members: Toby Mintz for answering each and every
question of mine as if it were the most interesting question of them all and Jerry Hobbs for his
generosity and for making me realize the power of the fundamental questions. I would also like to
thank my professor, Audrey Li, for her caring and her brilliant comments whenever we talk.
I would not have made it through this program without the love and support from my friends
in the department, outside of the department, as well as those living all the way across the ocean.
This is team work! To my linguistics people, Ana, Arunima, and Monica, you have seen me
through the best and the worst. Thank you for always standing by my side! I would also like to
thank my dear friend Marcia for showing me the true meaning of “Fight on”, and Ryan for the
many long conversations late into the night which helped me keep on going. A special shout-out
to my hero, Việt Ngân, for helping me with the ‘dreadful’ data collection process in Vietnam and
for sending me your best students. You played a key role in making the experiments in this
dissertation happen. I am forever in your debt.
I would like to show my utmost gratitude to Miss Được and Mr. Greg who are not only my
English teachers but also my mentors. You have a huge influence on my life.
My biggest thanks go to my parents who have worked tirelessly and sacrificed everything they
have for my educational adventures, first in Vietnam and later in the U.S. It is my dream to keep
on learning and you have always backed me up on it. Whether they are English, French, biology
or chemistry classes, or whatever ‘linguistics’ and recently, ‘data camp’ could be, you always
encourage me to do my best. I am very fortunate to have you as my parents.
Finally, I would like to thank all family and friends who not only were so willing to participate
in my experiments but also helped me spread the words. Your kindness and patience are beyond
my imagination.
i
Table of Contents
1. Introduction ............................................................................................................................. 1
2. Why pronouns? ....................................................................................................................... 1
3. Structural and discourse factors .............................................................................................. 2
4. Modality .................................................................................................................................. 4
5. Implicit causality and the Subject vs. Object bias in pronoun processing .............................. 5
6. Aims of the dissertation........................................................................................................... 6
1. Introduction ............................................................................................................................. 8
2. Experiment 1 – Narratives..................................................................................................... 14
3. Results ................................................................................................................................... 16
4. Discussion ............................................................................................................................. 21
1. Introduction ........................................................................................................................... 25
2. Experiment 2 – Written sentence completion ....................................................................... 29
3. Experiment 3 – Spoken sentence completion ....................................................................... 42
4. Comparing of Experiment 2 and 3: Effects of modality ....................................................... 47
5. General discussion................................................................................................................. 56
1. Introduction ........................................................................................................................... 58
2. Experiment 4 – English and Vietnamese implicit causality verbs ........................................ 60
3. Results ................................................................................................................................... 61
4. Discussion ............................................................................................................................. 64
1. Introduction ........................................................................................................................... 65
2. Experiment 5 – Subject vs. Object bias................................................................................. 69
3. Experiment 6 – Subject preference ....................................................................................... 82
4. General Discussion ................................................................................................................ 85
ii
1. Effects of structural and discourse factors on Vietnamese null vs. overt pronoun use ......... 87
2. Subject vs. Object bias during online pronoun processing ................................................... 90
iii
Abstract
In every day communication, language users are often confronted by the presence of multiple
competing linguistic choices. Referential form use (e.g. she, Mary, that girl), for example, is a
puzzle that has attracted much attention from both linguists and psychologists. In ‘Mary talked to
Sally because she was a friendly person’, the pronoun ‘she’ is ambiguous between ‘Mary’ and
‘Sally’. Why does the speaker choose to use ‘she’ instead of an unambiguous form (e.g. ‘Mary’,
‘Sally’)? How does the listener recognize the speaker’s intention despite the ambiguity? These
questions are further complicated in a language like Vietnamese in which pronouns are not just
function words like English ‘he/she’ but they are derived from a complex kinship system. In my
dissertation, I investigate speakers’ choice of referential form in Vietnamese focusing on pronouns.
Through a series of experiments, I probe a range of structural and discourse factors which may
influence the comprehension as well as the production of Vietnamese pronouns. In sum, these
studies aim to broaden our understanding of the impact of universal and language-specific features
on referential form choice in communication.
To provide a comprehensive picture of Vietnamese pronoun behavior considering their
crosslinguistic unique features, in Chapter 2, I conducted a narrative experiment to examine the
overall distribution of Vietnamese referential forms, particularly null pronouns (i.e. empty/zero
anaphora) and overt pronouns (e.g. kinship term pronouns). I incorporated structural factors such
as grammatical roles and grammatical parallelism into the analysis to obtain a detailed
characterization of Vietnamese pronoun production. I found that both grammatical roles and
grammatical parallelism have a strong influence on Vietnamese speakers’ choice of referential
form. When both the referent and the referring expression are in the grammatical subject position
(i.e. subject parallelism), speakers mostly use pronouns (null and overt pronouns). In contrast, the
lack of parallelism results in mostly NPs. Interestingly, hints of parallelism effect are also found
in object parallelism in which pronouns are used more than in the non-parallel cases. These results
highlight the importance of considering the grammatical roles of not only the antecedent but also
the anaphoric expression (e.g. pronouns) in investigating referential form choice.
Vietnamese speakers in the narrative experiment (Chapter 2) use both null and overt pronouns
equally. This finding poses a challenge to the salience-hierarchical approach (e.g. Ariel, 1990;
Givón, 1983) which suggests that null pronouns are often used to referred to highly salient referents
(i.e. those in subject position) while overt pronouns are used for less salient referents (e.g. those in
object position). Chapter 3 of my dissertation examined whether there is a division of labor
between null and overt pronouns in Vietnamese. In the sentence completion studies of this chapter,
I probed topicality, a discourse factor, while keeping grammatical roles constant. I found that
topicality is a crucial factor influencing Vietnamese speakers’ choice of null and overt pronouns.
Specifically, Vietnamese speakers mostly use null pronouns rather than overt pronouns when
referring back to the topicalized referents in discourse. However, this null vs. overt pronoun
distinction is only observed in production but not in comprehension. In sum, the results from
Chapters 2 and 3 support the form-specific multiple-constraints approach (Kaiser & Trueswell,
2008) since Vietnamese null and overt pronoun behaviors do not exhibit a clear hierarchy.
One intriguing finding in Chapter 3 is Vietnamese speakers’ tendency to refer back to the
objects of sentences despite the fact that previous results (Chapter 2) show a strong subject
preference. Furthermore, this object bias also challenges the well-known crosslinguistic subject
bias (Chafe, 1976). In Chapters 4-5, I investigate whether the object bias is presence during real-
time processing, or it is something that only emerges in off-line task (Chapter 3). One crucial piece
of information required for these experiments is verbs’ implicit causality. Thus, in Chapter 4, I
iv
conducted a large-scale norming study of 162 Vietnamese implicit causality verbs since there was
no prior verb database in Vietnamese. In addition to the verb norming, I also compared Vietnamese
verbs to their English equivalents. The comparison shows a stronger object bias in Vietnamese.
Keeping this in mind, in Chapter 5, I implemented the self-paced reading paradigm to examine the
subject vs. object bias in Vietnamese as well as age cue effects, specific to Vietnamese kinship
term pronouns, during online processing. I found that Vietnamese speakers have a tendency to
initially associate the pronouns with preceding subject antecedents. This finding is in line with the
subject preference found in the narrative study (Chapter 2). With regard to the object bias found
in the sentence completion studies (Chapter 3) and the verb study (Chapter 4), these results suggest
that verbs’ object-bias information is incorporated toward the end of sentence processing rather
than during the processing of the pronoun itself, providing evidence for the clausal integration
account (e.g. Garnham et al., 1996; Stewart et al., 2000). Regarding the use of age cues embedded
on the pronouns, I found that Vietnamese speakers rapidly use age cues to successfully resolve the
pronoun in the presence of verb bias.
Taken together, this dissertation shows that Vietnamese speakers’ use of referential forms,
particularly null pronouns and kinship term overt pronouns, are influenced by structural and
discourse factors. Nevertheless, how the effects of these factors may manifest varies depending on
the language. The results of the current work are in line with Kaiser & Trueswell’s (2008) form-
specific multiple-constraints approach since Vietnamese null and pronouns lack a clear division of
labor. These findings open doors to further investigation of Vietnamese pronouns, notably the null
vs. overt pronoun choice, the underlying factors driving the object bias as well as the role of kinship
features in pronoun resolution.
v
List of Tables
Table 1. Four configurations based on grammatical roles in preceding and current clause….….15
Table 2. Average length of the narratives by word count, utterance count, and average number of
words per utterance among participants. ...................................................................................... 17
Table 3. Overall percentages of null pronouns, overt pronouns and NPs used in written and spoken
narratives. ...................................................................................................................................... 17
Table 4. Percentage of each configuration in spoken narratives. ................................................. 17
Table 5. Percentage of each configuration in written task. ........................................................... 19
Table 6. Proportion of referential forms in the no-prompt conditions in Experiment 1 (written task).
....................................................................................................................................................... 37
Table 7. Types of verb biases and their proportions in the active no-prompt condition. ............. 39
Table 8. Proportion of referential forms in the no-prompt conditions in Experiment 3 (spoken task).
....................................................................................................................................................... 46
vi
List of Figures
Figure 1. Percentages of referential forms in four grammatical configurations in spoken task. .. 18
Figure 2. Percentages of referential forms in four grammatical configurations in spoken task. …19
Figure 3. Proportion of the four grammatical configurations in written and spoken narratives. .. 20
Figure 4. Percentages of referential forms in the four grammatical configurations in both written
and spoken narratives. ................................................................................................................... 21
Figure 5. Percentage of subject and object referents in active and passive conditions in Experiment
2 (written task). ............................................................................................................................. 35
Figure 6. Referential biases & forms in (active vs. passive) no-prompt conditions in Experiment 2
(written task). ................................................................................................................................ 37
Figure 7. Referential biases based on verb biases in null and overt prompt conditions. .............. 39
Figure 8. Referential biases among strong biased verbs. .............................................................. 40
Figure 9. Percentage of subject and object referents in active and passive conditions in Experiment
3 (spoken task). ............................................................................................................................. 44
Figure 10. Referential biases & forms in (active vs. passive) no-prompt conditions in Experiment
3 (spoken task). ............................................................................................................................. 45
Figure 11. Interpretation of null and overt pronouns in active-prompt (null vs. overt) conditions
(Written & Spoken)....................................................................................................................... 49
Figure 12. Interpretation of null and overt pronouns in passive-prompt (null vs. overt) conditions
(Written & Spoken)....................................................................................................................... 49
Figure 13. Choice of referential form in active-no-prompt conditions (Written & Spoken). ....... 51
Figure 14. Choice of referential form in passive-no-prompt conditions (Written & Spoken). .... 52
Figure 15. Percentages of subject responses for Vietnamese and English verbs by verb class. ... 62
Figure 16. Correlations between English and Vietnamese verbs by verb class based on the
percentage of subject responses. ................................................................................................... 63
Figure 17. How age cues and verb bias influence pronoun assignment. ...................................... 79
Figure 18. Disambiguating noun – First half. ............................................................................... 80
Figure 19. Disambiguating noun – Last half. ............................................................................... 80
Figure 20. Pronoun – First half. .................................................................................................... 81
Figure 21. Pronoun – Last half. .................................................................................................... 81
Figure 22. The use of x’s and reading times at disambiguating noun and at the pronoun. .......... 85
1
Introduction
1. Introduction
In everyday communication, language users are often confronted with choices during both
language comprehension and language production. One example is how we choose to refer to an
entity in discourse (e.g. he/John/the man), also known as referential form choice. This is a topic
that has attracted much attention from both linguists and psychologists. In ‘John talked to Bill
because he was a friendly person’, the pronoun ‘he’ is ambiguous between John and Bill. Why
does the speaker use ‘he’ instead of an unambiguous form (e.g. ‘John/Bill’)? How does the listener
identify the intended referent despite the ambiguity? These questions are further complicated in a
language like Vietnamese in which pronouns are not function words like English ‘he/she’, but are
instead derived from a complex kinship system and provide information about the age and/or social
status of the referent. In addition to kinship pronouns, Vietnamese also allows null pronouns (i.e.
empty pronouns/zero anaphora). In this dissertation, I investigate speakers’ choice of referential
form in Vietnamese with a focus on pronouns. Specifically, I probe the extent to which
grammatical (i.e. grammatical roles and grammatical parallelism) and discourse (i.e. topicality)
factors can influence the comprehension and production of different types of pronouns in
Vietnamese. I start out by examining how grammatical roles and grammatical parallelism affect
the production of Vietnamese referential forms. Next, I investigate the interplay between
grammatical roles and topicality and their effects on the null vs. overt pronoun choice in
Vietnamese, in comprehension as well as in production. Finally, I focus on the effects of
grammatical roles on pronoun assignment during online processing.
2. Why pronouns?
Pronouns, as under-specified forms, require context for interpretation. In languages with more than
one kind of pronoun, it is often argued that null (i.e. empty pronouns/zero anaphora) and
phonologically overt pronouns have different sensitivities to grammatical and discourse factors.
From a theoretical view, syntactic rules such as the Binding principles (e.g. Chomsky 1981,
Reinhart 1983) can restrict the set of possible referents for pronoun interpretation. However, the
Binding Theory is not intended to explain why, in a cross-sentential context, comprehenders would
associate a null pronoun with a certain referent and an overt pronoun with another, nor why
speakers would choose to produce a null pronoun over an overt pronoun in a particular context.
Studies examining pronoun choice have suggested that a number of factors may influence
how pronouns are interpreted or produced. These factors include the general notion of
prominence/salience/accessibility (e.g. Givón, 1983; Ariel, 1990; Gundel, Hedberg, & Zacharski,
1993) and more specific notions such as the grammatical and linear position of potential
antecedents (e.g. Chafe, 1976; Crawley and Stevenson, 1990; Crawley, Stevenson and Kleinman,
1994; Carminati, 2002), structural parallelism (e.g. Smyth, 1994; Chambers and Smyth, 1998),
thematic preference (e.g. Stevenson, Crawley and Kleinman, 1994), and discourse coherence (e.g.
Hobbs, 1979; Kehler et al., 2008). Much of the prior psycholinguistic work has focused on English,
with a morphologically fairly simple pronominal system. Some experimental work has also looked
at languages with a wider range of anaphoric forms (e.g. German: Bosch et al. 2003; Finnish:
2
Kaiser & Trueswell, 2008; Estonian: Kaiser, 2010; Chinese: Simpson, Wu, & Li, 2016; Yang,
Gordon, Hendrick, & Hue, 2003; Japanese: Ueno & Kehler, 2016). However, empirical and
typological question remain open regarding the referential properties of a more complex
pronominal system such as Vietnamese.
With regards to null and overt pronouns, the patterns of use seem to vary depending on the
languages and this may be due to how null pronouns are licensed in the language itself. For
example, Italian and Spanish, the claim is that null pronouns are licensed via verbal agreement
(e.g. Rizzi, 1982; Borer, 1989). In contrast, null pronouns in languages such as Chinese and
Japanese are licensed via discourse (e.g. Huang, 1984). This yields two classes: (i) agreement pro-
drop languages (e.g. Italian, Spanish) and (ii) discourse pro-drop languages (e.g. Chinese,
Japanese). The different mechanisms of licensing null pronouns are also reflected on how null
pronouns are distributed in these languages. For instance, null pronouns in agreement pro-drop
languages typically only occur in the grammatical subject position, whereas null pronouns in
discourse pro-drop languages can occur in both subject and object positions. Consequently, null
vs. overt pronoun choice in agreement pro-drop languages and discourse pro-drop languages may
exhibit varying degrees of sensitivity to different factors.
3. Structural and discourse factors
This section provides an overview about a range of factors that may influence how speakers use
different types of referential forms (e.g. null pronouns, overt pronouns, NPs) and their relevance
for the experiments in this dissertation. I first introduce two structural factors, (i) grammatical
roles and (ii) grammatical parallelism, and their effects on referential form choice. I then discuss
how grammatical factors can also interact with topicality, a discourse factor, to guide pronoun
resolution.
Previous work on pronoun resolution has shown that the grammatical role of the antecedent
is a crucial factor in pronoun interpretation (e.g. Chafe, 1976; Brennan, Friedman, & Pollard,
1987; Crawley & Stevenson, 1990; Gordon, Grosz, & Gilliom, 1993). Specifically regarding the
division of labor between null vs. overt pronouns, Carminati (2002) proposed the Position of
Antecedent Hypothesis (PAH) based on data from Italian: According to the PAH, null pronouns
tend to refer back to antecedents in spec-IP position (typically subjects) and overt pronouns to
antecedents lower in the syntactic tree (typically non-subjects or post-verbal subjects in Italian)
(see also Fedele, 2016). As illustrated in example (1) below, the null pronoun tends to be
interpreted as referring to the subject Mario while the over pronoun lui is often interpreted as the
object Giovanni.
(1) Carminati (2002)
Marioi ha telefonato a Giovannij, quando i/?j/lui?i/j aveva appena finito di mangiare
Marioi has telephoned to Giovannij, when i/?j/he?i/j had just finished of eating
‘Marioi has telephoned a Giovannij, when i/?j / he?i/j had just finished-eating”
Null and overt pronouns in Spanish also share these tendencies (e.g. Alonso-Ovalle, Fernández-
Solera, Frazier, & Clifton, 2002). However, both Spanish and Italian are agreement pro-drop
languages. Turning to discourse pro-drop languages, prior work shows that the division between
null and overt pronouns with respect to antecedents’ grammatical position is much less clear. It
has been found that in Chinese and Japanese (e.g. Chinese: Simpson et al., 2016; Yang, Gordon,
Hendrick, & Wu, 1999; Japanese: Ueno & Kehler, 2016), both null and overt pronouns are
frequently interpreted as referring back to subject antecedents. Accordingly, there are open
3
questions regarding how null and overt pronouns behave across languages. For instance, is there a
division of labor between null and overt pronouns in all languages? If there is, which factors
modulate this?
It should be noted that prior work on grammatical roles such as (Alonso-Ovalle et al., 2002;
Carminati, 2002; Simpson et al., 2016; Ueno & Kehler, 2016) only focus on the grammatical roles
of antecedents and the interpretation of pronouns in the grammatical subject position. Meanwhile,
other work on parallel function points out that not only the grammatical role of the antecedent but
the grammatical role of the pronoun itself also plays a role: “A pronoun with two or more
grammatically and pragmatically possible antecedents in a preceding clause will be interpreted as
coreferential with the candidate that has the same grammatical role” (Smyth, 1994). For instance,
in example (2) below the pronoun he in subject position (2a) is resolved to the subject antecedent
William and the object pronoun him in (2b) refers to the object antecedent Oliver (e.g. Sheldon,
1974; Smyth, 1994; Chambers & Smyth, 1998). For ease of exposition, I will refer to this effect
as grammatical parallelism effect contrast to grammatical role effect which only involves
antecedents’ grammatical roles.
(2) (Smyth, 1994)
a. William hit Oliver and he slapped Rod. (he = William)
b. William hit Oliver and Rod slapped him. (him = Oliver)
Although the parallelism strategy has received criticisms about its inconsistent effect in pronoun
assignment (e.g. Crawley, Stevenson, & Kleinman, 1990), other studies have shown that in order
for parallelism to take effect, the clauses/sentences must be entirely parallel in their structures
(Smyth, 1994). Keeping that in mind, it should also be pointed out that parallelism has mostly been
studied in English, a language that does not (normally) allow null pronouns. To my knowledge,
work on parallelism has not examined languages that allow both null and overt pronouns in both
subject and object position. Crucially, given that null pronouns in Italian and Spanish occur in
subject position but not in object position, these kinds of languages do not allow for a full
investigation of whether the difference between null/overt pronouns in object position is
influenced by parallelism. However, as we will see, Vietnamese allows null and overt pronouns in
both subject and object positions. Therefore, in addition to its interesting typology with kinship
term pronouns, Vietnamese is also a good tool to probe the questions regarding parallelism effects
on null and overt pronouns.
In fact, the grammatical subject position of a sentence is not merely a grammatical one. Givón
(1983) points out that the grammatical subject also functions as the topic of that sentence (see also
Chafe, 1976). Therefore, considering grammatical roles alone may result in mixed effects between
grammatical factors and discourse factors, in this case, topicality. Rohde and Kehler (2014) tested
this hypothesis with English pronouns. To tease apart subjecthood and topicality effects, one needs
to be able to manipulate referents’ degrees of topicality while maintaining the same grammatical
subject role. The active vs. passive manipulation in English is a good tool for this purpose. Previous
work has shown that English passivization promotes the subject of a sentence to a topicalized
position (Davison, 1984; Lambrecht, 1994). Therefore, in comparison to the subject of an active
sentence, the subject of a passive sentence has a much higher chance to be the topic of discourse.
Using passivization, Rohde and Kehler (2014) found that pronouns are used significantly more to
refer back to the subject of passives than to the subject of actives. This indicates that grammatical
role and topicality have different effects on pronoun resolution.
4
Further evidence of topicality effects on pronoun interpretation comes from Spanish in which
topicality can shift the referential bias of overt pronouns: Even though overt pronouns in Spanish
tend to refer back to the object, speakers mostly interpret overt pronouns as referring to the
topicalized subject rather than to the non-topicalized subject (Alonso-Ovalle et al., 2002). What
about null and overt pronouns in discourse pro-drop languages? Considering grammatical roles of
the antecedent alone, it seems that they do not differ much from each other. Ueno and Kehler
(2016) used topic marking to examine whether topicality affects null and overt pronoun use in
Japanese. No significant results were found with this type of manipulation. The patterns of null
and overt pronoun interpretations do not change whether the subject is subject-marked or topic-
marked. The question remains whether topicality plays a role in speakers’ choice of null vs. overt
pronouns in discourse pro-drop languages.
4. Modality
In Section 3, I have discussed a number of struct and discourse factors which may have an
impact on how pronouns are used. Another factor which has not been fully explored is the effect
of modality (i.e. spoken vs. written language use) on null vs. overt pronoun comprehension and
production. There have been numerous studies looking at the choice between NP and pronoun use
in English (see Chafe & Tannen (1987) for an overview). These studies reveal that written and
spoken language differ: Spoken language has more pronouns and written language has more NPs.
However, English does not have null pronouns as in the case of agreement pro-drop languages
(e.g. Italian, Spanish) and discourse pro-drop languages (e.g. Chinese, Japanese). In fact, prior
work relating to modality effect on null and overt pronouns is quite limited. It has been reported
that in Chinese, null pronouns are used increasingly more in written than in spoken narratives (Li
and Thompson, 1979; Christensen, 2000). Contrast to Chinese, null pronouns in Japanese is the
default pronominal form used in both spoken and written language and Japanese overt pronouns
rarely occur (Clancy, 1980, 1982). These results call for further crosslinguistic investigation on
the effect of modality and the null vs. overt pronoun choice.
However, the set of written and spoken samples used for comparison in many of these works
vary in their topics and genres. Therefore, the results are not a direct comparison between the two
modalities. Regarding null and overt pronouns in discourse pro-drop languages, two studies in
Chinese and Japanese show different results. In Chinese narratives, both null and overt pronouns
are frequently used (Christensen, 2000). However, in Japanese narratives, null pronouns are the
most frequent form while overt pronouns rarely occur (Clancy, 1980, 1982). The different patterns
of over pronoun occurrences in Chinese and Japanese may be due to how overt pronouns are
derived in the two languages: While overt pronouns in Chinese are function words similar to
English pronouns (Li and Thompson, 1981), Japanese overt pronouns are derived from nouns
(Kuroda, 1965; Hinds, 1975, 1983).
Furthermore, it is important to note that in prior work about NP vs. null vs. overt pronoun use,
the results only report the total numbers of tokens (i.e. how many NPs/null pronouns/overt
pronouns used in total) without taking into account the environment these forms occur in (e.g.
whether they refer to preceding subjects or objects, and whether they occur in subject or object
position). Thus, it is unclear whether modality has an influence on the mechanisms licensing the
use of these referential forms. In addition, Vietnamese pronoun system is typologically different
from those in Chinese and Japanese: Vietnamese overt pronouns are derived from kinship terms.
A closer look at Vietnamese is needed to gain insights into the crosslinguistic patterns of null and
overt pronouns.
5
5. Implicit causality and the Subject vs. Object bias in pronoun processing
As will become clear in Chapters 2 and 3 of this dissertation, Vietnamese appears to show
seemingly contradictory effects of grammatical roles on pronoun use: While the results of Chapter
2 (Experiment 1) shows that Vietnamese speakers tend to use both null and overt pronouns for
subject antecedents (i.e. subject bias), the results of Chapter 3 (Experiment 2 and 3) reveal that
pronouns are frequently assigned to object antecedents (i.e. object bias). These results naturally
prompt further question about the nature of these biases.
In addition to grammatical and discourse factors, one other factor that may influence pronoun
resolution is verb semantics -- specifically, the semantics of the verb whose arguments the potential
antecedents are. Verb bias, specifically implicit causality bias, has shown in a large body of work
to have a very powerful effect on speakers’ interpretation of ambiguous pronouns (e.g. Caramazza,
Grober, Garvey, & Yates, 1977; Garvey & Caramazza, 1974). Considering example (3) below.
The two sentences are identical except for their verbs, annoy in (3a) and admire in (3b). Let us
now consider the interpretation of the pronoun he in both sentences. In (3a), English speakers tend
to think that he refers to John whereas in (3b), he is interpreted as Bill.
(3) a. John annoyed Bill because he was loud. (he = John)
b. John admired Bill because he was kind. (he = Bill)
In psycholinguistic work, implicit causality verbs are often labeled based on the grammatical role
of the cause (i.e. the antecedent that the pronoun he refers to). For example, verbs such as annoy
in (3a) are called subject-biased verbs since John is in the subject position, and those such as
admire in (3b) are called object-biased verbs. Prior work on implicit causality has proposed
different taxonomies in an attempt to categorize verbs with respect to their biases (see (Rudolph
& Försterling, 1997) for a full review). For example, the widely-known Revised Action-State
Distinction (Brown & Fish, 1983b; Au, 1986) shown in example (4) below claims that Agent-
Patient (4a) and Experiencer-Stimulus (4c) verbs tend to be subject-biased whereas Agent-
Evocator (4b) and Stimulus-Experiencer (4d) verbs are often object-biased.
(4) a. Agent-Patient
Sally hit Mary because she… (subject bias: she = Sally)
b. Agent-Evocator
Sally punished Mary because she… (object bias: she = Mary)
c. Experiencer-Stimulus
Sally impressed Mary because she… (subject bias: she = Sally)
d. Stimulus-Experiencer
Sally liked Mary because she… (object bias: she = Mary)
Many researchers have also investigated verbs' implicit causality biases in languages other than
English. Among the languages investigated are German (Fiedler,1978; Rudolph, 1997), Spanish
(Goikoetxea, Pascual, & Acha, 2008), Dutch (Sernin & Marsman, 1994), and Italian (Manetti &
De Grada, 1991). As seen from the list of languages, most of them are Indo-European with very
few exceptions (e.g. Brown & Fish, 1983a; Hartshorne, Sudo, & Uruwashi, 2013). Thus, it may
not be surprising that they have similar patterns of verb bias.
Several prior studies have tended to focus on a small number of verbs (but for important large-
scale studies, see Ferstl, Garnham, and Manouilidou (2011); Hartshorne and Snedeker (2013) on
English; Goikoetxea et al., 2008 on Spanish). For non-Indo-European languages, there appear to
be no publicly available large-scale norms about verb biases.
6
This leads to the concern that perhaps the object bias found in Chapter 3 is a result of verb
choice. To investigate whether referents in the grammatical subject position has a more prominent
role than those in the object position, verbs that are not strongly bias pronoun interpretation toward
either subject or object antecedents are chosen. However, they are chosen based on the English
verb bias study, and then are translated into Vietnamese. Thus, the object bias observed may stem
from the inequivalence between the two languages. As it becomes clear in the verb study
(Experiment 4) in Chapter 4, this is not the case.
The issue remains whether the grammatical subject role in Vietnamese plays a prominent role
or it is the object bias that is present during pronoun processing. In the two experiments in Chapter
5 (Experiment 5 and 6), I address this question by presenting real-time information of how
Vietnamese speakers resolve pronouns during online processing.
6. Aims of the dissertation
This dissertation aims to further our understanding about referential form use in comprehension
and production with a focus on null and overt pronouns. Vietnamese is used as a case study for
two reasons. First, Vietnamese null and overt pronouns have a broad syntactic distribution and
reference function: They can refer to both subjects and objects, and can occur in both subject and
object positions. These features allow us to examine effects such as grammatical parallelism,
which has not previously been explored for null and overt pronouns. Second, Vietnamese has a
typologically distinct pronoun system from the languages that have previously been studied. As it
has been shown with Chinese and Japanese, even though they belong to the same discourse pro-
drop language group, null and overt pronouns in these languages have different patterns of use.
Therefore, Vietnamese with its kinship pronoun system can contribute to our understanding of
pronouns crosslinguistically.
The studies in this dissertation examine a number of factors which may influence Vietnamese
null and overt pronoun use.
(i) Grammatical factors: Grammatical roles and grammatical parallelism have been shown to
affect pronoun use and interpretation. Thus, in my investigation, I use these factors as a tool
examine how Vietnamese null and overt pronouns are used as referential devices and their
distribution in language production (Experiment 1, Chapter 2).
(ii) Discourse factors: Pronouns' preference for subjects may be related to a preference for
topics. To better understand the relation between grammatical role effects and topicality-based
effects, I examine how null and overt pronouns are interpreted and produced when speakers refer
to the topicalized referent vs. non-topicalized referent in discourse while keeping referents’
grammatical roles constant (Experiments 2 and 3, Chapter 3).
(iii) Implicit causality and the Subject vs. Object bias during online pronoun processing:
Experiments 2 and 3 in Chapter 3 yield a potentially surprising result, namely an object preference
in pronoun assignment. To gain additional insights into how real-time processing of pronouns in
Vietnamese works -- in particular, when does a subject preference or object preference emerge
during online processing -- in Experiment 4 (Chapter 4) and Experiments 5 and 6 (Chapter 5) I
take a closer look at how pronouns are being interpreted during online processing.
The structure of this dissertation is as follows: In the following Chapter 2, I present a narrative
study looking at Vietnamese speakers’ choice of referential form (Experiment 1). Two factors,
grammatical roles and grammatical parallelism, are taken into account to help uncover the
underlying mechanisms licensing referential form choice. To do this, I use a narrative task and
7
examine the relationship between the grammatical role of the antecedent and of the referring
expression, and the form in which the referring expression occur.
Chapter 3 aims to untangle the effects of grammatical roles and topicality and how these
factors may affect Vietnamese null and overt pronoun use. Experiments 2 and 3 in this chapter
tackle this question from both the comprehension and production perspectives.
In the next two chapters, Chapters 4 and 5, I focus on the unexpected object preference found
in Chapter 3. In chapter 4, I address the concerns regarding the choice of verbs for the experimental
items in Chapter 3 by conducting an implicit causality study of 147 Vietnamese verbs (Experiment
4). Following the verb norming in Chapter 4, Chapter 5 examines whether this object bias is indeed
present during online processing of pronouns (Experiment 5 and 6).
Finally, Chapter 6 provides a summary of the findings of the experiments in this dissertation,
their implications and future directions.
8
Structural factors and referential form choice in narratives
1. Introduction
It is widely agreed that entities in a discourse vary in their salience/prominence: At a particular
point in time, some entities are more salient or prominent in the discourse participants’ mental
models than other entities. Prior work suggests that the salience level of entities influences
speakers’ referential form choice as well as comprehenders' interpretation of referential forms
(Ariel, 1990; Givón, 1983; Gundel et al., 1993). It is frequently suggested that more reduced
referential forms tend to be used for highly salient referents while fuller referential forms tend to
be used for less salient referents. Thus, if a language has both null and overt pronominal forms,
null pronouns are often used to refer to highly salient referents while overt pronouns are used to
refer to less salient referents, as shown in (5).
(5) Most salient referents Less salient referents
Null pronouns Overt pronouns NPs
The claim that there exists a relationship between the salience of the referent and the type of
referring expressions leads to the question of what influences how salient referents are. Prior work
indicates that referents’ salience
1
can be influenced by a number of factors, including the
grammatical role of the antecedent (e.g. subject vs. object) (Chafe, 1976; Crawley & Stevenson,
1990) and whether the pronoun and its antecedent occupy parallel grammatical roles (i.e. both
elements are in subject position or in object position) (Smyth, 1994; Chambers and Smyth, 1998).
The work I report in this paper builds on this insight that referential form use depends not only the
grammatical role of the antecedent but also on the grammatical role of the anaphoric form. As I
discuss below, theories of referential form cannot focus solely on a notion of salience derived on
the prior realization of the antecedent but also have to take into account the argument structure of
the anaphor-containing sentence.
Many of the fundamental studies on grammatical parallelism effect have largely focused on
English and English overt pronouns (e.g. Smyth, 1994; Stevenson et al., 1995; Chambers and
Smyth, 1998). Consequently, even though it is widely known that null and overt pronouns across
languages have different properties (e.g. Spanish: Alonso-Ovalle et al. 2002; Italian: Carminati,
2002; Japanese: Clancy, 1980; Chinese: Li and Thompson, 1979), to the best of my knowledge,
little is known about the extent to which grammatical parallelism can affect the comprehension
and production of null and overt pronouns.
In this chapter, I report a narrative study on Vietnamese, a language that allows null and overt
pronouns in both subject and object position (Experiment 1). Experiment 1 examine how and
whether Vietnamese speakers’ choice of referential forms, particularly null and overt pronouns, is
influenced by (i) the grammatical role of the antecedent and (ii) the grammatical role of the
1
Terms such as ‘referent’ and ‘referring expression’ are standardly used in psycholinguistic work on pronoun
interpretation and production. In this dissertation, I use the term ‘referent’ to mean not only the entity in the world that
a certain linguistic element picks out, but – more relevantly for our present purposes – the linguistic realization of that
entity. Thus, I will often say, for the sake of brevity, that a particular referring expression refers to a preceding subject
or object – even though this is not strictly speaking correct, since the referring expression refers not to the grammatical
role of subject or object but to the entity which occurs/is linguistically realized in subject/object position.
9
referring expression – in particular, whether they have the same grammatical role (grammatical
parallelism) or not. Thus, the first aim of this work is to shed light on referential form choice in a
context where the alternation between null and overt pronominal forms has not previously been
systematically considered. My second aim is to investigate potential differences between spoken
and written language in the use of referential forms, i.e. possible effects of language ‘modality’ –
the physical means used to express language, with speech, writing or gestural communication.
Prior work suggests that spoken and written language differ with regards to kinds of referential
forms that are regularly produced in these different modalities (see Chafe and Tannen (1987) for
an overview). Generally speaking, it has been suggested that pronouns are more common in spoken
than in written language, and that full NPs are used more frequently in written language (e.g. Biber
et al., 1999; Christensen, 2000). However, these studies mostly discuss overall counts and many
of them contain data from different genres with various levels of formality. Thus, it is difficult to
know whether the differences are due to modality per se or to other properties that have been
correlated with modality in these prior studies.
Thus, in the current study, I carefully consider the effect of modality (written vs. spoken) on
the choice of referential form, while keeping the genre and level of formality constant by using
explicit instructions. This allows us to test for potential differences between written and spoken
language more directly.
The structure of this chapter is as follows: In the remainder of section 1, I discuss previous
findings on the effects of grammatical role, grammatical parallelism and modality on referential
form interpretation and production. I also discuss the nature of the Vietnamese pronominal system
and compare it to other pro-drop languages and other pronominal systems. In Section 2, I describe
the spoken and written narratives tasks that I used to elicit data as well as how the data was
analyzed. Section 3 presents the results from the written and spoken tasks and provides a
comparison between the two types of data. Section 4 discusses the implications of our findings,
compares them to findings from other languages, and outlines directions for future work.
1.1. GRAMMATICAL ROLES AND GRAMMATICAL PARALLELISM
One well-known factor that influences referents’ salience is grammatical role (e.g. being realized
in subject or object position) (Chafe, 1976; Brennan, Friedman, & Pollard, 1987; Crawley &
Stevenson, 1990; see also Gordon, Grosz, & Gilliom, 1993; Gordon & Chan, 1995; Perfetti &
Goldman, 1974). To identify salient referents, prior work has often used pronoun interpretation or
subsequent mention likelihood as a diagnostic. In one of the earliest works on this topic, Chafe
(1976) presented a number of observations and argued that subjects indeed have a special
prominent cognitive status - for example, that knowledge about subjects is more readily accessible
than knowledge about other parts of sentences. The special status of subjects has been confirmed
in many subsequent studies.
Recent work by Fukumura and van Gompel (2010) used sentence-continuation tasks to
investigate whether and to what extent the production of pronouns in English is influenced by
semantic biases (induced by verbs and connectives such as ‘because’) and the grammatical roles
of potential antecedents (subject vs. object). Fukumura and van Gompel found that participants
produced more pronouns (relative to names) when referring to the preceding subject than to the
preceding object, regardless of the semantic biases of verbs and connectives. These results add to
the body of literature showing that grammatical subjects are privileged as antecedents of
subsequent pronouns.
10
The effect of grammatical roles is also reflected in parallelism effects (Smyth, 1994;
Stevenson et al., 1995; Chambers and Smyth, 1998). Chambers and Smyth (1998) found that
pronouns, at least in English, tend to prefer antecedents in matching grammatical positions:
Pronouns in subject position tend to be interpreted as referring back to preceding subjects, and
pronouns in object position tend to be interpreted as referring back to preceding objects. However,
to the best of my knowledge, work on grammatical parallelism effects has focused on English
(overt) pronouns and has not systematically looked at the null vs. overt pronoun distinction.
Although the null vs. overt distinction has not been investigated systematically in parallelism
configurations, a large body of prior work has investigated the referential properties of null and
overt pronouns in subject position. Before continuing on to review this prior work, it is important
note that broadly speaking, languages with both null and overt pronouns come in two types: pro-
drop languages which have rich subject-verb agreement, (‘agreement pro drop languages’), and
‘discourse pro drop’ languages (Barbosa 2011, Neeleman and Szendrői 2007), which typically
lack verb agreement and permit pro in subject and object positions subject to discourse
recoverability. Prior work on pronoun interpretation in agreement pro-drop languages such as
Italian and Spanish has led researchers to conclude that the antecedent’s grammatical role is crucial
for the use and interpretation of null and overt pronouns in subject position: while null pronouns
tend to refer back to preceding subjects, overt pronouns tend to refer to preceding objects (e.g.
Alonso-Ovalle et al., 2002; Carminati, 2002, but see Fedele 2016 for Italian data that points to a
more nuanced picture).
1.2. DISCOURSE PRO-DROP LANGUAGES
I focus on referential forms in a discourse pro-drop language – Vietnamese – for several reasons.
2
First, discourse pro-drop languages typically have null and overt pronouns occurring in both
subject and object position. This distributional property allows us to expand the investigation
beyond overt pronouns and subject pronouns. In addition, the availability of null and overt
pronouns in both subject and object position means that I can investigate the full range of parallel
and non-parallel configurations (as explained below in Section 1.1) with both null and overt
pronouns. This would not be possible if we were to investigate pro-drop languages which have
strict/heavy constraints on the use of null pronouns in object position. Thus, discourse pro-drop
languages are an ideal tool to explore the interaction between pronominal form, the grammatical
role of the antecedent (subject or object), and crucially, also the grammatical role of the referring
expression (subject or object).
Previous work suggests that the null vs. overt pronoun distinction in discourse pro-drop
languages appears to be less clear than in pro-drop languages. Several studies looking at Chinese
pronouns in narratives suggest that the choice between null and overt pronouns appears to be in
free variation and reflects speakers’ personal interpretations of the discourse context (Li and
Thompson, 1979) as well as speakers’ personal preferences (Christensen, 2000). However, while
the results in Li and Thompson (1979) suggest that null pronouns seem to be the common, default
form, other work (Chen, 1986; Christensen, 2000) found that both null and overt pronouns are
2
Vietnamese is naturally classified as a discourse pro-drop language as it is not a language in which verbal agreement
licenses the occurrence of pro, unlike Spanish and Italian. Vietnamese patterns like other discourse pro-drop
languages such as Chinese and Japanese, where pro is essentially available whenever its content can be recovered
from the ongoing discourse context (Barbosa 2011).
11
used frequently in narratives. These studies indicate that speakers’ choice and discourse structure
have the main influence on the use of null and overt pronouns in Chinese.
In contrast, other work shows that null and overt pronouns in Chinese are strongly influenced
by syntactic structure. In terms of comprehension, Yang et al. (1999, 2003) conducted a number
of self-paced reading studies and found that grammatical role (subject vs. object) has a strong
effect on how rapidly pronouns are read in Chinese. For example, in a reading-time study reported
in Yang et al. (1999), participants slowed down when repeated names rather than null or overt
pronouns were used to refer to prior-mentioned referents – but only when the repeated names
occurred in subject position (see also Gordon et al. (1993) on the repeated name penalty in
English). In fact, Yang et al.’s (1999) follow-up study found that slow-downs only occurred when
a repeated name in subject position was used to refer back to a preceding subject (subject
parallelism). In addition, Yang (2003) found that participants read subject pronouns faster when
they referred back to the preceding subject than to the preceding object, even in contexts that
favored object interpretations. As a whole, these findings show that Chinese null and overt
pronouns in subject position have an interpretation preference toward antecedents in subject
position.
In related work, Simpson et al. (2016) examine the comprehension of Chinese overt pronouns
in subject position through a series of sentence completion experiments. These studies mostly
focus on transfer-of-possession verbs (e.g. send, give, kick). Simpson et al. (2016) found that
participants tend to interpret overt subject pronouns in the continuations as referring back to the
preceding subject. Although this tendency can be modulated by other discourse factors such as the
nature of the event (e.g. perfective vs. imperfective) and the type of coherence relation (e.g.
Explanation vs. Occasion), evidence for a subject preference remains strong. Put together, the
results in Yang (1999, 2003) and Simpson et al. (2016) emphasize the importance of the
antecedent’s grammatical role. However, these studies did not explore the production aspect of
null and overt pronouns. Furthermore, they have only focused on subject pronouns and have not
yet examined object pronouns. Thus, further work is needed to obtain a more complete picture.
Related work has been conducted in Japanese, another discourse pro-drop language. Hinds
(1975, 1983) and Clancy (1980, 1982) investigated Japanese null and overt subject pronouns by
means of questionnaires, conversations and narratives and found that the use of overt pronouns in
Japanese is very restricted, compared to null pronouns. One potential explanation for this
restriction lies in the fact that Japanese overt pronouns are historically derived from nouns and
exhibit semantic and syntactic behaviors similar to nouns (Kuroda, 1965), which is different from
pronouns in other languages such as Chinese. For example, kare in Japanese can function as a
pronoun meaning ‘he’ and a noun meaning ‘boyfriend’. Kare can also take modifiers and
determiners similar to nouns do (e.g. ureshii kare ‘happy guy’) (Hinds, 1975). It is important to
note that although null pronouns are not found in traditional-styled narratives in Japanese
(according to Clancy, 1980), they can occur in daily conversations (according to Hinds, 1975,
1983; Amano and Kondo, 2000).
Null and overt pronouns in subject position in Japanese have also been examined by means of
experimental work. Ueno and Kehler (2016) conducted a series of sentence completion studies on
the interpretation of Japanese null and overt pronouns. Their experiments employed transfer-of-
possession verbs as well as implicit causality verbs (e.g. surprise, praise). Similar to Kehler and
Rohde’s (2013) work in English, they had both pronoun-prompt (comprehension) and no-prompt
conditions (production): Participants either had to interpret an overt subject pronoun before
providing their continuations or they could freely use whatever referential form they preferred.
12
Similar to Simpson et al.’s (2016) study on Chinese, perfective and imperfective aspect were also
manipulated. Furthermore, since Japanese has topic marking, Ueno and Kehler also manipulated
topicality using topic vs. nominative marking on the preceding subject. The results of Ueno and
Kehler (2016) show that Japanese overt pronouns in subject position, similar to English overt
pronouns, are sensitive to a number of pragmatic factors (e.g. (im)perfective marking, implicit
causality bias). In contrast, null subject pronouns have much less sensitivity to pragmatic
manipulations, none for the (im)perfective manipulation and only limited sensitivity to the implicit
causality manipulation. Nevertheless, both Japanese null and overt pronouns in subject position
exhibit a subject bias similar to what has been found for Chinese subject pronouns.
In sum, cross-linguistically, it is unclear whether null and overt pronouns in subject position
behave differently and how the grammatical role of the antecedent can affect the use of null and
overt pronouns in discourse pro-drop languages. Furthermore, null and overt pronouns in object
position have not been systematically investigated in prior work.
1.3. VIETNAMESE
Vietnamese makes an interesting case study for two reasons. First, as a discourse pro-drop
language, Vietnamese allows both null and overt pronouns in both subject and object positions as
shown in example (6). In (6b), null pronouns are used to refer back to both the preceding subject
and object while in (6b’), an equivalent of (6b), overt pronouns are used. (Null pronouns are
denoted with parentheses in the translation.)
(6) a. Vân nhìn thấy Nam trên đường về nhà.
Vân saw Nam on way back home
‘Vân saw Nam on her way home.’
b. Gọi mấy lần nhưng anh không nghe.
Call several time but he not hear
‘(She) called (him) several times but he didn’t hear (her).’
b’. Cô gọi anh mấy lần nhưng anh không nghe cô.
she call he several time but he not hear she
‘She called him several times but he didn’t hear her.’
Second, unlike many other languages discussed in the pronoun resolution literature, Vietnamese
overt pronouns are most commonly derived from kinship terms.
3
In example (7a), the element ông
is used as a kinship term and is interpreted with its literal meaning ‘grandfather.’ In (7b), ông is
used as part of a compound noun and no longer has the literal kin term interpretation ‘grandfather’
but contributes the meaning of ‘old male’ to the compound. In (7c), ông is used as an overt
pronoun, where it again does not mean ‘grandfather’, but is used in a pronominal way for anaphoric
reference to some antecedent in the discourse which has the properties of being male and old.
3
In addition to the extensive list of kinship pronouns, Vietnamese also has a small set of pronouns which do not come
from kinship terms (e.g. nó ‘he/she/it’, họ ‘they’). The pronoun nó occurred very infrequently (only one participant
used nó) in the narratives: participants had a strong preference for using kinship pronouns. However, occurrences of
nó were nevertheless counted as an overt pronoun whenever this element did occur. The pronoun họ ‘they’ was also
very infrequent. I did not include the few occurrences of họ in our analyses, because this pronoun is often ambiguous
in terms of which group it refers to and therefore difficult to code.
13
(7) a. Ông của Lan vừa đến.
grandfather of Lan just arrive
‘Lan’s grandfather just arrived.’
b. Ông nông dân đang hái trái cây.
old.male.farmer PROG pick fruit
‘The farmer is/was picking fruit.’
c. Ông hái từng trái một.
old.male.he pick each fruit at once
‘He picked the fruit one by one.’
Thus, many elements which are used in a typically pronominal way in Vietnamese also appear in
other linguistic contexts, incorporated into larger compound words frequently depicting
professions and as pure kinship terms with relational meanings. The use of such elements as
pronouns is established by means of two criteria. First, as pronouns, such elements do not project
their literal kin term meaning (‘grandfather’, ‘uncle’, ‘aunt’ etc), but communicate more general
information about gender and age. Second, in their pronominal use, elements such as ông, cô, anh
and bà occur either bare (i.e. not part of a larger compound word) or with a demonstrative modifier,
e.g. ông ấy ‘lit. ‘that old male person’. In the pear story narratives investigated in the current study,
elements such as ông, cô, anh and bà were never used as kin terms encoding a relational meaning
to others in the storyline, but occurred either as parts of larger compounds, when a discourse
participant was introduced (and sometimes referred back to at a much later point), or as pronouns,
when reference was made to some other NP in the discourse.
This kin term pronoun system distinguishes Vietnamese from other discourse pro-drop
languages such as Chinese and Japanese which have previously been studied. Chinese overt
pronouns are similar to English-type pronouns in that they only denote number (and gender in third
person pronouns in written Chinese) (Li and Thompson, 1981). Meanwhile, as shown in Section
1.2, Japanese overt pronouns have noun-like behaviors (Kuroda, 1965; Hinds, 1975, 1983). More
importantly, the difference in Chinese vs. Japanese overt pronoun systems is also correlated with
different patterns of use: Previous work on Chinese narratives shows that both null and overt
pronouns are frequently used (Christensen, 2000). However, in Japanese narratives, null pronouns
are the most frequent form while overt pronouns occur only rarely (Clancy, 1980, 1982). Null
pronouns in Japanese are considered as the equivalent of English pronouns. In contrast, Japanese
overt pronouns have very restrictive use with specific connotations (see Hinds, 1975 for a full
discussion) and their occurrences are often considered to be due to influence of Western languages
such as English. Thus, among discourse pro-drop languages, null and overt pronouns vary greatly
in their properties and usage. A closer look at the typologically different kinship pronoun system
in Vietnamese can contribute valuable information regarding pronoun behavior cross-
linguistically.
In this paper, I present our work on narratives as an initial investigation of null and overt
pronouns in Vietnamese. I also aim to draw a direct comparison between pronouns in Vietnamese
and in other discourse pro-drop languages. Since previous studies on pronouns in Chinese and
Japanese which also discuss spoken and written modality have used narratives (Christensen, 2000;
Clancy, 1982), I also use a narrative task to keep our study maximally parallel to prior work.
Most importantly, I am interested in the effects that grammatical roles of the antecedent and
of the referring expression itself have on referential form choice in both subject and object
positions. A sentence completion task is typically used to investigate referents’ subsequent
mentions in subject position but not in object position. Therefore, a narrative task which allows us
14
to examine referents’ occurrences in both subject and object positions is better suited for our
purposes.
Furthermore, in order to test for potential effects of spoken vs. written modality, I keep the
number of referents and the genre constant in both written and spoken modalities. Prior work on
narratives and modality only reports overall counts of referential forms without details about the
grammatical positions of their occurrences (Clancy, 1980; Christensen, 2000). Additionally, many
of these studies also look at written and spoken data in different genres (news vs. conversational)
(Biber et al., 1999). Thus, the differences found may be due to the discourse type and not modality.
Taking these factors into consideration, our study maintains maximal parallelism between our
spoken and written narratives in genre and formality. I also include grammatical roles and
grammatical parallelism in our analysis. Our goal is to shed light on the mechanisms licensing
referential forms and to examine whether modality (i.e. the use of spoken vs. written language)
indeed has a direct influence on these mechanisms.
2. Experiment 1 – Narratives
2.1. DATA COLLECTION
I used a narrative task based on the Pear film, similar to the narratives used in work on Chinese
(Christensen, 2000) and Japanese (Clancy, 1980). The experiment consisted of two parts, spoken
and written. Prior work on Chinese and Japanese either only discussed the overall counts of
referential forms (Christensen, 2000) or how referential forms are used with regards to discourse
structure (e.g. number of intervening clauses, number of intervening referents) (Clancy, 1980). In
contrast, our study focuses on the mechanisms licensing referential form choice (null vs overt).
Thus, I incorporate factors such as (i) grammatical roles of the antecedent and of the pronominal
element and (ii) grammatical parallelism into our analysis and examine their influence on
referential form use in both spoken and written modalities.
2.2. METHOD
Twenty native speakers of Vietnamese (living in Vietnam) participated in the experiment. First,
each participant was shown the Pear film (Chafe, 1980) about a boy stealing pears. There are sound
effects in the film but no spoken words. After watching the film, participants were first instructed
to recount the story as if they were speaking to a friend who had not seen it. The narratives were
recorded. This made up the spoken task of the experiment. After verbally narrating the story,
participants were instructed to recount the story as if they were writing to a friend who had not
seen the film. This made up the written task of the experiment.
2.3. DATA ANALYSIS
To prepare the data for further analysis, I transcribed the spoken narratives orthographically. I also
included features of spoken language such as hesitations, pauses, false starts, repetitions and self-
corrections in the transcription. When repetitions and self-corrections occurred, I only considered
the final occurrence in the analysis, under the assumption that this is the version with which
participants were most satisfied. In the next step, I divided the spoken narratives into utterances.
Following Hurewitz (1998), Passonneau (1998) and others, I define an utterance as a finite clause
(i.e. containing a finite verb) but do not consider relative clauses as separate utterances for purposes
15
of discourse segmentation. Relative clauses are grouped with the main clause whose components
they modify, following Hurewitz (1998) and others. As a consequence, given that our analysis
focuses on subjects and objects of the main clause, referents that are only mentioned inside relative
clauses are not included in the analysis
4
(see also Bel et al., 2010; Walker et al., 1998). I adopted
these same criteria to divide the written narratives into utterances. Thus, similar to an utterance in
the spoken narratives, each utterance in the written narratives consists of a finite verb and may
include a relative clause. It is important to note that in this analysis, I only report cases in which
referents occur in adjacent utterances. I did not encounter ambiguous pronouns in this dataset
(with the exception of họ ‘they’, which was not counted due to its ambiguity, see footnote 3).
I coded all mentions of singular third-person human referents in adjacent utterances for (i)
grammatical role and (ii) referential form. Regarding (i) grammatical role, we coded referents’
grammatical roles in both the preceding and the current utterances (e.g. subject, object,
possessive, etc.). In other words, I coded the grammatical roles of the antecedent and of the
anaphoric element. For the purposes of the current work, I only discuss Subject and Object roles
in our analysis. Four grammatical configurations were established based on referents’ preceding
and current grammatical roles as shown in Table 1. (See footnote 1 regarding our use of the term
‘referent’ in this paper.)
Preceding clause
(antecedent)
Current clause
(anaphoric element)
Grammatical configuration
Subject Subject Subject-Subject
(Subject parallelism)
Subject Object Subject-Object
Object Subject Object-Subject
Object Object Object-Object
(Object parallelism)
Table 1. Four configurations based on grammatical roles in preceding and current clause.
Regarding (ii) referential form, since our goal is to observe how grammatical roles can influence
the current choice of referential form (i.e. null pronoun, overt pronoun, and NP), I only coded
referents’ referential form in the current clause. Examples (8-11) illustrate how data is coded with
regards to the four grammatical configurations. The referents of interest are in bold. Null pronouns
are indicated in the English translations by pronouns in parentheses.
(8) a. khi cậu bé này đi ngang qua một con đường
when CL boy this go past a CL road
‘when this boy went past a road’
b. thì (Ø) gặp một cô bé cũng đi một chiếc xe đạp
then (Ø) see a CL girl also ride a CL bike
‘then (he) saw a girl who also rode a bike’
→ Configuration: Subject-Subject
4
Although arguments might be made that referents that are only mentioned inside relative clauses should be included
in studies of anaphor-antecedent relations, following the norms adopted by previous investigations allows us to create
a profile of Vietnamese which can be compared directly with studies of other languages. As for complement clauses,
these were included in the current study of Vietnamese (though they were very rare – only one relevant token in each
of the spoken and written narratives).
16
→ Referential form: null pronoun
5
(9) a. cậu thấy ba cậu bé đang đứng trước mặt mình
he see three CL boy PROG stand front face self
‘he saw three boys standing in front of him’
b. một cậu bé đỡ cậu dậy
a CL boy pull he up
‘a boy pulled him up’
→ Configuration: Subject-Object
→ Referential form: overt pronoun
(10) a. thì (Ø) đã đỡ cái cậu bé này dậy
then (Ø) PAST pull CL CL boy this up
‘then (they) pulled this boy up’
b. cậu bé này lúc này đau chân
CL boy this time this hurt leg
‘at this time, this boy hurt his leg’
→ Configuration: Object-Subject
→ Referential form: NP
(11) a. thì nó gặp một bé gái đi ngược chiều
then he see a CL girl go opposite direction
‘then he saw a girl going on the opposite direction’
b. và do (Ø) mãi nhìn bé gái
and because (Ø) busy look CL girl
‘and because (he) was busy looking at the girl’
→ Configuration: Object-Object
→ Referential form: NP
When counting null pronouns, I excluded those that occur in coordinate constructions with “and”,
“but” and so on. I did this to avoid inadvertently inflating the number of null pronouns. Even in
languages like English, standardly analyzed as not allowing pro-drop, coordination structures like
“Lisa went home and made a sandwich” and “Lisa went to the library but could not find her friend”
allow what superficially looks like a missing pronoun/pro. As a result, excluding these types of
structures in our analyses ensures that all null pronouns reported in our results are ‘proper’ null
pronouns and not analyzable in terms of coordination.
3. Results
In this section, I first present some general information about the narratives. I then discuss how
referential forms are used with regards to the four grammatical configurations in Table 1. I also
discuss the details of the spoken task (Section 3.1) following by those of the written task (Section
3.2). Finally, I draw a comparison between written and spoken results (Section 3.3).
5
Note that the element cậu which appears in examples (4-7) occurs either as part of a larger compound cậu bé meaning
‘boy’, or as a pronoun meaning ‘he’ (young male). Its original lexical meaning is the kin term relation ‘uncle’
(mother’s brother). Other speakers used the pairs chú bé ‘boy’ and chú ‘he’ (young male) for the same discourse
referent. The original lexical meaning of chú is also ‘uncle’ (father’s younger brother).
17
Let us first look at the length of the narratives. I removed hesitations, pauses, repetitions and
self-corrections from the spoken narratives prior to performing the word count to keep them
parallel with the written narratives. Table 2 shows that on average, the spoken narratives are longer
than the written narratives considering both the average number of words and the average number
of utterances. I also calculated the average number of words per utterance for each participant and
averaged them across all participants. The result shows that spoken utterances are longer than
written ones. This is in line with prior work on written vs. spoken differences, specifically that
spoken language tends to be more elaborate while written language is more concise (e.g. Drieman,
1962; Horowitz & Newman, 1964; Tannen, 1982).
Avg. word Avg. utterance Avg. word per utterance
Written 317.2 35.45 9.15
Spoken 381.45 39.85 9.91
Table 2. Average length of the narratives by word count, utterance count, and average number of
words per utterance among participants.
Table 3 shows the overall use of null pronouns, overt pronouns and NPs in the narratives based on
the four configurations discussed in Table 1 above. As seen in Table 3, among the three types of
referential forms, null pronouns and NPs occur slightly more frequently than overt pronouns.
Additionally, I found no difference between written and spoken narratives with regards to
referential form use. These patterns might seem to suggest that, at least on this broad level,
referential form choice occurs randomly/at chance since there is no clear preference for any of the
forms. However, as I show later in this paper, this is not the case. When grammatical roles and
grammatical parallelism are taken into account, clear patterns of preference start to emerge. Thus,
it is importance to not only look at the overall frequency of referential form use but also to consider
the environment in which the forms occur.
Null pronouns Overt pronouns NPs Total
Written 35% 30% 35% 100%
Spoken 34% 30% 36% 100%
Table 3. Overall percentages of null pronouns, overt pronouns and NPs used in written and
spoken narratives.
3.1. SPOKEN NARRATIVES
Let us first look at the spoken results. Among the four configurations, participants used the Subject-
Subject configuration to refer to referents in adjacent utterances (78.85%) much more frequently
than any of the other three configurations.
Subject-Subject Subject-Object Object-Subject Object-Object Total
% 78.85 5.14 9.06 6.95 100
Table 4. Percentage of each configuration in spoken narratives.
Patterns of referential form choice (i.e. null, overt, NP) in the current utterance across the four
configurations is shown in Figure 1. To examine the pattern of pronoun vs. NP across the four
18
configurations, I conducted a series of chi-square tests
6
. The results confirm that the distribution
of pronouns vs. NP use in the Subject-Subject configuration (Subject parallelism) differs
significantly from the other three configurations. Specifically, pronouns (null pronouns + overt
pronouns = 73.18%) are the dominant choices in the Subject-Subject configuration. In contrast,
the other three configurations Subject-Object, Object-Subject and Object-Object consist of mostly
NPs (> 60% NPs in each configuration). With regards to the null vs. overt pronoun choice in the
Subject-Subject configuration, participants show no preference for either null or overt pronouns
(p = .13).
I also examined the pronoun vs. NP choice in the other three configurations, Subject-Object,
Object-Subject and Object-Object (Object parallelism). No significant difference was found in the
distribution of pronouns and NPs (p = .08) among these configurations. Nevertheless, the parallel
Object-Object configuration has slightly more pronouns (39.13%) than the other non-parallel
Subject-Object and Object-Subject configurations (35.29% and13.33% respectively). More
interestingly, while the non-parallel Subject-Object and Object-Subject configurations have no
null pronouns at all, the parallel Object-Object configuration elicits 26.1% null pronouns.
Figure 1. Percentages of referential forms in four grammatical configurations in spoken task.
(The first part of each label refers to the grammatical role of the antecedent and the second part
refers to the grammatical role of the pronoun or NP (e.g. Subjectantecedent-Subjectanaphoric_element).
3.2. WRITTEN NARRATIVES
Let us turn to the written results. I first examined how frequently participants used each type of
grammatical configuration in Table 1 in their narratives. When participants produced an NP or a
6
I used chi-squared test for the statistical analyses, although I realize that aspects of my data are not ideal for this
statistical test. My elicited-narration technique yielded a corpus of multiple narratives and thus involves multiple
observations from each participant. However, my open-ended task differs from the standard, more narrowly-controlled
within-subjects design often used in psycholinguistics, and although I have multiple observations from each person,
the nature of these observations is highly variable across participants. This, as well as the fact that my analysis of
pronominal forms involves analyzing responses dependent on the syntactic configuration that a participant chose to
produce, lead me to opt for the chi-squared analysis over other options, although chi-square assumes independence.
19
(null or overt) pronoun in subject position or object position, I noted what position the antecedent
was in. In Table 4 and Figure 1, as in Table 1, the first part of each label refers to the grammatical
role of the antecedent and the second part refers to the grammatical role of the pronoun or NP (e.g.
Subjectantecedent-Subjectanaphoric_element). I found that re-mentioning of the same referent is mostly
likely to occur in the Subject-Subject configuration (Subject parallelism). As seen in Table 4, the
Subject-Subject configuration occurs at a rate of 76%, far more frequently than any of the other
three configurations.
Subject-Subject Subject-Object Object-Subject Object-Object Total
% 76 7 9 8 100
Table 5. Percentage of each configuration in written task.
I also investigated referential form choice (i.e. null pronouns, overt pronouns, NPs) in the current
utterance in each grammatical configuration. Figure 2 shows the percentages of each referential
form in the four configurations. Null and overt pronouns are presented ‘stacked’ in a single bar to
make the overall percentage of pronouns relative to NPs easier to see.
Figure 2. Percentages of referential forms in four grammatical configurations in spoken task.
(The first part of each label refers to the grammatical role of the antecedent and the second part
refers to the grammatical role of the pronoun or NP (e.g. Subjectantecedent-Subjectanaphoric_element).
As seen in Figure 2, the Subject-Subject configuration (Subject parallelism) mostly occurs with
pronouns (null pronouns + overt pronouns = 74.71%), whereas the other three configurations
consist of mostly NPs (> 55% NPs in each configuration). Results of a series of chi-square tests
suggest that the distribution of pronouns vs. NPs in the Subject-Subject configuration differs
significantly from the other three – as expected from the patterns in Figure 2. Specifically,
participants produced significantly more pronouns (null + overt pronouns) relative to NPs in the
Subject-Subject configuration than in the Subject-Object configuration (p < .001), the Object-
Subject configuration (p < .001), and the Object-Object configuration (p < .01). I also compared
the use of null vs. overt pronouns in the Subject-Subject configuration and found no significant
difference between the two forms (p = .06) – as the patterns visible in Figure 1 lead us to expect.
A closer look at the other three configurations, Subject-Object, Object-Subject and Object-
Object (Object parallelism) shows that the proportion of pronouns vs. NPs used in these
20
configurations are not significantly different from each other (p = .39). However, the proportion
of pronouns in the parallel Object-Object configuration is numerically slightly higher than those
in the non-parallel Subject-Object and Object-Subject configurations, 44% compared to 33% and
24% respectively.
3.3. COMPARING WRITTEN AND SPOKEN RESULTS
In this section, I examine the effects of modality (i.e. written, spoken) on (i) the use of grammatical
configurations as well as (ii) the choice of referential forms in each configuration. For this purpose,
I provide a side-by-side comparison of the written and spoken results in Figures 2 and 3.
Figure 3 shows the proportions of four types of grammatical role configurations in the
written and spoken narratives. I observe the same patterns in both types of narratives. In particular,
the Subject-Subject (Subject parallelism) configuration is the most frequent (more than 75% of all
occurrences). The other three configurations occur at a similar rate as seen in Figure 3. In short,
there is no effect of modality on the occurrence of the four different types of configurations.
Figure 3. Proportion of the four grammatical configurations in written and spoken narratives.
In terms of referential form use, I conducted a series of chi-square tests to compare the numbers
of null pronouns, overt pronouns and NPs in each grammatical configuration between written and
spoken narratives. The results show that Vietnamese participants do not differ in their referential
form use in writing and in speaking (p’s = n.s.). In both types of narratives, the Subject-Subject
configuration differs significantly from the other three configurations. Figure 3 shows that in the
Subject-Subject configuration, pronouns (null + overt pronouns in the stacked bars) are the
preferred forms. However, in the other three configurations, participants exhibit a preference for
NPs over pronouns. This preference for NP use is very clear in the non-parallel Subject-Object
and Object-Subject configurations. Interestingly, the parallel Object-Object configuration,
although still yielding a high number of NPs, has slightly more pronouns than the non-parallel
configurations. Most prominently, in the spoken narratives, null pronouns are found in the parallel
Object-Object configuration, but they did not occur at all in the non-parallel configurations. In
sum, modality does not affect Vietnamese participants’ choice of referential form across all four
grammatical configurations. Nevertheless, in both modalities, I find that the four configurations
elicit different kinds of referential forms (as can be seen in Figure 4, and as I previously discussed
in Section 3.1 and 3.2).
21
Figure 4. Percentages of referential forms in the four grammatical configurations in both written
and spoken narratives
7
.
Taken together, our results show no effects of written vs. spoken modality on how Vietnamese
participants use either grammatical configurations or referential forms with respect to these
configurations. The lack of modality effect on referential form choice in the current study contrasts
with previous claims that pronouns and NPs occur at different rates in written and in spoken
language (Biber et al., 1999; Christensen, 2000).
4. Discussion
Experiment 1 investigated the effects of (i) grammatical roles, (ii) grammatical parallelism and
(iii) modality on speakers’ referential form choices in Vietnamese narratives. I am particularly
interested in how Vietnamese null and overt pronouns are used. This interest stems from the fact
that Vietnamese not only allows null and overt pronouns in both subject and object positions but
also has a complex kinship pronoun system that differs from other discourse pro-drop languages
such as Chinese and Japanese. Thus, this chapter aims to add to our understanding of pronoun
behavior in typologically different languages.
The narrative experiment (Experiment 1) has two parts, spoken and written. I instructed
participants to recount the Pear film first by speaking, and then by writing. I analyzed the narratives
taking into account (i) referents’ grammatical roles in the preceding and current utterances and
(ii) their referential forms in the current utterance. This method allowed me to investigate the
extent to which grammatical roles and grammatical parallelism affect referential form choice. The
results of both spoken and written narratives show that grammatical role and grammatical
parallelism play a key role in Vietnamese speakers’ choice of referential form. Specifically,
Vietnamese speakers use significantly more pronouns (null and over pronouns combined) when
the grammatical subject role is maintained across utterances (i.e. Subject parallelism). In contrast,
the non-parallel configurations (i.e. Subject-Object, Object-Subject) result in mostly NPs.
7
The absence of columns for null pronouns in the Subj-Obj and Obj-Subj configurations in the spoken narratives is
due to the fact, noted at the end of section 3.1, that speakers did not produce such elements in these two configurations
in the spoken narratives.
22
Interestingly, I also detect hints of a parallelism effect in the Object-Object configuration (i.e.
Object parallelism). Although NPs are the most frequent choice, Vietnamese speakers produced
more pronouns (null and overt pronouns) in the Object parallelism configuration than in the non-
parallel ones. I also observed parallelism effects in the patterning of null pronouns in the spoken
narratives: null pronouns only occurred in Subject and Object parallelism configurations.
Nevertheless, the Subject and Object parallelism configurations differed in their overall patterns
with Subject parallelism favoring pronouns and Object parallelism favoring NPs. These patterns
indicate that grammatical role still has a strong impact on referential form choice.
I am also interested in the potential role of modality (i.e. spoken vs. written) on the production
of Vietnamese referential forms. Our results show that modality has no significant effect on
Vietnamese speakers’ referential form choice when the level of formality and subject matter being
described are kept parallel in spoken and written descriptions. The patterns of pronoun and NP use
are similar in spoken and written narratives. Moreover, Vietnamese speakers also use null and
overt pronouns similarly in both modalities. At first glance, this finding seems to contradict prior
claims that written language utilizes more NPs than spoken language (Biber et al., 1999) and that
null pronouns are used increasingly more in written than in spoken narratives (Li and Thompson,
1979; Christensen, 2000). However, there is a major difference between these studies and our
work. Whereas previous studies report the number of tokens without specifying the environment
of occurrence (Christensen, 2000; Clancy, 1982), our study report these numbers with respect to
grammatical roles and grammatical parallelism. Crucially, including grammatical factors in the
analyses allows us to obtain a clearer view of the underlying mechanism licensing use of different
referential forms, particularly null and overt pronouns. Thus, the lack of modality effects in our
results suggests that the same underlying mechanism guides production of referential forms in both
spoken and written Vietnamese, which I regard as a desirable conclusion.
Another focus of attention in the current study is the choice of null vs. overt pronouns in
Vietnamese. In previous, highly influential work on the discourse pro-drop language Chinese,
Givón (1983) has proposed that there is a strong preference for the use of null pronouns (‘zero
anaphora’) rather than overt pronouns when the antecedent for such elements is highly
salient/prominent in a discourse. Givón (1983) and a broad range of functional studies adopting
Givón’s approach suggest that ‘the more accessible a referent is within a discourse, the less overt
coding it will be given, hence that highly accessible antecedents will be referenced with zero
anaphora, less accessible antecedents with (overt) pronouns, and very weakly accessible referents
with the use of a full NP’ (Simpson et al 2016:2). Similar observations about the relationship
between the form of referring expressions and the salience/prominence of the antecedent are made
by Ariel (1990) and Gundel et al. (1993).
A large number of studies have shown that grammatical role has a significant influence on
referents’ salience and thus, the choice of referring expression (Chafe, 1976; Brennan, Friedman,
& Pollard, 1987; Crawley & Stevenson, 1990; see also Gordon, Grosz, & Gilliom, 1993; Gordon
& Chan, 1995; Perfetti & Goldman, 1974). Particularly, referents in subject position are more
salient than those in object position. These studies along with the salience hierarchy (Givón, 1983;
Ariel, 1990; Gundel et al., 1993) predict that more reduced referential forms are preferred for
highly salient subject antecedents while fuller forms are frequently used for less salient object
antecedents. These predictions have been supported in English (e.g. Fukumura and van Gompel,
2010) as well as in agreement pro-drop languages such as Italian and Spanish (Carminati, 2002;
Alonso-Ovalle et al., 2002).
23
The finding of the current study on Vietnamese that speakers employ broadly equal amounts
of null and overt pronouns in situations where the grammatical roles of the antecedent and
anaphoric element are the same – poses a clear challenge to the salience hierarchy. Null pronouns
in Vietnamese – being the more reduced referential form – are expected to be chosen much more
frequently than overt pronouns to refer to highly salient subject referents, but this was not observed
in either the spoken or written narratives. As a clear preference for null pronouns was not found
in Subject-Subject coreference relations, the conclusion can be drawn that there is no necessary
cross-linguistic application of the salience hierarchy in the choice of referential forms,
automatically favoring more reduced forms in instances of reference to recent, highly salient
elements within a discourse.
The absence of a straightforward mapping between more reduced forms and more salient
elements is in line with Kaiser & Trueswell’s (2008) form-specific multiple-constraints approach.
Based on data from Finnish overt pronouns and anaphoric demonstratives, Kaiser and Trueswell
argue against the assumption that different kinds of referring expressions can be straightforwardly
mapped onto a unified salience hierarchy.
The discovery of broadly equal use of null and overt pronouns in the current study of
Vietnamese interestingly converges with the results of a recent investigation of Chinese described
in Christensen (2000) which also found that speakers tend to use null and overt pronouns equally
in similar conditions, at least in spoken Chinese.
8
This suggests that the connections posited
between salience and representational form in instances of anaphoric reference should carefully
be reexamined in other pro-drop languages, to establish which of these follow the
Vietnamese/Chinese patterning, and which may perhaps show stronger preferences for null
pronouns when these are licensed by the context.
Our results also distinguish pronouns in Vietnamese from those which occur in Japanese in a
potentially informative way (Clancy, 1980, 1982). It has previously been claimed that the
observed, highly restricted use of Japanese overt pronouns may be due to the fact that they are
historically derived from nouns and are rich in semantics. The latter property is suggested to
constrain their use, resulting in a significantly lower frequency of occurrence than that of null
pronouns in the language (Hinds, 1975, 1983). Comparing Vietnamese and Japanese, it can be
noted that Vietnamese kinship overt pronouns in Vietnamese are also semantically rich, but this
does not seem to restrict their use in the same ways as in Japanese.
There are two factors that may influence pronoun use in Vietnamese and null/overt pronoun
alternations which have not been explored in the current study, warranting further investigation.
First, although the grammatical subject of a sentence is also often the topic of a particular stretch
of discourse and is highly salient (Givón, 1983), being a grammatical subject does not always
entail being a topic (Lambrecht, 1994). As a result, the second subject in our Subject parallelism
configuration is likely to be a topic but does not have to be one. If speakers favor the use of null
subjects for reference to the discourse topics and were to use overt pronouns for other instances of
anaphoric reference, this might account for some of the variation between null and overt pronouns
attested in Experiment 1. Consequently, one may question the degree to which grammatical
8
Christensen’s investigation of oral and written narratives recounting the pear story in Chinese showed that null and
overt pronouns were used at nearly the same rate in the oral narratives. However, unlike Vietnamese, this patterning
was not maintained in the written narratives, where null pronouns occurred 55% of the time, while overt pronouns
were used at a rate of less than 20%. There is thus a clear effect of modality at play in Chinese, which does not seem
to occur in Vietnamese.
24
subjects regularly function as topics in Vietnamese. If such a relation does not exist strongly in
Vietnamese, this might allow for a more nuanced account of the distribution of null and overt
pronouns in patterns of Subject-Subject co-reference. I address this concern in the following
chapter (Chapter 3, Experment 2 and 3) using a sentence completion task.
Second, another factor which my analysis has not accounted for is the role that coherence
relations potentially may play in anaphoric reference. Previous work shows that the production
and comprehension of pronouns can be influenced by the type of coherence relation which exists
between two clauses (Kehler, 2002; Kehler & Rohde, 2013). With regards to discourse pro-drop
languages, Simpson et al. (2016) confirm the effects of coherence relations on the likelihood of
mention and referential form use in Chinese. They found that the Explanation
9
relation results in
more continuations referring back to the preceding subjects than the Occasion
10
relation does. This
indicates that the subjects in Explanation relations are more likely to be continuing discourse
topics. Although I have not computed the details regarding coherence relations in our data, an
initial preliminary examination suggests that there was a high amount of Occasion relations in our
narrative data. According to Kehler (2008), Occasion is the typical relation used in narratives. In
this light, the subjects in our narratives might not be “strong topics”, which could be a reason why
null pronouns were not the dominant referential form choice – perhaps null pronouns are only used
to refer to very strong discourse topics, and are less commonly used in subject chains which do
not involve topics of such strength. I aim to disentangle these factors in future work.
In sum, the results of Experiment 1 indicate that grammatical role and grammatical parallelism
play an important role in how Vietnamese speakers choose referential forms. I found that not only
subjecthood but also grammatical role parallelism increase pronoun use. In contrast, if the referring
expressions and its antecedent are not in parallel grammatical roles, and in particular if they are
not both subjects – I observe a significant increase in the production of NPs. Unlike prior work,
our study found no effects of written vs. spoken modality, indicating that the effects of grammatical
roles and parallelism on referential form use are not affected by modality. These results highlight
the importance of considering referents’ grammatical roles in adjacent utterances when
investigating speakers’ choice of referential form.
Regarding the distinction between Vietnamese null and overt pronouns, no differences were
found in the current experiment. This leads me to conclude that in Vietnamese, grammatical roles
and parallelism have similar effects on both null and overt pronouns. Interestingly, despite the fact
that Chinese, Japanese and Vietnamese are all discourse pro-drop languages, overt pronoun use
varies crosslinguistically. Although Vietnamese overt pronouns are semantically rich kinship
terms, they are very frequently used similar to Chinese overt pronouns (Christensen, 2000). This
contrasts sharply with Japanese overt pronouns which are historically derived from nouns, and as
claimed in Hinds (1975, 1983), are used restrictively due to their semantics.
9
An Explanation relation occurs when a follow-on sentence is used to provide an explanation of the content of a
preceding sentence.
10
An Occasion relation occurs with a temporal sequencing of events, the content of one sentence preceding that of a
second sentence in time.
25
Topicality and the null vs. overt pronoun choice
1. Introduction
Previously, in Chapter 2, I investigated how Vietnamese speakers use referential forms (i.e. null,
overt, NP) in narratives. Two grammatical factors, grammatical roles and grammatical parallelism,
were taken into account in Experiment 1. The results of that experiment show that even though
Vietnamese is a discourse pro-drop language, structural factors plays an important role on
referential form choice. Specifically, Vietnamese speakers produce significantly more pronouns
in the subject parallelism configuration than in the object parallelism or non-parallelism
configurations. One crucial finding in the narrative results is lack of variation in null and overt
pronoun use: Speakers produced an equal amount of null and overt pronouns. This finding is
unexpected in light of the widespread view (Ariel, 1990; Givón, 1983; Gundel et al., 1993)
according to which a specific referential form is selected based on the referent’s degree of salience
in discourse, with null pronouns typically regarded as being associated with more salient
antecedents than overt pronouns. Following this view, null and overt pronoun should have different
patterns of use.
As previously discussed in Chapter 2, Section 4, the effect of grammatical roles found in
Experiment 1 may be mixed with topicality effect since referents in the grammatical subject
position frequently coincide with the topic of discourse (Givón, 1983). Furthermore, coherence
relation between utterances can also affect the re-mention rate of the referent and the choice of
referential form when referents are mentioned again (e.g. Kehler, 2002; Kehler & Rohde, 2013;
Simpson, Wu, & Li, 2016). These factors were not fully controlled for in Experiment 1 (spoken
and written narratives) in Chapter 2.
The current chapter focuses on Vietnamese null and overt pronouns. I present two sentence
completion experiments (Experiments 2 and 3) examining the effect of topicality on Vietnamese
speakers use of null vs. overt pronoun while keeping coherence relation constant across the
clauses. Passivization is used as tool to promote topicality. Different from the narrative study,
Experiments 2 and 3 not only investigate how Vietnamese speakers produce null vs. overt
pronouns but also their interpretations of the two pronoun types. Additionally, I also examine
whether modality has an effect on the comprehension and production of null and overt pronouns.
The structure of this chapter is as follows: In the remainder of Section 1, I discuss details of
the salience-hierarchical approaches and their predictions for a null vs. overt pronoun distinction.
I also present evidence for an interplay between grammatical roles and topicality and how that may
affect pronoun use. In Section 2, I describe the written sentence completion experiment
(Experiment 2) and the results obtained from the data. Section 3 presents the spoken sentence
completion experiment (Experiment 3) and its results. Results from both experiments are
compared in Section 4. Section 5 discusses the implications of our findings, compares them to
findings from other languages, and outlines directions for further work.
1.1. SALIENCE-HIERARCHICAL APPROACHES
As previously discussed in Chapter 2, Section 1, salience/prominence is an important factor for
referential form choice (Ariel, 1990; Givón, 1983; Gundel et al., 1993). While being the subject of
26
a sentence increases referents’ salience, being the topic of discourse also boosts it. The notion of
topicality in Givón (1983) was presented as a continuum with zero anaphors as the most topical
entities, and referential indefinite NPs as the least topical entities. Applying the hierarchy of
salience to null and overt pronouns in Vietnamese, null pronouns should be ranked higher than
overt pronouns in this hierarchy. Thus, null pronouns are used to refer to the most salient, topical
entities available in discourse while overt pronouns are used for the less salient entities.
Although there is no argument about the link between the choice of referential form and the
salient status of a referent in discourse, it is questionable whether these forms are mutually
exclusive. For example, if speakers choose to use a null pronoun, does that mean other choices
such as an overt pronoun or an NP cannot be used? To address this question, Gundel et al. (1993)
proposed the Givenness Hierarchy with six cognitive statuses for referents in discourse and how
they are related to various referential expressions. With regards to the current work, the ranking of
null and overt pronouns remains the same as seen in (8). The crucial point here is that Gundel et
al.’s hierarchy is an implicational hierarchy in which the higher status forms entails the lower
status forms but not vice versa. To demonstrate how this can be implemented, let us take a look at
the referential forms in (8). In this hierarchy, a null pronoun is ranked higher than an overt pronoun
and higher than an NP. According to the Givenness Hierarchy, when a null pronoun is used, it is
possible to replace it with an overt pronoun. Nevertheless, I do not see all of these forms freely
replacing one another in discourse. Gundel et al. suggested that the actual distribution of the forms
is constraint by Grice’s Maxim of Quantity (i.e. “Do not make your contribution more informative
than is required”). For instance, if the use of a reduced form such as a null pronoun is sufficient,
there is no need to use a fuller form such as an overt pronoun.
This leads us back to the question about the equal use of null and overt pronouns in the subject
parallelism configuration (i.e. Subjectantecedent-Subjectanaphoric_element) in Experiments 1 in Chapter 2.
Since subject parallelism is the most frequently used configuration and it also elicits significantly
more pronouns (null and overt pronouns) than the other three configurations, it is undoubtedly that
referents in the subject position gain a special status in discourse. If so, why are overt pronouns
still very frequently used? One possible explanation is that perhaps in Vietnamese, being in the
grammatical subject position alone does not promote referents to the highest-ranked, most salient
entities in discourse. Thus, at this level of salience, overt pronouns are permitted to occur as much
as null pronouns without violating the Grice’s Maxim of Quantity. As it becomes clear in the
following Section 1.2, there are different degrees of salience between occurring in the grammatical
subject position and being in the subject and also topicalized position.
1.2. SUBJECT PREFERENCE AND TOPICALITY
While some researchers have focused largely on the syntactic differences between subjects and
objects (i.e. grammatical roles and grammatical parallelism), others take a more discourse-level
approach and focus on notions such as 'topic.' However, the nature of the relationship between
these two types of factors are not always clear. In this section, I discuss how both syntactic and
discourse level factors may presumably play a role in pronoun comprehension and production and
the types of constructions that may help tease apart the effects of these factors.
One constraint of grammatical parallelism on pronoun assignment is that it requires the two
utterances/clauses to be fully parallel in their structures for a the effect to take place (Crawley et
al., 1990; Smyth, 1994; see discussion in Chapter 2, section 1.1). What happened then when
parallelism is not maintained? Smyth (1994) found that English speakers were more likely to
assign a pronoun in object position to a subject referent than to an object referent. For example,
27
Smyth (1994) tested pronoun assignment in both subject and object positions in non-parallel
sentences and found that whereas pronouns in the grammatical object position were occasionally
resolved to the referent in the subject position, no pronouns in the subject position were resolved
to referents in non-subject position. The results from Smyth (1994) show that in comprehension,
English speakers have a strong tendency to interpret pronouns as referring to the referent in the
grammatical subject position. Similarly, in production, other work (Stevenson et al., 1994; Arnold,
2001; Kehler, 2008) also found that when participants were allowed to choose both the referential
expressions and which referent to mention next, they tend to choose pronouns when referring to
referents in subject position but use names for those in object position. Taken together, these
findings support the subject assignment strategy for pronoun use.
Other researchers, however, have focused on referents’ status in discourse (i.e. topicality)
(Givón, 1983). If the subject of a sentence can also function as the sentential topic, it is difficult to
see whether pronoun use is being driven by grammatical role effect or topicality effect.
Due to the canonical word order of English, the grammatical subject position very often
coincides with the topic position. Rohde and Kehler (2013) suggested that topicality effect can be
better observed in production task when speakers can select a particular referential form to
continue with the chosen topic. According to their topicality hypothesis, the rate of
pronominalization for subjects, objects and other referents can indicate whether or not the referent
is a topic without relying on grammatical roles. The English passive construction is a good tool
to test this hypothesis as it promotes the subject of the sentence to a topic position (Creider, 1979;
Davison, 1984). Thus, even though subjects in different active and passive constructions share the
same grammatical role, they may vary in the likelihood of becoming the discourse topics. Consider
the sentences in (12) below. Both Amanda and Brittany are syntactic subjects. Nonetheless, the
likelihood of Brittany being construed as the topic of (12b) is higher than the likelihood of Amanda
being construed as the topic of (12a) (Rohde & Kehler, 2014). Indeed, Rohde et al. found that
speakers are more likely to continue mentioning Brittany, the subject of the passive sentence in
(12b), in their continuations than Amanda, the subject of the active sentence in (12a). Additionally,
they are also more likely to use pronouns when referring to the subjects in passives than in actives.
In sum, these results show that grammatical role and topicality have different effects on pronoun
use.
(12) (Rohde and Kehler; 2014)
a. Active
Amanda amazed Brittany. ________________
b. Passive
Brittany was amazed by Amanda. ____________
The relationship between topicality and pronouns has also been investigated in other languages
with a null and overt pronoun system. In agreement pro-drop languages such as Spanish (e.g.
Alonso-Ovalle et al., 2002), null pronouns are mainly used to refer to the subject antecedents
whereas overt pronouns tend to refer to object antecedents. Additionally, Spanish word order is
more flexible than English. Subjects can occur pre-verbally (SV) or post-verbally (VS). In one of
their experiments, Alonso-Ovalle et al. showed that subjects in preverbal (SV) position are topics.
They proceeded to manipulate topicality using the different word order patterns. Participants were
asked to indicate whether the pronoun, in either SV or VS pattern, refers to the subject or the object
of the previous sentence. Alonso-Ovalle et al. found that overt pronouns in pre-verbal (SV) subject
position (i.e. topic position) were more likely to be interpreted as the subject antecedents. Their
28
results illustrate that topicality can override the object preference previously found in overt
pronouns. With regards null and overt pronouns in discourse pro-drop languages, Ueno and
Kehler’s (2016) found that both null and overt pronouns in Japanese tend to be used to refer back
to the subject antecedents. To promote topicality, Ueno and Kehler used topic marking on the
subject antecedents in place of the nominative marking. However, no significant results were found
with the topic manipulation relative to the nominative marking regarding null vs. overt pronoun
use.
In sum, the studies discussed in this section shows that referents’ salience vary not only with
grammatical roles (i.e. subject vs. object) but also with topicality (i.e. referents in subject position
vs. referents in topicalized subject position). Crucial to the current work, it is not yet clear how
topicality can affect the use of null vs. overt pronouns as findings in agreement pro-drop languages
and discourse pro-drop languages differ. These differences might stem from how topicality is
manipulated in the studies due to language specific properties to mark topicality. In the current
work, I aim to shed light on whether topicality can affect Vietnamese speaker’s comprehension
and production of null and overt pronouns. Since Vietnamese is a language with fixed word order
without topic marking, I use passivization as a topicalization device to promote referents in the
grammatical subject position to a topic position in discourse.
1.3. VIETNAMESE
As previously described, Vietnamese is a discourse pro-drop language. Null and overt
pronouns in Vietnamese can occur in both subject and object positions. More importantly,
Vietnamese overt pronouns are not reduced anaphoric expressions as in other languages (e.g.
English: he/she) but are derived from kinship terms (for details see Chapter 2, Section 1.3). In this
study, I use role nouns (e.g. engineer, driver, seamstress) which have the structure of [kinship term
+ occupation] as seen in (13a). In these cases, the kinship term functions as the head of the role
nouns. Hence, the occurrence of an overt pronoun is the occurrence of the head of the full NPs it
refers to as illustrated in (13b). The occurrence of the demonstrative with the kinship term is
optional though its presence increases naturalness (Thompson, 1987). Therefore, overt pronouns
in the current study have the structure of [kinship term + demonstrative].
(13) a. ông kĩ sư
older.male engineer
‘the male engineer’
b. ông (ấy)
older.male that
‘he’
Turning to passivization, different from English, the syntactic status of passive in Vietnamese is
under debate. Earlier work (Emeneau, 1951; Li and Thompson, 1976) has claimed that since the
language lacks verbal morphology, it does not have passive voice. On the other hand, other
researchers have argued that passive exists in Vietnamese although there are differences in the
analyses of the so-called passive markers bị and được (Simpson and Ho, 2008; Bruening and Tran,
2015). Since this is not a crucial point for our study, I will still refer to this type of syntactic
structure as passive in the sense that it promotes a non-Agent argument to the grammatical subject
position. This use of passive in English is also known to promote a non-Agent argument into the
topic position (Givón, 1990).
29
1.4. AIMS OF THIS WORK
In Sections 1.1 and 1.2 of this chapter, I have presented some crucial factors and theories related
to pronoun resolution, particularly to Experiments 2 and 3 in the current chapter. A range of topics
discussed included saliency, grammatical roles, coherence relation and topicality. It is important
to keep in mind that although these factors may influence the use of pronouns, the effect on each
form, null vs. overt, may vary. Prior work (Kaiser and Trueswell, 2008), studying the referential
properties of the pronoun hän ‘s/he’ and the demonstrative tämä ‘this’ in Finnish, found that the
effects of syntactic roles and the position of antecedents on the two forms are not alike. It will
become clear later that this form-specific approach is relevant to the findings in this chapter.
The main goal of this chapter is to shed light on Vietnamese speakers’ use of null and overt
pronouns. Previously in the narrative study in Chapter 2, when considering only grammatical
factors, I found that null and overt pronouns in the subject position were equally used to refer back
to the subject antecedents. This finding poses a challenge for the salience-hierarchical approaches
in which referential form use is ranked based on referents’ salience in discourse. One possibility
is that grammatical factors alone do not result in a strong enough indication for salience to tease
apart the two types of pronouns in Vietnamese. In the current chapter, I incorporate a discourse
factor, topicality, into the investigation. Specifically, I examine how topicality can affect the null
vs. overt pronoun choice while keeping grammatical roles of the antecedents (i.e. subjects of
actives vs. subjects of passives) and of the pronouns (i.e. subject pronouns) constant. One way to
observe topicality effect is by looking at how participants interpret null and overt pronouns in the
comprehension task. Another way is by looking at the likelihood-of-mention and the choice of
referential form in the production task. Thus, both comprehension and production are investigated
in this chapter. Last but not least, as mentioned in Section 4, Chapter1, it is possible that modality
may also drive referential form choice. Thus, I also investigate whether modality (spoken vs.
written) has an effect on the choice of null vs. overt pronouns by comparing the findings of the
written (Experiment 2) and the spoken (Experiment 3) experiments.
2. Experiment 2 – Written sentence completion
In this experiment, I use a sentence completion task to investigate whether the interpretation and
production of null and overt pronouns in Vietnamese are influenced by (i) the information-
structural status of potential antecedents – in particular, the distinction between topics and non-
topics and (ii) whether null and overt pronouns behave similarly.
I am interested both in (i) how people interpret different kinds of referential forms – in other
words, the comprehension side, and in (ii) when people choose to produce different kinds of
referential forms – in other words, the production side. Thus, in some conditions participants were
given an overt or null pronoun and had to interpret it (decide who it refers to) in order to write a
continuation for the sentence, whereas in some conditions participants could freely choose what
form to produce and who to refer to. I tested how information-structural properties, particularly
topicality (as signaled by promotion to the syntactic subject in a passive construction), influences
participants’ interpretation and production of null and overt pronouns in Vietnamese.
30
2.1. METHODS
2.1.1. Participants
Twenty-four adult native speakers of Vietnamese participated. Among the twenty-four
participants, only two had experienced living outside of Vietnam for more than 12 months (one
for 2 years and one for 8 years). At the time of the experiment, all participants were living in
Vietnam.
2.1.2. Materials and design
I created 24 transitive target sentences, as shown in example (14). All subjects and objects were
role nouns or professional nouns (e.g. the engineer, the dressmaker, the student, etc.)
11
, and all
target clauses were followed by the connective vì ‘because’ (to keep the coherence relation
constant). All targets contained two same-gender nouns (e.g. ông kĩ sư ‘male.engineer’ and ông
lái xe ‘male.driver’, or cô thợ may ‘female.dressmaker’ and cô khách hàng ‘female.customer’).
Crucially, I manipulated (i) the voice of the critical sentence (active/passive) and (ii) the
anaphoric form that people had to use in their continuation (null pronoun/overt pronoun/no
prompt), as shown in example (14). This yielded a 2x3 design.
(14) a. Active
Ông kĩ sư cám ơn ông lái xe vì ông ấy … / đã… / …
male.engineer thank male.driver because he ASP
‘The engineer thanked the driver because he …’
b. Passive
Ông kĩ sư được ông lái xe cám ơn vì ông ấy … / đã… / …
male.engineer PASS male.driver thank because he ASP
‘The engineer was thanked by the driver because he …’
The overt pronouns of Vietnamese are kin terms, as discussed in Section 1.2.2. The presence of a
null pronoun was signaled by presence of the aspectual particle đã.
12
. In the no-prompt conditions,
the sentence ended right after the connective ‘because’ and thus participants could continue
however they wished.
When selecting the verbs, I looked at the English norming study conducted by Hartshorne and
Snedeker (2013), who tested how likely different verbs are to show a subject or object bias when
followed by the connective because (e.g. “Sally frightens Mary because she is a dax” creates a
bias to interpret ‘she’ as referring to the subject Sally, whereas “Sally fears Mary because she is a
dax” creates a bias to interpret ‘she’ as referring to the object Mary). Crucially, I selected verbs
that do not show a strong bias towards the preceding subject or object in these kinds of sentence
frames (40%-59% object bias). This was done to ensure that the subject or object biases found in
our study are not just results of verb effects.
In addition to the targets, the study also included thirty fillers. Eighteen fillers contained the
connective mặc dù ‘although’ and twelve contained vì ‘because’. The verbs in the fillers were
11
Out of the role nouns/professions, 60% are male and 40% are female.
12
The aspect marking đã in Vietnamese is an optional marking indicating a past event. It cannot co-occur
with the progressive aspect as in ‘was sleeping’ in English. For this reason, we do not include đã in the
pronoun prompt condition to provide participants more freedom in their choice of a natural continuation.
31
evenly split between subject-biased verbs and object-biased verbs, and the arguments were a mix
of proper names and descriptive noun phrases. Some fillers ended with mention of a new referent,
some with a pronoun followed by an aspect marker as in example (15a), and some simply ended
at the connective as in example (15b), similar to the no-prompt conditions.
(15) a. Quỳnh kiếm Nga mặc dù cô ấy đã _________.
Quỳnh look for Nga although she ASP
‘Quỳnh looked for Nga although she ASP _________.’
b. Anh lính cứu hỏa bị anh thợ máy nghi ngờ vì _________.
male.firefighter PASS male.mechanic suspect because
‘The firefighter was suspected by the mechanic because _________.’
2.1.3. Procedure
Participants were instructed to provide natural-sounding continuations for the sentence fragments.
Participation took place over the internet, using the web-based software Qualtrics.
2.1.4. Analyzing the continuations
The subject in the continuation sentence was coded for whether it referred to the subject or object
of the preceding sentence. If it was not clear which referent the subject referred to, it was coded as
“unclear”. These unclear continuations were excluded from subsequent analyses. Thus, in what
follows, I only focus on trials that could be clearly analyzed as referring to the preceding subject
or the preceding object. Consequently, the proportion of subject continuations is the inverse of the
proportion of object continuations for any one condition. Below are some example continuations
and how they were coded in the data analysis.
(16) a. Cô thợ may bị cô khách hàng gạt vì cô ta không lấy tiền cọc.
‘The dressmaker was fooled by the customer because she did not take the deposit.’
→ Subject
b. Cô giữ trẻ ép buộc cô ca sĩ vì cô ấy đã lỡ hứa sẽ cho cô ta một số tiền.
‘The babysitter forced the singer because she promised to give her some money.’
→ Object
c. Chị thu ngân quan tâm đến chị hướng dẫn viên du lịch vì chị ấy rất thân thiện.
‘The cashier concerned about the tour guide because she is very friendly.’
→ Unclear
In the production task, I coded the likelihood-of-mention (subject or object) as well as the choice
of referential form. I only considered the singular forms that were possible mentions of subject or
object antecedents in our analysis and excluded other forms such as plural nouns, possessives and
so on. The relevant forms were divided into three categories: a null pronoun, an overt pronoun,
and an NP. Some examples of participants’ choice of referential form are illustrated in (17).
(17) a. Ông bác sĩ khiêu khích ông đầu bếp vì đã uống quá chén.
‘The doctor provoked the chef because (he) drank too much.’
→ Subject, null pronoun
32
b. Ông gác cổng kéo lê ông quản lý vì ông ta đã quá say.
‘The gate-keeper dragged the manager because he was too drunk.’
→ Object, overt pronoun
c. Chị dược sĩ được cô thợ thêu chào hỏi vì hai người có quen biết.
‘The pharmacist was greeted by the embroiderer because the two of them knew each
other.’
→ Plural noun: excluded
2.2. PREDICTIONS
I begin with the predictions for the prompt conditions (comprehension task), and continue with the
predictions for the no-prompt conditions (production task).
2.2.1. The prompt conditions
• Prompt conditions/active voice: Is the interpretation of null vs. overt pronouns in
Vietnamese sensitive to the grammatical role of potential antecedents? Previous work in Japanese
(Ueno and Kehler, 2016) showed that both null and overt pronouns have a subject bias though with
varying degrees. These results show that null and overt pronouns do not occur in complimentary
distribution. Turning to Chinese, Yang et al. (1999) did not find differences in reading times for
null and overt pronouns in their self-paced reading study. This indicates that null and overt
pronouns in Chinese pattern alike at least in reading task. Using a production task, Simpson et al.
(2016) found that although Chinese speakers mainly produced null pronouns when referring to
subject antecedents, there was no evidence that overt pronouns were the preferred forms for non-
subject antecedents. Consequently, there is no clear division of labor between null and overt
pronouns in Chinese and Japanese. These findings contrast with studies in Italian and Spanish,
which suggest that null and overt pronouns have different referential biases. Specifically, work in
Italian and Spanish (Carminati, 2000; Alonso-Ovalle et al., 2000) suggests that null pronouns
exhibit strong subject bias while overt pronouns do not (see also Fedele and Kaiser (2014, 2015)
for further work on coherence relations and the intra-/inter-sentence distinction in Italian
pronouns).
To shed light on the question of whether null and overt pronouns in Vietnamese are sensitive
to antecedents’ grammatical role, I analyzed how speakers continue the null and overt pronoun
prompts in the active voice. If Vietnamese patterns like Chinese and Japanese, I expect that
Vietnamese null and overt pronouns will share the same referential biases and that both pronoun
types will exhibit a preference for the preceding subject. This prediction is based on prior work
showing that preceding subjects are more likely to be interpreted as the antecedents of subsequent
pronouns (Chafe, 1976; Crawley and Stevenson, 1990). Alternatively, another possible outcome
is that null and overt pronouns have different referential biases, following the predictions of form-
based hierarchy-based approaches (e.g. Givón, 1983; Ariel, 1990; Gundel et al., 1993), according
to which null pronouns tend to refer to highly salient entities whereas overt pronouns tend refer to
less salient entities. Combining this claim with the subjecthood preference yields the prediction
that null pronouns will be more likely to refer back to preceding subjects than overt pronouns,
similar to Carminati’s Position-of-Antecedent Hypothesis for Italian (see Section 1.2.1).
• Prompt conditions/passive voice: So far I have been focusing on the active voice. What
about prompt sentences in the passive voice? Comparing the active and passive conditions allows
33
us to investigate whether the interpretation of null and overt pronouns is sensitive to topicality. It
has been shown that passive construction in English promotes subject referents into topic position;
hence, subjects of passives are more prominent in the discourse than regular subjects (Creider,
1979; Davison, 1984; Rohde and Kehler, 2014). Crucially, if passivization in Vietnamese
topicalizes the prompted object and renders the original agent non-topical, then I predict:
o If null and overt pronouns have the same referential biases: If null and overt pronouns
are both (i) sensitive to grammatical role and prefer subjects and (ii) also sensitive to
topicality and prefer topics, then I expect both forms to exhibit a stronger subject
preference after passive sentences than active sentences
o If null and overt pronouns differ in their referential biases: If null pronouns are used
for more salient antecedents than overt pronouns (Ariel, 1990; Givón, 1983; Gundel
et al., 1993), then – if passives mark the constituent that has been promoted to subject
position as being a topic – I expect that the subject preference exhibited by null
pronouns (relative to overt pronouns) will be a stronger after passive sentences than
active sentences.
On the other hand, if passives do not mark their syntactic subjects as topics, the results in the
passive conditions to mimic those in the active conditions. Pronouns in the passive conditions will
exhibit a preference for subjects -the promoted patients/themes. More importantly, I expect no
effect of topicality. The subject preference in passive sentences will not be stronger than the subject
preference in active sentences.
2.2.2. The no-prompt conditions
The no-prompt conditions present us with two questions (i) the likelihood-of-mention (i.e. will
participants refer to the subject or object antecedent in their continuations?) (ii) the choice of
referential expression (i.e. if participants continue to refer to the subjects of active sentences, which
referential form do they use?)
• No-prompt conditions/active voice: Regarding the likelihood-of-mention, I predict that
participants will be more likely to continue to mention the preceding subjects. This prediction is
built on Crawley and Stevenson’s (1990) results in which the preceding subjects were mentioned
much more frequently than the preceding objects. Regarding the choice of referential forms, two
predictions can be made. First, if null and overt pronouns in Vietnamese have the same referential
biases, I expect participants will use both pronoun types when referring to the preceding subjects.
The use of NPs will mostly occur when participants refer to the preceding objects as previously
observed in Crawley and Stevenson (1990). Second, if null and overt pronouns have different
referential biases, based on the form-based hierarchy-based approaches (Givón, 1983; Ariel, 1990;
Gundel et al., 1993), I expect that when referring to highly salient entities such as the preceding
subjects, participants will mostly use null pronouns. In contrast, when referring to less salient
entities such as the preceding objects, I expect participants to use overt pronouns. Fuller forms
such as NPs, if used, are expected to refer to less salient entities (i.e. object antecedents) more
often than highly salient entities (i.e. subject antecedents).
• No-prompt conditions/passive voice: As previously mentioned, subjects of passives in
English are in a topic position; hence, they are more prominent than subjects of actives in discourse
(Creider, 1979; Davison, 1984; Rohde and Kehler, 2014). Regarding the likelihood-of-mention, if
passivization in Vietnamese also promoted patients/themes into a topicalized subject position, I
predict that participants will continue to refer to the topicalized subjects and that this subject
34
preference is stronger in the passive conditions than in the active conditions. Regarding the choice
of referential forms, I predict:
o If null and overt pronouns have the same referential biases: Participants will be equally
likely to use null and overt pronouns to refer to the topicalized subjects in passive sentences
and pronoun use is higher in passive sentences than in active sentences. Participants will
mostly use NPs to refer to the object rather than the subject.
o If null and overt pronouns differ in their referential biases: If null pronouns are used for
more salient antecedents than overt pronouns (Ariel, 1990; Givón, 1983; Gundel et al.,
1993), participants will use null pronouns to refer to the subject and they will do so more
frequently in passive sentences than in active sentences. Overt pronouns and NPs will more
likely be used to refer to the object.
In contrast, if the grammatical subjects are not topicalized as topics in passive sentences, I do not
expect the results in the passive conditions to differ from those in the active conditions. More
specifically, I expect a subject preference for the promoted patients/themes in the passive
conditions. However, this subject preference is not overlaid by topicality; hence, it is not stronger
than the subject preference found in the active conditions. The choice of referential forms will also
mirror the results in the active conditions with the same amounts of null and overt pronouns used
for subject and object referents.
2.3. RESULTS
I first present the results for the prompt conditions and then the no-prompt conditions. Figure 1
shows both prompt and no-prompt conditions, but the results are discussed separately in sections
2.3.1 (prompt conditions) and 2.3.2 (no-prompt conditions), for clarity of exposition.
2.3.1. Results for prompt conditions
As can be seen in Figure 5, in the active conditions, regardless of the type of prompt, the subjects
of the continuations were more likely to refer back to the preceding object. This goes against the
expected subject preference. Numerically, null pronouns exhibit a stronger object preference than
overt pronouns (76.62% and 62.92% respectively).
In the passive conditions, the continuations exhibited a strong preference toward the preceding
subject referent. This subject preference is equally strong in the null pronoun condition (86.36%)
and in the pronoun condition (83.33%), as shown in Figure 5.
35
Figure 5. Percentage of subject and object referents in active and passive conditions in
Experiment 2 (written task).
I used a logistic mixed-effects regression model to analyze the proportion of subject continuations
as a function of anaphor type (overt vs. null, coded as 1 and -1 respectively) and voice (active vs.
passive, coded as 1 and -1 respectively) with participant and item as random effects. Tests were
conducted with the glmer function in the lme4 package (Bates et al., 2015) in the R environment
(R Core Team, 2016)
13
. (I analyzed the proportion of subject continuations, which is the inverse
of the proportion of object continuations.) The analyses show a main effect of voice (β = -1.37, SE
= 0.15, Wald Z = -8.88, p < 0.001), no main effect of prompt type (β = 0.06, SE = 0.14, Wald Z =
0.43, p = 0.66) and a marginal interaction between voice and prompt type (β = 0.25, SE = 0.14,
Wald Z = 1.8, p = 0.071). Thus, passive sentences elicited significantly more continuations
referring to the preceding subject than active sentences.
Regarding the marginal interaction, I conducted separate analyses to examine the effects
of anaphor type first on active then passive conditions. I fitted a model in which the proportion of
subject continuations was a function of anaphor type (overt vs. null, coded as 1 and -1 respectively)
with participant and item as random effects. The analyses show a significant effect of prompt type
in active conditions (β = 0.42, SE = 0.2, Wald Z = 2.07, p = 0.03) but not in the passive conditions
(β = -2.46, SE = 3.8, Wald Z = -0.63, p = 0.52). This indicates that in active conditions, there were
significantly more continuations referring to the preceding objects when null pronouns were used
compared to when overt pronouns were used.
As seen in Figure 1, the amounts of subject continuations in passive conditions are higher
than the amounts of object continuations in active conditions. To see whether the subject
preference in passives is significantly different from the object preference in actives, I used a
mixed-effects regression model in which the dependent variable was the proportion of object
continuations for active conditions and the proportion of subject continuations for passive
conditions. Voice and anaphor type were included as independent variables and participants and
items were included as random effects. This analysis reveals a main effect of voice (β = -0.63, SE
13
When specifying the structure of random effects, we started with fully crossed and fully specified random effects,
tested whether the model converges, and reduced random effects (starting with item effects) until the model converged
(see Jaeger at http://hlplab.wordpress.com, May 14, 2009). Then, we used model comparison to test each random
effect; only those that were found to contribute significantly to the model were included in the final analyses. However,
all models contained random intercepts for subjects and items.
36
= 0.16, Wald Z = -3.89, p < 0.001), which indicates that subject preference in passive conditions
is significantly stronger than the object preference in active conditions.
2.3.2. Results for no-prompt conditions
Let us first consider how frequently participants continued by referring back to the preceding
subject or object, regardless of what referring expression they used. Similar to the results of the
prompt conditions, passive voice in the no-prompt conditions yields a high proportion of
continuations that start by referring back to the preceding subject (82.14%) as seen in Figure 1.
I fitted a model containing only the intercept with the proportion of subject continuations
as the dependent variable. Participant and item were included as random effects. Analyses were
conducted separately for active and passive conditions. By looking at the intercept, I can see
whether participants’ preferences for objects in active condition and subjects in passive condition
are higher than chance. I found no main effect in the active condition (β = -0.66, SE = 0.55, Wald
Z = -1.19, p = 0.23). However, there was a main effect in the passive condition (β = 1.96, SE =
0.68, Wald Z = 2.86, p < 0.01). While the rate of subject continuations does not differ from chance
in the active condition, it is significantly above chance in the passive condition.
I tested whether the subject preference in passives is significantly different from the object
preference in actives using a mixed-effects regression model in which the dependent variable was
the proportion of object continuations for active conditions and the proportion of subject
continuations for passive conditions. Voice was included as the independent variable and
participants and items were included as random effects. This analysis reveals a main effect of voice
(β = -0.84, SE = 0.25, Wald Z = -3.3, p < 0.001), which indicates that subject preference in passive
conditions is significantly stronger than the object preference in active conditions.
Let us now consider participants’ choice of referential expression, as they had the freedom
to select which form to use, as well as who to refer to. Figure 6 is a stacked graph illustrating the
proportions of the forms (NP vs. overt vs. null) used in the active and passive conditions. The
overall bars shown in Figure 6 are the same as the active-no-prompt and passive-no-prompt bars
shown in Figure 1. In Figure 6, I are simply taking a closer look at the data – specifically, a closer
look at what kind of referring expressions make up each of the bars.
Thus, in Figure 2, the subsections inside each bar show the relative percentage of each type
of referential form. Note that the proportions of the referential forms shown inside each bar are the
relativized proportions calculated based on the proportion of subject or object continuations in
each condition. For example, in the active voice/no-prompt condition, the overall proportion of
subject continuations is 0.43 as seen in Figure 2 (or 43% as shown in Figure 1). The proportions
of NP, null and overt pronouns are 0.04, 0.22 and 0.17 respectively; and they add up to a total of
0.43 which is the overall proportion of subject continuations in this condition. The relativized
proportions are illustrated in Figure 6 as well as shown in Table 6.
37
Figure 6. Referential biases & forms in (active vs. passive) no-prompt conditions in Experiment
2 (written task).
RELATIVE
Total proportion of
subject/object
continuations
ABSOLUTE
NP Null Overt NP Null Overt
ACTIVE Subject 0.04 0.22 0.17 0.43
10 50 40
Object 0.16 0.25 0.16 0.57 28.21 43.59 28.21
PASSIVE Subject 0.06 0.61 0.15 0.82 7.25 73.91 18.84
Object 0.06 0.1 0.02 0.18 33.33 53.33 13.33
Table 6. Proportion of referential forms in the no-prompt conditions in Experiment 1 (written
task).
Besides the relative numbers, Table 6 also presents another set of numbers called absolute
numbers. Different from the relative numbers which add up to the total proportion of subject/object
continuations, absolute numbers are calculated out of 100. For instance, in active voice, if I
consider the total trials where people chose to refer back to the subject as equal to 100%, then
participants used a null pronoun on 50% of those trials. I will use absolute numbers to discuss
participants’ choice of referential expression in more details.
The active condition reveals that participants’ referential choices were similar regardless
of whether they chose to continue by referring back to the preceding subject or object: Null
pronouns were mostly chosen when participants referred back to the preceding subject (50%) and
the preceding object (43.59%). These proportions, however, do not differ very much from
proportions of overt pronouns. Overt pronouns were chosen on 40% and on 28.21% of the trials
when participants referred to the preceding subject and object respectively. Participants only used
an NP on 10% of the trials when referring to the preceding subject but they used an NP on 28.21%
when referring to the preceding object. Overall, I do not see a clear difference in the use of null
and overt pronouns in the active condition.
In the passive condition, participants mostly used null pronouns, especially when they
referred back to the preceding subject (presumably topicalized by the passive construction).
Indeed, if I consider only those trials where participants chose to continue by referring back to the
38
preceding subject, I find that they used a null pronoun on 73% of these trials. An NP was used on
only 7.25% of these trials, and an overt pronoun on only 18.84%. If I consider only those trials
where participants chose to continue by referring back to the less-preferred preceding object, I find
that they used a null pronoun on 53% of these trials. Participants used an NP on 33.33% of these
trials and only used overt pronouns on 13.33%. Crucially, this is the first situation in our
experiment where I see evidence of null and overt pronouns in Vietnamese patterning differently:
It seems that in a production task, when a sentence contains a clearly topical referent (the subject
of the passive), that is an ideal antecedent for null pronouns.
2.4. EFFECTS OF VERB BIAS
A possible hypothesis about the object preference in our active conditions is that it comes from
verb effects. As explained in Section 2.1.2, I tried to select verbs that, when used in the frame
‘Sally (verb) Mary because …’, they would not have a strong bias for the preceding subject or
object. I refer to these as equi-biased verbs. However, because no prior norming study had been
conducted on Vietnamese verbs, I used the data from English collected by Hartshorne et al. (2013).
Thus, one possible concern is that perhaps the verbs that are equi-biased in English may not be
equi-biased in Vietnamese. This section addresses the concern that the topicality effect found in
Sections 2.3.1 and 2.3.2 above are indeed due to verb choice. It is also worth mention that in the
following chapter (Chapter 4), I present a study dedicated to Vietnamese implicit causality verbs.
This verb study examines the referential bias of verbs in Vietnamese in comparison to those in
English and shows that overall, the 24 verbs used in the current experiment are indeed equi-
biased.
14
Let us now turn back to the topicality effect and the concern whether it is a result of the verb
choice. Recall that as mentioned in Section 2.3.2, when I looked at the active no-prompt condition,
I found that participants were equally likely to continue by referring back to the preceding subject
and to the preceding object. This finding suggests that as a group, the verbs I chose are indeed
equi-biased in that they do not create a strong expectation for the preceding subject of object to be
mentioned.
Although that concern seems to not be a problem, one might still wonder about individual
verbs and their biases. The active no-prompt condition is a good tool to probe verb bias since it
closely resembles the norming task in Hartshorne et al. from which our verbs were chosen. I
computed the percentage of subject bias for each verb from the number of subject and object
continuations.
I then divided the verbs into four groups based on the percentage of subject bias
15
as shown
in Table 7 below. Henceforth, the label “subject-strong” is used for verbs that elicited 75% or
more subject continuations. The label “object-strong” is used for verbs that elicited 24% or less
subject continuations. Verbs that elicited 25%-49% subject continuations are “object-weak, and
50%-74% are “subject-weak”.
14
The average subject bias of these 24 Vietnamese verbs as shown in the verb study (Chapter 4) is 44.85% (i.e. within
the equi-bias zone of 40%-60%). Even when we consider verb class (i.e. Agent-Patient, Agent-Evocator, Stimulus-
Experiencer, Experiencer-Stimulus, Source-Goal), all classes have an average subject bias between 40%-60% except
for Agent-Evocator which has an average of 37.65%. Thus, choosing verbs based on the English norms is not an
explanation for the results in the current chapter.
15
Among the 24 verbs, one verb did not yield any subject or object count; thus, there’s a total of 23 verbs in Table 2.
39
Percentage of subject bias 0%-24% 25%-49% 50%-74% 75%-100%
Number of verbs 10 3 5 5
Type of verb bias object-strong object-weak subject-weak subject-strong
Table 7. Types of verb biases and their proportions in the active no-prompt condition.
To examine the extent to which individual verb’s biases may influence referential biases, I
reanalyzed the results of the prompt conditions using the verb bias information in the no-prompt
condition as norming results. The group of object-preferring verbs consists of the ‘object-strong’
and ‘object-weak’ verbs (13 verbs in total) while the group of subject preferring verbs consists of
the ‘subject-strong’ and ‘subject-weak’ verbs (10 verbs in total).
It is clear from Figure 7 that in the passive conditions, regardless of the verb biases,
participants tend to continue by referring back to the preceding subject. However, in the active
conditions, verb biases appear to have some influence. If the verbs are object-biased verbs,
speakers tend to refer to the preceding object more often than the subject, and more so with the
null pronoun prompt than with the overt pronoun prompt. Nevertheless, this verb effect seems to
be less clear when the verbs are subject-biased. There seems to be no indication of a subject or
object preference regardless of prompt type. These patterns remain even when I only consider the
object-strong and subject-strong verbs as seen in Figure 8.
Figure 7. Referential biases based on verb biases in null and overt prompt conditions.
40
Figure 8. Referential biases among strong biased verbs only (0%-24% and 75%-100% subject-
biased verbs) in null and overt prompt conditions.
The results in Figure 8 above are compiled from the object-strong and subject-strong verbs only
(verbs which exhibit 75%-100% subject or object continuations in the active no-prompt condition;
i.e., strongly biased towards either subject or object continuations). Once again, there’s a clear
preference for the subject of passives regardless of the verb biases. I also observe a preference for
the object among the object-biased verbs. On the contrary, the strong subject-biased verbs do not
exhibit a preference for the subject.
In sum, the results presented in Sections 2.3.1. and 2.3.2 cannot be attributed simply to the
lexical biases of verbs. When participants had the freedom to choose which referent to mention
next and the referential form for their choice of referent, they clearly had a strong bias toward the
subject antecedent. However, when presented with a null or an overt pronoun, participants’ subject
bias disappeared. It can be said that the presence of the null and overt pronouns weakens the subject
bias of the verbs.
2.5. DISCUSSION OF EXPERIMENT 2 - WRITTEN TASK
Our experiment investigated the comprehension and the production of null and overt pronouns in
Vietnamese in active and in passive sentences. In the comprehension task (prompt conditions), I
examined whether pronoun interpretation (null vs. overt) was sensitive to the grammatical role of
the potential antecedent. I also examined the effects of topicality on pronouns’ referential biases
through the use passive construction. The production task (no-prompt conditions) provided us with
two pieces of information. First, which antecedent (subject vs. object) would participants be more
likely to mention in the continuations? Second, which referential expressions would they use to
refer back to these antecedents? I also observed how topicality could influence participants’ choice
of referent and choice of referential expressions in production. Following this structure, I will first
discuss the findings in the prompt conditions then the no-prompt conditions.
41
2.5.1. The prompt conditions
Our data showed that there was an object bias in the active conditions which does not support the
prediction that there would be a bias toward the grammatical subjects. Participants tend to interpret
both null and overt pronouns as referring to the preceding object although the preference for
objects is stronger for null than for overt pronouns. The passive conditions, in contrast, show a
strong subject preference. Null and overt pronouns were equally likely to be interpreted as referring
to the preceding subject.
At first glance, these results seem to contradict each other. A closer look at the patterns
reveals that they do not. In fact, the objects in active sentences and the subjects in passive sentences
both play the role of the patients/themes. It can be said that pronouns in our study exhibit a
patient/theme bias. More importantly, this patient/theme bias is significantly stronger in passive
conditions than in active conditions, which suggests there is also an effect of topicality in the
passive conditions. I conclude that passivization functions as a topic marking in Vietnamese and
that pronoun interpretation is sensitive to topicality.
It is also important to point out that null and overt pronouns in our study do not differ in their
referential biases, which does not support the predictions from the form-based hierarchy-based
approaches (Givón, 1983; Ariel, 1990; Gundel et al., 1993) and the Position of Antecedent
Hypothesis (Carminati, 2000). According to these predictions, null pronouns would have a bias
toward the preceding subject while overt pronouns would have a bias toward the preceding object.
Instead, Vietnamese null and overt pronouns share the same referential biases and do not exhibit
a clear division of labor, similar to null and overt pronouns in Chinese and Japanese.
2.5.2. The no-prompt conditions
The likelihood-of-mention in the no-prompt conditions resembles the referential biases found in
the prompt conditions. Participants were more likely to continue to refer to the preceding object in
the active condition although the object preference is only numerically higher than the subject
preference. In the passive condition, participants mainly continue to refer back to the preceding
subject. As previously discuss in the prompt section, these preferences can be viewed as a
patient/theme bias. I also found that this patient/theme bias is significantly stronger in the passive
condition than in the active condition, which indicates that passivization promotes the
patient/theme into a topic position.
Turning to the choice of referential form, the patterns differ between the active and the
passive condition. When referring to the preceding object in the active condition, participants
showed no clear preference for either null or overt pronouns. Conversely, when referring to the
preceding subject in the passive condition, participants clearly preferred null pronouns over overt
pronouns. So far, this is the only condition in which null and overt pronouns behave differently
from each other. Since the differences occur in the passive condition, I attribute them to the effect
of topicality.
A closer look at individual verbs’ bias shows that the object preference found in the prompt
conditions was not a result of the lexical biases of verbs. Using the results in the no-prompt
condition as norming results, the verbs were categorized into four groups based on their degrees
of subject bias. I then reanalyzed the results from the prompt conditions based on the verb biases.
As expected, the object-strong verbs exhibited a strong object bias for both null and overt
pronouns. Surprisingly, the subject-strong group did not exhibit a subject bias, but instead an equi-
42
bias for both pronoun types. It appears that the object preference found in the active-prompt
conditions is due to the presence of the pronoun prompts.
In sum, pronouns in Experiment 2 exhibit a patient/theme bias in comprehension as well as
production. The patient/theme bias is strengthened by topicality effects in the passive conditions.
Despite their similar behaviors in various conditions, null and overt pronouns started to exhibit
different degrees of sensitivity with topicality in the production task. Specifically, participants
produced more null pronouns than overt pronouns when they continued to refer to the topics. I
suggest an explanation for these data under the form-specific view (Kaiser and Trueswell, 2008)
in which different forms can be influenced by the same factors but with different magnitudes.
3. Experiment 3 – Spoken sentence completion
In Experiment 2, I investigated how referential expressions in Vietnamese can be interpreted and
produced by speakers in a written sentence-completion task. In this section, I investigate these
processes using a spoken task. The discrepancy between spoken and written modalities has long
been acknowledged and studied in discourse analysis (DeVito, 1964; Poole and Field,1976;
Tannen, 1982; Chafe, 1982, 1985; to name a few). Nevertheless, these studies were conducted in
the framework of narratives, dialogues, and corpus analyses and the written and spoken data came
from different sources. Consequently, they are not a direct comparison for the use of referential
expression between spoken and written language. Regarding experimental work in pronoun
resolution, I see that a large number of these studies are conducted in the form of a written task
(Crawley and Stevenson, 1990; Carminati, 2002; Chambers and Smyth, 1998; Kehler et al., 2008;
Alonso-Ovalle et al.; Ueno and Kehler, 2016; to name a few). Other studies have been using visual
work eye tracking for speech production (Arnold et al.; 2000; Kaiser and Trueswell, 2008). Thus,
results from the written task cannot be compared directed to results from the spoken task.
In this experiment, I used the same materials and design in Experiment 1 while manipulating
modalities for a direct comparison. Crucially to our study, it is possible that null and overt pronoun
may exhibit different patterns in the spoken task which have not been observed in the written task.
3.1. METHODS
3.1.1. Participants
Thirty-six adult native speakers of Vietnamese participated in the task. All participants were
currently living in Vietnam and none of them had lived abroad for more than 12 months.
3.1.2. Materials
Experiment 3 used the same design and materials in Experiment 2. I again manipulated (i) voice
(active vs. passive) and (ii) prompt types (overt pronoun vs. null pronoun vs. no-prompt).
3.1.3. Procedure and data analysis
The experiment was conducted on computer using Paradigm software (Perception Research
Systems). Participants first read the sentence fragment on the screen. Once ready, they pressed a
key a move to the recording screen. Participants were instructed to say out loud the complete
sentence, i.e. both the fragment shown on the screen and their own continuation. The sentence
43
fragment was displayed throughout the recording, in order to avoid imposing a memory burden on
participants. Participants were allowed to repeat their sentences if they were not satisfied with the
initial trial
16
. When they finished saying their continuation, participants pressed a key to move to
the next item. This allow participants to proceed at their own pace and to minimize speech
disfluencies.
Similar to Experiment 1, I coded the referent of the subject continuation as referring to the
preceding subject or the preceding object. When there was repetition, I coded the final repetition
as it is the one participants were satisfied with. When the subjects of the continuations do not have
a clear interpretation as referring to the subject or object antecedents, they were coded as “unclear”.
Regarding the no-prompt conditions, I only considered the singular referents that referred to either
the subject or the object antecedents and excluded the plural referents. The referential expressions
used by participants were categorized as a null pronoun, an overt pronoun, or an NP.
3.2. PREDICTIONS
If spoken and written language pattern alike, the predictions for Experiment 2 are the same as for
Experiment 1. Null and overt pronouns will show similar biases toward the patients/themes and
these biases will be stronger in the passive conditions than in the active conditions due to topicality
effects. In production, null and overt pronouns will be equally used in the active condition but in
the passive condition, null pronouns will be the preferred choice of referential expression.
However, if spoken language differs from written language, keeping the patient/theme bias and
topicality effects constant, I predict null and overt will have different referential biases. More
specifically, I predict:
o In the prompt conditions (comprehension): Null pronouns will more likely be interpreted
as referring back to the object in actives and the subject in passives since the patient/theme
bias promotes these antecedent to be highly salient referents (Givón, 1983; Ariel, 1990;
Gundel et al., 1993). Overt pronouns, in contrast, will mostly be used to refer to the less-
salient antecedents-the agents. I also expect a higher use of null pronouns as referring back
to the patient/theme in the passive conditions than in the active conditions.
o In the no-prompt conditions (production): Participants will produce more null pronouns
when referring back to the patient/theme and overt pronouns to refer to the agents. They
will also use more null pronouns when referring back to the patient/theme in the passive
conditions than in the active conditions. In addition, I expect to see an overall increase of
pronoun use in the spoken task compared to the written task since previous work has shown
that written language contains more full NPs and spoken language contains more pronouns
(DeVito, 1964; Poole and Field,1976; Tannen, 1982; Chafe, 1982, 1985).
3.3. RESULTS
I first present the results for the prompt conditions then the results for the no prompt conditions.
Figure 5 consists of both the prompt and the no-prompt results but they will be discussed in two
separate sections, 3.3.1 and 3.3.2 respectively.
16
Participants occasionally repeated to provide a different or a more detailed explanation for the event. When this
happened, they rarely changed the referent of the subject in the continuations.
44
3.3.1. Results for prompt conditions
As seen in Figure 9, in the active conditions, participants tend to continue to refer back to the
preceding object regardless of prompt type. This goes against our prediction of a subject
preference. The object preference is numerically stronger in the null condition than in the overt
condition, 75.42% and 69.78% respectively. These patterns were also found in Experiment 2.
In contrast, the passive conditions exhibit a strong subject preference. Participants were
equally likely to refer back to the preceding subject in the null pronoun condition (86.4%) and in
the overt pronoun condition (85.83%) as shown in Figure 9.
Figure 9. Percentage of subject and object referents in active and passive conditions in
Experiment 3 (spoken task).
Similar to Experiment 1, I evaluated the effects of anaphor type (overt vs. null, coded as 1 and -1
respectively) and voice (active vs. passive, coded as 1 and -1 respectively) using a mixed-effects
regression model in which the dependent variable is the proportion of subject continuations.
Participant and item were included as random effects. Tests were conducted with the glmer
function in the lme4 package (Bates et al., 2015) in the R environment (R Core Team, 2016). Noted
that the proportion of subject continuations used in the analysis is the inverse of the proportion of
the object continuations.
I observe a main effect of voice (β = -1.43, SE = 0.12, Wald Z = -11.83, p < 0.01), which
indicates passive conditions trigger significantly more subject continuations than active
conditions. There is no main effect of prompt (β = 0.06, SE = 0.12, Wald Z = 0.53, p = 0.6)
showing that the two prompt types do not differ in terms of referential biases. I also found no
interaction between voice and anaphor type (β = 0.08, SE = 0.11, Wald Z = 0.7, p = 0.48) indicating
that the degree of sensitivity to the voice manipulation does not differ for the two anaphor types.
Another analysis was conducted to compare the object preference in active sentences to the
subject preference in passive sentences using a mixed-effects regression model. The dependent
variable was the proportion of object continuations for active conditions and the proportion of
subject continuations for passive conditions. Voice and anaphor type were the independent
variables. Participants and items were included as random effects. I found a main effect of voice
(β = -0.56, SE = 0.13, Wald Z = -4.28, p < 0.001), which shows that the strength of the subject
preference in passives is significantly stronger than the strength of the object preference in actives.
45
3.3.2. Results for no-prompt conditions
Now, I turn to the conditions where no prompt pronoun was given, so participants could freely
choose what form to use and who to refer to. Overall, results in the no-prompt conditions pattern
with results in the prompt conditions as seen in Figure 10. In the active condition, participants
were more likely to continue to refer to the preceding object (69.31%) than the preceding subject
(30.69%). However, in the passive condition, the preceding subject was more likely to be
mentioned than the preceding object, 82.46% and 17.54% respectively.
To see whether these referential biases are higher than chance level, I fitted a model containing
only the intercept with the proportion of subject continuations as the dependent variable first for
the active condition then for the passive condition. Participant and item were included as random
effects. The details of the intercepts reveal a main effect in the active condition (β = -1.29, SE =
0.57, Wald Z = -2.23, p < 0.05) as well as a main effect in the passive condition (β = 1.68, SE =
0.4, Wald Z = 4.2, p < 0.001). It can be said that participants exhibited a strong tendency to refer
back to the preceding object in the active condition and to the preceding subject in the passive
condition.
To compare the object preference in the active condition to the subject preference in the
passive condition, a mixed-effects regression model was used. In this model, the dependent
variable was the proportion of object continuations for active conditions and the proportion of
subject continuations for passive conditions. Voice was the independent variable. Participants and
items were included as random effects. This analysis reveals a main effect of voice (β = -0.43, SE
= 0.17, Wald Z = -2.4, p < 0.05), which participants tend to refer back to the subject in passive
sentences significantly more than the object in active sentences.
Besides the results of referential biases, the no-prompt condition also provides information
about the referential forms that participants chose to use. Figure 10 presents the proportions of
referential forms used when referring back to the subjects and objects of the preceding sentence in
each condition (active vs. passive). The details of these proportions are also presented in Table 8.
Noted that these are relativized proportions based on the proportions of subject or object
continuations within a given condition. The sum of the subject and object proportions in either the
active or passive condition is equal to 1 (or 100%).
Figure 10. Referential biases & forms in (active vs. passive) no-prompt conditions in Experiment
3 (spoken task).
46
Table 8. Proportion of referential forms in the no-prompt conditions in Experiment 3 (spoken
task).
Another way to observe the proportions of referential choice is with the absolute numbers also
shown in Table 8. For example, if I consider the total trials in the active voice where participants
chose to refer back to the subject, the absolute number shows that they used a null pronoun on
35% of these trials. It appears that in the active condition, participants were equally likely to use
null and overt pronouns regardless of whether they chose to refer to the preceding subject or object.
When referring to the preceding subject, participants used a null pronoun on 35% and an overt
pronoun on 39% of these trials. When referring to the preceding object, they chose a null pronoun
21% and an overt pronoun 20% out of the total trials. In contrast, the use of an NP reveals an
interesting pattern. There is a significant increase in participants’ choice of an NP to refer back to
the preceding object compared to the subject, 59% and 26% respectively. More importantly, this
large number of NPs (59%) also dominates other choices of null (21%) and overt pronouns (20%).
In the passive condition, null pronouns were mainly used when participants refer back to the
preceding subject. Among the total trials where participants referred to the preceding subject, they
produced a null pronoun on 47% of these trials, an overt pronoun on 20% and an NP on 33%.
However, when they referred back to the less-preferred preceding objects, there was no clear
preference for any referential forms. Participants produced a null pronoun on 30% of these trials,
an overt pronoun on 40% and an NP on 30%. It is important to point out that unlike in other
conditions, null and overt pronouns patterned differently in this condition. It appears that null
pronouns are the preferred referential forms for a topicalized antecedent - the subject of a passive
sentence. In addition, I also found that participants frequently used an NP to refer to the preceding
subject, more frequently than they did with an overt pronoun.
3.4. DISCUSSION OF EXPERIMENT 3 - SPOKEN TASK
The results in Experiment 3 overall pattern with those in Experiment 2 in many respects. In the
active conditions, participants tended to continue to refer to the preceding object. Conversely, in
the passive conditions, they preferred to continue to refer to the preceding subject. As previously
discussed, these preferences do not contradict each other. In fact, they represent a patient/theme
bias across all conditions. Furthermore, compare to the patient/theme bias in the active conditions,
the bias found in the passive conditions is significantly stronger. This suggests that the bias in the
passive conditions was also influenced by topicality.
Let us turn to the patterns between null and overt pronouns which is the main focus in this
experiment. I first discuss the interpretation of null and overt pronouns in the prompt conditions
then the production of referential forms in the no prompt conditions.
RELATIVE
Total proportion
of subject/object
continuations
ABSOLUTE
NP Null Overt NP Null Overt
Active Subject 0.08 0.11 0.12 0.31 26 35 39
Object 0.41 0.14 0.14 0.69 59 21 20
Passive Subject 0.27 0.39 0.16 0.82 33 47 20
Object 0.06 0.05 0.07 0.18 30 30 40
47
3.4.1. The prompt conditions
It appears that null and overt pronouns in the spoken task share the same referential biases. In other
words, participants were equally likely to interpret them as referring to the patient/theme of the
previous clause in active and passive conditions. Despite the fact that the topicalized subjects in
passives are highly salient, I did not see null pronouns being interpreted as the preceding subject
more frequently than overt pronouns. Overall, these results do not support the form-based
hierarchy-based prediction (Givón, 1983; Ariel, 1990; Gundel et al., 1993) in which null pronouns
are more likely to be interpreted as the more salient antecedents-the patients/themes and overt
pronouns as the less salient antecedents-the agents. Indeed, these similarities in referential biases
between null and overt pronouns in Vietnamese mirror previous findings in Chinese and Japanese
(Yang et al., 1999; Ueno et al., 2010).
3.4.2. The no-prompt conditions
In the active condition, I did not find any differences in how participants used null and overt
pronouns to refer back to the patient/theme (21% and 20% respectively) or the agent (35% and
39% respectively).
Interestingly, the use of null and over pronouns diverged in the passive condition. Participants
produced null pronouns much more frequently than overt pronouns, 47% and 20% respectively,
when referring back to the patient/theme in the topicalized subject position. Crucially, this is the
only condition in our study in which null pronouns and overt pronouns differ from each other. In
addition, I also found a large number of NPs when participants referred back to the patient/theme
in both active and passive conditions, 59% and 33% respectively. These results go against our
prediction that participants would use more pronouns in speaking than in writing. I will discuss
the increasing use of NPs in more details in section 3.5 where I directly compare the results of
Experiment 2 and Experiment 3.
In sum, our results did not support the prediction that the grammatical subject would be the
preferred antecedent in the active conditions. Instead, I found an object preference in the active
conditions and a subject preference in the passive conditions, thus, a patient/theme bias. Regarding
the behaviors of null and overt pronouns, I found that they patterned alike in many of the
conditions. Only in the passive-no-prompt condition, was there evidence that they differed from
each other. This suggests that topicality have different effects on pronoun interpretation and
production. These results cannot be explained under a form-based hierarchy-based and the
antecedent position approaches. I suggest that the form-specific approach (Kaiser and Trueswell,
2008) should be used to account for how null and overt pronouns were influenced by topicality
with varying degrees.
4. Comparing of Experiment 2 and 3: Effects of modality
My main goal in conducting both the written and the spoken task is to investigate the effect of
modality on the referential biases between null and overt pronouns as well as the production of
referential forms. Thus, using logistic mixed-effects regression models to investigate the effects of
modality on referential form use. I will discuss each question respectively. To provide a complete
picture, the results from both Experiment 2 and 3 are included in our discussion, keeping in mind
that the two experiments only vary by modality (Experiment 2 is a written task and Experiment 3
is a spoken task, with the same items and design).
48
4.1. NULL AND OVERT PRONOUNS IN COMPREHENSION (THE PROMPT CONDITIONS)
I use a mixed-effects regression model in which the dependent variable was the proportion of
subject continuations. (The proportion of subject continuations is the inverse of the proportion of
object continuations). Anaphor type (overt vs. null, coded as 1 and -1 respectively) and voice
(active vs. passive, coded as 1 and -1 respectively) were included as within-subjects variables and
modality (written vs. spoken, coded as 1 and -1 respectively) as a between-subjects variable.
Participants and items were included as random effects. The analyses showed a main effect of
voice (β = -1.39, SE = 0.09, Wald Z = -14.26, p < 0.001), but no main effects of prompt (β = 0.06,
SE = 0.09, Wald Z = 0.74, p = 0.45). More importantly, we found no main effect of modality (β =
0.02, SE = 0.09, Wald Z = 0.29, p = 0.76). Furthermore, there was a marginal interaction between
voice and prompt (β = 0.16, SE = 0.09, Wald Z = 1.83, p = 0.066). However, no interaction was
found between voice and modality (β = 0.05, SE = 0.09, Wald Z = 0.57, p = 0.56), between prompt
and modality (β = 0.002, SE = 0.09, Wald Z = 0.03, p = 0.97), and among voice, prompt and
modality (β = 0.08, SE = 0.09, Wald Z = 0.98, p = 0.32). It can be said that modality does not
affect how Vietnamese speakers interpret null and overt pronouns.
In the following sections, I discuss in more details the referential biases of null and overt
pronouns in comprehension in both written and spoken tasks, first in the active then in the passive
conditions.
4.1.1. Interpretation of null and overt pronouns (active-prompt conditions)
Previously, I hypothesized that if null and overt pronouns had the same referential biases, they
would have subject preference (Chafe, 1976; Crawley and Stevenson, 1990). However, if null and
overt pronouns had different referential biases, I hypothesized that null pronouns would be
interpreted as referring to the subject antecedents while overt pronouns would tend to refer to the
object antecedents based on the form-based hierarchy-based approach (Givón, 1983; Ariel, 1990,
Gundel et al., 1993) and the Position of Antecedent Hypothesis (Carminati, 2000). Data from our
study do not support the grammatical subject preference hypothesis nor do they support the
hypothesis that null and overt pronouns have different referential biases. Instead, participants
tended to interpret both null and overt pronouns as referring to the object antecedents. As seen in
Figure 11, this object preference is significantly stronger for null pronouns than for overt pronouns
in the written task, yet it is only numerically stronger in the spoken task. Overall, null and overt
pronouns in Vietnamese do not differ in terms of their referential preferences.
49
Figure 11. Interpretation of null and overt pronouns in active-prompt (null vs. overt) conditions
(Written & Spoken).
4.1.2. Effects of topicality in comprehension (prompt conditions)
In contrast to the object preference in actives, a clear preference for the subjects of passives
can be seen in Figure 13 with the written results in the left and the spoken results in the right. As
the object in actives (the patient/theme) is also the subject in passives, these preferences illustrate
a patient/theme bias. However, the bias in the passive sentences is not solely a patient/theme bias
as seen in the active sentences. Crucially, I found that the bias in passives was significantly stronger
than the bias in actives, indicating a topicality bias on top of the patient/theme bias. It can be said
that passivization in Vietnamese promotes the subject into a topic position and that pronoun
interpretation is sensitive to topicality.
Let us turn to how each type of pronoun was influenced by topicality. I see in Figure 13
that participants were equally likely to interpret null pronouns and overt pronouns as the preceding
topicalized subject. It appears that topicality has the same effect on null and overt pronouns and
that both pronoun types do not differ in their referential biases regardless of modalities.
Figure 12. Interpretation of null and overt pronouns in passive-prompt (null vs. overt) conditions
(Written & Spoken).
50
4.2. NULL AND OVERT PRONOUNS IN PRODUCTION (THE NO-PROMPT CONDITIONS)
In the no-prompt conditions, participants were more likely to continue to refer to the preceding
object in active conditions and to the subject in the passive conditions. To see whether these
referential biases are higher than chance level and whether modality has an effect on the biases,
we fitted a model containing the intercept with the proportion of subject continuations as the
dependent variable and modality as a between-subjects variable (written vs. spoken, coded as 1
and -1 respectively). Participant and item were included as random effects. We conducted separate
analyses for active and passive conditions. In the active conditions, the details of the intercept
revealed a main effect (β = -0.89, SE = 0.43, Wald Z = -2.05, p < 0.05). However, there was no
main effect of modality in the active conditions (β = 0.20, SE = 0.31, Wald Z = 0.65, p = 0.51).
Similarly, in the passive conditions, we found a main effect of the intercept (β = 1.89, SE = 0.41,
Wald Z = 4.59, p < 0.001) but no main effect of modality (β = 0.11, SE = 0.29, Wald Z = 0.37, p
= 0.71). Once again, no effect of modality is found on how Vietnamese speakers use null and overt
pronouns.
In the next sections, I discuss the patterns found in the active no-prompt and the passive no-
prompt conditions in both written and spoken modalities.
4.2.1. Production of null and overt pronouns (active-no-prompt conditions)
Regarding the likelihood-of-mention, the results in our written task illustrate the equi-biased verb
choice. In Hartshorne and Snedeker’s (2013) study in which I extracted our verbs, participants
read sentences such as ‘Sally frightens Mary because she is a dax’ then were asked ked “Who was
the dax?”. The subject/object bias was calculated based on participants’ answers whether it was
Sally or Mary. The equi-biased verbs chosen in our study are verbs which had a 40%-60% object
bias; hence, there was an equal number of answers for Mary as for Sally. In this sense, it is not
surprising that participants in our study were as likely to refer back to the object as to the preceding
subject in the written task. I see in Figure 8 that the total height of the object bar is higher than the
height of the subject bar, but this difference is not statistically significant. Interestingly, this equi-
preference for subject and object antecedents disappeared in the spoken task. There was clear
preference for the preceding objects. I suggest that referential biases are indeed sensitive to
modality (spoken vs. written).
Besides the overall trend of referential biases, I also looked at the details regarding the type
of referential forms speakers used with these biases. Let us focus here on the use of null and overt
pronouns and return to the use of NPs in section 6.3 on modality. Figure 12 shows participants’
choice of referential form with the written results on the left and the spoken results on the right. In
the written task, participants only slightly preferred using null pronouns to overt pronouns and they
clearly did not have a preference for either null or overt pronouns in the spoken task. For example,
when referring to the preceding subject, in the written task, null pronouns were chosen on 50% of
these trials and overt pronouns were chosen on 40%. In the spoken task, these percentages were
35% and 39% for null and overt pronouns respectively.
51
Figure 13. Choice of referential form in active-no-prompt conditions (Written & Spoken).
4.2.2. Effects of topicality in production (no-prompt conditions)
Looking at the overall heights of the bars in Figure 14, I see that they resemble the patterns in
Figure 12 (prompt conditions). Participants were more likely to continue to mention the subject
antecedents in the passive conditions regardless of modalities. As previously said, this is indeed a
patient/theme bias which was boosted by topicality. Therefore, topicality not only has an effect on
pronoun interpretation, but it also influences the likelihood-of-mention in production task.
Figure 14 also shows how participants chose referential forms when referring to the
preceding subject or object. As I can see, null pronouns are the preferred referential forms when
participants continued to refer to the preceding subject in passive sentences. This is true in both
written and spoken task. Rohde and Kehler (2014) also pointed out that the rates of
pronominalization for subject referents in passive sentences are a good indication of topicality.
More specifically, in our study, the rates of pronominalization can be seen as the percentages of
null pronouns. Regarding modalities, no differences were found in the patterns of null and overt
pronouns. I only found differences in the use of NPs, with more NPs in the spoken task than in the
written task.
One may suggest that the increase of null pronouns is due to salience. As the passivization
promotes a highly salient topic; hence, more null pronouns were used to refer to the topicalized
subject. Although this explanation appears to fit with the data at hand, it cannot accommodate all
of patterns found in our study, namely how null and overt pronouns have the same patterns in
comprehension and in the active conditions of the production task.
52
Figure 14. Choice of referential form in passive-no-prompt conditions (Written & Spoken).
The increasing use of null pronouns when referring to the subjects of passives may be captured
under the form-specific approach (Kaiser and Trueswell, 2008). Implementing this approach to
our current study, it can be said that null and overt pronouns may be influenced by the same factors
with various degrees of sensitivity. As seen in Figure 14, null and overt pronouns in Vietnamese
were both affected by a patient/theme bias as well as topicality. Null pronouns, however, exhibited
a stronger sensitivity to topicality effect than overt pronouns did in production task.
While the form-specific approach can be used to explain our data, it does not help us bridge
the gap between the results in production and comprehension. Whereas null and overt pronouns
share the same behaviors in comprehension, they differ in production under the effects of
topicality. One suggestion is to use a Bayesian approach for these patterns. In their study, Kehler
and Rohde (2013) implemented this approach to reconcile the differences between the
comprehenders’ expectations and the speakers’ choice of referential form. Although I do not
discuss this approach in the current paper, I will include this probabilistic method in future work.
4.3. EFFECT OF MODALITY
Previous work using corpus data and discourse analysis has shown that modality can influence
the use of referential expressions (DeVito, 1964; Poole and Field,1976; Tannen, 1982; Chafe,
1982, 1985). In particular, they found that the use of nouns was prevalent in writing. On the
contrary, pronouns were dominantly used in speaking. Up to date, I have yet known of an
experiment which directly manipulates written and spoken language in pronoun resolution. Our
experiments then can shed some light on this issue. Recall that the referents provided in our
sentence fragments were ambiguous and regardless of the prompt or no-prompt manipulation, no
extra information was presented to disambiguate them. Overall, our results show that null and
overt pronouns in Vietnamese do not differ across modalities.
Now that I have discussed null and overt pronouns, let us turn to the only difference between
the spoken and written task, the use of NPs in production (i.e. the no-prompt conditions) in Figure
12 and 14. The spoken task triggered a significant increase in nouns compared to the written task
especially when the nouns refer to the patient/theme antecedents. This goes against the prediction
that spoken task would yield more pronouns than written task. One possibility is that our tasks
53
presented participants with ambiguous referents; hence, they used NPs to disambiguate though this
explanation may not provide us with a complete picture. This claim is supported by previous work
examining written language (e.g. newspapers and academic essays) which often contains higher
number of referents than speech. Consequently, NPs are used as a strategy of ambiguity avoidance
(Biber et al., 1999). Thus, if the participants in the spoken task (Experiment 3) were highly
cooperative in communication, they would try to help the addressees identify the intended referents
by using NPs. Consequently, they would use NPs more often in the spoken task than in the written
task since listeners, unlike readers, do not have opportunities to re-hear the sentences to re-identify
the intended referents.
However, existing psycholinguistic work (Arnold and Griffin, 2007; Fukumura et al., 2011)
has shown that the increasing use of NPs may not simply be a result of ambiguity avoidance. In
their studies, the presence of a second character in discourse reduces pronoun use even when the
characters are of different genders; thus, there is no ambiguity in using a pronoun. This
phenomenon can be explained using the cognitive model (Ariel, 1990; Gundel et al, 1993) in which
the choice of referential form depends on the referent’s cognitive status. The more activated the
referent is, the more reduced referential form is used. Considering that activation and attention are
two closely related processes, the activation of a referent in one’s memory at a specific time t is a
result of how much attention it received prior to t (Chafe, 1994; Kibrik, 1996). If attention is a
limited resource (Kahneman, 1973), when a discourse contains only one character, the character
obtains all of the attention and is highly activated in the speakers’ mind. This gives rise to a high
number of pronouns referring to this character. On the contrary, when a second character occurs,
attention is divided between the two referents. Even when one character is more activated than the
other, the amount of attention speakers give to this character is still less than when there is only
one character in discourse. Consequently, even when there is no ambiguity if a pronoun is used,
speakers still use less pronouns since the referent is less activated. As a result, the increasing
number of NPs found in our spoken task can also be a result of referents’ competition and not just
a result of avoiding ambiguity. However, this does not explain why the increase of NPs only occurs
in the spoken task and not in the written task. In our paper, we suggested that this is due to (i)
repeating an NP in speaking is more convenient than in writing and (ii) perhaps the repeated name
penalty is more acceptable in speaking than in writing. Further investigation is needed to confirm
these suggestions. More importantly, the question here is whether the spoken task presented
participants with a higher amount of competition between the referents than the written task did.
To address this question, we take a look at the experimental set-ups of the two tasks.
In the written task, participants had the sentence fragments presented in front of them and they
were asked to write the continuations for the fragments. In contrast, in the spoken task, participants
were asked to read the fragments on the computer screen and to think of the continuations.
Participants were instructed to press a button to proceed to the recording screen only after they had
come up with the continuations for the fragments. At the recording screen, they spoke the full
sentence (i.e. the fragment and the continuation) out loud. Meanwhile the sentence fragments were
presented on the screen during the recording. This procedure aims to give participants time to
familiarize with the given context and minimize the pressure of time. In doing so, the spoken and
written tasks can be more closely related with regards to their experimental design since we did
not provide participants with a specific time to start writing the continuations nor block the
sentence fragments from them during the writing process.
Nonetheless, the procedure in the spoken task can still pose a heavier load on participants’
working memory than in the written task because they had to produce the fragments consisting the
54
NPs. In other words, participants had to pay attention to both referents in the fragments and
temporarily memorized them while producing the sentence; thus, competition between referents
in the spoken task is higher than in the written task. As previously mentioned, referents’
competition increases the use of NPs (Arnold et al., 2007; Fukumura et al., 2011). It can be said
that and the increase in NP use in our spoken task is in line with the competition-based account.
Although the competition-based account can be a plausible explanation for the patterns in our
studies, one may argue that participants might not memorize the NPs while speaking the sentences
out loud but that they simply repeated the fragments they saw on the screen and added the
continuations. If this were the case, there would be no difference in the degree of competition
between the spoken task and the written task. This leads us back to the question of why participants
used NPs more often in the spoken task than in the written one. One hypothesis is that the shift
from reading to speaking in the spoken task interferes with the continuity of discourse; hence, it
triggers NP use. The continuity or flow of discourse, according to Li and Thompson (1979) and
Chen (1986), is “the speaker’s perception of the degree of ‘connection’ between clauses in
discourse”. This degree of 'connection' (Chen, 1986) can be defined in terms of topic continuity
and semantic continuity. Similar to Givón (1983), topic continuity is affected by the change of
topic. Semantic continuity (Chen, 1986) is influenced by factors such as (i) turning from
background information to foreground information, or vice versa, (ii) insertion of some digression
into the theme development, (iii) insertion of temporal, locative, adversative, or other types of
adverbial, and (iv) switch or turn in conversation. When interruption occurs with either topic or
semantic continuity, fuller referential forms such an NP should be used instead of reduced forms
such as a pronoun. Evidence for this claim can be found in Simpson et al.’s (2015) work on
pronouns in Chinese. In their experiment 4, participants were asked to write continuations for the
given sentences (e.g. 陈元把钱还给了赵云, ... ‘Chen Yuan returned the money to Zhao Yun. __’).
This experimental set-up yielded a significantly higher number of pronouns (92% null pronouns
and 5% overt pronouns) than NPs (3%) when participants referred to the preceding subject.
Different from experiment 4, their experiment 5 presented the sentences in a form of a conversation
as seen in (18), repeated from Simpson et al. (2016).
(18) (Simpson et al., 2016)
A: Lu Jian reng le yi gen xiangjiao gei Zhou Ping
Lu Jian throw ASP 1 CL banana to Zhou Ping
‘Lu Jian tossed a banana to Zhou Ping.’
B: weishenme
why
‘Why?’
A: _____
When given this dialogue, participants were more likely to use NPs than overt pronouns to refer
to the subject antecedents, 79.6% and 20.4% respective. Contrast to experiment 4, no null pronouns
were found in this experiment. The switch or turn in conversation in Simpson et al.’s experiment
is an example of semantic discontinuity mentioned in Chen (1986). Consequently, pronouns,
especially null pronouns, were dispreferred and NPs became the preferred referential form. In this
line, when participants in our spoken task were asked to read the fragments, they could be in one
continuity or flow of discourse. Then when they were asked to speak the sentences in which they
provided the second subordinate clause (i.e. the because clause), this continuity was disrupted,
which in turn triggered the use of NPs. However, unlike in Simpson et al., the results of our spoken
55
task still consist of pronouns. Specifically, when participants referred to the preceding object of
active sentences, 21% null pronouns and 20% overt pronouns were used. In the passive sentences,
47% null pronouns and 20% overt pronouns were used to refer to the preceding subject. In short,
if shifting from reading to speaking interrupts the continuity of discourse, this type of interruption
has a much weaker effect than the switch or turn in conversation observed in Simpson et al. (2015).
According to Arnold and Griffin (2007) and Fukumura et al. (2011), the decreasing number of
pronouns is not just a strategy to avoid ambiguity. They found that when participants were
presented with a second, competing character in discourse, pronouns are used less even then they
do not result in ambiguity. However, this explanation does not eliminate the differences between
our spoken and written tasks. It is possible that although participants had the need to use more
pronouns, it was more convenient for them to produce long full noun phrases in speech than in
writing.
Overall, I found no differences in how null and overt pronouns were interpreted in the two
experiments. Both pronoun types exhibited a patient/theme bias and they were equally influenced
by topicality in the passive conditions in written and in spoken tasks. They also had the same
patterns in production in both experiments. Although null and overt pronouns were equally used
in the active condition, null pronouns were the preferred choice of referential form in the passive
condition. It can be concluded that participants exhibited a strong tendency to refer back to the
preceding object in the active condition and to the preceding subject in the passive condition
regardless of modality. The only variation in the two experiments was the use of NPs. Contrast to
our prediction, participants produced more NPs and not more pronouns in the spoken task. Since
our prediction was built on previous findings, it is worth noting how these studies differ from our
studies resulting in different outcomes. In earlier work, data on referential expressions were
extracted from corpus and discourse data collection method (DeVito, 1964; Poole and Field,1976;
Tannen, 1982; Chafe, 1982, 1985). These studies found more NP use in written language and
higher pronoun use in speech. Biber et al. (1999) noted that these differences may be due to the
various degrees of ambiguity in each type of data. Written language, for example newspapers and
academic essays, often contains many referents. In order to disambiguate, people tend to use an
NP when they refer to previous referents. Contrastively, spoken language seems to have less
referents in the discourse; hence, there is less ambiguity and the use of a pronoun is sufficient in
most cases. In our experiments, I controlled for the number of referents in each item. The referents
in each sentence also shared the same gender; hence, they were ambiguous. In this case, perhaps
the increase of NP use indicates the need to disambiguate. Existing psycholinguistic work,
however, has claim that speakers’ choice of referring expression may be governed by other factors
besides ambiguity avoidance. Arnold and Griffin (2007) pointed out that even when the use of a
pronoun was not ambiguous between the referents, the presence of a second character in discourse
decreased the use of a pronoun. They attributed this effect to referents’ competition for speakers’
attention in discourse. Similarly, Fukumura et al. (2011) also found fewer pronouns when a
competitor sharing similarities with the referent was presented despite the fact that the use of a
pronoun in these cases would not create ambiguity. In this light, the increasing number of NPs
found in our spoken task cannot be solely explained by ambiguity avoidance. Nevertheless, the
fact that NPs were only frequently used in the spoken and not in written task does not eliminate
the suggestion that the spoken task presents participants with the means to better express their
choice of referential form.
56
5. General discussion
Previous work in languages such as Italian and Spanish has claimed that null and overt pronouns
have different referential biases (Carminati, 2000; Alonso-Ovalle et al., 2000). Studies in Japanese
and Chinese, however, have found that null and overt pronouns do not pattern differently from
each other (Yang et al., 1999; Ueno and Kehler, 2016). In this chapter, I revisited this topic using
Vietnamese as a case study. Several questions were addressed: (i) whether null and overt pronouns
in Vietnamese have the same or different referential biases and what their referential biases may
be, (ii) whether passivization is a topic-marking device in Vietnamese and how it influences the
rates of null vs. overt pronominalization, (iii) whether modality (spoken vs. written) has an effect
on the choice of null vs. overt pronouns.
This chapter focuses on null and overt pronouns in Vietnamese and their relationship with
topicality. Prior work in saliency presents us with a hierarchy in which different referential forms
are chosen based on the statuses of the referents (Givón, 1983; Ariel, 1990; Gundel et al., 1993).
Null pronouns, for instance, are used to refer to highly salient referents while overt pronouns are
used for less salient ones. These form-based hierarchy-based approaches present different roles for
null and overt pronouns. Studies in pro-drop languages such as Italian and Spanish (Carminati,
2002; Alonso-Ovalle, 2002) have found that there is a division of labor between null and overt
pronouns as claimed in the Position of Antecedent Hypothesis. While null pronouns tend to refer
back to the subject antecedents, overt pronouns prefer the object antecedents. This pattern,
however, has not been observed in topic-drop languages such as Chinese and Japanese (Yang et
al., 1999; Ueno and Kehler, 2016) in which null and overt pronouns do not differ in their referential
biases. Furthermore, although topicality has been shown to shift overt pronouns’ referential biases
from the object to the subject antecedents in Spanish using word-order manipulation (Alonso-
Ovalle et al., 2002), no effects of topicality were found in Japanese using topic marking (Ueno et
al., 2016). Besides these topic manipulations, it has been shown that in English, passivization can
be used to promote topicality (Rohde and Kehler, 2014). Since Vietnamese is a language with
fixed word order and no topic marking, our study implemented passivization as a test for topicality.
I also manipulated modality for a direct comparison in the use of null and overt pronouns in spoken
and written language.
Regarding comprehension, our prompt conditions show no differences between null and overt
pronouns. In the active conditions, both null and overt pronouns exhibited an object bias. In
contrast, they both had a subject bias in the passive conditions. I concluded that null and overt
pronouns in our study had a patient/theme bias since passivization promoted the object
(patient/theme) in active into a subject position. Further analyses of these biases revealed that the
subject bias in passive sentences was significantly stronger than the object bias in active sentences.
This indicates that besides a patient/theme bias, null and overt pronouns were also influenced by
topicality in the passive conditions. Overall, the two pronoun types do not differ from each other
in their interpretations.
Regarding production, the no-prompt conditions show similar patterns with the prompt
conditions regarding the likelihood-of-mention. In both active and passive conditions, participants
were more likely to refer back to the patient/theme. There was also a topicality effect with the
subject bias significantly stronger in passive than in active conditions. Regarding the choice of
referential form, although participants were equally likely to use a null or an overt pronoun in the
active conditions, they had a strong preference to produce a null pronoun in the passive conditions.
Crucially, this is the only condition in our study in which null and overt pronouns show different
patterns.
57
In terms of modality, the overall results of referential biases and topicality effect do not differ
between the written (Experiment 2) and spoken (Experiment 3) modalities. Null and overt
pronouns in comprehension and production shared similar patterns across modalities. These results
are in line with previous findings in the narrative study (Chapter 2) in that modality does not
influence the underlying mechanisms of null vs. overt pronoun use. However, contrast with the
narrative study, the production of NPs in the current chapter varies between writing and speaking.
It has been shown that the presence of a second, competing character in discourse may decrease
the use of pronouns (Arnold et al., 2007; Fukumura et al., 2011). Since my items consisted of two
ambiguous referents, I suggest that the increasing use of NPs in the spoken task is a result of this
tendency and since it is more convenient for participants to produce NPs in speaking than in
writing, there is a increase of NP use in the spoken experiment.
The findings in this chapter raise two key issues.
• Vietnamese pronoun choice cannot be explained using form-based hierarchy-based
approaches nor can they be explained with the Position of Antecedent Hypothesis. One suggestion
is to use the form-specific approach (Kaiser and Trueswell, 2008) in which different forms may
show different degrees of sensitivity to the same factor. However, this approach does not explain
why null and overt pronouns in Vietnamese do not differ in comprehension, yet they start to
diverge in production. Another suggestion is to use a probabilistic method such as a Bayesian
approach to bridge the gap between pronoun interpretation and pronoun production. This method
has been successfully used in Kehler and Rohde (2013). Although I do not discuss this method in
the current paper, future work will investigate data patterns using this method.
• The object bias found in the active conditions are not only unexpected but also contradicts
the subject preference found earlier in Experiment 1, Chapter 2. Crucially, the object bias also pose
as a challenge to the widely known subject bias found in pronoun resolution across languages (e.g.
Chafe, 1976; Alonso-Ovalle et al., 2002; Carminati, 2002; Crawley & Stevenson, 1990; Simpson
et al., 2016; Ueno & Kehler, 2016). Two key questions concerning the subject vs. object bias: (i)
To what extent is the object bias found in Experiment 2 and 3 a result of the verb choice based on
the English verb study by Hartshorne and Snedeker (2013)? (ii) Do Vietnamese speakers have a
subject or object bias in pronoun assignment? I examine these questions in the following chapters.
Specifically, in Chapter 4 (Experiment 4), I present results of a large-scale study of 162
Vietnamese implicit causality verbs and provide a comparison between Vietnamese verbs and their
English equivalents. In Chapter 5 (Experiment 5 and 6), I use the self-paced reading paradigm to
gain incremental, real-time information and investigate whether subject or object bias is present
during online processing of Vietnamese pronouns.
58
Implicit causality in Vietnamese: Is there an object bias?
1. Introduction
This chapter sets out to investigate referential biases of Vietnamese implicit causality verbs.
Results from the sentence completion experiments in Chapter 3 (Experiments 2 and 3_ raise an
intriguing question with regard to pronouns’ referential bias. Specifically, in active sentences,
Vietnamese speakers showed a tendency to refer back to object antecedents rather than to the non-
topicalized subject antecedents. This object bias is unexpected since previously, the narrative
experiment (Experiment 1 in Chapter 2) reveals a clear preference for the grammatical subject
role: The subject parallelism configuration has a distinct pattern compared to the other three
configurations including the object parallelism. Particularly, pronouns (null and overt pronouns)
are the preferred choice when the antecedent and the referring expression are both subjects in
adjacent utterances. These are also the grammatical roles being investigated in the sentence
completion tasks (Chapter 3). Moreover, the majority of the utterances in narratives are also in
active construction as such as the active, non-topicalized sentences in the sentence completion
experiments.
Considering the similarities and differences between the two types of tasks, three things
should be pointed out. First, the sentence completion experiments controlled the coherence relation
between the clauses (all items involved an Explanation relation, signaled by because). The
narrative study did not control for the kind of coherence relation between sentences and
presumably involved a range of relations, including 'Occasion' relations which describe a narrative
sequence of events. Second, the verbs used in the sentence completion tasks were all equi-biased
verbs (i.e. verbs that do not strongly bias pronoun interpretations toward either the subject nor
object), whereas the narrative study consist of a wide range of verbs. Lastly, the equi-biased verbs
used in Vietnamese are based on the English verb study. Taken together, these differences raise
the possibility that the seemingly unexpected object bias in Chapter 3 may stem the from verb
choice.
Since there is no prior study of Vietnamese implicit causality verbs, I report in this chapter a
study of 149 implicit causality verbs in Vietnamese. I aim to (i) provide an overview of the
Vietnamese verbs’ referential biases, and (ii) draw a crosslinguistic comparison between the
Vietnamese and English implicit causality verbs.
The structure of the chapter is as follows: In the remainder of Section 1, I provide an overview
of prior work on implicit causality and the taxonomy (i.e. the Revised Action-State Distinction
(Brown & Fish, 1983b; Au, 1986)) often used to categorize implicit causality verbs. Section 2
describes the methods used in the experiment and compare the findings in Vietnamese with those
in English. Section 3 provides a summary of the patterns found in the study and suggestions for
further research.
1.1. WHAT IS IMPLICIT CAUSALITY?
It has been pointed out that when reading sentences such as (19), English speakers have a strong
preference to interpret the pronoun she in (19a) as Lisa and in (19b) as Kate (Caramazza et al.,
59
1977; Garvey & Caramazza, 1974). This phenomenon in which the cause of the event (i.e. she)
can be inferred by the verbs in the main clause is called implicit causality.
(19) a. Lisa frightened Kate because she… (she = Lisa)
b. Lisa blamed Kate because she… (she = Kate)
It has been shown that in these sentences, a because connective is necessary for a causal
interpretation and thus verbs’ causal attributions are used to guide pronoun assignment to the cause
(e.g. Ehrlich, 1980).
Verbs’ implicit causality is often described in terms of subject vs. object bias. For instance,
the verb frighten in (15a) leads to an interpretation that the cause (i.e. she) of the event is the
subject of the main clause, Lisa. In contrast, blame in (15b) results in an interpretation that the
cause is Kate, the object of the main clause. In other words, frighten is a subject-biased verb
(sometimes referred to as an IC-1 verb) and blame is an object-biased verb (sometimes referred to
as an IC-2 verb). It should be noted that even though implicit causality is often described in these
terms, the bias itself should be thought of as a continuum rather than in absolute terms. Thus,
verbs’ implicit causality varies from strongly subject/object-biased to equi-biased (i.e. not strongly
biased toward either subject nor object).
1.2. THE REVISED ACTION-STATE DISTINCTION
Since different verbs may very drastically on their implicit causality biases, researchers have been
very interested in finding a way to categorize them and thus; helps with predicting their biases.
Two prominent taxonomies proposed in the literature are the Revised Action-State Distinction
(Brown & Fish, 1983b; Au, 1986) and the Linguistic Category Model (Semin & Fiedler, 1988,
1991). A detailed comparison of these two taxonomies and their implications are provided in
Rudolph and Försterling (1997). In short, Rudolph and Försterling found that the Revised Action-
State Distinction is a more straightforward taxonomy and it is also better than the Linguistic
Category Model in capturing variance in causal attributions in verbs. In this work, I will also use
the Revised Action-State Distinction for verb categorization. Another advantage of using this
taxonomy is that it is widely used in a large body of work and thus, it can help draw a more direct
comparison of verbs in Vietnamese and other languages.
To illustrate the four verb categories in the Revised Action-State Distinction, examples (20)
are provided below. As we can see in these examples, Agent-Patient (20a) and Experiencer-
Stimulus (20c) verbs have a subject bias. In contrast, Agent-Evocator (20b) and Stimulus-
Experiencer (20d) illicit an object bias.
(20) a. Agent-Patient
Sally hit Mary because she… (subject bias: she = Sally)
b. Agent-Evocator
Sally punished Mary because she… (object bias: she = Mary)
c. Experiencer-Stimulus
Sally impressed Mary because she… (subject bias: she = Sally)
d. Stimulus-Experiencer
Sally liked Mary because she… (object bias: she = Mary)
The Vietnamese verbs examined in this chapter are also discussed with regards to the four verb
categories mentioned above. One may ask if verbs’ implicit causality biased can be predicted using
this taxonomy, why would we need to do a verb study to confirm these already known biases?
60
First, even though the taxonomy can help us predict the biases of a good number of verbs, it is not
always the case that verbs’ biases follow these predictions. Let us take a look at example (21)
below.
(21) Agent-Evocator
a. Sally praised Mary because she… (object bias: she = Mary)
b. Sally apologized to Mary because she… (subject bias: she = Sally)
Despite the fact that both praise and apologize are Agent-Evocator verbs, they exhibit the opposite
biases: Praise is object-biased while apologize is subject-biased. Consequently, it is important to
examine individual verb’s bias rather than assuming the bias from the taxonomy. Second, even
though studies have examined implicit causality in a number of languages, they are mostly done
with European languages (German: Fiedler,1978; Rudolph, 1997; Spanish: Goikoetxea et al.,
2008; Dutch: Sernin & Marsman, 1994; Italian: Manetti & De Grada, 1991). One exception is the
implicit causality in Chinese by Brown and Fish (1983a). Last but not least, very few studies test
a large number of verbs (English: Ferstl et al., 2011; Hartshorne & Snedeker, 2013; Spanish:
Goikoetxea et al., 2000) and thus; it is difficult to see how the predictions hold up once other words
are included.
In this study, I test a total of 162 verbs in Vietnamese (including 24 verbs used in the sentence
completion study in Chapter 3, Experiment 2 & 3) to obtain information about each individual
verbs' implicit causality bias. To examine the subject-/object-biased predictions with respect to
verb class, I categorize the Vietnamese verb based on the Revised Action-State Distinction. Most
importantly, I am interested in the crosslinguistic differences between the English and Vietnamese
verbs’ implicit causality biases and whether these differences influence the results in the sentence
completion study. In the result section, I provide a comparison of 147 Vietnamese verbs and their
English equivalents (Ferstl et al., 2011) based on the Revised Action-State Distinction. Separately,
I also compare the biases of the twenty-four verbs used in the sentence completion study with the
results in Hartshorne and Snedeker (2013).
2. Experiment 4 – English and Vietnamese implicit causality verbs
2.1. PARTICIPANTS
One hundred and sixty-three adult native speakers of Vietnamese participated in the experiment.
None of the participants have lived outside of Vietnam for more than six months.
2.2. MATERIALS AND DESIGN
The target items are designed based on Hartshorne and Snedeker (2013)
17
. Each sentence had two
male or female names; thus, the pronoun is ambiguous in terms of gender as seen in example (22)
below. The lengths of the names in each item are matched so that they only differ by a maximum
of one letter. Similar to Hartshorne and Snedeker’ use of the nonce word dax (e.g. ‘Sally frightens
Mary because she is a dax’), I use đăn tuê, a Vietnamese nonce word, in all of the items. Therefore,
đăn tuê functions as a filler word and does not have any influence on how participants interpret
17
Due to the large number of verbs in this study, I chose to use a naming task similar to the one in Hartshorne and
Snedeker (2013) rather than a sentence completion task for ease of coding the results.
61
the cause of the event. Since Vietnamese pronouns are derived from kinship terms denoting not
only gender but also age, both old (22a) and young pronouns (22b) are used in the experiments.
(22) a. Trúc la Hằng vì bà ấy/cô ấy đăn tuê.
Trúc scold Hằng because sheOLD/sheYOUNG đăn tuê
‘Trúc scolded Hằng because she is đăn tuê.’
b. Công la Nhật vì ông ấy/anh ấy đăn tuê.
Công scold Nhật because heOLD/heYOUNG đăn tuê
‘Công scolded Nhật because he was đăn tuê’
A total of 162 verbs were tested. The verbs are divided into three lists, each with fifty-four verbs.
The verbs are pseudo-randomized so that no more than three verbs of the same categories (e.g.
Agent-Patient) occur in a row. Two pseudo-randomizations are used for each list. Eight catch-
trials are also added into each list. The catch-trials consist of different gendered names (i.e. one
male, one female); thus, the pronouns in the fillers are unambiguous and participants need not use
verb bias to know which referent the pronoun refers to. An example of a catch-trial is shown in
(23). Each participant only completed one list with a total of sixty-two items.
(23) Nghĩa quý mến Thắm vì cô ấy đăn tuê.
NghĩaMALE cherish ThắmFEMALE because sheYOUNG đăn tuê
‘Nghĩa cherished Thắm because she is đăn tuê.’
2.3. PROCEDURE
Participants were instructed to read the sentences such as (24a) and answer the question in (24b)
by writing the name of the referent.
(24) a. Trúc la Hằng vì cô ấy đăn tuê.
Trúc scold Hằng because sheYOUNG đăn tuê
‘Trúc scolded Hằng because she is đăn tuê.’
b. QUESTION: Who is đăn tuê? ________ [write down a name]
3. Results
In this section, I first discuss the results of 147 Vietnamese verbs with regards to (i) the four verb
categories, and (ii) the biases of their English equivalents from Ferstl et al.’s (2011) study. I then
discuss the set of twenty-four verbs previously used in Experiment 2 and 3 (Chapter 3) with respect
to their English equivalents from Hartshorne and Snedeker (2013).
Prior to data analysis, I eliminated participants based on two criteria: (i) the lack of variation
(i.e. whether they always replied with the subject/object names) and/or (ii) their performance on
catch-trial (i.e. whether they provided correct answers for at least 5 out of 8 catch-trials). This
process leaves us with a total of ninety-eight participants.
A total of 147 verbs are discussed in the following analysis. Figure 15 below shows the
percentages of subject responses for Vietnamese and English, by verb class. Overall, English and
Vietnamese verbs appear to share the same pattern in their subject bias with 38.97% subject bias
in Vietnamese and 42.7% subject bias in English. Pearson correlation is used to test whether there
is a correlation among verbs in the two languages. I found that as a whole, English and Vietnamese
62
verbs are strongly correlated (r = 0.46, n = 147, p < 0.001). However, the subject bias in stronger
in English than in Vietnamese.
Figure 15. Percentages of subject responses for Vietnamese and English (Ferstl et al., 2011) by
verb class (n = 147).
A closer look reveals that there are indeed variations with regards to verb class. As seen in Figure
16 below, subject biases in English and Vietnamese pattern similarly in three of the classes.
Specifically, Agent-Patient verbs in both English and Vietnamese exhibit a stronger subject bias
than verbs in the Agent-Evocator class and verbs in the Experiencer-Stimulus class has an object
bias. Pearson correlations were performed showing that indeed there are significant correlations
among these classes, Agent-Patient (r = 0.3, n = 65, p < 0.05), Agent-Evocator (r = 0.5, n = 29, p
< 0.01) and Experiencer-Stimulus (r = 0.51, n = 25, p <0.01). However, no correlation is found
with Stimulus-Experiencer verbs (r = 0.3, n = 22, p = 0.14). In fact, English Stimulus-Experiencer
verbs has 65% subject responses. In contrast, Vietnamese Stimulus-Experiencer verbs receives
55.6% object responses. However, the object bias is stronger in Vietnamese Experiencer-Stimulus
than Stimulus-Experiencer verbs (p<0.01); thus, these categories are still distinct in Vietnamese.
63
Figure 16. Correlations between English and Vietnamese verbs by verb class based on the
percentage of subject responses.
As expected, results from the paired t-tests show that information about age (t(148)=0.62, p=0.54)
and gender (t(148)=0.40, p=0.69) encoded on kinterm pronouns has no effect on subject preference
strength.
Let us now turn to the twenty-four verbs used in the sentence completion study (Chapter 3).
The crucial question is whether or not these verbs are equi-biased in Vietnamese. Recall that
previously, these verbs were chosen from the English study by Hartshorne and Snedeker (2013)
on the basis that they are not strongly biased toward either the subject nor the object (%40-%60
object responses or inversely, %40-%60 subject responses). In this discussion, I refer to verbs’
biases using percentages of subject responses as I have done so far in this chapter.
Verbs in this group can be divided into five classes: Agent-Patient (n=9), Agent-
Evocator(n=6), Stimulus-Experiencer (n=3), Experiencer-Stimulus (n=2), and Source-Goal (n=4).
Overall, these twenty-four Vietnamese verbs do not differ from their English equivalents in their
biases: On average, the verbs are %44.85 subject biased in Vietnamese and %49.54 subject biased
in English (Hartshorne & Snedeker, 2013). Thus, they are also equi-biased verbs in English and in
Vietnamese. The subject bias is slightly stronger in English than in Vietnamese similar to the
patterns found in the analysis of the 147 verbs. Looking at each verb class, I found that all but
Agent-Evocator are equi-biased in Vietnamese. Specifically, the average percentage of subject
responses is %55.14 for Agent-Patient verbs, %44.44 for Stimulus-Experiencer verbs, %40.74 for
Experiencer-Stimulus verbs, and %46.30 for Source-Goal verbs. Meanwhile, Agent-Evocator
verbs only have an average of %37.65 subject bias. Nevertheless, these results show that overall,
the Vietnamese verbs used in the sentence completion study are indeed equi-biased (i.e. %40-%60
subject bias). It can be concluded that the object bias found in the sentence completion study
(Chapter 3) cannot be due to verb choice.
64
4. Discussion
This chapter describes a large-scale study that collected information about the implicit causality
bias of over one hundred Vietnamese verbs.
The motivation for this study stems from the object bias found in the sentence completion
tasks in Chapter 3. Participants in the sentence completion experiments exhibit a tendency to use
pronouns (null and overt pronouns) to refer back to the object antecedents in active sentences. This
contradicts sharply with the finding in the narrative study in Chapter 2 in which pronouns are
mostly used to refer back to subject antecedents. One primary concern for the object bias found in
the sentence completion study is that the equi-biased verbs used in these sentences are chosen on
the basis of the English verb study. Thus, there may be discrepancies on how English verb bias
results can be generalized to Vietnamese verbs. The current chapter addresses this concern by
reporting results from a study of 162 Vietnamese implicit causality verbs.
I found that overall, English and Vietnamese verbs share similar patterns in their referential
biases. However, English verbs exhibit a stronger subject bias than Vietnamese verbs.
Interestingly, looking at each verb class shows that in fact, there are important differences among
verbs in the two languages. Specifically, even though Agent-Patient, Agent-Evocator and
Experiencer-Stimulus verbs in English and Vietnamese behave similarly, Stimulus-Experiencer
verbs in Vietnamese exhibit an object bias, contrast to the subject bias in English Stimulus-
Experiencer verbs. In other words, both Stimulus-Experiencer and Experiencer-Stimulus verbs in
Vietnamese have an object bias, even though the object bias in Experiencer-Stimulus is
significantly stronger. Crucially, the verb study in this chapter shows that Vietnamese speakers
tend to interpret pronouns as referring to object antecedents more frequently than English speakers
do. The question remains whether Vietnamese speakers have a bias toward the subject or the object
in pronoun resolution. The following chapter, Chapter 5, addresses this issue.
65
Subject vs. Object bias in online pronoun processing
1. Introduction
This chapter takes a closer look at the unexpected object bias found in the sentence completion
task in Chapter 3 and in the verb study in Chapter 4. In this chapter, I report two self-paced reading
studies that explore the real-time processing of pronouns in Vietnamese, in order to get a better
understanding of how the object bias emerges during on-line comprehension.
The finding that pronouns prefer object antecedents (Chapters 3 and 4) is unexpected --
both from a crosslinguistic perspective and in light of the narrative study in Chapter 2 which
suggests the grammatical subject in Vietnamese has a prominent role in discourse. More
specifically, in the narrative study, people produced pronouns significantly more often when the
referents’ grammatical subject role is maintained across utterances (i.e. subject parallelism effect).
Furthermore, the narrative study also showed that, while pronouns (both null and overt) are the
predominant choice in subject parallelism configurations, NPs are the most frequently used
referential form in object parallelism configurations (i.e. subjecthood effect). Both of these
findings point towards a preference for pronouns to be used for subject antecedents. In contrast,
the sentence completion task in Chapter 3 and the verb bias selection task in Chapter 4 reveal an
object bias in Vietnamese.
These seemingly conflicting results lead to several hypotheses. On the one hand, it could
be that the subjecthood effect is limited to narratives (Chapter 2) and that generally speaking,
Vietnamese speakers exhibit an object bias with pronoun resolution (Chapters 3,4). However, this
view seems to be controversial considering the vast amount of crosslinguistic evidence indicating
that subjects are prominent in discourse and preferred as pronoun antecedents (other things being
equal). Why would Vietnamese differ from other languages? Which characteristic of the language
could underlie the typologically unusual object bias? So far, the only major difference we could
see about Vietnamese compared to other languages is the use of kinship terms as pronouns.
However, this alone is not sufficient to explain why kinship terms results in an object preference.
On the other hand, it could be that Vietnamese actually has an underlying subject
preference -- in line with crosslinguistic patterns -- and that the object preference observed in the
sentence completion tasks (Chapters 3 and 4) might be an epiphenomenon of the implicit causality
processing. In this chapter, I will examine whether the subject or object bias is presence during
online processing of Vietnamese pronouns. Using the self-paced reading paradigm, I will measure
people’s reading times for each word to gain incremental, real-time information about the
subject/object bias.
1.1. EFFECTS OF VERB BIAS ON PRONOUN INTERPRETATION
Verb bias, particularly implicit causality bias, has been shown to have a strong influence on
pronoun interpretation in both offline and online processing. For offline processing, when asked
to choose which referent the pronoun he in (25) refer to, participants have a strong tendency to
choose John in (25a) and Bill in (25b).
66
(25) a. John annoyed Bill because he was loud. he = John
b. John admired Bill because he was kind. he = Bill
The effect of implicit causality bias has also been observed for online processing. Caramazza et
al. (1977) asked participants to read sentences such as those in (26) and to indicate who the pronoun
referred to. In (26a), the continuation (i.e. the because-clause) is congruent with the bias by the
verb scolded and hence, the pronoun he was interpreted as Bill. Contrastively, the continuation in
(26b) lead to an interpretation of he as Tom, which is incongruent with the verb bias. Caramazza
et al. found that participants took longer to read and to respond to incongruent sentences (26b) than
congruent ones (26a).
(26) (Caramazza et al., 1977)
a. Tom scolded Bill because he was annoying.
b. Tom scolded Bill because he was annoyed.
Importantly, Caramazza et al. also found that the effect of implicit causality persists even in cases
where the gender feature is sufficient to interpret the pronoun itself: Participants had shorter
response times for gender-different, congruent sentences (27a) than gender-different, incongruent
sentences (27b).
(27) (Caramazza et al., 1977)
a. Sue scolded Bill because he was annoying.
b. Bill scolded Sue because he was annoyed.
Since Caramazza et al.'s seminal work, many researchers have investigated effects of implicit
causality using off-line as well as on-line methods. Today, researchers agree that implicit causality
bias influences pronoun interpretation, but disagreements remain about the time course in which
this verb-based information takes effect during online sentence processing. Two contrasting
accounts have been tested: the clausal integration account and the immediate focusing account.
The clausal integration account claims that verb bias only has full effects toward the end of the
because-clause (Garnham et al., 1996; Stewart et al., 2000). In contrast, the immediate focusing
account suggests that verb information is rapidly used by comprehenders as soon as they reach the
because-clause (Greene & McKoon, 1995; Long & De Ley, 2000).
To explore how and when effects of implicit causality emerge, many of these studies have
employed so-called 'probe tasks' (Garnham et al., 1996; Greene & McKoon, 1995; Long & De
Ley, 2000; McDonald & MacWhinney, 1995). In this type of task, participants are asked to read
sentences and to respond as quickly as possible whether or not a probe word has occurred
previously in the sentence. The probe word may or may not relate to the congruent referent with
regards to the verb bias. For example, the referent Bill in (4a) is in focus under the effect of the
verb scolded but not in (4b). By presenting the probe word at various points in a sentence such as
(4a), researchers can detect when verb bias takes effects which in turn facilitates participants’
response times when the in-focus probe word, Bill, appears. Garnham et al. (1996) found a late
effect for the probe occurring at the end of the because-clause but no early effect for the probe
after the pronoun he. This supports the clausal integration account. However, results from other
studies using probe task (Greene & McKoon, 1995; Long & De Ley, 2000; McDonald &
MacWhinney, 1995) show an early effect of verb bias, at the probe immediately following the
pronoun. These results support the immediate focusing account.
The use of probe task and the contradicting results have been criticized in more recent
studies. As pointed out by Koornneef & Vanberkum (2006) and Stewart et al. (2000), the
67
occurrence of the probe questions is unnatural: It is not something we normally do when reading
texts. Consequently, such a direct query may result in participants developing a strategy such as
memorizing the referents, which is specific to probe task and not part of natural sentence
comprehension (Gordon, Hendrick, & Foster, 2000). Furthermore, Stewart et al. (2000) also
pointed out that the probe task may not be sensitive enough to detect the time course of language
processing. One example provided in their paper is the failure to detect an effect of antecedent
facilitation in a pronoun interpretation study conducted by Greene, McKoon, & Radcliff (1992)
using probe task. However, using the more sensitive eye-tracking paradigm, Garrod et al. (1994)
were able to detect this facilitation effect. To avoid the concerns associated with the probe task,
Stewart et al. (2000) used the self-paced reading paradigm to investigate the time course of verbs’
implicit causality bias. In their experiments (Experiment 2-4), participants read two large
fragments of a sentence such as (28). The fragment division is indicated by “/”.
(28) (Stewart et al., 2000:436)
a. Daniel apologized to Joanne because he / had been behaving selfishly.
b. Joanne apologized to Arnold because he / didn’t deserve the criticism.
Stewart et al. found that it took participant longer to read sentences such as (28b) in which the
pronoun interpretation is incongruent with the verb bias, compared to sentences where the pronoun
interpretation is congruent with the verd bias in (28a). However, the reading time slowdown only
occurred in the second fragment and not in the first fragment. This led them to conclude that verb
bias effect is a late effect, supporting the clausal integration account (Garnham et al., 1996).
Koornneef & van Berkum (2006) challenged this interpretation and suggested that the lack of
an early detection of verb bias effect in Stewart et al. (2000) might be due to their design. Even
though self-paced reading may be a good task to detect the time course of the effect, presenting
the sentences in two large fragments could prime participants to a button-press rhythm. In addition,
the effect of incongruence may not occur exactly at the pronoun but at the next few words right
after the pronoun (Badecker & Straub, 1992; Ehrlich & Rayner, 1983). Therefore, having the
sentence fragment boundary right after the pronoun could mask this spill-over effect. To address
this issue, Koornneef & van Berkum (2006) examined whether implicit causality verbs have an
immediate impact during sentence processing using a finer-grained word-by-word self-paced
reading task. They found that participants slowed own immediately after encountering a bias-
incongruent pronoun (e.g. he in example 5b). A subsequent eye-tracking experiment yielded
similar patterns. Thus, Koornneef & van Berkum (2006) concluded that verbs’ implicit causality
bias has a very rapid effect during sentence processing.
These results regarding verbs’ implicit causality bias relate to the current work in two ways.
First, prior work shows that implicit causality information can influence pronoun interpretation
and there is strong evidence supporting this claim not only from offline judgment tasks but also
from online language processing tasks. Therefore, when Vietnamese speakers resolved the
ambiguous pronouns in the sentence completion task (Chapter 3) and the verb judgement task
(Chapter 4) to the grammatical object referent, they might have done so under the influence of the
verb information. Second, the discussions about the different experimental methods used in
previous studies and how they may affect the outcomes can point to an appropriate task for the
current study. Specifically, it is important to have a task that is sensitive enough to pick up effects
during online processing. As shown in prior work, the self-paced reading paradigm can be a good
choice to examine the object bias in Vietnamese.
68
1.2. EFFECTS OF INFORMATION MARKED ON THE PRONOUN
In addition to information from verb semantics, information from the pronoun itself can also guide
pronoun interpretation. In English, gender information on third singular person pronouns (he/she)
can serve as a cue to the intended antecedent. Thus, pronouns following clauses with two different-
gender referents (29b) are not ambiguous (compare to (29a)).
(29) a. John scolded Bill when he entered the room.
b. John scolded Mary when he entered the room.
While gender marking on the pronoun can function as an effective device to guide selection of the
appropriate referent in sentences (29b), many researchers have questioned the extent of influence
gender cues during real-time pronoun resolution. Here, we are especially interested in the role of
gender cues on pronoun resolution in the presence of other information (e.g. subjecthood
preference, verbs’ implicit causality bias).
Some researchers have suggested that gender information is used before other information.
For instance, Crawley, Stevenson, & Kleinman (1990) -- and many others -- found that ambiguous
pronouns tend to be interpreted as referring to preceding subjects. In addition, they also found
evidence during an online reading experiment that when the pronoun is not ambiguous, gender
information is used instead of the subject assignment strategy. In contrast, in a probe task, Greene,
McKoon, & Ratcliff (1992) found no evidence that participants identified the referents of gender-
disambiguated pronouns, leading them to conclude that gender cue is only used when the referents
are highly accessible.
With regards to how rapidly gender information is processed, contradicting results have also
been found. While the rapid use of gender cue have been found in some studies (Boland, Acker,
& Wagner, 1998; MacDonald & MacWhinney, 1990), others found limited use of gender in
pronoun processing (Gernsbacher, 1989; McDonald & MacWhinney, 1995). One concern
regarding these mixed results is the fact that these studies employed the probe task, where
participants not only resolved the pronoun, but they also had to remember the referents mentioned
previously. As previously discussed in the verb bias in Section 1.1, this type of task has been
criticized for disrupting the language comprehension process and may cause participants to use
other strategies such as memorization.
In an effort to better address this question, Arnold et al. (2000) tracked participants’ eye-
movements to potential (visually depicted) antecedents while they listened to stories containing
gender-ambiguous and gender-unambiguous referents. Arnold et al. found rapid use of gender
information right after the onset of the pronoun. They also found evidence that gender cues and
other information such as referents’ salience are used simultaneously in pronoun processing. Even
when referents’ salience is decreased, participants in Arnold et al. still rapidly used gender cue to
identify the indented referent in the stories. They concluded that gender information and other
types of cues are used rapidly in online pronoun processing.
Regarding the question of how information from verb bias (implicit causality) interacts with
gender marking on pronouns, prior work also shows mixed results. While Garnham & Oakhill
(1985) found an effect of gender cue facilitating pronoun processing, they found no systematic
effect of verb bias. In contrast, Vonk (1985) found an effect of verb bias but not gender cue.
Diverging from these claims, Garnham, Oakhill, & Cruttenden (1992) found that both verb
bias and gender cue facilitate pronoun assignment. However, they observed that gender cues are
only used when the specific task demands participants to make use of such information. In
particular, Garnham et al.'s results are supported by Stewart et al. (2000, Experiment 4) which also
69
found that both gender cues and verb bias facilitate pronoun interpretation. However, due to how
they segmented the sentence (28), the effects only show in the second sentence fragment but not
in the first fragment where the (un)ambiguous pronoun locates. As noted in Garnham et al. (1992),
perhaps a more appropriate naturalistic task than the two-chunk self-paced reading task used in
their experiments is better suited to investigate these effects in online language processing. Nicol
& Swinney (2003) has also noted that verb bias information may be more readily available than
pronouns’ gender cue in previous reading-type studies since participants have already read and
computed the verb by the time they reach the pronoun. Thus, the Arnold et al.’s (2000) audio-
visual eye-tracking task mentioned above might be a better type of task to detect the time course
of these effects. However, as I shall proceed with the self-paced reading paradigm in the current
study, it is important to keep these discussions in mind when interpreting the final results.
1.3. AIMS OF THE CURRENT STUDY
The current study examines whether the object bias previously found in offline judgement tasks
(Chapters 3 and 4) is also present during the real-time processing of Vietnamese pronouns. When
the pronoun identity goes against speakers’ initial interpretation, online processing difficulties
should be detected. Therefore, if Vietnamese speakers indeed have an initial preference to interpret
the pronoun as referring to object antecedents, we should not detect any processing difficulties at
the disambiguation region when it becomes clear that the intended referent of the pronoun is the
preceding object. In addition to object bias, the study also investigates whether age cue, specific
to Vietnamese kinship term pronouns, can be used to guide pronoun processing in the presence of
verb bias.
2. Experiment 5 – Subject vs. Object bias
This experiment employs the word-by-word self-paced reading paradigm to test pronoun
interpretation in Vietnamese. In light of the results reported in the preceding chapters, I am
specifically interested in whether online processing of pronouns reveals evidence for an early
object preference (as the offline data in Chapters 3 and 4 might lead us to expect) or if there are
any signs of an early subject preference (as prior crosslinguistic work might lead us to expect). To
examine this, I manipulated whether the pronoun interpretation matches the verb bias (what I will
refer to as 'congruence') using object biased verbs. I am also interested in the use of age cues
denoted by Vietnamese kinship term pronouns in reference resolution. Age cues are features
specific to Vietnamese. However, similar to gender cues in English, they are encoded lexically on
the pronouns and can be used to distinguished referents in discourse (e.g. ông ‘heOLD vs. anh
‘heYOUNG’). For these reasons, I am also interested in how age cues in Vietnamese can guide
pronoun assignment. To test how age cues are used, I manipulate the age of the referents (i.e. same
age vs. different age) and thus, the pronoun is either ambiguous or unambiguous with regards to
age.
In this experiment, I examine (i) whether Vietnamese speakers have a subject bias or object
bias during pronoun processing using object-biased verbs and (ii) how age cues are used in the
presence of verb bias during pronoun assignment when pronouns are unambiguous. Participants
in this experiment read sentences which consist of either same age referents (i.e. ambiguous
pronoun conditions) or different age referents (i.e. unambiguous pronoun condition). I detect
effects of verb bias and age cues by measuring reading times at the disambiguating noun (i.e.
indicating by the repetition of one of the two referents) revealing which referent the pronoun refers
70
to. In the ambiguous pronoun conditions, if verb bias has a strong influence, participants should
resolve the pronoun according to the verb bias and thus, experience processing difficulties (i.e.
longer reading times) when the disambiguating nouns does not match with their expectations. In
contrast, in the unambiguous pronoun conditions, if age cue has a strong effect on pronoun
assignment, participants should be able to resolve the pronoun without experiencing any
slowdowns when the disambiguating noun does not match with the verb bias. However, if verb
bias still has an influence despite the present of age cues, participants should still slow down when
they encounter the disambiguating noun which does not match with the verb bias.
2.1. METHODS
2.1.1. Participants
Thirty-eight Vietnamese native speakers living in Vietnam participated in the study. None of the
participants had lived outside Vietnam for more than six months. Participants received
VND100.000 (approximately USD 4.30) for their participation.
2.1.2. Materials and Design
2.1.2.1. Norming studies
Prior to the main experiment, three norming studies were conducted to create and select the
experimental stimuli.
a. Noun norming
Since I am interested in the age cues embedded in pronouns, the potential referents of these
pronouns must be clearly recognized as having specific ages (i.e., clearly young or clearly old). I
also want to be able to control the social status of the potential referents, as higher social status
and increased age are tightly associated in Vietnamese culture. To do this, I chose to use role nouns
(e.g. dentist, teacher etc.) in the target items.
As shown in (30), Vietnamese role nouns consist of two components: the kinship term (e.g.
ông, old.male) and the occupation term (e.g. nha sĩ 'dentist'). The gender and age of the role nouns
are explicitly denoted in the kinship term. In addition, there is an implicit social hierarchy
associated with both age and different professions/roles: older people are more respected than
younger people, and socially prestigious roles (e.g. dentist) are more respected than less prestigious
roles (e.g. janitor). To identify target items that allow me to test for effects of the age cue, ensuring
that other prestige factors do not come into play, I conducted a (i) noun norming study to gather
information about prestige of role nouns.
(30) ông nha sĩ
old.male dentist
‘the dentistOLD.MALE’
Before I discuss the norming study in more detail, it is important to note that in addition to the
social hierarchy in role nouns, frequency is also an important factor since it can affect reading time
and lexical retrieval. Unfortunately, frequency information in Vietnamese corpuses is not readily
available for all of the role nouns in question. Therefore, a secondary frequency judgement task
was added into the noun norming study. This type of subjective frequency rating task is not
71
uncommon especially in cases where corpora are not reliable or available such as in American
Sign Language (Balota, Pilotti, & Cortese, 2001; Mayberry, Hall, & Zvaigzne, 2014).
The noun rating task was implemented using the online survey platform Qualtrics. Seventy-
five Vietnamese native speakers rated the prestige of 40 role nouns (24 targets, 12 fillers, 4 catch
trials) and the frequency of another set of 40 role nouns (24 targets, 12 fillers, 4 catch trials). Nouns
occurred in both old vs. young forms (indicated by kinship term) and where possible, male vs.
female forms (indicated by kinship term). Participants first rated the nouns on social prestige, and
then rated the nouns on frequency. For the prestige rating task, participants were instructed to rate
how respectable the jobs (indicating by the role nouns) are from 1 to 6 with 1 is least respectable
and 6 as highly respectable. For the frequency rating task, participants were told to rate on how
commonly used the nouns are with 1 as very rarely used and 6 as very frequently used. In total 72
role nouns were rated, and participants did not see any noun twice. Six catch trials were included
in each part (i.e. prestige rating, frequency rating) to detect whether participants were paying
attention to the task. The catch-trials for prestige consist of three job with very low prestige and
three with very high prestige. Similarly, the catch-trials for frequency include three very frequently
used nouns and three very low frequency noun (based on the Vietnamese mixed corpus from
Leipzig Corpora Collection (2018)).
Ratings for prestige and frequency were analyzed separately. Participants who performed less
than 70% correct on catch trials (less than 3 correct answers out of 4) were eliminated. This leaves
us with 64 participants for the prestige task and 51 participants for the frequency task. Raw scores
were transformed into z-scores.
Based on these scores, I selected nouns and used them to make noun pairs, since each target
needs two role nouns to provide two potential antecedents. Nouns with the same gender that were
closely ranked in both prestige and frequency were matched to create noun pairs.
When creating the noun pairs for the targets, I also made sure that nouns that were paired
together to be the potential pronoun antecedents in a target have the same number of syllables and
differ in by no more than two letters.
b. Main clause plausibility
In addition to norming the prestige of the nouns to ensure that targets have two roles with equal
social prestige, I also wanted to norm the plausibility of the main clause to ensure that the events
being described in that clause (e.g. ‘the young dentist praised the old engineer’ vs. ‘the young
engineer praised the old dentist’) are roughly equally plausibility. This is important because we
want to avoid a situation where a certain noun is unusual or implausible in a particular scenario
(e.g. ‘the gardener cured the doctor’) as that could influence the activation/salience of that noun
and thus also the likelihood of that noun being interpreted as the antecedent of a subsequent
pronouns. In essence, I wanted to make sure all events and event participant pairings were similar
in plausibility.
The noun pairs created on the basis of the noun norming were paired with the object biased
verbs from the verb study (Chapter 4) to create a set of 43 sentences, which were then tested for
plausibility. The Subject-Object order (i.e. N1N2 vs. N2N1) and age cue (i.e. same age vs. different
age) in these sentences were manipulated in a 2x2 design shown in (31). The Subject-Object order
manipulation is crucial for the congruence in the main self-paced experiment.
72
(31) a. N1N2_same age
Ông nha sĩ khen ông kỹ sư
old.male.dentist praise old.male.engineer
‘The dentistOLD.MALE praised the engineerOLD.MALE.’
b. N2N1_same age
Ông kỹ sư khen ông nha sĩ
old.male.engineer praise old.male.dentist
‘The engineerOLD.MALE praised the dentistOLD.MALE.’
c. N1N2_different age
Anh nha sĩ khen ông kỹ sư
young.male.dentist praise old.male.engineer
‘The dentistYOUNG.MALE praised the engineerOLD.MALE.’
d. N2N1_different age
Ông kỹ sư khen anh nha sĩ
old.male.engineer praise young.male.dentist
‘The engineerOLD.MALE praised the dentistYOUNG.MALE.’
In addition, another set of 41 sentences similar to (31b & c) but with equi-biased verbs from
Chapter 5 were also created for the purpose of Experiment 7. These two sets of sentences were
intermixed for a total of 84 items in this study as shown in (32) below.
(32) a. N1N2_equi-biased verb
Anh nha sĩ hẹn với ông kỹ sư
young.male.dentist meet with old.male.engineer
‘The dentistYOUNG.MALE met with the engineerOLD.MALE.’
b. N2N1_equi-biased verb
Ông kỹ sư hẹn với anh nha sĩ
old.male.engineer meet with young.male.dentist
‘The engineerOLD.MALE met with the dentistYOUNG.MALE.’
Ninety-one Vietnamese speakers rated a total of nine-eight items. Participants were instructed
to rate the plausibility of the events described in the sentences on a 7-point scale (1 = implausible,
7 = plausible). Each participant rated a total of 98 sentences (42 targets and 56 fillers). Thirty-two
of the fillers function as catch-trials described highly plausible and highly implausible events. I
excluded participants who (i) didn’t provide sufficient background information for us to identify
them as native Vietnamese speakers, or (ii) who already participated in prior studies or (iii) who
performed less than 75% correct on catch trials were eliminated. This left us with sixty-five
participants.
For data analysis, ratings were transformed into z-scores. Outliers were detected based on
MAD-score (Leys et al., 2013) using the normalize function from the Rling package in R
environment.
Based on these constraints, I identified the twenty-four items which has the least variation in
ratings across all 6 conditions (as described in examples 31 and 32) were chosen for the main
experiment. In other words, I selected those sentences where all six permutations received
maximally similar plausibility ratings. In addition, I also conducted t-tests to confirm that the
different conditions do not differ significantly in terms of their plausibility ratings.
73
c. Interpretations of x’s
To maintain the verb bias in the main clause while also introduce a transitive event in the because
clause so that the disambiguating noun can appear, I use x’s in place of a real verb. The syllable
length and character length of this placeholder is chosen based on frequency of their appearances
in the verb study in Chapter 5. For ease of exposition, I am going to use x’s to refer to this use even
though the actual length of the x’s varies from 3 to 13 letters and from 1 to 3 words. Further
information about this choice is discussed in Section 3.2.2 below.
Due to this design, there is a concern about the possible interpretations of x’s used in the
study. A sentence completion task was conducted to ensure that x’s is indeed interpreted as a verb.
Thirteen Vietnamese speakers were instructed to provide natural continuations for twelve sentence
fragments which are identical to the stimuli used in the main experiment. The fragments end at the
x’s. The first word (i.e. the choice for x’s) in participants’ continuations was coded. Participants
chose to use verbs 70% of the time for x’s. Thus, the use of x’s instead of actual verbs should not
be an issue for the main experiment.
2.1.2.2. Materials
The target items consist of two-clause sequences as illustrated in (11). The first clause involves a
transitive verb (e.g. praise) with a role noun as its subject and object. This first clause is followed
by the connective 'because' and a second clause with a pronominal subject and transitive verb
which mentions one of the preceding characters as its object (thereby disambiguating the intended
referent of the pronoun to be the other person).
Recall that, as discussed in Section 1.3, the main aim of this study is to test (i) whether
Vietnamese speakers have a subject bias or object bias during pronoun processing using object-
biased verbs and (ii) how age cues are used in the presence of verb bias during pronoun assignment
when pronouns are unambiguous. To investigate this, I manipulated (i) congruency (i.e. whether
the referent of the pronoun in the because-clause matches the implicit causality bias of the verb in
the first clause) and (ii) the age configuration of the two referents (i.e. same age vs. different age).
Example (11) illustrates the four conditions. As can be seen in (11), in the same-age conditions
the pronoun is ambiguous between the two referents, but in the different age conditions the age
information on the pronoun can be used to identify the intended referent.
In both the same-age and different-age conditions, the intended referent of the pronoun is
clearly disambiguated once people encounter the object (in bold) in the because-clause, because
participants then realize that the pronoun must refer to the other character. (For a similar design,
see e.g. Stewart et al., 2000) I refer to the object as the disambiguating noun, and this is the critical
region of interest for my analyses. I manipulated whether the pronoun is disambiguated as referring
to the preceding object (33a & c) or preceding subject (33 b & d). Thus, it is at the disambiguating
noun when the congruence of the pronoun's antecedent becomes clear, i.e. whether the referent of
the pronoun in the because-clause matches the implicit causality bias of the verb in the first clause.
(33) a. same age_congruent => pronoun disambiguated as referring to preceding object
Ông nha sĩ khen ông kỹ sư vì ông ấy trong nhiều năm
old.male.dentist praise old.male.engineer because heOLD in many year
qua đã xxx ông nha sĩ trong ngôi trường ở ngoại thành.
past ASP xxx old.male.dentist in school at outskirts
‘The dentistOLD.MALE praised the engineerOLD.MALE because heOLD for many years has
xxx the dentistOLD.MALE at the school in the outskirts of the city.’
74
b. same age_incongruent => pronoun disambiguated as referring to preceding subject
Ông kỹ sư khen ông nha sĩ vì ông ấy … ông nha sĩ …
old.male.engineer praise old.male.dentist because heOLD… old.male.dentist
‘The engineerOLD.MALE praised the dentistOLD.MALE because heOLD for many years has
xxx the dentistOLD.MALE at the school in the outskirts of the city.’
c. different age_congruent => pronoun disambiguated as referring to preceding object
Anh nha sĩ khen ông kỹ sư vì ông ấy… anh nha sĩ …
young. male.dentist praise old. male.engineer because heOLD… young.male.dentist
‘The dentistYOUNG.MALE praised the engineerOLD.MALE because heOLD for many years
has xxx the dentistYOUNG.MALE at the school in the outskirts of the city.’
d. differen age_incongruent => pronoun disambiguated as referring to preceding subject
Ông kỹ sư khen anh nha sĩ vì ông ấy … anh nha sĩ …
old.male.engineer praise young.male.dentist because heOLD … young.male.dentist
‘The engineerOLD.MALE praised the dentistYOUNG.MALE because heOLD for many years
has xxx the dentistYOUNG.MALE at the school in the outskirts of the city.’
In addition to manipulating whether the two referents have the same age (both old) or different
ages (one old, one young), I also manipulated whether the implicit causality bias of the verb is
congruent or incongruent with what the pronoun's referent is disambiguated to be. All verbs were
object-biased verbs. Thus, when the pronoun is disambiguated as referring to the preceding object
(33 a and c), this is in line with what the verb's IC bias leads us to expect. I call these conditions
congruent. When the pronoun is disambiguated as referring to the preceding subject (33b and d),
this goes against expectations triggered by the verb's IC bias. I call these conditions incongruent.
It is worth emphasizing that, as mentioned above, in Vietnamese it is clear from the role nouns
whether the person is old or young. Because English does not encode this information lexically, I
use subscripts to signal whether the Vietnamese role noun refers to an old or young person.
As can be seen in the examples, instead of an actual verb in the second clause, I used a
placeholder which consisted of a variable number of repetitions of the letter x as discussed in
Section 3.2.2.c. This was done to avoid problems due to semantic biases that would have been
created by the use of real verbs in the second clause. In this experiment, it is crucial to ensure that
by the time participants encounter the disambiguating noun, their expectations are guided by (i)
the verb in the first clause and (ii) the kind of pronoun in the second clause, but not by anything
else. Because it was not possible to identify verbs for the second clause that would not potentially
introduce additional biases, I instead chose to use a 'placeholder verb' consisting of x's. (This
placeholder was preceded by the aspect marker đã, which signals that it is a verb.)
In addition, to ensure that reading times at the critical disambiguating noun are not influenced
by any reading time patterns on the pronoun region, a four-word adverbial phrase occurs after the
pronoun. In addition, to be able to detect any spill-over effects after the critical disambiguating
noun region, two prepositional phrases occur after that noun. The adverbial phrase and the
prepositional phrases were held constant across all conditions of an item.
Twenty-four targets were combined forty-eight fillers of three types (fillers involving relative
clauses, fillers with when/while clauses, and fillers with ambiguous nouns/verbs). Thus, each
participant saw a total of seventy-two items.
75
Fillers also contained placeholder x's, in place of content words such as nouns and verbs. The
location of the x placeholder in fillers varied (beginning, middle and end regions of the sentences).
This was done to prevent participants from developing expectations about when the x’s occur.
Targets and fillers were pseudo-randomized, and there was at least one filler between two
target items and no more than three fillers consecutively. A Latin-square design was used to create
four experimental lists. The lists were also reversed to control for effects of ordering, resulting in
a total of eight lists.
All targets and fillers were followed by a comprehension question. There are two types of
comprehension question, the 'mention' type and the 'content' type. To illustrate, the target sentence
in (34a) could have the following comprehension questions in (34b).
(34) a. Mention type
Câu trên có nhắc đến từ ‘ngôi trường’ không?
‘Was the word ‘school’ mentioned in the previous sentence?’
b. Relation type
Ông nha sĩ có cắt tóc cho ông kỹ sư không?
‘Did the dentist give the engineer a haircut?’
For targets, comprehension questions inquired about the content in either the main clause (i.e. the
beginning of the sentence) or in the prepositional phrases of the because-clause (i.e. the end of the
sentence). The questions never probe the referent of the pronoun. For fillers, comprehension
questions inquired about the content in all regions of the sentences (i.e. beginning, middle and
end). The numbers of yes and no answers were balanced across all targets and fillers.
2.1.2.3. Procedure
The study was conducted using the moving-window self-paced reading paradigm, using the
Linger software (written by Doug Rohde). Sentences first appeared on the screen as lines of
underscore dashes. Participants read sentences word-by-word by pressing the space bar. Each key
press revealed the next work while masking the previous word.
With regards to x’s, it was explained to participants beforehand that some words may not be
displayed properly. In these cases, they would see x's in place of the actual word. Participants were
instructed to keep on reading when encountering x's and to try their best to understand the
sentences.
Since the Vietnamese lexicon contains many compounds (e.g. Emeneau, 1951), a full word
may be a combination of several words. For example, the noun ông kỹ sư ‘engineerOLD’ is a
compound consisting of three words. These were all displayed at the same time during self-paced
reading. In other words, to make the displays seem as natural as possible, sentences were
segmented so that a full compound word was shown on the screen all at once; we did not display
each sub-part separately.
Each target sentence similar to example (33) above was segmented into sixteen words as
shown in example (35) below (‘/’ indicates the cut-off of each word). Furthermore, target sentences
were displayed in two separate lines. The line break occurs at the clause boundary (right before
'because'). This was done to ensure that there is no line break during the critical because-clause.
(35) Ông nha sĩ/khen/ông kỹ sư/
vì/ông ấy/trong/nhiều/năm/qua/đã/xxx/ông nha sĩ/trong/ngôi trường/ở/ngoại thành.
76
Fillers also appear in two lines as shown in examples (36) and (37). Some fillers have a short first
line and a long second line similar to the targets as in (36), while others have a longer first line or
both lines of approximately equal length as in (37). This was intentional: When the sentence first
appears as lines of dashes, participants do not know whether the sentence is a target or a filler
merely by looking at the length of each line.
(36) Trung/người/Chương/đưa/gói hàng/
đã/lên đường/đi/công tác/xa/trên/xxxxx xx xxxx/màu xanh/được/vài/giờ.
‘Trung who Chương gave the package to left for a business trip on a blue xxxxx xx xxx
a few hours ago.’
(37) Khi/các/bé mẫu giáo/đang/xếp/xx xxxx/trong/lớp học/
thì/có/một/đàn/chim bồ câu/bay/đến/đậu/quanh/cửa ra vào.
‘While the preschoolers were arranging xx xxxx in the classroom, there was a flock of doves
flew over and landed around the doorway.’
After reading each sentence, participants were presented with a comprehension question. To
answer the question, participants were instructed to press the F key for ‘Yes’ and the J key for
‘No’. If participants provided an incorrect answer, a message, “Sorry! This answer is incorrect.”,
would appear on the screen to remind participants to read more carefully next time. If participants
provided a correct answer, no message was displayed. Participants completed four practice
sentences before starting the experiment.
2.2. PREDICTIONS
In this study, the disambiguating noun region is the critical region of interest, because this is the
point where the intended referent of the preceding pronoun becomes clear. Thus, by looking at
reading times at the disambiguating noun, we can glean insights into how people interpreted the
preceding pronoun. Recall that this disambiguating noun logically has to refer to the other
character, i.e. whoever the pronoun does not refer to (to avoid a Binding Principle C violation).
Consider an item like ‘The dentistOLD.MALE praised the engineerOLD.MALE because heOLD for many
years has xxx the dentistOLD.MALE at the school in the outskirts.’ Here, if I assume that the pronoun
refers to the old dentist, then if the disambiguating noun turns out to be 'old dentist', this tells me
that my interpretation of the pronoun was incorrect. This is expected to cause processing
difficulties at the disambiguating noun region, detectable by means of reading time slowdowns.
However, if I interpret the pronoun as referring to the old engineer, then the disambiguating noun
'old dentist' is compatible with my pronoun interpretation -- since it refers to the other character.
Before discussing the predictions, let us take a look again at the conditions, depicted
schematically in (38) below. Example 33 above provides a concrete example. Example (38) below
shows that in the congruent conditions, the pronoun is disambiguated as referring to the preceding
object, while in the incongruent conditions, it is disambiguated as referring to the preceding
subject.
(38) a. same age_congruent (pronoun => object)
dentistOLD … engineerOLD because heOLD … dentistOLD
b. same age_incongruent (pronoun => subject)
engineerOLD … dentistOLD because heOLD … dentistOLD
77
c. different age_congruent (pronoun => object)
dentistYOUNG … engineerOLD because heOLD … dentistYOUNG
d. different age_incongruent (pronoun => subject)
engineerOLD … dentistYOUNG because heOLD … dentistYOUNG
Let us now consider the predictions for reading time patterns at the disambiguating noun. If
Vietnamese speakers have a preference to interpret pronouns as referring to preceding objects (as
the data in Chapters 3 and 4 suggests), then we expect the critical region (disambiguating noun) in
the congruent conditions (where the pronoun refers to the preceding object, in line with the verb
bias) to be read faster than the critical region in the incongruent conditions (where the pronoun
refers to the preceding subject).
In particular, in the same-gender conditions, where the gender information on the pronoun
does not allow readers to identify the intended referent, we might expect to see faster reading times
in the object-referring congruent conditions than the subject-referring incongruent conditions -- if
it is the case that Vietnamese pronouns tend to be interpreted as referring to preceding objects,
especially in contexts where the implicit causality bias of the verb also favors the object.
In contrast, if online processing of pronouns in Vietnamese exhibists a (potentially transient)
subject preference (even in the presence of an object-biasing implicit causality verb), we may find
that the same-gender incongruent conditions are easier to process -- ie that the disambiguating
region is read more quickly -- than the same-gender congruent conditions.
What about the different-gender conditions? These conditions allow us to test the extent to
which the age information encoded on the pronoun guides online pronoun interpretation, and how
this information interacts with verb bias. By looking at effects of the age cues, I aim to contribute
to the debate regarding the effects of lexically-encoded features on reference resolution. As we
saw in Section 1.2 above, prior work on gender cues has led to divergent results.
If the age information provided on the pronoun plays a central role in pronoun interpretation
and has a stronger effect that the expectations created by verb bias, we expect no differences in
reading times between the congruent and incongruent conditions at the disambiguating noun. This
is because, under this view, comprehenders will simply use the age cue on the pronoun to identify
the intended referent and thus will know to expect the disambiguation noun in both conditions.
However, if both the age information on the pronoun and the expectations triggered by the
bias of the preceding verb guide pronoun resolution, then (i) reading times at the disambiguating
noun should be faster in the congruent condition than in the incongruent condition, but (ii) the
[different age_incongruent] condition should still read faster than the [same age_incongruent]
condition, due to the presence of the age cue.
Regarding the effect of age cue at the pronoun region, since previous work using self-
paced reading task has not found any significant differences in reading times between the
ambiguous and unambiguous conditions, it is unlikely that there are differences in reading times
between the same age and the different age conditions (Stewart et al., 2000). However, it is possible
that effect of age cue occurs at or soon after the pronoun. In this case, one would expect shorter
reading times for the different age conditions as age cue can facilitate pronoun processing.
78
2.3. RESULTS
2.3.1. Data Analysis
Let us first examined participants’ performance on comprehension questions. On average,
participants answered 94% of the questions correctly and all participants answered at least 84% of
the questions correctly. Three participants who scored below 90% were eliminated from the
analysis.
With regards to reading time measurement at each word, it is common practice in self-paced
reading experiments to exclude outlying data which are too fast (e.g. participants pressed key by
mistake before they could read the word) or too slow (e.g. participants were not paying attention).
The reason for this practice is that outlying data can dramatically affect the average reading times
and distort the results. Thus, prior to computing the average reading time of each word position,
reading times faster than 150ms and those that are larger than 3000ms were removed from the
analysis. In addition, reading times more than 2.5 standard deviations from the mean reading times
of a given word in a specific condition were also removed. In total, 10.84% of the data were
removed prior to analysis.
Statistical analyses were performed using mixed-effect linear regressions using packages lme4
package (Bates et al., 2015) and lmerTest (Kuznetsova et al., 2019) in the R environment (R Core
Team, 2016) to assess the effects of the age configuration (i.e. same age vs. different age) and
congruency (i.e. object-biased verbs) on reading times at each word position. In the statistical
models, age cue and congruency were fixed effects, and participant and item were random effects.
The fixed effects were coded using simple difference sum coding. For age cue, when subject and
object have different age cues, it is coded as 1 and when they share the same age cue, it was coded
as -1. For congruence, when the pronoun interpretation is congruent with the verb bias, it is coded
as 1, otherwise -1. The fixed effect structure was kept constant across all models, because these
effects were theoretically motivated by the design of the study. I also included random intercepts
for participants and items as well as random slopes for the fixed effects by participants and by
items in the models.
In the following sections, I report results of the best-fit model among those that converges
(following Baayen et al., 2008; Jaeger, 2008) for each region of interests. In all analyses, the best-
fit model includes only random intercepts for participants and items.
2.3.2. Critical Region (Disambiguating Noun)
Figure 17 shows the mean word-by-word reading times in the four conditions. As seen in Figure
1, the disambiguating noun in the [same age_congruent] condition was read slower than in the
other three conditions. In other words, in the absence of a gender cue, when the pronoun is
disambiguated as referring to the preceding object, the disambiguating noun is read more slowly -
- suggesting that participants had not interpreted the preceding pronoun as referring to the
preceding object. Thus, we see no indication of an object preference in this condition: instead, the
results suggest that participants had interpreted the pronoun as referring to the preceding subject
and are surprised when the disambiguation region indicates otherwise.
79
Figure 17. How age cues and verb bias influence pronoun assignment.
The mean reading time for the disambiguating noun in the [same age_congruent] condition is
451.54ms compared to 422.67ms, 421.24ms, and 424.67ms for the [same age_incongruent],
[different age_congruent] and [different age_incongruent] conditions respectively. However, this
slowdown is only numerical and not statistically significant. There is no main effect of congruence
(β = 5.494, SE = 5.505, p = 0.319), no main effect of age (β = -7.913, SE = 5.508, p = 0.151) and
no interaction between congruence and age (β = -7.907, SE = 5.505, p = 0.151).
As previously mentioned in Section 2.1.2.2.c, a possible complicating factor is the position of
the disambiguating noun: It occurs right after the placeholder x’s (which are standing in for the
verb). Upon encountering the x’s, participants may experience processing difficulty due to the
unexpectedness of seeing non-linguistic/non-meaningful elements. Consequently, possible effects
at the disambiguating noun may be masked by processing difficulty caused by the occurrence of
the x’s.
If processing difficulty triggered by the x’s is indeed masking effects of the experimental
manipulations at the disambiguating noun, one might expect these effects to be stronger early on
in the experiment. Over time, as participants become accustomed to the existence of x’s (in both
targets and fillers), the surprisal effect may subside -- which could make it easier to detect potential
effects of the experimental manipulations on the disambiguating noun.
To investigate this possibility, I analyzed the first half and the last half of the experimental
trials separately. Figure 18 shows the data from the first half of the experiment, and Figure 19
shows the data from the second half of the experiment.
{dentist praised engineer}because he OLD ADV1 ADV2 ADV3 ADV4 ASP XXX dentist at school in outskirts
80
Figure 18. Disambiguating noun – First half. Figure 19. Disambiguating noun – Last half.
As expected, in the first half of the experiment (Figure 18), participants’ reading times at the
disambiguating noun do not vary significantly across the four conditions.
In contrast, the patterns in the last half of the experiment show that the disambiguating noun
is read numerically more slowly in the [same age_ congruent] condition than in the other three
conditions (Figure 19). Statistical analyses were conducted on the data from the second half of the
experiment to test the reliability of this observation. Since the model with both subjects and items
do not converge, I report here the by-subjects and by-items separately. Based on the by-subjects
analysis, incongruent conditions are read slower than congruent ones (main effect of congruence,
β = 11.747, SE = 5.531, p < 0.05), same-age conditions are read marginally slower than different
gender conditions (marginal main effect of age, β = -10.424, SE = 5.529, p = 0.06). Importantly,
we also find a marginal interaction between congruence and age (β = -10.558, SE = 5.526, p =
0.057), showing that verb bias has an effect on pronoun resolution in the ambiguous pronoun
conditions. However, no significant effects are found in the by-item analysis for congruence (β =
11.65, SE = 7.543, p = 0.123) or age (β = -9.683, SE = 7.554, p = 0.201) or for the interaction
between congruence and age (β = -9.956, SE = 7.556, p = 0.188).
2.3.3. Pronoun Region
Although the disambiguating noun is our main region of interest, the pronoun region can reveal
interesting information on how age cues embedded in Vietnamese pronouns are used in pronoun
assignment.
Looking at the pronoun region in Figure 1, we see that the pronoun in the [different
age_congruent] condition was read slower than in the other three conditions. The statistical
analysis revealed a main effect of congruence (β = 6.971, SE = 3.182, p < 0.05) but no effect of
age (β = 3.373, SE = 3.181, p = 0.289) and a marginal interaction between congruence and age (β
= 6.028, SE = 3.181, p = 0.058).
ASP XXX dentist at school
ASP XXX dentist at school
81
A closer look at the first and last half of the experiments show that this effect of congruence
only occurred at the beginning of the experiment. As seen in Figure 4, the marginal effect of
congruence is found in the first half as the [different age_congruent] condition has longer reading
times compared to the other three conditions (β = -8.095, SE = 4.375, p = 0.065). In contrast, data
from the last half of the experiment shows no effect of congruence (β = -6.6678, SE = 3.91, p =
0.089). Figure 5 shows no significant differences in reading times across all conditions at the at
the pronoun region in the last half of the experiment. However, both of the congruent conditions
have slightly slower reading times than the incongruent ones.
2.4. DISCUSSION
Experiment 6 aimed to test whether the real-time processing of subject-position pronouns in
Vietnamese exhibits a preference for object referents (as the data from Chapters 3 and 4 might lead
us to expect) or whether there are any signs of a subject preference (as one might expected based
on the patterns observed in other languages). In addition, I interested in the interplay between verb
bias and age cues on pronoun resolution.
Thus, the study was designed to compare pronouns that were ambiguous vs. unambiguous (in
terms of whether the age information on the pronoun allows for unique identification of the
antecedent) and whose reference was congruent or incongruent relative to the preceding verb's
implicit causality bias. The key question is whether, upon encountering an ambiguous pronoun,
Vietnamese speakers would resolve it based on the information of the verb (i.e. object bias as seen
in Chapter 3 and 4) or on the structural factor (i.e. subject bias as seen in Chapter 2).
The results of Experiment 6 shows that Vietnamese speakers do not have a bias for the
grammatical object during online processing of pronouns. In the [same age_congruent] condition,
participants slowed down at the disambiguating noun despite the fact that this interpretation is
expected from the object biased verbs. In contrast, the same age-congruent condition shows no
slowdown at the disambiguating region and it is read as fast as the different age conditions.
Figure 20. Pronoun – First half. Figure 21. Pronoun – Last half.
because he OLD ADV
because he OLD ADV
82
As a whole, these point to a subject bias contradicting previous findings of object bias for
pronoun interpretation in Vietnamese. One possible explanation is the discrepancy between offline
and online tasks. When providing offline judgments, participants read the full sentences and decide
which referent the pronoun refers to. In this case, all of the information about verbs’ implicit
causality and the referents were fully processed before the judgements were given. In contrast,
during the self-paced reading task, participants must process information incrementally and the
identity of the pronoun is only revealed at the disambiguating region. Therefore, when verb
information is still being processed, other factors such as subjecthood can still affect pronoun
assignment.
I also found that participants took longer to read the pronoun in the [different age_congruent]
conditions. This pattern was not predicted. It is unclear why slowdown only occurs in this
particular condition but not in all of the different age conditions (i.e. congruent and incongruent)
where pronoun cue is present. One possible explanation is perhaps participants were surprised with
the configuration of age cue being used. More specifically, after encountering an event with both
young and old referents where the young referent is the initiator (i.e. the subject), participants did
not expect the old referent to occur next as the pronoun in the because-clause. Since there is no
ambiguity between the referents in this condition and the slowdown does not occur in the condition
where the old referent is the initiator, this might be a pragmatic effect rather than an effect of verb
bias or age cue.
3. Experiment 6 – Subject preference
Experiment 6 suggests that during online processing of ambiguous pronouns, Vietnamese speakers
initially interpret the pronoun as referring to a preceding subject. Specifically, in the ambiguous
pronoun conditions (i.e. same age conditions), participants took longer to read the same
age_congurent condition (i.e. pronoun referring to the object matching the verb bias) than the same
age_incongruent condition (i.e. pronoun referring to the subject against the verb bias). However,
this slowdown can also be a result of a surprisal effect upon encountering the subject referent being
rementioned as the disambiguating noun. In fact, evidence from prior work has shown that this is
a possibility. (Kaiser, 2009)investigating the likelihood of rementioning of referents in discourse
found that if the subject pronoun is used to refer to the highly salient referent in discourse,
participants are also likely to remention the less salient referent. In contrast, once participants use
the subject pronoun to refer to the less salient referent, they are much less likely to remention the
highly salient one. Taken this into account, the use of object-biased verbs in the because-
construction in Experiment pushes pronoun interpretation toward the object referent (Chapter 4)
and thus, makes it the more salient referent in discourse. Therefore, if participants rapidly use verb
information in processing and interpret the pronoun as referring to the object (i.e. same
age_congruent condition), this may decrease the likelihood that the subject referent is rementioned.
If the subject referent is not expected to occur again, participants may be surprised when they
encounter the subject referent at the disambiguating region. In this light, the effect found in the
same age_congruent in Experiment 5 may be a surprisal effect and not an effect of the subject bias
during pronoun resolution.
To ensure that the effect in Experiment 5 is due to the subject preference, I conduct
Experiment 7 to investigate whether the object verb bias results in slowdown upon encountering
the subject referent being rementioned at the disambiguating region. I vary the degree of verb bias
in Experiment 7 by using equi-biased and object-biased verbs, which in turn changes the likelihood
of reoccurrence of the subject referent. If reading times at the subject referent’s reoccurrence (i.e.
83
the disambiguating noun/region) vary across the equi-biased and the object-biased conditions, it
can be said that the effect in Experiment 5 is a result of a surprising effect. Contrastively, if there
are no differences in reading times across all conditions, it can be concluded that the effect in
Experiment 5 is the result of the subject preference in pronoun assignment.
3.1. METHODS
3.1.1. Participants
Thirty-eight Vietnamese native speakers participated in this study. Participants had not lived
outside of Vietnam for more than 6 months and had not participated in any previous experiments.
3.1.2. Materials and Design
Similar to Experiment 5, congruence (i.e. whether the pronoun refers to the same referent inferred
from the verb bias) was manipulated. In addition, I also manipulated verb bias, i.e. whether the
verb was object-biased or equi-biased (meaning that it did not show a preference for preceding
subjects or objects in implicit causality-type contexts with a because connective). This yields a
2x2 design. All referents are unambiguous with respect to age: Sentences always had one young
and one old referent. An example is shown in (39) below. The object-biased items were identical
with the ones used in Experiment 5, examples (33c and d). The equi-biased items were chosen
from the first clause plausibility norming in Section 2.1.2.1 of Experiment 5. All items have the
same role nouns across all conditions.
(39) a. different age_object-biased verb (pronoun as object matching verb bias)
Anh nha sĩ khen ông kỹ sư vì ông ấy… anh nha sĩ …
young. male.dentist praise old. male.engineer because heOLD… young.male.dentist
‘The dentistYOUNG.MALE praised the engineerOLD.MALE because heOLD for many years
has xxx the dentistYOUNG.MALE at the school in the outskirts of the city.’
b. differen age_object-biased verb (pronoun as subject against verb bias)
Ông kỹ sư khen anh nha sĩ vì ông ấy … anh nha sĩ …
old.male.engineer praise young.male.dentist because heOLD … young.male.dentist
‘The engineerOLD.MALE praised the dentistYOUNG.MALE because heOLD for many years
has xxx the dentistYOUNG.MALE at the school in the outskirts of the city.’
c. different age_equi-biased verb (pronoun: no preference for subject nor object)
Anh nha sĩ hẹn với ông kỹ sư vì ông ấy… anh nha sĩ …
young. male.dentist meet with old.male.engineer because heOLD young.male.dentist
‘The dentistYOUNG.MALE met with the engineerOLD.MALE because heOLD for many years
has xxx the dentistYOUNG.MALE at the school in the outskirts of the city.’
d. differen age_equi-biased verb (pronoun: no preference for subject nor object)
Ông kỹ sư hẹn với anh nha sĩ vì ông ấy … anh nha sĩ …
old.male.engineer meet with young.male.dentist because heOLD young.male.dentist
‘The engineerOLD.MALE met with the dentistYOUNG.MALE because heOLD for many years
has xxx the dentistYOUNG.MALE at the school in the outskirts of the city.’
84
3.1.3. Procedure
The procedure was identical to that of Experiment 6.
3.2. PREDICTIONS
If the strong object bias affects the likelihood of rementioning of the subject referent in the
because-clause, reading times at the disambiguating noun region should be longer in the object-
biased verb conditions than in the equi-biased verb conditions. In contrast, if verb bias does not
affect the remention rate of the subject referent, there should be no difference in reading times at
the disambiguating noun across all conditions.
3.3. RESULTS
Prior to statistical analysis, four participants who performed less than 91% correct on
comprehension questions were eliminated. (They were identified as outliers based on a boxplot.)
This leaves us with a total of 34 participants. Reading times that are too short (<150ms) or too long
(>3000ms) or more than 2.5 standard deviation away from the mean based on word position and
condition were also eliminated from the analysis. The total amount of data points lost after this
process is 13.3%.
Overall, reading times across all four conditions do not differ. This can be seen in Figure 6.
Mixed-effect linear regressions models were used to assess the effects of congruency (i.e. object-
biased verbs) and verb bias (i.e. object-biased verbs vs. equi-biased verbs) on reading times at the
disambiguating noun and at the pronoun. In the statistical models, congruency and verb bias were
fixed effects and participants and item were random effects. The fixed effects were coded using
simple difference sum coding. For congruence, when the pronoun interpretation is congruent with
the verb bias, it is coded as 1, otherwise -1. For verb bias, when subject and object have different
age cues, it is coded as 1 for object-bias verbs and -1 for equi-biased verbs. Random intercepts for
participants and items as well as random slopes for the fixed effects by participants and by items
were also included in the models. The results I report below are from the models that converge
which includes only random intercepts for participants and items.
At the disambiguating region, no effect of congruence (β = -2.081, SE = 5.1547, p = 0.687),
no effect of verb bias (β = -0.491, SE = 5.153, p = 0.924) and a marginal interaction (β = -9.667,
SE = 5.153, p = 0.061). Interestingly, in contrast to Experiment 6, congruence does not have an
effect at the pronoun (β = -2.279, SE = 3.39, p = 0.5) nor does verb bias (β = -2.279, SE = 3.39, p
= 0.5) although there is a marginal interaction (β = -2.279, SE = 3.39, p = 0.5).
85
Figure 22. The use of x’s and reading times at disambiguating noun and at the pronoun.
4. General Discussion
This chapter investigates (i) whether the object bias found in Chapter 3 and 4 is present during
online pronoun processing in Vietnamese and (ii) the extent to which age cues can guide pronoun
resolution in the presence of verb bias.
To address these questions, I used the self-paced reading paradigm to detect the effects of verb
bias and age cues in Experiment 5. I measured Vietnamese speakers’ reading times at the
disambiguating noun region (i.e. the critical region) to see whether they experienced any
processing difficulties, detectable by means of reading time slowdowns, once the intended referent
of the preceding pronoun becomes clear. Let us first consider the same-gender conditions where
the gender information on the pronoun does not allow readers to identify the intended referent, we
might expect. Interestingly, I found that participants slowed down after encountering the
disambiguating noun in the congruent condition in which the pronoun is interpreted as referring to
the object as inferred by the verb bias. In contrast, no slowdown occurred in the incongruent
condition where the pronoun refers back to the subject. These results show that Vietnamese
speakers have a tendency to interpret pronouns as referring to subject antecedents. This finding is
in line with the results found in Experiment 1, Chapter 2 in which pronouns in subject position are
mostly used to refer to subject antecedents (i.e. Subject parallelism). With regard to the object bias
found in Experiment 2, 3 (Chapter 2) and 4 (Chapter 4), these results suggest that verb bias
information is incorporated toward the end of sentence processing rather than during the
processing of the pronoun itself.
Let us now turn to the different-gender conditions. These conditions allow us to test the extent
to which the age information encoded on the pronoun guides online pronoun interpretation, and
{dentist praised engineer}because he OLD ADV1 ADV2 ADV3 ADV4 ASP XXX dentist at school in outskirts
86
how this information interacts with verb bias. At the disambiguating noun, no slowdown was
detected: Participants read these conditions as fast as they read the same age_congruent condition
in which the pronoun refers back to the subject. This shows that age cues are successfully used to
guide pronoun resolution with little effect of verb information. Looking at the pronoun region
where the pronoun features are processed, there seems to be some pragmatic effects with the old
pronoun was read slower than the young pronoun. This may be due to the order of the referents in
the main clause (i.e. young-old vs. old-young).
Finally, there is a concern that the effects found in in Experiment 5 does not come from the
presence of the subject preference during online processing but from a surprisal effect upon
encountering the object (i.e. participants did not expect a transitive event in the because-clause).
As shown in Kaiser (2009), when the subject pronoun in interpreted as referring to the less salient
object antecedent, re-mentioning rate of the subject antecedent decreases. Since Experiment 5 uses
object-biases verbs which lead to an object interpretation for the pronoun in question, one may
worry that seeing the subject being mentioned again as the disambiguating noun is something
unexpected. Experiment 6 addresses this concern. Taking the different age conditions in
Experiment 5, I vary the degree of verb bias (i.e. object-biased verbs vs. equi-biased verbs). I found
that there are no differences at the disambiguating noun region across all conditions, suggesting
that the slowdowns at the disambiguating noun in Experiment 5 is indeed an effect of the subject
preference and not a surprisal effect caused by the rementioning of the subject referent.
Interestingly, the effect at the pronoun region found in the different age conditions in Experiment
5 was not found in Experiment 6. Therefore, this may be an epiphenomenon in rather than an effect
of age cues.
87
Conclusion
This dissertation provides a comprehensive picture of Vietnamese pronoun behaviors which
broadens our understanding of long-standing claims in pronoun resolution and also recognizes the
unique features of Vietnamese pronouns. The widely-known salience-hierarchy approach states
that referential form use is largely influenced by referents’ degrees of salience (Ariel, 1990; Givón,
1983; Gundel et al., 1993). According to this approach, the more salient the referent, the more
reduced the referential form is and referential forms are mapped onto a unified hierarchy. However,
defining which referent is ‘more salient’ is not always straightforward as salience can be modulated
by a range of factors such as grammatical roles (e.g. Chafe, 1976; Crawley & Stevenson, 1990;
Crawley, Stevenson & Kleinman, 1994; Carminati, 2002), structural parallelism (e.g. Smyth,
1994; Chambers and Smyth, 1998), topicality (e.g. Givón, 1983; Rohde & Kehler, 2014), discourse
coherence (e.g. Hobbs, 1979; Kehler et al., 2008) and verbs’ implicit causality (e.g. Garnham et
al., 1992; Garvey & Caramazza, 1974; McDonald & MacWhinney, 1995). Furthermore, it is
possible that each type of referential form may exhibit different degrees of sensitivity to these
factors. In fact, more recent work has pointed out that referential form use cannot easily be captured
based on a uniform salience scale. Kaiser and Trueswell (2008), studying Finnish pronouns and
demonstratives, found that antecedents’ grammatical roles and position have varying effects on
the two referential forms. As a result, they proposed a form-specific multiple-constraints approach
to account for this phenomenon. In this dissertation, I investigate Vietnamese speakers’ use of
referential forms with respect to these approaches. Vietnamese is an interesting case study due to
its typologically distinct pronoun system which has both null pronouns (i.e. zero anaphora/empty
pronouns) and overt kinship-term pronouns (e.g. ông ‘grandfather/heOLD’). In addition to
referential form use, I also examine how different factors guide online pronoun processing.
Looking at real-time processing provides a fined-grained information and insights that help to shed
light on some seemingly contradictory results that I obtained in offline studies. Taken together, the
studies in this dissertation highlights the impact of universal and language-specific features on
pronoun resolution.
1. Effects of structural and discourse factors on Vietnamese null vs. overt pronoun use
The first part of the dissertation (Chapter 2, Experiment 1) examines the effects of structural
factors on Vietnamese referential form production (e.g. null pronouns, overt pronouns, NPs).
Numerous studies have shown that the grammatical role of the antecedent (i.e. subject vs. object)
(e.g. Chafe, 1976, Crawley and Stevenson, 1990, (Gordon, Grosz, & Gilliom, 1993) plays a crucial
role in the comprehension and production of referential forms. Nevertheless, in languages with
both null and overt pronouns (e.g. Italian, Spanish, Japanese, Chinese), the extent to which the
antecedent’ grammatical role can affect null vs. overt pronoun choice varies. On the one hand,
there is a (relatively) clear division of labor between null and overt pronouns in agreement pro-
drop languages (e.g. Italian and Spanish) with respect to antecedent’s grammatical role.
Specifically, other things being equal, null pronouns are typically used to refer back to referents in
subject position overt pronouns are used for object referents (Alonso-Ovalle et al., 2002;
Carminati, 2002; but see also also Fedele, 2016). On the other hand, in discourse pro-drop
languages (e.g. Chinese, Japanese), the grammatical role of the antecedent seems to have the same
88
effect on both null and overt pronouns: Both pronoun types can be used to refer back to the subject
antecedent. Therefore, the role of antecedent’s grammatical role in null vs. overt pronoun use needs
to be investigated further.
In addition to the grammatical role of the antecedent, prior work on grammatical
parallelism (e.g. Sheldon, 1974; Smyth, 1994; Chambers and Smyth, 1998) shows that the
grammatical role of the pronoun itself can affect how speakers interpret pronouns. Specifically,
subject pronouns tend to be interpreted to refer back to the subject antecedent while object
pronouns are often interpreted as referring to the object antecedent. However, studies on
grammatical parallelism effect have largely focused on languages with English-type pronouns and
not languages with both null and overt pronouns. Consequently, little is known about how
grammatical parallelism can play a role in in null vs. overt pronoun use. The lack of research could
be due to the fact that not all languages allow for null pronouns to occur in object position. In this
sense, Vietnamese null and overt pronouns are an ideal case study since they can occur in both
subject and object position and can be used to refer to either the subject or object antecedent.
In the narrative experiment (Chapter 2, Experiment 1), I investigate when Vietnamese
speakers choose to produce null and overt pronouns and how grammatical roles and grammatical
parallelism can influence speakers’ choices. As a result, the analysis of this narrative study
(Experiment 1) includes details of the antecedent’s and the anaphoric expression’s grammatical
roles. These details set this study apart from prior narrative work which only reports the overall
number of each type of referential forms (Christensen, 2000; Clancy, 1982).
In Experiment 1, I found that grammatical roles and grammatical parallelism are crucial
factors influencing referential form choice in Vietnamese. Specifically, speakers use significantly
more pronouns (null and over pronouns combined) when both the antecedent and the anaphoric
expression are in subject position (i.e. Subject parallelism). In contrast, when parallelism is absent
or in configurations with Object parallelism, speakers produce mostly NPs. Interestingly, there are
also some hints of a parallelism effect in the Object parallelism case, as more pronouns are used
than in the non-parallel ones. Still, the patterns in Subject and Object parallelism configurations
differ, indicating the importance of grammatical role. The results in Experiment 1 highlight the
effects of structural factors, especially the importance of considering the grammatical roles of not
only the antecedent but also the anaphoric expression (i.e. parallelism) in investigating referential
form choice.
Another finding from the narrative study (Experiment 1) has to do with crosslinguistic
variation in overt pronoun use. Considering the rare occurrences of Japanese overt pronouns which
are derived from nouns and are semantically rich (Clancy, 1982; Hinds, 1975), one might expect
the same restriction for Vietnamese overt kinship-term pronouns. Nevertheless, I find that
Vietnamese overt pronouns are used as frequently as null pronouns. This emphasizes the need for
further work on more typologically distinct pronoun systems.
Turning to the null vs. overt pronoun distinction, I found no differences in how Vietnamese
speakers use the two pronoun types in this narrative study (Experiment 1). These patterns suggest
that Vietnamese null and overt pronouns are similar to null and overt pronouns in Chinese and
different from those in Italian and Spanish: They do not differ from each other in terms of how
sensitive they are to grammatical roles and grammatical parallelism. Importantly, the lack of
variation between null and overt pronoun use poses a challenge to the salience-hierarchical
approach (Ariel, 1990; Givón, 1983) which suggests that different degrees of salience are
represented by different referential forms. Thus, the question remains whether there is a division
89
of labor between null and overt pronouns in Vietnamese and whether it can be better captured with
the salience-hierarchical approach or with the form-specific multiple-constraints approach.
Let us now turn to other factors that may affect referents’ degrees of salience, beyond those
considered in Experiment 1. First, it is important to note that the notion of topic – which is often
viewed as related to subjecthood – is well-known to promote referents’ salience (e.g. Givón, 1983).
However, being a grammatical subject does not always entail being a topic (Lambrecht, 1994).
Since the narrative analysis does not separate these two notions, it is possible that the equal use of
null and overt pronouns is a consequence of some subjects being more topical than others. Second,
the narrative analysis also did not consider the role coherence relations may play in pronoun use.
As shown in Kehler (2002), Kehler and Rohde (2013) and many others, the production and
comprehension of pronouns can be influenced by the type of coherence relation which exists
between two clauses. These effects have also been confirmed in discourse pro-drop languages.
Simpson et al. (2016) studying pronoun use in Chinese found that coherence relations affect both
the likelihood of mention and referential form use.
To tease apart the effects of grammatical roles and topicality, in Chapter 3 (Experiments 2
and 3), I use passivization as a tool to manipulate the likelihood of a referent being the topic while
keeping the grammatical subject role constant. Crucially, only one type of coherence relation, the
Explanation relation, is used. Following Rohde and Kehler (2014), I employed a sentence
completion task in which participants read active/passive sentence fragments such as ‘The engineer
thanked the driver/was thanked by the driver because …/he …/ASP …’ and provided continuations
for the fragments. When the fragments end with because, it allows participants to freely choose
referential forms (i.e. production task). When they end with he or the aspect marking (indicating
a null pronoun), it requires participants to interpret the overt pronoun before providing the
continuations (i.e. comprehension task).
The results of Experiment 2 and 3 reveal that topicality is a crucial factor influencing
Vietnamese speakers’ choice of null vs. overt pronouns. In both comprehension and production,
participants exhibit a strong tendency to refer back to the topicalized subjects of passive sentences
compared to the subjects or objects of active sentences. More importantly, when looking at null
vs. overt pronoun use, I found that null pronouns are the preferred form when referring to the
topicalized subjects of passives. In contrast, there are no differences between null and overt
pronoun use in actives. Interestingly, this pattern of null pronoun preference only occurs in the
production task where speakers could choose which referential form to use and which antecedent
to refer back to. When speakers had to interpret the given null or overt pronouns prior to providing
their own continuations (i.e. comprehension task), no difference was found in between the two
pronoun types: Both null and overt pronouns are equally likely to be interpreted as referring back
to the subjects of passives. These findings suggest that it is important not only to examine different
factors but also to examine both comprehension and production when investigating pronoun use.
What do the findings in Experiment 1-3 tell us about Vietnamese pronouns? First, let us return
to the two approaches to pronoun resolution discussed previously. The salience-hierarchical
approach arranges the different types of referential form on a unified hierarchy (Ariel, 1990;
Givón, 1983; Gundel et al., 1993). As a result, different referential forms should have different
patterns of use. In contrast, the form-specific multiple-constraints approach (Kaiser & Trueswell,
2008) allows for the forms to have varying degrees of sensitivity to a range of factors. Let us now
consider the results of Experiments 1-3 in Chapters 2 and 3. Experiment 1 (narrative production)
suggests that null and overt pronouns show similar patterns when it comes to the grammatical role
of the antecedent. Experiments 2-3 show that once we consider cases where topicality – a
90
discourse-level factor – is more clearly marked, null and overt pronouns diverge in their patterns:
Null pronouns are the preferred form used for topical referents. Since Vietnamese null and over
pronouns exhibit varying degrees of sensitivity to structural and discourse factors, the form-
specific multiple-constraints approach is more suitable to account for the findings.
It should also be noted that in Experiments 1-3, I not only investigate the roles of structural
and discourse factors on null and overt pronoun use but I also examine whether and how these
underlying mechanisms can be affected by the distinction between spoken and written language
(i.e. modality). Previous work using corpus data and discourse analysis concluded that modality
can influence the use of referential expressions (see Chafe and Tannen (1987) for an overview). In
particular, prior work suggests that the use of nouns is more prevalent in writing, whereas pronouns
are (relatively) more frequent in speech. This claim is supported by previous work examining
written language (e.g. newspapers and academic essays) which often contain higher number of
referents than speech (e.g. Tannen, 1982; Biber et al., 1999). Other work looking specifically at
null and overt pronoun use has claimed that null pronouns are used more in written than in spoken
narratives (Li and Thompson, 1979; Christensen, 2000). However, these previous studies only
report the number of tokens without specifying the environment of occurrence. In Experiments 1-
3, I examine the effect of modality on how referential forms are used with respect to structural and
discourse effects, in order to more directly assess potential effects of modality on the mechanisms
licensing null and overt pronoun in Vietnamese. I show that modality does not influence null and
overt pronoun use in Vietnamese in either the narrative study (Experiment 1) or in the sentence
completion study (Experiment 2 and 3). Thus, the modality effect reported in prior work is likely
to be an epiphenomenon rather than an effect on the pronoun licensing mechanisms.
2. Subject vs. Object bias during online pronoun processing
The last two chapters, Chapters 4 and 5, of the dissertation address the conflicting findings in
the previous chapters (Chapters 2 & 3) regarding the effect of grammatical roles on Vietnamese
pronoun use. On the one hand, the narrative study (Chapter 2, Experiment 1) found that
Vietnamese speakers have a strong tendency to continue to refer back to referents in the
grammatical subject position. On the other hand, the sentence completion studies (Chapter 3,
Experiments 2 and 3) revealed a preference to continue with reference to the preceding the object.
These seemingly conflicting results raise an important question with regards to the interaction
between the grammatical subject vs. object role and saliency in Vietnamese. In addition, the idea
that the object in Vietnamese is more salient than the subject is very controversial considering the
well-known crosslinguistic subject preference in pronoun resolution literature (e.g. Chafe, 1976;
Crawley & Stevenson, 1990; Gordon et al., 1993). Thus, it is necessary to take a closer look at
how Vietnamese speakers process pronouns to better understand these patterns.
Prior to conducting the online experiments on pronoun processing, in Chapter 4, I address an
important concern regarding the design of the sentence completion task (Experiments 2 and 3) in
which the object preference was found. In these experiments, my choice of equi-biased verbs (i.e.
verbs that do not have a strong bias toward either the subject or the object when used in an
explanation relation) was based on the English study by Hartshorne & Snedeker (2013). The reason
for choosing equi-biased verbs was to ensure the chance of detecting any potential differences
between null and overt pronouns (since strongly object-biased or subject-biased verbs could have
yielded floor or ceiling effects that could have masked effects of pronoun form). Although I chose
verbs with maximally similar meanings in English and Vietnamese, one may still wonder whether
the unexpected object preference in Vietnamese could be due to me having inadvertently chosen
91
verbs that, although equi-biased in English, have an object bias in Vietnamese when used the the
explanation frame that I tested in Experiments 2 and 3. If this is the case, the object bias I observed
for Vietnamese would not be unexpected.
To avoid these kinds of complications for the online processing study reported in Chapter 5,
I conducted a large-scale norming study of Vietnamese implicit causality verbs (Chapter 4,
Experiment 4). It should be noted that there is a lack of large-scale study of verbs in non-Indo-
European languages and therefore, it is unclear to what extent the classification of implicit
causality verbs (i.e. the Revised Action-State Distinction (Brown & Fish, 1983b; Au, 1986)) maps
out crosslinguistically. It is not known to what extent verbs’ individual referential biases (i.e.
subject vs. object bias) may vary from one language to another. To understand these properties in
Vietnamese verbs, Experiment 4 examined a total of 162 verbs in Vietnamese and compared their
referential biases with those in the English equivalents (taken from Ferstl et al., 2011). The results
of Experiment 4 show that even though there is an overall strong correlation between Vietnamese
and English verbs, Vietnamese implicit causality verbs have a stronger object bias than English
ones. Interestingly, despite prior results that Stimulus-Experiencer and Experiencer-Stimulus
categories have consistent results across languages (Hartshorne et al., 2013, on 8 languages), in
Experiment 4, I find that Stimulus-Experiencer verbs in Vietnamese have an object bias in contrast
to the subject bias found in English and other languages. Crucially, looking at the set of 24 verbs
used in the sentence completion study (Chapter 3), I found that overall, these verbs are also equi-
biased in Vietnamese (44.85% subject bias). Consequently, the object bias found in the sentence
completion study is not a result of verb choice. Further work is needed to understand this
unexpected object bias.
The object bias found in Chapter 3 and 4 (Experiment 2-4) leads to an intriguing question: Do
Vietnamese speakers exhibit this object bias during real-time processing as well, or is it something
that emerges in off-line data? Or, putting it differently, could it be that real-time processing of
Vietnamese pronouns shows evidence for a subject bias, even if this is not present in offline data?
This is a question I investigate in Experiments 5 and 6 (Chapter 5). Additionally, I am also
interested in the role of age cues, which are encoded on Vietnamese kinship-term overt pronouns,
play in pronoun assignment. Experiments 5 and 6 used the self-paced reading paradigm to
investigate the subject vs. object bias as well as age cue effects during online processing of
pronouns. Reading times are used as an indication of whether participants experience processing
difficulties (i.e. when something does not match participants’ expectations). Participants read
sentences similar to those used in the sentence completion study in which the main clause consists
of two same gender role nouns and because is used as the connectives (e.g. ‘The dentistOLD.MALE
praised the engineerOLD.MALE because heOLD for many years has xxx the dentistOLD.MALE at the school
in the outskirts’). The verbs in the main clause (e.g. praise) are object-biased verbs chosen from
Experiment 4 (Chapter 4). The use of x’s equals to the use of a verb to maintain participants’
interpretation of the pronoun he until the reach the disambiguating noun, the dentist, in the
because-clause. At the disambiguating noun region (e.g. the dentistOLD.MALE in the because-clause)
the identity of the pronoun he is revealed and reading times in this region can show us whether the
intended referent of the pronoun matches with participants’ expectation.
I found that Vietnamese speakers have a tendency to initially associate the pronouns with
preceding subject antecedents. This finding is in line with the results found in Experiment 1,
Chapter 2 in which pronouns in subject position are mostly used to refer to subject antecedents
(i.e. Subject parallelism). With regard to the object bias found in Experiment 2, 3 (Chapter 3) and
4 (Chapter 4), these results suggest that verb bias information is incorporated toward the end of
92
sentence processing rather than during the processing of the pronoun itself. Regarding the use of
age cues embedded on the pronouns, I found that Vietnamese speakers rapidly use age cues to
successfully resolve the pronoun in the presence of verb bias.
There is a concern that the slowdown in the same age condition in (40a) could be a surprisal
effect upon encountering the re-mention of the subject referent in the because-clause rather than
an effect of the subject preference. I conducted Experiment 6 to confirm that it is indeed the subject
preference effect that was found in Experiment 5. The reasoning of a surprisal effect comes from
Kaiser (2009) which found that if a subject pronoun is interpreted as referring to the less salient
object, the likelihood that the highly salient referent (i.e. subject) will be re-mentioned later in the
sentence decreases. Thus, one may wonder whether the slowdown that occurred at the
disambiguating noun (i.e. the dentist) in (40a) could be the result of a surprisal effect. In
Experiment 6, I use verbs with vary the degree of verb bias (i.e. object-biased verbs vs. equi-biased
verbs) to modulate the salience of the referents in discourse, which in turn varies the likelihood of
re-mentioning the subject referent. If the slowdown is indeed due to a surprisal effect, there should
be differences in reading times between the object-biased and the equi-biased conditions when
participants encounter the subject referent at the disambiguating noun. The results of Experiment
6 show no differences at the disambiguating noun region across all conditions, suggesting that the
slowdown at the disambiguating noun in Experiment 5 is indeed an effect of the subject preference
and not a surprisal effect caused by the re-mentioning of the subject referent.
In short, Experiments 5 and 6 provide evidence that during online processing, Vietnamese
speakers initially assign the pronoun to the subject antecedent. When we combine these findings
with the observations from the sentence completion study (Chapter 3, Experiment 2 & 3) that there
exists an object preference in offline data, the picture that emerges is that verbs’ information seems
to contribute – and guide people towards an object interpretation – at a somewhat later point during
processing.
What can be said about the late effect of verbs’ implicit causality information? In fact, this
question has long been an interesting debate between two accounts, the clausal integration account
and the immediate focusing account. On the one hand the clausal integration account claims that
verbs’ implicit causality has a late effect during comprehension, at or toward the end of the
sentence (e.g. Garnham et al., 1996; Stewart et al., 2000). On the other hand, the immediate
focusing account suggest that implicit causality information takes effects much more rapidly
during comprehension, at or soon after the encounter of the verb. Various criticisms have been
stated with regards to the method used in these prior studies. Thus, in Experiment 5 and 6 (Chapter
5), I implemented the word-by-word self-paced reading task which has been shown in recent work
to be a better method for detecting implicit causality effects (Koornneef & van Berkum, 2006; see
also Nicol & Swinney, 2003; Pyykkönen & Järvikivi, 2010 on the use of a spoken language
comprehension task). Interestingly, even though Koornneef and van Berkum found evidence for
the immediate focusing account in their study, the results in Experiment 5 in this dissertation
support the clausal integration account. Specifically, the results of Experiments 5 and 6, put
together, suggest that verbs’ information is not integrated until much later on, toward the end of
the sentence, allowing for an initial interpretation of the pronoun solely based on the grammatical
factor of subjecthood. Consequently, further work is needed to examine the time course of implicit
causality in sentence comprehension.
In sum, the work in this dissertation show that even though structural and discourse factors
may influence pronoun use across languages, how their effects manifest may vary depending on
the language. The results of the current work are in line with Kaiser & Trueswell’s (2008) form-
93
specific multiple-constraints approach. Based on data from Finnish overt pronouns and anaphoric
demonstratives, Kaiser and Trueswell argue against the assumption that different kinds of referring
expressions can be straightforwardly mapped onto a unified salience hierarchy. As shown in
Experiment 1-3, Vietnamese null and pronouns lack a clear division of labor and share similar
sensitivity to a number of factors such as grammatical roles and grammatical parallelism. These
findings open doors to further investigation of Vietnamese, notably the underlying factors driving
the object bias found in implicit causality verbs and the role of kinship features in pronoun
resolution.
94
References
Alonso-Ovalle, L., Fernández-Solera, S., Frazier, L., & Clifton, C. (2002). Null vs. overt pronouns
and the topic-focus articulation in Spanish. Italian Journal of Linguistics, 14, 151–170.
Amano, N., & Kondo, M. (2000). NTT database series nihongo-no goikokusei: Lexical properties
of Japanese (Vol. 7). Tokyo: Sanseido.
Ariel, M. (1990). Accessing Noun-Phrase Antecedents. Routledge.
Arnold, J. E., & Griffin, Z. M. (2007). The effect of additional characters on choice of referring
expression: Everyone counts. Journal of memory and language, 56(4), 521-536.
Arnold, J. E. (2001). The effect of thematic roles on pronoun use and frequency of reference
continuation. Discourse Processes, 31(2), 137-162.
Arnold, J. (2000). The rapid use of gender information: Evidence of the time course of pronoun
resolution from eye-tracking. Cognition, 76(1), B13–B26. https://doi.org/10.1016/S0010-
0277(00)00073-1
Badecker, W., & Straub, K. (1992, March 19). Resolving pronoun-antecedent relations. Paper
presented at the 5
th
CUNY Conference on Human Sentence Processing, New York.
Barbosa, P. (2011). Pro‐drop and Theories of pro in the Minimalist Program Part 2: Pronoun
Deletion Analyses of Null Subjects and Partial, Discourse and Semi pro‐drop. Language
and Linguistics Compass 5/8: 571-587. Available at Wiley Online Library Free Access:
https://onlinelibrary.wiley.com/doi/full/10.1111/j.1749-818X.2011.00292.x#ss9-title
Balota, D. A., Pilotti, M., & Cortese, M. J. (2001). Subjective frequency estimates for 2,938
monosyllabic words. Memory & Cognition, 29(4), 639–647.
https://doi.org/10.3758/BF03200465
Bel, A., Perera, J., & Salas, N. (2010). Anaphoric devices in written and spoken narrative
discourse: Data from Catalan. Written Language and Literacy, 13(2), 236.
https://doi.org/10.1075/wll.13.2.03bel.
Borer, H. (1989). Anaphoric AGR. In: Jaeggli O.A. & Safir K.J. (Eds.), The Null Subject
Parameter. Studies in Natural Language and Linguistic Theory, vol 15. Springer: Dordrecht.
Bosch, P., Rozario, T., & Y. Zhao. (2003). Demonstrative Pronouns and Personal Pronouns.
German der vs. er. Proceedings of the EACL2003: Workshop on The Computational
Treatment of Anaphora. Budapest.
Brennan, S., Friedman, M., & Pollard, C. (1987). A centering approach to pronouns. Proceedings
of the 25th Annual Meeting of the Association for Computational Linguistics (pp. 155-162).
Stanford, CA.
Brown, R., & Fish, D. (1983a). Are there universal schemas of psychological causality? Archives
de Psychologie, 51(196), 145–153.
Brown, R., & Fish, D. (1983b). The psychological causality implicit in language. Cognition, 14(3),
237–273. https://doi.org/10.1016/0010-0277(83)90006-9
Bruening, B., & Tran, T. (2015). The nature of the passive, with an analysis of
Vietnamese. Lingua, 165, 133-172.
Caramazza, A., Grober, E., Garvey, C., & Yates, J. (1977). Comprehension of anaphoric pronouns.
Journal of Verbal Learning and Verbal Behavior, 16(5), 601–609.
https://doi.org/10.1016/S0022-5371(77)80022-4
Carminati, M. N. (2002). The processing of Italian subject pronouns. Retrieved from
http://scholarworks.umass.edu/dissertations/AAI3039345/
Chafe, Wallace L. (1980). The pear stories: Cognitive, cultural, and linguistic aspects of narrative
production. Norwood, NJ: Ablex.
95
Chafe, Wallace L. (1976). Givenness, contrastiveness, definiteness, subjects, topics, and point of
view. Subject and topic, 25-56. New York: Academic Press Inc. 24
Chafe, W., & Tannen, D. (1987). The relation between written and spoken language. Annual
Review of Anthropology, 16(1), 383-407.
Chambers, C. G., & Smyth, R. (1998). Structural parallelism and discourse coherence: A test of
centering theory. Journal of Memory and Language, 39(4), 593-608.
Chen, P. (1986). Referent introducing and tracking in Chinese narratives. Ph.D. dissertation,
Department of Linguistics, University of California, Los Angeles.
Chomsky, N. (1981). A note on Non-control PRO. Journal of Linguistic Research, 1(4), 1-11.
Christensen, M. B. (2000). Anaphoric reference in spoken and written Chinese narrative discourse.
Journal of Chinese Linguistics 28(2): 303-336.
Clancy, P. M. (1980). Referential choice in English and Japanese narrative discourse. In The pear
stories: Cognitive, cultural, and linguistic aspects of narrative production, Wallace L Chafe
(ed), 127-202. Norwood, NJ: Ablex.
Clancy, P. M. (1982). Written and spoken style in Japanese narratives. In Spoken and written
language: exploring orality and literacy, Deborah Tannen (ed), 55-76 Norwood, NJ: Ablex.
Crawley, R. A., & Stevenson, R. J. (1990). Reference in single sentences and in texts. Journal of
Psycholinguistic Research, 19(3), 191–210. https://doi.org/10.1007/BF01077416
Crawley, R. A., Stevenson, R. J., & Kleinman, D. (1990). The use of heuristic strategies in the
interpretation of pronouns. Journal of Psycholinguistic Research, 19(4), 245–264.
https://doi.org/10.1007/BF01077259
Creider, C. A. (1979). On the explanation of transformations. T. Givón (Ed.), Syntax and
semantics: Vol. 12. Discourse and syntax, Academic Press, New York (1979), pp. 3–21.
Davison, A. (1984). Syntactic markedness and the definition of sentence topic. Language, 60(4),
797. https://doi.org/10.2307/413799
DeVito, J. A. (1964). A quantitative analysis of comprehension factors in samples of oral and
written technical discourse of skilled communicators. PhD thesis. University of Illinois,
Urbana.
Drieman, G. H. J. (1962). Differences between written and spoken languages: An exploratory
study. Acta Psychologica, 20, 78-100. https://doi.org/10.1016/0001-6918(62)90009-4.
Ehrlich, K. (1980). Comprehension of Pronouns. Quarterly Journal of Experimental Psychology,
32(2), 247–255. https://doi.org/10.1080/14640748008401161
Ehrlich, K., & Rayner, K. (1983). Pronoun assignment and semantic integration during reading:
Eye movements and immediacy of processing. Journal of Verbal Learning and Verbal
Behavior, 22(1), 75–87. https://doi.org/10.1016/S0022-5371(83)80007-3
Emeneau, M. B. (1951). Studies in Vietnamese (Annamese) grammar (Vol. 8). University of
California Press.
Fedele, E. (2016). Discourse-level processing and pronoun interpretation. Ph.D. dissertation,
Department of Linguistics, University of Southern California, Los Angeles.
Fedele, E., & Kaiser, E. (2014). Looking back and looking forward: Anaphora and cataphora in
Italian. University of Pennsylvania Working Papers in Linguistics, 20(1), 10.
Fedele, Emily and Kaiser. Elsi. (2015). Resolving null and overt pronouns in Italian. In
Proceedings of the 15
th
Texas Linguistic Society Meeting (TLS), eds. Christopher Brown,
Qianping Gu, Cornelia Loos, Jason Mielens, Grace Neveu, 55-72.
96
Ferstl, E. C., Garnham, A., & Manouilidou, C. (2011). Implicit causality bias in English: A corpus
of 300 verbs. Behavior Research Methods, 43(1), 124–135. https://doi.org/10.3758/s13428-
010-0023-2
Fukumura, K., & Van Gompel, R. P. (2010). Choosing anaphoric expressions: Do people take into
account likelihood of reference?. Journal of Memory and Language, 62(1), 52-66.
https://doi.org/10.1016/j.jml.2009.09.001
Garnham, A., Oakhill, J., & Cruttenden, H. (1992). The role of implicit causality and gender cue
in the interpretation of pronouns. Language and Cognitive Processes, 7(3–4), 231–255.
https://doi.org/10.1080/01690969208409386
Garnham, A., Traxler, M., Oakhill, J., & Gernsbacher, M. A. (1996). The Locus of Implicit
Causality Effects in Comprehension. Journal of Memory and Language, 35(4), 517–543.
https://doi.org/10.1006/jmla.1996.0028
Garvey, C., & Caramazza, A. (1974). Implicit Causality in Verbs. Linguistic Inquiry, 5(3), 459–
464.
Givón, T. (Ed.). (1983). Topic continuity in discourse: A quantitative cross-language study.
Philadelphia: J. Benjamins Pub. Co.
Givón, T. (1990). Syntax: A Functional-Typological Introduction. John Benjamins Publishing
Company.
Goikoetxea, E., Pascual, G., & Acha, J. (2008). Normative study of the implicit causality of 100
interpersonal verbs in Spanish. Behavior Research Methods, 40(3), 760–772.
https://doi.org/10.3758/BRM.40.3.760
Gordon, P. C., Grosz, B. J., & Gilliom, L. A. (1993). Pronouns, Names, and the Centering of
Attention in Discourse. Cognitive Science, 17(3), 311–347.
https://doi.org/10.1207/s15516709cog1703_1
Gordon, P. C., Hendrick, R., & Foster, K. L. (2000). Language comprehension and probe-list
memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(3),
766–775. https://doi.org/10.1037/0278-7393.26.3.766
Greene, S. B., & McKoon, G. (1995). Telling Something we can’t Know: Experimental
Approaches to Verbs Exhibiting Implicit Causality. Psychological Science, 6(5), 262–270.
https://doi.org/10.1111/j.1467-9280.1995.tb00509.x
Greene, S. B., McKoon, G., & Ratcliff, R. (n.d.). Pronoun Resolution and Discourse Models. 18.
Gundel, J. K., Hedberg, N., & Zacharski, R. (1993). Cognitive Status and the Form of Referring
Expressions in Discourse. Language, 69(2), 274. https://doi.org/10.2307/416535
Hartshorne, J. K., & Snedeker, J. (2013). Verb argument structure predicts implicit causality: The
advantages of finer-grained semantics. Language and Cognitive Processes, 28(10), 1474–
1508. https://doi.org/10.1080/01690965.2012.689305
Hartshorne, J. K., Sudo, Y., & Uruwashi, M. (2013). Are Implicit Causality Pronoun Resolution
Biases Consistent Across Languages and Cultures? Experimental Psychology, 60(3), 179–
196. https://doi.org/10.1027/1618-3169/a000187
Hinds, J. (1975). Third person pronouns in Japanese. In Language in Japanese Society, Fred C
Peng (ed), 129-157. Tokyo: University of Tokyo Press.
Hinds, J. (1983). Topic continuity in Japanese. In Topic Continuity in Discourse: a Quantitative
Cross-language Study, Talmy Givón (ed), 47-93. Amsterdam & Philadelphia: John Benjamins
Publishing Company.
97
Horowitz, M. W. & Newman, J. B. (1964). Spoken and written expression: An experimental
analysis. The Journal of Abnormal and Social Psychology 68(6): 640.
https://doi.org/10.1037/h0048589
Hurewitz, F. (1998). A quantitative look at discourse coherence. In Centering Theory in Discourse,
Marilyn A Walker, Arvind K Joshi & Ellen F Prince (eds), 273–291. Oxford: Clarendon Press.
Hobbs, J. R. (1979). Coherence and coreference. Cognitive science, 3(1), 67-90.
Huang, C. T. J. (1984). On the distribution and reference of empty pronouns. Linguistic inquiry,
531-574.
Kaiser, E. (2009). Investigating effects of structural and information-structural factors on pronoun
resolution. In M. Zimmermann & C. Féry (Eds.), Information Structure from Different
Perspectives (pp. 332–353). Oxford University Press.
Kaiser, E. (2010). Effects of Contrast on Referential Form: Investigating the Distinction Between
Strong and Weak Pronouns. Discourse Processes, 47(6), 480–509.
https://doi.org/10.1080/01638530903347643
Kaiser, E. & Trueswell, J. C. (2008). Interpreting pronouns and demonstratives in Finnish:
Evidence for a form-specific approach to reference resolution. Language and Cognitive
Processes, 23(5), 709–748. https://doi.org/10.1080/01690960701771220
Kehler, A. (2002). Coherence, reference, and the theory of grammar. Stanford, Calif: CSLI
Publications.
Kehler, A. & Rohde, H. (2013). Aspects of a theory of pronoun interpretation. Theoretical
Linguistics, 39(3–4). https://doi.org/10.1515/tl-2013-0019
Kibrik, A. A. (1996). Anaphora in Russian narrative prose: A cognitive calculative
account. Typological Studies in Language, 33, 255-304.
Koornneef, A., & van Berkum, J. (2006). On the use of verb-based implicit causality in sentence
comprehension: Evidence from self-paced reading and eye tracking. Journal of Memory and
Language, 54(4), 445–465. https://doi.org/10.1016/j.jml.2005.12.003
Kuroda, S. (1965). Generative Grammatical Studies in the Japanese Language. MIT.
Lambrecht, K. (1994). Information structure and sentence form: Topics, focus, and the mental
representations of discourse referents. https://doi.org/10.1017/CBO9780511620607
Leys, C., Ley, C., Klein, O., Bernard, P. & Licata, L. (2013). Detecting outliers: Do not use
standard deviation around the mean, use absolute deviation around the median. Journal of
Experimental Social Psychology. http://dx.doi.org/10.1016/j.jesp.2013.03.013
Li, C. N. & Thompson, S. A. (1981). Mandarin Chinese: a Functional Reference Grammar.
Berkeley: University of California Press.
Li, C. N. & Thompson, S. A. (1979). Third person pronouns and zero anaphora in Chinese
discourse. In Syntax and semantics 12: Discourse and syntax, Talmy Givón (ed), 311–335.
New York: Academic Press.
Li, C. N., Thompson, S. A., & Li, C. N. (1976). Subject and Topic: A New Typology of Language
in Subject and Topic.
Long, D. L., & De Ley, L. (2000). Implicit Causality and Discourse Focus: The Interaction of Text
and Reader Characteristics in Pronoun Resolution. Journal of Memory and Language, 42(4),
545–570. https://doi.org/10.1006/jmla.1999.2695
MacDonald, M. C., & MacWhinney, B. (1990). Measuring inhibition and facilitation from
pronouns. Journal of Memory and Language, 29(4), 469–492. https://doi.org/10.1016/0749-
596X(90)90067-A
98
Mayberry, R. I., Hall, M. L., & Zvaigzne, M. (2014). Subjective frequency ratings for 432 ASL
signs. Behavior Research Methods, 46(2), 526–539. https://doi.org/10.3758/s13428-013-
0370-x
Mayol, L. (2010). Refining salience and the Position of Antecedent Hypothesis: a study of Catalan
pronouns. University of Pennsylvania Working Papers in Linguistics, 16(1), 15.
McDonald, J. L., & MacWhinney, B. (1995). The time course of anaphor resolution: Effects of
implicit verb causality and gender. Journal of Memory and Language, 34(4), 543–566.
Neeleman, Ad & Szendrői, Kriszta. 2007. Radical pro drop and the morphology of pronouns.
Linguistic Inquiry 38(4): 671-714.
Nicol, J. L., & Swinney, D. A. (2003). The Psycholinguistics of Anaphora. In A. Barss (Ed.),
Anaphora: A Reference Guide (pp. 72–104).
Ngo, Binh & Kaiser, Elsi. 2018. Effects of grammatical roles and topicality on Vietnamese
referential form production. Proceedings of the Linguistic Society of America 3(57): 1-12.
https://doi.org/10.3765/plsa.v3i1.4354
Passonneau, Rebecca J. 1998 Interaction of discourse structure with explicitness of discourse
anaphoric noun phrases. In Centering Theory in Discourse, Marilyn A Walker, Arvind K Joshi
& Ellen F Prince (eds), 327-358. Oxford: Oxford University Press.
Perfetti, Charles A & Goldman, Susan R. 1974. Thematization and sentence retrieval. Journal of
Verbal Learning and Verbal Behavior 13(1): 70–79. https://doi.org/10.1016/s0022-
5371(74)80032-0
Poole, M. E., & Field, T. W. (1976). A comparison of oral and written code elaboration. Language
and Speech, 19(4), 305-312.
Pyykkönen, P., & Järvikivi, J. (2009). Activation and persistence of implicit causality information
in spoken language comprehension. Experimental Psychology.
Rizzi, L. (1982). Issues in Italian syntax (Vol. 11). Walter de Gruyter.
Rohde, H., & Kehler, A. (2014). Grammatical and information-structural influences on pronoun
production. Language, Cognition and Neuroscience, 29(8), 912–927.
https://doi.org/10.1080/01690965.2013.854918
Rudolph, U., & Försterling, F. (1997). The psychological causality implicit in verbs: A review.
Psychological Bulletin, 121(2), 192–218. https://doi.org/10.1037/0033-2909.121.2.192
Sheldon, A. (1974). The role of parallel function in the acquisition of relative clauses in
English. Journal of verbal learning and verbal behavior, 13(3), 272-281.
Simpson, A., Wu, Z., & Li, Y. (2016). Grammatical roles, Coherence Relations, and the
interpretation of pronouns in Chinese. Lingua Sinica, 2(1). https://doi.org/10.1186/s40655-
016-0011-2
Simpson, A., & Ho, H. T. (2008). The comparative syntax of passive structures in Chinese and
Vietnamese. In Proceedings of the 20th North American Conference on Chinese Linguistics
(NACCL-20) (Vol. 2, pp. 825-841).
Smyth, R. (1994). Grammatical determinants of ambiguous pronoun resolution. Journal of
Psycholinguistic Research, 23(3), 197–229. https://doi.org/10.1007/BF02139085
Stewart, A. J., Pickering, M. J., & Sanford, A. J. (2000). The Time Course of the Influence of
Implicit Causality Information: Focusing versus Integration Accounts. Journal of Memory
and Language, 42(3), 423–443. https://doi.org/10.1006/jmla.1999.2691
Tannen, D. (1980). Spoken/Written Language and the Oral/Literate Continuum. Annual Meeting
of the Berkeley Linguistics Society (pp. 207-218). https://doi.org/10.3765/bls.v6i0.2133.
99
Ueno, M., & Kehler, A. (2016). Grammatical and pragmatic factors in the interpretation of
Japanese null and overt pronouns. Linguistics, 54(6). https://doi.org/10.1515/ling-2016-0027
Vonk, W. (1985). The immediacy of inferences in the understanding of pronouns. In Inferences in
text processing, G. Rickheit & H. Strohner (eds), 205–218. Amsterdam: North-Holland.
Walker, M.A., Joshi, A.K., & Prince, E. (1998). Centering Theory in Discourse. Oxford University
Press.
Yang, C. L., Gordon, P. C., Hendrick, R., & Hue, C. W. (2003). Constraining the comprehension
of pronominal expressions in Chinese. Cognition, 86(3), 283–315.
https://doi.org/10.1016/S0010-0277(02)00182-8
Yang, C. L., Gordon, P. C., Hendrick, R., & Wu, J. T. (1999). Comprehension of Referring
Expressions in Chinese. Language and Cognitive Processes, 14(5–6), 715–743.
https://doi.org/10.1080/016909699386248
100
Appendix A. Chapter 3 Target Items
The items shown in this list are the sentence fragments used in the sentence completion study
(Chapter 3, Experiments 2 & 3). The items are shown in the active condition similar to (1). There
are three prompt types (no prompt/overt-pronoun prompt/null-pronoun prompt) after the
connective vì ‘because’. The passive equivalents of these fragments can be created using the
passive markings. An example of the active-passive transformation is shown below.
(1) Active
[Cô thợ may] gạt [cô khách hàng] vì … / cô ta… / đã…
dressmaker fool customer because … / she… / ASP…
‘The dressmaker fooled the customer because … / she… / ASP…’
(2) Passive
[Cô thợ may] bị [cô khách hàng] gạt vì … / cô ta… / đã…
dressmaker PASS customer fool because … / she… / ASP…
‘The dressmaker fooled the customer because … / she… / ASP…’
Item Sentences
Passive
marking
1
Cô thợ may gạt cô khách hàng vì … / cô ta… / đã…
‘The dressmaker fooled the customer because … / she… / ASP…’
bị
2
Ông nhà báo ra hiệu cho ông giáo sư vì … /ông ta… / đã…
‘The journalist signaled the professor because … / he… / ASP…’
được
3
Người thám tử báo tin cho người kiến trúc sư vì … /ông ấy/ đã…
‘The detective informed the architecture because… / he… / ASP…’
được
4
Anh bán báo thách thức anh giữ xe vì … /anh ta… / đã…
‘The newspaper seller dared the parking guard because … / he… / ASP…’
bị
5
Anh họa sĩ tố cáo anh bồi bàn vì … /anh ta… / đã…
‘The artist reported the waiter because … / he… / ASP…’
bị
6
Cô lao công năn nỉ cô kế toán vì … /cô ấy… / đã…
‘The janitor begged the accountant because … / she… / ASP…’
được
7
Bà giúp việc thông cảm với bà bán hàng vì … / bà ấy… / đã…
‘The maid understood the shop assistant because … / she… / ASP…’
được
8
Ông phóng viên thôi miên ông khách trọ vì … / ông ta… / đã…
‘The reporter hypnotized the hotel guest because… / he… / ASP…’
bị
9
Anh lính đánh bại anh thủy thủ vì … / anh ta… / đã…
‘The soldier defeated the sailor because… / he… / ASP…’
bị
10
Chị dược sĩ chào hỏi cô thợ thêu vì … / chị ấy… / đã…
‘The pharmacist greeted the embroiderer because … / she… / ASP…’
được
11
Ông nông dân khuyến khích ông sửa xe vì … / ông ấy… / đã…
‘The farmer encouraged the mechanic because… / he… / ASP…’
được
12
Cô thợ cắt tóc ra lệnh cho cô chủ nhà vì … / cô ta… / đã…
‘The hairdresser ordered the landlady because … / she… / ASP…’
bị
13
Cô thư kí thúc giục cô người mẫu vì … / cô ấy… / đã…
‘The secretary urged the model because … / she… / ASP…’
bị
14
Bác thợ mộc liên lạc với bác thợ chụp hình vì … / bác ấy… / đã…
‘The carpenter contacted the photographer because… / he… / ASP…’
được
101
15
Anh sinh viên tha thứ cho anh công nhân vì … / anh ấy… / đã…
‘The student forgave the worker because … / he… / ASP…’
được
16
Ông gác cổng kéo lê ông quản lý vì … / ông ta… / đã…
‘The gate-keeper dragged the manager because… / he… / ASP…’
bị
17
Cô giữ trẻ ép buộc cô ca sĩ vì … / cô ta… / đã…
‘The babysitter forced the singer because … / she… / ASP…’
bị
18
Chị thu ngân quan tâm đến chị hướng dẫn viên du lịch vì … / chị ấy… / đã…
‘The cashier concerned about the tour guide because … / she… / ASP…’
được
19
Ông kĩ sư cám ơn ông lái xe vì … / ông ấy… / đã…
‘The engineer thanked the driver because … / he… / ASP…’
được
20
Ông thợ hồ cắn ông bảo vệ vì … / ông ta… / đã…
‘The builder bit the guard because … / he… / ASP…’
bị
21
Cô giáo ngắt lời cô diễn viên vì … / cô ấy… / đã…
‘The teacher interrupted the actress because … / she… / ASP…’
bị
22
Ông chủ quán đại diện cho ông thợ điện vì … / ông ấy… / đã…
‘The restaurant owner represented the electrician because… / he… / ASP…’
được
23
Chị hàng xóm gửi thư cho chị phát thanh viên vì … / chị ấy… / đã…
‘The neighbor mailed the TV host because … / she… / ASP…’
được
24
Ông bác sĩ khiêu khích ông đầu bếp vì … / ông ta… / đã…
‘The doctor provoked the chef because… / he… / ASP…’
bị
102
Appendix B. Chapter 5 Target Items
This list contains the items used in Experiment 5. The items are shown in the same age_congruent
condition similar to (1). An example of how all four conditions are constructed is shown in (1-4)
below.
(1) same age_congruent
Ông nha sĩ khen ông kỹ sư vì ông ấy trong nhiều năm
old.male.dentist praise old.male.engineer because heOLD in many year
qua đã xxx ông nha sĩ trong ngôi trường ở ngoại thành.
past ASP xxx old.male.dentist in school at outskirts
‘The dentistOLD.MALE praised the engineerOLD.MALE because heOLD for many years has xxx
the dentistOLD.MALE at the school in the outskirts of the city.’
(2) same age_incongruent
Ông kỹ sư khen ông nha sĩ vì ông ấy … ông nha sĩ …
old.male.engineer praise old.male.dentist because heOLD… old.male.dentist
‘The engineerOLD.MALE praised the dentistOLD.MALE because heOLD for many years has
xxx the dentistOLD.MALE at the school in the outskirts of the city.’
(3) different age_congruent
Anh nha sĩ khen ông kỹ sư vì ông ấy… anh nha sĩ …
young. male.dentist praise old. male.engineer because heOLD… young.male.dentist
‘The dentistYOUNG.MALE praised the engineerOLD.MALE because heOLD for many years has
xxx the dentistYOUNG.MALE at the school in the outskirts of the city.’
(4) differen age_incongruent
Ông kỹ sư khen anh nha sĩ vì ông ấy … anh nha sĩ …
old.male.engineer praise young.male.dentist because heOLD … young.male.dentist
‘The engineerOLD.MALE praised the dentistYOUNG.MALE because heOLD for many years
has xxx the dentistYOUNG.MALE at the school in the outskirts of the city.’
Item Sentence
1 Ông nha sĩ khen ông kỹ sư vì ông ấy trong nhiều năm qua đã xxx ông nha sĩ trong ngôi trường
ở ngoại thành.
2 Bà dược sĩ tin tưởng bà luật sư vì bà ấy suốt mấy năm liền đã xxx xxxx bà dược sĩ tại ngôi
làng dưới thung lũng.
3 Ông bác sĩ phê bình ông giám đốc vì ông ấy vào một buổi sáng đã xxxx xxxx ông bác sĩ tại
tòa án cạnh ủy ban.
4 Bà diễn viên ghét bà kế toán vì bà ấy vào dịp cuối tuần đã xxx xxxxx bà diễn viên trong khu
vườn gần cầu vượt.
5 Ông đạo diễn khen ông nhà báo vì ông ấy ngay từ lúc đầu đã xxx xxxx xxx ông đạo diễn ngoài
hành lang cạnh phòng họp.
6 Ông luật sư tố cáo ông thanh tra vì ông ấy mấy ngày gần đây đã xxx xxx xxx ông luật sư trong
bãi xe gần công trường.
7 Bà bán rau chú ý đến bà giúp việc vì bà ấy mấy lúc gần đây đã xxx xxx bà bán rau ở ngôi
chùa cạnh bến xe.
8 Ông thợ hồ kiện ông giữ xe vì ông ấy hôm sáng thứ Ba đã xxxx ông thợ hồ trong quán cơm
kế phim trường.
103
9 Ông nông dân nhắc nhở ông thợ mộc vì ông ấy cách đây không lâu đã xxxx ông nông dân
giữa phòng khách trong nhà trọ.
10 Ông thợ mộc trách móc ông nông dân vì ông ấy hôm trưa thứ Năm đã xxxx xxx ông thợ mộc
trước quầy bánh trong siêu thị.
11 Ông thợ sơn tránh xa ông làm vườn vì ông ấy suốt mấy ngày qua đã xxx xxxx ông thợ sơn
trong con hẻm cạnh nhà thờ.
12 Ông công nhân tôn trọng ông tài xế vì ông ấy buổi trưa hôm kia đã xxx ông công nhân ở quán
nước trước bến phà.
13 Bà tạp vụ thăm bà bán vé vì bà ấy hôm chiều thứ Bảy đã xxxx xxx bà tạp vụ trong cửa hàng
gần văn phòng.
14 Ông xe ôm bỏ mặc ông thợ hồ vì ông ấy buổi tối hôm qua đã xxxx xxx xxx ông xe ôm trước
nhà ga trong trung tâm.
15 Bà giữ trẻ tin tưởng bà thợ may vì bà ấy hôm chiều mùng ba đã xxxxx xxx bà giữ trẻ trong
tiệm phở trước chung cư.
16 Bà thợ may thăm bà giữ trẻ vì bà ấy từ nào đến giờ đã xxx xxxxx bà thợ may trong nhà sách
ngay ngã tư.
17 Ông bảo vệ theo dõi ông thợ điện vì ông ấy hôm tối thứ Sáu đã xxxx xxx ông bảo vệ trong
công viên kế xưởng dệt.
18 Ông đầu bếp bỏ mặc ông lái xe vì ông ấy thứ Hai tuần trước đã xxxx ông đầu bếp ở nhà thuốc
cạnh bệnh viện.
19 Ông lái xe ghét ông đầu bếp vì ông ấy từ xưa đến nay đã xxxxxx xxx ông lái xe trong quán
nước ngay đầu ngõ.
20 Bà khách hàng nhắc nhở bà chủ quán vì bà ấy cách đây vài ngày đã xxx xxxxxx bà khách
hàng trong ngân hàng cạnh nhà hát.
21 Bà thư ký chỉ trích bà y tá vì bà ấy thứ Bảy tuần rồi đã xxxxx xxxx bà thư ký tại khách sạn
trong thị trấn.
22 Bà giữ trẻ thăm bà thủ thư vì bà ấy trong mấy ngày liền đã xxxxx xxx bà giữ trẻ trong phòng
chờ tại bệnh viện.
23 Bà thư ký khen bà chủ quán vì bà ấy mới cuối tuần rồi đã xx xxxx xxx bà thư ký tại quầy vé
trong sân bay.
24 Ông thợ điện chú ý đến ông lái xe vì ông ấy vào hôm thứ Tư đã xx xxxx ông thợ điện ở trạm
xăng gần xa lộ.
Below is the list of equi-biased verbs used in Experiment 6.
Item Equi-biased verbs
1 ngắt lời
2 biết ơn
3 đón tiếp
4 ghen tị với
5 ngắt lời
6 mua chuộc
7 nói xấu
8 đánh
9 lừa
104
10 kéo lê
11 rượt
12 vu oan cho
13 an ủi
14 chỉ đường cho
15 cãi lời
16 tha thứ cho
17 kéo lê
18 hẹn với
19 chịu đựng
20 liên lạc với
21 chào hỏi
22 an ủi
23 năn nỉ
24 lừa
Abstract (if available)
Abstract
In every day communication, language users are often confronted by the presence of multiple competing linguistic choices. Referential form use (e.g. she, Mary, that girl), for example, is a puzzle that has attracted much attention from both linguists and psychologists. In ‘Mary talked to Sally because she was a friendly person’, the pronoun ‘she’ is ambiguous between ‘Mary’ and ‘Sally’. Why does the speaker choose to use ‘she’ instead of an unambiguous form (e.g. ‘Mary’, ‘Sally’)? How does the listener recognize the speaker’s intention despite the ambiguity? These questions are further complicated in a language like Vietnamese in which pronouns are not just function words like English ‘he/she’ but they are derived from a complex kinship system. In my dissertation, I investigate speakers’ choice of referential form in Vietnamese focusing on pronouns. Through a series of experiments, I probe a range of structural and discourse factors which may influence the comprehension as well as the production of Vietnamese pronouns. In sum, these studies aim to broaden our understanding of the impact of universal and language-specific features on referential form choice in communication. ❧ To provide a comprehensive picture of Vietnamese pronoun behavior considering their crosslinguistic unique features, in Chapter 2, I conducted a narrative experiment to examine the overall distribution of Vietnamese referential forms, particularly null pronouns (i.e. empty/zero anaphora) and overt pronouns (e.g. kinship term pronouns). I incorporated structural factors such as grammatical roles and grammatical parallelism into the analysis to obtain a detailed characterization of Vietnamese pronoun production. I found that both grammatical roles and grammatical parallelism have a strong influence on Vietnamese speakers’ choice of referential form. When both the referent and the referring expression are in the grammatical subject position (i.e. subject parallelism), speakers mostly use pronouns (null and overt pronouns). In contrast, the lack of parallelism results in mostly NPs. Interestingly, hints of parallelism effect are also found in object parallelism in which pronouns are used more than in the non-parallel cases. These results highlight the importance of considering the grammatical roles of not only the antecedent but also the anaphoric expression (e.g. pronouns) in investigating referential form choice. ❧ Vietnamese speakers in the narrative experiment (Chapter 2) use both null and overt pronouns equally. This finding poses a challenge to the salience-hierarchical approach (e.g. Ariel, 1990
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Discourse level processing and pronoun interpretation
PDF
Dynamics of multiple pronoun resolution
PDF
Interpretation of pronominal forms across languages
PDF
Syntactic and non-syntactic factors in reflexive pronoun resolution in Mandarin Chinese
PDF
Discourse-level processing of nominal possessive constructions
PDF
When things are left unsaid: existential and anaphoric implicit objects in discourse
PDF
Processing the dynamicity of events in language
Asset Metadata
Creator
Ngo, Binh Nha (author)
Core Title
Vietnamese pronouns in discourse
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Linguistics
Publication Date
08/01/2019
Defense Date
05/30/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
discourse,grammatical roles,implicit causality,kinship terms,modality,OAI-PMH Harvest,object bias,parallelism,passivization,pro-drop languages,pronouns,reference resolution,salience,subject preference,topicality,Vietnamese
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Kaiser, Elsi (
committee chair
), Hobbs, Jerry (
committee member
), Mintz, Toben (
committee member
), Simpson, Andrew (
committee member
)
Creator Email
binhnngo@usc.edu,nhabinhngo@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-208175
Unique identifier
UC11663033
Identifier
etd-NgoBinhNha-7714.pdf (filename),usctheses-c89-208175 (legacy record id)
Legacy Identifier
etd-NgoBinhNha-7714.pdf
Dmrecord
208175
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Ngo, Binh Nha
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
discourse
grammatical roles
implicit causality
kinship terms
modality
object bias
parallelism
passivization
pro-drop languages
pronouns
reference resolution
salience
subject preference
topicality