Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Towards a correlational law of language: three factors constraining judgement variation
(USC Thesis Other)
Towards a correlational law of language: three factors constraining judgement variation
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
TOWARDS A CORRELATIONAL LAW OF LANGUAGE:
THREE FACTORS CONSTRAINING JUDGEMENT VARIATION
by
Daniel Hoagberg Plesniak
A Dissertation Presented to the
FACULTY OF THE USC GRADUATE SCHOOL
UNIVERSITY OF SOUTHERN CALIFORNIA
In Partial Fulfillment of the
Requirements for the Degree
DOCTOR OF PHILOSOPHY
(LINGUISTICS)
May 2022
Copyright 2022 Daniel Hoagberg Plesniak
ii
Acknowledgements
I would first and foremost like to thank my committee members, Audrey Li, Hajime Hoji,
Andrew Simpson, Elsi Kaiser, and Namkil Kim for all their help in the multi-year process that has
been the making of this dissertation. To say just a few words about each: Audrey Li, my advisor, has
been my constant supporter and advocate since before I even joined the linguistics department at
USC. Hajime Hoji has always challenged me to be a better scientist and supported me in becoming
one. Andrew Simpson has long been an inspiration of mine and has been a consistent source of
good leads for new areas of research. Elsi Kaiser has singlehandedly taught me (like many students
in our department) almost everything I know about psycholinguistics, and her practical advice about
how to conduct experiments has been supremely useful. Last, but not least, I still routinely think of
the advice Namkil Kim gave me when we first spoke, about the need to understand judgement data
in context; I hope what I have done for this dissertation can live up to the spirit of those remarks.
I would also like to express my gratitude to the USC linguistics faculty and linguistics
graduate students (former and present) for the profound insights they have shared over the years; I
have learned more here than I ever imagined. I am particularly indebted to the members of the
group formerly known as Syntax+ and the Psycholinguistics Lab meetings. Both attending and
presenting at these venues has been an invaluable part of my education. I do not have space to thank
everyone individually, but suffice it to say that I have been truly blessed in terms of mentors,
colleagues, and friends.
This dissertation represents the culmination of my long career as a student. As such, I want
to thank all the teachers I’ve had along the way. I came into this world with a desire and
determination to learn; everything else, I owe to you. My first teachers, of course, were my parents,
from whom I continue to learn to this day; it would take a much greater volume than this
dissertation to tell even a meaningful fraction of all the things they’ve done for me, so all I will say is
iii
this: Mommy and Daddy, no one could be luckier than to be born your son. Other special mention
must certainly be made of my undergraduate advisor, Shizhe Huang, who, in one brief class period,
brought me from not knowing what linguistics was to wanting to pursue it as a career. And that is to
say nothing of everything she’s taught and done for me since then. There are so many others worthy
of mention here, but if my teachers have given me one consistent piece of feedback over the years, it
is that I should try to be a bit more concise, so I will attempt to move towards an ending.
I cannot, however, end without thanking two individuals who helped me greatly when
dealing with the non-English languages I worked with in this dissertation, Korean and Mandarin
Chinese. I mention them in footnotes later, but from translating the experiments to recruiting
participants, they made my life exponentially easier. (Though special thank you should also be
extended to my mom in this regard too, as she essentially recruited my English-speaking participants
single-handedly). These two individuals are Yoona Yee (who, over the course of the writing of this
dissertation, has become my fiancée) and Felix Qin (who I met during this process and has proved
to be a wonderful friend). Their thoughts on the various issues we discussed have been crucial for
my understanding, and I thank them both from the bottom of my heart.
Finally, I want to thank (and re-thank) all my family and friends for all their consistent love
and support. Because of you, though the work was hard, I always knew I could get through it. Many
of you also answered way too many weird judgement questions that probably made your heads hurt,
but you always offered to answer more. If that isn’t paradise, then I don’t know what is.
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ......................................................................................................................................... II
ABSTRACT .......................................................................................................................................................... VI
1 INTRODUCTION ........................................................................................................................................... 1
1.1 PREFACE ....................................................................................................................................................... 1
1.2 VARIATION AND THE COMPUTATIONAL SYSTEM ................................................................................................... 9
1.3 GENERAL GOALS .......................................................................................................................................... 14
1.4 A BASIC LAW ............................................................................................................................................... 18
1.5 KEY SUMMARY ............................................................................................................................................ 25
1.6 PREVIEW OF FINAL RESULTS ........................................................................................................................... 27
1.7 OUTLINE ..................................................................................................................................................... 30
2 STRUCTURE AND MEANING ....................................................................................................................... 34
2.1 INTRODUCTION ............................................................................................................................................ 34
2.2 MERGE AND C-COMMAND ............................................................................................................................ 36
2.3 CROSSOVER EFFECTS ..................................................................................................................................... 43
2.4 THE ROLE OF LINEAR PRECEDENCE .................................................................................................................. 62
2.5 POSSESSOR-BINDING AND BEYOND ................................................................................................................. 70
2.6 EXPLANATIONS FROM COVERTNESS ................................................................................................................. 86
2.7 INCORPORATING VARIATION ........................................................................................................................ 100
2.8 SPECULATIVE THEORIES OF QUIRKY EFFECTS ................................................................................................... 118
2.9 CONCLUSION ............................................................................................................................................. 147
3 EXPERIMENTS, PAST AND PRESENT ......................................................................................................... 151
3.1 INTRODUCTION .......................................................................................................................................... 151
3.2 PREVIOUS EXPERIMENTS .............................................................................................................................. 166
3.3 THE EXPERIMENTAL TEMPLATE ..................................................................................................................... 214
3.4 INITIAL DISCUSSION OF THE EXPERIMENTS CONDUCTED .................................................................................... 240
4 ENGLISH .................................................................................................................................................. 253
4.1 PRELIMINARIES .......................................................................................................................................... 253
4.2 A(S, X, Y) ................................................................................................................................................. 268
4.3 B(S, X, Y) ................................................................................................................................................. 282
4.4 C(S, X, Y) ................................................................................................................................................. 301
4.5 FULL RESULTS AND DISCUSSION .................................................................................................................... 309
5 KOREAN .................................................................................................................................................. 323
5.1 PRELIMINARIES .......................................................................................................................................... 323
5.2 A(S, X, Y) ................................................................................................................................................. 336
5.3 B(S, X, Y) ................................................................................................................................................. 344
5.4 C(S, X, Y) ................................................................................................................................................. 350
5.5 FULL RESULTS AND DISCUSSION .................................................................................................................... 356
6 MANDARIN CHINESE ............................................................................................................................... 365
6.1 PRELIMINARIES .......................................................................................................................................... 365
6.2 A(S, X, Y) ................................................................................................................................................. 381
6.3 B(S, X, Y) ................................................................................................................................................. 387
6.4 C(S,X, Y) .................................................................................................................................................. 391
6.5 FULL RESULTS AND DISCUSSION .................................................................................................................... 396
v
7 FURTHER ANALYSIS AND CONCLUSION .................................................................................................... 413
7.1 SUMMARY OF PREVIOUS RESULTS ................................................................................................................. 413
7.2 OVERALL RESULTS ASSESSMENT .................................................................................................................... 433
7.3 EVALUATION OF POTENTIAL CRITICISMS ......................................................................................................... 447
7.4 LOOKING FORWARD ................................................................................................................................... 461
7.5 CONCLUSION AND BROADER SIGNIFICANCE ..................................................................................................... 470
GLOSSARY ........................................................................................................................................................ 478
BIBLIOGRAPHY ................................................................................................................................................. 485
DATA APPENDIX ............................................................................................................................................... 493
READING THE DATA .................................................................................................................................................. 493
ENGLISH PARTICIPANTS ............................................................................................................................................. 494
KOREAN PARTICIPANTS ............................................................................................................................................. 498
MANDARIN CHINESE PARTICIPANTS ............................................................................................................................. 501
vi
Abstract
Critics of generative syntax have sometimes alleged that, despite its universalist aspirations, real world
data provide an inexhaustible list of counterexamples to even its most carefully formulated predictions. Even
among practicing generative syntacticians, there is a common assumption that predictions may occasionally
be contradicted by the judgements of certain individuals. Regardless of whose perspective we take, the basic
observation is the same; while we may make predictions that hold across most, individuals in most cases, we
should expect that there will be “atypical” individuals/cases that will go against those predictions. The anti-
and pro-generativist camps thus disagree not so much on the data, but on how such instances of “atypicality”
should be interpreted.
This basic assumption, however, has been called into question by new theoretical developments and
experimental techniques. Hoji (2017, 2019, 2022b) lays out a program for addressing judgement variation
which, rather than dismissing “atypical” individuals as statistical outliers, seeks to make predictions that hold
across the judgements of all individuals, typical or atypical, based on independently diagnosed properties of
each individual’s grammar/I-language. The fundamental insight of this program is that, while individuals do
differ in their judgements, they do so for principled reasons that have measurable effects. Once we measure
said effects for a given individual, we can make a relativized prediction based on the detected grammatical
properties, with the aim to accurately predict the judgements of each individual without exception.
Skepticism as to whether such a high degree of accuracy is possible is natural, but evidence strongly
indicates that it is. In this dissertation, I consider that case of BVA (Bound Variable Anaphora)
interpretations, which, though they have played an outsized role in the development of generative syntax, are
subject to rather significant variation in judgement. Reinhart (1983)’s seminal work on the topic makes use of
paradigms of BVA’s differing availability across sentence types to motivate the claim that abstract structure,
via the relation c-command, constrains the mapping from form to meaning, a central component of the basic
generative syntactic endeavor. Unfortunately, the sort of judgements Reinhart relied upon are not themselves
consistent. While in many cases, the majority of individuals do have judgements as predicted by c-command-
vii
based accounts, a persistent and sizable minority do not. BVA is thus precisely a phenomenon where
purported universalist accounts appear to be merely tendencies.
If we develop a slightly more sophisticated theory of BVA, however, we find that this is not the case.
A straightforward “pure c-command” approach does seem untenable, but Ueyama (1998) provides a revised
theory of BVA that allows it to occur between elements X and Y under one of three conditions: (a) X c-
commands Y, (b) X precedes Y, and (c) something called a “quirky effect” occurs. Conditions (a) and (b) are
easily controlled for and disentangled, but, as Ueyama notes, (c) is highly idiosyncratic to the individual and
the particular choice of X, Y, and sentence; it seems to have something to do with discursive factors, but the
precise nature of these factors remains elusive. Nevertheless, such effects are clearly observable, as many
sentences where X neither precedes nor c-commands Y are nevertheless accepted with the relevant BVA
readings by at least some individuals. Fortunately, however, Hoji (2017/2019/2022b)’s basic methodology
provides a way of predicting the presence or absence of quirky effects on a given BVA-sentence from
diagnosable properties of the individual in question.
Given these diagnostics, the Ueyama model of BVA is fully testable. We make the strong prediction
that, when there is no precedence, c-command, or quirky effects between X and Y, BVA(X, Y) should be
impossible. In this dissertation, I demonstrate exactly that, across multiple speakers of three different
languages, English, Korean, and Mandarin Chinese. Using multiple different constructions and choices of X
and Y, we find not a single judgement in contradiction to the prediction stated above; in cases where X
neither precedes nor c-commands Y, nor are there any quirky effects detected, BVA is never accepted,
whereas in all other cases, BVA is at least sometimes accepted. This is precisely as predicted by Ueyama
(1998)’s theory of BVA. As such, we have supported that theory in a manner consistent with its universalist
nature, without needing to exclude the judgements of certain individuals due to atypicality. Generative syntax
is thus not merely confined to tendencies; under the correct conditions, it can make definite and testable
predictions that are true in all cases for all individuals.
1
1 Introduction
1.1 Preface
In this preface, I provide a brief explanation as to the whys and wherefores of this
dissertation. The intention is not to address the specific findings to be presented or even their
significance per se, but rather, to explain the overall motivations for the project. Such an explanation
is perhaps more personal than objective, and uninterested readers should feel free to skip ahead to
Section 1.2; this introductory chapter is written in such a way that one may begin reading from that
point and miss none of the relevant information required to understand the rest of the dissertation.
Indeed, Section 1.2 was originally written to be the first section of this chapter, but on
reflection, I found that it did not address certain metascientific concerns that I have heard raised in
the past about similar work, both my own and others’. These concerns, as far as I understand them,
stem from a confusion as to what the purpose of such work is. Such confusion is natural, as what I
am doing is, superficially at least, somewhat atypical, because of my focus on certain very “basic”
cases, the general nature of which has been considered well-established in generative linguistics for
decades. This focus seems to give rise to incorrect impressions such as: (I), that I am somehow
trying to pass off well-known generalizations as my own new discoveries, or (II), that I am somehow
trying to argue that there is some fundamental flaw in our understanding that these basic cases
reveal, in effect implying that all research that builds on this understanding is a of “house of cards”
that will come crashing down because of said flaw.
To be clear, these are both very far from the truth, and readers of this dissertation will
hopefully note the lack of any widespread claims of revolutionary generalizations discovered or fatal
flaws uncovered. There is discussion of new results, yes, as well as critical evaluation of previous
works, but no more so than is typical in other dissertations in this field.
2
Why then the focus on such basic cases? The answer is that it derives from the challenges of
prediction. What I am to do here, in essence, is to provide an affirmative answer to the question,
“given enough information, can the hypotheses of generative linguistics make non-trivial yet
exceptionlessly correct universal predictions about the judgements of any given individual?”, or,
more succinctly “can generative hypotheses make predictions that are always true for everyone?”
This has, for the most part, not been attempted in generative linguistics, and for good reason. While
various successes have been achieved in making predictions that are true for most people most of
the time, it has been recognized that human linguistic behavior is affected by many factors, and that
introspective linguistic judgements are no exception to this. As such, it is hard to guarantee that
every person will conform to precisely the expected patterns of judgements in all cases, even if we
are pretty sure that those expected patterns are essentially “correct”.
To circumvent this issue, generative linguists have tried a number of different approaches,
some of the most common being focusing on the judgements of a single individual (usually at a
specific time), making use of what informally appear to be the “consensus” judgements on a given
item, or more formally, using statistical approaches to compare mean judgements under different
conditions and to dismiss severe outliers from the norm in a principled manner. The question is thus
why I am not (primarily) following one of these approaches. It is certainly not because of any
animus or distaste for such approaches; they all have a role to play in linguistic research, and I
provide a typology of these and other useful investigative methodologies in the beginning of
Chapter 3. Nevertheless, I contend that, while it is difficult to overcome the sorts of “noise” that
plagues judgement behavior, it is in fact possible to do so in ways that result in the sort of
exceptionless results I alluded above, meaning that, for the specific purpose at hand, we can go
beyond what has already been done via these circumvention-based approaches.
3
Because this type of approach is largely unprecedented, it is natural that “basic cases” should
be the place to start; further, basic cases being the simplest, they are inherently the easiest to make
predictions about, which is an important consideration given that successfully predicting judgement
behavior to such a precise degree is very difficult. Despite this difficulty, however, there are two
reasons why such an endeavor is worthwhile. These reasons are rather divergent from one another,
and that divergence, I suspect, is part of what adds to the confusion as to the purpose of such work.
The first reason is that I believe successful predictions of this sort may convince reasonable skeptics
of the validity of the basic hypotheses of the generative enterprise, while the second is that I believe
the theoretical and methodological development necessary to achieve such success will reveal new
aspects of the nature of the language faculty that have thus far been largely undetectable. As such,
there are really two audiences for this work: those outside of generative linguistics altogether,
potentially skeptical of the field’s validity, and those very much inside it, interested in fine-grained
accounts of the fundamental workings of the language faculty.
I will address these two reasons/audiences in turn. First, the skeptics. Let me make
something of a historical analogy: as we shall discuss in Chapter 2, Chomsky (2017) credits the
16
th
/17
th
century scientist Galileo for making certain observations about the nature of language.
Galileo, however, is far more famous for his astronomical work, in particular his championing of the
heliocentric model of the solar system and his subsequent trials and censorship by the Catholic
Church. While this story is often reduced to something like a historical fairy tale describing the
confrontation of free-inquiry-based science with dogmatic religion, the historical record paints a far
more complex picture; Galileo’s story was one that played out over many acts and involved a cast of
characters both within the Church and the broader scientific/astronomical community. These
individuals had diverse motives and a range of dispositions towards Galileo, reflecting the
4
complicated political, religious, social, and scientific dynamics of the time. I will not attempt to
summarize the story here, but I will comment briefly on one aspect.
As I noted, the reasons why various individuals became opponents (or supporters) of
Galileo were varied, but not all such individuals did so simply out of adherence to Church doctrine.
At least in some cases, Galileo’s opponents were serious and data-minded scholars who found fault
with the predictions of the heliocentric model. They themselves had sophisticated models of the
solar system, which could in many cases capture the key data taken as evidence for heliocentrism
while still retaining a fundamentally geocentric model. There were, however, points at which the
heliocentric and geocentric models made fundamentally different predictions. One such case
concerned the motion of the stars, particularly with regard to a phenomenon now known as “stellar
parallax”. In essence, if the Earth were at the center of everything and the stars moved around the
Earth, it was imaginable that the stars were always equidistant from the Earth compared to one
another, i.e., they were all points on the same sphere, centered on the Earth. On the other hand, if
the Earth moved around, then it ought to sometimes be closer to certain stars than to others.
Anyone who has looked out the window of a moving car has had the chance to notice that nearby
pedestrians go by quickly whereas features in the distance, say mountains, appear to move much
more slowly. This is parallax, the phenomenon wherein closer objects appear to move faster relative
to the observer than more distant ones. Applying this principle to stars, if the Earth were really
moving around such that it is sometimes closer to some stars than to others, then the closer stars
ought to appear to change position relative to the ones that are farther away; in essence, the angle at
which the observer views the closer star would change more rapidly than the angle at which the
observer views the farther star, giving the impression that the closer star is moving “faster”
towards/away from the observer than the farther star is. The heliocentric model thus predicts this
“stellar parallax”.
5
Unfortunately for heliocentrists like Galileo, no such effect had ever been observed, despite
the rather detailed observations of the night sky that had been undertaken at the time. The
geocentrists touted this as a failed prediction of heliocentrism. The heliocentrists, on the other hand,
generally responded with the suggestion that the stars were simply so far away that such an effect
was too small to be observable by the technology of the time. As it turns out, this was indeed
correct; the stars are orders of magnitude further away than was typically imagined at that time,
rendering the effects of stellar parallax miniscule, and indeed, it took over two hundred years of
telescope development until the relevant observations were finally made (by various individuals in
the 1830’s). At the time, however, this explanation was naturally greeted by geocentrists as hand-
waving, an attempt to explain away problems with an already controversial theory.
I think it is fair to say that, in certain respects, generative linguistics finds itself today in a
somewhat similar environment as Galileo and the heliocentrists did at that time, in the following
sense: despite decades of work spent trying to provide evidence for our models, skepticism abounds
in both linguistics and other language-adjacent fields, as well as in the general academic world. This
is to say nothing of the general public, who, as far as I can tell are mostly ignorant of our basic
“worldview”, while ideas from other fields, e.g., computer science, are frequently seen on television,
in popular science articles, etc. It is true that generative linguistic principles are sometimes seen in
scholarly or popular materials outside of generative linguistics, though at least in my experience, they
are often there for the sake of criticizing them, or at the very least contrasting them with another
theory, rather than simply propounding them as correct.
The antipathy towards generative linguistics is, I believe, multifaceted. Much like in Galileo’s
case, sometimes it appears to me to be based in either ignorance, misunderstanding, or purely
ideological grounds, but in other cases, it is indeed principled, due precisely to perceived failed
predictions of our theories. The most common argument is something along the lines of what is
6
articulated succinctly in Evans and Levinson (2009), where it is stated, “the claims of Universal
Grammar […] are either empirically false, unfalsifiable, or misleading in that they refer to tendencies
rather than strict universals.” Here, we see the negative reference to tendencies; though generative
linguistics is inherently universalist in its basic hypotheses, as I noted above, it would be fair to say
that, up to this point, most attempts to empirically support these purported universal hypotheses
have been tendency-based. This has been sufficient for generative linguists themselves to continue
developing our theories, but it has quite clearly not been sufficient to convince many of our critics.
The sense is that we predict universal, exceptionless patterns which, on closer inspection, are not
observed; much like the case of stellar parallax, this purportedly “failed” prediction is taken as
evidence that our basic hypotheses are incorrect.
As such, making an attempt to demonstrate the success of our hypotheses in a universal
manner, beyond tendencies is, in my view, an essential part of responding to the critiques of our
skeptics. We can, of course, wonder if such a response will really convince them, just as we can
wonder whether even data from the 1830’s regarding stellar parallax would really have satisfied
Galileo’s critics. The answer to both, I think, is probably not in all cases, but perhaps in some. As
such, it is my hope through this dissertation to engage with reasonable skeptics who have been put
off by the various “exceptions” to generative theories, and to demonstrate to them that such
exceptions should not be taken as reason to discard these theories all together. It is perhaps overly
optimistic of me to assume such individuals might read this dissertation, but in the case that they do,
I have tried to make things accessible for even those without technical training in generative
linguistics. As such, I hope those readers with background in generative linguistics will excuse me if
I sometimes briefly explain rather simple “Syntax 101”-style concepts, as I occasionally do in
Chapter 2, especially in the footnotes (which I advise those who are unfamiliar with generative
syntax to check, especially regarding the hypothesized structures corresponding to various sentence
7
types). Similarly, I at times eschew certain aspects of syntactic theory which are technically relevant
but do not strictly affect the matter at hand; this too is part of my attempts to accommodate skeptics
(or curious members of the general public, for that matter), who might want to engage with the ideas
presented here. As will become clear, the apparent simplicity of the presentation does not prevent a
very nuanced and detailed theoretical picture from emerging; if desired, this picture can easily be
supplemented with the relevant hypotheses to integrate it more fully into any particular more
“developed” model of syntax.
This brings me to the second purpose I mentioned, for which the audience is very much
those enmeshed in the generative linguistics framework. As I stated, it is my belief that attempting to
predict judgement behavior in such a precise way will lead us to an even deeper understanding of the
language faculty than the one we have now. When we try to make such predictions, we see where
our current hypotheses break down, which in turn allows us to discover phenomena we might have
missed in the past. To return to the analogy with stellar parallax, this is precisely what happened to
the heliocentrists who, in the decades after Galileo’s death, were dutifully trying to find some
evidence for the existence of said effects. As noted, it would take well over a century before such a
feat could be attempted, but in the meantime, by trying to find it by looking closer and closer at
stars, astronomers discovered various never-before-seen phenomena. One of these is what we now
call “stellar aberration”; without getting into any great detail, said aberration is not an everyday
perceptual phenomenon like parallax is, but is in fact a distortive consequence of the fixed finite
speed of light in a vacuum and the high speeds at which astronomical bodies move relative to one
another. The discovery was an important step in recognizing the existence of this fixed speed, which
is famously a crucial component of modern physical theory (namely special relativity). As such, it is
fair to say that the search for the minute parallax-based predictions of heliocentrism paid dividends
8
far greater than was ever expected, certainly far beyond simply observing aspects of the behavior of
stars.
Like astronomy, generative linguistics is also a science of stars, though these are our *-
markings, used to indicate the unacceptability of a given hypothetical item. We have hitherto mostly
assumed, like the old geocentrists, that these stars do not “move” relative to one another. That is, if
two sentences contrast in acceptability, they should contrast in acceptability for everyone, even if
there might be some noise that gets in the way of this prediction coming out perfectly in every case.
What I hope to convince readers is that this simply is not true; our stars do move, and judgements,
genuine judgements, distort from their expected values according to clear and consistent principles.
Not only does tracking and accounting for such distortion give us a way of dismissing the claims of
skeptics, it also gives us a window to a range of phenomena we have not been able to clearly see
before, painting a new picture of the linguistic universe the same way that increasingly powerful
telescopes have, for centuries now, routinely painted new pictures of the physical universe.
The theoretical derivation and experimental validation of precise predictions of the kind I
have described above is thus not only a vivid demonstration of the successes the generative
enterprise has achieved, but also a tool for beginning one of many new chapters in our investigations
into the nature of the human language faculty. The challenge is great, but I suspect the rewards will
be greater. As I hope to demonstrate in this dissertation, promising first steps are already being
undertaken, and the promised benefits have already begun to accrue. It thus is my sincere hope that
the explorations discussed here will be understood not only in their own right but also as an early
piece of a much broader project.
9
1.2 Variation and the Computational System
One of the basic claims of generative linguistics
1
is that, though certain aspects of language
are chaotic or arbitrary, there is a mathematically formal system underlying the basic structure and
rules of language. Chomsky (1995, 2017, and elsewhere) hypothesizes that at least part of this
mathematical formalism comes about due to the existence of a mental “computational system” (CS),
which operates by combining linguistic elements such as words into sets, allowing for the creation of
larger structures. In particular, the act of combination allows for the creation of a “digitally infinite
array of structured expressions”, in effect, an endless series of sentences.
We must, however, be careful about the understanding of the term “sentence”. What is
created by Chomsky’s CS is specifically a set-based hierarchical grouping of linguistic elements; these
hierarchies are then mapped, by various other modules of the mind concerned with language, onto
both an external sequence of sounds/signs/symbols, and into an internally represented “meaning”.
The ultimate output of the CS and related modules is thus a pairing of form and meaning.
In keeping with the notion of Universal Grammar (UG) (Chomsky 1965), this CS itself is
said to be genetically specified in such a way that it is common to all human beings (barring certain
developmental conditions or injuries). That is, while it is undeniable that different humans speak
different languages with different words and different rules for how to externalize structures (e.g.,
whether nouns come before or after the adjectives that modify them), the CS that builds those
structures is hypothesized to be identical in each human. Under this conception, the differences
between the languages of different individuals are taken to be reflections in different linguistic input
1
A terminological note: in this dissertation, we will be focusing specifically on generative syntax, rather than, say, generative
phonology. The topics discussed, however, are all instantiations of abstract notions, challenges, and methodologies that
will have reflexes in any given generative field, e.g., acceptability judgements, inter-speaker variation, correlational
predictions, etc. Furthermore, it is fair to say that the generative linguistic enterprise is ultimately a collaborative one, with
developments in one sub-area having ramifications for all. As such, my use of “generative linguistics” is not meant to
imply that “generative syntax” is somehow more “core” than other generative areas, but to reflect generative syntax’s place
in the broader generative enterprise.
10
that the individuals were exposed to, especially during maturation. As a result, while the externalized
form of one person’s language may vary dramatically from another person’s, the system which
generates the structures that serve as the basis for the externalized forms, as well as the ability to
interpret such forms, is the same in each case.
The centrality of structure, emphasized in Chomsky’s work since at least Chomsky 1957, has
been a guiding principle for the sub-field of generative syntax in particular since its inception.
However, as an effectively mind-internal phenomenon, it cannot normally be observed directly.
Rather, what can be observed directly are its effects on the aforementioned linking of the forms of
sentences to their meanings. In particular, generative linguists have long argued that abstract
linguistic structures may be studied via the use of introspective “acceptability judgements”; for this
dissertation at least, the acceptability judgements in which we are interested take the form of a
speaker’s intuition(s) about whether a given sentence can correspond to a given (type of)
interpretation. If certain sentence-forms have systematic relations with certain meanings, that is,
individuals judge that certain meanings are consistently (im)possible with certain forms, then this
may provide information as to properties of the abstract structure that links them.
A key challenge to such an approach, however, is the tension between the assertion of
universal properties stemming from the CS and the intense variability of acceptability judgements.
Certainly, there are dramatic differences when comparing individuals who speak different languages,
but even between individuals who speak the same language there is variation in which form-meaning
combinations are judged to be acceptable
2
. Indeed, even a specific person at different times may
have different judgements on the same items.
2
Criticizing those who try to draw conclusions from the purported fact that there are roughly 5,000 distinct languages in
the world, Kayne (2019) forcefully articulates the way in which this dramatically underestimates the variation that exists
across individuals:
11
Several solutions have been proposed. For example, some have questioned whether reported
data variation may result from researchers insufficiently vetting their own data, leading to calls for
greater oversight and scrutiny during the publication process (e.g., Linzen and Oseki 2018). Others
contend that judgement variation calls for the replacement of acceptability judgements of specific
individuals with averages collected over many individuals via the types of experiments and statistical
tests common in fields like psychology (see, for example, the program thoroughly put forth in
Schutze 1996/2016). Yet another approach assumes that variation results from individuals either not
understanding the questions they are being asked or are otherwise not meeting the conditions
necessary for the relevant hypotheses to apply; the proposal in this case is that separate informant
classification procedures should be applied ensure judgements are collected from individuals who
are suited to the judgement task (see, for example, Hoji 2015).
These approaches, however, all seem to make the assumption that there is a single “true”
judgement for a given form-meaning pair, and that variation from that judgement is a form of noise
and/or inaccuracy. While it would be useful if this were true, it is not clear that the Chomskyan view
of language supports such a conclusion in the first place. Rather, as articulated by Chomsky (1980,
1986a, and elsewhere), each individual possesses their own “I-language”. In Chomsky’s terminology,
an I-language is a “steady state” of the language faculty, created when the universally shared “initial
“We know that there are distinct varieties of English – many syntactic differences have been discussed that distinguish
American from British English. And various regional syntactic differences within the United States or within the United
Kingdom are well known. But what if it turned out that for every single pair of English speakers (and similarly for other
languages) one could find at least one sharp syntactic difference? My own experience in observing the syntax of English
speakers, both linguists and non-linguists, makes me think that it is likely that no two speakers of English have exactly the
same syntax. If it is true that no two English speakers have the same (syntactic) grammar, then the number of syntactically
distinguishable varieties of English must be as great as the number of native speakers of English. Extrapolating to the
world at large, one would reach the conclusion that the number of syntactically distinct languages/dialects is at least as
great as the number of individuals presently alive (i.e. more than 5 billion). Adding in those languages/dialects which have
existed but no longer exist (not to mention those which will exist but do not yet exist) it becomes clear that the number
of syntactically distinct (potential) human languages is far greater than 5 billion.”
As we shall see shortly below, this great diversity in languages is a natural consequence of Chomsky’s I-language-based
conception of the language faculty.
12
state” is exposed to external data. The I-language constitutes the “grammar” of the individual and is
thus the driving force behind their judgements.
That judgements on analogous sentences may vary between speakers of different languages
is relatively uncontroversial; the external data that went into forming the steady state of a Korean
speaker is likely quite different than that of an English speaker. Though both the Korean speaker
and the English speaker should have judgements that obey the general constraints imposed by the
initial state, e.g., constraints coming from making use of a CS, there may be other constraints left
underspecified in the initial state, which are acquired based on the data to which the individual is
exposed. As such, the two speakers may differ in their judgements on analogous sentences in their
respective languages, due to these experience-specific learned constraints.
The same logic applies to two speakers of the same language; though we expect differences
between their I-languages to be smaller, it is nevertheless true that no two individuals are exposed to
the same external data, and thus, it is possible that they will have acquired slightly different rules as
to which form-meaning pairs are acceptable. Indeed, even the I-language of the same speaker
making judgements at two different times may, through their experiences in between the acts of
judgement, have developed into a slightly different steady state. Of course, when the time between
judgements is very small, the chances of such an occurrence having an effect are relatively small, but
as time increases, so do the chances of a noticeable change occurring
3
.
As such, given that speakers make judgements based on their current I-languages, particular
to the specific person and time, there is no reason to expect that judgements on analogous, or even
3
Indeed, consistent with what Hoji (2022b) suggests based on data from his own I-language, at least some of the perceived
variation among individuals may be simply an artefact of sampling each individual at a given time; if we sampled each
individual at multiple times, we might well see, for example, two individuals who had seemed to fall into two different
“categories” of judgement pattern now each falling into the opposite category, suggesting there are not different kinds of
people so much as different kinds of state that individuals’ I-languages range across over time.
13
identical, items must always be the same. Rather, such judgements must obey all constraints imposed
by the initial state but are otherwise free to vary. While in principle, it might still be the case that
universal CS principles or similarities in experience constrain judgements to such a degree that
variation is minimal, but in practice, this does not seem to be the case, at least not in general. This
complicates attempts to use acceptability judgements as tools for investigating CS properties, as it
follows that in many instances, judgements will reflect idiosyncratic constraints acquired based on
the experiences to date of the particular individual rather than revealing anything universal.
A central challenge is thus how to use the varying judgements coming from individual I-
languages in order to study the properties of the universal CS. In response to this challenge, Hoji
(2015, 2017, 2022a) sets forth such a program for using acceptability judgements to test universal
hypotheses in the face I-language-based variation. This approach maintains that there are genuine
universal constraints coming from the CS which demonstrably and consistently constrain
individuals’ judgements. It differs from other approaches in that it dispenses with the notion of a
“true” judgement for a given sentence-interpretation pair; for any given such pair, it may be
acceptable or unacceptable to a given speaker at a given time, and it is not generally possible to
predict the judgement that will be made in an absolute sense. Instead, Hoji argues that, by making
careful use of various diagnostic tests (see Sub-Section 2.7), correlational predictions can be
established, which are inviolable across individuals. That is, while it is not possible to predict how an
individual will judge a given item in isolation, it is possible to predict how that individual will judge
an item based on other judgements that individual has made. These other judgements serve to
diagnose various aspect of the individual in question’s I-language, allowing for both universal and I-
language specific factors to be taken into account, which allows for a deterministic prediction to be
made.
14
This correlational approach is in some sense a natural outgrowth of past research in the
generative linguistics program, and various scholars have at times suggested the need for further
diagnostic tests in making predictions (including many of the authors of the various works to be
discussed in Chapter 2, for example). Hoji’s approach, however, is the first to attempt such an idea
on a large scale, at least that I am aware of. In that sense, it marks a fundamental shift in approach to
research using acceptability judgements compared to what has previously been pursued. As such, the
methodologies to be used in implementing this program are still developing, and there is much
further work to be done. This dissertation seeks to advance the development of this general project
by applying its principles in new ways and making improvements to the implementational
methodology. The experiments performed contain both practical improvements over past
experiments, and also cover a great deal of new ground; not only are a much broader range of
structures considered than have been in past works concerning correlated judgements, but these
structures are also considered across three typologically different languages, English, Korean, and
Mandarin Chinese, and demonstrate predicted convergences across all three. While by no means is it
the final word on any of the matters considered, this dissertation represents a major step forward,
both in terms of implementation and empirical coverage, and adds new support to a number of
basic hypotheses advanced over the past decades.
1.3 General Goals
Before discussing the particular contribution to be made by this dissertation, it is useful to
first discuss the general goals of the program of research to which the dissertation contributes. Hoji
(2015) terms this program of studying the language faculty at the level of the I-language “Language
Faculty Science” (LFS). While this dissertation departs from Hoji 2015 in a number of ways, it is
nevertheless appropriate to see them both as parts of this same LFS program of research: in both
15
works, the crucial concern is to rigorously determine and demonstrate properties of the CS
4
, which,
as noted previously, is hypothesized to be common to all (linguistically competent) human beings.
Definitionally, the CS is a module of the mind which uses mathematically formal rules relating to
abstract structure that links sentence forms and sentence meanings. It need not be the only module
establishing links between form and meaning, nor even the only mathematically formal one, but it is
the one that does so via syntactic structure.
We thus aim to show there are universal hypotheses about form-meaning pairs, formulated
in terms of abstract hierarchical structure, that can be rigorously supported by explicit experimental
tests. Further, this support should be consistent when those tests are deployed for a variety of
different individuals. Such concerns are thus distinct from, albeit deeply linked to, other legitimate
concerns that a researcher working on language might have. For example, we are not primarily
concerned with how these CS properties interact in everyday life with other cognitive factors, nor
are we focused on using CS properties to provide coherent explanations for complicated patterns of
linguistic data. Such tasks are important in their own right, and indeed, may be relevant as secondary
concerns to the type of researcher pursued here, but it should be clarified that the focus of this
project lies elsewhere, namely, in identifying and demonstrating universal CS properties.
To achieve this, the initial hurdle that must be cleared is the demonstration that there are
surface level “laws of language”, that is, universally true generalizations that accurately predict the
results of acceptability judgements. In other words, we must show that the acceptability of
sentence-interpretation pairs, despite great variation from person to person, is still nevertheless
subject to general and universal constraints that can be explicitly demonstrated via some sort of
empirical investigation (an “experiment”).
4
The demonstration of the CS’s properties implying the demonstration of the CS’s existence, which is itself a matter of
debate.
16
Once this first goal has been met, a second key goal is to show that such “laws” are not only
concerned with the surface forms of sentences or the semantic details of the interpretations, but also
rely on notions of abstract, hierarchical syntactic structure; that is, the laws established are
demonstrably linkable to some properties of the CS. In general, we expect that there may be many
factors that regulate which interpretations an individual can assign to a given sentence. Most such
factors will be only tangentially relevant for the demonstration of CS properties. However, we
expect that, in certain cases, at least one such relevant factor will demonstrably make reference to
the syntactic structure of the sentence, and the positions in which different elements are situated in
that structure with regard to one another. Such a demonstration requires that we hypothesize a
syntactic structure for the sentences in question and find types of interpretations that show some
sort of relationship with that structure. To use the phraseology of Hoji 2015, we would ideally want
to find meaning relations (MR’s), such that MR(S, X, Y) , the meaning relation in the interpretation
of sentence S
5
connecting linguistic elements X and Y, is possible only if X and Y stand in a certain
syntactic configuration with regard to one another in S (which for us will be the “c-command”
relation, to be defined in Section 2.2).
Hoji 2015 states this goal quite strongly, arguing that we must find MR(S, X, Y) for which its
availability is exclusively dependent upon syntactic structure. However, as is articulated in Hoji’s
subsequent work (2017, 2022a/b), it is not clear that such exclusively structure-based MR’s exist
6
.
Rather, we will have to content ourselves with demonstrating that some MR’s are dependent on
structure among other factors. To do so, it will thus be necessary to isolate each relevant factor in
5
Hoji 2015 uses only MR(X, Y), though the relativization to an S is implicit given that what is being discussed is indeed
the interpretation of specific sentences.
6
Indeed, even Hoji 2015 (see in particular the last two paragraphs of Section 3.5 of that work) discusses the need to
identify “effective probes” for a given speaker, specifically because of the possibility that a given MR might not always be
constrained c-command for all individuals.
17
turn, and show that, when all other relevant factors are absent, then MR(S, X, Y) is indeed
dependent purely on the one remaining factor. In particular, we want to show that, when all non-
structural factors are absent, then the availability of MR(S, X, Y) is purely based on X’s structural
relationship with Y in S.
Once the above is established, the relevance of abstract syntactic structure in constraining
the MR in question is demonstrated. Assuming this has been done in a convincing way, it constitutes
evidence for the existence of the Chomskyan CS and its role in deriving the “law” established in
accomplishing the first goal. In achieving this, we have therefore gone beyond surface level
generalizations to underlying explanations for those generalizations, particularly the aspects of those
explanations involving the CS. Once we have come to this position, the third and final goal is to
leverage such laws, and the demonstrated role of syntactic structure within them, to discover and
refine our view of the properties of the CS. That is, once we have clearly isolated the role of the CS,
we have thereby established a method for “probing” the properties of the CS in the given domain,
providing us with the opportunity to test further hypotheses. In this way, we can accumulate new
knowledge about the CS specifically, and the workings of the human language faculty in general.
It must be noted from these goals that we are making use of particular MR’s and particular
syntactic structures as tools, rather than the central objects of our investigation. As stated above, this
central object of investigation is the language faculty and its properties, particularly the properties of
the CS; in as much as knowledge about certain MR’s and structures are revealing of such properties,
then they are of interest, but anything more than that is tangential to this particular form of inquiry.
As such, we are not primarily concerned with pinpointing the exact “nature” of the MR’s in
question, nor with articulating the exact details of the structural representations of the sentences
employed. Rather, throughout this dissertation, I will articulate the properties of the MR’s and
structures in question as far as is needed to demonstrate the link between a given MR and a given
18
syntactic structure, but (usually) no further. As a result, while a certain degree of understanding of
these elements is necessary for the purpose of demonstrating/establishing CS properties, we can
remain relatively agnostic about various details of their aspects while still completely satisfying our
main purposes.
At times, those whose primary research orientation is more geared towards, e.g., either (a) a
complete analysis of interpretative phenomena, or (b) a fully articulated syntactic parse for a given
construction, may feel a sense that more should be said. I by no means deny this; more absolutely
should be said on all these matters. However, I hope it nevertheless becomes clear that the findings
in this dissertation are indeed contributing to our knowledge about both particular MR’s and
particular syntactic structures. Though the results of the investigation conducted do not reach a
conclusive or exhaustive account of the relevant properties (about particular MR’s and particular
syntactic structures), they nevertheless provide thought-provoking considerations that directly
constrain the range of possible solutions to the relevant problems. That is, while I will not be
providing a holistic account of any particular MR or sentence structure, I will be unearthing new
facts, and adding new support to old ones, that must be accommodated by any such account; this in
turn will help us establish more detailed accounts of the MR’s/structures by favoring certain
approaches to their analysis over others.
1.4 A Basic Law
Turning now to specifics, in this dissertation, I provide evidence for a particular “law” of
form-meaning relations, which I formalize as follows:
(1) *A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
19
These highly abstract expressions require some unpacking. To begin, as mentioned in the
previous section, S represents a given sentence, or, more accurately, a given sentence pattern or
“schema” in terms of Hoji 2015. For example, if S=N(oun)1 V(erb) N(oun)2 (N1 V N1), it might
correspond to the sentence ‘dogs like cake’, ‘rice is food’, or ‘John found Mary’, or other such
expression. The sentence patterns we are interested in, however, are going to be ones that are
relevant to some MR that relates an element X to an element Y. As such, S should contain a “slot”
for X and for Y, such as S=X V Y. If X is ‘the man’ and Y is ‘his daughter’, this S might become
‘the man praised his daughter’, ‘the man loved his daughter’, ‘the man fed his daughter’, etc. We will
thus be concerned with triplets of a schematic S and two consistent choices of X and Y, as can be
seen repeatedly in (1). Note that in all expressions, S, X, and Y, are to be understood as the same
choices of S, X, and Y; one sentence frame, one choice of X, and one choice of Y, as with the
example S=X V Y, X=‘the man’, Y= ‘his daughter’ above.
Moving on to the functions taking S, X, and Y as their arguments: the rightmost one is
BVA(S, X, Y), which represents a particular MR interpretation, BVA(X, Y), obtaining in sentence S
between elements X and Y. BVA (“Bound Variable Anaphora”) itself is another term used in Hoji
2015
7
, which is also sometimes referred to as “binding” or specifically “quantificational binding”.
This MR is roughly understood to describe a situation whereby the interpretation of an otherwise
singular-denoting expression Y varies across the different individuals expressed by X. For example,
7
Though the abbreviation is in use elsewhere, see Dechaine, and Wiltschko 2017, for example. The term “Bound Variable
Anaphora” itself has a long history in the field, and dates at least to Partee 1978’s use of the term to delineate between
pronouns which get their interpretation through a formal process of variable binding, long discussed in the field of logic,
and those which are bound “pragmatically” through merely happening to refer to the same individual(s) as their
antecedents. In keeping with this origin, the term “binding” is sometimes used, or a derivative thereof, e.g.,
“quantificational binding” in works like Barker (2012), as well as variants of “bound variable anaphora”, e.g., Reinhart
1983’s “bound anaphora”. Some authors mean slightly different things by these terms; Partee (1978) for example considers
interpretations like the coreferential reading of ‘prosecutor” and ‘his’ in ‘the prosecutor believed that he would win the
case’ to be a (potential) instance of bound variable anaphora, because for her, the term refers to a particular way of
establishing an interpretation, not the interpretation itself. For Hoji, on the other hand, the term refers to a type of
interpretation, as described in the main text, rather than a way of achieving such an interpretation; it is in this latter way of
understanding the term that I will be using it here.
20
if we have S=X V Y N, but this time, X= ‘every boy’, and Y= ‘his’, we can form sentences like
‘every boy loves his mother’. BVA(S, X, Y) here is a way of understanding the sentence such that it
expresses that each boy loves that boy’s own mother, boy A loves boy A’s mother, boy B loves boy
B’s mother, etc. Under such an expression, the “singular” expression ‘his mother’ does not, in fact,
refer to a single individual, but rather, a different individual per each member of the set identifiable
from the expression ‘every boy’.
As noted in the previous sub-section, it is important to note that I am not here making a
claim as to how BVA interpretations come about, what their formal properties are, whether BVA
might or might not be an instance of a more general type of interpretation, or whether or not BVA
is a fundamental “category” of interpretation or merely a heterogenous grouping of different types
of interpretations that happen to seem superficially similar. These are all relevant questions, and the
establishment of the law in (1) is an important step in resolving them, but for now, I will simply
stipulate that interpretations of the style described above are “BVA interpretations”. Any such
reading, where sentences of type S can have interpretations where the meaning of expression Y
varies across the different individuals “expressed by” X will be called BVA(S, X, Y).
As can be seen in (1), the expression is not simply BVA(S, X, Y), but *BVA(S, X, Y). * here
is taken to mean “unavailable to the individual in question”, so *BVA(S, X, Y) means that the
speaker cannot have such a BVA interpretation for the S, X, Y triplet in question. That is, BVA(X,
Y) is not a possible interpretation of S. Returning to the example of S=‘every boy loves his mother’,
*BVA(S, every boy, his) would indicate that the individual in question feels that ‘every boy loves his
mother’ cannot have an interpretation where it means that A loves A’s mother, B loves B’s mother,
etc.; perhaps the individual thinks ‘his mother’ in that sentence can only refer to a specific boy’s
mother, say John’s mother, who is loved by all boys. Such an inability to accept BVA is unlikely for
21
the sentence ‘every boy loves his mother’, but for other sentences to be considered, e.g., ‘his student
spoke to every teacher’, such rejection is not only possible, but often likely.
We can note that the law takes the form of an implication “à*BVA(S, X, Y)”. Thus, if
certain conditions are met, BVA(S, X, Y) for a particular S, X, and Y is predicted to be impossible
for the individual in question. These conditions are expressed to the left of the implication sign,
being *A(S, X, Y), *B(S, X, Y), and *C(S, X, Y); we can further note that they are joined with a
logical “and”, so all must be true in order for the implication to hold. A, B, and C are themselves
various sets of “factors” that can allow the establishment of BVA(S, X, Y). This list, I claim, is
exhaustive. That is, BVA can only come about if at least one of A, B, or C is present to allow it. If
none of these (types of) factors are present, then BVA is impossible. This is what the law expresses;
if A(S, X, Y), B(S, X, Y), and C(S, X, Y) are all * (unavailable) for a given S, X, and Y for a given
individual, that is, the speaker cannot understand S, X, or Y to have any of the relevant properties,
then BVA(S, X, Y) will be unavailable for that individual.
What then are these factors? A is an umbrella term for factors having to do with the surface
form of the utterance. At the present time, only one such relevant factor is known: whether X
precedes Y in S
8
(see discussion of Ueyama 1998 in Sub-Section 2.7). That is, if X comes before Y in
the linear surface form of S, then BVA(S, X, Y) is in principle possible. *A(S, X, Y), on the other
hand, would means that X does not precede Y in S
9
. In such a case, BVA(S, X, Y) could not be
established due to X preceding Y, but it might still be possible if one of the other factors enabled it.
8
Or even perhaps outside of one S; see discussion in Ueyama 1998.
9
Technically *A(S, X, Y) should mean that the particular individual in question cannot understand X to come before Y in
S, but surface ordering should generally not be something on which different individuals disagree, given that it is clearly
verifiable. Were we comparing analogous sentences across languages with word orders that varied such that X and Y
changed positions, however, this would be a more important distinction to keep in mind.
22
The second element, B, is a set of factors having to do with how the sentence and its
elements are interpreted; these can range from narrow concerns, e.g., what sort of formal
representation can a particular individual assign to the word “his”, to very broad ones, e.g., whether
the individual understands ‘every boy loves his mother’ to be a generic statement of the properties
of boys, an observation being made about the all the boys in a specific school, etc. B is thus most
likely a complex and heterogenous blend of different factors, but, because they are all fundamentally
“interpretational” they all share the common trait of needing to be diagnosed experimentally via
certain tests (to be discussed in Section 2.7). That is, because these “quirky factors” (to borrow a
term from Ueyama 1997/1998) have to do with idiosyncrasies of how a specific individual
understands a given sentence, word, or phrase, they cannot be predicted just based on the surface
form of a given sentence, and thus require some sort of independent diagnostic to be run on the
individual in question to see if B(S, X, Y) is possible for that individual for the particular S, X, and Y
in question.
If those diagnostics come up negative, letting us know there is *B(S, X, Y), and also, X does
not precede Y, so there is *A(S, X, Y), the final way that BVA(S, X, Y) can come about is via C(S, X,
Y). This last set of factors are those of greatest interest to us, namely those dealing with properties
of the syntactic structure of the utterance
10
. As mentioned previously, the particular way in which
syntactic properties are formulated in this dissertation will be in terms of the abstract structural
relation c-command (to be introduced in Section 2.2), so whether there is C(S, X, Y) will be a
question of whether X is understood to c-command Y in (the structure of) S
11
. Unlike factors in A,
10
Note that such factors have a different status in our theory than the others, as these alone are said to derive from the
CS, because, by definition, the CS is mathematically formal, whereas the sources of A(S, X, Y) and B(S, X, Y) may or may
not be.
11
For terminological simplicity, I will conflate the surface form of a sentence, relevant for A(S, X, Y) with the structural
representation of that sentence, relevant for C(S, X, Y), as well as potentially whatever form or forms of representation
are relevant for B(S, X, Y), calling all of these things “S”; clearly they are not quite the same, but for a given judgement,
23
whether or not X c-commands Y is not a priori obvious from the form of the sentence in question,
but unlike B, we will be able to make definite hypotheses linking a given S and whether or not X c-
commands Y in it without the need for individual-specific tests. As such, whether we have *C(S, X,
Y) can be read off of a given sentence provided we have (correctly) made the relevant structural
hypotheses for that sentence.
If there is indeed *C(S, X, Y), and also *A(S, X, Y) and *B(S, X, Y)
for the individual in
question, then the law in (1) predicts that there must be *BVA(S, X, Y) for that individual. Put into
words, given our current understanding of the various factors, the law states that if X does not
precede Y in S, no “quirky factors” are diagnosed for X, Y, and S for the individual in question, and
X does not c-command Y in S, then the individual in question cannot accept a reading for S that
involves X entering into BVA with Y. It should also be noted that, an implication necessarily implies
its contrapositive; thus, if we take “ok” to mean the opposite of “*”, that is roughly, “is available for
the individual in question”, then (1) implies (2):
(2) okBVA(S, X, Y)à okA(S, X, Y) ∨ okB(S, X, Y) ∨ okC(S, X, Y)
12
In essence, (2) states that if BVA(S, X, Y) is available for the individual in question, then it must
be the case that either (i) X precedes Y in S, (ii) “quirky factors” are diagnosed for S, X, or Y for the
individual, and/or (iii) X c-commands Y in S. This articulation is truth-conditionally equivalent to
the previous one, but highlights the fact that the “law” in question is essentially a statement about
when BVA(S, X, Y) is possible; namely, at least one of its facilitating “conditions” must be met,
there should be a function mapping between them, so this should not create any problems.
12
This form makes clear the debt this formulation of the law owes to Ueyama (1998), who proposes, in essence that
BVA(S, X Y) comes about due to either X c-commanding Y in S or X preceding Y in S, but also, as mentioned, coins the
term “quirky effects” in an appendix discussing apparent exceptions to this “c-command or precedence” requirement.
24
either A(S, X, Y), X preceding Y in S, B(S, X, Y), “quirky effects” on S, X, and/or Y, or C(S, X, Y),
X c-commanding Y in S. If none of these conditions is met, then BVA(S, X, Y) is impossible,
bringing us back to what is stated in (1). The law thus gives us predictions both about when BVA(S,
X, Y) must be impossible for an individual, and about what must be true when BVA(S, X, Y) is
possible for the individual.
In this dissertation, I demonstrate that, across multiple choices of S, X, and Y, in three different
languages, English, Korean, and Mandarin Chinese, this law holds; BVA interpretations are available
only once the conditions are met for either A, B, or C. I show that, across speakers of each
language, when each factor (A-C) is isolated, it can facilitate BVA by itself, and when all three are
absent, BVA is consistently impossible. That is, if, say, we have *A(S, X, Y) and *B(S, X, Y) (X does
not precede Y and there are no “quirky effects”), then whether or not BVA(S, X, Y) is possible is
predicted to depend entirely on whether there is *C(S, X, Y) or not; if X does not c-command Y in S
under these conditions, BVA(S, X, Y) is always impossible, whereas if X does c-command Y in S
under these conditions, BVA(S, X, Y) is not impossible. The same can be established for both A(S,
X, Y) and B(S, X, Y) as well.
Such results both build on and expand the coverage of what has been found in previous
experiments (some of which are discussed in Section 3.2), covering a much wider set of languages
and lexical items, as well as making improvements in the collection methodology and thus the
richness of the data collected. The successful demonstration of the law’s predictive power in these
domains helps to fulfill the three goals articulated in the previous section: not only does it provide
evidence of a universal principle linking forms and meanings, but, via factor C, it demonstrates the
crucial role that syntactic structure plays in regulating such a link. Further, as will be shown, this law
may be exploited for use as a diagnostic tool for the presence/absence of structural relations like c-
command, as well as the properties of the elements taking part in MR relations, allowing for fine-
25
grained exploration of the nature of the structure-generating CS.
1.5 Key Summary
At this point, let me provide a brief summary of certain crucial points that have been stated
above, as these particular points will serve as guiding principles for the rest of the dissertation and
thus bear repeating. What must be kept in mind is that the general project to which this dissertation
belongs is to reconcile the universality of Chomsky’s proposed structure-building CS with the
variation in the acceptability judgements produced by human beings. Rather than treating this
variation as noise to be filtered out, instead the solution pursued is to build theories general enough
to accommodate genuine disagreement while simultaneously retaining definite predictive power.
As laid out in Section 1.3, the endeavors of this dissertation, and of all such works, primarily
concentrate on the following:
(3) General Goals
(I) Demonstrating the existence of “laws of language” mediating
form and meaning that
hold despite inter-speaker variation.
(II) Supporting the claim that these laws make reference to
syntactic structure and thus
reflect the properties of the CS.
(III) Leveraging the structure-sensitivity of these laws to verify
and/or discover the
properties of the CS.
In particular, this dissertation argues for the existence of a law, as formalized in (1), repeated
as (4) below:
(4) *A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
26
This law (which we may abbreviate as the “ABC-BVA law
13
” essentially claims that the MR
BVA can be established only if at least one of three (types of) conditions are met. For our purposes,
we can understand these as one purely “external” condition, X preceding Y in S, one purely
“internal” condition, X, Y, and/or S being interpreted in such a way as to induce one of several
things lumped under the term “quirky effects”, and one “structural” condition, X c-commanding Y.
These notions will all be explicated in greater detail in Chapter 2. Crucially, however, this law is
demonstrated to consistently and deterministically hold across three different languages: English,
Korean, and Mandarin Chinese. Further, for each language, evidence for this law is found across
multiple different constructions with multiple different lexical items therein, and across multiple
speakers of the language.
Not only is this law shown to hold, meeting goal (I), it is further shown that each sub-part of
the law is necessary. No one part can be relaxed, which clearly demonstrates the crucial role played
by each factor in constraining form-meaning pairs. This necessarily includes the role played by C(S,
X, Y), namely abstract hierarchical structure, thus achieving (II). Further, in ways to be discussed in
the subsequent chapters, the analysis conducted breaks new ground by examining structures not
investigated previously, thus allowing for further testing and discovery of CS properties, fulfilling
goal (III). The successful demonstration of the law in (4) thus fulfills the goals articulated in (3), and
thereby sheds new light on the CS, providing a strong argument for both its general existence and
some of its particular properties, as will be described in subsequent chapters.
13
Note that as I have defined them, A(S, X, Y), B(S, X, Y), and C(S, X, Y) need to be filled in with hypotheses as to what
their contents are, as I have done in the previous section. As such, it is perhaps improper to speak of “the” ABC-BVA
law; there could be other hypotheses as to the contents of A, B, and C. ABC-BVA is thus a sort of family of laws, rather
than a single law itself, and what I am arguing for in this dissertation is a particular ABC-BVA law. For the sake of brevity,
I will refer to it as “the ABC-BVA law”, with the understanding that there are no other competing ABC-BVA laws currently
under discussion.
27
1.6 Preview of Final Results
How is the ABC-BVA law to be demonstrated? To understand this fully, we will need the
theoretical and methodological tools laid out in Chapters 2 and 3, respectively. At this point,
however, it is possible to get an appreciation for the overall results, even if a deeper understanding
will have to wait until later.
Because the ABC-BVA law predicts the (un)acceptability of BVA under certain conditions,
the natural way to test whether this law is correct is to elicit judgements about the availability of
BVA in various sentences with different properties. For this dissertation, this is done not only for
various individuals, but as mentioned, for various individuals across three different languages. For
any given S containing X and Y, each individual may, in essence, give two judgements: BVA(S, X, Y)
is possible or BVA(S, X, Y) is not possible. As I mentioned in Section 1.4, we can abbreviate these
as respectively okBVA(S, X, Y) and *BVA(S, X, Y).
Given the large number of datapoints collected, a way to visually represent the results will be
useful so that the overall patterns may be easily detected. To do this, we will represent each BVA
judgement received as either a red square, denoting okBVA(S, X, Y), or a green circle, denoting
*BVA(S, X, Y)
14
. Therefore, if there are, say, 100 red squares and 50 green circles, that would mean
that, in the judgements collected across different sentences and different individuals, BVA(S, X, Y)
was accepted 100 times and rejected 50 times. For the sake of consistency with experiments in other
works, let me refer to these as red and green “dots”, where we will understand “dot” to refer to
whatever the relevant shapes are in the particular visualization scheme used.
14
This may seem somewhat “flipped” for what is culturally typical, given that green usually means “yes” and red “no”.
However, the ABC-BVA law predicts *BVA(S, X, Y). As a result, no instance of *BVA(S, X, Y) will ever be a
counterexample to that law; as such, they are green in the sense of “non-threatening”. Instances of okBVA(S, X, Y), on
the other hand, might or might not be counterexamples to the law. If they occur in situations where one of the law’s three
conditions are not met, then they in fact support the law (see discussion of (2)), but if all three conditions are met and yet
okBVA(S, X, Y) still obtains, then the law is be falsified. As such, red denotes something like “possible danger” here.
28
The counts of these dots alone will not tell us much about whether the ABC-BVA law is
correct or not. Rather, the ABC BVA law divides up the space of responses by giving us three
relevant conditions to consider: whether we have *A(S, X, Y), *B(S, X, Y), and/or *C(S, X, Y) for
the given sentence for the given individual making the judgement. If all three conditions are met, the
prediction is that every judgement in such a situation must be *BVA(S, X, Y). If at least one of these
conditions are not met, then “all bets are off”, and we may see either okBVA(S, X, Y) or *BVA(S,
X, Y).
Let us represent these three conditions, *A, *B, and *C as three intersected circles, as shown
below in (5). If a red or green dot is inside a given circle, that means that the relevant condition was
met when the judgement was made. For example, a green dot inside of the *A circle and the *B
circle but outside the *C circle would mean that: (i) the individual in question’s judgement on the
sentence in question was *BVA(S, X, Y), and (ii) the conditions of that judgement were such that
there was *B(S, X, Y) and *A(S, X, Y), but not *C(S, X, Y)
15
. Similarly, a red dot outside of all three
circles would mean that (i) the individual in question’s judgement on the sentence in question was
okBVA(S, X, Y), and (ii) the conditions of that judgement were such that there was none of
*A/*B/*C(S, X, Y)
16
.
15
There is some heterogeneity between these factors; whether or not a given sentence is diagnosed as *A(S, X, Y) or *C(S,
X, Y) will simply depend on the form of the sentence in question. Namely, a sentence S with be considered *A(S, X, Y) if
X does not precede Y in S, and *C(S, X, Y) if X is not hypothesized to c-command Y in S. Whether or not the sentence
is diagnosed as *B(S, X, Y) will make reference to certain other properties of the individual in question, and thus a given
sentence may be diagnosed as *B(S, X, Y) or okB(S, X, Y) depending on which individual is judging it. Readers may be
concerned at this stage that diagnosis B(S, X, Y) may be circular, essentially a convenient “excuse” to dismiss any point of
data that does not fit our other predictions. As we will see in the subsequent two chapters that this is not the case; there
are clear criteria that determine whether we have *B(S, X, Y) or not for a given sentence judged by a given individual; these
criteria are independently diagnosed via specific tests, the basics of which are elaborated in Section 2.7.
16
One way of thinking about such graphs (and we will be seeing more throughout this dissertation) is that the color/shape
of the dots represents qualities of the judgements that were made, whereas the locations represent the properties of the
items that were judged (with the understanding that such properties are determined by the way in which the individual in
question represents such sentences in their own I-languages, allowing for the individual-specific factors in B(S, X, Y) to
form part of these “properties”).
29
Taking the data from the experiments conducted, if we categorize all relevant judgements in
this fashion, we obtain the diagram in (5):
(5)
This diagram will be discussed in greater detail as (367) in Sub-Section 7.2.2, but we can
already appreciate the basic result it represents. Of the almost 800 datapoints, gathered from over 30
individuals, including English, Korean, and Mandarin Chinese speakers, a completely exceptionless
pattern emerges, precisely in line with the ABC-BVA law. Crucially, in the central intersection of all
three circles, representing judgments on sentences for which *A, *B, and *C did obtain for the
individual in question, there is not a single instance of an okBVA, while there are many instances of
*BVA. On the other hand, in literally every other area of the diagram, there are numerous instances
of okBVA occurring alongside instances of *BVA. This is precisely what the ABC-BVA law
30
predicts; without a single exception, BVA is unacceptable (resulting in a green dot) provided that *A,
*B, and *C are ensured, and if any one of these conditions is relaxed, BVA becomes potentially
acceptable
17
.
There are of course many questions left unanswered at this point, regarding how things
break down with regards to different languages, different individuals, different sentences, etc., as
well as relevant hypotheses, experimental methodology, and other related issues. Answering these
questions is key to establishing the significance of these results and doing so will be the primary goal
of the remainder of the dissertation. Even at this early stage, however, we can note the striking
correspondence between the ABC-BVA law and the results achieved; not a single point of data is in
contradiction to the law, and, as we will increasingly see, most of the data gathered indeed actively
support it. I think even severe skeptics will agree with me that predictions matched so precisely with
results are not easy to achieve in research dealing with language, which is notoriously subject to the
sort of variation I have discussed above. As such, the chance that such results have obtained merely
by chance are quite low, a point I will discuss further in Chapter 7. Regardless of the ultimate
correctness of various aspects of my argument and interpretation, it thus seems that the ABC-BVA
law has captured a very real property of human language. Even this alone, I hope, readers will find
exciting.
1.7 Outline
The rest of this dissertation will precede as follows: first, in Chapter 2, I provide an overview
of relevant works addressing the relationship between syntactic structure and MR’s like BVA.
17
Note that acceptability (represented by red dots) is not guaranteed in such situations by the ABC-BVA law, and as we
see, this is indeed the case; there are green dots in all regions of the diagram. The reason for this will be addressed further
in the following two chapters, but in essence, meeting one of conditions A-C is necessary, but not necessarily sufficient,
to enable BVA.
31
Starting from basic structural primitives, this discussion builds to include various factors that seem
to modulate whether different MRs are acceptable for a given sentence or not. The focus is on the
structural relation c-command, but as we will see, c-command alone seems insufficient for providing
a straightforward explanation for the phenomena involved. This discussion thus follows a trail of
theoretical concepts and points of concern that have led to the hypothesization of the multi-faceted
law expressed in (1).
While Chapter 2 provides the full theoretical underpinnings of the correlational approach to
be employed, it does not discuss practical matters regarding how this approach may be implemented.
Chapter 3 thus complements this theoretical discussion with methodological details. It starts off with
discussion of the basic philosophy behind the experiments performed, followed by discussion of
recent experimental work conducted in this style. These works in turn inform the experiments that
are performed for this dissertation; following the review of such works, I give an overview of the
experimental “template” that is followed by all the main experiments conducted for this dissertation.
The discussion there includes motivation for the way in which experimental items
18
were designed
and displayed, as well as explicit articulation of other procedures involved in obtaining the data. At
this point, readers should have a much more grounded sense of how the basic program summarized
in Section 1.5 will be carried out, and thus how the results previewed in Section 1.6 will be obtained.
As such, to end Chapter 3, I give a brief overview of the results obtained from these experiments in
more detail, as well as provide overall discussion of various issues and points of interest to keep in
mind going forward.
The next three chapters each focus on experiments conducted in one of the three languages
under consideration in this dissertation: Chapter 4 focuses on English, Chapter 5 on Korean, and
18
By which I mean the individual sentences and interpretations as they were conveyed to participants.
32
Chapter 6 on Mandarin Chinese. This ordering reflects the order in which the data were gathered,
which is occasionally relevant for specific concerns (to be discussed further in Sub-Section 3.4.2 and
in the relevant chapters). These issues are relatively minor, however, and the results of each chapter
stand on their own as independently significant results, albeit results that are made even more
significant through comparison with one another.
These three chapters follow a relatively consistent internal structure. In the first section of
each chapter (X.1), I introduce relevant background literature and discussion of any language-
specific issues that affect either the experimental setup or interpretation of results. In the following
three sections, I then consider each of the three “factors”, A, B, and C, and their demonstrated
influence on the data obtained: A, namely linear word order in X.2, B, quirky effects, in X.3, and C,
syntactic structure via c-command, in X.4. In each section, I consider the factor’s influence across
different lexical items, different MRs, and different constructions/sentence types, ultimately
demonstrating how the influence of the factor in question can be consistently seen across these
different domains once the appropriate controls and correlations are applied. In the final section of
each chapter, (X.5), I consider the overall implications deriving from the combined results for each
of the factors in question, explaining how the results of the experiments in the particular language in
question contribute to the overall project, as well as discussing any remaining issues to be addressed.
In the final chapter, Chapter 7, I initially summarize what has been shown for each language
separately in Section 7.1 and then go on in Section 7.2 to discuss the significance of the results taken
as a whole. Subsequently, in Section 7.3 I address potential criticisms that may be levelled at the
experiments conducted and/or the significance of its results, with the intent to demonstrate that the
data and methodology used to collect it in fact answer these concerns well. Following this, in Section
7.4, I discuss what I see as the natural limitations of the particular experiments conducted and
explain how future work can expand upon and correct these various issues. Finally, I close in Section
33
7.5 with discussion of how this work has contributed to the much broader project of studying
judgement variation and the CS and what its implications might be, both within the field of
linguistics and beyond it.
34
2 Structure and Meaning
2.1 Introduction
In this chapter, I build up the key theoretical notions that underlie the BVA law formalized
in the previous section, which thereby motivate the specific investigations performed for this
dissertation. I will briefly review the development of some of these ideas in the literature, and also
comment on some alternative proposals that will be relevant as well.
To start, in Section 2.2, I build up the notion of c-command from theoretical primitives,
most crucially the operation “Merge” (Chomsky 1995). As will be noted, this relation forms the basis
for all structural hypotheses in this dissertation. Then, in Section 2.3, I go on to discuss how c-
command has been argued to constrain MR’s (Meaning Relations) like BVA, as well as to other
MR’s relevant to this dissertation, namely Coref (Coreference) and DR (Distributive Reading)
19
.
From there, I go on to discuss the most relevant of the numerous complications and controversies
surrounding the link between MR’s and c-command in the theoretical literature. These includes
issues of linear precedence, addressed in Section 2.4, as well as seemingly exceptional constructions,
such as possessives, addressed in Section 2.5.
Sections 2.4 and 2.5 also address attempts to deal with these apparent counterexamples to c-
command-based theories, by either building a multi-faceted theory of BVA or by redefining various
structural notions in order to preserve a unified account. The former approach, though occasionally
explored, has generally not found widespread acceptance in the literature, yet as I show (and others
have shown), the redefinitional approach, though at times promising, seems to lead us down an
endless rabbit hole of ever weakening our hypotheses, with very little ultimate payoff. As I note,
such attempts have also frequently been marred with inadequate attention to issues of precedence,
19
Abbreviations taken from Hoji (2015, 2022a-c, and elsewhere)
35
yet even a hybrid theory that incorporates precedence will still encounter persistent issues. As such,
in Section 2.6, I explore yet another type of approach commonly taken, which is to assume that
certain relevant operations apply “covertly”. There are various ways in which this general idea is
manifested, and while the relevant works surveyed do not quite resolve all the relevant issues
regarding BVA and other MR’s, they make significant progress in certain areas.
Because it has not generally been the focus of past theoretical work, I mostly set aside the
issue of judgement variation for the above discussions. However, as it turns out, such variation is
exactly the sort of thing that needs to be considered, in combination with the various theoretical
innovations described above. I thus return to the discussion of variation in Section 2.7. Ultimately,
these considerations lead to the “correlational methodology” first proposed by Hoji 2017. While
certain questions are still left unanswered, Hoji’s approach to variation provides an account for
(potentially) all the “problematic cases” discussed in previous sections, offering an avenue to resolve
the longstanding theoretical disputes which were not initially recognized as having anything to do
with variation. I further demonstrate how this correlational approach, combined with other key
hypotheses in the ABC-BVA law, in fact derives many of the past generalizations regarding BVA as
special cases of this general law. Thus, many of the seemingly disparate theoretical proposals
discussed have in fact been pointing towards the same basic principle all along. In Section 2.8, I
offer some speculative analysis of the “questions left unanswered” by the methodology; this analysis
is not crucial for understanding the rest of the dissertation but may be of interest to some. Finally, in
Section 2.9, I note how this understanding in turn yields a deeper insight to the ABC-BVA law,
properly contextualizing the role that structure plays in constraining BVA, which previous
approaches have come close to but not fully captured due to, I suggest, their incomplete reckoning
with judgement variation.
36
2.2 Merge and C-Command
2.2.1 The Operation Merge
Though syntactic structure has been an aspect of generative syntactic theory since the 1950’s,
Chomsky 1995 articulates a significant theoretical claim that represents a potential advance in our
understanding of the nature of such structure. In particular, Chomsky claims that all syntactic
structures derive from the repeated application of a single operation. This operation, Merge,
combines two linguistic elements, A and B, into a set {A,B}, as schematized in (6):
(6)
Chomsky motivates Merge on the grounds of being the simplest solution for (one aspect of)
the problem of linguistic creativity. Specifically, Chomsky (2017) provides an observation whose
genesis he attributes to certain remarks by Galileo, that there are an infinite
20
number of sentences,
even though there are a finite number of linguistic elements, e.g., letters, words, etc. This can be
easily confirmed; given a set of sentences S, it is always possible to make a sentence S’ which is not
in S, simply by adding on a new sentence with some sort of conjunction like ‘and’ (or inserting
modifying adjectives/adverbs into the sentence, etc.). At some point, the sentence will likely become
nonsensical and difficult for a human to effectively process, but it will formally be a sentence,
20
Technically, Chomsky does not attribute Galileo with observing the infiniteness of possible sentences, but rather, noting
with wonderment that just reconfiguring the few letters of the alphabet can express the thought of any one at any time
anywhere in the world. However, unless Galileo thought that there were a limited number of thoughts people could have,
then he was effectively saying that the letters of the alphabet could be combined to represent an infinite array of utterances,
which is the intended point here.
37
nevertheless. As such, it cannot be that any individual has a preset list of possible sentences in their
mind, as such a list would be infinitely long and thus un-storable in memory. Indeed, even were the
list finite, there would be the question of where it comes from. Different individuals speak different
languages and must therefore have different “lists”, implying they are not inborn. If so, then they
must come from experience. The claim, however, that an individual can only repeat or understand
sentences they have experienced is easily falsified, as in Chomsky (1957 and elsewhere)’s famous
“colorless green ideas sleep furiously”. Though the intent of this sentence is to show that effectively
nonsensical strings of words can still be valid “sentences” of English, because it is nonsensical, is
also highly unlikely that Chomsky had heard it from someone else before he thought of it. If he did
hear it from someone, then the question becomes from whom that someone else heard it, and on
and on in an endless chicken-and-egg sequence, demonstrating a conceptual problem with the
“repetition” theory.
If we can accept that all possible sentences are not simply stored in a person’s mind, then
there must instead be some sort of generative procedure that allows for new sentences to be made.
This in turn entails there ought to be some sort of generative module of the mind (sometimes called
a mental “organ”) that executes this procedure, giving us motivation for the CS discussed in the
previous chapter. This CS must achieve “digital infinity”, allowing for an infinite set of utterances
created by combining finite, discrete atomic elements, via some sort of combinatory operation(s).
Chomsky argues that the simplest such operation that will achieve this digital infinity is an operation
that combining two elements into a new element, this operation being the aforementioned Merge.
To quote from Chomsky 2017, Merge is:
38
“[…] the simplest recursive operation, embedded in one or another way in all others. This
operation takes two objects already constructed, say X and Y, and forms a new object Z,
without modifying either X or Y, or adding any further structure to them. Z can be taken to
be just the set {X, Y}. […] Since Merge imposes no order, the objects constructed, however
complex, will be hierarchically structured, but unordered; operations on them will necessarily
keep to structural distance, ignoring linear distance. The linguistic operations yielding the
language of thought must be structure-dependent, as indeed is the case.”
There are several relevant points to unpack here. First, Merge is recursive; it is able to apply
to its own outputs. As a result, if we take my hypothetical {A,B} in (6) and Chomsky’s Z, a.k.a.
{X,Y}from the above quote, we should be able to apply Merge to these to items, creating
{{A,B},Z}:
(7)
Second, Merge does not itself encode an order. (7) is therefore equivalent to (8):
(8)
39
Merge creates unordered sets, and as such {A,B} is equivalent to {B,A}, as they have the
same elements, A and B. Under Chomsky’s account, therefore, there is no direct “Merge order”
analogue of “word order”. If the structures Merge builds are universal, then it is possible that word
order is an I-language property determined by what data the individual in question is exposed to.
That is, two individuals might both form the structure in (7), but one individual “pronounces” this
structure “A B X Y”, another “Y X B A”, and yet another “B A X Y”, etc., based on particular
externalization rules the individual has learned (which may or may not make reference to the
properties of A, B, X, and Y as well as the structure itself). There is a popular type of alternative to
this language-specific notion of word order (most famously articulated in Kayne 1994’s theory of
“Antisymmetry”), which holds that the externalization process is in fact universal, and that languages
in fact vary in terms of what structures they use Merge to build, partially influenced by what
properties they assign to their lexical elements (our A, B, C, and D). Resolving this question is far
beyond the scope of this dissertation. I will note, however, that despite the differences between the
languages to be investigated, they all seem to behave as if the underlying structure of the sentences
in question is indeed at least quite similar, if not identical. However, in fairness to the “universal
externalization” camp, there are no particularly “crucial” cases tested, so it may well be that more
focused investigation would indeed reveal underlying structural differences between the languages.
For our purposes, we can remain agnostic, noting simply that if structural differences exist, they
were not relevant for the hypotheses and predictions utilized in this dissertation.
A final point to note from Chomsky’s quote above is that Merge allows us to set up a new
type of measure of distance between two elements, not based on word order/sequence, but based
on structural properties. There are many such metrics we can set up, defining not only distance but
other abstract measures and relations as well. To do so in an intelligible fashion, however, it is
necessary to develop the appropriate vocabulary for the relations between elements in Merge-based
40
structures. It is crucial to note that Merge is binary as Chomsky has defined it, combining only two
items at a time. As such, if all syntactic structure is built up by Merge, then for any sentence/phrase
consisting of more than two “units” (e.g., words), there must be exclusive hierarchical subgroupings
which contain some of the elements but not others. As it turns out, the relationships between these
groupings, often referred to as “constituents”, play, or at least have been argued to play, an outsized
role in constraining mappings between form and meaning, as will become evident shortly.
2.2.2 Basic Structural Relations
Before considering the effects of the relations between constituents, we first need to define
what those relations might be. That is, given the nested structures created by Merge, how can we
describe one element’s position with respect to another? While the possibilities are endless, there are
basic notions the relevance of which is immediately apparent and have long been mainstays of
generative syntactic theory, even before Chomsky’s articulation of Merge. Consider the simple case
that we have elements A or B, which may themselves be atomic or derived. If Merge applies to
create {A, B}, then the relationship between A and B is labeled “sisterhood”:
(9) Sisterhood:
X and Y are sisters iff X and Y have undergone Merge to create {X,Y}
In (7), for example, A and B are sisters, X and Y are sisters, and Z and {A,B} are sisters,
with no other such pairs existing. Following the genealogical metaphor, we can term the constituents
which undergo Merge to form larger constituents the “daughters” of that larger constituent:
(10) Daughterhood:
X and Y are the daughters of Z iff Z={X,Y}
41
Returning to (7), A and B are the daughters of {A,B}, X and Y are the daughters of Z, and
{A,B} and Z are the daughters of {{A,B},Z}, with no other “daughterhood” relationships existing.
Let us consider a slightly more built-up structure, resulting from iteratively applying Merge
to A and B, then to C, and then to D, as in (11) below:
(11)
Here, we see that A, B, C, and D are all “inside of” {D,{C,{A,B}}. Yet, in terms of the
relationships previously defined, all we can say is that {D,{C,{A,B}}’s daughters are D and
{C,{A,B}}. This correctly captures {D,{C,{A,B}}’s relationship with D, but does not directly
indicate that A, B, and C are ultimately “inside” {D,{C,{A,B}} as well. We can, however, create a
more “recursive” definition for a relation called “domination” or “containment” which captures the
relevant properties:
(12) Domination/Containment:
X dominates/contains Y iff:
(a) Y is X’s daughter or
(b) Y is the daughter of an element Z, which X dominates/contains.
By the definition in (12), we can find that {D,{C,{A,B}} contains its daughters, D and
{C,{A,B}}, the latter of which also contains its daughters, C and {A,B}, and {A,B} in turn contains
A and B. By (12), {C,{A,B}} contains everything {A,B} contains, so {C,{A,B}} contains not only
42
C, but also A and B. By a similar logic, {D,{C,{A,B}} contains A, B, C and D (as well as {A,B} and
{C,{A,B}}). Thus, the containment/domination relation accurately captures the desired property of
“inside-ness”.
It is important to note that such structural notions precede Chomsky 1995’s conception of
them as derivative of Merge; one does not need to have any particular explanation for how such a
structure is made in order to define them. The Merge-based account, however, connects such
relations with Chomsky’s theory of the CS and its solution to the problem of infinite syntactic
creativity. We should remember too that our goal in defining such relations is, following the general
goals laid out in Section 1.3, to help us to formulate hypotheses that make predictions about the
ways in which structure constrains the mapping between form and meaning. For this purpose,
containment/domination by itself is relatively limited, as the best it can do is express that the
properties of a given thing are based on the properties of what is inside of it
21
. Rather, containment
can be combined with sisterhood to create a new relation, “c-command”, which was identified by
Reinhart (1976)
22
, like the other structural notions discussed here, well before the theory of Merge
came about. In terms of the definitions given above, it can be succinctly defined as in (13):
(13) C-command:
X c-commands Y if X is sister to some element Z which contains Y.
In terms of (11), D c-commands everything {C,{A,B}} contains, most crucially the atomic
elements A, B, and C, while C does not c-command D but does c-command A and B, and A and B
21
Of course, there may be times when what is “inside of” what is non-obvious. Further, it is possible that a given
phenomenon might make reference to a constituent and its contained elements in theoretically interesting/revelatory ways.
As such, the relation by itself should not be dismissed as predictively useless out of hand, even though it is not by itself
the focus here.
22
Though, as Reinhart notes, X c-commanding Y is effectively the same as of what Klima 1964 calls “Y being in
construction with X”, so the conceptual history of the relation is somewhat longer than it might appear.
43
each do not c-command anything
23
. In terms of Merge, c-command can be understood to formalize
the long-distance relations created by the “path” of Merge; D c-commanding A, B, and C, for
example, signifying that D has been merged with an element which was itself built up via Merge
from elements including A, B, and C.
2.3 Crossover Effects
2.3.1 Strong Crossover
C-command was developed not just as an abstract structural relation, but as a tool for
formalizing the way in which sentence structure constrains possible interpretations, as can be seen in
Reinhart 1976 and Chomsky 1981. These early works on c-command expressed optimism that it
might be the fundamental relationship mediating form and meaning. As will be seen later in this
chapter, however, this c-command-based program has encountered great difficulties in the decades
since its inception, which have led many to advocate either modifying the program or abandoning it
altogether. Hoji (2017) notes pointedly that the challenge for the basic Chomskyan approach is to
show that a Merge-based structural relation like c-command can serve as a basis for hypotheses that
predict acceptability judgements in testable and demonstrable ways, the strong implication being that
this has not been done, or at least not to a sufficient degree. As will be shown, despite having
promise in a wide variety of areas, c-command-based predictions do seem to suffer predictive failure
time and time again. The correlational approach, itself initially suggested in Hoji 2017, can be seen as
an attempt to overcome these failures, and to rehabilitate (a modified version of) the Reinhartian
account of c-command as the key mediator of form and meaning.
23
Some theories modify either the definition of containment or c-command itself to let elements also c-command their
sisters, so A and B would mutually c-command one another. Which variant we follow will not be crucial for this dissertation.
44
Before turning to address the problems of the c-command account and their potential
resolutions, let us first see some basic examples of how the account works via an example, which
will be relevant for much of this dissertation. This rather classical illustration is the case of
“crossover effects”. These effects were documented prior Reinhart 1976 and 1983’s formulation of
c-command and its links to interpretation but find straightforward analysis under such an account.
Postal (1971) notes that certain configurations, which Wasow (1972) would later label
“strong crossover” configurations, disallow certain readings. Consider a sentence like (14):
(14) “Who does he trust?”
We can note that (14) is an interrogative version of a sentence like “he trusts X” where X is
filled in by whoever the object of ‘trust’ is. As is typical in English, in the interrogative version of the
sentence, the wh-word, in this case ‘who’, appears at the beginning of the sentence, rather than
where X would go in the declarative. However, to use a term originating from Chomsky 1981, ‘who’
is still clearly receiving the same “theta role” as X; that is, it has the same semantic relationship with
the verb ‘trust’, namely, the individual who is being trusted. Assuming that the “object” of ‘trust’
always receives the “trustee” theta role, we may thus still refer to ‘who’ as the “object” of ‘trust’. We
will return to its “object” status shortly below, but in the meantime, we may say that (14) is an
instantiation of a general sentence pattern S=X Y V, where Y is the “subject” in its usual pre-verbal
position and X is the wh-object.
For convenience, let us introduce another MR at this point, which, following Hoji
(2019/2022b), we will call “Coref(S, X, Y)”. Coref(S, X, Y) simply means that in sentence S, Y is
interpreted as referring to the same individual as X, where Y will typically be something like a
pronoun. For example, if we have S=“John complained about his homework”, then English
45
speakers will generally agree that Coref(S, John, his), that is, ‘his homework’ means essentially
‘John’s homework’, is a possible and altogether likely interpretation of the sentence. Regarding
sentences like (14), however, Postal reports *Coref
24
(S, who, he), that is, in this sentence, ‘he’ cannot
co-refer with, i.e., refer to the same individual as, ‘who’. The sentence can thus not mean something
like “which person is such that that person trusts himself?” or more simply “who trusts himself?”.
Postal’s observation is that, in general, for sentence of the S=X Y V type, Coref(S, X, Y) is
impossible.
2.3.2 Crossover and Structure
Postal hypothesizes that the fact that we have *Coref(S, X, Y) for such S is due to the
underlying syntactic structure of S. Specifically, Postal advances the theory that the “object-hood” of
elements like ‘who’ in (14) should be taken literally. In essence, not only is sentence-initial wh-
element fulfilling the same semantic role as a typical, post-verbal, declarative object, but it in fact
occupies the same position in the syntactic structure as well, or at least has a close relationship with
that position.
As to declarative objects, various tests can be applied to determine their likely syntactic
position
25
. The results of these tests generally been taken to suggest that verbs and objects are
24
Whether the MR in question is properly Coref(S, who, he), BVA(S, who, he), or a third, wh-word specific MR(S, who,
he) is a somewhat subjective point. MR’s are labels of convenience for surface phenomena, so we are free to define them
however we want; these definitions of course have consequences, but since wh-sentences will be of only passing interest
to us here, we need not worry too much about what we call this sort of MR.
25
These tests include examining what tends to move together as a unit, what tends to be replaced together by things like
pronouns, and what tends to get conjoined by conjunctions, etc. (see the overview in Lasnik 1990, for example, for a
smattering of such tests). For example, if we take “John ate cake” as our base sentence, we can see in (i) that it is easy to
replace ‘ate cake’ with something like ‘did so’, but there is no obvious equivalent for replacing ‘John ate’ with something;
trying ‘did so’ in (ii) for this purpose yields an essentially uninterpretable sentence:
(i) John did so.
(ii) Did so cake.
46
Merged first, and then undergo Merge with the subject as a unit, giving a structure like {Subject,
{Verb, Object}}, displayed in “tree form” in (15) below:
(15)
It is typical, rather than writing out the “set” notation, to provide syntactic category labels
like Noun Phrase (NP) and Verb Phrase (VP), which I will adopt for future trees in this chapter,
rendering Subject-Verb-Object (SVO) sentence structures, like the one schematized (15), as in (16)
26
Similarly, the phrase ‘ate cake’ can be moved around as a unit inside of a bigger sentence, whereas ‘John ate’ cannot easily
do so. For example, consider the pairings below:
(iii) Ate/eat cake, I know that John did.
(iv) John ate/eat, I know that (did) cake.
Displacing ‘ate cake’ (which might change to ‘eat cake’ complications due to the insertion of ‘did’) to the start of this
sentence is possible, albeit a little bit “stuffy” sounding to my ear. However, trying to do something similar with ‘John’ ate’
again yields essentially uninterpretable word salad.
Finally, we can consider trying to conjoin either (a), the verb and the object and another verb and object, excluding the
subject, or (b), the reverse, namely the subject and the verb with another subject and verb, excluding the object:
(v) John ate cake and drank milk.
(vi) John ate and Mary baked cake.
(v) is a relatively normal sentence. (vi) on the other hand, while perhaps possible, is quite “marked”. Such sentences have
been studied extensively, under the heading of “right node raising” (Postal 1974), but the general intuition is that there is
something very “special” and “atypical” about these sentences, in such a way that it suggests an extra operation of some
sort has been performed. Whatever the correct analysis, facts like the asymmetries displayed in (i)-(vi) have generally caused
generative linguists to hold that verbs and objects form a unit to the exclusion of the subject.
26
I am simplifying the phrases labeled and omitting non-crucial levels of structure for the sake of simplicity; in this
dissertation, we will only end up caring about the syntactic relationship (whether there is c-command or not) between
different constituents, so labels of nodes and intermediate levels of structure will not end up making a significant difference.
47
(16)
Postal (1971)’s claim that the “object” wh-element occupies the same syntactic position as a
regular object thus constitutes the claim that sentences like (14) have a structural representation that
includes something like {he, {trust, who}}, as below:
(17)
How exactly ‘who’ comes to appear away from ‘trusts’ when the sentence is externalized has
long been subject to various competing theories. One option, pursued in Chomsky 1981 and most
subsequent works in the “Government and Binding” tradition” is that ‘who’ is displaced to a higher
position in the structure, leaving behind an element like a silent “trace” (symbolized via a t with a
48
shared numerical index with the moved element) to fill the position it has moved away from,
yielding something roughly like (18):
(18)
Alternatively, in the “Minimalist” program launched by Chomsky (1993), ‘who’ may in fact
exist in two places in the structure at once, in the form of two copies, only one of which is
pronounced:
49
(19)
This latter process can be understood as a special type of Merge, namely “re-merge”,
whereby an already merged element, in this case ‘who’, is merged again into the structure, thus
yielding the set representation {who, {he, {trust, who}}}; the element ‘who’ here appears multiple
times in the structure, such that the maximal set contains multiple instances of ‘who’.
Though the above two theories have been historically the most popular in Chomskyan
linguistics (to say nothing of views in other similar traditions, e.g., Head-driven Phrase Structure
Grammar), there is a relevant minority view that does not take the movement itself to be syntactic.
Such a view was argued for a limited set of phenomena by Aoun and Benmamoun (1998) and as a
much broader account of such displacement in Sauerland and Elborne, 2002. Such an account is
also essentially assumed in Hoji 2015 and subsequent works. This view holds that displacement of
this sort is entirely the result of the (language-specific) externalization process and is not represented
in the structure at all. Under such an approach, there is neither multiple instances of ‘who’ nor
50
anything like a trace, but rather, the rough structure in (17) is essentially complete in terms of the
structural representation
27
.
These differing accounts will have minor impacts on the way certain configurations are
classified in this dissertation, and I will comment on them when relevant. In general, however, they
will be interchangeable for our purposes. I will essentially assume this last approach, where
displacement is an entirely “external” phenomenon, just for the reason that it makes discussing the
representations in question a bit simpler. As I will show, however, assuming one of the other
theories would not cause any interpretative problems for the discussion of the results my
investigations.
What all these approaches have in common is that they allow for what Chomsky (1977)
terms “reconstruction effects”, where displaced elements like ‘who’ in (14) optionally or obligatorily
behave as if they are in their low-down structural position of origin. This analytic framework
provides the motivation for Postal (1971)’s use of the term “crossover”, as the wh-element starts in
the object position, usually thought of as being on the right of the subject, and then “crosses over” it
to land at the far left of the sentence. Postal goes on to suggest that in such a sentence S with such a
configuration, Coref(S, Subject, Object) is impossible.
2.3.3 Strong Crossover and C-command
We can note that, by the definition of c-command given in (13) and the rough structure
given in (15), subjects c-command objects and not vice versa. One analysis of this “strong
crossover” effect might be to take X c-commanding Y in S as a pre-condition for MR’s like Coref(S,
27
Hoji does allow for some apparent instances of displacement of this kind to in fact be cases of base-generation, e.g.,
what Ueyama 1998 calls “Deep OS”, which we will return to later. As will be discussed, this would have interpretative
consequences for the results of the experiments performed, but similar to the consequences of the different movement-
based accounts, such consequences would not be problematic for the purposes of demonstrating the intended point(s).
51
X, Y) and BVA(S, X, Y). This is the position taken the “binding theory” developed based on
Reinhart (1976 and 1983) and Chomsky (1981), albeit with certain additional nuances, some of
which will be discussed shortly.
Following this account, in sentences like (14), Coref(S, who, he) is impossible because ‘who’
does not underlyingly c-command ‘he’ in the syntactic structure, even though it appears before it in
the externalized sentence
28
. Indeed, we can note that the effect obtains even if X is just a “regular”
object, not a moved wh-one, as in (20), where most individuals are likely to reject Coref(S, John, he)
(20) He trusts John.
Accepting such an interpretation would mean allowing the sentence to be interpreted as
meaning something like “John trusts himself”, which it is not generally reported to be possible
29
.
This follows quite directly from the hypothesis that X must c-command Y in S for Coref(S, X, Y)
30
28
This is one area where the “displacement in externalization” analysis is much simpler. If we hold that ‘who’ moves in
the syntax itself, then ‘who’ does, in fact, c-command ‘he’, from its high-up position. This can of course be rectified in a
number of ways, such as requiring all copies of ‘who’ to c-command ‘he’, requiring ‘he’ not to c-command ‘who’s trace,
etc., but these are extra complications.
29
As we will be seeing throughout, statements about what is impossible are often something of a simplification. It is
possible, in rare contexts, for the sentence to have such an interpretation. These cases, however, likely arise in a special
way, having to do with B(S, X, Y) factors mentioned in the previous section. See Hoji (2022b) for more discussion about
cases along such lines.
30
An alternative c-command-based constraint would be, along the same lines as Chomsky 1981’s “Binding Condition C”,
to say that, under the assumption that Y is a pronoun and X is a name or other definite expression, that Y cannot c-
command X. For this particular type of sentence, that constraint will do just as well as the X c-commanding Y requirement,
and will suffer from the same problems as well. Once we look at weak-crossover, however, the two accounts will make
different predictions; namely, if all that is required is that Y not c-command X, then a sentence like (i) should be fine with
a Coref reading, whereas if X needs to c-command Y, it should not:
(i) His parents trust John.
As we will be discussing, it has frequently been assumed that (i) is uncontroversially acceptable with the Coref(S, John, his)
reading, but experimental evidence from experiments such as those discussed in Section 3.2 does not support this claim.
We will see that situation is, in fact, more complex than either of these two “pure c-command” accounts make it out to
be, but the evidence to be presented does, in my view, favor the requirement for X to c-command Y, rather than simply
52
to be accepted and the asymmetry between subjects and objects, where ‘he’ c-commands ‘John’ but
vice versa, as below:
(21)
Despite this apparent success of this c-command-based theory of Coref, there is a
problematic complication, namely that flipping the subjects and objects of sentences like (14) and
(20) to make (22)a and (22)b respectively do not make Coref much easier to accept:
(22) a. Who trusts him?
b. John trusts him.
(23) a.
the need to avoid Y c-commanding X.
53
b.
The sentences in (22) resist a Coref(S, who, him) and Coref(S, John, him) reading,
respectively. That is, (22)a is not usually able to be interpreted as meaning “who trusts himself” and
(22)b is not usually able to be interpreted as meaning “John trusts himself”. As can be seen from
(23), however, the c-command account will not help us explain this; ‘who’ and ‘John’ do in fact c-
command ‘him’ in these sentences, so there is not predicted to be any sort of “crossover” effect.
As formalized in Chomsky (1981)’s “Binding Condition B”, it turns out there is a general
“anti-local” condition on MR’s like Coref (BVA too, for that matter), at least when involving
pronoun elements like ‘him’, whereby X and Y of Coref(S, X, Y) must be separated by certain
structural interveners, such as a clause boundary or a noun phrase boundary. While this issue does
not serve as evidence against a c-command-based condition on Coref, it does make the use of
“strong crossover” as evidence for this c-command condition fairly weak
31
. That is, because
switching to a minimally different sentence where X does c-command Y does not enable Coref(S, X,
31
One approach not pursued here is to continue with strong crossover but make use of clause boundaries. See the
contrasts between the acceptability of Coref(S, who/John, him) (ia) and (iia), where ‘who’/’John’ c-command ‘him’, and
in (ib) and (iib), where they do not:
(i) a. Who says Mary trusts him?
b. Who does he say Mary trusts?
(ii) a. John says Mary trusts him.
b. He says Mary trusts John.
54
Y), there is not much reason to believe that the lack of such c-command is what made Coref
impossible in strong crossover configurations in the first place.
2.3.4 Weak Crossover
Not all is lost, however. What Wasow (1972) terms “weak crossover” avoids the locality
confound, albeit at the cost of slightly more intricacy. These are instances where the element being
“crossed over” over contains Y of MR(S, X, Y), rather than simply being Y. That is, Y is embedded
in something like an NP, such as ‘his’ in ‘his parents’ in (24)
(24) a. Who do his parents trust?
b. Underlyingly:
In such a sentence, the wh-word originates as the object, and thus does not c-command the
pronoun ‘his’, which is contained within the subject. Additionally, ‘his’ is separated from ‘who’ by a
noun phrase boundary, so there are not concerns with ‘anti-locality’ here. The Coref(S, who, his)
reading is something like “who is trusted by his own parents?” It is generally the case that English
speakers find this interpretation to be unacceptable with the sentence in (24)a, though this is subject
to a fair degree of variation, hence the label “weak” crossover. The proper analysis of this variation
is, in fact, one of the main concerns of this dissertation, but let us put it aside for now and just
assume that we have *Coref(S, X, Y) for such sentences. The question is whether, when we switch
55
to a minimally different sentence where X does c-command Y, Coref(S, X, Y) becomes possible.
Consider, for example, the minimally switched version of (24):
(25) a. Who trusts his parents?
b. Underlyingly:
In (25), ‘who’ now occupies the subject position, and is sister to the VP ‘trusts his parents’.
By the definition given in (13), ‘who’ thus c-commands not only the object ‘his parents’, but also just
‘his’. Under the Reinhartian account, which holds that Coref(S, X, Y) requires X to c-command Y in
S, we then expect that Coref(S, X, Y) might be possible in sentences like (25), even though it was
not possible in minimally different sentences like (24). As it turns out, most English speakers find
(25) uncontroversially acceptable with this reading, namely that the sentence means something like
“who trusts his own parents?” and indeed, that is perhaps the most plausible reading of the sentence
when interpreted without a context. This contrast is precisely in keeping with Reinhart’s predictions;
when X c-commands Y in S, Coref(S, X, Y) is possible, and when X does not c-command Y in S,
Coref(S, X, Y) is impossible. As such, weak crossover, unlike strong crossover, provides clear
evidence for the role of c-command in constraining the availability of MR’s.
56
We can also note that, as predicted, this contrast is not specific to any actual “crossing over”
and obtains without the use of a wh-word. Consider the following two pairs, the first with Coref(S,
John, his) and the second with BVA(S, every boy, his):
(26) a. John trusts his parents.
b. His parents trust John.
(27) a. Every boy trusts his parents.
b. His parents trust every boy.
(26)a and (27)a have the same structure as given in (25), just with ‘who’ replaced with ‘John’
and ‘every boy’ respectively. Likewise, (26)b and (27)b have the same structure as (24). As such, in
the (a) cases, ‘John’/‘every boy’ c-command ‘his’, but not in the (b) cases. The (a) cases are
uncontroversially acceptable with Coref(S, John, his) and BVA(S, every boy, his) respectively. That
is, ‘his parents’ can be interpreted as John’s parents or the parents of each boy respectively. The
question is whether the (b) cases yield *Coref(S, John, his), and *BVA(S, every boy, his), that is,
whether the (b) sentences cannot have such readings.
At this point, readers will likely have diverging opinions about whether this is the case or
not. While Reinhart (1983) discusses many cases where Coref(S, X, Y) does indeed seem to require
X to c-command Y in S, Reinhart 1983 distinguishes Coref and BVA, recognizing that Coref is far
less “well-behaved” in that regard than BVA (see esp. Chapter 7 of that work). This has led to an
assumption one sees from time to time in the literature that pronouns like ‘his’ do not make
reference to c-command. As will be shown in Section 3.2, this assumption seems to be empirically
incorrect; many individuals do indeed behave as if they have such a requirement, and find Coref(S,
John, his) impossible for sentences like (26)c. However, as will be discussed throughout this
dissertation, it is clear that there is an asymmetry between Coref and BVA, with BVA being much
more likely in general to follow a Reinhartian pattern than Coref.
57
The contrast between the availability of BVA between (27)a and (27)b is much more
uncontroversially recognized in the literature
32
. As noted above and to be seen throughout this
dissertation, it is still the case that quite many people do not find there to be such a contrast, albeit
these individuals seem to be the minority. Indeed, as we will be seeing in the next section, there are
also independent problems with such pairings having to do with confounds coming from word
order, though Reinhart’s original analysis provides solutions to deal with such problems. All in all,
we can say that, if (I) all confounding issues are dealt with, and (II) the minority group of “weak
crossover accepters” is somehow accounted for, then the contrast between sentences analogous to
(27)a and (27)b would be a strong piece of evidence in favor of the Reinhartian account, and thus, in
favor of the role of structure (via c-command) of constraining the mapping between form and
meaning. Resolving issues (I) and (II) is, in fact, the primary motivation for the methodology to be
adopted, as will be seen below.
2.3.5 Parallels in DR
Before moving on to explore such issues however, let us note a similar situation in yet
another MR. This MR, which Hoji (2019 and elsewhere) calls DR (standing for “distributive
reading”) occurs in sentences like “two boys greeted three girls (each)”. Especially with the addition
of ‘each’, the sentence can be interpreted such that the three girls greeted are distinct for each boy,
e.g., John greeted Mary, Sarah, and Jane, whereas Bill greeted Susan, Elizabeth, and Helen. This
sense of “multiplying” or “distributing” different sets of ‘three girls’ across each member of ‘two
boys’ is what is classified as DR(S, two boys, three girls).
32
As an anecdote to attest the widespread belief in this contrast, Barker 2012, the main thrust of which is claiming that
Reinhart’s analysis is totally wrong and that BVA has no c-command requirements, holds up such weak crossover BVA
examples as the sole piece of clear evidence in favor of a Reinhartian approach.
58
Much like Coref(S, X, Y) and BVA(S, X, Y), DR(S, X, Y) has long been argued to require X
to c-command Y in S. As early as Chomsky 1957, long before c-command was formalized, it was
nevertheless argued that such readings were structurally constrained, as exemplified by the contrast
between (28) and (29):
(28) Everyone in the room knows at least two languages.
(29) At least two languages are known by everyone in the room.
Chomsky 1957 reports that when S=(28), DR(S, everyone in the room, at least two
languages) is possible, i.e., that (28) could be understood to refer to a situation where each person
knows two languages, which do not overlap with the two languages each other person knows
33
, e.g.,
John knows English and Spanish, Mary knows Korean and Chinese, etc. On the other hand,
Chomsky reports that when S=(29), this is impossible, *DR(S, everyone in the room, at least two
languages), i.e., everyone must know the same two languages. For example, John knows English and
Spanish, Mary knows English and Spanish, Kate knows English and Spanish, etc.
Though, as noted above, “c-command” had not yet been formalized when it was first
observed, this pattern is nevertheless consistent with the requirement that DR(S, X, Y) requires X to
c-command Y. In (28), ‘everyone in the room’ is the subject of the sentence, and thus, by the basic
structure given in (16), c-commands the object, ‘at least two languages'. Thus, X of DR(S, X, Y) c-
commands Y in (28). The hypothetical c-command-based requirement is thus met, allowing for
DR(S, X, Y) in such sentences.
33
It could also be something with partial overlap, e.g., John knows English and Spanish, Mary knows English and Korean,
etc. The point is that there need not be any overlap.
59
On the other hand, in (29), it is ‘at least two languages’ that is the subject. There is no object
per se, but there is a post-verbal prepositional phrase ‘by everyone in the room’. The structure of
such expressions is typically hypothesized
34
to be roughly
35
something like (30) below. The key detail
is that the subject c-commands the prepositional phrase, and the “agent” inside of it, that is, the NP
who expresses the “doer” of the verb in question, and not vice versa:
34
As in footnote 25, we can deploy various constituency tests to hypothesize this structure. Using parallels to those tests
used for active voice sentences, we can see that these “agents” of passive voice sentences pattern quite like objects in terms
of forming a unit with the verb:
(i) Mary was praised by Bill and John was too.
(ii) Praised by Bill, I understand that John was.
(iii) John was praised by Bill and mentioned by Susan.
From (i) , we can see that ‘praised by Bill’ can go “missing” together, being simply implied by ‘was’, somewhat like how
Verb+Object could be replaced by “did so” in active sentences. Further, though (ii) may strike some ears as awkward or
dialectal, it is still clearly English, unlike the garbled (iv):
(iv) John was praised, I understand that by Bill.
(iv) simply does not express the meaning “I understand John was praised by Bill”, whereas (ii) plausibly can.
Additionally, (iii) is clearly a rather “straightforward” sentence of English, demonstrating the ease with which
verb+prepositional phrase can be conjoined with other such units. Overall, we have seen in these tests evidence that the
verb+prepositional phrase act as a unit to the exclusion of the subject, whereas from our previous tests (and (iv)) we
have seen that the verb and the subject do not seem to like to form a unit to the exclusion of rest of the predicate. This
motivates the structural hypothesis given in the text.
35
As before, the exact details of labels and intermediate phrases are not crucial for our purposes. Following generative
traditions, I have shown the PP merging with a VP, rather than a V directly. This distinguishes such phrases from
prepositional objects, such as ‘to the store’ in ‘I went to the store’. Whether or not this is the correct analysis for by-PP’s
in passives will not be relevant; all that matters is that they clearly form some sort of unit with the verb to the exclusion of
the subject, and that the subject is merged with the structure in such a way that it c-commands the agent.
60
(30)
Given the structure in (30), the inability to have DR(S, everyone in the room, at least two
languages) in (29) is expected under the hypothesis that DR(S, X, Y) requires X to c-command Y in
S; in (29), ‘everyone in the room’ is the agent, and ‘at least two languages’ is the subject, and as such,
the former does not c-command the latter. (28) and (29) thus provide a minimal pair of sentences,
where ‘everyone in the room’ and ‘at least two languages’ have the same theta role in each, that is,
‘everyone in the room’ is the know-er and ‘at least two languages’ is what is known. If we accept
Chomsky 1957’s judgements, despite this apparent semantic similarity, the sentences have divergent
interpretative potentials based on their structure, where DR is possible in the first but not in the
second, specifically because of c-command-based considerations.
In an early instance of reported variation in judgements, however, Katz and Postal (1964)
disagreed with Chomsky (1957)’s judgements, maintaining that DR(S, everyone in the room, at least
two languages) was possible for S=(29). It is unclear whether Chomsky came to agree with this
particular judgement, but certainly, subsequent works of his and other generative authors do not
maintain that DR(S, X, Y) requires overt c-command.
61
Despite this concession, the link between DR and c-command was not abandoned. Most
famously, May (1977)’s popular “quantifier raising” (QR) proposal holds that DR(S, X, Y) does
indeed require X to c-command Y in S, and that apparent counterexamples derive from changes to
the syntactic structure that are not reflected in the surface word order. Specifically, these changes
involve movement of quantificational elements like ‘everyone in the room’ to a higher position in
the structure, somewhat like ‘who’ has been argued to overtly move. From this higher position, the
moved element c-commands Y of D(S, X, Y) ‘at least two languages’, even though it does not
appear to. The rough underlying structure of (29) would be something like what is given below:
(31)
QR and related phenomena are treated more extensively later in Section 2.6, but the
Chomsky-Katz-Postal debate, and its proposed resolutions, exemplifies a common theme to be
discussed in the following sections: despite c-command-based generalizations’ apparent explanatory
power, they are riddled with apparent exceptions, which have given rise to either more complex
62
proposals as to the nature of syntactic structure (like QR) or to attacks from competing theories,
which argue that they are not structurally governed at all.
2.4 The Role of Linear Precedence
2.4.1 The Confound of Word Order
One challenge to c-command-based theories of MR’s comes from precedence-based
theories. Such theories, generally with regards to MR’s like Coref and BVA, in fact predate
discussions of c-command, given that the latter was not yet formulated, though the basic idea of
syntactic structure was still explored (see such early examples as Langacker (1969), Jackendoff
(1972), and Lasnik (1976)). In the vein of such theories, Chomsky (1976) initially proposed what has
been called the “leftness condition” (a name given by Higginbotham 1980), namely:
(32) Leftness Condition:
A variable cannot be an antecedent of a pronoun on its left.
The notion of “variable” here is a bit of a historical detour, but to briefly summarize,
Chomsky was concerned with sentences involving certain quantificational phrases like ‘someone’,
which were to be treated as “variables”; for example, Chomsky gives the below:
(33) The woman he loved betrayed someone.
Chomsky notes that, at least in his judgement
36
, that Coref(S, someone, he) is impossible for
the sentence; that is, it cannot mean something like, “there is some person X such that the woman
36
Again, as in so many of these cases, there is variation in judgement depending on the individual in question.
63
loved by X betrayed X”
37
. Chomsky’s leftness condition explains this inability by the fact that
‘someone’ comes after, rather than before, ‘he’.
We can immediately note, however, an alternative account based in c-command. Recalling
the structure for simple SVO sentences given in (16), (33) ought to have the rough structure given
below:
(34)
In addition to not preceding ‘he’, ‘someone’ also does not c-command it. It is thus
ambiguous whether surface word order or underlying structure is what is preventing Coref in this
case. We can note a similar ambiguity in our weak-crossover “evidence” for the role of c-command
in constraining BVA; repeating (27) below as (35):
(35) a. Every boy trusts his parents.
b. His parents trust every boy.
37
Note the usefulness of “X” in describing this meaning, hence (part of) the reason that Chomsky and others wanted to
understand such words as “variables”.
64
In (35)a, ‘every boy’ c-commands ‘his’, by virtue of the former being the subject and the
latter being inside the object. In (35)b, the positions are reversed, and as such, ‘every boy’ no longer
c-commands ‘his’. As reported, BVA(S, every boy, his) is felt by many to be possible in (35)a and
impossible (35)b, potentially evidence that the availability of BVA depends on a certain syntactic
configuration. At the same time, however, ‘every boy’ precedes ‘his’ in (35)a but not in (35)b, so this
pair equally could be said to be evidence that the availability of BVA depends on which
word/phrase comes first, and not on anything structural per se.
2.4.2 Reinhart’s Response
As hinted at in the previous section, Reinhart (1983) shows many counterexamples to the
claim that MR’s like BVA and Coref are predicated on a specific precedence relationship. These are
frequently achieved by making use of the previously mentioned “reconstruction” effects, often
regarding “topicalized” phrases; that is, phrases that have apparently moved from their normal
position to the front of a sentence (much as do wh-phrases in English), yet can still be understood
as if they occupy the base positions. Consider, for example, one of Reinhart’s examples:
(36) a. Zelda spend her sweetest hours in her bed.
b. In her bed, Zelda spent her sweetest hours.
65
c.
Combining the structures given in (16) and (30)
38
, we can see that (36)a ought to have
roughly the structure given in (36)c; most crucially, the subject ought to c-command the PP ‘in her
bed’, and thus, the word ‘her’. Coref(S, Zelda, her) is uncontroversially acceptable in (36)a, but this
does not resolve our issue, given that ‘Zelda’ both c-commands and precedes ‘her’. Under
reconstruction accounts, however, we hypothesize that the “topicalized” PP ‘in her bed’ in (36)b in
fact occupies the same position in the structure as shown in (36)c, that is, that (36)a and (36)b are
structurally identical
39
. In that case, (36)b gives us an ideal test case, where X c-commands Y but
does not precede it. A c-command-based theory would predict Coref(S, Zelda, her) to be potentially
acceptable under such circumstances, whereas a precedence-based theory would not.
38
38 At least assuming that PP’s with ‘by’ in passives have a similar enough) syntactic position to “regular” post-verbal
PP’s. Repeating the sort of constituency tests discussed previous footnotes will suggest this is true:
(i) Zelda spent time in her bed and Mary [did too].
(ii) It is [spending time in her bed] that Zelda likes.
(iii) Zelda [spent time in her bed] and [threw coins out the window].
39
Or the latter is like the former except for movement, though then we need to add a formal “reconstruction” operation
to account for why it is interpreted in the lower position rather than the higher position. These two types of accounts,
post-syntactic displacement or pure syntactic movement, do not always make the same predictions, depending on how the
latter treats such movement “chains”. For the purposes of this particular section though, the relevant predictions should
be identical, at least under most standard approaches.
66
As it turns out, Coref(S, Zelda, her) is widely judged to be acceptable in (36)b. We can use an
identical strategy to assuage precedence-based concerns about the pairing like that in (35), by
topicalizing the phrase containing Y of BVA(S, X, Y), e.g., ‘his parents’ in (35)a so as to make both
sentences lack precedence of Y by X. This yields:
(37) a. His parents, every boy trusts.
b. His parents trust every boy.
Here, the sentences are matched in terms of ‘every boy’ not preceding ‘his’, while, under
reconstruction-style hypotheses, in (37)a ‘every boy’ c-commands ‘his’, and it does not do so in
(37)b. The prediction under the Reinhartian account is that the former sentence should allow a
BVA(S, every boy, his) reading and the latter should not; despite quite a lot of variation as before,
the most common judgement pattern does seem to be that such a reading is possible in sentences
like (37)a and not in sentences like (37)b, demonstrating a that the observed effect cannot be
reduced an issue of precedence. Indeed, Chomsky’s original sentence of concern, (33) above, can be
addressed succinctly by using this very strategy:
(38) a. Someone was betrayed by the woman he loved
b. By the woman he loved, someone was betrayed.
To the extent that we can accept Coref(S, someone, he) in (38)a, we can probably also accept
it in (38)b. It therefore does not seem possible to reduce purported c-command effects to the effects
of precedence
40
.
40
Though see Bruening (2014) who argues that a precedence-based constraint is still detectable if certain intermediate
stages in the derivation are considered. This, however, expands the meaning of “precedence” considerably beyond the
mere linear word order of the final utterance “output”. In that sense, similar approach is taken by Barker (2012), who
argues for precedence to be evaluated at the “reconstructed position”, i.e., where it would be if it had not been displaced
67
2.4.3 C-Command and Precedence
It would be theoretically tidy if the above was the “end of the story” for precedence-based
accounts. All we have demonstrated, however, is that BVA(S, X, Y) does not require X to precede Y
in S if X c-commands Y in S. What about the reverse? Namely, if X precedes Y in S but does not c-
command it in S, is BVA(S, X, Y) still in principle possible?
The answer appears to be “yes”. An early-observed potential example can be found via a
phenomenon identified by Geach (1962), who noted what came to be called “donkey anaphora”.
These occur in sentences like the now classic:
(39) Every farmer who owns a donkey beats it.
Such sentences are noted for their relatively confusing quantificational structure. The most
normal reading of (39) seems to be something like, “every donkey who is owned by a farmer is
beaten by said farmer”, but nowhere does ‘donkey’ overtly receive a quantifier like ‘every’. It is
challenging, then, to understand how ‘it’ essentially enters into a somewhat BVA-like MR, that yields
a reading very like BVA(S, every donkey, it); since there is no ‘every donkey’, perhaps this MR is
more accurately MR(S, a donkey, it), or something even more complex.
I will not endeavor to give a proper analysis of how such readings come about here, but we
can nevertheless note several points of interest. Structurally, the phrase ‘a donkey’ is contained
to the sentence-initial position. Again, this expands the notion of precedence considerably, essentially incorporating
syntax into the meaning of precedence.
Based on the findings in this dissertation and the previous work to be discussed in Section3.2, it is not clear we need these
more complex articulations of precedence to capture the data at hand, and indeed, it seems that the simpler account of
precedence may actually fare a bit better. Nevertheless, both Barker and Bruening consider many more sentence types
than are considered in this dissertation and the other works to be summarized, so such views cannot simply be dismissed
out of hand at this point.
68
within the relative clause ‘who owns a donkey’, itself contained within the subject ‘every farmer who
owns a donkey’. There thus cannot be any straightforward account given under c-command
relations between ‘a donkey’ and ‘it’; they do not remotely c-command one another. However, ‘A
donkey’ does precede ‘it’. Consider what happens if we reverse the positions of the two:
(40) Every farmer who owns it beats a donkey.
I suspect the consensus judgement will be that the MR(S, a donkey, it) that can obtain in (39)
cannot obtain here; that is, (40) cannot mean “every donkey who is owned by a farmer is beaten by
said farmer”. If so, then it seems that what was permitting this MR in (39) was something to do with
‘a donkey’ preceding ‘it’. One may of course object, noting that ‘a donkey’ and ‘it’ have now changed
places with regards to ‘every farmer’ as well. Perhaps the MR that obtains is more properly analyzed
as obtaining due to a certain structural relationship between ‘every farmer’/ ‘every farmer who owns
a donkey’ and ‘it’. In (39) ‘every farmer who owns a donkey’ is the subject, and ‘it’ is the object, so
the former straightforwardly c-commands the latter, whereas in (40), ‘it’ is actually a part of the
subject, ‘every farmer who owns it’, and c-command relations are thus somewhat murkier.
Fortunately, we can resolve this issue via Reinhart’s topicalization strategy. One minor issue
is that ‘it’ tends not to be easily acceptable with topicalization in English, so we probably cannot
meaningfully obtain results by topicalizing (39) into (41):
(41) It, every farmer who owns a donkey beats.
Nevertheless, we can do something very similar if we simply replace ‘it’ with ‘that donkey’:
69
(42) a. Every farmer who owns a donkey beats that donkey.
b. That donkey, every farmer who owns a donkey beats.
(42)a should be uncontroversially acceptable with the MR(S, a donkey, that donkey) reading;
that is, where multiple donkeys are being beaten by multiple farmers. We understand (42)b to have
the same basic underlying structure as (42)a, by the same hypotheses about topicalization and
reconstruction we have seen above. However, to my ear, and I expect to many others’, (42)b lacks
the MR(S, a donkey, that donkey) reading; rather, it must mean something like, “there is a specific
(and extremely unfortunate) donkey who is beaten by all farmers who own donkeys”. That is, the
donkey is a sort of “scapegoat” (scape-donkey?) for all the other donkeys. Such a contrast supports a
precedence-based account of MR’s like BVA; structure remains constant, but changes in precedence
affect the availability of MR’s.
Ueyama (1998) compiles various examples of both types of the cases discussed, where Coref,
BVA, and other MR(S, X, Y) obtain when X c-commands Y but does not precede Y, and vice versa,
when X precedes Y but doesn’t c-command it. Her basic conclusion is that MR’s like BVA are
convergently similar surface manifestations of multiple different underlying mechanisms, one of
which, formal dependency (FD), relies on c-command and one of which, indexical dependency (ID),
relies on precedence. If this is the case, then we cannot have a “pure c-command” theory of such
MR’s but must acknowledge the potential for multiple sources.
This alone is not a fatal blow to the Reinhartian approach; c-command still plays an
important and clearly detectable role in constraining the availability of MR’s like BVA.
Unfortunately, precedence-based complications are only the tip of the iceberg; there are several
other issues that plague c-command accounts as well.
70
2.5 Possessor-Binding and Beyond
2.5.1 Introduction to Possessor Binding
The first of the above-mentioned issues centers on various configurations which appear to
lack c-command and yet nevertheless seem to reliably allow for BVA (as well as DR and Coref)
readings for at least some individuals. A classic instance of this is so-called “possessor-binding” or
“spec(ifier)-binding”
41
, noted as early as Higginbotham 1980, for sentences of the following sort:
(43) Every author’s mother loves his books.
The subject of (43) is the possessive ‘every author’s mother’. The exact structure of a
possessive is a rather complex matter
42
, but for our purposes, we can understand the merge-path of
sentences like (43) to be something roughly like what is displayed in (44).
41
This latter term, I believe, originates from Reinhart 1987.
42
It is easy to argue though that the possessor does form a constituent with what it possesses. Besides the clear semantic
unit that they form, we can provide further evidence using the tests deployed in previous footnotes, e.g., replacing it with
something, moving it around as a unit, and coordinating it with analogous units:
(i) (Speaking of every author’s mother), [she] loves his books.
(ii) It is [every author’s mother] that loves his books.
(iii) [Every author’s mother] and [his wife] love his books.
We will not be able to do anything like this for the string ‘mother loves his books’; the results are truly bizarre, in most
cases effectively gibberish (Note that whether we assume the ‘s to be part of ‘every author’s’ or part of ‘’s mother loves
his books’ does not impact whether or not these come out as gibberish or not.):
(iv) Every author’s [she].
(v) It is [mother loves his books] that every author’s.
(vi) Every author’s [mother loves his books] and [father hates his books].
The “common sense” conclusion here is that possessors and what they possess form their own syntactic units, as we
have been in fact assuming throughout. It is not the case that the possessor somehow merges with the sentence in a
position higher than the rest of the subject; it is part of the subject.
71
(44)
As such, the possessor in the subject, e.g., ‘every author’ in (43) does not c-command the
possessor in the object, e.g., ‘his’ in (43). Nevertheless, readings like BVA(S, every author, his) are
frequently accepted for such sentences. That is, many feel that (43) can mean something like “for
each author X, X’s mother likes X’s books”, so author A’s mother likes author A’s books, author B’s
mother author B’s books, etc.
The relevance of such constructions for c-command-based theories of BVA was noted even
in Reinhart 1983, which is to say, essentially as long as c-command-based theories of BVA have
been advanced, as they appear to provide a relatively straightforward counterexample to the claim
that BVA(S, X, Y) requires X to c-command Y in S. Unfortunately, discussions of these
constructions have often ignored the potential confound of linear word order
43
; while ‘every author’
does not c-command ‘his’, in (43), it does precede it. By what we have just seen in the previous
43
One potential exception is Buring (2004), who links such cases and donkey-anaphora, which, as previously discussed, I
understand to be precedence rather than c-command-based. Buring does not seem to share this understanding, but
nevertheless, we are both identifying the two phenomena to be linked in some way and different than “standard” BVA.
Indeed Reinhart (1987) also made a similar link, albeit in a more convoluted way, whereby the donkey-anaphora
interpretations are achieved via a combination of an index copying operation and spec-binding. As I will mention in Section
5.1, Kang (1988) also develops a unified account of the two, expanding an account formulated by Haik (1984) for donkey-
anaphora, but again, does not (explicitly) discuss the role of precedence. As such, the connectedness of the phenomena
has long been recognized, but the role precedence plays in both of them has much more rarely been a focus.
72
section, the acceptability of BVA(S, every author, his) is therefore not necessarily remarkable. The
better test case would be the topicalized version:
(45) His books, every author’s mother loves.
As will be shown in later chapters by the data gathered for this dissertation, many, and
perhaps even most, individuals who accept possessor-binding with precedence, as in (43), do not
accept it once that precedence is removed, as in (45). This effect cannot be reduced to some
problem with reconstruction or topicalization, as many of the individuals in question accept BVA in
both forms of (46)
(46) a. Every author loves his books.
b. His books, every author loves.
Given this distribution of judgements, where (43) is acceptable, (46)a and (46)b are
acceptable, but (45) is not, it is quite straightforward to advance an account where “possessor-
binding” is merely precedence-based binding. When ‘every author’ c-commands ‘his’, as in (46), then
the BVA reading can survive topicalization, whereas the “possessor” binding structure lacks such c-
command, and thus must rely on precedence for BVA, meaning that topicalization will disrupt it.
In some sense, the literature that has sprung up around this issue is a consequence of a
somewhat “all or nothing” attitude towards c-command and BVA, where proponents have held that
all BVA must be conditioned by c-command, and detractors held that BVA has nothing to do with
c-command. Recognition of a more nuanced perspective, that c-command is one of multiple factors
playing a role in constraining BVA might, therefore, have resolved such issues sooner. In the
defense of past works, however, the pattern I presented above is not universally true; there are
73
individuals who do, in fact, accept BVA(S, every author, his) with S=(45). Though, as will be shown,
accepting such instances of BVA is not a particularly common, it is not so rare as to render it
implausible that some of the authors in question might have encountered it. To give these the
benefit of the doubt, we can assume that either they themselves or at least someone they consulted
with had this pattern of judgements. With this assumption, possessor-binding cannot simply be
subsumed under precedence binding and does indeed constitute a novel form of exception to c-
command-based generalizations about BVA.
2.5.2 Label Sensitive C-Command
There have been multiple different sorts of responses to this sort of “exceptional” BVA.
One main body of thought on the matter has been to attempt to revise the structural constraints on
BVA. Reinhart (1983) essentially tries to “bandage” the original theory by simply allowing BVA(S,
X, Y) if X c-commands Y or is X the possessor in a larger nominal phrase
44
Z that c-commands Y.
This is not a very “comprehensive” solution to the problem, and others have tried to take more
radical approaches to subsume possessor-binding under c-command-based binding.
One well known example of such an approach comes from Kayne (1994). This approach
involves adapting structural notions from May (1985) and Chomsky (1986b), who were themselves
trying to define various structural relations in terms of the relationship “exclusion”. The way in
which Kayne ends up using the term is essentially as follows:
(47) Exclusion:
X excludes Y iff no segment of X dominates Y.
44
From this point, I will start sometimes replacing the term “NP” with “nominal (phrase)” when the precise label is
unimportant, for reasons that we will come to shortly.
74
The obvious question is of course what a “segment” is. We can essentially understand a
segment to be a nested series of elements sharing the same syntactic label, such as “NP”
45
. For
example, because ‘dog’, ‘white dog’, and ‘fluffy white dog’ all can occur in essentially the same
positions in a sentence, it has been argued that ‘white’ and ‘fluffy’ are “adjuncts” whose merge with
the rest of the phrase does not change the label of the phrase, yielding a nested series of NP’s:
(48)
The three NP’s, ‘dog’, ‘white dog’, and ‘fluffy white dog’ would all be considered one
“segment” under the definition in (47). Note that we are adding quite a lot into the list of what the
CS “sees” by adopting such a definition; we are now defining relations not only on hierarchies of
sets, but also based on what “labels” those sets receive. That is not to say such an approach is
implausible, and indeed, as we have already seen, it is standard to assume we need such a thing for
defining the relevant domains for the “anti-locality” effect for pronouns participating in BVA/Coref
45
Technically, there may be situations where two phrases, both bearing label X are both daughters of another phrase
labeled X; in that case, we will need to know which of the two lower X’s the higher X has “projected” from, but we can
(mostly) ignore this detail.
75
(see explanation around (22) earlier in the section). Nevertheless, if we use such notions to define c-
command, rather than the “label-less” set relations used in (9)-(13), c-command becomes an
inherently more complicated relation.
To go about achieving such a redefinition of c-command, domination is also given an
additional constraint:
(49) Domination:
X dominates Y iff every segment of X dominates Y.
While in some cases identical to the definition given in (12), in cases like the one
schematized in (48), including this condition or not yields very different answers as to what
dominates what. Specifically, by just the definition in (12), the NP ‘fluffy white dog’ dominates its
daughters and recursively anything its daughters dominate, meaning that it dominates all the terminal
notes, ‘fluffy’, ‘white’, and ‘dog’. Thinking in terms of segments, however, the highest NP, ‘fluffy
white dog’, forms a segment with the lowest NP, just ‘dog’. The only thing this lowest NP
dominates is, unsurprisingly, its one atomic member, ‘dog’. But by the definition in (49), that means
that the higher NP, ‘fluffy white dog’, also only dominates ‘dog’; ‘fluffy’ and ‘white’ appear to be
dominated by some of the NP segments, but because they are not dominated by all of them, they are
not dominated by any.
C-command is then defined based on this new understanding of domination:
(50) C-Command (Exclusion-based):
X c-commands Y iff X and Y are categories and X excludes Y and every category that
dominates X dominates Y
We can see that there are three conditions here, but the first, “being a category”, is
essentially just clarifying the domain on which c-command is defined and thus need not concern us
76
here. The second is that X should exclude Y. As expressed in (47), this means that no segment of X
should dominate Y. This essentially prevents X from c-commanding anything “inside itself”, which
the purely structural account of c-command in (13) does by defining c-command on things
contained within X’s sister, which is inherently not part of X itself
46
.
The third and final part of this definition is that everything that dominates X must dominate
Y. Were (49) not in use, this would essentially reduce to the requirement that Y be either X’s sister
or contained within X’s sister (or X or contained in X, but this is taken care of in the previous
condition), giving us the same requirement as (13). However, with the new condition on domination
in (49), we get a different result.
Before we see this, it will be helpful to revise the way sentences have been schematized
slightly. The way I have schematized basic sentences, e.g., in (16) will generate a minor issue if we do
not make a minor alteration. I repeat it here as:
(51)
By the same logic as applied to (48), because not all segments of VP dominate the subject
NP (specifically the lower VP does not), then VP does not dominate the subject. If that is the case,
46
Except under certain less common theories of movement that permit movement of elements in X’s sister into X or vice
versa, but these would not make a relevant difference for the matter at hand.
77
then the subject is, in fact, not dominated by anything. Technically, it would still c-command the
object, as there is no category that dominates the subject that does not dominate the object, but this
is a rather “degenerate” case. Kayne (1994) does not have this problem, as he is taking an articulated
view of the structure of the sentence that has various intervening phrases for various categories
(most crucially verbal inflection, which will make the node at which the subject joins in IP, which
itself contains VP). For demonstration purposes, we can approximate this more simply by reverting
to an older tradition in syntax whereby the highest node of the sentence was simply referred to as
the “sentence”, or S
47
:
(52)
In the structure in (52), S dominates the subject NP, the VP, V, and the object NP. There is
nothing else that dominates the subject NP. Thus, everything that dominates the subject NP also
dominates the object NP. Given that subject NP excludes the object NP, i.e., the object NP is not
inside of it, the subject NP thus c-commands the object NP by the definition given in (50).
47
To be clear, I am not advocating that syntacticians should return to doing so. Merely, it is simpler to adopt this notation
for the purposes of demonstration, rather than explain a rather intricate theory how functional projections like tense are
represented, when such a theory is not particularly relevant to what is being explored. As stated, none of these labels will
matter in the end for the theory we ultimately adopt.
78
2.5.3 Kayne’s Approach: Possessors as DP Adjuncts
The above result, namely the subject c-commanding the object, is the same as we find under
the definition of c-command given in (13). Something different happens, however, if we take the
case of a subject with “adjuncts”. Consider the following diagram:
(53)
As noted in the discussion of (48), not all segments of NP dominate the adjunct, and as
such, the highest NP, i.e., the subject, does not in fact dominate the adjunct “inside” itself. As such,
the only node that dominates this adjunct is S. As such, by the same logic that the subject c-
commands the object given above, by the definition of c-command in (50), the adjunct c-commands
the object too.
This is a major difference from the approach to c-command given near the beginning of this
section, whereby the adjunct would be considered dominated by the NP, and thus would not c-
command the object. We should, however, observe an important caveat that Kayne adopts, namely
Abney (1987)’s DP hypothesis. Under this hypothesis, the elements labeled NP in the above tree
structures are in fact DP’s (determiner phrases) with NP’s inside of them. For example, if we
79
combine our ‘fluffy white dog’ with the “determiner” ‘the’, and use it as the subject of a sentence,
we would get (roughly) the following under this hypothesis:
(54)
In the structure given in (54), it is still true that the NP ‘fluffy white dog’ does not dominate,
say ‘fluffy’. The DP ‘the fluffy white dog’, however, does; there are no segments of the DP that do
not dominate ‘fluffy’. As this subject DP does not dominate the object DP, then there is an element
that dominates ‘fluffy’ that does not dominate the object, and so, by the definition in (54), ‘fluffy’
does not c-command the object.
Delving into the intricacies of NP vs. DP structures for nominals would take us far afield at
this point, as all we really need to know is the place of the possessor in the DP structure Kayne
assumes, and crucially, how that differs from other accounts.
80
The main “other account” of concern derives from Chomsky (1970)’s X-bar theory, which
offers a particular structural position for the possessor
48
, namely the “specifier” position, or “spec”
(from which the term “spec-binding” derives). While adjuncts are both sisters and daughters of XP,
48
Why does the possessor get this special position? In English at least, D’s like ‘the’ mandatorily come before adjectives
and other traditional adjuncts like prepositional phrases and relative clauses. We cannot “scramble” this order, as below:
(i) a. The white dog on the porch that I saw
b. White the dog on the porch that I saw
c. On the porch the white dog that I saw.
d. That I saw the white dog on the porch.
Of the phrases in (i) the only way to express (i)a is in fact (i)a itself; (i)b-(i)d cannot be understood as synonymous with
(i)a, if they can even be understood at all. This is easily explained under a DP approach, assuming that all the adjuncts are
part of the NP. Under this account, the linearization of the phrase is simply keeping syntactic constituents together (that
is, there is no “displacement” going on), and thus, the structural high D ‘the’ cannot intervene between one of these
adjuncts and the rest of the linearized NP.
Possessors in English seem to occupy a similar position to ‘the’; they are mutually exclusive with ‘the’, and like the, they
also cannot be reordered with respect to adjuncts like adjectives:
(ii) a. John’s fluffy white dog.
b. John’s the fluffy white dog.
c. Fluffy John’s white dog.
Once again there is no synonym in (ii) for (ii)a. (ii)b and (ii)ac can perhaps be interpreted in some rather unusual
contexts, but even so, they will not mean (ii)a. As such, English possessors are not assumed under DP-theory to be part
of the NP, but rather, part of the DP, with the possessive ‘s’ perhaps occupying the D position, accounting for the lack
of ‘the’ or other determiner. Is the possessor then an adjunct of the DP? Originally, this was thought not to be the case
among DP theorists, because it lacks the “repeatability” and “delectability” aspects common to most adjuncts; that is,
adjuncts can usually be added or deleted freely, so long as they appear in the correct position.
(iii) a. The [dog] is friendly.
b. The [white dog] is friendly.
c. The [fluffy white dog] is friendly.
d. The [big fluffy white dog] is friendly.
e. The [big fluffy white dog on the porch] is friendly
f. The [big fluffy white dog on the porch with a wagging tail] is friendly.
As shown in (iii)b-f we can have as many adjectives, e.g., ‘white’, ‘fluffy’, and ‘big’, and prepositional phrases, e.g., ‘on the
porch’ and ‘with a wagging tail’, as we want, or we can have none whatsoever as in (iii)a; the sentence is well formed
regardless of whether these adjuncts are there or not. English possessors are not like this at all:
(iv) a. John’s dog is friendly.
b. Mary’s John’s dog is friendly.
c. (‘s) dog is friendly.
While having one possessor as in (iv)a is fine, adding another one as in (iv)b is (except with certain very atypical readings)
impossible, as is deleting the possessor, i.e., (iv)c. As such, possessors seem to have a different relationship with the
nominal phrase than do adjuncts, though there are many different ways in which such a difference might be understood,
of which the specifier-based explanation is a particularly “structural” one.
81
specifiers are daughters of XP but sisters to an “intermediate projection” called X’ (giving the theory
its name), as schematized below:
(55) Adjunct:
(56) Specifier:
As with the DP hypothesis, we do not need to go into too much detail with regards to X-bar
theory. All we need to understand is that, if possessors are specifiers of DP, then a sentence like (43)
will have roughly the following structure:
(57)
82
I have added subscript numbers to (57) to make it clear which parts of which DP’s are
linked. Crucially, DP 1 (the possessor in the subject) is not a “segment” of DP 2 (the entire subject);
the two just both happen to be DP’s (see footnote 45). What is important is that, in this structure
possessor DP 2 ‘every author’ is sister to D 2’ but dominated by DP 2. Thus, the definition of
domination in (49) yields that DP 2 dominates DP 1; there is no other segment labeled DP 2 that does
not dominate DP 1. As such, DP 2 dominates DP 1, the possessor, but does not dominate DP 3, the
object. By the definition in (50), therefore, the possessor does not c-command the object.
This is where Kayne finally makes his key contribution. Under Kayne (1994)’s theory of
“antisymmetry”, there is no structural distinction between specifiers and adjuncts; specifiers
essentially are adjuncts as far as syntax is concerned, with the issues of repetition and deletion being
relegated elsewhere. As such, the structure in (57) changes ever-so-subtly to the one in (58):
(58)
Under the definition of domination given in (49), DP 1 is no longer dominated by DP 2, as
there is now a lower segment of DP 2 does not dominate DP 1. The only node that dominates DP 1,
83
the possessor in the subject, is S, which also dominates DP 3, the object. Thus, by the definition of c-
command in (50), the possessor of the subject c-commands the object.
Under Kayne’s theory, possessor-binding is thus no longer an instance of BVA(S, X, Y)
where X does not c-command Y. ‘Every author’ does c-command ‘his book’ in sentences like (43).
Of course, we already had a potential precedence-based explanation for the acceptability of (43), but
Kayne apparently hoped to subsume all cases of BVA under the Reinhartian c-command approach.
Indeed, Kayne's theory also offers a plausible explanation for why some individuals accept
topicalized possessor-binding, as in (45), which neither a traditional definition of c-command nor a
precedence-based explanation could account for.
2.5.4 Unabating Counterexamples
As noted previously, this explanation comes at a cost, namely a somewhat more complicated
theory. To be fair, there are numerous points of motivation for Kayne’s theory, accounting for spec-
binding being only one. A complicated theory that gets the facts right is presumably better than a
simple one which does not. Unfortunately, however, while Kayne’s theory accounts for possessor-
binding in particular, possessor-binding is only one of a fairly large set of exceptions to c-command-
based generalizations. Consider for example, “inverse linking” constructions, identified by May
(1985)
49
:
(59) Someone from every city hates it.
49
These are, in form at least, similar to donkey anaphora. Barker (2012) considers inverse-linking to be a clearer exception
to c-command-based generalizations than donkey anaphora, though the analysis we adopt here does not necessarily force
us to maintain that distinction.
84
(59) is generally accepted with a BVA(S, every city, it) reading, e.g., someone from New York
hates New York, some from Los Angeles hates Los Angeles, someone from Chicago hates Chicago,
etc. ‘Every city’ is contained within the PP ‘from every city’, itself within the subject ‘someone from
every city’. As we have seen above, adjunct PP’s are held to occupy a position lower than possessors
in the structure of the nominal, and even if they did not, ‘every city’ would still be inside of the PP
itself. It is impossible then, even under Kayne’s redefined c-command approach, to hold that ‘every
city’ c-commands ‘it’.
Of course, once again, we have the confound of precedence. We can alleviate it, using a
similar strategy to what we did for donkey anaphora in (42):
(60) a. Someone from every city hates that city.
b. That city, someone from every city hates.
Based on previous experience, I suspect that, as with possessor binding, while rates of
acceptance of BVA in (60)a will be high, they will drop off considerably in (60)b, though not reach
zero. Assuming that is the case, there is still something left to account for here. There are other
approaches that do achieve this, such as Hornstein (1995)’s “almost c-command”, which essentially
allowing the structurally highest nominal phrase within another nominal phrase to c-command out
of it, but this too is essentially a problem-specific bandage. Barker (2012) compiles a large list of
such exceptions, many of which do not involve “highest nominals” or even nominals inside
nominals at all. Just to list a few of the many examples:
(61) The amount of wealth that each person had was added to their overall score.
(62) In everyone’s own mind, they are the most important person in the world.
(63) We will sell no wine before its time.
85
My usual caveat about failure to control for the effects of linear precedence applies here
50
.
Ignoring that, however, let us briefly examine each one of these sentences:
In (61), BVA(S, each person, their) is reported to be possible, despite ‘each person’ being
embedded in the relative clause ‘that each person had’, which is itself embedded in the nominal
phrase ‘wealth that each person had’, itself in the PP ‘of wealth that each person had’, which is
embedded in the subject ‘the amount of wealth that each person had. ‘Their’ is inside a post-verbal
PP ‘to their overall score’. Given the deep embedding of ‘each person’, which would preclude it
from c-commanding ‘their’ without a fairly intensive revision to hypotheses regarding structure or c-
command, does not block BVA here, it seems difficult to provide a c-command account for this
BVA
51
.
In (62), ‘in everyone’s own mind’ is in fact a sentence-level PP adjunct, and yet BVA(S,
everyone, they) is reported to be possible. We need not worry too much about the structural
position of such PP’s; even if ‘in everyone’s own mind’ does c-command ‘they’, ‘everyone’ does not;
it is just the possessor inside of the nominal phrase ‘everyone’s own mind’, itself inside of the larger
PP. ‘Everyone’ is thus deeply buried and the highest constituent it is buried in is not a nominal, and
thus even if we relax our definition of c-command so much as to let any NP/DP c-command out of
NP/DP’s that contain it, ‘everyone’ still cannot plausibly c-command ‘they’.
Finally, in (63), BVA(S, no wine, its) is reported to be possible. Here, ‘no wine’ is the object
and ‘before its time’ is a post-verbal PP. By the structural parses given earlier in this section (see the
50
At least for Barker (2012), the inclusion of precedence is somewhat deliberate; as mentioned in footnote 40, he considers
a somewhat syntactcized version of precedence to be a requirement for BVA.
51
One might wonder about the use of ‘their’, which may be regarded as plural, but substituting in ‘his’ does not seem to
affect the availability of the reading, at least for me. As we will be seeing in Chapter 4, the use of ‘their’ on its own does
not seem to radically alter the availability of BVA when the subject is itself grammatically singular, as ‘each person’ is (even
though it is not necessarily “semantically” singular).
86
tree in (36)), the post-verbal PP c-commands the object and not vice versa. As such, ‘no wine’ does
not c-command ‘its’ and vice versa. This dispels any hopes we might have had of reducing
“exceptional BVA” to having something to do with subjects and sentence-level PP’s; such cases
obtain when X of BVA(S, X, Y) is the object of the sentence too
52
.
We could of course modify our theory of c-command further in order to account for any
and all of these exceptions. There are, however, many more such counterexamples, which would
force us to likely modify things even further. In the end, the modification strategy will leave us with
a theory of c-command that bears little resemblance to the original; practically everything will have
to be said to be able to c-command everything else. Far from a predictive theory, under such a
watered-down version of c-command, the claim that BVA(S, X, Y) requires X to c-command Y in S
becomes an almost entirely vacuous statement. If we are really pushed to this extreme, it becomes
unclear what utility the notion of c-command provides, as it is at that point simply a convoluted way
of restating the fact that various elements in a given sentence can enter into BVA with one another.
2.6 Explanations from Covertness
2.6.1 The Basic QR/SR Approach
There is an alternative approach, which does not require any redefinition of c-command, but
which greatly expands the possibilities for X to c-command Y. This is the previously mentioned QR
approach of May (1977). Under this account, usually applied to DR but potentially applicable to any
52
Here, there is a potential structural explanation that some have provided. For example, Pesetsky (1995) argues for
structural ambiguity in the post-verbal domain that would allow the object to c-command a post-verbal PP, which would
account for the availability of BVA here. While such an approach may have merits for individual constructions, applying
it to all cases of “exceptional” BVA will run into the same problem I have articulated for the approach of redefining c-
command in response to each new exception; eventually, one reaches a state where anything can be said to c-command
anything else, and the theory becomes non-predictive. That is not to say that the availability of BVA should not be used
as a diagnostic for structure; rather, what I am arguing is that, in order to use BVA as a diagnostic, several other factors
must be controlled first. If we can do that, we can evaluate proposal’s like Petsetsky’s.
87
MR, if X does not appear c-command Y in S yet MR(S, X, Y) is possible, this can be explained by X
silently having raised higher in the structure. Making liberal use of “…” to indicate spaces for
additional structure, we can schematize this roughly as below:
(64)
Let us return to our original structural definitions given in (9)-(13). While X does not c-
command Y in its low position in (64), it does once it has moved to its higher position. QR thus
provides a tool by which potentially any counterexample to c-command-based generalizations can
be dismissed.
There is a danger, however, in adopting such a powerful theory, in that it may again make c-
command so widely available that our theories lose any semblance of predictive power. To guard
88
against this, there must either be (a) some restrictions
53
on QR
54
, or (b) some detectable
consequences of QR. I will return to (a) later in this chapter, so let us focus on (b) for now.
At this point, we have seen the major trend among certain syntacticians defending the c-
command approach to BVA, namely to address seeming counter-examples by finding ways to
“widen” the scope of what c-commands what. This is achieved either through redefinition of c-
command, as in Kayne (1994)’s approach, or by allowing for some sort of structure-altering
operation like QR. An alternative approach has been to argue that BVA is not syntactic at all and is
instead dependent entirely on semantic considerations. This position is strongly articulated in Barker
(2012) (see works cited therein for previous similar approaches, esp. Safir 2004).
Interestingly enough, Barker’s approach is quite like a QR approach. While he argues that
BVA(S, X, Y) cannot result from X QR-ing to a high position, he primarily does so based on a
minor technicality. He cites Heim and Kratzer (1998), who in turn cite Chomsky (1981), to argue
that, by definition that BVA(S, X, Y) is only possible if X is in an “argument position” in S
53
May’s initial formulation does give us one restriction, in that QR is motivated by a mismatch in semantics. Following
accounts of quantification such as Generalized Quantifier Theory (see e.g., Barwise and Cooper, 1981), quantificational
phrases like ‘every boy’ must (semantically) take predicates as their arguments, with taking something as an argument being
essentially the same thing as undergoing Merge with it. If, however, a quantificational phrase appears in, say, object position,
e.g., ‘John saw every boy’, then this is not possible. QR is thus motivated as a solution to this mismatch, allowing the
quantified expression to move higher in the structure and thus take an appropriate (semantic) argument. If we are going
to use QR as an explanation for BVA though, we can immediately see that restricting QR to only cases where this mismatch
occurs will not give us a sufficiently constrained theory to explain why so many individuals reject sentences like ‘his parents
saw every boy’ with a BVA reading; the type mismatch is there, yet ‘every boy’ raising up to a position where it c-commands
‘his parents’ is apparently insufficient for those individuals to allow it to enter into BVA with ‘his’. Indeed, this is essentially
the line of reasoning that led Reinhart (1987) to reject QR as a potential source for BVA. We do not necessarily need to
adopt such a strong stance, but it is clear that a QR account will need something other than just mismatch requirement in
order to account for the observed patterns.
54
A somewhat analogous approach is rather to derive apparent QR from something else, e.g., Hornstein (1995)’s argument
that what we have is actually multiple overlapping overt movements, such that, say, the object overtly moves to a position
where it c-commands the subject for independent reasons and then the subject moves back over the object, ultimately
yielding the original word order. In such cases, the scope that obtains between them is just a matter of which element(s)
reconstruct(s) to which positions. Much like the type-mismatch motivation discussed in the previous footnote, this does
provide certain restrictions to “covert” scopes, but not ones that will particularly help in resolving our BVA mysteries.
That is not a reason to object to said accounts, but rather, simply, those accounts alone do not help us to recover (an
updated version of) the original Reinhartian account of BVA.
89
(essentially, a position where X is “required to be” by another element, e.g., a verb needing an
object), and that the position reached by QR is not an “argument position”. Based on this, Barker
argues that, even if DR(S, X, Y) can be achieved via QR, BVA(S, X, Y) cannot be, so a QR-based
account will not help in explaining apparently c-command-less BVA.
Instead, Barker (2012) argues that DR
55
and BVA can be obtained via purely semantic
operations. That is, the hierarchical structure built by syntax contains no relevant movement of any
kind, but the semantic module of the mind applies certain interpretative operations which achieve
the same result, namely the structurally low X achieving “high scope” in the interpretation. Crucially
for Barker, these operations are the same or at least similar across DR and BVA, so while QR
apparently could not successfully provide a joint explanation for such MR’s, these semantic
operations can.
As I implied above, the argument against QR-based BVA is somewhat weak. Is not entirely
clear why Barker cannot simply assume that either (I) the stipulation that BVA only happens from
an argument position is wrong, or (II) the stipulation that QR does not target argument positions is
wrong. As far as I can see from my reading of the sources, neither of these positions was very
strongly motivated, and modern works on the subject rarely reference them. That is not to say
Barker’s preference to locate scope changing operations in the semantics, rather than the syntax via
QR, is unreasonable. However, it seems that Barker’s arguments will work equally well for either a
QR approach or a post-syntactic approach. In fact, they will work for essentially any approach that
holds the general positions outlined in (65):
55
At least some cases of DR; it is not clear whether Barker is arguing that QR does not exist or merely relegating it to a
minor role.
90
(65) a. There are places where X can be merged into the structure of S where it does not
c-command Y.
b. There is an operation (or set of operations) that applies to the base structure created
in (a) whereby X comes to take a “higher scope” than Y, whether this be in syntax or
post-syntactic.
c. This (set of) operation(s) can facilitate both BVA(S, X, Y) and DR(S, X, Y).
Let me call such accounts “Scope Raising” (SR) accounts, to be neutral as to how this raising
takes place. The question to be addressed then is the same as mentioned above: are there ways of
“detecting” whether SR has taken place? If so, then the theory is predictive; we may not always be
able to know from looking at the surface form of a sentence whether BVA(S, X, Y) is possible, but
if BVA(S, X, Y) is possible, there ought to be some “signature” indicating that SR has happened.
2.6.2 Barker’s Operational Test for Scope
Barker (2012) argues for one method of detecting such an “SR signature”. He phrases this
as an “operational test for scope”:
(66) Operational Test for Scope:
A quantifier can take scope over a pronoun only if it can take scope over an indefinite in the
place of the pronoun.
By “a quantifier […] tak[ing] scope over a pronoun”, Barker means what I have called
BVA(S, X, Y), and by “a quantifier […] tak[ing] scope over an indefinite”, Barker means what I have
called DR(S, X, Y). This test in (66) thus suggests that the availability of BVA should be linked to
that of DR. In particular, BVA(S, X, Y) should only be possible if DR(S’, X, Y’) is possible, where Y’
is an indefinite expression, unlike Y, which is a pronoun, and where S’ is the same as S, only changed
minimally to allow for Y’ rather than Y. We can phrase this as either (67)a or (67)b; as noted in Sub-
Section 1.4, such “contrapositives” are equivalent:
91
(67) a. okBVA(S, X, Y)àokDR(S’, X, Y’)
b. *DR(S’, X, Y’)à*BVA(S, X, Y)
The assumption here is essentially that DR requires just SR to occur, whereas BVA requires
SR and perhaps something more
56
. As such, BVA can occur in a subset of the configurations where
DR can occur; if DR cannot occur, BVA cannot, and if DR can occur, BVA perhaps can. For
example, Barker gives the following pair:
(68) No one’s mother-in-law fully approves of her.
(69) No one’s mother-in-law fully approves of an unemployed son-in-law.
(68) is, in fact, a possessor-binding case, where there is attempted BVA from the subject
possessor ‘no one’ to the object ‘her’, the intended meaning being something like “no woman is
approved of by her own mother-in-law”. (69) is an analogous sentence, only using DR instead of
BVA, substituting ‘an unemployed son-in-law’ for ‘her’, the intended reading being that there are
different unemployed sons-in-law that are disapproved of, one by each mother-in-law (rather than
every mother-in-law in question disapproving of the same unemployed son-in-law).
Barker treats judgements on BVA and DR here as if all individuals have the same such
judgements
57
. In particular, he reports that (68) is indeed acceptable with a BVA(S, no one, her)
reading, and, as predicted (69) is acceptable with a DR(S’, no one, an unemployed son-in-law).
Indeed, as Barker shows, in a variety of places where apparently c-command-less BVA is acceptable,
56
For Barker (2012) in particular, this “something more” is his version of a linear precedence constraint, as mentioned in
previous footnotes. This is primarily motivated by his belief that BVA in weak crossover configurations is impossible,
whereas DR in analogous constructions is possible. As noted earlier in this chapter, this is a (misleading) simplification of
the situation, and to understand the possibility of both DR and BVA in such constructions, we need to get a more serious
handle on variation, as will be discussed in the following section.
57
Though note that (66) would work equally well as a statement regarding judgement variation.
92
DR is acceptable too. This alone, however, is only partial evidence for (67); we need cases where DR
is unacceptable and thus BVA is unacceptable too. Otherwise, we are only testing half of the
relevant predictions.
Barker gives very little in the way of such examples; most are somewhat indirect. For
example, he reports a contrast between the following:
(70) That Mary seems to know every boy surprised his mother.
(71) The grade that each student receives is recorded in his file.
Barker reports that when S=(70), BVA(S, every boy, his) is not possible, whereas when
S=(71), BVA(S, each student, his) is possible. Further, he notes that Szabolcsi (2011) reports that
‘each’, unlike ‘every’, can SR outside a tensed clause, at least as far as DR is concerned. In a sense,
this is consistent with his central claim, ‘each’ seems to have greater SR potential than ‘every’, but it
is a rather weak application of his general approach. The DR equivalents of (70) and (71) are not
produced or examined, and (70) and (71) are far from a minimal pair; a cursory examination can
show that in the former, ‘every student’ is inside of a clause serving as the subject of the sentence,
whereas in the latter’ each student’ is inside of a relative clause inside of the subject. They are indeed
both in tensed clauses, but those tensed clauses occupy very different structural positions, which
may or may not account for the contrast in judgements by themselves, with no need for reference to
Barker’s DR-BVA correlation
58
.
If we put Barker’s account a more complete test, the results are mixed. It will be shown
throughout this dissertation, including in the previous works reviewed in Section 3.2, there are
58
Further, I am certain there will be considerable variation in judgements on the acceptability of (70) and (71); this is not
a problem for Barker though so long as there is parallel variation in the acceptability of their DR counterparts.
93
ample cases where *DR does indeed imply *BVA. Such cases lend further credence to Barker’s basic
approach, and to the notion that there is some “SR” operation(s) that is behind both BVA(S, X, Y)
and DR(S, X, Y) in cases where X does not (or at least does not appear to) c-command Y in S. As
we will be seeing, however, there are numerous counterexamples to this correlation as well. That is,
there are cases where DR is impossible but in an analogous sentence, BVA is possible; as such, the
Barker-style approach cannot be said to be complete by itself.
2.6.3 The Role of the Bindee
Let us observe one further aspect of this SR-based approach: it is almost entirely focused on
the behavior of X of MR(S, X, Y). This is apparent from the general formulation I provided in (67),
which involves terms BVA(S, X, Y) and DR(S’, X, Y’). Here, X is shared between BVA and DR, but
Y is not; as Barker’s own formulation in (66) puts it, the goal is to predict how a quantifier (X)
behaves with regard to the ability to scope over a pronoun (Y) based on its ability to scope over an
indefinite expression (Y’). The implicit assumption here seems to be that it is some property of X
that determines whether BVA(S, X, Y) should be available. Setting aside issues like c-command and
linear precedence, this still begs the question as to whether Y’s properties might also be relevant.
As it turns out, a separate literature exists concerning the nature of Y in BVA(S, X, Y), and
its role in constraining BVA interpretations. Quite relevantly, a duo of responses to Barker 2012,
Baltin, Dechaine, and Wiltschko 2015 and Dechaine and Wiltschko 2017 argue for a c-command-
based theory of BVA but include considerations about cases where the properties of Y might lead to
the appearance of BVA without c-command. In the first of these papers, Baltin et al. argue that
seeming c-command violations are due to a different type of BVA, which they term “Q-binding”,
which is achieved via discursive mechanisms outside of syntax. Crucially, this type of binding is said
to be unequally available for all lexical items, citing the difference in acceptability between pronouns
94
like ‘him’ and reflexives like ‘himself’; the claim is that the former can be Q-bound and the latter
cannot. For example, the authors provide the follow contrast in terms of possessor-binding between
‘her’ and ‘herself’:
(72) a. Every dissident’s lawyer visited her.
b. Emily’s father visited herself.
The claim is that (72)a is possible with a BVA(S, every dissident, her) reading, while (72)b is
not possible with a Coref(S, Emily, herself) reading. Of course, this is not perfectly minimal, but we
could change (72)b to (73), and I suspect judgements would be similar
59
:
(73) Every dissident’s lawyer visited herself.
Following the judgements given, it seems that possessor binding, taken to be a form of
BVA(S, X, Y) without X c-commanding Y in S, is possible for typical pronouns in English but not
for reflexives. Baltin et al. go on to provide a host of other cases wherein these two groups differ
from one another in terms of their BVA and Coref capacities, with the general pattern being that the
reflexives are far more restrictive than the other pronouns. This restrictiveness, the authors claim, is
a sign that reflexive pronouns can only be “A-bound”, for which c-command is a requirement,
whereas other pronouns can be Q-bound. This latter form of binding is said to be “legislated by
discourse representation”, and therefore not subject to the same sort of structural constraints. Q-
binding, they claim, is also responsible for the sort of donkey anaphora discussed in Section 2.4; this
of course neglects the precedence-based aspects of such phenomena, though in principle, there is no
59
Whether said judgments will actually align with Baltin et al’s reported judgements, I suspect will be yet another case of
variation.
95
reason to exclude multiple sources of donkey anaphora as well, and the role of precedence may also
be considered fundamentally discursive in nature depending on our treatment of it
60
.
While, as we will see further later in this chapter, discourse-based mechanisms may indeed be
a plausible explanation for c-command-less MR, we should note that, on the surface at least, Baltin
et al.’s proposal seems to radically limit the role of c-command in constraining BVA/Coref; if only
reflexives obey c-command-based constraints, then what was the source of things like the weak
crossover phenomena discussed in Section 2.3? Here, the authors are forced to claim that such
phenomena are discursive in nature rather than structural; this is a rather extraordinary claim, given
that they are conditioned by certain structural configurations, but it is essentially the view that their
account forces them to hold. If weak-crossover configurations block BVA(S, X, Y) when Y is a
pronoun like ‘his’ or ‘her’, as they do for many individuals, but pronouns like ‘his and ‘her’ are
bound through discursive Q-binding rather than structural A-binding, then the constraint must be
on Q-binding.
While, of course, such an account is not impossible, it is a rather unfortunate corner to be
backed into for a paper which claims to be trying to support a structure-based account of BVA. The
basic account is, however, further refined in Dechaine and Wiltschko 2017, which gives a more
nuanced analysis of such pronouns. Though slightly different language is used, the authors
essentially claim that non-reflexive pronouns are ambiguous between Baltin et al.’s A- and Q-
variables (things that can be A- and Q-bound, respectively). As a result, at certain times, such
pronouns will appear to follow c-command-based constraints, and other times, they will not.
Instead of the term Q-binding, however, Dechaine and Wiltschko (2017) argue that the
cases of BVA without c-command are the results of the pronouns in question being covert “definite
60
This would be more or less consistent with Ueyama (1998)’s account of co-I-indexation.
96
descriptions”, which they again analogize to donkey anaphora. In order to make this claim non-
circular, the authors propose a diagnostic for such covert definite descriptions, namely that they can
be replaced with epithets, while regular, c-command-requiring pronouns cannot. As an example, the
authors provide the following case:
(74) a. Everyone’s mother thinks that he is a genius.
b. Everyone’s mother thinks that the idiot is a genius.
In (74)a, BVA(S, everyone, he) is reported to be acceptable, despite the lack of ‘everyone’ c-
commanding ‘he’. Dechaine and Wiltschko claim is thus that ‘he’ here is a covert definite descriptor,
which predicts it should be possible to replace ‘he’ with an epithet like ‘the idiot’ and maintain the
BVA meaning. This is done in (74)b, which is indeed reported to be acceptable with a BVA(S,
everyone, the idiot) reading.
In contrast, when Y is not a covert definite descriptor, such replacement is said to be
impossible. For example, Dechaine and Wiltschko give the contrast in (75):
(75) a. Every woman was outraged that she was underpaid.
b. Every woman was outraged that the bitch was underpaid.
Here, ‘every woman’ does in fact c-command ‘she’/‘the bitch’. BVA(S, every woman, she) is
reported to be possible for (75)a, but BVA(S, every woman, the bitch) is reported to be impossible
for (75)b. The claim is thus that ‘she’ in such sentence is not a covert definite descriptor but is
entering into a syntactic relationship with ‘every woman’, which enables BVA. As such, replacement
by an epithet is predicted to be impossible by their theory, and indeed it seems to be.
I should note, however, that I have been giving a somewhat “reinterpreted” version of the
account that Dechaine and Wiltschko give. As they have it originally, a condition on BVA(S, X, Y) is
97
that Y must be a pronoun of some sort; my use of terms like “BVA(S, everyone, the idiot)” and
“BVA(S, every woman, the bitch)” are thus improper under their account. Essentially, they take
BVA not as a surface-level description of a meaning, but as an underlying mechanism that produces
that meaning, and which relies on c-command. They assume that epithets, as a subset of the broader
class of (non-pronominal) nominal phrases, cannot enter into (c-command-based) BVA, because
that would violate Condition C of Chomsky (1981)’s binding principles, which requires that such
nominal phrases not enter into Coref/BVA with an element that c-commands them. Under this
view, epithets participating in the BVA-looking MR (let us call it “BVA” in quotes) can thus only be
inserted in positions where X of “BVA”(S, X, Y) does not c-command them.
If this is true, however, then Dechaine and Wiltschko account is completely non-predictive.
We have two facts: (I) “BVA”(S, X, epithet) is only possible if X does not c-command the epithet in
S, and (II) BVA(S, X, pronoun) is sometimes possible when X does not c-command the pronoun in
S. The fact that we find epithets doing “BVA” in positions where pronouns can participate in BVA
without being c-commanded is then trivial; those are the only places such epithets can ever go in the
first place. We could essentially rephrase the entire argument as, “pronouns can participate in BVA
whether they are c-commanded or not, and epithets can only participate in ‘BVA’ if they are not c-
commanded, so when BVA-participating pronouns are not c-commanded, they must be epithets”.
This is clearly a rather large leap of logic; it is certainly possible, but none of the evidence provided
actually provides active support for the conclusion. (74) and (75) demonstrate that pronouns can
undergo BVA whether they are c-commanded by the quantificational phrase or not, while epithets
can undergo “BVA” only if they are not c-commanded by the quantificational phrase, but this does
not tell us anything new about the behavior of pronouns that would suggest they can be covert
definite descriptions.
98
However, Dechaine and Wiltschko’s account can be salvaged somewhat by adopting the
“reinterpretation” I have suggested. Indeed, the areas wherein their original account differ from my
reinterpretation are areas where their account appears to rely on generalizations that are empirically
incorrect. Most problematically, the claim that Condition C prevents epithets from undergoing
“BVA”(S, X, epithet) when X c-commands epithet in S is not correct. Some individuals may even
accept “BVA” in (75)
61
, but for those who do not, I suspect many will find the reading much
improved in (76), as I myself do:
(76) Every woman was outraged that the bitch’s husband was underpaid.
The Condition C-based account holds that (75)b disallows BVA(S, every woman, the bitch)
because ‘every woman’ is the subject of the sentence, and thus c-commands the clausal object ‘that
the bitch was underpaid’, and therefore, the epithet ‘the bitch’. Moving ‘the bitch’ into a possessor
inside that clause, as I have done in (76), should not alter this basic c-command relation, and yet for
me at least, it greatly improves the possibility of “BVA”(S, every woman, the bitch). It thus cannot
be maintained in such an absolute way that Condition C rules out all instances of epithets
participating in “BVA” with elements that c-command them.
Further, as we will be seeing in later chapters, non-pronominal nominal phrases do not
distribute so differently from pronouns in terms of Coref and BVA after all. Consider, for example,
the use of phrases like “that N”, where N matches the noun used in the relevant quantificational
expression, as in (77):
61
I think this is something of a confound given how distasteful it is to refer to ‘every woman’ as ‘the bitch’. If we simply
followed the model of (74) and generated ‘everyone is outraged that the idiot is underpaid’, acceptance would be more
likely, though still, I suspect, subject to intense variation. Note that I have switched to present tense, as I find that helps
improve acceptability for me.
99
(77) Every teacher praised that teacher’s student.
While there is variation, many individuals can accept “BVA”(S, every teacher, that teacher) in
such cases. As I will be showing, this “BVA” patterns qualitatively identically to BVA involving
pronouns, including sensitivity to c-command. There is thus little motivation for holding that,
because epithets are non-pronominal nominal phrases, they must undergo some sort of “BVA” that
is different from true c-command-based BVA
62
.
If we assume that both BVA(S, X, pronoun) and BVA(S, X, epithet) are both equally “true”
BVA, and that the latter is not so strictly bound by Condition C as is claimed, then Dechaine and
Wiltschko’s account becomes more interesting. Rather than simply noting two distributions and
trying to conclude something from that, they have effectively made a prediction that, in cases where
X does not c-command Y in S, the structural positions where epithets can be Y of BVA(S, X, Y) will
be positions pronouns can be Y of BVA(S, X, Y) or vice versa. They have not shown this, and I will
not assume it to be correct, but it is a promising line of inquiry in any case.
Regardless of these issues, what is most important for our purposes is that both Baltin et al.
(2015) and Dechaine and Wiltschko (2017) have advanced theories to account for c-command-less
BVA that focus on the Y of BVA(S, X, Y), as opposed to QR/SR-style approaches that focus on the
X. Further, both sets of authors, particularly Baltin et al., link such phenomena to something
“discursive”, rather than something occurring in the narrow syntax. As we will see in the following
62
Dechaine and Wiltschko can be easily excused for not knowing this latter fact though; as I will be discussing, past works
that have attempted to provide a treatment of BVA involving exactly phrases like ‘that N’, such as Hoji 1995 and Ueyama
1998, have also concluded that they do not participate in the same sort of BVA that pronouns do. This, as I will argue, is
a case of mistaking a tendency for an inviolable property; it is true that such phrases are not usually used in the same way
as pronouns, but this does not mean they cannot be. As will be shown across all three languages examined, they very much
can be used in this way, though, as I have said, there is individual-level variation.
100
section, both X and Y do indeed have a role to play, and a discursive/extra-syntactic analysis of that
role may turn out to be the most straightforward explanation. Before we can understand such an
analysis, however, we will need to get a handle on variation.
2.7 Incorporating Variation
2.7.1 Ueyama’s Quirky Effects
Let us return to the account of BVA provided by Ueyama (1998), which incorporates both
c-command and precedence as two separate conditions which can, each by itself, facilitate BVA. As
we have seen in preceding sections, there are cases of BVA which cannot be explained via either c-
command or precedence. Ueyama (1998) notes this too, highlighting a phenomenon Ueyama (1997)
terms “quirky binding”. Her primary examples come from Japanese; at this point, I have been
discussing BVA only in English, but the universalist approach predicts it its relationship with
precedence and c-command should be the same in all languages. Certainly, as we will see more
directly in Section 3.2 in particular, everything I have discussed with regards to English can be
reproduced in Japanese, including both basic support for c-command’s role in constraining MR’s, as
well as the problems for such accounts.
Ueyama’s quirky binding occurs in sentences like the following:
(78) So-ko-no bengosi-ga subete-no zidoosya-gaisya-o uttaeteiru
that-place-GEN attorney-NOM every-GEN automobile-company-ACC63 sued
‘Its retained attorney sued every automobile company.’
63 GEN, NOM, and ACC are genitive, nominative, and accusative markers respectively, providing overt marking as to
what are the possessors, subjects, and objects (again respectively), which is rather convenient considering those are
precisely the things we have been discussing in this section.
101
In (78), we actually have a weak crossover configuration; subete-no zidoosya gaisya-o ‘every
automobile company’ does not c-command soko ‘that place’ by virtue of the basic structure for SVO
sentences given in (16); because this is Japanese, the word order is in fact SOV, but by hypothesis,
these are simply language-specific ways of linearizing the same basic underlying structure. If we
attempt BVA(S, subete-no zidoosya gaisya-o, soko), we are attempting to establish BVA from an
object into a subject, giving us neither the relevant precedence nor c-command configuration, which
should result in BVA being impossible. On the contrary, however, Ueyama notes that such
sentences can indeed sometimes allow this reading, namely that each company was sued by its own
retained attorney.
This alone is not surprising; as we have seen in this chapter, exceptions to c-command
generalizations about BVA are abundant. As noted, this particular one lacks a precedence-based
explanation, as soko precedes subete-no zidoosya gaisya-o, but we have seen many examples of such cases
as well, though not settled on a particular way of resolving them yet. Ueyama (1998), however,
makes an important connection to variation here that many others have missed, noting in her
“Appendix D” that “some speakers tend to accept [quirky binding] rather easily, while others seem
to firmly reject it.” Ueyama goes on to note that, even for individuals who do accept it sometimes, a
number of factors seem to play a role as to whether the reading is acceptable in a particular instance.
These factors are generally not structural, but focused on semantic, mostly discursive factors, such as
the “salience” of the X of BVA(S, X, Y) in the discourse, its ability to be understood as the “topic”
of the sentence, and other factors such as X’s specificity, as well as factors pertaining to Y such as it
being interpreted in a “non-individual-denoting” sense
64
.
64
Ueyama leaves some of these terms relatively vague, using them primarily in the “intuitive” sense rather than referring
to any specific theory of the syntactic/semantic/pragmatic definition of “topichood” or “specificity. I will discuss these
terms further in the following section, though we will still primarily be using informal definitions.
102
This account somewhat dovetails with Baltin et al. (2015)’s, though the latter is concerned
with the properties of Y of BVA(S, X, Y), while Ueyama here is much focused on those of both X
and Y. Like Baltin et al., Ueyama is also concerned with Coref as well. She holds that something
analogous to quirky binding can occur in the Coref domain
65
, for example, in the analogue of (78):
(79) soko-no bengosi-ga Toyota-o uttaeta
that-place-GEN attorney-NOM Toyota-ACC sued
‘Its retained attorney sued Toyota.’
In (79), Coref(S, Toyota, soko) is also reported to be possible, with similar conditions on Y
as were required to achieve quirky-binding readings (the “non-individual-denoting” requirement, to
be discussed further in Section 2.8). Thus, while not per se explaining the source of such an effect or
set of effects, Ueyama has nevertheless uncovered a phenomenon operating across different MR’s
that leads to c-command-less and precedence-less acceptance of said MR’s, at least for certain
people in certain discursive situations
66
.
65
Like BVA, discussion of this is in Appendix D of Ueyama 1998.
66
This still leaves open the question of why Coref is so much more permissive than BVA in this regard. Ueyama’s
theory provides one possible solution, namely that Coref can come about as a result of what she terms “Co-D-
Indexation”, whereas BVA cannot. Co-D-Indexation essentially involves separately interpreting two expressions as
referring to the same individual, without making use of an operation to link interpretations of the two elements. She
goes on to argue that D-Indexation (no Co- in this case) is what is responsible for interpretations like ‘he’ being
interpreted as “person C” in the following situation:
(i) Person A walks into a room where person B is; person A believes that person B is hiding person C.
Person A: “Where is he?”
In (i), ‘he’ has no linguistic antecedent in the conversation whatsoever, so it cannot have picked up its reference by any
operation that requires another element to be in the conversation, let alone the same sentence. Note that we could not
do something analogous for BVA; if Person A were looking for ‘every boy’, opening the conversation with “where is he”
could not convey this. Further, in Japanese, Ueyama reports that depending on whether the expression used for ‘he’ is
headed with the demonstrative ‘a‘ or the demonstrative ‘so‘ the interpretation is possible or impossible respectively,
evidence for a lexical constraint on the availability of D-indexation. Further, this distinction rules out the possibility of
co-D-indexation being responsible for the Coref reading of (79), as the demonstrative used there is ‘so’, so it is not the
case that co-D-indexation can subsume all quirky effects in the Coref domain; reference to both is needed (and besides,
quirky effects would be needed to account for cases like (78), where co-D-indexation would be unavailable due to the
MR being BVA rather than Coref).
103
2.7.2 Variation in DR
A similar observation is made in Hayashishita 2004/2013 regarding DR as is made in
Ueyama 1998 with regards to BVA and Coref. Since discussions of QR/SR accounts of DR first
began, it has been known that it is not equally available across constructions, contexts, or lexical
items. With regards to lexical items in particular, Szabolsci (2010, esp. Chapters 10 and 11), and
Beghelli and Stowell (1997) give detailed descriptive typologies with regards to which quantifiers can
QR from which positions, as well as attempting to derive these differences from structure-based
considerations. Were such attempts to be successful, they would indeed constitute evidence for the
types of “constraints” on QR that I alluded to in Section 2.6, which would make the theory
predictive and thus non-circular
67
.
Assessing whether Ueyama’s analysis is correct, and if so, whether it accounts for all the cases where Coref is more
permissive than BVA, is outside the scope of this dissertation, as the focus is on BVA, rather than Coref. However, as
we will be seeing, Coref’s permissiveness presents certain “annoyances” for the basic correlational methodology
employed to “make use of” BVA. As such, improvements in our understanding of Coref’s sources may improve our
mastery of BVA as well, and as such, investigating Ueyama (1998)’s account of Coref and if turns out to be well
supported, finding ways to better control co-D-indexation (especially in languages like English which, as far as has been
discovered, lack an analogue to ‘a’ vs. ‘so’), ought to be a high priority for future work along similar lines as this one.
67
In discussion of constraints on QR, one would be remiss to not mention Fox (1995/2000)’s well known theory of
“scope economy”, which also tries establish constraints on QR. The key notion of Fox’s theory is that each application
of QR must move the moved quantificational phrase structurally higher than some other quantificational phrase (or
more specifically, it must take higher semantic scope than some other quantificational phrase, but Fox assumes all
semantic scope derives from c-command). That is, QR can never be “vacuous”, and must always result in changing c-
command relations between quantificational phrases. Further, following assumptions about movement first put forward
by Chomsky (1977), movement is said to be “cyclic”; for our purposes, this means that it must “stop” at the edge of
each clause the moved element is being moved through. To move to a position in a higher clause then, an element must
first move to the edge of the lower clause, and then move into the higher clause.
Fox uses these two requirements on movement to derive that QR ought to normally be clause bounded, which is
frequently reported to be the case. If an element wants to QR into a higher clause, it must first move to the edge of the
clause it is currently in, but unless this movement puts it structurally higher than another quantificational element, then
this movement is banned by Fox’s requirement that QR always change structural relations between quantificational
elements. Such movement should, however, be possible if there is an intermediate quantificational element to “move
over”. Fox takes the following pair of sentences (taken from Moltmann and Szabolsci 1994) as demonstrating the
resulting predicted contrast:
(i) One girl knows that every boy bought a present for Mary.
(ii) One girl knows what every boy bought for Mary.
104
Exploring the specific descriptions of the behaviors of different quantificational phrases
given in these works would take us somewhat far afield of the discussion at hand, not least of all
because Hayashishita argues that there is a much more crucial aspect of quantificational behavior
that they are overlooking. Instead of different quantifiers being able to QR to different places in the
syntactic structure, Hayashishita argues that that the only cases wherein DR is based on c-command
are the “surface scope” readings, those cases where DR(S, X, Y) obtains when X c-commands Y in
S without invoking any covert movement operations like QR. “Inverse scope” readings, where
something like QR/SR would have to be invoked, he argues, are achieved through discourse-based
operations. Effectively, Hayashishita (2013)’s account takes DR interpretations of sentences like (80)
and (81)a to be formed completely differently, with the latter effectively being a “shorthand” for
(81)b.
(80) Every abstract was read by three reviewers.
(81) a. Three reviewers read every abstract.
b. Three reviewers read abstract #1, three reviewers read abstract #2, three reviewers read
abstract #3, etc.
As is assumed for most SR accounts, because (80) is such that ‘every abstract’ c-commands
‘three reviewers’ simply by virtue of ‘every abstract’ being the subject of the sentence, DR(S, every
The reported judgement is that DR(S, every boy, one girl) is impossible for (i) but not for (ii). On the assumption that
‘what’ is a quantificational element at the structural edge of the clause, but ‘that’ is not quantificational, then Fox’s
account indeed predicts this contrast; in (i), there is no intermediate quantificational element to “move over”, whereas in
(ii) there is, so ‘every boy’ can only reach the higher clause in (ii).
Unfortunately, I know from experience that judgements are not consistent with regard to such pairs, even though their
contrast is (on the surface at least) strongly predicted by Fox’s theory. This is not to say though that this theory is wrong;
much like Reinhart’s approach to BVA, there may simply be other factors to reckon with besides movement and c-
command with regards to DR. An approach similar to the one I am pursuing here, which seeks to detect and thereby
control these additional factors, is precisely the sort of thing that will let us investigate whether Fox’s account makes
predictions that stand up to the sort of experimental scrutiny employed in this dissertation, which would in turn provide
substantial information with regard to the debate about the syntactic reality (or lack thereof) of movement and QR.
105
abstract, three reviewers) is possible to achieve with no additional operations. (81)a lacks such c-
command, so DR(S, every abstract, three reviewers) is only possible via a special operation. For
Hayashishita, however, this is not an operation on (81)a; rather, (81)a is formed when this operation
combines the various sentences in (81)b, turning the different entities, ‘abstract #1’, ‘abstract #2,
‘abstract #3’, etc. into one shorthanded subject, ‘every abstract’. Because each sentence in isolation
permits a reading where the three reviewers in question are different from the three reviewers
mentioned in the other sentences, the DR reading essentially comes “for free”.
Hayashishita’s exact analysis, however, is not so important for our present purposes as the
evidence he uses to support it. For that, consider sets of sentences like the following, which differ
only in terms of what quantificational expression is used in the object position.
(82) a. Three reviewers read every abstract.
b. Three reviewers read many abstracts.
c. Three reviewers read two abstracts.
d. Three reviewers read more than two abstracts
In all the cases in (82), taking X to be whatever the object is, DR(S, X, three reviewers) will
be an inverse scope reading, as X does not c-command ‘three reviewers’. We can compare these
readings to those in (83), where we have flipped quantifiers between subjects and objects.
(83) a. Every reviewer read three abstracts.
b. Many reviewers read three abstracts.
c. Two reviewers read three abstracts.
d. More than two reviewers read three abstracts
Taking X to be the subject in the sentences above, we now have DR(S, X, three abstracts)
as a surface scope reading, as X does c-command ‘three abstracts. Hayashishita checks judgements
of different individuals, speakers of both Japanese and English, with regard to their ability to accept
106
“inverse DR” in sentences like (82) and “surface DR” in sentences like (83). As expected, these latter
types of interpretations are widely accepted, and there is no obvious effect of the choice of X. The
straightforward conclusion is that surface scope readings essentially come “for free”; if X c-
commands Y (without invoking reference to QR), then DR(S, X, Y) is generally possible.
The inverse DR cases, however, show clear and consistent differences based on the choice
of X. Specifically, as X gets more and more numerically intricate, as shown via the progression from
‘every’ to ‘many’ to ‘two’ to ‘more than two’ in (82)a-(82)d, DR(S, X, Y) is increasingly rejected by
more and more individuals. This is subtly different from what is reported in other works examining
the distribution of QR (e.g., the aforementioned Beghelli and Stowell 1997). Specifically, it is not
merely that different quantificational elements have different abilities to participate in inverse DR;
rather, different individuals have different levels of permissiveness in terms of whether they allow
these different quantifiers to participate in inverse DR. While there is an overall pattern as
mentioned, some individuals reject inverse DR in all cases given in (82), whereas others accept it in
all such cases. Further, Hayashishita documents that rejection and acceptances frequently changes
based on the context in which the sentences are presented, suggesting that the ability to participate
in inverse DR is not only a property of the choice of X of DR(S, X, Y), or even of S
68
. As such,
Hayashishita argues that not only are inverse scope readings not “free” like surface scope readings,
but they are not bound by the same sort of universal structure-based constraints that surface scope
readings are.
68
The contexts used are fairly varied, and getting to Hayashishita’s intended analysis from them is somewhat involved, so
I will direct interested readers to the original papers. In essence, Hayashishita eventually settles on some version (different
in 2004 and 2013) of “specificity”, understood here as roughly whether there is a unique set of individuals in the discourse
who are clearly picked out by X. This also derives why universal quantifiers like ‘every’ are very amenable to inverse
readings (it's easy to pick out in a given context which reviewers are part of ‘every reviewer’: it’s all of them!), but highly
non-specific expressions like ‘more than two’ are not. I will discuss such constraints a bit more in the following section,
including their link with Ueyama’s similar constraints on the X of quirky-BVA(S, X, Y).
107
Whether or not Hayashishita’s analysis is correct, its supporting data clearly adds yet a third
MR to the list of those which display both variation in judgement depending on the individual in
question, in particular when MR(S, X, Y) is paired with a sentence where X does not c-command (or
precede) Y. Further, it offers yet more evidence that factors outside of syntactic structure, such as
the context in which a sentence is presented, affect such readings. Finally, taken alongside works
discussed previously, not only Ueyama 1998 but also Barker 2012, Baltin et al. 2015, and Dechaine
and Wiltschko 2017, it is yet more evidence that the choice of X and Y of MR(S, X, Y) has impact
on whether such c-command-less readings are accepted or not, albeit on an individual-specific level.
2.7.3 Hoji’s Correlational
69
Approach
The proper analysis of such “exceptional” MR’s, especially of interest here, “exceptional BVA”, is
still an unsettled matter. I will offer some thoughts on the matter in Section 2.8, but reach no firm
conclusions. The lack of a truly complete analysis threatens to limit the utility of MR’s like BVA as
tools for investigating the properties of the CS. If BVA(S, X, Y) strictly required X to c-command Y
in S, or even required either c-command or precedence, then it would be quite useful for that
purpose, as BVA would in effect be a sort of “c-command detector”. As we have seen, however,
there are persistent issues with c-command-based generalizations, even when precedence is
accounted for. Such issues complicate the effort to make such a “detector”, and indeed, potentially
cast doubt on whether c-command is really relevant for BVA or even exists at all.
69
Audrey Li (p.c. Aug 2021) suggests that Hoji’s approach is better understood as “implicational” rather than
“correlational”, given that it is exclusively formulated in terms of one-way entailments, whereas “correlation” may the
impression of a sort of two-way relationship. On the other hand, Hoji (p.c. Sept 2021) states that “implicational” may give
a sense of tendencies, rather than a definite entailment. Given that I am summarizing Hoji’s work, I will follow his
terminology, but it is important that readers not be confused by terminology either way; in the piece of argumentation I
am reviewing here, Hoji attempts to establish conditions that definitely entail the impossibility of BVA for an individual,
but Hoji does not take the impossibility of BVA for an individual to predict anything. (Unlike BVA being accepted, which
does make predictions in the “contrapositive” sense given in (2), make predictions).
108
Using Ueyama (1998)’s term and calling all these cases of “exceptional MR” “quirky effects”
as an umbrella term, we can state the problem in terms of wanting to have complete understanding
of quirky effects. Such an understanding, at least in principle, would give us a means to control
them. If such control could be achieved, and then used to reveal patterns consistent with the
original, c-command-based theories, then this would finally overcome the issues that have plagued
such accounts. Indeed, though they did not use this terminology, achieving such an understanding
and such control has been one of the major goals of many of the works described thus far in this
chapter.
At this point in time, however, while we have made significant progress in identifying the
problems a theory of quirky effects must overcome, a complete theory is still lacking. However,
despite this outstanding issue, Hoji (2017, 2019, 2022a) provides us with a methodology to check the
validity of Reinhart-style c-command-based generalizations while explicitly accounting for quirky-
based variation in individuals. The main focus of Hoji’s proposal is finding a way to “deal with”
quirky effects, which are essentially treated under his methodology as stochastic noise, obscuring the
orderly “signal” coming from the effects of things like c-command
70
. Under such an approach, Hoji
argues, it is not strictly necessary to understand why or even when quirky effects come about; rather,
all one needs is a robust way of determining whether or not a quirky effect can occur for a specific
individual and for particular choices of S, X, and Y. If we determine such an effect can occur for
that individual with those choices of S, X, and Y, then that individual’s judgements on an MR with
that particular S, X, and Y does not necessarily constitute evidence relevant to the testing of c-
command-based hypotheses. Thus, seemingly “problematic” cases can potentially be dismissed as
70
As we will be seeing in later chapters, this basic approach can also be turned on its head, so that said quirky effects can
be studied via controlling for the effects of c-command. That this was possible was realized, as I understand it,
independently by both Hoji and myself at around the same time.
109
being the result of quirky effects, whatever those effects are. If, on the other hand, we determine
that quirky effects cannot occur for that individual, S, X, and Y, then such a dismissal is impossible,
and the results of their MR judgements have direct bearing on the correctness of our c-command
hypotheses (assuming no interference from something like precedence).
To put things in more concrete terms, Hoji, following Ueyama (1998), assumes that there are
multiple possible sources for MR’s like BVA. In the case of BVA(S, X, Y) in particular, one of these
sources is Ueyama’s FD(S
71
, X, Y), which requires X to c-command Y in S. For Hoji, there are two
other (types of) sources of BVA(S, X, Y); again following Ueyama (1998), these are X preceding Y in
S, and some sort of quirky effect. These are said to exhaust the possible sources of BVA, deriving
precisely the ABC-BVA law stated in Sub-Section 1.4, repeated here as (84):
(84) *A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
Though Hoji does not use this terminology, as I stated when introducing the law, A(S, X, Y)
can be understood as X preceding Y in S, B(S, X, Y) as a quirky effect on S, X, and/or Y, and C(S,
X, Y) as X c-commanding Y in S. Indeed, Hoji’s exposition is significantly more conceptual and
general than what I am presenting, but this presentation makes clear what must be done. Assuming
these three factors, or perhaps sets of factors in the case of quirky/B, can all independently enable
version of BVA, in order to see c-command’s effect on BVA clearly, we must somehow eliminate
the influence of the other two factors. Thus, if we know we have ensured *A(S, X, Y) and *B(S, X,
Y), then we will derive the prediction:
71
I am altering her notation slightly to make things compatible with the way I have presented things here. Ueyama (1998)
would just have FD(X, Y)
110
(85) a. *C(S, X, Y) à *BVA(S, X, Y)
b. okBVA(S, X, Y) àok C(S, X, Y)
(85) is in fact Reinhart’s original generalization for BVA. Hoji’s approach, itself highly
indebted to Ueyama (1998’s) approach, thus derives the c-command theory of BVA as a special case
of this more general theory. The question, naturally, is how to ensure that the conditions *A(S, X, Y)
and *B(S, X, Y) are met. The former is conceptually straightforward; we just need to pick sentences
where X does not precede Y, as we have been doing at points throughout this chapter. Ensuring
*B(S, X, Y) however, namely finding sentences that lack “quirky effects” for the individual in
question, is much trickier, especially since we do not know precisely what enables such effects. This
is where Hoji’s main contribution lies.
2.7.4 Testing for Quirky Effects
In essence, Hoji proposes empirically motivated diagnostics for the presence or absence of B(S, X,
Y). This approach is reminiscent of Barker (2012)’s “operational test for scope”, though Hoji is
explicitly assuming the possibility of variation. Nevertheless, Hoji’s first test is essentially the same as
Barker’s, namely checking whether X can engage in some sort of QR/SR by using comparison with
DR
72
. Though Hoji does not, we can formalize this as below:
(86) Hoji’s DR test
73
:
If X does not c-command or precede Y in S, acceptance of BVA(S, X, Y) can be attributed
to a quirky effect if the individual in question also accepts DR(S’, X, Y’) (where S’ minimally
different from S such that an appropriate Y’ can be used).
72
Hoji assumes for independent reasons that SR is not syntactic QR, but something extra-syntactic along the lines of what
Hayashishita (2003/2013) proposes. Nothing about Hoji’s diagnostic approach, however, precludes the possibility of QR,
and if one wants to take QR to be a subset (proper or improper) of quirky effects, that should not contradict any of the
analysis given here.
73
Note that, at this point, this is merely a diagnostic test, and is not making any predictions about the relationship between
DR and BVA. Such predictions will be made, but we need to “build up” to that point first.
111
To make this less abstract, let us consider my own judgements at the time of writing.
Contrary to what is expected under purely c-command-based considerations, I accept BVA(S, every
teacher, his) in weak crossover configurations like (87):
(87) His student praised every teacher.
‘Every teacher’ does not precede ‘his’, so precedence is eliminated as a potential source for
BVA here. Two options thus remain: (I) either one of my hypotheses is wrong, e.g., I have
misidentified the structure of SVO sentences, have the wrong definition of c-command, c-command
is in fact not relevant for BVA, etc., or (II) a quirky effect has taken place. To check for (II), we can
change our sentence slightly to see if inverse scope DR is possible; this will require replacing ‘his
student’ with something like ‘two students’, and adjusting accordingly
74
:
(88) Two students praised every teacher.
I do, in fact, accept DR(S, every teacher, two students) in (88); for me, inverse scope is
possible with ‘every teacher’. As a result, a quirky effect is diagnosed, and my acceptance of weak
crossover BVA in (87) can be attributed to this effect
75
.
74
Another option would be to replace just ‘his’ with a Y’, yielding something like ‘two school’s students praised every
teacher’. While this might be more parallel to the original, Hoji does not pursue this option, and it seems that the version
of replacement done in (88) is sufficiently “close enough” for quirky detection. It seems an important task for future work
to determine what makes two sentence-types “close enough” in this regard, though I will simply follow Hoji’s procedure
for this dissertation.
75
To be clear, while my judgement here is consistent with Hoji’s account, it does not serve as strong evidence for that
account. As I will discuss, what we would further want to show is that individuals who do not accept inverse scope DR in
sentences like (88) (and also have the intended judgement pattern with Coref to be described below) do not accept BVA
112
So far, this is, on a practical if not conceptual level, equivalent to Barker’s test for scope.
However, Hoji has a second test which Barker lacks. While the DR test checks for a potential quirky
effect centering on S or X, note that the original Y of the BVA sentence is absent in the DR
sentence, e.g., there is ‘his’ in (87) but not in (88). As a result, if there is something “quirky” about
the Y in question, such as the sorts of effects discussed by Baltin et al. (2015) and Dechaine and
Wiltschko (2017), the DR test will miss it. As such, Hoji employs a second test using Coref, which
we can formalize as follows:
(89) Hoji’s Coref test:
If X does not c-command Y in S, acceptance of BVA(S, X, Y) can be attributed to a quirky
effect if the individual in question also accepts Coref(S’’, X’, Y) (where S’’ minimally
different from S such that an appropriate Y’ can be used).
To instantiate this, let us imagine that I did not accept DR(S, every teacher, two students) in
(88) while still accepting (87). In that case, Hoji’s DR-based diagnostic in (86) would not have
detected a quirky effect. Such an effect could still be diagnosed, however, if I accept Coref(S, John,
his) in (90):
(90) His student praised John.
I in fact do accept such a Coref reading, so I am once again diagnosed as having a potential
quirky effect, again providing a potential explanation as to why I show insensitivity to c-command in
with BVA in (87). If, on the other hand, I did not accept this reading, then both of the diagnostics
failed to detect a quirky effect. The combination of these two diagnostics, Hoji argues, captures all
in sentences like (87). In this chapter, I will merely stipulate this is true, but we will see in the subsequent chapters that
repeated attempts to test this prediction have all yielded positive results; the correlation does indeed hold.
113
quirky effects, so in that case, we would have to conclude that I did not experience a quirky effect,
yet still somehow managed to accept weak-crossover BVA in (87). Hoji’s contention, in effect, is
that no such individual exists; all individuals who accept BVA in sentences like (87) accept DR in
sentences like (88) and/or Coref in sentences like (90)
76
.
It should be noted, though, that the term “like” is doing a bit of work here. Hoji is focused
on general sentence patterns, “schemata”, rather than specific sentences instantiating them. As I will
be discussing further when reviewing Hoji’s relevant experimental work in Sub-Section 3.2.2,
following Hoji (2015), any given sentence might get rejected by an individual for reasons unrelated
to the factors we are concerned with. To take just a trivial example, perhaps the intended DR
reading for (88) is in principle acceptable, but I find it hard to imagine a situation where each teacher
gets praised by two students, so I believe the reading to be unacceptable. This is easily remedied if
we switch the words in the sentence but keep the structure, perhaps talking about something more
common like each teacher being asked questions by two students, or being praised by two
evaluators, etc. There is thus an implementational question of how many instances of each schema
we have to check in order to be sure that there is no such “sentence-incidental” factor distorting
acceptability judgements. As will be shown, however, in practice, the required number of instances
to check for accurate results appears to be fairly low.
Further, even if we are sure that an individual is consistently rejecting a given MR with a
certain sentence pattern, there is an additional empirical challenge in determining whether rejection
76
Note that Hoji is not claiming that all individuals who accept DR/Coref in sentences like (88) and (90) must necessarily
accept BVA in sentences like (87). Indeed, such a pattern of judgements is not particularly uncommon; as I will emphasize
repeatedly going forwards, it seems that the conditions that allow “quirky-based” BVA are a subset of the conditions that
allow for quirky-based DR and Coref; indeed, this is probably why the correlation “works” in the first place. As such, it is
entirely possible that the right factors are present to enable quirky-DR/Coref, but not quirky BVA. The reverse is not true,
however; no one accepts quirky-BVA under a given condition but not either quirky-DR or quirky-Coref, precisely because
of the set-subset relationship described above, which means that if the conditions for quirky-BVA are met, the conditions
for at least one of quirky-DR and quirky-Coref must also be met.
114
of the MR in question is actually due to the sentence pattern in question. For example, if the X of
BVA(S, X, Y) is particularly complicated, e.g., Hayashishita’s ‘more than two reviewers’, it may be
hard for an individual to even properly conceptualize what a DR(S’, X, Y’) reading should be. As
such, the individual’s consistent rejection of such DR readings may have nothing to do with whether
X induces a quirky effect, and tells us little about whether BVA(S, X, Y) will be accepted. In this
case, Hoji’s solution will rely on checking other sentences, in this case, those with a different
structure, but again, I consider this to be “implementational” question, and will delay discussion of it
until Sub-Section 3.2.2.
2.7.5 Previous Generalizations as Special Cases
While I have left such implementational details vague for now, what should be clear from
the above discussion is that, while diagnosing the presence of a quirky effect under Hoji’s system is
straightforward, diagnosing the absence of such an effect in a rigorous manner is rather involved.
Let us imagine though that we have performed whatever checks are necessary and are confident that
an individual’s judgements are “significant” in the way Hoji intends. In that case, a failure to detect a
quirky effect for a given S with a given choice of X and Y via both the DR test and the Coref test is
taken to mean that there are no quirky effects operative on that S-X-Y triplet for the individual in
question. Translating that into my “ABC” notation, we can say:
(91) *DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) à *B(S, X, Y)
That is, we can consider factor(s) B “controlled for” if neither the DR nor the Coref test
detect anything “quirky”. Pugging this condition back into the general form of the “ABC-BVA” law,
we derive the following:
115
(92) *A(S, X, Y) Ù *DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
In (92), we have simply substituted the conditions for the diagnosis of *B in (91) for *B in
the law given in (84). From here, we can derive Hoji’s most crucial prediction. To do so, we must be
in a situation that X does not precede Y in S nor (sans QR
77
) c-command it in S, thus giving us *A(S,
X, Y) and *C(S, X, Y). Such a situation is not hard to find; we have in fact seen several already, e.g.,
a weak crossover configuration like (87). In such a case, then (92) reduces to (93):
(93) a. *DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) à *BVA(S, X, Y)
b. okBVA(S, X, Y) à okDR(S’, X, Y’) ∨ okCoref(S’’, X’, Y)
This is an instance of what Hoji (2022a) calls a correlational prediction. That is, it does not
tell us whether a given sentence (pattern) will be acceptable or not with a given MR (even with given
choices of X, Y, etc.). Rather, it tells us, for each individual, how judgements on three different MR’s
will pattern with respect to one another. In particular, while there is a wide range of variation
possible, a specific pattern will never obtain under the relevant conditions:
(94) Possible and Impossible Patterns of Judgements
Prediction BVA(S, X, Y) DR(S’, X, Y’) Coref(S’’, X’, Y)
Possible * * *
Possible * * ok
Possible * ok *
Possible * ok ok
Impossible ok * *
Possible ok * ok
Possible ok ok *
Possible ok ok ok
77
I will stop repeating this caveat at this point, but it can be assumed to apply in all cases going forward.
116
This may seem like a somewhat minor prediction, but as we will be seeing in subsequent
chapters, experimentally obtained results conform strikingly well to this pattern. While all seven
“possible” patterns obtain with frequency, the one lone impossible pattern simply never occurs (so
long as the conditions laid out above have been met). It is of course possible, perhaps even
probable, that a relevant counterexample will be found at some point, but for now, Hoji’s predicted
correlation remains exceptionless.
Just as we derive Reinhart’s generalization in the case of *A(S, X, Y) and *B(S, X, Y), and
derive “Hoji’s Correlation” in (93) in the case of *A(S, X, Y) and *C(S, X, Y), we could also derive a
precedence-based generalization akin to Chomsky (1976) “Leftness Condition” described earlier in
this section by looking at cases where we have *B(S, X, Y) and *C(S, X, Y). It is thus apparent that
various past theories of BVA are simply special cases of the general ABC-BVA law. We can
summarize this as in (95):
(95) In the case that any two conditions of (a) are met, we derive one of (b)-(d):
a. The ABC-BVA Law:
*A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
b. Chomsky’s Leftness Condition:
BVA(S, X, Y) is possible only if X precedes Y in S.
c. Hoji’s Correlation:
*DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) à *BVA(S, X, Y)
78
d. Reinhart’s Generalization:
BVA(S, X, Y) is possible only if X c-commands Y in S.
78
Which I will again mention is equivalent to okBVA(S, X, Y) à okDR(S’, X, Y’) ∨ okCoref(S’’, X’, Y), which emphasizes
that b-d are the three observable conditions under which BVA is acceptable. Note the focus on observable; I am not
providing an interpretation for (95)c in particular. I will speculate a bit more in the following section., but I will reach no
firm conclusions on what the correct analysis of this fact might be. Possible explanations range from something entirely
“discursive”, e.g., the tests are checking whether the individual is making use of something outside of syntax that enables
special interpretations to occur, to something entirely “structural”, e.g., the tests are checking whether the individual is
considering a specific way of interpreting traces/raising an element to specific position, to something in between or
otherwise, e.g., diagnosing lexical properties of X and Y, how “accommodating” the individual is, etc.. What is important
is that, whatever is going on, the correlation with DR and Coref allows us to track its behavior across different I-languages.
117
That is, if *B(S, X, Y) and *C(S, X, Y) are guaranteed, then BVA(S, X, Y) is strictly
constrained by whether or not we have of A(S, X, Y), that is, whether X precedes Y or not, as in
(95)b. Similarly, if we guarantee *A(S, X, Y) and *C(S, X, Y), then BVA(S, X, Y) is strictly
constrained by the status of B(S, X, Y), which can be determined by consulting both the DR and
Coref tests, the gist of which is spelled out in (95)c
79
. Finally, if *A(S, X, Y) and *B(S, X, Y) are
guaranteed, then it is C(S, X, Y), that is whether X c-commands Y or not, as in (95)d.
As noted, Hoji’s relevant work does not formalize things in the same way I have here, and
thus the interconnectedness of these various predictions is not so directly elucidated. Nevertheless,
this elegant result is inherent in the theory he advances. Further, predictions like those deriving from
Barker (2012)’s “operational test for scope” can be obtained as a special case of (95)c as well. As
such, ABC-BVA captures a wide range of observed behaviors of BVA, while still retaining many of
the strong predictions made by simpler theories. Rather than overturning these previous predictions,
they are merely relativized to certain environments where only one of the three basic types of factors
that can enable BVA is present. The question of course is whether these relativized predictions are
in fact accurate, and this will be the main subject of the remainder of this dissertation.
79
Reiterating what I have stated above: that the formulation of “Hoji’s Correlation” is simplifying things somewhat.
Specifically, as I will discuss in full detail in the next chapter, we need to make sure that conditions are met such that
rejection of the relevant sentences with Coref and DR readings is “significant” for our purposes; that is, we need to ensure
that rejection is not due to some outside reason. To take just one example, we would need to be sure that X’-Y pairing
used being incompatible with a Coref(S, X’, Y) reading in general (we would not want to predict that rejecting Coref(S’’,
that man, she) has any bearing on whether someone will reject BVA(S, every teacher, she), as the issue is simple gender
mismatch in the Coref case. As I see it, however, such issues are practical matters, rather than a theoretical ones, so the
heart of the correlation is indeed stated via *DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) à *BVA(S, X, Y).
118
2.8 Speculative Theories of Quirky Effects
2.8.1 Introduction
Before moving on to empirical evaluations of Hoji’s theory of BVA, which will be a major
concern in Chapter 3, in this section I will review speculative analyses of what might underly “quirky
effects”, as well as provide my own. As I noted in the previous section, while Hoji’s correlational
approach can successfully detect quirky effects (in the sense of being able to control for them), it
leaves open the question of what such effects really are. We do not strictly need such an
understanding for accomplishing the goals laid out in Section 1.3; however, I think it is appropriate
to sketch an analysis, both because it may help provide a meaningful “interpretation” of what I have
labeled “Hoji’s Correlation”, as well as provide a jumping-off-point for future inquiries into the
nature of quirky effects. Those not interested in such speculation, however, should feel free to skip
this section; I will not rely on the analysis here elsewhere, so understanding it is not necessary.
In essence, the question is how it is possible for an individual to accept BVA(S, X, Y) when
X does not precede or c-command Y, as in a weak crossover configuration; more broadly, how does
an individual accept any MR(S, X, Y) under such circumstances? Let us start to answer this by
reviewing the model laid out in Ueyama 1998. The MR’s under consideration include Coref and
BVA, as well as E-type links (e.g., donkey anaphora); DR is not considered. Before addressing quirky
effects, Ueyama initially provides for three underlying mechanisms that can give rise to some or all
of these readings
80
:
80
As before, I am modifying the notation slightly, but the concepts are the same
119
(96) Sources of MR(S, X, Y)
Source: MR’s that can arise therefrom:
a. FD(S, X, Y) BVA/Coref
b. Co-I(S
81
, X, Y) BVA/Coref/E-type
c. Co-D(S, X, Y) Coref
We have encountered all these sources already in this chapter (albeit the last one only in
footnote 66). FD(S, X, Y) is the c-command-based relation that enables BVA(S, X, Y) when X c-
commands Y, and has other conditions such as anti-locality, Y being a suitable bindee, etc. Co-I is
the precedence-based counterpart
82
of FD, requiring X to precede Y in S. Finally, Co-D is specific to
Coref and essentially requires that both X and Y receive the same “D index”; intuitively, we can
understand that the interpreter assigns X to refer to, say, John, and also assigns Y to refer to John,
and thus the two happen to corefer.
2.8.2 Hints from Co-D-Indexation
While Co-D considerations are not directly relevant to BVA, I suspect that understanding
Co-D better will help us to ultimately understand quirky effects as they relate to BVA. Indeed, in
some sense, Co-D might be seen as a type of “quirky” (though Ueyama does not call it such), in that
it does not have conditions that allow/disallow it that can be clearly identified based on the surface
form of a sentence. As I discussed in footnote 66, Ueyama does provide one Japanese-specific way
of controlling it, namely her argument that Y’s prefixed by the demonstrative so- cannot be D-
indexed. For a language like English, however, which lacks this distinction, no such fact can be
81
Technically, Ueyama allows Co-I indexations across sentences, so “S” here should not be taken literally.
82
I use this term somewhat loosely, as for Ueyama, FD is established within the CS and Co-I-indexation is not, so the two
are not on equal footing in her theory, even though they ultimately achieve parallel interpretations.
120
utilized. As we have seen in this chapter and will be seeing more in other chapters, English
nevertheless behaves as if Co-D(S, X, Y) is not entirely free. Consider the following sentences:
(97) a. His teacher praised John.
b. His teacher praised every student.
If we consider Coref((97)a, John, his) and BVA((97)b, every boy, his), then we find that
some people accept neither interpretation, some accept both, and some accept one but not the
other. Further, Coref with (97)a seems much easier to accept than BVA with (97)b; this is
underscored by the previously noted fact that, in the literature, Coref with sentences like (97)a is
typically assumed to be possible, whereas BVA with (97)b is typically assumed to be impossible. The
asymmetry between the two would be easily explainable via the list of sources for different MR’s
given in (96): Coref can be achieved via Co-D, which is possible even in “weak crossover”
constructions, whereas BVA cannot be achieved that way.
If that is the case, however, then why do individuals reject Coref in (97)a? As we will be
seeing, many, perhaps most, experimental participants reject it, even though it seems as though most
linguists writing on the subject (myself included) can accept it. There are several (non-mutually
exclusive) possibilities that might explain this: (i) Co-D has certain interpretative
restrictions/requirements that complicate its use to achieve Coref, e.g., maybe the sentence has to be
interpreted in a certain way in order for Co-D to apply, (ii) Co-D is somehow “costly” (more so than
doing Coref under c-command, which most everyone seems to be able to do), such that people are
unwilling to do it or have trouble achieving it, and/or (iii) Co-D is not usually done in such
situations, so (at least non-expert) individuals are not used to it and thus do not think to do it
83
.
83
It is easy to imagine that the requirements of (i) might underly the increased costs of (ii) and/or the infrequent usage of
(iii), though we do not need to commit ourselves to such a view at this point.
121
While I suspect all three are true, I will use my own judgements to provide support for (i) at
this point. Given that Co-D involves “identifying” X and Y as the same individual, we might
suppose that it imposes a requirement that the X in question must be “identifiable”. Consider the
following:
(98) a. His student spoke to John.
b. His student spoke to that professor.
c. His student spoke to a professor.
d. His student spoke to an unknown professor.
e. His student spoke to a professor, but I don’t know who.
The relevant Coref interpretations, Coref(S, John/that professor/ a professor/an unknown
professor), are not equally available to me. The difficulty of getting them increases as one goes down
the list, with (98)a allowing it almost effortlessly to (98)e almost completely blocking it for me as of
the time of this writing. Such a difficulty gradient makes sense if one thinks in terms of identification
of X; in (98)a, we have a clear individual ‘John’, in (98)b, we have a relatively clear designated
person, ‘that professor’, though we still have to “decide” who that is, whereas in all others, we have
an indefinite/non-specific ‘a professor’. In (98)c, we can still imagine that it is a particular known
professor the speaker is just not bothering to label, but in (98)d and (98)e, it becomes clear that no
individual is known. In (98)d, I can at least create a sort of label, “the unknown professor”, but in
(98)e, even that seems dubious to do, as what is stated is effectively that the correct index to assign
the professor is not known. As such, Co-D may have at least one restriction, that X must be
identifiable. Given that many people do not even accept Coref(S, John, his) in (98)a and analogous
sentences
84
, where identification of X is easy, we can suspect that there may be even more
restrictions imposed, or that there is some influence of possibilities (ii) and (iii) that I discussed.
84
And, as we will be seeing, in other types of sentences where X neither precedes nor c-commands Y.
122
Regardless of the exact analysis, it is evident that Co-D-indexation is clearly not simply “free” in all
cases.
2.8.3 Ueyama’s Conditions on Quirky Effects
Given that we know BVA-centered quirky effects have a similar “evasive” quality to Co-D in
the sense that such quirky effects do not always happen but can happen for certain individuals, we
might wonder if some of the hypothetical factors (i)-(iii) might also be relevant to quirky
phenomena. Ueyama certainly holds that there is something in the vein of (i), namely interpretative
conditions on quirky. I mentioned these briefly in the preceding section, but let me discuss them in
greater depth here.
The account given in (96) cannot be considered complete, as Ueyama notes that there are
times when weak crossover cases are accepted with Coref/BVA readings (and this despite the Y of
Coref(S, X, Y) being an NP prefixed with so-, meaning that Co-D would not be possible). As stated
in the previous section, Ueyama calls such cases “quirky” binding (at least in the case of BVA).
Ueyama (2022) reviews several conditions that she holds are necessary for the establishment of such
quirky binding:
(99) Ueyama (2022)’s conditions on Quirky (S, X, Y)
a. X must “refer” to a specific set of individuals.
b. S must not be one of certain constructions.
c. No other quirky binding in the same clause
d. X must be in a position which is salient enough to be the “topic” of a sentence
e. Y must be non-individual denoting
Let me start with (99)e first. While the presentation in Ueyama (1998/2022) suggests that all
five of the conditions in (99) are required for quirky-BVA, she holds that quirky-Coref requires only
this last condition. Essentially, Y of Coref(S, X, Y) does not enter into any kind of direct relationship
123
with X (somewhat like Co-D), but rather, acquires a n interpretation, where it comes to mean
something analogous to an English word prefixed with an article like ‘a’ or ‘the’. As such, (100)
comes to be interpreted analogously to (101)
85
.
(100) So-ko-no bengosi-ga Toyota-o uttaeta.
that-place-GEN
86
attorney-NOM Toyota-ACC Sued
“Its attorney sued Toyota”
(101) An/The (retained) attorney sued Toyota.
In essence, the relationship between the attorney and Toyota is left vague, and is presumably
filled in by the pragmatics/world knowledge of the individual interpreting the sentence, which might
allow for ‘an attorney’ to be understood as ‘an attorney of Toyota’s’. Furthering this sense, Ueyama
goes on to point out that such expressions can be used in generic expressions (where the antecedent
does not even appear in the sentence), such as:
(102) So-ko-no bengosi toyuu mono-wa sin’yoosi-nai hoo-ga ii.
that-place-GEN attorney-NOM that thing-TOP trust-not way-NOM good.
“‘It is better not to trust a (retained) attorney (in general)”
The more literal translation of this sentence would be ‘it is better not to trust its retained
attorney’, or perhaps more idiomatically, ‘it is better not to trust one’s retained attorney’. Regardless,
the intended point is that the soko in sokono bengosi ‘its lawyer’ is not receiving its interpretation
“directly” from the antecedent, but rather by participating in a generic statement, which we can then
85
As I have mentioned previously, these sentences and their discussion can be found in Ueyama 1998’s “Appendix D”.
86
Abbreviations for relevant case/etc. “markers” in Japanese: NOM-nominative, ACC-accusative, DAT-dative, GEN-
genitive, and TOP-topic.
124
apply to various entities to make it look “as if” there were a direct relationship between the entity
and sokono bengosi. As such, the condition on quirky-Coref is not so much one of structure as one of
interpretation, namely that Y of Coref(S, X, Y) must get interpreted in this “non-individual-
denoting” way.
When we turn to BVA, there are even more conditions, namely (99)a-d. Taking these in turn,
(99)a is something similar to what we saw with Hayashishita (2004/2013)’s analysis of inverse-scope
DR; as quantifiers get less clear in their referents (Ueyama contrasts ‘every’ on the one hand with
‘55%’ on the other), then using them as X of quirky-BVA(S, X, Y) becomes harder. Indeed, as I
noted in footnote 68, Hayahshishita also formulates his constraints in terms of specificity, so his DR
analysis can in some sense be seen as an extension of Ueyama’s BVA analysis. As to (99)b, Ueyama
notes various constructions which do not seem to allow quirky readings. These are a bit intricate to
explain, so I will refer interested readers to Ueyama 1998 for specific discussion. Importantly,
however, Ueyama (p.c. September 2021) confirms that the basic intuition seems to be that such
constructions involve elements already occupying the “topic” position of the relevant clause, with
the implication being that X of quirky-BVA(S, X, Y) needs to occupy that position in order to
achieve the desired reading.
This implication is interesting, given the general understanding in the literature that topics in
general must be specific, or at least generic (see discussion in Reinhart 1981, for example); though
terminology and exact understanding changes from author to author, the intuition is that topics need
to refer to some identifiable individual or set thereof, which is exactly the restriction Ueyama’s
restriction on quirky binding is intended to convey. (99)d goes on to reference topichood
specifically, and Ueyama explores a number of conditions where, for example, changing the verb in
a sentence grants a given element greater or lesser prominence, making it easier or harder to
understand a given element as the sentence topic, which correspondingly makes quirky BVA easier
125
or harder to accept, though as she notes, reactions vary greatly between different speakers. The final
condition to discuss, (99)c, can also be understood in terms of topics; if there is only one topic per
clause, then obviously, two elements cannot both be that topic, and if being a “quirky binder”
requires being a topic, then there cannot be more than one quirky binder per sentence. As an
example, Ueyema provides sentences including the following:
(103) So-no hito tantoo-no so-ko-no syokuin-ga
that-GEN person in.charge-GEN that-place-GEN staff-NOM
subete-no giin-ni USC to UCLA-o suisen-saseta
every-GEN senator-DAT USC and UCLA-ACC recommend-made.
“That place’s staff who are in charge of him made every senator recommend and USC
and UCLA.”
Intended: A USC staff member and a UCLA staff member who are in charge of that man
made all the senators recommend each of USC and UCLA
Ueyama’s analysis is that attempting to interpret the sentence in the way specified requires
both BVA(S, every senator, him) and BVA(S, USC and UCLA, that place), which would be two
quirky BVA’s in a clause, and is thus impossible. Again, Ueyama does not provide the topic-based
analysis directly, but given the other statements made, it is not hard to draw the line to both subete-no
giina ‘every senator’ and USC to UCLA ‘USC and UCLA’ both needing to occupy the topic position,
which they cannot do simultaneously
87
.
As such, while Ueyama (1997, 1998, and 2022) makes a primarily “descriptive” presentation
of the facts surrounding “quirky binding”, directly following the logic she has basically already laid
out for us allows (99) to be condensed to the following:
87
For relevant contrasts and other examples of this kind, readers can refer to Section 2.3 of Ueyama 1998’s Appendix D.
126
(104) Conditions on Quirky-BVA(S, X, Y) (based on Ueyama’s)
a. X must serve as the “topic” of the relevant clause
b Y must be non-individual denoting
Interestingly, at this point, we have that Quirky-Coref(S, X, Y) requires only condition
(104)b, whereas, following Hayahsishita(2004/2013), it would seem that Quirky-DR(S, X, Y)
requires only condition (104)a. As we will be seeing, however, this state of affairs seems like
something of a simplification; in particular, Hoji’s analysis suggests that what Ueyama calls quirky
binding may, in fact, refer to multiple different phenomena. For now though, we can summarize (my
interpretation of) the Ueyama(-Hayashishita) model of the relevant MR(S, X, Y) as below. I include
“DD”, which is Hoji (2022b)’s equivalent of FD for DR, which lacks the same anti-locality
condition as FD:
(105) Ueyama(-Hayshishita)’s Sources for, and Conditions on, MR’s
Source Condition MR’s
FD(S, X, Y) X c-commands Y
X and Y are non-local
Coref/BVA(S, X, Y)
DD(S, X, Y) X c-commands Y DR(S, X, Y)
Co-I(S, X, Y) X precedes Y Coref/BVA(/DR?)(S, X, Y)
Co-D(S, X, Y) X and Y can bear D-indices Coref(S, X, Y)
Quirky(S, X, Y) (i) X is a topic
(ii) Y is non-individual
denoting
(i) DR(S, X, Y)
(ii) Coref(S, X, Y)
(i)+(ii) BVA(S, X, Y)
2.8.4 Hoji (2022b)’s Non-Formal Sources
There is a crucial question left unaddressed by the Ueyama(-Hayshishita) approach: why
does Hoji’s method for detecting quirky effects require two tests, both DR and Coref? As we will
see in the following chapters, both tests are needed, and when we use both, the results are indeed
precisely as predicted. Of course, when the Ueyama approach was being developed, Hoji’s results
had not been found, so it is understandable that it does not address them. In fact, it leads us to an
127
almost opposite conclusion: we see from (105) that quirky-Coref requires Y to be non-individual
denoting, quirky-DR requires X to be a topic, and quirky-BVA requires both. In that case, we should
be able to check just one of either Coref or DR; if quirky-Coref is impossible, then Y cannot be
non-individual denoting, so quirky-BVA should be impossible, no reference to DR needed.
Likewise, if quirky-DR is impossible, then X cannot be a topic, so again quirky-BVA should be
impossible, this time with no reference to Coref needed.
Given that this is not what we observe, the predictions need to be “loosened” a bit. As
discussed in the previous section, Hoji’s methodology essentially assumes that “quirky effects” could
be independently present on either X or Y, and that such a quirky effect could by itself facilitate
BVA. That means, however, that the supposition that quirky-BVA has two requirements that must
be satisfied simultaneously cannot be maintained. As such, while Hoji (2022b) accepts many of the
intuitions behind Ueyama and Hayashishita’s generalizations, his approach is different from the one
detailed in (105). There are two main ways in which this difference manifests: first, he assumes there
are (at least) two types of “quirky effects”, which derive from what he calls “non-formal sources”
(NFS’s), which calls NFS1 and NFS2, and second, that NFS’s have generally the same requirements
across different MR’s (with NFS-based DR’s requirements being slightly more different than NFS-
based BVA and Coref). As such, his model adds one more “source” for MR’s by splitting quirky
into two:
(106) Sources of MR(S, X, Y)
a. FR(S, X, Y)
b. Co-I(S, X, Y)
c. Co-D(S, X, Y)
d. NFS1(S, X, Y)
e. NFS2(S, X, Y)
128
Hoji holds that NFS1 requires that the sentence in question being understood to be
interpreted in a “categorical context”, a term that he adapts from Kuroda (1972). The complexities
of what both Hoji and Kuroda mean by this term are a little involved, but essentially it is a question
of whether or not the sentence is understood as making an evaluative judgement/comment about
which elements have what properties, such cases being “categorical”. For Hoji, the possibility of
categorical interpretation can be blocked by using phrases like a! (roughly, “look!”) before the
sentence and using verbal aspects that encode ongoing rather than completed events; the basic idea
is to make the sentence seem as much like an off-the-cough reaction to the exterior world rather
than the speaker making some sort of reflection-laden pronouncement. Hoji (2022b) does not
provide direct examples illustrating the intended points, but he does provide templates, from which
I can construct analogous English examples. Certainly, I too can detect a contrast between, say:
(107) a. Long ago, his student would always praise every professor.
b. Look over there! His student is praising every professor!
(107)a is clearly “better” with the BVA(S, every professor, his) reading than is (107)b, though
how intense the degradation is for me changes fairly dramatically between different instances of
judging it. Hoji also notes that X and Y having matching N-heads, e.g., ‘every professor’ and ‘the
professor’ also seems to facilitate NFS1
88
, though it is unclear how that relates to whether or not the
context is categorical and does not seem to be a consistent requirement for the reading to obtain.
As for NFS2, Hoji holds that it is possible in both categorical and non-categorical contexts,
but that it becomes impossible in cases where X and Y of MR(S, X, Y) are “local” in S; for Hoji, this
88
I am basing this on an official draft of (the soon to be published) Hoji 2022b. As I understand it, the draft has since
been revised to make the head-matching NFS a separate NFS, NFS3, and this may be what ends up reflected in the
published version. I do not consider the head-matching condition much in this section, so whether we count it as a separate
NFS or not should have minimal impact on the analysis to be presented.
129
means co-arguments of the same verb
89
. Thus, Hoji expects a potential contrast between sentences
like the following
90
:
(108) a. His student spoke to every teacher.
b. He spoke to every teacher.
As I discussed in Section 2.3, it was precisely this sort of contrast occurring (for some
individuals) that led to strong crossover being considered “strong” and weak crossover “weak”.
The fact that BVA(S, every teacher, his) is more widely available in (108)a than is BVA(S, every
teacher, he) in (108)b is commonly analyzed as an issue of anti-locality and/or, more specifically,
Condition C-type restrictions on pronouns c-commanding their antecedent noun phrases (see
discussion in Section 2.3). Hoji, however considers anti-locality to be a property not of BVA in
general but of FD. Given that FD requires c-command, and neither of the relevant constructions
involve X of BVA(S, X, Y) c-commanding Y, Hoji does not consider anti-locality to be a possible
explanation, and indeed, he does not Condition C to be descriptively accurate.
Here, I am inclined to agree with Hoji, as I have been tacitly doing throughout this
chapter. Were the issue driving the contrast in (108) that pronouns cannot not c-command their
antecedents, then we would not expect a contrast in the availability of BVA in the following
sentences:
(109) His mother and John lied to every man.
(110) He and John lied to every man.
89
Technically Hoji states that NFS2 is impossible in a case where X and Y are local and X does not c-command Y, but I
understand this to be due to the fact that if X does c-command Y, then distinguishing between NFS2 and something
actually c-command-based like FD becomes difficult.
90
I am taking slight liberties here, as Hoji’s comments are only about Japanese nouns prefixed by so-, but I do not see why
they would not extend to English, as indeed they seem to do.
130
Neither (109) nor (110) is easy to accept with a BVA(S, every man, he/his), at least for me.
In both cases, however, ‘he’/’his’ is just one element inside a much larger, conjoined subject, and
thus does not directly c-command ‘every man’. In this case, neither X nor Y c-commands the
other; the Hoji-style approach would predict BVA to be impossible in both sentences (absent a
quirky effect), whereas a Condition C account requiring the pronoun to not c-command its
antecedent would predict BVA to be acceptable in both sentences without requiring any quirky
effects. The former account is consistent with my difficulty in accepting BVA in such sentences,
whereas the latter leaves it unexplained. Further, much like (108)a vs. (108)b, the former is far
better than the latter (again, at least for me). The difference is that ‘his’ in (109) a possessor inside
of one of the primary nominal phrases in that conjunction, whereas in (110), ‘he’ is itself one of
those primary nominal phrases. This again reinforces the notion that the relevant contrast is not
one of whether Y c-commands X, but something else.
Of course, analyses of conjoined phrases are notoriously tricky, and one might argue that
somehow the conjoined phrase “counts” as one of its conjuncts. In that case though, it is unclear
why there is a contrast between the sentences in (111) and those in (112) in terms of the
acceptability of BVA(S, every man, his):
(111) a. Every man lied to his mother.
b. To his mother, every man lied.
(112) a. Every man and John lied to his mother.
b. To his mother, every man and John lied.
For me, BVA in (112) is clearly degraded (even worse if we topicalize), unlike in (111). We
can of course keep coming up with counter analyses, but the simplest explanation for now seems
to be that conjunctions do not have a special status with regard to c-command and BVA. If that is
131
true, then contrasts like those between (109) and (110)cannot be explained via a restriction against
Y of BVA(S, X, Y) c-commanding X. (I will address later in this section what I think they can be
explained by.)
Returning to Hoji’s account of NFS2, given that BVA in (108)a-b cannot be FD-based,
by Hoji’s hypotheses, it is not sensitive to anti-locality; rather, he claims that it is NFS2 that is
responsible, because it being blocked in sentences like (108)b but not (108)a. That is, if an
individual could accept BVA(S, every teacher, his) (108)a due to NFS2(S, every teacher, his), this
strategy will is no longer an option in (108)b, creating a potential contrast. Essentially, NFS2 has
its own anti-locality constraint.
Interestingly, there does not seem to be a parallel observation to be made with DR given
that, replacing (108)b with an equivalent DR sentence seems greatly improve acceptability. While
many will still reject it for being an “inverse scope” reading, we have seen that many individuals
do accept such readings, as in:
(113) Two students spoke to every teacher. With DR(S, every teacher, two students)
Indeed, we have just seen that Hoji relies on such sentences as part of the DR test. Setting
this oddity aside for the time being, we can thus review the conditions under which “quirky
effects” can be accepted under Hoji’s theory:
(114) Hoji’s conditions allowing quirky-MR(S, X, Y)
a. NFS1 S can be interpretable as “categorical”
(in certain cases, X and Y match in N-head)
b. NFS2 Y is not local to X
132
With this understanding, Hoji’s DR test is checking whether X (when used in S by the
individual in question) of BVA(S, X, Y) can be affected by either NFS1 or NFS2, and the Coref test
is checking the same about Y (again, relativized to the choice of S and individual). Once we have
ensured that neither NFS1 or NFS2 can affect either X or Y, then we know that neither NFS1 and
NFS2 can facilitate a quirky effect with BVA(S, X, Y), and thus Hoji’s correlational predictions are
derived.
2.8.5 More on Non-Individual-Denoting Interpretations
While Hoji (2022b)’s NFS-based account does indeed derive the intended correlation, it
loses something of the “intuitive” nature of the Ueyama(-Hayashishita) model. Regarding NFS1, we
may wonder how the sentence being understood as “categorical” allows for an MR to be established,
and why that is tied to specific choices of X and Y. NFS2, as presented, is even more enigmatic;
what role does being non-local play in allowing an MR to be established? This question comes with a
sense of “duplication”: FD itself has an anti-locality condition, and DD does not, meaning that
when the MR’s are established via c-command, Coref and BVA (coming from FD) have anti-locality
conditions and DR (coming from DD) does not. Meanwhile, NFS2 apparently has the same rules:
BVA and Coref must be non-local to be based on NFS2, whereas DR does not have to be
91
.
The NFS’s stand in contrast with other ways of establishing MR’s, namely those given in
(96). We have discussed above how Co-D-indexation can lead to Coref readings. Similarly, it seems
intuitive that X “coming before” Y would allow Y to refer back to it, and Ueyama (1998) has laid
91
At least this is my interpretation; Hoji 2022b technically does not state what enables “quirky DR” in non-categorical
contexts, that is, when NFS1 is not available. Given that DR is sometimes available in those contexts, and that its
availability in such cases is, as far as I know, correlated with BVA’s, it would be reasonable to assume that the same
source/sources are enabling both. The locality difference may be taken as evidence to the contrary, but as I will argue later
in this section, I believe this can be accounted for under a unified approach.
133
out a formal semantics of how that might happen. Finally, c-command’s role is also relatively clear, if
we assume a compositional semantics which takes the syntactic structure as its basic input; the fact
that structurally “higher up” elements come out as having semantic scope over “lower down”
elements is clear from as foundational work in compositional semantics as Montague (1973).
Hoji’s NFS’s do not have this same clarity in terms of how they give rise to
Coref/DR/BVA. This, of course, may mean that we simply have not understood the relevant
factors involved. My suspicion, however, is that Hoji has not identified the fundamental
requirements for establishing the various NFS’s, but rather, he has identified situations that
allow/facilitate or inhibit/block them, without having captured the actual underlying requirements.
As I will show, we can actually synthesize Hoji’s NFS approach with a modified version of the
Ueyama(-Hayashishita) model, which provides both an intuitive account of quirky effects and
correctly derives Hoji’s Correlation. Further, under this account, rather than testing both X and Y
for the same two NFS’s, we are in fact testing X for one kind of NFS and Y for another, and I will
show that it is clear how each one of these independently can allow quirky-Coref/DR/BVA.
To begin, let us consider the Ueyama’s “non-individual-denoting” Y’s again. Here, the link
between such a way of interpreting Y and the resulting overall Coref/BVA interpretation was
somewhat straightforward, if not completely fleshed out. We can perhaps bring what is going on
even more to light, however; consider the following set of sentences.
(115) Context: it is a well-known fact that each teacher has a favorite special student in their class,
We are discussing the typical circumstances surrounding this favorite student.
a. Usually, every teacher praises that student a lot.
b. Generally speaking, that student is smart.
c. That student praises every teacher; that’s how they get to be their favorite.
We are interested in DR(S, every teacher, that student) readings; that is, readings where the
identity of ‘that student’ varies with the teacher in question, namely ‘that student’ is the teacher in
134
question’s special favorite student
92
. For me, such a reading is easily possible in (115)a; ‘that student’
is essentially interpreted as “that kind of student”. The reading is also available in (115)b, which is
striking because there is no overt ‘every professor’ in the sentence at all. Barring some sort of covert
elements being present in the structure corresponding to the sentence, it must be the case that
DR(S, every teacher, that student) is not being established due to ‘every teacher’ c-commanding ‘that
student’ here. Finally, in (115)c, we have to be a little bit careful with the reading we are after; it is
essentially “each professor’s student praises that professor” (the teachers are vain; it turns out their
favorite students are just the ones who are the most sycophantic). I also judge this reading possible.
Indeed, I have replicated these judgements with speakers of both Korean and Mandarin Chinese
93
,
using the equivalent phrases. Indeed, these speakers are individuals who generally did not accept
“quirky”-DR in the types of sentences used in the experiments for this dissertation, which makes it
all the more surprising. Once the intended context and interpretation were explained to them,
however, they both accepted the intended readings with relative ease
94
.
Two things are clear from this example: first, this is a rather consistent way to enable (at least
some) individuals to accept “inverse scope” DR, and second, that accepting inverse scope DR in this
way is not interpretationally “free”; rather, it requires a specific type of interpretation of the Y of
DR(S, X, Y), where Y essentially comes to refer to a kind of thing, rather than any specific
92
Hoji (p.c. July 2021) suggests this might be better understood as a BVA reading, primarily because his working definition
of DR requires reference to different sets of multiple individuals. What I have called the DR reading here would involve
different “sets” of just one individual. Either way though, it does not matter much for this particular purpose.
93
Yoona Yee and Felix Qin respectively, who I thank in footnotes in the upcoming chapters for their other assistance
with this dissertation.
94
The Korean speaker noted that it would be “better” to use 그런 학생 geuleon hagsaeng ‘that kind of student’ than just 그
학생 geu hagsaeng ‘that student’ but still found DR possible with the latter.
135
individual. My sense is that this sort of interpretation is possible with BVA as well. Consider (116),
repeated from (107)a.
(116) Long ago, his student would always praise every professor.
I accept BVA(S, every professor, his) here, but in a particular way. It is not a description of
the fact that, say, John was praised by Tim, who was his student, Bill was praised by Robert, who
was his student, etc., even if it is truth conditionally equivalent to something of that sort. Rather, ‘his
student’ is completely abstract, denoting a certain kind of person. This can perhaps be seen even
better if we put it in the analogue of (115)b, where ‘every professor’ disappears completely:
(117) His student was always obedient and polite.
I can accept the BVA(S, every professor, his) here, but again, ‘his student’ sounds like a sort
of genericized/idealized person, not anyone in particular. As I alluded to above, we can infer from
the truth of this generic statement to the truth of individual instantiations (because one’s student was
always obedient and polite, it must be that John’s student was always obedient and polite, Bill’s too,
etc.), but that is not the core meaning. Note that once we have established this “generic person”,
there is nothing that stops us from using it in Coref:
(118) His student praised John.
136
(118) could of course be understood as ‘his student’=John ‘praised John’, the Co-D-
Indexation way, but we could also choose to understand it as the “abstract” ‘his student’; each will
lead to Coref(S, John, his). Again, the latter may be made clearer by putting it in a X-less context:
(119) His student was always the smartest of the group.
(119) can clearly (to me at least) have a reading, “John’s student, whoever that was at the
time, was always the smartest one in the group”. Under such an interpretation, ‘his student’ does not
refer to anyone in particular; it is a purely abstract individual which different individuals may
instantiate. We can therefore achieve all three relevant MR’s by making use of this “generic
individual”.
Note that, because of this reliance on genericity, this reading is much easier to obtain when
speaking about things in a contemplative manner, which disfavors things like present progressive
aspect or prefacing by “look over there!”, consistent with what Hoji has noted for Japanese. For
example, consider:
(120) Look over there! His student is obedient!”
Trying to get a BVA(S, every professor, his) reading here yields something truly bizarre.
Perhaps there are scenarios where one could imagine declaring that the individual corresponding to
the abstract relation ‘his student’ for each professor is currently being obedient, but practically
speaking, it is very difficult to imagine an appropriate context, and thus to interpret the sentence in
137
this way
95
. We can thus see the relevance of what Hoji has been labeling categorical contexts, even
though such considerations are really secondary to the issue of whether or not Y can be treated as a
kind/generic/non-individual-denoting expression.
Though the interpretation in question is not per se identical, we can extend a version of this
analysis to indefinite numeral expressions as well, which we have frequently seen as Y of DR(S, X,
Y), as in (121):
(121) Three students praised two teachers.
There is a reading of (121), which we may consider a DR(S, two teachers, three students)
reading, but which is not the same thing as “teacher A was praised by students 1-3 and teacher B
was praised by students 4-6”, even though the two will be truth-conditionally equivalent. This is the
reading which simply holds that the ratio of praising students to each teacher was 3:1. In other
words, two teachers are such that a three-student group praised them, regardless of whether it was
the same three student group. This reading essentially expresses that the cardinality of the praising
group was 3 in each case.
This discussion recalls what Li (1998) calls the “quantity reading” of such numeral
expressions. As will be discussed briefly in Chapter 6, Mandarin Chinese typically does not allow
such expressions to serve as the subjects of sentences, as in (122)
95
The best I can imagine is something like: we are watching an instructional video for teachers, “how to make your student
obedient”, which features a bunch of teachers simultaneously acting out the required steps. Once the requisite stage is
reached, the narrator says the sentence, indicating that each of the teachers has achieved his goal of making his student
obedient. Very, very weird.
138
(122) 三个 学生 在 学校 受伤了
Sāngè xuéshēng zài xuéxiào shòushāngle.
Three students at school were.hurt
“Three students were hurt at school”.
To express the intended meaning, speakers normally need to put the existential marker 有
yǒu ‘have’ at the beginning of the sentence, rendering it something akin to “there were three students
hurt at school”. As Li
96
points out, however, such sentences can sometimes be accepted, provided
that what is being conveyed is that the quantity of students hurt. It therefore seems that such
numerically expressions are indeed ambiguous between “individual-denoting” and “quantity-
denoting” interpretations
97
, which have measurably different properties, allowing such elements to
appear in subject positions, and for me, to serve as Y of inverse-scope DR(S, X, Y)
98
.
I take such readings to be analogous to the “non-individual-denoting”/“generic” cases I
discussed previously, even if they are not quite the same. What unifies them is that MR(S, X, Y) is
achieved, not by the establishment any direct relationship between X and Y, but by Y simply being
inherently generic/abstract/kind-like, such that the sentence in question truth-conditionally implies
the relevant MR reading. We can thus take Hoji’s Coref test to be a test of whether the individual in
question can understand the Y to be used in BVA(S, X, Y) as non-individual denoting in this way, at
least in the given (type of) sentence under consideration.
96
This is her general point about many different types of sentences involving subject numeral expressions, but about
sentence (122) in particular, readers can see her footnote 3.
97
Which may have different syntactic realizations; this is Li (1998)’s argument.
98
Technically, Li (1998) argues that such quantity expressions cannot enter into DR, which is inconsistent with what I am
claiming, though there may be issues specific to Chinese indefinite subjects that add a layer of complication that results in
these differences.
139
2.8.6 Topichood Revisited
The question now is whether we can determine what the DR test is checking with regard to
X of MR(S, X, Y). In fact, I think Ueyama already has essentially found this in her statement that X
must be the topic of the sentence/clause; my only primary change from her analysis is that I believe
that this alone is sufficient to act as an NFS to enable an MR(S, X, Y) reading, without Y needing to
have any special interpretation. Let me initially eschew the term “topic”, as my sense is it means
rather different things to different researchers. Let us rather say that X ought to be the “property-
bearer” of the sentence in question, which ties in again to some degree with the notion of categorical
judgements. In essence, it is a question of whether a sentence like (123) can be understood to be
intended to convey a property of ‘every teacher’.
(123) His student spoke to every teacher.
The relevant BVA reading here is something like “it is the property of every teacher that his
student spoke to him”. In essence, ‘every teacher’ in the object position is essentially treated as stand
in for a variable; what the sentence really says is “every teacher X is such that X’s student spoke to
X”. This latter formulation is nothing new; it looks quite similar to the typical formulations for
inverse scope one finds in introductory textbooks. What I am suggesting, however, is that this
formulation be taken particularly literally and understood as having the pragmatic requirement that
the sentence must be interpreted as being “about” X and the properties thereof.
The issue Hoji noted of shared N-heads is relevant here
99
, as it seems to enhance this sense
of “variable-ness”; I think (124) may be even easier to accept than (123) with a BVA reading:
99
There may be other ways in which matching N-heads are relevant as well; my intention is not to claim that this “explains”
Hoji’s observation, merely to note that it is along the same lines. Further, as I noted footnote 88, Hoji considers it possible
140
(124) That teacher’s student spoke to every teacher.
We can immediately bring in the Ueyama/Hayashishita observations about the need of X to
have a “specific” reading. I noted earlier in this section that specificity has often said to be a required
property of topics, but when we think of topic as a “property-bearer”, an intuitive reason for this
requirement presents itself. In essence, it is rather straightforward to conceive of stating a property
about ‘every teacher’, as what it represents is quite ‘entity like’ and is thus something we can easily
imagine having properties as a collective whole. The more numerically intricate and vague we make
X, that harder it is to conceive of X as having properties as a collective whole. How can, for
example, “at least one or more teachers” be considered a “thing” that it is appropriate to assign
properties to? Indeed, something quite interesting happens, for me at least, with sentences with such
X’s, e.g., ‘more than one teacher’:
(125) His student spoke to more than one teacher.
I can accept this sentence with a BVA(S, more than one, his) reading, but it feels distinctly
“logical” and “math-book-like”, almost as if we are turning language into a kind of word problem. I
suspect the reason for this is that, to accept the BVA reading, ‘more than one teacher’ must be
conceived as a coherent collective entity to which we can assign properties, requiring the adoption
of a very artificial and “math-like” way of viewing the world. Of course, we can do so, but that
produces the sense of facetiousness and/or sense that this might be found in a book with word-
problems. Indeed, this tracks well with Hayashishita’s generalization that inverse scope DR seems to
that head-matching is in fact an NFS3, so it may be an entirely different “entity” altogether.
141
require X to refer to a uniquely identifiable set of individuals; if X would not normally refer to such
a set, we must create an artificial scenario wherein we can imagine that it does, allowing the reading,
but at the cost of requiring a very specific mode of interpretation.
One may note here that what I have stated above, “every teacher X is such that his student
praised X” sounds precisely like what is described in many theories of QR , where (123) (when
interpreted with inverse scope BVA) might be said to have a structure (very roughly) like:
(126)
Of course, we could also have an account where the relationship between ‘every teacher’ and
the object position is not QR, but one of base generation in that position (which we can label the
topic position), or even one where the conversion of the object position to a variable happens in the
semantics, not the syntax (an SR approach). Indeed, as I noted before, all I have really done is
imposed a requirement that, if an element wants to (come to) occupy this “high up” position
(whether structurally high up or just semantically), it must be viewed as “entity-like” enough to serve
as a topic/property-bearer. That is, what I am claiming is that being a “property bearer” is an
interpretive requirement of any element that undergoes QR/SR, consistent with many of the
observations and intuitions about variation noted by Ueyama and Hayashishita, as discussed in
Section 2.7.
142
This formulation also captures the intuitions behind the “categorical context” constraint
Hoji (2022b) discusses. We can easily understand how utterances spoken as a reaction to ongoing
events are unlikely to be interpreted as statements about the properties of abstract collections of
entities; they are instead likely to be assertions about what is happening in some ongoing events,
almost the opposite of a “word problem”-like mode of interpretation. We also now have a clear
logic as to what the DR test is testing for, namely whether or not the individual in question can
understand the sentences in question such that X can be understood to be a topic/property bearer.
I will provide one final speculative extension to this analysis, which may shed light on the
nature of Hoji’s NFS2. Recall Hoji’s claim that NFS2-based BVA(S, X, Y) and Coref(S, X, Y) is
impossible if S=[Y V X], but the fact that NSF2-based DR(S, X, Y) is apparently possible, at least as
I have defined it (DR(S, X, Y) acceptance in S where X does not precede or c-command Y and there
is no categorical context). Let us suppose that the structure given in (126) is in fact correct, and X is
actually syntactically in the high-up topic position (regardless of how it got there). In such a case,
even though the word order will be “X V Y”, we have a structure like the following:
(127)
143
We can immediately note two things: first, X c-commands Y, and, more specifically, X c-
commands Y locally
100
. As such, Hoji’s FD(S, X, Y) cannot be established, whereas DD(S, X, Y) can
be, leading FD-based BVA and Coref being impossible, and DD-based DR being possible. This
contrast coincides precisely with what Hoji reports
101
. As such, we do away with the “coincidence”
in Hoji’s system that FD-based BVA and Coref and NFS2-based BVA and Coref both experience
anti-locality effects whereas DD-based DR and NFS2-based DR both do not. This explanation also
correctly derives the patterns I observed with BVA with conjunctions in (108)-(112); all we need is
to suppose that conjunctions do not serve as barriers to locality (c.f. the degraded status of Coref(S,
John, him) in ‘John spoke to Mary and him’), and the pattern is derived by precisely the same
argumentation as for the non-conjunction case given earlier in this paragraph.
This account also makes a prediction, namely that the only interpretation under which a
sentence like (128) could be acceptable with BVA(S, every teacher, he) is via the “non-individual-
denoting”/ “generic” interpretation of ‘he’.
(128) He praised every teacher.
That is, because ‘every teacher’ remains local to ‘he’, raising it to the property position will
not enable FD(S, every teacher, he), so the only potential solution is to interpret ‘he’ as non-
100
Two technical points to note: first, technically, Hoji’s formulation of locality is in terms of “co-arguments of the same
verb”, which may or may not apply to X in its high-up position. Logically, it would seem as if X and Y here are co-
arguments, but one might perhaps adopt a particular understanding of the term which renders them not co-arguments. If
we use a more structure-based definition though, along the lines of Chomsky (1981)’s formulation in terms (effectively)
of intervening clause or nominal phrase boundaries, then X can certainly be considered local to Y.
Second, as I mentioned in Section 2.6.1, there is the issue of whether X is in an “A-position”, but as I discussed there, it
is unclear to me why we should continue to maintain the stipulation that BVA(S, X, Y) can only occur when X occupies
such a position.
101
Again, he reports it for Japanese, but it seems to extend straightforwardly to English.
144
individual-denoting. While this interpretation is very difficult to accept or even understand, I have
found one way that I am able to reliably do so. I must imagine we are talking about the “universal
self” of each person (that little voice inside all of us who is in fact our own self talking to us), and
that each teacher, perhaps in great introspection and dialogue with their inner nature, gave praise to
themselves from the perspective of their own self. The reading is essentially “He(=the self) praised
each teacher”, and since every teacher has a different self, a BVA-like reading is achieved. The
reading is difficult to stumble on because it is really only possible in a very specific sort of
spiritual/philosophical/meditative context: “he heard a voice from within, and he realized the voice
was himself, and the voice praised him” is the basic sense. As far as I can see, this is the only way to
achieve a BVA-like reading in the sentence, and if so, then this does correspond to our prediction;
‘every teacher’ cannot simply be understood as the topic, but instead, a special interpretation of ‘he’
is needed.
2.8.7 Summary and Remaining Questions
In summary, I have argued for the following:
(129) Conditions that independently allow for B(S, X, Y)
a. Y can be understood as non-individual-denoting/generic/kind-like, allowing it to achieve
a reading that is truth-conditionally equivalent to BVA/DR/Coref.
b. X can be understood as the topic/“property bearer” of the sentence, allowing it to occupy
a position above the sentence and thus potentially establish FD/DD via c-command.
102
102
Note that this, much like Hoji’s formulation, makes it clear why (i) we need both the Coref and the DR, and (ii) why
those tests can only make negative predictions about BVA(S, X, Y). With regards to (i), the Coref test checks whether Y
of Coref(S’’,X’, Y) can be non-individual denoting, while the DR test checks whether the X of DR(S’, X, Y’) can be
understood as a topic; if the answer is no in both cases, then BVA(S, X, Y) cannot be enabled by X being a topic or Y
being non-individual denoting, whereas if the answer is not clearly no in at least one case, then BVA in principle could be
enabled by one of these two strategies. We must say “in principle” though, because, addressing (ii) Coref(S’’, X’, Y) might
be acceptable for other reasons than Y being non-individual-denoting, such as X’ being a topic, and the same is true of
DR(S’, X, Y’), e.g., it might be accepted not due to X being a topic but due to Y’ being non-individual-denoting; thus the
source of DR/Coref acceptance might come from factors other than S, X, and/or Y, and thus have no bearing on BVA(S,
X, Y). Thus, if a test fails to come back negative, that is, DR/Coref is accepted, we cannot necessarily predict that BVA
145
This is essentially Ueyama’s proposal for quirky-BVA, but with the previously linked
conditions on X and Y split into two independent conditions which do not both need to be met for
quirky-BVA to be possible; as with Hoji’s understanding of NFS’s, this allows for the necessity of
both the Coref and the DR test to be motivated. Further, similar to Hoji’s account, and unlike
Ueyama and Hayashishita’s, I assume that these conditions can facilitate all three of Coref, DR, and
BVA. This proposal further allows for factors such as the categorical context that Hoji 2022b
identifies to be relevant for facilitate all these MR’s, even though the context itself is not the
mechanism per se that leads to them to being established.
I must stress that I have not provided a significant test of this proposal, simply some basic
motivation. Doing so is outside the scope of this dissertation, and as I have stated, I will not rely on
this analysis in future chapters. Further, this analysis does not fully explain why individuals do not
simply always accept quirky-MR; after all, it should theoretically always be possible to understand
terms like ‘every teacher’ as a topic and ‘his student’ as non-individual denoting with the correct
mode of interpretation, but as we will see in subsequent chapters, many individuals do not accept
BVA(S, every teacher, his) in weak crossover configurations. As I speculated regarding Co-D-
indexation, this may be due to a combination of factors; by their very nature, these special
interpretations may not be obvious to all individuals who judge the relevant sentences, and indeed,
because so much of the interpretation happens unconsciously, even knowing the intended way of
interpreting the sentence may not guarantee that the individual’s mind actually “builds” that
particular representation.
will also be accepted. If both do come back negative though, that is, both DR and Coref are rejected, we can make the
prediction that BVA will be rejected (assuming we have controlled for precedence and c-command).
146
Further, there may be issues less “grammatical” in nature and more closely connected to
processing difficulties. To take a non-linguistic example, consider the classic “challenge” of rubbing
ones stomach with one hand while simultaneously patting one’s head with another. This is difficult
for many, impossible for some, even though there is nothing about the body’s neural or
physiological architecture that prevents it. Rather, we are very used to having our two hands act in
concert with one another, rather than in independent and essentially totally unrelated ways. Thus,
even with effort, we find the head-patting hand often making small circles between pats, or the
stomach-rubbing hand bobbing up and down as it circles. In essence, we are trying to use our hands,
normally coordinated with one another, in ways that are totally at odds with what one another is
doing; while this is physically possible, it is difficult because it is so far outside of what we typically
do.
The various MR-facilitating sources discussed above behave quite similarly, though we have
even less conscious control and awareness of them than we do our hands. C-command and
precedence, for example, usually go together; if you want to achieve BVA, why not put X in a
position where it both c-commands and precedes Y? Add to this the desire to make X the topic of
the sentence; should not the topic normally come first in the sentence (or at least be marked in some
special way if it does not)? When ‘every teacher was praised by his student’ is an alternative, where
‘every teacher both precedes and c-commands ‘his’, why would anyone ever say ‘his student praised
every teacher’, but still want ‘every teacher’ to be the topic? Of course, with a combination of natural
inclination, practice, and effort, at least some people probably can do these things, and thus
disparities in these factors may explain variation to some degree
103
.
103
To say nothing of differences in linguistic experience that may arise based on the properties of one and one’s
surrounding community’s I-languages, e.g., whether topicalization/scrambling of word order is common or not, which
may make one more or less tolerant towards phrases in unusual positions being considered to be the “topic” of the
sentence, for example. My hunch is that the relatively inflexible word order of written English is partially responsible for
147
Again, the above (and indeed, most of the content in this section) is purely speculative at this
point, and needs much work in order to be even meaningfully testable, let alone supported. I
provide it merely for the sake of offering my (and others’) best “interpretation” of what lies behind
the somewhat enigmatic correlation between MR’s that underlies Hoji’s methodology. If this is
helpful (or at least thought-provoking) in understanding the rest of this dissertation, so much the
better, but if some have found this section overly confusing or ungrounded, they need not worry
about its contents; going forward, all that will be of issue is “Hoji’s Correlation” itself, not whatever
lies behind it.
2.9 Conclusion
In this chapter, we have traced a long path, starting with the postulation of a basic CS-
building block, the operation Merge. This structure-building operation creates structures with certain
properties, which we can categorize using various structural relations. The most relevant for us is c-
command, which essentially formalizes the notion of one element being structurally “higher” than
another. If such a relation is to be detected via acceptability judgements, it must have some
consistent effect on meaning. The basic Reinhartian approach is to argue that BVA(S, X, Y) is
possible only if X c-commands Y in S. Though this account makes numerous correct predictions,
we have seen that it also yields a number of counterexamples as well. There have been attempts to
maintain the pure c-command-based approach, such as trying to redefine certain aspects of c-
command, as in Kayne (1994)’s attempt to capture possessor-binding effects, or allow for c-
command to be covert, as in QR-based theories. Unfortunately, these theories often either do not
English appearing so much more permissive than other languages where the written standard allows for more frequent
word order switches, but that is purely a guess at this point.
148
account for all counterexamples or run the risk of becoming so permissive that they lose most or all
predictive power.
Further, as we have seen, these problems are not unique to BVA, but manifest in other MR’s
such as Coref and DR as well. In general, pure c-command-based accounts seem to simply miss key
aspects of the data, which has caused some (e.g., Barker 2012) to argue that c-command simply does
not constrain MR’s in the ways its proponents had hoped. Even defenders of the basic c-command
approach such as Ueyama (1998), Hayashishita 2004/2013, Baltin et al. (2015), and Dechaine and
Wiltschko (2017), have generally had to conclude that there are a number of factors in addition to c-
command that can enable BVA/DR/Coref readings, including linear word order and discourse-level
effects.
Further, as can perhaps be inferred from my comments about judgement variation
throughout, many (though certainly not all) arguments for or against various theories of BVA and
other MR’s have fallen victim to an overreliance on terms like “English”and “Japanese”. While these
are helpful abstractions, in a Chomskyan conception of the human language faculty, languages like
English do not have an independent existence. Rather, each individual has an I-language, a steady
state of the shared universal grammar, instantiated in their brain/mind. Because these steady states
develop from their initial state due to external stimulus, and each individual has been exposed to
different external stimuli, in principle no two individuals need have exactly the same judgements, nor
even does the same individual at different times. Any theory that lives or dies on the basis of a
certain sentence-interpretation pair being unacceptable in “English” is thus already on potentially
shaky ground.
Of course, it may turn out that, despite the potential differences in I-languages, all speakers
of a given language do in fact have the same judgements on a given item. While there may indeed be
areas of language where this is the case, it is certainly not so with BVA, DR, and Coref. Judgements
149
vary immensely, thus an approach is ultimately needed that can contend with such variation. Hoji
(2017, 2019, 2022b) attempts to establish such an approach, reframing older hypotheses as
constraints on variation in the space of an individual’s judgements and establishing conditions under
which accepting or rejecting a given sentence-interpretation pair may be judged as significant. The
resulting predictions are far more robust, and as we will be seeing, appear to capture the range of
possible BVA judgements with total accuracy, while still maintaining c-command as an integral part
of the theory.
Of course, we may still wonder whether superficially non-syntactic aspects of the theory may
be subsumed under a c-command-based approach. For example, we have noted that Hoji’s DR test,
used to determine whether X of BVA(S, X, Y) has a “quirky effect”, looks a lot like a test one might
use to see whether X can undergo QR, a fundamentally syntactic operation. Note, however, that if
such an analysis is correct, it would not change our basic predictions, as summarized in (95). Rather,
it would either make the same predictions, or might in fact make an additional and/or more specific
prediction. A similar note can be made for attempts to subsume precedence-effects under some
version of c-command, as well as various possible (re)interpretations of Hoji’s Coref test for quirky
effects. As such, the jury is still out on whether a purely c-command-based approach to BVA is
possible, but this will not be problematic for us moving forward. Rather, the ABC-BVA law argued
for in this dissertation makes a maximally conservative guess with regards to the role of c-command,
which can be amended as desired based on a given reanalysis.
Even with this conservative guess, c-command is still clearly very relevant in constraining
BVA. As will be shown throughout the rest of this dissertation, it makes clear predictions that
cannot be subsumed under a purely precedence-based account or an entirely “meaning”-based
account, at least not without redefining those terms to essentially include notions of structure and c-
command. As such, we find strong evidence for the cognitive reality of c-command, and thus the
150
operation of Merge on which it is based, as well as obtain an investigative tool with which we can
test various properties of the language faculty and the CS. The basic Reinhartian intuitions seem to
have been correct after all; though other factors must be considered, structure does constrain the
mapping between form and meaning, just as predicted by the fundamental Chomskyan approach to
language.
151
3 Experiments, Past and Present
3.1 Introduction
3.1.1 Overview
In this chapter, we move from theoretical discussion to matters of data collection and
analysis. As we left things in Chapter 2, we had a fairly detailed idea of the factors that can enable
BVA(S, X, Y). These include both factors we can more or less “see” like word order, and ones that
can also “see” if we have the relevant hypotheses, such as sentence structure, as well as those cannot
simply be “seen” just by looking at the form of a sentence, namely so-called “quirky effects”. This
last category is especially problematic, as it leads to variation that cannot be reliably controlled
simply by manipulating aspects of the sentence in question. Courtesy of Hoji (2017, 2019, 2022a),
we have a correlational way of dealing with such cases by using judgements on parallel sentences
with DR and Coref interpretations as “quirky detectors”. As noted repeatedly in Section 2.7,
however, a fair deal of work is necessary to bridge the divide between these abstract hypotheses and
real-world judgement behavior.
This chapter explores how this divide is to be bridged such that we can design and perform
experiments that yield results which meaningfully reflect on our hypotheses. In the rest of section in
particular, I will give a conceptual overview of the basic type of experiments to be performed, as
well as review the relevant hypotheses that have been established in the previous two chapters. From
there, in Section 3.2, I will discuss previous similar experiments conducted to test such hypotheses.
This discussion begins with discussion of Hoji’s own work in Japanese, followed by discussion of
my previous follow-ups to that work in English. After having established this background, in Section
3.3, I discuss the design of the experiments conducted for this dissertation, with emphasis on how
they both build upon and differ from previous experiments. Finally, in Section 3.4, I preview certain
aspects of the analysis conducted and results of the experiments performed, so that readers may
152
know roughly what to watch for when reading the details of each experiment in the following
chapters.
3.1.2 Individuals vs. Populations
I use the term “experiment” to describe the data collection done for both this dissertation
and previous similar works. Let me clarify, however, how I am using this term, as there is a chance
that some readers may be misled regarding what I mean by it; this may lead to the belief that I am
erroneously claiming a certain type of result obtains
104
when in fact I do not discuss such a result. In
particular, it should be noted that all analysis done in these experiments is on the judgements of
individuals
105
, rather than on aggregates of the judgements of different individuals. The distinction
here is subtle, but it has fairly noticeable consequences for how analysis is performed.
From a broad perspective, as is commonly remarked in introductory textbooks, generative
linguistics is inherently an experimental field. The most basic such “experiment” consists in making
a judgement regarding whether a given item is “acceptable” to the individual in question or not. This
is not specific to the domain of research being carried out here; the item in question may be a way of
pronouncing a word, conjugating a verb in a certain way in a certain context, etc. As noted in
Chapter 1, in this dissertation, the items of interest are pairings of sentence strings and MR's. Even
with this restriction, there are a variety of ways that such experiments can be conducted, ranging
from simple introspection by the researcher, to work with consultants, to informal surveys, and to
various large-scale data-collection enterprises.
104
E.g., statistical significance under certain tests in differences between mean responses to items in various different
conditions, with certain types of controls applied, as I will discuss further below.
105
Perhaps even more specifically, individuals at particular times; at least under idealized conditions, this should reduce to
studying I-languages.
153
This last possibility is itself quite diverse in terms of what sorts of experiments can
instantiate it. When such experiments are conducted using tools and techniques common in
psychology, they are said to be “psycholinguistic” experiments, and it is my impression that this is
often what linguists mean these days when they say “experiments”. Psycholinguistics itself is a
heterogenous field, but there is nevertheless a set of typical assumptions that often go along with
such work, as represented by works such as the influential Schutze 1996/2016, which acts as a kind
of “manual” for how such experiments are performed.
Some aspects of the program Schutze lays out are widely applicable to any sort of
experimental science, e.g., the use of careful experimental controls to make sure results are not
confounded by some factor not under investigation. As Plesniak (2022c) notes, however, there are
aspects of Schutze’s program that are quite specific to a certain type of research. In particular,
Schutze seems to take the focus of experimentation to be determining the average or typical
behavior of participants in a given sample, and then quantifying the confidence with which it can be
said that this average/typical behavior of the sample reflects the true average/typical behavior of the
population. The key questions of interest become things like “how do people typically judge a given
item?” (or more commonly, “a given type of item”), “how do such typical judgements on different
items differ from one another (or not)?”, and “are such differences incidental to the sample or do
they reflect true trends in the population?”.
As Plesniak (2022c) notes, this “typicality-seeking” design can be seen, for example, in
Schutze’s suggestions for finding ways to alleviate the influence of atypical speakers, for example, by
the use of statistical tools applied to a well-populated sample. I give the key quote there, for
example, “subjects must be sufficient in number in order for the assumptions of the required
statistical tests to be met and to avoid distorting the results with atypical speakers.” (emphasis
mine, Schutze 2016, page 184). As such, Schutze’s methodological ontology recognizes the existence
154
of atypical (and thus typical) speakers, and also the possibility that their judgements can “distort” the
overall results; as I understand it, “distort” here is relative to the sorts of results that would have
obtained with only judgements from typical speakers included
106
.
Plesniak (2022c) contrasts this sort of experiment with the kind of experiment conducted in
Plesniak (2022b) and Hoji (2022c), to be discussed below in Section 3.2. The crucial difference is
that, unlike for Schutze, the goal is not to find out what judgements/pattern of judgements are
typical, but rather, what patterns of judgements are possible. As such, notions of typical and atypical
speakers will not have (direct) status, nor is there a possibility for results to be distorted by such
speakers, at least so long as their judgements are accurately reported. Indeed, at least in certain cases,
“non-typical” speakers will actually give us a much better idea of the range of possible variation than
typical ones will, meaning that “possibility-seeking” experiments very much do not want to discard
or smooth over such data.
The experiments conducted for this dissertation are of this “possibility-seeking” type, rather
than the Schutze-ian “typicality-seeking” type, hence the caveat I provided regarding the term
“experiment”, as I suspect for many, that term brings to mind something like what Schutze is
describing. Of course, both finding out what is typical and what is possible are informative ways of
probing the language faculty, but as Plesniak (2022c) discusses, both methods require certain
standards of investigation that are largely incompatible, or at least hard to reconcile with one
106
It is worthwhile to note, however, that Schutze is not claiming that such speakers are fundamentally exceptional to any
systematic account. Indeed, in discussing variation, he goes to length to argue against the claim that individual judgement
differences should be regarded as purely random effects. For example, he states, “if […] we start with the assumption that
[variation] has a cause within the system of judgement performance, then as we understand more about that process, we
might eventually be in a position to say precisely what governs variation over time and predict it as a function of other
cognitive and situational variables. Only after we have exhausted the search for such an explanation should we resort to
random probabilities.” One might indeed argue that Hoji’s quirky diagnostics are precisely a way of predicting variation as
a function of cognitive variables. As such, the difference between a Schutze-style approach and the one I will pursue in
this dissertation is not fundamentally a question of what atypical speakers/judgements represent; rather, it is a
methodological question of how we deal with the judgements of those speakers, and indeed, of all speakers, as I detail
below.
155
another. Crucially, what is to be analyzed can differ wildly between the two; as just noted, an atypical
participant is definitionally noise for typicality-seeking but may be important data for possibility-
seeking. Likewise, the distribution or frequency of a given judgement pattern is not immediately
relevant to determining what is possible but is inherently crucial for determining what is typical.
Both approaches can be brought to bear on the same sorts of phenomena, and one would
hope that their results converge to a degree; what is theoretically possible ought to constrain what
actually happens, which yields the typical. Likewise, a given behavior being typical can point to some
sort of underlying constraint as to what is possible. Discovering “laws” of the sort investigated in
this dissertation can be approached either way. I think it is fair to say, however, that the possibility-
seeking style of inquiry is more direct than typicality-seeking style with regard to such “law
discovery”. This is because it is fairly straightforward to argue that, if something is found to never
happen, despite repeatedly trying to induce scenarios where it hypothetically might occur, then there
is some principle or set of principles that is preventing it from happening. If, however, something is
not typical, the results are a bit harder to interpret. On the one hand, non-typicality may indeed
suggest an underlying prohibition, one which is sometimes circumvented via an exceptional
mechanism(s), rendering violations merely atypical rather than impossible, similar to what we have
seen happens with regards to BVA in weak crossover configurations. This is especially compelling if
we find the lack of typicality replicated across many different conditions. On the other hand, if a
pattern is not typical, this does not preclude the existence of outliers, and indeed, if there were no
outliers, we would likely say that the thing in question was not possible or at least not observed,
rather than just not typical. The existence of such outliers suggests that what has been observed is
not the full “law” governing the relevant phenomenon. Of course, it is possible that the outliers
simply the results of errors/imprecisions inherent to the method of data collection employed, but
unless that can be rigorously demonstrated, we must entertain the possibility that such outliers
156
represent genuine minority responses. Typicality-based approaches must thus do extra work to
discriminate between these two possibilities, and indeed, if it is shown that all exceptions to the
typical response were due to non-grammatical issues with data collection, then we would have in fact
shown that the typical response was the only possible (genuine) response, bringing us back to
possibility-seeking.
Though the two styles of inquiry have many similarities, their differences are such that using
the term “experiment” to describe both becomes imprecise. To describe the sort of inquiry pursued
in this dissertation and like works, let me coin a term and call it “systematic consultation of
individuals” (SCI). Discussing each word in this term from right to left, we start with the term
“individuals”; the notion to be expressed here is that, unlike in a Schutze-style experiment, the
purpose is not to determine values aggregated across a sample/population. The aggregation
approach, what I have referred to as “typicality-seeking”, essentially relies on averages across the
judgements of individuals, such that the ultimate “judgements” used for analysis are not properties
of each individual, but composites of those properties averaged across the individuals. As we will be
seeing in the discussion of experiments in Section 3.2, this is not what is done in “SCI-experiments”
(at least not as their core analysis). Rather, each individual’s pattern of judgements is analyzed on its
own, without much regard for the judgements on aggregate.
The other two terms do not contrast with a Schutze-style approach; indeed, we may call such
an approach “systematic consultation of populations” (SCP) or something along those lines.
“Consultation” both SCI and SCP have in common, as they are not based on a linguist relying on
their own introspective judgements, at least not alone; this contrasts to cases where precisely that is
done. As I noted near the beginning of this sub-section, relying on one’s own judgements is also
properly a type of experiment, so I do not mean to dismiss such an approach by any means; indeed,
as Hoji (2022a-c, and elsewhere) argues extensively, one’s own “self-experiments” are perhaps the
157
most reliable and extensive form of experiment one can do. The primary limitation of such
experiments is that they cannot tell us much about variation between individuals, as there is only one
individual involved, albeit that individual’s judgements may change over time. As such, consultation-
style and introspection-style experiments both have uses, though the former is necessary for the task
laid out in Section 1.4 for this dissertation regarding providing strong evidence for the “law” that
holds across individuals.
Finally, the term “systematic” represents the fact that these experiments follow a set
investigative procedure, and thus differ from other cases of consultation of individuals, such as are
common in informal consultations of linguists with their colleagues or other informants. In such
cases, the investigator may have things in mind to be checked, but there is no algorithmic procedure
to be followed in checking them. As with all other types of experiments, this “non-systematic” way
of doing things has a crucial role to play in linguistic inquiry, especially when breaking ground into
new areas of research. I am thus not “ranking” any sort of experiment versus any other, but rather
simply providing a typology of experiments, which can serve to distinguish the sort of SCI-
experiments pursued in this dissertation and like works from experiments of other types. As far as I
know, these three parameters, whether the investigations are systematic or non-systematic, whether
they consist of consultation or introspection, and whether their focus is individuals or populations,
should capture the range of different types of experiments employed in linguistics, at least to the
degree that the sort of experiments pursued here are clearly distinguished from other types.
To close, I want to reiterate that it should not be understood that I am claiming SCI-
Experiments to meet the experimental standards of any other type, such as Schutze-style SCP
experiments or the sort of systematic self-experiments described in, for example, Hoji (2022b). I
thus hope that the use of “experiment” to describe them is not taken as an attempt to endow the
results with greater “legitimacy” than is deserved; the SCI-experiments to be discussed make no
158
attempt, for example, to prevent “non-typical” speakers from unduly influencing aggregate trends
that might emerge from analysis of the results. As such, if evaluated as an SCP-experiment, the
quality of the results would be fairly judged as quite poor. Rather, whether or not the results of any
experiment are “legitimate” should be evaluated based on the experiment’s stated goals and the
procedures used to achieve those goals. I have elucidated the goals for these experiments in Chapter
1, and the procedures used will be detailed throughout the rest of this chapter.
3.1.3 Review of Key Hypotheses
Before moving on to discuss such procedures, however, let us recall certain hypotheses given
in Chapter 2; these will be the ones that will be relevant for all the experiments discussed moving
forward. First, we have the hypothesis of Merge and the resulting structural relations, as given in
Section 2.2:
(130) Merge:
Merge takes elements X and Y and forms {X,Y}
(131) Sisterhood:
X and Y are sisters iff X and Y have undergone Merge to create {X,Y}
(132) Daughterhood:
X and Y are the daughters of Z iff Z={X,Y}
(133) Domination/Containment:
X dominates/contains Y iff:
(a) Y is X’s daughter.
(b) Y is the daughter of an element Z which X dominates/contains.
(134) C-command:
X c-commands Y if X is sister to some element Z which contains Y.
Given (130)-(134), we are equipped to determine c-command relations for any relevant
structure built by Merge. We will, however, need to know which sentences correspond to which
structures. For the sake of convenience, I will discuss sentences using terms like “subject” and
159
“object”; formally, we should have procedures telling us which elements constitute the subject or
object of a given sentence, but in the sentences to be dealt with in this dissertation, such
determinations are not controversial. The first type of sentence we will deal with is the basic Subject-
Verb-Object (SVO) sentence of English, for which we hypothesize the following “rough”
107
structure:
(135)
We further hypothesize that (135) is the correct structure for SVO sentences in languages
where the word order is similar to English, like Mandarin Chinese, as well as for SOV languages like
Japanese and Korean. That is, regardless of how a language “canonically” orders its verb and object,
the underlying structure is identical
108
, recalling that structure is not being claimed here to provide a
linear order, so the fact that “Verb” precedes “Object” in (135) is incidental; the structure is
Predicate={Verb, Object}, which is itself formally equivalent to {Object, Verb}
109
.
107
“Rough” in the sense that there may be numerous details left out, but none such that they would change our predictions.
108
As mentioned early in Chapter 2, there are plenty of arguments to the effect that the underlying structures are not
completely identical across these languages. For this dissertation, I am proceeding “as if” the structures are identical,
because that seems to be all that is needed to produce the correct predictions. A different set of hypotheses, where the
structures are not identical but are similar enough that the relevant predictions turn out the same (namely that c-command
relations between X and Y of MR(S, X, Y) are not altered) would fare just as well.
109
Even for languages with word orders where V and O are not contiguous, e.g., where the dominant word order is VSO,
a uniform account of the structure can be maintained; the linearization process simply does not preserve the constituency
of the verb and object. Of course, such an account might be incorrect, though as long as S asymmetrically c-command
O’s in the structure, we would get the same predictions, as noted in the previous footnote.
160
Bringing in “reconstruction effects”, we additionally hypothesize that, even if the object is
“displaced” to the front of the sentence, yielding an OSV order, the sentence still corresponds
110
to
the structure in (135). As such, quite many sentences will be taken to correspond to this structure,
wherein, as we can see from the definitions given above, the subject c-commands the object, but not
vice versa.
Indeed, we will only be considering one other basic sentence structure, namely the passive.
In passive, there is still a subject, but rather than an object, there is what I termed an “agent”
111
.
How such sentences “look” varies a bit by language; in English, the agent is inside of a prepositional
phrase headed by ‘by’, and the auxiliary verb be is also inserted. Mandarin Chinese looks like English
(at least superficially) but lacks the insertion of the auxiliary verb; whether or not this reflects a
structural difference or not will not be relevant for us. At least the types of sentences to be
considered, Korean (and Japanese too, with certain caveats, though we will not be looking at
Japanese passives so these are not relevant) instead forms passives via the use of a case marker for
the agent (dative case), which is different than that assigned in the corresponding active sentences to
either the object (accusative case) or the subject (nominative case). This change in case is also
accompanied by a change in the form of the verb. Though I will discuss them in greater detail in
each of their respective chapters, I provide examples of actives and passive versions of the same
basic sentence in each language below. Here, I will use the verb ‘praise’ or its equivalent in the
110
Or at least, still can correspond; Hoji (2015 and elsewhere) uses “can” here in deference to Ueyama 1998’s deep OS
hypothesis, which allows that such cases may correspond to a different structure. As noted previously, I will mention any
cases where this difference is relevant.
111
This term is in quotes because what appears in the by-phrase is not, strictly speaking, necessarily an “agent” in the sense
that word is usually used, i.e., the active “do-er” of an action. The correct term is perhaps “logical subject”, but I want to
avoid that due to its potential confusion with the structural subject position. In the particular sentences we will be
considering, however, the logical subject is always an agent, so the term is technically applicable, although it is being
“misappropriated” to a degree.
161
language, with A standing for the nominal phrase expressing the “praiser” and B standing for the
nominal phrase expressing the “praise-ee”.
(136) English
a. A praised B.
b. B was praised by A.
(137) Mandarin Chinese
a. A 夸了 B。
A kuāle B.
A praised B.
‘A praised B.’
b. B 被 A 夸了
112
。
B bèi A kuāle
B by A praised.
‘B was praised by A.’
(138) Korean
a. A 가 B 를 칭찬했다.
A-ga B-leul chingchan-haessda.
A-NOM B-ACC praise-did.
‘A praised B’
b. B 가 A 에게 칭찬받았다.
B-ga A-ege chingchan-padassda.
B-NOM A-DAT praise-received
‘B was praised by A’
112
It has been noted that passives in Chinese often convey a sense of “adversity” (see discussion and references in Chapter
4 of Huang et al 2009, for example). This might render the use of a “positive”-sounding verb like kuā, ‘praise’, unacceptable.
As we will discuss later in this chapter and in Chapter 6, there were independent issues with the Chinese passive that
rendered its impact of the outcome of the experiment more limited than in the experiments for other languages, though
it did have a role to play (and it had a much larger role in the follow-up experiment discussed in Section 6.5.3). Interestingly
though, looking at the pattern of responses, this does not seem to have prevented participants from accepting such
sentences; barring other issues, passives with kuā were almost always accepted.
Felix Qin (p.c. September 2021), who I credit in Chapter 6 with helping to “translate” the experiment to Mandarin Chinese,
agrees that the adversity effect is generally encoded by passives, but he also finds that kuā-passives do not induce this sense
for him. “I think it is pretty commonly used for scenarios like parents telling others why they or their kid’s happy because
their kid 被老师夸了”. (bèi lǎoshī kuāle, by teacher praised, ‘praised by the teacher’). It is possible that this is an
uncontroversial opinion; if not, I wonder if there is a degree of language change occurring in (certain) younger speakers
due to prolonged contact with/bilingualism in English, where passives lack this “adversity” sense.
162
In each of (136)-(138), the (a) case is the active and the (b) case is the passive. Setting aside
the fine-grained details of each structure, however, we can settle on language-universal hypothesis of
the following form, which is much as it was presented in Section 2.3, albeit a bit vaguer than the
highly English-influenced presentation given there:
(139)
In (139), the […] indicate points where the structure may be more complex than is written,
and AgentP is merely meant to be the phrase that contains the agent, which may just be the agent, or
may be the agent plus a preposition, etc
113
. The important point to note is that, just as subject c-
commands object, but not vice versa, in (135), here subject c-commands agent, and again, not vice
versa. Further, again like the case of actives, let us hypothesize that displacing the agent or agent-
containing phrase does not alter this basic structure
114
.
113
This is purely a notation of convenience; I am not trying to suggest that “AgentP” is a meaningful syntactic label.
114
Again, there are caveats if we believe in something like Ueyama (1998)’s “Deep OS”, namely possibility that some cases
of what I have been calling displaced phrases are actually base-generated in a structurally high position. While this would
have potential consequences for us, they are not particularly problematic; at worse, they reduce to certain things I will
attribute to “precedence” being ambiguously attributable to either precedence or c-command, which I will note at the
relevant points. This in no ways harms the success of our basic predictions, or the result’s support for the role of c-
command in constraining BVA; the role of precedence might be diminished, but as we will see, only partially.
163
At this point, our structural hypotheses are almost complete. All we need to include is how
to treat things like possessors, and, in one case to be discussed later in this chapter, relative clauses
modifying nouns. Section 2.5 covers this topic in quite a bit of detail, but for our purposes, all we
need is the following:
(140) Possessors and modifying relative clauses
115
are contained within the nominal (subject,
object, or passive “agent”) that contains the noun they possess/modify.
In other words, we do not need to worry too much about the internal structure of
NP’s/DP’s/nominal phrases, we just need to know that possessors and modifying relative clauses
form a constituent with their posessee/modifiee to the exclusion of other elements. As such, if an
element c-commands a nominal phrase, it will also c-command the possessor or relative clause in
that nominal phrase. Likewise, relevant to possessor-binding, if the entire nominal phrase c-
commands something, the possessor inside of it will not; we will thus be making different
predictions than, say, Kayne (1994) in that regard. We can see both of these principles in the rough
structure given for (141) (repeated from (44)). Here, the subject c-commands the object, and thus
the possessor in the object; on the other hand, the possessor in the subject does not c-command the
object or the possessor therewithin:
115
At least of the restrictive “X that/who did Y” form we will be considering.
164
(141)
From these basic hypotheses, we can derive the following c-command relations, which will
be all we need for our predictions moving forward:
(142) A. In an active, the subject c-commands the object.
b. In an active, the subject c-commands a possessor in the object.
c. In an active, the subject c-commands anything inside of a relative clause in the object.
d. In an active, the object does not c-command the subject.
e. In an active, the object does not c-command a possessor in the subject.
f. In an active, a possessor in the subject does not c-command the object.
g. In an active, a possessor in the subject does not c-command a possessor in the object.
h. Everything in (a)-(g) is true if we displace the object to the front of the sentence.
i. Everything in (a)-(h) is true if we substitute “a passive” for “an active” and
“agent”
116
for “object”.
All the statements in (142) can be more succinctly summarized as in (143), which holds true
for all the sentences with which we will be concerned
117
:
116
To reiterate, "agent" is just meant to denote the nominal inside the “by-phrase” or its equivalent.
117
I underline this to emphasize that this is restricted to only these sentences. I am not claiming, for example, that there
are no constructions where anything c-commands a subject, only that we will not be dealing with sentences where this is
relevant.
165
(143) a. The subject c-commands all other nominals in the sentence.
b. No nominal c-commands the subject or anything inside it.
c. Possessors in nominals do not c-command anything outside those nominals.
The sections in the rest of the chapter will all make reference to the deductions given in
(143), which as we have seen, derive from our basic structural notions and a small number of
hypotheses about the structures corresponding to simple sentence types. We can then combine them
with the ABC-BVA law described in Section 1.3, especially making use of its special cases, which I
repeat here from (95) in Chapter 2:
(144) In the case that any two conditions of (a) are met, we derive one of (b)-(d):
a. The ABC-BVA Law:
*A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
b. Chomsky’s Leftness Condition:
BVA(S, X, Y) is possible only if X precedes Y in S.
c. Hoji’s Correlation:
*DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) à *BVA(S, X, Y)
d. Reinhart’s Generalization:
BVA(S, X, Y) is possible only if X c-commands Y in S.
In particular, (144)d obtains in cases where both *A(S, X, Y) and *B(S, X, Y) are ensured. In
such a case, the c-command-based deductions in (143) in fact become predictions about BVA, as in
(145)
118
:
(145) a. BVA(S, X, Y) may occur between X and Y in S if X is the subject of S
b. BVA(S, X, Y) cannot occur between X and Y in S if Y is or is inside the subject of S.
118
As is hopefully apparent, if an individual is such that, for a given case of DR(S, X, Y) and/or Coref (S, X, Y), DR/Coref
is necessarily dependent on X c-commanding Y in S, then that individual should also obey a version of these predictions
with “DR”/“Coref” substituted for “BVA”.
166
c. BVA(S, X, Y) cannot occur between X and Y in S if X is a possessor in the subject of S
and Y is outside of that subject.
The predictions in (145) may seem simple at first glance; after all, they reduce simply to
BVA(S, X, Y) being possible when X is the subject of S and at no other times. As we will be seeing,
however, manipulating the relevant factors, e.g., active vs. passive voice, we can obtain sentences
that appear superficially to “mean” identical things in terms of what elements receive what theta
roles, and yet for which the availability of BVA differs, precisely in line with our structure-based
predictions. That is, sentences one would normally think as expressing identical meanings do not, in
fact, have the same interpretative possibilities, with those interpretative possibilities being
constrained by the sentence structure.
Further, we can also make use of (144)a and (144)b, both of which obtain only when we
have ensured *C(S, X, Y). From (143), we can see that these will only be relevant in cases where X is
not the subject of S. Relatedly, as noted, to guarantee that the predictions in (145) hold, we will have
to control for linear precedence and quirky effects. Controlling for the former is rather
straightforward; we can just choose either to “canonical” or the “displaced” word order of a given
sentence type, whichever has X before/after Y as desired. Controlling for B(S, X, Y) though is a
more complicated affair. We now turn to how this has been achieved in previous experiments,
which has indeed led to the successful replication of some of the predictions in (145) across a variety
of different individuals.
3.2 Previous Experiments
3.2.1 Introduction
In this section, I summarize what is essentially the most immediate experimental background
of this dissertation. As mentioned previously in this chapter, this discussion will begin with an
167
experiment (or set of experiments) of Hoji’s (reported in various places, including Hoji 2019, 2022c),
which demonstrates an initial success of the “correlational methodology” in Japanese. In discussing
this experiment, we will answer some of the implementational questions raised towards the end of
Chapter 2 with regard to how to bridge the gap between abstract theoretical predictions using the
correlational approach and experimentally-obtained judgement data.
Hoji’s investigations are limited primarily to contrasting two constructions, weak crossover
configurations and OSV topicalization/scrambling, but consider multiple different choices of X’s
and Y’s for BVA(S, X, Y). All of these choices converge to the same basic correlational pattern,
supporting basic hypotheses about syntactic structure and the correlations between BVA, DR, and
Coref. Further convergence is found in Plesniak 2022a, which performs a limited replication of
Hoji’s Japanese results in English. While this experiment has less coverage than Hoji’s, it
nevertheless makes importation contributions by demonstrating that Hoji’s methodology can be
reliably applied to languages other than Japanese. Further, as we will be seeing, it also contradicts
claims that are sometimes made to the effect that such correlations between MR’s cannot be
established at all in English. These especially concern the purported structural insensitivity of Coref,
though to some extent DR as well.
From Plesniak 2022a, we move to the final work to be reviewed here, Plesniak 2022b. Here,
Plesniak 2022a’s findings are extended in various ways, including consideration of “possessors/spec
binding” constructions. As in the other cases, the correlational hypotheses yield correct predictions
in Plesniak 2022b, providing further support for the results first observed by Hoji and applying the
correlational methodology successfully to new areas. As such, this section traces the rather short,
experimental history of SCI-experiments conducted in the correlational manner, setting up the final
piece of background for the experimental endeavor that is at the heart of this dissertation.
168
3.2.2 Hoji’s “Kyudai” Experiment: Methodological Preliminaries
Hoji’s experiment is reported in multiple places (e.g., Hoji 2022c), but given that it was
conducted at Kyushu University, a.k.a. Kyudai, it has come to be referred to informally as the
“Kyudai experiment”, a label I will adopt here. This Kyudai experiment employs multiple
constructions in Japanese paired with BVA(S, X, Y) readings. The most crucial of these are instances
of either the SOV order, or of the scrambled OSV equivalent thereto; by the hypotheses adopted in
Sub-Section 3.1.3, these can be taken as corresponding to the structure given in (135); crucially, as
noted in (143), the subject, which is marked with the nominative marker ga in Japanese, c-commands
the object, which is marked with the accusative/dative marker o/ni
119
in Japanese
120
. Because Hoji is
making use of BVA(S, X, Y), X and Y will have to feature in the sentence; however, given the anti-
locality condition on FD, the hypothesized c-command-based source for BVA, Y will be embedded
into a broader nominal phrase, so that X and Y are always non-local (see Section 2.3). In this
nominal, Y will be the possessor, marked in Japanese with the genitive marker no. As a result, either
the subject or the object of such sentences will be X, and the other one with be Y-no NP
121
. There
are thus four possible sentence types to consider:
(146) Japanese Sentence Pattern: English Equivalent:
a. X-ga Y-no NP-o/ni V X V Y’s NP
b. Y-no NP-o/ni X-ga V Y’s NP, X V
c. Y-no NP-ga X-o/ni V Y’s NP V X
d. X-o/ni, Y-no NP-ga V X, Y’s NP V.
119
The differences between these markers will not be of concern here; as Hoji shows, the basic structure seems to be the
same in either case, at least so far as being asymmetrically c-commanded by the subject goes.
120
There are instances where these case markers pattern differently than what I have described here, but we will not be
considering them; see Hoji 2022b for some discussion (which shows how the predicted correlations can be extended to
these other cases as well).
121
This NP being the part of the subject/object that contains at least the “head noun” of the overall nominal phrase.
169
While sentences of all the types in (146) feature in the Kyudai experiment, it is (146)b and
(146)c that are the most important for Hoji. The reasoning for this is that these two sentence types
are such that X does not precede Y, eliminating the possibility of precedence-based BVA(S, X, Y).
As such, given that Hoji is following Ueyama’s approach to BVA, BVA(S, X, Y) can only come
about in these sentences as the result of either: (i) X c-commanding Y or (ii) X, Y, and/or S having
some sort of quirky effect.
Further, these two sentence types form a near-minimal pair in terms of c-command, with X
c-commanding Y in (146)b but not in (146)c. Resultingly, if there are no quirky effects, BVA(S, X,
Y) should only be possible in the former, not the latter. To reflect this, Hoji terms this a case of
“predicted schematic asymmetry” following a terminological convention of Hoji 2015. The
prediction in question has two components: the “*Schema” prediction and the “okSchema”
prediction, corresponding to the two types of sentences, those without X c-commanding Y,
instances of a *Schema, and those with X c-commanding Y, instances of an okSchema. Regarding
the *Schema, the prediction is universal; no individual should ever accept BVA(S, X, Y) in such a
case unless a quirky effect has occurred. This is a slight modification of the terminology introduced
in Hoji 2015, where *Schemata are predicted to never be accepted at all. For analysis of the Kyudai
results, Hoji retreats from this strong position, as, even though in the data considered in Hoji 2015
comes close at points, such absolute predictions have not been successfully replicated. Indeed, it is
precisely this issue that leads to Hoji (2019, 2022a/b)’s emphasis on the importance of controlling
quirky effects.
For both understandings of *Schemata, however, the prediction of unacceptability,
relativized or not, applies to an individual’s “true” judgements on the sentences in question, by
which I mean the judgements the individual would have if they knew precisely what the
170
experimenter was trying to ask them about. An individual’s reported judgements may or may not
reflect these true judgements. The reasons why true and reported judgements might differ include
inattention or confusion on the part of judgement-maker, as well as other, more specific issues, such
as the individual in question having certain I-language properties that would invalidate crucial
assumptions of the hypotheses being tested. This last case does not necessarily lead to a mismatch
between “true” and “reported” judgements but may cause the judgements reported to be
judgements on a somewhat different sorts of items than what the experimenter intended to present,
which will have an analogous effect to simply answering inaccurately. In all cases, Hoji makes efforts
to distinguish such individuals, whose responses have no direct bearing on the *Schema prediction,
from those whose responses reflect their true judgements, by use of what he terms “sub-
experiments”. Other studies may use terms like “catch trials” or “control questions”, but the basic
intuition is the same regardless of what such things are called: they are questions that anyone
meeting the basic assumptions of the experiment (attentive, understanding the instructions, etc.)
should answer in a particular way. If individuals fail to answer in this way, especially if they do so
repeatedly and across multiple different types of “sub-experiment”, then something is “wrong”
about the way they are answering, and thus their reported judgements have no bearing on a *Schema
prediction.
The other type of prediction is the okSchema prediction. This prediction is not universal in
the same way that the *Schema prediction is; the latter predicts that all relevant individuals will have
the same judgement, namely *BVA(S, X, Y) in this case. The okSchema prediction, on the other
hand, is not that all relevant individuals will judge okBVA(S, X, Y), but merely, that such a
judgement is possible. From that, we can predict that at least some individuals ought to have such a
171
judgement, if we consider a sufficiently large sample of judgements. In essence, it is an existential,
rather than a universal, prediction
122
.
Why the difference between the two? If quirky factors are absent, then narrow CS factors
like whether X c-commands Y are the only relevant constraints articulated with regard to whether
BVA(S, X, Y) can be established. Meeting such constraints, however, is only necessary, not
sufficient, for BVA readings to arise. As such, it is possible that an instance of an okSchema gets
rejected with a BVA reading, even though it meets the constraints necessary to allow such a reading
in principle. As such, while the prediction is that all instances of a *Schema will be unacceptable
provided that the relevant preconditions are met, the okSchema prediction is essentially only a
“hope” that some individuals will sometimes accept instantiations of the okSchema
123
.
Given this distinction, it may seem that the okSchema prediction is essentially useless. Unlike
the *Schema prediction, it does not give us anything meaningfully “testable” on its own. As
discussed in Section 2.7, however, simply knowing that an individual rejects MR(S, X, Y) in an S
where X does not c-command Y does not per se tell us that the rejection was due to the lack of c-
command. This is one place in which the okSchema prediction becomes useful. Essentially, we are
able to categorize an individual’s responses on the * and okSchemata according to one of three
labels:
122
Hoji (2022b) argues that, in a self-experiment, okSchema predictions can, at least sometimes, be made universal. He
also notes that sub-experiments that check for attentiveness and understanding of the intended interpretation are not
necessary, as the experimenter knows that they themselves are paying attention, etc. (although other types of sub-
experiments, such as those checking whether various conditions are met for the relevant hypotheses to apply, may still be
needed in a self-experiment.) We are concerned, however, with SCI-experiments, which by definition involve consultation
with others, and as such, these details will not play a major role in the exposition to come.
123
This asymmetry between a *Schema-based prediction and an
ok
Schema-based prediction is named in Hoji 2015: Chapter
2 the “fundamental schematic asymmetry”.
172
(147) a. “Detection”: *Schema rejected, okSchema accepted
b. “Disconfirming”: *Schema accepted.
c. “Neutral”: Any other pattern, commonly both * and okSchema rejected
(147)b is perhaps the easiest to understand; the *Schema prediction has been
“disconfirmed”, because the individual in question accepted BVA(S, X, Y) paired with an S where X
does not c-command Y. Technically, we should say it has been “potentially disconfirmed” because
of the possibility of quirky effects, but let us assume for now that we have ensured *B(S, X, Y).
Given that the okSchema is not disconfirmable, this is the only way any such disconfirmation can
happen. As noted above, however, if such disconfirmation does not happen, Hoji holds that we
cannot necessarily claim that the result provides significant support for the hypotheses in question.
For example, as noted in (147)c, an individual might reject BVA in both schemata. Such a situation
fails to demonstrate the predicted contrast between an S where X c-commands Y and an S where it
does not; a plausible explanation for the unacceptability of BVA in such a case might simply be that
the individual in question had trouble accepting the relevant BVA readings at all, perhaps due to
some unrelated issue like the complexity of the scenario such a reading would describe, or the
individual simply never accepting BVA with that choice of X and/or Y. If, on the other hand, an
individual does accept BVA(S, X, Y), but does so only with the okSchema and not with the
*Schema, then Hoji takes this as evidence for the “detection” of the effects of the presence/absence
of X c-commanding Y in S. As such, for Hoji, the okSchema prediction is used to assess the
significance of a successful replication of the *Schema prediction; data can only be counted as
directly supporting the hypotheses in question when both predictions are met, resulting in the
predicted asymmetry between the two schemata.
Translating the above discussion into experiment-specific terms, Hoji predicts the following:
173
(148) Assuming an individual passes all relevant sub-experiments:
a. If the individual shows no signs of a quirky effect, that individual should not violate the
*Schema prediction, namely, should reject all instances of BVA(S, X, Y) with sentences of
the form in (146)c, ‘Y-no NP-ga X-o/ni V’.
b. Assuming enough individuals are considered, then at least a subset of the individuals
described in (a) will follow the okSchema prediction, namely, accepting at least some
instances of BVA(S, X, Y) with sentences of the form in (146)b, ‘Y-no NP-o/ni X-ga V’.
c. If, on the other hand, the individual does violate the *Schema prediction, accepting BVA(S,
X, Y) in a sentence of the form ‘Y-no NP-ga X-o/ni V’, then that individual should have
been otherwise diagnosed with quirky effects by the relevant tests.
This is a rather striking prediction, as we can see that all that changes in the sentence itself
from the * to the okSchema is the swap in placement of the case markers ga and o/ni; despite this
similarity, Hoji’s claim is that the sort of asymmetrical behavior described in (148) will obtain
without exception for all judgements of all individuals, albeit some judgements may fall into either
the “neutral” category or be diagnosed with quirky effects and as such not have a direct impact on
the success or failure of the predictions. Some may find the existence of such neutral/quirky cases
disturbing, but for Hoji, they are a necessary consequence of ensuring both control of noise and
significance of experimental results when working with individuals other than the experimenter’s
own self (or other trained specialists).
We may fairly raise the critique, however, that given that the hypotheses in question are
universal, they should be in principle replicable in any individual; as such, categorizing individual’s
judgements as “neutral” somewhat weakens the evidence for this universality. A similar point may
be raised about individuals showing quirky effects; in these individuals, there is no prediction that
the c-command-based patterns will be observed, again weakening the claim that there is something
inherently “universal” about said patterns. It is important, though, to distinguish this critique from a
claim that the predictions in (148) are unfalsifiable; they are very clearly quite falsifiable, as all such
174
falsification would require would be for an individual to “pass” the relevant sub-experiments and
tests for quirky effects but to nevertheless accept BVA(S, X, Y) when S is an instance of *Schema. A
single individual having this judgement just once would constitute such a disconfirmation of the
prediction, and Hoji’s Kyudai experiment features hundreds of individuals making dozens of such
judgements, providing ample opportunity for falsification of this sort to occur.
The issue of “discarding” neutral/quirky individuals is one I attempt to improve upon
significantly in this dissertation, and I will address it further in Section 3.3. Hoji, however, anticipates
this issue to a degree, by employing multiple different choices of X and Y and checking their
combinations in a fairly exhaustive manner, offering many chances for each individual to have a
judgement that “counts” towards the results. There are a total of six different basic types of X and
two different choices of Y used in the Kyudai experiment:
(149) X’s:
asoko to koko ‘that place and this place’
aitu to koitu ‘that guy and this guy’
subete-no N ‘every N’
#-Cl-no N ‘# N’s’
sukunakutomo #-Cl izyoo-no N ‘at least # or more Ns’
N-ga/o sukunakutomo #-Cl izyoo ‘at least # or more Ns’
(150) Y’s:
soko ‘it/the place/that place’
soitu ‘the guy/that guy’
In (149), N stands for a noun, of which there are different choices used, # stands for a
number, which also varies in terms of which number is used, and Cl for a numeral classifier of some
kind, which, as is standard in Japanese, is dependent on the choice of the noun
124
. Hoji also makes
124
The X’s used either denoted people or institutions of some sort, hence the absence of any equivalent to Cl in the
English translations, given that these are not usually mass nouns in English.
175
use of different instantiations of V and NP of (146) (this varying between just a noun and a small
multi-word phrase), but X and Y are more crucial for the predictions in (148), in that the “relevant
tests” for quirky effects referred to in (148)c make reference to the specific choices of X and Y
employed in the sentence.
This dependence on X and Y arises because of the nature of Hoji’s tests for quirky effects,
as discussed in Sub-Section 2.7.4. Recall the two diagnostics for quirky effects (applying only to
sentences where X neither precedes nor c-commands Y) as they are presented as they are presented
in that section:
(151) Hoji’s DR test:
If X does not c-command Y in S, acceptance of BVA(S, X, Y) can be attributed to a quirky
effect if the individual in question also accepts DR(S’, X, Y’) (where S’ minimally different
from S such that an appropriate Y’ can be used).
(152) Hoji’s Coref test:
If X does not c-command Y in S, acceptance of BVA(S, X, Y) can be attributed to a quirky
effect if the individual in question also accepts Coref(S’’, X’, Y) (where S’’ minimally
different from S such that an appropriate Y’ can be used).
For Hoji, if DR(S’, X, Y’) or Coref(S’’, X’, Y) is acceptable for S when X does not precede or
c-command Y, then a quirky effect is diagnosed. As we saw a bit in Sections 2.6 and 2.7, whether an
MR is available in such a position varies not only based on the individual in question, but also the X
and Y used. As such, even if the DR test diagnoses a quirky effect for subete-no N ‘every N’ for an
individual, such an effect might not obtain when that individual considers sukunakutomo #-Cl izyoo-
no N ‘at least # or more Ns’; indeed, such a pattern would not be surprising given the data
Hayashishita (2004/2013) reports. As such, testing multiple combinations of X and Y offers
individuals multiple chances of finding an X-Y pair for which they themselves do not exhibit quirky
effects. If such a case is found, then the individual in question has a chance to potentially disconfirm
176
the predictions laid out in (148), namely judging instances of the *Schema when no quirky effects are
diagnosed. Should such an individual ever accept BVA(S, X, Y) under such a situation, then the
predictions are disconfirmed; as Hoji shows, however, this does not happen in the Kyudai dataset.
Before we reach Hoji’s analysis, however, we need to deal with one further issue, which was
mentioned previously with regard to the tests for quirky effects. Namely, while (151) and (152) tell
us conditions under which we can be sure that a quirky effect is present, what Hoji needs is a way of
determining if a quirky effect is absent. I noted this issue when discussing (91) in Sub-Section 2.7.5,
repeated here as (153):
(153) *DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) à *B(S, X, Y)
Written out in words, this expression states that there are no quirky effects affecting BVA(S,
X, Y) if the individual in question rejects both DR(S’, X, Y’) and Coref(S’’, X’, Y); this, we should
recall, is under the assumption that S, S’, and S’’ are minimally different, just enough so to
accommodate the appropriate X’ or Y’ for DR or Coref respectively. As I noted during its initial
discussion, however, (153) only holds if we know that the individual in question rejected DR and
Coref “for the right reasons”; if we do not know this, then the rejection of DR and Coref in the DR
and Coref tests does not necessarily ensure that there will be no quirky effect affecting BVA. Besides
basic experimental issues like attentiveness and comprehension, which may cause an experimental
participant to respond with judgement that does not reflect their true judgement on the item at
hand, we also encounter exactly the same issue we did with okSchema predictions when dealing with
BVA. Namely, quirky effects are merely necessary, not sufficient, conditions for MR(S, X, Y) to
arise. As such, there may be a quirky effect present on X, and yet DR(S’, X, Y’) is not accepted; if we
177
took (153) at face value, this might erroneously lead us to suspect that X is free of quirky effects,
leading to incorrectly believe we have ensured *B(S, X, Y) and thus wrongly predict *BVA(S, X, Y).
Hoji notes, however, that his solution regarding * and okSchemata predictions will also work
for resolving these issues with the DR and Coref tests. When dealing with * and okSchemata, we
were concerned with the possibility that BVA(S, X, Y) might be rejected in the *Schema case due an
incidental reason rather than being due to X not c-commanding Y as claimed . That is essentially the
same question we are concerned with regarding the DR and Coref tests: are DR(S’, X, Y’) and
Coref(S’’, X’, Y) really being rejected because of a lack of a quirky effect or because of some
incidental reason?
We need some way to discriminating between these two situations, and this is precisely what
okSchema predictions allow us to do. Let us adapt notation from Hoji (2022a) and consider two
types of S, *S and okS, corresponding to the * and okSchemata of our experiments. Under this
system, (153) is minorly rewritten to (154), reflecting its focus on cases where X neither c-commands
nor precedes Y in S.
(154) *DR(*S’, X, Y’) Ù *Coref(*S’’, X’, Y) à *B(*S, X, Y)
(154) is incomplete, because as noted, we do not know whether the rejection of DR or Coref
is “significant” for our purposes or not. Looking at this somewhat differently, if there are no quirky
effects on S, X, and Y for the individual in question, and like Hoji, we are only looking at sentences
where X does not precede Y, then DR/Coref acceptance should only be possible in S where X c-
commands Y. If we check DR and Coref with *S’ and *S’’, where X does not c-command Y, and
find that the individual rejects such readings, then as noted, this is ambiguous between (i) DR/Coref
requiring X to c-command Y in S, indicating a lack of quirky effects, or some other, interfering
178
effect. If, however, we find that, when we consider minimally different schemata such that X does c-
command Y (okS’/okS’’), DR and Coref become possible, then this minimal pair suggests that it was
indeed the lack of c-command that was causing the rejection with *S’/*S’’; this in turn implicates a
lack of quirky effects as desired. If, on the other hand, changing c-command relations does not
result in a switch in acceptance, that is both *S’/*S’’ and okS’/okS’’ are rejected with DR/Coref,
then we have reason to suspect that it is some outside factor causing rejection, rather than any
dependence on c-command, in which case, our tests for quirky effects are inconclusive.
In other words, the same sort of “confirmed (predicted) schematic asymmetry” that suggests
that a given instance of *BVA(*S, X, Y) is significant for c-command detection can also let us know
that instances of *DR(*S’, X, Y’), and *Coref(*S’’, X’, Y) are significant for “lack of quirky”
detection as well; indeed, such “lack of quirky” detection is in equivalent to c-command effect
detection in the sense of (147) with DR/Coref substituted for BVA. Hoji thus makes the implication
in (154) “reliable” by incorporating judgements on okS’ and okS’’, as below:
(155) Hoji’s diagnostic of a lack of quirky effects:
*DR(*S’, X, Y’) Ù okDR(okS’, X, Y’) Ù *Coref(*S’’, X’, Y) Ù okCoref(okS’’, X’, Y) à *B(*S,
X, Y)
Of course, as mentioned in Section 2.7, it is always possible for judgments on any single
sentence to happen to not be accepted with a given MR for an incidental reason; Hoji thus checks
multiple instances of all schemata to better ensure that this does not happen. We can thus
understand *MR(*S, X, Y) to mean, “MR is never accepted by in any sentence of pattern *S
containing X and Y”, and okMR(okS, X, Y) to mean, “MR accepted sufficiently frequently in
sentences of pattern okS containing X and Y”. We should keep in mind that considerations of what
counts as “sufficient frequency”, as well as (155) itself, are matters of practice, not necessarily of
179
theory; there is thus a certain degree of “fuzziness” that Hoji acknowledges in such matters,
regarding issues like how many instances of each schema need to be checked and how many times
must an okSchema be accepted with an MR to consider rejection of the corresponding *Schema
with that MR to be “significant”. Fortunately, while certain relatively arbitrary choices have to be
made, the interpretation of Hoji’s data is not affected too greatly by this issue
125
.
With these subtleties in mind, we can finally reach the general form of Hoji’s main
prediction, which is consistent with (148) but now more explicitly addresses how quirky effects are
detected. Expressing this as an implication, we essentially just substitute into (155) with the
appropriate conditions from (148) and other parts of our exposition:
(156) Hoji’s correlational prediction:
Provided that:
a. The individual in question passes all relevant sub-experiments.
b. *S and okS are properly chosen such that X c-commands Y in okS but not *S.
c. *S and okS are both such that X does not precede Y
Then we predict that:
*DR(*S’, X, Y’) Ù okDR(okS’, X, Y’) Ù *Coref(*S’’, X’, Y) Ù okCoref(okS’’, X’, Y) à
*BVA(*S, X, Y)
As before, we do not have an equivalent prediction regarding okBVA(okS, X, Y), as the
okSchema prediction is only existential, but, in order to have an individual’s judgements count as c-
command effect “detection” in the sense of (147), we will need the okSchema to be accepted.
Indeed, the language of “detection” is a simpler, albeit less precise, way that Hoji expresses the
prediction in (156):
125
Discussing the subtleties of such choices is a bit beyond our scope here, but interested readers can see some discussion
in Plesniak (2022c), as well as in some of Hoji 2022c’s discussion as well.
180
(157) “Detection” in Coref Ù “Detection” in DR à “Detection” (or at least “Neutral”) in BVA
Where we recall that “detection” in the Kyudai experiment constitutes rejecting the MR in
sentences of the form ‘Y-no NP-ga X-o/ni V’, i.e., weak crossover configurations, and accepting the
given MR with ‘Y-no NP-o/ni X-ga V’, i.e., OSV reconstruction configurations, or the equivalent
S’/S’’ forms of these configurations for DR and Coref.
126
3.2.3 Hoji’s “Kyudai” Experiment: Results
With these methodological issues taken care of, let us now turn to the Kyudai experiment’s
results. Hoji divides these results into various sub-parts, each corresponding to a different pairing of
X and Y, the options for which are repeated below from (149) and (150):
(158) X’s:
asoko to koko ‘that place and this place’
aitu to koitu ‘that guy and this guy’
subete-no N ‘every N’
#-Cl-no N ‘# N’s’
sukunakutomo #-Cl izyoo-no N ‘at least # or more Ns’
N-ga/o sukunakutomo #-Cl izyoo ‘at least # or more Ns’
(159) Y’s:
soko ‘it/the place/that place’
soitu ‘the guy/that guy’
126
As noted elsewhere, there is a small issue with the translation between BVA and DR, whereby [Y-no NP] cannot simply
become [Y’-no NP] but must just become [Y’]. As such, the sentences of the *S type become just [Y’-ga X-o/ni V] in the
DR case, which is not a weak crossover configuration. It is “analogous enough”, however, for the correlations to come
out as predicted.
181
To reiterate in concrete terms what was said in the preceding sub-section, for any X-Y pair,
we crucially want to classify them according to (160), which is the experiment-specific instantiation
of (147):
(160) a. “Detection”: BVA(S, X, Y) impossible in weak crossover, ‘Y-no NP-ga X-o/ni V’,
but possible in OSV reconstruction, ‘Y-no NP-o/ni X-ga V’
b. “Disconfirming”: BVA(S, X, Y) possible in weak crossover
c. “Neutral”: Any other pattern
Using this system for classifying the Kyudai experiment participants’ BVA judgements on a
given X-Y pair, we can now make diagrams involving “dots”, somewhat analogous to (5) in Section
1.6. An important difference, however, is that, in (5), each dot represents a single judgement,
whereas in these diagrams, each dot will represent an individual’s pattern of judgements, according
to the classification scheme in (160). Each dot in these diagrams will thus represent what would be
multiple related dots in diagrams like (5); this is an important distinction to keep in mind, as we will
be seeing more of both types of diagrams going forward. In this section, however, they will all be of
the “pattern of judgement” type.
Taking the case of X=subete-no N, ‘every N’ and Y=soitu ‘the guy/that guy’, Hoji first simply
plots a random scattering of the judgements, with no regard to things like tests for quirky. Doing so,
he obtains the following (reproduced from Hoji 2019 with permission of the author):
182
(161)
As noted, each “dot” here represents an individual’s judgement pattern on BVA in sentences
involving this particular choice of X and Y
127
. The color of each dot represents their classification
according to (160): green “detection”, red “disconfirming”, and blue “neutral”
128
. Without getting
into numerical details, it can be clearly seen that there are many dots in each group. If we were to
127
At this point, this is just a random assortment of dots whose position is entirely arbitrary. Hoji’s intention is to contrast
this disordered state with the one that emerges once the proper classification is done. One could have summarized the
same information here with simple counts of how many “dots” are in each category.
128
We do not need to worry too much about the numbers written in the legend. In essence, they refer to what rate of
acceptance and rejection each schema needs to “count” as significant for the category in question. “b” is the *Schema,
and as such, must be accepted 0% of the time if the individual is not to be counted as “disconfirming”. As noted, there is
a somewhat less theoretically motivated distinction to be made regarding how often the okSchema, “a” in this case, must
be accepted for an individual’s response to constitute “detection”. Hoji’s decision here is to make the relevant threshold
25% of the time. There is a third type of schema in his data, represented by “c”, but as can be seen, it is set to be equal to
or greater than 0%, which will always be true, so it is not being used here.
183
“leave things here”, then very little could be said; “reds” are hopelessly interspersed with “greens”,
so no clear c-command-based pattern has been detected. As noted however, Hoji has a key tool for
discriminating between these two cases, namely the presence or absence of quirky effects, as can be
detected via (155). Indeed, as they are stated in (156), it is impossible to test Hoji’s predictions
without such classification.
What (156) predicts is that, once we take into account whether individuals pass the
respective DR and Coref tests, all “red” dots ought to show evidence of quirky effects. Hoji thus
classifies all individuals according to whether they show c-command detection using DR(S’, X, Y’)
and Coref(S’’, X’, Y), where X and Y are still subete-no N, ‘every N’ and soitu ‘the guy/that guy’, and
X’ is ‘#-Cl-no N’, ‘# Ns’ and Y’ is ‘aitu’, which translates roughly to the same thing as ‘soitu’, namely
‘that guy/the guy’
129
.
Hoji
130
represents “passing” each test by being inside of the relevant circle; those whose
judgements constitute “detection” with DR(*S/okS, subete-no N, #-Cl-no N) have their dots placed
inside of the circle with that label. Likewise, those whose judgements constitute detection with
Coref(*S/okS, aitu, soitu) are placed in that circle. If the individual has detection with both MR’s,
then they are placed in the intersection of the two, in the center of the graph.
Applying this procedure, Hoji derives the following:
129
The difference between expressions like ‘soitu’ and ‘aitu’ is of importance for Hoji, but not immediately for us;
essentially, the former is claimed by Ueyama (1998) to be able to participate in FD-based MR’s (and not Co-D-indexation-
based Coref), whereas the latter is claimed by her to not be able to participate in FD-based MR’s (while being able to
participate in others such Co-D-indexation-based Coref), as discussed in the previous chapter.
130
Technically, Hoji is following (the manuscript precursor of) Plesniak 2022a in doing this. However, my decision to
present things in this way in Plesniak 2022a was entirely due to his suggestion, so credit for this design should certainly
rest with him.
184
(162)
Crucially, we see that there are red dots everywhere except in the central intersection. This is
precisely what is predicted by the implication in (156), repeated as (163) below:
(163) “Detection” in Coref Ù “Detection” in DR à “Detection” (or at least “Neutral”) in BVA
All cases of “disconfirmation” in BVA are also cases where there was not sufficient
“detection” in Coref or DR, indicating a potential quirky effect. As such, all instances that seem to
violate c-command-based predictions, namely acceptance of weak crossover BVA, may be attributed
to experimentally-diagnosed interference by quirky effects. In essence, though there is variation in
BVA judgements, this variation is systematically constrained. As predicted by the ABC-BVA law,
185
given that X follows Y in the linear form, BVA(S, X, Y) is only acceptable in these sentences if
either X c-commands Y or a quirky effect is diagnosed.
Further, in the central intersection, individuals for whom it has been assessed that no such
quirky effects are taking place for this choice of X and Y, there are at least some green dots,
indicating full c-command detection. Taken all together, such results support the basic hypotheses
about the structures of the sentences in question; that is, it suggests that we correctly identified the
lack of X c-commanding Y in *S and the presence of such c-command in okS, and were also correct
as other hypotheses such as the relationship between BVA and c-command, and the relationship
between BVA, DR, and Coref, etc.
In some cases, one additional step is necessary for Hoji to make results come out this clearly.
This step, I have already mentioned several times in the previous sub-section, namely checking the
relevant sub-experiments to ensure that the individuals in question are attentive, comprehending the
directions, meet the assumptions necessary for the relevant hypotheses to apply, etc. I will discuss
details of such sub-experiments further later in this section. For demonstrational purposes though,
let us consider judgements on different X-Y pair, namely X=aitu to koitu ‘that guy and this guy’. As
Hoji (2019) notes, for this pair, simply classifying according to Coref and DR judgements does not
eliminate all cases of disconfirming judgements:
186
(164)
As can be seen, one red dot remains in the middle, indicating that there was an individual for
whom it seemed that neither the DR test nor the Coref test identified a quirky effect, and yet the
individual still accepted BVA(S, aitu to koitu, soitu) in at least one instance where S was a weak
crossover configuration. When Hoji includes consideration of sub-experiments, however, it becomes
clear that this individual had some sort of problem or incompatibility with the experiment.
Hoji includes the status of having passed the relevant sub-experiments as yet another circle,
along with passing the DR and Coref tests:
187
(165)
The top circle in (165) is the “sub-experiment-passing” circle. As we can see, the red dot that
was formerly in the central intersection does not make it inside this final circle, while three green
dots do. Again, without getting into details of what such sub-experiments are at this point, we can
simply note not “passing” such experiments constitutes independent evidence to exclude that
individual’s judgement from consideration; this individual may have given signs of misunderstanding
certain key directions, for example. Given that, once these tests are applied, only individuals whose
judgements are as predicted remain in the center, this too is judged by Hoji to be a successful
replication of the main prediction in (156).
While they do conform to the predictions, results like (162) and (165) are limited in scope,
with the information in the central intersection representing the judgement of just a few individuals
on just a few BVA items. Hoji, however, repeats this same sort of analysis with all other X-Y pairs in
188
the Kyudai experiment (as indicated in (149) and (150) above), finding the same sort of results
throughout, thus demonstrating the consistency and efficacy of the correlational approach. In
essence, all individuals who give responses that might go against what would be predicted by a
purely c-command-based account can be directly accounted for by either the detection of quirky
effects or the failure to pass the sub-experiments. Further, after focusing only on individuals who
pass the sub-experiments and have no quirky effects with the relevant X and Y, we find that these
individuals have judgments that reflect c-command-based asymmetries in the availability of BVA,
further adding significance to the results.
One may wonder how many such individuals there are in each case. The number varies, but
the two above examples are fairly representative; generally only a few individuals reach the “center”
of each diagram. To be fair to Hoji, these are not the only individuals who “count”, so it is not as if
the other individuals are being completely discarded; red dots outside the center are equally as
predicted as green dots within it, if not more so. Green dots out of the center and blue dots
anywhere also have their proper role in Hoji’s theory
131
. It is nevertheless true, as discussed in the
previous sub-section, that we would want to show that the purportedly universal c-command-based
patterns can be rigorously found in as many individuals as possible
132
, which would involve finding
X-Y pairs not diagnosed with quirky effects for every such individual. Any one X-Y pair in Hoji’s
131
We may wonder what role the judgements of those who fail sub-experiments in particular might play, as those
individuals may be inattentive or not be comprehending what they are being asked. I will concede that those individuals
certainly have a different status from those who “pass” the relevant sub-experiments (regardless of whether the “passers”
ultimately make it into the “center” or not). Nevertheless, knowing what individuals who are confused or guessing
randomly are doing may nevertheless provide useful information about certain baseline tendencies, so I am hesitant to
rule out their utility altogether. Further, those who pass many sub-experiments but fail one in particular may be analyzed
in more depth, as we can presumably isolate what aspect “went wrong” in their case, e.g., they understood everything but
the way the DR interpretation was conveyed, which may help us examine what role that particular aspect plays in
establishing the overall results in greater depth.
132
This is not to say that we are making a numerical prediction; the key is the lack of red dots in the center, not the number
of green dots. The number of green dots in the center rather adds quantitative significance to the qualitative total lack of
red dots in the center that constitutes Hoji’s main prediction.
189
data yields just a few such individuals; across all X-Y’s, Hoji reports that 12 different individuals
make it into the center at some point, out of about 179-186 total
133
(p.c. Hajime Hoji, Dec 2022).
What is important is that each one of the relevant judgements from each of these individuals in
principle could have been “disconfirming”, and the fact that none are is striking. In that sense,
Hoji’s Kyudai experiment represents a successful first-of-its kind experiment designed to extract the
pattern of c-command in BVA from the noise of inter-speaker variation. The results of other
experiments to be presented here, including those from the experiments conducted for this
dissertation, may thus be fairly said to be building on the foundation lain by the Kyudai experiment.
3.2.4 Plesniak (2022a)’s “Toy Experiment”
The first attempt to build on the Kyudai experiment’s foundation is Plesniak 2022a, which
attempts to demonstrate that Hoji’s correlational predictions are not limited to Japanese but can be
successfully replicated in English as well. While in some sense this is not as groundbreaking as the
Kyudai experiment itself, it is crucial that such predictions can be replicated in other languages; as
we can note, none of hypotheses articulated in Section 2.7 or the implementational strategies
discussed earlier in this section make any specific reference to the properties of “Japanese”. Rather,
the predictions flow from general hypotheses, such as the ABC-BVA law and Hoji’s DR-Coref-BVA
correlation, and practical considerations as to how to ensure reliability and control in experiments.
As a result, if it were to turn out that the results of the Kyudai experiment were in fact specific to
Japanese, this might suggest something fundamentally wrong about, or at least missing in, the basic
hypotheses underlying the correlational approach.
133
What percentage of the participant population this is is irrelevant for the point I am making here, though again, it might
be relevant for other purposes related to significance assessment.
190
Fortunately, that is not what happens, and as we will see, things turn out in essentially the
same way in English as in Japanese. It is worth noting, however, that Plesniak 2022a’s experiment is
a much smaller experiment than the Kyudai experiment. There is only one choice for X and Y of
BVA(S, X, Y), namely ‘every teacher’, and ‘his’ respectively; the latter is always part of the broader
phrase ‘his student’, and there is only one verb, ‘spoke to’. Plesniak 2022a uses the same two crucial
schemata as Hoji (weak crossover and OSV reconstruction construcitons, “translated” to English),
with there is only one sentence instantiating each:
(166) English Schemata
a. *Schema: Y’s NP V (to
134
) X
b. okSchema: (to) Y’s NP, X V
(167) BVA Sentences
a. *Schema: His student spoke to every teacher.
b. okSchema: To his student, every teacher spoke.
The full experiment contains only ten questions, three of which are unused in the final
analysis. In addition to the two BVA sentences, following the correlational methodology, there are
necessarily sentences to check for quirky effects via DR and Coref. These are created by substituting
‘three students’ for ‘his student’ and ‘John’ for ‘every teacher’ respectively:
(168) DR Sentences
a. *Schema: Three students spoke to every teacher.
b. okSchema: To three students, every teacher spoke.
134
The inclusion of ‘to’ was motivated by the sense that OSV reconstruction sentences are more easily acceptable in written
form for English speakers when ‘to’ is present; the OSV pattern is relatively normal in spoken English, but it is far rarer
in written English, and sometimes this seems to cause some English speakers to reject such sentences out of hand when
asked for a judgement. We are thus using a slightly different structural hypothesis than for a “regular” SVO sentence, given
that ‘to his student’ and ‘to every teacher’ are PP’s in (indirect) object positions, but we can assume that these occupy
roughly the same space as the regular (direct) objects as schematized in (135), or as the agent PP’s in passives, as
schematized in (139); in either case, the crucial hypothesis that the nominal inside the PP is c-commanded by the subject
and not vice versa holds, which is all we need.
191
(169) Coref Sentences
a. *Schema: His student spoke to John.
b. okSchema: To his student, John spoke.
Although with far fewer instantiating sentences, this set of six sentence types, two of each of
BVA, DR, and Coref, closely mirrors Hoji’s Kyudai experiment. Translation of the Coref sentences,
however, is not without some controversy. It is claimed in various sources
135
that English sentences
like (169)a (that is, “weak crossover” configurations with Coref interpretations) are simply always
acceptable to English speakers, contrasting with languages like Japanese, where it has been reported
that “weak crossover Coref” is impossible. This would render Hoji’s Coref test for identifying
whether the individual in question has a structurally sensitive Y unusable in English. This claim,
however, has two major problems: first, as we have seen in the preceding sub-section, not all
Japanese speakers find a contrast in Coref in sentences like those in (169); indeed, in Hoji’s data, it is
clear that there are a non-insignificant number of individuals who can accept Coref in weak-
crossover constructions like (169)a. Given that Japanese speakers vary in whether a given instance of
weak-crossover Coref is acceptable or not, we may wonder if the situation is the same in English. As
we shall see, indeed it is, constituting the second problem for this generalization. In fact, quite many
individuals show the predicted asymmetry between (169)a and (169)b, suggesting that they very
much did experience a weak-crossover effect with Coref in English. As such, Hoji’s Coref test turns
out to be indeed usable for English.
135
I have been able to trace some of the more recent claims back to comments by Boskovic (2014), where it is used to
argue that Japanese and English have different nominal-internal structures. However, prior to the experiment conducted
for Plesniak 2022a (in 2019), Hoji also assumed that Coref in English could not fulfill the same role it does in Japanese,
following Ueyama (1998)’s discussion of co-D-indexation, which she manages to control by the use of so-NP’s in Japanese,
which as discussed in footnote 66, lack a clear analogue in English. Indeed, one can note that Chomsky (1981)’s Binding
Principle B, supposed to govern the Coref behavior of such pronouns, simply holds that Coref(S, X, Y) is impossible if X
c-commands Y locally, unlike Binding Principle A does for reflexives, it does not hold that X must c-command Y; this too
seems to suggest the view that Coref cannot be expected to conform to c-command based patterns when English
pronouns are involved. .
192
The final type of sentences included were four attentiveness/comprehension checking “sub-
experiments”. Three of these turn out to be completely redundant (including one which was passed
by every one of the 100+ individuals who participated); as such Plesniak 2022a makes use of only
one of them
136
. This one particular sub-experiment is termed the “BVA Instruction Sub-
Experiment” or “BVA-Inst-Sub” by Hoji 2015, and it plays a fairly crucial role in ensuring that
reported judgements do indeed match “true” judgements. Namely, it tests whether participants
seemed to correctly understand the way in which the BVA interpretation are being conveyed to
them. That is, because neither the term “BVA” nor the concept it describes are commonly known to
non-specialists, some form of informal description is used to convey the interpretations to be judged
to the participants. For Plesniak 2022a, building on Hoji 2015’s work with English, this was done via
the use of the phrase “each teacher’s own student”, e.g., the example item presented schematically in
(170)
137
:
(170) Sentence: “His student spoke to every teacher”
Option A: “His student” can refer to each teacher’s own student.
Option B: “His student” cannot refer to each teacher’s own student.
The hope was that most individuals understand that ‘each teacher’s own student’ does not
refer to a particular student, but varies in interpretation with each teacher in question, i.e., reflects
BVA(S, every teacher, his). If, however, an individual does not understand it in this way, then such a
136
The more sub-experiments there are, and the more distributed they are throughout the experiment, the more likely we
are to catch anyone who is, for example, sometimes attentive sometimes careless. Plesniak (2022c) has discussion about
how many such experiments are necessary, but the bottom line is that it depends on a great number of other factors.
Though we could imagine the required number to be quite high, the results of the experiments presented here suggest this
is not the case; results often come out as predicted even when very few such controls are utilized.
137
Somewhat simplified from its actual presentation in Plesniak (2022a) for the sake of expositional brevity. For example,
participants were allowed to reject both options if they felt that the sentence was not grammatical English, expressed as a
third option.
193
person’s reported judgements do not, in fact, reflect judgements on the availability of BVA for that
speaker; instead, that speaker is judging a different sort of interpretation, one which is not BVA, and
thus for which no predictions have been made.
To check whether this “paraphrase” is correctly understood, participants in Plesniak 2022a’s
experiment are asked to judge the sentence in (171), which contains the crucial phrase of interest,
namely ‘each teacher’s own student’:
(171) John spoke to each teacher’s own student.
What is being checked is whether ‘each teacher’s own student’ mandatorily conveys the
intended “multiple different students” reading, or whether the participant in question could
somehow understand it as referring to one individual. If the individual could understand ‘each
teacher’s own student’ as referring to just one individual, then the use of the phrase clearly was not
forcing that individual to consider BVA readings, meaning that the directions were not effective for
that individual. The BVA-Inst-Sub asked precisely about this kind of interpretation, and those who
answered that (171) could describe an act of speaking to just one student were categorized as
potentially inattentive or misunderstanding the directions; while most individuals “passed” this sub-
experiment, a non-trivial minority did not, and, as in the Kyudai experiment, taking note of this
minority was necessary for results to come out as predicted.
With these seven items in mind, two BVA schemata, two DR and two Coref schemata, and
the BVA-Inst-Sub, we can now turn to the results of Plesniak 2022a. The basic Venn Diagram used
is effectively identical to Hoji’s
138
; the three circles represent the three categorization tools, the DR
138
I have modified terminology used in the diagram slightly from Plesniak 2022a’s original to be more consistent what we
have seen so far.
194
and Coref tests for quirky effects and the BVA-Inst Sub. Likewise, the labels of the dots are the
same with the same conditions (albeit “neutral” is now yellow rather than blue): in terms of the
sentences in (167):
(172) BVA(S, every teacher, his) acceptable in…
a. (167)a: “Disconfirming”
b. (167)b but not (167)a: “Detection”
c. Anything else: “Neutral”
We can understand the classification circles in a similar way, recalling that to get into the DR
and Coref circles, one had to get “detection” in those domains as well; for DR, this is rejecting
(168)a and accepting (168)b, and for Coref it is rejecting (169)a and accepting (169)b. For the BVA-
Inst-Sub, there was only one item, which had to be rejected with the provided reading, providing
evidence that the individual did not understand the directions in an “unintended” way.
Putting these all together, we derive Plesniak 2022a’s results diagram:
195
(173)
Each color of dot is manifested by about a third of the participants. Crucially, a third of
individuals do accept BVA(S, every teacher, his) with (167)a, i.e., they accept weak-crossover BVA,
while the other two thirds reject BVA in this context. Of this latter group, half (again, about a third
of the overall) do accept BVA(S, every teacher, his) with (167)b, i.e., OSV reconstruction. The first
of these three groups has judgements that do not pattern according to c-command-based
predictions, whereas the last of these three groups does; as with the Kyudai experiments, the
question is whether classification according to sub-experiments and tests for quirky explains the red,
c-command-violating dots, while leaving some green, c-command-conforming dots in the center.
As can be seen from (173), this is precisely what happens; all red dots fail to make it into at
least one of the circles, whereas there are ten dots, three yellow and seven green, which reach the
center. Interpretationally, this means that all cases of accepting weak-crossover BVA can be
196
attributed to either failing to comprehend the directions of the experiment in the way intended, or to
quirky effects on either ‘every teacher’, ‘his’, or both, or possibly the sentence pattern itself (which
we will be seeing in later chapters does seem to be possible). When we focus on individuals who
apparently did understand the directions as intended, and who showed no signs of quirky effects, we
find that they all reject weak-crossover BVA, which, as discussed, is a sub-case of BVA(S, every
teacher, his) when ‘his’ does not c-command ‘every teacher’. Further, the majority of these
individuals do accept BVA(S, every teacher, his) with the OSV reconstruction schema, where ‘every
teacher’ does c-command ‘his’, even though it does not precede it. As such, the evidence suggests
that c-command is the main differentiating factor for these individuals as to whether BVA is
possible, precisely as predicted by the BVA-ABC law to occur when precedence and quirky effects
are absent
139
.
Plesniak 2022a thus directly replicates the Kyudai experiment’s results in English, with
minimal change from the format of the original. As noted, the scale at which this is done is
somewhat smaller than what Hoji originally did; indeed, Plesniak 2022a labels this experiment a “toy
experiment” because it uses so few items, and as such, cannot be considered a “full-scale”
replication of Hoji’s experiment. There are, however, different aspects to the “smallness” of Plesniak
2022a’s experiment. On the one hand, it is true that including other instances of X and Y, or simply
more BVA sentences using the same choices of X and Y, would allow for a more thorough
investigation of the state of affairs of BVA in English. Of course, we do not have particular reason
to suspect that ‘every teacher’ and ‘his’ are unrepresentative choices; indeed, as noted in Sections 2.6
139
I have been speaking in this section about the ABC-BVA law as if Hoji and/or Plesniak 2022a made use of that law;
technically, they had not formalized it as such, as this dissertation is the first work to do so. However, as I noted in Section
2.7, it was effectively inherent in their hypotheses, and indeed, as noted in Section 1.4, follows fairly directly from Ueyama
(1998)’s account of BVA. So, though the terminology is a bit anachronistic for the works reviewed in this section, the ideas
of the ABC-BVA law are nevertheless solidly present in the hypotheses these works advance.
197
and 2.7, universal quantifiers like ‘every’ and “regular” pronouns like ‘his’ are among the most
permissive cases of X and Y of BVA(S, X, Y), so we would expect them to be less likely to show
clear c-command effects than other choices of X and Y. As such, that these effects obtain clearly
and exceptionlessly with these choices is adds further significance Plesniak 2022a’s results.
Nevertheless, thoroughness adds reliability and therefore significance, and Plesniak 2022a is,
in this sense, not particularly thorough when investigating BVA. We should note, however, that the
same cannot be said for DR and Coref, at least when considering the significance of the overall
results. Testing more BVA sentences, at least of the *Schema variety, increases the chances of
finding a “disconfirming” judgement, which would contradict the predictions being tested. On the
other hand, under Hoji’s system, testing more DR and Coref sentences increases the chances of
detecting quirky effects, and actually reduces the chances that a given judgement makes it into the
central intersection. As such, while checking more BVA sentences increases the degree to which the
key predictions are tested, testing more DR and Coref sentences in fact reduce the risk of those key
predictions getting disconfirmed, albeit while making the results more reliable. As such, assessing the
significance of Plesniak 2022a’s results compared to Hoji’s Kyudai results is not an easy matter;
compared to the former, the latter has both increased and decreased the risk of disconfirmation
along certain dimensions, essentially resulting in a “high risk high reward”-style experiment. As I will
discuss further in Section 7.4, the downside of this style of experiment is that it sacrifices a degree of
reliability, but the upside is that it allows for much quicker checking of things, allowing for more
efficient design, and thus potentially greater overall coverage.
These issues will be relevant considerations in Section 3.3, where I discuss the design of the
experiments performed in this dissertation. Regardless of such issues, however, we can still
appreciate that Plesniak 2022a’s results are unlikely to have come about by chance and are thus
significant in that respect. Of the ten dots in the center, none are red, meaning that all individuals in
198
question rejected BVA in (167)a; given that BVA interpretations were frequently accepted by
participants of this experiment, this would be an unexpected result if the hypotheses in question
were incorrect. To make a very simplistic assumption, if BVA acceptance or rejection was simply
random, then the odds of getting this result would be 1/(2^10), a little less than 0.1%.
One could of course take a much less simplistic approach, and I will discuss such issues to a
degree in Section 7.3, so I will defer discussion of such issues until then. In the meantime, we can
simply note that Plesniak 2022a, like Hoji’s Kyudai work, tests the relevant predictions via a SCI-
experiment, and, despite encountering a wide range of variation, finds no individual who does not
behave according to the predictions (see discussion of (94) in Sub-Section 2.7.5). As such, Plesniak
2022a’s results provide support to the hypotheses in question, even if the strength of such support
remains to be quantified. As such, Plesniak 2022a achieves the goal of providing a demonstration
that Hoji’s correlations are not Japanese-specific and may indeed be universal. Of course, more
languages would need to be checked to show this in a fully convincing manner, as well as additional
sentence patterns, which are both part of what this dissertation purposes to do.
3.2.5 Plesniak (2022b)’s Weak Crossover Experiment
The checking of more sentence patterns, however, is also one of the things achieved in
Plesniak (2022b)’s experiments. These experiments are each somewhat different from one another
but are generally concerned with addressing potential objections to the general correlational
methodology. In terms of thoroughness, they also approach Hoji’s Kyudai experiments along certain
dimensions, but have crucial differences as well.
As discussed in both Plesniak 2022b and Plesniak 2022c, the methodology used in the
Plesniak 2022b experiments builds that of Plesniak 2022a but adds more items so as to make the
experiment much less “toy”-like. Multiple instantiating sentences are used per schema, and further,
199
of the ~200 participants per experiment, half of the participants use ‘every teacher’ as X of BVA(S,
X, Y), and the other half use ‘more than one teacher’ as X of BVA(S, X, Y). While this is still far
fewer choices of X compared to the Kyudai experiment, and the format also differs from the Kyudai
experiment in that each participant sees only one X, Plesniak 2022b nevertheless doubles the
number of X’s compared to what is done in Plesniak 2022a. Additionally, Plesniak 2022b makes use
of far more attentiveness/comprehension-checking sub-experiments, along with various other
“controls”, e.g., adding a degree of randomness to the presentation order of the different items, the
use of additional sentences without MR interpretations, etc.
140
.
The first of Plesniak 2022b’s three experiments conducted in this way is much like Plesniak
2022a/Hoji’s weak-crossover experiment, but with a significant twist. The *Schema in question
remains the same, but the okSchema changes. Compare (176) and (177), used in this experiment, to
Plesniak 2022a’s equivalents in (174) and (175) (repeated from (166) and (167) in the previous sub-
section):
(174) Plesniak 2022a’s English Schemata
a. *Schema: Y’s NP V (to) X
b. okSchema: (to) Y’s NP, X V
(175) Plesniak 2022a’s BVA Sentences
a. *Schema: His student spoke to every teacher.
b. okSchema: To his student, every teacher spoke.
(176) Plesniak 2022) (Experiment 1): English Schemata
a. *Schema: Y’s NP V (to) X
b. okSchema: X was V (to) by Y
(177) Examples of Plesniak 2022b’s BVA Sentences
a. *Schema: His student spoke to every teacher.
b. okSchema: Every teacher was spoken to by his student.
140
Readers can see Plesniak (2022b) and especially Plesniak (2022c) for a full discussion of the implementation of the
experiments.
200
Plesniak 2022b swaps Plesniak 2022a’s OSV reconstruction schema, (174)b, for a passive,
(176)b. As mentioned, one of Plesniak (2022b)’s goals is to address potential concerns regarding the
Kyudai experiment and Plesniak 2022a’s experiment. We may note that, while the *Schema and the
okSchema in (174) are matched in terms of the linear ordering of X and Y, they are not matched in
terms of X and Y’s theta roles; in the *Schema, X is the one spoken to, and Y is the one who speaks,
while this is reversed in the okSchema. The match in terms of the ordering of X and Y means that
the observed asymmetries between such sentences cannot be attributed to the effects of precedence,
but one might still wonder if the difference in theta roles accounts for the difference in acceptability,
rather than whether or not X c-commands Y. In (176), on the other hand, the *Schema and
okSchema are matched in terms of X and Y’s theta roles; Y is the “do-er” in both cases and X is the
“do-ee”. As per the hypotheses in Sub-Section 3.1.3, the schemata do contrast in terms of whether
X c-commands Y or not; X c-commands Y in the okSchema but not in the *Schema. As such, if
similar results to those found in Plesniak 2022a and the Kyudai experiment obtain, we can rule out
switching theta-roles as the source of the contrasts observed (or at least as the only source),
supporting the c-command-based account.
Of course, proceeding in this way, Plesniak (2022b) is losing some degree of control over
precedence; “some” because the *Schema is still such that X does not precede Y. As the *Schema
prediction is the main “testable” prediction (see discussion in Sub-Section 3.2.2, especially (156)), the
basic empirical conditions for falsification of this experiment are identical to the previous ones.
Rather, what has changed is the okSchema, which as discussed previously in this section, is a tool for
helping to ensure the “significance” of the experiment. As noted, the Kyudai/Plesniak 2022a
experiments use their okSchemata to demonstrate that the results cannot be reduced to a difference
in precedence between X and Y in different sentences; the Plesniak 2022b experiments use their
okSchemata to demonstrate that the results cannot be reduced to a difference in theta-roles between
201
the sentences. Put together, the results of each suggest that neither precedence nor theta roles can
account for the observed contrasts, jointly lending strong support for the necessary role of c-
command in constraining BVA interpretations
141
.
The results of the experiment are qualitatively identical to Hoji’s and Plesniak 2022a’s. As
such, results are once again consistent with Hoji’s initial correlational predictions; once all non-c-
command sources of BVA are controlled for, no attentive/comprehending individual accepts weak-
crossover BVA. I have here preserved Plesniak (2022b)’s labels, (“WC” standing for “weak
crossover”, but the conditions on each color are essentially the same as in Plesniak 2022:
(178) BVA(S, X, his) acceptable in…
a. At least one instance of (176)a: Red dot
b. Some instances of (176)b but no instance of (176)a: Green dot
c. Anything else: Yellow dot
As in previous experiments, individuals are classified according to three criteria, the
attentiveness/comprehension checking sub-experiments, as well as the DR and Coref tests. These
141
That is, assuming that precedence and theta roles are not “conspiring together” in some way. This possibility could be
addressed by controlling for both at once, which indeed is done for the experiments for this dissertation; no such
conspiracy is discovered.
One potential objection, raised in essence by Hoji (p.c. January 2021 among other times) is, to use the terminology of this
dissertation, that A(S, X, Y), which reduces to X preceding Y in S for us, is part of the ABC-BVA law, and thus must be
controlled to observe the clear effects of X c-commanding Y, whereas there is no corresponding “Theta”(S, X, Y), and
thus it is unclear (a) whether an experiment that does not control for precedence can be said to be fully significant re-the
effects of c-command, and (b) why we would want to control for theta-roles in the first place. The answer to (b), at least
to me, is rather straightforward; it is a fairly intuitive possibility that the ability to participate in BVA might be dependent
on the semantic role an element plays in a sentence, and so it would behoove us to check that possibility (at least once or
twice) just to make sure that isn’t the case. Indeed, a full check of that possibility is not performed by Plesniak (2022b) or
in this dissertation; I merely check that c-command effects are not reducible to theta-role effects, which is not the same as
demonstrating that theta-role effects do not exist. As to (a), it is true that we cannot say that an individual who accepts the
okSchema in Plesniak 2022b’s experiment necessarily does so because of X c-commanding Y and not because of X
preceding Y. Regardless, as mentioned, I control for both in the experiments for this dissertation, so this “historical” issue
need not concern us much.
202
latter tests make use of sentences following equivalent patterns to their BVA equivalents, e.g., the
following:
(179) Examples of Plesniak (2022b)’s DR Sentences
a. *Schema: Three students spoke to every teacher.
b. okSchema: Every teacher was spoken to by three students.
(180) Examples of Plesniak (2022b)’s Coref Sentences
a. *Schema: His student spoke to John.
b. okSchema: John was spoken to by his student.
To pass the DR/Coref test (visually, to get inside the DR/Coref circle), the individual in
question had to always reject sentences of the form in (179)a/(180)a with a DR/Coref
interpretation while accepting at least some sentences of the form in (179)b/(180)b with the
DR/Coref interpretation.
The results of this classification are as summarized below:
203
(181)
As with the previous experiments, the only individuals who pass all test are in the center and
they are represented by green or yellow dots; that is, they reject all instances (176)a, and most of
them do accept some instances (176)b. All red dots, weak-crossover-accepters, are outside the
center, indicating that there are clear sources for this acceptance, either a lack of
attentiveness/comprehension or diagnosed quirky effects.
3.2.6 Plesniak (2022b)’s “Bad Hypotheses” Experiment
Given that (181) is essentially the same graph we have seen repeated multiple times in the
past sub-sections, I will not discuss it in further detail; the results support the basic hypotheses, so
purported c-command effects can indeed not be reduced to theta-role effects. The “essentially the
204
same graph” point, however, is yet another one of Plesniak (2022b)’s concerns. Throughout the
experiments reported in this section, graphs of this sort have been considered successful replications
of the predictions. Crucially, the fact that, in the central intersection, there are some green dots but
no red dots, i.e., no individual who passes the tests for attentiveness and lack of quirky effects
accepts BVA in weak crossover configurations, while many of those who do pass these tests show
clear, c-command-based patterns, has been taken to indicate support for the hypotheses under
consideration. The key assumption here is that these results are obtaining because those hypotheses
are correct. That this assumption is correct is required for such results to be significant; if we got this
sort of “some greens no reds” result regardless of whether the hypotheses in question were correct,
then the result would have no significance. It is imaginable, even if improbable, that such a result
might obtain just because of some aspect of the experimental design itself, not related to the
hypotheses being tested. It is unclear exactly how such a thing would occur, but rather than consider
hypotheticals, Plesniak (2022b) opts to simply investigate this issue directly.
As a result, Plesniak 2022b’s second experiment essentially tests a set of “incorrect”
structural hypotheses. The intention is to show that the “some greens no reds” result does not
obtain under these conditions. Specifically, Plesniak (2022b) adopts roughly the hypotheses in (182),
which yields the predictions in (183):
(182) a. In sentences of the form SVO, S c-commands O and not vice versa.
b. In sentences of the form OSV, O c-commands S and not vice versa.
(183) a. BVA(S, X, Y) may be possible in sentences of the form [X V Y’s N], even if *B(S, X, Y).
b. BVA(S, X, Y) is only possible in sentences of the form [Y’S N, X V] if okB(S, X, Y).
205
That is, “reconstruction effects” can only occur due to quirky effects, as subjects do not c-
command topicalized objects in the way they do non-topicalized objects
142
. We have seen ample
evidence throughout both Chapter 2 and the other experiments discussed in this section that this is
untrue; reconstruction effects have long been held to be structural in nature, not quirky-based, and
we have seen them obtaining repeatedly in experiments for individuals for whom quirky effects are
not diagnosed. As such, if we adopt these hypotheses, our experimental results should not turn out
like those of the previous experiments discussed; they should fail to support these wrong hypotheses
and would ideally give us some clue that we are “wrong” about something.
To test whether this indeed happens, Plesniak (2022b) essentially re-performs the first
experiment, switching *Schema and okSchema to the following:
(184) Plesniak (2022b) (Experiment 2): English Schemata
a. *Schema: (to) Y’s N, X V
b. okSchema: X V to Y’s N
(185) Examples of BVA Sentences
a. *Schema: To his student, every teacher spoke.
b. okSchema: Every teacher spoke to his student.
143
(186) Examples of DR Sentences
a. *Schema: To three students, every teacher spoke.
b. okSchema: Every teacher spoke to three students.
(187) Examples of Coref Sentences
a. *Schema: To his student, John spoke.
b. okSchema: John spoke to his student.
142
This is different from Ueyama (1998)’s proposal, which holds that OSV sentences are ambiguous between structures
where S c-commands O and where S does not c-command O, the latter being called “Deep OS”. In that case, S still can
be understood as c-commanding O in OSV, it just need not be. Plesniak 2022b, on the other hand, is deliberately adopting
an incorrect hypothesis that S can never be understood to c-command O in OSV, a much more extreme version of
Ueyama’s proposal.
143
One might think that this could be a way to test for the effects of A(S, X, Y), as the schemata are minimally different,
with the *Schema having Y before X and the okSchema having X before Y. However, C(S, X, Y) has not been controlled,
so no clear prediction is made. We will be coming to how such a test may be made later in this chapter.
206
Despite this superficially small change, the results that obtain in the experiment are qualitatively
different:
(188)
As can be seen in (188), the result is not, like previous experiments “some greens but no reds
in the center”; rather, there are no dots in the center at all. Interpretationally, no individual is such
that the individual both passes the attentiveness/comprehension checks and tests negative for quirky
effects. It is in these quirky tests that the issue lies. Recall that, for a dot to be inside a given DR or
Coref circle, the individual it represents must always reject the DR/Coref in the *Schema while at
least sometimes accepting DR/Coref in the okSchema. In this experiment, that would mean
consistently rejecting OSV reconstruction DR/Coref while at least sometimes accepting regular
SVO DR/Coref. We can see that some “attentive” individuals do have this pattern for DR, a
mixture of greens, yellows, and reds, but no attentive individuals have this pattern for Coref. That is,
207
no “attentive” individual consistently rejects readings Coref(S, John, his) in sentences like (187)a
while accepting such readings in sentences like (187)b. This indeed makes a great deal of sense if we
consider the correct hypotheses, namely that ‘John’ does
144
c-command ‘his’ in (187)a; no quirky
effect is required to get the relevant interpretation, it comes “for free” via c-command, and thus
there is little reason for individuals to systematically reject this type of interpretation.
What we see in (188) is effectively a “neutral result”; no red dots make it into the center, so
we cannot say the prediction has failed per se, but no green (or even yellow) dots make it there
either, so we do not get the same sort of positive evidence that occurred in previous experiments.
Reanalyzing Hoji’s Kyudai experiment data in this way, Plesniak (2022b) notes that this same basic
result obtains; no green dot is ever found in the center. For almost all choices of X and Y, no
individual at all reaches the center, but in the rare cases that one does, that individual’s
corresponding dot is always red, providing active disconfirmation of the hypotheses at hand.
Such results assuage the concerns that something about the experimental design itself, rather
than the correctness of the hypotheses, was behind the “supportive” looking results that have
obtained in previous experiments. Plesniak (2022b)’s evidence all indicates the contrary; across
multiple experiments, the correlational methodology is sensitive to the correctness of the
hypotheses, such that while it will not always falsify incorrect hypotheses (which would require a red
dot in the center), it never actively supports them (which would require greens in center, along with
no reds). As such, the fact that it has multiple times provided support for the claim that BVA(S, X,
Y) in weak crossover configurations is due only to quirky effects can be considered a reliable test of
the hypotheses, not a mere methodological fluke
145
.
144
Or, considering Deep OS again, at least “can”.
145
A similar type of investigation is performed in Hoji 2022c, where the absence of various non-predicted types of
entailments/correlations between different MR’s is demonstrated, lending further significance to the fact that the predicted
208
3.2.7 Plesniak (2022b)’s Spec-Binding Experiment
The final point of concern Plesniak (2022b) seeks to address is that all previous results
described in this section attained using the correlational methodology have focused on the same
basic *Schema, namely a basic weak crossover configuration. Though this methodology is predicted
to work regardless of what the *Schema in question is, this is a prediction one naturally would want
to test. To do this, Plesniak (2022b) turns to cases of possessor/spec-binding, as has been
extensively discussed in Section 2.5. To briefly review, using the hypotheses and deductions laid out
in Sub-Section 3.1.3, we can see that, in sentences of the form in (189), X should not c-command Y.
(189) X’s N V Y’s N.
Because X is embedded in a larger phrase, X’s N, it does not c-command “out of” that
phrase according to the definition of c-command we have adopted (contra Kayne (1994) and others
discussed in Section 2.5). As previously noted, however, there is a problem with making use of such
sentences, namely that X still precedes Y, so we do not have *A(S, X, Y). If such precedence has not
been controlled, the availability of BVA(S, X, Y) is not predicted to require either X to c-command
Y in S or quirky effects, and thus, we cannot apply the correlational methodology to such
sentences
146
. A straightforward solution, also as previously discussed, is to topicalize the Y-
containing phrase, which removes the precedence but should not alter the fact that X does not c-
entailments do occur as predicted.
146
Both Hoji and I are actively working to develop a method that works under such circumstances, but we have not yet
settled on a way that can be reliably deployed in SCI-experiments. What would be required would be to find X and Y such
that, for the individual in question, BVA(S, X, Y) could not be based on A(S, X, Y), as diagnosed via correlations with
their other judgements.
209
command Y. Doing so, we obtain the *Schema used for this third and final experiment of Plesniak
2022b:
(190) a. *Schema: (to) Y’s N , X’s N V.
b. BVA Example: To his student, every teacher’s colleague spoke.
c. DR Example: To three students, every teacher’s colleague spoke.
d. Coref example: To his student, John’s colleague spoke.
The question then is what can serve as an adequate okSchema counterpart to this *Schema.
There are many conceivable options, each with different strengths and drawbacks, and for the
experiments conducted on spec-binding for this dissertation, different options are in fact used
simultaneously. As such, let me defer discussion of the options to the discussion of experimental
design in the next section, and simply state here that Plesniak (2022b) chooses to match once again
in terms of (rough) theta-roles, rendering the okSchema as below:
(191) a. okSchema: X has a(n) N who V (to) Y’s N.
b. BVA Example: Every teacher has a colleague who spoke to his student.
c. DR Example: Every teacher has a colleague who spoke three students.
d. Coref example: John has a colleague who spoke to his student.
We can see that the sentences in (190) and (191) roughly “mean” the same thing, modulo
MR interpretations, yet in (191), the structural hypotheses and deductions in 3.1.3 tell us that X does
c-command Y in such sentences, contrary to (190). As such, the three “colors” of individuals to
appear in the results diagram will be as follows:
210
(192) a. Red: Accepts BVA with at least one sentence like (190)a
b. Green: Never accepts BVA with sentences like (190)a but does so with some sentences
like (191)b
c. Yellow: Otherwise
Likewise, to pass the relevant DR and Coref tests for lack of quirky effects, individuals
have to consistently reject sentences like (190)c and (190)d while at least sometimes accepting
sentences like (191)c and (191)d. Classifying according to these parameters and the usual
attentiveness/comprehension sub-experiments, Plesniak (2022b) finds a familiar result:
(193)
As is also the case for all the weak crossover experiments reviewed in this Section, Plesniak
2022b’s spec-binding experiments finds the characteristic “some greens and no reds in the center”
result. Interpretationally, all individuals who accept (topicalized) spec-binding do so as a result of
either issues related to taking the experiment (inattentiveness, etc.) or diagnosed quirky effects.
211
Indeed, as can be seen from the diagram, reference to quirky effects alone is sufficient; attentiveness
is superfluous, though individuals whose judgements correspond to the predictions but are
inattentive perhaps should not be considered to provide particularly strong evidence for the
predictions. Individuals who lack such confounding factors never accept spec-binding and, in this
case universally, do accept BVA from subjects into objects in “theta-role equivalent” sentences,
further supporting the notion that the rejection of spec-binding sentences for such individuals is due
to the lack of proper c-command relations
147
.
Not only does this result establish that the correlational methodology can be applied
successfully beyond investigations of the original weak-crossover configurations, it also has
significant implications for the status of possessor binding in syntactic theory. Namely, this result
supports the claim I made at several points in Chapter 2 that possessor binding is not a particularly
exceptional case of BVA. It arises due to either precedence or quirky effects, which are not specific
to possessive structures but affect BVA regardless of structural configuration. There is thus no need
to revise structural theories to accommodate such acceptances, nor does the fact that possessor
binding is sometimes accepted imply (as Barker 2012 claims) that c-command is not relevant for
BVA. Rather, once careful control of both precedence and quirky effects is achieved, it becomes
clear that c-command is indeed very relevant, and possessors cannot enter into BVA relations with
elements outside of the broader possessive phrase that contains them precisely because they do not
c-command out of that phrase. In this sense, Plesniak 2022b’s third experiment is an instance of
using the correlational methodology not merely to account for variation, but to also learn new things
147
Note here that one could argue that such individuals simply do not allow for reconstruction effects; in essence, the
claim would be that these are the individuals for whom topicalizing Y/the phrase containing Y prevents participation in
MR(S, X, Y). The experiments performed for this dissertation will disambiguate this possibility, as there, we can see that
many of the individuals in question do accept “regular” reconstruction (when X is a subject), but not “spec-binding”
reconstruction (when X is a possessor in the subject).
212
about the nature of the CS and language faculty, showing the correlational methodology’s utility in
making new contributions to long-debated issues in syntactic theory.
3.2.8 Looking Ahead
The experiments discussed in this section provide initial demonstrations of various aspects
of the correlational methodology. In particular, Hoji’s “Kyudai” results represent the first
application of the methodology to a large-scale experiment involving multiple non-specialists, while
Plesniak 2022a replicates these results from Japanese in English, which is followed up on by Plesniak
2022b, where the results are established more rigorously and various “what if” scenarios are
investigated. Plesniak 2022b is also significant for us in that it makes use, at least in some way, of all
the different constructions discussed in Sub-Section 3.1.3, which are the constructions that will form
the basis of the experiments performed for this dissertation (or at least, the ones in English).
As noted in several places, however, the dissertation experiments will expand on this basis in
several ways, including far more thorough investigation of the possible permutations of these
constructions and their potential relationships with one another in each individual’s judgements.
One specific aspect of these permutations that especially pertains to the discussion in this section is,
as noted at various places, Hoji’s Kyudai experiments and Plesniak 2022a’s experiment all use
okSchema that are matched in precedence with their corresponding *Schema, whereas Plesniak
2022b matches by theta-role; the experiments in this dissertation do both at once, as well as
separately. No unexpected result obtains because of this, which is good news for previous
experiments as it shows they were not missing some “confound”. Further, it allows us to get a much
more fine-grained picture of what is going on in the judgements of each individual, enhancing the
overall significance of the results.
213
This dissertation seeks to build on previous results in other ways as well. Most obviously, it
seeks to expand testing coverage to yet more languages, obtaining results not only from English but
also Korean and Mandarin Chinese. Not only does this provide more varied support for the relevant
hypotheses, but it also allows for a stronger argument as to the “universality” of the results. The
three languages are typologically quite different from one another and thus the fact that the same
basic results obtain makes it hard to advance the argument that some specific property of these
languages in particular is responsible for said results.
Further, while Plesniak 2022b’s experiments cover a wider range of constructions than
previous experiments by making use of more choices of X of BVA(S, X, Y) compared to Plesniak
2022a, these are all “siloed” in the sense that each participant only made use of one choice of X.
This state of affairs contrasts with the Kyudai experiments, where each participant made use of
multiple choices of X and Y. This dissertation brings the situation much closer to that of the Kyudai
experiments by including multiple X-Y pairs, each to be judged by all the participants in the relevant
language. As with the Kyudai data, we will see that the results nevertheless converge across X-Y
pairs, further demonstrating the independence of the relevant predictions from any particular choice
of X and Y.
Overall, as we will be seeing shortly, this dissertation employs a rather different
implementation of the basic methodology, at least in terms of the experience participants had while
doing the experiment. I will delay discussion of this different implementation until the next section,
but two points can be noted for now: first, the fact that these results do in fact converge with
previous results provides evidence that the basic methodology is robust, regardless of the particular
way of it is implemented. Second, as discussed at various points in this section, previous experiments
provided results where many individuals were excluded from key considerations. Graphically, these
individuals were displayed as dots outside the central intersection of the relevant Venn diagrams. As
214
discussed, it is not the case that previous work has not ignored these individuals, and their
judgement patterns have been consistent with, and in many cases predicted by, the hypotheses at
hand. However, it is nevertheless true that many of these individuals never showed the pattern of c-
command-based asymmetries predicted by the hypotheses laid out near the beginning of this
chapter. As a result, if we think in terms of the three goals laid out in Section 1.3, these individuals
do indeed help in establishing the ABC-BVA law, but they do not help us demonstrate the crucial
role that structure plays within that law. Naturally, the more individuals this can be demonstrated
with, the more confident we can be in the correctness of our basic hypotheses, especially in the
proportional sense; our hypotheses are “universal”, so if we can reproduce their most crucial
predictions only in a small subset of participants, it creates an “unseemly” gap between hypotheses
and observed results, which would ideally be rectifiable.
The experiments conducted for this dissertation seek to do just that, with a design focused
on maximizing our ability to meaningfully include a given individual’s data in as many relevant
analyses as possible and to test as many possible predictions with each individual as time, money,
and other relevant resources will allow. I hold this represents not so much a deviation from previous
approaches as a natural development from them, but regardless, there are many key points of both
similarity and difference between the two styles of experimentation.
3.3 The Experimental Template
3.3.1 Introduction
In this section, I discuss the general format of the experiments conducted in this
dissertation, including various implementational details and the reasoning behind them. Of course,
each of the three languages covered by the experiments will require slightly different experimental
implementations; at the very least, the words used must change to reflect the language in question,
215
but there are less obvious changes as well. While I will note such differences in the relevant chapters,
it is nevertheless true that the overall format remained the same across all three languages, and it is
that overall format that will be discussed here.
As noted at the end of the previous section, these experiments differ from those reviewed in
that section in several ways, including the use of more sentence patterns, more choices of X and Y
of BVA(S, X, Y), at least as compared to the experiments done in English, and combining what had
previously been different experiments on multiple different sets of individuals into one experiment
carried out with all participants. There is also a shift from a focus on “quantity” to a focus on
“quality”, though both these experiments and previous experiments do emphasize both. In this case
though, as mentioned previously, rather than elicit data from hundreds of individuals, many of
whose judgements are effectively “filtered out” from the final analysis, these experiments seek to
reproduce the predicted patterns of judgements in as high a percentage of the individuals
participating as possible. As will be shown in Chapters 4-6, this approach is largely successful,
though there is a cost, namely the experiments are far more intensive, and thus resource constraints
limit the number of participants to a fraction of that in previous experiments. As we will see,
however, the data gathered from the participants is very detailed and wide-reaching, so there is
reason to believe that this cost is worthwhile, at least for the purpose at hand. For now, I will lay out
the format of the experiments mostly without reference to cost-benefit analyses, but I will at times
comment on such aspects; they will receive a more thorough discussion in Chapter 7.
216
3.3.2 The Basic Procedure
The experiments took the form of approximately hour-long one-on-one “interviews”
conducted with each participant by the experimenter over video-conferencing software
148
.
Participants were shown items they were to judge, which were stored on PowerPoint slides, and
gave their responses orally. These responses were recorded by hand by the experimenter in a
printed-out form and were later digitized for analysis. This is already quite different from previous
experiments, which were performed via digital forms that were filled out by many individuals at
once. This “mass-survey” way of doing things is certainly easier on the experimenter, given that the
participants can take the survey simultaneously and the experimenter does not need to engage with
them directly. For participants, however, the opportunity to interact with the experimenter seemed
to prove useful, as I will discuss shortly.
The specific designs of the items judged is discussed slightly later in this section, but we can
note at this point that all items followed the same basic template, as exemplified below:
148
This was necessitated by various constraints, most relevantly the COVID-19 pandemic. As with so many things during
the pandemic, the experiments were carried out on the platform Zoom.
217
(194)
At the top of each slide was a sentence, e.g., “Every professor spoke to their student” in
(194). This was the sentence to be judged. Beneath each sentence were two pictures, labeled (I) and
(II), which represented different potential interpretations of the sentence. The participants were
asked to read the sentence aloud, then consider (I) and (II), and then decide whether the situation
depicted in either was a possible interpretation of the sentence. This way of conveying
interpretations is yet another departure from the experiments described in Section 3.2, which used
text-based “paraphrases” to convey the desired interpretations. The intention behind this change
was again to make things easier for participants, with the hope being that judging specific “scenes”
might be easier than judging abstract interpretations presented via sentences.
The interpretations provided in (II) corresponded to the MR’s under investigation, Coref,
DR, or BVA, depending on the item in question. In (194), we can see both from the sentence and
the (II) interpretation that this was a BVA interpretation; for each individual A-G (the professors),
there is a corresponding distinct student to whom that professor spoke: A to A’s student, B to B’s
student, etc. On the other hand, the interpretation in (I) was always a non-MR reading, such as the
218
“referential” reading depicted in (194). In that particular case, there is just one student, who every
professor is speaking to, and thus the reading is not a BVA reading.
The inclusion of this type of item served two purposes: first, it provided a clear alternative to
the (II) interpretation, helping to highlight what the participants were intended to notice about the
(II) interpretation. For example, in (194), the fact that (I) involves only one student being spoken to
serves to highlight the fact that (II) involves multiple students being spoken to, which participants
might overlook given that ‘student’ in the sentence is singular. It may be hard to imagine how
participants could make such a mistake, but, given answers to sub-experiments testing precisely such
issues in previous experiments, it is clear that such mistakes are not uncommon, at least when
experiments are performed using the text-based “mass survey” method that those experiments used.
The second reason for the use of (I)-type interpretations is that judgements on them give insight
into whether a participant who rejects a (II)-type interpretation can assign any interpretation to the
sentence in question. If this is the case, e.g., the individual in question rejects interpretation (II) but
does accept interpretation (I), this demonstrates that the sentence itself is an interpretable string for
the individual, and (II) is just simply not a valid interpretation of that string. If we did not know that
some other interpretation was possible, however, then rejection of (II) might simply indicate that the
string is uninterpretable, which would make their rejection of (II) less significant
149
.
Note, however, that this is not a “forced choice” experiment, and participants were not
being asked to contrast the two interpretations. Rather, participants had four response options:
setting aside the exact wordings, they could choose “(I) only”, “(II) only”, “both”, or “neither”
150
.
149
Indeed, both Hoji’s Kyudai experiments and Plesniak (2022b)’s experiments include items which are the sentences of
interest paired with (I)-style (non-MR) interpretations, to check exactly that. This serves this second purpose but does not
so clearly help with the first purpose, i.e., helping participants understand the intended MR interpretation better, the reason
being that, in these past works, the two items are not always juxtaposed. Here, the two items are presented together, allow
participants to much more clearly “compare” them.
150
A reasonable objection might be that even if participants are told that there is no “forced choice”, they may nevertheless
219
Further, for the purposes of double-checking for string (un)interpretability, if “neither” was chosen,
the experimenter asked a follow-up question as to whether the participant could think of any
interpretation for the sentence, or whether the sentence was simply uninterpretable, the response to
which was also recorded.
To avoid biasing the participant in some way, the experimenter spoke only at predetermined
times and then generally only in predetermined ways, for example, in response to the “neither”
response discussed above. Other times included saying things like “yes” or “got it” in response the
hearing the participant’s judgement, as well as asking the participant to clarify if an ambiguous
answer was given, e.g., “(I)”, which could mean “(I) only”, or might mean that the participant had
finished checking (I) and was now moving on to check (II)
151
. The experimenter also answered any
questions the participants asked regarding what they were supposed to be doing.
One other time when the experimenter spoke, this time more extensively than others, was
during the “tutorials” provided to participants. These tutorials were given near the beginning of the
experiment; these began with a demonstration to the participants what was meant by judging
whether a sentence could have an interpretation in general, which was followed by three subsequent
tutorials that demonstrated the meaning of each MR in question, namely Coref, DR, and BVA. Each
of these tutorials consisted of three items: one which was intended to be ambiguous between the (I)
and (II) interpretations, one which was intended to only have just one interpretation (usually (I)),
feel they should be choosing one option or the other. As can be seen from the data provided in the appendix, however,
this is clearly not what happened; the “both” option was chosen with very high frequency in the cases where we would
expect that both options were indeed possible. Indeed, except for very specific cases, participants almost always chose
either “both” or “(I) only”, as expected given that (I)-type readings, having no c-command requirements, should in
principle always be available. As such, there is little evidence to support the notion that the participants felt they had to
choose just the “best” interpretation (Indeed, their verbal comments often indicated that they thought one was much
better than the other, but they were saying “both” anyways because the other one was at least possible, even if unlikely,
exactly as intended).
151
It might also mean that the participant had checked (I), found it to be acceptable, and then forgotten to check (II);
verbal comments in response to such prompts suggested this was often the case, so this procedure was rather important.
220
and one which was intended be interpretable as neither (I) nor (II); this helped familiarize
participants both with the MR’s used in the experiment, as well as with their options for answering
the questions.
Participants sometimes had different judgements on these tutorial items than intended; for
example, one of the “general” tutorial items, across all three languages used sentences following the
basic format in (195) and (196):
(195) General Tutorial Sentences (English)
a. John likes his roommate, and Bill does too.
b. John likes Bill’s roommate, and Bill does too.
c. John likes pizza, and Bill does too.
(196) General Tutorial Interpretations (English)
The expected judgements are that (195)a should be in principle ambiguous between both
readings (Ross (1967)’s “strict” vs. “sloppy” identities in ellipsis). (195)b, on the other hand, because
the roommate is not referred to as ‘his roommate’ but ‘Bill’s roommate’ is expected to lack a
“sloppy” reading, allowing only the interpretation in (II). (195)c is simply not about roommates at
all, and so neither (I) nor (II) is expected to be acceptable. Despite these being the clear consensus
judgements in relevant works on the topic, a surprisingly persistent minority of participants still
found (195)b ambiguous between (I) and (II), across all three languages investigated, even after the
221
experimenter asked various clarification questions to ensure that was indeed their judgement
152
.
Indeed, one individual even suggested that (195)c would be acceptable with at least one of (I) or (II)
on the assumption that someone has a roommate named “Pizza”
153
.
In the case of any such deviation from expectations, the experimenter asked participants to
elaborate on how they felt the given interpretation was/was not acceptable. So long as the
participant gave an answer that suggested they understood the original question correctly, no issue
was raised; in the case where a participant did not accept something the experimenter had intended
them to, sometimes an “improved” version of the sentence was offered, e.g., adding a helping word
like ‘each’ in the appropriate place to facilitate DR; this usually led to the participant accepting the
intended judgements, or at least recognizing how someone else might accept them. On the other
hand, in the case that a participant’s answers to the tutorial questions were consistently not as
expected, and that participant could not explain their judgements in a way that suggested they had
understood the question, then the experimenter made a note that this participant’s data should not
be considered (necessarily) reliable
154
. In that case, the experiment nevertheless would proceed, and
the participants full data would still be recorded. This happened only once, in the English
experiment, and is discussed in Chapter 4; it ended up making little difference for the final analysis
152
I think I see what these participants had in mind; it is something like “John likes the other person (=Bill)’s roommate,
and Bill does too”, meaning that Bill likes the other person’s roommate, which is John’s roommate. Essentially, the name
‘Bill’ is actually a disguised reference to “the other individual”. Under such an understanding, I also find (I) acceptable.
153
This during the Mandarin Chinese experiment, so: (a) “pizza” was conveyed via Chinese characters, and thus there was
no lower-case initial letter indicating that it was not a name, and (b) because Mandarin Chinese is somewhat less “strict”
about what can serve as a first name than standard English is (names based on common nouns being not infrequent in
China), this is not quite as absurd a possibility as it would be for the average American to bear the name Pizza (though
even that is of course possible). Nevertheless, given that only one individual (out of roughly a dozen individuals who took
Mandarin Chinese experiment) thought of this possibility, we can assume this is not a thought that is likely to occur to
most people.
154
Note that the decision to mark the individual in question as not reliable is made before any of that individual’s BVA
judgments on the actual sentences of interest are recorded. There is thus no way that such a decision can be used to exclude
individuals because the experimenter considers their judgements on crucial sentences to be “inconvenient” in some way.
222
whether that participant was included or excluded, though there is a small degree of evidence that
indeed, that individual was potentially not understanding everything in quite the same way as the
other participants.
The fact that the interviewer was present to answer questions, as well as provided
participants with tutorials essentially alleviated the need for any filtering “sub-experiments” in the
style of Hoji (2015). This provides yet another departure from the experiments discussed in Section
3.2, where such sub-experiments were an integral part of classifying participants for the purposes of
analysis. The risk of individuals either not paying attention or not understanding the task correctly is
naturally much higher in an automated survey than a one-on-one “interview”. In the case of these
experiments, the tutorial items, along with the experimenter’s feedback on them, essentially fulfilled
the role of ensuring that the participants understood the questions, and as mentioned, could also be
used if necessary to detect participants who were likely not understanding. Additionally, as noted
above, the use of pictures of specific scenarios, instead of the general verbal descriptions of the
intended MR used in previous experiments, does indeed seem to have made it easier for participants
to understand what they were being asked to judge
155
. It is difficult to assess which of these changes
were the most impactful, but as will be shown in subsequent chapters, the combined result of these
changes is far less “filtering” of participants than was required previously. To be clear, while this in
no way invalidates previous results, as discussed, it is generally true that being able to meaningfully
replicate predictions across all individuals is the ideal result for any SCI-experiment, and these
experiments represent a significant methodological attempt to move in that direction.
155
See, however, discussion in Plesniak (2022c), where pros and cons of both methods are discussed, with the primary
con of the visual method being its loss of generality; given that results found here converge with the previous experiments,
however, there does not seem to have been any great qualitative impact in this particular case, though of course it is a
relevant concern for future experiments.
223
3.3.3 Displaying MRs
Having summarized the basic experimental procedures, we can now address the particulars
of how the various interpretations were displayed. We have already seen in (194) an example of a
BVA item, with two interpretations, “referential” and BVA. As noted, the key difference is that in
the former, all the relevant individuals have arrows going from them to the same individual, whereas
in the latter, all the relevant individuals have arrows going from them to different individuals.
Considering X and Y of BVA(S, X, Y), we can note that the choice of Y does not really change the
“meaning” of the interpretation very much, as Y is being “interpreted” according to X. The choice
of X, however, does change the meaning rather significantly. For example, ‘every professor’ was one
choice of X, but there were other, more numerically intricate choices of X as well. For example, one
choice of X for the English experiment was ‘every professor but one’. This was displayed as in (197):
(197)
To represent the “less than every” property of ‘every professor but one’, one individual of
A-G is displayed with no lines at all, specifically G, and correspondingly, in (II), there is one
224
individual, G’s student, who has no line drawn to them. In cases that the quantifier chosen called for
even fewer individuals, e.g., “more than one professor”, the solution was to delete even more lines
until the picture became (a) accurate to the sentence, and (b) distinct from the pictures
corresponding to any other choice of X.
Moving on to other MR’s, DR readings were expressed in a similar manner to BVA readings.
The key difference was that the corresponding pictures involved multiple arrows originating from
each “individual” on the left, as can be seen in (198):
(198)
In this case, once again, (I) is the “referential”, non-DR
156
, reading, where there are really
only two students total, who were spoken to by each of the professors (or, in the case of quantifiers
156
One may say that (I) might also be a case of a DR or DR-like reading where the two students for each professor just
happen to be the same students as for each other professor. This is not a concern for us, as the reading in (I)’s only purpose
is to show that readings other than (II) are possible if (II) cannot be accepted; we are not actually interested in the non-
DR reading itself. If DR readings are acceptable with a given sentence for a given individual, then presumably (II) would
be acceptable, so if an individual rejects (II) and accepts (I), then we have evidence that DR was not possible for that
individual, which is all we are after here.
225
other than ‘every’, some subset of the professors), represented by two lines coming from each
professor and converging on the same two students. In (II), the DR reading, each professor still has
two lines coming from them, but these now all point to unique students, two per each professor.
Further, while in the BVA case, the individuals on the right were labeled according to the individuals
on the left, e.g., an arrow from “A” to “A’s student”, in the DR cases, no such labelling occurs, but
instead, the individuals on both sides are assigned numbers. This is consistent with the difference
between BVA and DR, given that in DR, there is no sense that Y of DR(S, X, Y) takes on “the
identity” of the member of X, which it does in BVA by definition. That is, the ‘two students’ in
DR(S, every professor, two students), are not necessarily the two students of each professor in
question, just a set of two students per each professor, regardless of whose students they are.
Coref interpretations were displayed somewhat differently, due to the fact that they do not
involve multiple individuals. An example is given below in (199):
(199)
Here, (I) and (II) look quite similar, each having just one individual on each side of the line
with an arrow between them. The distinction between them is simply a matter of the identity of the
226
individual on the right, whether they are “someone else’s student”, as in the non-coreferential (I), or
are the student of the individual on the left, the Coref reading. Because the X of Coref(S, X, Y) is
always singular in these experiments, to avoid potential agreement issues with the singular Y, these
diagrams always featured this “one on each side” design and did not have to change to
accommodate different quantities in the way the BVA and DR diagrams did.
3.3.4 Schemata Used
As noted in Section 3.2, there are multiple ways to match *Schemata to corresponding
okSchemata, with the point of bifurcation in past works being whether to match in terms of linear
precedence of X and Y or whether to match in terms of theta-roles. In these experiments, both are
done, separately and together. As a result, for each of our *Schemata, there are multiple
corresponding okSchemata, and vice versa. To see how this is achieved, let us consider some of the
basic sentence patterns described in Sub-Section 3.1.3, the most basic of which is probably the
simple SVO pattern (I will use English here, but an equivalent procedure can be done for all
languages and will be reviewed in the relevant chapters). Let us begin with a very straightforward
okSchema, where X is the S of an SVO sentence and Y is the possessor in the O:
(200) X V Y’s N.
We expect by our basic structural hypotheses and the ABC-BVA law that BVA(S, X, Y) will
be, in principle, possible via C(S, X, Y) in sentences instantiating the pattern in (200), given that the
subject, X, c-commands the object, which contains Y. At the same time, BVA(S, X, Y) will also be,
in principle, possible via A(S, X, Y), given that X precedes Y in S (and via B(S, X, Y) too, but that
will not be relevant here). Let us set aside the issue of precedence briefly and focus on theta-roles.
227
We can ask, what type of sentence is such that X and Y’s N get the same theta-roles as in (200) but
X does not c-command Y’s N? We have already seen something like this in discussion of Plesniak
(2022b), namely, the resulting sentence after passivization:
(201) Y’s N was
157
V by X.
The difference here is that, in Plesniak 2022b, the SVO sentence was the *Schema and the
passive was its corresponding okSchema, but here this is reversed; in sentences corresponding to
(201), X is hypothesized to not c-command Y, and X also does not precede Y, so it is itself a passive
*Schema, a sentence-type unexplored by the previous experiments described in Section 3.2.
(200) is an okSchema that matches (201) in terms of theta roles, but as can be seen, not the
linear order of X and Y. As we have seen numerous times at this point, this issue can be eliminated
by displacing Y’s N to the front of (200), yielding:
(202) Y’s N, X V.
In sentences corresponding to (202), X is still hypothesized to c-command Y, but does not
precede it, thus permitting BVA(S, X, Y) only via C(S, X, Y) according to the BVA(S, X, Y) law.
(That is, assuming there are no quirky effects; we will return to this issue later in this sub-section
when we discuss DR and Coref). As such, we now have three types of sentence, all matched in theta
roles: (200), where X both precedes and c-commands Y, (201), where X neither precedes nor c-
157
All sentences in these experiments are given in the past tense or its equivalent in the relevant language, so we should
technically be writing something like “V-ed” here instead of “V”, but I will suppress this issue for the sake of not cluttering
up the schemata unnecessarily. Likewise, I will ignore issues like ‘was’ vs. ‘were’ and ‘a’ vs. ‘an’, as these are essentially
deterministic given whatever elements are put in for the various variables.
228
commands Y, and (202), where X does not precede Y but does c-command it. There is one more
combination possible here, which is where X precedes Y and does not c-command it. This has been
largely ignored in previous similar experiments, because their focus was demonstrating various
things about c-command. Our focus here, however, is on the entirety of the ABC-BVA, even if c-
command is the part of that law which we most care about. As such, we ideally also want to be able
to demonstrate the effects of precedence alone, when the effects of c-command are absent. As such,
we can indeed construct such a fourth sentence-type, making use of the same displacement strategy
as before, now applied to (201):
(203) By X, Y’s N was V.
158
Now we have a complete paradigm of four sentence types, all matched in terms of theta-
roles, such that they manifest all possible combinations of the ok/*A(S, X, Y) and ok/*C(S, X, Y).
For the sake of summary, I represent this in (204) below (suppressing issues of where ‘to’ might go
if the verb requires it):
158
Note that, even under an analysis where movement happens in the syntactic structure itself or we have Ueyama (1998)’s
base-generated Deep OS, X would still not be predicted to c-command Y here, as it is embedded in the larger phrase ‘by
X’. The same argument would hold for Mandarin Chinese, but not as clearly for Korean, to be discussed in the relevant
chapters. One could, however, argue that the ‘by’ or ‘by’-like element in these languages is “transparent” for c-command,
as Li (1985, 1990) does for Chinese bèi. This would involve enriching our definition of c-command somewhat, but there
would be clear testable consequences that could help us distinguish the relevant possibilities. Checking such issues is
beyond the scope of this dissertation, though it is a “low-hanging fruit” for future research. Let me reiterate though that
such issues are not really problematic for us; if the ‘by’-transparency accounts are correct, then some of the effects I am
attributing to precedence are actually attributable to c-command. Given that our main desire is to demonstrate the
relevance of c-command to BVA, this would in fact be excellent news for us; c-command plays an even greater role than
we thought! As I have stated elsewhere, the current formulation of the hypotheses represents the most conservative and
minimal role one can assign c-command and still arrive at the correct predictions. The point is that even under these
assumptions, reference to c-command is still necessary for accurately predicting judgements. If it turns out more “liberal”
theories of c-command are correct, all the better for us. This applies not only to transparency-based theories, but also
those that might try to explain the patterns of contrast via reference to different movement types (e.g., A vs. A-bar, etc.).
Crucially, such analysis already assumes the relevance of structure for BVA, which is what we are trying to demonstrate.
The only problematic account would be one that relies only on things like precedence and semantic/pragmatic factors like
quirky effects, bypassing structure entirely. My intent is to show that the data do not support such an analysis, at least not
a straightforward one.
229
(204) Set 1
a. okA, okC: X V Y’s N.
b. *A, *C: Y’s N was V by X.
c. *A, okC: Y’s N, X V.
d. okA, *C: By X, Y’s N was V.
If we reverse the theta-roles, such that Y’s N becomes the do-er and X becomes the do-ee,
we flip the roles of actives and passives, producing yet another 2-by-2 paradigm:
(205) Set 2
a. okA, okC: X was V by Y’s N.
b. *A, *C: Y’s N V X.
c. *A, okC: By Y’s N, X was V.
d. okA, *C
159
: X, Y’s N V.
This set includes the familiar weak-crossover configuration, (205), as well as Plesniak
(2022b)’s passive okSchema for it, (205). It now adds the topicalized version of both of these, (205)
and (205). Likewise, we can create our third and final set of sentences by applying a similar logic to
Plesniak (2022b)’s possessor/spec-binding sentences and their okSchema equivalents, producing:
159
If we adopt either a syntactic movement analysis or Ueyama (1998)’s Deep OS hypothesis (adopted for English) for
“X, Y’s N V” , then this may be considered a case of okC(S, X, Y), as X may c-command Y from its high-up position in
the structure (in the movement case, it will depend on how one treats the interaction of BVA and movement, however, as
this is very much like a wh-weak crossover case (suppressing potential issues of A vs. A’ movement), which many
movement accounts ban through conditions on precisely such interactions). I will not adopt either of these analyses here,
but they are of course strong contenders for possible follow-ups. I will note, however, that (a) none of these alternatives
are such that choosing to believe them will render certain examples contradictory to the ABC-BVA law; they merely shift
when factors are available, namely whether it is just A or A and C together, and (b) there are other cases where there is
certainly just A by either hypotheses, so while having to exclude sentences instantiating this pattern from our analysis of
pure precedence effects will certainly weaken it, as will be seen, it does not live or die by such sentences alone. Additionally,
at least in English, where ‘to’ is used, X will be embedded in ‘to X’, and as such, this issue may not arise (depending on
the sorts of “transparency” issues discussed in the previous footnote).
230
(206) Set 3
a. okA, okC: X has an N who V Y’s N.
b. okA, *C: X’s N V Y’s N.
160
c. *A, okC: An N who V Y’s N, X has.
161
d. *A, *C: Y’s N, X’s N V.
A close inspection of Set 3 reveals a minor difference relative to the other sets: while (206)a
is the same as Plesniak (2022b)’s relevant okSchema, (206)b is not the same as Plesniak (2022b)’s
relevant *Schema, which is (206)d. As can be seen, (206)b is the “untopicalized” version of
possessor binding, where X does indeed precede Y, so the distribution of *A is somewhat different,
with the topicalized sentence patterns, (206)c-d, both being the ones that lack X preceding Y.
With all twelve sentence patterns taken together, we now have the following configuration
of sentences:
(207) a. Three *Schemata (*A, *C): (204)b, (205)b, and (206)d
b. Three (*A, okC) okSchemata : (204)c, (205)c, and (206)c
c. Three (okA, *C) okSchemata: (204)d, (205)d, and (206)b
d. Three theta-role matching okSchemata per each *Schema (all the others in each set)
As such, there are multiple ways that a rejection of a given *Schema can be considered
significant. I will discuss the significance of such rejections in various ways in the upcoming
chapters, but I will describe the most common type of significance briefly here. First of all, we can
recall that, by design the members of a given set all share the same basic theta-role-based meaning.
As a consequence BVA(S, X, Y) constitutes the same basic interpretation for each sentence type in a
160
Here, the concerns about movement/Deep OS discussed in the previous two footnotes do not apply, so this at least is
(I am arguing) a case of pure precedence regardless of what analysis one adopts for the other constructions.
161
Across all three languages, this construction was generally felt to be “awkward” by most individuals. As we will see,
some did not accept it as an interpretable string at all. Most, however, did accept it as such, and their judgments on it were
precisely as expected, so the issue of its awkwardness/marginal status does not seem to have caused any problems here.
Further, its role in the analysis is generally quite minor, as we will be seeing throughout.
231
given set, e.g., in Set 1, it is an interpretation where for each of the members of X, there is an N that
they V-ed. On the other hand, for every sentence type in Set 2, for example, this is reversed, for
every member of X there is a Y’s N that V-ed them. As such, if an individual rejects BVA(S, X, Y) in
one of the sentences in (207)a (with a particular X and Y) but accepts one of the corresponding
sentence in (207)d, then we know that there is at least no problem with the individual accepting the
basic BVA(S, X, Y) interpretation. That is, the individual can accept BVA(S, X, Y) at least
sometimes when say, X is the do-er and Y’s N is the do-ee in general, lending significance to the fact
that specifically bad when paired with the relevant construction from (207)a.
Once we have established this basic type of significance, we can then ask whether the
rejection has contrasts along one of the predicted dimensions. For example, if the individual in
question does accept BVA(S, X, Y) (again with that particular choice of X and Y) with at least some
sentence or sentences of (207)b, then we know that, for that individual, there is evidence that
rejection of BVA(S, X, Y) in (207)a can be attributed to the lack of X c-commanding Y, as when X
does c-command Y (in (207)b), BVA(S, X, Y) is sometimes acceptable. Likewise, if an individual
accepts BVA(S, X, Y) with at least some sentences in/of (207)c, then there is evidence that the
rejection of BVA(S, X, Y) in (207)a can be attributed to the lack of X preceding Y. Note that these
sentences need not come from the same sets for the relevant contrast to be established; the sets are
relevant only for determining theta-role-based contrasts.
At this point, these ideas are fairly abstract and may be hard for readers to fully grasp
without seeing actual examples; I will come to these in the following chapters, where, as I stated
above, I will reiterate the relevant reasoning. [I present it here merely to show why the various
sentences were included, namely to provide minimal pairs for contrasts along the three dimensions
mentioned: same theta-roles but different precedence and/or c-command, same precedence but
different c-command, and c-command but different precedence.] One may rightly point out that the
232
(a) sentence of each of Sets 1-3 is somewhat redundant, given that it is both okA(S, X, Y) and
okC(S, X, Y), that is, X both precedes and c-commands Y in it. As one may note, however, these (a)
sentences are often the most straightforward or simple ones of their set, and as such, accepting them
can provide a clear case of accepting a theta-role matched okSchema, also important for assessing
significance as described above. In summary:
(208) Significance of Rejecting BVA(S, X, Y) in one of (207)a (X and Y being constant)
can be considered significant if we have at least one of the following:
a. Theta-role matched: (one of) (207)d is accepted with BVA(S, X, Y)
b. Contrast in c-command: (one of) (207)b is accepted with BVA(S, X, Y)
c. Contrast in precedence: (one of) (207)c is accepted with BVA(S, X, Y)
3.3.5 Structure of the Experiments
We can now turn to DR and Coref in the experiment, which, as before, play the crucial role
of checking for quirky effects, and thereby determining if we have *B(S, X, Y) or not. In all cases,
these are constructed from their BVA equivalents by either simply substituting a non-
quantificational expression, e.g., ‘that teacher’, for X in the schema, as in Coref, or substituting a
numerical expression, e.g., ‘two students’ for Y’s N in the schema, as in DR. This yields a total of 36
schemata across BVA, DR, and Coref. Moreover, three distinct X-Y pairs are used, meaning that, if
we treat the choice of X and Y as determining its own “sub-schemata” for every schema, there are in
fact a total of 108 schemata.
Given that we intend non-specialists to be able to finish this experiment in one sitting, this
high number of schemata means we must sacrifice in terms of the number of instantiations of each
schema. This is especially true given that participants are generally giving a much higher degree of
attention to the task than online survey participants would be, and thus take longer to respond. For
this reason, this experiment returns to a degree to the system used in Plesniak 2022a, having only
233
one instantiation of each of sub-schema, though if we combine instantiations that differ only in the
choice of X and Y, this is three instantiations per schema.
Even so, given the twelve tutorial questions, this makes for a total of 120 questions; if we
assume an average time of about half a minute per question
162
, this makes for about an hour of
experiment per participants. That rough hour was divided into three “rounds”, each corresponding
to one choice of X-Y pair. A break of indefinite duration was allowed to the participants in between
these rounds, should the participant opt to take it
163
. Within each round, there were three “parts”,
each focusing on a different MR, first Coref, then DR, and finally BVA. As such, on a given round,
for a specific X-Y pair, a participant judged the availability of Coref(S, X’, Y) for 12 sentences, then
the availability of DR(S, X, Y’) for 12 sentences, and then the availability of BVA(S, X, Y) for 12
sentences (as well as their non-MR counterparts in picture (I)). Note that, within a round, taking a
break would not have been desirable, as even short durations of time can cause a participant’s I-
language to shift such that Hoji’s correlational may no longer apply, so answering the questions in
each round as temporally close together as possible is optimal.
These parts were further divided into “sets”, as corresponding to the sets schematized in
(204)-(206) (in the order given). This was helpful for participants, given the point observed earlier
that within a set, the potential MR (and non-MR) interpretations are the same, whereas between sets,
they change slightly (do-ers become do-ees, etc.). As such, when judging members of the same set,
the pictures presented as options (I) and (II) did not change, whereas between sets, they did
164
. To
162
There was variation between individuals, but this estimate does not seem to be far off the mark. The average individual
was probably a little faster than this, but not by terribly much.
163
Few did.
164
Changes in sets, parts, and rounds were all overtly cued in the PowerPoint, so participants were certainly aware of them.
234
take an example, consider the DR part of third round of the English experiment, where X was ‘more
than one professor’. The sets were as follows:
(209) Set 1 Sentences
a. More than one professor spoke to two students.
b. Two students were spoken to by more than one professor.
c. To two students, more than one professor spoke.
d. By more than one professor, two students were spoken to.
(210) Set 1 Interpretations
(211) Set 2 Sentences
a. More than one professor was spoken to by two students.
b. Two students spoke to more than one professor.
c. By two students, more than one professor was spoken to.
d. To more than one professor, two students spoke.
(212) Set 2 Interpretations
235
(213) Set 3 Sentences
a. More than one professor has a colleague who spoke to two students.
b. More than one professor’s colleague spoke to two students.
c. A colleague who spoke to two students, more than one professor has.
d. To two students, more than one professor’s colleague spoke.
(214) Set 3 Interpretations
As can be seen, for the sentences within each of the sets, theta-roles do not change, so the
same pictures for (I) and (II) apply equally well to all of them. Between the sets though, theta roles
do change, with ‘more than one professor’ going from the speakers in Set 1 to the ones spoken to in
Set 2, and then to the “haver” of the colleague(s) who spoke in Set 3. As such, the overt division
into sets of the items was assumed to be helpful for participants, as it prevented them from worrying
that a subtle change in the options for (I) and (II) had taken place without them realizing it, such
that they would be making a mistake if they did not recognize it
165
.
165
One may of course question whether this organization of items into sets, or sets into parts, parts into rounds, etc. might
have biased the results in some way or another. In fact, we know that any particular way of configuring things results in
some bias; the more relevant question is whether the biases induced might have affected the judgement process in such a
way that the results might not accurately bear on the hypotheses in question. While I personally do not see how this
configuration might have enhanced or disrupted the correlations between MR’s, I have included all the relevant data in
the appendix, so that anyone who has such a concern may check whether the data seem to have been influenced in such
a way. To provide evidence in support of my view, however, let us note that, while the data do precisely follow the
predicted correlations, which are rather intricate correlations between different sentences in different parts of the
experiment, in almost every other respect, there is wild variation. This does not seem consistent with an experimental bias-
based explanation. We would presumably expect this bias to manifest “across the board”, rather than in just the specific
places it is predicted to, but even setting that issue aside, such bias should also manifest itself as a sort of tendency, rather
than a hard and fast rule. Given that we instead find an exception-less pattern in only a very specific aspect of the
judgements, it simply seems hard to think of a source of bias that could lead to such results, though of course, such issues
236
The judging of these various sets constituted the entirety of Rounds 2 and 3; in Round 1,
however, the tutorials were also provided. We have already discussed the general tutorial, given at
the start; see (195) and (196). The MR-specific tutorials were each given before the first instance of
the corresponding MR “part” of the round, i.e., the Coref tutorial before the Coref part, the DR
tutorial before the DR part, and the BVA tutorial before the BVA part
166
. I provide the English
sentences and images used for each tutorial below, the Korean and Mandarin Chinese cases
essentially being direct translations.
(215) Coref Tutorial Sentences (English)
167
a. John likes his roommate.
b. John likes Bill’s roommate.
c. John likes pizza.
(216) Coref Tutorial Interpretations (English)
should be investigated thoroughly and seriously for the purposes of bettering future experiments.
166
To say this all more succinctly, the experiment consisted of 3 rounds (each concerning a particular choice of X and Y),
which were broken down into 3 parts (each concerning a particular MR, in the sequence Coref-DR-BVA), which were
broken down into 3 sets, those being the theta-role-matched groups of 4 sentences discussed in (204)-(206), in that order.
In Round 1, each part was preceded by the relevant tutorial, and the general tutorial came before everything else (and thus
immediately before the Coref tutorial).
167
These are rather similar to the sentences from the general tutorial; this is because the two tutorials came “back-to-back”,
given that they both occurred before the Part 1 of Round 1, that being the start of the experiment and the first instance
of Coref. As such, this was intended to make participants feel “familiar” with the task and thus gain some confidence in
making judgements.
237
(217) DR Tutorial Sentences (English)
a. Two boys greeted three girls
168
.
b. Two boys greeted those three girls.
c. Two boys greeted no girls.
(218) DR Tutorial Interpretations (English)
(219) BVA Tutorial Sentences (English)
a. Every man loves his dog.
b. Every man loves that dog over there.
c. Every man loves his cat.
(220) BVA Tutorial Interpretations (English)
In each case, the expected judgements were: (a), both possible, (b), only (I) possible (the
non-MR reading), and (c), usually neither possible
169
. As noted earlier in this section, it was after
168
This was a case where the inclusion of ‘each’ or its equivalent was often helpful for participants to accept (II) as intended.
169
Because of the difficulty of finding cases where neither interpretation is possible, the (c) case often involved some kind
of “trick”, liking swapping out the words ‘dog’ and ‘cat’. Participants often missed this, but if so, it was simply pointed out
to them, and presuming they then agreed that the interpretations were not possible, it was not “held against them”. Indeed,
this was somewhat deliberate, as it encouraged participants to pay close attention to the sentences and their interpretations;
238
completing the last of these tutorials that the determination was made whether the individual
seemed to understand the tutorials or should have a special mark placed on their sheet data
indicating that they did not seem to understand them fully. To repeat though, only one individual
was excluded through this process, and we will see that this individual’s data would not have violated
the basic predictions of our hypotheses anyways, so the procedure was perhaps unnecessary, though
it does in principle seem to be important to have some sort of quality control in place in case a truly
non-attentive individual were to take the experiment. I will return to issues of quality control later, in
Chapter 7.
3.3.6 Summary of Changes from Previous Experiments
As noted in Sub-Section 3.2.8, the design of this experiment innovates over its predecessors
in several respects. Now that we have seen the abstract template of this design in full, we can now
review these innovations in more detail than before.
First, by having the experimenter present, able to answer questions, as well as providing
active feedback on tutorials, non-attentiveness/non-comprehension drops to a minimal, perhaps
negligible amount. As will be shown in the remaining chapters, doing without any direct sub-
experiments in the style of Hoji 2015 did not seem to have an adverse effect on the results of these
experiments, and indeed, they replicate previous findings from experiments that did use such sub-
experiments. It is also quite possible that the visual presentation of the intended interpretations may
have played a role in making the directions clearer and harder to misinterpret, though as I noted, it is
hard to say at this point which of these aspects contributed more
this could not be done directly during the main questions of the experiment itself, as saying “make sure you’re paying close
attention” could be (wrongly) taken as a sign to the participants that they were making some kind of mistake, so the tutorial
was a tool for (hopefully) achieving a similar result.
239
Further, as can be seen in the discussion of Sets 1-3 ((204)-(206)), an unprecedented number
of * and okSchemata are used, a total of 12, which are coupled with three different X-Y pairs. This
yields an experiment where participants judge an instantiation of each of the resulting 36 BVA sub-
schemata, as well as the corresponding DR and Coref items as well. The inclusion of this diverse set
of items enables several things: first, as noted several times now, there has been a division in past
experiments between making use of okSchemata matched in precedence with the *Schemata and
okSchemata matched in theta-roles with the *Schemata. This experiment includes both (including at
the same time and separately), as well as adds a third category, “okSchemata” (if we can expand the
meaning of that term a bit) that are matched in c-command with the *Schemata but differ in
precedence (that is, X precedes but does not c-command Y). Further, the inclusion of multiple X’s,
Y’s, and * and ok-schemata, coupled with the elimination of attentiveness/comprehension-checking
sub-experiments, enables the relevant patterns of judgements to be replicated in a far greater
percentage of the participants than in previous experiments; simply put, having more sentence types
and choices of X and Y increases the chance of finding instances for which quirky effects are not
detected, meaning that the crucial c-command-based asymmetries can be observed. Of course, this
in no way lowers testability; as much as each instance of a judgement on a given *Schema has a
chance of supporting our hypotheses, it has an equal chance of contradicting them as well.
In a broader sense, the wide variety of structures, which vary both in terms of precedence
and c-command, coupled with the already establish use of DR and Coref as detectors for quirky
effects, allow us to get a much broader picture of the interaction of factors A, B, and C with the
availability of BVA in each participant. This in turn allows for a much more direct demonstration of
the *A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y) law than has been possible with the
largely C(S, X, Y)-focused efforts previously. Indeed, I think it is fair to say that, while previous
experiments have supported parts of this law, and future experiments may support it in a much
240
more thorough/rigorous way, the experiments here may be rightly claimed to be the first attempt to
fully demonstrate all aspects of this law, at least by means of an SCI-experiment.
3.4 Initial Discussion of the Experiments Conducted
3.4.1 Analytic Basics
In Section 1.6, I gave a brief preview of the final results, but this was necessarily a very rough
sketch, as the relevant concepts had not been established. At this point, however, I have presented
this dissertation’s primary theoretical background, experimental background, and experimental setup
through the various topics discussed in this chapter and in Chapters 1 and 2, so we are in a better
position to talk about the results in more detail. This discussion is not designed to supplant or even
summarize the discussion to be given in the following chapters; this section will not “tease” the key
results as Section 1.7 does, for example. Rather, I will overview the major components of the
analysis to be performed in the upcoming chapters, as well as flag certain issues and key points of
interest, so that readers can have a sense of what to expect when reading those chapters.
As has been mentioned, the results of the experiments all support the basic ABC-BVA law,
as repeated here as (221):
(221) The ABC-BVA Law:
*A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
That is, when all three relevant conditions hold, namely, X does not precede Y in S, there is
no “quirky effect” on X, Y or S, and X does not c-command Y in S, then BVA(S, X, Y) is never
possible; this is true across all individuals who participated in the experiments, and thus across all
three languages examined. There are various details of the experiment to be discussed that pertain to
each language in particular; I will leave that to the various language-specific chapters. The broader
241
issues of concern here are how the analysis is carried out so as to show that the ABC-BVA law is
indeed supported, what kinds of “issues” arise in performing these analyses, and also what the
overall “point” of such analysis is in the first place. With these understandings in place, we shall be
in a good position to evaluate the specific analyses of the data from each language, as well as the
overall effect of the data from all three languages combined.
As noted in Section 3.3, these experiments revolve around three basic *Schemata; in addition
to the weak-crossover sentences in the active voice, which have been the main subjects of most of
the previous investigations, as detailed in Section 3.2, we also have “weak-crossover passives”, as
well as (topicalized) possessor binding sentences. These sentences are all such that *A(S, X, Y) is a
given, as X does not precede Y in them, and according to the structural hypotheses in summarized
in Sub-Section 3.1.3, X does not c-command Y in them either, giving us *C(S, X, Y). The ABC-BVA
law thus predicts that such sentences will only be acceptable with a BVA(S, X, Y) reading in the case
of okB(S, X, Y), that is, if enabled by some sort of quirky effect. All other sentence types, on the
other hand, have either okA(S, X, Y) or okC(S, X, Y), meaning that no such quirky effect is, in
principle, needed to facilitate the acceptance of BVA(S, X, Y) in such sentences.
Diagnosis of the presence or absence of said quirky effects, as in previous experiments, is
done by examining each individual’s judgements on analogous sentences, using the same X or Y,
with different MR’s, DR or Coref. If a given *Schema instantiation is judged acceptable with one of
these MR’s, then, in principle, we cannot guarantee *B(S, X, Y) for BVA. If, on the other hand, the
corresponding *Schema examples are not judged to be acceptable with DR/Coref, we assume this
does diagnose *B(S, X, Y), in which case, *BVA(S, X, Y) is predicted.
There are two important points to note regarding this “quirky detection” however, which
provides contrasts with previous experiments. The first is that, for reasons that are somewhat
unclear, it proved to be the case that in the data gathered, it was never necessary to use both DR and
242
Coref to accurately predict BVA behavior; that is, one MR alone seemed sufficient for diagnosing
quirky effects, though which MR that was, DR or Coref, varied based on the language in question. It
is unclear at this point whether this represents an important new finding or is merely an artifact of
the particular data gathered; as such, I will at times analyze things in both ways, first using both
MR’s to test for B(S, X, Y), and then using just one MR to test for B(S, X, Y). Such cases will be
clearly marked and discussed in the relevant parts of the upcoming chapters; the determination of
which of the analyses is more appropriate to perform will have to wait for future data. For now,
there are pros and cons to both approaches, which I will discuss, though using both MR’s, the more
conservative approach, is probably the more “rigorous” option for now, at least until the one MR
analysis has been more thoroughly vetted.
The second relevant point to note is that, while previous experiments have evaluated the
presence/absence of quirky effects based not only on an individual’s judgements on the Coref/DR
*Schemata, but also on the corresponding Coref/DR okSchemata as well (as I discussed earlier in
this chapter, in Sub-Section 3.2.2). The use of these okSchemata was not due to a theoretical
requirement per se, but rather, as a way of double-checking the significance of a lack of quirky-
detection using the *Schemata. In the data gathered for this experiment, however, such double-
checking has no qualitative effect on the data, and indeed, very little quantitative effect either. As
such, I rely purely on the *Schemata for quirky detection, just for the sake of analytic simplicity. This
represents a slight methodological departure from previous experiments, though, as can be seen in
the data in the appendix, almost all cases of *Schema rejection in DR/Coref were accompanied by at
least some acceptances in the corresponding okSchemata; as such, the two methods of quirky
detection are rendered effectively identical for our purposes due to this widespread okSchema
acceptance in DR and Coref.
243
Returning to the BVA *Schema discussed above (that is, the sentences for which we have
*A(S, X, Y) and *C(S, X, Y)), if we have ensured *B(S, X, Y), we can make the definite predictions
of *BVA(S, X, Y) . This prediction is indeed true, as we have in fact already seen in the “preview”
given in Section 1.6, which I will repeat here once more (primary discussion of its significance is still,
however, deferred until Chapter 7):
(222)
The *Schema are all *A(S, X, Y) and *C(S, X, Y), so all judgements on instantiations of
*Schemata are contained in the intersection of the two corresponding circles in (222). This
intersection is itself partitioned into a part which also overlaps with *B(S, X, Y) and a part outside of
that overlap, corresponding cases where the absence of quirky effects was and was not diagnosed,
244
respectively. As we can see, when quirky effects were absent for instantiations of *Schemata (the
central intersection), all judgements were *BVA(S, X, Y) as predicted (green dots), and likewise, all
cases where BVA(S, X, Y) was accepted with *Schemata instantiations (red dots) are cases where
quirky effects were detected
170
.
As such, we can already see that the ABC-BVA law’s predictions were never violated in the
data gathered for this dissertation. While this alone is of interest, our analysis will consist of far more
than simply reiterating this fact. What we want to know is not merely that the ABC-BVA law was
not violated, but that it was actively supported. That is, we want to show that each of the relevant
constraints, A, B, and C, was indeed playing an active role in constraining BVA judgements, and was
doing so across languages, individuals, and choices of S, X, and Y of BVA(S, X, Y). To do this, the
data from each language are broken down according to these three factors, focusing on one at a
time. The data are then analyzed in terms of minimal contrasts that show the effects of each of the
factors, similar to what was done for the effect of c-command in previous experiments, and along
the lines of the tests for “significance” summarized in (208).
As such, each of the following three chapters will have individual sections discussing the
roles of factors A, B, and C, that is for us, linear precedence, quirky effects, and c-command,
respectively. These sections will not only discuss the role of the factors in the data overall, but also
examine the data in various ways, examining the roles of each individual, sentence-type, and X-Y
pair in establishing the final results. While these discussions ultimately culminate in the sort of Venn-
Diagram-based correlational analysis that we have seen in Section 3.2, I will also take care to point
out various trends in the data before reaching such analyses. These trends serve two main roles: first,
they give an overall sense of the distributions of judgements in the data, which not only helps to
170
This is using the more “conservative” 2-MR test for quirky effects; the 1-MR test version will be discussed in alongside
this one in Chapter 7.
245
understand the ultimate analysis, but also may be useful for independent purposes as well. Second, in
looking at these trends, it will become clear just how hard it is to establish exceptionless patterns as
to whether BVA will be judged acceptable or not with a given sentence. By noting the inherent
difficulty in finding exceptionless patterns, we can better appreciate the significance of the fact that it
is only in the very specific way predicted by our hypotheses that such patterns reliably appear.
There is, of course, much more we could do; the dataset is fairly large, and the number of
possible ways of dividing according to various properties is therefore huge. What I have included in
each chapter is merely a subset of this possible set of ways of examining the data, but it hopefully
sufficiently serves to demonstrate at least the rough shape of the data. Those with further interest
can see the data appendix included in this dissertation; it is detailed enough that alternative analyses
or descriptions I have not provided should still be possible to carry out with the data provided there.
3.4.2 Complications
In Section 7.4, I discuss some of the limitations of the experiments, which provide are areas
for improvement in future experiments. These issues, however, do not substantially impact the
interpretation of the results at a basic level, so we do not need to preview them here; they are
essentially areas where reliability and significance can be increased, rather than serious problems with
the experimental results themselves.
There are, however, a few “caveats” to the sort of results displayed in the chart in (222),
which take the form of excluded datapoints. This issue will come up in various forms throughout
the different experiment, so I flag it here in a general sense before discussing it in more detail in the
relevant chapters. These exclusions are, in all cases, principled, and further, as I will discuss in
Section 7.3, almost all of the data excluded is totally “benign” in terms of its impact on the results.
To be clear, these datapoints are not being suppressed in the absolute sense and do receive full
246
discussion in the relevant sections (as well as being present in the appendix); they are merely
“quarantined” from the rest of the data for purposes of analysis because of various principled
reasons.
One such reason for “quarantining” has already been discussed in this chapter, namely the
one English speaker whose responses to the tutorial questions suggested confusion to the degree
that this individual’s data was marked as unreliable. This was the only case where datapoints were
excluded from the main analyses based on the particular individual involved. In the other cases, this
was done on the basis of the items judged, rather than the individual judging them.
There are two such cases. The first concerns certain instances of the possessor binding
sentences, where it seems there were unintended ambiguities that proved confusing to participants.
These ambiguities had to do with the presence of the possessee, as in the following sentence:
(223) Every professor’s colleague spoke to his student.
In most BVA sentences that participants judged, there were only really two relevant readings,
one where ‘his’ is interpreted as meaning some third person not mentioned in the sentence, and the
BVA reading. In possessor-binding sentences, however, there is a third sort of interpretation, one
where ‘his’ is interpreted as ‘the colleague’s’. Such a reading is perhaps a sort of BVA(S, every
professor’s colleague, his) reading, or possibly the equivalent Coref reading if there is one individual
who is every professor’s colleague, but it is certainly not a BVA(S, every professor, his) reading.
Because participants have not encountered such readings often, however, and because they
are indeed MR readings, just not the particular MR reading that was intended to be conveyed to
them, I was already concerned about such cases before the experiments began; because participants
do not really know the difference between subtly different BVA readings or even BVA and Coref
247
readings, they may easily mistake this “third reading” for the BVA reading; the pictures provided do
not quite allow for this mistake if one is paying close attention, but the participants are going
through an hour-long judgement exercise, and as such, they may not pay maximal attention to such
minute details
171
. To counteract this, a procedure was instituted where, whenever sentences
instantiating Set 3 were judged with BVA (or Coref, which has a similar issue), the first time in a
given part of a given round that the participant judged a BVA/Coref interpretation to be acceptable
(that is, chose option (II) or said that both options (I) and (II) were possible), the experimenter
attempted to clarify saying something like “just to check, when you say (II) is possible, you mean
that Y’s student means something like X’s student, not X’s colleague’s student, correct?” (with X
and Y being filled in with the current choice of X and Y)
172
.
Some participants did indeed sometimes change their answers at this point, realizing that (II)
was not the interpretation they had thought of. Others understood that the answer was different
from the get-go, and even made comments to the fact that there was a third interpretation possible.
Unfortunately, there seems to have been a third group who thought they understood the distinction,
but in fact, did not. I will discuss the details more in the relevant chapters, but in essence, while there
was nothing “wrong” per se, certain responses to possessor-binding sentences in the English
experiments were different enough from responses to other types of sentences in terms of the
overall patterns of judgement that it seemed plausible that the intended interpretations were not
being understood.
171
This is one disadvantage of representing interpretations as pictures rather than sentences; with sentences, we can
specifically direct readers attention to the fact that ‘his’ is to be interpreted as something like ‘each professor’s own’,
whereas with pictures, we cannot guarantee that participants will look at the right part of the picture to notice that the
people involved are students of the professors, and not of the professor’s colleagues.
172
Note that, in the vast majority of cases, this was on an okSchema, as the *Schema was the last one judged in this set,
alleviating concerns that this procedure might bias judgements on the *Schema away from BVA acceptance.
248
Fortunately, not all possessor-binding sentences were so ambiguous, and the rare “unusual”
judgements did not seem to show up along with the non-ambiguous sentences. The supposition at
after the English experiments concluded was that the procedures designed to ensure that
participants understood the intended interpretations correctly were not working 100% of the time,
even if they were effective in most cases. The prediction was then that we might see apparent
violations of the ABC-BVA law, specifically with the “ambiguous” possessor-binding cases, but
never with the non-ambiguous ones. As it turned out, this is precisely what then went on to happen
in the Korean experiment, essentially confirming that the suspicion was correct
173
. As such, there are
grounds for holding that the judgements recorded on such sentences could not be reliably assumed
to reflect the participant’s true judgements, and thus, such data was set aside. This decision was thus
based in concerns of reliability and not simply a post-hoc exclusion for the sake of convenience
174
.
Of course, resorting to this kind of segregation is not ideal, and as such, the significance of the
possessor-binding portion of these experiments is limited, if nothing else by the fact that there are
fewer datapoints left to consider once we focus only on the “unambiguous” cases. As I attempt to
show, however, even with this reduced amount of possessor-binding data, a compelling case can
nevertheless be made in favor of our hypotheses.
173
Or at least that something unusual was going on with those purportedly “ambiguous” sentences in particular; it could
be that the source of the confusion has been misidentified or that participants were understanding, and this represents a
genuinely new phenomenon, but absent further evidence, I think the ambiguity account is fairly compelling, especially
given the not infrequent participant responses to the effect that they had indeed misunderstood the interpretation that was
to be judged before they were reminded what that interpretation was meant to be.
174
Indeed, as I will note in various places, it is in fact a very inconvenient exclusion; the vast majority of the data excluded
strongly supports our predictions, and the only really “problematic” case is just one datapoint. We could, presumably, find
some sort of clever post-hoc way of dismissing this one datapoint and keeping at least some of the others, but that is not
the intention of the exclusion. As I will stress further throughout the upcoming chapters, such exclusion should be done
whether the data supports our predictions or not; we have reason to believe that at least some judgements reported on
such sentences are not reliable, and no principled way of determining which judgements those are, so we cannot treat
those judgements as having the same status regarding our predictions as other judgements, which we have no reason to
believe suffered from the same problems. See further discussion in Chapter 7.
249
The other type of sentence to be excluded from main consideration comes as part of my
decision to take something of a “gamble” when dealing with Mandarin Chinese passives. As such,
similar to the possessor-binding issue, it was known beforehand that there was a potential problem,
so again, exclusion of such sentences is not merely a post-hoc repair for inconvenient data. I will
discuss the relevant issues in greater detail in Chapter 6, but in essence, simple Mandarin passives do
not generally allow the same degree of freedom of dislocation as in the other languages considered.
To try to overcome this issue, I made use of a more complex structure, the well-known but
notoriously difficult to analyze shi…de construction; this construction does facilitate the desired
topicalization, but there is a serious question as to what the structure of such sentences is; as one
may note, there is no shi…de covered on our structural hypotheses. As such, these sentences were
included on a “try it and see what happens” basis, with the hope that the passives involving shi…de
would have the same structure as their shi…de-less equivalents.
Though, as I will note, there is still reason to think this might be correct, the data found in
this experiment do not universally support such a conclusion. As such, some but not all of the
Mandarin Chinese passive data must also be “quarantined”, though other parts of it can still be used,
as I will discuss in Chapter 6. As with the excluded possessor-binding sentences, this exclusion limits
the significance of certain aspects of the results, but, as I will show, the rest of the Chinese data still
paints a convincing picture with regard to our main predictions.
These three cases, comprehension difficulties, unintended ambiguities with possessor-
binding, and the use of a construction not covered under our structural hypotheses, constitute the
whole of the sources for excluded data. To reiterate, all of these issues were anticipated beforehand,
even if in the latter two cases, it was (incorrectly) hoped that the relevant problems could be
overcome, so this is not a case of simply cherry-picking out undesired datapoints; the exclusion of
such data is entirely principle-based, and I suspect that many potential concerns readers may have at
250
this point will be assuaged once they see the actual details of the data excluded, which will be
detailed in each of the language-specific chapters.
As will be discussed in in Chapter 7, the occurrence of such issues may be a result of using
so few sentences to instantiate each schema. I have argued in this chapter that, to some extent, doing
so was integral to accomplishing the stated goal of “casting a wide net” as to sentence patterns and
choices of X and Y and thus replicating the full range of predicted patterns in more individuals. To
the overall credit of the correlational methodology, this minimal checking does indeed prove
sufficient once minor issues like those discussed above are accounted for. It is fair to say, however,
that regardless of whether or not it worked in this case, in general such a strategy is overly “risky”. In
keeping with the Mukai 2012/2022’s articulation of how research in areas such as this should
proceed, we want to maximize our chance of finding meaningful errors in our predictions, and
having so few instances of each schema means that it is hard to know if apparent deviations from
predictions are meaningful or are merely a reflection of insufficient “checking” for things such as
quirky effects. I will return to these themes in Section 7.4, but as stated, they do not directly affect
the interpretation of the results of this experiment, even if they are important considerations as such
research moves forward.
3.4.3 Outline of the Language-Specific Chapters
In this chapter, we have moved from theory to practice; we started in Section 3.1 discussing
the hypotheses and predictions established in the previous chapter, as well as the general “type” of
experiments to be conducted, namely what I have termed “SCI-experiments”. We saw several
examples of past SCI-experiments in Section 3.2 and also overviewed the design of the particular
SCI-experiments employed for this dissertation in Section 3.3. While the I in SCI stands for
“individuals”, meaning that the experiments are focused on the judgements of individuals rather
251
than the aggregated judgements arising from a sample, the S stands for “systematic”. Despite dealing
with variation, the experiments discussed thus far have been highly formulaic. There are of course
differences between them, having to do with both issues of how exactly various aspects of the
experiment are implemented as well as differing topics of investigation. Nevertheless, it is true that,
because they all seek to implement the basic correlational methodology first proposed in Hoji 2017,
there is a great deal of procedural overlap between them.
This will be even more true for the three different language-specific experiments conducted
for this dissertation; given that the general template provided in Section 3.3 underlies all three of
these language-specific experiments, they are even more similar to one another than the cases we
have seen thus far. Because they are so similar, the way I go about discussing them will also be quite
similar, so to conclude this “previewing” section, I will provide the basic outline that each of the
subsequent three chapters follows. This will allow us to minimize repetitive summaries in each
chapter.
Each of these chapters will start with a section which bridges the gap between the general
template presented in this chapter and the language-specific implementation of that template used to
gather the data to be analyzed in that chapter. This will involve reviewing relevant concepts from
previous chapters or potentially introducing new ones, especially for Korean and Mandarin Chinese,
which we have thus far not discussed in much depth. I will also provide a full discussion of the
choices of X and Y of BVA(S, X, Y), as well as all other lexical choices made, and the various
sentence-types used, such that the substantive parts of the experiments should be totally
reconstructable from the information provided. Information about any parts of the data
“quarantined” away from the rest will also be presented here, though full analysis of that data will
wait until the final section of each chapter.
252
As mentioned earlier in this section, the next three sections of each chapter will address
factors A, B, and C of the ABC-BVA law individually, starting with observing rough patterns
regarding the distribution of BVA according to the presence or absence of the factor in question.
This discussion includes looking at the behavior of the different X-Y pairs and different sentence-
types both separately and together, building to the full, fine-grained correlational analysis. In this
analysis, we will see an account not only of how the factor in question has constrained BVA
interpretations, but also a breakdown in terms of which individuals, sentence types, and choices of X
and Y have served as the most crucial cases for demonstrating the intended point; the intention of
this breakdown is to demonstrate the generality of the results, specifically that they not merely an
artefact of any particular participant, sentence type, or choice of X and Y.
After having done this for all three factors separately, in the final section, I consider all three
factors together, producing language-specific diagrams of the sort we have briefly examined in (222)
earlier in this section. The resulting analysis succinctly presents what we will have seen throughout
the chapter, namely that the ABC-BVA law is not violated by any of the data considered, and
indeed, is actively supported by a variety of relevant judgements. It is at this point that we will also
consider the otherwise excluded data in detail, examining what its general features are and how it
does or does not differ from the rest of the data. Finally, in addition to summarizing the significance
of the results at the end of the chapter, I will close with discussion of any follow-ups or extensional
investigations performed and/ as any immediate questions raised that have yet to be answered.
Once this is completed for each of English, Korean, and Mandarin Chinese (in that order),
then we can come to the joint assessment of the experiments as a whole, as well as their broader
significance, to be discussed in Chapter 7.
253
4 English
4.1 Preliminaries
4.1.1 Introduction
In this chapter, I discuss and analyze the results of the experiment(s) conducted in English,
noting the different ways in which these results support our various hypotheses. In subsequent
chapters, I will repeat this process in Korean and Mandarin Chinese, demonstrating convergence
across all three languages. In these later chapters, I will begin by overviewing some basic-language
specific background regarding the language in question, but in the case of English, our discussion in
previous chapters has already covered the relevant details fairly thoroughly. As such, I will begin this
chapter simply by summarizing the most crucial aspects of our hypotheses, predictions, and
experimental design, so that they can be easily reviewed. I will omit this in subsequent chapters in
favor of the aforementioned language-specific review.
Our central hypothesis is the so-called ABC-BVA law, which sets forth necessary conditions
that must be met for BVA(S, X, Y) to obtain. Repeated from Section 1.4 here, it is:
(224) *A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
That is, BVA(S, X, Y) is impossible if none of the three factors, A(S, X, Y), B(S, X, Y), or
C(S, X, Y), is present (and by simple deduction, if BVA(S, X, Y) is possible, then at least one of the
factors must have been present to enable it). While A, B, and C in principle represent more general
notions (as discussed in Section 1.4), for our particular purposes, we can understand A(S, X, Y) to
mean that X precedes Y in the linear form of S, B(S, X, Y) to be a “quirky effect” obtaining on S, X,
and/or Y for the individual in question, and C(S, X, Y) to be X c-commanding Y in S.
254
The presence or absence of A(S, X, Y) is easily ascertained by simple examination of S. C(S,
X, Y) is also fairly simple to check for as well; the various structural hypotheses summarized in Sub-
Section 3.1.3 lead to the following basic deduction (regarding specifically the sentence types we will
be interested in):
(225) a. The subject c-commands all other nominals in the sentence.
b. No nominal c-commands the subject or anything inside it.
c. Possessors in nominals do not c-command anything outside those nominals.
As such, for us, diagnosing C(S, X, Y) will be simply a matter of asking whether X is the
subject of the sentence. If we find that, in a given sentence, X is neither the subject nor does it
precede Y, then the only remaining possible source of BVA is B(S, X, Y). Identifying its presence or
absence is somewhat more involved than the other two factors, but still fairly straightforward, as this
can be diagnosed via correlations with other MR’s. Specifically, following a particular
implementation of Hoji 2017’s general program (see discussion in Sections 2.7 and 3.3), we will say
that *B(S, X, Y) is diagnosed under the following condition:
(226) Assuming *A(S, X, Y) and *C(S, X, Y)
*DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) à *B(S, X, Y)
That is, for a particular individual, with particular items S, X, and Y, to check for the
possibility of a quirky effect in BVA, we check analogous sentences to S with DR/Coref
interpretations. If these interpretations are unavailable, then we will assume that we have *B(S, X,
255
Y). If one or both interpretations are available, the test is technically inconclusive as stated, but we
will assume that it indicates okB(S, X, Y)
175
.
Putting all these identification criteria together, we noted in Section 2.7 that a series of
previously proposed generalizations come out as predictions of the ABC-BVA law, specifically as
special cases when the two factors other than A, B, or C, respectively, are guaranteed to be absent:
(227) In the case that any two conditions of (a) are met, we derive one of (b)-(d):
a. The ABC-BVA Law:
*A(S, X, Y) Ù *B(S, X, Y) Ù *C(S, X, Y) à *BVA(S, X, Y)
b. Chomsky’s Leftness Condition:
BVA(S, X, Y) is possible only if X precedes Y in S.
c. Hoji’s Correlation:
*DR(S’, X, Y’) Ù *Coref(S’’, X’, Y) à *BVA(S, X, Y)
d. Reinhart’s Generalization:
BVA(S, X, Y) is possible only if X c-commands Y in S.
We thus want to consider various types of sentences that vary with respect to these factors
and demonstrate that when (and only when) two of the three factors are “controlled”, the expected
generalization obtains without exception. To do this, we consider three different “sets” of sentences,
wherein each set is matched in terms of rough “theta-role meaning”, as discussed in Section 3.3
(228) Set 1:
a. An active-voice sentence where X is the subject and Y is in the object (both A and C)
b. a.’s passive equivalent. (Neither A nor C)
c. A topicalized version of a. (C, but not A)
d. A topicalized version of b. (A, but not C)
(229) Set 2:
a. A passive-voice sentence where X is the subject and Y is in the “agent” (both A and C)
175
Technically, if we consider B(S, X, Y) to be the factors that enable quirky-BVA in particular, then accepting DR and/or
Coref in these environments might not indicate the presence of B(S, X, Y), as there might be quirky factors that enable
DR or Coref but not BVA. If we consider B(S, X, Y) to be the factors that enable any quirky-MR, then we do not have
this issue.
256
b. a.’s active equivalent. (Neither A nor C)
c. A topicalized version of a. (C, but not A)
d. A topicalized version of b. (A, but not C)
(230) Set 3:
a. A sentence where X is the subject and Y is in the object (both A and C)
b. a.’s equivalent where X is the possessor in the subject, Y in the object. (A but not C)
c. A topicalized version of a. (C, but not A)
d. A topicalized version of b. (Neither A nor C)
A given participant’s pattern of judgments on these sentences lets us see how factors A and
C affect the availability of BVA. Once we bring in judgements on the analogous DR/Coref
sentences, we can also see how B interacts with these patterns as well. What we are interested in is
showing that (i) the cases where all three factors are absent are indeed cases where BVA is rejected,
(ii) (at least some of) those cases of rejections have “minimal pair” contrasts along one or more of
the relevant dimensions, where BVA is accepted when one or more of the factors are restored, and
(iii) there is no simpler generalization would suffice to capture the data at hand.
In each of the following three sections, I will demonstrate that (i) is generally true and that
(ii) can be demonstrated along the specific dimension in question, *A(S, X Y) vs. okA(S, X, Y) in
Section 4.2, *BS(X, Y) vs. okB(S, X, Y) in Section 4.3, and *C(S, X, Y) vs. okC(S, X, Y) in Section
4.4. Further, in each section, I will show that the “simpler generalizations” mentioned (iii) never
seem to be possible; when we try to relax things a bit in any possible area, we find a messier pattern,
suggesting the necessity of the ABC-BVA law precisely as formatted. These results, especially when
taken in combination with the results of previous experiments and the data gathered from other
languages, strongly support our hypotheses.
In the rest of this section, I explain the particulars of the implementation of the general
template and strategy discussed above for English; after going through the various factors in
257
subsequent sections as detailed above, I will return to some of these details in Section 4.5, where I
will also overview the final results, as well as elaborate certain conceptual issues of potential interest.
4.1.2 Language-specific implementation
We now turn to how the general template laid out in Chapter 3 and schematized by the sets
given in (228)-(230) are implemented in the English experiment(s). As mentioned in Chapter 3, three
choices of X and Y were used, one X-Y pair per “round” of the experiment. In terms of different
X’s, Plesniak (2022b)’s experiments, which as noted in Section 3.2 are the most similar in scope and
focus to the ones performed here, provide us with two potential choices of X and Y, ‘every N’ and
‘more than one N’. These choices are not incidental; recalling Hayashishita (2004/2013)’s
observations of different quantifiers from Section 2.7, ‘every N’, as a universal, is one of the most
likely quantificational phrase to partake in things like “inverse scope” readings, whereas numerically
intricate quantificational phrases like ‘more than one N’ are the least likely to do so (others have
made similar observations; see discussion in Beghelli and Stowell 1997, for example). That is, ‘every
N’ is perhaps the most likely to have a quirky effect leading to B(S, X, Y) and ‘more than one N’ the
least likely. Note, we are not actually studying and/or testing the difference between these two
quantifiers; using quantifiers that differ in this way simply gives us a greater chance of observing
different behaviors that might be of interest to us.
Because we need a particular N
176
, and ‘student’-‘teacher’ pairings seemed to have been
effective in previous experiments, N was chosen to be ‘professor’. This gives us two full choices of
176
We could, in theory, have different N’s for different X’s, but this would complicate comparison between them. By
using the same N’s throughout the experiment, we keep consistent various qualities of the noun, such as animacy,
humanness, the relationship between the N of X and the N that Y possesses, etc., all of which might, in theory, have
effects relevant to what we are studying. (Not that they are expected to cause a breakdown in the ABC-BVA law, but they
might affect tendencies in ways that make comparison across cases more difficult. Though it is entirely conceivable that
different features of the noun DO interact with the ABC-BVA law, all the more reason to take a conservative approach
so that we can establish a baseline for future research, as I discuss further in Sub-Section 7.3.3).
258
X, ‘every professor’ and ‘more than one professor’. We need a third option for X, which, like the
first two, needs to obey two constraints: first, it must be grammatically singular, as English in
particular has challenges related to non-singular subjects and BVA. ‘Two professors’, for example,
cannot (generally) participate in BVA(S, two professors, his), e.g., if S is ‘two professors praised his
student’. Using ‘their’ in this situation resolves the issue to a degree (though introducing the
ambiguity of a group reading, but this should be resolved via the images presented). However, as will
be discussed shortly, while one could design an experiment this way, that is using exclusively things
like ‘their’, there were reasons in this particular case not to do so.
The second, more practical constraint is that quantifier chosen must be “visually distinct”
from the other two quantifiers, while still being easily representable, as per the sample items given in
Section 3.3. Many options are possible, but the one ultimately used here is ‘every professor but one’;
it fulfills the singular requirement and also is easily visualizable in ways that differ from the other
two choices of X. Though it had not been used in previous such studies, as the results will show, it
seems to perform as expected.
Turning to Y’s of BVA(S, X, Y), one may note that the English experiments discussed in
Section 3.2 all made use of ‘his’, as embedded in ‘his student’. Based on preliminary tests that I ran,
however, it seemed that some disliked using ‘his’ as Y of BVA(S, X, Y), preferring ‘their’ instead;
certainly, sentences like ‘every professor spoke to their student’ are more common in contemporary
speech than ‘every professor spoke to his student’, given the increasing loss of ‘his’s as a “mixed-
gender” pronoun. One option in response to this issue would be to simply switch ‘his’ to ‘their’ as
one of the choices of Y. Unfortunately, some individuals (at least prescriptively) disallow ‘their’ as
singularly-denoting, insisting that the pronoun used should be ‘his’, ‘her’, or ‘his or her’. As such, we
have two competing constraints on which pronouns should be used.
259
Given that differences between ‘his’ vs. ‘their’ is not particularly relevant to the investigation
at hand, the eventual decision I made was to make ‘their’ the default, but to allow participants to
request an alternative version of the experiment that substituted ‘his’, ‘her’, or ‘his or her’ for ‘their’.
This is somewhat sub-optimal, as data are not as perfectly comparable as they might have been if
only one such element were used, but again I stress that comparison between different choices of Y
is not a goal of these investigations. As long as choices of Y are consistent for a given individual,
then that is enough for our analytic purposes; essentially, we are looking at a given individual’s
judgements on sentences using whatever that individual claims to be the best mixed-gender third
person pronoun in their I-language.
For the other two Y’s, the strategy employed is to use one which is believed to be “better at”
being the Y of BVA(S, X, Y), than ‘their’/‘his’/etc. and one that is “worse at” being the Y of
BVA(S, X, Y). Again, note this is not for the purposes of investigating the differences between
different choices of Y, but rather, for giving the participants a more varied set of sentences on which
to produce judgements. For the “better” Y, as noted in discussions of past experiments in Section
3.2, ‘his own’ has previously been used as part of directions conveying to participants what a BVA
interpretation is, e.g., “every teacher spoke to his student” meaning “every teacher spoke to his own
student”. Note, however, that even such sentences with ‘his own’ are technically ambiguous; in the
correct context, e.g., John was so shocked that it was his own particular student that was spoken to
by every teacher, ‘his own’ can have a non-BVA reading. The fact, however, that the use of ‘his own’
was generally effective in conveying the BVA interpretations in past experiments suggests that ‘his
own’ is much more strongly biased towards a BVA interpretation than is just ‘his’. As such, ‘their
own’/‘his own’/ etc., serves as the second choice of Y in the experiments performed here.
260
As for a more “difficult” Y, the option settled upon was to use ‘that N’, where N matches
the noun in X, in this case rendering it ‘that professor’. Such elements do not typically participate in
BVA readings, but consider the sort of sentence given below:
(231) Every professor praised that professor’s student.
Sentences like (231) are cases where judgements tend to “split” between individuals; some
readily accept BVA(S, every professor, that professor) in such cases, whereas others find it
impossible to do so; such splits do not seem to happen with ‘their’, and if anything, happen in the
“opposite direction” with ‘their own’ (some finding any other interpretation besides BVA
impossible), so it seems that ‘that N’ is on the opposite side of the relevant spectrum. Further, as we
will discuss in subsequent sections, there have been certain claims made about phrases like ‘that N’,
which I will call “demonstrative phrases” for convenience, and their behavior with regard to BVA;
including demonstrative phrases as a choice of Y thus allows us to comment on these past claims,
even if briefly.
To reiterate, though these choices of X and Y were intended to be different than one
another, the main intention behind the experiment is not to compare them to one another. At times
I will highlight cases where there were minor differences in tendency, but this is merely reporting
what obtained in this particular experiment; I am not testing any claims regarding the behaviors of
these different X’s and Y’s or trying to claim that their behavior in the dataset resulting from this
particular experiment represents their behavior in general. As stated, the overall goal of using
different X’s and Y’s is to provide participants a diverse set of environments to make judgements in,
and to show that the (correlational) pattern that emerges is unchanged regardless of the choice of X
and Y, suggesting that the results are general and not being conditioned by specific lexical items.
261
Further, as noted in Chapter 3, multiple choices of X and Y give participants a greater chance of
having some X’s and Y’s differ in whether or not they are *B(S, X, Y), allowing a far less
“exclusionary” approach to data than has been taken in the past
177
.
Given that we are not testing anything about each X or Y in particular, we are thus free to
“pair” them in fairly arbitrary ways. If we were trying to interpedently look at the effects of each
choice of X and Y on BVA, we would need a 3X3 design, that is, combining every X with every Y.
As we are not studying the different X’s and Y’s though, but instead using them as tools of
convenience, each X was paired with just one choice of Y and vice versa. Specifically, in each round,
the pairings are as follows:
(232) Round 1: X=Every professor Y=Their (student)
Round 2: X=Every professor but one Y=Their own (student)
Round 3: X=More than one professor Y=That professor(‘s student)
In addition to these choices of X and Y of BVA(S, X, Y), we also need Y’ of DR(S’, X, Y’)
and X’ of Coref(S’’, X, Y); that is, for checking DR, we need some sort of quantified expression
rather than ‘Y’ student’, and for checking Coref, we need a non-quantificational element rather than
X. For Y’ ‘two students’ is used in this experiment, simply because it is easy to depict DR involving
two diagrammatically (higher numbers make the diagrams rather crowded). For Y’, there are further
constraints given the fact that ‘that professor’ is a possible choice of Y in this experiment. Plesniak
2022b used ‘John’ and ‘that professor’ as choices of X’, but consider the following sentences:
177
In some sense, following this logic, one could argue that the fact that, while it is true that most individuals used ‘their’
as Y but others used ‘his’ or ‘her’ (no one ended up requesting ‘his or her’), this is not an analytical weakness, but a strength;
a diversity of pronoun choices does not result in any deviation from the predictions. Indeed, perhaps it would be better if
every participant had had a completely unique set of X’s and Y’s; there are logistical and analytic reasons that would make?
conducting such an experiment more difficult, but such a thing could, in principle, be done.
262
(233) John spoke to that professor’s student.
(234) That professor spoke to that professor’s student.
A Coref(S, John, that professor) reading is quite hard to achieve in (233) (at least for me); it
is perhaps possible if one knows that John is a professor, but even then, it is rather awkward in most
contexts. (234), on the other hand, does fairly easily allow Coref(S, that professor, that professor),
for me at least, but now we are introducing a sort of confound in that X’ and Y are completely
identical. Such an issue is not problematic per se, but considering that when X’ switches to one of
the three X’s discussed above, this identity will be lost, we may be concerned that judgements on
Coref may not implicate onto BVA in the way predicted
178
. To preclude this possibility, an adjective
is introduced, rendering X’ ‘that new professor’; whether or not potential problems regarding the use
of ‘John’ or ‘that professor’ discussed above would have had any effect is unclear, but based on the
results, it seems that ‘that new professor’ did not do anything particularly “problematic”, at the very
least.
A final choice to make is that of the verb. Though both ‘praised’ and ‘spoke to’ are used in
previous English experiments discussed in Section 3.2, as mentioned there, ‘spoke to’ seems to be
easier for participants to accept with topicalization; as I noted in that section, this does not seem to
me to be reflective of spoken English, but rather, based on the infrequency of topicalization in
written English. For whatever reason, having a ‘to’ in front of a topicalized object seems to make it
look “more normal” in written English; though the presence of the preposition may make the
structure a bit more complex than necessary, this does not seem to make a difference with respect to
what c-commands what. As such, it seems a worthwhile trade off in English to make use of ‘spoke
178
Especially given Hoji 2022b’s speculation about the role of matching between X and Y (see Section 2.7), though he is
primarily concerned with matching N-heads, which I do in fact deliberately use with the choice of Y=’that N’. The point
though is that that choice is constant across Coref and BVA, whereas using ‘that professor’ as both X’ and Y of Coref but
not BVA would introduce an even greater identity into Coref that was absent in BVA.
263
to’; in the experiments in other languages, Korean and Mandarin Chinese, where this anti-
topicalization/scrambling “bias” does not seem to be so strong, this step is not taken, and
equivalents to ‘praised’ are used, given that they do not involve this extra complication.
With all this established, we can see the full form of the sentences employed in the English
experiments, modulo the choice of X and Y. For BVA, these are:
(235) Set 1:
a. X spoke to Y’s student.
b. Y’s student was spoken to by X.
c. To Y’s student, X spoke.
d. By X, Y’s student was spoken to.
(236) Set 2:
a. X was spoken to by Y’s student.
b. Y’s student spoke to X.
c. By Y’s student, X was spoken to.
d. To X, Y’s student spoke.
(237) Set 3:
a. X has a colleague who spoke to Y’s student.
b. X’s colleague spoke to Y’s student.
c. A colleague who spoke to Y’s student, X has.
d. To Y’s student, X’s colleague spoke.
The particular BVA sentences used in each round can be derived by substituting the relevant
X-Y pair, as given in (232). Similarly, the Coref sentences can be derived by substituting ‘that new
professor’ in for X, while the DR sentences can be derived by substituting in ‘two students’ for ‘Y’s
student’. A given sentence type thus appears nine times across the different rounds. Taking the
example of (235)a, for instance:
(238) Round 1:
Coref: That new professor spoke to their student.
DR: Every professor spoke to two students.
BVA: Every professor spoke to their student.
264
Round 2:
Coref: That new professor spoke to their own student.
DR: Every professor but one spoke to two students.
BVA: Every professor but one spoke to their own student.
Round 3:
Coref: That new professor spoke to that professor’s student.
DR: More than one professor spoke to two students.
BVA: More than one professor spoke to that professor’s student.
4.1.3 Further Considerations
From the patterns I have laid out in (235)-(238), we can see that each participant in the
experiment judged 108 main items (not including tutorial items, see Section 3.3). There were 13
participants for the English experiment, making for roughly 1,400 datapoints collected. As discussed
in Sub-Section 3.4.2, however, not all of these datapoints are used in the final analysis. Though I
have provided an overview there, I will discuss in slightly more detail here the exclusions made in
the English experiment in particular.
Two basic issues arose that led to such exclusions being made. First, one individual was such
that their answers to the tutorial questions consistently did not align with the expected answers, and
this individual was not generally able to provide a clear explanation for their divergent answers or to
understand the expected answers when those were explained. As per the procedures laid out in
Section 3.3, a note was made on this participant’s datasheet to the effect that the data gathered could
not be considered reliable, as it was unclear if the participant had understood the task at hand. I will
reiterate that this decision was made and recorded before any answers to BVA judgements were
elicited from the individual in question; as such, this exclusion was in no way designed to eliminate a
participant with unusual or prediction-violating data.
Indeed, in Sub-Section 4.5.2, I will offer a summary of this excluded data, and it will be clear
that the individual in question had judgements that were both consistent with the predictions and
265
generally similar to the judgements of the other English speakers. There are a couple of areas in
which this individual does noticeably differ from the other English speakers, as will be discussed, but
none of these in any way contradict the predictions of the ABC-BVA law or the other supporting
hypotheses. The most notable difference between this individual and the others has to do with the
fact that, as will be discussed more in the following sections, all other English speakers had a robust
correlation between DR and BVA, stronger than the expected correlation between DR and Coref
together and BVA. That is, the other English speakers seem obey the following pattern, somewhat
reminiscent of Barker (2012)’s proposal:
(239) When there is *A(S, X, Y) and *C(S, X, Y),
*DR(S’, X, Y’) à *BVA(S, X, Y)
This in no way contradicts Hoji’s correlation; (239) entails (227)c, though this result is still
somewhat surprising, especially given previous experiments on English did not find a similar pattern
(see Section 3.2). As such, it is hard to know how to interpret this result; nevertheless, it is true that
the participant excluded for comprehension reasons did not follow this pattern, but rather required
reference to both DR and Coref to accurately predict BVA judgements. Again, such behavior is
entirely as predicted by our hypotheses, but nevertheless, it is different from the patterns of the
other participants. Whether there is any significance to this difference will have to wait until future
experiments can investigate the issues surrounding the (239) judgement pattern further.
The second issue noted in in Sub-Section 3.4.2 that is relevant for the English experiment
involves possessor binding sentences, those corresponding to (237)b and (237)d. Readers can review
the full description of the issue given there, but in essence, certain instances of the items containing
266
these sentences had unintended ambiguities
179
concerning whether the intended X of BVA(S, X, Y)
was actually X or ‘X’s colleague’. These ambiguities were facilitated by the somewhat underspecified
choices of Y, ‘their’ and ‘their own’. Fortunately, the final choice of Y, ‘that professor’, lacks such
ambiguity or at least strongly biases against it. Consider the following paradigm (simplified from the
actual experiment by using ‘every’ as the quantifier for X, when it would have been ‘more than one’):
(240) a. Every professor’s colleague spoke to their student.
b. Every professor’s colleague spoke to their own student.
c. Every professor’s colleague spoke to that professor’s student.
(240)a and (240)b are amenable to a BVA(S, every professor’s colleague, their (own))
reading; that is, it is plausible to think that the sentence expresses the meaning that the colleague(s)
each have a student who they spoke to. (240)c resists this reading, because it explicitly uses the
phrase ‘professor’ in the Y, which clearly refers back to ‘every professor’. Of course, it is imaginable
that someone might do some “mental gymnastics” and consider that the colleagues of professors are
also professors, but getting this reading involves doing a form of “analysis” that was not necessary in
the other two cases. Though it is of course imaginable that some might do this, based on the results
of all of the experiments performed, across all three languages no one shows signs of having done
so (unlike for the other two, where there are indeed some signs, as we will discuss).
As mentioned in Sub-Section 3.4.2, it was hoped that explicit verbal clarification would help
alleviate the issue. Results from the experiment, however, cast doubt on this procedure’s efficacy,
though as in the case of the excluded individual, the results themselves were not particularly
problematic. As with the other excluded data, I will discuss these issues more in Sub-Section 4.5.2,
179
Or more precisely, these items were not ambiguous, but as discussed, the relevant details that would have disambiguated
them were hard enough to catch that it was very understandable that participants might misunderstand the question in a
slight but very significant way, which many reported that they in fact did.
267
but in essence, we once again see one individual (different from the excluded individual) requires
both Coref and DR, rather than just DR, to predict BVA. As I stated above, this is not a violation of
any of our predictions, but given the striking DR-BVA correlation that obtains everywhere else in
this dataset, it is noticeably “different” from other judgements. Further, this anomalous judgement
obtains with one of the “ambiguous” Y’s, rather than ‘that professor’. While innocuous on their
own, these facts taken all together suggested to me that at least some participants were indeed
suffering from the “ambiguity” issue I had hoped to avoid.
As such, at this point, I made the tentative decision to exclude the Set 3 items from Rounds
1 and 2 of the experiment (using ‘their’ and ‘their own’ as Y) from the main analysis of the
experiments in all languages
180
and to focus on the possessor binding data from Round 3, where
‘that professor’ serves as Y. At this point, the basis for the decision to exclude was merely a
“hunch”, but as it turns out, data subsequently from the Korean experiment (to be discussed in the
next chapter) directly supported the guess that possessor-binding in Rounds 1 and 2, but not Round
3, is ambiguous, so evidence suggests that this hunch was correct.
As such, though they are analyzed all analyzed in Sub-Section 4.5.2, as well as viewable in the
data provided in the appendix, the datapoints coming from the one seemingly confused participant
are excluded from the main analysis, and of the remaining twelve, Set 3 items from Rounds 1 and 2
are excluded as well (besides as noted in the above footnote). While these exclusions are of course
unfortunate, they are principled, and, as we will see, they do not prevent us from providing plentiful
180
Except in the rare cases where which element is serving as the X of BVA(S, X, Y) is not relevant for establishing the
intended point, e.g., certain cases having to do with the role of precedence, when both ‘X’ and ‘X’s colleague’ both precede
or do not precede Y. In such cases, the data are still not fully analyzed per se, but may be useful for demonstrating the
“significance” of other data points, as discussed towards the end of Sub-Section 3.3.4; in graphical terms, this means that
such items may influence the “color” certain datapoints receive on some of the Venn diagrams, but will not themselves
be represented as points on those diagrams.
268
evidence for the all intended points.
4.2 A(S, X, Y)
In this section, the relationship between A(S, X, Y) and BVA(S, X, Y) for the individuals
surveyed is considered. As such, we are concerned with the following implication:
(241) (Under the correct conditions)
*A(S, X, Y)à*BVA(S, X, Y)
In other words, if the correct conditions are met, BVA(S, X, Y) is possible only if X precedes
Y in S. What we will endeavor to show, across all individuals, sentence-types, and choices of X and
Y, is that the content of these “correct conditions” is *C(S, X, Y) and *B(S, X, Y). To see that this
restriction is indeed necessary, we want to categorize each individual, for any given subset of the
data (all of it, with a particular choice of X-Y, particular choice of S, etc.), with regards to whether
the implication in (241) holds. Reminiscent of the divisions we saw in Section 3.2, we can divide up
individuals’ patterns of judgements on a given subset of our data as follows:
(242) On a given subset of the data…
a. “Non-implicating” (NI):
Individuals who sometimes accept BVA(S, X, Y) when X does not precede Y
b. “Fully implicating” (FI):
Individuals who never accept BVA(S, X, Y) when X does not precede Y,
but who do sometimes accept BVA(S, X, Y) when X does precede Y.
181
c. “Degeneratively implicating” (DI):
Individuals who never accept BVA(S, X, Y)
181
Further, these individuals must also (at least sometimes) accept BVA(S, X, Y) [with other members of the same (theta-
role-matched) set as the elements they reject , such that it is clear that there is not some sort of “meaning-based” issue
with BVA involving that particular X and Y in those particular “theta roles”. I will make further comments on this later
in this section; it plays a more minimal role here, as we are primarily focusing on individuals’ overall patterns (at least on a
given subset of the data), rather than those individuals’ specific judgements.
269
In essence, those who are “non-implicating” simply do not obey the implication in (241) at
all, at least in the particular subset of the data in question; one of our main goals is to show that, if
the subset is chosen to be datapoints where we have *B(S, X, Y) and *C(S, X, Y), there are no such
individuals, but that no (straightforward) superset of that subset has this property, matching our
expectations given the ABC-BVA law. The second group, “fully implicating”, are the ones who not
only obey the implication, but show clear, precedence-based contrasts. Ideally, we want to show
that, when the aforementioned *B+*C subset is examined, there are such individuals falling into this
group, as these individuals are the ones who most strongly support the basic claim that A(S, X, Y)
constraints BVA(S, X, Y). These two terms can be compared to Hoji’s “disconfirming” and
“detection” designations respectively. Similarly, the final term, “degeneratively implicating” is quite
like Hoji’s “neutral” designation. These individuals give judgments that are compatible with the
implication but only because of the technicality that they never accept BVA readings of the
sentences in given subset of the data; this does not really tell us much about whether linear
precedence was important to them or not.
As mentioned, we can look at individuals’ BVA judgements across the whole dataset, or on
sentences from specific sets, or sentences using certain X-Y pairs. These X-Y pairs, I will abbreviate
throughout this chapter as below:
(243) a. Every Professor-Their EP-T
b. Every Professor But One-Their Own EPBO-TO
c. More Than One Professor-That Professor(‘s) MTOP-TP
Let us first assume that only precedence is relevant, i.e., that there are no “correct
conditions” to be met at all for the implication to hold. We can generate the following table, where
270
each cell counts the number of individuals falling into the categories laid out in (242), arranged in the
format “X/Y/Z”, where X is the number of “fully implicating” individuals, Y is the number of
“degeneratively implicating” individuals, and Z is the number of “non-implicating” individuals. Were
the implication in (241) without qualification, we would expect to see no “non-implicating”
individuals, and hopefully many “fully implicating” individuals, but this is not what happens. Indeed,
this issue is not alleviated by focusing on specific sets or choices of X-Y pair, or even combinations
thereof; in all instances, individuals remain who accept BVA(S, X, Y) in cases when X does not
precede Y:
(244) Categorization of individuals with regards to (241)
FI/DI/NI ET-T EPBO-TO MTOP-TP All X-Y’s
Set 1 1/1/10 3/0/9 6/3/3 2/0/10
Set 2 3/0/9 0/0/12 5/4/3 0/0/12
Set 3 N/A
182
N/A 5/1/6 N/A
All Sets 2/0/10 0/0/12 4/1/7 0/0/12
Because the number of individuals involved is only 12, we do not want to draw strong
conclusions based on the proportions of numbers in one cell vs. another. It is, however, striking
that, considering the data as a whole, all individuals are “non-implicating”. That is, there are no
individuals who have a completely “precedence-based” pattern to BVA. Indeed, most cells are such
that the majority of individuals are non-implicating. The only exceptions are found in some of the
cells in the MTOP-TP column of (244). There though, quite a few individuals are degeneratively
implicating, suggesting there were challenges in accepting BVA(S, more than one professor, that
182
These N/A cells come about as a result of the exclusions discussed in Sub-Section 4.1.3; technically, the cell that is the
intersection of the “All X-Y’s” column and “Set 3” row has a value, but because MTOP-TP is the only column it would
draw from, it would be identical to the Set 3-MTOP-TP cell, so I have marked it “N/A” to avoid giving a false impression
that something is being tabulated.
271
professor) in general. As we can see from the intersection of “All Sets” and “MTOP-TP” though,
ultimately, all but one individual did eventually accept such readings at least once.
Clearly, precedence alone does not correctly predict BVA. If we add in the first of our two
“correct conditions”, namely if we have also ensured that X does not c-command Y, *C(S, X, Y),
the situation improves somewhat. In essence, this means considering only the cases of “weak
crossover” passives and actives, and the possessor binding cases, comparing their topicalized
versions with their non-topicalized versions, which permutes precedence but not c-command
183
. We
thereby generate the following table:
(245) Categorization of individuals with regards to (241) (c-command controlled)
FI/DI/NI ET-T EPBO-TO MTOP-TP All X-Y’s
Set 1 7/1/4 3/1/8 8/4/0 2/1/9
Set 2 9/0/3 5/1/6 7/5/0 6/0/6
Set 3 N/A N/A 6/3/3 N/A
All Sets 8/0/4 3/1/8 6/3/3 2/0/10
We find that there are indeed two “fully implicating” individuals under these conditions, that
is, individuals for whom X preceding Y is the sole determining factor of whether BVA is possible in
cases where X does not c-command Y. The other ten, however, are still non-implicating. If we look
across the different conditions, we can see that many now have quite fewer cases of non-implicating
individuals in various subsets of the data; some, namely as Sets 1 and 2 in the MTOP-TP cases, even
have no such individuals while still having a high number of fully implicating individuals. Overall,
though, we can see that, while the numbers of non-implicators have certainly been reduced in
specific scenarios, across the board, there are still many cases of accepting BVA(S, X, Y) in cases
183
Note that topicalized weak crossover cases are thereby grouped with non-topicalized possessor-binding cases as
sentences where X precedes Y, and vice versa for sentences where X does not precede Y.
272
where X neither precedes nor c-commands. We thus will need to maintain not only *C(S, X, Y), but
also *B(S, X, Y) as a necessary condition for the implication in (241) to hold.
We could repeat the above exercise for DR and Coref as well. This, however, is not
particularly relevant to us, as we do not actually make predictions about DR and Coref judgements
themselves
184
. What we are interested in is DR and Coref’s ability to diagnose *B(S, X, Y). The
question is thus not whether DR(S’, X, Y’) and Coref(S’’, X’, Y) are accepted when X/X’ does not
precede Y’/Y, but rather, whether cases where BVA(S, X, Y) is accepted without the precedence
“line up” with cases where the corresponding DR and Coref cases are also accepted without
precedence, at least once *C(S, X, Y) is ensured.
I will show shortly below that the answer to this question is indeed “yes”. Further, it will also
be shown that, in cases where such effects do not take place, once we have guaranteed *C(S, X, Y),
the implication in (241) does not merely hold, but is in fact directly supported by a number of
minimal and near minimal pairs wherein the presence or absence of X preceding Y in S modulates
the acceptability of BVA(S, X, Y).
To do this, we must transition from looking at individuals to looking at individuals’
judgements on particular sentences of interest. This is because, for the “lining up” to occur, we must
match judgements not only in terms of individuals, but also in terms of specific choices of S, X, and
Y, ensuring *B(S, X, Y). From the preceding tables, we can see that there are effectively 7 “sets” of
sentences under consideration, six of these deriving from combinations of Sets 1 and 2 with all three
different X-Y pairs, and the last being the one instance of Set 3 analyzed, using the ‘more than one
184
The relevant data can be easily generated from what is given in the appendix. To save curious readers the trouble, the
basic conclusion is that DR and Coref are even less well behaved than BVA; that is, even once we ensure *A(S, X, Y) and
*C(S, X, Y), Coref(S, X, Y) and DR(S, X, Y) are still very often accepted. In some specific S-X-Y combinations, they are
not, but once we generalize across a given X-Y pair or a give set, then usually almost all individuals accept the MR in
question without X c-commanding or preceding Y. This will be somewhat different in the Korean and Mandarin Chinese
cases, which I will mention in the relevant chapters.
273
professor’-‘that professor’ pair. Within each of these 7 sets, there are two sentences in each where X
does not precede Y, of which one is such that X does not c-command Y either. There are thus 14
sentences where we have *A(S, X, Y), and 7 sentences where we have both *A(S, X, Y) and *C(S, X,
Y); these latter seven are the three (non-topicalized) weak crossover passives, the three (non-
topicalized) weak crossover actives, and the one topicalized possessor binding construction. The
other seven are topicalized actives and passives where X is the subject and Y is in the topicalized
object/agent phrase.
For each sentence with *A(S, X, Y), for each individual, we want to see whether BVA(S, X,
Y) is accepted or rejected, especially those wherein we have *C(S, X, Y). Further, we want to focus
on individual-X-Y-construction combinations for which we have diagnosed *B(S, X, Y). Thus, we
must consider each participant’s judgements on DR(S’, X, Y’) and Coref(S’’, X’, Y), and apply the
correlational methodology described in Chapters 2 and 3 to detect whether there is (or at least might
be) B(S, X, Y). Indeed, as I have mentioned in Section 2.14.1, at least for these particular English
speakers judging these particular sentences, it seems as if DR(S, X, Y) alone is sufficient as a
diagnostic tool for B(S, X, Y) (see discussion in Section 4.3 as to why I am including this “post-hoc”
DR-only method). Thus, we may obtain a more precise, though somewhat post-hoc, sub-analysis by
focusing only on DR and not Coref. I will thus report results in terms of both the DR+Coref and
DR alone methods.
Similar to the diagrams from previous works given in Section 3.2, we want to classify each of
these judgements in two ways. First, considering only BVA judgements themselves, we will have a
system resembling the three-way division given in (242), but with different labels and with different,
albeit reminiscent, definitions for each group. Crucially, in (242), we were interested in classifying
participants (on particular subsets of the data), whereas here, we will be more interested in individual
*Schema judgements, and their corresponding okSchema judgements by the same participant. We
274
will thus be returning to a classification system very much like the ones used in the experiments
discussed in Section 3.2, albeit adapted to work for precedence-effects rather than c-command
effects. The three groups, which are somewhat “telegraphically” labeled given that they need to fit in
graphs, and their classification criteria are as follows:
(246) For a given *A(S, X, Y)sentence:
a. *A(S, X, Y) and BVA(S, X, Y)
The individual in question accepted this sentence with a BVA(S, X, Y) reading
b. *A(S, X, Y)à*BVA(S, X, Y), okA(S, X, Y)
185
and BVA(S, X, Y)
The individual in question rejected this sentence with a BVA(S, X, Y) reading,
but accepted BVA(S, X, Y)
186
in sentences where X preceded Y but did not c-
command it and also accepted BVA in sentences from the same set
187
as the sentence
in question.
c. Just *A(S, X, Y)à*BVA(S, X, Y)
Like b., but insufficient minimal pair sentences were accepted; either the individual
never accepted BVA(S, X, Y) in a sentence where X preceded Y, or the individual
did not accept sentences from the same set as the sentence in question.
In essence, (246)a is like (242)a; both involve BVA(S, X, Y) being accepted when X does not
precede Y. (246)b is also like (242)b, but it is a bit stricter; not only does the individual have to
accept sentences with BVA(S, X, Y) in sentences where X precedes Y, those sentences have to be
ones where X also does not c-command Y. The individual must also accept sentences from the same
“set” as the rejected sentence in question
188
. The first of these conditions guarantees that the
individual can accept BVA(S, X, Y) purely based on precedence with the given X-Y pair; they are
185
Here is where the “telegraphic” aspect can be confusing; it should really be okA(S, X, Y) and *C(S, X, Y), as the text
makes clear.
186
To be clear, this is BVA with the same choice of X and Y; X and Y mean particular choices of X and Y, not just any X
and Y.
187
“Same set” meaning the same choice of Set 1, or 3 with the same choice of X and Y.
188
These two conditions can, but need not, be satisfied by the same sentence(s).
275
not only accepting c-command-based BVA, for example. The second of these conditions ensures
that we cannot attribute the non-acceptance of BVA in the *Schema sentence in question to some
problem with the narrow “meaning” of the sentence; the individual in question accepts BVA in at
least one other sentence that has the same superficial (i.e., theta-role-based) “meaning” as the
sentence in question. Turning to the alst category, (246)c, like (242)c, is the category for the
remainders, in this case, those judgements which, though definitely not cases of (246)a, nevertheless
fall short of the rather exacting standards for “full significance” required by (246)b. Summarized
briefly, (246)a are judgements that do not follow precedence-based patterns, (246)b are judgements
that strongly follow precedence-based patterns, and (246)c are those that follow precedence-based
patterns, but do so in a weaker way that those categorized as (246)b.
In the diagrams to be given shortly below, the categories in (246) will determine the quality
of the “dots” that represent each judgement on a *A(S, X, Y) sentence; similar to Plesniak 2022b,
those falling into category (246)a are red squares, those falling into category (246)b are green circles,
and those falling into (246)c are yellow triangles. As mentioned above, each of these dots must also
be categorized in a second way as well, regarding the presence or absence of the interfering factors,
B(S, X, Y) and C(S, X, Y). The latter is rather straightforward; if a sentence is such that X does not
c-command Y, it will appear inside the *C(S, X, Y) circle, and if it is such that X does c-command Y,
it will appear outside of it
189
. Turning to B, as with the experiments discussed in Section 3.2, because
B(S, X, Y) is diagnosed experimentally, what will be represented on the diagram are the results of the
corresponding DR and Coref tests. Like in those diagrams, if a given sentence is such that the
individual judging it accepted DR/Coref in the corresponding sentence
190
(i.e, the same sentence
189
Since there are no sub-experiments in the sense of the previous experiments discussed in Section 3.2, the *C(S, X, Y)
circle occupies the same place in the diagram (the top circle) as those did.
190
Technically, according to how we have formulated the test, it is not applicable for sentences where X c-commands Y.
276
with ‘two students’ substituted in for ‘Y’s student’ or ‘that new teacher’ substituted in for X), then
the judgement dot will appear outside the DR/Coref circle. If the individual in question did not
accept DR/Coref in the relevant corresponding sentence, then the dot will be inside the circle. If a
dot is inside both the DR and Coref circles, then *B(S, X, Y) is diagnosed as per (226), i.e., quirky
effects are absent for that speaker with that choice of S, X, and Y. As mentioned, there is also a
simpler “post-hoc” method that relies on just the DR test; under this version, just being within the
DR circle is enough to qualify as *B(S, X, Y), without regard to being inside or outside the Coref
circle. To indicate these two different versions of the test for B(S, X, Y), I have made the Coref
circle dashed, to indicate its secondary, possibly redundant, role.
Before discussing the predictions, let me simply show the resulting diagram, so readers need
not hold all the above in memory:
I will suppress this technical issue for the sake of conveying the relevant information; it has no bearing on our conclusions,
as those sentences are included mainly for comparison. This will also be the case in all such Venn diagrams going forward.
277
(247) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
*C, DR, Coref 10 3 0 13
*C, DR 13 4 0 17
*C, Coref 8 6 6 20
DR, Coref 4 3 6 13
*C 14 2 18 34
DR 8 3 6 17
Coref 1 4 15 20
None 4 2 28 34
Total 62 27 79 168
Returning now to the predictions: regardless of how we diagnose *B(S, X, Y), we make the
strong prediction that if there is *C(S, X, Y) and *B(S, X, Y) for a given individual (for a particular S-
X-Y), then *A(S, X, Y) must imply *BVA(S, X, Y). Thus, our prediction is supported by any
instance of a sentence that is such that there is *C(S, X, Y) and *B(S, X, Y), X does not precede Y,
and BVA(S, X, Y) is judged impossible by the participant in question. If the same conditions are met
278
but BVA(S, X, Y) is judged to be possible, then the prediction is contradicted. If the conditions are
not met, then we make no definite prediction; we predict instead that BVA(S, X, Y) might be
acceptable. As mentioned in Section 3.2 such acceptability can in fact be used to strengthen the
significance of our results. Specifically, because our focus is A(S, X, Y), for cases where we have
both *A(S, X, Y) and *BVA(S, X, Y), we ideally want to further demonstrate that the presence of
A(S, X, Y) in S makes BVA(S, X, Y) sometimes possible, even in the absence of c-command and
quirky effects.
Diagrammatically, what we expect is that, though there may be red dots, there should be
none in the central intersection, where we have guaranteed both *C(S, X, Y) and *B(S, X, Y); that is,
because all the judgements represented are on sentences that are where X does not precede Y,
BVA(S, X, Y) should be impossible if the aforementioned conditions are met. Thus, under the
original approach to quirky effect detection, in the intersection of the *C(S, X, Y) circle, the DR
circle, and the Coref circle, we predict there to be no red dots. If we adopt the more post-hoc
method of quirky detection relying on just DR, then we predict that this “no red dots” status will
also extend to the area where the DR and *C circles intersect but which is outside of the Coref
circle. Regardless of the *B detection method, we further hope to see as many non-red dots as
possible inside the intersection where we have *B(S, X, Y) and *C(S, X, Y), ideally green dots as
those are the most “significant” in establishing miminal-pair-based evidence for the role of
precedence in constraining/allowing BVA readings.
Turning to the results shown in the chart and the accompanying summary table in (247), the
plurality of judgements on *A(S, X, Y) sentences are in fact “red”, that is BVA(S, X, Y)-accepting. A
narrow majority, however, are BVA-rejecting, green or yellow, though the numbers are quite close,
89 green/yellow to 79 red; we can say that BVA was roughly equally as likely to be accepted as it was
rejected in sentences where X did not precede Y.
279
When classified according to *C(S, X, Y) and *B(S, X, Y), however, whatever symmetry
there is between acceptances and rejections completely breaks down. If we consider the judgements
such that the corresponding *A(S, X, Y) sentence is hypothesized to also be *C(S, X, Y) and is
identified by the DR and Coref tests as being *B(S, X, Y) for the individual making the judgement,
then there are no “reds” in the group whatsoever; there are 10 “greens” and 3 “yellows”. Further if
we ignore the Coref test for *B(S, X, Y) and rely solely on the DR test, the total number of included
dots grows dramatically, but again no reds are found: there are 26 (10+13) greens and 7 (3+4)
yellows. As such, we can see clearly that, when B(S, X, Y) and C(S, X, Y) are controlled for, *A(S,
X,Y)à*BVA(S, X, Y) without exception. Furthermore, in the vast majority of cases, the individual
in question does accept BVA(S, X, Y) in (at least some) cases where X does precede Y, suggesting
that the lack of X preceding Y is indeed responsible for the unavailability of BVA(S, X, Y) in such
sentences.
These results are exactly as predicted by the ABC-BVA law and show direct evidence of
precedence’s involvement in constraining BVA. However, several questions arise due to the
heterogeneous nature of the data. The judgements “in the center”, that is in the intersection of the
*C(S, X, Y) and the DR: *B(S, X, Y) (and the Coref: *B(S, X, Y) if we want to include it), represent
the “controlled” cases, wherein the *A(S, X, Y)à*BVA(S, X, Y) prediction is expected to hold.
Those judgements not in this “center” are of course still quite relevant, but they do not allow
anything about *A itself to be tested directly given the presence of confounding factors like c-
command and quirky effects.
Given the nature of the chart, however, we do not know the “demographic” makeup of the
judgements in the center; how many different individuals are represented? Which choices of X-Y
pair and which constructions? Absent this information, one might rightly worry that these
datapoints come from just a few of the individuals, or only from one choice of X-Y pair, or one
280
sentence type. If that were the case, while it would not contradict our predictions, it would limit their
significance; our claim is that the role of precedence is universal, and should find reflexes in every
individual, X-Y pair, and sentence type used to generate the dataset
191
.
Fortunately, inspection of the data reveals the results are quite general. Let us “zoom in” and
label each point in the center according to the following three-part system numbering system:
(248) i-j-k
i: The (arbitrary) number identifying the participant in question. 1-12.
k: The choice of X and Y, 1 for ‘every professor’-‘their’, 2 for ‘every professor but one’-
‘their own’, and 3 for ‘more than one professor’-‘that professor’.
j: The sentence type in question, 1 for(non-topicalized) weak crossover passive, 2 for
(non-topicalized) weak crossover active, and 3 for topicalized possessor-binding
192
.
Once we zoom and label accordingly, we have the following:
191
It is possible that there are, for example, X-Y pairs that do not permit precedence-based BVA, at least for a given
individual (Hoji, p.c. Aug 2021, argues this to be the case). This would not contradict the ABC-BVA law, but would add
to the properties that would need to be satisfied to meet the criteria for A(S, X, Y), just as, say, considerations of locality
would add to the properties that would need to be satisfied to meet the criteria for C(S, X, Y). The point, here, however,
is that we have no reason to believe that the X-Y pairs chosen for this experiment are such X-Y pairs, and part of our
intent in choosing different such pairs is to show that precedence effects obtain across a variety of different lexical items.
As such, we want to demonstrate that all the particular X-Y pairs used in this experiment show signs of precedence-based
behavior regarding BVA.
192
That is, 1=[Y was V by X’s N], 2=[Y V X’s N], and 3=[Y’s N, X’s N2 V].
281
(249)
We can see that there is a wide variety of different individual-X-Y-sentence pairs
represented. To go into more detail, if we first consider all the labeled judgements in the center
(ignoring the dashed Coref circle), then every individual is represented at least once, most more than
once, many several times. Further, every X-Y pair and every construction is represented anywhere
from 4 (the lowest being the ‘every professor but one’-‘their own’ X-Y pair) to 17 times (the highest
being the (weak crossover) actives). Indeed, not only is every X-Y pair and every construction
represented at least once, most several times, but in fact every S-X-Y set is represented at least once,
and in almost every case more than once.
Looking only at the “greens”, which are the most significant for considerations of
precedence because they are those for whom precedence-based and theta-role-matched minimal pair
contrasts could be established, little changes from when we were looking at both greens and yellows,
282
except that no datapoints are found for individual 4; the other individuals, as well as the various S-X-
Y combinations, are all represented among the green dots, usually multiple times.
If we repeat this exercise considering only those who passed both the DR and the Coref
tests for *B(S, X, Y) (so this time, only those within the dashed Coref circle), the results are naturally
somewhat diminished, but are still fairly comprehensive. If we consider both yellows and greens,
nothing qualitatively changes for the various X-Y pairs, constructions, and thereof. Fewer
individuals, however, make it in, with individuals 2, 3, 7, and 9 having no relevant judgements. If we
further restrict to just greens, individual 4 is once again excluded, and the ‘more than one professor’-
‘that professor’ pair in the weak crossover passive construction is not represented. However, every
other X-Y-construction pair is, and most of the X-Y pairs and constructions by themselves are
represented multiple times.
As such, even under the most restrictive measure of what “counts” as supporting data, the
prediction in question is still widely born out across individuals, X-Y pairs, and constructions.
Relaxing this measure a bit allows for more evidence for the universality of the prediction, but
regardless, the predicted effect of precedence is demonstrated across all choices of X-Y, all
constructions, all or almost all pairings thereof, and all or almost all individuals. Taken all together,
this evidence thus supports the prediction that, once we have guaranteed X does not c-command Y
and no quirky effects intervene, *A(S, X, Y)à*BVA(S, X, Y), as predicted by the ABC-BVA law; in
such cases, linear precedence alone is frequently enough to enable BVA to be accepted, and if linear
precedence is absent, then BVA becomes universally unavailable.
4.3 B(S, X, Y)
In this section, we consider the role of B(S, X, Y) in constraining BVA(S, X, Y) in English.
As noted, B(S, X, Y) constitutes a potentially heterogenous mixture of the various
283
lexically/semantically/pragmatically conditioned “quirky effects” that Hoji’s correlational diagnostics
(the DR and Coref tests) can capture. As such, these effects are not tied to the particular sentences
in the way that linear precedence or c-command are, but rather, appear idiosyncratically based on the
individual in question.
We have already seen in the previous section that considerations of B(S, X, Y) are
indispensable for correctly predicting the impossibility of BVA(S, X, Y). As such, in this section, we
will consider several additional issues: (I) whether *B(S, X, Y) alone is sufficient to predict *BVA(S,
X, Y), (II), the role of DR and Coref in establishing B(S, X, Y), expanding on the discussion in the
previous section, and finally, (III), whether we can see clear, minimal pair-like contrasts based on the
presence/absence of B(S, X, Y). The answers to these questions will be: (I), *B(S, X, Y) alone is not
sufficient, (II), DR alone seems to be sufficient to diagnose *B here, and (III), yes, we can see
“quirky-based” contrasts.
I turn first to question (I). As discussed briefly in Section 2.6, some (e.g. Safir (2004), Barker
(2012)) have discussed the possibility of whether what I am considering quirky properties might be
sufficient to capture the distribution of BVA; that is, perhaps BVA(S, X, Y)’s availability depends
purely on the semantic/lexical/pragmatic properties of S, X, and Y, and reference to other factors
like precedence and c-command is not needed. I will be arguing that this is not the case, and that as
per the ABC-BVA law, all three types of factors must be equally considered.
This debate can be captured in terms of a disagreement about the content of “under the
correct conditions” in (250) below, which is simply (241) from the previous section, with A replaced
with B.
(250) (Under the correct conditions)
*B(S, X, Y)à*BVA(S, X, Y)
284
The “quirky absolutist” position I described above would hold that the reference to “correct
conditions” in (250) is redundant, and consideration of *B alone is sufficient to predict *BVA; that
is, unlike under the ABC-BVA law, we need not ensure *A(S, X, Y) and *C(S, X, Y) (which for us
reduces to X neither preceding or c-commanding Y), for the *B(S, X, Y)à*BVA(S, X, Y)
implication to hold. Testing this possibility will require us to relax our standards for how quirky
effects are detected greatly; as phrased in the definitions given throughout the previous two
chapters, such tests are only applicable when dealing with sentences where X neither precedes nor c-
commands Y, as these are alternative potential sources of Coref, DR, and BVA. The claim of the
“quirky absolutist” position, however, is that such considerations are not relevant for the
establishment of MR’s; the only relevant factors are found in the semantics or other such meaning-
related module(s) of the mind. I will thus assume that under such an account, since no other factor
could be enabling DR or Coref, that the DR and Coref tests would validly diagnose quirky effects
regardless of whether X precedes or c-commands Y in the sentences under consideration. We can
then easily test the resulting predictions against those made by the ABC-BVA law.
We actually already know to some extent that the ABC-BVA prediction does indeed work
out; as shown in (247), there are no cases where we have *A(S, X, Y), *B(S, X, Y), and *C(S, X, Y)
but BVA(S, X, Y) is accepted. This pattern, however, is also consistent with the quirky-absolutist
pattern as well. What we would want to do to test this other prediction would be to check sentences
where X does not necessarily both precede and c-command Y, but are nevertheless the “expanded”
DR/Coref tests diagnose *B(S, X, Y), and see if BVA is ever accepted. There are various possible
ways that this “expansion” could occur. As noted in the paragraph above, the test for *B(S, X, Y)
given in (226) technically only applies to cases where we have *A(S, X, Y) and *C(S, X, Y). There are
many conceivable ways of modifying these diagnostic criteria, but I will explore two fairly
straightforward ones here.
285
One option is to roughly follow the spirit of the proposal of Barker (2012)’s “operational
test for scope” (see Section 2.6) and hold, as I described above, that only the factors diagnosed by
the DR/Coref tests, as expanded to apply to any sentence, are relevant to establishing the DR,
Coref, and BVA. In that case, the restriction of (226) to cases where we have *C(S, X, Y) and *A(S,
X, Y) is simply irrelevant; surface scope and inverse scope DR, for example, have essentially the
same “status” in this theory, with neither being “special” because of c-command. As such, the
procedure is simply, for every S, to check its respective DR and Coref possibilities in S’ and S’’. If
DR(S’, X, Y’) and/or Coref(S’’, X’, Y) is possible, then BVA(S, X, Y) may be possible, and if not,
then BVA(S, X, Y) should be impossible. For example, following this logic, ‘every professor spoke
to their student’ should allow a BVA(S, every professor, their) reading only if ‘every professor spoke
to two students’ allows a DR(S, every professor, two students) reading, and/or ‘that new teacher
spoke to their student’ allows a Coref(S, that new professor, their) reading.
It is quite straightforward to show that this prediction does not hold of the English data. Let
us recall the categorization of individuals into “fully implicating”, “degeneratively implicating”, and
“non- implicating” given in (242) in the previous section. We now substitute in *B(S, X, Y) for *A(S,
X, Y), but leave all other conditions mostly the same. There is a slight issue with regards to what
counts as “fully-implicating”, as there are not predesignated minimally contrasting sentences that
have quirky effects which can be compared to a given instance of *BVA(S, X, Y) when there are no
diagnosed quirky effects. For our purposes here, it will be sufficient to say that “fully implicating”
are those individuals who have at least some sentences diagnosed as *B(S, X, Y), for whom the
implication in(250); that is, where there is at least some relevant S for which quirky effects are not
diagnosed by the relevant DR and Coref tests (whether those are “expanded” to ignore c-command
and precedence or not), and for which BVA(S, X, Y) is thus not accepted. “Degeneratively
286
implicating” is thus reserved for individuals for whom quirky effects are simply always diagnosed on
the relevant sentences, meaning that the implication is true, but only because it could not be tested:
(251) On a given subset of the data…
a. “Non-implicating”:
Individuals who sometimes accept BVA(S, X, Y) when no quirky effect is diagnosed.
b. “Fully implicating”:
Individuals who never accept BVA(S, X, Y) when no quirky effect is diagnosed,
and who do have S for which no quirky effect is diagnosed.
c. “Degeneratively implicating”:
Individuals who never have any S for which quirky effects are not diagnosed.
Classifying individuals according to (251), without taking into account c-command or
precedence, we derive the following:
(252) Quirky Absolutism, Take 1
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA 2 2 8
While there are some individuals for whom *DR and *Coref can perfectly predict *BVA,
regardless of c-command or precedence, this is simply untrue for most participants; that is, there
were plentiful cases where a given sentence was accepted with a BVA reading even when its
corresponding sentences were not accepted with DR or Coref readings. Such results directly
contradict predictions of accounts such as Barker (2012)’s, which hold that BVA should only be
available in a subset of the cases where corresponding DR readings are; we have added in
consideration of Coref as well as DR, which should make it even harder for such predictions to get
disconfirmed, and yet, they clearly still are.
287
We could, however, try “relaxing” the test, and the logic behind it a bit; perhaps c-command
and precedence are relevant for Coref and DR, just not for BVA. Under such a logic, we might
suppose that it is not merely the ability to accept, DR(S’, X, Y’)/Coref(S’’, X’, Y) that implies the
ability to accept BVA(S, X, Y), but rather, the ability to accept DR/Coref in a minimally similar
S’/S’’ where X/X’ neither precedes nor c-commands Y’/Y. That is, for a given S in which X either
c-commands and/or precedes Y, we can use a minimally similar S, say the one in the same “set” as
S, in which X neither precedes or c-commands Y, to do the relevant tests for B(S, X, Y)
193
. This
position makes concessions as to the relevance of c-command and precedence in the detection of
quirky effects but denies the ability of c-command and precedence to facilitate BVA on their own,
i.e., absent quirky effects. It is hard to see how such an account might be deduced from any
particular theory of BVA, but the question is moot; while this version of the prediction does slightly
better, it still falls quite short:
(253) Quirky Absolutism, Take 2
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA 1 5 6
Even with this “compromise” technique, the results still leave half the individuals accepting
BVA(S, X, Y) in the absence of any diagnosed B(S, X, Y). We could of course try various other
permutations of this strategy, but as I have pointed out already, what we have done so far has
already been already based on fairly odd concessions to try to make results come out as predicted,
namely that c-command and precedence must be controlled in DR and Coref to detect quirky
effects, and that said quirky effects carry over universally to analogous BVA but suddenly lose any
193
For example, to perform the DR test for the sentence ‘every teacher spoke to their student’, we would use not ‘every
teacher spoke to two students’, but rather ‘two students were spoken to by every teacher’.
288
reference to c-command or precedence. At least in my view, a much simpler approach at this point
is to simply accept the claims of the ABC-BVA law and assume that c-command and precedence are
independent factors, which, in addition to quirky effects, can facilitate BVA readings, as well as DR
and Coref ones.
If we accept the above and compare the focus only on those sentences where we have *A(S,
X, Y) and *B(S, X, Y), we find the following:
(254) BVA acceptance in sentences where X neither precedes nor c-commands Y
# of Individuals Never Accepts Sometimes Accepts
BVA
(No DR/Coref)
2 10
(255) Status of (250) with regards to sentences where X neither precedes nor c-commands Y
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA
(DR/Coref
considered)
7 5 0
In the first of these two tables, no reference to DR/Coref is made, and participants are
classified simply by their overall responses to the items wherein X did not c-command or precede Y.
Most of these individuals accepted BVA(S, X, Y) at least sometimes in those cases, though two
individuals did indeed never accept BVA in those conditions
194
. As the second table shows, once we
filter for B(S, X, Y) using the MR’s DR and Coref, removing from consideration any S that was
diagnosed as having the potential for B(S, X, Y), then no individual accepts BVA(S, X, Y) in such
sentences. Further, most do accept BVA(S, X, Y) in without precedence and c-command when B(S,
194
These two individuals did accept BVA(S, X, Y) at other times, however, so we know they were not simply rejecting all
BVA readings out of hand.
289
X, Y) is diagnosed (the “fully implicating” 7), suggesting that the presence/absence of B(S, X, Y) is
indeed constraining the availability of BVA(S, X, Y) for them.
Notably, however, 5 of the 12 individuals in (255) are “degenerately implicating”, meaning
essentially that the DR and Coref tests always diagnosed the potential for B(S, X, Y). As such, *B(S,
X, Y)à*BVA(S, X, Y) was not meaningfully testable for those individuals, because there were never
any *B cases for them to judge to begin with. This is not problematic per se, but it is unfortunate in
that we would like to replicate the predicted patterns in as many individuals as possible, as doing so
provides evidence for the universality of our claims. It should be remembered, however, that the use
of DR and Coref as diagnostic tools is primarily an empirical matter. We have discussed some
principled speculation in Sections 2.7 and 2.8, basically that quirky effects tend to attach to particular
choices of X and Y, and thus DR testing X (in combination with the particular S), and Coref testing
Y (also in combination with the particular S) seem to be sufficient to “catch” all these effects. As we
have seen in Section 3.2, this approach has a solid track record; the combined DR and Coref tests
have apparently accurately diagnosed the presence of B(S, X, Y), if not perfectly, then with an error-
rate so low it has not been detectable in past experiments.
As I mentioned previously, however, it seems as if, in this dataset, both MR’s are in fact not
needed to achieve results consistent with predictions. This brings us to issue (II) that I raised in the
beginning of this section. In particular, using only the DR test, we successfully detect all cases where
BVA(S, X, Y) is accepted in S where X does not precede or c-command Y. Indeed, doing so results
in much wider replication of the predications, as now all individuals have at least some relevant
judgements for which *B(S, X, Y) was indeed diagnosed. The same cannot be said about Coref,
however; see the table below:
290
(256) Status of (250) using just one MR to detect B(S, X, Y) (assuming *A/*C(S, X, Y))
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA, considering just
DR
12 0 0
BVA, considering just
Coref
7 1 4
Coref(S’’, X’, Y) on its own fails to be an effective enough tool for the detection of B(S, X,
Y) for these individuals; while it works for 8/12 of them, 4, a third, still allow for BVA(S, X, Y) in
the absence of diagnosed B(S, X, Y) when relying on the Coref test only. DR, on the other hand, not
only apparently accurately serves as a diagnostic for B(S, X, Y), leaving no “non-implicating”
individuals, but also improves the number of individuals for whom a purely quirky-based contrast in
judgements can be seen to 100%; all individuals have at least one situation in which okB(S, X, Y) is
diagnosed and BVA(S, X, Y) is possible, and in all situations where *B(S, X, Y) is not diagnosed,
BVA(S, X, Y) is impossible.
I readily admit that using just one MR to detect quirky effects purely on the basis that it
works sufficiently well for the data in question is an entirely post-hoc measure. Further, given that
previous studies (see Section 1.3) have not found DR on its own to be a sufficient diagnostic for B-
factors, this result may simply be a fluke of having mostly participants whose sources of B(S, X, Y)
generally center on the choice of X, which is what DR(S’, X, Y’) directly tests for
195
. Nevertheless, it
is an interesting and undeniably present trend in the data, with a straightforward and plausible
explanation (namely that the DR test alone was sufficient to detect quirky effects), so I think it is
195
Though it is fair to say, both in the previous results examined and also in results of other unpublished experiments, DR
has been consistently more successful than Coref for detecting quirky-based BVA in English, at least if we go by sheer
quantity of cases detected. One possibility may be that a certain thing or things that the DR test tests for are more common
(at least in English) than what is tested for by the Coref test. Alternatively, perhaps we have misunderstood either the role
of the Coref test in past experiments, or the role of the DR test in these ones. Of course, these results may simply have
been a simple “fluke”, though that is not incompatible with these other possibilities.
291
worth reporting the results of such an analysis here, both to shed light on the data gathered and to
inform future investigations into the role of DR and Coref as detectors of quirky effects.
We will see in later chapters that the DR-only approach does work for the Korean data as
well, though it is considerably less impressive there given that quirky effects are much rarer in that
dataset. It does not, however, work for the Mandarin Chinese data (in fact the opposite is true,
Coref, rather than DR, is sufficient by itself); further, as has been mentioned in Sub-Section 4.1.3
and will be discussed further in Sub-Section 4.5.2, if we include data that was excluded from the
main analysis for various principled reasons, then it looks as if the Coref test was useful for detecting
apparent quirky effects, however we choose to interpret such results. As such, the “DR alone”
analyses presented here should be taken was a considerable grain of salt in terms of what they
represent for the broader population. We may indeed have stumbled on something quite
consequential, but it will take significant future work to verify whether or not this is indeed the case.
With this issue in mind, we can finally address issue (III) that I raised at the beginning of this
section. Namely, how are we to observe the impact of the presence or absence of quirky effects on
judgements of a given individual? Note that this is different from merely establishing that *B(S, X,
Y) is a necessary condition for ensuring *BVA(S, X, Y); we want to demonstrate something further,
specifically that minimally modulating the presence or absence of a quirky effect can
correspondingly modulate the availability of BVA within the judgements of a given individual
(assuming of course that precedence and c-command have been controlled for)
196
.
Ideally then, we want to compare across X-Y pairs, on sentences that are otherwise identical
except for the choice of X and Y. In particular, we want to find one X-Y pair, call them X1-Y1, such
196
Note that we have, in fact, already shown this across individuals in the previous section, given that those who had
quirky effects with a given S-X-Y sometimes accepted BVA(S, X, Y) when other individuals who had no such quirky
effects on the same triplet did not.
292
that BVA(S, X1, Y1) is impossible for the individual in question, yet for the other pair, X2-Y2
BVA(S’, X2, Y2) is possible for that individual, where S’ differs from S only by substitution of X2
for X1 and Y2 for Y1
197
. To ensure we are looking at the influence of quirky effects S/S’ should be
such that neither does X1/X2 precede Y1/Y2 nor does X1/X2 c-command Y1/Y2.
Technically, if we find such a pattern of judgements, we are not quite done. Such a contrast
suggests that it is at least possible that the X1-Y1 pair in S lacked some property which the X2-Y2
pair had that enabled BVA(S’, X2, Y2). Given that both A(S’, X2, Y2) and C(S’, X2, Y2) have been
eliminated as factors, it is further possible that the factor in question is B(S’, X2, Y2). This, however,
cannot be assumed from the ABC-BVA law, as the ABC-BVA law does not guarantee that all
instances of *BVA(S, X, Y) are accompanied by *B(S, X, Y); the implicature is only one way, from
*B(S, X, Y) to *BVA(S, X, Y), not vice versa. We can, however, address this issue by checking the
DR and Coref tests.
We will come to this “full” test later in this section. For now, let us focus on the
“candidates” for such an effect, without worrying about the caveat mentioned in the previous
paragraph. One caveat we will have to observe, however, is as discussed in in Sub-Section 4.1.3, only
one X-Y pair is considered reliable for the possessor-binding cases, so such an analysis cannot be
done on that construction, given that it requires comparison between X-Y pairs of the same
sentence type. However, for the other two relevant cases, weak crossover versions of passives and
actives respectively, the full range of three choices of X-Y is available.
Given that there are three X-Y pairs, there are also three pairs of pairs; that is, we can
compare ‘every professor’-‘their’ to ‘every professor but one’-‘their own’ or to ‘more than one
professor’-‘that professor’, or compare ‘every professor but one’ to ‘more than one professor’. For a
197
If we are considering S to represent a schema with variable positions for X and Y, rather than a particular sentence,
then we have the same S in both cases.
293
given sentence type, there are three relevant types of results of such a comparison: (i) BVA
judgements were the same in both (rejected or accepted), (ii) BVA was rejected in the first but not
the second, (iii) BVA was rejected in the second but not the first. The results of this classification are
presented shortly below.
First, however, I will be a little bit careful regarding “significance”, as we want to ensure that
rejection of BVA with a given X-Y pair is not simply due to an inability to accept BVA with those
particular choices of X and Y in general. For example, some individuals seem to simply reject BVA
with a demonstrative phrase either frequently or always, and as such, their rejection of such a
reading may not tell us much about quirky effects or other such things. We can address this for now
by simply noting whether that individual accepted BVA(S, X, Y) with another item in the same
theta-roel-matched set as the one where they rejected BVA(S, X, Y) in the case of X not preceding
or c-commanding Y. To distinguish these “less significant” cases, where there was no such
acceptance, I will indicate them in parentheses in the tables below; those outside of the parentheses
were “fully significant” in this respect.
Consider first the results with the weak crossover passives:
(257) Weak crossover passives, comparison on different X-Y pairs
# of Individuals BVA in 1, not 2 BVA in 2, not 1 No asymmetry
EP-T: EPBO-TO 1(0) 5(1) 5
EP-T: MTOP-TP 4(3) 0(1) 4
EPBO-
TO :MTOP-TP
7(2) 0(0) 3
The pattern that emerges for the weak-crossover passives is entirely as expected given
various previous points at which we noted that ‘every professor but one’-‘their own’ was likely to be
the easiest to accept with a BVA in cases where X did not precede or c-command Y. Readers may
294
also have noted that, of the two remaining pairs, ‘more than one professor’-‘that professor’ was the
most likely to be rejected with a BVA reading in such cases; this pattern is also clear here. Indeed,
the most common contrast in terms of X and Y was an individual accepting the BVA reading of
(258) while rejecting the BVA reading of (259).
(258) Their own student was spoken to by every professor but one.
(259) That professor’s student was spoken to by more than one professor.
‘Every professor’-‘their’ also had a number of contrasting cases, with BVA frequently being
rejected in (260) by individuals who accepted it in(258), but also BVA in (260) being frequently
accepted by those who rejected it in (259)
(260) Their student was spoken to by every professor.
I have set off the relevant cases in (257) in bold and italics (bold being cases where that
MTOP-TP pair is involved, italics being cases where the EPBO-TO pair is involved). If we consider
weak crossover actives, we can see that a quite similar pattern emerges, albeit somewhat weaker
numerically:
(261) Weak crossover actives, comparison on different X-Y pairs
# of Individuals BVA in 1, not 2 BVA in 2, not 1 No asymmetry
EP-T: EPBO-TO 0(0) 3(0) 9
EP-T: MTOP-TP 2(3) 0(0) 7
EPBO-
TO :MTOP-TP
5(3) 0(0) 4
295
Once again, we can essentially put sentences in a hierarchy, with the sentences above being
more likely to be accepted with a BVA reading by an individual who rejects a BVA reading in the
lower sentences:
(262) Their own student spoke to every professor but one.
ß
Their student spoke to every professor.
ß
That professor’s student spoke to more than one professor.
As noted previously, the numbers here are not draw from a large enough dataset that we can
expect them to generalize to the broader population; however, this ordering is roughly consistent
what one might expect given the exposition in Section 4.1, this ordering is not particularly
surprising. We still need to check, however, whether these contrasts are actually reflexes of the
presence or absence of diagnosed quirky effects, however. At this point, all we can say is that they
are plausible candidates for such effects; we have ensured that these contrasts are not due to
compatibility issues between the particular X-Y pairs involved and BVA via comparison to theta-
role-matched minimal contrasts, but we will need to go to DR and Coref to diagnose the presence
or absence of quirky effects.
In order to achieve the sort of analysis described above, we can take a very similar strategy as
that which was employed to generate the diagram in (247). The categorization of the dots must be
slightly altered, to reflect our focus on B(S, X, Y) rather than A(S, X, Y), but this is largely just a
matter of incorporating the significance conditions we discussed immediately above:
296
(263) For a given *A(S, X, Y) and *C(S, X, Y) sentence:
a. BVA(S, X, Y)
The individual in question accepted this sentence with a BVA(S, X, Y) reading
b. *B(S, X, Y)à*BVA(S, X, Y), okB(S, X, Y) and BVA(S, X, Y)
No quirky effects were diagnosed for this sentence for the individual in question, and
that individual accepted this sentence with a BVA(S, X, Y) reading; additionally, as
per (257) and (261), the individual in question did accept other sentences with BVA
readings, including one that minimally differed in the identity of X and Y, and also
ones from the same “set”
c. Just *B(S, X, Y)à*BVA(S, X, Y)
Like b., but insufficient minimal pair sentences were accepted; either the individual
never accepted a BVA(S, X, Y) sentence that differed just in terms of X-Y pair, or
the individual did not accept BVA in any sentences from the same set as the
sentence in question.
To add the additional element of ensuring that a given instance of BVA rejection is due to
*B(S, X, Y), we can plot the above-categorized judgments, as in (247), with respect to whether the
DR and Coref tests diagnose B(S, X, Y) for that particular sentence (for the particular individual in
question). Unlike for the analysis in the previous section, however, there is no need for a separate
categorization as to whether or not X c-commands Y (or X precedes Y, for that matter), as, we are
focusing solely on the cases where X neither precedes nor c-commands Y. As noted, these will
reduce to use to the weak crossover passives and actives, as the possessor binding cases do not have
contrasting X-Y pairs for us to use.
Applying all the above, we obtain the following:
297
(264) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
DR, Coref 7 5 0 12
DR 8 6 0 14
Coref 3 10 4 17
None 6 6 17 29
Total 24 27 21 72
As was the case in the previous section, when we consider judgements on sentences where X
does not precede or c-command Y and that are diagnosed as *B(S, X, Y) by the DR and Coref (or
just DR) tests, no judgements are such that BVA(S, X, Y) was accepted. This, by itself, is not new
information; indeed, the location of the relevant “dots”, and whether they are “red” or not are
definitionally identical to what can be seen in (247), because many of same classification methods are
used; whether the DR and Coref tests are passed, and whether BVA(S, X, Y) is accepted. What
298
differs here, however, is whether judgements show up as green or yellow, that is, which judgements
count as “strongly” vs. “weakly” significant for the factor under consideration, in this case, B(S, X,
Y). This difference comes about because “significance” in terms of precedence makes references to
comparison to sentences that contrast in terms of precedence, whereas “significance” here makes
reference to sentences that contrast in terms of quirky effects. As such, while we already knew that
(264) would not show us any direct violations of the ABC-BVA law, it shows us new information
regarding the role of B(S, X, Y) in both constraining and permitting BVA readings. This is because
(i) every dot which is in the center represents a case where quirky effects were not detected, and
correspondingly, BVA was unacceptable, and (ii) every such dot that is green in color represents a
case where an almost identical sentence, differing only in terms of using a choice of X-Y pair that
was diagnosed with a quirky effect
198
, was accepted. Such cases represent clear instances of the
presence or absence of a quirky effect determining the acceptability of BVA, precisely as is predicted
to happen by the ABC-BVA law for sentences where X does not precede or c-command Y.
As before, to get a better sense of the most crucial parts of the data, we can “zoom in” on
judgements in the central intersection:
198
One might note no steps were taken to ensure that this contrasting X-Y pair was diagnosed with a quirky effect, but
this because we already knew that they were. If it was not, then there would be a red dot in the center of the diagram,
(because they would have passed the DR and Coref tests but still have accepted BVA), but there is clearly no such dots.
As such, there is no need to check that the minimal contrasting sentences were diagnosed with quirky effects, as such
checks have in fact already been performed.
299
(265)
Here I use the same numbering system as is given in (248): participant number followed by
X-Y pair followed by construction. As noted above, if we consider both greens and yellows
together, we are simply repeating what was done in the previous section *though we have lost the
possessor binding cases). If we use only DR to test to diagnose *B(S, X, Y), then all individuals, all
X-Y pairs, and all constructions are represented. If we use both the DR and the Coref test, while
quantities drop, we still have all constructions and X-Y pairs represented, though we no longer have
relevant datapoints from a quarter of the participants (lacking data from individuals 2, 3, and 7).
If we restrict to only “green” judgements, however, things are no longer guaranteed to
match so closely with what obtained in the previous section, because, as discussed above, what
determines green vs. yellow here is different because of the minimal contrasts considered. If we use
300
only DR as the test for quirky effects, then we no longer have judgements from individuals 3, 4, 9,
and 12, but the other eight still represented. All X-Y pairs and constructions are represented (most
more than once), as are all their combinations except the particular sentence ‘their own student
spoke to every professor but one’; this is consistent with what was noted earlier, namely that
sentences with the ‘every professor but one’-‘their own’ pair are the ones most likely to be diagnosed
with quirky effects in this dataset, and as such, they are the least likely to make it into the central
intersection. Indeed, these sentences are in many cases serving as the minimal contrasts for the ones
that do make it into the center, so though they are graphically absent, they are still playing an
important role in these results.
If we require both Coref and DR tests to be passed to diagnose *B(S, X, Y), judgements
from individuals 2, 6, and 10 also are not represented, nor is the particular X-Y-construction
combination ‘that professor’s student was spoken to by more than one professor’, but every other
such combination of X-Y pair and construction is represented. As such, even with the most
restrictive considerations, the effects on the acceptability of BVA(S, X, Y) of changing X-Y pairs
from one which is diagnosed as *B(S, X, Y) to one that is diagnosed as potentially okB(S, X, Y) is
demonstrated across multiple constructions, X-Y pairs, and individuals. One may note that the
numbers of green dots involved are somewhat less what we saw in the diagrams for A(S, X, Y) in
the previous subsection (and what we will see in the next section for C(S, X, Y)), but this is not
surprising given that fewer sentences are considered for this particular test due to the exclusion of
the possessor binding sentences; there are also challenges in getting relevant minimal contrasts, as
we cannot ensure that there is always a minimally different sentence with quirky effects for each
participant to judge (since what items quirky effects occur with are specific to the individual making
the judgement), a challenge we do not have with precedence and c-command-based contrasts. The
exact numbers, however, are not particularly crucial; the effects of B(S, X, Y) here are clearly
301
demonstrated for various individuals/environments. As far as I am aware, this is a first of its kind
analysis; while previous works such as those reviewed in Section 3.2 have certainly dealt with the
subject of different X-Y pairs, data has not been arrayed in such a way so as to reveal the role of
switching choices of X-Y pairs for each individual. As such, this section provides a novel form of
evidence for the active role B(S, X, Y) plays in enabling BVA(S, X, Y) readings, at least when the
relevant precedence or c-command relations are absent.
4.4 C(S, X, Y)
In this section, we finally consider the role of C(S, X, Y) in constraining BVA(S, X, Y); that
is, the relationship between X c-commanding Y (or not) and BVA(S, X, Y) being possible (or not).
At this point, we already have a fairly good idea about how c-command influences BVA judgements
in this dataset because of our discussions in the previous sections regarding what happens when we
do or do not control for c-command. Despite some redundancy, however, repeating the sorts of
analysis on c-command we did when focusing on precedence and quirky effects still provides useful
insight into the rough nature of the relevant patterns in the data, and will yield more direct evidence
for the active role c-command plays as well. To achieve this, let us once again establish a convention
for classifying an individual’s patterns of judgements on a given subset of the data. In this case, we
are interested in the implication given in (266):
(266) (Under the correct conditions)
*C(S, X, Y)à*BVA(S, X, Y)
To classify individuals’ results on various part of the data with regard to this implication, we
can simply modify (242) to reverse the roles of c-command and precedence, yielding:
302
(267) On a given subset of the data…
a. “Non-implicating”:
Individuals who sometimes accept BVA(S, X, Y) when X does c-command Y
b. “Fully implicating”:
Individuals who never accept BVA(S, X, Y) when X does not c-command Y,
but who do sometimes accept BVA(S, X, Y) when X does c-command Y.
c. “Degeneratively implicating”:
Individuals who never accept BVA(S, X, Y)
As with A(S, X, Y) and B(S, X, Y), it can be quickly demonstrated that C(S, X, Y) alone is
not enough to constrain the distribution of BVA(S, X, Y) judgements. If we construct the c-
command-based analogue of (244) for example, simply categorizing all the BVA judgements based
on whether the sentence judged involved X c-commanding Y, it is quite apparent that c-command
alone is insufficient for predicting BVA:
(268) Categorization of individuals with regards to (267)
FI/DI/NI ET-T EPBO-TO MTOP-TP All X-Y’s
Set 1 5/1/6 4/0/8 6/3/3 3/0/9
Set 2 5/0/7 3/0/9 7/4/1 3/0/9
Set 3 N/A N/A 3/1/8 N/A
All Sets 4/0/8 3/0/9 2/1/9 1/0/11
The overall results are, numerically, slightly more “well behaved” than they were for (244), in
the sense of there being slightly fewer “non-implicating” individuals; there is even one individual
who, regardless of sentence, choice of X-Y, etc. always required X to c-command Y for BVA(S, X,
Y) to be acceptable. Nevertheless, in the vast majority of cases, individuals at least sometimes
(usually often) accept BVA(S, X, Y) in situations where X does not c-command Y.
Adding in control of precedence helps reduce the numbers of non-implicators, but, as we
should expect by now, it does not eliminate them. To do so, we focus only on those sentences
303
where Y precedes X, e.g., in Set 1, the “canonical” weak crossover passive and the topicalized non-
weak crossover active. This yields the following:
(269) Categorization of individuals with regards to (267) (precedence controlled)
FI/DI/NI ET-T EPBO-TO MTOP-TP All X-Y’s
Set 1 6/2/4 4/0/8 6/6/0 3/0/9
Set 2 8/1/3 6/0/6 4/8/0 6/0/6
Set 3 N/A N/A 4/5/3 N/A
All Sets 7/1/4 4/0/8 4/5/3 2/0/10
The situation improves a bit, though only somewhat. For example, from the Set 2 row, we
can see that only half of the individuals in question accept BVA weak crossover actives, down from
¾ when we were not controlling for precedence. One more individual has joined the group of those
who are completely “fully-implicating” across the whole dataset, bringing such individuals to a grand
total of two. The improvements are thus marginal; as we already know from the previous sections,
B(S, X, Y) must be considered as well before a categorical pattern can be found.
Before moving on to that full analysis, however, there is one interesting point to note about
one of the X-Y pairs, specifically the ‘more than one professor’-‘that professor’ pair. As also noted
in Section 4.2, this pair has quite a high number of ‘degenerately implicating’ individuals, meaning
that, while these individuals did not accept BVA(S, X, Y) when X did not c-command Y in S, they
also did not accept it when X did c-command Y in S. When we did not control for precedence in
(268), there was only one individual who never accepted BVA(S, more than one professor, that
professor) under any circumstances. Once we control for precedence, however, we see that there are
5 (out of 12) individuals who never accepted BVA(S, more than one professor, that professor) when
‘that professor’ came first, regardless of c-command relations.
304
Given that we have only 12 individuals represented, we should take such patterns with a
large grain of salt. It is true, however, the difficulty finding individuals who will accept such
sentences without ‘more than one professor’ preceding ‘that professor’ is in fact consistent with
claims made in previous literature. In particular, a series of works, Hoji 1955, Ueyama 1998, and
Hoji et al. 2000, develop a theory of what Hoji (1995) calls “dem-binding" (given its connection to
such demonstrative phrases), which morphs into Ueyama’s precedence-based source for BVA and
Coref, co-I-indexation, effectively equivalent to A(S, X, Y) in this dissertation. One claim made in
these works, most directly in Hoji et al. 2000, is that phrases like the English ‘that N’ are unable to
serve as Y of FD(S, X, Y), which is what enables c-command-based BVA. Translating to the terms
used in this dissertation, the expectation would be that BVA(S, X, Y) with such a demonstrative
phrase as Y could only be established via A(S, X, Y), not C(S, X, Y). They do not address directly
how quirky effects would be treated in such case, so we do not know if B(S, X, Y) could also enable
such BVA under such accounts.
We can see from (269)(and we will be seeing this further later in this section and in
subsequent chapters), that this claim is not quite accurate. We can see, for example, that four out of
twelve individuals consistently had fully c-command-based patterns for ‘that professor’, to say
nothing of the three who accepted it at least once without either c-command or precedence. As
such, the data do not support the claim that BVA involving demonstratives requires precedence, and
further, we are already finding signs that such BVA can in fact be based on c-command; we will see
this more clearly at the end of this section. There may well be something “substantive” behind the
“precedence-only” claim, given that many individuals do seem to require precedence, a pattern we
will see repeated in other languages; if there is, however, better control of the relevant factors is
required in order to make a clear pattern emerge. Until this can be done, the precedence-favoring
305
nature of “dem-binding” seems better analyzed as a sort of preference or tendency, rather than a
fundamental rule.
Regardless of demonstrative-specific issues, we have seen from (268) and (269) the expected
necessity of controlling for both B(S, X, Y) and A(S, X, Y). To control for B, we again start by
classifying the judgements of individuals on sentences where X does not precede Y:
(270) For a given *C(S, X, Y) sentence:
a. *C(S, X, Y) and BVA(S, X, Y)
The individual in question accepted this sentence with a BVA(S, X, Y) reading
b. *C(S, X, Y)à*BVA(S, X, Y), okC(S, X, Y) and BVA(S, X, Y)
199
The individual in question rejected this sentence with a BVA(S, X, Y) reading,
but accepted BVA(S, X, Y) in sentences where X c-commanded Y but did not
precede it and also accepted BVA in sentences from the same set as the sentence
in question.
c. Just *C(S, X, Y)à*BVA(S, X, Y)
Like b., but insufficient minimal pair sentences were accepted; either the individual
never accepted BVA(S, X, Y) in a sentence where X c-commanded Y, or the
individual did not accept sentences from the same set as the sentence in question.
As before, this categorization will graphically translate to the color/shape of the “dots”, each
representing a given judgement. These dots are then further categorized into whether they are inside
or outside of three circles; two of those are identical to the ones used in the preceding sections,
namely the results of the corresponding DR and Coref tests, while the final one is whether or not
the sentence was *A(S, X, Y). We predict that, so long as there is *A(S, X, Y) and the DR (and
Coref) test(s) diagnose *B(S, X, Y), BVA(S, X, Y) should be impossible for any *C(S, X, Y)
sentence. In fact, based on the analyses in the previous sections, we already know this will turn out
to be true. As such, what we mainly hope to demonstrate here is the (well-represented) existence of
199
Recalling, as the text makes clear, that “okC(S, X, Y)" in this label is really an abbreviation for “okC(S, X, Y) and *A(S,
X, Y)”; that is, it refers to sentences where X c-commands but does not precede Y.
306
“*Cà*BVA, C & A” individuals, who meet the stringent standards to give the strongest evidence
that c-command specifically is playing an active role in constraining/permitting BVA readings. As it
turns out, we do indeed find such evidence:
(271) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
*A, DR, Coref 7 6 0 13
*A, DR 9 8 0 17
*A, Coref 10 4 6 20
DR, Coref 3 5 5 13
*A 12 4 18 34
DR 6 7 4 17
Coref 6 3 11 20
None 9 3 22 34
Total 62 40 66 168
307
Depending on whether we make use of the Coref test or rely purely on DR, the number of
“green” judgements in the center is either 7 or 16, and, as predicted and expected, the number of
“red” judgements is zero. That is, no individual accepts BVA(S, X, Y) when there is *C(S, X, Y),
assuming that *B(S, X, Y) and *A(S, X, Y) are ensured, and further, there are many instances of such
individuals accepting BVA(S’, X, Y) in S’ where X does c-command Y, suggesting that c-command
plays an active role in determining whether BVA is acceptable.
As before, we can “zoom in” and look at judgements in the center to investigate how
representative they are of the different conditions/individuals in the dataset:
(272)
308
As it has been discussed in previous sections (due to the dots in question being the same
there), the results when we considering both green and yellow “dots” together need not be covered
in great depth here; whether or not the Coref test is included, all or almost all individuals, X-Y pairs,
constructions, and combinations of X-Y and constructions are represented, most multiple times.
If we look at just the greens, all individuals except for individuals 4 and 11 are represented,
as are all X-Y pairs, constructions, and combinations thereof. If we restrict further by including the
Coref tests for diagnosing B(S, X, Y), we no longer include individuals 2, 3, 7, 9, 10, and 12,
meaning that just under half of all individuals remain. This may seem like a stark reduction but
compared to previous experiments such as those discussed in Section 3.2, where often <10% of
individuals were represented in this particular slice, the rate of inclusion is actually much improved
from what it has been previously. Regardless, even though this is the most restrictive category
considered, there are nevertheless still several individuals represented.
As for X-Y pairs and construction, under this most restrictive view, each one is represented,
as are their combinations, except for the weak crossover passive with the third X-Y pair, i.e., “that
professor’s student was spoken to by more than one professor” (_-3-1, where the blank is filled by
the participant number), and the possessor binding case. The fact that the possessor binding cases
show up less is understandable given that only one third as many judgements on them are analyzed
here as compared to the other constructions, but the fact that no green cases of it make to the very
center is a shortcoming of the result. If we consider the DR-only test for B(S, X, Y), there is one
instance of possessor binding in the center, with more being available if we consider yellow cases,
regardless of the method of B-detection. As such, we have indeed provided a degree of evidence for
the role of c-command (or the lack thereof) in modulating the availability of BVA in the possessor
binding construction but have failed to provide the strongest “type” of evidence for it. Taken
together with the findings of Plesniak 2022b and the experiments to be discussed in later chapters,
309
however, there is nevertheless a consistent picture emerges of the role of c-command in such
constructions, which is supportive of our predictions.
Overall, these results are not only consistent with the predictions of the ABC-BVA law, but
further demonstrate the crucial role c-command plays in determining whether or not BVA is
possible. One may of course note that they do not quite reach the ultimate ideal of fully
demonstrating clear c-command-based patterns across each individual (exactly how close they come
to this ideal depending on how stringent of a requirement one imposes as what counts as
“significant” for such a demonstration, as explained in the above paragraphs). The results
nevertheless represent a marked improvement over previous methods in that regard. Further, not
“fully demonstrating clear c-command-based patterns across each individual” is not the same thing as
not clearly demonstrating the role of c-command in constraining the availability of BVA. In this
regard, the results provide strong evidence, drawn from a diverse set of constructions, lexical items,
and individuals, as to the active role that c-command plays in constraining BVA readings, in keeping
with the ABC-BVA law. As we can see from both (271) and (272), whether X c-commands Y is
clearly a crucial consideration for determining whether or not BVA(S, X, Y) will be rejected, and
further there are numerous cases when X c-commanding Y alone appears to be enough to enable
BVA(S, X, Y) to occur. As such, the evidence is not only entirely as predicted by the ABC-BVA law,
but also strongly supports the role of that c-command plays within it.
4.5 Full Results and Discussion
4.5.1 Full Results
In the preceding sections, we have focused on each of A(S, X, Y), B(S, X, Y), and C(S, X, Y)
in turn, evaluating the various ways in which they contribute to the distribution of BVA-acceptability
in the dataset. Now, we can take a more wholistic approach, which will allow us to see fully how the
310
predictions ABC-BVA law are realized in the English data, across all the (non-excluded) BVA
judgements recorded during the experiment. In this case, we will consider each judgement
separately, this time without reference to any “minimal pairs” or the like. Rather, we will simply
categorize each judgement as to: (I), whether BVA(S, X, Y) was accepted or not, and (II), whether
we had *A(S, X, Y), *B(S, X, Y), and/or *C(S, X, Y), that is, X not preceding Y, no quirky effects,
and X not c-commanding Y; this is reminiscent of the diagrams showing the overall results for all
the experiments, as previewed in Sections 1.6 and 3.4, only now the results are just for the English
experiments specifically.
We predict that it will be generally possible for a given judgement to turn out such that
either BVA is accepted or not accepted, except specifically in the case that all three of the relevant
factors are absent, i.e., when we have *A, *B, and *C. Under that specific combination of conditions
(and that one only), BVA is predicted to be always rejected. This prediction is borne out precisely,
though there is a small question when it comes to detection of B(S, X, Y); as discussed, it appears
that, for this dataset, we can either use both the DR and the Coref test or just the DR test. The
former is the better-established test, while the latter is at this point essentially post-hoc, but I will
present both, because, as discussed in Section 4.3, the DR only version is interesting in its own right
and may indeed give us a slightly more “accurate” picture of what is going on in English.
311
(273) Using DR+Coref for Quirky Detection
a. Venn diagram
200
b. Summary Table
# of judgements Green Red Total
*A, *B, and *C 13 0 13
*A and *B 7 6 13
*A and *C 47 24 71
*B and *C 8 5 13
*A 22 49 71
*B 2 11 13
*C 34 37 71
None 8 63 71
Total 141 195 336
200
Just as in the other Venn diagrams, we are being a bit sloppy here about *B(S, X, Y), as our tests technically do not
apply to S where we do not have *A(S, X, Y) and *C(S, X, Y). Mostly just for the purpose of displaying such dots, in a
way that is at least plausible to likely reflect the odds of B(S, X, Y) factors being available to the participate for that choice
of S, X, and Y, I have adopted an approach somewhat like the second “quirky absolutist” one detailed in Section 4.3; that
is, a given S for a given individual is classified as *B(S, X, Y) if the *A(S, X, Y)& *C(S, X, Y) S in its theta-role matched
set is diagnosed as *B(S, X, Y). This convention will be used for all such “combined” graphs.
312
(274) Using just DR for Quirky Detection
a. Venn diagram
b. Summary Table
# of judgements Green Red Total
*A, *B, and *C 30 0 30
*A and *B 18 12 30
*A and *C 30 24 54
*B and *C 21 9 30
*A 11 43 54
*B 5 25 30
*C 21 33 54
None 5 49 54
Total 141 195 336
Note that, between the two methods, the overall number of green dots vs. red dots, as well
as the number of dots inside the *A and *C circles does not change; all that changes is how many
dots get inside the *B circle. Regardless, we see that overall, there are in fact more reds than greens,
313
that is, more instances in which BVA was accepted than it was rejected. In every area of the
diagram(s) that is not the central intersection, we find both green and red dots; that is, when
anything less than all three conditions of the ABC-BVA were met, BVA was sometimes accepted
and sometimes rejected, consistent with our predictions. This pattern completely changes, however,
when all three conditions are met: in the central intersection, there are 13 or 30 datapoints
(depending on B detection), all of which are colored green; BVA was never accepted in such cases
201
.
Alternatively, if we want to view things in terms of the “contrapositive” sense discussed in Section
1.4, all 195 cases of BVA acceptance (red dots) occur outside that central intersection, these being
cases where BVA was accepted when the ABC-BVA law predicted such acceptance to be possible.
Either way we look at it, these results are precisely what is predicted by the ABC-BVA law; in all
cases where A, B, and C are absent, BVA is impossible, and all cases wherein BVA was accepted are
cases when at least one of A, B, and C is present. The data thus directly supports our various
hypotheses.
4.5.2 Analysis of Other Data Gathered
For various reasons, the diagrams given in (273) and (274) do not summarize all of the data
gathered for the experiment. Readers may have noticed, for example, that though I spoke a fair
amount in Chapter 3 regarding the non-MR interpretations that participants judged alongside the
MR readings, I have not analyzed them so far in this chapter. Recall that the primary significance of
these non-MR readings is to shed further light on cases where a given sentence was rejected with an
201
I will not repeat the “zoom in” procedure here, as it would yield identical results to what we have already seen; there is
no “green” vs. “yellow” distinction here, so we do not need to give the same sort of account of “significance”. The basic
summary of the dots in the center is that the set of individuals and S-X-Y combinations in this dataset is internally diverse,
not just representing a specific subset of the individuals/X-Y pairs/sentence types. Further details can be gleaned from
the appendix, where it should be fairly easy, given the discussion in this chapter, to identify how each BVA judgement
listed there is (or would be, in the case of excluded datapoints) represented in such diagrams.
314
MR reading; if the individual who rejects the sentence with an MR reading does accept it with a non-
MR reading, then we can be sure that there was no independent problem with the sentence in
question. If that does not happen, however, we cannot be so sure. As such, it is in principle possible
that we might have to “throw out” some of the BVA rejections listed in the above diagrams due to
lack of non-MR acceptance.
Fortunately, however, such a situation never occurs. The full details can be seen in the
appendix, but I will overview the cases of importance here. It is important to keep in mind that the
non-MR readings were intended to be always acceptable, so in principle there is no particular reason
that any of them should be rejected. This is generally what happened; such readings were almost
always accepted, but there were sporadic cases of rejections as well. There were even certain
repeated tendencies, such as participants insisting that sentences involving ‘their own’ must have a
BVA/Coref interpretation, e.g., reporting that ‘their own’ in (275) can only refer to ‘that new
professor’ and cannot refer to a third party.
(275) That new professor spoke to their own student.
Such cases are not at all problematic for us, as they involve the individual in question
accepting the MR-interpretation; non-MR interpretations are only used in our analysis for precisely
the opposite cases, namely, where the MR-interpretation was rejected. The vast majority of the time,
in such cases, the non-MR interpretations were accepted. There were, however, cases where neither
interpretation was accepted; as we will see though, none of these impacted the significance of our
results.
One subset of these cases, which I have already mentioned briefly, occurred with the Set 3
items (the ones involving possessor binding and paraphrases thereof). Sometimes, participants
315
reported that neither of the interpretations given were possible, but instead, reported that a third
possibility, that the ‘colleague’ in question was serving as the X of BVA(S, X, Y). This third option
was not one of the displayed possibilities, but as noted in Chapter 3, it is not the particular non-MR
interpretation that matters, only the fact whether some alternative interpretation is possible. It was
precisely for that reason that, if both presented interpretations were rejected by a participant, the
procedure was to ask that participant whether there was such a third interpretation possible for the
given sentence. Given that, for these particular cases, such an interpretation did exist, the rejection
of the intended BVA interpretation can still be regarded as significant in that regard, i.e., we know
that the sentence is not uninterpretable in general, but rather, it is specifically incompatible with the
particular reading presented.
Also concerning Set 3, some individuals rejected sentences of the form given in (237)c, i.e.:
(276) A colleague who spoke to Y’s student, X has.
This is perhaps unsurprising, given how stylistically awkward the sentence can be absent a
very specific context for its use, but is analytically unconcerning, because the sentence is
hypothesized to be such that X c-commands Y. As such, we never in fact make the prediction that it
will be impossible for it to have a BVA(S, X, Y) reading, and thus never rely on the rejection of
BVA(S, X, Y) for such sentences as evidence for anything. As such, whether the BVA(S, X, Y)
reading in such cases was rejected due to independent problems with the sentence is irrelevant.
Indeed, the only time the sentence factors into the final analysis is when it is accepted; in such cases,
it can serve as a contrast to a sentence without X c-commanding Y, which can highlight the
“significance” of the latter sentence getting rejected.
316
Given the above, the only sentences for which the reason for rejection matters in the
ultimate analysis are those wherein X neither precedes nor c-commands Y (and for which no quirky
effect is detected, for that matter), and such sentences were never found to be unacceptable with any
reading; if the particular MR-reading was rejected, the participant in question always accepted some
other reading for the sentence. As such, we can dismiss the one other sentence for which no
interpretation could be accepted, at least for one particular participant:
(277) By every professor, their student was spoken to.
Here, ‘every professor’ precedes ‘their’, so as with (276), this rejection is not used in any way
as evidence in the argumentation, so it is not relevant what its source is; it was only ever crucial
evidence when it was accepted, not when it was rejected.
The above summary exhaustively describes all the relevant cases of non-MR rejection, and
since none of them were significant with regards to our predictions, we can set such cases aside. The
other sort of relevant data not featured in the Venn diagrams is the “excluded” data discussed in
Sub-Section 4.1.3; as promised, I will give a brief overview of that data here. The first case of such
exclusion was the one participant who was excluded for attentiveness/comprehension issues. Given
that this individual’s data is potentially “faulty” I will not fully analyze it but will provide a summary
of the most important parts. Crucially, of the nine BVA items judged where X neither c-command
nor preceded Y (including the two spec-binding cases that would typically be excluded, see below),
the individual’s judgements were entirely consistent with the predictions of the ABC-BVA. In two
cases, both instances of weak crossover passives, BVA(S, X, Y) was rejected; given this individuals
other judgements, these rejections would have been considered “significant”, as they had accepted
BVA(S, X, Y) in a number of minimally contrasting pairs.
317
In the other seven such cases, however, BVA(S, X, Y) was accepted. In all cases, however,
the expected quirky effects were detected. In all but one case, only the DR test was needed,
consistent with the general trend seen in the other English participants. As noted in Sub-Section
4.1.3 though, there was one case for this participant where the Coref test was needed; this was
because the participant had the following pattern of judgements:
(278) a. Their student spoke to every professor. okBVA(S, every professor, their)
b. Two students spoke to every professor. *DR(S, every professor, two students)
c. Their student spoke to that new professor. okCoref(S, that new professor, their)
As such, this individual in fact conforms to the predictions of the ABC-BVA law and other
related hypotheses, but is not following the pattern of other English speakers in needing only DR(S’,
X, Y’) as a test for B(S, X, Y). The inclusion or exclusion of this individual thus does not change the
fundamental results of the experiment, though it would have consequences for the use of the DR vs.
DR+Coref tests for B(S, X, Y); given that this individual seemed to have difficulties with the
experiment, it is hard to draw firm conclusions from this difference from the other particpants.
The other area of “excluded” judgements is the data from the possessor-binding cases
involving ‘every professor’-‘their’ and ‘every professor but one’-‘their own’, which were excluded
due to potential ambiguities (again see Sub-Section 4.1.3) As with the excluded participant, given the
data’s potentially “faulty” nature, I will not analyze it full (again, the full data can be found in the
appendix) but will give a brief summary.
Of the 24 of excluded topicalized possessor-binding judgements cases (12 individuals * 2 X-
Y pairs; we are not counting the excluded individual’s judgements on these sentences, though I have
discussed them above), 16 were such that BVA(S, X, Y) was accepted, and 8 were such that it was
not. Of the 16 cases of acceptance, all were such that B(S, X, Y) was diagnosed; in all but one case,
318
the DR test was sufficient to achieve this, but there was one individual who had the following
judgement pattern, which would have necessitated the use of the Coref test:
(279) a. To their student, every professor’s colleague spoke. okBVA
b. To two students, every professor’s colleague spoke. *DR
c. To their student, the new professor’s colleague spoke. okCoref
Again, such a pattern of judgement is totally consistent with our basic predictions, though it
is inconsistent with the general observation, found in the “untainted” parts of the dataset, that no
reference to Coref was necessary for detecting quirky effects. Like the case of the excluded
individual, this result is difficult to interpret, but its inclusion would not have been fundamentally
problematic. Indeed, of the 8 BVA rejection cases, most
202
individuals had judgements that could be
taken as “significant” minimal contrasts with the rejected sentences, so the data superficially seem to
support our hypotheses in essentially all respects.
As such, the data gathered is actually totally consistent with, and indeed directly supportive
of, the ABC-BVA law and the particular structural hypotheses provided about possessors put forth
in preceding chapters. Nevertheless, we should not put great stock in this “evidence”; while it is a
positive sign that it does support our hypotheses, there are reasons to believe it is not reliable, and
this cuts both ways; we cannot alternate between excluding and including data simply based on
whether inclusion or exclusion would be the most useful. Further, as discussed in Sub-Section 4.1.3,
we will be seeing in the next chapter that the Korean data would indeed be problematic if such cases
were included. Indeed, if any hypothesis is to have been considered supported by these datapoints, it
is ultimately the “hunch” derived from the English data that the ambiguities in such items were not
202
Five, to be specific. The remaining three cases were such that the individuals in question not accept other elements in
the same set, failing to establish “theta-role-matched” contrasts and thus diminishing the significance of the rejection.
319
being adequately disambiguated for participants, which the Korean data bears out in precisely the
sort of way expected.
4.5.3 Coref and DR
Because discussion of the issues surrounding the two tests for B(S, X, Y) has been scattered
through this chapter, I will briefly summarize it here, as it is an issue we will return to in the
subsequent chapters. As has been noted throughout this chapter, for the data considered, DR(S’, X,
Y’) was a sufficient test to diagnose *B(S, X, Y), rather than both DR(S’, X, Y’) and Coref(S’’, X’, Y)
being necessary, which is what we would expect given Hoji’s formulations as discussed in Section
2.7. While needing only one test instead of two is not a direct contradiction of anything, this finding
is fairly inconsistent with the results of the previous experiments discussed Section 3.2, including
those for English, where the Coref test most certainly was required (though it can be seen from the
relevant graphs that DR often “did more” in terms of barring crucial red dots from the center).
Another piece of the puzzle: as noted earlier in this section, some of the data excluded from
consideration on independent grounds (namely that there were potential
attentiveness/comprehension issues) turned out to be such that both Coref and DR were necessary
to make the results come out as predicted. It is unclear how to interpret this; it could be a total
coincidence, given that the data was already inherently faulty. On the other hand, it could be
meaningful, but speak more to the Coref test having a sort of attentiveness/comprehension
checking role of some sort; this would also explain the role that Coref played in previous
experiments as well. Alternatively, it could be a sign (however unreliable or not the data) that both
Coref and DR are necessary to detect B(S, X, Y) in English, and that the ability to use DR alone is
merely a fluke of the data (perhaps influenced by DR in English often being relatively “easy to get”
320
in the absence of the “required” precedence/c-command relations). This would also be consistent
with previous results.
Data from Korean and Mandarin Chinese, to be discussed in the following two chapters, will
bear further on this question. The Korean data will turn out to be somewhat like English, in that
only the DR test will be necessary for quirky detection, though the number of relevant cases is much
smaller, so it is hard to assess the relevance of this result. What we will also see with the Korean data
is that, if Coref is indeed serving as an attentiveness check, it is not a sufficiently good one (at least
not with the current experimental design); one piece of excluded data has an issue that reference to
Coref cannot help to explain. In Mandarin Chinese, on the other hand, we will see that it is Coref,
and not DR that is needed to detect B(S, X, Y), which serves as evidence that Coref cannot simply
be “done away with” entirely, at least not in all languages.
Nevertheless, even if DR and Coref are both required in general, it is still worth asking why
only the DR test was needed in this particular dataset. Perhaps the new methodology employed here,
where participants are much more likely to be attentive/comprehending of the task, has revealed a
pattern that had previously been masked by noise. On the other hand, perhaps there were simply
too few individuals considered, and if we had investigated other individuals (or other
constructions/choices of X-Y), “Coref-require-ers” would have emerged. Future studies in English
should thus not abandon the use of Coref but should keep in mind the question of how much DR
alone can do, and take care to check whether the pattern observed here is replicated again.
4.5.4 Summary and Assessment
In this chapter, we have seen that the data gathered in the English experiment consistently
followed the predictions deriving from the ABC-BVA law. When X does not precede Y, neither X
nor Y have “quirky effects”, and X does not c-command Y, BVA(S, X, Y) is impossible, across all
321
individuals, X-Y pairs, and constructions considered. Further, when any one of these three factors is
restored, BVA(S, X, Y) is frequently possible, lending further significance to the results.
Such results not only support the ABC-BVA law itself, but also the other hypotheses used in
deriving the predictions; in particular, the evidence supports the hypothesized c-command relations
in the various sentences, i.e., that the subject of an active voice sentence asymmetrically c-commands
the object, that the subject of a passive asymmetrically c-commands the “by-phrase”, and that the
possessor in a subject does not “c-command out” of it. The English results are thus consistent with
predictions, and both replicate the results or previous experiments and expand their coverage.
As will be shown in the following chapters, results like these are further replicated in both
Korean and Mandarin Chinese as well. We will see that in the data from speakers of these other
languages, certain quantities are different. In particular, the other languages actually give much
stronger evidence (quantitatively speaking) for the role of c-command in constraining BVA, while
their evidence for the role of B(S, X, Y) is correspondingly much weaker. That these two would vary
inversely with one another is expected, given that quirky effects can only be conclusively detected in
cases where the relevant c-command relations are absent and vice versa.
While B(S, X, Y) nevertheless plays a clear role in the data from other languages, the English
data has provided a particularly clear opportunity to see the effects of modulating B by the use of
different X-Y pairs in a given construction; as noted in Section 4.3, this is the first time such a
demonstration has been made systematically on an individual level. Further, it is worth noting of
these experiments that, compared to previous experiments, e.g., Plesniak 2022b’s, the rates of
individuals/data “filtered” from consideration has dropped considerably. This achievement brings
the results closer to the “universal” patterns we would ideally like to be able to demonstrate. As
such, while the Korean and Mandarin Chinese experiments are somewhat more “novel” in terms of
addressing languages that have not been studied before with this method, English experiment
322
nevertheless has broken ground in several ways, in addition to replicating the expected results. As we
will see, once we consider the results of all three languages together, despite their many differences,
they point to a consistent picture: the ABC-BVA law holds in all cases, and variation between
languages is constrained by inviolable principles.
323
5 Korean
5.1 Preliminaries
5.1.1 Introduction
In this chapter, I describe the results of the experiment(s) conducted in Korean
203
. These
results essentially function as a replication in a different language of the English results discussed in
the previous chapter. The general hypotheses and predictions will thus be the same, barring a few
language-specific implementational issues. As such I will not rediscuss these basic issues; readers can
refer back to Sub-Section 4.1.1, which should be equally applicable here. Rather, let us focus on the
Korean specific issues that impact the way in which the experiment is implemented here and the
ways, if any, the results obtained differ from what was found for the English participants in the
previous chapter.
This latter aspect is easier to address, as the results differ very little, qualitatively speaking.
Overall, the same basic result found in English obtains in Korean. That is, the predictions deriving
from the ABC-BVA law and other hypotheses hold; when X neither precedes Y in S, neither X nor
Y are subject to “quirky effects” in S, and X is not hypothesized to c-command Y in S, then BVA(S,
X, Y) is judged impossible in all cases. Though the number of Korean participants is slightly smaller
than in the English experiment (10 vs. 12), the Korean data actually show quantitatively more direct
evidence for the role of c-command (and to some degree precedence) in constraining BVA. As will
be seen, this is because the effects of B(S, X, Y) are much weaker in this particular data sample;
however, this is a trade-off we are more than happy to make, recalling that, as laid out in Section 1.3,
203
Enormous thanks to Yoona Yee for her many hours of assistance providing advice, translation, and practical help that
enabled the conversion of the initial English experiment into its Korean version and the implementation thereof. The
work underlying this chapter could not have been completed without her help.
324
our main interest is in the effects of syntactic structure/c-command, with other factors (A(S, X, Y)
and B(S, X, Y)) being more secondary, albeit entirely necessary, considerations.
Before turning to the implementation aspect, it is useful to understand some of the relevant
background literature on BVA interpretations in Korean. There has been extensive discussion of
different choices of Y of BVA(S, X, Y) in Korean in relation to how they behave in different
structural positions and pragmatic contexts, much of which will not be directly relevant for the
purpose at hand. There are, however, several key issues that have been highlighted that will be worth
considering going forward.
One persistent issue has been which elements can serve as Y of BVA(S, X, Y). Kang (1988),
for example, devotes a number of examples to showing that the Korean third person
pronoun/demonstrative geu (그) can indeed serve as such a Y (though see further discussion of geu
later in this section). This claim contradicts earlier claims made by Hong (1985) and Choe (1988),
which reach a similar conclusion regarding geu as is reached in Satio and Hoji 1983 regarding the
Japanese third person “pronoun” kare (彼), namely that it can only be used referentially, referring to
one individual at a time. Kang concedes that this is indeed sometimes the case, as sentences like
(280) are degraded if not totally unacceptable with a BVA(S, nuguna, geu) reading.
(280) 누구나 그가 현명하다-고 생각한다
nuguna geu-ga hyeonmyeonghada-go saenggagghanda
everyone he-NOM wise-that thinks
“Everyone thinks that he is wise”
Kang nevertheless provides numerous examples wherein geu can participate in BVA. Many
of these involve constructions where geu is rather deeply embedded in the structure compared to the
element intended as X of BVA(S, X, geu), potentially reflecting parallel observations made about
325
BVA in Chinese by Aoun and Hornstein (1987)
204
, but Kang shows this is not always a necessary
property. For example, Kang judges BVA(S, nuguna, geu) possible in (281):
(281) 누구나 그의 어머니를 좋아한다
nuguna geu-eui eomeoni-leul johahanda
everyone he-GEN mother-ACC likes
“Everyone likes his mother”
It thus cannot be maintained that geu can never participate in BVA. Nevertheless, Kang
expresses a degree of uncertainty regarding the distribution of geu, tantalizingly linking it to
“unknown pragmatic factors” (pp. 193), and also briefly mentioning differences between speakers
with regards to its acceptability. Indeed, such variation has become a topic of interest in recent years,
with Kim and Han (2016) claiming that there is effectively micro-dialectal variation between
speakers with regards to the nature of geu. Conducting two experiments, they find that participants
tend to bifurcate with regards to whether they accept BVA with geu or not. For example, considering
sentences like:
(282) 모두-가 농구장-에서 그가 농구-를 잘 한다-고 말했다
modu-ga nonggujang-eseo geu-ga nonggu-leul jal handa-go marhaessda.
every-NOM basketball.court-at he-NOM basketball-ACC well does-that said.
“Everyone at the basketball court said that he plays basketball well”.
204
At least this is who Kang cites. There seems to be a somewhat complicated history, as the work in question is listed as
an unpublished manuscript by Kang, whereas it seems to have ultimately been published in 1992. That published work,
however, gives attribution for the relevant observations to Aoun and Li 1990, the content of which was presented and
published in proceedings as early as 1987. Presumably, Kang had access to the Aoun and Hornstein manuscript, which
cited Aoun and Li’s presentation/proceedings paper, and thus cited Aoun and Hornstein for Aoun and Li’s
data/conclusions.
326
The vast majority of participants were self-consistent, either always accepting or always
rejecting the BVA(S, modu, geu) readings in such sentences, but they were split relatively evenly into
the “always rejecting” or “always accepting” groups. Further, Kim and Han go on to show that an
individual’s rate of acceptance/rejection of BVA(S, X, geu) when X and geu are in the same clause in
S is correlated with the individual’s rate of acceptance/rejection of the same reading when X and geu
are separated by a clause boundary, suggesting that what is at play is not reducible to variation with
regard to the presence or absence of Aoun and Hornstein 1987-style anti-locality. Kim and Han thus
conclude that there is inter-speaker variation in terms of how geu is represented in the lexicons of
different Korean speakers, deriving the situation wherein some individuals can use it in BVA and
others cannot.
The experiments conducted for this dissertation make use of geu, and do not find the sort of
bifurcation that Kim and Han describe, at least not obviously, given that all individuals at least
sometimes accept BVA with geu. It should be noted, however, that what was labeled as “always” in
Kim and Han’s work had a 25% threshold of error; thus “always” rejecting referred to individuals
who only accepted BVA with geu 25% of the time or less, meaning it is quite possible that they
sometimes did accept it. The experiment conducted for this dissertation thus cannot directly
replicate (or fail to replicate) these results, as there are too few tokens of BVA of geu which do not
also involve some other sort of conflating factor (lack of c-command, lack of precedence, etc.). As
noted, all individuals accept BVA of geu in the data to be analyzed here at least sometimes, but again,
this cannot speak to the 25% threshold hypothesis directly. Interestingly, as we will see, the
conditions under which this acceptance happens are often restrictive; it is not atypical for individuals
to require both c-command and precedence for BVA with geu, which potentially differs from other
choices of Y. As such, the intuition behind the Kim and Han 2016’s result, namely that BVA with
327
geu may be relatively restricted for some individuals, which is also the intuition found in Kang
(1988)’s discussion, does receive some degree of support.
Returning again to Kang’s discussion, it is noted there that Possessor-binding obtains readily
in Korean, as in BVA(S, nugu, geu) in (283):
(283) 누구의 어머니나 그의 실패를 안타까워했다
nugu-eui eomeonina
205
geu-eui sirpe-leul antaggaueohessda
everyone-GEN mother he-GEN failure-ACC distressed
“Everyone’s mother was distressed at his failure.”
Kang goes on to give an account that unifies such spec-binding sentences and donkey
anaphora in Korean, expanding Haik (1984)’s “indirect binding” proposal. The details of this
account are not so important, however, as the fact that it shows that the same issues that plague c-
command accounts of BVA in English are also present in Korean as well. Indeed, weak crossover
violations are also observed. In particular, Kwon et al (2009) set out to conduct an (SCP-style; see
Sub-Section 3.1.2) experiment regarding the BVA behaviors of three choices of Y of BVA(S, X, Y),
the pronoun/demonstrative geu, jagi ‘self’, a reflexive like element roughly analogous to
‘himself’/’herself’ in English, albeit with a much broader distribution, and the hypothesized silent pro
element, which Korean, unlike English, can make use of to achieve possessed readings without any
overt possessor present in the phrase. Unexpectedly, however, they discover evidence of BVA with
geu being accepted in weak crossover configurations.
205
This na element is crucial to Kang’s analysis and plays a role in marking what is taking scope over what; at this point, it
is tangential to our main purposes, so I am leaving it unglossed in the examples, but further experiments in Korean of this
sort may well find various uses for it. Crucially, Kang reports that if it leaves its position after the full noun phrase, but
instead occurs after nugu, the possessor-binding reading becomes impossible; this, however, is contradicted by the
judgements of at least some other native speakers I have spoken to, so it seems that in that domain too, there is variation.
328
Essentially, one of Kwon et al’s experiments manipulates a four-way paradigm much like
those we will be dealing with in the remainder of this chapter. With i/ga(이/가), eul/leul (을/를), and
eui (의) being case markers for nominative, accusative, and genitive case respectively, schematically,
their paradigm is:
(284) a. X-i/ga Y-eui N-(l)eul V.
English: X V Y’s N.
b. Y-eui N-(l)eul X- i/ga V.
English: Y’s N, X V.
c. Y-eui N- i/ga X-(l)eul V.
English: Y’s N V X.
d. X-(l)eul Y- eui N- i/ga V.
English: X, Y’s N V.
Essentially, what is being permuted is on one hand whether X precedes Y, and on the other,
whether X c-commands Y. Interestingly, while sentences where Y is jagi show the expected dip in
average acceptability when X no longer c-commands Y compared to when it does, both geu and
sentences with null possession via pro show a much murkier picture. While sentences of the (284)a
type are indeed judged the best, the other three conditions have mean ratings that are almost
identical to one another; further, these ratings are rather “so-so”, in the middle of the 1-5 range of
possible values, meaning that such sentences were not consistently strongly rejected. As such, weak
crossover cases like (284)c seem to be being judged to be about as good as simple “reconstruction”
cases like (284)b, which are expected to be fine with a BVA interpretation
206
.
206
As I noted regarding the results of the Korean experiment for this dissertation, there seem to be many individuals for
whom BVA with geu requires both precedence and c-command, so one possible analysis could be that many participants
in Kwon et al’s experiment were simply rejecting all options besides (284)a. There are, however, individuals who do not
share this pattern of judgements, allowing BVA with geu in the absence of precedence, so it is somewhat mysterious why
such individuals would not manifest themselves in Kwon’s data, pushing the average ratings in the two conditions further
apart. As such, while the precedence+c-command requirement may explain Kwon et al’s results, I suspect such an
explanation is incomplete.
329
Kwon et al thus claim that the sort of c-command restrictions that apply to elements like his
in English do not apply to geu; as we have seen, however, these restrictions do not consistently apply
in English either, so the fact that weak crossover sentences are sometimes accepted in Korean is,
contra Kwon et al’s assessment, evidence of similarity between English and Korean. Interestingly, as
we will see this pattern is not what obtains in the data from the experiment to be discussed in this
chapter, as all the Korean participants consistently reject weak crossover BVA. As with Kim and
Han (2016)’s experiment, the comparison between experiments is apples to oranges in many
respects, so the fact that different results obtain is not necessarily cause for concern by either party.
What is of importance for us though is that Korean clearly has analogous issues to those of English,
regarding variation in BVA acceptability with and without c-command, which, in Korean’s case,
have been connected at times to differing behaviors of different elements serving as Y of BVA(S, X,
Y).
5.1.2 Language-specific implementation
I turn now the various details necessary to “fill in” the general template presented in Section
3.3 in order to construct the Korean experimental items. The general design philosophy was to keep
these items roughly parallel with their English counterparts, simply for ease of comparison and
consistency across languages. As such, the more “incidental” lexical items were effectively translated
literally: ‘professor’, ‘student’, and colleague’ become gyosu (교수), hagsaeng (학생), and dongryo (동료)
respectively, each literal a translation. Likewise, the X’ and Y’ of Coref(S, X’, Y) and DR(S, X, Y’),
which in the English experiments were ‘that new professor’ and ‘two students’, respectively became
geu sae gyosu (그 새 교수) (lit. ‘that new professor’) and du myeongeui hagsaeng (두 명의 학생) (lit. ‘two
[person classifier] of student’).
330
One slight difference comes in terms of the choice of verb. Recall from Sub-Section 4.1.2
that the verb ‘speak to’ was used for English, so that the topicalized elements could appear with ‘to’,
which seems to help some individuals accept them in writing. With Korean, however,
scrambling/topicalization common in the written language, and thus there was no analogous need to
use such a verb. Indeed, all nominals in the sentences used received overt case marking, so in a
sense, an analogous sort of “marking” to what ‘to’ provides can perhaps be said to already have been
present. In any case, we are free to use a more grammatically “straightforward” verb (recalling that
‘speak to’ introduces slight complications because its “object” is really a post-verbal PP). As such,
the verb used in Korean was chingchan-haessda (칭찬했다) (lit. ‘praise-did’) ‘praised’, mirroring
choices made in some of the previous experiments discussed in Section 3.2.
Chingchan-haessda marks the “praiser” with the nominative case marker i/ga (이/가) and the
“praise-ee” with the accusative case marker eul/leul (을/를). To achieve the rough equivalent of
passive voice for this verb, haessda (했다) ‘did” can be replaced by badassda (받았다) ‘received’; the
“praise-ee” is now marked with nominative case marker and the “praiser” with the dative case
marker ege (에게). One may question whether or not this is precisely the same structure as an
English passive, but as noted in Chapter 3, this is not necessary to determine for our purposes;
rather, all that is necessary is the hypothesis that the nominative-marked element (the subject) c-
commands the dative-marked element and not vice versa, just as in English the subject is
hypothesized to asymmetrically c-command the by-phrase. Crucially, what we get is a class of
sentences that superficially “mean the same thing” as their “active” counterparts, at least in terms of
“who did what”, but which differ in terms of which elements c-command which other elements; this
is what we are interested in, to demonstrate the independence of c-command effects from those of
theta-role-based meaning, not the exact structural equivalents of English passives.
331
As for the choices of X’s and Y’s of BVA(S, X, Y), these were also kept relatively constant
with their English equivalents, based on the same logic expressed in Section 4.1 that these choices
are relatively diverse enough to provide a small yet meaningful sample of the range of possible X’s
and Y’s. For X’s, we thus have essentially literal translations: modeun gyosu (모든 교수) ‘every
professor’
207
, han myeongeul jeoihan modeun gyosu (한 명을 제외한 모든 교수) ‘every professor except
one’ (lit. one [person classifier]-[accusative marker] except every professor), and han myeong isangeui
gyosu (한 명 이상의 교수) ‘one or more professors
208
’ (lit. ‘one [person classifier] more than’s
professor’); the logic behind these changes is unchanged from their English equivalents, namely
providing a range of visually distinct choices of quantifiers that also differ with regard to their
“numeral specificity” along the lines of what is discussed in Hayashishita 2004/2013 (see Sub-
Section 2.7.2).
As for the choices of Y, exact matches between English and Korean are harder due to
differences in the pronominal system. We have seen earlier in this section extensive discussion of the
pronominal/demonstrative element geu, which may roughly translate ‘he’, but which seems to have
somewhat different properties. It does, however, appear to be the closest element in Korea, and as
we have seen, has an extensive history of discussion in the literature on BVA in Korean, so it was
selected as the first choice of Y. Nevertheless, it is important to recognize that geu is only the Korean
equivalent of ‘he’ in a superficial sense; though it has served as a demonstrative for centuries, its use
as a third person pronoun is a relatively recent introduction to the language, well-known to have
207
Technically, modeun gyosu is ambiguous between ‘every professor’ and ‘all professors’, with disambiguation to the latter
possible via a plural marker. The disambiguated plural form seems to be more restricted in its participation in BVA, at
least with some of the choice of Y used in this experiment, but since we are not using that form, the ‘every’ interpretation
is available, and thus the participants did not seem to have any trouble accepting it in BVA.
208
This is subtly different from the English ‘more than one professor’, but it seems to have functioned roughly equivalently
for our purposes.
332
been modeled after the Japanese kare, which was itself essentially invented as a third person pronoun
for use in translating European language texts. Martin’s 1992 grammar of Korean notes that it is
used “only in rather formal writing”, though its usage seems to have been slowly expanding over
time.
Given all this, it is clear that geu does not have the same status in Korean as he does in
English, even if they roughly approximate one another in translation. As we will be seeing, however,
geu does behave in a manner highly analogous to he and other choices of Y examined in this
experiment, at least so far as obeying the ABC-BVA law is concerned. That even geu, unusual as its
status in the I-languages of today’s Korean’s may be, obeys this law perfectly is a testament to the
universality of the ABC-BVA law and the correlational methodology; as I have formulated them,
they are not dependent on choosing a particularly “pronoun-like” choice of Y of BVA(S, X, Y), and
the fact that predicted correlations are successfully replicated with elements like geu provide dramatic
replication of that prediction.
As to the second choice, roughly serving as a counterpart to ‘their own’, the choice made
was jasin (자신), an element similar in meaning to the jagi (‘self’) mentioned briefly earlier in this
section. When used as an argument of a verb, jasin means something like ‘himself’, but when used as
a possessor, it can convey a sense analogous to the ‘X’s own’-type expressions in English. Finally,
the third choice of Y was a more straightforward translation: geu gyosu (그 교수) ‘that professor’.
Putting these various options into pairings, we have the following:
(285) Round 1: X= Modeun gyosu Y=Geu(eui hagsaeng)
‘every professor’ ‘his (student)
Round 2: X= Han myeongeul jeoihan modeun gyosu Y=Jasin(eui hagsaeng)
‘every professor except one’ ‘his own (student)
Round 3: X=Han myeong isangeui gyosu Y=Geu gyosu (eui hagsaeng)
‘one or more professors’ ‘That professor(’s student)
333
We pair these choices of X and Y the following sentence frames, following what has been
discussed in this section and also in Section 3.3. As with English, I will give the forms of the BVA
sentences, and then Coref and DR sentences can be found by the appropriate substitutions of X’
and Y’ respectively.
The Korean BVA sentences followed the templates as follow (as before, DR and Coref can
be derived by the appropriate substitutions of Y’ and X’ respectively):
(286) Set 1:
a. X 가 Y 의 학생을 칭찬했다.
X-ga Y-eui hagsaeng-eur chingchan-haessda.
X-NOM Y-GEN student-ACC praise-did.
“X praised Y’s student.”
b. Y 의 학생이 X 에게 칭찬받았다.
Y-eui hagsaeng-i X-ege chingchan-padassda.
Y-GEN student-NOM X-DAT praise-received.
“Y’s student was praised by X.”
c. Y 의 학생을 X 가 칭찬했다.
Y-eui hagsaeng-eur X-ga chingchan-haessda.
Y-GEN student-ACC X-NOM praise-did.
“Y’s student, X praised.”
d. X 에게 Y 의 학생이 칭찬받았다.
X-ege Y-eui hagsaeng-i chingchan-padassda.
X-DAT Y-GEN student-NOM praise-received.
“By X, Y’s student was praised.”
209
209
I have made this point elsewhere, but let me repeat: a question may be raised about whether, in such sentence, X does
not merely precede Y, but in fact c-commands it. I have assumed that this is not the case, i.e., that X is merely displaced
leftwards at PF and is always underlyingly in the same place as in (b) here. If this were untrue, say that X either syntactically
moved to this position or was base-generated there, then BVA(S, X, Y) being acceptable in such sentences might be due
to c-command, rather than precedence alone (or in addition to it). While this would mean some of the acceptances I report
as being due to precedence might actually be due to c-command (whether working in concert with or acting independently
of precedence), note that this is not problematic for our ultimate goals. What we are most interested in is showing the
clear effects of c-command on BVA; precedence is largely just “along for the ride” because we seem to need it. As such,
if purported precedence effects turn out to be c-command effects, this only further supports our main contentions. Also
note that in the spec-binding cases, such issues do not arise; X in (288)b precedes Y and does not c-command it (at least
under traditional c-command definitions), and this sentence type does not have the same “movement based” complications
that the other two sentence types have. As such, precedence effects cannot be easily gotten rid of completely, regardless
of whichever account of the OSV word order one adopts.
334
(287) Set 2:
a. X 가 Y 의 학생에게 칭찬받았다.
X-ga Y-eui hagsaeng-ege chingchan-padassda.
X-NOM Y-GEN student-DAT praise-received.
“X was praised by Y’s student.”
b. Y 의 학생에게 X 를 칭찬했다.
Y-eui hagsaeng-ege X-leul chingchan-padassda.
Y-GEN student-DAT X-ACC praise-did.
“Y’s student praised X.”
c. Y 의 학생에게 X 가 칭찬받았다.
Y-eui hagsaeng-ege X-ga chingchan-padassda.
Y-GEN student-DAT X-NOM praise-received.
“By Y’s student, X was praised.”
d. X 를 Y 의 학생에게 칭찬했다.
X-leul Y-eui hagsaeng-ege chingchan-padassda.
X-ACC Y-GEN student-DAT praise-did.
“X, Y’s student praised.”
(288) Set 3:
a. X 는 Y 의 학생을 칭찬한 동료가 있다.
X-neun Y-eui hagsaeng-eul chingchanha-n dongryo-ga issda.
X-TOP Y-GEN student-ACC praise-MOD colleague-NOM has.
“X has a colleague who praised Y’s student.”
b. X 의 동료가 Y 의 학생을 칭찬했다.
X-eui dongryo-ga Y-eui hagsaeng-eul chingchan-haessda.
X-GEN colleague-NOM Y-GEN student-ACC praise-did.
“X’s colleague praised Y’s student.”
c. Y 의 학생을 칭찬한 동료가 X 는 있다.
Y-eui hagsaeng-eul chingchanha-n dongryo-ga X-neun issda.
Y-GEN student-ACC praise-MOD colleague-NOM X-TOP has.
“A colleague who praised Y’s student, X has.”
d. Y 의 학생을 X 의 동료가 칭찬했다
Y-eui hagsaeng-eul X-eui dongryo-ga chingchan-haessda.
Y-GEN student-ACC X-GEN colleague-NOM praise-did.
“Y’s student, X’s colleague praised.”
As I did with English, for the sake of giving readers something “concrete” to view rather
than just templates, I here provide examples of the Coref, DR, and BVA sentences, instantiating
(286)a. Since these are somewhat larger in size due to glossing requirements, I will just provide
examples from Round 1; other rounds are derived simply by substituting the choices of X and Y as
provided for in (285):
335
(289) Round 1:
Coref: 그 새 교수가 그의 학생을 칭찬했다.
geu sae gyosu-ga geu-eui hagsaeng-eur chingchan-haessda.
that new professor-NOM he-GEN student-ACC praise-did.
“That new professor praised his student.”
DR: 모든 교수가 두 명의 학생을 칭찬했다.
modeun gyosu-ga du myeong-eui hagsaeng-eur chingchan-haessda.
every professor-NOM two CL-ACC student-ACC praise-did.
“Every professor praised two students.”
BVA: 모든 교수가 그의 학생을 칭찬했다.
modeun gyosu-ga geu-eui hagsaeng-eur chingchan-haessda.
every professor-NOM he-GEN student-ACC praise-did.
“Every professor praised his student.”
5.1.3 Further Considerations
Unlike in the case of English, no participants showed problems with the tutorials that caused
their data to be excluded from the main analysis. However, the same issues regarding the Set 3
sentences remain, namely that, unless ‘professor’/gyosu is explicitly mentioned in the choice of Y of
BVA(S, X, Y) (as it is for geu gyosu ‘that professor’, but not for the other two choices), such
sentences, especially the possessor binding ones, have a confusing “third reading”, namely that the
student(s) in question is/are the students of the professor’s colleague(s), not the professor. Because
such readings can also be BVA readings, this is deceptively similar to the BVA reading being asked
about, and some participants may not carefully distinguish the two. As in English, then, the
ambiguous cases are suppressed for the purposes of the main analysis, though they are dealt with
separately in Sub-Section 5.5.2.
As it turns out, the “suspicion” arising from this item's unusual, but not necessarily
prediction-violating behavior discussed in the previous chapter receives much stronger support here;
while the vast majority of the data is as predicted, there is one directly prediction violating datapoint.
As noted in the previous chapter, the decision to exclude such sentences was made on the basis of
336
the English experiment alone, before the Korean data was collected. The decision to leave these
potentially “ambiguous” sentences in was primarily for the purposes of checking whether the initial
suspicion was borne out by further data
210
. As such, the fact that the lone apparent counterexample
shows up precisely in this excluded data in fact supports prior supposition that such data were faulty.
I make this note to emphasize the non-post-hoc nature of this exclusion; the data were expected to
be problematic based on design considerations and atypical but not prediction-violating results in
the English data, long before the one “problematic” Korean datapoint was collected, and the fact
that such a case showed up in fact validates the initial decision to exclude such sentences from
consideration in English.
5.2 A(S, X, Y)
As we did with English, we now go “factor by factor”, examining the role that each of
precedence, quirky effects, and c-command play in constraining BVA interpretations. We start with
linear precedence, A(S, X, Y). As was the case with English, we are interested in demonstrating that
the following implication holds, and that the “correct conditions” are “no quirky effects” and “X
does not c-command Y”
(290) (Under the correct conditions)
*A(S, X, Y)à*BVA(S, X, Y)
210
The alternative would have been to change the implementation so as to try to avoid the pitfalls suspected to be present
in English. Had this been done, however, I feel that an important opportunity would have been lost, namely to gather
more information about the “problematic” cases in order to check if there really was something “wrong” with them, and
if so, what. To me, this is an important methodological contribution, which can inform future experiments with regard to
what might happen if similar “ambiguity” is present. Changing the experimental design would have precluded this option,
and thus left us in an more indeterminate state: maybe the atypical English results were due to the ambiguity issue, maybe
they were not. While it leaves us in a position where we can say less about possessor-binding than otherwise, I judged that
understanding the issues that can confound the results of SCI-experiments is of more long-term importance than getting
more data on a particular construction, thus leading me to leave the sentences “as is” to see if my suspicions would be
validated. As it turns out, not only were the suspicions indeed validated, but plenty of useful possessor-binding data was
still gathered, so overall it turned out to be roughly a “win-win” situation.
337
That is, if we ensure *C(S, X, Y) and *B(S, X, Y), it is predicted that BVA(S, X, Y) is
impossible if X does not precede Y. Further, though not strictly predicted, it is also highly expected
that this result should obtain only if *C(S, X, Y) and *B(S, X, Y) are ensured; that is, if we have
either okB(S, X, Y) or okC(S, X, Y), then we have no reason to expect the implication to hold. We
can see quite clearly dividing up the dataset by precedence alone will be insufficient. Using the same
“fully implicating”-“degeneratively implicating”-“non-implicating” notation introduced in Section
4.2, we can construct the same sort of summary chart as we did for English in (244):
(291) Abbreviations
a. Modeun Gyosu-Geu MG-G
every professor-his
b. Han Myeongeul Jeoihan Modeun Gyosu-Jasin HMJMG-J
every professor except one-his own
c. Han Myeong Isangeui Gyosu-Geu Gyosu HMIG-GG
One or more professors-that professor
(292) Categorization of individuals with regards to (290)
FI/DI/NI MG-G HMJMG-J HMIG-GG All X-Y’s
Set 1 9/0/1 2/0/8 10/0/0 2/0/8
Set 2 8/0/2 1/0/9 9/0/1 0/0/10
Set 3 N/A N/A 2/1/7 N/A
All Sets 8/0/2 1/0/9 3/0/7 0/0/10
The overall result, shown in the bottom right corner of the table in (292), is that every
participant is in some way non-implicating; that is, at some point, all participants accept BVA(S, X,
Y) in a sentence where X does not precede Y. If we look at the breakdown by set or X-Y pair, or by
combinations thereof, we see a much greater range of variation, however. For example, we see that
everyone in the Set 1 HMIG-GG intersection is “fully implicating”, meaning that no one accepted
BVA in the sentences in Set 1 where geu gyosu ‘that professor’ preceded han myeong isangeui gyosu ‘one
338
or more professors’, regardless of whether the latter c-commanded the former or not; that is,
whether the sentence was a “non-scrambled weak crossover passive” or a “scrambled non-weak-
crossover active”, BVA(S, HMIG, GG) was never accepted by anyone, even though at least some
cases where geu gyosu followed han myeong isangeui gyosu were accepted with BVA(S, HMIG, GG).
Most cells, however, are such that a majority of individuals are “non-implicating”; that is, most
subdivisions of the data by lexical items or sentence types or combinations thereof still result in
most individuals accepting BVA(S, X, Y) in cases where X does not precede Y. The only exception
seems to be the modeun gyosu-geu ‘every professor’-‘him’ case, where most individuals (though not all)
are fully implicating across the board. Whether this reflects a meaningful difference between this pair
and the others or is just a numerical accident is hard to say at this point, but regardless, there are still
individuals who do accept BVA(S, modeun gyosu, geu) when X does not precede Y, so it is certainly
not a categorical pattern.
We can vastly improve the overall situation by controlling for c-command; that is,
considering only sentences where X is not hypothesized to c-command Y (namely weak crossover
and spec-binding sentences):
(293) Categorization of individuals with regards to (290) (c-command controlled)
FI/DI/NI MG-G HMJMG-J HMIG-GG All X-Y’s
Set 1 6/4/0 5/5/0 10/0/0 7/3/0
Set 2 6/4/0 5/5/0 10/0/0 7/3/0
Set 3 N/A N/A 7/1/2 N/A
All Sets 6/4/0 5/5/0 8/0/2 8/0/2
Compared to what obtained in English, the results for Korean here are dramatically better;
almost no cases of “non-implicating” individuals remain. That is, almost no one ever accepts
BVA(S, X, Y) when X neither precedes nor c-commands Y in this dataset. Indeed, the two cases
339
where this is violated both come from the same cell of the chart, namely the possessor-binding
sentences with the han myeong isangeui gyosu ‘one or more professors’- geu gyosu ‘that professor’. In
particular, this means that a BVA(S, HMIG, GG) interpretation was sometimes accepted in the
sentence:
(294) 그 교수의 학생을 한 명 이상의
geu gyosu-eui hagsaeng-eul han myeong isangeui
that professor-GEN student-ACC one CL or.more
교수의 동료가 칭찬했다.
gyosu-eui dongryo-ga chingchan-haessda.
professor-GEN colleague-NOM praise-did.
“That professor’s student, one or more professor’s colleague praised.”
Two individuals found (294) acceptable with the relevant BVA reading, but this was not the
case for any other sentences where X neither preceded nor was hypothesized to c-command Y
considered. We can note, however, that with the other two choices of X-Y pairs, there were a high
number of “degenerately implicating” individuals, which there were none of in (292), meaning that
these individuals only accepted BVA with the MG-G and HMJMG-J pairs when X c-commanded Y;
we will return to this a bit later in Section 5.4
211
.
211
If we were to repeat all this analysis for DR and Coref, we would find a similar, if more “muted” version of what
happens in English; namely, both Coref and DR are more widely accepted without precedence and/or c-command than
is BVA. Controlling for c-command (or precedence) makes things look a bit better behaved, but there would still be non-
implicators in all rows and columns of the hypothetical tables. Korean speakers seem to simply be more permissive in
accepting Coref(S, X, Y) and DR(S, X, Y) when X does not c-command or precede Y than they are BVA(S, X, Y).
One interesting point to note is with regards to Coref with geu-eui ‘his’ and geu gyosu-eui ‘that professor’s’, all participants
are willing to accept them sometimes when X precedes but does not c-command Y at least sometimes. However, the
majority (7/10) do not accept Coref with jasin-eui ‘(him)self’s’ under these conditions, even though they are generally fine
with Coref with jasin-eui when X c-commands but does not precede Y. Again, these are not strictly categorical patterns, so
interpreting them is difficult, but it is perhaps relevant to the question of why there are so many “degeneratively implicating”
individuals with jasin here.
340
The effects of B(S, X, Y) are thus much weaker in this dataset than they were for the English
one considered in the previous chapter. While this diminishes our ability to investigate quirky effects
(see the next section), it is excellent for the purpose of investigating the role of precedence (and c-
command), as their influence is not being obscured by said quirky effects in the same way that it was
in English. Nevertheless, we do need to account for the couple of cases where BVA(S, X, Y) was
accepted even after *A(S, X, Y) and *C(S, X, Y) were ensured. Fortunately, the usual tests for B(S,
X, Y) accurately diagnose these cases as ones where quirky effects were present, precisely as
predicted.
Constructing an equivalent diagram to the one given for English in (247), we can fully see
the role of A(S, X, Y) revealed through the control of B(S, X, Y) and C(S, X, Y). (I reuse the same
conventions of naming and presentation as were used throughout the previous chapter; readers can
refer back to the discussion preceding (247) for the relevant explanations.)
341
(295) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
*C, DR, Coref 30 15 0 45
*C, DR 12 1 0 13
*C, Coref 2 3 2 7
DR, Coref 18 8 19 45
*C 5 0 0 5
DR 9 1 3 13
Coref 1 1 5 7
None 4 0 1 5
Total 81 29 30 140
There are no “red dots” in the central intersection of the graph. This signifies that, as
predicted by the ABC-BVA law, when if a given sentence is *A(S, X, Y), assuming that *B(S, X, Y)
and *C(S, X, Y) are also true, then BVA(S, X, Y) is consistently impossible (in other words, the
conditions are met for (290) to hold). Further, numerous dots in the center are “green”, meaning
342
that, among other things, there are relevant minimally contrasting sentences where X does precede
Y and, for the same individual, BVA(S, X, Y) is possible. This demonstrates that X preceding Y in S
is capable on its own of permitting BVA(S, X, Y), without need for X to c-command Y or for there
to be a quirky effect.
We can note that, as with English, only the DR test is actually necessary to achieve the
desired results. This is less striking here, as there are only two red dots that need to be “taken care
of” by the tests in the first place, but it is nevertheless consistent with the results found in the
previous chapter. If we ignore this, and use both tests, then there are 45 green dots in the center, 30
of them green; if we can use the same sort of “post-hoc” analysis we explored with English, using
only the DR test, then the number rises slightly, 58 total dots, 42 of them green. Either way, this is a
much greater quantity of direct evidence for the determinative role of precedence in
constraining/permitting BVA than was found in English, again probably due to the much lower
number of quirky effects obtaining, which had obscured the pattern to a greater degree in English.
As discussed in the previous chapter, we may want to know which individuals, lexical pairs,
and constructions are represented in the set of dots “in the middle”, namely those that provide the
most direct support for our prediction about the *A(S, X, Y)à*BVA(S, X, Y) implication. We can
thus perform our usual “zoom in” maneuver, as below (along with an updated “decoding” guide):
(296) i-j-k
i: The (arbitrary) number identifying the participant in question. 1-10.
k: The choice of X and Y, 1 for modeun gyosu ‘every professor’ -geu ‘his’ 2 for han
myeongeul jeoihan modeun gyosu ‘every professor except one’ -jasin ‘himself/his own’, and
3 for han myeong isangeui gyosu ‘one or more professors’ -geu gyosu ‘that professor’.
j: The sentence type in question, 1 for(non-topicalized) weak crossover passive, 2 for
(non-topicalized) weak crossover active, and 3 for topicalized possessor-binding.
343
(297)
If we consider all individuals in the “center”, ignoring whether they are “green” or “yellow”
or inside both the DR and Coref circles or just the DR circle, then all individuals, X-Y pairs, and
constructions are represented, and most possible sub-combinations of the three are as well. That is,
almost every judgement on a sentence that involved *A(S, X, Y) and *C(S, X, Y) was included,
because so few relevant sentences were not diagnosed as *B(S, X, Y) by the DR test. If we include
the Coref test as well, the numbers fall a little bit; individual #1 has no judgements remaining (that
individual never passed both the DR and Coref tests for the relevant sentences), and a smattering of
other individual-X-Y-construction combinations drop out, but overall, every condition by itself is
represented, usually more than once.
344
If we consider only greens, those who specifically provide evidence for A(S, X, Y)’s active
role in both facilitating and inhibiting BVA(S, X, Y), very little changes. If we ignore the Coref test,
results are near identical; a few individuals don’t have a datapoint for every single X-Y-construction
combination, but every individual, X-Y pair, and construction (and combination of X-Y pair and
construction) are represented, all more than once, most several times. If we include the Coref test,
then, in addition to the dropping out of individual #1, we also lose individual #10, and smattering of
combinations of individuals, X-Y pairs, and constructions. Nevertheless, judgements from 8/10
individuals are represented, most multiple times, as well as every combination of X-Y pairs and
constructions, which of course means that every X-Y pair and construction individually are
themselves represented, all multiple times.
Thus, regardless of how “strict” or not one is with the filtering, it can be seen that the
general predictions of the ABC-BVA law are borne out across the diversity of individuals, X-Y pairs,
and constructions. Further, the active role of A(S, X, Y), flipping *BVA(S, X, Y) to okBVA(S’, X, Y)
when S and S’ minimally differ in terms of precedence and/or meaning, is also quite well
represented across this diversity as well, as manifested through the green dots specifically. As such,
not only have we replicated the results found for precedence in English, in a quantitative sense, the
Korean data makes the point much more strongly, even though there are slightly fewer participants.
5.3 B(S, X, Y)
In this section, I discuss more deeply the role of B(S, X, Y), i.e., quirky effects, in
constraining BVA in the Korean dataset. As before, we can understand this in terms of settling
various issues regarding the implication:
345
(298) (Under the correct conditions)
*B(S, X, Y)à*BVA(S, X, Y)
We have seen from the previous section that reference to B(S, X, Y) has not been as
frequently necessary for results to come out as predicted. Nevertheless, in precisely the couple of
cases where considerations of precedence and c-command alone failed to make the correct
predictions, adding in considerations of B(S, X, Y) correctly predicted the data. As such, while the
signal is much weakened, B(S, X, Y) still plays a directly observable role in this dataset.
As with English in Section 4.3, I intend to address three specific issues here. To repeat
verbatim what I stated there, the issues are: (I) whether *B(S, X, Y) alone is sufficient to predict
*BVA(S, X, Y), (II), the role of DR and Coref in establishing B(S, X, Y), expanding on the
discussion in the previous section, and finally, (III), whether we can see clear, minimal pair-like
contrasts based on the presence/absence of B(S, X, Y). As it turns out, we will find very similar
results regarding (I) and (II) here, but (III) will be different due to the infrequency of quirky-based
BVA in this dataset.
Turning first to issue (I): in Section 4.3, I discussed two ways one might attempt to take a
“quirky absolutist” position, analogous to the one Barker (2012) takes for English. Readers can
review the details there, but in essence, the first simply checks whether each BVA sentence’s
corresponding Coref and/or DR sentences were accepted, and if neither were, predicts the BVA
sentence should not be accepted; the second does something similar, but instead of relying on each
BVA sentences exact Coref/DR analogue, it makes use of whichever Coref and DR sentences were
most closely semantically matched but were such that X/X’ did not precede or c-command Y’/Y.
Both approaches fail completely (see (251) for the precise meanings of the various implication
types):
346
(299) Quirky Absolutism, Take 1
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA 0 1 9
(300) Quirky Absolutism, Take 2
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA 0 1 9
In either case, the vast majority of individuals are such that BVA(S, X, Y) is accepted in at
least some S’s where the corresponding/analogous sentences are rejected with DR(S, X, Y’) and
Coref(S, X’, Y). The only individual for whom this is not the case is “degenerately implicating”,
meaning that all their judgements are simply rendered irrelevant because nothing is ruled as *B(S, X,
Y) (the revised DR and Coref tests diagnosed potential quirky effects everywhere). Contrast this
with the results that emerge once we take c-command and precedence into effect:
(301) BVA acceptance in sentences where X neither precedes nor c-commands Y
# of Individuals Never Accepts Sometimes Accepts
BVA
(No DR/Coref)
8 2
(302) Status of (298) with regards to sentences where X neither precedes nor c-commands Y
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA
(DR/Coref
considered)
9 1 0
As we saw in the previous section, ensuring *A(S, X, Y) and *C(S, X, Y) alone is sufficient to
accurately predict the BVA judgements of most, though not all, individuals in this dataset, and
adding in *B(S, X, Y) to the other two takes care of the remaining individuals. As such, we can see
347
that while considerations of B(S, X, Y) are indeed necessary, they are hardly sufficient by themselves,
precisely as was the case in English.
As mentioned, the resolution to issue (II) is likewise similar to what was found in English,
namely that, once we have ensured *A(S, X, Y) and *C(S, X, Y), the DR test alone is sufficient for
diagnosing *B(S, X, Y) and thus predicting BVA, whereas the Coref test alone fails to achieve this:
(303) Status of (298) using just one MR to detect B(S, X, Y) (assuming *A/*C(S, X, Y))
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA, considering just
DR
10 0 0
BVA, considering just
Coref
7 1 2
Indeed, we can see from these results and from (302) that the Coref test is in some sense
“over-eliminative”, in that it converts one individual from “fully implicating” to “degeneratively
implicating”; interpretationally, this individual never passed the Coref test on the relevant sentences,
so their data had to be flagged as constantly quirky, and thus the relevant implication in (298) could
not be checked. As we can see from the other data though, this exclusion achieved nothing; indeed,
that individual never accepted BVA(S, X, Y) when X did not precede or c-command Y at all. As
noted in the previous section, it is difficult to draw conclusions here, as there were so few instances
where BVA(S, X, Y) was accepted when X did not precede or c-command Y, but it is nevertheless
interesting that DR and Coref do seem to have the same level of efficacy in detecting BVA in this
Korean dataset as they did in the English one. Once again, while it is hard to draw any conclusions
at this point, it is something that should be watched carefully in future experiments.
Finally, I turn to issue (III): whether we can observe minimal contrasts within an individual
due to the presence vs. absence of quirky effects on the particular X and Y of BVA(S, X, Y). The
348
answer here is unfortunately that we cannot. This is not so much a reflection of the data but of the
experimental design. As we have seen in the previous section, all the cases of quirky effects
manifesting were with the possessor binding sentences, which are precisely the sentences for which
we do not have “minimal contrasts” to compare with, as only one X-Y pair’s data was usable. We
were to include possessor-binding data from the excluded sentences, we could indeed see such
contrasts, but that would be a somewhat inconsistent (or at least complicated) application of our
“exclusion” criteria
212
.
Given that nothing particularly relevant for our purposes will come out of investigating this
data further, I will forego much of the kinds of analysis I did for the English dataset in Section 4.3.
For the sake of comparison, I present here the Korean Venn diagram equivalent for the English
(264), which is confined to weak crossover passives and actives:
212
One might argue, however, that since these individuals (Korean participants #5 and #6, see data in the appendix) each
reported rejecting (topicalized) possessor binding with at least one other choice of X and Y, the issue of “misunderstanding”
the question does not apply. That is, the participants reported that the BVA interpretation in question was impossible; the
issues discussed in previous chapters regarding possessor binding were specifically about cases where it was erroneously
judged to be possible, not impossible. As such, we might still count “impossible” judgements as significant for the purposes
of establishing minimal contrasts with other judgements. If we do that, then we do indeed observe two clear cases of
changing X-Y leading to changes in the acceptability of BVA(S, X, Y), showing us a more direct picture of the effects of
*/okB(S, X, Y)
349
(304) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
DR, Coref 0 41 0 41
DR 0 13 0 13
Coref 0 3 0 3
None 0 3 0 3
Total 0 60 0 60
This diagram simply emphasizes that BVA was simply never accepted in a relevant sentence
for the relevant sort of effect to be demonstrated. The data are thus consistent with the *B(S, X,
Y)à*BVA(S, X, Y) prediction, but in a degenerate way. It is crucial to understand, however, that
this does not mean that the prediction made about *B by the ABC-BVA law has been falsified, or
even that the results are inconclusive. Quirky effects cannot be studied quite so “directly” in this
dataset as they were in the English dataset, but as we have already seen in this section, its effects can
350
be observed through the combination of *A(S, X, Y) and *C(S, X, Y) failing to completely predict
*BVA(S, X, Y), and reference to the *B(S, X, Y) diagnostics, specifically the DR test, accounting for
all the instances of such apparent “failure”. Thus, while we have not been able to observe the effects
of B(S, X, Y) as clearly as in the previous chapter, this in no way imperils the successful status of the
ABC-BVA law in predicting BVA judgements. Indeed, as we will see, the weakness of B(S, X, Y)
means we can much more clearly see the effects of C(S, X, Y), which, of the three factors of the
ABC-BVA law, is our primary interest.
5.4 C(S, X, Y)
I turn now finally to the analysis of the role of C(S, X, Y) in the Korean data gathered. As
usual, we can understand this section as analyzing the data from the perspective of a specific
implication, namely:
(305) (Under the correct conditions)
*C(S, X, Y)à*BVA(S, X, Y)
That is, if X does not c-command Y in S, then BVA(S, X, Y) should not be possible,
assuming the other relevant conditions are met, which we predict to be X not preceding Y in S and
no quirky effects obtaining. As before, we start by considering what happens if only c-command is
taken into account; in other words, how well does C(S, X, Y) do on its own at predicting BVA,
before we add in the other considerations? The following obtains:
351
(306) Categorization of individuals with regards to (305)
FI/DI/NI MG-G HMJMG-J HMIG-GG All X-Y’s
Set 1 9/0/1 9/0/1 4/0/6 8/0/2
Set 2 6/0/4 8/0/2 2/0/8 6/0/4
Set 3 N/A N/A 2/1/7 N/A
All Sets 6/0/4 8/0/2 0/0/10 0/0/10
Overall, we see that all individuals at least sometimes accept BVA(S, X, Y) in S where X does
not c-command Y. This seems to vary quite a lot depending on the choice of X and Y; the HMIG-
GG pair, for example, is such that all individuals at least sometimes accept BVA(S, HMIG, GG) in
some S where X does not c-command Y, whereas with the HMJMG-J pair, only two individuals do
so, with the other eight showing clear c-command-based patterns. The first pair is somewhere in
between; as per usual, the numbers are not drawn from a large enough sample that we can feel very
confident assigning significance to the differences between judgements on one set of lexical items
vs. another; the most relevant point is that the hypothesized c-command relations alone clearly do
not reliably constrain BVA on their own, even if things happen to work out that way in certain
situations.
If we restrict ourselves to considering only sentences where X does not precede Y, then, as
expected, the number of “non-implicators” drops across the board:
(307) Categorization of individuals with regards to (305) (precedence controlled)
FI/DI/NI MG-G HMJMG-J HMIG-GG All X-Y’s
Set 1 7/3/0 9/1/0 7/3/0 10/0/0
Set 2 7/3/0 9/1/0 7/3/0 10/0/0
Set 3 N/A N/A 5/3/2 N/A
All Sets 7/3/0 9/1/0 5/3/2 8/0/2
As we already know from previous sections, two individuals remain who accept BVA(S, X,
Y) at least once when X neither precedes nor c-commands Y in S. Interestingly, recall that in 5.2 that
352
the MG-G and HMJMG-J pairs had “degeneratively implicating” individuals once we looked only at
cases where X did not c-command Y. Here, we see that the same occurs with HM-G when we look
at cases where X does not precede Y, and to a lesser degree with HMJMG-J. It thus seems that there
are a number of individuals for whom some X-Y pairs, BVA(S, X, Y) requires X to both precede
and c-command Y in S. Further, given that no individuals are “degeneratively implicating” overall,
we can see that this requirement is indeed specific to certain X-Y pairs
213
, not something “global” for
the individuals in question. A final note of comparison to make is that, unlike in 5.2, some
individuals in the HMIG-GG case become “degeneratively implicating” as well, perhaps reflecting
the same pattern we saw in English, and that accounts like Hoji 1995, Ueyama 1998, and Hoji et al.
2000 picked up on, namely that some individuals require demonstrative phrases like geu gyosu ‘that
professor’ to be preceded by X in order to achieve BVA(S, X, geu gyosu).
To account for the remaining problematic datapoints, we need to consider B(S, X, Y) as
well. The question is, after we have ensured *B(S, X, Y), do we still have data that demonstrate the
clear effects of c-command in determining whether BVA(S, X, Y) is possible or not? The answer in
this case is overwhelmingly yes. Constructing a diagram equivalent to the one given in Section 4.4
(and using the same labeling system as the one provided there), we find many times over the number
of clearly demonstrated c-command effects:
213
My experience checking various sentences with native speakers is that this requirement is tied to Y in particular. One
might hope that this would become clear in the DR vs. Coref data, but it is not so. In both cases, no individuals are
“degenerate” for a certain choice of X or Y before precedence is controlled, and then once precedence is controlled, some
are; in some cases, it is just one individual, others are high as eight, but it does not seem to line up in any straightforward
way with the BVA data, and as mentioned, there are also “non-implicators” throughout (again, one can view the relevant
data in the appendix). More work is thus needed to pinpoint the source of this apparent requirement and how it comes to
attach itself to certain choices of X and/or Y for specific individuals.
353
(308) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
*A, DR, Coref 37 8 0 45
*A, DR 7 6 0 13
*A, Coref 3 2 2 7
DR, Coref 25 3 17 45
*A 4 1 0 5
DR 5 3 5 13
Coref 3 1 3 7
None 1 0 4 5
Total 85 24 31 140
Considering just the green dots in central intersection alone, which show clear c-command
effects, i.e., rejecting BVA(S, X, Y) when X does not c-command Y but accepting it in
precedence/meaning-matched S’ where it does, the Korean data offers more than five times as many
354
cases as English, 37 vs. 7, despite having two less participants. If we do our slightly “post-hoc”
version of the test for B(S, X, Y), relying only on the DR portion of the test, the ratio decreases
slightly; there are 44 green dots for the Korean data vs. 16 for the English, but this is still almost
three times as much. I point this out not to critique the English results, but rather, to highlight just
how widely the desired effects have been demonstrated in Korean. We have many cases of the
absence of X c-command Y in S disallowing BVA(S, X, Y), and for the same person with the same
choices of X and Y, there exist minimal pairs where X c-commands but does not precede Y and
BVA(S, X, Y) is acceptable. Further, despite there being so many datapoints “in the center”,
absolutely none of them are “red”; that is, in all these 45-62 cases where all three necessary
conditions were ensured, BVA(S, X, Y) was never once accepted; all cases of acceptance are in cases
where at least one of the conditions was not met.
We can “zoom in” once again on the judgements in the center:
355
(309)
From previous sections, we already know what happens if we consider both “green” and
“yellow” judgements, so let us just focus on the greens, which are those for which a full c-
command-effect is detected. If we use only the DR test to diagnose *B(S, X, Y), then judgements
from 9 of the 10 individuals are represented; only individual # 10 has no data included. Further, all
X-Y pairs, constructions, and combinations thereof are represented, multiple times. If we use both
the DR and Coref tests to diagnose B, little changes; we no longer have judgements from individual
#1, but otherwise, every other condition and combination thereof are well represented across
multiple participants.
The numerosity of these detected c-command effects should be compared to the previous
experiments conducted in English and Japanese, as presented in Section 3.2. This experiment had a
fraction of the number of the participants as those did, and yet found far more instances of the
356
desired effects, across a much higher percentage of the participant population. The same was to
some extent true for English, as discussed in the previous chapter, but the demonstration here is
much more dramatic. Albeit we have no “control” case of an analogous experiment done in Korean
under one of the other styles of experimentation, I think it fair to say this result presents strong
evidence of the improvement made in some areas by the current experimental design. This in turn
may provide some optimism as to the overall potential of SCI-style experimentation’s ability to be
more inclusive and less “eliminative” of the individuals surveyed in pursuit of a categorical results
demonstrating the effects of c-command.
5.5 Full Results and Discussion
5.5.1 Full Results
Having looked at each of the three factors in the ABC-BVA law individually, I now combine
these three analyses, showing the overall results that emerge from the Korean data. These results
highlight the way in which the data conform to precisely the ABC-BVA law. As with English, these
diagrams will be slightly different than those that have come earlier in this chapter; here, judgements
are colored only according to whether BVA was accepted or rejected (no comparison to BVA in
other analogous sentences), and the three circles will represent the sentence being *A(S, X, Y) (X
can be seen to not come before Y in S), *B(S, X, Y) (the individual in question passed the relevant
diagnostic tests for this particular X, Y and S
214
), and *C(S, X, Y) (X is not hypothesized to c-
command Y in this sentence). If we use the original test for B(S, X, Y) that requires the use of both
Coref and DR, the following diagram is obtained:
214
See the small note about how this is calculated for sentences other than the *Schemata provided in Section 4.5
357
(310) a. Venn Diagram
b. Summary table
# of judgements Green Red Total
*A, *B, and *C 45 0 45
*A and *B 26 19 45
*A and *C 23 2 25
*B and *C 28 17 45
*A 16 9 25
*B 4 41 45
*C 13 12 25
None 1 24 25
Total 156 124 280
As demanded by the ABC-BVA law, if an S-X-Y combination is such that, for the individual
in question, there is *A(S, X, Y), *B(S, X, Y), and *C(S, X, Y), then such sentences universally are
*BVA(S, X, Y), as evidenced by the 45 green dots in the intersection of all three circles and the lack
of any red dots there. Further, providing evidence that all three conditions are needed, in all other
regions of the diagram, there are red dots, meaning that no sub-combination of the three factors is
358
sufficient to predict BVA in the way that the full combination is. As noted throughout this chapter,
the central intersection is far more populated than in English, due to more dots making it inside the
*B(S, X, Y) circle than did in English. As stated above, however, both green dots in the center and
red dots outside the center are both forms of evidence for the ABC-BVA law, so it is not necessarily
true that the Korean data offers more evidence for the law than the English data does; rather, the
evidence is quantitatively stronger in different areas in the data from the different languages. English
has more thorough evidence of the role of B(S, X, Y) than does Korean, while Korean has more
such evidence regarding C(S, X, Y). Regardless, qualitatively, the results from both languages come
out exactly as predicted.
If we use our somewhat “post-hoc” method for diagnosing *B(S, X, Y), ignoring the Coref
test and relying solely on the DR test, we move more green dots to the center, but all red dots
remain outside it:
359
(311) a. Venn Diagram
b. Summary Table
# of judgements Green Red Total
*A, *B, and *C 58 0 58
*A and *B 36 22 58
*A and *C 10 2 12
*B and *C 36 22 58
*A 6 6 12
*B 4 54 58
*C 5 7 12
None 1 11 12
Total 156 124 280
While the number of red dots outside the center cannot possibly increase under this change,
13 more green dots move into the center, bringing the total to 58. Thus, we have 58 cases of the
ABC-BVA law’s negative prediction being borne out, and 124 pieces of supporting evidence
provided by BVA being possible when the conditions the law’s absolute restriction to apply are not
met. Finally, other 98 pieces of data are somewhat “neutral” not directly supporting the law but not
360
contradicting it either, given that we do predict acceptability, only unacceptability; nevertheless as we
have seen throughout this chapter, they may speak to specific restrictions, such as individuals being
unable to accept BVA with certain lexical items under certain conditions (e.g., the individuals who
could not accept BVA(S, X, Y) with where Y=geu ‘that/him’ unless X came first, regardless of other
factors. Overall, just as in with English, the Korean results provide a wealth of data, all of which
strictly obeys the ABC-BVA and supports its various predictions.
5.5.2 Analysis of Other Data Gathered
Now that these overall results have been presented, I turn here to the datapoints gathered
that are not represented in the diagrams in the preceding sub-section. First, as in the previous
chapter, we may wonder about individuals’ judgements on the non-MR interpretations provided
alongside the MR ones. In general, there was no systematic pattern, and these interpretations were
widely accepted. Some tendency-based sub-patterns might be detected, such as sentences involving
jasin ‘self/’their own’ being less likely to receive a non-MR interpretation, but such patterns were
fairly inconsistent
215
. Unlike English, there were almost no cases where both interpretations were
rejected, except for occasionally with the items in Set 3, where some individuals wanted the X of
MR(S, X, Y) to be the dongryo ‘colleague’, rather than any of the interpretations provided. As
discussed in Sub-Section 4.5.2, such cases pose no significant concern to the analysis conducted.
There is thus no risk for any of the datapoints where BVA was rejected that such rejections were
due to mere string unacceptability; all sentences were judged by all individuals to be interpretable in
some way, just not necessarily with a BVA reading.
215
Looking at the data in the appendix for both Coref and BVA, we can see that jasin frequently can have the non-MR
interpretation, which means that it can be understood as referring to an individual who is not overtly represented in the
sentence. Thus, claims that are occasionally made that jasin must refer to a c-commanding antecedence are clearly incorrect,
even if they do have some basis in preferential tendencies.
361
The other major area not covered in the data presented are the spec-binding results for the
modeun gyosu ‘every professor’ -geu ‘his’ and han myeongeul jeoihan modeun gyosu ‘every professor except
one’ -jasin ‘himself/his own’ pairs, which, as discussed, were excluded due to potential unintended
ambiguities. To briefly summarize the results (as always, the full data is viewable in the appendix): of
the 20 cases (10 participants times 2 X-Y pairs) where the crucial topicalized spec-binding sentences
were judged, BVA(S, X, Y) was rejected 16 times (80% of the total). Of those times, most would
have been considered “significant” according to the standards for deciding whether a judgement is a
“green” dot or a “yellow” dot for the Venn diagrams in the previous sections; for 12 of them,
though the individual rejected that particular sentence, the individual accepted other sentences which
were matched in terms of theta-role meaning, having X c-command but not precede Y, and having
X precede but not c-command Y. For two of others, all contrasts could be established except those
based on pure c-command; that is, the individuals in question did not seem to accept BVA with
topicalization in general for the relevant X-Y pair, so c-command could not be isolated from
precedence, but all other minimal contrasts could be established. Finally, for the remaining two,
none of the comparison sentences were accepted, so little significance could be assigned.
Of the four times the sentences were accepted with a BVA reading, two were such that both
DR and Coref tests diagnosed the potential of B(S, X, Y). These are thus in keeping with the rest of
the dataset. A third was such that the Coref test but not the DR test diagnosed the potential of B(S,
X, Y), which is analogous to the one such case in the excluded possessor-binding data in English; in
both cases, the result is consistent with the ABC-BVA law but contradicts the unexpected but
otherwise consistent behavior in the rest of the two datasets in which DR is sufficient to detect B(S,
X, Y). Finally, there was one instance where BVA(S, X, Y) was accepted in such a sentence but
neither of the two tests diagnosed B(S, X, Y); this is unlike anything found in English and would
constitute a contradiction of the main predictions of the experiment if taken at face value.
362
Both these last two cases involved the han myeongeul jeoihan modeun gyosu ‘every professor but
one’ -jasin ‘their own’ X-Y pair. As noted in several times across both this chapter and the previous
one, in a sense, the fact that this seemingly contradictory data pops up here in fact supports the
initial hunch from English that such cases were ambiguous/confusing to the participants. That is
not to say such cases should be “swept under the rug”; though I believe I had made both a
principled and non-post-hoc decision to exclude such data, it is quite possible that the “hunch”
about what caused this deviation is wrong, and in fact, there is some other issue, theoretical or
methodological that lead to this apparent violation of the prediction (see discussion in Section 7.4).
In particular, if such violations continue to appear in the future, even after the appropriate
adjustments have been made, I want to make it certain it is recorded that they were also observed
here, so such future results can be understood as a replication of such a finding, helping us to better
assess the significance of such result. Of course, this present observation is quite limited; only in 1
datapoint out of the (13+10+11) participants across the three languages*3(X-Y choices)=102
relevant datapoints gathered for this experiment. Nevertheless, I think it is important to note it here,
in case more such cases show up in future experiments, so it can be clearly seen that they are part of
a pattern, suggesting there is indeed something more complicated going on than simple participant
confusion.
5.5.3 DR and Coref
Just to keep a “running tally” with regards to the observed roles of DR and Coref: thus far,
we have seen both the English and the Korean data are such that the DR test alone seems to be
sufficient to diagnose *B(S, X, Y). As noted in this chapter, the data here was much more limited,
with only two crucial judgements, but still, it is rather striking that the same sort of pattern obtained
again. As it turns out, however, we will be seeing in the next chapter that this may simply have been
363
a coincidence; the DR test alone does not sufficiently account for the Mandarin Chinese data. At the
very least, we will not be able to conclude form the data in this dissertation as a whole that the Coref
test is totally irrelevant; whether there is something “language-specific”, “methodology-specific”, or
“dataset-specific” (just pure happenstance) remains to be seen.
We can also be somewhat curious about the DR-only result for Korean, as it would contrast
rather sharply with what has been found for Japanese in Hoji’s experiments (see Section 3.2); while
there is no iron law requiring Korean and Japanese to pattern in the same way, given their overall
typological similarities, it would be striking if they really were fundamentally different in this way.
Overall, this result reinforces to the urgency with which more investigation is needed, so that this
issue can be studied in a more conclusive manner.
5.5.4 Summary and Assessment
The Korean data examined in this Chapter provides a cross-linguistic replication of the
results found for English in Chapter 2, being both consistent with, and indeed actively supportive
of, the predictions of the ABC-BVA law and our auxiliary hypotheses. In particular, Korean
provides quantitatively stronger evidence for the role of syntactic structure, via c-command, in
constraining/facilitating BVA readings. While the data do not support the categorization of Korean
as a “scope-rigid” or “quirky-less” language (one can see from the diagram in (271), for example that
things like “inverse scope” Coref and DR were accepted sometimes), the effect of Korean speaker’s
generally less permissive judgements regarding the availability of the various MR’s, at least as
compared to English speakers, surely helps with this.
We will see a similar pattern emerge in the following chapter with Mandarin Chinese.
Korean, however, offers certain advantages for this particular experiment because of its relatively
free word order in certain regards as compared to both English and Chinese; the Chinese data in
364
particular suffers some limitations because of this, though on the other hand, the typological
“differentness” of Mandarin has its own advantages, in that the successful replication of the results
gains significance precisely because it obtains despite these differences. Nevertheless, Korean
provides strong evidence for the ABC-BVA law generally, and for the role of C(S, X, Y) in
particular, due to its high rate of fully “significant” evidence for c-command effects across a broad
subset of the data.
365
6 Mandarin Chinese
6.1 Preliminaries
6.1.1 Introduction
In this chapter, I will report on the results of the experiment(s) conducted in Mandarin
Chinese
216
. At this point, we have seen two analogous experiments in English and Korean in the
preceding chapters, so I will abbreviate the exposition significantly, so as to avoid repetition from
previous chapters. In essence, as with both English and Korean, the Mandarin results are as
predicted by the ABC-BVA law and our other hypotheses; namely, once we control for factors A, B,
and C, that is linear word order, quirky effects, and syntactic structure, then we can accurately
predict the (im)possibility of BVA readings.
Implementing this experiment in a Chinese language, however, offers some unique
challenges as compared to English and Korean, for which implementation was fairly
straightforward. If we had started with Chinese, then this would not necessarily be the case, but
since the template laid out in Section 3.3 was primarily designed with English in mind, it is natural
that there is some degree of alteration needed for implementation in typologically different
languages. In this section, I will discuss the relevant properties of Mandarin Chinese, and how they
affect the implementation and analysis of the experiment.
As we will see, there is an unfortunate limitation when it comes to passive sentences, which
means this particular experiment can only partially address issues related to passives, but overall, the
experiment is still successfully implemented and gives the predicted results. Indeed, as alluded to at
216
I should here give a massive thank you to my Mandarin-speaking helper, Felix Qin. His many hours of labor in
providing advice, translation, and practical help were invaluable in “converting” this experiment to Mandarin Chinese and
to implementing it successfully. His judgements and suggestions had an outsized influence on the way things turned out,
and the degree of experimental success that was achieved is certainly a direct result of that influence. I will occasionally
refer to “my consultant” in this chapter; this is always him.
366
the end of the preceding chapter, that this still happens despite the typological differences between
Mandarin Chinese on the one hand and English and Korean on the other (with the latter two being
quite different from one another as well) is in fact a strength of the experiment. Further, as I will
discuss in Section 6.5, follow up experiments have been conducted that help us to overcome some
of the limitations of the original experiments, further supporting our basic hypotheses.
Before discussing the implementation itself, I will briefly discuss a few relevant observations
that have been made about Mandarin Chinese in previous literature. The literature on BVA, DR, and
Coref-related phenomena in Mandarin is vast, but much of it deals with relatively complicated
phenomena and is thus somewhat tangential to our rather “basic” interest at this point. Some
directly relevant observations, however, have been made in the literature concerning the 3
rd
person
pronoun tā (written 他/她/它, depending on gender). Aoun (1985), for example, notes a potential
contrast between the following types of sentences:
(312) 没人 说 他 要 来
méirén shuō tā yào lái
nobody say he will come
“nobody said he would come”
(313) 没人 说 李四 讨厌 他
méirén shuō Lǐsì tǎoyàn tā
nobody say Lisi hate him
“nobody said Lisi hates him”
Aoun 1985’s analysis focuses on those speakers for whom BVA(S, meiren, ta) is impossible
in sentences of the type in (312) but not for those of the type in (313); that is, for these individuals,
BVA(S, X, Y) with Y=tā requires some sort of intervening subject, such as the name Lǐsì in the case
of (313). Further, Aoun notes that the pattern for these individuals changes if the “anaphor” zìjǐ
(自己), analogous to the Korean jagi/jasin we have dealt with in the previous chapter, is substituted
367
for tā. In this case, these individuals have no intervening subject restriction, and BVA is possible in
both types of sentences.
Even more interesting for our purposes, Aoun reports that at least two other patterns of
judgements are possible on these types of sentences with tā: some individuals accept both sentences
with BVA(S, meiren, ta), and others accept neither with BVA(S, meiren, ta). That is, while some
Mandarin Chinese speakers seem to have certain anti-local constraints on BVA of tā, others do not
show this effect, while still others reject BVA of tā altogether
217
. In some sense, this recalls rather
closely our discussion of previous literature on the Korean 3
rd
person “pronoun” geu in Section 5.1,
which some individuals find impossible to use as Y of BVA(S, X, Y), while others find it quite easy
to use it in way in all cases, and still others can use it sometimes as Y of BVA(S, X, Y) but with
certain restrictions. This parallel even extends to zìjǐ and jagi/jasin, both of which are claimed to
contrast with their 3
rd
person pronoun counterparts by being universally available for BVA for all
individuals (provided an environment where X of BVA(S, X, Y) c-commands them).
In a somewhat similar vein, in contrast to Huang (1982)’s general claim that quantifier scope
in Chinese must match the “surface scope” given by the surface form, Aoun and Li (1989) provide
examples from passive constructions which are judged ambiguous with regard to quantifier scope by
some Mandarin Chinese speakers. As they report, however, much like in the BVA example above,
217
Aoun and Li 1990 further complicate this picture, noting that, for those speakers who report a contrast between the
two sentence types, the choice of intervening element is at least sometimes relevant, such as whether the element can be
understood as “quantificational” or not, e.g., liǎnggè rén(两个人) ‘two people’ vs. wǒ (我) ‘I’. Crucially, however, the
contrast is not consistent, with some individuals preferring the quantificational intervening elements to the non-
quantificational ones, and others the reverse.
This situation seems somewhat reminiscent of what we have seen for quirky effects in preceding chapters; that is, different
items produce different levels of permissiveness for different individuals, with no consistent pattern obtaining. While these
sort of intervention/anti-locality effects are outside the scope of this dissertation, it seems a direct potential extension:
what we would need to check is whether the DR/Coref tests accurately diagnose which elements will be consistent
interveners. If so, then perhaps the pattern Aoun and Li note could be replicated in a consistent manner across all (relevant)
individuals.
368
not all speakers share these judgements; for some, there is no ambiguity. The general spirit of this
finding is likewise replicated in the experiment reported in this chapter; as was the case in both
English and Korean, there is inter-speaker variation as to the acceptability of things like DR(S, X, Y)
when X does not c-command Y. Indeed, the variation seems to be even broader than Aoun and Li
report, with some accepting “inverse scope” in regular active sentences as well. While many
individuals come down on the side of “fixed scope”, this is not a universally shared judgement
218
. At
this point, it seems more appropriate to say that having a fixed scope is a tendential property of a
given language, but not an absolute. Some English speakers act as if the surface form for a sentence
fixes scope for things like DR, some do not, and the same variation is found within Chinese
speakers, albeit with more Chinese speakers apparently coming down on the side of “fixed” than
“not fixed” than in English.
I think the above point is rather crucial to emphasize; by simplistically dividing languages
into fixed- and non-fixed-scope languages, we risk misanalysing a number of relevant patterns. For
example, Scontras et al (2017) analyze DR possibilities in English speakers, Mandarin Chinese
speakers, and English speakers whose heritage language is Mandarin Chinese, finding that the third
group falls somewhere in the middle of the first two in terms of their willingness of accept inverse-
scope DR. This is an interesting result, but its discussion is somewhat restricted by the authors’
focus on whether the heritage speakers have a flexible “English-like” grammar or a rigid “Chinese-
like” grammar
219
. Given what we have found throughout this dissertation and the various works
summarized herein, it does not appear that scope flexibility/rigidity is an inherent property of a
218
This will be most obvious in the discussion of the follow-up experiments in Sub-Section 6.5.3.
219
In fairness to the authors, they do consider the possibility that scope-rigidity might not be a consequence of Mandarin’s
influence on the heritage speakers, but rather, due to pressure for simpler representations that make their DR possibilities
converge with those of non-heritage Mandarin speakers. This, however, is still the same dichotomy, just understood in a
more nuanced way.
369
given language at all. This is not to deny the existence of relevant tendencies nor to dismiss the
potential sources of those tendencies as non-grammatical; there may indeed be grammatical
properties of English and Chinese (and Korean, for that matter) that favor flexibility/rigidity. It is
only once we let go of the overly binary distinction regarding scope possibilities, however, that we
can even begin to examine such properties. My hope is that the kind of experiment performed here,
which allows for investigation on the level of the individual, with specific choices of sentences and
lexical items, can contribute to this type of discussion, allowing us to see better how and why
interpretations like “inverse scope” differ in availability across both individuals, sentences, and
languages.
6.1.2 Language-specific implementation
Returning to more concrete matters, I now address the exact sentences that were used to implement
the Mandarin Chinese experiments. These sentences were kept relatively close to the English and
Korean sentences, but as noted, there were certain Chinese-specific constraints that had to be
accommodated. One such issue becomes apparent even when we consider very basic lexical items.
The X’s of BVA(S, X, Y) all involved the head noun jiàoshòu (教授) ‘professor’. Following a pattern
similar to English and Korean, the three choices of X were: měigè jiàoshòu (每个教授) ‘every teacher’,
dà duōshù jiàoshòu (大多数教授) ‘most professors’
220
, and bùzhǐ yīgè jiàoshòu (不止一个教授) ‘more
than one professor’. Here, already, a complication arises, namely the requirement of some of these
X’s to coincide with the element dōu (都). Sometimes translated as “all”, dōu’s precise analysis and
220
It would have been more directly parallel with the other experiments to use the equivalent expression to ‘every
professor but one’, but this is the somewhat long/awkward chúle yīgè jiàoshòu yǐwài, qítā jiàoshòu (除了一个教授以外,其
他教授) (‘except for one professor, the other professors’). Inclusion of this term made sentences overly long, and so
even though it was more similar to what was used in English and Korean, it was decided to use the shorter and
numerically (in our diagrams, at least) relatively similar dà duōshù jiàoshòu ‘most professors’, keeping in mind that the
precise choices of X’s and Y’s are arbitrary and not an aspect of the hypotheses being tested.
370
co-occurrence patterns are tricky to pin down (interested readers can consult Huang 2005 for a
nuanced accounting of many of its intricate behaviors). For our purposes, all that is important to
understand is that dōu mandatorily appears immediately preverbally when a clause contains certain
quantificational expressions (prototypically those involving universal quantifiers like měigè ‘every’), at
least when those expressions occupy certain syntactic positions, e.g., the subject. Templatically, this
means the relevant sentences must be rendered like (314)b rather than (314)a:
(314) a. X Predicate
b. X dou Predicate
221
The sense of the Mandarin Chinese speaker I consulted with when designing the experiment
was that, for the sentences in question, dōu was never necessary when the X in question was bù zhǐ
yīgè jiàoshòu ‘more than one professor’ (lit. ‘not only one professor’) but was necessary if the X was
měigè jiàoshòu ‘every professor’ or dà duōshù jiàoshòu ‘most professors’, assuming that X was not in
object position, in which case dōu’ was not necessary. Dōu, however, has been noted for the
multiplicity of roles it appears to play, including its purported roles as a “distributor”, “free choice
marker”, and other relevant semantic functions (see for example the review of such functions
provided in Xiang 2020). As such, one might be concerned that dōu’s inclusion might disrupt the
baseline behavior of BVA itself or impede the use of DR(S’, X, Y’) as a detector for B(S, X, Y).
However, as will be apparent from the data in the following sections, there is no obvious
difference between the X’s with and without dōu, and indeed, they pattern much like the sentences in
the other languages examined, which lack any overt equivalent of dōu. As such, while the role of dōu
and its interactions with MR’s like BVA and DR (and their correlations) certainly merits further
221
Similar patterns were followed in the sentences for this experiment when the relevant X’s were possessors of subjects
or the “agents” of passives.
371
investigation, its presence or absence with the different choices of X does not seem to have
“disrupted” anything in this case.
As for the Y’s of BVA(S, X, Y), these were relatively straightforward “translations” from the
Korean experiment: they were, in matching order with the X’s, the previously discussed tā
222
and zìjǐ,
as well as nàgè jiàoshòu (那个教授) ‘that professor’. All of these were always coupled with de xuéshēng
(的学生), ‘‘s student’. X’ of Coref(S’’, X’, Y) as straightforward as well, again the simple translation
nàgè xīn lái de jiàoshòu (那个新来的教授) ‘that new(ly come) professor’, parallel to what it was in
English and Korean.
Y’ of DR(S’, X, Y’) however was slightly more challenging. As observed in many works,
(Chao 1968, Li and Thompson 1981, and others) indefinite expressions have some degree of
restriction in Mandarin (and other varieties of) Chinese, especially as regards to the subject position.
I mentioned this briefly in the context of discussing Li 1998 in Section 2.8, but I will re-summarize
the relevant points here. Typically, sentences that would have indefinite subjects in languages like
English are instead translated using existential constructions. Taking a pair of Li’s examples:
(315) 三 个 学生 在 学校 受伤-了.
Sān gè xuéshēng zài xuéxiào shòushāng-le.
Three Cl student at school hurt-Perf.
“Three students were hurt at school.”
(316) 有 三 个 学生 在 学校 受伤-了.
Yǒu sān gè xuéshēng zài xuéxiào shòushāng-le.
Have three Cl student at school hurt-Perf.
“There were three students hurt at school.”
222
Written as 他, so technically reading as masculine, like ‘he’ in English.
372
Sentences of the former type are often marked as impossible in papers on Chinese syntax,
with comments indicating the sentence should be modified to resemble the latter via the addition of
the existential marker yǒu (有). As Li notes, however, the distinction is not quite so binary as it is
often made out to be. Especially in the right context, specifically one which focuses on the
“quantity”/”number” aspect of the sentence, the acceptability of the yǒu-less sentences improves.
Because of this, there is a wide range of possible reactions to such sentences, which will differ based
on both the speaker, the imagined context of the utterance, and the particular lexical items involved.
As such, if we had simply copied the decision of the English (and Korean) experiments to use ‘two
students’ as Y’ for the DR sentences, we would risk a number of speakers rejecting these sentences
out of hand, not due the DR interpretation, but rather, the difficulty of having an indefinite subject
of this sort
223
.
What my consultant advised me was that other quantity-expressing expressions, such as
hěnduō xuéshēng (很多学生) ‘many students’ were much more “comfortable” as subjects without any
yǒu-marking. That is, unlike (315), which the consultant generally found uncomfortable, replacing
the subject with hěnduō xuéshēng made them seem relatively normal:
(317) 很多 学生 在 学校 受伤-了.
Hěnduō xuéshēng zài xuéxiào shòushāng-le.
Many student at school hurt-Perf.
“Many students were hurt at school.”
223
To say nothing of my speculation in Section 2.8 to the effect that such readings may overlap with what we have been
calling quirky effects on Y of DR(S, X, Y), which, while it should not invalidate Hoji’s DR test, may well cause it to become
overly-eliminative (given that quirky effects on the Y of DR(S, X, Y) might provide “false-positives”, when what we really
care about is quirky effects on X.)
373
As such, hěnduō xuéshēng was used as Y’ of DR(S, X, Y’). One subsequent issue arises here
though, because ‘many’ is not easily picturable in the DR context because of space limitations (see
the diagrams shown in Section 3.3; adding more students to the picture would crowd the diagram
significantly). For the sake of not changing the items presented dramatically, the solution chosen was
to simply tell participants that the number of individuals depicted was to be understood as “many”/
hěnduō for the purposes of judging the sentence; this seems to have worked as intended, as the
participants’ answers to the DR question (as can be seen in the appendix) do not look show any
“unusual” signs as compared to those from other languages.
Continuing our list of implementation choices made, the verb used in the sentences, similar
to the Korean case, was kuāguò
224
(夸过) ‘praised’. Further, as discussed briefly in Chapter 3, unlike
Korean, Mandarin Chinese has an (at least superficially) similar passive form to English, involving
the word bèi (被) ‘by’. However, like many non-nominal elements in Chinese, bèi-phrases resist
movement to the pre-subject position
225
. This creates a challenge for the control of precedence;
namely, in English, we were able to show that the subject of the passive can, in many cases,
participate in BVA(S, X, Y) where X is the subject and Y is in the by phrase, even when the by-
phrase is moved to the front of the sentence. This is important, as it provides a contrast (matched in
terms of word order and theta-role-based meaning) with the weak crossover active sentence, as we
224
Some may wonder why the aspect marker here is guò (过) rather than le (了). The former encodes more of a remote
past or “previous experience” sense than the latter, which is usually taken as a more or less “standard” preterit marker.
The answer is simply that my consultant thought that the former made the sentences sound more natural, and given that
I saw no obvious negative to using it, I decided to follow his intuition on the matter.
225
Possibly because bei does not actually form a constituent with what I have informally termed the “agent”. It may instead
form a constituent with the whole predicate, with the “agent” and the rest of the predicate forming a constituent nested
inside of that one (see discussion and references in Huang et al. 2009). This would change the structural hypotheses we
have been adopting, but, crucially, would not alter the c-command relationship between the subject and the agent of
Chinese passive sentences. For our purposes therefore, this account makes equivalent predictions to the “constituent”
account.
374
have the paradigm as below:
(318) a. By Y, X was spoken to
b. Y spoke to X.
In (318), both sentences have the same word order and same theta-role-based meaning, but
assuming we have *B(S, X, Y), BVA(S, X, Y) is possible only in the first sentence. This contrast
plays a crucial role in demonstrating the role of C(S, X, Y). While we could have just used the bèi-
phrases as is and forgone the “scrambled” versions, this would have severely limited the significance
of the results of the weak-crossover actives (which have generally been the “strongest” of the three
sets in terms of providing evidence for C(S, X, Y), given what we have seen in the previous two
chapters). Recall form Section 1.3 that finding such evidence, rather than investigating the passives
per se, is our primary purpose.
As such, rather than throw out these important contrasts, a work-around was attempted.
There is at least one way to get bèi-phrases before the subject of a sentence, namely via the shì…de
(是 ... 的) construction. This construction is generally understood to signal a kind of “focus”, but
whose analysis is subject to considerable controversy (see review, discussion, and analysis in Cheng
2008, for example). While there are various nuances, in short, the shì…de construction transforms a
sentence of the form in (319)a to the form in (319)b:
(319) a. Subject Predicate.
b. Subject shi Predicate de.
375
Crucially, at least for some speakers
226
, the Predicate+de complex can be fronted, yielding
(320):
(320) Predicate de, Subject shi.
If we apply this construction to passives, we are able to generate the following pair, of which
the latter achieves the desired “agent-first” word order:
(321) a. Subject shi bei Agent Verb de.
b. Bei Agent Verb de, Subject shi.
While the employment of the shì…de construction achieves its goal, it raises the question as
to whether the same hypotheses can be extended to this form as might be extended to a simple
passive. That is, does the subject still asymmetrically c-command the bèi-phrase, or is there a
different structure/multiple structures and/or various types of movement/displacement going on?
If so, it would be ideal, as then the passive could play the same roles as it did in the previous
chapters: both as an okSchema (to contrast with the weak crossover actives), as well as *Schema
(weak crossover in its own right). If not, then only the former of these is possible, as we do not
really know for sure that the subject cannot be understood to c-command the bèi-phrase. This is the
reverse of the situation that would occur if we just used the regular, shì…de-less, passive; we could
use it for certain as a *Schema, but it could not be as useful a contrast for the weak crossover
actives, as precedence could not be controlled.
226
The reaction from the experiment participants seemed to be that most did accept this, though several of them reported
it to sound rather “dialectal”
376
As the shì…de way had at least a chance of accomplishing both objectives, I elected to try it,
albeit with the caveat that any result stemming from its use as a *Schema ought to be considered
merely a preliminary investigation rather than rigorous hypothesis testing. The outcome of this
investigation, however, is somewhat complicated. I will discuss it briefly in the following sub-
section, and in more depth in Section 6.5.
Before turning to that, let us review the full template employed in these experiments. Given
the above discussion of X’s and Y’s, we have established that the choices for each round were as
follows:
(322) Round 1: X= Měigè jiàoshòu Y= Tā (de xuéshēng)
‘every professor’ ‘his (student)
Round 2: X= Dà duōshù jiàoshòu Y= Zìjǐ (de xuéshēng)
‘most professors’ ‘his own (student)
Round 3: X= Bùzhǐ yīgè jiàoshòu Y= Nàgè jiàoshòu (de xuéshēng)
‘more than one professor’ That professor(’s student)
These in turn plug into the basic template for the BVA sentences in the Mandarin Chinese
experiments as given below (as usual, with the DR and Coref sentences derivable by the relevant
substitution of Y’ and X’ respectively):
377
(323) Set 1:
a. X (都) 夸过 Y 的 学生。
X (dōu) kuāguò Y-de xuéshēng.
X DOU praised Y’s student.
“X praised Y’s student.”
b. Y 的 学生 是 被 X (都) 夸过 的。
Y-de xuéshēng shì bèi X (dōu) kuāguò de.
Y’s student SHI by X DOU praised DE.
“Y’s student was praised by X”
c. Y的 学生, X (都) 夸过。
Y-de xuéshēng, X (dōu) kuāguò.
Y’s student X DOU praised.
“Y’s student, X praised.”
d. 被 X (都) 夸过 的, Y的 学生 是。
bèi X (dōu) kuāguò de, Y-de xuéshēng shì.
By X DOU praised DE, Y’s student SHI.
“By X, Y’s student was praised.”
378
(324) Set 2:
a. X (都) 是 被 Y的 学生 夸过 的。
X (dōu) shì bèi Y-de xuéshēng kuāguò de.
X DOU SHI by Y’s student praised DE.
“X was praised by Y’s student.”
b. Y的 学生 夸过 X。
Y-de xuéshēng kuāguò X.
Y’s student praised X
“Y’s student praised X.”
c. 被 Y的 学生 夸过 的, X (都) 是。
bèi Y-de xuéshēng kuāguò de, X (dōu) shì.
by Y’s student praised DE, X DOU SHI.
“By Y’s student, X was praised.”
d. X, Y的 学生 夸过。
X Y-de xuéshēng kuāguò.
X, Y’s student praised.
“X, Y’s student praised.”
(325) Set 3: (for which ‘colleague’ was straightforwardly rendered as tóngshì (同事).)
a. X (都) 有 夸过 Y的 学生 的 同事。
X (dōu) yǒu kuāguò Y-de xuéshēng de tóngshì.
X DOU have praised Y’s student DE colleague.
“X has a colleague that praised Y’s student.”
b. X的 同事 (都) 夸过 Y的 学生。
X-de tóngshì (dōu) kuāguò Y-de xuéshēng.
X’s colleague DOU praised Y’s student.
“X’s colleague praised Y’s student.”
c. 夸过 Y的 学生 的 同事, X (都) 有。
kuāguò Y-de xuéshēng de tóngshì X (dōu) yǒu.
praised Y’s student DE colleague X DOU have.
“A colleague that praised Y’s student, X has”.
d. Y的 学生, X的 同事 (都) 夸过。
Y-de xuéshēng, X-de tóngshì (dōu) kuāguò.
Y’s student, X’s colleague DOU praised.
“Y’s student, X’s colleague praised.”
As I have done in preceding chapters, I will provide an example of a (non-weak crossover) active
sentence from round one in its Coref, DR, and BVA forms.
379
(326) Round 1:
Coref: 那个 新来的 教授 夸过 他的 学生。
nàgè xīn-lái-de jiàoshòu kuāguò tā-de xuéshēng.
that newcome professor praised his student.
“That newcome professor praised his student.”
DR: 每个 教授 都 夸过 很多 学生。
Měigè jiàoshòu dōu kuāguò hěnduō xuéshēng.
Every professor DOU praised many student.
“Every professor praised many students.”
BVA: 每个 教授 都 夸过 他的 学生。
Měigè jiàoshòu dōu kuāguò tā-de xuéshēng.
Every professor DOU praised his student.
“Every professor praised his student.”
6.1.3 Further Considerations
As noted above, because of the restrictions on fronting of in passives, an additional
structure, the shì- de structure, was introduced, so that precedence could be properly controlled,
allowing it to serve as a counterpoint to the weak-crossover active phrase. However, in doing so, we
lose some of our predictive power because it is not made clear from our hypotheses whether shì- de
sentences have the same structure as “regular” sentences, specifically whether the “matrix” subject
still must asymmetrically c-command all other nominal arguments. As such, we cannot say for
certain whether a given such sentence corresponds to a *-schema or not; that is, whether there is
*C(S, X, Y), which, given *A(S, X, Y) and *B(S, X, Y), would have allowed us to predict *BVA(S, X,
Y). As noted, it would have been “nice” if such were the case; it could not have been considered a
piece of direct support for the hypotheses laid out in the preceding chapters, as again there was
nothing said about shì- de there, but it would still have been consistent with our hypotheses with
simple additional assumption that shì- de does not change the overall structure of sentences (at least
not to such a degree that it alters the relevant c-command relations).
380
This “nice” possibility does seem to be what happened, however. As I will discuss in greater
detail in Section 6.5, while almost all the data does behave as we would expect for a “normal”
passive, there is one case of BVA acceptance that would violate our predictions if we assumed that
such sentences have the same structure as “normal” passives. To avoid unnecessary ambiguity in
graphs and charts, data regarding judgements on such sentences is restricted in Sections 6.2-6.4 to
their ok-schema usage; the acceptance of such a sentence can be used for contrast with the rejection
of another sentence, but as we make no predictions about the sentences’ rejection, such cases will
not be analyzed together with the rest of the data
227
.
As a result, there is effectively no “weak crossover passive” in these experiments; we are
restricted to “weak crossover active” and “(topicalized) possessor binding” as our main *-schemas.
To reiterate, this is not specifically because including such data would cause a disconfirmation of our
predictions; even if things had gone as hoped, the data should have been treated separately, given its
much more “informal” status in terms of prediction. I will discuss the “excluded” data further in
Sub-Section 6.5.2 and then discuss the results of a follow-up experiment that addresses the issue
more directly in Sub-Section 6.5.3.
Additionally, to maintain consistency with English and Korean, the potentially ambiguous
possessor binding cases (those using the non-demonstrative choice of Y of BVA(S, X, Y)) are also
excluded from the main consideration and discussed separately in Sub-Section 6.5.2. As will be seen
there, there is no evidence of any sort of problem; in fact, the data look quite supportive of our
predictions. I do not, however, think it is reasonable to believe that there are ambiguities in English
227
Note that this does not mean that Set 1 is being excluded as a whole from such analysis. Rather, passives in both Set 1
and Set 2 are being excluded in cases where their c-command structure matters; the actives in Set 1 are still considered.
This makes Set 1 as a whole harder to talk about, because its *Schemata are excluded, so in the tables that follow, I have
excluded it, but the Set 1 actives do feature as okSchemata where relevant and also in the “overall” analysis in the Venn
diagrams in the final section of this chapter.
381
and Korean on the one hand and not in Mandarin Chinese on the other, just because the data in
Mandarin looks favorable (though recall that in English looked relatively favorable too; only in
Korean was there are “real” problem). Perhaps future work will uncover that indeed, there was some
ambiguity in English and Korean that is lacking in Mandarin Chinese
228
, but there is nothing to
support that here. Indeed, if we assume it was ambiguous in English and Korean, it only resulted in
1 erroneous datapoint out of all the data gathered, so, assuming the same “rate of error”, we might
not even expect to find such a case in the Chinese data even if it is equally likely to happen.
Finally, as with Korean, no participants had to be excluded due to inattention or
miscomprehension.
6.2 A(S, X, Y)
Following the same order as in the previous chapters, let us first examine the role of
precedence in the Mandarin Chinese data. As in other cases, we are checking to verify that the ABC-
BVA correctly predicts the “correct conditions” necessary for the implication in (327) to hold:
(327) (Under the correct conditions)
*A(S, X, Y)à*BVA(S, X, Y)
As before, let us start by looking at the data when nothing but precedence is controlled
(ignoring quirky effects and issues of c-command). Using the labeling conventions given in Section
4.2, and the abbreviations given below in (328), we find the data to be as given in (329):
228
Anecdotally, the Mandarin-speaking participants did seem to understand the difference between the BVA reading being
asked about and BVA by the tóngshì ‘colleague’ much more consistently than did the others, or at least, they reacted with
less surprise when the difference was explained as required by the procedure.
382
(328) Abbreviations
a. Měigè Jiàoshòu-Tā MJ-T
every professor-his
b. Dà Duōshù Jiàoshòu-Zìjǐ DDJ-Z
most professors-his own
c. Bùzhǐ Yīgè Jiàoshòu-Nàgè Jiàoshòu BYJ-NJ
(329) Categorization of individuals with regards to (327)
FI/DI/NI MJ-T DDJ-Z BYJ-NJ All X-Y’s
Set 1
229
N/A N/A N/A N/A
Set 2 5/0/6 0/0/11 1/7/3 0/0/11
Set 3 N/A N/A 3/3/5 N/A
All Sets 5/0/6 0/0/11 3/3/5 0/0/11
We can see that, overall, there were no individuals who had a purely precedence-based
pattern of BVA; that is, BVA(S, X, Y) was accepted by all individuals at least sometimes when X did
not precede Y in S. While other choices of X and Y were mixed in terms who whether individuals
did or did not follow this pattern, the DDJ-Z pair was universally accepted with BVA in such
precedence-less cases, presumably because of zìjǐ’s aforementioned tendency to follow c-command-
based patterns.
If we ensure *C(S, X, Y), considering only sentences where X does not c-command Y, in
addition to not preceding it, the results are dramatically reversed:
(330) Categorization of individuals with regards to (327) (c-command controlled)
FI/DI/NI MJ-T DDJ-Z BYJ-NJ All X-Y’s
Set 1 N/A N/A N/A N/A
Set 2 5/6/0 7/3/1 3/8/0 8/2/1
Set 3 N/A N/A 3/8/0 N/A
All Sets 5/6/0 7/3/1 3/8/0 8/2/1
229
Given the discussion in the preceding section, judgements on Set 1 items are not being (directly) considered here
(because the crucial items in those sets are the shì…de passives), but I have included the row in the table so that it is easier
to compare between tables in different chapters.
383
Now, all but one individual never accepts BVA(S, X, Y) when X does not c-command Y; an
even lower rate than in Korean, though of course, the Korean experiments had more sentences to
consider because of the inclusion of the passives. Further, that individual that has the relevant quirky
effect only has it once; in particular, they accept BVA(S, da duoshuo jiashou, ziji) in the sentence:
(331) 自己的 学生 夸过 大多数 教授。
zìjǐ -de xuéshēng kuāguò dà-duōshù jiàoshòu.
self’s student praised most professors
“His own student praised most professors.”
230
Other than this individual, eight of the participants do accept BVA(S, X, Y) sometimes when
X does precede but not c-command Y, with the remaining two apparently requiring c-command for
BVA.
231
We can also note that the MJ-T and BYJ-NJ pairs seem to be such that a number of
individuals do not accept them with BVA(S, X, Y) without X c-commanding Y, though in the case
of MJ-T, we do not have terribly many datapoints to consider.
Regardless of these sub-analyses, the issue remains that we must account for the individual
who accepted BVA in (331). By the ABC-BVA law, we predict that this will be achieved by
controlling for B(S, X, Y); this prediction is borne out in the Venn diagram below:
230
Audrey Li (p.c. November 2021) notes that the most natural reading of this sentence is the generic one, which she
compares with zìjǐ de xuéshēng yǒngyuǎn shì zuì hǎo de ‘one’s own students are always the best’, the suspicion being that zìjǐ
in particular facilitates this reading (and indeed, the participant did reject this sentence type with BVA for all other choices
of X and Y). This is precisely in line with the various speculations given in Section 2.8 about the nature of quirky effects,
at least those tied to the choice of Y. Furthering this analysis, as can be seen from the appendix, this individual passes the
relevant DR test, but fails the relevant Coref test, meaning that Y in particular is implicated as the source of the quirky
effect.
231
If we apply an analogous analysis to DR and Coref, results are relatively similar, except there are a few more cases of
individuals not requiring X to c-command or precede Y in order to establish DR/Coref(S, X, Y).
384
(332) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
*C, DR, Coref 12 25 0 37
*C, DR 4 0 1 5
*C, Coref 2 0 0 2
DR, Coref 2 17 18 37
*C 0 0 0 0
DR 0 0 5 5
Coref 0 0 2 2
None 0 0 0 0
Total 20 42 26 88
From the above, we see that the red dot of concern, namely the one inside the *C(S, X, Y)
circle, does not make it past the tests for *B(S, X, Y). An interesting “twist” occurs though, as this
385
time it is the Coref test, rather than the DR test, which catches it, reversing the pattern we saw in the
previous two chapters, though not challenging our basic hypotheses.
Of the ~40 dots that do make it into the center, most are in fact yellow, meaning they are
instances where the individual rejected BVA(S, X, Y) on a sentence where X did not precede Y, but
also did not accept BVA in enough of the minimal contrasts (e.g., sentences with the same X and Y
where X did precede Y but did not c-command Y) for the judgement to be considered a maximally
significant demonstration of the effects of precedence on BVA. This too is different from the
English and Korean experiments, where most of the center dots were green. There are, however,
still 12 such green dots overall (which is still more than in the central intersection in English). If we
relax our criteria in a somewhat post-hoc way and employ only the Coref test for determining *B(S,
X, Y), then we go from 37 to 39 dots overall, and the two dots we gain are both green, bringing the
total of greens to 14.
As we will discuss in Section 6.4, this pattern is reversed in the case of c-command: there,
greens outnumber yellows in the center. Our main concern, however, is not how many dots there
are per se, but what the dots represent. As such, as in previous chapters, I will once again “zoom in”
on the central intersection and label the dots according to the following convention:
(333) i-j-k
i: The (arbitrary) number identifying the participant in question. 1-11.
k: The choice of X and Y, 1 for měigè jiàoshòu ‘every professor’ -tā ‘his’ 2 for dà duōshù
jiàoshòu most professors -zìjǐ ‘himself/his own’, and
3 for bùzhǐ yīgè jiàoshòu ‘one or more professors’ -nàgè jiàoshòu ‘that professor’.
j: The sentence type in question, 2 for (non-topicalized) weak crossover active, and 3
for topicalized possessor-binding, with 1 being unused due to exclusions.
386
(334)
Whether or not we use both Coref and DR or just Coref to test for *B(S, X, Y) does not
make a great difference here, given that only two more datapoints are included in the latter case. If
we consider the judgements of all individuals, regardless of whether those judgements are yellow or
green, we can see that almost every single possible combination of individual, X-Y pair, and
construction is included in the center. There are a few incidental exclusions, but the only consistent
pattern is that none of participant #11’s judgements make it in; those consistently failed the Coref
test.
If we consider the green judgements only, those giving the clearest evidence for the role of
A(S, X, Y), we still find judgements from the majority of individuals, with participants #5 and #10
dropping out. We also maintain multiple judgements from all the different combinations of X-Y
387
pairs and constructions. As such, though there are quite a lot of yellow dots here, representing
weaker evidence, the green dots still tell a rather thorough story; across most individuals, X-Y pairs,
and constructions, evidence of the active role of A(S, X, Y) of constraining and permitting BVA(S,
X, Y) can be found. Indeed, given the strength of this evidence, it is reasonable to ask whether the
apparent preference for BVA via c-command in this dataset is actually at all reflective of general
properties of Mandarin Chinese or just an artefact of the data here. Judgment on this issue will have
to be reached via future studies, but for now we can be satisfied knowing that there are numerous
instances where Mandarin speakers have judgements reflecting that precedence alone can facilitate
BVA readings.
6.3 B(S, X, Y)
In this section, I will discuss the now familiar three issues concerning B(S, X, Y) from
preceding chapters, as applied to this dataset. To repeat, these are: (I) whether *B(S, X, Y) alone is
sufficient to predict *BVA(S, X, Y), (II), the role of DR and Coref in establishing B(S, X, Y),
expanding on the discussion in the previous section, and finally, (III), whether we can see clear,
minimal pair-like contrasts based on the presence/absence of B(S, X, Y).
(I) is easily addressed; using either of the proposed methods of “quirky” absolutism
discussed in Section 4.3, that is, assuming that the “correct conditions” in (335) are just “any time
whatsoever”, the resulting predictions fail dramatically:
(335) (Under the correct conditions)
*B(S, X, Y)à*BVA(S, X, Y)
388
(336) Quirky Absolutism, Take 1
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA 0 1 10
(337) Quirky Absolutism, Take 2
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA 0 1 10
Almost all individuals sometimes accept BVA(S, X, Y) at least sometimes when *B(S, X, Y)
is diagnosed, and the one individual of whom this is not true simply never passed both the DR and
Coref tests for a given sentence, so the implication in (335) never applied to them. On the other
hand, as we have seen, if we do properly control for c-command and precedence, *B(S, X, Y) does
become determinative, as predicted by the ABC-BVA law:
(338) BVA acceptance in sentences where X neither precedes nor c-commands Y
# of Individuals Never Accepts Sometimes Accepts
BVA
(No DR/Coref)
10 1
(339) Status of (335) with regards to sentences where X neither precedes nor c-commands Y
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA
(DR/Coref
considered)
10 1 0
As with Korean, we see here that reference B(S, X, Y) plays a small, but crucial role in
predicting BVA in Mandarin Chinese. While most individuals in the dataset did not experience
quirky effects, it is no coincidence that in the one case where someone did, the tests for B(S, X, Y)
accurately predicted the possibility of BVA(S, X, Y) being accepted.
389
Turning now to issue (II), as noted, the roles of Coref and DR seem to reverse in this dataset
relative to what they have been doing in previous chapters:
(340) Status of (335) using just one MR to detect B(S, X, Y) (assuming *A/*C(S, X, Y))
# of Individuals Fully Impl. Degenerate Impl. Non-Impl.
BVA
(No DR/Coref)
10 0 1
BVA
(DR/Coref
considered)
10 1 0
In some sense, this result is expected given the previous studies discussed in Section 3.2;
these previous experiments found that reference to both DR and Coref was necessary to accurately
diagnose *B(S, X, Y), and if we take data from all three language tested here into consideration, this
is indeed true. The DR test is needed more often than the Coref one, but both are necessary. On the
other hand, this result does not directly rule out the possible analysis that only one such test is
needed per language, the DR test for English and Korean and the Coref test for Mandarin Chinese.
Given, however, that there was only one relevant datapoint in Mandarin, it seems premature at this
point to conclude that there is any such “primacy” to the Coref test in Mandarin, though it is
something to keep in mind for future experiments. I will discuss this briefly again in Sub-Section
6.5.4.
Finally, addressing issue (III), much like Korean, the fact that so few quirky effects were
observed prevents us from getting the same quality of clear evidence for the role of B(S, X, Y) as we
were able to in English. Unlike Korean, the Mandarin data does offer at least a potential chance to
do so, as the quirky effect that was observed happened in the weak crossover active case, meaning
that we can compare judgements by the same individual on the same sentence type with different
choices of X and Y. However, the individual in question never had any X-Y pair which passed the
390
Coref test for the relevant sentence type; that is, a potentially quirky Y was diagnosed for all three X-
Y pairs for that individual, so we cannot do the sort of minimal comparison we would want to. Just
for the sake of comparison with the diagrams in Sections 4.3 and 5.3, I present the relevant Venn
diagram below:
(341) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
DR, Coref 0 27 0 27
DR 2 1 1 4
Coref 0 2 0 2
None 0 0 0 0
Total 2 30 1 33
However, as was the case with Korean, we can note that, while this data is not ideal for
capturing the role of B(S, X, Y), we have nevertheless shown quite clearly the necessity of
391
consideration of B(S, X, Y) for achieving the intended predictions. As such, the failure to find these
ideal datapoints in no way goes against our basic predictions; it merely fails to provide a very specific
type of support for them. Indeed, while it would be going too far to say this lack evidence is a good
thing, it is true that a low frequency of quirky effects makes the study of c-command effects much
easier. This is a silver lining, but a significant one, as out of all three factors, it is c-command with
which we are the most concerned.
6.4 C(S,X, Y)
We turn now to the role of c-command in the dataset, meaning that the relevant implication
to consider is as given in (342):
(342) (Under the correct conditions)
*C(S, X, Y)à*BVA(S, X, Y)
As usual, we first look at c-command’s role the dataset as a whole, without regard for
precedence or quirky effects. The relevant data is summarized in the table below:
(343) Categorization of individuals with regards to (342)
FI/DI/NI MJ-T DDJ-Z BYJ-NJ All X-Y’s
Set 1 N/A N/A N/A N/A
Set 2 8/0/3 4/0/7 2/7/2 3/0/8
Set 3 N/A N/A 5/3/3 N/A
All Sets 8/0/3 4/0/7 5/3/3 3/0/8
C-command alone does do slightly better than precedence alone in predicting BVA: there
were three individuals for whom BVA(S, X, Y) was never accepted in a sentence where X did not c-
392
command Y. The majority of individuals, however, did accept BVA at least sometimes in such
sentences, with no clear pattern across choices of X and Y.
If we remove sentences where X precedes Y, we get a much clearer pattern, as expected:
(344) Categorization of individuals with regards to (342) (precedence controlled)
FI/DI/NI MJ-T DDJ-Z BYJ-NJ All X-Y’s
Set 1 N/A N/A N/A N/A
Set 2 10/1/0 10/0/1 3/8/0 10/0/1
Set 3 N/A N/A 5/6/0 N/A
All Sets 10/1/0 10/0/1 5/6/0 10/0/1
There is of course the one individual who had the one quirky effect, but everyone else
consistently rejects BVA(S, X, Y) when X does not c-command Y. Further, all individuals are such
that they accept sufficient minimally contrasting items (e.g., BVA(S, X, Y) in S where X does c-
command Y) that the rejections of BVA(S, X, Y) can be considered maximally significant for our
purposes. We can also note that, if anything, BVA involving nàgè jiàoshòu is more widely accepted
when X c-commands but does not precede Y than the reverse, though the two are close. Such a
result once again contradicts the claims discussed in Hoji et al. 2000 and elsewhere that
demonstrative phrases can only participate in precedence-based BVA, not c-command-based BVA.
Finally, we can note that, as in the data from every other language, if we consider Set 3,
where the relevant sentences are the possessor-binding cases, when precedence is not controlled
(i.e., when the object is untopicalized), there are individuals who accept BVA in such sentences.
When precedence is controlled, in all three languages, the number of “non-implicating” individuals
goes down and the number of “fully implicating” individuals goes up, indicating that at least some
acceptances of possessor-binding-BVA were due to precedence, as predicted by our hypotheses. In
the Mandarin data in particular, this effect is total; all possessor-binding acceptances disappear once
393
the object is topicalized
232
. We thus have ample evidence that the widespread reports in the literature
as to the acceptability of possessor-binding are due to confounds from precedence (and, to some
extent, quirky effects), rather than indicating something about the nature of c-command as
suggested by theories like Kayne (1994)’s.
Such issues aside, we still need to assess what happens once we control for quirky effects, so
that we deal with the one “non-implicating” datapoint. Incorporating the tests for *B(S, X, Y), we
arrive at diagram given below:
232
This cannot be reduced to a ban on BVA(S, X, Y) when Y is in a topicalized object, as in topicalized non-possessor-
binding sentences, the “fully-implicating” individuals accept with BVA with the same choices of X and Y.
394
(345) a. Venn Diagram
b. Summary Table
# of
judgements
Green Yellow Red Total
*A, DR, Coref 22 15 0 37
*A, DR 4 0 1 5
*A, Coref 2 0 0 2
DR, Coref 13 15 9 37
*A 0 0 0 0
DR 1 0 4 5
Coref 0 0 2 2
None 0 0 0 0
Total 42 30 16 88
Depending on whether we use the original DR+Coref test for *B(S, X, Y), or the language-
specific post-hoc Coref test, there are 37-39 dots in the center, 22-24 of them greens and none of
them red. As such, there was no judgements which violated the *Cà*BVA prediction once we
ensured *A and *B; in cases where those were not ensured, i.e., outside the central intersection, such
395
judgements did exist. Further, there were quite many individuals whose judgements did follow the
*Cà*BVA prediction and who also accepted BVA(S, X, Y) in other cases where the same X did c-
command Y in S, even when X did not precede Y in S. As mentioned, this is something of a reversal
from the precedence case discussed earlier in this chapter, where most judgements in the center
ended up being neutral, albeit with still quite a few of green ones. Further, despite the Chinese
experiment both having fewer participants than the English one and not having data from passives
considered here, there are still far more green dots here than in the equivalent diagram for English
(see Sub-Section 4.4), showing how much stronger the “signal” of C(S, X, Y) was for this dataset
due to the lesser amount of “noise” coming from B(S, X, Y).
As before, we can “zoom in” on the central dots to see how representative they are of the
various participants/conditions in the experiment:
(346)
396
If we consider both green and yellow dots together, then the result will not differ from what
is stated in Section 6.2, so we will focus on the green dots, those which show the most direct
evidence of a “c-command effect”. As before, whether or not the two dots in the Coref circle but
not the DR circle are included makes minimal difference. Either way, all individuals except for
participant #11 have data included, in fact, in all cases, there is more than one judgement from each
of those participants. The same can be said for the X-Y pairs, different constructions, and the
combinations thereof; all possible combinations are included and represented multiple times, and
indeed, most participants have datapoints representing their judgement on most such combinations.
As such, despite being impoverished by the loss of the data coming from passive sentences, the
Mandarin Chinese experiment demonstrates a wide base of support for the c-command-based
predictions deriving from the ABC-BVA law, finding maximally clear c-command effects in almost
every I-language measured, frequently multiple such effects across multiple lexical items and
sentence types. Indeed, as can be seen from the Venn diagram, almost every judgement that
hypothetically could have been used to support the predictions were used that way; very few were
excluded due to issues with B(S, X, Y); this once again highlights general “inclusivity” achieved via
the current experimental methodology.
6.5 Full Results and Discussion
6.5.1 Full Results
Having examined each of the relevant factors A-C individually in the preceding sections, we
now turn to the “overall” analysis, which combines all three. As in preceding chapters, in the
resulting overview graph, the meaning of each dot changes, representing now just one individual’s
judgement on one BVA sentence: green circles for BVA being rejected, red squares for BVA being
397
accepted. Also as before, we have a choice with regards to how we detect *B(S, X, Y): using both
the DR and Coref tests as expected, or using the more post-hoc option of just picking the one that
was needed in this particular dataset. If we elect for the former option, we derive the following:
(347) a. Venn Diagram
b. Summary Table
# of judgements Green Red Total
*A, *B, and *C 37 0 37
*A and *B 24 11 35
*A and *C 6 1 7
*B and *C 28 9 37
*A 7 2 9
*B 14 21 35
*C 1 6 7
None 4 5 9
Total 121 55 176
398
As can be seen in (347), once *A(S, X, Y), *B(S, X, Y), and *C(S, X, Y) are ensured, there are
37 datapoints, all of which are cases of *BVA(S, X, Y), with no cases of okBVA(S, X, Y), exactly as
predicted by the ABC-BVA law. Likewise, in every other area, where at least one of the three
conditions does not hold, there are at least some cases of okBVA(S, X, Y), again exactly what we
would expect to find if all three conditions of the ABC-BVA law were necessary to guarantee
*BVA(S, X, Y).
If we allow ourselves to use the post-hoc “language specific” diagnostic for *B, namely just
the Coref in this case, there are far fewer changes than in the other languages:
399
(348) a. Venn Diagram
b. Summary Table
# of judgements Green Red Total
*A, *B, and *C 39 0 39
*A and *B 27 11 38
*A and *C 4 1 5
*B and *C 28 11 39
*A 4 2 6
*B 16 22 38
*C 1 4 5
None 2 4 6
Total 121 55 176
The only change that particularly matters for our purposes is the central intersection gaining
two more “green” dots, adding slightly more quantitative support to the prediction. Regardless of
which method of quirky effect detection we use, the Chinese experiment receives more “greens” in
the center than the English experiment (13 or 30), and less than Korean (45 or 58). In terms of
evidence of this sort, it is in the middle in terms of its quantitative strength, though when compared
400
to some of the results reported in Section 3.2, we see that it is on the high end of quantitative
support in in terms of SCI-experiments of this kind performed thus far. If, on the other hand, we
look at reds outside the center, evidence for the “contrapositive” prediction of the ABC-BVA law,
here there are 55, which is less than Korean’s 124 and English’s 195, but still hardly low in number.
The experiment thus is completely consistent with the ABC-BVA law’s predictions and provides
direct evidence for the relevance and necessity of each of the components of that law.
6.5.2 Analysis of Other Data Gathered
Now that we have seen the “overall” results, in this sub-section, I give a brief account of the
various types of data not included in, or at least not apparent from, those results. As we have done
with the analyses of such data from other languages (see Sub-Sections 4.5.2 and 5.5.2), we first look
at the answers of participants regarding the non-MR readings, e.g. a “referential” reading on a BVA
item. As explained previously, this is important for ensuring that rejections of sentences with BVA
readings were