Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Building phrase structure from items and contexts
(USC Thesis Other)
Building phrase structure from items and contexts
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
1 Building Phrase Structure from Items and Contexts by Katherine S. McKinney-Bock University of Southern California Doctoral Dissertation 2 Table of Contents CHAPTER 1 INTRODUCTION: INCLUSIVENESS, LOCALITY & GRAPH THEORETIC SYNTAX.....8 1.1. Goals of the Dissertation.......................................................................................................................................8 1.2. Foundational Issues in Syntactic Theory...................................................................................................15 1.2.1. The Minimalist Program................................................................................................................................15 1.2.2. One Overarching Issue: Inclusiveness.......................................................................................................23 1.2.3. Another (likely) related problem: Locality Constraints on Movement......................................41 1.3. Empirical Phenomena: Control, Raising, and Relativization.............................................................47 1.4. The Key Proposal..................................................................................................................................................49 1.4.1. From Grafting to Graphs: On Phrase-markers.....................................................................................50 1.4.2. Graph Theoretic Syntax..................................................................................................................................58 1.5. To Conclude the Introduction.........................................................................................................................64 CHAPTER 2 THE ITEMS AND CONTEXTS ARCHITECTURE: ITS ROOTS ARE IN CHAINS.........66 2.1. Conceptual Overview of Vergnaud (forthcoming).................................................................................66 2.2. From Phonology to Syntax: a formal notion of occurrence and chain is the same in both...68 2.2.1. Formalizing the Notion of “Occurrence”: A Comparison of Circular Permutations and Metrical Structure.............................................................................................................................................................69 2.2.2. Applying the notion of occurrence to linguistics: Metrical structure.........................................74 2.3. The heart of the formal system: The ICA....................................................................................................76 2.3.1. Bare Phrase Structure and a Family of Trees.......................................................................................78 2.3.2. The Mirror Principle.........................................................................................................................................83 2.3.3. Dual Domains: Two types of Merge...........................................................................................................87 2.3.4. Categorial Symmetry.......................................................................................................................................88 2.4. The ICA and A’-‐Structure...................................................................................................................................91 2.5. The ICA as an Ideal: Certain Shortcomings...............................................................................................95 2.6. Conclusion...............................................................................................................................................................96 CHAPTER 3 PHRASE STRUCTURE AND CASE COMPLEMENTIZERS...............................................98 3.1. Introduction............................................................................................................................................................98 3.2. The Minimalist Program....................................................................................................................................99 3.2.1. Case Theory..........................................................................................................................................................99 3.2.2. Phrase Structure and the Lexical/Functional Distinction.............................................................103 3.3. Building Phrase Structure and the role of Case.....................................................................................106 3.3.1. Primitive Binary Features: Argument Structure...............................................................................106 3.3.2. Embedding in the Literature......................................................................................................................110 3.3.3. Proposal: Linking Phases through the ‘K’ (‘Komplementizer’)-Domain..................................111 3.3.4. Phrase Structure, and Spell-Out................................................................................................................118 3.3.5. To Phrase-markers (review from chapter 1)......................................................................................120 3.3.6. Spell-out of Phrase Structure.....................................................................................................................139 3.3.7. Case Theory/Phrase Structure under the ICA.....................................................................................145 3.4. Generalizing features and phases................................................................................................................146 3.5. Conclusion.............................................................................................................................................................151 CHAPTER 4 EMBEDDING IS SHARING...................................................................................................152 4.1. Introduction..........................................................................................................................................................152 4.2. Empirical & Theoretical Considerations from Minimalism..............................................................154 4.2.1. D-T Parallels: Key Observations from the Literature......................................................................154 4.2.2. Functions of D and T in Control Clauses................................................................................................159 4.3. Nonfinite/Embedded Clauses.......................................................................................................................164 3 4.3.1. Derivation: A Typology of Embedding....................................................................................................166 4.3.2. A Shared Property of Raising & Control................................................................................................182 4.3.3. Raising, Control, and PF Spell-out............................................................................................................184 4.3.4. Benefits for the ICA over current theories............................................................................................190 4.3.5. Shortcomings.....................................................................................................................................................190 4.4. Conclusion.............................................................................................................................................................191 CHAPTER 5 DEFINING SPELL-OUT AND FUSING RELATIVIZATION/COORDINATION..........192 5.1. Introduction..........................................................................................................................................................192 5.2. The Minimalist Program/Traditional Syntactic Theory: Relativization & Coordination....194 5.2.1. Relativization....................................................................................................................................................194 5.2.2. Coordination......................................................................................................................................................194 5.2.3. Freidin & Vergnaud: On the relationship of movement and coordination.............................195 5.3. An Empirical Problem Resolved: Split-‐Antecedent Relative Clauses..........................................196 5.3.1. Perlmutter & Ross 1970: The empirical issue.....................................................................................196 5.3.2. How the literature handles SARC.............................................................................................................197 5.3.3. Accounting for SARC......................................................................................................................................210 5.4. Open Issues...........................................................................................................................................................243 5.4.1. A more specific theory of PF P-markers and Extraposition..........................................................243 5.4.2. Object relative clauses and a generalization to wh-movement (which should subsume object relative clauses)..................................................................................................................................................244 5.4.3. Parallelism constraint...................................................................................................................................244 5.5. Conclusion.............................................................................................................................................................244 CHAPTER 6 CONCLUSION..........................................................................................................................246 6.1. Summary of Contributions.............................................................................................................................246 6.2. A Comparison of Frameworks......................................................................................................................246 6.3. Future Directions................................................................................................................................................249 REFERENCES.....................................................................................................................................................251 4 Abstract This dissertation aims to revisit foundational issues in syntactic theory regarding cyclicity and displacement. I take narrow syntax to operate over domains (phases) more local than in current Minimalism. To do this, I define a notion of phase overlap which involves the sharing of grammatical features across two independent phases. Phase overlap applies to phases involved in the construction of argument structure, e.g., linking subject and object phases, in further building clausal structure, as well as in embedding of complement clauses, and phase overlap also plays a role in Aʹ′ constructions, such as relativization. To overlap phases, I take the idea that generalized binary connectives build phrase structure (Vergnaud forthcoming), and extend it in such a way that it gives rise to phases that involve parallel nominal and verbal domains, rather than treating the verbal domain as ‘privileged’. In this dissertation, both the verbal and nominal domains are implicated at the edges of phases, creating phase overlap and a novel notion of cyclicity: to construct two (consecutive) cycles is to share a pair of features across (both) the nominal and verbal domain. The definition of sharing across phases, or phase overlap, is grounded in the scientific hypothesis that long-distance grammatical relationships are a by-product of interface requirements such as linearization, rather than a fundamental aspect of the architecture at narrow syntax. This hypothesis is based in part on the Items and Contexts Architecture (ICA, Vergnaud forthcoming), although the ICA remains incomplete in its formalization of embedding. From this type of sharing, I develop a strong hypothesis that the appearance of displacement (of a noun) is a product of how the formal computational system spells out, rather than a movement operation that takes place at narrow syntax. 5 From this hypothesis, I then set forth a unified analysis of the D-C-T domain, where noun sharing plays a crucial role in the linking – generalized linking – of two (otherwise independent) phases, including Subject/Object phases to build a transitive clause (chapter 3), two CP phases involving embedded and matrix clauses (chapter 4) and relative and matrix clauses (chapter 5). Along with noun sharing, I maintain the idea of verbal sharing – in a certain way following the standard literature, i.e. that v is visible to both the lower (object) phase and higher (subject phase). This plays a key component in phase overlap. However, I extend this idea to all embedding, and hypothesize that all embedding shares (semantically interpretable) features. This is seen empirically, especially in cases of control. 6 Acknowledgements The completion of this work is due in part to many hands and many minds. I am profoundly indebted to my advisor, Roumyana Pancheva, for not hesitating to say yes to taking me on halfway through graduate school as an advisee; for diving into this project as deeply – and in many ways, more deeply – than I; and for permitting – and pushing – me to take risks in order to move this project further than I thought possible. Perhaps equally important have been her continual reminders to remind myself about the bigger picture and to keep in mind the pivotal concepts that allow for communicating ideas to others. My dissertation committee has been an incredible source of support. Andrew Simpson has always been there for me, to listen to slivers of ideas from me and direct them to new paths. I wouldn’t be pursuing the colorful, sizeable world of adjectival modification (in a line of research outside of this dissertation) without his early guidance. Khalil Iskarous came to USC most recently, and his theoretical imagination and breadth of scientific knowledge has positively kept me from ever completing this dissertation project; I am indebted to him for newer, deeper questions to continue pursuing in the next steps. Richard Arratia unfailingly responded to all my more mathematical questions, hopeless as some of them were, and even if he was not aware of how much of a support he was during a very difficult transition for me, provided me with the motivation and inspiration to keep on. Special thanks to Elsi Kaiser, who has inspired and enriched my linguistic training and knowledge. She always made time to guide my experimental research and train me in the tools, techniques and methods that are crucial to both types of linguistic research that I had been able to pursue. Special thanks also to Maria Luisa Zubizarreta. She has been a mentor, both in life and linguistics, and has inspired me to continue along the research path that I am pursuing here, in spite of its challenges. I dearly acknowledge my late advisor, Jean-Roger Vergnaud, for the immeasurable support he gave me even when his own need for support was deep indeed. As he credits his own mentors, he has most definitely contributed more to this dissertation than is in it, and more to my own personal growth than I can express here. I treat as an ideal his intellectual beauty and ability to bring mathematical elegance into linguistic theory, which guides my research daily. I keep with me what I can of his personal wisdom and humor, which has impacted my life greatly. The faculty at USC have been incredibly supportive of me every step of the way, and special thanks goes out to Hagit Borer, Louis Goldstein, Elena Guerzoni, Jim Higginbotham, Hajime Hoji, Audrey Li, Karen Jesney and Rachel Walker for their personal support and guidance. Barry Schein’s guidance and mentorship has always been a somewhat destabilizing force, pushing me to rebuild to greater heights. 7 To my cohort – Xiao He, Canan Ipek, Lucy Kim, David Li, Sarah Ouwayda, Ben Parrell, Erika Varis Doggett, Hector Velasquez, and Mary Byram Washburn – I am indebted to your support and to our parallel endeavours, both personal and academic, which have helped shape my outlook on life. To my colleagues and friends in the program – Janet Anderson, Ed Holsinger, Barbara Tomaszewicz, Ulrike Steindl, Priyanka Biswas, Mythili Menon, Syed Saurov, and Michael Shepherd – I am grateful for your support these past years. Michal Temkin Martinez has been there for me both as a peer and a mentor, always there with sound advice at the ready. Mary Byram Washburn and Erika Varis Doggett have gone hand-in-hand with me (with beach towels in hand) through the tough times and the fun times. I will never forget late nights in the department with Sarah Ouwayda, who has richly and importantly contributed to my thinking. I have deeply appreciated the collaborative spirits Roger Liao and Tommi Leung, who have forged paths with me. Joyce Perez was a rock throughout the program, and always a source of useful information and guidance for just about anything (and everything) that came up. Thanks also to the University of Southern California, and support through fellowships and teaching assistantships for the six years that I was there (in particular, the Del Amo Fellowship program, Gold Family Fellowship, Dissertation Completion Fellowship). Beyond USC, there are many people who kept me sane as I was treading between graduate school and family: Elizabeth Shin, Lynn Sconyers (and Paul, and Kiera), and Roslyn Chaplin. I treasure your friendship. Before USC, Therese Tardio, Jeanne Baxtresser, Alberto Almarza, Marcy Lohman, Jeff Putterman gave me the background and life skills to pursue graduate school and gain as much as I have from my experience here. And, my family. To my husband, Dan, your empathy, late nights, and support (among everything else) has allowed me to grow as a person both beside you and with you. I feel so much richer to have you by my side (and, boy, am I glad you went through this first!). To my son, Nathaniel, who spent the first six months of his life helping his mother write her qualifying paper, who felt every ounce of grief and joy along with me, both before and after birth, and whose curiosity and persistence I try to mirror in my own endeavours, thank you. To my baby girl, who I have not yet met, thank you for reminding me of life and love while I spent hours writing this dissertation (and thank you for feeling every emotion in the rainbow throughout). To my parents, thank you for steadfastly answering the phone (still), and always being there to talk with me. Graduate school coincided with many life changes for me, and my growth as a person and my research as a linguist are inevitably intertwined. I am grateful to all those who helped me bring all the branches of my life together through these past six years. And, as always, any and all shortcomings, errors and blunders are my own. 8 Chapter 1 Introduction: Inclusiveness, Locality & Graph Theoretic Syntax 1.1. Goals of the Dissertation This dissertation aims to revisit foundational issues in syntactic theory regarding locality and displacement, and argues that narrow syntax is much more local than in previous theories. I take narrow syntax to operate over domains (phases) more local than in current Minimalism, involving only a single argument and verbal head. To do this, I define a notion of phase overlap in such a way that noun sharing plays a critical role in linking two independent phases. Phase overlap applies to phases involved in the construction of argument structure, linking e.g. subject and object phases as well as embedding of complement clauses, and phase overlap also plays a role in Aʹ′ constructions, such as relativization, with future extension/predictions for wh- movement and islands. To overlap phases, I take the idea that generalized binary connectives build phrase structure (Vergnaud forthcoming), and modify it in such a way that it gives rise to phases based on transformations that involve parallel nominal and verbal features in the Case/complementizer domain, rather than treating the verbal domain as ‘privileged’ (i.e. v, C). This builds on work that looks at the nominal and verbal domains as parallel (Abney 1987, Vergnaud & Zubizarreta 2001, Megerdoomian 2002, Liao 2011, Vergnaud forthcoming, among many others); however, the literature has not, to my knowledge, developed a notion of cyclicity that involves assumptions other than the verbal domain/verbal extended projection as the extension of one phase into another (i.e. v as both visible in lower and higher phases, and C of an embedded clause as complement to V). In this dissertation, both the verbal and nominal domains are implicated at the edges of phases, creating phase overlap and a novel notion of cyclicity: any two (consecutive) cycles share a pair of features across (both) the nominal and verbal domain. 9 The definition of sharing across phases, or phase overlap, is grounded in the scientific hypothesis that long-distance grammatical relationships are a by-product of interface requirements such as linearization, rather than a key part of the architecture at narrow syntax. This hypothesis is based in part on the Items and Contexts Architecture (Vergnaud forthcoming, McKinney-Bock & Vergnaud forthcoming), although the ICA focuses on the formatives within a single phase (and remains incomplete in its formalization of linking phases/phase overlap). From this type of sharing, I develop a strong hypothesis that the appearance of displacement (of a noun) is a product of how the formal computational system spells out, rather than a movement operation that takes place at narrow syntax. The contribution by this dissertation is, in large part, an extension of the Minimalist Program of syntax. The generative theory of phrase structure has undergone several major revisions over the years in an attempt to rid itself of posited primitives, which, while seemingly empirically motivated, were nevertheless stipulations. For example, binary branching, which was stipulated in X-bar theory, is derived in Minimalism. And the idea of government, which was central under the Government and Binding approach for a host of phenomena, has been dispensed with. Continuing in that tradition, I address current stipulations within Minimalist syntax: (1) narrow syntax contains conditions that are really relevant only at the interfaces, (2) long-distance grammatical relationships require additional grammatical mechanisms to constrain the power introduced by movement/agreement, and (3) current clause structure neglects certain empirical parallels between the nominal and verbal domains. Additionally, I discuss issues arising with the operations Merge and Agree in MP, especially at the level of chains (linguistic objects occurring in multiple grammatical relationships). I explore what it means to be truly local, and use a generalization of grafting or sharing to account for clausal embedding and 10 relativization, with predictions extending to other movement/agreement phenomena on a much stronger level. First, current Minimalist theory contains conditions in narrow syntax that are really relevant only for the interfaces. This is an inherent result of the fact that current theory uses the same syntactic object in narrow syntax as an input to the two interfaces. For example, linearization algorithms and rules of chain pronunciation are relevant only at PF, for pronunciation and deletion (e.g. in the case of checked features), and yet copy-and-movement occurs at narrow syntax to check features before Spell-out in order to obtain the proper structure for pronunciation. Also relevant is ellipsis, where there is necessarily an indication in narrow syntax that the structure will be deleted, in order to allow for certain syntactic operations to occur to avoid unwanted outcomes such as antecedent-contained deletion. Another example is c- command, which is relevant at LF, but not necessarily needed at the level of computation. Currently, c-command is used to constrain movement and Agreement in syntax, but if narrow syntax does not involve movement or long-distance Agree, c-command becomes unnecessary. We see the opposite as well: certain mechanisms that are useful at narrow syntax must also be retained and hence interpreted at the interfaces, given current grammatical architecture, and this runs into difficulties. One example of this is Multidominance as it is realized in current theories. These theories use Multidominance at narrow syntax to represent certain displacements such as Right-Node-Raising, or coordinated wh-movement, but at the interfaces Multidominance structures are problematic for linearization algorithms because these types of structures create ordering paradoxes with items that are multiply dominated (see Wilder 2008, de Vries 2009). In this dissertation, narrow syntax is taken to be entirely local, deriving the appearance of displacement and structural relations such as Agreement via c-command only at the interfaces. 11 Second, long-distance grammatical relationships require additional grammatical mechanisms like locality constraints, copying and indexing operations, and agreement operations in order to constrain the power introduced by movement or agreement. These operations take place across a larger domain of syntactic structure than an immediate Merge relationship and so require conditions on what size of structure the relationship can span in order to avoid unbounded dependencies. With a more local syntax, such as the one developed in this dissertation, operations such as these are replaced with a condition on Spell-out of narrow syntax. Third, current clause structure does not adequately reflect empirical parallelisms between the nominal and verbal domain (cf. Vergnaud & Zubizarreta 2001, Megerdoomian 2002, Vergnaud forthcoming). X-bar theory did account for the fact that both nominal and verbal categories involve similar types of syntactic objects (XPs). However, while X-bar theory accounts for the parallels through the use of a similar primitive phrase structure, the system in this paper accounts for the parallel hierarchy that we see within nominal and verbal projections. In X-bar theory, this would be the order in which the XPs combine to form the verbal and nominal hierarchies (e.g. TP > AspP > VP and DP > ClP > NP), which remains a stipulation despite the use of XPs in both domains. In fact, ever since Abney (1987), there is really no good way to represent selection of an argument by the verb. A verb, e.g. eat, has no semantic relationship with D, alone – the D could be any type of D (every, the, a, etc. …). What eat thematically selects is the NP below the D – apples, but the NP is now embedded below a hierarchy of functional projections, and there is no way to encode this relationship locally. The system in this dissertation, following ideas in Vergnaud (forthcoming), works to derive the parallelisms in the hierarchies across nominal and verbal domains, such as the semantic 12 relationship between V and N which both exist low in the hierarchies, from properties of the computation itself. At its roots, this new approach to syntax dispenses with the notion of tree used in current theory, and relaxes syntactic representations to allow for simple graphs. A simple graph is defined mathematically as a graph that contains no self-loops (an edge that connects a vertex to itself) or multiple edges (two or more edges connecting two vertices in a graph); at the current juncture objects such as those add unnecessary structural complications that have no obvious parallel to human language: (i) Tree (rooted, binary branching) (ii) Simple Graph (iii) Graph with a self-loop and multiple edges This reinterpretation of syntactic objects allows for all non-local relationships to be represented locally: the relationship between eat and what is represented as an edge connecting the terminal nodes that represent the lexical items (words) eat and what, in addition to representing the relationship between the root node (“S”, below) and what as an edge connecting those terminal nodes. There is no need for a transformation operation to copy, move and delete the object what. (iv) Long-distance relationship (v) Local, ‘graph’ representation between two positions for what: (simplified here) what ate what John S what John S ate 13 Then, an algorithm interprets the graph representation at narrow syntax and creates a set of objects (binary branching, rooted trees) that, as in current theory, are used at the interface of sound and meaning. The key difference from the Minimalist architecture is that the representation at narrow syntax is a graph, rather than a tree, which allows us to remove from narrow syntax the stipulations for transformations, while still deriving objects that are useful at the interfaces: trees. The graph representation of narrow syntax allows us to state relationships such as those embodied by wh-words in a local fashion, along a single edge in the graph, avoiding the need for the complicated transformations that are stipulated in current Minimalism. Using graphs at narrow syntax is not just an alternative notation to binary-branching trees: it derives grammatical relationships that are not always implicated in trees without certain stipulations, and on a broader level, it groups families of trees together with a common syntax which have structural and semantic similarities, but have not been derived from a common source using a tree notation in narrow syntax. In doing this, I take narrow syntax to be a more abstract representation, from which constituent structure (represented as a tree) is derived and used at the interfaces. (1) Figure: Architecture of Language Tier #1 Narrow Syntax [graph] Tier #2: Interfaces Interface with Sound (Phonological Form) Interface with Meaning (Logical Form) [tree] [tree] 14 This two-tiered system provides a novel resolution to some of the problems discussed above, while the mechanism of a ‘deeper’ narrow syntax than before is motivated by the minimalist idea that narrow syntax shouldn’t require conditions relevant only at the interfaces. This new conception of narrow syntax allows us to look at the computational system without the complications needed for externalization and interaction with other cognitive systems. It allows two properties of syntax to emerge: the symmetry of syntax and the duality of features (see also Liao 2011). This introductory chapter includes a broad conceptual discussion of the Minimalist Program, looking at its goals and principles. One conceptual advantage that MP has over earlier syntactic theories is that it has the goal of excluding extralinguistic mechanisms from the system – and, in some ways, this dissertation is an exercise in pushing this principle further. A criticism is set out regarding the operation Merge as completely local. It is argued that it is necessary to look at larger structures to evaluate which kind of Merge is being used, despite the source independence of Merge (Boeckx 2004). From this, I turn to an overview of the empirical investigations in this dissertation: clausal embedding and relativization. I begin to refine a notion of noun sharing which will unify at some level my accounts of these empirical phenomena in chapters 3 and 4. I then turn to the Items and Contexts Architecture and define locality in a different way at narrow syntax, under the name of Graph Theoretic Syntax. 15 1.2. Foundational Issues in Syntactic Theory 1.2.1. The Minimalist Program This section sets out the important facets of the Minimalist Program. Movement, chaining, feature checking, and Bare Phrase Structure are addressed, to set up a foundation for what types of grammatical relationships are available and the domains within which they apply. It is in this foundation where I begin to explore what it means to move to a system with entirely local grammatical relationships, even at the level of head projection. Eventually this will pave the way for an exploration of grafting as a generalized mechanism in syntax, replacing other types of grammatical relationships with this single mechanism. 1.2.1.1. Merge, Bare Phrase Structure & Set Formation Chomsky (1995, and subsequent work) presents an argument for a more minimal syntax, and sets out the system of Bare Phrase Structure (BPS), which is conceptually more economical than X’-theory. The Minimalist Program assumes a model of language where the computational module (the ‘syntax’) builds structures, and then sends these structures to two other modules of language for use: the PF component, or phonetic form (this is the module for speech and linearization of language – sound), and the LF component, or the semantic (logical) form (this is the module for interpretation - meaning). This model assumes that there are no direct interactions between PF and LF; rather, the output sent to PF and LF is mediated through narrow syntax. So, if there is some linguistic phenomenon that affects both phonetic form and meaning, then there must be something in the syntax that creates this apparent relationship. Importantly, the same object that is built in narrow syntax gets sent to PF and LF for interpretation (Chomsky, 2001). The Minimalist Program assumes a derivational approach to syntax: structure is generated step by step, using a set of rules, and if these rules are violated then the derivation 16 crashes. It can crash at PF if there is a violation related to externalization in sound, or the phonological form, and it can crash at LF if there is a violation problematic for the generation of meaning. In order for a derivation to converge, it must satisfy the principle of Full Interpretation, which states that everything in the PF and LF representations can be given an interpretation. Derivations are cyclic, meaning that at a certain point – called Spell-out - at the end of a cycle, the syntax sends an object to PF and LF. An independent clause, for example, would consist of multiple cycles. The derivational approach gets rid of multiple levels of syntactic representation; namely a “Deep Structure” before transformations apply and a “Surface Structure” that is then sent for interpretation. The derivational approach also ensures that all transformations occur in narrow syntax, before Spell-out, as the structure is being built – rather than building an “entire” tree and then performing transformations on that tree as in past generative theories. A derivation utilizes a numeration, which is a set of lexical items drawn from the lexicon (i.e. morphemes that are stored in memory). Each lexical item has a lexical entry that is made up of a set of linguistic features, i.e. airplane, which would be made up of features such as [noun] (‘formal’ feature), [begins with a vowel] (phonological feature), or [artifact] (semantic feature) (Chomsky 1995: 230). Crucially, in MP only the information in the numeration (lexical items) is open to computation. There is no notion of X’ or XP, only X, as minimal or maximal, nor is there a notion of interpretable indices, such as i on who i . Certain operations manipulate items from the numeration to combine them in the syntactic component. Important operations here are: Merge, Move, Agree. Chomsky argues that Merge is the simplest operation – it takes a syntactic object A, and another syntactic object B, and combines them into a new syntactic object. For example, a syntactic object [eat] and another 17 syntactic object [an apple] would Merge to form the new object [eat an apple]. Merge is taken to be set formation: so Merging eat and an apple creates the set {eat, an apple}. Chomsky argues that one additional component to the set is needed, to identify the category of the new constituent (nominal or verbal), as these categories behave differently in language. So Chomsky introduces a label to a Merged pair, and argues that the label must be one of the syntactic objects, either A or B. So, a Merged pair of A and B creates the set {A, B} with the label being either A or B: {A, {A, B}} or {B, {A, B}}. 1 It is this object that is diagrammed by trees. The syntactic object {A, {A, B}} would be a tree as follows: (2) A 2 A B In MP, Chomsky puts indices on the root A versus the leaf A, but notes immediately that “this is informal notation only: empirical evidence would be required to postulate the additional elements” – like indices – that are not lexical features. Here, I exclude these indices, as the label A and leaf A are the same object. When A labels the set {A, B} we say that A projects, and we also say that A is the head of the phrase. We then take further features of A and continue to use them in the derivation (the merging of A and B has checked some feature already, perhaps an Edge feature [Chomsky 2008] or other agreement feature that licenses Merge). So, if A is a verb, then the Merged object {A, {A, B}} would have verbal features in the next step of the derivation (even if B is not a verb). An example of this would be something like: (3) A = eat B = an apple (we have already Merged an and apple to form a more complex object) 1 Chomsky 1995 proposes that the syntactic object created by adjunction is: L = {<H(K), H(K)>, {A, K}} (p. 248). The label in the case of adjunction is an ordered pair consisting of H(K), the head of K. 18 Merge(A, B) = Merge(eat, an apple) = {eat, {eat, an apple}} Since here, eat has verbal features, we will treat this new object [eat an apple] as a verb. The next step of the derivation would give this object: (4) Merge(eat an apple, past tense) = {past, {past, eat an apple}} = ate an apple This continues to project Tense, which is verbal material, and continues to be a verbal category. Finally, we also need to Merge a subject, i.e. John: 2 (5) Merge(John, ate an apple) = {past, {John, ate an apple}} = John ate an apple The verbal material still labels the Merged pair, rather than John, as we see that further operations require an extension of the verbal domain (i.e. the next Merge would be of the C- domain, or ‘CP,’ which involves verb-raising to C in languages such as German). In section 1.2.2, I discuss some issues that arise with Merge, namely (1) the requirement that a broader context than the most local/recent Merge relationship is needed to be able to deduce relations in trees, and (2) the generality of Merge: the operation must somehow be restrained to prevent deviant operations from occurring (standard solutions use Agreement features). I relate both of these issues to an underlying goal of the MP related to the Inclusiveness Condition, which is to rid the system of non-linguistic elements. 1.2.1.2. Agreement Another primitive operation is Agree (Chomsky 2001). Agree involves long-distance feature- checking between two items A and B with (typically) a semantically uninterpretable feature [uF], a probe, and a semantically interpretable feature [iF], a goal, respectively. So for A and B to 2 In this description of a derivation there is a simplification in that the VP-Internal Subject Hypothesis is not represented. 19 ‘agree,’ A must have an [uF] as part of its feature matrix, and B an [iF] that can value the [uF]. The two features must be in a c-command relationship. Agree allows for semantically uninterpretable features to be checked/deleted, so they are not visible to the interfaces. Only interpretable grammatical features may be sent to the interfaces for semantic or phonetic interpretation; otherwise the derivation crashes. The operation Agree can apply only within a specified domain; Chomsky (2001) defines this domain as a strong phase. A strong phase obtains by the merging of a phase head such as C or v, with certain properties: (i) operations such as Agree can only apply within a strong phase, (ii) strong phases are targets for movement (EPP-driven) as they are reconstruction sites, (iii) once the phase head has been merged, the phase is removed from the active computation and sent to Spell-out; everything within it becomes inaccessible to further operations (with the exception of the phase ‘edge’). Any potential goal B with [iF] is not visible to further computation from outside a strong phase, and so some probe A with [uF] outside of the strong phase cannot enter into agreement with B. For example, Agree can occur between A and B within the strong phase defined by C, below, but D and B cannot Agree as D is outside of this phase. (6) a. C 3 [ ] C 3 C T 3 A [uF] T 3 T V 3 V B [iF] 20 b. V 3 D [uF] V 3 V C 3 [ ] C 3 C T 3 A T 3 T V 3 V B [iF] Chomsky allows for strong phases to have an edge position that is ‘visible’ to the higher phase, otherwise cyclic movement of, e.g. wh-phrases would be prevented. The [Spec, C] position (notated with [ ] in the above trees) is a phase edge, and any expression in this position would be permitted to enter into an Agree relationship as a goal, with some higher probe; say, some higher C position: c. C 3 C 3 C [uF] … 3 V C 3 [ B [iF] ] C 3 C T 3 A T 3 T V 3 V N 21 In the case of wh-phrases in English, more than Agree is required: displacement also occurs (B moves to the [Spec, C] of the higher C). 1.2.1.3. Chaining & Movement The complex operation just referred to, Move, helps create sentences with displaced syntactic objects. Movement is ‘feature-driven’, like Agree, in the sense that there must be some uninterpretable feature that needs to have a (linguistic) interpretation that acts as a probe and requires a goal with an interpretable feature. First, the probe and goal enter into an Agree relationship, and then the goal is copied and the copy is re-Merged with the probe to ‘check’ [uF] and interpret it. An example of this would be the sentence: (7) Who did Bob meet? Who plays two roles in the sentence – as a question word that needs to enter in a relationship with a question operator at the clausal level, and as the object of the verb meet. We represent this duality as a movement during the derivation of the clause: (8) Derivation: 3 1) Merge(meet, who) 2) Merge(PAST, meet who) 3) Merge(Bob, PAST met who) At this point, we have Bob met who. Now, we need to move who to a second location so it can play its role as a question marker. Move takes an object that has already been Merged and re- Merges it into the structure again. This is feature driven; here, we have an interpretable wh- feature (for who, what when, where why-type question words), and the operation ‘Move who’ occurs when an interpretable wh-feature raises to check an uninterpretable wh-feature. We 3 I have simplified here a bit, for purposes of exposition. It is generally assumed that vP is also a phase, and who needs to move first to the edge of the “v” category, prior to moving to the “C” category. This type of cyclic movement is given empirical support by languages (like German) with overt partial wh-movement, where the wh- word is pronounced at the edge of vP, rather than CP. 22 re-Merge Agree assume the presence of a “C” category, which contains the uninterpretable wh-feature. After step 3) in (7), “C” is Merged: (9) Derivation: 1) Merge(C[uWh], Bob PAST meet who) 2) Move who = [who C Bob PAST meet who] Finally, specific to English, we see the past tense T also moves – is copied and re-merged – to the C position and is pronounced as “did.” (10) Derivation: 1) Move T = [who C-did Bob meet who] The two instances of who form what is called a Chain, or movement chain, and are indexed so that we know there are two copies of the same who in our derivation. Only the first element of a chain is pronounced (so the second who would be deleted at PF): (11) Who C-did Bob meet who. An abstract representation of Move as a complex operation containing both Agree and re-Merge is shown below: (12) C 3 B [iF] C 3 C [uF] … 3 V C 3 [ B [iF] ] C 3 C T 3 A T 3 T V 3 V N 23 1.2.2. One Overarching Issue: Inclusiveness This section looks at the Inclusiveness Condition, and a discussion of why certain mechanisms remain problematic under this notion. I turn to bar-levels, identified by Chomsky (1995) as a mechanism that is undesirable in syntax. Interpreting bar-levels requires a system of relational properties based on grammatical features, and using trees/contexts broader than an extremely local Merge relationship, even at the level of an X MAX phrase, is problematic. After a discussion of bar-levels, I turn to another mechanism that Chomsky (1995) identifies as undesirable; that of indices. More specifically, indices on a moved constituent lack substantial information about which part of the constituent may reconstruct, and we need a way to syntactically account for the fact that A-bar movement is not full reconstruction – something which requires a long-distance relationship between two items in a tree. Indexing two items in an A-bar chain does not accomplish that, clearly, and I turn to this particular issue in this section because it abstractly relates to bar-levels in the sense that these mechanisms are used to represent chaining: in a set of co-indexed phrases, as well as a head that has projected through multiple ‘bar-levels’, a similar abstraction across both exists: these mechanisms are used to illustrate how a single LI exists in multiple contexts; this is the notion of chain. These problems are directly related to the Inclusiveness Condition (Chomsky 1995), in that they require knowledge of whether or not an item has been Merged previously, and/or will be Merged again: to paraphrase, whether the item is in multiple grammatical relationships. The Inclusiveness Condition is a good jumping off point for the discussion about a foundational issue that arises with the MP. (13) Inclusiveness Condition (Chomsky 1995: 228): “Any structure formed by the computation (in particular, π and λ) is constituted of elements already present in the lexical items selected for N [the numeration]; no new objects are added in the course of computation apart from rearrangements of lexical properties.” 24 In particular, Chomsky 1995 identifies violations of inclusiveness as indices or bar-levels (p. 227). An example of bar-levels is the following (bar-levels in bold brackets). I am not using category labels here (such as T, T’, TP) because Bare Phrase Structure eliminates category labeling: (14) [-ed MAX ] 5 [-ed’] quickly 3 John [-ed’] 3 [-ed MIN ] walk Here, looking to the X’ projection e.g., the immediate constituents labeled by -ed’ are: [ -ed’ John - ed’] and [ -ed’ –ed MIN walk]. Note that the lack of labeling –ed as category T (alternatively one could label –ed MIN as T, –ed’ as T’, -ed MAX as TP) under Bare Phrase Structure is independent of bar-levels; the fact that –ed is a minimal projection is independent of its status as a Tense morpheme (what is important is that it has not projected), and that –ed’ is a bar-level projection is independent of its status as a Tense morpheme (what is important is that it has projected), and that –ed MAX is independent of its status as a Tense morpheme (what is important is that it projects no further). The issue still remains: we have identified, using bar-levels, a relational property of –ed in that it either (i) has not projected (X 0 or X MIN ), (ii) has projected and continues to project (X’), or (iii) has projected and projects no further (XP or X MAX ). As bar-levels are external to language’s (internal) lexical properties, they violate Inclusiveness. What are the purposes of these extra-linguistic mechanisms? The purpose of bar levels is to describe the relational grammatical functions related to projection. Chomsky (1995) suggests 25 the following two relevant properties for minimal and maximal projections: (1) minimal projections allow access to formal features of lexical items, and (2) maximal projections play a role at LF, in the sense that noun phrases and verb phrases are interpreted “in terms of their type.” He points out that X’ plays no other interpretive role; it is “invisible” (p. 242). One could take issue with this statement, in a semantic sense, as the “invisibility” of X’ is not always true in the semantics; for example, the consistuent representing V’ as [V NP] will have an interpretation as a predicate of individuals. Even still, whether John merges with ed’ or with ed MIN does not matter for the checking of syntactic features; the only difference is semantic as the DP John is a subject to the predicate constituent labeled ed’. And, the head-specifier relationship of John and its V is not maximally local, as it is mediated through the Merging of the specifier with the phrase rather than the head directly. Even with an objection to a notion of “invisibility”, the only reason we have the V’ constituent syntactically is for the purposes of our system of semantic interpretation at LF, and we want to maintain the same tree at LF where the constituency matters. Even despite semantic requirements at LF, it is important in the syntax that the bar levels all contain some instantiation of –ed as it is Merged with another constituent. Take a list of projections of –ed from the tree above: (15) (i) [-ed MAX ] (ii) [ -ed’ -ed’ quickly] (iii) [ -ed’ John -ed’] (iv) [ -ed’ –ed MIN walk] Dealing with TP/ed MAX , we know that the next projection will be into a complementizer domain, so let us add this Merge relationship to the above list: (16) (i) [ that’ that MIN -ed MAX ] (ii) [ -ed’ -ed’ quickly] (iii) [ -ed’ John -ed’] 26 (iv) [ -ed’ –ed MIN walk] If we assume that bar-levels are unimportant, we can simplify the Merge relationships to the following: (17) (i) [ that that -ed] (ii) [ -ed -ed quickly] (iii) [ -ed John -ed] (iv) [ -ed –ed walk] This looks very similar to a chain of occurrences of –ed in a local context (a single Merge relationship). Importantly, the existence of –ed’ and –ed MAX (as distinct from –ed MIN ) depends on –ed occurring in multiple grammatical relationships – multiple Merge relationships – to create the list of ‘projections’ above. Bar-levels are indeed extralinguistic, and don’t serve a syntactic purpose (though, perhaps, a semantic one), and one that Chomsky (1995) says could be excluded. However, the discussion goes a bit deeper to look at the notion of projection using the minimal and maximal distinction, and shows that we still do use a notion of projection in our syntax. This is presented below. We see that this function remains, to distinguish one LI in multiple contexts, or multiple merge relationships. This is an important function (even if it does not involve the operation ‘Move’) which links it to indexing – and, at some level, the two serve the same purpose. Turning to indexing, here are some examples: 4 4 Note that I don’t distinguish in these examples between indexed traces, t, and indexed copies. If one wants one copy of who to bind another, then one must identify the two copies of who co-vary in reference; this is what the index does. I set aside for a moment the key idea that the lower copy is converted to an object that allows bound readings, i.e. a nominal headed by the (Fox 2000). One could argue, to the contrary, that indexing is simply a representation of what the derivation has already done: the same lexical item (LI) who has been Selected (using the operation Select) from the numeration twice. However, Chomsky permits two Selections of the same LI to have different reference at LF; see Chomsky 1995 p. 228, something that becomes an issue with referring expressions, in particular – though, as indexing doesn’t distinguish referring expressions from wh-expressions (see fn. 5, 6, 7 below), I take it to be an issue for both types of expressions. So, the indexing here plays a key role in providing information as to the reference of who; without it there would not be enough information for the derivation to interpret correctly the two occurrences of who at LF. 27 (18) a. Bob i thought himself i intelligent. b. Who i did Mary say who i /t i loved John? c. He i thinks that he i /he k is smart. The index is represented with a sub-scripted i, and lexical items which share the same index share the same referent. As the index i is external to language’s lexical properties, it violates Inclusiveness. One could, in fact, assign reference to any DP/NP through an index even if the derivation does not have another DP that co-refers to that DP: (19) Who i did Mary j say who i /t i loved John k ? This reveals precisely one key purpose of indices. One might point out that it is informatively useless to index Mary and John in the sentence above. Why is this? It is because there is no other reference to either Mary or John in the sentence; were we to provide an answer to the question the indexing might prove more useful: (20) Q: Who i did Mary j say who i /t i loved John k ? A: She j said that Susan loved him k . Here, the redundant indexing becomes informatively useful to the linguist; it links the pronoun she to Mary with index j and him to John with index k. It seems that indexing is most useful when there is one referent in multiple contexts, which could be taken to be one referent in a chain. The chains in the above sentences are: (Mary, she), (John, him) and (who, who). 5 Generalizing across indices and bar levels, the purpose of these inclusiveness violations is very nearly always to distinguish one LI in multiple grammatical relationships. Another way of stating this is that a single LI exists in multiple contexts; this is the notion of chain that is so ubiquitous in generative theory. And, indices and bar levels serve to show this LI in multiple contexts: who in its base-generated and moved position, and –ed in its various contexts with 5 I treat pronouns as being in a chain, just like wh-words. For a preliminary account of pronominalization, see Freidin & Vergnaud 2001 (republished in McKinney-Bock & Zubizarreta forthcoming, chapter 2). This is also along the lines of Kayne 2002, where pronouns are treated as having a constituent with the noun referent in a theta/argument position, which then moves: [he John]. 28 walk, John, quickly and that. We think of indexing as linking a moved constituent to its base- generated position, and we think of a notion of projection as not movement, but deeply their purposes is to represent the same phenomenon. Now, what is the problem with bar-levels and indices; why should they violate a computational condition? At a broad conceptual level, Chomsky (1995) holds the computational system of human language C HL to a high standard, attempting to rid C HL of any mechanism that is not driven by grammatical features or items in the lexicon (this is the basis for Bare Phrase Structure as well). At an equally important level, the literature has shown that multiple occurrences of an item in an A’-movement chain are not equivalent. It turns out that indices are not enough information for reconstruction (see Lebeaux 1988, Chomsky 1995, Fox 2000), and that a notion of domain is required. This is diagnosed by Condition C effects, with an argument/adjunct asymmetry (van Riemsdijk & Williams 1981, Freidin 1986, Lebeaux 1988, cited in Fox 2000) reminiscent of the same thing (adapted from Fox 2000: 151): 6 (21) (a) [Which argument that John i made] did he i believe? (b) % ??[Which argument that John i is a genius] did he i believe? (c) [How many different arguments that John i made] did he i believe? (d) % ??[How many different arguments that John i is a genius] did he i believe? Here, partial reconstruction is available in the (a) example, but not in the (b) example: (22) a. [Which argument that John i made] did he i believe [x-argument]? b. % ??[Which argument that John i is a genius] did he i believe [x-argument that John i is a genius]? What supposedly accounts for the deviance in the (b) example is that John reconstructs below he, violating Principle C. One way of accounting for this is Fox (2002), who proposes a rule of 6 Note that I add a %-sign to the (b) and (d) examples, as in my dialect this example is not deviant. I believe dialectal variation for these examples may be in order; I leave this to future work. 29 Trace Conversion which converts a trace that has Quantifier-Raised into a definite description, e.g. Every book she read the book. What is important from the literature is that, in general, A’- movement requires an additional syntactic mechanism that allows for partial reconstruction; simple (co-) indexing is insufficient to account for why only parts of a raised constituent may reconstruct. For bar-levels, Chomsky notes that they can be equivalent to features in a rich enough computational system. “With sufficiently rich formal devices (say, set theory), counterparts to any object (nodes, bars, indices, etc.) can readily be constructed from features. There is no essential difference, then, between admitting new kinds of objects and allowing richer use of formal devices; we assume that these (basically equivalent) options are permitted only when forced by empirical properties of language” (Chomsky 1995: fn 7). Again, conceptually, Chomsky chooses the option which uses grammatical/linguistic features instead of the mechanism of bar-levels, which involves extralinguistic notations to describe relational properties of trees. However, while reducing bar-levels to properties of formal features may be more conceptually ideal, one particular problem remains. I will attempt to illustrate this via both bar-levels and via my own interpretation of what it means to have a ‘richer use of formal devices.’ What remains conceptually under MP and Bare Phrase Structure is the intermediate (non-minimal, non-maximal) node, which is imposed by the tree structure generated by Merge and head projection. If this is relaxed, the intermediate node is not needed. To understand what makes –ed’ an X’ level, we begin by looking only to the immediate Merge relationship of –ed’ in a tree for John walked quickly: 30 (23) -ed MAX 5 [-ed’] quickly 3 John [-ed’] 3 -ed MIN walk Let’s rid ourselves of bar-levels at this point and focus on this single Merge operation: (24) -ed 3 -ed walk Now, with labeling, we can only distinguish between –ed as it has been Merged with walk, and – ed as it has not been Merged with walk. There is no information as to whether the projection –ed continues to project (i.e. is –ed’) or whether this is the final, maximal projection of –ed (i.e. – ed MAX ). There is also no information as to whether the occurrence of –ed that Merged with walk has been Merged for the first time (i.e. –ed MIN ) or has been Merged in a previous operation (i.e. – ed’), the latter presumably not being empirically attested. In fact, one could take this to be –ed MAX by default, at this point in the derivation – if this Merge operation has just occurred. The only way to tell whether occurrences of –ed are minimal or maximal, or neither, is to look at a broader context: more of the tree/derivation. If we attempt to scan ‘down’ in the tree we find that there are no previous Merge relationships. We can deduce that the Merged –ed is a minimal projection: (25) -ed 3 -ed MIN walk 31 If we scan ‘up’ in the tree (or continue the derivation), we find that –ed Merges with John, and continues to project: (26) -ed 3 John -ed 3 -ed MIN walk We can deduce from this single ‘look’ that –ed is not a maximal projection. As it is not a minimal projection (this we can deduce from the single Merge relationship, as the head –ed is merged and projects in the first relationship), we know that –ed is an X’, neither minimal nor maximal: (27) -ed 3 John -ed’ 3 -ed MIN walk And so on. The only way to deduce the bar-level information is from looking at a broader stretch of the tree, or multiple Merge relationships. Returning to Chomsky 1995, 2001, under Bare Phrase Structure, however, a label is not equivalent to its projection. Rather, Merge is a set-forming operation (a “rich formal device”), which creates the set containing the two Merged constituents and the label of the set. Merging A and B gives the object K = {Z, {A, B}} where Z is the label of the phrase, Z=A or B. (28) Z 3 A B If A projects, Z=A, and if B projects, Z=B. Let’s assume A projects. The tree and syntactic object (set) are as follows: 32 (29) (i) A 3 A B (ii) {A, {A, B}} One could renotate the tree to reflect the syntactic object that is now available to the computation: (30) {A, {A, B}} 3 A B If we successively compute this for John walked quickly, replacing labels with sets, we obtain: (31) {ed, {quickly, {ed, {John, {ed, walk}}}} 5 {ed, {John, {ed, walk}}} quickly 3 John {ed, {ed, walk}} 3 -ed walk Now, let’s take the local Merge relationship between –ed and walk: (32) {ed, {ed, walk}} 3 -ed walk Looking only to this Merge relationship, we can deduce the status of the head -ed: we know that –ed is a minimal projection, as it is not part of a set that shows the derivation. However, we still cannot know whether the label is a maximal projection or an X’ projection. To deduce this, we (once again) need to scan ‘up’ the tree to see that yes, there are further Merge operations that take place: 33 (33) {ed, {John, {ed, walk}}} 3 John {ed, {ed, walk}} 3 -ed walk Only now can we deduce the status of the label {ed, {ed, walk}}, and the reason for this is that the entire history of the derivation is visible from the maximal node (not just the local Merge relationship). It is clear that Merge as a set-building operation provides more information about the history of a derivation than bar-levels, and I interpret this as part of what Chomsky means by a “rich formal device” replacing bar-levels. However, it still does not allow us to ‘know’ whether a set is a maximal projection, or even a root node, without attempting to scan further up the tree/continue along the derivation. Let’s call this issue the ‘Maximal Projection Problem.’ We see below that the Maximal Projection Problem remains an even more general problem, related to the Phase Impenetrability Condition (and to the ‘look-ahead’ problem, although somewhat in reverse). A not-unrelated issue that has been discussed in Chomsky 2008 is what restricts/controls the operation Merge at a local level (this is pointed out in Liao 2011 as not sufficiently addressed in standard theory, although he discusses potential options for restricting Merge to a set of relations in Bowers 2001, Collins 2002, Hiraiwa 2005; see also Vergnaud forthcoming). If Merge is a truly free/general operation, then two objects should be permitted to Merge independent of their status. This quickly becomes problematic, as we know (for example) that many LIs are incompatible. For example: (34) *walk-ed-ing {ed, {walk, {ed, {ed, ing}}}} 34 There are many such ‘ridiculous’ possibilities that are not attested in human language. One possible way to constrain Merge is to introduce an Edge Feature (Chomsky 2008), which regulates whether the LI may be Merged to certain other LIs or not. Using this, Chomsky leaves the notion of ‘deviance’ to the interfaces, and permits derivations that don’t converge at Spell- out because they don’t have the proper features to be interpreted or linearized. Essentially, it is required that an additional type of (local) feature Agreement occur between two LIs in order to allow for Merge to occur. Setting these attempts to restrict Merge aside, the Maximal Projection Problem remains in that one requires use of more than one Merge relationship in order to discern the type of projection. The type of projection is important, according to Chomsky, for interpretation as a minimal or maximal projection (which have interpretive properties at LF), or neither (being “invisible” to the syntactic computation). So to know whether something is “invisible” we must look to a larger portion of the derivational tree – there is not enough information locally provided. And this is even just at the level of A-structure/projection; even larger issues arise when one looks at A’-movement and chaining (which I return to in Section 1.2.2.1 immediately below) and local Merge relationships. And so, taking Merge to be a set-building operation comes with its own set of problems, which is related to both (1) the requirement that a broader context than the most local Merge relationship be available in order to deduce relations in trees – information which is used at LF (Chomsky 2005); and (2) Merge as a general operation must somehow be restrained to prevent deviant operations from occurring (but see Bowers 2001, Vergnaud forthcoming). These problems are not unrelated to the mechanisms that violate Inclusiveness in the abstract sense that items are Merged multiple times, because (1)/(2) require knowledge of whether or not an item 35 has been Merged previously, and/or will be Merged again: to paraphrase, whether the item is in multiple grammatical relationships. To summarize, indices lack substantial information about which part of the constituent may reconstruct; once we build a semantic interpretation mechanism into this we see that indices may not even be necessary. We need a way to syntactically account for the fact that A-bar movement is not full reconstruction. Second, interpreting bar-levels requires a system of relational properties based on grammatical features, and there is an open issue with using trees/contexts broader than an extremely local Merge relationship (more on this below). 1.2.2.1. Generalizing the Maximal Projection Problem to Long-Distance Grammatical Relationships: Move/Agree Another key issue with the current syntactic objects (trees) is that certain relations must be defined over a larger domain than a single grammatical relationship (e.g. between a probe C and goal who), as there is not enough information provided at a local level. Under the Minimalist Program, one goal should be to identify these relations and explore alternatives that don’t require a notion of relational domain to be defined. As discussed above, Move is a complex operation that involves long-distance Agreement between two features, and then re-Merging of the lexical item hosting the lower feature with (a projection of) the lexical item hosting the higher feature. As part of this, Chomsky (2004) argues that Merge is source independent and does not require information about whether the objects it is Merging come from the numeration or from a previous step in the derivation. Van Riemsdijk (2001, 2006) argues that the source independence of Merge predicts the existence of the type of Merge that creates a graft, a constituent that is multiply dominated. The idea governing source independence is the same as the one governing a notion of bar-levels, above: the only way to ‘see’ the difference between Merge and Re-Merge (‘External’ – from the 36 external numeration - or ‘Internal’ – pulling a syntactic object from the existing derivation – Merge) is to look at a ‘bigger picture’ of the syntactic structure to see if the item already exists in the derivation, or if it is still in the numeration. With source independence, instances of grafting are not excluded because grafting involves looking at a bigger structure than a simple Merge relationship; it minimally involves two Merge relationships: in (35), the Merging of (α, δ) and (β, δ). (35) α β α δ β To know that the constituent δ in the Merge relationships above is grafted, we must first know that it is Merged twice, once with α and once with β, and that both α and β project and NOT δ. It could be argued even further that distinguishing cases of grafting from cases of Internal Merge involves looking all the way up to the root, to ensure that the two heads α and β that dominate the grafted point do not end up re-Merging at the root (an Internal Merge). The following two trees illustrate this point. (36) β δ β β γ γ α α 37 (37) γ σ γ α σ β α δ β In (36) and (37), the Merge relationships Merge(α, δ) and Merge(β, δ) are present, with both α and β projecting and NOT δ, and so the figure in (35) is present in both (36) and (37). One can see this below, where the relevant configuration in (38)/(39) is in black; the rest of the tree is grayed out: (38) β δ β β γ γ α α (39) γ σ γ α σ β α δ β 38 The only difference between the above structures is that β has been previously Merged with a constituent that dominates α in (38), and not in (39). This is only visible from a much larger ‘chunk’ of the tree than even the immediate two Merge relationships of Merge(α, δ) and Merge(β, δ). However, in (38), when the entire structure is taken into account, δ appears to be a case of Internal Merge, and (39) appears to be a case of External Merge. 7 To review, here are the three cases of Merge discussed in the literature: (40) Merge/ ‘External Merge’ (from the numeration) of β: α/β α β (41) Re-Merge/ ‘Internal Merge’ of β: α/β β α … γ γ β (42) The case of ‘grafting’ (van Riemsdijk 2001, 2006), or ‘External Re-Merge’ (de Vries 2009) of δ: α β α δ β 7 Notice here that we are referring to a representation of a tree, rather than referring to the derivation. However, this is not crucial to the discussion at hand of source independence. Rather, in the case of (35)/(36), if the structure in (35) had been Merged first, the Extension Condition would have been violated to create (36) when β is Merged with γ. So an additional condition on Merge is required to rule out instances of grafting when looking at how a derivation might proceed. 39 However, this description of grafting is a bit fragile in the sense that it is not clear how the structure created by grafting differs from general X-bar theory; rather, it is one of many possible configurations created by different projections and headedness – another one being the X-bar configuration. While the literature points out that grafting is predicted by Merge, the utility of grafting is empirically restricted by these papers to certain constructions involving, for example, transparent free relatives (van Riemsdijk 2006), right-node-raising (McCawley 1982, de Vries 2009), or across-the-board wh-questions (Citko 2005) (see de Vries 2009 for a nearly exhaustive list of constructions that have used External Re-Merge – it numbers around 12) – and various constraints are proposed to restrict grafting to these phenomena. Here are some examples: (43) There is what I suspect to be a meteorite on the front lawn. [Transparent Free Relative] (van Riemsdijk 2006, his 9b) (44) John admires, but Jill hates, Bush. [RNR] (de Vries 2009, his 8-9) (45) I wonder what Gretel recommended and Hansel read. [ATB wh-question] (Citko 2005, her 7) What is focused on is the linearization problems created by grafted structures such as those in (8). For example, Citko (2005) addresses the issue of linearization with a constraint on multidominance structures that prevents them unless there is an antisymmetric projection that ‘brings together’ the parallel structures and creates an antisymmetric relationship between the parallel projections that is essentially a Spec-Comp c-command relationship. Additionally, Citko’s constraint states that movement of some constituent within the projections must occur in order to prevent a derivation crash. Van Riemsdijk (2001) addresses linearization by allowing for the ‘grafted’ constituent (in his case, the free relative or transparent free relative) to be inserted into the PF string at the point where the graft occurs. 40 What I put forth for these accounts of grafting is that two components, or levels, of analysis should be separated out: (1) working out complex linearization algorithms for constructions like parentheticals, free relatives, and coordinated wh-questions, and (2) generalizing this type of sharing within narrow syntax and exploring the underlying Merge mechanism. Grafting is part of the more general Merge mechanism and needs to be explored as a separate component. One of the important principles in generative grammar is that everything follows from a set of non-discriminating computational rules instead of construction specific mechanisms. The grafting mechanism is not construction specific, and ‘grafting Merge’ is a predictable consequence of a general Merge mechanism. In fact, grafting is reducible to the same problem as the Maximal Projection Problem, above: it involves one constituent being in multiple Merge relationships, or in a chain. At the root of our account is a more specific notion of occurrence, and chain, of a certain constituent. In this dissertation, I take grafting to be a generalized phenomenon that extends beyond coordinate structures and free relatives or other parentheticals, and even movement: I address the ‘Maximal Projection Problem’ (whose name should now be revised to include long-distance grammatical relationships as well as more-local head projection), as it is a deep foundational issue within MP. The goal of this dissertation is to reformulate the object(s) used at narrow syntax to involve a generalization of sharing, or what it means to occur in multiple grammatical relationships. A strong hypothesis is that Merge is strictly local, taking strictly local in a new sense: long-distance movement relationships, projection, as well as feature agreement within a single Merge relationship are all represented with a local Merge relationship. One consequence is that domains/cyclicity becomes a fundamental restrictor under syntax, even more so than previously taken to be. 41 1.2.3. Another (likely) related problem: Locality Constraints on Movement Agree, and therefore, Move, is restricted to strong phases, which relieves much of the problem of overgeneration through long-distance dependencies. However, one could take movement constraints to be ‘extralinguistic’ in a similar way to indexing, and conceptually disprefer them. In grouping Move within the same domain of mechanisms which violate Inclusiveness, this dissertation acts as an exercise in comparison between certain movement- based and non-movement-based approaches. However, certain feature configurations arise within the phase when various components with the same feature ‘compete’ for movement. Conditions that are additional to strong phases have been introduced to control this competition. For example, Relativized Minimality (RM, Rizzi 1990) is used extensively to decide which XP gets to move and which is blocked. (46) Relativized Minimality (Rizzi 1990) X alpha-governs Y iff there is no Z such that: i. Z is a potential alpha-governor for Y ii. Z c-commands Y and does not c-command X iii. alpha-governors: heads, A-Spec, A’-Spec Note that the concept driving RM exists within MP in terms of attraction, after the notion of government was abandoned (see Chomsky 1995: 297): (47) Attract F K attracts F if F is the closest feature that can enter into a checking relation with a sublabel of K. Various constraints interact with RM/Attraction in order to select which XP gets to Move. This occurs within a phase, in addition to the notion of domain that the phase already has introduced. In a more ideal system, constraints that act within a phase (defining sub-domains within the strong phase/domain) would be dispensed with. This is one goal of the dissertation, although it remains a work in progress. Constraints like those used in Pesetsky & Torrego 2001 (see P&T 2001 for citations), work to cover significant empirical ground and unify previously disjoint 42 empirical phenomena. However, they are conceptually not preferable for the same reasons that Inclusiveness forbids indexing and bar-levels: they are extralinguistic factors introduced in the system. It would be best to minimize them to one notion of domain (or perhaps none) instead of introducing competition within a single domain. I turn here to Pesetsky & Torrego (2001), as they have a rich discussion surrounding feature deletion (‘checking’), within which they discuss and utilize several constraints for empirical phenomena within the C-domain (see chapter 4 of this dissertation as well). I summarize certain constraints from their discussion here, to illustrate how movement is kept ‘reeled in’ within MP literature, yet loose enough for particular empirical applications. The additional constraints I discuss are: (1) further attraction/checking of already checked features (Pesetsky & Torrego 2001); (2) Attract Closest (Chomsky 1995), (3) Principal of Minimal Compliance (Richards 1997). These constraints are of immediate interest because they deal with ‘competition’ within a single syntactic domain. Pesetsky & Torrego allow for already checked features to undergo further checking (they refer to this in terms of feature deletion). The key example in their paper is subject DP movement to T, and then to C. They hypothesize that Nominative Case on a DP is the realization of an uninterpretable T feature, [uT], which in standard English has a sub-feature [+EPP] that requires movement (instead of simple Agreement). So, a subject DP will raise to [Spec, TP] to check its [uT] with the T-head (which has [iT]). Note here that P&T allow feature checking to obtain in the ‘reverse’ direction; we can have an uninterpretable goal with an interpretable probe. They also permit checking with two uninterpretable features. This is, in fact, the next step in the derivation I am alluding to, with DP moving to [Spec, CP]. They further assume that C can have a [uT] as well, which requires checking/deletion. Movement of either DP with Nominative Case, 43 [uT] or movement of T, with [iT] can check the [uT] feature on C. The result, when combined with further constraints on feature deletion, allows for either T-to-C movement, or subject DP raising. One further constraint that plays a role with multiple checking (as utilized in P&T 2001), is a definition of closeness (Chomsky 1995), which Pesetsky & Torrego call Attract Closest F (P&T 2001: 5). The definition of closeness in Chomsky 1995 (p. 296) is: (48) A can raise to target K only if there is no legitimate operation Move B targeting K, where B is closer to K. Pesetksy & Torrego simplify the language somewhat, to the following (their (10): (49) Attract Closest F (ACF) If a head K attracts Feature F on X, no constituent that bears F is closer to K than X. The purpose of this constraint is to help select, for example, which of DP [uT] or T[ iT ] is closer to C [uT] . The empirical application in Pesetsky & Torrego (2001) is that of an interaction of [uWh] and [uT] to derive T-to-C movement with object wh versus subject wh (and no T-to-C movement). The interaction is that, when the object wh-feature is present, the object raises to C to check its wh-feature. Then, T is attracted to C instead of the subject DP through an adaptation of closeness dependent upon Head Movement which defines the ‘head’ of TP to be closer than the specifier of TP, to C. In the case of the subject wh, the subject DP is attracted to Spec, C to check its [iWh] feature, and then by virtue of being closer, checks [uT] on C as well. Yet another constraint on movement is that of the Principle of Minimal Compliance (Richards 1997, simplified by Pesetsky & Torrego 2001, their (18)): (50) Principle of Minimal Compliance (PMC) Once an instance of movement to α has obeyed a constraint on the distance between source and target, other instances of movement to α need not obey this constraint. 44 The purpose of this constraint is to allow a violation of closeness, but only after the first movement has satisfied this. This constraint is used by Richards (1997) to explain empirical Superiority effects in Bulgarian. Once the ‘closest’ wh-word is moved, one observes that the ordering between further moved multiple wh-words is free (P&T 2001, Richards 1997, Boskovic 1995): (51) Superiority effect in Bulgarian multiple questions with three wh-phrases a. Koj kogo kakvo e pital? [wh1 wh2 wh3] who whom what AUX asked b. Koj kakvo kogo e pital? [wh1 wh3 wh2] who what whom AUX asked ‘Who asked what to whom?’ Once koj has moved – the closest wh-word – then ‘minimal compliance’ with the closeness constraint is observed, and any other accessible wh-words may be attracted in any order. Notice that many of these movement constraints have specific empirical justification, and tend to serve as fixes to the strength of the PIC and strong phases. Conceptually, I take it to be more ideal that attention shifts from movement constraints within a phase, to how to build phrase structure in a way that makes displacement appear on the surface, an empirical observation, but actually comes from a deeper notion of syntax that relies only on locality and constraints on local relationships alone (see Vergnaud forthcoming). Of course, other constraints will be necessary, but they will be targeted less at the operation of movement and more at how grammatical relationships can be formed between lexical items. This is one theoretical/formal focus (on ‘cyclicity’ and domains alone), rather than two (both movement and cyclicity/domains). One could take movement and movement constraints to be ‘extralinguistic’ similarly to indexing, and conceptually disprefer them. Along a similar vein with respect to projection, if Merge is minimally “costless” it should be preferable to look to the most minimal structure, rather than being necessary to look at larger 45 structure to discern relational properties. With EPP features, the same problem arises, and grafting as well. 1.2.3.1. The Minimalist Model uses same object within each module of language There are several issues with Bare Phrase Structure that our approach addresses, which I address in turn in the following sections as I outline our system. Immediately, however, there is an overarching conceptual issue with BPS in that BPS assumes the use of the same syntactic object at narrow syntax and at PF/LF, which is not ideal because this requires that certain mechanisms that are only needed at the interfaces also exist in narrow syntax. This gives rise to stipulations at the level of narrow syntax such as the Single Root Constraint and rules of chain pronunciation to constrain Merge that are really constraints on the objects at the interfaces. The Single Root Constraint is important for semantic interpretation, and not so much for syntax, and rules on chain pronunciation are important for pronunciation. At PF, we must constrain the pronunciation of a chain so that it is only pronounced once. This requires Move to copy and then check the lower copies in a chain during the derivation at narrow syntax, prior to Spell-out. 8 This move is unnecessary for the computation at syntax. It is also unnecessary at LF – the deleted copy can be reconstructed and interpreted in the deleted positions for the semantics (and often is). Recall that everything in narrow syntax is sent to both PF and LF – the deletion, however, seems relevant only for PF. Under this approach, no copying (for purposes of PF deletion) is required in narrow syntax. Rather, the syntactic object will contain only one occurrence of the constituent, and will take care of chain formation through permitting the single occurrence to be in multiple 8 We see, as in Nunes 2004, that sometimes multiple occurrences of a chain are pronounced. This questions even more the need for a constraint for the single pronunciation of a chain’s head, and the copy-and-delete approach. However, in addition to PF structures that allow a single pronunciation of a chain, our system will also need to derive these cases of multiple pronunciations. I do not address this data here. 46 grammatical relationships. Then, a new object for use at PF is ‘read’ from narrow syntax, with the proper linear requirements on pronunciation. This new object will be a classical Phrase- marker (P-marker), in the sense that only External Merge (regular Merge) has applied. Move (‘Internal Merge’) does not occur in a classical P-marker. 9 At LF, the Single Root Constraint, a constraint on syntactic objects that only allows for one item that dominates everything else in the tree, is required as well as dominance relations; namely, those of c-command and scope. The Single Root Constraint is necessary for semantic interpretation. The semantic theory we assume interprets items via truth conditions, and so for interpretation we require a single truth condition, computed at the root of the tree after all arguments are saturated. Also, interpretation of c-command relationships such as quantifier scope and NPIs arise at the level of LF. So, in BPS, narrow syntax needs to have c-command relationships in addition to constraining Merge to only create objects with a single root. Both of these are not immediately necessary requirements to the syntactic computation (apart from Agree and Move), but are stipulations that limit what narrow syntax produces so that LF can have interpretable structures. Here, since long-distance dependencies (agreement or movement) are not available, agreement is satisfied through a local Merge relationship, rendering c-command unnecessary. In the system here, the Single Root Constraint is relaxed at narrow syntax and in turn the Extension Condition on Merge is dispensed with, but in the same way as PF, our syntax still produces LF objects with single roots and more typical c-command relationships for interpretation. 9 This is a bit of a simplification, in the sense that parameters need to be developed that ‘select’ the correct Phrase- markers for pronunciation and interpretation; this will vary across languages. 47 In short, we move the constraints like those that work on linearization, chain pronunciation, single root, and c-command relations from narrow syntax to PF and LF objects, which is where these constraints are used for pronunciation and interpretation. This allows us to simplify from narrow syntax the stipulations that are really only necessary for the interfaces, while still deriving objects that are useful at the interfaces. The Minimalist model is still utilized, but we have three different sets of objects: narrow syntax has one object that gives rise to a set of P-markers that are used at PF and LF (see earlier Figure). This takes the place of certain derivational conditions at narrow syntax that are truly interface conditions. 1.3. Empirical Phenomena: Control, Raising, and Relativization Here, I move away from conceptual discussion regarding ideals involving Merge and Move, and turn to two empirical phenomena which have been extensively studied in the history of generative grammar. While I will not be able to provide a full account, in this dissertation, for the rich empirical observations in the literature regarding these two phenomena, I propose that they are related and, underlyingly, share the same mechanism. To my knowledge, the empirical domain of Control vs. Raising verbs has been considered independently of relativization. There is somewhat of a similar argument, however, in both domains: does control involve movement of a noun phrase, or not? Do relative clauses involve raising of a noun phrase, or not? Both of these discussions center around whether or not the phenomena involve one noun phrase which has undergone movement, or whether it involves a noun phrase that binds a (covert) pronoun. The similarity in the argument might simply come from the fact that we have two such mechanisms in the theory which can be manipulated: pronoun binding and movement. However, this dissertation explores a generalization across Control and Relativization, and looks at the 48 issue not as movement vs. variable binding (a pronoun), but rather as to what it means to share a noun phrase that occurs in two argument positions. The key relationship in Control constructions is the C-T domain (see, e.g. Landau 2004). Landau argues for a theory of control that utilizes both the nominal and verbal domain: facts about C-selection in the embedded clause create different control scenarios, and the C-T relationship is crucial to deriving dependent vs. anaphoric tense in control predicates. Additionally, instead of the distribution of PRO being based on case (no case or null case), he treats the distinction between PRO and pro/lexical DPs with a referentiality feature, +/-R, with contextual rules about where the +R feature can surface with respect to the C-T domain within which the DP is merged. In chapter 3, I take referentiality to occur on a D-feature (see Liao 2011 for a discussion of referentiality on D under the ICA) and so the D relationship with C is crucial to distinguishing Control constructions. On the other hand, in chapter 5 I propose that the D-C-T domain is crucial to derive relative clauses as well, and that relative clauses are two clauses with a shared DP argument. The natural ‘asymmetry’ between the two clauses arises when one CP undergoes relativization and receives a restrictive interpretation. The relativization results from which C participates in some (D, C) relationship. Like Control, the relationship between D-C in the subordinate clause and the C-T domain, plays an important role in relativization. While I cannot address this in the dissertation, there are several empirical phenomena that occur in matrix clauses: Pesetsky & Torrego (2001) unify the T-to-C asymmetry (between subject and object wh-movement) with that-trace effects, and other subject/object asymmetries from the 1980s, using feature agreement between C-T. Alexiadou & Anagnostoupoulou 1998 account for subject vs. verb-raising by a D-feature which occurs in the T-domain, which 49 accounts for word order possibilities across language families. Then, of course, there is complementizer agreement and V2 languages. The literature is rich in its exploration of the C-T dependency, and how a DP is involved in this dependency. There is a rich ground for further testing the account of noun sharing in the C-T domain that I propose in this dissertation, but this will have to wait for another occasion. 1.4. The Key Proposal The primary goal of this dissertation (and research program) is to set forth a unified analysis of the D-C-T domain, where noun sharing/grafting plays a crucial role in linking two (otherwise independent) clauses, and subordinating one through the C-T domain. D-N sharing will be involved in control, raising and relativization. The C-T relationship also plays a crucial role in the analyses proposed here, and looks at the possibility of complementizers occurring in both the nominal and verbal domain: extending Vergnaud & Zubizarreta (2001), Megerdoomian (2002), Liao (2011) into the complementizer domain as well. What I contribute is an account of embedded non-finite clauses (cases of subject control and raising-to-subject), looking to address the movement vs. PRO accounts of Control in a way that uses both and neither, and an account of relative clauses that includes split-antecedent relative clauses; an inherent problem still under current theory. There are limits (both empirical and formal) to what I can begin to address in this dissertation; I will not yet be able to address fully wh-movement or T-to-C movement – although doors will be opened to exploration of this and island constraints. A preliminary analysis of coordination is proposed as well, deriving a family of coordination which renders many different surface structures and unifies several different coordinate constructions which, under standard theory, would have completely different derivations. 50 How do I deal with the Maximal Projection Problem (above)? I don’t believe that I fully resolve the MPP, but I do provide an account that resolves the need for “invisible” projections at narrow syntax: every grammatical relationship is motivated with a linguistic feature, and there is no sense of an X’ projection under the ICA, as I interpret it here. As for looking ‘up’ or ‘down’ in a tree, this problem becomes slightly more abstract. The condition on Phrase-markers that I propose defines what a ‘local’ domain is when a DP is shared. This does require looking to multiple Merge relationships (and ruling out those which violate the condition on Phrase-markers), but once again it becomes not a property of projection (i.e. knowing whether –ed should become “invisible” further along, or a “maximal” projection), but it stems from sensori-motor constraints on pronunciation and linearization – an interface problem – not in narrow syntax. No indexing is needed, as there is no movement in this system: rather, Narrow Syntax provides ‘options’ for the surface placement of a DP, and the condition on Phrase-markers along with constraints on LF/PF select where to interpret/pronounce the DP. And, bar-levels – the projection of a head within multiple grammatical relationships – becomes a product of the interfaces: necessary for linearization, but not a part of the syntactic computation. Next I provide a basic description of ‘Graph Theoretic Syntax’ as it is utilized in this dissertation. 1.4.1. From Grafting to Graphs: On Phrase-markers 10 A classical Phrase-marker – one without traces − is a derivation in which only External Merge applies. Specifically, a new constituent can be formed by merging a pair of maximal constituents drawn from an array of formatives or from the list of antecedently constructed constituents. The important point here is that two applications of External Merge will not share any constituent, 10 This section is developed from McKinney-Bock & Vergnaud (2010), (forthcoming). 51 yielding the classical Phrase-marker structure without multidominance or movement. If, as has been assumed here, Merge is restricted to applying to pairs of formatives (linguistic features, e.g. N/V, functional/lexical features, or wh), with the merging of non-terminals arising from headedness/labeling, then External Merge will have to allow for overlapping applications, e.g., in order to to allow heads to project multiple times in any case (see below). The following condition adequately describes the standard workings of such derivations: (52) Given two applications of Merge to two distinct pairs of formatives {f i , f j } and {f i , f k } sharing the element f i , f i must be the head/label in at least one of the relations Merge(f i , f j ) and Merge(f i , f k ). This translates into the following condition on Phrase-markers, which are needed at the interfaces: (53) Condition on Phrase-markers Let P be some classical Phrase-marker and let (f i , f j ), (f i , f k ), f i , f j , f k distinct formatives in P, be a pair of grammatical relations in P which share the formative f i . At least one of the two relations is labeled/headed by f i . A classical Phrase-marker P then is a graph with the following two restrictions: (54) (i) P is a tree 11 (in the graph theoretic sense). (ii) P obeys the condition in (53). I briefly return to the idea that, due to the head/label relation, a classical Phrase-marker will in general admit more than one representation. To illustrate, the tree in (55), with geometric heads (squares, circles) projecting in the way just indicated above, could also be drawn as in (56). The circle represents one formative, and the square represents another formative. 11 A tree is a simple graph without cycles (as in standard graph theory; see, e.g., Balakrishnan & Ranganathan 2000). 52 (55) 1 3 2 4 … (56) 1 3 2 4 … Here, the root node 1 and the leaf node 2, both labeled with the circle, are the same formative. Note that I am not talking about the syntactic object (cf. Chomsky 1995) that is created; rather, I am talking about the tree representation, which we draw using only the labels of the given objects. If we only take into account labeling, then we can pursue the following discussion. The reason both nodes 1 and 2 are labeled is because the circle/square is the head of their respectice phrases; in the Merge(circle, square) relationship, the circle is the head of the [circle square] phrase, and the circle labels the phrase. Similarly, node 3 and leaf node 4 are also equivalently labeled as a square in the Merge(square, …) relationship. Returning to Chomsky 1995 (p. 244), I do not introduce indexing to distinguish the Merged square head (node 4) from its projection (node 3). Because of this, we could technically ‘project’ from the leaf node 4, rather than the more typical drawing where we project from node 3. The two drawings are equivalent, because we don’t distinguish the projecting label from its head. 53 If we were to take into account the syntactic objects that are created via Merge in BPS, then the nodes are not equivalent – while labels are equivalent, the constituent that is labeled is not part of the equivalence class. Put another way, in (55) and (56) node 2 is a minimal projection and 1 is a maximal projection (and not minimal) – the only sense in which the two structures are equivalent is for feature-checking purposes, where, e.g. the node 2 wants to check a feature of node 4. In MP, feature-checking (including both checking and selectional features) is done by merging with a head or a projection of the head, which is both not a ‘minimal’ system and also creates the issue discussed above in that the label is not the same thing as the category/projection itself. Here, under the ICA, checking features happens directly: labels and projections are the same thing. Similarly, the X-bar diagram in (57) could be drawn as in (58): (57) 1 3 2 4 5 (58) 1 3 2 4 5 Here, the circle is Merged with two separate items, nodes 2 and 5, but the circle is the head of both Merge relationships. We see it projecting in the picture in (57) as a more ‘typical’ X-bar diagram, but (58) is no different – it still shows the circle being the head of the Merge 54 relationships with nodes 2 and 5. 12 Another way of describing this is that nodes 3 and 4 are equivalent, so we can draw representations projecting from one or the other without a difference in representation. In what follows, I only use the more standard representations in (55) and (57), keeping in mind that there are equivalent representations. Representational problems immediately arise when Internal Merge gets involved. More generally, classical Phrase-markers cannot adequately represent nontrivial chains, i.e., lists of grammatical relations which share a formative f i (see Chomsky 1981, 1995), as only External Merge is applied in them and f i can only be in one context/Merged once. It is possible to draw “augmented” Phrase-markers in which multiple occurrences of a formative are coindexed, as is standardly done in current theory. (59) C what i C C … eat what i There is a conceptual difficulty, though, as discussed above: indexing is excluded by Inclusiveness, as indexes are artificial mechanisms that are externally imposed on the structures. For a certain number of structures, the difficulty can be overcome by using ‘intersecting’ Phrase-markers, or restricted multidominance in the sense of the literature discussed above. A representation along the lines of (60) would be required. 12 Strictly speaking, nodes 2 and 3 are merged in (57), even if the feature checking is with 2 and 4 – which is important to the discussion at hand. 55 (60) ‘Internal Merge’ C what C C … eat The above diagram illustrates the case of internal merge while respecting inclusiveness: the two occurrences of β are not indexed or ‘copied’ so there is no additional mechanism that would violate inclusiveness. Still, the theory cannot be extended to the most general case. In particular, standard wh chains cannot be naturally described in terms of intersecting Phrase-markers, even assuming a derivation by phase. Such a description would require that any two phases along a wh “path” intersect as in (61), violating cyclicity (see above). Even taking successive cyclic movement through vP into account, the problem that the wh path is still ‘visible’ in the lower phase remains. 56 (61) C what C C … vP v v eat eat The problem is this: a single wh-word ‘moves’ cyclically, and so has a Merge relationship inside each phase of a clause. With a copying/indexing mechanism, several copies of the wh-word end up in the derivation, one of which is pronounced. If there is no copying, then the single wh-word is in a Merge relationship with all the positions to which it has been cyclically Merged. This is in violation of the Phase Impenetrability Condition, defined in Chomsky 2001, where movement out of a phase is only permitted if the constituent has first been moved to the left edge of the phase), because there is a grammatical relationship that remains within the cycle that is technically still visible from the what leaf, despite the fact that Spell-out has occurred. The phase is still ‘penetrable’ because what is still visible to the computation. This is because the wh- word’s position in a tree is ‘visible’ in the derivation even after Spell-out of several phases after the wh-word was Merged into this position. However, setting aside the issue of wh-cyclicity temporarily, in order to respect inclusiveness and avoid copying, one needs to admit graphs with cycles, e.g., as in (62): 57 D C T (62) The elementary graph in (62) represents a chain of two occurrences of D, with one occurrence in the context of T and one, in the context of C, say, in the specifier positions of T and C, respectively. In other words, (62) represents the raising of D from the specifier position of T to that of C (or, alternatively, the lowering of D from the specifier position of C to that of T). So, a proper representation of nontrivial chains then requires that the condition on trees (repeated below) in turn be relaxed to allow for graphs (e.g. with multiple roots and a single item in multiple Merge relationships): (63) (i) P is a tree (in the graph theoretic sense). Still, classical Phrase-markers seem to be the right objects to describe interpretive properties of expressions at the interface levels. Then, in the spirit of CAT theory (Williams 2002) and following the ICA (Vergnaud forthcoming), I assume a two-tiered architecture. Narrow syntax will be formalized as a graph in the general sense. Phrase-markers will be read from that graph, subject to various conditions, in particular, locality conditions of various types (cyclical restrictions, etc.). To illustrate, the elementary graph in (63) gives rise to the two classical Phrase-markers in (64): 58 (64) (i) T (ii) C D T D C The fact that the two occurrences of D across the two classical Phrase-markers in (64) are occurrences of the same formative is enshrined in the primordial syntactic graph in (63). Again, the architecture is that of the CAT theory, in which constituent structures are read from linear arrangements of categories and transformations apply to such linear arrangements, not to the “emergent” constituent structures. So: (65) Grammatical computations do not apply directly to Phrase-markers, but instead to more abstract representations, from which Phrase-markers are ultimately derived. (Vergnaud forthcoming) 1.4.2. Graph Theoretic Syntax I assume the existence of a labeling relation, construed as an ordering relation between the merged items (see Vergnaud forthcoming). Assuming labeling, Merge is taken to apply to pairs of grammatical formatives, or features. In this dissertation, I work with a limited array of grammatical formatives/features, which are akin to features assumed currently in the Minimalist Program – e.g. categorical features [noun], [verb], functional features such as v, C, D, Asp, etc. A unique formal theory of features is not developed in this dissertation. Merge applied to some α, some β, establishes one of two possible grammatical relationships between α and β. If α is the head, then β is either a Complement of α or a Specifier of α. From this, two varieties of Merge are established: Selectional-Merge (S-Merge) and Checking-Merge (C-Merge). S-Merge creates 59 f i x f j f i f j x f i f j C a Head-Complement relation, or selection relation (Chomsky 1965), e.g. the pair (C, T) in the upper phase of a clause. C-Merge creates a Head-Specifier relation, or checking relation (Chomsky 1993), e.g. the pair (T, D) or (C, D). A derivation is represented as a graph with labeled edges. The edge (f i , f j ) x with end points/vertices f i and f j , f i and f j formatives, represents the x-merging of f i and f j , x=S (selection) or C (checking): (66) We call a derivational graph an M-graph (“M” for “Merge”), and the notion of M-graph is akin to that of T-marker in the sense of Chomsky 1975. Assuming labeling, an M-graph is a directed graph. For example, the one-edged graph in (66) should be oriented as shown in (67) if f j is the head of (f i , f j ) x : (67) If x in (67) is C (checking), f i is in the relation Specifier-of to f j . (68) Checking is itself a symmetrical relation. Specifier-of is just the relation that arises when two formatives in a checking relation are ordered by labeling/headedness. Nominal arguments are uniformly treated as specifiers (cf. Larson 1988, Bowers 1993, Hale & Keyser 1993, Lin 2001, Vergnaud & Zubizarreta 2001, Megerdoomian 2002, Liao et al. 2010, Liao 2011). 60 S V v C C S N D X Y 1.4.2.1. Symmetry I take syntactic configurations to be governed by basic symmetry principles (cf. Vergnaud & Zubizarreta 2001, Megerdooman 2002, Lasnik and Uriagereka 2005, Liao 2011, Vergnaud forthcoming), in the sense that there is a certain uniformity of syntactic structure/hierarchical organization, even across categories. 13 Thus, there is a one-one correspondence between the constituents in the specifier of some phrase X and the constituents in X. For example, one expects the relations in (69) to hold, with labeling implicit (see chapter 3 for details): (69) 1.4.2.2. From graphs to Phrase-markers Phrase-markers can be read from graphs. If one were to construct a tree based on Merge of (X, Y), with head projection/labeling, one could build a Phrase-marker from the graph (above). In other words, X Y means that X and Y are Merged and Y projects. (70) is the object at narrow syntax, and (71) is a Phrase-marker derived from (70). (70) 13 This notion is present even in X-bar Theory, where the uniform structure of an XP across different categories (NP, VP, AP, etc.) suggests symmetry. It is taken to be even more deeply involved in the fundamental architecture of syntax in the later works listed and discussed here. 61 Z X Y Z X Y (71) Y X Y And so on: (72) (73) Z Y Z X Y (74) (75) Y Z Y Y X Essentially, graphs are more abstract representations of a set of possible trees, and the proposed Condition on Phrase-markers serves to produce trees that are interpretable at the interfaces. It is necessary to keep in mind that the graphs at narrow syntax are not simply a notational variant of what the trees used in the Minimalist Program are, and this is so in several respects (see the results in chapters 3, 4 and 5). To illustrate this briefly, I turn to an argument by 62 Vergnaud in his lectures at the University of Southern California. Standard phrase structure – with labeling - creates the following derivation for a vP: (76) Merge a, prize to obtain { a a, prize} Merge win, { a a prize} to obtain { win win, { a a prize}} Merge v, { win win, { a a prize}} to obtain { v v,{ win win, { a a prize}} Then, head movement of win to v obtains: { v { v v, win}, { a a, prize}} A tree representing the penultimate step of the derivation is as follows (prior to head movement): (77) v v win win a a prize Vergnaud then describes the grammatical relationships that are present between these four grammatical formatives, taking Merge to establish a grammatical relationship between the two merged components. He establishes that v-a are in a case checking relationship, a-prize are in a selection relationship, and v-win are also in a selection relationship. However, two components are not correct in this derivation: it incorrectly generates a grammatical relationship between win and a (the Merge relationship in step 2), which doesn’t exist (remember, case is licensed by v not win). It also misses out on the relationship (theta) between win and prize, which is never generated. Notice immediately that the graph representation in (78), with labeling, correctly generates the four grammatical relationships between these items, and does not contain a 63 S V v C C S N D grammatical relationship between win and a (there is no diagonal edge connecting these two formatives in a merge relationship): (78) And, the Phrase-markers that this graph gives rise to never include a tree of the type given in the derivation above (a “standard” tree appearing in current theories). I use * (78) here to represent a Phrase-marker that is not generated from (78), which is a Phrase-marker with the undesirable grammatical relationship between win and a (note that the Phrase-marker containing [ v [ v v win][ a a prize]] is generated, which is the desirable one): (79) * (78) v v win win a a prize 64 Notice that standard phrase structure theory never gives rise to the trees shown by the ICA, although it would be possible to generate them using merge – however, due to assumptions about how syntax merges at the interfaces, we do not use these trees in “standard” theory. Importantly, though, the ICA does not generate the structure above (because of how phases are built, as binary products of features – see question 1.1, however, for discussion about this principle), while standard theory does. The graphs (abstract representations of sets of Phrase-markers) in their current form are not isomorphic to Phrase structure in MP. 1.5. To Conclude the Introduction In this chapter, I have attempted to set out the broad exploratory questions under the Minimalist Program which are relevant to the investigation at hand. Chapter 2 explains the foundations of Vergnaud’s Items and Contexts Architecture, which forms the basis for the investigation in chapters 3-5. Each chapter that follows contains (1) further developments to the grammatical architecture developed here; specially notions of embedding, sharing and phase overlap, and (2) an empirical exploration relevant to the formal development at hand. Chapter 3 looks at Case assignment and phrase structure. A notion of symmetry in syntax (Vergnaud forthcoming, Liao & Vergnaud 2010, Liao 2011) is set out for the structure of a phase. From this I propose a linking mechanism, formed by a close relationship between case and complementizers, which links these symmetric phases to build the fabric of syntax. Empirically, simple transitive clauses are described, and quirky Case in Icelandic is briefly explored. Chapter 4 turns to clausal embedding. Using the notion of symmetry and phrase structure in chapter 3, I propose to define embedding following Williams’ (2003) Level Embedding, where embedding may occur at any level of the clause and deriving the generalization that embedded clauses are always smaller than matrix clauses. However, I further define the 65 mechanism of embedding, something which is not accounted for clearly in the literature. I take embedding to involve a shared constituent across the matrix and embedded clauses. From this, I explore pre-theoretical structural descriptions of different ‘sized’ embedded clauses: raising, control and indicative clauses. Chapter 5 looks to long-distance wh-dependencies within the realm of relativization. Using sharing once again (the same mechanism from chapter 4), I draw parallels between clausal embedding and relativization, and propose that movement within relative clauses is actually a case of the head noun being shared across relative and matrix clauses. The chapter culminates by returning to an empirical paradox in current theory, that of split-antecedent relative clauses. I show that my account of relativization, interacting with a notion of coordination which is defined in the chapter, serves to naturally predict split-antecedent clauses as part of a family of coordination, rather than treat them as the anomaly that they are under current theory. 66 Chapter 2 The Items and Contexts Architecture: Its Roots are in Chains * 2.1. Conceptual Overview of Vergnaud (forthcoming) Graph theoretic syntax, as laid out above, has its roots in Vergnaud (forthcoming), who formalizes and proposes the Items and Contexts Architecture (ICA). The Items and Contexts Architecture is a grammatical/computational architecture that is an extension of the Minimalist Program, with certain conceptual revisions. Vergnaud addresses A-movement and phrase structure, hypothesizing that the nominal and verbal domains are computed/constructed in parallel, so that the structural size of a specifier is the same as that of its sister. He addresses A- bar movement and issues with movement and copy/deletion, dealing with the notion of occurrence and chaining in terms of mathematical permutation and cyclicity. Three primary research questions guide the core of Vergnaud (forthcoming), and find exploration and interpretation within this dissertation – in particular, question (C) is explored in subsequent chapters. (A) Methodologically, how can broadly defined mathematical/cognitive principles guide linguistic investigation? (B) To what extent do general mathematical principles apply across linguistic domains? What principles guide computation at different levels of linguistic structure (phonology, metrical structure, syntax)? (C) How is the computational domain defined? Looking to (A), an overarching theme explores the general principle of symmetry (Ch. 2, 4, 6) and how it may guide syntactic thought. Empirically, Vergnaud derives the parallels between nominal and verbal domains from his grammatical architecture. To cross linguistic domains, (B), Vergnaud uses a formal notion of periodicity (Vergnaud 2003) to derive stress * Parts of this chapter will appear in the following volume as: McKinney-Bock, K. and M-L Zubizarreta. Introduction. Primitive Elements of Grammatical Theory: Papers by Jean-Roger Vergnaud and his Collaborators, eds. K. McKinney-Bock & M-L Zubizarreta. Routledge. 67 patterns in metrical structure and wh-chains in syntactic structure in the same formal way. Then, looking to (C), Vergnaud explores notions of long-distance movement and clausal domains and what it means to be non-local. The book gravitates toward a theory that derives all grammatical relationships as inherently local, and the appearance of displacement is a product of how the formal computational system spells out. Dealing with higher-level operations, such as coordination and wh-quantification, leads to an extension of copy theory that has copying and deletion at the level of phases, triggering phase reduplication that allows for the appearance of displacement/movement. Empirically, Vergnaud’s ICA has not yet been explored to its richest potential. Empirical extensions of the ICA to date in the literature include classifier constructions in Chinese and few/little in English (Liao forthcoming), modality, aspect, and referentiality (Liao 2011), correlatives (Leung 2007), as well as empirical parallels between the nominal and verbal domains (Megerdoomian 2002, Vergnaud forthcoming). This dissertation addresses basic transitive phrase structure, raising and control (chapters 3-4), relativization and the problem of split-antecedent relative clauses (chapter 5), and touches on wh-questions and focus (chapter 5) and quantification (chapter 5). Using the ICA as a foundation for narrow syntax, I introduce a notion of sharing across phases that is grounded in similar scientific goals as the ICA to eliminate long-distance grammatical relationships, based in part on Vergnaud (forthcoming) and McKinney-Bock & Vergnaud (forthcoming). Apparent throughout Vergnaud’s writings is the belief, in line with Chomskyan inquiry, that linguistic methods should have foundations in scientific methods. Using, for example “the heuristic of eliminating overlapping conditions” (Freidin & Vergnaud 2001, reprinted in McKinney-Bock & Zubizarreta forthcoming), the move from Government and Binding (GB) to 68 the Minimalist Program (MP) allowed for the elimination of the Case Filter and government to the more general principle of Full Interpretation. Vergnaud’s work on the ICA lays the foundation for an extension of the Minimalist Program, collapsing conditions such as the EPP, checking, and the Mirror Principle into a single formal mechanism based on a guiding principle of symmetry (Vergnaud forthcoming). For example, the feature/category D is T, but found in a general nominal context rather than a verbal context. Here, I introduce the key ideas and illustrations from Vergnaud (forthcoming) which will be important within this dissertation. The ideas and innovations remain Vergnaud’s, and any errors in argumentation, representation and interpretation are mine. Finally, after introducing Vergnaud (forthcoming), in Section 2.5 I briefly discuss certain aspects of the ICA which remain incomplete and not developed: (1) the ICA focuses its hypotheses on the formatives within a single phase and remains incomplete in its formalization of linking phases/phase overlap/embedding, and (2) the ICA follows current theory in privileging the verbal domain. Chapters 3-5 in this dissertation develop a notion of phase overlap and embedding, using both nominal and verbal domains. 2.2. From Phonology to Syntax: a formal notion of occurrence and chain is the same in both A theory of metrical structure from Vergnaud (2003), reprinted in McKinney-Bock & Zubizarreta (forthcoming), provides the roots of the grammatical architecture for both phonology and syntax under the ICA. Vergnaud begins with a comparison of “clocks and number systems” to metrical structure, looking to the general mathematical notion of circular permutation, which he then uses to formalize metrical structure (and the notion of chain, which runs formally through both phonology and syntax). Vergnaud then returns to the hypothesis that metrical structure and (syntactic) constituent structures are both hierarchical, and that the ‘congruence’ 69 between the two is reduced to a correspondence between the two independently defined hierarchical structures. From this, Vergnaud derives a notion of occurrence that allows for a formalization of chain in both metrical structure and syntactic structure. Each object used in a derivation (of stress, or of syntax), has two roles: that of a goal or item/interpretable feature, or of a source, or context/uninterpretable feature. Then, a checking mechanism allows for the generation of strings and a single pronounced item, from the two roles it plays. 2.2.1. Formalizing the Notion of “Occurrence”: A Comparison of Circular Permutations and Metrical Structure Vergnaud illustrates a useful relationship between ‘beats’ of metrical structure (syllables and stressed syllables) and a clock system with two hands (a big hand and a little hand). To do this, Vergnaud begins with a metrical grid for the word nominee: (80) * * * * 1 2 3 n o m i n ee Here, the lower tier represents the syllable, and the higher tier represents the stressed syllable, at the level of the word. One could repeat the metrical grid periodically to create a ‘beat’ (imagine repeating the word nominee over and over): (81) . . . * * * . . . . . . * * * * * * * * * . . . . . . . . . . . . 1 2 3 . . . . . . . . . . . . . . . . . . n o m i n ee A metrical constituent is defined as a group of positions (syllables), and the stressed position is the head of the metrical constituent, representing the ‘projection’ of the head. This is the 70 Hypothesis of Metrical Constituency (HMC), and it allows syntactic structure and metrical structure to be a correspondence between two constituent structures (Halle & Vergnaud 1987). The pattern in (81) resembles a clock system, or a circular permutation, that repeats infinitely (of course, linguistic structure does not – but Vergnaud shows that the formal nature of these periodic structures is relevant to linguistic structure nonetheless). Nominee has 3 beats, so Vergnaud uses a clock with three numbers as an analogy. The bottom tier/syllable level corresponds to the big hand of a clock, and the top tier/word level corresponds to the little hand. (82) 0 2 1 As the clock progresses, the little hand sits at 0 until the big hand makes a full revolution. When the big hand returns to 0, the little hand moves to 1, and sits at 1 until the big hand has made another revolution. This can be written out as in (83), with the bottom tier = little hand, and the top tier = big hand. This picture, in (83) resembles the metrical beat in (81). (83) 1 2 0 1 2 0 1 2 0 1 2 0 0 1 2 3 4 5 6 7 8 9 71 As Vergnaud points out, “In essence, a clock/number system defines a hierarchy of equivalence relations among numbers.” The classes can be constructed by mapping the circular permutation displayed in (82) onto the linear set of numbers (as in the bottom tier of (83), above). (84) 0 2 1 (84) can be notated using pairs of numbers, or coordinates. For example, the edge going from 0 to 1, above, can be written as the pair (0, 1) – where the left coordinate represents the initial number, and the right coordinate represents the final number. In (0, 1), the clock hand starts at 0 and ends at 1. Here is the full permutation: (85) (i) (0, 1) (1, 2) (2, 0) Mathematically, Vergnaud defines the pair as follows: (ii) [(x, y) = def “y is the image of x”]: a circular permutation defined over the set {1, 2, 0} The left coordinate, e.g. 0, precedes the right coordinate in a pair, e.g. 1. The clock hand starts at 0, and 0 occurs first, followed by the clock hand moving to 1, where 1 occurs second. So: (iii) [(x, y) = def “any occurrence of x precedes an occurrence of y”]: a periodic distribution of 1, 2, and 0 defined over an infinite discrete linear set The analogy Vergnaud draws between a metrical grid and a circular permutation is the beginning of his defining a formal notion of occurrence in linguistics, both in metrical structure and in syntax. 72 Notice, in (85), that each number occurs once as a left coordinate and once as a right coordinate. Similarly in (83), repeated here, the number 0 occurs three times in the second line/bottom tier: (86) 1 2 0 1 2 0 1 2 0 1 2 0 0 1 2 3 4 5 6 7 8 9 We can separate the type of object from the instance. The type of number 0, is an element of the set {0, 1, 2}, and it has some set of properties associated with it. There are three instances of 0 in (86), marked by the little hand of the clock: 0 occurs when the little hand is at 1, again when it is at 2, and again when it is at 0 (marked by positions 3, 6 and 9). Vergnaud defines a set of occurrences (e.g. of 0) as a chain. Here, the fact that 0 occurs three times is independent of the set of properties that define what 0 is; rather, it is based on the hands of the clock. Vergnaud calls ω the set of properties associated with some object (say, 0, or more linguistically, say, some grammatical object, e.g. T, which has a bundle of features), and he calls I the set of properties that can be freely associated with all objects, which can alternatively be called a set of indices – here, the system of the hands of the clock. Then, he defines an occurrence of ω (e.g. 0) as the pairing of ω with some element in I. However, the indices here are not arbitrary; rather, they arise from the properties of the clock itself: they arise from the properties of the formal system. Vergnaud discusses the difference between the clock system/permutation and metrical structure, which is topological: the former is two-dimensional, and the latter is one-dimensional. One can convert the two-dimensional clock to one dimension (see Vergnaud 2003, section 5), and end up with the following: 73 (87) 0|1|2|0 The clock can be arranged linearly to generate any of the following sequences: (88) (i) 0|1|2|0 (ii) 1|2|0|1 (iii) 2|0|1|2 Notice that, to preserve the effects of a permutation (or a circle) in one dimension, the ends must be repeated. To briefly illustrate, we can look to a circle such as that below. The circle is “topologically equivalent” to a line in which the two ends are “identified,” or equal to one another: (89) (i) (ii) |⎯⎯⎯⎯⎯⎯⎯⎯| A B A = B Here, A and B are two occurrences of the same object (e.g. 0 in (87), above), and constitute a chain. Returning to the two-dimensional permutation, we see that each object/number occurs as a left coordinate and as a right coordinate: (90) (i) (0, 1) (1, 2) (2, 0) 74 Vergnaud shows (Vergnaud 2003, Section 5) that each type object/number may be analyzed as a chain reflecting the two roles of the object (two instances): one as a left coordinate, and the other as a right coordinate. Vergnaud labels these source (left) and goal (right). There is an occurrence of each type of object as a source, in the domain, or as a goal, in the co-domain. But, as introduced above, we can collapse this onto one dimension: (91) 0|1|2|0 When we do this, we no longer ‘see’ that each of the numbers in the set {1, 2, 0} form a chain – in fact, we can only allow ourselves to see one of the numbers that forms a chain (the endpoints) – and we must do this to preserve the topological equivalence Vergnaud discusses, above. But, in two dimensions, it remains that each object occurs twice – once as a source and once as a goal – and so each object plays two roles in the structure. It is the linearization/collapsing onto one dimension where we collapse the two roles onto one object, or one “geometric point.” It is this notion of occurrence within this mathematical exploration of circular permutations that Vergnaud applies to linguistics, being reminded of two conjectures by Chomsky and Halle (resp.). 2.2.2. Applying the notion of occurrence to linguistics: Metrical structure Reaching from mathematics to linguistics, Vergnaud identifies two conjectures about chaining, one by M. Halle and the other by N. Chomsky: (92) Halle’s conjecture Given a metrical grid Μ and some position i in Μ, the asterisks in the column above i form a chain. (93) Chomsky’s conjecture 75 “We could, for example, identify this [JRV: the full context of α in K [KMB: K a syntactic object]] as K’= K with the occurrence of α in question replaced by some designated element OCC distinct from anything in K.” (Vergnaud forthcoming, note 64) Vergnaud develops a theory that “vindicates” both conjectures, extending his formal notion of chain both to syntax and to metrical structure. The question is: how does one get from strings to circular permutations (as described above)? Vergnaud treats a string as a ‘severed’ permutation, as below: (94) C B A He replaces the notation A|B|C|A, notated below, with a delta representing a ‘junction’ or an edge of some constituent: (95) a. ABC = {(A, B), (B, C), (C, Δ)} b. A|B|C|A = {(A, B), (B, C), (C, A)} As with circular permutations, an object in a permutation (e.g. A) has two roles. Taking Δ to be an instance of the OCC feature from Chomsky 1998, e.g. {(X, Δ)} = <X, OCC>, we have one role for X – this is the source role, discussed above. The second role, <X, ID> represents X’s ability to ‘substitute’ for Δ, or, in other words, act as the goal for some {(Y, Δ)} pair (allowing us to recursively create a string, e.g. XY). In creating the string XY, Vergnaud defines a co-chain, which exists with the identification of <X, OCC> with <Y, ID>. (96) {<X, OCC>, <Y, ID>} is a co-chain iff <X, OCC> = <Y, ID> 76 What this represents is the adjacency of X and Y in the string XY. We could rephrase this and say that Y creates a context for X. We have a chain with the elements <X, OCC> and <X, ID> - recall that these are two occurrences of X as a source and as a goal – and we have a co-chain with <X, OCC> and <Y, ID>, which allows us to stitch together adjacency relations in chains. Here is Vergnaud’s example of the string ABC (see Vergnaud 2003, Section 6.3 for details): (97) <A, OCC> <B, OCC> <C, OCC> <A, ID> <B, ID> <C, ID> Returning to metrical grids, Vergnaud shows that the string in (95) can create the metrical grid from (81). He demonstrates that this occurs, really, by a checking mechanism (in the minimalist sense): an OCC ‘gesture’, or occurrence, gets deleted when it is linked in a cochain with an ID ‘gesture’ (the arrows, above). Essentially, an OCC-gesture is an uninterpretable feature in syntax, or in phonology, it means “that a stressed unit is defined as one that cannot be interpreted as a context for another unit” – stress as a junctural mark. This is the core of Vergnaud’s system in Vergnaud (2003, forthcoming) as it relates to both phonology and syntax, with an illustration in the domain of metrical grids. The ideas here underpin the ICA, as well as previewing Vergnaud’s collaboration with Louis Goldstein in 2009 for a seminar at the University of Southern California linking gestural phonology with syntax, a realm which the 2009 seminar opened and one that remains open to investigation. 2.3. The heart of the formal system: The ICA Here, I provide an introductory window into Vergnaud’s grammatical architecture, the Items and Contexts Architecture (ICA), using Chomsky’s Bare Phrase Structure and Baker’s Mirror Principle as a basis for discussion and interpretation of the ICA. 77 Under the Minimalist Program, linguistic theory has attempted to move toward a general set of computational principles, trying to disambiguate which principles are language-specific and which are general (Chomsky 1995, 2004, et seq.). Part of this program is to explore posited primitives and mechanisms that, while empirically motivated, remain stipulations. Continuing in that tradition, Vergnaud (forthcoming) recharacterizes the grammatical architecture under Bare Phrase Structure (BPS) as a matrix, or a graph, dispensing of properties of the derivation that are only relevant for the interfaces (here, we discuss three) and allowing for a more general phrase structure that is used in both nominal and verbal contexts – by characterizing the primitives of phrase structure as playing dual roles (Manzini 1995, 1997), that of both an item and that of a context. Current clause structure neglects empirical parallelisms between the nominal and verbal domain (cf. Megerdoomian 2002, Liao 2011, Vergnaud forthcoming), or at least treats superficially the structural parallels between nominal structure and verbal structure. While X-bar theory accounts for the parallels through the use of a similar primitive phrase structure template, the ICA accounts for the parallel hierarchy that we see with certain linguistic features, based on observations by Lasnik & Uriagereka (2005) and others about clausal symmetry. In X-bar theory, this symmetry is represented by the order in which the XPs combine to form the verbal and nominal hierarchies, which remains a stipulation despite the use of XPs serving similar functions in both domains (i.e. classifier and aspect heads serving to represent the (same) abstract notion of ‘end-point’ in the nominal – mass vs. count nouns- and verbal –telic vs. atelic events- domains). Ever since Abney (1987), there is really no good way to represent selection of an argument by the verb, or local theta (θ-)selection. A verb, e.g. eat, has no semantic relationship with D, alone – the D could be any type of D (every, the, a, etc. …) (Sportiche 2005). What eat 78 selects is the NP below the D – apples, but the NP is now embedded below a hierarchy of functional projections, and there is no way to encode this relationship locally. Reaching out empirically, Vergnaud’s paper proposes that the extended projection of the nominal and verbal domain are formed from “the same primitives in the same hierarchical order” for the DP and the VP by having a single feature merged (checked) in two contexts: the N- category context, and the V-category context. In other words, Vergnaud’s system takes the N/V parallels as a strong hypothesis, and his system derives exact parallels between the domains. It remains an empirical question whether or not this generalizes, but see chapters 3, 4, 5 of this dissertation and Liao 2011 for examples of an application across the N-V domains that resolves empirical problems to previous approaches, and that shows that even paradigms that don’t appear on the surface to require parallels still do. Vergnaud works to derive the parallelisms in the hierarchies across nominal and verbal domains, such as the semantic relationship between V and N which both exist low in the hierarchies, from properties of the computation (narrow syntax) itself. 2.3.1. Bare Phrase Structure and a Family of Trees At the level of computation, Vergnaud’s approach derives a family of trees from a syntactic representation that is a 2 x n matrix – or, as is shown in Appendices II-III to in Vergnaud (forthcoming), and in this dissertation, – a graph (isomorphic to the matrix). In doing this, he takes narrow syntax to be a more abstract representation, from which constituent structure (represented as a tree or set of trees) is derived. From this, the family (or plurality) of trees computed from the matrix/graph in narrow syntax can be used at the interfaces for interpretation and linearization, at a second tier of analysis. This system, in the spirit of continued theoretical progress under the Minimalist Program, argues for a two-tiered 79 architecture, which resolves concerns about computation that occurs at narrow syntax but instead is relevant only to interface properties. This can be understood by looking to Bare Phrase Structure (Chomsky 1995). Here, merging α and β creates the set {α,{α,β}. Then, merging another element δ builds the set to: {δ, {δ, {α,{α,β}}}. This can be represented by the labeled tree in (98): (98) δ α δ α β Notice, however, that at any ‘level’ of the derivation, or any ‘level’ of structure, one can see that α and β were merged originally. Here are the steps of the derivation so far: 1. Merge (α, β) = {α, {α, β} 2. Merge (δ, {α, {α, β}) = {δ, {δ, {α,{α, β}}} In both steps 1 and 2, one can see the ‘history’ of what has been merged – that is, the merging of α and β which took place originally. The derivation itself encodes at every level the history of the derivation, and at every step of the derivation there is a new tree. So, at the end of a long derivation, there are as many trees (representations of structure) that there are steps to the derivation. This set of trees, or ‘family’ of trees, that arises – and that is encoded in the derivation – is one way to think about Bare Phrase Structure. Vergnaud takes this perspective in Vergnaud (forthcoming), and derives a family of trees for each derivation. This family of trees, or as Vergnaud puts it, plurality of constituent structures, The family of trees can be represented by a series of dominance relations between labels (which Vergnaud aligns with Chomsky’s 1981, 1986b notion of government). For example: (99) δ α β 80 This graph represents the fact selection of β by α, and a domination relationship between β and α. Note that earlier the directed graph indicated headedness, except that here the direction of the arrow is reversed. Also, δ dominates α and everything that α contains, which here includes β. 14 Two possible trees can be used to represent this structure: (100) δ δ α (101) δ α δ α β This is because α can represent/label the merge relationship for α and β. This is akin to telescoping, cf. Brody 1997. From these graphs that represent government relations between formatives, Vergnaud shows that several trees are possible – an equivalence class, or plurality, of trees arises, representing the different levels of structure that can be represented. In Vergnaud (forthcoming), Vergnaud introduces such graphs for both nominal and verbal domains, which can be schematized as follows (here, the arrows show selection of Asp by T, V by Asp, etc. rather than headedness): (102) T Asp V R(oot) T Asp N R 14 Earlier, I introduced directed graphs to indicate headedness. As the head projects, it thus continues to dominate. 81 Here, T is the standard tense formative, Asp is the aspect formative, V/N are categorical formatives, and R is the formative for the Root of the phrase. He takes, along the lines of Borer 2005 and others, Aspect and Tense to be found in the nominal domain as well (with their respective interpretations, i.e. Aspect representing the mass/count distinction and T the definite/indefinite distinction, possibly). He uses this clause structure throughout, though he acknowledges that the clausal and nominal structures are much likely more complex than this illustration (see chapter 3 of this dissertation, Liao 2011). However, the formal components of his system will remain despite possible modifications to the primitive grammatical formatives he uses. The family of possible trees, or possible constituent analyses (limited by the head projection relation), is a syntactic arrangement that Vergnaud calls an Iterated Head- Complement (IHC) structure. He argues that IHC structures make up the fabric of syntax, and are the primitive, or “minimal units” of grammatical structure. This is an important point, because typically a derivation is represented by a single labeled tree. What Vergnaud illustrates is that using a single tree as the only representation of a derivation misses some properties rendered by Bare Phrase Structure that remain unaddressed – properties that come from a grammatical object occurring in two roles (as in Vergnaud 2003). To compute the parallel nominal and verbal domains, pictured in the graphs above, Vergnaud departs from the standard notion of Merge as the simplest function. He proposes that there are two kinds of Merge, rather than a simple concatenation notion. One type of Merge forms selectional relationships–or builds hierarchical structure in an extended projection, Head- Merge/S(electional)-Merge. The second type of Merge forms checking relationships – allows for an N-context and a V-context to check one another, EPP-Merge/C(hecking)-Merge. This differs 82 from Chomsky 2004, who notes that the first Merge relationship creates a head-complement relationship, and further applications of Merge with the same head create what we call specifier relationships. Chomsky points out that it remains an empirical question whether this distinction is necessary or not, and proposes not to restrict the number of times a head can project (or the number of specifiers that can occur). Vergnaud departs from this concern, and permits one type of merge that allows for head-complement relationships (selection), and another type of merge that creates head-spec relationships (checking). The parallel nominal and verbal domains that are created by C-Merge and S-Merge can be represented by a more abstract structure: a 2 x n matrix, from which constituent structure (or Phrase-markers – or, standard trees) is derived. This is schematized as follows: (103) T Asp V R T Asp N R Here, we will take T=D, and Asp=Classifier/#, so we can revise the matrix as follows: (104) T Asp V R D CL N R The horizontal boxes represent S-Merge, or selection, in the parallel domains: (105) T Asp V R D CL N R The vertical arrangement of boxes in the matrix represent C-Merge, or checking, across the nominal and verbal domains. This represents argument/specifier relationships: 83 (106) T Asp V R D CL N R Notice that there is a direct, local relationship at every level of the clause. Vergnaud develops this further when he argues that each grammatical formative plays a role as an item and a context across dual domains), discussed below. This formal exploration opens the door to Vergnaud’s (forthcoming) main contribution: the Items and Contexts Architecture (ICA), which derives constituent structure from a more abstract grammatical representation. It provides an account of A-movement where feature-by-feature ‘accretes’ into the nominal and verbal domain at the same time: a parallel (nominal) specifier is paired with every verbal projection, and is of equal structural size. This leads to parallel growth of the nominal and verbal domain. More deeply, Vergnaud hypothesizes that, in the grammatical interface with the cognitive brain system, the primitive objects of the grammatical interface are not lexical items, or grammatical items, but rather the roles played by these items. Each item has two roles: one role refers to an object in the mental system, and the other role relates to being a context in the mental system. Then, a grammatical structure is a set of mappings between items and contexts, the two types of roles of constituents. 2.3.2. The Mirror Principle Vergnaud collapses Baker’s Mirror Principle, the EPP, and checking under one formal mechanism. This predicts an empirical distinction between specifier creation and selection, which remains heterogeneous in the computation. Possible empirical consequences include 84 predicting the existence of polysynthetic languages and deriving the (empirically observed) Mirror Principle. Here, we introduce and explore the mechanism in detail, beginning with the Mirror Principle as stated informally in Baker 1985: (107) Mirror Principle (Baker 1985) Morphological derivations must directly reflect syntactic derivations (and vice versa). Baker observes that certain morphological patterns are the mirrored order of syntactic patterns, more specifically in the linear ordering of these elements. Baker (1988) uses the head movement constraint (Travis 1984) to form agglutinative words via (syntactic) head-movement of morphological affixes. The morphological derivation is the syntactic derivation, and the mirrored order is created via affixation by cyclic, local head-movement. Vergnaud’s system takes the Mirror Principle to stem from the relationship between interpretable and uninterpretable pairs of features, paired e.g. across the nominal and verbal domains. Empirically, we observe that features that are uninterpretable in the verbal domain, e.g. person, are interpretable in the nominal domain. Vergnaud takes this to be a fundamental aspect of the computational component of human language, C HL : some uninterpretable formative in the verbal domain is interpretable in a dual, nominal domain. He refers to an abstract categorical feature of the verbal domain (typically ‘V’) as O and the categorical feature of the nominal domain as O* (typically ‘N’), as an abstraction away from standard theory, to illustrate the concept. Using Vergnaud’s notation, where <O| is a context and |O*> is an item, he represents the uninterpretable verbal contextual formative as being under identity with the interpretable nominal item formative: (108) <O| = |O*> The verbal contextual feature is the nominal item feature, where the contextual features are uninterpretable and the item features are interpretable. 85 The notation that Vergnaud uses for items (interpretable) and contexts (uninterpretable) is the bra-ket notation used in quantum physics: (109) <x | y> where <x| is called a bra- and |x> is called a –ket. The kets are the items, and the bras are the contexts. This formal notation is only one of a few that Vergnaud introduces in the paper, as different ways of exploring the same idea, but it is one of the more intuitive ideas (and used pervasively throughout the formal consequences in the final sections of Vergnaud (forthcoming), so I utilize this notation here). From this, he derives a Mirrored template (under the Mirror Principle) by allowing a ‘reversal’ of which grammatical formatives act as contexts and which act as items from one domain to its dual. Essentially, we reverse, in each feature pair, which formative is the item and which is the context (this is based on the general feature identity, mentioned above): (110) R= Root, V=categorical verbal feature, N=categorical noun feature, A=Aspect, T=Tense, ∅=Edge feature, X=either V or N feature (a variable for a categorial feature) Nominal Template, X=N: {<∅|R>, <R|X>, <X|A>, <A|T>, <T|∅>} T Asp N T Asp N Root 86 Notice that Vergnaud uses Edge features, ∅, for both T and Root features, to note that there is only a ‘joker’ feature creating an item for (context) Root, and context for (item) T (this is akin to Δ in Vergnaud 2003). We see here (and explained in Vergnaud forthcoming) that the –ket for each formative is akin to ‘projecting’ or labeling a head. Then, taking <O| = |O*>, or more specifically, reversing the each item-context pair in the nominal domain to create its dual (verbal) domain, derives a mirrored verbal template: (111) Mirrored Verbal Template, X=V*: {<R|∅>, <X|R>, <A|X>, <T|A>, <∅|T>} Root V* Asp Root V* Asp T Notice that this tree represents the morphological linear ordering of a verbal root, with its affixes, crucially, taking the inverse of the features in the other domain. Here, the Mirror Principle is derived by parallel nominal and verbal domains being expressed in a dual relationship, whereas in standard theory it is taken to be a syntax-morphology isomorphism. What Vergnaud does here is pair two domains (defined as one instance of a template, as illustrated above), where one ‘checks’ the other. He requires that there be dual domains, which is the same as requiring that any single template have a specifier – this is the essence of the EPP. By doing this, and by requiring the dual set to have a mirrored inverse as its dual, he derives the Mirror Principle. This comes from the single requirement that the contextual/uninterpretable 87 features of the nominal domain act as items/interpretable features in the verbal domain (and vice versa). This requirement is not given by a principle, but Vergnaud hints at the possibility of it being an effect of Full Interpretation: if we have a set of grammatical formatives, then they must play both roles – as an item (interpretable) and as a context (uninterpretable). This requirement about the duality of features, and the dual domains, is the essence of the ICA, and how Vergnaud stitches together the fabric of syntax in an unexpected way – and this requirement allows the Mirror Principle, checking, and the EPP to be unified under one structural principle. This duality also allows for the nominal/verbal domains to be paired, and requires that each domain be the same size as its dual (specifiers are the same size as the constituent with which they are merged.), because each feature plays a role in both the nominal and verbal domains. 2.3.3. Dual Domains: Two types of Merge Here, let’s return to the idea of parallel/dual nominal and verbal domains, looking to how Vergnaud creates these structures. He introduces the idea that there are two types of Merge: one that creates extended projections (what he calls Head-Merge), and one that creates checking relationships across domains (what he calls EPP-Merge). The role of Sections 5, 6 and 7 of Vergnaud (forthcoming) is to pull together the notion of EPP-Merge and Head-Merge with the idea of the functional duality of a grammatical formative as both an item and a context. Notice that, to derive the Mirror Principle, Vergnaud required that some grammatical formative that is used as an interpretable item in one domain be used as uninterpretable context in a paired domain. However, we have necessarily skimmed over the idea that a given grammatical formative plays two roles within a single nominal or verbal domain. Let’s return to the given structure for a verbal structure (notice that the template is always the same): 88 (112) Verbal Template, X=V: {<∅|R>, <R|X>, <X|A>, <A|T>, <T|∅>} Recall that the feature T is an item in the pair <A|T>, and a context in the pair <T|∅>. This allows for selectional domains to be created, such as the verbal domain above (as we elaborate below). 2.3.4. Categorial Symmetry In Vergnaud (forthcoming), Vergnaud introduces the principle of Categorial Symmetry. From the notion of Head-Merge, he builds IHC (Iterated Head-Complement) structures to create extended projections, as in the template above, (110-112), and from EPP-Merge; he then pairs two IHC structures in parallel. Vergnaud links the idea of parallel IHC structures to the notion of symmetry in syntax from Lasnik & Uriagereka 2005, and proposes (a strong) principle to incorporate this notion of symmetry with that of IHC structures. This is the principle of Categorial Symmetry (CS), which states that all IHC structures are constructed from the same grammatical formatives in the same hierarchical order. Most importantly, the principle of CS implies that a noun phrase and a verb phrase (the parallel IHC structures linked by EPP-Merge) are made up of the same abstract components. This embodies observations in the literature that certain components of the verbal and nominal domains seem to play similar roles, such as aspect in the verbal domain being linked with count/mass distinctions, or Div (cf. Borer 2005) in the nominal domain. Essentially, an IHC structure with an N categorical feature will create a noun phrase, or if it is found in the context of a V categorical feature will create a verb phrase. Given the IHC structure for a verb phrase set out in Vergnaud (forthcoming), he further shows the template for a verb phrase, where each formative in (112) plays a role as an item (i.e. |X> ) and as a context (i.e. <X|). As mentioned earlier, he notates the ket with a delta notation, 89 where ΔX is equivalent to <X|, and represents the item X as creating a context. Then, to achieve the telescoping relation, each item is paired with a context. In bra-ket notation, the verb template is: (R=Root, A=Aspect, T = Tense, ∅ =Edge) (113) {<∅|R>, <R|X>, <X|A>, <A|T>, <T|∅>} Here, X is in the context of R (recall X= V or v), Aspect is in the context of X, and Tense is in the context of Aspect. The symbol ∅, as before, represents an Edge feature (so the R is found in the context of an ‘edge’, and an ‘edge’ is found in the context of T of an IHC structure). The item/context notation is crucially represented in (113) as a directed edge of the graph, which expresses the headedness/labeling property of constituent structure. We see that Tense falls in the context of Aspect in (113) as well. Or, synonymously, Tense is the label of the constituent created when Aspect and Tense are Merged. Essentially, as Vergnaud states, headedness is the manifestation of a “mental/neural notion of context” – and this notion is what creates asymmetry in syntax. Vergnaud discusses the mental notion of context in Vergnaud (2009: 33), which is a symmetrical one, and hypothesizes that a simplification creating an asymmetry is useful: (114) (p/q) ⇔ (q/p) , where (x/y) is “x is in the context of y” Vergnaud hypothesizes that the computation can be simplified by only representing half of this relationship and argues that this asymmetry is computationally useful: “The syntactic relation of … headedness ultimately derives from the asymmetry of the mental/neural concept. That asymmetry is the basis of many features of the mind, such as the focus/background dichotomy.” A constituent is defined as any bra-ket pair, or context and item pair: <Q|P> = {Q, P} is a constituent. This relates the notion of context to that of constituent, and then we see that this an asymmetric notion because if {Q, P} is a constituent, then it must be that 90 either Q is a context or P is a context (either <Q|P> , or <P|Q>, but not both). In other words, either P or Q must be the label of a constituent. To allow for recursivity, he adds a recursive rule that links two constituents: (115) If Q is in the context of P and R is in the context of Q, then R is in the context of the couple {Q, P}. (116) (Q, ΔP), (R, ΔQ) ⇒ (R, Δ{Q, P}) So we can ‘nest’ ítems within constituents, allowing for more complex contexts. Then, the item in the item-context pair is the label/head for the pair, as mentioned above. Vergnaud puts all of the above axioms into a set of axioms called the Contextuality and Constituency Axioms, repeated here: (117) Contextuality and Constituency Axioms (CC Axioms) Let Q and P be two constituents. Then: a. (Q, ΔP) ⇒ {Q, P} is a constituent b. {Q, P} is a constituent ⇒ (Q, ΔP) or (P, ΔQ) c. (Q, ΔP), (R, ΔQ) ⇒ (R, Δ{Q, P}) d. (Q, ΔP) ⇒ Q is the head of the constituent {Q, P} e. A phrase with label Q is analyzed as a pair {{Q, P}, (Q, ΔP)}. This set of axioms describes what is known in the literature as Bare Phrase Structure, and derives the intuitive idea of ‘family of trees’ which we discussed earlier. 15 15 As a note, Vergnaud interchanges several equivalent formal notations throughout the paper. For the reader’s clarification, here is a list of notations used for ítems and contexts: <X|Y> bra-ket notation <context|item> (X, ΔY) delta notation (item, Δcontext) X Y directed graph context item 91 Vergnaud also addresses the derivational process in syntax, showing that the item-context pairs given in (113) can be selected Top-down, Bottom-up, or in any order. The same grammatical relations are encoded in any way. As long as certain conditions are met, that every ket is paired with a bra that contains distinct formatives, and conversely, that every bra is paired with a ket that contains a distinct formative, and the mapping is one-one, then the structure is licit. Now, synthesizing the discussion of Bare Phrase Structure as creating families of constituent structures and the verb template, we can see that the verb template is actually a family of possible constituent structures (or tree structures) that make up an equivalence class. This is represented by the bra-ket notation, formalized by the notion of how the mathematical interpretation of the domain relates to the notion of a family of constituent structure. 2.4. The ICA and A’-Structure Under the ICA, A’-structure is separate from A-structure, and is built using a different type of grammatical formative. In short, the formatives that create A’-structure are pairs of grammatical connectives/relators (cf. den Dikken 2006), which ‘select’ for entire IHC structures/phases, rather than the notion of selecting an extended projection (as in Head-Merge). These could be called ‘supercategories’, as they are of a higher type than the primitive grammatical formatives that make up a phase (i.e. T, Asp, N/V, Root as above). These formatives are features that select for a phase, and reduplicate the phase. Then, deletion patterns will result in movement (A’- movement). The details are below. This conceptualization of A’-Structure and Generalized Connectives compares to phase- based movement under MP, although without the notion of EPP or feature-driven movement 92 needing to be constrained by stipulations such as phase-heads C, v. As grammatical connectives take only phases, transformations will be limited to that phase alone. Vergnaud (forthcoming) sets out a working hypothesis of the formatives that make up an IHC structure. A phase will be defined as the pairing of two IHC structures (one in the nominal, and one in the verbal domain), or ‘dual domains’. As we have been utilizing in our example above, he proposes that the elements that make up a verb phrase are T, Asp, X, and R, where X = v/V distinction, and a noun phrase has T, Asp, X and R as well, with X=N. However, a clause is standardly made up of two verbal phases, a vP and a CP/TP phase. To do this with IHC structures, Vergnaud proposes that a clause is made up of two IHC structures, with two tenses, two aspects, a verbal v/V feature, and two R (root) features. Each of these IHC structures has a dual nominal projection. Essentially, as in Minimalism, two phases make up a transitive clause. However, the mechanism for linking the lower (vP) phase and the higher (CP) phase remains incomplete. As we discuss below, Vergnaud has a mechanism for linking phases with wh-movement, focus, or other discourse-level properties, but this does not necessarily translate to linking a pair of CP and vP phases to build a transitive clause. As Vergnaud suggests, perhaps the right edge of the higher phase and the left edge of the lower phase are shared, although the formative that would partake in this role is not yet clear. The architecture presented by Vergnaud allows for many working hypotheses about the actual structure of a clause, and which formatives make up an IHC structure – he uses these formatives for illustrative purposes, and acknowledges that the system will need to be richer than his initial proposal. Vergnaud (forthcoming) defines a phase under this architecture as the pairing of a constituent with its dual (‘mirror’ from before, or specifier). So, the pair of nominal-verbal templates, repeated here, constitutes a phase: 93 (118) Verbal Template and Nominal Template: {<∅|R>, <R|V>, <V|A>, <A|T>, <T|∅>} {<∅|R>, <R|N>, <N|A>, <A|T>, <T|∅>} However, one important function of phases is the edge position, which allows for cyclic movement to occur. Here, Vergnaud extends the architecture further and discuses how a category and its specifier may be linked by higher-order logical connectives that represent discourse-level functions (or functions commonly found in the C-domain). These connectives can only be used with a complete IHC structure, and signal the edge of a phase. There may be one ‘focus’ per phase, or one set of linkers that connect the nominal and verbal domain. He extends this to wh- movement and complement clause embedding (see chapter 4 of this dissertation for further discussion of complement clauses and an interpretation of Vergnaud’s theory). The details are not given explicitly by Vergnaud (forthcoming), but I provide a basic introduction here, from Vergnaud’s lecture notes and personal communication (the following is quoted from his notes): “Human Language (HL) “quantifiers” are viewed as iterated logical connectives (in line with analyses found in the logical literature, such as in Skolem …., as well as in the linguistic literature, such as Harris …). To illustrate, English every and each arise from the iteration of ∧ (and in English), while English any arises from the iteration of some version of ∨ (or in English). Connectives are intrinsically binary, in the sense that a connective is an indissoluble pair of elements, of the form in (119), giving rise to the structure in (120) [and (121)- added by KMB]: (119) (K, k) (120) Kx, ky, x, y phases (121) K {<∅|R>, <R|V>, <V|A>, <A|T>, <T|∅>} {<∅|R>, <R|N>, <N|A>, <A|T>, <T|∅>} k {<∅|R>, <R|V>, <V|A>, <A|T>, <T|∅>} {<∅|R>, <R|N>, <N|A>, <A|T>, <T|∅>} Thus, a connective is a discontinuous lexical item. We elaborate below. A connective can be symmetrical, with K=k, or asymmetrical, with K≠k. The conjunctive connective and the disjunctive connective are symmetrical. The former is the pair AND … , and … , as in English And Mary screamed and John cried, typically 94 realized as Mary screamed and John cried. The disjunctive connective is OR … , or … , as in Or Mary screamed or John cried, typically realized as Mary screamed or John cried. An asymmetrical connective is the connective IF … , then …. Another one is the definiteness connective. We slightly depart from Russell 1905 here in identifying definiteness as a particular connective, not [necessarily – KMB] as a particular quantification. The defining property of a connected structure Kx, ky is: (122) Connectedness is chaining (Cc) Given the connective (K, k) in the structure Kx, ky, there must exist a constituent ∂ such that Kx and ky each contain a copy of ∂. In the case of a symmetrical connective, Cc is trivially satisfied by taking ∂ to be K=k. [e.g. from Liao & Vergnaud (forthcoming), there is an (OF, of) connective linking two noun phrases, OF=of. - KMB]. [In the case of an asymmetrical connective, there must be an independent ∂ that gives rise to a chain spanning the two halves of Kx, ky. [KMB: For example, in a relative clause (see Chapter 5), there is an asymmetrical (D, C) connective which requires a shared noun phrase as its ∂ (the relativized noun). These connectives link two phases in the sense of complete, dual IHC nominal and verbal structures (x and y are variables representing two separate phases). Each contains a nominal and verbal counterpart.].” Additionally, Vergnaud’s lectures and an appendix to Vergnaud (forthcoming) elaborated and developed the idea of a phase, looking at the equivalence of a label/head of a phrase and its constituent: (123) Head(K) ↔ K (124) ( v-V see) ↔ ( v-V see him) Vergnaud treats this equivalence as a generalized substitution transformation, which renders Transformation-markers (T-markers in the sense of Chomsky 1957) to be a ‘system’ of phases, each one depending on whether substitution of the entire constituent for the head has occurred (or not). Vergnaud expands the idea of phasal connectedness to adjuncts, and elaborates (as in the lecture notes, above) on wh-movement and phasal connectives. 95 In this dissertation I explore and interpret further the idea that generalized connectives give rise to phases, and look at generalized substitution transformations from the lense of parallel nominal and verbal domains, rather than treating the verbal projection as ‘privileged.’ 2.5. The ICA as an Ideal: Certain Shortcomings There are two shortcomings to Vergnaud (forthcoming) which I focus on in chapters 3-5. First, the ICA is concerned with the formatives within a single phase, and develops parallels between the coupled nominal-verbal domains within an IHC structure (a phase) containing a single nominal argument and its verbal counterpart. To link IHC structures, Vergnaud (forthcoming) takes an ‘inner Tense’ formative to be shared across the lower (object) and higher (subject) phases of a transitive clause. The T formative of the object phase is taken to be the Root/R formative of the higher phase, presumably acting similar to v in the standard theory as having both lexical and functional status (‘semi-lexical’). Essentially, Vergnaud allows for the verbal domain to continue extending between object and subject phases, as in current theories. This is a different mechanism that that used to link phases via binary grammatical connectives (as above), which uses pairs of logical connectives to link full phases. This predicts that there is a fundamental difference between embedding full clauses and linking smaller phases. However, I argue – in the sense of Williams’ (2003) Level Embedding – that clauses come in many different sizes and that the same generalized embedding mechanism is what links any type of phase with another phase, and that there are empirical benefits, particularly in the realm of Control and Raising clauses, that justify using the same generalized embedding mechanism. 96 Secondly, the phase extension from Vergnaud (forthcoming) follows current theories in privileging the verbal domain. In and of itself, this is not undesirable; however, it is unclear that there is a fundamental principle behind the mechanism which allows the verbal domain to extend across phases, and not the nominal domain. This is especially apparent within MP when one considers phase boundaries and specifier positions: it is empirically necessary to allow for the specifier position of v (used generally as a ‘nominal’ domain) to be visible to higher phases, as well as v. And yet, v (and C) – verbal elements – define phase boundaries. Here, I move away from both Vergnaud (forthcoming) and standard theories, and argue that phase boundaries are defined using both parallel nominal and verbal domains, and that ‘nominal complementizers’ exist. In some sense, this takes the ideal of the ICA’s Principle of Categorial Symmetry and extends it to the “supercategories” that are used to overlap phases – and build both clause structure as well as embedded clauses. Chapters 3-5 in this dissertation develop a notion of phase overlap and embedding, using both nominal and verbal domains. Positive consequences of this approach are discussed in chapter 3 regarding Case, chapter 4 with Control/Raising clauses, and chapter 5 with relativization. 2.6. Conclusion Key elements in Vergnaud (forthcoming) will be utilized in this dissertation; namely, the Principle of Categorial Symmetry, the notion of binary grammatical connective, and the notion of an IHC structure as defining a phase. Then, this dissertation attempts a unification of the D-C-T domain across Control and Relativization, using the following: (1) a formal notion of Case as a complementizer, (2) a notion of noun sharing which allows for phase overlap, (3) a general definition of ‘domain’ which allows us to move from graphs at narrow syntax, to Phrase-markers at PF and LF (see Constraint on Phrase-Markers, chapter 5). These three contributions, couched in the primitives of the ICA 97 and Vergnaud (forthcoming), allow for the start to a resolution to the conceptual problems address in MP above, with respect to Merge and locality. 98 Chapter 3 Phrase Structure and Case Complementizers 3.1. Introduction This chapter lays out the basics of phrase structure argued for in this dissertation, and then proposes a new role for abstract Case as a nominal complementizer. First, a discussion of Case Theory under the Minimalist Program explores the role of Case under Chomsky’s Visibility Condition and an Agreement-based model. The purpose of Case is discussed and evaluated. Then, I turn to how lexical-functional relationships are constructed under current theory, and discuss the necessity of economy conditions/relativized minimality as a result of defining domains around lexical-functional structure. Two core outstanding issues under current theory will be presented. (1) The subject- internal VP hypothesis creates competition for DPs within the domain for raising. Rules of attraction become necessary for the subject to raise, leaving the object in a lower (raised) position. A similar problem arises with wh-movement, particularly in the TP domain (see Boskovic 1997, Pesetsky & Torrego 2001 for discussion) in order to arrive at correct empirical results for subjacency. The ICA returns to a notion of interwoven lexical-functional relationships (one per phase), which has consequences for this competition and nullifies the need for discussion about attraction of two DPs within one domain. Along with this, difficulties arise with wh-quantification and island domains, which I cannot yet account for. After this review, I turn to a proposal about how phrase structure is constructed, as a series of phases that introduce one argument/specifier for a set of verbal projections. A single lexical, and single functional relationship is expressed within each phase (Vergnaud forthcoming, Liao 2011, McKinney-Bock & Vergnaud 2010), and the semantics of features found in subject and object phases is set out. 99 Then I turn to the central proposal of this chapter, which reinterprets Case as a complementizer found in the nominal domain rather than being housed at the level of T/[Spec, TP] and v/[Spec, vP]. Case is one feature found in a pair of complementizers {D Kase , C Komp } (abbreviated {D K , C K }). This binary grammatical connective plays a key role in stitching together the fabric of syntax, and linking subject-object phases in a transitive clause. Some open consequences of this connective are discussed. Then I review how the EPP under the ICA is presented (Vergnaud forthcoming), and I discuss the relationship of the EPP to the proposal about Case in this chapter. Finally, a comparison of phrase structure under TAG, MP and the ICA is given, and possible conceptual arguments for and against the ICA over current theory. 3.2. The Minimalist Program 3.2.1. Case Theory I address Case in this chapter, because it is a feature that is always uninterpretable – somewhat of a linguistic anomaly, given that almost all other features (barring, e.g., the EPP) are interpretable in some context and uninterpretable in others. Additionally, Case doesn’t have an overt dual feature to occur in both the nominal and verbal domains. I look to the open question regarding the relationship of Case to the verbal domain, and question why it is necessary, when the purpose of Case has been to describe the distribution of DPs. In some sense, Case remains a linguistic mystery. Following Vergnaud 1977’s Case Filter (see Vergnaud 2008), which states that “every phonetically realized NP must be assigned (abstract) Case” (Chomsky 1995: 111), Chomsky (1995) reduces Case to “a reflection of Spec-Head agreement” between nominal and certain heads that are said to be Case-licensers (Chomsky 1995: 122), and the Case Filter is satisfied 100 only at the interface level. This is a revision of Chomsky (1981) from two levels of structure – Deep (D-) Structure licensing non-structural case, and S-Structure licensing Structural Case – to an interface condition. Here, chains are linked at LF for Case-checking purposes - and he revises the Visibility Condition (Chomsky 1986) from a condition on marking arguments, to a condition that makes DPs ‘active’ or available for Agreement operations. The Visibility Condition states that “A chain is visible for theta-marking if it contains a Case position… necessarily, its head, by Last Resort” (Chomsky 1995: 116-119). In other words, the chain becomes available for interpretation and pronunciation as long as it contains a Case position. Otherwise, the argument cannot be given a θ-role, nor used in the computation. On the more recent view, an unchecked Case feature makes the DP’s Φ-features visible for checking the uninterpretable Φ-features of a probe (e.g. uF on T). Like Chomsky (1981), Chomsky (1995) takes Case to be a condition on a derived/transformed level of representation (S-Structure in 1981, the LF interface in 1995), rather than a condition on narrow Syntax. Empirical evidence comes from: (125) (i) *it seems [Susan to be here] (ii) Susan seems to be here Where, at the level of narrow syntax, both structures should be okay - and it is not until LF where the absence of Case on ‘Susan’ crashes the derivation, because the two constituents ‘it’ and ‘Susan’ do not form a chain at LF. Case Theory under the Minimalist Program relies on the operation Agree, a long-distance relationship between two items with the same feature. Prior to MP, and within MP, there has been quite a lot of revision in the mechanism of case licensing and where it happens in the grammar, and what phenomena Case is responsible for. For example, the debate about the role of 101 Case in Control structures remains (Hornstein 1999, Landau 2001, 2004, Grano 2012, a.o.). Quirky Case in Icelandic also raises the question about whether Case itself is responsible for movement, or if agreement can check Case features instead, making the EPP a separate mechanism. In Chomsky (1995), Agree is taken to be a complex operation that triggers movement, by which one item with an interpretable feature iF (the goal) enters into a long-distance relationship with a head that has an uninterpretable uF version of that feature (the probe), followed by movement of the goal to the probe. Later, Chomsky (1998) revises this notion to allow for Agree to check features, not requiring that feature checking be accompanied by movement. He thereby relegates all possible (overt) movement to an EPP feature, a special feature that is separate from Case. Case Theory in MP, then, is independent from the EPP, and therefore from movement. There are some empirical reasons to believe that Case and the EPP (used to raise arguments to a specifier position, e.g. Spec, TP) are separate mechanisms. This is due, in part, to the behavior of morphological case in Icelandic (see Adger 2003 for discussion) as well as the behavior of Exceptional Case Marking (ECM) verbs in English (see Liao 2011 for discussion). In Icelandic, the EPP drives movement of either a dative or nominative subject to the specifier of TP, depending on what the verb selects (taken to be a lexical property of the verb). Dative, however, is not a structural Case, but rather an inherent case - and so it appears that structural nominative Case is not what drives the raising to the specifier of TP. Before, the EPP had been taken to be a Case-checking mechanism, but the presence of a non-structural case with raising to [Spec, TP] in Icelandic suggests that raising is separate from structural Case. 102 With respect to ECM verbs in English, Case on objects also appears independent of raising. Evidence of raising-to-object appears with overt adverbials: (126) Bob expected Mary incorrectly [Mary to get the right answer]. There is no reason that accusative Case cannot be checked via long-distance agreement with the lower copy of Mary, especially since no strong C is believed to intervene between the matrix verb and the subject of the embedded clause. Consequently, raising-to-object seems to be driven also by a general raising requirement, i.e. the EPP, rather than by structural accusative Case. As a result, Chomsky (1995) revises the mechanism that is responsible for Case checking from movement to Agreement. Previously, movement from a lower position of the DP would check a nominative (NOM) Case feature on T. This is eventually revised to an agreement relationship using Agree. This works for subjects, but leaves the status of accusative Case on objects somewhat unclear. Pesetsky & Torrego (2001) discuss two possibilities for the checking of accusative (ACC) Case on an object. (1) The vP checks the ACC Case feature of the DP in its derived position in some specifier position above the vP. This movement is necessarily covert, as objects are not generally seen above the v position: (127) Bob had caused Mary [Mary to fall over]. (128) *Bob had Mary caused [Mary to fall over]. (2) T could check ACC as well as NOM, with some sort of multiple specifier configuration - akin to multiple wh-movement in Slavic. This option would also involve covert movement for the object in English too, as the object is not seen this high in a clause: (129) *Bob Mary had caused [Mary to fall over]. The underdetermined possibilities of T checking both NOM and ACC, versus T and v playing a similar role in (Case) checking different arguments, brings to light a more general question about Case. What is the relationship of Case to the verbal paradigm, and why do DPs 103 require Case - an abstract notion whose only role is to describe the distribution of DPs? Chomsky’s argument is that Case makes a DP ‘active’ to the computation so the uF Φ-features on the T-head can be checked. In other words, an unchecked Case feature makes the DP’s Φ- features available for checking the uninterpretable Φ-features of a probe. Once Case on the DP has been checked the DP’s Φ-features are no longer visible to the computation (ruling out unwanted structures, etc). In assigning a structural configuration for Case, Chomsky 1995, citing Baker (1988), states that agreement can make chains available to the computation (just as Case does). “Abstract Case should include agreement along with standard Case phenomena. Case is a relation of XP to H, H an X 0 head that assigns or checks the Case of XP. Where the feature appears in both XP and H, we call the relation ‘agreement’; where it appears only on XP, we call it ‘Case’” (Chomsky 1995: 119). At a deeper level, Chomsky is relating Case directly to overt morphological agreement, but without realization of an overt morpheme on the head (most likely verbal in nature). This works, but raises the question about duality in feature (un)interpretability: Case and the EPP are the only uninterpretable features without a dual, interpretable feature – this remains an imperfection in the system. In MP, Chomsky takes Case to be a similar relation as agreement (and later, uses Agree, without movement, to check case.) 3.2.2. Phrase Structure and the Lexical/Functional Distinction The way in that lexical-functional relationships under current theory is represented raises the issue of locality, which I turn to here. In the 1980s-early 1990s, empirical evidence led the literature to propose the VP-internal subject hypothesis (see, e.g. Zagona 1982, Koopman & Sportiche 1985, Kitagawa 1986, Kuroda 1988, Sportiche 1988, among others), which states that the subject of a sentence is generated within the VP instead of in [Spec, TP] (or [Spec, IP] at the 104 time). Along with this hypothesis a theory of lexical and functional categories was developed that situated all lexical categories lower in the clausal structure than functional categories (e.g. the extended projection of Grimshaw 1991). Both theories work hand-in-hand, as the VP-internal subject receives its lexical (θ-) role from the lexical V and then moves to the functional domain, where it receives its functional role(s). Along with this, however, comes the concern that the object must also raise out of the VP for licensing of case and agreement. It was proposed (Chomsky 1995) that the object moves to an agreement phrase, AgrOP, immediately above the lexical projection/VP in order to check Case (in parallel with subject movement to AgrSP, located immediately above TP). This allows both the subject and the object to receive a thematic role from the lexical domain (low), and a functional role from the functional domain (high). However, a prerequisite for the extended projection which puts all lexical operations below functional ones, creates a domain that always contains multiple DPs. In effect, this architecture introduces competition for which DP is selected to move when. More specifically, how do we prevent any DP argument from moving to some functional projection licensing specific arguments? For example, how do we prevent the object DP from moving to AgrSP, and the subject DP from moving to AgrOP? Something like the following must be ruled out: (i) [ CP [ AgrSP DP OBJECT [TP [ AgrOP DP SUBJECT [ VP t SUBJECT [V’ V t OBJECT ]]]]]] This is a case of this type of nested movement, where the object crosses over the landing position for the subject DP’s movement. This should be ruled out, as it gives the wrong empirical result, although it is not clear that this type of dependency is computationally bad. On the other hand, we must allow another type of crossing movement, where the object crosses the base-generated position for the subject, as long as the subject has moved higher: 105 (ii) [ CP [ AgrSP DP SUBJECT [TP [ AgrOP DP OBJECT [ VP t SUBJECT [V’ V t OBJECT ]]]]]] To do this, economy constraints on movement that rule out cases of cross-over movement were required. These constraints are needed crucially because all the arguments/DPs are found within the same domain – creating competition for feature checking. Similar problems arise in the case of sentences containing multiple wh-words as well as the collection of subject/object movement asymmetries discussed in the 1980s, and unified under the account in Pesetsky & Torrego (2001), under the umbrella of “Attract Closest” (Chomsky 1995) interacting with the “Principle of Minimal Compliance” (Richards 1997). Once again, competition is created with multiple DPs in a single domain. In the case of wh-movement, this happens by moving a wh-object to the edge of the vP phase to be visible to further movement – and competing, then, with any possible wh-subject that is merged later on. It is this presence of multiple DPs within a single domain (agreement domain, or movement domain) that triggers the need for constraints on which argument may be selected for the agreement relationship. These constraints are not cognitively or linguistically motivated, but are rather an artifactual necessity of the fact that all lexical relationships occur below functional ones. In the next section, I propose utilizing the ICA (Vergnaud forthcoming), within which a single lexical-functional relationship creates a phase. An immediate consequence is that only one DP is merged per phase. From this, I show that Vergnaud’s system is inherently non-recursive, and it remains a problem how even transitive clauses are constructed. I then propose that a pair of nominal and verbal complementizers are what links the edges of phases. In returning to a notion of interwoven lexical-functional relationships (one per phase), we nullify the need for attraction of a closest XP within a single domain – only one XP will occur per domain. 106 3.3. Building Phrase Structure and the role of Case 3.3.1. Primitive Binary Features: Argument Structure Syntax is taken to be composed of features that come in binary sets, as binary grammatical connectives. Phases are then composed of the Cartesian product of two sets of features (Vergnaud forthcoming, Liao & Vergnaud 2011, McKinney-Bock & Vergnaud 2010), in particular {R, Φ} and {N, V}. The pair {R, Φ} is the lexical-functional pair ({Substantive, Functional} in Liao 2011). The details of clause make-up are simplified under Vergnaud (forthcoming) for illustrative purposes, leaving full clause structure to future revision, and Liao 2011 makes significant advances within the structure of modality, aspect and referentiality. Here, I take transitive clauses to be comprised of two separate phases: (130) SUBJECT PHASE OBJECT PHASE (V, Φ) (V, R) (V, Φ) (V, R) (N, Φ) (N, R) (N, Φ) (N, R) I interpret these functional-lexical pairs, across the nominal and verbal domains, as follows: (131) SUBJECT PHASE OBJECT PHASE T Light Verb v V D ext N ext D int N int 107 In the Subject phase, T and D play functional referential roles (see Landau 2004, Liao 2011), where T sets reference to a time and D sets reference for the noun. 16 Light Verb (V L ) and N ext are the lexical items in the subject phase, with N ext playing the role of a nominal root (for the external argument) and V L playing the role of contributing a component of meaning to the verbal extended projection similar to that of a light verb – e.g. causation, psychological experience, or change of state, etc: external theta-roles, such as those assigned by verbs like kick, fear, run. In the Object phase, v and D play the same functional, referential role. For the time being, we can think of v as an existential quantifier over the event variable with some verbal root denoting a predicate of events (along the lines of, e.g., Diesing 1992). This is analogous to referential D, which plays the role of closing off the individual variable for the nominal root (which denotes a predicate of individuals). However, issues of quantification are more complex, and I do not discuss them here. Then, V and N int are the verbal root and nominal root for the object phase. The edges between the four vertices also represent grammatical relationships. The {D, T} and {D, v} relationships are functional, and the {V L , N} and {V, N} relationships are lexical – i.e. theta (θ-)relationships. The functional, or Φ-, relationships between {D, T} and {D, v} are referential. Each phase has a functional head and a lexical head for both nominal and verbal counterparts. Each edge represents a parallel relationship, labeled as such below, with S representing S-Merge. 16 There is a typology of tense features, which I will later refer to as +/-T, or independent/dependent (+T) and anaphoric (-T) tense – so T here does not just refer to a representation of finite, independent tense. 108 S S S S Referential (132) T V L v V D ext N ext D int N int Notice here that V L and v represent two functions that are carried out by the vP in standard theory (or, by different little v heads, or little v and Voice heads) – a phrase which has been treated a semi-lexical (van Riemsdijk 1998), as it both assigns case (ACC) and a θ-role (e.g., AGENT). Similarly, Vergnaud’s work also assumes a shared v head between higher and lower phases, though in Vergnaud (forthcoming) it is labeled/characterized as an inner T head (corresponding to the state-individual level predicate distinction). What I do here is split the lexical and functional relationship that v has traditionally had into two heads. 17 Most, if not all, previous approaches to clause structure to my knowledge – including Vergnaud forthcoming, Liao 2011 - assume a privileged relationship for the verbal domain, allowing it to project across phases. I diverge from this idea, below, and assume that both nominal and verbal domains project across phases in parallel. It remains unclear why the same labeled head would necessarily play both a Case- and θ- assigning role. Empirically, it does capture Burzio’s generalization, although this generalization can also be captured by the selection relationship of two separate heads. I diverge from this view slightly in the dissertation, and use the idea of a Case Phrase (KP) (Lamontagne and Travis 1987), as a generalized grammatical connective (see chapter 2) to model Case assignment, rather than v. Historically, (Simpson class notes, 2008), case-markers develop from the reduction of 17 This makes a prediction for languages with serial verbs. Specifically, each V L will introduce its own argument. I leave this to future research/revisions. Referential AGT-θ THEME-θ 109 prepositions. Non-structural Case, in English, still is marked with prepositions (to in the dative, e.g.). Prepositions and Complementizers play similar roles in language (Kayne 2002), and so I take structural Case to be the reflection of a complementizer in the D-domain. 18 To my knowledge, the proposal to link Case to phrase structure building has not been taken before. Instead of a feature on D that facilitates Φ-feature checking, I reanalyze case as a nominal complementizer, whose responsibility (shared with the verbal complementizer) is to allow the phase to combine with another phase, and build phrase structure. This gives a new role of Case, to which I return below. Note that the structure of Subject and Object phases is still somewhat a simplification of clause structure; namely, the quantificational properties of modality and viewpoint aspect are absent (but see Liao 2011 for one possible account; see chapter 4 and McKinney-Bock & Vergnaud (2010) for another possibility related to generalized quantification). For the purposes of this dissertation, though, the above phases will serve to illustrate the proposals surrounding relativization and embedding. For a richer theory of clause make-up, see Liao 2011, who adds aspectual and modal heads. Despite Liao’s elegant analysis of modality and aspect within his model, I remain neutral on the placement of these (primarily semantic) elements, as they involve quantification (see chapter 4). Cf. Williams (2003) and McKinney-Bock & Vergnaud (2010), quantification may involve a more complex reduplication taking both DPs and VPs as reduplicated arguments – rather than being expressed by functional heads in and of themselves. I am concerned, in particular, with the Modality head – Aspect may be defined within the functional structure, as a featural distinction rather than a quantificational property. 18 This line of reasoning would predict that historical change of Ps to case-markers would reflect a shift from being spelled out as a verbal category in the V-domain to a nominal category in the N-domain; I leave this to future research. 110 From this structure, a question arises: how does one link the phases that contain a subject and contain an object? Each phase serves as an independent VP-shell (Larson 1988), and without using the extended projection to link phases, each one serves as a (non-recursive) syntactic computation. 3.3.2. Embedding in the Literature I follow Williams’ (2003) notion of small clauses, where clauses are built in parallel, and embedding happens when one clause continues growing and the other stops being built (it embeds). However, there isn’t a clear notion of what it means to embed in his system – it is an operation that remains unspecified. Under the ICA, embedding is also not clearly defined – Vergnaud assumes that the verbal domain is privileged, and links the lower phase of a clause to the higher phase. Following the ICA, Liao (2011: 158) defines clausal embedding as an extension of the verbal projection: the pair {k, k} = T embed which, essentially, allows for two syntactic objects (what I am calling ‘phases’ here) to be linked through the verbal domain. Liao links the CP object to the TP object, and the T embed connective creates a relationship between C and T: (133) C 3 C T Extending the verbal domain, and allowing various DP ‘nominal domains’ to create relations with various elements of the extended projection is essentially what the Head-Spec relationship does in Minimalism, and it continues into Vergnaud’s & Liao’s work as well. Here, I take a different route, and formalize embedding as generalized sharing. The key conjecture is: (134) Embedding Conjecture (to be revised in chapter 4) To embed is to share a constituent across clauses 111 This is built off of Vergnaud’s idea that nominal and verbal phases are built in parallel, and also on Williams’ (2003) idea that embedded and embedding clauses are also built in parallel – and the embedding occurs at the point where one clause is ‘finished’. The smaller clause is the embedded one. This idea is developed further in chapter 4, which takes a first look at clausal embedding. However, it applies here as well at the level of building clause structure, and linking subject-object phases. In the following sections, I now illustrate how the phase-linking via {D K , C K } works. I present a full analysis of a matrix indicative clause (started above). Then, in chapter 4, I discuss how conjecture about embedding presented here can capture Landau’s 2004 thesis that Control depends on the C-T relationship as well as the controlee, as well as Pesetsky & Torrego’s 2001 and Alexidaou & Anagnostopoulou’s (1998) theses regarding the close relationship between D and T. In Chapter 4, I present analyses of embedded indicative sentences, embedding control predicates and embedded raising predicates. I note here that these are ante-theoretical structural descriptions that will hopefully coalesce into structural generalizations under the ICA, and eventually realize a general theory of embedding (but this is currently outside the scope of chapters 3-4). 3.3.3. Proposal: Linking Phases through the ‘K’ (‘Komplementizer’)-Domain I propose a binary grammatical connective whose sole responsibility is to link phases: (135) D Kase C Komp In a phase, the verbal and nominal domains receive a higher order analysis as a phases which Merges with a “supercategory” (Vergnaud forthcoming). The phase is, essentially, treated as one object rather than separate features. Vergnaud (forthcoming) treats focus and wh- movement as binary grammatical connectives that take a supercategory as a sister, rather than 112 entering into a single grammatical relationship with another feature – as we saw when building Subject and Object phases. Here I apply the idea of a binary connective, previously suggested for A’-phenomena, to case and phrase structure, phenomena of the A-type. Phase P: {F 1 , F 2 } × {G 1 , G 2 } (136) PHASE (F 1 , G 1 ) (F 2 , G 1 ) (F 1 , G 2 ) (F 2 , G 2 ) “Supercategory”: P × {wh, WH} (137) (wh, (F 1 , G 1 )) (wh, (F 2 , G 1 )) (WH, (F 1 , G 1 )) (WH, (F 2 , G 1 )) (wh,(F 1 , G 2 )) (wh, (F 2 , G 2 )) (WH, (F 1 , G 2 )) (WH, (F 2 , G 2 )) What the connective {wh, WH} does is take a phase and reduplicate it (Vergnaud p.c., lecture notes, see chapter 2 of this dissertation). Vergnaud proposes to extend this type of feature pair (the {wh, WH}) to all quantification, although a full implementation of this remains to be tested (but see chapter 4 for further developments). I propose that {D K , C K } plays a role similar, if not identical, to that of {wh, WH}: it takes a phase as its complement, and reduplicates it. Let’s start with an object phase: 113 (138) OBJECT PHASE v V D int N int Now, the object phase merges with {D K , C K }: (139) “Supercategory”: Object Phase × {D K , C K } (D K , v func ) (D K , V lex ) (C K , v func ) (C K , V lex ) (D K , D int ) (D K , N int ) (C K , D int ) (C K , N int ) In doing this, I assume that the mechanism that is responsible for Case checking, D K , is also responsible for acting as an edge feature, connecting phases. Vergnaud (2007) proposes a generalized attachment transformation (GAT), with a pivot, which is a shared constituent across two phases. GAT orders the Phrase-markers in terms of subordination. GAT is, however, quite powerful, and Vergnaud only begins to address what forms of GAT are seen in grammar, and how they are labeled. He defines a phase P and head H, with H’ the projection of H. From this, the phase is defined as the maximal tree of H’ made up of all constituents which are contained within H’ and not dominated by another phase within H’. All specifiers and adjuncts are included. His illustration is of a structure in (140), analyzed as in (141): 114 (140) ( C that ( T Past ((she) ( v-V see him)))) (141) (i) Phase I: ( v-V see him) (ii) Phase II: ( C that ( T Past ( v-V (she) see))) 19 This is a simplified illustration in the sense that Vergnaud is only dealing with linearized, Merged lexical items (for example, both EPP- and Head-Merge have already occurred in see him, with the linear order [v-V D-N]). The generalized binary connective {D K , C K }, much like GAT (above), has a pivot that allows the lower and higher phases to “see” each other. In this case, I take reduplication of the entire object phase (containing {D int , N int , v, V}) in to be the pivot, using D K /C K . 20 Here is a graphical representation of phasal connection, taken from Vergnaud (2007) for wh-movement. Here, the arrows represent the time course (linearization), so the lines would be read first from left-to-right, then bottom-to-top. The connective {or, or} which is linking phases is represented at the beginning of each tier on the graph, and the phases are represented immediately after: 19 Note that the object is not present in Phase II, this is because only v links the higher and lower phases in Vergnaud (2007); this is the privileging of the verbal domain. 20 If one were to object to this, an alternative would be to do something similar to (141ii), but allowing the D from the object DP to also be visible. This revised Phase II would look like: (ii) Phase II: ( C that ( T Past ( v-V (she) see the))) The decision between this approach and the one I assume above remains to be investigated. 115 (142) or X 1 Y 1 wh-Z or X 2 Y 2 X 1 Y 1 wh-Z or X 3 Y 3 X 2 Y 2 X 1 Y 1 wh-Z Etc. or X i+1 Y i+1 X i Y i …………………………… Y 1 wh-Z Etc. Here is a representation of {D K , C K } linking a Subject and Object phase (I have simplified the cube by representing the nominal and verbal domains (the horizontal lines) as phrases, where [TP DP ext ] is a shorthand representation for the subject phase {D ext , N ext , T, V L }, and [vP DP int is shorthand for the object phase. The cube in (139) is represented by the reduplication [vP DP int ] across the D K and C K tiers. Notice that this treats the vP (object) phase as both nominal and verbal – Kase, a nominal head, is assigned to both domains in the first row. It also treats the vP phase as verbal under Komp, a ‘verbal’ head. 21, 22 (143) D K vP DP int C K TP DP ext vP DP int 21 This could be instrumental in two empirical domains: one possible empirical domain would be to turn to the languages which mark clausal complements with case, and the second would be to turn to Hartman (2012), the most recent in a long line of literature claiming that sentential CPs are nominal in nature. 22 One problem that arises is that, by representing {D K , C K } as full reduplication, we are now re-introducing the object back into the subject domain. 116 These two phases set out the functional and lexical properties of the clause: each one introduces one θ relationship (introduces an argument), and has the function of situating the reference of the participant and event. The{D K , C K } pair of connectives in (143) allows the matrix clause to be stitched together using the complementizer/case connective. This can be alternatively notated with a set of diagrams as in the phases above, such as (139), as each element has a ‘coordinate’ that contains either a D K or a C K . I will put the heads in a coordinate with either D K or C K , notating the nodes as, for example, (C K , v) and (D K , v): (144) (C K , T) (C K , V L ) (D K , v) (D K , V) (C K , v) (C K , V) (C K , D ext ) (C K , N ext ) (D K , D int ) (D K , N int ) (C K , D int ) (C K , N int ) In (144), the D K coordinates apply to the object phase, and the C K coordinates link the object phase to the subject phase. There is also a relationship between the D K , C K copies of [vP-DP int ] (the reduplication is a result of the object phase being the pivot, shared across {D K , C K }), the same as in (139): (145) (D K , v) (D K , V) (C K , v) (C K , V) (D K , D int ) (D K , N int ) (C K , D int ) (C K , N int ) 117 One could stitch together the two cubes, above, to create a full object. Notice that the C K cube in (146), represents a relationship between the internal and external DPs, shown by the red bolded lines below: (146) (C K , T) (C K , V L ) (C K , v) (C K , V) (C K , D ext ) (C K , N ext ) (C K , D int ) (C K , N int ) This relationship remains superficial in this thesis. There are two possible motivations for this relationship, although not a knockdown argument (and, to my knowledge, there is no argumentation in the field that there is a relationship between the Ds and Ns of the external and internal arguments. First, in the case of Romance reflexives, the reflexive could be filling in one of the two argument slots, reflexivizing the predicate. Second, the person-case constraint (Bonet 1994, 2008; Bhatt & Simik 2009; Walkow 2010) shows interaction of arguments when they appear as clitics. For example, first and second-person direct object clitics may be banned in the presence of indirect object clitics. This agreement paradigm could result across the relationship highlighted in (146). At any rate, the phases are then stitched together using {D K , C K }, with the pivot being the existing phase. It is in this way that transitive clauses are built: once an argument is introduced by a section of the extended verbal projection, with both a functional and a lexical component in both the N and V domains, the (generalized) phase is complete – and then the role of the complementizer/case domain is to link these generalized phases with higher phases. I crucially 118 depart here from the notion that the verbal domain is privileged over the nominal domain in linking phases. 3.3.3.1. Why might we have Case here? A possible world without case. This {D K , C K } pair allows for phases to be ‘visible’ to further computation and higher phases.It essentially stitches together the fabric of syntax, and allows for the combination of multiple phases in building phrase structure. In some way (still somewhat weakly), it derives a notion of visibility that isn’t dependent on head-specific stipulations. 3.3.3.2. A Note: Structural vs. Inherent Case The above assigns Structural Case. For Inherent Case, a similar set of connectives will be assigned, {p, P}, but the connectives will be symmetric (i.e. {of, OF}, consisting of the same preposition) instead of asymmetric (a Det and a Comp, D K and C K ) (see Liao 2011, Liao & Vergnaud 2010 for the {of, OF} connective in the nominal domain). 3.3.4. Phrase Structure, and Spell-Out Let’s assume the simplified phase structure presented above, and in McKinney-Bock & Vergnaud 2010. I’ll limit this chapter to the higher (subject) phase of a clause, and again simplified for things like aspect and number. I continue to take syntactic configurations to be governed by basic symmetry principles (cf. Lasnik and Uriagereka 2005, McKinney-Bock & Vergnaud 2010, Liao 2011, Vergnaud forthcoming). Thus, there is a one-one correspondence between the constituents in the specifier of X and the constituents in X. The assumed categories present are D ext , N ext , T, V L , where the checking relationships (cf. Chomsky 1995) – nominal- verbal pairs - are D ext -T and N ext -V L . Then, the selectional relationships, or extended projection (Grimshaw 1991) are the nominal hierarchy, D ext -N ext , and the verbal hierarchy, T-V L . Each of 119 f i f j x these categories is represented by a vertex on a graph, and each pair that occurs in a grammatical relationship (either checking or selection), is represented by an edge between the two vertices on the graph: (147) SUBJECT PHASE T V L D ext N ext We are missing a notion of head, or projection, as in the labels of BPS. This becomes important at the interfaces, when classical Phrase-markers are read from these graphs. So, a head/label is represented by a directed edge, with an arrow facing the head of the grammatical relationship: (148) SUBJECT PHASE T V L D ext N ext This idea can be generalized. As discussed, a derivation is represented as a graph with directed edges. The edge (f i , f j ) x with end points/vertices f i and f j , f i and f j formatives, represents the x- merging of f i and f j , x=S (selection) or C (checking): (149) 120 f i f j x f i f j C I call a derivational graph an M-graph (“M” for “Merge”). The notion of M-graph is akin to that of T-marker (cf. Chomsky 1975), where the transformational (here, derivational) ‘history’ is shown as a representation. Assuming labeling, an M-graph is a directed graph, as before. For example, the one- edged graph in (148) should be oriented as shown in (149), if f j is the head of (f i , f j ) x : (150) If x in (150) is C (checking), f i is in the relation Specifier-of to f j . (151) Checking is itself a symmetrical relation, represented in chapter 3 without directed graphs (Vergnaud forthcoming, Liao 2011). Specifier-of is just the relation that arises when two formatives in a checking relation are ordered by labeling/headedness, as shown here in (151). Again, via VP shells, arguments are uniformly treated as specifiers (cf. Bowers 1993, Hale & Keyser 1997, Larson 1988, Lin 2001, Megerdoomian 2002, Liao 2009, Liao & Vergnaud 2010, Vergnaud 2009, forthcoming), including objects (cf. Borer 2005, Schein 2010, with arguments regarding the introduction of the object through a different projection than the verbal root), and shells are linked via the complementizer domain, {D K , C K }. 3.3.5. To Phrase-markers (review from chapter 1) Phrase-markers (more restricted graphs used here at the interfaces) can be read from the M-graphs at narrow syntax. If one were to construct a tree based on Merge of (X, Y), with head projection/labeling, that tree (or classical Phrase-Marker) could be ‘read’ from the following graph as follows, with Y being the head of X: 121 X Y Z X Y Z X Y (152) (153) Y X Y And so on: (154) (155) Z Y Z X Y (156) (157) Y Z Y Y X Essentially, graphs are more abstract representations of a set of possible trees/Phrase-markers that are interpretable at the interfaces. 122 Now, if, as has been assumed in Vergnaud (forthcoming), Merge is restricted to applying to pairs of formatives/features, with the merging of non-terminals arising from headedness/labeling, then Merge must allow for overlapping applications. For example, in order to generate the X-bar schema – to allow heads to project multiple times to generate a phrase with both a complement and a specifier for the head – we must allow two applications of Merge to that head, one to Merge the complement to the head and another Merge the specifier. This is synonymous, now, with being in multiple grammatical relationships. Immediate examples of this occur in chapters 3 and 4, where pivots overlap higher and lower phases to create transitive clauses, or to embed clauses. If, however, we allow unbounded Merge to apply at the level where Phrase-markers are ‘read’ and used at the interfaces, then an issue arises. To avoid this issue, which I return to shortly, a condition on Phrase-markers is required. The following condition describes the standard workings of such derivations: (158) Given two applications of Merge to two distinct pairs of formatives {f i , f j } and {f i , f k } sharing the element f i , f i must be the head/label in at least one of the relations Merge(f i , f j ) and Merge(f i , f k ). This translates into the following condition on Phrase-markers: (159) Condition on Phrase-markers Let P be some classical Phrase-marker and let (f i , f j ), (f i , f k ), f i , f j , f k distinct formatives in P, be a pair of grammatical relations in P which share the formative f i . At least one of the two relations is labeled/headed by f i . 23 A classical Phrase-marker P then is a graph with the following two restrictions: 23 Recall that we are creating sets of ordered pairs with Merge, using a Cartesian product, and so the pairs {X, Y} and {A, B} that Merge will give rise to the set {(X, A), (X, B), (Y, A), (Y, B)}. Further applications of Merge will create more complicated ordered pairs, and I am simplifying the notation so that (X, A) = some grammatical formative, e.g. T, so that the three coordinates are represented as (T, C) instead of the more complex ((X, A), C) that is really occurring. Importantly, the notation (X, Y) denotes a grammatical relationship that arose from some application of Merge, while {X, Y} is a set of features. 123 (160) (i) P is a tree (in the graph theoretic sense – a simple graph without cycles). 24 (ii) P obeys the condition in (159) immediately above I briefly return to the idea that, due to the head/label relation, a classical Phrase-marker will in general admit more than one representation. To illustrate, the tree in (161), with geometric heads (squares, circles) projecting in the way just indicated above, could also be drawn as in (162). The circle represents one formative, and the square represents another formative. (161) 1 3 2 4 … (162) 1 3 2 4 … Here, the root node 1 and the leaf node 2, both labeled with the circle, are the same formative. Note that I am not talking about the syntactic object (cf. Chomsky 1995) that is created; rather, I am talking about the tree representation, which we draw using only the labels of the given objects. If we only take into account labeling, then we can pursue the following discussion. The reason it labels both nodes 1 and 2 is because it is the head/label of the phrase [circle square]; in the Merge(circle, square) relationship, the circle is the head of the square, and the circle labels the phrase. Similarly, node 3 and leaf node 4 are also equivalently labeled as a square in the 24 As discussed in chapter 2, the definition is that of standard graph theory; see, e.g., Balakrishnan & Ranganathan (2000). A ‘cycle,’ following graph theory, is created when one can trace a ‘path’ through the edges of the graph, beginning at some vertex and ending at that same vertex (if no vertices are repeated in this path, sometimes this is called a ‘simple cycle’) 124 Merge(square, …) relationship. Returning to Chomsky 1995 (p.244), I do not introduce indexing to distinguish the Merged square head (node 4) from its projection (node 3). Because of this, we could technically ‘project’ from the leaf node 4, rather than the more typical drawing where we project from node 3. The two drawings are equivalent, because we don’t distinguish the projecting label from its head. If we were to take into account the syntactic objects that are created via Merge in BPS, then the nodes are not equivalent – while labels are equivalent, the constituent that is labeled is not part of the equivalence class. Put another way, in (161) and (162) node 2 is a minimal projection and 1 is a maximal projection (and not minimal) – the only sense in which the two structures are equivalent is for feature-checking purposes, where, e.g. the node 2 wants to check a feature of node 4. In MP, feature-checking (including both checking and selectional features) is done by merging with a head or a projection of the head, which is both not a ‘minimal’ system and also creates the issue discussed above in that the label is not the same thing as the category/projection itself. Here, under the ICA, checking features happens directly: labels and projections are the same thing. Similarly, the X-bar diagram in (163) could be drawn as in (164): (163) 1 3 2 4 5 125 X Y Z (164) 1 3 2 4 5 Here, the circle is Merged with two separate items, nodes 2 and 5, but the circle is the head of both Merge relationships. We see it projecting in the picture in (163) as a more ‘typical’ X-bar diagram, but (164) is no different – it still shows the circle being the head of the Merge relationships with nodes 2 and 5. Another way of describing this is that nodes 3 and 4 are equivalent, so we can draw representations projecting from one or the other without a difference in representation. In what follows, I only use the more standard representations in (161) and (163), keeping in mind that there are equivalent representations. Returning to the condition on Phrase-markers, we rule out Phrase-markers containing the following: (165) (X, Y), (Y, Z) * * X Z X Y Z This configuration, common in the literature on Multidominance, creates problems for linearization. The constraint on Phrase-markers proposed above rules out this configuration for P-markers used at the interfaces, and so avoids the linearization problem. As this configuration is 126 typically used for arguments, we also run into the semantic problem at LF that we cannot apply Y to both X and Z, and so we run into problems with argument saturation. For this reason, we restrict Phrase-markers to not have this configuration. However, we do see this configuration arise in M-graphs, and it is natural for the linguistic relationships we observe. It will become evident that the condition we propose will be useful, and natural, for constraining certain P- markers with displaced items such as the head of a relative clause. I return to this in the analysis of split-antecedent relative clauses in chapter 5. With the above conditions defined, we return to the original M-graph that depicted a subject phase: (166) SUBJECT PHASE T V L D ext N ext From here on out, I simplify the notation of the set of vertices {T, V L , D ext , N ext } to the set {T, V, D, N}, keeping in mind that the phases will be subject phases consistently throughout, unless otherwise noted. (167) SUBJECT PHASE T V D N We derive two maximal (using all vertices) Phrase-markers from this M-graph. Notice that the edge between V and N will not be contained in the same Phrase-marker as the edge between D 127 and N, because that would be ruled out by the Condition on Phrase-markers. Then, the two possible maximal Phrase-markers are: (168) (169) We do not derive a Phrase-marker containing all four edges, as this creates problems for linearization and interpretation as discussed, above. Were one ignore this condition, a Phrase- marker could arise from the M-graph containing all four edges, and could be depicted as follows: 128 (170) It would necessarily contain a cycle, and this is not an object that is useful at the interfaces. Of the valid Phrase-markers in (168) and (169), a parameter across languages determines which P-marker is used at PF. English uses the first, and, developing further the ideas from Alexiadou & Anagnostoupoulou 1998, I suggest that Romance uses the other parameter. Vergnaud (forthcoming) suggests that polysynthetic languages utilize this second P-marker as well, along the lines of Baker 1996. In the following sections, I demonstrate that the Condition on Phrase-markers is highly useful for displacement/long-distance relationships, and will allow us to constrain narrow syntax to only local, immediate grammatical relationships. Long-distance relationships are only apparent, and arise only in the objects used at the interfaces, which will allow for linear movement at PF and interpretive movement (i.e. Quantifier Raising) at LF. 129 3.3.5.1. Phases and Cyclic Spell-out Earlier, I illustrated that the typical vP/CP phases arise as the result of a Cartesian product of linguistic features, and that a condition on Phrase-markers gives rise to two possible Phrase- markers which are argued to be used parametrically across languages (at the interfaces). In this section, I repeat some argumentation which is presented in chapter 2, as it is helpful for a discussion of long-distance relationships present in relativization and SARC. I provide a possible extension of this condition in terms of an informal condition on Spell-out, which begins to generalize the idea of parameterization in Phrase-markers. (171) Condition on Spell-out (informal): 25 Spell-out goes by pairs of parallel pairs of features (edges or planes) that exhaust clause structure. The advantage of having Spell-out proceed in this way is that it allows us to avoid introducing the operations Move and Agree into the system, and sticking only to (local) Merge. In BPS, defining long-distance grammatical relationships using Move/Agree requires an additional mechanism of indexing as well as spanning a stipulated domain of a tree searching for a Probe/Goal relationship. When we have Spell-out proceed in the way described above, we find that the parametric choices often match the in situ or moved position of constituents. In the analysis of split-antecedent relative clauses in chapter 5, we will see that it gives rise to several possible (and empirically attested) coordinated structures. We will see that this Condition on Spell-out works nicely with coordination and also (as we have seen) with subject phase and object phase Phrase-makers. One drawback is that it does not account for relativization, so to account for split-antecedent relative clauses the original Condition on Phrase-markers is used as it makes the correct predictions. Future work will 25 In the future, a useful way of defining domains or cycles in syntax may be by feature-sharing. A domain could be defined as a set of nodes/vertices on a graph that share a feature, which would possibly begin to motivate the informal Condition on Spell-out that I am using in this paper. In this case, the structure of subject/object phases may have to be modified (as they do not share a single feature, right now). 130 attempt to explore whether the Condition on Phrase-markers, used with relativization, and the Condition on Spell-out, used in clause structure and coordination, can be unified or if they are truly different phenomena. This requires an in-depth exploration of wh-movement and islandhood. In the Minimalist Program, Chomsky argues that any mechanism in the computational system must have empirical motivation. Chomsky 1995 p. 225, proposes the Inclusiveness Condition, which is that the output of a system has no more than the input of the system. If we follow Inclusiveness, we cannot have traces left by movement, or indices left by movement (see chapter 1). However, even in a copy-and-deletion approach to movement, we have to be able to distinguish copies that are pronounced and copies that aren’t pronounced. We also have to be able to realize that copies are indeed copies – if we simply copy and Re-Merge some item B without indexing it, then there is no way to ensure that B is the same item as B without some mechanism indexing the first copy of B to the second copy of B. This is a problem of how to deal with multiple occurrences of items in syntax. 26 If there are long-distance relationships between occurrences of the same syntactic object/constituent, then we require an additional mechanism such as indexing (or Minimality, see fn 26) to mark that they are occurrences of the same object. In our system, without long-distance movement dependencies, we do away with indexing – a violation of Inclusiveness. More specifically, representational problems immediately arise when Move (Internal Merge) gets involved. More generally, classical Phrase-markers cannot adequately represent 26 Alternatively, indexing may not be needed if you have a different mechanism, cf. something like Relativized Minimality, that takes long-distance syntactic relationships (‘chunks’ or domains of the tree) into account. More specifically, if the structure is a multiple wh-question, like Who who likes who, the first two whos form a chain. The head of the chain is in relationship with C and checks a wh-feature so it can be recognized as the head of the chain without a need for indexing. And, it is clear that only the higher who can be the foot of the chain, because of Minimality. 131 nontrivial chains, i.e., lists of grammatical relations which share a formative f i (see Chomsky 1981, 1995) because classical Phrase-markers are defined as only allowing External Merge (pulling from the numeration). It is possible to draw “augmented” Phrase-markers in which multiple occurrences of a formative are coindexed and the Move operation is introduced. Coindexing is still the default theoretical mechanism in the literature. There is a conceptual difficulty, though, as such chimeras with indexing or traces are excluded by the Inclusiveness Condition. For a certain number of structures, the difficulty can be overcome by using ‘intersecting’ Phrase-markers, restricted multidominance in the sense of van Riemsdijk 2001, 2006, Wilder 2008, Citko 2005, etc. In defining these objects, a representation along the lines of (172) would be required. (172) ‘Internal Merge’ α/β β α α γ γ The above diagram illustrates the case of Internal Merge while respecting Inclusiveness: the two occurrences of β are not indexed or ‘copied’ so there is no additional mechanism that would violate Inclusiveness. 27 27 Even if one doesn’t accept that copying involves coindexation (see fn. 26; Pancheva, p.c.), then copying is an extra mechanism required by the system even if it doesn’t leave a labeled trace/index of the operation “copy.” 132 Still, the theory cannot be extended to the most general case. In particular, standard wh chains cannot be naturally described in terms of intersecting Phrase-markers, even assuming a derivation by phase. Such a description would require that any two phases along a wh “path” intersect as in (173), which violates cyclicity. To illustrate this, assume that α is a phase, as in C/v. Then, moving β to be Re-Merged with α is equvalent to movement to Spec, CP or Spec, vP. We have not yet violated cyclicity and the PIC. But, when we continue with the higher phase, and we cyclically move β once again to the next Spec position, this time of some phase head δ, we see that β now exists in the higher phase, and in the lower phase at the same time. The dashed line indicates what exists in the lower phase. (173) δ β δ δ α α α γ γ However, the lower phase should no longer be available for operations according to the PIC, so we have a contradiction – we need to have β available for the operation Move, but we also need β to no longer be accessible to the computation. Since we are not manipulating indexed copies of β in a chain, we have a problem. 133 C T To summarize, introducing Move requires the addition of indexing mechanisms for copies and/or a copying mechanism into the system, and introducing restricted Multidominance runs into issues when we begin to deal with phases and cyclicity. Our system avoids these problems by restricting the system to only allow for local Merge operations. We do allow for multiple applications of Merge to a single item (as seen in previous sections), a generalized form of Multidominance, but we do not define cycles based on any specific item/category as in standard MP. Rather, I base cyclicity in syntax on Cartesian products of features (cf. Vergnaud forthcoming), and then use the Condition on Spell-out, in addition to the Condition on Phrase- markers, to allow for Spell-out of certain shared constituents only in a single position. In the empirical illustration of split-antecedent relative clauses, we will see that the Spell-out of parallel planes predicts a family of coordinated structures that are indeed empirically available. Returning to the fact that our system allows multiple applications of Merge, we see that in order to avoid copying, one needs to admit graphs with cycles, e.g. as in (174). A ‘cycle,’ following graph theory, is created when one can trace a ‘path’ through the edges of the graph, beginning at some vertex and ending at that same vertex (if no vertices are repeated in this path, sometimes this is called a ‘simple cycle’). (174) D 134 The simple graph in (174) represents a chain of two occurrences of D, with one occurrence in the context of T and one, in the context of C, say, in the specifier positions of T and C, respectively (as marked by the directed edges). We have a selectional relationship between T and C, extending the verbal projection, and we have checking relationships between D, T and D, C. This graph represents a derivational history of merging D and T, and then merging C (see earlier sections on M-graphs). In other words, (174) represents the raising of D from the specifier position of T to that of C (or, alternatively, the lowering of D from the specifier position of C to that of T). So, a proper representation of nontrivial chains then requires that the condition that our syntactic objects are trees (in a graph theoretic sense) in turn be relaxed. This is what the ICA, as presented here, allows at narrow syntax. Still, classical Phrase- markers seem to be the right objects to describe interpretive properties of expressions at the interface levels, for semantic interpretation at LF and linearization at PF. At least, I have assumed as much. Then, I propose a two-tiered architecture (McKinney-Bock & Vergnaud 2010, Liao 2011, Vergnaud forthcoming), where narrow syntax is formalized as a graph in the general sense, and Phrase-markers are read from that graph, subject to various conditions, in particular, locality conditions of various types (cyclical restrictions, etc.). To illustrate, the elementary graph in (174) gives rise to the three classical Phrase-markers (175): 135 (175) (i) T (ii) C D T D C (iii) C C T The fact that the two occurrences of D across the two classical Phrase-markers in (175i) and (175ii) are occurrences of the same formative is enshrined in the primordial syntactic graph in (174). So: (176) “Grammatical computations do not apply directly to Phrase-markers, but instead to more abstract representations [such as that in (173)] from which Phrase-markers are ultimately derived.” (Vergnaud forthcoming) Finally, long-distance dependencies must be constrained by defining a syntactic domain. Chomsky does this in MP using phases of interpretation that are cyclically Spelled-out to the interfaces, based on the Merge of certain categories: C and v, and possibly D. In this system we define cycles based on Cartesian products of linguistic features, which builds up a single verb- argument phase and then overlaps with other phases via a binary grammatical superconnective which stitches together the fabric of syntax, rather than defining the end of a domain as the Merging of some (rather arbitrary) category of the extended projection such as CP, vP, or DP. To 136 this end, we see Spell-out following pairs of features to create the Phrase-markers used at PF and LF. 3.3.5.2. The System allows sharing Expanding upon the previous section, we allow for multiple applications of Merge to any given formative in our graph. This is a generalization of the Multidominance approach. Multidominance has been discussed under a variety of different guises and formalizations: Parallel Merge (Citko 2005), grafting (van Riemsdijk 2001, 2006), External Re- Merge (de Vries 2009). In this paper I refer to Multidominance as sharing or multiple applications of Merge, although I recognize that there are theoretical differences that correlate with the different terms and different discussions. Most accounts of sharing converge on the idea that this type of Merge is predicted by the existence of both Internal and External Merge (Chomsky 1995, 2001, & subsequent work), where a single constituent is Merged with two separate heads, that project in the system. External Merge: (177) α/β α β 137 Internal Merge: (178) α/β β α α γ γ β (179) The case of ‘grafting’ (van Riemsdijk 2001, 2006, or ‘External Re-Merge’ (de Vries 2009) α β α δ β This description of sharing is a bit fragile in the sense that it is not clear how the structure created by sharing differs from a general X-bar theory; rather, it is one of many possible configurations created by different projections and headedness – another one being the X-bar configuration. It is only ruled out by the Extension Condition, which prevents Merge from applying anywhere but the root of a tree – and we can only know where the root is by looking at the syntactic objects generated (at the complex sets present in {A, {A, B}}, as the complex sets that are present in the syntactic object give a version of the derivational history. The labels/heads themselves do not give us this information. Much of the discussion about sharing centers around the fact that the type of structure in (179) violates the single root constraint or on issues that arise with linearization/surface order conflicts. Here, we do not treat the single root constraint as a necessary one in narrow syntax. The issue of linearization is discussed in e.g. Citko 2005 with a constraint on Multidominance projections (as in the extended projections of the two parallel heads in type 3, above) that prevents them unless there is an antisymmetric projection that 138 ‘brings together’ the parallel projections and creates an antisymmetric relationship between them that is essentially a Spec-Comp c-command relationship – this allows a set of parallel projections to be linearly ordered. While this resolves the problem of parallel projections, the multiply dominated constituent, α, is still impossible to linearize, even based on the fact that one of the projections that dominates α c-commands the other. So, Citko adds the additional constraint that the object α, that is multiply dominated, move to a position outside of the Multidominance structure so it can be linearized. She applies this empirically for coordinated wh-questions (Who did Bob like and Mary hate?), where the coordinated sentences are the parallel projections, and who is the multiply dominanted, and moved, constituent. Van Riemsdijk addresses linearization by allowing for the ‘grafted’ constituent (in his case, the free relative or transparent free relative) to be inserted at the point where the graft occurs. In general, the theoretical frameworks that utilize Multidominance have conditions on Multidominance that allow for linearization. The focus on linearization rather than what the underlying Merge mechanism is seems, to us, to be of less importance. Rather, we believe that sharing is a more generalized phenomenon and that it extends beyond coordinate structures and free relatives or other parentheticals. It is not construction specific: as van Riemsdijk has claimed, the ‘grafting Merge’ is a predictable consequence of a general Merge mechanism. We take this and expand upon this type of Merge to account for regular relative clauses, extendable to movement in general. At the root of our account is a more specific notion of occurrence of a certain constituent; one that does not violate Inclusiveness by introducing an indexing mechanism. In fact, a non-homogenous theory of displacement, with both movement (traces or copy/deletion) and sharing, requires linearization rules for both types of structures. I take such a 139 theoretical heterogeneity to be unnecessary, and extend the notion of sharing to all notions of multiple occurrences, including those of Internal (re-)Merge as well as External (re-)Merge. This is a generalization of grafting. This generalized mechanism will require a somewhat more refined notion of a chain of occurrences. In sum, this system appears on the surface to be quite different from the BPS syntax proposed in the Minimalist Program. However, we follow quite closely to principles of Minimalism that Chomsky proposes, positing only the minimal necessary conditions on the system, such as the Inclusiveness Condition – the only items used in syntax are interpretable linguistic features. The increased heterogeneity of representations that occurs in this two-tiered system is more than offset by the fact that the Single Root Constraint, c-command, and linearization rules are not needed in narrow syntax. There are also no long-distance grammatical relationships, so no need for coindexing movement or Probe/Goal feature matching. Additionally, complex descriptions of objects created when restricted Multidominance is allowed (as in the literature) are avoided by the fact that we freely allow multiple applications of Merge to create graphs at narrow syntax, restricted by the allowance of only binary pairs of features Merging. 3.3.6. Spell-out of Phrase Structure In this section I will briefly cover how the above structure for a transitive clause is Spelled out (particularly to PF), with English parameters, as proposed in this chapter. Recall the Condition on Phrase-markers and the Condition on Spell-out. These two conditions will overlap in their application to a basic transitive clause, although I will show that the Condition on Phrase- 140 markers is less restricted than the Condition on Spell-out, which allows for its application to relativization and sharing in chapters 4 and 5. 28 The Spell-out of the subject and object phases goes as follows. Recall that the nominal and verbal planes of both phases are linked via a shared C K /complementizer. (180) (C K , T) (C K , V L ) (C K , v) (C K , V) (C K , D ext ) (C K , N ext ) (C K , D int ) (C K , N int ) For a transitive clause in English, first the lexical-functional planes get Spelled-out, traversing the D-N, v/T-V edges (that stretch horizontally): (181) (C K , T) (C K , V L ) (C K , v) (C K , V) (C K , D ext ) (C K , N ext ) (C K , D int ) (C K , N int ) Recall that for an edge X Y, we see that X is the head/label of the constituent [X Y]. This gives us four Phrase-markers: 28 A full investigation of the consequences of these two conditions is left to future research. 141 (182) D ext 3 D ext N ext T 3 T V L D int 3 D int N int v 3 v V [[D ext N ext ] [T V L ]] [[D int N int ] [v V]] Then, the parallel planes for the nominal-verbal domains are Spelled-out, which run vertically: (183) (C K , T) (C K , V L ) (C K , v) (C K , V) (C K , D ext ) (C K , N ext ) (C K , D int ) (C K , N int ) This applies to the constituents already formed: (184) T 5 D ext T 3 3 D ext N ext T V L v 5 D int v 3 3 D int N int v V [[D ext N ext ] [T V L ]] [[D int N int ] [v V]] Finally, the parallel planes for the shared complementizer are Spelled-out, which run along the z- axis: 142 (185) (C K , T) (C K , V L ) (C K , v) (C K , V) (C K , D ext ) (C K , N ext ) (C K , D int ) (C K , N int ) This introduces the two relationships of C K with the constituent [[D ext N ext ] [T V L ]] and [[D int N int ] [v V]]: (186) C K 5 T C K # 5 [[D ext N ext ] [T V L ]] v C K # [[D int N int ] [v V]] [C K [[D ext N ext ] [T V L ]]] [C K [[D int N int ] [v V]]] Keep in mind that at narrow syntax all the formatives comprising the subject and object phases ({D, N, T, V} and {D, N, v, V}) have a C K feature, but Spell-out makes it appear as if C K only has a local relationship with T and v. In this sense, C K acts like it is in ‘agreement’ or an Agree relationship with the entire clause – but, in fact, at narrow syntax, there is a local relationship with each formative. It is in this sense that the architecture developed in this dissertation derives what appear to be long-distance grammatical relationships, when instead at narrow syntax these relations are entirely local. Morphological agreement, as in Case/complementizers in this 143 chapter, behave this way, and in chapter 5 we will see that coordination and A’-phenomena involving wh-quantification behave this way as well. PF will then interpret and linearize the constituents in the standard way, allowing for the ‘Calder mobile’ effect to give rise to parametric word orders. In this early chapter, I would like to directly compare the Condition on Spell-out, which was used to derive (186) above, with the Condition on Phrase-markers, which I described as somewhat less restricted. It will become apparent, in chapters 4 and 5, that the Condition on Phrase-Markers will more clearly illustrate where a shared phrase (for example, the head noun of a relative clause) may be Spelled-out, and I will illustrate here that the Condition on Phrase- markers for a single phase allows more options than the Condition on Spell-out, raising empirical questions that need to be tested. Both the Condition on Phrase-markers and the Condition on Spell-out derive the following Phrase-markers for an object phase: (187) OBJECT PHASE v V D int N int Phrase-markers: (i) v 5 D int v 3 3 D int N int v V (ii) v 5 v V 3 3 v D int V N int 144 By the Condition on Spell-out, this is a result of whether or not the lexical-functional (horizontal) edges are Spelled-out first, or the nominal-verbal (vertical) edges are Spelled-out first (see also Vergnaud forthcoming, Liao 2011). By the Condition on Phrase-markers, these are allowed as one option due to the fact that N int occurs only in relationship to D int in (187i), or V in (187ii). However, there are is another option available under the Condition on Phrase-markers, that is not available via the Condition on Spell-out: Phrase-markers: (iii) v 3 D int v 3 v V 3 V N int (iv) v 3 V v 3 v D int 3 D int N int The question is twofold: first, there is the question of c-command, in particular. In (187iii), D int c-commands v, V and N int . However, in the comparable (187ii), D int does not c-command out of v into V and N int . The same is true for the pair (187i) and (187iv), for c-command of V. The second part of the question is constituency; in (187iii) we have the constituent [v V N int ], and in (184ii) this is not a constituent. The same in (187iv), where [v D N] is a constituent to the exclusion of V. This is an empirical question, with morphological consequences (e.g. whether or not [v V N int ] is a separate morphological constituent from D int ). However, we will see that a couple of similar options which arises in the case of relativization is empirically determined to require the Condition on Phrase-markers, at least prior to a more formalized definition of the Condition on Spell-out. 145 3.3.7. Case Theory/Phrase Structure under the ICA The Principle of Categorial Symmetry (CS) states (Vergnaud forthcoming: 43): (188) All IHC structures are constructed from the same primitive formatives arranged in the same hierarchical order. As discussed in chapter 2, Vergnaud derives the Mirror Principle through a rule that reverses which features are Items and which are Contexts in each bra-ket, which gives a pair of dual domains: (189) (i) {<∅|R O* >, <R O* |O*>, <O*|A O* >, <A O* |T O* >, <T O* |∅>} Nominal Domain (ii) {<R O |∅>, <O|R O >, <A O |O>, <T O |A O >, <∅|T O >} “Dual” Verbal Domain Essentially, the items of the verbal domain are the contexts in the nominal domain. When one recalls that items are interpretable features, and contexts are uninterpretable features, one sees that the uninterpretable contexts in the verbal domain are interpretable items in the nominal domain. This works well, for example, in the domain of Person features: one sees that Person is interpretable in the nominal domain, as it positions the referent of the DP as speaker, addressee, or external referent. On the other hand, verbal person agreement (in English, however impoverished, is –s) has no semantic value; there is no contribution to the verb’s semantics by showing a person feature. With respect to Structural Case, it would be natural to have the uninterpretable features in the nominal domain Spell-out as Case, akin to the suggestion in Pesetsky & Torrego 2001. However, Vergnaud proposes that this is not the case. Rather, Structural Case is the signature of the mapping in the nominal domain, “as distinct from the mere morphological realization of the set of uninterpretable (contextual) formatives of O.” Vergnaud (p.c.) suggested that Case should arise from the fact that a DP is ‘complete’ and hence visible to computation – which, under the 146 ICA, means that both the nominal and verbal domains have been built in parallel ‘to completion’ – or, rather, to a phase boundary. Here, I have interpreted Case to play the role of a nominal complementizer which merges once the phase is complete. A pair of connectives, with a ‘Case complementizer’ or nominal complementizer D K in the DP, as well as a verbal complementizer C K in the VP, made the phase visible to further computation. These complementizers combine with ‘supercategories’ and are of a higher type: they take phases as their complements and allow for phrase structure to be ‘linked’ and ‘visible’ to further computation. Note that Vergnaud did specify what Structural Case arises from: the ‘signature of the mapping’ above, for the nominal domain in (i). 29 As in Minimalism, Case makes the DP ‘visible’ to further computation, and the complementizer is what allows for phase completion – leaving the complementizer domain ‘visible’ to further computation as well (under the PIC). I take these two concepts to stem from the same formal function, and that Case and C are nominal-verbal parallels. 3.4. Generalizing features and phases I detailed a possible way to build transitive clauses earlier. However, clause structure still remains considerably simplified, with the infrastructure for layering phases set out – and the empirical details not quite worked out. Ideally, clause structure will end up looking akin to that in Lin (2001), where several serial ‘shells’ of a verb with a single argument layer together to form the extended projection (possibly as many layered ‘phases’). 29 Future work will reconcile the formalism that Vergnaud alludes to with the formal system utilized and developed in this dissertation, but should maintain the conceptual benefit of Case to be a complementizer. 147 In this section I look at empirical support for the parametric Phrase-markers at the level of D and T, and possibly extend it to Aspect and Number. Returning to the simplified clause structure once again, we see that two possible Phrase-markers arise: (190) Syntax: Phrase-markers: We explored extensively the first Phrase-marker, which is the typical X-bar schema with a subject in [Spec, T] and V incorporated into T. This represents a standard English subject and verb phase, and this Phrase-marker represents English PF. However, the second possible Phrase- marker is not as straightforward. Vergnaud references Baker 1996 and suggests that this may be 148 an instance of the Polysynthesis Parameter, where polysynthetic languages utilize this other Phrase-marker. Looking to the literature, I suggest that there are examples even less exotic where this Phrase-marker could be used. One key property of this Phrase-marker is that D and N are not a constituent alone; rather, [V N] is a constituent, and [D T] has a local relationship without N. Crucially, this could appear in a situation where D+T+V are incorporated into verbal material, with a separate N. This seems to be what Alexiadou and Anagnostopoulou 1998 (henceforth A&A 1998) propose with respect to the EPP in Romance languages and Greek. A&A 1998 develop a typology of languages based on two parameters: the EPP parameter and the Spec, TP parameter. The EPP parameter is parameterized based on whether the language can check its EPP feature through verb raising (“Move X o ”) or through the movement of a phrase (“Move XP”). Germanic utilizes Move XP, and we see things like DPs raising to check the EPP feature. On the other hand, they argue for the novel analysis of Celtic, Greek and Romance, that these languages satisfy the EPP based on verb-raising. Then, a further split is realized based on whether or not Spec, TP is available for subjects (it divides Celtic versus Greek/Romance for the Move X o languages, and then English versus Icelandic for the Move XP languages). Instead of satisfying the EPP via a silent pronoun in Romance/Celtic/Greek, they argue that a +D feature occurs on the verbs, and that verb-raising checks the EPP. This seems to parallel the structure above, where [D T] are in a close relationship, and [T V]. Then, the noun is a separate entity. One immediate objection to the above structure is that the D is entirely separate from N, whereas the Move X o languages a full ‘DP’ can appear following the verb: 149 (191) a. Llegó Pedro/el chico arrived Pedro/the boy ‘Pedro/the boy arrived.’ b. Llegaron todos los chicos. arrived all the boys ‘All the boys arrived.’ If the D feature that checks EPP is separate from the D feature that determines definiteness or quantification, then this analysis remains possible. In fact, A&A 1998 argue that this D-feature is essentially agreement, which is why the verb can have this D-feature – it behaves much like a pronoun (p. 517): (192) You love - agapas (Greek for “you love”) The agreement features, or Φ-features, then, is the D-feature that occurs in the Phrase-marker above. Definiteness (which would come into play for existential there constructions in English) is a different feature, which is underspecified in the current analysis. 30, 31 There is evidence, then, for both types of Phrase-markers, even without looking to polysynthetic languages. In fact, as we begin to complicate the picture and look to other linguistic features that have been proposed to be independent heads (i.e. aspect/Asp or number/Num), we see that there is evidence for both types of Phrase-markers above within these domains as well. However, as we have defined a phase by a shared feature, as we introduce more features we will introduce several more phases. Let’s look at evidence for the second configuration of Phrase-marker in a different domain, using pseudo-incorporation as evidence for the presence of this type of syntactic feature 30 Note that existential constructions in English still have a ‘D’ in the lower subject’s position: (i) There was a man at the party. The presence of ‘a’ representing singular agreement would also be the presence of the D/-feature, while the presence of there would require another type of analysis. 31 I am not sure what the role of agreement in Move, XP languages (like English or Icelandic) is under A&A 1998; it is much more impoverished, but still contains traces of -features – e.g. Person, in English. 150 configuration. Dayal 2011 argues that Hindi has incorporated NPs – incorporated because they are non-specific, generic nouns, and NPs because they can be coordinated and modified (although these behaviors are somewhat restricted and not quite productive). There is a peculiar behavior with respect to plural marking on these incorporated NPs – they are not number neutral. In fact, she argues that they are incorporated NumPs, and that their number marking is dependent on aspect. When verbs are marked with perfective aspect, they can have an interpretation as Accomplishment or Activity predicates. When marked with a completive particle and perfective aspect, it forces the Accomplishment reading, and does not permit an Activity reading (as evidence by the unacceptability of *for 3 hours). When marked completive and perfective, then, incorporated nominals must show plural agreement to be acceptable (Dayal 2011:20). We see in Hindi that Num-Asp can be incorporated and when they are the features must match. This is an argument that we have a constituent structure much like [D T] [N V], but with aspect and number. Something akin to: (193) [[Num Asp] [N V]] She also notes that these incorporated NPs (NumPs) do not have to be adjacent to the verb. This would be a problem for the argument I am suggesting here. However, the contexts she provides involve scrambling and negation – both contexts where focus/quantification come into play. As these types of operations often involve Aʹ′-movement, modeled here as reduplication and complementary deletion, then one could imagine the above constituent structure going through a process of reduplication and deletion to ‘move’ certain elements – or, possibly, triggering Spell-out of different parallel planes. The motivation as to why this would happen is unclear, but the possibility is there. 151 Essentially, pseudo-incorporation and incorporation provides evidence for the Merge parameter I have laid out here, at many different levels of clause structure. Supposedly, these levels of clause structure will correspond to verbal shells – which are creating by overlapping phases in with respect to nominal and verbal complementizer feature pairs. The nominal argument in the higher phase is shared with the verbal element in the lower phase through a semantic (θ-) relationship. Then, the nominal argument in the lower phase is assigned case by the verbal element. This could give rise to several layers, or phases, including Number/Aspect – as evidenced preliminarily in this section. 3.5. Conclusion This chapter provided a new theory of phrase structure, extending ideas in ICA to account for the syntax of transitive clauses, developing an analysis of phase-linking. Phase linking involves a binary grammatical connective of the complementizer type: {D K , C K }. Case is taken to be a nominal complementizer, rather than a correlate of Φ-agreement at the level of D-T and D-v. 152 Chapter 4 Embedding is Sharing 4.1. Introduction This chapter discusses a typology of embedded clauses: raising, control (partial and exhaustive), and indicative. The proposal is couched within a key idea in the literature that [Spec, T] involves both verbal and nominal material. Working from this idea, I propose that embedding necessarily involves sharing a set of features. In doing this, I argue that embedding involves both the nominal and verbal domains (both D and T, in some cases), not just an extension of the verbal domain that the literature, including Vergnaud (forthcoming) and Liao (2011), assumes is privileged. This relates to the proposal in chapter 3, as the complementizer domain involves both nominal and verbal features {D K , C K }. The larger issues this proposal touches on relate to a central question: what is the mechanism that allows for embedding? Williams (2003) proposes the notion of Level Embedding, which builds clauses in parallel – meaning that the clausal hierarchies are constructed simultaneously for matrix and embedded clauses (this is not unlike Vergnaud forthcoming, who builds phases in parallel). However, Williams (2003) simply allows for a nonspecific mechanism of ‘embedding’ to occur once one of the two clauses is finished/reaches the highest point in the hierarchy that it will have. This derives the result that the smaller of the two clauses is always the embedded one, and the matrix clause is always the larger, and has strong empirical support. However, Williams never makes clear what it means to embed as an operation in the syntax. The same question exists in the literature, at the level of complementation: what drives selection by V for a CP complement (in some cases, and a DP in others)? What allows for the 153 verbal domain to recurse like that, allowing for a complementation rule of the type VP -> V CP? 32 I believe there is more in the case of embedding than a simple selectional rule that allows for V to have a CP sister, and even more than Williams’ elegant result that embedded clauses are always smaller than the matrix. Empirically, there is a semantic relationship between the embedded and matrix clauses – this relationship can vary, but it could be a tense dependency, or a shared subject reference, for example. These are particularly observable in the domain of infinitivals, both in raising and control. Following Williams (2003), what makes the embedded clause embedded and not matrix is the presence of another clause being built in parallel that is larger: that has more features. We see, for example in Landau (2004), that certain embedded control clauses have a semantic dependency on the matrix clause: the tense of the embedded clause is anaphoric to that of the matrix. This is not derived simply from the notion of Level Embedding alone, which has the only necessary requirement for embedding be the size of the embedded clause relative to the matrix. Instead, I work both with Level Embedding and empirical facts about infinitivals, and I propose that there is always a relationship between the embedded and matrix clauses, and that this relationship is some sort of sharing of a semantic feature which occurs across both clauses. The Level of sharing is, essentially, at the point where the embedded clause has stopped growing and embeds. My account is, then, a hybrid of Level Embedding with a requirement that to embed is to share some set of semantic features. In this sense, embedding takes on a mechanism similar to that of relativization, where one NP is shared across both the relative and matrix clause (e.g. The man who went to the store 32 Note that this problem exists at a slightly more complicated level, as arguments that infinitivals may differ in ‘size’ are developed, e.g. Landau 2013, where VP -> CP or VP -> FinP, or Grano 2012, where VP -> CP or VP -> vP. Williams (2003) allows for a generalization of this, where any level of embedding is permitted – but his is not the only account in the literature that allows VP -> V XP, where XP is an element of the extended verbal projection. 154 bought cookies) – see chapter 5. To do this, I implement the Generalized Attachment Transformation (GAT) from Vergnaud (2007) to the realm of embedding, with the additional component that GAT allows not just one head H (as Vergnaud stipulates), but rather allows sharing of a set of features – involving both the nominal and verbal domain. In proposing this, I extend nominal-verbal parallels to the mechanism oif embedding. Empirically, I explore different infinitival constructions and the nominal-verbal dependencies they express, and provide a pre-theoretical structural description to begin to support the argument that embedding involves sharing. 4.2. Empirical & Theoretical Considerations from Minimalism 4.2.1. D-T Parallels: Key Observations from the Literature A key idea that percolates throughout the literature in different ways (Pesetsky & Torrego 2001, Alexiadou & Anagnostoupoulou 1998) is that sometimes DPs check features on T, and at other times VP material checks features on T or D. This accounts for a variety of empirical phenomena, including a typology of Romance languages and Germanic languages and subject- raising (Alexiadou & Anagnostoupoulou 1998), as well as allowing for a unification of T-to-C movement and subject/object asymmetries (Pesetsky & Torrego 2001). Pesetsky & Torrego (2001) (henceforth P&T 2001) unify T-to-C movement, the that- trace effect, and subject/object asymmetries, which had previously been explained by the Empty Category Principle (ECP) in the 1980s, under the proposal that there is an uninterpretable T feature [uT] located/checked on the DP. The ECP was based on the notion that subject traces must be properly governed by their antecedents. Complementizers such as ‘that’ or a tensed auxiliary such as ‘did’, which had raised from T to C (Koopman 1983), being heads and thus head-governors, blocked proper government by the antecedent. P&T 2001 note that this line of 155 reasoning was abandoned, importantly because there was no deeper explanation for why subjects have this requirement in the first place. Objects had a lexical governor, the verb, which explains why antecedent government was a necessity, so the stipulation regarding subject government was by necessity. In their article, P&T 2001 argue that the current formalization under the Minimalist Program allows for an update to the ECP, and provides a deeper explanation of the empirical observations from the early 1980s. The crucial elements of the Minimalist Program that P&T assume are as follows (P&T 2001: 4): (1) uninterpretable features must be checked and disappear by the end of the derivation; (2) movement only occurs as a response to an EPP feature; (3) even after being marked for deletion, a feature may continue to participate in the derivation - “be alive” in the derivation. The first of these elements is the requirement that uninterpretable features not be seen by the interfaces, as they cannot be interpreted either in the logical form (they have no semantic import) or in the phonetic form (they are not pronounceable). MP takes the derivation crash if uninterpretable features are not gone by the time the derivation is complete (every cycle has been Spelled-out, leaving nowhere for uninterpretable features to continue as part of the derivation). The second is an important point in relation to Case-driven movement, which Adger 2003 has pointed out is problematic, above. P&T 2001 assume Chomsky (1998), where there is another operation besides Move that can check features: Agree. The difference between Move and Agree is that Agree establishes a basic, long-distance relationship between some feature F occurring on a goal X, and another feature F on some higher head, the probe Y. Move involves Agree, but also involves another feature, the EPP feature, which requires that X be in an immediate domain of Y (both have the Agreement relationship with F). So, Move is essentially Agree, but with a copy- 156 and-delete operation that ‘moves’ X into the higher, immediate domain of Y (usually the specifier position). Finally, the third element they utilize is that this feature F, even after being ‘marked for deletion’, or checked by another feature F, may continue to enter into higher Agree (or Move) relations as the derivation proceeds. From these crucial assumptions within MP, P&T 2001 derive the T-to-C asymmetry based on the hypothesis that nominative case is actually an uninterpretable T feature [uT] being checked on the DP. From this, they derive the that-trace effect, and other empirical phenomena based on the interplay of +/-EPP feature (crosslinguistic differences as well as in embedded vs. matrix clauses), and various economy conditions and other derivational conditions. The key element in their analysis, relevant to the discussion at hand, is that [uT] features may be found on both verbal and nominal elements. When [uT] appears on D, it appears as NOM Case. This permits a DP with a NOM feature (now a [uT] feature) to delete a [uT] feature on another head, as well as allowing a T head with a [uT] feature to check the same feature on another head. Crucially, this ‘other’ head is C. So, either a DP phrase or a T head can move to check a [uT] feature that appears on C. To derive T-to-C movement, for example, they take English matrix clauses to have a C head with the following features: a [uT] feature, and a [uWh] feature. Both of these features have a +EPP ‘subfeature’ which requires Move to satisfy any Agreement relation entered into. To derive Koopman’s T-to-C asymmetry (see 2, below), they assume an attract closest condition, as well as conditions defining ‘closeness’ that select one of two combinations: (1) the non-wh T head (with a [uT] feature) as well as a +wh object DP, to satisfy the EPP features on both the T- and wh-features of C in (4a); (2) the Subject DP, with both [uWh] and [uT] feature, by economy 157 conditions, will check both the T and wh-features on C. By an economy principle, we select subject-raising as opposted to raising an overt T auxiliary to satisfy (4b) (P&T 2001, their (14)). (194) Structures before movement to C and Spec,CP a. [C, uT, uWh] [TP [Mary, uT] T [VP bought what] ] b. [C, uT, uWh] [TP [who, uT] T [VP bought the book] ] From (4), the structures pre-movement, P&T 2001 derive the T-to-C asymmetry paradigm below, with (5a,b) satisfied by the pattern in (4a), and (5c,d) satisfied by the pattern in (4b): (195) T-to-C asymmetry (Koopman 1983) a. What did Mary buy? b. *What Mary bought? c. *Who did buy the book? d. Who bought the book? They go on to derive the that-trace effect for subjects in a similar way. While the details of derivational order and movement are glossed over somewhat here, as well as other empirical predictions which are borne out by this idea (such as differences between Belfast English & Standard English, as well as other empirical paradigms in English), one key idea is clear: as a result of both DPs and Ts being permitted to carry a [T] feature (either [uT] or [iT]), both are permitted to enter into a syntactic relationship with the complementizer based on a long-distance Agreement relationship. From this key idea, P&T unify a variety of subject/object asymmetries in the literature which were previously accounted for by the ECP and assumptions about the special government status of subject traces. P&T provide a deeper explanation of why subjects and objects behave differently, as well as a move to reduce structural Case to a single verbal feature, that is interpretable in one domain and uninterpretable in the other. On the flip side, the literature contains arguments that D-features can occur in the verbal domain. Alexiadou & Anagnostoupoulou 1998 account for word order possibilities in Germanic 158 and Romance by allowing T-raising to replace DP raising. The key idea here is that in languages with rich overt agreement morphology (such as Greek) have an EPP parameter that allows the Verb to enter into agreement, as the verb has ‘pronominal’ material, agreement, that acts like +D (they are noncommittal as to whether this is a +D feature, a +Φ-features, or a +Case feature). They call this an AGR parameter, requiring only an X head to agree, which stands in opposition to an EPP parameter that requires XP movement to satisfy the EPP. In languages such as English, without rich agreement morphology on the verb, the subject DP must rise to check the D-feature. The empirical coverage of this account helps explain the distribution of expletive constructions, as well as subject inversion constructions crosslinguistically, and has a strong empirical basis. It differs from P&T 2001 in that it assumes that the EPP parameter itself attracts an XP or an X head, while P&T 2001 argue, based on movement conditions, that an XP or an X head will move based on definitions of closeness - and that the feature itself is identical on both XPs and Xs. But, under both accounts, the key idea remains the same: there is some feature, part of the D-T relationship, which is permitted to occur in both nominal and verbal domains to drive different types of movement. In P&T, they permit a [uT] feature in the nominal domain, realized as NOM Case, and in A&A, they permit a +D feature in the verbal domain that is realized as a type of pronominal agreement morphology on the verb (rich agreement morphology on the verb). Both papers, however, require either conditions on movement or an interaction of parameters involving the EPP to allow for the availability of movement to a specifier position. This is a byproduct of the fact that specifiers, under MP, are built to a maximal projection (XP) prior to being merged at different levels in the verbal hierarchy. On the other hand, the extended 159 verbal projection is comprised of a set of Xs and their complements. XPs move through the verbal projection based on parameters like EPP or AGR, which are theory-internal mechanisms. On the other hand, under the proposal in this chapter, which is couched under the ICA, specifiers and their heads are built dually, and that this *expects* that there should be a strong D-T relationship based on the *same abstract feature* falling in the context of N, or V. In a sense there is no +D feature on the verb, nor a -T feature on the noun, but rather the *same* feature occurs in both. Then, order of Spell-out derives the empirical ‘movement’ or ‘agreement’ effects seen. 4.2.2. Functions of D and T in Control Clauses Nonfinite clauses in the Minimalism are taken to have a nonfinite [-T] feature, which in turn lacks a nominative Case feature. This is in opposition to finite clauses, where [+T] and nominative Case are associated. The byproduct of [-T] in nonfinite clauses is that there is no need to check Φ-features at this level. Instead, it is proposed that PRO is licensed via null Case, a Case which is only carried by PRO. This is a stipulation regarding Case only in infinitival contexts for the purposes of Control (Chomsky & Lasnik 1993, Chomsky 1995). More recently, linguists such as Landau (2000, 2004) take issue with theories of control that assume null Case for PRO, especially in light of data from languages which show case agreement on the verb. The observation is that, in infinitival contexts with silent PRO, case agreement may still appear on the embedded verb, and this brings into question the association of PRO with a special null Case being what drives the silent subject in Control constructions. Landau’s proposal shares, abstractly, some of the properties of P&T 2001 and A&A 1998. His (2004) account of Control, and recent (2013) account, proposes an interaction of 160 feature checking across the nominal and verbal domains. To explain this, I will begin by discussing the in-depth empirical generalization that Landau presents. Empirically, Landau’s (2004) formal system makes a distinction between three types of embedded clauses. The first type is an embedded clause with a ‘free complementizer.’ In this case, the type of C selected for the embedded clause has no dependency upon the matrix clause: there is no T-feature on C. This accounts for typical indicative embedded clauses. The second type is an embedded clause with a selected complementizer which has an uninterpretable [–T] feature, whose tense is anaphoric to the matrix clause. This case generates certain sentences that are ungrammatical under mismatching tense, as below: (196) *Yesterday, John managed to solve the problem tomorrow. (197) *tora, o Yani skseri /arxizi na kolimbai avrio. [Greek] now, the John knows-how/begins PRT swim.3sg tomorrow ‘Now, John knows how/begins to swim tomorrow.’ The third type is an embedded clause with a selected complementizer, which has an uninterpretable [+T] feature, whose tense is then dependent upon the matrix clause, although not necessarily the same. This accounts for empirical coverage of partial control examples in English, what are called ‘F-subjunctives’ in Balkan, and Hebrew ‘finite control’/subjunctive cases (in 3 rd person). Based on these empirical facts from Hebrew, Balkan languages, as well as Partial control in English (Landau 2000), Landau observes that in certain Control cases (which he calls Partial Control), PRO isn’t under identity with its controller, but rather refers to a plural group containing the controller. Correlated with this property of partial control is the property of tense dependency: while languages use different (morphologically realized) vehicles for embedded clauses with empty subjects, they are divided along the lines of tense dependency: these clauses 161 have ‘dependent tense,’ where the matrix clause restricts the possibilities for the tense of the embedded clause. Looking to tense dependency to motivate his account of the two types of Control, Landau (2004) discusses two types of infinitives in the Balkan languages. Both of these infinitives use subjunctive rather than an infinitive morphology: the F-Subjunctives and C-Subjunctives. F- Subjunctives have anaphoric tense, while C-Subjunctives have dependent tense: (198) a. tora, o Yanis elpizi/theli na figi avrio. now, the John hopes/wants PRT leave.3sg tomorrow ‘Now, John hopes/wants to leave tomorrow.’ b. *tora, o Yani skseri /arxizi na kolimbai avrio. now, the John knows-how/begins PRT swim.3sg tomorrow ‘Now, John knows how/begins to swim tomorrow.’ (Greek: Varlokosta 1993, ex. 43, 44, 46; cited in Landau 2004: 831) In English, there are also two types of control: partial control, as mentioned above, and exhaustive control. Partial control occurs when the controller is a member of some plurality denoted by PRO, and has dependent tense. Exhaustive control is when the controller DP and the referent of PRO are one and the same. Landau points out that these two types of control are dependent upon tense: PC complements are tensed, EC are not. Exhaustive control, like F- Subjunctives, involve anaphoric tense, and partial control, like C-Subjunctives, involve dependent tense: (199) a. *Yesterday, John managed to solve the problem tomorrow. (EC) b. Yesterday, John hoped to solve the problem tomorrow. (PC) The typology of anaphoric vs. dependent tense in control predicates occurs cross-linguistically, even if the morphological realization of embedded clauses differs. Landau argues for a theory of control that accounts for both the nominal and verbal domain: facts about C-selection in the 162 embedded clause create different control scenarios, and the C-T relationship is crucial to deriving dependent vs. anaphoric tense in control predicates. Additionally, instead of the distribution of PRO being based on Case (null Case or a lack of Case), he treats the distinction between PRO and pro/lexical DPs with a referentiality feature, +/-R, with contextual rules about where the +R feature can surface with respect to the C-T domain within which the DP is merged. 33 Crucially, this is a notion of abstract tense/T. Landau (2004:839) specifically states that morphological tense is independent of the formal typology he presents: for example, infinitive clauses in English with Partial Control are ‘tensed’ (with T in his system), but infinitival; Balkan languages have subjunctive tense (are morphologically tensed) but have the same tense as the matrix clause – so, -T in his system, which is nonfinite. Then, the feature [Agr] is purely morphological in his formal system: infinitives are [-Agr], while subjunctives and indicatives are [+Agr]. The distribution of nominal phrases in the subject position of infinitives still plays a role in Landau’s theory, but rather than base the analysis on Case and its distribution, Landau looks to referentiality: PRO is [-R], not having its own reference, whereas pro and lexical DPs have their own reference, [+R]. Then, the [+R] feature from the subject of the embedded clause plays a role in preventing control, and the [-R] feature from the subject of the embedded clause plays a role interacting with C to derive the two types of control that Landau discusses (for details see Landau 2004: 848). The formal system derives the different types of control through an interplay of R- features, T-features and Agr-features. The contribution of Landau’s is both empirical and formal: 33 I have glossed over the Hebrew data, where control with an overtly-tense marked (‘morphologically finite’) verb is used. Landau argues that, despite the lack of overt subjunctive morphology in Hebrew, that control occurs in the subjunctive in Hebrew (based on semantic tests for subjunctive such as the irrealis). The difference between Hebrew and the Balkan languages is that, under dependent tense situations (not anaphoric), the subjunctive leads to obligatory control in Hebrew, while there is no control in the Balkan languages (Landau 2004: 837). 163 he provides an extended empirical typology of control, showing that finite and non-finite control is possible. He also unifies the subjunctive cases of control with the tensed/non-tensed control, under the T-to-C feature movement system (independently argued for in the Balkan literature). Formally, he contributes semantic import into the DP features used to license PRO versus pro/lexical DPs (referentiality), and shows that T-to-C movement in the verbal domain is the key verbal component accounting for the empirical facts, as opposed to selecting different T-heads to account for control (as in earlier Minimalism). In the literature these accounts of control are another facet of the argument that Case is not responsible for licensing subjects in embedded clauses (much like the Icelandic argument given above that EPP is responsible for subject-raising, not NOM Case features and the ECM data, above). Now, where Landau’s analysis is similar to that of P&T 2001 and A&A 1998, is the idea that referentiality is responsible for control, in both the nominal and verbal domains: - R-feature: referentiality in the DP, in the standard interpretation - T-feature: referentiality in the VP, in the form of tense anaphoricity/dependency I will use this key idea to derive the various types of Control clauses under the ICA. For Landau, these are two separate features – under the ICA, referentiality occurs in both nominal and verbal domains, as the same feature. The issue of Case in embedded clauses is relevant to Case Theory in the ICA, where Case does not actually get checked on T – rather, Case is a nominal complementizer and falls within the C-domain. As a result, Landau’s claims about the C-domain playing a critical role in Control directly relate to Case, and will perhaps unite, in some cases of Control, the claims about null Case and PRO with ideas about C-agreement and Control. 164 4.3. Nonfinite/Embedded Clauses Using Landau (2004) and Pesetsky & Torrego (2001) as a springboard, I now proceed to derive a limited typology of embedded clauses: embedded indicatives, infinitives with raising verbs, and infinitives with both exhaustive (predicative) & partial (logophoric) control verbs. Here, I present a pre-theoretic, descriptive notion of what it means to embed an infinitival clause, and the purpose of this is to (a) open doors to future research on embedding through testable hypotheses under the architecture I employ in this dissertation; (b) illustrate how I am building phrase structure by showing how I capture various empirical facts about raising and control clauses with various formal notions. This section should provide a clear picture of what functions various notions of category/supercategory, structural size, and pivot play under the syntax I present in this dissertation. Three notions play a crucial role in this initial account of embedding. First, the notion of pivot within GAT becomes crucial. I return to the conjecture about embedding, refining this with the notion of pivot: (200) Embedding Conjecture (revised) To embed is to have a pivot between the embedded and matrix clause. Second is the structural ‘size’ of the embedded clause, or how much phrase structure the embedded clause has. Williams (2003) allows for embedding to occur at any point during a derivation, when the ‘embedded’ clause stops growing. At the point where one clause grows larger than its parallel clause, the smaller clause ‘embeds’ within the larger clause. Third, the phasal connective {D K , C K } is involved, which was introduced in chapter 3 as a way to overlap and link phases. Note that the hypothesis of “embedding” that I explore here gives {D K , C K } somewhat of a backseat to two other notions: the pivot and the structural ‘size’ of the embedded clause. {D K , C K } is not always even present as the only pivot between these 165 ‘small(ish) clauses’ with their matrix, though it plays more of a role in the largest embedding. This again pulls from Williams (2003), in the sense that clauses have different sizes and different levels of ‘completeness,’ leading to extraction possibilities, which is an (old) proposal from Rosenblum (1967) and most recently employed in Grano (2012) whereby certain features become ‘unavailable’ for extraction in Control structures as a result of the embedded clause being too ‘large’ to access certain positions. However, the mechanism of embedding always involves sharing – which alleviates the concern about allowing for VP to select any number of types of complements (as I discussed above). A table is given here of the typology, each of which I build phrase structure for, in turn. The table is ordered from ‘smallest’ embedded clause, to ‘largest’, with the highest head being indicated under ‘clause size’. The Pivot column shows what is shared between matrix and embedded clause, the {D K , C K } column states a binary distinction: does {D K , C K } occur as a LINKER between the embedded and matrix clause, or not? 34 The ‘clause structure’ column shows how ‘large’ the independent section of the embedded clause is. Table: Typology of Embedding Type of Embedding Pivot {D K , C K } Raising D-N ext , C K Yes, partially shared Exhaustive (Predicative) Control Higher Phase D ext -N ext , T func -V L Yes Embedded Indicative {D K , C K } Yes I begin by discussing independent empirical properties of each type of clause, and how I account for them in terms of the three columns above. Then I discuss some shared properties of raising and control, and present a hypothesis about how this is perhaps derived. 34 {D K , C K } can and does occur in other places in the clause – but the crucial distinction I make here is whether it does the work of embedding the infinitive/embedded clause inside the matrix. 166 4.3.1. Derivation: A Typology of Embedding 4.3.1.1. Deriving Raising Clauses The structure for raising verbs in English has certain properties that I will represent here. First, it involves subjects, not objects: (201) *John is likely [Mary to kiss t] (202) *Mary believes John [Sue to kiss t] (203) John is likely [t to kiss Mary] (204) Mary believes John [t to kiss Sue regularly] (205) John is likely [t to be kissed (by Mary)] (206) Mary believes John [t to be kissed (by Sue) regularly] Second, the tense is dependent upon the matrix tense, T, but not necessarily anaphoric (207) Today, the man is likely to propose tomorrow (because he is confident). Tomorrow, who knows. Third, the NOM Case for the subject is set by {D K , C K } of the matrix clause. Non-finite complementizers tend to be prepositions, as Landau (2013) notes, and I treat the lower C K (which introduces the vP in the embedded clause) as housing the preposition to, the infinitival marker, as well as D K , the ACC case for the object. I’ll begin by returning to the transitive clause, from chapter 3, which introduces the lower {D K , C K } as follows (in a sentence e.g. John kissed Mary): (208) Kase [vP DP int ] Komp TP DP ext [vP DP int ] For the raising verb, the structure is essentially the same, with accusative case and the finite complementizer: 167 (209) ACC [win the game] to [is likely the man] [win the game] Graphically, the coordinates for a transitive sentence were as follows: (210) (C K , T) (C K , V L ) (D K , v) (D K , V) (C K , v) (C K , V) (C K , D ext ) (C K , N ext ) (D K , D int ) (D K , N int ) (C K , D int ) (C K , N int ) For the raising verb, it appears essentially the same, with D K as ACC and C K as to: (211) (to, is) (to, likely) (ACC, EVENT v ) (ACC, win) (to, EVENT v ) (to, win) (to, the) (to, man) (ACC, the) (ACC, game) (to, the) (to, game) Then, in the higher phase, the subject gets its Case from the higher {D K , C K }, which is {NOM, that MATRIX }: 168 (212) D K = NOM C K = that MATRIX (NOM, is) (NOM, likely) (NOM, the) (NOM, man) (that MATRIX , is) (that MATRIX , likely) (that MATRIX , the) (that MATRIX , man) Let’s return, for a moment, to the grammatical relationships given by the non-finite complementizer to. (213) (to, is) (to, likely) (ACC, EVENT v ) (ACC, win) (to, EVENT v ) (to, win) (to, the) (to, man) (ACC, the) (ACC, game) (to, the) (to, game) Notice that the man is not in a relationship with win, which it should be. To be in a relationship man with win, one would have to have a diagonal line connecting the nodes (to, man) and (to, win) in the cube. Here is where the difference between a fully transitive clause and a raising clause is found. The man is shared across both the embedded clause to win the game and the matrix clause, is likely. 169 (214) is likely the man EVENT win Along which dimension is this dual relationship? I propose that, since there is no Case feature in the embedded clause (a hallmark of raising constructions), that the D K within the matrix clause is in fact also applied to the lower (embedded) clause, and that it too is shared across the {D K , C K } for the matrix clause (pictured above), as well as the {D K , C K } for the lower clause – which I introduce right now. The nominative case marker appears in the lower D K position, creating the following additional relationship between [the man] and [v win]: (215) (NOM, is) (NOM, likely) (NOM, the) (NOM, man) (NOM, EVENT) (NOM, win) 170 The complementizer, to, has already been introduced with the accusative Case marker (see 180, above), but now it is also shared with the higher, nominative Case marker. (216) (NOM, is) (NOM, likely) (ACC, EVENT v ) (ACC, win) (to, is) (to, likely) (to, EVENT v ) (to, win) (NOM, the) (NOM, man) (ACC, the) (ACC, game) (to, the) (to, man) (to, the) (to, game) This is akin to the ‘transparency’ noted with raising clauses in the literature, although instead of arguing for a lack of a complementizer, the complementizer is indeed there and is shared across the embedded and matrix clauses, creating the transparency we see with raising clauses. The different levels of analysis are copied here: Lower phase: (217) (to, is) (to, likely) (ACC, EVENT v ) (ACC, win) (to, EVENT v ) (to, win) (to, the) (to, man) (ACC, the) (ACC, game) (to, the) (to, game) 171 Shared constituents, D-N ext and C K : (218) Shared D-N ext (NOM, is) (NOM, likely) (NOM, the) (NOM, man) (NOM, EVENT) (NOM, win) (219) Shared C K (220) (NOM, is) (NOM, likely) (ACC, EVENT v ) (ACC, win) (to, is) (to, likely) (to, EVENT v ) (to, win) (NOM, the) (NOM, man) (ACC, the) (ACC, game) (to, the) (to, man) (to, the) (to, game) 172 Higher phase: (221) (NOM, is) (NOM, likely) (NOM, the) (NOM, man) (that MATRIX , is) (that MATRIX , likely) (that MATRIX , the) (that MATRIX , man) One could stitch together these cubes to make a larger structure, but I will leave it at this for now. The key representation here is that there are two elements shared across the matrix and embedded clauses: {D K , C K } in the higher phase of the matrix clause, and {D K , C K } in the higher phase of the embedded clause. More specifically, {NOM, C K } and {ACC, C K }. Also, the external argument is shared across the two Case positions for the matrix and embedded clauses: [the man.NOM] with [is likely.NOM] and [to win. NOM]. One empirical objection to the above proposal would have to do with quirky case and raising in Icelandic. For example, in the case of a verb that assigns accusative (ACC) case in a predicate such as ‘lack money’, in a raising environment the constituent [D-N ext ] would be assigned accusative case (example from Wurmbrand 1999, (5c)). Nominative case is unavailable in this environment (though the particular example does not illustrate this point). 173 (222) Harald virðist vanta ekki peninga Harold-ACC seems lack not money ‘Harold seems to not lack money.’ This is known as case percolation (Andrews 1976, see Landau 2008). To account for the status of quirky case percolation in Icelandic, where a non-nominative case appears on the noun that has been raised, we would see a Case mismatch in the following structure: (223) Shared D-N ext (NOM, T -PAST ) (NOM, virðist) (?, Harald D ) (?, Harald N ) (ACC, STATE v ) (ACC, vanta) Before, we assumed that there was, essentially, no independent nominative case in English in the lower phase. Now, there necessarily must be a case assigner in the lower phase that assigns accusative case to Harald, prior to raising. This would not be represented in the structure for shared C K that we had before: (224) *Shared C K [missing ACC case] (225) (NOM, T -PAST ) (NOM, virðist) (GEN, STATE v ) (GEN, vanta) (to, T -PAST ) (to, virðist) (to, STATE v ) (to, vanta) (NOM, Harald D ) (NOM, Harald N ) (GEN, D int ) (GEN, peninga) (to, Harald D ) (to, Harald N ) (to, D int ) (to, peninga) 174 Instead, we assume yet another VP shell for the predicate [vanta] ‘lack’, which comes with its own {D K , C K } assigning accusative case, {ACC, to}: (226) (NOM, T -PAST ) (NOM, virðist) (ACC, STATE v ) (ACC, vanta) (to, T -PAST ) (to, virðist) (to, STATE v ) (to, vanta) (NOM, Harald D ) (NOM, Harald N ) (ACC, Harald D ) (ACC, Harald N ) (to, Harald D ) (to, Harald N ) (to, Harald D ) (to, Harald N ) Then, the lower VP shell which assigns genitive case to the object of ‘lacks’ – in this case, money ‘peninga’, will have its own linking {D K , C K } which appears with {GEN, C K } (perhaps C K for the genitive case is something akin to English ‘of’, as in {GEN, of}). This will link with the higher phase of the embedded clause, which links Harald to the accusative case: (227) Within the Embedded Clause: Shell for accusative argument Shell for genitive argument (ACC, STATE v ) (ACC, vanta) (GEN, v) (GEN, vanta) (to, STATE v ) (to, vanta) (of, v) (of, vanta) (ACC, Harald D ) (ACC, Harald N ) (GEN, D int ) (GEN, peninga) (to, Harald D ) (to, Harald N ) (of, D int ) (of, peninga) These shells are linked, as usual, with the C K complementizer appearing across both phases (see chapter 3, also repeated above). I have not specified the ‘flavor’ of v for the genitive argument shell, but it would be another type of functional verbal element. 175 (228) Lower, Genitive Shell linking with higher, Accusative Shell through of (of, v) (of, vanta) (GEN, STATE v ) (GEN, vanta) (of, STATE v ) (of, vanta) (of, D int ) (of, peninga) (GEN, D int ) (GEN, peninga) (of, Harald D ) (of, Harald N ) This additional complication may occur in English, with the man appearing in the context of the lower accusative phase as well. Returning to Icelandic, we have an interesting case of competition for the shared D-N ext , repeated below: (229) Shared D-N ext (NOM, T -PAST ) (NOM, virðist) (?, Harald D ) (?, Harald N ) (ACC, STATE v ) (ACC, vanta) This does not exist for English, as any structure like this would have two nominative cases (one from the embedded, and one from the matrix). For Icelandic, then, to account for case percolation in raising verbs, one has to assume that during Spell-out, the morphology of 176 accusative case ends up being pronounced on Harald, even though Harald technically has two cases. 35 To summarize, raising verbs involve the shared pivot of the subject argument, D-N ext , as well as a shared non-finite complementizer C K , here realized as to. From this, and the regular Case-assigning shells (as in chapter 3), one sees that a derivation for raising verbs is very similar to that of transitive clauses, with the exception that the embedded clause is ‘smaller’ as it lacks an independent C K . 4.3.1.2. Deriving Exhaustive Control (Predicative Control) Clauses Exhaustive Control (EC) is defined in Landau (2004) as the more standard case of Control – where, e.g. in subject control, the subject of the embedded clause, the controlee, is controlled by the subject of the matrix clause. The referent must be exact; hence the use of the term ‘exhaustive.’ In Grano (2012), EC predicates are proposed to be raising predicates, which embed a vP (as opposed to PC predicates which embed a full CP without raising). In Landau (2013), EC predicates are proposed to be predicates with a FinP ‘head of complement’ (the matrix embeds a FinP, rather than a CP). The difference between raising and EC predicates which I present here is that EC predicates have an anaphoric tense with the matrix clause (which we saw wasn’t necessary with raising predicates): (230) a. *Yesterday, John managed to solve the problem tomorrow. EC b. Today, the man is likely to propose tomorrow (because he is confident). Raising 35 See Pesetsky (2010) for an analysis of case stacking in Russian – the analysis is similar here, as Harald appears with multiple cases, and only one is spelled out. 177 I propose that this difference is key to deriving the difference between raising and EC embedded infinitivals. The EC infinitival is larger than the raising infinitival, because it contains a higher phase containing the D ext -T relationship, as well as the V L -N ext relationship. Like Grano (2012), they involve a ‘raising’ structure much like seem, as they share the same pivot: D ext -N ext . Then, the sharing of the parallel verbal component of the phase, T-V L derives the tense anaphoricity property, as well as the (exhaustive) identity of the matrix subject with the embedded subject. Also, cf. Grano, the only ‘independent’ features of the embedded infinitive under a control predicate is the vP. Exhaustive Control allows for overt infinitival complementizers, here C K . The crucial empirical difference between raising and EC is a case mismatch between embedded subject and the matrix subject, D K . So, I propose that the Control predicate is larger, and includes an independent {D K , C K } for the higher phase in the embedded clause. This introduces the independent NOM case in the embedded predicate, as well as the infinitival complementizer. The structure is as follows: (231) The man tried to win the game. Embedded (accusative) phase: (232) C K ACC = to (to, T) (to, V L ) (ACC, EVENT v ) (ACC, win) (to, EVENT v ) (to, win) (to, the) (to, man) (ACC, the) (ACC, game) (to, the) (to, game) 178 Embedded Matrix Embedded (nominative) phase: (233) (NOM, T) (NOM, V L ) (C K NOM , T) (C K NOM , V L ) (NOM, the) (NOM, man) (C K NOM , the) (C K NOM , man) Shared constituents, D-N ext and T-V L : (234) Shared D-N ext (C K NOM , T) (C K NOM , V L ) (NOM, T) (NOM, V L ) (NOM, T) (NOM, V L ) (C K NOM , the) (C K NOM , man) (NOM, the) (NOM, man) (NOM, the) (NOM, man) (that MATRIX , T) (that MATRIX , V L ) (NOM, T) (NOM, V L ) (that MATRIX , the) (that MATRIX , man) (NOM, the) (NOM, man) 179 Matrix phase: (235) (NOM, T) (NOM, V L ) (that MATRIX , T) (that MATRIX , V L ) (NOM, the) (NOM, man) (that MATRIX , the) (that MATRIX , man) Landau’s (2004) analysis involves both referentiality in the DP (controlee to controller) as well as tense referentiality – the anaphoricity of T in the embedded clause to T in the matrix clause. He accounts for this through a C-selection that is either ‘dependent’ or ‘independent’, and a C-T Φ-agreement that allows C to ‘assign’ the embedded T the tense from the matrix clause. This analysis requires two mechanisms: first, long-distance featural agreement between the C in the embedded clause and the T in the matrix clause, as well as feature agreement between C and T in the embedded clause. The analysis above derives ‘agreement’ for the matrix and embedded Tense via T- sharing, rather than using C as a conduit for agreement. There are independent complementizers in both clauses, which assign independent (nominative) cases to the shared subject. The referentiality of ‘PRO’ (here, a shared subject) is under identity because the subject D-N ext is also shared. To summarize, the key difference between raising and Exhaustive Control is the fact that EC involves a larger pivot: the entire subject phase. Also, the presence of an embedded {D K , C K } gives rise to (1) the possibility of an embedded non-finite overt complementizer (seen in some languages), and (2) derives a ‘full’ argument in the embedded clause, something which PRO attempts to capture. 180 The difference between raising and control on the DP side in the literature and this analysis is that the DP ‘controls’ PRO via agreement (Landau 2004/2013), or via raising movement (Grano 2012). Here, ‘PRO’ is a distributional description which describes the presence of a complementizer in the embedded clause that assigns Case, whereas sharing a DP derives both raising and Control. 4.3.1.3. Deriving Partial (Logophoric) Control The empirical differences between EC and Partial Control (PC), which I address here (Landau 2000, 2004) are: (1) the controlee must either contain or match the controller in the matrix clauses: it can match exactly, or it can also refer to a group (a plural) that includes the controller; (2) PC has some tense dependency on the matrix, but it is not anaphoric. 36 (236) Yesterday, John hoped to meet tomorrow. I leave an account of logophoric control to future research at this time, and suggest that it is in fact ‘larger’ along some dimension than exhaustive (predicative) control because it involves two Tense heads, and a looser referentiality between the subject of the matrix and of the embedded. However, I am unsure how to account for the fact that (a) tense remains restricted by the matrix clause, and (b) the referent remains restricted by the matrix clause – although none are under exact identity. Landau (2013) begins to account for this through a variable binding account from the matrix clause into the embedded C (rather than identity matching under predication), but it requires a semantics that I have not yet built into the syntax. I think sharing at the level of the higher phase may still be present, but there is some sort of referential quantification going on that allows for a looser interpretation, and I leave that to future research. 36 There are properties present with logophoric control (Landau 2013); I leave these to future research at this time. 181 4.3.1.4. Deriving Indicative Embedded Clauses The simplest case, in some sense, would be the embedded finite indicative clause (with a possible overt complementizer). Here, there is no true pivot and the embedded clause has its own {D K , C K }, assigning independent Case to its arguments within the embedded clause. It must, however, still be embedded: and ‘to embed is to share (something).’ So the pivot, I propose, involves the {D K , C K } pair of features, as well as the entire clause, just as it is when linking two verbal phases within a transitive clause. (237) D K TP embed DP embed C K vP matrix DP matrix TP embed DP embed The pivot becomes the entire embedded clause, and the C K is what allows the verbal phase to extend into the matrix clause. One interesting possibility with {D K , C K } is that it could be reversed; this may account for why embedded clauses are nominalized (or Case-marked) in some languages (see, e.g., Coon 2010): (238) C K TP embed DP embed D K vP matrix DP matrix TP embed DP embed Here, the embedded clause appears ‘nominal’ to the higher, matrix clause, rather than ‘verbal’, because it is found within the context of D K rather than C K , which could allow for D K /Case to appear on the embedded clause. I set the details aside for now, but refer to a work-in-progress by McKinney-Bock & Vergnaud (2011) proposing that a {D, C} connective as proposed in chapter 5 for relative clauses, and which would serve the same role as {D K , C K }, developed here, plays a 182 crucial role in both embedding and relativization, deriving factive, non-factive and relative clauses. 4.3.1.5. Summary To summarize, there is a distinction between raising, EC and PC clauses with regards to their syntactic size as well as their embedded pivot. There are still many empirical issues to work out, as the generalizations about control in the literature are broadly and deeply understood, but the general hypothesis is that these clauses are different ‘sizes’ and so embed at different points in their structure. We can obtain all of the long-distance agreement relations noted by Landau through the sharing of various elements. Were this project to work out, there would be enormous conceptual benefit, in that there would be no passive C-structure whose only requirement is to percolate features from the matrix to embedded clauses, nor would there be any long-distance agreement. Rather than selection of various heads (whose definitions are stipulated), there would be a principled reason why we see raising, PC and EC clauses in syntax, having specifically to do with how large the embedded clause is and how much is shared. There would also be a link between embedding and relativization (cf. Kayne 2008); NOUN-sharing in general stitches together clauses and builds relationships between matrix clauses and complement/adjunct clauses. The role of the {D K , C K } connective is key in both, sharing either D K or C K in meaningful ways. 4.3.2. A Shared Property of Raising & Control Vergnaud (forthcoming) addresses raising clauses, in particular the data from Chomsky 1995 with respect to the scope of negation inside the infinitive: (239) Everyone seems not to be there yet. 183 not: not everyone is there yet *NEG>every yes: nobody is there yet every>NEG This is interpreted as the following: (240) everyone seems [not one to be there yet] Vergnaud’s interpretation is similar to Fox’s Trace Conversion, in the sense that the quantifier is only merged up ‘high’ and does not reconstruct; rather, the trace is a definite object (something like ‘the one’). A similar analysis is that of Sportiche (2005), who analyzes the D merging higher than N (see discussion in chapter 5). There, I observe that this property of quantificational every is observable also with control clauses: (241) Everyone has managed not to be there yet. not: not everyone is there yet *NEG>every yes: nobody is there yet every>NEG With Partial (Logophoric) Control, the scope issue with negation remains: (242) Everyone preferred not to be there at 5. (243) a. Everyone preferred not to be there at 5 (and in fact, nobody was). b. Everyone preferred not to be there at 5 (and in fact, not everyone was). In the (a) sentence, everyone had their preferences met because they weren’t there at 5. In the (b) example, unfortunately some people had to go against their preference and be there by 5 anyway. So the scope of negation, once again, is below everyone. If everyone were reconstructed in the subject position, this would not be the case: b’. #Everyone preferred not everyone to be there at 5. The meaning of the (243b’) sentence is very different: the preference is that only some people arrive by 5pm, not that nobody arrive by 5 pm. At any rate, it seems the problem persists: the quantificational determiner D must not be in the embedded clause; only the matrix. There seems 184 to be an odd interaction of the plural/group interpretation of everyone arriving, and the stricter reading that each man cares only about his own arrival. Quantification is discussed, once again, as problematic in chapter 5 to relative clauses – the relative clause restricts the domain of quantification, which is absent from the matrix. This is strongly related to the scope possibility for every as within the matrix, but not within the embedded clause. I discuss this further in chapter 5. 4.3.3. Raising, Control, and PF Spell-out Here, as in chapter 3, I illustrate how the structures at narrow syntax are Spelled-out to the interfaces (in particular, looking to PF linearization). For Raising structures, the Condition on Phrase-markers becomes important as the subject noun phrase is shared across the matrix and embedded clauses. Taking the structure that illustrates the shared D-N, we have the following sets of Phrase-markers: (244) Shared D-N ext (NOM, is) (NOM, likely) (NOM, the) (NOM, man) (NOM, EVENT) (NOM, win) 185 Phrase-markers: (i) is 5 the is 3 3 the man is likely EVENT 5 EVENT win (ii) is 5 is likely EVENT 5 the EVENT 3 3 the man EVENT win The Phrase-markers in (244i) and (244ii) would be ‘stitched’ together via the complementizer domain, in this case to, which I illustrate below. Here, in looking only to the narrow syntax given in (244), we derive the sets above. Crucially, the following set of Phrase-markers is ruled out, where [the man] occurs both in the Phrase-marker [is likely] and [EVENT win]: Phrase-markers: *(iii) is 5 the is 3 3 the man is likely EVENT 5 the EVENT 3 3 the man EVENT win This will consequently (once the complementizer to links the two Phrase-markers) disallow the PF pronunciation of something like: (245) *The man is likely the man to win. 186 Now let’s turn to the Spell-out of the lower and upper phases, using to to link the two phases. As in chapter 3, Spelling out first the lexical-functional and nominal-verbal edges, we obtain the following Phrase-markers: (246) Phrase-markers: is 5 the is 3 3 the man is likely EVENT 5 EVENT the 3 3 EVENT win the game [[the man] [is likely]] [[win-EVENT] [the game]] Then, the linking of these occurs via the complementizer, in this case taken to be infinitival complementizer to: (247) to 5 is to # 5 [[the man] [is likely]] to EVENT # [[win-EVENT] [the game]] As with the option above, there is a further Spell-out option to have [the man] appear low in the tree, with [win- EVENT]: 187 (248) to 5 is to # 5 [is likely] to EVENT # [the man [win-EVENT] [the game]] I’d like to take an opportunity here to mention something I omitted in chapter 3, for simplicity’s sake. The reader may be wondering what happened to the lower object phase which is shared across the {D K , C K } phases, in a form of reduplication. Following Vergnaud (forthcoming), I take Spell-out to allow for a complementary deletion of reduplicated forms. 37 In this case, one would end up with the following linear constituent, for the object phase [[win-EVENT] [the game]]: (249) [ [ C K [win-EVENT the game]] [ D K [win-EVENT the game]] ] And the following complementary deletion applies: (250) [ [ C K [win-EVENT the game]] [ D K [win-EVENT the game]] ] Turning to Exhaustive Control clauses, a very similar sharing occurs with a larger constituent: this time, [D ext -N ext T-V L ] is shared across the matrix and embedded clauses, marked with NOM case, rather than simply [D ext -N ext ] in the raising clauses. Again implementing the Condition on Phrase-markers, there are two possibilities for this shared ‘plane’: 37 Along with Vergnaud (forthcoming), I do not have a complete theory of deletion at present, though this is a rich domain of inquiry to pursue in further research. See McKinney-Bock (2010) for the beginnings of such a theory. 188 Matrix Embedded (251) Shared [D ext -N ext T-V L ] (C K NOM , T) (C K NOM , V L ) (NOM, T) (NOM, V L ) (NOM, T) (NOM, V L ) (C K NOM , the) (C K NOM , man) (NOM, the) (NOM, man) (NOM, the) (NOM, man) (that MATRIX , T) (that MATRIX , V L ) (NOM, T) (NOM, V L ) (that MATRIX , the) (that MATRIX , man) (NOM, the) (NOM, man) This is the structure for the shared constituent of an EC clause. To illustrate better, I will stitch together the matrix and embedded clauses/cubes (this is equivalent to the above diagram): 189 (252) Shared [D ext -N ext T-V L ] (C K NOM , T) (C K NOM , V L ) (NOM, T) (NOM, V L ) (C K NOM , the) (C K NOM , man) (that, T) (that, V L ) (NOM, the) (NOM, man) (that MATRIX , the) (that MATRIX , man) The plane highlighted in gray is the shared plane, and the (NOM) coordinates are bold. Notice that the Condition on Phrase-markers rules out the nominative [the man T-V L ] from occurring in both the matrix clause and the embedded clause: (253) (C K NOM , T) (C K NOM , V L ) (NOM, T) (NOM, V L ) (C K NOM , the) (C K NOM , man) (that, T) (that, V L ) (NOM, the) (NOM, man) (that MATRIX , the) (that MATRIX , man) 190 And so, in EC we see the Phrase-markers Spelled-out such that [the man T-V L ] is found in the matrix, rather than embedded, clauses (though a parameter is predicted, such that the alternative is available). 4.3.4. Benefits for the ICA over current theories A generalized mechanism of embedding as sharing accounts for empirical facts without being dependent upon selection of some type of root node that can bear features (or not). Also, there is a link between A- and A-bar movement: generalized sharing, and a unification of the mechanism behind embedding and relativization is possible (see discussion in chapter 4). Empirical benefits may be, eventually, a large-scale unification of embedding, clausal complementation, relativization, and islands through the same mechanism of sharing. 4.3.5. Shortcomings As of yet, there is no principled definition of a pivot, and the empirical consequences have not been explored for free sharing. Though, a typology of sharing in the subject phase may be free, and derive the different types of raising and control (not all combinations were explored here, however). Also, the empirical coverage is not yet complete (quantifier raising and partial control – both involving quantification), and although the negation data is not observed in any of the MP or TAG literature, it is quite likely feasible to account for it in terms of assuming that CP bloccks the scope of negation. Also, the elegant unification by Pesetsky & Torrego 2001 of various phenomena involving T and C is not yet fully derived; a notion of wh-movement prevents us from completing the unification of T-to-C and that-t effects, etc. 191 4.4. Conclusion This chapter proposes that embedding involves sharing a (semantically interpretable) constituent across two phases. From this a typology of embedding is derived. 192 Chapter 5 Defining Spell-Out and Fusing Relativization/Coordination 5.1. Introduction In this chapter, I develop a theory of Spell-out using a constraint on directed graphs. Adding to the structure set out in prior chapters, of symmetric phases in syntax being linked with a nominal-verbal complementizer pair and embedded using sharing, this chapter introduces and asymmetric notion of headedness/labeling which is critical for interpreting the objects used at narrow syntax at the PF and LF interfaces. After a discussion and introduction of the theory, using the constraint on di-graphs I turn to the empirical side and look at simple subject relativization in order to illustrate how the constraint on digraphs gives rise to headed relative clause constructions that can be linearized. Then, I will illustrate the benefits of the building blocks and operations of this new conceptualization of syntax with an old empirical problem: split-antecedent relative clauses. There is difficulty in representing relative clauses with split antecedents that bind reciprocal anaphors (cf. Perlmutter & Ross 1970, McCawley 1982, Link 1984, Wilder 1994): (254) Mary met a man and John met a woman who know each other well. The issue is how to connect a relative clause with a plural relativizer to two antecedent nominals, each in a separate clause. I argue that the split-antecedent relative clause (SARC) structure is not unique to relative clauses. Instead, a general notion of coordinating sets of grammatical formatives allows SARC to be derived from the same syntax as a host of other coordinated structures, rather than requiring additional stipulated syntactic mechanisms. Within ‘graph theoretic syntax,’ I argue SARC are a natural, predicted consequence. 193 As with any new theory, and along with the other chapters, there are many open issues that remain, and I begin to discuss the remaining issues at the end of this chapter. In this chapter, however, I put forth an analysis of coordination and relativization that makes more explicit the ideas in McKinney-Bock & Vergnaud 2010, and I begin to look at clause structure and expand the simplifying assumptions made in past work to more clearly paint a complete picture of the primitives of syntax. The roots of this approach are at the intersection of several lines of research: (i) Recent work within the Minimalist Program (Chomsky 1999, 2001, and work cited there), as discussed; (ii) CAT theory (Williams 2002), from which a similar two-tiered architecture is utilized; (iii) graft theory and Multidominance, from which the idea of sharing a syntactic object is used – and in this system, generalized; (iv) antisymmetric syntax (Kayne 1994, 2005, and work cited there), (v) the theory of relators (den Dikken 2006), from which the idea of generalized relators as grammatical connectives is used; (vi) work by Schein (2010), looking at semantic interpretation of arguments; (vii) the work in articulatory phonology (Browman & Goldstein 1986, 1992, 2000, Goldstein, Byrd, and Saltzman. 2006, Saltzman,& Byrd, 2000, Saltzman, Nam, Krivokapic, & Goldstein 2008), which argues for a system that allows overlapping gestures at the level of phonology, rather than segments, creates linear speech (here, ‘overlapping’ grammatical relationships at the level of syntax leads to linearization at PF, rather than requiring grammatical relationships to be linearizable at the level of syntax), (viii) work by Liao & Vergnaud and Liao (2011), who explore the symmetry of syntax both in the DP and at the level of aspect/modality, and, finally (and centrally), (ix) work by Vergnaud (forthcoming). 38 38 As a note, some of the sections found in this chapter are repeated in chapter 2; I do this because the argument develops in both chapters – at a more general level in chapter 2, and specific to relativization here in chapter 5, and feel it is best for to utilize some of these concepts in both places. My apologies to the reader. 194 5.2. The Minimalist Program/Traditional Syntactic Theory: Relativization & Coordination 5.2.1. Relativization There are several approaches to relativization in syntactic theory (starting from Kuroda 1968); here I discuss the three main approaches: the head-raising, movement approach, the head- external approach, and the deletion-under-identity, or ‘matching’ approach. A head-raising derivation of a single-headed relative clause (Vergnaud 1974, Kayne 1994, a.o.) proceeds as follows. First, the head of the relative clause merges into the argument/adjunct position inside the relative clause (Kayne 1994). Then, it extracts to an Aʹ′- position in the relative clause, and subsequently moves into the matrix clause. (255) [the [ CP book that [Mary read book]]] Under a head-external approach, the head noun of the relative clause is merged outside of the relative clause, and an operator is involved in A-bar movement from the internal position of the head to the A-bar position in [Spec, CP]. (256) [the book [ CP Op i that [Mary read t i ]]] The matching approach contains two independent copies of the noun book, and involves relativization of the inner copy, followed by deletion: (257) [the book [CP which book [Mary read which book]]] 5.2.2. Coordination Under current theory, coordination is treated as a maximal projection, which asymmetrically takes two phrases that are ‘matched’ in size (see, e.g., Munn 1992): (258) [ And MAX CP 1 [ and and MIN CP 2 ]] 195 This type of coordination account derives 2 nd conjunct asymmetries that we observe in language, such as quantifier binding of a pronoun: (259) Every man and his dog walk in this park. (260) *His dog and every man walk in this park. Notice that there are no parallels between the theories of relativization and coordination under current syntactic theory; they are independent constructions with independent structures. 5.2.3. Freidin & Vergnaud: On the relationship of movement and coordination One concept that forms the foundation of this chapter is that movement and coordination are ‘flip sides’ of the same coin (Freidin, p.c., Vergnaud, p.c.). Intuitively, there is a parallel between the two types of constructions: movement involves one syntactic object in two contexts, and coordination involves two syntactic objects in one context. To illustrate: (261) [X 1 α … β… X 1 ɣ] movement X 1 falls in two immediate contexts: (262) [— ɣ] [— α] On the other hand, coordination involves two Xs in the same context: (263) [and X 1 and X 2 ] [— and] I use an implementation of coordination and relativization in this chapter which reflects this intuition: the head noun will be shared across two immediate contexts, representing the two argument positions for each of the relativized and matrix clauses. Then, when turning to SARC, I use the coordinator and as a reduplicator, which takes a single syntactic structure (i.e. a CP) and 196 creates two abstract, functional contexts for that CP, allowing for two sets of lexical items to be merged as CP 1 and CP 2 . 5.3. An Empirical Problem Resolved: Split-Antecedent Relative Clauses 5.3.1. Perlmutter & Ross 1970: The empirical issue The problem of split-antecedent relative clauses (SARC) was brought to light in a squib by Perlmutter and Ross 1970. Ross’ Rule of Extraposition had derived extraposed relative clause from an assumed counterpart that contains the relative clause adjacent to its antecedent: (264) (Perlmutter & Ross’ ex. 1, 2) A woman who was wearing a fur coat entered the room. A woman entered the room who was wearing a fur coat. However, they point out that not all relative clauses can have the clause adjacent to the head noun, and can only be extraposed (Perlmutter & Ross’ 3-5): (265) A man entered the room and a woman went out who were quite similar (266) *A man who were quite similar entered the room and a woman went out (267) *A man entered the room and a woman who were quite similar went out They leave this as a paradox. As of yet, this empirical paradox does not have a clear solution, and remains a difficulty in BPS and current theory. I discuss possible analyses that have been mentioned in the literature, although a detailed analysis of these structures has not yet (to my knowledge) been undertaken. I then attempt to detail an analysis of SARC utilizing the ICA as I have described it in this paper, and show that SARC are naturally predicted from the way I propose to treat coordination under the ICA. 197 5.3.2. How the literature handles SARC 5.3.2.1. Solution #1: Extraposition, RNR, and Multidominance at PF 5.3.2.1.1. Multidominance at PF (McCawley 1982) McCawley 1982 argues against the stipulation that discontinuous constituent structures, structures that dominate items without dominating everything that is between them, are not part of syntax. Instead of taking strings to be primitives in linguistic theory (which only allow for certain types of trees), he takes trees to be primitives, and defines a set of axioms on trees that he uses to develop a syntax of structures for parentheticals, RNR structures, extraposed relative clauses, and several other empirical phenomena that seem difficult to account for in the transformational grammar of the 1970s. Then, he proposes two types of transformations on trees in order to account for discontinuous constituent structures. He defines ‘relation-changing’ and ‘order-changing’ transformations. Relation-changing transformations change constituent structure, which also involve changes to the surface order of the constituents, while order-changing transformations just change the surface order of the syntactic items without changing the constituency of the structure. Order-changing transformations yield discontinuous constituents, as in the following Right-Node-Raised (RNR) construction, which shares a single NP. The structure appears as follows (McCawley 1982: his 11c): (268) Tom may be, and everyone is sure Mary is, a genius. 198 While McCawley’s discourse aims to redefine transformations that create surface trees from deep structure (for the purposes of the article, he assumes that deep structure does have continuous trees, p.94), which is not what I am addressing here, the fundamental idea that ‘surface’ Phrase-markers allow for shared constituents has become crucial to many recent accounts of Multidominance, including Right-Node-Raising, which has been proposed to account for split-antecedent relative clauses. What is important here is McCawley’s underlying idea, that strings – which have roots in automata theory – may be replaced by trees – which have roots in graph theory. While I do not argue that Multidominance is a surface feature – seemingly in contradiction to what McCawley is proposing with respect to surface structures – the idea that trees and graphs can ‘share’ a syntactic item that occurs only once in surface structure, yet is semantically interpreted in multiple grammatical relationships, is crucial. The difference is that this single occurrence comes from an item in an abstract syntactic structure, a deep(er) structure, which creates a family of 199 Phrase-markers that are used at PF – including structures similar to those that McCawley is generating in transformational grammar. Perlmutter & Ross 1970 noted that split-antecedent relative clauses require extraposition. McCawley looks at extraposition as an order-changing operation that doesn’t change constituent structure, which he takes to explain why extraction from extraposed relative clauses is still disallowed (his 10b,bʹ′): (269) A man entered who was wearing a black suit. (270) *What kind of clothing did a man enter who was wearing? In McCawley’s proposal, this still falls under a violation of Ross’ CNPC, because the NP and the relative clause remain a constituent – the order-changing transformation has triggered extraposition, but the constituency remains the same. Unfortunately, the order-changing transformation is not enough to account for the paradox in Perlmutter and Ross 1970 with split-antecedent relative clauses. There is no ‘deep structure’ (or narrow syntax) for the collective relative clause that would allow for extraposition. One would have to draw a structure similar to the RNR examples in McCawley’s paper, but for narrow syntax and work from there, where the relative clause is shared across the coordinated structures, as in his RNR example. The following section summarizes the possibilities. 5.3.2.1.2. Extraposition: the possibilities, and the problems Baltin 2005 summarizes four types of extraposition in the literature: (i) rightward movement (Ross 1967), (ii) base generation with some process interpreting the extraposed element in its original position (Culicover & Rochemont 1990), (iii) base generation with leftward movement into its original position (Kayne 1994), and (iv) a mixed analysis (Fox & Nissenbaum 1999). 200 5.3.2.1.2.1. Rightward movement Baltin 2005 uses SARC as an argument against rightward movement, following the paradox proposed by Perlmutter & Ross 1970. In this account, the relative clause would move rightward and be extraposed/moved to a higher position. However, in the case of SARC it is unclear what position the relative clause could be base generated in, as the antecedents are split across two clauses. 5.3.2.1.2.2. Base generation with leftward movement of the head (Kayne 1994) This analysis generates the extraposed element in its rightward position, which Kayne (1997) argues is low in the clause, and then the head moves leftward into its non-extraposed position. Through this, Kayne (1994) accounts for Ross’ Right Roof Constraint, which states that a syntactic object cannot be moved (to the right) higher than the clause where it originates, because a violation of the Right Roof Constraint under the structure Kayne proposes would require lowering the head into the clause where its non-extraposed position is, and movement into a non c-commanding position is generally assumed to be ruled out. At any rate, this analysis runs into two serious problems for SARC. First, two heads would have to move out of the extraposed relative clause and then be ‘split’ into two separate clauses. Second, it is also unclear where the extraposed relative clause would actually be generated in the structure. Under this account, extraposed elements are meant to be in a very low position in the clause, with the head raising out of it into a higher position. With the conjoined CPs, there is no shared position that would be low enough to allow this type of movement. This is a similar problem that the rightward-movement account runs into. 201 In sum, the two movement accounts run into problems for SARC. Now I turn to the approaches that base generate the relative clause in a higher, extraposed position and that don’t require movement of the heads or of the relative clause. 5.3.2.1.2.3. Base generation with an interpretive mechanism (Culicover & Rochemont 1990) As cited in Baltin 2005, Culicover & Rochemont 1990 propose a rule accounting for the Right Roof Constraint that says that an extraposed element must be adjoined to the “minimal maximal” projection containing its host. This is similar to the principle from Guerón and May 1984, which argues that the extraposed constituent is governed by its head which has QRed higher than the extraposed clause. Both Culicover & Rochemont 1990 and Guerón & May 1984 are accounts of Ross’ Right Roof Constraint, and their definition of locality would be violated if the extraposed element were adjoined to a higher clause than the one where its head sits. In the case of SARC, this is a plausible analysis, as the conjunction of two CPs would be the minimal maximal projection required for interpretation of the two split antecedents. One could imagine that the extraposed relative clause could be base-generated as an adjunct to CP. As long as the two DPs in each conjunct could QR out of the clause, in ATB fashion, and attach to CP, this would obey the constraints on movement noted in the literature. However, Guerón and May have data that show differences between extraposition of definite noun phrases and quantified noun phrases – definite noun phases cannot extrapose as easily (Baltin 2005, citing Guerón and May): (271) *The man showed up that hated Chomsky The argument here is that the man cannot QR, and so the extraposition is not possible. This analysis, then, makes an immediate prediction that SARC are ruled out with definite NPs. (272) The man showed up and the woman left that hate each other. 202 This is not the case, so it is doubtful that a base generated extraposition analysis, where the coordinated subjects are both QRed (in an ATB fashion) to CP in order to license the CP- adjoined, extraposed relative clause, is appropriate for SARC (under the assumptions made above). Also, in the above construction, if the subjects don’t raise, then it appears that the Right Roof Constraint is violated, which is a strange violation to an otherwise strong generalization. However, it appears that SARCs violate the Right Roof Constraint in another way as well (Pancheva, p.c.): (273) I expect to meet a man and you expect to meet a woman who know each other well An analysis using maximal-minimal projections that depends on the Right Roof Constraint may not be appropriate here. As a note, Baltin 2005 also shows problems for this approach that are independent of the SARC construction, which should be kept in mind. 5.3.2.1.2.4. Fox & Nissenbaum 1999 Baltin proposes to apply Fox & Nissenbaum’s analysis to SARC, where extraction of both conjuncts occurs and the extracted constituents QR to a higher position, with Late Merger of the relative clause. He gives the following derivation (Baltin 2005, his 85): (274) a. [[a man entered the room] and [a woman left]] [who were similar] (QR from both conjuncts, yielding (b)) b. [[[DP a man] entered the room] and [[DP a woman] left]] [DP [DP a man] and [DP a woman]]] [who were similar] (merger of the relative clause to the conjoined DP yields (c)) c. [[[DP a man] entered the room] and [DP a woman] left]] [DP [DP [DP a man] and [DP a woman]] CP who were similar]] 203 His example is a bit incomplete, as it doesn’t mention that the DPs must be moved out of their conjuncts and then additionally coordinated before being Merged to the relative clause. This is essentially what I suggested in the previous section, but a bit more specific about the details of Late Merger of the relative clause rather than simple base-generation, followed by QR to license the already-Merged relative clause. To argue for this possibility, Baltin points out that ATB topicalization of coordinated nominals can occur, and likens this type of ATB to those constructions (his 86): (275) This book i and that magazine j John bought t i at Borders and Bill bought t j at Dalton's respectively. Before looking at the fine details of this analysis of SARC, the bigger picture is that 1) SARC here are being treated as extraposed, and 2) Late Merger assumes a non-head-raising approach. As little empirical support for 1) or 2) is presented, other than the fact that the relative clause is pronounced at the end of the sentence, it is necessary to look at the facts first. Fox & Nissenbaum present data that can serve as an empirical test for whether or not SARC are truly extraposed. In extraposed adjuncts, Principle C effects are alleviated, while in situ adjuncts still have Principle C effects. So, we expect certain Principle C effects to be alleviated in SARC. We do find this. Another key prediction for this extraposition account with Late Merger is that the relative clause cannot have the head(s) in situ, because they QR from the main clauses and attach prior to merging the relative clause. This predicts that reconstruction is not possible. We find evidence of reconstruction, which calls into question a Baltin/Fox & Nissenbaum-type analysis of SARC. Principle C 204 The facts in Fox & Nissenbaum 1999 and Fox 2002, who argue that Late Merger occurs with adjuncts, but not complements, we see that there are important consequences for Condition C with extraposed relative clauses (Fox 2002: 74): (276) Extraposed a I gave him i an argument yesterday that supports John’s i theory. [adjunct] b. ??/*I gave him i an argument yesterday that John’s i theory is correct. [complement] The late-merged adjunct allows for co-reference between him/John in a, while the complement is not late-merged, and Condition C effects remain between him/John. Also, in non-extraposed clauses, both complement and adjunct, Condition C effects are expected to remain: (277) Not Extraposed a. *I gave him i an argument that supports John’s i theory yesterday. [adjunct] b. *I gave him i an argument that John’s i theory is correct yesterday. [complement] There is a difference between extraposed and non-extraposed relative clauses (which are adjuncts) with respect to Condition C. We now have a direct test as to whether SARC behave like regular relative clauses or relative clauses that have been extraposed. If SARC behave like extraposed relative clauses, then we should see no Condition C effects (i.e. no reconstruction). If SARC behave like regular relative clauses, Condition C effects should show up (i.e. reconstruction). (278) John gave her i an argument and Bob gave her i a linguistic judgment that (both) support Mary i ’s theory. Principle C effects are alleviated, so SARC behave like extraposed relative clauses. Reconstruction 205 Another aspect of a Late Merger analysis of relative clauses is that the head of the relative clause cannot be raised out of the relative clause, as the head of the relative clause is merged prior to the (Late) Merger of the relative clause. If the Late Merger account of SARC holds, then we predict no reconstruction effects. This is not what we find. There are certain cases in the literature where reconstruction effects are observed. For example, it has been noted that pronouns can be bound by a quantifier inside the relative clause in certain cases: (279) The relative of his that every boy likes the most is his mother The pronouns cannot be easily bound by a c-commander inside the relative clause when the relative clause is extraposed: (280) *?The relative of his is his mother that every boy likes the most Let’s test with SARC: (281) Susan met a grad student of his and Mary met an undergrad student of his that every man saw get married to each other. Here, we do have reconstruction, which is not predicted by a Late Merger analysis. Interestingly, insertion of an adverb like “yesterday” (the typical assumption being that extraposition has occurred) maintains grammaticality: (282) Susan met a grad student of his and Mary met an undergrad student of his yesterday that every man saw get married to each other. Also interesting is that we can combine the reconstruction example with the Principle C example, and the sentence is acceptable: (283) The graduate students gave her i an argument of his j and the undergraduate students gave her i a linguistic judgment of his j that every professor j thought (both) supported Mary i ’s theory. It is a bit mysterious how Principle C effects could be alleviated at the same time as pronouns bound within the relative clause. Under a Late Merger analysis, the Principle C alleviation is 206 accounted for, but we would not be able to bind his with the subject of the relative clause. Conversely, under a raising analysis (which, it was already discussed, does not have a mechanism to raise conjoined heads and then split them amongst two CPs), we would predict the binding of his, but not the Principle C alleviation. To complete the paradigm, we see a similar paradox arise when we bind pronouns in the relative clause from quantifiers in the matrix, while still checking for Principle C alleviation: (284) The graduate students gave her i an argument of every professor’s j and the undergraduate students gave her i a linguistic judgment of every professor’s j that he j thought (all) supported Mary i ’s theory. It seems that we can have quantifiers binding pronouns from the matrix into the relative (implying c-command from the matrix into the relative), but we still allow for pronouns in the matrix that refer to R-expressions in the relative clause (implying no c-command). It seems that we need a system that would allow for certain LF configurations for pronoun binding, and certain PF configurations for the Principle C effect, while still maintaining the coordinated DPs as a constituent for the plural relative clause and allowing them to be split at PF into two different clauses. Perlmutter & Ross’ paradox persists. In our system, however, which gives rise to a family of Phrase-markers, an analysis maintaining both Principle C (at PF) and pronoun binding (at LF) would be possible, as the Phrase-markers at PF and LF do not necessarily have to be the same. Additionally, the coordinated DPs in our system do remain a constituent and split at PF into two different clauses – this is accounted for in Section 5.3.3.4. 5.3.2.1.3. A note on Parallelism Baltin notes that it is necessary to account for the apparent need for parallel positions with the split antecedence, giving the following examples (Baltin 2005, his 50-52): (285) A man entered the room and a woman left who were similar. (286) *A man visited a woman (yesterday) who were similar. 207 (287) *A man entered the room and I saw a woman who were similar. 39 Baltin points out that his analysis (given above) would additionally derive the parallelism constraint on SARC, as subjects and objects cannot be ATB-extracted. Interestingly, while subjects and objects seemingly cannot be coordinated in SARC, direct and indirect objects can be. We also see this relaxation with regular ATB: (288) The book that Mary bought and (that) John was given (289) Mary submitted an essay and John was given a term paper that matched each other word-for-word. The parallelism constraint remains unaccounted for this under this system. 5.3.2.2. Solution #2: Ellipsis Wilder 1994 discusses SARC in the context of an argument for full CP/DP coordination with ellipsis, as opposed to a Williams-esque small clause coordinate structure, following the Like-and-Like constraint (i.e. NP-and-NP, VP-and-VP, vP-and-vP, etc are permitted). Wilder argues for a full CP/DP coordination (‘large’ coordination), and gives an empirical 39 As an anonymous GLOW reviewer for McKinney-Bock & Vergnaud (2010) pointed out to me, parallel positions may not be a constraint on SARC. The reviewer gave the following example: (i) Kyra marched in with a young man and another young man showed up with Emily who turned out to have attended the same high school in Tasmania. However, as Roumyana Pancheva points out, the use of ‘same’ does not guarantee that this is a plural relative clause that is shared by split antecedents. The interpretation of the above example could have an “external reading,” as in the following example: (ii) John attended School X. I attended the same school. [external ‘same’] (iii) John and I attended (one and the) same school. [internal ‘same’] If we have an external reading of ‘same’ then the relative clause might just be a singular relative clause that is with the head another young man. We can test with ‘one and the same’: (iv) ??Kyra marched in with a young man and another young man showed up with Emily who turned out to have attended one and the same high school in Tasmania. It seems that we may have a structure here like the following, but with the relative clause extraposed: (v) Kyra marched in with a young man and [another young man [who turned out to have attended the same high school in Tasmania]] showed up with Emily. 208 generalization that there are three types of deletion: forward deletion (both left-peripheral and gapping), and backward deletion (accounting for RNR structures). As a result, his analysis of split-antecedent relative clause structure follows his general argument for ellipsis in coordination. He argues that split-antecedent relative clauses come from the following structure (his 143): (290) [John met a man [who knew each other well]] and [Mary met a woman [who knew each other well]]. This is his ‘backward deletion’, which he also uses to account for other empirical phenomena that have been analyzed as Right-Node-Raising (RNR). It also requires him to explain how, at LF, this structure is interpretable, since the surface form *a man who knew each other well is ungrammatical. He turns to Chomsky and says that this syntactically well-formed, but “receives an interpretation as ‘gibberish’.” Wilder pushes the collective interpretation into a discourse model, and not into the syntax. Additionally, (and perhaps unfortunately), he also pushes plural agreement between the head of the relative and its relative pronoun into another, semantic, module of language, which threatens any syntactic account of agreement in general. Interestingly, Wilder observes that the collective relative clause can occur in another context – one in which two DPs are coordinated (cf. Link 1984), and that the interpretation of such structures does not differ much semantically from the split-antecedent structures: (291) The man and the woman [who knew each other well] sat together. Since Wilder has assumed that DP coordination exists, he assumes that the two DPs are coordinated and that the relative clause adjoins to the coordinated DPs. As this assumption is not syntactically available for the split-antecedent relative clauses, he is cornered into the analysis I have outlined above. 209 Wilder’s observation about the similar semantic interpretation between split-antecedent relative clauses and these coordinated DP relative clauses is important to the analysis proposed in McKinney-Bock & Vergnaud 2010, and expanded upon in this paper. Rather than assuming that a syntactic mechanism works for one case and not the other (as the coordinated DP-relative clause adjunction analysis would entail), our structure can derive the coordinated DP structure and the split-antecedent relative clause structure from a single underlying structure, with one parameter that changes. In the spirit of Wilder 1994, who argues for large coordination (at the level of C and D), that parameter has to do with the relationship between phase heads D and the Cs of the clauses being relativized. 5.3.2.3. Interim Summary In the literature, two possible accounts of SARC emerge: an extraposition analysis or an ellipsis analysis. At their roots, the extraposition and ellipsis accounts rely on various types of long-distance relationships – movement, or an identity relationship that permits deletion. Both of these accounts run into empirical problems, and we treat the long-distance relationships as less than theoretically ideal. Our account (below) is not ellipsis or extraposition: an additional level of syntactic abstraction allows for an LF structure that allows for ‘normal’ semantic interpretation, with arguments in situ in both matrix and relative clauses without movement. Missing from the literature (to my knowledge) is how SARC could be treated with a Multidominance approach, other than the proposal in McCawley 1984 with Multidominance at PF. One would expect that it would make similar predictions to Kayne’s approach, where the relative clause would exist ‘down low,’ right next to each of the antecedents (cf. McCawley’s structure, at narrow syntax). Here, the problem of where the relative clause is attached is made 210 clear, but we still have the problem that Principle C effects are alleviated – if the relative clause is ‘down low’ we would expect Principle C violations, as discussed above. There is a deeper issue with these tests for SARC. Both Principle C and pronoun binding are diagnostics for a single tree that is being created at narrow syntax, interpreted at LF and linearized at PF. This makes it very difficult to obtain split antecedents via movement of the split heads from the relative clause or via movement of the relative clause itself. The current tests return mixed arguments for the position of SARC, and all current potential analyses have difficulties. Taking the lightly different route, we can set aside the issues of PF and LF relations such as Principle C and pronoun binding for now, which are interface requirements that utilize c- command relations. I choose to focus instead on the fact that we want to be able to derive a plural relative clause from objects that are syntactically distinct, and that we do find some types of reconstruction effects, so I will assume that the split antecedents are indeed both within the relative clause and the matrix clause, and that a head-raising type of analysis rather than a matching analysis should come into effect somehow. Then we should be able to derive Phrase- markers that have the antecedents inside the relative clause, and Phrase-markers where the antecedents are outside the relative clause, that can take into account pronoun binding as well as Principle C alleviation (and linearization). Differences between extraposed and non-extraposed relative clauses will necessarily be at PF. 40 5.3.3. Accounting for SARC A solution to the problem of split-antecedent relatives lies in an observation by Wilder 40 One could continue to work with a matching-type analysis, rather than a head-raising analysis. This would entail a search for an explanation for why quantifiers inside the relative clause can bind pronouns outside it (in the head). I do not pursue this here, because of the issues discussed above, but leave it as a possibility that I must continue to consider as a competing analysis in future work. 211 1994 that the structure in (292), repeated here, has a similar interpretation/agreement as its counterpart in (293), which reverses the relative and matrix clauses: (292) Mary met a man and John met a woman who know each other well. (293) A man who Mary met and a woman who John met know each other well. I refer to (292) as SARC, and (293) as coordinated complex DPs (CCDP). While Wilder dismisses the possibility that (292) and (293) are structurally derived from the same source, McKinney-Bock & Vergnaud (2010) argue to the contrary that these two structures are essentially from the same source. There is natural symmetry shared between (292) and (293), which arises in the fact that a man and a woman are interpreted as arguments (consequently, specifiers) of both verbs, in both sentences. There is also a natural asymmetry arises with which CP undergoes relativization and receives a restrictive interpretation. This symmetry/asymmetry arises for regular relative clauses as well: (294) Mary met a man who knows her well. (295) A man who Mary met knows her well. We move away from the problems of SARC structures in current literature (Section 5.3.2), and look at the problem in a different light. Using the two-tiered system presented in this dissertation, we show that the asymmetry between matrix and relative clauses is triggered by a simple reversal of relationships between the two Cs and D, with consequences for T. We capture similarities between the tree structures that a tree representation cannot represent, as a tree requires ‘re-building’ from the ground up. In the usual representation of a relative clause as an adjunct to DP, SARC and CCDP have no syntactic relationship to one another – a collection of DPs and CPs would be combined from the numeration to form either representation, without regard to the structural and interpretational symmetries. 212 This section is organized as follows: First, I present an analysis of regular relative clauses. Then, I present how coordination functions here, followed by an account for SARC and CCDP. 5.3.3.1. Regular Relative Clauses One difficulty of having a sharing representation in narrow syntax is the issue of asymmetry. A tree containing one item linked to two positions, without additional notions, represents the multiple occurrences of the shared constituent as symmetrical (van Riemsdijk 2006, his 9a): (296) I ate what was euphemistically referred to as a steak [a steak] exists in the following contexts: (i) [I ate ⎯⎯ ] (ii) [something was euphemistically referred to as ⎯⎯ ] The two separate CPs described as sharing the constituent [a steak] are on a par. Then, the asymmetry in the construction between the matrix and the relative clause (in van Riemsdijk’s example, a free relative) remains unaccounted for. To account for this asymmetry, we return to the basic notion of M-graph and begin with a similar configuration for the higher (subject) phase, (297), given some subject relative clause: 213 T v D N (297) A man that knew Mary laughed. 41 (298) Integrating the proposal from chapter 3, I utilize the {D K , C K } relationship for the complementizer domain. This is also motivated by Chomsky 2008, where C is the “locus of agreement” for the higher phase, in the nominative domain, and that T only inherits the ability to agree when selected by C, otherwise resulting in a non-finite ECM structure (Chomsky 2008: 143). However, for the purposes of simplicity and exposition, I will notate the subject phrase when it is merged with {D K , C K } in a slightly different way. Recall: (299) (D K , T) (D K , V) (C K , T) (C K , V) (D K , D ext ) (D K , N ext ) (C K , D ext ) (C K , N ext ) 41 Here, I have used an unergative verb laugh for one of the CPs, and I have used a subject relative clause for the other CP. The structure I draw is simplified for purposes of explanation, but the clauses will appear with overlapping subject/object phases for a transitive sentence. 214 And the alternate notation, which collapsed the cube into two ‘dimensions,’ essentially collapses the cube along the front and back planes of the cube (red), which share the D K coordinate and C K coordinate, as well as collapsing the verbal selection domain and the nominal selection domain: (300) D K TP DP ext C K TP DP ext (301) (D K , T) (D K , V) (C K , T) (C K , V) (D K , D ext ) (D K , N ext ) (C K , D ext ) (C K , N ext ) (302) Selection domains (D K , T) (D K , V) (C K , T) (C K , V) (D K , D ext ) (D K , N ext ) (C K , D ext ) (C K , N ext ) Keeping this in mind, lets begin to look at the subject phases of two independent clauses, P and P´: 215 (303) P (D K , T) (D K , V) (C K , T) (C K , V) (D K , D ext ) (D K , N ext ) (C K , D ext ) (C K , N ext ) (304) P´ (D K ´, T´) (D K ´, V´) (C K ´, T´) (C K ´, V´) (D K ´, D´) (D K ´, N´) (C K ´, D´) (C K ´, N´) These are independent clauses, with no shared argument. Neither are relative clauses. Now, if I were to share the D-N argument, I would need to either ‘delete’ the relationship of D-N with {D K ´, C K ´} or {D K , C K } in order to be able to overlap the two planes in (303)-(304) which represent the D-N, as well as delete the D´-N´ (since D-N will be shared). Below I ‘delete’ the relationship of D-N with {D K , C K }, generating the object Q: 216 (305) Q (D K , T) (D K , V) (C K , T) (C K , V) (D K ´, D) (D K ´, N) (C K ´, D) (C K ´, N) (D K ´, T´) (D K ´, V´) (C K ´, T´) (C K ´, V´) This, I argue, will represent a relative clause, with P´ as the relative clause and P as the matrix clause. Following the standard raising analysis of relative clauses (cf. Vergnaud 1974, Kayne 1994), D K is in a relationship with the C K ´ that is the relative clause: (306) Q (D K , T) (D K , V) (C K , T) (C K , V) (D K ´, D) (D K ´, N) (C K ´, D) (C K ´, N) (D K ´, T´) (D K ´, V´) (C K ´, T´) (C K ´, V´) 217 This relationship is a checking relationship between C K ´ and the nominal domain. C K ´ is the head/label of the pair {D K , C K ´}. From here, I will simplify the relationship between the complementizer domain and the phase, pulling instead from the analysis by McKinney-Bock & Vergnaud (2010), and developed in McKinney-Bock (2011). This is for simplicity of exposition as well as the fact that I have not yet been able to unify these two types of structures, so it remains to be seen how the implementation below applies to the additional {D K , C K } dimensions that are a key part of the clause. The simplification is that I treat D and D K as the same feature, rather than splitting the duty of referentiality (D) and Case/Kase (D K ). As above, the relative clause is built like the matrix clause and the {D, N} pair is shared in the argument positions of both verbs. D is also shared between the C K and C K ´ domains. 218 C C´ T V D N T´ V´ (307) Again following the standard raising analysis of relative clauses (Vergnaud 1974, Kayne 1994), D K is in a relationship with the C K ´ that is the relative clause. This relationship is a checking relationship between C K ´ and the nominal domain. C K ´ is the head/label of the pair {D K , C K ´}: 219 D T V N T´ V´ C C´ X Y Z (308) We return to the condition on classical Phrase-markers: Condition on Phrase-markers Given two application of Merge to two distinct pairs of formatives {f i , f j } and {f i , f k } sharing the element f i , f i must be the head/label in at least one of the relations Merge(f i , f j ) and Merge(f i , f k ). Under this condition, Phrase-markers containing the following are ruled out: (309) {X, Y}, {Y, Z} * * X Z X Y Z We see that the condition rules out the pairs {D, C} and {D, T´} from the being in the same Phrase-marker, which will rule out a Phrase-marker containing multiple occurrences of D-N (see immediately below): 220 D C´ C T V N T´ V´ C (310) With the condition in place, some relevant maximal Phrase-markers are: (311) C C T D T D N T V (312) C´ D C´ D N C´ T´ T´ V´ (313) C´ C´ T´ D T´ D N T´ V´ Importantly, we rule out Phrase-markers that contain multiple occurrences of the D-N pair, which creates problems for both linearization (Wilder 2008) and interpretation: 221 (314) * C´ D C´ D N C´ T´ D T´ D N T´ V´ Then, PF and LF interpret a restricted family of Phrase-markers, for example, the set (311)-(313) above. Leaving the details of linearization across a family of Phrase-markers for now, I postulate the following: Spell-Out to PF of a family of Phrase-markers involves looking at ordering information across multiple trees, which is dependent upon the same object occurring in multiple Phrase-markers. 5.3.3.1.1. Obtaining the observed symmetry in the data: ‘reversing’ relativized and matrix clauses The graph maintains identical argument relations for the head noun a man in the above sentences. What reverses is which CP is relativized, and which is matrix. A simple reassignment of the {D, C} checking relationship from one clause to the next will obtain this result. CP = (A man) laughed heartily C´P = (A man) knew Mary 222 D T V N C S T´ V´ C´ (315) a man that knew Mary laughed heartily 223 T´ V´ C´ C D T V N (316) a man that laughed heartily knew Mary 5.3.3.2. Relative Pronouns I have sidestepped a bit the issue of the relative pronoun in relative clauses. In the head raising analysis, which I have assumed due to reconstruction facts in SARC, the standard analysis postulates an external D, and the wh-word (as in which man or that man) acts as the internal D. Then, the internal D raises to Spec, CP and the NP moves further to a higher Spec position, to get the correct linear order (as in Kayne 1994): the external man NP that internal that man arrived. Here there is only one occurrence of D, that acts as both an external and internal D. The grammatical relationships with T and C determine the promotion of D-N to be pronounced in the 224 Spec, CP or in the Spec, TP within the relative clause. So, the relative pronoun will need to arise from this type of syntactic configuration – and I don’t yet know how lexical insertion will proceed here. We need to be able to have both a the (external D, or a quantifier) and that/which (as an internal-type D/C) into the Phrase-marker. This is closely related to the problem of wh- movement in the system, which requires a more explicit analysis. I suspect the re-splitting of D into D and D K is one step in the resolution to this issue – presumably independently necessary, as D K is for phase overlap and Case. 5.3.3.3. The “family of coordination” To obtain coordinated CPs/DPs in the two types of sentences a notion of coordination must be introduced into the system. This section will show that the SARC structure is not unique; there is no reason to derive SARC in a “special” way that is distinct from simple headed relatives. In fact, this section shows that a family of structures arises from a general notion of coordinating sets of grammatical formatives, within the ICA. The family of coordinated structures is derived using a representation of and as a binary grammatical connective {and, and}. A large question surrounding coordination that arises in the literature is whether or not and is a phasal connective, or if sub-parts of phases can be coordinated (Williams 1978, Wilder 1994, Johnson 2000). This is the ‘large coordination’ versus ‘small coordination’ hypothesis. Large coordination hypotheses claim that only CPs (and possibly DPs) can be coordinated, as in Wilder 1994. In these types of analyses, deletion of identical copies occurs across constituents in order to derive coordination of like constituents that are not CPs (i.e. coordinated TPs: John has arrived and will leave; just one of many examples). The small-coordination hypothesis allows a 225 general Like-and-Like constraint in order to coordinate all types of phrases such as VPs, vPs, TPs, DPs, NPs, etc. Because the level of abstract syntax that is present in our system is restricted to local relationships, it is a bit difficult to say whether our analysis is a ‘large conjunct’ analysis or a ‘small conjunct analysis.’ Because {and, and´} comes in a binary pair, we must coordinate across the nominal and verbal domains – so subject coordination like John and Mary is not possible, in this take on coordination. On the other hand, we find that vPs can be coordinated, technically, as {and, and´} can Merge with an object phase. I will remain undecided on the issue, as I use {and, and´} only at the CP level to coordinate relative and matrix clauses when I account for the SARC alternation. 5.3.3.3.1. Coordination of vP Phases I will begin with a simpler example than the case of SARC, that of vP (object) phases, made up of the set {D int , N int , v, V}. To get a coordination of vP phases, one would take the Cartesian product of the phase with the pair of features {and, and´}: (317) {D, N, v, V} × {and, and´} = { (D, and), (N, and), (v, and), (V, and), (D, andʹ′), (N, andʹ′), (v, andʹ′), (V, andʹ′)} This can be represented as follows: 226 (318) This coordination represents a coordination of vPs, for example: (319) A man arrived and a woman left. (320) win a prize and lose a million dollars 42 Now, note that this is not the only possible empirical coordination. Cf. Goodall 1987, one can have coordinations as follows: (321) A man and a woman arrived and left (respectively). (322) win and lose a prize and a million dollars (respectively) The respectively reading is a natural interpretation of the coordinated sentences that I noted. Only the man can arrive, and only the woman can leave in the [vP] and [vP] coordination. The 42 In a full sentence, this would be: Bob didn’t [win a prize] and [lose a million dollars] 227 respectively reading is one of two optional readings in the second set of coordinations I noted; namely, the [DP and DP] [vP and vP] coordinations. The other reading is a collective, where the man and the woman arrived and left together. The reading is more difficult in the win/lose example, because winning and losing constitute a pair of verbs that exhaust a set of actions. It is easier to get this reading without exhaustivity: (323) bake and eat (both) cookies and cake At any rate, we see that we can get the ‘respectively’ reading for both types of coordinations. Setting aside the semantic contribution of respectively for now, and looking only to obtain the syntactic distribution, I derive these two coordinations from the same abstract syntactic structure via the Condition on Spell-out, repeated here: (324) Condition on Spell-out (informal, to be revised): 43 Spell-out goes by pairs of parallel pairs of features (edges or planes) that exhaust clause structure. Under this condition, the three parallel planes of features for the coordinated vPs would be as follows: 43 In the future, a useful way of defining domains or cycles in syntax may be by feature-sharing. A domain could defined as a set of nodes/vertices on a graph that share a feature, which would possibly begin to motivate the informal Condition on Spell-out that I am using in this dissertation. In this case, the structure of subject/object phases may have to be modified (as they do not share a single feature, right now). 228 (325) {N, V} pair: N category feature [a man and a woman] V category feature [arrived and left] {and, and´} pair: and feature [a man arrived] 229 and´ feature [a woman left] {ϕ, R} pair: Φ-features (for D, v) R (lexical/θ) features (for N, V) Notice that the three groupings each exhaust all vertices. These groupings are the three pairs of parallel planes, following the Condition on Spell-out, and they give way to three possible Spell-out possibilities. I now discuss each possibility and the empirical result obtained. 230 5.3.3.3.2. Option #1: Spell-out N/V features If we Spell-out the phases containing the N-feature and V-feature, we send the following planes first to LF/PF: (326) N category feature [a man and a woman] V category feature [arrived and left] The Phrase-markers that we obtain for PF are: (327) OR (328) OR 231 Notice here that there is no pronounced “and” in these Phrase-markers. I have not specified a rule for “and.” Where “and” is pronounced is in the place where and and andʹ′ come ‘in contact’ with one another; rather, in the set of adjacent nodes where the features match except for the and feature. In the above Phrase-marker, this is at the top of the tree, in the first layer, which would generate the man and the woman. In the other possible Phrase-marker, it would occur where the Ns and Ds are adjacent. This is not quite the result we want, as we would generate a and a man and woman. However, it is possible that we would like to generate a man and woman. I hope to constrain this Phrase-marker to generate that, but at the moment it is a problem for the system, and one that I hope to combine with an analysis of quantifiers more generally in this system. Returning to the pronunciation of and, we see from the PF Phrase-markers that we pronounce and whenever the and, andʹ′ features come into a local/direct contact. This occurs when the edge, or ‘dimension’ along which the and, andʹ′ feature is collapsed, or traversed. So we will want to take our PF Phrase-marker and insert and whenever this edge is traversed. For abstract syntax, here is an informal rule about the pronunciation of and. (329) Rule for pronunciation of and: Pronounce “and” when any {and, andʹ′} pair is Spelled-out. With this rule, we generate: (330) N-phase: [The man and the woman] V-phase: [danced and sang] Once we have created these Phrase-markers, the parallel phases are Spelled-out again, together (details to be worked out formally): (331) The man and the woman danced and sang. 232 5.3.3.3.3. Option #2: Spell-out and/andʹ′ features The second option is to Spell-out first along the planes with the and feature and the andʹ′ feature. (332) and [a man arrived] and´ [a woman left] Here, we end up with two identical Phrase-markers that appear as follows, one for and and one for andʹ′: 233 (333) Empirically, we get: (334) and feature: [a man arrived] andʹ′ feature: [a woman left] At this point, the and - andʹ′ dimension has not been traversed. After these planes are Spelled-out, then we Spell-out the two planes again, together. At this point the and dimension is pronounced, and we obtain: (335) A man arrived and a woman left. 5.3.3.3.4. Option #3: Spell-out R-phase, ϕ-phase The final option spells out the phases defined by the lexical (R) and functional (Φ-) domains first. 234 (336) Φ-features (for D, v) R (lexical/θ) features (for N, V) This, in short, obtains the empirical result: (337) [D T and D T] [N V and N V] In English it is not clear that this has an empirical counterpart. A non-coordinated example of this would be: (338) [[D T] [N V]] This structure would be one where the Φ-features shared by D and T are merged together, and N and V are incorporated. I discuss this possibility more in Section 4, for Romance languages (cf. Alexiadou & Anagnostopoulou 1998) with a possible generalization of this type of structure to aspectual hierarchy in pseudo incorporation as well as full noun incorporation structures (cf. Dayal 2011). It is possible that the [D T and D T] phase runs into the same problem as the [D and 235 D] that we obtain for Option #1, with the Phrase-marker that generated [D and D][N and N], in the sense that only one [D T] is spelled out, ever. To sum up, we begin to obtain a family of coordinated structures with different hierarchical properties, from the same abstract syntax. Setting aside the semantics of collectivity, as we obtain respectively readings with both structures, we see that a family of coordinated structures can be generated from a single underlying syntax of and that has the same set of contexts, with two different sets of lexical items (those with the and feature, and those with the andʹ′ feature). 5.3.3.4. Obtaining the SARC/CCDP alternation with coordination To account for the SARC/CCDP alternation, let’s start by looking at the three CPs that belong to the two structures: (339) Mary met a man John met a woman A man and a woman know each other In both SARC and CCDP, the two coordinated objects are Mary met a man and John met a woman. In SARC, they are matrix clauses. In CCDP, they are relative clauses. Just as with regular relative clauses, we obtain this reversal with a reversal of the relationship between D and C/C´. Now, however, we have coordination involved as well as a collective/plural sentence. The two coordinated objects will appear as follows: (340) Mary met a man and John met a woman 236 And the collective sentence will be linked with both of these objects. (341) 237 What is underspecified from the above representation is the relative/matrix clause asymmetry, the D and C/C´ relationship. For SARC, the structure has a D, C´ relationship to relativize the collective sentence: 44 (342) Mary met a man and John met a woman who know each other well And for CCDP, we reverse the relationship so that D and the two C’s are relativized. 44 This picture may be a little confusing, so to clarify, both (D, and´) and (D, and) have a relationship with (C, and). The lines are difficult to see. 238 (343) A man that Mary met and a woman that John met know each other well. As with regular relative clauses, the two structures are obtained from a simple reversal of the relationship between D and C. In this case, the coordinated sentences have two Cs, so we are dealing with the additional complexity of two relationships between Ds and Cs. Otherwise, it is the same as the regular relative clause example. 5.3.3.4.1. More on the collective One could question whether or not the plural relative clause is a more complex set of coordinates, as in the full coordinated structure: 239 (344) In this case, we might have the PF a man and a woman know each other well be a pronunciation of something akin to A man knows her well, and a woman knows him well. Negation acts as a test for this: (345) Mary met a man and John met a woman who didn’t know each other well. The interpretation is possible that neither the woman knows the man well, or the man knows the woman well, but neither condition is necessary. As long as one of the two conditions holds, the sentence is true. So the plural seems to be arising from a single set of nodes for the sentence a man and a woman know each other well. 5.3.3.4.2. Deriving the family of coordination from the structure proposed for SARC In fact, several other coordinated structures pop up from the structure proposed for SARC and CCDP. 240 (346) Relative Clause Coordination (CCDP): For the coordinated relative clause, we get: (347) #1: [[[D N and D N] [Tʹ′ Vʹ′ and Tʹ′ Vʹ′ ]] T V ] = [[[The man and the woman] [who Mary met and who John met]] danced] #2: (CCDP) [[D N Tʹ′ Vʹ′ and D N Tʹ′ Vʹ′ ] T V ] = [[The man who Mary met and the woman who John met] danced] #3: [[[D Tʹ′ and D Tʹ′] [N Vʹ′ and N Vʹ′ ]] T V ] See the “open issues” section for empirical possibilities. 241 (348) Matrix Clause Coordination (SARC): (349) #1: [[[D N and D N] Tʹ′ Vʹ′] T V and T V] = [[[The man and the woman] who Mary met] danced and sang] #2: (SARC) [[[D N T V] and [D N T V]] Tʹ′ Vʹ′] = [[[The man danced] and [the woman sang]] who Mary met] #3: [[[D Tʹ′ and D Tʹ′] [N Vʹ′ and N Vʹ′ ]] T V ] See the “open issues” section for empirical possibilities. Finally, we get a coordination of the entire structure: 242 (350) Here, we have the same options as before – we can Spell-out along the and, andʹ′ feature dimensions, which will get us two entirely separate relative clause/matrix sentences: (351) #1: [[[D N Tʹ′ Vʹ′] T V] and [[D N Tʹ′ Vʹ′] T V]] = [[[The man who Mary met] danced] and [[the woman who John met] sang]]. Or, we can break it down by parallel (exhaustive) planes, and get: #2: [[[D N and D N] [Tʹ′ Vʹ′ and Tʹ′ Vʹ′]] T V and T V] = [[[The man and the woman] [who Mary met and who John met]] danced and sang] or 243 #3: [[[D Tʹ′ and D Tʹ′] [N Vʹ′ and N Vʹ′ ]] T V and T V ] See the “open issues” section for empirical possibilities. This covers several possible coordinations with relative clauses. Empirically, we find another mixed result across the phases: (352) [[[D N Tʹ′ Vʹ′ and D N Tʹ′ Vʹ′]] T V and T V] = [[The man who Mary met and the woman who John met] danced and sang] and (353) [[[D N T V and D N T V]] Tʹ′ Vʹ′ and Tʹ′ Vʹ′] = [[The man danced and the woman sang] who Mary met and who John met] 5.4. Open Issues In setting out this system, which has several benefits over the current theory, I have necessarily left several issues open. Here I briefly discuss possible directions for some issues. 5.4.1. A more specific theory of PF P-markers and Extraposition Empirically, the issue of Extraposition is immediate. I am somewhat forced into the position that Extraposition is a PF phenomenon. However, Fox & Nissenbaum 1999 go over scope differences between extraposed structures and non-extraposed structures, looking at free choice any and the scope of for in ‘look for.’ There are LF effects for extraposed structures in addition to PF effects, and so narrow syntax should show a difference in scope with extraposed and non-extraposed structures. Because SARC are not clearly extraposed or not extraposed, so the immediate relevancy to the analysis above remains unclear. However, a larger consideration of the empirical details in papers by Culicover & Rochemont (1990) and Baltin (1981, a.o.) is necessary, as well as a theory of LF under this system. 244 5.4.2. Object relative clauses and a generalization to wh-movement (which should subsume object relative clauses) Additionally, I have not shown how cyclic wh-movement works in this system. I put forth an analysis for clause structure in this paper, and I focused on subject relative clauses and unergatives, as a detailed analysis of how cyclic wh-movement will work here is not yet complete. Deriving object relative clauses is the first step. Following Vergnaud (2009, forthcoming), I suspect that wh-movement will work in a similar way as coordination, in the sense that a binary pair of wh-features will reduplicate the phase containing the wh-word. This is what Vergnaud (forthcoming) proposes, with the details to be worked out for long-distance successive cyclic movement. This requires another look at Object Shift, cf. Fox and Pesetsky 2004. Once successive cyclic wh-movement has been formalized, an extension to islands is absolutely necessary. In my prospectus, I look at data by Lakoff & Postal looking at the CSC and apparent violations to the CSC, which may be accounted for by the system here. 5.4.3. Parallelism constraint Finally, with the research proposed above, I should be able to return to SARC to derive the parallelism constraint that Baltin (2005) describes, as well as ATB movement. 5.5. Conclusion In this chapter, I revisited posited primitives in current Minimalist theory, which were stipulations. The system presented here once again ‘minimizes’ narrow syntax by restricting grammatical relationships to entirely local Merge relationships, and derives parallel hierarchies for the nominal and verbal domains under the phrase structure presented in chapter 2, based on foundations from Vergnaud’s ICA (forthcoming). This simplifies the theory by dispensing of conditions in narrow syntax that are really relevant only for the interfaces, in particular 245 linearization constraints and c-command relationships – which arise in narrow syntax as an inherent result of the fact that current theory uses the same syntactic object in narrow syntax and at the interfaces. In an opposite direction, Multidominance structures were removed from PF/LF and permissible only at narrow syntax, and we generalized sharing to any grammatical formative. This escaped the well-known linearization problems of Multidominance structures. Then, I laid out an analysis of the SARC/CCDP alternation within this framework. Current analyses of relative clauses and extraposition cannot provide a complete account for SARC without stipulations, and are not empirically adequate. However, SARC are a natural, predicted consequence within the framework of coordination put forth here. In this paper, I put forth a more explicit analysis of coordination and I begin to look at clause structure and expand the simplifying assumptions made in past work to more levels of the clause. There is future work to be done on this system, including more explicit theories of PF and LF relationships and an explicit account of relative pronouns and wh-movement – the relationship between D and D K remains open. Also, more empirical work on SARC constructions and why they violate the Right Roof Constraint should be explored. Finally, a complete hierarchy of the clause under the ICA needs to be further elaborated. 246 Chapter 6 Conclusion 6.1. Summary of Contributions This dissertation takes narrow syntax to operate over domains (phases) more local than in current Minimalism, defining a notion of phase overlap which involves the sharing of grammatical features across two independent phases. Phase overlap applies to phases involved in the construction of phrase structure as well as in embedding of finite and non-finite complement clauses, and relative clauses. To overlap phases, I take the idea that generalized binary connectives build phrase structure (Vergnaud forthcoming), and extend it in such a way that it involves both a nominal and verbal complementizer {D K , C K }, rather than treating the verbal domain as ‘privileged’. In this dissertation, both the verbal and nominal domains are implicated at the edges of phases, creating phase overlap and a novel notion of cyclicity: to construct two (consecutive) cycles is to share a pair of features across (both) the nominal and verbal domain. Case plays a key role in phase overlap. 6.2. A Comparison of Frameworks To conclude this dissertation, it is worthwhile to take the time to make a brief comparison of the architecture in this dissertation with both MP and Tree-Adjoining Grammar, as they form a continuum with respect to locality of operations: MP allowing long-distance operations, this archiecture minimizing long-distance operations, while TAG occurs on the spectrum between MP and the architecture proposed and developed here. TAG is a formal architecture that falls somewhere between the (local) architecture developed in this dissertation and the (Agreement-based) Minimalist Program. Elementary trees are constructed and then ‘adjoined’ or inserted into other elementary trees to build clause 247 structure. Frank (2006) provides a discussion of agreement across embedded and infinitival clauses involving Hindi that illustrates the general mechanism for embedding that TAG utilizes. The mechanism that Frank (2006) proposes is one that limits agreement to elementary trees – but has no locality restrictions for agreement within an elementary tree. He allows features, for example Φ-features, to percolate to the root node of an elementary tree freely. So, any DP within the elementary tree could have its features percolate to the root node. Once the features have percolated to the root node, and the elementary tree is adjoined to another tree, then agreement can occur between the adjoined (‘embedded’) clause and the matrix clause. Frank (2006) points out that, under the phase-based Minimalist approach, constraints on movement and agreement should be the same. Under the TAG approach, movement and agreement are different mechanisms and should be subject to different constraints. He proposes an account of cross-clausal agreement where the matrix verb doesn’t enter into an agreement relationship with the embedded DP; rather, the agreement is ‘indirect’ and occurs across elementary trees: there is one domain of agreement in the embedded elementary tree, where the DP agrees with the root node of the elementary tree, and then the root node of the elementary tree (which is now part of the matrix elementary tree) triggers agreement on the verb in the matrix clause. This indirect agreement is very reminiscent of Landau’s (2004) proposal, by which the root node C of the embedded clause mediates agreement between the matrix tense and the embedded tense. One might assume that Frank’s (2006) analysis is amenable to this, in terms of feature percolation to C or no feature percolation to C. In fact, he proposes an analysis of topic agreement in Tsez that hints at how this system could deal with EC vs. PC agreement. In Tsez, topic agreement is only possible when the embedded clause is a TopicP (and so the Topic head 248 acts as the root of the clause). In cases where cross-clausal agreement is not possible, the embedded clause is a CP, which Frank assume is a head that cannot bear Φ-features and so will not percolate into the matrix clause. Instead of C-selection, Frank’s proposal depends on the ‘structural size’ of the embedded complement, much like the analysis I am proposing for control and raising clauses. So, the TAG approach is reminiscent of this approach in terms of the notion of structural size, rather than having different definitions/selections of what constitutes the C-edge of a phase (in Minimalism). However, the notion of pivot under the ICA and agreement under TAG differs. Here, feature ‘percolation’ (as a non-theoretical notion) is taken to occur from sharing, or one occurrence of the shared heads – in other words, it is direct agreement between the matrix verb and embedded DP, which is precisely what Frank says is not possible. Instead, Frank (2006) takes long-distance agreement to occur freely throughout the elementary tree/elementary domain, and only if the feature is permissibly borne by a root node will the feature percolate higher. This requires an assumption about certain verbal heads allowing for certain features, whereas the ICA makes assumptions about pivots and sharing. A discussion of syntactic islands will be crucial to this proposal. In MP, embedding occurs by extending the verbal projection indefinitely, allowing some C-head (or Fin-head, or other verbal head) to be the complement to a lexical V-head. Different selectional requirements derive raising vs. control predicates, and agreement is mediated between clauses via a typology of the edge of the C-phase allowing for different features to agree, or not. In TAG, embedding occurs by taking elementary trees and ‘sharing’ the root node, with its features. Agreement is unbounded within elementary trees, but the root node must be able to 249 bear some feature for agreement to percolate from some embedded clause to a higher clause. I am unsure as to the interpretation of abstract Case in TAG as of right now. However, under the proposal in this dissertation, embedding is a somewhat different mechanism to building transitive clauses; phases are allowed to be degenerate at (almost) any level of the phase, and shared across clauses. The conjecture is that embedding involves sharing of some pivot between clauses; the pivot being (tentatively) the highest merged object of the pivot. There are no selectional requirements that drive the feature percolation and long-distance agreement present in both MP and TAG; rather, agreement is the result of sharing the same feature/grammatical item across two domains. Ideally this is unrestricted, and a typology of embedding is developed for raising & control predicates. This is in the spirit of Williams (2003), and also predicts that an embedded clause is always smaller than a matrix clause. That differs from TAG and MP, where domains are relatively independent of one another. A discussion of islands in light of Williams (2003) is crucial; it is possible that islands are ‘larger’ than their matrix clauses and so extraction cannot occur. 6.3. Future Directions Possible future directions for this project are vast; the most immediate work is to further explore the binary grammatical connectives that take a phase as its complement (‘supercategories’); in particular those of quantification and wh-movement. A model of quantification is critical, in particular to interpretive differences seen in chapter 5 across relative clauses. This dissertation has strong implications for wh-movement and islands, and so it is necessary to derive the island constraints within this framework next. Finally, further work on the relationship between LF and narrow syntax is a next step, as well as refining Spell-out to PF. 250 While many open questions and issues remain, this novel approach to syntax provides immediate justification in its approach because, at its roots, it follows the minimalist idea that narrow syntax shouldn’t require conditions relevant only at the interfaces. This new approach to syntax dispenses with a notion of tree, and relaxes syntactic representations to allow for simple graphs. In doing this, we take narrow syntax to be a more abstract representation, from which constituent structure (represented as a tree) is derived and used at the interfaces. This two-tiered system provides a novel resolution to some of the problems in current generative grammar, both theoretical and empirical. 251 References Abney, S. 1986. A Grammar of Projections. Ms. Abney, S. 1987. The English noun phrase in its sentential aspect. Doctoral Dissertation, MIT. Adger, D. 2003. Core Syntax: A Minimalist Approach. Oxford University Press. Alexiadou, A. & E. Anagnostoupoulou. 1998. Parameterizing AGR: Word order, V-movement and EPP Checking. Natural Language and Lingustic Theory 16. 491-539. Andrews, A. 1976. The VP-complement analysis in Modern Icelandic. In Proceedings of NELS 36, eds. Joan Maling and Annie Zaenen. 1-21. Amherst: GLSA. Baker, M. C. 1996. The Polysynthesis Parameter. Oxford University Press. Baker, M.. 1985. The Mirror Principle and morphosyntactic explanation. Linguistic Inquiry 16:373-415. Baker, M. 1988. Incorporation: A theory of grammatical function changing. Chicago: University of Chicago Press. Balakrishnan, R. and K. Ranganathan. 2000. A Textbook of Graph Theory. New York: Springer. Baltin, M. 1981. Strict Bounding. in The Logical Problem of Language Acquisition. Carl L. Baker and John McCarthy, eds. Cambridge: MIT Press. 257–295. Baltin, M. 2005. 25: Extraposition. The Blackwell Companion to Syntax, eds. Martin Everaert & Henk van Riemsdijk. Blackwell Publishing. Blackwell Reference Online. 01 September 2011. Bhatt, R. and R. Simik. 2009. Variable Binding and the Person-Case Constraint. 25th Annual Meeting of the Israel Association for Theoretical Linguistics (IATL 25). Ben Gurion University of the Negev. Bonet, E. 1994. The Person-Case Constraint: A morphological approach. In The morphology- syntax connection, eds Heidi Harley and Colin Phillips. 33–52. Bonet, E. 2008. The Person-Case Constraint and repair strategies. In Person Restrictions, eds. Roberta d’Alessandro, Susann Fischer, Gunnar Hrafn Hrafnbjargarson. Mouton de Gruyter. Borer, H. 2005. In name only. Structuring sense, Volume I. Oxford: Oxford University Press. Bowers, J. 1993. The Syntax of Predication. Linguistic Inquiry 24.4:591-656. Bowers, J. 2001. Syntactic relations. Ms., Cornell University. Brody, M. 1997. Mirror Theory. Ms., University College London. Browman, C. and L. Goldstein. 1986. Towards an articulatory phonology. Phonology Yearbook 3:219-252. Browman, C. and L. Goldstein. 1992. Articulatory Phonology. Phonetica 49:155-180. Browman, C. and L. Goldstein. 2000. Competing constraints on intergestural coordination and self organization of phonological structures. Les Cahiers de l'ICP, Bulletin de la Communication Parlée 5. 25-34. Carrier, J. & J. Randall. 1992. The argument structure of syntactic structure of resultatives. Linguistic Inquiry 23. 173-234. Chomsky, N. 1951. Morphophonemics of modem Hebrew. [Published New York: Garland, 1979.1 Chomsky, N., 1955-1956. The logical structure of linguistic theory. New York: Plenum. Chomsky, N. 1955 [1975]. The Logical Structure of Linguistic Theory. Ms, Harvard University, Cambridge, Mass [published in Plenum, New York]. Chomsky, N. 1956. Three models for the description of languages, IRE Transactions on Information Theory 2, 113-124. 252 Chomsky, N. 1957. Syntactic structures. The Hague: Mouton. Chomsky, N. 1959. On certain formal properties of grammars, Information and Control 2, 137- 167. Chomsky, N. 1962. Context-free grammar and pushdown storage, Quarterly Progress Report 65, 187-194. Research Laboratory in Electronics, Cambridge, Mass. Chomsky, N. 1963. Formal Properties of Grammars. In: R. Duncan Luce, R. R. Bush and E. Galanter (eds.), Handbook of Mathematical Psychology, Vol. II, 323-418, New York: John Wiley and Sons, Inc. Chomsky, N. 1964. Current issues in linguistic theory. In The structure of language, eds. Fodor & Katz. New Jersey: Prentice Hall. 50-118. Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press. Chomsky, N. 1970. Remarks on nominalization. In Readings in Transformational Grammar, eds. R. Jacobs, P. Rosenbaum. Waltham, Mass.: Ginn & Co. Chomsky, N. 1976. Conditions on rules of grammar. Linguistic Analysis 2,303-351. Chomsky, N. 1977. On wh-movement. In Formal syntax, eds. Peter Culicover et al., 71-132. New York: Academic Press. Chomsky, N. 1980. On binding. Linguistic Inquiry. 11, 1-46. Chomsky, N. 1981. Lectures on Government and Binding. Foris, Dordrecht. Chomsky, N. 1986a. Knowledge of Language. New York: Praeger. Chomsky, N. 1986b. Barriers. Cambridge, MA.: MIT Press. Chomsky, N. 1991. Some notes on the economy of derivation and representation. In: R. Freidin (ed.), Principles and parameters in comparative grammar, 417-454. Cambridge, MA: MIT Press. [an early version appeared in MIT Working Papers in Linguistics 10, 1989; reprinted in Chomsky, 1995b.] Chomsky, N. 1993. A minimalist program for linguistic theory. In: K. Hale and S.J. Keyser (eds.), The view from Building 20, l-52, Cambridge, MA: MIT Press. [Reprinted in Chomsky, 1995b. Chomsky, N., 1995a. Language and nature. Mind 104]. Chomsky, N. 1995. The minimalist program. Cambridge, MA: MIT Press. Chomsky, N. 1998. Minimalist inquiries: The framework. In: R. Martin, D. Michaels, and J. Uriagereka (eds.) Step by step : essays on minimalist syntax in honor of Howard Lasnik Cambridge, MA: MIT Press, 2000. [an early version appeared as MIT Occasional Papers in Linguistics 15, 1998] Chomsky, N. 2000 New horizons in the study of language and mind. New York: Cambridge University Press. Chomsky, N. 2001. Derivation by Phase. In M. Kenstowicz, ed., Ken Hale: A Life in Language. Cambridge, Mass: MIT Press. 1-52. [an early version appeared as MIT Occasional Papers in Linguistics 18, 1999]. Chomsky, N. 2004. Beyond explanatory adequacy. In A. Belletti (ed.), Structures and Beyond. Oxford: Oxford University Press. 104-33. Chomsky, N. 2005. Three factors in the language design. Linguistic Inquiry 36:1–22. Chomsky, N. 2008. On phases. In R. Freidin, C. P. Otero and M. L. Zubizarreta (eds.), Foundational Issues in Linguistic Theory. Cambridge, Mass: MIT Press. 133-66. Chomsky, N. and H. Lasnik. 1993. The Theory of Principles and Parameters. in Syntax: An international handbook of contemporary research, Vol 1. Ed. J. Jacobs, A. von Stechow, W. Sternefelt, T. Vennemann. 506-569. Berlin: Walter de Gruyter. 253 Citko, B. 2005. On the Nature of Merge: External Merge, Internal Merge, and Parallel Merge. Linguistic Inquiry 36:475-496. Collins, C. 2002. Eliminating labels. In Derivation and explanation in the minimalist program, eds. Samuel Epstein and Daniel Seeley. Malden, MA: Blackwell. 42-64. Compton, A.H. 1923. A quantum theory of the scattering of x-rays by light elements. The Physical Review 21:483-502. Culicover, P. & M. Rochemont. 1990. Extraposition and the Complement Principle. Linguistic Inquiry 21. 23-47. Dayal, V. 2011. Hindi pseudo-incorporation. Natural Language and Linguistic Theory 29. 123- 167. Diesing, M. 1992. Bare Plural Subjects and the Derivation of Logical Representations. Linguistic Inquiry 23.3. 353-380. den Dikken, M. 2006. Relators and linkers: The syntax of predication, predicate inversion, and copulas. Cambridge, MA: MIT Press. Fox, D. 2000. Economy and Semantic Interpretation. Cambridge, Mass.: MIT Press. Fox, D. 2002. Antecedent Contained Deletion and the Copy Theory of Movement. Linguistic Inquiry 33. 63-96. Fox, D. & J. Nissenbaum. 1999. Extraposition and Scope: A case for overt QR. in Proceedings of the 18 th West Coast Conference in Formal Linguistics. Sonya Bird, Andrew Carnie, Jason Haugen, and Peter Norquest, eds. 132-144. Somerville: Cascadilla Press. Fox, D. & D. Pesetsky. 2004. Cyclic linearization of syntactic structure. Ms, MIT. Frank, R. 2006. Phase Theory and Tree Adjoining Grammar. Lingua 116.145-202. Freidin, R. 1986. Fundamental Issues in the Theory of Binding. In Studies on the acquisition of anaphora, ed. Barbara Lust. Dordrecht. 151-188. Freidin, R. & J.-R. Vergnaud. 2001. Exquisite Connections: Some remarks on the evolution of linguistic theory. Lingua 111. 639-666. Goldstein, L., D. Byrd, and E. Saltzman. 2006. The role of vocal tract gestural action units in understanding the evolution of phonology. In From Action to Language: The Mirror Neuron System, ed. Michael Arbib, 215-249. Cambridge University Press. Gracanin-Yuksek, M. 2007. About Sharing. PhD dissertation, MIT. Grano, T. 2012. Control and Restructuring at the Syntax-Semantics Interface. PhD Dissertation, University of Chicago. Grimshaw, J. 1991. Extended projection. Ms, Brandeis University. Waltham, MA. Guerón, J. & R. May. 1984. Extraposition and Logical Form. Linguistic Inquiry 15. 1-31. Hale, K., and S. J. Keyser. 1993. On argument structure and the lexical expression of syntactic relations. In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, eds. Kenneth Hale and Samuel Jay Keyser. Cambridge, Mass.: MIT Press. Hale, K., and S.J. Keyser. 2002. Prolegomenon to a Theory of Argument Structure. Cambridge, Mass.: MIT Press. Halle, M. and J.-R. Vergnaud. 1987. An Essay on Stress. Cambridge, MA, MIT Press. Hiraiwa, K. 2005. Dimensions of symmetry in syntax: Agreement and clausal architecture. PhD Dissertation, MIT. Hornstein, N. 1999. Movement and Control. Linguistic Inquiry 30.1. 69-96. Hudson, R.A. 1984. Word Grammar. Oxford: Basil Blackwell. Johnson, K. 2000. Restoring Exotic Coördinations to Normalcy. Ms, University of Massachusetts – Amherst. 254 Kayne, R. 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT Press. Kayne, R. 2002. Pronouns and their Antecedents. In Derivation and Explanation in the Minimalist Program, eds. S. Epstein, T.D. Seely. 133-166. Kayne, R. 2005. Movement and silence. Oxford: Oxford University Press. Kayne, R. 2008. Why isn’t This a complementizer? Ms., New York University. Kitagawa, Y. 1986. Subject in English and Japanese. University of Massachusetts-Amherst. Koopman, H. & D. Sportiche. 1985. Theta Theory and Extraction. Abstract in GLOW Newsletter. Koopman, H. 1983. Control from COMP and Comparative Syntax. The Linguistic Review 2.2. 139-160. Krifka, M. 1989. Nominal Reference, Temporal Constitution and Quantification in Event Semantics. In Semantics and Contextual Expression, eds. Renate Bartsch, Jan van Benthem, and P. van Emde Boas. Dordrecht: Foris. Krifka, M. 1992. Thematic Relations as Links between Nominal Reference and Temporal Constitution. In Lexical Matters, eds. Ivan Sag and Anna Szabolcsi. Stanford, California: CSLI Publications. Kuroda, S.-Y. 1968. English Relativization and Certain Related Problems. Kuroda, S.-Y. 1988. Whether We Agree or Not. Lingvisticae Investigationes 12. 1-47. Laka, I. 1993. Unergatives that assign ergative, Unaccusatives that assign accusative. in Papers on Case and Agreement. Bobaljik & Phillips, eds. Cambridge, MA: MIT Working Papers 18. 149-172. Lamontagne, G., and L. Travis. 1987. The syntax of adjacency. In Proceedings of WCCFL 6, ed. M. Crowhurst, 173–186. Landau, I. 2000. Elements of Control: Structure and Meaning in Infinitival Constructions. Studies in Natural Language and Linguistic Theory. Dordrecht. Landau, I. 2004. The Scale of Finiteness and the Calculus of Control. Natural Language & Linguistic Theory 22.4. 811-877. Landau, I. 2013. Predication vs. Logophoric Anchoring: A Two-Tiered Theory of Control. CLS 49. University of Chicago, April 20. Larson, R. 1988. On the double object construction. Linguistic Inquiry 19:335-391. Lasnik, H. and J. Uriagereka. 2005. A Course in Minimalist Syntax. Blackwell. Lebeaux, D. 1988. Language Acquisition and the Form of Grammar. Doctoral dissertation, University of Massachusetts. Lebeaux, D. 1991. Relative clauses, licensing, and the nature of the derivation. In Perspectives on phrase structure: Heads and licensing, ed. Susan Rothstein, 209-239. San Diego, CA: Academic Press. Lebeaux, D. 2000. Language Acquisition and the Form of Grammar. Philadelphia, PA: John Benjamins. Lebeaux, D. 2001. Prosodic Form, Syntactic Form, Phonological Bootstrapping and Telegraphic Speech. In Approaches to Bootstrapping: Phonological, lexical, syntactic, and neurophysiological aspects of early language acquisition, Vol. 2, eds. J. Weissenborn and Barbara Höhle, 87-120. Philadelphia, PA: John Benjamins. Leung, T. 2007. Syntactic derivation and the theory of matching contextual features. Doctoral dissertation, University of Southern California, Los Angeles. Levin, B. & M. Rappaport-Hovav. 1994. A Preliminary Analysis of Causative Verbs in English. Lingua 92. 35-77. 255 Liao, W.-W. Roger & J.-R. Vergnaud. 2010. Of NPs and Phases. GLOW in Asia 8. August 12- 16. Liao, W.-W. Roger. 2011. The symmetry of syntactic relations. Doctoral Dissertation, University of Southern California. Liao, W.-W. Roger. Forthcoming. On Merge-markers and Nominal Structures. In Primitive Elements of Grammatical Theory: Papers by Jean-Roger Vergnaud and his Collaborators, eds. K. McKinney-Bock & M-L Zubizarreta. Routledge. Lin, T. Jonah. 2001. Light Verb Syntax and the Theory of Phrase Structure. Doctoral dissertation, UC-Irvine. Link, G. 1984. Hydras. On the Logic of Relative Clause Constructions with Multiple Heads. In F. Landman and F. Veltman, eds. Varieties of Formal Semantics. Dordrecht: Reidel. 151- 180. Manzini, M.-R. 1995. From Merge and Move to Form Dependency. UCLA Working Papers in Linguistics 7:323-345. Manzini, M.-R. 1997. Adjuncts and the theory of phrase structure. In Proceedings of the Tilburg Conference on Rightward Movement, eds. D. Le Blanc and H. Van Riemsdijk. Amsterdam: John Benjamins. Martin, R. and J. Uriagereka. 2000. Some possible foundations of the Minimalist Program. In Step by Step: Essays on minimalist syntax in honor of Howard Lasnik, eds. R. Martin, D. Michaels, and J. Uriagereka. Cambridge, Mass.: MIT Press. McCawley, J. 1982. Parentheticals and Discontinuous Constituent Structure. Linguistic Inquiry 13.1:91-106. McKinney-Bock, K. 2010. Linearization when multiple orderings are possible: Adjective Ordering Restrictions and Focus. 32nd Deutschen Gesellschaft für Sprachwissenschaft (DGfS) Conference. Humboldt University, Berlin, Germany. February 23-26. McKinney-Bock, K & J.-R. Vergnaud. 2010. Grafts and Beyond. Generative Linguistics in the Old World (GLOW) 33, Wrocɬaw, Poland. April 14-16. McKinney-Bock, K. & J.-R. Vergnaud. Forthcoming. Grafts and Beyond: Graph Theoretic Syntax. In Primitive Elements of Grammatical Theory: Papers by Jean-Roger Vergnaud and his Collaborators, eds. K. McKinney-Bock & M-L Zubizarreta. Routledge. McKinney-Bock, K. & M.L. Zubizarreta, Eds. Forthcoming. Primitive Elements of Grammatical Theory: Papers by Jean-Roger Vergnaud and his Collaborators. Routledge. Megerdoomian, K. 2002. Beyond words and phrases: a unified theory of predicate composition. Doctoral Dissertation, University of Southern California. Megerdoomian, K. 2008. Parallel nominal and verbal projections. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, eds. R. Freidin, C.P. Otero, and M.L. Zubizarreta, 73-104. Cambridge, Mass.: MIT Press. Nunes, J. 2004. Linearization of Chains and Sideward Movement. Cambridge, MA: MIT Press. Ogawa, Y. 2001. A Unified Theory of Verbal and Nominal Projections. Oxford University Press. Perlmutter, D.M. & J.R. Ross. 1970. Relative Clauses with Split Antecedents. Linguistic Inquiry 1:350. Pesetsky, D. & E. Torrego. 2001. T-to-C Movement: Clauses and Consequences. in Ken Hale: a Life in Language. Cambridge, MA: MIT Press. Richards, N. 1997. What Moves Where in Which Language? Doctoral Dissertation, MIT. van Riemsdijk, H. 1998. Categorial feature magnetism: The endocentricity and distribution of projections. Journal of Comparative Germanic Linguistics 2. 1-48. 256 van Riemsdijk, H. 2001. A Far from Simple Matter: Syntactic Reflexes of Syntax-Pragmatics Misalignments. In Semantics, Pragmatics and Discourse. Perspectives and Connections. A Festschrift for Ferenc Kiefer, eds. I. Kenesei & R.M. Harnish, 21-41. Amsterdam: John Benjamins. van Riemsdijk, H. 2006. Grafts Follow from Merge. In Phases of Interpretation, ed. M. Frascarelli, 17-44. Berlin: Mouton de Gruyter. van Riemsdijk, H., and E. Williams, 1981. NP-Structure. The Linguistic Review 1. 171-217. Rizzi, L. 1990. Relativized minimality. Cambridge, MA: MIT Press. Robinson, J.J. 1970. Dependency Structures and Transformational Rules. Language 46:259-285. Ross, J.R. 1967. Constraints on Variables in Syntax. Dissertation, Massachusetts Institute of Technology. Saltzman, E., & Byrd, D. 2000. Task-dynamics of gestural timing: Phase windows and multifrequency rhythms. Human Movement Science 19:499-526. Saltzman, E., Nam, H., Krivokapic, J., & Goldstein, L. 2008. A task-dynamic toolkit for modeling the effects of prosodic structure on articulation. In Proceedings of the Speech Prosody 2008 Conference, eds. P. A. Barbosa, S. Madureira, & C. Reis. Campinas, Brazil. Sauzet, P. 1993. Attenance, gouvernement et mouvement en phonologie. Les constituants dans la phonologie et la morphologie de l’occitan. Doctoral dissertation, Université Paris 8. [Published in 1994, Montpellier: CEO/UPV.] Schein, B. 2001. The Semantics of Right-Node Raising and Number Agreement. Semantics Workshop, Center for Cognitive Science, Rutgers University, May. Schein, B. 2002. Number Agreement in Lebanese Arabic. Linguistics Colloquium, MIT, Cambridge, MA, May. Schein, B. 2007. Simple Clauses Conjoined. Symposium: Philosophy & Linguistics, American Philosophical Association, Eastern Division, Baltimore, MD, 28 December. Schein, B. 2010. Conjunction Reduction Redux. Ms, University of Southern California. Simpson, A. 2008. Lecture Notes. University of Southern California, LING 535. Sportiche, D. 1988. A Theory of Floating Quantifiers and its Corollaries for Constituent Structure. Linguistic Inquiry 19.3. 425-449. Sportiche, D. 2005. Division of Labor between Merge and Move: Strict Locality of Selection and Apparent Reconstruction Paradoxes. In Proceedings of the Workshop Divisions of Linguistic Labor, The La Bretesche Workshop. Szabolcsi, A. 1989. Noun phrases and clauses: Is DP analogous to IP or CP? In The Structure of Noun Phrases, ed. J. Payne. Mouton de Gruyter. Travis, L. 1984. Parameters and Effects of Word Order Variation. Doctoral dissertation, MIT. Vergnaud, J.-R. 2003. On a certain notion of “occurrence”: the source of metrical structure, and of much more. In Living on the Edge, S. Ploch, ed. Berlin: Mouton de Gruyter. Vergnaud, J.-R. 2007. A Format for Syntactic Analysis. Ms., University of Southern California. Vergnaud, J.-R. 2009. Defining Constituent Structure. Ms., University of Southern California. Vergnaud, J.-R. Forthcoming. Avatars of Conceptual Necessity: Elements of UG. In Primitive Elements of Grammatical Theory: Papers by Jean-Roger Vergnaud and his Collaborators, eds. K. McKinney-Bock & M-L Zubizarreta. Routledge. Vergnaud, J.-R. and M.L. Zubizarreta, 2001. Derivation and constituent structure. Ms., University of Southern California. de Vries, M. 2009. On Multidominance and Linearization. Biolinguistics 3.4. 344-403. 257 Walkow, M. 2010. A unified analysis of the person case constraint and 3-3-effects in Barceloni Catalan. in Proceedings of NELS 40, eds. S. Kan, C. Moore-Cantwell, R. Staubs. Amherst: GLSA. Wilder, C. 1994. Coordination, ATB and Ellipsis. Groninger Arbeiten zur Germanistischen Linguistik 37:291-331. Wilder, C. 2008. Shared Constituents and Linearization. In Topics in Ellipsis, ed. Kyle Johnson. Cambridge University Press. Williams, E. 1978. Across-the-Board Rule Application. Linguistic Inquiry 9. 31-43. Williams, E. 2003. Representation Theory. Cambridge, MA: MIT Press. Wurmbrand, S. 1999. Modal Verbs Must be Raising Verbs. In Proceedings of WCCFL 18. Cascadilla Press. Zagona, K.T. 1982. Government and Proper Government of Verbal Projections. Doctoral Dissertation, University of Washington.
Abstract (if available)
Abstract
This dissertation aims to revisit foundational issues in syntactic theory regarding cyclicity and displacement. I take narrow syntax to operate over domains (phases) more local than in current Minimalism. To do this, I define a notion of phase overlap which involves the sharing of grammatical features across two independent phases. Phase overlap applies to phases involved in the construction of argument structure, e.g., linking subject and object phases, in further building clausal structure, as well as in embedding of complement clauses, and phase overlap also plays a role in A-bar constructions, such as relativization. ❧ To overlap phases, I take the idea that generalized binary connectives build phrase structure (Vergnaud forthcoming), and extend it in such a way that it gives rise to phases that involve parallel nominal and verbal domains, rather than treating the verbal domain as ‘privileged’. In this dissertation, both the verbal and nominal domains are implicated at the edges of phases, creating phase overlap and a novel notion of cyclicity: to construct two (consecutive) cycles is to share a pair of features across (both) the nominal and verbal domain. ❧ The definition of sharing across phases, or phase overlap, is grounded in the scientific hypothesis that long-distance grammatical relationships are a by-product of interface requirements such as linearization, rather than a fundamental aspect of the architecture at narrow syntax. This hypothesis is based in part on the Items and Contexts Architecture (ICA, Vergnaud forthcoming), although the ICA remains incomplete in its formalization of embedding. From this type of sharing, I develop a strong hypothesis that the appearance of displacement (of a noun) is a product of how the formal computational system spells out, rather than a movement operation that takes place at narrow syntax. ❧ From this hypothesis, I then set forth a unified analysis of the D-C-T domain, where noun sharing plays a crucial role in the linking – generalized linking – of two (otherwise independent) phases, including Subject/Object phases to build a transitive clause (chapter 3), two CP phases involving embedded and matrix clauses (chapter 4) and relative and matrix clauses (chapter 5). Along with noun sharing, I maintain the idea of verbal sharing – in a certain way following the standard literature, i.e. that v is visible to both the lower (object) phase and higher (subject phase). This plays a key component in phase overlap. However, I extend this idea to all embedding, and hypothesize that all embedding shares (semantically interpretable) features. This is seen empirically, especially in cases of control.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Silence in answers: a study of ellipsis in Hindi
PDF
The grammar of correction
PDF
Perspective in Turkish complementation
PDF
The morphosyntax of states: deriving aspect and event roles from argument structure
PDF
The grammar of individuation, number and measurement
PDF
Syntax-prosody interactions in the clausal domain: head movement and coalescence
PDF
Exploring the effects of Korean subject marking and action verbs’ repetition frequency: how they influence the discourse and the memory representations of entities and events
Asset Metadata
Creator
McKinney-Bock, Katherine S.
(author)
Core Title
Building phrase structure from items and contexts
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Linguistics
Publication Date
07/12/2013
Defense Date
05/13/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
case theory,embedding,formal linguistic theories,Linguistics,OAI-PMH Harvest,phrase structure,relative clauses,syntax
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Pancheva, Roumyana (
committee chair
), Arratia, Richard (
committee member
), Iskarous, Khalil (
committee member
), Simpson, Andrew (
committee member
)
Creator Email
katymck@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-288262
Unique identifier
UC11292851
Identifier
etd-McKinneyBo-1764.pdf (filename),usctheses-c3-288262 (legacy record id)
Legacy Identifier
etd-McKinneyBo-1764.pdf
Dmrecord
288262
Document Type
Dissertation
Format
application/pdf (imt)
Rights
McKinney-Bock, Katherine S.
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
case theory
embedding
formal linguistic theories
phrase structure
relative clauses
syntax