You are on page 1of 274

This page intentionally left blank

CONTROL AS MOVEMENT

The movement theory of control (MTC) makes one major claim: that control relations in sentences like ‘John wants to leave’ are grammatically medi- ated by movement. This goes against the traditional view that such sentences involve not movement, but binding, and analogizes control to raising, albeit with one important distinction: whereas the target of movement in control structures is a theta position, in raising it is a non-theta position; however, the grammatical procedures underlying the two constructions are the same. This book presents the main arguments for MTC and shows it to have many theo- retical advantages, the biggest being that it reduces the kinds of grammatical operations that the grammar allows, an important advantage in a minimalist setting. It also addresses the main arguments against MTC, using examples from control shift, adjunct control, and the control structure of “promise,” showing MTC to be conceptually, theoretically, and empirically superior to other approaches.

cedric boeckx is Research Professor at the Catalan Institute for Advanced Studies (ICREA), and a member of the Center for Theoretical Linguistics at

the Universitat Autonoma`

de Barcelona.

norbert hornstein is Professor in the Department of Linguistics at the University of Maryland, College Park.

jairo nunes is Professor in the Department of Linguistics at the Universi-

dade de Sao˜

Paulo, Brazil.

In this series

81 rog e r la s s : Historical linguistics and language change

82 john m. and e r son : A notional theory of syntactic categories

83 bernd heine : Possession: cognitive sources, forces and grammaticalization

84 nomi

85 john co l eman : Phonological representations: their names, forms and powers

86 chri stina y. bethin : Slavic prosody: language change and phonological theory

87 ba r ba ra dan cygi e r : Conditionals and prediction

88 c lai r e l ef e bv r e : Creole genesis and the acquisition of grammar: the case of Haitian creole

89 heinz giegerich : Lexical strata in English

90 keren rice : Morpheme order and semantic scope

91 ap ri l m c mahon : Lexical phonology and the history of English

92 ma t th ew y. ch en : Tone Sandhi: patterns across Chinese dialects

93 g r ego ry t. s tump : Inflectional morphology: a theory of paradigm structure

94 joan by b e e : Phonology and language use

95 lau ri e bau e r : Morphological productivity

96 thoma s e rn s t : The syntax of adjuncts

97 e liza b e th c lo s s t raugo t t and richard b. dasher : Regularity in semantic change

98 maya hi ckmann : Children’s discourse: person, space and time across languages

99 dian e b lak emo r e : Relevance and linguistic meaning: the

e r t e s chik - shi r : The dynamics of focus structure

semantics and pragmatics of

discourse markers

100 ian ro b e r t s and anna roussou : Syntactic change: a minimalist approach to grammaticalization

101 donka minkova : Alliteration and sound change in early English

102 ma rk c. bak e r : Lexical categories: verbs, nouns

103 ca r lo ta s. smi th : Modes of discourse: the local structure of texts

104 rochelle lieber : Morphology and lexical semantics

and adjectives

105 ho lg e r

di e s s e l : The

acquisition of complex sentences

106 sha ron

ink e la s and

cheryl zoll : Reduplication: doubling in morphology

107 su san edwa rd s : Fluent aphasia

108 ba r ba ra dan cygi e r and eve sweetser : Mental spaces in grammar: conditional constructions

109 h ew ba e rman , dunstan brown, and greville g. corbett : The

syntax–morphology interface: a study of syncretism

110 ma r cu s toma lin : Linguistics and the formal sciences: the origins of generative grammar

111 samu e l

112 pau l d e la cy : Markedness: reduction and preservation in

113 y ehuda n. fa lk : Subjects and their properties

114 p. h. ma t th ew s : Syntactic

115 ma rk c. bak e r : The syntax of agreement and concord

116 gi l lian ca t riona ram chand : Verb meaning and the lexicon: a first phase syntax

117 pi e t e r muy sk en : Functional

118 juan u riag e r eka : Syntactic anchors: on semantic structuring

119 d. ro b e r t ladd : Intonational phonology , second edition

120 leonard h. babby : The syntax of argument structure

121 b. elan d re sher : The contrastive hierarchy

122 david adg e r , daniel harbour, and laurel j. watkins : Mirrors and microparameters: phrase structure beyond free word order

123 niina ning zhang : Coordination in syntax

124 n ei l smi th : Acquiring phonology

125 nina topin tzi : Onsets: suprasegmental and prosodic behaviour

126 c ed ri c bo e ckx , norbert hornstein, and jairo nunes : Control as movement

d. ep s t ein and t. daniel seely : Derivations in minimalism

phonology

relations: a critical survey

categories

in phonology

Earlier issues not listed are also available

CAMBRIDGE STUDIES IN LINGUISTICS

General editors: p. austin, j. bresnan, b. comrie,

s.

crain, w. dressler, c. j. ewen, r. lass,

d.

lightfoot, k. rice, i. roberts,

s.

romaine, n. v. smith

Control as Movement

CONTROL AS MOVEMENT

CEDRIC BOECKX

ICREA/Universitat Autonoma`

de Barcelona

NORBERT HORNSTEIN

University of Maryland, College Park

JAIRO NUNES

Universidade de S ao˜ Paulo, Brazil

` de Barcelona NORBERT HORNSTEIN University of Maryland, College Park JAIRO NUNES Universidade de S ao˜

CAMBRIDGE UNIVERSITY PRESS

Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi, Dubai, Tokyo

Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK

www.cambridge.org Information on this title: www.cambridge.org/9780521195454

© Cedric Boeckx, Norbert Hornstein, and Jairo Nunes 2010

This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press.

First published in print format

2010

ISBN-13

978-0-511-78955-7

eBook (NetLibrary)

ISBN-13

978-0-521-19545-4

Hardback

Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Contents

Acknowledgments

page x

1 Introduction

1

2 Some historical background

5

2.1 Introduction

5

2.2 What any theory of control should account for

5

2.3 Control in the standard-theory framework

6

2.4 Control in GB

9

2.5 Non-movement approaches to control within minimalism

16

2.5.1 The null-case approach

16

2.5.2 The Agree approach

20

2.6 Conclusion

35

3 Basic properties of the movement theory of control

36

3.1 Introduction

36

3.2 Departing from the null hypothesis: historical, architectural, and empirical reasons

37

3.3 Back to the future: elimination of DS and the revival of the null hypothesis

43

3.4 Controlled PROs as A-movement traces

46

3.4.1 Configurational properties

47

3.4.2 Interpretive properties

49

3.4.3 Phonetic properties and grammatical status

52

3.5 Conclusion

56

4 Empirical advantages

59

4.1 Introduction

59

4.2 Morphological invisibility

59

4.3 Interclausal agreement

60

4.4 Finite control

63

4.4.1

Finite control and hyper-raising

70

viii Contents

4.4.2 Finite control, islands, and intervention effects

75

4.4.3 Summary

79

4.5 The movement theory of control under the copy theory of movement

79

4.5.1 Adjunct control and sideward movement

83

4.5.2 The movement theory of control and morphological restrictions on copies

98

4.5.3 Backward control

102

4.5.4 Phonetic realization of multiple copies and copy control

115

4.6 Conclusion

123

5 Empirical challenges and solutions

125

5.1 Introduction

125

5.2 Passives, obligatory control, and Visser’s generalization

125

5.2.1 Relativizing A-movement

127

5.2.2 Impersonal passives

132

5.2.3 Finite control vs. hyper-raising

136

5.3 Nominals and control

141

5.3.1 Finite control into noun-complement clauses in Brazilian Portuguese

142

5.3.2 Raising into nominals in Hebrew

147

5.3.3 The contrast between raising nominals and control nominals in English

149

5.4 Obligatory control and morphological case

152

5.4.1 Quirky case and the contrast between raising and control in Icelandic

152

5.4.2 Apparent case-marked PROs

160

5.5 The minimal-distance principle, control shift, and the logic of minimality

169

5.5.1 Control with promise-type verbs

171

5.5.2 Control shift

176

5.5.3 Summary

181

5.6 Partial and split control

182

5.6.1 Partial control

183

5.6.2 Split control

190

5.7 Conclusion

194

6 On non-obligatory control

195

6.1 Introduction

195

6.2 Obligatory vs. non-obligatory control and economy computations

196

6.3 Some problems

202

6.4 A proposal

204

Contents ix

7 Some notes on semantic approaches to control

210

7.1 Introduction

210

7.2 General problems with selectional approaches to obligatory control

210

7.3 “Simpler syntax”

216

7.3.1 Some putative problems for the movement theory of control

217

7.3.2 Challenges for “simpler syntax”

226

7.4 Conclusion

237

8 The movement theory of control and the minimalist program

238

8.1 Introduction

238

8.2 Movement within minimalism and the movement theory of control

239

8.3 The movement theory of control and the minimalist architecture of UG

241

8.4 Inclusiveness, bare phrase structure, and the movement theory of control

245

8.5 Conclusion

248

References

250

Index

261

Acknowledgments

Previous versions of (part of) the material discussed here have been presented at the following universities: Connecticut, Harvard, Leiden, Lisbon, Maryland, New York, Rutgers, Sao˜ Paulo, Stony Brook, Tilburg, and Utrecht; and at the following meetings: ANPOLL 2003, XVIII Colloquium on Generative Gram- mar, Edges in Syntax, EVELIN 2004, GLOW XXX, Going Romance 2007, LSA 2005, Romania Nova II, Ways of Structure Building, and V Workshop on Formal Linguistics at USP. We would like to thank these audiences for com-

c,´ Hans Broekhuis,

ments and suggestions. Special thanks to Zeljko Boskoviˇ

Lisa Cheng, Marcelo Ferreira, Michael¨ Gagnon, Terje Lohndal, Carme Picallo, and Johan Rooryck. We would also like to acknowledge the support received from the Generalitat de Catalunya (grant 2009SGR1079; first author), NSF (grant NSD.BCS.0722648; second author), and CNPq and FAPESP (grants 302262/2008–3 and 2006/00965–2; third author).

ˇ

1 Introduction

In the following pages we develop an extended argument for a proposal whose conceptual simplicity and empirical success will, we trust, be evident to all readers. The proposal says that (obligatory) control is movement, more specif- ically, A-movement. We propose that the phenomena that have been used to motivate a special and separate control construction are best explained if control is treated as an A-movement dependency, on a par with other phe- nomena that have been traditionally treated in terms of A-movement such as passive, raising, and (local) scrambling. Put another way, we claim that main- taining the constructional specificity of control (in whatever form, be it in terms of the PRO theorem [e.g., Chomsky 1981], null case [e.g., Chomsky and Lasnik 1993; Martin 1996; and Boskoviˇ c´ 1997], or ad hoc “anaphoric” tense-agreement dependencies [e.g., Landau 1999, 2000, 2004]) significantly hampers our understanding of the phenomenon as it leads to explanations that are roughly as complex as the phenomenon itself. Despite virtues that we believe are transparent (see e.g., Hornstein 1999, 2001), the movement theory of control (hereafter, MTC) has proven to be quite controversial. 1 We believe that there are several reasons for this. The first one is historical. Differentiating raising from control in terms of movement has been a fixed point within generative grammar from the earliest accounts within the standard theory to current versions of minimalism (see Davies and Dubinsky 2004). Under this long-held view, which became crystallized in GB with the formulation of the (construction-specific) control module (Chomsky 1981), if raising involves movement, control cannot. It is thus not surprising that the MTC has been welcomed with considerable skepticism, as its basic proposal is exactly to analyze control in terms of (A-)movement. However, such historical bias should not deter us from a fair evaluation of the conceptual properties and empirical coverage of the MTC.

1 See e.g., Landau (2000, 2003); Culicover and Jackendoff (2001, 2005); Kiss (2005); and van Craenenbroeck, Rooryck, and van den Wyngaerd (2005) for a useful sample.

2 Introduction

The second reason behind the controversy is also related to the long interest control has enjoyed within the generative tradition. Over the years, control phenomena have been richly described. Consequently, any new approach will

likely fail, at least initially, to adequately handle some of the relevant data. Moreover, if the novel approach is conceptually tighter than the more descrip- tive accounts that it aims to replace (as we believe to be the case with the MTC), some features of the phenomenon heretofore assumed to be central may not be accommodated at all. This should occasion no surprise, as it reflects the well-known tension between description and explanation. Odd as it may seem, failure to cover a data point may be a mark of progress if those that are covered follow in a more principled fashion. The virtues of a proposal can be seriously misevaluated unless one keeps score of both what facts are covered and how facts are explained. A weak theory can often be “easily” extended to accommodate yet another data point, and this is not a virtue. Correspondingly,

a tight theory may miss some “facts” and this is not necessarily a vice, par-

ticularly if the account is comparatively recent and the full implications of its resources have not yet been fully developed. We believe that many have been too impressed by these apparent problems without considering how the MTC might be developed to handle them. In fact, we believe that the MTC actually faces few empirical difficulties (and none of principle), whereas the current alternatives both face very serious empirical hurdles (e.g., backward control) and often empirically succeed by stipulating what should be explained (e.g., the distribution of PRO through null case). One aim of what follows is to make this case in detail. Finally, it is fair to say that the resistance to MTC is in part due to the inadequacies and limitations of previous versions of the MTC (including our own work), which we have tried to overcome here. Addressing the vigorous critiques of MTC here and in previous work (Hornstein 2003; Boeckx and Hornstein 2003, 2004, 2006a; Nunes 2007; Boeckx, Hornstein, and Nunes in press) has allowed us to rectify some errors, clarify the proposal, and sharpen the arguments. This stimulating intellectual exercise has led us to better appreciate the consequences of the MTC and has in fact convinced us that it covers even more empirical ground than we at first thought, as we will argue in the following chapters. For all these reasons, we thought that a detailed defense of MTC required

a monograph. But before we launch our defense of MTC, a few notes are in order. First, we cannot emphasize enough that MTC does not equate “control” with “raising.” Since the MTC was first proposed, it has been regularly objected

Introduction 3

that the MTC cannot be right because of features that control has, but raising does not, and vice versa. However, control is raising only in the descriptive sense that control is an instance of A-movement, but it is not raising qua construction. In other words, all the MTC is saying is that, like the derivation of raising, passive, or local scrambling constructions, the derivation of obligatory- control constructions also involves A-movement. The different properties of constructions involving wh-movement and topicalization, for instance, do not argue against analyzing them in terms of A’-movement. Similarly, we urge the reader not to dismiss our proposal simply because (unanalyzed) control– raising asymmetries exist. Although raising often proves useful in illustrating properties of A-movement that carry over to control, it is a ladder that ought to be kicked away as theory advances. In the chapters that follow, we in fact argue that control–raising asymmetries generally reduce to independent factors – something we take to be an indication that the MTC is on the right track. Second, the MTC is actually not a radically new idea. It goes back as far as Bowers (1973), who already proposed that raising and control should be basically generated in the same way. However, as the proposal conflicted with core principles of almost every model of UG from Aspects to GB, it did not find fertile soil to blossom for a long time. This scenario drastically changed when the minimalist program came into the picture. Chomsky’s (1993) proposal that D-structure should be eliminated provided a very natural conceptual niche for the MTC within the generative enterprise as it removed the major theoretical obstacle that prevented movement to -positions. In a system with D-structure, movement to -positions is a non-issue, for movement can only take place once -assignment is taken care of. By contrast, in a system without D-structure, where movement and -assignment intersperse, movement to -positions arises at least as a logical possibility. Thus, whether or not it is a sound option has to be determined on the basis of the other architectural features of the system, as well as its empirical coverage. We hope to show that the MTC fits snugly with some leading minimalist conceptions and thus constitutes an interesting argument in its favor. Third, as minimalism aspires to explain why UG properties are the way they are, we are interested in developing a theory of control that deduces the properties of control configurations from more basic postulates, rather than merely listing the possible controllers, controllees, control predicates, and control complements coded as features of individual lexical items. Finally, although our specific implementation of MTC is the one that has been extended to the broadest range of data thus far, it is certainly not the only one possible. O’Neil (1995), Manzini and Roussou (2000), Kayne (2002), and

4 Introduction

Bowers (2006) share the spirit but not the details of our analysis. For reasons of space, we will not be able to do proper justice to these works and the reader is invited to evaluate each different implementation in its own right. Let us close this introductory chapter by providing an overview of the sub- sequent chapters. Chapter 2 offers a brief overview of how control is handled in the standard-theory framework, in GB, and in non-movement approaches within minimalism. Chapter 3 lays out the broad features of our version of the MTC. Chapter 4 discusses some of the empirical advantages that the MTC has. Chapter 5 addresses many of the empirical challenges that have been consid- ered to be fatal to the MTC and proposes solutions compatible with the MTC. Chapter 6 presents our take on how non-obligatory control is to be analyzed. Chapter 7 discusses the extent to which the MTC is based on more solid concep- tual and empirical grounds than semantic/selectional approaches to obligatory control. Finally, Chapter 8 concludes the monograph.

2

Some historical background

2.1

Introduction

Up to very recently, there had been a more or less uncontroversial view that control phenomena should be analyzed in terms of special grammatical primi- tives (e.g., PRO) and construction-specific interpretive systems (e.g., the control module). In this chapter, we examine how this conception of control was instan- tiated in the standard-theory framework (section 2.3), in GB (section 2.4), and

in non-movement analyses within the minimalist program (section 2.5), briefly

outlining what we take to be the virtues and problems of each approach. 1 This discussion will provide the general background for us to discuss the core prop- erties of (our version of) the MTC in Chapter 3 and evaluate its adequacy in the face of the general desiderata for grammatical downsizing explored in the minimalist program.

2.2 What any theory of control should account for

A theoretically sound approach to control – one that goes beyond the mere

listing of the properties involved in control – must meet (at least) the following four requirements. First, it must specify the kinds of control structures that are made available by UG and explain how and why they differ. Assuming, for instance, that obligatory control (OC) and non-obligatory control (NOC) are different, their differences should be reduced to more basic properties of the system.

Second, it must correctly describe the configurational properties of control,

accounting for the positions that the controller and the controllee can occupy.

In addition, it should provide an account as to why the controller and the

controllee are so configured. Assuming, for instance, that the controllee can

1 For much more detailed discussion, we urge the reader to consult Davies and Dubinsky’s (2004) excellent history of generative treatments of raising and control.

6

Some historical background

only appear in a subset of possible positions (e.g., ungoverned subjects), why are controllees so restricted? Third, it must account for the interpretation of the controllee, explaining how the antecedent of the controllee is determined and specifying what kind of anaphoric relation obtains between the controllee and its antecedent (in both OC and NOC constructions) and why these relations obtain and not others. For instance, assuming that controllers must locally bind controllees in OC constructions, why is the control relation so restricted in these cases? Fourth, it must specify the nature of the controllee: what is its place among the inventory of null expressions provided by UG? Is it a formative special to control constructions or is it something that is independently attested? In the next sections, we briefly review how these concerns have been addressed from the standard-theory model to the minimalist program.

2.3 Control in the standard-theory framework

Within the framework of the standard theory, control phenomena were coded in the obligatory transformation referred to as equi(valent) NP deletion ( END), which for our current purposes can be described as follows: 2

(1)

Structural description:

X-NP-Y-[ S {for/poss}-NP-Z]-W

1

2

3

4

5

6

7

Structural change:

1

2

3

4

Ø

6

7

Conditions: i. 2 = 5 ii. the minimal-distance principle is satisfied

Irrelevant details aside, END applies to the (a)- structures in (2)–(5), for instance, and converts them in the corresponding (b)-sentences.

(2) a.

John tried/wanted/hoped [for John to leave early]

b.

John tried/wanted/hoped to leave early

(3) a.

John regrets/insisted on/prefers [poss John leaving early]

b.

John regrets/insisted on/prefers leaving early

(4) a.

John persuaded/ordered/forced/asked/told Mary [for Mary to leave early]

b.

John persuaded/ordered/forced/asked/told Mary to leave early

(5) a.

John kissed Mary before/after/without [poss John asking if he could]

b.

John kissed Mary before/after/without asking if he could

2 Here we abstract away from issues that are orthogonal to our discussion such as the interaction between END and the rule of complementizer deletion, which has the effect of deleting the term numbered 4 in (1). See Rosenbaum (1967, 1970) for discussion.

2.3 Control in the standard-theory framework 7

According to this approach, there is nothing of special interest in the nature of the controllee. It is a regular NP in the underlying structure and the fact that the corresponding surface position is phonetically null follows from the kind of transformation END is. It is a deletion transformation that removes the targeted NP, leaving nothing at surface structure. To put it differently, the superficial phonetic difference between controller and controllee results not from intrinsic lexical properties of the controllee, but from properties of the computation itself, i.e., that END is a deletion operation. As far as the configurational properties of control are concerned, END explic- itly specifies that the controllee (the target of deletion) must occur in the subject position of infinitival clauses (for-clauses) and gerunds (poss-clauses), and that the controller must be the closest NP (in compliance with the minimal-distance principle). Thus, according to the minimal-distance principle, sentences such as (4b) must be derived from the structures in (4a) and not from the one in (6) below, which would incorrectly allow the understood subject of the embed- ded clause to be interpreted as being coreferential with the matrix subject. As opposed to what we find in (4a), the antecedent of the controllee in (6) is not the closest NP around. As for adjunct control in sentences such as (5), the minimal-distance principle is satisfied under the assumption that the embedded clause is adjoined to the matrix clause and, as such, it is structurally closer to the subject than it is to the object. 3

(6)

John persuaded/ordered/forced/asked/told Mary [for John to leave early]

Finally, the interpretation properties of control are enforced by condition (i), which requires that controller and controllee be “identical,” which was understood in terms of coreference. This general approach was refined within the standard theory as more com- plex control structures were considered, but its axiomatic (i.e., stipulative) nature remained. The configurational and interpretive properties of control were analyzed as irreducible features of the END transformation itself. This by no means diminishes the value of these earlier approaches to control. Iden- tifying the different properties of control phenomena with such formal rigor

3 END as stated is not entirely adequate empirically. Given (1) above, the structure in (ia), for example, should allow for control by ‘Mary’ in (ib):

(i) a.

John persuaded a friend of Mary [for Mary to leave]

b.

John persuaded a friend of Mary to leave

It should be clear how requiring that some sort of command relation hold between the antecedent NP and the deleted one will help screen out cases like (i), where the “wrong” NP is chosen.

8

Some historical background

was unquestionably an achievement, with large consequences for theorizing beyond control structures, and it paved the way for subsequent reanalyses in GB and in the minimalist program. Before we leave this brief review, two points are worth mentioning which will be relevant to the discussion of these later reanalyses, including the MTC. The first one regards an empirical problem that the standard-theory approach faced in relation to the way it handled the interpretive properties of control. As we saw above, the controller and the controllee were taken to be lexically identical and the semantic relation between them was understood as coreference. Problems arise when the controller is not a referential NP, as exemplified by the contrast between (7) and (8).

(7) a.

[John wants [John to win]]

b.

John wants to win

(8) a.

[Everyone wants [everyone to win]]

b.

Everyone wants to win

Whereas (7a) might be taken to roughly represent the meaning of (7b), (8a) in no way represents the interpretation of (8b), which should rather be paraphrased as ‘Everyone wants himself to win.’ This suggests that, instead of an NP identical to (i.e., coreferential with) its controller in underlying structure, what we actually need is a kind of bound anaphor or an expression that can be so interpreted. 4 The obvious question then is how to obtain this bound interpretation. The second point worth mentioning concerns the identification of another type of control. Relatively early on, END was distinguished from a related operation dubbed super-equi (SEND). This operation also deletes a subject of a non-finite clause but, in contrast to END, it operates across unbounded stretches of sentential material, as illustrated in (9). 5

4 If there is an anaphoric relation in control structures, then END is unlikely to be a chopping (“gap”-leaving) rule. Rather, it is more like the rules of reflexivization or pronominalization, which were operations governed by command relations. The problem is that control structures do not appear to leave lexical residues like the other construal operations. They appear to require a phonetic gap. Seen from a contemporary perspective, the problem of how to characterize the rules that lead to control structures (are they chopping rules or construal rules?) highlights the tension that we will see constantly recurring: how best to account for both the distribution of the controllee and its interpretation.

2.4

Control in GB

9

(9) a.

[ S1 John said [ S2 that Mary believes [ S3 that [ S4 John washing himself] would make a good impression on possible employers]]]

b. John said that Mary believes that washing himself would make a good impression on possible employers

Note that (9) violates the minimal-distance principle, as ‘Mary’ intervenes between the target of deletion (‘John’ in S 4 ) and its antecedent (‘John’ in S 1 ). Moreover, in contrast to standard END configurations, the controllee is not within a clausal complement (or adjunct) of a higher predicate. In (9), for instance, the controllee is within the sentential subject of S 3 . The following question then arises: what is the relation between END and SEND? Or to put the question somewhat differently: why should UG have two rules that have the same effect (deletion of an identical NP), but apply to different configurations? 6 In the next sections we examine some answers to these two issues that were offered within GB and the minimalist program.

2.4 Control in GB

Building on earlier work in the extended standard theory (EST), the GB approach to control is considerably more ambitious and empirically more suc- cessful than the standard-theory model. Within GB, the controllee is a PRO, a base-generated NP containing no lexical material ([ NP Ø]). This conception of the controllee as a base-generated non-lexical formative arises as a natural consequence of the GB assumptions regarding the base component. The GB theory of the base includes both phrase- structure rules, like the ones in (10), and lexical-insertion operations, like the ones in (11).

(10)

a.

S NP INFL VP

b.

VP V NP

c.

NP N

(11) a.

N John/he/it/Bill

 

b.

V kiss/see/admire

c.

INFL past/to

These two types of rules operate in tandem to generate structures such as (12) below. However, they can also be used to generate structures like (13), where the subject of the clause has been generated by the phrase-structure component

6 Grinder (1970) actually collapsed END and SEND. However, later approaches identified many substantial differences between the constructions underlying END and SEND that are better captured if two kinds of control are recognized, as we shall see below.

10

Some historical background

but has not been filled by lexical insertion. In short, a theory of the base factored into a set of phrase-structure rules and lexical-insertion operations has room for an element like PRO: it is what one gets when one generates an NP structure but does not subject it to lexical insertion.

(12)

(13)

[

[

S

S

[

[

NP

NP

John] past [ VP see [ NP Bill]]] Ø] to [ VP see [ NP Bill]]]

This way of understanding PRO has an interesting consequence for the constructions that were captured by END in the standard theory. If one assumes that categories without lexical content are uninterpretable unless provided with “content” (by being linked with an antecedent, for example) and, furthermore, that the principle of full interpretation does not tolerate contentless structures, then the requirement that PRO must have an antecedent follows naturally. 7 We wish to stress this point as it is important for some of the discussion that follows. If one treats PRO as a lexical element, it is hard to explain why PRO must be phonetically null and why it requires an antecedent. Of course, it is possible to stipulate that these two features are inherent properties of a specific lexical item (PRO), but this cannot explain why PRO is necessarily anaphoric and null. Moreover, so conceived, PRO is a rather unusual lexical element as it has no positive properties. It has no phonetic matrix and its only semantic feature is the requirement that it must be coindexed with a grammatical antecedent. 8 This point is worth emphasizing. PRO, on this view, is not simply a semantically dependent expression that needs to be interpreted with respect to some salient element in the discourse (e.g., like ‘the other’ in ‘John ate one of the bagels. Harry ate the other.’). Rather, PRO is specified as needing an antecedent in a particular structural configuration. However, this is a very odd lexical feature as it is only definable in configurational (i.e., grammatical) terms. In other words, invoking such features in the construction of lexical items (be it PRO or any other item) is just a way of simulating a grammatical requirement via lexical stipulation. 9 The GB approach offers a sounder alternative as it treats PRO’s properties as the result of interacting grammatical principles. This feature of the GB analysis

7 See Chomsky (1980: 8): “If Coindex does not apply and the embedded clause contains PRO, then we end up with a ‘free variable’ in LF; an improper representation, not a sentence but an open sentence.”

8 This point is similar to Chomsky’s (1995) argument against considering Agr as a lexical category. Given that its only features are uninterpretable, a preferable approach, all things being equal, is to take these features as belonging to related true lexical categories.

2.4 Control in GB

11

of control is clearly a desirable one for any theory to have. Any adequate theory of control should eschew lexically stipulating PRO’s basic properties and specify how grammatical principles interact so that the desired properties of PRO emerge. We consider some possibilities below (see also Chapter 3). Being a grammatical, non-lexical formative specified as [ NP Ø], PRO is in fact quite similar to NP-traces (standard traces of A-movement) in GB. 10 What distinguishes them is neither their internal structures nor their interpretation, but how they are introduced in the derivation and how they get their indices. 11 PRO is inserted at D-structure, but is only coindexed later in the derivation. In contrast, NP-traces receive their indices as they are created in a movement operation (they must be coindexed with the NP that moves). However, after PROs get their indices (at S-structure or LF), they become completely indistin- guishable from NP-traces. Notice that, once we take PRO and NP-traces to be indistinguishable at some point(s) in the derivation, we are already very close to the MTC. We return to this point in Chapters 3 and 4. The GB account of the distribution of PRO is similarly ambitious. Rather than simply stipulate that it appears in the subject position of non-finite clauses, GB strove to derive this fact from the binding theory. The proposal, known as the PRO theorem, went as follows (see Chomsky 1981). PRO was taken to be a pronominal anaphor and, as such, subject to both principles A and B. 12 Principle A states that an anaphor must be bound in its domain; principle B that a pronoun must be free in its domain. Under the assumption that these principles apply within the same domain, they end up imposing contradictory requirements on a pronominal anaphor, namely, that it should be both free and bound in the same domain. The only way for such an expression to meet both requirements is for it to vacuously satisfy them, i.e., by not meeting the necessary conditions for these requirements to be enforced. Thus, PRO cannot have a binding domain. Given that the binding domain for an expression was defined (in one of its formulations) as the smallest clause within which it is governed, then PRO does not have a binding domain if it is ungoverned (if it has no governor, for instance). Finally, if one takes an Infl head to be a governor if

10 See Chomsky (1977: 82): “We may take PRO to be just a base-generated t(x) [trace of x], x a variable; i.e., as a base generated NP x , an NP without an index.”

11 See Chomsky (1977: 82): “trace and PRO are the same element; they differ only in the way the

index is assigned – as a residue of a movement rule in one case, and by a rule of control in the

other

Note also that PRO is a non-terminal.”

12

Some historical background

it is finite but not if it is non-finite (to and ing), one then derives the distribution

of PRO: it can only appear in the subject position of non-finite clauses. A side benefit of this reasoning is that it provides an account of why PRO must be phonetically null. Within GB, case theory requires that nominals with phonetic content bear case and case is taken to be assigned under government. If PRO only appears in ungoverned positions, it cannot be case marked. Therefore, PRO cannot have phonetic content, for otherwise the case filter would be violated. Once again, this makes PRO very similar to NP-traces. These too occur in caseless positions and, not surprisingly, are phonetically null. Notice also that, by taking PRO to be a non-lexical formative, the problem

posed by quantified expressions in the standard-theory framework dissolves. A sentence like (8b), for instance, repeated below in (14a), will be associated with

a structure along the lines of (14b), where PRO does not have quantification

properties on its own, but is rather interpreted as a bound variable, as desired.

(14) a.

Everyone wants to win

b.

[Everyone wants [PRO to win]]

At first sight, taking PRO to be a pronominal anaphor also seems to have other welcome consequences as far as its interpretation is concerned. One indeed finds examples of its anaphoric behavior, as illustrated in (15), as well as examples of its pronominal behavior, as illustrated in (16). 13

(15) a.

It was expected PRO to shave himself

b.

John 1 thinks that it was expected PRO 1 to shave himself

c.

John 1 ’s campaign expects PRO 1 to shave himself

d.

John 1 expects PRO 1 to win and Bill 2 does too (‘and Bill expects himself to win,’ not ‘and Bill expects John to win’)

e.

[The unfortunate] 1 expects PRO 1 to get a medal

f.

[Only Churchill] 1 remembers PRO 1 giving the ‘Blood, Sweat, and Tears’ speech

(16) a.

It is illegal PRO to park here

b.

John 1 thinks that Mary said that PRO 1 shaving himself is vital

c.

John 1 ’s friends believe that PRO 1 keeping himself under control is vital if he is to succeed

d.

John 1 thinks that PRO 1 getting his resume´ in order is crucial and Bill does too (‘Bill 2 thinks that his 1/2 getting his resume´ in order is crucial’)

e.

[The unfortunate] 1 believes that PRO 1 getting a medal is unlikely

f.

Only Churchill remembers that PRO giving the BST speech was momentous

13 On the properties illustrated in (15) and (16) as well as further data and discussion, see e.g., Fodor (1975), Williams (1980), Lebeaux (1985), and Higginbotham (1992).

2.4 Control in GB

13

In (15), PRO roughly behaves like a reflexive. In configurational terms, it requires an antecedent (cf. [15a]) which must be local (cf. [15b]) and c- command it (cf. [15c]). On the interpretation side, it only supports a sloppy interpretation under VP ellipsis (cf. [15d]), a de se reading in sentences such as (15e) (i.e., it is only felicitous if the unfortunate is conscious of who he is and expects himself to get a medal), and a bound reading when its antecedent is associated with only (that is, [15f] can be paraphrased as ‘Only Churchill is such that he remembers himself giving the BST speech’ and not as ‘Only Churchill remembers that Churchill gave the BST speech’). By contrast, in (16) PRO behaves like a pronoun in every respect. Hence, it does not require an antecedent (cf. [16a]) and, where there is an antecedent, the antecedent need not be local (cf. [16b]) or c-command it (cf. [16c]). In addition, (16d) allows both strict and sloppy readings, (16e) permits both de se and non- de se interpretations, and (16f) may be falsified by situations in which people other than Churchill recall the import of the BST speech. Despite appearances, the data in (15) and (16) actually turn out to be quite problematic for the specification of PRO as a pronominal anaphor within GB. Notice that PRO displays either properties of reflexives or properties of pro- nouns. But in no case does it display properties of both pronouns and reflexives. Not by coincidence were data such as (15) and (16) handled by two different transformations (END and SEND, respectively) within the standard theory. Thus, it makes more sense to assume that PRO is ambiguous between a reflex- ive and a pronoun, than to assume it is a pronominal anaphor. However, this ambiguity thesis completely undermines the PRO theorem, as the theorem cru- cially assumes the existence of an element that is simultaneously a pronoun and an anaphor. In turn, if the PRO theorem falls, we are left with no account of the distribution of PRO. The requirements of the PRO theorem have one further architectural con- sequence: in order to explain the distributional properties of PRO in terms of the PRO theorem, some other component of the grammar must be responsible for PRO’s specific interpretation in a given configuration. This accounts for the addition of the control module in the GB framework. The control module recognizes two types of control: obligatory control (OC), illustrated in (15), and non-obligatory control (NOC), illustrated in (16). In the case of OC, the controller is lexically specified as an argument of the embedding control verb and, in the case of non-local control, other (rarely specified but frequently adverted to) principles come into play. Notice that this amounts to saying that, like in the earlier treatment in terms of END and SEND, OC and NOC are rather distinct types of relations. Importantly, it is tacitly assumed that the control module somehow obliterates the pronominal

14

Some historical background

specification of PRO in OC constructions, and, conversely, its anaphoric spec- ification in NOC constructions. The problem is not so much that the details of how this would be achieved are never spelled out, but that this tacit assumption casts suspicion over the initial specification of PRO as a pronominal anaphor. Why should UG provide PRO with such a specification only to see it blotted out later? After all, this does not happen with standard pronouns and anaphors:

they must live with the pronominal or anaphoric specifications stated in their birth certificate. One could reply that we should learn to live with the PRO module given the nice results we obtain with the PRO theorem concerning the distribution of PRO. However, this apparent success does not survive closer scrutiny either. The first thing to be noted is that the PRO-theorem account of the distributional properties of PRO is intrinsically associated with a specific formulation of binding domains, one in which government is essentially the one and only requirement to be satisfied. Recall that all that matters for PRO to vacuously satisfy both principles A and B is that it does not have a binding domain. To lack a governor is certainly one way for PRO to be deprived of a binding domain. But, if the correct definition of binding domain ends up including other requirements, there may be other ways for PRO to lack a domain. Take, for instance, the definition in (17) (see Chomsky 1981).

(17)

is a binding domain for iff is the minimal NP or S containing , in which is governed and has a subject accessible to .

This is not the place to review the various reasons for including the notion of accessible subject in the definition of binding domain within GB. 14 The relevant point for our discussion is that, once accessible subjects become part of the definition of binding domains, PRO may also lack a domain if it does not have an accessible subject. This in turn undermines the account of the distribution of PRO in terms of the PRO theorem, as government is no longer the only player on the field. 15 To put it broadly, if binding domains are to be formulated along the lines of (17), the account of the distribution of PRO exclusively in terms of government involves an independent axiom, rather than a theorem. But the problem is actually worse than these remarks suggest. Recall that a crucial assumption in the PRO-theorem account is that a finite Infl is a governor, but a non-finite Infl is not. From an empirical point of view, this assumption

14 See Lasnik and Uriagereka (1988) for a good discussion of the notion of accessible subject and the motivations for its inclusion in the definition of binding domain.

2.4 Control in GB

15

is challenged by languages like Brazilian Portuguese, which allow obligatory

control into indicative clauses, as we will see in detail in sections 2.5.2.2 and 4.4 below. In order to make room for finite control in such languages, the PRO- theorem account would be forced to assume that their finite Infls are optional governors. However, it is not at all obvious how this assumption can be formally encoded in the system. Given that government is a structural relation, being a governor cannot be listed as a lexical property for the reasons discussed above. Such a lexical specification would be comparable to saying, for instance, that a given lexical item is lexically specified as being unable to c-command. 16 This

just does not make sense. What is required is a structural reason for preventing

a non-finite Infl from governing its Spec. But, regardless of the definition of

government one assumes, if a finite Infl can govern its Spec, so should a non- finite Infl, as the two structural configurations are identical. Again, to assume the opposite would be parallel to saying that, although the configurational Spec- head relation is exactly the same in both cases, a finite Infl head m-commands its Spec, but a non-finite Infl does not. In short, when details are considered, the distributional properties of PRO do not follow theorematically and it is not even obvious how to convert the PRO theorem into an axiom, as it is unnatural to encode structural properties as lexical features or formulate different notions of government for different lexical items. Even the apparent benefit of the account of PRO’s lack of case has an undesir- able consequence. An A-chain in GB must be headed by a case-marked position unless it is headed by PRO. 17 This statement is transparently troublesome. If A- chains are independently subject to a case-licensing requirement (say, Aoun’s [1979] visibility condition, which requires that -roles be associated with case in order to be visible at LF), why should A-chains headed by PRO be exempted from such a requirement? Notice in particular that, given that chains headed by pro and null operators also required case licensing, PRO’s lack of phonetic content could not be the reason for this exception. To sum up: despite its laudable ambitions and its improvement over the standard-theory approach to control, the GB approach has significant empirical and theoretical problems. On the plus side, treating PRO as a grammatical formative circumvents the previous problem related to control involving quan- tified expressions, and accounts for why PRO is phonetically null and why (at least in the case of OC) it needs a grammatical antecedent. On the down side, the account of the distribution of PRO turns out on closer consideration to be

16 See Hornstein, Nunes, and Grohmann (2005) for a discussion of this point.

16

Some historical background

less a theorem than an axiomatic stipulation. Moreover, the assumption that PRO is a pronominal anaphor leads to empirical problems as the system cannot

predict when PRO behaves like an anaphor and when it behaves like a pronoun.

A separate control module must then be added to the theory to specify the

interpretive properties of PRO. Moreover, the construction-specific flavor of this new addition to the model is at odds with the general goal of the principles- and-parameters theory of deducing properties of rules and constructions from the interaction of more basic features. It is no wonder that the control module always felt like an appendix to the model and never occupied a bright spot among GB’s theoretical achievements. The GB take on control was therefore ripe for a minimalist reanalysis.

2.5

Non-movement approaches to control within minimalism

2.5.1

The null-case approach

Given that the addition of the construction-specific control module in GB was prompted by the problematic assumption that PRO is a pronominal anaphor, one would in principle expect that the abandonment of this assumption should also lead to the abandonment of the control module. However, history and logic

are known to frequently go their separate ways. The first minimalist reanalysis

of

control, outlined in Chomsky and Lasnik (1993), gave up on the account

of

PRO in terms of its alleged pronominal-anaphoric nature, but basically left

intact the assumption that the interpretation of PRO required a special module

in the system. Let us consider how the distributional properties of PRO are

handled on this account. Take the contrast between (18a) and (18b), for instance.

(18) a.

John hoped [PRO 1 to be elected t 1 ]

b.

John hoped [PRO 1 to appear to t 1 [that Bill was innocent]]

From the perspective of GB, PRO cannot occupy a governed position as it would then meet the requirements for binding theory to apply and would end up violating principle A or principle B. Hence, PRO cannot remain in the object position of the embedded verb in (18a) or the preposition to in (18b). However, once it moves to the subject position of the infinitival clause, which, by assumption, is ungoverned, it should circumvent the binding violation in both (18a) and (18b). The ungrammaticality of (18b) is therefore unaccounted for within GB. Notice that the contrast in (18) mimics the contrast in the ECM constructions in (19), which can be straightforwardly captured if one assumes

2.5 Non-movement approaches to control 17

that a given expression cannot move from a case-marked position to another case-marked position.

(19) a.

We expected [John 1 to be hired t 1 ]

b.

We never expected [John 1 to appear to t 1 [that the job was easy]]

Based on the parallelism between pairs like (18) and (19), Chomsky and Lasnik (1993) propose a case-based account of the distributional properties of PRO under which A-chains headed by PRO are not exceptional as far as case licensing is concerned, as they were in GB (see section 2.4). 18 The gist of their proposal is that PRO must be licensed by a special kind of case, dubbed null case, which is checked by some non-finite Infl heads. Under the assumption that the infinitival to in (18) checks null case, movement of PRO is licit in (18a) as it proceeds from a caseless (passives generally do not check case) to a case-checking position, but not in (18b), where it proceeds from a case- checking to another case-checking position. This proposal also extends to the standard cases regarding the distribution of PRO. Thus, under this view, PRO cannot appear in the subject position of finite clauses or in the object position of a transitive verb, as respectively illustrated in (20), because these are not positions in which null case is checked.

(20) a.

John hoped (that) PRO could eat a bagel

b.

Bill saw PRO

Notice that some extra assumption must be made in order to capture the standard contrast between (21) and (22) below, for instance. In other words, the null-case approach must somehow ensure that the infinitival ‘to’ of control constructions can license PRO, but not the infinitival ‘to’ of ECM or raising constructions. The obvious question is how to independently distinguish the ‘to’ that can check null case in (21) from its siblings in (22), which cannot. One thing is certain. One cannot simply say that these heads are lexically ambiguous in terms of their specification for case checking; otherwise, structures corre- sponding to (22) should be grammatical with the case-checking version of ‘to’.

(21)

John hopes [PRO to graduate soon]

(22) a.

I believe [PRO to be nice]

b.

It seems [PRO to be nice]

18 The idea of accounting for the distribution of PRO in terms of case finds its origins in Bouchard’s (1984) proposal that PRO cannot appear in a case-marked position.

18

Some historical background

Martin (1996, 2001) is the most fully worked out version of the null-case approach to the distribution of PRO, which attempts to couch the distinction between (21) and (22) on more solid grounds. Building on Stowell’s (1982) proposal that control infinitives are tensed whereas ECM and raising infinitivals are tenseless, Martin proposes that only tensed infinitivals check null case. 19 Tying null case to tense has the virtue of rendering it more natural and less stipulative. Under this perspective, null case would be very similar to nomina- tive case, as both would be checked by a tensed Infl, differing only in terms of their morphological realization. Unfortunately, the proposed independent diagnostics for distinguishing tensed from tenseless infinitivals fail to yield the expected divide between control predicates, on the one hand, and ECM and raising predicates, on the other, as convincingly shown by Wurmbrand (2005). 20 Take the contrasts between the infinitival complements of the control verb ‘decide’ and the ECM verb ‘believe’ in (23)–(26) (from Wurmbrand 2005), for example.

(23) a.

At 6, Leo decided to sing in the shower right then

b.

At 6, Leo believed Bill to sing in the shower right then

(24) a.

Leo decided yesterday to leave tomorrow

b.

John believes/believed Mary to be pregnant

(25) a.

Leo decided [[to leave] [which was/is true]]

b.

Leo believes [[John to be smart] [which is true]]

(26) a.

Leo doesn’t want John to sing in the shower, but he decided to, anyway

b.

Leo believes John to be honest and she believes Frank to, as well

The contrasts above are supposed to show that the control infinitival is tensed as it is compatible with eventive predicates (cf. [23a]), triggers a future reading (cf. [24a]), requires an irrealis interpretation (that is, the truth of the complement is left unspecified at the time of the utterance; cf. [25a]), and licenses VP ellipsis (cf. [26a]). Conversely, the ECM/raising infinitival clauses are taken to be tenseless as they are incompatible with eventive predicates (cf. [23b]), require a simultaneous interpretation with respect to the embedding clause (cf. [24b]), 21 allow a realis interpretation (cf. [25b]), and do not license VP ellipsis (cf. [26b]).

19 See also Boskoviˇ c´ (1997) for relevant discussion.

20 For further discussion and arguments against null case and its ties to tense, see also Landau (2000), Pires (2001, 2006), Baltin and Barrett (2002), and Hornstein (2003).

2.5 Non-movement approaches to control 19

The above paradigm does indeed distinguish ‘decide,’ a control verb, from ‘believe,’ an ECM verb. The problem, as Wurmbrand (2005) shows, is that the criteria do not generalize to other control and ECM/raising cases. For instance, the control verb ‘claim’ does not license eventive predicates (cf. [27a] below) and allows a realis interpretation for its complement (cf. [27b]), whereas the control verb ‘manage’ does not trigger a future reading (cf. [27c]). In turn, the infinitival complement of the ECM verb ‘expect’ is compatible with an eventive predicate (cf. [28a]), does not permit a realis interpretation (cf. [28b]), and allows a non-simultaneous interpretation (cf. [28c]).

(27) a.

At 6, Leo claimed to sing in the shower right then

b.

Leo claimed [[to be a king], which was true]

c.

John managed to bring his toys tomorrow

(28) a.

The bridge is expected to collapse tomorrow

b.

The train is expected [[to arrive late tomorrow] [which is true]]

c.

The printer is expected to work again tomorrow

Wurmbrand (2005) also reviews the VP ellipsis data and observes that the data that purport to demonstrate a distinction between control (where it is allowed) and raising (where it is prohibited) are subject to substantial speaker variation (when the contrast exists at all). Besides, the clear acceptability of raising examples such as the ones in (29) indicates that the licensing of VP ellipsis fails to cleanly distinguish control from raising.

(29) a.

The tower started to fall down and the church began to as well

b.

John expects the printer to break down whereas Peter expects the copier to

c.

They say that Mary doesn’t know French but she seems to

The above arguments, which decouple tense properties from control infini- tivals, are seconded by the observation that PRO may also occur in gerundive subject positions, despite the fact that gerunds are generally analyzed as not tensed (see Stowell 1982; Pires 2001, 2006). This is illustrated in (30) below, where the gerund licenses PRO but not the temporal adverb.

(30) a.

John hated [PRO eating turnips ( tomorrow)]

b.

John preferred [PRO eating turnips ( tomorrow)]

The overall conclusion one reaches is that, whatever tense properties non- finite clauses have, they do not seem to be useful for distinguishing raising from control configurations. There are surely differences between raising and control complements, but this varies across verbs and there is no apparent systematic way to distinguish the two classes using the “tense” diagnostics mentioned

20

Some historical background

above. Thus, although conceptually appealing, the attempt to analyze null case as similar to nominative by associating it to a form of tense ends up failing. This is really bad news. Once the distribution of PRO cannot be reduced to a [±tense] feature of T, null case finds no independent motivation within the system and follows from nothing but the attested distribution of PRO. And the picture is not very glamorous. In order to work, the null-case approach requires three stipulations: (i) PRO has no phonetic content; (ii) null case must be assigned to PRO; and (iii) only PRO can bear null case. These three stipulations track but do not explain the facts under discussion. In other words, despite its explanatory aspirations, it seems fair to say that the null-case approach amounts to stipulating that PRO appears where it does and that it has the phonetic properties it has. What of PRO’s interpretive properties? Here there is some good news. With the PRO theorem abandoned, PRO can be treated as ambiguous, a null reflexive in some contexts (OC cases) and a null pronoun in others (NOC cases). It is then possible to reduce the interpretive properties of PRO to the interpretive properties of pronouns and reflexives. For example, that OC PRO requires a local, c-commanding antecedent follows from its being subject to principle A of the binding theory (or whatever substitutes for principle A). The fact that NOC PRO does not need an antecedent follows its being pronominal. Given such a reduction, what remains to be determined is why OC and NOC PROs distribute as they do, i.e., why reflexive PRO appears in OC contexts and pronominal PRO in NOC contexts. One can, of course, stipulate that certain predicates select for OC and so for reflexive-like PROs, while others do not. However, it is not clear how this is to be implemented grammatically (see Chapters 6 and 7 below). First, it is not clear how selection of embedded subjects by matrix verbs (so-called control predicates) is to be stated. If selection is a head-to-head relation, then OC is not an obvious case of selection. Second, adjunct control seems to pattern like OC and, on the standard assumption that predicates can select complements but not adjuncts, then adjunct control is expected to be NOC, contrary to fact. These are issues that we revisit in later chapters. What is worth noting here is that simply reducing OC to something like principle A and NOC to something like principle B does not by itself suffice to account for the interpretive properties of OC and NOC configurations.

2.5.2 The Agree approach

Let us now consider Landau’s (1999, 2000, 2004) alternative approach to con- trol. Like the null-case approach reviewed in the previous section, Landau takes the existence of PRO for granted but, unlike proponents of the null-case

2.5 Non-movement approaches to control 21

approach, he takes PRO to bear regular case like any other DP. In addition to this take on case, three other aspects of Landau’s approach stand out: (i) the special attention given to “partial-control” constructions; (ii) the dependence of obligatory control on the postulation of certain features and feature specifi- cations; (iii) the interpretation of PRO mediated by (a version of) Chomsky’s (2000, 2001) Agree operation. Let us examine each of these major aspects of Landau’s system, leaving the discussion of whether or not PRO bears regular case to section 5.4.2 below. 22

2.5.2.1 The relevance of partial control

Partial control refers to control constructions where an embedded predicate must take a (semantically) plural subject, but the antecedent of the controllee is (semantically) singular, as illustrated in (31).

(31)

The chair hoped [PRO to gather/meet at 6/to apply together for the grant]

In (31), the matrix subject is understood as a member of the set of peo- ple denoted by the embedded subject. Assuming that this interpretive fact shows that controller and controllee are not identical, Landau takes partial- control constructions to be a strong argument for a PRO-based account of control. According to him, the mismatch in interpretation between PRO and its antecedent results from PRO being independently specified for the semantic feature mereology, which characterizes group names (for instance, ‘committee’ is [+Mer], while ‘chair’ is [Mer]), as illustrated in (32).

(32)

The chair [Mer] hoped [PRO [+Mer] to gather/meet at 6/to apply together for the grant]

It is a great merit of Landau’s work to have shown that partial control is indeed an instance of obligatory control (the controllee requires a local c-commanding antecedent, triggers sloppy readings under ellipsis, and enforces de se readings, for instance), and to have provided a very detailed description of the types of predicates that allow partial control. Landau argues that a tensed infinitive such as the complement of “desiderative” verbs licenses it, but an untensed infinitive such as the complement of “implicative” verbs does not, as illustrated by the contrast between (31) and (33) (see section 2.5.2.2 below for details).

22 Here we will primarily focus on Landau (2004)’s analysis of obligatory control, which he takes to replace his older treatment (Landau 1999, 2000). For discussion of the limitations of his previous treatment, see Hornstein (2003) and Landau (2007) for a rejoinder.

22

Some historical background

(33)

The chair managed [PRO to gather/meet at 6/to apply together for the grant]

It is fair to say that, after Landau’s work, partial control came to be part of the empirical basis that any approach to obligatory control must take into consideration. However, the amount of ad hoc machinery required to account for partial control in Landau’s system, as we will see below in section 2.5.2.3, ends up undermining the initial appeal that a PRO-based theory appears to have. And there are empirical problems, as well. As observed by Hornstein (2003), it is not the case that any predicate that selects a plural subject licenses partial control, as shown in (34).

(34) a.

They sang alike/were mutually supporting

b.

John hoped/wants [PRO to sing alike/to be mutually supporting]

Notice that the matrix predicate of (34b) is of the type that licenses partial control (cf. [31]). So (34b) shows that partial control must in part be deter- mined by properties of the embedded predicate. In fact, Hornstein suggests that what seems to distinguish the predicates that support partial control from the ones that do not is that the former can select a commitative PP, as shown in (35).

(35) a.

The chair met/gathered/applied together for the grant with Bill

b.

The chair sang alike/is mutually supporting with Bill

The data in (36)–(37) further show that being compatible with a commitative PP is not sufficient for partial control to be licensed: the commitative must be selected.

(36) a.

The chair met/gathered/applied together for the grant ( with Bill)

b. The chair left/went out (with Bill)

c. The committee left/went out

(37)

The chair preferred [PRO to leave/go out at 6] (exhaustive control: OK; partial control: )

Example (36b) shows that, as opposed to what happens with ‘meet/ gather/apply together’ in (36a), the commitative associated with ‘leave/go out’ is not selected. In turn, (36c) shows that a [+Mer] noun can be the subject of ‘leave/go out.’ Now, given that in Landau’s system PRO can always be intrinsically specified as [+Mer], one would expect that a sentence such as (37), whose matrix predicate is of the type that licenses partial control, should allow a partial-control reading with a [+Mer] PRO. But this does not happen. Example (37) only has an exhaustive control reading.

2.5 Non-movement approaches to control 23

The fact that the availability of partial control is contingent on there being

a predicate that selects a commitative complement suggests that, rather than

involving a plural subject, partial control may in fact involve the licensing of

a null commitative argument in a standard (“exhaustive”) obligatory-control

construction. That is, a sentence such as (38a) should actually be represented as in (38b) (still keeping PRO for purposes of discussion), where pro is a null commitative argument.

(38) a.

The chair preferred to meet at 6

b.

[The chair] i hoped [PRO i to [+tense] meet pro k at 3]

Here is not the place for us to pursue the suggestion encapsulated in (38b) (see

section 5.6.1 below for discussion). What is relevant for our current purposes is to point out that, in Landau’s system, the availability of partial control should be quite free once the tense requirements on the infinitive are satisfied. It is indeed quite mysterious in his system why partial control should depend on the potential licensing of commitative arguments within the infinitival clause. And

if partial control turns out to be more related to the licensing of null commitative

arguments, whatever accounts for exhaustive control should also cover partial control. In other words, if something along the lines of (38b) is on the right track, partial control does not intrinsically favor a PRO-based approach and we are back to the original question of what the best account of the null embedded subject of (38a) is (see section 5.6.1 below for a suggestion of how partial control can be analyzed under the MTC).

2.5.2.2 [Tense] and [Agr] features and finite control

The second major aspect of Landau’s system is the specific typology of control configurations involving both non-finite and finite clauses – it establishes. Following a venerable tradition, Landau assumes that the local environment of

the embedded subject must provide all the necessary information to determine whether it must, can, or cannot be PRO. In particular, Landau takes the rel- evant local licensing features to be (semantic) [T(ense)] and (morphological) [Agr(eement)]. Where Landau departs from previous accounts is in the way these features conspire to determine the nature of control, as shown in (39) (from Landau 2004: 840). 23

23 EC and PC in (39) stand for exhaustive and partial control respectively. C(ontrol)-subjunctives and F(ree) - subjunctives are distinct in that only the former necessarily require an obligatory- control interpretation of their subjects. For purposes of exposition, below we use I for the tense head T in order to distinguish it from the tense feature [T].

24

Some historical background

(39)

Obligatory control

 

No control

 

EC-infinitive

Balkan C-

Hebrew

PC-infinitive

Balkan

indicative

subjunctive

3rd-person

F-subjunctive

subjunctive

I

0

[T, Agr]

[T, +Agr]

[+T, +Agr]

[+T, Agr]

[+T, +Agr]

[+T, +Agr]

C

0

[T]

[T]

[+T, +Agr]

[+T, (+Agr)]

[+T, +Agr]

Ø

Consider the infinitives in (39), for instance. As mentioned in section 2.5.2.1, Landau has argued that the essential difference between an infinitival that allows partial control and one that disallows it is its tense properties: an infinitival

I allows both exhaustive and partial control if specified as [+T], but only

exhaustive control if specified as [T]. This difference is meant to capture

the fact that the infinitival clauses that allow partial control can be temporally independent from the matrix clause, as illustrated in (40) below. Given that the tense properties of I are predicted by the selecting predicate and that selection

is a local relation, the [T] features of I are accordingly replicated on C in (39).

Thus, a verb like ‘hope,’ for instance, selects a CP headed by C [+T] , which in

turn selects an IP headed by I [+T] .

(40) a.

Yesterday John hoped to travel tomorrow

b.

Yesterday John managed to travel tomorrow

As Landau observes, the basic intuition underlying the typology in (39) is that obligatory-control configurations do not form a natural class; they are in fact the complement subset of the natural class of non-controlled environments. Putting aside the case of Hebrew third-person subjunctives for the moment, the generalization is that if I is positively specified for both [T] and [Agr], it does not trigger obligatory control. On the other hand, a single negative specification for [T] or [Agr] ([+T, Agr] or [T, +Agr]) or a negative specification on both ([T, Agr]) will necessarily lead to obligatory control. In sum, obligatory control is the elsewhere case. Given this feature distribution, it follows that indicative complements should not display obligatory control. As Landau (2004: 849–850) puts it, “the only generalization in this domain that appears to be universal is the incompat- ibility of indicative clauses with OC. Anything else is possible, under cer- tain circumstances.” However, this generalization is falsified by “referential” (i.e., non-expletive, non-arbitrary) null subjects in (colloquial) Brazilian Por- tuguese. As extensively argued by Ferreira (2000, 2004, 2009) and Rodrigues (2002, 2004), null subjects in Brazilian Portuguese show all the diagnostics of

2.5 Non-movement approaches to control 25

obligatory control. Take the Brazilian Portuguese sentences in (41)–(45), for instance.

24

(41) Comprou um carro novo

Bought

a

car

new

‘She/he bought a new car’

(42)

[[o Joao]˜

disse que [o

pai

d[o

Pedro]] acha

que

vai

The Jo ao˜ said that the father of-the Pedro thinks that goes

ser promovido] be promoted ‘Joao˜ i said that [Pedro j ’s father] k thinks that he k/ i/ j/ l is going to be promoted’

(43)

So´

o

Joao˜

acha

que vai

ganhar a

corrida

Only the Jo ao˜ thinks that goes win

the race

 

‘Only Joao˜ is an x such that x thinks that x will win the race’ NOT: ‘Only Joao˜ is an x such that x thinks that he, Joao,˜ will win the race’

(44)

O

Joao˜

esta´ achando que vai

ganhar a

corrida e

o

The Jo ao˜ is thinking that goes win

the race

and the

Pedro tambem´ estPedro too is ‘Joao˜ thinks that he’s going to win the race and Pedro does, too (think that he, Pedro, is going to win the race)’

(45)

O infeliz acha que devia receber uma medalha The unfortunate thinks that should receive a medal ‘The unfortunate thinks that he himself should receive a medal’

Example (41) shows that null subjects in Brazilian Portuguese require an antecedent 25 and (42), that the antecedent must be the closest c-commanding DP. As for interpretation matters, a null subject in Brazilian Portuguese is interpreted as a bound variable when its antecedent is an only-DP (cf. [43]); it obligatorily triggers sloppy reading under ellipsis (cf. [44]); and it only admits a de se reading in sentences such as (45). Importantly, in all the sentences of (41)–(45), the null subject displays the diagnostics of obligatory control despite the fact that it is within a standard indicative clause. The existence of finite control into indicative complements in Brazilian Portuguese therefore presents prima facie problems for the typology proposed

24 See Ferreira (2000, 2004, 2009) and Rodrigues (2002, 2004) for additional tests. 25 Referential null subjects in matrix clauses in Brazilian Portuguese can only be licensed as instances of topic-drop (see Ferreira 2000, Modesto 2000, and Rodrigues 2004 for relevant discussion).

26

Some historical background

by Landau. 26 Below we discuss the implications of this empirical fact within Landau’s Agree-based approach.

2.5.2.3 Determining the interpretation of obligatorily controlled

PRO via Agree In addition to the features [T] and [Agr] to be hosted by C and I, Landau (2004:

841) also proposes that DPs must be featurally specified as to whether or not they support independent reference ([R]): lexical DPs and pro are specified

as [+R] and PRO as [R]. According to Landau (p. 841), “[b]oth values on [R] are interpretable, when occurring on nominal phrases.” However, the [R] feature makes PRO a potential goal for agreement, for “this feature acts as an instruction to coindex the -features of PRO with those of an antecedent; Agree is a way of achieving that” (p. 843). The feature [R] is also assigned to some functional categories, according to the rule in (46).

(46)

R-assignment rule (Landau 2004: 842)

For X 0 [ T, Agr] {I 0 , C 0 ,

.}:

Ø

Ø [R]/elsewhere

[+R]/X 0 [

]

, if = = +

Given these assumptions, let us consider the derivation of an exhaustive control construction such as (47) in Landau’s system, which is given in (48).

(47)

(48)

John managed to fix the car

[DP I 2 [ t DP Agree CP C [−T] [ IP PRO [−R] I
[DP I 2 [
t DP
Agree
CP C [−T] [ IP PRO [−R] I 1[−T, −Agr, −R] [ t PRO
[
Agree
Agree
Agree

]]]]]

Agreement between I 1 and PRO in (48) deletes I 1 ’s [Agr] and [R] features. Agreement between C and I 1 then deletes C’s [T] feature. Finally, after agreeing with the matrix subject, I 2 agrees with PRO, coindexing their features and licensing PRO’s [R] feature. 27

26 See Rodrigues (2004) for arguments that Finnish may also allow obligatory control into finite indicative clauses.

27 The only relevant difference between (48) and a typical obligatory-control subjunctive in Greek such as (i) in Landau’s system is that in the latter, I 1 has overt agreement morphology ([+Agr]), rather than “abstract” agreement ([Agr]), as represented in (ii).

(i) Greek (Terzi 1997):

I Maria 1 prospathise Pro 1/ 2 na

the Maria tried.3SG PRT read.3SG ‘Maria tried to read’

divasi

2.5 Non-movement approaches to control 27

In turn, a partial-control construction like (49) is to be derived along the lines of (50).

(49)

(50)

The chair hoped to meet at 6

Agree [ [DP I 2[+T, +Agr, +R] [ t DP CP C [+T, +Agr, +R]
Agree
[
[DP I 2[+T, +Agr, +R] [
t DP
CP C [+T, +Agr, +R] [ IP PRO [−R]
Agree
Agree
]]]]]
I 1[+T, −Agr, −R] [ t PRO
[−R] Agree Agree ]]]]] I 1[+T, −Agr, −R] [ t PRO Agree As before, agreement between

Agree

As before, agreement between I 1 and PRO deletes I 1 ’s [Agr] and [R] fea- tures. Agreement between C and I 1 now deletes C’s [+T] and [+Agr] features, but not its [+R] feature, for it mismatches the [R] feature of I 1 . C then checks its [+R] feature with I 2 . Notice that I 2 agrees with C and not with PRO, which raises the question of how PRO can license its [R] feature. According to Landau (p. 845), this feature gets licensed in virtue of I 2 agreeing with C, which in turn is “coindexed” with PRO via I 1 . Furthermore, Landau assumes (p. 849) that if I 2 and PRO do not agree directly, their [Mer] features need not match. If I 2 is specified as [Mer] and PRO is inherently specified as [+Mer], a partial-control effect will arise. Let us finally consider the last type of obligatory-control configuration listed in (39): Hebrew third-person subjunctives. 28 Given the feature specification for Hebrew subjunctives in (39), the derivation of a sentence such as (51) proceeds as in (52).

(51)

Hebrew (Landau 2004)

Gil i hivtiax [se-ˇ

Gil promised that will-behave.3SG.M well ‘Gil promised to behave’

ec i yitna’heg yafe]

(52) Agree [DP I 2[+R] [ [ ]]]]] t DP CP C [+T, +Agr, +R]
(52)
Agree
[DP I 2[+R] [
[
]]]]]
t DP
CP C [+T, +Agr, +R] [ IP PRO [−R] I 1[+T, +Agr, +R] [ t PRO
Agree
Agree
Agree
(ii)
[DP I 2 [
t DP
Agree
[ CP C [−T] [ IP PRO [−R] I 1[−T, +Agr, −R] [ t PRO ]]]]]
Agree
Agree
Agree

28 Landau (2004: 815, 846) attributes the lack of a derivation with an uncontrolled third-person pro in Hebrew subjunctives to the non-existence of referential third-person pro in the language. More specifically, he assumes Shlonsky’s (1997) proposal that third-person pro s in Hebrew are null Num heads and, because they are null, they cannot support a third-person feature hosted by a higher D-head.

28

Some historical background

Agreement between I 1 and PRO in (52) checks the [+Agr] feature of I 1 , but not its [+R] feature, which mismatches the [R] feature of PRO. Agreement between C and I 1 then checks all of the features of C and the [+R] feature of I 1 . If the matrix I 2 had a [R] feature, the remaining unchecked [R] feature of PRO would be licensed by agreement with I 2 . However, in (52) I 2 is specified as [+R]. The [R] feature of PRO must then be indirectly licensed in virtue of the agreement relations between I 2 and C, between C and I 1 , and between I 1 and PRO. As the reader can easily check, the feature specifications and computations proposed above are such that they track, but do not explain, the distribution and interpretation of PRO. Landau (p. 842) in fact acknowledges that his R- assignment rule is an “honest stipulation,” which played the role of case in previous models. Unfortunately, if the distribution and interpretation of PRO is to rest on a stipulation, calling it honest does not make the analysis less stipulative. In other words, it is subject to the same criticism made to the null-case approach: the distribution and interpretation of PRO ends up being stipulated under the guise of lexical features. It is also worth pointing out that, under the label Agree, Landau’s proposal actually groups different kinds of relations, which do not obviously form a natural class. Thus, in addition to the familiar valuation procedure involving a [interpretable] and a [+interpretable] feature of Chomsky (2001), the Agree operation assumed by Landau encompasses three other types of relations. First, it admits relations between two [interpretable] features such as the agreement between C and I 1 with respect to [Agr] features in (50) or the agreement between I 2 and C in (52) with respect to [R] features. According to Landau (p. 849), “[t]he fact that C bears [+Agr] does not stop this feature from entering Agree with [Agr] of I ; recall that [+Agr] on C represents abstract [Agr] to begin with (in most cases), thus [Agr] on both heads is semantically uninterpretable and phonologically null.” That may be so, but the resort to features which are motivated neither in LF nor in PF terms not only is completely at odds with core minimalist assumptions, but also reinforces the impression that these features are only redescribing the facts to be explained. The second type of relations encompassed by Landau’s version of Agree include “coindexing” relations such as the agreement between I 2 and PRO in (48) to license PRO’s [R] feature (which was assumed to be a [+interpretable] feature, as mentioned above). Finally, it also includes composite-“coindexing” relations such as the licensing of the [R] feature of PRO in (50) and (52), which involves the conjunction of three basic agreement relations: between I 2 and C, between C and I 1 , and between I 1 and PRO. Even if we put aside the fact

2.5 Non-movement approaches to control 29

that “coindexing” and feature valuation/deletion seem to be of different nature,

it is not at all trivial to explain how the composition of the three agreement

relations mentioned should result in “coindexing.” Recall that, in (50) and (52), I 2 and C agree with respect to [R] as do I 1 and PRO, but C and I 1 agree with respect to [Agr]. In virtue of these two agreement relations, PRO agrees with I 1 through some kind of transitivity assumption. However, it is worth asking how transitivity arises given that the agreement relations computed do not target the

same type of feature. Note that, if A is taller than B and B is fatter than C, then one can conclude nothing regarding A’s height or weight as regards C. However, for the account above to work, we must assume that this logic is overturned when certain feature sets are involved, which in turn brings the obvious minimalist question: why are these features endowed with their alleged properties? This shows that the proposed transitivity in the account of (51) does not follow as

a point of logic, but is rather a stipulated feature in Landau’s system. Thus,

the proposed composite-“coindexing” relations should be subject to the same skepticism we accord the Barriers approach to A-movement, which licenses A-traces by resorting to a chain coindexing mechanism combining Spec-head agreement with head-to-head government (see Chomsky 1986a, section 11). The non-explanatory nature of the proposal is further highlighted when Landau’s account of (51) is examined in light of his take on the impossibility of PRO in indicative clauses. As we saw in (39), the feature specification proposed for indicatives involved the features [+T] and [+Agr] for I and no features for C. The reason for C not to be associated with [T] features is that the tense value of I is completely independent from the matrix clause. Furthermore, since

Landau (p. 840) assumes that the presence of [Agr] on C is parasitic on [+T], if indicative C does not have [+T], it cannot have [Agr] either. Finally, if it is not specified for both features, it cannot be associated with an [R] feature, according to the R-assignment rule in (46). That being so, Landau (p. 843) claims that the reason why PRO cannot be licensed in the indicative configuration in (53) below (Landau’s [40b]) is that “Agree fails due to a feature mismatch in the

R value between I and PRO. Thus, indicative clauses with independent tense

universally do not display OC.”

(53)

[DP I 2 [

t

DP

[

CP C [ IP I 1[+T, +Agr, +R] [ VP PRO [R] Agree

I 1[ + T, + Agr, + R] [ VP PRO [ − R] ∗ Agree
Agree
Agree

]]]]]

This specific claim now introduces an additional aspect of composite- agreement relations: feature mismatch is taken to cause a derivational crash under direct agreement, like the relation between I 1[+R] and PRO [R] in (53),

30

Some historical background

but not under “composite” agreement, like the relation among I 2[+R] -C [+R] - I 1[R] -PRO [R] in (50). Putting aside the fact that no motivation was provided for why these two instantiations of “Agree” should yield opposite results, it is important to point out that this stipulated aspect of composite agreement leads to overgeneration. Notice that the feature mismatch at the derivational step depicted in (53) , that is, before PRO moves, cannot be the reason for the derivation to crash. As we saw in the derivation proposed by Landau for Hebrew subjunctive control in (52), mismatch in the values for [R] by itself is not a problem if the features can be licensed later on in the derivation. Consider for instance the structure in (54), which depicts the movement of PRO in (53).

(54) [DP I 2 [ [ ]]]]] t DP CP C [ IP PRO [−R]
(54)
[DP I 2 [
[
]]]]]
t DP
CP C [ IP PRO [−R] I 1[+T, +Agr, +R] [ VP t PRO
Agree
Agree

If I 2 in (54) is specified as [R], it will be able to agree with PRO, but the [+R]

feature of I 1 will remain unchecked, causing the derivation to crash. Suppose, by contrast, that I 2 is specified as [+R]. As such, it should be able to agree with I 1 , as represented in (55), checking the [+R] feature of the latter.

(55)

in (55), checking the [ + R] feature of the latter. (55) Agree [DP I 2[+R]

Agree

(55), checking the [ + R] feature of the latter. (55) Agree [DP I 2[+R] [
[DP I 2[+R] [ [ t DP CP C [ IP PRO [−R] I 1[+T,
[DP I 2[+R] [
[
t DP
CP C [ IP PRO [−R] I 1[+T, +Agr, +R] [ VP t PRO
Agree
Agree

]]]]]

What about the [R] feature of PRO? Recall from (50) and (52) that PRO can be indirectly licensed by a chain of “agreement” relations. In (52), for example, its [R] feature is taken to be licensed in virtue of PRO’s having agreed with I 1 , which had agreed with C, which in turn had agreed with I 2 .

That being so, there should be no reason for PRO not to get licensed in (55) via

a composite-“agreement” relation. That is, its [R] feature should be licensed once PRO has agreed with I 1 , which agrees with I 2 . Crucially, composite agreement is assumed to be oblivious to feature mismatch. In other words, once the composite-agreement relations proposed by Landau are assumed, finite control into indicatives becomes freely available. In fairness, Landau (2004: 846–847) seems to assume that the [+R] feature

of I 1 cannot be checked by a probe higher than C: “We still account for the fact that indicative complements in Hebrew do not display OC. In a configuration like (40b) [= (53) above], as opposed to (43b) [= (52) above], the [+R] feature of I remains unchecked as no corresponding feature exists on the indicative

C .” Notice, however, that C does not prevent a higher probe from agreeing with

the embedded subject in (48), for instance. Given that the embedded subject and the embedded I are equidistant (see Chomsky 1995), it does not seem plausible to exclude the checking of the [+R] feature of I 1 in (55) based on the

2.5 Non-movement approaches to control 31

intervention of C. Notice also that C in (55) has no features that could block an agreement relation between a higher probe and I 1 . A phase-based approach to this conundrum is of no help either. Landau (2004, footnote 26) claims that “[a]lthough not at the edge of its phase, PRO is visible to Agree from the outside since its ( - and R-) features are interpretable (hence, never erased).” Note, however, that in Landau’s system the - and R- features of PRO are only valued after agreement. Thus, it is plausible to assume that spell-out/transfer must be halted until PRO has its features valued and, if this is so, we are back to the technical question of why I 1 in (55) cannot be checked by the matrix probe if spell-out/transfer is on hold. Of course, one may attempt to specify the inner workings of spell-out/transfer in such a way that PRO becomes immune to spell-out/transfer at the relevant derivational step, but not I 1 . But that would only add to an already loaded machinery, without actually shedding light on the discussion. Still, such an attempt would require further complications. Recall that, under composite-agreement relations in Landau’s system (cf. [52] for Hebrew and [ii] in footnote 27 for Balkan languages), the higher probe agrees with the embedded C, “which is also coindexed with PRO via I ” (Landau 2004: 845). This in turn indicates that the embedded I must still be available to the computation at the derivational step where PRO is to have its R-feature licensed. To wrap up: if composite relations must be assumed in order to account for Hebrew subjunctive control, control into indicative clauses becomes freely allowed. Although this may be good news for languages such as Brazilian Portuguese, as discussed in section 2.5.2.2, it is certainly unwelcome for most languages.

2.5.2.4 Simplifying Landau’s “calculus of control”

Let us examine what the relevant property of Brazilian Portuguese indicative clauses is that triggers obligatory control. Ferreira (2000, 2004, 2009) proposes that finite Ts in Brazilian Portuguese are ambiguous in being associated with either a complete or an incomplete set of -features and that obligatory control is licensed in clauses with a -incomplete T. 29 Nunes (2007, 2008a) reinterprets Ferreira’s proposal in terms of the presence or absence of the feature [person] in T. He observes that the verbal-agreement paradigm of finite clauses in Brazilian Portuguese is such that the only inflection that overtly encodes both number and

29 Ferreira (2000, 2004, 2009) (as well as Rodrigues 2002, 2004) in fact analyzes null subjects in Brazilian Portuguese in terms of the MTC. Thus, in his system a -incomplete T does not value the case feature of the subject of its clause, which can then undergo A-movement to the matrix clause. We will leave a detailed discussion of null subjects in Brazilian Portuguese under the MTC for section 4.4 below.

32

Some historical background

person is the first-person singular inflection. All the other cases involve either number specification with default value for person (third) or default values for both person and number (third singular), as illustrated in (56).

(56)

Verbal-agreement paradigm in (colloquial) Brazilian Portuguese

 

cantar ‘to sing’: indicative present

eu ‘I’

cant o

P:1; N:SG

voc‘you (SG)’

   

ele ‘he’

canta

P:default; N:default (= 3SG)

ela ‘she’

a gente ‘we’

 

vocesˆ

‘you (PL)’

   

eles ‘they (MASC)’

cantam

P:default; N:PL (= 3PL)

elas ‘they (FEM)’

Nunes proposes that -complete and -incomplete finite Ts in Ferreira’s terms correspond to Ts specified with number and person features or a number feature only. In case a T with just a number feature is selected, the corresponding per- son specification will be added in the morphological component by redundancy rules, as sketched in (57) below. That is, if T has only a number feature and it is valued as singular in the syntactic component, it will later be associated with first person in the morphological component; if the number feature receives any other value in the syntactic component (default or plural), it will later be asso- ciated with a default value for person (third) in the morphological component.

(57)

cantar ‘to sing’: indicative present

 

Valuation of T in the syntactic component

Addition of [person] in the morphological component

Surface form of the verb

N:SG

P:1; N:SG

canto

N:default

P:default; N:default

canta

N:PL

P:default; N:PL

cantam

If finite control in Brazilian Portuguese is related to the possibility that its finite Ts may be specified only for number in the syntactic component, we now find a commonality with Hebrew subjunctive control. As argued by Landau (2004), subjunctive control in Hebrew is restricted to the third person. This can be

2.5 Non-movement approaches to control 33

interpreted as indicating that the relevant subjunctive T in Hebrew may also be associated with only a number feature, in which case it will surface with default-person morphology, that is, third person. This in turn paves the way for a considerable simplification in Landau’s typology, with the desired effects. In other words, the environments where one finds obligatory control involve deficient T-heads, i.e., heads that are temporally deficient, -deficient, or both. Landau’s table in (39) can now be revised as in (58), where ‘+’ stands for fully specified and ‘’ for deficient (or null).

(58)

 

Obligatory control

No control

[T , ]

[T + , ]

[T , + ]

[T + , + ]

untensed

tensed uninflected

Balkan untensed

English

uninflected

infinitives,

subjunctives,

indicatives,

infinitives,

Brazilian

etc.

Balkan tensed

etc.

Portuguese

subjunctives,

indicatives,

etc.

Hebrew 3rd-person

subjunctives, etc.

The table in (58) shares with Landau the intuition that obligatory control is typologically more diverse, but drastically simplifies the “calculus of control” in Landau’s terms. Under (58), finding out whether or not a given clause licenses obligatory control does not need to take the features of C into consideration, for I is sufficient: if either [T] or [ ] is deficient, obligatory control is pos- sible. Besides simplifying Landau’s system and accounting for finite control into indicatives in Brazilian Portuguese, (58) also eliminates the suspicious ambiguity of the combination of the specifications C [+T, +Agr] and I [+T, +Agr] in Landau’s table in (39), which were employed to describe both obligatory control in Hebrew subjunctives and no control in Balkan F-subjunctives (see section 4.4 below for further discussion). As opposed to Landau’s system, what matters in (58) is not the morpho- logical realization of agreement features, but rather how specified the set of agreement features associated with I is. Conceptually, this is also a welcome result. If obligatory control is to be ultimately determined in the syntactic com- ponent, why should the PF realization of agreement features matter? From the perspective of (58), the availability of obligatory control is determined by the tense and agreement features that enter in the syntactic component, regardless of their later morphological realization.

34

Some historical background

The question that now arises is why obligatory control should correlate with deficiency in tense or -feature specification, as depicted in (58). We have seen that Landau’s Agree-based approach was couched on the admittedly stipulated postulation and distribution of [R]-features, which mimicked the case-based approach to PRO in previous models. Given that the distribution of PRO is handled in such a stipulative manner, it would not be surprising if Landau’s R-assignment rule in (46) could be reformulated in such a way that it should become compatible with the generalizations embodied in (58), something that we will not pursue here. However, notice that tense or -feature deficiency generally characterizes “porous” domains out of which movement can take place (see e.g., Boeckx 2003, 2005). Thus, from the perspective of the MTC, the picture embodied in (58) is exactly what one would expect: we can simply replace control by A-movement , as in (59).

(59)

 

A-movement:

A-movement:

[T , ]

[T + , ]

[T , + ]

[T + , + ]

untensed

tensed uninflected

Balkan untensed

English

uninflected

infinitives,

subjunctives,

indicatives,

infinitives,

Brazilian

etc.

Balkan tensed

etc.

Portuguese

subjunctives,

indicatives,

etc.

Hebrew

3rd-person

subjunctives, etc.

We return to this correlation between movement/obligatory control and Infl- deficiency in section 4.4.

2.5.2.5 Summary Combining aspects of syntactic and semantic approaches, Landau’s Agree- based approach to control involves a rich array of features that allows him to capture many manifestations of control, with a very high degree of formal explicitness. Here we have focused on three major pillars of his proposal: (i) the importance ascribed to partial control; (ii) the typology predicted by his feature system; and (iii) the technical details of how the distribution and interpretation of PRO is to be obtained through the operation Agree. We have seen that, given their sensitivity to the argument properties of the embedded predicate, partial-control constructions may also be conceived as involving a null commi- tative argument, instead of an obligatory-controlled PRO with an independent

2.6 Conclusion 35

semantic plural feature. In other words, the existence of partial control is not by itself an argument for PRO-based theories. As for the typology predicted, the system undergenerates in that it has no room for finite control into indicatives, which is allowed in Brazilian Portuguese. Finally, the technical apparatus rests on various stipulations regarding the properties of features needed to track the distribution and interpretation of PRO and on composite-agreement relations that are not independently motivated and lead to overgeneration. All in all, we agree with Landau (p. 842) that the theoretical foundations for his approach are on equal footing with the Case-based approach in previous models. Despite its technical precision and empirical coverage, it accounts for the distribution and the interpretation of PRO by ultimately encoding the facts to be explained in the guise of stipulative lexical features.

2.6 Conclusion

In this chapter we have discussed different approaches to control within the generative enterprise, from the standard theory to minimalism. It is an inter- esting fact that the standard theory and the GB approaches took PRO not as a lexical formative, but as the output of the computations of the syntactic com- ponent. Accordingly, each of them attempted to account for the distribution and interpretation of PRO in terms of the broad architectural properties of the model of UG then assumed. By contrast, the null-case and Agree approaches take PRO to be a lexical item and, despite their laudable attempts to deduce the distributional and interpretive properties of PRO, they end up simply encoding them as lexical features, thereby eschewing true explanation. With this background, we are now ready to examine the major properties of the MTC, given a minimalist setting.

3 Basic properties of the movement theory of control

3.1 Introduction

If we could start afresh, without our historical baggage and the preconceptions that often come with it, we would likely be struck by the similarities between sentences like (1a) and (1b) below. Both sentences involve a matrix predicate that embeds a non-finite sentential complement and, more interestingly, the unrealized subject of the embedded clause is interpreted as being “the same” as the subject of the matrix clause. That is, ‘John’ is the kisser in both (1a) and

(1b).

(1) a.

John seemed to kiss Mary

b.

John tried to kiss Mary

In face of these structural and interpretive similarities, our fresh minds – unbiased but armed with Occam’s razor – would undoubtedly attempt to capture them in a uniform way, with the same mechanisms, unless presented with strong independent reasons for not doing so. The seduction of this simple reasoning encapsulates the MTC. The MTC takes it that the null hypothesis for the derivation of raising and control constructions such as (1a) and (1b) should resort to the same grammatical devices. Thus, if (1a) is to be analyzed in terms of A-movement, (1b) should prima facie be analyzed as involving A-movement as well. Of course, null hypotheses can be, and frequently are, incorrect. But the incorrectness has to be demonstrated and this – in our view – has not been the case with the MTC, despite claims to the contrary, as we shall discuss. In this chapter, we present the basic features of (our version of) the MTC, leaving a detailed discussion of its empirical advantages and its solutions to problems raised in the literature to chapters 4 and 5. Section 3.2 starts with a historical discussion of factors that prevented the MTC from being entertained as the null hypothesis from day one. Section 3.3 shows how the abandonment of D-structure in the minimalist program made it possible and natural to explore the null hypothesis underlying the MTC. Section 3.4 shows how an analysis

3.2

Departing from the null hypothesis 37

of controlled PRO as a trace of A-movement deduces the configurational, phonetic, and interpretive properties of obligatory control and how obligatorily controlled PRO can be dispensed with as a grammatical formative. Finally, section 3.5 reviews the architectural features of the MTC and its place within the minimalist program, showing that there exist no strong reasons for discarding the null hypothesis regarding the derivation of control and raising constructions.

3.2 Departing from the null hypothesis: historical, architectural, and empirical reasons

Consider once again the examples in (1). For all their similarities, there is a difference between the two sentences. In (1b) ‘John’ has an interpretive function in virtue of being the matrix subject that is absent in (1a). Infelicitously, we might say that in (1b) John is described as both a kisser and a trier while in (1a), though he is a kisser still, he is in no sense a seemer. This is reflected in the fact that (1a) has a paraphrase like ‘it seemed that John kissed Mary,’ while (1b) has (at best) the very awkward paraphrase ‘John tried for John to kiss Mary’ and no possible paraphrase analogous to the one for (1a): ‘it tried for John to kiss Mary’ is almost incomprehensible. Generative grammarians, it is fair to say, have been more impressed by the last noted difference than the aforementioned similarities, for they have generally taken the semantic difference revealed by the paraphrases to indicate that these otherwise similar sentences have entirely different generative (derivational) profiles. Example (1a) is taken to mediate the relation between the two subject positions by moving ‘John’ from the embedded to the matrix position, leaving behind a coindexed trace, as illustrated in (2a) below. By contrast, (1b) is taken to relate ‘John’ to the embedded-subject position through some kind of binding relation, as represented in (2b). 1

(2) a.

[John 1 seemed [t 1 to kiss Mary]]

b.

[John 1 tried [PRO 1 to kiss Mary]]

It is important to note that enriching the theoretical apparatus by having UG invoke different grammatical resources in order to capture the interpretive difference between (1a) and (1b) is not the only conceivable option. As already suggested in section 3.1, one might imagine keeping the theoretical apparatus

38

Basic properties of the movement theory of control

constant and analyzing (1b) also in terms of movement, as illustrated in (3) below. From this perspective, the interpretive difference between (1a) and (1b) is to be ascribed to the indisputable fact that there is an additional -role

available in the matrix clause of (1b) (the “trier” -role), but not in the matrix clause of (1a). Therefore, as ‘John’ moves to the matrix clause, it may establish

a new thematic relation in (1b), but not in (1a).

(3)

[John 1 tried [ t 1 to kiss Mary]]

The analysis in (3), which essentially embodies the MTC, is arguably the null hypothesis for the analysis of (1b). Given the pervasive role of movement in the grammar – however it is encoded – Occam’s razor should urge us to attempt to make do with movement, given that it is already independently required. Interestingly, this has not been a widely explored path in generative grammar 2 and it is worth pausing to consider why this is so. In part, this can be attributed to historical reasons regarding the development of the field. In the earliest days of generative grammar, simply attaining descrip- tive adequacy was a tremendous challenge as the basic formal tools to handle linguistic data were still in the making. It is therefore unsurprising that these first stages were essentially taxonomic, establishing the inventory of possible constructions in natural languages and formulating the rules that should yield the catalogued constructions. Thus, until the early 1980s, transformations were complex operations that generated alternative structures typed by construction (e.g., the passive rule, the wh-question formation rule, the relativization rule,

the topicalization rule, etc.). In this scenario, differentiating a raising rule from

a control rule (the equi(valent) NP deletion rule; see section 2.3) makes good

sense. One very good diagnostic for differentiating two constructions is their differing effects on meaning and, as we noted above, raising sentences like (1a) do differ from control sentences like (1b) as far as the thematic powers of the subject ‘John’ are concerned. This motivation, though reasonable in this background, ceases to be per- suasive with the emergence of the principles-and-parameters approach. One important legacy of the GB era with the shift from constructions and rules to principles and parameters is that constructions are now viewed as epiphe- nomena, resulting from the interaction of more basic operations. Instead of a (roughly) one-to-one relation between constructions and rules, we now find

basic operations such as Move (or, more radically still, Lasnik and Saito’s [1992] Affect ) underlying the derivation of a multitude of different types

2 An early notable exception is Bowers (1973).

3.2

Departing from the null hypothesis 39

of constructions. For instance, the derivation of passive and raising construc- tions is taken to employ the same grammatical device, namely, (A-)movement, rather than resorting to the distinct rules of passive and raising. The differ- ences between these constructions such as the dethematicization of the external argument in the case of passives are then factored out and analyzed in terms of other independent components of the grammar (e.g., -theory). But once one goes this far, there is no obvious conceptual barrier to categorizing control with passive and raising, all sharing the same generative resources. The three constructions could potentially be derived by the same grammatical tool (A- movement) and their differences allocated to different components of the gram- mar. Let us make the same point in a slightly different way: nobody assumes that wh-questions are the same as relative clauses. Nonetheless, it is now widely accepted (among generative grammarians) that, whatever differences the two constructions enjoy (and there are many), these differences do not undermine the claim that their derivations both involve a common (A’-)movement opera- tion. The MTC asks that this same reasoning be applied to raising and control configurations. The source of their differences may reside not in the operations that go into generating them, but in the interaction of their specific properties with other grammatical components. In section 3.3 and Chapter 5 below, we will examine in detail potential sources for the differences between control and raising constructions documented in the literature. The important point to bear in mind here is that with the abandonment of constructions as grammatical primitives, there remains no logical impediment for entertaining the hypothesis that control structures are derived by movement. In fact, the principles-and-parameters perspective, in which constructions are not theoretical primitives, invites one to eliminate the exceptional theoretical status of control qua construction in the grammar. Recall from our discussion

in

section 2.4 that the analysis of control in GB could not ultimately be reduced

to

binding/construal without additional provisos. Once PRO is analyzed as a

pronominal anaphor, the account of the interpretive properties of PRO requires

a specific grammatical module, the control module, which must somehow

disregard the pronominal specification of PRO in OC constructions and its anaphoric specification in NOC constructions. Such construction sensitivity looks like a fossil from previous stages of the generative enterprise that one would like to get rid of. The second historical reason for why the MTC was not pursued within generative grammar, which also accounts for why the construction-specific flavor of the control module was tolerated within GB, has to do with the general architecture of the grammar standardly assumed prior to the minimalist

40

Basic properties of the movement theory of control

program. More specifically, the assumption of D(eep)-structure (DS) as a level of representation left no room for an approach along the lines of the MTC. 3 Let us see why. In prior models, DS has two important properties. First, it codes all the relevant argument-structure information a sentence expresses. It is, in technical lingo, a pure “representation of GF- ,” where all and only argument (thematic) positions are filled. 4 Thus, if a predicate has a logical subject and object (agent and theme), then both subject and object positions must be lexically filled at DS. On the other hand, if a verb has a logical object but no subject (e.g., a passive or unaccusative verb), the object position must be lexically filled at DS, whereas the subject position must be empty. The second important aspect of DS is that, functionally, it is the output of phrase-building operations (namely, phrase-structure rules and lexical-insertion operations) and the input to the transformational component. In other words, in models that include a DS level all lexical-insertion operations precede all movement transformations. Let us now examine how the sentences in (1) repeated below in (4) are analyzed under these assumptions. Take the representations in (5), for instance, which indicate that ‘John’ has moved from the embedded to the matrix-subject position, leaving a trace behind.

(4) a.

John seemed to kiss Mary

b.

John tried to kiss Mary

(5) a.

[John 1 seemed [t 1 to kiss Mary]]

b.

[John 1 tried [ t 1 to kiss Mary]]

Given the requirement that lexical insertion must precede movement, the DS representation of (5a) and (5b) should be as in (6) below, with ‘John’ gener- ated in the embedded-subject position. In both (6a) and (6b), ‘kiss’ has two arguments and its subject and object positions are correctly filled. ‘Seem’ in (6a) does not assign a -role to its subject position and, accordingly, its subject position is left empty. By contrast, ‘try’ does assign a -role to its subject position, but there is no category filling this position in (6b). Hence, a movement-based derivation along the lines of (5b) for the control construc- tion in (4b) is ruled out at DS, as the thematic requirements of ‘try’ are not

3 Deep structure is not quite the same as D-structure. However, for current purposes the differences are insignificant as are the differences between various models of grammar that included a D- structure level, e.g., EST and both early and late GB models.

3.2

Departing from the null hypothesis 41

satisfied at this level. The derivation of (4b) should therefore have two dis- tinct elements filling the subject positions, as illustrated in (7b), with ‘John’ occupying the matrix-subject position and PRO the embedded-subject posi- tion. Notice that a similar DS representation for raising constructions, as illus- trated in (7a), is not licit, as the subject position of ‘seem’ is filled despite the fact that it is not thematic. In other words, assuming DS in the grammar unavoidably leads to a movement analysis of raising and a construal analysis of control.

(6) a.

DS:

[Seemed [John to kiss Mary]]

b.

DS: [Tried [John to kiss Mary]]

(7) a.

DS: [John seemed [PRO to kiss Mary]]

b.

DS:

[John tried [PRO to kiss Mary]]

The big empirical virtue of assigning distinct DS representations to raising and control constructions along the lines of (6a) and (7b) is that it derives semantic differences between these constructions in a principled manner. Take the contrasts in (8)–(10), for example.

(8) a.

There seems to be someone kissing Mary

b.

There tried to be someone kissing Mary

(9) a.

The cat seems to be out of the bag (idiomatic interpretation: OK)

b.

The cat tried to be out of the bag

(idiomatic interpretation: )

(10) a.

The doctor seemed to examine Mary Mary seemed to be examined by the doctor

b.

The doctor tried to examine Mary

= Mary tried to be examined by the

doctor

In (8), we can find an expletive in the subject of ‘seem’ but not of ‘try.’ Why? Because expletives have no semantic content and so cannot bear -roles. 5 As the subject of ‘seem’ is not thematic, it must be empty at DS, as shown in (11a) below; ‘there’ can then be inserted in the subject position of ‘seem’ after DS without semantic (or grammatical) violence being done. In contrast, as the subject position of ‘try’ is thematic, at DS it must be filled by a category that can bear a -role. If it is left empty at DS, as in (11b) and later filled with ‘there,’ the thematic properties of ‘try’ will not be satisfied at DS. If ‘there’ fills this position already at DS, as in (11c), we again have an illicit DS representation, for ‘there’ is not a valid -role bearer.

5 At least not conventional ones. We set aside for the nonce the “quasi”-argument status of ‘it’ in weather constructions like ‘it is raining’ (see Chomsky 1986b for discussion).

42

Basic properties of the movement theory of control

(11) a.

DS:

[Seems [to be someone kissing Mary]]

b.

DS: [Tried [to be someone kissing Mary]]

c.

DS: [There tried [to be someone kissing Mary]]

We can play the same game with the idioms in (9). Idioms are not composi- tionally interpreted. If we take this to mean that they cannot bear conventional -roles, then an idiom (or part of one) cannot be simultaneously interpreted idiomatically and thematically. In (9a) ‘the cat’ can retain its idiomatic meaning, as the subject of ‘seem’ is not thematic and is left empty at DS, as represented in (12a) below. Thus, (9a) can be interpreted as meaning that it seems like the secret has been revealed. This is not possible in (9b). Once ‘the cat’ is in a thematic position, as shown in (12b), it cannot have an additional idiomatic interpretation. Consequently, though (9b) is well formed and interpretable, it means something like ‘the kitty tried to escape the confining sac.’ It has nothing to do with secrets revealed or otherwise.

(12) a.

DS: [Seems [[the cat] to be out of the bag]]

b.

DS: [[The cat] tried [PRO to be out of the bag]]

The pairs of sentences in (10) offer a final illustration of the same point. The sentences in (10a) are rough paraphrases of one another, both meaning that it seems that the doctor examined Mary. They display “voice transparency” in the sense that passivizing the embedded clause has little effect on the interpretation of the whole. This transparency is captured at DS as ‘Mary’ fills the object position of the embedded verb in the DS representation of both sentences, as illustrated in (13) below. By contrast, the control counterparts in (10b) have completely different meanings, with the effort being made by the doctor in the first sentence but by Mary in the second. Lack of voice transparency in (10b) is attributed to the different positions ‘Mary’ occupies at DS in each case. As illustrated below, at DS ‘Mary’ is the thematic object of ‘examine’ in (14a) but the thematic subject of ‘try’ in (14b).

(13) a.

DS: [Seemed [the doctor to [examine Mary]]]

b.

DS: [Seemed [to be [examined Mary] by the doctor]]

(14) a.

DS: [The doctor tried [PRO to [examine Mary]]]

b.

DS: [Mary tried [to be [examined PRO] by the doctor]]

In sum, the two features of DS reviewed above (namely, that DS is sub- stantively the level at which all and only thematic positions must be filled and functionally the level that feeds movement operations) conspire to eliminate a movement approach to control along the lines of (15) below at DS. This is one

3.3 The revival of the null hypothesis 43

reason why the MTC was not considered viable in GB or in previous models that assumed DS.

(15)

[John 1 tried [ t 1 to kiss Mary]]

Furthermore, the interaction of these two features of DS derives subtle inter- pretive properties of raising and control structures and this was certainly a big achievement. So much so that retaining the clumsy construction sensitivity of the control module in a principles-and-parameters model seemed a reasonable price to pay. But questions arise if the architecture of the model changes. Specifically, what if we give up DS? Can the interpretive differences between raising and control still be captured? If so, are we not then free to reconsider the null hypothesis regarding control, expressed in (15)? This is the subject of the next section.

3.3 Back to the future: elimination of DS and the revival of the null hypothesis

As we discussed in section 3.2, assuming DS has drastic consequences for the (simplest) version of the MTC expressed in (15). As DS requires that all -roles must be discharged before any movement takes place, the -roles associated with the controller and the controllee must be assigned before any movement operation, which in practice prevents the controller and the controllee from being associated via movement. Models that eschew a DS level are therefore free to pursue the MTC. In other words, without DS there is no obvious archi- tectural reason against reducing raising and control to a common movement source. This observation has become especially relevant with the emergence of the minimalist program in the early 1990s, as minimalism argues against the postulation of non-interface levels such as DS. 6 In particular, minimalists have explored the idea that lexical insertion and -assignment, on the one hand, and movement, on the other, can be freely interspersed. A sentence such as (16), for instance, is to be associated with the (simplified) derivation in (17) (see footnote 1), where movement of ‘what’ in (17e) is sandwiched between different applications of lexical insertion and -assignment.

6 For relevant discussion, see Chomsky (1995), Uriagereka (1998), and Hornstein, Nunes, and Grohmann (2005), among others.

44

Basic properties of the movement theory of control

(16)

What did Mary say that John saw

(17) a.

Merger of ‘saw’ and ‘what’ + θ -assignment:

[saw what]

b.

Merger of T:

[T

[saw what]]

c.

Merger of ‘John’ + θ -assignment:

[John [T [saw what]]]

d.

Merger of ‘that’ :

[that [John [T [saw what]]]]

e.

Movement of ‘what’:

[what i [that [John [T [saw t i ]]]]]

f.

Merger of ‘say’ + θ -assignment:

[say [what i [that [John [T [saw t i ]]]]]]

g.

Merger of T:

[T

[say [what i [that [John [T [saw t i ]]]]]]]

h.

Merger of ‘Mary’ + θ -assignment:

[Mary [T [say [what i [that [John [T [saw t i ]]]]]]]]

i.

Merger of C:

[C

[Mary [T [say [what i [that [John [T [saw t i ]]]]]]]]]

j.

Movement of ‘what’:

[what [C [Mary [T [say [what i [that [John [T [saw t i ]]]]]]]]]]

Once it is independently assumed that -role assignment may follow appli- cations of movement, it is not at all odd to suppose that a control construction such as (18) below could be derived along the lines of (19), where the “trier” -role is assigned after movement of ‘John’ (cf. [19g]). This is even more so if movement is in fact a composite operation that includes merger as one of its basic operations. 7 That is, if movement of ‘John’ in (19g) involves plugging it into the structure via merger, this merger operation should in principle license -assignment in the same way merger of ‘Mary’ in (19a) or ‘John’ in (19c), for instance, does.

(18)

John tried to kiss Mary

(19) a.

Merger of