(Studies in Generative Grammar) Gereon Muller, Wolfgang Sternefeld, Gereon M. Ller-Competition in Syntax - A Synopsis-De Gruyter (2000)

Competition in Syntax
W
DE
G
Studies in Generative Grammar 49
Editors
Harry van der Hulst
Jan Köster
Henk van Riemsdijk
Mouton de Gruyter
Berlin · New York
Competition in Syntax
Edited by
Gereon Müller
Wolfgang Sternefeld
Mouton de Gruyter
Berlin · New York 2001
M o u t o n de Gruyter (formerly Mouton, The Hague)
is a Division of Walter de Gruyter G m b H & Co. K.G, Berlin.
The series Studies in Generative G r a m m a r was formerly published by

Foris Publications Holland.
© Printed on acid-free paper which falls within the guidelines

of the A N S I to ensure permanence and durability.
Die Deutsche Bibliothek — Cataloging-in-Publication Data
Competition in syntax / ed. by Gereon Müller ; Wolfgang Sternefeld. -

Berlin ; New York : Mouton de Gruyter, 2001
(Studies in generative grammar ; 49)
ISBN 3-11-016945-2
© Copyright 2000 by Walter de Gruyter G m b H & Co. K G , D-10785 Berlin.

All rights reserved, including those of translation into foreign languages. No part of this
book may be reproduced in any form or by any means, electronic or mechanical, including
photocopy, recording, or any information storage and retrieval system, without permission
in writing from the publisher.
Printing: Werner Hildebrand, Berlin.
Binding: Lüderitz & Bauer, Berlin.
Printed in Germany.
Contents
The Rise of Competition in Syntax: A Synopsis
Gereon Müller & Wolfgang Sternefeld 1
Let's Phrase It! Focus, Word Order, and Prosodie Phrasing in

German Double Object Constructions
Daniel Biiring 69
Remarks on the Economy of Pronunciation

Gisbert Fanselow & Damir Cavar 107
On the Integration of Cumulative Effects into Optimality Theory

Silke Fischer 151
Quantifier Scope in German and Cyclic Optimization

Fabian Heck 175
Experimental Evidence for Constraint Competition in

Gapping Constructions
Frank Keller 211
Word Order Variation: Competition or Co-Operation?

Jürgen Lenerz 249
OT Accounts of Optionality: A Comparison of Global Ties and

Neutralization
Tanja Schmid 283
The Interpretation of Object Shift and Optimality Theory

Sten Vikner 321
Case Conflict in German Free Relative Constructions: An Optimality

Theoretic Treatment
Ralf Vogel 341
The Optimal Linking of Arguments: The Case of English Psych Verbs

Anja Wanner 377
Index of OT-Constraints 401
Index of Subjects 403

The Rise of Competition in Syntax: A Synopsis
Gereon Müller & Wolfgang Sternefeld
1 Local vs. Competition-Based Approaches
Syntactic theories differ with respect to how they determine the wellformed-
ness or illformedness of a given sentence S, in a given language. One possi-
bility is that the decision of whether S, is grammatical or not can be made by
exclusively considering properties of S,·; properties of other sentences S j, S*,
...are irrelevant. Another possibility is that properties of other sentences S j,
Sk,... do play a role in deciding whether S, is grammatical or not in addition to
S, 's own properties. The first possibility, which we may call a local approach,
can arguably be viewed as the standard one; this strategy is pursued in, e.g.,
most versions of government and binding theory (principles and parameters
theory), head-driven phrase structure grammar, lexical-functional grammar
(until recently), and in certain versions of the minimalist program. The sec-
ond possibility presupposes a competition of sentences; hence, we will refer
to it as a competition-based, approach. This strategy is the one that this book
is about; it is chosen in certain versions of the minimalist program (in par-
ticular, in earlier manifestations), in theories that incorporate the Blocking
Principle (the Elsewhere Condition), and, last but not least, in optimality-
theoretic syntax. In what follows, we will illustrate fundamental differences
between and points of convergence among local and competition-based ap-
proaches by considering government and binding theory (section 2) and its
development into the minimalist program (section 3), blocking syntax (sec-
tion 4), and optimality-theoretic syntax (section 5).
2 Government and Binding Theory
Chomsky's (1981, 1986a,b) theory of government and binding is a typical

instance of a local approach. In this theory, syntactic objects are viewed as
2 Gereon Müller & Wolf gang Sternefeld
(D-structure, S-structure, LF) triples which are created by phrase-structure

rules (X-bar theory) and the transformational rule Move a . Syntactic con-
straints can take various forms. First, they can be representational filters in
the sense that they apply at one or more of the three levels D-structure, S-
structure and/or LF (cf. the principles of Ö-theory, the binding theory princi-
ples, or the Empty Category Principle (ECP)). Alternatively, constraints can
be derivational, which implies that they do not apply at any specific level,
but restrict the movement operation itself (cf. the Subjacency Condition). Fi-
nally, government and binding theory envisages the possibility that syntactic
constraints can be global (in Lakoff's 1971 terminology). Global constraints
are neither representational nor derivational; rather, they relate non-adjacent
representations in a complex derivation. In the case at hand, the Projection
Principle (see Chomsky 1981:38) is a global constraint that relates the three
levels of representation by demanding that the subcategorization properties
of lexical items be respected from one level to the next. Still, all these types
of constraints, including the global one, can be checked by exclusively look-
ing at a given syntactic object S, .1 Properties of other syntactic objects are
irrelevant. Thus, all these types of constraints (representational, derivational,
and global) can be viewed as local in this sense.
2.1 Representational Constraints
To see this, consider first the effects of representational constraints like the
binding principles A and B, which are given here in a simplified form.
(l)a. Principle A\
An anaphor is bound in its binding domain,
b. Principle B:
A pronominal is free in its binding domain.
By assuming that these principles apply at S-structure, we can derive that

(2-a,c) (more precisely, the (D-structure, S-structure, LF) triples of which
(2-a,c) are simplified S-structure representations) are grammatical, whereas
(2-b) and (2-d) are ungrammatical, due to violations of Principle A and Prin-
ciple B, respectively. Crucially, however, the determination of grammatical-
ity of the four sentences proceeds on a local basis. Thus, the illformedness of
(2-d) is not related to the wellformedness of (2-a), and the illformedness of
The Rise of Competition in Syntax 3
(2-b) is not related to the wellformedness of (2-c), even though the two strate-
gies - pronominalization and reflexivization - seem to be in complementary
distribution for the most part in English.
(2) a. Johni likes himselfi

b. *Johni thinks that Mary likes himselfi
c. Johni thinks that Mary likes himj
d. *Johni likes himi
Another representational constraint in government and binding theory is the

ECP, which is assumed to apply to LF representations only (see Chomsky
1986a, Lasnik & Saito 1992).
(3) ECP ("Empty Category Principle"):

Every trace must be marked [+y].
A trace is marked [+y] if it is properly (antecedent or lexically) governed, and

[—y] if it is not properly governed. Once assigned, a /-feature is irreversible.
By assumption, the complementizer that blocks antecedent-government of
a subject trace. Consequently, ti is marked |—y] in (4-a), which exhibits a
familiar complementizer-trace effect, and [+yj in (4-b). Even though these
features are assigned at S-structure, the actual ECP violation in (4-a) takes
place at LF. There is a reason why the ECP cannot be assumed to hold at
S-structure already: If it did, (4-c) would also be ill formed, due to [—y |-
marking of the intermediate subject trace t',. Being an argument trace in an
Α-bar position, this intermediate trace is freely deletable on the way to LF (an
option that does not exist for ti in (4-a,b,c)). With it goes the [—y]-marking,
and (4-c) is correctly predicted to be well formed despite a lack of proper
government at S-structure - the ECP is respected at LF.
(4) a. *Whoi do you think [ C p t,([+yJ) that [i P t i ( t - y j ) will leave J] ?

b. Whoi do you think [ C p t',(|+yj) — [IP ti([+yj) will leave |] ?
c. Whoi do you think [CP t"([+y]) that Mary said Icp t',(|—y |) — |IP
ti(L+yJ) will leave ]]] ?
It is again worth noting that the account of the illformedness of (4-a) in no

way relies on the fact that (4-b) is well formed - the grammaticality status of
the two sentences is determined on a local basis.
2.2 Derivational Constraints
Basically the same situation arises with constraints like the Subjacency Con-
dition in (5).
(5) Subjacency Condition:

Movement must not cross two bounding nodes.
The standard assumption in government and binding theory is that Subja-

cency is a derivational constraint on (overt) movement, rather than a repre-
sentational constraint applying at S-structure. Theory-internal motivation for
this assumption comes from the consideration of the data in (6) (cf. Sternefeld
1991).
(6) a. Whoi do [ip you think [cp t', that [IP Mary loves ti ]]] ?
b. Whoi do [IP you believe [¡p John to be in love with ti ]] ?
IP is a bounding node for movement. The two wA-movement steps in (6-a)

cross one IP each, and given that no other bounding node intervenes, the well-
formedness of this sentence is compatible with both a derivational and a rep-
resentational interpretation of the Subjacency Condition. The case is different
in the exceptional Case marking context in (6-b). As it stands, ti is separated
from its antecedent who \ by two IP bounding nodes at S-structure, since there
is no intermediate trace tj. Thus, under a representational interpretation of the
Subjacency Condition, we would in fact wrongly expect (6-b) to be ungram-
matical. No such problem arises under a derivational interpretation, though.
On this latter view, (6-b) is derived from a D-structure representation with
a CP shell present in the embedded clause. VWz-movement can precede CP
deletion (as required for exceptional Case marking of the embedded subject
NP), and the Subjacency Condition is correctly predicted to be satisfied by the
movement operation itself. This is shown in (7), which outlines the relevant
part of the derivation of (6-b).
(7) a. — do [IP you believe [CP — C [ip John to be in love with whoi ]]]
b. — do [IP you believe [cp whoi C [IP John to be in love with TI ]]]
c. whoi do [IP you believe [cp tj C [IP John to be in love with ti ]]]
d. whoi do [IP you believe [IP John to be in love with ti ]]
Among other phenomena, the Subjacency Condition derives Complex Noun

Phrase Condition (CNPC) islands. The contrast between (8-a) and (8-b) fol-
lows, given that wA-movement crosses two bounding nodes in (8-b), but not
in (8-a). Again, the fact that the movement strategy seems to be in comple-
mentary distribution with a resumptive pronoun strategy is not theoretically
reflected by relating the illformedness of resumptive pronouns as in (8-c) to
the wellformedness of movement in (8-a), and the (relative) wellformedness
of resumptive pronouns in (8-d) to the illformedness of movement in (8-b): 2
Independently of how exactly an account of the contrast in (8-c,d) looks like
in government and binding theory, it seems that it must rely on a local con-
straint that is violated if a resumptive pronoun has an antecedent that is too
close. 3
(8) a. the man [cp who(m)i I saw ti ]

b. *the man [CP who(m)i [IP I don't believe INP the claim fcp t', that
anyone saw ti ]]]]
c. *the man [cp who(m)i I saw himj ]
d. ?the man [cp who(m)i [IP I don't believe [NP the claim [cp that any-
one saw himi ]]]]
2.3 Global Constraints
Finally, the Projection Principle can always be checked by looking at the

properties of a given sentence, and does not necessitate a consideration of
other sentences. The Projection Principle is given in a simplified form in (9).
(9) Projection Principle:

a. If a selects β in γ as a lexical property, then a selects β in γ at
some level L( .
b. If a selects β in γ at level L,, then a selects β in γ at level L,·.
To find out whether a given sentence S, respects the Projection Principle,

we have to take into account the sentence's representations at D-structure, S-
structure, and LF. This way, the presence of traces of overt movement in argu-
ment positions can be enforced at S-structure and at LF, among other things.
Clearly, such a global constraint is quite complex; however, it does not yet
rely on the notion of competition - properties of other sentences (i.e., other
(D-structure, S-structure, LF) triples) are irrelevant for determining whether
a given sentence violates or respects the Projection Principle.
6 Gereon Müller & Wolfgang Sternefeld
2.4 An Exception: The Avoid Pronoun Principle
To sum up so far, government and binding theory has different types of con-
straints, with varying complexity, but all of them are local in the above sense;
i.e., they do not involve competition. Interestingly, though, there is a notable
exception: the Avoid Pronoun Principle of Chomsky (1981).
The empty pronominal PRO and lexical pronouns come close to being in
complementary distribution: As stated in the PRO-theorem, PRO is confined
to positions that are ungoverned (e.g., the subject position of control infini-
tives). In contrast, overt pronouns typically show up in positions that are gov-
erned. This is so because overt pronouns must be assigned Case, and Case
is normally assigned under government only. However, there is one position
in which government and binding theory permits both an empty pronomi-
nal PRO (because the position is ungoverned) and an overt pronoun (because
Case can be assigned without government by a special Case assignment rule).
This is the subject position of English gerunds. Consider first the possibility
of PRO in this position:
(10) a. Johnj would much prefer [ PRO] going to the movie ]

b. *Johni would much prefer [ PR02/ar¿> going to the movie ]
As shown by the contrast between ( 10-a) and ( 10-b), PRO must be co-indexed
with the matrix subject here; it cannot bear a different index or be inter-
preted arbitrarily. Obligatory control follows from the control rule in (11)
(see Manzini 1983).4
(11) Control Rule:

If PRO is minimally dominated by a declarative clausal object a ,
then it must be controlled by an antecedent within the minimal CP
dominating a.
Next consider overt pronouns in English gerunds:
(12) a. *Johni would much prefer [ hisi going to the movie ]

b. Johni would much prefer [ his2 going to the movie ]
c. Johni would much prefer [ hisi book ]
(12-b) shows that an overt pronoun is possible in the subject position of

English gerunds. However, an overt pronoun cannot be co-indexed with the
matrix subject, in striking contrast to PRO; cf. (12-a). The illformedness of
(12-a) does not follow from any of the relevant local constraints in govern-
ment and binding theory. In particular, (12-c) strongly suggests that principle
Β of the binding theory is not violated in (12-a) (the pronoun his occupies
SpecNP in both cases, according to Chomsky's 1981 assumptions), and the
wellformedness of (12-b) proves that Case can be assigned to his in (12-a). In
view of this situation, Chomsky (1981:65) suggests that the illformedness of
(12-a) is not to be traced back to a violation of some local constraint. Rather,
it should be related to the wellformedness of (10-a): The two sentences com-
pete, and (12-a) is ungrammatical because (10-a) is grammatical. This idea is
implemented by adopting the Avoid Pronoun Principle in (13).
(13) Avoid Pronoun Principle:

Lexical pronouns are blocked by empty pronouns if possible.
This implies that the grammaticality of a sentence S, with an overt pronoun

cannot be checked on a local basis anymore: To find out whether such a sen-
tence S, is grammatical or not, a minimally different sentence S 7 has to be
considered in which the overt pronoun is replaced by PRO. If S¡ is grammati-
cal, S, is ungrammatical by the Avoid Pronoun Principle; this effect occurs in
the case of (10-a) vs. (12-a). If, on the other hand, S¡ is ungrammatical (e.g.,
because PRO violates the PRO-theorem, or because PRO has an index that
leads to a violation of the Control Rule), S, can be grammatical (provided
that no local constraints of grammar are violated); this situation arises in the
case of (10-b) (which violates the local Control Rule) vs. (12-b).
The question arises of what type of constraint the Avoid Pronoun Principle
is. Clearly, it does not belong to any of the three types of local constraints
discussed above: To find out whether a given sentence S, respects or violates
this constraint, it does not suffice to consider properties of S, alone. Rather,
the properties of another sentence Sj must be considered. For the present
purposes, let us refer to constraints like the Avoid Pronoun Principle that rely
on a competition of sentences as translocal constraints. 5
It seems that the Avoid Pronoun Principle has been widely accepted in gov-
ernment and binding theory. However, conceptually it has arguably always
been an alien element in that syntactic approach. In line with this, it has
never been fully clear whether it should best be viewed as a purely syntac-
tic constraint, or indeed as a pragmatic constraint derivable from, e.g, Gricean
maxims. 6 Such a question does not arise with the use of translocal constraints
in the minimalist program.
3 The Minimalist Program
Translocal constraints are employed in various versions of the minimalist pro-

gram that are developed in Chomsky (1991, 1993, 1995, 1998). In general,
translocal constraints can be viewed as selection instructions: Out of a given
set of syntactic objects, one (or more) is chosen according to a metric that
is specified by the constraint. The set of syntactic objects that participate in
the competition is called the reference set. The minimalist program dispenses
with D-structure and S-structure, retaining only the interface levels LF and
PF as levels of representation. Consequently, representational constraints that
apply at D-structure or S-structure are dispensed with (as well as global con-
straints). The remaining local constraints are either derivational, or they apply
at LF or PF ("bare output conditions"). With respect to the issue of competi-
tion, this derivational orientation of the minimalist program has an immediate
consequence. The competing syntactic objects are derivations (not represen-
tations or η-tuples of levels of representations). In line with this, translocal
constraints in the minimalist program are often called transderivational con-
straints. A common property of most (if not all) translocal constraints in the
minimalist program is that they can be conceived of as economy constraints
in some sense; i.e., a translocal constraint choses the most economical deriva-
tion in the reference set, according to some metric of economy. This certainly
holds for the first translocal constraint suggested in the minimalist program,
the Fewest Steps condition.
3.1 Fewest Steps
3.1.1 V-to-I Movement in Chomsky ( 1991 )
Chomsky (1991) is concerned with deriving the difference between French

and English with respect to V-to-I movement (cf. Pollock 1989): French has
overt V-to-I movement of finite verbs; English does not have such movement
(except for auxiliaries). This is shown in (14).
(14) a. Jean embrassei souvent [vp ti Marie ]

b. *Jean souvent [vp embrassej Marie ]
c. *John kissesi often [yp t] Mary ]
d. John often [yp kissesi Mary ]
As a first step towards accounting for these data, Chomsky assumes that
French has "strong" I nodes, whereas English has "weak" I nodes. This dis-
tinction becomes important for the following local (derivational) constraint:
(15) Strength of I:
Strong I tolerates adjunction of all Vs; weak I tolerates adjunction
only of "light" Vs (auxiliaries).
This excludes (14-c) in English: Overt V-to-I movement violates Strength of

I. In contrast, overt V-to-I movement in (14-a) in French does not violate
this constraint. Still, something extra needs to be said about (14-b) in French,
which vacuously fulfills Strength of I, just like (14-d) in English does. Thus,
the question is: Why is overt V-to-I raising obligatory if it is possible? Chom-
sky's (1991) background assumption is that inflection is base-generated in
I. If V does not raise to I, I must lower to V in overt syntax, so as to ful-
fill another local constraint, the Stray Affix Filter, which prohibits inflectional
affixes that are not attached to a verbal host. The crucial idea now is that
overt I lowering creates an unbound trace that must be undone by LF via
covert raising of the whole V-I complex to the position of the trace of I. The
derivations underlying (14-a) and (14-b) in French are given in (16) and (17),
respectively.
(16) a. Jean I2 souvent [yp embrassei Marie J (overt raising)

b. Jean [¡2 embrasse!-I 2 ] souvent [yp ti Marie ]
(17) a. Jean I 2 souvent [vp embrassei Marie ] (overt lowering)
b. Jean t 2 souvent [yp [v, embrassei-I 2 ] Marie 1 (covert raising)
c. Jean [y, embrassei-I 2 ] souvent [yp ti Marie 1
The second derivation has more movement steps than the first one, and it is
therefore filtered out as uneconomical by the translocal economy constraint
Fewest Steps, which can be formulated as follows:
(18) Fewest Steps'.

If two derivations Di and D 2 are in the same reference set and Dj
involves fewer operations than D 2 , then Di is to be preferred over
D2.
A definition of reference set that works for the approach in Chomsky ( 1991 ) is
(19); here, the numeration is the set of all lexical items (including functional
heads) that are used in a derivation. 7
(19) Reference Set:

Two derivations Di and D2 are in the same reference set iff they (i)
have the same numeration and (ii) respect all local constraints.
The qualification in (i) ensures that, e.g., (20-b) cannot accidentally block
(20-a) even though it involves fewer syntactic operations. Furthermore, the
statement in (ii) produces the welcome consequence that (20-c) cannot ac-
cidentally block (20-a) even though it involves fewer syntactic operations
by leaving the w/i-phrase and the auxiliary in situ - (20-c) violates a lo-
cal constraint like the Wh-Criterion, which requires a wA-phrase to move to
SpecQ+u,/,] overtly in English.
(20) a. Whati have2 you t2 seen ti ?

b. You have seen a car
c. *You have2 seen whti ?
This particular application of a translocal constraint in the minimalist pro-

gram is not generally accepted anymore. Still, it can serve as an illustration
of certain recurring properties and problems of translocal constraints, and in-
deed of competition-based approaches in general.
First, as Chomsky (1991:433) observes, this system "tends to eliminate the
possibility of optionality in derivation. Choice points will be allowable only
if the resulting derivations are all minimal in cost ... This may well be too
strong a conclusion, raising a problem for the entire approach." As an ex-
ample, consider optional topicalization in English. (21-a) and (21-b) are both
well formed, even though (21-b) invariably involves one movement operation
more than (21-a).
(21 ) a. Mary gave a book to John!

b. To Johni Mary gave a book tj
The solution suggested by Chomsky (1991:433) is that certain movement op-

erations (like, we can assume, topicalization in English) might be "assigned
to some other component of the language system, perhaps a 'stylistic' com-
ponent of the mapping... to PF." Movement operations of this type might then
be exempt from the Fewest Steps constraint. Alternatively, we might revise
the definition of reference set appropriately, such that the two derivations in
(21) do not compete anymore. For instance, we might add the requirement
that competing derivations must have identical LF representations, as in (22).
Assuming that sentences which differ with respect to whether topicalization

has applied must have different LFs, this would yield the desired result.
(22) Reference Set (revised):

have the same numeration and the same LF, and (ii) respect all local
constraints.
Second, a potential conceptual problem arises. Translocal economy con-

straints increase complexity: To find out whether a given sentence is gram-
matical, it does not suffice to look at internal properties of the sentence (Does
it violate a local constraint?); rather, the properties of other sentences have
to be taken into account as well (Does the sentence have the most econom-
ical derivation in the reference set?). Chomsky (1991:448) remarks that this
might suggest that "language design as such appears to be in many respects
'dysfunctional,' yielding properties that are not well adapted to the functions
language is called upon to perform." 8
Third, Chomsky (1991) discusses successive-cyclic movement. It is stan-
dardly assumed that long-distance w/i-movement of adjunct wA-phrases must
be successive-cyclic; otherwise, a locality constraint (like the ECP) will be
fatally violated, as with, e.g., w/i-island configurations; cf. (23).
(23) *Howi do you wonder fcp whether to fix the car ti | ?
Chomsky (1991) observes that successive-cyclic movement creates a poten-

tial problem for the Fewest Steps condition: Successive-cyclic movement as
in (24-a) should always be blocked by one-swoop movement as in (24-b).
(24) a. Howi do you think |cp t" that John said | cp t', that Bill fixed the car
ti ]]?
b. *Howi do you think [CP — that John said fcp — that Bill fixed the
cart, | ] ?
Note, though, that no particular problem arises under the notion of reference
set in (19) or (22). According to these definitions, only those derivations can
compete that respect all local constraints of grammar, i.e., that are otherwise
well formed. By hypothesis, the derivation that generates the surface repre-
sentation (24-b) violates a locality constraint; hence, it cannot compete with
the derivation that generates (24-a), and (24-a) is chosen by Fewest Steps
because there is no competing derivation that would be more economical.
But what if we were to dispense with clause (ii) in the definition of reference
set, or that clause (ii) were weakened in such a way that some derivations vio-
lating local constraints could compete after all. (As we will see below, there is
some evidence for this latter option.) Then, the derivations generating (24-a)
and (24-b) might compete, and the problem of accounting for successive-
cyclic movement under Fewest Steps would persist. How, then, can we permit
successive-cyclic movement in (24)? Chomsky (1993) advances the follow-
ing solution: "Operations" as they are relevant for Fewest Steps do not simply
involve applications of Move a as such. Rather, a more complex process of
chain formation that (a) moves some item to its target position and (b) au-
tomatically inserts intermediate traces in appropriate positions counts as a
single operation for the purposes of Fewest Steps: 9
(25) Form Chain:

Move a to its target position and freely insert intermediate traces in
appropriate positions.
On this view, "successive-cyclic" movement is no more costly from the per-

spective of Fewest Steps than one-swoop movement. Furthermore, the initial
evidence concerning the French case of overt I-lowering followed by covert
V-raising remains unaffected: A succession of movement operations involv-
ing a single item can only be reanalyzed as one instance of Form Chain (one
operation for the purposes of Fewest Steps) if there is no other operation that
intervenes; but in the case at hand, the operation spell-out that creates the
branching in the derivation to PF and LF must intervene between overt low-
ering and covert raising in the derivation in (17). Hence, (17) still involves
two applications of Form Chain.
3.1.2 Wh-Topicalization in Epstein (1992)
Another application of the Fewest Steps condition is the account of the ban
on w/i-topicalization in Epstein (1992). As noted above, topicalization is in
principle optional in English; cf. (21). For many speakers, topicalization is
also optionally possible in contexts like (26), where the target position is
in an embedded clause and the matrix clause involves short w/i-movement.
Given the qualification that competing derivations must have identical LFs,
this poses no problem for Fewest Steps.
(26) a. Who] ti said [cp that [IP Mary gave a book to John 2 ]] ?
b. Whoi ti said fcp that to John 2 lip Mary gave a book t 2 ]] ?
Interestingly, embedded topicalization becomes impossible when the item

that is topicalized is a wA-phrase; cf. (27-b). Note that the embedded wh-
phrase may stay in situ in overt syntax, giving rise to a multiple question
interpretation; cf. (27-a).
(27) a. Whoi ti said fcp that [IP Mary gave a book to \vh0m2 |] ?
b. *Whoi ti said [cp that to whom2 fip Mary gave a book t2 J] ?
Epstein proposes deriving the ban on wA-topicalization in (27-b) from the

Fewest Steps condition. The derivations D) (generating (27-a)) and D 2 (gen-
erating (27-b)) are in the same reference set. Assuming that all wA-phrases
must be in the domain of a SpecQ+,,,/,ι at LF, they both end up with the LF
representation in (28):
(28) *Whoi to whom2 t t said fcp that [IP Mary gave a book ]] ?
D| reaches this LF by applying one (covert) instance of wA-movement to the

embedded object to whomNote that there is only one movement operation
in this case, either because LF movement of arguments does not have to be
successive cyclic, or because successive-cyclic covert movement can be an-
alyzed as one instance of Form Chain. D2, on the other hand, reaches the
same LF by applying two instances of wA-movement to the embedded object
to whom2 (viz., one overtly and one covertly - given the intervening spell-
out operation, these two movement operations cannot be reanalyzed as one
instance of Form Chain). Hence, Di blocks D 2 via Fewest Steps.
As shown in Müller & Sternefeld (1996), the same kind of analysis may be
given for the ban on wA-scrambling in German, which is illustrated in (29).
(29) a. Warum 1 hat der Fritz wasj gelesen ?

why has ART Fritz what read
b. *Warumi hat wasi der Fritz ti gelesen ?
why has what ART Fritz read
However, it is also shown in Müller & Sternefeld (1996) that the Fewest Steps
approach to the ban on optional movement of wA-phrases which is later un-
done by further, covert operations is not entirely unproblematic, and may
necessitate additional assumptions. For one thing, German exhibits the same
ban on wA-topicalization as English:
(30) a. Weri sagte ti fcp daß Maria wem 2 ein Buch gegeben hat 3 | ?
who said that Maria whom a book given has
b. *Weri sagte ti Lcp wem2 hat3 Maria t2 ein Buch gegeben t3 | ?

who said whom has Maria a book given
This strongly suggests an identical account in terms of Fewest Steps. How-

ever, since German topicalization always requires V/2 movement, and since
V/2 movement is incompatible with the presence of a complementizer in Ger-
man, the derivations generating (30-b) and (30-a) do not share an identical
numeration, and we would wrongly expect no competition to arise. 10 Thus,
to accomodate this evidence, it seems as though the definition of reference set
must be revised, as in (31) - given complelementizer deletion at LF, (30-a,b)
can be assumed to be identival at this level.
(31 ) Reference Set (second revision):

have the same LF, and (ii) respect all local constraints.
Moreover, it turns out that there are several well-formed constructions at-
tested in the world's languages in which wA-phrases can in fact undergo op-
tional overt fronting to a non-target position. In Müller & Sternefeld (1996),
we discuss evidence from partial wA-movement, wA-imperatives, and wA-
reconstruction. For the present purposes, the example of optional partial wh-
movement to a SpecQ_,„/,] position in Ancash Quechua may suffice (cf.
Cole 1982). (32-a) shows that wA-phrases may be fronted to a SpecCj+UJ/,i
target position in overt syntax in Ancash Quechua; (32-d) shows that wh-
phrases may also stay in situ in overt syntax, raising (by assumption) to the
SpecC[+U)/j] position in covert syntax. Interestingly, (32-b) and (32-c) are also
possible. Here, the wA-phrases raise to an intermediate SpecQ-u,/,] overtly.
Given that this implies an additional wA-movement operation at LF, Epstein's
(1992) Fewest Steps approach should rule out these cases.
(32) a. I cp Ima-ta-taqi (qam) kreinki [cp t" Maria muna-nqa-n-ta [cp t',
what acc you believe Maria want-nom-3-acc
José ti ranti-na-n-ta ]]] ?
José buy-nom-3-acc
b. [CP — (Qam) kreinki |cp ima-ta-tai Maria muna-nqa-n-ta [cp t,
c. I CP — (Qam) kreinki [cp — Maria muna-nqa-n-ta [cp ima-ta-tai
d. I CP — (Qam) kreinki [cp — Maria muna-nqa-n-ta [cp — José ima-
ta-tai ranti-na-n-ta ]]] ?
The conclusion drawn in Müller & Sternefeld (1996) in view of well-formed

constructions like this one is that reference sets should be significantly re-
duced in size by assuming that identity of surface structure (rather than iden-
tity of LF) matters in the definition of reference sets, as in (33).
(33) Reference Set (third revision):

Two derivations Dj and D2 are in the same reference set iff they (i)
have the same surface structure, and (ii) respect all local constraints.
This way, partial w/z-movement is permitted, but it is clear that much (in fact,
most) of the original evidence in favor of Fewest Steps is lost: Thus, on this
view, neither French V-in situ, nor English (or German) w/i-topicalization can
be ruled out by Fewest Steps anymore. As noted in Sternefeld (1997), this
situation might be viewed as indicative of a general problem with translocal
constraints: A significant reduction of competition in reference sets may be
empirically desirable so as to account for cases of optionality (as in partial
wA-movement constructions); but as an unwanted side effect, it also threat-
ens to undermine the notion of translocal economy itself: Many ill-formed
derivations that could be ruled out by translocal constraints will now survive
because the more economical derivation is not part of the same reference set
anymore. Finding a suitable definition of reference set that is weak enough to
permit optionality and strong enough to actually do some work is one of the
fundamental concerns of all versions of the minimalist program that employ
the notion of competition.
3.1.3 Freezing in Collins (1994)
Evidence for yet another definition of reference sets comes from Collins'
(1994) account of freezing effects with Α-movement in English. As shown
in (34-a,b), subject NPs are islands for extraction in English, whereas object
NPs permit extraction (with certain types of verbs). In the present context, the
interesting case is that of subject NPs that originate in object position, as in
the case of passivization. As can be seen in (34-c), such derived subject NPs
are also islands.
(34) a. Whoi did John take [NP a picture of TI | ?

b. *Who1 is INP a picture of ti | on sale ?
c. *Whoi was I NP2 a picture of ti ] taken t2 by John ?
In a derivational approach, it must be shown that any derivation of (34-c)

leads to illformedness. In one derivation, Dj, NP raising to subject position
applies before w/z-extraction from NP takes place.
(35) a. I cp — was | IP — taken [NP2 a picture of whoi ] by John J]

b. I CP — was Iip [NP2 a picture of whoi ] taken t2 by John ]]
c. *|CP whoi was [IP [NP2 a picture of t| ] taken t2 by John ]]
Here, extraction of who\ from NP2 (which is in subject position already,

hence non-L-marked, and therefore a barrier) violates a local constraint like
the C E D . "
(36) CED ("Condition on Extraction Domain"):

Movement must not cross a barrier.
In another derivation, D2, wA-movement precedes NP raising:
(37) a. I CP — was [IP — taken [NP2 a picture of whoj ] by John ] ]

b. I CP who] was [IP — taken [NP2 a picture of ti ] by John J]
c. [CP whoi was [IP [NP2 a picture of ti ] taken t2 by John ]]
This derivation violates another local constraint, the Strict Cycle Condition
in (38). The reason is that NP raising targets the subject position. The subject
position is included in the CP domain, which has already been affected by
w/î-movement to SpecC earlier in the derivation.
(38) SCC ("Strict Cycle Condition"):

No movement operation may target a landing site that is included in
a domain that has already been affected by movement earlier in the
derivation.
So far, translocal constraints are not needed in an account of the illformedness

of (34-c). The Fewest Steps condition does become relevant, though, when
we consider a third derivation, D3. This derivation proceeds by what Collins
(1994) calls chain interleaving. First, the w/i-phrase who\ is extracted from
the object NP2 while NP 2 is still in situ (i.e., transparent for extraction); who\
adjoins to VP. Second, NP raising to the subject position takes place. Finally,
who\ moves from its intermediate position to SpecC; see (39).
(39) a. [CP — was [IP — taken [yp INP2 a picture of whoi ] by John ]]]
b. I CP — was [IP — taken [yp whoi [yp [NP2 a picture of ti ] by
John |]]]
c. [ cp — was LIP INP 2 a picture of TI | taken |vp WHOI |vp T2 by

John ]]]]
d. I cp whoi was [ip [np2 a picture of ti ] taken | yp t', | yp t2 by John ]]]]
Di violates the CED; D2 violates the SCC. D3 violates neither of these local
constraints. However, D3 is blocked by Di and D2 via Fewest Steps: Other
things being equal, D3 needs three movement steps where D| and D2 make
do with two movement steps. 12
This approach has an important consequence for the definition of refer-
ence sets. The three derivations Di, D2, and D3 yield the same surface string,
which is ill formed. Thus, the more economical derivations that block D3
via Fewest Steps are not well-formed derivations, as in the applications of
Fewest Steps discussed above, but rather ill-formed derivations that violate
local constraints, viz., the CED and the SCC. This reasoning implies that ref-
erence sets can in fact not be defined as assumed so far, by requiring that only
those derivations can compete that satisfy all local constraints - in the case
at hand, Di and D2 violate local constraints. Still, we cannot simply drop
this requirement in the definition of reference sets; otherwise, all instances
of movement would invariably be blocked in favor of in-situ derivations by
Fewest Steps, and syntactic derivations would be fairly trivial. It seems that
what is needed in view of this conflicting evidence is a relativized notion of
local constraint satisfaction.
In this context, the idea of convergence of derivations introduced in Chom-
sky (1993) becomes relevant: Only those derivations that converge can com-
pete with respect to translocal constraints. Essentially, whereas all violations
of local constraints lead to ungrammaticality, only a subset of violations of
local constraints also leads to non-convergence. Ungrammatical derivations
that converge may then still be used to block other derivations as ungrammat-
ical, as in the freezing construction discussed by Collins (1994). It is an em-
pirical issue how convergence is to be defined. As a rule of thumb, and for the
present purposes, we can say that a violation of those constraints that trigger
movement (like the WA-Criterion, the Extended Projection Principle (EPP),
which triggers subject raising, or whatever constraint optionally triggers top-
icalization) leads to non-convergence, whereas a violation of constraints like
the CED and the SCC permits convergence of a derivation. 13
Under these assumptions, the notion of reference set needed for the ap-
proach in Collins (1994) can be defined as in (40). Note that the analysis is
compatible with assuming that either numerations, or surface structures, or
LF representations, or any combination of these determines the competition;
hence, a commitment to one of these options is not necessary in the case at

hand.
(40) Reference Set (fourth revision):

Two derivations D] and D2 are in the same reference set iff they (i)
have the same numeration/surface structure/LF, and (ii) converge.
To end the discussion of Fewest Steps, we would like to emphasize that there
is no inherent reason why the notion of an "operation" that is mentioned in
the Fewest Steps condition should be confined to movement. Indeed, Chom-
sky & Lasnik (1993) argue that the deletion of intermediate traces in the LF
component (which is argued to be an option with arguments and impossible
with adjuncts in Chomsky 1986a, Lasnik & Saito 1992, and related work) is
also regulated by the Fewest Steps condition.
There have been many more applications of the Fewest Steps condition in
the minimalist program (see, e.g., the Fewest Steps account of the ban on
semantically vacuous quantifier raising in Fox 1995), but these may suffice
for the time being. 14 Let us now consider the translocal economy constraint
Shortest Paths.
3.2 Shortest Paths
The Shortest Paths condition can be defined as follows (cf. Chomsky 1993,
1995):
(41) Shortest Paths·.

If two derivations Di and D2 are in the same reference set and the
movement paths of Di are shorter than the movement paths of D2,
then D, is to be preferred over D2.
Various applications of this condition have been suggested in minimalist syn-

tax. Perhaps the most striking one concerns the derivation of superiority ef-
fects.
3.2.1 Superiority Effects in Chomsky ( 1993) and Kitahara (1993)
Superiority effects in English are illustrated by the examples in (42).
(42) a. I wonder [cp whoi C [n> ti bought what 2 ]]

b. *I wonder |cp what2 C [ip who, bought t2 ]]
c. Whomi did John persuade ti [cp to visit whom2 | ?

d. *Whom2 did John persuade whomj |cp t'2 to visit I ?
The Superiority Condition proposed by Chomsky (1973) demands that in

cases where there are two (or more) w/z-phrases that could in principle
be moved to a given SpecC[+U)/,] position, only the highest wA-phrase can
undergo such w/i-movement overtly, i.e., the one that asymmetrically c-
commands the other(s). This condition is respected in (42-a,c). The examples
in (42-b,d) are ungrammatical because the highest w/z-phrase in the clause
has failed to undergo overt movement to SpecC; rather, the lower w/i-object
has moved to SpecC.
As indicated in Chomsky (1993) and argued extensively in Kitahara (1993),
superiority effects can systematically be accounted for by the translocal con-
dition Shortest Paths. For instance, the movement path from ti to whotti\ in
(42-c) is shorter than the movement path from t 2 to whom2 in (42-d). Hence,
given that the two derivations Di and D2 generating (42-c) and (42-d), re-
spectively, compete (which would follow from most definitions of reference
sets envisaged above), Di blocks D2 as ungrammatical by Shortest Paths.
Or does it? Recall that at least some of the evidence for Fewest Steps (the
ban on V-in situ in French, the ban on w/i-topicalization and w/z-scrambling
in English and German) has relied on the assumption that covert move-
ment counts in the same way that overt movement does. But assuming that
LF movement also counts for the Shortest Paths condition leads straightfor-
wardly into a dilemma: In the case at hand, the derivation that has the shorter
overt w/z-movement path invariably has the longer covert w/i-movement path,
and it seems that by LF, both derivations have w/z-movement paths of equal
length. Hence, ceteris paribus, both should be well formed. In view of this,
several steps can be taken. First, one can assume that there is in fact no covert
w/z-movement of any kind; this makes it possible to maintain the Shortest
Paths account of superiority phenomena without qualification, but is incom-
patible with the Fewest Steps applications sketched above. Second, one might
explicitly distinguish between Fewest Steps and Shortest Paths in this respect:
Whereas Fewest Steps compares whole derivations, Shortest Paths compares
only the overt parts of derivations. A third possibility is developed in Sterne-
feld (1997).
The underlying intuition of this account is that LF movement from a posi-
tion a to another position β has a chance to be shorter than overt movement
from a to β. The central observation is that the issue of whether a Shortest
Paths account of superiority effects is incompatible with covert w/z-movement
is highly dependent on how path length is defined. If (movement) path length

is determined by considering the number of nodes crossed by a movement
operation, there is indeed a problem. But suppose now that path length is
determined by considering the number of complete chains that are crossed.
Now LF movement of whom2 in (42-c) will create a shorter path than overt
movement of whom2 in (42-d), even though whom2 originates in the same po-
sition and targets the same landing site. The reason is that covert movement
of whom2 in (42-c) crosses t], which is part of a chain, but not a complete
chain, whereas overt movement of whom2 in (42-d) crosses whom\, which
is a complete chain at this point of the derivation. Given that whom\ crosses
the same number of complete chains in the course of overt movement in Di
(generating (42-c)) and in the course of covert movement in D2 (generating
(42-d)), the small difference pertaining to movement of whom-i becomes de-
cisive, and Di successfully blocks D2 as ungrammatical via Shortest Paths,
even under the assumption that covert w/i-movement exists and is relevant for
the Shortest Paths condition.
Another interesting issue raised by the Shortest Paths account of superior-
ity effects is posed by what one might call "LF-optionality." Sentences like
(43-a) have two possible readings (see Baker 1970) that correspond to two
different LF representations, given LF movement of wh-in situ elements.
(43) Whoi ti wonders [cp where2 we bought what3 t2 ] ?

a. whoi what3 ti wonders | cp where2 we bought t3 t2 |
Answer: John wonders where we bought the books, Mary wonders
where we bought the records, etc.
b. whoi t] wonders [CP where2 what3 we bought t3 t2 J
Answer: John wonders where we bought what, Mary wonders
where we bought what, etc.
Given that all wh-in situ phrases must undergo movement to a SpecC[+u,/,]
position at LF, D 2 (creating (43-b)) should block Di (creating (43-a)) because
Di's paths are longer. Again, there are several possible solutions. 15 As before,
one might stipulate that covert wA-movement either does not exist, or does not
count with respect to the Shortest Paths condition. Alternatively, this evidence
could be viewed as a further argument that reference sets are defined in such
a way that competing derivations must have identical LF representations.
3.2.2 Yo-Yo Movement in Collins (1994)
The term yo-yo movement characterizes a combination of lowering and rais-

ing operations affecting a single item in the course of a derivation, or even
within the overt part of a derivation. Derivations employing yo-yo movement
are identified as problematic in Chomsky (1986a) (the main observation being
attributed to Andy Barss), but envisaged as legitimate possibilities in Lasnik
& Saito (1992). Collins (1994) shows that the availability of yo-yo movement
would make a wrong prediction for the West African language Ewe, and at-
tempts to derive a ban on yo-yo movement from the Shortest Paths condition.
Ewe is among the languages that show reflexes of successive-cyclic wh-
movement in the C domain. The reflex of successive cyclicity concerns the
morphological form of the 3.Pers.Sing. subject pronoun in the canonical sub-
ject position. The regular form of the pronoun is é\ cf. (44-a). The regular
pronoun é can be replaced by wo in cases of long-distance extraction (focus
movement, in the case at hand); cf. (44-b).
(44) a. Kofi gblö [cp be é/*wo Jo Kösi ]

Kofi said that he hit Kösi
b. Kofi] ε me gblö [CP (t.) be é/wo /ο ti |
Kofi Foc I said that he hit
Collins assumes as the correct underlying generalization that é is replaced by

wo if and only if the local SpecC position is filled. Accordingly, we can postu-
late that the apparent optionality of wo in (44-b) is due to an option for long-
distance Α-bar movement of arguments in Ewe to apply either successive-
cyclically, via SpecC (in which case wo is obligatory), or in one swoop (in
which case wo is impossible). It does not come as a surprise from a pretheo-
retical point of view that Α-bar movement that originates in the matrix clause
and targets a SpecC position there does not trigger the morphological change
in the embedded subject position:
(45) Kofii ε me gblö na ti |cp be é/*wo Jo Kösi |

Kofi Foc I said to that he hit Kösi
Still, to ensure that wo is impossible in (45), a derivation like (46) that em-
ploys yo-yo movement must be ruled out. In this derivation D ¡ , Kofi\ is first
lowered to the embedded SpecC position, licensing wo in the subject position
there, and then raised to the target SpecC position in the matrix clause.16
(46) a. Foe [IP I [yp said |pp to Kofii | [CP that |IP he hit Kösi |||]
b. Foc I ip I fvp said Lpp to ti | [cp Kofii that |ip he hit Kösi ]]]]
c. Kofii Foc I IP I [vp said |PP to ti ] [cp t', that [IP he hit Kösi ]]]]
There are various possibilities to exclude such a derivation (relying, e.g.,

on versions of the SCC or versions of the Proper Binding Condition). Still,
Collins (1994) observes that D] in (46) is blocked by the derivation D2 in (47)
via Shortest Paths. D2 proceeds without intermediate lowering. 17
(47) a. Foc fip I [VP said [pp to Kofii J [CP that [IP he hit Kösi ]]]]
b. Kofii Foe [IP I [vp said [pp to ti ] [CP that [IP he hit Kösi ]]]]
A final remark is due concerning the notion of reference set presupposed by

this analysis. Since "he" is a wo in D] and an é in D2, it is clear that this
difference must not suffice to create different reference sets. This can be ac-
complished in a number of ways, e.g., by defining reference sets with respect
to a level of representation at which the difference in pronoun shape is invisi-
ble (possibly LF), or by explicitly stipulating in the definition of reference set
that minor differences like the one at hand do not suffice to create different
competitions. It is this latter strategy that is also pursued by Nakamura (1998)
in his approach to wA-movement in Tagalog.
3.2.3 Tagalog Wh-Movement in Nakamura (1998)
A generalization underlying wA-movement in the Austronesian language

Tagalog is that only the highest A-position of a given clause (the subject po-
sition) is accessible for wA-movement; Nakamura (1998) assumes this to be
SpecT (or Speci). In constructions in which an agent NP occupies the high-
est A-position (the so-called Agent Topic (AT) constructions), this NP can be
wA-moved; an NP bearing a different Theta-role that shows up in an object
position cannot undergo wA-movement; cf. (48). 18
(48) a. [CP Sinoi ang [χρ t', b-um-ili [vp tL tv ng damit2 ]]] ?
who Ang bought^ dress,,,/,
'Who is the one that bought the dress?'
b. *[CP Ano2 ang [ΧΡ siJuani b-um-ili [yp ti TV t2 ]]] ?
what Ang Juanafo b o u g h t ^
'What is the thing that Juan bought?'
A different marking on the verb triggers the so-called Theme Topic (TT) con-
struction. Here, the theme NP occupies the structural subject position SpecT;
and indeed, only the theme NP can undergo wA-movement; cf.:
(49) a. * | c p S i n o i a n g [χρ damiti b-in-ili |vp t( tv t2 J J] ?

who Ang dressai bought/· 7-
'Who is the one that bought the dress?'
b. tcp Ano2 ang [χρ t'2 b-in-ili [yp ni Juan tv t2 IDJ ?
what Ang bought τ τ JuanerÄ,
'What is the thing that Juan bought?'
Nakamura's (1998) basic idea is that the derivations generating (48-a) and
(49-a) compete, as do the derivations generating (48-b) and (49-b). The
derivations underlying (48-a) and (49-b) can then block their respective com-
petitors as ungrammatical because of the Shortest Paths constraint. To see
this, consider the case of wA-movement of the theme NP in (48-b) and (49-b).
The movement path from the VP-internal object position to the SpecC target
position in (48-b) is longer than the path from the subject position SpecT to
SpecC in (49-b). Consequently, the Shortest Paths condition guarantees that
the derivation generating (49-b) blocks the derivation generating (48-b) as un-
grammatical. An analogous account is available for the agent wA-movement
case in (48-a) vs. (49-a).
As Nakamura observes, this analysis raises two further potential problems.
First, we have to ensure that derivations can compete even though they do
not have identical lexical material - the Agent Topic and the Theme Topic
constructions clearly differ in lexical make-up. Nakamura accomplishes this
by replacing the notion of "identical numeration" in the definition of refer-
ence set with the more liberal notion of "non-distinct numeration;" the latter
is defined in such a way that two numerations that only differ with respect
to functional features do not count as distinct. (Clearly, this raises some non-
trivial questions for other languages in which competitions of the type that
Nakamura postulates seem unwanted.)
Second, the derivation that generates, e.g., (49-b) may minimize the wh-
path in comparison with the derivation that generates (48-b), but it increases
path lengths in the Α-domain. It is not quite clear how problematic this is;
in the case presently under consideration, the Α-chain formed in (49-b) by
theme raising is only minimally longer than the Α-chain formed in (48-b) by
agent raising, whereas the wA-chain formed in (49-b) is much shorter than
the wA-chain formed in (48-b). There would be even less of a problem for the
agent wA-extraction case in (48-a) and (49-a). In any event, Nakamura (1998)
replaces the notion of "movement paths" in the definition of the Shortest
Paths condition with the more specific notion of "comparable chain links."
This yields the effect that, e.g., the derivation generating (49-b) blocks the
derivation generating (48-b) just because the former derivation's wA-chain
links are shorter than the latter derivation's comparable wA-chain links, irre-
spective of the length of other chain links created by Α-movement, V raising,
etc.
3.2.4 Freezing in Chomsky ( 1995)
Recall the English freezing construction in (34-c), which is repeated here:
(50) *Whoi was [np2 a picture of ti ] taken Í2 by John ?
Above, we have considered three derivations. If NP raising precedes wh-

movement, the CED is violated. If w/i-movement precedes NP raising, the
SCC is violated; the CED and the SCC are both local constraints. Finally, if
chain interleaving occurs, this derivation can be excluded by invoking the
Fewest Steps condition (cf. Collins 1994). Chomsky (1995) suggests that
translocal economy constraints might play an even bigger role in account-
ing for the illformedness of (50). The idea is that the second derivation can
be excluded without recourse to something like the SCC; the Shortest Paths
condition can do this just as well. Consider again the two derivations that are
compatible with Fewest Steps:
(51) a. (i) I cp — was [ip — taken [np2 a picture of whoi ] by John ]]

(i·) [cp — was I ip [np2 a picture of whoi ] taken t2 by John ]]
(iii) *[cp whoj was [¡ρ [np2 a picture of tj j taken t2 by John ]]
b. (i) [cp — was [ip — taken | np2 a picture of whoi | by John ]]
(ii) [cp whoi was L i p — taken [np2 a picture of ti J by John ]]
(iii) [ cp whoi was [ip [np2 a picture of ti | taken t2 by John ]]
Chomsky's (1995:328) suggestion reads as follows: "Passive [i.e., NP raising]

is the same in both [derivations]; w/i-movement is 'longer' in the illicit one in
an obvious sense, object being more remote from SpecC than subject in terms
of number of XPs crossed. The distinction might be captured by a proper
theory of economy of derivation." In other words: D 2 in (51-b), which does
not violate a local constraint (the SCC, by assumption, being irrelevant or
dispensable), is blocked as ungrammatical via Shortest Paths by the more
economical D] in (51-a), which does violate a local constraint (the CED) but
converges.
3.3 Procrastinate
Chomsky (1993, 1995) assumes the following local condition as a trigger for
overt movement.
(52) Feature Condition:

a. Strong features must be checked in overt syntax.
b. Weak features must be checked by LF.
As in the approach in Chomsky (1991), French I features are classified

as "strong," English I features as "weak." However, whereas the constraint
Strength of I (cf. (15)) in the 1991 analysis states that strong I tolerates V
raising, the Feature Condition forces V raising to a strong I. Another dif-
ference between the two analyses is that the 1993 model does not require
I lowering in the syntax anymore if V does not overtly raise to I. Looking
back at the French/English paradigm in (14), we can now see that the original
problem has disappeared: (53-a) respects the Feature Condition, (53-b) does
not, and there is no need to invoke a translocal economy constraint to exclude
(53-b).
(53) a. Jean embrassei souvent | yp t| Marie )

b. *Jean souvent | vp embrassei Marie |
c. *John kissesi often | yp tj Mary |
d. John often Ivp kissesi Mary |
However, a new problem has appeared: Assuming LF V-to-I movement in

(53-d), this derivation respects the Feature Condition; but it seems that (53-c),
with overt V-to-I movement, does so as well. This derivation was ruled out
by Strength of I in Chomsky's (1991) approach, but with this condition gone,
something else must be said. More generally, a condition is needed that guar-
antees that overt movement is possible only if it is forced by the Feature
Condition, i.e., in the presence of strong features. This is achieved by the Pro-
crastinate condition, which is explicitly formulated as a translocal constraint
in Marantz (1995:357).
(54) Procrastinate:
If two derivations Di and D2 are in the same reference set, and D|
differs from D2 in that an item a is moved covertly in Di and overtly
in D2, then Di is to be preferred over D2.
Procrastinate blocks (53-c) in favor of (53-d), which delays V-to-I movement

to the LF component. However, (53-b) does not block (53-a), assuming that
only those derivations can compete that converge, and that (53-b) does not
converge because of its Feature Condition violation.
The current status of Procrastinate is somewhat unclear. There have been
attempts to dispense with this translocal constraint, either by deriving its ef-
fects from local constraints, or by reducing it to the Fewest Steps condition
(cf. Kitahara 1997).19
3.4 Merge before Move
Chomsky (1995, 1998) assumes that syntactic structures are created by al-
ternating operations of structure-building (Merge) and movement (Move). At
any given stage of the derivation, the situation can arise that it must be de-
cided whether the next step is a Merge or a Move operation. The following
translocal condition settles the issue by preferring Merge to Move if both are
possible as such; the specific formulation is based on Frampton & Gutman
(1999).
(55) Merge before Move:

Suppose that two derivations Di and D2 are in the same reference set
and respect all local constraints, and Di = (Σο,..., Σ„, Σ„ + ι, ...Σ*)
and Da = (Σ 0 ,..., Σ„, Σ' η + ι , ...Σ(). Then Di is to be preferred over
D2 if Σ„ —>• Σ„+ι is an instance of Merge and Σ„ —> is an
instance of Move.
Evidence for this condition comes from expletive constructions in English.

Consider the data in (56-a,b).
(56) a. Therei seems [ip ti to be [pp someone2 in the room J]

b. *Therei seems [ip someone2 to be [pp t 2 in the room ]]
Given the predicate-internal subject hypothesis, someone is first merged in

the SpecP position. When the derivation reaches the embedded IP domain,
the Extended Projection Principle (EPP) becomes active; this local constraint
requires filling of Specl by either Merge or Move. Assuming that there is
part of the numeration at this stage of the derivation, two possibilities arise:
Either there is merged in Specl (and subsequently raised to the matrix Specl
position), as in (56-a), or someone is moved to Specl (and there is merged
later in the derivation, directly in the matrix Specl position), as in (56-b).
These two derivations involve an identical numeration, and they both respect
all local constraints. In this case, Merge before Move tells us to choose the
derivation underlying (56-b) and dispense with the derivation that generates
(56-a).
Given that identity of numeration is a prerequisite for competition, (57) is
correctly predicted to be possible - if there is no there present in the numera-
tion, there is no competing derivation here that could be preferred by Merge
before Move.
(57) Someonei seems | ip tj to be ti in the room |
The question arises of whether there is a deeper reason why Merge operations
count as more economical than Move operations. Chomsky ( 1 9 9 5 , 1 9 9 8 ) sug-
gests that Move is to be defined in terms of Merge, which would make it
inherently more complex, and this fact might ultimately be exploited in an at-
tempt to derive the Merge before Move condition. Chomsky (1998:14) him-
self remarks: "Good design conditions would lead us to expect that simpler
operations are preferred to more complex ones, so that Merge ... preempt|s|
Move, which is a 'last resort,' chosen when nothing else is possible."
3.5 Conclusion
The four translocal constraints discussed so far do not yet exhaust the list of
translocal constraints that have been proposed; see, e.g., the translocal Econ-
omy of Representation constraint in Chomsky (1991), or the translocal Pref-
erence Principle for Reconstruction in Chomsky (1993). Still, the constraints
discussed here can be considered representative. At this point, we can address
the question of what the structure of a minimalist syntax with translocal con-
straints looks like. Such a syntax has two parts. In the first part, derivations
are created by structure-building (Merge), movement (Move), deletion, and
perhaps other operations. Convergent derivations are assembled in reference
sets according to criteria that must be decided on (see the above definitions
of reference sets for some options). In the second part, translocal constraints
choose among the competing derivations and thus determine the wellformed-
ness of sentences. In essence, then, it turns out that a minimalist syntax with
translocal constraints has exactly the shape that Prince & Smolensky (1993)
attribute to an optimality-theoretic grammar: A first generator part (called
Gen) creates the candidate set (= reference set, in minimalist syntax); Gen has
only local constraints. A second "harmony"-evaluation part (called H-Eval)
determines the optimal candidate(s) (= derivation(s), in minimalist syntax) in

the candidate set. More generally, we will see that all kinds of competition-
based syntax have this structure, which is schematically shown in (58).
(58) Structure of a competition-based syntax:

a. Gen creates the candidate set {Q, C2,...}.
b. H-Eval determines the optimal candidate(s) C, (C¡,...) in
{C,,C 2 ,...}.
Two issues concerning the notions of optimality and grammaticality in mini-

malist syntax remain to be clarified. 20 First, does optimality equal grammat-
icality? Whereas this is the case in optimality-theoretic syntax (see below),
things are slightly more involved in minimalist syntax. As we have seen, it
has been argued that derivations that converge can enter the competition, even
though they may violate certain local constraints (recall the discussion of
freezing effects in English). Accordingly, an optimal candidate may be one
that violates a local constraint, and is therefore ungrammatical.
Second, we have so far left open the question of how optimality evaluation
proceeds in the presence of more than one translocal economy constraint in
the grammar. In this case, conflicts may arise. As a simple, abstract example,
suppose that there are two translocal constraints (T! and T 2 ), and only three
derivations (D l t D 2 , and D 3 ) in the reference set.21 Suppose further that T]
prefers Di over D 2 and D 3 ; that T 2 prefers D 2 over Di and D 3 ; and that a
derivation Do that would be preferred by both Ti and T 2 fails to converge,
so that it cannot participate in the competition. In such a situation, various
possibilities arise. A first possibility would be what we can call "tolerance."
On this view, it suffices to be selected by one translocal constraint to be opti-
mal (hence, potentially grammatical); consequently, both Di and D 2 would be
classified as optimal. A second possibility would be "ranking": The conflict
among translocal constraints is resolved by a ranking, such that that deriva-
tion is optimal that is preferred by the higher-ranked constraint in the case of
conflict. If T] is ranked higher than T 2 , this would imply that only D] is opti-
mal. Finally, a third possibility is what we can call "breakdown": In the case
of conflicting instructions made by translocal constraints, no derivation can
emerge as optimal. It turns out that this last possibility is the one that is gen-
erally assumed, and though it is not easy to come up with decisive evidence,
it also strikes us as the most adequate one (see Collins 1994, Sternefeld 1997,
and Müller 2000). On this basis, we can conclude that grammaticality can be
defined as follows in a minimalist syntax with translocal constraints:
(59) Grammaticality:
A derivation D, is grammatical iff (a) and (b) hold:
a. D, does not violate a local constraint.
b. D, is optimal.
(60) Optimality in minimalist syntax:
A derivation D, is optimal iff there is no derivation D^ in the same
reference set that is preferred over D, by a translocal constraint.
The minimalist system that emerges in this way is not without problems.
Some of those show up in all versions of competition-based syntax. For one
thing, since a minimalist syntax of this type involves a global competition
in a reference set that may be large, or even infinite, the overall complexity
of the system is significantly increased. For another, we have seen that it
is difficult to come up with a single, unified definition of reference set that
accommodates all available evidence that one may want to treat in terms of
translocal constraints.
Other problems are more specific and confined to the particular notion of
optimality that is employed in minimalist syntax. Most notably, the H-Eval
metric is not maximally homogeneous and simple (because it may depend on
a number of formally unrelated translocal constraints); however, it is rather
inflexible nevertheless. Specifically, all translocal constraints must be classifi-
able as economy constraints in some sense (thus, properties of sentences that
are not related to economy considerations cannot be subject to optimization).
Even more importantly, it implies that all variation among languages must
take place in the Gen part of the grammar - there is no room for parameter-
ization in the H-Eval system. It is not always obvious that this position can
be maintained in the light of conflicting empirical evidence. As an example,
consider the effect that the Shortest Paths condition has on w/i-movement in
German. Recall that the Shortest Paths condition accounts for the superiority
effect with wA-movement in languages like English; cf. (42). As has often
been noted (see, e.g., Haider 1983), German does not exhibit superiority ef-
fects of this kind:
(61) a. Ich frage mich |cp wen C ti was2 gekauft hat 1

I ask myself who what bought has
b. Ich frage mich [cp was2 C weri Í2 gekauft hat |
I ask myself what who bought has
Still, it seems clear that the path from tj to werx in (61-a) is shorter than the
path from t2 to was2 in (61-b). To avoid the result that the Shortest Paths
condition blocks (61-b) in favor of (61-a), additional assumptions concerning

Gen are therefore necessary. 22
Not least because of problems like this, there is a strong tendency in re-
cent versions of the minimalist program to dispense with translocal con-
straints - and hence, with the concept of competition - altogether; see in
particular Collins (1997), Frampton & Gutman (1999), and also Chomsky
(1995, 1998).23 Local (derivational) constraints like the Last Resort condi-
tion and the Minimal Link Condition (MLC) have been developed in Chom-
sky (1995) and much recent work as economy conditions that can take over
at least some of the work that was done by translocal constraints like Fewest
Steps and Shortest Paths. Similarly, effects that were attributed to Procrasti-
nate and Merge before Move have been shown to be derivable without invok-
ing translocal constraints.
Apart from these theory-internal considerations, it is interesting to note that
the fall of translocal constraints (and with it the fall of the concept of com-
petition) in minimalist syntax goes hand in hand with the rise of optimal-
ity theory, and hence optimality-theoretic syntax, which inherently relies on
translocality and competition. However, before turning to this approach, we
will discuss another model of competition-based syntax, one that developed
concurrently with (and as an extension of) government and binding theory:
blocking syntax.
4 Blocking Syntax
Blocking syntax was developed by DiSciullo & Williams (1987) and

Williams (1997) on the basis of Aronoff's (1976) approach to blocking in
morphology. To a large extent, it is equivalent to a syntactic theory that incor-
porates the Elsewhere Condition which has played a major role in phonology
(see Kiparsky 1982 and the references cited there). A version of blocking
syntax that is even closer to Kiparsky's approach is developed in Fanselow
(1989, 1991); this approach relies on the Proper Inclusion Principle. A block-
ing syntax has the same general form as a minimalist syntax, viz., that in
(58): Gen generates the candidates (in blocking syntax typically S-structure
representations) which are assembled in candidate sets. The competing can-
didates are then subjected to an H-Eval procedure that determines the optimal
candidate(s).
The underlying idea of blocking syntax is that synonyms are not tolerated
in natural languages. Consequently, candidate sets are defined in terms of

identity of meaning:
(62) Candidate Set:

Two candidates C, and C ; are in the same candidate set iff they (i)
have the same meaning, and (ii) respect all local constraints.
A candidate is grammatical iff it is optimal (given (62), an optimal candidate

cannot violate a local constraint). The concept of optimality is different from
that adopted in the minimalist program. The optimal candidate is the most
specific one. Thus, more specific candidates block less specific ones: This is
the Blocking Principle.
(63) Optimality in blocking syntax:

A candidate C, is optimal iff there is no candidate C¡ in the same
candidate set that is more specific.
This approach crucially depends on how specificity is understood. In mor-

phology, irregular forms count as more specific than regular forms. In syntax,
we can assume that C, is more specific than C, if local constraints (or rules)
lead to C, 's distribution being more restricted than that of C¡.
4.1 Comparative Formation in Williams (1997)
With this in mind, let us consider an example that is discussed in Williams

(1997): English comparative formation.
As shown in (64), English has two ways of comparative formation: a mor-
phological one and a syntactic one. The two strategies appear to be in com-
plementary distribution.
(64) a. hot —• hotter, *more hot

b. happy —> happier, *more happy
c. colorful *colorfuller, more colorful
Williams suggests that these data can be accounted for by the following two
rules, which we call rule A and rule B.
(65) a. Rule A (morphological):

Comparatives can be formed by attaching the suffix er to monosyl-
labic adjectives, and to disyllabic adjectives ending in y.
b. Rule Β (syntactic):
Comparatives can be formed by adding more in the syntax.
The candidate that employs the morphological comparative *colorfuller vi-

olates rule A, while the candidate that uses the syntactic comparative more
colorful respects both rule A (vacuously) and rule B. In contrast, candidates
like hotter or happier respect rule A (and, vacuously, rule B); but what is
left open by rule A and rule Β is why candidates like *more hot or *more
happy are ungrammatical. The illformedness of these forms could be derived
by replacing rule Β with rule B' in (66). However, adopting this rule would
lead to a redundancy: The context that permits morphological comparatives
in rule A is repeated in an identical form as the context that prohibits syntactic
comparatives in rule B'.
(66) Rule B'\

Comparatives can be formed by adding more in the syntax, unless
the adjective is monosyllabic, or disyllabic and ending in y.
To avoid this redundancy, Williams proposes maintaining rule Β instead of

adopting rule B'. The generalization that syntactic comparatives seem to be
possible only if morphological comparatives are impossible is then derived
by invoking the notion of blocking embodied in (63). Given rules A and B,
morphological comparatives are more restricted in their distribution than syn-
tactic comparatives - in fact, rule Β imposes no specific restrictions on syn-
tactic comparative formation at all. Hence, if both morphological and syntac-
tic comparatives respect all local constraints (like rules A and B), the more
specific morphological comparative is selected as optimal. If, however, the
morphological comparative violates a local constraint (like rule A), the syn-
tactic comparative cannot be blocked anymore, and is consequently selected
as optimal.
4.2 Anaphors vs. Pronouns in Fanselow (1991)
Recall from section 2.1 the data that seem to suggest a complementary distri-
bution of anaphors and pronominals in English (at least in the domain under
discussion here).
(67) a. Johnj likes himself ι

b. *Johni thinks that Mary likes himself ι
c. Johni thinks that Mary likes himi
d. *Johni likes him ι
We have seen that standard government and binding theory accounts for these
data by invoking the principles A and Β in (68).
(68) a. Principle A:
An anaphor is bound in its binding domain,
b. Principle Β:
A pronominal is free in its binding domain.
However, as in the case of the comparative formation rules A and B' that
were just discussed, it seems that this approach involves a redundancy: A
generalization is missed if two separate local constraints are postulated for
anaphors and pronominals, where the context that permits one strategy is
identical to the context that precludes the other strategy (viz., the binding
domain in both cases). As noted by Fanselow (1989, 1991), Burzio (1991),
and Richards (1997), among others, a more elegant account can be given if the
notion of competition is invoked. Here, we will sketch Fanselow's blocking
approach. 24
Fanselow's analysis relies on the Proper Inclusion Principle (PIP), a ver-
sion of the Elswhere Condition (cf. Kiparsky 1982) that can be viewed as a
translocal constraint:
(69) Proper Inclusion Principle (PIP):

a. Suppose that two feature assignment mechanisms M, and M 2 com-
pete in a given structure. Then, other things being equal, M2 cannot
be applied if Mi is more specific.
b. Mi is more specific than M2 if the application domain of M2 prop-
erly includes the application domain of Mi.
The feature assignment mechanisms that play a role in the present context
are (a) the assignment of the feature |+anaphoric] to an NP, and (b) the
assignment of the feature [+pronominall to an NP - in short, reflexiviza-
tion (or reciprocalization) and pronominalization. By assumption, the assign-
ment of the feature l+anaphoric] is subject to (something like) Principle A,
whereas there is no comparable requirement for the assignment of the feature
[+pronominal|; i.e., Principle Β is dropped. This implies that, due to Princi-
ple A, anaphors are more restricted in their distribution than pronominals; the
application domain of pronominalization properly includes the application
domain of reflexivization. From this it follows directly that in all those cases
where both anaphors and pronominals respect all local constraints, the PIP
forces the choice of the anaphor. Pronominals can emerge only in contexts in
which anaphors are precluded (e.g., because of a violation of Principle A, as
in the examples presently under consideration).
The PIP can be viewed as a version of the blocking principle that is part
of the definition of optimality in (63). The only relevant change that must be
made for the case at hand concerns the question of which entities compete.
We can now assume that the competing items are complete syntactic objects
(syntactic candidates), rather than feature assignment mechanisms.
4.3 Conclusion
Blocking syntax is characterized by the fact that it is fairly simple in vari-

ous respects. Most importantly, blocking syntax employs a simple concept
of optimality in its H-Eval part. There is only one translocal constraint (the
Blocking Principle that selects the most specific candidate), not more than
one, as in minimalist syntax. In addition, due to the origin of the blocking
principle as a means to avoid synonymy, blocking analyses uniformly rely on
identity of meaning in the definition of candidate sets, again in contrast to the
variability involved in minimalist syntax. However, the simplicity comes at
a certain price: Harmony evaluation is even less flexible than in minimalist
syntax. First, the blocking principle by its very nature can only have a small
domain in which it is active; in general, the role of H-Eval in optimality the-
ory is smaller than in the minimalist program. Second, there is no room for
parameterization in the H-Eval domain at all. And third, since blocking anal-
yses depend on complementarity of distribution, cases of optionality pose
problems that are almost insurmountable. 25 A competition-based approach
that strengthens the role of the H-Eval part of the grammar and increases
flexibility in this domain is optimality-theoretic syntax; and it is this model
that we finally turn to now.
5 Optimality-Theoretic Syntax
5.1 Basic Concepts
By definition, an optimality-theoretic syntax takes the general form in (58),

with the grammar divided into a Gen part that creates the competing candi-
dates, and an H-Eval part that selects the optimal candidate(s). Recall that
the notion of optimality in a minimalist syntax or in a blocking syntax is a
comparatively simple one: Optimality is determined by a small set of simple

translocal economy constraints in the former case (cf. (60)), and by a sin-
gle translocal blocking principle (selecting the most specific candidate) in the
latter (cf. (63)). In optimality-theoretic syntax, there is only one translocal
constraint that determines optimality: Optimal (and grammatical) is a candi-
date that has the best "constraint profile" - or, more precisely, a candidate f o r
which there is no competitor that has a better constraint profile; cf. (70). This
definition makes it possible f o r more than one candidate to be optimal in a
given candidate set.
(70) Optimality in optimality-theoretic syntax:

A candidate C, is optimal (= grammatical) iff there is no candidate
Cy in the same candidate set that has a better constraint profile.
However, the evaluation metric is internally highly complex. The notion of

constraint profile is defined in (71).
(71 ) Constraint Profile".

A candidate C j has a better constraint profile than a candidate C, iff
there is a constraint Con such that (a) and (b) hold:
a. C j satisfies Con better than C, ; i.e., C j satisfies Con and C, violates
Con, or C j violates Con less often than Q .
b. There is no constraint Con' ranked higher than Con on which C, and
C j differ.
This presupposes that in addition to the local constraints employed by the

Gen component, which are inviolable and unranked, the H-Eval component
relies on a system of local constraints that are violable and ranked (and, by
assumption, universal) in order to determine the best constraint profile, hence,
optimality. T h e ranking among the violable local constraints of the H-Eval
component is indicated by the symbol the H-Eval constraints themselves
are typically written with small capitals. Optimality-theoretic competitions
are often illustrated by tables (so-called tableaux)·, optimality of a candidate
is indicated by the pointing finger: cs= ; violation of a local constraint is shown
by a star * in the appropriate column of the table; if this violation is fatal f o r
a candidate (i.e., responsible for its suboptimality), an exclamation mark ! is
added (redundantly). In the abstract H-Eval competition in table Τ ι , in which
the candidate set consists of C1-C5, Q emerges as the optimal candidate: It
avoids a violation of the high-ranked constraints A and Β (unlike C3-C5), and
it minimizes a violation of the low-ranked constraint C (unlike C2). Hence,

there is no competing candidate with a better constraint profile than Q .
T\ : Determining optimality
Candidates A Β C
*
ErCi
c2 **t
c3 *!
C4 *!
*
C5 *!
By reranking the constraints Β and C in T i , candidate C3 emerges as the

optimal candidate; cf. table T2.
T2: Reranking
Candidates A c Β
Ci *!
c2 * 1*
1®= C3 *
c4 *!
c5 *! *
Reranking of constraints f o r m s the basis of the concept of parameterization in

optimality-theoretic syntax. A further characteristic feature of this approach
is that it is essentially non-cumulative; i.e., no number of violations of a low-
ranked constraint can outweigh a single violation of a higher-ranked con-
straint. Thus, suppose that there were an additional, lowest-ranked constraint
D in Ti that Q violated, say, five times, and that C 2 - C 5 did not violate at all.
This would not undermine C i ' s optimality.
Before we turn to some illustrations of optimality-theoretic analyses,
something must be said about the nature of candidates and candidate sets.
Optimality-theoretic syntax is strongly influenced by work in optimality-
theoretic phonology. Since the latter is characterized by an orientation that is
predominantly representational (cf. Prince & Smolensky 1993 and McCarthy
& Prince 1995), it does not come as a surprise that many approaches in
optimality-theoretic syntax postulate that the competing candidates created
by Gen are surface structure representations. This holds, e.g., f o r what can
arguably be viewed as the three most influential analyses in optimality-
theoretic syntax so far, viz., Grimshaw (1997), Pesetsky (1998), and Le-
gendre, Smolensky & Wilson (1998). However, there is no inherent reason
why the candidates that are subject to optimization should not be syntactic
objects of a more complex type, like (D-structure, S-structure, LF) tuples as
in government and binding theory, or, indeed, complete derivations, as in the
minimalist program. 26 The choice of candidate type goes hand in hand with
the choice of local constraint type that shows up in the H-Eval part as vio-
lable and ranked: If candidates are representations, constraints will be repre-
sentational; if candidates are derivations, constraints will be derivational; and
if candidates are (D-structure, S-structure, LF) tuples as in government and
binding theory, constraints can take any of the forms sketched in section 2.
Similarly, candidate sets can be defined in various ways, which of course
significantly influences the nature of the competition. Basically, all of the def-
initions of reference sets in minimalist syntax that have been proposed (see
section 3 and Sternefeld 1997) are also potential definitions of candidate sets
in optimality-theoretic syntax. A further influential definition of candidate
sets comes from Grimshaw (1997). She postulates that two candidates (S-
structure representations) compete iff they are realizations of the same pred-
icate/argument structure and have non-distinct logical forms (or non-distinct
interpretations).
By making optimality depend on an intricate system of violable and ranked
constraints, H-Eval - and hence, the concept of competition - becomes even
more important than in minimalist syntax and blocking syntax. As a matter
of fact, much work in optimality-theoretic syntax has tried to minimize the
role of the Gen component, and maximize the role of the H-Eval component
(but see Pesetsky 1997, 1998 for some cautionary remarks).
An optimality-theoretic approach gains immediate support in all those con-
texts where postulating a competition of syntactic objects is initially plausi-
ble. This includes, but is by no means confined to, contexts where notions
of economy seem to play a role. A prototypical case is one in which the
wellformedness of a sentence S, that exhibits an otherwise peculiar prop-
erty seems to depend on the unavailability of another sentence S¡ that ex-
hibits the property one would normally expect. Here, S, is often referred to as
a "repair" form; a typical instance is the English úfo-support construction.
Accordingly, Jo-support was among the first phenomena to be tackled in
optimality-theoretic syntax (see Speas 1995 and Grimshaw 1997). Most of
the constructions discussed in sections 2-4 can also be viewed as suggesting
an underlying competition; and indeed, they can fruitfully be addressed in
optimality-theoretic syntax. This is shown in the following section.
5.2 Case Studies
5.2.1 Anaphors vs. Pronouns in Wilson (1999)
Let us begin with the competition between reflexivization and pronominaliza-

tion. The following optimality-theoretic account is based on Wilson (1999). 27
Recall the generalization that, by and large, pronominals are allowed to ex-
press binding relations in English in just those cases in which anaphors are
not allowed to do so. To account for this, a prerequisite is that two sentences
which differ only with respect to the choice of anaphor vs. pronominal in a
given position must compete. The ranking LOC-ANT » REF-ECON of the
two constraints in (72) then produces the right results.
(72) a. LOC-ANT ("Local Antecedent"):

If a binding domain contains an anaphor, then it must also contain
the anaphor's antecedent,
b. REF-ECON ("Referential Economy"):
A (referentially dependent) argument must not have lexical φ-
feature specification.
LOC-ANT is a version of Principle A; it requires a local antecedent for an

anaphor. Hence, as a tendency, this constraint favors pronominals, which sat-
isfy it vacuously. REF-ECON, on the other hand, inherently prefers anaphors
to pronominals if we are willing to make the assumption that anaphors do
not have a lexical </>-feature specification, whereas pronominals do. Conse-
quently, when an anaphor can respect LOC-ANT, the violation of REF-ECON
incurred by a pronominal is fatal; cf. table T3.
T3 : Reflexivization
Candidates LOC-ANT REF-ECON
is* Q : Johni likes himselfi
C2: Johni likes him! *!
However, when an anaphor cannot find an antecedent in its binding domain

and must violate LOC-ANT, a violation of the lower-ranked REF-ECON con-
straint becomes possible, and pronominalization turns out to be optimal; cf.
table T4.
T4: Pronominalization
Candidates LOC-ANT REF-ECON

Q : Johni thinks that Mary likes himselfi *!
*
i®· C2'. Johni thinks that Mary likes himi
5.2.2 Complementizer-Trace Effects in Grimshaw ( 1997)
In section 2, we noted that government and binding theory accounts for the
complementizer-trace effect in (4-a) on a purely local basis, without postulat-
ing a competition with the complementizer-less variant in (4-b) from which
only the latter would emerge as optimal. This view is abandoned in Déprez
(1991), which is the basis of the optimality-theoretic account advanced in
Grimshaw (1997). As background, Grimshaw assumes that the size of clauses
is variable. Clauses are extended projections of V; they are minimally VPs,
but they can be IPs, CPs, or functional projections of an even bigger size,
depending on the outcome of optimization. Bridge verbs in English permit
both CP-embedding (with a complementizer - a declarative CP without a
complementizer will typically fatally violate a high-ranked constraint that
precludes empty head positions) and IP- or VP-embedding (without a com-
plementizer). In the latter case, IP must be chosen if an auxiliary or do is
present (i.e., if the need arises to accommodate an additional lexical head);
VP can be chosen otherwise. The main constraints that are needed in the ac-
count of complementizer-trace effects are listed in (73). A possible ranking
for English is OP-SPEC » T-LEX-Gov » STAY.28
(73) a. OP-SPEC ("Operator in Specifier"):

W/i-operators must occupy a specifier position from which they c-
command all elements of the extended V projection over which they
take scope.
b. T-LEX-Gov ("Lexical Government of Traces"):
A trace is lexically governed.
c. STAY ("Economy of Movement"):
Trace is not allowed.
STAY is a local version of the translocal economy constraint Fewest Steps;

OP-SPEC is a version of the W/z-Criterion that is often postulated in govern-
ment and binding theory (see, e.g., Lasnik & Saito 1992). The ranking OP-
SPEC STAY ensures overt wA-movement in simple questions in English.
T - L E X - G O V corresponds to (a part o f ) the ECP. Assuming that candidates

with and without that compete, the complementizer-trace effect is derived as
shown in table T5. 2 9
75: Subject wh-movement
Candidates OP- T-LEX-

SPEC Gov STAY
Ci :... vvhoi you think [cp that fip ti will leave J] *! *
e®- C2:... vvhoi you think lip ti will leave ] *
C3:... you think |cp that [IP vvhoi will leave J] *!

C4: ... you think [jp vvhoi will leave 1 *!
C 3 and C 4 fatally violate OP-SPEC. Both Q and C 2 violate STAY, but Q

violates T - L E X - G O V in addition: t| in Q is not lexically governed {that being
unable to do so), whereas ti in C2 is lexically governed (by the matrix V ) . 3 0 In
contrast, an embedded V governs object traces throughout, irrespective of the
presence or absence of a complementizer that, hence, T - L E X - G o v is satisfied
equally well by Q and C2 in table T 6 . Given that Q and C2 do not diifer
with respect to any other constraint either, optionality of a complementizer is
correctly predicted in cases of object extraction, due to an identical constraint
profile.
7V Object wh-movement
SPEC Gov STAY
«s· Ci :... vvhoi you think [cp that [IP she will invite ti J] *
us- C2:... whoi you think [IP she will invite ti ] *
C3: ... you think fcp that fip she will invite whoi |] *!
C4: ... you think [1 she will invite vvhoi ] *!
Thus far, there is no evidence f o r treating T - L E X - G O V as a violable con-

straint in the H-Eval part of the grammar (rather than as an inviolable con-
straint in the Gen part). Such evidence can be gained by considering adjunct
extraction. In this case, T - L E X - G O V is violated by both candidates involving
w/z-movement (C|, C2). However, given that there is no competing candidate
that can avoid a violation of T - L E X - G O V without violating a higher-ranked
constraint (e.g., a candidate that employs a resumptive pronoun; see below),
Ci and C2 can emerge as optimal despite this violation.
Τη: Adjunct wh-movement

SPEC Gov STAY
es- Ci : ... vvhyi you think [cp that lip she has left t] |] * *
* *
i®· C2: ... vvhyi you think fjp she has left ti ]
C3:... you think [cp that |jp she has left vvhyi 11 *!
C4:... you think [¡p she has left vvhyi 1 *!
5.2.3 Subjacency and Resumptive Pronouns in Pesetsky (1998) and Le-

gendre, Smolensky & Wilson (1998)
Recall that resumptive pronouns often seem to be possible only as last resort
strategies in cases where traces are blocked (see (8)). Competition-free mod-
els like government and binding theory have no obvious means to relate one
construction to the other (at least, as long as they are supposed to stay strictly
competetion-free; see note 3); but the case is different in optimality-theoretic
syntax. An optimality-theoretic account of resumptive pronoun strategies is
developed in Legendre, Smolensky & Wilson (1998) (on the basis of evidence
from Chinese) and Pesetsky (1998) (on the basis of English data comparable
to those in (8), as well as evidence from Hebrew, Russian, and Polish). The
details of the two analyses differ a great deal, but the gist of the explanation
is identical; it centers around two constraints like those in (74). 31
(74) a. CNPC ("Complex Noun Phrase Condition"):

Traces must not be separated by a complex noun phrase from their
antecedents,
b. RES ("Resumptive Constraint"):
Resumptive pronouns are prohibited.
The CNPC prohibits traces in certain (non-local) environments; RES disfavors

resumptive pronouns (i.e., pronouns that are bound from an Α-bar position)
in general. The ranking is CNPC » RES. (Thus, the two constraints and their
ranking are analogous to what we have seen with LOC-ANT and REF-ECON
in the domain of binding theory.) As with the wA-movement construction
discussed in the last section, it must be ensured that overt movement of the
relative operator takes place in examples like those in (8). We assume that
this is independently taken care of. 32 Based on these assumptions, consider
table T 8 .
Tg: Trace vs. resumptive pronoun in transparent contexts

Candidates CNPC RES
I®· Q : the man [cp who(m)i I saw ti ]
C 2 : the man |cp who(m)i I saw himi J *!
Both candidates respect CNPC. Consequently, the RES violation incurred by

the resumptive pronoun in C2 becomes fatal, and C¡ is optimal. However, in
the competition illustrated in table T9, Q violates CNPC. In this case, C2'S
RES violation is tolerable, and the resumptive pronoun strategy emerges as
optimal. 33
Tg : Trace vj. resumptive pronoun in C N P C contexts
Candidates CNPC RES

CI : the man [cp who(m)j [ip I don't believe [np the claim *!
[cp t', that anyone saw t| J]]]
*
us- C2: the man fcp who(m)i frp I don't believe [np the claim
Icp that anyone saw himi J]]]
A lot more could be said about relativization in English and other languages
in an optimality-theoretic approach (in particular, concerning f/iaf-relatives
and their relation to wA-relatives), but these considerations will have to suffice
for now; cf. Grimshaw (1997) and Pesetsky (1998).
5.2.4 Avoid Pronoun
Consider now the Avoid Pronoun facts that were discussed in section 2 (cf.
(10) and (12)). In English gerunds, PRO and a lexical pronoun can both oc-
cur in principle; however, PRO must be used instead of a lexical pronoun if
it can fulfill the Control Rule. A transfer of Chomsky's (1981) approach into
optimality theory is straightforward. The Control Rule in (11) can directly
be viewed as an optimality-theoretic constraint (with the same qualification
as in government and binding theory; see note 4); cf. (75-a). The Avoid Pro-
noun Principle in (13) can be simplified by turning this translocal constraint
into a local (though violable) one; cf. (75-b). 34 The ranking for English is
CONTROL » *PRON.
(75) a. CONTROL ("Control Rule"):

If PRO is minimally dominated by a declarative clausal object a ,
then it must be controlled by an antecedent within the minimal C P

dominating a .
b. *PRON ("Avoid Pronoun"):
Pronouns are prohibited.
Suppose that candidate sets are defined in such a way that candidates with
PRO and candidates with a lexical pronoun can compete, but, crucially, that
sentences with different indexings (hence, different logical forms) do not
compete. Then, the facts fall into place. The blocking of a lexical pronoun
by P R O in cases where CONTROL can be satisfied is illustrated in table TIQ.
Γιο·" PRO vs. pronoun under co-indexing
Candidates CONTROL *PRON

Q : Johni would much prefer [ hisj going to the movie 1 *!
na* C 2 : Johni would much prefer [ PRO] going to the movie 1
Table Τ π illustrates the case where PRO is not co-indexed with the matrix
antecedent, thereby violating CONTROL. Here, the *PRON violation incurred
by all pronouns is non-fatal, and the pronoun strategy is optimal.
T\ ι .· PRO vs. pronoun under contra-indexing
Candidates CONTROL *PRON

o®· C i : Johnj would much prefer | his2 going to the movie | *
C 2 : Johni would much prefer | PRO2 going to the movie | *!
5.2.5 Superiority Effects
The question arises of whether the evidence that is accounted for by translo-
cal constraints in minimalist syntax can also be reanalyzed in optimality-
theoretic syntax. At least to some extent, this seems to be the case. As
noted above, the STAY constraint adopted in Grimshaw (1997), Legendre,
Smolensky & Wilson (1998), and much related work, is essentially a local
version of the translocal Fewest Steps condition. Similarly, a local counter-
part has been suggested for the translocal economy constraint Shortest Paths.
Let us reconsider the superiority phenomenon as one of the core applications
of Shortest Paths. Some relevant examples are repeated in (76).
( 7 6 ) a. I wonder [cp whoi C [IP ti bought what2 1|

b. * I wonder fcp what2 C [IP whoi bought t2 |]
Ackema & Neeleman (1998) propose a local version of Shortest Paths that we
may call M I N - C H A I N ("Minimize Chain Length"). This constraint records a
star * for every node crossed by a movement chain. 35 Assuming that only
overt movement counts for the purposes of this constraint, (76-a) can suc-
cessfully block (76-b) under M I N - C H A I N : Other things being equal, the wh-
chain in (76-a) violates M I N - C H A I N twice (IP and C are crossed), whereas
the wA-chain in (76-b) violates M I N - C H A I N four times (VP, I', IP, and C' are
crossed), the third violation being fatal already.
Another account is developed by Legendre, Smolensky & Wilson (1998).
They start out with BAR, which is nearly identical to the CED given above:
(77) BAR ("Barriers Condition"):

A chain link must not cross a barrier.
By a general operation of local constraint conjunction, BAR can be conjoined

with various constraints, including itself. Reflexive local conjunction of BAR
and B A R yields a new constraint B A R & B A R = B A R 2 that is violated if a
chain link crosses two barriers. Local conjunction of BAR2 and BAR yields
B A R 3 , which is violated if a chain link crosses three barriers, and so on. This
mechanism recursively produces a subhierarchy of BAR' constraints which
has a fixed internal ranking, given a universal meta-restriction on constraint
conjunction: Coni & Con2 Coni, Con2·36 The violability of BAR' sub-
hierarchy constraints makes it possible to adopt a simple theory of barriers
according to which a significant number of XPs are barriers (see Köster 1987
for this general idea). Adopting a strict interpretation of Chomsky (1986a),
Legendre, Smolensky & Wilson (1998) assume that all non-L-marked XPs
are barriers, including VP, IP, subject NPs, adjunct CPs, etc. This makes the
BAR' subhierarchy a good means to measure path lengths and derive typical
Shortest Paths effects. In the superiority case currently under consideration,
subject w/z-movement as in (76-a) crosses only an IP barrier, violating BAR,
whereas object wA-movement as in (76-b) crosses a VP barrier and an IP
barrier, violating BAR2, which is universally ranked higher. Given that the
two candidates do not differ with respect to other constraints, and that they
have a better constraint profile than all other competitors, it follows that the
availability of short subject movement blocks the possibility of longer object
movement, as desired.
To end this section, note that Legendre, Smolensky & Wilson (1998) also
succeed in reanalyzing the Tagalog w/z-movement evidence discussed above
in an optimality-theoretic way by invoking the BAR' subhierarchy. More gen-
erally, we can conclude that all of the analyses involving translocal constraints
that have been proposed in minimalist syntax or blocking syntax can be recast
in optimality-theoretic terms by employing local, violable constraints. 37
5.3 Some Open Issues
Optimality-theoretic syntax inherits the complexity problem from minimalist

syntax. Candidate sets are typically large (as in Pesetsky 1998), often infinite
(as in Grimshaw 1997). In addition, there are several open issues that are
specific to the optimality-theoretic approach. We will focus on two of these
in what follows, and briefly mention five others after that.
5.3.1 Inputs and Faithfulness
An important optimality-theoretic concept that has played no role so far in our

discussion is the notion of input. In optimality theory (cf. Prince & Smolensky
1993), Gen does not create competing candidates (the outputs) freely; rather,
it does so on the basis of a given input. In phonology, inputs are underlying
representations stored in the lexicon; here, inputs qualify as roughly the same
types of objects as outputs. In syntax, it is much less clear what the input
might look like (see Archangeli & Langendoen 1996). The null hypothesis -
that the input is a completely articulated potential sentence of the same type
as the output candidates - is not unproblematic because it would seem to
imply the assumption that all possible sentences are "stored," which cannot
possibly be true. To find out what the input in syntax is, it is instructive to
consider its theory-internal functions. By and large, there are two. First, the
input is standardly taken to define the competition. Second, the input serves
as a basis for faithfulness constraints that demand input/output identity and
thereby minimize deviations from the input in the optimal output. Let us con-
sider the second function first.
Faithfulness constraints play an important role in phonology. Constraints of
the PARSE (or MAX) family prohibit deletion of input material in the output;
constraints of the FILL (or DEP) family prohibit insertion of output mate-
rial that is not part of the input; constraints of the I DENT family prohibit
modifying input material. Faithfulness constraints have also been adopted
in much recent work in optimality-theoretic syntax. The following two con-
straints are taken from Legendre, Smolensky & Wilson (1998) and Bakovic
& Keer (1999), respectively.
( 7 8 ) a. PARSE[SC0PE|:
Scope assignment in the input must be realized by chain formation
in the output.
b. FAITH[COMPJ:
The output value of [±COMP] is the same as the input value.
Note that (78-a) implies that the input is a more complex object than just a
collection of words (a numeration) or a predicate/argument structure; it must
be a highly structured representation that encodes the relative scope of op-
erators. (78-b) presupposes an abstract feature [±COMP| that for the present
purposes we can assume to be located on a V that selects a proposition. Let
us consider candidates that violate these constraints. Suppose that (79-a) is
the input for output candidate (79-b), and (79-c) is the input for output candi-
date (79-d). Legendre, Smolensky & Wilson (1998) assume that (79-b) vio-
lates PARSE|SCOPE] because matrix scope for how\ in the input (79-a) (indi-
cated by [+wh]i) is reduced to embedded scope in the output (again indicated
by [+wh]i). Similarly, Bakovic & Keer (1999) assume that (79-d) violates
FAITH|COMP] because a [-COMP] specification in the input contrasts with a
|+COMPL specification (hence, a complementizer) in the output. 3 8
(79) a. | + w h | | ... wonder[ +u ,/,| [ [+wh| 2 ... what 2 ... howi ... | (input)
b. You wonderf + i„/,] |cp [+wh]i [+wh]2 howi John did what2] (output)
c. ... V\-comp\ I ». J (input)
d. I think | cp that [pp on him Ji no coat looks good t| ] (output)
At this point, we need not go into the actual analyses in which these con-
straints play a role (as it happens, both faithfulness violations turn out to
be non-fatal, i.e., (79-b,d) are optimal). The crucial question is: Is it really
necessary to refer to the concept of input here, or is it possible to read the
respective violations off the output forms, without any reference to inputs?
At least for the cases at hand, the answer is straightforward: By enriching
output representations in ways that have independently been proposed, a ref-
erence to inputs becomes unnecessary. (79-a,b) is a case where the intended
matrix scope is not reached by chain formation in the candidate. Employing
abstract scope markers ( Σ ) in S-structure representations (cf., e.g., Williams
1986), we can equivalently encode this input information in the output, as
in (80-a). 39 As for the case in (79-c,d), the only assumption that we have to
make (and which strikes us as innocuous, in fact, completely standard) is that
selectional properties of lexical heads are accessible in syntax; cf. (80-b).
(80) a. Σ ι you wonder[+u,/,] [cp [+wh]i [+whb how] John did \vhat2 J
(output)
b. I think[_com/?i [cp that [pp on him h no coat looks good ti | (output)
P A R S E [ S C O P E ] and FAITHLCOMP] can now be modified in obvious ways,

without reference to inputs.
(81)a. PARSE|SCOPE] (revised):

Scope markers must be reached by chain formation,
b. FAlTHfCOMP] (revised):
Lexical [±COMPj selection requirements must be respected.
If this result can be generalized, and all syntactic faithfulness constraints can
be reanalyzed in this way, we can conclude that these constraints do not
support the concept of input anymore. Why should it be that the notion of
input is relevant for phonological faithfulness constraints, but not for their
syntactic counterparts? The answer, we believe, follows from what appears
to be a fundamental difference between syntax and phonology: Syntax is
an information-preserving system with richly structured output candidates,
whereas phonology is a system that loses information, so that reference to an
underlying input is necessary in constraints.
With this in mind, let us turn to the other input function noted above, that
of defining candidate sets. Since syntactic output candidates are richly struc-
tured, all the relevant information that they must share in order to compete can
be read off them, independently of what notion of candidate set is adopted;
again, this is in sharp distinction to phonology. Thus, it is possible to ex-
plicitly define candidate sets without reference to the concept of input. For
instance, if we follow Grimshaw (1997) in assuming that competing can-
didates must have the same predicate/argument structure, we can read this
information off the potentially competing candidates themselves.
As a matter of fact, it turns out that an input-independent characterization of
candidate sets cannot even be avoided in Grimshaw's own approach. Recall
that Grimshaw (1997) postulates that two candidates compete only if they
have non-distinct logical forms (in addition to identical predicate/argument
structures). If the input fully determines the candidate set, this presupposes
that an input is a complex object that exhibits all relevant logical form in-
formation. It is generally assumed that outputs can deviate from inputs in
many ways, subject only to faithfulness constraints. Hence, if nothing else
is said, we expect that output candidates can be semantically unfaithful to
the input by, e.g., applying scope reduction (such that, e.g., a w/i-phrase with
matrix scope in the input is interpreted with embedded scope in the output).
This clearly implies that candidates with distinct logical forms can compete.
This consequence is embraced by Legendre, Smolensky & Wilson (1998) (cf.
(79-b)). However, such a result is incompatible with Grimshaw's (1997) as-
sumptions, according to which competing candidates must have (not: go back
to) non-distinct logical forms. Thus, even in this approach, the input cannot
completely determine the competition; the requirement of non-distinct logi-
cal forms must be stipulated on top of it.
More generally, it emerges that an input-free characterization of candidate
sets is both readily available and independently motivated. Hence, reference
to inputs is unnecessary for the purpose of defining competition in syntax.
From all this, we would like to conclude that it may eventually be possible
to dispense with the notion of input in syntax; but further research is needed
in this domain (also see note 43 below).
5.3.2 Absolute Ungrammaticality
Another important open question in optimality-theoretic syntax is how to ac-

count for the phenomenon of "absolute ungrammaticality" or "ineffability,"
i.e., cases where there does not seem to be a candidate in a candidate set
that is grammatical. As an example, consider the following ungrammatical
example involving wA-extraction across an adjunct island in German:
(82) *Wasi ist Fritz eingeschlafen [cp nachdem er ti gelesen hat | ?

what is Fritz fallen asleep after he read has
Let us apply the suggestions that can be found in the literature to the case at
hand. First, Pesetsky (1997, 1998) emphasizes that certain sentences may be
ungrammatical not because they are classified as suboptimal in the H-Eval
part of the grammar, but because they cannot be generated by Gen in the first
place. Thus, a constraint like (83) might be part of Gen.
(83) ADJ-ISL ("Adjunct Island Constraint"):

A trace must not be separated by an adjunct clause from its an-
tecedent.
Second, it is suggested in Grimshaw (1994) and Müller (1997) that certain

optimal candidates may have properties that make them inaccessible for other
domains of the language faculty, like, e.g., semantic interpretation. A D J - I S L
might be part of H-Eval, but ranked higher than OP-SPEC. On this view,
(84) could block (83) as suboptimal; but this optimal candidate would be
uninterpretable (indicated by #) and, hence, unusable.
(84) #Fritz ist eingeschlafen [cp nachdem er wasi gelesen hat ] ?

Fritz is fallen asleep after he what read has
These two approaches have in common that they allow the possibility that ab-
solute ungrammaticality is not located in the H-Eval component of grammar,
but in a component that precedes (Gen) or follows (interpretation) optimiza-
tion. If, however, H-Eval is to be held responsible for the ungrammaticality
of (82), there must be a competing candidate with a better constraint profile
that blocks it. A priori, this might be a candidate that employs a resumptive
pronoun strategy, which is only legitimate in this context as a last resort. If
this were so, the ineffability problem would be spurious in the case at hand.
However, (85) shows that the resumptive pronoun strategy is not an option
in German (a constraint like RES must outrank ADJ-ISL and other locality
constraints in German):
(85) *Wasi ist Fritz eingeschlafen |CP nachdem er esi gelesen hat | ?
what is Fritz fallen asleep after he it read has
What, then, could the optimal candidate blocking (82) look like? Following
Prince & Smolensky (1993), Ackema & Neeleman (1998) propose that the
empty candidate 0 (the "null parse") is part of every candidate set. This can-
didate violates the constraint in (86), which is typically ranked high. 40
(86) * 0 ("Avoid Null Parse"):

0 is prohibited.
Constraints that are ranked higher than * 0 in effect become inviolable (given
that there is no constraint except * 0 that 0 can violate). In this sense, * 0
introduces a dividing line into rankings. Thus, if both ADJ-ISL and the con-
straint that triggers vWi-movement (e.g., OP-SPEC) outrank * 0 , adjunct is-
lands become inviolable. This is shown in table T12.
T12: Adjunct islands and the null parse
Candidates ADJ-ISL OP-SPEC *0

Q : wasi ... [cp nachdem er tj V ] *!
C2· — ... [CP nachdem er wasi V ] *!
*
e r C3: 0
A final possibility to be discussed here is the neutralization approach to ab-

solute ungrammaticality in syntax. Such an approach has been adopted by
Legendre, Smolensky & Wilson (1998), Schmid (1998), Bakovic & Keer
(1999), and Wilson (1999), among others. For the present case, a neutral-
ization analysis might posit that the optimal candidate blocking (82) is (87).
(87) Fritz ist eingeschlafen [cp nachdem er was ι gelesen hat ]

Fritz is fallen asleep after he something read has
The crucial difference from (84) is that WAS ι is turned into an indefinite pro-
noun, and the matrix C[+U)/,] is turned into a C[_„,/,]. Thus, there is a feature
change from |+wh] in (82) to f - w h ] in (87), and the sentence is interpreted as
declarative, rather than as a question. 41 If (87) is to block (82) as suboptimal,
this presupposes that candidates that differ in their wA-feature specification
can compete. But then, the problem arises that we would also wrongly expect
one of the sentences in (88) to block the other.
(88) a. Wasi hat er ti gelesen ?

what has he read
b. Er hat wasi gelesen
he has something read
The neutralization approach solves this problem as follows. The [±wh]-

specification is unambiguously specified in the input; an input with a [+wh]
specification on some item and a minimally different input with a | - w h ] spec-
ification count as different, and define different candidate sets. The important
assumption is that there is a faithfulness constraint that demands preservation
of the | ± w h ] feature specification in the output:
(89) FAITH[WH]:
The output value of [ ± w h ] is the same as the input value.
Suppose now that ADJ-ISL and OP-SPEC are ranked higher than
FAITH|WH|. Then, (87) will have a better constraint profile than (82) both
in the competition that has a f - w h ] specification in the input, and in the com-
petition that has a f+wh] specification in the input. Thus, there is a "neutral-
ization" of different input specifications in the output. This is shown in tables
T,3 and Τ14. 42
T\j: Adjunct islands and neutralization; f-w] in the input
Candidates ADJ-ISL OP-SPEC FAITH |WH|

c. wasi[+U)]... [cp nachdem erti V ] *! *
c2 — ... Icp nachdem er vvasj V ] *! *
— ... [cp nachdem er wasij_ u ,| V ]
T\4: Adjunct islands and neutralization; [+w] in the input
Candidates ADJ-ISL OP-SPEC FAITH|WH]

Ci : vvasi[+u,|... [cp nachdem erti V ] *!
C2: — ... [CP nachdem er WASIF+U)] V ] *!
ny C3: — ... ICP nachdem er WASIF-u,] V ] *
In transparent contexts, where movement may occur without a violation of

a high-ranked locality constraint like A D J - I S L (cf. (88)), F A I T H [ WH| viola-
tions become fatal, and the candidate that maintains the [±wh] specification
of the input emerges as optimal. 43
Of the four approaches to absolute ungrammaticality discussed here (Gen,
interpretation, null parse, neutralization), the neutralization approach is ar-
guably the most elegant one. Still, it is not without problems. One conspic-
uous peculiarity is that neutralization creates massive derivational ambigu-
ity. A well-formed sentence like (87) can have different "histories," being
an optimal candidate in two candidate sets with different inputs. This vac-
uous ambiguity may be considered problematic from the point of view of
language acquisition and parsing; and it can only be avoided by additional
meta-optimization procedures that compare the competitions in T, 3 and T| 4 ;
cf. the notion of input optimization in Prince & Smolensky (1993) (called
lexicon optimization in phonology).
5.3.3 Residual Issues
As remarked above, this does not exhaust the list of open issues that are cur-
rently under debate in optimality-theoretic syntax. We end this section by
briefly mentioning a few others.
Optionality
In the best of all possible worlds, one would not expect optionality to arise
in a theory that selects the best candidate. The solutions that have been pro-
posed in view of this situation center around concepts like (i) true optionality,
according to which more than one candidate can be optimal due to an iden-
tical constraint profile (recall the above discussion of complementizer-trace
effects); (ii) constraint ties, which come in various versions (global and local,
ordered, conjunctive, and disjunctive) and all somehow incorporate the idea
that two (or more) constraints are equally important; (iii) pseudo-optionality,
which rests on the idea that the observed optionality is only apparent, and
reducible to different optimization procedures in different candidate sets; and
(iv) neutralization again, essentially an elaborate version of (iii). It turns
out that none of these solutions is completely unproblematic. See Müller
(2000:chapter 5) for a critical overview.
Degrees of Grammaticality
According to the definition of optimality in (70), an optimal candidate is

grammatical, and a suboptimal candidate is invariably ungrammatical, no
matter what the relative quality of its constraint profile is in comparison
with other suboptimal candidates. Without further assumptions, this makes
it impossible to account for degrees of grammaticality (or acceptability) in a
syntax-internal way, in contrast to what is the case in government and binding
theory (albeit only by stipulation; cf., e.g., the traditional distinction between
"mild" Subjacency violations and "strong" ECP violations).
Cumulativity
A related property of optimality-theoretic syntax is that, in its standard form,

it does not capture cumulative effects; in government and binding theory,
cumulativity manifests itself in the assumption that a sentence gets "more
ungrammatical," the more constraints it violates. The reason for optimality
theory's failure to integrate cumulativity is that many violations of a lower-
ranked constraint cannot outweigh a single violation of a higher-ranked con-
straint. However, as we have seen, this consequence does not hold if we adopt
the mechanism of local constraint conjunction. Whether this is a positive or
negative result remains to be seen.
Parameterization
Work in government and binding theory and the minimalist program has fo-
cussed on morphological properties of lexical items as factors that determine
parametrization. Such a view can in principle be reconciled with optimality-
theoretic syntax without too much ado (one and the same syntactic constraint
ranking may yield different optimal candidates if the morphological proper-
ties of these candidates differ from language to language, and there are con-
straints that refer to these morphological properties). However, in practice,
work in optimality-theoretic syntax has often sought to account for syntactic
parameterization exclusively in terms of syntactic reranking, and either deny
a relation to morphology, or view morphological properties not as the ba-
sis, but as a reflex of syntactic parameterization. Again, this issue is far from
being settled; for opposing views, see, e.g., Grimshaw & Samek-Lodovici
(1998) and Legendre, Smolensky & Wilson (1998) on the one hand, and
Vikner (2000) on the other.
Another recurring question in the optimality-theoretic approach to parame-
terization is whether every reranking of constraints that is logically possible
is also linguistically plausible (i.e., results in a potential grammar). The hy-
pothesis that it is is known as factorial typology, and is the focus of much
recent work.
Multiple Optimization
Following Prince & Smolensky (1993), it is standardly assumed that there is

exactly one optimization procedure in syntax; the candidates are evaluated
only once. An alternative that is considered in Prince & Smolensky (1993) is
that optimization procedures can affect candidates more than once. Recently,
this idea has been pursued in various ways in optimality-theoretic syntax.
Several proposals rely on the distinction between interpretive optimization
and expressive optimization: Interpretive optimization may precede expres-
sive optimization (see Wilson 1999), expressive optimization may precede
interpretive optimization (see Hendriks & de Hoop 1999), or the two proce-
dures may influence each other (see Blutner 2000 and Jäger & Blutner 2000).
Heck (1998:this volume) argues that the government and binding model can
be transferred into optimality-theoretic syntax by assuming that optimization
applies three times: First, D-structures are subject to optimization; second, the
optimal D-structure output serves as the input to S-structure optimization; fi-
nally, the optimal S-structure output serves as the input to LF optimization.
Fanselow, Kliegl & Schlesewsky (1999) develop an optimality-theoretic ap-

proach to parsing that is based on the idea that parsing can be viewed as an
iteration of optimization procedures that stop when the final word of a sen-
tence has been taken in. Finally, Heck & Müller (2000) adopt a minimalist
syntax in which each cyclic node (XP) created in the derivation is subject to
optimization; only the optimal XP is submitted to the next step of the deriva-
tion, and so on, until the optimal root node is determined. Thus, in this system,
optimization is not just multiple; it is local in the sense that each optimization
procedure affects only a small unit.
None of these cases of multiple optimization can be viewed as a notational
variant of standard, single optimization. It remains to be seen to what extent
multiple optimization is a viable alternative.
6 The Contributions to This Volume
Most of the papers in this volume originate from a workshop at the 2117
Annual Conference of the DGfS (German Linguistic Society), which took
place at the university of Constance in February, 1999. The contributions
have in common that they discuss pieces of empirical evidence for which
a competition-based approach has some initial plausibility. They are all pri-
marily concerned with optimality theory, and they take up a number of the
open issues that were just mentioned.
Biiring's paper is a study of free word order in German, a domain that
has been tackled in terms of violable and ranked constraints in pre-optimality
work going back to the 70's and 80's. Like Choi (1999), Büring's approach
presents an optimality-theoretic analysis that rests on Lenerz's (1977) seminal
work. Central theoretical notions that play a role include optionality, degrees
of grammaticality, and, in particular, the prosody/syntax interface.
Fanselow & Cavar adopt the copy theory of movement and assume that
overt and covert movement both apply before spell-out. The crucial difference
relates to the question of which members of a copy chain are pronounced,
and which are deleted. To give a comprehensive answer to this question,
the authors discuss evidence from a variety of languages that includes long-
distance and partial wA-movement, the w/i-copy construction, the NP split
construction, and instances of head movement. They develop an optimality-
theoretic approach that reconciles features of the analyses in Pesetsky (1998)
and Grimshaw (1997), and that relies on a system of multiple (local) opti-
mization which integrates Chomsky's (1998) concept of a phase.
As in the case of word order, it has often been argued in pre-optimality-

theoretic analyses of relative quantifier scope that the notions of violability
and ranking (or weight) of interacting factors play an important role. Fischer
sets out to transfer some main results of one such study (viz., Pafel's 1998
approach to quantifier scope in German) into optimality theory. In view of
the fact that this approach also employs the notions of optionality (in the
guise of scope ambiguity) and cumulativity, Fischer develops an analysis that
rests on constraint ties and local constraint conjunction.
Heck is also concerned with quantifier scope in German. Based on the ob-
servation that scope relations at LF are highly dependent on word order at
S-structure, which is in turn strongly influenced by the variable order of ar-
guments at D-structure, he argues for a new system of multiple optimization
(called "cyclic optimization"). This system takes the government and binding
organization of grammar as a starting point and postulates three optimization
procedures: at D-structure, at S-structure, and at LF.
In his experimental study of English gapping constructions, Keller ob-
serves the influence of various interacting constraints. On the basis of this
evidence, he argues for an approach that incorporates features of optimality-
theoretic syntax but also provides room for (a) cumulativity of constraint
violation; (b) gradient acceptability of candidates (related to the number
and quality of constraint violations); and (c) a distinction between "hard"
and "soft" constraints (e.g., a hard clause-mateness constraint, and soft sub-
ject/predicate and minimal distance constraints), the latter being violable and
subject to choice of context.
Like Büring, Lenerz addresses free word order structures in German. This
paper complements Biiring's, since Lenerz argues that the empirical evidence
does not in fact support a competition-based approach. Going through all the
main pieces of word order evidence that have been analyzed in terms of com-
petition, Lenerz shows that an analysis that focusses on the variable semantic
and pragmatic contributions of definite and indefinite NPs in different posi-
tions in the German middle field can yield empirically adequate results with-
out any recourse to the notion of competition; the particular approach that he
develops relies on choice functions and the partitioning of clauses into do-
mains with background-determined reference and with immediate sentence
constituent reference.
Schmid's paper is a close investigation of different ways to handle optional-
ity in optimality-theoretic syntax. After reviewing the options that exist in the
literature, Schmid focusses on a comparison of a specific (global) notion of
constraint tie and the concept of neutralization. The cases of optionality that
serve as the empirical basis are (a) complementizer drop in English, (b) wh-
movement in French root clauses, and (c) the German "Ersatzinfinitiv" (IPP)
construction. For each of these phenomena, a global tie analysis is compared
with a neutralization analysis; general strategies are suggested that permit a
transfer from one type of approach to the other; and a conclusion is drawn
that ultimately favors the neutralization solution.
The focus of Vikner's paper is the conflict that arises between two well-
motivated constraints in Icelandic: First, the relative scope of quantified items
must correspond to their surface order; second, NPs can undergo object shift
in front of an adverbial only if the main verb has undergone movement. In-
terestingly, it seems as though relative scope does not have to correspond
to surface order in exactly those contexts in which object shift is blocked.
Vikner shows that this supports an optimality-theoretic analysis in which the
first constraint is ranked below the second one, and is thus violable in the case
of conflict. Finally, the analysis is extended to German.
Vogel takes as a starting point the observation that free relative construc-
tions by their very nature strongly suggest constraint violability and constraint
ranking: They are incompatible with the standard assumption that there is
a one-to-one correspondence between Case assigners and items that are as-
signed Case. Moreover, Case conflicts can show up in free relatives which
are often resolvable by ranking (but may also result in absolute ungrammati-
cality). On this basis, Vogel develops an optimality-theoretic analysis of free
relative constructions in German, and he investigates the typological implica-
tions that result from reranking the proposed constraints; among other things,
the analysis sheds new light on the concepts of factorial typology and neu-
tralization.
Finally, Wanner observes that there are conflicts between linking rules
which become manifest in the domain of psych verbs in English. For instance,
the CONTROL-RULE favors experiences as external arguments, whereas
the CAUSER-RULE prefers causers as external arguments; in an optimality-
theoretic approach, the conflict can be resolved by ranking the latter rule
above the former, and this is what explains the difference between Mary
frightens John (where the theme is a causer) and John fears Mary (where
the theme is not a causer). An interesting theoretical aspect of this analysis is
that the competing candidates are not sentences, but argument structures.
We believe that the papers collected in this volume give a fair indication
of both the potential and the limitations of optimality-theoretic syntax, and
of competition-based syntax in general. To us, they strongly suggest that it is
fruitful to further explore the concept of syntactic competition, even though

an eventual success of this enterprise cannot be taken f o r granted at this point.
Acknowledgments
We would like to thank Kirsten Brock f o r the enormous a m o u n t of excellent

work that she put into the present volume. We are also grateful to Oliver
Avieny and Annette Farhan f o r their editorial assistance. Miiller's work was
supported by D F G grants M U 1444/1-1,2-1; Sternefeld's work was supported
by a D F G grant within the SFB 441.
Notes
1. Our use of the term global follows its original interpretation in Lakoff (1971)
throughout this introduction. Sometimes, global is understood in a rather dif-
ferent sense in the literature (including Chomsky 1995 and Collins 1997), as a
synonym for translocai or transderivational (see below). As we will see, in this
second interpretation, a global constraint can in fact not be checked by exclu-
sively looking at a given syntactic object S,·.
2. The resumptive pronoun strategy is by itself marginal in English and is cho-
sen here mainly for expository reasons; see Chomsky (1981:173) for a discus-
sion of the case at hand. However, resumptive pronouns as a last resort in cases
where movement is blocked are widely attested in other languages. See Shlonsky
(1992), Pesetsky (1998), and the references cited there.
3. Note, however, that Chomsky (1982:63f.) envisages an account in terms of the
Avoid Pronoun Principle, which, as we will see, is an exception insofar as it is in
fact a non-local constraint in government and binding theory.
4. As Manzini shows, the Control Rule is actually a theorem that can be derived
from more primitive assumptions. This need not concern us here.
5. The Avoid Pronoun Principle has been applied to pro-drop phenomena in lan-
guages like Italian by Haegeman (1994:217). The idea here is that the availabil-
ity of the empty pronominal pro in the subject position of finite clauses tends to
make the use of an overt pronoun impossible; on this view, overt subject pronouns
can only show up in pro-drop languages if they fulfill a function that pro cannot
fulfill (like, e.g., focus interpretation). Also recall Chomsky's (1982) analysis of
resumptive pronouns that was mentioned above.
6. Also see Reinhart (1983) on a version of binding theory that relies on pragmatic
constraints of this type.
I . The notion of a numeration is first introduced in Chomsky (1993), so this is

strictly speaking an anachronism.
8. This view is later abandoned in the minimalist program. Thus, Chomsky (1998:6)
speculates that "language design might be optimal... approaching a 'perfect so-
lution' to minimal design specifications."
9. We can assume that a position is "appropriate" for insertion of intermediate traces
if the resulting structure does not violate local constraints - e.g., those on im-
proper movement (an Α-bar trace must not end up within an Α-chain, see May
1979; an adjoined trace must not end up within a chain headed by an antecedent
in a specifier position (and vice versa), see Müller & Stemefeld 1996).
10. At least, this holds as long as we are not prepared to assume that embedded V/2
in German is derived by deletion of a complementizer daß that is present in the
numeration.
I I . All XPs which are not L-marked are barriers. XPs which are not in complement
position are therefore always barriers.
12. Note that the concept of Form Chain cannot undermine this reasoning because
the two instances of chain formation applying to the w/i-phrase in D3 are not
adjacent, but interrupted by another operation - that of NP raising to subject
position.
13. See Chomsky (1993, 1995) for discussion of the various options that arise here.
14. Also compare Kitahara's (1997) reconstruction of Procrastinate effects in terms
of Fewest Steps; see section 3.3 on Procrastinate.
15. Note in passing that Chomsky's (1991) way out in terms of 'stylistic movement'
that can be chosen in the case of optional overt movement is not viable here for
obvious reasons, LF movement never being 'stylistic.'
16. The derivation originally envisaged by Collins (1994) is actually even more com-
plex since it involves two additional VP-adjunction operations. The derivation in
(46) is sufficient for our present purposes, though.
17. One might think that Fewest Steps would also suffice to block Di in favor of D2.
However, assuming that the two movement operations in Di can be reanalyzed as
a single instance of Form Chain, this is not the case. That said, it is worth noting
that Shortest Paths would indeed suffice to account for the ban on V-in situ in
French that was explained by invoking Fewest Steps in Chomsky (1991). The
derivation in (17) also instantiates yo-yo movement; the only essential difference
from the derivation in (46) is that yo-yo movement is interrupted by spell-out in
the former case, but not in the latter.
18. Two additional assumptions must be clarified. First, in line with what is proba-
bly the majority of literature on the topic, Nakamura (1998) postulates that wh-
constructions in Tagalog do not actually involve movement of the wA-phrase,
but rather movement of an empty operator in a relative clause-like construction.
That is, English questions like "What did Juan buy?" are rendered as "What is it
Opi that Juan bought t] ?" For expository purposes, we will ignore this compli-
cation in what follows, but the correct structure is still reflected in the translation.
Second, note that actual positions of items that are overtly visible do not always
reflect the position that is theoretically relevant in Nakamura's (1998) analysis. In
particular, he assumes that the structural subject position SpecT is left-peripheral,
and in many cases can only be filled at LF. Still, subject NPs behave in every re-
spect as if they occupied the SpecT position overtly. This covert subject raising
with overt effects is indicated here by italicizing the relevant subject NP; thus ital-
icization is meant to imply that the italicized NP is pronounced in the position
of its trace. Note that these complications do not arise in a language like Toba
Batak, which otherwise exhibits the same general effect; see Schachter (1984)
and Sternefeld (1995).
19. Note, however, that a residue of the Procrastinate condition still shows up in
Chomsky's (1998:14) translocal principle that prefers Agree over Move. Also
see the next section.
20. It is worth noting that the notion of optimality has systematically been used in
minimalist syntax, apparently without recourse to optimality theory as developed
by Prince & Smolensky (1993), and at a time when optimality-theoretic syn-
tax papers did not yet exist. See, e.g., Chomsky (1993:4), and, for explicit uses
of the notion, Collins (1994:46), Kitahara (1997:18), and Frampton & Gutman
(1999:5).
21. See Müller (2000:chapter 4) for a slightly more realistic (albeit still simplified)
example.
22. See Fanselow (1997), who argues that was2 can be scrambled to a position in
front of wer ι before w/i-movement takes place in (61-b). However, to avoid
a blocking of (61-a) by (61 -b) in a system with translocal constraints (which
Fanselow does not assume), it would then also have to be ensured that the two
derivations do not compete; this could be achieved by assuming that the presence
vs. absence of the trigger for waj-scrambling creates two different reference sets.
An alternative would be to assume that whereas translocal constraints cannot be
parameterized, the definition of reference set can be. Without recourse to inter-
mediate scrambling, reference sets might then be defined in German in such a
way that (61-a) and (61 -b) do not compete, whereas they could be defined differ-
ently in English, so that the English counterparts of these derivations do compete.
See Sternefeld (1997) for an extensive discussion of this option.
23. Recall, however, that Chomsky retains some translocal constraints even in more
recent work, though often hesitantly and with a sense that if truly necessary,
translocality would qualify as an "imperfection" of language. Thus, directly after
suggesting the Shortest Paths account of the ban on the acyclic derivation of
freezing effects with NP raising cited above, Chomsky (1995, 328) comes close
to revoking it by stating: " - though the issue is nontrivial, in part because we are
invoking here a 'global' [i.e., translocal] notion of economy of the sort we have
sought to avoid."
24. Also see Hornstein (2000), who gives the same kind of account in a minimalist
setting.
25. The standard way out of the problems created by optionality chosen by pro-
ponents of blocking syntax is to find subtle semantic differences between the
relevant sentences - in other words, to deny true optionality.
26. The former strategy is discussed in Heck (1998:this volume); the latter strategy
is pursued in Müller (1997).
27. We hasten to add that all the case studies in this section are simplified versions
of the actual analyses proposed in the literature. In the present context, we are
mainly interested in the logic of the argument, not in the specific (or maximally
elegant) formulation of the constraints. Accordingly, we leave open the questions
of defining candidates and candidate sets where they do not seem to be important
for our present purposes. Note also that the simplification is particularly radical in
Wilson's (1999) case. Based on evidence from binding theory, Wilson argues for
an elaborate model of multiple optimization in syntax (see section 5.3.3 below);
he is concerned with many more data and, eventually, typological universale that
the naive analysis presented here cannot possibly account for.
28. The ranking of T-LEX-GOV and STAY could also be reversed; this ranking is not
determined by the cases we are interested in here.
29. To avoid the issue of do-support in root clauses, which is orthogonal to the issue
of complementizer-trace effects in embedded clauses, we have chosen examples
here in which the SpecC[+U!/,] target position is in an embedded clause.
30. Note that a violation of T-LEX-GOV will automatically imply a violation of the
more general STAY constraint. Hence, given that there is no other constraint on
which Ci and C2 differ, it follows that Ci's constraint profile is better than that
of C2 under ranking; i.e., Ci harmonically bounds C2
31. The relevant constraints are BAR3 (see below) and FILL in Legendre, Smolensky
& W i l s o n ( 1 9 9 8 ) , a n d I S L A N D - C O N D a n d S I L E N T - T R A C E in P e s e t s k y ( 1 9 9 8 ) .
CNPC should be viewed as a placeholder for one or more general conditions
that yield the described effects; RES is arguably part of a more general system
of constraints on pronouns. Also see Hornstein (2000) resumptive pronouns and
islands, including CNPC, in English.
32. Operator movement in relative clauses in English can be achieved by (something
along the lines of) Grimshaw's (1997) ranking OP-SPEC » STAY (plus STAY
» RES); this option is chosen by Legendre, Smolensky & Wilson (1998). In
contrast, Pesetsky (1997) does not assume the movement operation in relative
clauses to be subject to optimization; in his view, Gen does not generate the in-
situ version in English in the first place.
33. It seems that in order to achieve compatibility of this account of resumptive pro-
nouns with the account of the lack of complementizer-trace effects with adjuncts
sketched in the preceding section, the ranking RES T-LEX-GOV would have
to be assumed. That said, one will probably have to assume independent, high-
ranked constraints that block resumptive pronouns in adjunct chains, anyway.
34. Is *PRON confined to personal (and possessive) pronouns, or does it also cover
anaphoric pronouns? Under the first option, REF-ECON and *PRON might be the
same constraint. Under the second option, we would in fact face what is known
as a subhierarchy of constraints: A general constraint *PRON prohibits all kinds
of pronouns, a more specific constraint *PERS-PRON (= REF-ECON) prohibits
only personal pronouns, and an even more specific constraint *RES-PERS-PRON
(= RES) prohibits only personal pronouns used as resumptives.
35. Ackema & Neeleman call this constraint STAY, but this may be somewhat un-
fortunate, given that MIN-CHAIN differs substantially from Grimshaw's (1997)
STAY, in the same way that Shortest Paths differs from Fewest Steps.
36. Local constraint conjunction makes it possible to reintroduce the concept of cu-
mulativity into optimality-theoretic syntax: Multiple violations of a given con-
straint Coni may not directly outweigh a single violation of a higher-ranked
constraint Con2, but can do so indirectly by triggering a violation of an even
higher-ranked constraint Coni'.
37. An interesting question is whether a translation of translocal constraints into lo-
cal constraints is actually needed in optimality theory; in other words: Could
not some of the violable and ranked constraints in the H-Eval part be translocal
themselves, just like the basic optimality principle is? For instance, one could en-
visage a translocal SHORTEST PATHS that fulfills the same task as MIN-CHAIN
or BAR' : SHORTEST PATHS selects the candidate C, with the shortest movement
paths in a given candidate set, and this can be signalled by stipulating that all can-
didates except C, are assigned a star * under this constraint. Such an approach
may raise additional complexity issues, and it has - to the best of our knowledge
- not yet been proposed in optimality-theoretic syntax. Still, it seems to us to be
viable in principle. Indeed, a translocal constraint of this type has been proposed
for phonology in Prince & Smolensky (1993) (H-NUC, which, however, is even-
tually replaced there by a subhierarchy of local constraints that are derived by a
process of harmonic alignment).
38. In both cases, only those aspects of the input are considered that matter for the
faithfulness constraints under consideration.
39. Note that the distinction between the actual scope position for a wA-item (here
designated by |+wh|) and the "intended" scope position for a w/z-item (here des-
ignated by Σ ) is fundamental in the analysis by Legendre, Smolensky & Wilson
(1998), and not an artefact of the input-independent approach.
40. Assuming the concept of input, this constraint amounts to the statement that the
input must not be left completely unrealized.
41. Here we exploit the fact that was is ambiguous between a wh-reading and an in-
definite reading in (colloquial) German. This does not hold for other wA-phrases
like welches Buch ('which book'), which, however, also cannot be extracted from
adjunct islands. For these cases, the neutralization approach would have to be
complicated in such a way that the candidate with the f-wh] NP (perhaps ein
Buch ('a book')) deviated from the one that must be blocked not just in feature
specification, but also in morphological shape. Such complications do not affect
the general argument, though.
42. A side remark: The candidates in (79-b) and (79-d) that were discussed in the pre-
vious section also signal input neutralization; these candidates are also optimal
in candidate sets where they do not violate the respective faithfulness constraint.
43. This account rests on the concept of input. Is it possible to maintain the analysis
without reference to this notion? It is, but the task is slightly more difficult here
than in the cases that were discussed in the last section. We have to ensure that
an output candidate like (87), with a C[_ m /,| and a was[- w h\, has abstract [+wh]
or | - w h l markers that encode the postulated input difference, and that can be
referred to by an appropriately revised FAlTHfWH] constraint.
References
Ackema, Peter — Ad Neeleman

1998 Optimal questions. Natural Language and Linguistic Theory 16: 443-
490.
Archangeli, Diana — D. Terence Langendoen
1996 Afterword. In: Diana Archangeli & D. Terence Langendoen (eds. ), Op-
timality Theory: An Overview, 200-215. Oxford: Blackwell.
Aronoff, Mark
1976 Word Formation in Generative Grammar. Cambridge, MA: MIT Press.
Baker, Carl L.
1970 Notes on the description of English questions: The role of an abstract
question morpheme. Foundations of Language 6: 197-219.
Bakovic, Eric — Ed Keer
1999 Optionality and ineffability. Ms., Harvard University & UMass.,
Amherst. To appear in: Géraldine Legendre, Jane Grimshaw & Sten
Vikner (eds. ), Optimality-Theoretic Syntax, Cambridge, MA: MIT Press.
Blutner, Reinhard.
2000 Some aspects of optimality in natural language interpretation. Ms.,
Humboldt-Universität Berlin.
Burzio, Luigi
1991 The morphological basis of anaphora. Journal of Linguistics 27: 81-105.
Choi, Hye-Won
1999 Optimizing Structure in Context: Scrambling and Information Structure.
Stanford: CSLI Publications.
Chomsky, Noam
1973 Conditions on transformations. In: Stephen Anderson & Paul Kiparsky
(eds.), A Festschrift for Morris Halle, 232-286. New York: Academic
Press.
Chomsky, Noam
1981 Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam
1982 Some Concepts and Consequences of the Theory of Government and
Binding. Cambridge, MA: MIT Press.
Chomsky, Noam
1986a Barriers. Cambridge, MA: MIT Press.
Chomsky, Noam
1986b Knowledge of Language. New York: Praeger.
Chomsky, Noam
1991 Some notes on economy of derivation and representation. In: Robert
Freidin (ed.), Principles and Parameters in Comparative Grammar, 417-
454. Cambridge, MA: MIT Press.
Chomsky, Noam
1993 A minimalist program for linguistic theory. In: Kenneth Hale & Samuel
Jay Keyser (eds.), The View from Building 20, 1-52. Cambridge, MA:
MIT Press.
Chomsky, Noam
1995 Categories and transformations. (Chapter 4). In: The Minimalist Pro-
gram, 219-394. Cambridge, MA: MIT Press.
Chomsky, Noam
1998 Minimalist inquiries. Ms., MIT, Cambridge, MA
Chomsky, Noam — Howard Lasnik
1993 Principles and parameters theory. In: Joachim Jacobs, Arnim von Ste-
chow, Wolfgang Sternefeld & Theo Vennemann (eds.), Syntax, vol. I,
506-569. Berlin: de Gruyter.
Cole, Peter
1982 Subjacency and successive cyclicity: Evidence from Ancash Quechua.
Journal of Linguistic Research 2: 35-58.
Collins, Chris
1994 Economy of derivation and the generalized proper binding condition. Lin-
guistic Inquiry 25: 45-61.
Collins, Chris
1997 Local Economy. Cambridge, MA: MIT Press.
Déprez, Viviane
1991 Economy and the that-t effect. In Proceedings of the Western Conference
on Linguistics 4: 74-87.
DiSciullo, Anna-Maria — Edwin Williams
1987 On the Definition of Word. Cambridge, MA: MIT Press.
Epstein, Samuel David
1992 Derivational constraints on A'-chain formation. Linguistic Inquiry 23:
235-259.
Fanselovv, Gisbert
1989 Konkurrenzphänomene in der Syntax. Linguistische Berichte 123: 385-
414.
Fanselovv, Gisbert
1991 Minimale Syntax. Habilitation thesis, Universität Passau.
Fanselovv, Gisbert
1997 The proper interpretation of the minimal link condition. Ms., Universität
Potsdam.
Fanselovv, Gisbert — Reinhold Kliegl — Matthias Schlesewsky
1999 Optimal parsing. Ms., Universität Potsdam.
Fox, Danny
1995 Economy and scope. Natural Language Semantics 3:283-341.
Frampton, John — Sam Gutman
1999 Cyclic computation. Syntax 2: 1-27.
Grimshavv, Jane
1994 Heads and optimality. Handout, Universität Stuttgart.
Grimshavv, Jane
1997 Projection, heads, and optimality. Linguistic Inquiry 28: 373-422.
Grimshavv, Jane — Vieri Samek-Lodovici
1998 Optimal subjects and subject universals. In: Pilar Barbosa et al. (eds.),
Is the Best Good Enough?, 193-219. Cambridge, MA: MIT Press &
Haegeman, MITWPL.
Liliane
1994 Introduction to Government and Binding Theory. Oxford: Blackwell.
Haider, Hubert
1983 Connectedness effects in German. Groninger Arbeiten zur Germanistis-
chen Linguistik 23: 82-119.
Heck, Fabian
1998 Relativer Quantorenskopus im Deutschen - Optimalitätstheorie und die
Syntax der Logischen Form. M.A. thesis, Universität Tübingen.
Heck, Fabian — Gereon Müller
2000 Repair-driven movement and the local optimization of derivations. Ms.,
Universität Stuttgart & IDS Mannheim. Short version in: Glow Newslet-
ter 44: 26-27.
Hendriks, Petra — Helen de Hoop
1999 Optimality theoretic semantics. Ms., University of Groningen. (Cognitive
Science and Engineering Prepublications 98-3.)
Hornstein, Norbert
2000 Is the binding theory necessary? Ms., University of Maryland.
Jäger, Gerhard — Reinhard Blutner
2000 Against lexical decomposition in syntax. Ms., ZAS & Humboldt-
Universität Berlin.
Kiparsky, Paul
1982 From cyclic phonology to lexical phonology. In: Harry van der Hulst &
Neil Smith (eds.), The Structure of Phonological Representations, vol 1,
131-175. Dordrecht: Foris.
Kitahara, Hisatsugu
1993 Deducing 'superiority' effects from the shortest chain requirement. Har-
vard Working Papers in Linguistics 3: 109-119.
Kitahara, Hisatsugu
1997 Elementary Operations and Optimal Derivations. Cambridge, MA: MIT
Press.
Koster, Jan
1987 Domains and Dynasties. Dordrecht: Foris.
Lakoff, George
1971 On generative semantics. In: Danny Steinberg & Leon Jakobovits (eds.),
Semantics, 232-296. Cambridge: Cambridge University Press.
Lasnik, Howard — Mamoru Saito
1992 Move a. Cambridge, MA: MIT Press.
Legendre, Géraldine — Paul Smolensky — Colin Wilson
1998 When is less more? Faithfulness and minimal links in wh-chains. In: Pilar
Barbosa et al. (eds.), Is the Best Good Enough?, 249-289. Cambridge,
MA: MIT Press & MITWPL.
Lenerz, Jürgen
1977 Zur Abfolge nominaler Satzglieder im Deutschen. Tübingen: Stauffen-
burg.
Manzini, Rita
1983 On control and control theory. Linguistic Inquiry 14: 421-446.
Marantz, Alec
1995 The minimalist program. In: Gert Webelhuth (ed.) , Government and
Binding Theory and the Minimalist Program, 351-382. Oxford: Black-
well.
May, Robert
1979 Must COMP-to-COMP movement be stipulated? Linguistic Inquiry 10:
719-725.
McCarthy, John — Alan Prince
1995 Faithfulness and reduplicative identity. In: Jill Beckman, Laura Walsh-
Dickie & Suzanne Urbanczyk (eds.), Papers in Optimality Theory, 249-
384. Amherst, MA: UMass Occasional Papers in Linguistics 18.
Müller, Gereon
1997 Partial vvh-movement and optimality theory. The Linguistic Review 14:
249-306.
Müller, Gereon
2000 Elemente der optimalitätstheoretischen Syntax. Tübingen: Stauffenburg.
Müller, Gereon — Wolfgang Sternefeld
1996 Α-bar chain formation and economy of derivation. Linguistic Inquiry 27:
480-511.
Nakamura, Masanori
1998 Reference set, minimal link condition, and parameterization. In: Pilar
Barbosa et al. (eds.), Is the Best Good Enough?, 291-313. Cambridge,
MA: MIT Press & MITWPL.
Pafel, Jürgen
1998 Skopus und logische Struktur — Studien zum Quantorenskopus im
Deutschen. Habilitationsschrift, Universität Tübingen.
Pesetsky, David
1997 Optimality theory and syntax: Movement and pronunciation. In: Diana
Archangeli & D. Terence Langendoen (eds.), Optimality Theory. An
Overview, 134-170. Oxford: Blackwell.
Pesetsky, David
1998 Some optimality principles of sentence pronunciation. In: Pilar Barbosa
et al. (eds.), Is the Best Good Enough?, 337-383. Cambridge, MA: MIT
Press & MITWPL.
Pollock, Jean-Yves
1989 Verb movement, universal grammar, and the structure of IP. Linguistic
Inquiry 30: 365-424.
Prince, Alan — Paul Smolensky

1993 Optimality Theory: Constraint Interaction in Generative Grammar. Ms.,
Rutgers University. To appear: Cambridge, MA: MIT Press.
Reinhart, Tanya
1983 Anaphora and Semantic Interpretation. London: Croom Helm.
Richards, Norvin
1997 Competition and disjoint reference. Linguistic Inquiry 28: 178-187.
Schachter, Paul
1984 Studies in the Structure of Toba Batak. UCLA Occasional Papers in Lin-
guistics 5.
Schmid, Tanja
1998 West germanic "Infinitivus Pro Participio" (IPP) constructions in opti-
mality theory. In: Tina Cambier-Langeveld, Anikó Lipták, Michael Red-
ford & Erik Jan van der Torre (ed.) , Proceedings of Console VII, 229-
244. Leiden: SOLE.
Shlonsky, Ur
1992 Resumptive pronouns as a last resort. Linguistic Inquiry 23: 443-468.
Speas, Margaret
1995 Generalized control and null objects in optimality theory. In: Jill Beck-
man, Laura Walsh-Dickie & Suzanne Urbanczyk (eds.) , Papers in Op-
timality Theory, 637-653. Amherst, MA: UMass Occasional Papers in
Linguistics 18.
Sternefeld, Wolfgang
1991 Chain formation, reanalysis, and the economy of levels. In: Hubert
Haider & Klaus Netter (eds.), Representation and Derivation in the The-
ory of Grammar, 71-137. Dordrecht: Kluwer.
1995 Voice phrases and their specifiers. FAS Papers in Linguistics 3: 48-85.
1997 Comparing reference sets. In: Chris Wilder, Hans-Martin Gärtner & Man-
fred Biervvisch (eds.) , Economy in Linguistic Theory, 81-114. Berlin:
Akademieverlag.
Vikner, Sten.
2000 Checking strong verbal inflection in optimality theory. Ms., Universität
Stuttgart.
Williams, Edwin
1986 A reassignment of the functions of LF. Linguistic Inquiry 17: 265-299.
Williams, Edwin
1997 Blocking and anaphora. Linguistic Inquiry 28: 577-628.
Wilson, Colin
1999 Bidirectional optimization and the theory of anaphora. Ms., Johns Hop-
kins University. To appear in: Géraldine Legendre, Jane Grimshaw &
Sten Vikner (eds.) Optimality Theoretic Syntax, Cambridge, MA: MIT
Press.
Let's Phrase It!
Focus, Word Order, and Prosodie Phrasing in German
Double Object Constructions
Daniel Biiring
This paper presents a case study in the interaction of word order, prosody and
focus. The construction under consideration is the double object construction
in German. The analysis proposed is in line with the following more general
hypotheses:
First, focus and word order do not interact directly. There are no grammati-
cal rules that relate focus to specific phrase structural positions. Rather, focus
interacts with prosodie phrasing, which in turn may interact with word order.
Second, the kind of word order variation under investigation here is gov-
erned by two potentially conflicting types of constraints: morphosyntactic
constraints that express ordering preferences relating to case, definiteness and
possibly other categories, and prosodie constraints that define what a prosodie
structure should look like. If these constraint families call for incompatible
demands, languages may allow only the morphosyntactically perfect struc-
ture, or only the prosodically perfect structure, or, as is arguably the case in
German, both.
Third, violable ranked constraints provide a well-suited framework to
account for these kinds of phenomena. Both the morphosyntactic and
the prosodie constraints, as well as those governing the relation between
prosody and focus, are implemented as markedness constraints. Their rela-
tive (non-)ranking accounts for the variation observed within a language and
cross-linguistically.
1 Introduction
German, like many of its Germanic cousins, is a verb-second language. What

sets it, along with Dutch, apart from the other Germanic verb-second lan-
guages is what Bech (1955/57) calls its Klammerstruktur (lit.: 'bracket struc-
ture')· All non-finite verb forms appear at the very end of the clause, so that
70 Daniel Biiring
the finite verb in second position and the non-finite ones in final position to-
gether form a sort of bracket around the main body of the clause.
initial finite verb ... . . . . _ . , .
(1) .. Mittelfeld non-finite verb forms
position
As indicated, this main body of the clause, as delimited by the finite verb to
its left and the non-finite ones to its right, is traditionally called the Mittelfeld
('middle field')·
In embedded clauses, the initial position usually remains empty and the
finite verb is found at the end, too. In its place the subordinating complemen-
tizer constitutes the left bracket of the Mittelfeld.
(2) ι »·
complementizer is υ
Mittelfeld non-finite verb finite
forms verb
The Mittelfeld contains all non-clausal complements of the verb, some non-
finite clausal ones, and most adverbials (almost any of these can alternatively
occupy the initial position in declarative main clauses, a fact we can ignore
here). The relative order among the elements in the Mittelfeld is basically
free. In particular, German, unlike Dutch, allows reordering among the nom-
inal arguments quite freely. Subject and object as well as the two objects in
a ditransitive construction can be found in various orders. The following ex-
amples of embedded clauses from Müller (1998) (his (31) and (36)) illustrate
this:
(3) nominative-accusative-order
a. ... dass eine Frau den Fritz geküsst hat.
that a woman the-ACC Fritz kissed has
b. ... dass den Fritz eine Frau geküsst hat.
that the-ACC Fritz a woman kissed has
'... that a woman kissed Fritz.'
(4) dative-accusative-order
a. ... dass man das Buch dem Fritz geschickt hat.
that one the book the-DAT Fritz sent has
b. ... dass man dem Fritz das Buch geschickt hat.
that one the-DAT Fritz the book sent has
'...that someone sent Fritz the book.'
All arguments are nominal. Overt case marking for nominative, dative and
accusative is found on articles. As one might suspect, (4) allows even more
Double Object Constructions 71
different orderings involving subject-object-reordering, which we did not list

here.
It has long been observed that various factors determine the acceptability
of a given word order in a particular case, among them case, definiteness,
animacy, and focus (cf. Lenerz 1977, Uszkoreit 1987, Müller 1998, among
others). In the present study we will concentrate on the particular role that
focus plays in relation to case (which we take as representative of the other
morphosyntactic constraints). We also limit our discussion to the relative or-
dering of accusative and dative objects in double object constructions.
2 Focus and Word Order: A Summary of the Proposal
In his seminal study on German word order, Lenerz (1977) found that there
are two main semantic/pragmatic factors that co-determine object ordering
in German double object constructions: definiteness and focus. Simplifying
slightly, the generalizations in (5) hold:
(5) a. Definite NPs precede indefinite NPs.

b. Non-focused NPs precede focused NPs.
An equally important finding of that study was that there is one purely mor-
phosyntactic factor involved, too: 1
(6) Dative NPs precede accusative NPs.
As Lenerz observed, these three conditions interact in a complex and inter-

esting fashion: Either one or both of (5) can be violated, as long as (6) is met;
and (6) can be violated only if both conditions in (5) are met. Put differently,
if the dative object precedes the accusative object (henceforth DatO>AccO
order), any distribution of focus and (in)definiteness between the objects is
possible; but the accusative object can precede the dative object (henceforth
A c c O > D a t O order) only if DatO is in focus and AccO is definite. Lenerz
(1977) concluded from this that DatO>AccO is the "unmarked" word order,
and that deviance from it is only justified in compliance with the conditions
in (5).
The focus-case interaction is demonstrated in (7) and (8) (Lenerz' (2) and
(3), p. 43). To control for focus, a context-question as in (7) and (8) is pro-
vided; the focus in the answer can then be identified as the constituent that
corresponds to the W/I-phrase in the question ([...'IF brackets indicate focus,
capitals represent pitch accents).
72 Daniel Biiring
The DatO > AccO order in (a) is fine in both cases, whereas the AccO >
DatO order in (b) is only acceptable if DatO is in focus (or, as we shall some-
times say, F-marked).
(7) Wem hast du das Geld gegeben?

'Who did you give the money to?'
a. | +def. DatO] F > [+def. AccO]
Ich habe Idem KasSIErer]/r das Geld gegeben.
I have the teller the money given
b. | +def. AccOl > [+def. DatO] F
Ich habe das Geld [dem KasSIErer]/r gegeben.
I have the money the teller given
Ί gave the money to the teller.'
(8) Was hast du dem Kassierer gegeben?
'What did you give to the teller?'
a. | +def. DatO] > [+def. AccO] F
Ich habe dem Kassierer [das GELDJ/r gegeben.
I have the teller the money given
b. r+def. AccO]/? > [+def. DatO]
?*Ich habe [das GELD]/r dem Kassierer gegeben.
Ί gave the teller the money.'
The definiteness-case interaction is illustrated in (9) and (10) (Lenerz' (18)

and (20) on p. 52f.). DatO > AccO order is possible with an indefinite pre-
ceding a definite, contra (5-a), as in (9).
(9) Was hast du einem Schüler geschenkt?

'What did you give to a student?'
[ - d e f . DatO] > [+def. AccO| F
Ich habe einem Schüler [das BUCH]/r geschenkt.
I have a-DAT student the book given
Ί gave a student the book.'
But AccO > DatO order is unacceptable if AccO is indefinite; cf. (10) (note
that in both examples the focus follows the non-focus, in accordance with
(5-b)):
(10) Wem hast du ein Buch geschenkt?

'Who did you give a book?'
[ - d e f . AccO] > |+def. DatO|/r

*Ich habe ein Buch [dem SCHÜler|f geschenkt.
I have a book the student given
Ί gave a book to the student.'
In an unpublished paper (Biiring 1996) I proposed reinterpreting Lenerz'

findings in the following terms: DatO > AccO is the base generated VP-
internal order of objects in German; AccO > DatO is the result of a syntactic
movement operation called scrambling, which adjoins AccO to the VP (this
follows a common line of syntactic analysis for German; cf. Webelhuth 1989,
Müller 1991, Vikner 1991). There are two constraints on scrambling, which
can be phrased as in (11):
( 11 ) a. Don't scramble a focused NP!

b. Don't scramble an indefinite NP!
To derive these I proposed utilizing two constraints along the lines of (12)
and (13):
(12) FINALFOCUS (FF)

Focus should be sentence final.
(13) IND(EFINITES)
Indefinites must be properly contained in VP (if they are to receive
an existential reading).
Both these constraints have been proposed in the literature and can be seen
to be independently motivated. I will return to this issue below. They inter-
act with a general syntactic faithfulness constraint that penalizes movement,
including scrambling, which we will call STAY (cf. Grimshaw 1997). Option-
a l l y of movement results where the base order violates FF and the derived
order violates STAY but respects INDEFINITES and FF. Movement is prohib-
ited where the base order fulfills both STAY and FF; it is also prohibited if the
derived order violates INDEFINITES.2
In order to discuss the workings of this system I will implement it in the
form of an optimality grammar, as proposed in Choi (1996) and, indepen-
dently, in Biiring (1997a) (it is the latter proposal I am going to discuss here,
although Choi's analysis uses essentially the same constraint tie, her CN2
- dative precedes accusative, and NEW - roughly: a non-focused argument
precedes a focused one, to derive focus-related word order variation; since I
will propose a fundamental reanalysis later in this paper, I will not attempt a
74 Daniel Biiring
comparison of the two accounts here). To achieve the desired results, I N D E F -

INITES must be undominated, while F I N A L F O C U S and STAY are tied. The
proposed ranking is thus the one in (14):
(14) INDEFINITES » STAY FINALFOCUS
The <<C3> notation indicates the constraint tie. A tie can be resolved in two dif-
ferent ways, in this case as in (15-a) or as in (15-b) (cf. Prince & Smolensky's
ordered global ties', see also Müller 1999 for more discussion).
( 1 5 ) a. INDEFINITES » STAY » FINALFOCUS

b. INDEFINITES » FINALFOCUS » STAY
If a structure is optimal under one of these rankings, it is grammatical. If the

winner under (15-a) has a different word order from the one under (15-b),
optionality results. If they are the same, word order is strict.
Note that the only one of these constraints that can ever favor scrambling is
FF: Scrambling may bring an F-marked NP into a position closer to the end
of the clause. In that sense, scrambling in this analysis is focus driven.
It should be fairly easy to see how the facts around indefinites follow in
this system: Scrambling of an indefinite will always result in a violation of
I N D E F I N I T E S (VP-adjoined sites are not properly contained in VP) and STAY
(movement has occurred). Leaving it in situ may at worst violate FF. But IN-
DEFINITES dominates FF in either resolution of the tie (i.e., in both rankings
in (15)), so the optimal candidate will never violate I N D E F I N I T E S in favor of
FF.
If the AccO (the potential scramblee) is definite (notated as dAccO), how-
ever, a more interesting constraint interaction can be observed. Suppose DatO
is focus (and AccO is not). Leaving AccO in situ satisfies STAY but violates
FF, since AccO, and not DatOf, is final. This structure, (16-a), will be the
optimal candidate under (15-a). Scrambling AccO across DatO satisfies FF
(DatOF ends up sentence final) but violates STAY; this candidate wins under
(15-b). The situation is summarized in tableau (16).
i: [ V p DatO/r [ V ' dAccO V]] IND STAY FF

a. is- [yp DatO F [ v / dAccO V]] *
b. us- [yp dAccO [yp DatOF [ v / t A c c 0 V]]] *
c. [vp dAccO [yp DatO/r [yp tQatO lv' [AccO VJ111 **I
(I added (16-c) to illustrate that multiple scrambling, though generally per-

mitted, will be harmonically bounded and therefore blocked.) Note that both
winners in (16) have a non-focused verb following the focused argument. But
there is no application of scrambling that would remedy this situation. I will

return to the question of whether the sentence final V should be taken to vio-
late FF at all in section 3. For the purpose of this exposition I will understand
FF to be satisfied if no non-focused argument follows the focused one.
If AccO is focus (and DatO is not), the situation changes again. Now
the DatO > AccOF candidate fulfills both STAY and FF (and INDEFINITES),
while the scrambled candidate AccO/r>DatO violates them both. That is,
under either resolution of the tie the in situ version is optimal. Movement is
blocked:
i: 1 vp DatO | V ' dAccO F VI] IND STAY FF

a. us- [ vp DatO [ v / dAccO/R V]]
b. [VP dAccOf [ V p DatO | V ' tAccO VI]] * *
As I already noted in that earlier work, this system also derives a case not
considered in Lenerz (1977), but observed in Eckardt (1996): If both objects
are focused, scrambling is excluded, regardless of (in)definiteness. In terms
of the system proposed: STAY must not be violated if no improvement in
terms of FF results:
(18) i: [VP DatO/r f V ' dAccO F VI] IND STAY FF

a. us- [vp DatO/r [ v / dAccO F V]]
b. [ V p dAccO/r [VP DatO F [ v / t A c c o V]]]
It was finally observed that scrambling of indefinites is possible if these are

not to receive an existential interpretation (cf. the exact wording of the con-
straint in (13)). 3 To illustrate this, let us compare two sentences with focus
on DatO and an indefinite AccO. In the first version, the indefinite AccO is
meant to be existential (indicated as iAccOg):
i: [VP D a t O / r | v / i A c c O g V ] ] IND STAY FF

a. ι®· [VP D a t O / r [ v / i A c c O g V ] ] *
*
b. [ V p iAccOa [ V p DatO/r [ v / tAccO V[]| *!
The scrambled structure is blocked for its violation of INDEFINITES: Since

the indefinite is existential, it ought to stay within VP.
Turning now to the second version, we observe that scrambling the indef-
inite AccO becomes an option if the indefinite is supposed to be interpreted
as a generic NP (indicated by subscript Gen):
76 Daniel Biiring
i: [ vp DatO/R [ V ' iAccO G t 7 1 V]] IND STAY FF

a. us-1 v p D a t O / r [ V ' ÍACCOG«, V I ] *
b. κ ? 1 VP iAccO G e „ fvp D a t O F f V ' TACCO V]]] *
The INDEFINITES constraint does not apply here since A c c O is generic. Ac-
cordingly, (20-a) is optimal if the tie is resolved to STAY > FF, while (20-b)
is optimal if it is resolved to FF » STAY. Movement of the indefinite across
the definite is thus (optionally) possible. 4
This quick overview illustrates all the relevant aspects of the system as
proposed in Biiring (1996) in its application to German double object con-
structions. Empirically successful though it is, many questions remain open.
Some of them regard the nature of the constraints. Why should they hold
in the way they do? Others regard the technical set-up of the system. What
advantages does it have to specify focus patterns (rather than, say, contexts,
accent patterns, or nothing at all) in the input?
Regarding the first set of questions, the INDEFINITES constraint in (13) is
a fairly direct adaptation of the seminal proposals in de Hoop (1992) and
Diesing (1992). If the position taken in these works is basically correct, posi-
tional preferences of indefinites can be explained in terms of the way syntax
is mapped onto semantics. The effects of FINALFOCUS can and should, I
believe, be derived f r o m the way syntax is mapped onto prosody, utilizing
ideas found in Truckenbrodt (1995, 1999) and Biiring (1997a). It is this latter
aspect that the present paper is mainly concerned with.
T h e system I will present below shares many essential properties with the
one sketched in this section and preserves its basic tenets: Object ordering
in German is determined by morphosyntactic and focus-related constraints,
F-marking is specified in the input, and optional reordering is derived by a
constraint tie in the very way illustrated above.
I will not, however, continue to use the particular constraints STAY, FI-
N A L F O C U S a n d INDEFINITES. In s e c t i o n 3 I p r o p o s e r e p l a c i n g F l N A L F o -
CUS by a group of constraints relating focus, prosody and syntax. Their net
effect will be similar to that observed with FINALFOCUS above; in contradis-
tinction to this single constraint, however, their empirical coverage is much
broader, and they are compatible with and well motivated by current work
in prosodie phonology. In section 4 I will then introduce a constraint DAT,
which takes over the work of STAY. The effects of DAT will be the same as
those of STAY; it is chosen merely to avoid commitment to a derivational
syntactic framework. The issue of (in)definiteness and its influence on object
order will be ignored in what follows, along with the constraint introduced to
handle it; reintegration of it within the analysis developed below will have to
await a later occasion (Büring in prep.)·
Regarding the second set of questions, the issue to be addressed here re-
gards the specification of the input. The system sketched above and elabo-
rated in what follows crucially specifies F-marking (and the different readings
of indefinites) in the input, but not, e.g., accenting or prosodie phrasing. This
choice could be made differently. I don't think that the present paper presents
conclusive evidence in favor of the set-up chosen here. Its purpose is to show
that such a system can be devised, and explore what properties it will have,
facilitating further discussion. I will touch upon some of the issues involved
after the main exposition in section 5 below.
3 Deconstructing F I N A L F O C U S
This section explores the rationale behind a constraint like FINALFOCUS, and
proposes replacing it with more precise and natural constraints on prosodie
phrasing. Likewise, we will no longer assume the constraints INDEFINITES
and STAY (which will be replaced by a less committing constraint called
DAT(IVE) in section 4 below).
3.1 Phrasing, Stress, and Accent
Let me start by clarifying some of the assumptions about the relation be-
tween context, focus and accent I am making. I follow Selkirk (1984, 1995),
Rochemont (1986), and many others in assuming an overall picture as in (21).
(21)
Prosodie Structure
Context (specified Syntactic Structure
with stress and
by, e.g., a question) with F-marking
pitch accents
The context determines which constituents in the syntactic structure need to

be F-marked. I will adopt the most straightforward characterization of this
relation, as proposed in Schwarzschild (1999): Any constituent which is not
contextually Given (or c(ontext)-construable in Rochemont's terms) needs
to be F-marked. Usually this will be the constituent that corresponds to the
w/z-phrase in a context-question (see Selkirk 1995 and Schwarzschild 1999
for enlightening discussion), plus all or most of its sub-constituents. In this
78 Daniel Biiring
paper I will have nothing more to say on the Context-F-marking relation; my

subject will be the correspondence between the two boxes on the right in (21),
focus realization.
In English, focus is signalled by pitch accents, i.e., movements of the funda-
mental frequency of the speaker's voice, centered around prominent syllables.
But, as is well known, not every terminal element that bears F must receive
a pitch accent. Selkirk (1984), like many following her, assumes two addi-
tional steps on the way from the context-determined F-marking to the actual
prosodie structure: First, a set of conditions on the possible F-patterns within
the syntactic tree, usually cast in terms of focus projection rules·, second,
a correspondence condition between F-marked terminals and pitch accents,
e.g., the basic focus rule of Selkirk (1984: 207).
In what follows I want to explore a different line: I will derive the rele-
vant effects of the focus projection rules in terms of prosodie principles (in
keeping with a larger project aimed at eliminating focus projection rules al-
together; cf. Biiring 1997a, Drubig 1994, Schwarzschild 1999), and I will
follow Truckenbrodt (1995, 1999) in assuming that there is no rule that di-
rectly relates F-marking to pitch accents, but that the focus-accent relation is
mediated through prosodie phrasing.
My assumptions about prosodie phrasing are fairly standard: Lexical heads,
sometimes together with lighter material accompanying them, form prosodie
words (PWds). Prosodie words are grouped into intermediate prosodie cate-
gories which I will call accent domains, ADs (a term used by Uhmann 1991,
similar to Gussenhoven's 1984focus domains, Pierrehumbert & Hirschberg's
1990 intermediate phrases, and Truckenbrodt's 1999 phonological phrases),
which in turn are grouped into intonational phrases (iPs). Following Selkirk
(1984) and many others, I assume that prosodie phrasing is exhaustive, strictly
layered, and non-recursive.5
Each such prosodie category has a unique head. The head is the most promi-
nent element of the category. For example, the syllables sie, geld and ge
(marked by capitals) in (22) - which is the end of sentence (8-a) above -
are the heads of their respective prosodie words; they receive a grid mark at
the word-level, and hence are more prominent than all the other syllables.
(22) ... χ )ip

( χ )ad ( χ )ad
( X )pWd ( X )pWd ( X )pwd
dem KasSIErer das GELD geGEben
the-DAT teller the money give
The prosodie words (dem Kassierer and das Geld) are the heads of their
respective accent domains. Accordingly they are more prominent than the
prosodie word (gegeben), which means that their most prominent syllables
receive AD-level stress. Finally, the AD (das Geld gegeben) is the head of
the iP that wraps the entire sentence (the dots indicate that the iP extends fur-
ther to the left) and thus receives a grid mark at the iP-level (note that this is
different from the notation used in Halle & Vergnaud 1987, where heads are
indicated by grid marks on the next higher level).
As noted, the grid marks represent stress, where higher columns represent
a higher degree of stress. Finally, stressed syllables are associated with pitch
accents (I will not be concerned with the choice of pitch accent here, see
Pierrehumbert & Hirschberg 1990 for general discussion, and Biiring 1997b
on German). For our purposes it is sufficient to state that each sentence con-
tains at least one pitch accent, and that if a syllable gets a pitch accent asso-
ciated with it, every other syllable with the same or higher degree of stress
must get a pitch accent, too; the range of the pitch movement (the perceived
"intensity" of the accent) is positively correlated with the level of stress on
the syllable the accent is associated with (cf. Pierrehumbert 1980). The result
will be that the head of iP always bears a pitch accent. A common pattern
in German is that all AD-heads have a pitch accent, too (cf., e.g., Uhmann
1991). We stipulate that syllables with only PWd-level stress never bear pitch
accents.
The convention we will use where no prosodie trees are given is the follow-
ing: AD-heads are marked by capitalization of the pertinent syllable, the iP-
head by capitalization plus underlining of the word; Pwd-heads aren't marked
at all. (22) can thus be abbreviated as in (23):
(23) dem KasSIErer das GELD gegeben.
Given what we said above, Geld must bear a pitch accent here, while
KasSIErer may (along with every other AD-head that may precede it); the
V gegeben cannot. The pitch accent on Geld will be the most prominent one
(the nuclear stress).6
Let us start by elaborating on the notion of accent domain (I will ignore the
issue of prosodie word formation, because nothing hinges on it in the present
context). An AD has an "ideal size", which is described by the constraints
in (24). Since its two parts, PRED and XP, don't conflict in the examples I
discuss in this paper, I will treat them as one constraint, ADF, in the tableaux
that follow:
80 Daniel Biiring
(24) A D F (ACCENTDOMAINFORMATION):
a. PRED:
A predicate shares its A D with at least one of its arguments.
b. XP:
A D contains an XP. If X P and Y P are within the same A D , one
contains the other (where X and Y are lexical categories).
A s stated quite explicitly in (24-b), the ideal aimed at is to map lexically

headed X P s onto A D s . T w o special cases arise: If one such X P contains an-
other, the dominating one will be mapped onto an A D . A n d if an argument
X P is adjacent to its predicate, the predicate will be integrated into the X P ' s
A D (borrowing Jacobs' 1992 apt term). For example, the N P and its selecting
predicate will be mapped onto one A D in the German (25-a) (the same results
obtain mutatis mutandis f o r English V O structures). This candidate doesn't
violate XP, because even though it contains two lexically headed maximal
projections, N P and VP, the latter contains the former. 7 It doesn't violate
PRED either because the predicate - i.e., the verb - gets to share the A D
with its N P argument.
(25) das Geld geben

the money give
( ' g i v e the m o n e y ' )
( )AD /
a. cs= ( das Geld )( geben )pWd
( )( )AD * (PRED: V is alone in its A D )
b. ( das Geld )( geben )pWd
(( ) )AD * (illegal recursion of A D )
c. ( das Geld )( geben )pWd
A s said above, every A D has one head, which is the most prominent element
within it, indicated by a grid mark within the A D . If an A D consists of more
than one prosodie word, as in (25-a), the head will be determined by (26):
(26) A/P (ARGUMENT-OVER-PREDICATE): Within A D , an argument is

more prominent than a predicate.
(27) ( χ )AD violates A/P

a. ( das Geld )( geben )pwd
( χ )AD satisfies A/P
b. ι®· ( das Geld )( geben )pwd
The effect of (26) is demonstrated in (27) (note that here and henceforth I
don't indicate prosodie words in a separate line; hence, PWd-heads are no
longer marked with a grid mark at all).
3.2 Simple Foci
A D F and A/P in tandem govern phrasing and prominence, all other things
being equal. How does focus enter the picture? I submit that one simple con-
straint, (28), is all that is needed: 8
(28) FP(FOCUSPROMINENCE)
Focus is most prominent.
Importantly, (28) inspects the prosodie structure of the sentence; for example,
if an A D contains two prosodie words, only one of which contains an F-
marked node, FP will demand that PWd become the head of AD; likewise for
higher prosodie categories (to which I will turn below).
In German, FP is crucially ranked above A/P; for reasons that will become
clear later, A D F must be ranked in-between them. To understand the work-
ings of FP, let us consider AD-formation in cases in which exactly one im-
mediate constituent of the clause bears an F-feature (that constituent may in
turn contain more F-features, which I will ignore here); I will refer to these
as simple or narrow focus cases. What we observe here is that among the
elements within VP, only the constituent in focus has A D level stress (and,
for reasons to be discussed in a moment, the main pitch accent). (Since nar-
row V-foci are hard to elicit by a w/z-question, I will use contrasting contexts,
which, in accordance with Schwarzschild (1999), I assume work the same in
all relevant respects.)
(29) a. (Was hast du dem Kassierer gegeben? Ich habe dem Kassierer) das
GELD/r gegeben.
'What did you give the teller? I gave the teller |the money|/r.'
b. (Ich habe nicht gesagt, du sollst dem Kassierer das Geld beschrei-
ben, sondern du sollst dem Kassierer) das Geld GEben/r.
Ί didn't say you should describe the money to the teller, but you
should |givej/r the money to the teller.'
The pitch accents on Geld and geben, respectively, tell us that these must be
AD-heads, while their respective sisters are not (otherwise they could at least
bear a secondary - j c e n t , which I would have indicated by capitals). Let us
82 Daniel Biiring
see how this follows from our assumptions: If the AccO alone is F-marked
(case (29-a)), FP will require that the prosodie word containing it become the
head of the AD. This is the case in (30-a) and (30-c), but not in (30-b), which
is therefore blocked ((30-b) violates A/P on top of this, which is irrelevant
here). Between (30-a) and (30-c), ADF prefers the former, due to the reason
already seen in (25-b): The V alone is not a good A D , violating A D F / P R E D .
i: 1 VP INP das Geld]/r geben] FP ADF A/P

OS> ( Χ )AD
a. (das Geld/r)(geben)pwd
*
( Χ )AD *!
b. (das Geld/r)(geben)pwd
( Χ Χ Χ )AD *!
c. (das Geldf )(geben)pwd
If V alone is F-marked (case (29-b)), it has to be head of AD, which blocks

(31-a). Both (31-b) and (31-c) achieve this goal, though by different means.
The former shifts the accent within an otherwise perfect AD, violating A/P;
the latter makes V an AD of its own, which then trivially satisfies FP.
i: 1 vp INP das Geld] geben H FP ADF A/P

( χ )AD *!
a. (das Geldf)(geben)pwd
*
ISR ( χ )AD
b. (das Geld/r)(geben)pwd
( Χ Χ Χ )AD *!
c. (das Geld/r)(geben)pwd
Note that A/P is not violated in (31 -c), since there is no AD containing a pred-
icate and its argument. Yet (31-c) is blocked by (31-b) since ADF dominates
A/P.
Let us now include the intonational phrase, iP, in the picture. Since we are
only concerned with VP-internal focus here, it suffices to assume that every
sentence (or inflection phrase) is mapped onto one iP.
Intonational phrases are strictly right-headed in German: The head will be
that AD which is aligned with iP's right edge. We implement this by assuming
(32) as an undominated constraint (cf. McCarthy & Prince 1993).
(32) ÌP-HEAD-RIGHT:
ALIGN(iP, right, head(iP), right)
Main prominence within iP will therefore be on the most prominent element

of its final A D (this is essentially equivalent to Selkirk's 1984 final strength-
ening). Since at least the most prominent syllable of a sentence must be asso-
ciated with an accent (see above), the iP-head will be perceived as the main
accent or "nuclear stress" of the sentence; on casual inspection it might even
be perceived as the only prominence, even though that constitutes an over-
simplification in most cases.
Since (32) is not violated in any of the examples I discuss in this paper I
won't include it in the tableaux and will indicate violations of it separately,
where necessary.
The full representations of (30-a) and (31-b) are then (33-a) and (33-b),
respectively (the dots again indicate that iP extends to the left):
(33) a. χ )¡p
Χ )AD
(das Geld/r)(geben)pwd
b. Χ ).p
( Χ )AD
(das Geld)(geben/r-) PWd
Notice that, also at the iP-level, the structures in (33) respect FP: The AD
containing the focus ends up being the head of iP. Note in particular that
although there is plenty of material following the most prominent syllable
Geld within the iP in (33-a), there is no AD following the AD containing
Geld. Therefore Ì P - H E A D - R I G H T in ( 3 2 ) is respected here: The rightmost
daughter AD is the head of iP.9
To conclude the simple focus cases, what if the VP-initial dative argument
is the sole focus? As already discussed in section 2 above, the nuclear accent
then falls on the DatO; moreover, no secondary accents on either AccO or V
are allowed.
(34) (Wem hast du das Geld gegeben? Ich habe) dem KasSIEreiy das
Geld gegeben.
'Who did you give the money? I gave [the teller|/r the money.'
This is predicted: The dominant constraint FP will force the head of A D and
iP to be on the focused DatO. As Truckenbrodt (1995: ch. 5) was the first to
point out, this, together with Ì P - H E A D - R I G H T , excludes the presence of any
AD following the one containing the focus. Consider (35-a) and (35-b); in
both, the final A D (das Geld geben) becomes the head of the iP, consonant
with the right-headedness of iP, (32). But this violates FP at the iP level, since
84 Daniel Biiring
the AD (dem Kassierer ρ) is not most prominent in iP. Since FP dominates

ADF, (35-a) and (35-b) are blocked by (35-d).
Alternatively consider (35-c). Here the AD containing the focus, dem
Kassiererρ, becomes the head of iP, satisfying FP, but violating iP-HEAD-
RLGHT, which again dominates ADF.
i: 1 vp Idem Kassiererl/r (np das Geld] geben] FP ADF A/P

Χ )IP *!
( χ )( χ )AD
a. (dem Kassierer/r)(das Geld)(geben)pwd
X )IP *! *
( Χ Χ Χ )AD
b. (dem Kassierer/0(das Geld)(geben)pwd
X )IP iP is not right-headed
( χ )( χ )AD
c. (dem Kassierer ρ )(das Geld)(geben)pwd
X )IP *
IS- ( χ )AD
d. (dem Kassierer/r )(das Geld)(geben)pwd
The only way to have a non-final argument be most prominent is thus to

make it the head of the final AD. In other words, all subsequent AD bound-
aries are "deleted", or put more accurately: No more ADs are formed. This
blatantly violates ADF, in particular its XP sub-constraint (cf. (24-b)): DatO
and AccO are both lexically headed XPs, and neither one contains the other.
But since ADF is dominated by both FOCUSPROMINENCE and i P - H E A D -
RLGHT, these violations are unavoidable. Consequently the post-focal stretch
of the sentence gets totally de-structured. There can be no AD-level stresses
and, accordingly, no pitch accents. This effect has been observed in various
languages; cf. again the discussion in Truckenbrodt (1995: ch. 5).
3.3 Complex Foci
We now turn to cases in which more than one immediate constituent of the
sentence is in focus. (36) is an example of this sort (quite possibly there is
another F-mark on the VP here and in (38) below; I'll address this issue in
the next sub-section).
(36) (Wie soll die NSF dabei helfen? - Sie soll) das GELD/r geben F .
'How can the NSF help? They should given the moneyF-
The final V, though F-marked itself, does not have AD-level stress and cannot
bear a pitch accent. That means that object and verb continue to form an AD;
(37) shows how this is accounted for:
i: [VP I NP das G e l d | f geben/r| FP ADF A/P

X )¡p * *!
( X )( X )AD
a. (das Geldf Xgeben F ) pwa
*
χ )iP
M· ( \ )AD
b. (das Geld/r)(geben/r)pwd
*
X )¡p *!
( Χ )AD
c. (das Geld/r)(geben/r)pwd
X )ip iP is not right-headed
( Χ Κ Χ )AD
d. (das Geld/r)(geben/r)pwd
The winner in this case has exactly the same prosodie structure as in the
object-only-F case in (30). The tableaux look considerably different though.
In particular, notice that even the winner in (37) has one violation of FP.
This is unavoidable if a sentence has two F-marked constituents, given that
every phrase has only one head: At some level, one of the F-constituents must
become the non-head.
The most instructive candidates to compare are (37-a) and (37-b). In (37-
a), V is the head of its own AD, and the A D with the AccO is subordinated
at the level of iP, inducing an FP violation. In (37-b), V is subordinated, but
already at the A D level. It incurs a violation of FP, too, this time because the
PWd containg geben ρ is not the head of the A D containing it. Note that since
there is only one A D (which then is the head of iP), no further violations
of FP occur. The choice pro (37-b) is made by the lower constraint ADF,
which prefers the "integrated" structure (37-b) over the "split" one in (37-
a): A perfect A D cannot consist of just a predicate as in (37-a) (the fact that
the prominence within the A D is on the object rather than on the verb, as in
(37-c), is then regulated by A/P).
What happens if the verb and two arguments are F-marked? In this case we
get the nuclear accent on the rightmost argument and a secondary accent, or
at least AD-level stress, on the VP-initial one.
(38) (Was hast du gemacht? - Ich habe) dem KasSIErer/r das G E L D F

86 Daniel Biiring
gegebenf.
'What did you do? - 1 gave/r the teller ρ the money p.'
Turning to the tableau, there will inevitably be two violations of FP (since

there are three prosodie words - two arguments and the predicate - only one
of which will be prominent at all levels). But this time, since the ideal AD
contains at most one of the arguments (cf. (24-b) above), each argument will
get its own AD, and they will only be "merged" at the level of the iP. The
predicate will integrate with the closest argument for reasons of ADF, as seen
before.
i: [vpfdem Kassierer] F [ NP das Geld]/· geben/r] FP ADF A/P

**
X )iP *!
( Χ )AD
a. (dem Kassierer jr)(das Geld f )(gebenρ)pwd
**
X )iP *!
( χ )ad
b. (dem Kassierer/r)(das Geld/r)(geben/r)pwd
**
X )ip
IGF ( X )( Χ X )AD
c. (dem Kassierer ρ )(das Geldf)(geben/r)py/d
**
X )iP
( χ Χ χ )AD
d. (dem Kassierer f )(das Geld/r)(gebení-)pwd
** *!
X )¡P
( χ Κ Χ )ad
e. (dem Kassierer/τ )(das Gel d f )( geben f ) pwd
**
X )iP *!
( χ Χ χ Χ χ )ad
f. (dem Kassierer/r)(das Geld/r)(geben/r)pwd
It should be noted once more that the crucial difference is between predicates
and non-predicates in complex-F-constructions. While predicates have an in-
centive to join the AD of one of their arguments and therefore integrate at
that lower level, arguments do not. In fact, ADF prefers for them not to share
an AD with any other argument, which is why they end up forming their own
AD. Incidentally, this reasoning applies to adjuncts, too, except that these
never join ADs with a predicate, given that they are never selected by a pred-
icate (cf. again the definition in (24)); it is beyond the scope of this paper to
go into this, though.
3.4 F on a Verbal Projection
In the above cases we have concentrated on F's sitting on immediate con-

stituents of the clause such as DatO, AccO and V. I remarked above (see
discussion of (36)) that sometimes, there may be F-marking on clausal pro-
jections such as V', VP, IP, etc., as well. It would lead us too far afield to
discuss the pragmatic differences leading to, say, a |yp NPp V/r| pattern as
opposed to a Ivp NPp VF\F pattern, especially since the issue seems con-
troversial. Fortunately, no commitment is needed, because, as we will see,
nothing changes with or without additional F-marking on clausal projections.
Consider (40), which is mostly a repetition of (37) with an F on VP added. 10
i: IVP Idas Geld]/? gebenHF FP ADF A/P

*
χ )¡p
•BT ( X )AD
a. (das Geld/r)(geben/r)pWd
*
X )ip *!
( Χ )AD
b. (das Geld/r)(geben/r)pwd
*
X )iP *!
( Χ Χ Χ )AD
c. (das Geld/r)(geben/r)PW(j
In (40-a) and (40-b), the smallest prosodie constituent containing VPf is the
AD. Since that AD is the head of iP, F O C U S P R O M I N E N C E is met.11
If we look at a double object example along the lines of (39), a similar
reasoning applies; (41) repeats the winning candidate for this structure with
an F on VP added:
(41) ( χ )¡P
( Χ )( X )AD
I VP dem Kassierer/r das Geld/r geben/r|yr
No smaller prosodie unit than iP contains the VP, and since iP is the high-
est category, nothing is more prominent than iP. Therefore, no additional
violation of FP occurs, hence no other candidate will improve relative to
(41)/(39-c).
88 Daniel Büring
3.5 Deaccenting
So far we have looked at simple foci (DatO/r, AccO/r, and V/r) or complex
foci in which all VP internal arguments were F-marked. Now I will turn to ex-
amples that contain deaccenting. As a starting point, recall that a ditransitive
VP/IP focus without deaccenting results in a structure with the last AD con-
sisting of a prominent AccO and a non-prominent V. (42)/(43) illustrates this
with a new example (foci on, external to, and above VP are not indicated):
(42) 'Why was Veronika arrested?'-* DatO/r AccO F V F

Weil sie ihrem MAcker den KaMINhaken überzog,
bec. she her-DAT man the-ACC poker landed
'Because she beat her man with the poker.'
(43)
i: 1 vp DatO/r AccOf V F \ FP ADF A/P
**
χ )iP *!
( χ Κ χ )(x )ad
a. (ihrem Macker/r)(den Kaminhaken/,)(überzog/7)p\yd
χ )iP iP not right-headed
( χ )( χ )ad
b. (ihrem Macker/r)(den Kaminhaken/r)(überzog/r)pwd
**
X )iP *!
( χ )( χ )ad
c. (ihrem Macker/r)(den Kaminhakenf-)(iiberzogf-)pwd
**
χ )iP
"sr ( χ χ )AD
d. (ihrem Macker/r)(den Kami η hake η f ) ( ü berzog f ) pwj
**
χ )iP *!
( χ )ad
e. (ihrem Macker/r)(den Kaminhaken/r)(überzog/r )pwd
If we now introduce AccO as part of the context-question, it no longer needs

to be F-marked (since it is Given). But its surrounding elements, DatO and
V, still are (and so are, presumably, its dominating VP and IP). Such a struc-
ture is realized with the nuclear accent on V, no stress or secondary accent on
AccO, and AD-level stress (and usually a secondary pitch accent) on DatO.
So the contextually given AccO must remain unstressed and unaccented be-
tween its two stressed neighbors, which is presumably why cases like this
have been labelled "deaccented".
(44) 'Why was Veronika arrested? Only because she had a poker in her
trunk?' -)• DatO/r AccO V/r

Nein,...
No,...
a. ...weil sie ihrem MAcker den Kaminhaken ÜBerzog.

b. #...weil sie ihrem MAcker den Kaminhaken überzog.
...bee. she her-DAT man the-ACC poker landed
It should also be noted that, unlike all the other cases of XP+V focus we have
considered thus far, this type of context strictly excludes an unstressed V, as
(44-b) shows. This result is predicted, as tableau (45) demonstrates.
i: I vp DatO/r AccO V/r| FP ADF A/P

X )¡p * *
us- ( χ )( χ )AD
a. (ihrem Macker/r)(den Kaminhaken)(überzog/r)pwd
χ )ip * *!
( χ )( χ )(x )ad
b. (ihrem Macker/r)(den Kaminhaken)( überzog ρ ) pwj
χ )¡p **!
( Χ )( χ )AD
c. (ihrem Mackerf)(den Kaminhaken)(überzogf jp-y/j
*
χ )¡p *!
( χ )ad
d. (ihrem Macker/r)(den Kaminhaken)(überzog/r)pw(j
In contradistinction to the case of a narrowly focused DatO - (35), whose

winner is structurally parallel to (45-d) - the verb together with the AccO
will form an AD in this case. The reason is that this last AD does not threaten
to violate FP: It contains a focus, and to make that focus prominent, it merely
needs to violate A/P.
Before I go on I would like to emphasize some points in which the present
account differs from others in the literature: First, the integration effect (F-
marked V remains unstressed if its argument is stressed) is arrived at without
any syntactic conditions on F-patterns (such as the second phrasal focus rule
in Selkirk 1984: 207, or the third focus assignment rule of Rochemont 1986:
85), a result which brings us closer to a theory that dispenses with focus
projection rules altogether. Second, it maintains that stress assignment and
accenting are related to focus through prosodie phrasing (unlike the proposal
in Schwarzschild 1999: sec. 6, which otherwise shares many properties with
the present one). Third, it generalizes to complex F-patterns and deaccenting,
90 Daniel Biiring
cases not discussed in, e.g., Truckenbrodt (1995, 1999). And fourth, it allows
integration with a theory of word order variation, as I will demonstrate now.
4 Word Order Variation
4.1 A First Look
In (35) above we derived the fact that single focus on the DatO yields a struc-
ture as in (46). FP requires that DatO be the head of AD and iP; since iP is
right-headed, this blocks insertion of further A D boundaries to the right of
AD. Due to this, a "super-big" AD is formed, violating ADF.
(46) ... χ )¡p

( χ )ad
DatO/rAccOV
Let us now see what happens if word order permutations enter the picture.
In this case, the same F-pattern could be realized without violating any con-
straint by utilizing AccO > DatO order. And in fact, this option exists along-
side the one in (46) in German, as noted in (7) above. I repeat both vari-
ants for convenience here (note that I have added the indication of AD-level
prominence on AccO in (47-b), which may or may not be associated with a
pre-nuclear pitch accent):
(47) Who did you give the money to?

a. Ich habe dem KasSIErer das Geld gegeben.
(DatO > AccO = (46))
b. Ich habe das GELD dem KasSIErer gegeben.
(AccO > DatO = (48))
Ί gave the teller the money/the money to the teller.'
Let us examine (47-b) more closely, since we haven't done so before. Its
prosodie structure is (48). It is perhaps worth pointing out that the prosodie
structure of (47-b)/(48) is identical to that of the parallel DatO-AccO-V ex-
ample in (38)/(39) above; in particular, V and the adjacent DatO integrate
into one AD in just the same way that V and AccO do.
(48) ... χ )ip

( Χ )( Χ )ad
AccO DatO ρ V
(49-a) is the constraint profile for this AccO > DatO order. No violations oc-
cur. (I will henceforth leave out the iP-level for reasons of space; the head of
iP - which is predictably the rightmost AD - will be indicated by a capital
bold face grid mark: X.)
i: [ vp AccO DatO/R V] FP ADF A/P

•er ( χ Χ X )AD
a. (das Geld)(dem Kassierer/r)(geben)pwd
( χ )AD *!
b. (das Geld)(dem Kassierer/r)(geben)pwd
* *
( Χ Χ Χ X )AD
Χ *!
c. (das Geld)(dem Kassierer/r)(geben)pwd
Likewise, the deaccentuation case in (44)/(45) above can be realized in an

optimal prosodie structure with AccO > DatO order. As (50) illustrates, this
structure maps the V and its argument into one AD, avoiding a violation of
ADF.
(50) 'Why was Veronika arrested? Only because she had a poker in her
trunk?'
AccO DatO/r Vf
Nein, weil sie den KaMINhaken ihrem MAcker überzog,
no bec. she the-ACC poker her-DAT man landed
i: [ V p AccO DatO/r V/r| FP ADF A/P
*
( Χ Κ X )AD *!
a. (den Kaminhaken)(ihrem Macker/r)(iiberzogf )pwd
*
( Χ )( X XX )AD *!
b. (den Kaminhaken)(ihrem Mackcr/r )(überzogρ)pwj
*
er ( χ )( X )AD
c. (den Kaminhaken)(ihrem Macker /τ )( überzog f ) pwd
**| *
( X )AD
d. (den Kaminhaken)(ihrem Macker/r)(iiberzog/r)pw ( j
If the F-marked argument is capable of forming a "perfect AD" with the verb,
which it is when it is adjacent to V, this option is preferred - (51 -c). Phrasing
and accenting the verb separately - (51-b) - is just as impossible as with
A C C O f V F i n (37).
92 Daniel Büring
4.2 Competing with Movement
Suppose now that AccO > DatO structures as in (49) and (51 ) were to en-
ter into competition with their DatO > AccO siblings such as (35) and (45).
That is, suppose that the input was not specified with respect to object
ordering, allowing outputs with either order to compete with one another
(I'll use set notation in the input specification to indicate this). 12 Then the
AccO > DatO structure would be the sole winner. It clearly beats even the
best DatO > AccO candidate. (52) demonstrates this for the simple DatO/r
case. It compares the optimal structure from (51) with that from (35), render-
ing the latter sub-optimal.
who did you give the money?

i: {vp AccO, DatO/r, V} FP ADF A/P
KT ( X )( X )AD
a. (das Geld)(dem Kassierer/r)(geben)pwd
( X )AD *!
b. (dem Kassiererf)(das Geld)(geben)pwd
If German had a ranking like that in (52), the only object order we would ever
find for simple focus cases would be one where the F-marked object follows
the F-less one, so the former can be maximally prominent (satisfying FP)
and the latter can form an AD (satisfying ADF). Put in derivational terms,
we would find obligatory scrambling of non-focused objects around focused
ones. While this is obviously not the case in German, a situation much like
this can arguably be found in many Romance languages, e.g., Spanish, Italian
and, to a lesser degree, French (Ladd 1996, Zubizarreta 1998). In Spanish for
example, an NP which is the single focus in a sentence must occur postver-
bally, in VP-final position. The interpretation of this that I have in mind runs
along the following lines: Any structure in which the focus isn't sentence final
would require an AD which contains the focus and everything following it,
similar to (52-b) (otherwise the AD wouldn't be iP-final, hence not the head
of iP, yielding a violation of FP). But such a structure will always be sub-
optimal compared to one in which the focus occurs in sentence final position,
so that every element preceding it can form its own AD, similar to the struc-
ture in (52-a). Derivationally put, we find obligatory movement of the focus
to a peripheral position (see Gutiérrez-Bravo 1999 for an optimality analysis
along these lines).
So how is German different from a Spanish-type language? Why is there

optionality of object re-ordering at least in some cases, as discussed in sec-
tion 2? Assume a constraint that disfavors the AccO > DatO word order in
(52) (such as STAY from section 2 above). Such a constraint, if ranked high
enough, in particular higher than ADF, would be able to tip the scales in fa-
vor of (52-b) again. To derive the German case, then, the pertinent constraints
must be arranged so as to allow two optimal candidates sometimes. This we
achieve by ...
4.3 Tying the Focus Constraints with DAT
As in section 2 above, we will use a constraint-tie to derive the optionality.

Essentially, we want to tie the constraint that enforces a good prosodie struc-
ture with the one that enforces DatO > AccO word order. Above we used the
constraint STAY for the latter purpose. We do not need to commit ourselves to
the syntactic assumptions connected to this constraint, though (i.e., a partic-
ular base order, movement, VP-adjunction). We can simply follow the guide
of Müller (1998: 22) and use the constraint in (53): 13
(53) DAT(IVE): Dative NPs precede accusative NPs.
Implementing this, however, requires an additional complication. Since we

have two relevant constraints regarding prosodie structure - A D F and A/P -
the question arises as to which of them we should tie with the word order
constraint DAT.
Upon closer inspection it turns out that the alternation - where it exists -
is always between the DatO > AccO order under some mediocre prosodie
phrasing and the AccO > DatO order under an optimal phrasing (i.e., one
which satisfies both A D F and A/P in an optimal way). What this means is
that we want the tie to be resolved into (54-a) or (54-b):
(54) a. DAT » A D F » A/P

b. A D F » A/P » DAT
It will turn out that we crucially never want the tie to resolve into any of (55):
(55) a. A/P » DAT » A D F

b. A/P » ADF » DAT
c. DAT » A/P » A D F
d. A D F » DAT » A/P
94 Daniel Biiring
Put differently, the two prosodie constraints A D F and A / P never change their
ranking relative to each other, but only as a block relative to DAT.
This special property of the system will become relevant only in the deac-
centing cases, but for reasons of comparability and uniformity I will use it
right f r o m the beginning. The notation I invent for this purpose is that in (56):
FP DAT i prosodie constraints

ADF A/P
The narrow-focus cases fall out straightforwardly again as in section 2 above:

With AccO/r all constraints pull in the same direction, making the DatO >
A c c O order the only one possible (I use set notation to designate the input
again):
What did you give the teller? FP DAT ; pros. cons.

ADF A/P
i: { D a t O , A c c O / r , V }
«S*( χ )( X )AD
a. (dem Kassierer)(das Geld/r)(geben)pwd
( X )AD *! *!
b. (das Geldf )(dem Kassierer)(geben)pwd
Since candidate (57-a) violates neither DAT nor any of the prosodie con-
straints, it will be the winner regardless of how the tie is resolved.
With DatO/r, the focus constraints - in particular A D F - favor A c c O >
DatO, while DAT favors the opposite order. Since both constraints are tied,
both structures emerge as optimal.
Who did you give the money? FP DAT ; pros. cons.

ADF A/P
i: { D a t O / r , A c c O , V }
IGT( X )AD *
a. (dem Kassierer/r)(das Geld)(geben)pwd

*
RA- ( Χ )( X )AD
b. (das Geld)(dem Kassierer/R)(geben)pwd
( X )AD *! *!
c. (das Geld)(dem Kassierer^)(geben)pwd
Focus on both arguments (with or without the verb) again allows only the
order preferred by DAT. While scrambling doesn't make the phrasing worse,
it doesn't improve it either and is therefore excluded:
What will you do? FP DAT pros. cons.

ADF A/P
i: {DatO/r, AccO/r, VF}
ET ( χ )( X )ad **
a. (dem Kassierer/τ )(das Geld/r)(geben/7)p\yd

( X )( X )ad ** *!
b. (dem Kassierer/r)(das Geld/r)(geben/r)pwd
( χ )( χ )( X )AD **
*!
c. (dem Kassierer/r)(das Geld/r)(geben/r)pw(j
( X )( X )AD **
*!
d. (das Geld/r)(dem Kassierer/r)(geben/r)pwcj
( X )AD ** *! *!
e. (das Geld/r)(dem Kassierer/r)(geben/r)pw(j
( Χ Χ X )( X )AD ** *! *!
f. (das Geld/r)(dem Kassierer/r)(geben/r)pwd
Let us then turn to the deaccenting cases. If DatO and V are F-marked and
AccO is not, two structures emerge as grammatical: the one that fulfills
DAT (and violates the prosodie constraint A/P), and the one that satisfies the
prosodie constraints (but violates Dat).14
(60)
Why was Veronika arrested? Because she had FP D A T • pros. cons.
a poker in her trunk? No, because she... ADF A/P
i: { A c c O , D a t O / r , V / r }
*
BS· ( χ χ X )ad *
a. (ihrem Macker/r)(den Kaminhaken)(überzog/r)pwd

*
( χ )( Χ XX )ad *!
b. (ihrem Macker/r)(den Kaminhaken)(iiberzog/r)p Wd
( X )ad * *!
c. (ihrem Macker/r)(den Kami nhaken)( überzog ρ ) pwd
( χ Κ X )ad * * *!
d. (den Kaminhaken)(ihrem Macker/r)(iiberzog/r)pv/d
* *
( χ )(
)(X χ )ad *!
e. (den Kaminhaken)(ihrem Macker/r)(überzog/r)pwc)
* *
m· ( χ )( X )ad
f. (den Kaminhaken)(ihrem Macker/r)(überzog/r)pwd
If DatO is non-F, on the other hand, AccO > DatO order is excluded because
the DatO > AccO order already allows a perfect prosodie structure.
96 Daniel Biirìng
(61)
Why was Veronika arrested? Because her man FP DAT j pros. cons.
disappeared? No, because she... ADF A/P
i: {DatO, A c c O f , V F }
*
( χ )( X )ad *!
a. (ihrem Macker)(den Kaminhaken/r)(überzog/r)pwd
*
( X )( X XX )AD *!
b. (ihrem Macker)(den Kaminhaken/r)(überzog/r)pwd
*
"ST ( X )( X )ad
c. (ihrem Macker)(den Kaminhaken/r)(überzog/r)p\yd
*
( X )ad *!
d. (ihrem Macker)(den Kaminhaken/r)(iiberzog/r)PWd
*
( χ )( X )ad *! *!
e. (den Kaminhaken/r)(ihrem Macker)(iiberzogρ)pwd
*
( Χ )( Χ XX )ad *! *!
f. (den Kaminhaken/r)(ihrem Macker)(überzog/r)Pwd
* *! *!
( X )ad
g. (den Kaminhaken/r)(ihrem Macker)(überzogρ)pwd
To sum up then, the proposed system correctly predicts when we get two
different object orders with the same focus pattern, and when we don't. It
generalizes across the various cases because it doesn't invoke the notion of an
optimal focus placement (as the system in section 2 did), but only the notion
of an optimal prosodie structure. For each of the grammatical structures, it
delivers a unique prosodie phrasing, which in turn can be used to derive the
set of its possible accent patterns.
5 Concluding Remarks and Some Thoughts On Markedness
One way of looking at the constraint tie is that there are actually two gram-
mars at work: One that finds the prosodically optimal candidate, and one that
finds the morphosyntactically optimal one (which we have equated with the
one that displays DatO > AccO order here). All candidates which are prosod-
ically and morphosyntactically sub-optimal are predicted to be simply un-
grammatical.
A reasonable objection to the present proposal is that a sentence like, say,
(8-b), repeated here as (62-a), is awkward, but still much better than, say, (62-
b), and that (62-a) should therefore be question-marked, but not starred, as is
done here.
(62) a. ??Ich habe das GELD dem Kassierer gegeben.

b. *Ich habe das GELD gegeben dem Kassierer.
I have the money given the teller
Ί gave the money to the teller.'
Intuitively, (62-b) violates a hard and fast constraint about verb-argument

ordering in German, namely that nominal arguments cannot be post-verbal.
Constraints like DAT, ADF, A/P and their kin seem "softer" than that.
Ignoring the numerous interesting issues about the relation between gram-
maticality and acceptability that come up here, let us grab the bull by the
horns and directly implement the intuition described above. We simply need
to say that a candidate C* which is sub-optimal to the prosodically or mor-
phosyntactically optimal candidate C only by virtue of violating any of the
constraints in DAT, ADF, A/P - the word order constraints - is marked, but
not ungrammatical. 15
In practice this means that all sub-optimal candidates discussed in this pa-
per that lose on constraints lower than FP are no longer ungrammatical, but
merely marked. The result of this amendment is a system which is similar to,
though much less refined than, the one proposed in Müller (1998). I mention
this possibility merely to illustrate that everything proposed in this paper can
relatively easily be made to conform to such a system. There is no inherent
incompatibility between the type of constraints used here and the kind of sys-
tem Müller proposes in order to derive both grammaticality and markedness,
and the types of judgement reported above make it seem advantageous to
actually combine them.
Note though that the notion of markedness as introduced in the last para-
graph is only defined among members of the same competitor set. And given
our decision to include F-marking in the input, this means that otherwise iden-
tical sentences with different F-markings are not in competition with each
other. It follows that there will not be F-patterns that are sui generis more
marked than others.
To give an example, the structures in (63-a) and (63-b) are both optimal.
Since they involve different F-patterns they do not even compete. Hence nei-
ther is marked with respect to the other.
(63) a. ( χ )iP
( )AD
dem Kassierer/rdas Geld geben
98 Daniel Biiring
b. ( χ )iP
χ )( χ )ad
)ad
dem Kassierer/r das Geld/r geben/r
c. ?*( X )iP
Χ Χ χ χ X )ad
das Geld/r dem KassiererF geben/r
(63-c), on the other hand, is a competitor to (63-b), and sub-optimal with

respect to it (it violates DAT and ADF). Hence it is marked (or ungrammatical,
on the previous interpretation).
This state of affairs reflects the intuition that in the appropriate contexts
(e.g., "Who did you give the money to?" and "What did you do?", respec-
tively) (63-a) and (63-b) are judged perfect, while (63-c), even in the context
that suits its F-pattern (again "What did you do?") sounds at best marginal.
It is of course possible to define different notions of markedness in addi-
tion. For example, (63-a) requires a much more specific context than (63-b)
(roughly one in which "give X the money" is already under discussion). It
has been observed that speakers, when presented with sentence-tokens out
of the blue, judge (63-b) as more "normal" than (63-a), presumably because
(63-a) forces them to accommodate a more specific context than (63-b) does
(cf. Höhle's 1982 explication of the notion "normal intonation").
I believe, however, that this sort of markedness, call it pragmatic marked-
ness, is essentially different from the one manifested in, e.g., (63-c), which
we might call structural markedness. Empirically, the first only occurs in the
absence of a context and disappears once the appropriate context is provided,
while the second invariably remains. As a consequence, pragmatic marked-
ness should be explained in pragmatic terms: It quite likely correlates with the
amount of accommodation speakers are required to make. Structural marked-
ness is a matter of grammar proper: It is no more reducible to non-structural
facts than, say, the requirement that noun phrases need case.
In the present proposal this has been done by specifying F-marking in the
input. Accordingly, for every input I with a specified F-pattern there are one
or more optimal structures, which are predicted to be fully acceptable in a
context that complies with the F-pattern of I. Sub-optimal output structures
for I are structurally marked (or ungrammatical) by virtue of the system de-
veloped, and thus predicted to be considerably less acceptable in that context.
No structural markedness ranking among different inputs with different F-
patterns is defined (though, as I indicated above, it could be on a different
scale). In this respect the present system essentially differs from that pro-
posed in Müller (1998).
In conclusion, this paper has offered a particular way of looking at word

order variation, exemplified with double object constructions in German. It
was proposed that two families of constraints co-determine word order: mor-
phosyntactic constraints and prosodie constraints. The apparent influence of
information structure (here: focus marking) on word order is really only in-
direct, since information structure interacts with prosodie phrasing (this line
of argumentation was, I believe, first explicitly pursued in the works of Zubi-
zarreta and Reinhart, as documented in Zubizarreta 1998 and Neeleman &
Reinhart 1998, though in these works, unlike in the present one, information
structure interacts with accenting directly).
Depending on the relative ranking of these constraint families, languages
may display strict word order (morphosyntactic constraints outrank prosodie
ones), free word order (prosodie constraints outrank morphosyntactic ones),
or some mixture thereof (the two constraint types are intertwined or tied).
The German Mittelfeld presents an instance of the third type.
In the present paper we have only included one exemplary morphosyntactic
constraint. We mainly tried to explore the prosodie constraints, attempting to
connect up approaches like the ones mentioned above with current work in
prosodie phonology. We have shown that such a "deconstruction" of accent
assignment rules is not only compatible with the general approach to word
order variation, but even serves to derive a wide and comprehensive range of
data in a satisfactory fashion.
Notes
This paper is built on an earlier economy-theoretic paper of mine (Biiring 1996) and
a series of optimality-based talks I gave at the SFB 282 Colloquium "Die Intona-
tionale" in Cologne 7-96, the "Interfaces of Grammar" conference in Tübingen 10-
96, and the Stuttgart "Workshop on OT Syntax" 10-97 (Biiring 1997a). I would like
to thank the audiences at these conferences for their comments and discussion, espe-
cially Kai Alter, Katharina Hartmann, Gerhard Jäger, Inga Kohlhof, Gereon Müller,
Roger Schvvarzschild and Hubert Truckenbrodt. Judith Aissen, Armin Mester and
Line Mikkelsen offered invaluable comments and suggestions that greatly helped to
shape and improve the present version. All remaining errors and shortcomings have
been retained deliberately in order to stimulate future work.
1. The issue of whether the DatO>AccO order is preferred for all verbs or just a
lexically specified sub-group is controversial; cf. among others Haider (1993),
Fortmann & Frey (1997), Vogel & Steinbach (1998), and Müller (1998). Since
100 Daniel Biiring
this question is orthogonal to the issue at hand we will leave it unresolved, con-
centrating on verbs that are uncontroversially among the D a t O > A c c O ones.
2. I ignore cases of obligatory movement here, which I argue in Biiring (1996) exist
in German, too.
3. In fact, the situation is more complicated since I argued that (13) is not accurate,
a point I won't go into here.
4. To derive Diesing's (1992) original generalization, INDEFINITES in (13) would
have to be strengthened to a biconditional, requiring that indefinites are exis-
tential if and only if they are VP-internal. In Biiring (1996) I argue, however,
that this generalization is too strong, i.e., that VP-internal indefinites can also be
generic; cf. also Biiring (in prep.).
5. These conditions, as well as the one requiring headedness of prosodie phrases
introduced in the next paragraph, should be properly understood as undominated
markedness constraints in an optimality framework. It should be understood that
this only holds for German, then, leaving open the possibility that they are ranked
in a more interesting way in other languages.
6. Note that not all F-marked XPs will need to have a pitch accent on this view,
though they all will have AD-level stress. Likewise, non-focused XPs can be
pitch accented (if all focused ones are, too), since they usually receive AD-level
stress as well. Crucially, however, non-focused AD-heads cannot have accents if
the focused AD-heads don't. This is the empirical generalization reported, e.g.,
in Uhmann (1991). Others, for example Féry (1993), claim that (in our termi-
nology) all AD-heads must receive a pitch accent if they are part of a focused
phrase, while non-focused ones may or may not. This latter viewpoint can be im-
plemented by reformulating FP below, but it necessitates further complications.
Given that the empirical situation seems unclear, I will therefore stick with the
easier generalizations offered in Uhmann (1991).
7. PRED a n d X P a r e c l o s e relatives of T r u c k e n b r o d t ' s ( 1 9 9 9 ) W R A P a n d S t r e s s - X P
constraints, respectively. Just like these they will favor two ADs for adjunct-
head, argument-argument and adjunct-argument structures (since neither phrase
contains the other and no predication is involved), and with head-complement
structures they unanimously favor one AD (since bare heads aren't XPs). Note,
however, that PRED - unlike STRESS-XP - favors (NP V)AD-integration, even
if V ends up being the head of the AD (cf. (31) below). Also, PRED applies
even if the predicate is an XP of its own. I'm thinking of subject-intransitive-
verb, object-secondary-predicate, and NP-short relative clause structures, which
all have been reported to allow, if not require, single ADs. This can be achieved
in the present system by ranking PRED above XP.
8. A more precise rendering is (i):
(i) F P (FOCUSPROMINENCE)
a. If a is the smallest prosodie constituent that contains an F-marked syn-
tactic node β, a is called a prosodie focus.
b. If α is a prosodie focus at level η it is the head at level /i+l.
9. It is perhaps worth noting that the second best candidate in (31), (31-c), will
receive a very similar overall realization to the winner (31-b). In particular both
(31-b) and (31-c) have main prominence (= the head of iP) on V. The difference
is merely the presence of AD-level prominence on the AccO in (31-c), which
means that it can, but doesn't have to, bear a secondary pitch accent. As far as I
can tell, the data are inconclusive with regard to this issue (a lot depends upon the
relation between prominence and pitch accents; cf. note 6 above). (31-b) owes its
optimality to the fact that A/P is ranked below ADF. If both were tied, both (3 Ι-
ό) and (31-c) would be grammatical. Within the set of data that I discuss in this
paper, this seems to be the only case in which ranking ADF above A/P is crucial.
If required on empirical or theoretical grounds, this ranking could be given up,
ruling in (31-c) as an optimal output.
10. The F-marking on VP is not represented in the outputs. This is because I only
indicate the prosodie structure in the output, with some F-marks added for con-
venience, so there is no natural place for them. Strictly speaking the outputs
should be pairs of syntactic phrase markers with F-marking and prosodie struc-
tures without (or, perhaps, only the latter).
11. Pedantically speaking, (40-c) violates FP as formulated in note 8 above, since
the smallest prosodie constituent containing V P f , iP, is not the head of the next
higher prosodie category, because there is no such higher category. But since this
affects all structures in the competition equally, no change will arise from this. In
the main text I will ignore these extra violations for the sake of perspicuousness.
Put differently, I will interpret FP to say "...is the head at level η + 1, if there is
such a level."
12. Another implementation would specify object order in the input, but allow GEN
to change it. Nothing hinges on this in the present context.
13. As stressed in Müller 1998, this also allows us to add more morphosyntactic
word-order constraints upon demand, yielding different morphosyntactically un-
marked word orders without committing to the assumption of different base-
generated argument orders.
14. Note that if only DAT and ADF were tied, (60-f) could never emerge as an opti-
mal candidate; it would always loose to (60-a). If instead all of DAT, ADF and
A/P were tied, (60-a) and (60-f) would be permitted (as desired), both having one
violation. But so would (60-c), which has only one violation, too. But (60-c) is
not acceptable in this case.
This is where the more complicated construction that I introduced at the begin-
ning of this section pays off: Among the structures that violate DAT, only (60-f)
is grammatical, because it violates none of the prosodie constraints. This can-
didate is optimal under the ranking DAT » ADF » A/P. Among the structures
that satisfy DAT, only (60-a) is optimal, because it violates the lower prosodie
constraint A/P, rather than the higher one ADF, as (60-c) does. This corresponds
102 Daniel Biiring
to the outcome under the ranking ADF » A/P » DAT. Crucially, for (60-c) to
win the constraints would have to be ordered with A/P dominating ADF (either
one of those in (55) above), but this is not permitted by the kind of tie that is
assumed here.
15. Note that FP is not included here, owing to the empirical fact that a sentence
with, say, main prominence on the AccO can absolutely not be used in a context
that requires focus on the DatO. In other words, a structure like (i) is strictly
impossible, not just marked.
(i) ( X )iP
( χ )( χ )AD
(DatCV)(AccO)(V)
References
Bech, Gunnar
1955/57 Studien ueber das deutsche verbum infinitum. K0benhavn (Det Kongelige
Danske Videnkabernes Selskabs Historisk-filologiske Meddelelser 35,
No. 2, 1955/ 36, No. 6, 1957).
Biiring, Daniel
1996 Interpretation and movement: Towards an economy-theoretic treatment
of German 'Mittelfeld' word order. Ms., Frankfurt University.
Biiring, Daniel
1997a Perfect or just optimal? Towards an OT account of German Mittelfeld
word order. Talk presented at the Workshop on OT Syntax, October 1997,
Stuttgart University.
Biiring, Daniel
1997b The Meaning of Topic and Focus - The 59th Street Bridge Accent. Lon-
don: Routledge.
Biiring, Daniel
in prep. What do definites do that indefinites definitely don't? Ms., UC Santa
Cruz.
Choi, Hye-Won
1996 Optimizing structure in context: Scrambling and information structure.
Ph.D. dissertation, Stanford University, (to appear with CSLI Publica-
tions, Stanford).
Diesing, Molly
1992 Indefinites. Cambridge, MA: MIT Press.
Drubig, Hans Bernhard
1994 Island Constraints and the Syntactic Nature of Focus and Association
with Focus. Arbeitspapiere des Sonderforschungsbereichs 340, No. 51.

University of Tübingen.
Eckardt, Regine
1996 Intonation and Predication: An Investigation in the Nature of Judge-
ment Structure. Arbeitspapiere des Sonderforschungsbereichs 340, No.
77. Stuttgart & Tubingen.
Féry, Caroline
1993 German Intonational Patterns. Tübingen: Niemeyer.
Fortmann, Christian — Werner Frey
1997 Konzeptuelle Struktur und Grundabfolge der Argument im Deutschen.
In: F.-J. d'Avis and U. Lutz (eds.) Zur Satzstruktur im Deutschen, 143-
170. (Arbeitspapiere des Sonderforschungsbereichs 340.) University of
Tübingen.
Grimshavv, Jane
1997 Projection, heads and optimality. Linguistic Inquiry 28: 373-422.
Gussenhoven, Carlos
1984 On the Grammar and Semantics of Sentence Accents. Dordrecht: Foris.
Gutiérrez-Bravo, Rodrigo
1999 An OT account of the crosslinguistic differences in focus and word order
in English, Spanish and French. Ms., UC Santa Cruz.
Haider, Hubert
1993 Deutsche Syntax - Generativ. Tübingen: Narr.
Halle, Morris — Jean-Roger Vergnaud
1987 An Essay on Stress. Cambridge, MA: MIT Press.
Höhle, Tilman
1982 Explikation für 'normale Betonung' und 'normale Wortstellung'. In: W.
Abraham (ed.) Satzglieder im Deutschen, 75-153. Tübingen: Narr,
de Hoop, Helen
1992 Case configuration and noun phrase interpretation. Ph.D. dissertation,
Rijksuniversiteit Groningen.
Jacobs, Joachim
1992 Neutral stress and the position of heads. In: J. Jacobs (ed.) Informations-
struktur und Grammatik, 220-244. (Linguistische Berichte Sonderheft 4.)
Opladen: Westdeutscher Verlag.
Ladd, Robert D.
1996 Intonational Phonology. Cambridge, UK: Cambridge University Press.
Lenerz,
1977 Jürgen
Zur Abfolge nominaler Satzglieder im Deutschen. Tübingen: Narr.
104 Daniel Büring

1993 Generalized alignment. Ms., University of Massachusetts, Amherst &
Rutgers University.
Müller, Gereon
1991 In support of dative movement. In: S. Barbiers, M. den Dikken, and C.
Levelt (eds.) Proceedings ofLCJL 3, 201-217. Leiden.
Müller, Gereon
1998 German Word Order and Optimality Theory. Arbeitspapiere des Sonder-
forschungsbereichs 340, No. 126. Stuttgart & Tübingen.
Müller, Gereon
1999 Optionality in optimality-theoretic syntax. GLOTInternational 4(5): 3-8.
Neeleman, Ad — Tanya Reinhart
1998 Scrambling and the PF interface. In: M. Butt and Geuder W. (eds.) The
Projection of Arguments: Lexical and Compositional Factors, 309-353.
Stanford: CSLI Publications.
Pierrehumbert, Janet
1980 The phonology and phonetics of English intonation. Ph.D. dissertation,
MIT.
Pierrehumbert, Janet — Julia Hirschberg
1990 The meaning of intonational contours in the interpretation of discourse.
In: P. Cohen, J. Morgan and M. Pollock (eds.) Intentions in Communica-
tions, 271-311. Cambridge, MA: MIT Press.
1993 Optimality Theory: A Theory of Constraint Interaction. RuCCS Techni-
cal Reports 2. Rutgers University (to appear with MIT Press).
Rochemont, Michael
1986 Focus in Generative Grammar. Amsterdam/Philadelphia: John Ben-
jamins.
Schvvarzschild, Roger
1999 GIVENness, AvoidF and other constraints on the placement of accent.
Natural Language Semantics 7(2): 141-177.
Selkirk, Elisabeth
1984 Phonology and Syntax: The Relation between Sound and Structure. Cam-
bridge, MA: MIT Press.
Selkirk, Elisabeth
1995 Sentence prosody: Intonation, stress, and phrasing. In: J. Goldsmith (ed.)
Handbook of Phonological Theory, 550-569. Cambridge, MA/Oxford,
UK: Blackwell.
Truckenbrodt, Hubert
1995 Phonological phrases: Their relation to syntax, focus, and prominence.
Ph.D. dissertation, MIT. (Published 1998 by MITWPL).
1999 On the relation between syntactic phrases and phonological phrases. Lin-
guistic Inquiry 30(2): 219-255.
Uhmann, Susanne
1991 Fokusphonologie. Tübingen: Niemeyer.
Uszkoreit, Jürgen
1987 Word Order and Constituent Structure in German. Stanford: CS LI Pub-
lications.
Vikner, Sten
1991 Verb movement and the licensing of NP-positions in the Germanic lan-
guages. Ph.D. dissertation, University of Geneva.
Vogel, Ralf — Markus Steinbach
1998 The dative - An oblique case. Linguistische Berichte 173: 65-90.
Webelhuth, Gert
1989 Syntactic saturation phenomena and the modern Germanic languages.
Ph.D. dissertation, University of Massachusetts, Amherst.
Zubizarreta, Maria Luisa
1998 Prosody, Focus and Word Order. (Linguistic Inquiry Monographs 33.)
Cambridge, MA: MIT Press.
Remarks on the Economy of Pronunciation
Gisbert Fanselow & Damir Cavar
1 Introduction and Overview
The idea that syntactic movement is composed of two steps, a copying oper-
ation followed by a deletion operation (the C&D-theory of movement) - as
illustrated in (1) - has again become popular with the rise of the Minimal-
ist Program (Chomsky 1993). In one of the straightforward extensions of the
C&D-approach, at least certain instances of so-called covert movement arise
from the overt copying of a full phrase before SPELLOUT, followed by the
deletion of the higher rather than the lower copy - an assumption that implies
that spellout conventions regulate whether the target or the source position of
the copying operation is realized phonetically (see, e.g., Bobaljik 1995, Groat
& O'Neil 1996, Pesetsky 1997, 1998a, Roberts 1997 (for head movement),
Sabel 1998, among others) - as illustrated in (2) for Chinese.
(1) Overt Movement

a. (it does not matter) she likes w h o COPYING =>
b. (it does not matter) who she likes who =>·
DELETION O F SOURCE
c. (it does not matter) who she likes who =>
(2) Covert Movement
a. ta weishenme da meigeren COPYING =>
he why hit everyone
b. weishenme [ ta weishenme da meigeren J =>
DELETION O F TARGET =>·
weishenme [ ta weishenme da meigeren ]
108 Gisbert Fanselow & Damir Cavar
In this paper, we will discuss four constructions which we believe have a

fairly simple analysis in a C&D-theory of movement only: true partial wh-
movement as in Bahasa Indonesia (3), w/i-copying (4), left-branch extrac-
tions/split constituents as in German (5), and (apparent) head movement. We
agree with Pesetsky (1997, 1998a) in the conviction that the particular suc-
cess of a C&D-theory of movement (as compared to other models) hinges
on its interaction with principles of sentence pronunciation in an optimality
theoretic fashion.1
(3) a. Bill tahu Tom men-cintai siapa

Bill knows Tom loves who
b. Bill tahu | siapa yang Tom cintai j
Bill knows who FOC Tom loves
c. Siapa yang Bill tahu Tom cintai
'Who does Bill know that Tom loves?'
(4) Wen denkst du wen sie liebt?
who think you who she loves
'Who do you think that she loves?'
(5) Was hat sie [ t für Bücher ] gelesen?
what has she for books read
'What kind of books has she read?'
Furthermore, the analysis of the constructions discussed here relies on the

notion of cyclic optimization in syntax, as recently proposed by Müller (1999)
and Heck & Müller (1999).
The paper is organized as follows. We will introduce the basic idea of pro-
nunciation economy approaches in the next section, with an analysis of partial
Wz-movement. It will be argued that partial wA-movement data are exactly the
kind of construction one would expect to find if the C&D-theory of movement
is correct. Certain syntactic differences among the languages that have partial
movement will be analyzed in an optimality theoretic fashion.
The economy aspect of the approach defended here will become clear in
section 3, where we show that deletion in chains may be incomplete if cer-
tain constraints are ranked above the principle of pronunciation economy. In
section 4, we show that semantic and phonological conditions may imply that
deletion is distributed over both copies in a movement chain. One particularly
promising aspect of this account is that it allows us to reduce head movement
to phrasal movement without being confronted with notorious problems, like
Economy of Pronunciation 109
missing freezing effects, that arise in other approaches (e.g., Kayne 1998,
Koopman & Szabolcszi 1999, Mahajan 1999).
2 TYue Partial W7i-Movement
An integration of the C&D-theory into an OT-approach to syntax (see

Grimshaw 1997, Pesetsky 1998a, Legendre (in press), among many others)
involves the following ingredients: Copying and deletion apply freely in the
GEN-component of grammar, but the effects they have on grammatical out-
puts are determined by a number of principles on syntactic structure and sen-
tence pronunciation. For concreteness, we will assume without further dis-
cussion that the creation of copies ("movement") is forced by the need to
check features of an attracting head (as in Chomsky 1995, but nothing hinges
on this assumption), 2 and we assume that movement is successive-cyclic.
Chains that are created in this way will "originally" contain multiple copies
of the same phonetic and semantic material. These copies may, but need not
be subjected to a deletion operation, so that GEN will generate at least the
candidates in (6) 3 for a question like who do you think she will invite?
(6) a. who do you think who she will invite who

b. who do you think who she will invite whe
c. who do you think whe she will invite who
d. wfee do you think who she will invite who
e. who do you think who she will invite who
f. w h e do you think who she will invite whe
g· w h o do you think whe she will invite who
h. w h o do you think who she will invite whe
In the standard case of movement, only one of these copies is actually pro-
nounced. This follows from an interaction of the principles PRONECON and
RECOV.4 PRONECON favors those structures in which the deletion of pho-
netic matrices in chains is maximized, but deletion is subject to recoverabil-
ity, so that normally 5 exactly one copy will be retained in each chain. In other
words, in most situations, only (6-e-g) are potential winners.
(7) a. Pronunciation Economy (PRONECON) 6

* Phonetic Matrix
b. Recoverability (RECOV)7
The content of unpronounced elements must be recoverable from a
local antecedent.
Ceteris paribus, this approach leads to the expectation that any copy in a
chain may be the one that is spelled out, with all the others being deleted.
So-called "partial wA-movement"8 as can be found in Bahasa Indonesia (cf.
(3), repeated here as (8), and Saddy 1991, 1992) or Malay (Cole & Hermon
1998) seems to bear this prediction out. In a wA-question, the wA-phrase may
either appear in situ (8-a), or be realized in its scope position (8-c), but it can
also show up in the specifier positions of any of the CPs that may intervene
between the root position of the wA-phrase and its scope position, as (8-b)
illustrates ("true partial wA-movement").
(8) a. Bill tahu Tom men-cintai siapa

Bill knows Tom loves who
b. Bill tahu f siapa yang Tom cintai ]
Bill knows who FOC Tom loves
c. Siapa yang Bill tahu Tom cintai
'Who does Bill know that Tom loves?'
Cole & Hermon (1998) argue that partial wA-movement is not focus move-
ment; see also Basilico (1998) for arguments that partial movement in Slave
cannot be reduced to focus movement. The most straightforward analysis for
(8) (considered but rejected in Cole & Hermon 1998) assumes that siapa has
in fact been attracted to its scope position in all examples, forming the chain
indicated in (9-a). Due to the interaction of PRONECON and RECOV, all but
one of the copies of siapa must not be pronounced. In the optimal state of af-
fairs, any of the copies may be the one that is realized overtly, as the abstract
structures (9-b-d) illustrate, which (roughly) correspond to (8).
(9) a. I cp siapa ... | cp siapa ... siapa J]

b. [cp siapa ... |CP siapa... siapa J]
c. I CP siapa ... ICP siapa ... siapa J]
d. [CP siapa ... Icp siapa ... siapa ]]
There is at least one argument for analyzing true partial wA-movement along
these lines. As Saddy (1991, 1992) and Cole & Hermon (1998) observe, par-
tially moved wA-phrases behave as if they have moved to the scope posi-
tion at least in terms of island conditions: There must be no movement is-
land between the partially moved wA-phrase and its scope position. Thus, a
wA-phrase cannot be moved out of an adjunct clause in Malay, and partially

moved vWz-phrases must not occur within adjuncts either, as (10) (taken from
Cole & Hermon 1998: 227,236) illustrates. The same holds, for example, for
subject islands or for wA-islands: H7i-phrases cannot be extracted out of such
islands, and wA-phrases that seem to have undergone "partial" wA-movement
are not tolerated in these constructions either.
(10) a. *Apa (yang) Ali dipecat kerana dia beli t

what (that) Ali was-fired because he bought
b. *Ali dipecat apa (yang) kerana dia beli t
'What is the thing such that Ali was fired because he bought it?'
Such observations are explained straightforwardly if - as (9) suggests - "par-

tial" movement is in fact full wA-movement, involving a "non-standard" dele-
tion part though: The constellation in (11 -a) created by copying is ruled out
by standard island theories. It does not matter then whether the uppermost or
an intermediate copy of this (illicit) wA-chain fails to undergo deletion trig-
gered by P R O N E C O N .
(11) a. *u>A-phrase ... [island ··· wA-phrase ... wA-phrase ... |

b. *wh phrase ... [island ··· wA-phrase ... wh phrase ... |
c. *wA-phrase ... [island ··· wh phrase ... wh phrase ... |
Island facts thus favor the C&D-analysis of partial wA-movement.

Saddy (1991) and Cole & Hermon (1998) point out, however, that argu-
mentai wA-phrases in situ can appear within syntactic islands (although they
cannot be moved out of these islands), as (12) illustrates.
(12) Ali dipecat kerana Fatimahfikir dia mem bel i apa

Ali was-fired because Fatimah thinks he bought what
(12) illustrates, in fact, a fairly widespread property: Unlike their adjunct

counterparts, argumentai wA-phrases in situ do not obey any island constraints
in a number of languages (but not in all), among them Chinese (see, e.g.,
Huang 1981). If island conditions affect overt and covert movement in the
same way - as they have to if the difference between the two movement types
is one of pronunciation only - (12) cannot involve movement. Rather, argu-
mentai wA-phrases in situ must be assumed to be bound by a (null) question
operator in the appropriate Comp (see, e.g., Aoun & Li 1993, Tsai 1994, Cole
& Hermon 1998).9
More precisely, the [+wA|-Comp as in (13) may or may not have a syntactic
112 Gisbert Fanselow & Damir Cavar.
feature F that attracts a w/z-phrase.10 If the feature F is missing, the w/z-phrase

cannot move to the specifier position of Comp, but a binding relation can be
established for w/z-phrases that are "referential" enough to be bound (i.e., for
argumentai w/z-phrases). Then no island effects are to be expected.11 If the
attracting feature F is present, the w/z-phrase moves, so that island effects
arise, irrespective of where the w/z-phrase is spelled out in the end.
(13) Comp I +wh ]... w/z-phrase ...
Chinese illustrates the prediction that certain in situ w/z-phrases can be island
sensitive: VWz-adjuncts can stay in situ, but they must not appear in islands.
Note that adjuncts cannot be bound by an (argumentai) question operator base
generated in Comp. Therefore, w/z-adjuncts can form a part of a question only
if a chain is built up which links the adjunct to its scope position. W/z-adjuncts
can thus be realized phonetically in situ only if a copy-chain (respecting is-
lands) is built up to the scope position, in which the lowest copy surfaces after
the deletions as forced by (7).12
We therefore follow Cole & Hermon (1998) in making the assumption that
two strategies for forming questions coexist in Malay at least: copying of the
w/z-phrase to its scope position, and the binding of w/z-arguments in situ. See
Pesetsky (1998b) for a related but slightly different view on English, German
and Slavic questions.
A phonetic sequence such as (14) in which an overt copy of the w/z-phrase
appears in its base position is thus ambiguous in our account (but not in Cole
& Hermon 1998): buah apa may be bound by a [+w/z)-Comp, or it may be
the copy of a chain link to the matrix Spec,C position that is spelled out pho-
netically. Given that the binding-zTz-iz'ta strategy is, in general, more liberal
than the formation of questions by movement, (nearly) all examples that are
grammatical under a movement analysis are generatable with a binding anal-
ysis, too - so that the existence of an ambiguity is both hard to establish and
also hard to refute.
(14) Mary (mem)-beli buah apa di kedai

Mary prefix-buy fruit what at shop
'What fruit did Mary buy at the shop?'
This prediction of an ambiguity may come closer to what holds in Singa-

porean Malay than in Bahasa Indonesia. According to Saddy (1991), w/z-
phrases in situ in Bahasa Indonesia are special in that they always take widest
possible scope with respect to other operators, and one can take this as an ar-
gument against the systematic structural ambiguity of w/z-phrases in situ that
we predict; however, the relevant judgments seem not to be shared by native

speakers of Singaporean Malay (Cole & Hermon 1998: 225), so that we may
uphold our analysis at least for the latter language.
As for the situation in Bahasa Indonesia, systematic differences between
wA-phrases that are phonetically realized in situ and those that appear in other
positions can be accounted for in the following way. Müller (1997) argues
for a principle like (15) as one of the determinants of parametric variation
among languages with respect to question formation (we have adapted the
formulation to the needs of the system we develop here).
(15) WH-IN-SPEC
A wA-phrase must be phonetically realized in the specifier position
of a CP.
WH-IN-SPEC blocks the phonetic realization of w/z-phrases in situ whenever

a chain has been formed which contains more than two members (and if at
least one of the chain members is a specifier of CP). Thus, unless other prin-
ciples override it, (15) implies that wA-phrases in situ are not part of a chain
reaching Spec,CP, as required for Bahasa Indonesia. As long as RECOV is
ranked above WH-IN-SPEC, however, wA-phrases that are bound by a null
operator in Comp (rather than being part of a chain created by copying) can
nevertheless be realized in situ, because their omission would violate recov-
erability.
For languages like Chinese in which wA-phrases show up in root posi-
tions only, the effects of (15) must be counteracted by a further principle
like STAY*, which has been proposed in a somewhat different form in vari-
ous works (see, e.g., Grimshaw 1997, Müller 1997, Legendre et al. 1998, and
Ackema & Neeleman 1998) and which can be formulated as in (16) in our
approach.
(16) STAY* (Nonstandard formulation)13

If the phonetic matrix of a c-commands a member of the chain of
β, then it c-commands the phonetic matrix of β.
When STAY* » W H - I N - S P E C , the Chinese type of question formation arises

(no visible effects of copying whatsoever); when W H - I N - S P E C STAY*,
the Bahasa Indonesia system comes into being (wA-phrases in chains are
always displaced phonetically,14 but bound wA-phrases may be realized in
situ). Finally, if there is a tie between STAY* and W H - I N - S P E C , the grammar
of Singaporean Malay arises, in which wA-phrases that belong to movement
chains may surface in situ and in derived positions.
A further observation on Malay and Bahasa Indonesia discussed in Saddy

(1991) and Cole & Hermon (1998) seems to be incompatible with our anal-
ysis and figures as the key argument against the spellout account of partial
movement in Cole & Hermon (1998: 251). 15 Transitive verbs in Bahasa In-
donesia and Malay optionally combine with certain prefixes like meng. These
must be absent, however, when a wA-phrase has moved across them, but they
can be present when the wA-phrase is in situ. This rule holds for long wh-
movement as well (see Cole & Hermon 1998: 230-234 for details). Crucially,
in partial w/z-movement constructions, meng must be absent between the root
position and the position of the phonetically realized wA-phrase only; it can
appear between the overt position of the w/z-phrase and the latter's scope po-
sition, as (17) indicates.
(17) Ali (mem) beritahu kamu tadi apa yang Fatimah (*men)-baca
Ali meng told just now what that Fatimah meng read
'What did Ali tell you just now that Fatimah was reading.'
Whether this observation creates a problem for the C&D-analysis of "partial"

movement (as Cole & Hermon 1998 claim it does) or not, depends, of course,
on the details of the rule system that governs the distribution of meng. (18)
appears to capture the empirical facts, and if such statements are allowed as
(language-particular?) constraints in grammars, particle distribution cannot
even be used as an argument against a fully representational interpretation of
a C&D-analysis of partial movement.
(18) *MENG, w h e n e v e r MENG is c - c o m m a n d e d by the phonetic matrix

of w/z-phrase a and when it c-commands a trace of a .
A more convincing analysis of the particle facts can be constructed, however,

if we assume a notion of cyclic optimization, as proposed by Müller (1999)
and Heck & Müller (1999). In standard OT, the generator component of gram-
mar (GEN) constructs a set of syntactic objects from an input. These syntactic
objects are the candidates for the evaluation procedure (EVAL), which selects
optimal candidates, which are then grammatical sentences.
In a cyclic model of optimization, this interaction of GEN and EVAL applies
sequentially, building up and optimizing larger and larger syntactic objects.
For concreteness, suppose that by applying merger and copying operations
to lexical entries or syntactic objects previously formed, GEN generates a set
of syntactic objects, until the elements so formed correspond to cyclic nodes
(NPs and CPs) or to "phases" in the sense of Chomsky (1998). These can-
didates are then subjected to the EVAL procedure, yielding an optimal can-
didate. The optimal candidates for the expression of cyclic categories/phases
so formed may then be fed into the GEN component again, in order to gener-
ate even larger structures, until the level of cyclic nodes or phases is reached
again, at which the E VAL procedure selects the optimal structure again.
In such a system, the question of which of the copies created by move-
ment can be retained, and which copies are deleted phonetically, poses it-
self each time the construction of alternative structural representations has
reached the cyclic node level. Consider, then, a stage in a derivation in which
a w/z-phrase has been copied to a higher position, crossing an occurrence of
meng in this context (19-a). Suppose that Σ is cyclic, so that optimization
can and must start. Because of PRONECON, one of the two W/Z-phrase copies
must disappear. 16
If the upper copy loses its phonetic matrix (19-b), nothing seems to have
to happen to meng, i.e., it can be retained. In structures in which meng has
been retained, the uppermost specifier position of CP therefore does not have
a phonetic matrix, and it will not be able to re-acquire this phonetic matrix
in later copying steps for more or less obvious reasons. 17 Therefore, above a
retained meng, no copy in a w/z-chain originating lower than meng can have
a phonetic matrix.
Assume, however, that there is a principle requiring that meng must be
deleted (19-c) when the upper copy is retained phonologically. This can (and
must) be checked locally in each cyclic domain relevant for optimization.
Thus, the empirical generalizations that concern mewg-distribution are very
well compatible with a C&D-approach when it is executed cyclically. 18
(19) a. I ς w/z-phrase meng w/z-phrase |

b. wh phrase meng w/z-phrase
A more straightforward version of this account assumes that cyclic wh-

movement always targets the specifier positions of the relevant "phases", i.e.,
it assumes that cyclic w/z-movement always passes through Spec,CP and the
specifier position of a functional projection below the subject position but
above VP (see Chomsky 1986,1998; the relevant functional head might, e.g.,
be AGR-O relative to some earlier versions of the minimalist program, or
the "outer specifier of vP" as in Chomsky 1998). The ban on the use of the
meng-prefix in situations in which it has been "passed" by overt movement
can then be reduced to (20), which bears an obvious similarity to the Doubly-
Filled Comp Filter.
(20) *IAGR-O-P V W - P H R A S E [ AG R-O meng | v P ... |j], if V W - P H R A S E h a s

a phonetic matrix.
When the w/z-phrase has been copied to Spec,AGR-O, a phase is com-

pleted and the output must be optimized. If meng deletes, the w/z-phrase in
S p e c , A G R - O may or may not be the one that retains its phonetic matrix. If
meng fails to delete, the lower copy of the w/z-phrase - and not the one in the
specifier position of A G R - O must be the one that retains its phonetic matrix.
The upper copy (being stripped of its phonetic features) is, however, the one
that will undergo further (and therefore invisible) movement (see footnote 17
for a precise argumentation). Thus, the fact that meng is never "crossed" by
overt movement is derived.
That the w/z-phrase is, apparently, never realized phonetically in Spec,AGR-
O seems to f o l l o w from the interaction of STAY* and WH-IN-SPEC. STAY*
only favors the root position of a w/z-phrase, while WH-IN-SPEC disfavors
the realization of w/z-phrases in anything but Spec,CP.
We now seem to run into a problem, however: If optimization applies cycli-
cally, WH-IN-SPEC forces the phonetic realization of a w/z-phrase in the
lower Spec,CP-1 position when Σ is optimized in (21). Spec,AGR-O is then
left with a w/z-phrase lacking a phonetic matrix - and this is the only one that
can be copied to higher positions. Spec,CP-2 thus seems to never be able to
acquire a phonetic matrix.
(21) [ Σ * Spec,CP-2 ... [ Σ Spec,AGR-O ... Spec,CP-l ... ] ]
Note, however, that we need a further principle anyhow in order to capture

languages that neither f o l l o w the w/z-strategy (Chinese) nor partial movement
(Malay). Consider in this respect PARSESCOPEO, borrowed from Legendre
et al. (1998), but adapted to our current needs:
(22) PARSESCOPEO (Nonstandard formulation)

If a has scope over β then the phonetic matrix of a c-commands the
phonetic matrix of β.
Suppose PARSESCOPEO is tied with WH-IN-SPEC, while STAY* is low (or

tied). If α is a w/z-phrase, the optimization of Σ in (23) will imply that a ' s
phonetic matrix appears in S pec A G R - O - 1 in (23) - only STAY* runs counter
to this conclusion, but STAY* has a lower rank. When Σ * is optimized, the
phonetic matrix of a is moved even further, because PARSESCOPEO and WH-

IN-SPEC pull in the same direction.
(23) ΙΣ*** Spec,CP-2 ... Spec,AGR-0-2 ... | Σ „ Spec,CP-1 ... [ Σ

Spec,AGR-0-l . . . a ... |]J]
The w/i-phrase appears now in the lowest Spec,CP position. The derivation
bifurcates when Σ * * is reached: a climbs up phonetically if PARSESCOPEO
is given more weight, while its phonetic material stays in Spec,CP-1 when the
tie is resolved towards WH-IN-SPEC. The former derivation will finally copy
the phonetic material of a further to Spec,CP-2 (because the two constraints
in question have the same implications for the last derivational step); the latter
cannot but leave a phonetically at Spec,CP-1. In other words, where there is a
tie between WH-IN-SPEC and PARSESCOPEO, partial wA-movement arises.
When PARSESCOPEO dominates WH-IN-SPEC, the phonetic material of
the wA-phrase will be realized in the highest chain position under considera-
tion (WH-IN-SPEC cannot block scope driven movement up to Spec,AGR-0)
- this characterizes languages with full and multiple wA-movement like Ro-
manian and Bulgarian.
The factorial typology leads us to expect that there are also languages in
which WH-IN-SPEC dominates PARSESCOPEO. Slave could be such a lan-
guage: In Slave (Basilico 1998), the wh-in-situ strategy is less restrictive than
overt movement, as (24-a-b) show: The complements of so-called "indirect
discourse verbs" are barriers for overt movement, but wh-in-situ is licensed.
It is thus surprising that (24-c), which involves partial movement within the
island, is in fact grammatical, quite in contrast to what one would expect from
the situation we found in Malay.
(24) a. *?Ode netá nimbáa enáih?á kenéhdzáh

where your father tent 3pitch 3tried
'Where did your father try to pitch the tent?'
b. Raymond Jane judeni ri yili kodisho
Raymond Jane where FOC 3is 3knows
c. Raymond [ judeni ri Jane yili | kodisho
'Where does Raymond know Jane to be?' (= 1875a,b of Rice 1989)
Slave differs from Malay in a further respect: There are configurations in

which partial movement is ruled out, while complete movement is not: Com-
plements of direct discourse verbs are transparent for movement, but disallow
partial movement:
(25) a. John beya judeni ráwozée sudeli

John my-son where 3opt-hunt 3wants 1
'Where does John want my son to hunt?'
b. *John judeni beya ráwozée sudeli
c. Hodi nurse egháuhndá néndi
where nurse lopt-see-2sg 3told2
'Where did the nurse tell you she would see you?'
Basilico (1998) suggests that complements of direct discourse verbs lack a

Spec,CP node, so that (25-b) cannot possibly arise. Thus, the grammatical-
ity of (24-c) is the only surprising property of Slave. We can understand the
contrast between (24-a) and (24-c), however, if we assume that complements
of indirect discourse verbs are transparent for movement, but that some con-
stellation C of principles rules out that any of the occurrences of w/z-phrases
that have been copied out of the complement CP could ever bear a phonetic
matrix: W/z-movement then has to be obligatorily "partial".
We have already seen what this constellation of principles is: W H - I N - S P E C
dominating P A R S E S C O P E O inevitably prevents w/z-phrases from leaving a
Spec,CP position they have reached if w/z-movement needs to additionally
pass through an AGR-O-position in the next higher clause. The predictions
of the factorial typology constructible from STAY*, P A R S E S C O P E O and W H -
I N - S P E C are therefore borne out.
3 W7i-Copying
3.1 Preliminary Remarks on Question Formation in German
Before we discuss w/¡-copying in German, we will start with a few remarks on

the embedding of German in the principle system developed so far. German
is not a wh-in-situ language, and does not allow partial w/z-movement of the
kind we find in Malay languages, at least not in simple questions. In multiple
questions, only one w/z-phrase appears in its scope position:
(26) a. Wen hat er wem gezeigt?

whoAcc has he whopAT showed
'Who did he show to whom?'
b. *Wen wem hat er gezeigt?
On obvious grounds, (26) can be analyzed in two ways: Taking up ideas pro-
posed by Grewendorf (1999) and Sabel (1998), we may hypothesize that all
wA-phrases move to their scope position, but that there is a principle that bans
the spelling out of more than one wA-phrase per Spec,CP position. Alterna-
tively, we may follow Müller (1997) in the assumption that there is a principle
that bans the (phrasal) movement of more than one wA-phrase to Spec,CP in
German. The w/z-phrase in situ would then have to be bound in situ (or un-
dergo feature movement in the system of Pesetsky 1998b).
When the two w/z-phrases of a multiple question originate in different
clauses, no uniform pattern emerges: In addition to the constellation in (27-a),
which closely mirrors the English counterpart and which characterizes stan-
dard German, there are dialects in which the multiple question cannot be
formed in the way it is in (27-a). 19 In such dialects, the lower wA-phrase ei-
ther has to undergo "partial" w/i-movement to the specifier position of the
complement clause (as in (27-b), which is acceptable for at least some speak-
ers in Potsdam and surroundings), or the lower wA-phrase must be the one
that undergoes overt movement (as in (27-c), blatantly violating superiority
thereby). 20
(27) a. !Wer denkt, dass sie wen liebt?

who thinks that she who loves
' W h o thinks that she loves who?'
b. !Wer denkt, wen sie liebt?
c. !Wen denkt wer, dass sie liebt?
The latter two dialects thus resemble Iraqi Arabic (see Ouhalla 1996) and
Hindi (Mahajan 1990) in the sense that (a) wA-phrases in situ cannot take
scope out of the minimal finite clause they are contained in (unless they fill
this clause's specifier position) and (b) the distribution of wh-in-situ is there-
fore more constrained than the distribution of w/z-phrases that have under-
gone overt movement. If German w/z-phrases in situ are not moved covertly,
and are subject to additional binding requirements of the sort we find in Iraqi
Arabic (Ouhalla 1996), the unavailability of a multiple question interpretation
for (27-a) in the relevant dialects is captured fairly easily, while it is less clear
why covert movement should have to fulfill less liberal island conditions than
overt movement, if the major difference between the two operations is one of
the location of spellout. The dialects that rule out (27-a) thus suggest that Ger-
man wA-phrases in situ do not involve covert movement. This is quite in line
with the conclusions arrived at (for what he terms covert phrasal movement)
by Pesetsky (1998b) on quite different grounds.
Obviously, the dialects with the stricter binding restrictions on wh-in-situ

solve the pertinent problem in two ways, either by allowing partial move-
ment to move the wA-phrase to a position in which it can be bound from
outside (27-b) or by moving the lower phrase directly to its scope position (as
in (27-c)). The latter strategy cannot help with triple questions for straightfor-
ward reasons: The lowest wA-phrase still is separated from its scope position
by a finite clause boundary.
(28) *Wen denkt wer, dass sie wem vorgestellt hat?

whoAcc thinks who that she wJiodat introduced has
'Who thinks that she introduced who to whom?'
On the other hand, (29) is fine in the dialects that tolerate partial movement,
so we must assume that wem fulfills the locality requirements on binding
because of the presence of wen in the next Spec,CP position.
(29) !Wer glaubt wen sie wem vorgestellt hat?

who believes who she who introduced has
'Who thinks that she introduced who to whom?'
Let us now turn to the different strategies of forming long-distance questions

in German: Like Romani or Frisian, German is fairly rich in this respect.
(30) a. !Wen denkst du dass sie liebt?

who think you that she loves
b. Wen denkst du liebt sie?
c. Was denkst du wen sie liebt?
what think you who she loves
d. !Wen denkst du wen sie liebt?
who think you who she loves
e. !/*Wen denkst du was sie liebt?
'Who do you think that she loves?'
(30-a) exemplifies standard long wA-movement of arguments, which is gram-

matical in some, 21 but not all varieties of German (see, e.g., Kvam 1983,
Fanselow, Kliegl & Schlesewsky 2000). Extractions from so-called verb-
second-complements (30-b) are well formed in all dialects of German. 22
(30-c) exemplifies so-called "wA-scope marking", and is often analyzed as
involving partial wA-movement plus the insertion of a scope marker (see
McDaniel 1989, Müller 1997, and the contributions to Lutz et al. 2000). If
Fanselow & Mahajan (1996, 2000) are correct, however, the constructions
involve quite a different analysis. Presupposing that the latter approach is

correct, we will ignore the construction in the rest of this paper.
(30-d) is, however, a construction of particular interest in the context of pro-
nunciation economy: It appears as if more than one copy within a wA-chain
formed by overt movement is spelled out phonetically (Copy-Construction,
CC). Similar constructions exist in Frisian (Hiemstra 1986), Afrikaans (du
Plessis 1977) and Romani (McDaniel 1989). We will focus our attention on
this construction in this section, after some brief comments on remaining op-
tions.
McDaniel, Chiù & Maxfield (1995: 741) state that structures like (30-e) are
ungrammatical in German although they are used, e.g., in Romani with the
vwA-marker so. Comparable constructions involving what exist in Child Eng-
lish, but it is indeed hard to find native speakers of German who accept (30-e).
Note that the "w/z-scope marker" so appearing in the Romani counterparts of
(30-e) is homophonous with the complementizer in Romani, an observation
that suggests a straightforward analysis of the construction: was/so/what is
not a question word or a scope marker in (30-e), but rather the agreeing
form of a complementizer - which agrees with its specifier position hosting
a (silent) copy of a w/i-phrase created by overt movement of an element into
an even higher clause. We assume that this analysis is correct.
(31-a,b) are taken from Anyadi & Tamrazian (1993), who have located
speakers who accept these sentences in the Ruhr dialect of German. This
does not appear entirely correct, but there are dialects which tolerate these
structures. 23 We will comment on (31) at the end of this section.
(31) a. ¡Welchem Mann glaubst du wem sie das Buch gegeben hat?
which man believe you who she the book given has
'Which man do you think that she gave the book to?'
b. ¡Mit welchem Werkzeug glaubst du womit Ede das Auto
with which tool think you what-with Ede the car
repariert hat?
repaired has
'With which tool do you think that Ede repaired the car?'
122 Gisbert Fanselow & Damit Cavar
3.2 Some Facts about the Copy Construction
Let us turn, then, to the Copy Construction (CC), and see how it fits into
our analysis. The CC is characterized by a number of interesting generaliza-
tions, two of which are fairly standard. First, no copy may appear in the root
position of the wA-chain.
(32) *Wen denkst du wen sie | vp wen liebt |

who think you who she who loves
'Who do you think she loves?'
Overt copies show up in Spec,CP only. If infinitive clauses have no Spec,CP

position in German, the ill-formedness of (33) is explained immediately.
(33) a. *Wen versuchst du wen zu küssen?

who try you who to kiss
b. *Wen batest du mich wen zu küssen?
who asked you me who to kiss
Keeping an overt copy in a root position (in addition to the copy in the scope
position) not only implies violations of PRONECON, it also incurs a further
violation of WH-IN-SPEC. AS long as there is no reason to keep a copy there
(and there is none), (32)—(33) always give way to alternative structures in
which the lowest copy is not spelled out.
A second generalization concerns the nature of all overt copies of the wh-
chain (but the lowest one): They may not be syntactically complex:
(34) a. *Wessen Studenten denkst du wessen Studenten wir kennen?

whose student think you whose Student we know
' W h o s e student do you think that we know?'
b. * Wieviel Studenten denkst du wieviei Studenten wir
how many students think you how many students we
kennen?
know
'How many students do you think that we know?'
(35) Womit denkst du womit er sie verletzt hat?
with what think you with what he her hurt has
'With what do you think that he hurt her?'
(36) (?*)Mit was denkst du mit was er sie verletzt hat?

While CCs that involve wA-phrases consisting of a single word are perfect
in those dialects that allow the CC at all, the situation differs radically when
the w/i-phrase is syntactically complex: In the CC ungrammaticality arises in
quite a number of dialects/idiolects as soon as the upper copy contains two or
more words (see (34)). (35)-(36) form a nice minimal pair in this respect - the
structures do not differ in meaning but just in the fact that womit is a single
word, in contrast to mit was. For some (most?) speakers, a contrast exists
between (35) and (36) - with the ungrammaticality of the former example
being rather mild only. There is practically nobody who would go beyond
(36) in terms of the complexity of the copied w/i-phrase, though.
It has been assumed (Fanselow & Mahajan 1996, Höhle 1996) that this
anti-complexity restriction affects all copies in the same way, but this claim
overlooks the greater flexibility we observe for the lowest copy:
(37) Wen denkst du wen von den Studenten man einladen sollte?
who think you who of the students one invite should
'Which of the students do you think that one should invite?'
(38) Wieviel sagst du wieviel Schweine ihr habt?
how many say you how many pigs you have
'How many pigs do you say that you have?'
That there is a syntactically complex wA-phrase in the specifier position of the

complement clause is obvious in (38), and can be argued for easily in the case
of (37) as well: Note that the boldface material precedes the pronoun man, to
the left of which it is normally impossible to scramble PPs. That the boldfaced
material of the lower copy forms one constituent (and not two, with the PP
being scrambled to second position) is also obvious in those dialects of Ger-
man that allow a mixing of long movement and CC, so that constellations like
(39) may arise, in which the boldface copy wen von den Studenten can have
reached the intermediate clause only by mono-constituental wA-extraction,
since there is no long scrambling in German.
(39) Wen denkst du wen von den Studenten sie sagte dass man
who think you who of the students she said that one
einladen sollte?
invite should
124 Gisbert Fanselow ά Damir Cavar
'Which of the students do you think she said that one should invite?'
(40), on the other hand, shows that it is not sufficient for grammaticality that
one copy only in the chain is syntactically complex:
(40) *Wen von den Studenten denkst du wen man einladen sollte?
which of the students think you who one invite should
Finally, in those dialects which have few problems with (36), (41) is perfect
as well. PPs are strict islands for movement in German, so aus Konstanz
could not possibly ever have left an wen aus Konstanz by standard movement.
Thus, there is no alternative to an analysis of (41) in which the lower Spec,CP
position is occupied by an wen aus Konstanz, and the upper one by an wen.
(41) An wen denkst du an wen aus Konstanz man das schicken

at who think you to who from Constance one that send
darf?
may
'To which person from Constance do you think one is allowed to
send that?'
The complexity restriction thus does not apply to the lowest copy. The com-
plexity restriction holding for upper copies renders wA-copying ungrammati-
cal whenever a wA-phrase cannot be split or "separated", as is, e.g., the case
for w/i/c/i-phrases.
(42) a. Welches denkst du welches er nehmen wird?

which think you which he take will
'Which one do you think he will take?'
b. * Welches denkst Du welches Schweinderl er nehmen wird?
which think you which piggie he take will
'Which piggie do you think he will take?'
c. *Welches Schweinderl denkst Du welches Schweinderl er nehmen
wird?
The CC obeys stricter locality restrictions than standard long overt move-
ment, as Höhle (1996) and others have observed; consider (43) - correspond-
ing examples with simple long movement would be grammatical. What we
get is exactly analogous to the intervention effect Beck (1996) and Pesetsky
(1998b) identify for German wh-in-situ: No operator may intervene between
the copies of the wA-phrase.
(43) a. *Wen glaubt keiner wen sie liebt?

who believes nobody who she loves
'Who does nobody believe that she loves?'
b. *Wen glaubt jeder wen sie liebt?
who believes everybody who she loves
'Who does everybody believe that she loves?'
3.3 Three Analyses
There have not been too many proposals for an analysis of the CC, but Inge
Hiemstra's (1986) theory of the construction (and its Frisian counterpart) is
certainly outstanding in many respects. Published nearly ten years before
Chomsky ( 1995), her contribution preempts insights of much work in featural
movement theory in a number of respects. The central idea of her analysis of
(44) is that when a structure requires wA-movement, this may be carried out
as any of the following:
— movement of the wA-feature alone,

— the pied-piping of the ^-features aligned with the wA-feature, or
— the pied-piping of the whole phrase bearing the wA-feature.
The resulting system is, thus, quite reminiscent of a movement theory gen-
erally adopted later in the mid-nineties. The first two options for effecting
movement must then be complemented by a theory of spellout for the dis-
placed feature complexes. According to Hiemstra, it is the most unmarked
lexical element bearing the relevant features that will realize the feature
complex in question. A single |+wh|-feature is therefore realized as was
(= (44-a)), and the feature complex [+wh, 3rd ps., acc| as wen (= (44-b)).
(44) a. Pure feature movement:

Was denkst du wen sie eingeladen hat?
what think you who she invited has
b. Pied-piping of ^-features:
Wen denkst du wen sie eingeladen hat?
c. Pied-piping of full phrase:
Wen denkst du dass sie eingeladen hat?
'Who do you think she has invited?'
If Fanselow & Mahajan (1996,2000) are correct, was in (44-a) is a sentential

vWí-expletive originating in the object position of the matrix clause. The par-
allel between (44-a) and (44-b) would thus be a spurious one. It is also not
too clear to what extent we want to consider wieviel as in (38) or an wen as in
(41) to be mere spellouts of complexes of ^-features. For welche, one would
have to give an answer to the question of why it allows 0-featural copying
only if what is left behind is a single word item.
Thus, a more promising version of her approach would seem to have to
move closer to the ideas developed in Chomsky (1995), in assuming that the
minimal element that can be moved overtly is the word carrying the attracted
feature (= wen, wieviel, welche). We can then analyze the CC as a sequence
of two types of movement steps, the first one involving the pied-piping of the
phrase dominating the attracted word, the second one confining itself to the
overt displacement of the word-level category. With the exception of those
dialects that can prepose a minimal PP like an wen in the CC, this approach
is descriptively correct, but it leaves open some questions: Why is the wh-
word that moves in the second step not deleted in the phrasal copy, as in (45)
(which would be grammatical if the second occurrence of wieviel had been
kept phonetically)?
(45) *Wieviel denkst du wieviel Schweine sie sagt dass wir

how many think you how many pigs she says that we
haben?
have
'How many pigs do you think she says that we have?'
Fanselow & Mahajan (1996, 2000) concentrate on the development of an ac-

count for (44-a), but add a sketch of an analysis of the CC, too. Their analysis
makes use of the fact that the Comp position cannot be empty phonetically in
German embedded clauses 24 (except in indirect question complements), and
they assume that the copies of the w/z-phrase occupying the lower Spec,C po-
sition cliticize onto Comp whenever this position is not filled in another way.
For wh-phrases of the size of a single word, this approach works smoothly,
but they do not take into account the principled availability of more com-
plex constructions like (37) and (38), for which it is hard to believe that the
complete w/z-phrase occupies the Comp position.
Pesetsky's (1997, 1998a) crucial insight concerning uneconomical pronun-
ciations of w/i-chains is that they typically arise in contexts where standard
movement would violate island conditions. In fact, one often finds the CC
in dialects that do not allow long movement of arguments. In a form slightly
adapted to the general approach we pursue here, the relevant Island Constraint
takes the form (46):
(46) ISLAND
*a ... I Σ ... β ... I
where α, β belong to a single chain, a or β are unpronounced, and
Σ is an island.
Suppose in the dialects allowing CCs, CPs are (or, can be) barriers for extrac-
tion. In the first derivational step for a long distance question, the w/z-word
will be copied to Spec,C at some stage:
(47) wen dass du wen eingeladen hast

who that you who invited have
Whether dass may be preserved in the overt presence of phonetic material in

Spec,C is, partially, a function of the rank of the Doubly Filled Comp Filter in
German - it is violable in some but not all varieties of German (see, e.g., Heck
1997 for a pertinent OT analysis), and we will not go into this issue here.
When it comes to the optimization of the pronunciation of (47), a number of
principles come into play. In addition to PARSESCOPEO introduced above,
LEFTEDGECP taken over from Pesetsky (1998a: 341) 25 seems relevant:
(48) LEFTEDGECP (LEC)

The first pronounced word in a CP must be the complementizer pro-
jecting that CP.
If PARSESCOPEO has a high rank in German (as seems to be the case),

PARSESCOPEO » LEC guarantees that wen can be kept in the initial po-
sition of (47) whenever (47) represents a complete complement question or
forms part of a larger movement structure. PRONECON implies that one of
the two occurrences of wen disappears; if PARSESCOPEO LEC, it is the
higher copy that must be retained. We thus end up with (49).
(49) wen dass du we» eingeladen hast

who that you who invited have
Suppose that the head and the specifier of a CP/a phase (but no other ele-
ments) are accessible in the next optimization cycle. (50) represents the AGR-
OP or vP of a matrix clause that will end up as a matrix question. If Σ is not
interpreted as an island, PARSESCOPEO implies that the upper copy of wen
is retained, and LEC implies that the lower copy of wen should disappear.
(50) wen denkst | Σ wen dass du t eingeladen hast |

who think 2sg who that you invite have
If Σ is interpreted as an island, ISLAND blocks the deletion of the lower copy

if ISLAND » LEC. In fact, we might capture the dialectal variation we find
in German by assuming that Σ = CP is always counted as a barrier and that
the ranking of ISLAND and LEC is not fixed in the same way in the various
dialects of German: the copy construction arises when ISLAND » LEC, while
we get long movement when LEC » ISLAND, as tables 1 and 2 show.
(51 ) a. wen denkst wen dass du eingeladen hast

b. wen denkst we» dass du eingeladen hast
c. wen denkst wen dass du eingeladen hast
ο. wen denkst wen dass du eingeladen hast
Table 1 PARSESCOPEO ISLAND LEC
m- (51-a) *
(51-b) *!
*
(51-c) *!
*
(51-d) *!
Table 2 PARSESCOPEO LEC ISLAND

(51-a) *!
*
ISR (51-b)
*
(51-c) *!
*
(51-d) *!
A cyclic application of the principles discussed so far, together with the as-
sumption that the specifiers and heads remain accessible for optimization
from outside, thus yields the CC under the ranking given in table 1. Why
is it that upper copies must be non-complex? We can derive this from the
principles PRONECON and PARSESCOPEO if we interpret them properly. In
showing how, we may confine our attention to the derivational step linking a
Wz-phrase in Spec,C to its first landing site in the matrix clause. In (53), we
ignore one candidate structure to which we will return later.
(52) wen von den Studenten denkst wen von den Studenten dass ...
who of the students think who of the students that
(53) a. wen von den Studenten denkst wen von den Studenten dass ...
b. wen von den Studenten denkst wen von den Studenten dass ...
c. wen von den Studenten denkst wen von den Studenten dass ...
Presupposing the ranking ISLAND » LEC, we need to consider only those

candidates that retain phonetic material both in the lower Spec,C and in the
first matrix clause landing site position. Obviously, PRONECON is violated
more often in (53-a) than in (53-b,c) - it has three more words than the two
other candidates. If PRONECON is ranked below ISLAND (so that copying is
possible at all), the former principle will still block (53-a).
The decision between (53-b,c) seems to follow from PARSESCOPEO if we
interpret the principle as applying to semantic units, and if we make the fairly
standard assumption that the restrictor of a wA-quantifier should not appear in
the scope position of the operator, i.e., if (54-a) is preferred over (54-b) (see,
e.g., Chomsky 1993, Fox 1995).
(54) a. wh-x (Pred(jc))...

b. wh-x, χ a Pred χ ...
The higher the restriction of a w/z-operator is moved in a tree, the more vi-
olations of PARSESCOPEO arise relative to it, so that (53-b) is favored over
(53-c). The core properties of the CC thus seem derived.
Note, however, that an account of (53-b) vs. (53-c) in terms of PARSE-
SCOPEO makes the incorrect prediction that a w/î-phrase that can be split up
must be so. This is false, as (55) shows.
(55) a. (i) Wen von den Studenten kennst du?

who of the students know you
'Who of the students do you know?'
(ii) Wen kennst du von den Studenten?
b. (i) Was für Studenten kennst du?
was for students know you
'What kind of students do you know?'
(ii) Was kennst du für Studenten?
c. (i) Wieviel Studenten kennst du denn?
how many students know you then
'How many students do you know?'
(ii) Wieviel kennst du denn Studenten?
We therefore need a principle that penalizes structures that have been con-
tiguous at level L but cease to be so at level L'.
(56) Contiguity In Syntax (CIS)

The phonetic material corresponding to a constituent must be
spelled out in one position only.
CIS disfavors separation, whereas PARSESCOPEO requires it. When the two
constraints are tied, the constellation we find in (55) arises.26 The tie with
PARSESCOPEO implies a fairly high rank for CIS; in particular, it dominates
PRONECON. Therefore, we must understand CIS in such a way that it is sat-
isfied when there is at least one copy of a phrase that is pronounced in an un-
split fashion - otherwise, the CC would be ruled out because it would always
imply a CIS violation. We must make sure, however, that the tie between
PARSESCOPEO and CIS does not imply that complex wA-phrases (which al-
ways contain a restrictor that should be left in situ) do not have to move at all
(because the PARSESCOPEO violation by the operator part is always counter-
balanced by the PARSESCOPEO violation of the restrictor). This is effected
by the principle WH-IN-SPEC introduced above.
Consider, then, the consequence of CIS for the two crucial movement steps
- the one from the root position to Spec,C, and the subsequent step mapping
the wA-phrase into the matrix clause.
As table 3 shows, we correctly predict the distribution of grammaticality in
the first movement step of (57).
(57) wen von den Studenten du wen von den Studenten einlädst?
who of the students you who of the students invite
(58) a. wen von den Studenten du wen von den Studenten einlädst
b. wen von den Studenten du wen von den Studenten einlädst
c. wen von den Studenten du we« von den Studenten einlädst
d. wen von den Studenten du wen von den Studenten einlädst
Table 3 PARSESCOPEO/CIS ISLAND WH-IN-SPEC PE LEC

* ****
(58-a) * (restrictor)
IS- (58-b) *(restrictor)
e r (58-c) *(contiguity)
*
(58-d) * (operator)
If the derivation proceeds with (58-c), nothing new happens: A w/¡-phrase

consisting of a single word cannot violate CIS. If (58-b) is chosen, we pro-
ceed as discussed in the context of (53-a-c) (repeated here as (59-a-c)), but it
is also obvious that we deliberately ignored a candidate above: (59-d). Never-
theless, (59-b) is still optimal: CIS is respected by the lower copy, we do not
move the restrictor of the w/i-operator further up, and the wA-operator moves
to a position c-commanding its scope.
(59) a. ... wen von den Studenten denkst wen von den Studenten dass ...
b. ... wen von den Studenten denkst wen von den Studenten dass ...
c. ... wen von den Studenten denkst wen von den Studenten dass ...
d. ... wen von den Studenten denkst we» von den Studenten dass ...
Table 4 PARSESCOPEO/CIS ISLAND WH-IN-SPEC PE LEC

(59-a) **** *
*'.(restrictor)
os- (59-b) * *
(59-c) * ¡(restrictor) * *
(59-d) * *
* ¡(contiguity)
Therefore, we have derived the fact that the copy construction allows complex
overt w/z-phrases in the lowest Spec,C position only. As we have remarked
above, this restriction can be minimally violated in certain dialects in which
a PP may be copied; cf. (36), repeated here as (60).
(60) (?*)Mit was denkst du mit was er sie verletzt hat?

Given what we have seen so far, the optimal candidate should be one that
"strands" the preposition in the lower copy. For the constraint that makes (60)
possible by overriding PRONECON, a natural formulation comes to mind.
Note that the copying operation moves a PP category upwards, and one may
assume that phonetic material that does not contain a preposition in the head
position cannot be a phonetic realization of a PP:
(61) LeftEdgePP (LEP)

The leftmost element realized phonetically in a PP must be the
preposition from which that PP was projected.
If L E P is ranked above PRONECON, the prepositional head may be retained

in a CC.
Our final task in describing the CC is a discussion of the intervention ef-
fect. As (62) shows (see also Beck 1996), an operator like sentential negation
cannot intervene between a w/z-phrase and its restrictor - the problem can be
solved in various ways, by respecting contiguity (62-a) or by scrambling the
w/z-phrase in front of the operator (62-c) before it is split up.
(62) a. Was für Bücher hat er nicht gelesen?

what for books has he not read
b. * Was hat er nicht für Bücher gelesen?
c. Was hat er für Bücher nicht gelesen?
According to Pesetsky (1998b), the relevant anti-intervention constraint im-

plies that a w/i-operator must not be separated from its restrictor by a fur-
ther operator. This applies to those copy constructions in a straightforward
way in which a restrictor is indeed stranded; the same effect shows up in
(63), i.e., when the wA-phrase consists of a single word only. (63) fits into
Pesetsky's proposal if we assume that, semantically, there is a restrictor part
present for wen as well, and that this restrictor part is left behind with the
lowest visible copy of the phrase. Alternatively, the intervention constraint
might be reformulated so that it requires that links in a chain in which both
elements contain either phonetic or semantic material must not be separated
by an operator.
(63) a. *Wen glaubt keiner wen sie liebt?

who believes nobody who she loves
'Who does nobody believe that she loves?'
b. *Wen glaubt jeder wen sie liebt?
who believes everybody who she loves
'Who does everybody believe that she loves?'
The only potential problem that arises, then, is related to ineffability. A suffi-
ciently high rank of the intervention constraint will be able to block the copy
construction in the situations where this is called for, so that the winning com-
petitor is a long movement construction. The same consequence arises for
constellations in which the w/i-phrase must not be split up (w/uc/i-phrases, or
PPs, in certain dialects).
This is an acceptable result for those dialects in which the copy construc-
tion co-exists with long movement, but it does not capture the ineffability
effect that can arise for long distance dependencies when a dialect forbids
long movement and an intervention effect rules out the copy construction at
the same time. A standard solution (see Legendre et al. 1998) would be to
rank the intervention constraint higher than faithfulness constraints concern-
ing scope assignments for the intervening operators.
3.4 Some Related Issues
As we have mentioned above, certain varieties of German (e.g., Lower Rhine

area, Bavarian Suabia) allow questions to be constructed in the form given
in (64).
(64) Welchen Mann denkst du wen er kennt?

which man think you who he knows
'Which man do you think he knows?'
We have little to say about this construction, except for the observation that
it is not likely to be a subcase of a CC. While the lower occurrence of a wh-
element is a non-complex one, it does not copy the wA-operator of the upper
wÄ-phrase. It is rather the minimal spellout of the w/z-features that should be
present in the lower Spec,C position, as (64) and (65) show: 27
(65) a. Wieviel Bier denkst du was ertrinkt?

how-much beer think you what he drinks
'How much beer do you think that he drinks?'
b. *Wieviel Bier denkst du wieviel er trinkt?
One simple analysis would analyze wen and was as agreeing forms of the
complementizer. This would be consistent with the observation that (31-b) is
judged worse than (31-a) (repeated here as (66)), if we assume that womit
makes a bad [+wh|-complementizer.
(66) a. ¡Welchem Mann glaubst du wem sie das Buch gegeben hat?
which man believe you who she the book given has
'Which man do you think that she gave the book to?'
b. !Mit welchem Werkzeug glaubst du womit Ede das Auto
with which tool think you what-with Ede the car
repariert hat?
repaired has
'With which tool do you think that Ede repaired the car?'
Alternatively, the contrast in (66) might be caused by the fact that Ger-
man dialects tend not to block standard long movement when PPs are af-
fected, so that (66-b) might be blocked by a candidate involving long move-
ment. We might also consider wen, was, and womit as spellout forms for φ-
features of a wA-phrase that has lost its original phonetic content. In a dialect
that ranks ISLAND over LEC, the insertion of "expletive" phonetic material
spelling out wh-φ-features is an alternative means of avoiding an ISLAND-

violation. Suppose that FI blocks the use of expletive material, and suppose
that PRONECON works in such a way that it blocks the repetition of phonetic
material only. Then if FI » PRONECON, we still get the CC, but if the rank-
ing is reversed, the situation in (64)-(66) arises. We do not wish to commit
ourselves to this analysis, though.
German dialects allow at least two further examples of "uneconomical pro-
nunciation". When the highest verbal element of a clause is topicalized as in
(67), the problem arises that the second position should be filled by exactly
this element, too, given that the V/2 constraint is not violated in German.
The problem may be solved by expletive insertion (67-b) or by uneconomical
pronunciation (as in (67-a)). Retention of two copies is confined to modals,
though.
(67) a. Können kann ich nicht

can can I not
Ί am not ABLE to.'
b. Können tue ich nicht
can do I not
c. Schlafen tue ich nicht
sleep do I not
'Sleeping is not what I do.'
d. *Schlafen schlafe ich nicht
sleep sleep I not
In a simple split construction (68) (see also next section), the phonetic ma-
terial belonging to a single constituent is distributed over two places in the
sentence without any repetitions, but there are two exceptions to this property
of split XPs. In those dialects in which a PP can enter the split construc-
tion, the preposition must appear in both positions in which the PP is spelled
out partially (68-b). 28 Given the high rank of LEP, this is not unexpected.
Likewise, van Riemsdijk (1989) observes that the indefinite article may (and
sometimes must) be repeated in split noun phrases - a fact we can relate to
the observation that singular count noun phrases may never be realized pho-
netically without an initial determiner.
(68) a. Teure Bücher habe ich viele

expensive books have I many
Ί have many expensive books.'
b. In Schlössern habe ich noch in keinen gewohnt

in castles have I yet in no lived
Ί have not yet lived in castles.'
c. *In keinen habe ich noch in Schlössern gewohnt
d. Einen amerikanischen Wagen kann ich mir nur einen grünen
an American car can I me only a green
leisten
afford
Ί can only afford a green American car.'
Thus, the split construction supports the idea that the economization of pro-
nunciation is in general quite subject to other constraints.
4 A Few Remarks on the Split Construction and So-Called Head

Movement
A detailed analysis of the split construction is beyond the scope of the present
paper, and would mostly repeat what is said in Cavar & Fanselow (1997,
2000). The following remarks are meant to prepare the discussion of a further
advantage of a pronunciation economy account: It gives a straightforward
analysis of so-called head movement.
That constructions apparently involving rightward movement might (at
least in some contexts) have to be reanalyzed as resulting from the stranding
of phonetic material in a leftward movement operation has been proposed,
e.g., by Kayne (1994) and by Wilder (1995). It also seems obvious that the
"stranding" of β (say, a relative clause) in the process of moving Σ (say, a
DP) in (69) could be the result of an incomplete deletion 29 in the source po-
sition of movement, followed by an erasure of β in the target position, due to
PROΝECOΝ.
( 6 9 ) ... Ι Σ Α β ι ... = •
. . . \ Σ α β I ... [Σαβ I ...=>

... [Σαβ I ...[τ et β | ... =•
...\Σα·β \ ... [τ<*βί ...=•
In ΟΤ terms, the candidate set for the pronunciation of a movement chain (70)
is simply enlarged by allowing (free) partial deletion in the copies created by
movement.
(70) - X Y Z - X Y Z -
(71) a. X Y Z - X Y Z
b. Χ Υ Ζ - XrX-Ζ
c. X-¥-Z — Χ Υ Ζ
d. XY-Z-XYZ
e. X Y Z - ^ Z
f. X Y-Z — X-Y-Z
g. X Y Z - X Y Z
h. J W Z - X Y Z
Candidate (71-a) violates PRONECON three times, while (71-b) and (71-c)
satisfy this constraint and represent full overt and full covert movement, re-
spectively. Candidates (71-d,e) represent split constituents (the pronunciation
of the copies created by movement is distributed over two places) - they im-
ply a CIS violation that is justified only when higher constraints like PARSE-
SCOPEO are thereby fulfilled. (71-f) implies a (presumably fatal) RECOV vi-
olation, whereas the PRONECON violation in (71-g) is acceptable only if a
constraint like L E P forces it. Additional constraints may come into play: The
PARMOVE constraint of Müller (1998a) will imply that the c-command re-
lations between the phonetic occurrences of X,Y, and Ζ should not change
when the phrase is split up by free deletion - this is respected in (71-c,d), but
not in (71-h).
Cavar & Fanselow (1997, 2000) extend the approach that we have pre-
supposed above for left branch extractions to the split construction that one
finds in German (68-a), Russian, Polish, Croatian and many other languages,
and that has so far resisted a satisfactory treatment. For constructions such
as (68-a), the obvious alternative to analyzing the split construction in terms
of partial deletion is simple extraction. But notice that split constituents do
not respect standard islands for movement. For example, PPs are islands for
standard movement in Croatian and Polish, yet PPs may be split up freely,
as the examples in (72-a,b) illustrate. Likewise, German PPs are extraction
islands, but split constructions arise, as (68-b) and (72-c) show. See Fanselow
(1988, 1993), Kuhn (1998), van Geenhoven (1996), and in particular Cavar
& Fanselow (2000) for a detailed discussion of the shortcomings of simple
movement accounts.
(72) a. Na kakav je Ivan krov skocio? (Croatian)

on what-kind has Ivan roof jumped?
'On what kind of roof has Ivan jumped?'
b. Na jaki Marek dach skoczyt? (Polish)

on what-kind Marek roof jumped
c. Mit was hast Du für Problemen nicht gerechnet?
with what have you for problems not reckoned
'What kind of problems did you fail to take into account?'
The analysis offered in Cavar & Fanselow (2000) can be recast in OT terms in
the following way: The split construction arises because CIS is ranked below
(or rather tied with) PARSESCOPEO as applied to specific pragmatic-semantic
features. Thus, for Croatian, Polish, or German it can be shown that a DP or
PP is split up only if its phonetic material is linked to at least two different
pragmatic (focus) or semantic (w/i-)features. Thus, suppose that krov bears
a focus feature and na kakav a w/i-feature in (72-a), and suppose that focus
features have to be checked in a focus position in Croatian. If PARSESCOPEO
is ranked higher or at least as high as CIS, the PP na kakav krov need not or
cannot be realized in a single structural position - it is split up. 30
Up to now, we have only considered cases in which semantic constraints
(but see note 30) require that a phrase be linearized discontinuously after
movement. It is natural to suspect that conditions relating to the phonological
interface may have the same effect. Consider (73) in this respect. It is a noto-
rious fact that stressed particles have to be stranded (73-b,c) in German (and
Dutch) when a verb-second clause is being formed, while unstressed particles
go along with the verb (73-e,f)·
(73) a. dass er das Buch zu lesen an-fängt

that he the book to read begins
b. Er fängt das Buch zu lesen an
c. *Er anfängt das Buch zu lesen
d. dass er das Buch zu lesen ver-sucht
that he the book to read tries
e. *Er sucht das Buch zu lesen ver
f. Er versucht das Buch zu lesen
Depending on one's assumptions concerning the internal structure of |an+

fangen], (73-b) violates what was called the "Lexical Integrity Principle" in
earlier versions of the generative approach, which rules out an operation like
movement that affects a part of a lexical item only. If we want to avoid such a
violation, we could assume that anfangen is structured as | ν an \ y fangen | |,
and that head movement affects the smallest item only to which it is appli-
cable. But particles must be stranded in constructions involving movement
to Comp only: In certain varieties of German (but not in Dutch, see Roberts
1991) the particle must not be stranded in contexts of non-finite verb incor-
poration (see (74)-(75)).
(74) dass er es zu lesen hätte anfangen müssen

that he it to read had begin must
'that he should have begun to read it.'
(75) *dass er es zu lesen an hätte fangen müssen
In the system proposed here, (73)-(75) constitute no particular problem. The

principle OPW introduced in (76) blocks the appearance of more than one
prosodie word in the position to which the finite verb moves. If OPW
CIS, a particle that constitutes a second prosodie word must not be realized
in the landing site of verb movement, but a stressless particle, on the other
hand, will appear in second position together with the verb proper, because
the CIS-violation which the stranding of the particle incurs is not justified by
the avoidance of a clash with OPW in this case.
(76) ONEPROSODICWORD ( O P W )
The second position of the clause may host a single prosodie word
only.
In other words, a full lexical category is copied to second position in (73-b)

and (73-e), but whether its complete phonetic material may show up there de-
pends on conditions of the syntax-phonology interface. Note that OPW is also
justifiable on the basis of the observation that pronouns that are adjoined to
that second position must in fact be phonological clitics, because otherwise,
(76) would be violated (see Cavar 1999 for a related view).
Recourse to partial deletion and OPW not only allows us to respect the
Lexical Integrity Hypothesis, it also helps to avoid excorporation analyses
(Roberts 1991) for cases like (77): There is good reason to believe that a
verbal complex anrufen hätte können is formed in the derivation of (77), in
which anrufen first incorporates into können, with the complex so formed
moving up further to the finite auxiliary (this may explain, e.g., the need to
replace the past participle gekonnt with the infinitive können, see Schmid
1998 for an analysis). In standard analyses, hätte then has to be excorporated
out of this verbal complex when it moves to second position, but we can avoid
this conclusion by assuming that the whole verbal complex anrufen können
hätte is moved to second position, in which, however, due to OPW, only the
highest prosodie word may be spelled out.
(77) Er hätte Maria anrufen können

he had Mary call up could
'He could have called Mary up.'
In fact, Roberts (1997) comes quite close to a similar line of reasoning in

his discussion of Romance restructuring constructions, for which he assumes
that chains created by overt movement are spelled out in the higher position,
unless this violates a constraint *fw WW], which bans two "syntactic" words
dominated by a single X o category.
For obvious reasons, a position characterized by (76) may even be targeted
by full phrasal copying without any change in the phonetic effect: Whatever
is copied to second position can find a single word spellout there only; the
remaining part must be realized in situ. We may therefore hypothesize that
movement is always phrasal copying: The overt appearance of the movement
of a head in isolation arises when additional constraints imply that little more
(or less) than the head of the copied phrase may appear phonetically in the
landing site.
For many instances of apparent head movement, OPW is too strong (recall
that verb inversion need not strand a particle in German), so it can and must
be complemented by a principle like (78), with proper choice of S.
(78) NOPHRASALSPEC (NPS)

The phonetic realization of specifier S must not contain material that
forms a phrase.
Under this perspective, there is an analysis for apparent verb movement to

Infi in French (79-a) in which the whole VP has been moved to the innermost
specifier position of Infi (79-b,c). The upper copy of the VP must then be
reduced phonetically to its head if NPS is to be respected by the innermost
specifier of Infi. Likewise, the head must be eliminated phonetically in the
lower copy because of PRONECON.
(79) a. Il embrasse souvent Marie

he kisses often Mary
b. II... souvent embrasse Marie
c. Il embrasse Marie souvent embrasse Marie
d. Il embrasse Marie souvent embrasse Marie
It thus seems clear that such a reanalysis of head movement is a possible

extension of our approach, so that the question arises of whether it is also a
step that is called for. Several proposals (Kayne 1998, Koopman & Szabolcsi
1999, Mahajan 1999) have indeed been made that imply that at least certain
instances of head movement should be reinterpreted as phrasal movement.
Some of the obvious advantages of the resulting systems are:
— Phrasal movement can always be carried out in such a way that it ex-
tends the phrase marker (= it targets the root node) that is being con-
structed. The movement of a head to another head position does not
fulfill this extension requirement on obvious grounds: When head F
moves out of ZP to H in [Ή ZP], it does not land at the root node
dominating H and ZP. One unwelcome consequence of this is that
a head H moved to G does not c-command its root position, quite
in contrast to what one would normally assume to hold for move-
ment. Head movement violates most generalizations one would like
to defend in the context of a movement theory.
— In a feature driven theory of movement, it is not obvious what fac-
tor determines whether a given feature must be eliminated by head
movement or by phrasal movement.
— In the configurational definition of phrasal levels (Speas 1990,
Chomsky 1995), an element E is a maximal projection if its mother
node is not a projection of E. If a head H adjoins to head G, neither
G nor G' are projected from it. Therefore, the head H should be-
come a maximal projection by movement. Even if this consequence
were tenable, it would induce a violation of the chain uniformity
condition (Chomsky & Lasnik 1993), because the trace of the head
is not maximal.
We therefore seem to possess excellent evidence for the elimination of head

movement, but the approaches that are discussed in this context face at least
one potential problem. Typically, the visible displacement of a single word
W is reanalyzed as the remnant movement of a WP out of which all material
but W has been extracted. The derivation of (79-a) would therefore proceed
as in (80) (see, e.g., Mahajan 1999):
(80) Il souvent |vp embrasse Marie | =>•

Il souvent Marie | VP embrasse Marie | =>•
Il souvent Marie | vp embrasse t | = > ·
Il I vp embrasse t | souvent Marie | yp embrasse t | =>•
Il I yp embrasse t | souvent Marie t
The movement operations that "evacuate" the VP or any other XP before

"head" movement are typically unmotivated in terms of feature checking or
semantic considerations, which may (but need not) create a problem in min-
imalist accounts, but will not necessarily in OT, since the relevant STAY*
violations may be called for by the need to respect a higher ranked NPS or
OPW. But we have known since Wexler & Culicover (1980) that movement
has a freezing effect on a phrase Ρ in the sense that Ρ becomes an island after
movement, as demonstrated impressively in Müller (1998b). "Head" move-
ment constellations do not induce such island effects, however:
(81 ) Über wen liest sie eine Geschichte t

about who read she a story
'Who is she reading a story about?'
In (81), the verb has been preposed. If this implies phrasal fronting in the
sense that eine Geschichte über wen has been extracted out of VP before the
rest of VP (= liest) was preposed, then eine Geschichte über wen should have
become an island for movement, which it has not: über wen can still be moved
out of this phrase.
Our account avoids this problem because the splitting up of the VP into a
preposed verb and a stranded rest is effected by pronunciation laws (and not
by remnant movement) - there is no particular reason why the object should
thereby become an island. A comparison of attempts to guarantee that certain
movements have no freezing effects (Müller 1999) with our approach may
thus help to identify the proper way of eliminating head movement.
Notes
Parts of this paper were presented at the Workshop on Conflicting Rules in Phonol-
ogy and Syntax at the University of Potsdam (Dec. 15-16, 1999), and at the Linguis-
tics Colloquium at the University of Leipzig. We would like to thank the audiences
for useful comments and criticism. Particular thanks for helpful hints and for sup-
port in various respects related to this article go to Joanna Btaszczak, Caroline Féry,
Susann Fischer, Gereon Müller, and Douglas Saddy. We would also like to thank
Artemis Alexiadou, Hans-Martin Gärtner, Anoop Mahajan, Matthias Schlesevvsky,
Peter Staudacher, and Chris Wilder. Research for this paper was supported by the
grant INK 12/B1 "Innovationskolleg Formale Modelle kognitiver Komplexität" fi-
nanced by the Federal Ministry of Education and Research and administered by the
German Research Foundation. The paper was written while Damir Cavar was em-
ployed by the University of Potsdam.
1. However, we do not share Pesetsky's view that the optimality theoretic aspect of
syntax is confined to such principles of sentence pronunciation.
2. In an OT-framework, one expects that the link between the (uninterpretable) fea-
tures of an attracting head and the creation of copies can be violated in two ways:
There should be movement that does not check such features, and there should be
uninterpretable features that are not checked by movement/copy formation. The
former assumption helps at least in analyzing non-terminal movement steps in
cyclic extractions; see also Chomsky (1998) for an approach in which the strict
connection between the triggering of movement and feature checking is given
up.
3. We confine our attention here to those candidates that are well formed with re-
spect to conditions such as subjacency, those in which no superfluous movement
has taken place, those in which every necessary movement step has been carried
out, etc., i.e., we confine our attention to the effects of deletion on chains that are
fully grammatical in every other respect.
4. Given that PRONECON never seems to outrank RECOV, it is more adequate to
collapse the two principles into a single one that just bans the realization of pho-
netic material that is predictable from the syntactic environment. This principle
would also be more in the spirit of Pesetsky (1998a).
5. That there will be phonetic material in one copy only follows from (7) if we
make a further assumption: The phonetic material must not be scattered over the
various copies in the chain. See below for an elaboration of this point.
6. PRONECON is an obvious extension of Pesetsky's (1998a: 344) TEL-principle,
which penalizes the use of a phonetic matrix for function words.
7. See Pesetsky (1998a: 342) for a slightly different version of this principle.
8. In a considerable number of languages (see Fanselow 2000 for an overview),
a further option exists which is also discussed under the heading "partial wh-
movement":
(i) Was denkst du wie die Rosen riechen?

what think you how the roses smell
Wie seems to have matrix scope in (i), yet it is moved to the specifier position
of the complement clause only. The apparent scope position of the w/i-phrase is
filled by a different ννΛ-element (was 'what'). Malay and Bahasa Indonesia lack
this element in the scope position, at least overtly. There are various proposals for
the proper analysis of (i) (see, e.g., the contributions to Lutz, Müller & v. Stechow
2000). If Fanselow & Mahajan (1996,2000) are correct, both was and wie are in
fact in their scope positions, i.e., the partial nature of wA-movement in (i) is only
apparent.
9. Alternatively, the lack of island effects for argumentai ννΛ-phrases in situ can also
be explained by assuming that they are subject to featural attraction in the sense
of Chomsky (1995) and Pesetsky (1998b), if featural attraction (or "Agreement",
see Chomsky 1998) is not constrained by subjacency, as Pesetsky (1998b) argues.
In such an account, there are two types of "covert" movement: standard phrasal
copying followed by the deletion of the phonetic material in the landing site
(= the topic of the present paper), and featural attraction.
10. An EPP-feature in the sense of Chomsky (1998), or a D-feature, as argued by
Fanselow & Mahajan (2000).
11. More precisely, it is island effects related to subjacency that one does not expect.
Intervention effects as discussed by Beck (1996) or Pesetsky (1998b) are not ex-
cluded in this way. Furthermore, the binding of w/z-phrases in situ may be subject
to binding conditions, as Ouhalla (1996) points out. That the binding option is
restricted to wA-phrases in situ is a consequence of the fact that wA-phrases can
appear in the specifier position of a CP only if they have been attracted to that
position. In other words, if the Comp-position to which the wA-phrase is seman-
tically linked has an attracting feature, this feature must be checked by copying,
so that island effects automatically become relevant.
12. The feature attracting wA-phrases to Spec,CP can be optionally absent in lan-
guages with "overt wA-movement" (as seems to be the case for French matrix
Comps) and in languages without it (Chinese).
German is a language with "overt" wA-movement in which the wA-attracting fea-
ture of Comp cannot be absent (if we ignore echo questions). Hindi, on the other
hand, seems to be a wh-in-situ language in which wh-in-situ phrases must not ap-
pear in islands (see Mahajan 1990). In the system presupposed above, this array
of facts suggests that the attracting feature of Comp is again always present (wh-
arguments cannot be simply bound by an operator, just as in German). Therefore,
there seem to be languages with (German) and without (Hindi) overtly visible
wA-movement which require the wA-attracting feature of Comps to be present
obligatorily. The only option that appears to be unrealized is a language in which
the attracting feature of Comp is obligatorily absent (in such a language, adjunct
questions could not arise at all).
13. Movement that is string-vacuous in the strictest possible sense is thus not penal-
ized by STAY* if the principle relates to the realization of phonetic matrices. This
may be a welcome result for various kinds of "evacuation" operations necessary
in the context of remnant movement that are not feature driven. See, e.g., Müller
(1999) for a discussion.
14. Violations of STAY* must be assumed to not be cumulative, because otherwise
w/î-phrases would then move to the lowest Spec,C position only.
15. That island effects can be captured straightforwardly in our system has been
shown above - the pertinent argument Cole & Hermon bring forward in this
respect applies to a particular formulation of what they call "Multiple Spellout"
theories only, but not to the system we develop here.
16. More precisely, the GEN component produces some candidates in which all
copies in a chain retain their phonetic matrices, and others in which all but one
have lost theirs (and still many further candidates), and only the latter have a
chance of winning the competition because they violate PRONECON as little as
is possible in the light of RECOV. We will continue to use (slightly misleading)

derivational formulations of optimization for pronunciation, however, because
they are simpler.
17. More precisely, we assume that "deletion" means that the phonetic information of
(part of) a copy is marked as unpronounced. In principle, such a phonetic matrix
could be promoted to the status of being pronounced in later derivational steps,
but such syntactic objects are not likely to be optimal candidates. To see why,
consider the following derivation. Suppose that Β = [ A ... A has been formed,
with Β being cyclic, so that the optimal pronunciation must be selected. Suppose
B* = |Ar ... A2 ...1 is optimal. Then B* may be subjected to further operations,
but note that only the specifier (and perhaps the head) of B* is accessible to
such further copying or deletion steps (Chomsky 1998); that is, in particular,
the phonetic matrix of A2 can no longer be affected - it cannot be marked as
unpronounced in further derivational steps. Thus, when the object G = [ Ao ...Is*
A t . . . A2 ] ...1 is subjected to EVAL, Ao must lose its phonetic matrix as long
as chains are realized in one position only (= the standard case), because the
decision that A2 is pronounced can no longer be retracted. In other words, Ao has
a chance for phonetic realization only if other principles than RECOV crucially
override PRONECON. See below for such a case.
18. Wilder (1995) shows that deletion operations operating from left (retained) to
right (deleted) are subject to intervention effects from certain heads. Meng-
distribution fits this proposal, because the phonetic presence of meng excludes
the transformation of... XP ... meng ... XP ... to ... XP ... meng ... XP ..., so that
we are left with ... XP... meng ... XP... as the only option.
19. At least this holds for questions asking for lists.
20. "!" indicates that the structure is acceptable in some versions of German only.
21. Long movement is, e.g., certainly acceptable in all dialects spoken to the south
of the Main river (the so-called white-sausage equator), but also in Eastfalian and
in some dialects in Schleswig-Holstein.
22. According to Reis (1995), however, this construction is parenthetical in nature
and therefore does not involve movement at all.
23. It is hard to find speakers from the Ruhr area (Dortmund, Bochum) who accept
and use these constructions, but they seem fine to at least some speakers from the
Lower Rhine area (around Wesel, Kleve), Eastern Westfalia (around Bielefeld),
and Bavarian Suabia (Kempten). Thanks to Susanne Anschütz, Daniel Btiring,
Sascha Felix, Peter Gebert, Barbara Lenz, Sandra Muckel, Barbara Stiebels and
in particular to Susann Fischer for their help in this respect.
24. Recall that in German verb-second complement clauses like (i), the embedded
Comp position is filled by the verb attracted to that position.
(i) Ich denke er hat sie geküsst.

I think he has her kissed
Ί think he has kissed her.'
25. Pesetsky (1998a) later revises this first version of LEC. The reformulation is not
relevant for the type of data we discuss in the main body of the article, though.
26. Given the cyclic nature of optimization, there seems little hope for an attempt to
guarantee that the tie between CIS and PARSESCOPEO is not resolved differently
in different movement steps. When a phrase is split, it will not be put together in
later derivational steps for obvious reasons, but, on the other hand, one expects
a phrase to be able to split at later derivational steps as well, (ii), modeled af-
ter corresponding Dutch data in Barbiers (1999), suggests that at least for some
versions of German and some wA-phrases, the expectation is borne out. In ad-
dition, (ii) corroborates the view that long movement passes through an AGR-O
position, as Barbiers observes.
(i) Was für Frauen hast denn du gedacht dass er einladen will?
what for women have ptc you thought that he invite wants
(ii) Was hast denn du für Frauen gedacht dass er einladen will?
(iii) ?Was hast denn du gedacht dass er für Frauen einladen will?
27. We are obliged to Susann Fischer for helping us get informants' judgments here.
28. We owe this observation to Josef Bayer, p. c.
29. This has been proposed recently by Hinterhölzl (1999).
30. Caroline Féry (p. c.) suggests the following alternative explanation for split XPs
that avoids the assumption of specific focus positions: XPs are split because
of their suboptimal phonological properties. Notice that two prominent accents
should not be adjacent in a string. If a noun phrase has two independent foci, it
must realize two prominent accents. The splitting of the phrase avoids a situation
in which these two accents would be too close to each other.
References

1998 WHOT? In: P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis and D.
Pesetsky (eds.) Is the Best Good Enough?, 15-33. Cambridge, MA: MIT
Press.
Anyadi, Stefanie — Armine Tamrazian
1993 W/z-movement in Armenian and Ruhr German. VCL Working Papers in
Linguistics 5: 1-22.
Aoun, Joseph — Yen-Hui A. Li
1993 W/z-elements in situ: Syntax or LF? Linguistic Inquiry 24:199-238.
Barbiers, Sjef
1999 Remnant stranding and the theory of movement. Paper presented at the
Workshop on Remnant Movement, Feature Movement and Their Impli-
cations for the T-Model. Potsdam, July 1999.
Basilico, David
1998 WA-movement in Iraqi Arabic and Slave. The Linguistic Review 15(4):
301-339.
Beck, Sigrid
1996 Quantified structures as barriers for LF movement. Natural Language
Semantics 4: 1-56.
Bobaljik, Jonathan
1995 Morphosyntax. The syntax of verbal inflection. Ph.D. dissertation, MIT.
Cavar, Damir
1999 Aspects of the syntax-phonology interface. Ph.D. dissertation, University
of Potsdam.
Cavar, Damir. — Gisbert Fanselow
1997 Split constituents in Germanic and Slavic. Paper presented at the Interna-
tional Conference on Pied Piping, Jena.
Cavar, Damir — Gisbert Fanselow
2000 Discontinuous constituents in Slavic and Germanic languages. Ms., Uni-
versity of Potsdam.
Chomsky, Noam
1986 Barriers. Cambridge, MA: MIT Press.
Chomsky, Noam
1993 A minimalist program for linguistic theory. In: K. Hale and S.J. Keyser
(eds.) The View from Building 20, 1-52. Cambridge, MA: MIT Press.
Chomsky, Noam
1995 The minimalist program. Cambridge, MA: MIT Press.
Chomsky, Noam
1998 Minimalist inquiries: The framework. Ms., MIT.
Chomsky, Noam — Howard Lasnik
1993 The theory of principles and parameters. In: J. Jacobs, A. v. Stechovv,
W. Sternefeld and Th. Vennemann (eds.) Syntax: An International Hand-
book of Contemporary Research., 506-569. Berlin: de Gruyter.
Cole, Peter — Gabriella Hermon
1998 The typology of w/i-movement: Wh-questions in Malay. Syntax 1: 221-
258.
du Plessis, Hans
1977 Wh movement in Afrikaans. Linguistic inquiry 8: 723-726.
Fanselovv, Gisbert
1988 Aufspaltung von NPn und das Problem der 'freien' Wortstellung. Lin-
guistische Berichte 114: 91-113.
Fanselovv, Gisbert
1993 Die Rückkehr der Basisgenerierer. Groninger Arbeiten zur Germanistis-
chen Linguistik 36: 1 -74.
Fanselovv, Gisbert
2000 Partial movement. SynCom Project. Utrecht Institute of Linguistics.
Fanselow, Gisbert — Anoop Mahajan
1996 Partial movement and successive cyclicity. In: U. Lutz and G. Müller
(eds.) Papers on Wh-Scope Marking, 131-161. (Arbeitspapier des Son-
derforschungsbereichs 340, No. 76.) Stuttgart & Tübingen.
Fanselovv, Gisbert — Anoop Mahajan
2000 Towards a minimalist theory of w/z-expletives, wA-copying, and succes-
sive cyclicity. In: U. Lutz, G. Müller and A. v. Stechow (eds.) Wh-Scope
Marking. Amsterdam: Benjamins.
Fanselow, Gisbert — Reinhold Kliegl — Matthias Schlesewsky
2000 'Long' movement in Northern German: A training study. Ms., University
of Potsdam.
Fox, Danny
1995 Condition C effects in ACD. MIT Working Papers in Linguistics 27: 105-
120.
van Geenhoven, Veerle
1996 Semantic incorporation and indefinite descriptions. Ph.D. dissertation,
University of Tubingen.
Grevvendorf, Günther
1999 Multiple w/¡-fronting. Ms., University of Frankfurt.
Grimshaw, Jane
Groat, Erich — John O'Neil
1996 Spellout at the LF-interface. In: W. Abraham, S. D. Epstein, H. Thrains-
son and J. W. Zvvart (eds.) Minimal Ideas: Syntactic Studies in the Mini-
malist Framework, 113-139. Amsterdam: Benjamins.
Heck, Fabian
1997 Komplementierer und ihre Spezifikatoren. Ms., University of Tübingen.
Heck, Fabian — Gereon Müller
1999 Repair is local. Paper presented at the Workshop on Conflicting Rules in
Phonology and Syntax. Potsdam, December 1999.
Hiemstra, Inge
1986 Some aspects of w/i-questions in Frisian. NOWELE 8: 97-110.
Hinterhölzl, Roland
1999 Licensing movement and stranding in the West Germanic OV languages.
Ms, University of Potsdam.
Höhle, Tilman
1996 German w...vv-constructions. In: U. Lutz and G. Müller (eds.) Papers ort
Wh-Scope Marking, 37-58. (Arbeitspapier des Sonderforschungsbereichs
340, No. 76.) Stuttgart & Tubingen.
Huang, C.-T. James
1981 Move vvh in a language without wA-movement. The Linguistic Review 1 :
369-416.
Kayne, Richard
1994 The antisymmetry of syntax. Cambridge, MA: MIT Press.
Kayne, Richard
1998 Overt vs. covert movement. Syntax 1: 128-191.
Koopman, Hilda — Anna Szabolcsi
1999 Verbal complexes. Ms., UCLA.
Kuhn, Jonas
1998 Resource sensitivity in the syntax-semantics interface and the German
split NP construction. Ms., Universität Stuttgart.
Kvam, Sigmund
1983 Linksverschachtelung im Deutschen und Norwegischen. Tübingen: Nie-
meyer.
Legendre, Géraldine
in press An introduction to optimality theoretic syntax. In: G. Legendre, J.
Grimshavv, and S. Vikner (eds.) Optimality Theoretic Syntax. Cambridge,
MA: MIT-Press.
1998 When is less more? Faithfulness and minimal links in w/i-chains. In:
P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis and D. Pesetsky (eds.)
Is the Best Good Enough?, 249-289. Cambridge, MA: MIT Press.
Lutz, Uli — Gereon Müller — Arnim von Stechovv (eds.)
2000 Wh-Scope Marking. Amsterdam: Benjamins.
Mahajan, Anoop
1990 The A/A-bar distinction and movement theory. Ph.D. dissertation, MIT.
Mahajan, Anoop
1999 Against head movement in syntax. Ms., UCLA.
McDaniel, Dana
1989 Partial and multiple w/¡-movement. Natural Language and Linguistic
Theory 7: 565-604.
McDaniel, Dana — Bonnie Chiù — Thomas L. Maxfield

1995 Parameters for w/i-movement types: Evidence from child English. Natu-
ral Language and Linguistic Theory 13: 709-753.
Müller, Gereon
1997 Partial w/i-movement and optimality theory. The Linguistic Review 14:
249-306.
Müller, Gereon
1998a Order preservation, parallel movement, and the emergence of the un-
marked. Ms., ROA 275-0798.
Müller, Gereon
1998b Incomplete Category Fronting. Dordrecht: Kluwer.
Müller, Gereon
1999 Shape conservation and remnant movement. To appear in: Proceedings
of the 30th Annual Conference of the North-Eastern Linguistic Society.
Amherst, MA: GLSA.
Ouhalla, Jamal
1996 Remarks on the binding properties of w/i-pronouns. Linguistic Inquiry
27: 676-708.
Pesetsky, David
1997 Optimality theory and syntax: Movement and pronunciation. In: D. Ar-
changeli and D.T. Langendoen (eds.) Optimality Theory: An Overview,
134-170. Blackwell: Oxford.
Pesetsky, David
1998a Some optimality principles of sentence pronunciation. In: P. Barbosa,
D. Fox, P. Hagstrom, M. McGinnis and D. Pesetsky (eds.) Is the Best
Good Enough?, 337-383. Cambridge, MA: MIT Press.
Pesetsky, David
1998b Phrasal movement and its kin. Ms., MIT.
Reis, Marga
1995 Extractions from verb-second clauses in German? In: U. Lutz and J. Pafel
(eds.) On Extraction and Extraposition in German, 45-88. Amsterdam:
Benjamins.
Rice, Keren
1989 A Grammar of Slave. Berlin: Mouton de Gruyter.
van Riemsdijk, Henk
1989 Movement and regeneration. In: P. Benincà (ed.) Dialect Variation and
the Theory of Grammar, 105-136. Dordrecht: Foris.
Roberts, Ian
1991 Excorporation and minimality. Linguistic Inquiry 22: 209-217.
Roberts, Ian
1997 Restructuring, head movement, and locality. Linguistic Inquiry 28: 423-
460.
Roberts, Ian
1998 Have/Be raising, Move F and Procrastinate. Linguistic Inquiry 29: 113-
125.
Sabel, Joachim
1998 Principles and parameters of w/z-movement. Habilitation thesis, Univer-
sity of Frankfurt.
Saddy, Doug
1991 Wh-scope mechanisms in Bahasa Indonesia. MIT Working Papers in Lin-
guistics 15: 183-218.
Saddy, Doug
1992 A versus A-bar-movement and w/¡-fronting in Bahasa Indonesia. Ms.,
University of Queensland.
Schmid, Tanja
1998 Optional and Obligatory IPP Constructions in Westgermanic. Paper pre-
sented at the Second Workshop on Optimality Theory Syntax, October
1998, University of Stuttgart.
Speas, Margaret
1990 Phrase Structure in Natural Language. Dordrecht: Kluwer.
Tsai, Wei-Tien
1994 On economizing the theory of Α-bar dependencies. Ph.D. dissertation,
MIT.
Wexler, Kenneth — Peter Culicover
1980 Formal Principles of Language Acquisition. Cambridge, MA: MIT Press.
Wilder, Chris
1995 Rightvvard movement as leftward deletion. In: U. Lutz and J. Pafel (eds.)
On Extraction and Extraposition in German, 273-309. Amsterdam: Ben-
jamins.
Wilder, Chris
1997 Some properties of ellipsis in coordination. In: A. Alexiadou and T.A.
Hall (eds.) Studies on Universal Grammar and Typological Variation,
59-107. Amsterdam: Benjamins.
On the Integration of Cumulative Effects into
Optimality Theory
Silke Fischer
1 Introduction
The goal of this paper is to discuss the question of whether cumulative the-
ories are indispensable, because they are needed in order to capture certain
linguistic phenomena, or whether cumulative effects can be expressed equally
well in an optimality-theoretic framework. If so, cumulative theories could be
integrated into Optimality Theory (OT).
At first sight, the two theories seem to behave very differently. In OT, the
number of violations of low-ranked constraints does not play any role as long
as the constraint that is decisive for the outcome of the competition is higher-
ranked. In a cumulative theory, on the other hand, the situation is somewhat
different, because the underlying principle is that the weights of the involved
factors are added up. Thus it can happen that some factors which individually
do not have much weight and are therefore unimportant on their own become
decisive as soon as they cooccur or appear repeatedly.
As empirical background I will use Pafel's cumulative approach to quan-
tifier scope in German (cf. Pafel 1998). I will discuss whether it is possible
to "translate" it into OT, where the difficulties lie, and what kind of assump-
tions one might have to make. What I will not do is discuss Pafel's theory as
such, that is, discuss whether it is able to capture the phenomenon of quanti-
fier scope or where its advantages and disadvantages might lie; nor is the aim
of this paper to provide an adequate optimality-theoretic account of quanti-
fier scope in general (for this purpose see Heck, this volume). Pafel's theory
only serves as a case study for a more theoretical debate; therefore, the ap-
proach itself as well as the judgments on the sentences are neither changed
nor commented on.
152 Silke Fischer
2 Pafel's Approach to Quantifier Scope
Pafel introduces a number of factors that seem to have an impact on the scopai
behavior of quantifiers, i.e., whether they tend to take wide scope over other
quantifiers or not. Each factor is assigned some weight. In order to decide
which one of two quantifiers in a given sentence tends to take wide scope,
one has to determine which factors are relevant for each quantifier in the given
context. Then one can calculate the scopai value (SV) of each quantifier by
adding up the values of the relevant factors. The scopai behavior can then be
determined from the difference between the scopai values as follows:
(i) |SV(Q,)-SV(Q2)|> 1 :
The quantifier with the larger SV takes wide scope (i.e., the sentence
is unambiguous).
(ii) |SV(QI)-SV(Q2)|< 1 :
Either quantifier may take wide scope (i.e., the sentence is ambigu-
ous).
a. 0 <|SV(Q1)-SV(Q2)|< 1 :
The reading in which the quantifier with the larger SV takes wide
scope is preferred.
b. |SV(Q,)-SV(Q2)|=0:
Both readings are equally well available. 1
The cumulative character of the approach is illustrated by the following four

examples (Pafel's examples (3.17), (3.45), (3.42), (3.1)), in which the factors
SUBJECT, EX-PRE (external precedence) and IN-DIS (inherent distributiv-
ity) are involved. These factors are defined as follows:
weight:
EX-PRE ... is assigned to quantifiers in the "Vorfeld" 1.5
which linearly precede other quantifiers;
SUBJECT ... is assigned to subject quantifiers; 1
IN-DIS ... is assigned to quantifiers that have an in- 1
herently distributive character.
(1) Jeder Pianist hat eine Fuge in seinem Repertoire,

levery pianist|„ om has [a fugue] a c c in his repertoire
Qi = jeder Pianist, Q 2 = eine Fuge
QI: E X - P R E + S U B J E C T + IN-DIS
Q2: -
Cumulative Effects in OT 153
S V ( Q , ) = 1 . 5 + 1 + 1 = 3.5
SV(Q 2 ) = 0
Qi > Q2 (i e., Qi has relative scope over Q 2 ): possible
Q2 > Qi (i-e., Q2 has relative scope over Qi): impossible
(2) Jede Fuge hat ein Pianist in seinem Repertoire,
[every fugue] a c c has [a p i a n i s t ] ^ in his repertoire
Q, : E X - P R E + IN-DIS SV(Qi ) = 1.5+1 = 2.5
Q2: SUBJECT SV(Q 2 ) = 1
Qi > Q2: possible
Q 2 > Qi: impossible
(3) Ein Pianist hat jede Fuge in seinery Repertoire,
[a pianist] nom has [every fugue| Í(CC in his repertoire
Q, : EX-PRE + SUBJECT SVCQj ) = 1.5+1 = 2.5
Q2: IN-DIS SV(Q 2 ) = 1
Qi > Q2: possible
Q2 > Q , : impossible
(4) Eine Fuge hat jeder Pianist in seinem Repertoire,
[a fugue| a c c has [every pianist ]„om in his repertoire
Q,: EX-PRE SV(Q,)=1.5
Q2: SUBJECT + IN-DIS SV(Q 2 ) = 1 + 1 = 2
Qi > Q2: possible
Q 2 > Qi: possible
As these examples show, the scopai behavior of the quantifiers depends on

the combination of the factors. Although, for instance, the factors SUBJECT
and IN-DIS on their own do not indicate a general tendency to wide scope
(cf. (2) and (3)), this can be the case if they cooccur (cf. example (4)).
3 The Translation into OT
What we need in order to establish an optimality-theoretic account of the

data above are, informally speaking, candidates, constraints, and a constraint
ranking; and if we try to show that OT can do the job as well as the cumulative
theory, these have to be chosen in such a way that the results are equivalent
to Pafel's results. Of course we cannot restrict ourselves to the four examples
154 Silke Fischer
above, but for the beginning they already constitute a task and draw one's
attention to the main problems.
Since in Pafel's theory relative quantifier scope only depends on the com-
parison of the involved quantifiers described in terms of a certain set of fac-
tors, and is not influenced by any further component like the syntactic deriva-
tion, the translation into OT might require some unconventional assumptions.
The starting point is that we have two quantifiers with different properties,
and based on this information alone our theory should be able to predict the
possible scope relations. In analogy to Pafel's procedure I therefore propose
that the quantifiers of the sentence under consideration constitute the can-
didate set and that the optimal candidate will be the quantifier which tends
to take wide scope. In the case of ambiguous sentences this means that the
candidates will have to be equally optimal.
As far as the constraints are concerned, it seems to be reasonable to adopt
Pafel's factors and, as a first try, rank them according to their weight in Pafel's
account such that constraints with greater weight are higher-ranked and con-
straints with the same weight are considered to be tied. With regard to the
examples above we thus have the following constraints:
EX-PRE (E): Quantifiers must occur in the "Vorfeld" and precede some
other quantifier.
SUBJECT (S): Quantifiers must be subjects.
IN-DIS (I): Quantifiers must be inherently distributive.
In order to get more plausible candidates than merely the quantifiers under
consideration, one can alternatively use the sentences' S-structures as input,
which yields potential LFs as output. If it is assumed that the possibility for
a quantifier to take wide scope is expressed by the fact that it precedes the
other quantifier at LF, and if the constraints are reinterpreted in such a way
that they refer to the first quantifier only (e.g., S: The first quantifier must be
the subject), we get exactly the same results. The candidates in the tableaux
are then to be understood as abbreviations for LF-representations in which
the quantifier in question precedes the other quantifier.
Let's see whether on these assumptions the predictions of Pafel's approach
can be captured.
(5) First ranking: E » S o I

(Γ) Jeder Pianist hat eine Fuge in seinem Repertoire,

[every pianist'] nom has [a fugue] aC c in his repertoire
T,:
Candidates E S I
1®· Qi : jeder Pianist
Q 2 : eine Fuge *! * *
(2') Jede Fuge hat ein Pianist in seinem Repertoire,

levery fugue\ a c c has fa pianist|„ om in his repertoire
T2:
Candidates E S I
US' Qi: jede Fuge *
Q 2 : ein Pianist *! *
(3') Ein Pianist hat jede Fuge in seinem Repertoire,

l'a pianist|„ om has [every fugue| ÍICC in his repertoire
T3:
Candidates E S I
ι®" Qj : ein Pianist *
Q 2 : jede Fuge *! *
(4') Eine Fuge hat jeder Pianist in seinem Repertoire,

[a fugueJacc has |every pianist| nom in his repertoire
T4:
Candidates E S I
IST Qi : eine Fuge * *
* Q 2 : jeder Pianist *!
Unfortunately, this first approach does not work. Although the constraint
ranking in (5) predicts the scopai behavior of the sentences (l)-(3) analo-
gously to Pafel's theory (cf. T1-T3), it is not able to capture the ambiguity of
example (4); cf. T 4 .
In order to predict this ambiguity, the two candidates in T 4 both have to be
optimal, which means that in contrast to the situation in Ti -T3, the violation of
the constraint E in T 4 must not be fatal. If we compare the situation in T 4 with
that in T1-T3, it can be concluded that it must be the simultaneous violation
of the two low-ranked constraints S and I that prevents the Ε-violation of Q2
from being fatal. At this point we are faced with an apparent contradiction.
As mentioned in the introduction, it is a basic principle of OT that violations
156 Silke Fischer
of low-ranked constraints cannot compensate for the violation of a higher-

ranked constraint.
One way out of the dilemma would be to assume that there is a further
constraint at work which renders the Ε-violation harmless, so to speak. And
based on the results of Tj -T4, a natural way to describe this constraint would
be to say that it somehow combines the constraints S and I. Thus we might use
the following local conjunction as a further constraint. 2 (As far as local con-
junctions in general and in syntax in particular are concerned, cf. Smolensky
(1995) and Legendre et al. (1998) respectively.)
(6) S & I: Quantifiers must be subjects or inherently distributive.
This constraint will be satisfied as long as at least one of the two constraints
S or I is fulfilled, and it will be violated whenever S and I are simultaneously
violated, which corresponds exactly to the situation in T4 and distinguishes
it from T1-T3. In order to derive the right result in T 4 , we would like to say
that E and S & I are tied. But since ties are not defined in a unified way, it
has to be made explicit at this point what kind of ties we are talking about.
Basically, we can draw a distinction between local and global ties (for a de-
tailed analysis of different types of ties see Müller 1999). The main differ-
ence between these two concepts concerns the significance of violations of
lower-ranked constraints. Under a local tie approach the prediction will be
that these violations become relevant as soon as neither the tied constraints
nor higher-ranked constraints decide the competition. Formally, this means
that a given language is determined by one constraint ranking in which the tie
is integrated as follows:
S&I » E
Under a global tie approach, on the other hand, a language is determined by a

whole set of constraint rankings, namely those which result if every possible
resolution of the tie is understood to be part of an independent order. This can
be illustrated as in (8):
E » S&I » ...
y — • constraint order a
... » /
x
«- —>· constraint order β
S&I » E » ...
Optimality is then to be understood as optimality with regard to at least one

of the resulting constraint orders. One consequence of this approach is that
violations of constraints that are lower-ranked than the tie itself are irrelevant
as long as there is at least one ranking under which the candidate is better
than the competing ones.
If we assume that E and S & I are locally tied, we will not immediately get
the right result for sentence (4), as T 5 shows. 3
(4") Eine Fuge hat jeder Pianist in seinem Repertoire.

|'a fugue ]Í(CC has [every pianist \nom in his repertoire
T5:
Candidates E S&I S I
* Qi : eine Fuge * *! *!
is* Q2: jeder Pianist *
In this case, Q2 will win, because it does not violate the two low-ranked con-
straints S and I, in contrast to Qi. For this approach to work, it would have to
be assumed that the local conjunction X & Y ("X or Y must hold") somehow
replaces the simple constraints X and Y, such that in a competition where X
& Y is involved, X and Y must be excluded. (Intuitively it does not seem to
be so unreasonable that one constraint should not be referred to twice, once
in the form of X and the second time in the form of the local conjunction X
& Y. For a related idea in which certain elements are only referred to once in
determining the grammaticality of a given derivation, cf. Richards's (1998)
Principle of Minimal Compliance.)
So if we replace T5 with T 6 , where S and I are excluded from the compe-
tition, and if we assume that E and S & I are locally tied, we finally get the
right prediction for sentence (4):
158 Silke Fischer
T6:
Candidates E S&I
Qi : eine Fuge *
1
ι® Q2: jeder Pianist *
Alternatively, we could assume that the relation between the constraints E and
S & I is expressed in terms of an ordered global tie (as illustrated in diagram
(8)). With regard to sentence (4), this means that the tableau we would get
would be equivalent to T 5 , except that the violations of S and I would not be
fatal and both quantifiers would be optimal: Qi under constraint order a and
Q2 under constraint order β.
Τ7 :
Candidates E S&I S I
ts" Qi : eine Fuge * *
US' Q2: jeder Pianist *(!)
To sum up, the underlying ranking we have assumed so far is Ε o S & I
» S o l . However, the following example reveals that this order cannot be
completely correct. In order to capture the ambiguity of sentence (9), which
corresponds to Pafel's example (3.108b), E and S have to be tied.
(9) Eine Fuge haben einige Pianisten in ihrem Repertoire,

fa fugue | a c c have [some pianists]„om in their repertoire
Qi : EX-PRE SV(Qi)=1.5
Q2: SUBJECT SV(Q 2 ) = 1
Qi > Q2: possible
Q2 > Qi : possible
T8:
Candidates E S
b^ Qi: eine Fuge *
es· Q 2 : einige Pianisten *
This observation raises a severe problem. If we assume on the one hand that
E is tied with S (and also with I, as the difference between the scopai values
shows), and on the other hand that E is tied with S & I, we have to conclude
that S & I is also tied with S and I because of transitivity. But if we consider
these constraints in the light of Pafel's approach, 4 they correspond to factors
with the scopai values 2 and 1 respectively, which means that the difference is
> 1. Thus only the factor with the larger scopai value should be able to take
wide scope, and the corresponding constraint should be higher-ranked than
the other one. So it must be concluded that we face a problem with regard to
transitivity.
4 The Transitivity Problem
As far as the examples (1) to (4) are concerned, it seems to be possible to

derive the predictions of Pafel's cumulative theory (CT) by means of an
optimality-theoretic analysis somehow. However, there is one essential differ-
ence between the two theories, which probably constitutes the main difficulty
for the integration of cumulative effects into OT. If we compare the behavior
of two quantifiers in Pafel's theory, there are three possible results: The ab-
solute value of the difference between the scopai values might be > 1, = 0,
or G |0, 1[. In OT, on the other hand, we basically have two possibilities to
describe the relation between two constraints. One can be higher-ranked than
the other, or they can be tied. As mentioned before, it seems to be reasonable
to assume the following "translation rules" (where A and Β are factors rele-
vant for scope, W(X):= the weight of factor X, and Con(X):= the constraint
derived from factor X):
(i) W(A) = W(B) —• Con(A) o Con(B)

(ii) W(A)-W(B) > 1 — • Con(A) » Con(B)
However, the third possibility, where 0<|W(A)—W(B)|< 1, is problematic.

On the one hand, this configuration predicts ambiguity, thus the correspond-
ing constraints cannot be ranked in a dominance relation. But if they are tied,
we have a problem with transitivity, as was already observed at the end of the
last section. Consider the following configuration:
SV(Qi ) = 2 involved factor: A

SV(Q 2 ) = 1.5 involved factor: Β
SV(Q 3 ) = 1 involved factor: C
CT: a. SV(Q,)-SV(Q 2 ) = 0.5 —>· predicts ambiguity
b. SV(Q 2 )-SV(Q 3 ) = 0.5 — • predicts ambiguity
but: c. SV( Q l )-SV(Q3) = 1 — • predicts no ambiguity
OT: According to the result in (a.), one would like to say that Α ο Β; but
according to the result in (b.), one would like to say that Β o C.
160 Silke Fischer
—• Because of transitivity, we would have to assume A o C. This con-

tradicts the result in (c), according to which we would expect that
A»C.
So if we assumed a strict transitive order, the consequence would be that all

factors belonging to the set 7> would translate into tied constraints, where Tp
is defined as the set containing the factor F and all those factors whose scopai
values are less than 1 step away from the scopai value of an element belong-
ing to Tp. This domino effect would render most of the constraints equally
strong and lead to false predictions, as the following example illustrates. This
example (Pafel's number 3.104) contains a new factor, SL-PAT, which is as-
signed to quantifiers with a slight tendency to be interpreted as Patients. It has
the weight 1 and translates into the constraint SL, which says that quantifiers
must have a slight tendency to be interpreted as Patients.
( 10) Einem Kind hat er jedes Märchen erzählt.

Ia c h i l d l y has fhe]„om [every fairytale \ acc told
Qi : EX-PRE + SL-PAT S V(Q, ) = 1.5+1 = 2.5
Q2: IN-DIS SV(Q 2 ) = 1
Qi > QÏ'· possible

Q2 > Ql· impossible
Starting with the difference in weight between the two factors E and I, which
is 1.5—1 = 0.5, we can assume that E o i . Similarly, from the difference
between the weights associated with E and S L & I, which is 2—1.5 = 0.5,
we can conclude that Ε o SL & I; so according to transitivity we get the
relation I o SL & I. On the other hand, SL & I is tied with E & SL, since the
relevant difference is 2.5—2 = 0.5. Again because of transitivity, we therefore
get the result that I ο E & SL. But as illustrated in T 9 , this gives us the wrong
predictions with regard to sentence (10), in which only the first quantifier can
take wide scope.
T9:
Candidates E & SL I
us· Qi: einem Kind *
* B3P Q 2 : jedes Märchen *
If we want to make sure that only Qi wins, E & SL must be ranked higher
than I, a ranking which is also suggested by the difference between their
corresponding weights, which is 2.5—1 = 1.5.
I do not know how to solve this problem without giving up to some extent
the idea that constraint orders must be strictly transitive. But if we allow that
Α ο Β and Β o C does not necessarily imply A o C, we can account for the
examples above with the following diagram:
(11)
Β » C » D
constraint order a
(...A » Β » C ...)
constraint order β
(...A » C » Β ...)
constraint order γ
A
» C » D (...B»A»C...)
In (11), two global ties are involved, which express the relations Α ο Β and
Β o C, but still all three resulting constraint orders predict that A is higher-
ranked than C. This is possible because in contrast to usual assumptions,
according to which the branches of global ties are continued in the same way,
the second tie in (11) does not affect all branches, but is only part of the two
constraint orders a and β. So we could propose that the occurrence of global
ties need not necessarily affect all branches of the ranking structure. With this
assumption the transitivity problem can be solved, which means that the idea
of strict transitivity in constraint rankings must be given up (and this might
be a controversial result). However, transitivity does not have to be given up
completely, since each constraint order in itself remains transitive. It seems
to me that this is the easiest way to integrate the non-transitive effects of
cumulative theories into OT.5
The question then arises of how the underlying relation between the con-
straints A, B, and C, which is illustrated in (11), can be formally expressed.
Following a suggestion by Ralf Vogel (p.c.), I propose that it can be cap-
tured adequately by the relation (A » C) ο B, where this kind of interaction
between ties and hierarchical rankings is defined as follows:
162 Silke Fischer
(A » C) ο Β := ΑοΒ» C V A » CοΒ
A» Β» C V Β» A» C
V A» C» Β (V A » Β » C)
resulting constraint orders: (i) A» Β» C

(ii) Β» A» C
(iii) A» C» Β
This definition can be generalized in such a way that it can be applied to

all sorts of combinations between ties and (bracketed) asymmetric rankings.
The crucial point is that the brackets on hierarchical rankings make it possi-
ble to preserve this hierarchy even in a tied order. This means that if the tie
is resolved, it will yield only those combinations possible between the tied
elements in which the hierarchy indicated in brackets is preserved.
5 Combining Constraints
In section 3 we considered four sentences that involved the simple constraints

E, S, and I. In order to account for the behavior of the quantifiers in these
examples, the additional constraint S & I was introduced. But what about
constraints like E & S, E & I, or E & S & I? The question that needs to
be discussed at this point is what kind of constraint combinations have to
be taken into account. Since quantifiers can exhibit all sorts of combined
properties, the answer should be that in principle all constraint combinations
have to be considered. However, if we examined quantifiers with η different
properties, we would have to discuss 2" — 1 constraints. Since the first four
sentences have already shown that a certain subset of all constraints seems to
suffice to determine the outcome of the competition for a concrete example,
it would be helpful to find out what this subset has to look like.
Remember that the last observation in section 3 was that E must not only
be tied with S & I, but also with S and I, whereas S & I is higher-ranked than
S and I, giving rise to the transitivity problem. In the light of the previous
section, we can now assume that the underlying formal relation is (S & I »
S o I) ο E, which is illustrated by the diagram in (13) (cf. also the calculation
in the appendix).
This constraint ranking is indeed able to predict the ambiguity of sentence

(4) 6 (cf. T 1 0 ). But if the competition is restricted to the same set of constraints
(i.e., {S & I, E, S, I}), it does not make the correct predictions for the unam-
biguous sentences (2) and (3), in which only the first quantifier can take wide
scope (cf. T u and Τ12).
(4") Eine Fuge hat jeder Pianist in seinem Repertoire,

[a fugue| ÍÍCC has |every pianist|„ om in his repertoire
T10:
Candidates S&I E S I
US' Qi : eine Fuge * *
us* Q2: jeder Pianist
(2") Jede Fuge hat ein Pianist in seinem Repertoire,
levery fugue] (lcc has [a pianist|„„ m in his repertoire
Τ,,:
us* Qi : jede Fuge
* 1®· Q 2 : ein Pianist
164 Silke Fischer
(3") Ein Pianist hat jede Fuge in seinem Repertoire.

|a pianist | n o m has I every fugue lacc in his repertoire
T12:
ι®* Qi : ein Pianist
* US' Q2: jede Fuge
According to structure (13), not only Qi but also Q2 is optimal in both

tableaux, namely under the constraint orders a and β in the case of Τ π , and
under the constraint orders γ and S in the case of T12. The conclusion that
can be drawn is that the constraint subset relevant for the examples (2) and
(3) has not been completely taken into consideration in T u and T12. Based
on our observations concerning sentence (4), it seems reasonable to assume
that the relevant subset CON r e / (i.e., the smallest set of constraints to which
the competition can be reduced) consists of two members only, namely the
combinations of the constraints derived from the properties of each quantifier.
As far as the examples (2) and (3) are concerned, this means that the relevant
constraint subsets are {Ε & I, S} and {E & S, 1} respectively (cf. T13 and T14).
T13:
Candidates E &I S
us· Qi -. jede Fuge *
Q2: ein Pianist *!

T14:
Candidates E & S I
Qi: ein Pianist *
Q 2 : jede Fuge *!
The following example 7 serves as a further illustration of this generalization

concerning CON r e /. It contains two new factors: ST-L-DB refers to strong
lexical discourse binding and has the weight 2; FOCUS is assigned to focused
quantifiers 8 and has the weight —1. These factors translate into the following
two constraints:
ST-L-DB (ST): Quantifiers must occur in strong lexical discourse binding

contexts.
FOCUS (F): Quantifiers must be focused.
(14) Welche Fuge hat jeder Pianist in seinem Repertoire?
I which fugue W has [every pianistl„ om in his repertoire
Q ι : EX-PRE + ST-L-DB + FOCUS

g ì : SUBJECT + IN-DIS
SV(Q,)= 1.5+2-1 =2.5
SV(Q 2 )= 1+1 = 2
Qi > Q2: possible
Q2 > Qi: possible
relevant constraint subset: { E & S T & F , S & I } ç CON,
where CON is the set comprising all
constraint combinations
constraint ranking: E&ST&FoS&I
T, 5 :
Candidates E&ST&F S&I
ts· Q ( : welche Fuge *
us5 Q 2 : jeder Pianist *
However, sentences in which the quantifiers share some common properties

relevant for scope require a slight modification to the definition of CON re /.
Since neither candidate would violate a constraint derived from (one of) these
properties, these constraints are irrelevant for the competition and must there-
fore be excluded from CON re ;. Thus, CONre; can be defined as follows: The
first element of CONr<,; is the local conjunction that involves the constraints
derived from Qi's properties minus those Qi shares with Q 2 , and the second
element combines the constraints derived from Q 2 's properties minus those
shared with Qi. Example (15) serves as an illustration. While constraint sub-
set (i) does not yield the correct result (cf. Ti6(,·)), constraint subset (ii), which
consists of the same constraint combinations except that the common prop-
erty F is excluded, makes the correct predictions (cf. Ti6(,,)).
(15) Welche Fuge hat JEder Pianist gespielt?

I which fugue \llcc has | EVery pianist |„om played
Q, : EX-PRE + ST-L-DB + FOCUS
Q2: SUBJECT + IN-DIS + FOCUS
SV(Qi)= 1.5+2-1 =2.5
SV(Q 2 )= 1 + 1 - 1 = 1
Qi > Q2: possible
Q 2 > Qi : impossible
relevant constraint subset:
166 Silke Fischer
(i) {E & ST & F, S & I & F}

(ii) {E & ST, S & I }
constraint ranking:
(i) E&ST&F»S&I&F
(ii) E&ST»S&I
Tl6(i>
Candidates E & ST & F S & I & F
us* Qi : welche Fuge
* ts· Q 2 : JEder Pianist
Tl6(n):
Candidates E & ST S&I
US' Qi : welche Fuge *
Q 2 : JEder Pianist *!
As far as factors with negative weight are concerned, one might alternatively
translate them into negative constraints in order to avoid configurations where
X » X & Y, which contradicts the definition of local conjunction. The factor
FOCUS, for example, would then translate into the following constraint:
*F: Quantifiers must not be focused.
In fact, we could then also try to replace the factor FOCUS (with weight
— 1 ), which is associated with focused quantifiers, with a factor *FOCUS with
weight 1, which is associated with unfocused quantifiers. In this way we could
generally reinterpret factors with negative weight such that they would all be
assigned positive weight. With regard to example (14), we would then have
the following configuration, which illustrates that the difference between the
scopai values and therefore the predictions on possible scope relations remain
unaffected by this reinterpretation.
(14') Welche Fuge hat jeder Pianist in seinem Repertoire?

[ which fuguejncc has [every pianist\ nom in his repertoire
Q,: EX-PRE + ST-L-DB
gì: SUBJECT + IN-DIS + *FOCUS
SV(Q,)= 1.5+2 = 3.5
SV(Q 2 ) = 1+1+1 = 3
Qi > Q2: possible
Q2 > Qi : possible
relevant constraint subset: { E & ST, S & I & * F }

constraint ranking: E & S T o S & I & * F
T17:
Candidates E & ST S & I & *F
US' Qi : welche Fuge *
1®· Q2: jeder Pianist *
As far as example (15) is concerned, the factor *FOCUS would not be in-
volved at all, because both quantifiers in the sentence are focused. Hence,
*F would not belong to the relevant constraint subset. However, all sentences
that contain unfocused quantifiers (like the examples (l)-(4)) are now associ-
ated with the factor *FOCUS and therefore with the constraint *F; but as our
considerations above have shown, *F will be excluded from CONri./ in case
both involved quantifiers are unfocused. Thus the replacement of F/FOCUS
by *F/*FOCUS does not affect our earlier examples.
Finally, there is another configuration in Pafel's approach that must be men-
tioned. If a quantifier is not associated with any property that is relevant for
scope, it receives the scopai value 0. Thus it is possible for a sentence con-
taining such a quantifier to be ambiguous in case the second quantifier Q2
has a scopai value with -1 < SV(Ç>2) < 1. Assume that Q2 has the property
A, which translates into the constraint A. As indicated in T| 8 , Q2 fulfils A in
contrast to Qi. Thus we are faced with the situation that Q2 will always win
if we do not introduce a further constraint which is violated by Q2 but not
byQi.
Ti 8 :
Candidates A
* Qi *!
US' Q 2
In order to get the right result, we have to think of an additional constraint

which is satisfied exactly by those quantifiers which do not have any prop-
erties that influence the quantifier's scopai behavior. Such a constraint might
look as follows:
NO PROPERTY (N-PR): Quantifiers must not have properties relevant for

scope.
On this assumption, the competition works as follows:

168 Silke Fischer
(16) Q,:- SV(Qi) = 0

Q2: A-1<SV(Q2)C1
Qi > Q2: possible
Q2 > Qi: possible
relevant constraint subset: {N-PR, A}
constraint ranking: N-PR o A
Τ,9:
Candidates A N-PR
BSP Qi *
E3= Q 2 *
Note that the constraint N-PR must also come into play if a quantifier shares
all its properties with the second quantifier of the sentence. This configuration
is illustrated in the following example, where A and Β are properties relevant
for scope that translate into the constraints A and Β respectively.
(17) Qi : A + B
Q2: A, where |SV(Qi)—SV(Q2)|< 1, i.e., either quantifier can
take wide scope.
As discussed above, the constraint derived from the common property A is
excluded from CONre/. Thus the relevant constraint subset might be:
(>) {B>, or
(ii) {B, N-PR}.
For (ii), the constraint ranking is Β o N-PR, because we know from our as-
sumptions in (17) that |weight(B)|< 1. The results we get for (i) and (ii) are
illustrated in T2o(,·) and Τ20(,·,), which show that we have to use the second
constraint subset.
T2o(o: T2o(,·,·):
Candidates Β Candidates Β N-PR
nsr Q, B3F Q, *
* q2 *! ι®» Q 2 *
One further situation that can occur in cumulative theories, which we do not
find in Pafel's approach however, is that the cumulative occurrence of one
and the same constraint violation might change the outcome of the whole
competition. Imagine the following configuration:
T21: T22:
Candidates A Β Candidates A Β
Q *! e r C, *
1®· C 2 * c2 **!
If it is assumed that A » B, we can account for T21, but not for T22, and if
we assume that Β » A, we get the right prediction for T22, but not for T 2 i. In
the light of the ongoing discussion, one way out of the dilemma might be to
assume that constraint combinations of the sort X & Y are not only possible
in case X / Y , but also if X = Y. The resulting constraint would be a reflexive
local conjunction (cf. also Legendre et al. 1998), which would have to be
interpreted as follows:
(18) (i) The constraint X & X =: X 2 is violated iff X is violated twice;

(ii) more general:
The constraint X" is violated iff X is violated η times.
On these assumptions, T 2 i and T22 can be accounted for with the following
constraint ranking: Β 2 A » B. Since A » B, C2 wins in T 2 i, and since
B » A, Ci wins in T22, as illustrated more precisely in T23.
2
T23:
Candidates B2 A
03° C, *
c2 *!
6 Conclusion
As the discussion showed, it seems to be possible to integrate cumulative ef-

fects, as they occur, for example, in Pafel's approach to quantifier scope, into
OT if some special assumptions are accepted. In order to get effective con-
straints, it was first of all necessary to introduce (reflexive) local conjunction,
which multiplies the number of constraints enormously and might therefore
give rise to criticism. But as could be shown in the previous section, the out-
come of the competition only hinges on a small subset of the whole set of
constraints.
A much more severe problem was approached in section 4 and concerns the
transitivity of constraint rankings. Since in cumulative theories, transitivity
does not need to hold, we face the problem that we might have to integrate
170 Silke Fischer
non-transitive effects into a transitive order. I think that this is only possible
if the idea of strict or global transitivity, where Α ο Β and Β o C necessarily
implies A o C, is given up. Thus, I proposed that the occurrence of global
ties within global ties might only affect some of the branches. This approach
allows on the one hand the integration of non-transitive effects, but preserves
on the other hand at least locally the transitive order, because each resulting
constraint order remains transitive. Thus, this step is not as radical as it might
seem at first sight. Of course, it has to be pointed out that global ties in general
increase the amount of complexity tremendously ; however the number of the
resulting constraint rankings is again reduced somewhat if global ties do not
necessarily have to affect all branches. As far as the formal realization of
this relation is concerned, it can be expressed as interaction between ties and
bracketed hierarchical rankings. This seems to me to be a natural elaboration
of the two basic relations and "o", which is to some extent reminiscent
of the interaction between addition and multiplication.
Finally, the question arose as to how CONre.;, the smallest set of constraints
relevant for a competition, can be defined. It is clear that constraints on which
the candidates behave alike can be excluded and that furthermore simple con-
straints which are also part of relevant local conjunctions need not be taken
into consideration. (In the latter case, the simple constraints will not be de-
cisive, since the corresponding local conjunctions are higher-ranked.) More-
over, the cumulative character of the constraints ensures that (A & X) 5i> or
ο (Β & X) ^ A » or o B, which allows us to ignore certain higher-ranked
local conjunctions on which the candidates differ. As far as the integration of
Pafel's approach into OT is concerned, it could therefore be concluded that
CON re ; contains only two constraints, namely the constraint combinations
derived from the properties associated with each quantifier.
There are two questions I have not addressed here. First, it could be asked
whether anything would change if CON re / contained more than two con-
straints or if more than two candidates were involved. The second ques-
tion concerns the representation of tendencies in OT, as for example the
preference for certain readings. One possibility might be that it can some-
how be captured by the number of constraint orders which are affected by
certain ties, since this is exactly how ambiguities predicted by the relation
0 < |SV(Qi)—SV(Q2)| < 1 are characterized. However, whether this ap-
proach would really work would have to be discussed in more detail.
Appendix
The ranking we finally assumed for the constraints S & I, S, I, and E was (S &
1 » S ο I) ο E, which results in eight constraint orders if the ties are resolved
(cf. diagram (13)). This outcome can be predicted very easily if we assume
the following definition, which is a generalization of definition (12):
Generalization of definition ( 12):

(Ai » . . . » A„) ο Β := Αι ο Β » A 2 » A3 » . . . » A„
ν A, » A 2 ο Β » A3 » . . . » An
ν ...
ν A, » A 2 » A3 » . . . » A „ o B
Example:
(D » Α ο Β) o C
This is the underlying formal relation if Α ο Β, A o C, Β o C, D o C, but D »

A and D » B. If we apply the definition above, we get the following result:
(D » A ο Β) o C
(D » A » Β) o C V (D » Β » A) o C
D oC » A » Β V D oC » Β » A
V D » A oC » Β V D » ΒoC » A
V D » A » Βo C V D » Β» A o C
D » C » A » Β V D » C » Β » A
V C » D » A » Β V C » D » Β » A
V D » A » C » Β V D » Β » C » A
(V D » C » A » Β V D » C » Β » A)
V D » A » Β » C V D » Β » A » C
(V D » A » C » Β V D » Β » C » A)
resulting constraint orders: (i) C » D » A » Β

(ii) C » D » Β » A
(iii) D» A » Β » C
(iv) D» A » C » Β
(ν) D» Β » A » C
(vi) D» Β » C » A
(vii) D» C » A » Β
(viii) D» C » Β » A
172 Silke Fischer
Notes
For comments and discussion I want to thank Fabian Heck, Gereon Müller, Tanja
Schmid, Wolfgang Sternefeld, Sten Vikner, and Ralf Vogel.
1. The distinction between (ii-a) and (ii-b) is only mentioned for completeness'
sake. It does not play any role in the further discussion, since the question of
how this difference can be expressed in an optimality-theoretic framework is not
addressed here.
2. If this constraint were translated back into Pafel's theory, it would correspond to
a factor with the weight 2, since it involves both properties S (weight 1) and I
(weight 1).
3. The question might arise of whether it is legitimate to restrict the competition
to the four constraints considered in T5. It is true that there are higher-ranked
constraints on which Qi and Q2 differ, namely E & X and any local conjunction
containing X and S or I, where X is a constraint that is violated by both candi-
dates. However, the cumulative character of the constraints ensures that (A & X)
» or ο (Β & Χ) A » or ο Β. Thus, the outcome of a competition involving
the constraints A & Χ, Β & X, A, and Β does not change if A & X and Β & X
are not taken into account.
4. It is not possible to provide a concrete example that only involves the two con-
straints S & I and S or I. These combinations are ruled out, because Pafel's pos-
tulation of the two contrasting factors EX-PRE and IN-PRE assures that one of
them is always involved. (The latter property is assigned to quantifiers in the
"Mittelfeld" that linearly precede other quantifiers.) But I think the general prob-
lem becomes clear nevertheless.
5. The situation in which Α ο Β and Β o C, but C » A must be excluded, is not as
unusual as it may seem at first sight. It also occurs, for example, in Müller (2000),
where it is assumed on the one hand (by transitivity) that A o C, but where on the
other hand C » A is excluded because of an underlying meta-constraint which
says that A must be higher-ranked than C.
6. The dotted lines in the tableaux indicate that two neighboring constraints X and
Y are tied, but that their corresponding weights are not equal.
7. The sentences (14) and (15) correspond to Pafel's examples (3.164') and (3.165).
8. Pafel assumes that w/i-phrases are inherently focused (cf. Pafel 1998: 98).
References
Heck, Fabian
t.v. Quantifier scope in German and cyclic optimization
1998 When is less more? Faithfulness and minimal links in wh-chains. In: P.
Barbosa, D. Fox, P. Hagstrom, M. McGinnis and D. Pesetsky (eds.) Is the
Best Good Enough?, 249-289. Cambridge, MA: MIT Press.
Müller, Gereon
1999 Optionality in optimality-theoretic syntax. GLOTInternational 4.5: 3-8.
Müller, Gereon
2000 Das Pronominaladverb als Reparaturphänomen. Linguistische Berichte
182: 139-178.
Pafel, Jürgen
1998 Skopus und logische Struktur. Studien zum Quantorenskopus im Deut-
schen. Technical Report 129, Arbeitspapiere des Sonderforschungsbere-
ichs 340. Universität Tübingen.
Rutgers University & University of Colorado, Boulder. To appear as Lin-
guisitc Inquiry Monograph, Cambridge, MA: MIT Press.
Richards, Norvin
1998 The principle of minimal compliance. Linguistic Inquiry 29: 599-629.
Smolensky, Paul
1995 On the internal structure of Con, the constraint component of UG. Ms.,
Johns Hopkins University.
Quantifier Scope in German and Cyclic Optimization
Fabian Heck
Standard Optimality Theory (OT) as developed by Prince & Smolensky

(1993) or McCarthy & Prince (1993) is based on the assumption that a gram-
matical structure S, is derived in the following way: given a certain input I,
a function / first generates a (possibly infinite) set Si... S* of possible struc-
tures from I and then performs a computation of optimization to filter out all
suboptimal structures, leaving only S, as the optimal output O. This is satis-
fying as long as it suffices to refer to two levels of representation. However,
in syntax it has often been argued that one needs more levels of representa-
tion, for example the levels of D-structure, S-structure, and Logical Form (cf.
Chomsky 1981). If OT is applicable to syntax at all, then the null hypothesis is
to assume that the computation / holds between all levels of representation. I
propose in this paper that this is indeed the case, and that the computation of
generation and optimization proceeds in a cyclic fashion. The application of
this hypothesis is concerned with the description of relative quantifier scope
in German.
1 Introduction
The goal of this paper is to account for the phenomenon of relative quantifier
scope in German. The discussion basically deals with sentence pairs of the
following type:
(l)a. Jeder hat einen Fehler gemacht

everybodyNOM has one mistakeAcc made
b. Einen Fehler hat jeder gemacht
one mistakeAcc has everybodynoM made
(1-b) is ambiguous. It can either have the meaning described in (2-a) or the
meaning described in (2-b):
176 Fabian Heck
(2) a. There exists one mistake χ such that for every person y the follow-
ing holds: y made x.
b. For every person y there exists a mistake χ such that y made x.
Interestingly (1-a) only has the reading (2-b). Hence, the question is, when
does a sentence that contains two quantifiers have only one reading and when
does it have two readings?
First of all, following May (1977), Stechow (1993), Heim & Kratzer (1997),
and others I assume that every meaning of a sentence is spelled out unambigu-
ously at the level of Logical Form.1 The relative scope of two quantifiers Qi
and Q 2 is encoded by the relationship of c-command (following the definition
in Reinhart 1976): If Qi c-commands Q2 at LF, then Qi has scope over Q2.
Now, I think that there are two main observations that may lead to a prin-
cipled account of the given question: First, the relative quantifier scope in
German is highly dependent on the given S-structural configuration. That is,
if a quantifier Qi c-commands another quantifier Q 2 at S-structure (SS), then
Q! will be able to c-command Q2 at LF as well. In other words, the mapping
from S-structure to LF is highly structure preserving (cf., for instance, Kiss
1999 for German, and Kroch 1974, Reinhart 1983, and McCawley 1999 for
English).
Second, it seems that the scope relations can be inverted on the way from
S-structure to LF if the derivation has involved S-structure movement. 2 This
means that whenever there is a quantifier at S-structure that does not fill its
D-structure position, the scope relations are destabilised and there may be an
accessible reading that does not correspond to the S-structural configuration.
Technically this will be spelled out by reconstructing the moved quantifier to
its base position.
The basic assumption about the transparent LF, together with the first ob-
servation, calls for the syntactic levels of S-structure and LF. The second
observation calls for the syntactic level of D-structure (DS).
Since I am using Optimality Theory to tackle the problem, I first want to
give a motivation for this decision: Often the data suggest that there are dif-
ferent principles at work that stand in conflict with each other, but that never-
theless are all needed. That is, even grammatical structures cannot fulfil every
constraint. We nevertheless need all constraints, and hence, constraints must
be violable. OT gives us the means to express the concept of a violable but
active constraint.
Cyclic Optimization ill
2 Cyclic Optimization
The strategy I will follow here is to reconcile the classical T-model of gram-
mar of Chomsky (1981) with the standard model of Optimality Theory of
Prince & Smolensky (1993) and McCarthy & Prince (1993). 3 The result of
this reconciliation will be an extended version of OT which will be referred
to as the model of Cyclic Optimization. Its basic characteristics are the fol-
lowing.
Starting with a kind of predicate-argument structure as input, a generator
GEN constructs a set of possible D-structures out of some "lexical" material. 4
This input defines the candidate set:
(3) Definition of candidate sets:

Two candidates Q and C2 are in the same candidate set if and only if
they descend from the same predicate-argument structure and con-
sist of the same lexical material.
This set will then be optimized in the first cycle. The output will be an op-
timal D-structure DS,. DS, in turn will serve as input for the second cycle,
which starts with the generation of a set of possible S-structures, basically
using the transformation move-α. This set will again undergo the process of
optimization and the output will be an optimal S-structure SS 7 . SS7· will be
the input for the last cycle. Again using move-α, a set of possible LFs will be
generated and one last time optimization will apply, resulting in an optimal
Logical Form LF*. The whole computation can be seen in the diagram below:
First Cycle Second Cycle Third Cycle

178 Fabian Heck
To put it in a nutshell: Optimal LFs are derived from optimal S-structures,

which in turn are derived from optimal D-structures. Optimization proceeds
cyclically, the different cycles being the syntactic levels of D-structure,
S-structure, and LF.
Another important property of the model is that it allows the reranking of
the constraints as soon as another cycle is entered. This will become impor-
tant when we examine the impact of one and the same constraint at different
levels of representation. As we will see, all the constraints will be present at
each cycle, but will be ranked in different ways depending on the level of
representation.
3 D-Structure
As I have already mentioned, scope inversion sometimes appears to be deriv-

able by LF-reconstructing a moved quantifier to its D-structure position. If
we want this prediction to be verifiable (or falsifiable), we first need an ex-
plicit theory of D-structure. This is so because we need to know the base
position of the quantifier Q in order to decide if reconstruction of Q to this
position can lead to scope inversion or not. Therefore, I will first introduce
some assumptions about sentence structure in German and then I will give a
(somewhat simplified) OT-account of German D-structure.
The assumptions are the following: 1. The subject is generated in SpecVP
(cf. Haider 1993). 2. Structural cases like nominative and accusative are as-
sociated with fixed positions. Nominative is assigned to SpecVP, accusative
is assigned to the sister of the verb. The dative is a lexical case in German and
can be freely adjoined within the verbal projection (both assumptions are due
to Vogel & Steinbach 1998). 3. Nominative will be assigned by the verb. 4.
Adjunction to a non-maximal projection is allowed.
This gives us the following structure projected by a ditransitive verb:
(4) [ VP IO [VP Subject [ v IO [ v DO [ v IO V ]]]]]
This means that the indirect object (IO) in (4) may principally occupy three
different positions at D-structure: it may be base adjoined above the subject,
between the subject and the direct object (DO), or below the direct object.
The claim I want to make is that its exact base position can be determined
by a process of D-structure optimization. The positions of the subject and the
DO are fixed, so the only variation in D-structure will come from the choice
of base adjoining the IO at different positions. This choice will give us the
Cyclic Optimization 179
basic or unmarked word order, where the term unmarked is to be understood

in the sense of Höhle (1982). A s we will see, the unmarked word order is not
dependent on the verb (contra Haider 1992, Haider 1993) nor is it the result
of S-structure optimization (contra Müller 1999), but in this approach it is the
result of D-structure optimization.
3.1 Constraints
I will now introduce the first constraints, partially following Abraham (1986),
Hoberg (1981) Lenerz (1977), Stowell (1981), Uszkoreit (1986), and Müller
(1999). These three constraints will provide us with the traces we need to
derive scope inversion by reconstruction. This is the first step to linking rel-
ative scope, an LF-phenomenon, to basic word order, which is a property of
D-structure. And here is the first constraint:
(5) Constraint ofAnimacy (ANIM)

If a and β are arguments, a [-I-animate| and β |—animate|, then a
precedes β.
Evidence for ANIM is given by the following examples, in which in the un-
marked case the animate argument always precedes the inanimate argument
(see the a-examples). If the order is reversed as in the b-examples, the result
is marked: 5
(6) a. daß er der Mutter das Sorgerecht entzogen hat

that he the motheroAT the custodyACC withdrawn has
b. ?daß er das Sorgerecht der Mutter entzogen hat
that he the custody ACC the motherDAT withdrawn has
(7) a. daß er das Kind dem Einfluß entzogen hat

that he the childAcc the influenceoAT withdrawn has
b. ?daß er dem Einfluß das Kind entzogen hat
that he the influenceoAT the childAcc withdrawn has
(8) a. daß einem Patienten ein Medikament geholfen hat

that a patientoAT a medicineNOM helped has
b. ?daß ein Medikament einem Patienten geholfen hat
that a medicineNOM a patientoAT helped has
(9) a. daß Jakob einem Kind ein Märchen erzählt hat

that Jacob a childoAT a taleAcc told has
180 Fabian Heck
b. ?daß Jakob ein Märchen einem Kind erzählt hat

that Jacob a taleAcc a c h i l d o A T told has
The second constraint is called the
( 10) Constraint of Agentivity (AGENT)

If a and β are arguments and a bears the 0-role agent, then a pre-
cedes β.
What is particularly interesting here is that AGENT has not had any impact
in the examples so far. But AGENT comes into play as soon as A n i m is kept
constant:
( 11 ) a. daß ein Blauhelm einem Flüchtling geholfen hat

that a UN-soldierNOM a refugeeoAT helped has
b. ?daß einem Flüchtling ein Blauhelm geholfen hat
that a refugeeoAT a UN-soldierNOM helped has
This characteristic property is called emergence of the unmarked (cf.

McCarthy & Prince 1994). It means that a constraint that is inactive in many
cases suddenly awakes. In OT, this property of constraints is expected be-
cause a constraint Q might be overridden by another constraint C2 that is
higher ranked than Q . But as soon as C2 does not have an effect anymore for
independent reasons, Ci becomes relevant. 6
All this suggests that the partial ranking between the two constraints here
is A n i m » A g e n t . The third constraint is the
(12) Constraint of Adjacency ( A D JA )

If a and β are arguments, α bearing structural case and β bearing
lexical case, then a is closer to the case assigning verb than β .
In a sense, the evidence for A d j a shows the same characteristics as the ex-
amples cited as evidence for A G E N T : this time, if A N I M and A G E N T both are
kept constant, the effects of A d j a can emerge: 7
(13) a. daß er einem Beispiel eine Nummer zugeordnet hat

that he an exampleDAT a numberAcc assigned has
b. ?daß er eine Nummer einem Beispiel zugeordnet hat
that he a numberAcc an exampleDAT assigned has
(14) a. daß er einem Arzt einen Patienten zugeteilt hat
that he a doctorDAT a patientAcc assigned has
b. ?daß er einen Patienten einem Arzt zugeteilt hat

that he a patientAcc a doctoroAT assigned has
As a consequence the complete hierarchy so far is ANIM AGENT

ADJA.8
The conclusion is that the a-examples are unmarked D-structures which
allow for maximal Focus Projection (cf. Höhle 1982). The b-examples are
marked and hence they must have been derived by S-structure movement.
3.2 Analysis
We now turn to the explicit computations, which are shown in the OT-tables
below. Optimal candidates are indicated by the pointing hand ns\
The tables in (15) and (16) show different unmarked orders for the examples
in (6) and (7) respectively, in which the main reason for different word order
is a difference in animacy:
(15) daß er der Mutter das Sorgerecht entzogen hat

that he the motherDAT the custodyACC withdrawn has
Input: entzieh-(er,mutter,sorgerecht)
Candidates ANIM AGENT ADJA
c^ Ci : der Mutter ... das Sorgerecht
*
C2: das Sorgerecht... der Mutter *!
(16) daß er die Kinder dem Einfluß entzogen hat

that he the childrenAcc the influenceoAT withdrawn has
Input: entzieh-(er,einfluß,kinder)
Candidates ANIM AGENT ADJA
*
«s- Q : die Kinder ... dem Einfluß
C¿: dem Einfluß ... die Kinder *!
Examples like these show very clearly that basic word order can not be totally
dependent on the verb. In both cases we face the same verb. However, in one
case the basic word order is direct object before indirect object, whereas in
the other example it is the inverse.
The tables in (17) and (18) show the analysis of the examples in (8) and
(11) respectively:
182 Fabian Heck
(17) daß einem Patienten ein Medikament geholfen hat

that a patientoAT a medicineNOM helped has
Input: helf-(medikament,patient)
Candidates Anim Agent A d ja
*
US' C].· einem Patienten ... ein Medikament
*
C2: ein Medikament... einem Patienten *!
(18) daß ein Blauhelm einem Flüchtling geholfen hat

that a UN-soldier N oM a refugee DA T helped has
Input: helf-(blauhelm,flüchtling)
Candidates Anim Agent Adja
us- Ci: ein Blauhelm ... einem Flüchtling *
C2: einem Flüchtling ... ein Blauhelm *!
As can be seen, the indirect object occurs to the left of the subject in one case
but to the right of the subject in the other case. This is due to a difference in
animacy of the arguments, and the emergence of AGENT.
Finally, we see an example of what happens if even agentivity is neutral-
ized:
(19) daß er einem Beispiel eine Nummer zugeordnet hat

that he an example DA T a numberAcc assigned has
I nput: zuordn-(er,beispiel,nummer)
Candidates Anim Agent Adja
US' Ci : einem Beispiel ... eine Nummer
C2: eine Nummer ... einem Beispiel *!
Thus, we can finish with the first cycle. The optimal D-structure will now be
the input for the next cycle, the S-structure generation and optimization. 9
4 S-Structure
I will first clarify the basic assumptions about S-structure that are made here:
1. Topicalization is semantically empty movement. It is triggered by some
need for clause typing. 2. Scrambling may be semantically empty if it is trig-
gered by information structural needs (e.g., align focus to the right). But it
may also be semantically relevant if it is triggered in order to gain scope.
4.1 Constraints
The next constraint I will adopt is the economy constraint ECON, which pro-
hibits movement (cf. Chomsky 1995, Grimshaw 1997).
(20) Economy (ECON)

Movement is not allowed.
Since there is ECON, a trigger for S-structure movement is needed. In the case
of scope induced scrambling this will be formalised by stipulating an abstract
scope marker Q which is generated at D-structure and which c-commands
the base position of the quantifier it is supercoindexed with (coindexation
meaning that the quantifier and the scope marker share the same scope): 10
(21) I« Q U ... Qi... [ p ... Q¡¡... 11] (D-structure)
If Q can be generated freely, this may lead to over-generation. This problem

will be addressed later, when we have a better understanding of what kind of
Q-insertion should be allowed and what kind of insertion should be blocked.
It is clear that at least at LF, every quantifier has to be near its scope marker
- if there is one - because it is at LF that the scope is interpreted. However,
in German it seems as if this already has to happen at S-structure (recall the
first main observation from the introduction):
(22) [ e | Q , Qf QÍ, ] [« ... Q,... [p ... t 2 ... ]]] (S-structure/LF)
To force the quantifier to move up to its scope marker we need the next con-
straint, which follows quite naturally:
(23) Scope Principle (SP)

Every quantifier has to be adjoined to the scope marker it is coin-
dexed with.
This partially explains why the relative scope is highly dependent on S-

structure in German. But it cannot be the whole story. Expressing scope re-
lations at S-structure does not guarantee that these scope relations will be
preserved at LF. We therefore will adopt a constraint which is due to Beck
(1996):
(24) Quantifier Induced Barrier (QUIB)

Movement across a scope bearing element is prohibited.
184 Fabian Heck
Of course, this constraint should only show its effects at LF because at S-

structure we do have movement across scope bearing elements. In this sense
Q U I B is an LF constraint (and as such it was introduced by Sigrid Beck).
The reason why I mention it here is twofold: First, to somehow complete
the explanation of S-structure sensitivity of relative scope in German, and
second, to demonstrate that one and the same constraint can have different
impacts on different levels of representation, the difference being a result of
different rankings. That is, at S-structure we have the partial ranking SP »
Q U I B , whereas at LF we have Q U I B » SP.
But now back to S-structure. To assure that scope scrambling will not be
prohibited by economy we define the partial ranking SP » ECON, ANIM
AGENT » ADJA. The D-structure constraints are ranked sufficiently low to
guarantee that they have no influence on this level of representation. 11
Other types of S-structure movement can be derived in a similar fashion.
Topicalization will be derived by defining a constraint that forces the Top
position 12 to be filled in main clauses (for reasons of clause typing; cf., for
instance, Cheng 1991):
(25) Principle of Clause Typing (TYPE)

Every clause has to be typed.
It is a well-known fact that focused elements tend to go to the right edge of

the sentence (in right branching languages). Scrambling that applies in order
to align focus (whatever the ultimate reason for this may be; cf., for instance,
Samek-Lodovici 1997, Cinque 1993, Reinhart 1997, or Büring, this volume)
is due to the following principle:
(26) Align Focus (ALIGN)

Focus marked arguments have to be aligned to the right periphery
of the sentence.
The idea is that an unfocused constituent which separates a focused con-

stituent from the right edge of the sentence is supposed to move leftwards in
order to get the focused constituent into the rightmost position. This move-
ment will be called anti-focus scrambling.
T Y P E and A L I G N must outrank E C O N , because we see that they do apply
at S-structure. T Y P E must also outrank A L I G N , because presumably focused
elements can be topicalized. 13 It seems as if they both also outrank SP. We
therefore get the following partial ranking: TYPE » ALIGN » S P » ECON.
4.2 Analysis
Since the main interest here is relative scope at LF and not S-structure move-
ment, I shall only briefly discuss the consequences of this ranking. The input
for the computation will be optimal D-structures - the output of the previous
cycle.
First of all it is clear that if there is a scope marker, then the quantifier which
is coindexed with it has to move in order to fulfil SP:
(27) daß Jakob ein Märchen jedem Kind erzählt hat

that Jacob one t a l e A c c every c h i l d o A T told has
Inputos: daß Jakob Q jedem Kind [ ein Märchen erzählt hat

Candidates TYPE SP ECON
o^ Ci : Γ ein Märchen! Q 1 ...jedem ... ti *
C 2 : Ql ...jedem ... ein Märchen' *!
On the other hand, if there is no scope marker present, then scrambling will
result in a fatal violation of economy:
(28) daß Jakob jedem Kind ein Märchen erzählt hat

that Jacob every c h i l d o A T a taIeAcc told has
Inputos: daß Jakob jedem Kind ein Märchen erzählt hat
Ci: jedem ... ein Märchen
C 2 : ein Märcheni... jedem ... t) *!
If we have an object that is both topic marked and scope marked, then on
the one hand it has to move to the Top position in order to satisfy TYPE, but
on the other hand it has to move to its scope marker. Since we assumed that
TYPE SP at S-structure, the object will be topicalized: 14
(29) Ein Märchen hat Jakob jedem Kind erzählt

one t a l e A c c has Jacob every c h i l d o A T told
Inputos: -Top daß J. Q' jedem Kind fj+τορΐ ein Märchen erzählt hat
os* Ci: ein Märchen', ...Q'... jedem ... ti * *
*
C2: [Q ein Märchenj Q ] ...jedem ... t| *!
*
C3: -Top —Q' ...jedem ... ein Märchen' *!
186 Fabian Heck
More or less the same holds for ALIGN. For reasons of space I will only show
the result of the complex situation in which there is a conflict between scope
marking and focus alignment: 1 5
(30) daß einem Flüchtling jeder B L A U h e l m geholfen hat

that one refugeepAT every UN-soldier N oM helped has
Inputos: Q' [F jeder BLAUhelm ]' einem Flüchtling geholfen hat

Candidates ALIGN SP ECON
os· Ci: Q'... einem Flüchtlingi... [ρ jeder]'...ti * *
*
C2: I Q [F jeder h ] . . . t2... einem Flüchtling *!
* *
C3: [ Q [F jeder ] 2 ] einem Flüchtling] ... ti *!
*
C4: Q!... [F jeder]' ... einem Flüchtling *!
Candidate Q wins. It dispenses with scope driven movement in order to align

focus. However, this conclusion is merely theoretical because some kind of
scope driven movement can apply later at the level of L F (but only due to a
conspiracy of two constraints, as we shall see). We know this, because (30)
has the inverted reading.
There is still another candidate, C5, that does better than Q . It is a candi-
date that raises the subject to its scope marker and scrambles the object even
further than this position in order to align focus:
(31) daß einem Flüchtling jeder BLAUhelm geholfen hat

that one refugee D A T every UN-soldierNOM helped has
Inputos: Q' [F jeder BLAUhelm ]' einem Flüchtling geholfen hat

Candidates ALIGN S P ECON
C5: einem Flüchtlingi... [ Q [F jeder J2] t 2 ti * *
A s we shall see later, the scope marker in (31) stands in an improper posi-
tion and therefore violates a constraint designed to avoid the proliferation of
improper scope marker insertion. We will come back to this constraint later.
To sum up: S-structure is the level at which the relative scope is determined
in most cases in German. This means that a quantifier moves to its scope
position. The exceptions are cases where movement which is triggered by
information structure outranks scope movement and delays it until LF.
5 Logical Form
We now turn to the last cycle. The basic assumptions are: 1. The verb is in-
terpreted as an open proposition (following Nohl & Stechow 1995).16 This
means that there are argument variables which are generated directly at
the verb. As a consequence there is no type driven QR (contra May 1977,
May 1985), but quantifiers can be interpreted in situ. 2. Semantically empty
movement is obligatorily reconstructed at LF.
In addition to QUIB I will introduce two further constraints that show their
effects at LF. Together with the first two cycles these constraints will serve
as a means to derive some empirical facts about relative scope in German as
they have been noted by Pafel (1997).
5.1 Constraints
It is well known that some quantifiers tend to take wide scope as a mere
lexical property (cf., for instance, Milsark 1974 and Pafel 1997). In German
these quantifiers are jeder, mancher, and die meisten, and they will be referred
to as the strong quantifiers, as in the following constraint:
(32) Quantifier Raising (QR)

Adjoin a strong quantifier somewhere above its base position in the
tree.
Together with the next constraint, QR will be responsible for some instances
of scope inversion. The next constraint is based on the assumption that se-
mantically empty movement is reconstructed at LF. Reconstruction is to be
understood here as syntactic lowering that violates economy.
(33) Reconstruction (REC)

Movement is to be undone.
I propose the following LF hierarchy to hold between the constraints met

s o far: QUIB » SP » REC, Q R » ECON, TYPE, ALIGN. 1 7 QUIB is ranked
above SP because there is no LF movement over scope bearing elements at LF
in German. SP is above REC and QR to ensure that interpretable movement
will not be undone at LF. REC and QR are above ECON for those operations
to be applicable at all. Finally REC is above TYPE and ALIGN in order to
undo semantically empty movement.
188 Fabian Heck
I am now going to present some data about relative scope in German and
then we will see how the proposed constraints can account for them.
5.2 Scope Inversion by Reconstruction
5.2.1 Reconstructing topicalized quantifiers
It is argued in the literature that topicalized quantifiers often behave as if re-

constructed to their base position (cf., for instance, Beck 1996, Büring 1996,
Höhle 1991, Frey 1993, and Pafel 1997).
(34) a. Ein Märchen hat er allen Kindern erzählt

one t a l e A c c has he all c h i l d r e n D A T told
b. Einem Kind haben alle geholfen
one childoAT have aÜNOM helped
Both examples have an inverted reading besides the reading that corresponds
to the S-structural configuration. Scope inversion follows if (34-a,b) are in-
stances of the following scheme:18
(35)a. [...Q,... [,..Q 2 ... [ ...ti... ]]] (S-structure)

b. I ... pm! ... I... Q 2 ... [... Qi... ]]] (Logical Form)
The topicalized quantifier is reconstructed to its base position. However,

sometimes inversion does not seem to be possible:
(36) a. Ein Schüler hat alle Kinder verprügelt

one schoolboyNOM has all childrenAcc thrashed
b. Einem Kind hat er alle Märchen erzählt
one childoAT has he all talesAcc told
This follows if (36-a,b) are instances of the scheme in (37):
(37) a. L... Q,... [... ti... L... Qa... ]]] (S-structure)

b. I ...pmx ... I ... Qi... [... Q 2 ... ]]] (Logical Form)
So, if we assume that reconstruction of topicalized constituents is obligatory,

then the difference between the two kinds of examples must be due to dif-
ferent base positions. In one case the target of reconstruction is below the
embedded quantifier and inversion is possible; in the other, it is above the
embedded quantifier and inversion is impossible. D-structure optimization

provides us with the appropriate target positions.
5.2.2 Reconstruction in the Mittelfeld
People do not agree if there is reconstruction in the Mittelfeld. Whereas Frey

(1993) reconstructs unscrupulously in the Mittelfeld, Beck (1996), Büring
(1996), and Höhle (1991) claim that reconstruction in the Mittelfeld is im-
possible.
I want to claim that the truth is somewhere in between the two positions:
Reconstruction is possible if scrambling is semantically empty, for instance
if it has applied for information structural reasons. In the following examples
a question precedes the relevant sentence in order to set up a context which
suggests that scrambling has applied in order to align focus:
(38) a. Für wen gilt, daß er eine Fuge spielen kann?

for whom holds that henoM one fugueAcc play can
b. Ich glaube, daß eine Fugei [ρ fast jeder PianIST | t]
I believe that one fugueAcc almost every pianist N 0M
spielen kann
play can
(39) a. Wen hat er einigen schlechten Einflüssen entzogen?
whoAcc has he some bad i n f l u e n c e s o A T withdrawn
b. Ich glaube, daß er einigen Einflüssen] schon | f fast jedes
I believe that he some i n f l u e n c e s D A T PART almost every
KIND ] t! entzogen hat
childAcc withdrawn has
(40) a. Wer hat einem Flüchtling geholfen?
w h o N O M has one r e f u g e e o A T helped
b. Ich glaube, daß einem Flüchtling! schon | ρ fast jeder
I believe that one r e f u g e e o A T PART almost every
BLAUhelm 1 t, geholfen hat
UN-soldierNOM helped has
I think that in these examples reconstruction, and hence inversion, is indeed

possible. 1 9 However, if scrambling has applied in order to enlarge the relative
scope of a quantifier, this movement should not be reconstructable. These are
exactly the cases where movement has not applied for information structural
needs.
190 Fabian Heck
5.3 Quantifier Raising
Now, what is the use of strong quantifiers that should be raised by QR? Re-
member that QUIB was introduced to account for the surface orientation of
German relative scope. Since QUIB outranks QR, it seems as if QR could
never apply.
The answer to this question is based on the following observation: Some-
times inversion seems only to be possible if both movement is involved and a
strong quantifier is present:
(41) a. Ein Schüler ι hat ti alle Kinder verprügelt

one schoolboyiMOM has all childrenAcc thrashed
b. daß ein Schüler alle Kinder verprügelt hat
that one schoolboyNOM all childrenAcc thrashed has
(42) a. Einem Kind] hat Jakob ti alle Märchen erzählt

one childoAT has Jacob all talesAcc told
b. daß Jakob einem Kind alle Märchen erzählt hat
that Jacob one childoAT all talesAcc told has
(43) a. Ein Schüleri hat ti allen Kindern geholfen

one schoolboyNOM has all childrenoAT helped
b. daß ein Schüler allen Kindern geholfen hat
that one schoolboyNOM all childrenoAT helped has
In the examples (41)-(43) inversion is impossible in both the main clauses

and the embedded clauses. The main clauses lack a strong quantifier, but
movement has applied. The embedded clauses have neither a strong quantifier
nor a moved constituent. Now contrast this with the following examples:
(44) a. Ein Schüler] hat ti jedes Kind verprügelt

one schoolboyNOM has each childAcc thrashed
b. daß ein Schüler jedes Kind verprügelt hat
that one schoolboynoM each childAcc thrashed has
(45) a. Einem Kindi hat Jakob ti jedes Märchen erzählt

one childoAT has Jacob each taleAcc told
b. daß Jakob einem Kind jedes Märchen erzählt hat
that Jacob one childDAT each taleAcc told has
(46) a. Ein Schülerj hat ti jedem Kind geholfen

one schoolboyNOM has each childDAT helped
b. daß ein Schiller jedem Kind geholfen hat

that one schoolboyNOM each childoAT helped has
(44)-(46) are completely analogous except that the weak quantifier all- has
been replaced by the strong quantifier jed-. Now inversion is possible in the
main clauses which involve topicalization, but still impossible in the embed-
ded clauses! So it seems that movement may destabilise the scope configura-
tion, resulting in inversion if additionally there is a strong quantifier present
that takes advantage of the déstabilisation.
Now, the claim is that inversion is created by the following conspiracy of
REC and QR, and by movement induced déstabilisation of the structure. First
the strong embedded quantifier is raised by QR across the base position of
the topicalized quantifier. Then the topicalized quantifier is reconstructed to
its base position. Inversion is the result:
(47) pre\ | YP jedes Kind2 [VP ein Schüleri tírF verprügelt hat |] (LF)
The only additional stipulation I have to make is that Q U I B is blind to down-

ward movement. In other words, it is allowed to reconstruct across a scope
bearing element at LF, but it is not allowed to raise across such an element.
This is in concord with speculations about Q U I B (in non-negative cases) in
Beck (1995).
5.4 Analysis
I will now present the concrete computation. But first some remarks about the
tables: 1. This time the input consists of S-structures that contain the relevant
S-structure traces that must be there according to D-structure optimization
and S-structure movement. 2. The LF-structures in the tables contain only LF-
traces for reasons of space and readability. 3. The scope marker Q will count
as relevant lexical material with respect to the definition of the candidate
sets. 20
192 Fabian Heck
5.4.1 Topicalization
Topicalization of the direct object
(48) Mindestens einen Fehler hat jeder gemacht

at-least one mistakeAcc has everybodyNOM made
Inputss: Mindestens einen Fehleri hat [ jeder ti gemacht ]

Candidates QUIB S P R E C QR ECON
c&Ci'.pm] ... jeder2... mindestens] **
C2: pre 1 ... jeder ... mindestens] *! *
C3: mindestens ...jeder *1 *
C4: mindestens ... jeder2... t^r F

*! *
C5: jeder2·.. mindestens ... tir F

*! * *
(48) shows how inversion is derived by reconstruction of the topicalized

quantifier. C2 loses because it does not QR its strong quantifier. However, the
difference is merely theoretical because it does not affect the relative scope.
C3 and C4 both do not reconstruct and therefore have a fatal violation of REC.
C5 is ungrammatical because it tries to derive inversion by raising across a
scope bearing element. Of course, the reading which corresponds to the S-
structure configuration must also be derivable:
(49) Mindestens einen Fehler hat jeder gemacht

at-least one mistakeAcc has everybodyNOM made
Inputss: Mindestens einen Fehler1, hat [ Q [jeder ti gemacht ]]

Candidates QUIB SP R E C Q R ECO Ν
F *
E®" CL pre|mini Q] ... jeder 2 ... tir
c 2 preQ... jeder ... min'i *! * *
F * **
C 3 pre\... [mini Q]... jeder ... t^ *!
F **
C 4 pre 1... jeder 2 ... [mini Q\... tif *!
l *
C 5 min'i... Q ... jeder 2 ... t ^ *! *
This is achieved by inserting a scope marker. At S-structure TYPE outranks

SP, therefore we topicalize. But at LF it is the inverse, so the topicalized
quantifier gets reconstructed to its scope marker. Since Q is present in (49)
but not in (48), these two examples are in different candidate sets. This is
important because if they were in the same set, then the optimal candidate in
the set without scope markers would block the optimal candidate in the set
with scope markers, since the first one exhibits one less violation of economy.
The same point can be made with a weak quantifier (the example with Q
which derives the S-structure reading is omitted for reasons of space):
(50) Mindestens einen Fehler haben alle Studenten gemacht

at-least one mistakeAcc have all studentSNOM made
Inputss: Mindestens einen Fehler] haben [ alle Studenten ti gemacht |

Candidates QUIB SP REC QR ECON
*
os* C) : pm\ ... alle ... mindestens 1
*!*
C2: pre 1 ... alle2 ... t ^ . . . mindestens]
Cy. mindestens ... alle
C4: alle2·.. mindestens ... t ^ *! * *
Again, a candidate that reconstructs wins. This time, however, QR of the em-
bedded quantifier is not licensed (since it is a weak quantifier), and therefore
C 2 fatally violates economy. Candidates that do not reconstruct or that raise
the embedded quantifier across the topicalized one are ill formed because of
fatal violations of REC and Qui Β respectively (see C3 and C4).
The same holds for two quantifiers in object position if the one that is more
deeply embedded is topicalized (the example with Q is again omitted):
(51) Mindestens ein Märchen hat Jakob allen Kindern erzählt

at-least one taleAcc has JacobNOM all childrenoAT told
Inputss: Mindestens ein Märcheni hat | Jakob allen Kindern ti erzählt |

1 *
ρ® C\:pm\ ... alle ... mindestens]
C2: pm\ ... alle2... t^ F ...mindestens] *!*
C 3 : mindestens ... alle

*
C 4 : mindestens ... alle 2 ... t ^ *!
C5: alle2... mindestens ... t j F
*! * *
Of course, the analogue of the candidate C4 in (51) could have been listed in
table (50) as well. But it is ill formed anyway since it does not reconstruct
and string vacuously raises the weak embedded quantifier, causing another
violation of economy.
194 Fabian Heck
Topicalization of the subject
We now come to some trickier derivation of scope inversion that has already
been mentioned. (52) demonstrates how inversion can be derived by the con-
spiracy of REC and QR (see Ci):
(52) Mindestens ein Mann liebt jede Frau

at-least one manNOM loves every womanAcc
Inputss: Mindestens ein Manni liebt | ti ede Frau ]

Candidates QUIB S P REC QR ECON
F **
ι®· C\\pre\... jede2·.. mindestens 1... t^
**
m- C2: mindestens]... jede2... t ^
C 3 : jede 2 ... mindestens]... t!rF * *
*!
C 4 : mindestens] ... jede 2 ... t ^ *! *
C5: mindestens] ...jede * ! *
First the quantifier that remained unmoved at S-structure is raised across the
S-structure trace of the topicalized quantifier. Then reconstruction of the top-
icalized quantifier can apply (recall that Qui Β is not sensitive for reconstruc-
tion). The S-structure reading is derivable by raising the strong quantifier just
string vacuously such that its target position still remains below the target
position of reconstruction (see C2).21
However, without a strong quantifier the example has only the reading cor-
responding to the surface:
(53) Mindestens ein Mann liebt alle Frauen

at-least one manNOM loves all womenAcc
Inputss: Mindestens ein Manni liebt [ t] alle Frauen J

Candidates QUIB S P REC QR ECON
C \ \ p w \ . . . alle2... mindestens]... tîr
F
Cz.pm\... mindestens]... alle *
C3: pre\... mindestens]... alle2... tí¡r F * 1*

C4: mindestens ... alle
C5: alle2 ... mindestens ... t ^ *! * *
This is so because in this case raising of the embedded quantifier into a posi-
tion above the target position of reconstruction causes an additional violation
of economy which is fatal. 22
Topicalization of the indirect object
We can hold the same mechanism responsible for the readings available in a
configuration with a topicalized indirect object and a subject in situ (here one
example with a strong and another with a weak quantifier):
(54) Drei Beobachtern ist jeder Spieler aufgefallen

three observersoAT is every playerNOM noticed
'Three observers have noticed every player.'
Inputss: Drei Beobachtern] ist [ t¡ jeder Spieler aufgefallen J

F **
ι®· C\:pr&\... jeder2 ... drei!... tir
«3° C2: pm\... dreii... jeder2... Í2 **
C3: drei... jeder2... t!fF *! *
C4: drei... jeder *! *
C5: jeder 2 ... drei ... TJF *! * *
Again, in one case the strong quantifier raises into a position above the target
of reconstruction, thereby causing scope inversion. In the other case it does
not raise far enough and the S-structural relative scope is preserved at LF.
(55) shows the same thing without a strong quantifier. Here the additional
violation against economy is decisive and blocks inversion.
(55) Drei Beobachtern sind alle Spieler aufgefallen

three observers DAT are all playersNOM noticed
'Three observers have noticed all players.'
Inputss: Drei Beobachtern] sind | tj alle Spieler aufgefallen |

us3 Q : ρτθ\... drei]... alle *
C2: pf»\... alle2 ... drei]... t^ F *!*
C3: drei ... alle

C 4 : alle2·.. drei ... tirF *! * *
C 5 : alle 2 ... drei ... t^ F *! * *

196 Fabian Heck
Inversion impossible
If there is no appropriate trace, reconstruction cannot apply and inversion is
blocked, even if a strong quantifier is present:
(56) Jakob hat einigen Kindern jedes Märchen erzählt

Jacob has some childrenoAT every taleAcc told
Inputss: Jakob hat einigen Kindern jedes Märchen erzählt

F *
us- C| : einigen ... jedesi... t^ ...
C2: einigen... jedes *!
C3: j e d e s e i n i g e n ... t^ F
*! *
In (56) QuiB blocks inversion by movement of the embedded quantifier. The

back door of first raising and then reconstructing below the base position is
not available since there is no appropriate trace.
5.4.2 Scrambling
If scrambling is triggered by information structure, then the moved item is

reconstructable. It is of no importance whether we insert a strong or a weak
quantifier in (57) because the target of reconstruction is already below the
embedded quantifier:
(57) daß mindestens eine Fuge fast jeder P i a N I S T spielen

that at-least one fugueAcc almost every pianistNOM play
kann
can
Inputss: daß mindestens eine Fugei fast [p jeder PiaNIST ] ti spielen kann
Candidates QUIB SP R E C Q R ECON
US'Ci:pr&\ ... jeder2... t ^ ... mind.i **
*
C2: pre 1 ... jeder ... mindestens] *!
C3: mindestens ...jeder *1 *
C4: m i n d e s t e n s ... jeder2... t^ F *! *
C5: jeder 2 ... mindestens ... tirF *! * **

If there is a scope marker that has triggered scrambling, it will now block
reconstruction together with the Scope Principle. Again, any attempt to derive
inversion by QR violates QUIB:
(58) daß mindestens eine Fuge jeder Pianist spielen kann

that at-least one fugueAcc every p i a n i s t N O M play can
Inputss: daß [G; [ mind, eine Fuge ]', Ql ] jeder Pianist ti spielen kann
Candidates QUIB SP R E C Q R E C O N
m- Ci: [mindestens Q] ... jeder2... t ^ * **
* *
C2: [mindestens Q] ...jeder
C3: \pr» 1 QM ...jeder ... mind.', *! * *
F *
C 4 : jeder 2 ... [mindestens Q] ... tir *!
The following examples demonstrate how the very same verb may or may
not allow scope inversion by reconstruction, depending on the different D-
structures which are determined by D-structure optimization. 2 3
(59) daß er mindestens einem Einfluß jedes KIND entzogen

that he at-least one i n f l u e n c e o A T every c h i l d A c c withdrawn
hat
has
Inputss: daß er mindestens einem Einfluß] [Ρ jedes KIND ] ti entzogen hat

Candidates QUIB SP REC Q R ECON
m-Ci'.pr&i ... jedes2·.. t ^ . . . mind.i **
C2: pro 1 ... jedes ... mindestens] *! *

C3: mindestens ... jedes2... t2 *! *
C4: mindestens ...jedes *! *
C5: jedes2·.. mindestens ... t^ F *! * *
Because of A n i m the D-structure in (59) must be DO before 10. From this,

it follows that (59) must involve movement which can be reconstructed.
In contrast, in (60) no movement has applied. Hence, no reconstruction is
possible: 2 4
198 Fabian Heck
(60) daß er mindestens ein Kind jedem Einfluß entzogen

that he at-least one childAcc every influenceoAT withdrawn
hat
has
Inputss: daß er mindestens ein Kind jedem Einfluß entzogen hat
us· Q : mindestens ...jedem
C2: jedem ι ... mindestens ... t^F *! *
C3: mindestens ... jedem 1 ... *!*
In (61) and (62) the D-structure relations are inverse: The indirect object pre-
cedes the direct object. Since (61) exhibits this order, no S-structure move-
ment has applied and hence no trace is there that could serve as the target of
reconstruction:
(61) daß er mindestens einer Mutter jedes Sorgerecht entzogen

that he at-least one mother DA T every custodyAcc withdrawn
hat
has
Inputss: daß er mindestens einer Mutter jedes Sorgerecht entzogen hat

Candidates QUIB SP R E C Q R E C O N
«s- Cj: mindestens ... jedesi ... *
C2: mindestens ...jede *!

C3: jedes 1 ... mindestens ... t^F *! *
In contrast to this, in (62) there is an appropriate trace and scope inversion is

available. Of course, the S-structure reading would be derivable in a variant
of (62) that contained a scope marker that blocked reconstruction, and hence,
inversion:
(62) daß er einige Sorgerechte schon jeder MUTter entzogen

that he some custodiesAcc already every motheroAT withdrawn
hat
has
Inputss: daß er einige Rechtei schon | F jeder MUTter] t] entzogen hat

Candidates Q U I B S P REC QR E C O N
D®· C a p r a i ... jeder2... Í 2 F . . . einigei **
C2: pr&i ... jeder ... einigei *! *
C3: einige ... jeder *T *
C4: einige ... jeder 2 ... *! *
C5: jeder2... einige ... t ^ *! * *
Raising of the strong quantifier after reconstruction applies string vacuously

and therefore has no impact.
6 Scope Marker Insertion
Finally, I will address the problem of proper scope marker insertion. As I

already mentioned, insertion of Q is free and might therefore create readings
that are not borne out empirically. Intuitively, there are two configurations that
one wants to block: I will call them vacuous scope marking and downward
scope marking.
The first kind of improper Q-insertion is shown in (63). We can see in
(63-b,c) that the scope marker contributes nothing to the information of rela-
tive scope at S-structure. It marks more or less the same scope as the quantifier
it is coindexed with:
Vacuous scope marking

a. [ h o p . . · ] ...[ Q ··· Qi - Q2 - IJ (D-structure)
b. ΙΙτορ Qi]... Γ Q'... ti... Qj... Il (S-structure 1)
c. llTopQl]..· [[ Ql Q' QLl - t , . . . t 2 . .. 11 (S-structure 2)
d. ! [Top 1 ... \[Qi Q ' Q ^ I ...Ql... t 2 ... Il (Logicai Form)
This has an unwanted consequence. Suppose that in (63-b) Q 2 is a weak quan-

tifier. Then this quantifier should raise up to its scope marker (at S-structure or
at LF, see (63-b,c)), which stands above the base position of the topicalized
quantifier Qi. In the next step, the topicalized quantifier gets reconstructed
below the raised quantifier: Inversion is the result. This derivation opens the
door to a proliferation of unwanted scope inversions.
Now, as I said, what is intuitively wrong with the S-structures in (63-b,c)
is that the scope marker does no job. It just marks the domain the coindexed
200 Fabian Heck
quantifier already c-commands. 25 What might be strange is that the relevant

level for the appropriateness of the scope marker is not D-structure, the level
at which the scope marker is inserted (there, the scope marker is not vac-
uous!), but S-structure. The reason may be that in German it is mainly at
S-structure that the relative scope is fixed.
The second kind of improper Q-insertion is shown in (64) and is rather
obvious. Here, Q is generated below the base position of Q, and therefore
causes a change in relative scope at LF:
(64) Downward scope marking

a. [ ... Q',... I ... Q 2 - I . . . Q - ]]] (D/S-structure)
b. I ...pre x ... I ... 02». I ... lo/ Q ' Q i 1 ·.·]]] (Logical Form)
The mechanism presented above allows reconstruction to a scope marker, so

why should it not be allowed in this case? It seems that scope marking always
proceeds upward in the tree, or should proceed upward at least at one step in
the derivation. In (64) this is never the case.
Now consider an S-structure like (64-a), which contains no traces left be-
hind by movement. This is a configuration that typically does not allow for
scope inversion. But this is exactly what downward scope marking does, so it
over-generates and it has to be blocked.
What is interesting is that both kinds of improper scope marker insertion
have something in common, and hence can be blocked by a single constraint:
(65) Proper Q ( P R O P - Q )
A scope marker Ql is licensed if and only if it c-commands a
contraindexed quantifier Q J that breaks the extended chain that is
formed by Q1 and its coindexed quantifier Q' , 26
It is clear that this constraint blocks both vacuous scope marking and down-
ward scope marking. 27
Vacuous marking is blocked because if a marker does not c-command a
contraindexed quantifier, it is vacuous by definition. And for cases of down-
ward marking that could result in inversion it is exactly the same: The scope
marker is vacuous.
I have no evidence how P R O P - Q should be ranked within the hierarchy
because it does not stand in conflict with any other constraint. It simply filters
out candidates with an improper scope marker as they occur at any level of
representation. This might be an indication that P R O P - Q is part of G E N .
7 Arguments for the Cycle
In this last section I want to review arguments that help to differentiate the
approach of cyclic optimization from another approach.
Remember that the goal of this paper was to derive the correspondence
between quantifier scope and basic word order, as it is suggested by the data.
Now, instead of using cyclic optimization one could think of defining candi-
dates as triples (D-structure, S-structure, LF). It would be absolutely reason-
able to do so, because there is a well-defined term of optimality for such an
approach: A triple t = (DS, SS, LF) is optimal with regard to the other triples
t\...tn iff the constraint profile of t is better than the profiles of t\...tn, where
the three slots of every triple are computed in a parallel fashion and then the
violations are summed up in one big table. I will refer to this approach as the
parallel approach.
However, I think that there are three arguments that might support the cyclic
approach: 1. Complexity of derivation: If in the parallel approach an LF is
checked for optimality, there is no way to tell if it will ever have the chance
to be a winner because its D-structure and its S-structure are computed at
the same time. So this mechanism has to compute the whole set of structures
exhaustively. In the cyclic approach, however, a candidate will only be opti-
mal if all of its levels of representation are optimal. An LF that is based on
a non-optimal D-structure cannot be part of a winner even if the LF itself is
optimal. In the cyclic approach only those LFs that descend from an optimal
D-structure and an optimal S-structure are computed. All other potential LFs
are filtered out somewhere previously in the computation.
2. Potential reranking: Under a parallel approach the constraint ranking is
fixed during the computation. That is, if a triple t = (DS, SS, LF) is evalu-
ated all elements of t are evaluated under the same ranking. Under the cyclic
approach, however, it is in principle possible to reorder the constraints on
the way from one cycle to the next (cf. McCarthy & Prince 1993:12, Mester
1999, Rubach 2000). I would like to point out why this might be a desirable
consequence.
Recall the constraint TYPE, that was introduced to derive the fact that the
Vorfeld in an German sentence has to be filled with a constituent in unem-
bedded contexts. Clearly, in order for TYPE to have any effect it must outrank
ECON and REC. At LF, however, it is exactly those topicalized elements that
undergo reconstruction and therefore TYPE must not have any impact at that
point of the derivation. There are at least two ways to achieve this. Either,
one stipulates that certain constraints are "switched on" at a certain level of
202 Fabian Heck
representation and "switched o f f " at another level. Or one makes use of the
mechanism of neutralizing the effects of one constraint by ranking it suffi-
ciently low within the given hierarchy. In the case at hand, this implies that at
the surface we are dealing with a partial order TYPE ECON, REC, whereas
at L F we are dealing with the inverted order REC » TYPE, ECON. Since
reranking is the mechanism that is supposed to be the locus of parametric
change in optimality theory anyway, it seems to me that the second strategy
is the most natural way to deal with the problem at hand.
3. Accumulation of violations: The last argument is an empirical one. The
point is that cyclic optimization makes the prediction that each time a new cy-
cle is entered the reset button is pushed and all the violations so far are deleted
from memory. That is, the next competition restarts without any burden from
the past cycles. Now suppose that there is a violation of some constraint at
D-structure (that is the first slot in the parallel approach), and suppose further
that this violation does not have any impact on the computation of the opti-
mal candidate. Take a look at this violation in the parallel approach, where all
violations f r o m every slot of the triple are accumulated. There this violation
might be decisive for the failure of the whole computation because it might
just be the violation that makes the sum of violations of the relevant candidate
surmount the sum of violations of some other candidate.
Since the two approaches make different predictions at this point, it could
serve as a test to check their empirical adequacy. An appropriate configuration
for such a test, however, may be hard to find.
Notes
This paper is a partial elaboration of my masters thesis. It was supported by the DFG
grant MU 1444/2-1 for the project "Optimalitätstheoretische Syntax des Deutschen"
at the University of Stuttgart. For comments and discussion I want to thank Jane
Grimshaw, Gereon Müller, Tanja Schmid, Arnim von Stechow, Wolfgang Sternefeld,
Sten Vikner, Ralf Vogel, and the audience of the DGfS 1999 workshop "Competition
in Syntax". Of course, all errors are mine.
1. This assumption is called the concept of transparent LF and is in opposition to
May (1985), who allows ambiguity even at the level of LF.
2. By the way, whenever I use the term scope inversion I refer to a configuration
that is an inverted variant of the S-structural scope configuration.
3. Actually, in McCarthy & Prince (1993) the authors speculate that it might be
useful to define the process of optimization on different levels of morphological
representation. That is, something similar to the current proposal is considered

as a possible strategy. However, the considerations remain at best sketchy.
4. There is a variety of different approaches to what should be the input to GEN. The
debate is not finished yet, but the assumptions made here are rather unexotic, I
think (cf., for instance, Grimshaw 1997).
5. I should stress that the question marks before the b-examples do not indicate
ungrammaticality (e.g., due to a violation of economy), but only markedness.
In a suitable context the b-examples may be even more appropriate than the a-
examples. By and large this amounts to a theory of marked word order as has
been proposed by Choi (1996), Costa (1998), or Büring (this volume). The point
here is that the base generated structure should be the least specific one and hence
should be unmarked in the given neutral context.
6. To show that AGENT can really be blocked by ANIM, one needs examples in
which the agent is not animate, because only then is there a potential situation in
which the two constraints are in conflict and only then can we tell if animacy is
really more important than agentivity. These are very rare; however, (8) may be
a relevant instance.
7. The behaviour of psychological verbs that take a dative argument can be ex-
plained along the same lines:
(i) a. daß dem Fritz die Tänzerin gefallen hat

that the FritzoAT the dancing-girlNOM pleased has
'that Fritz was pleased by the dancing-girl'
b. ?daß die Tänzerin dem Fritz gefallen hat
that the dancing-girlNOM the FritzoAT pleased has
'that Fritz was pleased by the dancing-girl'
However, psychological verbs that take an accusative argument are still mysteri-
ous because in the approach of Vogel & Steinbach ( 1998) accusative case remains
unaffected by the word order constraints, and this for good reasons.
8. Examples like those in (11) presuppose a theory of linking as proposed by Wun-
derlich (1997). The point is that we need to assure that the 0-role agent is always
linked to nominative case (if there is such a role). If it were not, then the follow-
ing example (with the intended meaning given below it), in which the arguments
that bear dative and the 0-role agent coincide, would be a possible candidate:
(i) daß einem Blauhelm ein Flüchtling geholfen hat

that a UN-soldieroAT a refugeeNOM helped has
'that a UN-soldier helped a refugee'
Since (i) respects AGENT and ADJA, in contrast to (11), it would block (11) as
suboptimal. However, if we assume that the linking between case and 0-role of
an argument is determined by the lexicon entry of each verb, then we can avoid
204 Fabian Heck
this unwanted consequence. Then (i) would never be generated (at least not with
the given meaning).
9. It is clear that the constraints that guide S-structure movement and LF movement
should not have any effects on the level of D-structure. One gets this for free if
one assumes that there cannot be any movement before D-structure has been built
up. In other words, the constraints that will be introduced in the next cycles are
present in the first cycle as well, but they are without impact for independent rea-
sons. So we do not have to bother about the explicit ranking of these constraints
with respect to the constraints met so far in the hierarchy of D-structure.
10. Basically, the idea is that scope markers are generated freely within the structure.
In some sense this is different from Diesing (1996), Diesing (1997), and Vikner
(this volume), where it is assumed (in a generative semantics style) that the rela-
tive scope is already part of the input, which in turn means that the information
about relative scope is doubled: First it is encoded in the input and then it is en-
coded within the syntactic structure. In opposition to this, here the relative scope
is determined by an optimized LF only, not by the input.
11. From now on, I shall dispense with explicitly mentioning the D-structure con-
straints in the ranking because they will be of no importance anymore, neither at
S-structure nor at LF. Simply assume that they are the lowest ranked constraints
in the hierarchies of S-structure and LF.
12. This may be SpecCP or the specifier of another functional head as the Top head
argued for in Müller & Sternefeld (1993).
13. Of course, we cannot satisfy ALIGN first and then move the focused constituent
to the Top position, because ALIGN is dependent on S-structure and cannot be
satisfied by a trace. This, however, does not mean that I assume every constraint
to be dependent on the surface. That traces sometimes can do a job can be seen
from examples like the following:
(i) Which people] [ t', seem to each otheri | ti to be intelligent ]]?
14. SP will also be assumed not to be satisfiable by traces, for reasons that hopefully
will become clear.
15. Focal stress is indicated by capital letters.
16. It is not clear who originally came up with this idea. But one of the latest propos-
als which follow this strategy is due to Wolfgang Sternefeld (cf. Sternefeld 1993).
17. Note also that the S-structure hierarchy is rather different from this, namely:
TYPE » ALIGN » SP » Q U I B , ECON » Q R , R E C . R e c a l l t h a t T Y P E is a b o v e
ALIGN because focused elements can be topicalized. TYPE and ALIGN are both
above economy for obvious reasons. SP is above QUIB because in German S-
structure movement over a scope bearing element is possible. Economy is above
QR because there is no S-structure movement of strong quantifiers (without a
scope marker). Finally, REC is ranked below economy because it would be un-
reasonable to stipulate some kind of Yo-Yo movement. That is, we do not want
to assume that on the way from D-structure to S-structure a movement operation

will first move a constituent upwards in the tree and then reconstruct it again.
18. According to May (1985), reconstruction leaves behind a little pro which is then
deleted. I will follow him in this, but only for reasons of readability. There is
nothing that hinges on this assumption. So any time you encounter a barred pro
you know that reconstruction has applied.
19. Of course, inversion should be impossible if there are no appropriate traces.
20. The idea behind this is that candidates with the scope marker and candidates
without the scope marker are not in the same candidate (or reference) set, hence
not in the same competition. What this amounts to is that candidates with dif-
ferent semantics cannot compete, as has been argued for by Fox (1995), for in-
stance. For an extensive discussion about how the notion of reference set should
be defined, cf. Sternefeld (1997).
21. This derivation could as well be blocked by inserting a scope marker for the
topicalized quantifier. Then the strong quantifier could not move to a position
above the scope marker since this is a scope bearing element.
22. Of course, raising the embedded quantifier into any position whatsoever causes
a violation of economy. But here we are only interested in movement that causes
scope inversion.
23. Since the following tables focus on reconstructability, they only contain candi-
dates without scope markers.
24. A derivation of (60) that first moves the embedded quantifier Q2 to a scope
marker across the matrix quantifier Qi and then in turn raises Qi across Q2 by
anti-focus scrambling would end up in a linear order identical to the optimal
D-structure. But since an appropriate trace would then be available, we would
expect (60) to have an inverted reading if Q2 is focused. And this is not borne
out empirically. However, it can be shown that this derivation can be blocked by
PROP-Q; see below.
25. To put it correctly, the domain marked by Ql does not contain any scope bearing
element that is not already contained in the domain that is marked by Q 1 .
26. A constituent breaks a chain if and only if it is c-commanded by at least one
element e\ of the chain and in turn c-commands at least one other element <?2 of
the chain.
27. It is tempting to simplify the definition by only demanding every scope marker
to c-command a contraindexed quantifier. But the definition (65) refers to the
extended chain of a quantifier Qi and its scope marker to ensure that the scope
marker cannot be licensed by a third quantifier Q3:
(i) Qj ... QJ2 ... Q . . . Q3
If the definition only demanded that a scope marker c-command a contraindexed

206 Fabian Heck
quantifier, then Q* in (i) would license Q' although Ql does not scope mark any-
thing that is not in the scope of Q, already. Hence, (i) is an instance of vacuous
scope marking and must be blocked.
References
Abraham, Werner
1986 Word order in the middle field of the German sentence. In: W. Abraham
and S. de Meij (eds.) Topic, Focus, and Configurationality, 15-38. Ams-
terdam: Benjamins.
Beck, Sigrid
1995 Negative islands and reconstruction. In: U. Lutz and J. Pafel (eds.) On Ex-
traction and Extraposition in German, 121-144. Amsterdam: Benjamins.
Beck, Sigrid
1996 Wh-constructions and transparent logical form. Ph.D. dissertation, Uni-
versity of Tubingen.
Biiring, Daniel
1996 The 59th Street bridge accent. Ph.D. dissertation, University of Tubingen.
BUring, Daniel
t.v. Let's phrase it!
Cheng, Lisa Lai-Shen
1991 On the typology of vvh-questions. Department of Linguistics and Phi-
losophy, MIT. Distributed by MIT Working Papers in Linguistics. MIT,
Cambridge, Massachusetts.
Choi, Hye-Won
Ph.D. dissertation, Stanford University.
Chomsky, Noam
1981 Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam
1995 The Minimalist Program. Cambridge, MA: MIT Press.
Cinque, Guglielmo
1993 A null theory of phrase and compound stress. Linguistic Inquiry 24: 239-
298.
Costa, Joäo
1998 Word order variation. Ph.D. dissertation, HIL University of Leiden.
Diesing, Molly
1996 Semantic variables and object shift. In: H. Thráinsson, S. Epstein and
S. Peter (eds.) Studies in Comparative Germanic Syntax. Vol. II, 66-84.
Dordrecht: Kluwer.
Diesing, Molly
1997 Yiddish VP order and the typology of object movement in Germanic.
Natural Language and Linguistic Theory 15: 369-427.
Fox, Danny
1995 Economy and scope. Natural Language Semantics 3(3): 283-341.
Frey, Werner
1993 Syntaktische Bedingungen für die semantische Interpretation. Berlin:
Akademie-Verlag.
Grimshavv, Jane
Haider, Hubert
1992 Branching and discharge. Sonderforschungsbereich 340, University of
Stuttgart.
Haider, Hubert
1993 Deutsche Syntax - Generativ. Tubingen: Narr.
Heim, Irene — Angelika Kratzer
1997 Semantics in Generative Grammar. Oxford: Blackwell.
Hoberg, Ursula
1981 Die Wortstellung in der geschriebenen deutschen Gegenwartssprache.
München: Hueber.
Höhle, Tilman
1982 Explikationen für 'normale Betonung' und 'normale Wortstellung'. In:
W. Abraham (ed.) Satzglieder im Deutschen, 75-153. Tübingen: Narr.
Höhle, Tilman
1991 On reconstruction and coordination. In: H. Haider and K. Netter (eds.)
Representation and Derivation in the Theory of Grammar, 139-197. Dor-
drecht: Kluwer.
Kiss, Tibor
1999 Configurational and relational scope determination in German. Ms., Un-
versitat Bochum.
Kroch, Anthony
1974 The semantics of scope in English. Ph.D. dissertation, MIT. (Published
1979, New York: Garland).
Lenerz, Jürgen
1977 Zur Abfolge nominaler Satzglieder im Deutschen. Tübingen: Narr.
208 Fabian Heck
May, Robert
1977 The grammar of quantification. Ph.D. dissertation, MIT.
May, Robert
1985 Logical Form: Its Structure and Derivation. Cambridge, MA: MIT Press.
1993 Prosodie morphology I - Constraint interaction and satisfaction. Ms.,
University of Massachusetts, Amherst and Rutgers University.
1994 The emergence of the unmarked. NELS 24: 333-379.
McCavvley, James D.
1999 Why surface syntactic structure reflects logical structure as much as it
does, but only that much. Language 75: 34-62.
Mester, Armin
1999 Weak parallelism: Serial and parallel sources of opacity in OT. Ms., Uni-
versity of California, Santa Cruz.
Milsark, Gary
1974 Existential sentences in English. Ph.D. dissertation, MIT.
Müller, Gereon
1999 Optimality, markedness, and word order in German. Linguistics 37: 777-
818.
1993 Improper movement and unambiguous binding. Linguistic Inquiry 24:
461-507.
Nohl, Claudia — Arnim von Stechovv
1995 Interpretation syntaktischer Strukturen - Eine Semantikeinfiihrung an-
hand des Deutschen. Technical Report 07-95, Seminar für Sprachwis-
senschaft der Universität Tübingen.
Pafel, Jürgen
1997 Skopus und logische Struktur: Studien zum Quantorenskopus im Deut-
schen. Unpublished Habilitation, University of Tübingen.
1993 Optimality Theory: Constraint interaction in generative grammar. Ms.,
Rutgers University and University of Colorado at Boulder.
Reinhart, Tanya
1976 The syntactic domain of anaphora. Ph.D. dissertation, ΜΓΓ.
Reinhart, Tanya
1983 Anaphora and Semantic Interpretation. London: Croom Helm.
Reinhart, Tanya
1997 Interface economy. In: Wilder et al. (eds.), 146-169.
Rubach, Jerzy
2000 Glide and glottal stop insertion in Slavic languages: A DOT analysis.
Linguistic Inquiry 31(2): 271-317.
Samek-Lodovici, Vieri
1997 OT-interactions between focus and basic word order. Talk presented at
the Workshop on OT Syntax, October 1997, University of Stuttgart.
Stechow, Arnim von
1993 Die Aufgaben der Syntax. In: J. Jacobs, A. von Stechow, W. Sternefeld
and T. Vennemann (eds.) Syntax: Ein internationales Handbuch zeit-
genössischer Forschung, 1-88. Berlin: Walter de Gruyter.
1993 Plurality, Reciprocity and Scope. Technical Report 13-93, Seminar für
Sprachwissenschaft der Universität Tubingen. Revised version in Natural
Language Semantics 1998.
1997 Comparing reference sets. In: Wilder et al. (eds.), 81-114.
Stovvell, Tim
1981 Origins of phrase structure. Ph.D. dissertation, MIT.
Uszkoreit, Hans
1986 Constraints on order. Linguistics 24: 883-906.
Vikner, Sten
t.v. The interpretation of object shift and optimality theory.
1998 The dative - An oblique case: Linguistische Berichte 173: 65-90.
Wilder, Chris — Hans-Martin Gärtner — Manfred Bierwisch (eds.)
1997 The Role of Economy Principles in Linguistic Theory. Berlin: Akademie-
Verlag.
Wunderlich, Dieter
1997 Cause and the structure of verbs. Linguistic Inquiry 28: 27-68.
Experimental Evidence for Constraint Competition
in Gapping Constructions
Frank Keller
This paper presents the results of two experiments investigating gradient ac-
ceptability in gapping constructions. Experiment 1 shows that adjuncts and
complements are equally acceptable as remnants in gapping, a fact that has
been surrounded by controversy in the literature. It also provides evidence
against the claim that gapping must leave behind exactly two remnants, and
shows that subject remnants are less acceptable than object remnants. This
effect of remnant type can be overridden by context. Experiment 2 confirms
the remnant effect and investigates how it interacts with other constraints on
gapping to produce a gradient acceptability pattern.
A number of grammar models have been proposed to deal with gradient
linguistic data, including the re-ranking model (Keller 1998), which draws
on concepts from Optimality Theory. Two assumptions are central to this
model: (a) constraint violations are cumulative, i.e., the degree of unaccept-
ability increases with the number of constraints violated; and (b) constraints
cluster into two types based on their acceptability profile: hard constraints
cause strong unacceptability when violated, while violations of soft con-
straints cause only mild unacceptability. The experimental data presented in
this paper confirm both assumptions and provide additional evidence for the
hard/soft distinction by demonstrating that only soft constraints are subject to
context effects.
1 Introduction
The aim of this paper is twofold. Firstly, we aim to make a methodological

point by showing that experimental techniques can contribute to linguistic
theory by settling data disputes that cannot be resolved solely on the basis
of intuitive, informal acceptability judgments. More specifically, we apply
the experimental paradigm of magnitude estimation to gapping constructions,
which allows us to test claims made in the theoretical literature on gapping.
212 Frank Keller
We provide evidence for constraint competition in gapping and investigate

the influence of context on the acceptability of gapped sentences.
The second aim of this paper is to obtain experimental data regarding sub-
optimal linguistic structures, i.e., structures that attract gradient acceptability
judgments. Such gradient data allow us to test aspects of a specific model
of gradience in grammar, the re-ranking model. More specifically, the data
bear on two central assumptions of this model: the cumulativity of constraint
violations and the dichotomy of hard and soft constraints.
In this introduction, we give a brief overview of the theoretical literature
on gapping, provide some background on Optimality Theory, and outline the
re-ranking model of gradience.
1.1 Gapping Constructions in English
Gapping is a grammatical operation that deletes certain subconstituents of a

coordinate structure. As examples consider ( 1 )-(3) below, in which the (a) ex-
amples constitute gapped versions of the (b) examples:1
(1) a. I ate fish, Bill rice, and Harry roast beef.

b. I ate fish, Bill ate rice, and Harry ate roast beef.
(2) a. Tom has a pistol, and Dick a sword.
b. Tom has a pistol, and Dick has a sword.
(3) a. I want to try to begin to write a novel, and Mary
to try to begin to write
to begin to write
. a play,
to write
0
b. I want to try to begin to write a novel, and Mary wants to try to
begin to write a play.
These examples indicate that gapping always deletes the matrix verb and
leaves behind exactly two constituents as remnants (Kuno 1976: 318). Based
on previous work by Hankamer ( 1 9 7 3 ) , Jackendoff ( 1 9 7 1 ) , and Ross ( 1 9 7 0 ) ,
Kuno ( 1 9 7 6 ) also observes that certain functional principles affect the accept-
ability of gapping, such as the following restriction on the interpretation of
the constituents left behind by gapping:2
(4) The Minimal Distance Principle [MINDIS] (Kuno 1976: 306)

The two constituents left behind by Gapping can be most readily
Constraint Competition in Gapping Constructions 213
coupled with the constituents (of the same structures) in the first
conjunct that were processed last of all.
The examples in (5) illustrate the Minimal Distance Principle: In (5-a), the
remnant Tom has to be paired with Mary, yielding the interpretation in (5-b).
It is not possible to pair Tom with the more distant subject John, yielding the
interpretation in (5-c).
(5) a. John believes Mary to be guilty, and Tom to be innocent.

b. John believes Mary to be guilty, and John believes Tom to be inno-
cent.
c. John believes Mary to be guilty, and Tom believes Mary to be inno-
cent.
A further generalization about gapping constructions is that the gap has to

represent contextually given information, while the remnant has to constitute
new information. Kuno (1976) captures this using the concept of Functional
Sentence Perspective (FSP):
(6) The FSP Principle of Gapping [SENTP] (Kuno 1976: 310)

Constituents deleted by Gapping must be contextually known. On
the other hand, the two constituents left behind by Gapping neces-
sarily represent new information and, therefore, must be paired with
constituents in the first conjunct that represent new information. |... |
Kuno (1976) notes that the FSP Principle seems to be able to override the
Minimal Distance Principle. (7-a) is acceptable as a gapped version of (7-b),
even though it violates MlNDlS. We regard this fact as initial evidence that
gapping is subject to constraint competition in an optimality theoretic sense.
(7) a. With what did John and Bill hit Mary? John hit Mary with a stick,
and Bill with a belt,
b. With what did John and Bill hit Mary? John hit Mary with a stick,
and Bill hit Mary with a belt.
More evidence for constraint competition in gapping comes from Kuno's

(1976) observation that the remnants in a gapped sentence tend to be inter-
preted as a subject and its predicate:
(8) The Tendency for Subject-Predicate Interpretation

[SUBJPRED] (Kuno 1976: 311)
When Gapping leaves an NP and a VP behind, the two constituents
214 Frank Keller
are readily interpreted as constituting a sentential pattern, with the

NP representing the subject of the VP.
This explains why (9-a) can be interpreted as the gapped version of (9-b)
(where Tom is the subject of donate), but not as the gapped version of
(9-c) (where Tom is the subject of the object control verb persuade). Exam-
ple (10-a), on the other hand, not only has (10-b) as a possible interpretation,
but also (10-c) (or at least (10-c) is considerably better than (9-c)). In (10-c),
Tom is the subject of donate, because the matrix verb promise is a subject
control verb. Such a subject-predicate interpretation is preferred in gapping
constructions. Note that (10-c) violates MlNDlS, thus indicating a competi-
tion between MlNDlS and SUBJPRED.
(9) a. John persuaded Bill to donate $200, and Tom to donate $400.
b. John persuaded Bill to donate $200, and John persuaded Tom to
donate $400.
c. John persuaded Bill to donate $200, and Tom persuaded Bill to do-
nate $400.
(10) a. John promised Bill to donate $200, and Tom to donate $400.
b. John promised Bill to donate $200, and John promised Tom to do-
nate $400.
c. John promised Bill to donate $200, and Tom promised Bill to donate
$400.
Finally, Kuno (1976) also observes that gapping cannot leave behind rem-
nants that are part of a subordinate clause: (11-a) cannot be understood as a
gapped version of (11-b).
(11) a. John persuaded Dr. Thomas to examine Jane and Bill Martha.
b. John persuaded Dr. Thomas to examine Jane and Bill persuaded Dr.
Thomas to examine Martha.
This can be formulated as the generalization that the remnants in a gapping

construction must be part of a simplex sentence:
(12) The Requirement for Simplex-Sentential Relationship

[SIMS] (Kuno 1976: 314)
The two constituents left over by Gapping are most readily inter-
pretable as entering into a simplex-sentential relationship. The in-
telligibility of the gapped sentence declines drastically if there is no
such relationship between the two constituents.
According to Kuno (1976: 316), "the Requirement for Simplex-Sentential

Relationship is a very strong and nearly inviolable constraint," and a violation
of this constraint leads to strong unacceptability. Kuno (1976) claims that
the interaction of this constraint with weaker ones such as MLNDLS, SENTP,
and SUBJPRED, allows us to derive the degree of acceptability of gapped
sentences.
However, Kuno (1976) does not make this interaction explicit; he fails to
give an account of how the degree of acceptability of a gapped sentence is
computed from the constraint violations it incurs. The present paper aims
to overcome this limitation. Using experimental data we investigate how the
interaction of constraints on gapping determines the degree of acceptability
of a gapped structure. Our investigation is guided by an explicit model of
constraint competition that draws on concepts from Optimality Theory, intro-
duced in the next section.
1.2 Optimality Theory
Our model of constraint interaction in gapping constructions relies on the

concept of grammatical competition recently introduced into linguistic theory
by approaches such as Optimality Theory (OT; Prince and Smolensky 1993,
1997) or the Minimalist Program (MP; Chomsky 1995). In what follows, we
focus on Optimality Theory, and briefly introduce its basic mechanisms.
Standard Optimality Theory deviates from more traditional linguistic
frameworks in that it assumes grammatical constraints to be (a) universal,
(b) violable, and (c) ranked. Assumption (a) means that constraints are maxi-
mally general, i.e., they contain no exceptions or disjunctions, and there is no
parameterization across languages. Highly general constraints will inevitably
conflict; therefore, assumption (b) allows constraints to be violated, even in a
grammatical structure, while assumption (c) states that some constraint viola-
tions are more serious than others. While, according to (a), the formulation of
constraints remains constant across languages, the ranking of the constraints
can differ between languages, thus allowing crosslinguistic variation to be
accounted for.
In an OT setting, a structure is grammatical if it is the optimal structure
in a set of candidate structures. Optimality is defined via constraint ranking:
The optimal structure violates the least highly ranked constraints compared
to its competitors. The number of violations plays a secondary role; if two
structures violate a constraint with the same rank, then the number of viola-
tions incurred decides the competition. OT therefore deviates from traditional
216 Frank Keller
Table 1. Constraint profile for direct object extraction (simplified from Legendre et
al. 1995: (22-a))
[Q, [thinkcp [χ;]]] SUBCAT BAR4 BAR3 BAR2 *t

a. whatj do [you [think [he
* * *
[said tyl]]]
b. what; do [you [think [t7 that
** **
[he [said t 7 ]]]]]
c. what, do [you [think [that
* *
[he [said t 7 ]]]]]
grammatical frameworks in that the grammaticality of a sentence is not de-

termined in isolation, but in comparison with other possible structures. Note
that there is no inherent restriction on the number of optimal candidates for a
given candidate set; more than one candidate may be optimal if several candi-
dates share the same constraint profile, i.e., if they incur the same constraint
violations.
We will illustrate how OT works with a simple example taken from an ac-
count of wh -extraction by Legendre et al. (1995). Our example deals with ex-
traction from direct objects in English. Legendre et al. (1995) assume that the
following constraints govern extraction: SUBCAT, which states that the sub-
categorization requirements of the verb have to be met; *t, which disallows
traces (i.e., movement); and BAR«, which rules out movement that crosses
more than η barriers (for a definition of barrier, see Legendre et al. 1995). For
English, the assumption is that these constraints are ranked as follows:
(13) SUBCAT » BAR4 » BAR3 » BAR2 » *t
This means that a violation of S U B C A T is more serious than a violation of

BAR4, which in turn is more serious than a violation of BAR3, etc.
A crucial assumption in OT is that all candidate structures (syntactic rep-
resentations) that take part in a grammatical competition are generated from
a common input, assumed to be a predicate argument structure by Legendre
et al. (1995). The input structure specifies the verb and the arguments of the
verbs, plus operators and scope relations that might be present. As an ex-
ample, consider the first line of Table 1: This input contains the verb think
(subcategorizing for a CP complement) and specifies that its argument has to
contain a syntactic variable χ j which is in the scope of a question operator Q¡.
Such an input has to be realized by a wh-question.
Possible realizations of this input are the candidates (a)-(c) in Table 1.
These candidates violate different constraints, as indicated by the asterisks

in Table 1. For example, candidate (a) violates S U B C A T (as the verb takes
an IP complement, instead of a CP complement), *t (due to the moved wh-
element it contains), and BAR3 (because the movement crosses three barri-
ers).
The optimal structure in a candidate set is computed as the structure that
violates the least highly ranked constraints. As an example, consider the com-
petition between candidates (a) and (c): (a) violates SUBCAT, while (c) vio-
lates B A R 4 . According to the constraint hierarchy in ( 1 3 ) , S U B C A T is ranked
higher than B A R 4 , which means that candidate (c) wins the competition. Note
that all the other constraints that are violated by either of the candidates are
not taken into account in determining the winner. Only the most highly ranked
constraint on which the two candidates differ matters for the constraint com-
petition (strict domination of constraints). Two candidates differ on a con-
straint if one candidate violates that constraint more often than the other one
(e.g., (a) violates S U B C A T once, while (b) violates it zero times).
In Table 1 the optimal candidate is (b): It wins against (c), as it violates
BAR2 instead of BAR4. The additional trace that (b) contains allows it to
avoid crossing four barriers at once. This means that (b) incurs two violations
of *t (instead of just one). However, this is not relevant to the competition
with (c), due to strict domination. (Note that (a) would win if the input con-
tained think subcategorizing for an IP.)
Another important aspect of OT can also be illustrated using the extraction
example: In OT, crosslinguistic variation can be accounted for by constraint
re-ranking. Assume that there is an additional constraint *Q, which disal-
lows empty question operators. For English, the ranking *Q *t holds. This
means that questions are formed by movement of wh -elements, while in-situ
wh-elements, which have to be bound by the Q operator, are ungrammati-
cal. Chinese, on the other hand, exhibits the opposite ranking *t *Q, i.e.,
the use of an empty question operator is preferred to the use of a trace. This
explains why in Chinese, wh-elements remain in situ in direct object extrac-
tions, where the wh -element is bound by the Q operator. English, on the other
hand, requires wh -movement in such configurations, as illustrated by the ex-
ample in Table 1.
218 Frank Keller
1.3 Suboptimal Candidates
Standard OT assumes that all non-optimal candidates are equally ungram-

matical, which leads to a binary notion of grammaticality. We propose drop-
ping this assumption and argue for an extended version of OT that not only
computes the optimal candidate for a given candidate set, but also makes pre-
dictions about the relative grammaticality of suboptimal candidates. More
specifically, we adopt the following hypothesis (see Keller and Alexopoulou
2000 for details):
( 14) Suboptimality Hypothesis

a. Suboptimal candidates differ in grammaticality.
b. The relative grammaticality of suboptimal candidates can be used
as evidence for constraint rankings.
Note that (14-b) follows from (14-a): If suboptimal candidates differ in gram-
maticality, then the comparison between two suboptimal candidates can be
used as evidence for constraint rankings in the same way as the comparison
between a grammatical candidate and an ungrammatical candidate is used to
determine rankings in standard OT.
There are several ways of implementing the suboptimality hypothesis, i.e.,
of extending OT to make predictions about suboptimal structures; the most
straightforward one is based on the assumption that the relative grammatical-
ity of a candidate corresponds to its relative optimality in the candidate set
(Keller 1997). Such a model will make predictions of the form: Candidate Si
is more optimal (i.e., more grammatical) than candidate S2, where both S\ and
S2 may be suboptimal candidates. This prediction can be tested empirically
by showing that S1 is more acceptable than S2·
This "naive" model of suboptimality (which simply equates relative op-
timality with relative grammaticality) has been criticized for a number of
reasons (Keller 1998, Müller 1999). One problem is that it predicts gram-
maticality differences only for structures in the same candidate set; relative
grammaticality cannot be compared across candidate sets. Another problem
is that grammaticality differences are predicted between all structures in a
candidate set. A typical OT grammar assumes a richly structured constraint
hierarchy, therefore all or most structures in a given candidate set will differ
in optimality. The naive model predicts that there is a grammaticality differ-
ence whenever there is a difference in optimality. This means it will probably
overgenerate, i.e., predict far more degrees of grammaticality than we can
reasonably expect to find in the data.
1.4 The Re-Ranking Model
A number of suboptimality-based models of gradience have been proposed

that avoid the problems with the naive model (Hayes 2000, Hayes and
MacEachern 1998, Keller 1998, Müller 1999). The present paper takes as
its starting point the re-ranking model put forward by Keller (1998), which is
based on experimental research on gradient acceptability in extraction from
picture NPs (Cowart 1989, 1997, Keller 1996, 1997). We summarize the rel-
evant experimental findings:
— Soft and Hard Constraints: constraints cluster into two types

based on their acceptability profile: Hard constraints cause strong
unacceptability when violated (e.g., constraints on phrase structure,
agreement, and subcategorization), while violations of soft con-
straints cause only mild unacceptability (e.g., constraints on refer-
entiality and definiteness). Violations of hard constraints are signif-
icantly less acceptable than violations of soft constraints. 3
— Cumulativity: constraint violations are cumulative, i.e., the degree
of unacceptability increases with the number of constraints violated.
This finding holds both for soft and for hard constraints.
Apart from lending a certain plausibility to OT's notions of constraint rank-

ing and constraint interaction (see Keller 1998 for details), these results also
provide evidence against a naive model of gradience. The naive model fails
to accommodate the distinction between hard and soft constraints and cannot
explain the cumulativity effect.
Keller (1998) suggests an alternative model of gradience that draws on con-
cepts from OT learnability theory (Tesar and Smolensky 1998). The central
idea of this model is to compute which constraint re-rankings are required
to make a suboptimal structure optimal. This information can then be used
to compare structures with respect to their degree of grammaticality: The
assumption is that the degree of grammaticality of a candidate structure S
depends on the number and type of re-rankings required to make S optimal.
Such a re-ranking model offers the necessary flexibility to accommodate the
experimental findings on constraint ranking and constraint interaction in OT:
— The re-ranking model allows us to determine the relative grammat-

icality of arbitrary structures by comparing the number and type of
re-rankings required to make them optimal. Comparisons of gram-
maticality are not confined to structures in the same candidate set,
220 Frank Keller
which accounts for the fact that subjects can judge the relative gram-
maticality of arbitrary sentence pairs.
— It seems plausible to assume that some constraint re-rankings are
more serious than others, and hence cause a higher degree of un-
grammaticality in the target structure. This assumption allows us to
model the experimental findings that some constraint violations are
more serious than others. The experimental data justify two types of
re-rankings, corresponding to the soft and hard constraint violations
discussed above.
— Another assumption is that the degree of grammaticality of a struc-
ture depends on the number of re-rankings necessary to make it op-
timal: The more re-rankings a structure requires, the more ungram-
matical it becomes. This predicts the cumulativity of violations that
was found experimentally both for soft and for hard constraints. 4
The work presented in this paper aims to provide additional evidence for
two assumptions underlying the re-ranking model: (a) the dichotomy of hard
and soft constraints and (b) the cumulativity of constraint violations. An ad-
ditional aim is to investigate how context effects interact with the soft/hard
distinction and the cumulativity effect.
1.5 Magnitude Estimation
The present study relies on very subtle linguistic intuitions, viz., on judg-
ments about the relative acceptability of information structurally different re-
alizations of a sentence. Such intuitions about relative acceptability should be
measured experimentally, since the informal elicitation technique tradition-
ally used in linguistics is unlikely to be reliable here (Cowart 1997, Schütze
1996, Sorace 1992). A suitable experimental paradigm is magnitude estima-
tion, a technique standardly applied in psychophysics to measure judgments
of sensory stimuli (Stevens 1975). The magnitude estimation procedure re-
quires subjects to estimate the magnitude of physical stimuli by assigning nu-
merical values proportional to the stimulus magnitude they perceive. Highly
reliable judgments can be achieved for a whole range of sensory modalities,
such as brightness, loudness, or tactile stimulation.
The magnitude estimation paradigm has been extended successfully to the
psychosocial domain (Lodge 1981), and recently Bard et al. (1996) and
Cowart (1997) have shown that linguistic judgments can be elicited in the
same way as judgments of sensory or social stimuli. In contrast to the five
Table 2. Factors in Experiment 1
verb frame (Frame) remnant (Remn) context (Con)

trans. NP V NP — felicitous context
N P V PP null context (control)
N P V VP
NP V PP-adj
ditrans. NP V NP NP NP _ XP XP felicitous context
NP V NP PP _ _ XP XP null context (control)
NP V NP V P NP XP
NP_ XP_
or seven point scale conventionally used to measure human intuitions, mag-

nitude estimation employs a continuous numerical scale. It provides fine-
grained measurements of linguistic acceptability, which are robust enough to
yield statistically significant results, while being highly replicable both within
and across speakers. Since magnitude estimation provides data on an interval
scale, parametric statistics can be used for evaluation.
Magnitude estimation requires subjects to assign numbers to a series of lin-
guistic stimuli proportional to the acceptability they perceive. First, subjects
are exposed to a modulus item, to which they assign an arbitrary number.
Then, all other stimuli are rated proportional to the modulus, i.e., if a sentence
is three times as acceptable as the modulus, it gets three times the modulus
number, etc.
2 Experiment 1: Verb Frame, Remnant, and Context
2.1 Introduction
Experiment 1 was designed to investigate whether the following constraints

on gapping that have been proposed in the literature have a gradient effect
on the acceptability of gapped sentences: (a) the verb frame of the gapped
verb, (b) whether the remnant left behind by gapping is a complement or an
adjunct, (c) the structure of the remnant, and (d) the context preceding the
gapped sentence. Table 2 gives an overview of the factors included in this
experiment and their levels.
The factor verb frame (Frame) included both transitive and ditransitive
verbs. The transitive case included verbs with NP, PP, and VP complements.
222 Frank Keller
PP adjuncts were also included in order to test the claim that adjunct remnants
are more acceptable than complement remnants (Hankamer 1973). The fol-
lowing examples illustrate the levels of the factor Frame for transitive verbs:
( 15) a. NP V NP: She repeated the question, and he the answer.

b. NP V PP: She negotiated with the manager, and he with the secre-
tary.
c. NP V VP: She expected to win, and he to lose.
d. NP V PP-adj: She read in the bedroom, and he in the lounge.
For ditransitive verbs, the factor Frame included verbs that have an NP as
their first complement, and another NP, a PP, or a VP as their second comple-
ment, such as the examples in (16).
(16) a. NP V NP NP: She charged the client 50 pounds, and he the manu-
facturer 100 pounds.
b. NP V NP PP: She accompanied the boy to school, and he the girl
to university.
c. NP V NP VP: She authorized the manager to leave, and he the sec-
retary to stay.
Transitive verbs allow only one type of remnant (where the subject and the
object are left behind, while the verb is gapped). Ditransitive verbs, on the
other hand, allow more complicated remnants, which we took into account
by including the additional factor remnant type (Remit) for ditransitive verbs.
The levels of Remn can be exemplified by the following sentences:
(17) a. NP _ XP XP: She charged the client 50 pounds, and he the manu-
facturer 100 pounds.
b. XP XP: She charged the client 50 pounds, and the manufac-
turer 100 pounds.
c. NP XP: She charged the client 50 pounds, and he 100 pounds.
d. NP _ XP _ : She charged the client 50 pounds, and he the manu-
facturer.
Note that we use pronouns in (17-c) and (17-d) to make sure that the remnant
is interpreted as the subject NP.
Context (Con), the third factor in the experiment, was meant to test the
influence of context on the acceptability of gapping. A felicitous context for
gapping (according to Kuno's 1 9 7 6 S E N T P constraint) is one in which the
gapped constituent contains given information, while the remnants constitute
new information. Such a given-new partition can be realized using a question

context: new constituents in the answer are realized as wh-phrases in the
question, while given constituents in the answer are realized as full NPs in
the question. This is illustrated by the questions in (18), which constitute
felicitous contexts for the transitive sentences in (15):
(18) a. What did Hanna and Michael repeat?

b. Who did Emily and Matthew negotiate with?
c. What did Rachel and Andrew expect to do?
d. Where did Rebecca and Mark read?
The factor Con was the same for the ditransitive condition. Here are the fe-
licitous contexts for the examples in (17):
( 19) a. Who did Hanna and Michael charge what?

b. Who did Hanna charge what?
c. What did Hanna and Michael charge the client?
d. Who did Hanna and Michael charge 50 pounds?
A null context condition was included as a control condition, allowing us to

determine how subjects behave in the absence of contextual information.
2.2 Predictions
The predictions for the present experiment can be summarized as follows:

1. As far as the factor Frame is concerned, no clear predictions can be de-
rived from the literature as to the effect of complement type (NP, PP, or
VP) or arity (transitive or ditransitive) of the verb. As for the comple-
ment/adjunct status of the remnant, our experiment allows us to verify
Hankamer's (1973) claims that PP adjuncts are more acceptable than PP
complements. 5
2. For the factor Remn, the constraint M I N D I S predicts that the remnant
XP XP is more acceptable than the remnants NP XP and NP
_ XP Another relevant prediction is that the remnant NP _ XP XP is
unacceptable, based on the claim of Kuno ( 1976: 318) that gapping has
to leave behind exactly two constituents.
3. As for the effect of Con, Kuno's ( 1 9 7 6 ) constraint S E N T P predicts that

the acceptability of a gapped sentence should be increased in a felicitous
context, compared to the control condition (the null context).
224 Frank Keller
Furthermore, we predict an interaction between the factors Remn and Con,

based on Kuno's ( 1 9 7 6 ) observation that the satisfaction of S E N T P seems to
override a violation of M I N D I S (see Section 1.1).
2.3 Method
2.3.1 Subjects
Fifty-five native speakers of English participated in the experiment. The sub-

jects were recruited over the Internet by postings to newsgroups and mailing
lists. Participation was voluntary and unpaid. Subjects had to be naive, i.e.,
neither linguists nor students of linguistics were allowed to participate.
The data of two subjects were excluded as they turned out to be non-native
speakers. The data of a further two subjects were excluded because they were
linguists. Finally, the data of two subjects were eliminated after an inspec-
tion of their response times showed that they had not completed the experi-
ment adequately (uniform response pattern or response times < Is). This left
49 subjects for analysis. Of these, 29 subjects were male, 20 female; eight
subjects were left-handed, 41 right-handed. The age of the subjects ranged
from 14 to 52 years; the mean was 30.6 years.
2.3.2 Materials
Training Materials
The experiment included a set of training materials that were designed to
familiarize subjects with the magnitude estimation task. The training set con-
tained six horizontal lines. The range of largest to smallest item was 1:6.7.
The items were distributed evenly over this range, with the largest item cov-
ering the maximal window width of the web browser. A modulus item in the
middle of the range was provided.
Practice Materials
A set of practice items was used to familiarize subjects with applying mag-
nitude estimation to linguistic stimuli. The practice set consisted of six sen-
tences that were representative of the test materials. A wide spectrum of ac-
ceptability was covered, ranging from fully acceptable to severely unaccept-
able. A modulus item in the middle of the range was provided.
Test Materials
The experiment included two subdesigns, as illustrated in Table 2. For the

transitive items, a full factorial design was used with verb frame {Frame) and
context {Con) as the two factors, yielding a total of Frame χ Con = 4 x 2 = 8
cells. For the ditransitive items, the additional factor remnant type {Remn)
was included, yielding Frame χ Remn χ Con = 3 x 4 x 2 = 24 cells. Four
lexicalizations were used for each of the cells, which resulted in a total of 128
stimuli. A set of 32 fillers was used, designed to cover the whole acceptability
range.
To control for possible effects from lexical frequency, the stimuli in both
subdesigns were matched for frequency. Verb and noun frequencies were ob-
tained from a lemmatized version of the British National corpus (100 million
words) and average frequencies were computed for the verb, the head noun
of the subject, and the head noun of the object for each frame. An A NOVA
confirmed that the average verb, subject, and object frequencies did not differ
significantly between frames.
2.3.3 Procedure
The method used was magnitude estimation as proposed by Lodge (1981) and
extended to linguistic stimuli by Bard et al. (1996). Each subject took part in
an experimental session that lasted approximately 15 minutes and consisted
of a training phase, a practice phase, and an experimental phase. The experi-
ment was self-paced, though response times were recorded to allow the data
to be screened for anomalies.
The experiment was conducted remotely over the Internet. The subject ac-
cessed the experiment using his or her web browser. The browser established
an Internet connection to the experimental server, which was running Web-
Exp 2.1 (Keller et al. 1998), an interactive software package for administering
web-based psychological experiments.
Instructions
Before the actual experiment started, a set of instructions were presented. The
instructions first explained the concept of numerical magnitude estimation of
line length. Subjects were instructed to make estimates of line length relative
to the first line they would see, the reference line. Subjects were told to give
the reference line an arbitrary number, and then assign a number to each
following line so that it represented how long the line was in proportion to the
226 Frank Keller
reference line. Several example lines and corresponding numerical estimates

were provided to illustrate the concept of proportionality.
Then subjects were told that linguistic acceptability could be judged in the
same way as line length. The concept of linguistic acceptability was not de-
fined; instead, examples of acceptable and unacceptable sentences were pro-
vided, together with examples of numerical estimates.
Subjects were told that they could use any range of positive numbers for
their judgments, including decimals. It was stressed that there was no upper
or lower limit to the numbers that could be used (exceptions being zero or
negative numbers). Subjects were urged to use a wide range of numbers and
to distinguish as many degrees of acceptability as possible. It was also em-
phasized that there were no "correct" answers, and that subjects should base
their judgments on first impressions, and not to spend too much time thinking
about any one sentence.
Demographic Questionnaire
After the instructions, a short demographic questionnaire was administered.
The questionnaire included name, email address, age, sex, handedness, aca-
demic subject or occupation, and language region. Handedness was defined
as "the hand you prefer to use for writing", while language region was de-
fined as "the place (city, region/state/province, country) where you learned
your first language". The results of the questionnaire were reported above.
Training Phase
The training phase was meant to familiarize subjects with the concept of
numeric magnitude estimation using line lengths. Items were presented as
horizontal lines centered in the window of the subject's web browser. After
viewing an item, the subject had to provide a numerical judgment over the
computer keyboard. After pressing Return, the current item disappeared and
the next item was displayed. There was no possibility of revisiting previous
items or change responses once Return had been pressed. No time limit was
set for either the item presentation or for the response.
Subjects first judged the modulus item, and then all the items in the training
set. The modulus remained on the screen all the time to facilitate comparison.
Items were presented in random order, with a new randomization being gen-
erated for each subject.
Practice Phase
This phase allowed subjects to practice magnitude estimation of linguistic

acceptability. The presentation and response procedures were the same in the
training phase, with linguistic stimuli being displayed instead of lines. Each
subject judged the whole set of practice items.
Experimental Phase
The presentation and response procedures in the experimental phase were

the same as in the practice phase. A between subjects design was used to
administer the factor Con: Subjects in Group A judged non-contextualized
stimuli, while subjects in Group Β judged contextualized stimuli. The factors
Frame and Remn were administered within subjects. There were 64 stimuli
per group, which were placed in a Latin square design, generating four lexi-
calizations at 16 items for each of the groups.
Each subject saw one of the lexicalizations and 16 fillers, i.e., a total of
32 items. Each subject was randomly assigned to a group and a lexicaliza-
tion: 26 subjects were assigned to Group A, and 23 to Group B. Instructions,
examples, training items, and fillers were adapted for Group Β to take context
into account.
2.4 Results
The data were normalized by dividing each numerical judgment by the mod-
ulus value that the subject had assigned to the reference sentence. This oper-
ation creates a common scale for all subjects. All analyses were carried out
on the geometric means of the normalized judgments. The use of geometric
means is standard practice for magnitude estimation data (Bard et al. 1996,
Lodge 1981).
Separate analyses of variance (ANOVAs) were performed for the transitive
and ditransitive verb frames. The analysis of the transitive frames failed to
find a significant main effect of verb frame. The main effect of context was
significant only by items (F { ( 1, 47) = .326, ρ = .571; F 2 ( l , 6) = 29.720,
ρ = .002), and the interaction of frame and context was non-significant. The
average judgments for the transitive condition are graphed in Figure 1.
For the ditransitive frames, a marginal main effect of verb frame was found
(Fi (2, 94) = 2.727, ρ = .071; F 2 (2, 12) = 6.037, ρ = .015). Further-
more, the ANOVA showed a highly significant main effect of remnant type
(Fi(3, 141) = 18.936, ρ < .0005; F 2 (3, 18) = 6.564, ρ = .003), and
228 Frank Keller
Figure l. Effect of verb frame on gapping (transitive frames)
verb frame
an interaction of verb frame and context ( F j ( 2 , 9 4 ) = 5.661, ρ = .005;

F2(2, 12) = 5.096, ρ = .025). The interaction of remnant type and con-
text was significant only by subjects (Fi(3, 141) = 5.483, ρ = .001;
F 2 (3, 18) = 1.847, ρ = .175). No main effect of context was found, and
all the remaining interactions were non-significant.
To further investigate the interactions context/verb frame and context/rem-
nant type, separate ANOVAs were performed for the context condition and
the null context condition. In the null context condition, remnant type was
significant ( F , ( 3 , 7 5 ) = 15.066, ρ < .0005; F 2 (3, 9) = 5.766, ρ = .018),
while verb frame, as well as all interactions, failed to reach significance. The
mean judgments for the null context conditions are graphed in Figure 2. This
graph shows that the XP XP remnant is more acceptable than the other
remnants. This effect is consistent across all frame types.
In the ANOVA for the context condition, remnant type (F\ (3, 66) = 4.092,
ρ = .010; F2(3,9) = 1.112, ρ = .394) and the interaction between verb
frame and remnant type (F,(6, 131) = 3.256, ρ = .005; F2(6, 18) = 1.240,
ρ = .332) produced weak effects that were significant only by subjects. The
mean judgments for the felicitous context conditions are depicted in Figure 3.
This graph shows that the remnant effect disappears in a felicitous context:
The XP XP remnant is not significantly more acceptable than the other
remnants. This is compatible with Kuno's (1976) account of the interaction
of the constraints M i n D i s and SENT P.
The ANOVA for the context condition also revealed a significant main effect
0.1-
»—«ΝΡ XP XP
XP XP
-'NP XP
- - NP XP
I I I
NP V NP NP NP V NP PP NP V NP VP
verb frame
Figure 2. Effect of verb frame and remnant type on gapping (ditransitive frames,
null context)
0
è
a
If·-0.1
oυ
Λ
-0.2
-0.3
NP V NP NP NP V NP PP NP V NP VP
verb frame
Figure 3. Effect of verb frame and remnant type on gapping (ditransitive frames,
felicitous context)
of verb frame (F, (2,44) = 7.677, ρ = .001; F2{2, 6) = 15.919, ρ = .004).

A post-hoc Tukey test showed that the NP V NP NP verb frame was signifi-
cantly less acceptable than both the NP V NP PP and the NP V NP VP frames
(a < .05).
230 Frank Keller
2.5 Discussion
For transitive verbs, we found that gapping is equally acceptable for all types
of verbal complements tested (NP, PP, VP). We also failed to find a differ-
ence between PP complements and PP adjuncts. This result settles the con-
troversy on the status of complements and adjuncts in gapping: Hankamer
(1973) claims that PP adjuncts are more acceptable than PP complements, a
claim that is disputed by Jackendoff (1971) and Kuno (1976). These negative
results are also important for our next experiment, as they allow us to disre-
gard the distinction between different verb frames, and between adjuncts and
complements, thus enabling us to use a more compact experimental design.
In contrast to transitive verbs, ditransitive verbs showed an effect of Frame:
in a felicitous context, the NP V NP NP frame was less acceptable than the
other frames. Note, however, that this effect, for which the literature on gap-
ping fails to offer an explanation, is rather small (see Figure 3).
The main finding of Experiment 1 is the effect of remnant type and its inter-
action with context. We showed that the XP XP remnant is more accept-
able than all the other remnants, an effect that is very strong in a null context,
but disappears completely in a felicitous context. This provides strong evi-
dence for Kuno's (1976) Minimal Distance Principle, and in particular for his
observation that a violation of M I N D I S can be overridden by a satisfaction of
the context requirements on gapping (his constraint S E N T P ) .
On the other hand, we found that the NP _ XP XP remnant is not signifi-
cantly less acceptable than NP XP and NP _ XP _ , contrary to Kuno's
(1976) claim that gapping must leave behind exactly two remnants.
Now let us briefly consider an alternative explanation for the interaction of
remnant type and context. One could argue that this effect is actually due to
the contexts used, rather than to the stimulus sentences proper. Some initial
plausibility for this view derives from the fact that two of the remnants (NP
_ XP XP and XP XP) used double wh -questions as contexts (see (19-a)
and (19-b)), while the other two remnants (NP XP and NP _ XP _ ) had
single wh -questions as contexts (see (19-c) and (19-d)). It seems plausible to
assume that multiple wh -questions are less acceptable than single ones, and
maybe subjects actually took the acceptability of the context into account
when they judged the acceptability of the stimulus sentences.
To test this hypothesis, an A NOVA was conducted on the contextualized
data with question type as the only factor. This yielded an effect of ques-
tion type which was significant by subjects ( F i ( l , 2 ) = 8.982, ρ = .007;
F2{\, 3) = 1.257, ρ — .344). However, this effect went the other way than
was expected: Single questions (mean = —.0085) were less acceptable than
double questions (mean = .0410). This result allows us to rule out the hy-
pothesis that the effect of Remn is due to the type of question used, rather
than to the remnant itself.
Another alternative explanation for the remnant is that XP XP is more
acceptable because it does not contain a subject pronoun. This pronoun is
present in the other three remnants and might reduce acceptability in the null
context condition, as it cannot be anchored to an NP in the context. This
would explain why the remnant effect disappears in context, where such an
antecedent is provided (see (15) and (18)). This alternative explanation for
the remnant effect cannot be ruled out on the basis of Experiment 1. We will
address this issue in the next experiment, which will investigate the behavior
of gapping in non-felicitous contexts. A non-felicitous context provides an
antecedent for the subject pronoun, but differs from a felicitous context in
that it violates SENTP.
3 Experiment 2: Minimal Distance, Subject-Predicate Interpre-

tation, Simplex Sentence, and Context
3.1 Introduction
The aim of this experiment was to replicate and extend the findings of Ex-
periment 1. It was designed to investigate how the remnant effect found in
Experiment 1 interacts with other constraints on gapping, and how it behaves
in a neutral and non-felicitous context. Table 3 gives an overview of the fac-
tors included in Experiment 2. The constraints are the ones detailed in Sec-
tion 1.1, either violated or not: Minimal Distance (MINDIS), Functional Sen-
tence Perspective (SENTP), Subject-Predicate Interpretation (SUBJPRED),
and Simplex-Sentential Relationship (SIMS).
The constraint M I N D I S (see (4)) is satisfied if the distance between the rem-
nants and their antecedents is minimal, as in (20-a), where the thief can be
paired with the criminal and for robbing the bank can be paired with for bur-
gling the house. (20-b), on the other hand, is in violation of MINDIS, as she
cannot be paired with the neighbor, but has to be paired with the subject he.
(20) a. He punished the criminal for robbing the bank and the thief for bur-
gling the house.
232 Frank Keller
Table 3. Factors in Experiment 2
MINOIS (Dis) SUBJPRED (Pred) S IMS (Sim)

not violated (_ _ XP XP) not violated not violated
violated (NP _ _ X P ) violated violated
SENT? (Con)
not violated (fei. context)
violated (non-fel. context)
neutral context (control)
null context (control)
b. He helped the neighbor by doing the shopping and she by washing

the dishes.
c. He punished the criminal for robbing the bank and the thief the
house.
d. He helped the neighbor by doing the shopping and the friend by
washing the dishes.
Another constraint on gapping postulated by Kuno (1976) is SUBJPRED

(see (8)), which requires that the remnants left behind by gapping be inter-
preted as a subject and its predicate. This constraint is met in (20-a), where
the thief is the subject of for burgling the house, but it is violated in (20-d),
where the subject of washing the dishes is not the remnant the friend, but the
main clause subject he.
The constraint SIMS (see (12)) requires that the constituents left behind by
gapping have to be part of a simplex sentence, i.e., gapping out of subordi-
nate clauses is disallowed. This constraint is met in (20-a), where the gapped
clause is interpreted as he punished the thief for burgling the house, while
it is violated in (20-c), where the interpretation of the gapped clause is he
punished the thief for robbing the house.
Finally, the experiment included the constraint SENTP (see (6)), which gov-
erns the context required for gapping. Extending the results of Experiment 1,
we included not only a felicitous context condition, in which the remnants are
new while the gap is given (i.e., SENTP is satisfied), but also a non-felicitous
context, in which the remnants are given while the gap is new (i.e., SENTP is
violated). The contexts were formulated as questions, on a par with Experi-
ment 1. In addition to the felicitous and non-felicitous contexts, we included
two control conditions: a null context condition and a neutral context condi-
tion. In the null context condition, the stimuli were presented in isolation. In
the neutral context condition, the stimuli were prefixed by the question What
happened?, which indicates an all focus information structure.
The examples in (21) show the felicitous contexts that belong to the stimuli
in (20), while (22) gives the corresponding non-felicitous contexts.
(21) a. Who did Michael punish, and why?

b. How did David and Hanna help the neighbor?
c. Who did Michael punish, and why?
d. Who did David help, and how?
(22) a. Why did Michael punish the criminal and the thief?
b. Who did David and Hanna help, and how?
c. Why did Michael punish the criminal and the thief?
d. How did David help the neighbor and the friend?
3.2 Predictions
3.2.1 Constraints
Based on the results of Experiment 1 and on the claims in the theoretical

literature on gapping, we can arrive at a set of predictions regarding the con-
straints investigated in the present experiment.
We expect strong unacceptability for a violation of SIMS, i.e., for sentences
in which the remnants are not in a simplex-sentential relationship. Intuitively,
a violation of SIMS is so serious that it cannot be remedied by the satisfaction
of other constraints such as M I N D I S , SUBJPRED, or SENTP.
An effect of MINDIS is also predicted, i.e., structures with subject rem-
nants (see (20-b)) are expected to be reduced in acceptability. In line with the
findings of Experiment 1 this effect should disappear in a felicitous context
(see (21-b)).
We also expect a significant effect of SUBJPRED; gapped sentences that
do not allow a subject-predicate interpretation of the remnants (see (20-d))
are predicted to be dispreferred. In line with Kuno's (1976) observations,
we expect this effect to interact with MINDIS, and possibly with SENTP,
i.e., with context (even though Kuno (1976) does not explicitly mention this
possibility).
Finally, Kuno's (1976) account also predicts an effect of SENTP, i.e., a
felicitous context should improve the overall acceptability of a gapped sen-
tence.
234 Frank Keller
3.2.2 Constraint Ranking
The present experiment also allows us to test the validity of Keller's (1998)
model of gradient grammaticality: We predict that the constraints tested in
this experiment cluster into hard and soft constraints. Hard constraints are
expected to receive a high ranking, i.e., trigger a high degree of unaccept-
ability, while soft constraints will receive a low ranking, i.e., cause only mild
unacceptability when violated.
Intuitively, SIMS is a good candidate for a hard constraint, while SUBJ-
P R E D and M L N D L S are probably soft constraints. A particularly interesting
question is how context interacts with soft and hard constraints. It seems
plausible to expect soft constraints to be more susceptible to context effect
than hard ones.
3.2.3 Constraint Interaction
Another prediction is that constraint violations are cumulative, i.e., that the
degree of unacceptability of a sentence increases with the number of con-
straint violations it incurs. This finding underpins the re-ranking model of
gradience. Note that Keller (1998) found that the cumulativity effect holds
for both soft and hard constraint violations.
3.3 Method
3.3.1 Subjects
Sixty native speakers of English from the same population as in Experiment 1

participated in the experiment. None of them had previously participated in
Experiment 1.
The data of two subjects had to be excluded because they were linguists.
The data of another three subjects were eliminated after an inspection of
their response times showed that they had not completed the experiment ad-
equately (response times < Is or > 100s). This left 55 subjects for analysis.
Of these, 32 subjects were male, 23 female; eight subjects were left-handed,
47 right-handed. The age of the subjects ranged from 17 to 72 years; the mean
was 31.6 years.
3.3.2 Materials
Training and Practice Materials
These were the same as in Experiment 1.
Test Materials
A full factorial design was used which included the factors Dis, Sim, Pred,
and Con, representing the constraints MINDIS, SIMS, SUBJPRED, and
SENTP, respectively (see Table 3 for an overview of the experimental de-
sign). The factors Dis, Sim, and Pred had two levels (constraint violated
or not violated), while the factor Con had four levels: constraint violated
(non-felicitous context), not violated (felicitous context), plus the two con-
trol conditions (null context and neutral context). This yielded a total of
Dis χ Sim χ Pred χ Con = 2 x 2 x 2 x 4 = 32 cells. Eight lexicaliza-
tions were used for each of the cells, which resulted in a total of 256 stimuli.
A set of 24 fillers was used, designed to cover the whole acceptability range.
3.3.3 Procedure
Instructions, Demographic Questionnaire, Training and Practice Phase
These were the same as in Experiment 1.
Experimental Phase
The presentation and response procedures in the experimental phase were the
same as in Experiment 1. A between subjects design was used to adminis-
ter the experimental stimuli: Subjects in Group A judged non-contextualized
stimuli, while subjects in Group Β judged contextual i zed stimuli.
For Group A, four test sets were used: Each set contained two lexicaliza-
tions for each of the cells in the design Dis χ Sim χ Pred, i.e., a total of 16
items. The items were distributed over the test sets in a Latin square design.
For Group B, eight test sets were used, each containing the design in one
lexicalization and three contextualizations. This yielded 24 items per test set,
which again were placed in a Latin square.
In Group A, each subject saw 32 items: 16 experimental items and 16 fillers.
In Group B, each subject saw 40 items: 24 experimental items and 16 fillers.
Each subject was randomly assigned to a group and a lexicalization; 25 sub-
jects were assigned to Group A, and 30 to Group B. Instructions, examples,
236 Frank Keller
context
Figure 4. Context effects for Si MS
training items, and fillers were adapted for Group Β to take context into ac-
count.
3.4 Results
As in Experiment 1, all analyses were carried out on the geometric means

of the normalized judgments. Separate ANOVAs were performed for the null
context condition and the context condition.
3.4.1 Constraints
Simplex Sentence
In the null context condition, a highly significant main effect of Sim was
found (Fi(l, 24) = 23.415, ρ < .0005; F2( 1,7) = 18.918, ρ = .003). The
same effect of Sim was present in the context condition (F| (1, 29) = 97.310,
ρ < .0005; F2(1, 7) = 15.548, ρ = .006). The interaction between Sim and
context was non-significant.
Figure 4 depicts the mean judgments for a violation of S IMS in all contexts.
It indicates that SIMS violations have a strong effect on acceptability and
illustrates the absence of a context effect: A violation of SimS results in the
same decrease in acceptability in all contexts (including the null context and
the neutral context).
• MINDIS n o t v i o l a t e d
0.1 •MINDIS violated
-0.1
-0.2
-0.3-
_L
null neutral non-felicitous felicitous
context
FigureS. Context effects for M I N D I S
»SUBJPRED n o t violated
0.1 - •SUBJPRED violated
£> 0-
a
S--0.1
-0.2-
-0.3-
context
Figure 6. Context effects for SUBJPRED
Minimal Distance
In the null context condition, a highly significant main effect of Dis was found
( F i ( 1 , 2 4 ) = 25.997, ρ < .0005; F2{\,1) = 14.612, ρ = .007). Dis was
also significant in the context condition ( F i ( l , 2 9 ) = 23.315, ρ < .0005;
F 2 ( 1 , 7 ) = 11.421, ρ = .012), where an interaction of Dis and SIM was
also present, significant by subjects only ( F i ( l , 2 9 ) = 4.568, ρ — .001;
/ Z ì o , ? ) = 2.111,/» = .190).
238 Frank Keller
The A NOVA also revealed a significant interaction of Dis and context

(F\ (2, 58) = 4.568, ρ = .014; F2(2, 14) = 6.553, ρ = .010). We inves-
tigated this interaction by conducting separate ANOVAs for the three context
conditions. In the neutral context condition, we found a main effect of Dis
(F, (1,29) = 15.282, ρ = .001; F2( 1,7) = 11.207, ρ = .012). Also in
the non-felicitous context condition, a highly significant effect of Dis was
obtained ( F , ( l , 2 9 ) = 20.747, ρ < .0005; F 2 ( l , 7 ) = 16.904, ρ = .005).
However, the A NOVA for the felicitous context failed to detect an effect of
Dis. Figure 5 depicts the interaction of context with Dis. It shows that the
effect of Dis disappears in the felicitous context, in line with our predictions.
Subject-Predicate Interpretation
The main effect of Pred failed to reach significance in the null context condi-
tion. In the context condition, a main effect of Pred was found (F\ (1, 29) =
19.377, ρ < .0005; F 2 ( l , 7 ) = 9.891, ρ = .016). The interaction of Pred
and context failed to be significant. There was, however, an interaction of
Pred and Sim that was significant by subjects only (F| ( 1, 29) = 11.453,
ρ = .002; F 2 (1,7) = 2.524, ρ = .156).
Figure 6 depicts the interaction of context with Pred. Note the absence of a
context effect, contrary to our expectation that SUBJPRED is a context depen-
dent constraint. However, the presence of a PredlSim interaction might indi-
cate that the effect of Sim blocks out the context effect of Pred. Recall that a
violation of SIMS leads to a high degree of unacceptability, while SUBJPRED
only has a small effect on acceptability. It is therefore appropriate to factor
out violations of SIMS (and other constraints), and to look at the effect of
context on single violations of SUBJPRED. The mean judgments for single
violations of SUBJPRED are depicted in Figure 7, which indicates that the
effect of Pred in the neutral context is stronger than in the other contexts.
To confirm this observation, we conducted separate ANOVAS for single vi-
olations of SUBJPRED for the four context conditions. In the null context,
the felicitous context, and the non-felicitous context, no significant effect of
a single SUBJPRED violation was found. In the neutral context, however, a
single violation of SUBJPRED led to a significant reduction in acceptability
( F , ( l , 29) = 8.327, ρ = .007; F2( 1,7) = 5.610, ρ = .050).
Functional Sentence Perspective
The A NOVA on the context condition showed a significant main effect Con
( F ] ( l , 29) = 10.209, ρ < .0005; F 2 ( l , 7 ) = 13.082, ρ = .001). A post-hoc

context
Figure 7. Context effects for SUBJPRED (single violations)
Tukey test was conducted to investigate the locus of the Con effect. It was
found that the neutral context was significantly less acceptable than both the
felicitous and the non-felicitous context (a < .01 in both cases). However,
there was no difference between the felicitous and the non-felicitous context.
Figure 8 compares the degree of unacceptability caused in each context by

single violations of the constraints S I M S , M I N D I S , and S U B J P R E D . The
graph indicates that a violation of S U B J P R E D only has a small effect on
acceptability. A violation of SIMS leads to serious unacceptability, while a
violation of M I N D I S is somewhere in-between.
To test if these differences in unacceptability were significant, we con-
ducted a separate A NOVA on the subset of the data that only contained single
violations. In the null context, a significant effect of constraint type was found
(F] (2, 4 8 ) = 6 . 8 1 7 , ρ = . 0 0 2 ; F2(2, 14) = 5 . 5 0 9 , ρ = .017). A post-hoc
Tukey test showed that the degree of unacceptability caused by a violation of
S I M S was higher than the degree of unacceptability caused by a violation of
S U B J P R E D (by subjects, a < . 0 1 , and by items, Α < . 0 5 ) . Also, the degree
of unacceptability associated with a M I N D I S violation was higher than that
associated with a S U B J P R E D violation (by subjects only, a < . 0 5 ) .
We also found a significant effect of constraint type in the context condition
(Ft (2, 58) = 19.251, ρ < .0005; F2(2, 14) = 3.693, ρ = .052). A Tukey
240 Frank Keller
0.1
s> o
x>
B--0.1
-0.2
·—· null c o n t e x t
• »context
-0.3
none SUBJPRED MINDIS SIMS

constraint violation
Figure 8. Effect of type of violation (single violations)
test showed that a violation of SIMS caused a higher degree of unacceptability

than either a violation o f S U B J P R E D ( a < . 0 5 ) or a violation o f MINDIS (by
subjects only, a < . 0 1 ) . T h e d i f f e r e n c e b e t w e e n MLNDLS a n d S U B J P R E D
failed to reach significance in the context condition.
To test the hypothesis that constraint violations are cumulative, we recoded

the data such that the number of constraint violations was the independent
variable. In the null context condition, an ANOVA on the recoded data revealed
a significant effect of number of violations (F\ (3,72) = 21.817, ρ < .0005;
F2(3, 21) = 19.217, ρ < .0005). Also in the context condition, a highly
significant effect of number of violations was obtained (F\ (3, 87) = 65.062,
ρ < .0005; F2(3, 21) = 24.993, ρ < .0005).
The effect of number of violations is graphed in Figure 9. This graph shows
a consistent cumulativity effect for both the null context and the context con-
dition. A post-hoc Tukey test was conducted to locate the effect of number of
violations. For the null context condition, it was found that a single violation
was significantly less acceptable than zero violations (by subjects, a < .01,
and by items, a < .05). The difference between one and two violations failed
to be significant, but two violations were significantly less acceptable than
zero violations (a < .01). The difference between two and three violations
number of violations
Figure 9. Effect of number of violations
was again not significant, but three violations were significantly less accept-
able than one violation (a < .01).
The same post-hoc test was conducted for the context condition. Again, it
was found that one violation was less acceptable than zero violations (a <
.01), while two violations were less acceptable than one violation (a < .01).
The difference between two and three violations was again too small to reach
significance, but the three violations were significantly less acceptable than
one violation (a < .01).
3.5 Discussion
3.5.1 Constraints
Experiment 2 found main effects of Sim, Dis, and Pred. This demonstrated
that violations of the constraints MlNDlS, S u b j P r e d , and SimS signifi-
cantly reduce the acceptability of gapped sentences, as predicted by Kuno's
(1976) account of gapping. A main effect of Con was also present, but con-
trary to predictions, no difference between the acceptability of gapping in a
felicitous and a non-felicitous context was found. However, the acceptability
of gapping in the felicitous and the non-felicitous context was significantly
higher than in the neutral context. This seems to indicate that even a non-
felicitous context provides an information structure that is partially compati-
ble with the requirements of the constraint S e n t P .
242 Frank Keller
We also found that S E N T P interacts with other constraints on gapping. A

significant interaction of Con and Dis was obtained: A violation of MlNDlS
leads to reduced acceptability in the null context, the neutral context, and the
non-felicitous context. In the felicitous context (that satisfies the information
structure constraint S E N T P ) , the effect of Dis disappeared. Note that the null
context and the neutral context behaved in the same fashion with respect to
MlNDlS violations; this is expected based on the hypothesis that even a null
context carries implicit information structural assumptions, and is interpreted
by subjects on a par with a neutral (all new) context.
Similar to the Dis effect, the effect of Pred was also found to be context
dependent. Considering stimuli that incur a single violation of SUBJPRED, we
found a significant effect of Pred only in the neutral context; in the felicitous
and non-felicitous contexts, the effect of Pred was too small to be significant.
Also, in the null context, no effect of Pred was found, even though this would
be expected under the assumption that the null context behaves like a neutral
(all new) context.
In contrast to M I N D I S and SUBJPRED, the Simplex S constraint SIMS was
found to be immune to context effects: It caused consistently strong unac-
ceptability, independent of which context was presented. This is in line with
our predictions regarding the behavior of SI MS.
Another one of Kuno's ( 1 9 7 6 ) observations can be tested against the data
from Experiment 2. Examples like (9) and (10) seem to indicate that a sat-
isfaction of SUBJPRED can override a violation of MINDIS. However, we
failed to find an interaction of Dis and Pred in either the null context con-
dition or the context condition. This might indicate that the interaction of
S U B J P R E D and M I N D I S that Kuno ( 1 9 7 6 ) observes is limited to examples
like the ones in (9) and (10), and does not generalize to our experimental
stimuli.
Finally, the results of the present experiment allow us to evaluate the alter-
native explanation for the Dis effect we discussed in Section 2.5: The XP
XP remnant is more acceptable than the XP XP remnant because the lat-
ter contains a subject pronoun, which reduces acceptability if it is not contex-
tually anchored (in a null or neutral context). This explanation can be ruled
out on the basis of Experiment 2, which demonstrated a Dis effect for the non-
felicitous context condition, i.e., even if the subject pronoun can be anchored
to a contextually given NP.
A second set of predictions for Experiment 2 was based on Keller's (1998)

model of gradient grammaticality as constraint re-ranking. This model rests
on the assumption that constraints cluster into hard constraints (that lead to
serious unacceptability) and soft constraints (that cause only mild unaccept-
ability). Consider Figure 8, which graphs the unacceptability incurred by
single violations of the three constraints S I M S , M I N D I S , and S U B J P R E D .
We found that a S L M S violation was significantly more serious than a viola-
tion of M I N D L S , which in turn was significantly more serious than a viola-
tion of S U B J P R E D , leading to the overall ranking of S L M S » M I N D I S »
S U B J P R E D . We conclude that S L M S qualifies as a hard constraint, as it leads
to strong unacceptability, while S U B J P R E D induces only mild unacceptabil-
ity and thus should be classified as soft. The status of M I N D I S is less clear,
as it falls in-between these two extremes.
Note, however, that we also observed that the soft constraint S U B J P R E D
was subject to contextual variation (consider the increased effect of a
S U B J P R E D violation in the neutral context). On the other hand, S I M S , a hard
constraint, was immune to context effects. This leads to the more general
hypothesis that soft constraints are subject to context effects, while hard con-
straints are immune to contextual variation. If correct, this hypothesis would
provide us with a new diagnostic for the hard/soft distinction, in addition to
constraint strength (proposed in Keller 1998). Based on this hypothesis, we
can classify M I N D I S as a soft constraint, as it is clearly subject to context
effects, even though its constraint strength is relatively close to that of S L M S ,
a hard constraint.
The findings of Experiment 2 confirm another assumption on which the re-

ranking model rests: Constraint violations are cumulative, i.e., the degree of
unacceptability increases with the number of violations. A clear cumulativ-
ity effect was obtained for both the null context condition and the context
condition (see Figure 9).
244 Frank Keller
4 General Discussion
4.1 Implications for Linguistic Methodology
This paper is part of a line of research that draws on the experimental par-
adigm of magnitude estimation to obtain linguistic judgment data that are
reliable and maximally delicate. This line of research, which was initiated
by Bard et al. (1996) and Cowart (1997), has contributed to linguistic the-
ory by settling data disputes that could not be resolved solely on the basis of
intuitive, informal acceptability judgments. Relevant experimental findings
have been obtained in studies on extraction (Cowart 1989,1997, Keller 1996,
1997), binding theory (Cowart 1997), unaccusativity (Sorace 1993a,b, 2000),
and word order (Keller and Alexopoulou 2000, Keller 2000a).
The results of Experiments 1 and 2 confirm the usefulness of an experi-
mental approach to linguistic data by applying magnitude estimation to gap-
ping constructions. Experiment 1 showed that PP adjuncts and PP comple-
ments are equally acceptable as remnants in gapping, a fact that has been
surrounded by controversy in the theoretical literature. It also provided evi-
dence against the claim that gapping must leave behind exactly two remnants
(Kuno 1976). Another theoretically interesting result is that subject remnants
are less acceptable than object remnants, an effect that turned out to be con-
text dependent. Experiment 2 confirmed this result and provided evidence
for another context dependent constraint on gapping (Subject-Predicate In-
terpretation), but also discovered a constraint that is immune to context ef-
fects (Simplex S). More importantly, Experiment 2 provided data on how
the constraints on gapping interact, i.e., on what happens if more than one
constraint is violated. Such interaction data, which cannot easily be obtained
with the traditional intuitive approach, allows us to make observations on
how constraints compete, and thus can inform an optimality theoretic model
that deals with gradient linguistic data (Hayes 2000, Hayes and MacEachern
1998, Keller 1998, Müller 1999).
4.2 Implications for Optimality Theory
The interaction of Dis, Pred, and Con demonstrated in Experiment 2 can

be regarded as evidence that gapping is subject to constraint competition, a
fact that was already noted by Kuno (1976) (who, however, did not have the
conceptual tools of modern Optimality Theory at his disposal).
Offering a detailed analysis of the experimental data based on an optimal-

ity theoretic model is beyond the scope of the present paper. The reader is
referred to Keller (2000b), who presents both an explicit model of gradience
in Optimality Theory, and a detailed account of the gapping data from Exper-
iments 1 and 2.
4.3 Implications for the Re-Ranking Model
The work presented in this paper provided additional evidence for two cen-
tral assumptions underlying the re-ranking model of gradience (Keller 1998):
First, the experimental data confirmed the cumulativity of constraint viola-
tions assumed by the re-ranking model. In addition, the results support the
soft/hard distinction of constraint violations previously demonstrated for ex-
traction. Context effects on gapping were also investigated, and we arrived at
the hypothesis that soft constraints are subject to context effects, while hard
constraints are immune to contextual influences. If correct, this hypothesis
would provide us with an additional diagnostic for the hard/soft distinction.
This has to be validated in further experimental work.
Also, the present results allow us to speculate on the theoretical status of
hard and soft constraints, and its implications for grammar architecture. One
possible line of argumentation is that soft constraints are limited to the in-
terface level of the grammar (syntax-semantics, syntax-pragmatics, syntax-
lexicon), while hard constraints are internal to syntax. This would explain
why soft constraints cause only weak acceptability effects and can be over-
ridden by context, while hard violations cause strong unacceptability and are
immune to context effects.
The constraints identified as soft in the present study belong to the
syntax-semantics or syntax-pragmatics interface (Minimal Distance, Subject-
Predicate Interpretation), while the hard constraint (Simplex S) seems to be
syntactic in nature. This observation squares well with previous results on
extraction, where constraints on phrase structure, agreement, and subcatego-
rization were found to be hard, while soft constraints included referentiality
and definiteness, i.e., constraints located at the syntax-semantics interface.
Notes
Thanks to Mark Steedman for important advice on the work reported here. Com-
ments on earlier stages of this paper were provided by Maria Lapata and by the
246 Frank Keller
audience of the Workshop on Competition in Syntax, Constance, March 1999. The

support of an ESRC Postgraduate Research Studentship is also acknowledged.
1. All examples in this section are taken from Kuno (1976).
2. We supply constraint names for notational convenience.
3. This terminology should not be taken to imply that hard constraints are invio-
lable, while soft constraints are violable in an optimality theoretic sense. The
soft/hard distinction is an empirical one, based on the acceptability profile of a
constraint.
4. Note however, that a re-ranking model can only explain cumulative violations of
different constraints. Cumulative violations of the same constraint are not pre-
dicted to lead to an increase of unacceptability, as they can be dealt with by a
single re-ranking (see Keller 1998 for details).
5. Consider the following examples from Hankamer (1973), which are analogous
to our sentences (15-b) and (15-d) (the acceptability judgments are his):
(i) a. *Max wanted to put the eggplant on the table, and Harvey in the sink,
b. ?Max writes plays in the bedroom, and Harvey in the basement.
References
Bard, Ellen Gurman — Dan Robertson — Antonella Sorace

1996 Magnitude estimation of linguistic acceptability. Language 72(1): 32-68.
Chomsky, Noam
Covvart, Wayne
1989 Illicit acceptability in picture NPs. In: Caroline Wiltshire, Randolph
Graczyk and Bradley Music (eds.) Papers from the 25th Meeting of the
Chicago Linguistic Society, vol. 1 : The General Session, 27-40. Chicago.
Covvart, Wayne
1997 Experimental Syntax: Applying Objective Methods to Sentence Judg-
ments. Thousand Oaks, CA: Sage Publications.
Hankamer, Jorge
1973 Unacceptable ambiguity. Linguistic Inquiry 5: 17-68.
Hayes, Bruce P.
2000 Gradient well-formedness in Optimality Theory. In: Joost Dekkers, Frank
van der Leeuvv and Jeroen van de Weijer (eds.) Optimality Theory:
Phonology, Syntax, and Acquisition. Oxford: Clarendon Press.
Hayes, Bruce P. — Margaret Mac Bachern
1998 Folk verse form in English. Language 74(3): 473-507.
Jackendoff, Ray S.
1971 Gapping and related rules. Linguistic Inquiry 2: 21-35.
Keller, Frank
1996 How do humans deal with ungrammatical input? Experimental evidence
and computational modelling. In: Dafydd Gibbon (ed.) Natural Lan-
guage Processing and Speech Technology: Results of the 3rd KONVENS
Conference, Bielefeld, October 1996,27-34. Berlin: Mouton de Gruyter.
Keller, Frank
1997 Extraction, Gradedness, and Optimality. In Alexis Dimitriadis, Laura
Siegel, Clarissa Surek-Clark, and Alexander Williams, eds., Proceedings
of the 21st Annual Penn Linguistics Colloquium, 169-186. (Penn Work-
ing Papers in Linguistics, no. 4.2.) Department of Linguistics, University
of Pennsylvania.
Keller, Frank
1998 Gradient grammaticality as an effect of selective constraint re-ranking.
In: M. Catherine Gruber, Derrick Higgins, Kenneth S. Olson and Tamra
Wysocki (eds.) Papers from the 34th Meeting of the Chicago Linguistic
Society, vol. 2: The Panels, 95-109. Chicago.
Keller, Frank
2000a Evaluating competition-based models of word order. In Proceedings of
the 22nd Annual Conference of the Cognitive Science Society. Philadel-
phia, PA.
Keller, Frank
2000b Gradience in Grammar: Experimental and Computational Aspects of De-
grees of Grammaticality. PhD thesis, University of Edinburgh.
Keller, Frank — Theodora Alexopoulou
2000 Phonology competes with syntax: Experimental evidence for the interac-
tion of word order and accent placement in the realization of information
structure. Cognition, to appear.
Keller, Frank — M. Corley — S. Corley — L. Konieczny — A. Todirascu
1998 WebExp: A Java Toolbox for Web-Based Psychological Experiments.
Technical Report HCRC/TR-99, Human Communication Research Cen-
tre, University of Edinburgh.
Kuno, Susumo
1976 Gapping: A functional analysis. Linguistic Inquiry 7: 300-318.
Legendre, Géraldine — C. Wilson — P. Smolensky — K. Homer — W. Raymond
1995 Optimality and wh-extraction. In: Jill Beckman, Laura Walsh Dickey and
Suzanne Urbanczyk (eds.) Papers in Optimality Theory, 607-636. (Uni-
versity of Massachusetts Occasional Papers in Linguistics 18) University
of Massachusetts, Amherst.
248 Frank Keller
Lodge, Milton
1981 Magnitude Scaling: Quantitative Measurement of Opinions. Beverley
Hills, CA: Sage Publications.
Müller, Gereon
1999 Optimality, markedness, and word order in German. Linguistics 37(5):
777-818.
Prince, Allan — Paul Smolensky
1993 Optimality Theory: Constraint Interaction in Generative Grammar.
Technical Report 2, Center for Cognitive Science, Rutgers University.
Prince, Allan — Paul Smolensky
1997 Optimality: From neural networks to universal grammar. Science 275:
1604-1610.
Ross, John R.
1970 Gapping and the order of constituents. In: Manfred Bierwisch and
Karl Erich Heidolph (eds.) Progress in Linguistics: A Collection of Pa-
pers, 249-259. The Hague: Mouton.
Schütze, Carson T.
1996 The Empirical Base of Linguistics: Grammaticality Judgments and Lin-
guistic Methodology. Chicago: University of Chicago Press.
Sorace, Antonella
1992 Lexical conditions on syntactic knowledge: Auxiliary selection in native
and non-native grammars of Italian. Ph.D. dissertation, University of Ed-
inburgh.
Sorace, Antonella
1993a Incomplete vs. divergent representations of unaccusativity in non-native
grammars of Italian. Second Language Research 9: 22-47.
Sorace, Antonella
1993b Unaccusativity and auxiliary choice in non-native grammars of Ital-
ian and French: Asymmetries and predictable indeterminacy. Journal of
French Language Studies 3: 71-93.
Sorace, Antonella
2000 Gradients in split intransitivity: Auxiliary selection in Western European
languages. Language, to appear.
Stevens, Stephen S.
1975 Psychophysics: Introduction to its Perceptual, Neural, and Social
Prospects. New York: John Wiley.
Tesar, Bruce — Paul Smolensky
1998 Learnability in Optimality Theory. Linguistic Inquiry 29(2): 229-268.
Word Order Variation: Competition or Co-Operation?
Jürgen Lenerz
The interplay of different factors in word order variation seems to call for a
description in terms of a competition model. Still, a general assessment of
competition models shows certain drawbacks in explanatory power. I argue
that an approach in terms of a detailed description of the interacting factors
of co-operating subsystems may result in a deeper understanding of central
facts of word order variation. Using well-known data from German, this is
exemplified by a semantic and pragmatic analysis of the referential power
of definite and indefinite noun phrases in the background and the focus re-
spectively. Such an analysis in terms of a choice function approach not only
provides a deeper insight into the relevant mechanisms, but it also, I hope,
opens the door for further research in a number of areas not connected with
the study of word order variation.
1 The State of the Art
Word order variation is a phenomenon of many languages which seems to

be situated at the crossroads of syntax, prosodie structure, semantics, and
discourse pragmatics. Many relevant factors of each of these determining
systems have been described in some detail in the literature of the last two
decades. In syntax, there has been some debate about the adequate approach
in terms of flat vs. branching structure, base generation or scrambling, and
obligatory or optional movement, to name but the most important issues. For
some literature concerning the state of affairs in German, cf., amongst oth-
ers, Haider (1993), Fanselow (1997), Haider & Rosengren (1998), Müller
(1998) and the literature cited there. As far as prosodie structure is con-
cerned, its interaction with (discourse related) focus-assignment in syntax
and its realisation as prosodie prominence has also been dealt with in a num-
ber of works. Here, the questions of focus-projection (or: F-percolation),
the rise-fall-contour, etc. have been investigated in some detail by Selkirk
250 Jürgen Lenerz
(1984), Uhmann (1991), Jacobs (1991, 1992a,b), Eckardt (1996), Trucken-

brodt (1996), Zubizarreta (1998) and others. Some important semantic as-
pects of word order variation have also been addressed. Here, the problems of
scope assignment, scope inversion, and the related issues of the proper inter-
pretation of indefinite NPs as generic or existential have been investigated in
some detail by Diesing (1990), Biiring (1996, 1997), Eckardt (1996), Krifka
(1998) and others. Finally, progress has also been made in the area of the
relevant aspects of discourse pragmatics, which is responsible for the cen-
tral concept of background vs. focus information; cf., amongst others, Choi
(1996), Biiring (1997), and the literature mentioned there.
The apparent interaction of various determining systems has always given
rise to a description of word order variation in terms of a competition of these
systems. Various competition models have been proposed for a treatment
of word order variation in German and other languages; cf., among others,
Uszkoreit (1984), Jacobs (1988a,b), Hawkins (1990), Primus (1993, 1994),
Biiring (1996), Choi (1996), and Müller (1998).
As natural as it seems to treat word order variation in terms of a compe-
tition model, there are, to my mind, a number of serious objections to these
approaches to word order variation in German. Without going into any de-
tailed analysis of the particular proposals, the following objections come to
mind: Most competition models seem to be content with stating a number
of purely observational generalizations, and putting these into some impli-
cational order. This can be shown for my own early account (Lenerz 1977a)
and many subsequent discussions. In such a procedure, the resulting word
order variation can, of course, be derived, and the relevant conditions can be
shown in their interaction. The relevant conditions themselves, however, are
not explained in any way. In addition, the conditions frequently lump together
information from different systems, thus in themselves violating the method-
ological principle of modularity. (An example will be given below by the
so-called "existential axiom"; cf. Biiring 1996: 6.) Moreover, one of the most
puzzling questions of all competition models concerns the definition of the
set of candidates that are supposed to be in competition with each other when
faced with a number of conditions they should comply with in the best way
possible. For most approaches, there seems to be a tacit assumption that the
candidate set consists of sentences with the same lexical material ("numera-
tion" in terms of the Minimalist Program, Chomsky 1995; cf. Lenerz 1998)
and the same θ-structure, thus of sentences with (roughly) the same mean-
ing. (For some discussion of the proper restrictions, cf. Jacobs 1988a,b.) In
most cases, however, it is left unclear to which degree the elements of the
Word Order Variation 251
candidate set are supposed to be synonymous. Is identity of sentence mood

required, i.e., are questions, assertions, imperatives, etc. in the same candi-
date set? What about the position of adverbials and their varying influence
on meaning? What about quantifier placement and corresponding scope rela-
tions? When faced with these problems and related ones, it seems very diffi-
cult to decide what a proper definition of "candidate set" should be in order
to make a competition analysis work for a given case.
As regards Optimality Theory (OT) as the currently most prominent version
of a competitional model, there seems to be a division of labour between an
underlying definition of competing structures by the function GEN (gener-
ate and the filtering device of constraints which are put in a specific hierar-
chy. The more specific GEN is, the less work will be left for the constraints.
(An example will be discussed below; cf. Choi 1996.)
Thus, all these particular objections taken together, there appears to be the
danger of a lack of descriptive adequacy as long as competition models deal
with observational generalizations instead of detailed and precise definitions
of the relevant properties of well-formed sentences, preferably based on a
small number of underlying general axiomatic principles.
In the following, I will try to show that a (more traditional) derivational
approach to some of the factors determining word order variation in German
may lead to a better understanding of the relevant forces and their interaction.
This can be achieved, I believe, in a model which assumes that the systems
which determine word order variation and its constraints all co-operate in
precise and predictable ways in defining the relevant properties of individual
sentences. If successful, such an approach will ideally define for any given
"numeration" with a specific meaning a candidate set containing exactly one
element, thus making a subsequent competition obsolete.
2 The Analysis
2.1 The Idea
In the following, I will try to demonstrate how a modular derivational anal-

ysis of some well-known phenomena of word order variation in German is
able to provide a deeper and more adequate understanding of the relevant
principles co-operating in defining a sentence in its structure and its specific
meaning. In order to be able to do so, I will first present a traditional account
in 2.2; in 2.3, I will develop a proper structural distinction of sentences of
252 Jürgen Lenerz
German in a B-part (background) and an F-part (focus and focus-affiliated

elements). This bi-partition will then be shown to correspond with a semantic
distinction in b(ackground)-determined vs. immediate sentence constituent
(isc)-dependent interpretation of constituents. The proper referential inter-
pretation of NPs will be derived on the basis of a choice function approach in
2.4, following a recent proposal by von Heusinger (1997). This analysis will
then be used in 2.5-2.8 to explain some well-known constraints on word or-
der variation in German. My proposal, which is summarized in 3., requires a
distinction between a referential and a non-referential part of the meaning of
NPs. Independent evidence for this distinction will be given in 4., indicating
its usefulness for further research in hitherto unrelated areas of syntax.
2.2 The Facts
The phenomena I will deal with were first described in some detail in Lenerz
(1977a) and have been discussed under various aspects in subsequent work.
Recent attempts to describe them in terms of an OT-account can be found in
Choi (1996), Biiring (1996) and Müller (1998), with some discussion of the
relevant literature.
In German, the order of the indirect object (10, dative) and the direct object
(DO, accusative) depends to a large degree on the particular verb, probably
basically being due to considerations of animateness (cf. Vogel & Steinbach
1998). There has been a considerable amount of debate about the proper syn-
tactic analysis, concerning the basic order of arguments, binding relations,
etc. (cf., amongst others, Rosengren 1993, Fanselow 1993, 1997, Haider &
Rosengren 1998, and Müller 1998). I won't take up this discussion. For the
present purposes, the following may suffice: It has generally been agreed that
for a large part of German verbs the unmarked order is 10 > DO, as the evi-
dence in (2) shows. The order 10 > DO in (2) is not subject to the constraints
obtaining for the reverse order in (3) and (4). A reasonable explanation for
naming IO > DO the unmarked order may be given along the lines of Höhle
(1982) and some subsequent work (cf. Büring 1997): The unmarked order
may be used in more discourse contexts since it allows more focus interpreta-
tions (via F-projection). I won't go into this matter, however, in the present pa-
per. More discussion may be found in Reinhart (1995,1997), Eckardt (1996),
Truckenbrodt (1996), Zubizarreta (1998) and related work; cf. also Uhmann
(1991).
The following generalizations have been observed (cf. Lenerz 1977a,
Büring 1996):
(1)a. |±def IO] > [±def DO] : "unmarked order", regardless of focus
position (cf. (2-a), (3-a), (4-a))
b. [+def DO] > [IO]F : scrambling of [+def, - F ] is o.k. (cf. (2-b))
c. *[±def DO] F > IO = don't scramble focus! (cf. (3))
d. *[—def DO] > [IO] F = don't scramble (existential) indefinites! (cf.
(4))
The questions of (2)-(4) are supposed to give an adequate context for the
respective answers in which the focus is given prominence by intonation, as
indicated by capital letters. The questioned constituent is indicated (Q: DO
or Q: 10).
(2) Wem hast du das Buch gegeben? (Q: IO)

'Whom did you give the book?'
a. Ich habe [dem/einem StuDENten] F das/ein Buch gegeben.
I have the/a student the/a book given
[±def IOIF > [±def DO] ("unmarked order")
b. Ich habe das Buch [dem/einem StuDENten] F gegeben.
I have the book the/a student given
[+def DO] > [±def I0] F (scrambled [+def DO, - F ] is o.k.)
(3) Was hast du dem Studenten gegeben? (Q: DO)
'What did you give to the student?'
a. Ich habe dem Studenten [das B U C H ] F gegeben.
I have the student the book given
[+def IO] > [+def DO] F ("unmarked order")
b. *?Ich habe [das BUCH] F dem Studenten gegeben.
I have the book the student given
*[+def DO]f > [+def 10] (*scrambled focus)
(4) Wem hast du ein Buch gegeben? (Q: IO)
'Whom did you give a book?'
a. Ich habe [dem/einem StuDENten] F ein Buch gegeben.
I have the/a student a book given
[±def IO] F > [—def DO] ("unmarked order")
b. *Ich habe ein Buch [dem StuDENten] F gegeben.
I have a book the student given
*[—def DO] > [+def I0] F (*scrambled indefinite NP)
c. *Ich habe ein Buch [einem StuDENten] F gegeben.
I have a book a student given
*[-def DO] > [ - d e f IO] F (*scrambled indefinite NP)
254 Jürgen Lenerz
In the unmarked order IO > DO, every conceivable distribution of definite-

ness (±def) and focus (F) is possible; cf. (2-a), (3-a), and (4-a): [±def IO]f >
|±def DOJ and | ±def 10] > [±def DO] F . A re-arrangement of both objects
is possible only if the DO is definite and the 10 is focus; cf. (1-b) and (2-b).
Without any discussion of the differing assumptions in the relevant literature,
I will simply assume the re-arrangement to be due to scrambling (cf. Haider
& Rosengren 1998, Müller 1998). The main reason for doing so is that there
has to be a structural difference between IO > DO and DO > IO for a proper
compositional analysis of their different semantic interpretations; cf. below.
Base generation, especially in a flat structure, generally does not provide the
means for a structural distinction. In other respects, the details of the proper
syntactic analysis are not relevant for the following discussion.
The constraints to be dealt with in any analysis of word order variation in
German are given in (1-c) and (1-d): Focus must not be scrambled (cf. (3-b))
and, even if the last NP carries focus, the scrambled NP must not be indef-
inite (cf. (4-b,c))! This last condition, basically taken from Lenerz (1977a),
requires a further specification (cf. (5) below): The proper condition is that ex-
istential indefinites should not be scrambled. The examples (3) and (4) show
the effect of violations of these constraints: In the preceding question context
the respective answers are only well formed if the DO in DO > IO is not
the focus and not indefinite (with an existential reading) (cf. (2-a,b) and (4-
a)). If the indefinite DO, however, receives a generic interpretation, as in (5),
scrambling is allowed (cf. Biiring 1996: 10):
(5) Wem erzählt Peter einen obszönen Witz?

'Whom does Peter tell an obscene joke?'
Peter erzählt einen obszönen Witz immer einem Schulfreund.
Peter tells an obscene joke always a-DAT schoolmate
(generic)
'Peter tells an obscene joke always to a schoolmate.'
It thus turns out that existential indefinites may only occur in a restricted
environment. Biiring covers this case by a so-called "existential axiom":
(6) The Existential Axiom:

Existential indefinites can occur in the background only if they are
c-commanded by the focus. (Biiring 1996: 6)
So far, the relevant conditions are given. They have been used as basic in
many studies on word order variation in German. Notice, however, that ( 1 -

a-d) and (6) are only observational generalizations, lacking any explanatory
power in themselves. In particular, (6) is not an axiom in the proper sense,
as Büring (1996: 6) himself admits: A condition of such a complex structure,
referring to several subconditions of different sorts is itself in need of an
explanation. Even worse, (6) offers an observation which is paradoxical in a
sense: It is hard to understand why an indefinite in the background should be
c-commanded by the focus in order to receive an existential interpretation.
In the following, I will try to show how the conditions (1) and (6) may be
derived from more general principles.
2.3 Split Tree = BF-Structure
In order to be able to derive the effects of (1) and (6), I will first have to define
some terminology. This is necessary in order to be able to establish an ade-
quate correlation between syntactic structure and background-focus structure
(BF-structure). This, again, is necessary because the semantic interpretation
of NPs depends on the BF-structure of the sentence. Assuming that semantic
interpretation is compositional, the relation of NPs to BF-structure has to be
visible in syntactic structure.
Many proposals have been made concerning the (referential) semantics of
indefinite NPs. It is well known that indefinite NPs may have a generic read-
ing and an existential reading. (In addition, other aspects have to be distin-
guished; there is also an (un)specific, a referential, and an attributive reading;
for some discussion and exemplification, cf. von Heusinger 1997.) Many re-
cent proposals refer to the distinction between a generic and an existential
reading, as exemplified in (7):
(7) a. weil ein Feuerwehrmann¡ immer [yp t, beREIT ] ist.

because a fireman always ready is
'because a fireman is always ready' generic reading
b. weil immer [yp ein Feuerwehrmann beREIT 1 ist.
because always a fireman ready is
'because there is always a fireman ready' existential reading
Here, ein Feuerwehrmann has a generic interpretation in (7-a): The sentence

states that it is a general property of every fireman (= any arbitrary fireman)
to be ready. In (7-b), however, it is stated that at all times there will be some
fireman ready, at least one fireman at each particular point in time. This is the
so-called existential reading. (Normally, it is unspecific, i.e., we do not know
256 Jürgen Lenerz
who the fireman is. If ein Feuerwehrmann has a specific reading, speaker or
hearer have to assume that the fireman is somehow identifiable, e.g., as Hans
Feuer.) The distinction between a generic and an existential reading of an
indefinite NP seems to correlate to some degree either with syntactic structure
(split tree hypothesis; cf. Heim 1982 and Diesing 1990, amongst others) or
with BF-structure (Krifka 1984, Eckardt 1996, Biiring 1996, Yeom 1998).
Common to most treatments, however, is the assumption that the semantic
analysis in terms of a quantifier logic distinguishes between two domains
(cf. Heim 1982 and von Heusinger 1997 for some discussion). A common
assumption is that a sentence may be translated into a logical structure of the
form given in (8):
(8) quantifier [restrictor: scope]
An equally common assumption is the so-called split tree hypothesis:
(9) IP
SpecIP I'
NP VP
Sadv VP
(...weil) ein Feuerwehrmann,· immer

1
t,·
1
bereit ist
indefNP: generic (GEN) existential(3)
If the indefinite NP ein Feuerwehrmann is moved out of its original base po-
sition (as VP-internal subject) to SpecIP, it receives a generic interpretation.
If it stays within the VP, the indefinite NP has an existential reading. (For
some more discussion and additional observations, cf. Eckardt 1996.)
In the following, I will assume a particular version of the split tree hypoth-
esis such that a structural bi-partition of each sentence in German correlates
with a specific understanding of background-focus structure. The referential
interpretation of indefinite (and definite) NPs in my analysis will not be in
terms of a quantificational logic as in (8), but in terms of a choice function
approach along the lines of von Heusinger (1997). On this basis, generic and
existential readings of indefinite NPs will then be explained.
I take it for granted that BF-structure is relevant for the proper seman-
tic interpretation of a sentence, especially as far as the referential interpre-
tation of NPs is concerned. Assuming the principle of compositionality of
semantic interpretation, BF-structure should be visible in syntactic structure
in some way or other. In German, I claim that we find a BF-bi-partition of
every sentence: Constituents inside the VP (i.e., non-moved, VP-dominated
constituents) are somehow focus-affiliated. The focus itself and all focus-
affiliated constituents, even if they are not (part of) the focus in a narrow
sense, belong to the F-part of the sentence. All other constituents (maybe ex-
cept the topicalized constituent in SpecCP and the finite verb in C°) belong to
the B-part of the sentence. The relevant BF-bi-partition is brought about by
movement: Α-movement of the subjects to SpecIP or scrambling (cf. Haider
& Rosengren 1998; for some psycholinguistic evidence, cf. Clahsen & Feath-
erston 1998; for different solutions in other languages, cf. Vikner (this vol-
ume) on object shift in Icelandic and Williams (1999) on English; cf. also
Zubizarreta 1998 for Germanic and Romance; for some general discussion,
cf. Abraham 1995, ch. 14). A functional explanation for the specific condi-
tions in German may run as follows: The VP can be viewed as the syntactic
realisation of a predicate. A predicate refers to a property (of an individual
or of several individuals, i.e., a relation of these individuals). If the whole
predicate is new information, all relevant constituents will ideally remain in-
side the VP; cf. í/zgre-sentences in English, ei-sentences in German, etc. If
an individual is known, the respective NP will be moved out of the VP if pos-
sible. Normally, this applies to the subject. So, we get the typical distinction
"subject-predicate" by movement of the subject to SpecIP. In German, other
NPs may also be moved out of the VP if they represent background infor-
mation, in this case by scrambling. Thus, in German, BF-structure may be
formally represented by a syntactic bi-partition in surface structure:
Topicalization Subject scrambled NP VP-dominated
in SpecIP constituents
b-determined reference isc-dependent reference
B-part F-part (focus + f-affiliated)
Choi (1996): -NEW +NEW
+PROM -PROM +PROM -PROM
background focus ?
(f-affiliated)
Clearly, (10) does not cover the whole range of problems which are connected
with the correlation between prosodie prominence, (syntactic) focus structure
and information structure. For a recent approach in the framework of the min-
imalist program, cf. Zubizarreta (1998). It is impossible here to give a detailed
justification of my view by comparing it with the many proposals in the lit-
erature. It may be useful, however, for a proper understanding of my specific
proposal to compare it with some similar approaches, noticing the small but
important differences. So, the terminological distinction into background and
focus, as shown in the bottom line in (10), is somewhat vague in several re-
spects: The term focus normally refers to prosodically prominent constituents
which are either themselves new information (minimal focus) or part of the
new information, in which case the focussed constituent serves as a focus ex-
ponent for a derived focus comprising additional constituents (F-projection
or F-percolation; cf. Eckardt 1996; for the distinction between a conception
of absolute focus or relative focus, cf. Höhle 1982, Jacobs 1984,1991). Other
elements within the VP which are not proper parts of the focus are generally
not covered by this terminology (except, maybe, in Biiring 1997, whose idea
of background vs. focus is very similar to my distinction between B-part vs.
F-part). So, in order to refer to both focus proper and focus affiliated con-
stituents, I propose talking about the F-part of the sentence. The so-called
background normally refers to old information, but the term "background"
normally does not relate to prosodically prominent constituents belonging to
old information. If both have to be addressed together, a new terminology is
necessary: I call it the B-part of the sentence.
A distinction very similar to mine is in fact developed in Choi (1996)
in some detail: Based on earlier work by Vallduvi (1992), Choi distin-
guishes between old (given) information (—NEW) and new (added) informa-
tion (+NEW). In addition, she adds the finer distinction between prominent
(+PROM) and non-prominent (—PROM) parts of old or new information.
This gives a cross classification which allows for a number of necessary dis-
tinctions, e.g., rise-fall-intonation, etc. I basically agree with Choi's analysis
except for the fact that she distributes the features [ ± NEW, ± PROM1 in a
random way on the constituents of a sentence. Thus, a subsequent battery of
ranked constraints in an OT approach has to filter out improper assignments
of [ ± NEW, ± PROM] to particular constituents. I hope to show that there
are rules for a proper assignment of constituents to the particular parts of
the information structure such that no subsequent filtering mechanism will be
needed.
Finally, I should add some remarks on the relation between a BF-structure
and a split tree. Biiring (1996) tries to show that a BF-analysis is superior to
a split tree analysis on empirical grounds. In particular, he claims that there
are sentences in German with a generic indefinite NP inside the VP. Biiring's
example is (11). He claims that einem Italiener ('a-DAT Italian', IO) may
receive a generic interpretation (as well as an existential one, to be sure). I
doubt that (11 -b) can have a generic interpretation, but let's assume this for
the sake of Biiring's argument.
(11) a. Wen hast Du einem Italiener vorgestellt?

who have you a-DAT Italian introduced?
'Who did you introduce to an Italian?'
260 Jürgen Lenerz
b. Ich habe einem Italiener [ MARion ]f vorgestellt.

I have a-DAT Italian Marion introduced
(IO = * generic?)
Btiring assumes that the IO einem Italiener in (11-b) is in the VP rather than
scrambled out of the VP. The possibility of a string-vacuous scrambling of
the IO (as in: IO¿ fvp t, DO]) should be ruled out, as Btiring claims, because
then it would be unclear why the 10 in (12) cannot have a generic reading,
i.e., why string-vacuous scrambling (as in DO* IO, [vp t/ t*|) would not be
allowed there:
(12) Ich habe [np MARion* |f einem Italiener t* vorgestellt.

I have Marion a-DAT Italian introduced
(10 = existential, not generic!)
I must admit that I do not find Büring's argumentation conclusive: As far

as I can see, scrambling the focus-DO as in (12) should result in a deviant
sentence; cf. (1-c) and (3-b). This follows naturally if we assume that scram-
bling may apply to background elements only, moving them to the B-part of
the sentence. Thus, if (12) is not considered deviant, |DO]p > [IO] should be
base generated inside the VP. This may be possible for verbs taking an an-
imate DO, cf. also Müller (1998) on underlying c-command relations defin-
ing binding relations: With verbs like vorstellen anaphoric binding is always
from a c-commanding DO to a c-commanded 10, never vice versa. A similar
argument may apply to Büring's (1996: 5) example in footnote 9:
(13) a. Wen hat eine Blondine bedient?

whom has a blonde served
b. (Ich glaube,) daß [ meinen BRUder ]p eine Blondine bedient hat.
(I believe) that my brother a blonde served has
Here, if the sentence is judged correct at all, some sort of VP-internal DO/r
> SU sequence has to be assumed. Notice, however, that sentences like (13-
b) should be deviant, assuming Lenerz' (1977a: ch. 4; 1977b) condition of
agentivity, which is relevant for DO > SU sequences. So, (12) and (13)
do not provide counterevidence against string-vacuous scrambling. Accord-
ingly, a sentence like (14) may indeed have a generic reading if we assume
(even string-vacuous) scrambling of the IO einem Italiener, as indicated by
the parenthesised Adverb immer ('always'), which presumably marks the left
VP-boundary:
(14) Ich würde einem Italiener (immer) [γρ | MARION |p vorstellen]

I would a-DAT Italian (always) Marion introduce
(IO = generic)
Ί would always introduce Marion to an Italian.'
Thus, I conclude that, contrary to Büring's claim, a split tree analysis for a
German sentence as in (10) may indeed properly represent a BF-bi-partition.
So far, the main reason for assuming a BF-bi-partition, as represented in a
split tree analysis, has not been made clear. As the top line in (10) states, the
different readings of NPs in the B-part and the F-part, respectively, are due to
a distinction between a b(ackground)-determined vs. an immediate sentence
context (isc)-dependent interpretation: The reference of elements in the back-
ground is b-determined: It is either given by the preceding linguistic context
or by general knowledge. In contrast, the reference of elements in the F-part,
being newly introduced or somehow affiliated to newly introduced elements,
has to be chosen in a context adequate manner, i.e., as isc-dependent refer-
ence. Take (15) as an example:
(15) Peter bought a book at a science fiction book store.
Without any particular context given, we may safely assume that the refer-
ence of Peter (being a proper name) is somehow b-determined. Assuming
that the VP (bought a book at a science fiction book store) is the F-part of
(15), the reference of a book is isc-dependent: In order to give the sentence a
proper interpretation, we have to assume that a book refers to a book whose
reference is determined by the immediate sentence context, i.e., a book which
is available at a certain book store at the time in the past at which Peter bought
it. So, a book cannot refer to the Book of Kells (which is not for sale), to the
Bible or to Chomsky & Halle, The Sound Pattern of English (both are not
available at science fiction book stores), nor to next year's best selling fiction
novel by John Irving (which cannot have been for sale in the past). Similar
reasoning applies to an isc-dependent referential interpretation of the PP at a
science fiction bookstore.
2.4 The Semantics of Definite and Indefinite NPs
All we need to complete the set of tools for my analysis is a proper way of
distinguishing between the identification of the referents of definite and indef-
inite NPs, on the one hand, and the assertion expressed in the sentence, on the
other hand. A semantics with choice functions, developed by von Heusinger
262 Jürgen Lenerz
(1997, to appear), seems to be the appropriate tool to capture this distinc-

tion. It also allows us to capture the distinction between b-determined and
isc-dependent reference. The basic idea of this semantics is that indefinite
and definite NPs are not quantifier phrases, but terms that pick out one of the
elements that are described by the descriptive material of the phrase. Thus
the referent of a book in (15) is chosen out of the set of books. This basic
idea is reconstructed by the so-called epsilon-operator, which was introduced
into mathematics by Hilbert & Bernays (1977) in order to replace the exis-
tential and universal quantifiers. The epsilon-operator takes a predicate as its
argument and yields a term. The most natural interpretation of the epsilon
operator is a choice function that arbitrarily assigns to each set one of its ele-
ments. Thus we can represent sentence (16) as (16-a), where the indefinite NP
is represented by the epsilon term "εχΜχ". The predicate S (snores) applies
to an individual χ which is chosen by the choice function φ from the set of
individuals which have the property of being men (Mx); "S" is the predicate,
"εχΜχ" is an individual, "M" is the attribute defining the set of individuals
"Mx". The epsilon term is interpreted by the choice function applied to the
extension of the predicate man, i.e., as the operation of assigning one element
to the set of men. In other words, the indefinite NP refers to a particular man,
even though we may not determine which one. The sentence is true if this
man is in the extension of the predicate snore, as in (16-b).
(16) A man snores.

a. S(exMx)
b. ||S(exMx)H = 1 iff φ(\\εχ man'(χ)]|) e [[snore']|
Since the choice of the particular man is arbitrary given the definition of the
choice function so far, the sentence is true if any (arbitrarily chosen) man
snores, which corresponds to the generic character of the sentence.
However, in order to analyse non-generic sentences, von Heusinger (1997)
introduces indexed epsilon-operators that are interpreted by different choice
functions. This makes it possible to choose the reference of an NP in a
context-adequate manner, i.e., in my words as isc-dependent. Thus the rep-
resentation for the indefinite NP a man in (17) is the indexed epsilon term
"£, XMX" with the free index or parameter i for the choice of a choice func-
tion. Thus (17) is true if there is a choice function φ^ such that the object
assigned to the set of men (i.e., a particular man) is in the extension of the
predicate is snoring, as in (17-b):
(17) (What's that noise?) A man is snoring.

a. 3/[S(e,xMx)] (isc-dependent reference)

b. [|S(e,xMx)]| = 1 iff there is a choice function </>* such that φκ (|[εχ
man'(x)H) e ^snoring'^
Here, as the continuous form of the predicate (is snoring) indicates, the sen-
tence refers to a particular event. Thus, the reference of a man has to be
chosen in an isc-dependent way. This is done by the context-adequate choice
function ε,. The formula in (17-a) expresses this by using an existential quan-
tification of contexts (3i): There is a context i in which the predicate S (is
snoring) applies to an χ which is chosen by a context adequate choice func-
tion ε,· from the set of elements which have the property (attribute) of being
men (Mx). Here, the choice function is properly restricted, and there is no
generic reading available. (17-a) represents the normal "existential" (non-
specific) reference of an indefinite NP. Notice that the existential quantifier
in (17-a) does not apply to the individual a man, but to contexts (3/).'
In order to complete my short account of von Heusinger's approach, I will
also present a rough characterization of the interpretation of b-determined
reference. B-determined reference is usually expressed by definite NPs or
pronouns; cf. (18) in the (preceding) context of (17):
(18) The man / he is drunk (= in context k) = D(£¿ χ Μ χ χ ) (b-determined

reference)
Here, the choice function ε is the particular (constant) choice function ε¿

which chose a man in the preceding context. This choice function ε* chooses
the most salient man (who is snoring) in the given context k. The "most
salient" individual is
(19) (i) one that was mentioned last in the preceding context (where a man
is listed as a snoring man: M5 is the (background determined) sub-
set (Mx Λ Sx) which was determined by the preceding context, or
(ii) one that is somehow present in a given non-linguistic context and is
"pointed at" (deictic reading), or
(iii) one that is the proper salient individual in our knowledge of the
world (cf. the president, the sun, my wife, etc.).
This completes my short rendition of von Heusinger's choice function ap-

proach, which will now be applied to the problem of different interpretations
of different word orders in German.
264 Jürgen Lenerz
2.5 Indefinite NPs in the B-Part: Generic Interpretation
I will now try to show how the puzzling aspects of word order variation in
German can be explained. In order to do so, I will first derive the generic and
existential reading of indefinite NPs. This will then be applied to explain the
constraints in (1).
Let us first turn to a derivation of the generic vs. existential reading of
indefinite NPs, as exemplified in (7), repeated here for convenience:
(7) a. weil ein Feuerwehrmann¡ immer [vp t, beREIT ] ist.

because a fireman always ready is
'because a fireman is always ready' generic reading
b. weil immer [γρ ein Feuerwehrmann beREIT 1 ist.
because always a fireman ready is
'because there is always a fireman ready' existential reading
Both readings will be derived as pragmatically induced interpretations of

a vague referential semantics. (For similar approaches in other areas of se-
mantics, cf. the concept of conventional implicature (Grice 1975) and, more
precisely, the concept of conceptual shift (Bierwisch 1983).) In (7-a), the NP
ein Feuerwehrmann ('a fireman') is in SpecIP, hence not VP-dominated, i.e.,
in the B-part of the sentence. Thus, its reference should be b-determined,
i.e., given (i) by preceding context, or (ii) by deixis, or (iii) by common
knowledge; cf. (19). Neither is obviously the case since ein Feuerwehrmann
is indefinite, hence it does not refer to a given or known individual. This is
the point where the pure semantic interpretation stops without providing a
proper result. Assuming the principle of co-operation (Grice 1975), however,
pragmatic inference may help us, possibly in terms of a conventional impli-
cature along the following lines: There is no given or known individual to
which ein Feuerwehrmann refers. However, the set of firemen is in the back-
ground knowledge, as is indeed the knowledge about every predicate! So,
the indefinite NP ein Feuerwehrmann may choose any (arbitrary) individual
from that set, and this is indeed the generic reading! Note especially that the
choice function is not isc-dependent. Hence, the generic reading of an in-
definite NP in the singular is nothing but an unspecific reading which is not
isc-dependent. In terms of a choice functional approach, ein Feuerwehrmann
in the B-part of a sentence is represented as "sxFx". The fact that the choice
function ε is not context specific (i.e., not ε,χ) is due to the NP being in the
B-part and the resulting pragmatic inference. (Similar derivations of other re-
alisations of generic NPs (bare plurals, definite generic NPs) may be worked
out.)
The existential reading of an indefinite NP can be derived in a similar way:
Ein Feuerwehrmann in (7-b) is in the F-part of the sentence. Hence, it should
be interpreted in an isc-dependent manner. In terms of a choice function ap-
proach this means that the choice function choosing an individual from the
set of firemen has to be a context adequate choice function:
(20) 3/B(£,xFX)
(20) is to be understood as has been exemplified for (17) above: There is a

context i in which a particular context-adequate choice function ε, applies,
choosing an individual from the set of firemen.2 (19) then states that this fire-
man has the property of being ready. Thus, the existential reading of the in-
definite NP ein Feuerwehrmann can be derived from its being in the F-part of
the sentence, hence isc-dependent in its interpretation. The existential read-
ing is thus explained as the result of a pragmatically induced and contextually
restricted application of the choice function.
2.6 Don't Scramble Existential Indefinites!
I have shown above how a generic reading of an indefinite NP in the B-part

of a sentence may be derived. This result can now also be applied to the
scrambled NP in sentences like (4-b,c) and the corresponding constraint ( 1 -
d): "Don't scramble (existential) indefinites!" By scrambling, an NP is moved
to the B-part of the sentence. An indefinite NP in the B-part is inevitably
interpreted as generic, not as existential. Hence, the corresponding sentences
(4-b,c) are semantically deviant since it does not make any sense to state that
somebody gave a generic book to a specific student at a specific time (cf.
Büring 1996). A similar effect may possibly be achieved in English (even
without scrambling) by lexically enforcing a generic reading (any old book)
and putting the focus on the other object (PEter):
(21) a. I gave a book to PEter.

b. ?*I gave any old book to PEter.
There is additional evidence in German that the scrambling constraint (1-d)

is structure dependent. In other words: The generic reading responsible for
266 Jürgen Lenerz
this constraint derives from a split tree analysis in German, reflecting the BF-
structure in surface syntax. There are a few verbs in German allowing Dative
Shift. So schicken ('to send') allows a PP-argument, as in (22-a), or an NP
argument in the dative, as in (22-b). The PP-argument is generally assumed
to be closest to the verb. Thus, (22-a) represents the unmarked order DO >
PP, whereas (22-b) shows a derived (scrambled) order DO > IO.
(22) An wen / wem hast du ein Buch geschickt? (Q: PP)

'Who did you send a book (to)?'
a. Ich habe [vp ein Buch an den VerLAG geschickt ]
I have a book to the publishers sent
| - d e f DO] > [+def PP1F
(DO in situ)
b. *Ich habe ein Buch, [Vp dem VerLAG t, geschickt ]
I have a book the-DAT publishers sent
* I - d e f DO] > l+def IO] F
(*scrambled indefinite NP)
Ί sent a book to the publishers.'
There is a clear distinction in the acceptability of (22-a) vs. (22-b) in the given
context: While (22-a) is perfect, (22-b) shows the same deviation as (4-b,c).
This can be explained by my analysis: The structure of (22-a) is (22-a'):
(22a') VP
NP V
ein Buch PP V
an den VerLAG geschickt

to the publishers sent
Here, the indefinite NP ein Buch is clearly VP-dominated, hence in the F-

part, where it receives an isc-dependent interpretation, i.e., as (unspecific)
existential.
Notice that the NP ein Buch is not c-commanded by the focus (which is
placed on the PP an den VerLAG)·, thus, Büring's so-called Existential Axiom
(6) does not cover this case. It turns out that it is not a c-command relation
to focus, but an m-command relation which is relevant, i.e., the indefinite NP
must be dominated by (every segment of) VP, i.e., the NP must be in the F-
part! Being in the F-part results in an isc-dependent interpretation, hence in
an unspecific existential reading, as shown above.
Things are different for (22-b), as (22-b') shows:
(22b') VP
NP VP
NP V'
NP V
ein Buch, dem VerLAG geschickt

a book /ze-DAT publishers sent
B-part F-part
Here, the indefinite NP ein Buch has been scrambled from its base position
(10 > DO) into the B-part of the sentence, where it inevitably receives a
b-determined, hence generic interpretation. The sentence is as odd as the ex-
amples (4-b,c) above, again because of semantic deviance.
So far, the constraint (1-d) and the so-called Existential Axiom (6) have
been derived in my analysis. Still, there remains the obvious paradox in (6):
Why should an NP "in the background" be in the F-part of a sentence? It can
be shown, I believe, that the assumption that the existential indefinite NP in
the crucial examples is "in the background" is wrong.
2.7 Existential Indefinites in the F-Part
The assumption that an existential indefinite NP is in the F-part of a sentence

is crucial for my analysis. Hence, in order to be able to resolve the apparent
paradox in the so-called Existential Axiom in (6), I will have to show that
the existential NP is not in the background, as insinuated by the preceding
question context in (2-a) and (4-a). Let us first look at some examples in a
different context in which the preceding sentence (23-a) is not a question:
(23) a. Peter hat sich ein Buchi gekauft.

Peter has himself a book bought
268 Jürgen Lenerz
'Peter bought himself a book.'

b. Nein, MAX hat sich ein Buchau / eins^/j [ N 0] gekauft,
no Max has himself a book / one bought
'No, Max bought himself a book / one.'
c. Fritz hat sich AUCH ein Buch^i/*j/k / eins*i/*j/k IN0] gekauft.
Fritz has himself also a book / one bought
'Fritz bought himself a book / one, too.'
As the referential index i in (23-a) shows, a (specific or unspecific) reference

is determined as isc-dependent. A following sentence in which this reference
is taken as b-determined would have to refer to this booki by the definite NP
das Buchi ('the book, ') or by a pronoun es¡ ('it, '). In (23-b), however, the
book which Maxm bought himself m is not necessarily the same book which
Peterp bought for himself p. In a strict physical sense it even has to be a differ-
ent book since, normally, the same copy cannot be sold twice. But even if we
think about a book in more abstract terms as referring to any copy of a certain
book identifiable by author and title, the book that Max bought in (23-b) is
not necessarily the same one as the one Peter bought in the preceding sen-
tence (23-a). The indices in (23) do not cover this situation correctly. A more
precise version should use the different choice functions: Thus, if we replace
the index i by its choice function ε,· for the context i in (23-b), we get the
correct result: The isc-dependent choice function for the context j in (23-b)
cannot be the same one as in the preceding context i; hence, the reference of
a book in (23-b) is chosen by the context adequate choice function ej. To be
sure, this choice function may by chance pick the same book in its physical
or non-physical sense, but the relevant point is that its choice is independent
of the preceding choice of ε,·, thus not b-determined. (A similar argument
holds for (23-c).) Thus, the reference of the NP ein Buch is not necessarily
in the background of the following context. If anything is in the background,
then it is the property of the NP ein Buch, namely that Peter was involved in
some kind of book-buying. This shows that the distinction between a referen-
tial part and a non-referential (attributive or descriptive) part of the meaning
of NPs is relevant for a discourse adequate semantic interpretation. This dif-
ference is in some way expressed in the choice function approach " ε , χ Β χ " ,
where "ε, χ" may be seen as a representation of the referential part, and " B x "
as a representation of the attributive or descriptive part of the NP meaning.
(More precise ways of representation will have to be worked out.)
This assumption is also supported by the fact that the repetition of the whole
N P ein Buch in the sentences following (23-a) does not sound normal. In
normal subsequent context, the "bare determiner" eins ('one') will be used,
deleting the noun Buch. This again indicates clearly that the attributive part
of the NP {Buch) is b-determined and may hence be deleted. The referential
part eins, however, has to be uttered since it establishes an isc-dependent
reference independent of the preceding context. Notice, especially, that the
use of a b-determined pronoun es ('it') is not possible; cf. (24):
(24) a. Peter hat sich ein Buchi gekauft.

Peter has himself a book bought
b. *Nein, MAX hat es¡ sich gekauft.
no, Max has it himself bought
c. *Fritz hat es¡ sich AUCH gekauft.
Fritz has it himself also bought
Correct sentences corresponding to what the intended meaning of (24-b,c) is

would have to state explicitly that the same referential choice is intended:
(24)' b'. Nein, MAX hat sich dieses Buch¡ gekauft.

no, Max has himself this book bought
c'. Fritz hat sich dasselbe (Buch) AUCH gekauft.
Fritz has himself the-same (book) also bought
If we apply this argument to the crucial question-answer examples in (2)-
(4), we get a similar result: The answers containing the full NP ein Buch are
unnatural parrot-like answers. In natural dialogue, a previously mentioned NP
has to be replaced by a pro-form. If its reference is b-determined, the personal
pronoun is used, as in (25):
(25) Was hast du dem Studenten gegeben?

'What did you give to the student?'
Ich habe ihm ein BUCH gegeben.
I have him a book given
Ί gave him a book.'
If the reference of the NP in the answer is not b-determined but isc-dependent,

then, instead of a pronoun (26-a), the bare indefinite determiner is used
(26-b):
270 Jürgen Lenerz
(26) Wem hast du ein Buch gegeben?

'Whom did you give a book?'
a. ??Ich habe es dem StuDENten gegeben.
I have it the student given
Ί gave it to the student.' b-determined
b. Ich habe dem StuDENten eins [n0] gegeben.
I have the student one given
Ί gave one to the student.' isc-dependent
This shows, too, that the indefinite (singular) NP ein Buch ('a book') in the
answer (26-b) is not "in the background" (although it was mentioned before
in the question); rather, it is in the F-part of (26-b), and thus it cannot be
referentially b-determined. Rather, it has to refer in an isc-dependent man-
ner. So, its reference is chosen isc-dependently\ only its attributive part is
b-determined (as is, indeed, every attributive part of any NP).
This resolves the apparent paradox in the description of the distribution of
existential indefinite NPs.
To conclude, indefinite NPs receive a generic reading if they appear in the
B-part of a sentence. In the F-part of a sentence, indefinite NPs are interpreted
in an isc-dependent manner. This usually gives us an unspecific existential
reading. (There may, however, be certain contexts with a generic predicate, in
which an indefinite NP in the F-part may also receive a generic interpretation.
I will not discuss this here. I will also not discuss the distinction between a
non-specific and a specific interpretation of indefinite NPs.)
2.8 Problems with the "Unmarked Order"
This leaves us with the cases in which each order of 10 and DO is acceptable
(2-a,b). If DO is a definite NP and 10 carries focus, DO may remain in situ
([± def IO]/r > | + def DO], unmarked order) or scramble ([+ def DO] >
| ± def 10]F). The scrambling of a non-focus DO to the background part of
the sentence does not present a problem. There is, however, the problem of
interpreting the non-scrambled definite DO in situ, i.e., in the F-part. A sim-
ilar problem arises, of course, for Diesing's (1990) analysis, as Choi (1996:
120f.) notices. In the original split tree analysis, the definite NP, being in-
side the VP, should be bound by the unselective existential quantifier which
contradicts its definite reference. Instead of discussing Diesing's and Choi's
proposals, I will present my own analysis, which is based on the principles
stated and applied above.
According to my analysis, a definite NP in the B-part should be b-

determined in reference, whereas a definite NP in the F-part should be isc-
dependent in reference. Thus, there should be a difference in meaning be-
tween (2-a) and (2-b), repeated here for convenience:
(2) Wem hast du das Buch gegeben?

'Whom did you give the book?'
a. Ich habe [dem StuDENten|/r das Buch / *es gegeben.
I have the-DAT student the book / it given
(isc-dependent)
b. Ich habe das Buch / es [dem StuDENten|/r gegeben.
I have the book / it the-DAT student given
(b-determined)
Such a difference in meaning is, however, hard to establish. I assume that the
difference does not reside in the final meaning of both sentences, but rather
in the way their interpretation is brought about.
Let me start with the more natural case (2-b). Here, the information of the
F-part (dem Studenten gegeben) is added to the B-part (ich habe das Buch ...).
The NP das Buch belongs to the B-part and its reference is established in a
b-determined manner, i.e., as the most salient individual of the set of books
in the given context; cf. (18) and (19). The explicit repetition of the whole
NP das Buch is, in fact, unnatural and would, of course, require additional
pragmatic reasoning. The most natural explicit answer replaces the NP with
a personal pronoun; cf. (2-b')· (I will disregard elliptic answers like "dem
Studenten" for obvious reasons; they don't allow any insight into word order
relations.)
(2b') Ich habe es dem Studenten gegeben.

I have it the-DAT student given
So, (2-b) or (2-b') do not present a problem to my analysis. This is, of course,
different for (2-a). My proposal will have to proceed as follows.
In (2-a), the information of the F-part (dem Studenten das Buch gegeben)
is added to the B-part (ich habe)·, the NP das Buch here belongs to the F-
part and its reference is established in an isc-dependent manner, i.e., as a
specific, aforementioned book whose reference has to be determined appro-
priately with respect to the immediate sentence context. This, of course, will
result in fixing the reference of the NP das Buch as referring to exactly the
same individual as in the question (2), or in the related answer (2-b). (2-a)
272 Jürgen Lenerz
only achieves this in a more indirect way. Notice, too, that a replacement of
the NP with a personal pronoun in the F-part is not possible; cf. (2-a'):
(2a') *Ich habe dem Studenten es gegeben.

I have the-DAT student it given
Admittedly, more research into the semantics of definite NPs, especially if

they appear in the F-part, is necessary. Notice, however, that my proposal pre-
supposes for definite NPs a distinction similar to the one I required above for
indefinite NPs: The interpretation of each NP is a combination of a referential
part and a non-referential (attributive or descriptive) part. For the present case
of definite NPs this means that the attributive part is always treated as given
or known. If the definite NP is in the B-part of a sentence, then its referential
part is b-determined and, by being definite, the NP chooses the most salient
given/known individual of the set defined by the attributive part. If the defi-
nite NP is in the F-part of a sentence, its attributive part is b-determined (as is
true for all attributes). It is only its referential part which is to be interpreted
in an isc-dependent manner. Being definite, however, the appropriate context
adequate choice function chooses a specific, salient given/known individual.
My proposal thus describes two different ways of achieving the same mean-
ing for both sentences (2-a) and (2-b). The details of a precise formal analysis
will, however, still have to be worked out.
3 Conclusion
To conclude, I hope to have presented the outlines of a proper analysis for

the interpretation of definite and indefinite NPs in various word orders in
German. The specific interpretation is different for each possible word order,
and the constraints banning certain distributions of focussed and/or indefinite
NPs ((l-c,d), (6)) have thus found an explanation: The corresponding deviant
sentences do not in general receive a proper semantic interpretation.
As I said, the present paper sketched only the outlines of a proper analysis.
Independent evidence for the relevance of the distinction between referential
and attributive parts of the meaning of NPs is, of course, required (for a short
exemplification, cf. the appendix (4.)) and should receive a detailed analysis.
Also, a proper formal analysis of the required distinction is still missing. In
particular, a proper and detailed formal analysis of all the possible readings of
definite and indefinite NPs in all relevant distributions will have to be worked
out, describing in a precise way the co-operation of the different modules and
their respective principles such that a true account of the interaction of syntax,
prosodie structure, semantics and (discourse) pragmatics (BF-structure) can
be achieved. Even if this has not been fully accomplished in the present paper,
my analysis has shown that sentences with different word orders are indeed
different sentences, each with a different meaning, even if they show the same
lexical material ("numeration") and the same argument structure. So, in the
ideal case, the candidate set (numeration + meaning) for each sentence will in
fact be reduced to cardinality 111, leaving no room for a competitional model.
This seems to be true at least for the cases discussed. Other constraints may
still exist which are not amenable to a similar approach. This may hold at
least for those cases in which the factors I mentioned are not involved. The
well-known case of growing length of constituents as discussed by Hawkins
(1983) and Primus (1993, 1994), amongst others, comes to mind.
Thus, my analysis has not proven that competition models are inadequate
altogether. Rather, I hope to have shown that it is worthwhile to investigate
the apparently competing conditions in detail and try to derive them from
more basic principles before applying them in a competition model. There
seems to be more co-operation than one might think at first sight.
4 Appendix
In the following appendix, I will present some independent evidence for the
distinction between a referential and a non-referential/attributive part of the
semantics of NP. In particular, I will show that such a distinction is reflected
in syntactic behaviour.
4.1 Evidence from Prepositions
In most grammars, some forms of prepositions in German are described as

amalgamations of a preposition with a following definite article; so zum is
analysed as zu+dem ('to the'), ins as in+das ('into the'), etc. The matter is
complex and not well understood in its phonological, morphological, syntac-
tic and semantic aspects. The analysis as "preposition + definite article" is,
however, doubtful, as the following example shows (cf. Siebert 1999: 125):
(27) Peter geht zum Arzt, *[der ihm empfohlen wurde |.

Peter goes to (the) doctor who to-him recommended was
'Peter goes to the doctor *| who was recommended to him
274 Jürgen Lenerz
Clearly, the noun Arzt ('doctor') does not refer; so, maybe the special form
of the preposition (zum) is only a merger of a preposition with a dative case
ending (zu + -m), lacking the referential part of the determiner and relating
only to the attributive meaning of the noun.
4.2 Evidence from Predicate Nouns
A similar point can be made for predicate nouns. They, too, do not seem to
refer, but to consist only of an attributive reading:
(28) a. Peter wird (*der/?ein) Lehrer, und Max will *er/es auch
Peter becomes (*the/?a) teacher, and Max wants *him/it also
werden.
become
b. Peter wird (*der/?ein) Lehrer, und Max will auch einer [n0]
Peter becomes (*the/?a) teacher, and Max wants also one
werden,
become
'Peter will become a teacher, and Max also wants to become one.'
4.3 Evidence from Idioms
A similar distinction must be made for non-referential NPs in idioms: The

idiomatic reading is only obtained if the NP stays in its VP-internal position:
(29) a. Peter hat einer FRAU den Hof gemacht.

Peter has a-DAT woman the court made
'Peter courted a woman.' [idiomatic readingJ
b. ??Peter hat den Hof einer FRAU gemacht.
Peter has the court a-DAT woman made
??'Peter made the court to a woman.' [no idiomatic reading]
In my analysis, the idiomatic interpretation in which the NP is non-referential

is only possible if it is interpreted in an isc-dependent manner. Scrambling,
as in (29-b), forces a b-determined reading, i.e., a referential reading of the
NP, leaving no possibility for the idiomatic meaning. Other examples of non-
referential idioms are given in (30):
(30) non-referential NPs in idioms:

a. jemandem den Marsch blasen
sb. (dat.) the march blow
= to give sb. a ticking-off
b. jemandem die Schau stehlen
sb. (dat.) the show steal
= to steal the show from sb.
c. jemandem die Tür weisen
sb. (dat) the door show
= to turn sb. out
Similarly, there are coined phrases which may or may not be read in an id-
iomatic way:
(31) (non-)referential NPs (in idioms):

a. jemandem die Quittung (für etwas) geben
sb. (dat.) the receipt (for sth.) give
= to make sb. pay (for sth.)
b. jemandem die rote Karte zeigen
sb. (dat.) the red card show
= (in soccer): to give sb. a ticket / a (final) warning
c. jemandem nicht das Wasser reichen (können)
sb. (dat.) not the water give (can)
= to be inferior to sb.
d. jemandem die Luft abdrehen
sb. (dat.) the air throttle
= to make sb. shut up / to kill sb.
Predictably, the idiomatic reading is again only obtained with the NP in situ
(32-a), whereas the scrambled NP in (32-b) only allows a non-idiomatic, i.e.,
referential interpretation:
(32) a. Der Schiedsrichter hat einem SPIELER die rote Karte gezeigt.
the referee has a-DAT player the red card shown
I idiomatic reading preferred\
b. Der Schiedsrichter hat die rote Karte einem SPIELER gezeigt,
the referee has the red card a-DAT player shown
I referential, non-idiomatic reading only |
276 Jürgen Lenerz
4.4 Evidence from Syntax
Further evidence for a distinction between a referential and a non-refer-

ential/attributive part of NPs may also come from syntax. Split topicalisation
as in (33) and was ywr-split as in (34) may under some closer investigation
turn out to apply to only one semantic part of the NP.
(33) Vol VOSY haben mich ja [ΓΡ (NP viele t7-] überholt].
Volvos have me yes many overtaken
'(As for) Volvos, many overtook me.'
(34) Was, haben dich denn [yp [NP t, für Leute] angesprochen]?
what have you-ACC then for people addressed
'What kind of people addressed you, then?'
As has been noted before (cf. Müller & Sternefeld 1995), the source of the
movement must be VP internal. Both movements may not apply after scram-
bling (cf. also Lenerz 1994: 163):
(33') *Volvos7 haben mich [NP viele t y ] m ja [YP t m überholt].

Volvos have me many yes overtaken
(34') *Was, haben dich [np t; für Leute]* denn [yp t¿ angesprochen]?
what have you-ACC for people then addressed
This, again, indicates a difference in referential capacity which may be recon-

structed in terms of the distinction between b-determined and isc-dependent
reference.
Notes
I would like to thank H. Weiß and K. v. Heusinger for valuable comments and helpful
proposals.
1. In particular, (17-a) is not a covered-up version of a quantifier approach: As von
Heusinger (1997: 98ff.) points out himself, the formula (17-a) is only a shorthand
version of a more precise logical expression without any quantifier whatsoever:
(i) Et' |S(e,'xMx)] s S{ε ζί [S(£,XMX)]XMX)
As (i) shows, the quantification over contexts 3i in (17-a) can be replaced by a

precise choice fiinction ("Zeta") ranging over contexts ( f i ) .
2. This is even clearer in the equivalent expression referred to in fn. 1 :
(i) 3f [B(f,x (Fx))] s B(efí[B(£,XFX)]xFx)
ζ is a choice function which chooses a context i. This context i is characterised

by the properties in the brackets [B(e,-xFx)]: It is a context in which a particular
choice function e¡ applies, choosing an individual χ from the set of firemen Fx
who are ready (B). The choice function is thus the context adequate choice
function choosing an appropriate fireman in that context.
References
Abraham, Werner
1995 Deutsche Syntax im Sprachenvergleich. Tubingen: Narr.
Bierwisch, Manfred
1983 Semantische und konzeptuelle Repräsentation lexikalischer Einheiten.
In: W. Mötsch and R. Rüzicka (eds.) Untersuchungen zur Semantik, 61-
99. (Studia Grammatica 22.) Berlin: Akademie-Verlag.
Büring, Daniel
1996 Towards an economy-theoretic treatment of German Mittelfeld word or-
der. Ms., DFG-Project "Ökonomieprinzipien" (GR 559/5-1). Universität
Frankfurt und Köln.
Büring, Daniel
1997 The Meaning of Topic and Focus - The 59th Street Bridge Accent. Lon-
don: Routledge.
Choi, Hye-Won
Ph.D. dissertation, Stanford University.
Chomsky, Noam
Clahsen, Harald — Samuel Featherston
1998 Antecedent Priming at Trace Positions: Evidence from German scram-
bling. (Essex Research Reports in Linguistics Vol. 23.) Essex: University
of Essex.
Diesing, Molly
1990 The syntactic roots of semantic partition. Ph.D. dissertation, University
of Massachusetts, Amherst.
Eckardt, Regine
1996 Intonation and Predication: An Investigation in the Nature of Judge-
ment Structure. Arbeitspapiere des Sonderforschungsbereichs 340, No.
77. Stuttgart & Tubingen.
278 Jürgen Lenerz
Egli, Urs
1995 Definiteness, binding, salience, and choice functions. In: F. Hamm, J.
Kolb and A. von Stechow (eds.) The Blaubeuren Papers: Proceedings
of the Workshop on Recent Developments in the Theory of Natural Lan-
guage Semantics. October, 9-16th 1994, 105-125. Technical Report 08-
95, Seminar für Sprachwissenschaft der Universität Tübingen.
Fanselow, Gisbert
1993 The return of the base generators. In: Groninger Arbeiten zur Germanis-
tischen Linguistik 36: 1 -74.
Fanselow, Gisbert
1997 Features, 0-roles, and free constituent order. Ms., University of Potsdam.
Grevvendorf, Günther
1995 German: A grammatical sketch. In: J. Jacobs, A. von Stechow, W. Sterne-
feld and Th. Vennemann (eds.) Syntax, Vol II, 1288-1391. Berlin: de
Gruyter.
Grice, H. Paul
1975 Logic and conversation. In: P. Cole and J.L. Morgan (eds.) Speech Acts,
41-58. (Syntax and Semantics 3.) New York: Academic Press.
Haftka, Brigitte (ed.)
1994 Was determiniert Wortstellungsvariation? Studien zu einem Interaktions-
feld von Grammatik, Pragmatik und Sprachtypologie. Akten der AG 5 der
DGfS Jahrestagung 1993. Opladen: Westdeutscher Verlag.
Haider, Hubert
1993 Deutsche Syntax - Generativ. Tübingen: Narr.
Haider, Hubert — Inger Rosengren
1998 Scrambling. (Sprache und Pragmatik, Arbeitsberichte 49.) University of
Lund.
Hawkins, John A.
1983 Word Order Universals. New York: Academic Press.
Hawkins, John A.
1990 A parsing theory of word order universals. Linguistic Inquiry 21: 223-
261.
Heim, Irene
1982 The semantics of definite and indefinite noun phrases. Ph.D. dissertation,
University of Massachusetts, Amherst.
Heusinger, Klaus v.
1997 Salienz und Referenz: Der Epsilonoperator in der Semantik der No-
minalphrase und anaphorischer Pronomen. (Studia Grammatica 43.)
Berlin: Akademie-Verlag.
Heusinger, Klaus v.
to appear The reference of indefinites. In: K. v. Heusinger and U. Egli (eds.) Ref-
erence and Anaphoric Relations, 265-284. (Studies in Linguistics and
Philosophy.) Dordrecht: Kluwer.
Hilbert, David — Bernays, Paul
1977 Grundlagen der Mathematik, Vol. 11, 2 n d ed. Berlin/New York: Springer
Verlag.
Höhle, Tilman Ν.
1982 Explikationen für 'normale Betonung' und 'normale Wortstellung'. In:
W. Abraham (ed.) Satzglieder im Deutschen, 75-153. Tübingen: Narr.
Jacobs, Joachim
1984 Funktionale Satzperspektive und Illokutionssemantik. Linguistische Be-
richtet: 25-58.
Jacobs, Joachim
1988a Probleme der freien Wortstellung im Deutschen. Sprache und Pragmatik
5: 8-37.
Jacobs, Joachim
1988b Fokus-Hintergrund-Gliederung und Grammatik. In: H. Altmann (ed.) In-
tonationsforschungen, 89-134. Tübingen: Niemeyer.
Jacobs, Joachim
1991 Focus ambiguities. Journal of Semantics 8: 1-36.
Jacobs, Joachim
1992a Integration. Arbeitsbericht des Sonderforschungsbereichs 282 (Theorie
des Lexikons), No. 13. Universität Düsseldorf.
Jacobs, Joachim
1992b Neutral stress and the position of heads. In: J. Jacobs (ed.) Informations-
struktur und Grammatik, 220-244. Opladen: Westdeutscher Verlag.
Krifka, Manfred
1984 Fokus, Topik, syntaktische Struktur und semantische Interpretation. Ms.,
Universität München.
Krifka, Manfred
1998 Scope inversion under the rise-fall-contour in German. Linguistic Inquiry
20: 75-112.
Lenerz, Jürgen
1977a Zur Abfolge nominaler Satzglieder im Deutschen. Tübingen: Narr.
Lenerz, Jürgen
1977b Zum Einfluß von 'Agens' auf die Wortstellung des Deutschen. In: H.W.
Viethen, W.-D. Bald and K. Sprengel (eds.) Grammatik und interdiszi-
plinäre Bereiche der Linguistik. Akten des II. Linguistischen Kolloqui-
ums, Aachen, 1976, 133-142. Tübingen: Niemeyer.
280 Jürgen Lenerz
Lenerz, Jürgen
1994 Pronomenprobleme. In: Haftka (ed.), 161-173.
Lenerz, Jürgen
1998 Noam Chomsky, The minimalist program. In: Beiträge zur Geschichte
der Deutschen Sprache und Literatur, Bd. 120, Heft 1, 103-111. Tübin-
gen: Niemeyer.
Lenerz, Jürgen
1999 Besprechung: Klaus von Heusinger, „Salienz und Referenz. Der Ep-
silonoperator in der Semantik der Nominalphrase und anaphorischer
Pronomen". In: Beiträge zur Geschichte der deutschen Sprache und Lit-
eratur, Bd. 121, Heft 3,456-459. Tübingen: Niemeyer.
Müller, Gereon
1998 German Word Order and Optimality Theory. Arbeitspapiere des Sonder-
forschungsbereichs 340, No. 126. Stuttgart & Tübingen.
1995 Extraction, lexical variation, and the theory of barriers. In: U. Egli, P.
Pause, C. Schwarze et al. (eds.) Lexical Knowledge in the Organization
of Language, 35-80. Amsterdam: Benjamins.
Primus, Beatrice
1993 Word order and information structure: A performance-based account of
topic positions and focus positions. In: J. Jacobs, A. von Stechow, W.
Sternefeld and Th. Vennemann (eds.) Syntax: Ein internationales Hand-
buch zeitgenössischer Forschung, 880-896. Berlin: Walter de Gruyter.
Primus, Beatrice
1994 Grammatik und Performanz: Faktoren der Wortstellungsvariation im
Mittelfeld. Sprache und Pragmatik 32: 39-86.
Reinhart, Tanya
1995 Interface strategies. OTS Working papers 95-002. Utrecht: Research In-
stitute for Language and Speech.
Reinhart, Tanya
1997 Interface economy: Focus and markedness. In: C. Wilder, H.-M. Gärtner
and M. Bierwisch (eds.) The Role of Economy Principles in Linguistic
Theory, 146-169. Berlin: Akademie-Verlag.
Rosengren, Inger
1993 Wahlfreiheit mit Konsequenzen. Scrambling, Topikalisierung und FHG
im Dienste der Informationsstrukturierung. In: M. Reis (ed.) Wortstel-
lung und Informationsstruktur, 251-312. (Linguistische Arbeiten 306.)
Tübingen: Max Niemeyer Verlag.
Selkirk, Elisabeth O.
1984 Phonology and Syntax: The Relation between Sound and Structure. Cam-
Siebert, Susann
1999 Wortbildung und Grammatik: Syntaktische Restriktionen in der Struktur
komplexer Wörter. Tübingen: Niemeyer.
1996 Prosodie und Intonation im Deutschen. Talk presented at GGS, Berlin.
Uhmann, Susanne
1991 Fokusphonologie. (Linguistische Arbeiten 252.) Tübingen: Niemeyer.
Uszkoreit, Hans
1984 Word order and constituent structure in German. Ph.D. dissertation, Uni-
versity of Austin, Texas, (published 1987, Stanford: CSLI Publications).
Vallduvi, Enric
1992 The Information Component. New York: Garland.
Vikner, Sten
t.v. The Interpretation of Object Shift and Optimality Theory.
Williams, Edwin
1999 Economy as shape conservation. Talk presented at the Annual Meeting
of the DGfS. Konstanz, Feb. 1999.
Yeom, Jae-Il
1998 A Presuppositional Analysis of Specific Indefinites: Common Grounds as
Structured Information States. New York/London: Garland.
Zimmermann, Thomas Ede
1991 Kontextabhängigkeit. In: A. von Stechow and D. Wunderlich (eds.) Se-
mantik: Ein internationales Handbuch der zeitgenössischen Forschung,
156-229. Berlin/New York: Walterde Gruyter.
Zubizarreta, Maria Luisa
1998 Prosody, Focus, and Word Order. (Linguistic Inquiry Monographs 33.)
OT Accounts of Optionality: A Comparison of Global
Ties and Neutralization
Tanja Schmid
1 Introduction and Overview
At first sight, optionality poses a problem for all theories that assume a com-
petition between candidates (e.g. transderivational Minimalism and to a much
larger extent Optimality Theory (OT)). In such theories, the optimal (or the
most economical) candidate blocks the non-optimal (less economical) candi-
dates in a given candidate set (reference set). Only the optimal candidate is
grammatical.
In this paper I will introduce and compare two accounts of optionality in
OT and show that they are empirically equivalent. One account, the global tie
approach (see, e.g., Ackema & Neeleman 1995, 1998), involves constraint
ties and the other account, the neutralization approach (see Legendre et al.
1995, 1998; Bakovic & Keer 1999), makes use of the normal OT interaction
of faithfulness and markedness constraints. Both accounts will be applied to
different data sets. The accounts will be checked to determine whether one
is superior to the other. The result will be that in fact both approaches can
be used to account for the same kind of data. Nevertheless, to strengthen the
theory, one account should be seen as superfluous. For conceptual and not
empirical reasons, I will prefer the neutralization account in the end.
I will proceed as follows: In section 2 , 1 will give a brief introduction to
OT and discuss different OT accounts of optionality including the two men-
tioned above. In order to justify the focus on these two accounts I will briefly
mention their advantages compared to other OT accounts of optionality.
In the following three sections, I will look at data for which either an ac-
count in terms of neutralization (section 4) or global ties (section 5) or both
(section 6) has been given in the literature. In section 4 and in section 5,1 first
introduce the analysis proposed in the literature. Then I give a new account in
284 Tanja Schmid
terms of the opposing approach, keeping the constraints as similar as possi-

ble. Section 4 will be concerned with complementizer optionality in English.
In section 5,1 will look at the optionality of w/i-movement in root questions in
standard and colloquial French and its breakdown in certain contexts. Section
6 gives an example of optional IPP constructions in German which again can
be analyzed by both approaches. In section 7, some advantages and disadvan-
tages of both approaches are presented. The last section briefly summarizes
the results.
2 Basic Assumptions of OT
The following five ideas are central to Optimality Theory:
(1) Basic assumptions (see among many others Prince & Smolensky
1993, Grimshaw 1997)
a. Constraints are universal.
b. Constraints are violable.
c. Grammars are rankings of constraints.
d. An optimal candidate in a candidate set is grammatical, all non-
optimal candidates are ungrammatical.
e. The grammaticality of a candidate not only depends on its inher-
ent properties, but on the properties of the competing candidates as
well.
(1-d) especially is problematic for optionality. How can optionality be achiev-

ed when only one optimal candidate in a competition is grammatical?
Several proposals have been made in the OT-Iiterature, some of which will
be introduced in the next section. At this point it should suffice to give a def-
inition of optionality that allows for the possibility of more than one optimal
candidate in one and the same competition. Such a definition is given in (2):
(2) Optimality (following Müller 1999: 3):

A candidate C, is optimal with respect to a constraint ranking (CONi
» ... CON, ... » CON„) iff there is no candidate Cy in the same
candidate set such that:
a. There is a constraint CON* that Cj satisfies better than Q ; and
b. Con* is the highest ranking constraint on which Q and C¡ differ.
Global Ties and Neutralization 285
The candidates in a given candidate set are generated by a part of the gram-
mar (GEN, for generator) which contains only inviolable and unranked con-
straints. GEN takes an underlying form (the input) and builds up all possi-
ble output structures. These outputs, called the candidates, are evaluated by
another part of the grammar, the function H-EVAL (Harmony Evaluation),
which determines the optimal candidate(s) based on the constraint hierarchy
of the language.
I will use the following notation in this paper:
(3) a. os* : optimal candidate

b. * ! : fatal violation
c. < > : constraint tie
d. > : constraint domination
3 OT Accounts of Optionality
The present definition of optimality is compatible with more than one opti-
mal candidate in one and the same competition. This is exactly what I will
understand by optionality from a theoretical point of view:
(4) Optionality: Two (or more) different candidates are optimal, i.e.,
grammatical, though they are (or seem to be) in the same competi-
tion.
In OT, optionality is possible, but only under certain conditions. Opinions in

the OT literature differ in what these conditions should look like (see Müller
1999 for an overview).
In this section, three different OT approaches towards optionality are intro-
duced: In the first one, identity of constraint profile is the only condition that
allows the optionality of two candidates, in the second one, constraint ties
(local or global) are needed in addition, and in the third one, the so-called
neutralization approach, the crucial condition for the optionality of two can-
didates is their optimality in different competitions which arise from slightly
different inputs.
The introduction and discussion of several approaches to optionality will
motivate the focus on the global tie approach and the neutralization approach
throughout the paper. It will be shown that only these two approaches allow
the optimal candidates to have a quite different constraint profile. This will
be needed to account for certain sets of data.
286 Tanja Schmid
3.1 Identity of Constraint Profile
The obvious way of allowing for optionality in OT is that the winning can-
didates in one and the same competition have the same constraint profile
(Grimshaw 1997: 41Of., and Vikner 1999 use identity of constraint profile
to account for complementizer optionality). The condition for the optimality
of two (or more) competing candidates under this point of view is absolute
identity of the optimal constraint profile.
Identity of constraint profile is an intrinsic part of the theory that results di-
rectly from the basic mechanisms of OT. Regardless of any additional mech-
anisms and assumptions used, identity of constraint profile can never be ex-
cluded.
Only if the identity of the constraint profile is used as the only way to ac-
count for optionality, and nothing else is stipulated, will I speak of an "ap-
proach" to optionality along these lines. Then, however, the question arises
as to whether this is sufficient to account for all cases of optionality. To il-
lustrate the idea of identity of constraint profile, a very simplified example is
given below:
(5) Abstract example of identity of constraint profile:

A Β C
* **
ι®· a. Ci
es- b. C 2 * **
c. C 3 **t
The tableau above shows one single competition with the three candidates
Ci, C2 and C3, and an extremely small grammar consisting of only three
constraints, A, B, and C with the ranking A » Β » C. Candidate C3 fatally
violates the highest ranking constraint A. As both remaining candidates Q
and C2 have exactly the same constraint profile (they both violate constraint
A once, constraint Β not at all, and constraint C twice) and as this constraint
profile is optimal (they fare better than C3 on the highest constraint on which
they differ), both Q and C2 are grammatical.
This approach is quite plausible for a small grammar, as in (5), but, with a
larger number of constraints it is unlikely that the optimal candidates are not
distinguished by any constraint at all.
It is extremely difficult to keep an identical constraint profile of two (or
more) candidates. For this reason, identity of constraint profile should not be
seen as an independent approach to account for all cases of optionality, but
merely as a theoretical possibility that is not sufficient on its own for most
cases.
3.2 Constraint Ties
One additional assumption that is made in the literature to account for option-
ality is the possibility of constraint ties. Constraints that are tied are equally
important, i.e., two (or more) competing candidates may differ with respect
to the tied constraints but can nevertheless both (all) be optimal.
The notion of "constraint tie" is not used uniformly: At least five differ-
ent concepts of tie can be found in the literature (see Müller 1999 for an
overview). Prince & Smolensky (1993: 51, fn. 31) briefly mention the possi-
bility of constraint ties and open the door to different interpretations:
(6) It is entirely conceivable that the grammar should recognize non-

ranking of pairs of constraints, but this opens up the possibility of
crucial nonranking (neither can dominate the other; both rankings
are allowed), for which we have not yet found evidence.
I will continue by concentrating on two quite common notions of tie that are
in accordance with Prince & Smolensky's considerations. I will call them
local ties and global ties. Local ties can be seen as special types of constraints
and global ties as underspecifications of different constraint rankings, i.e., in
a language with a global tie, multiple constraint rankings co-exist. 1
3.2.1 Local Ties
Local ties follow one of the notions mentioned in Prince & Smolensky (1993:
51), namely, the "crucial nonranking" of constraints.
With the type of local tie that I will introduce (see Müller 1997 for a crucial
application), tied constraints count as a single constraint, i.e., "a candidate
violates a tie if it violates a constraint that is part of this tie, and multiple
violations add up" (Müller 1999: 6). A simplified abstract example is given
in (7):
288 Tanja Schmid
(7) Abstract example of a local tie

A Β C
es- a. Cj * * **
** **
us- b. C 2
c. C 3 *** ι
As before, the candidates Q and C2 are both grammatical in their competi-

tion. In contrast to table (5), the simplified grammar this time includes a local
tie of the constraints A and Β (A < > B). Candidate Q violates both part A
and part Β of the tied constraint once and candidate C2 violates part A twice
but part Β not at all. Added up, both candidates violate the tied constraint A
ο Β twice. On the lowest ranked constraint, C, the candidates Q and C2
behave alike. They both violate it twice. Although the sole competing candi-
date C3 does not violate constraint C, it is nevertheless suboptimal as it fares
the worst on the highest ranking constraint on which the candidates differ,
i.e., the tied constraint, which it violates three times.
In the local tie approach, the optimal candidates may differ, contrary to in
the "identity of constraint profile" approach. But even here, they differ only to
a certain extent, namely, on the tied constraint itself. The optionality would
break down if the optimal candidates differed on constraint C or any other
constraint below the tie.
Both accounts introduced so far have in common that they allow for op-
tionality only if the constraint profile of the candidates in question is exactly
(or nearly) the same. Apart from the improbability of this, cases can be found
in which the optimal candidates look much more different than would be
expected under such accounts. Examples will be discussed in the following
sections. They suggest a necessity for other accounts of optionality.
3.2.2 Global Ties
The concept of global ties follows another consideration in Prince &

Smolensky (1993: 51), namely, that "... both rankings are allowed" (see, e.g.,
Ackema & Neeleman 1995 for an application).
As mentioned above, a constraint ranking with a global tie A ο Β is
an underspecification and stands for two rankings: The constraint hierarchy
splits into two rankings from the constraint tie A ο Β onwards. Under one
ranking, constraint A dominates constraint B, and under the opposite ranking,
constraint Β dominates constraint A (for a formal definition of global tie see
Müller 1999: 5).
Two (or more) competing candidates are grammatical if they are optimal
under one possible resolution of the tie. This means, contrary to what is the
case with local ties, that the optimal candidates may show a different con-
straint profile below the tied constraint.
In (8) I give an abstract example of a global tie in the underspecified form.
The resolutions of the tie are given in (9) and (10) below.
(8) Abstract example of a global tie (underspecified)

A Β C
** **
a. Ci
i®· b. C2 * *
c.C3
The global tie A ο Β above is not yet explicitly resolved. It is shown, how-
ever, that again, both candidate Q and candidate C2 are optimal. Notice that
this result could not be achieved under the assumption of identity of constraint
profile or local ties alone as Q and C2 differ in the number of violations both
on the tied constraint and on constraint C below the tie.
Candidate C3 is suboptimal under any resolution of the tie. This is ex-
pressed by the two marks of fatal violation in brackets (!). Why these marks
are fatal will become clearer below, where the global tie is resolved into the
two possible total orders A » Β » C (9) and Β » A » C (10):
(9) Global ties resolved: A » Β » C

A Β C
*!* **
a. Ci
* *
b. C 2
c. C 3 *|**
Under this resolution of the tie candidate C2 is optimal as it fares best on the
highest constraint on which the candidates differ (i.e., constraint A). What
happens under the opposite total order is shown in (10):
(10) Global ties resolved: Β » A » C

Β A c
D^ a. Ci ** **
b. C 2 *! *
***!
c. C 3
Under this resolution of the tie, candidate Ci is optimal. Candidate C2 fatally

290 Tanja Schmid
violates the highest ranking constraint Β and candidate C3 fatally violates

constraint A, on which it differs from candidate Q . It does not matter that
the optimal candidate Q violates the constraints A and C more often than
C2.
Global ties allow a greater number of differences in the constraint profiles
of optimal candidates than the two other approaches introduced so far. Com-
pared to the local tie approach, the optimal candidates may differ not only
on the tied constraint itself, but even on constraints below the tie. This is
necessary to account for the data sets shown in this paper.
The global tie approach, however, is not the only one that allows a greater
number of differences in the constraint profile of two optimal candidates. An-
other approach with the same effect is introduced below, the so-called neu-
tralization approach (see Legendre et al. 1995, 1998 and Bakovic & Keer
1999).
3.3 Neutralization
All accounts of optionality that I have introduced so far assume that two (or
more) grammatical candidates are optimal in one and the same competition.
The main idea of the neutralization account, however, is that the optimal can-
didates win different competitions, i.e., that they are not built from the same
input (although they may be included in each other's candidate sets by GEN).
A crucial assumption for the neutralization approach to optionality is that
the relevant inputs differ only minimally with respect to, e.g., functional fea-
tures (Bakovic & Keer 1999); otherwise, they are identical. The contrasts in
the input are either preserved in the output (apparent optionality) or neutral-
ized depending on the constraint ranking of the language.
Neutralization equals a "breakdown" of optionality: A candidate is optimal
not only in a candidate set in which it is faithful to the input, but also in a
candidate set in which it is unfaithful. This is the case when the unfaithful
candidate blocks the faithful one due to a higher ranked (markedness) con-
straint, i.e., a difference in the input is neutralized in the output; hence the
name "neutralization" for the whole approach.
An abstract example of both (apparent) optionality and neutralization is
given in (11). The table is taken from Bakovic & Keer (1999) (figure 1).
(11) Abstract example of neutralization

FAITH » MARKEDNESS MARKEDNESS » FAITH
INPUTS OUTPUTS INPUTS OUTPUTS
II •Οι Ii —jPi
h >0 2 h
Assume an abstract example in which the faithfulness constraint (FAITH) re-

quires faithfulness to the input, and in which the markedness constraint does
not allow the occurrence of a feature X in the output: '* feature X'. Assume
further that input I 2 differs from input Ij only in that it contains a feature X.
When the faithfulness constraint is ranked above the markedness constraint
(see lefthand side of the table) it is more important to be faithful to the input
than to obey the markedness requirements. In this case, feature X of input
I 2 would occur in output 0 2 in the second competition. In the first, separate
competition (with the input 11), output Oj is optimal (without feature X which
was not specified in Ii). As feature X is the only difference between Oi and
0 2 , (apparent) optionality occurs, although Oi and 0 2 are optimal in different
competitions.
The righthand side of the table shows the mechanism of neutralization.
Here, the markedness constraint is ranked higher than the faithfulness con-
straint. Under the assumption that everything else remains equal, it is now
more important to fulfill the markedness requirements than to be faithful to
the input. In the case at hand, the marked feature X that was present in I 2
does not show up in the output. But now the output of I 2 equals the output of
a different competition, namely, that of I l5 which was not specified for fea-
ture X from the beginning. In this case, feature specifications in the input are
neutralized in the output. 2
In what follows, I will concentrate on global ties and neutralization.
Of all the approaches that I have introduced in this section, these two are
similar enough to account for the same type of data, namely, data where the
two optimal candidates differ greatly. Three sets of data will be checked to
see whether both approaches can account for them, whether they are both
needed, or whether one approach is superfluous.
4 English Complementizer Optionality
The first set of data in the comparison between the global tie approach and the
neutralization approach comes from complementizer optionality in English.
292 Tanja Schmid
A typical example of the optionality of complementizer drop is given in the

embedded object clauses in (12): 3
(12) a. Do you think Icp that tip Jane looks like Mary]]?
b. Do you think hp Jane looks like Mary]?
For these kinds of data, accounts in terms of neutralization have been pro-
posed in the literature (Legendre et al. 1995, Bakovic & Keer 1999, Kura-
fuji 1997). I will proceed as follows: First, I will introduce a neutralization
account based on those already proposed, and later on, I will give a new ap-
proach in terms of global ties that can account for the data as well.
4.1 The Constraints
The constraints that I will use are mostly taken from Bakovic & Keer (1999)
and Kurafuji (1997), 4 whose approaches I will combine. The constraints that
will become relevant in this section are given below: 5
(13) *EXP: *Expletives (e.g., complementizers). 6

(14) FAITH[COMP]: The output value of [COMP] is the same as the input
value.
(15) PURE-EP: Purity of Extended Projection: No adjunction to (and no
movement into the head of) a subordinate clause (see Grimshaw
1997: 374).
4.2 The Two Approaches
English shows an alternation of optionality, obligatoriness and ineffability of

complementizers, depending on the context. An example of optionality and
one of obligatoriness of complementizers will be given below. 7
4.2.1 The Neutralization Approach
The question of how the input is defined 8 becomes very relevant for the neu-
tralization approach because, under this approach, optionality is explicitly
connected with faithfulness to the input.
The input in OT-syntax can be defined in the following way: "... a lexi-
cal head plus its argument structure (...) plus a specification of the associ-
ated tense ..." (Grimshaw 1997: 375f.). It is crucial for the neutralization ap-
proach to optionality that Bakovic & Keer (1999) add functional features like
[ + / — C O M P ] to this definition.
Furthermore, Bakovic & Keer (1999) need to assume the following:
— Embedded clauses with that are CPs and those without that are IPs.
— Embedded CPs and IPs differ in their specification for a feature
[COMP].
— Bridge verbs like think can be equipped with either a ] + C O M P | or a
[ — C O M P ] feature; F+COMPJ requires a CP and [ — C O M P ] requires
an IP.
The relevant faithfulness constraint that refers to the [ C O M P ] feature is

FAITHFCOMPJ and the relevant markedness constraint is * E X P , which is vi-
olated in the presence of a complementizer.
In the neutralization approach, optionality of complementizers is due both
to the existence of two inputs that differ only in their feature specification
for [ C O M P ] and to the ranking of F A I T H | C O M P | above the markedness con-
straint * E X P .
The relevant ranking is given in (16): 9
(16) PURE-EP » FAITH[COMP] » *EXP
In the first relevant competition, think is equipped with a [ + C O M P ] feature.

The tableau is given in (17): 10
(17) Input: ... think[+Compi -

PURE-EP FAITH|COMP) *EXP
ι®· a.... think |cp that lip ... *
b.... think [IP ... *!
In this competition, candidate (a) with a complementizer is optimal although

it violates the markedness constraint *EXP. Contrary to its competitor (b)
without a complementizer, it does not violate the higher ranked faithfulness
constraint FAITH[COMP].
When, however, the specification in the input is [—COMP], then the result
is different, as can be seen in (18):
294 Tanja Schmid
(18) Input: ... think[ —Comp] •·•

Pure-Ep Fa i t h [ C o m p ] * E X P
*
a.... think [cp that lip ... *!
5
ι® b. ... think [¡p ...
This time, candidate (b) without a complementizer is faithful to the input and
therefore optimal in the competition.
So far, it has been shown that the neutralization approach can account for
optionality: Optionality is the result of faithfulness to slightly different input
specifications.
4.2.2 The Global Tie Approach
In this section, I give a new account (in terms of global ties) for the same
data. Contrary to in the neutralization approach, two (or more) candidates
emerge as optimal in one and the same competition in the global tie ap-
proach. Another difference concerns the role of the input, which does not
need to be as explicitly specified. In all global tie accounts in syntax that I am
aware of (Ackema & Neeleman 1995 and 1998, Broekhuis & Dekkers 1999,
Schmid 1998 and 1999), markedness constraints are the only constraints that
are needed to account for optionality. Faithfulness constraints sensitive to
functional features seem not to be necessary.
When instead of the faithfulness constraint a markedness constraint is in-
troduced that contradicts *EXP, and when these two constraints are tied, op-
tionality can be accounted for. I assume that the markedness constraint in
question is HAVE (CP), which requires a clause to be a CP (because, e.g.,
only there the information about sentence mood can be stored).
(19) H a v e (CP): A C-projection is obligatory in a clause.
The relevant ranking is shown below:
(20) P u r e - E p » H a v e (CP) < > *Exp
When the input sensitive faithfulness constraint is not crucial anymore, then
it does not matter if the input is specified for [—COMP] or [+COMP], As
markedness constraints make the decision, the result would be the same in
either case.
The competition with a global tie (unresolved) is shown in (21):
(21) Global ties

PURE-EP HAVE ( C P ) *EXP
ι®· a.... think [cp that hp ... *
*
us" b.... think [ip ...
Under the resolution of the tie in which H A V E ( C P ) dominates * E X P , candi-

date (a) (with a complementizer) is optimal, and under the opposite resolution
of the tie, candidate (b) (without a complementizer) is optimal.
In this section I have shown that both approaches can account for option-
ality. But what about cases in which the neutralization part of the neutral-
ization approach becomes relevant? In the next section I illustrate how both
approaches can handle obligatoriness of complementizers in a certain con-
text.
4.3 Complementizer Obligatoriness in Complements with Adjunction
One context in which complementizers become obligatory for most speakers

(see, e.g., Grimshaw 1997: 411) is sentential complements with adjunction.
This is shown in (22), which is taken from Bakovic & Keer (1999), ex.(4):
(22) a. I think fcp that on him, no coat looks good t]

b. *I think fip on him, no coat looks good t]
To account for cases like these, the constraint P U R E - E P , inactive so far, be-
comes relevant. One part of P U R E - E P prohibits adjunction to the highest
projection of a subordinate clause. The CP that is introduced by the com-
plementizer can function as a "shelter" for adjunction. When it is present, the
projection to which adjunction takes place is no longer the highest projection
of the subordinate clause.
In the neutralization approach, P U R E - E P is the markedness constraint that
is responsible for neutralizing different feature specifications in the input to
only one output specification when it is ranked above the relevant faithfulness
constraint. No matter what the feature specification in the input, the output
will always show a complementizer. This can be seen in the tableaux below.
In (23), the input is specified for [ + C O M P | :
296 Tanja Schmid
(23) Input: [+COMP] complement clause with adjunction
PURE FAITH *EXP

-EP [COMP]
m- a.... think [cp that [JP on him [ IP ... *
b.... think [ΧΡ on him [ΧΡ ... *! *
The complementizer in the faithful candidate (a) violates * E X P . Nevertheless,

this candidate is optimal. Compared to its competitor (b) it does not violate
FAITH[COMP| just as before (see tableau (17)). The only difference is an
additional violation of PURE-EP by candidate (b) this time.
More interesting is the real neutralization case, in which the input is spec-
ified for J—COMP] and the output nevertheless shows a complementizer due
to the highly ranked PURE-EP:
(24) Input: |—COMP] complement clause with adjunction

PURE FAITH *EXP
-EP [COMP]
US' a.... think [CP that [ip on him [n>... * *
b.... think [IP on him [IP ... *!
Again, candidate (a) (with a complementizer) is optimal although this time,

it is unfaithful to the input. The decision is made by the highly ranked PURE-
EP, which is obeyed by candidate (a) but violated by the faithful candidate
(b). The optimal candidates (23-a) and (24-a) are identical outputs derived
from different inputs ("derivational ambiguity") 1 1
As far as the global tie approach is concerned, obligatoriness of comple-
mentizers can be accounted for as well. The highly ranked PURE-EP deter-
mines the outcome, just as with neutralization, see (25);
(25) Global ties

PURE-EP HAVE ( C P ) *EXP
*
ι®" a . . . . think Icp that [¡p ...
*
b . . . . think [IP ... *!
The decision is made above the tie at PURE-EP, favouring candidate (a) with
a complementizer.
In this section I have shown that cases of complementizer optionality and
obligatoriness which have been accounted for under the neutralization ap-
proach in the literature can be accounted for just as well under the global
tie approach. Note that * E X P , which is crucial for the global tie approach, is
superfluous for the neutralization approach - at least for the cases that have
been looked at here.
5 French Root Questions
The second set of data comes from French, in which wA-movement of argu-
ment XPs is optional in root questions. This time, an approach in terms of
global ties is given in the literature; see Ackema & Neeleman (1995,1998). 12
An analysis along these lines will be introduced and taken as the basis for an
account in terms of neutralization. The relevant set of data is given below:
(26) a. Quii as2-tu t2 vu ti ?

who have-you seen
b. Tu as vu qui?
you have seen who
c. Qui] tu as vu ti ?
who you have seen
d. *Asi-tu ti vu qui?
have-you seen who
As shown in (26) there are (at least) three possible ways of forming a root
question in French. 13 Example (a) shows wA-movement and subject-auxiliary
inversion. In example (b) the wA-element remains in situ and in example (c)
it is moved again, but this time without subject-auxiliary inversion. The only
ungrammatical example is subject-auxiliary inversion without w/i-movement
as shown in (d). It is often assumed that (a) occurs in standard French and (b)
and (c) in colloquial French (see, e.g., Confais 1985: 175). Unlike Ackema
6 Neeleman (1995), who are mainly interested in the optionality between (a)
and (b), 14 I will derive optionality inside the same register, i.e., between (b)
and (c), in the following.
5.1 The Constraints
The relevant constraints from Ackema & Neeleman (1995, 1998) are shown
below:
(27) SPC15: Shortest Path Condition: Minimize movement paths.
Every node crossed by movement creates a "*". 16

298 Tanja Schmid
(28) Q-MARK: Q-marking: In a question, assign a [+Q] feature to the

constituent corresponding to the proposition.
That is, the VP must be marked by a lexical X with a [+Q] feature. This [+Q]
feature is assigned via sisterhood by a wA-element in Spec XP. In root clauses,
this constraint can only be obeyed by a combination of w/i-movement and X
movement.
(29) Q-SCOPE: [+Q] elements must c-command the constituent corre-

sponding to the proposition.
This constraint functions as a trigger for iv/j-movement.
5.2 The Global Tie Approach

First, I will introduce an account in terms of global ties along the lines of
Ackema & Neeleman (1995). I will assume a global tie between SPC, which
restricts movement, and Q-SCOPE, which potentially triggers movement. 17
The input of the relevant competition is not explicitly defined by Ackema
& Neeleman (1995, 1998). They implicitly assume, however, that a wh-
(question)pronoun always bears a Q-feature. Whether this feature has to be
in the input or may be added by GEN is not directly relevant for the global tie
approach.
5.2.1 Standard French
In standard French, the markedness constraint Q-MARK will be ranked above

the tie as shown in (30):
(30) Q-MARK » SPC < > Q - S C O P E
The following tableaux (for standard French and for colloquial French below)
are underspecified, representing two subtableaux simultaneously:
(31) Standard French: global ties

Q-MARK SPC Q-SCOPE
5
ES a. Qui] as2-tu t2 vu ti ******
*
b. Tu as vu qui *!
***
C. Quii tu AS vu ti *!
*** *
d. Asi-tu ti vu qui *!
In standard French, only candidate (a) with movement of both the w/i-phrase
and the auxiliary is optimal. The highly ranked Q - M A R K requiring both wh-
movement and verb movement eliminates any other candidate, independently
of the resolution of the tie.
5.2.2 Colloquial French
Now assume that the only difference between the two registers (i.e., the only
one relevant for the case at hand) is the position of Q-MARK. Contrary to
standard French, in which it is highly ranked, it is ranked below the tie in
colloquial French: 1 8
(32) SPC < > Q-SCOPE » Q-MARK

(33) Colloquial French: global ties
SPC Q-SCOPE Q-MARK
a. Quii as2-tu t2 vu ti
ι®· b. Tu as vu qui *(!) *
os· c. Quii tu as vu ti *
d. Asj-tu tj vu qui *
Under one resolution of the constraint tie (SPC » Q-SCOPE), the require-
ment to minimize movement paths is more important than the need to fulfill
scoping requirements via movement. Therefore, under this ranking, candidate
(b) without wA-movement is optimal. Under the opposite ranking (Q-SCOPE
S PC), however, candidate (c) emerges as optimal because it respects Q -
SCOPE (with the shortest possible movement path). 1 9
Thus, the global tie between two constraints can account for the optionality
of w/i-movement in colloquial French. 2 0
5.3 The Neutralization Approach
For the new account in terms of neutralization, a faithfulness constraint must

be introduced that is sensitive to an input feature. In the case at hand, the
presence vs. absence of this feature in connection with the placement of the
faithfulness constraint reflects the presence vs. absence of w/z-movement. I
assume the feature to be [Q] and the relevant faithfulness constraint to be:
(34) FAITH[Q]: The output value of [Q] is the same as the input value.
300 Tanja Schmid
T h e assumption will be that [Q] is a purely syntactic feature that may ( [ + ] )

or may not ([—J) be connected with a wA-(question)element in the input.
Note that it is crucial for the approach that the occurrence and interpretation
of a w/z-element is independent of its bearing a [Q]-feature or not. It must be
possible to interpret even a ννΛ-element in situ without a [Q]-feature as a
question. 2 1
T h e markedness constraint Q-MARK is as highly ranked as in the global tie

approach. In the neutralization approach, it is crucial that it outranks the faith-
fulness constraint FAITHFQ], The relevant parts of the constraint ranking are
given below:
(35) Q-MARK » FAITH[Q], and Q-MARK » SPC
Under the assumption that the functional feature [Q] may be freely inserted
in the input, more candidates than before need to be looked at. For all four
candidates (a-d) a phonetically identical counterpart exists that differs only in
the feature specification of the wA-element. These "counterpart candidates"
are included in the tableaux below.
The presence of a [Q]-feature on a w/z-element will be marked by [ + ] and
the absence of a [Q]-feature by | — ].
In the input o f the first competition, the w/z-element is equipped with a [Q]-
feature.
(36) Input: [ + Q ]
Q-MARK FAITH[Q] Q-SCOPE SPC
is· a. Quif+]i as2"tu t2 vu ti ******
b. Tu as vu qui[_i *! *
c. Quif+]i tu as vu ti *! ***
d. Asi-tu t) vu qui[_] *! * ***
e. Quif_|] as2-tu t2 vu t] *! * ******
f. Tu as vu qui[+] *! *
g. Qui[_n tu as vu t] *! * ***
h. Asj-tu t] vu qui[+i *! * ***
Only candidate (a) fulfills the highly ranked constraint Q - M A R K by both

moving the w/z-element and the auxiliary. Therefore, it is optimal despite its
many violations of the lower ranked SPC. In this competition, the optimal
candidate is faithful to the input. Only the candidates (f) and (h) violate Q -
SCOPE: They include a [ + Q ] element that does not show up in scope position.
A s the markedness constraint Q - M A R K is ranked above the faithfulness

constraint FAITH[Q], the same candidate will come out as optimal even if
there is no [Q]-feature in the input. This is shown in (37):
(37) Input: [—Q] (i.e., no Q )

Q-MARK FAITH[Q] Q-SCOPE SPC
* * * * * *
us· a. Qui|+]i as2-tu t2 vu t[ *
b. Tu as vu qui|_] *!
* * *
c. Qui[+]i tu as vu ti *! *
* * *
d. Asj-tu ti vu qui[_i *!
* * * * * *
e. Qui[_|i as2-tu t2 vu t¡ *!
f. Tu as vu qu¡r+] *! * *
* * *
g. Qui[_|i tu as vu t] *!
***
h. Asi-tu t¡ vu qui[_|_| *! * *
Again, candidate (a) is optimal although this time, it is unfaithful to the input.
Only candidate (a) with movement of both the w/i-phrase and the auxiliary is
optimal in standard French under the neutralization approach - just as under
the global tie approach. The highly ranked Q - M A R K again eliminates any
other candidate and forces a | + Q ] element to occur even if it is not present in
the input. Different underlying specifications are neutralized in the output.
In colloquial French, the constraint ranking differs slightly from the one in
standard French: A s in the global tie approach, Q - M A R K , which is highly
ranked in standard French, is ranked low in colloquial French. The faithful-
ness constraint FA IT H [ Q ] must be ranked above all markedness constraints,
and among the markedness constraints, Q-SCOPE and SPC have to be ranked
above Q - M A R K . This is shown in (38):
(38) FAITH[Q] » Q-SCOPE » SPC » Q-MARK
Under this ranking it is guaranteed that distinct feature specifications in the

input come out differently in the output. Let us first look at an input in which
the w/z-element is specified for the |Q]-feature:
302 Tanja Schmid
(39) Input: | + Q ]
FAITH[Q] Q-SCOPE SPC Q-MARK

a. Qui[+]i as2-tu t2 vu ti * * * * 1**
b. Tu as vu qui[_] *! *
us- c. Qui|+|i tu as vu ti *** *
d. Asj-tu ti vu qui[_| *! *** *
*
e. Qui|_]i as2-tu t2 vu ti *! ******
*
f. Tu as vu qui[+] *!
*** *
g. Qui[_]i tu as vu ti *!
*** *
h. Asj-tu tj vu qui[+] *!
As FAITH[Q] is highest ranked, the optimal candidate must be faithful to the

input, i.e., it must have a [Q]-feature. Among all the faithful candidates ((a),
(c), (f) and (h) in the tableau), candidate (c) emerges as optimal: It satisfies
Q - S C O P E with the shortest possible movement path.
When the [Q]-feature is not specified in the input, then it cannot occur in
the output. This has the visible effect that no movement of the wA-element
takes place, as shown in (40):
(40) Input: | - Q ] (i.e., no Q)
FAITH[Q] Q-SCOPE SPC Q-MARK

a. Qui[+]i as2-tu t2 vu ti *! ******
*
is* b. Tu as vu qui[_]
*** *
c. Qui[+ji tu as vu ti *!
*!** *
d. Asi-tu tj vu qui[_]
*
e. Qui[_]i as2-tu t2 vu ti
* *
f. Tu as vu quif+] *!
g. Qui[_)i tu as vu ti * 1** *
* *** *
h. Asj-tu ti vu quif+] *!
Of the faithful candidates (b), (d), (e) and (g), the one which does not violate
Q - S C O P E and fulfills S P C best emerges as optimal. This is candidate (b).
In this section, it has been shown that the neutralization approach can ac-
count for the optionality of w/z-movement in root questions in colloquial
French as can the global tie approach given in the literature. The two accounts
are empirically equivalent with respect to the above data. 22 In the global tie
approach, optionality is achieved by a global tie of SPC and Q-SCOPE, and
in the neutralization approach, by a combination of free [Q]-insertion in the
input and the ranking of FAITH[Q] above all relevant markedness constraints.
Due to reranking, all constraints are necessary in the neutralization approach

here.
Interestingly, the optionality is lost in embedded clauses.
5.4 Neutralization in Embedded Questions

Independently of the register, optionality of w/z-movement breaks down in
embedded questions:
(41) a. *Je me demande quii as2-tu t2 vu ti
I myself ask who have-you seen
b. *Je me demande tu as vu qui
I myself ask you have seen who
c. J e m e demande quii tu as vu ti
I myself ask who you have seen
d. *Je me demande asi-tu ti vu qui
I myself ask have-you seen who
Only the candidate with w/i-movement is grammatical; all other candidates

are ungrammatical.
To account for these data, two additional constraints are needed. They were
not shown in the tableaux before because they are only active in connection
with embedded clauses:
(42) SELECT(ion): Selectional requirements must be satisfied at S-

structure (Ackema & Neeleman 1995: 40).
In the case of a matrix verb like se demander ("ask oneself") that selects an
embedded question, SELECT is satisfied when the highest projection of the
embedded clause carries a Q-feature. 23
PURE-Ep, which has already been introduced in section 4, will become
crucial in standard French. It is repeated in (43):
(43) PURE-EP: Purity of Extended Projection: No adjunction to (and no

movement into the head of) a subordinate clause (Grimshaw 1997:
374).
In both types of French (standard and colloquial) I assume SELECT and

P U R E - E P to be ranked above the other constraints, whose order, of course,
remains the same as before. Only in standard French, however, is the ranking
of PURE-EP crucial.
304 Tanja Schmid
Let us first look at a competition of standard French with a [Q]-feature in the

input. 24 It is crucial that PURE-EP is ranked above Q-MARK (cf. Grimshaw
1997 for the basic argument):
(44) Input: Je me demande ... [+QJ ...

SELECT PURE Q- FAITH SPC Q-
-EP MARK [Q] SCOPE
a. qui|+|i as2"tu t2 vu tj *! ******
b. tu as vu qui[_] *! * *
is* c. qui|+u tu as vu tj * ***
d. asj-tu t) vu qui] | *! * * ***
The faithful candidate (c) with the [Q]-feature on the moved w/z-element is
the optimal one. It is the only candidate that fulfills the highly ranked SE-
LECT (the highest projection carries a [Q|-feature as demanded by the matrix
verb) and PURE-EP (no movement into the head of the highest embedded
projection).
Nevertheless, with an input that does not show a [Q]-feature on the wh-
element, the same optimal candidate will result:
(45) Input: Je me demande ... [—Q]... (i.e., no Q)

SELECT PURE Ω- FAITH SPC Q-
-EP Μ Α RK [Q] SCOPE
a. quif+|] as2-tu t2 vu t; *! * Jjc * * * * *
b. tu as vu qui[_] *! *
us· c. qui[_|_u tu as vu ti * * ***
d. asj-tu ti vu qui[_] *! * ***
As before, the highest ranked markedness constraints SELECT and PURE-EP

make the decision between themselves before Q-MARK or the faithfulness
constraint are brought into the picture. The only difference from the winner
of the competition above is that this time, the optimal candidate is not faithful
to the input.
I will assume that the order of SELECT and PURE-EP is the same as in stan-
dard French. It is important that SELECT outranks FAITH[Q]. The ranking of
PURE-EP, however, is not crucial. The remaining constraints are ranked as
before in colloquial French (see (38)). Again, I will first look at the competi-
tion with a [Q]-feature in the input:
(46) Input: Je me demande ... [+Q]...

SELECT PURE FAITH Q- SPC Q-
-EP IQ] SCOPE MARK
a. qui[+]i as2"tu t2 vu t] *! * * ** * *
b. tu as vu quif_] *! * *
os· c. qui[+]i tu as vu tj *** *
d. as]-tu ti vu qui[_| *! * *** *
Due to the highly ranked constraint S E L E C T , only a candidate with a [Q]-

feature on a moved wA-element can win the competition. With the given in-
put, one of the faithful candidates (a) or (c) is possible. In the end, candidate
(c) comes out as optimal. The movement path of the competing candidate (a)
is longer: In addition to wA-movement, it moves the auxiliary to the head of
the highest projection of the embedded clause, thereby violating P U R E - E P
and SPC more often than the optimal candidate (c).
With an input that does not show a [Q]-feature, the same optimal candidate
will still result. What becomes crucial here is the ranking of S E L E C T above
FAITH[Q], It is responsible for neutralizing the underlying specifications of
two different inputs of the feature [Q] to one single output, as shown in (47):
(47) Input: Je me demande ... [—Q]... (i.e., no Q)

SELECT PURE FAITH Q- SPC Q-
-EP (Q] SCOPE MARK
a. qui| + || as2"tu t2 vu tj *! * ******
b. tu as vu qui[_] *! *
IKS' c. quif_|_] J tu as vu t¡ * *** *
d. asi-tu ti vu quij_i *! *** *
The optionality of vWz-movement that can be seen in root questions in col-

loquial French is blocked in embedded questions. This is due to the high
ranking of S E L E C T (crucially above F A I T H [ Q ] ) . Irrespectively of the input, it
requires the highest projection of the embedded clause to carry a |Q]-feature
(in the case of embedded questions that are selected by, e.g., se demander).
This forces the optimal candidate to be unfaithful to its input. Of the unfaith-
ful candidates, (c) wins for the same reason as before: It has one (crucial)
movement less than its competitor, candidate (a).
306 Tanja Schmid
The global tie approach can account for the breakdown of optionality in
embedded questions along the same lines. The only thing to do is to rank the
constraints S e l e c t and P u r e - E p above the global tie.
The topic of this section was the optionality of wA-movement in root ques-
tions in colloquial French and its breakdown in standard French and in em-
bedded questions of both registers. For these kind of data, an account along
the lines of the global tie approach has been suggested in the literature. The
neutralization approach, however, which was introduced here, turns out to
account for the data as well.
6 IPP in German
The last set of data comes from IPP constructions in German. They may be
optional, depending on the verb class involved. IPP is short for Infinitivus Pro
Participio, which denotes a bare infinitive that in a certain context replaces the
expected past participle in some West Germanic languages. Simplifying a bit,
this context is given in the perfect tense, when the verb that is selected by the
temporal auxiliary takes a VP complement itself. IPP cases in German only
occur with a particular word order: The auxiliary, which normally follows
its complement verb in embedded sentences, precedes it. The connection be-
tween verb form and verb order is shown in (48) with the perception verb
hören ('hear'). Perception verbs optionally occur either in the IPP, with the
finite auxiliary hat ('has') preceding the other verbs, or as the past participle
in the "normal" verb order, with the finite auxiliary at the end.
Optionality of IPP with perception verbs:
(48) a. *... dass sie ihn singen hören hat

that she him sing hear-inf has
b. ... dass sie ihn hat singen hören
that she him has sing hear-inf
c. ... dass sie ihn singen gehört hat
that she him sing heard-pastp has
d. *... dass sie ihn hat singen gehört
that she him has sing heard-pastp
I have looked at these data in some detail in Schmid (1998, 1999). Again,
they can be analysed in both ways: either with the global tie approach or with
the neutralization approach.
6.1 The Constraints
The following constraints will become relevant below (see Schmid 1998,
1999):
(49) MORPH: (morphological selection):

Morphological selectional properties of lexical items must be ob-
served.
An example of a morphological selectional property is that a present per-

fect auxiliary in German selects a complement headed by a past participle. 25
MORPH will be violated whenever the complement of a lexical item does not
occur in the right (i.e., selected) form, as is the case with IPP.
(50) *PASTP/PV/+V (*V-complement of a past participle of perception

verbs): A past participle of perception verbs may not take a verbal
complement. 26 This is the constraint that may act as a trigger for
IPP.
(51) HD-LFT (head left): The base position of a head is immediately to

the left of its complement.
(52) HD-RT (head right): The base position of a head is immediately to
the right of its complement.
(53) H D - L F T & D M O R P H : H D - L F T a n d MORPH m a y n o t be v i o l a t e d
in the same domain D, 27 i.e., HD-LFT & D MORPH is violated iff
HD-LFT is violated Λ MORPH is violated in the same domain. 28
A complex constraint like (53) is called a "Local Conjunction" 29 (see

Smolensky 1995, 1997). It can be defined as follows (Legendre et al. 1998:
262):
(54) Local Conjunction:

a. Given two constraints Q and C2, their Local Conjunction (with re-
spect to a domain type D), Q &o C2, is a new constraint which is
violated when two distinct violations of Ci and C2 occur within a
single domain of type D.
b. Universal ranking: Ci &D C2 » Ci, C2
308 Tanja Schmid
One possibility to account for the data is in terms of the global tie approach
(see Schmid 1999). I will assume that the constraint that demands the occur-
rence of a selected past participle (MORPH) is tied with the constraint that
may demand the replacement of a past participle under certain conditions
(*PASTP/PV/+V). Other crucial rankings are: HD-LFT & MORPH outranks
H D - R T , H D - R T o u t r a n k s H D - L F T a n d t h e tied c o n s t r a i n t s o u t r a n k H D - R T .
One possible (still underspecified) ranking is given in (55):
(55) H D - L F T & MORPH » MORPH < > * P A S T P / P V / + V » HD-RT »

HD-LFT
A competition with the above constraint ranking is shown below. The com-
peting candidates result from the manipulation of the verb form and of the
order of the lexical items by GEN.30
(56) Global ties

... dass sie ihn ... HD-LFT MORPH *PASTP/PV/+V HD- HD-
& MORPH RT LFT
a. singen hören hat *! * ***
«s- b. hat singen hören * * **
is- c. singen gehört hat * ***
d. hat singen gehört * **

*!
Under one resolution of the tie (MORPH » *PASTP/PV/+V), it is more im-

portant to observe the morphological selectional properties of the temporal
auxiliary hat than to obey the constraint against past participles in a certain
context. Therefore, under this ranking, candidate (c) with the perception verb
in the past participle is optimal (the other candidate with the past participle,
(d), fatally violates HD-RT).
Under the opposite resolution of the tie (*PASTP/PV/+V » MORPH), a
candidate with the perception verb in the bare infinitive (i.e., IPP) is optimal.
It does not violate the highly ranked *PASTP/PV/+V. Of the two candidates
with IPP, candidate (b) is optimal. Its competitor (a) fatally violates the con-
joined constraint (hat ('has') is on the right side of its complement, i.e., it
violates HD-LFT, and its complement hören ('hear') occurs in the bare in-
finitive, i.e., it violates MORPH). Candidate (b), however, only violates one
part of the conjoined constraint (MORPH), which is not enough to violate it
as a whole.
Note that the two optimal candidates differ in their constraint profile below
the tie. It is therefore crucial that the tie is global and not local.
For the neutralization approach to work, an additional faithfulness constraint

is needed: 3 1
(57) FAITH(PASTP): When a past participle is specified in the input, then

it must occur in the output and vice versa.
It is necessary that there are inputs that differ only with respect to their spec-
ification for the past participle. To allow for optionality, it is crucial that
FAITH(PASTP) is ranked above the relevant markedness constraint, namely
*PASTP/PV/+V:
(58) Relevant part of the ranking:

... F A I T H ( P A S T P ) » * P A S T P / P V / + V ...
I will first look at a competition in which the input is specified for the past
participle (marked by the past participle prefix Ige-)):
(59) Input: Past participle
... dass sie ihn | + g e - | ... HD-LFT FAITH *PASTP/PV/+V HD- HD-
& MORPH (PASTP) RT LFT
a. singen hören hat *! * ***
b. hat singen hören *! * **
ι®" c. singen gehört hat * #**
d. hat singen gehört *

*! **
The winner in this competition is the faithful candidate (c) with the percep-
tion verb in the past participle and the temporal auxiliary following the other
verbs. Although it violates the markedness constraint * P A S T P / P V / + V , it is
optimal due to the higher ranking of the faithfulness constraint.
Likewise, due to this ranking, a different winner emerges when the input is
not specified for a past participle, as shown in (60):
310 Tanja Schmid
(60) Input: Bare infinitive
... dass sie ihn ... HD-LFT FAITH *PASTP/PV/+V HD- HD-
& MORPH (PASTP) RT LFT
a. singen hören hat *! ***
us- b. hat singen hören * **
c. singen gehört hat *! * ***
d. hat singen gehört *! * * **
In this competition, too, the winner is faithful to the input, i.e., this time, the
perception verb occurs in the bare infinitive and not in the past participle. As
candidate (a) fatally violates the conjoined constraint, candidate (b) emerges
as optimal, with IPP and the auxiliary on the left side of its complement.
The result of section 6 is the same as before: Again, both the global tie
approach and the neutralization approach can account for the data.
7 Advantages and Disadvantages of the Two Approaches
The reason for the discussion of the global tie approach and the neutralization
approach in this paper is their ability to account for the same kind of data
(which are difficult or impossible to account for under other approaches, like
identity of constraint profile and local ties):
— Both approaches can handle cases of optionality in which the opti-

mal candidates seem to differ in many respects (i.e., they are more
impervious to differences lower down in the constraint hierarchy).
— Both approaches can account not only for optionality, but also for
its "breakdown" in certain contexts (i.e., alternation).
The similarities of the two accounts of optionality given in this paper suggest
that the neutralization approach may easily be translated into the global tie
approach and vice versa. Some considerations on possible "translation rules"
are given below:
— From the neutralization approach into the global tie approach:

The markedness constraint (e.g., *EXP in section 4) that was
crucially outranked by the relevant faithfulness constraint (e.g.,
F A I T H [ C O M P ] ) in the neutralization approach will form a tie with a
conflicting markedness constraint (e.g., H A V E (CP)). The faithful-
ness constraint is then either abandoned or ranked below the global
tie. As the decision about the optimal candidate(s) in a competition

is now made by markedness constraints alone, it does not matter
which feature specification is given in the input. 32
— From the global tie approach into the neutralization approach:
In the other direction, the relevant faithfulness constraint (e.g.,
F A ITH [ C O M P ] in section 4) must be ranked above the markedness
constraints that form the global tie (e.g., H A V E ( C P ) and * E X P ) .
The tie is then no longer needed. It is crucial that the faithfulness
constraint is sensitive to a functional feature (e.g., I C O M P ] ) whose
presence or absence is the only distinction between the inputs.
To sum up, it can be said that a global tie of two markedness constraints,
one of which, say Mi, prohibits fx] and the other of which, say M2, demands
|xj, has the same effect as a faithfulness constraint F which is sensitive to
[x] and outranks the markedness constraints. This is so because F on its own
either demands or prohibits [x] already, depending on the input. Only one
optimal candidate results in both approaches if another relevant markedness
constraint outranks either the tie Μι o M2 or the faithfulness constraint F.
If it should indeed be the case that the global tie approach and the neutral-
ization approach can always be translated into each other without empirical
consequences (as suggested by the examples in this paper), 33 then it would be
preferable to dispense with one of the two approaches to avoid redundancy in
the grammar.
The question arises as to which approach should be dispensed with. As
the approaches seem to be empirically equivalent, I will list some more con-
ceptual and theory internal points below, both for and against each of the
approaches.
The following points can be presented in favour of the neutralization ap-

proach:
— Optionality (and neutralization) comes out as a result of the "nor-

mal" constraint interaction of faithfulness constraints and marked-
ness constraints. Faithfulness constraints are needed anyway. If
there were only markedness constraints in the grammar, the way
would be open to "ba" everywhere (see Chomsky 1995: 224, fn. 4).
312 Tanja Schmid
— Neutralization can account for absolute ungrammaticality. By rank-

ing a markedness constraint (M) above a faithfulness constraint (F),
even an unfaithful candidate can be optimal in a competition when it
differs from the faithful candidate by not violating M (see Legendre
et al. 1998: 274f. for neutralization to a candidate with a different
LF compared to its input, and Keer & Bakovic 1999).
The following points may be made against the neutralization approach:
— Candidate sets can become very large as unfaithful candidates must

be included to a certain degree.
— Derivational ambiguity: One and the same output can be derived
from several different inputs.
— It is neither completely clear what functional features a faithfulness
constraint can refer to, nor is it clear whether it is desirable that
different specifications of these functional features lead to different
inputs.
— GEN must be able to manipulate functional features.
— There is no obvious way to distinguish repair/exceptional forms (cf.
IPP) from regular forms.
Note, however, that most of the above points show general properties of an
OT system (see, e.g., "richness of the base"). The neutralization approach
only makes us more conscious of them.
Global ties allow the presence of two (or more) grammars simultaneously.
One way to see this complexity as an advantage of the approach is that it may
reflect the property of instability that languages show in their development.
In studies of language change, it is not unusual to assume the simultaneous
presence of two or more grammars (see, e.g., Kroch 1989, Pintzuk 1991). The
following points, however, can be raised against the global tie approach:
— Global ties are problematic for learnability, see, e.g., Tesar (1998),
who proposes a learning algorithm that builds on a total ranking of
constraints. 34 Something else that may complicate language acqui-
sition is the increasing number of possible grammars. The number
of grammars containing three constraints is 6 without, and 19 with
allowing for the possibility of constraint ties (see Vikner 1999).
— Global ties are also problematic from a conceptual point of view.

With one tie, two grammars may be simultaneously present in the
mind of a single speaker, increasing to four grammars with two
ties, six grammars with three ties, and so on. In addition, the pos-
sibility of a tie built by three (or more) constraints is not excluded
and would increase the number of simultaneously present grammars
enormously, resulting in (at least) six simultaneous grammars (see,
e.g., Sells et al. 1996).
Given these considerations I am inclined to favour the neutralization approach

over the global tie approach: It exploits the normal OT interaction of marked-
ness constraints and faithfulness constraints, it is conceptually less complex
and problematic than global ties and it can account for total ungrammaticality
as well.
8 Summary
At first sight, optionality poses a problem for OT. In the OT literature, how-
ever, several accounts of optionality can be found. What I wanted to do in
this paper was to compare two of these approaches that seem to be able to
cover the same kind of data, namely the global tie approach and the neutral-
ization approach. Where an account in terms of neutralization had already
been suggested in the literature, I developed an account in terms of global
ties and vice versa. Both approaches were applied to three different sets of
data: complementizer optionality in English, optionality of wA-movement in
French root questions and optionality of IPP constructions in German. The
result in all three cases was the same: The approaches are empirically equiv-
alent and can account for both optionality and the breakdown of optionality
in certain contexts. If two approaches can account for the same set of data,
one of them should be abandoned (for reasons of simplicity, elegance, "econ-
omy"). It was argued that the approach to be abandoned should be the global
tie approach. The neutralization approach can do the same and more (account
for ungrammaticality) without adding new mechanisms to the system. Both
accounts increase complexity, but only global ties result in different gram-
mars (i.e., rankings) for one and the same language and thus pose problems
for learnability.
314 Tanja Schmid
Notes
I would like to thank the audience of the Graduiertenkolleg-Workshop in Söllerhaus,

June 1999, the audience of WOTS 3 at the University of Stuttgart, November 1999
and especially Gereon Müller, Ian Roberts, and Sten Vikner for their suggestions and
comments. All remaining shortcomings and errors are mine.
1. "Local" and "global" are the main criteria that Müller 1999: 5ff. uses to clas-
sify the different concepts of a tie. What I call a local tie is named "conjunctive
local tie" in his overview and what I call a global tie is named "ordered global
tie". In this article, however, I concentrate on the local-global distinction and, for
reasons of simplicity, pick out only one member of each set, leaving aside other
distinctions.
2. Neutralization can be used to account for ineffability, another potential problem
for an OT analysis; see Legendre et al. 1998 and Bakovic & Keer 1999.
3. Following Bakovic & Keer (1999) among many others, I mark the embedded
sentence with that as a CP and the embedded sentence without that as an IP.
4. Bakovic & Keer (1999) and Kurafuji (1997) adapt constraints from Grimshaw
(1997) and Vikner (1999), arguing, however, against the approach in terms of
identity of constraint profile suggested in these papers.
5. Some other constraints are implicitly assumed but left out of the tableaux for
reasons of simplicity:
(i) *Lx-Mv: *Lexical Movement: *Movement of a lexical head.
(ii) PR-BD: Proper Binding: *Every trace that c-commands its antecedent.
The ranking *Lx-Mv » PR-BD is responsible for the lack of V-to-I movement
in English (see Vikner 1999).
The ranking of the next constraint relative to, e.g., *EXP is responsible for
the (non-)occurrence of a complementizer:
(iii) OB-HD: Obligatory Head: *Every empty head.
For English, the ranking PURE-EP » OB-HD » * E X P r e q u i r e s t h e i n s e r t i o n of

the complementizer whenever a CP is available.
6. Cf. Grimshaw's (1997) Full Interpretation (Fi). Notice, however, that Grimshaw
explicitly assumes that complementizers do not violate Fl.
7. For cases of ineffability of complementizers, see, e.g., Bakovic & Keer (1999).
8. In general, this is an important but unanswered question in OT.
9. The ranking of PURE-EP above FAITH [COMP] will become clear later on, when
PURE-EP becomes active.
10. For the neutralization account to work, GEN must be quite powerful as it has to be
able to manipulate certain (functional) features of the input. It can, for example,
add them or delete them.
11. See Prince & Smolensky (1993: 192) for a proposal of a mechanism to determine
the optimal input for a given output ("input optimization").
12. In their 1998 paper they eventually reject the global tie analysis in favour of an
approach of apparent optionality, i.e., they assume that the optimal candidates
belong to different candidate sets.
13. Cases with est-ce que are commonly assumed to behave differently. I will ignore
them here.
14. At the end of their 1998 paper Ackema & Neeleman also briefly mention option-
ality in colloquial French (i.e., optionality between (b) and (c)) and sketch an
account in terms of optional inclusion of a complementizer in the input which
can optionally remain unpronounced.
15. This constraint is called STAY by Ackema & Neeleman (1998).
16. Exactly how often a candidate violates SPC depends on specific assumptions
about sentence structure.
17. Ackema & Neeleman (1995) assume a global tie between SPC and Q-MARK. I
will differ from them to be able to account for the optimality of candidate (26-c)
with wA-movement but without movement of the auxiliary.
18. Note that different registers (standard, colloquial,...) can be thought of as being
different resolutions of a global tie (see, e.g., Sells et al. 1996). Under such a
view, optionality in French root questions (in both registers) could also be ac-
counted for by a tie of all three of the above constraints. For the sake of con-
venience, however, I will continue to show different tableaux for standard and
colloquial French.
19. As pointed out by a reviewer, this analysis cannot straightforwardly be extended
to multiple questions in Colloquial French.
20. Note that the local tie approach would not be sufficient in this case (assuming the
constraints above): Candidate (b) violates the tied constraints only once, while
candidate (c) violates them three times. Both candidates, however, emerge as
optimal.
21. In this case the separation of a syntactic [Q]-feature and a semantic [vvh]-feature
must be assumed (for a discussion of the difference between these features, see,
e.g., Müller 1993: 302 and the references given there).
22. But cf. fn. 18: There might also exist a language with three (or more) winners
which could be accounted for by a tie of three (or more) constraints in the global
tie approach. In this case, an account in terms of neutralization would be more
difficult, at least if the binarity of formal input features should be maintained.
23. A few words are in order concerning the relation between constraints like SE-
LECT and FAITH[X] (more generally between selectional constraints and faith-
fulness constraints; see, e.g., the relation between "morphological selection" and
FAITH(PastP) in section 6).
These two types of constraints are not as similar as they may seem at first
316 Tanja Schmid
sight. They may overlap, but they might contradict each other as well, depend-
ing on the input. They see the element/feature in question in relation to different
entities: Selectional constraints in relation to the selecting element, and faithful-
ness constraints in relation to the input. Selectional constraints are markedness
constraints in the sense that it can be decided whether a candidate fulfills or vio-
lates them without knowing the input. To decide, however, whether a faithfulness
constraint is violated, the input must be taken into consideration. I will assume
in the following that both constraint types are needed in the grammar. However,
the question remains as to what the exact relation between them is and where
selectional requirements fit into an OT system at all.
24. For the sake of simplicity, I have left out the candidates (e) to (h) of the former
tableaux as they will never be optimal.
25. See Bech (1983) for an early and thorough investigation of verbal complementa-
tion in German.
26. As pointed out in Schmid (1998, 1999), this constraint is in fact part of a whole
constraint subhierarchy sensitive to verb classes.
27. In this case, the domain D consists of a verbal head and its VP-complement.
28. Ci &£> C2 is equivalent to a logical disjunction (Ci ν C2 in a given domain D),
which is to be read as: Q ν C2 is not violated iff Ci is not violated ν C2 is not
violated in D.
29. Note that "Local Conjunction" as defined in (54) is a recursive mechanism that
could in principle lead to a nonfinite number of constraints, which I do not con-
sider desirable. For my purposes here, however, it would suffice to see (53) as
one complex, universal constraint. The reason why I have formulated it as a con-
junction of two simplex constraints rather than as one complex constraint is that
it becomes more transparent: It is bad to have the wrong form or be in the wrong
place, but it is even worse to both have the wrong form and be in the wrong place.
30. That the bare infinitive and not the to-infinitive is used instead of the past par-
ticiple could be due to yet another constraint like, e.g., *STRUCTURE (under the
assumption that a bare infinitive has less structure than a to-infinitive). As all
non-selected to-infinitives are ruled out by this constraint, I have not included
them in the tableaux.
31. Note that with the introduction of FAITH(PASTP), the constraint MORPH is only
indirectly relevant, namely through Local Conjunction. For the sake of simplicity,
I will leave out the simplex constraint in the following tableaux.
32. Note, however, that the faithfulness constraint (FAITH[COMP|) could form a tie
with the conflicting markedness constraint (*EXP) under the assumption that
HAVE (CP) holds in the input, i.e., that a verb always selects a CP in the input.
Under this assumption, the input would be as relevant in the global tie approach
as in the neutralization approach.
33. Remember, however, that cases with three (or more) optimal candidates are more
difficult to account for in terms of neutralization. Nevertheless, a neutralization
account does not seem to be impossible if, e.g., formal input features are not
(always) assumed to be binary.
34. As pointed out by Tony Kroch p.c., it must be checked, however, if global ties
really turn out to be this problematic for the learning algorithm. It could be the
case that whenever the learner comes to a piece of data that contradicts an as-
sumed ranking, the contradicting ranking is stored as a different grammar. The
number of simultaneous grammars should be restricted nevertheless.
References

1995 Optimal questions. OTS Working Papers. Utrecht University.
1998 Optimal questions. Natural Language and Linguistic Theory 16: 443-
490.
Bakovic, Eric — Ed Keer
1999 Optionality and ineffability. Ms., Harvard University & University of
Massachusetts, Amherst. To appear in: G. Legendre, J. Grimshaw and S.
Vikner (eds.) Optimality-Theoretic Syntax. Cambridge, MA: MIT Press.
Bech, Gunnar
1983 Studien über das deutsche verbum infinitum. Tübingen: Niemeyer. Re-
print from 1955.
Broekhuis, Hans — Joost Dekkers
1999 The Minimalist Program and Optimality Theory: Derivations and evalu-
ations. To appear in: J. Dekkers, F. van der Leeuvv and J. van de Weijer
(eds.) Optimality Theory: Phonology, Syntax and Acquisition. Oxford:
Oxford University Press.
Chomsky, Noam
Confais, Jean-Paul
1985 Grammaire Explicative. München: Max Hueber.
Grimshaw, Jane
Kroch, Anthony
1989 Reflexes of grammar in patterns of language change. Journal of Lan-
guage Variation and Change 1: 199-244.
Kurafuji, Takeo
1997 Three OT approaches to the optionality of complementizers. Ms., Rutgers
University.
318 Tanja Schmid

1995 Optimality and wh-extraction. In: J. Beckman, L. Walsh-Dickie and S.
Urbanczyk (eds.) Papers in Optimality Theory, 607-635. (University of
Massachusetts Occasional Papers in Linguistics 18.) University of Mas-
sachusetts, Amherst.
Müller, Gereon
1993 On deriving movement type asymmetries. Ph.D. dissertation, Universität
Tübingen.
Müller, Gereon
1997 Partial wh-movement and Optimality Theory. The Linguistic Review 14:
249-306.
Müller, Gereon
Pintzuk, Susan
1991 Phrase structures in competition: Variation and change in Old English
word order. Ph.D. dissertation, University of Pennsylvania.
Rutgers University.
Schmid, Tanja
1998 West Germanic 'Infinitivus Pro Participio' (IPP) constructions in Opti-
mality Theory. To appear in: Proceedings of Console 1998. The Hague:
HAG.
Schmid, Tanja
1999 Die Ersatzinfinitivkonstruktion im Deutschen. Ms., Universität Stuttgart.
Sells, Peter — John Rickford — Thomas Wasovv
1996 An optimality theoretic approach to variation in negative inversion in
AAVE. Natural Language and Linguistic Theory 14: 591-627.
Smolensky, Paul
1995 On the internal structure of the constraint component of UG. Talk given
at UCLA, April 1995.
Smolensky, Paul
1997 Constraint interaction in generative grammar II: Local conjunction (or,
random rules in Universal Grammar). Talk given at the Hopkins Opti-
mality Theory Workshop/University of Maryland Mayfest, May 1997.
Tesar, Bruce
1998 Error-driven learning in Optimality Theory via efficient computation of
optimal forms. In: P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis and
D. Pesetsky (eds.) Is the Best Good Enough?, 421-435. Cambridge, MA:
MIT Press.
Vikner, Sten
1999 V-to-I movement and 'do'-insertion in Optimality Theory. Ms., Univer-
sität Stuttgart. To appear in: G. Legendre, J. Grimshaw and S. Vikner
(eds.) Optimality-Theoretic Syntax Cambridge, MA: MIT Press.
The Interpretation of Object Shift and Optimality
Theory
Sten Vikner
Diesing (1996, 1997) observes that the interpretations of object-shifted ob-

jects and non-object-shifted objects in Icelandic object shift constructions dif-
fer along lines very similar to the interpretation differences between scram-
bled and non-scrambled objects in, e.g., German. The present paper argues
that Optimality Theory has certain advantages over, e.g., Minimalism in ac-
counting for such data. This is because the interpretational differences only
hold of object shift constructions: In a construction in which object shift is
possible, a non-object-shifted object only has one interpretation (parallel to
a German non-scrambled object), but in a construction in which object shift
is not possible, a non-object-shifted object is ambiguous (interpretable either
like a German scrambled object or like a German non-scrambled object). In
other words, what matters is not just whether the object has moved, but also
whether it "could have moved" (i.e., it depends on how well those compet-
ing candidates are doing which contain object-shifted objects). In Optimality
Theory, such a situation can be accounted for in terms of violable constraints,
and the difference between object shift and scrambling can be derived from
different rankings of the same constraints.
1 Introduction
Object shift is a process found in the Scandinavian languages (Holmberg

1986, 1991, 1997,1999, Vikner 1989,1994 and the other papers in Corver &
van Riemsdijk 1994, Josefsson 1992, 1993, Holmberg & Platzack 1995, and
references in all of these) which moves the object out of its base position in-
side the VP to a position to the left of an element (e.g., negation or adverbial)
which is not part of the VP:
322 Sten Vikner
(1) Danish
a. *Hvorfor laeste Peter aldrig den ?
b. Hvorfor laeste Peter den aldrig t ?
why read Peter (it) never (it)
(2) Icelandic
a. *Af hverju las Pètur aldrei hana ?
b. Af hverju las Pètur hana aldrei t ?
why read Pètur (it) never (it)
In Icelandic, all DPs undergo object shift, whereas in the other Scandinavian
languages, only pronouns do:
(3) Danish
a. Hvorfor laeste Peter aldrig den her bog ?
b. *Hvorfor laeste Peter den her bog aldrig t ?
why read Peter (this book) never (this book)
(4) Icelandic
a. Af hverju las Pètur aldrei pessa bók ?
b. Af hverju las Pètur pessa bók aldrei t ?
why read Pètur (this book) never (this book)
The contrast between (l)/(2) and (3)/(4) shows that object shift of pronouns
is obligatory in both Danish and Icelandic, whereas object shift of full DPs is
only optional in Icelandic and downright impossible in Danish. Object shift
is only possible if the verb leaves VP, which a finite main verb does in main
clauses (which are V2, see (l)-(4)), but which a non-finite main verb never
does:
(5) Danish
a. Hvorfor har Peter aldrig laest den ?
b. *Hvorfor har Peter den aldrig laest t ?
why has Peter (it) never read (it)
(6) Icelandic
a. Af hverju hefur Pètur aldrei lesiö pessa bók ?
b. *Af hverju hefur Pètur pessa bók aldrei lesiö t ?
why has Pètur (this book) never read (this book)
Scrambling, an object movement very similar to object shift found in the con-
tinental West Germanic languages (cf. the papers in Grewendorf & Sternefeld
1990, Webelhuth 1992, Haider 1993, the papers in Corver & van Riemsdijk
Object Shift and OT 323
1994, Müller 1995, Haider & Rosengren 1998, and references in all of these),
is not dependent on the position of the verb in this way:
(7) German
a. ...ob Peter nie dieses Buch liest ?
b. ... ob Peter dieses Buch nie t liest ?
if Peter (this book) never (this book) reads
(8) German
a. Warum liest Peter nie dieses Buch ?
b. Warum liest Peter dieses Buch nie t ?
why reads Peter (this book) never (this book)
(9) German
a. Warum hat Peter nie dieses Buch gelesen ?
b. Warum hat Peter dieses Buch nie t gelesen ?
why has Peter (this book) never (this book) read
Scrambling, too, becomes obligatory rather than optional when pronouns are
considered:
(10) German
a. *... ob Peter nie es liest ?
b. ...ob Peter es nie t liest ?
if Peter (it) never (it) reads
(11) German
a. *Warum liest Peter nie es ?
b. Warum liest Peter es nie t ?
why reads Peter (it) never (it)
(12) German
a. *Warum hat Peter nie es gelesen ?
b. Warum hat Peter es nie t gelesen ?
why has Peter (it) never (it) read
When pronouns are modified (e.g., we two, you and I, he with the red hair),
they behave like full DPs (cf. (3)-(4), (6), and (7)-(9)), and not like unmodi-
fied pronouns (cf. (l)-(2), (5), and (10)-(12)), i.e. they do not undergo object
shift/scrambling in Danish and only optionally in Icelandic and German.
324 Sten Vikner
2 The Interpretation of Object Shift (and Scrambling)
From the above, it might appear that (Icelandic) object shift and (German)
scrambling are completely optional, at least as far as non-pronouns are con-
cerned. This is not the case, however. As observed in Diesing & Jelinek
(1995:150) (from now on: D&J) and in Diesing (1996:79, 1997:418), the
interpretation of object-shifted objects in Icelandic differs from that of non-
object-shifted objects, and this difference is parallel to the difference in inter-
pretation between scrambled and non-scrambled objects in, e.g., German and
Yiddish (cf. Diesing 1992:129).
(13) German
... weil ich ...
... because I
a. ... selten die kleinste Katze streichle
b. ... die kleinste Katze selten streichle
(the smallest cat) seldom (the smallest cat) pet
(D&J: 130 (9-a), Diesing 1996:73 (17), 1997:379, (14-a))
D&J/Diesing observe that the interpretation of (13-a) is that whichever group

of cats I meet, I rarely pet the one which is the smallest in that particular
group. The interpretation of (13-b) is that there is a cat which is smaller than
all others, and that cat, I rarely pet. In other words, the relative scope of
seldom and the smallest cat correspond to their surface order; the one furthest
left has wider scope.
(14) Icelandic
a. Hann les sjaldan lengstu bókina
b. Hann les lengstu bókina sjaldan
He reads (longest book-the) seldom (longest book-the)
(Diesing 1996:79 (32), 1997:418 (82))
According to Diesing (1996, 1997), the interpretation of (14-a) is that

whichever group of books he is put in front of, he rarely reads the one which
is the longest in that particular group. The interpretation of (14-b) is that there
is a book which is longer than all others, and that book, he rarely reads. Thus
also here, the relative scope of seldom and the longest book correspond to
their surface order; the one furthest left has wider scope. Diesing's claim is
that these interpretation differences can be derived from her Mapping Hy-
pothesis (1992:10, 1997:373, see also D&J: 124), i.e., that the differences
follow from whether the object is inside the VP and thereby part of the "nu-
clear scope (the domain of existential closure)" or outside VP but inside IP
and thereby part of the "restriction (of an operator)". Diesing's observations
are supported by the following more extensive set of data:1
(15) German
a. In den Prüfungen beantwortet er selten die schwierigste Frage
in the exams answers he rarely the most-difficult question
b. In den Prüfungen beantwortet er die schwierigste Frage selten
in the exams answers he the most-difficult question rarely
(16) German
a. In den Prüfungen hat er selten die schwierigste Frage
in the exams has he rarely the most-difficult question
beantwortet
answered
b. In den Prüfungen hat er die schwierigste Frage selten
in the exams has he the most-difficult question rarely
beantwortet
answered
The interpretation of the German (15-a)/(16-a) is that regardless of which

exam he is taking, he rarely answers whichever question happens to be the
most difficult one in that particular exam. The interpretation of the German
(15-b)/(16-b), on the other hand, is that there is one particular question which
is more difficult than all others (e.g., "list all the irregular verbs in Icelandic")
and which appears in most or all exams, and when he encounters this ques-
tion, he rarely answers it. The exact same differences in interpretation hold
of the Icelandic (17-a,b):
(17) Icelandic
a. Í prófunum svarar hann sjaldan erfidustu spurningunni
in exams-the answers he rarely most-difficult question-the
b. I prófunum svarar hann erfidustu spurningunni sjaldan
in exams-the answers he most-difficult question-the rarely
There is one case which is not discussed by Diesing, namely, the context
in which object shift is not possible in Icelandic. In this context, only one
word order is possible, (18-a), and this word order has not just one of the two
interpretations discussed above, it actually has both interpretations:
326 Sten Vikner
(18) Icelandic
a. I prófunum hefur hann sjaldan svaraö erfidustu
in exams-the has he rarely answered most-difficult
spurningunni
question-the
b. *í prófunum hefur hann sjaldan erfidustu spurningunni
in exams-the has he rarely most-difficult question-the
svaraö
answered
c. *í prófunum hefur hann erfidustu spurningunni sjaldan
in exams-the has he most-difficult question-the rarely
svaraö
answered
3 Optimality Theory and the Interpretation of Object Shift (and

Scrambling)
3.1 Optimality Theory
In optimality theory (cf., e.g., Prince & Smolensky 1993, Grimshaw 1993,
1997, Burzio 1995, Müller 1997, Archangeli & Langendoen 1997, Barbosa
et al. 1998), constraints are taken to be relative ("soft") rather than absolute
("hard"):
(19) a. Absolute: If a sentence violates constraint C, it is ungrammatical.

b. Relative: That a sentence violates constraint C may be bad, but not
as bad as if it had violated constraint B, which again is less bad than
if it would violate constraint A.
In other words: Although there is a price to be paid every time a constraint is

violated, the price is not always the grammaticality of the sentence in ques-
tion. The following four ideas are central to optimality theory (Grimshaw
1997:373):
(20) a. Constraints may be violated.

b. Constraints are ordered in a hierarchy (a grammar is a particular
ordering of constraints).
c. Constraints are universal, i.e., in all languages, the same constraints
apply, except that they are ordered differently from language to lan-
guage (language variation is variation in the constraint hierarchy),
d. Only the optimal version of a sentence is grammatical; all non-
optimal versions are ungrammatical (the optimal version/candidate
of two is the one with the least violation of the highest constraint on
which the two versions/candidates differ).
Let us now return to the data discussed in sections 1 and 2. These data showed
that the interpretation of an object in Icelandic depends on whether or not it
has undergone object shift, in a completely parallel way to how the interpre-
tation of an object in German depends on whether or not it has undergone
scrambling. It is crucial, however, that whereas scrambling is never impossi-
ble in German, there are many contexts in Icelandic which do not allow object
shift. In those Icelandic sentences in which object shift is excluded, the non-
object-shifted object has two interpretations: It may be interpreted either as if
it preceded the adverbial or as if it followed it, and not just the latter.
This ambiguity is the reason why an Optimality Theory analysis is particu-
larly suitable here:
On the one hand, the generalisation seems to hold of most of the data that
the scope of objects and adverbials is read off their surface position (Diesing's
"Scoping Condition", 1996:70, 1997:375-76, see also D&J: 127), hence the
differences seen in, e.g., (17), between the non-object-shifted object in (17-
a), "he rarely answers whichever is the most difficult question in any given
exam", and the object-shifted object in (17-b), "there is a question more dif-
ficult than all others, and when he encounters this question, he rarely answers
it".
On the other hand, this generalisation clearly does not hold in constructions
which disallow object shift, like (18). The Scoping Condition would predict
that also in (18), a non-object-shifted object would only have one interpreta-
tion, i.e., that (18-a) could only be interpreted like (17-a) and not like (17-b)
(and also that the interpretation of (17-b) would only be available in sentences
where object shift was possible). This is not correct; (18-a) is ambiguous be-
tween the two interpretations. In other words, what matters is not just whether
the object has undergone object shift or not, but also whether it "could have
moved if it had wanted to." This can be accounted for in Optimality Theory
terms by saying that in Icelandic, the L I C E N S I N G constraint is ranked higher
than the S C O P I N G constraint. The idea is that the object in an object shift
construction is licensed both in its base position and in the object-shifted po-
sition, whereas in a non-object-shift construction, the object is only licensed
in its base position.
328 Sten Vikner
Diesing (1997:414) suggests that this licensing could be constrained by the

Shortest Move constraint under the assumption of Equidistance of Chomsky
(1993:17-19 = 1995:184-186) (see also Bobaljik & Jonas 1996 and Collins
& Práinsson 1996).
Object licensing could however also be a question of Case assignment
along the lines of Holmberg (1986:177) and Vikner (1994:493) (though see
Holmberg 1997:215, 1999:25-27, where it is a question of licensing the fea-
ture |—focusj by the closest c-commanding X o ), i.e., Case may be assigned
either by a verb or a verb trace in V o or by a verb or a verb trace in I o , where
the former is relevant for objects in situ inside V P and the latter for objects
that have undergone object shift. (17-a,b) would thus be analysed as follows:
(21) Icelandic
C° Io Vo
a. I prófunum svarar„ hann t„ sjaldan t u spurningunni
b. í prófunum svararv hann t„ spurningunnii sjaldan t„ t,·
in exams-the answers he (question) rarely (question)
I Ì I i
Case Case
Constructions which disallow object shift would do so because I o neither con-

tains the Case assigning verb (i.e., the main verb) nor a trace of it. In the fol-
lowing subsections, the optimality theory analysis of Icelandic will contain
the following three constraints, given in the order of the ranking for Icelandic:
( 2 2 ) a. LICENSING:
An object must be licensed by being c-commanded by its se-
lecting verb or the trace of this verb (this subsumes Shortest
Move/Equidistance/Case assignment because c-command is the
lowest common denominator of the various licensing mechanisms
discussed).
b. SCOPING:
An element has the (surface) position in the clause that corre-
sponds to its relative scope (based on Diesing 1992:10-12,1996:70,
1997:375, D&J: 126; cf. the discussion above, and see also Bobaljik
1995:362).
c. STAY:
Movement should be avoided. This corresponds to Procrastinate
(movement should not take place before LF) and/or Economy of
Derivation (movement should not take place at all).
German differs from Icelandic in that SCOPING is ranked above LICENSING,

not below it.
I have included the constraint STAY, which constrains all movements, in all
the tableaux. Just like LICENSING more or less corresponds to Shortest Move
in Minimalist terms, STAY more or less corresponds to Procrastinate (see also
the discussion in the conclusion in section 4).
If STAY were to receive the highest ranking, no movement would ever take
place. If STAY did not exist, we would expect not to find any "Movement as
a last resort" type phenomena (cf., e.g., Chomsky 1991) at all.
3.2 Icelandic Objects with Narrow Scope
Consider the analysis of input in which the object has narrow scope relative to
the adverbial (i.e., "he rarely answers whichever is the most difficult question
in any given exam"). First the case in which object shift is possible:
(23) Input: narrow scope object LICENSING SCOPING STAY

a. US'... svarar hann sjaldan OBJ :
(17-a)
b. * ... svarar hann OBJ sjaldan *! *
:(l7-b)
The computation of the optimal candidate proceeds as follows: (23-a) is better

than (23-b), because (23-a) does better than (23-b) on the highest constraint
on which the two differ, SCOPING, where (23-a) has no violations and (23-b)
one. In other words, (23-b) loses out because of its violation of SCOPING,
hence the "!" next to this violation ("!" marks the fatal violation).
Let us now turn to the case in which object shift is not possible:
(24) Input: narrow scope object LICENSING SCOPING STAY

a. is"... hann sjaldan svaraö OBJ = (18-a)
* = (18-b)
b. * ... hann sjaldan OBJ svaraö *!
c. * ... hann OBJ sjaldan svaraö *! * * = (18-c)
(24-a) is better than (24-b), because (24-a) does better than (24-b) on the
highest constraint on which the two differ, LICENSING, where (24-a) has no
violations and (24-b) one. The same goes for the comparison between (24-
a) and (24-c), and therefore (24-a) is better than either (24-b) or (24-c). The
result is that (23-a) and (24-a) are the optimal candidates and hence the only
grammatical versions of the sentences in question. However, this result could
be achieved in (almost) any theoretical framework, including ones with non-
violable constraints, as there is no conflict between the constraints here; the
winning candidates do not violate any constraints at all. In the derivations in
330 Sten Vikner
the following subsection, this is not the case: All candidates violate at least
one of the constraints.
3.3 Icelandic Objects with Wide Scope
If the input is such that the object has wide scope relative to the adverbial (i.e.,
"there is a question more difficult than all others, and when he encounters this
question, he rarely answers it"), the situation changes crucially:
(25) Input: wide scope object LICENSING SCOPING STAY

a. * ... svarar hann sjaldan OBJ *! = (17-a)
b. US'... s varar hann OBJ sjaldan *
=(17-b)
(25-b) is better than (25-a), because (25-b) does better than (25-a) on the
highest constraint on which the two differ, S C O P I N G , where (25-b) has no
violations and (25-a) one. Given that (25-b) nevertheless violates a constraint,
namely STAY (= Procrastinate/Economy of Derivation), it has to be the case
not only that STAY is a violable constraint (as it is also in Minimalism; cf.
section 4 below), but also that STAY has lower priority than S C O P I N G , as can
be seen here, where the choice is between having to violate either S C O P I N G
or STAY. Let us now turn to the case in which object shift is not possible:
(26) Input: wide scope object LICENSING SCOPING STAY

a. ... hann sjaldan svarad OBJ *
=(18-a)
b. * ... hann sjaldan OBJ svaraö *! * *
=(18-b)
c. * ... hann OBJ sjaldan svaraö *! *
= ( 18-c)
Whereas in (25), there was a constraint conflict between S C O P I N G and STAY,

here there is a conflict between LICENSING and S C O P I N G . If both were non-
violable constraints, (18-a) would be impossible with the wide scope inter-
pretation (i.e., (26) should have no good output at all), which clearly is not
the case. (26-a) is better than (26-b), because (26-a) does better than (26-b) on
the highest constraint on which the two differ, L I C E N S I N G , where (26-a) has
no violations and (26-b) has one. The same goes for the comparison between
(26-a) and (26-c), and therefore (26-a) is better than either (26-b) or (26-c).
Given that (26-a) nevertheless violates a constraint, namely S C O P I N G , it has
to be the case not only that SCOPING is a violable constraint (which is not
the case in Diesing's minimalist analysis; cf. section 4 below), but also that
S C O P I N G has lower priority than L I C E N S I N G , as can be seen when the choice
is between having to violate either L I C E N S I N G or S C O P I N G . This section thus
shows that, given the three constraints, L I C E N S I N G , S C O P I N G , and STAY, de-

fined as in (22), the Icelandic ranking has to be L I C E N S I N G » SCOPING »
S T A Y , and that at least S C O P I N G and STAY have to be violable.
3.4 German Objects
I want to suggest that the relevant difference between Icelandic and German,
i.e., the difference between object shift and scrambling, is that where Ice-
landic has L I C E N S I N G ranked above S C O P I N G , German has S C O P I N G ranked
above L I C E N S I N G . This could reflect either that object L I C E N S I N G is less
necessary in German than in Icelandic, or that c-command is not a neces-
sary condition on object licensing in German. Consider first the narrow scope
cases in which the object has narrow scope relative to the adverbial (i.e., "he
rarely answers whichever is the most difficult question in any given exam"):
(27) Input: narrow scope object SCOPING LICENSING STAY

a. US'... Ver selten OBJ ... =(15-a)
b. * ... V e r OBJ selten... *! *
= (15-b)
(28) Input: narrow scope object SCOPING LICENSING STAY

a. ι®"... er selten OBJ V... = (16-a)
b. * ... er OBJ selten V... *! * *
=(16-b)
This is the unproblematic case; as in the Icelandic (23-a) and (24-a), the win-
ning candidates here, (27-a) and (28-a), violate no constraints. Consider now
the wide scope cases in which the object has wide scope relative to the ad-
verbial (i.e., "there is a question more difficult than all others, and when he
encounters this question, he rarely answers it"), where the situation changes
crucially:
(29) Input: wide scope object SCOPING LICENSING STAY

a. *... Ver selten OBJ ... *! - (15-a)
*
b. us-... Ver OBJ selten ... =(15-b)
Even though (29-b) violates STAY, because the object has undergone move-
ment, it is grammatical, because its competitor, (29-a), violates the higher
ranked S C O P I N G .
(30) Input: wide scope object SCOPING LICENSING STAY

a. * ... er selten OBJ V... *! =(16-a)
b. c®·... er OBJ selten V... * *
=(16-b)
332 Sten Vikner
Even though (30-b) violates LICENSING, because the object is not c-com-
manded by the main verb, it is still grammatical, because its competitor,
(30-a), violates the higher ranked SCOPING. This section thus shows that
the three constraints, LICENSING, SCOPING, a n d STAY, d e f i n e d as in (22),
h a v e to b e r a n k e d as f o l l o w s in G e r m a n : SCOPING » LICENSING » STAY,
and that at least LICENSING and STAY have to be violable. The result of the
reranking of SCOPING and LICENSING (compared to Icelandic) is thus that
in cases of scrambling in German, SCOPING determines everything regard-
less of whether there is licensing via c-command. In other words, there is a
one-to-one correspondence between word order and interpretation in cases of
scrambling in German. That this is not necessarily the case outside scram-
bling contexts is shown in the next subsection.
3.5 Scoping May Also Be Violated in German
Both in Icelandic and in German, topicalisation of the object in (17-a)/(15-a)

results in exactly the same surface string(s) as topicalisation of the object in
(17-b)/(15-b):
(31) Icelandic Erfidustu spurningunni svarar hann sjaldan

(32) German Die schwierigste Frage beantwortet er selten
(the) most-difficult question(-the) answers he rarely
These two sentences are both ambiguous, i.e., both have both the reading of
(17-a)/(15-a), "he rarely answers whichever is the most difficult question in
any given exam", and the one of (17-b)/(15-b), "there is a question more dif-
ficult than all others, and when he encounters this question, he rarely answers
it". To account for (31) and (32), it suffices to assume (as I do in Vikner 1999)
that the topic (the object) is an operator, and that operators underlie a sepa-
rate constraint, OP-SPEC (Grimshaw 1997:377, Bakovic 1998:39), which re-
quires them to move to a specifier position (which for various reasons will
be CP-spec; see, e.g., Grimshaw 1997:377). OP-SPEC would then have to
be ranked above the other three constraints discussed so far. The effect of
OP-SPEC here is parallel to the e f f e c t of LICENSING in (24) a n d (26), i.e.,
regardless of whether the object has wide or narrow scope, OP-S PEC will let
(phonetically) identical candidates win in the two cases. Consider first the
tableaux for the Icelandic (31):
(33)
Input: object = topic, OP- LICENS SCOP STAY
narrow scope SPEC ING ING
a. * ... svarar hann sjaldan OBJ *!
b. * ... svarar hann OBJ sjaldan *! * *
* * *
c. us* OBJ svarar hann sjaldan
(34)
Input: object = topic, OP- LICENS SCOP STAY

wide scope SPEC ING ING
*
a. * ... svarar hann sjaldan OBJ *!
*! *
b. * ... svarar hann OBJ sjaldan
c. es- OBJ svarar hann sjaldan * *
B o t h ( 3 3 - c ) and ( 3 4 - c ) correspond to ( 3 1 ) in Icelandic. C o n s i d e r now the

tableaux f o r the G e r m a n ( 3 2 ) , which show that the ranking d i f f e r e n c e be-
tween Icelandic and G e r m a n d i s c u s s e d in the previous section has no e f f e c t
in the c a s e at hand:
(35)
Input: object = topic, OP- SCOP LICENS STAY

narrow scope SPEC ING ING
a. * ... beantwortet er selten OBJ *!
* *
b. * ... beantwortet er OBJ selten *!
* * *
c. ι®· OBJ beantwortet er selten
(36)
Input: object = topic, OP- SCOP LICENS STAY

wide scope SPEC ING ING
a. * ... beantwortet er selten OBJ *! *
*
b. * ... beantwortet er OBJ selten *!
* *
c. ι®* OBJ beantwortet er selten
B o t h ( 3 5 - c ) and ( 3 6 - c ) correspond to ( 3 2 ) in G e r m a n and show that even in

G e r m a n , SCOPING is outranked by another constraint, and therefore even in
G e r m a n , SCOPING has to be violable.
334 Sten Vikner
4 Conclusion
Within Optimality theory, it is not only possible but actually expected that
a constraint may override a second constraint and at the same time be over-
ridden itself by a third constraint. This paper has tried to show that such sit-
uations are found both in Icelandic and in German, where on the one hand,
SCOPING overrides Procrastinate/STAY, cf. the Icelandic (25) and the Ger-
man (29), and on the other hand, SCOPING is at the same time overridden
itself by Shortest Move/LLCENSING in Icelandic, cf. (26), and by OP-SPEC
in German, cf. (35-c)/(36-c).
In other frameworks, e.g., the Minimalist Program, this is not straight-
forwardly possible, because there are only two levels, "Conditions on Con-
vergence" and "Economy Considerations", where the former are inherently
ranked above the latter.
According to Diesing (1997:422), a minimalist analysis (Chomsky 1993,
1995, Bobaljik & Jonas 1996 and Collins & Práinsson 1996) regulates the
availability of object shift by means of Shortest Move, a rule which deter-
mines whether object shift is a possible movement. This is only the case if
the verb itself has moved, due to Equidistance (see, e.g., Chomsky 1993:17-
19 = 1995:184-186 and Bobaljik & Jonas 1996).
Shortest Move is a "Condition on Convergence" (Chomsky 1995:219-220),
i.e., if it is violated, the derivation will crash rather than converge. Procrasti-
nate, on the other hand, which is a generalisation that says that overt move-
ment (before Spell-Out, i.e., in the syntax) is more costly than covert move-
ment (after Spell-Out, i.e., at LF), is an "Economy Consideration", which
means that it can only select between different converging derivations, but
not cause a derivation to crash. This difference is important: If Procrasti-
nate were a condition on convergence, "there would never be any cases of
overt movement" (Diesing 1997:422). In terms of the present analysis, this
would correspond to STAY being inviolable, which is untenable, as discussed
in connection with (25) above. Given that clear cases of object shift do ex-
ist, Diesing (1997:422) concludes that it must be the case that "the Scoping
Condition is a condition on Convergence, which leads to the overriding of
Procrastinate". In terms of the present Optimality Theory analysis, this sim-
ply corresponds to SCOPING being higher ranked than STAY.
The difference between the Minimalist Program and Optimality Theory is
that if, in minimalist terms, the Scoping Condition is a condition on conver-
gence, the Scoping Condition itself may not be violated, as this would make
the derivation crash. However, as the discussion of the Icelandic (26) above
showed (see also the discussion of the German (35)), the Scoping Condition
must be a violable constraint, 2 otherwise a wide scope interpretation of the
object would only be possible in object shift constructions, which clearly is
not the case.
I thus hope to have shown that Optimality Theory allows a comprehen-
sive account of the interpretation of object shift (and of scrambling), which
includes aspects that would seem to be more difficult to account f o r within
other f r a m e w o r k s , e.g., the Minimalist Program. 3
Notes
This paper is partly based on work which was part of the project The Syntax Com-
panion at the Netherlands Institute of Advanced Study (NIAS), Royal Dutch Acad-
emy of Sciences, Wassenaar, The Netherlands, Spring 1997. A preliminary version
appeared in Working Papers in Scandinavian Syntax no. 60, pp. 1-24, December
1997.1 would like to thank the following for their comments, criticism and/or native
speaker assistance: various anonymous reviewers, Kristján Árnason, Eric Bakovic,
Jonathan Bobaljik, Molly Diesing, Hubert Haider, Gunnar Hrafn Hrafnbjargarson,
Johannes Gísli Jónsson, Ed Keer, Almut Klepper-Schudlo, Gereon Müller, Peter Ohl,
Christer Platzack, Ian Roberts, Ramona Römisch-Vikner, Tanja Schmid, Halldór Ár-
mann Sigurösson, Arnim von Stechow, Höskuldur Práinsson, Carl Vikner, Angelika
Wöllstein-Leisten, and Heike Zinsmeister. I would also like to thank my Syntax Com-
panion colleagues and audiences at the 1 st Workshop on Optimality Theory Syntax
in Stuttgart, November 1997, at the University of Iceland, March 1998, at the Work-
shop on Competition in Syntax at the 21 st Conference of the German Linguistics
Society in Constance, February 1999, and at the University of Lund, March 1999.
I am particularly grateful to Daniel BUring and Hans-Martin Gärtner for untangling
the different interpretations and interpretational differences for me.
1. A few remarks on the data and on the native speaker informants are in order.
According to Molly Diesing (p.c.), the informants for the data and the interpreta-
tions given in (14) above were Johannes Gísli Jónsson, Sigriöur Sigurjónsdóttir,
Halldór Ármann Sigurösson, and Höskuldur Práinsson. In an earlier version of
this paper, Vikner (1997), I focussed on the interpretation of indefinite objects,
but as was pointed out by Johannes Gísli Jónsson, examples with an object in
situ in the context where object shift is possible, like (21-a) in Vikner (1997),
are not completely unambiguous, as opposed to what I claimed there (a generic
reading of (21-a) in Vikner 1997 is not impossible, just dispreferred). In this
paper I therefore focus on definite superlative objects like the ones discussed in
Diesing (1996,1997). These works only discussed Icelandic data like (14) above,
336 Sten Vikner
where object shift is possible. The possible interpretations of data like (18) be-
low, in which object shift is not possible, were not mentioned. The informants
for the data discussed here include Kristján Árnason, Gunnar Hrafn Hrafnbjar-
garson, Johannes Gísli Jónsson, and Halldór Ármann Sigurösson. Whereas the
interpretation both of the object in the perfect case (where object shift is impos-
sible), (18-a) below, and of the object in the object-shifted case, (17-b) below, is
completely uncontroversial, the interpretation of the object in situ in the context
where object shift is possible remains controversial, in so far as (17-a) below is
found to be ambiguous by one of my four informants, Johannes Gísli Jónsson.
In the text, I shall nevertheless assume that (17-a) is unambiguous, following the
other three informants and following Diesing (1996, 1997).
2. Notice that this argument is still valid even if what were considered above to be
cases of non-shifted objects should turn out to be objects which only undergo
object shift after Spell-Out (at LF). Object shift would then always take place,
and the only variation would be whether it takes place before or after Spell-Out.
The Scoping Condition ("the scope of objects is read off their surface position,"
(22) above) might then have to be made more explicit, e.g., "the scope of objects
is read off their position at Spell-Out," but it would still have to be violable;
cf. the discussion of (18-a)/(26) above. If the Scoping Condition is not taken to
apply to surface positions/positions at Spell-Out, but instead to positions at LF,
and if object shift is assumed to vary only with respect to when it applies and not
with respect to whether it applies or not, the prediction would be that all objects
would receive (only) wide scope interpretations, a prediction which is clearly
undesirable.
Hornstein's (1995) analysis of scope ambiguities offers a way of account-
ing for the ambiguity of (18-a)/(26) while maintaining that object shift always
applies (at the latest at LF), but this not only requires sacrificing the Scoping
Condition, but also makes incorrect predictions for the non-ambiguous cases.
Hornstein (1995: 154) assumes that scope ambiguities arise as follows: What de-
termines scope is the relative position of the scope taking elements at LF, but
the position of an element which counts for scope may be any of the positions
in the chain of that element. The ambiguity of (18-a), a non-shifted object in a
construction in which object shift cannot apply, could thus be accounted for if
object shift is assumed to apply at LF iff it does not apply before Spell-Out: The
reading of (18-a) in which the object has narrow scope arises by having the pre-
object shift position of the object count for scope, and the reading in which the
object has wide scope arises by having the post-object shift position count. The
problem is that this account would incorrectly make exactly the parallel predic-
tions for object-shifted objects, as in (17-b). This should be ambiguous as well:
Object shift has applied, and scope may now be determined by either the non-
shifted or the shifted position. But (17-b) is not ambiguous; the object can only
have wide scope, not narrow scope. Also non-shifted objects in constructions in
which object shift could have applied, e.g. (17-a), would incorrectly be predicted
to be ambiguous in a parallel way, although here the object can only have narrow
scope, not wide scope (though see the remarks in note 1).
3. Admittedly, there are also at least two versions of minimalism that would seem
to have more in common with Optimality Theory than "standard" minimalism
does, in that they allow ranked and violable constraints: Bobaljik (1995:351),
which like this article is an attempt to formulate Diesing's Scoping Condition as
a violable constraint, and Holmberg (1997:214). However, ranked and violable
constraints are left out in a more recent version of the latter work, Holmberg
(1999). As for other OT analyses of object shift, see also Müller (1998), which
focusses on multiple object shift in double object constructions.
References
Archangeli, Diana — Terence D. Langendoen (eds.)

1997 Optimality Theory: An Overview. Oxford: Blackwell.
Bakovic, Eric
1998 Optimality and inversion in Spanish. In: Pilar Barbosa, Danny Fox, Paul
Hagstrom, Martha McGinnis and David Pesetsky (eds.) Is the Best Good
Enough?, 35-58. Cambridge, MA: MIT Press.
Barbosa, Pilar — D. Fox — P. Hagstrom — M. McGinnis — D. Pesetsky (eds.)
1998 Is the Best Good Enough? Cambridge, MA: MIT Press.
Bobaljik, Jonathan
1995 Morphosyntax: The syntax of verbal inflection. Ph.D. dissertation, MIT.
Bobaljik, Jonathan — Dianne Jonas
1996 Subject positions and the roles of TP. Linguistic Inquiry 27(2): 195-236.
Burzio, Luigi
1995 The rise of Optimality Theory. GLOT International 1 (6): 3-7.
Chomsky, Noam
1991 Some notes on economy of derivation and representation. In: Robert
Freidin (ed.) Principles and Parameters in Comparative Grammar, 417-
454. Cambridge, MA: MIT Press.
Chomsky, Noam
1993 A Minimalist Program for Linguistic Theory. In: Kenneth Hale and
Samuel Jay Keyser (eds.) The View from Building 20, 1 -52. Cambridge,
MA: MIT Press.
Chomsky, Noam
338 Sten Vikner
Collins, Chris — Höskuldur Práinsson

1996 VP-internal structure and object shift in Icelandic. Linguistic Inquiry
27(3): 391-444.
Corver, Norbert — Henk van Riemsdijk (eds.)
1994 Studies on Scrambling. Berlin: Mouton de Gruyter.
Diesing, Molly
1992 Indefinites. Cambridge, MA: MIT Press.
Diesing, Molly
1996 Semantic variables and object shift. In: Höskuldur Práinsson, Samuel
Epstein and Steve Peter (eds.) Studies in Comparative Germanic Syntax
II, 66-84. Dordrecht: Kluwer.
Diesing, Molly
1997 Yiddish VP order and the typology of object movement in Germanic.
Natural Language and Linguistic Theory 15(2): 369-427.
Diesing, Molly — Eloise Jelinek
1995 Distributing arguments. Natural Language Semantics 3(1): 123-176.
Grevvendorf, Günther — Wolfgang Sternefeld (eds.)
1990 Scrambling and Barriers. Amsterdam: John Benjamins.
Grimshavv, Jane
1993 Minimal Projection, Heads, and Optimality. Technical Report 4, Rutgers
University Center for Cognitive Science.
Grimshavv, Jane
1997 Projection, Heads, and Optimality. Linguistic Inquiry 28(3): 373-422.
Haider, Hubert
1993 Deutsche Syntax - Generativ. Tiibingen: Gunter Narr Verlag.
Haider, Hubert — Inger Rosengren
1998 Scrambling. Sprache und Pragmatik 49:1-104.
Holmberg, Anders
1986 Word order and syntactic features in the Scandinavian languages and
English. Ph.D. dissertation, University of Stockholm.
Holmberg, Anders
1991 The distribution of Scandinavian weak pronouns. In: Henk van Riems-
dijk and Luigi Rizzi (eds.) Clitics and their Hosts, 155-173. (EUROTYP
Working Papers (European Science Foundation)), Tilburg University.
Holmberg, Anders
1997 The true nature of Holmberg's generalization. In: Proceedings of the
27th Annual Conference of the North-Eastern Linguistic Society, 203-
217. Amherst, MA: GLSA.
Holmberg, Anders
1999 Remarks on Holmberg's generalization. Studia Linguistica 53(1): 1-39.
Holmberg, Anders — Christer Platzack
1995 The Role of Inflection in Scandinavian Syntax. New York: Oxford Uni-
versity Press
Hornstein, Norbert
1995 Logical Form: From GB to Minimalism. Oxford: Blackwell.
Josefsson, Gunlög
1992 Object shift and weak pronominals in Swedish. Working Papers in Scan-
dinavian Syntax 49: 59-94.
Josefsson, Gunlög
1993 Scandinavian pronouns and object shift. Working Papers in Scandinavian
Syntax 52:1-28.
Müller, Gereon
1995 A-bar Syntax: A Study in Movement Types. Berlin: Mouton de Gruyter.
Müller, Gereon
1997 Partial wh-movement and optimality theory. The Linguistic Review
14:249-306.
Müller, Gereon
1998 Order preservation, parallel movement, and the emergence of the un-
marked. To appear in: Géraldine Legendre, Jane Grimshaw and Sten
Vikner (eds.) Optimality-Theoretic Syntax. Cambridge, MA: MIT Press.
1993 Optimality Theory: Constraint Interaction in Generative Grammar.
Technical Report 2, Rutgers University Center for Cognitive Science.
Vikner, Sten
1989 Object shift and double objects in Danish. Working Papers in Scandina-
vian S)>n/aj:44:141-155.
Vikner, Sten
1994 Scandinavian object shift and West Germanic scrambling. In: Norbert
Corver and Henk van Riemsdijk (eds.) Studies on Scrambling, 487-517.
Berlin: Mouton de Gruyter.
Vikner, Sten
1997 The Interpretation of object shift, optimality theory, and minimalism.
Working Papers in Scandinavian Syntax 60: 1 -24.
Vikner, Sten
1999 V°-to-I° movement and 'do'-insertion in optimality theory. To appear in:
Géraldine Legendre, Jane Grimshaw and Sten Vikner (eds.) Optimality-
Theoretic Syntax, Cambridge, MA: MIT Press.
340 Sten Vikner
Webelhuth, Gert
1992 Principles and Parameters of Syntactic Saturation. New York: Oxford
University Press.
Case Conflict in German Free Relative Constructions:
An Optimality Theoretic Treatment
Ralf Vogel
Languages differ as to how big a case conflict must be in a free relative (FR)
construction to cause ungrammaticality. While English requires true catego-
rial matching, German allows the suppression of structural cases if assigned
by the matrix verb. There are also different types of non-matching languages.
Paradigmatic examples are Gothic and Modern Greek. Earlier generative syn-
tactic accounts mainly propose a distinction only between matching and non-
matching languages. This is not fine-grained enough to capture the typolog-
ical findings. An optimality theoretic treatment permits a richer, but not in-
finite, typology, and it allows constraint violation (which obviously happens
in FR constructions). The proposed account makes use of the optimality the-
oretic conception of correspondence. The assumed constraints are on input-
output correspondence (input-LF as well as input-PF), and also on PF-LF
correspondence.
1 Free Relative Constructions
A considerable amount of attention has been paid to free relative construc-

tions in generative grammar in the last thirty years. Some interesting and puz-
zling properties make this construction worth examining. Free relatives have
the somewhat paradoxical property of being clauses that stand for non-clausal
constituents. The following example contains five such relative clauses:
(1) [ Wer sich mit freien Relativsätzen beschäftigt], verwendet, [was von
ihm dafür gehalten wird], [sooft er kann], [wann immer sich ihm
dafür eine Gelegenheit bietet] und [wo immer er sich befindet],
ss
'Whoever deals with free relative clauses, uses what he considers
to be one, as often as he can, whenever he has the opportunity and
wherever he is.'
342 Ralf Vogel
This paper will not deal with adverbial clauses, such as the last three high-
lighted clauses in (1). I will concentrate on free relatives (FRs) that realise
an argument of a verb. And here again I will restrict myself mainly to case-
marked complements, and widely ignore prepositional phrases. An important
phenomenon is the so-called matching effect (Bresnan & Grimshaw 1978):
In many languages the relative pronoun of the FR construction has to fulfil
both the requirements of the FR internal verb, and those of the matrix verb. In
German, the relative pronoun has to appear in the dative case if the FR stands
for a dative argument of the matrix verb:
(2) a. Ich folge wem immer ich vertraue

I follow who-DAT ever I trust
b. *Ich folge wen immer ich bewundere
I follow who-ACC ever I adore
c. *Ich folge wem immer ich bewundere
I follow who-DAT ever I adore
Both folgen and vertrauen assign dative to their object in (2-a); the relative
pronoun matches these requirements and the clause is fine. This is not the
case in (2-b), because the FR internal verb bewundern assigns accusative.
Whichever of the two cases is chosen (i.e., dative or accusative) for the rel-
ative pronoun, the result is ill-formed. Examples like this seem to have led
many researchers (starting with Groos & van Riemsdijk 1981) to conclude
that German FRs require matching in general. As already observed by Pittner
(1991, 1995, 1996), this is not the case. 1 The empirical generalisations about
German FR constructions seem to me to be the following:
— The pronoun of the FR has to carry the case assigned by the FR

internal verb.
— If the matrix verb assigns structural case (i.e., nominative or ac-
cusative) to the FR complement, it imposes no restrictions on the
case of the FR pronoun at all in one variant of German (henceforth
German A). Another variant, German B, does not allow an FR if the
matrix verb requires accusative and the embedded verb nominative.
— If the matrix verb assigns oblique case (i.e., dative or genitive) to
the FR complement, the FR pronoun has to carry exactly that case.
This in turn only leads to well-formedness if the FR internal verb
assigns the same case to the FR pronoun.
Free Relatives 343
The following examples show that matrix nominative need not occur on the
FR pronoun:
(3) Uns besucht...

us visits
a. wer Maria mag
who-NOM Maria-ACC likes
b. wen Maria mag
who-ACC Maria-NOM likes
c. wem Maria vertraut
who-DAT Maria-NOM trusts
d. auf wen Maria sich freut
on whom Maria SELF be-happy
'whoever Maria is looking forward to meeting'
e. wessen Maria sich erfreuen würde
who-GEN Maria SELF be-happy would
'whoever Maria would be happy about'
f. wessen Bücher Maria gefallen
who-GEN books-NOM Maria-DAT please
'whoever's books please Maria'
There is no big difference with the FR in sentence-initial position: 2
(4) a. Wer Maria mag, wird eingeladen

w h o - N O M Maria likes is invited
b. Wen Maria mag, wird eingeladen
who-ACC Maria likes is invited
c. Wem Maria vertraut, wird eingeladen
who-DAT Maria trusts is invited
d. Auf wen Maria sich freut, wird eingeladen
on who Maria SELF be-happy is invited
'Whoever Maria is looking forward to meeting is invited.'
e. Wessen Maria sich erfreuen würde, wird eingeladen
who-GEN Maria SELF be-happy would is invited
'Whoever Maria would be happy to meet is invited.'
f. Wessen Bücher Maria gefallen, wird eingeladen
who-GEN books-NOM Maria-DAT please, is invited
'Whoever's books please Maria is invited.'
The situation is nearly the same with accusative as matrix case:

344 Ralf Vogel
(5) a. Ich erzähle, was immer mir gefällt

I tell what-NOM ever me-DAT pleases
b. Ich lade ein, wen auch Maria mag
I invite who-ACC also Maria likes'
c. Ich lade ein, wem auch Maria vertraut
I invite who-DAT also Maria trusts
d. Ich lade ein, auf wen sich auch Maria freuen würde
I invite on who SELF also Maria be-happy would
Ί invite whoever also Maria would be happy to meet.'
e. Ich lade ein, wessen sich auch Maria erfreuen würde
I invite who-GEN SELF also Maria be-happy would
Ί invite whoever also Maria would be happy to meet.'
f. Ich lade ein, wessen Bücher auch Maria gefallen
I invite who-GEN books-NOM also Maria-DAT please
Ί invite whoever's books please also Maria.'
There is a controversy about data in which the relative pronoun carries nom-
inative. For some speakers, the example in (6) is ill-formed (cf. Pittner 1991,
1995): 3
(6) (*)Er zerstört, wer immer ihm begegnet

he destroys who-NOM ever him-DAT meets
As far as I can see, there is a real disagreement among German native speak-
ers about data like this. I found, however, that parallel examples like the fol-
lowing are easier to judge as well-formed:
(7) a. (*)Ich lade ein, wer mir sympathisch ist

I invite who-NOM me-DAT nice is
^ Ί invite who I like.'
b. (*)Er tötet, wer immer ihm begegnet
he kills who-NOM ever him-DAT meets
But even those speakers who do not accept (6) and (7) accept (5-a). This
must be due to the fact that the wA-pronoun for inanimates, was, is the same
for both nominative and accusative. 4 This shows that the matching effect
is not about a syntactic feature like "abstract case", but about the morpho-
phonological "identity" of elements with not necessarily identical syntactic
features. 5
For those speakers who accept (6) and the examples in (7), the patterns
for matrix nominative and accusative are alike. This variant of German will
Free Relatives 345
be referred to as "German A", and the one for which (6) and (7) are odd as
"German B".
This is the only case in which the two variants of German seem to differ
in their judgements of FRs. 6 If the case "assigned" by the matrix verb to the
free relative construction is an oblique case like dative 7 or genitive, then the
relative pronoun has to appear in that case, and the verb inside the free relative
also has to assign that case to the pronoun: 8
(8) The matrix verb assigns dative to the FR:

a. *Ich vertraue, wer Hitchcock mag
I trust who-NOM Hitchcock likes
b. *Ich vertraue, wen auch Maria mag
I trust who-ACC also Maria-ΝΟΜ likes
c. Ich vertraue, wem Maria gefällt
I trust who-DAT Maria-ΝΟΜ pleases
d. *Maria hilft, wessen andere sich entledigen möchten
Maria helps who-GEN others SELF rid want
'Maria helps whoever others want to get rid of.'
e. *Maria hilft, wessen Eltern sie mag
Maria helps who-GEN parents-ACC she likes
f. *Maria hilft, von wem sie eine Belohnung erwartet
Maria helps from whom she a reward expects
(9) The matrix verb assigns genitive to the FR:
a. *Bodo entledigt sich, wer immer andere Ansichten hat als
Bodo rids SELF who-NOM ever other opinions has than
er
he
'Bodo gets rids of whoever has different opinions than he.'
b. *Bodo entledigt sich, wen immer Henkel nicht mag
Bodo rids SELF who-ACC ever Henkel not likes
c. *Bodo entledigt sich, wem immer Gerhard mißtraut
Bodo rids SELF who-DAT ever Gerhard mistrusts
d. Bodo entledigt sich, wessen er nicht mehr bedarf
Bodo rids SELF who-GEN he no longer needs
e. *Bodo entledigt sich, mit wem immer er einmal Streit hatte
Bodo rids SELF with who ever he once argument had
f. *Bodo entledigt sich, wessen Einverständnisses er nicht bedarf
Bodo rids SELF who-GEN agreement-GEN he not needs
346 Ralf Vogel
In these cases, the pronoun must fulfil the case requirements of the two differ-
ent verbs. This "matching effect" is the most spectacular finding about free
relatives, and much of the work that has been done on them in generative
grammar addresses the question of how to derive it.
The next section discusses the syntactic properties of FR constructions and
various attempts to deal with them in generative syntax. None of the proposals
suggested so far can derive the full range of typological variation in the way
different grammars handle the case conflict that occurs in FR constructions.
Most approaches predict that there are only matching and non-matching lan-
guages, but not that there are languages like German, that have non-matching
with matrix structural case and matching with matrix oblique case, or lan-
guages with the pattern of Modern Greek.
In the third and fourth sections, I will develop an optimality theoretic ac-
count of the case conflict in German FR constructions. We will also see that
it can deal in a much better way with typological variation.
2 The Syntax of Argument Free Relative Clauses
Two observations are crucial for the syntactic analysis of argument FRs:
(10) I. They seem to obey the same selectional restrictions as NP constit-

uents.
II. FRs resemble relative clauses and embedded wA-clauses in their
internal structure and syntactic behaviour.
With respect to (10-L), FRs behave like NPs or DPs, but with respect to
(10-11.), they behave like CPs. The task for the syntactic analysis is to bring
these two apparently contradictory observations together.
FRs differ from "normal" relative clauses in that they do not seem to be
"headed", i.e., they do not seem to have an antecedent, as opposed to ordinary
relative clauses:
(11) a. Der, den ich meine, steht dort drüben {Rei.cl.)

the-NOM the-ACC I mean stands there over
b. 0 Wen ich meine, steht dort drüben {FR)
who-ACC I mean stands there over
Free Relatives 347
2.1 The FR Pronoun
The two relative constructions in (11) use different pronouns. While the rela-
tive clause uses the ordinary ¿-pronoun as relative pronoun, the FR uses the
w/z-pronoun. It is mostly impossible to use the w/z-pronoun as relative pro-
noun in German:
(12) *Der, wen ich meine...

the-NOM who-ACC I mean
On the other hand, it is marginally possible to use the d-pronoun as FR pro-

noun. This was much better in earlier stages of German, but does not always
appear odd, as (13-c) shows:
(13) a. ?der mit dem Wolf tanzt

the-NOM with the wolf dances
'(the one) w h o dances with the w o l f '
b. wer mit dem Wolf tanzt
who-NOM with the wolf dances
c. Wer/Der da gerade mit dem Wolf tanzt, ist ein Freund
who/the-NOM there just with the wolf dances is a friend
von mir
of mine
The ¿/-pronoun is restricted to a definite interpretation, while the w/z-pronoun

can also have a generic and/or universal quantification reading.
Modern Greek has a specific FR pronoun that looks like a hybrid of relative
and w/z-pronoun (Alexiadou & Varlokosta 1995: 4):
(14) a. Potisa pjosirthe (Question)

asked-lSg who came-3Sg
Ί asked who came.'
b. Opjos theli erhete (FR)
whoever-NOM want-3Sg come-3Sg
'Whoever wants, may come.'
c. Agorasa to spiti pu/to opjo mu arese (Rel.cl.)
bought-lSg the-house-ACC that/which me pleased
Ί bought the house that I liked.'
348 Ralf Vogel
This evidence allows for a quite construction specific treatment of FR con-

structions: A large number of the specific features of FRs might be attribut-
able to the lexical properties of FR pronouns. Not all languages need neces-
sarily have their own genuine set of pronouns for FR constructions, just as,
e.g., the relative pronoun in German is the same as the d-pronoun. Hence, the
fact that German uses the wA-pronoun for FRs is no evidence that FRs are
w/i-clauses.
The pronouns introducing questions, FRs and relative clauses have in com-
mon that they occur in the operator position of an operator-variable chain.
They are pronominal operators of different types. This difference should be
expressed in terms of formal features. From the pattern of Modern Greek, we
might conclude that the FR pronoun has features of both wA-pronouns and
relative pronouns: 9
TYPE wh REL
wA-pronoun + -
relative pronoun - +
FR pronoun + +
FRs and indirect questions are in complementary distribution. This can be

shown by the addition of disambiguating items like immer ('ever', FR) and/or
alles ('all', question): 10
(16) a. Ich möchte wissen, wer alles/*immer kommt

I want-to know who-NOM all/*ever comes
b. Ich werde heiraten, wer *alles/immer mich darum bittet
I will marry who-NOM *all/ever me about-it asks
2.2 Variations of Non-Matching
The pronoun of the relative clause is sensitive to the case assigning properties
of both the matrix verb and the relative clause internal verb. This can again
be seen very clearly in Modern Greek (Alexiadou & Varlokosta 1995: 12f.):
(17) a. Agapo opjon/*opjos meagapa

love-lSg whoever-ACC/*NOM me loves
Ί love whoever loves me.'
b. Opjon/opjos piaso tha timorithi
whoever-ACC/NOM catch-lSg FUT be-punished-3Pl
'Whoever I catch will be punished.'
Free Relatives 349
With the FR in postverbal position, the FR pronoun has to carry the matrix
case, while both cases are possible when the FR is fronted. The relative pro-
noun of FR constructions is in a case conflict in postverbal position. The
structural configuration of this conflict can be represented in the following
way:
(18) Case conflict

CP
C A S E l is the case assigned by the matrix verb to the FR and CASE2 is the
case assigned by the verb inside the FR to the trace of the relative pronoun,
Í2. Under the assumption that CASEl percolates from the top node of the
FR (which kind of category it actually is will be discussed below) to the FR
pronoun, the pronoun is case marked twice. This is indicated by the numerical
subscripts on the node REL-PRON. Examples like those in (17) show that the
pronoun can realise either of the two cases. But that configurations like (18)
need not necessarily lead to ungrammaticality, even if CASEl and CASE2
differ, is already quite an unexpected fact: A single pronoun can carry only
one case feature; the other case feature is not assigned, or at least not realised,
and because of this, FR constructions should be defective. Obviously, they are
not.
The grammar of Modern Greek handles this case conflict by suppressing the
case assigned to the pronoun inside the FR for FRs in postverbal position.11
This is the opposite of what happens in German, where the relative pronoun
350 Ralf Vogel
always has to carry the case assigned to the wA-pronoun by the FR internal
verb. The phenomenon that relative pronouns realise a case from "outside"
of the relative clause is quite frequent in ancient languages like Gothic and
Ancient Greek (cf. Harbert 1983). It has been termed "case attraction".
Recall that German requires matching as soon as the matrix case assigned
to the FR is oblique. One obvious explanation for this could be that only
structural case may be "recoverable" if "suppressed", but not oblique case.
What, then, happens in Modern Greek if the "suppressed" case is an oblique
one? Again, we observe something that differs from the usual pattern (cf.
Alexiadou & Varlokosta 1995: 13):
(19) Tha voithiso opjon tu dosis to o n o m a m u

FUT help-lSg whoever-ACC cl-GEN give-2Sg the name my
*opjou 'whoever-GEN'
*s'opjon 'to whoever'
*opjou tu 'whoever-GEN him-GEN'
Ί will help whoever you give (him) my name.'
As usual, the pronoun carries the accusative assigned by the matrix verb, but
in addition to that we have a pronominal clitic following the relative pronoun
inside the FR that realises the genitive case assigned by the FR internal verb.
The clitic can be seen as a resumptive element spelling out a trace of the
relative pronoun in the sense of Pesetsky (1998).
Modern Greek has found a way to escape the case conflict by realising both
case forms within the chain of the relative pronoun, thereby violating the
restriction that a single chain should bear only one case feature.
A third type of non-matching language is Gothic, where, according to
Harbert (1983), both attraction and non-attraction were possible ways of han-
dling the case conflict. In this language, the relative pronoun systematically
chooses the case form that is higher on the case hierarchy:
... the two types of free relative are in complementary distribution, the
choice between them being determined by the relationship between the
case appropriate to the matrix clause role of the construction and the case
appropriate to the role of the missing argument in the lower clause. When
the matrix case is to the right of the lower clause case on a hierarchy
of the form Nom-Acc-{§!;"} it prevails [(20-a), attraction]. When it is to
the left of the lower clause case the lower clause case prevails [(20-b),
non-attraction], (Harbert 1983: 249)
Free Relatives 351
(20) "Optional" attraction in Gothic (Harbert 1983: 248f.):

a. Jah Po-ei ist us Laudeikaion jus ussiggwaid
and Acc-Compl is from Laodicea you read
'and read (the one) which is from Laodicea' (Col. 4:16)
b. Pan-ei frijos siuks ist
Acc-Compl you-love sick is
'(The one) whom you love is sick.' (Joh. 11:3)
Any account of the syntax of FR constructions should be able to cover the

typological fact that there are not just matching and non-matching languages,
but also at least three different types of non-matching languages, as exempli-
fied by German, Modern Greek and Gothic.
2.3 FRs as Bare CPs
Three different kinds of proposals for the structure of FR clauses can be dis-
tinguished:
I. FRs have the structure of a DP or NP complemented by a CP or IP

with the FR pronoun in D° or N°:
[NP/DP F R - P R O N [CP/IP ... ]]
II. FRs have the structure of an empty DP or NP complemented by a
CP with the FR pronoun in SpecCP:
[NP/DP 0 [CP F R - P R O N ... ]]
III. FRs are CPs, with the FR pronoun in SpecCP:
[ c p F R - P R O N ... ]
The first proposal was made by Bresnan & Grimshaw (1978) and the second
by Groos & van Riemsdijk (1981), the latter being a reply to the former. A
variant of the second proposal that includes a treatment of case attraction was
developed by Harbert (1983). The third analysis was proposed more recently
by Rooryck (1994).
Rooryck shows that both DP-CP accounts face empirical problems. The
structure under I. cannot deal very well with many of the extraposition prop-
erties of relative clauses in German and Dutch (this was first shown by Groos
& van Riemsdijk), and the structure under II. wrongly predicts subjacency vi-
olations in cases of extraction out of the specifier of the CP. The usual syntac-
tic tests show quite clearly that FRs behave like ordinary subordinate clauses
352 Ralf Vogel
(i.e., like CPs). The DP-CP proposals also require some construction specific
stipulations and mechanisms in order to work. 12
I will adopt several insights from the discussed approaches. These are basi-
cally the assumption that FRs are CPs (cf. Rooryck 1994), that case attraction
is a PF phenomenon (cf. Harbert 1983) and that the case hierarchy plays a cru-
cial role in many languages (cf. Bresnan & Grimshaw 1978, Harbert 1983,
Pittner 1991). But the basic account should be rephrasable with different syn-
tactic analyses.
There are some further reasons why I do not make use of one of the two
proposals that claim that FRs are "headed": 13
— The "bare CP" analysis is conceptually simpler, in that it only makes

use of directly observable elements.
— If we assume that the FR pronoun is a pronoun of its own kind (cf.
the evidence from Modern Greek, given in (14)), then it is possible
to attribute most of the specific semantic properties of FRs to the
pronoun itself, in a way analogous to relative pronouns and subor-
dinate w/i-clauses.
— Rooryck's (1994) proposal that the C° head of the FR has an agree-
ment function is sufficient to establish a configuration in which the
FR pronoun carries a case feature that does not stem from its local
domain, but from the matrix verb.
2.4 The Case Hierarchy
Several observations about FRs in different languages can be explained with

reference to a case hierarchy. Bresnan & Grimshaw (1978) already cite an
unpublished manuscript by L. Carlson (1977) about the following Finnish
data (cf. Bresnan & Grimshaw 1978: 373f.):
(21) a. Valitsen mistä sinäpidät

choose-I what-ELATIVE you like-you
b. *Pidän mitä sind valitset
like-I what-PARTlTlVE you choose-you
Finnish FRs seem to resemble German ones in that the FR pronoun always
takes the case assigned by the FR internal verb. Finnish can deal with some in-
stances of case conflict: (21) shows that a matrix partitive may be suppressed
if the embedded case is elative, but not vice versa. Bresnan & Grimshaw
(1978: 374) cite Carlson as follows:
Free Relatives 353
Carlson suggests that nominative (the case of subjects and objects of im-
personal constructions), accusative, and partitive (the cases of objects of
transitive verbs) are unmarked cases in Finnish; the case of a free relative
may disagree with that of its head only when the relative has unmarked
case; and the head must agree in case with the subordinate verb that gov-
erns its trace.
Pittner (1991) claims that the following rule holds in German FR construc-
tions:
Bei einem Kasuskonflikt zwischen dem vom Matrixverb geforderten Ka-

sus Kl und dem vom Verb im freien Relativsatz geforderten Kasus K2
kann Kl unrealisiert bleiben, wenn Kl K2 auf folgender Hierarchie vor-
angeht:
(KH) Nominativ < Akkusativ < Dativ/Präpositionalkasus 14
I do not think that dative and PPs should be grouped together on the case
hierarchy in general, but for German it does not seem to make an empirical
difference.
Harbert (1983) proposed a similar case hierarchy for Gothic, as discussed in
connection with the data in (20). Hierarchies of all kinds of features are quite
common in optimality theoretic models. We could, e.g., develop a constraint
family R E A L I S E < c a s e > , where " < c a s e > " stands for the different cases.
The usual hierarchical ordering of these constraints in the OT model gives us
the implementation of the case hierarchy:
REALISEpp » REALISEDAT » REALISEacc » REALISEN0M
This hierarchy should, of course, be universally fixed and not freely rerank-
able.
2.5 Case assignment at PF?
I will make use of Harbert's (1983) intuition that case attraction is a PF-
related phenomenon. In such cases, we obviously have a mismatch between
the syntactic case feature of the FR pronoun (assigned by the FR internal
verb) and its morpho-phonological case feature ("assigned" by the matrix
verb). OT can deal with such rule violations. Whether such a candidate wins
depends on the system of constraints and their ranking.
What is particularly interesting is that the formulation of the constraint
makes reference to the notion of correspondence, which has become a fruit-
ful topic of discussion in Optimality Theory. The correspondence here is that
354 Ralf Vogel
between a morpho-syntactic feature of a pronoun (at the relevant syntactic

level of representation) and its corresponding morpho-phonological expres-
sion at the level of morpho-phonological representation, which, for ease of
discussion, I will assume to be PF.
3 An Optimality Theoretic Analysis
The general architecture of optimality theoretic models can be schematised

as follows:
Input: Io
Candidate set C: Ci C2 C3 c 4 ... c„

^ ^ EVAL(R(CON),C)
Output: O0
I make the following assumptions about the components of this model:
— The standard proposal about the INPUT in OT syntax has been

given by Grimshaw ( 1997a: 375f.): "The input for a verbal extended
projection is a lexical head plus its argument structure and an as-
signment of lexical heads to its arguments, plus a specification of
the associated tense and aspect." For the present discussion, it is im-
portant that the specific structure of the free relative clause is part of
the input. I assume a version of the input that has been proposed by
Keer & Bakovic (forthcoming) that also adds functional features to
the elements in the input: "In addition to the lexical features, argu-
ment structure, tense and aspect of the Grimshavian input, we posit
that there are functional features such as |±COMP] and [±WH]."
The FR pronoun and its formal features should also be present in the
input. I assume that any functional features and projections might
be included in the input. To be concrete: A free relative construc-
tion can already be distinguished from an ordinary headed relative
clause construction at the input level. This version of the input re-
sembles a D- or S-structure much more than an unstructured set of
lexical and functional items.
— The OUTPUT is a pair (LF, PF), as usual (not only) in the mini-
malist branch of generative syntax.
Free Relatives 355
— GEN produces the candidate set on the basis of the input: "[GEN]
... generates all extended projections that conform to X-bar theory,
that is, in which all projections are of the right basic structure."
(Grimshaw 1997a: 376). I further assume that GEN can manipulate
the functional categories of the input. Contrary to Keer & Bakovic
(forthcoming), I do not assume that GEN can perform manipula-
tions on the values of features. I assume that GEN can only manip-
ulate the distribution of features within the clause, and thereby add
or erase functional projections. The motivation for this move will
become clear in the next section.
— The Candidate set C is the set of possible output candidates, gen-
erated by GEN: (LF, PF) pairs that conform to universal well-
formedness rules.
— EVAL is an evaluation algorithm based on the particular ranking R
of the set of universal constraints CON. EVAL compares the candi-
dates in C and chooses the best competitor as output.
3.1 Correspondence Theory
OT models are always theories about the relation (i.e., correspondence)

between two different representations. These are mostly input and output.
McCarthy & Prince (1995) among others show that in some cases it is also
reasonable to assume correspondence between the outputs of two different
competitions (output-output correspondence), and even further kinds of cor-
respondences between different levels of representation. For our purposes, it
is particularly interesting that the output of the OT syntax model in (21) is
already a pair of two representations, LF and PF. The possibility of corre-
spondence constraints between these two representations is straightforward
and I will make use of it.
CON consists of two types of constraints: markedness constraints and faith-
fulness constraints. Markedness constraints allow the singling out of certain
properties of (input-faithful) candidates as marked and thus prefer candidates
that differ from the input in this property. Faithfulness constraints allow the
marking of a deviation from certain properties of the input in a candidate as
disadvantageous compared to a faithful candidate. Very often the ranking of
faithfulness and markedness constraints concerning a specific property de-
cides whether the input-faithful or the input-deviant candidate wins.
Following McCarthy & Prince (1995), Grimshaw (1997b) lists the follow-
ing two basic types of faithfulness constraints:
356 Ralf Vogel
(22) a. MAX: Every segment in the input has a correspondent in the output,
b. DEP: Every segment in the output has a correspondent in the input.
The scheme in (23) illustrates the differences between these constraint fami-
lies. A MAX constraint that applies to two elements in the input can be satis-
fied by a single element in the output. The opposite holds for DEP constraints.
It is also less important whether segments are in the right order.
(23) MAX
input F F F X
output F X F
Thus, DEP and MAX constraints allow, say, "weak" unfaithfulness. Consider
the matching effect:
(24) a. Ich helfe wem ich helfen will

I help who-DAT I help want
b. Ich helfe einem, dem ich helfen will
I help one-DAT who-DAT I help want
Strictly speaking, we have two dative case features in (24-a), one assigned
by the matrix verb, and the other assigned by the FR internal verb. But nei-
ther M A X d a t , nor D E P d a t are violated. D E P d a t requires that for each dative
feature in the output there is (at least) one in the input. This is the case; we
have even more than one, but this is irrelevant. M A X d a t requires that for each
dative feature in the input there is one in the output. Again, this is the case
- though it is the same dative feature of the output that corresponds to both
dative features of the input.
This might help to explain the following problematic example discussed in
Pittner (1995) and Leirbukt (1995):
(25) a. Sagen Sie das bitte Frau Schwarzkopf, Herrn Müller, Herrn
tell you that please Mrs. S.-DAT Mr. M.-DAT Mr.
Schmidt und wen sie sonst noch treffen
S.-DAT and who-ACC you else yet meet
'Please tell that to Mrs. S., Mr. M., Mr. S. and whoever else you
might meet.'
b. *Sagen Sie das bitte, wen Sie sonst noch treffen
tell you that please who-ACC you else yet meet
Free Relatives 357
The obvious problem is the well-formedness of (25-a) in spite of the ill—

formedness of (25-b). There is no violation of MAX d a t or DEP d a t in (25-a)
if we assume that in a conjunction structure only the first conjunct is relevant
for case checking issues. Evidence for this assumption might be given by the
fact that a reversal of the two conjuncts leads to ill-formedness:
(26) *Sagen Sie das bitte, wen Sie treffen und Frau Schwarzkopf
tell you that please who-ACC you meet and Mrs. S.
The FR clause with an accusative pronoun, which normally is impossible

when inserted for a dative marked complement, gets a "free ride" in (25-a), so
to speak: There is no case conflict, because the dative required by the matrix
verb is realised by the DPs in the first conjunct. For a traditional account it is
hard to explain why the accusative FR is allowed in (25-a). But it might not
be impossible.
3.2 Ungrammaticality
A problem that is never easy to solve in OT is absolute ungrammaticality. In

an OT competition there will always be a winner. If all candidates in the can-
didate set for (27-b) are FR constructions, there should be an FR construction
that wins and thus is well-formed.
(27) a. Ich habe eingeladen, wem ich vertraue

I have invited who-DAT I trust
b. *Ich vertraue, wen ich eingeladen habe
I trust, who-ACC I invited have
There are several ways to escape this problem. The way that I am choos-
ing here is using not only FR constructions among the candidates, but also
ordinary headed relative constructions. These are not input-faithful, but will
sometimes win, because they do not violate some crucial constraints. That is
to say, there is always a candidate like (28) among the candidates for a FR
construction. And in this case, (28) should even turn out to be the optimal
candidate. 15
(28) Ich vertraue einem, den ich eingeladen habe

I trust one-DAT who-ACC I invited have
It must be possible for GEN to generate an output like (28) from an input
of the form of a FR construction. For this to be possible, I assume that GEN
358 Ralf Vogel
can perform manipulations on the functional features and categories. GEN

cannot add or delete features, but it can reorganise them. Let us assume that
the FR pronoun has two features A and B. Then in the free relative con-
struction, these are both included in the FR pronoun, but in the case of the
headed relative, they are distributed: A is in the head (i.e. "einem" in (28)),
and Β is in the relative pronoun. The semantic property that distinguishes an
ordinary relative pronoun from an FR pronoun is referentiality: A simple re-
strictive relative clause cannot introduce a new discourse referent, but a free
relative clause does. 16 So let us assume two features [ ± R E F ] for "referential-
ity" and [ ± R E L ] for the characteristics of relative pronouns (these may be
further analysable into other features, but this is a minor issue here): 17
[REF] [REL]
pronoun + -
relative pronoun — +
FR pronoun + +
In the input of a FR construction, the two features are joined under one func-
tional head. I assume that it is possible for GEN to split this feature bundle
and distribute the features over several functional projections, and thereby
introduce functional projections that were not present in the input. 18
(30) Input: [+REF +REL]
f+REF] C° [+REF] DP C
[+REL]
[+REL] C°
The correspondence that is involved here is input-LF correspondence. We

are comparing a structure that is given in the input with its output LF. The
faithfulness is about the functional features and functional projections.
Free Relatives 359
3.3 The Constraints
In (30), the output structure 2 differs in three respects from the input structure:
There is one additional functional projection that was not in the input, there
is an additional functional head D°, and the features of the D° in the spec-
ifier of the CP differ from its correspondent in the input structure, and vice
versa. I want to summarise these cases as subcases of the same constraint,
FAITHfunc. In addition to the mentioned input-LF faithfulness, FAITHfunc
will also include instances of LF-PF correspondence, namely the occurrence
of resumptive clitic pronouns in cases like (19), repeated here:
(31) Tha voithiso opjon tu dosis to onomamu

FUT help-lSg whoever-ACC cl-GEN give-2Sg the name my
*opjou 'whoever-GEN'
*s'opjon 'to whoever'
*opjou tu 'whoever-GEN him-GEN'
Ί will help whoever you give (him) my name.'
The requirement to be included in FAITHfunc is that an LF-chain of a func-

tional category should have exactly one PF correspondent. In the case of re-
sumptives, there are two PF elements corresponding to one LF-chain.19
(32) FAITHfunc:
Each functional feature bundle in the input has a corresponding
functional head with the same feature specification and vice versa;
and a chain of a functional category has exactly one PF correspon-
dent.
FAITHfunc is violated if
— A functional head in the input has no corresponding functional head

with the same feature specification in the output LF (or vice versa)
or
— More than one element of the output PF corresponds to a link of the
LF-chain of the same functional category.
There are three mismatches between the input structure and the output LF 2
in (30), hence three violations of FAITHfunc:
360 Ralf Vogel
input output 2/LF

[DP + R E F , + R E L 1 0
0 [DP, +REF]
0 [Dp2 +REL1
The FR structure with the additional resumptive pronoun has one FAITHfunc
violation, because the resumptive pronoun introduces a second PF correspon-
dent of the same functional LF-chain.
The second constraint that I am assuming is on PF-LF correspondence and
requires that the morpho-phonological case form of an element Ρ of an output
PF must correspond to the morpho-syntactic case feature of the LF-chain
corresponding to P:
(33) MATCH:
The morpho-phonological case feature of a PF element X may not
contradict the syntactic case feature of the chain of XP, the corre-
spondent of X in the output LF.20
MATCH is violated in all instances of case attraction, as in Modern Greek

and Gothic:
(34) PF: X-acc

I
LF: XP¡ ... ti V
I I I +nom
Take XP to be the FR pronoun. At LF, XP is the head of a chain assigned

nominative by the FR-internal verb or IP (henceforth r - c a s e ) , but at PF the
morpho-phonological case affix of X, the correspondent of XP, corresponds
to the accusative assigned by the matrix verb (henceforth " m - c a s e " ) . 2 1
The faithfulness constraints for the case features are likewise defined as cor-
respondence rules between syntactic and morpho-phonological case features.
The first constraint is violated by ordinary FR constructions:
(35) UNIcase:
"Uniqueness of case relations". For each case required by each
case assigner there is exactly one XP that stands in the appropri-
ate structural position for case assignment and realises (morpho-
phonologically) the required case feature. XPs may realise the case
of at most one case assigner.
Free Relatives 361
To include attraction as a mode of realising the case feature assigned by the

matrix verb in the FR pronoun in the SpecCP of the FR, it must be possible
that the PF correspondent of the specifier of an AGR head can realise a case
feature of that head.
(36) A case feature is realised by XP iff

a. One of the elements within XP's chain is realised with the respective
case morphology at PF.
b. The head of one of the elements within XP's chain is an AGR-head
and "realises" the respective case morphology at PF via Spec-Head
agreement on the PF correspondent of its specifier.
m-case=DAT
r-case=ACC UNIcase MATCH FAITHfunc
a. [CP dat * *
b. [cp acc *
c. [cp dat... acc * *
d. [DP dat [cp acc * * *
The candidates in (37) are the candidates relevant in the competition. 22 (37-a)
is an instance of case attraction, as exemplified by Gothic. It leaves one case
unrealised, so there is one UNIcase violation, and we have one MATCH vio-
lation, because of the attraction structure. Candidate (37-b) is a German-type
m clause with the w/i-pronoun bearing r - c a s e . There is just one UNIcase
violation. Candidate (37-c) has attraction and in addition a resumptive pro-
noun spelling out the trace, as exemplified by some Modern Greek FRs. We
have one MATCH violation for attraction and one FAITHfunc violation for
the clitic. Candidate (37-d) is the headed relative construction. It has three
FAITHfunc violations for the change in the functional categories and their
feature distribution.
We already see from this tableau that different rankings of the constraints
will result in different winners. The case hierarchy will be implemented by a
series of input-PF constraints, resembling MAX constraints:
(38) REALcase,,:
Each XP chain with a syntactic case feature |+CASE„| at the out-
put LF has a morpho-phonological case feature of |+CASE„] on a
corresponding element of XP at the output PF.
Again, R E A L C A S E „ can be fulfilled by the element in SpecXP if X has A G R

properties, i.e., the F R pronoun can fulfill R E A L C A S E „ for m - c a s e :
362 Ralf Vogel
m-case=DAT
r-case=ACC REALdat REALacc
a. ICP dat *
b. |CP acc *
c. | cp dat... acc
d. [ D p dat [CP acc
Many possible candidates have no chance of "winning" under any ranking.

Some examples are given in (40): 23 candidate (40-d) is harmonically bounded
by (40-c); candidates (40-f) and (40-g) are harmonically bounded by (40-e).
Thus, (40-a,b,c,e) are the four relevant candidates.
(40)
m-case=NOM
r-case=ACC MAT Uc Ff REALd REALa REALn
a. [cp nom * * *
b. [cp acc * *
c. [cp nom ... acc * *
d. [CP acc ... nom * ** * *
e. [DP nom [cp acc ***
f. [DP nom [CP nom ... acc * * ****
g. [DP acc [CP nom ** **** *** * *
Note that (40-a) can never win over (40-b), because REALnom, which is
violated by (40-b), must never be higher than REALacc, which is violated by
(40-a). Thus, a candidate of type (40-a) can only win if m - c a s e is higher
than r - c a s e on the case hierarchy. This is exactly what has been found in
Gothic FRs.
4 Analysis of German Free Relative Clauses
The constraint ranking that yields the pattern for "German A" (cf. page 345)
is given in (41); the constraint for the oblique cases dative and genitive is
abbreviated with REALobl: 24
(41) MATCH REALobl » FAITHfunc » UNIcase REALacc REAL-

nom
Free Relatives 363
The top ranking of MATCH ensures that German FRs have no case attraction,
and the ranking of FAITHfunc above UNIcase ensures that German has FR
constructions under case conflict. That REALoblique is higher than FAITH-
func has the effect that FRs that suppress oblique cases are ungrammatical
(i.e., lose against the "headed" relative construction). The following sections
show this in a little more detail.
4.1 Matching Free Relatives
Let us first take a look at a non-conflicting example:
(42) Wer das glaubt, ist ein Träumer

who-NOM this believes is a dreamer
m-case=NOM
r-case=NOM MAT REA Lo Ff Uc REA La REALn
is· a. | cp nom *
b. [CP nom ... nom *!

c. [DP nom [cp nom
The same results can be repeated wherever m - c a s e and r - c a s e match, and

if both are PPs with the same lexical preposition.
4.2 Free Relatives with Case Conflict
A case that looks like matching is examples with the inanimate FR pronoun
was, which realises both nominative and accusative. The PF form is abbrevi-
ated with "n/a" in the candidate set.
(43) Was ich nicht weiß, macht mich nicht heiß

what-ACC I not know makes me not hot
'What I don't know doesn't excite me.'
m-case=NOM
r-case=ACC MAT REALo Ff Uc REALa REALn
is- a. [CP n/a *
b. [CP — ACC *!
c. [DP nom [cp acc
When the FR pronoun is animate, the PF forms of accusative and nominative

differ:
364 Ralf Vogel
(44) Wen ich traf, wurde eingeladen

who-ACC I met was invited
m-case=NOM
r-case=ACC MAT REALo Ff Uc REALa REALn
is* a. [cp ace * *
b. [cp nom * *
c. Icp nom ... acc *! *
d. [DP nom [CP acc
Nearly the same will happen if r - c a s e is an oblique case. The only dif-
ference will be that candidate b. will have a violation of REALobl instead
of REALacc. But the result will be the same, as expected. The examples
with accusative as m - c a s e are totally analogous, with the exception of the
REALacc and REALnom violations, which are not relevant here. A possibly
complicated case is the following example:
(45) Wessen Bücher mir gefallen, wurde eingeladen

who-GEN books me-DAT please was invited
What is special here, is that the FR pronoun is not in the SpecCP of the
FR, but in the Spec of a DP that itself occupies SpecCP. Presumably, it is
impossible for the pronoun to undergo attraction in this case. 25 And even if
it did, it is not the FR pronoun that would agree with the C°-AGR head of
the FR, but the complex DP and thus the pronoun would be unable to fulfil
REALcase„ for m - c a s e . The relevant example is put in brackets in (46). It
is, however, clear that the expected candidate would win, no matter how the
attraction candidate (46-b) were treated: 26
(46)
[cp [DP GEN NP ] ... ]-NOM MAT REALo Ff Uc REALa REALn

is- a. [cp gen NP * *
(b. [ CP nom NP ) *! (*) * *
c. [DP nom fcp gen NP
The main reason for the possibility of non-matching when m - c a s e is a struc-

tural case is that REALacc and REALnom are ranked below FAITHfunc.
Things are different if m - c a s e is oblique, because REALobl is ranked above
FAITHfunc.
Free Relatives 365
(47) *Ich helfe, wer mich fragt

I help who-NOM me asks
m-case=DAT
r-case=NOM MAT REALo Ff Uc REALa REALn
a. [çp nom *! * *
b. [ C p dat *! * *
c. [çp dat... nom *! *
os· d. [ Q P dat Içp nom * * *
Again, the pattern can be repeated with all other combinations of oblique and
structural cases. 27
4.3 Topological Predictions
Besides the German pattern, the model predicts another 7 different possible
grammars: 28
(48) a. UNIcase REALobl REALacc REALnom » FAITHfunc » MATCH

This is Modern Greek - under the assumption that in Greek the
"resumptive" pronouns for accusative and nominative are "spelled
out" by p r o (Modern Greek is a pro drop language). 29 Here, the
candidate with the resumptive pronoun wins in all instances of case
conflict.
b. FAITHfunc REALobl » UNIcase REALacc » MATCH REALnom
This is Gothic - a language in which the higher marked case is cho-
sen, irrespective of whether it is m - c a s e or r - c a s e .
c. UNIcase MATCH REALobl REALacc REALnom » FAITHfunc
This is a matching language like English. The ordinary headed rel-
ative construction always wins under case conflict.
The following grammars are predicted, but have not been found yet:
(49) a. REALobl REALacc » FAITHfunc » UNIcase MATCH REALnom

In this language, nominative always loses, irrespective of whether
it is m - c a s e or r - c a s e . In those cases where accusative and an
oblique case come together, we get the FR construction with a re-
sumptive pronoun,
b. FAITHfunc REALobl » UNIcase MATCH » REALacc REALnom
This language has attraction if the m - c a s e is oblique. In the other
cases, the r - c a s e is chosen on the FR pronoun.
366 Ralf Vogel
c. MATCH REALobl REALacc » FAITHfunc » REALnom UNIcase

This language only allows non-matching FRs if m - c a s e is nomina-
tive. Grosu (1994) claims that this holds in Spanish and Romanian.
There are, however, some empirical complications. 30
The following grammar is predicted, but perhaps implausible, because in two

instances the oblique case feature is suppressed:31
(50) FAITHfunc MATCH » UNIcase REALobl REALacc REALnom
This language always lets r - c a s e win, even if m - c a s e is oblique.
4.4 A Different German Dialect
The prediction of an implausible language might already be a good reason to

look for further refinements of the model. Note also that thus far our model
does not predict the German variant Β discussed on page 345. This vari-
ant does not allow a free relative if m - c a s e is accusative and r - c a s e is
nominative. 32
However, this variant can be predicted with a different implementation of
the case hierarchy. This implementation allows hierarchically higher cases to
satisfy realisation requirements for hierarchically lower cases:
(51 ) REALnom can be fulfilled by any morphological case feature

REALacc can be fulfilled by any morphological case feature, except
nominative
REALobl can only be fulfilled by the proper oblique case feature
(i.e., dative for dative, genitive for genitive, etc.)
With this we get both variants of German:
(52) German A:
/Er zerstört wer ihm begegnet
he destroys who-NOM him-DAT meets
m-case=ACC
r-case=NOM MAT REALo Ff Uc REA La REALn
ι®" a. Γ cp nom * *
b. fcp acc *
c. [cp acc ... nom *! *
d. 1 dp acc | cp nom
Free Relatives 367
The crucial reranking from German A to German Β is that REALacc goes

above FAITHfunc and UNIcase:
(53) German B:
*Er zerstört wer ihm begegnet
he destroys who-NOM him-DAT meets
m-case=ACC
r-case=NOM MAT REALo REALa REALn Ff Uc
a. | cp nom *
b. [ C p acc *! *
c. [cp acc ... nom *! *
is· d. [DP acc [cp nom * * *
4.5 Revised Topology
With the modification in (51) and two additional ones, we can further restrict
the pattern. REALnom plays no role in the interaction of the constraints, so
we can eliminate it - it is always fulfilled in the competitions under debate
here. REALobl can be argued to be inviolable, and hence part of GEN 3 3 -
thus, candidates that violate REALdat or REALgen will never be generated.
So REALobl can also be removed. We then no longer need to postulate a
fixed ranking of constraints, because REALacc is the only constraint of the
case hierarchy that is left. The four constraints we now use are: REALacc
(in the new version where REALacc can be fulfilled by any case form except
nominative), UNIcase, FAITHfunc and MATCH. This rules out the possibil-
ity of (50). We now get the prediction of six grammars, each of which is quite
reasonable:
(54) a. UNIcase REALacc » FAITHfunc » MATCH

This is instantiated by Modern Greek (48-a).
b. FAITHfunc REALacc » UNIcase MATCH
This is instantiated by Gothic (48-b).
c. UNIcase MATCH REALacc » FAITHfunc
A language without FRs under case conflict (48-c).
d. FAITHfunc » UNIcase MATCH » REALacc
Not found yet, but very close to Gothic, cf. (49-b).
e. MATCH » FAITHfunc » REALacc » UNIcase
German, variant A.
368 Ralf Vogel
f. MATCH REALacc » FAITHfunc » UNIcase

German, variant Β.
It is easy to see now how the two variants of German differ, namely, in the rel-
ative ranking of FAITHfunc and REALacc. For German A, it is better to leave
accusative unrealised (and violate REALacc) than to rearrange the functional
material of the FR (and violate FAITHfunc). The opposite holds for German
B34
5 Concluding Remarks
The main advantage of the proposed analysis is its ability to predict typo-
logical variation in a much better way than previous accounts - and it does
so by making use of an even simpler and comparatively unproblematic syn-
tactic analysis. The number of construction specific assumptions has been
reduced, but the typological predictions of the present account look much
more realistic. The system of constraints is able to filter out a set of candi-
dates that includes only the types of FR constructions that we find universally.
The constraints that are used here are all defined as constraints on correspon-
dences between different levels of representation, input-LF and PF-LF. The
constraints conflict: A candidate that performs well on input-LF correspon-
dence performs worse on PF-LF correspondence. Languages seem to differ
in which correspondence they consider to be more important. The constraints
are formulated in a general, non-construction-specific way. The system thus
should be able to account for phenomena other than FR constructions. This
opens further fields of research.
Notes
I would like to thank the following colleagues for helpful comments and fruitful
discussions: Artemis Alexiadou, Gisbert Fanselow, Hans-Martin Gärtner, Anasta-
sia Giannakidou, Jane Grimshaw, Alex Grosu, Fabian Heck, Géraldine Legendre,
Gereon Müller, Peter Ohl, Doug Saddy, Tanja Schmid, Sten Vikner. I'm further
thankful to the audiences of presentations of parts of this paper at the University of
Stuttgart on April 15, 1999, the GGS 1999 workshop in Stuttgart, May 15, 1999, at
the University of Potsdam on July 16, 1999, at Rutgers University, New Brunswick,
on September 10, 1999, and at Johns Hopkins University, Baltimore, on October 7,
1999. All remaining errors are mine. The work on this paper was fully supported by a
Free Relatives 369
grant for the DFG research project "Optimalitätstheoretische Syntax des Deutschen"
(MU 1444/2-1), University of Stuttgart.
1. The article by Groos & van Riemsdijk has been very influential and is still the
source of the most-widely accepted syntactic analysis of FR constructions. But
the paper unfortunately bears some misleading judgements of German FR con-
structions which have been taken over by many researchers. Van Riemsdijk still
claims that German is a matching language, as can be seen in van Riemsdijk
(1998). Although there might be some German speakers whose judgements are
as described by Groos & van Riemsdijk, it seems that the majority of German
speakers judge differently.
2. For many speakers, examples with sentence-initial FRs are less acceptable than
the examples in (53). I view this as an effect of parsing difficulties. Because
German is a non-matching language, the hearer has several options for the gram-
matical function of a FR, in general. If the FR is clause-final, it is easy to detect
its grammatical function (GF), because usually there is only one GF left that is
assigned by the verb and not yet realised by some constituent. For a clause-initial
FR, however, everything is open, and its GF can only be guessed at the point of
its occurrence. So we expect that clauses with clause-initial FRs are parsed with
a higher error rate than those with clause-final FRs. The same problem occurs
with sentence-initial FRs assigned accusative by the matrix verb.
3. The brackets around the asterisk indicates the disagreement about the data be-
tween German native speakers.
4. This explanation has already been given by Groos & van Riemsdijk (1981).
5. The question arises as to whether such effects of homophony can also be found
with lexical items other than pronouns. A case in point could be the following:
(i) a. (?)Ich lade ein, wessen Eltern ich vertraue

I invite who-GEN parents-DAT I trust
b. ??Ich lade ein, wessen Geschwistern ich vertraue
I invite who-GEN siblings-DAT I trust
The surface form of Eltern is the same in all four morphological cases of Ger-
man, while it is Geschwistern only in the dative, and Geschwister in nominative,
accusative and genitive. This is crucial for the judgements in (i). While German
native speakers may disagree on the grammaticality status of (i-a), they agree
that (i-b) is significantly worse than (i-a). Whatever explanation might be found
for this fact, it is obvious that the differing morphological paradigms for Eltern
and Geschwister play a role.
6. I cannot make a statement about where these two variants originate. As far as I
can see, it has nothing to do with different regions. I also cannot affirm that it is
a matter of different generations or social class.
7. For a broad range of arguments that German dative is not a structural case, con-
trary to what has often been claimed, see Vogel & Steinbach (1998).
370 Ralf Vogel
8. The FR pronoun always realises the FR internal case in the following examples.
9. I abbreviate the features as "wh" and "REL", without discussing the actual nature
of these features. These labels should be seen as informal cover terms.
10. A quite extensive discussion of the different distributions of free relatives and
embedded wA-clauses in English can be found in the first sections of Bresnan &
Grimshaw 1978.
11. FRs in preverbal position should be analysed as instances of left dislocation, as
Alexiadou & Varlokosta (1995) convincingly demonstrate.
12. The main empirical reason for the assumption of a DP-CP structure is the match-
ing effect. We will see that it can also be derived in a "bare CP" analysis.
13. The somewhat misleading term "head" is used very often to refer to the "an-
tecedent" of restrictive relative clauses. I will avoid it as much as I can.
14. Translation: "In a case conflict between the case K1 required by the matrix verb
and the case K2 required by the verb in the free relative clause, K1 can remain
unrealised if K1 precedes K2 in the following case hierarchy:
(CH) nominative < accusative < dative/prepositional case"
15. This strategy of accounting for ungrammaticality is quite frequent in optimal-
ity theory. It is called neutralisation. The winner of one competition is also the
winner of another competition - in this case it would be one with the headed
relative clause as input. And in this competition, (55) has an even better profile,
because it presumably violates fewer constraints than with an FR construction in
the input. For other applications of neutralisation in OT syntax, cf. Legendre et
al. (1995, 1998) and Keer & Bakovié (forthcoming).
16. Wiltschko (1999) presents many pieces of evidence that FR pronouns are seman-
tically indefinites.
17. Independent empirical motivation for this feature composition analysis can be
seen in the design of the FR pronoun in Modern Greek; cf. section 2.1.
18. Note that this requires a theory that clarifies which features can be "project-
ing" and which cannot. But this is not an extra complication. In the domain of
verbs, e.g., there is an ongoing debate in generative syntax about which morpho-
syntactic features of verbs are heads of their own projection, and which are not.
INFL has been split into TENSE and AGR and the latter has become controver-
sial; under debate are also ASPECT, NEGATION, VOICE, and others. I mention
this to show that the proposed mechanism does not bring in extra complications
compared to traditional approaches, at least in this respect.
19. This is reminiscent of the constraint "SILENT TRACE" of Pesetsky (1998).
FAITHfunc can be seen as composed of simpler constraints. It is, however, a dis-
junction, not a conjunction, we are dealing with here, because a single violation
of one of the combined constraints suffices to get a violation of FAITHfunc. In
order to distinguish matching languages from languages without FRs, we would
have to split FAITHfunc. I avoid this here only for ease of presentation. See Vogel
(in prep.)
Free Relatives 371
20. It has become standard in minimalist syntax to assume that case features of DPs
are no longer present at LF. However, what still is present is the trace in the case
position, which means that we can always reconstruct a DP's case feature from
its chain. This will suffice for the present purpose.
21.1 do not want MATCH to be a constraint on PF-input correspondence, because
in this case MATCH would also be violated by output 2 in (30). Both DP; and
DP2 have the same DP as the input. This DP is assigned r-case. But DPj
realises m-case. Thus, DP] is unfaithful vv.r.t. its case feature. On the other
hand, if we look at the LF, then DPi and DP2 build chains of their own and
each corresponds to its own PF element. There is a one-to-one mapping between
the case features. MATCH is obeyed. A violation of MATCH would have the
effect that in (30) output 2 would be "harmonically bounded" by output 1: Both
would have one violation of MATCH, but in addition output 1 would have one
violation of FAITHfunc, while output 2 would have three. In the account being
developed here, the two candidates behave alike vv.r.t. all other constraints, so
output 2 would always lose against output 1.
22.1 abbreviate the structures by only indicating the syntactic category of the can-
didates and the overt case features that occur in them. Abstract case features are
printed in capitals, overt (PF-) case features in italics.
23. A short glossary for the abbreviation of the constraints in the following tableaux:
MAT = MATCH; Ff = FAITHfunc; Uc = UNIcase; REALo = REALobl; REALd
= REALdat; REALa = REALacc; REALn = REALnom. These abbreviations are
only introduced to keep the width of the tables small enough.
24. That there is no between constraints (as, e.g., between MATCH and
REALobl) only means that we have no evidence from the given data for the
ranking of two constraints with respect to each other.
25.1 know of no example of this kind from attraction languages. But even in German
this construction poses additional complications, which I do not have an answer
to yet. See footnote 5 for one example.
26. It is impossible to construe an example with a resumptive pronoun, because the
chain of the FR pronoun has only one link.
27. The situation with matrix PPs will only be mentioned briefly. GEN may not erase
lexical items from the input, so only those variants are possible outputs, where the
preposition is preserved, which eliminates the candidate where the wA-pronoun
tealises r-case, and not the PP:
(i) Ich schreibe *wer / über einen, der mir gefällt

I write who-NOM / a b o u t o n e , w h o - Ν θ Μ me-DAT p l e a s e s
372 Ralf Vogel
m-case=PP
r-case=NOM MAT REALo Ff Uc REALa REALn
a. [CPP * *
b. [çp Ρ ... nom *! *
ty c. |pp Ρ Icp * * *
28. The typological predictions of the proposed model have been calculated with
the help of the constraint ranking software OTSOFT, developed by Bruce Hayes
(Hayes 1998). Only FRs with case conflicts are taken into account here, so this
typology does not differentiate between languages with only matching FRs and
those without any FRs. The fully developed typology is presented in Vogel (in
prep.).
29. See Alexiadou & Varlokosta (1995) for a discussion of Modern Greek FRs.
30. These complications have to do with the fact that in Spanish and Romanian ani-
mate direct objects are realised with an obligatory preposition, while inanimates
are not. As a consequence of this, animate direct objects require matching, while
inanimates do not. The latter behave like nominatives (they are morphologically
indistinguishable from nominatives anyway). There are several possibilities to
account for this in the present system. The above ranking is only one option,
and it requires additional assumptions about inanimate direct objects. See Grosu
(1994) for a detailed discussion of the data. I will present my account of this
problem in Vogel (in prep.).
31. One could also state that this is the ranking for German A. The "implausible"
candidates occur in those cases where German A has no FRs. We might then
assume that an FR that suppresses dative or genitive wins the competition, but
is ungrammatical for independent reasons, e.g., uninterpretability because of the
deletion of semantically necessary oblique case features.
32. The problem is that in order to rule out a configuration with m - c a s e = A C C
and r - c a s e = N O M , we have to rank REALacc above FAITHfunc, but we then
wrongly rule out FRs with the configuration | m - c a s e = A C C r - c a s e = D A T ] ,
33. One could, for example, argue that these case forms are like lexical items. That
dative case in German can have its own specific semantic contribution in a clause,
has been shown by Wunderlich (19%), among others. Pesetsky (1998) uses
a constraint called RECOVERABILITY to account for a related phenomenon in
Polish relative clauses: complementiser-introduced relative clauses require a re-
sumptive pronoun in the case position of the (phonetically empty) relative opera-
tor's chain if it is assigned oblique case. Pesetsky also assumes that oblique cases
have a semantic contribution that cannot be recovered if they are not realised at
PF.
34. The version of German that was proposed by Groos & van Riemsdijk (1981)
would be a matching language like (54-c).
Free Relatives 373
References
Alexiadou, Artemis — Spyridoula Varlokosta

1995 The syntactic and semantic properties of free relatives in Modern Greek.
ZAS Working Papers in Linguistics 5: 1-30.
Bresnan, Joan — Jane Grimshaw
1978 The syntax of free relatives in English. Linguistic Inquiry 9: 331-391.
Carlson, Lauri
1977 Remarks on the syntax of free relative clauses. Ms., Linguistics and Phi-
losophy Department, MIT.
Chomsky, Noam
1986 Barriers. Cambridge, MA: MIT Press.
Chomsky, Noam
1992 A Minimalist Program for Linguistic Theory. (MIT Occasional Papers in
Linguistics 1.).
Giannakidou, Anastasia — Jason Merchant
1996 On the interpretation of indefinite objects in Greek. Ms., Rijksuniversiteit
Groningen and University of California, Santa Cruz.
Grimshaw, Jane
1997a Projection, heads and optimality. Linguistic Inquiry 28: 373-422.
Grimshaw, Jane
1997b The best clitic: Constraint conflict in morphosyntax. In: L. Haegeman
(ed.) Elements of Grammar: A Handbook in Contemporary Syntactic
Theory, 169-196. Dordrecht: Kluwer.
Groos, Anneke — Henk van Riemsdijk
1981 Matching effects with free relatives: A parameter of core grammar. In:
A. Belletti et al. (eds.) Theory of Markedness in Generative Grammar,
171-216. Pisa: Scuola Normale Superiore di Pisa.
Grosu, Alexander
1994 Three Studies in Locality and Case. London/New York: Routledge.
Harbert, Wayne
1983 On the nature of the matching parameter. Linguistic Review 2: 237-284.
Hayes, Bruce
1998 OTSOFT.EXE. Optimality Theory Software, http://www.humnet.ucla.
edu/humnet/linguistics/people/hayes/otsoft/otsoft.htm.
Keer, Edward — Eric Bakovic
forthcom. Optionality and ineffability. To appear in: Géraldine Legendre, Jane
Grimshaw and Sten Vikner (eds.) Optimality-Theoretic Syntax. Cam-
374 Ralf Vogel

1995 Optimality in ννΛ-chains. In: J. Beckman, L. Walsh Dickie and S. Urban-
czyk (eds.) Papers in Optimality Theory, 607-636. (University of Mas-
sachusetts Occasional Papers in Linguistics 18.) GLSA, University of
Massachusetts.
Leirbukt, Oddleif
1995 Über Setzung und Nichtsetzung des Korrelats bei Relativsätzen mit wer
im heutigen Deutsch. In: H. Popp (ed.) Deutsch als Fremdsprache: An
den Quellen eines Faches., 151-163. München: Iudicium.
1995 Faithfulness and reduplicative identity. In: J. Beckman, L. Walsh Dickie
and S. Urbanczyk (eds.) Papers in Optimality Theory, 249-384. (Uni-
versity of Massachusetts Occasional Papers in Linguistics 18.) GLSA,
University of Massachusetts.
Müller, Gereon
Pesetsky, David
1998 Some optimality principles of sentence pronunciation. In: P. Barbosa, D.
Fox, P. Hagstrom, M. McGinnis and D. Pesetsky (eds.) Is the Best Good
Enough?, 337-384. Cambridge, MA: MIT Press.
Pittner, Karin
1991 Freie Relativsätze und die Kasushierarchie. In: E. Feldbusch (ed.) Neue
Fragen der Linguistik, Vol 1, 341-347. Tübingen: Niemeyer.
Pittner, Karin
1995 Regeln für die Bildung von freien Relativsätzen. Eine Antwort an Oddleif
Leirbukt. Deutsch als Fremdsprache 32(4): 195-200.
Pittner, Karin
1996 Attraktion, Tilgung und Verbposition: Zur diachronen und dialektalen
Variation beim Relativpronomen im Deutschen. In: E. Brandner and G.
Ferraresi (eds.) Language Change and Generative Grammar, 120-153.
Opladen: Westdeutscher Verlag.
Riemsdijk, Henk van
1998 Trees and Scions - Science and Trees.
http://mitpress.mit.edu/chomskydisc/riemsdyk.html.
Free Relatives 375
Rooryck, Johan
1994 Generalized transformations and the vvh-cycle: Free relatives and bare
wh-CPs. GAGL 37: 195-208.
Vogel, Ralf
in prep. Towards an 'optimal' typology of free relative clause constructions. Ms.,
University of Stuttgart.
Wiltschko, Martina
1999 Free relatives as indefinites. In: Shahin Kimary, Susan Blake and Eun-
Sook Kim (eds.) The Proceedings of the Seventeenth West Coast Confer-
ence on Formal Linguistics, 700-712. Stanford, CA: CSLI.
Wunderlich, Dieter
1996 Dem Freund die Hand auf die Schulter legen. In: Gisela Harras and Man-
fred Bierwisch (eds.) Wenn die Semantik arbeitet, 331-360. Tubingen:
Niemeyer.
The Optimal Linking of Arguments:
The Case of English Psych Verbs
Anja Wanner
1 Introduction
Optimality Theory (OT) allows the assumption that the violation of universal
principles of grammar does not automatically lead to ungrammaticality. Ev-
ery violation of these principles has its price, and if the price is reasonable,
that is to say, if violation of a principle which is ranked low is bought at the
cost of making it possible to not violate a principle that is ranked high, the
result will not be ungrammatical.
The aim of this article is to suggest a way of extending the framework of
OT to another field of application: the mapping between lexical and syntactic
structures. It will be shown how to make OT the framework for a lexicon-
based linking theory. The advantages of such an approach will be pointed out
on the basis of English psych verbs, which are generally considered a dif-
ficult case for any projectional linking approach, i.e., any linking approach
that assumes that a verb's argument structure and syntactic class (transitive,
unergative, unaccusative) can be derived from its semantic representation in
a more or less straightforward manner (following universal mapping princi-
ples). The shortcomings of such an approach will be discussed in section 2 of
this article. Before we examine implicit instances of ranking in current link-
ing theories in section 4, we will present English psych verbs as a problem
for well-established linking principles such as the Thematic Hierarchy in sec-
tion 3. Section 5 will show how to turn this problem into an integral part of a
projectional linking theory within the OT framework. It will be assumed that
the core of an "optimal" linking theory lies in the interaction of linking rules
that apply between lexical semantic structure and argument structure. These
linking rules may overlap and even contradict each other (this will be called a
"mismatch situation"). The syntactic status of an argument (internal or exter-
nal argument) will depend on the language-specific ranking of these universal
378 Anja Wanner
linking rules. Finally, the preliminary ranking that results from investigating
one particular verb class in one particular language will be put into a wider
perspective in section 6.
2 Problems of Projectional Linking Theories
The central hypothesis of projectional linking theories is the assumption that

the syntactic behaviour of a verb - its argument structure in particular - is a
reflection of the verb's semantic representation. Critics of such an approach,
e.g., Borer (1994), argue that this would exclude the possibility of a verb
with a specific meaning showing different syntactic behaviour in different
languages. Even though these differences cannot be denied when it comes to
individual verbs, there is a strong tendency for verbs of coherent semantic
classes to form coherent syntactic classes, too (as a case in point, cf. Perlmut-
ter (1978) on passivization of verbs of specific semantic classes across dif-
ferent languages). Secondly, if one argues for a strict projectional approach,
one would not expect a particular verb to show variable syntactic behaviour
within one language. In English, for instance, a verb of manner of movement
can be used to express telic or atelic events, depending on the presence and
choice of the complement. Furthermore, every intransitive verb licenses an
NP complement in the shape of a cognate object or a reaction object; see
(l-a,b).
(1) a. They walked along the river./They walked themselves tired,

b. She smiled (her approval).
Thirdly, the influence of non-lexical categories must not be neglected. It has

often been described how the specification of T° or D° can determine the
event structure of the sentence (e.g., Verkuyl 1993, Tenny 1994). For instance,
verbs of consumption by their very nature normally denote telic events and
demand an object, but if this object is not measurable in space, the event
cannot be measured in time, i.e., the event becomes atelic; compare (1-c).
(1) c. She ate three apples/*apples within five minutes.
Fourthly, one would not expect verbs of a given semantic class to show vari-
able behaviour in one language. One typically refers to psych verbs to il-
lustrate this kind of argument: If there is anything like a canonical syntactic
representation of arguments, one would not expect a crosswise realization of
E X P E R I E N C E R and T H E M E as in (l-d,e).
English Psych Verbs 379
(1) d. [th The storm | frightened [exp the children |.

e. [exp The children ] feared [th the storm |.
Facing these counter-arguments, how can a projectional approach be main-

tained? In other words: How can we make it flexible enough to cover single-
language variability as well as cross-linguistic variation?
First of all it is important to differentiate between systematic syntactic flex-
ibility and non-predictable syntactic behaviour of verbs. In her descriptive
study of the syntactic behaviour of semantically coherent English verb classes
Levin (1993) convincingly shows that the syntactic distribution of a verb de-
pends to a large extent on its semantics. Verbs of coherent semantic classes
show similar syntactic behaviour. Any projectional linking theory will have
to isolate those parts of the semantic representation that matter to syntax.
"Relating to cognitive processes" might not be a syntactically relevant se-
mantic feature, while "movement towards a goal" might be. To find out about
the syntactically relevant parts of lexical semantics, one usually looks at the
structural parts of lexical representations first. It is generally accepted that
these should take the form of lexical templates, structured by aspectual pred-
icates (like DO, BECOME, CAUSE). 1 A potential for syntactic variability may
arise if one assumes that linking regularities apply to parts of lexical struc-
ture rather than to the templates as such. A single verb may thus be subject
to different linking rules, and there is nothing that would prevent a conflict
of linking rules. Optimality Theory seems to be the ideal framework for a
linking theory that is able to handle such a conflict situation. Optimality The-
ory also allows for cross-linguistic differences in the application of universal
linguistic principles. While it is not quite clear in the standard framework
of generative grammar which universal principle allows the formulation of
which parameter, Optimality Theory makes a clear statement about where
variation is to be expected, and which forms it may take. Focussing on Eng-
lish psych verbs, this article sets out to sketch how a linking theory that is
based on the framework of Optimality Theory might be organized.
3 A Case in Point: English Psych Verbs
Let us first look at the syntactic behaviour of psych verbs (like fear, frighten,
love, delight), which are characterized by the semantic roles they assign to
their two arguments, EXPERIENCER and THEME. In the following it will be
discussed whether these verbs form a coherent class according to semantic
and syntactic criteria.
380 Anja Wanner
3.1 Defining Psych Verbs
Psych verbs are those verbs that express the perception of an emotion, the
content of which is lexically fixed (cf. fear, frighten, enjoy, surprise), or the
causation of such a perception. That is to say, the emotion itself is not realized
as an argument of the verb (* Harry feared great fear of the dog), because it is
an integral part of the verb's meaning (and quite often the verb takes its name
from the emotion expressed). According to this criterion, verbs like decide,
think, and persuade do not belong to the class of psych verbs.
In most cases psych verbs are two-argument verbs. The emotion that is felt
is usually ascribed to human beings only. In terms of 0-Theory they bear the
semantic role of an EXPERIENCER. With psych verbs of the type fear the
EXPERIENCE!* is realized in subject position, while it is in the position of the
object with psych verbs of the type frighten. The second argument of psych
verbs is either the target or the cause/stimulus of the emotion (see the verbs
in (2-a,b)). In the latter case the second argument can be interpreted as an
AGENT if it is a human being (see the second example in (2-b)).
(2) a. Type fear (EXPERIENCER + target of emotion):

admire, abhor, adore, deplore, detest, enjoy, envy, fear, hate, like,
love, regret
Harry enjoyed the party.
Your sister needn't fear the exam,
b. Type frighten (EXPERIENCER + cause of emotion):
agonize, anger, disappoint, disgust, encourage, frighten, humiliate,
terrify
The article in the Times angered Sally greatly.
Sally frightened her little brother (deliberately) (by putting on an
African mask).
This semantic difference in the interpretation of the second argument of psych

verbs is sometimes neglected. Tenny (1987: 504), for example, makes the
claim that "the thematic role of the arguments in internal and external position
is exactly the same." The second argument of psych verbs is classified as
T H E M E , no matter whether it is the cause or the target of emotion. Essentially,
this makes the ö-grid [EXP, TH] the defining characteristic of psych verbs. It
seems almost heretical to question this criterion for establishing a common
class of psych verbs.
The following discussion will show that psych verbs do not form a homo-
geneous class according to structural criteria. What they have in common
is an incorporated argument: If anything, the emotion itself qualifies as the

T H E M E of the verb (in the original sense of T H E M E as the object in motion
or being located). The psych verbs of the fear type make a statement about
who feels what, i.e., about the location of an emotion (compare the possi-
ble paraphrases feel fear, feel love, etc.), while the frighten verbs express the
causation of such an emotion (accordingly, they can be paraphrased as cause
fear, cause delight, etc.), which - following Jackendoff's (1983) Thematic
Relations Hypothesis - can be analyzed as a change of location, caused by an
(animate or inanimate) object or an event.
Psych verbs are generally considered puzzling objects for a linking theory
because of the crosswise realization of their arguments. According to tradi-
tional linking principles like the Thematic Hierarchy, one would expect the
two arguments of psych verbs to be realized in the same way; more specif-
ically, one would expect the E X P E R I E N C E R to be in the subject position in
both cases because it is the thematically higher argument of the verb.
(3) Thematic Hierarchy:

AGENT > INSTRUMENT > EXPERIENCER >
SOURCE/GOAL/LOCATION > THEME
From this perspective, the frighten-verbs are the more problematic subtype of
psych verbs because the thematically lower element is realized in the syntac-
tically higher position, while the pattern of argument realization of the fear-
verbs agrees with the Thematic Hierarchy. Even if one puts forward a rela-
tivized form of the Thematic Hierarchy (as in (4)) and generates the subject
of fear within the VP (in a position where it is asymmetrically c-commanded
by the E X P E R I E N C E R argument), there remains the problem of non-identical
patterns of argument realization for verbs with supposedly identical argument
structures. 2
(4) Given a theta-grid |Experience^ Theme|, the Experiencer is pro-

jected to a higher position than the theme.
(Belletti & Rizzi 1988: 344)
In contrast to Belletti & Rizzi, I will assume that the argument structure
of psych verbs follows from their semantic representation (rather than from
idiosyncratic Case marking abilities). Differences in the projection of argu-
ments will not be considered counter-examples for a projectional linking ap-
proach. They will be taken as an indicator for underlying differences in the
verbs' semantic structures instead.
382 Anja Wanner
3.2 The Argument Structure of Psych Verbs
There is no point in discussing argument realization problems if we do not

look into the assumed argument structure of the verbs in question first. If it
could be shown that verbs like fear and frighten do not have the same argu-
ment structure, there would be no problem with the Thematic Hierarchy in
the first place.3 I will assume that the mapping between argument structure
and D-structure is transparent. The external argument is generated in a po-
sition different from (and asymmetrically c-commanding) the base position
of the internal argument. According to this assumption we can apply either
argument structure diagnostics or D-structure tests to find out about the syn-
tactic class of a verb. Are psych verbs ordinary transitive verbs like break, or
do they have two internal arguments, which would make them double unac-
cusatives?
In the case of English psych verbs the argument structure tests in (5) indi-
cate that both subtypes belong to the class of transitive verbs: (5-a,b) show
that passivization is possible, which is a well-established diagnostic for the
presence of an external argument (cf. Perlmutter 1978).
(5) a. The children were frightened by the sudden noise in the garden,
b. Those pictures are loved by most people.
(5-c,d) illustrate the application of a diagnostic for the presence of an NP

complement. Levin & Rappaport (1986) have demonstrated that an NP can
function as the external argument of an attributive adjectival passive only
if it is the complement of the verb. According to their "Sole Complement
Generalization" psych verbs have one and only one complement, which is
the E X P E R I E N C E R in the case of the frighten-\erbs and the T H E M E in the
case of the fear-verbs.
(5) c. the frightened children, *the frightened noise

d. some well-loved pictures, *those unloved p e o p l e E X P
The data in (5-c,d) are complemented by nominalization data. As illustrated

by (5-e,f), psych verbs quite freely allow the formation of -er-nouns. This
indicates that the EXPERIENCER is an external argument in the case of the
/ear-verbs, while the frighten-verbs project the T H E M E as their external argu-
ment.
(5) e. admirer (of Jane Austen), lover (of good music), dog-hater
f. baffler, disturber, enchanter, startler (Levin 1993: 190)
It seems, then, that both subtypes of psych verbs are transitive verbs and that
the two arguments are indeed realized crosswise.
Some diagnostics, however, seem to point in another direction. On the basis
of data like (5-g,h) Grimshaw (1990) argues that psych verbs of the frighten-
class do not have an external argument. In contrast to fear-verbs, they do not
form compounds in which one of the two arguments (the internal argument)
is incorporated.
(5) g. *a child-frightening storm

g'. *a storm-frightening child
h. *some person-loving music
h'. a music-loving person
According to Grimshaw argument realization is a cyclic process, the small-

est domain of which is the word. If predicate and internal argument form a
domain that excludes the external argument, it should be possible to form a
deverbal compound within which only the internal argument of the verb is re-
alized. If there is a compound derived from a transitive verb, it is not possible
to realize the external argument within the compound without the realization
of the internal argument (heart-breaking music, *music-breaking of hearts).
Nor can there be a compound on the basis of a verb with two arguments of
the same rank (i.e., two internal arguments), since word structures are always
binary (*book-giving to children, * children-giving of books).
The data in (5-g,h) are taken as support for Grimshaw's hypothesis that the
two subclasses of psych verbs have different argument structures: While the
/ear-verbs take the E X P E R I E N C E R as their external argument, the frighten-
verbs do not have an external argument at all. However, there are native
speakers who report a clear difference in acceptability between (5-g) and (5-
g') - with a strong tendency to prefer the compound into which the E X P E -
RIENCER is incorporated (i.e. (5-g)). This would be in line with the original
assumption that the two arguments of psych verbs are not of the same rank in
the case of verbs like frighten·. If it is "better" to realize the E X P E R I E N C E R ar-
gument within the smallest domain, the syntactic status of the E X P E R I E N C E R
is lower than that of the T H E M E - and the conflict with the Thematic Hi-
erarchy remains. 4 Contrary to Grimshaw (1990), we will therefore assume
that the two subclasses of psych verbs have the same argument structure and
belong to the class of transitive verbs.
384 Anja Wanner
3.3 Semantic-Aspectual Subclassification
On the basis of a common 0-grid the fear-verbs and the frighten-\erbs are
claimed to belong to the same semantic class. However, if one looks at the
meaning of these verbs more closely, it is obvious that the frightenserbs ex-
press a change of state in the EXPERIENCE!* argument, while the fear-verbs
rather behave like states. Consequently, the verbs love, hate, fear can be para-
phrased as feel love, feel hate, feel fear, while the verbs delight, anger and
horrify can be paraphrased as cause sb. to feel delight, cause sb. to feel anger,
cause sb. to feel horror. In some cases the causative component is reflected
by the morphological structure of the verbs; see the examples in (6), which
are denominal derivations containing a causative suffix.
(6) fright-en, agon-ize, horr-ify
This difference in the aspectual structure of the event has been dealt with
in a variety of ways. While Tenny claims that thematic roles should be re-
placed by what she calls "aspectual roles" altogether ("Aspectual Interface
Hypothesis"), see (7-a,b), other linguists have tried to complement the The-
matic Hierarchy by some more aspectually oriented linking principle.
(7) a. The mapping between thematic structure and syntactic argument

structure is governed by aspectual properties. |...J Only the aspec-
tual part of thematic structure is visible to the syntax.
(Tenny 1992: 2)
b. An argument can provide a measure, a path, or a terminus for the
event described by the verb. These three ways of participating in
aspectual structure may be thought of as aspectual roles.
(Tenny 1994: 94)
For instance, Grimshaw (1990) argues for the assumption of a hierarchical

"Aspectual Tier", which has the CAUSER as its highest position. Thus, the
THEME-subject of the frighten-\erbs has a low thematic but a high aspectual
status, which makes the question of why it is realized as the subject less of a
riddle (for a more detailed account of Grimshaw's suggestion, see below).
Apart from the presence of a causer, a causative verb always implies a
change of state in a second argument. In the case of the frighten-\erbs this is
the EXPERIENCER. It is important to keep this in mind when we talk about the
causation of the experience of an emotion. Otherwise one might feel tempted
to mix up linguistic information and world knowledge. If I fear dogs, it is
reasonable to assume that I feel fear whenever I see a dog and thus the dog
might be considered some kind of causer of my fear. Alternatively, one might

consider the E X P E R I E N C E R some kind of causer of the emotion, as tenta-
tively suggested by Pinker (1989: 142), who relates psych verbs to perception
verbs, asking: "What is the cause in an act of perception? Is it the perceiver,
because he or she must be engaged in mental activity [...]? Or is the stimulus
the cause, because its salient properties call attention to itself After all,
both types of psych verbs can be complements of verbs like decide, which
indicates some kind of control potential in the E X P E R I E N C E R argument; see
(8-a,b). 5
(8) a. Harry decided not to fear the presentation,

b. The children decided not to get frightened.
However, this causation of fear is not built into the meaning of the verb fear,
while the causative equivalent always implies that the emotion was actually
felt; see the contrast in (8-c,d).
(8) c. The article on AIDS upset /frightened/worried Sally, *yet she didn't
feel upset at all.
d. Sally loved/cherished/adored her little niece, yet she was angry
when she saw her.
It can be concluded that the frighten verbs - but not the fear verbs - express
a change of state in the E X P E R I E N C E R argument. The causer of this change
is the second argument, the T H E M E . It seems, then, that the two subtypes of
psych verbs do not form a coherent class according to aspectual criteria.
The aspectual differences between frighten and fear hold for other verbs
of the two subtypes of psych verbs, too. The verbs listed in (2-a) - those
that project the E X P E R I E N C E R as subject - do not express a change of state,
while all the verbs of the frighten type (the ones in (2-b)), whose pattern of
argument realization is a problem for the Thematic Hierarchy, are causative.
It seems reasonable to conclude that these underlying differences in the verbs'
semantics are the key to explaining the difference in argument realization.
The data in (9) and (10) support the assumption that the two subgroups
of psych verbs belong to different aspectual classes. Only the frighten-verbs
are compatible with frame adverbials, which refer to the time span of the
event until its culmination. The/ear-verbs, on the other hand, do not license
a resultative predicate, they cannot easily be replaced by a dynamic proform
(like do), and they are not compatible with the progressive.
(9) a. The little boy feared the neighbour's dog (*within a minute).
386 Anja Wanner
b. *The little boy was fearing the storm.

c. *The little boy feared the dog into a frenzy/breathless.
d. ??What the little boy did was fear the dog.
(10) a. Smilla's personality fascinated him (completely) (within a minute).
b. The storm was frightening the children.
c. The storm frightened the children out of their wits.
d. What the stranger did was frighten the children.
Differences in lexical aspect or aktionsart are represented in the verb's lex-

ical semantic structure. We will essentially follow Dowty (1979), who ana-
lyzes lexical causatives (accomplishments) as complex predicates expressing
a causal relation between two subevents, a causing activity (containing the
causer argument) and a caused achievement (containing the argument that un-
dergoes a change of state); see (11-a). Verbs like fear do not conform to this
type of event structure. As has been pointed out, there is no change of state in
either of the arguments (thus, there can be no causer of such a change), and
arguably there is no ongoing activity at all. On the other hand, they do not
represent true states, i.e., individual-level predicates in the sense of Kratzer
(1989). 6 Contrary to Dowty, who gives the same representation for temporary
states as for activities, we will regard temporary states as simple predicates
(see (11-b)), following Kratzer in assuming that the aspectual difference be-
tween individual-level and stage-level states will be represented by different
(non-thematic) event arguments.
(11) a. accomplishment: [[jt (DO-STH)l CAUSE BECOME frightened]]

b. temporary state: [fear x, y \
In the case of the frighten-verbs χ corresponds to the T H E M E ; in the case

of the fear-verbs it corresponds to the E X P E R I E N C E D The latter situation is
what we would expect if the linking regularities that handle the mapping from
lexical semantic structure to argument structure were constrained by the re-
quirements of the Thematic Hierarchy only. Obviously, some aspect-related
principle is at work, too. Let us for a moment take up Grimshaw's suggestion
to establish an independent second hierarchy, the Aspectual Tier, in which
purely semantic notions like animacy are of no importance. If the highest po-
sition of this hierarchy is that of a causer, there will be a prominence clash of
the argument status of the T H E M E of the frighten-xerbs. It would occupy the
highest position in the Aspectual Tier, but the lowest position in the Thematic
Tier.
(12) Thematic Tier, (agent (experiencer (goal/source/location (theme))))

Aspectual Tier, (cause (other (...))) (Grimshaw 1990: 24)
Since it is an external argument according to the argument structure tests in

(5), one is tempted to draw the conclusion that in a conflict between two
linking principles, the aspectual status is of higher importance. 7 Before we
examine this hypothesis in detail, let us first look at some other suggestions
on how to deal with the notion of weighing universal linking principles.
4 How to Deal with Mismatches
There are numerous approaches that try to tackle the argument structure of
psych verbs from a lexical point of view. 8 Most recent work recognizes the
possibility of conflict zones (or "mismatches") between two or more linking
principles when applied to a single item (e.g., Grimshaw 1990, Dowty 1991,
Levin & Rappaport Hovav 1995). However, explicit instructions on how to
solve such a conflict are not always given. This depends on the status mis-
matches are given within a linking theory. If they are considered unfortunate
constellations that should not occur if the linking principles one has estab-
lished are universally valid, the mismatch phenomenon will be pushed to the
grammar periphery. If, on the other hand, mismatches are considered an inte-
gral part of the interaction of linking principles, one will expect some discus-
sion of how to deal with them. It should be obvious that in a linking theory
within the OT framework mismatches would be essential to establishing the
language-specific ranking of constraints.
In Grimshaw's argument structure theory, linking conflicts are explicitly
mentioned. Each argument is placed in relation to its co-arguments into the
Thematic and the Aspectual Tiers as given in (12). If there is an argument
which is the highest ("most prominent") in both hierarchies, it gets the status
of the external argument; cf. Grimshaw (1990: 5). A mismatch situation arises
if an argument is the highest in only one of the two hierarchies (illustrated by
crossing lines in (13)). The outcome of such a mismatch situation is easy
to determine: According to Grimshaw's prominence criterion, the respective
argument will always be an internal argument. 9
388 Anja Wanner
(13) Asp. Tier Them. Tier Asp. Tier Them. Tier

1
2 >· 2 2 2
3 3 3 3
4 4
A-Structure: (jc ( j ) ) ((χ, y))
As clear as this procedure seems, there are some problems with this approach.
Leaving aside for a moment the empirical problem that our argument struc-
ture tests do not confirm the conclusion that the THEME is an internal argu-
ment in the case of the frighten-\erbs, the theoretical basis of Grimshaw's
argument structure theory predicts some conflicts for which there is no ele-
gant solution. First of all the Aspectual Hierarchy is not really spelled out
(see (12)), which makes it impossible to assign an aspectual value to argu-
ments other than a causer. Secondly, the concept of two hierarchies only
makes sense if they are independent of each other. This is not the case in
the traditional understanding of 0-Roles, since some Ö-roles are explicitly
defined in terms of aspectual notions like dynamicity and action. Thirdly,
there is a general problem for Grimshaw's relativizing approach, pertaining
to the calculation of the argument structure of single-argument verbs. If there
is only one argument, this argument will automatically be the highest in both
hierarchies. Thus, single-argument verbs would always have an external ar-
gument, i.e., there would not be any unaccusative verbs. To rule out this over-
generalization Grimshaw has to make a concession, which includes an ele-
ment of weighting: "Apparently, more than relative prominence is involved;
some measure of absolute prominence must contribute too" (Grimshaw 1990:
39). She postulates that an argument which undergoes a change of state will
intrinsically have the status of an internal argument. It seems, then, that its
aspectual status overrules its thematic status - a situation that can be handled
more elegantly within the OT framework.
Coming back to our own conclusion that the THEME of the frighten-verbs
is an external argument, one might want to suggest another way to interpret
the interplay of the two hierarchies. We could regard them as separate linking
systems whose demands on a single argument can be in conflict with each
other - a situation that seems to lend itself ideally to the application of the
notion of weightable principles or rankable constraints: In the presence of
another argument, a THEME can become the external argument, under the
condition that it is the causer of a change of state in the second argument. The
constraints on argument structure established by the Thematic Hierarchy can
be violated if this is the only way to "buy" a non-violation of the constraints

set out by the Aspectual Hierarchy - a causing THEME becomes the external
argument and an EXPERIENCER argument that undergoes a change becomes
the internal argument.
If, however, the aspectual status of an argument overrides its thematic sta-
tus, we will have to ask why we need a semantic hierarchy in the first place.
The answer lies in the behaviour of the fear-verbs: When there is no differ-
ence in aspectual status, as in the case of /ear-verbs, which express atelic
events without any progress or change whatsoever, how should we determine
which argument will be external (or if any of the arguments is external, for
that matter) without referring to some semantic notion such as "sentience" (a
term used by Dowty 1991) or EXPERIENCER? After all, there is no such verb
as the verb fear in (14), which projects the THEME as its external argument
(at least not in English). The argument structure given in (14) could be called
ungrammatical because it is not in line with the Thematic Hierarchy - with-
out compensating this violation by fulfilling linking principles that relate to
the aspectual structure of the event.
(14) *fear (x (y)), such as y fears χ
Still, this modification in handling mismatches would require that the posi-
tions in the Aspectual Tier and the Thematic Tier be put in concrete terms
first.
Another framework that explicitly refers to mismatch situations is Dowty's
(1991) prototype model, whose core is the "Argument Selection Principle"
given below:
In predicates with grammatical subject and object, the ar-

gument for which the predicate entails the greatest number
of Proto-Agent properties will be lexicalized as the subject
of the predicate; the argument having the greatest number
of Proto-Patient entailments will be lexicalized as the direct
object. (Dowty 1991: 576)
Unlike Grimshaw, Dowty does not take universal hierarchies as a starting

point, but looks at the semantic and aspectual properties of subjects and ob-
jects of "stable" transitive verbs, i.e., verbs whose arguments are realized as
subject and object in the same way in a variety of languages. The syntac-
tic status of a verb's arguments depends on their closeness to a prototypical
agent (realized in subject position, or, we might add, as external argument) or
a prototypical patient (realized in object position). The prototypical subject
390 Anja Wanner
(proto-agent) is characterized by the properties given in (15-a), while the pro-

totypical object iproto-patient) has the properties listed in (15-b) (and some
others, which I leave out because they cannot be applied to psych verbs). 10
( 15) a. Contributing properties for the agent Proto-Role:

A l . volitional involvement in the event or state
A2. sentience (and/or perception)
A3, causing an event or change of state in another participant
b. Contributing properties for the patient Proto-Role:
PI. undergoes change of state
P2. incremental theme
P3. causally affected by another participant (Dowty 1991: 572)
It is quite clear that not every subject or object can be characterized by the
full bundle of these properties. What matters is if the argument in question
is closer to the proto-agent or the proto-patient. This is determined by count-
ing the number of properties that can be assigned. We might consider it a
mismatch situation if a single argument has the same number of P- and A-
properties, i.e., if it is in the middle between these two thematic poles. Again,
in the case of psych verbs the most obvious conflict occurs in the evaluation
of the THEME argument of frighten-verbs, as can be seen in (16-b). If we take
into consideration the possibility of volitional involvement of the EXPERI-
ENCE!* argument, we have the same number of A- and P-properties for the
EXPERIENCER. For one thing, this tells us that no matter whether the argu-
ment will end up in object or subject position, it is far from being a typical
AGENT or PATIENT. In contrast to this m i s m a t c h situation, the EXPERIENCER
of /ear-verbs is assigned Α-properties only, see (16-a). Thus, we should ex-
pect more variation in the realization of arguments across languages in the
case of frighten-\trbs. On the other hand, the THEME of the frighten-\erbs
can be assigned only Α-properties, while the THEME of the /ear-verbs does
not qualify for the assignment of any of the above-mentioned properties at
all. 11 It seems, then, that the arguments of psych verbs do not exactly qualify
as prototypical subjects or objects in either case.
( 16) a. The children^ feared the storm v

x: A2, ( A l )
y- -
b. The storm v frightened the children^
χ : A2, ( A l ) , P 1 , P 3
y: A3
But how do we know where the arguments of frighten will end up in a spe-
cific language? Again we are confronted with the necessity of modifying a
general linking principle towards the idea that in a mismatch situation not all
demands are equal, as Dowty himself concedes: "I would not rule out
the desirability of 'weighting' some entailments more than others for pur-
poses of argument selection" (Dowty 1991: 574). And again, this element of
weighting could be handled best within the framework of Optimality Theory.
5 The Optimality Theory Answer
5.1 Prerequisites
Having presented two cases in which the notion of ranking within univer-
sal linking principles has found its way in through the back door, I will now
sketch a way to make this idea the basis of a linking theory within the OT
framework. Linking principles will be formulated as constraints on argument
structure. These constraints - which take the shape of linking rules - relate
either to positions in the semantic representation or to semantic properties.
Comparable candidates will be different argument structures (which have to
be in line with non-overridable structure principles). Since the possibilities
for the argument status of a given element are very limited (internal or ex-
ternal argument), there cannot be as much variation as between competing
candidates in syntax. Parallel to the status of X-bar Theory as a set of struc-
ture principles that restricts the kind of candidates to consider, we will take
for granted one universal structure principle each for argument structure and
for lexical semantic representation. The first one is well-established: There
can be maximally one external argument. The second principle, which I have
referred to as the "Identification Principle" elsewhere (Wanner 1999: 134f.),
relates to the complexity of the event: I will assume that one of the central
principles of linking between semantic structure and argument structure is
that each subevent has to be "identified" by at least one argument. 12 Con-
cerning the nature of linking rules, I would like to follow Levin & Rappaport
Hovav (1995), who have looked at patterns of argument realization of English
verbs extensively.
392 Anja Wanner
5.2 Psych Verbs and Linking Rules
For the sake of the argument, let us assume that all the constraints we need
are the linking rules listed in (17), which are in line with generally accepted
linking principles. Like Levin & Rappaport Hovav (1995) I have included a
DEFAULT-RULE that declares the presence of an internal argument the default
case. 1 3 In the linking theory of Levin & Rappaport Hovav (1995) this rule
has the status of an elsewhere condition. It cannot collide with any of the
specific rules, since it is only activated if none of them applies. This need
not necessarily be the case in the optimality framework, although it is quite
obvious that the default rule is likely to be ranked rather low (otherwise we
wouldn't have any verbs with external arguments).
(17) CAUSER-RULE: The argument identifying a CAUSE-subevent is an

external argument.
ACTOR-RULE: The first argument of DO is an external argument.
BECOME-RULE: The argument identifying a BECOME-(sub)event is
an internal argument.
CONTROL-RULE: An argument which potentially controls the event
is an external argument. 1 4
DEFAULT-RULE: Each argument of a verb is an internal argument.
The first three linking rules relate to structural information in the verb's lexi-
cal information. The third rule tries to capture the special (if vague) treatment
that the EXPERIENCER is given in most linking theories; cf. Jackendoff (1987:
401), who considers the EXPERIENCER the argument of an "as yet unexplored
State-function having to do with mental states."
It has already been shown that the / e a r - v e r b s behave like (temporary)
STATES, while the frighten-\erbs constitute ACCOMPLISHMENTS, which -
in the tradition of Vendler and Dowty - we assume to fall into two subevents;
see the templates in (11), repeated below.
( 1 1 ) a. f r i g h t e n : | [ x (DO-STH)J CAUSE [y BECOME frightened]]

b. fear: [fear χ, y |
In the case of frighten the arguments of the verb belong to two different
subevents. The THEME argument is the one to identify the first subevent
(causal subevent). As such it will be subject to the CAUSER-RULE. The EX-
PERIENCER, on the other hand, identifies the BECOME-subevent ("central
subevent" in terms of Levin & Rappaport Hovav 1995). As such it is subject
to the BECOME-RULE. Arguably it is also subject to the CONTROL-RULE,
which would create a rule conflict, since it cannot be an external and an in-
ternal argument simultaneously.
If the ACTOR-RULE is activated or not will depend on the kind of subject
we choose. Since frightening somebody does not necessarily presuppose that
any specific action takes place, we will consider this rule inactive or rather
"vacuously satisfied" in the case of psych verbs, particularly when the subject
is inanimate, as in [The movie frightened the children|.
The case is different with the /ear-verbs. They do not have a complex as-
pectual structure. Neither the CAUSER-RULE nor the BECOME-RULE can b e
considered a relevant factor here; in other words: they are vacuously satisfied.
The only thing the/ear-verbs have in common with the frighten-verbs is the
fact that the EXPERIENCER falls under the scope of the CONTROL-RULE, too,
which reflects the semantic relatedness of the two groups of verbs.
5.3 Ranking of Linking Rules
Let us now look at how to rank the linking constraints in (17) on the basis of
English psych verbs. Competing candidates are different argument structures,
as illustrated in the tableaux in (18). There are three comparable candidates
in the case of frighten (see (18-a)): Candidate A has an external THEME, can-
didate Β has two internal arguments, and candidate C has an external EXPE-
RIENCER. Candidate D is ruled out because a verb cannot have two external
arguments. (I have not included any candidates that do not contain an event
position since they would presumably be ruled out by independent argument
structure principles.)
The ranking of constraints can only be determined if the optimal candidate
is known. In syntax this is the grammatical sentence. Here, the optimal can-
didate is the argument structure of the verb as resulting from the application
of structural tests. From applying these tests we have concluded - contrary to
Grimshaw (1990) - that in English verbs like frighten project their "THEME"
(the argument identifying the causing subevent) as external argument. This
means that the argument structure given in A is the optimal candidate. Thus
w e can c o n c l u d e that the CONTROL-RULE is d o m i n a t e d by the BECOME-
RULE because otherwise the EXPERIENCER would not end up as an internal
argument. Alternatively, if we take into account the argument status of the
THEME, w e m i g h t a r g u e that the CAUSER-RULE d o m i n a t e s the CONTROL-
RULE s i n c e the CAUSER (and not the c o n t r o l l i n g EXPERIENCER) b e c o m e s
the external argument. The internal ranking of the CAUSER-RULE and the
394 Anja Wanner
B E C O M E - R U L E is difficult to establish, since a case in which a single argu-

ment is subject to both linking rules (thus identifying a causing and a central
subevent) is presumably ruled out by the principle of event identification. It
doesn't come as a surprise that the optimal candidate violates the DEFAULT-
RULE.
(18) a. Tableau:
The movie χ frightened CAUSER- ACTOR- BECOME- CONTROL- DEFAULT-
the children γ RULE RULE RULE RULE RULE
is· A frighten: [e, (je (y))) / / * *
Β frighten: |e, ( U , ; y ) ) | *
/ *
/
C frighten: [e, ( y ( * ) ) ] * *
/ *
D frighten: [e, ( χ , y)\ cannot be generated (two external arguments)
In contrast to the situation in (18-a), the THEME of the /ear-verbs is not sub-
ject to the C A U S E R - R U L E , nor is the EXPERIENCER subject to the BECOME-
RULE, i.e., both of these constraints on argument structure are vacuously sat-
isfied; see (18-b).
(18) b. Tableau:
They y feared the stormx CAUSER- BECOME- CONTROL- DEFAULT-
RULE RULE RULE RULE
A fear: [e, (x (y))] * *
*
Β fear: [e, ((*, J))l /
is* C fear: [e, (y (x))l / *
The EXPERIENCER ends up as the external argument by virtue of being sub-

ject to the CONTROL-RULE. The grammaticality of candidate C vs. candidate
A thus illustrates the effectiveness of the CONTROL-RULE. The comparison
of candidate C with candidate Β shows that a violation of the DEFAULT-RULE
is less severe than a violation of the CONTROL-RULE. Therefore, we can con-
clude that the former is dominated by the latter. Thus, the ranking of con-
straints that we can derive from the two tableaux in (18) is the following:
(19) CAUSER-RULE » DEFAULT-RULE

CAUSER and/or BECOME-RULE » CONTROL-RULE
CONTROL-RULE » DEFAULT-RULE
Within the OT framework the problematic status of mismatches is resolved:

It is obvious that the internal hierarchy of constraints could not be determined
without a conflict between linking rules. Furthermore, it is possible to predict
variation in linking: If the hierarchy of constraints is different, a different

candidate will be optimal. We can even make a statement about the stability
of a verb's arguments across different languages. The arguments of fear are
not within the scope of the CAUSER-RULE and the BECOME-RULE. T h u s , the
relative position of these rules within the grammar of a specific language does
not matter. The argument structure of the frighten-\e,rbs, on the other hand,
crucially depends on the ordering of the aspectually accentuated linking rules.
Variation is more likely in this case.
6 Evaluation
On the basis of a particular verb class it was shown why and how the frame-
work of Optimality Theory could be used as the basis for a theory of argu-
ment linking. The OT-based linking model we argued for tries to integrate
important insights from current lexicon-based linking theories: It is based on
linking rules that constrain possible argument structures (like the ones formu-
lated by Levin & Rappaport Hovav 1995), its organization is inherently hier-
archical (as suggested by Grimshaw 1990), it takes into account the aspectual
function of arguments (which reminds us of Tenny 1987, 1992), it can be
used to make a statement about the prototypicality of an external or internal
argument (as pointed out by Dowty 1991), and it relates directly to the lexical
semantic representation of verbs, thus following Jackendoff (1987) in treat-
ing 0-roles as inherently relational notions. While all of the linking models
we touched upon have to make concessions when it comes to linking con-
flicts arising from the collision of different linking principles, the framework
of Optimality Theory lends itself ideally to dealing with such mismatches. OT
gives us a framework to recognize conflicts between universal constraints as
necessary and desirable because otherwise we would neither be able to con-
strue the grammar of a specific language nor could we predict the potential
for cross-linguistic variation.
In the case of English psych verbs we essentially followed Grimshaw
(1990) in assuming that there are two aspectually different subclasses, each
with a homogeneous pattern of argument realization. What relates these two
classes is the presence of an argument that falls under the scope of the
CONTROL-RULE (generally labelled EXPERIENCER). S i n c e t h e CONTROL-
RULE is d o m i n a t e d b y t h e CAUSER-RULE, t h e EXPERIENCER c a n n o t b e p r o -
jected as external argument in the presence of a causer. Argument structure
tests confirmed that the causing THEME is indeed an external argument in the
396 Anja Wanner
case of frighten. The fear-verbs, on the other hand, do not express causative
events. Thus, the C A U S E R - R U L E is vacuously satisfied and the E X P E R I E N C E R
becomes the external argument by virtue of falling under the scope of the
CONTROL-RULE.
It is obvious that we could merely sketch the outline of a linking theory
within the OT framework. For the sake of the argument we took the shape and
universal validity of linking constraints for granted. Since we examined only
one verb class we could not establish the ranking of linking rules unequivo-
cally. To check if the approach sketched here is superior to its competitors,
one would have to put different verb classes to the test and take into account
more (or different) linking principles. The (fragmentary) hierarchy of linking
rules that we established should of course be compatible with all verb classes
within one language. The high status of the C A U S E R - R U L E , for instance, is
confirmed by the fact that accomplishments (like break, build, eat) generally
are transitive verbs, projecting the causer argument as external argument, no
matter whether either of the two arguments of the verb is animate or not. On
the other hand, our set of constraints does not allow states with an external
argument if this does not fall under the scope of the C O N T R O L - R U L E . 1 5
Apart from consistency within one language we would expect cross-
linguistic differences in the realization of arguments - as well as variation
along a historical dimension - to be attributable to different rankings of link-
ing constraints. If this hypothesis could be supported empirically, the case for
an OT-based linking theory would gain even more ground.
Notes
1. The notation used here basically follows that used by Levin & Rappaport Hovav
(1995).
2. The analysis argued for by Belletti & Rizzi (1988, 1991 ) hinges on the assump-
tion of a Case grid, which contains information about the idiosyncratic Case
assigning abilities of verbs, i.e., the ability to assign inherent Case. The argu-
ment structure of psych verbs ultimately follows from their idiosyncratic ability
to assign inherent Case to one of their arguments. For a critical discussion of this
approach, see Wanner (1999: 188ff.).
3. Within a projectional approach, however, this would only shift the problem to the
mapping from semantic structure to argument structure: One would have to ask
why members of one semantic class of verbs have different argument structures.
4. This leaves us with the question of how to explain the differences in acceptabil-
ity between (5-g) and (5-h). Restrictions on word formation processes can be
of a very fine-grained nature. In this case they might depend on the aspectual
differences between the two types of psych verbs.
5. In the linking approach developed here, this will be reflected by the assumption
of a specific linking rule (CONTROL-RULE).
6. For instance, they can appear as the complement of a perception verb, which is
not possible for true states.
7. We would have to establish first, of course, that the Thematic Hierarchy could
not simply be replaced by - or turned into - an Aspectual Tier.
8. For a list of the relevant literature, see Levin (1993: 188ff.). A Case-oriented
approach, which cannot be discussed here, is presented by Belletti & Rizzi ( 1988,
1991).
9. Following Grimshaw (1990), the external argument is marked by being sur-
rounded by only one set of round brackets in argument structure.
10. "Incremental Theme" is the term Dovvty uses to refer to an object that changes
gradually as the event progresses, e.g., the complement in mow the lawn or build
a house.
11. In the example given the causing argument is inanimate and can be assigned only
one Α-property. The situation is different, of course, when an agentive reading is
available (Sally frightened her little brother deliberately).
12. A similar principle ("Subevent Identification Condition") is put forward by
Rappaport Hovav & Levin (1998).
13. That any verb can take an internal argument is shown in Wanner (to appear). On
the other hand, not every verb has the capacity to take an external argument. I will
therefore consider the internal argument the default case of a thematic argument
licensed by a verb.
14. A controlling argument can be used in constructions like (8-a,b).
15. To explain which argument is projected as external argument in aspectually sym-
metric events one might have to include something like the notion of "internal
causation" as developed by Levin & Rappaport Hovav (1995: 91): "some prop-
erty inherent to the argument of the verb is 'responsible' for bringing about the
eventuality." It should be mentioned, however, that the criteria according to which
the "responsibility" of an argument is established are rather vague.
References
Belletti, Adriana — Luigi Rizzi

1988 Psych-verbs and 0-theory. Natural Language and Linguistic Theory 6:
291-352.
398 Anja Wanner
Belletti, Adriana — Luigi Rizzi

1991 Notes on psych-verbs, θ-theory, and binding. In: R. Freidin (ed.) Princi-
ples and Parameters in Comparative Grammar, 132-162. (Current Stud-
ies in Linguistics 20.) Cambridge, MA: MIT Press.
Borer, Hagit
1994 The projection of arguments. Ms. (Lecture notes, Girona International
Summer School in Linguistics).
Burzio, Luigi
1986 Italian Syntax: A Government-Binding Approach. (Studies in Natural
Language and Linguistic Theory.) Dordrecht: Reidel.
Dovvty, David
1979 Word Meaning and Montague Grammar. Dordrecht: Reidel.
Dovvty, David
1991 Thematic proto-roles and argument selection. Language 67: 547-619.
Grimshaw, Jane
1990 Argument Structure. (Linguistic Inquiry Monographs 18.) Cambridge,
MA: MIT Press.
Jackendoff, Ray
1983 Semantics and Cognition. (Current Studies in Linguistics 8.) Cambridge,
MA: MIT Press.
Jackendoff, Ray
1987 The status of thematic relations in linguistic theory. Linguistic Inquiry
18: 369-411.
Kratzer, Angelika
1989 Stage-level and individual-level predicates. Ms., University of Mas-
sachusetts, Amherst.
Levin, Beth
1993 English Verb Classes and Alternations: A Preliminary Investigation.
Chicago: University of Chicago Press.
Levin, Beth — Malka Rappaport
1986 The formation of adjectival passives. Linguistic Inquiry 17: 623-661.
Levin, Beth — Malka Rappaport
1989 An approach to unaccusativity mismatches. In: Proceedings of the 19th
Annual Conference of the North-Eastern Linguistic Society, 314-329.
Levin, Beth — Malka Rappaport Hovav
1995 Unaccusativity at the Syntax-Lexical Semantics Interface. (Linguistic In-
quiry Monographs 26.) Cambridge, MA: MIT Press.
Perlmutter, David
1978 Impersonal passives and the unaccusativity hypothesis. In: Proceedings
of the Annual Meeting of the Berkeley Linguistic Society 4, 157-189.
Pinker, Steven
1989 Learnability and Cognition: The Acquisition of Argument Structure.
Rappaport Hovav, Malka — Beth Levin
1998 Building Verb Meanings. In: M. Butt and W. Geuder (eds.) The Projec-
tion of Arguments: Lexical and Compositional Factors, 97-134. Stanford:
CSLI Publications.
Tenny, Carol
1987 The aspectual interface hypothesis. In: Proceedings of the 18th Annual
Meeting of the North-Eastern Linguistic Society, 490-508.
Tenny, Carol
1992 The aspectual interface hypothesis. In: I. Sag and A. Szabolcsi (eds.) Lex-
ical Matters, 1-27. (CSLI Lecture Notes 24.) Stanford: Stanford Univer-
sity Press.
Tenny, Carol
1994 Aspectual Roles and. the Syntax-Semantics Interface. (Studies in Linguis-
tics and Philosophy 52.) Dordrecht: Kluwer.
Vendler, Zeno
1967 Linguistics in Philosophy. Ithaca, NY: Cornell University Press.
Verkuyl, Henk
1993 A Theory of Aspectuality. (Cambridge Studies in Linguistics.) Cam-
bridge: Cambridge University Press.
Wanner, Anja
1999 Verbklassifizierung und aspektuelle Alternationen im Englischen. (Lin-
guistische Arbeiten 398.) Tübingen: Max Niemeyer Verlag.
Wanner, Anja
to appear Intransitive verbs as case assigners. In: H. Janßen (ed.) Verbal Projec-
tions. (Linguistische Arbeiten.) Tübingen: Max Niemeyer Verlag.
Index of OT-Constraints
* EXP, 2 9 2 FAlTHfunc, 3 5 9
•FOCUS (*F), 166 FAITH[Q], 299
* L x - M v (* Lexical Movement), 3 1 4 FILL, 4 5 , 6 0
*MENG, 114 FINALFOCUS (FF), 7 3
* P A S T P / P V / + V , 307 Fl (Full Interpretation), 3 1 4
*PRON (Avoid Pronoun), 4 3 FOCUSPROMINENCE (FP), 81, 100
*T, 2 1 6 FOCUS (F), 164
* 0 (Avoid Null Parse), 4 9 H-Nuc, 61
ACCENTDOMAINFORMATION HAVE ( C P ) , 2 9 4
(ADF), 80 HD-LFT (Head Left), 3 0 7
ACTOR-RULE, 392 HD-LFT &D MORPH, 3 0 7
ADJ-ISL (Adjunct Island Constraint), HD-RT (Head Right), 307
48 IDENT, 4 5
ADJA (Adjacency), 180 IN-DIS (I), 154
AGENT (Agentivity), 180 IND(EFINITES), 7 3
ALIGN (Align Focus), 184 ISLAND-COND, 6 0
ANIM (Animacy), 179 ISLAND, 127
ARGUMENT-OVER-PREDICATE IP-HEAD-RIGHT, 8 2
(A/P), 8 0 LEFTEDGECP (LEO), 127
BAR (Barriers Condition), 4 4 LEFTEDGEPP (LEP), 131
BAR 3 , 6 0 LICENSING, 328
BAR", 2 1 6 LOC-ANT (Local Antecedent), 38
BECOME-RULE, 392 MATCH, 360
CAUSER-RULE, 392 MAX, 45, 356
CIS (Contiguity in Syntax), 130 MAXdo,, 356
CN2,73 MIN-CHAIN (Minimize Chain
CNPC ( C o m p l e x Noun Phrase Condi- Length), 4 4
tion), 4 1 MINDIS, 2 1 2
CONTROL-RULE, 392 MORPH (Morphological Selection),
CONTROL (Control Rule), 4 2 307
DAT(IVE), 9 3 NEW, 7 3
DEFAULT-RULE, 392 N o PROPERTY (N-PR), 167
DEP, 4 5 , 3 5 6 NOPHRASALSPEC (NPS), 139
CEPdat, 356 O B - H D (Obligatory Heads), 3 1 4
ECON (Economy), 183 ONEPROSODICWORD OPW, 138
EX-PRE (E), 154 OP-SPEC, 39, 332
FAITH(PASTP), 309 PARMOVE (Parallel Movement), 136
FAITH [COMP], 4 6 , 4 7 , 2 9 2 PARSESCOPEO,116
FAITH [WH], 5 0 PARSE[SCOPE], 4 6 , 4 7
402 Index of OT-Constraints
PARSE, 4 5 SELECT (Selection), 303

PR-BD (Proper Binding), 314 SENTP, 213
PRONECON (Pronunciation Econ- S H O R T E S T PATHS, 6 1
omy), 1 0 9 SILENT TRACE, 6 0 , 3 7 0
PROP- Q (Proper Q), 2 0 0 SIMS, 214
PURE-EP (Purity of Extended Projec- S L , 160
tion), 292, 303 S P (Scope Principle), 183
Q-MARK (Q-Marking), 298 SPC (Shortest Path Condition), 297
Q-SCOPE, 298 ST-L-DB (ST), 164
Q R (Quantifier Raising Constraint), STAY, 3 9 , 7 3 , 3 1 5 , 3 2 8
187 STAY*, 113
QUIB (Quantifier Induced Barrier), SUBCAT, 2 1 6
183, 191 SUBJECT (S), 154
REAL C a J e , 361 SUBJPRED,214
RECOV (Recoverability), 110, 3 7 2 T-LEX-GOV (Lexical Government of
REC (Reconstruction Constraint), Traces), 39
187 TYPE (Principle of Clause Typing),
REF-ECON (Referential Economy), 184
38 UNlcase (Uniqueness of Case Rela-
RES (Resumptive Constraint), 41 tions), 360
S & I, 156 W H - I N - S P E C , 113
SCOPING, 328 WRAP, 100
Index of Subjects
Α-movement, 15 basic focus rule, 78

absolute ungrammaticality, 48-51, basic word order, 181, 201, 252, 253
312,314,357,363 BF-structure (background-focus
accent domain (AD), 78 structure), 255
addition, 170 binding theory, 2 , 2 6 0
adjectival passive, 382 Blocking Principle, 31
adjunction, 295 blocking syntax, 30-34
adverbial clauses, 342 bounding node, 4
agent, 380, 381 bridge verb, 39
Agent Topic (AT) construction, 22
agentivity, 182,203, 380 c-construable, 77
Agree, 142 candidate, 30, 36, 37,45, 290
Aktionsart, 386 argument structure as, 391,393
alternation, 310 definition of, 154,201,285, 354
ambiguity, 155, 170, 175, 327, 332, derivation as, 28
335-337 candidate set, 27,28,30,36,205,251,
anaphor, 2, 33 283, 354, 355
animacy, 179, 182, 203, 252, 344, and input, 47
363, 381, 386 definition of, 31, 37, 43, 154, 177,
anti-complexity restriction, 123 191,250,251
Argument Selection Principle, 389 definition of - by input, 45
argument structure, 382 size of, 45, 175, 312
Aspectual Hierarchy, 388, 389 Case
Aspectual Interface Hypothesis, 384 abstract, 6, 344
aspectual role, 384 m-, 360
Aspectual Tier, 384, 386-389 r-, 360
attraction, 109, 110, 112, 126, 142- Case attraction, 350, 352, 353, 363,
144 364
Avoid Pronoun Principle, 7 optional, 351
Case Filter, 6
b(ackground)-determined interpreta- Case hierarchy, 350, 352, 353, 366,
tion, 252, 261-264, 267-272, 370
274, 276 causer, 384
B-part (background), 252, 257, 2 5 9 - chain, 46, 348, 350, 359, 371
261, 264, 265, 267, 270-272 LF-, 359
Baker sentence, 20 chain interleaving, 16
bare determiner, 268 choice function, 261-265, 268, 269,
bare output condition, 8 272, 276, 277
barrier, 16,44,58, 117, 127, 128 clause typing, 182, 184
404 Index of Subjects
cliticization, 126 201, 202, 217, 299, 303, 327,

in free relative clauses, 350 331,395
cognate object, 378 context, 77, 81, 189, 203, 220, 228,
comparable chain link, 23 232, 240, 252, 254, 262-265,
comparative formation 267, 268
in English, 31, 32 control, 6
competition, 1, 28 Control Rule, 6
complementary distribution, 3, 5, 6, conventional implicature, 264
31,32, 34 convergence of a derivation, 17, 334
complementizer deletion copy theory of movement, 107, 109
and V/2, 58 correspondence, 341, 353, 355, 358,
at LF, 14 368
complementizer drop, 286,291-297 crucial nonranking, 287
complementizer-trace effect, 3 , 3 9 , 4 0 cumulative theory (CT), 159, 168
complexity, 11, 29,45, 170, 201, 313 cumulativity, 36, 52, 61, 143, 151-
compound, 383 171, 212, 215, 219, 220, 234,
CON, 355 240, 243
CON r W , 164,165, 168, 170 and cyclic optimization, 202
conceptual shift, 264 and same vs. different constraints,
Condition on Extraction Domain 246
(CED), 16
D-structure, 2 , 8 , 178, 354, 382
constraint
dative
derivational, 2 , 4
as a lexical vs. structural Case, 178,
economy, 8
342, 345, 353, 362, 369
faithfulness, 45, 47, 73, 283, 291,
shift, 266
294, 299, 301, 304, 311, 312,
deaccenting, 88-90, 95
355
definites, 71, 254,261,271, 272, 347
filter, 2
superlative, 324, 335
global, 2, 5, 57
deletion, 18, 107
hard vs. soft, 212, 219, 220, 234,
denominal verb, 384
243, 245
derivational ambiguity, 51,296, 312
local, 2 , 7 , 27,29, 35
discourse pragmatics, 250
markedness, 69, 100, 283, 291,
distributivity, 152
295, 300, 301, 304, 311, 312,
do-support, 37
355
Doubly-Filled Comp Filter, 116, 127
representational, 2
transderivational, 8, 57,283 economy, 8, 29, 39, 108, 183, 185,
translocal, 7, 8, 57 187, 193, 195, 203, 204, 283,
type and candidate type, 37 328, 330, 334
constraint profile, 35 Economy of Representation, 27
constraint re-ranking, 36, 53, 178, ECP violation
strength of, 52
Elsewhere Condition, 30, 33 189, 204, 233, 249, 252, 254,

emergence of the unmarked, 180 258, 266, 270, 328
emotion and wh-phrases, 172
perception of, 380 complex, 84-86
Empty Category Principle (ECP), 2 , 3 narrow, 81
epsilon operator, 262 simple, 81
indexed, 262 FOCUS (factor), 164
Equidistance, 328, 334 focus domain, 78
Ersatzinfinitiv, 306-310 focus movement, 110
es sentences, 257 focus projection, 78, 181, 249, 252,
event 259
atelic, 378 Form Chain, 12
- structure, 378 frame adverbial, 385
EX-PRE (factor), 152 free relative clauses, 341-368
excorporation, 138 Case conflict in, 349, 363
Existential Axiom, 250,254,266,267 typology of, 365, 367
existential closure, 325 vs. indirect questions, 348
experiencer, 378-385, 386, 389, 390, freezing, 15-18, 24, 109, 141
392-396 FSP Principle of Gapping, 213
expletive, 133 functional sentence perspective, 213
-insertion, 134
Extended Projection Principle (EPP), gamma-marking, 3
17,26, 143 gapping, 211, 221-245
Extension Condition, 140 complements vs. adjuncts, 230
extraction from picture NPs, 219 Gen (Generator), 27, 28, 30, 36, 37,
48, 49, 115, 177, 200, 203, 251,
F-marking, 72,77, 97 285, 290, 298, 312, 314, 355,
F-part (focus (-affiliated)), 252, 257, 357, 358, 367
259, 261, 265-267, 270-272 generative semantics, 204
factorial typology, 53, 117, 118 generic NP, 265, 347
faithfulness, 46, 290, 292, 296, 300, gerunds
305, 310, 355, 357, 358 English, 6
fatal violation, 35 Gesetz der wachsenden Glieder, 273
feature Givenness, 77, 213, 223, 259, 261,
strong, 25 272
weak, 25 goal, 381
Feature Condition, 25 government and binding theory, 1-7,
feature movement, 125, 140 52
Fewest Steps, 9 gradient acceptability, 52, 211, 212,
filter, 2 218, 219, 234, 242, 245
final strengthening, 83 grammaticality
focus, 57, 71, 78, 137, 182, 184, 186, definition of, 29
Gricean maxim, 7
H-Eval (Harmony Evaluation), 27- island, 126, 142

30, 34, 35, 37, 48,49, 115,285, adjunct, 48,111
355 CNPC, 4
harmonic alignment, 61 freezing, 141
harmonic parallelism, 53, 201 indirect discourse verb, 117
harmonic serialism, 53, 201 subject, 15, 111
harmonically bounded, 60, 371 vvh-, 11, 111
head movement, 139, 140
homophony, 369 Klammerstruktur, 69
Identification Principle, 391 L-marking, 16,44, 58

identity of constraint profile, 286 language change, 312
idioms, 274-275 languages (except English and Ger-
improper movement, 58 man)
IN-DIS (factor), 152 Afrikaans, 121
indefinites, 71, 253, 254, 261, 265, Ancash Quechua, 14
270, 335 Bahasa Indonesia, 108, 110, 112-
existential vs. generic interpreta- 114, 142
tion of, 75-76, 250, 254, 256, Bulgarian, 117
257, 259, 260, 263-265 Child English, 121
FR pronouns as, 370 Chinese, 41, 107, 111-113, 116,
individual-level predicate, 386 143, 217
ineffability, 48-51, 132, 292, 312, Danish, 322, 323
314, 357, 363 Dutch, 137, 138, 145,351
Infinitivus Pro Participio (IPP), see Ewe, 21-22
Ersatzinfinitiv Finnish, 352, 353
input, 45-48, 50, 62, 76, 77, 97, 154, French, 8, 9, 12, 15, 19, 25, 56, 58,
175, 177, 203, 204, 216, 285, 92, 139, 143,297, 306
290-294, 298, 300-302, 304, standard vs. colloquial, 299
305, 309, 310, 316, 341, 354- Frisian, 120, 121, 125
358 Gothic, 341, 350, 351, 353, 360-
input optimization, 51,315 362, 365, 367
instrument, 381 Greek
integration, 80 Ancient, 350
interface, 245 Modern, 341, 346-352, 360,
intermediate phrase, 78 361, 365, 367, 370, 372
intermediate trace, 3, 18 Hebrew, 41
Internet, 225 Hindi, 119, 143
interpretability, 49 Icelandic, 257, 321-325, 327-335
intervention effect, 131 Iraqi Arabic, 119
intonational phrase (iP), 78 Italian, 57,92
isc-determined interpretation, 252, Malay, 110-114, 116-118, 142
261-272, 274, 276 Normalem Ulem, 372
Polish, 41, 136, 372 Merge before Move, 26

Romani, 120,121 Minimal Distance Principle, 212
Romanian, 117, 366, 372 Minimal Link Condition (MLC), 30
Russian, 41, 136 minimalist program, 8-30, 37, 54,
Serbo-Croatian, 136 215, 250, 258, 283, 321, 328-
Slave, 110, 117, 118 330, 334, 335, 337, 354, 371
Spanish, 92, 93, 366, 372 Mittelfeld, 70, 189
Tagalog, 2 2 - 2 4 , 4 4 modal verb, 134
Last Resort, 30 modularity, 250
learnability, 219, 312 Move alpha, 2 , 1 2 , 1 7 7
left branch extraction, 136 movement
left dislocation, 370 overt vs. covert, 19, 25, 107
Lexical Decomposition Grammar, string-vacuous, 143
203 multiple question, 118, 119, 230
Lexical Integrity Principle, 137, 138 multiplication, 170
lexicon optimization, 51
linear precedence, 152 negation, 131, 321
linking theory, 377-396 neutralization, 50-52, 285, 290-293,
projectional, 378, 379 295, 299-302, 305, 309-312,
local conjunction, 44, 61, 156, 162, 370
170, 172, 307,316 new information, 257
definition of, 307 nominalization, 382
reflexive, 44, 169 NP split, 134-137,145,276
location, 381 nuclear scope, 325
logical form (LF), 2 , 3 , 8 , 1 8 , 4 7 , 1 5 4 , nuclear stress, 79
175-177, 187, 312, 334, 341, null parse, 49
358, 359 numeration, 9, 2 3 , 4 6 , 2 5 1 , 2 7 3
transparent, 176,202
object shift, 321-335
lowering, 187 and Case assignment, 328
m-command, 266 interpretation of, 324, 327
magnitude estimation, 211, 220-221, multiple, 337
225,227, 244 of pronouns vs. full NPs, 322
manner of movement verb, 378 optional, 323
mapping hypothesis, 324 optimality, 28
markedness definition of, 35, 201, 215, 284,
pragmatic, 98 327
structural, 98 blocking definition of, 31
vs. grammaticality, 96, 170, 179, minimalist definition of, 29
203, 211 optimization
cyclic, 53, 108, 114, 175, 177,201
matching effect, 341, 342, 344, 346,
D-structure, 53, 179, 189, 197
363, 369, 370
expressive, 53
meng-deletion, 114
LF-, 53, 187, 204 prominence, 78

local, 54 pronominal, 2, 33
multiple, 53 pronominalization, 3, 33, 38
S-structure, 53, 179, 182 pronoun, 21, 22, 231, 242, 268, 269,
optional movement, 12-14, 249, 323, 271, 272, 322, 323, 347
324 d-, 347, 348
optionality, 10,52, 283, 299, 310 FR-, 348, 349, 353, 354, 358, 370
definition of, 285 lexical vs. empty, 7
in blocking syntax, 60 relative, 342, 344, 345, 347, 348,
LF-, 20 350, 352, 358
pseudo-, 52 resumptive, 5 , 4 1 , 5 7 , 350, 371
true, 52, 286 wh-, 344, 347, 348, 350, 361
output, see candidate Proper Inclusion Principle (PIP), 33
proposition
parameterization, 53 V as an open, 187,204
morphological basis of, 53 prosodie phrasing, 78
parsing, 54 prosodie structure, 77, 249, 258, 259,
partial vvh-movement, 14, 108-120, 273
142 prosodie word (PWd), 78
particle, 137 proto-role, 389, 390
partitive Case, 352, 353 psych verb, 203, 377-396
passivization, 24, 378,382
psychophysics, 220
perception verb, 306, 308
phase, 115 Q-feature, 298, 300, 302, 304, 305,
Phi-feature, 38, 125, 126, 134 315
phonological phrase, 78 Q-scope marking, 183,185,186,191,
pied piping, 125, 126 192, 197-200
pitch accent, 78 downward, 199, 200
pointing finger, 35 free, 204
predicate nouns, 274 improper, 186,199
predicate/argument structure vacuous, 199, 200,206
and candidate set, 37 quantifier
Preference Principle for Reconstruc- existential vs. universal, 262
tion, 27 strong vs. weak, 187,190-196,204
Principle A, 2, 33 quantifier raising, 18, 154, 187, 191,
Principle B, 2, 33 193, 197, 199
Principle of Minimal Compliance, type-driven, 187
157 quantifier scope, 151, 175, 176, 187,
pro, 57, 205, 365 188, 190,201,250
pro-drop, 57, 365 destabilization of, 191
PRO-theorem, 6
Procrastinate, 25, 328-330, 334 ranking software, 372
Projection Principle, 2 , 5
re-ranking model, 211,219,234,243, of PP, 123

245 scope, 184
reaction object, 378 semantically non-vacuous, 189,
reconstruction, 176, 179, 187-189, 197
192-200, 205 semantically vacuous, 189, 196
reference set, 8, 23, 27, 37,205, 283 string-vacuous, 260
definition of, 10, 11, 14, 15 wh-, 13, 131
referentiality, 112, 358 sentience, 389
reflexivization, 3, 33, 38 Shortest Move, 328, 329, 334
relative clause, 4 1 , 4 2 Shortest Paths, 18
standard vs. free, 346 SL-PAT (factor), 160
remnant Sole Complement Generalization,
in gapping constructions, 212,228, 382
230 source, 107, 381
remnant movement, 140, 141,143 spell-out, 107, 334
repair, 37, 312 split Infi hypothesis, 370
Requirement for Simplex-Sentential split tree hypothesis, 256, 270
Relationship, 214 ST-L-DB (factor), 164
reset button, 202 stage-level predicate, 386
restriction, 325 stranding, 135
richness of the base, 312 Stray Affix Filter, 9
rise-fall contour, 249, 259 Strength of 1,9
Rule A (comparative), 31 Strict Cycle Condition, 16, 140
Rule Β (comparative), 32 strict domination, 217
Rule B' (comparative), 32 Subevent Identification Condition,
397
S-structure, 2 , 8 , 30 subhierarchy, 44, 61, 316
salience, 263, 271, 272 Subjacency, 2 , 4 , 4 1 , 142, 143, 351
scopai value (SV), 152, 158, 167 - Condition, 4
scope - violation
narrow - of objects, 329, 331, 332 strength of, 52
relative, 324, 328 SUBJECT (factor), 152
wide - of objects, 330-332 subject-aux inversion, 297
scope inversion, 176, 178, 179, 186— suboptimal candidate, 175,218
192, 194-200, 205, 250 Suboptimality Hypothesis, 218
Scoping Condition, 327, 334-337 successive-cyclic movement, 11, 12
scrambling, 73, 182, 183, 185, 186, reflex of, 21
249, 253, 254, 265, 267, 270,
Superiority Condition, 19
321-324
superiority effect, 18, 19, 2 9 , 4 3 , 4 4
anti-focus, 184, 189, 205
interpretation of, 327 T-model, 177
long-distance, 123 tableau, 35
of idioms, 274, 275 target, 107
template unaccusative verb, 377, 382, 388

lexical, 379 unselective binding, 143, 270
Tendency for Subject-Predicate Inter-
pretation, 213 V-to-I movement, 8
i/iaf-trace effect, 3, 3 9 , 4 0 V/2, 14, 137, 322
Thematic Hierarchy, 381, 383-386, complement, 120, 144
389 constraint, 134
Thematic Relations Hypothesis, 381 vacuous movement, 193, 194, 260
Thematic Tier, 387-389 verb frame, 221, 228
theme, 378, 380-386, 388-390, 392- violability, 35
395 Vorfeld, 152
incremental, 397 VP-internal subject hypothesis, 26,
Theme Topic (TT) construction, 22 178
there sentences, 26, 27, 257
Wackernagel movement, 323
tie, 5 2 , 7 4 , 9 3 , 9 6 , 1 1 3 , 1 1 6 , 1 1 7 , 130,
was för split, 276
137, 154, 156, 158,285,287
weight, 152,154, 160
global, 52, 74, 156, 158, 161, 170,
negative, 166
285, 288-290, 294, 296, 298,
Wh-Copying, 118-132
299, 302, 306, 308, 310-312,
Wh-Criterion, 10, 39
314,315
vvh-extraction, 48, 216
ordered, 314
wh-movement
local, 52, 156, 157, 285, 287, 288,
embedded, 303
314,315
conjunctive, 314 long-distance, 11
Top position, 184, 185, 204 optional, 297-306
topicalization, 10, 14, 182, 184, 185, wh-scope marking, 120, 121
188, 191-195, 199, 201, 204, abstract, 46
205, 276, 332
yo-yo movement, 21, 58, 204
embedded, 13
vvh-, 12, 13
transitivity
of constraint rankings, 161
of constraint ties, 159, 160
translation rule, 159, 310

(Studies in Generative Grammar) Gereon Muller, Wolfgang Sternefeld, Gereon M. Ller-Competition in Syntax - A Synopsis-De Gruyter (2000)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Studies in Generative Grammar) Gereon Muller, Wolfgang Sternefeld, Gereon M. Ller-Competition in Syntax - A Synopsis-De Gruyter (2000)

Uploaded by

Copyright:

Available Formats

Competition in Syntax

The series Studies in Generative G r a m m a r was formerly published by

© Printed on acid-free paper which falls within the guidelines

Die Deutsche Bibliothek — Cataloging-in-Publication Data

Competition in syntax / ed. by Gereon Müller ; Wolfgang Sternefeld. -

© Copyright 2000 by Walter de Gruyter G m b H & Co. K G , D-10785 Berlin.

Let's Phrase It! Focus, Word Order, and Prosodie Phrasing in

Remarks on the Economy of Pronunciation

On the Integration of Cumulative Effects into Optimality Theory

Quantifier Scope in German and Cyclic Optimization

Experimental Evidence for Constraint Competition in

Word Order Variation: Competition or Co-Operation?

OT Accounts of Optionality: A Comparison of Global Ties and

The Interpretation of Object Shift and Optimality Theory

Case Conflict in German Free Relative Constructions: An Optimality

The Optimal Linking of Arguments: The Case of English Psych Verbs

Index of OT-Constraints 401

Index of Subjects 403

Gereon Müller & Wolfgang Sternefeld

1 Local vs. Competition-Based Approaches

2 Government and Binding Theory

Chomsky's (1981, 1986a,b) theory of government and binding is a typical

(D-structure, S-structure, LF) triples which are created by phrase-structure

2.1 Representational Constraints

By assuming that these principles apply at S-structure, we can derive that

(2) a. Johni likes himselfi

Another representational constraint in government and binding theory is the

(3) ECP ("Empty Category Principle"):

A trace is marked [+y] if it is properly (antecedent or lexically) governed, and

(4) a. *Whoi do you think [ C p t,([+yJ) that [i P t i ( t - y j ) will leave J] ?

It is again worth noting that the account of the illformedness of (4-a) in no

2.2 Derivational Constraints

(5) Subjacency Condition:

The standard assumption in government and binding theory is that Subja-

IP is a bounding node for movement. The two wA-movement steps in (6-a)

Among other phenomena, the Subjacency Condition derives Complex Noun

(8) a. the man [cp who(m)i I saw ti ]

2.3 Global Constraints

Finally, the Projection Principle can always be checked by looking at the

(9) Projection Principle:

To find out whether a given sentence S, respects the Projection Principle,

2.4 An Exception: The Avoid Pronoun Principle

(10) a. Johnj would much prefer [ PRO] going to the movie ]

(11) Control Rule:

Next consider overt pronouns in English gerunds:

(12) a. *Johni would much prefer [ hisi going to the movie ]

(12-b) shows that an overt pronoun is possible in the subject position of

(13) Avoid Pronoun Principle:

This implies that the grammaticality of a sentence S, with an overt pronoun

3 The Minimalist Program

Translocal constraints are employed in various versions of the minimalist pro-

3.1 Fewest Steps

3.1.1 V-to-I Movement in Chomsky ( 1991 )

Chomsky (1991) is concerned with deriving the difference between French

(14) a. Jean embrassei souvent [vp ti Marie ]

This excludes (14-c) in English: Overt V-to-I movement violates Strength of

(16) a. Jean I2 souvent [yp embrassei Marie J (overt raising)

(18) Fewest Steps'.

(19) Reference Set:

(20) a. Whati have2 you t2 seen ti ?

This particular application of a translocal constraint in the minimalist pro-

(21 ) a. Mary gave a book to John!

The solution suggested by Chomsky (1991:433) is that certain movement op-