Professional Documents
Culture Documents
(Studies in Generative Grammar) Gereon Muller, Wolfgang Sternefeld, Gereon M. Ller-Competition in Syntax - A Synopsis-De Gruyter (2000)
(Studies in Generative Grammar) Gereon Muller, Wolfgang Sternefeld, Gereon M. Ller-Competition in Syntax - A Synopsis-De Gruyter (2000)
W
DE
G
Studies in Generative Grammar 49
Editors
Harry van der Hulst
Jan Köster
Henk van Riemsdijk
Mouton de Gruyter
Berlin · New York
Competition in Syntax
Edited by
Gereon Müller
Wolfgang Sternefeld
Mouton de Gruyter
Berlin · New York 2001
M o u t o n de Gruyter (formerly Mouton, The Hague)
is a Division of Walter de Gruyter G m b H & Co. K.G, Berlin.
Syntactic theories differ with respect to how they determine the wellformed-
ness or illformedness of a given sentence S, in a given language. One possi-
bility is that the decision of whether S, is grammatical or not can be made by
exclusively considering properties of S,·; properties of other sentences S j, S*,
...are irrelevant. Another possibility is that properties of other sentences S j,
Sk,... do play a role in deciding whether S, is grammatical or not in addition to
S, 's own properties. The first possibility, which we may call a local approach,
can arguably be viewed as the standard one; this strategy is pursued in, e.g.,
most versions of government and binding theory (principles and parameters
theory), head-driven phrase structure grammar, lexical-functional grammar
(until recently), and in certain versions of the minimalist program. The sec-
ond possibility presupposes a competition of sentences; hence, we will refer
to it as a competition-based, approach. This strategy is the one that this book
is about; it is chosen in certain versions of the minimalist program (in par-
ticular, in earlier manifestations), in theories that incorporate the Blocking
Principle (the Elsewhere Condition), and, last but not least, in optimality-
theoretic syntax. In what follows, we will illustrate fundamental differences
between and points of convergence among local and competition-based ap-
proaches by considering government and binding theory (section 2) and its
development into the minimalist program (section 3), blocking syntax (sec-
tion 4), and optimality-theoretic syntax (section 5).
To see this, consider first the effects of representational constraints like the
binding principles A and B, which are given here in a simplified form.
(l)a. Principle A\
An anaphor is bound in its binding domain,
b. Principle B:
A pronominal is free in its binding domain.
(2-b) is not related to the wellformedness of (2-c), even though the two strate-
gies - pronominalization and reflexivization - seem to be in complementary
distribution for the most part in English.
Basically the same situation arises with constraints like the Subjacency Con-
dition in (5).
(6) a. Whoi do [ip you think [cp t', that [IP Mary loves ti ]]] ?
b. Whoi do [IP you believe [¡p John to be in love with ti ]] ?
(7) a. — do [IP you believe [CP — C [ip John to be in love with whoi ]]]
b. — do [IP you believe [cp whoi C [IP John to be in love with TI ]]]
c. whoi do [IP you believe [cp tj C [IP John to be in love with ti ]]]
d. whoi do [IP you believe [IP John to be in love with ti ]]
in (8-a). Again, the fact that the movement strategy seems to be in comple-
mentary distribution with a resumptive pronoun strategy is not theoretically
reflected by relating the illformedness of resumptive pronouns as in (8-c) to
the wellformedness of movement in (8-a), and the (relative) wellformedness
of resumptive pronouns in (8-d) to the illformedness of movement in (8-b): 2
Independently of how exactly an account of the contrast in (8-c,d) looks like
in government and binding theory, it seems that it must rely on a local con-
straint that is violated if a resumptive pronoun has an antecedent that is too
close. 3
To sum up so far, government and binding theory has different types of con-
straints, with varying complexity, but all of them are local in the above sense;
i.e., they do not involve competition. Interestingly, though, there is a notable
exception: the Avoid Pronoun Principle of Chomsky (1981).
The empty pronominal PRO and lexical pronouns come close to being in
complementary distribution: As stated in the PRO-theorem, PRO is confined
to positions that are ungoverned (e.g., the subject position of control infini-
tives). In contrast, overt pronouns typically show up in positions that are gov-
erned. This is so because overt pronouns must be assigned Case, and Case
is normally assigned under government only. However, there is one position
in which government and binding theory permits both an empty pronomi-
nal PRO (because the position is ungoverned) and an overt pronoun (because
Case can be assigned without government by a special Case assignment rule).
This is the subject position of English gerunds. Consider first the possibility
of PRO in this position:
As shown by the contrast between ( 10-a) and ( 10-b), PRO must be co-indexed
with the matrix subject here; it cannot bear a different index or be inter-
preted arbitrarily. Obligatory control follows from the control rule in (11)
(see Manzini 1983).4
(12-a) does not follow from any of the relevant local constraints in govern-
ment and binding theory. In particular, (12-c) strongly suggests that principle
Β of the binding theory is not violated in (12-a) (the pronoun his occupies
SpecNP in both cases, according to Chomsky's 1981 assumptions), and the
wellformedness of (12-b) proves that Case can be assigned to his in (12-a). In
view of this situation, Chomsky (1981:65) suggests that the illformedness of
(12-a) is not to be traced back to a violation of some local constraint. Rather,
it should be related to the wellformedness of (10-a): The two sentences com-
pete, and (12-a) is ungrammatical because (10-a) is grammatical. This idea is
implemented by adopting the Avoid Pronoun Principle in (13).
As a first step towards accounting for these data, Chomsky assumes that
French has "strong" I nodes, whereas English has "weak" I nodes. This dis-
tinction becomes important for the following local (derivational) constraint:
(15) Strength of I:
Strong I tolerates adjunction of all Vs; weak I tolerates adjunction
only of "light" Vs (auxiliaries).
The second derivation has more movement steps than the first one, and it is
therefore filtered out as uneconomical by the translocal economy constraint
Fewest Steps, which can be formulated as follows:
A definition of reference set that works for the approach in Chomsky ( 1991 ) is
(19); here, the numeration is the set of all lexical items (including functional
heads) that are used in a derivation. 7
10 Gereon Müller & Wolfgang Sternefeld
The qualification in (i) ensures that, e.g., (20-b) cannot accidentally block
(20-a) even though it involves fewer syntactic operations. Furthermore, the
statement in (ii) produces the welcome consequence that (20-c) cannot ac-
cidentally block (20-a) even though it involves fewer syntactic operations
by leaving the w/i-phrase and the auxiliary in situ - (20-c) violates a lo-
cal constraint like the Wh-Criterion, which requires a wA-phrase to move to
SpecQ+u,/,] overtly in English.
(24) a. Howi do you think |cp t" that John said | cp t', that Bill fixed the car
ti ]]?
b. *Howi do you think [CP — that John said fcp — that Bill fixed the
cart, | ] ?
Note, though, that no particular problem arises under the notion of reference
set in (19) or (22). According to these definitions, only those derivations can
compete that respect all local constraints of grammar, i.e., that are otherwise
well formed. By hypothesis, the derivation that generates the surface repre-
sentation (24-b) violates a locality constraint; hence, it cannot compete with
the derivation that generates (24-a), and (24-a) is chosen by Fewest Steps
because there is no competing derivation that would be more economical.
12 Gereon Müller & Wolf gang Sternefeld
But what if we were to dispense with clause (ii) in the definition of reference
set, or that clause (ii) were weakened in such a way that some derivations vio-
lating local constraints could compete after all. (As we will see below, there is
some evidence for this latter option.) Then, the derivations generating (24-a)
and (24-b) might compete, and the problem of accounting for successive-
cyclic movement under Fewest Steps would persist. How, then, can we permit
successive-cyclic movement in (24)? Chomsky (1993) advances the follow-
ing solution: "Operations" as they are relevant for Fewest Steps do not simply
involve applications of Move a as such. Rather, a more complex process of
chain formation that (a) moves some item to its target position and (b) au-
tomatically inserts intermediate traces in appropriate positions counts as a
single operation for the purposes of Fewest Steps: 9
Another application of the Fewest Steps condition is the account of the ban
on w/i-topicalization in Epstein (1992). As noted above, topicalization is in
principle optional in English; cf. (21). For many speakers, topicalization is
also optionally possible in contexts like (26), where the target position is
in an embedded clause and the matrix clause involves short w/i-movement.
Given the qualification that competing derivations must have identical LFs,
this poses no problem for Fewest Steps.
(26) a. Who] ti said [cp that [IP Mary gave a book to John 2 ]] ?
b. Whoi ti said fcp that to John 2 lip Mary gave a book t 2 ]] ?
The Rise of Competition in Syntax 13
(27) a. Whoi ti said fcp that [IP Mary gave a book to \vh0m2 |] ?
b. *Whoi ti said [cp that to whom2 fip Mary gave a book t2 J] ?
(28) *Whoi to whom2 t t said fcp that [IP Mary gave a book ]] ?
However, it is also shown in Müller & Sternefeld (1996) that the Fewest Steps
approach to the ban on optional movement of wA-phrases which is later un-
done by further, covert operations is not entirely unproblematic, and may
necessitate additional assumptions. For one thing, German exhibits the same
ban on wA-topicalization as English:
(30) a. Weri sagte ti fcp daß Maria wem 2 ein Buch gegeben hat 3 | ?
who said that Maria whom a book given has
14 Gereon Müller & Wolf gang Sternefeld
Moreover, it turns out that there are several well-formed constructions at-
tested in the world's languages in which wA-phrases can in fact undergo op-
tional overt fronting to a non-target position. In Müller & Sternefeld (1996),
we discuss evidence from partial wA-movement, wA-imperatives, and wA-
reconstruction. For the present purposes, the example of optional partial wh-
movement to a SpecQ_,„/,] position in Ancash Quechua may suffice (cf.
Cole 1982). (32-a) shows that wA-phrases may be fronted to a SpecCj+UJ/,i
target position in overt syntax in Ancash Quechua; (32-d) shows that wh-
phrases may also stay in situ in overt syntax, raising (by assumption) to the
SpecC[+U)/j] position in covert syntax. Interestingly, (32-b) and (32-c) are also
possible. Here, the wA-phrases raise to an intermediate SpecQ-u,/,] overtly.
Given that this implies an additional wA-movement operation at LF, Epstein's
(1992) Fewest Steps approach should rule out these cases.
(32) a. I cp Ima-ta-taqi (qam) kreinki [cp t" Maria muna-nqa-n-ta [cp t',
what acc you believe Maria want-nom-3-acc
José ti ranti-na-n-ta ]]] ?
José buy-nom-3-acc
b. [CP — (Qam) kreinki |cp ima-ta-tai Maria muna-nqa-n-ta [cp t,
José ti ranti-na-n-ta ]]] ?
c. I CP — (Qam) kreinki [cp — Maria muna-nqa-n-ta [cp ima-ta-tai
José ti ranti-na-n-ta ]]] ?
d. I CP — (Qam) kreinki [cp — Maria muna-nqa-n-ta [cp — José ima-
ta-tai ranti-na-n-ta ]]] ?
The Rise of Competition in Syntax 15
This way, partial w/z-movement is permitted, but it is clear that much (in fact,
most) of the original evidence in favor of Fewest Steps is lost: Thus, on this
view, neither French V-in situ, nor English (or German) w/i-topicalization can
be ruled out by Fewest Steps anymore. As noted in Sternefeld (1997), this
situation might be viewed as indicative of a general problem with translocal
constraints: A significant reduction of competition in reference sets may be
empirically desirable so as to account for cases of optionality (as in partial
wA-movement constructions); but as an unwanted side effect, it also threat-
ens to undermine the notion of translocal economy itself: Many ill-formed
derivations that could be ruled out by translocal constraints will now survive
because the more economical derivation is not part of the same reference set
anymore. Finding a suitable definition of reference set that is weak enough to
permit optionality and strong enough to actually do some work is one of the
fundamental concerns of all versions of the minimalist program that employ
the notion of competition.
Evidence for yet another definition of reference sets comes from Collins'
(1994) account of freezing effects with Α-movement in English. As shown
in (34-a,b), subject NPs are islands for extraction in English, whereas object
NPs permit extraction (with certain types of verbs). In the present context, the
interesting case is that of subject NPs that originate in object position, as in
the case of passivization. As can be seen in (34-c), such derived subject NPs
are also islands.
This derivation violates another local constraint, the Strict Cycle Condition
in (38). The reason is that NP raising targets the subject position. The subject
position is included in the CP domain, which has already been affected by
w/î-movement to SpecC earlier in the derivation.
(39) a. [CP — was [IP — taken [yp INP2 a picture of whoi ] by John ]]]
b. I CP — was [IP — taken [yp whoi [yp [NP2 a picture of ti ] by
John |]]]
The Rise of Competition in Syntax 17
Di violates the CED; D2 violates the SCC. D3 violates neither of these local
constraints. However, D3 is blocked by Di and D2 via Fewest Steps: Other
things being equal, D3 needs three movement steps where D| and D2 make
do with two movement steps. 12
This approach has an important consequence for the definition of refer-
ence sets. The three derivations Di, D2, and D3 yield the same surface string,
which is ill formed. Thus, the more economical derivations that block D3
via Fewest Steps are not well-formed derivations, as in the applications of
Fewest Steps discussed above, but rather ill-formed derivations that violate
local constraints, viz., the CED and the SCC. This reasoning implies that ref-
erence sets can in fact not be defined as assumed so far, by requiring that only
those derivations can compete that satisfy all local constraints - in the case
at hand, Di and D2 violate local constraints. Still, we cannot simply drop
this requirement in the definition of reference sets; otherwise, all instances
of movement would invariably be blocked in favor of in-situ derivations by
Fewest Steps, and syntactic derivations would be fairly trivial. It seems that
what is needed in view of this conflicting evidence is a relativized notion of
local constraint satisfaction.
In this context, the idea of convergence of derivations introduced in Chom-
sky (1993) becomes relevant: Only those derivations that converge can com-
pete with respect to translocal constraints. Essentially, whereas all violations
of local constraints lead to ungrammaticality, only a subset of violations of
local constraints also leads to non-convergence. Ungrammatical derivations
that converge may then still be used to block other derivations as ungrammat-
ical, as in the freezing construction discussed by Collins (1994). It is an em-
pirical issue how convergence is to be defined. As a rule of thumb, and for the
present purposes, we can say that a violation of those constraints that trigger
movement (like the WA-Criterion, the Extended Projection Principle (EPP),
which triggers subject raising, or whatever constraint optionally triggers top-
icalization) leads to non-convergence, whereas a violation of constraints like
the CED and the SCC permits convergence of a derivation. 13
Under these assumptions, the notion of reference set needed for the ap-
proach in Collins (1994) can be defined as in (40). Note that the analysis is
compatible with assuming that either numerations, or surface structures, or
LF representations, or any combination of these determines the competition;
18 Gereon Müller & Wolfgang Sternefeld
To end the discussion of Fewest Steps, we would like to emphasize that there
is no inherent reason why the notion of an "operation" that is mentioned in
the Fewest Steps condition should be confined to movement. Indeed, Chom-
sky & Lasnik (1993) argue that the deletion of intermediate traces in the LF
component (which is argued to be an option with arguments and impossible
with adjuncts in Chomsky 1986a, Lasnik & Saito 1992, and related work) is
also regulated by the Fewest Steps condition.
There have been many more applications of the Fewest Steps condition in
the minimalist program (see, e.g., the Fewest Steps account of the ban on
semantically vacuous quantifier raising in Fox 1995), but these may suffice
for the time being. 14 Let us now consider the translocal economy constraint
Shortest Paths.
The Shortest Paths condition can be defined as follows (cf. Chomsky 1993,
1995):
Given that all wh-in situ phrases must undergo movement to a SpecC[+u,/,]
position at LF, D 2 (creating (43-b)) should block Di (creating (43-a)) because
Di's paths are longer. Again, there are several possible solutions. 15 As before,
one might stipulate that covert wA-movement either does not exist, or does not
count with respect to the Shortest Paths condition. Alternatively, this evidence
could be viewed as a further argument that reference sets are defined in such
a way that competing derivations must have identical LF representations.
The Rise of Competition in Syntax 21
Still, to ensure that wo is impossible in (45), a derivation like (46) that em-
ploys yo-yo movement must be ruled out. In this derivation D ¡ , Kofi\ is first
lowered to the embedded SpecC position, licensing wo in the subject position
there, and then raised to the target SpecC position in the matrix clause.16
(46) a. Foe [IP I [yp said |pp to Kofii | [CP that |IP he hit Kösi |||]
22 Gereon Müller & Wolfgang Sternefeld
b. Foc I ip I fvp said Lpp to ti | [cp Kofii that |ip he hit Kösi ]]]]
c. Kofii Foc I IP I [vp said |PP to ti ] [cp t', that [IP he hit Kösi ]]]]
(47) a. Foc fip I [VP said [pp to Kofii J [CP that [IP he hit Kösi ]]]]
b. Kofii Foe [IP I [vp said [pp to ti ] [CP that [IP he hit Kösi ]]]]
(48) a. [CP Sinoi ang [χρ t', b-um-ili [vp tL tv ng damit2 ]]] ?
who Ang bought^ dress,,,/,
'Who is the one that bought the dress?'
b. *[CP Ano2 ang [ΧΡ siJuani b-um-ili [yp ti TV t2 ]]] ?
what Ang Juanafo b o u g h t ^
'What is the thing that Juan bought?'
A different marking on the verb triggers the so-called Theme Topic (TT) con-
struction. Here, the theme NP occupies the structural subject position SpecT;
and indeed, only the theme NP can undergo wA-movement; cf.:
The Rise of Competition in Syntax 23
Nakamura's (1998) basic idea is that the derivations generating (48-a) and
(49-a) compete, as do the derivations generating (48-b) and (49-b). The
derivations underlying (48-a) and (49-b) can then block their respective com-
petitors as ungrammatical because of the Shortest Paths constraint. To see
this, consider the case of wA-movement of the theme NP in (48-b) and (49-b).
The movement path from the VP-internal object position to the SpecC target
position in (48-b) is longer than the path from the subject position SpecT to
SpecC in (49-b). Consequently, the Shortest Paths condition guarantees that
the derivation generating (49-b) blocks the derivation generating (48-b) as un-
grammatical. An analogous account is available for the agent wA-movement
case in (48-a) vs. (49-a).
As Nakamura observes, this analysis raises two further potential problems.
First, we have to ensure that derivations can compete even though they do
not have identical lexical material - the Agent Topic and the Theme Topic
constructions clearly differ in lexical make-up. Nakamura accomplishes this
by replacing the notion of "identical numeration" in the definition of refer-
ence set with the more liberal notion of "non-distinct numeration;" the latter
is defined in such a way that two numerations that only differ with respect
to functional features do not count as distinct. (Clearly, this raises some non-
trivial questions for other languages in which competitions of the type that
Nakamura postulates seem unwanted.)
Second, the derivation that generates, e.g., (49-b) may minimize the wh-
path in comparison with the derivation that generates (48-b), but it increases
path lengths in the Α-domain. It is not quite clear how problematic this is;
in the case presently under consideration, the Α-chain formed in (49-b) by
theme raising is only minimally longer than the Α-chain formed in (48-b) by
agent raising, whereas the wA-chain formed in (49-b) is much shorter than
the wA-chain formed in (48-b). There would be even less of a problem for the
agent wA-extraction case in (48-a) and (49-a). In any event, Nakamura (1998)
replaces the notion of "movement paths" in the definition of the Shortest
Paths condition with the more specific notion of "comparable chain links."
This yields the effect that, e.g., the derivation generating (49-b) blocks the
derivation generating (48-b) just because the former derivation's wA-chain
24 Gereon Müller & Wolfgang Sternefeld
links are shorter than the latter derivation's comparable wA-chain links, irre-
spective of the length of other chain links created by Α-movement, V raising,
etc.
3.3 Procrastinate
Chomsky (1993, 1995) assumes the following local condition as a trigger for
overt movement.
(54) Procrastinate:
If two derivations Di and D2 are in the same reference set, and D|
differs from D2 in that an item a is moved covertly in Di and overtly
in D2, then Di is to be preferred over D2.
26 Gereon Müller & Wolfgang Sternefeld
Chomsky (1995, 1998) assumes that syntactic structures are created by al-
ternating operations of structure-building (Merge) and movement (Move). At
any given stage of the derivation, the situation can arise that it must be de-
cided whether the next step is a Merge or a Move operation. The following
translocal condition settles the issue by preferring Merge to Move if both are
possible as such; the specific formulation is based on Frampton & Gutman
(1999).
These two derivations involve an identical numeration, and they both respect
all local constraints. In this case, Merge before Move tells us to choose the
derivation underlying (56-b) and dispense with the derivation that generates
(56-a).
Given that identity of numeration is a prerequisite for competition, (57) is
correctly predicted to be possible - if there is no there present in the numera-
tion, there is no competing derivation here that could be preferred by Merge
before Move.
The question arises of whether there is a deeper reason why Merge operations
count as more economical than Move operations. Chomsky ( 1 9 9 5 , 1 9 9 8 ) sug-
gests that Move is to be defined in terms of Merge, which would make it
inherently more complex, and this fact might ultimately be exploited in an at-
tempt to derive the Merge before Move condition. Chomsky (1998:14) him-
self remarks: "Good design conditions would lead us to expect that simpler
operations are preferred to more complex ones, so that Merge ... preempt|s|
Move, which is a 'last resort,' chosen when nothing else is possible."
3.5 Conclusion
The four translocal constraints discussed so far do not yet exhaust the list of
translocal constraints that have been proposed; see, e.g., the translocal Econ-
omy of Representation constraint in Chomsky (1991), or the translocal Pref-
erence Principle for Reconstruction in Chomsky (1993). Still, the constraints
discussed here can be considered representative. At this point, we can address
the question of what the structure of a minimalist syntax with translocal con-
straints looks like. Such a syntax has two parts. In the first part, derivations
are created by structure-building (Merge), movement (Move), deletion, and
perhaps other operations. Convergent derivations are assembled in reference
sets according to criteria that must be decided on (see the above definitions
of reference sets for some options). In the second part, translocal constraints
choose among the competing derivations and thus determine the wellformed-
ness of sentences. In essence, then, it turns out that a minimalist syntax with
translocal constraints has exactly the shape that Prince & Smolensky (1993)
attribute to an optimality-theoretic grammar: A first generator part (called
Gen) creates the candidate set (= reference set, in minimalist syntax); Gen has
only local constraints. A second "harmony"-evaluation part (called H-Eval)
28 Gereon Müller & Wolfgang Sternefeld
(59) Grammaticality:
A derivation D, is grammatical iff (a) and (b) hold:
a. D, does not violate a local constraint.
b. D, is optimal.
(60) Optimality in minimalist syntax:
A derivation D, is optimal iff there is no derivation D^ in the same
reference set that is preferred over D, by a translocal constraint.
The minimalist system that emerges in this way is not without problems.
Some of those show up in all versions of competition-based syntax. For one
thing, since a minimalist syntax of this type involves a global competition
in a reference set that may be large, or even infinite, the overall complexity
of the system is significantly increased. For another, we have seen that it
is difficult to come up with a single, unified definition of reference set that
accommodates all available evidence that one may want to treat in terms of
translocal constraints.
Other problems are more specific and confined to the particular notion of
optimality that is employed in minimalist syntax. Most notably, the H-Eval
metric is not maximally homogeneous and simple (because it may depend on
a number of formally unrelated translocal constraints); however, it is rather
inflexible nevertheless. Specifically, all translocal constraints must be classifi-
able as economy constraints in some sense (thus, properties of sentences that
are not related to economy considerations cannot be subject to optimization).
Even more importantly, it implies that all variation among languages must
take place in the Gen part of the grammar - there is no room for parameter-
ization in the H-Eval system. It is not always obvious that this position can
be maintained in the light of conflicting empirical evidence. As an example,
consider the effect that the Shortest Paths condition has on w/i-movement in
German. Recall that the Shortest Paths condition accounts for the superiority
effect with wA-movement in languages like English; cf. (42). As has often
been noted (see, e.g., Haider 1983), German does not exhibit superiority ef-
fects of this kind:
Still, it seems clear that the path from tj to werx in (61-a) is shorter than the
path from t2 to was2 in (61-b). To avoid the result that the Shortest Paths
30 Gereon Müller & Wolfgang Sternefeld
4 Blocking Syntax
Williams suggests that these data can be accounted for by the following two
rules, which we call rule A and rule B.
b. Rule Β (syntactic):
Comparatives can be formed by adding more in the syntax.
Recall from section 2.1 the data that seem to suggest a complementary distri-
bution of anaphors and pronominals in English (at least in the domain under
discussion here).
We have seen that standard government and binding theory accounts for these
data by invoking the principles A and Β in (68).
(68) a. Principle A:
An anaphor is bound in its binding domain,
b. Principle Β:
A pronominal is free in its binding domain.
However, as in the case of the comparative formation rules A and B' that
were just discussed, it seems that this approach involves a redundancy: A
generalization is missed if two separate local constraints are postulated for
anaphors and pronominals, where the context that permits one strategy is
identical to the context that precludes the other strategy (viz., the binding
domain in both cases). As noted by Fanselow (1989, 1991), Burzio (1991),
and Richards (1997), among others, a more elegant account can be given if the
notion of competition is invoked. Here, we will sketch Fanselow's blocking
approach. 24
Fanselow's analysis relies on the Proper Inclusion Principle (PIP), a ver-
sion of the Elswhere Condition (cf. Kiparsky 1982) that can be viewed as a
translocal constraint:
The feature assignment mechanisms that play a role in the present context
are (a) the assignment of the feature |+anaphoric] to an NP, and (b) the
assignment of the feature [+pronominall to an NP - in short, reflexiviza-
tion (or reciprocalization) and pronominalization. By assumption, the assign-
ment of the feature l+anaphoric] is subject to (something like) Principle A,
whereas there is no comparable requirement for the assignment of the feature
[+pronominal|; i.e., Principle Β is dropped. This implies that, due to Princi-
ple A, anaphors are more restricted in their distribution than pronominals; the
application domain of pronominalization properly includes the application
domain of reflexivization. From this it follows directly that in all those cases
where both anaphors and pronominals respect all local constraints, the PIP
34 Gereon Müller & Wolfgang Sternefeld
forces the choice of the anaphor. Pronominals can emerge only in contexts in
which anaphors are precluded (e.g., because of a violation of Principle A, as
in the examples presently under consideration).
The PIP can be viewed as a version of the blocking principle that is part
of the definition of optimality in (63). The only relevant change that must be
made for the case at hand concerns the question of which entities compete.
We can now assume that the competing items are complete syntactic objects
(syntactic candidates), rather than feature assignment mechanisms.
4.3 Conclusion
5 Optimality-Theoretic Syntax
T\ : Determining optimality
Candidates A Β C
*
ErCi
c2 **t
c3 *!
C4 *!
*
C5 *!
T2: Reranking
Candidates A c Β
Ci *!
c2 * 1*
1®= C3 *
c4 *!
c5 *! *
why the candidates that are subject to optimization should not be syntactic
objects of a more complex type, like (D-structure, S-structure, LF) tuples as
in government and binding theory, or, indeed, complete derivations, as in the
minimalist program. 26 The choice of candidate type goes hand in hand with
the choice of local constraint type that shows up in the H-Eval part as vio-
lable and ranked: If candidates are representations, constraints will be repre-
sentational; if candidates are derivations, constraints will be derivational; and
if candidates are (D-structure, S-structure, LF) tuples as in government and
binding theory, constraints can take any of the forms sketched in section 2.
Similarly, candidate sets can be defined in various ways, which of course
significantly influences the nature of the competition. Basically, all of the def-
initions of reference sets in minimalist syntax that have been proposed (see
section 3 and Sternefeld 1997) are also potential definitions of candidate sets
in optimality-theoretic syntax. A further influential definition of candidate
sets comes from Grimshaw (1997). She postulates that two candidates (S-
structure representations) compete iff they are realizations of the same pred-
icate/argument structure and have non-distinct logical forms (or non-distinct
interpretations).
By making optimality depend on an intricate system of violable and ranked
constraints, H-Eval - and hence, the concept of competition - becomes even
more important than in minimalist syntax and blocking syntax. As a matter
of fact, much work in optimality-theoretic syntax has tried to minimize the
role of the Gen component, and maximize the role of the H-Eval component
(but see Pesetsky 1997, 1998 for some cautionary remarks).
An optimality-theoretic approach gains immediate support in all those con-
texts where postulating a competition of syntactic objects is initially plausi-
ble. This includes, but is by no means confined to, contexts where notions
of economy seem to play a role. A prototypical case is one in which the
wellformedness of a sentence S, that exhibits an otherwise peculiar prop-
erty seems to depend on the unavailability of another sentence S¡ that ex-
hibits the property one would normally expect. Here, S, is often referred to as
a "repair" form; a typical instance is the English úfo-support construction.
Accordingly, Jo-support was among the first phenomena to be tackled in
optimality-theoretic syntax (see Speas 1995 and Grimshaw 1997). Most of
the constructions discussed in sections 2-4 can also be viewed as suggesting
an underlying competition; and indeed, they can fruitfully be addressed in
optimality-theoretic syntax. This is shown in the following section.
38 Gereon Müller & Wolfgang Sternefeld
T3 : Reflexivization
Candidates LOC-ANT REF-ECON
is* Q : Johni likes himselfi
C2: Johni likes him! *!
T4: Pronominalization
In section 2, we noted that government and binding theory accounts for the
complementizer-trace effect in (4-a) on a purely local basis, without postulat-
ing a competition with the complementizer-less variant in (4-b) from which
only the latter would emerge as optimal. This view is abandoned in Déprez
(1991), which is the basis of the optimality-theoretic account advanced in
Grimshaw (1997). As background, Grimshaw assumes that the size of clauses
is variable. Clauses are extended projections of V; they are minimally VPs,
but they can be IPs, CPs, or functional projections of an even bigger size,
depending on the outcome of optimization. Bridge verbs in English permit
both CP-embedding (with a complementizer - a declarative CP without a
complementizer will typically fatally violate a high-ranked constraint that
precludes empty head positions) and IP- or VP-embedding (without a com-
plementizer). In the latter case, IP must be chosen if an auxiliary or do is
present (i.e., if the need arises to accommodate an additional lexical head);
VP can be chosen otherwise. The main constraints that are needed in the ac-
count of complementizer-trace effects are listed in (73). A possible ranking
for English is OP-SPEC » T-LEX-Gov » STAY.28
7V Object wh-movement
Candidates OP- T-LEX-
SPEC Gov STAY
«s· Ci :... vvhoi you think [cp that [IP she will invite ti J] *
C3: ... you think fcp that fip she will invite whoi |] *!
C4: ... you think [1 she will invite vvhoi ] *!
* *
i®· C2: ... vvhyi you think fjp she has left ti ]
C3:... you think [cp that |jp she has left vvhyi 11 *!
C4:... you think [¡p she has left vvhyi 1 *!
Recall that resumptive pronouns often seem to be possible only as last resort
strategies in cases where traces are blocked (see (8)). Competition-free mod-
els like government and binding theory have no obvious means to relate one
construction to the other (at least, as long as they are supposed to stay strictly
competetion-free; see note 3); but the case is different in optimality-theoretic
syntax. An optimality-theoretic account of resumptive pronoun strategies is
developed in Legendre, Smolensky & Wilson (1998) (on the basis of evidence
from Chinese) and Pesetsky (1998) (on the basis of English data comparable
to those in (8), as well as evidence from Hebrew, Russian, and Polish). The
details of the two analyses differ a great deal, but the gist of the explanation
is identical; it centers around two constraints like those in (74). 31
A lot more could be said about relativization in English and other languages
in an optimality-theoretic approach (in particular, concerning f/iaf-relatives
and their relation to wA-relatives), but these considerations will have to suffice
for now; cf. Grimshaw (1997) and Pesetsky (1998).
Consider now the Avoid Pronoun facts that were discussed in section 2 (cf.
(10) and (12)). In English gerunds, PRO and a lexical pronoun can both oc-
cur in principle; however, PRO must be used instead of a lexical pronoun if
it can fulfill the Control Rule. A transfer of Chomsky's (1981) approach into
optimality theory is straightforward. The Control Rule in (11) can directly
be viewed as an optimality-theoretic constraint (with the same qualification
as in government and binding theory; see note 4); cf. (75-a). The Avoid Pro-
noun Principle in (13) can be simplified by turning this translocal constraint
into a local (though violable) one; cf. (75-b). 34 The ranking for English is
CONTROL » *PRON.
Suppose that candidate sets are defined in such a way that candidates with
PRO and candidates with a lexical pronoun can compete, but, crucially, that
sentences with different indexings (hence, different logical forms) do not
compete. Then, the facts fall into place. The blocking of a lexical pronoun
by P R O in cases where CONTROL can be satisfied is illustrated in table TIQ.
Table Τ π illustrates the case where PRO is not co-indexed with the matrix
antecedent, thereby violating CONTROL. Here, the *PRON violation incurred
by all pronouns is non-fatal, and the pronoun strategy is optimal.
The question arises of whether the evidence that is accounted for by translo-
cal constraints in minimalist syntax can also be reanalyzed in optimality-
theoretic syntax. At least to some extent, this seems to be the case. As
noted above, the STAY constraint adopted in Grimshaw (1997), Legendre,
Smolensky & Wilson (1998), and much related work, is essentially a local
version of the translocal Fewest Steps condition. Similarly, a local counter-
part has been suggested for the translocal economy constraint Shortest Paths.
Let us reconsider the superiority phenomenon as one of the core applications
of Shortest Paths. Some relevant examples are repeated in (76).
Ackema & Neeleman (1998) propose a local version of Shortest Paths that we
may call M I N - C H A I N ("Minimize Chain Length"). This constraint records a
star * for every node crossed by a movement chain. 35 Assuming that only
overt movement counts for the purposes of this constraint, (76-a) can suc-
cessfully block (76-b) under M I N - C H A I N : Other things being equal, the wh-
chain in (76-a) violates M I N - C H A I N twice (IP and C are crossed), whereas
the wA-chain in (76-b) violates M I N - C H A I N four times (VP, I', IP, and C' are
crossed), the third violation being fatal already.
Another account is developed by Legendre, Smolensky & Wilson (1998).
They start out with BAR, which is nearly identical to the CED given above:
that have been proposed in minimalist syntax or blocking syntax can be recast
in optimality-theoretic terms by employing local, violable constraints. 37
( 7 8 ) a. PARSE[SC0PE|:
Scope assignment in the input must be realized by chain formation
in the output.
b. FAITH[COMPJ:
The output value of [±COMP] is the same as the input value.
Note that (78-a) implies that the input is a more complex object than just a
collection of words (a numeration) or a predicate/argument structure; it must
be a highly structured representation that encodes the relative scope of op-
erators. (78-b) presupposes an abstract feature [±COMP| that for the present
purposes we can assume to be located on a V that selects a proposition. Let
us consider candidates that violate these constraints. Suppose that (79-a) is
the input for output candidate (79-b), and (79-c) is the input for output candi-
date (79-d). Legendre, Smolensky & Wilson (1998) assume that (79-b) vio-
lates PARSE|SCOPE] because matrix scope for how\ in the input (79-a) (indi-
cated by [+wh]i) is reduced to embedded scope in the output (again indicated
by [+wh]i). Similarly, Bakovic & Keer (1999) assume that (79-d) violates
FAITH|COMP] because a [-COMP] specification in the input contrasts with a
|+COMPL specification (hence, a complementizer) in the output. 3 8
(79) a. | + w h | | ... wonder[ +u ,/,| [ [+wh| 2 ... what 2 ... howi ... | (input)
b. You wonderf + i„/,] |cp [+wh]i [+wh]2 howi John did what2] (output)
c. ... V\-comp\ I ». J (input)
d. I think | cp that [pp on him Ji no coat looks good t| ] (output)
At this point, we need not go into the actual analyses in which these con-
straints play a role (as it happens, both faithfulness violations turn out to
be non-fatal, i.e., (79-b,d) are optimal). The crucial question is: Is it really
necessary to refer to the concept of input here, or is it possible to read the
respective violations off the output forms, without any reference to inputs?
At least for the cases at hand, the answer is straightforward: By enriching
output representations in ways that have independently been proposed, a ref-
erence to inputs becomes unnecessary. (79-a,b) is a case where the intended
matrix scope is not reached by chain formation in the candidate. Employing
abstract scope markers ( Σ ) in S-structure representations (cf., e.g., Williams
1986), we can equivalently encode this input information in the output, as
in (80-a). 39 As for the case in (79-c,d), the only assumption that we have to
make (and which strikes us as innocuous, in fact, completely standard) is that
selectional properties of lexical heads are accessible in syntax; cf. (80-b).
The Rise of Competition in Syntax 47
(80) a. Σ ι you wonder[+u,/,] [cp [+wh]i [+whb how] John did \vhat2 J
(output)
b. I think[_com/?i [cp that [pp on him h no coat looks good ti | (output)
If this result can be generalized, and all syntactic faithfulness constraints can
be reanalyzed in this way, we can conclude that these constraints do not
support the concept of input anymore. Why should it be that the notion of
input is relevant for phonological faithfulness constraints, but not for their
syntactic counterparts? The answer, we believe, follows from what appears
to be a fundamental difference between syntax and phonology: Syntax is
an information-preserving system with richly structured output candidates,
whereas phonology is a system that loses information, so that reference to an
underlying input is necessary in constraints.
With this in mind, let us turn to the other input function noted above, that
of defining candidate sets. Since syntactic output candidates are richly struc-
tured, all the relevant information that they must share in order to compete can
be read off them, independently of what notion of candidate set is adopted;
again, this is in sharp distinction to phonology. Thus, it is possible to ex-
plicitly define candidate sets without reference to the concept of input. For
instance, if we follow Grimshaw (1997) in assuming that competing can-
didates must have the same predicate/argument structure, we can read this
information off the potentially competing candidates themselves.
As a matter of fact, it turns out that an input-independent characterization of
candidate sets cannot even be avoided in Grimshaw's own approach. Recall
that Grimshaw (1997) postulates that two candidates compete only if they
have non-distinct logical forms (in addition to identical predicate/argument
structures). If the input fully determines the candidate set, this presupposes
that an input is a complex object that exhibits all relevant logical form in-
formation. It is generally assumed that outputs can deviate from inputs in
many ways, subject only to faithfulness constraints. Hence, if nothing else
is said, we expect that output candidates can be semantically unfaithful to
the input by, e.g., applying scope reduction (such that, e.g., a w/i-phrase with
48 Gereon Müller & Wolfgang Sternefeld
matrix scope in the input is interpreted with embedded scope in the output).
This clearly implies that candidates with distinct logical forms can compete.
This consequence is embraced by Legendre, Smolensky & Wilson (1998) (cf.
(79-b)). However, such a result is incompatible with Grimshaw's (1997) as-
sumptions, according to which competing candidates must have (not: go back
to) non-distinct logical forms. Thus, even in this approach, the input cannot
completely determine the competition; the requirement of non-distinct logi-
cal forms must be stipulated on top of it.
More generally, it emerges that an input-free characterization of candidate
sets is both readily available and independently motivated. Hence, reference
to inputs is unnecessary for the purpose of defining competition in syntax.
From all this, we would like to conclude that it may eventually be possible
to dispense with the notion of input in syntax; but further research is needed
in this domain (also see note 43 below).
Let us apply the suggestions that can be found in the literature to the case at
hand. First, Pesetsky (1997, 1998) emphasizes that certain sentences may be
ungrammatical not because they are classified as suboptimal in the H-Eval
part of the grammar, but because they cannot be generated by Gen in the first
place. Thus, a constraint like (83) might be part of Gen.
(84) could block (83) as suboptimal; but this optimal candidate would be
uninterpretable (indicated by #) and, hence, unusable.
These two approaches have in common that they allow the possibility that ab-
solute ungrammaticality is not located in the H-Eval component of grammar,
but in a component that precedes (Gen) or follows (interpretation) optimiza-
tion. If, however, H-Eval is to be held responsible for the ungrammaticality
of (82), there must be a competing candidate with a better constraint profile
that blocks it. A priori, this might be a candidate that employs a resumptive
pronoun strategy, which is only legitimate in this context as a last resort. If
this were so, the ineffability problem would be spurious in the case at hand.
However, (85) shows that the resumptive pronoun strategy is not an option
in German (a constraint like RES must outrank ADJ-ISL and other locality
constraints in German):
(85) *Wasi ist Fritz eingeschlafen |CP nachdem er esi gelesen hat | ?
what is Fritz fallen asleep after he it read has
What, then, could the optimal candidate blocking (82) look like? Following
Prince & Smolensky (1993), Ackema & Neeleman (1998) propose that the
empty candidate 0 (the "null parse") is part of every candidate set. This can-
didate violates the constraint in (86), which is typically ranked high. 40
Constraints that are ranked higher than * 0 in effect become inviolable (given
that there is no constraint except * 0 that 0 can violate). In this sense, * 0
introduces a dividing line into rankings. Thus, if both ADJ-ISL and the con-
straint that triggers vWi-movement (e.g., OP-SPEC) outrank * 0 , adjunct is-
lands become inviolable. This is shown in table T12.
The crucial difference from (84) is that WAS ι is turned into an indefinite pro-
noun, and the matrix C[+U)/,] is turned into a C[_„,/,]. Thus, there is a feature
change from |+wh] in (82) to f - w h ] in (87), and the sentence is interpreted as
declarative, rather than as a question. 41 If (87) is to block (82) as suboptimal,
this presupposes that candidates that differ in their wA-feature specification
can compete. But then, the problem arises that we would also wrongly expect
one of the sentences in (88) to block the other.
(89) FAITH[WH]:
The output value of [ ± w h ] is the same as the input value.
Suppose now that ADJ-ISL and OP-SPEC are ranked higher than
FAITH|WH|. Then, (87) will have a better constraint profile than (82) both
in the competition that has a f - w h ] specification in the input, and in the com-
petition that has a f+wh] specification in the input. Thus, there is a "neutral-
ization" of different input specifications in the output. This is shown in tables
T,3 and Τ14. 42
The Rise of Competition in Syntax 51
As remarked above, this does not exhaust the list of open issues that are cur-
rently under debate in optimality-theoretic syntax. We end this section by
briefly mentioning a few others.
52 Gereon Müller & Wolfgang Sternefeld
Optionality
In the best of all possible worlds, one would not expect optionality to arise
in a theory that selects the best candidate. The solutions that have been pro-
posed in view of this situation center around concepts like (i) true optionality,
according to which more than one candidate can be optimal due to an iden-
tical constraint profile (recall the above discussion of complementizer-trace
effects); (ii) constraint ties, which come in various versions (global and local,
ordered, conjunctive, and disjunctive) and all somehow incorporate the idea
that two (or more) constraints are equally important; (iii) pseudo-optionality,
which rests on the idea that the observed optionality is only apparent, and
reducible to different optimization procedures in different candidate sets; and
(iv) neutralization again, essentially an elaborate version of (iii). It turns
out that none of these solutions is completely unproblematic. See Müller
(2000:chapter 5) for a critical overview.
Degrees of Grammaticality
Cumulativity
Parameterization
Work in government and binding theory and the minimalist program has fo-
cussed on morphological properties of lexical items as factors that determine
parametrization. Such a view can in principle be reconciled with optimality-
theoretic syntax without too much ado (one and the same syntactic constraint
ranking may yield different optimal candidates if the morphological proper-
ties of these candidates differ from language to language, and there are con-
straints that refer to these morphological properties). However, in practice,
work in optimality-theoretic syntax has often sought to account for syntactic
parameterization exclusively in terms of syntactic reranking, and either deny
a relation to morphology, or view morphological properties not as the ba-
sis, but as a reflex of syntactic parameterization. Again, this issue is far from
being settled; for opposing views, see, e.g., Grimshaw & Samek-Lodovici
(1998) and Legendre, Smolensky & Wilson (1998) on the one hand, and
Vikner (2000) on the other.
Another recurring question in the optimality-theoretic approach to parame-
terization is whether every reranking of constraints that is logically possible
is also linguistically plausible (i.e., results in a potential grammar). The hy-
pothesis that it is is known as factorial typology, and is the focus of much
recent work.
Multiple Optimization
Most of the papers in this volume originate from a workshop at the 2117
Annual Conference of the DGfS (German Linguistic Society), which took
place at the university of Constance in February, 1999. The contributions
have in common that they discuss pieces of empirical evidence for which
a competition-based approach has some initial plausibility. They are all pri-
marily concerned with optimality theory, and they take up a number of the
open issues that were just mentioned.
Biiring's paper is a study of free word order in German, a domain that
has been tackled in terms of violable and ranked constraints in pre-optimality
work going back to the 70's and 80's. Like Choi (1999), Büring's approach
presents an optimality-theoretic analysis that rests on Lenerz's (1977) seminal
work. Central theoretical notions that play a role include optionality, degrees
of grammaticality, and, in particular, the prosody/syntax interface.
Fanselow & Cavar adopt the copy theory of movement and assume that
overt and covert movement both apply before spell-out. The crucial difference
relates to the question of which members of a copy chain are pronounced,
and which are deleted. To give a comprehensive answer to this question,
the authors discuss evidence from a variety of languages that includes long-
distance and partial wA-movement, the w/i-copy construction, the NP split
construction, and instances of head movement. They develop an optimality-
theoretic approach that reconciles features of the analyses in Pesetsky (1998)
and Grimshaw (1997), and that relies on a system of multiple (local) opti-
mization which integrates Chomsky's (1998) concept of a phase.
The Rise of Competition in Syntax 55
serve as the empirical basis are (a) complementizer drop in English, (b) wh-
movement in French root clauses, and (c) the German "Ersatzinfinitiv" (IPP)
construction. For each of these phenomena, a global tie analysis is compared
with a neutralization analysis; general strategies are suggested that permit a
transfer from one type of approach to the other; and a conclusion is drawn
that ultimately favors the neutralization solution.
The focus of Vikner's paper is the conflict that arises between two well-
motivated constraints in Icelandic: First, the relative scope of quantified items
must correspond to their surface order; second, NPs can undergo object shift
in front of an adverbial only if the main verb has undergone movement. In-
terestingly, it seems as though relative scope does not have to correspond
to surface order in exactly those contexts in which object shift is blocked.
Vikner shows that this supports an optimality-theoretic analysis in which the
first constraint is ranked below the second one, and is thus violable in the case
of conflict. Finally, the analysis is extended to German.
Vogel takes as a starting point the observation that free relative construc-
tions by their very nature strongly suggest constraint violability and constraint
ranking: They are incompatible with the standard assumption that there is
a one-to-one correspondence between Case assigners and items that are as-
signed Case. Moreover, Case conflicts can show up in free relatives which
are often resolvable by ranking (but may also result in absolute ungrammati-
cality). On this basis, Vogel develops an optimality-theoretic analysis of free
relative constructions in German, and he investigates the typological implica-
tions that result from reranking the proposed constraints; among other things,
the analysis sheds new light on the concepts of factorial typology and neu-
tralization.
Finally, Wanner observes that there are conflicts between linking rules
which become manifest in the domain of psych verbs in English. For instance,
the CONTROL-RULE favors experiences as external arguments, whereas
the CAUSER-RULE prefers causers as external arguments; in an optimality-
theoretic approach, the conflict can be resolved by ranking the latter rule
above the former, and this is what explains the difference between Mary
frightens John (where the theme is a causer) and John fears Mary (where
the theme is not a causer). An interesting theoretical aspect of this analysis is
that the competing candidates are not sentences, but argument structures.
We believe that the papers collected in this volume give a fair indication
of both the potential and the limitations of optimality-theoretic syntax, and
of competition-based syntax in general. To us, they strongly suggest that it is
The Rise of Competition in Syntax 57
Acknowledgments
Notes
1. Our use of the term global follows its original interpretation in Lakoff (1971)
throughout this introduction. Sometimes, global is understood in a rather dif-
ferent sense in the literature (including Chomsky 1995 and Collins 1997), as a
synonym for translocai or transderivational (see below). As we will see, in this
second interpretation, a global constraint can in fact not be checked by exclu-
sively looking at a given syntactic object S,·.
2. The resumptive pronoun strategy is by itself marginal in English and is cho-
sen here mainly for expository reasons; see Chomsky (1981:173) for a discus-
sion of the case at hand. However, resumptive pronouns as a last resort in cases
where movement is blocked are widely attested in other languages. See Shlonsky
(1992), Pesetsky (1998), and the references cited there.
3. Note, however, that Chomsky (1982:63f.) envisages an account in terms of the
Avoid Pronoun Principle, which, as we will see, is an exception insofar as it is in
fact a non-local constraint in government and binding theory.
4. As Manzini shows, the Control Rule is actually a theorem that can be derived
from more primitive assumptions. This need not concern us here.
5. The Avoid Pronoun Principle has been applied to pro-drop phenomena in lan-
guages like Italian by Haegeman (1994:217). The idea here is that the availabil-
ity of the empty pronominal pro in the subject position of finite clauses tends to
make the use of an overt pronoun impossible; on this view, overt subject pronouns
can only show up in pro-drop languages if they fulfill a function that pro cannot
fulfill (like, e.g., focus interpretation). Also recall Chomsky's (1982) analysis of
resumptive pronouns that was mentioned above.
6. Also see Reinhart (1983) on a version of binding theory that relies on pragmatic
constraints of this type.
58 Gereon Müller & Wolfgang Sternefeld
Opi that Juan bought t] ?" For expository purposes, we will ignore this compli-
cation in what follows, but the correct structure is still reflected in the translation.
Second, note that actual positions of items that are overtly visible do not always
reflect the position that is theoretically relevant in Nakamura's (1998) analysis. In
particular, he assumes that the structural subject position SpecT is left-peripheral,
and in many cases can only be filled at LF. Still, subject NPs behave in every re-
spect as if they occupied the SpecT position overtly. This covert subject raising
with overt effects is indicated here by italicizing the relevant subject NP; thus ital-
icization is meant to imply that the italicized NP is pronounced in the position
of its trace. Note that these complications do not arise in a language like Toba
Batak, which otherwise exhibits the same general effect; see Schachter (1984)
and Sternefeld (1995).
19. Note, however, that a residue of the Procrastinate condition still shows up in
Chomsky's (1998:14) translocal principle that prefers Agree over Move. Also
see the next section.
20. It is worth noting that the notion of optimality has systematically been used in
minimalist syntax, apparently without recourse to optimality theory as developed
by Prince & Smolensky (1993), and at a time when optimality-theoretic syn-
tax papers did not yet exist. See, e.g., Chomsky (1993:4), and, for explicit uses
of the notion, Collins (1994:46), Kitahara (1997:18), and Frampton & Gutman
(1999:5).
21. See Müller (2000:chapter 4) for a slightly more realistic (albeit still simplified)
example.
22. See Fanselow (1997), who argues that was2 can be scrambled to a position in
front of wer ι before w/i-movement takes place in (61-b). However, to avoid
a blocking of (61-a) by (61 -b) in a system with translocal constraints (which
Fanselow does not assume), it would then also have to be ensured that the two
derivations do not compete; this could be achieved by assuming that the presence
vs. absence of the trigger for waj-scrambling creates two different reference sets.
An alternative would be to assume that whereas translocal constraints cannot be
parameterized, the definition of reference set can be. Without recourse to inter-
mediate scrambling, reference sets might then be defined in German in such a
way that (61-a) and (61 -b) do not compete, whereas they could be defined differ-
ently in English, so that the English counterparts of these derivations do compete.
See Sternefeld (1997) for an extensive discussion of this option.
23. Recall, however, that Chomsky retains some translocal constraints even in more
recent work, though often hesitantly and with a sense that if truly necessary,
translocality would qualify as an "imperfection" of language. Thus, directly after
suggesting the Shortest Paths account of the ban on the acyclic derivation of
freezing effects with NP raising cited above, Chomsky (1995, 328) comes close
to revoking it by stating: " - though the issue is nontrivial, in part because we are
60 Gereon Müller & Wolfgang Sternefeld
invoking here a 'global' [i.e., translocal] notion of economy of the sort we have
sought to avoid."
24. Also see Hornstein (2000), who gives the same kind of account in a minimalist
setting.
25. The standard way out of the problems created by optionality chosen by pro-
ponents of blocking syntax is to find subtle semantic differences between the
relevant sentences - in other words, to deny true optionality.
26. The former strategy is discussed in Heck (1998:this volume); the latter strategy
is pursued in Müller (1997).
27. We hasten to add that all the case studies in this section are simplified versions
of the actual analyses proposed in the literature. In the present context, we are
mainly interested in the logic of the argument, not in the specific (or maximally
elegant) formulation of the constraints. Accordingly, we leave open the questions
of defining candidates and candidate sets where they do not seem to be important
for our present purposes. Note also that the simplification is particularly radical in
Wilson's (1999) case. Based on evidence from binding theory, Wilson argues for
an elaborate model of multiple optimization in syntax (see section 5.3.3 below);
he is concerned with many more data and, eventually, typological universale that
the naive analysis presented here cannot possibly account for.
28. The ranking of T-LEX-GOV and STAY could also be reversed; this ranking is not
determined by the cases we are interested in here.
29. To avoid the issue of do-support in root clauses, which is orthogonal to the issue
of complementizer-trace effects in embedded clauses, we have chosen examples
here in which the SpecC[+U!/,] target position is in an embedded clause.
30. Note that a violation of T-LEX-GOV will automatically imply a violation of the
more general STAY constraint. Hence, given that there is no other constraint on
which Ci and C2 differ, it follows that Ci's constraint profile is better than that
of C2 under ranking; i.e., Ci harmonically bounds C2
31. The relevant constraints are BAR3 (see below) and FILL in Legendre, Smolensky
& W i l s o n ( 1 9 9 8 ) , a n d I S L A N D - C O N D a n d S I L E N T - T R A C E in P e s e t s k y ( 1 9 9 8 ) .
CNPC should be viewed as a placeholder for one or more general conditions
that yield the described effects; RES is arguably part of a more general system
of constraints on pronouns. Also see Hornstein (2000) resumptive pronouns and
islands, including CNPC, in English.
32. Operator movement in relative clauses in English can be achieved by (something
along the lines of) Grimshaw's (1997) ranking OP-SPEC » STAY (plus STAY
» RES); this option is chosen by Legendre, Smolensky & Wilson (1998). In
contrast, Pesetsky (1997) does not assume the movement operation in relative
clauses to be subject to optimization; in his view, Gen does not generate the in-
situ version in English in the first place.
33. It seems that in order to achieve compatibility of this account of resumptive pro-
nouns with the account of the lack of complementizer-trace effects with adjuncts
The Rise of Competition in Syntax 61
sketched in the preceding section, the ranking RES T-LEX-GOV would have
to be assumed. That said, one will probably have to assume independent, high-
ranked constraints that block resumptive pronouns in adjunct chains, anyway.
34. Is *PRON confined to personal (and possessive) pronouns, or does it also cover
anaphoric pronouns? Under the first option, REF-ECON and *PRON might be the
same constraint. Under the second option, we would in fact face what is known
as a subhierarchy of constraints: A general constraint *PRON prohibits all kinds
of pronouns, a more specific constraint *PERS-PRON (= REF-ECON) prohibits
only personal pronouns, and an even more specific constraint *RES-PERS-PRON
(= RES) prohibits only personal pronouns used as resumptives.
35. Ackema & Neeleman call this constraint STAY, but this may be somewhat un-
fortunate, given that MIN-CHAIN differs substantially from Grimshaw's (1997)
STAY, in the same way that Shortest Paths differs from Fewest Steps.
36. Local constraint conjunction makes it possible to reintroduce the concept of cu-
mulativity into optimality-theoretic syntax: Multiple violations of a given con-
straint Coni may not directly outweigh a single violation of a higher-ranked
constraint Con2, but can do so indirectly by triggering a violation of an even
higher-ranked constraint Coni'.
37. An interesting question is whether a translation of translocal constraints into lo-
cal constraints is actually needed in optimality theory; in other words: Could
not some of the violable and ranked constraints in the H-Eval part be translocal
themselves, just like the basic optimality principle is? For instance, one could en-
visage a translocal SHORTEST PATHS that fulfills the same task as MIN-CHAIN
or BAR' : SHORTEST PATHS selects the candidate C, with the shortest movement
paths in a given candidate set, and this can be signalled by stipulating that all can-
didates except C, are assigned a star * under this constraint. Such an approach
may raise additional complexity issues, and it has - to the best of our knowledge
- not yet been proposed in optimality-theoretic syntax. Still, it seems to us to be
viable in principle. Indeed, a translocal constraint of this type has been proposed
for phonology in Prince & Smolensky (1993) (H-NUC, which, however, is even-
tually replaced there by a subhierarchy of local constraints that are derived by a
process of harmonic alignment).
38. In both cases, only those aspects of the input are considered that matter for the
faithfulness constraints under consideration.
39. Note that the distinction between the actual scope position for a wA-item (here
designated by |+wh|) and the "intended" scope position for a w/z-item (here des-
ignated by Σ ) is fundamental in the analysis by Legendre, Smolensky & Wilson
(1998), and not an artefact of the input-independent approach.
40. Assuming the concept of input, this constraint amounts to the statement that the
input must not be left completely unrealized.
41. Here we exploit the fact that was is ambiguous between a wh-reading and an in-
definite reading in (colloquial) German. This does not hold for other wA-phrases
62 Gereon Müller & Wolfgang Sternefeld
like welches Buch ('which book'), which, however, also cannot be extracted from
adjunct islands. For these cases, the neutralization approach would have to be
complicated in such a way that the candidate with the f-wh] NP (perhaps ein
Buch ('a book')) deviated from the one that must be blocked not just in feature
specification, but also in morphological shape. Such complications do not affect
the general argument, though.
42. A side remark: The candidates in (79-b) and (79-d) that were discussed in the pre-
vious section also signal input neutralization; these candidates are also optimal
in candidate sets where they do not violate the respective faithfulness constraint.
43. This account rests on the concept of input. Is it possible to maintain the analysis
without reference to this notion? It is, but the task is slightly more difficult here
than in the cases that were discussed in the last section. We have to ensure that
an output candidate like (87), with a C[_ m /,| and a was[- w h\, has abstract [+wh]
or | - w h l markers that encode the postulated input difference, and that can be
referred to by an appropriately revised FAlTHfWH] constraint.
References
Choi, Hye-Won
1999 Optimizing Structure in Context: Scrambling and Information Structure.
Stanford: CSLI Publications.
Chomsky, Noam
1973 Conditions on transformations. In: Stephen Anderson & Paul Kiparsky
(eds.), A Festschrift for Morris Halle, 232-286. New York: Academic
Press.
Chomsky, Noam
1981 Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam
1982 Some Concepts and Consequences of the Theory of Government and
Binding. Cambridge, MA: MIT Press.
Chomsky, Noam
1986a Barriers. Cambridge, MA: MIT Press.
Chomsky, Noam
1986b Knowledge of Language. New York: Praeger.
Chomsky, Noam
1991 Some notes on economy of derivation and representation. In: Robert
Freidin (ed.), Principles and Parameters in Comparative Grammar, 417-
454. Cambridge, MA: MIT Press.
Chomsky, Noam
1993 A minimalist program for linguistic theory. In: Kenneth Hale & Samuel
Jay Keyser (eds.), The View from Building 20, 1-52. Cambridge, MA:
MIT Press.
Chomsky, Noam
1995 Categories and transformations. (Chapter 4). In: The Minimalist Pro-
gram, 219-394. Cambridge, MA: MIT Press.
Chomsky, Noam
1998 Minimalist inquiries. Ms., MIT, Cambridge, MA
Chomsky, Noam — Howard Lasnik
1993 Principles and parameters theory. In: Joachim Jacobs, Arnim von Ste-
chow, Wolfgang Sternefeld & Theo Vennemann (eds.), Syntax, vol. I,
506-569. Berlin: de Gruyter.
Cole, Peter
1982 Subjacency and successive cyclicity: Evidence from Ancash Quechua.
Journal of Linguistic Research 2: 35-58.
Collins, Chris
1994 Economy of derivation and the generalized proper binding condition. Lin-
guistic Inquiry 25: 45-61.
64 Gereon Müller & Wolfgang Sternefeld
Collins, Chris
1997 Local Economy. Cambridge, MA: MIT Press.
Déprez, Viviane
1991 Economy and the that-t effect. In Proceedings of the Western Conference
on Linguistics 4: 74-87.
DiSciullo, Anna-Maria — Edwin Williams
1987 On the Definition of Word. Cambridge, MA: MIT Press.
Epstein, Samuel David
1992 Derivational constraints on A'-chain formation. Linguistic Inquiry 23:
235-259.
Fanselovv, Gisbert
1989 Konkurrenzphänomene in der Syntax. Linguistische Berichte 123: 385-
414.
Fanselovv, Gisbert
1991 Minimale Syntax. Habilitation thesis, Universität Passau.
Fanselovv, Gisbert
1997 The proper interpretation of the minimal link condition. Ms., Universität
Potsdam.
Fanselovv, Gisbert — Reinhold Kliegl — Matthias Schlesewsky
1999 Optimal parsing. Ms., Universität Potsdam.
Fox, Danny
1995 Economy and scope. Natural Language Semantics 3:283-341.
Frampton, John — Sam Gutman
1999 Cyclic computation. Syntax 2: 1-27.
Grimshavv, Jane
1994 Heads and optimality. Handout, Universität Stuttgart.
Grimshavv, Jane
1997 Projection, heads, and optimality. Linguistic Inquiry 28: 373-422.
Grimshavv, Jane — Vieri Samek-Lodovici
1998 Optimal subjects and subject universals. In: Pilar Barbosa et al. (eds.),
Is the Best Good Enough?, 193-219. Cambridge, MA: MIT Press &
Haegeman, MITWPL.
Liliane
1994 Introduction to Government and Binding Theory. Oxford: Blackwell.
Haider, Hubert
1983 Connectedness effects in German. Groninger Arbeiten zur Germanistis-
chen Linguistik 23: 82-119.
The Rise of Competition in Syntax 65
Heck, Fabian
1998 Relativer Quantorenskopus im Deutschen - Optimalitätstheorie und die
Syntax der Logischen Form. M.A. thesis, Universität Tübingen.
Heck, Fabian — Gereon Müller
2000 Repair-driven movement and the local optimization of derivations. Ms.,
Universität Stuttgart & IDS Mannheim. Short version in: Glow Newslet-
ter 44: 26-27.
Hendriks, Petra — Helen de Hoop
1999 Optimality theoretic semantics. Ms., University of Groningen. (Cognitive
Science and Engineering Prepublications 98-3.)
Hornstein, Norbert
2000 Is the binding theory necessary? Ms., University of Maryland.
Jäger, Gerhard — Reinhard Blutner
2000 Against lexical decomposition in syntax. Ms., ZAS & Humboldt-
Universität Berlin.
Kiparsky, Paul
1982 From cyclic phonology to lexical phonology. In: Harry van der Hulst &
Neil Smith (eds.), The Structure of Phonological Representations, vol 1,
131-175. Dordrecht: Foris.
Kitahara, Hisatsugu
1993 Deducing 'superiority' effects from the shortest chain requirement. Har-
vard Working Papers in Linguistics 3: 109-119.
Kitahara, Hisatsugu
1997 Elementary Operations and Optimal Derivations. Cambridge, MA: MIT
Press.
Koster, Jan
1987 Domains and Dynasties. Dordrecht: Foris.
Lakoff, George
1971 On generative semantics. In: Danny Steinberg & Leon Jakobovits (eds.),
Semantics, 232-296. Cambridge: Cambridge University Press.
Lasnik, Howard — Mamoru Saito
1992 Move a. Cambridge, MA: MIT Press.
Legendre, Géraldine — Paul Smolensky — Colin Wilson
1998 When is less more? Faithfulness and minimal links in wh-chains. In: Pilar
Barbosa et al. (eds.), Is the Best Good Enough?, 249-289. Cambridge,
MA: MIT Press & MITWPL.
Lenerz, Jürgen
1977 Zur Abfolge nominaler Satzglieder im Deutschen. Tübingen: Stauffen-
burg.
66 Gereon Müller & Wolfgang Sternefeld
Manzini, Rita
1983 On control and control theory. Linguistic Inquiry 14: 421-446.
Marantz, Alec
1995 The minimalist program. In: Gert Webelhuth (ed.) , Government and
Binding Theory and the Minimalist Program, 351-382. Oxford: Black-
well.
May, Robert
1979 Must COMP-to-COMP movement be stipulated? Linguistic Inquiry 10:
719-725.
McCarthy, John — Alan Prince
1995 Faithfulness and reduplicative identity. In: Jill Beckman, Laura Walsh-
Dickie & Suzanne Urbanczyk (eds.), Papers in Optimality Theory, 249-
384. Amherst, MA: UMass Occasional Papers in Linguistics 18.
Müller, Gereon
1997 Partial vvh-movement and optimality theory. The Linguistic Review 14:
249-306.
Müller, Gereon
2000 Elemente der optimalitätstheoretischen Syntax. Tübingen: Stauffenburg.
Müller, Gereon — Wolfgang Sternefeld
1996 Α-bar chain formation and economy of derivation. Linguistic Inquiry 27:
480-511.
Nakamura, Masanori
1998 Reference set, minimal link condition, and parameterization. In: Pilar
Barbosa et al. (eds.), Is the Best Good Enough?, 291-313. Cambridge,
MA: MIT Press & MITWPL.
Pafel, Jürgen
1998 Skopus und logische Struktur — Studien zum Quantorenskopus im
Deutschen. Habilitationsschrift, Universität Tübingen.
Pesetsky, David
1997 Optimality theory and syntax: Movement and pronunciation. In: Diana
Archangeli & D. Terence Langendoen (eds.), Optimality Theory. An
Overview, 134-170. Oxford: Blackwell.
Pesetsky, David
1998 Some optimality principles of sentence pronunciation. In: Pilar Barbosa
et al. (eds.), Is the Best Good Enough?, 337-383. Cambridge, MA: MIT
Press & MITWPL.
Pollock, Jean-Yves
1989 Verb movement, universal grammar, and the structure of IP. Linguistic
Inquiry 30: 365-424.
The Rise of Competition in Syntax 67
Wilson, Colin
1999 Bidirectional optimization and the theory of anaphora. Ms., Johns Hop-
kins University. To appear in: Géraldine Legendre, Jane Grimshaw &
Sten Vikner (eds.) Optimality Theoretic Syntax, Cambridge, MA: MIT
Press.
Let's Phrase It!
Focus, Word Order, and Prosodie Phrasing in German
Double Object Constructions
Daniel Biiring
This paper presents a case study in the interaction of word order, prosody and
focus. The construction under consideration is the double object construction
in German. The analysis proposed is in line with the following more general
hypotheses:
First, focus and word order do not interact directly. There are no grammati-
cal rules that relate focus to specific phrase structural positions. Rather, focus
interacts with prosodie phrasing, which in turn may interact with word order.
Second, the kind of word order variation under investigation here is gov-
erned by two potentially conflicting types of constraints: morphosyntactic
constraints that express ordering preferences relating to case, definiteness and
possibly other categories, and prosodie constraints that define what a prosodie
structure should look like. If these constraint families call for incompatible
demands, languages may allow only the morphosyntactically perfect struc-
ture, or only the prosodically perfect structure, or, as is arguably the case in
German, both.
Third, violable ranked constraints provide a well-suited framework to
account for these kinds of phenomena. Both the morphosyntactic and
the prosodie constraints, as well as those governing the relation between
prosody and focus, are implemented as markedness constraints. Their rela-
tive (non-)ranking accounts for the variation observed within a language and
cross-linguistically.
1 Introduction
the finite verb in second position and the non-finite ones in final position to-
gether form a sort of bracket around the main body of the clause.
initial finite verb ... . . . . _ . , .
(1) .. Mittelfeld non-finite verb forms
position
As indicated, this main body of the clause, as delimited by the finite verb to
its left and the non-finite ones to its right, is traditionally called the Mittelfeld
('middle field')·
In embedded clauses, the initial position usually remains empty and the
finite verb is found at the end, too. In its place the subordinating complemen-
tizer constitutes the left bracket of the Mittelfeld.
(2) ι »·
complementizer is υ
Mittelfeld non-finite verb finite
forms verb
The Mittelfeld contains all non-clausal complements of the verb, some non-
finite clausal ones, and most adverbials (almost any of these can alternatively
occupy the initial position in declarative main clauses, a fact we can ignore
here). The relative order among the elements in the Mittelfeld is basically
free. In particular, German, unlike Dutch, allows reordering among the nom-
inal arguments quite freely. Subject and object as well as the two objects in
a ditransitive construction can be found in various orders. The following ex-
amples of embedded clauses from Müller (1998) (his (31) and (36)) illustrate
this:
(3) nominative-accusative-order
a. ... dass eine Frau den Fritz geküsst hat.
that a woman the-ACC Fritz kissed has
b. ... dass den Fritz eine Frau geküsst hat.
that the-ACC Fritz a woman kissed has
'... that a woman kissed Fritz.'
(4) dative-accusative-order
a. ... dass man das Buch dem Fritz geschickt hat.
that one the book the-DAT Fritz sent has
b. ... dass man dem Fritz das Buch geschickt hat.
that one the-DAT Fritz the book sent has
'...that someone sent Fritz the book.'
All arguments are nominal. Overt case marking for nominative, dative and
accusative is found on articles. As one might suspect, (4) allows even more
Double Object Constructions 71
In his seminal study on German word order, Lenerz (1977) found that there
are two main semantic/pragmatic factors that co-determine object ordering
in German double object constructions: definiteness and focus. Simplifying
slightly, the generalizations in (5) hold:
An equally important finding of that study was that there is one purely mor-
phosyntactic factor involved, too: 1
The DatO > AccO order in (a) is fine in both cases, whereas the AccO >
DatO order in (b) is only acceptable if DatO is in focus (or, as we shall some-
times say, F-marked).
But AccO > DatO order is unacceptable if AccO is indefinite; cf. (10) (note
that in both examples the focus follows the non-focus, in accordance with
(5-b)):
To derive these I proposed utilizing two constraints along the lines of (12)
and (13):
Both these constraints have been proposed in the literature and can be seen
to be independently motivated. I will return to this issue below. They inter-
act with a general syntactic faithfulness constraint that penalizes movement,
including scrambling, which we will call STAY (cf. Grimshaw 1997). Option-
a l l y of movement results where the base order violates FF and the derived
order violates STAY but respects INDEFINITES and FF. Movement is prohib-
ited where the base order fulfills both STAY and FF; it is also prohibited if the
derived order violates INDEFINITES.2
In order to discuss the workings of this system I will implement it in the
form of an optimality grammar, as proposed in Choi (1996) and, indepen-
dently, in Biiring (1997a) (it is the latter proposal I am going to discuss here,
although Choi's analysis uses essentially the same constraint tie, her CN2
- dative precedes accusative, and NEW - roughly: a non-focused argument
precedes a focused one, to derive focus-related word order variation; since I
will propose a fundamental reanalysis later in this paper, I will not attempt a
74 Daniel Biiring
The <<C3> notation indicates the constraint tie. A tie can be resolved in two dif-
ferent ways, in this case as in (15-a) or as in (15-b) (cf. Prince & Smolensky's
ordered global ties', see also Müller 1999 for more discussion).
c. [vp dAccO [yp DatO/r [yp tQatO lv' [AccO VJ111 **I
As I already noted in that earlier work, this system also derives a case not
considered in Lenerz (1977), but observed in Eckardt (1996): If both objects
are focused, scrambling is excluded, regardless of (in)definiteness. In terms
of the system proposed: STAY must not be violated if no improvement in
terms of FF results:
*
b. [ V p iAccOa [ V p DatO/r [ v / tAccO V[]| *!
The INDEFINITES constraint does not apply here since A c c O is generic. Ac-
cordingly, (20-a) is optimal if the tie is resolved to STAY > FF, while (20-b)
is optimal if it is resolved to FF » STAY. Movement of the indefinite across
the definite is thus (optionally) possible. 4
This quick overview illustrates all the relevant aspects of the system as
proposed in Biiring (1996) in its application to German double object con-
structions. Empirically successful though it is, many questions remain open.
Some of them regard the nature of the constraints. Why should they hold
in the way they do? Others regard the technical set-up of the system. What
advantages does it have to specify focus patterns (rather than, say, contexts,
accent patterns, or nothing at all) in the input?
Regarding the first set of questions, the INDEFINITES constraint in (13) is
a fairly direct adaptation of the seminal proposals in de Hoop (1992) and
Diesing (1992). If the position taken in these works is basically correct, posi-
tional preferences of indefinites can be explained in terms of the way syntax
is mapped onto semantics. The effects of FINALFOCUS can and should, I
believe, be derived f r o m the way syntax is mapped onto prosody, utilizing
ideas found in Truckenbrodt (1995, 1999) and Biiring (1997a). It is this latter
aspect that the present paper is mainly concerned with.
T h e system I will present below shares many essential properties with the
one sketched in this section and preserves its basic tenets: Object ordering
in German is determined by morphosyntactic and focus-related constraints,
F-marking is specified in the input, and optional reordering is derived by a
constraint tie in the very way illustrated above.
I will not, however, continue to use the particular constraints STAY, FI-
N A L F O C U S a n d INDEFINITES. In s e c t i o n 3 I p r o p o s e r e p l a c i n g F l N A L F o -
CUS by a group of constraints relating focus, prosody and syntax. Their net
effect will be similar to that observed with FINALFOCUS above; in contradis-
tinction to this single constraint, however, their empirical coverage is much
broader, and they are compatible with and well motivated by current work
in prosodie phonology. In section 4 I will then introduce a constraint DAT,
which takes over the work of STAY. The effects of DAT will be the same as
those of STAY; it is chosen merely to avoid commitment to a derivational
syntactic framework. The issue of (in)definiteness and its influence on object
order will be ignored in what follows, along with the constraint introduced to
Double Object Constructions 11
handle it; reintegration of it within the analysis developed below will have to
await a later occasion (Büring in prep.)·
Regarding the second set of questions, the issue to be addressed here re-
gards the specification of the input. The system sketched above and elabo-
rated in what follows crucially specifies F-marking (and the different readings
of indefinites) in the input, but not, e.g., accenting or prosodie phrasing. This
choice could be made differently. I don't think that the present paper presents
conclusive evidence in favor of the set-up chosen here. Its purpose is to show
that such a system can be devised, and explore what properties it will have,
facilitating further discussion. I will touch upon some of the issues involved
after the main exposition in section 5 below.
3 Deconstructing F I N A L F O C U S
This section explores the rationale behind a constraint like FINALFOCUS, and
proposes replacing it with more precise and natural constraints on prosodie
phrasing. Likewise, we will no longer assume the constraints INDEFINITES
and STAY (which will be replaced by a less committing constraint called
DAT(IVE) in section 4 below).
Let me start by clarifying some of the assumptions about the relation be-
tween context, focus and accent I am making. I follow Selkirk (1984, 1995),
Rochemont (1986), and many others in assuming an overall picture as in (21).
(21)
Prosodie Structure
Context (specified Syntactic Structure
with stress and
by, e.g., a question) with F-marking
pitch accents
The prosodie words (dem Kassierer and das Geld) are the heads of their
respective accent domains. Accordingly they are more prominent than the
prosodie word (gegeben), which means that their most prominent syllables
receive AD-level stress. Finally, the AD (das Geld gegeben) is the head of
the iP that wraps the entire sentence (the dots indicate that the iP extends fur-
ther to the left) and thus receives a grid mark at the iP-level (note that this is
different from the notation used in Halle & Vergnaud 1987, where heads are
indicated by grid marks on the next higher level).
As noted, the grid marks represent stress, where higher columns represent
a higher degree of stress. Finally, stressed syllables are associated with pitch
accents (I will not be concerned with the choice of pitch accent here, see
Pierrehumbert & Hirschberg 1990 for general discussion, and Biiring 1997b
on German). For our purposes it is sufficient to state that each sentence con-
tains at least one pitch accent, and that if a syllable gets a pitch accent asso-
ciated with it, every other syllable with the same or higher degree of stress
must get a pitch accent, too; the range of the pitch movement (the perceived
"intensity" of the accent) is positively correlated with the level of stress on
the syllable the accent is associated with (cf. Pierrehumbert 1980). The result
will be that the head of iP always bears a pitch accent. A common pattern
in German is that all AD-heads have a pitch accent, too (cf., e.g., Uhmann
1991). We stipulate that syllables with only PWd-level stress never bear pitch
accents.
The convention we will use where no prosodie trees are given is the follow-
ing: AD-heads are marked by capitalization of the pertinent syllable, the iP-
head by capitalization plus underlining of the word; Pwd-heads aren't marked
at all. (22) can thus be abbreviated as in (23):
Given what we said above, Geld must bear a pitch accent here, while
KasSIErer may (along with every other AD-head that may precede it); the
V gegeben cannot. The pitch accent on Geld will be the most prominent one
(the nuclear stress).6
Let us start by elaborating on the notion of accent domain (I will ignore the
issue of prosodie word formation, because nothing hinges on it in the present
context). An AD has an "ideal size", which is described by the constraints
in (24). Since its two parts, PRED and XP, don't conflict in the examples I
discuss in this paper, I will treat them as one constraint, ADF, in the tableaux
that follow:
80 Daniel Biiring
(24) A D F (ACCENTDOMAINFORMATION):
a. PRED:
A predicate shares its A D with at least one of its arguments.
b. XP:
A D contains an XP. If X P and Y P are within the same A D , one
contains the other (where X and Y are lexical categories).
( )AD /
a. cs= ( das Geld )( geben )pWd
( )( )AD * (PRED: V is alone in its A D )
b. ( das Geld )( geben )pWd
(( ) )AD * (illegal recursion of A D )
c. ( das Geld )( geben )pWd
A s said above, every A D has one head, which is the most prominent element
within it, indicated by a grid mark within the A D . If an A D consists of more
than one prosodie word, as in (25-a), the head will be determined by (26):
The effect of (26) is demonstrated in (27) (note that here and henceforth I
don't indicate prosodie words in a separate line; hence, PWd-heads are no
longer marked with a grid mark at all).
A D F and A/P in tandem govern phrasing and prominence, all other things
being equal. How does focus enter the picture? I submit that one simple con-
straint, (28), is all that is needed: 8
(28) FP(FOCUSPROMINENCE)
Focus is most prominent.
Importantly, (28) inspects the prosodie structure of the sentence; for example,
if an A D contains two prosodie words, only one of which contains an F-
marked node, FP will demand that PWd become the head of AD; likewise for
higher prosodie categories (to which I will turn below).
In German, FP is crucially ranked above A/P; for reasons that will become
clear later, A D F must be ranked in-between them. To understand the work-
ings of FP, let us consider AD-formation in cases in which exactly one im-
mediate constituent of the clause bears an F-feature (that constituent may in
turn contain more F-features, which I will ignore here); I will refer to these
as simple or narrow focus cases. What we observe here is that among the
elements within VP, only the constituent in focus has A D level stress (and,
for reasons to be discussed in a moment, the main pitch accent). (Since nar-
row V-foci are hard to elicit by a w/z-question, I will use contrasting contexts,
which, in accordance with Schwarzschild (1999), I assume work the same in
all relevant respects.)
(29) a. (Was hast du dem Kassierer gegeben? Ich habe dem Kassierer) das
GELD/r gegeben.
'What did you give the teller? I gave the teller |the money|/r.'
b. (Ich habe nicht gesagt, du sollst dem Kassierer das Geld beschrei-
ben, sondern du sollst dem Kassierer) das Geld GEben/r.
Ί didn't say you should describe the money to the teller, but you
should |givej/r the money to the teller.'
The pitch accents on Geld and geben, respectively, tell us that these must be
AD-heads, while their respective sisters are not (otherwise they could at least
bear a secondary - j c e n t , which I would have indicated by capitals). Let us
82 Daniel Biiring
see how this follows from our assumptions: If the AccO alone is F-marked
(case (29-a)), FP will require that the prosodie word containing it become the
head of the AD. This is the case in (30-a) and (30-c), but not in (30-b), which
is therefore blocked ((30-b) violates A/P on top of this, which is irrelevant
here). Between (30-a) and (30-c), ADF prefers the former, due to the reason
already seen in (25-b): The V alone is not a good A D , violating A D F / P R E D .
Note that A/P is not violated in (31 -c), since there is no AD containing a pred-
icate and its argument. Yet (31-c) is blocked by (31-b) since ADF dominates
A/P.
Let us now include the intonational phrase, iP, in the picture. Since we are
only concerned with VP-internal focus here, it suffices to assume that every
sentence (or inflection phrase) is mapped onto one iP.
Intonational phrases are strictly right-headed in German: The head will be
that AD which is aligned with iP's right edge. We implement this by assuming
(32) as an undominated constraint (cf. McCarthy & Prince 1993).
(32) ÌP-HEAD-RIGHT:
ALIGN(iP, right, head(iP), right)
Double Object Constructions 83
(33) a. χ )¡p
Χ )AD
(das Geld/r)(geben)pwd
b. Χ ).p
( Χ )AD
(das Geld)(geben/r-) PWd
Notice that, also at the iP-level, the structures in (33) respect FP: The AD
containing the focus ends up being the head of iP. Note in particular that
although there is plenty of material following the most prominent syllable
Geld within the iP in (33-a), there is no AD following the AD containing
Geld. Therefore Ì P - H E A D - R I G H T in ( 3 2 ) is respected here: The rightmost
daughter AD is the head of iP.9
To conclude the simple focus cases, what if the VP-initial dative argument
is the sole focus? As already discussed in section 2 above, the nuclear accent
then falls on the DatO; moreover, no secondary accents on either AccO or V
are allowed.
(34) (Wem hast du das Geld gegeben? Ich habe) dem KasSIEreiy das
Geld gegeben.
'Who did you give the money? I gave [the teller|/r the money.'
This is predicted: The dominant constraint FP will force the head of A D and
iP to be on the focused DatO. As Truckenbrodt (1995: ch. 5) was the first to
point out, this, together with Ì P - H E A D - R I G H T , excludes the presence of any
AD following the one containing the focus. Consider (35-a) and (35-b); in
both, the final A D (das Geld geben) becomes the head of the iP, consonant
with the right-headedness of iP, (32). But this violates FP at the iP level, since
84 Daniel Biiring
( Χ Χ Χ )AD
b. (dem Kassierer/0(das Geld)(geben)pwd
X )IP iP is not right-headed
( χ )( χ )AD
c. (dem Kassierer ρ )(das Geld)(geben)pwd
X )IP *
IS- ( χ )AD
d. (dem Kassierer/r )(das Geld)(geben)pwd
We now turn to cases in which more than one immediate constituent of the
sentence is in focus. (36) is an example of this sort (quite possibly there is
another F-mark on the VP here and in (38) below; I'll address this issue in
the next sub-section).
(36) (Wie soll die NSF dabei helfen? - Sie soll) das GELD/r geben F .
'How can the NSF help? They should given the moneyF-
Double Object Constructions 85
The final V, though F-marked itself, does not have AD-level stress and cannot
bear a pitch accent. That means that object and verb continue to form an AD;
(37) shows how this is accounted for:
The winner in this case has exactly the same prosodie structure as in the
object-only-F case in (30). The tableaux look considerably different though.
In particular, notice that even the winner in (37) has one violation of FP.
This is unavoidable if a sentence has two F-marked constituents, given that
every phrase has only one head: At some level, one of the F-constituents must
become the non-head.
The most instructive candidates to compare are (37-a) and (37-b). In (37-
a), V is the head of its own AD, and the A D with the AccO is subordinated
at the level of iP, inducing an FP violation. In (37-b), V is subordinated, but
already at the A D level. It incurs a violation of FP, too, this time because the
PWd containg geben ρ is not the head of the A D containing it. Note that since
there is only one A D (which then is the head of iP), no further violations
of FP occur. The choice pro (37-b) is made by the lower constraint ADF,
which prefers the "integrated" structure (37-b) over the "split" one in (37-
a): A perfect A D cannot consist of just a predicate as in (37-a) (the fact that
the prominence within the A D is on the object rather than on the verb, as in
(37-c), is then regulated by A/P).
What happens if the verb and two arguments are F-marked? In this case we
get the nuclear accent on the rightmost argument and a secondary accent, or
at least AD-level stress, on the VP-initial one.
gegebenf.
'What did you do? - 1 gave/r the teller ρ the money p.'
It should be noted once more that the crucial difference is between predicates
and non-predicates in complex-F-constructions. While predicates have an in-
centive to join the AD of one of their arguments and therefore integrate at
that lower level, arguments do not. In fact, ADF prefers for them not to share
an AD with any other argument, which is why they end up forming their own
AD. Incidentally, this reasoning applies to adjuncts, too, except that these
never join ADs with a predicate, given that they are never selected by a pred-
icate (cf. again the definition in (24)); it is beyond the scope of this paper to
go into this, though.
Double Object Constructions 87
In (40-a) and (40-b), the smallest prosodie constituent containing VPf is the
AD. Since that AD is the head of iP, F O C U S P R O M I N E N C E is met.11
If we look at a double object example along the lines of (39), a similar
reasoning applies; (41) repeats the winning candidate for this structure with
an F on VP added:
(41) ( χ )¡P
( Χ )( X )AD
I VP dem Kassierer/r das Geld/r geben/r|yr
No smaller prosodie unit than iP contains the VP, and since iP is the high-
est category, nothing is more prominent than iP. Therefore, no additional
violation of FP occurs, hence no other candidate will improve relative to
(41)/(39-c).
88 Daniel Büring
3.5 Deaccenting
So far we have looked at simple foci (DatO/r, AccO/r, and V/r) or complex
foci in which all VP internal arguments were F-marked. Now I will turn to ex-
amples that contain deaccenting. As a starting point, recall that a ditransitive
VP/IP focus without deaccenting results in a structure with the last AD con-
sisting of a prominent AccO and a non-prominent V. (42)/(43) illustrates this
with a new example (foci on, external to, and above VP are not indicated):
(44) 'Why was Veronika arrested? Only because she had a poker in her
Double Object Constructions 89
It should also be noted that, unlike all the other cases of XP+V focus we have
considered thus far, this type of context strictly excludes an unstressed V, as
(44-b) shows. This result is predicted, as tableau (45) demonstrates.
us- ( χ )( χ )AD
a. (ihrem Macker/r)(den Kaminhaken)(überzog/r)pwd
χ )ip * *!
( χ )( χ )(x )ad
b. (ihrem Macker/r)(den Kaminhaken)( überzog ρ ) pwj
χ )¡p **!
( Χ )( χ )AD
c. (ihrem Mackerf)(den Kaminhaken)(überzogf jp-y/j
*
χ )¡p *!
( χ )ad
d. (ihrem Macker/r)(den Kaminhaken)(überzog/r)pw(j
cases not discussed in, e.g., Truckenbrodt (1995, 1999). And fourth, it allows
integration with a theory of word order variation, as I will demonstrate now.
In (35) above we derived the fact that single focus on the DatO yields a struc-
ture as in (46). FP requires that DatO be the head of AD and iP; since iP is
right-headed, this blocks insertion of further A D boundaries to the right of
AD. Due to this, a "super-big" AD is formed, violating ADF.
Let us now see what happens if word order permutations enter the picture.
In this case, the same F-pattern could be realized without violating any con-
straint by utilizing AccO > DatO order. And in fact, this option exists along-
side the one in (46) in German, as noted in (7) above. I repeat both vari-
ants for convenience here (note that I have added the indication of AD-level
prominence on AccO in (47-b), which may or may not be associated with a
pre-nuclear pitch accent):
Let us examine (47-b) more closely, since we haven't done so before. Its
prosodie structure is (48). It is perhaps worth pointing out that the prosodie
structure of (47-b)/(48) is identical to that of the parallel DatO-AccO-V ex-
ample in (38)/(39) above; in particular, V and the adjacent DatO integrate
into one AD in just the same way that V and AccO do.
(49-a) is the constraint profile for this AccO > DatO order. No violations oc-
cur. (I will henceforth leave out the iP-level for reasons of space; the head of
iP - which is predictably the rightmost AD - will be indicated by a capital
bold face grid mark: X.)
(50) 'Why was Veronika arrested? Only because she had a poker in her
trunk?'
AccO DatO/r Vf
Nein, weil sie den KaMINhaken ihrem MAcker überzog,
no bec. she the-ACC poker her-DAT man landed
i: [ V p AccO DatO/r V/r| FP ADF A/P
*
( Χ Κ X )AD *!
a. (den Kaminhaken)(ihrem Macker/r)(iiberzogf )pwd
*
( Χ )( X XX )AD *!
b. (den Kaminhaken)(ihrem Mackcr/r )(überzogρ)pwj
*
er ( χ )( X )AD
c. (den Kaminhaken)(ihrem Macker /τ )( überzog f ) pwd
**| *
( X )AD
d. (den Kaminhaken)(ihrem Macker/r)(iiberzog/r)pw ( j
If the F-marked argument is capable of forming a "perfect AD" with the verb,
which it is when it is adjacent to V, this option is preferred - (51 -c). Phrasing
and accenting the verb separately - (51-b) - is just as impossible as with
A C C O f V F i n (37).
92 Daniel Büring
Suppose now that AccO > DatO structures as in (49) and (51 ) were to en-
ter into competition with their DatO > AccO siblings such as (35) and (45).
That is, suppose that the input was not specified with respect to object
ordering, allowing outputs with either order to compete with one another
(I'll use set notation in the input specification to indicate this). 12 Then the
AccO > DatO structure would be the sole winner. It clearly beats even the
best DatO > AccO candidate. (52) demonstrates this for the simple DatO/r
case. It compares the optimal structure from (51) with that from (35), render-
ing the latter sub-optimal.
If German had a ranking like that in (52), the only object order we would ever
find for simple focus cases would be one where the F-marked object follows
the F-less one, so the former can be maximally prominent (satisfying FP)
and the latter can form an AD (satisfying ADF). Put in derivational terms,
we would find obligatory scrambling of non-focused objects around focused
ones. While this is obviously not the case in German, a situation much like
this can arguably be found in many Romance languages, e.g., Spanish, Italian
and, to a lesser degree, French (Ladd 1996, Zubizarreta 1998). In Spanish for
example, an NP which is the single focus in a sentence must occur postver-
bally, in VP-final position. The interpretation of this that I have in mind runs
along the following lines: Any structure in which the focus isn't sentence final
would require an AD which contains the focus and everything following it,
similar to (52-b) (otherwise the AD wouldn't be iP-final, hence not the head
of iP, yielding a violation of FP). But such a structure will always be sub-
optimal compared to one in which the focus occurs in sentence final position,
so that every element preceding it can form its own AD, similar to the struc-
ture in (52-a). Derivationally put, we find obligatory movement of the focus
to a peripheral position (see Gutiérrez-Bravo 1999 for an optimality analysis
along these lines).
Double Object Constructions 93
It will turn out that we crucially never want the tie to resolve into any of (55):
Put differently, the two prosodie constraints A D F and A / P never change their
ranking relative to each other, but only as a block relative to DAT.
This special property of the system will become relevant only in the deac-
centing cases, but for reasons of comparability and uniformity I will use it
right f r o m the beginning. The notation I invent for this purpose is that in (56):
Since candidate (57-a) violates neither DAT nor any of the prosodie con-
straints, it will be the winner regardless of how the tie is resolved.
With DatO/r, the focus constraints - in particular A D F - favor A c c O >
DatO, while DAT favors the opposite order. Since both constraints are tied,
both structures emerge as optimal.
Focus on both arguments (with or without the verb) again allows only the
order preferred by DAT. While scrambling doesn't make the phrasing worse,
it doesn't improve it either and is therefore excluded:
Double Object Constructions 95
Let us then turn to the deaccenting cases. If DatO and V are F-marked and
AccO is not, two structures emerge as grammatical: the one that fulfills
DAT (and violates the prosodie constraint A/P), and the one that satisfies the
prosodie constraints (but violates Dat).14
(60)
Why was Veronika arrested? Because she had FP D A T • pros. cons.
a poker in her trunk? No, because she... ADF A/P
i: { A c c O , D a t O / r , V / r }
*
BS· ( χ χ X )ad *
If DatO is non-F, on the other hand, AccO > DatO order is excluded because
the DatO > AccO order already allows a perfect prosodie structure.
96 Daniel Biirìng
(61)
Why was Veronika arrested? Because her man FP DAT j pros. cons.
disappeared? No, because she... ADF A/P
i: {DatO, A c c O f , V F }
*
( χ )( X )ad *!
a. (ihrem Macker)(den Kaminhaken/r)(überzog/r)pwd
*
( X )( X XX )AD *!
b. (ihrem Macker)(den Kaminhaken/r)(überzog/r)pwd
*
"ST ( X )( X )ad
c. (ihrem Macker)(den Kaminhaken/r)(überzog/r)p\yd
*
( X )ad *!
d. (ihrem Macker)(den Kaminhaken/r)(iiberzog/r)PWd
*
( χ )( X )ad *! *!
e. (den Kaminhaken/r)(ihrem Macker)(iiberzogρ)pwd
*
( Χ )( Χ XX )ad *! *!
f. (den Kaminhaken/r)(ihrem Macker)(überzog/r)Pwd
* *! *!
( X )ad
g. (den Kaminhaken/r)(ihrem Macker)(überzogρ)pwd
To sum up then, the proposed system correctly predicts when we get two
different object orders with the same focus pattern, and when we don't. It
generalizes across the various cases because it doesn't invoke the notion of an
optimal focus placement (as the system in section 2 did), but only the notion
of an optimal prosodie structure. For each of the grammatical structures, it
delivers a unique prosodie phrasing, which in turn can be used to derive the
set of its possible accent patterns.
One way of looking at the constraint tie is that there are actually two gram-
mars at work: One that finds the prosodically optimal candidate, and one that
finds the morphosyntactically optimal one (which we have equated with the
one that displays DatO > AccO order here). All candidates which are prosod-
ically and morphosyntactically sub-optimal are predicted to be simply un-
grammatical.
A reasonable objection to the present proposal is that a sentence like, say,
(8-b), repeated here as (62-a), is awkward, but still much better than, say, (62-
b), and that (62-a) should therefore be question-marked, but not starred, as is
done here.
Double Object Constructions 97
(63) a. ( χ )iP
( )AD
dem Kassierer/rdas Geld geben
98 Daniel Biiring
b. ( χ )iP
χ )( χ )ad
)ad
dem Kassierer/r das Geld/r geben/r
c. ?*( X )iP
Χ Χ χ χ X )ad
das Geld/r dem KassiererF geben/r
Notes
This paper is built on an earlier economy-theoretic paper of mine (Biiring 1996) and
a series of optimality-based talks I gave at the SFB 282 Colloquium "Die Intona-
tionale" in Cologne 7-96, the "Interfaces of Grammar" conference in Tübingen 10-
96, and the Stuttgart "Workshop on OT Syntax" 10-97 (Biiring 1997a). I would like
to thank the audiences at these conferences for their comments and discussion, espe-
cially Kai Alter, Katharina Hartmann, Gerhard Jäger, Inga Kohlhof, Gereon Müller,
Roger Schvvarzschild and Hubert Truckenbrodt. Judith Aissen, Armin Mester and
Line Mikkelsen offered invaluable comments and suggestions that greatly helped to
shape and improve the present version. All remaining errors and shortcomings have
been retained deliberately in order to stimulate future work.
1. The issue of whether the DatO>AccO order is preferred for all verbs or just a
lexically specified sub-group is controversial; cf. among others Haider (1993),
Fortmann & Frey (1997), Vogel & Steinbach (1998), and Müller (1998). Since
100 Daniel Biiring
this question is orthogonal to the issue at hand we will leave it unresolved, con-
centrating on verbs that are uncontroversially among the D a t O > A c c O ones.
2. I ignore cases of obligatory movement here, which I argue in Biiring (1996) exist
in German, too.
3. In fact, the situation is more complicated since I argued that (13) is not accurate,
a point I won't go into here.
4. To derive Diesing's (1992) original generalization, INDEFINITES in (13) would
have to be strengthened to a biconditional, requiring that indefinites are exis-
tential if and only if they are VP-internal. In Biiring (1996) I argue, however,
that this generalization is too strong, i.e., that VP-internal indefinites can also be
generic; cf. also Biiring (in prep.).
5. These conditions, as well as the one requiring headedness of prosodie phrases
introduced in the next paragraph, should be properly understood as undominated
markedness constraints in an optimality framework. It should be understood that
this only holds for German, then, leaving open the possibility that they are ranked
in a more interesting way in other languages.
6. Note that not all F-marked XPs will need to have a pitch accent on this view,
though they all will have AD-level stress. Likewise, non-focused XPs can be
pitch accented (if all focused ones are, too), since they usually receive AD-level
stress as well. Crucially, however, non-focused AD-heads cannot have accents if
the focused AD-heads don't. This is the empirical generalization reported, e.g.,
in Uhmann (1991). Others, for example Féry (1993), claim that (in our termi-
nology) all AD-heads must receive a pitch accent if they are part of a focused
phrase, while non-focused ones may or may not. This latter viewpoint can be im-
plemented by reformulating FP below, but it necessitates further complications.
Given that the empirical situation seems unclear, I will therefore stick with the
easier generalizations offered in Uhmann (1991).
7. PRED a n d X P a r e c l o s e relatives of T r u c k e n b r o d t ' s ( 1 9 9 9 ) W R A P a n d S t r e s s - X P
constraints, respectively. Just like these they will favor two ADs for adjunct-
head, argument-argument and adjunct-argument structures (since neither phrase
contains the other and no predication is involved), and with head-complement
structures they unanimously favor one AD (since bare heads aren't XPs). Note,
however, that PRED - unlike STRESS-XP - favors (NP V)AD-integration, even
if V ends up being the head of the AD (cf. (31) below). Also, PRED applies
even if the predicate is an XP of its own. I'm thinking of subject-intransitive-
verb, object-secondary-predicate, and NP-short relative clause structures, which
all have been reported to allow, if not require, single ADs. This can be achieved
in the present system by ranking PRED above XP.
8. A more precise rendering is (i):
(i) F P (FOCUSPROMINENCE)
a. If a is the smallest prosodie constituent that contains an F-marked syn-
tactic node β, a is called a prosodie focus.
Double Object Constructions 101
9. It is perhaps worth noting that the second best candidate in (31), (31-c), will
receive a very similar overall realization to the winner (31-b). In particular both
(31-b) and (31-c) have main prominence (= the head of iP) on V. The difference
is merely the presence of AD-level prominence on the AccO in (31-c), which
means that it can, but doesn't have to, bear a secondary pitch accent. As far as I
can tell, the data are inconclusive with regard to this issue (a lot depends upon the
relation between prominence and pitch accents; cf. note 6 above). (31-b) owes its
optimality to the fact that A/P is ranked below ADF. If both were tied, both (3 Ι-
ό) and (31-c) would be grammatical. Within the set of data that I discuss in this
paper, this seems to be the only case in which ranking ADF above A/P is crucial.
If required on empirical or theoretical grounds, this ranking could be given up,
ruling in (31-c) as an optimal output.
10. The F-marking on VP is not represented in the outputs. This is because I only
indicate the prosodie structure in the output, with some F-marks added for con-
venience, so there is no natural place for them. Strictly speaking the outputs
should be pairs of syntactic phrase markers with F-marking and prosodie struc-
tures without (or, perhaps, only the latter).
11. Pedantically speaking, (40-c) violates FP as formulated in note 8 above, since
the smallest prosodie constituent containing V P f , iP, is not the head of the next
higher prosodie category, because there is no such higher category. But since this
affects all structures in the competition equally, no change will arise from this. In
the main text I will ignore these extra violations for the sake of perspicuousness.
Put differently, I will interpret FP to say "...is the head at level η + 1, if there is
such a level."
12. Another implementation would specify object order in the input, but allow GEN
to change it. Nothing hinges on this in the present context.
13. As stressed in Müller 1998, this also allows us to add more morphosyntactic
word-order constraints upon demand, yielding different morphosyntactically un-
marked word orders without committing to the assumption of different base-
generated argument orders.
14. Note that if only DAT and ADF were tied, (60-f) could never emerge as an opti-
mal candidate; it would always loose to (60-a). If instead all of DAT, ADF and
A/P were tied, (60-a) and (60-f) would be permitted (as desired), both having one
violation. But so would (60-c), which has only one violation, too. But (60-c) is
not acceptable in this case.
This is where the more complicated construction that I introduced at the begin-
ning of this section pays off: Among the structures that violate DAT, only (60-f)
is grammatical, because it violates none of the prosodie constraints. This can-
didate is optimal under the ranking DAT » ADF » A/P. Among the structures
that satisfy DAT, only (60-a) is optimal, because it violates the lower prosodie
constraint A/P, rather than the higher one ADF, as (60-c) does. This corresponds
102 Daniel Biiring
to the outcome under the ranking ADF » A/P » DAT. Crucially, for (60-c) to
win the constraints would have to be ordered with A/P dominating ADF (either
one of those in (55) above), but this is not permitted by the kind of tie that is
assumed here.
15. Note that FP is not included here, owing to the empirical fact that a sentence
with, say, main prominence on the AccO can absolutely not be used in a context
that requires focus on the DatO. In other words, a structure like (i) is strictly
impossible, not just marked.
(i) ( X )iP
( χ )( χ )AD
(DatCV)(AccO)(V)
References
Bech, Gunnar
1955/57 Studien ueber das deutsche verbum infinitum. K0benhavn (Det Kongelige
Danske Videnkabernes Selskabs Historisk-filologiske Meddelelser 35,
No. 2, 1955/ 36, No. 6, 1957).
Biiring, Daniel
1996 Interpretation and movement: Towards an economy-theoretic treatment
of German 'Mittelfeld' word order. Ms., Frankfurt University.
Biiring, Daniel
1997a Perfect or just optimal? Towards an OT account of German Mittelfeld
word order. Talk presented at the Workshop on OT Syntax, October 1997,
Stuttgart University.
Biiring, Daniel
1997b The Meaning of Topic and Focus - The 59th Street Bridge Accent. Lon-
don: Routledge.
Biiring, Daniel
in prep. What do definites do that indefinites definitely don't? Ms., UC Santa
Cruz.
Choi, Hye-Won
1996 Optimizing structure in context: Scrambling and information structure.
Ph.D. dissertation, Stanford University, (to appear with CSLI Publica-
tions, Stanford).
Diesing, Molly
1992 Indefinites. Cambridge, MA: MIT Press.
Drubig, Hans Bernhard
1994 Island Constraints and the Syntactic Nature of Focus and Association
Double Object Constructions 103
Truckenbrodt, Hubert
1995 Phonological phrases: Their relation to syntax, focus, and prominence.
Ph.D. dissertation, MIT. (Published 1998 by MITWPL).
Truckenbrodt, Hubert
1999 On the relation between syntactic phrases and phonological phrases. Lin-
guistic Inquiry 30(2): 219-255.
Uhmann, Susanne
1991 Fokusphonologie. Tübingen: Niemeyer.
Uszkoreit, Jürgen
1987 Word Order and Constituent Structure in German. Stanford: CS LI Pub-
lications.
Vikner, Sten
1991 Verb movement and the licensing of NP-positions in the Germanic lan-
guages. Ph.D. dissertation, University of Geneva.
Vogel, Ralf — Markus Steinbach
1998 The dative - An oblique case. Linguistische Berichte 173: 65-90.
Webelhuth, Gert
1989 Syntactic saturation phenomena and the modern Germanic languages.
Ph.D. dissertation, University of Massachusetts, Amherst.
Zubizarreta, Maria Luisa
1998 Prosody, Focus and Word Order. (Linguistic Inquiry Monographs 33.)
Cambridge, MA: MIT Press.
Remarks on the Economy of Pronunciation
The idea that syntactic movement is composed of two steps, a copying oper-
ation followed by a deletion operation (the C&D-theory of movement) - as
illustrated in (1) - has again become popular with the rise of the Minimal-
ist Program (Chomsky 1993). In one of the straightforward extensions of the
C&D-approach, at least certain instances of so-called covert movement arise
from the overt copying of a full phrase before SPELLOUT, followed by the
deletion of the higher rather than the lower copy - an assumption that implies
that spellout conventions regulate whether the target or the source position of
the copying operation is realized phonetically (see, e.g., Bobaljik 1995, Groat
& O'Neil 1996, Pesetsky 1997, 1998a, Roberts 1997 (for head movement),
Sabel 1998, among others) - as illustrated in (2) for Chinese.
missing freezing effects, that arise in other approaches (e.g., Kayne 1998,
Koopman & Szabolcszi 1999, Mahajan 1999).
In the standard case of movement, only one of these copies is actually pro-
nounced. This follows from an interaction of the principles PRONECON and
RECOV.4 PRONECON favors those structures in which the deletion of pho-
netic matrices in chains is maximized, but deletion is subject to recoverabil-
ity, so that normally 5 exactly one copy will be retained in each chain. In other
words, in most situations, only (6-e-g) are potential winners.
b. Recoverability (RECOV)7
The content of unpronounced elements must be recoverable from a
local antecedent.
Ceteris paribus, this approach leads to the expectation that any copy in a
chain may be the one that is spelled out, with all the others being deleted.
So-called "partial wA-movement"8 as can be found in Bahasa Indonesia (cf.
(3), repeated here as (8), and Saddy 1991, 1992) or Malay (Cole & Hermon
1998) seems to bear this prediction out. In a wA-question, the wA-phrase may
either appear in situ (8-a), or be realized in its scope position (8-c), but it can
also show up in the specifier positions of any of the CPs that may intervene
between the root position of the wA-phrase and its scope position, as (8-b)
illustrates ("true partial wA-movement").
Cole & Hermon (1998) argue that partial wA-movement is not focus move-
ment; see also Basilico (1998) for arguments that partial movement in Slave
cannot be reduced to focus movement. The most straightforward analysis for
(8) (considered but rejected in Cole & Hermon 1998) assumes that siapa has
in fact been attracted to its scope position in all examples, forming the chain
indicated in (9-a). Due to the interaction of PRONECON and RECOV, all but
one of the copies of siapa must not be pronounced. In the optimal state of af-
fairs, any of the copies may be the one that is realized overtly, as the abstract
structures (9-b-d) illustrate, which (roughly) correspond to (8).
There is at least one argument for analyzing true partial wA-movement along
these lines. As Saddy (1991, 1992) and Cole & Hermon (1998) observe, par-
tially moved wA-phrases behave as if they have moved to the scope posi-
tion at least in terms of island conditions: There must be no movement is-
land between the partially moved wA-phrase and its scope position. Thus, a
Economy of Pronunciation 111
Chinese illustrates the prediction that certain in situ w/z-phrases can be island
sensitive: VWz-adjuncts can stay in situ, but they must not appear in islands.
Note that adjuncts cannot be bound by an (argumentai) question operator base
generated in Comp. Therefore, w/z-adjuncts can form a part of a question only
if a chain is built up which links the adjunct to its scope position. W/z-adjuncts
can thus be realized phonetically in situ only if a copy-chain (respecting is-
lands) is built up to the scope position, in which the lowest copy surfaces after
the deletions as forced by (7).12
We therefore follow Cole & Hermon (1998) in making the assumption that
two strategies for forming questions coexist in Malay at least: copying of the
w/z-phrase to its scope position, and the binding of w/z-arguments in situ. See
Pesetsky (1998b) for a related but slightly different view on English, German
and Slavic questions.
A phonetic sequence such as (14) in which an overt copy of the w/z-phrase
appears in its base position is thus ambiguous in our account (but not in Cole
& Hermon 1998): buah apa may be bound by a [+w/z)-Comp, or it may be
the copy of a chain link to the matrix Spec,C position that is spelled out pho-
netically. Given that the binding-zTz-iz'ta strategy is, in general, more liberal
than the formation of questions by movement, (nearly) all examples that are
grammatical under a movement analysis are generatable with a binding anal-
ysis, too - so that the existence of an ambiguity is both hard to establish and
also hard to refute.
(15) WH-IN-SPEC
A wA-phrase must be phonetically realized in the specifier position
of a CP.
(17) Ali (mem) beritahu kamu tadi apa yang Fatimah (*men)-baca
Ali meng told just now what that Fatimah meng read
'What did Ali tell you just now that Fatimah was reading.'
(NPs and CPs) or to "phases" in the sense of Chomsky (1998). These can-
didates are then subjected to the EVAL procedure, yielding an optimal can-
didate. The optimal candidates for the expression of cyclic categories/phases
so formed may then be fed into the GEN component again, in order to gener-
ate even larger structures, until the level of cyclic nodes or phases is reached
again, at which the E VAL procedure selects the optimal structure again.
In such a system, the question of which of the copies created by move-
ment can be retained, and which copies are deleted phonetically, poses it-
self each time the construction of alternative structural representations has
reached the cyclic node level. Consider, then, a stage in a derivation in which
a w/z-phrase has been copied to a higher position, crossing an occurrence of
meng in this context (19-a). Suppose that Σ is cyclic, so that optimization
can and must start. Because of PRONECON, one of the two W/Z-phrase copies
must disappear. 16
If the upper copy loses its phonetic matrix (19-b), nothing seems to have
to happen to meng, i.e., it can be retained. In structures in which meng has
been retained, the uppermost specifier position of CP therefore does not have
a phonetic matrix, and it will not be able to re-acquire this phonetic matrix
in later copying steps for more or less obvious reasons. 17 Therefore, above a
retained meng, no copy in a w/z-chain originating lower than meng can have
a phonetic matrix.
Assume, however, that there is a principle requiring that meng must be
deleted (19-c) when the upper copy is retained phonologically. This can (and
must) be checked locally in each cyclic domain relevant for optimization.
Thus, the empirical generalizations that concern mewg-distribution are very
well compatible with a C&D-approach when it is executed cyclically. 18
can then be reduced to (20), which bears an obvious similarity to the Doubly-
Filled Comp Filter.
The w/i-phrase appears now in the lowest Spec,CP position. The derivation
bifurcates when Σ * * is reached: a climbs up phonetically if PARSESCOPEO
is given more weight, while its phonetic material stays in Spec,CP-1 when the
tie is resolved towards WH-IN-SPEC. The former derivation will finally copy
the phonetic material of a further to Spec,CP-2 (because the two constraints
in question have the same implications for the last derivational step); the latter
cannot but leave a phonetically at Spec,CP-1. In other words, where there is a
tie between WH-IN-SPEC and PARSESCOPEO, partial wA-movement arises.
When PARSESCOPEO dominates WH-IN-SPEC, the phonetic material of
the wA-phrase will be realized in the highest chain position under considera-
tion (WH-IN-SPEC cannot block scope driven movement up to Spec,AGR-0)
- this characterizes languages with full and multiple wA-movement like Ro-
manian and Bulgarian.
The factorial typology leads us to expect that there are also languages in
which WH-IN-SPEC dominates PARSESCOPEO. Slave could be such a lan-
guage: In Slave (Basilico 1998), the wh-in-situ strategy is less restrictive than
overt movement, as (24-a-b) show: The complements of so-called "indirect
discourse verbs" are barriers for overt movement, but wh-in-situ is licensed.
It is thus surprising that (24-c), which involves partial movement within the
island, is in fact grammatical, quite in contrast to what one would expect from
the situation we found in Malay.
3 W7i-Copying
On obvious grounds, (26) can be analyzed in two ways: Taking up ideas pro-
posed by Grewendorf (1999) and Sabel (1998), we may hypothesize that all
wA-phrases move to their scope position, but that there is a principle that bans
the spelling out of more than one wA-phrase per Spec,CP position. Alterna-
tively, we may follow Müller (1997) in the assumption that there is a principle
that bans the (phrasal) movement of more than one wA-phrase to Spec,CP in
German. The w/z-phrase in situ would then have to be bound in situ (or un-
dergo feature movement in the system of Pesetsky 1998b).
When the two w/z-phrases of a multiple question originate in different
clauses, no uniform pattern emerges: In addition to the constellation in (27-a),
which closely mirrors the English counterpart and which characterizes stan-
dard German, there are dialects in which the multiple question cannot be
formed in the way it is in (27-a). 19 In such dialects, the lower wA-phrase ei-
ther has to undergo "partial" w/i-movement to the specifier position of the
complement clause (as in (27-b), which is acceptable for at least some speak-
ers in Potsdam and surroundings), or the lower wA-phrase must be the one
that undergoes overt movement (as in (27-c), blatantly violating superiority
thereby). 20
The latter two dialects thus resemble Iraqi Arabic (see Ouhalla 1996) and
Hindi (Mahajan 1990) in the sense that (a) wA-phrases in situ cannot take
scope out of the minimal finite clause they are contained in (unless they fill
this clause's specifier position) and (b) the distribution of wh-in-situ is there-
fore more constrained than the distribution of w/z-phrases that have under-
gone overt movement. If German w/z-phrases in situ are not moved covertly,
and are subject to additional binding requirements of the sort we find in Iraqi
Arabic (Ouhalla 1996), the unavailability of a multiple question interpretation
for (27-a) in the relevant dialects is captured fairly easily, while it is less clear
why covert movement should have to fulfill less liberal island conditions than
overt movement, if the major difference between the two operations is one of
the location of spellout. The dialects that rule out (27-a) thus suggest that Ger-
man wA-phrases in situ do not involve covert movement. This is quite in line
with the conclusions arrived at (for what he terms covert phrasal movement)
by Pesetsky (1998b) on quite different grounds.
120 Gisbert Fanselow & Damir Cavar
(31) a. ¡Welchem Mann glaubst du wem sie das Buch gegeben hat?
which man believe you who she the book given has
'Which man do you think that she gave the book to?'
b. ¡Mit welchem Werkzeug glaubst du womit Ede das Auto
with which tool think you what-with Ede the car
repariert hat?
repaired has
'With which tool do you think that Ede repaired the car?'
122 Gisbert Fanselow & Damit Cavar
Let us turn, then, to the Copy Construction (CC), and see how it fits into
our analysis. The CC is characterized by a number of interesting generaliza-
tions, two of which are fairly standard. First, no copy may appear in the root
position of the wA-chain.
While CCs that involve wA-phrases consisting of a single word are perfect
in those dialects that allow the CC at all, the situation differs radically when
the w/i-phrase is syntactically complex: In the CC ungrammaticality arises in
quite a number of dialects/idiolects as soon as the upper copy contains two or
more words (see (34)). (35)-(36) form a nice minimal pair in this respect - the
structures do not differ in meaning but just in the fact that womit is a single
word, in contrast to mit was. For some (most?) speakers, a contrast exists
between (35) and (36) - with the ungrammaticality of the former example
being rather mild only. There is practically nobody who would go beyond
(36) in terms of the complexity of the copied w/i-phrase, though.
It has been assumed (Fanselow & Mahajan 1996, Höhle 1996) that this
anti-complexity restriction affects all copies in the same way, but this claim
overlooks the greater flexibility we observe for the lowest copy:
(37) Wen denkst du wen von den Studenten man einladen sollte?
who think you who of the students one invite should
'Which of the students do you think that one should invite?'
(38) Wieviel sagst du wieviel Schweine ihr habt?
how many say you how many pigs you have
'How many pigs do you say that you have?'
(39) Wen denkst du wen von den Studenten sie sagte dass man
who think you who of the students she said that one
einladen sollte?
invite should
124 Gisbert Fanselow ά Damir Cavar
'Which of the students do you think she said that one should invite?'
(40), on the other hand, shows that it is not sufficient for grammaticality that
one copy only in the chain is syntactically complex:
(40) *Wen von den Studenten denkst du wen man einladen sollte?
which of the students think you who one invite should
Finally, in those dialects which have few problems with (36), (41) is perfect
as well. PPs are strict islands for movement in German, so aus Konstanz
could not possibly ever have left an wen aus Konstanz by standard movement.
Thus, there is no alternative to an analysis of (41) in which the lower Spec,CP
position is occupied by an wen aus Konstanz, and the upper one by an wen.
The complexity restriction thus does not apply to the lowest copy. The com-
plexity restriction holding for upper copies renders wA-copying ungrammati-
cal whenever a wA-phrase cannot be split or "separated", as is, e.g., the case
for w/i/c/i-phrases.
The CC obeys stricter locality restrictions than standard long overt move-
ment, as Höhle (1996) and others have observed; consider (43) - correspond-
ing examples with simple long movement would be grammatical. What we
get is exactly analogous to the intervention effect Beck (1996) and Pesetsky
(1998b) identify for German wh-in-situ: No operator may intervene between
the copies of the wA-phrase.
Economy of Pronunciation 125
There have not been too many proposals for an analysis of the CC, but Inge
Hiemstra's (1986) theory of the construction (and its Frisian counterpart) is
certainly outstanding in many respects. Published nearly ten years before
Chomsky ( 1995), her contribution preempts insights of much work in featural
movement theory in a number of respects. The central idea of her analysis of
(44) is that when a structure requires wA-movement, this may be carried out
as any of the following:
The resulting system is, thus, quite reminiscent of a movement theory gen-
erally adopted later in the mid-nineties. The first two options for effecting
movement must then be complemented by a theory of spellout for the dis-
placed feature complexes. According to Hiemstra, it is the most unmarked
lexical element bearing the relevant features that will realize the feature
complex in question. A single |+wh|-feature is therefore realized as was
(= (44-a)), and the feature complex [+wh, 3rd ps., acc| as wen (= (44-b)).
adapted to the general approach we pursue here, the relevant Island Constraint
takes the form (46):
(46) ISLAND
*a ... I Σ ... β ... I
where α, β belong to a single chain, a or β are unpronounced, and
Σ is an island.
Suppose in the dialects allowing CCs, CPs are (or, can be) barriers for extrac-
tion. In the first derivational step for a long distance question, the w/z-word
will be copied to Spec,C at some stage:
Suppose that the head and the specifier of a CP/a phase (but no other ele-
ments) are accessible in the next optimization cycle. (50) represents the AGR-
OP or vP of a matrix clause that will end up as a matrix question. If Σ is not
interpreted as an island, PARSESCOPEO implies that the upper copy of wen
is retained, and LEC implies that the lower copy of wen should disappear.
128 Gisbert Fanselow & Damir Cavar
(51-b) *!
*
(51-c) *!
*
(51-d) *!
A cyclic application of the principles discussed so far, together with the as-
sumption that the specifiers and heads remain accessible for optimization
from outside, thus yields the CC under the ranking given in table 1. Why
is it that upper copies must be non-complex? We can derive this from the
principles PRONECON and PARSESCOPEO if we interpret them properly. In
showing how, we may confine our attention to the derivational step linking a
Wz-phrase in Spec,C to its first landing site in the matrix clause. In (53), we
ignore one candidate structure to which we will return later.
(52) wen von den Studenten denkst wen von den Studenten dass ...
who of the students think who of the students that
(53) a. wen von den Studenten denkst wen von den Studenten dass ...
b. wen von den Studenten denkst wen von den Studenten dass ...
Economy of Pronunciation 129
c. wen von den Studenten denkst wen von den Studenten dass ...
The higher the restriction of a w/z-operator is moved in a tree, the more vi-
olations of PARSESCOPEO arise relative to it, so that (53-b) is favored over
(53-c). The core properties of the CC thus seem derived.
Note, however, that an account of (53-b) vs. (53-c) in terms of PARSE-
SCOPEO makes the incorrect prediction that a w/î-phrase that can be split up
must be so. This is false, as (55) shows.
We therefore need a principle that penalizes structures that have been con-
tiguous at level L but cease to be so at level L'.
130 Gisbert Fanselow & Damir Cavar
CIS disfavors separation, whereas PARSESCOPEO requires it. When the two
constraints are tied, the constellation we find in (55) arises.26 The tie with
PARSESCOPEO implies a fairly high rank for CIS; in particular, it dominates
PRONECON. Therefore, we must understand CIS in such a way that it is sat-
isfied when there is at least one copy of a phrase that is pronounced in an un-
split fashion - otherwise, the CC would be ruled out because it would always
imply a CIS violation. We must make sure, however, that the tie between
PARSESCOPEO and CIS does not imply that complex wA-phrases (which al-
ways contain a restrictor that should be left in situ) do not have to move at all
(because the PARSESCOPEO violation by the operator part is always counter-
balanced by the PARSESCOPEO violation of the restrictor). This is effected
by the principle WH-IN-SPEC introduced above.
Consider, then, the consequence of CIS for the two crucial movement steps
- the one from the root position to Spec,C, and the subsequent step mapping
the wA-phrase into the matrix clause.
As table 3 shows, we correctly predict the distribution of grammaticality in
the first movement step of (57).
(57) wen von den Studenten du wen von den Studenten einlädst?
who of the students you who of the students invite
(58) a. wen von den Studenten du wen von den Studenten einlädst
b. wen von den Studenten du wen von den Studenten einlädst
c. wen von den Studenten du we« von den Studenten einlädst
d. wen von den Studenten du wen von den Studenten einlädst
move the restrictor of the w/i-operator further up, and the wA-operator moves
to a position c-commanding its scope.
(59) a. ... wen von den Studenten denkst wen von den Studenten dass ...
b. ... wen von den Studenten denkst wen von den Studenten dass ...
c. ... wen von den Studenten denkst wen von den Studenten dass ...
d. ... wen von den Studenten denkst we» von den Studenten dass ...
(59-c) * ¡(restrictor) * *
(59-d) * *
* ¡(contiguity)
Therefore, we have derived the fact that the copy construction allows complex
overt w/z-phrases in the lowest Spec,C position only. As we have remarked
above, this restriction can be minimally violated in certain dialects in which
a PP may be copied; cf. (36), repeated here as (60).
Given what we have seen so far, the optimal candidate should be one that
"strands" the preposition in the lower copy. For the constraint that makes (60)
possible by overriding PRONECON, a natural formulation comes to mind.
Note that the copying operation moves a PP category upwards, and one may
assume that phonetic material that does not contain a preposition in the head
position cannot be a phonetic realization of a PP:
The only potential problem that arises, then, is related to ineffability. A suffi-
ciently high rank of the intervention constraint will be able to block the copy
construction in the situations where this is called for, so that the winning com-
petitor is a long movement construction. The same consequence arises for
constellations in which the w/i-phrase must not be split up (w/uc/i-phrases, or
PPs, in certain dialects).
This is an acceptable result for those dialects in which the copy construc-
tion co-exists with long movement, but it does not capture the ineffability
effect that can arise for long distance dependencies when a dialect forbids
long movement and an intervention effect rules out the copy construction at
the same time. A standard solution (see Legendre et al. 1998) would be to
rank the intervention constraint higher than faithfulness constraints concern-
ing scope assignments for the intervening operators.
Economy of Pronunciation 133
We have little to say about this construction, except for the observation that
it is not likely to be a subcase of a CC. While the lower occurrence of a wh-
element is a non-complex one, it does not copy the wA-operator of the upper
wÄ-phrase. It is rather the minimal spellout of the w/z-features that should be
present in the lower Spec,C position, as (64) and (65) show: 27
One simple analysis would analyze wen and was as agreeing forms of the
complementizer. This would be consistent with the observation that (31-b) is
judged worse than (31-a) (repeated here as (66)), if we assume that womit
makes a bad [+wh|-complementizer.
(66) a. ¡Welchem Mann glaubst du wem sie das Buch gegeben hat?
which man believe you who she the book given has
'Which man do you think that she gave the book to?'
b. !Mit welchem Werkzeug glaubst du womit Ede das Auto
with which tool think you what-with Ede the car
repariert hat?
repaired has
'With which tool do you think that Ede repaired the car?'
Alternatively, the contrast in (66) might be caused by the fact that Ger-
man dialects tend not to block standard long movement when PPs are af-
fected, so that (66-b) might be blocked by a candidate involving long move-
ment. We might also consider wen, was, and womit as spellout forms for φ-
features of a wA-phrase that has lost its original phonetic content. In a dialect
that ranks ISLAND over LEC, the insertion of "expletive" phonetic material
134 Gisbert Fanselow & Damir Cavar
In a simple split construction (68) (see also next section), the phonetic ma-
terial belonging to a single constituent is distributed over two places in the
sentence without any repetitions, but there are two exceptions to this property
of split XPs. In those dialects in which a PP can enter the split construc-
tion, the preposition must appear in both positions in which the PP is spelled
out partially (68-b). 28 Given the high rank of LEP, this is not unexpected.
Likewise, van Riemsdijk (1989) observes that the indefinite article may (and
sometimes must) be repeated in split noun phrases - a fact we can relate to
the observation that singular count noun phrases may never be realized pho-
netically without an initial determiner.
Thus, the split construction supports the idea that the economization of pro-
nunciation is in general quite subject to other constraints.
A detailed analysis of the split construction is beyond the scope of the present
paper, and would mostly repeat what is said in Cavar & Fanselow (1997,
2000). The following remarks are meant to prepare the discussion of a further
advantage of a pronunciation economy account: It gives a straightforward
analysis of so-called head movement.
That constructions apparently involving rightward movement might (at
least in some contexts) have to be reanalyzed as resulting from the stranding
of phonetic material in a leftward movement operation has been proposed,
e.g., by Kayne (1994) and by Wilder (1995). It also seems obvious that the
"stranding" of β (say, a relative clause) in the process of moving Σ (say, a
DP) in (69) could be the result of an incomplete deletion 29 in the source po-
sition of movement, followed by an erasure of β in the target position, due to
PROΝECOΝ.
( 6 9 ) ... Ι Σ Α β ι ... = •
In ΟΤ terms, the candidate set for the pronunciation of a movement chain (70)
is simply enlarged by allowing (free) partial deletion in the copies created by
movement.
136 Gisbert Fanselow & Damir Cavar
(70) - X Y Z - X Y Z -
(71) a. X Y Z - X Y Z
b. Χ Υ Ζ - XrX-Ζ
c. X-¥-Z — Χ Υ Ζ
d. XY-Z-XYZ
e. X Y Z - ^ Z
f. X Y-Z — X-Y-Z
g. X Y Z - X Y Z
h. J W Z - X Y Z
Candidate (71-a) violates PRONECON three times, while (71-b) and (71-c)
satisfy this constraint and represent full overt and full covert movement, re-
spectively. Candidates (71-d,e) represent split constituents (the pronunciation
of the copies created by movement is distributed over two places) - they im-
ply a CIS violation that is justified only when higher constraints like PARSE-
SCOPEO are thereby fulfilled. (71-f) implies a (presumably fatal) RECOV vi-
olation, whereas the PRONECON violation in (71-g) is acceptable only if a
constraint like L E P forces it. Additional constraints may come into play: The
PARMOVE constraint of Müller (1998a) will imply that the c-command re-
lations between the phonetic occurrences of X,Y, and Ζ should not change
when the phrase is split up by free deletion - this is respected in (71-c,d), but
not in (71-h).
Cavar & Fanselow (1997, 2000) extend the approach that we have pre-
supposed above for left branch extractions to the split construction that one
finds in German (68-a), Russian, Polish, Croatian and many other languages,
and that has so far resisted a satisfactory treatment. For constructions such
as (68-a), the obvious alternative to analyzing the split construction in terms
of partial deletion is simple extraction. But notice that split constituents do
not respect standard islands for movement. For example, PPs are islands for
standard movement in Croatian and Polish, yet PPs may be split up freely,
as the examples in (72-a,b) illustrate. Likewise, German PPs are extraction
islands, but split constructions arise, as (68-b) and (72-c) show. See Fanselow
(1988, 1993), Kuhn (1998), van Geenhoven (1996), and in particular Cavar
& Fanselow (2000) for a detailed discussion of the shortcomings of simple
movement accounts.
The analysis offered in Cavar & Fanselow (2000) can be recast in OT terms in
the following way: The split construction arises because CIS is ranked below
(or rather tied with) PARSESCOPEO as applied to specific pragmatic-semantic
features. Thus, for Croatian, Polish, or German it can be shown that a DP or
PP is split up only if its phonetic material is linked to at least two different
pragmatic (focus) or semantic (w/i-)features. Thus, suppose that krov bears
a focus feature and na kakav a w/i-feature in (72-a), and suppose that focus
features have to be checked in a focus position in Croatian. If PARSESCOPEO
is ranked higher or at least as high as CIS, the PP na kakav krov need not or
cannot be realized in a single structural position - it is split up. 30
Up to now, we have only considered cases in which semantic constraints
(but see note 30) require that a phrase be linearized discontinuously after
movement. It is natural to suspect that conditions relating to the phonological
interface may have the same effect. Consider (73) in this respect. It is a noto-
rious fact that stressed particles have to be stranded (73-b,c) in German (and
Dutch) when a verb-second clause is being formed, while unstressed particles
go along with the verb (73-e,f)·
to Comp only: In certain varieties of German (but not in Dutch, see Roberts
1991) the particle must not be stranded in contexts of non-finite verb incor-
poration (see (74)-(75)).
(76) ONEPROSODICWORD ( O P W )
The second position of the clause may host a single prosodie word
only.
1999, Mahajan 1999) have indeed been made that imply that at least certain
instances of head movement should be reinterpreted as phrasal movement.
Some of the obvious advantages of the resulting systems are:
— Phrasal movement can always be carried out in such a way that it ex-
tends the phrase marker (= it targets the root node) that is being con-
structed. The movement of a head to another head position does not
fulfill this extension requirement on obvious grounds: When head F
moves out of ZP to H in [Ή ZP], it does not land at the root node
dominating H and ZP. One unwelcome consequence of this is that
a head H moved to G does not c-command its root position, quite
in contrast to what one would normally assume to hold for move-
ment. Head movement violates most generalizations one would like
to defend in the context of a movement theory.
— In a feature driven theory of movement, it is not obvious what fac-
tor determines whether a given feature must be eliminated by head
movement or by phrasal movement.
— In the configurational definition of phrasal levels (Speas 1990,
Chomsky 1995), an element E is a maximal projection if its mother
node is not a projection of E. If a head H adjoins to head G, neither
G nor G' are projected from it. Therefore, the head H should be-
come a maximal projection by movement. Even if this consequence
were tenable, it would induce a violation of the chain uniformity
condition (Chomsky & Lasnik 1993), because the trace of the head
is not maximal.
semantic considerations, which may (but need not) create a problem in min-
imalist accounts, but will not necessarily in OT, since the relevant STAY*
violations may be called for by the need to respect a higher ranked NPS or
OPW. But we have known since Wexler & Culicover (1980) that movement
has a freezing effect on a phrase Ρ in the sense that Ρ becomes an island after
movement, as demonstrated impressively in Müller (1998b). "Head" move-
ment constellations do not induce such island effects, however:
In (81), the verb has been preposed. If this implies phrasal fronting in the
sense that eine Geschichte über wen has been extracted out of VP before the
rest of VP (= liest) was preposed, then eine Geschichte über wen should have
become an island for movement, which it has not: über wen can still be moved
out of this phrase.
Our account avoids this problem because the splitting up of the VP into a
preposed verb and a stranded rest is effected by pronunciation laws (and not
by remnant movement) - there is no particular reason why the object should
thereby become an island. A comparison of attempts to guarantee that certain
movements have no freezing effects (Müller 1999) with our approach may
thus help to identify the proper way of eliminating head movement.
Notes
Parts of this paper were presented at the Workshop on Conflicting Rules in Phonol-
ogy and Syntax at the University of Potsdam (Dec. 15-16, 1999), and at the Linguis-
tics Colloquium at the University of Leipzig. We would like to thank the audiences
for useful comments and criticism. Particular thanks for helpful hints and for sup-
port in various respects related to this article go to Joanna Btaszczak, Caroline Féry,
Susann Fischer, Gereon Müller, and Douglas Saddy. We would also like to thank
Artemis Alexiadou, Hans-Martin Gärtner, Anoop Mahajan, Matthias Schlesevvsky,
Peter Staudacher, and Chris Wilder. Research for this paper was supported by the
grant INK 12/B1 "Innovationskolleg Formale Modelle kognitiver Komplexität" fi-
nanced by the Federal Ministry of Education and Research and administered by the
German Research Foundation. The paper was written while Damir Cavar was em-
ployed by the University of Potsdam.
1. However, we do not share Pesetsky's view that the optimality theoretic aspect of
syntax is confined to such principles of sentence pronunciation.
142 Gisbert Fanselow & Damir Cavar
2. In an OT-framework, one expects that the link between the (uninterpretable) fea-
tures of an attracting head and the creation of copies can be violated in two ways:
There should be movement that does not check such features, and there should be
uninterpretable features that are not checked by movement/copy formation. The
former assumption helps at least in analyzing non-terminal movement steps in
cyclic extractions; see also Chomsky (1998) for an approach in which the strict
connection between the triggering of movement and feature checking is given
up.
3. We confine our attention here to those candidates that are well formed with re-
spect to conditions such as subjacency, those in which no superfluous movement
has taken place, those in which every necessary movement step has been carried
out, etc., i.e., we confine our attention to the effects of deletion on chains that are
fully grammatical in every other respect.
4. Given that PRONECON never seems to outrank RECOV, it is more adequate to
collapse the two principles into a single one that just bans the realization of pho-
netic material that is predictable from the syntactic environment. This principle
would also be more in the spirit of Pesetsky (1998a).
5. That there will be phonetic material in one copy only follows from (7) if we
make a further assumption: The phonetic material must not be scattered over the
various copies in the chain. See below for an elaboration of this point.
6. PRONECON is an obvious extension of Pesetsky's (1998a: 344) TEL-principle,
which penalizes the use of a phonetic matrix for function words.
7. See Pesetsky (1998a: 342) for a slightly different version of this principle.
8. In a considerable number of languages (see Fanselow 2000 for an overview),
a further option exists which is also discussed under the heading "partial wh-
movement":
Wie seems to have matrix scope in (i), yet it is moved to the specifier position
of the complement clause only. The apparent scope position of the w/i-phrase is
filled by a different ννΛ-element (was 'what'). Malay and Bahasa Indonesia lack
this element in the scope position, at least overtly. There are various proposals for
the proper analysis of (i) (see, e.g., the contributions to Lutz, Müller & v. Stechow
2000). If Fanselow & Mahajan (1996,2000) are correct, both was and wie are in
fact in their scope positions, i.e., the partial nature of wA-movement in (i) is only
apparent.
9. Alternatively, the lack of island effects for argumentai ννΛ-phrases in situ can also
be explained by assuming that they are subject to featural attraction in the sense
of Chomsky (1995) and Pesetsky (1998b), if featural attraction (or "Agreement",
see Chomsky 1998) is not constrained by subjacency, as Pesetsky (1998b) argues.
In such an account, there are two types of "covert" movement: standard phrasal
Economy of Pronunciation 143
copying followed by the deletion of the phonetic material in the landing site
(= the topic of the present paper), and featural attraction.
10. An EPP-feature in the sense of Chomsky (1998), or a D-feature, as argued by
Fanselow & Mahajan (2000).
11. More precisely, it is island effects related to subjacency that one does not expect.
Intervention effects as discussed by Beck (1996) or Pesetsky (1998b) are not ex-
cluded in this way. Furthermore, the binding of w/z-phrases in situ may be subject
to binding conditions, as Ouhalla (1996) points out. That the binding option is
restricted to wA-phrases in situ is a consequence of the fact that wA-phrases can
appear in the specifier position of a CP only if they have been attracted to that
position. In other words, if the Comp-position to which the wA-phrase is seman-
tically linked has an attracting feature, this feature must be checked by copying,
so that island effects automatically become relevant.
12. The feature attracting wA-phrases to Spec,CP can be optionally absent in lan-
guages with "overt wA-movement" (as seems to be the case for French matrix
Comps) and in languages without it (Chinese).
German is a language with "overt" wA-movement in which the wA-attracting fea-
ture of Comp cannot be absent (if we ignore echo questions). Hindi, on the other
hand, seems to be a wh-in-situ language in which wh-in-situ phrases must not ap-
pear in islands (see Mahajan 1990). In the system presupposed above, this array
of facts suggests that the attracting feature of Comp is again always present (wh-
arguments cannot be simply bound by an operator, just as in German). Therefore,
there seem to be languages with (German) and without (Hindi) overtly visible
wA-movement which require the wA-attracting feature of Comps to be present
obligatorily. The only option that appears to be unrealized is a language in which
the attracting feature of Comp is obligatorily absent (in such a language, adjunct
questions could not arise at all).
13. Movement that is string-vacuous in the strictest possible sense is thus not penal-
ized by STAY* if the principle relates to the realization of phonetic matrices. This
may be a welcome result for various kinds of "evacuation" operations necessary
in the context of remnant movement that are not feature driven. See, e.g., Müller
(1999) for a discussion.
14. Violations of STAY* must be assumed to not be cumulative, because otherwise
w/î-phrases would then move to the lowest Spec,C position only.
15. That island effects can be captured straightforwardly in our system has been
shown above - the pertinent argument Cole & Hermon bring forward in this
respect applies to a particular formulation of what they call "Multiple Spellout"
theories only, but not to the system we develop here.
16. More precisely, the GEN component produces some candidates in which all
copies in a chain retain their phonetic matrices, and others in which all but one
have lost theirs (and still many further candidates), and only the latter have a
chance of winning the competition because they violate PRONECON as little as
144 Gisbert Fanselow & Damir Cavar
25. Pesetsky (1998a) later revises this first version of LEC. The reformulation is not
relevant for the type of data we discuss in the main body of the article, though.
26. Given the cyclic nature of optimization, there seems little hope for an attempt to
guarantee that the tie between CIS and PARSESCOPEO is not resolved differently
in different movement steps. When a phrase is split, it will not be put together in
later derivational steps for obvious reasons, but, on the other hand, one expects
a phrase to be able to split at later derivational steps as well, (ii), modeled af-
ter corresponding Dutch data in Barbiers (1999), suggests that at least for some
versions of German and some wA-phrases, the expectation is borne out. In ad-
dition, (ii) corroborates the view that long movement passes through an AGR-O
position, as Barbiers observes.
(i) Was für Frauen hast denn du gedacht dass er einladen will?
what for women have ptc you thought that he invite wants
(ii) Was hast denn du für Frauen gedacht dass er einladen will?
(iii) ?Was hast denn du gedacht dass er für Frauen einladen will?
27. We are obliged to Susann Fischer for helping us get informants' judgments here.
28. We owe this observation to Josef Bayer, p. c.
29. This has been proposed recently by Hinterhölzl (1999).
30. Caroline Féry (p. c.) suggests the following alternative explanation for split XPs
that avoids the assumption of specific focus positions: XPs are split because
of their suboptimal phonological properties. Notice that two prominent accents
should not be adjacent in a string. If a noun phrase has two independent foci, it
must realize two prominent accents. The splitting of the phrase avoids a situation
in which these two accents would be too close to each other.
References
Barbiers, Sjef
1999 Remnant stranding and the theory of movement. Paper presented at the
Workshop on Remnant Movement, Feature Movement and Their Impli-
cations for the T-Model. Potsdam, July 1999.
Basilico, David
1998 WA-movement in Iraqi Arabic and Slave. The Linguistic Review 15(4):
301-339.
Beck, Sigrid
1996 Quantified structures as barriers for LF movement. Natural Language
Semantics 4: 1-56.
Bobaljik, Jonathan
1995 Morphosyntax. The syntax of verbal inflection. Ph.D. dissertation, MIT.
Cavar, Damir
1999 Aspects of the syntax-phonology interface. Ph.D. dissertation, University
of Potsdam.
Cavar, Damir. — Gisbert Fanselow
1997 Split constituents in Germanic and Slavic. Paper presented at the Interna-
tional Conference on Pied Piping, Jena.
Cavar, Damir — Gisbert Fanselow
2000 Discontinuous constituents in Slavic and Germanic languages. Ms., Uni-
versity of Potsdam.
Chomsky, Noam
1986 Barriers. Cambridge, MA: MIT Press.
Chomsky, Noam
1993 A minimalist program for linguistic theory. In: K. Hale and S.J. Keyser
(eds.) The View from Building 20, 1-52. Cambridge, MA: MIT Press.
Chomsky, Noam
1995 The minimalist program. Cambridge, MA: MIT Press.
Chomsky, Noam
1998 Minimalist inquiries: The framework. Ms., MIT.
Chomsky, Noam — Howard Lasnik
1993 The theory of principles and parameters. In: J. Jacobs, A. v. Stechovv,
W. Sternefeld and Th. Vennemann (eds.) Syntax: An International Hand-
book of Contemporary Research., 506-569. Berlin: de Gruyter.
Cole, Peter — Gabriella Hermon
1998 The typology of w/i-movement: Wh-questions in Malay. Syntax 1: 221-
258.
du Plessis, Hans
1977 Wh movement in Afrikaans. Linguistic inquiry 8: 723-726.
Economy of Pronunciation 147
Fanselovv, Gisbert
1988 Aufspaltung von NPn und das Problem der 'freien' Wortstellung. Lin-
guistische Berichte 114: 91-113.
Fanselovv, Gisbert
1993 Die Rückkehr der Basisgenerierer. Groninger Arbeiten zur Germanistis-
chen Linguistik 36: 1 -74.
Fanselovv, Gisbert
2000 Partial movement. SynCom Project. Utrecht Institute of Linguistics.
Fanselow, Gisbert — Anoop Mahajan
1996 Partial movement and successive cyclicity. In: U. Lutz and G. Müller
(eds.) Papers on Wh-Scope Marking, 131-161. (Arbeitspapier des Son-
derforschungsbereichs 340, No. 76.) Stuttgart & Tübingen.
Fanselovv, Gisbert — Anoop Mahajan
2000 Towards a minimalist theory of w/z-expletives, wA-copying, and succes-
sive cyclicity. In: U. Lutz, G. Müller and A. v. Stechow (eds.) Wh-Scope
Marking. Amsterdam: Benjamins.
Fanselow, Gisbert — Reinhold Kliegl — Matthias Schlesewsky
2000 'Long' movement in Northern German: A training study. Ms., University
of Potsdam.
Fox, Danny
1995 Condition C effects in ACD. MIT Working Papers in Linguistics 27: 105-
120.
van Geenhoven, Veerle
1996 Semantic incorporation and indefinite descriptions. Ph.D. dissertation,
University of Tubingen.
Grevvendorf, Günther
1999 Multiple w/¡-fronting. Ms., University of Frankfurt.
Grimshaw, Jane
1997 Projection, heads, and optimality. Linguistic Inquiry 28: 373-422.
Groat, Erich — John O'Neil
1996 Spellout at the LF-interface. In: W. Abraham, S. D. Epstein, H. Thrains-
son and J. W. Zvvart (eds.) Minimal Ideas: Syntactic Studies in the Mini-
malist Framework, 113-139. Amsterdam: Benjamins.
Heck, Fabian
1997 Komplementierer und ihre Spezifikatoren. Ms., University of Tübingen.
Heck, Fabian — Gereon Müller
1999 Repair is local. Paper presented at the Workshop on Conflicting Rules in
Phonology and Syntax. Potsdam, December 1999.
Hiemstra, Inge
1986 Some aspects of w/i-questions in Frisian. NOWELE 8: 97-110.
148 Gisbert Fanselow & Damir Cavar
Hinterhölzl, Roland
1999 Licensing movement and stranding in the West Germanic OV languages.
Ms, University of Potsdam.
Höhle, Tilman
1996 German w...vv-constructions. In: U. Lutz and G. Müller (eds.) Papers ort
Wh-Scope Marking, 37-58. (Arbeitspapier des Sonderforschungsbereichs
340, No. 76.) Stuttgart & Tubingen.
Huang, C.-T. James
1981 Move vvh in a language without wA-movement. The Linguistic Review 1 :
369-416.
Kayne, Richard
1994 The antisymmetry of syntax. Cambridge, MA: MIT Press.
Kayne, Richard
1998 Overt vs. covert movement. Syntax 1: 128-191.
Koopman, Hilda — Anna Szabolcsi
1999 Verbal complexes. Ms., UCLA.
Kuhn, Jonas
1998 Resource sensitivity in the syntax-semantics interface and the German
split NP construction. Ms., Universität Stuttgart.
Kvam, Sigmund
1983 Linksverschachtelung im Deutschen und Norwegischen. Tübingen: Nie-
meyer.
Legendre, Géraldine
in press An introduction to optimality theoretic syntax. In: G. Legendre, J.
Grimshavv, and S. Vikner (eds.) Optimality Theoretic Syntax. Cambridge,
MA: MIT-Press.
Legendre, Géraldine — Paul Smolensky — Colin Wilson
1998 When is less more? Faithfulness and minimal links in w/i-chains. In:
P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis and D. Pesetsky (eds.)
Is the Best Good Enough?, 249-289. Cambridge, MA: MIT Press.
Lutz, Uli — Gereon Müller — Arnim von Stechovv (eds.)
2000 Wh-Scope Marking. Amsterdam: Benjamins.
Mahajan, Anoop
1990 The A/A-bar distinction and movement theory. Ph.D. dissertation, MIT.
Mahajan, Anoop
1999 Against head movement in syntax. Ms., UCLA.
McDaniel, Dana
1989 Partial and multiple w/¡-movement. Natural Language and Linguistic
Theory 7: 565-604.
Economy of Pronunciation 149
Roberts, Ian
1997 Restructuring, head movement, and locality. Linguistic Inquiry 28: 423-
460.
Roberts, Ian
1998 Have/Be raising, Move F and Procrastinate. Linguistic Inquiry 29: 113-
125.
Sabel, Joachim
1998 Principles and parameters of w/z-movement. Habilitation thesis, Univer-
sity of Frankfurt.
Saddy, Doug
1991 Wh-scope mechanisms in Bahasa Indonesia. MIT Working Papers in Lin-
guistics 15: 183-218.
Saddy, Doug
1992 A versus A-bar-movement and w/¡-fronting in Bahasa Indonesia. Ms.,
University of Queensland.
Schmid, Tanja
1998 Optional and Obligatory IPP Constructions in Westgermanic. Paper pre-
sented at the Second Workshop on Optimality Theory Syntax, October
1998, University of Stuttgart.
Speas, Margaret
1990 Phrase Structure in Natural Language. Dordrecht: Kluwer.
Tsai, Wei-Tien
1994 On economizing the theory of Α-bar dependencies. Ph.D. dissertation,
MIT.
Wexler, Kenneth — Peter Culicover
1980 Formal Principles of Language Acquisition. Cambridge, MA: MIT Press.
Wilder, Chris
1995 Rightvvard movement as leftward deletion. In: U. Lutz and J. Pafel (eds.)
On Extraction and Extraposition in German, 273-309. Amsterdam: Ben-
jamins.
Wilder, Chris
1997 Some properties of ellipsis in coordination. In: A. Alexiadou and T.A.
Hall (eds.) Studies on Universal Grammar and Typological Variation,
59-107. Amsterdam: Benjamins.
On the Integration of Cumulative Effects into
Optimality Theory
Silke Fischer
1 Introduction
The goal of this paper is to discuss the question of whether cumulative the-
ories are indispensable, because they are needed in order to capture certain
linguistic phenomena, or whether cumulative effects can be expressed equally
well in an optimality-theoretic framework. If so, cumulative theories could be
integrated into Optimality Theory (OT).
At first sight, the two theories seem to behave very differently. In OT, the
number of violations of low-ranked constraints does not play any role as long
as the constraint that is decisive for the outcome of the competition is higher-
ranked. In a cumulative theory, on the other hand, the situation is somewhat
different, because the underlying principle is that the weights of the involved
factors are added up. Thus it can happen that some factors which individually
do not have much weight and are therefore unimportant on their own become
decisive as soon as they cooccur or appear repeatedly.
As empirical background I will use Pafel's cumulative approach to quan-
tifier scope in German (cf. Pafel 1998). I will discuss whether it is possible
to "translate" it into OT, where the difficulties lie, and what kind of assump-
tions one might have to make. What I will not do is discuss Pafel's theory as
such, that is, discuss whether it is able to capture the phenomenon of quanti-
fier scope or where its advantages and disadvantages might lie; nor is the aim
of this paper to provide an adequate optimality-theoretic account of quanti-
fier scope in general (for this purpose see Heck, this volume). Pafel's theory
only serves as a case study for a more theoretical debate; therefore, the ap-
proach itself as well as the judgments on the sentences are neither changed
nor commented on.
152 Silke Fischer
Pafel introduces a number of factors that seem to have an impact on the scopai
behavior of quantifiers, i.e., whether they tend to take wide scope over other
quantifiers or not. Each factor is assigned some weight. In order to decide
which one of two quantifiers in a given sentence tends to take wide scope,
one has to determine which factors are relevant for each quantifier in the given
context. Then one can calculate the scopai value (SV) of each quantifier by
adding up the values of the relevant factors. The scopai behavior can then be
determined from the difference between the scopai values as follows:
(i) |SV(Q,)-SV(Q2)|> 1 :
The quantifier with the larger SV takes wide scope (i.e., the sentence
is unambiguous).
(ii) |SV(QI)-SV(Q2)|< 1 :
Either quantifier may take wide scope (i.e., the sentence is ambigu-
ous).
a. 0 <|SV(Q1)-SV(Q2)|< 1 :
The reading in which the quantifier with the larger SV takes wide
scope is preferred.
b. |SV(Q,)-SV(Q2)|=0:
Both readings are equally well available. 1
weight:
EX-PRE ... is assigned to quantifiers in the "Vorfeld" 1.5
which linearly precede other quantifiers;
SUBJECT ... is assigned to subject quantifiers; 1
IN-DIS ... is assigned to quantifiers that have an in- 1
herently distributive character.
S V ( Q , ) = 1 . 5 + 1 + 1 = 3.5
SV(Q 2 ) = 0
Qi > Q2 (i e., Qi has relative scope over Q 2 ): possible
Q2 > Qi (i-e., Q2 has relative scope over Qi): impossible
(2) Jede Fuge hat ein Pianist in seinem Repertoire,
[every fugue] a c c has [a p i a n i s t ] ^ in his repertoire
Q, : E X - P R E + IN-DIS SV(Qi ) = 1.5+1 = 2.5
Q2: SUBJECT SV(Q 2 ) = 1
Qi > Q2: possible
Q 2 > Qi: impossible
(3) Ein Pianist hat jede Fuge in seinery Repertoire,
[a pianist] nom has [every fugue| Í(CC in his repertoire
Q, : EX-PRE + SUBJECT SVCQj ) = 1.5+1 = 2.5
Q2: IN-DIS SV(Q 2 ) = 1
Qi > Q2: possible
Q2 > Q , : impossible
(4) Eine Fuge hat jeder Pianist in seinem Repertoire,
[a fugue| a c c has [every pianist ]„om in his repertoire
Q,: EX-PRE SV(Q,)=1.5
Q2: SUBJECT + IN-DIS SV(Q 2 ) = 1 + 1 = 2
Qi > Q2: possible
Q 2 > Qi: possible
above, but for the beginning they already constitute a task and draw one's
attention to the main problems.
Since in Pafel's theory relative quantifier scope only depends on the com-
parison of the involved quantifiers described in terms of a certain set of fac-
tors, and is not influenced by any further component like the syntactic deriva-
tion, the translation into OT might require some unconventional assumptions.
The starting point is that we have two quantifiers with different properties,
and based on this information alone our theory should be able to predict the
possible scope relations. In analogy to Pafel's procedure I therefore propose
that the quantifiers of the sentence under consideration constitute the can-
didate set and that the optimal candidate will be the quantifier which tends
to take wide scope. In the case of ambiguous sentences this means that the
candidates will have to be equally optimal.
As far as the constraints are concerned, it seems to be reasonable to adopt
Pafel's factors and, as a first try, rank them according to their weight in Pafel's
account such that constraints with greater weight are higher-ranked and con-
straints with the same weight are considered to be tied. With regard to the
examples above we thus have the following constraints:
EX-PRE (E): Quantifiers must occur in the "Vorfeld" and precede some
other quantifier.
SUBJECT (S): Quantifiers must be subjects.
IN-DIS (I): Quantifiers must be inherently distributive.
In order to get more plausible candidates than merely the quantifiers under
consideration, one can alternatively use the sentences' S-structures as input,
which yields potential LFs as output. If it is assumed that the possibility for
a quantifier to take wide scope is expressed by the fact that it precedes the
other quantifier at LF, and if the constraints are reinterpreted in such a way
that they refer to the first quantifier only (e.g., S: The first quantifier must be
the subject), we get exactly the same results. The candidates in the tableaux
are then to be understood as abbreviations for LF-representations in which
the quantifier in question precedes the other quantifier.
Let's see whether on these assumptions the predictions of Pafel's approach
can be captured.
Q 2 : jede Fuge *! *
Unfortunately, this first approach does not work. Although the constraint
ranking in (5) predicts the scopai behavior of the sentences (l)-(3) analo-
gously to Pafel's theory (cf. T1-T3), it is not able to capture the ambiguity of
example (4); cf. T 4 .
In order to predict this ambiguity, the two candidates in T 4 both have to be
optimal, which means that in contrast to the situation in Ti -T3, the violation of
the constraint E in T 4 must not be fatal. If we compare the situation in T 4 with
that in T1-T3, it can be concluded that it must be the simultaneous violation
of the two low-ranked constraints S and I that prevents the Ε-violation of Q2
from being fatal. At this point we are faced with an apparent contradiction.
As mentioned in the introduction, it is a basic principle of OT that violations
156 Silke Fischer
This constraint will be satisfied as long as at least one of the two constraints
S or I is fulfilled, and it will be violated whenever S and I are simultaneously
violated, which corresponds exactly to the situation in T4 and distinguishes
it from T1-T3. In order to derive the right result in T 4 , we would like to say
that E and S & I are tied. But since ties are not defined in a unified way, it
has to be made explicit at this point what kind of ties we are talking about.
Basically, we can draw a distinction between local and global ties (for a de-
tailed analysis of different types of ties see Müller 1999). The main differ-
ence between these two concepts concerns the significance of violations of
lower-ranked constraints. Under a local tie approach the prediction will be
that these violations become relevant as soon as neither the tied constraints
nor higher-ranked constraints decide the competition. Formally, this means
that a given language is determined by one constraint ranking in which the tie
is integrated as follows:
S&I » E
E » S&I » ...
y — • constraint order a
... » /
x
«- —>· constraint order β
S&I » E » ...
In this case, Q2 will win, because it does not violate the two low-ranked con-
straints S and I, in contrast to Qi. For this approach to work, it would have to
be assumed that the local conjunction X & Y ("X or Y must hold") somehow
replaces the simple constraints X and Y, such that in a competition where X
& Y is involved, X and Y must be excluded. (Intuitively it does not seem to
be so unreasonable that one constraint should not be referred to twice, once
in the form of X and the second time in the form of the local conjunction X
& Y. For a related idea in which certain elements are only referred to once in
determining the grammaticality of a given derivation, cf. Richards's (1998)
Principle of Minimal Compliance.)
So if we replace T5 with T 6 , where S and I are excluded from the compe-
tition, and if we assume that E and S & I are locally tied, we finally get the
right prediction for sentence (4):
158 Silke Fischer
T6:
Candidates E S&I
Qi : eine Fuge *
1
ι® Q2: jeder Pianist *
Alternatively, we could assume that the relation between the constraints E and
S & I is expressed in terms of an ordered global tie (as illustrated in diagram
(8)). With regard to sentence (4), this means that the tableau we would get
would be equivalent to T 5 , except that the violations of S and I would not be
fatal and both quantifiers would be optimal: Qi under constraint order a and
Q2 under constraint order β.
Τ7 :
Candidates E S&I S I
ts" Qi : eine Fuge * *
US' Q2: jeder Pianist *(!)
To sum up, the underlying ranking we have assumed so far is Ε o S & I
» S o l . However, the following example reveals that this order cannot be
completely correct. In order to capture the ambiguity of sentence (9), which
corresponds to Pafel's example (3.108b), E and S have to be tied.
This observation raises a severe problem. If we assume on the one hand that
E is tied with S (and also with I, as the difference between the scopai values
shows), and on the other hand that E is tied with S & I, we have to conclude
that S & I is also tied with S and I because of transitivity. But if we consider
these constraints in the light of Pafel's approach, 4 they correspond to factors
with the scopai values 2 and 1 respectively, which means that the difference is
Cumulative Effects in OT 159
> 1. Thus only the factor with the larger scopai value should be able to take
wide scope, and the corresponding constraint should be higher-ranked than
the other one. So it must be concluded that we face a problem with regard to
transitivity.
T9:
Candidates E & SL I
us· Qi: einem Kind *
* B3P Q 2 : jedes Märchen *
If we want to make sure that only Qi wins, E & SL must be ranked higher
than I, a ranking which is also suggested by the difference between their
corresponding weights, which is 2.5—1 = 1.5.
Cumulative Effects in OT 161
I do not know how to solve this problem without giving up to some extent
the idea that constraint orders must be strictly transitive. But if we allow that
Α ο Β and Β o C does not necessarily imply A o C, we can account for the
examples above with the following diagram:
(11)
Β » C » D
constraint order a
(...A » Β » C ...)
constraint order β
(...A » C » Β ...)
constraint order γ
A
» C » D (...B»A»C...)
In (11), two global ties are involved, which express the relations Α ο Β and
Β o C, but still all three resulting constraint orders predict that A is higher-
ranked than C. This is possible because in contrast to usual assumptions,
according to which the branches of global ties are continued in the same way,
the second tie in (11) does not affect all branches, but is only part of the two
constraint orders a and β. So we could propose that the occurrence of global
ties need not necessarily affect all branches of the ranking structure. With this
assumption the transitivity problem can be solved, which means that the idea
of strict transitivity in constraint rankings must be given up (and this might
be a controversial result). However, transitivity does not have to be given up
completely, since each constraint order in itself remains transitive. It seems
to me that this is the easiest way to integrate the non-transitive effects of
cumulative theories into OT.5
The question then arises of how the underlying relation between the con-
straints A, B, and C, which is illustrated in (11), can be formally expressed.
Following a suggestion by Ralf Vogel (p.c.), I propose that it can be cap-
tured adequately by the relation (A » C) ο B, where this kind of interaction
between ties and hierarchical rankings is defined as follows:
162 Silke Fischer
(A » C) ο Β := ΑοΒ» C V A » CοΒ
A» Β» C V Β» A» C
V A» C» Β (V A » Β » C)
5 Combining Constraints
T13:
Candidates E &I S
us· Qi -. jede Fuge *
Q 2 : jede Fuge *!
As far as factors with negative weight are concerned, one might alternatively
translate them into negative constraints in order to avoid configurations where
X » X & Y, which contradicts the definition of local conjunction. The factor
FOCUS, for example, would then translate into the following constraint:
In fact, we could then also try to replace the factor FOCUS (with weight
— 1 ), which is associated with focused quantifiers, with a factor *FOCUS with
weight 1, which is associated with unfocused quantifiers. In this way we could
generally reinterpret factors with negative weight such that they would all be
assigned positive weight. With regard to example (14), we would then have
the following configuration, which illustrates that the difference between the
scopai values and therefore the predictions on possible scope relations remain
unaffected by this reinterpretation.
T17:
Candidates E & ST S & I & *F
US' Qi : welche Fuge *
1®· Q2: jeder Pianist *
As far as example (15) is concerned, the factor *FOCUS would not be in-
volved at all, because both quantifiers in the sentence are focused. Hence,
*F would not belong to the relevant constraint subset. However, all sentences
that contain unfocused quantifiers (like the examples (l)-(4)) are now associ-
ated with the factor *FOCUS and therefore with the constraint *F; but as our
considerations above have shown, *F will be excluded from CONri./ in case
both involved quantifiers are unfocused. Thus the replacement of F/FOCUS
by *F/*FOCUS does not affect our earlier examples.
Finally, there is another configuration in Pafel's approach that must be men-
tioned. If a quantifier is not associated with any property that is relevant for
scope, it receives the scopai value 0. Thus it is possible for a sentence con-
taining such a quantifier to be ambiguous in case the second quantifier Q2
has a scopai value with -1 < SV(Ç>2) < 1. Assume that Q2 has the property
A, which translates into the constraint A. As indicated in T| 8 , Q2 fulfils A in
contrast to Qi. Thus we are faced with the situation that Q2 will always win
if we do not introduce a further constraint which is violated by Q2 but not
byQi.
Ti 8 :
Candidates A
* Qi *!
US' Q 2
Note that the constraint N-PR must also come into play if a quantifier shares
all its properties with the second quantifier of the sentence. This configuration
is illustrated in the following example, where A and Β are properties relevant
for scope that translate into the constraints A and Β respectively.
(17) Qi : A + B
Q2: A, where |SV(Qi)—SV(Q2)|< 1, i.e., either quantifier can
take wide scope.
As discussed above, the constraint derived from the common property A is
excluded from CONre/. Thus the relevant constraint subset might be:
(>) {B>, or
(ii) {B, N-PR}.
For (ii), the constraint ranking is Β o N-PR, because we know from our as-
sumptions in (17) that |weight(B)|< 1. The results we get for (i) and (ii) are
illustrated in T2o(,·) and Τ20(,·,), which show that we have to use the second
constraint subset.
T2o(o: T2o(,·,·):
Candidates Β Candidates Β N-PR
nsr Q, B3F Q, *
* q2 *! ι®» Q 2 *
One further situation that can occur in cumulative theories, which we do not
find in Pafel's approach however, is that the cumulative occurrence of one
and the same constraint violation might change the outcome of the whole
competition. Imagine the following configuration:
Cumulative Effects in OT 169
T21: T22:
Candidates A Β Candidates A Β
Q *! e r C, *
1®· C 2 * c2 **!
If it is assumed that A » B, we can account for T21, but not for T22, and if
we assume that Β » A, we get the right prediction for T22, but not for T 2 i. In
the light of the ongoing discussion, one way out of the dilemma might be to
assume that constraint combinations of the sort X & Y are not only possible
in case X / Y , but also if X = Y. The resulting constraint would be a reflexive
local conjunction (cf. also Legendre et al. 1998), which would have to be
interpreted as follows:
On these assumptions, T 2 i and T22 can be accounted for with the following
constraint ranking: Β 2 A » B. Since A » B, C2 wins in T 2 i, and since
B » A, Ci wins in T22, as illustrated more precisely in T23.
2
T23:
Candidates B2 A
03° C, *
c2 *!
6 Conclusion
non-transitive effects into a transitive order. I think that this is only possible
if the idea of strict or global transitivity, where Α ο Β and Β o C necessarily
implies A o C, is given up. Thus, I proposed that the occurrence of global
ties within global ties might only affect some of the branches. This approach
allows on the one hand the integration of non-transitive effects, but preserves
on the other hand at least locally the transitive order, because each resulting
constraint order remains transitive. Thus, this step is not as radical as it might
seem at first sight. Of course, it has to be pointed out that global ties in general
increase the amount of complexity tremendously ; however the number of the
resulting constraint rankings is again reduced somewhat if global ties do not
necessarily have to affect all branches. As far as the formal realization of
this relation is concerned, it can be expressed as interaction between ties and
bracketed hierarchical rankings. This seems to me to be a natural elaboration
of the two basic relations and "o", which is to some extent reminiscent
of the interaction between addition and multiplication.
Finally, the question arose as to how CONre.;, the smallest set of constraints
relevant for a competition, can be defined. It is clear that constraints on which
the candidates behave alike can be excluded and that furthermore simple con-
straints which are also part of relevant local conjunctions need not be taken
into consideration. (In the latter case, the simple constraints will not be de-
cisive, since the corresponding local conjunctions are higher-ranked.) More-
over, the cumulative character of the constraints ensures that (A & X) 5i> or
ο (Β & X) ^ A » or o B, which allows us to ignore certain higher-ranked
local conjunctions on which the candidates differ. As far as the integration of
Pafel's approach into OT is concerned, it could therefore be concluded that
CON re ; contains only two constraints, namely the constraint combinations
derived from the properties associated with each quantifier.
There are two questions I have not addressed here. First, it could be asked
whether anything would change if CON re / contained more than two con-
straints or if more than two candidates were involved. The second ques-
tion concerns the representation of tendencies in OT, as for example the
preference for certain readings. One possibility might be that it can some-
how be captured by the number of constraint orders which are affected by
certain ties, since this is exactly how ambiguities predicted by the relation
0 < |SV(Qi)—SV(Q2)| < 1 are characterized. However, whether this ap-
proach would really work would have to be discussed in more detail.
Cumulative Effects in OT 171
Appendix
The ranking we finally assumed for the constraints S & I, S, I, and E was (S &
1 » S ο I) ο E, which results in eight constraint orders if the ties are resolved
(cf. diagram (13)). This outcome can be predicted very easily if we assume
the following definition, which is a generalization of definition (12):
(D » A ο Β) o C
(D » A » Β) o C V (D » Β » A) o C
D oC » A » Β V D oC » Β » A
V D » A oC » Β V D » ΒoC » A
V D » A » Βo C V D » Β» A o C
D » C » A » Β V D » C » Β » A
V C » D » A » Β V C » D » Β » A
V D » A » C » Β V D » Β » C » A
(V D » C » A » Β V D » C » Β » A)
V D » A » Β » C V D » Β » A » C
(V D » A » C » Β V D » Β » C » A)
Notes
For comments and discussion I want to thank Fabian Heck, Gereon Müller, Tanja
Schmid, Wolfgang Sternefeld, Sten Vikner, and Ralf Vogel.
1. The distinction between (ii-a) and (ii-b) is only mentioned for completeness'
sake. It does not play any role in the further discussion, since the question of
how this difference can be expressed in an optimality-theoretic framework is not
addressed here.
2. If this constraint were translated back into Pafel's theory, it would correspond to
a factor with the weight 2, since it involves both properties S (weight 1) and I
(weight 1).
3. The question might arise of whether it is legitimate to restrict the competition
to the four constraints considered in T5. It is true that there are higher-ranked
constraints on which Qi and Q2 differ, namely E & X and any local conjunction
containing X and S or I, where X is a constraint that is violated by both candi-
dates. However, the cumulative character of the constraints ensures that (A & X)
» or ο (Β & Χ) A » or ο Β. Thus, the outcome of a competition involving
the constraints A & Χ, Β & X, A, and Β does not change if A & X and Β & X
are not taken into account.
4. It is not possible to provide a concrete example that only involves the two con-
straints S & I and S or I. These combinations are ruled out, because Pafel's pos-
tulation of the two contrasting factors EX-PRE and IN-PRE assures that one of
them is always involved. (The latter property is assigned to quantifiers in the
"Mittelfeld" that linearly precede other quantifiers.) But I think the general prob-
lem becomes clear nevertheless.
5. The situation in which Α ο Β and Β o C, but C » A must be excluded, is not as
unusual as it may seem at first sight. It also occurs, for example, in Müller (2000),
where it is assumed on the one hand (by transitivity) that A o C, but where on the
other hand C » A is excluded because of an underlying meta-constraint which
says that A must be higher-ranked than C.
6. The dotted lines in the tableaux indicate that two neighboring constraints X and
Y are tied, but that their corresponding weights are not equal.
7. The sentences (14) and (15) correspond to Pafel's examples (3.164') and (3.165).
8. Pafel assumes that w/i-phrases are inherently focused (cf. Pafel 1998: 98).
Cumulative Effects in OT 173
References
Heck, Fabian
t.v. Quantifier scope in German and cyclic optimization
Legendre, Géraldine — Paul Smolensky — Colin Wilson
1998 When is less more? Faithfulness and minimal links in wh-chains. In: P.
Barbosa, D. Fox, P. Hagstrom, M. McGinnis and D. Pesetsky (eds.) Is the
Best Good Enough?, 249-289. Cambridge, MA: MIT Press.
Müller, Gereon
1999 Optionality in optimality-theoretic syntax. GLOTInternational 4.5: 3-8.
Müller, Gereon
2000 Das Pronominaladverb als Reparaturphänomen. Linguistische Berichte
182: 139-178.
Pafel, Jürgen
1998 Skopus und logische Struktur. Studien zum Quantorenskopus im Deut-
schen. Technical Report 129, Arbeitspapiere des Sonderforschungsbere-
ichs 340. Universität Tübingen.
Prince, Alan — Paul Smolensky
1993 Optimality Theory: Constraint Interaction in Generative Grammar. Ms.,
Rutgers University & University of Colorado, Boulder. To appear as Lin-
guisitc Inquiry Monograph, Cambridge, MA: MIT Press.
Richards, Norvin
1998 The principle of minimal compliance. Linguistic Inquiry 29: 599-629.
Smolensky, Paul
1995 On the internal structure of Con, the constraint component of UG. Ms.,
Johns Hopkins University.
Quantifier Scope in German and Cyclic Optimization
Fabian Heck
1 Introduction
The goal of this paper is to account for the phenomenon of relative quantifier
scope in German. The discussion basically deals with sentence pairs of the
following type:
(1-b) is ambiguous. It can either have the meaning described in (2-a) or the
meaning described in (2-b):
176 Fabian Heck
(2) a. There exists one mistake χ such that for every person y the follow-
ing holds: y made x.
b. For every person y there exists a mistake χ such that y made x.
Interestingly (1-a) only has the reading (2-b). Hence, the question is, when
does a sentence that contains two quantifiers have only one reading and when
does it have two readings?
First of all, following May (1977), Stechow (1993), Heim & Kratzer (1997),
and others I assume that every meaning of a sentence is spelled out unambigu-
ously at the level of Logical Form.1 The relative scope of two quantifiers Qi
and Q 2 is encoded by the relationship of c-command (following the definition
in Reinhart 1976): If Qi c-commands Q2 at LF, then Qi has scope over Q2.
Now, I think that there are two main observations that may lead to a prin-
cipled account of the given question: First, the relative quantifier scope in
German is highly dependent on the given S-structural configuration. That is,
if a quantifier Qi c-commands another quantifier Q 2 at S-structure (SS), then
Q! will be able to c-command Q2 at LF as well. In other words, the mapping
from S-structure to LF is highly structure preserving (cf., for instance, Kiss
1999 for German, and Kroch 1974, Reinhart 1983, and McCawley 1999 for
English).
Second, it seems that the scope relations can be inverted on the way from
S-structure to LF if the derivation has involved S-structure movement. 2 This
means that whenever there is a quantifier at S-structure that does not fill its
D-structure position, the scope relations are destabilised and there may be an
accessible reading that does not correspond to the S-structural configuration.
Technically this will be spelled out by reconstructing the moved quantifier to
its base position.
The basic assumption about the transparent LF, together with the first ob-
servation, calls for the syntactic levels of S-structure and LF. The second
observation calls for the syntactic level of D-structure (DS).
Since I am using Optimality Theory to tackle the problem, I first want to
give a motivation for this decision: Often the data suggest that there are dif-
ferent principles at work that stand in conflict with each other, but that never-
theless are all needed. That is, even grammatical structures cannot fulfil every
constraint. We nevertheless need all constraints, and hence, constraints must
be violable. OT gives us the means to express the concept of a violable but
active constraint.
Cyclic Optimization ill
2 Cyclic Optimization
The strategy I will follow here is to reconcile the classical T-model of gram-
mar of Chomsky (1981) with the standard model of Optimality Theory of
Prince & Smolensky (1993) and McCarthy & Prince (1993). 3 The result of
this reconciliation will be an extended version of OT which will be referred
to as the model of Cyclic Optimization. Its basic characteristics are the fol-
lowing.
Starting with a kind of predicate-argument structure as input, a generator
GEN constructs a set of possible D-structures out of some "lexical" material. 4
This input defines the candidate set:
This set will then be optimized in the first cycle. The output will be an op-
timal D-structure DS,. DS, in turn will serve as input for the second cycle,
which starts with the generation of a set of possible S-structures, basically
using the transformation move-α. This set will again undergo the process of
optimization and the output will be an optimal S-structure SS 7 . SS7· will be
the input for the last cycle. Again using move-α, a set of possible LFs will be
generated and one last time optimization will apply, resulting in an optimal
Logical Form LF*. The whole computation can be seen in the diagram below:
3 D-Structure
This means that the indirect object (IO) in (4) may principally occupy three
different positions at D-structure: it may be base adjoined above the subject,
between the subject and the direct object (DO), or below the direct object.
The claim I want to make is that its exact base position can be determined
by a process of D-structure optimization. The positions of the subject and the
DO are fixed, so the only variation in D-structure will come from the choice
of base adjoining the IO at different positions. This choice will give us the
Cyclic Optimization 179
3.1 Constraints
I will now introduce the first constraints, partially following Abraham (1986),
Hoberg (1981) Lenerz (1977), Stowell (1981), Uszkoreit (1986), and Müller
(1999). These three constraints will provide us with the traces we need to
derive scope inversion by reconstruction. This is the first step to linking rel-
ative scope, an LF-phenomenon, to basic word order, which is a property of
D-structure. And here is the first constraint:
Evidence for ANIM is given by the following examples, in which in the un-
marked case the animate argument always precedes the inanimate argument
(see the a-examples). If the order is reversed as in the b-examples, the result
is marked: 5
What is particularly interesting here is that AGENT has not had any impact
in the examples so far. But AGENT comes into play as soon as A n i m is kept
constant:
In a sense, the evidence for A d j a shows the same characteristics as the ex-
amples cited as evidence for A G E N T : this time, if A N I M and A G E N T both are
kept constant, the effects of A d j a can emerge: 7
3.2 Analysis
We now turn to the explicit computations, which are shown in the OT-tables
below. Optimal candidates are indicated by the pointing hand ns\
The tables in (15) and (16) show different unmarked orders for the examples
in (6) and (7) respectively, in which the main reason for different word order
is a difference in animacy:
Examples like these show very clearly that basic word order can not be totally
dependent on the verb. In both cases we face the same verb. However, in one
case the basic word order is direct object before indirect object, whereas in
the other example it is the inverse.
The tables in (17) and (18) show the analysis of the examples in (8) and
(11) respectively:
182 Fabian Heck
As can be seen, the indirect object occurs to the left of the subject in one case
but to the right of the subject in the other case. This is due to a difference in
animacy of the arguments, and the emergence of AGENT.
Finally, we see an example of what happens if even agentivity is neutral-
ized:
Thus, we can finish with the first cycle. The optimal D-structure will now be
the input for the next cycle, the S-structure generation and optimization. 9
4 S-Structure
I will first clarify the basic assumptions about S-structure that are made here:
1. Topicalization is semantically empty movement. It is triggered by some
need for clause typing. 2. Scrambling may be semantically empty if it is trig-
gered by information structural needs (e.g., align focus to the right). But it
may also be semantically relevant if it is triggered in order to gain scope.
Cyclic Optimization 183
4.1 Constraints
The next constraint I will adopt is the economy constraint ECON, which pro-
hibits movement (cf. Chomsky 1995, Grimshaw 1997).
Since there is ECON, a trigger for S-structure movement is needed. In the case
of scope induced scrambling this will be formalised by stipulating an abstract
scope marker Q which is generated at D-structure and which c-commands
the base position of the quantifier it is supercoindexed with (coindexation
meaning that the quantifier and the scope marker share the same scope): 10
To force the quantifier to move up to its scope marker we need the next con-
straint, which follows quite naturally:
4.2 Analysis
Since the main interest here is relative scope at LF and not S-structure move-
ment, I shall only briefly discuss the consequences of this ranking. The input
for the computation will be optimal D-structures - the output of the previous
cycle.
First of all it is clear that if there is a scope marker, then the quantifier which
is coindexed with it has to move in order to fulfil SP:
On the other hand, if there is no scope marker present, then scrambling will
result in a fatal violation of economy:
If we have an object that is both topic marked and scope marked, then on
the one hand it has to move to the Top position in order to satisfy TYPE, but
on the other hand it has to move to its scope marker. Since we assumed that
TYPE SP at S-structure, the object will be topicalized: 14
Inputos: -Top daß J. Q' jedem Kind fj+τορΐ ein Märchen erzählt hat
Candidates TYPE SP ECON
os* Ci: ein Märchen', ...Q'... jedem ... ti * *
*
C2: [Q ein Märchenj Q ] ...jedem ... t| *!
*
C3: -Top —Q' ...jedem ... ein Märchen' *!
186 Fabian Heck
More or less the same holds for ALIGN. For reasons of space I will only show
the result of the complex situation in which there is a conflict between scope
marking and focus alignment: 1 5
*
C2: I Q [F jeder h ] . . . t2... einem Flüchtling *!
* *
C3: [ Q [F jeder ] 2 ] einem Flüchtling] ... ti *!
*
C4: Q!... [F jeder]' ... einem Flüchtling *!
A s we shall see later, the scope marker in (31) stands in an improper posi-
tion and therefore violates a constraint designed to avoid the proliferation of
improper scope marker insertion. We will come back to this constraint later.
To sum up: S-structure is the level at which the relative scope is determined
in most cases in German. This means that a quantifier moves to its scope
position. The exceptions are cases where movement which is triggered by
information structure outranks scope movement and delays it until LF.
Cyclic Optimization 187
5 Logical Form
We now turn to the last cycle. The basic assumptions are: 1. The verb is in-
terpreted as an open proposition (following Nohl & Stechow 1995).16 This
means that there are argument variables which are generated directly at
the verb. As a consequence there is no type driven QR (contra May 1977,
May 1985), but quantifiers can be interpreted in situ. 2. Semantically empty
movement is obligatorily reconstructed at LF.
In addition to QUIB I will introduce two further constraints that show their
effects at LF. Together with the first two cycles these constraints will serve
as a means to derive some empirical facts about relative scope in German as
they have been noted by Pafel (1997).
5.1 Constraints
It is well known that some quantifiers tend to take wide scope as a mere
lexical property (cf., for instance, Milsark 1974 and Pafel 1997). In German
these quantifiers are jeder, mancher, and die meisten, and they will be referred
to as the strong quantifiers, as in the following constraint:
Together with the next constraint, QR will be responsible for some instances
of scope inversion. The next constraint is based on the assumption that se-
mantically empty movement is reconstructed at LF. Reconstruction is to be
understood here as syntactic lowering that violates economy.
I am now going to present some data about relative scope in German and
then we will see how the proposed constraints can account for them.
Both examples have an inverted reading besides the reading that corresponds
to the S-structural configuration. Scope inversion follows if (34-a,b) are in-
stances of the following scheme:18
Now, what is the use of strong quantifiers that should be raised by QR? Re-
member that QUIB was introduced to account for the surface orientation of
German relative scope. Since QUIB outranks QR, it seems as if QR could
never apply.
The answer to this question is based on the following observation: Some-
times inversion seems only to be possible if both movement is involved and a
strong quantifier is present:
(44)-(46) are completely analogous except that the weak quantifier all- has
been replaced by the strong quantifier jed-. Now inversion is possible in the
main clauses which involve topicalization, but still impossible in the embed-
ded clauses! So it seems that movement may destabilise the scope configura-
tion, resulting in inversion if additionally there is a strong quantifier present
that takes advantage of the déstabilisation.
Now, the claim is that inversion is created by the following conspiracy of
REC and QR, and by movement induced déstabilisation of the structure. First
the strong embedded quantifier is raised by QR across the base position of
the topicalized quantifier. Then the topicalized quantifier is reconstructed to
its base position. Inversion is the result:
(47) pre\ | YP jedes Kind2 [VP ein Schüleri tírF verprügelt hat |] (LF)
5.4 Analysis
I will now present the concrete computation. But first some remarks about the
tables: 1. This time the input consists of S-structures that contain the relevant
S-structure traces that must be there according to D-structure optimization
and S-structure movement. 2. The LF-structures in the tables contain only LF-
traces for reasons of space and readability. 3. The scope marker Q will count
as relevant lexical material with respect to the definition of the candidate
sets. 20
192 Fabian Heck
5.4.1 Topicalization
F * **
C 3 pre\... [mini Q]... jeder ... t^ *!
F **
C 4 pre 1... jeder 2 ... [mini Q\... tif *!
l *
C 5 min'i... Q ... jeder 2 ... t ^ *! *
the set without scope markers would block the optimal candidate in the set
with scope markers, since the first one exhibits one less violation of economy.
The same point can be made with a weak quantifier (the example with Q
which derives the S-structure reading is omitted for reasons of space):
Again, a candidate that reconstructs wins. This time, however, QR of the em-
bedded quantifier is not licensed (since it is a weak quantifier), and therefore
C 2 fatally violates economy. Candidates that do not reconstruct or that raise
the embedded quantifier across the topicalized one are ill formed because of
fatal violations of REC and Qui Β respectively (see C3 and C4).
The same holds for two quantifiers in object position if the one that is more
deeply embedded is topicalized (the example with Q is again omitted):
Of course, the analogue of the candidate C4 in (51) could have been listed in
table (50) as well. But it is ill formed anyway since it does not reconstruct
and string vacuously raises the weak embedded quantifier, causing another
violation of economy.
194 Fabian Heck
We now come to some trickier derivation of scope inversion that has already
been mentioned. (52) demonstrates how inversion can be derived by the con-
spiracy of REC and QR (see Ci):
First the quantifier that remained unmoved at S-structure is raised across the
S-structure trace of the topicalized quantifier. Then reconstruction of the top-
icalized quantifier can apply (recall that Qui Β is not sensitive for reconstruc-
tion). The S-structure reading is derivable by raising the strong quantifier just
string vacuously such that its target position still remains below the target
position of reconstruction (see C2).21
However, without a strong quantifier the example has only the reading cor-
responding to the surface:
This is so because in this case raising of the embedded quantifier into a posi-
tion above the target position of reconstruction causes an additional violation
of economy which is fatal. 22
We can hold the same mechanism responsible for the readings available in a
configuration with a topicalized indirect object and a subject in situ (here one
example with a strong and another with a weak quantifier):
Again, in one case the strong quantifier raises into a position above the target
of reconstruction, thereby causing scope inversion. In the other case it does
not raise far enough and the S-structural relative scope is preserved at LF.
(55) shows the same thing without a strong quantifier. Here the additional
violation against economy is decisive and blocks inversion.
Inversion impossible
If there is no appropriate trace, reconstruction cannot apply and inversion is
blocked, even if a strong quantifier is present:
5.4.2 Scrambling
Inputss: daß mindestens eine Fugei fast [p jeder PiaNIST ] ti spielen kann
Candidates QUIB SP R E C Q R ECON
US'Ci:pr&\ ... jeder2... t ^ ... mind.i **
*
C2: pre 1 ... jeder ... mindestens] *!
C3: mindestens ...jeder *1 *
If there is a scope marker that has triggered scrambling, it will now block
reconstruction together with the Scope Principle. Again, any attempt to derive
inversion by QR violates QUIB:
Inputss: daß [G; [ mind, eine Fuge ]', Ql ] jeder Pianist ti spielen kann
Candidates QUIB SP R E C Q R E C O N
m- Ci: [mindestens Q] ... jeder2... t ^ * **
* *
C2: [mindestens Q] ...jeder
C3: \pr» 1 QM ...jeder ... mind.', *! * *
F *
C 4 : jeder 2 ... [mindestens Q] ... tir *!
The following examples demonstrate how the very same verb may or may
not allow scope inversion by reconstruction, depending on the different D-
structures which are determined by D-structure optimization. 2 3
In (61) and (62) the D-structure relations are inverse: The indirect object pre-
cedes the direct object. Since (61) exhibits this order, no S-structure move-
ment has applied and hence no trace is there that could serve as the target of
reconstruction:
(65) Proper Q ( P R O P - Q )
A scope marker Ql is licensed if and only if it c-commands a
contraindexed quantifier Q J that breaks the extended chain that is
formed by Q1 and its coindexed quantifier Q' , 26
It is clear that this constraint blocks both vacuous scope marking and down-
ward scope marking. 27
Vacuous marking is blocked because if a marker does not c-command a
contraindexed quantifier, it is vacuous by definition. And for cases of down-
ward marking that could result in inversion it is exactly the same: The scope
marker is vacuous.
I have no evidence how P R O P - Q should be ranked within the hierarchy
because it does not stand in conflict with any other constraint. It simply filters
out candidates with an improper scope marker as they occur at any level of
representation. This might be an indication that P R O P - Q is part of G E N .
Cyclic Optimization 201
In this last section I want to review arguments that help to differentiate the
approach of cyclic optimization from another approach.
Remember that the goal of this paper was to derive the correspondence
between quantifier scope and basic word order, as it is suggested by the data.
Now, instead of using cyclic optimization one could think of defining candi-
dates as triples (D-structure, S-structure, LF). It would be absolutely reason-
able to do so, because there is a well-defined term of optimality for such an
approach: A triple t = (DS, SS, LF) is optimal with regard to the other triples
t\...tn iff the constraint profile of t is better than the profiles of t\...tn, where
the three slots of every triple are computed in a parallel fashion and then the
violations are summed up in one big table. I will refer to this approach as the
parallel approach.
However, I think that there are three arguments that might support the cyclic
approach: 1. Complexity of derivation: If in the parallel approach an LF is
checked for optimality, there is no way to tell if it will ever have the chance
to be a winner because its D-structure and its S-structure are computed at
the same time. So this mechanism has to compute the whole set of structures
exhaustively. In the cyclic approach, however, a candidate will only be opti-
mal if all of its levels of representation are optimal. An LF that is based on
a non-optimal D-structure cannot be part of a winner even if the LF itself is
optimal. In the cyclic approach only those LFs that descend from an optimal
D-structure and an optimal S-structure are computed. All other potential LFs
are filtered out somewhere previously in the computation.
2. Potential reranking: Under a parallel approach the constraint ranking is
fixed during the computation. That is, if a triple t = (DS, SS, LF) is evalu-
ated all elements of t are evaluated under the same ranking. Under the cyclic
approach, however, it is in principle possible to reorder the constraints on
the way from one cycle to the next (cf. McCarthy & Prince 1993:12, Mester
1999, Rubach 2000). I would like to point out why this might be a desirable
consequence.
Recall the constraint TYPE, that was introduced to derive the fact that the
Vorfeld in an German sentence has to be filled with a constituent in unem-
bedded contexts. Clearly, in order for TYPE to have any effect it must outrank
ECON and REC. At LF, however, it is exactly those topicalized elements that
undergo reconstruction and therefore TYPE must not have any impact at that
point of the derivation. There are at least two ways to achieve this. Either,
one stipulates that certain constraints are "switched on" at a certain level of
202 Fabian Heck
representation and "switched o f f " at another level. Or one makes use of the
mechanism of neutralizing the effects of one constraint by ranking it suffi-
ciently low within the given hierarchy. In the case at hand, this implies that at
the surface we are dealing with a partial order TYPE ECON, REC, whereas
at L F we are dealing with the inverted order REC » TYPE, ECON. Since
reranking is the mechanism that is supposed to be the locus of parametric
change in optimality theory anyway, it seems to me that the second strategy
is the most natural way to deal with the problem at hand.
3. Accumulation of violations: The last argument is an empirical one. The
point is that cyclic optimization makes the prediction that each time a new cy-
cle is entered the reset button is pushed and all the violations so far are deleted
from memory. That is, the next competition restarts without any burden from
the past cycles. Now suppose that there is a violation of some constraint at
D-structure (that is the first slot in the parallel approach), and suppose further
that this violation does not have any impact on the computation of the opti-
mal candidate. Take a look at this violation in the parallel approach, where all
violations f r o m every slot of the triple are accumulated. There this violation
might be decisive for the failure of the whole computation because it might
just be the violation that makes the sum of violations of the relevant candidate
surmount the sum of violations of some other candidate.
Since the two approaches make different predictions at this point, it could
serve as a test to check their empirical adequacy. An appropriate configuration
for such a test, however, may be hard to find.
Notes
This paper is a partial elaboration of my masters thesis. It was supported by the DFG
grant MU 1444/2-1 for the project "Optimalitätstheoretische Syntax des Deutschen"
at the University of Stuttgart. For comments and discussion I want to thank Jane
Grimshaw, Gereon Müller, Tanja Schmid, Arnim von Stechow, Wolfgang Sternefeld,
Sten Vikner, Ralf Vogel, and the audience of the DGfS 1999 workshop "Competition
in Syntax". Of course, all errors are mine.
1. This assumption is called the concept of transparent LF and is in opposition to
May (1985), who allows ambiguity even at the level of LF.
2. By the way, whenever I use the term scope inversion I refer to a configuration
that is an inverted variant of the S-structural scope configuration.
3. Actually, in McCarthy & Prince (1993) the authors speculate that it might be
useful to define the process of optimization on different levels of morphological
Cyclic Optimization 203
However, psychological verbs that take an accusative argument are still mysteri-
ous because in the approach of Vogel & Steinbach ( 1998) accusative case remains
unaffected by the word order constraints, and this for good reasons.
8. Examples like those in (11) presuppose a theory of linking as proposed by Wun-
derlich (1997). The point is that we need to assure that the 0-role agent is always
linked to nominative case (if there is such a role). If it were not, then the follow-
ing example (with the intended meaning given below it), in which the arguments
that bear dative and the 0-role agent coincide, would be a possible candidate:
Since (i) respects AGENT and ADJA, in contrast to (11), it would block (11) as
suboptimal. However, if we assume that the linking between case and 0-role of
an argument is determined by the lexicon entry of each verb, then we can avoid
204 Fabian Heck
this unwanted consequence. Then (i) would never be generated (at least not with
the given meaning).
9. It is clear that the constraints that guide S-structure movement and LF movement
should not have any effects on the level of D-structure. One gets this for free if
one assumes that there cannot be any movement before D-structure has been built
up. In other words, the constraints that will be introduced in the next cycles are
present in the first cycle as well, but they are without impact for independent rea-
sons. So we do not have to bother about the explicit ranking of these constraints
with respect to the constraints met so far in the hierarchy of D-structure.
10. Basically, the idea is that scope markers are generated freely within the structure.
In some sense this is different from Diesing (1996), Diesing (1997), and Vikner
(this volume), where it is assumed (in a generative semantics style) that the rela-
tive scope is already part of the input, which in turn means that the information
about relative scope is doubled: First it is encoded in the input and then it is en-
coded within the syntactic structure. In opposition to this, here the relative scope
is determined by an optimized LF only, not by the input.
11. From now on, I shall dispense with explicitly mentioning the D-structure con-
straints in the ranking because they will be of no importance anymore, neither at
S-structure nor at LF. Simply assume that they are the lowest ranked constraints
in the hierarchies of S-structure and LF.
12. This may be SpecCP or the specifier of another functional head as the Top head
argued for in Müller & Sternefeld (1993).
13. Of course, we cannot satisfy ALIGN first and then move the focused constituent
to the Top position, because ALIGN is dependent on S-structure and cannot be
satisfied by a trace. This, however, does not mean that I assume every constraint
to be dependent on the surface. That traces sometimes can do a job can be seen
from examples like the following:
14. SP will also be assumed not to be satisfiable by traces, for reasons that hopefully
will become clear.
15. Focal stress is indicated by capital letters.
16. It is not clear who originally came up with this idea. But one of the latest propos-
als which follow this strategy is due to Wolfgang Sternefeld (cf. Sternefeld 1993).
17. Note also that the S-structure hierarchy is rather different from this, namely:
TYPE » ALIGN » SP » Q U I B , ECON » Q R , R E C . R e c a l l t h a t T Y P E is a b o v e
ALIGN because focused elements can be topicalized. TYPE and ALIGN are both
above economy for obvious reasons. SP is above QUIB because in German S-
structure movement over a scope bearing element is possible. Economy is above
QR because there is no S-structure movement of strong quantifiers (without a
scope marker). Finally, REC is ranked below economy because it would be un-
reasonable to stipulate some kind of Yo-Yo movement. That is, we do not want
Cyclic Optimization 205
quantifier, then Q* in (i) would license Q' although Ql does not scope mark any-
thing that is not in the scope of Q, already. Hence, (i) is an instance of vacuous
scope marking and must be blocked.
References
Abraham, Werner
1986 Word order in the middle field of the German sentence. In: W. Abraham
and S. de Meij (eds.) Topic, Focus, and Configurationality, 15-38. Ams-
terdam: Benjamins.
Beck, Sigrid
1995 Negative islands and reconstruction. In: U. Lutz and J. Pafel (eds.) On Ex-
traction and Extraposition in German, 121-144. Amsterdam: Benjamins.
Beck, Sigrid
1996 Wh-constructions and transparent logical form. Ph.D. dissertation, Uni-
versity of Tubingen.
Biiring, Daniel
1996 The 59th Street bridge accent. Ph.D. dissertation, University of Tubingen.
BUring, Daniel
t.v. Let's phrase it!
Cheng, Lisa Lai-Shen
1991 On the typology of vvh-questions. Department of Linguistics and Phi-
losophy, MIT. Distributed by MIT Working Papers in Linguistics. MIT,
Cambridge, Massachusetts.
Choi, Hye-Won
1996 Optimizing structure in context: Scrambling and information structure.
Ph.D. dissertation, Stanford University.
Chomsky, Noam
1981 Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam
1995 The Minimalist Program. Cambridge, MA: MIT Press.
Cinque, Guglielmo
1993 A null theory of phrase and compound stress. Linguistic Inquiry 24: 239-
298.
Costa, Joäo
1998 Word order variation. Ph.D. dissertation, HIL University of Leiden.
Cyclic Optimization 207
Diesing, Molly
1996 Semantic variables and object shift. In: H. Thráinsson, S. Epstein and
S. Peter (eds.) Studies in Comparative Germanic Syntax. Vol. II, 66-84.
Dordrecht: Kluwer.
Diesing, Molly
1997 Yiddish VP order and the typology of object movement in Germanic.
Natural Language and Linguistic Theory 15: 369-427.
Fox, Danny
1995 Economy and scope. Natural Language Semantics 3(3): 283-341.
Frey, Werner
1993 Syntaktische Bedingungen für die semantische Interpretation. Berlin:
Akademie-Verlag.
Grimshavv, Jane
1997 Projection, heads, and optimality. Linguistic Inquiry 28: 373-422.
Haider, Hubert
1992 Branching and discharge. Sonderforschungsbereich 340, University of
Stuttgart.
Haider, Hubert
1993 Deutsche Syntax - Generativ. Tubingen: Narr.
Heim, Irene — Angelika Kratzer
1997 Semantics in Generative Grammar. Oxford: Blackwell.
Hoberg, Ursula
1981 Die Wortstellung in der geschriebenen deutschen Gegenwartssprache.
München: Hueber.
Höhle, Tilman
1982 Explikationen für 'normale Betonung' und 'normale Wortstellung'. In:
W. Abraham (ed.) Satzglieder im Deutschen, 75-153. Tübingen: Narr.
Höhle, Tilman
1991 On reconstruction and coordination. In: H. Haider and K. Netter (eds.)
Representation and Derivation in the Theory of Grammar, 139-197. Dor-
drecht: Kluwer.
Kiss, Tibor
1999 Configurational and relational scope determination in German. Ms., Un-
versitat Bochum.
Kroch, Anthony
1974 The semantics of scope in English. Ph.D. dissertation, MIT. (Published
1979, New York: Garland).
Lenerz, Jürgen
1977 Zur Abfolge nominaler Satzglieder im Deutschen. Tübingen: Narr.
208 Fabian Heck
May, Robert
1977 The grammar of quantification. Ph.D. dissertation, MIT.
May, Robert
1985 Logical Form: Its Structure and Derivation. Cambridge, MA: MIT Press.
McCarthy, John — Alan Prince
1993 Prosodie morphology I - Constraint interaction and satisfaction. Ms.,
University of Massachusetts, Amherst and Rutgers University.
McCarthy, John — Alan Prince
1994 The emergence of the unmarked. NELS 24: 333-379.
McCavvley, James D.
1999 Why surface syntactic structure reflects logical structure as much as it
does, but only that much. Language 75: 34-62.
Mester, Armin
1999 Weak parallelism: Serial and parallel sources of opacity in OT. Ms., Uni-
versity of California, Santa Cruz.
Milsark, Gary
1974 Existential sentences in English. Ph.D. dissertation, MIT.
Müller, Gereon
1999 Optimality, markedness, and word order in German. Linguistics 37: 777-
818.
Müller, Gereon — Wolfgang Sternefeld
1993 Improper movement and unambiguous binding. Linguistic Inquiry 24:
461-507.
Nohl, Claudia — Arnim von Stechovv
1995 Interpretation syntaktischer Strukturen - Eine Semantikeinfiihrung an-
hand des Deutschen. Technical Report 07-95, Seminar für Sprachwis-
senschaft der Universität Tübingen.
Pafel, Jürgen
1997 Skopus und logische Struktur: Studien zum Quantorenskopus im Deut-
schen. Unpublished Habilitation, University of Tübingen.
Prince, Alan — Paul Smolensky
1993 Optimality Theory: Constraint interaction in generative grammar. Ms.,
Rutgers University and University of Colorado at Boulder.
Reinhart, Tanya
1976 The syntactic domain of anaphora. Ph.D. dissertation, ΜΓΓ.
Reinhart, Tanya
1983 Anaphora and Semantic Interpretation. London: Croom Helm.
Reinhart, Tanya
1997 Interface economy. In: Wilder et al. (eds.), 146-169.
Cyclic Optimization 209
Rubach, Jerzy
2000 Glide and glottal stop insertion in Slavic languages: A DOT analysis.
Linguistic Inquiry 31(2): 271-317.
Samek-Lodovici, Vieri
1997 OT-interactions between focus and basic word order. Talk presented at
the Workshop on OT Syntax, October 1997, University of Stuttgart.
Stechow, Arnim von
1993 Die Aufgaben der Syntax. In: J. Jacobs, A. von Stechow, W. Sternefeld
and T. Vennemann (eds.) Syntax: Ein internationales Handbuch zeit-
genössischer Forschung, 1-88. Berlin: Walter de Gruyter.
Sternefeld, Wolfgang
1993 Plurality, Reciprocity and Scope. Technical Report 13-93, Seminar für
Sprachwissenschaft der Universität Tubingen. Revised version in Natural
Language Semantics 1998.
Sternefeld, Wolfgang
1997 Comparing reference sets. In: Wilder et al. (eds.), 81-114.
Stovvell, Tim
1981 Origins of phrase structure. Ph.D. dissertation, MIT.
Uszkoreit, Hans
1986 Constraints on order. Linguistics 24: 883-906.
Vikner, Sten
t.v. The interpretation of object shift and optimality theory.
Vogel, Ralf — Markus Steinbach
1998 The dative - An oblique case: Linguistische Berichte 173: 65-90.
Wilder, Chris — Hans-Martin Gärtner — Manfred Bierwisch (eds.)
1997 The Role of Economy Principles in Linguistic Theory. Berlin: Akademie-
Verlag.
Wunderlich, Dieter
1997 Cause and the structure of verbs. Linguistic Inquiry 28: 27-68.
Experimental Evidence for Constraint Competition
in Gapping Constructions
Frank Keller
This paper presents the results of two experiments investigating gradient ac-
ceptability in gapping constructions. Experiment 1 shows that adjuncts and
complements are equally acceptable as remnants in gapping, a fact that has
been surrounded by controversy in the literature. It also provides evidence
against the claim that gapping must leave behind exactly two remnants, and
shows that subject remnants are less acceptable than object remnants. This
effect of remnant type can be overridden by context. Experiment 2 confirms
the remnant effect and investigates how it interacts with other constraints on
gapping to produce a gradient acceptability pattern.
A number of grammar models have been proposed to deal with gradient
linguistic data, including the re-ranking model (Keller 1998), which draws
on concepts from Optimality Theory. Two assumptions are central to this
model: (a) constraint violations are cumulative, i.e., the degree of unaccept-
ability increases with the number of constraints violated; and (b) constraints
cluster into two types based on their acceptability profile: hard constraints
cause strong unacceptability when violated, while violations of soft con-
straints cause only mild unacceptability. The experimental data presented in
this paper confirm both assumptions and provide additional evidence for the
hard/soft distinction by demonstrating that only soft constraints are subject to
context effects.
1 Introduction
These examples indicate that gapping always deletes the matrix verb and
leaves behind exactly two constituents as remnants (Kuno 1976: 318). Based
on previous work by Hankamer ( 1 9 7 3 ) , Jackendoff ( 1 9 7 1 ) , and Ross ( 1 9 7 0 ) ,
Kuno ( 1 9 7 6 ) also observes that certain functional principles affect the accept-
ability of gapping, such as the following restriction on the interpretation of
the constituents left behind by gapping:2
coupled with the constituents (of the same structures) in the first
conjunct that were processed last of all.
The examples in (5) illustrate the Minimal Distance Principle: In (5-a), the
remnant Tom has to be paired with Mary, yielding the interpretation in (5-b).
It is not possible to pair Tom with the more distant subject John, yielding the
interpretation in (5-c).
Kuno (1976) notes that the FSP Principle seems to be able to override the
Minimal Distance Principle. (7-a) is acceptable as a gapped version of (7-b),
even though it violates MlNDlS. We regard this fact as initial evidence that
gapping is subject to constraint competition in an optimality theoretic sense.
(7) a. With what did John and Bill hit Mary? John hit Mary with a stick,
and Bill with a belt,
b. With what did John and Bill hit Mary? John hit Mary with a stick,
and Bill hit Mary with a belt.
This explains why (9-a) can be interpreted as the gapped version of (9-b)
(where Tom is the subject of donate), but not as the gapped version of
(9-c) (where Tom is the subject of the object control verb persuade). Exam-
ple (10-a), on the other hand, not only has (10-b) as a possible interpretation,
but also (10-c) (or at least (10-c) is considerably better than (9-c)). In (10-c),
Tom is the subject of donate, because the matrix verb promise is a subject
control verb. Such a subject-predicate interpretation is preferred in gapping
constructions. Note that (10-c) violates MlNDlS, thus indicating a competi-
tion between MlNDlS and SUBJPRED.
(9) a. John persuaded Bill to donate $200, and Tom to donate $400.
b. John persuaded Bill to donate $200, and John persuaded Tom to
donate $400.
c. John persuaded Bill to donate $200, and Tom persuaded Bill to do-
nate $400.
(10) a. John promised Bill to donate $200, and Tom to donate $400.
b. John promised Bill to donate $200, and John promised Tom to do-
nate $400.
c. John promised Bill to donate $200, and Tom promised Bill to donate
$400.
Finally, Kuno (1976) also observes that gapping cannot leave behind rem-
nants that are part of a subordinate clause: (11-a) cannot be understood as a
gapped version of (11-b).
(11) a. John persuaded Dr. Thomas to examine Jane and Bill Martha.
b. John persuaded Dr. Thomas to examine Jane and Bill persuaded Dr.
Thomas to examine Martha.
Table 1. Constraint profile for direct object extraction (simplified from Legendre et
al. 1995: (22-a))
Note that (14-b) follows from (14-a): If suboptimal candidates differ in gram-
maticality, then the comparison between two suboptimal candidates can be
used as evidence for constraint rankings in the same way as the comparison
between a grammatical candidate and an ungrammatical candidate is used to
determine rankings in standard OT.
There are several ways of implementing the suboptimality hypothesis, i.e.,
of extending OT to make predictions about suboptimal structures; the most
straightforward one is based on the assumption that the relative grammatical-
ity of a candidate corresponds to its relative optimality in the candidate set
(Keller 1997). Such a model will make predictions of the form: Candidate Si
is more optimal (i.e., more grammatical) than candidate S2, where both S\ and
S2 may be suboptimal candidates. This prediction can be tested empirically
by showing that S1 is more acceptable than S2·
This "naive" model of suboptimality (which simply equates relative op-
timality with relative grammaticality) has been criticized for a number of
reasons (Keller 1998, Müller 1999). One problem is that it predicts gram-
maticality differences only for structures in the same candidate set; relative
grammaticality cannot be compared across candidate sets. Another problem
is that grammaticality differences are predicted between all structures in a
candidate set. A typical OT grammar assumes a richly structured constraint
hierarchy, therefore all or most structures in a given candidate set will differ
in optimality. The naive model predicts that there is a grammaticality differ-
ence whenever there is a difference in optimality. This means it will probably
overgenerate, i.e., predict far more degrees of grammaticality than we can
reasonably expect to find in the data.
Constraint Competition in Gapping Constructions 219
which accounts for the fact that subjects can judge the relative gram-
maticality of arbitrary sentence pairs.
— It seems plausible to assume that some constraint re-rankings are
more serious than others, and hence cause a higher degree of un-
grammaticality in the target structure. This assumption allows us to
model the experimental findings that some constraint violations are
more serious than others. The experimental data justify two types of
re-rankings, corresponding to the soft and hard constraint violations
discussed above.
— Another assumption is that the degree of grammaticality of a struc-
ture depends on the number of re-rankings necessary to make it op-
timal: The more re-rankings a structure requires, the more ungram-
matical it becomes. This predicts the cumulativity of violations that
was found experimentally both for soft and for hard constraints. 4
The work presented in this paper aims to provide additional evidence for
two assumptions underlying the re-ranking model: (a) the dichotomy of hard
and soft constraints and (b) the cumulativity of constraint violations. An ad-
ditional aim is to investigate how context effects interact with the soft/hard
distinction and the cumulativity effect.
The present study relies on very subtle linguistic intuitions, viz., on judg-
ments about the relative acceptability of information structurally different re-
alizations of a sentence. Such intuitions about relative acceptability should be
measured experimentally, since the informal elicitation technique tradition-
ally used in linguistics is unlikely to be reliable here (Cowart 1997, Schütze
1996, Sorace 1992). A suitable experimental paradigm is magnitude estima-
tion, a technique standardly applied in psychophysics to measure judgments
of sensory stimuli (Stevens 1975). The magnitude estimation procedure re-
quires subjects to estimate the magnitude of physical stimuli by assigning nu-
merical values proportional to the stimulus magnitude they perceive. Highly
reliable judgments can be achieved for a whole range of sensory modalities,
such as brightness, loudness, or tactile stimulation.
The magnitude estimation paradigm has been extended successfully to the
psychosocial domain (Lodge 1981), and recently Bard et al. (1996) and
Cowart (1997) have shown that linguistic judgments can be elicited in the
same way as judgments of sensory or social stimuli. In contrast to the five
Constraint Competition in Gapping Constructions 221
2.1 Introduction
PP adjuncts were also included in order to test the claim that adjunct remnants
are more acceptable than complement remnants (Hankamer 1973). The fol-
lowing examples illustrate the levels of the factor Frame for transitive verbs:
For ditransitive verbs, the factor Frame included verbs that have an NP as
their first complement, and another NP, a PP, or a VP as their second comple-
ment, such as the examples in (16).
(16) a. NP V NP NP: She charged the client 50 pounds, and he the manu-
facturer 100 pounds.
b. NP V NP PP: She accompanied the boy to school, and he the girl
to university.
c. NP V NP VP: She authorized the manager to leave, and he the sec-
retary to stay.
Transitive verbs allow only one type of remnant (where the subject and the
object are left behind, while the verb is gapped). Ditransitive verbs, on the
other hand, allow more complicated remnants, which we took into account
by including the additional factor remnant type (Remit) for ditransitive verbs.
The levels of Remn can be exemplified by the following sentences:
(17) a. NP _ XP XP: She charged the client 50 pounds, and he the manu-
facturer 100 pounds.
b. XP XP: She charged the client 50 pounds, and the manufac-
turer 100 pounds.
c. NP XP: She charged the client 50 pounds, and he 100 pounds.
d. NP _ XP _ : She charged the client 50 pounds, and he the manu-
facturer.
Note that we use pronouns in (17-c) and (17-d) to make sure that the remnant
is interpreted as the subject NP.
Context (Con), the third factor in the experiment, was meant to test the
influence of context on the acceptability of gapping. A felicitous context for
gapping (according to Kuno's 1 9 7 6 S E N T P constraint) is one in which the
gapped constituent contains given information, while the remnants constitute
Constraint Competition in Gapping Constructions 223
The factor Con was the same for the ditransitive condition. Here are the fe-
licitous contexts for the examples in (17):
2.2 Predictions
2. For the factor Remn, the constraint M I N D I S predicts that the remnant
XP XP is more acceptable than the remnants NP XP and NP
_ XP Another relevant prediction is that the remnant NP _ XP XP is
unacceptable, based on the claim of Kuno ( 1976: 318) that gapping has
to leave behind exactly two constituents.
2.3 Method
2.3.1 Subjects
2.3.2 Materials
Training Materials
The experiment included a set of training materials that were designed to
familiarize subjects with the magnitude estimation task. The training set con-
tained six horizontal lines. The range of largest to smallest item was 1:6.7.
The items were distributed evenly over this range, with the largest item cov-
ering the maximal window width of the web browser. A modulus item in the
middle of the range was provided.
Practice Materials
A set of practice items was used to familiarize subjects with applying mag-
nitude estimation to linguistic stimuli. The practice set consisted of six sen-
tences that were representative of the test materials. A wide spectrum of ac-
ceptability was covered, ranging from fully acceptable to severely unaccept-
able. A modulus item in the middle of the range was provided.
Constraint Competition in Gapping Constructions 225
Test Materials
2.3.3 Procedure
The method used was magnitude estimation as proposed by Lodge (1981) and
extended to linguistic stimuli by Bard et al. (1996). Each subject took part in
an experimental session that lasted approximately 15 minutes and consisted
of a training phase, a practice phase, and an experimental phase. The experi-
ment was self-paced, though response times were recorded to allow the data
to be screened for anomalies.
The experiment was conducted remotely over the Internet. The subject ac-
cessed the experiment using his or her web browser. The browser established
an Internet connection to the experimental server, which was running Web-
Exp 2.1 (Keller et al. 1998), an interactive software package for administering
web-based psychological experiments.
Instructions
Before the actual experiment started, a set of instructions were presented. The
instructions first explained the concept of numerical magnitude estimation of
line length. Subjects were instructed to make estimates of line length relative
to the first line they would see, the reference line. Subjects were told to give
the reference line an arbitrary number, and then assign a number to each
following line so that it represented how long the line was in proportion to the
226 Frank Keller
Demographic Questionnaire
After the instructions, a short demographic questionnaire was administered.
The questionnaire included name, email address, age, sex, handedness, aca-
demic subject or occupation, and language region. Handedness was defined
as "the hand you prefer to use for writing", while language region was de-
fined as "the place (city, region/state/province, country) where you learned
your first language". The results of the questionnaire were reported above.
Training Phase
The training phase was meant to familiarize subjects with the concept of
numeric magnitude estimation using line lengths. Items were presented as
horizontal lines centered in the window of the subject's web browser. After
viewing an item, the subject had to provide a numerical judgment over the
computer keyboard. After pressing Return, the current item disappeared and
the next item was displayed. There was no possibility of revisiting previous
items or change responses once Return had been pressed. No time limit was
set for either the item presentation or for the response.
Subjects first judged the modulus item, and then all the items in the training
set. The modulus remained on the screen all the time to facilitate comparison.
Items were presented in random order, with a new randomization being gen-
erated for each subject.
Constraint Competition in Gapping Constructions 227
Practice Phase
Experimental Phase
2.4 Results
The data were normalized by dividing each numerical judgment by the mod-
ulus value that the subject had assigned to the reference sentence. This oper-
ation creates a common scale for all subjects. All analyses were carried out
on the geometric means of the normalized judgments. The use of geometric
means is standard practice for magnitude estimation data (Bard et al. 1996,
Lodge 1981).
Separate analyses of variance (ANOVAs) were performed for the transitive
and ditransitive verb frames. The analysis of the transitive frames failed to
find a significant main effect of verb frame. The main effect of context was
significant only by items (F { ( 1, 47) = .326, ρ = .571; F 2 ( l , 6) = 29.720,
ρ = .002), and the interaction of frame and context was non-significant. The
average judgments for the transitive condition are graphed in Figure 1.
For the ditransitive frames, a marginal main effect of verb frame was found
(Fi (2, 94) = 2.727, ρ = .071; F 2 (2, 12) = 6.037, ρ = .015). Further-
more, the ANOVA showed a highly significant main effect of remnant type
(Fi(3, 141) = 18.936, ρ < .0005; F 2 (3, 18) = 6.564, ρ = .003), and
228 Frank Keller
verb frame
0.1-
»—«ΝΡ XP XP
XP XP
-'NP XP
- - NP XP
I I I
NP V NP NP NP V NP PP NP V NP VP
verb frame
Figure 2. Effect of verb frame and remnant type on gapping (ditransitive frames,
null context)
0
è
a
If·-0.1
oυ
Λ
-0.2
-0.3
NP V NP NP NP V NP PP NP V NP VP
verb frame
Figure 3. Effect of verb frame and remnant type on gapping (ditransitive frames,
felicitous context)
2.5 Discussion
For transitive verbs, we found that gapping is equally acceptable for all types
of verbal complements tested (NP, PP, VP). We also failed to find a differ-
ence between PP complements and PP adjuncts. This result settles the con-
troversy on the status of complements and adjuncts in gapping: Hankamer
(1973) claims that PP adjuncts are more acceptable than PP complements, a
claim that is disputed by Jackendoff (1971) and Kuno (1976). These negative
results are also important for our next experiment, as they allow us to disre-
gard the distinction between different verb frames, and between adjuncts and
complements, thus enabling us to use a more compact experimental design.
In contrast to transitive verbs, ditransitive verbs showed an effect of Frame:
in a felicitous context, the NP V NP NP frame was less acceptable than the
other frames. Note, however, that this effect, for which the literature on gap-
ping fails to offer an explanation, is rather small (see Figure 3).
The main finding of Experiment 1 is the effect of remnant type and its inter-
action with context. We showed that the XP XP remnant is more accept-
able than all the other remnants, an effect that is very strong in a null context,
but disappears completely in a felicitous context. This provides strong evi-
dence for Kuno's (1976) Minimal Distance Principle, and in particular for his
observation that a violation of M I N D I S can be overridden by a satisfaction of
the context requirements on gapping (his constraint S E N T P ) .
On the other hand, we found that the NP _ XP XP remnant is not signifi-
cantly less acceptable than NP XP and NP _ XP _ , contrary to Kuno's
(1976) claim that gapping must leave behind exactly two remnants.
Now let us briefly consider an alternative explanation for the interaction of
remnant type and context. One could argue that this effect is actually due to
the contexts used, rather than to the stimulus sentences proper. Some initial
plausibility for this view derives from the fact that two of the remnants (NP
_ XP XP and XP XP) used double wh -questions as contexts (see (19-a)
and (19-b)), while the other two remnants (NP XP and NP _ XP _ ) had
single wh -questions as contexts (see (19-c) and (19-d)). It seems plausible to
assume that multiple wh -questions are less acceptable than single ones, and
maybe subjects actually took the acceptability of the context into account
when they judged the acceptability of the stimulus sentences.
To test this hypothesis, an A NOVA was conducted on the contextualized
data with question type as the only factor. This yielded an effect of ques-
tion type which was significant by subjects ( F i ( l , 2 ) = 8.982, ρ = .007;
F2{\, 3) = 1.257, ρ — .344). However, this effect went the other way than
Constraint Competition in Gapping Constructions 231
was expected: Single questions (mean = —.0085) were less acceptable than
double questions (mean = .0410). This result allows us to rule out the hy-
pothesis that the effect of Remn is due to the type of question used, rather
than to the remnant itself.
Another alternative explanation for the remnant is that XP XP is more
acceptable because it does not contain a subject pronoun. This pronoun is
present in the other three remnants and might reduce acceptability in the null
context condition, as it cannot be anchored to an NP in the context. This
would explain why the remnant effect disappears in context, where such an
antecedent is provided (see (15) and (18)). This alternative explanation for
the remnant effect cannot be ruled out on the basis of Experiment 1. We will
address this issue in the next experiment, which will investigate the behavior
of gapping in non-felicitous contexts. A non-felicitous context provides an
antecedent for the subject pronoun, but differs from a felicitous context in
that it violates SENTP.
3.1 Introduction
The aim of this experiment was to replicate and extend the findings of Ex-
periment 1. It was designed to investigate how the remnant effect found in
Experiment 1 interacts with other constraints on gapping, and how it behaves
in a neutral and non-felicitous context. Table 3 gives an overview of the fac-
tors included in Experiment 2. The constraints are the ones detailed in Sec-
tion 1.1, either violated or not: Minimal Distance (MINDIS), Functional Sen-
tence Perspective (SENTP), Subject-Predicate Interpretation (SUBJPRED),
and Simplex-Sentential Relationship (SIMS).
The constraint M I N D I S (see (4)) is satisfied if the distance between the rem-
nants and their antecedents is minimal, as in (20-a), where the thief can be
paired with the criminal and for robbing the bank can be paired with for bur-
gling the house. (20-b), on the other hand, is in violation of MINDIS, as she
cannot be paired with the neighbor, but has to be paired with the subject he.
(20) a. He punished the criminal for robbing the bank and the thief for bur-
gling the house.
232 Frank Keller
SENT? (Con)
not violated (fei. context)
violated (non-fel. context)
neutral context (control)
null context (control)
the neutral context condition, the stimuli were prefixed by the question What
happened?, which indicates an all focus information structure.
The examples in (21) show the felicitous contexts that belong to the stimuli
in (20), while (22) gives the corresponding non-felicitous contexts.
3.2 Predictions
3.2.1 Constraints
The present experiment also allows us to test the validity of Keller's (1998)
model of gradient grammaticality: We predict that the constraints tested in
this experiment cluster into hard and soft constraints. Hard constraints are
expected to receive a high ranking, i.e., trigger a high degree of unaccept-
ability, while soft constraints will receive a low ranking, i.e., cause only mild
unacceptability when violated.
Intuitively, SIMS is a good candidate for a hard constraint, while SUBJ-
P R E D and M L N D L S are probably soft constraints. A particularly interesting
question is how context interacts with soft and hard constraints. It seems
plausible to expect soft constraints to be more susceptible to context effect
than hard ones.
Another prediction is that constraint violations are cumulative, i.e., that the
degree of unacceptability of a sentence increases with the number of con-
straint violations it incurs. This finding underpins the re-ranking model of
gradience. Note that Keller (1998) found that the cumulativity effect holds
for both soft and hard constraint violations.
3.3 Method
3.3.1 Subjects
3.3.2 Materials
Test Materials
A full factorial design was used which included the factors Dis, Sim, Pred,
and Con, representing the constraints MINDIS, SIMS, SUBJPRED, and
SENTP, respectively (see Table 3 for an overview of the experimental de-
sign). The factors Dis, Sim, and Pred had two levels (constraint violated
or not violated), while the factor Con had four levels: constraint violated
(non-felicitous context), not violated (felicitous context), plus the two con-
trol conditions (null context and neutral context). This yielded a total of
Dis χ Sim χ Pred χ Con = 2 x 2 x 2 x 4 = 32 cells. Eight lexicaliza-
tions were used for each of the cells, which resulted in a total of 256 stimuli.
A set of 24 fillers was used, designed to cover the whole acceptability range.
3.3.3 Procedure
Experimental Phase
The presentation and response procedures in the experimental phase were the
same as in Experiment 1. A between subjects design was used to adminis-
ter the experimental stimuli: Subjects in Group A judged non-contextualized
stimuli, while subjects in Group Β judged contextual i zed stimuli.
For Group A, four test sets were used: Each set contained two lexicaliza-
tions for each of the cells in the design Dis χ Sim χ Pred, i.e., a total of 16
items. The items were distributed over the test sets in a Latin square design.
For Group B, eight test sets were used, each containing the design in one
lexicalization and three contextualizations. This yielded 24 items per test set,
which again were placed in a Latin square.
In Group A, each subject saw 32 items: 16 experimental items and 16 fillers.
In Group B, each subject saw 40 items: 24 experimental items and 16 fillers.
Each subject was randomly assigned to a group and a lexicalization; 25 sub-
jects were assigned to Group A, and 30 to Group B. Instructions, examples,
236 Frank Keller
context
training items, and fillers were adapted for Group Β to take context into ac-
count.
3.4 Results
3.4.1 Constraints
Simplex Sentence
In the null context condition, a highly significant main effect of Sim was
found (Fi(l, 24) = 23.415, ρ < .0005; F2( 1,7) = 18.918, ρ = .003). The
same effect of Sim was present in the context condition (F| (1, 29) = 97.310,
ρ < .0005; F2(1, 7) = 15.548, ρ = .006). The interaction between Sim and
context was non-significant.
Figure 4 depicts the mean judgments for a violation of S IMS in all contexts.
It indicates that SIMS violations have a strong effect on acceptability and
illustrates the absence of a context effect: A violation of SimS results in the
same decrease in acceptability in all contexts (including the null context and
the neutral context).
Constraint Competition in Gapping Constructions 237
• MINDIS n o t v i o l a t e d
0.1 •MINDIS violated
-0.1
-0.2
-0.3-
_L
null neutral non-felicitous felicitous
context
»SUBJPRED n o t violated
0.1 - •SUBJPRED violated
£> 0-
a
S--0.1
-0.2-
-0.3-
null neutral non-felicitous felicitous
context
Minimal Distance
In the null context condition, a highly significant main effect of Dis was found
( F i ( 1 , 2 4 ) = 25.997, ρ < .0005; F2{\,1) = 14.612, ρ = .007). Dis was
also significant in the context condition ( F i ( l , 2 9 ) = 23.315, ρ < .0005;
F 2 ( 1 , 7 ) = 11.421, ρ = .012), where an interaction of Dis and SIM was
also present, significant by subjects only ( F i ( l , 2 9 ) = 4.568, ρ — .001;
/ Z ì o , ? ) = 2.111,/» = .190).
238 Frank Keller
Subject-Predicate Interpretation
The main effect of Pred failed to reach significance in the null context condi-
tion. In the context condition, a main effect of Pred was found (F\ (1, 29) =
19.377, ρ < .0005; F 2 ( l , 7 ) = 9.891, ρ = .016). The interaction of Pred
and context failed to be significant. There was, however, an interaction of
Pred and Sim that was significant by subjects only (F| ( 1, 29) = 11.453,
ρ = .002; F 2 (1,7) = 2.524, ρ = .156).
Figure 6 depicts the interaction of context with Pred. Note the absence of a
context effect, contrary to our expectation that SUBJPRED is a context depen-
dent constraint. However, the presence of a PredlSim interaction might indi-
cate that the effect of Sim blocks out the context effect of Pred. Recall that a
violation of SIMS leads to a high degree of unacceptability, while SUBJPRED
only has a small effect on acceptability. It is therefore appropriate to factor
out violations of SIMS (and other constraints), and to look at the effect of
context on single violations of SUBJPRED. The mean judgments for single
violations of SUBJPRED are depicted in Figure 7, which indicates that the
effect of Pred in the neutral context is stronger than in the other contexts.
To confirm this observation, we conducted separate ANOVAS for single vi-
olations of SUBJPRED for the four context conditions. In the null context,
the felicitous context, and the non-felicitous context, no significant effect of
a single SUBJPRED violation was found. In the neutral context, however, a
single violation of SUBJPRED led to a significant reduction in acceptability
( F , ( l , 29) = 8.327, ρ = .007; F2( 1,7) = 5.610, ρ = .050).
The A NOVA on the context condition showed a significant main effect Con
( F ] ( l , 29) = 10.209, ρ < .0005; F 2 ( l , 7 ) = 13.082, ρ = .001). A post-hoc
Constraint Competition in Gapping Constructions 239
Tukey test was conducted to investigate the locus of the Con effect. It was
found that the neutral context was significantly less acceptable than both the
felicitous and the non-felicitous context (a < .01 in both cases). However,
there was no difference between the felicitous and the non-felicitous context.
0.1
s> o
x>
B--0.1
-0.2
·—· null c o n t e x t
• »context
-0.3
number of violations
was again not significant, but three violations were significantly less accept-
able than one violation (a < .01).
The same post-hoc test was conducted for the context condition. Again, it
was found that one violation was less acceptable than zero violations (a <
.01), while two violations were less acceptable than one violation (a < .01).
The difference between two and three violations was again too small to reach
significance, but the three violations were significantly less acceptable than
one violation (a < .01).
3.5 Discussion
3.5.1 Constraints
Experiment 2 found main effects of Sim, Dis, and Pred. This demonstrated
that violations of the constraints MlNDlS, S u b j P r e d , and SimS signifi-
cantly reduce the acceptability of gapped sentences, as predicted by Kuno's
(1976) account of gapping. A main effect of Con was also present, but con-
trary to predictions, no difference between the acceptability of gapping in a
felicitous and a non-felicitous context was found. However, the acceptability
of gapping in the felicitous and the non-felicitous context was significantly
higher than in the neutral context. This seems to indicate that even a non-
felicitous context provides an information structure that is partially compati-
ble with the requirements of the constraint S e n t P .
242 Frank Keller
4 General Discussion
This paper is part of a line of research that draws on the experimental par-
adigm of magnitude estimation to obtain linguistic judgment data that are
reliable and maximally delicate. This line of research, which was initiated
by Bard et al. (1996) and Cowart (1997), has contributed to linguistic the-
ory by settling data disputes that could not be resolved solely on the basis of
intuitive, informal acceptability judgments. Relevant experimental findings
have been obtained in studies on extraction (Cowart 1989,1997, Keller 1996,
1997), binding theory (Cowart 1997), unaccusativity (Sorace 1993a,b, 2000),
and word order (Keller and Alexopoulou 2000, Keller 2000a).
The results of Experiments 1 and 2 confirm the usefulness of an experi-
mental approach to linguistic data by applying magnitude estimation to gap-
ping constructions. Experiment 1 showed that PP adjuncts and PP comple-
ments are equally acceptable as remnants in gapping, a fact that has been
surrounded by controversy in the theoretical literature. It also provided evi-
dence against the claim that gapping must leave behind exactly two remnants
(Kuno 1976). Another theoretically interesting result is that subject remnants
are less acceptable than object remnants, an effect that turned out to be con-
text dependent. Experiment 2 confirmed this result and provided evidence
for another context dependent constraint on gapping (Subject-Predicate In-
terpretation), but also discovered a constraint that is immune to context ef-
fects (Simplex S). More importantly, Experiment 2 provided data on how
the constraints on gapping interact, i.e., on what happens if more than one
constraint is violated. Such interaction data, which cannot easily be obtained
with the traditional intuitive approach, allows us to make observations on
how constraints compete, and thus can inform an optimality theoretic model
that deals with gradient linguistic data (Hayes 2000, Hayes and MacEachern
1998, Keller 1998, Müller 1999).
The work presented in this paper provided additional evidence for two cen-
tral assumptions underlying the re-ranking model of gradience (Keller 1998):
First, the experimental data confirmed the cumulativity of constraint viola-
tions assumed by the re-ranking model. In addition, the results support the
soft/hard distinction of constraint violations previously demonstrated for ex-
traction. Context effects on gapping were also investigated, and we arrived at
the hypothesis that soft constraints are subject to context effects, while hard
constraints are immune to contextual influences. If correct, this hypothesis
would provide us with an additional diagnostic for the hard/soft distinction.
This has to be validated in further experimental work.
Also, the present results allow us to speculate on the theoretical status of
hard and soft constraints, and its implications for grammar architecture. One
possible line of argumentation is that soft constraints are limited to the in-
terface level of the grammar (syntax-semantics, syntax-pragmatics, syntax-
lexicon), while hard constraints are internal to syntax. This would explain
why soft constraints cause only weak acceptability effects and can be over-
ridden by context, while hard violations cause strong unacceptability and are
immune to context effects.
The constraints identified as soft in the present study belong to the
syntax-semantics or syntax-pragmatics interface (Minimal Distance, Subject-
Predicate Interpretation), while the hard constraint (Simplex S) seems to be
syntactic in nature. This observation squares well with previous results on
extraction, where constraints on phrase structure, agreement, and subcatego-
rization were found to be hard, while soft constraints included referentiality
and definiteness, i.e., constraints located at the syntax-semantics interface.
Notes
Thanks to Mark Steedman for important advice on the work reported here. Com-
ments on earlier stages of this paper were provided by Maria Lapata and by the
246 Frank Keller
(i) a. *Max wanted to put the eggplant on the table, and Harvey in the sink,
b. ?Max writes plays in the bedroom, and Harvey in the basement.
References
Jackendoff, Ray S.
1971 Gapping and related rules. Linguistic Inquiry 2: 21-35.
Keller, Frank
1996 How do humans deal with ungrammatical input? Experimental evidence
and computational modelling. In: Dafydd Gibbon (ed.) Natural Lan-
guage Processing and Speech Technology: Results of the 3rd KONVENS
Conference, Bielefeld, October 1996,27-34. Berlin: Mouton de Gruyter.
Keller, Frank
1997 Extraction, Gradedness, and Optimality. In Alexis Dimitriadis, Laura
Siegel, Clarissa Surek-Clark, and Alexander Williams, eds., Proceedings
of the 21st Annual Penn Linguistics Colloquium, 169-186. (Penn Work-
ing Papers in Linguistics, no. 4.2.) Department of Linguistics, University
of Pennsylvania.
Keller, Frank
1998 Gradient grammaticality as an effect of selective constraint re-ranking.
In: M. Catherine Gruber, Derrick Higgins, Kenneth S. Olson and Tamra
Wysocki (eds.) Papers from the 34th Meeting of the Chicago Linguistic
Society, vol. 2: The Panels, 95-109. Chicago.
Keller, Frank
2000a Evaluating competition-based models of word order. In Proceedings of
the 22nd Annual Conference of the Cognitive Science Society. Philadel-
phia, PA.
Keller, Frank
2000b Gradience in Grammar: Experimental and Computational Aspects of De-
grees of Grammaticality. PhD thesis, University of Edinburgh.
Keller, Frank — Theodora Alexopoulou
2000 Phonology competes with syntax: Experimental evidence for the interac-
tion of word order and accent placement in the realization of information
structure. Cognition, to appear.
Keller, Frank — M. Corley — S. Corley — L. Konieczny — A. Todirascu
1998 WebExp: A Java Toolbox for Web-Based Psychological Experiments.
Technical Report HCRC/TR-99, Human Communication Research Cen-
tre, University of Edinburgh.
Kuno, Susumo
1976 Gapping: A functional analysis. Linguistic Inquiry 7: 300-318.
Legendre, Géraldine — C. Wilson — P. Smolensky — K. Homer — W. Raymond
1995 Optimality and wh-extraction. In: Jill Beckman, Laura Walsh Dickey and
Suzanne Urbanczyk (eds.) Papers in Optimality Theory, 607-636. (Uni-
versity of Massachusetts Occasional Papers in Linguistics 18) University
of Massachusetts, Amherst.
248 Frank Keller
Lodge, Milton
1981 Magnitude Scaling: Quantitative Measurement of Opinions. Beverley
Hills, CA: Sage Publications.
Müller, Gereon
1999 Optimality, markedness, and word order in German. Linguistics 37(5):
777-818.
Prince, Allan — Paul Smolensky
1993 Optimality Theory: Constraint Interaction in Generative Grammar.
Technical Report 2, Center for Cognitive Science, Rutgers University.
Prince, Allan — Paul Smolensky
1997 Optimality: From neural networks to universal grammar. Science 275:
1604-1610.
Ross, John R.
1970 Gapping and the order of constituents. In: Manfred Bierwisch and
Karl Erich Heidolph (eds.) Progress in Linguistics: A Collection of Pa-
pers, 249-259. The Hague: Mouton.
Schütze, Carson T.
1996 The Empirical Base of Linguistics: Grammaticality Judgments and Lin-
guistic Methodology. Chicago: University of Chicago Press.
Sorace, Antonella
1992 Lexical conditions on syntactic knowledge: Auxiliary selection in native
and non-native grammars of Italian. Ph.D. dissertation, University of Ed-
inburgh.
Sorace, Antonella
1993a Incomplete vs. divergent representations of unaccusativity in non-native
grammars of Italian. Second Language Research 9: 22-47.
Sorace, Antonella
1993b Unaccusativity and auxiliary choice in non-native grammars of Ital-
ian and French: Asymmetries and predictable indeterminacy. Journal of
French Language Studies 3: 71-93.
Sorace, Antonella
2000 Gradients in split intransitivity: Auxiliary selection in Western European
languages. Language, to appear.
Stevens, Stephen S.
1975 Psychophysics: Introduction to its Perceptual, Neural, and Social
Prospects. New York: John Wiley.
Tesar, Bruce — Paul Smolensky
1998 Learnability in Optimality Theory. Linguistic Inquiry 29(2): 229-268.
Word Order Variation: Competition or Co-Operation?
Jürgen Lenerz
The interplay of different factors in word order variation seems to call for a
description in terms of a competition model. Still, a general assessment of
competition models shows certain drawbacks in explanatory power. I argue
that an approach in terms of a detailed description of the interacting factors
of co-operating subsystems may result in a deeper understanding of central
facts of word order variation. Using well-known data from German, this is
exemplified by a semantic and pragmatic analysis of the referential power
of definite and indefinite noun phrases in the background and the focus re-
spectively. Such an analysis in terms of a choice function approach not only
provides a deeper insight into the relevant mechanisms, but it also, I hope,
opens the door for further research in a number of areas not connected with
the study of word order variation.
2 The Analysis
The phenomena I will deal with were first described in some detail in Lenerz
(1977a) and have been discussed under various aspects in subsequent work.
Recent attempts to describe them in terms of an OT-account can be found in
Choi (1996), Biiring (1996) and Müller (1998), with some discussion of the
relevant literature.
In German, the order of the indirect object (10, dative) and the direct object
(DO, accusative) depends to a large degree on the particular verb, probably
basically being due to considerations of animateness (cf. Vogel & Steinbach
1998). There has been a considerable amount of debate about the proper syn-
tactic analysis, concerning the basic order of arguments, binding relations,
etc. (cf., amongst others, Rosengren 1993, Fanselow 1993, 1997, Haider &
Rosengren 1998, and Müller 1998). I won't take up this discussion. For the
present purposes, the following may suffice: It has generally been agreed that
for a large part of German verbs the unmarked order is 10 > DO, as the evi-
dence in (2) shows. The order 10 > DO in (2) is not subject to the constraints
obtaining for the reverse order in (3) and (4). A reasonable explanation for
naming IO > DO the unmarked order may be given along the lines of Höhle
(1982) and some subsequent work (cf. Büring 1997): The unmarked order
may be used in more discourse contexts since it allows more focus interpreta-
tions (via F-projection). I won't go into this matter, however, in the present pa-
per. More discussion may be found in Reinhart (1995,1997), Eckardt (1996),
Truckenbrodt (1996), Zubizarreta (1998) and related work; cf. also Uhmann
(1991).
The following generalizations have been observed (cf. Lenerz 1977a,
Büring 1996):
Word Order Variation 253
(1)a. |±def IO] > [±def DO] : "unmarked order", regardless of focus
position (cf. (2-a), (3-a), (4-a))
b. [+def DO] > [IO]F : scrambling of [+def, - F ] is o.k. (cf. (2-b))
c. *[±def DO] F > IO = don't scramble focus! (cf. (3))
d. *[—def DO] > [IO] F = don't scramble (existential) indefinites! (cf.
(4))
The questions of (2)-(4) are supposed to give an adequate context for the
respective answers in which the focus is given prominence by intonation, as
indicated by capital letters. The questioned constituent is indicated (Q: DO
or Q: 10).
It thus turns out that existential indefinites may only occur in a restricted
environment. Biiring covers this case by a so-called "existential axiom":
So far, the relevant conditions are given. They have been used as basic in
Word Order Variation 255
In order to be able to derive the effects of (1) and (6), I will first have to define
some terminology. This is necessary in order to be able to establish an ade-
quate correlation between syntactic structure and background-focus structure
(BF-structure). This, again, is necessary because the semantic interpretation
of NPs depends on the BF-structure of the sentence. Assuming that semantic
interpretation is compositional, the relation of NPs to BF-structure has to be
visible in syntactic structure.
Many proposals have been made concerning the (referential) semantics of
indefinite NPs. It is well known that indefinite NPs may have a generic read-
ing and an existential reading. (In addition, other aspects have to be distin-
guished; there is also an (un)specific, a referential, and an attributive reading;
for some discussion and exemplification, cf. von Heusinger 1997.) Many re-
cent proposals refer to the distinction between a generic and an existential
reading, as exemplified in (7):
who the fireman is. If ein Feuerwehrmann has a specific reading, speaker or
hearer have to assume that the fireman is somehow identifiable, e.g., as Hans
Feuer.) The distinction between a generic and an existential reading of an
indefinite NP seems to correlate to some degree either with syntactic structure
(split tree hypothesis; cf. Heim 1982 and Diesing 1990, amongst others) or
with BF-structure (Krifka 1984, Eckardt 1996, Biiring 1996, Yeom 1998).
Common to most treatments, however, is the assumption that the semantic
analysis in terms of a quantifier logic distinguishes between two domains
(cf. Heim 1982 and von Heusinger 1997 for some discussion). A common
assumption is that a sentence may be translated into a logical structure of the
form given in (8):
(9) IP
SpecIP I'
NP VP
Sadv VP
If the indefinite NP ein Feuerwehrmann is moved out of its original base po-
sition (as VP-internal subject) to SpecIP, it receives a generic interpretation.
If it stays within the VP, the indefinite NP has an existential reading. (For
some more discussion and additional observations, cf. Eckardt 1996.)
In the following, I will assume a particular version of the split tree hypoth-
esis such that a structural bi-partition of each sentence in German correlates
with a specific understanding of background-focus structure. The referential
interpretation of indefinite (and definite) NPs in my analysis will not be in
terms of a quantificational logic as in (8), but in terms of a choice function
Word Order Variation 257
approach along the lines of von Heusinger (1997). On this basis, generic and
existential readings of indefinite NPs will then be explained.
I take it for granted that BF-structure is relevant for the proper seman-
tic interpretation of a sentence, especially as far as the referential interpre-
tation of NPs is concerned. Assuming the principle of compositionality of
semantic interpretation, BF-structure should be visible in syntactic structure
in some way or other. In German, I claim that we find a BF-bi-partition of
every sentence: Constituents inside the VP (i.e., non-moved, VP-dominated
constituents) are somehow focus-affiliated. The focus itself and all focus-
affiliated constituents, even if they are not (part of) the focus in a narrow
sense, belong to the F-part of the sentence. All other constituents (maybe ex-
cept the topicalized constituent in SpecCP and the finite verb in C°) belong to
the B-part of the sentence. The relevant BF-bi-partition is brought about by
movement: Α-movement of the subjects to SpecIP or scrambling (cf. Haider
& Rosengren 1998; for some psycholinguistic evidence, cf. Clahsen & Feath-
erston 1998; for different solutions in other languages, cf. Vikner (this vol-
ume) on object shift in Icelandic and Williams (1999) on English; cf. also
Zubizarreta 1998 for Germanic and Romance; for some general discussion,
cf. Abraham 1995, ch. 14). A functional explanation for the specific condi-
tions in German may run as follows: The VP can be viewed as the syntactic
realisation of a predicate. A predicate refers to a property (of an individual
or of several individuals, i.e., a relation of these individuals). If the whole
predicate is new information, all relevant constituents will ideally remain in-
side the VP; cf. í/zgre-sentences in English, ei-sentences in German, etc. If
an individual is known, the respective NP will be moved out of the VP if pos-
sible. Normally, this applies to the subject. So, we get the typical distinction
"subject-predicate" by movement of the subject to SpecIP. In German, other
NPs may also be moved out of the VP if they represent background infor-
mation, in this case by scrambling. Thus, in German, BF-structure may be
formally represented by a syntactic bi-partition in surface structure:
Topicalization Subject scrambled NP VP-dominated
in SpecIP constituents
background focus ?
(f-affiliated)
Clearly, (10) does not cover the whole range of problems which are connected
with the correlation between prosodie prominence, (syntactic) focus structure
and information structure. For a recent approach in the framework of the min-
imalist program, cf. Zubizarreta (1998). It is impossible here to give a detailed
justification of my view by comparing it with the many proposals in the lit-
erature. It may be useful, however, for a proper understanding of my specific
proposal to compare it with some similar approaches, noticing the small but
important differences. So, the terminological distinction into background and
focus, as shown in the bottom line in (10), is somewhat vague in several re-
spects: The term focus normally refers to prosodically prominent constituents
which are either themselves new information (minimal focus) or part of the
Word Order Variation 259
new information, in which case the focussed constituent serves as a focus ex-
ponent for a derived focus comprising additional constituents (F-projection
or F-percolation; cf. Eckardt 1996; for the distinction between a conception
of absolute focus or relative focus, cf. Höhle 1982, Jacobs 1984,1991). Other
elements within the VP which are not proper parts of the focus are generally
not covered by this terminology (except, maybe, in Biiring 1997, whose idea
of background vs. focus is very similar to my distinction between B-part vs.
F-part). So, in order to refer to both focus proper and focus affiliated con-
stituents, I propose talking about the F-part of the sentence. The so-called
background normally refers to old information, but the term "background"
normally does not relate to prosodically prominent constituents belonging to
old information. If both have to be addressed together, a new terminology is
necessary: I call it the B-part of the sentence.
A distinction very similar to mine is in fact developed in Choi (1996)
in some detail: Based on earlier work by Vallduvi (1992), Choi distin-
guishes between old (given) information (—NEW) and new (added) informa-
tion (+NEW). In addition, she adds the finer distinction between prominent
(+PROM) and non-prominent (—PROM) parts of old or new information.
This gives a cross classification which allows for a number of necessary dis-
tinctions, e.g., rise-fall-intonation, etc. I basically agree with Choi's analysis
except for the fact that she distributes the features [ ± NEW, ± PROM1 in a
random way on the constituents of a sentence. Thus, a subsequent battery of
ranked constraints in an OT approach has to filter out improper assignments
of [ ± NEW, ± PROM] to particular constituents. I hope to show that there
are rules for a proper assignment of constituents to the particular parts of
the information structure such that no subsequent filtering mechanism will be
needed.
Finally, I should add some remarks on the relation between a BF-structure
and a split tree. Biiring (1996) tries to show that a BF-analysis is superior to
a split tree analysis on empirical grounds. In particular, he claims that there
are sentences in German with a generic indefinite NP inside the VP. Biiring's
example is (11). He claims that einem Italiener ('a-DAT Italian', IO) may
receive a generic interpretation (as well as an existential one, to be sure). I
doubt that (11 -b) can have a generic interpretation, but let's assume this for
the sake of Biiring's argument.
Btiring assumes that the IO einem Italiener in (11-b) is in the VP rather than
scrambled out of the VP. The possibility of a string-vacuous scrambling of
the IO (as in: IO¿ fvp t, DO]) should be ruled out, as Btiring claims, because
then it would be unclear why the 10 in (12) cannot have a generic reading,
i.e., why string-vacuous scrambling (as in DO* IO, [vp t/ t*|) would not be
allowed there:
Here, if the sentence is judged correct at all, some sort of VP-internal DO/r
> SU sequence has to be assumed. Notice, however, that sentences like (13-
b) should be deviant, assuming Lenerz' (1977a: ch. 4; 1977b) condition of
agentivity, which is relevant for DO > SU sequences. So, (12) and (13)
do not provide counterevidence against string-vacuous scrambling. Accord-
ingly, a sentence like (14) may indeed have a generic reading if we assume
(even string-vacuous) scrambling of the IO einem Italiener, as indicated by
the parenthesised Adverb immer ('always'), which presumably marks the left
VP-boundary:
Word Order Variation 261
Thus, I conclude that, contrary to Büring's claim, a split tree analysis for a
German sentence as in (10) may indeed properly represent a BF-bi-partition.
So far, the main reason for assuming a BF-bi-partition, as represented in a
split tree analysis, has not been made clear. As the top line in (10) states, the
different readings of NPs in the B-part and the F-part, respectively, are due to
a distinction between a b(ackground)-determined vs. an immediate sentence
context (isc)-dependent interpretation: The reference of elements in the back-
ground is b-determined: It is either given by the preceding linguistic context
or by general knowledge. In contrast, the reference of elements in the F-part,
being newly introduced or somehow affiliated to newly introduced elements,
has to be chosen in a context adequate manner, i.e., as isc-dependent refer-
ence. Take (15) as an example:
Without any particular context given, we may safely assume that the refer-
ence of Peter (being a proper name) is somehow b-determined. Assuming
that the VP (bought a book at a science fiction book store) is the F-part of
(15), the reference of a book is isc-dependent: In order to give the sentence a
proper interpretation, we have to assume that a book refers to a book whose
reference is determined by the immediate sentence context, i.e., a book which
is available at a certain book store at the time in the past at which Peter bought
it. So, a book cannot refer to the Book of Kells (which is not for sale), to the
Bible or to Chomsky & Halle, The Sound Pattern of English (both are not
available at science fiction book stores), nor to next year's best selling fiction
novel by John Irving (which cannot have been for sale in the past). Similar
reasoning applies to an isc-dependent referential interpretation of the PP at a
science fiction bookstore.
All we need to complete the set of tools for my analysis is a proper way of
distinguishing between the identification of the referents of definite and indef-
inite NPs, on the one hand, and the assertion expressed in the sentence, on the
other hand. A semantics with choice functions, developed by von Heusinger
262 Jürgen Lenerz
Since the choice of the particular man is arbitrary given the definition of the
choice function so far, the sentence is true if any (arbitrarily chosen) man
snores, which corresponds to the generic character of the sentence.
However, in order to analyse non-generic sentences, von Heusinger (1997)
introduces indexed epsilon-operators that are interpreted by different choice
functions. This makes it possible to choose the reference of an NP in a
context-adequate manner, i.e., in my words as isc-dependent. Thus the rep-
resentation for the indefinite NP a man in (17) is the indexed epsilon term
"£, XMX" with the free index or parameter i for the choice of a choice func-
tion. Thus (17) is true if there is a choice function φ^ such that the object
assigned to the set of men (i.e., a particular man) is in the extension of the
predicate is snoring, as in (17-b):
Here, as the continuous form of the predicate (is snoring) indicates, the sen-
tence refers to a particular event. Thus, the reference of a man has to be
chosen in an isc-dependent way. This is done by the context-adequate choice
function ε,. The formula in (17-a) expresses this by using an existential quan-
tification of contexts (3i): There is a context i in which the predicate S (is
snoring) applies to an χ which is chosen by a context adequate choice func-
tion ε,· from the set of elements which have the property (attribute) of being
men (Mx). Here, the choice function is properly restricted, and there is no
generic reading available. (17-a) represents the normal "existential" (non-
specific) reference of an indefinite NP. Notice that the existential quantifier
in (17-a) does not apply to the individual a man, but to contexts (3/).'
In order to complete my short account of von Heusinger's approach, I will
also present a rough characterization of the interpretation of b-determined
reference. B-determined reference is usually expressed by definite NPs or
pronouns; cf. (18) in the (preceding) context of (17):
(19) (i) one that was mentioned last in the preceding context (where a man
is listed as a snoring man: M5 is the (background determined) sub-
set (Mx Λ Sx) which was determined by the preceding context, or
(ii) one that is somehow present in a given non-linguistic context and is
"pointed at" (deictic reading), or
(iii) one that is the proper salient individual in our knowledge of the
world (cf. the president, the sun, my wife, etc.).
I will now try to show how the puzzling aspects of word order variation in
German can be explained. In order to do so, I will first derive the generic and
existential reading of indefinite NPs. This will then be applied to explain the
constraints in (1).
Let us first turn to a derivation of the generic vs. existential reading of
indefinite NPs, as exemplified in (7), repeated here for convenience:
B-part and the resulting pragmatic inference. (Similar derivations of other re-
alisations of generic NPs (bare plurals, definite generic NPs) may be worked
out.)
The existential reading of an indefinite NP can be derived in a similar way:
Ein Feuerwehrmann in (7-b) is in the F-part of the sentence. Hence, it should
be interpreted in an isc-dependent manner. In terms of a choice function ap-
proach this means that the choice function choosing an individual from the
set of firemen has to be a context adequate choice function:
(20) 3/B(£,xFX)
this constraint derives from a split tree analysis in German, reflecting the BF-
structure in surface syntax. There are a few verbs in German allowing Dative
Shift. So schicken ('to send') allows a PP-argument, as in (22-a), or an NP
argument in the dative, as in (22-b). The PP-argument is generally assumed
to be closest to the verb. Thus, (22-a) represents the unmarked order DO >
PP, whereas (22-b) shows a derived (scrambled) order DO > IO.
There is a clear distinction in the acceptability of (22-a) vs. (22-b) in the given
context: While (22-a) is perfect, (22-b) shows the same deviation as (4-b,c).
This can be explained by my analysis: The structure of (22-a) is (22-a'):
(22a') VP
NP V
ein Buch PP V
must be dominated by (every segment of) VP, i.e., the NP must be in the F-
part! Being in the F-part results in an isc-dependent interpretation, hence in
an unspecific existential reading, as shown above.
Things are different for (22-b), as (22-b') shows:
(22b') VP
NP VP
NP V'
NP V
B-part F-part
Here, the indefinite NP ein Buch has been scrambled from its base position
(10 > DO) into the B-part of the sentence, where it inevitably receives a
b-determined, hence generic interpretation. The sentence is as odd as the ex-
amples (4-b,c) above, again because of semantic deviance.
So far, the constraint (1-d) and the so-called Existential Axiom (6) have
been derived in my analysis. Still, there remains the obvious paradox in (6):
Why should an NP "in the background" be in the F-part of a sentence? It can
be shown, I believe, that the assumption that the existential indefinite NP in
the crucial examples is "in the background" is wrong.
deleting the noun Buch. This again indicates clearly that the attributive part
of the NP {Buch) is b-determined and may hence be deleted. The referential
part eins, however, has to be uttered since it establishes an isc-dependent
reference independent of the preceding context. Notice, especially, that the
use of a b-determined pronoun es ('it') is not possible; cf. (24):
This shows, too, that the indefinite (singular) NP ein Buch ('a book') in the
answer (26-b) is not "in the background" (although it was mentioned before
in the question); rather, it is in the F-part of (26-b), and thus it cannot be
referentially b-determined. Rather, it has to refer in an isc-dependent man-
ner. So, its reference is chosen isc-dependently\ only its attributive part is
b-determined (as is, indeed, every attributive part of any NP).
This resolves the apparent paradox in the description of the distribution of
existential indefinite NPs.
To conclude, indefinite NPs receive a generic reading if they appear in the
B-part of a sentence. In the F-part of a sentence, indefinite NPs are interpreted
in an isc-dependent manner. This usually gives us an unspecific existential
reading. (There may, however, be certain contexts with a generic predicate, in
which an indefinite NP in the F-part may also receive a generic interpretation.
I will not discuss this here. I will also not discuss the distinction between a
non-specific and a specific interpretation of indefinite NPs.)
This leaves us with the cases in which each order of 10 and DO is acceptable
(2-a,b). If DO is a definite NP and 10 carries focus, DO may remain in situ
([± def IO]/r > | + def DO], unmarked order) or scramble ([+ def DO] >
| ± def 10]F). The scrambling of a non-focus DO to the background part of
the sentence does not present a problem. There is, however, the problem of
interpreting the non-scrambled definite DO in situ, i.e., in the F-part. A sim-
ilar problem arises, of course, for Diesing's (1990) analysis, as Choi (1996:
120f.) notices. In the original split tree analysis, the definite NP, being in-
side the VP, should be bound by the unselective existential quantifier which
contradicts its definite reference. Instead of discussing Diesing's and Choi's
proposals, I will present my own analysis, which is based on the principles
stated and applied above.
Word Order Variation 271
Such a difference in meaning is, however, hard to establish. I assume that the
difference does not reside in the final meaning of both sentences, but rather
in the way their interpretation is brought about.
Let me start with the more natural case (2-b). Here, the information of the
F-part (dem Studenten gegeben) is added to the B-part (ich habe das Buch ...).
The NP das Buch belongs to the B-part and its reference is established in a
b-determined manner, i.e., as the most salient individual of the set of books
in the given context; cf. (18) and (19). The explicit repetition of the whole
NP das Buch is, in fact, unnatural and would, of course, require additional
pragmatic reasoning. The most natural explicit answer replaces the NP with
a personal pronoun; cf. (2-b')· (I will disregard elliptic answers like "dem
Studenten" for obvious reasons; they don't allow any insight into word order
relations.)
So, (2-b) or (2-b') do not present a problem to my analysis. This is, of course,
different for (2-a). My proposal will have to proceed as follows.
In (2-a), the information of the F-part (dem Studenten das Buch gegeben)
is added to the B-part (ich habe)·, the NP das Buch here belongs to the F-
part and its reference is established in an isc-dependent manner, i.e., as a
specific, aforementioned book whose reference has to be determined appro-
priately with respect to the immediate sentence context. This, of course, will
result in fixing the reference of the NP das Buch as referring to exactly the
same individual as in the question (2), or in the related answer (2-b). (2-a)
272 Jürgen Lenerz
only achieves this in a more indirect way. Notice, too, that a replacement of
the NP with a personal pronoun in the F-part is not possible; cf. (2-a'):
3 Conclusion
their respective principles such that a true account of the interaction of syntax,
prosodie structure, semantics and (discourse) pragmatics (BF-structure) can
be achieved. Even if this has not been fully accomplished in the present paper,
my analysis has shown that sentences with different word orders are indeed
different sentences, each with a different meaning, even if they show the same
lexical material ("numeration") and the same argument structure. So, in the
ideal case, the candidate set (numeration + meaning) for each sentence will in
fact be reduced to cardinality 111, leaving no room for a competitional model.
This seems to be true at least for the cases discussed. Other constraints may
still exist which are not amenable to a similar approach. This may hold at
least for those cases in which the factors I mentioned are not involved. The
well-known case of growing length of constituents as discussed by Hawkins
(1983) and Primus (1993, 1994), amongst others, comes to mind.
Thus, my analysis has not proven that competition models are inadequate
altogether. Rather, I hope to have shown that it is worthwhile to investigate
the apparently competing conditions in detail and try to derive them from
more basic principles before applying them in a competition model. There
seems to be more co-operation than one might think at first sight.
4 Appendix
In the following appendix, I will present some independent evidence for the
distinction between a referential and a non-referential/attributive part of the
semantics of NP. In particular, I will show that such a distinction is reflected
in syntactic behaviour.
Clearly, the noun Arzt ('doctor') does not refer; so, maybe the special form
of the preposition (zum) is only a merger of a preposition with a dative case
ending (zu + -m), lacking the referential part of the determiner and relating
only to the attributive meaning of the noun.
A similar point can be made for predicate nouns. They, too, do not seem to
refer, but to consist only of an attributive reading:
(28) a. Peter wird (*der/?ein) Lehrer, und Max will *er/es auch
Peter becomes (*the/?a) teacher, and Max wants *him/it also
werden.
become
b. Peter wird (*der/?ein) Lehrer, und Max will auch einer [n0]
Peter becomes (*the/?a) teacher, and Max wants also one
werden,
become
'Peter will become a teacher, and Max also wants to become one.'
Similarly, there are coined phrases which may or may not be read in an id-
iomatic way:
Predictably, the idiomatic reading is again only obtained with the NP in situ
(32-a), whereas the scrambled NP in (32-b) only allows a non-idiomatic, i.e.,
referential interpretation:
(32) a. Der Schiedsrichter hat einem SPIELER die rote Karte gezeigt.
the referee has a-DAT player the red card shown
I idiomatic reading preferred\
b. Der Schiedsrichter hat die rote Karte einem SPIELER gezeigt,
the referee has the red card a-DAT player shown
I referential, non-idiomatic reading only |
276 Jürgen Lenerz
(33) Vol VOSY haben mich ja [ΓΡ (NP viele t7-] überholt].
Volvos have me yes many overtaken
'(As for) Volvos, many overtook me.'
(34) Was, haben dich denn [yp [NP t, für Leute] angesprochen]?
what have you-ACC then for people addressed
'What kind of people addressed you, then?'
As has been noted before (cf. Müller & Sternefeld 1995), the source of the
movement must be VP internal. Both movements may not apply after scram-
bling (cf. also Lenerz 1994: 163):
Notes
I would like to thank H. Weiß and K. v. Heusinger for valuable comments and helpful
proposals.
1. In particular, (17-a) is not a covered-up version of a quantifier approach: As von
Heusinger (1997: 98ff.) points out himself, the formula (17-a) is only a shorthand
version of a more precise logical expression without any quantifier whatsoever:
References
Abraham, Werner
1995 Deutsche Syntax im Sprachenvergleich. Tubingen: Narr.
Bierwisch, Manfred
1983 Semantische und konzeptuelle Repräsentation lexikalischer Einheiten.
In: W. Mötsch and R. Rüzicka (eds.) Untersuchungen zur Semantik, 61-
99. (Studia Grammatica 22.) Berlin: Akademie-Verlag.
Büring, Daniel
1996 Towards an economy-theoretic treatment of German Mittelfeld word or-
der. Ms., DFG-Project "Ökonomieprinzipien" (GR 559/5-1). Universität
Frankfurt und Köln.
Büring, Daniel
1997 The Meaning of Topic and Focus - The 59th Street Bridge Accent. Lon-
don: Routledge.
Choi, Hye-Won
1996 Optimizing structure in context: Scrambling and information structure.
Ph.D. dissertation, Stanford University.
Chomsky, Noam
1995 The Minimalist Program. Cambridge, MA: MIT Press.
Clahsen, Harald — Samuel Featherston
1998 Antecedent Priming at Trace Positions: Evidence from German scram-
bling. (Essex Research Reports in Linguistics Vol. 23.) Essex: University
of Essex.
Diesing, Molly
1990 The syntactic roots of semantic partition. Ph.D. dissertation, University
of Massachusetts, Amherst.
Eckardt, Regine
1996 Intonation and Predication: An Investigation in the Nature of Judge-
ment Structure. Arbeitspapiere des Sonderforschungsbereichs 340, No.
77. Stuttgart & Tubingen.
278 Jürgen Lenerz
Egli, Urs
1995 Definiteness, binding, salience, and choice functions. In: F. Hamm, J.
Kolb and A. von Stechow (eds.) The Blaubeuren Papers: Proceedings
of the Workshop on Recent Developments in the Theory of Natural Lan-
guage Semantics. October, 9-16th 1994, 105-125. Technical Report 08-
95, Seminar für Sprachwissenschaft der Universität Tübingen.
Fanselow, Gisbert
1993 The return of the base generators. In: Groninger Arbeiten zur Germanis-
tischen Linguistik 36: 1 -74.
Fanselow, Gisbert
1997 Features, 0-roles, and free constituent order. Ms., University of Potsdam.
Grevvendorf, Günther
1995 German: A grammatical sketch. In: J. Jacobs, A. von Stechow, W. Sterne-
feld and Th. Vennemann (eds.) Syntax, Vol II, 1288-1391. Berlin: de
Gruyter.
Grice, H. Paul
1975 Logic and conversation. In: P. Cole and J.L. Morgan (eds.) Speech Acts,
41-58. (Syntax and Semantics 3.) New York: Academic Press.
Haftka, Brigitte (ed.)
1994 Was determiniert Wortstellungsvariation? Studien zu einem Interaktions-
feld von Grammatik, Pragmatik und Sprachtypologie. Akten der AG 5 der
DGfS Jahrestagung 1993. Opladen: Westdeutscher Verlag.
Haider, Hubert
1993 Deutsche Syntax - Generativ. Tübingen: Narr.
Haider, Hubert — Inger Rosengren
1998 Scrambling. (Sprache und Pragmatik, Arbeitsberichte 49.) University of
Lund.
Hawkins, John A.
1983 Word Order Universals. New York: Academic Press.
Hawkins, John A.
1990 A parsing theory of word order universals. Linguistic Inquiry 21: 223-
261.
Heim, Irene
1982 The semantics of definite and indefinite noun phrases. Ph.D. dissertation,
University of Massachusetts, Amherst.
Heusinger, Klaus v.
1997 Salienz und Referenz: Der Epsilonoperator in der Semantik der No-
minalphrase und anaphorischer Pronomen. (Studia Grammatica 43.)
Berlin: Akademie-Verlag.
Word Order Variation 279
Heusinger, Klaus v.
to appear The reference of indefinites. In: K. v. Heusinger and U. Egli (eds.) Ref-
erence and Anaphoric Relations, 265-284. (Studies in Linguistics and
Philosophy.) Dordrecht: Kluwer.
Hilbert, David — Bernays, Paul
1977 Grundlagen der Mathematik, Vol. 11, 2 n d ed. Berlin/New York: Springer
Verlag.
Höhle, Tilman Ν.
1982 Explikationen für 'normale Betonung' und 'normale Wortstellung'. In:
W. Abraham (ed.) Satzglieder im Deutschen, 75-153. Tübingen: Narr.
Jacobs, Joachim
1984 Funktionale Satzperspektive und Illokutionssemantik. Linguistische Be-
richtet: 25-58.
Jacobs, Joachim
1988a Probleme der freien Wortstellung im Deutschen. Sprache und Pragmatik
5: 8-37.
Jacobs, Joachim
1988b Fokus-Hintergrund-Gliederung und Grammatik. In: H. Altmann (ed.) In-
tonationsforschungen, 89-134. Tübingen: Niemeyer.
Jacobs, Joachim
1991 Focus ambiguities. Journal of Semantics 8: 1-36.
Jacobs, Joachim
1992a Integration. Arbeitsbericht des Sonderforschungsbereichs 282 (Theorie
des Lexikons), No. 13. Universität Düsseldorf.
Jacobs, Joachim
1992b Neutral stress and the position of heads. In: J. Jacobs (ed.) Informations-
struktur und Grammatik, 220-244. Opladen: Westdeutscher Verlag.
Krifka, Manfred
1984 Fokus, Topik, syntaktische Struktur und semantische Interpretation. Ms.,
Universität München.
Krifka, Manfred
1998 Scope inversion under the rise-fall-contour in German. Linguistic Inquiry
20: 75-112.
Lenerz, Jürgen
1977a Zur Abfolge nominaler Satzglieder im Deutschen. Tübingen: Narr.
Lenerz, Jürgen
1977b Zum Einfluß von 'Agens' auf die Wortstellung des Deutschen. In: H.W.
Viethen, W.-D. Bald and K. Sprengel (eds.) Grammatik und interdiszi-
plinäre Bereiche der Linguistik. Akten des II. Linguistischen Kolloqui-
ums, Aachen, 1976, 133-142. Tübingen: Niemeyer.
280 Jürgen Lenerz
Lenerz, Jürgen
1994 Pronomenprobleme. In: Haftka (ed.), 161-173.
Lenerz, Jürgen
1998 Noam Chomsky, The minimalist program. In: Beiträge zur Geschichte
der Deutschen Sprache und Literatur, Bd. 120, Heft 1, 103-111. Tübin-
gen: Niemeyer.
Lenerz, Jürgen
1999 Besprechung: Klaus von Heusinger, „Salienz und Referenz. Der Ep-
silonoperator in der Semantik der Nominalphrase und anaphorischer
Pronomen". In: Beiträge zur Geschichte der deutschen Sprache und Lit-
eratur, Bd. 121, Heft 3,456-459. Tübingen: Niemeyer.
Müller, Gereon
1998 German Word Order and Optimality Theory. Arbeitspapiere des Sonder-
forschungsbereichs 340, No. 126. Stuttgart & Tübingen.
Müller, Gereon — Wolfgang Sternefeld
1995 Extraction, lexical variation, and the theory of barriers. In: U. Egli, P.
Pause, C. Schwarze et al. (eds.) Lexical Knowledge in the Organization
of Language, 35-80. Amsterdam: Benjamins.
Primus, Beatrice
1993 Word order and information structure: A performance-based account of
topic positions and focus positions. In: J. Jacobs, A. von Stechow, W.
Sternefeld and Th. Vennemann (eds.) Syntax: Ein internationales Hand-
buch zeitgenössischer Forschung, 880-896. Berlin: Walter de Gruyter.
Primus, Beatrice
1994 Grammatik und Performanz: Faktoren der Wortstellungsvariation im
Mittelfeld. Sprache und Pragmatik 32: 39-86.
Reinhart, Tanya
1995 Interface strategies. OTS Working papers 95-002. Utrecht: Research In-
stitute for Language and Speech.
Reinhart, Tanya
1997 Interface economy: Focus and markedness. In: C. Wilder, H.-M. Gärtner
and M. Bierwisch (eds.) The Role of Economy Principles in Linguistic
Theory, 146-169. Berlin: Akademie-Verlag.
Rosengren, Inger
1993 Wahlfreiheit mit Konsequenzen. Scrambling, Topikalisierung und FHG
im Dienste der Informationsstrukturierung. In: M. Reis (ed.) Wortstel-
lung und Informationsstruktur, 251-312. (Linguistische Arbeiten 306.)
Tübingen: Max Niemeyer Verlag.
Word Order Variation 281
Selkirk, Elisabeth O.
1984 Phonology and Syntax: The Relation between Sound and Structure. Cam-
bridge, MA: MIT Press.
Siebert, Susann
1999 Wortbildung und Grammatik: Syntaktische Restriktionen in der Struktur
komplexer Wörter. Tübingen: Niemeyer.
Truckenbrodt, Hubert
1996 Prosodie und Intonation im Deutschen. Talk presented at GGS, Berlin.
Uhmann, Susanne
1991 Fokusphonologie. (Linguistische Arbeiten 252.) Tübingen: Niemeyer.
Uszkoreit, Hans
1984 Word order and constituent structure in German. Ph.D. dissertation, Uni-
versity of Austin, Texas, (published 1987, Stanford: CSLI Publications).
Vallduvi, Enric
1992 The Information Component. New York: Garland.
Vikner, Sten
t.v. The Interpretation of Object Shift and Optimality Theory.
Vogel, Ralf — Markus Steinbach
1998 The dative - An oblique case. Linguistische Berichte 173: 65-90.
Williams, Edwin
1999 Economy as shape conservation. Talk presented at the Annual Meeting
of the DGfS. Konstanz, Feb. 1999.
Yeom, Jae-Il
1998 A Presuppositional Analysis of Specific Indefinites: Common Grounds as
Structured Information States. New York/London: Garland.
Zimmermann, Thomas Ede
1991 Kontextabhängigkeit. In: A. von Stechow and D. Wunderlich (eds.) Se-
mantik: Ein internationales Handbuch der zeitgenössischen Forschung,
156-229. Berlin/New York: Walterde Gruyter.
Zubizarreta, Maria Luisa
1998 Prosody, Focus, and Word Order. (Linguistic Inquiry Monographs 33.)
Cambridge, MA: MIT Press.
OT Accounts of Optionality: A Comparison of Global
Ties and Neutralization
Tanja Schmid
At first sight, optionality poses a problem for all theories that assume a com-
petition between candidates (e.g. transderivational Minimalism and to a much
larger extent Optimality Theory (OT)). In such theories, the optimal (or the
most economical) candidate blocks the non-optimal (less economical) candi-
dates in a given candidate set (reference set). Only the optimal candidate is
grammatical.
In this paper I will introduce and compare two accounts of optionality in
OT and show that they are empirically equivalent. One account, the global tie
approach (see, e.g., Ackema & Neeleman 1995, 1998), involves constraint
ties and the other account, the neutralization approach (see Legendre et al.
1995, 1998; Bakovic & Keer 1999), makes use of the normal OT interaction
of faithfulness and markedness constraints. Both accounts will be applied to
different data sets. The accounts will be checked to determine whether one
is superior to the other. The result will be that in fact both approaches can
be used to account for the same kind of data. Nevertheless, to strengthen the
theory, one account should be seen as superfluous. For conceptual and not
empirical reasons, I will prefer the neutralization account in the end.
I will proceed as follows: In section 2 , 1 will give a brief introduction to
OT and discuss different OT accounts of optionality including the two men-
tioned above. In order to justify the focus on these two accounts I will briefly
mention their advantages compared to other OT accounts of optionality.
In the following three sections, I will look at data for which either an ac-
count in terms of neutralization (section 4) or global ties (section 5) or both
(section 6) has been given in the literature. In section 4 and in section 5,1 first
introduce the analysis proposed in the literature. Then I give a new account in
284 Tanja Schmid
2 Basic Assumptions of OT
(1) Basic assumptions (see among many others Prince & Smolensky
1993, Grimshaw 1997)
a. Constraints are universal.
b. Constraints are violable.
c. Grammars are rankings of constraints.
d. An optimal candidate in a candidate set is grammatical, all non-
optimal candidates are ungrammatical.
e. The grammaticality of a candidate not only depends on its inher-
ent properties, but on the properties of the competing candidates as
well.
The candidates in a given candidate set are generated by a part of the gram-
mar (GEN, for generator) which contains only inviolable and unranked con-
straints. GEN takes an underlying form (the input) and builds up all possi-
ble output structures. These outputs, called the candidates, are evaluated by
another part of the grammar, the function H-EVAL (Harmony Evaluation),
which determines the optimal candidate(s) based on the constraint hierarchy
of the language.
I will use the following notation in this paper:
3 OT Accounts of Optionality
The present definition of optimality is compatible with more than one opti-
mal candidate in one and the same competition. This is exactly what I will
understand by optionality from a theoretical point of view:
(4) Optionality: Two (or more) different candidates are optimal, i.e.,
grammatical, though they are (or seem to be) in the same competi-
tion.
The obvious way of allowing for optionality in OT is that the winning can-
didates in one and the same competition have the same constraint profile
(Grimshaw 1997: 41Of., and Vikner 1999 use identity of constraint profile
to account for complementizer optionality). The condition for the optimality
of two (or more) competing candidates under this point of view is absolute
identity of the optimal constraint profile.
Identity of constraint profile is an intrinsic part of the theory that results di-
rectly from the basic mechanisms of OT. Regardless of any additional mech-
anisms and assumptions used, identity of constraint profile can never be ex-
cluded.
Only if the identity of the constraint profile is used as the only way to ac-
count for optionality, and nothing else is stipulated, will I speak of an "ap-
proach" to optionality along these lines. Then, however, the question arises
as to whether this is sufficient to account for all cases of optionality. To il-
lustrate the idea of identity of constraint profile, a very simplified example is
given below:
c. C 3 **t
The tableau above shows one single competition with the three candidates
Ci, C2 and C3, and an extremely small grammar consisting of only three
constraints, A, B, and C with the ranking A » Β » C. Candidate C3 fatally
violates the highest ranking constraint A. As both remaining candidates Q
and C2 have exactly the same constraint profile (they both violate constraint
A once, constraint Β not at all, and constraint C twice) and as this constraint
profile is optimal (they fare better than C3 on the highest constraint on which
they differ), both Q and C2 are grammatical.
This approach is quite plausible for a small grammar, as in (5), but, with a
larger number of constraints it is unlikely that the optimal candidates are not
distinguished by any constraint at all.
It is extremely difficult to keep an identical constraint profile of two (or
more) candidates. For this reason, identity of constraint profile should not be
seen as an independent approach to account for all cases of optionality, but
Global Ties and Neutralization 287
merely as a theoretical possibility that is not sufficient on its own for most
cases.
One additional assumption that is made in the literature to account for option-
ality is the possibility of constraint ties. Constraints that are tied are equally
important, i.e., two (or more) competing candidates may differ with respect
to the tied constraints but can nevertheless both (all) be optimal.
The notion of "constraint tie" is not used uniformly: At least five differ-
ent concepts of tie can be found in the literature (see Müller 1999 for an
overview). Prince & Smolensky (1993: 51, fn. 31) briefly mention the possi-
bility of constraint ties and open the door to different interpretations:
I will continue by concentrating on two quite common notions of tie that are
in accordance with Prince & Smolensky's considerations. I will call them
local ties and global ties. Local ties can be seen as special types of constraints
and global ties as underspecifications of different constraint rankings, i.e., in
a language with a global tie, multiple constraint rankings co-exist. 1
Local ties follow one of the notions mentioned in Prince & Smolensky (1993:
51), namely, the "crucial nonranking" of constraints.
With the type of local tie that I will introduce (see Müller 1997 for a crucial
application), tied constraints count as a single constraint, i.e., "a candidate
violates a tie if it violates a constraint that is part of this tie, and multiple
violations add up" (Müller 1999: 6). A simplified abstract example is given
in (7):
288 Tanja Schmid
** **
us- b. C 2
c. C 3 *** ι
Two (or more) competing candidates are grammatical if they are optimal
under one possible resolution of the tie. This means, contrary to what is the
case with local ties, that the optimal candidates may show a different con-
straint profile below the tied constraint.
In (8) I give an abstract example of a global tie in the underspecified form.
The resolutions of the tie are given in (9) and (10) below.
c.C3
The global tie A ο Β above is not yet explicitly resolved. It is shown, how-
ever, that again, both candidate Q and candidate C2 are optimal. Notice that
this result could not be achieved under the assumption of identity of constraint
profile or local ties alone as Q and C2 differ in the number of violations both
on the tied constraint and on constraint C below the tie.
Candidate C3 is suboptimal under any resolution of the tie. This is ex-
pressed by the two marks of fatal violation in brackets (!). Why these marks
are fatal will become clearer below, where the global tie is resolved into the
two possible total orders A » Β » C (9) and Β » A » C (10):
Under this resolution of the tie candidate C2 is optimal as it fares best on the
highest constraint on which the candidates differ (i.e., constraint A). What
happens under the opposite total order is shown in (10):
b. C 2 *! *
***!
c. C 3
3.3 Neutralization
All accounts of optionality that I have introduced so far assume that two (or
more) grammatical candidates are optimal in one and the same competition.
The main idea of the neutralization account, however, is that the optimal can-
didates win different competitions, i.e., that they are not built from the same
input (although they may be included in each other's candidate sets by GEN).
A crucial assumption for the neutralization approach to optionality is that
the relevant inputs differ only minimally with respect to, e.g., functional fea-
tures (Bakovic & Keer 1999); otherwise, they are identical. The contrasts in
the input are either preserved in the output (apparent optionality) or neutral-
ized depending on the constraint ranking of the language.
Neutralization equals a "breakdown" of optionality: A candidate is optimal
not only in a candidate set in which it is faithful to the input, but also in a
candidate set in which it is unfaithful. This is the case when the unfaithful
candidate blocks the faithful one due to a higher ranked (markedness) con-
straint, i.e., a difference in the input is neutralized in the output; hence the
name "neutralization" for the whole approach.
An abstract example of both (apparent) optionality and neutralization is
given in (11). The table is taken from Bakovic & Keer (1999) (figure 1).
Global Ties and Neutralization 291
The first set of data in the comparison between the global tie approach and the
neutralization approach comes from complementizer optionality in English.
292 Tanja Schmid
(12) a. Do you think Icp that tip Jane looks like Mary]]?
b. Do you think hp Jane looks like Mary]?
For these kinds of data, accounts in terms of neutralization have been pro-
posed in the literature (Legendre et al. 1995, Bakovic & Keer 1999, Kura-
fuji 1997). I will proceed as follows: First, I will introduce a neutralization
account based on those already proposed, and later on, I will give a new ap-
proach in terms of global ties that can account for the data as well.
The constraints that I will use are mostly taken from Bakovic & Keer (1999)
and Kurafuji (1997), 4 whose approaches I will combine. The constraints that
will become relevant in this section are given below: 5
The question of how the input is defined 8 becomes very relevant for the neu-
tralization approach because, under this approach, optionality is explicitly
connected with faithfulness to the input.
Global Ties and Neutralization 293
The input in OT-syntax can be defined in the following way: "... a lexi-
cal head plus its argument structure (...) plus a specification of the associ-
ated tense ..." (Grimshaw 1997: 375f.). It is crucial for the neutralization ap-
proach to optionality that Bakovic & Keer (1999) add functional features like
[ + / — C O M P ] to this definition.
Furthermore, Bakovic & Keer (1999) need to assume the following:
— Embedded clauses with that are CPs and those without that are IPs.
— Embedded CPs and IPs differ in their specification for a feature
[COMP].
— Bridge verbs like think can be equipped with either a ] + C O M P | or a
[ — C O M P ] feature; F+COMPJ requires a CP and [ — C O M P ] requires
an IP.
This time, candidate (b) without a complementizer is faithful to the input and
therefore optimal in the competition.
So far, it has been shown that the neutralization approach can account for
optionality: Optionality is the result of faithfulness to slightly different input
specifications.
In this section, I give a new account (in terms of global ties) for the same
data. Contrary to in the neutralization approach, two (or more) candidates
emerge as optimal in one and the same competition in the global tie ap-
proach. Another difference concerns the role of the input, which does not
need to be as explicitly specified. In all global tie accounts in syntax that I am
aware of (Ackema & Neeleman 1995 and 1998, Broekhuis & Dekkers 1999,
Schmid 1998 and 1999), markedness constraints are the only constraints that
are needed to account for optionality. Faithfulness constraints sensitive to
functional features seem not to be necessary.
When instead of the faithfulness constraint a markedness constraint is in-
troduced that contradicts *EXP, and when these two constraints are tied, op-
tionality can be accounted for. I assume that the markedness constraint in
question is HAVE (CP), which requires a clause to be a CP (because, e.g.,
only there the information about sentence mood can be stored).
When the input sensitive faithfulness constraint is not crucial anymore, then
it does not matter if the input is specified for [—COMP] or [+COMP], As
markedness constraints make the decision, the result would be the same in
either case.
The competition with a global tie (unresolved) is shown in (21):
Global Ties and Neutralization 295
*
us" b.... think [ip ...
To account for cases like these, the constraint P U R E - E P , inactive so far, be-
comes relevant. One part of P U R E - E P prohibits adjunction to the highest
projection of a subordinate clause. The CP that is introduced by the com-
plementizer can function as a "shelter" for adjunction. When it is present, the
projection to which adjunction takes place is no longer the highest projection
of the subordinate clause.
In the neutralization approach, P U R E - E P is the markedness constraint that
is responsible for neutralizing different feature specifications in the input to
only one output specification when it is ranked above the relevant faithfulness
constraint. No matter what the feature specification in the input, the output
will always show a complementizer. This can be seen in the tableaux below.
In (23), the input is specified for [ + C O M P | :
296 Tanja Schmid
The decision is made above the tie at PURE-EP, favouring candidate (a) with
a complementizer.
In this section I have shown that cases of complementizer optionality and
obligatoriness which have been accounted for under the neutralization ap-
proach in the literature can be accounted for just as well under the global
tie approach. Note that * E X P , which is crucial for the global tie approach, is
Global Ties and Neutralization 297
superfluous for the neutralization approach - at least for the cases that have
been looked at here.
The second set of data comes from French, in which wA-movement of argu-
ment XPs is optional in root questions. This time, an approach in terms of
global ties is given in the literature; see Ackema & Neeleman (1995,1998). 12
An analysis along these lines will be introduced and taken as the basis for an
account in terms of neutralization. The relevant set of data is given below:
As shown in (26) there are (at least) three possible ways of forming a root
question in French. 13 Example (a) shows wA-movement and subject-auxiliary
inversion. In example (b) the wA-element remains in situ and in example (c)
it is moved again, but this time without subject-auxiliary inversion. The only
ungrammatical example is subject-auxiliary inversion without w/i-movement
as shown in (d). It is often assumed that (a) occurs in standard French and (b)
and (c) in colloquial French (see, e.g., Confais 1985: 175). Unlike Ackema
6 Neeleman (1995), who are mainly interested in the optionality between (a)
and (b), 14 I will derive optionality inside the same register, i.e., between (b)
and (c), in the following.
The relevant constraints from Ackema & Neeleman (1995, 1998) are shown
below:
That is, the VP must be marked by a lexical X with a [+Q] feature. This [+Q]
feature is assigned via sisterhood by a wA-element in Spec XP. In root clauses,
this constraint can only be obeyed by a combination of w/i-movement and X
movement.
The following tableaux (for standard French and for colloquial French below)
are underspecified, representing two subtableaux simultaneously:
In standard French, only candidate (a) with movement of both the w/i-phrase
and the auxiliary is optimal. The highly ranked Q - M A R K requiring both wh-
movement and verb movement eliminates any other candidate, independently
of the resolution of the tie.
Now assume that the only difference between the two registers (i.e., the only
one relevant for the case at hand) is the position of Q-MARK. Contrary to
standard French, in which it is highly ranked, it is ranked below the tie in
colloquial French: 1 8
os· c. Quii tu as vu ti *
d. Asj-tu tj vu qui *
Under one resolution of the constraint tie (SPC » Q-SCOPE), the require-
ment to minimize movement paths is more important than the need to fulfill
scoping requirements via movement. Therefore, under this ranking, candidate
(b) without wA-movement is optimal. Under the opposite ranking (Q-SCOPE
S PC), however, candidate (c) emerges as optimal because it respects Q -
SCOPE (with the shortest possible movement path). 1 9
Thus, the global tie between two constraints can account for the optionality
of w/i-movement in colloquial French. 2 0
(34) FAITH[Q]: The output value of [Q] is the same as the input value.
300 Tanja Schmid
Under the assumption that the functional feature [Q] may be freely inserted
in the input, more candidates than before need to be looked at. For all four
candidates (a-d) a phonetically identical counterpart exists that differs only in
the feature specification of the wA-element. These "counterpart candidates"
are included in the tableaux below.
The presence of a [Q]-feature on a w/z-element will be marked by [ + ] and
the absence of a [Q]-feature by | — ].
In the input o f the first competition, the w/z-element is equipped with a [Q]-
feature.
(36) Input: [ + Q ]
Q-MARK FAITH[Q] Q-SCOPE SPC
is· a. Quif+]i as2"tu t2 vu ti ******
b. Tu as vu qui[_i *! *
c. Quif+]i tu as vu ti *! ***
f. Tu as vu qui[+] *! *
g. Qui[_n tu as vu t] *! * ***
b. Tu as vu qui|_] *!
* * *
c. Qui[+]i tu as vu ti *! *
* * *
d. Asj-tu ti vu qui[_i *!
* * * * * *
e. Qui[_|i as2-tu t2 vu t¡ *!
f. Tu as vu qu¡r+] *! * *
* * *
g. Qui[_|i tu as vu t] *!
***
h. Asi-tu t¡ vu qui[_|_| *! * *
Again, candidate (a) is optimal although this time, it is unfaithful to the input.
Only candidate (a) with movement of both the w/i-phrase and the auxiliary is
optimal in standard French under the neutralization approach - just as under
the global tie approach. The highly ranked Q - M A R K again eliminates any
other candidate and forces a | + Q ] element to occur even if it is not present in
the input. Different underlying specifications are neutralized in the output.
In colloquial French, the constraint ranking differs slightly from the one in
standard French: A s in the global tie approach, Q - M A R K , which is highly
ranked in standard French, is ranked low in colloquial French. The faithful-
ness constraint FA IT H [ Q ] must be ranked above all markedness constraints,
and among the markedness constraints, Q-SCOPE and SPC have to be ranked
above Q - M A R K . This is shown in (38):
(39) Input: | + Q ]
*
e. Qui|_]i as2-tu t2 vu ti *! ******
*
f. Tu as vu qui[+] *!
*** *
g. Qui[_]i tu as vu ti *!
*** *
h. Asj-tu tj vu qui[+] *!
* *** *
h. Asj-tu ti vu quif+] *!
Of the faithful candidates (b), (d), (e) and (g), the one which does not violate
Q - S C O P E and fulfills S P C best emerges as optimal. This is candidate (b).
In this section, it has been shown that the neutralization approach can ac-
count for the optionality of w/z-movement in root questions in colloquial
French as can the global tie approach given in the literature. The two accounts
are empirically equivalent with respect to the above data. 22 In the global tie
approach, optionality is achieved by a global tie of SPC and Q-SCOPE, and
in the neutralization approach, by a combination of free [Q]-insertion in the
input and the ranking of FAITH[Q] above all relevant markedness constraints.
Global Ties and Neutralization 303
In the case of a matrix verb like se demander ("ask oneself") that selects an
embedded question, SELECT is satisfied when the highest projection of the
embedded clause carries a Q-feature. 23
PURE-Ep, which has already been introduced in section 4, will become
crucial in standard French. It is repeated in (43):
b. tu as vu qui[_] *! * *
The faithful candidate (c) with the [Q]-feature on the moved w/z-element is
the optimal one. It is the only candidate that fulfills the highly ranked SE-
LECT (the highest projection carries a [Q|-feature as demanded by the matrix
verb) and PURE-EP (no movement into the head of the highest embedded
projection).
Nevertheless, with an input that does not show a [Q]-feature on the wh-
element, the same optimal candidate will result:
b. tu as vu qui[_] *! *
I will assume that the order of SELECT and PURE-EP is the same as in stan-
dard French. It is important that SELECT outranks FAITH[Q]. The ranking of
PURE-EP, however, is not crucial. The remaining constraints are ranked as
Global Ties and Neutralization 305
before in colloquial French (see (38)). Again, I will first look at the competi-
tion with a [Q]-feature in the input:
b. tu as vu quif_] *! * *
b. tu as vu qui[_] *! *
The global tie approach can account for the breakdown of optionality in
embedded questions along the same lines. The only thing to do is to rank the
constraints S e l e c t and P u r e - E p above the global tie.
The topic of this section was the optionality of wA-movement in root ques-
tions in colloquial French and its breakdown in standard French and in em-
bedded questions of both registers. For these kind of data, an account along
the lines of the global tie approach has been suggested in the literature. The
neutralization approach, however, which was introduced here, turns out to
account for the data as well.
6 IPP in German
The last set of data comes from IPP constructions in German. They may be
optional, depending on the verb class involved. IPP is short for Infinitivus Pro
Participio, which denotes a bare infinitive that in a certain context replaces the
expected past participle in some West Germanic languages. Simplifying a bit,
this context is given in the perfect tense, when the verb that is selected by the
temporal auxiliary takes a VP complement itself. IPP cases in German only
occur with a particular word order: The auxiliary, which normally follows
its complement verb in embedded sentences, precedes it. The connection be-
tween verb form and verb order is shown in (48) with the perception verb
hören ('hear'). Perception verbs optionally occur either in the IPP, with the
finite auxiliary hat ('has') preceding the other verbs, or as the past participle
in the "normal" verb order, with the finite auxiliary at the end.
Optionality of IPP with perception verbs:
The following constraints will become relevant below (see Schmid 1998,
1999):
One possibility to account for the data is in terms of the global tie approach
(see Schmid 1999). I will assume that the constraint that demands the occur-
rence of a selected past participle (MORPH) is tied with the constraint that
may demand the replacement of a past participle under certain conditions
(*PASTP/PV/+V). Other crucial rankings are: HD-LFT & MORPH outranks
H D - R T , H D - R T o u t r a n k s H D - L F T a n d t h e tied c o n s t r a i n t s o u t r a n k H D - R T .
One possible (still underspecified) ranking is given in (55):
A competition with the above constraint ranking is shown below. The com-
peting candidates result from the manipulation of the verb form and of the
order of the lexical items by GEN.30
Note that the two optimal candidates differ in their constraint profile below
the tie. It is therefore crucial that the tie is global and not local.
It is necessary that there are inputs that differ only with respect to their spec-
ification for the past participle. To allow for optionality, it is crucial that
FAITH(PASTP) is ranked above the relevant markedness constraint, namely
*PASTP/PV/+V:
I will first look at a competition in which the input is specified for the past
participle (marked by the past participle prefix Ige-)):
... dass sie ihn | + g e - | ... HD-LFT FAITH *PASTP/PV/+V HD- HD-
& MORPH (PASTP) RT LFT
a. singen hören hat *! * ***
The winner in this competition is the faithful candidate (c) with the percep-
tion verb in the past participle and the temporal auxiliary following the other
verbs. Although it violates the markedness constraint * P A S T P / P V / + V , it is
optimal due to the higher ranking of the faithfulness constraint.
Likewise, due to this ranking, a different winner emerges when the input is
not specified for a past participle, as shown in (60):
310 Tanja Schmid
... dass sie ihn ... HD-LFT FAITH *PASTP/PV/+V HD- HD-
& MORPH (PASTP) RT LFT
a. singen hören hat *! ***
In this competition, too, the winner is faithful to the input, i.e., this time, the
perception verb occurs in the bare infinitive and not in the past participle. As
candidate (a) fatally violates the conjoined constraint, candidate (b) emerges
as optimal, with IPP and the auxiliary on the left side of its complement.
The result of section 6 is the same as before: Again, both the global tie
approach and the neutralization approach can account for the data.
The reason for the discussion of the global tie approach and the neutralization
approach in this paper is their ability to account for the same kind of data
(which are difficult or impossible to account for under other approaches, like
identity of constraint profile and local ties):
The similarities of the two accounts of optionality given in this paper suggest
that the neutralization approach may easily be translated into the global tie
approach and vice versa. Some considerations on possible "translation rules"
are given below:
To sum up, it can be said that a global tie of two markedness constraints,
one of which, say Mi, prohibits fx] and the other of which, say M2, demands
|xj, has the same effect as a faithfulness constraint F which is sensitive to
[x] and outranks the markedness constraints. This is so because F on its own
either demands or prohibits [x] already, depending on the input. Only one
optimal candidate results in both approaches if another relevant markedness
constraint outranks either the tie Μι o M2 or the faithfulness constraint F.
If it should indeed be the case that the global tie approach and the neutral-
ization approach can always be translated into each other without empirical
consequences (as suggested by the examples in this paper), 33 then it would be
preferable to dispense with one of the two approaches to avoid redundancy in
the grammar.
The question arises as to which approach should be dispensed with. As
the approaches seem to be empirically equivalent, I will list some more con-
ceptual and theory internal points below, both for and against each of the
approaches.
Note, however, that most of the above points show general properties of an
OT system (see, e.g., "richness of the base"). The neutralization approach
only makes us more conscious of them.
Global ties allow the presence of two (or more) grammars simultaneously.
One way to see this complexity as an advantage of the approach is that it may
reflect the property of instability that languages show in their development.
In studies of language change, it is not unusual to assume the simultaneous
presence of two or more grammars (see, e.g., Kroch 1989, Pintzuk 1991). The
following points, however, can be raised against the global tie approach:
— Global ties are problematic for learnability, see, e.g., Tesar (1998),
who proposes a learning algorithm that builds on a total ranking of
constraints. 34 Something else that may complicate language acqui-
sition is the increasing number of possible grammars. The number
of grammars containing three constraints is 6 without, and 19 with
allowing for the possibility of constraint ties (see Vikner 1999).
Global Ties and Neutralization 313
8 Summary
At first sight, optionality poses a problem for OT. In the OT literature, how-
ever, several accounts of optionality can be found. What I wanted to do in
this paper was to compare two of these approaches that seem to be able to
cover the same kind of data, namely the global tie approach and the neutral-
ization approach. Where an account in terms of neutralization had already
been suggested in the literature, I developed an account in terms of global
ties and vice versa. Both approaches were applied to three different sets of
data: complementizer optionality in English, optionality of wA-movement in
French root questions and optionality of IPP constructions in German. The
result in all three cases was the same: The approaches are empirically equiv-
alent and can account for both optionality and the breakdown of optionality
in certain contexts. If two approaches can account for the same set of data,
one of them should be abandoned (for reasons of simplicity, elegance, "econ-
omy"). It was argued that the approach to be abandoned should be the global
tie approach. The neutralization approach can do the same and more (account
for ungrammaticality) without adding new mechanisms to the system. Both
accounts increase complexity, but only global ties result in different gram-
mars (i.e., rankings) for one and the same language and thus pose problems
for learnability.
314 Tanja Schmid
Notes
(ii) PR-BD: Proper Binding: *Every trace that c-commands its antecedent.
The ranking *Lx-Mv » PR-BD is responsible for the lack of V-to-I movement
in English (see Vikner 1999).
The ranking of the next constraint relative to, e.g., *EXP is responsible for
the (non-)occurrence of a complementizer:
11. See Prince & Smolensky (1993: 192) for a proposal of a mechanism to determine
the optimal input for a given output ("input optimization").
12. In their 1998 paper they eventually reject the global tie analysis in favour of an
approach of apparent optionality, i.e., they assume that the optimal candidates
belong to different candidate sets.
13. Cases with est-ce que are commonly assumed to behave differently. I will ignore
them here.
14. At the end of their 1998 paper Ackema & Neeleman also briefly mention option-
ality in colloquial French (i.e., optionality between (b) and (c)) and sketch an
account in terms of optional inclusion of a complementizer in the input which
can optionally remain unpronounced.
15. This constraint is called STAY by Ackema & Neeleman (1998).
16. Exactly how often a candidate violates SPC depends on specific assumptions
about sentence structure.
17. Ackema & Neeleman (1995) assume a global tie between SPC and Q-MARK. I
will differ from them to be able to account for the optimality of candidate (26-c)
with wA-movement but without movement of the auxiliary.
18. Note that different registers (standard, colloquial,...) can be thought of as being
different resolutions of a global tie (see, e.g., Sells et al. 1996). Under such a
view, optionality in French root questions (in both registers) could also be ac-
counted for by a tie of all three of the above constraints. For the sake of con-
venience, however, I will continue to show different tableaux for standard and
colloquial French.
19. As pointed out by a reviewer, this analysis cannot straightforwardly be extended
to multiple questions in Colloquial French.
20. Note that the local tie approach would not be sufficient in this case (assuming the
constraints above): Candidate (b) violates the tied constraints only once, while
candidate (c) violates them three times. Both candidates, however, emerge as
optimal.
21. In this case the separation of a syntactic [Q]-feature and a semantic [vvh]-feature
must be assumed (for a discussion of the difference between these features, see,
e.g., Müller 1993: 302 and the references given there).
22. But cf. fn. 18: There might also exist a language with three (or more) winners
which could be accounted for by a tie of three (or more) constraints in the global
tie approach. In this case, an account in terms of neutralization would be more
difficult, at least if the binarity of formal input features should be maintained.
23. A few words are in order concerning the relation between constraints like SE-
LECT and FAITH[X] (more generally between selectional constraints and faith-
fulness constraints; see, e.g., the relation between "morphological selection" and
FAITH(PastP) in section 6).
These two types of constraints are not as similar as they may seem at first
316 Tanja Schmid
sight. They may overlap, but they might contradict each other as well, depend-
ing on the input. They see the element/feature in question in relation to different
entities: Selectional constraints in relation to the selecting element, and faithful-
ness constraints in relation to the input. Selectional constraints are markedness
constraints in the sense that it can be decided whether a candidate fulfills or vio-
lates them without knowing the input. To decide, however, whether a faithfulness
constraint is violated, the input must be taken into consideration. I will assume
in the following that both constraint types are needed in the grammar. However,
the question remains as to what the exact relation between them is and where
selectional requirements fit into an OT system at all.
24. For the sake of simplicity, I have left out the candidates (e) to (h) of the former
tableaux as they will never be optimal.
25. See Bech (1983) for an early and thorough investigation of verbal complementa-
tion in German.
26. As pointed out in Schmid (1998, 1999), this constraint is in fact part of a whole
constraint subhierarchy sensitive to verb classes.
27. In this case, the domain D consists of a verbal head and its VP-complement.
28. Ci &£> C2 is equivalent to a logical disjunction (Ci ν C2 in a given domain D),
which is to be read as: Q ν C2 is not violated iff Ci is not violated ν C2 is not
violated in D.
29. Note that "Local Conjunction" as defined in (54) is a recursive mechanism that
could in principle lead to a nonfinite number of constraints, which I do not con-
sider desirable. For my purposes here, however, it would suffice to see (53) as
one complex, universal constraint. The reason why I have formulated it as a con-
junction of two simplex constraints rather than as one complex constraint is that
it becomes more transparent: It is bad to have the wrong form or be in the wrong
place, but it is even worse to both have the wrong form and be in the wrong place.
30. That the bare infinitive and not the to-infinitive is used instead of the past par-
ticiple could be due to yet another constraint like, e.g., *STRUCTURE (under the
assumption that a bare infinitive has less structure than a to-infinitive). As all
non-selected to-infinitives are ruled out by this constraint, I have not included
them in the tableaux.
31. Note that with the introduction of FAITH(PASTP), the constraint MORPH is only
indirectly relevant, namely through Local Conjunction. For the sake of simplicity,
I will leave out the simplex constraint in the following tableaux.
32. Note, however, that the faithfulness constraint (FAITH[COMP|) could form a tie
with the conflicting markedness constraint (*EXP) under the assumption that
HAVE (CP) holds in the input, i.e., that a verb always selects a CP in the input.
Under this assumption, the input would be as relevant in the global tie approach
as in the neutralization approach.
33. Remember, however, that cases with three (or more) optimal candidates are more
difficult to account for in terms of neutralization. Nevertheless, a neutralization
Global Ties and Neutralization 317
account does not seem to be impossible if, e.g., formal input features are not
(always) assumed to be binary.
34. As pointed out by Tony Kroch p.c., it must be checked, however, if global ties
really turn out to be this problematic for the learning algorithm. It could be the
case that whenever the learner comes to a piece of data that contradicts an as-
sumed ranking, the contradicting ranking is stored as a different grammar. The
number of simultaneous grammars should be restricted nevertheless.
References
Tesar, Bruce
1998 Error-driven learning in Optimality Theory via efficient computation of
optimal forms. In: P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis and
D. Pesetsky (eds.) Is the Best Good Enough?, 421-435. Cambridge, MA:
MIT Press.
Vikner, Sten
1999 V-to-I movement and 'do'-insertion in Optimality Theory. Ms., Univer-
sität Stuttgart. To appear in: G. Legendre, J. Grimshaw and S. Vikner
(eds.) Optimality-Theoretic Syntax Cambridge, MA: MIT Press.
The Interpretation of Object Shift and Optimality
Theory
Sten Vikner
1 Introduction
(1) Danish
a. *Hvorfor laeste Peter aldrig den ?
b. Hvorfor laeste Peter den aldrig t ?
why read Peter (it) never (it)
(2) Icelandic
a. *Af hverju las Pètur aldrei hana ?
b. Af hverju las Pètur hana aldrei t ?
why read Pètur (it) never (it)
In Icelandic, all DPs undergo object shift, whereas in the other Scandinavian
languages, only pronouns do:
(3) Danish
a. Hvorfor laeste Peter aldrig den her bog ?
b. *Hvorfor laeste Peter den her bog aldrig t ?
why read Peter (this book) never (this book)
(4) Icelandic
a. Af hverju las Pètur aldrei pessa bók ?
b. Af hverju las Pètur pessa bók aldrei t ?
why read Pètur (this book) never (this book)
The contrast between (l)/(2) and (3)/(4) shows that object shift of pronouns
is obligatory in both Danish and Icelandic, whereas object shift of full DPs is
only optional in Icelandic and downright impossible in Danish. Object shift
is only possible if the verb leaves VP, which a finite main verb does in main
clauses (which are V2, see (l)-(4)), but which a non-finite main verb never
does:
(5) Danish
a. Hvorfor har Peter aldrig laest den ?
b. *Hvorfor har Peter den aldrig laest t ?
why has Peter (it) never read (it)
(6) Icelandic
a. Af hverju hefur Pètur aldrei lesiö pessa bók ?
b. *Af hverju hefur Pètur pessa bók aldrei lesiö t ?
why has Pètur (this book) never read (this book)
Scrambling, an object movement very similar to object shift found in the con-
tinental West Germanic languages (cf. the papers in Grewendorf & Sternefeld
1990, Webelhuth 1992, Haider 1993, the papers in Corver & van Riemsdijk
Object Shift and OT 323
1994, Müller 1995, Haider & Rosengren 1998, and references in all of these),
is not dependent on the position of the verb in this way:
(7) German
a. ...ob Peter nie dieses Buch liest ?
b. ... ob Peter dieses Buch nie t liest ?
if Peter (this book) never (this book) reads
(8) German
a. Warum liest Peter nie dieses Buch ?
b. Warum liest Peter dieses Buch nie t ?
why reads Peter (this book) never (this book)
(9) German
a. Warum hat Peter nie dieses Buch gelesen ?
b. Warum hat Peter dieses Buch nie t gelesen ?
why has Peter (this book) never (this book) read
Scrambling, too, becomes obligatory rather than optional when pronouns are
considered:
(10) German
a. *... ob Peter nie es liest ?
b. ...ob Peter es nie t liest ?
if Peter (it) never (it) reads
(11) German
a. *Warum liest Peter nie es ?
b. Warum liest Peter es nie t ?
why reads Peter (it) never (it)
(12) German
a. *Warum hat Peter nie es gelesen ?
b. Warum hat Peter es nie t gelesen ?
why has Peter (it) never (it) read
When pronouns are modified (e.g., we two, you and I, he with the red hair),
they behave like full DPs (cf. (3)-(4), (6), and (7)-(9)), and not like unmodi-
fied pronouns (cf. (l)-(2), (5), and (10)-(12)), i.e. they do not undergo object
shift/scrambling in Danish and only optionally in Icelandic and German.
324 Sten Vikner
From the above, it might appear that (Icelandic) object shift and (German)
scrambling are completely optional, at least as far as non-pronouns are con-
cerned. This is not the case, however. As observed in Diesing & Jelinek
(1995:150) (from now on: D&J) and in Diesing (1996:79, 1997:418), the
interpretation of object-shifted objects in Icelandic differs from that of non-
object-shifted objects, and this difference is parallel to the difference in inter-
pretation between scrambled and non-scrambled objects in, e.g., German and
Yiddish (cf. Diesing 1992:129).
(13) German
... weil ich ...
... because I
a. ... selten die kleinste Katze streichle
b. ... die kleinste Katze selten streichle
(the smallest cat) seldom (the smallest cat) pet
(D&J: 130 (9-a), Diesing 1996:73 (17), 1997:379, (14-a))
(14) Icelandic
a. Hann les sjaldan lengstu bókina
b. Hann les lengstu bókina sjaldan
He reads (longest book-the) seldom (longest book-the)
(Diesing 1996:79 (32), 1997:418 (82))
follow from whether the object is inside the VP and thereby part of the "nu-
clear scope (the domain of existential closure)" or outside VP but inside IP
and thereby part of the "restriction (of an operator)". Diesing's observations
are supported by the following more extensive set of data:1
(15) German
a. In den Prüfungen beantwortet er selten die schwierigste Frage
in the exams answers he rarely the most-difficult question
b. In den Prüfungen beantwortet er die schwierigste Frage selten
in the exams answers he the most-difficult question rarely
(16) German
a. In den Prüfungen hat er selten die schwierigste Frage
in the exams has he rarely the most-difficult question
beantwortet
answered
b. In den Prüfungen hat er die schwierigste Frage selten
in the exams has he the most-difficult question rarely
beantwortet
answered
(17) Icelandic
a. Í prófunum svarar hann sjaldan erfidustu spurningunni
in exams-the answers he rarely most-difficult question-the
b. I prófunum svarar hann erfidustu spurningunni sjaldan
in exams-the answers he most-difficult question-the rarely
There is one case which is not discussed by Diesing, namely, the context
in which object shift is not possible in Icelandic. In this context, only one
word order is possible, (18-a), and this word order has not just one of the two
interpretations discussed above, it actually has both interpretations:
326 Sten Vikner
(18) Icelandic
a. I prófunum hefur hann sjaldan svaraö erfidustu
in exams-the has he rarely answered most-difficult
spurningunni
question-the
b. *í prófunum hefur hann sjaldan erfidustu spurningunni
in exams-the has he rarely most-difficult question-the
svaraö
answered
c. *í prófunum hefur hann erfidustu spurningunni sjaldan
in exams-the has he most-difficult question-the rarely
svaraö
answered
In optimality theory (cf., e.g., Prince & Smolensky 1993, Grimshaw 1993,
1997, Burzio 1995, Müller 1997, Archangeli & Langendoen 1997, Barbosa
et al. 1998), constraints are taken to be relative ("soft") rather than absolute
("hard"):
apply, except that they are ordered differently from language to lan-
guage (language variation is variation in the constraint hierarchy),
d. Only the optimal version of a sentence is grammatical; all non-
optimal versions are ungrammatical (the optimal version/candidate
of two is the one with the least violation of the highest constraint on
which the two versions/candidates differ).
Let us now return to the data discussed in sections 1 and 2. These data showed
that the interpretation of an object in Icelandic depends on whether or not it
has undergone object shift, in a completely parallel way to how the interpre-
tation of an object in German depends on whether or not it has undergone
scrambling. It is crucial, however, that whereas scrambling is never impossi-
ble in German, there are many contexts in Icelandic which do not allow object
shift. In those Icelandic sentences in which object shift is excluded, the non-
object-shifted object has two interpretations: It may be interpreted either as if
it preceded the adverbial or as if it followed it, and not just the latter.
This ambiguity is the reason why an Optimality Theory analysis is particu-
larly suitable here:
On the one hand, the generalisation seems to hold of most of the data that
the scope of objects and adverbials is read off their surface position (Diesing's
"Scoping Condition", 1996:70, 1997:375-76, see also D&J: 127), hence the
differences seen in, e.g., (17), between the non-object-shifted object in (17-
a), "he rarely answers whichever is the most difficult question in any given
exam", and the object-shifted object in (17-b), "there is a question more dif-
ficult than all others, and when he encounters this question, he rarely answers
it".
On the other hand, this generalisation clearly does not hold in constructions
which disallow object shift, like (18). The Scoping Condition would predict
that also in (18), a non-object-shifted object would only have one interpreta-
tion, i.e., that (18-a) could only be interpreted like (17-a) and not like (17-b)
(and also that the interpretation of (17-b) would only be available in sentences
where object shift was possible). This is not correct; (18-a) is ambiguous be-
tween the two interpretations. In other words, what matters is not just whether
the object has undergone object shift or not, but also whether it "could have
moved if it had wanted to." This can be accounted for in Optimality Theory
terms by saying that in Icelandic, the L I C E N S I N G constraint is ranked higher
than the S C O P I N G constraint. The idea is that the object in an object shift
construction is licensed both in its base position and in the object-shifted po-
sition, whereas in a non-object-shift construction, the object is only licensed
in its base position.
328 Sten Vikner
(21) Icelandic
C° Io Vo
a. I prófunum svarar„ hann t„ sjaldan t u spurningunni
b. í prófunum svararv hann t„ spurningunnii sjaldan t„ t,·
in exams-the answers he (question) rarely (question)
I Ì I i
Case Case
( 2 2 ) a. LICENSING:
An object must be licensed by being c-commanded by its se-
lecting verb or the trace of this verb (this subsumes Shortest
Move/Equidistance/Case assignment because c-command is the
lowest common denominator of the various licensing mechanisms
discussed).
b. SCOPING:
An element has the (surface) position in the clause that corre-
sponds to its relative scope (based on Diesing 1992:10-12,1996:70,
1997:375, D&J: 126; cf. the discussion above, and see also Bobaljik
1995:362).
c. STAY:
Movement should be avoided. This corresponds to Procrastinate
(movement should not take place before LF) and/or Economy of
Derivation (movement should not take place at all).
Object Shift and OT 329
Consider the analysis of input in which the object has narrow scope relative to
the adverbial (i.e., "he rarely answers whichever is the most difficult question
in any given exam"). First the case in which object shift is possible:
(24-a) is better than (24-b), because (24-a) does better than (24-b) on the
highest constraint on which the two differ, LICENSING, where (24-a) has no
violations and (24-b) one. The same goes for the comparison between (24-
a) and (24-c), and therefore (24-a) is better than either (24-b) or (24-c). The
result is that (23-a) and (24-a) are the optimal candidates and hence the only
grammatical versions of the sentences in question. However, this result could
be achieved in (almost) any theoretical framework, including ones with non-
violable constraints, as there is no conflict between the constraints here; the
winning candidates do not violate any constraints at all. In the derivations in
330 Sten Vikner
the following subsection, this is not the case: All candidates violate at least
one of the constraints.
If the input is such that the object has wide scope relative to the adverbial (i.e.,
"there is a question more difficult than all others, and when he encounters this
question, he rarely answers it"), the situation changes crucially:
(25-b) is better than (25-a), because (25-b) does better than (25-a) on the
highest constraint on which the two differ, S C O P I N G , where (25-b) has no
violations and (25-a) one. Given that (25-b) nevertheless violates a constraint,
namely STAY (= Procrastinate/Economy of Derivation), it has to be the case
not only that STAY is a violable constraint (as it is also in Minimalism; cf.
section 4 below), but also that STAY has lower priority than S C O P I N G , as can
be seen here, where the choice is between having to violate either S C O P I N G
or STAY. Let us now turn to the case in which object shift is not possible:
I want to suggest that the relevant difference between Icelandic and German,
i.e., the difference between object shift and scrambling, is that where Ice-
landic has L I C E N S I N G ranked above S C O P I N G , German has S C O P I N G ranked
above L I C E N S I N G . This could reflect either that object L I C E N S I N G is less
necessary in German than in Icelandic, or that c-command is not a neces-
sary condition on object licensing in German. Consider first the narrow scope
cases in which the object has narrow scope relative to the adverbial (i.e., "he
rarely answers whichever is the most difficult question in any given exam"):
This is the unproblematic case; as in the Icelandic (23-a) and (24-a), the win-
ning candidates here, (27-a) and (28-a), violate no constraints. Consider now
the wide scope cases in which the object has wide scope relative to the ad-
verbial (i.e., "there is a question more difficult than all others, and when he
encounters this question, he rarely answers it"), where the situation changes
crucially:
Even though (29-b) violates STAY, because the object has undergone move-
ment, it is grammatical, because its competitor, (29-a), violates the higher
ranked S C O P I N G .
Even though (30-b) violates LICENSING, because the object is not c-com-
manded by the main verb, it is still grammatical, because its competitor,
(30-a), violates the higher ranked SCOPING. This section thus shows that
the three constraints, LICENSING, SCOPING, a n d STAY, d e f i n e d as in (22),
h a v e to b e r a n k e d as f o l l o w s in G e r m a n : SCOPING » LICENSING » STAY,
and that at least LICENSING and STAY have to be violable. The result of the
reranking of SCOPING and LICENSING (compared to Icelandic) is thus that
in cases of scrambling in German, SCOPING determines everything regard-
less of whether there is licensing via c-command. In other words, there is a
one-to-one correspondence between word order and interpretation in cases of
scrambling in German. That this is not necessarily the case outside scram-
bling contexts is shown in the next subsection.
These two sentences are both ambiguous, i.e., both have both the reading of
(17-a)/(15-a), "he rarely answers whichever is the most difficult question in
any given exam", and the one of (17-b)/(15-b), "there is a question more dif-
ficult than all others, and when he encounters this question, he rarely answers
it". To account for (31) and (32), it suffices to assume (as I do in Vikner 1999)
that the topic (the object) is an operator, and that operators underlie a sepa-
rate constraint, OP-SPEC (Grimshaw 1997:377, Bakovic 1998:39), which re-
quires them to move to a specifier position (which for various reasons will
be CP-spec; see, e.g., Grimshaw 1997:377). OP-SPEC would then have to
be ranked above the other three constraints discussed so far. The effect of
OP-SPEC here is parallel to the e f f e c t of LICENSING in (24) a n d (26), i.e.,
regardless of whether the object has wide or narrow scope, OP-S PEC will let
(phonetically) identical candidates win in the two cases. Consider first the
tableaux for the Icelandic (31):
Object Shift and OT 333
(33)
Input: object = topic, OP- LICENS SCOP STAY
narrow scope SPEC ING ING
a. * ... svarar hann sjaldan OBJ *!
b. * ... svarar hann OBJ sjaldan *! * *
* * *
c. us* OBJ svarar hann sjaldan
(34)
(35)
(36)
*
b. * ... beantwortet er OBJ selten *!
* *
c. ι®* OBJ beantwortet er selten
4 Conclusion
Within Optimality theory, it is not only possible but actually expected that
a constraint may override a second constraint and at the same time be over-
ridden itself by a third constraint. This paper has tried to show that such sit-
uations are found both in Icelandic and in German, where on the one hand,
SCOPING overrides Procrastinate/STAY, cf. the Icelandic (25) and the Ger-
man (29), and on the other hand, SCOPING is at the same time overridden
itself by Shortest Move/LLCENSING in Icelandic, cf. (26), and by OP-SPEC
in German, cf. (35-c)/(36-c).
In other frameworks, e.g., the Minimalist Program, this is not straight-
forwardly possible, because there are only two levels, "Conditions on Con-
vergence" and "Economy Considerations", where the former are inherently
ranked above the latter.
According to Diesing (1997:422), a minimalist analysis (Chomsky 1993,
1995, Bobaljik & Jonas 1996 and Collins & Práinsson 1996) regulates the
availability of object shift by means of Shortest Move, a rule which deter-
mines whether object shift is a possible movement. This is only the case if
the verb itself has moved, due to Equidistance (see, e.g., Chomsky 1993:17-
19 = 1995:184-186 and Bobaljik & Jonas 1996).
Shortest Move is a "Condition on Convergence" (Chomsky 1995:219-220),
i.e., if it is violated, the derivation will crash rather than converge. Procrasti-
nate, on the other hand, which is a generalisation that says that overt move-
ment (before Spell-Out, i.e., in the syntax) is more costly than covert move-
ment (after Spell-Out, i.e., at LF), is an "Economy Consideration", which
means that it can only select between different converging derivations, but
not cause a derivation to crash. This difference is important: If Procrasti-
nate were a condition on convergence, "there would never be any cases of
overt movement" (Diesing 1997:422). In terms of the present analysis, this
would correspond to STAY being inviolable, which is untenable, as discussed
in connection with (25) above. Given that clear cases of object shift do ex-
ist, Diesing (1997:422) concludes that it must be the case that "the Scoping
Condition is a condition on Convergence, which leads to the overriding of
Procrastinate". In terms of the present Optimality Theory analysis, this sim-
ply corresponds to SCOPING being higher ranked than STAY.
The difference between the Minimalist Program and Optimality Theory is
that if, in minimalist terms, the Scoping Condition is a condition on conver-
gence, the Scoping Condition itself may not be violated, as this would make
the derivation crash. However, as the discussion of the Icelandic (26) above
Object Shift and OT 335
showed (see also the discussion of the German (35)), the Scoping Condition
must be a violable constraint, 2 otherwise a wide scope interpretation of the
object would only be possible in object shift constructions, which clearly is
not the case.
I thus hope to have shown that Optimality Theory allows a comprehen-
sive account of the interpretation of object shift (and of scrambling), which
includes aspects that would seem to be more difficult to account f o r within
other f r a m e w o r k s , e.g., the Minimalist Program. 3
Notes
This paper is partly based on work which was part of the project The Syntax Com-
panion at the Netherlands Institute of Advanced Study (NIAS), Royal Dutch Acad-
emy of Sciences, Wassenaar, The Netherlands, Spring 1997. A preliminary version
appeared in Working Papers in Scandinavian Syntax no. 60, pp. 1-24, December
1997.1 would like to thank the following for their comments, criticism and/or native
speaker assistance: various anonymous reviewers, Kristján Árnason, Eric Bakovic,
Jonathan Bobaljik, Molly Diesing, Hubert Haider, Gunnar Hrafn Hrafnbjargarson,
Johannes Gísli Jónsson, Ed Keer, Almut Klepper-Schudlo, Gereon Müller, Peter Ohl,
Christer Platzack, Ian Roberts, Ramona Römisch-Vikner, Tanja Schmid, Halldór Ár-
mann Sigurösson, Arnim von Stechow, Höskuldur Práinsson, Carl Vikner, Angelika
Wöllstein-Leisten, and Heike Zinsmeister. I would also like to thank my Syntax Com-
panion colleagues and audiences at the 1 st Workshop on Optimality Theory Syntax
in Stuttgart, November 1997, at the University of Iceland, March 1998, at the Work-
shop on Competition in Syntax at the 21 st Conference of the German Linguistics
Society in Constance, February 1999, and at the University of Lund, March 1999.
I am particularly grateful to Daniel BUring and Hans-Martin Gärtner for untangling
the different interpretations and interpretational differences for me.
1. A few remarks on the data and on the native speaker informants are in order.
According to Molly Diesing (p.c.), the informants for the data and the interpreta-
tions given in (14) above were Johannes Gísli Jónsson, Sigriöur Sigurjónsdóttir,
Halldór Ármann Sigurösson, and Höskuldur Práinsson. In an earlier version of
this paper, Vikner (1997), I focussed on the interpretation of indefinite objects,
but as was pointed out by Johannes Gísli Jónsson, examples with an object in
situ in the context where object shift is possible, like (21-a) in Vikner (1997),
are not completely unambiguous, as opposed to what I claimed there (a generic
reading of (21-a) in Vikner 1997 is not impossible, just dispreferred). In this
paper I therefore focus on definite superlative objects like the ones discussed in
Diesing (1996,1997). These works only discussed Icelandic data like (14) above,
336 Sten Vikner
where object shift is possible. The possible interpretations of data like (18) be-
low, in which object shift is not possible, were not mentioned. The informants
for the data discussed here include Kristján Árnason, Gunnar Hrafn Hrafnbjar-
garson, Johannes Gísli Jónsson, and Halldór Ármann Sigurösson. Whereas the
interpretation both of the object in the perfect case (where object shift is impos-
sible), (18-a) below, and of the object in the object-shifted case, (17-b) below, is
completely uncontroversial, the interpretation of the object in situ in the context
where object shift is possible remains controversial, in so far as (17-a) below is
found to be ambiguous by one of my four informants, Johannes Gísli Jónsson.
In the text, I shall nevertheless assume that (17-a) is unambiguous, following the
other three informants and following Diesing (1996, 1997).
2. Notice that this argument is still valid even if what were considered above to be
cases of non-shifted objects should turn out to be objects which only undergo
object shift after Spell-Out (at LF). Object shift would then always take place,
and the only variation would be whether it takes place before or after Spell-Out.
The Scoping Condition ("the scope of objects is read off their surface position,"
(22) above) might then have to be made more explicit, e.g., "the scope of objects
is read off their position at Spell-Out," but it would still have to be violable;
cf. the discussion of (18-a)/(26) above. If the Scoping Condition is not taken to
apply to surface positions/positions at Spell-Out, but instead to positions at LF,
and if object shift is assumed to vary only with respect to when it applies and not
with respect to whether it applies or not, the prediction would be that all objects
would receive (only) wide scope interpretations, a prediction which is clearly
undesirable.
Hornstein's (1995) analysis of scope ambiguities offers a way of account-
ing for the ambiguity of (18-a)/(26) while maintaining that object shift always
applies (at the latest at LF), but this not only requires sacrificing the Scoping
Condition, but also makes incorrect predictions for the non-ambiguous cases.
Hornstein (1995: 154) assumes that scope ambiguities arise as follows: What de-
termines scope is the relative position of the scope taking elements at LF, but
the position of an element which counts for scope may be any of the positions
in the chain of that element. The ambiguity of (18-a), a non-shifted object in a
construction in which object shift cannot apply, could thus be accounted for if
object shift is assumed to apply at LF iff it does not apply before Spell-Out: The
reading of (18-a) in which the object has narrow scope arises by having the pre-
object shift position of the object count for scope, and the reading in which the
object has wide scope arises by having the post-object shift position count. The
problem is that this account would incorrectly make exactly the parallel predic-
tions for object-shifted objects, as in (17-b). This should be ambiguous as well:
Object shift has applied, and scope may now be determined by either the non-
shifted or the shifted position. But (17-b) is not ambiguous; the object can only
have wide scope, not narrow scope. Also non-shifted objects in constructions in
Object Shift and OT 337
which object shift could have applied, e.g. (17-a), would incorrectly be predicted
to be ambiguous in a parallel way, although here the object can only have narrow
scope, not wide scope (though see the remarks in note 1).
3. Admittedly, there are also at least two versions of minimalism that would seem
to have more in common with Optimality Theory than "standard" minimalism
does, in that they allow ranked and violable constraints: Bobaljik (1995:351),
which like this article is an attempt to formulate Diesing's Scoping Condition as
a violable constraint, and Holmberg (1997:214). However, ranked and violable
constraints are left out in a more recent version of the latter work, Holmberg
(1999). As for other OT analyses of object shift, see also Müller (1998), which
focusses on multiple object shift in double object constructions.
References
Holmberg, Anders
1999 Remarks on Holmberg's generalization. Studia Linguistica 53(1): 1-39.
Holmberg, Anders — Christer Platzack
1995 The Role of Inflection in Scandinavian Syntax. New York: Oxford Uni-
versity Press
Hornstein, Norbert
1995 Logical Form: From GB to Minimalism. Oxford: Blackwell.
Josefsson, Gunlög
1992 Object shift and weak pronominals in Swedish. Working Papers in Scan-
dinavian Syntax 49: 59-94.
Josefsson, Gunlög
1993 Scandinavian pronouns and object shift. Working Papers in Scandinavian
Syntax 52:1-28.
Müller, Gereon
1995 A-bar Syntax: A Study in Movement Types. Berlin: Mouton de Gruyter.
Müller, Gereon
1997 Partial wh-movement and optimality theory. The Linguistic Review
14:249-306.
Müller, Gereon
1998 Order preservation, parallel movement, and the emergence of the un-
marked. To appear in: Géraldine Legendre, Jane Grimshaw and Sten
Vikner (eds.) Optimality-Theoretic Syntax. Cambridge, MA: MIT Press.
Prince, Alan — Paul Smolensky
1993 Optimality Theory: Constraint Interaction in Generative Grammar.
Technical Report 2, Rutgers University Center for Cognitive Science.
Vikner, Sten
1989 Object shift and double objects in Danish. Working Papers in Scandina-
vian S)>n/aj:44:141-155.
Vikner, Sten
1994 Scandinavian object shift and West Germanic scrambling. In: Norbert
Corver and Henk van Riemsdijk (eds.) Studies on Scrambling, 487-517.
Berlin: Mouton de Gruyter.
Vikner, Sten
1997 The Interpretation of object shift, optimality theory, and minimalism.
Working Papers in Scandinavian Syntax 60: 1 -24.
Vikner, Sten
1999 V°-to-I° movement and 'do'-insertion in optimality theory. To appear in:
Géraldine Legendre, Jane Grimshaw and Sten Vikner (eds.) Optimality-
Theoretic Syntax, Cambridge, MA: MIT Press.
340 Sten Vikner
Webelhuth, Gert
1992 Principles and Parameters of Syntactic Saturation. New York: Oxford
University Press.
Case Conflict in German Free Relative Constructions:
An Optimality Theoretic Treatment
Ralf Vogel
Languages differ as to how big a case conflict must be in a free relative (FR)
construction to cause ungrammaticality. While English requires true catego-
rial matching, German allows the suppression of structural cases if assigned
by the matrix verb. There are also different types of non-matching languages.
Paradigmatic examples are Gothic and Modern Greek. Earlier generative syn-
tactic accounts mainly propose a distinction only between matching and non-
matching languages. This is not fine-grained enough to capture the typolog-
ical findings. An optimality theoretic treatment permits a richer, but not in-
finite, typology, and it allows constraint violation (which obviously happens
in FR constructions). The proposed account makes use of the optimality the-
oretic conception of correspondence. The assumed constraints are on input-
output correspondence (input-LF as well as input-PF), and also on PF-LF
correspondence.
(1) [ Wer sich mit freien Relativsätzen beschäftigt], verwendet, [was von
ihm dafür gehalten wird], [sooft er kann], [wann immer sich ihm
dafür eine Gelegenheit bietet] und [wo immer er sich befindet],
ss
'Whoever deals with free relative clauses, uses what he considers
to be one, as often as he can, whenever he has the opportunity and
wherever he is.'
342 Ralf Vogel
This paper will not deal with adverbial clauses, such as the last three high-
lighted clauses in (1). I will concentrate on free relatives (FRs) that realise
an argument of a verb. And here again I will restrict myself mainly to case-
marked complements, and widely ignore prepositional phrases. An important
phenomenon is the so-called matching effect (Bresnan & Grimshaw 1978):
In many languages the relative pronoun of the FR construction has to fulfil
both the requirements of the FR internal verb, and those of the matrix verb. In
German, the relative pronoun has to appear in the dative case if the FR stands
for a dative argument of the matrix verb:
Both folgen and vertrauen assign dative to their object in (2-a); the relative
pronoun matches these requirements and the clause is fine. This is not the
case in (2-b), because the FR internal verb bewundern assigns accusative.
Whichever of the two cases is chosen (i.e., dative or accusative) for the rel-
ative pronoun, the result is ill-formed. Examples like this seem to have led
many researchers (starting with Groos & van Riemsdijk 1981) to conclude
that German FRs require matching in general. As already observed by Pittner
(1991, 1995, 1996), this is not the case. 1 The empirical generalisations about
German FR constructions seem to me to be the following:
The following examples show that matrix nominative need not occur on the
FR pronoun:
There is a controversy about data in which the relative pronoun carries nom-
inative. For some speakers, the example in (6) is ill-formed (cf. Pittner 1991,
1995): 3
As far as I can see, there is a real disagreement among German native speak-
ers about data like this. I found, however, that parallel examples like the fol-
lowing are easier to judge as well-formed:
But even those speakers who do not accept (6) and (7) accept (5-a). This
must be due to the fact that the wA-pronoun for inanimates, was, is the same
for both nominative and accusative. 4 This shows that the matching effect
is not about a syntactic feature like "abstract case", but about the morpho-
phonological "identity" of elements with not necessarily identical syntactic
features. 5
For those speakers who accept (6) and the examples in (7), the patterns
for matrix nominative and accusative are alike. This variant of German will
Free Relatives 345
be referred to as "German A", and the one for which (6) and (7) are odd as
"German B".
This is the only case in which the two variants of German seem to differ
in their judgements of FRs. 6 If the case "assigned" by the matrix verb to the
free relative construction is an oblique case like dative 7 or genitive, then the
relative pronoun has to appear in that case, and the verb inside the free relative
also has to assign that case to the pronoun: 8
In these cases, the pronoun must fulfil the case requirements of the two differ-
ent verbs. This "matching effect" is the most spectacular finding about free
relatives, and much of the work that has been done on them in generative
grammar addresses the question of how to derive it.
The next section discusses the syntactic properties of FR constructions and
various attempts to deal with them in generative syntax. None of the proposals
suggested so far can derive the full range of typological variation in the way
different grammars handle the case conflict that occurs in FR constructions.
Most approaches predict that there are only matching and non-matching lan-
guages, but not that there are languages like German, that have non-matching
with matrix structural case and matching with matrix oblique case, or lan-
guages with the pattern of Modern Greek.
In the third and fourth sections, I will develop an optimality theoretic ac-
count of the case conflict in German FR constructions. We will also see that
it can deal in a much better way with typological variation.
Two observations are crucial for the syntactic analysis of argument FRs:
With respect to (10-L), FRs behave like NPs or DPs, but with respect to
(10-11.), they behave like CPs. The task for the syntactic analysis is to bring
these two apparently contradictory observations together.
FRs differ from "normal" relative clauses in that they do not seem to be
"headed", i.e., they do not seem to have an antecedent, as opposed to ordinary
relative clauses:
The two relative constructions in (11) use different pronouns. While the rela-
tive clause uses the ordinary ¿-pronoun as relative pronoun, the FR uses the
w/z-pronoun. It is mostly impossible to use the w/z-pronoun as relative pro-
noun in German:
TYPE wh REL
wA-pronoun + -
relative pronoun - +
FR pronoun + +
The pronoun of the relative clause is sensitive to the case assigning properties
of both the matrix verb and the relative clause internal verb. This can again
be seen very clearly in Modern Greek (Alexiadou & Varlokosta 1995: 12f.):
With the FR in postverbal position, the FR pronoun has to carry the matrix
case, while both cases are possible when the FR is fronted. The relative pro-
noun of FR constructions is in a case conflict in postverbal position. The
structural configuration of this conflict can be represented in the following
way:
C A S E l is the case assigned by the matrix verb to the FR and CASE2 is the
case assigned by the verb inside the FR to the trace of the relative pronoun,
Í2. Under the assumption that CASEl percolates from the top node of the
FR (which kind of category it actually is will be discussed below) to the FR
pronoun, the pronoun is case marked twice. This is indicated by the numerical
subscripts on the node REL-PRON. Examples like those in (17) show that the
pronoun can realise either of the two cases. But that configurations like (18)
need not necessarily lead to ungrammaticality, even if CASEl and CASE2
differ, is already quite an unexpected fact: A single pronoun can carry only
one case feature; the other case feature is not assigned, or at least not realised,
and because of this, FR constructions should be defective. Obviously, they are
not.
The grammar of Modern Greek handles this case conflict by suppressing the
case assigned to the pronoun inside the FR for FRs in postverbal position.11
This is the opposite of what happens in German, where the relative pronoun
350 Ralf Vogel
always has to carry the case assigned to the wA-pronoun by the FR internal
verb. The phenomenon that relative pronouns realise a case from "outside"
of the relative clause is quite frequent in ancient languages like Gothic and
Ancient Greek (cf. Harbert 1983). It has been termed "case attraction".
Recall that German requires matching as soon as the matrix case assigned
to the FR is oblique. One obvious explanation for this could be that only
structural case may be "recoverable" if "suppressed", but not oblique case.
What, then, happens in Modern Greek if the "suppressed" case is an oblique
one? Again, we observe something that differs from the usual pattern (cf.
Alexiadou & Varlokosta 1995: 13):
As usual, the pronoun carries the accusative assigned by the matrix verb, but
in addition to that we have a pronominal clitic following the relative pronoun
inside the FR that realises the genitive case assigned by the FR internal verb.
The clitic can be seen as a resumptive element spelling out a trace of the
relative pronoun in the sense of Pesetsky (1998).
Modern Greek has found a way to escape the case conflict by realising both
case forms within the chain of the relative pronoun, thereby violating the
restriction that a single chain should bear only one case feature.
A third type of non-matching language is Gothic, where, according to
Harbert (1983), both attraction and non-attraction were possible ways of han-
dling the case conflict. In this language, the relative pronoun systematically
chooses the case form that is higher on the case hierarchy:
... the two types of free relative are in complementary distribution, the
choice between them being determined by the relationship between the
case appropriate to the matrix clause role of the construction and the case
appropriate to the role of the missing argument in the lower clause. When
the matrix case is to the right of the lower clause case on a hierarchy
of the form Nom-Acc-{§!;"} it prevails [(20-a), attraction]. When it is to
the left of the lower clause case the lower clause case prevails [(20-b),
non-attraction], (Harbert 1983: 249)
Free Relatives 351
Three different kinds of proposals for the structure of FR clauses can be dis-
tinguished:
The first proposal was made by Bresnan & Grimshaw (1978) and the second
by Groos & van Riemsdijk (1981), the latter being a reply to the former. A
variant of the second proposal that includes a treatment of case attraction was
developed by Harbert (1983). The third analysis was proposed more recently
by Rooryck (1994).
Rooryck shows that both DP-CP accounts face empirical problems. The
structure under I. cannot deal very well with many of the extraposition prop-
erties of relative clauses in German and Dutch (this was first shown by Groos
& van Riemsdijk), and the structure under II. wrongly predicts subjacency vi-
olations in cases of extraction out of the specifier of the CP. The usual syntac-
tic tests show quite clearly that FRs behave like ordinary subordinate clauses
352 Ralf Vogel
(i.e., like CPs). The DP-CP proposals also require some construction specific
stipulations and mechanisms in order to work. 12
I will adopt several insights from the discussed approaches. These are basi-
cally the assumption that FRs are CPs (cf. Rooryck 1994), that case attraction
is a PF phenomenon (cf. Harbert 1983) and that the case hierarchy plays a cru-
cial role in many languages (cf. Bresnan & Grimshaw 1978, Harbert 1983,
Pittner 1991). But the basic account should be rephrasable with different syn-
tactic analyses.
There are some further reasons why I do not make use of one of the two
proposals that claim that FRs are "headed": 13
Finnish FRs seem to resemble German ones in that the FR pronoun always
takes the case assigned by the FR internal verb. Finnish can deal with some in-
stances of case conflict: (21) shows that a matrix partitive may be suppressed
if the embedded case is elative, but not vice versa. Bresnan & Grimshaw
(1978: 374) cite Carlson as follows:
Free Relatives 353
Carlson suggests that nominative (the case of subjects and objects of im-
personal constructions), accusative, and partitive (the cases of objects of
transitive verbs) are unmarked cases in Finnish; the case of a free relative
may disagree with that of its head only when the relative has unmarked
case; and the head must agree in case with the subordinate verb that gov-
erns its trace.
Pittner (1991) claims that the following rule holds in German FR construc-
tions:
I do not think that dative and PPs should be grouped together on the case
hierarchy in general, but for German it does not seem to make an empirical
difference.
Harbert (1983) proposed a similar case hierarchy for Gothic, as discussed in
connection with the data in (20). Hierarchies of all kinds of features are quite
common in optimality theoretic models. We could, e.g., develop a constraint
family R E A L I S E < c a s e > , where " < c a s e > " stands for the different cases.
The usual hierarchical ordering of these constraints in the OT model gives us
the implementation of the case hierarchy:
REALISEpp » REALISEDAT » REALISEacc » REALISEN0M
This hierarchy should, of course, be universally fixed and not freely rerank-
able.
I will make use of Harbert's (1983) intuition that case attraction is a PF-
related phenomenon. In such cases, we obviously have a mismatch between
the syntactic case feature of the FR pronoun (assigned by the FR internal
verb) and its morpho-phonological case feature ("assigned" by the matrix
verb). OT can deal with such rule violations. Whether such a candidate wins
depends on the system of constraints and their ranking.
What is particularly interesting is that the formulation of the constraint
makes reference to the notion of correspondence, which has become a fruit-
ful topic of discussion in Optimality Theory. The correspondence here is that
354 Ralf Vogel
Input: Io
— GEN produces the candidate set on the basis of the input: "[GEN]
... generates all extended projections that conform to X-bar theory,
that is, in which all projections are of the right basic structure."
(Grimshaw 1997a: 376). I further assume that GEN can manipulate
the functional categories of the input. Contrary to Keer & Bakovic
(forthcoming), I do not assume that GEN can perform manipula-
tions on the values of features. I assume that GEN can only manip-
ulate the distribution of features within the clause, and thereby add
or erase functional projections. The motivation for this move will
become clear in the next section.
— The Candidate set C is the set of possible output candidates, gen-
erated by GEN: (LF, PF) pairs that conform to universal well-
formedness rules.
— EVAL is an evaluation algorithm based on the particular ranking R
of the set of universal constraints CON. EVAL compares the candi-
dates in C and chooses the best competitor as output.
(22) a. MAX: Every segment in the input has a correspondent in the output,
b. DEP: Every segment in the output has a correspondent in the input.
The scheme in (23) illustrates the differences between these constraint fami-
lies. A MAX constraint that applies to two elements in the input can be satis-
fied by a single element in the output. The opposite holds for DEP constraints.
It is also less important whether segments are in the right order.
(23) MAX
input F F F X
output F X F
Thus, DEP and MAX constraints allow, say, "weak" unfaithfulness. Consider
the matching effect:
Strictly speaking, we have two dative case features in (24-a), one assigned
by the matrix verb, and the other assigned by the FR internal verb. But nei-
ther M A X d a t , nor D E P d a t are violated. D E P d a t requires that for each dative
feature in the output there is (at least) one in the input. This is the case; we
have even more than one, but this is irrelevant. M A X d a t requires that for each
dative feature in the input there is one in the output. Again, this is the case
- though it is the same dative feature of the output that corresponds to both
dative features of the input.
This might help to explain the following problematic example discussed in
Pittner (1995) and Leirbukt (1995):
(25) a. Sagen Sie das bitte Frau Schwarzkopf, Herrn Müller, Herrn
tell you that please Mrs. S.-DAT Mr. M.-DAT Mr.
Schmidt und wen sie sonst noch treffen
S.-DAT and who-ACC you else yet meet
'Please tell that to Mrs. S., Mr. M., Mr. S. and whoever else you
might meet.'
b. *Sagen Sie das bitte, wen Sie sonst noch treffen
tell you that please who-ACC you else yet meet
Free Relatives 357
(26) *Sagen Sie das bitte, wen Sie treffen und Frau Schwarzkopf
tell you that please who-ACC you meet and Mrs. S.
3.2 Ungrammaticality
There are several ways to escape this problem. The way that I am choos-
ing here is using not only FR constructions among the candidates, but also
ordinary headed relative constructions. These are not input-faithful, but will
sometimes win, because they do not violate some crucial constraints. That is
to say, there is always a candidate like (28) among the candidates for a FR
construction. And in this case, (28) should even turn out to be the optimal
candidate. 15
It must be possible for GEN to generate an output like (28) from an input
of the form of a FR construction. For this to be possible, I assume that GEN
358 Ralf Vogel
[REF] [REL]
pronoun + -
relative pronoun — +
FR pronoun + +
In the input of a FR construction, the two features are joined under one func-
tional head. I assume that it is possible for GEN to split this feature bundle
and distribute the features over several functional projections, and thereby
introduce functional projections that were not present in the input. 18
f+REF] C° [+REF] DP C
[+REL]
[+REL] C°
In (30), the output structure 2 differs in three respects from the input structure:
There is one additional functional projection that was not in the input, there
is an additional functional head D°, and the features of the D° in the spec-
ifier of the CP differ from its correspondent in the input structure, and vice
versa. I want to summarise these cases as subcases of the same constraint,
FAITHfunc. In addition to the mentioned input-LF faithfulness, FAITHfunc
will also include instances of LF-PF correspondence, namely the occurrence
of resumptive clitic pronouns in cases like (19), repeated here:
(32) FAITHfunc:
Each functional feature bundle in the input has a corresponding
functional head with the same feature specification and vice versa;
and a chain of a functional category has exactly one PF correspon-
dent.
FAITHfunc is violated if
There are three mismatches between the input structure and the output LF 2
in (30), hence three violations of FAITHfunc:
360 Ralf Vogel
The FR structure with the additional resumptive pronoun has one FAITHfunc
violation, because the resumptive pronoun introduces a second PF correspon-
dent of the same functional LF-chain.
The second constraint that I am assuming is on PF-LF correspondence and
requires that the morpho-phonological case form of an element Ρ of an output
PF must correspond to the morpho-syntactic case feature of the LF-chain
corresponding to P:
(33) MATCH:
The morpho-phonological case feature of a PF element X may not
contradict the syntactic case feature of the chain of XP, the corre-
spondent of X in the output LF.20
(35) UNIcase:
"Uniqueness of case relations". For each case required by each
case assigner there is exactly one XP that stands in the appropri-
ate structural position for case assignment and realises (morpho-
phonologically) the required case feature. XPs may realise the case
of at most one case assigner.
Free Relatives 361
The candidates in (37) are the candidates relevant in the competition. 22 (37-a)
is an instance of case attraction, as exemplified by Gothic. It leaves one case
unrealised, so there is one UNIcase violation, and we have one MATCH vio-
lation, because of the attraction structure. Candidate (37-b) is a German-type
m clause with the w/i-pronoun bearing r - c a s e . There is just one UNIcase
violation. Candidate (37-c) has attraction and in addition a resumptive pro-
noun spelling out the trace, as exemplified by some Modern Greek FRs. We
have one MATCH violation for attraction and one FAITHfunc violation for
the clitic. Candidate (37-d) is the headed relative construction. It has three
FAITHfunc violations for the change in the functional categories and their
feature distribution.
We already see from this tableau that different rankings of the constraints
will result in different winners. The case hierarchy will be implemented by a
series of input-PF constraints, resembling MAX constraints:
(38) REALcase,,:
Each XP chain with a syntactic case feature |+CASE„| at the out-
put LF has a morpho-phonological case feature of |+CASE„] on a
corresponding element of XP at the output PF.
m-case=DAT
r-case=ACC REALdat REALacc
a. ICP dat *
b. |CP acc *
c. | cp dat... acc
d. [ D p dat [CP acc
(40)
m-case=NOM
r-case=ACC MAT Uc Ff REALd REALa REALn
a. [cp nom * * *
b. [cp acc * *
c. [cp nom ... acc * *
d. [CP acc ... nom * ** * *
e. [DP nom [cp acc ***
f. [DP nom [CP nom ... acc * * ****
g. [DP acc [CP nom ** **** *** * *
Note that (40-a) can never win over (40-b), because REALnom, which is
violated by (40-b), must never be higher than REALacc, which is violated by
(40-a). Thus, a candidate of type (40-a) can only win if m - c a s e is higher
than r - c a s e on the case hierarchy. This is exactly what has been found in
Gothic FRs.
The constraint ranking that yields the pattern for "German A" (cf. page 345)
is given in (41); the constraint for the oblique cases dative and genitive is
abbreviated with REALobl: 24
The top ranking of MATCH ensures that German FRs have no case attraction,
and the ranking of FAITHfunc above UNIcase ensures that German has FR
constructions under case conflict. That REALoblique is higher than FAITH-
func has the effect that FRs that suppress oblique cases are ungrammatical
(i.e., lose against the "headed" relative construction). The following sections
show this in a little more detail.
m-case=NOM
r-case=NOM MAT REA Lo Ff Uc REA La REALn
is· a. | cp nom *
A case that looks like matching is examples with the inanimate FR pronoun
was, which realises both nominative and accusative. The PF form is abbrevi-
ated with "n/a" in the candidate set.
m-case=NOM
r-case=ACC MAT REALo Ff Uc REALa REALn
is- a. [CP n/a *
b. [CP — ACC *!
c. [DP nom [cp acc
m-case=NOM
r-case=ACC MAT REALo Ff Uc REALa REALn
is* a. [cp ace * *
b. [cp nom * *
c. Icp nom ... acc *! *
d. [DP nom [CP acc
Nearly the same will happen if r - c a s e is an oblique case. The only dif-
ference will be that candidate b. will have a violation of REALobl instead
of REALacc. But the result will be the same, as expected. The examples
with accusative as m - c a s e are totally analogous, with the exception of the
REALacc and REALnom violations, which are not relevant here. A possibly
complicated case is the following example:
What is special here, is that the FR pronoun is not in the SpecCP of the
FR, but in the Spec of a DP that itself occupies SpecCP. Presumably, it is
impossible for the pronoun to undergo attraction in this case. 25 And even if
it did, it is not the FR pronoun that would agree with the C°-AGR head of
the FR, but the complex DP and thus the pronoun would be unable to fulfil
REALcase„ for m - c a s e . The relevant example is put in brackets in (46). It
is, however, clear that the expected candidate would win, no matter how the
attraction candidate (46-b) were treated: 26
(46)
b. [ C p dat *! * *
c. [çp dat... nom *! *
os· d. [ Q P dat Içp nom * * *
Again, the pattern can be repeated with all other combinations of oblique and
structural cases. 27
Besides the German pattern, the model predicts another 7 different possible
grammars: 28
The following grammars are predicted, but have not been found yet:
(52) German A:
/Er zerstört wer ihm begegnet
he destroys who-NOM him-DAT meets
m-case=ACC
r-case=NOM MAT REALo Ff Uc REA La REALn
ι®" a. Γ cp nom * *
b. fcp acc *
c. [cp acc ... nom *! *
d. 1 dp acc | cp nom
Free Relatives 367
(53) German B:
*Er zerstört wer ihm begegnet
he destroys who-NOM him-DAT meets
m-case=ACC
r-case=NOM MAT REALo REALa REALn Ff Uc
a. | cp nom *
b. [ C p acc *! *
c. [cp acc ... nom *! *
is· d. [DP acc [cp nom * * *
With the modification in (51) and two additional ones, we can further restrict
the pattern. REALnom plays no role in the interaction of the constraints, so
we can eliminate it - it is always fulfilled in the competitions under debate
here. REALobl can be argued to be inviolable, and hence part of GEN 3 3 -
thus, candidates that violate REALdat or REALgen will never be generated.
So REALobl can also be removed. We then no longer need to postulate a
fixed ranking of constraints, because REALacc is the only constraint of the
case hierarchy that is left. The four constraints we now use are: REALacc
(in the new version where REALacc can be fulfilled by any case form except
nominative), UNIcase, FAITHfunc and MATCH. This rules out the possibil-
ity of (50). We now get the prediction of six grammars, each of which is quite
reasonable:
It is easy to see now how the two variants of German differ, namely, in the rel-
ative ranking of FAITHfunc and REALacc. For German A, it is better to leave
accusative unrealised (and violate REALacc) than to rearrange the functional
material of the FR (and violate FAITHfunc). The opposite holds for German
B34
5 Concluding Remarks
The main advantage of the proposed analysis is its ability to predict typo-
logical variation in a much better way than previous accounts - and it does
so by making use of an even simpler and comparatively unproblematic syn-
tactic analysis. The number of construction specific assumptions has been
reduced, but the typological predictions of the present account look much
more realistic. The system of constraints is able to filter out a set of candi-
dates that includes only the types of FR constructions that we find universally.
The constraints that are used here are all defined as constraints on correspon-
dences between different levels of representation, input-LF and PF-LF. The
constraints conflict: A candidate that performs well on input-LF correspon-
dence performs worse on PF-LF correspondence. Languages seem to differ
in which correspondence they consider to be more important. The constraints
are formulated in a general, non-construction-specific way. The system thus
should be able to account for phenomena other than FR constructions. This
opens further fields of research.
Notes
I would like to thank the following colleagues for helpful comments and fruitful
discussions: Artemis Alexiadou, Gisbert Fanselow, Hans-Martin Gärtner, Anasta-
sia Giannakidou, Jane Grimshaw, Alex Grosu, Fabian Heck, Géraldine Legendre,
Gereon Müller, Peter Ohl, Doug Saddy, Tanja Schmid, Sten Vikner. I'm further
thankful to the audiences of presentations of parts of this paper at the University of
Stuttgart on April 15, 1999, the GGS 1999 workshop in Stuttgart, May 15, 1999, at
the University of Potsdam on July 16, 1999, at Rutgers University, New Brunswick,
on September 10, 1999, and at Johns Hopkins University, Baltimore, on October 7,
1999. All remaining errors are mine. The work on this paper was fully supported by a
Free Relatives 369
grant for the DFG research project "Optimalitätstheoretische Syntax des Deutschen"
(MU 1444/2-1), University of Stuttgart.
1. The article by Groos & van Riemsdijk has been very influential and is still the
source of the most-widely accepted syntactic analysis of FR constructions. But
the paper unfortunately bears some misleading judgements of German FR con-
structions which have been taken over by many researchers. Van Riemsdijk still
claims that German is a matching language, as can be seen in van Riemsdijk
(1998). Although there might be some German speakers whose judgements are
as described by Groos & van Riemsdijk, it seems that the majority of German
speakers judge differently.
2. For many speakers, examples with sentence-initial FRs are less acceptable than
the examples in (53). I view this as an effect of parsing difficulties. Because
German is a non-matching language, the hearer has several options for the gram-
matical function of a FR, in general. If the FR is clause-final, it is easy to detect
its grammatical function (GF), because usually there is only one GF left that is
assigned by the verb and not yet realised by some constituent. For a clause-initial
FR, however, everything is open, and its GF can only be guessed at the point of
its occurrence. So we expect that clauses with clause-initial FRs are parsed with
a higher error rate than those with clause-final FRs. The same problem occurs
with sentence-initial FRs assigned accusative by the matrix verb.
3. The brackets around the asterisk indicates the disagreement about the data be-
tween German native speakers.
4. This explanation has already been given by Groos & van Riemsdijk (1981).
5. The question arises as to whether such effects of homophony can also be found
with lexical items other than pronouns. A case in point could be the following:
The surface form of Eltern is the same in all four morphological cases of Ger-
man, while it is Geschwistern only in the dative, and Geschwister in nominative,
accusative and genitive. This is crucial for the judgements in (i). While German
native speakers may disagree on the grammaticality status of (i-a), they agree
that (i-b) is significantly worse than (i-a). Whatever explanation might be found
for this fact, it is obvious that the differing morphological paradigms for Eltern
and Geschwister play a role.
6. I cannot make a statement about where these two variants originate. As far as I
can see, it has nothing to do with different regions. I also cannot affirm that it is
a matter of different generations or social class.
7. For a broad range of arguments that German dative is not a structural case, con-
trary to what has often been claimed, see Vogel & Steinbach (1998).
370 Ralf Vogel
8. The FR pronoun always realises the FR internal case in the following examples.
9. I abbreviate the features as "wh" and "REL", without discussing the actual nature
of these features. These labels should be seen as informal cover terms.
10. A quite extensive discussion of the different distributions of free relatives and
embedded wA-clauses in English can be found in the first sections of Bresnan &
Grimshaw 1978.
11. FRs in preverbal position should be analysed as instances of left dislocation, as
Alexiadou & Varlokosta (1995) convincingly demonstrate.
12. The main empirical reason for the assumption of a DP-CP structure is the match-
ing effect. We will see that it can also be derived in a "bare CP" analysis.
13. The somewhat misleading term "head" is used very often to refer to the "an-
tecedent" of restrictive relative clauses. I will avoid it as much as I can.
14. Translation: "In a case conflict between the case K1 required by the matrix verb
and the case K2 required by the verb in the free relative clause, K1 can remain
unrealised if K1 precedes K2 in the following case hierarchy:
(CH) nominative < accusative < dative/prepositional case"
15. This strategy of accounting for ungrammaticality is quite frequent in optimal-
ity theory. It is called neutralisation. The winner of one competition is also the
winner of another competition - in this case it would be one with the headed
relative clause as input. And in this competition, (55) has an even better profile,
because it presumably violates fewer constraints than with an FR construction in
the input. For other applications of neutralisation in OT syntax, cf. Legendre et
al. (1995, 1998) and Keer & Bakovié (forthcoming).
16. Wiltschko (1999) presents many pieces of evidence that FR pronouns are seman-
tically indefinites.
17. Independent empirical motivation for this feature composition analysis can be
seen in the design of the FR pronoun in Modern Greek; cf. section 2.1.
18. Note that this requires a theory that clarifies which features can be "project-
ing" and which cannot. But this is not an extra complication. In the domain of
verbs, e.g., there is an ongoing debate in generative syntax about which morpho-
syntactic features of verbs are heads of their own projection, and which are not.
INFL has been split into TENSE and AGR and the latter has become controver-
sial; under debate are also ASPECT, NEGATION, VOICE, and others. I mention
this to show that the proposed mechanism does not bring in extra complications
compared to traditional approaches, at least in this respect.
19. This is reminiscent of the constraint "SILENT TRACE" of Pesetsky (1998).
FAITHfunc can be seen as composed of simpler constraints. It is, however, a dis-
junction, not a conjunction, we are dealing with here, because a single violation
of one of the combined constraints suffices to get a violation of FAITHfunc. In
order to distinguish matching languages from languages without FRs, we would
have to split FAITHfunc. I avoid this here only for ease of presentation. See Vogel
(in prep.)
Free Relatives 371
20. It has become standard in minimalist syntax to assume that case features of DPs
are no longer present at LF. However, what still is present is the trace in the case
position, which means that we can always reconstruct a DP's case feature from
its chain. This will suffice for the present purpose.
21.1 do not want MATCH to be a constraint on PF-input correspondence, because
in this case MATCH would also be violated by output 2 in (30). Both DP; and
DP2 have the same DP as the input. This DP is assigned r-case. But DPj
realises m-case. Thus, DP] is unfaithful vv.r.t. its case feature. On the other
hand, if we look at the LF, then DPi and DP2 build chains of their own and
each corresponds to its own PF element. There is a one-to-one mapping between
the case features. MATCH is obeyed. A violation of MATCH would have the
effect that in (30) output 2 would be "harmonically bounded" by output 1: Both
would have one violation of MATCH, but in addition output 1 would have one
violation of FAITHfunc, while output 2 would have three. In the account being
developed here, the two candidates behave alike vv.r.t. all other constraints, so
output 2 would always lose against output 1.
22.1 abbreviate the structures by only indicating the syntactic category of the can-
didates and the overt case features that occur in them. Abstract case features are
printed in capitals, overt (PF-) case features in italics.
23. A short glossary for the abbreviation of the constraints in the following tableaux:
MAT = MATCH; Ff = FAITHfunc; Uc = UNIcase; REALo = REALobl; REALd
= REALdat; REALa = REALacc; REALn = REALnom. These abbreviations are
only introduced to keep the width of the tables small enough.
24. That there is no between constraints (as, e.g., between MATCH and
REALobl) only means that we have no evidence from the given data for the
ranking of two constraints with respect to each other.
25.1 know of no example of this kind from attraction languages. But even in German
this construction poses additional complications, which I do not have an answer
to yet. See footnote 5 for one example.
26. It is impossible to construe an example with a resumptive pronoun, because the
chain of the FR pronoun has only one link.
27. The situation with matrix PPs will only be mentioned briefly. GEN may not erase
lexical items from the input, so only those variants are possible outputs, where the
preposition is preserved, which eliminates the candidate where the wA-pronoun
tealises r-case, and not the PP:
m-case=PP
r-case=NOM MAT REALo Ff Uc REALa REALn
a. [CPP * *
b. [çp Ρ ... nom *! *
ty c. |pp Ρ Icp * * *
28. The typological predictions of the proposed model have been calculated with
the help of the constraint ranking software OTSOFT, developed by Bruce Hayes
(Hayes 1998). Only FRs with case conflicts are taken into account here, so this
typology does not differentiate between languages with only matching FRs and
those without any FRs. The fully developed typology is presented in Vogel (in
prep.).
29. See Alexiadou & Varlokosta (1995) for a discussion of Modern Greek FRs.
30. These complications have to do with the fact that in Spanish and Romanian ani-
mate direct objects are realised with an obligatory preposition, while inanimates
are not. As a consequence of this, animate direct objects require matching, while
inanimates do not. The latter behave like nominatives (they are morphologically
indistinguishable from nominatives anyway). There are several possibilities to
account for this in the present system. The above ranking is only one option,
and it requires additional assumptions about inanimate direct objects. See Grosu
(1994) for a detailed discussion of the data. I will present my account of this
problem in Vogel (in prep.).
31. One could also state that this is the ranking for German A. The "implausible"
candidates occur in those cases where German A has no FRs. We might then
assume that an FR that suppresses dative or genitive wins the competition, but
is ungrammatical for independent reasons, e.g., uninterpretability because of the
deletion of semantically necessary oblique case features.
32. The problem is that in order to rule out a configuration with m - c a s e = A C C
and r - c a s e = N O M , we have to rank REALacc above FAITHfunc, but we then
wrongly rule out FRs with the configuration | m - c a s e = A C C r - c a s e = D A T ] ,
33. One could, for example, argue that these case forms are like lexical items. That
dative case in German can have its own specific semantic contribution in a clause,
has been shown by Wunderlich (19%), among others. Pesetsky (1998) uses
a constraint called RECOVERABILITY to account for a related phenomenon in
Polish relative clauses: complementiser-introduced relative clauses require a re-
sumptive pronoun in the case position of the (phonetically empty) relative opera-
tor's chain if it is assigned oblique case. Pesetsky also assumes that oblique cases
have a semantic contribution that cannot be recovered if they are not realised at
PF.
34. The version of German that was proposed by Groos & van Riemsdijk (1981)
would be a matching language like (54-c).
Free Relatives 373
References
Rooryck, Johan
1994 Generalized transformations and the vvh-cycle: Free relatives and bare
wh-CPs. GAGL 37: 195-208.
Vogel, Ralf
in prep. Towards an 'optimal' typology of free relative clause constructions. Ms.,
University of Stuttgart.
Vogel, Ralf — Markus Steinbach
1998 The dative - An oblique case. Linguistische Berichte 173: 65-90.
Wiltschko, Martina
1999 Free relatives as indefinites. In: Shahin Kimary, Susan Blake and Eun-
Sook Kim (eds.) The Proceedings of the Seventeenth West Coast Confer-
ence on Formal Linguistics, 700-712. Stanford, CA: CSLI.
Wunderlich, Dieter
1996 Dem Freund die Hand auf die Schulter legen. In: Gisela Harras and Man-
fred Bierwisch (eds.) Wenn die Semantik arbeitet, 331-360. Tubingen:
Niemeyer.
The Optimal Linking of Arguments:
The Case of English Psych Verbs
Anja Wanner
1 Introduction
Optimality Theory (OT) allows the assumption that the violation of universal
principles of grammar does not automatically lead to ungrammaticality. Ev-
ery violation of these principles has its price, and if the price is reasonable,
that is to say, if violation of a principle which is ranked low is bought at the
cost of making it possible to not violate a principle that is ranked high, the
result will not be ungrammatical.
The aim of this article is to suggest a way of extending the framework of
OT to another field of application: the mapping between lexical and syntactic
structures. It will be shown how to make OT the framework for a lexicon-
based linking theory. The advantages of such an approach will be pointed out
on the basis of English psych verbs, which are generally considered a dif-
ficult case for any projectional linking approach, i.e., any linking approach
that assumes that a verb's argument structure and syntactic class (transitive,
unergative, unaccusative) can be derived from its semantic representation in
a more or less straightforward manner (following universal mapping princi-
ples). The shortcomings of such an approach will be discussed in section 2 of
this article. Before we examine implicit instances of ranking in current link-
ing theories in section 4, we will present English psych verbs as a problem
for well-established linking principles such as the Thematic Hierarchy in sec-
tion 3. Section 5 will show how to turn this problem into an integral part of a
projectional linking theory within the OT framework. It will be assumed that
the core of an "optimal" linking theory lies in the interaction of linking rules
that apply between lexical semantic structure and argument structure. These
linking rules may overlap and even contradict each other (this will be called a
"mismatch situation"). The syntactic status of an argument (internal or exter-
nal argument) will depend on the language-specific ranking of these universal
378 Anja Wanner
linking rules. Finally, the preliminary ranking that results from investigating
one particular verb class in one particular language will be put into a wider
perspective in section 6.
Fourthly, one would not expect verbs of a given semantic class to show vari-
able behaviour in one language. One typically refers to psych verbs to il-
lustrate this kind of argument: If there is anything like a canonical syntactic
representation of arguments, one would not expect a crosswise realization of
E X P E R I E N C E R and T H E M E as in (l-d,e).
English Psych Verbs 379
Let us first look at the syntactic behaviour of psych verbs (like fear, frighten,
love, delight), which are characterized by the semantic roles they assign to
their two arguments, EXPERIENCER and THEME. In the following it will be
discussed whether these verbs form a coherent class according to semantic
and syntactic criteria.
380 Anja Wanner
Psych verbs are those verbs that express the perception of an emotion, the
content of which is lexically fixed (cf. fear, frighten, enjoy, surprise), or the
causation of such a perception. That is to say, the emotion itself is not realized
as an argument of the verb (* Harry feared great fear of the dog), because it is
an integral part of the verb's meaning (and quite often the verb takes its name
from the emotion expressed). According to this criterion, verbs like decide,
think, and persuade do not belong to the class of psych verbs.
In most cases psych verbs are two-argument verbs. The emotion that is felt
is usually ascribed to human beings only. In terms of 0-Theory they bear the
semantic role of an EXPERIENCER. With psych verbs of the type fear the
EXPERIENCE!* is realized in subject position, while it is in the position of the
object with psych verbs of the type frighten. The second argument of psych
verbs is either the target or the cause/stimulus of the emotion (see the verbs
in (2-a,b)). In the latter case the second argument can be interpreted as an
AGENT if it is a human being (see the second example in (2-b)).
From this perspective, the frighten-verbs are the more problematic subtype of
psych verbs because the thematically lower element is realized in the syntac-
tically higher position, while the pattern of argument realization of the fear-
verbs agrees with the Thematic Hierarchy. Even if one puts forward a rela-
tivized form of the Thematic Hierarchy (as in (4)) and generates the subject
of fear within the VP (in a position where it is asymmetrically c-commanded
by the E X P E R I E N C E R argument), there remains the problem of non-identical
patterns of argument realization for verbs with supposedly identical argument
structures. 2
In contrast to Belletti & Rizzi, I will assume that the argument structure
of psych verbs follows from their semantic representation (rather than from
idiosyncratic Case marking abilities). Differences in the projection of argu-
ments will not be considered counter-examples for a projectional linking ap-
proach. They will be taken as an indicator for underlying differences in the
verbs' semantic structures instead.
382 Anja Wanner
(5) a. The children were frightened by the sudden noise in the garden,
b. Those pictures are loved by most people.
(5) e. admirer (of Jane Austen), lover (of good music), dog-hater
f. baffler, disturber, enchanter, startler (Levin 1993: 190)
English Psych Verbs 383
It seems, then, that both subtypes of psych verbs are transitive verbs and that
the two arguments are indeed realized crosswise.
Some diagnostics, however, seem to point in another direction. On the basis
of data like (5-g,h) Grimshaw (1990) argues that psych verbs of the frighten-
class do not have an external argument. In contrast to fear-verbs, they do not
form compounds in which one of the two arguments (the internal argument)
is incorporated.
On the basis of a common 0-grid the fear-verbs and the frighten-\erbs are
claimed to belong to the same semantic class. However, if one looks at the
meaning of these verbs more closely, it is obvious that the frightenserbs ex-
press a change of state in the EXPERIENCE!* argument, while the fear-verbs
rather behave like states. Consequently, the verbs love, hate, fear can be para-
phrased as feel love, feel hate, feel fear, while the verbs delight, anger and
horrify can be paraphrased as cause sb. to feel delight, cause sb. to feel anger,
cause sb. to feel horror. In some cases the causative component is reflected
by the morphological structure of the verbs; see the examples in (6), which
are denominal derivations containing a causative suffix.
This difference in the aspectual structure of the event has been dealt with
in a variety of ways. While Tenny claims that thematic roles should be re-
placed by what she calls "aspectual roles" altogether ("Aspectual Interface
Hypothesis"), see (7-a,b), other linguists have tried to complement the The-
matic Hierarchy by some more aspectually oriented linking principle.
However, this causation of fear is not built into the meaning of the verb fear,
while the causative equivalent always implies that the emotion was actually
felt; see the contrast in (8-c,d).
(8) c. The article on AIDS upset /frightened/worried Sally, *yet she didn't
feel upset at all.
d. Sally loved/cherished/adored her little niece, yet she was angry
when she saw her.
It can be concluded that the frighten verbs - but not the fear verbs - express
a change of state in the E X P E R I E N C E R argument. The causer of this change
is the second argument, the T H E M E . It seems, then, that the two subtypes of
psych verbs do not form a coherent class according to aspectual criteria.
The aspectual differences between frighten and fear hold for other verbs
of the two subtypes of psych verbs, too. The verbs listed in (2-a) - those
that project the E X P E R I E N C E R as subject - do not express a change of state,
while all the verbs of the frighten type (the ones in (2-b)), whose pattern of
argument realization is a problem for the Thematic Hierarchy, are causative.
It seems reasonable to conclude that these underlying differences in the verbs'
semantics are the key to explaining the difference in argument realization.
The data in (9) and (10) support the assumption that the two subgroups
of psych verbs belong to different aspectual classes. Only the frighten-verbs
are compatible with frame adverbials, which refer to the time span of the
event until its culmination. The/ear-verbs, on the other hand, do not license
a resultative predicate, they cannot easily be replaced by a dynamic proform
(like do), and they are not compatible with the progressive.
(9) a. The little boy feared the neighbour's dog (*within a minute).
386 Anja Wanner
There are numerous approaches that try to tackle the argument structure of
psych verbs from a lexical point of view. 8 Most recent work recognizes the
possibility of conflict zones (or "mismatches") between two or more linking
principles when applied to a single item (e.g., Grimshaw 1990, Dowty 1991,
Levin & Rappaport Hovav 1995). However, explicit instructions on how to
solve such a conflict are not always given. This depends on the status mis-
matches are given within a linking theory. If they are considered unfortunate
constellations that should not occur if the linking principles one has estab-
lished are universally valid, the mismatch phenomenon will be pushed to the
grammar periphery. If, on the other hand, mismatches are considered an inte-
gral part of the interaction of linking principles, one will expect some discus-
sion of how to deal with them. It should be obvious that in a linking theory
within the OT framework mismatches would be essential to establishing the
language-specific ranking of constraints.
In Grimshaw's argument structure theory, linking conflicts are explicitly
mentioned. Each argument is placed in relation to its co-arguments into the
Thematic and the Aspectual Tiers as given in (12). If there is an argument
which is the highest ("most prominent") in both hierarchies, it gets the status
of the external argument; cf. Grimshaw (1990: 5). A mismatch situation arises
if an argument is the highest in only one of the two hierarchies (illustrated by
crossing lines in (13)). The outcome of such a mismatch situation is easy
to determine: According to Grimshaw's prominence criterion, the respective
argument will always be an internal argument. 9
388 Anja Wanner
As clear as this procedure seems, there are some problems with this approach.
Leaving aside for a moment the empirical problem that our argument struc-
ture tests do not confirm the conclusion that the THEME is an internal argu-
ment in the case of the frighten-\erbs, the theoretical basis of Grimshaw's
argument structure theory predicts some conflicts for which there is no ele-
gant solution. First of all the Aspectual Hierarchy is not really spelled out
(see (12)), which makes it impossible to assign an aspectual value to argu-
ments other than a causer. Secondly, the concept of two hierarchies only
makes sense if they are independent of each other. This is not the case in
the traditional understanding of 0-Roles, since some Ö-roles are explicitly
defined in terms of aspectual notions like dynamicity and action. Thirdly,
there is a general problem for Grimshaw's relativizing approach, pertaining
to the calculation of the argument structure of single-argument verbs. If there
is only one argument, this argument will automatically be the highest in both
hierarchies. Thus, single-argument verbs would always have an external ar-
gument, i.e., there would not be any unaccusative verbs. To rule out this over-
generalization Grimshaw has to make a concession, which includes an ele-
ment of weighting: "Apparently, more than relative prominence is involved;
some measure of absolute prominence must contribute too" (Grimshaw 1990:
39). She postulates that an argument which undergoes a change of state will
intrinsically have the status of an internal argument. It seems, then, that its
aspectual status overrules its thematic status - a situation that can be handled
more elegantly within the OT framework.
Coming back to our own conclusion that the THEME of the frighten-verbs
is an external argument, one might want to suggest another way to interpret
the interplay of the two hierarchies. We could regard them as separate linking
systems whose demands on a single argument can be in conflict with each
other - a situation that seems to lend itself ideally to the application of the
notion of weightable principles or rankable constraints: In the presence of
another argument, a THEME can become the external argument, under the
condition that it is the causer of a change of state in the second argument. The
constraints on argument structure established by the Thematic Hierarchy can
English Psych Verbs 389
Still, this modification in handling mismatches would require that the posi-
tions in the Aspectual Tier and the Thematic Tier be put in concrete terms
first.
Another framework that explicitly refers to mismatch situations is Dowty's
(1991) prototype model, whose core is the "Argument Selection Principle"
given below:
It is quite clear that not every subject or object can be characterized by the
full bundle of these properties. What matters is if the argument in question
is closer to the proto-agent or the proto-patient. This is determined by count-
ing the number of properties that can be assigned. We might consider it a
mismatch situation if a single argument has the same number of P- and A-
properties, i.e., if it is in the middle between these two thematic poles. Again,
in the case of psych verbs the most obvious conflict occurs in the evaluation
of the THEME argument of frighten-verbs, as can be seen in (16-b). If we take
into consideration the possibility of volitional involvement of the EXPERI-
ENCE!* argument, we have the same number of A- and P-properties for the
EXPERIENCER. For one thing, this tells us that no matter whether the argu-
ment will end up in object or subject position, it is far from being a typical
AGENT or PATIENT. In contrast to this m i s m a t c h situation, the EXPERIENCER
of /ear-verbs is assigned Α-properties only, see (16-a). Thus, we should ex-
pect more variation in the realization of arguments across languages in the
case of frighten-\trbs. On the other hand, the THEME of the frighten-\erbs
can be assigned only Α-properties, while the THEME of the /ear-verbs does
not qualify for the assignment of any of the above-mentioned properties at
all. 11 It seems, then, that the arguments of psych verbs do not exactly qualify
as prototypical subjects or objects in either case.
But how do we know where the arguments of frighten will end up in a spe-
cific language? Again we are confronted with the necessity of modifying a
general linking principle towards the idea that in a mismatch situation not all
demands are equal, as Dowty himself concedes: "I would not rule out
the desirability of 'weighting' some entailments more than others for pur-
poses of argument selection" (Dowty 1991: 574). And again, this element of
weighting could be handled best within the framework of Optimality Theory.
5.1 Prerequisites
Having presented two cases in which the notion of ranking within univer-
sal linking principles has found its way in through the back door, I will now
sketch a way to make this idea the basis of a linking theory within the OT
framework. Linking principles will be formulated as constraints on argument
structure. These constraints - which take the shape of linking rules - relate
either to positions in the semantic representation or to semantic properties.
Comparable candidates will be different argument structures (which have to
be in line with non-overridable structure principles). Since the possibilities
for the argument status of a given element are very limited (internal or ex-
ternal argument), there cannot be as much variation as between competing
candidates in syntax. Parallel to the status of X-bar Theory as a set of struc-
ture principles that restricts the kind of candidates to consider, we will take
for granted one universal structure principle each for argument structure and
for lexical semantic representation. The first one is well-established: There
can be maximally one external argument. The second principle, which I have
referred to as the "Identification Principle" elsewhere (Wanner 1999: 134f.),
relates to the complexity of the event: I will assume that one of the central
principles of linking between semantic structure and argument structure is
that each subevent has to be "identified" by at least one argument. 12 Con-
cerning the nature of linking rules, I would like to follow Levin & Rappaport
Hovav (1995), who have looked at patterns of argument realization of English
verbs extensively.
392 Anja Wanner
For the sake of the argument, let us assume that all the constraints we need
are the linking rules listed in (17), which are in line with generally accepted
linking principles. Like Levin & Rappaport Hovav (1995) I have included a
DEFAULT-RULE that declares the presence of an internal argument the default
case. 1 3 In the linking theory of Levin & Rappaport Hovav (1995) this rule
has the status of an elsewhere condition. It cannot collide with any of the
specific rules, since it is only activated if none of them applies. This need
not necessarily be the case in the optimality framework, although it is quite
obvious that the default rule is likely to be ranked rather low (otherwise we
wouldn't have any verbs with external arguments).
The first three linking rules relate to structural information in the verb's lexi-
cal information. The third rule tries to capture the special (if vague) treatment
that the EXPERIENCER is given in most linking theories; cf. Jackendoff (1987:
401), who considers the EXPERIENCER the argument of an "as yet unexplored
State-function having to do with mental states."
It has already been shown that the / e a r - v e r b s behave like (temporary)
STATES, while the frighten-\erbs constitute ACCOMPLISHMENTS, which -
in the tradition of Vendler and Dowty - we assume to fall into two subevents;
see the templates in (11), repeated below.
In the case of frighten the arguments of the verb belong to two different
subevents. The THEME argument is the one to identify the first subevent
(causal subevent). As such it will be subject to the CAUSER-RULE. The EX-
PERIENCER, on the other hand, identifies the BECOME-subevent ("central
subevent" in terms of Levin & Rappaport Hovav 1995). As such it is subject
to the BECOME-RULE. Arguably it is also subject to the CONTROL-RULE,
English Psych Verbs 393
which would create a rule conflict, since it cannot be an external and an in-
ternal argument simultaneously.
If the ACTOR-RULE is activated or not will depend on the kind of subject
we choose. Since frightening somebody does not necessarily presuppose that
any specific action takes place, we will consider this rule inactive or rather
"vacuously satisfied" in the case of psych verbs, particularly when the subject
is inanimate, as in [The movie frightened the children|.
The case is different with the /ear-verbs. They do not have a complex as-
pectual structure. Neither the CAUSER-RULE nor the BECOME-RULE can b e
considered a relevant factor here; in other words: they are vacuously satisfied.
The only thing the/ear-verbs have in common with the frighten-verbs is the
fact that the EXPERIENCER falls under the scope of the CONTROL-RULE, too,
which reflects the semantic relatedness of the two groups of verbs.
Let us now look at how to rank the linking constraints in (17) on the basis of
English psych verbs. Competing candidates are different argument structures,
as illustrated in the tableaux in (18). There are three comparable candidates
in the case of frighten (see (18-a)): Candidate A has an external THEME, can-
didate Β has two internal arguments, and candidate C has an external EXPE-
RIENCER. Candidate D is ruled out because a verb cannot have two external
arguments. (I have not included any candidates that do not contain an event
position since they would presumably be ruled out by independent argument
structure principles.)
The ranking of constraints can only be determined if the optimal candidate
is known. In syntax this is the grammatical sentence. Here, the optimal can-
didate is the argument structure of the verb as resulting from the application
of structural tests. From applying these tests we have concluded - contrary to
Grimshaw (1990) - that in English verbs like frighten project their "THEME"
(the argument identifying the causing subevent) as external argument. This
means that the argument structure given in A is the optimal candidate. Thus
w e can c o n c l u d e that the CONTROL-RULE is d o m i n a t e d by the BECOME-
RULE because otherwise the EXPERIENCER would not end up as an internal
argument. Alternatively, if we take into account the argument status of the
THEME, w e m i g h t a r g u e that the CAUSER-RULE d o m i n a t e s the CONTROL-
RULE s i n c e the CAUSER (and not the c o n t r o l l i n g EXPERIENCER) b e c o m e s
the external argument. The internal ranking of the CAUSER-RULE and the
394 Anja Wanner
(18) a. Tableau:
The movie χ frightened CAUSER- ACTOR- BECOME- CONTROL- DEFAULT-
the children γ RULE RULE RULE RULE RULE
is· A frighten: [e, (je (y))) / / * *
Β frighten: |e, ( U , ; y ) ) | *
/ *
/
C frighten: [e, ( y ( * ) ) ] * *
/ *
In contrast to the situation in (18-a), the THEME of the /ear-verbs is not sub-
ject to the C A U S E R - R U L E , nor is the EXPERIENCER subject to the BECOME-
RULE, i.e., both of these constraints on argument structure are vacuously sat-
isfied; see (18-b).
(18) b. Tableau:
They y feared the stormx CAUSER- BECOME- CONTROL- DEFAULT-
RULE RULE RULE RULE
A fear: [e, (x (y))] * *
*
Β fear: [e, ((*, J))l /
is* C fear: [e, (y (x))l / *
6 Evaluation
On the basis of a particular verb class it was shown why and how the frame-
work of Optimality Theory could be used as the basis for a theory of argu-
ment linking. The OT-based linking model we argued for tries to integrate
important insights from current lexicon-based linking theories: It is based on
linking rules that constrain possible argument structures (like the ones formu-
lated by Levin & Rappaport Hovav 1995), its organization is inherently hier-
archical (as suggested by Grimshaw 1990), it takes into account the aspectual
function of arguments (which reminds us of Tenny 1987, 1992), it can be
used to make a statement about the prototypicality of an external or internal
argument (as pointed out by Dowty 1991), and it relates directly to the lexical
semantic representation of verbs, thus following Jackendoff (1987) in treat-
ing 0-roles as inherently relational notions. While all of the linking models
we touched upon have to make concessions when it comes to linking con-
flicts arising from the collision of different linking principles, the framework
of Optimality Theory lends itself ideally to dealing with such mismatches. OT
gives us a framework to recognize conflicts between universal constraints as
necessary and desirable because otherwise we would neither be able to con-
strue the grammar of a specific language nor could we predict the potential
for cross-linguistic variation.
In the case of English psych verbs we essentially followed Grimshaw
(1990) in assuming that there are two aspectually different subclasses, each
with a homogeneous pattern of argument realization. What relates these two
classes is the presence of an argument that falls under the scope of the
CONTROL-RULE (generally labelled EXPERIENCER). S i n c e t h e CONTROL-
RULE is d o m i n a t e d b y t h e CAUSER-RULE, t h e EXPERIENCER c a n n o t b e p r o -
jected as external argument in the presence of a causer. Argument structure
tests confirmed that the causing THEME is indeed an external argument in the
396 Anja Wanner
case of frighten. The fear-verbs, on the other hand, do not express causative
events. Thus, the C A U S E R - R U L E is vacuously satisfied and the E X P E R I E N C E R
becomes the external argument by virtue of falling under the scope of the
CONTROL-RULE.
It is obvious that we could merely sketch the outline of a linking theory
within the OT framework. For the sake of the argument we took the shape and
universal validity of linking constraints for granted. Since we examined only
one verb class we could not establish the ranking of linking rules unequivo-
cally. To check if the approach sketched here is superior to its competitors,
one would have to put different verb classes to the test and take into account
more (or different) linking principles. The (fragmentary) hierarchy of linking
rules that we established should of course be compatible with all verb classes
within one language. The high status of the C A U S E R - R U L E , for instance, is
confirmed by the fact that accomplishments (like break, build, eat) generally
are transitive verbs, projecting the causer argument as external argument, no
matter whether either of the two arguments of the verb is animate or not. On
the other hand, our set of constraints does not allow states with an external
argument if this does not fall under the scope of the C O N T R O L - R U L E . 1 5
Apart from consistency within one language we would expect cross-
linguistic differences in the realization of arguments - as well as variation
along a historical dimension - to be attributable to different rankings of link-
ing constraints. If this hypothesis could be supported empirically, the case for
an OT-based linking theory would gain even more ground.
Notes
1. The notation used here basically follows that used by Levin & Rappaport Hovav
(1995).
2. The analysis argued for by Belletti & Rizzi (1988, 1991 ) hinges on the assump-
tion of a Case grid, which contains information about the idiosyncratic Case
assigning abilities of verbs, i.e., the ability to assign inherent Case. The argu-
ment structure of psych verbs ultimately follows from their idiosyncratic ability
to assign inherent Case to one of their arguments. For a critical discussion of this
approach, see Wanner (1999: 188ff.).
3. Within a projectional approach, however, this would only shift the problem to the
mapping from semantic structure to argument structure: One would have to ask
why members of one semantic class of verbs have different argument structures.
4. This leaves us with the question of how to explain the differences in acceptabil-
ity between (5-g) and (5-h). Restrictions on word formation processes can be
English Psych Verbs 397
of a very fine-grained nature. In this case they might depend on the aspectual
differences between the two types of psych verbs.
5. In the linking approach developed here, this will be reflected by the assumption
of a specific linking rule (CONTROL-RULE).
6. For instance, they can appear as the complement of a perception verb, which is
not possible for true states.
7. We would have to establish first, of course, that the Thematic Hierarchy could
not simply be replaced by - or turned into - an Aspectual Tier.
8. For a list of the relevant literature, see Levin (1993: 188ff.). A Case-oriented
approach, which cannot be discussed here, is presented by Belletti & Rizzi ( 1988,
1991).
9. Following Grimshaw (1990), the external argument is marked by being sur-
rounded by only one set of round brackets in argument structure.
10. "Incremental Theme" is the term Dovvty uses to refer to an object that changes
gradually as the event progresses, e.g., the complement in mow the lawn or build
a house.
11. In the example given the causing argument is inanimate and can be assigned only
one Α-property. The situation is different, of course, when an agentive reading is
available (Sally frightened her little brother deliberately).
12. A similar principle ("Subevent Identification Condition") is put forward by
Rappaport Hovav & Levin (1998).
13. That any verb can take an internal argument is shown in Wanner (to appear). On
the other hand, not every verb has the capacity to take an external argument. I will
therefore consider the internal argument the default case of a thematic argument
licensed by a verb.
14. A controlling argument can be used in constructions like (8-a,b).
15. To explain which argument is projected as external argument in aspectually sym-
metric events one might have to include something like the notion of "internal
causation" as developed by Levin & Rappaport Hovav (1995: 91): "some prop-
erty inherent to the argument of the verb is 'responsible' for bringing about the
eventuality." It should be mentioned, however, that the criteria according to which
the "responsibility" of an argument is established are rather vague.
References
Perlmutter, David
1978 Impersonal passives and the unaccusativity hypothesis. In: Proceedings
of the Annual Meeting of the Berkeley Linguistic Society 4, 157-189.
Pinker, Steven
1989 Learnability and Cognition: The Acquisition of Argument Structure.
Cambridge, MA: MIT Press.
Rappaport Hovav, Malka — Beth Levin
1998 Building Verb Meanings. In: M. Butt and W. Geuder (eds.) The Projec-
tion of Arguments: Lexical and Compositional Factors, 97-134. Stanford:
CSLI Publications.
Tenny, Carol
1987 The aspectual interface hypothesis. In: Proceedings of the 18th Annual
Meeting of the North-Eastern Linguistic Society, 490-508.
Tenny, Carol
1992 The aspectual interface hypothesis. In: I. Sag and A. Szabolcsi (eds.) Lex-
ical Matters, 1-27. (CSLI Lecture Notes 24.) Stanford: Stanford Univer-
sity Press.
Tenny, Carol
1994 Aspectual Roles and. the Syntax-Semantics Interface. (Studies in Linguis-
tics and Philosophy 52.) Dordrecht: Kluwer.
Vendler, Zeno
1967 Linguistics in Philosophy. Ithaca, NY: Cornell University Press.
Verkuyl, Henk
1993 A Theory of Aspectuality. (Cambridge Studies in Linguistics.) Cam-
bridge: Cambridge University Press.
Wanner, Anja
1999 Verbklassifizierung und aspektuelle Alternationen im Englischen. (Lin-
guistische Arbeiten 398.) Tübingen: Max Niemeyer Verlag.
Wanner, Anja
to appear Intransitive verbs as case assigners. In: H. Janßen (ed.) Verbal Projec-
tions. (Linguistische Arbeiten.) Tübingen: Max Niemeyer Verlag.
Index of OT-Constraints
* EXP, 2 9 2 FAlTHfunc, 3 5 9
•FOCUS (*F), 166 FAITH[Q], 299
* L x - M v (* Lexical Movement), 3 1 4 FILL, 4 5 , 6 0
*MENG, 114 FINALFOCUS (FF), 7 3
* P A S T P / P V / + V , 307 Fl (Full Interpretation), 3 1 4
*PRON (Avoid Pronoun), 4 3 FOCUSPROMINENCE (FP), 81, 100
*T, 2 1 6 FOCUS (F), 164
* 0 (Avoid Null Parse), 4 9 H-Nuc, 61
ACCENTDOMAINFORMATION HAVE ( C P ) , 2 9 4
(ADF), 80 HD-LFT (Head Left), 3 0 7
ACTOR-RULE, 392 HD-LFT &D MORPH, 3 0 7
ADJ-ISL (Adjunct Island Constraint), HD-RT (Head Right), 307
48 IDENT, 4 5
ADJA (Adjacency), 180 IN-DIS (I), 154
AGENT (Agentivity), 180 IND(EFINITES), 7 3
ALIGN (Align Focus), 184 ISLAND-COND, 6 0
ANIM (Animacy), 179 ISLAND, 127
ARGUMENT-OVER-PREDICATE IP-HEAD-RIGHT, 8 2
(A/P), 8 0 LEFTEDGECP (LEO), 127
BAR (Barriers Condition), 4 4 LEFTEDGEPP (LEP), 131
BAR 3 , 6 0 LICENSING, 328
BAR", 2 1 6 LOC-ANT (Local Antecedent), 38
BECOME-RULE, 392 MATCH, 360
CAUSER-RULE, 392 MAX, 45, 356
CIS (Contiguity in Syntax), 130 MAXdo,, 356
CN2,73 MIN-CHAIN (Minimize Chain
CNPC ( C o m p l e x Noun Phrase Condi- Length), 4 4
tion), 4 1 MINDIS, 2 1 2
CONTROL-RULE, 392 MORPH (Morphological Selection),
CONTROL (Control Rule), 4 2 307
DAT(IVE), 9 3 NEW, 7 3
DEFAULT-RULE, 392 N o PROPERTY (N-PR), 167
DEP, 4 5 , 3 5 6 NOPHRASALSPEC (NPS), 139
CEPdat, 356 O B - H D (Obligatory Heads), 3 1 4
ECON (Economy), 183 ONEPROSODICWORD OPW, 138
EX-PRE (E), 154 OP-SPEC, 39, 332
FAITH(PASTP), 309 PARMOVE (Parallel Movement), 136
FAITH [COMP], 4 6 , 4 7 , 2 9 2 PARSESCOPEO,116
FAITH [WH], 5 0 PARSE[SCOPE], 4 6 , 4 7
402 Index of OT-Constraints