Professional Documents
Culture Documents
Sentence Structure II: Phrase Structure Grammars: Introduction To Language - Lecture Notes 4B
Sentence Structure II: Phrase Structure Grammars: Introduction To Language - Lecture Notes 4B
☞ Goal: How are sentences built (or 'generated', as linguist say)? Corresponding to the two hypotheses that
were considered in the preceding Lecture Notes, we discuss two possibilities. The first hypothesis, based on a
'word chain device' (formally called a 'finite state model' or a 'Markov model), yields sentences that have a flat
structure. We already found an argument against such a hypothesis in the preceding Lecture Notes- the
sentences of English do not have a flat structure. We show that this hypothesis has other defects as well. The
second hypothesis, by contrast, generates (=produces) sentences that do not have a flat structure. It involves
Phrase Structure Rules, which yield trees with labels added to indicate the syntactic category of each
constituent (e.g. Noun Phrase, Verb Phrase, etc.). The resulting tree is seen to recapitulate the process by
which a sentence is generated =produced) by the rules of grammar: a group of elements forms a constituent
whenever they have been introduced by the application of the same rule.
1 Review: Constituency
(i) In every sentence, certain groups of words form 'natural units' [=constituents] and may:
-stand alone
-be moved as a unit
-be replaced as unit by a pronoun
(ii) Trees encode the information about constituents: two expressions are a natural unit (=constituent) if there
is a sub-tree that contains them and nothing else.
(iii) A sentence that can be analyzed as 2 different trees is structurally ambiguous (e.g. Lucy will hit the
student with the book)
Pinker discusses in Chapter 2 of The Language Instinct (p. 29) the example of question formation. If we wish
to form a question that corresponds to the assertion John is in the garden, we may simply move the auxiliary is
to the beginning of the sentence, yielding Is John __ in the garden? [here __ simply indicates that a word has
been displaced]. In a slightly more complex case, such as John is in the garden next to someone who is asleep,
we form the corresponding question by moving to the beginning of the sentence the first is, yielding Is John __
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
in the garden next to someone who is asleep? If we tried instead to move the second is, we would obtain a
sharply ungrammatical result ('ungrammatical' in the descriptive sense we will use throughout this course): *Is
John is in the garden right next to someone who __ asleep?
These contrasts are recapitulated in (1):
(1) a. John is in the garden next to someone who is asleep.
b. Is John __ in the garden next to someone who is asleep? (Move the first is)
c. *Is John is in the garden right next to someone who __ asleep? (Move the second is)
From these one might be tempted to infer that the rule of question formation is to systematically move to the
beginning of the sentence the first is which is uttered. Pinker shows that this hypothesis is incorrect, since it
predicts (incorrectly) that the question corresponding to (2)a is (2)b:
(2) a. A unicorn that is eating a flower is in the garden
b. *Is a unicorn that __ eating a flower is in the garden? (Move the first is)
c. Is a unicorn that is eating a flower __ in the garden? (Move the second is)
We do not discuss at this point what the correct rule is (it will turn out that it must be stated in more abstract
terms than 'moving the first is' or 'moving the second is'). But we observe that a child that only heard simple
cases of question formation (e.g. Is John __ in the garden?) would have to infer a rather complex and subtle
rule from limited data. For the same reason as in the case of integers mentioned above, the child must have
something to guide his acquisition of a rule that goes beyond the sentences that he has heard."
The Solution: 'move the auxiliary which is immediately under the right-hand daughter of the root'
The solution of the puzzle is that the rule of question formation should be stated in terms of structure (i.e. in
terms of syntactic trees) rather than in terms of strings (=linear order). The rule of question formation in
English is to move to the beginning of the sentence (i.e. to add to the tree) the auxiliary which is immediately
under the right-hand daughter of the root (the root is the top-most node of the tree).
(3) a. b.
If Mary is replaced with the person who will be hired (clearly a constituent - for instance it may be replaced
with the pronoun 'he' or 'she'), the general structure of the sentence is not affected, and in particular the same
word will is moved which was moved in the simple sentence. Crucially, it is not the word will contained in the
person who will be hired which is moved - as one wants. This is illustrated in (4) [note that a triangle stands
for a constituent whose internal structure is omitted for simplicity; in homeworks you should specify the
complete structure of a tree, i.e. you should not use triangles, unless the exercise tells you to do so]:
2
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
(4) a. b.
Going back to our original puzzle with A unicorn is in the garden, we can apply exactly the same reasoning.
Constituency tests would lead one to posit the following structure, where a unicorn is a single constituent.
(5)
The rule of question formation can then be applied in the same way as in our earlier examples:
(6)
And just as we want, the rule functions in exactly the same way when a unicorn is replaced with a unicorn that
is eating flowers; and the right result is obtained:
3
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
(7) a. b.
A plausible -but incorrect- model is discussed by Pinker in Chapter 4 of The Language Instinct, the
Finite State Model (also called 'Markov Model'; Pinker also calls it a 'word chain device'). It is both natural
and historically important, since it was considered plausible until the 1950's. In a nutshell, it attributes to a
speaker a simple mental system that allows him or her to determine whether a given word can or cannot follow
another given word. Here is the example of a Finite State Model discussed by Pinker (I have added a 'START'
and an 'ACCEPT' states, which are implicit in Pinker's discussion; the idea is that you feed the sentence to the
machine, starting with the first word, one word after the other. If you end up in the ACCEPT state after the last
word has been processed, the sentence is accepted; otherwise the sentence is rejected):
(8)
happy
4
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
There are two important arguments against the Finite State Model:
-Argument 1: It does not account for the tree-like structure of sentences that we observed in Lecture Notes 3B.
-Argument 2: It cannot properly account for 'long distance dependencies', i.e. constructions in which two
elements that depend on each other are separated by an arbitrary number of words.
(12) Example of a long distance dependency: either ... or...
a. Either John is sick or he is depressed
b. Either John thinks that he is sick or he is depressed
c. Either Mary knows that Johns thinks that he is sick or she is depressed
d. Either the boy eats hot dog or the dog eats hot dog
e. Either the happy happy boy eats hot dog or the dog eats candy
etc.
We could try to integrate the either ... or construction into our Finite State Model, but no simple solution
would work. To see this, observe that in the following model nothing requires that a sentence that starts with
either should also contain or somewhere down the road. And for good reason: in order to 'remember' this, the
model would need some kind of memory, which it lacks completely. The problem turns out to be very severe.
In fact, Noam Chomsky became famous in the 1950's by proving that no matter how complex a finite state
machine was, it could not handle all constructions of English.
(13)
happy
or then
5
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
6
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
I → will, might, can, should, does, did (an Inflection is: will, or might, or can, or should, or does, or did)
NP → PN, D N (a Noun Phrase comprises either a Proper Name/ProNoun alone, or a Determiner and a
Noun)
PN → John, Bill, Mary, Sam, he, she...
N → President, director, boy, girl, Dean, friend, mother...
VP → Vi, Vt NP, Vs CP (a Verb Phrase comprises either an intransitive Verb Vi alone, or a transitive verb Vt
followed by a Noun Phrase, or a verb of speech or though Vs followed by a Complementizer Phrase)
(18) IP
I'
NP I VP
Vi
PN will
Mary sleep
(19) IP
I'
NP I VP
D N Vi
will
the President
sleep
We can also generate some of the sentences that occupied us in Lecture Notes 3B:
7
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
(20) IP
I'
NP I VP
PN Vt NP
will
D N
Mary meet
the President
(21) IP
I'
NP
I VP
D N will Vt NP
D N
Your friend meet
the President
Crucially, we observe that our Phrase Structure Grammar generates sentences 'with the right structure', i.e.
with the tree-like structure that was discussed in Lecture Notes 3B. The only difference is that some non-
branching nodes have been added (reminder: a non-branching node is a node with just 1 daughter). When the
non-branching nodes and the labels are disregarded, we obtain exactly the trees that were argued for in Lecture
Notes 3B:
(22)
Mary will
meet
the President
(23)
will
8
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
We also note that our little grammar can generate more complex sentences, thanks in particular to our rule for
verbs of speech and thought (e.g. believe, think, claim, etc.), which can embed an Inflection Phrase within
another Inflection Phrase, as is shown below (the embedding of a constituent of a given category within
another constituent of the same category is called recursion; it is essential to generate an infinite language):
(24) IP
Recursion of IP (=an IP is
I' embedded within another IP)
NP I VP
PN Vs CP
will
C
Mary claim IP
that I'
NP I VP
PN will Vi
John sleep
Observe that nothing would prevent us from embedding the IP in (24) within a larger IP, e.g. John will think
that ____. Since this procedure can be repeated as many times as we want, our grammar can generate an
infinite number of sentences.
At this point it should already be clear that we have met Requirement 1: our grammar does account for the
tree structure that was argued for in Lecture Notes 3B. What about Requirement 2, then? Do we now have an
account of long-distance dependencies? We do, as soon as we add one rule to our little grammar:
(25)
9
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
It is then clear that by adding under IP1 or IP2 any of the trees that can be generated by our grammar, we
obtain a grammatical sentence. Requirement 2 has thus been met as well.
I'
NP VP I
PN NP V PAST
John
PN hit
Mary
10
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
(29) IP
NP I'
PN VP
I
Bill CP V
PAST
IP C think
I' that
NP VP I
PN NP V PAST
John
PN hit
Mary
It should be noted that English and Japanese are two extreme examples: head-initial for all constructions
(English), or head-final for all constructions (Japanese). Some languages display a mixed pattern, in which
some constructions (e.g. Verb Phrases) are head-initial, while others (e.g. Complementizer Phrases) are head-
final.
11
P. Schlenker - Ling 1 - Introduction to the Study of Language, UCLA
-Two 'tricks'
(a) Arbitrariness of the sign [Saussure] (75)
(b) Infinite use of finie means [Humboldt] (75)
12