Professional Documents
Culture Documents
Owen Rambow
CCLS, Columbia University rambow@ccls.columbia.edu
2008-03-12
Overview Terminology (ppt) What is Tree Adjoining Grammar? Some Syntactic Analyses TAG and Syntactic Theory Problem: German
Three Main Points About Tree Adjoining Grammar (TAG): TAG is a constrained mathematical formalism TAG supports the development of lexicalized grammars for natural languages TAG is not a linguistic theory
Constrained mathematical formalisms Mathematical device for specifying sets of structures Constrained: mathematical device cannot specify all possible sets
Linguistically appealing because scope of linguistic theory is restricted Computationally appealing because of ecient processing models
Many syntactic phenomena are idiosyncratic to specic lexical items or classes of lexical items Natural language processing: use of corpora pervasive, corpora (typically) consist of words
(1) a. He told a secret to Mary b. He told Mary a secret c. He divulged a secret to Mary d. *He divulged Mary a secret
Lexicalization
A formalism is lexicalized if every elementary structure contains at least one terminal symbol (= lexical item)
Context-Free Grammar (CFG) (Chomsky 1957), used all over NLP, BUT
Context-Free Grammar (CFG) (Chomsky 1957), used all over NLP, BUT
Wrong descriptive level (issue of lexicalization) Not powerful enough formally (Shieber 1985 on Swiss German)
Context-Free Grammar (CFG) (Chomsky 1957), used all over NLP, BUT
Wrong descriptive level (issue of lexicalization) Not powerful enough formally (Shieber 1985 on Swiss German)
Context-Free Grammar (CFG) (Chomsky 1957), used all over NLP, BUT
Wrong descriptive level (issue of lexicalization) Not powerful enough formally (Shieber 1985 on Swiss German)
Most current linguistic formalisms go beyond CFG (TAG, HPSG, LFG) Most of these lose constraint on formal power EXCEPT TAG
Lets Start with a CFG (2) a. S NP VP b. VP really VP c. VP V NP d. V likes e. NP John f. NP Lyn Elementary structures of this grammar: the rules
Derivation in a CFG
Rule 1: S NP VP Rule 2: VP really VP Rule 3: VP V NP Rule 4: V likes Rule 5: NP John Rule 6: NP Lyn
Derivation in a CFG
Rule 1: S NP VP Rule 2: VP really VP Rule 3: VP V NP Rule 4: V likes Rule 5: NP John Rule 6: NP Lyn
Derivation in a CFG
Rule 1: S NP VP Rule 2: VP really VP Rule 3: VP V NP Rule 4: V likes Rule 5: NP John Rule 6: NP Lyn
Derivation in a CFG
Rule 1: S NP VP Rule 2: VP really VP Rule 3: VP V NP Rule 4: V likes Rule 5: NP John Rule 6: NP Lyn
S = NP VP (Rule 1, then Rule 5) = John VP (next Rule 2) = John really VP (next Rule 3)
Derivation in a CFG
Rule 1: S NP VP Rule 2: VP really VP Rule 3: VP V NP Rule 4: V likes Rule 5: NP John Rule 6: NP Lyn
S = NP VP (Rule 1, then Rule 5) = John VP (next Rule 2) = John really VP (next Rule 3) = John really V NP (next Rule 6)
Derivation in a CFG
Rule 1: S NP VP Rule 2: VP really VP Rule 3: VP V NP Rule 4: V likes Rule 5: NP John Rule 6: NP Lyn
S = NP VP (Rule 1, then Rule 5) = John VP (next Rule 2) = John really VP (next Rule 3) = John really V NP (next Rule 6) = John really V Lyn (next Rule 4)
Derivation in a CFG
Rule 1: S NP VP Rule 2: VP really VP Rule 3: VP V NP Rule 4: V likes Rule 5: NP John Rule 6: NP Lyn
S = NP VP (Rule 1, then Rule 5) = John VP (next Rule 2) = John really VP (next Rule 3) = John really V NP (next Rule 6) = John really V Lyn (next Rule 4)
CFG is a string-rewriting system Elementary structure: context-free rewrite rule Operation: rewrite Record of derivation is a phrase-structure tree, called derivation tree
Derivation Tree:
S VP VP V likes NP Lyn
NP John really
(Rule 1) VP (Rule 5) really VP (Rule 2) really V NP (Rule 3) really V Lyn (Rule 6) really likes Lyn (Rule 4)
Lexicalization (reminder)
A formalism is lexicalized if every elementary structure contains at least one terminal symbol (= lexical item)
Rule Rule Rule Rule Rule Rule 1: 2: 3: 4: 5: 6: S NP VP VP really VP VP V NP V likes NP John NP Lyn
Lexicalization (reminder)
A formalism is lexicalized if every elementary structure contains at least one terminal symbol (= lexical item)
Rule Rule Rule Rule Rule Rule 1: 2: 3: 4: 5: 6: S NP VP VP really VP VP V NP V likes NP John NP Lyn
Lexicalizing a CFG (ctd) Can we lexicallize this CFG? Greibach Normal Form: changes rules, linguistically unappealing
Lexicalizing a CFG (ctd) Can we lexicallize this CFG? Greibach Normal Form: changes rules, linguistically unappealing S NP likes NP
Lexicalizing a CFG (ctd) Can we lexicallize this CFG? Greibach Normal Form: changes rules, linguistically unappealing S NP likes NP What about really? Go from string rewriting to tree rewriting!
Substitution
=>
A A
Substitution
1 4
S VP V likes NP
S VP V likes NP Lyn
NP
NP John
NP
NP Lyn
John
New formalism: Tree Substitution Grammar (TSG) Elementary structures: phrase-structure trees Operations: substitution
Formal power?
Formal power? Equivalent to CFG Linguistic expressive power (what range of theories can we formulate)?
Formal power? Equivalent to CFG Linguistic expressive power (what range of theories can we formulate)? Greater than CFG; examples: agreement, subcategorization; extended domain of locality
Agreement
1 1
S VP NP
S VP NP
NP [number=sg, person=3] V
NP [number=pl] V
likes [number=sg,person=3]
like [number=pl]
Subcategorization
S NP0 V tell VP NP2 NP1 NP0 V tell S VP NP1 P to PP NP2
Adjunction
S A
=>
A*
A A
Adjunction
S VP V likes NP Lyn
VP VP*
S VP VP V likes NP Lyn
NP John
really
NP John really
Note: trees that can be adjoined are called auxiliary trees, trees that can be substituted initial trees
Formal power?
Formal power? Greater than CFG and TSG Linguistic expressive power (what range of theories can we formulate)?
Formal power? Greater than CFG and TSG Linguistic expressive power (what range of theories can we formulate)? Greater than CFG, but same as TSG; examples: agreement, subcategorization; extended domain of locality
Derivation structure?
S NP V likes NP John NP likes Lyn Lyn VP NP really VP VP NP John really V S VP VP NP
like
1 2.2 2 SUBJ
like
ADJUNCT OBJ
John
Lyn
really
John
Lyn
really
A note on elementary structures and rewriting CFG: elementary structures are rewrite rules
S NP VP
A note on elementary structures and rewriting CFG: elementary structures are rewrite rules
S NP VP
TAG: elementary structures are PS trees TAG trees are really tree rewrite rules:
S S NP V likes VP NP VP really VP NP really V likes VP VP NP VP S
CFG Elementary Struc. Derived Structure Derivation Struc. Can lexicalize Engl? string string PS tree no
Overview Terminology (ppt) What is Tree Adjoining Grammar? Some Syntactic Analyses TAG and Syntactic Theory Problem: German
Why TAGs are useful in Linguistics Constrained mathematical formalism Extended domain of locality
Extended Domain of Locality Elementary structure is a tree, not a single layer of a tree CFG: VP V NP[obj] TAG:
S
r r rr r
NP [case:nom] [agr: 1 ] V
VP
rrr
NP [case:obj]
[agr: 1 ] /eat/
We can use Extended Domain of Locality for: Case requirements imposed on complements (example: German objects) Agreement through co-referential feature structures between dierent parts of the tree (e.g., subject-verb agreement or object clitic-participle agreement in F) Strongly governed prepositions Subcatgorization frame of lexical item (basic subcat and variations) Syntax of lexical item (basic and variation)
S
r r r r rr
NP [nom,agr: 1 ]
VP
r r rr r
V [agr: 1 ] /discriminate/ P
PP
r rr
NP [obj]
Subcategorization frame: S
rrr
NP V
VP
rr
NP
/eat/
S
r r rr r
NP
VP
r r r r r
V /give/
NP P to
PP
r r
NP
NP V
VP
rr
NP
/eat/
VP
rr
VP*
Adjuncts are not required, but the adjunct can only modify certain trees
English Passive:
S r r rr rr NP [nom,role:theme agr: 1 ] VP rrr Aux VP [agr: 1 ] V /be/ [past-part] eaten
NP [obj]
NP
r r S r rr
r VP V
Idiom S
r r rr r
VP
r rr r
NP
rr
the
N bucket
Examples of Derivations (1) Subcategroization for clauses (2) Long-distance topicalization (3) Control verbs (4) Raising (5) Extraction from picture-NPs
NP [nom,agr: 1 ] V
VP
r rr
[agr: 1 ] /think/
S
r r r rr r
COMP that
S
r r r r r
NP [nom,agr: 1 ] V
VP
r rr
NP
[agr: 1 ] /like/
NP
r rr r
r
VP
[nom,agr: 1 ]
r r r r
V [agr: 1 ] S
/think/ that
r r r
r
S
COMP
r rr r
NP [nom,agr: 2 ] V [agr: 2 ] /like/ VP
r r
NP
S
r r rr
r r rr
COMP that S * NP
S
rrr
P V
VP
rr
V /like/
NP
/think/
NP
r rr r
r
VP
[nom,agr: 1 ]
r r r r
V [agr: 1 ] S
/think/ that
r r r
r
S
COMP
r rr r
NP [nom,agr: 2 ] V [agr: 2 ] /like/ VP
r r
NP
Subcategorization for clauses Approach 2: Adjunction (ctd) Exactly the same derived tree (except for S root node of matrix clause) Advantage of adjunction: extraction from embedded clause (coming up!)
S
r r r
NP V
VP
r r
NP S* NP
S
rr
VP V /eat/
/think/
r r r
NP S
r r r
NP VP
r r r
V /think/ NP S
r r
VP V /eat/
S S
NP
r r
NP
r r r
S
r
VP S *
r rr
COMP that NP S
r r
r r
VP V /eat/
/think/
r r r
NP S S
rr r
NP VP
r r r
S
/think/
r rr
COMP that NP S
r r
VP V /eat/
NP
rrr
S S
r rr r
NP VP
r rrr
S
/think/
that
r r
r
S
COMP
r rrr
NP VP
rrr
S
/suspect/
r rr
COMP that NP S
r r
VP
Control verbs S
r rr r r
NP [ref: 1 ] V wishes
VP
r r rr
S* [mood:inf-to] [control: 1 ]
S [mood:inf-to] [control: 1 ]
rrr r
NP PRO [index: 1 ] V
VP
r rr
NP
Control verbs S
r rr rr r
NP
VP
rrr rr rr
V asks
NP [ref: 1 ]
S* [mood:inf-to] [control: 1 ]
S [mood:inf-to] [control: 1 ]
rrr r
NP PRO [index: 1 ] V
VP
r rr
NP
S
r r r
NP [agr: 1 ]
V [agr: 1 ] /seem/
VP*
r r rr
NP [agr: 1 ] VP [agr: 1 ]
rrr
V [agr: 1 ] /seem/ to appear V to sing V VP
rr
VP
Picture-NP extraction S
r rr r
NP V
VP
r rr
NP
/paint/
NP
r r r
NP pictures P of
PP
r r
NP
John paints pictures of bridges How do we get Bridgesi John paints pictures of ti ?
r r r
NP V /paint/ VP
r r
NP
{ NP
, S*
NP
r r
[index= 1 ]
r rr
NP pictures P of PP
r r
NP [index= 1 ]
NP
r rr
r
S
[index= 1 ]
NP
rr
r
VP
rrr
V /paint/ NP
r rr
NP pictures P of PP
r r
NP [index= 1 ]
Overview Terminology (ppt) What is Tree Adjoining Grammar? Some Syntactic Analyses TAG and Syntactic Theory Problem: German
The Lexicon as Grammar Q: Where do all these trees come from? Arent generalizations being missed (wh-movement the same for like and dislike)? A: We can generalize across classes of lexical items (eg, Transitive-Verb), and associate lexical item with class (eg, like is a Transitive-Verb)
The Lexicon as Grammar Q: Where do all these trees come from? Arent generalizations being missed (wh-movement the same for like and dislike)? A: We can generalize across classes of lexical items (eg, Transitive-Verb), and associate lexical item with class (eg, like is a Transitive-Verb) Q: Still, wh-movement from the subject position is the same in intransitive, transitive, and ditransitive verbs isnt a generalization being missed? A: Generating all possible trees is the goal of a TAG-based theory of syntax
The Lexicon as Grammar (ctd) Q: How big are the elementary trees? A: A tree is a lexical item and its syntactic projection. Option 1: NP DP
r r
Det the
NP
tree
Option 2:
DP
rr
Det NP the
Det
the
NP*
TAG is not a linguistic theory (nor a linguistic framework) Like CFG, TAG is a mathematical formalism Unlike GPSG, HPSG, LFG, CCG, and others, TAG is not a combination theory-and-formal-framework This means: a linguistic theory (or a linguistic framework) must be added to TAG to use it for linguistic description and/or theorizing We speak of TAG-based linguistic theories/ approaches/frameworks
D-Structure
move-
S-Structure
constrain
constrain
constrain
Principles
and
Parameters
Model: Minimalism
(Spellout) project
Lexicon
extend-target project insert move-
move-
Multiclausal LF Multiclausal PF
move-
constrain
constrain
constrain
Principles
and
Parameters
Clausal D-Structure
move-
Clausal S-Structure
formal derivation
Multiclausal S-Structure
constrain
constrain
constrain
Principles
and
Parameters
Model: GB
Lexicon
constrains
D-Structure
move-
S-Structure
constrain
constrain
constrain
Principles
and
Parameters
TAG-Based Syntactic Theory Syntactic theory explains variation on domains of extended projections only Combination of extended projections only through the formal operations of substitution and adjunction Scope of theory greatly reduced! See Frank (2000) for sample account
Overview Terminology (ppt) What is Tree Adjoining Grammar? Some Syntactic Analyses TAG and Syntactic Theory Problem: German
German: long scrambling in Mittelfeld (3) a. dass es Hans zu reparieren versucht that it Hans to repair tries that Hans tries to repair it b. * dass es Hans repariert zu haben bereut that it Hans repair to have regrets Intended meaning: that Hans regrets having repaired it
Long scrambling is iterable and combinable with extraposition (4) a. dass es that itACC versucht tries Hans den Kindern Hans [the children]DAT zu geben to give
that Hans tries to give it to the children b. dass es Hans den Kindern versucht zu geben
A A*
=>
A A
Solution: V-TAG, UVG-DL, DSG (Rambow 1994, Rambow et al 2001) TAG: grammar = ready-made trees; trees are combined
Solution: V-TAG, UVG-DL, DSG (Rambow 1994, Rambow et al 2001) TAG: grammar = ready-made trees; trees are combined DSG: grammar = trees in parts, assembly required; assembled trees are combined
Example:
VP NP
nom arg: +
VP NP
dat
arg: +
VP NP
acc
arg: +
VP VP
head: +
VP
VP
VP
geben
VP V
VP NP
acc
head:+ arg:+
VP NP
dat
head:+ arg:+
VP NP
nom
head:+ arg:+
VP VP V
VP NP
nom
arg: +
VP NP
dat
arg: +
VP NP
acc
arg: +
VP VP
head: +
head: arg:
VP VP VP geben
arg: path constraint: no [ complete:+ ] path constraint: no [ head:+ ]
geben
VP V
VP NP
dat
arg: +
VP NP
acc
arg: +
VP
VP VP
arg: +
VP
head: +
VP geben
arg:
VP VP V
head: +
arg:
VP V
versuchen
VP NP
acc
arg: +
VP NP
nom
arg: +
VP VP geben
arg: +
head: +
VP VP V
head: +
arg:
VP V
versuchen
arg:
Clausal Structure
formal derivation
Multiclausal S-Structure
constrain
Principles
and
Parameters
Old Model
Lexicon
constrains
Clausal D-Structure
move-
Clausal S-Structure
formal derivation
Multiclausal S-Structure
constrain
constrain
constrain
Principles
and
Parameters
Multiclausal LF
Clausal Structure
constrain
Multiclausal PF
Principles
and
Parameters
Conclusion Tree Adjoining Grammar is a good formalism for describing natural language syntax Exact type of TAG, and exact type of linguistic theory, are active areas of research
Backup Slides
Desiderata for a formal system for NL syntax (based on mildly context-sensitive, Joshi 1987): Constrained weak generative capacity (at most context-sensitive) Sentences made up incrementally from lexical items: constant-growth property Can form a new sentence using conjunction: closure under Kleene-star Sentential subjects: closure under iterated substitution Polynomially parsable
Null adjoining constraint (NA) (=terminal node for rewriting) Obligatory adjunction constraint (OA) (=nonterminal node for rewriting) Selective adjoining constraint (SA)
S NP VP OA( , )
1 2
1
VP Aux has VP *
2
VP Aux is VP *
V seen
TAG with Feature Structures Represent adjoining constraints with xed-sized feature structures
t2 A b2 t1 A b1 t3 A b3
v A t1b2t2
A b1 t3 b3 v
Example
tense: +
1
tense:
2
tense:
S NP
tense: + tense: +
VP Aux has
tense: + tense: -
VP Aux is
tense: +
VP V seen
tense: -
VP*
tense:
VP
* tense: tense:
3
tense:
VP Aux been
tense: -
VP*
tense: tense:
tense: +
1
tense:
2
tense:
S NP
VP Aux has
tense: + tense: -
VP Aux is
tense: +
VP V seen
tense: -
* VP
tense:
* tense: VP
tense:
tense:
4
Aux is
tense:
tense: + tense: -
tense: +
Aux has
S*
tense:
S*
tense: tense:
The Lexicon as Grammar (or the Grammar as Lexicon) We have many choices how to dene elementary trees in a linguistic grammar S
r r r
VP [agr:3sg] V blushes VP
NP Betty
[agr:3sg]
S
r r r r
NP Betty V
VP
r r r
NP bats
The Lexicon as Grammar (ctd) Almost all linguistic uses of TAG follow certain basic assumptions we will discuss those basic assumptions here But the basic assumptions are assumptions linguists bring to TAG, not assumptions that TAG imposes on the linguists! For example, we can use EDL for capturing linguistic properties as just discussed, but TAG does not impose that choice
The Lexicon as Grammar (ctd) It seems that in language: Properties (morphological, syntactic, semantic) are associated with lexical items and with classes of lexical items TAG: Given the extended domain of locality, we can state all relevant properties in an elementary tree The properties of larger structures are compositionally derivable from the properties of the composing lexical items TAG: Grammar is set of trees each
S
r r rr
NP [agr:3sg]
VP V [agr:3sg]
NP [agr:3sg] V
VP
r r
NP
likes blushes
NP Betty
NP bats
The Lexicon as Grammar (ctd) Q: In a TAG grammar, is there one tree per lexical item? A: No Example: verb Voice alternations (active/passive) Syntactic alternations (wh-movement, topicalization, dative shift, . . . ) ... Every lexical item is associated with a set of trees (family) which represents the set of
The Lexicon as Grammar (ctd) Conclusion: A TAG grammar (in the formal sense) is an enumeration of all lexical items in the language, in all possible syntactic variations The formal TAG grammar is a lexicon of the language
English Passive:
S
r r rr rr
VP
r rr r
VP V
/be/
[past-part] eaten
VP
r r r
PP
r r
NP [obj]
S
r r r r rr
VP
rr rr r
VP
r r r r
V [past-part] eaten by P
PP
r rr
NP [obj, role:agent]
S
r r r r r
[agr: 1 ] [tense:+] VP
r rr r
VP [tense:-] [mood:past-part]