You are on page 1of 37

Lexical and semantic

selection
Options for grammar
engineers and what
they might mean
linguistically
Outline and
acknowledgements
1. Selection in constraint-based approaches
i. types of selection and overview of methods used in
LKB/ERG
ii. denotation
2. The collocation problem
i. collocation in general
ii. corpus data on magnitude adjectives
iii. possible accounts
3. Conclusions
 Acknowledgements: LinGO/DELPH-IN, especially
Dan Flickinger, also Generative Lexicon 2005
1(i): Types of grammatical
selection
 syntactic: e.g., preposition among selects for
an NP (like other prepositions)
 lexical: e.g., spend selects for PP headed by
on
 Kim spent the money on a car
 semantic: e.g., temporal at selects for times
of day (and meals)
 at 3am
 at three thirty five and ten seconds precisely
Lexical selection
 lexical selection requires method of
specifying a lexeme
 in the ERG, this is via the PRED value

spend (e.g., spend the money on Kim)


spend_v2 := v_np_prep_trans_le &
[ STEM < "spend" >,
SYNSEM [ LKEYS [ --OCOMPKEY _on_p_rel
KEYREL.PRED "_spend_v_rel" ]]].
Lexical selection
 ERG relies on convention that different lexemes have
different relations
 `lexical’ selection is actually semantic. cf Wechsler
 no true synonyms assumption, or assume that grammar
makes distinctions that are more fine-grained than real-
world denotation justifies.
 near-synonymy would have to be recorded elsewhere: ERG
does (some) morphology, syntax and compositional
semantics
 alternatives?
 orthography: but ambiguity or non-monotonic semantics
 lexical identifier: requires new feature
 PFORM: requires features, values
Semantic selection
 Requires a method of specifying a
semantically-defined phrase
 In ERG, done by specifying a higher
node in the hierarchy of relations:
at_temp := p_temp_le &
[ STEM < "at" >,
SYNSEM [ LKEYS [ --COMPKEY hour_or_time_rel,
KEYREL.PRED _at_p_temp_rel ]]].
Hierarchy of relations
Semantic selection
 Semantic selection allows for indefinitely large set of
alternative phrases
 compositionally constructed time expressions
 productive with respect to new words, but exceptions
allowable
• approach wouldn’t be falsified if e.g., *at tiffin
 ERG lexical selection is a special case of ERG
semantic selection!
 could assume featural encoding of semantic
properties (alternatively or in addition to hierarchy)
 TFS semantic selection is relatively limited practically
(see later)
 also idiom mechanism in ERG
1(ii): Denotation, grammar
engineering perspective
 Denotation is truth-conditional, logically formalisable (in
principle), refers to `real world’ (extension)
 Not necessarily decomposable
 Naive physics, biology, etc
 Must interface with non-linguistic components
 Minimising lexical complexity in broad-coverage grammars is
practically necessary
 Plausible input to generator:
 reasonable to expect real world constraints to be obeyed (except in
context)
• the goat read the book
 Potential disambiguation is not a sufficient condition for lexical
encoding
 The vet treated the rabbit and the guinea pig with dietary Vitamin C
deficiency
Denotation, continued
 Assume linkage to domain, richer knowledge
representation language available
 TFS language for syntax etc, not intended for
general inference
 Talmy example: the baguette lay across the
road
 across - Figure’s length > Ground’s width
 identifying F and G and location for comparison in
grammar?
 coding average length of all nouns?
 allowing for massive baguettes and tiny roads?
But ...
 Trend in KR is towards description logics rather than richer
languages.
 Need to think about the denotation to justify grammaticization
(or otherwise)
 if temporal in/on/at have same denotation, selectional account is
required for different distribution
 unreasonable to expect lexical choice for in/on/at in input to
generator
 Linguistic criteria: denotation versus grammaticization?
 effect found cross-linguistically?
 predictable on basis of world knowledge?
 closed class vs open class
 Practical considerations about interfacing go along with linguistic
criteria
 non-linguists expect some information about word meaning!
 allow generalisation over e.g., in/on/at in generator input, while
keeping possibility of distinction
2(i) Collocation: assumptions
 Significant co-occurrences of words in
syntactically interesting relationships
 `syntactically interesting’: for examples in this
talk, attributive adjectives and the nouns they
immediately precede
 `significant’: statistically significant (but on what
assumptions about baseline?)
 Compositional, no idiosyncratic syntax etc (as
opposed to multiword expression)
 About language rather than the real world
Collocation versus denotation
 Whether an unusually frequent word pair is a
collocation or not depends on assumptions about
denotation: fix denotation to investigate collocation
 Empirically: investigations using WordNet synsets
(Pearce, 2001)
 Anti-collocation: words that might be expected to go
together and tend not to
 e.g., flawless behaviour (Cruse, 1986): big rain (unless
explained by denotation)
 e.g., buy house is predictable on basis of denotation,
shake fist is not
2(ii): Distribution of
`magnitude’ adjectives
 some very frequent adjectives have magnitude-
related meanings (e.g., heavy, high, big, large)
 basic meaning with simple concrete entities
 extended meaning with abstract nouns, non-concrete
physical entities (high taxation, heavy rain)
 extended uses more common than basic
 not all magnitude adjectives – e.g. tall
 nouns tend to occur with a limited subset of these
extended adjectives
 some apparent semantic groupings of nouns which
go with particular adjectives, but not easily specified
Some adjective-noun
frequencies in the BNC

number proportion quality problem part winds rain

large 1790 404 0 10 533 0 0

high 92 501 799 0 3 90 0


big 11 1 0 79 79 3 1

heavy 0 0 1 0 1 2 198
Grammaticality judgments

number proportion quality problem part winds rain

large * ? * *

high * ? *
big ? *

heavy ? * * *
More examples
impor success majority number proporti quality role problem part winds support rain
tance on

great 310 360 382 172 9 11 3 44 71 0 22 0

large 1 1 112 1790 404 0 13 10 533 0 1 0

high 8 0 0 92 501 799 1 0 3 90 2 0

major 62 60 0 0 7 0 272 356 408 1 8 0

big 0 40 5 11 1 0 3 79 79 3 1 1

strong 0 0 2 0 0 1 8 0 3 132 147 0

heavy 0 0 1 0 0 1 0 0 1 2 4 198
Judgments
impor proporti
success majority number quality role problem part winds support rain
tance on

great ? *

large ? ? * ? * *

high * ? ? * ? *

major ? ? ?

big ? ?

strong ? ? * * * * ?

heavy ? * ? * * * *
Distribution
 Investigated the distribution of heavy, high, big,
large, strong, great, major with the most common
co-occurring nouns in the BNC
 Nouns tend to occur with up to three of these
adjectives with high frequency and low or zero
frequency with the rest
 My intuitive grammaticality judgments correlate but
allow for some unseen combinations and disallow a
few observed but very infrequent ones
 big, major and great are grammatical with many
nouns (but not frequent with most), strong and
heavy are ungrammatical with most nouns, high and
large intermediate
heavy: groupings?
magnitude: dew, rainstorm, downpour, rain,
rainfall, snowfall, fall, snow, shower: frost, spindrift:
clouds, mist, fog: flow, flooding, bleeding, period,
traffic: demands, reliance, workload, responsibility,
emphasis, dependence: irony, sarcasm, criticism:
infestation, soiling: loss, price, cost, expenditure,
taxation, fine, penalty, damages, investment:
punishment, sentence: fire, bombardment, casualties,
defeat, fighting: burden, load, weight, pressure: crop:
advertising: use, drinking:
magnitude of verb: drinker, smoker:
magnitude related? odour, perfume, scent, smell,
whiff: lunch: sea, surf, swell:
high: groupings?
magnitude: esteem, status, regard, reputation,
standing, calibre, value, priority; grade, quality, level;
proportion, degree, incidence, frequency, number,
prevalence, percentage; volume, speed, voltage,
pressure, concentration, density, performance,
temperature, energy, resolution, dose, wind; risk, cost,
price, rate, inflation, tax, taxation, mortality, turnover,
wage, income, productivity, unemployment, demand
magnitude of verb: earner
heavy and high
 50 nouns in BNC with the extended
magnitude use of heavy with frequency
10 or more
 160 such nouns with high
 Only 9 such nouns with both adjectives:
price, pressure, investment, demand,
rainfall, cost, costs, concentration,
taxation
2(iii): Possible empirical
accounts of distribution
1. Difference in denotation between
`extended’ uses of adjectives
2. Grammaticized selectional
restrictions/preferences
3. Lexical selection
• stipulate Magn function with nouns (Meaning-
Text Theory)
4. Semi-productivity / collocation
• plus semantic back-off
1 - Denotation account of
distribution
 Denotation of adjective simply prevents it being
possible with the noun. Implies that heavy and high
have different denotations
heavy’(x) => MF(x) > norm(MF,type(x),c) & precipitation(x) or
cost(x) or flow(x) or consumption(x)...
(where rain(x) -> precipitation(x) and so on)
 But: messy disjunction or multiple senses, open-ended,
unlikely to be tractable.
 e.g., heavy shower only for rain sense, not bathroom sense
 Not falsifiable, but no motivation other than
distribution.
 Dictionary definitions can be seen as doing this
(informally), but none account for observed
distribution.
 Input to generator?
2 - Selectional restrictions and
distribution
 Assume the adjectives have the same denotation
 Distribution via features in the lexicon
 e.g., literal high selects for [ANIMATE false ]
 cf., approach used in the ERG for in/on/at in temporal
expressions
 grammaticized, so doesn’t need to be determined by
denotation (though assume consistency)
 could utilise qualia structure
 Problem: can’t find a reasonable set of cross-cutting
features!
 Stipulative approach possible, but unattractive.
3 - Lexical selection
 MTT approach
 noun specifies its Magn adjective
 in Mel’čuk and Polguère (1987), Magn is a
function, but could modify to make it a set, or vary
meanings
 could also make adjective specify set of
nouns, though not directly in LKB logic
 stipulative: if we’re going to do this, why not
use a corpus directly?
4- Collocational account of
distribution
 all the adjectives share a denotation corresponding
to magnitude, distribution differences due to
collocation, soft rather than hard constraints
 linguistically:
 adjective-noun combination is semi-productive
 denotation and syntax allow heavy esteem etc, but speakers
are sensitive to frequencies, prefer more frequent phrases
with same meaning
 cf morphology and sense extension: Briscoe and Copestake
(1999). Blocking (but weaker than with morphology)
 anti-collocations as reflection of semi-productivity
Collocational account of
distribution
 computationally,
 fits with some current practice:
• filter adjective-noun realisations according to n-
grams (statistical generation – e.g., Langkilde
and Knight, recent experiments with ERG)
• use of co-occurrences in WSD
 back-off techniques
 requires an approach to clustering
semantic spaces
 acquired from corpora
 generally, collect vectors of words which co-occur
with the target
 best known is LSA: often used in psycholinguistics
 more sophisticated models incorporate syntactic
relationships
 currently sexy, but severe limitations!
dog bark house cat
dog - 1 0 0

bark 1 - 0 0
Back-off and analogy
 back-off: decision for infrequent noun with no corpus
evidence for specific magnitude adjective
 should be partly based on productivity of adjective:
number of nouns it occurs with
 default to big
 back-off also sensitive to word clusters
 e.g., heavy spindrift because spindrift is semantically similar
to snow
 semantic space models: i.e., group according to distribution
with other words
 hence, adjective has some correlation with semantics of the
noun
Metaphor
 Different metaphors for different nouns (cf., Lakoff et
al)
 `high’ nouns measured with an upright scale: e.g.,
temperature: temperature is rising
 `heavy’ nouns metaphorically like burden: e.g., workload:
her workload is weighing on her
 Doesn’t lead to an empirical account of distribution,
since we can’t predict classes. Assumption of literal
denotation followed by coercion is implausible.
 But: extended metaphor idea is consistent with idea
that clusters for backoff are based on semantic space
Collocation and linguistic
theory
 Collocation plus semantic space clusters may account for some
of the `messy’ bits, at least for some speakers.
 in/on transport: in the car, on the bus
 Talmy: presence of walkway, `ragged lower end of hierarchy’
 but trains without walkway, caravans with walkway?
 in/on choice perhaps collocational, not real exception to language-
independent schema elements
 Potential to simplify linguistic theories considerably.
 Success of ngrams, LSA models of priming.
 Practically testable: assume same denotation of heavy/high or
in/on, see if we can account for distribution in corpus.
 Alternative for temporal in/on/at?
 Experiments with machine learning temporal in/on/at (Mei Lin,
MPhil thesis, 2004): very successful at predicting distribution, but
used lots of Treebank-derived features.
Summary
 Selection in ERG
 Other aspects of ERG selection not described here:
multiword expressions and idioms
 Collocational models as adjunct to TFS
encoding
 Role of denotation is crucial
 Practical considerations about grammar
usability
Final remarks
 Grammar usability:
 A good broad-coverage grammar should have an
account of denotation of closed-class words at
least, but probably not within TFS encoding.
 Can we use semantic web languages for non-
domain-specific encoding?
 Collocational techniques require much further
investigation
 Can semantic space models be related to
denotation (e.g., somehow excluding
collocational component)?
Idioms
Idiom entry:
stand+guard := v_nbar_idiom &
[ SYNSEM.LOCAL.CONT.RELS
<! [ PRED "_stand_v_i_rel" ],
[ PRED "_guard_n_i_rel" ] !> ].
Idiomatic lexical entries:
guard_n1_i := n_intr_nospr_le &
[ STEM < "guard" >,
SYNSEM [ LKEYS.KEYREL.PRED "_guard_n_i_rel“ ]].
stand_v1_i := v_np_non_trans_idiom_le &
[ STEM < "stand" >,
SYNSEM [ LKEYS.KEYREL.PRED "_stand_v_i_rel”]].
Idioms in ERG/LKB
 Account based on Wasow et al (1982), Nunberg et al
(1994).
 Idiom entry specifies a set of coindexed MRS
relations (coindexation specified by idiom type, e.g.,
v_nbar_idiom)
 Relations may correspond to idiomatic lexical entries
(but may be literal uses: e.g., cat out of the bag –
literal out of the).
 Idiom is recognised if some phrase matches the
idiom entry.
 Allows for modification: e.g., stand watchful guard
Messy examples
 among: requires group or plural or ?
 among the family (BNC)
 among the chaos (BNC)
 between: requires plural denoting two
objects, but not group (?)
 fudge sandwiched between sponge (BNC)
 between each tendon (BNC)
 ? the actor threw a dart between the couple
 * the actor threw a dart between the audience
(even if only two people in the audience)

You might also like