Professional Documents
Culture Documents
Polysemy
page ii
page iii
Polysemy
Theoretical and
Computational Approaches
Edited by
Yael Ravin
and
Claudia Leacock
page iv
Preface
Contents
Notes on Contributors
Figures
5.1 The AHD entry for the verb crawl 000
5.2 The CED entry for the verb 000
5.3 Crawl entries from four learners' dictionaries 000
5.4 Crawl sentences (some abridged) from corpus data 000
5.5 Semantic network for the verb crawl 000
5.6 Ramper entry in OHFED 000
5.7 Uses of ramper from the ARTFL twentieth-century corpus 000
5.8 Semantic network for ramper 000
9.1 The entry for bank in LDCE 000
10.1 Highly weighted links between handle and sword 000
10.2 Highly weighted links between handle and door 000
Tables
5.1 Comparative coverage of the verb crawl in four dictionaries 000
5.2 Dictionary senses of the verb crawl, with corpus examples 000
5.3 Grammatical information for the verb crawl in four
dictionaries 000
5.4 Elements in the motion frame 000
9.1 Types of knowledge source in LDCE 000
9.2 Accuracy of the reported system 000
11.1 Senses of ambiguous words 000
11.2 Number of occurrences of test words in training and test set,
% rare senses in test set, and baseline performance
(all occurrences assigned to most frequent sense) 000
11.3 Results of two disambiguation experiments (2 and 10 clusters
per word) 000
11.4 Disambiguation markedly improves retrieval performance 000
page xii
page 1
1
Polysemy: An Overview
YA E L RA V IN and C L A UD IA L E AC O CK
1. 1 w h at i s po l y se my ?
The study of polysemy, or of the `multiplicity of meanings' of words, has a
long history in the philosophy of language, linguistics, psychology, and
literature. The complex relations between meanings and words were Wrst
noted by the Stoics (Robins 1967). They observed that a single concept can
be expressed by several diVerent words (synonymy) and that conversely, one
word can carry diVerent meanings (polysemy). The collection of papers
assembled here represents current research into the issues arising from poly-
semy, such as the nature of polysemy; its relation to the more general
phenomenon of semantic ambiguity; the ways in which multiple meanings,
or senses, are represented in a dictionary or lexicon and related to each other;
the principles that govern these relations and the mechanisms that allow the
creation of new senses. Since words are used in context, the mechanisms by
which polysemous words combine with others to form the meaning of larger
syntactic units are also addressed.
Polysemy is rarely a problem for communication among people. We are so
adept at using contextual cues that we select the appropriate senses of words
eVortlessly and unconsciously. The sheer number of senses listed by some
sources as being available to us usually comes as a surprise: Out of approx-
imately 60,000 entries in Webster's Seventh Dictionary, 21,488, or almost
40%, have two or more senses, according to Byrd et al. (1987). Moreover, the
most commonly used words tend to be the most polysemous. The verb run,
for example, has 29 senses in Webster's, further divided into nearly 125 sub-
senses.
Although rarely a problem in language use, except as a source of humor
and puns, polysemy poses a problem in semantic theory and in semantic
applications, such as translation or lexicography. As we see with run in
Webster's, the traditional lexicographic practice is to list multiple dictionary
senses for polysemous words and to group related ones as sub-senses. Dic-
tionaries diVer in the number of senses they deWne for each word, the group-
ing into sub-senses and the content of deWnitions. It is clear that there is little
agreement among lexicographers as to the degree of polysemy and the way in
which the diVerent senses are organized. In fact, Atkins and Levin (1991) and
2 Polysemy
Fillmore and Atkins (this volume) show that mapping the deWnitions from
one dictionary onto another is often not possible, even for mildly ambiguous
words, such as the verb whistle, or across dictionaries that are similar in scope
and coverage. Whistle is deWned in one dictionary as make a shrill clear sound
by rapid movement but as move with a whistling sound in another. The genus
terms and diVerentia are reversed. Do these two dictionary deWnitions cap-
ture the same sense of the verb?
Another problem with dictionary deWnitions is their use of polysemous
deWning terms, further obscuring the relation between dictionary senses. One
of the deWnitions of the noun whistle is the sound produced by a whistle, but
which sense of whistle is intended, the instrument, the device (as in a factory
whistle), or both? A third problem with dictionary deWnitions mentioned by
Atkins and Levin and again in this volume arises when actual uses of words
encountered in context cannot be mapped to any dictionary deWnition, as in
whistling up the NATO forces if need be, revealing senses that are missing in
the dictionary. These examples indicate that lexicographers tend to disagree
as to the number of senses a word has, the semantic content of these senses
and their groupings. Semanticists also disagree on these issues as can be seen
in the diVerent approaches presented in this book. Two papers in particular,
Fillmore and Atkins and Goddard in this volume, propose ways in which
their theories can enhance or improve the practice of creating dictionary
deWnitions.
This conclusion agrees with Fillmore and Atkin's (this volume) view of word
meanings as networks of semantic concepts that are extendible from a core
meaning. The direction and scope of the extensions, Fillmore and Atkins
argue, depend on the particular word analyzed, the language (English crawl
diVers in its network of senses from the French ramper), historical evolution
and other words in the context.
1 . 3 p ol y s emy a n d co nt e xt
Geeraerts emphasizes the importance of context in determining the predic-
tions of each of his tests as he demonstrates that context alters the senses of
the words found in it. This emphasis on context is common to all of the
approaches discussed in this volume: the common modelÐbut we discuss
exceptions belowÐis to deWne the meaning of words independent of context,
as discrete entries in a dictionary, and then establish principles according to
which word meanings interact when found together in a particular context.
The central question is what aspects of word meaning are predeWned and
invariant across multiple contexts versus what other aspects are indetermin-
ate and only realized in context. Theories diVer in the balance they strike
between the two. On one extreme we Wnd Goddard (in this volume), stipulat-
ing maximal semantic content in word meaning. In fact, the interchangeabil-
ity of deWnition content across multiple contexts is the very criterion for
well-formed deWnitions. In Goddard's terms, context can only augment but
not alter the semantic content of words. On the other extreme we Wnd
SchuÈtze, who discards deWnitions and semantic content altogether. In
SchuÈtze's work, words do not have semantic content, only semantic similarity
to other words, which is measured by the similarity of the contexts in which
they appear. In fact, SchuÈtze induces senses rather than stipulate them.
Common to both these approaches is the relatively straight-forward prin-
ciples that govern the interaction between predeWned meaning and context.
Although Goddard does not address the issue directly here, his approach
need only stipulate very general principles of semantic composition, mainly to
rule out contradictory combinations (such as hateful love), anomalous ones
(wooden love) and other similar unacceptable constructions. Katz et al. (1985)
discusses three such principles in a similar theoretical approach to word
6 Polysemy
meaning. Semantic representations in Katz's theory consists of hierarchical
tree structures of semantic concepts. One principle for combining tree struc-
tures, called attachment, is an operation that appends the tree structures of
modiWers, such as adjectives and adverbs, to the tree structures of their
syntactic heads. Attachment occurs when the top concept node of the mod-
iWer can Wt under a concept node of the head. For example, intentionally with
a top concept node of intention) can attach to kill under the concept node of
event, since a general rule of semantic structure (namely, Intention ! Event)
licenses it.
Other approaches to the issue of word meaning and context in this volume
strike a more moderate balance. Cruse unfolds a large continuum of word
senses, ranging from `the possibility of (at least some) context-invariant
semantic properties of (at least some) words' to nodules of meaning that are
created and dissolved with changes in the context. Pustejovsky, similarly,
discusses highly under-speciWed sense of words and the principles that operate
on those senses to yield diVerent interpretations in diVerent contexts. The
more subtle the interactions between lexical meaning and context, the more
complex mechanisms are necessary for governing these interactions. This is
made clear in Pustejovsky's discussion of a particularly vexing group of verbs,
like risk, which are shown by Fillmore and Atkins (1992) to occur with
contradictory contexts, as in Mary risked her life versus Mary risked death.
Pustejovsky explains how the same verb meaning can combine with anto-
nymous complements to form roughly the same compositional meaning, that
of some likely harmful result. A coercion operator introduces the meaning of
privation (death, or the privation of life) to the meaning of the complement if
the complement does not already contain it. Coercion operations play a
major role in Pustejovsky's Generative Lexicon. They allow single-sense
words to acquire diVerent readings in diVerent contexts, by coercing the
meaning of nouns, for example, into one of their metonymic extensions.
Coercion explains how a fast car is one that can be driven quickly but a fast
typist is one who can type quickly (Pustejovsky 1993).
An open question remains as to how many such operative principles exist
and how complex the under-speciWed representations of words need to be in
order to allow certain coercions but block others (* save his death). Cruse
doubts whether such mechanisms are suYcient to explain the intricacies of
the interactions between word meaning and context. His conclusion is pess-
imistic: there is `a disturbing degree of Xuidity [in] semantic structure' which
cannot be subject to reductive explanations.
1 .4 t h eo r i e s o f me a ni n g
Semantic theories account for polysemy as one semantic phenomenon in a
comprehensive theory of meaning. Taken in the most general terms, seman-
Overview 7
tics relates the extralinguistic world to the linguistic expressions that describe
it. The nature of the extralinguistic entities need not concern us here: onto-
logically, one could speak of states of aVairs; formally, of intensions or sets of
possible worlds; mentally, of concepts corresponding to intentions or applied
to states of aVairs. Meaning, under all these views, can be understood as the
conditions in which a certain expression holds of certain extralinguistic
entities. Of course, individual entities are unique, so we refer more accurately
to extralinguistic types which group together many individual instances. A
lexical expression is applied to an extralinguistic type in a process of abstrac-
tion. In accounting for this abstraction, semantic theories are guided by two
sometimes contradictory principles: generalize (or reduce polysemy) as
much as possible in order to increase the explanatory power of the theory;
and make distinctions (or increase polysemy) in order to account for as much
semantic detail as possible. Theories diVer in the degree of abstraction they
allow. Some postulate one sense where others postulate many, for very
similar data.
Three major approaches to semantics are represented in this volume: the
Classical approach (Goddard); the Prototypical approach (Fillmore and
Atkins); and the Relational approach (Fellbaum).
1 . 5 t h e cl as s i cal t h e ory of me an in g
Three principles of the classical theory of deWnition bear on the problem of
polysemy: (1) senses are represented as sets of necessary and suYcient condi-
tions, which fully capture the conceptual content conveyed by words; (2)
there are as many distinct senses for a word as there are diVerences in these
conditions; and (3) senses can be represented independently of the context in
which they occur.
According to the classical, or Aristotelian, view of conceptual representa-
tion, an individual entity is a member of a conceptual category. For example,
Fido is a dog, if Fido possesses a set of necessary conditions (or deWning
features) and a set of suYcient conditions (or core properties) that deWne the
category dog. A few assumptions are traditionally made under this approach
(Howes 1990): conceptual categories consist of feature lists connected by
logical operators, such as conjunction and disjunction. The categories are
arranged in a hierarchy, where concepts on the same level of the hierarchy
inherit and share the core properties of the higher concepts, but have deWning
features that are mutually exclusive. What this means is that dogs and cats,
for example, share the core properties of mammals, but do not share any of
the canine or feline speciWc deWning features. An object either does or does
not belong to a conceptual category. Thus, an animal either is or is not
categorized conceptually as a mammal, even though in reality there may exist
diYcult cases, such as the platypus. But diYcult cases do not necessarily
8 Polysemy
weaken the conceptual structure, if they can be arbitrarily assigned to one of
the relevant categories.
It is not the Aristotelian notion of necessary and suYcient features which causes
trouble in semantic analysis; it is the tacit behaviorist assumption that the necessary
and suYcient features should correspond to measurable, objectively ascertainable
aspects of external reality. (Wierzbicka 1996)
1. 6 p o l ys em y with i n th e c l as s i ca l ap pr o ach
Omission to deWne polysemy and explain regular polysemy, sometimes
referred to as productiveor systematic polysemy, should be interpreted as an
oversight by Katz and Fodor rather than as an inherent defect of their
approach, as other classical approaches account for it quite naturally.
Apresjan (1974) deWnes polysemy as the similarity in the representations of
two or more senses of a word:
The deWnition does not require that there be a common part for all the meanings of a
polysemantic word; It's enough that each of the meanings be linked with at least one
other meaning.
He then points to cases in which a word is systematically ambiguous in
context. One such case is when the word exhibits more than one intension
which holds of the same referent as in I was put into quarantine. In Russian,
this sentence is ambiguous between a sense of quarantine as action and its
sense as place. English sentences like the construction is complete convey a
10 Polysemy
similar ambiguity between the product and the action that caused it. Apresjan
views these and other examples as instances of regular polysemy:
Polysemy of the word A with the meanings ai and aj is called regular if, in the given
language, there exists at least one other word B with the meanings bi and bj , which are
semantically distinguished from each other in exactly the same was as ai and aj and if ai
and bi and aj and bj are non-synonymous.
1 .7 t h e pr ot ot y p ica l ap p roach
Much twentieth-century philosophy of language can be seen as an attack on
the classical view of words as having distinct meanings and of deWnitions as
composed of necessary and suYcient conditions. Wittgenstein (1958) writes:
The idea that in order to get clear about the meaning of a general term one had to Wnd
the common element in all its applications has shackled philosophical investigation for
it has not only led to no result, but also made the philosopher dismiss as irrelevant the
concrete cases, which alone could have helped him to understand the usage of the
general term.
In a well-known discussion of the meaning of the word game, Wittgenstein
(1953) examines board games, card games, ball games, and Olympic games
and fails to Wnd an element that is common to all. Instead he Wnds several
elementsÐamusement, competition, skill, luck, rulesÐwhich occur in var-
ious combinations whenever the word game is used, depending on the context
in which it appears. Wittgenstein concludes that categories have fuzzy bound-
aries and meanings exhibit family resemblance, with common elements that
overlap and criss-cross in the same way that traits do in families.
In psychology, categorization by family resemblance was introduced by
Rosch and her colleagues in the 1970's. Rosch (1977) demonstrated that
people do not categorize objects on the basis of necessary and suYcient
conditions but rather on the basis of resemblance of the objects to a proto-
typical member of the category.
Following the classical approach to categorization, the prototypical
approach continues to respect the existence of a concept hierarchyÐa dog is
a mammal, an animal, a living thing. But membership in a conceptual cate-
gory is a matter of degree. Each category is represented by a prototype (an
actual member or a conceptual construct) that best exhibits the features of the
category and so is close to the ideal category deWnition of the classical theory.
In fact, Rosch proposed two prototypical models: one in which a single
prototype possesses the largest number of characteristic features, and the
14 Polysemy
other where several prototypes exist, each possessing a diVerent set of char-
acteristic features, not necessarily resembling one another. This second model
was adopted by linguists to handle polysemy, and we return to it below.
In various experiments Rosch demonstrated that prototypes are central to
human thinking. For example, people construct a prototype when none is
available, as when they are presented with various exemplars of an unfamiliar
category. The prototype they deWne is not arbitrary, but consistent across
individuals and cultures. When asked to choose an exemplar of a category,
people tend to choose prototypical ones. In her famous study of the Dani
people, who have only two colour categoriesÐone including all light, warm
colours and the other all dark, cool coloursÐRosch found that the Dani
tended to point to prototypical red, yellow, and white as examples of the
Wrst category, and to prototypical blue, green, and black as examples of the
other.
Best exemplars of a category resemble the prototype more than poor
exemplars. Rosch demonstrated that people take less time to verify state-
ments about category membership (X is a Y) when the exemplars are closer to
the prototype and more time when they are not. The degree of similarity to
the prototype is computed diVerently under diVerent theories. A simple
measure is the number of features in common with the prototype. Thus robins
are better exemplars of birds than penguins, since they can also Xy and sing.
Some theories count the number of features not shared with the prototype as
well. But the number of features alone can be misleading, because some
features are intuitively more important in determining category membership
than others. For example, laying eggs is more important than size or appear-
ance, in order to exclude bats from the bird category and still include pen-
guins. Other prototypical proposals assign diVerent weights to features.
Giving more weight to important features makes them more crucial in deter-
mining membership. But, as pointed out in Howes (1990), diVerentiating
important from unimportant features reintroduces a problem encountered
by the classical approach in determining the necessary conditions for a
category. Indeed, in order to determine the important features that govern
membership, the category concept often need to be deWned Wrst. To quote
Taylor (1989):
attributes of the Bird prototype, such as the presence of feathers wings, and a beak, the
building of nests, and the laying of eggs, would appear prima facie to require, for their
characterization, a prior understanding of what birds are.
Crucial features also determine whether the category has clear boundaries.
Poor exemplars of cats that share many common features with dogs are still
not members of any other mammal category. This is because the features they
share with other categories are not crucial. Most modern prototypical the-
ories accept the classical assumption of clear boundaries for at least some
Overview 15
categories, such as natural kinds, scientiWc or technical concepts, for example
the legal deWnition of who is an adult. In contrast to these, categories like
Wittgenstein's game have no clear boundaries, so that marginal cases may or
may not be categorized as games, depending again on context. Labov (1973)
shows similar results in his categorization experiments: as the shape of good
exemplars of cups is altered to become more and more shallow and wide, they
are often categorized as bowls.
Without mentioning the word polysemy, LakoV discusses the range of mean-
ing a word can have, as the result of the process of meaning extension.
Mother, for example, forms a radial conceptual model: it has a central
category where all of the models discussed above converge, and then more
marginal categories, its meaning extensions, such as surrogate mother, adopt-
ive mother or stepmother, that are linked to its central meaning along various
dimensions. The meaning extensions of radial concepts are not generated
from the prototypical concept by rules, but rather by convention. They must
be learned. But they are not random either. They are motivated by two
general principles: metaphor and metonymy.
Overview 17
Metaphors are mappings from a model in one domain `to a corresponding
structure in another domain. The CONDUIT metaphor for communication
maps our knowledge about conveying objects in containers onto an under-
standing of communication as conveying ideas in the words'. The metonymic
mechanism is similar to metaphor except that it maps according to a speciWc
function. `In a model that represents a part-whole structure, there may be
a function from a part to the whole that enables the part to stand for the
whole.'
LakoV views meaning extensions as part of a deeper cognitive organiza-
tion. Take for example, the container metaphor. It is based on a complex
container schema. On its most basic level, the container schema is experienced
physically ± we experience our bodies both as containers (of air, food, etc.)
and as things in containers (existing in rooms). The container schema has a
certain semantic (and cognitive) logic: Things are either in or out of a con-
tainer; the relation of containment is transitive (if A contains B and B
contains C, then A contains C). Finally, the schema also gives rise to various
metaphors, or meaning extensions: The visual Weld is seen as a container, with
things coming in and out of sight; and so are personal relations (trapped in a
marriage).
The fact that the basic elements of a schema are grounded in bodily
experiences or in other psychologically basic level categories, as deWned by
Rosch, is another aspect in which the prototypical approaches diVer from the
classical ones ± the classical assumption of a basis of primitive concepts which
combine to yield more complex meanings is replaced by a basis of psycho-
logically motivated concepts that cannot be further decomposed because they
are directly experienced as gestalts. (One cannot talk of the inside and outside
of a container as separate from the concept of the container itself.)
Taylor (1989) elaborates more directly on the nature of polysemy. `If
diVerent uses of a lexical item require, for their explication, reference to two
diVerent domains, or two diVerent sets of domains, this is a strong indication
that the lexical item in question is polysemous. School, which can be under-
stood against a number of alternative domains (the education of children, the
administrative structure of a university, etc.) is a case in point.' And further
on: `polysemous categories exhibit a number of more or less discrete, though
related meanings, clustering in a family resemblance category'. Thus, Taylor
adds another type of prototypical category. One that is complex, like mother,
but is not radial in that it does not have a central meaning. Such is the
meaning of over, for example, as discussed by Taylor, based on LakoV and
earlier work by Brugman. Over can express a static relation of being vertical
while not in contact to the point of reference ± the lamp is over the table;
vertical, not in contact with the reference but dynamic ± the plane Xew over the
city. Walk over the street is dynamic but involving contact; walk over the hill is
similar but deWnes the shape of the path, and so on.
18 Polysemy
As Taylor aptly points out, in the absence of constraints, meanings can be
inWnitely chained via family resemblance, so that everything ends up asso-
ciated with everything else. Taylor rejects any absolute constraint. As the
analysis of climb shows, one category can contain contradictory elements ±
the motion in climbing can be either up or down. As the analysis of over
shows, a category can encroach upon the semantic content of other cat-
egories: over sometimes means the same as beyond, across or on the other
side. He therefore agrees with LakoV, that rather than looking for constraints
on meaning extensions, we should look for tendencies and regularities.
If it is not possible to state absolute constraints on the content of family resemblance
categories it might none the less be the case that certain kinds of meaning extension are
more frequent, more typical and more natural, than others. In other words, we should
be looking for recurrent processes of meaning extension, both within and across
languages, rather than attempting to formulate prohibitions on possible meaning
extensions. (Taylor 1995: 21)
Fillmore and Atkins provide such an analysis of meaning extensions and how
they relate to one or more core meanings in their discussion of the polyse-
mous verb crawl in both English and French. Loosely based on LakoV's
model of radical categories, they stipulate one or more core literal senses at
the center of a concept network. Each of the literal senses can be extended by
general productive mechanisms to form derived senses. Which mechanisms
apply, however, seems to be idiosyncratic across words and languages. In
their example, ramper in French extends to mean stalk and creep while its
equivalent in English, crawl, does not. Crawl extends to describe the way
babies move, which in French is rendered by marcher aÂquatre pattes.
Pustejovsky takes a somewhat diVerent approach ± he explicates meaning
extensions in terms of a set of generative rules, such as coercion, that are
triggered by the context in which a word is used.
1 .9 t h e re l at i on al ap pr o ach
In relational models of the lexicon, words are organized according to their
meanings using rich semantic relations or links to form a semantic network.
Like prototypical models, the relational model works with semantic domains.
In addition, it `attempts to make explicit the structural organization that is
implicit in other models, and describes how the elements of a domain are
related to each other' (Evens 1988). Ideally, in a relational model of the
lexicon, knowing the meaning of a word is knowing the word's location in
the semantic space of the lexicon.
Synonymy and antonymy are perhaps the most basic relations. Synonymy
can be deWned to hold between words when one can be substituted for the
other in a context without changing the meaning of the phrase. For example,
Overview 19
frigid can be substituted for freezing in the phrase it is freezing outside without
changing its meaning. In the case of antonymy, such a substitution causes the
opposite meaning, as in `my gloves are wet/dry'. Another basic relation that
holds between nouns is hypernymy, also called the superordinate relation, or
the IS A relation. For example, the superordinate or hypernym of the noun
dog could be canine or animal meaning that a dog IS A canine or that a dog IS
A animal. The concepts are more and more general as one travels up a
hypernym chain. The inverse relation is hyponymy, which is a subordinate
relation. Hyponyms of dog include all the various kinds of dogs, such as
dalmation and poodle. Fellbaum (1990) has introduced a similar relation
that holds between verbs, that of troponymy. Whereas the organization rela-
tion between the nouns is an IS A relation the organizing relation between
verbs is a manner relation, for example, ambling and strolling are manners of
walking.
The kinds of relation that are typically encoded in relational semantic
networks are best exempliWed by those found using word association tests.
There is remarkable agreement on word associations across people, and these
associations reXect speciWc relations. For example, when 182 people were
asked to respond to smooth, 22 per cent responded with the synonym soft,
19.8 per cent responded with the antonym rough, and 6 per cent responded
silk which has smoothness as a property (Keppel and Strand 1970). Similarly,
when asked to respond to animal, 22.5 per cent responded with the hyponym
dog. However, when analysing these data, it is not always possible to deWne
what relation has been retrieved. When the probe word was army, three
people responded no. It is, perhaps, possible to recognize that there is a
relation, but deWning it is not always easy.
Determining precisely what the organizing relations are in the mental
lexicon, or how many there are, remains controversial; there is little agree-
ment on these issues. The number and type of posited relations varies widely.
WordNet (Fellbaum 1998) is a widely used online relational lexicon that is
freely available. Developed by George Miller and his colleagues at Princeton
University, WordNet was originally conceived of as an experiment. Tired of
semantic theories that were based on a handful of words, Miller wanted to
test the limits of a relational lexicon by seeing how much of the language it
was possible to include (Miller 1998). Almost Wfteen years later, WordNet
contains nearly 100,000 concepts in its semantic network.
WordNet's basic unit is a set of synonyms, a syn-set, which represents a
concept in the semantic space associated with a deWnitional gloss. For ex-
ample, the syn-set {plant, Xora, plant life} represents the concept that is
shared by the three words. Synonym sets are related to each other with a
small set of pointers or labeled arcs, each representing a semantic relation.
Nouns, verbs, adjectives and adverbs all use synonymy and antonymy.
Nouns in WordNet are also related by hypernymy/hyponymy and by
20 Polysemy
meronymy, the part/whole relation. For example, meronyms of the synonym
set {car, auto, automobile, motorcar} include bumper and air bag. For verbs,
WordNet encodes troponyms, the manner relation, and entailment relations,
for example, snoring entails sleeping. Adjectives have a similarity relation for
words that, although not synonyms, have meanings that are semantically
close, for example freezing and cold or wet and damp. WordNet also has two
derivational relations that point from one syntactic category to anotherÐ
that between a relational adjective and the noun from which it is derived
(cultural pertains to culture) and that between an adverb and the adjective
from which it is derived (usually is derived from usual).
Other relational lexicons that have been developed use many more rela-
tions. Ahlswede and Evens (1988) posit many hundreds of relations. These
include spatial relations such as above, morphological relations such as
plural, and grammatical relations such as subject of a verb. Dolan, Van-
dervende, and Richardson (this volume) describe how they built a relation
lexicon, MindNet, automatically. MindNet's relations include hypernym,
modiwer, purpose, and object.
The semantic relations that are used in relational models can be grouped
into what De Saussure called syntagmatic relations and paradigmatic rela-
tions. Syntagmatically related words that co-occur frequently. Recall that silk
was associated with the adjective smooth. This is a syntagmatic relation since
smooth and silk frequently co-occur in phrases like as smooth as silk. Syntag-
matically related words are often the components of a collocation, as in
smooth talker, or represent selectional preferences, such as water in drink
water. Paradigmatically related words are those that appear in a similar
contextsÐprimarily because they represent similar concepts. Soft is paradig-
matically related to smooth since either of the synonyms soft and smooth can
Wll in the blank in asÐas silk.
Relational models like that of Ahlswede and Evens make extensive use of
both syntagmatic and paradigmatic relations. WordNet, on the other hand,
encodes paradigmatic relations only. It give us no information about syntag-
matic relations. Miller and Leacock (this volume) explore how corpus stati-
stics can be used to supplement the WordNet lexicon with syntagmatic
information.
Relational approaches maintain the classical division into distinct senses
for polysemous words but they do not decompose the meaning of concepts.
Even though words are treated as decomposable atomic units, relational
models make much use of classical feature inheritance.
Relational lexicons are ideal for inferencing, especially when the relations
are transitive. For example, in WordNet, the hypernym of dog is canine, and
the hypernym of canine is carnivore. Since hypernymy is a transitive relation,
it then follows that a dog is a carnivore. Additional properties can be inherited
from WordNet's deWnitional glosses. The gloss for canine includes the fact
Overview 21
that canines are `Wssiped mammals with nonretractile claws'. Again, carni-
vore's gloss tells us that carnivores are `Xesh-eating mammals'. Thus we can
infer that a dog a Wssiped mammal that has nonretractile claws and eats Xesh.
By traveling up the hypernym tree, a wealth of information about dogs is
quickly collected.
1. 1 0 p o l ys em y w it h i n t h e r el at i ona l ap p roa ch
Representing regular polysemy within a relational framework is problematic.
Word senses that exhibit regular polysemy can be very distant from each
other in the semantic network's conceptual space. As an example, the noun
ash has three senses in WordNet, one appears in the hierarchy as plant
material (as in an ash baseball bat) and the other as a woody plant (as in
the ash tree). These senses are quite distant and their semantic relation cannot
be determined through proximity in the network. How can a relational
network be used both to discover sense pairs exhibiting regular polysemy
and, having done so, to indicate that there is regular polysemy between two
senses, but not a third sense, as in the sense of ash meaning the residue from
a Wre?
Mel'cÏuk (1988) uses indices to indicate words exhibiting both regular
polysemy and non-regular polysemy. WordNet represents regular polysemy
of nouns by using a similarity relation (Miller and Zholkovsky 1998). The
basic idea, Wrst proposed by Philip Johnson-Laird, is that since an ash tree
and its wood bears a regular semantic relation, as do all trees and their wood,
then these nodes can be identiWed high up in the noun hierarchy. That is, by
identifying the wood node and the tree node as bearing a semantic relation,
each tree will be associated with its wood in the WordNet display.
Working with verbs, Fellbaum (this volume) takes advantage of Word-
Net's troponym relation to discover non-regular polysemous verb senses. She
identiWes a class of autotroponymsÐverb senses that have been conXated with
their complement meanings. For example, consider the sentences:
The Wsh smells good.
The Wsh smells.
In the second sentence, the sense of the verb smells has been conXated with
that of a particular concept, bad.
smellÐ(emit an odour; `the soup smells good')
!smell (smell bad; `He rarely washes, and he smells')
!reek, stink
Fellbaum represents this kind of polysemy as superordinate and subordinate
senses, where the subordinate sense has a more speciWc meaning which
includes the adjectival element.
22 Polysemy
1 .1 1 re g u l a r p olys e my
Fellbaum focuses on non-regular polysemy, in contrast to several other
papers in this collection which discuss words that are polysemous in a
systematic, predictive fashion. Nouns like city alternate their meaning
between an administrative entity or unit, the group of people living within
the unit's borders and the people who govern it. Similarly, nouns like book
alternate between the physical object and its content. In fact, as Cruse (this
volume) points out, both meanings can be active simultaneously, as in I'm
going to buy John a book for his birthday, similar to the behaviour of verbs like
climb, which JackendoV and others analyse as containing a disjunction of
semantic features. This simultaneity entails that the two `meanings' (the
physical object and its content) are not disjoint, but rather components of a
single word sense. The challenge for semantic theories lies in formulating
mechanisms to trigger the relevant meaning components in the appropriate
contexts. Pustejovsky (1995) accounts for such alternations, as between the
physical object and its content, as follows. The deWnition of book contains
three arguments, one for the physical object (x), another for the content (y)
and a third for a combination of the two, referred to as a dotted argument, and
written as x.y. Qualia, or aspects of the meaning of book, determine the
relationships these arguments can have with each other or with other seman-
tic components in their context. The formal quale speciWes that x holds y. The
telic quale speciWes the purpose and function of a bookÐbeing read by an
agentÐwhich applies to the combined concept, that is to (x.y).
How regular are these alternations? As indicated by several discussions in
this volume, the alternations seem to follow trends. Some correlate with
semantic classes. We mentioned trees and their woods. Words similar to
book in that they have content, show similar alternations. Nominals that
describe an action, such as construction, co-operation, separation, often de-
scribe its result too. A diVerent trend is suggested by Dowty, who advances
syntactic structures as an explanatory principle for alternations in meaning.
Dowty shows that semantic diVerences correlate systematically with known
syntactic alternations, such as the swarm alternation (as in Bees swarm in the
garden/The garden swarms with bees). The two syntactic formsÐone with an
agent subject; the other with a locative subjectÐexhibit a variety of semantic
diVerences. Dowty catalogs these diVerences and proposes a systematic rela-
tion that holds among them. The syntactic properties of the locative structure
form a set of devices for conveying a speciWc meaning: the locative subject
turns the location into the topic of discourse while the predicate assigns some
abstract property to it, which is further reinforced by the use of indeWnite
plurals and mass terms. Dowty's discussion is unique in this volume in its
emphasis on syntax as a means for realizing diVerent aspects of the meaning
of words.
Overview 23
In contrast, Fellbaum warns against such generalizations of polysemy. She
emphasizes unpredictability of polysemous verbs. After acknowledging the
regularities discussed by Pustejovsky, and Dowty, she explores lexicalized
polysemy that is independent of context, syntactic realizations or semantic
class membership, and that therefore cannot be generalized across the lexical
items that exhibit it. Fillmore and Atkins agree. While they discuss general
principles of extension, such as metonymy, or the extension from a feeling
(sad as in The person is sad) to something evoking this feeling (as in a sad day),
they caution against formulating conditions under which these general me-
chanisms apply. As their contrastive analysis of English and French demon-
strates, there is no predictive principle to determine when general mechanisms
apply to extend certain meanings. Apart from the general mechanisms, there
are many speciWc polysemy extensions, such as the one deriving the meaning
of closeness to the ground from the English crawl, or the one deriving the
concept of spreading from the French ramper. Similar speciWcity can be found
in the application of Pustejovsky's privative coercion operation, which turns
the meaning of life to mean death, but only when it functions as the comple-
ment of risk.
1 . 1 2 p ol y se m y f r o m a co mp ut at ion al p oin t of v i e w
Computer applications that handle the content of natural language texts need
to come to terms with polysemy. The problem is not a new one to computa-
tional linguistics. Ever since the surge of research on machine translation in
the 1950s and 1960s, the problem of word sense identiWcation has been a
major stumbling block in natural language processing.
The study of polysemy in computational linguistics addresses the problem
of how to map expressions to their intended meanings automatically. Com-
puters have the same resources for sense identiWcation as we do, the context.
However, computers are handicapped because they can only interpret the
context as strings of letters, words or sounds, and not as meanings. One
direction taken by researchers is to try to harvest machine readable diction-
aries for lexicographic knowledge of diVerent senses of polysemous words.
Another approach is to try to solve the mapping problem by simulating
human understanding using statistical procedures to capture patterns of co-
occurrences of words in context.
Bar-Hillel, an early enthusiast for fully automatic machine translation,
concluded that the task is futile because automatic sense identiWcation is
not possible (Bar-Hillel 1964). He asserts that, although it is a trivial matter
for an English speaker to assign the appropriate sense of pen (enclosure rather
than writing implement) in the box is in the pen, no computer program can do
so. We can disambiguate pen because our world knowledge includes informa-
tion regarding the relative sizes of toy boxes, writing implements, and play
24 Polysemy
pens. Bar-Hillel concludes that the solution to the problem would involve a
complete characterization of world knowledge, which is unbounded.
Katz and Fodor (1963) published a theory of semantic interpretation that
posits structures that are contained in the mental lexicon, yet independent of
world knowledge. Kelly and Stone (1975) used the Katz±Fodor semantic
theory as the basis for creating algorithms for automatic sense disambigua-
tion. Over a period of seven years in the early 1970's, Kelly and Stone (and
some 30 students) hand-coded algorithms, sets of ordered rules, for disam-
biguating 671 words, after studying a set of concordance lines containing the
target words. An obvious problem with the Kelly±Stone approach is the
amount of work involved. A second, more fundamental shortcoming, Kelly
and Stone conclude, is that the Katz±Fodor theory is not suYciently rich to
account for the productivity of human language.
Despite the earlier warnings about the impossibility of automatic sense
resolution without the characterization of world knowledge, there is a resur-
gence of interest in automatic sense identiWcation, most notably with the aid
of machine readable dictionaries and statistical analyses of large textual
corpora. In 1998, Computational Linguistics devoted a special issue to word
sense disambiguation (Ide and VeÂronis 1998). This renewed eVort will test
Bar-Hillel's predictions in light of the enormous computational capabilities
that are available to us today, as compared to what was available in the early
1960s.
How is polysemy deWned in computation terms? The optimal degree of
abstraction over multiple meanings of words is often determined by the
computational goal: in machine translation, there are distinct senses if there
exists a lexical choice in the target language; for example, know as in know a
person is translated into connaitre in French whereas know as in know a fact is
translated into savoir. Sometimes the degree of abstraction is determined by
the limits of what can be accomplished with current technology. In large-scale
information retrieval, for example, polysemy is only one aspect of a much
larger ambiguity problem, which includes syntactic ambiguity (the verb table
versus the noun table), and homonymy (river bank versus bank as a Wnancial
institution).
In the late 1980s, machine-readable dictionaries (or MRDs) were proposed
for sense disambiguation. Lesk (1986) devised a simple method to link dic-
tionary deWnitions if they share words in common. For example, in the
dictionary used by Lesk, pine has two major senses and cone has three. The
phrase pine cone is therefore six-way ambiguous. In order to disambiguate
it, a system must choose from the six possible combinations. Lesk identiWed
the relevant deWnitions of pine and cone to be the ones which contained
the same words, in this case, the ones which contained both evergreen and it
tree. In general, Lesk's algorithm chooses senses containing overlapping
words.
Overview 25
However, when dictionary deWnitions are terse, overlap of deWning words
in unlikely, and other clues in dictionaries are sought. Walker (1987), for
example, uses overlap of subject codes that are associated with words in
Longman's Dictionary of Contemporary English. The relevant subject code
for a piece of text is the one associated with most of the words in the text. The
deWnitions chosen are those associated with that subject code in the diction-
ary. One method for expanding the number of dictionary deWnitions that
can be used was developed by Guthrie et al. (1991), who compute neighbour-
hoods for polysemous words. Neighborhoods are built by searching for co-
occurrences in deWnitions and sample sentences of all words sharing the same
subject code. Wilks et al. (1993) provide a detailed account of how machine-
readable dictionaries can be used to identify senses of polysemous words. In
this volume, Stevenson' and Wilks describe a combination of methods for
identifying word senses using MRDs.
More recently, corpus-based disambiguation methods have been de-
veloped. As machine-readable corpora containing many millions of words
became available, large-scale analysis of the words that co-occur with poly-
semous words was undertaken (Black 1988; Zernik 1991; Hearst 1991; Gale et
al. 1992), and corpus-based word sense identiWcation, i.e: statistical corpus
analysis of co-occurrence patterns, became mainstream in computational
linguistics. Sense identiWcation systems train on example sentences represent-
ing each sense of a wordÐthe number of examples can range from 50 to
hundreds of sentences. The sets of examples are used to create a model for
each word sense. Once the models have been created, the system chooses the
most suitable model for a given novel usage of the word. The suitability of the
model is computed on the basis of a similarity measure between the features
of the model and those of the context of the novel occurrence.
Miller and Leacock characterize the kinds of contextual information that a
computer can be expected to extract from a corpus. They look at local
context, the open and closed class words that are near or adjacent to a
polysemous word and at topical context, substantive words that co-occur
with a word sense in a discussion of a given topic. Automatic sense identiWca-
tion systems that make explicit use of topical and local context include Towell
and Voorhees (1998) and Leacock, et al. (1998).
Adam KilgarriV (1997) organized Senseval, a conference to compare and
evaluate state of the art corpus-based sense disambiguation systems. Eighteen
participants disambiguated 35 polysemous words using the same training and
testing materials. Their results were sent back to KilgarriV and his colleagues,
who reported them at a workshop in Herstmonceaux Castle in the summer of
1998. The Senseval results show that we have reached the point that Bar-
Hillel called the `80 percent fallacy' (Bar-Hillel 1964). The thrust of this
fallacy is that if 80 per cent of a problem is solved, in this case, 80 per cent
of the polysemous words are correctly disambiguated, it does not follow that
26 Polysemy
the remaining 20 per cent can be resolved by a 20 per cent increase in research.
On the contrary, Bar-Hillel argues that solving the Wnal 20 per cent entails
much more work than was required to achieve the initial 80 per cent accuracy.
Another problematic aspect of this type of corpus-based approach is what
Gale et al. (1992) aptly call the `knowledge acquisition bottleneck'. In order to
get materials for training a program on senses of a particular polysemous
word, the corpus of contexts containing that word has to be manually parti-
tioned into its diVerent senses. Aside from being slow and costly, the work put
into manual tagging does not scale upÐthe partitioning required for one
word will be of no use in disambiguating any other word and will not decrease
the amount of manual eVort required.
SchuÈtze (this volume) takes a diVerent approach to corpus-based sense
identiWcation. His central concern is to design a sense identiWcation algorithm
that is both psychologically plausible and generally applicable. Instead of
assigning a polysemous word to a discrete sense, SchuÈtze clusters word senses
that are similar in that they share a similar context, and then deWnes these
clusters as word senses.
Dolan, Vanderwende, and Richardson (this volume) combine corpus-
based and MRD approaches. Like SchuÈtze, they use a similarity measure
on the context of words for determining whether words are close in meaning.
But the contexts they use are very specialized: they are links in a semantic
network, called MindNet, which is obtained from the analysis of MRD
deWnitions. MindNet does not attempt to classify individual word senses;
instead it recognized concepts that are similar to those in novel usages.
SchuÈtze's and Dolan et al.'s approaches can be very useful for information
retrieval, where the task is to match the query context to similar contexts in
the database of documents, but it is not obvious how such approaches would
be used in an application like machine translation, without extensive manual
encoding.
In computational linguistics, it is easier to automatically identify senses of
homonyms than it is to identify senses of polysemes. For example, it is easier
to automatically distinguish between bat the mammal and the bat used in
baseball than it is to distinguish between the bat used in baseball and that
used in cricket. This is because the contexts of homonyms will consist of quite
diVerent vocabularies, whereas the contexts of polysemes may be quite simi-
lar. Whether the distinction between homonyms and polysemes is important
depends, again, on the task at hand. Information retrieval, where several
relevant documents are presented to a user to choose from, may be a more
forgiving environment than automatic translation.
The Weld of computational linguistics is rapidly expanding in two import-
ant directions. Collections of online texts are increasing in size, and with them
the demand for large-scale processing. Large-scale research is speciWed as an
important goal of many research projects: the Text Retrieval Conference
Overview 27
(TREC), for example, the annual forum for the evaluation of information
retrieval systems, sponsored by the National Institute of Standards and
Technology (Harman 1995), has required participants to process a collection
of 2±3 gigabytes of text. More recently, it has announced a new track, in
which participants are asked to process 20 gigabytes. The other direction of
expansion is in the application of computers to new domains. Publishers and
computer companies are experimenting with dictionaries and encyclopedias
that exist online. Libraries are redeWning themselves as digital, oVering their
patrons novel ways to use their resources. More recently, methods are needed
to navigate, search, and retrieve relevant data from the rapidly increasing
volume of multi-lingual textual information available on the Internet. These
developments create an enormous new demand on computational text under-
standing, and on the more speciWc problem of associating words with their
intended meanings.
r e fe r en c e s
Ahlswede, T. E., and Evens, M. W. (1988), `A lexicon for a medical expert system', in
Evens (1988).
Apresjan, D. (1974), Regular polysemy', Linguistics, 142: 5±32.
Atkins, B. T. and Levin, B. (1991), Admitting impediments', in U. Zernick (ed.),
Lexical Acquisition: Exploiting On-line Resources to Build a Lexicon. Hillsdale, NJ:
Lawrence Erlbaum.
Bar-hillel, Y. (1964), `A demonstration of the nonfeasibility of fully automatic high
quality machine translation', in Language and Information: Selected Essays on Their
Theory and Application. Reading: Addison-Wesley.
Black, E. W. (1988), An experiment in computational discrimination of English word
senses', IBM Journal of Research and Development, 32(2): 185±94.
Byrd, R. J. Calzolari, N., Chodorow, M. S., Klavans, J. L., Neff, M. S. and Rizk,
O. A. (1987), `Tools and methods for computational lexicology', Computational
Linguistics, 13(3±4): 219±40. Also available as IBM RC 12642, IBM T. J. Watson
Research Center, Yorktown Heights, NY.
Coleman, L., and Kay, P., (1981) `Prototype semantics: the English verb lie', Lan-
guage, 57(1).
Cruse, D. A. (1986), Lexical Semantics. Cambridge: Cambridge University Press.
Evens, M. W. (1988), Relational Models of the Lexicon: Representing Knowledge in
Semantic Networks. Cambridge: Cambridge University Press.
Fellbaum, C. (1990), `English verbs as a semantic net', International Journal of
Lexicography 3.
ÐÐ (1998), WordNet: A Lexical Reference System and Its Application. Cambridge,
Mass.: MIT Press.
Fillmore, J. (1982), `Towards a descriptive framework for spatial deixis', in
R. J. Jarvella and W. Klein (eds.), editors, Speech, Place and Action. New York:
Wiley.
28 Polysemy
ÐÐ and Atkins, B. T. S. (1992), `Towards a frame-based lexicon: the semantics of
risk and its neighbors', in Lehrer and E. Feder Kittay, (eds.), Frames, Fields, and
Contrasts: New Essays in Semantic and Lexical Organization. Hillsdale, NJ: Lawr-
ence Erlbaum.
Gale, W., Church, K. W., and Yarowsky, D. (1992), `A method for disambiguating
word senses in a large corpus', Computers and the Humanities, 26: 415±39.
Geeraerts, D. (1993), `Vagueness's puzzles, polysemy's vagaries', Cognitive Linguis-
tics, 4(3): 223±72.
ÐÐ (1994), `Polysemy', in R. E. Asher and J. M. Y. Simpson (eds.), The Encyclopedia
of Language and Linguistics. Oxford and New York: Pergamon.
Guthrie, J. A., Guthrie, L., Wilks, Y., and Aidinejad, H. (1991), `Subject-dependent
co-occurrence and word sense disambiguation', in Proceedings of the 29th Annual
Meeting of the Association for Computational Linguistics. Berkeley, Calif.: ACL.
Harman, D. K. (1995), `Overview of the third text retrieval conference (TREC-3)'.
Technical Report Special Publication 500±225. Washington, DC: National Institute
of Standards and Technology.
Hearst, A. (1991), `Noun homograph disambiguation using local context in large text
corpora', in Annual Conference of the UW Centre for the New OED and Text
Research: Using Corpora. Oxford: UW Centre for the New OED and Text Research.
Howes, M. M. (1990), The Pyschology of Human Cognition. New York: Pergamon.
Ide, N. and Veronis, J. (1998), `Introduction to the special issue on word sense
disambiguation: the state of the art', Computational Linguistics, 24(1): 1±40.
Jackendoff, R. S. (1985), `Multiple subcategorization and the theta-criterion: the case
of climb', Natural Language and Linguistic Theory, 3: 271±95.
Katz, J. J. (1972), Semantic Theory. New York: Harper & Row.
ÐÐ Leacock, C., and Ravin, Y. (1985), `A decompositional approach to modiWca-
tion', in E. LePore and B. McLaughlin (eds.), Action and Events. Oxford: Blackwell.
ÐÐ Fodor, J. A. (1963), `The structure of a semantic theory', Language, 39: 170±210.
Kelly, E. and Stone, P. (1975), Computer Recognition of English Word Senses.
Amsterdam: North-Holland.
Keppel, G., and Strand, B. Z. (1970), `Free-association responses to the primary
purposes and other responses selected from the Palermo-Jenkins norms', in L.
Postman and G. Keppel, (eds.), Norms of Word Association. New York: Academic
Press.
Kilgarriff, A. (1997), `Evaluating word sense disambiguation programs: progress
report'. Brighton: Information Technology Research Institute.
Labov, W. (1973), `The boundaries of words and their meanings', in C.-J. N. Bailey
and R. W. Shuy, (eds.), New Ways of Analysing Variation in English. Washington,
DC: Georgetown University Press.
Lakoff, G., (1987), Women, Fire, and Dangerous Things. Chicago: University of
Chicago Press.
Leacock, C., Chodorow, M., and Miller, G. A. (1998), `Using corpus statistics and
WordNet relations for sense identiWcation', Computational Linguistics, 24(1): 47±65.
ÐÐ Towell, G., and Voorhees, E. M. (1993), `Corpus-based statistical sense resolu-
tion', in Proceedings of the ARPA Workshop on Human Language Technology. San
Francisco: Morgan Kaufman.
Overview 29
ÐÐ (1996), `Towards building contextual representations of word senses using sta-
tistical models', in Boguraev and J. Pustejovsky, (eds.), Corpus Processing for
Lexical Acquisition. Cambridge, Mass.: MIT Press.
Lesk, M. (1986), `Automatic sense disambiguation: how to tell a pine cone from an ice
cream cone', in Proceedings of the 1986 SIGDOC Conference. New York: Associa-
tion for Computing Machinery.
Levin, B. (1995), English Verb Classes and Alternations: A Preliminary Investigation.
Chicago: University of Chicago Press.
Mel'C uk, I. and Zholkovsky, A. (1988), `The explanatory combinatorial dication-
ary', in M. W. Evens (ed.), Relational Models of the Lexicon: Representing Knowl-
edge in Semantic Networks. Cambridge: Cambridge University Press.
Miller, G. A. (1990), `WordNet: an on-line lexical database', International Journal of
Lexicography, 3(4).
ÐÐ (1998), `Nouns in WordNet', in C. Fellbaum (ed.), WordNet: A Lexical Reference
System and Its Application. Mass.: MIT Press.
Pustejovsky, J. (1993), Semantics and the Lexicon. Dordrecht: Kluwer.
ÐÐ (1995), The Generative Lexicon. Cambridge, Mass.: MIT Press.
Quine, W. V. (1960), Word and Object. Cambridge, Mass.: MIT Press.
Ravin, Y. (1990), Lexical Semantics without Thematic Roles. Oxford: Oxford Uni-
versity Press.
Robins, R. H. (1967), A Short History of Linguistics. Bloomington: Indiana University
Press.
Rosch, E. (1977), `Human categorization', in N. Warren (ed.), Advances in Cross-
Cultural Pshycology, vol. 7. London: Academic Press.
Taylor, J. R. (1989), Linguistic Categorization: Prototypes in Linguitic Theory. Ox-
ford: Oxford University Press.
Towell, G., and Voorhees, E. M. (1998), `Disambiguating highly ambiguous words',
Computational Linguistics, 24(1): 125±46.
Walker, D. E. (1987), `Knowledge resource tools for accessing large text Wles', in S.
Nirenburg, (ed.), Machine Translation. Cambridge: Cambridge University Press.
Weinreich, U. (1966), `Explorations in semantic theory', in T. A. Sebeok (ed.),
Current Trends in Lnguistics, iii. The Hague: Mouton.
Wierzbicka, A. (1990), `Prototypes save: on the uses and abuses of the notion of
``prototypes'' in linguistics and related Welds', in S. L. Tsohatzidis (ed.), Meanings
and Prototypes: Studies in Linguistic Categorization. London: Routledge & Kegan
Paul.
ÐÐ (1996), Semantics: Primes and Universals. Oxford: Oxford University Press.
Wilks, Y., Fass, D., Guo, C.-M., McDonald, J. E., Plate, T., and Alator, B. M.
(1993), `Machine tractable dictionary tools', in J. Pustejovsky (ed.), Semantics and
the Lexicon. Dordrecht: Kluwer.
Wittgenstein, L. (1953), Philosophical Investigations. Oxford: Basil Blackwell &
Mott.
ÐÐ (1958), The Blue and Brown Books. Oxford: Basil Blackwell & Mott.
Zernik, U. (1991), `Train1 vs. Train2: tagging word senses in corpus', in U. Zernik
(ed.), Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, Hills-
dale, NJ: Lawrence Erlbaum.