You are on page 1of 41

page i

Polysemy
page ii
page iii

Polysemy
Theoretical and
Computational Approaches

Edited by
Yael Ravin
and
Claudia Leacock
page iv

Great Clarendon Street, Oxford ox2 6dp


Oxford University Press is a department of the University of Oxford.
It furthers the University's objective of excellence in research, scholarship,
and education by publishing worldwide in
Oxford New York
Athens Auckland Bangkok Bogota Buenos Aires Calcutta
Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul
Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai
Nairobi Paris SaÄo Paulo Singapore Taipei Tokyo Toronto Warsaw
with associated companies in Berlin Ibadan
Oxford is a registered trade mark of Oxford University Press
in the UK and in certain other countries
Published in the United States
by Oxford University Press Inc., New York
editorial Matter + organization # yael
Ravin and Claudia Leacock 2000
Individual chapters # the contributors
The moral rights of the author have been asserted
Database right Oxford University Press (maker)
First published 2000
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
without the prior permission in writing of Oxford University Press,
or as expressly permitted by law, or under terms agreed with the appropriate
reprographics rights organisation. Enquiries concerning reproduction
outside the scope of the above should be sent to the Rights Department,
Oxford University Press, at the address above
You must not circulate this book in any other binding or cover
and you must impose this same condition on any acquiror
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging in Publication Data
Data applied
ISBN 0±19±823842±8
1 3 5 7 9 10 8 6 4 2
Typeset in Times
by Kolam Information Services Pvt Ltd, Pondicherry, India
Printed in Great Britain
on acid-free paper by
page v

Preface

The problem of polysemy, or of the multiplicity of word meanings, has


preoccupied us since the beginning of our professional careers. We Wrst
attempted to formalize it as graduate students, in `A Decompositional
Approach to ModiWcation', together with Professor Jerry Katz as our Wrst
joint publication. In that early analysis of modiWcation, we discussed poly-
semous adjectives such as good, which appear to acquire diVerent meanings
depending on the head they modify. Thus, a good knife is a knife that cuts
well, but a good memento is an object that adequately reminds. We explained
this seeming polysemy in decompositional terms ± the semantics of the head is
a composition of several semantic components; only one of these components
is aVected by the modiWer. This issue of the polysemy of modiWers continues
to challenge linguists, as we see in this volume.
Polysemy was also a main aspect of Yael Ravin's thesis and subsequent
book, about the relationship between semantic and syntactic structures. In
that work, polysemy is viewed as a one-to-many relationship between syntac-
tic, or lexical, forms and their corresponding meanings. Thus, break±with
refers to an instrument relation in break the window with a rock but to an
accompaniment or complicity relation in break the window with Paul. Viewed
in this way, polysemy is a crucial aspect in deWning the systematic relationship
between meaning and structure.
We continued to research polysemy as part of our work in computational
linguistics. Computational applications do not always diVerentiate between
polysemy (multiple meanings at the level of a single word) and ambiguity
(multiple meanings at the level of more complex syntatic structures). Work on
grammar-checking at IBM used the deWnitions in a machine readable dic-
tionary to determine which sense of a word or syntatic structure was intended
in a given text. This direction of research is further discussed in this volume.
Finally, at Princeton University's Cognitive Lab, Claudia Leacock colla-
borated with George Miller, Martin Chodorow, and Ellen Voorhees to for-
mulate and run a series of experiments in an eVort to understand better the
roles that diVerent aspects of context play in identifying the intended sense of
a polysemous word. This work culminated in the development of a Topical/
Local ClassiWer, described in the 1998 issue of Computational Linguistics.
Our research into the theoretical aspects of polysemy and our experience
with practical approaches to resolve it gave us the idea to put together this
volume. Two observations emerge from this collection: Wrst, polysemy
vi Preface
remains a vexing theoretical problem, leading many researchers to view it as a
continuum of words exhibiting more or less polysemy, rather than a strict
dichotomy. The second is the increasing realization that context plays a
central role in causing polysemy, and therefore should be an integral part of
trying to resolve it. We hope readers will appreciate the implicit dialogue
emerging among the various contributions, between diVerent analyses of
similar data, between diVerent semantic theories, and between theories and
computational experiments.
page vii

Contents

Note on Contributors 000


List of Figures and Tables 000
1 Polysemy: An Overview 000
Yael Ravin and Claudia Leacock
2 Aspects of the Micro-structure of Word Meanings 000
D. Alan Cruse
3 Autotroponomy 000
Christiane Fellbaum
4 Lexical Shadowing and Argument Closure 000
James Pustejovsky
5 Describing Polysemy: The Case of Crawl 000
Charles J. Fillmore and B. T. S. Atkins
6 The Garden Swarms with Bees and the Fallacy of
`Argument Alternation' 000
David Dowty
7 Polysemy: A Problem of DeWnition 000
CliV Goddard
8 Lexical Representations for Sentence Processing 000
George A. Miller and Claudia Leacock
9 Large Vocabulary Word Sense Disambiguation 000
Mark Stevenson and Yorick Wilks
10 Polysemy in a Broad-Coverage Natural Language
Processing System 000
William Dolan, Lucy Vanderwende, and Stephen Richardson
11 Disambiguation and Connectionism 000
Hinrich SchuÈtze
Index
page viii

Notes on Contributors

B er y l T . Su e A t k in s , professional lexicographer since 1966, has been an


editor of the original Collins-Robert English±French Dictionary series, Lex-
icographical Adviser to Oxford University Press, and consultant to lexical
systems developers in Europe and the USA. Her research interests include the
contribution of linguistic theory to corpus lexicography (with Charles J.
Fillmore), the training of lexicographers and teaching of lexicography, and
dictionary use and dictionary users.
D. A la n C ru s e is currently Senior Lecturer in Linguistics at Manchester
University, where he teaches courses in semantics, pragmatics, and cognitive
linguistics. His research interests and publications are mainly in the Weld of
lexical semantics.
Wi ll i a m D o lo n has been a Researcher in the Natural Language Proces-
sing group at Microsoft Research since 1992. He previously worked on
several natural language projects at the IBM Los Angeles ScientiWc Center.
His interests include lexical knowledge bases, word sense disambiguation,
information retrieval, and the computational processing of metaphor.
D a v i d Do w t y is Professor of Linguistics at Ohio State University. He is the
author of Word Meaning and Montague Grammar, co-author of Introduction to
Montague Semantics, and author of a number of articles on aspect and aktion-
sart, tense, formal model-theoretic lexical semantics, and categorial grammar.
C h r ist i a n e F e ll ba u m is a member of the Cognitive Science Laboratory
at Princeton University. She has worked and published extensively in the
areas of speciWc language impairment, semantics, syntax, and lexical seman-
tics. Much of her work in the past decade has centred on the verb component
of the lexical database WordNet. She is the editor of WordNet: A Lexical
Database of English (1998).
C h a rl es J . F il lm o r e is Professor Emeritus of Linguistics at the Univer-
sity of California at Berkeley, Director of the National Science Foundation-
funded Frame Net project at the International Computer Science Institute in
Berkeley, and a Fellow of the American Academy of Arts and Sciences. His
research interests have been in grammar, semantics (especially lexical seman-
tics), and linguistic pragmatics.
C li v G o dd a r d is an Associate Professor in Linguistics at the University
of New England, Armidale, Australia. He has published a dictionary and
Notes on Contributors ix
grammar of the Yankunytjatjara language of Central Australia. More re-
cently he has been working on Malay (Bahasa Melayu). His main research
interests are in semantics, cognitive linguistics, cross-cultural pragmatics, and
language description. He is co-editor, with Anna Wierzbicka, of Semantics
and Lexical Universals: Theory and Empirical Findings (John Benjamins,
1994). His latest book is Semantic Analysis: A Practical Introduction (Oxford
University Press, 1998).
C la u d ia L ea c o ck recently became a research scientist at the Educational
Testing Service. Previously she spent six years as a research staV member at
Princeton University's Cognitive Science Laboratory. Recent papers include
Leacock, Chodorow, and Miller, `Using corpus statistics and WordNet rela-
tions for sense identiWcation', in Computational Linguistics' special issue on
word sense disambiguation (March 1998), and Leacock and Chodorow,
`Combining local context with WordNet similarity for word sense identiWca-
tion', in WordNet: A Lexical Reference System and Its Application (1998).
G e or g e M i l le r is James S. McDonnell Distinguished University Profes-
sor of Psychology Emeritus and Senior Research Psychologist at Princeton
University. He is the author of seven books, of which the most recent, The
Science of Words, was published in 1991. He is presently directing the Word-
Net project at Princeton University's Cognitive Science LaboratoryÐan on-
line database for English organized by semantic relations.
Ja m es P u s te jo v sk y is currently an Associate Professor of Computer
Science at Brandeis University, where he conducts research in the areas of
computational linguistics, lexical semantics, inheritance, and information
retrieval and extraction.
Y a el R a v i n is a research staV member at the T. J. Watson Research Center
of IBM, New York, currently working on text analysis and information
retrieval. Recent papers include: Ravin and Wacholder, `Extracting Names
from Natural-Language Text' (1996), IBM Research Report, and Wacholder,
Ravin, and Choi, `Disambiguation of Names in Text', in Proceedings of the
Fifth Conference on Applied Natural Language Processing (March 1997).
St ev e R i c h a r d so n has been a Senior Researcher in the Natural Lan-
guage Processing group at Microsoft Research since its inception in 1991. He
previously worked at the IBM Bethesda, Maryland Development Laborat-
ory, and the IBM Thomas J. Watson Research Center, where he was involved
in NLP Research and Development for eight years. He is interested in all
aspects of NLP and has worked on grammar checking, lexical knowledge
bases, and machine translation.
Hi n r i c h S c h UÈ t z e is a member of the research staV at the Xerox Palo Alto
Research Center and Consulting Assistant Professor in the Linguistics De-
partment at Stanford University. His main research interest is the application
x Notes on Contributors
of statistical learning to problems in computational linguistics and informa-
tion retrieval. He has worked on word-sense disambiguation, genre classiWca-
tion, and text categorization. He published Foundations of Statistical Natural
Language Processing (with Chris Manning, 1999), and Ambiguity Resolution
in Language Learning (1997).
Ma r k S t ev e n so n is currently Wnishing a PhD in word sense disambigua-
tion and is supervised by Yorick Wilks. They have jointly published several
papers resulting from this work. Apart from word sense disambiguation, his
interests include lexical semantics, parsing, machine learning, information
extraction and psycholinguistics. His academic career began at Glasgow
University where he studied Mathematics and Philosophy. He then com-
pleted a Masters degree at the ArtiWcial Intelligence Department of Edin-
burgh University where he carried out research into graphical interfaces to
NLP systems.
L u cy V an d e r w e n d e has been a Researcher in the Natural Language
Processing group at Microsoft Research since 1992. Previously she worked
at the Institute for Systems Science in Singapore, and at IBM Bethesda, the
Maryland Development Laboratory. Her interests include cross-linguistics
deep argument structure, noun compounds, lexical knowledge bases,
information retrieval, and machine translation.
Y o r i c k Wi lk s is Professor of Computer Science at the University of
SheYeld and Director of ILASH, the Institute of Language, Speech, and
Hearing. He is also a fellow of the American Association for ArtiWcial
Intelligence, on advisory committees for the National Science Foundation,
and on the boards of some Wfteen AI-related journals. He currently works
in the areas of information extraction from text sources, computational
pragmatics, and the automatic of linguistic resources such as lexicons and
grammars.
page xi

Figures and Tables

Figures
5.1 The AHD entry for the verb crawl 000
5.2 The CED entry for the verb 000
5.3 Crawl entries from four learners' dictionaries 000
5.4 Crawl sentences (some abridged) from corpus data 000
5.5 Semantic network for the verb crawl 000
5.6 Ramper entry in OHFED 000
5.7 Uses of ramper from the ARTFL twentieth-century corpus 000
5.8 Semantic network for ramper 000
9.1 The entry for bank in LDCE 000
10.1 Highly weighted links between handle and sword 000
10.2 Highly weighted links between handle and door 000

Tables
5.1 Comparative coverage of the verb crawl in four dictionaries 000
5.2 Dictionary senses of the verb crawl, with corpus examples 000
5.3 Grammatical information for the verb crawl in four
dictionaries 000
5.4 Elements in the motion frame 000
9.1 Types of knowledge source in LDCE 000
9.2 Accuracy of the reported system 000
11.1 Senses of ambiguous words 000
11.2 Number of occurrences of test words in training and test set,
% rare senses in test set, and baseline performance
(all occurrences assigned to most frequent sense) 000
11.3 Results of two disambiguation experiments (2 and 10 clusters
per word) 000
11.4 Disambiguation markedly improves retrieval performance 000
page xii
page 1

1
Polysemy: An Overview
YA E L RA V IN and C L A UD IA L E AC O CK

1. 1 w h at i s po l y se my ?
The study of polysemy, or of the `multiplicity of meanings' of words, has a
long history in the philosophy of language, linguistics, psychology, and
literature. The complex relations between meanings and words were Wrst
noted by the Stoics (Robins 1967). They observed that a single concept can
be expressed by several diVerent words (synonymy) and that conversely, one
word can carry diVerent meanings (polysemy). The collection of papers
assembled here represents current research into the issues arising from poly-
semy, such as the nature of polysemy; its relation to the more general
phenomenon of semantic ambiguity; the ways in which multiple meanings,
or senses, are represented in a dictionary or lexicon and related to each other;
the principles that govern these relations and the mechanisms that allow the
creation of new senses. Since words are used in context, the mechanisms by
which polysemous words combine with others to form the meaning of larger
syntactic units are also addressed.
Polysemy is rarely a problem for communication among people. We are so
adept at using contextual cues that we select the appropriate senses of words
eVortlessly and unconsciously. The sheer number of senses listed by some
sources as being available to us usually comes as a surprise: Out of approx-
imately 60,000 entries in Webster's Seventh Dictionary, 21,488, or almost
40%, have two or more senses, according to Byrd et al. (1987). Moreover, the
most commonly used words tend to be the most polysemous. The verb run,
for example, has 29 senses in Webster's, further divided into nearly 125 sub-
senses.
Although rarely a problem in language use, except as a source of humor
and puns, polysemy poses a problem in semantic theory and in semantic
applications, such as translation or lexicography. As we see with run in
Webster's, the traditional lexicographic practice is to list multiple dictionary
senses for polysemous words and to group related ones as sub-senses. Dic-
tionaries diVer in the number of senses they deWne for each word, the group-
ing into sub-senses and the content of deWnitions. It is clear that there is little
agreement among lexicographers as to the degree of polysemy and the way in
which the diVerent senses are organized. In fact, Atkins and Levin (1991) and
2 Polysemy
Fillmore and Atkins (this volume) show that mapping the deWnitions from
one dictionary onto another is often not possible, even for mildly ambiguous
words, such as the verb whistle, or across dictionaries that are similar in scope
and coverage. Whistle is deWned in one dictionary as make a shrill clear sound
by rapid movement but as move with a whistling sound in another. The genus
terms and diVerentia are reversed. Do these two dictionary deWnitions cap-
ture the same sense of the verb?
Another problem with dictionary deWnitions is their use of polysemous
deWning terms, further obscuring the relation between dictionary senses. One
of the deWnitions of the noun whistle is the sound produced by a whistle, but
which sense of whistle is intended, the instrument, the device (as in a factory
whistle), or both? A third problem with dictionary deWnitions mentioned by
Atkins and Levin and again in this volume arises when actual uses of words
encountered in context cannot be mapped to any dictionary deWnition, as in
whistling up the NATO forces if need be, revealing senses that are missing in
the dictionary. These examples indicate that lexicographers tend to disagree
as to the number of senses a word has, the semantic content of these senses
and their groupings. Semanticists also disagree on these issues as can be seen
in the diVerent approaches presented in this book. Two papers in particular,
Fillmore and Atkins and Goddard in this volume, propose ways in which
their theories can enhance or improve the practice of creating dictionary
deWnitions.

1. 2 p ol y s e m y v ers u s ho mon yn y an d in d e t e r m in acy


Traditionally, polysemy is distinguished from homonymy. Strictly speaking,
homographs are etymologically unrelated words that happen to be repres-
ented by the same string of letters in a language. For example, bass the Wsh is
derived from Old English barse (perch) while bass the voice is derived from
Italian basso. Conversely, polysemes are etymologically and therefore semant-
ically related, and typically originate from metaphoric usage. Line in a line of
people and a line drawn on a piece of paper are etymologically related, and it
is easy to see their semantic relation. The distinction is not always straightfor-
ward, especially since words that are etymologically related can, over time,
drift so far apart that the original semantic relation is no longer recognizable.
The distinction between polysemy and homonymy is important because it
separates the principled from the accidental and poses the following ques-
tions: If diVerent senses of polysemous words are systematically related, how
do they derive from each other, and how should they be organized to reXect
this regularity?
A second important distinction is one between polysemy and indetermi-
nacy, sometimes referred to as vagueness. The distinction is between those
aspects of meaning that correspond to multiple senses of a word versus those
Overview 3
aspects that are manifestations of a single sense. For example, the referent of
child can be either male or female. This diVerence in gender can be viewed as
polysemy, creating two diVerent senses of child; or, more intuitively, it can be
seen as a diVerence that is indeterminate within a single sense of child. The
distinction between polysemy and indeterminacy is at the core of semantic
theory as it deWnes the relation between the semantics of linguistic expressions
and the extralinguistic entities to which these expressions refer. Thus, each
semantic theory represented in this book, whether it assumes the existence of
mental images, concepts, states or referents, discusses criteria for determining
which of these extralinguistic factors are relevant to semantic representations.
Various diagnostic tests have been proposed for distinguishing between
polysemy and indeterminacy, but they are all problematic. Cruse discusses
many in this volume; Geeraerts (1993; 1994) has a detailed analysis, which we
summarize below.
The Wrst group of tests are the logical tests originally deWned by Quine
(1960): if an assertion involving a word can be both true and false of the same
referent, then the word is polysemous. For example, the feather is light and not
light is possible because light is being used as not heavy and then as not dark.
Quine points out this test needs to be broadened to include situations in which
the truth conditions are neither true nor false, as when the assertion has a
bizarre or anomalous reading. One example is the book is sad, where applying
the condition of experiencing emotion to a book is anomalous, rather than
simply false. A variation on this test is to use assertions in which both senses
may be true but not redundant. For example Charles has changed his position,
discussed in Cruse (1986), may refer to either his physical location in the room
or his view on some issue.
A second diagnostic group of tests is linguistic. There is a linguistic con-
straint on using multiple senses in a single usage of a polysemous word. For
example, zeugma results in Arthur and his driving license expired last Thurs-
day, as noted again by Cruse (1986).
Finally, there is a deWnitional test, whose origin can be traced back to
Aristotle. A word is polysemous if more than a single deWnition is needed to
account for its meaning. In classical terms, a word is polysemous if a single set
of necessary and suYcient conditions cannot be deWned to cover all the
concepts expressed by the word. This test diVers from the others in that it is
not only diagnostic but also explicatory, as evidenced in the way it is used by
Goddard (this volume) to derive semantic representations. To quote Geer-
aerts (1994), the deWnitional test
embodies a hypothesis about the principles of categorization that human beings
employ. It suggests, in fact, that one of the basic characteristics of natural language
categorization is a tendency towards generality (or, in other words, towards maximal
abstraction.)
4 Polysemy
DiVerent tests yield contradictory results when compared to each other and
Geeraerts (1993) notes several cases where the tests make conXicting predic-
tions. His strongest case is with words like newspaper that can refer either to
the people heading the organization or to the printed object. Newspaper is not
polysemous according to the linguistic test, because both senses can exist in
sentences like The newspaper decided to change its print. However, this class of
words is polysemous according to the deWnitional test since a single deWnition
to cover both the people and the product cannot be formed.
Geeraerts shows that when the three types of tests are slightly manipulated
and re-applied to the same linguistic material, they yield inconsistent results.
While Judy's dissertation is thought-provoking and yellowed with age indicates
polysemy for dissertation according to the linguistic test, the slightly modiWed
Judy's dissertation is still thought-provoking although yellowed with age is Wne.
In fact, as documented by Cruse (1986), the awkwardness of coordination is a
matter of degree, and it can appear less awkward if properly manipulated.
The variability of the deWnitional test is directly relevant to Wierzbicka's
approach represented here by Goddard, who holds that the deWnitional test
described by Geeraerts is valid and consistent and in fact uses it as his primary
method for determining the sense (or senses) of words. Goddard traces that
the diYculty in establishing the distinction between polysemy and indetermi-
nacy to faulty semantic representation. The diYculty disappears when deWni-
tions are constructed with care and rigor. This is a hard task, which may not
always be successful. Geeraerts (1993) cautions that the polysemy or inde-
terminacy of a word may hide in the polysemy or indeterminacy of the words
used to deWne it. He gives as an example one of Wierzbicka's own deWnitions,
that of bachelor as an unmarried man thought of as someone who could marry
(Wierzbicka 1990). The meaning of bachelor, though, cannot be easily accom-
modated within a deWnition like this, as observed by LakoV (1987), who cites
the diYculties in applying the deWnition to either Tarzan or the Pope, both
unmarried but for very diVerent reasons. Wierzbicka responds that by con-
structing the deWnition more carefully, she can address these problems. Her
modiWed deWnition reads: a man who has never married thought of as a man
who can marry if he wants to (1996). Geeraerts (1993) further points out that in
this deWnition the word can preserves the polysemy between the sense of
permission and the sense of physical possibility and that its use in the deWni-
tion obscures the diYculties perceived with the former deWnition of bachelor.
Hence it is important to ensure that only monosemous words are used in
deWnitions, or if not, that the sense in which they are used is made clear, but
Geeraerts does not see a well-deWned methodology that would guarantee this.
We return to this point below.
Because the tests are not consistent, the diVerence between polysemy and
indeterminacy is not stable and our ability to deWne distinct senses for words is
questioned. Geeraerts believes that the failures of the tests call into question
Overview 5
our very conception of meaning and of lexical semantics. Meanings may not
be Wxed entities, but rather diVerent overlapping subsets of semantic com-
ponents, some more preferential than others.
The tremendous Xexibility that we observe in lexical semantics suggests a procedural
(or perhaps `processual') rather than a reiWed conception of meaning; instead of
meaning as things, meanings as a process of sense creation would seem to become
our primary focus of attention.

This conclusion agrees with Fillmore and Atkin's (this volume) view of word
meanings as networks of semantic concepts that are extendible from a core
meaning. The direction and scope of the extensions, Fillmore and Atkins
argue, depend on the particular word analyzed, the language (English crawl
diVers in its network of senses from the French ramper), historical evolution
and other words in the context.

1 . 3 p ol y s emy a n d co nt e xt
Geeraerts emphasizes the importance of context in determining the predic-
tions of each of his tests as he demonstrates that context alters the senses of
the words found in it. This emphasis on context is common to all of the
approaches discussed in this volume: the common modelÐbut we discuss
exceptions belowÐis to deWne the meaning of words independent of context,
as discrete entries in a dictionary, and then establish principles according to
which word meanings interact when found together in a particular context.
The central question is what aspects of word meaning are predeWned and
invariant across multiple contexts versus what other aspects are indetermin-
ate and only realized in context. Theories diVer in the balance they strike
between the two. On one extreme we Wnd Goddard (in this volume), stipulat-
ing maximal semantic content in word meaning. In fact, the interchangeabil-
ity of deWnition content across multiple contexts is the very criterion for
well-formed deWnitions. In Goddard's terms, context can only augment but
not alter the semantic content of words. On the other extreme we Wnd
SchuÈtze, who discards deWnitions and semantic content altogether. In
SchuÈtze's work, words do not have semantic content, only semantic similarity
to other words, which is measured by the similarity of the contexts in which
they appear. In fact, SchuÈtze induces senses rather than stipulate them.
Common to both these approaches is the relatively straight-forward prin-
ciples that govern the interaction between predeWned meaning and context.
Although Goddard does not address the issue directly here, his approach
need only stipulate very general principles of semantic composition, mainly to
rule out contradictory combinations (such as hateful love), anomalous ones
(wooden love) and other similar unacceptable constructions. Katz et al. (1985)
discusses three such principles in a similar theoretical approach to word
6 Polysemy
meaning. Semantic representations in Katz's theory consists of hierarchical
tree structures of semantic concepts. One principle for combining tree struc-
tures, called attachment, is an operation that appends the tree structures of
modiWers, such as adjectives and adverbs, to the tree structures of their
syntactic heads. Attachment occurs when the top concept node of the mod-
iWer can Wt under a concept node of the head. For example, intentionally with
a top concept node of intention) can attach to kill under the concept node of
event, since a general rule of semantic structure (namely, Intention ! Event)
licenses it.
Other approaches to the issue of word meaning and context in this volume
strike a more moderate balance. Cruse unfolds a large continuum of word
senses, ranging from `the possibility of (at least some) context-invariant
semantic properties of (at least some) words' to nodules of meaning that are
created and dissolved with changes in the context. Pustejovsky, similarly,
discusses highly under-speciWed sense of words and the principles that operate
on those senses to yield diVerent interpretations in diVerent contexts. The
more subtle the interactions between lexical meaning and context, the more
complex mechanisms are necessary for governing these interactions. This is
made clear in Pustejovsky's discussion of a particularly vexing group of verbs,
like risk, which are shown by Fillmore and Atkins (1992) to occur with
contradictory contexts, as in Mary risked her life versus Mary risked death.
Pustejovsky explains how the same verb meaning can combine with anto-
nymous complements to form roughly the same compositional meaning, that
of some likely harmful result. A coercion operator introduces the meaning of
privation (death, or the privation of life) to the meaning of the complement if
the complement does not already contain it. Coercion operations play a
major role in Pustejovsky's Generative Lexicon. They allow single-sense
words to acquire diVerent readings in diVerent contexts, by coercing the
meaning of nouns, for example, into one of their metonymic extensions.
Coercion explains how a fast car is one that can be driven quickly but a fast
typist is one who can type quickly (Pustejovsky 1993).
An open question remains as to how many such operative principles exist
and how complex the under-speciWed representations of words need to be in
order to allow certain coercions but block others (* save his death). Cruse
doubts whether such mechanisms are suYcient to explain the intricacies of
the interactions between word meaning and context. His conclusion is pess-
imistic: there is `a disturbing degree of Xuidity [in] semantic structure' which
cannot be subject to reductive explanations.

1 .4 t h eo r i e s o f me a ni n g
Semantic theories account for polysemy as one semantic phenomenon in a
comprehensive theory of meaning. Taken in the most general terms, seman-
Overview 7
tics relates the extralinguistic world to the linguistic expressions that describe
it. The nature of the extralinguistic entities need not concern us here: onto-
logically, one could speak of states of aVairs; formally, of intensions or sets of
possible worlds; mentally, of concepts corresponding to intentions or applied
to states of aVairs. Meaning, under all these views, can be understood as the
conditions in which a certain expression holds of certain extralinguistic
entities. Of course, individual entities are unique, so we refer more accurately
to extralinguistic types which group together many individual instances. A
lexical expression is applied to an extralinguistic type in a process of abstrac-
tion. In accounting for this abstraction, semantic theories are guided by two
sometimes contradictory principles: generalize (or reduce polysemy) as
much as possible in order to increase the explanatory power of the theory;
and make distinctions (or increase polysemy) in order to account for as much
semantic detail as possible. Theories diVer in the degree of abstraction they
allow. Some postulate one sense where others postulate many, for very
similar data.
Three major approaches to semantics are represented in this volume: the
Classical approach (Goddard); the Prototypical approach (Fillmore and
Atkins); and the Relational approach (Fellbaum).

1 . 5 t h e cl as s i cal t h e ory of me an in g
Three principles of the classical theory of deWnition bear on the problem of
polysemy: (1) senses are represented as sets of necessary and suYcient condi-
tions, which fully capture the conceptual content conveyed by words; (2)
there are as many distinct senses for a word as there are diVerences in these
conditions; and (3) senses can be represented independently of the context in
which they occur.
According to the classical, or Aristotelian, view of conceptual representa-
tion, an individual entity is a member of a conceptual category. For example,
Fido is a dog, if Fido possesses a set of necessary conditions (or deWning
features) and a set of suYcient conditions (or core properties) that deWne the
category dog. A few assumptions are traditionally made under this approach
(Howes 1990): conceptual categories consist of feature lists connected by
logical operators, such as conjunction and disjunction. The categories are
arranged in a hierarchy, where concepts on the same level of the hierarchy
inherit and share the core properties of the higher concepts, but have deWning
features that are mutually exclusive. What this means is that dogs and cats,
for example, share the core properties of mammals, but do not share any of
the canine or feline speciWc deWning features. An object either does or does
not belong to a conceptual category. Thus, an animal either is or is not
categorized conceptually as a mammal, even though in reality there may exist
diYcult cases, such as the platypus. But diYcult cases do not necessarily
8 Polysemy
weaken the conceptual structure, if they can be arbitrarily assigned to one of
the relevant categories.
It is not the Aristotelian notion of necessary and suYcient features which causes
trouble in semantic analysis; it is the tacit behaviorist assumption that the necessary
and suYcient features should correspond to measurable, objectively ascertainable
aspects of external reality. (Wierzbicka 1996)

A modern linguistic embodiment of the classical approach is found in the


rationalist and intensionalist theory of semantics developed by Katz and
Fodor (1963) and later reWned in Katz (1972). Working within the framework
of Chomsky's Standard Theory of the 1960s, Katz and Fodor's aim was to
articulate the principles of a universal semantic theory that would explain `the
system of internalized rules that constitute the native speaker's linguistic
competence' (Katz 1972: p. xxii). For semantics, this meant the rules that
relate linguistic structures to their meaning.
Katz (1972) opens with a discussion of `what is meaning'. Responding to
Quine, who denies meaning an ontological status that is independent from
that of reference, Katz claims that a semantic theory need not deWne the
ontological status of meaning, just like a theory of physics is not required to
deWne matter. Rather, the goal of a semantic theory should be to provide a
general scheme for representing meaning. The scheme provides mechanisms
for deWning how the logical form (or meaning) of propositions (or sentences)
is determined compositionally from the senses of its components and the
syntactic relations that hold among these components.
A semantic theory provides a semantic representation for each sense of
each lexical item. A lexical item, roughly, a word, is a minimal grammatical
unitÐa morpheme or idiomÐthat bears content. Sense representations are
listed in a dictionary, where ambiguous words have more than one sense
representation.
The semantic representations of senses are themselves composed of smaller
semantic units, called semantic markers, roughly corresponding to the class-
ical core and deWning properties. Although Katz does not give a precise
ontological deWnition, he describes them as reXecting the conceptual parts,
or component concepts, that make up a particular sense. For example, object
and physical are semantic markers in the representation of one sense of chair
below:
(Object), (Physical), (Non-living), (Artifact), (Furniture), (Portable), (Something
with legs), (Something with a back), (Something with a seat), (Seat for one)

Semantic markers are not necessarily primitive semantic concepts. They


can be further decomposed into component semantic markers, so that this
representation of chair includes under-speciWed components, such as object.
Although, in principle, Katz believed that a full and Wnite vocabulary of
Overview 9
primitive semantic markers would be available once the dictionary work was
complete, in practice, semantic representations were often left under-speciWed
and an inventory was never supplied.
Semantic markers enable inferencing. For example, from the sentence
There is a chair in the room, we can infer there is a physical object in the
room, since all the semantic markers in the representation of physical object
are included in the representation of chair. This corresponds to the classical
hierarchy of conceptual classes governed by the inheritance of core properties.
Related senses, whether of the same word or of diVerent words, share
common semantic markers. For example, some of the representations of
chair, ball, and shadow share the semantic marker object, which diVerentiates
them from the senses of thought or togetherness, which do not include this
marker. Semantic similarity is therefore a relative measure, dependent on
the extent of shared semantic markers. Synonymy is deWned as complete
sharingÐa relation that holds between words that have one (or more)
identical sense representations. Ambiguity holds whenever a word has
more than one representation. Even a single diVerence in the presence or
absence of a semantic marker is enough to distinguish one sense from
another. In his treatment of ambiguity, Katz does not distinguish polysemy
from homonymy, as it is not a goal of the semantic theory to explain the
origin of sense diVerences, whether, for example, they stem from the same
etimology or not. Similarly, Katz's theory does not explain regular polysemy,
the phenomenon under which a lexical item with one sense representation
acquires another representation that diVers from the Wrst in predictable
ways.

1. 6 p o l ys em y with i n th e c l as s i ca l ap pr o ach
Omission to deWne polysemy and explain regular polysemy, sometimes
referred to as productiveor systematic polysemy, should be interpreted as an
oversight by Katz and Fodor rather than as an inherent defect of their
approach, as other classical approaches account for it quite naturally.
Apresjan (1974) deWnes polysemy as the similarity in the representations of
two or more senses of a word:
The deWnition does not require that there be a common part for all the meanings of a
polysemantic word; It's enough that each of the meanings be linked with at least one
other meaning.
He then points to cases in which a word is systematically ambiguous in
context. One such case is when the word exhibits more than one intension
which holds of the same referent as in I was put into quarantine. In Russian,
this sentence is ambiguous between a sense of quarantine as action and its
sense as place. English sentences like the construction is complete convey a
10 Polysemy
similar ambiguity between the product and the action that caused it. Apresjan
views these and other examples as instances of regular polysemy:
Polysemy of the word A with the meanings ai and aj is called regular if, in the given
language, there exists at least one other word B with the meanings bi and bj , which are
semantically distinguished from each other in exactly the same was as ai and aj and if ai
and bi and aj and bj are non-synonymous.

Regular polysemy is governed by processes which are productive, rule-gov-


erned, and predictable, very much like processes of word formation. One such
process is metonymical transfer, responsible for creating senses such as foot in
the foot of the mountain. Another is the systematic relation between words
denoting vessels and the quantity that the vessel holds, such as spoon, the
utensil and spoon meaing spoonful, as in a spoon of sugar.
While the phenomenon of regular polysemy is not a major problem for the
classical approach, sense distinctions may be. If the classical approach pos-
tulates new senses with every conceptual diVerence, there is a danger of an
inWnite proliferation of senses. This was pointed out early on by Weinrich
(1966), for example, who warns that the number of senses of a simple word
like eat may be inWnite, as eating involves the use of a spoon in the case of
eating soup but the use of a fork and the action of cutting in the case of eating
meat. Katz objects that these diVerences are not inherent in the meaning of
eat and are only diVerences in the instantiation of the concept of eating in
diVerent circumstances:
Various activities that can correctly be called eating may diVer in the ways they are
carried out, as Weinrich suggests. They may be performed with spoons, Wngers, chop-
sticks, knives, shovels or whatever strikes one's fancy, but, nonetheless, the are
instances of eating in the same sense of this term. The fundamental point, is that,
insofar as eating applies to each activity with exactly the same sense, they are equiva-
lent activities. Meaning must be an abstraction from the variable features of the things
referred to by the term: the meaning of a word must represent only the invariant
features by virtue of which something is a thing, situation, activity, event or whatever
of a given type. Otherwise no word could ever be used again with the same meaning
with which it is used at any one time, since there is always some diVerence in what is
referred to from one time to the next. (Katz 1972)
This statement is important since it allows some abstraction, or indetermi-
nacy, in the semantic representation of a word even within the classical
approach. As we will see, other, non-classical approaches carry this notion
of indeterminacy much further than what is allowed by Katz. But the intro-
duction of indeterminacy leaves us with a question of how to decide which
features of the situation form integral parts of the concept and which are
merely inconsequential variations. Why is the eating implement inconsequen-
tial and not part of the meaning of eat while the action of swallowing is?
Without a convincing argument for not distinguishing the sense of eating
Overview 11
soup from the sense of eating meat, it is hard to justify a principled distinction
between the sense of eating as ingesting food and the sense of eating as when
acid corrodes metal.
An attempt to further investigate this issue within Katz's theory is found in
Ravin (1990). While inspecting the semantic content of several event verbs,
Ravin concludes that there are no clear criteria for which aspects of a real-
world situation are relevant to the semantics of a particular verb, but there is
a methodology for determining which aspects ought to be semantically
represented. The methodology is based on the idea that semantic representa-
tions ought to account for all the semantic properties and relations of lingu-
istic expressions. For example, paraphrases, such as break into pieces and
shatter, which have the same meaning, should have identical representations.
This requirement puts constraints on the representations of the constituents
break, into, and pieces. Similarly, semantic diVerences should be reXected. If
break and break into pieces diVer in meaning, the representation of break
should reXect this by not specifying the degree of detachment of the resulting
pieces. The speciWcation of complete detachment occurs when the semantic
representation of into pieces is merged in.
Working with the classical framework of the Natural Semantic Metalan-
guage, Wierzbicka endorses a similar methodology for determining how
polysemous a word is:
We must proceed by trial and error, assuming always, to begin with, that there is only
one meaning, constructing a tentative deWnition, checking it against a word's possible
range of use, then, if necessary, positing a second meaning and so on. (Wierzbicka
1996: 242)
She too appeals to semantic intuitions for determining whether the semantic
representation of a wordÐin her theory, the natural language paraphrase of
its meaningÐcan be substituted for the word in all of its uses. For example,
love has the following deWnition (Goddard in this volume):
X loves Y ˆ
X often thinks about Y
X thinks good things about Y
X want to do good things for Y
X wants good things to happen to Y
when X thinks about Y, X often wants to be with Y
when X thinks about Y, X often feels something good
This deWnition can be substituted for the word in the context of romantic
love, paternal love, or brotherly love, since it represents an invariant meaning
of love. The other semantic diVerences among these three contexts are due to
the presence of diVerent modiWers (e.g., platonic), which alter the meaning of
the compositional expression.
12 Polysemy
While it is true that such methodological constraints help create more
systematic representations, there is still room for uncertainty, because the
semantic intuitions of native speakers are often diYcult to ascertain and
because there are no reliable objective tests, as pointed out by Geeraerts.
Whether two linguistic expressions are indeed paraphrases is itself debatable
as discussed by Dowty in this volume.
The level of detail included in semantic representations depends on what
the semantic theory wants to account for. If, as in Katz's framework, the
theory should explain why certain compositional representations are anom-
alous (e.g., *break the dust), selection restrictions need to be encapsulated in
the meaning of the head to disallow invalid combinations. If the theory has to
account for all diVerences in meaning between words, minute semantic details
need to be recorded. For example, to account for the diVerence between break
and tear, the suddenness of the event or the rigidity of the object need to be
speciWed.
From among the theoretical approaches presented in this volume, Wierz-
bicka's theory of Natural Semantic Metalanguage comes closest in spirit to
Katz's, although there are signiWcant diVerences. Like Katz, Wierzbicka
holds that word senses decompose into a Wnite number of universal semantic
primitives, called `primes'. Unlike primitive markers, though, primes are
words in natural language itself. In fact, primes make up that subset of words
in every human language that encapsulate innate prelinguistic meanings and
that are readily translatable into other languages. Wierzbicka and her collea-
gues have been developing the vocabulary of primesÐthey are about 50 of
them now (Wierzbicka 1996)Ðand forming deWnitions of non-primes over
many years, across a variety of languages and semantic domains.
Semantic representations of non-primes consist of combinations of primes,
cast as natural language deWnitions. Like Katz's, they are required to be
explicit and complete and to predict all the semantic relations that the lexical
item participates in. For example, Goddard shows how the representation of
send somebody includes semantic components such as the intention of the
person being sent and the speech act performed by the person doing the
sending to diVerentiate send from other verbs that mean roughly cause to
go. The main diVerence is that since Goddard's representation are natural
language paraphrases, they can be more easily substituted than semantic
markers in contexts where the word appears. This is important for his
methodology of reductive paraphrase.
To conclude, the questions that Katz's theory raises about the goals of a
semantic theory and the nature of lexical semantic representations are critical
to all the approaches presented here. In particular, the hypothesis that lexical
meaning is reduceable to a Wnite set of primitives is held by Goddard but
strongly contested by the other contributors. The assumption that the full
semantic content of a word can be captured in its lexical representation is also
Overview 13
debatedÐwith Goddard postulating that the lexicon can contain the invar-
iant meaning of words and Pustejovsky postulating that lexical representa-
tions are under-speciWed. Finally, the strong classical assumption that any
semantic diVerence results in the addition of a new sense is challenged by the
Prototypical theories, such as Fillmore and Atkin's.
There are also practical implications to these assumptions. As Fillmore and
Atkin discuss, lexicographers are required to cover as many uses and senses of
words as they believe exist, in the form of discrete word senses. At the same
time they work under severe space constraints, and therefore choose to omit
senses if they can be derived from others by general principles.

1 .7 t h e pr ot ot y p ica l ap p roach
Much twentieth-century philosophy of language can be seen as an attack on
the classical view of words as having distinct meanings and of deWnitions as
composed of necessary and suYcient conditions. Wittgenstein (1958) writes:
The idea that in order to get clear about the meaning of a general term one had to Wnd
the common element in all its applications has shackled philosophical investigation for
it has not only led to no result, but also made the philosopher dismiss as irrelevant the
concrete cases, which alone could have helped him to understand the usage of the
general term.
In a well-known discussion of the meaning of the word game, Wittgenstein
(1953) examines board games, card games, ball games, and Olympic games
and fails to Wnd an element that is common to all. Instead he Wnds several
elementsÐamusement, competition, skill, luck, rulesÐwhich occur in var-
ious combinations whenever the word game is used, depending on the context
in which it appears. Wittgenstein concludes that categories have fuzzy bound-
aries and meanings exhibit family resemblance, with common elements that
overlap and criss-cross in the same way that traits do in families.
In psychology, categorization by family resemblance was introduced by
Rosch and her colleagues in the 1970's. Rosch (1977) demonstrated that
people do not categorize objects on the basis of necessary and suYcient
conditions but rather on the basis of resemblance of the objects to a proto-
typical member of the category.
Following the classical approach to categorization, the prototypical
approach continues to respect the existence of a concept hierarchyÐa dog is
a mammal, an animal, a living thing. But membership in a conceptual cate-
gory is a matter of degree. Each category is represented by a prototype (an
actual member or a conceptual construct) that best exhibits the features of the
category and so is close to the ideal category deWnition of the classical theory.
In fact, Rosch proposed two prototypical models: one in which a single
prototype possesses the largest number of characteristic features, and the
14 Polysemy
other where several prototypes exist, each possessing a diVerent set of char-
acteristic features, not necessarily resembling one another. This second model
was adopted by linguists to handle polysemy, and we return to it below.
In various experiments Rosch demonstrated that prototypes are central to
human thinking. For example, people construct a prototype when none is
available, as when they are presented with various exemplars of an unfamiliar
category. The prototype they deWne is not arbitrary, but consistent across
individuals and cultures. When asked to choose an exemplar of a category,
people tend to choose prototypical ones. In her famous study of the Dani
people, who have only two colour categoriesÐone including all light, warm
colours and the other all dark, cool coloursÐRosch found that the Dani
tended to point to prototypical red, yellow, and white as examples of the
Wrst category, and to prototypical blue, green, and black as examples of the
other.
Best exemplars of a category resemble the prototype more than poor
exemplars. Rosch demonstrated that people take less time to verify state-
ments about category membership (X is a Y) when the exemplars are closer to
the prototype and more time when they are not. The degree of similarity to
the prototype is computed diVerently under diVerent theories. A simple
measure is the number of features in common with the prototype. Thus robins
are better exemplars of birds than penguins, since they can also Xy and sing.
Some theories count the number of features not shared with the prototype as
well. But the number of features alone can be misleading, because some
features are intuitively more important in determining category membership
than others. For example, laying eggs is more important than size or appear-
ance, in order to exclude bats from the bird category and still include pen-
guins. Other prototypical proposals assign diVerent weights to features.
Giving more weight to important features makes them more crucial in deter-
mining membership. But, as pointed out in Howes (1990), diVerentiating
important from unimportant features reintroduces a problem encountered
by the classical approach in determining the necessary conditions for a
category. Indeed, in order to determine the important features that govern
membership, the category concept often need to be deWned Wrst. To quote
Taylor (1989):
attributes of the Bird prototype, such as the presence of feathers wings, and a beak, the
building of nests, and the laying of eggs, would appear prima facie to require, for their
characterization, a prior understanding of what birds are.
Crucial features also determine whether the category has clear boundaries.
Poor exemplars of cats that share many common features with dogs are still
not members of any other mammal category. This is because the features they
share with other categories are not crucial. Most modern prototypical the-
ories accept the classical assumption of clear boundaries for at least some
Overview 15
categories, such as natural kinds, scientiWc or technical concepts, for example
the legal deWnition of who is an adult. In contrast to these, categories like
Wittgenstein's game have no clear boundaries, so that marginal cases may or
may not be categorized as games, depending again on context. Labov (1973)
shows similar results in his categorization experiments: as the shape of good
exemplars of cups is altered to become more and more shallow and wide, they
are often categorized as bowls.

1. 8 p o l y s em y wit h i n t h e p r ot o ty p ical a p p roa ch


The prototypical approach has been adopted by many linguists to explain the
meaning of words. Charles Fillmore, George LakoV, Geeraerts, and John
Taylor, and to a certain extent Pustejovsky, share many assumptions that are
directly in contrast with the classical view. (For an extensive overview of the
diVerent positions, see Taylor 1989.) While classical approaches have an
aYnity with philosophy and logic, prototypical approaches have an aYnity
with psychology. While the classical theories emphasize deWnitions (either of
meaning or of semantic properties and relations) and relate meaning to truth
conditions, possible worlds, and states of aVairs, prototypical approaches
emphasize meaning as part of a larger cognitive system and relate it to mental
representations, cognitive models and bodily experiences.
Fillmore (1982) provides one of the earliest accounts of the diVerent types
of prototypical meanings. In addition to prototypical concepts with clear
boundaries, Fillmore discusses cases where a word's meaning is composed of
a disjunction of compatible conditions. When all of them hold, the use of the
word is most prototypical. When none hold, the use of the word is anom-
alous. And when only some hold, the use of the word is not central. An
example is the meaning of climb, which is a disjunction of ascending and
clambering. JackendoV (1985) elaborates further, Wnding Bill climbed up the
mountain to be more prototypical, or stereotypical, being ‡upward and
‡clambering, than either Bill climbed down the mountain or the snake
climbed up the tree. Neither feature is deWned as necessary or suYcient by
itself, but the disjunction is implicitly deWned as necessary, to exclude sen-
tences like *the train climbed down the mountain (ÿupward; ÿclambering).
Similarly, Coleman and Kay (1981) propose three semantic elements in the
meaning of the verb lie: the falsehood of the reported statement; the speaker's
belief in the falsehood of the statement; and the speaker's intention to deceive.
Prototypicality degrades when not all three hold.
Another kind of prototypicality discussed by Fillmore is one where a
certain condition included in the meaning is more privileged, or basic, than
the others. Long, for example, is more prototypically used in the spatial sense;
and only by extension in the duration sense. Meaning extension is a central
theme in the prototypical approach to semantics, as we shall see below.
16 Polysemy
Finally, Fillmore mentions meanings which are composed of a set of
necessary and suYcient conditions that are interpreted against a back-
ground of knowledge. Prototypicality characterizes the degree to which the
situation in the world, or our understanding of it, Wts the assumptions that
form part of the idealized concept. He cites bachelor as an example. The
classical deWnition of an unmarried adult male Wts best only when other
assumptions about society are true: Male participants in long-term unmar-
ried couplings would not ordinarily be described as bachelors; a boy aban-
doned in the jungle and grown to maturity away from contact with human
society would not be called a bachelor; John Paul II is not properly thought of
as a bachelor.
Labov also touches on the importance of background assumptions when
he points out that the distinction between cups and bowls is partly based on it
± marginal objects are more readily categorized as bowls if they contained
mashed potatoes, and as cups if they contained coVee. Thus the conventions
of what food is served in a cup or a bowl play a role in determining its
typicality.
LakoV (1987) adds another type of prototypical concept ± the cluster
concept ± itself made up of several simpler categories, or cognitive models,
to use LakoV's term. For example, the meaning of mother is made up of the
following models: The birth model, deWning the mother as the person who
gives birth; the genetic model, deWning her as the female who contributes
genetically to the child's makeup; the nurturance model deWnes her as the
female adult who raises the child; the marital model as the female adult
married to the father, and the genealogical model as the closest female
ancestor. LakoV concludes:
There need be no necessary and suYcient conditions for motherhood shared by
normal biological mothers, donor mothers (who donate an egg), surrogate mothers
(who bear the child, but may not have donated the egg), adoptive mothers, unwed
mothers who give their children up for adoption and stepmothers. They are all
mothers by virtue of their relation to the ideal case, where the models converge.
That ideal case is one of the many kinds of cases that give rise to prototype eVects.

Without mentioning the word polysemy, LakoV discusses the range of mean-
ing a word can have, as the result of the process of meaning extension.
Mother, for example, forms a radial conceptual model: it has a central
category where all of the models discussed above converge, and then more
marginal categories, its meaning extensions, such as surrogate mother, adopt-
ive mother or stepmother, that are linked to its central meaning along various
dimensions. The meaning extensions of radial concepts are not generated
from the prototypical concept by rules, but rather by convention. They must
be learned. But they are not random either. They are motivated by two
general principles: metaphor and metonymy.
Overview 17
Metaphors are mappings from a model in one domain `to a corresponding
structure in another domain. The CONDUIT metaphor for communication
maps our knowledge about conveying objects in containers onto an under-
standing of communication as conveying ideas in the words'. The metonymic
mechanism is similar to metaphor except that it maps according to a speciWc
function. `In a model that represents a part-whole structure, there may be
a function from a part to the whole that enables the part to stand for the
whole.'
LakoV views meaning extensions as part of a deeper cognitive organiza-
tion. Take for example, the container metaphor. It is based on a complex
container schema. On its most basic level, the container schema is experienced
physically ± we experience our bodies both as containers (of air, food, etc.)
and as things in containers (existing in rooms). The container schema has a
certain semantic (and cognitive) logic: Things are either in or out of a con-
tainer; the relation of containment is transitive (if A contains B and B
contains C, then A contains C). Finally, the schema also gives rise to various
metaphors, or meaning extensions: The visual Weld is seen as a container, with
things coming in and out of sight; and so are personal relations (trapped in a
marriage).
The fact that the basic elements of a schema are grounded in bodily
experiences or in other psychologically basic level categories, as deWned by
Rosch, is another aspect in which the prototypical approaches diVer from the
classical ones ± the classical assumption of a basis of primitive concepts which
combine to yield more complex meanings is replaced by a basis of psycho-
logically motivated concepts that cannot be further decomposed because they
are directly experienced as gestalts. (One cannot talk of the inside and outside
of a container as separate from the concept of the container itself.)
Taylor (1989) elaborates more directly on the nature of polysemy. `If
diVerent uses of a lexical item require, for their explication, reference to two
diVerent domains, or two diVerent sets of domains, this is a strong indication
that the lexical item in question is polysemous. School, which can be under-
stood against a number of alternative domains (the education of children, the
administrative structure of a university, etc.) is a case in point.' And further
on: `polysemous categories exhibit a number of more or less discrete, though
related meanings, clustering in a family resemblance category'. Thus, Taylor
adds another type of prototypical category. One that is complex, like mother,
but is not radial in that it does not have a central meaning. Such is the
meaning of over, for example, as discussed by Taylor, based on LakoV and
earlier work by Brugman. Over can express a static relation of being vertical
while not in contact to the point of reference ± the lamp is over the table;
vertical, not in contact with the reference but dynamic ± the plane Xew over the
city. Walk over the street is dynamic but involving contact; walk over the hill is
similar but deWnes the shape of the path, and so on.
18 Polysemy
As Taylor aptly points out, in the absence of constraints, meanings can be
inWnitely chained via family resemblance, so that everything ends up asso-
ciated with everything else. Taylor rejects any absolute constraint. As the
analysis of climb shows, one category can contain contradictory elements ±
the motion in climbing can be either up or down. As the analysis of over
shows, a category can encroach upon the semantic content of other cat-
egories: over sometimes means the same as beyond, across or on the other
side. He therefore agrees with LakoV, that rather than looking for constraints
on meaning extensions, we should look for tendencies and regularities.
If it is not possible to state absolute constraints on the content of family resemblance
categories it might none the less be the case that certain kinds of meaning extension are
more frequent, more typical and more natural, than others. In other words, we should
be looking for recurrent processes of meaning extension, both within and across
languages, rather than attempting to formulate prohibitions on possible meaning
extensions. (Taylor 1995: 21)
Fillmore and Atkins provide such an analysis of meaning extensions and how
they relate to one or more core meanings in their discussion of the polyse-
mous verb crawl in both English and French. Loosely based on LakoV's
model of radical categories, they stipulate one or more core literal senses at
the center of a concept network. Each of the literal senses can be extended by
general productive mechanisms to form derived senses. Which mechanisms
apply, however, seems to be idiosyncratic across words and languages. In
their example, ramper in French extends to mean stalk and creep while its
equivalent in English, crawl, does not. Crawl extends to describe the way
babies move, which in French is rendered by marcher aÂquatre pattes.
Pustejovsky takes a somewhat diVerent approach ± he explicates meaning
extensions in terms of a set of generative rules, such as coercion, that are
triggered by the context in which a word is used.

1 .9 t h e re l at i on al ap pr o ach
In relational models of the lexicon, words are organized according to their
meanings using rich semantic relations or links to form a semantic network.
Like prototypical models, the relational model works with semantic domains.
In addition, it `attempts to make explicit the structural organization that is
implicit in other models, and describes how the elements of a domain are
related to each other' (Evens 1988). Ideally, in a relational model of the
lexicon, knowing the meaning of a word is knowing the word's location in
the semantic space of the lexicon.
Synonymy and antonymy are perhaps the most basic relations. Synonymy
can be deWned to hold between words when one can be substituted for the
other in a context without changing the meaning of the phrase. For example,
Overview 19
frigid can be substituted for freezing in the phrase it is freezing outside without
changing its meaning. In the case of antonymy, such a substitution causes the
opposite meaning, as in `my gloves are wet/dry'. Another basic relation that
holds between nouns is hypernymy, also called the superordinate relation, or
the IS A relation. For example, the superordinate or hypernym of the noun
dog could be canine or animal meaning that a dog IS A canine or that a dog IS
A animal. The concepts are more and more general as one travels up a
hypernym chain. The inverse relation is hyponymy, which is a subordinate
relation. Hyponyms of dog include all the various kinds of dogs, such as
dalmation and poodle. Fellbaum (1990) has introduced a similar relation
that holds between verbs, that of troponymy. Whereas the organization rela-
tion between the nouns is an IS A relation the organizing relation between
verbs is a manner relation, for example, ambling and strolling are manners of
walking.
The kinds of relation that are typically encoded in relational semantic
networks are best exempliWed by those found using word association tests.
There is remarkable agreement on word associations across people, and these
associations reXect speciWc relations. For example, when 182 people were
asked to respond to smooth, 22 per cent responded with the synonym soft,
19.8 per cent responded with the antonym rough, and 6 per cent responded
silk which has smoothness as a property (Keppel and Strand 1970). Similarly,
when asked to respond to animal, 22.5 per cent responded with the hyponym
dog. However, when analysing these data, it is not always possible to deWne
what relation has been retrieved. When the probe word was army, three
people responded no. It is, perhaps, possible to recognize that there is a
relation, but deWning it is not always easy.
Determining precisely what the organizing relations are in the mental
lexicon, or how many there are, remains controversial; there is little agree-
ment on these issues. The number and type of posited relations varies widely.
WordNet (Fellbaum 1998) is a widely used online relational lexicon that is
freely available. Developed by George Miller and his colleagues at Princeton
University, WordNet was originally conceived of as an experiment. Tired of
semantic theories that were based on a handful of words, Miller wanted to
test the limits of a relational lexicon by seeing how much of the language it
was possible to include (Miller 1998). Almost Wfteen years later, WordNet
contains nearly 100,000 concepts in its semantic network.
WordNet's basic unit is a set of synonyms, a syn-set, which represents a
concept in the semantic space associated with a deWnitional gloss. For ex-
ample, the syn-set {plant, Xora, plant life} represents the concept that is
shared by the three words. Synonym sets are related to each other with a
small set of pointers or labeled arcs, each representing a semantic relation.
Nouns, verbs, adjectives and adverbs all use synonymy and antonymy.
Nouns in WordNet are also related by hypernymy/hyponymy and by
20 Polysemy
meronymy, the part/whole relation. For example, meronyms of the synonym
set {car, auto, automobile, motorcar} include bumper and air bag. For verbs,
WordNet encodes troponyms, the manner relation, and entailment relations,
for example, snoring entails sleeping. Adjectives have a similarity relation for
words that, although not synonyms, have meanings that are semantically
close, for example freezing and cold or wet and damp. WordNet also has two
derivational relations that point from one syntactic category to anotherÐ
that between a relational adjective and the noun from which it is derived
(cultural pertains to culture) and that between an adverb and the adjective
from which it is derived (usually is derived from usual).
Other relational lexicons that have been developed use many more rela-
tions. Ahlswede and Evens (1988) posit many hundreds of relations. These
include spatial relations such as above, morphological relations such as
plural, and grammatical relations such as subject of a verb. Dolan, Van-
dervende, and Richardson (this volume) describe how they built a relation
lexicon, MindNet, automatically. MindNet's relations include hypernym,
modiwer, purpose, and object.
The semantic relations that are used in relational models can be grouped
into what De Saussure called syntagmatic relations and paradigmatic rela-
tions. Syntagmatically related words that co-occur frequently. Recall that silk
was associated with the adjective smooth. This is a syntagmatic relation since
smooth and silk frequently co-occur in phrases like as smooth as silk. Syntag-
matically related words are often the components of a collocation, as in
smooth talker, or represent selectional preferences, such as water in drink
water. Paradigmatically related words are those that appear in a similar
contextsÐprimarily because they represent similar concepts. Soft is paradig-
matically related to smooth since either of the synonyms soft and smooth can
Wll in the blank in asÐas silk.
Relational models like that of Ahlswede and Evens make extensive use of
both syntagmatic and paradigmatic relations. WordNet, on the other hand,
encodes paradigmatic relations only. It give us no information about syntag-
matic relations. Miller and Leacock (this volume) explore how corpus stati-
stics can be used to supplement the WordNet lexicon with syntagmatic
information.
Relational approaches maintain the classical division into distinct senses
for polysemous words but they do not decompose the meaning of concepts.
Even though words are treated as decomposable atomic units, relational
models make much use of classical feature inheritance.
Relational lexicons are ideal for inferencing, especially when the relations
are transitive. For example, in WordNet, the hypernym of dog is canine, and
the hypernym of canine is carnivore. Since hypernymy is a transitive relation,
it then follows that a dog is a carnivore. Additional properties can be inherited
from WordNet's deWnitional glosses. The gloss for canine includes the fact
Overview 21
that canines are `Wssiped mammals with nonretractile claws'. Again, carni-
vore's gloss tells us that carnivores are `Xesh-eating mammals'. Thus we can
infer that a dog a Wssiped mammal that has nonretractile claws and eats Xesh.
By traveling up the hypernym tree, a wealth of information about dogs is
quickly collected.

1. 1 0 p o l ys em y w it h i n t h e r el at i ona l ap p roa ch
Representing regular polysemy within a relational framework is problematic.
Word senses that exhibit regular polysemy can be very distant from each
other in the semantic network's conceptual space. As an example, the noun
ash has three senses in WordNet, one appears in the hierarchy as plant
material (as in an ash baseball bat) and the other as a woody plant (as in
the ash tree). These senses are quite distant and their semantic relation cannot
be determined through proximity in the network. How can a relational
network be used both to discover sense pairs exhibiting regular polysemy
and, having done so, to indicate that there is regular polysemy between two
senses, but not a third sense, as in the sense of ash meaning the residue from
a Wre?
Mel'cÏuk (1988) uses indices to indicate words exhibiting both regular
polysemy and non-regular polysemy. WordNet represents regular polysemy
of nouns by using a similarity relation (Miller and Zholkovsky 1998). The
basic idea, Wrst proposed by Philip Johnson-Laird, is that since an ash tree
and its wood bears a regular semantic relation, as do all trees and their wood,
then these nodes can be identiWed high up in the noun hierarchy. That is, by
identifying the wood node and the tree node as bearing a semantic relation,
each tree will be associated with its wood in the WordNet display.
Working with verbs, Fellbaum (this volume) takes advantage of Word-
Net's troponym relation to discover non-regular polysemous verb senses. She
identiWes a class of autotroponymsÐverb senses that have been conXated with
their complement meanings. For example, consider the sentences:
The Wsh smells good.
The Wsh smells.
In the second sentence, the sense of the verb smells has been conXated with
that of a particular concept, bad.
smellÐ(emit an odour; `the soup smells good')
!smell (smell bad; `He rarely washes, and he smells')
!reek, stink
Fellbaum represents this kind of polysemy as superordinate and subordinate
senses, where the subordinate sense has a more speciWc meaning which
includes the adjectival element.
22 Polysemy
1 .1 1 re g u l a r p olys e my
Fellbaum focuses on non-regular polysemy, in contrast to several other
papers in this collection which discuss words that are polysemous in a
systematic, predictive fashion. Nouns like city alternate their meaning
between an administrative entity or unit, the group of people living within
the unit's borders and the people who govern it. Similarly, nouns like book
alternate between the physical object and its content. In fact, as Cruse (this
volume) points out, both meanings can be active simultaneously, as in I'm
going to buy John a book for his birthday, similar to the behaviour of verbs like
climb, which JackendoV and others analyse as containing a disjunction of
semantic features. This simultaneity entails that the two `meanings' (the
physical object and its content) are not disjoint, but rather components of a
single word sense. The challenge for semantic theories lies in formulating
mechanisms to trigger the relevant meaning components in the appropriate
contexts. Pustejovsky (1995) accounts for such alternations, as between the
physical object and its content, as follows. The deWnition of book contains
three arguments, one for the physical object (x), another for the content (y)
and a third for a combination of the two, referred to as a dotted argument, and
written as x.y. Qualia, or aspects of the meaning of book, determine the
relationships these arguments can have with each other or with other seman-
tic components in their context. The formal quale speciWes that x holds y. The
telic quale speciWes the purpose and function of a bookÐbeing read by an
agentÐwhich applies to the combined concept, that is to (x.y).
How regular are these alternations? As indicated by several discussions in
this volume, the alternations seem to follow trends. Some correlate with
semantic classes. We mentioned trees and their woods. Words similar to
book in that they have content, show similar alternations. Nominals that
describe an action, such as construction, co-operation, separation, often de-
scribe its result too. A diVerent trend is suggested by Dowty, who advances
syntactic structures as an explanatory principle for alternations in meaning.
Dowty shows that semantic diVerences correlate systematically with known
syntactic alternations, such as the swarm alternation (as in Bees swarm in the
garden/The garden swarms with bees). The two syntactic formsÐone with an
agent subject; the other with a locative subjectÐexhibit a variety of semantic
diVerences. Dowty catalogs these diVerences and proposes a systematic rela-
tion that holds among them. The syntactic properties of the locative structure
form a set of devices for conveying a speciWc meaning: the locative subject
turns the location into the topic of discourse while the predicate assigns some
abstract property to it, which is further reinforced by the use of indeWnite
plurals and mass terms. Dowty's discussion is unique in this volume in its
emphasis on syntax as a means for realizing diVerent aspects of the meaning
of words.
Overview 23
In contrast, Fellbaum warns against such generalizations of polysemy. She
emphasizes unpredictability of polysemous verbs. After acknowledging the
regularities discussed by Pustejovsky, and Dowty, she explores lexicalized
polysemy that is independent of context, syntactic realizations or semantic
class membership, and that therefore cannot be generalized across the lexical
items that exhibit it. Fillmore and Atkins agree. While they discuss general
principles of extension, such as metonymy, or the extension from a feeling
(sad as in The person is sad) to something evoking this feeling (as in a sad day),
they caution against formulating conditions under which these general me-
chanisms apply. As their contrastive analysis of English and French demon-
strates, there is no predictive principle to determine when general mechanisms
apply to extend certain meanings. Apart from the general mechanisms, there
are many speciWc polysemy extensions, such as the one deriving the meaning
of closeness to the ground from the English crawl, or the one deriving the
concept of spreading from the French ramper. Similar speciWcity can be found
in the application of Pustejovsky's privative coercion operation, which turns
the meaning of life to mean death, but only when it functions as the comple-
ment of risk.

1 . 1 2 p ol y se m y f r o m a co mp ut at ion al p oin t of v i e w
Computer applications that handle the content of natural language texts need
to come to terms with polysemy. The problem is not a new one to computa-
tional linguistics. Ever since the surge of research on machine translation in
the 1950s and 1960s, the problem of word sense identiWcation has been a
major stumbling block in natural language processing.
The study of polysemy in computational linguistics addresses the problem
of how to map expressions to their intended meanings automatically. Com-
puters have the same resources for sense identiWcation as we do, the context.
However, computers are handicapped because they can only interpret the
context as strings of letters, words or sounds, and not as meanings. One
direction taken by researchers is to try to harvest machine readable diction-
aries for lexicographic knowledge of diVerent senses of polysemous words.
Another approach is to try to solve the mapping problem by simulating
human understanding using statistical procedures to capture patterns of co-
occurrences of words in context.
Bar-Hillel, an early enthusiast for fully automatic machine translation,
concluded that the task is futile because automatic sense identiWcation is
not possible (Bar-Hillel 1964). He asserts that, although it is a trivial matter
for an English speaker to assign the appropriate sense of pen (enclosure rather
than writing implement) in the box is in the pen, no computer program can do
so. We can disambiguate pen because our world knowledge includes informa-
tion regarding the relative sizes of toy boxes, writing implements, and play
24 Polysemy
pens. Bar-Hillel concludes that the solution to the problem would involve a
complete characterization of world knowledge, which is unbounded.
Katz and Fodor (1963) published a theory of semantic interpretation that
posits structures that are contained in the mental lexicon, yet independent of
world knowledge. Kelly and Stone (1975) used the Katz±Fodor semantic
theory as the basis for creating algorithms for automatic sense disambigua-
tion. Over a period of seven years in the early 1970's, Kelly and Stone (and
some 30 students) hand-coded algorithms, sets of ordered rules, for disam-
biguating 671 words, after studying a set of concordance lines containing the
target words. An obvious problem with the Kelly±Stone approach is the
amount of work involved. A second, more fundamental shortcoming, Kelly
and Stone conclude, is that the Katz±Fodor theory is not suYciently rich to
account for the productivity of human language.
Despite the earlier warnings about the impossibility of automatic sense
resolution without the characterization of world knowledge, there is a resur-
gence of interest in automatic sense identiWcation, most notably with the aid
of machine readable dictionaries and statistical analyses of large textual
corpora. In 1998, Computational Linguistics devoted a special issue to word
sense disambiguation (Ide and VeÂronis 1998). This renewed eVort will test
Bar-Hillel's predictions in light of the enormous computational capabilities
that are available to us today, as compared to what was available in the early
1960s.
How is polysemy deWned in computation terms? The optimal degree of
abstraction over multiple meanings of words is often determined by the
computational goal: in machine translation, there are distinct senses if there
exists a lexical choice in the target language; for example, know as in know a
person is translated into connaitre in French whereas know as in know a fact is
translated into savoir. Sometimes the degree of abstraction is determined by
the limits of what can be accomplished with current technology. In large-scale
information retrieval, for example, polysemy is only one aspect of a much
larger ambiguity problem, which includes syntactic ambiguity (the verb table
versus the noun table), and homonymy (river bank versus bank as a Wnancial
institution).
In the late 1980s, machine-readable dictionaries (or MRDs) were proposed
for sense disambiguation. Lesk (1986) devised a simple method to link dic-
tionary deWnitions if they share words in common. For example, in the
dictionary used by Lesk, pine has two major senses and cone has three. The
phrase pine cone is therefore six-way ambiguous. In order to disambiguate
it, a system must choose from the six possible combinations. Lesk identiWed
the relevant deWnitions of pine and cone to be the ones which contained
the same words, in this case, the ones which contained both evergreen and it
tree. In general, Lesk's algorithm chooses senses containing overlapping
words.
Overview 25
However, when dictionary deWnitions are terse, overlap of deWning words
in unlikely, and other clues in dictionaries are sought. Walker (1987), for
example, uses overlap of subject codes that are associated with words in
Longman's Dictionary of Contemporary English. The relevant subject code
for a piece of text is the one associated with most of the words in the text. The
deWnitions chosen are those associated with that subject code in the diction-
ary. One method for expanding the number of dictionary deWnitions that
can be used was developed by Guthrie et al. (1991), who compute neighbour-
hoods for polysemous words. Neighborhoods are built by searching for co-
occurrences in deWnitions and sample sentences of all words sharing the same
subject code. Wilks et al. (1993) provide a detailed account of how machine-
readable dictionaries can be used to identify senses of polysemous words. In
this volume, Stevenson' and Wilks describe a combination of methods for
identifying word senses using MRDs.
More recently, corpus-based disambiguation methods have been de-
veloped. As machine-readable corpora containing many millions of words
became available, large-scale analysis of the words that co-occur with poly-
semous words was undertaken (Black 1988; Zernik 1991; Hearst 1991; Gale et
al. 1992), and corpus-based word sense identiWcation, i.e: statistical corpus
analysis of co-occurrence patterns, became mainstream in computational
linguistics. Sense identiWcation systems train on example sentences represent-
ing each sense of a wordÐthe number of examples can range from 50 to
hundreds of sentences. The sets of examples are used to create a model for
each word sense. Once the models have been created, the system chooses the
most suitable model for a given novel usage of the word. The suitability of the
model is computed on the basis of a similarity measure between the features
of the model and those of the context of the novel occurrence.
Miller and Leacock characterize the kinds of contextual information that a
computer can be expected to extract from a corpus. They look at local
context, the open and closed class words that are near or adjacent to a
polysemous word and at topical context, substantive words that co-occur
with a word sense in a discussion of a given topic. Automatic sense identiWca-
tion systems that make explicit use of topical and local context include Towell
and Voorhees (1998) and Leacock, et al. (1998).
Adam KilgarriV (1997) organized Senseval, a conference to compare and
evaluate state of the art corpus-based sense disambiguation systems. Eighteen
participants disambiguated 35 polysemous words using the same training and
testing materials. Their results were sent back to KilgarriV and his colleagues,
who reported them at a workshop in Herstmonceaux Castle in the summer of
1998. The Senseval results show that we have reached the point that Bar-
Hillel called the `80 percent fallacy' (Bar-Hillel 1964). The thrust of this
fallacy is that if 80 per cent of a problem is solved, in this case, 80 per cent
of the polysemous words are correctly disambiguated, it does not follow that
26 Polysemy
the remaining 20 per cent can be resolved by a 20 per cent increase in research.
On the contrary, Bar-Hillel argues that solving the Wnal 20 per cent entails
much more work than was required to achieve the initial 80 per cent accuracy.
Another problematic aspect of this type of corpus-based approach is what
Gale et al. (1992) aptly call the `knowledge acquisition bottleneck'. In order to
get materials for training a program on senses of a particular polysemous
word, the corpus of contexts containing that word has to be manually parti-
tioned into its diVerent senses. Aside from being slow and costly, the work put
into manual tagging does not scale upÐthe partitioning required for one
word will be of no use in disambiguating any other word and will not decrease
the amount of manual eVort required.
SchuÈtze (this volume) takes a diVerent approach to corpus-based sense
identiWcation. His central concern is to design a sense identiWcation algorithm
that is both psychologically plausible and generally applicable. Instead of
assigning a polysemous word to a discrete sense, SchuÈtze clusters word senses
that are similar in that they share a similar context, and then deWnes these
clusters as word senses.
Dolan, Vanderwende, and Richardson (this volume) combine corpus-
based and MRD approaches. Like SchuÈtze, they use a similarity measure
on the context of words for determining whether words are close in meaning.
But the contexts they use are very specialized: they are links in a semantic
network, called MindNet, which is obtained from the analysis of MRD
deWnitions. MindNet does not attempt to classify individual word senses;
instead it recognized concepts that are similar to those in novel usages.
SchuÈtze's and Dolan et al.'s approaches can be very useful for information
retrieval, where the task is to match the query context to similar contexts in
the database of documents, but it is not obvious how such approaches would
be used in an application like machine translation, without extensive manual
encoding.
In computational linguistics, it is easier to automatically identify senses of
homonyms than it is to identify senses of polysemes. For example, it is easier
to automatically distinguish between bat the mammal and the bat used in
baseball than it is to distinguish between the bat used in baseball and that
used in cricket. This is because the contexts of homonyms will consist of quite
diVerent vocabularies, whereas the contexts of polysemes may be quite simi-
lar. Whether the distinction between homonyms and polysemes is important
depends, again, on the task at hand. Information retrieval, where several
relevant documents are presented to a user to choose from, may be a more
forgiving environment than automatic translation.
The Weld of computational linguistics is rapidly expanding in two import-
ant directions. Collections of online texts are increasing in size, and with them
the demand for large-scale processing. Large-scale research is speciWed as an
important goal of many research projects: the Text Retrieval Conference
Overview 27
(TREC), for example, the annual forum for the evaluation of information
retrieval systems, sponsored by the National Institute of Standards and
Technology (Harman 1995), has required participants to process a collection
of 2±3 gigabytes of text. More recently, it has announced a new track, in
which participants are asked to process 20 gigabytes. The other direction of
expansion is in the application of computers to new domains. Publishers and
computer companies are experimenting with dictionaries and encyclopedias
that exist online. Libraries are redeWning themselves as digital, oVering their
patrons novel ways to use their resources. More recently, methods are needed
to navigate, search, and retrieve relevant data from the rapidly increasing
volume of multi-lingual textual information available on the Internet. These
developments create an enormous new demand on computational text under-
standing, and on the more speciWc problem of associating words with their
intended meanings.

r e fe r en c e s

Ahlswede, T. E., and Evens, M. W. (1988), `A lexicon for a medical expert system', in
Evens (1988).
Apresjan, D. (1974), Regular polysemy', Linguistics, 142: 5±32.
Atkins, B. T. and Levin, B. (1991), Admitting impediments', in U. Zernick (ed.),
Lexical Acquisition: Exploiting On-line Resources to Build a Lexicon. Hillsdale, NJ:
Lawrence Erlbaum.
Bar-hillel, Y. (1964), `A demonstration of the nonfeasibility of fully automatic high
quality machine translation', in Language and Information: Selected Essays on Their
Theory and Application. Reading: Addison-Wesley.
Black, E. W. (1988), An experiment in computational discrimination of English word
senses', IBM Journal of Research and Development, 32(2): 185±94.
Byrd, R. J. Calzolari, N., Chodorow, M. S., Klavans, J. L., Neff, M. S. and Rizk,
O. A. (1987), `Tools and methods for computational lexicology', Computational
Linguistics, 13(3±4): 219±40. Also available as IBM RC 12642, IBM T. J. Watson
Research Center, Yorktown Heights, NY.
Coleman, L., and Kay, P., (1981) `Prototype semantics: the English verb lie', Lan-
guage, 57(1).
Cruse, D. A. (1986), Lexical Semantics. Cambridge: Cambridge University Press.
Evens, M. W. (1988), Relational Models of the Lexicon: Representing Knowledge in
Semantic Networks. Cambridge: Cambridge University Press.
Fellbaum, C. (1990), `English verbs as a semantic net', International Journal of
Lexicography 3.
ÐÐ (1998), WordNet: A Lexical Reference System and Its Application. Cambridge,
Mass.: MIT Press.
Fillmore, J. (1982), `Towards a descriptive framework for spatial deixis', in
R. J. Jarvella and W. Klein (eds.), editors, Speech, Place and Action. New York:
Wiley.
28 Polysemy
ÐÐ and Atkins, B. T. S. (1992), `Towards a frame-based lexicon: the semantics of
risk and its neighbors', in Lehrer and E. Feder Kittay, (eds.), Frames, Fields, and
Contrasts: New Essays in Semantic and Lexical Organization. Hillsdale, NJ: Lawr-
ence Erlbaum.
Gale, W., Church, K. W., and Yarowsky, D. (1992), `A method for disambiguating
word senses in a large corpus', Computers and the Humanities, 26: 415±39.
Geeraerts, D. (1993), `Vagueness's puzzles, polysemy's vagaries', Cognitive Linguis-
tics, 4(3): 223±72.
ÐÐ (1994), `Polysemy', in R. E. Asher and J. M. Y. Simpson (eds.), The Encyclopedia
of Language and Linguistics. Oxford and New York: Pergamon.
Guthrie, J. A., Guthrie, L., Wilks, Y., and Aidinejad, H. (1991), `Subject-dependent
co-occurrence and word sense disambiguation', in Proceedings of the 29th Annual
Meeting of the Association for Computational Linguistics. Berkeley, Calif.: ACL.
Harman, D. K. (1995), `Overview of the third text retrieval conference (TREC-3)'.
Technical Report Special Publication 500±225. Washington, DC: National Institute
of Standards and Technology.
Hearst, A. (1991), `Noun homograph disambiguation using local context in large text
corpora', in Annual Conference of the UW Centre for the New OED and Text
Research: Using Corpora. Oxford: UW Centre for the New OED and Text Research.
Howes, M. M. (1990), The Pyschology of Human Cognition. New York: Pergamon.
Ide, N. and Veronis, J. (1998), `Introduction to the special issue on word sense
disambiguation: the state of the art', Computational Linguistics, 24(1): 1±40.
Jackendoff, R. S. (1985), `Multiple subcategorization and the theta-criterion: the case
of climb', Natural Language and Linguistic Theory, 3: 271±95.
Katz, J. J. (1972), Semantic Theory. New York: Harper & Row.
ÐÐ Leacock, C., and Ravin, Y. (1985), `A decompositional approach to modiWca-
tion', in E. LePore and B. McLaughlin (eds.), Action and Events. Oxford: Blackwell.
ÐÐ Fodor, J. A. (1963), `The structure of a semantic theory', Language, 39: 170±210.
Kelly, E. and Stone, P. (1975), Computer Recognition of English Word Senses.
Amsterdam: North-Holland.
Keppel, G., and Strand, B. Z. (1970), `Free-association responses to the primary
purposes and other responses selected from the Palermo-Jenkins norms', in L.
Postman and G. Keppel, (eds.), Norms of Word Association. New York: Academic
Press.
Kilgarriff, A. (1997), `Evaluating word sense disambiguation programs: progress
report'. Brighton: Information Technology Research Institute.
Labov, W. (1973), `The boundaries of words and their meanings', in C.-J. N. Bailey
and R. W. Shuy, (eds.), New Ways of Analysing Variation in English. Washington,
DC: Georgetown University Press.
Lakoff, G., (1987), Women, Fire, and Dangerous Things. Chicago: University of
Chicago Press.
Leacock, C., Chodorow, M., and Miller, G. A. (1998), `Using corpus statistics and
WordNet relations for sense identiWcation', Computational Linguistics, 24(1): 47±65.
ÐÐ Towell, G., and Voorhees, E. M. (1993), `Corpus-based statistical sense resolu-
tion', in Proceedings of the ARPA Workshop on Human Language Technology. San
Francisco: Morgan Kaufman.
Overview 29
ÐÐ (1996), `Towards building contextual representations of word senses using sta-
tistical models', in Boguraev and J. Pustejovsky, (eds.), Corpus Processing for
Lexical Acquisition. Cambridge, Mass.: MIT Press.
Lesk, M. (1986), `Automatic sense disambiguation: how to tell a pine cone from an ice
cream cone', in Proceedings of the 1986 SIGDOC Conference. New York: Associa-
tion for Computing Machinery.
Levin, B. (1995), English Verb Classes and Alternations: A Preliminary Investigation.
Chicago: University of Chicago Press.
Mel'C uk, I. and Zholkovsky, A. (1988), `The explanatory combinatorial dication-
ary', in M. W. Evens (ed.), Relational Models of the Lexicon: Representing Knowl-
edge in Semantic Networks. Cambridge: Cambridge University Press.
Miller, G. A. (1990), `WordNet: an on-line lexical database', International Journal of
Lexicography, 3(4).
ÐÐ (1998), `Nouns in WordNet', in C. Fellbaum (ed.), WordNet: A Lexical Reference
System and Its Application. Mass.: MIT Press.
Pustejovsky, J. (1993), Semantics and the Lexicon. Dordrecht: Kluwer.
ÐÐ (1995), The Generative Lexicon. Cambridge, Mass.: MIT Press.
Quine, W. V. (1960), Word and Object. Cambridge, Mass.: MIT Press.
Ravin, Y. (1990), Lexical Semantics without Thematic Roles. Oxford: Oxford Uni-
versity Press.
Robins, R. H. (1967), A Short History of Linguistics. Bloomington: Indiana University
Press.
Rosch, E. (1977), `Human categorization', in N. Warren (ed.), Advances in Cross-
Cultural Pshycology, vol. 7. London: Academic Press.
Taylor, J. R. (1989), Linguistic Categorization: Prototypes in Linguitic Theory. Ox-
ford: Oxford University Press.
Towell, G., and Voorhees, E. M. (1998), `Disambiguating highly ambiguous words',
Computational Linguistics, 24(1): 125±46.
Walker, D. E. (1987), `Knowledge resource tools for accessing large text Wles', in S.
Nirenburg, (ed.), Machine Translation. Cambridge: Cambridge University Press.
Weinreich, U. (1966), `Explorations in semantic theory', in T. A. Sebeok (ed.),
Current Trends in Lnguistics, iii. The Hague: Mouton.
Wierzbicka, A. (1990), `Prototypes save: on the uses and abuses of the notion of
``prototypes'' in linguistics and related Welds', in S. L. Tsohatzidis (ed.), Meanings
and Prototypes: Studies in Linguistic Categorization. London: Routledge & Kegan
Paul.
ÐÐ (1996), Semantics: Primes and Universals. Oxford: Oxford University Press.
Wilks, Y., Fass, D., Guo, C.-M., McDonald, J. E., Plate, T., and Alator, B. M.
(1993), `Machine tractable dictionary tools', in J. Pustejovsky (ed.), Semantics and
the Lexicon. Dordrecht: Kluwer.
Wittgenstein, L. (1953), Philosophical Investigations. Oxford: Basil Blackwell &
Mott.
ÐÐ (1958), The Blue and Brown Books. Oxford: Basil Blackwell & Mott.
Zernik, U. (1991), `Train1 vs. Train2: tagging word senses in corpus', in U. Zernik
(ed.), Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon, Hills-
dale, NJ: Lawrence Erlbaum.

You might also like