Professional Documents
Culture Documents
1 Introduction
All comprehensive linguistic theories must confront the nature of the rela-
tionship between linguistic forms and the functions they perform. To what
extent is this relationship constrained and where do the constraints come
from? At the level of grammar, form-function relationships are apparently
constrained in part by compositionality, and a range of linguistic or cogni-
tive universals governing grammatical mappings have also been proposed.
At the level of the lexicon, associations between form and function have, at
least since Saussure (1966), been assumed to be mostly unconstrained. That
is, the word form expressing a particular meaning in a particular language
could be any phonotactically legal sequence of phonological segments and
the convention for each such relationship must simply be memorized by
learners of the language. It is in this sense that the form-function relation-
ship in language is said to be mainly “arbitrary”.
At the same time, it has often been recognized that within restricted re-
gions of the lexicon, complete arbitrariness fails to hold. This is most obvi-
ous for onomatopoeia (e.g., pitter patter, moo), but it has also been attested
for other categories of words in a wide range of spoken and signed lan-
guages (Hinton, Nichols & Ohala, 1994; Taub, 2001). In all of these cases,
the forms of words are said to somehow “suggest” their meanings, or the
meanings are said to somehow “motivate” the word forms. The general
term used for these phenomena is iconicity. Despite increased interest in
iconicity in recent years, two fundamental questions remain unanswered:
1. What empirical means do we have for identifying a particular word
as iconic or arbitrary?
2. How is it that the lexicon can be both arbitrary and iconic?
A possible answer to the second question is suggested by a completely
different body of research, the study of lexical access and word recognition
within psycholinguistics. Here it is recognized that the similarity of a word
form to other forms plays a role in the speed at which the word is accessed
during comprehension or production (Luce & Pisoni, 1998). Similarly, re-
search on semantic speech errors during production (e.g., Fromkin, 1971)
shows that words with similar meanings may interfere with one another.
Since phonological similarity and semantic similarity both have an effect on
language processing, it is worth considering whether the relationship be-
tween the two types of similarity also plays a role and whether communica-
tive considerations could govern the sorts of form-meaning relationships
that are favored or disfavored in languages.
In this paper, we provide a preliminary method of measuring the iconic-
ity of words. This method relies on a new, formal definition of iconicity,
one that distinguishes between what we call absolute iconicity, based on the
direct relationship between form and meaning, and relative iconicity, based
on the relationship between distance between forms and distance between
meanings. We also propose an account of how arbitrariness and iconicity
might be favored for different parts of the lexicon. We describe a prelimi-
nary study using Japanese and Tamil expressives, a lexical category report-
edly characterized by a high degree of iconicity (Childs, 1994).
2 Expressives
Many languages of the world, especially in Africa, South Asia, Southeast
Asia, and the Americas, have a lexical category referred to as expressives,
also sometimes as ideophones or mimetics (Childs, 1994). While there are
no necessary and sufficient conditions for defining this category cross-
linguistically, a number of phonological, morphological, syntactic, semantic,
and pragmatic properties characterize the expressive prototype. Expressive
forms tend to be made up of a limited set of phonemes and to exhibit redu-
plication. Syntactically, expressives are often used in combination with a
‘light’ verb such as say or do. Expressives tend to denote adverbial mean-
ings, and among the semantic categories often expressed by this category
are movements (of the body or of objects); physical states; sounds and
noises; speech patterns; sensations, emotions or mental states; personal ap-
3 / ICONICITY IN EXPRESSIVES: AN EMPIRICAL INVESTIGATION
Japanese Tamil
kirakira ‘glitter, twinkle’ labakk ‘suddenly’
giragira ‘glare strongly’ pacakk ‘suddenly’
girari ‘glare momentarily’ didikku ‘all of a sudden’
3 Types of Iconicity
We consider words to be associations of what we will call forms and mean-
ings. Both forms and meanings can be thought of as points in multi-
dimensional spaces, though we may not have direct access to what the set of
defining dimensions is. Given a meaning, a language user should be able to
assign a form to it: this is production. Given a form, a language user should
be able to assign a meaning to it: this is comprehension. Figure 1 illustrates
these two processes for the simple case of two form and two meaning di-
mensions.
Figure 1. Production and Comprehension. Form space (F) and Meaning space (M) are each
defined over only two dimensions. Each small square represents the form or meaning of a word.
In production, the speaker begins with a meaning (m1) and accesses a form (f1). In comprehen-
sion, the addressee begins with a form (f1) and accesses a meaning (m1).
We define absolute iconicity to be the property of a set of words for
which there is a similarity function that relates form and meaning (either in
whole or part). In absolute iconicity, there is a correlation between one or
more form dimensions and one or more meaning dimensions. The most
obvious example is onomatopoeia, for which forms are intended as imita-
tions of sounds in nature. Thus there appears to be a weak correlation be-
tween the vowel formants in conventional words for animal sounds (moo,
quack, cheep) and the perceived formants in the sounds made by the ani-
mals. Sign languages offer many other instances of absolute iconicity, for
example, the conventional use of the space in front of the signer for future
times and the space behind the signer for past times. Note that the form-
meaning function itself can vary from one language to another. Though we
know of no such languages, it is at least conceivable that a sign language
could assign the future to the space behind the signer and the past to the
space in front of the signer. The point is that the assignment applies consis-
tently across all of the words in the iconic set.
We define relative iconicity to be the property of a set of words for
which there is a correlation between form similarity and meaning similarity.
That is, rather than a similarity function relating meaning to form, relative
iconicity is based on separate similarity functions relating forms to forms
and meanings to meanings. For example, glow, gleam, glint, and glimmer
seem to exhibit relative iconicity because their similarity of form (gl- in the
salient initial position) corresponds to a similarity of meaning (‘having to do
with light of low intensity’) and because there are relatively few other Eng-
lish verbs that begin with gl- and do not share this meaning. Note that with
relative iconicity there need not be form-meaning similarity – this relation-
ship could be completely arbitrary – as long as similar forms have similar
meanings and similar meanings have similar forms. Note also that both
kinds of iconicity are properties of groups of words, absolute iconicity be-
cause it is defined with respect to a similarity function over a set of form-
meaning pairings and relative iconicity because it is defined with respect to
the similarity between pairs within a group of words. In particular it makes
no sense to say that a particular word is relatively iconic.
We define arbitrariness to be the absence of either absolute or relative
iconicity. In this paper we will be mainly concerned with arbitrariness in the
latter sense. For example, the words in the set apple, pear, peach, apricot,
plum, and cherry appear to exhibit arbitrariness in the sense of absence of
relative iconicity within the set since there is no obvious relationship be-
tween the similarity of forms and the prima facie similarity of associated
meanings. For example, peaches and apricots are quite similar (relative to
other objects or foods), but the forms of the words are as different as any
5 / ICONICITY IN EXPRESSIVES: AN EMPIRICAL INVESTIGATION
two pairs on this list. And the meanings of the only two nouns in the list
that begin with vowels (in both cases followed by /p/) appear no more simi-
lar than any other pair.
A special kind of relative iconicity is what we call anti-iconicity, where
expressions with similar meanings tend to have dissimilar forms. That is,
there is a negative correlation between form similarity and meaning similar-
ity. One of the goals of this paper is to demonstrate that anti-iconicity actu-
ally exists within some limited domains, but we illustrate here with a trivial
example. Consider these common nouns for carnivores: dog, wolf, fox, cat.
If we were to judge the similarities of the meanings of these words, relying
either on their zoological classification or on their gross appearance and
behavioral patterns1, we might find dogs and wolves most similar and foxes
more similar to dogs/wolves than they are to cats. But with respect to pho-
netic form, we find that cat and dog both consist of stop-vowel-stop se-
quences, whereas both wolf and fox consist of consonant-vowel-consonant-
fricative sequences. That is, it would appear that within this set of words,
form similarity correlates negatively with meaning similarity.
One problem with our informal examples of relative iconicity, arbitrari-
ness, and anti-iconicity is that they have relied on our intuitive impressions
of form and meaning similarity. For the definitions to be useful, we need
explicit means of measuring form similarity and meaning similarity. With
no constraints on what counts as similar, one could argue, for example, that
in our fruit example, the presence of /p/ in words for fruits in the list that do
not grow in clusters is evidence for iconicity within this list. We return to
this issue in Section 6 below, offering a formal measure of form similarity
and tentative approach to the trickier problem of meaning similarity.
In Figure 2, we abstractly illustrate absolute and relative iconicity and in
Figure 3, anti-iconicity and arbitrariness.
1 Note that these are not the only possible criteria for comparing the animals. On a scale of
tameness, for example, the similarities would clearly turn out differently.
Figure 2. Absolute and Relative Iconicity. The arrows indicate associations between the forms
and meanings of individual words. For the absolute iconicity example, there is a perfect corre-
lation between both pairs of F and M dimensions. For the relative iconicity example, we high-
light the connections for one cluster of words to show how similarity of form is associated with
similarity of meaning. In the plot on the right, each point represents the form and meaning
similarities for a single pair of words in the iconic domain.
Figure 3. Relative Anti-Iconicity and Arbitrariness. For the anti-iconic example, one cluster of
meanings is highlighted to show how its forms tend to be further apart than they would be by
chance. As in Figure 2, in the plots on the right, each point represents the form and meaning
similarities for a pair of words in the domain.
7 / ICONICITY IN EXPRESSIVES: AN EMPIRICAL INVESTIGATION
lexicon where one of the other of these situations would be more likely to
arise? We address this in the next section.
2 Although we don’t discuss this notion further in this paper, labels also play a role in judg-
ments of similarity (Sloutsky & Fisher, 2004), and anti-iconicity might cause the categories in
such domains to be perceived as more distinct than they would otherwise.
3 We make no claims about whether these English words are iconic by our definition of rela-
tive iconicity. We use them here only to illustrate the sorts of meanings that are conveyed by
expressives in languages such as Japanese and Tamil.
6 Method
Between nouns
Within nouns Within expressives
and expressives
Japanese .0621 .0920 .0556
Tamil .0427 .0372 .0280
7 Study 1: Iconicity and Anti-Iconicity in Broad Semantic
Categories
We started by isolating relatively broad semantic categories. A thesaurus is
organized in terms of a hierarchical ontology of categories, so, to the extent
that we trust this ontology to accurately reflect how people group concepts,
we can generalize results based on a thesaurus. For this study, we selected a
subset of the most general categories from the Japanese expressive thesau-
rus (Chang, 1990). These included, for example, EMOTION and MOTION.
For the Tamil expressives, we selected roughly comparable categories from
the list of expressives in the Tamil expressive dictionary (Sataasivam, 1966).
For the nouns, we selected several of the high-level categories that group
the words in the CDI, for example, ARTICLES OF CLOTHING and FOOD ITEMS.
Our results show that the Iconicity Coefficient was not significantly
different from 1.0 for any of the categories. That is, within broad categories
grouping expressives and concrete nouns, we found no tendency towards
iconicity or anti-iconicity. Note, however, that our hypotheses for the ad-
vantages of iconicity for expressives and anti-iconicity for concrete nouns
both depend on the degree of semantic similarity within the categories we
are examining. Perhaps the variation within broad categories such as
EMOTION and FOOD ITEMS is too great for these advantages to be reflected.
Given an unfamiliar expressive, there is presumably little advantage for the
hearer to guess that it refers to some emotion on the basis of its similarity to
other emotion words. The possibility of guessing that it refers to more spe-
cific emotions, such as anger or sadness, on the basis of its similarity to
words in those categories, would be more useful. Similarly, there is little
advantage for a misheard word for a food item to be less similar to any
other food word than it is to other nouns. A form for a fruit that could not
easily be confused with the names for other fruits, for example, would be
more useful.
In the next study, we therefore examined narrower semantic categories.
rather than only those from our original list of 300 expressives. For Japa-
nese these categories included ATTIRE, SLEEP, PAIN, EATING, COMPLAINT,
BURNING, SPLASHING, and STICKINESS; for Tamil they included ANGER,
QUICKNESS, BABBLING , LAUGHTER, EATING, SOFTNESS, and SUDDENNESS.
For the nouns, we informally selected categories roughly comparable in
specificity. We avoided categories in which many of the words were recent
borrowings or were morphologically complex because we felt these proper-
ties could affect the relative iconicity of the words (we return to these ef-
fects in the Conclusion section). Again, we made an attempt to include all
“basic” members of these categories; there were between 8 and 13 of these
in each case. The categories for both languages were FARM ANIMALS,
INTERNAL ORGANS, NATIVE MUSICAL INSTRUMENTS, BODILY EXCRETIONS,
SPICES, and WEATHER PHENOMENA .
To calculate the mean form similarity between words inside and outside
a given category, sb, we used all of the collected expressives and nouns in
each case. For these two languages we take these two categories to be repre-
sentative of the range of forms possible for mono-morphemic words (verbs
in both languages are always inflected). For each category, we wanted to
know whether its Iconicity Coefficient (IC) was significantly greater or less
than 1.0, whether the within-category form similarity differed from the
similarity of those words to the range of all words included in the sample.
We used a Monte Carlo simulation to assess the significance in each case.
This technique works by randomly picking words to place in hypothetical
meaning categories within the larger lexical categories (nouns or expres-
sives) and then using the distribution of the ICs of these categories to de-
termine how likely the ICs we actually obtained would have occurred by
chance.
Japanese Expressives
TIME **
SECURE ****
STICKY ****
SPLASHING ****
GLITTERING ****
BURNING *
ABRUPT STOP ****
FLOATING ****
CUTTING ****
EATING ***
COMPLAINT ****
ABSENT ***
ANXIETY *
TEARFULNESS ****
PAIN ****
SLEEP
EYE_EXPRESSION **
FAT ****
ATTIRE *
Tamil Expressives
ORNAMENT_NOISE ****
UNEVEN *
SUDDEN ****
SOFT *
HEALTHY
EAT **
HEARTBEAT **
LAUGH
BABBLE ****
QUICK
ANGRY ****
Japanese Nouns
WEATHER
SPICES
EXCRETIONS
MUS.INSTRUMENTS
ORGANS ****
FARM ANIMALS
Tamil Nouns
WEATHER *
SPICES
EXCRETIONS
MUS.INSTRUMENTS *
ORGANS
FARM ANIMALS
4 Asterisks indicate significance at the .05 (*), .01 (**), .001 (***) and .0001 (****) levels.
If a Bonferroni correction is made, the significance cutoff rises: bars marked with a single
asterisk will no longer be significant.
17 / ICONICITY IN EXPRESSIVES: AN EMPIRICAL INVESTIGATION
References
Chang, A. C. (1990). Gitaigo-giseigo bunrui youhou jiten. A thesaurus of Japanese
mimesis and onomatopoeia: Usage by categories. Tokyo: Daishuukan Shoten.
Childs, G. T. (1994). African ideophones. In Hinton, Nichols, & Ohala (Eds.),
Sound symbolism, pp. 179-204. Cambridge: Cambridge University Press.
Fenson, L., Dale, P.S., Reznick, J.S., Thal, D., Bates, E., Hartung, J.P., Pethick, S.,
& Reilly, J.S. (1993). The MacArthur Communicative Development Inventories:
User’s guide and technical manual. San Diego: Singular Publishing Group.
Fromkin, V. (1971). The non-anomalous nature of anomalous utterances. Language,
47, 27-52.
Gasser, M. (2004). The origins of arbitrariness in language. Proceedings of the An-
nual Conference of the Cognitive Science Society, 26.
Goldstone, R.L., Medin, D.L., & Halberstadt, J. (1997). Similarity in context. Mem-
ory & Cognition, 25 (2), 237-255.
Hinton, L., Nichols, J., & Ohala, J. (Eds.) (1994). Sound symbolism. Cambridge:
Cambridge University Press.
Luce, P.A. & Pisoni, D.B. (1998). Recognizing spoken words: The neighborhood
activation model. Ear & Hearing, 19, 1-36.
Ogura, T. & Watamaki, T. (1998). Research and development of early language
development inventory. Research project, Grant-in-aid for Scientific Research,
Kobe University (in Japanese).
Peirce, C. S. (1998). The essential Peirce: Selected philosophical writings: Volume
2. Bloomington, IN: Indiana University Press.
Sataasivam, M. (1966). Olikkurippakaraati (Dictionary of onomatopoeic expres-
sions). Madras: Paari Nilayam.
Saussure, F. de (1966). Course in general linguistics (tr. W. Baskin). New York:
McGraw-Hill.
Sethuraman, N. (ms.) Early language inventory: Tamil. Manuscript.
Sloutsky, V.M., & Fisher, A.V. (2004). Induction and categorization in young chil-
dren: a similarity-based model. Journal of Experimental Psychology: General,
133, 166-188.
Taub, S. F. (2001). Language from the body: Iconicity and metaphor in American
Sign Language. Cambridge: Cambridge University Press.