You are on page 1of 10

Chapter 8:Spoken Word Recognition

Spoken word recognition is the study of how lexical representations are


accessed from phonological patterns in the speech signal.

What are words Words are a systematized assignment of characters that specify
a certain meaning.Words are the elements that are separated by spaces in
printed text.For example,.The status of polymorphemic words :
plurals,compound nouns Passive Vs active vocabulary.Vocabulary range :
20,000 to 75,000 words.[æntaɪdɪsəstæblɪʃmɛnteɹiənɪzm̩]

❖ Research Methods:

The Gating TechniqueThe gating technique is a method for determining when


listeners can accurately interpret words in relation to information in the speech
stream. This involves presenting increasingly long fragments of speech and
measuring when listeners can interpret the speech appropriately. By presenting
spoken fragments and recording the listener's judgments for each fragment, the
shortest fragment that can be correctly identified defines the point at which there is
sufficient information in the speech for identification.

Lexical DecisionThe lexical decision method involves presenting a word on a


computer screen and asking the participant to indicate whether it is a real or
nonsense word. The researcher measures the response time and accuracy of the
participant to access the mental lexicon and determine whether the word has a
mental entry or not. Frequency effects are observed, where more frequent words
take less time to access, indicating that the lexicon is organized by individual word
frequency.

Priming TechniqueThe priming technique is a method used to explore spoken


language comprehension. It involves presenting a target word (to be judged as real
or nonsense) preceded by a prime word. The effect of the prime on response
latency/reaction times is then measured. A semantically related prime leads to
faster response times, suggesting that the mental lexicon is organized by semantic
relatedness since a semantically related word preceding the target word makes the
target word easier to access.Priming effect also found for Orthographically related
primes,Phonologically related primes,Constituent morphemes in morphologically
complex forms.

Cross-modal PrimingCross-modal priming is a research technique where a spoken


word is used as a prime to activate related concepts when participants make a
lexical decision about a written target word. This technique is used to investigate
how listeners interpret ambiguous words. In the study, participants listen to
ambiguous words like "bug" in different contexts and then decide whether a written
target word is a real word or not. The aim is to determine whether hearing the prime
"bug" in different contexts boosts lexical decision times for related words, such as
"ant" or "spy". This technique employs both auditory and visual modalities to study
language processing.

❖ The time course of Spoken Word recognition

The overall process of spoken word recognition can be broken down into neat
and manageable stages (Tyler & Frauenfelder, 1987 ):

-Pre-lexical analysis: The operations that are carried out on the speech input in
order to organize it into useful units;
-Contact : establishing links between the input and the stored forms of words;
-Activation: Getting contacted words excited about the fact that they have
been contacted;
-Access : Getting hold of the information about a word that is stored in the
mental lexicon (e.g. its meaning, grammatical category, etc.);

Cohort Model of Word Recognition

Access stage:Cohort Theory adopts the hypothesis that the listener


retrieves a set of words (a cohort) which match the evidence of the signal so
far. It is assumed that we work through lexical entries in turn until we find a
match for a word that we are hearing or reading. Thus, on hearing the phonetic
string [kaep] they would retrieve CAP, CAPITAL, CAPRICORN, CAPTURE,
CAPTAIN, CAPTIVE, etc. as word candidates.

Selection stage:If the next sounds proved to be [t] and [I], the cohort would
narrow to CAPTAIN, CAPTIVE and CAPTIVATE. Finally, the sound [n] would
mark a uniqueness point, where only one word match, CAPTAIN, was
possible.Therfore,one item will be selected from the set .

Integration stage :semantic and syntactic properties of the chosen words are
utilized .

What is the unit of prelexical analysis?


Pre-lexical analysis involves automatic peripheral perceptual processes which
analyze the spoken input into linguistically relevant units.Why not the word ?
the segmentation problem : it is very often extremely difficult to know where
one word finishes and the next begins.The identification of its entire speech
pattern will slow down the recognition process
The Phoneme as a unit of pre-lexical analysis

The phoneme is considered a unit of pre-lexical analysis in speech processing.


Arguments in favor of this view suggest that identifying phonemes helps in
distinguishing between different words and reduces the number of units to search for.
However, there are arguments against the psychological validity of phonemes as they
are linguistic constructs and may not align with how individuals perceive and process
speech. The recognition of phonemes can be influenced by the process of learning to
read and connecting sounds with written words. Additionally, languages that do not
represent phonemes in their writing systems can impact phoneme awareness and
performance in phoneme-related tasks. This suggests that the ability to perceive and
manipulate phonemes is partly influenced by literacy in alphabetic languages like
English.

Distinctive features a unit of prelexical analysis


Distinctive features are a unit of pre-lexical analysis in which listeners identify
specific characteristics in the speech signal. For example, listeners can utilize
vowel nasalization to predict the final consonant in a word. This is demonstrated in
gating experiments where listeners can identify a word like "soon" during the vowel
portion of the word before the final consonant is heard. However, in languages like
Bengali and Hindi, the nasal quality of a vowel may inform the listener about the
identity of the vowel rather than the following consonant.

The syllable as a unit of prelexical analysis


Listeners use syllables as a unit to identify words. Syllables can vary in size, and
different languages may have different syllabic inventories or stress patterns.
Therefore, the syllable is a potential unit of prelexical analysis.

Metrical Segmentation Strategy


The Metrical Segmentation Strategy (MSS) is a process used in language
processing where word searches begin each time a strong or stressed syllable is
encountered. It is a universal strategy that applies to all languages, but the type of
unit used for segmentation varies based on the metrical or rhythmic structure of
the language. For instance, stress-timed languages such as English use stress as
the unit for segmentation, while syllable-timed languages like French use syllables.
On the other hand, mora-timed languages such as Japanese use moras as the unit
for segmentation.

The Possible Word Constraint (PWC)


The Possible Word Constraint (PWC) principle states that the speech input is
segmented into words without any leftover sounds. To apply this principle, the
Metrical Segmentation Strategy (MSS) is used, which identifies words by starting
searches at each strong or stressed syllable encountered. An experiment was
conducted to test the PWC principle by presenting participants with nonword
stimuli and asking them to press a button as soon as they heard a real word within
the stimulus. The results showed that participants were slower to respond for
stimuli with two strong syllables, with the second such syllable starting at the
consonant /t/, compared to stimuli with one strong and one weak syllable. This
suggests that the PWC principle is at work during speech perception, and that the
MSS may depend on the metrical or rhythmic structure of the language being
spoken.

Contact and activation


The contact stage in spoken word recognition involves connecting the output of
pre-lexical analysis with the stored forms of words in the mental lexicon. This
process is mainly characterized by bottom-up processing, which involves
processing information in a feedforward manner from lower to higher levels of
processing. However, top-down processing can also play a role in this stage, which
involves using prior knowledge and context to guide the interpretation of incoming
information.During the contact stage, there can also be parallel processing, which
means that multiple stored forms can be activated simultaneously. This can happen
when there is uncertainty or ambiguity in the input, and multiple stored forms are
candidates for the input. The activation stage follows the contact stage and
involves exciting or activating the contacted words.

Parrallell Vs Serial Processing


Parallel processing models suggest that multiple words are activated and
processed simultaneously during lexical processing. On the other hand, serial
processing models suggest that only one word is considered at a time during
lexical processing.

Zwitserlood (1989 )

Zwitserlood (1989) used the crossmodal priming technique to provide evidence for
a parallel processing model. In this experiment, subjects were presented with visual
targets and heard test, control, and partner primes. They were asked to quickly
determine whether the visual targets were real words or not. The test primes were
words like "captain," which had partner words that started with the same sequence
of phonemes (e.g. "capital"). However, the partner words were never heard in the
experiment. The test and control primes were each paired with three types of
targets: one related to the test prime (e.g. "ship"), one related to the partner word
(e.g. "money"), and one related to neither. The results showed that the response
time was faster for the related target words compared to the unrelated ones,
indicating that the subjects were able to simultaneously access and process
information from the test prime and the partner word. This supports the idea of
parallel processing during lexical processing.

The importance of onsets

Cohort activation is a process in which a listener's mental lexicon is activated by


the initial sounds of a word, which allows the listener to generate predictions about
what the upcoming word might be. As the listener hears more sounds, they use
bottom-up processing to narrow down the potential words that match the input,
helping them identify the correct word. The importance of onsets lies in their ability
to provide an early cue for cohort activation, which can aid in more quickly
identifying the correct word. This highlights the complex and dynamic nature of
language comprehension.

❖ Lexical storage :connectedness


LX IS The way in which lexical items are organized in the lexicon so as to ensure rapid
access. Most current models assume that words are linked in complex network
interconnections. a word such as CHAIR has links to others in the lexical set of
furniture. But, for listeners, it also has links to words such as CARE that resemble
it phonologically and, for readers, links to words such as CHAIN which resemble it
orthographically. Other associations are based upon frequency of co- occurrence
(CHAIR-TABLE, CHAIR-MEETING) and upon sense relations such as synonyms,
antonyms and hyponymy. The connections between words differ in strength, with
CHAIR-TABLE much stronger than CHAIR-BED. Connectionist computer programs
(e.g. McLelland & Rumelhart, 1981) have simulated the way in which strengths of
connection are said to evolve. They do so by means of a mechanism which strengthens
a connection that occurs frequently and allows infrequent ones to atrophy.A word such
as CHAIR has links to others in the lexical set of furniture. But, for listeners, it also
has links to words such as CARE that resemble it phonologically and, for readers,
links to words such as CHAIN which resemble it orthographically. Other
associations are based upon frequency of co- occurrence (CHAIR-TABLE,
CHAIR-MEETING) and upon sense relations such as synonyms, antonyms and
hyponymy. The connections between words differ in strength, with CHAIR-TABLE
much stronger than CHAIR-BED.

Lexical Access
The retrieval of a lexical entry from the lexicon, containing stored information
about a word’s form and its meaning. Serial models of lexical access assume
that we work through lexical entries in turn until we find a match for a word
that we are hearing or reading. It is well established that frequent words are
identified more quickly than infrequent ones. These models therefore propose
that words are stored not just by similarity of form but also in order of
frequency: words beginning with /k{r/ would be accessed in the order CARRY
– CARROT – CARRIAGE – CARRIER – CARRION. This is the approach
favoured, for example, by Forster’s (1979) search model

Priming Evidence (Zwitserlood, 1989 ):


Participants were able to respond more rapidly to both the test targets ship
and money when they had heard the onset fragment /kap/ of the test prime.The
onset fragment allows the listener to make contact with lexical representations
for both captain and capital , with consequent access to the semantic content
of these representations.
Activation-based model
The more input there is that supports a particular lexical item, the higher the level of
activation for that entry. The activation levels of both captain and capital increase as
the first few sounds (/kæp/) are heard, but once the sound pattern of the input diverges
from that expected for the word capital , then the
activation of this word falls, while that for captain continues to rise.

Lexical selection
Lexical access is the process by which stored information in the lexicon becomes
available. This occurs before the selection of a unique candidate word. Priming evidence
suggests that listeners are able to respond more rapidly to target words when they have
heard the onset fragment of a related word, indicating that they have made contact with
the lexical representations for both the prime and the target words. This allows access to
the semantic content of these representations, supporting the idea of parallel processing
in lexical access.

The selection process


The selection process must choose between candidates so that a word can
be recognised .Selection depends on the uniqueness point of the word
For each word we can identify a uniqueness point, i.e. a point in the word
where it no longer overlaps with other words in the initial cohort. For
captain this point could be the second vowel sound (/ə/), where this word
becomes distinct from captive. Gating experiments indicate that the actual
recognition points of words correlate highly with these uniqueness points.

How are non-word recognized ?


The process of recognizing non-words is similar to that of recognizing real words.
The recognition of non-words depends on the point where they diverge from known
words. A set of words matching the input is activated, and their activation levels
change as more of the input is processed. When the input diverges from all known
words, the input is recognized as a non-word. Therefore, the recognition of
non-words is based on the deviation point from known words.

❖ Effects in SWR
The effect of a characteristic of a particular lexical item upon the ease
with which it is retrieved from the lexicon. Evidence supports the
following:

➢ Frequency Effect
It has long been established that words that occur frequently in the
language (as reflected by counts of large text corpora) are recognized faster, and more
accurately under noisy conditions, than words that occur rarely (e.g.,Howes & Solomon,
1951; Savin, 1963).It is well established that frequent words are identified more quickly
than infrequent ones. These models therefore propose that words are stored not just by
similarity of form but also in order of frequency: words beginning with /k{r/ would be
accessed in the order CARRY – CARROT – CARRIAGE – CARRIER – CARRION .
with frequent words recognised more rapidly than
infrequent. This is the approach favored, for example, by Forster’s (1979) search
model.When a word is presented, it activates a cohort of possible word candidates, with
the more frequent members having an advantage due to their higher resting activation
levels.
➢ COMPETITION Effects

The process of recognizing a spoken word depends not only on the properties of the
word itself, but also on the properties of other words that compete with it. This is known
as competition effectsCompetition between words is often represented in terms of
activation. Prompted by a particular string of letters or sounds, we access a number of
possible word matches. They are activated to different degrees – with the more likely
ones (those that are most frequent and those that form the closest match to what is in
the input) receiving more activation than the others. For example, hearing the initial
sequence [Iksp] would lead a listener to retrieve from the lexicon a set of items which
include EXPIRE, EXPECT, EXPLODE, EXPLAIN, EXPRESS etc. These would all
receive activation; but, if the next sound proved to be [r], the activation for EXPRESS
would be boosted to the point where it ‘fired’ – i.e. was accepted as the only possible
match for the evidence available. The activation of all others would decline.

Competition between words is not simply a question of how closely they match the
signal. The activation of a word is boosted if it is of high frequency. Thus, EXPECT
would start off at a higher level of activation than the less frequent EXPIRE – or
alternatively would require a lower level of activation in order to achieve a match.This
indicates that the recognition process does not solely depend on the degree to
which the spoken input matches the representation of a given word, but
also on the degree to which the input matches the representations of alternative
words

➢ Neighborism
. The neighborhood concept serves to identify words which are in competition with each
other by virtue of similarity of form. The sight of the word read on the page activates
not only READ but also neighbors which form close matches to the target.In theory, a
neighborhood includes words that are different by one letter, regardless of the position
of the letter . In this analysis, REAP, BEAD, REED and ROAD are neighbors of READ.
Within a neighborhood, there are friends: words which share the same rime as well as
the same spelling (the verb LEAD, BEAD). There are also enemies: words with the same
spelling but a different rime (HEAD, BREAD, DEAD, the noun LEAD).
There is evidence that the time it takes to recognise a given word is affected by the size
of its neighborhood and the number of friends and enemies it possesses. Thus,
recognition of a word like READ will be slowed by the existence of friends such as
BEAD and particularly by the existence of enemies such as DEAD, HEAD, BREAD etc.
By contrast, words like FEETor SIDE are recognised rapidly because they have few
friends and no enemies. The situation is complicated by the need to take account of the
possible effects of frequency. A word such as HAVE has no friends and a number of
enemies (CAVE, WAVE, RAVE, SAVE etc.) but happens to be a very frequent item.
Some accounts therefore represent neighborhood effects in terms of the frequency of the
target word in relation to the accumulated frequencies of its neighbors.
Word recognition and CONTEXT EFFECTS 1

The question of whether pre-selection can account for spoken word recognition has
been investigated through cross-modal priming experiments. The results suggest
that bottom-up information, such as phonetic input, takes priority over contextual
information in the initial stages of making contact with stored lexical forms. In
sentence contexts where a related visual target word is presented before the prime
word, there is no facilitation of the target word compared to a control condition.
However, if the target word is presented later in the prime word, there is facilitation.
For example, in the sentence "The men stood around for a while and watched their
captain...", if the visual target word "ship" is presented immediately before the prime
word "captain", there is no facilitation of "ship", but if the visual target word is
presented after the onset fragment /kap/, there is facilitation. These findings
suggest that contextual information may not be used for pre-selection but may still
have an effect on word recognition at a later stage.

Word Recognition and context effects 2


Contextual information plays a secondary role in spoken word recognition, but the
integration of contextual and lexical information allows for earlier recognition of
words in sentence contexts than in isolation. The recognition point of a word can be
earlier than its uniqueness point if contextual information rules out other candidate
words before this point. For example, if the word "trespass" is heard in a context
where it is clear that the speaker is referring to a poacher being found guilty, the
recognition point of the word would be earlier, at the /s/, since other candidate
words such as "trees" and "trestle" do not fit the context and their activation levels
drop.

Word Recognition and context effects 3


The effect of context on spoken word recognition is variable and depends on
various factors. While context can aid in word recognition, it plays a secondary role,
and bottom-up information from the phonetic input has priority in the initial stages
of accessing stored lexical forms. Context can facilitate word recognition earlier in
sentence contexts than in isolation, and the recognition point of a word can be
earlier than its uniqueness point if contextual information rules out other candidate
words before that point. However, the effect of context is not consistent, as shown
by word monitoring experiments, where response times increased across
sentences, and word monitoring times tended to decrease later in the sentence as
the context became clearer.

➢ PRIMING EFFECT
An increase in the speed with which a word is recognised, which results from
having recently seen or heard a word that is closely associated with it. Shown
the word DOCTOR, a subject recognises words such as NURSE or PATIENT
more rapidly than usual – always provided they are presented soon
afterwards. DOCTOR is referred to as the prime and PATIENT as the target. The
sight of the word DOCTOR is said to prime PATIENT. Exposure to the prime is
represented as activating (or bringing into prominence) a range of associated
words. These words then become easier to identify because they are already
foregrounded in the mind. The process, known as spreading activation, is
highly automatic and not subject to conscious control. Most priming effects
are relatively short-lived, and decay quite quickly, thus ensuring that too many
lexical items are not activated simultaneously.

Recognising morphologically complex forms


The way in which we recognize, process, and store words that are made up of more
than one meaningful element depends on several factors, including the productivity
of the morphological processes involved. Inflected words, such as "plays," "played,"
"went," "cats," and "children," and derived words, such as "government," "maturity,"
and "casualty," are morphologically complex and require additional processing
compared to simple words. Compound words, such as "boathouse" and "hotdog,"
are also morphologically complex.One determining factor in how these words are
processed is the productivity of the morphological processes involved. Inflectional
morphology, which involves adding suffixes to words to indicate tense, number, or
case, is more productive and regular than derivational morphology, which involves
adding affixes to create new words or change the meaning of existing words.

Processing Inflections :How do we process inflected words ?

When processing inflected words, frequency effects are documented, meaning that
more commonly used forms are easier to access. Out of context, inflected forms
can be difficult to access, with words like "deiz" and "pækt" being reported as "daze"
and "pact" rather than "days" and "packed". The word recognition system
disassembles inflected forms through morphological decomposition, analyzing
them into a stem plus an inflectional affix. The comprehension and production of
regular forms require morpho-phonological assembly and disassembly, while
irregular forms do not have an overt stem + affix structure and must be analyzed as
full forms. In summary, the processing of inflected words is influenced by factors
such as frequency, context, and morphological structure.

Processing Derivations :
The recognition of derived words is influenced by the combined frequency of words
related to them and the transparency of their derivational relationship. Priming
effects exist between a base form and a morphologically related word only if the
relationship between the two is transparent. Prefixed words prime and are primed
by their base forms in transparently related words, but suffixed words do not prime
morphologically related suffixed words. This suggests that the recognition of
derived words is a complex process that involves both morphological and semantic
factors.
Words and Rules

The mental lexicon likely uses both a rule-based recognition system and a
full-listing system. The rule-based system involves analyzing the input word into its
meaningful parts, while the full-listing system includes every word, simple or
complex, in the mental lexicon. This suggests that our ability to process language
relies on a combination of generalizable rules and specific, memorized instances.

➢ Evidence of lexical storage and retrieval


Speech errors by normal speakers provide insights into how we store and
retrieve lexical items and how we assemble speech. There are two main types
of speech errors: selection errors and assemblage errors. Selection errors
demonstrate that both meaning and form play a part in the way we associate
words in our minds and retrieve them. Assemblage errors provide insights into
various stages in the process of constructing an utterance, including choosing
a syntactic structure, fitting words into a syntactic frame, attaching inflections,
assigning lexical stress, and phonetic planning for articulation.

You might also like