Professional Documents
Culture Documents
N. D. Prasad Kartik. P
Raghu Engineering College Raghu Engineering College,
Door No: 38-19-84 #10-6-42/7
Gouthami Gardens; Jyothi Nagar Krishna Towers, Ramnagar Bazaar
Visakhapatnam-18; (0891) 2527599 Visakhapatnam-02(0891)5571206
prasad_nd43210@yahoo.co.uk kartik_podugu_1985@yahoo.co.in
Abstract: Speech recognition is an important area The problem with these early efforts was that they
of research because it offers an improved form of focused primarily on the end product of speech - the
human-machine interaction. This paper deals words people were attempting to. Generate. By the
mid-1980s, speech recognition programmers were
with an overview of the discovery and the building on Chomsky's ideas of grammatical structure,
development of the system and researches for the and using more powerful hardware to implement
accurate Speech recognition system and the statistical phoneme-chain recognition routines. It was
problems that need to be overcome to build not until the dramatic increase of processing power of
accurate speech recognition systems. The the 1980's and 1990's that technology companies could
disciplines involved in Automatic Speech seriously consider the possibility of speech
Recognition like Signal Processing, Physics recognition. Further, many of the major companies
collaborated to create application-programming
(acoustics), Pattern recognition, Communication standards. This millennium has brought a wide array of
and Information theory, Linguistics, physiology, new speech recognition products and research areas.
Computer Science, Psychology are discussed
briefly. The problems in Automatic Speech 2. AUTOMATIC SPEECH RECOGNITION.
Recognition like Recognition units; Variability
and ambiguity regarding the Speech recognition For decades, people have been dreaming of an
are discussed in detail by classifying them into "intelligent machine" which can master natural speech.
Automatic Speech Recognition (ASR) is one
many subgroups. The classifications of the subsystem of this 'machine', the other subsystem being
present available Automatic Speech Recognition Speech Understanding (SU). The goal of ASR is to
systems are discussed. The future challenges of transcribe natural speech while the goal of SU is to
Automatic Speech Recognition are discussed understand the meaning of transcription.
briefly.
Speech recognition continues to be a Automatic Speech Recognition means different
challenging field for researchers. The successful things to different people. At one end of the spectrum
is the voice-operated alarm clock which ceases ringing
speech recognizer is free from the constraints of when the word 'stop' is shouted at it, and at the other
speakers, vocabularies, ambiguities and end is the automatic dictating machine which produces
environment. A lot of efforts have been made in
this direction, but complete accuracy is still far
from reach. The task is not an easy one due to the a typed manuscript in response to the human voice or
interdisciplinary nature of the problem. the expert system which provides answers to spoken
questions. A practical speech recognizer falls
1. INTRODUCTION. somewhere between these two extremes.
,r
3.1. Signal Processing. 3.10. Recognition units.
The process of extracting relevant information from The first step in solving the recognition problem is
the speech signal in an efficient and robust manner. to decide which units to recognize. Possible candidates
are words, syllables, diphones phonemes and
distinctive features.
3.2. Physics (acoustics).
4. THE WORDS.
The science of understanding the relationship
between the physical speech signal and physiological The word is a basic unit in speech recognition. The
mechanisms (the human vocal tract mechanism). meaning of an utterance can only be deduced after the
words of which it is composed have been recognized,
The word is so basic that a whole class of speech
3.3. Pattern Recognition. recognizers, discrete utterance or isolated word
recognizers has been designed for the sole purpose of
The set of algorithms used to cluster data to create identifying spoken words. However, these devices
one or more prototypical patterns and to match a pair require each word to be spoken in isolation; this is not
of patterns. the way in which sentences are normally produced.
3.4. Communication & Information Theory: One of the problems of employing the word as the
recognition unit is the number of words in the
The procedures for estimating parameters of language. This is of the order of 100,000. For
statistical models; the methods for detecting the recognition to take place some representation of each
presence of particular speech patterns, the set of word needs to be stored. This implies that a large,
modern coding and decoding algorithms. though not impossible, amount of storage is required.
Nevertheless there are a number of domains of man
3.5. Linguistics. machine interaction where the vocabulary can be
restricted to a much smaller number of words. In
The relationship between sounds (phonology), these situations the word is often employed as the
words in a language (syntax), meaning of spoken recognition unit.
words (semantics), and sense derived from meaning
(pragmatics). Another problem encountered in using words as
the recognition units is in determining where one
3.6. Physiology: word ends and the next begins. There is often no
acoustic evidence of word boundaries. In fact, in the
Understanding the higher order mechanisms pronunciation of continuous speech, co-articulation
within the human central nervous system that effects take place across word boundaries, altering the
account for speech production and perception in acoustic manifestation of each word, and obscuring
human beings. the boundaries.
9. CONCLUSION.
10. REFERENECES
[1] Digit Magazine
[2] www.netac.rit.edu
[3] www.cslu.ogi.edu