You are on page 1of 14

REPORT

ON
SPEECH RECOGNITION SYSTEM

Submitted by,
Satyanarayana Dash
Roll.No : 3 MCA 067
Regd.No : 0305107085
12/08/21 1
CONTENTS
 INTRODUCTION
 OVERVIEW
 BASIC DIADRAM
 SPEECH RECOGNITION MAIN PROBLEMS
 HOW SPEECH RECOGNITION WORKS
 APPLICATION FIELDS
 CONCLUSION

12/08/21 2
INTRODUCTION
Speech recognition is the hottest topic in research today.
Speech recognition is the process of converting an audio signal,
captured by a microphone or a telephone, to a set of words.
In its simplest form, this machine should consist of two
subsystems, namely automatic speech recognition (ASR) and
speech understanding (SU). The goal of ASR is to transcribe
natural speech while SU is to understand the meaning of the
transcription.

12/08/21 3
OVERVIEW
Speech recognition uses several techniques to "recognize"
the human voice. It functions as a pipeline that converts digital
audio signals coming from the sound card to recognized
speech.
The voice input to the microphone goes to the sound card.
The output from the sound card—digital audio—is processed
using FFT (Fast Fourier Transform)—and further fine-
processed using HMMs(Hidden Markov Models) and other
techniques. The built-in database is used for analyzing what’s
been spoken. There’s a reverse feedback to the database at the
final stage for the purpose of adaptation. The final recognized
output then goes back to the CPU.

12/08/21 4
BASIC DIAGRAM OF THE PROCESS

12/08/21 5
SPEECH RECOGNITION MAIN
PROBLEMS

The difficulties of speech recognition involved several


topics such as:
 Number of speakers
 Nature of the utterance
 Vocabulary
 Environmental conditions

12/08/21 6
HOW SPEECH RECOGNITION
WORKS
Speech recognition fundamentally functions as a pipeline that
converts PCM (Pulse Code Modulation) digital audio from a
sound card into recognized speech. The elements of the
pipeline are:
1. Transform the PCM digital audio into a better acoustic
representation
2. Apply a "grammar" so the speech recognizer knows what
phonemes to expect. A grammar could be anything from a
context-free grammar to full-blown English.
3.      Figure out which phonemes are spoken.
4. Convert the phonemes into words.

12/08/21 7
PRELIMINARY THINGS
 SOUND CARD
 FAST FOURIER TRANSFORM(FFT)
 HIDDEN MARKOV MODEL(HMM)

12/08/21 8
SOUND CARD
A sound card is a peripheral device that attaches to the PCI
slot on a motherboard to enable the computer to input, process,
and deliver sound.
 

The main function of the sound card in the speech


recognition system is analog-to-digital conversion which done by
an ADC (Analog-to-Digital Converter).

12/08/21 9
FAST FOURIER TRANSFORM

The Fourier transform allows us to decompose a periodic signal


into an infinite series of simple sine waves, each having a different
frequency and phase. In this case, however, the frequencies are
not discrete but rather a continuous spectrum. Transformation
changes the time domain to frequency domain and vice versa.
Because the spectrum is continuous, the result is an envelope of
the frequency-domain components rather than a plot of the
components themselves.

12/08/21 10
Hidden Markov Models
When the software has to be able to judge when a phoneme
ends and the next one begins. For this, it uses a technique called
Hidden Markov Models (HMM), which is another mathematical
model that uses statistics. To figure out when speech starts and
stops, a speech recognizer has silence phonemes, which are also,
assigned feature numbers.

12/08/21 11
STEP BY STEP PROCESS OF SPEECH
RECOGNITION SYSTEM

 TRANSFORM THE PCM DIGITAL AUDIO


 FIGURE OUT THE PHONEMES
 REDUCING COMPUTATION & INCREASING ACCURACY
 CONTEXT FREE GRAMMAR
 DISTRETE DICTATION
 CONTINUOUS DICTATION
 ADAPTATION

12/08/21 12
APPLICATION FIELDS
Speech synthesis (voice response systems)
 Digital transmission and storage (optimized encryption
of signals)
 Speaker verification and identification (control of access,
legal applications)
 Aids to the handicapped
 Speech recognition (automatic dictation, command and
control)
 Enhancement of speech signal quality (noise removal)

12/08/21 13
CONCLUSION
 
speech technologies are quickly moving forward, and as the
time comes with new technologies, the algorithms and methods
are optimized and improved.
So, for the speech recognition system, there are many
software’s developed and used. Those are: Voice pad generator,
Dragon’s naturally speaking, Scan soft open speech recognizer
etc.

12/08/21 14

You might also like