3MCA67 Speech Recognition

REPORT
ON
SPEECH RECOGNITION SYSTEM
Submitted by,
Satyanarayana Dash
Roll.No : 3 MCA 067
Regd.No : 0305107085
12/08/21 1
CONTENTS
 INTRODUCTION
 OVERVIEW
 BASIC DIADRAM
 SPEECH RECOGNITION MAIN PROBLEMS
 HOW SPEECH RECOGNITION WORKS
 APPLICATION FIELDS
 CONCLUSION
12/08/21 2
INTRODUCTION
Speech recognition is the hottest topic in research today.
Speech recognition is the process of converting an audio signal,
captured by a microphone or a telephone, to a set of words.
In its simplest form, this machine should consist of two
subsystems, namely automatic speech recognition (ASR) and
speech understanding (SU). The goal of ASR is to transcribe
natural speech while SU is to understand the meaning of the
transcription.
12/08/21 3
OVERVIEW
Speech recognition uses several techniques to "recognize"
the human voice. It functions as a pipeline that converts digital
audio signals coming from the sound card to recognized
speech.
The voice input to the microphone goes to the sound card.
The output from the sound card—digital audio—is processed
using FFT (Fast Fourier Transform)—and further fine-
processed using HMMs(Hidden Markov Models) and other
techniques. The built-in database is used for analyzing what’s
been spoken. There’s a reverse feedback to the database at the
final stage for the purpose of adaptation. The final recognized
output then goes back to the CPU.
12/08/21 4
BASIC DIAGRAM OF THE PROCESS
12/08/21 5
SPEECH RECOGNITION MAIN
PROBLEMS
The difficulties of speech recognition involved several

topics such as:
 Number of speakers
 Nature of the utterance
 Vocabulary
 Environmental conditions
12/08/21 6
HOW SPEECH RECOGNITION
WORKS
Speech recognition fundamentally functions as a pipeline that
converts PCM (Pulse Code Modulation) digital audio from a
sound card into recognized speech. The elements of the
pipeline are:
1. Transform the PCM digital audio into a better acoustic
representation
2. Apply a "grammar" so the speech recognizer knows what
phonemes to expect. A grammar could be anything from a
context-free grammar to full-blown English.
3. Figure out which phonemes are spoken.
4. Convert the phonemes into words.
12/08/21 7
PRELIMINARY THINGS
 SOUND CARD
 FAST FOURIER TRANSFORM(FFT)
 HIDDEN MARKOV MODEL(HMM)
12/08/21 8
SOUND CARD
A sound card is a peripheral device that attaches to the PCI
slot on a motherboard to enable the computer to input, process,
and deliver sound.

The main function of the sound card in the speech

recognition system is analog-to-digital conversion which done by
an ADC (Analog-to-Digital Converter).
12/08/21 9
FAST FOURIER TRANSFORM
The Fourier transform allows us to decompose a periodic signal

into an infinite series of simple sine waves, each having a different
frequency and phase. In this case, however, the frequencies are
not discrete but rather a continuous spectrum. Transformation
changes the time domain to frequency domain and vice versa.
Because the spectrum is continuous, the result is an envelope of
the frequency-domain components rather than a plot of the
components themselves.
12/08/21 10
Hidden Markov Models
When the software has to be able to judge when a phoneme
ends and the next one begins. For this, it uses a technique called
Hidden Markov Models (HMM), which is another mathematical
model that uses statistics. To figure out when speech starts and
stops, a speech recognizer has silence phonemes, which are also,
assigned feature numbers.
12/08/21 11
STEP BY STEP PROCESS OF SPEECH
RECOGNITION SYSTEM
 TRANSFORM THE PCM DIGITAL AUDIO

 FIGURE OUT THE PHONEMES
 REDUCING COMPUTATION & INCREASING ACCURACY
 CONTEXT FREE GRAMMAR
 DISTRETE DICTATION
 CONTINUOUS DICTATION
 ADAPTATION
12/08/21 12
APPLICATION FIELDS
Speech synthesis (voice response systems)
 Digital transmission and storage (optimized encryption
of signals)
 Speaker verification and identification (control of access,
legal applications)
 Aids to the handicapped
 Speech recognition (automatic dictation, command and
control)
 Enhancement of speech signal quality (noise removal)
12/08/21 13
CONCLUSION

speech technologies are quickly moving forward, and as the
time comes with new technologies, the algorithms and methods
are optimized and improved.
So, for the speech recognition system, there are many
software’s developed and used. Those are: Voice pad generator,
Dragon’s naturally speaking, Scan soft open speech recognizer
etc.
12/08/21 14

3MCA67 Speech Recognition

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3MCA67 Speech Recognition

Uploaded by

Copyright:

Available Formats

REPORT

The difficulties of speech recognition involved several

The main function of the sound card in the speech

The Fourier transform allows us to decompose a periodic signal

 TRANSFORM THE PCM DIGITAL AUDIO

You might also like