You are on page 1of 28

 

COLLEGE OF INFORMATICS
DEPARTMENT OF INFORMATION SYSTEM
COURSE TITLE: SEMINAR IN INFORMATION SYSTEM
TITLE: SPEECH RECOGNITION SYSTEM SEMINAR
GROUP MEMBER
N0 NAME ID NO
1 AYANO WANDO 2827/11
2 TEKLU ABERA 1307/11
3 TALILE GELETA 1306/11
4 KERASA GEETANA 1295/11
5 MISGANE MEGERSA 1298/11

1
OUTLINE
 Introduction
 Speech recognition system
 Speech recognition process
 Structure of speech recognition
 Type of speech recognition system
 Speech recognition algorithm
 Advantages and disadvantages
 Application of speech recognition

2
CHAPTER 1
 INTRODUCTION TO SPEECH RECOGNITION

 
 1.1 INTRODUCTION
 Speech recognition allows you to provide input to a

system with your voice. Just like clicking with your


mouse, typing on your keyboard, or pressing a key on
the phone keypad provides input to an application.
 speech recognition allows you to provide input by

talking.

3
1.2 SPEECH RECOGNITION
 Speech recognition (or sometimes referred to as Automatic Speech
Recognition) is the process by which a computer (or other type of machine)
identifies spoken words.
 Basically, it means talking to a computer & having it correctly understand what
you are saying.

4
5
CHAPTER 2
LITERATURE SURVEY
2.1 SPEECH RECOGNITION PROCESS

 In humans the speech or acoustic signals are received by the ears &
then transmitted to the brain for understanding & extracting the
meaning out of the speech & then to react it appropriately.
 Speech recognition enabled computer or devices too, work under the
same principle.

6
Any speech recognition system involves following five major steps:
1. Signal Processing
 The sound is received through the microphone in the form of analogy electrical
signals.
2.Speech Recognition
 This is the most important part of this process; here the actual recognition is
done.

7
3. Semantic Interpretation
Here it checks if the language allows a particular syllable to appear after
another.
4. Dialog Management
The errors encountered are tried to be corrected.
5.Response Generation
After the task is performed, the response or the result of that task is generated.

8
2.2 STRUCTURE OF STANDARD SPEECH RECOGNITION SYSTEM

Fig.2.1 structure of speech recognition

9
The structure of a standard speech recognition system is illustrated
in Figure 2.1. The elements are as follows:
 Raw speech - Speech is typically sampled at a high frequency
 Signal analysis - Raw speech should be initially transformed and
compressed, in order to simplify subsequent processing.
 Speech frames - The result of signal analysis is a sequence of speech
frames, typically at 10 milliseconds intervals, with about 16
coefficients per frame.

10
Con’t…….
 Acoustic models - In order to analyse the speech frames for their
acoustic content, we need a set of acoustic models.
 Acoustic analysis and frame scores - Acoustic analysis is
performed by applying each acoustic model over each frame of
speech, yielding a matrix of frame scores.
 Time alignment - Frame scores are converted to a word sequence by
identifying a sequence of acoustic models, representing a valid word
sequence.

11
2.3 TYPES OF SPEECH RECOGNITION SYSTEMS
Speech recognition systems can be separated in several different classes:-
• Isolated Word
Isolated word recognizers usually require each utterance to have quiet
(lack of an audio signal) on both sides of the sample window.
 Connected Word
Connect word systems are similar to Isolated words, but allow separate
utterances to be 'run−together' with a minimal pause between them.

12
Con’t…….

• Continuous Speech
Continuous speech recognizers allow users to speak almost naturally,
while the computer determines the content. Basically, it's computer
dictation.
• Spontaneous Speech
At a basic level, it can be thought of as speech that is natural sounding
and not rehearsed.

13
There are two types of voice verification/identification system, which are as
follows:
1. Text-Dependent:
If the text must be the same for enrolment and verification this is called
text-dependent recognition.
2.Text-Independent:
Text-independent systems are most often used for speaker identification as
they require very little if any cooperation by the speaker.

14
CHAPTER 3
SYSTEM ANALYSIS
3.1 SPEECH RECOGNITION ALGORITHMS

3.1.1 Dynamic Time Warping


Dynamic Time Warping algorithm is one of the oldest and most
important algorithms in speech recognition.
The simplest way to recognize an isolated word sample is to compare it
against a number of stored word templates and determine the “best
match”.

15
3.1.2 Hidden Markov Model
The most flexible and successful approach to speech recognition so far
has been Hidden Markov Models (HMM).A Hidden Markov Model is a
collection of states connected by transitions. It begins with a designated
initial state. In each discrete time step, a transition is taken up to a new state,
and then one output symbol is generated in that state.

16
3.1.3 Neural Networks

A neural network consists of many simple processing units (artificial


neurons) each of which is connected to many other units. Each unit has
a numerical activation level (analogous to the firing rate of real
neurons).

17
CHAPTER 4
DISCUSSION
4.1 SPEECH RECOGNITION SOFTWARES
There are sample of Speech Recognition Software’s available in the market.
These software are available for various kinds of platforms including Smart
phones, PCs, Tablets etc. & are designed for different Operating Systems as
well.

18
Con’t…….

 Julius
Open source& Freeware speech recognition engine
 Google Now
An intelligent personal assistant software
 Iris (Intelligent Rival Imitator of SIRI)
The application uses natural language processing to answer questions based on
user voice request.

19
Con’t…….

 Dragon Naturally Speaking

A speech recognition software package


The software has three primary areas of functionality: dictation, text-to-
speech and command input.

 Windows Speech Recognition

A speech recognition application

 Developed by – Microsoft

20
4.2 ADVANTAGES& DISADVANTAGES

4.2.1 Advantages

• Increases productivity

• Can help with menial computer tasks, such as browsing and scrolling
• Diminishes spelling mistakes

21
4.2.2 Disadvantages
• Inaccuracy & Slowness
Most people cannot type as fast as they speak.
• Vocal Strain
Using voice recognition software, you may find yourself speaking more loudly than in
normal conversation.
• Out-of-Vocabulary (OOV) Words

• Spontaneous Speech
Systems are unable to recognize the speech properly when it contains dis influences

22
4.3 APPLICATIONS
• Games and Edutainment
Speech recognition offers game and edutainment developers the
potential to bring their applications to a new level of play.
• Data Entry
Applications that require users to keyboard paper-based data into the
computer
• Document Editing
• Speaker Identification
• Medical Disabilities
This technology is a great boon for blind & handicapped as they can
utilize the speech recognition technology for various works.

23
CHAPTER 5
CONCLUSION & FUTURE SCOPE
5.1 CONCLUSION
· Speech recognition will revolutionize the way people interacted with Smart
devices & will, ultimately, differentiate the upcoming technologies.
· This technology will spawn revolutionary changes in the modern world and
become a pivot technology.
· Speech recognition will revolutionize the way people interacted with Smart
devices & will ultimately, differentiate the up coming technologies.

24
Con’t…….

 Speech recognition is a truly amazing human capacity, especially when you


consider that normal conversation requires the recognition of 10 to 15
phonemes per second. It should be of little surprise then that attempts to
make machine (computer) recognition systems have proven difficult.
 Speech recognition system are an indispensable part of ever advancing field
of human computer interaction.

25
5.2 FUTURE SCOPE
As we discussed the future scope of speech recognition system are:-
 Speech recognition may become speech understanding.
 Ability to distinguish pronounces of speech and meanings of words
 Computers to decide what a person just said.
 Talk with all the devices.
 Universal translator is developing to translate one speech into another.
 Keyboard and other control pannel ability to simply listen our command.

26
27
28

You might also like