Presented by:Nitin Rawat TT-ET (0909533046

)

INDEX
 Objectives
 Introduction  Difference between speech and voice recognition  The logic  Study of components used  Methodology  Challenges faced and their solution  Advantages and disadvantages

OBJECTIVES
 To develop a speech recognition system capable of

recognizing the words spoken by the user.
 To make the system speaker independent.  Remove external noise using filters.  Retain data in RAM.  Remove external noise.

INTRODUCTION
 Speech is a natural mode of communication for people by which they

express their views and messages as voice utterances.  The electronic approach of speech recognition is to convert the speech into an electronic signal.  Analog representation of speech.

DIFFERENCE BETWEEN SPEECH AND VOICE RECOGNITION
Speech recognition
 It is a speaker independent

Voice recognition
 It is a speaker dependent

system.  The main function of this system is to recognize the word spoken by the speaker.  It concerns with what is being spoken and not who the speaker is.

system  The main function of this system is to recognize the speaker first and then the words  It concerns with who is speaking and what is being spoken.

THE LOGIC

STUDY OF COMPONENTS USED
The programming board

IC HM2007
 Single chip voice recognition CMOS.
 Speaker dependent.  External RAM support.  Maximum 40 word recognition (.96     

second). Maximum word length 1.92 seconds (20 word). Microphone support. Manual and CPU modes available. Response time less than 300 milliseconds. 5V power supply.

INTERNAL FUNCTIONS OF HM2007
 The chip provides the following error codes.

55 = word to long 66 = word to short 77 = no match  Pressing “99” and then clear clears all the data inside the RAM.

8k x 8 SRAM
 TTL-compatible inputs

and outputs.  13 address pins  8 I/O pins.  Automatic power-down when deselected.  Static RAM organized as 8192 words by 8 bits.

IC 74lS373
 Consists of eight latches

with 3-state outputs for bus  organized system applications.  D1-D8 are data inputs.  Q1-Q8 are output pins.  Used for controlling two 7 segment drivers.

IC 7448 (7 segment driver)
 Converts BCD data into

control signals for 7 segment display.  ABCD are the BCD inputs.  a-f are outputs for the 7 segment display.

7 SEGMENT DISPLAY
 a-g are the connected to

output of IC 7448.  Can display all hexadecimal digits from 0-9 and A-F.

METHODOLOGY
Trained words and codes fed in RAM
Data from RAM to latch

Speech input from mic

To & segment display or any other processing unit or interface.

• The IC HM2007 is the heart of this speech recognition


• • •

circuit. The IC provides an analog front end, voice analysis, speech recognition and system control. The IC gets the analog signal from mic and converts it into digital codes. These codes are given specific notations or codes by the user and are stored in the RAM this is called as training. Now whenever the IC gets the same speech input it will give the same notation as given before.

CHALLENGES FACED AND THEIR SOLUTIONS
 External Noise reduction-

it has been reduced by using a band pass filter of range 300Hz - 3.1kHz .

•Homonyms
Homonyms are words that sound alike. For instance the words cat, bat, sat and fat sound alike. Because of their like sounding nature they can confuse the speech recognition circuit.

•The Voice with Stress & Excitement
Stress and excitement alters ones voice. This affects the accuracy of the circuit’s recognition. To achieve a higher accuracy word recognition one needs to mimic the excitement in ones voice when programming the circuit. These factors should be kept in mind to achieve the high accuracy possible from the circuit. This becomes increasingly important when the speech recognition circuit is taken out of the lab and put to work in the outside world.

ADVANTAGES AND DISADVANTAGES
 Advantages
• This technology is great boon for blind and handicapped as they can utilize

the voice recognition technology for their works. • As the speech recognition technology needs only voice and irrespective of the language in which it is delivered it is recorded, due to this perspective this is helpful to be used in any language.

 DRAWBACKS:
• If the system has to work under noisy environments, background noise may

corrupt the original data and leads to SS misinterpretation. • If words that are pronounced similar for example, their, there, this technology face difficulty in distinguishing them.

FUTURE SCOPE
 No typing by keyboard would be

required as your voice will act as an interface between you and your computer

various
 

using speech for controlling devices like car stereos, GPS ,home appliances and various other devices.

•Sub vocal speech recognitionThe normal speech recognition can be extended to sub vocal speech recognition. This technology enables speech recognition through the vocal utterances. It is a boon to all the dumb people or people who face problems in speaking.

Sign up to vote on this title
UsefulNot useful