You are on page 1of 45

Intelligent Speech Recognition For Real Time Control Of DC Motor

CHAPTER 1 INTRODUCTION
Speech is one of the natural forms of communication for humans. Speech recognition is the process by which a computer identifies spoken words. Basically, it means talking to your computer, and having it correctly recognize what you are saying. This is the key to any speech related application. There are a number ways to do this but the basic principle is to Somehow extract certain key features from the uttered speech and then treat those features as the key to recognizing the word when it is uttered again. In this thesis, the issue of speech recognition is studied and a speech recognition system is developed for command recognition using Vector quantization model This has been true since the dawn of civilization, the invention and widespread use of the telephone, audio-phonic storage media, radio, and television has given even further importance to speech communication and speech processing. The advances in digital signal processing technology has led the use of speech processing in many different application areas like speech compression, enhancement synthesis, and recognition. The concept of a machine than can recognize the human voice has long been an accepted feature in Science Fiction. Recent development has made it possible to use this in the security system.

CHAPTER 2
Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

PROPOSED BLOCK DIAGRAM


Rotate Forward Rotate Backward Microphone (1) PC (2) Stop To control speed

RS232 (3)

Buzzer (5)

GSM Module (8)

LPC 2388 Microcontroller Development Board ARM7 (4)

LM293D (6)

DC motor (7)

SD Card (9)
Fig:2.1 Proposed Block diagram

LCD (10)

LF Amplifier (11)

2.1 BLOCK DIAGRAM EXPLANATION 2.1.1 Microphone Microphone is used to capture the sound. 2.1.2 Personal Computer To capture sound from microphone and to process the signal to identify the person.Figure shows the block diagram of what is happening in PC

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

Audio signal MFCC

Vector Quantization

Euclidian Distance

Recognized Speaker O/P Fig: 2.2 Basic speech Recognition block diagram MFCC is based on the human peripheral auditory system. The human perception of the frequency contents of sounds for speech signals does not follow a linear scale. Thus for each tone with an actual frequency t measured in Hz, a subjective pitch is measured on a scale called the Mel Scale .The mel frequency scale is a linear frequency spacing below 1000 Hz and logarithmic spacing above 1kHz.As a reference point, the pitch of a 1 kHz tone, 40 Db above the perceptual hearing threshold, is defined as 1000 Mels 2.1.3 RS232 Serial port interface to connect ARM7 microcontroller to PC .RS-232 interface which is one of the most popular of the serial interfaces. It is used with Embedded C to send and read data from serial port.Here command is send from MATLAB using this RS 232. RS 232 is very established interface from the very old years. So all the new standards are provided usually with RS232 converters. It is very easy to deal with data from RS232. The converters where based on hardware earlier but now a day they come in software layer as virtual COM ports. The MAX 232 is an integrated circuit that converts signals from an RS-232 serial port to signals suitable for use in TTL compatible digital logic circuits. The MAX232 is a dual driver/receiver and typically converts the RX, TX, CTS and RTS signals. The drivers Provide RS-232 voltage level output (approx. 7.5 V) from a single +5V supply via on-

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

chip charge pumps and external (typically 100F) capacitor. This makes it useful for implementing RS-232 in devices that otherwise do not need any voltages outside the 0 V to+5 V range, as power supply design does not need to be made more complicated just for driving the RS-232 in this case. The receivers reduce RS-232 inputs (which may be as high as 25 V), to standard 5 V TTL levels. These receivers have typically threshold of 1.3V and a typical hysteresis of 0.5V. 2.1.4 LPC 2388 Microcontroller Development Board (ARM7) MCB2300 populated with LPC 2388 is used to control the DC motor drive. 2.1.5 Buzzer Produce sound when the Voice is recognized and Generate long bell when an Error is occur while accessing the wrong input 2.1.6 LM293D L293D is a dual H-Bridge motor driver, So with one IC we can interface two DC motors which can be controlled in both clockwise and counter clockwise direction and if you have motor with fix direction of motion the you can make use of all the four I/Os to connect up to four DC motors. The name "HBridge" is derived from the actual shape of the switching circuit which controls the motion of the motor. It is also known as "Full Bridge". Basically there are four switching elements in the HBridge as shown in the figure below. L293D is most used HBridge driver IC. .

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

Fig:2.3 LM293 driver IC

2.1.7 DC motor In this project speech command is used to control the DC motor. Stepper motor can also be used instead of DC motor.But DC Motors are always preferred over stepper motors. There are many things which can control the speed of motor; can control the direction of rotation, here speech command is used for control. In this project I will learn to interface and control of a DC motor with ARM microcontroller using speech recognition. DC motors are used extensively in industry for applications such as robot arm drives, machine tools, rolling mills, and aircraft control

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

Fig:2.4 DC motor

2.1.8 GSM Module (Pulraj GSM-P2 Modem) For reading SMS in the text mode: AT+CMGF=1 Press enter AT+CMGR= no. Number (no.) is the message index number stored in the sim card. For new SMS, URC will be received on the screen as +CMTI: SM no. Use this number in the AT+CMGR number to read the message 2.1.9 SD Card The Secure Digital Memory Card is the standard memory card for mobile equipments. The SDC was developped as upper-compatible to Multi Media Card so that the SDC compleant equipments can also use an MMC with a few considerations. The MMC/SDC has a microcontroller in it, the flash memory controls (erasing, reading, writing and error controls) are completed inside of the memory card. The data is transferred between the memory card and the host controller as data blocks in unit of 512 bytes, so that it can be seen like a generic hard disk drive from view point of upper level layers. The currentry defined file system for the memory card is FAT12/16 with FDISK patitioning rule. The FAT32 is defined for only high capacity (>= 4G) cards.

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

Fig:2.5 SD card

The three of the contacts are assigned for power supply so that the number of effective signals is six (SDC). Therfore the data transfer between the host and the card is done via a synchronous serial interface. The supply voltage can be fixed to a proper value because the SDC works at supply voltage of 2.7 to 3.6 volts. 2.1.10 LCD To display the current direction of motor

Fig:2.6 LCD display

2.1.10.1 LCD Interface and Programming In the recent years LCD, is finding in daily use replacing LEDs which may be Single, Seven Segment or Multi Segment LEDs Because of Declining Pricing of LCD and ability to display numbers, characters and graphics. Another advantage of LCD is that,

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

Incorporation of refreshing controller in to LCD for relieving the CPU of the task of refreshing LCD. 2.1.10.2 LCD Backgorund Microcontroller program must interact with the outside world using input and output devices that communicate directly with a human being. The most common devices attached is an LCD display. Here 16 by 2 LCDs used , means 16 characters per line by 2 lines A popular standard exists which allows to communicate with the vast majority of LCDs regardless of their manufacturer. The standard is referred to as HD44780U, which refers to the controller chip which receives data from an external source (in this case, the microcontroller) and communicates directly with the LCD. 2.1.10.3 44780 BACKGROUND The 44780 standard requires 3 control lines as well as either 4 or 8 I/O lines for the data bus. The user may select whether the LCD is to operate with a 4-bit data bus or an 8bit data bus. If a 4-bit data bus is used the LCD will require a total of 7 data lines (3 control lines plus the 4 lines for the data bus). If an 8-bit data bus is used the LCD will require a total of 11 data lines (3 control lines plus the 8 lines for the data bus). The three control lines are referred to as EN, RS, and RW. The EN line is called Enable line is used to tell the LCD that you are sending it data. To send data to the LCD, program should make sure this line is low (0) and then set the other two control lines and/or put data on the data bus. When the other lines are completely ready, bring EN high (1) and wait for the minimum amount of time required by the LCD datasheet , and end by bringing it low (0) again. The RS line is the Register Select line. When RS is low (0), the data is to be treated as a command or special instruction (such as clear screen, position cursor, etc.). When RS is high (1), the data being sent is text data which sould be displayed on the screen.

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

The RW line is the "Read/Write" control line. When RW is low (0), the information on the data bus is being written to the LCD. When RW is high (1), the program is effectively querying (or reading) the LCD. Only one instruction ("Get LCD status") is a read command. All others are write commands so RW will almost always be low. The data bus consists of 4 or 8 lines (depending on the mode of operation selected by the user). In the case of an 8-bit data bus, the lines are referred to as DB0, DB1, DB2, DB3, DB4, DB5, DB6, and DB7. 2.1.10.4 LCD 4-Bit Operation? LCD in 4-Bit means we are 4 Lines of data bus instead of using 8 Line data bus. In this Method, we are Splitting Bytes of data in Nibbles. If you successfully interface Microcontroller with LCD with 4 Pins. Then we can save 4 Lines of Microcontroller, which pins we can used for other purpose. 2.1.10.5 To initialize LCD in 4 bit mode Lcd is initialize in 4 bit mode. For that 0x30 is the be written 3 times on lcd. 2.1.10.6 LCD Functions User functions - call these from your program (function prototypes are in the lcd header file) void lcd_init (void); void lcd_clear (void); void lcd_putchar (char c); void set_cursor (unsigned char column, unsigned char line); void lcd_print (unsigned char const *string) Also a number of low level functions that are called from the user functions see the lcd.c file. void lcd_write_cmd (unsigned char c) void lcd_write_4bit (unsigned char c)

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

static unsigned char wait_while_busy (void) static unsigned char lcd_read_status (void) To generate the current direction of motor. SD card is used to store the speech.

2.1.11 LF Amplifier Since the size of speech signal is very high, microcontroller memory is not enough to store that much size

Dept. of ECE, SNGCE

10

Intelligent Speech Recognition For Real Time Control Of DC Motor

CHAPTER 3 SPEECH RECOGNITION


3.1 PRINCIPLES OF SPEECH RECOGNITION Speech has evolved over a period of ten thousand years as the primary means of communication between human beings. Speech recognition is a major topic in speech signal processing. Speech recognition is considered as one of the most popular and reliable biometric technologies used in automatic personal identification systems. Speech recognition systems are used for variety of applications such as multimedia browsing tool, access centre, security and finance. It allows people work in active environment to use computer. We know that the physiological structure of a vocal tract is different for different persons. Due to this, we can differentiate one persons voice from others. This difference in vocal tract structure is reflected in the frequency spectrum of speech signal. People have been using this speech spectrum for speaker identification for a very long time. The Melfrequency Cepstral coefficients (MFCC) feature was first proposed for speech recognition . This is a filter bank based approach but implemented using time-frequency analysis technique. Here, first time analysis is done through framing operation and then frequency analysis is done by passing that frame through filter bank. As the time analysis is done first, to handle the quasi periodicity of the speech signal, MFCC needs overlapping frames. Filters are designed in such a way that they resemble the human auditory frequency perception. Later people used MFCC satisfactorily for speaker recognition also. Presently MFCC is the most widely used feature set for speaker recognition. In the MFCC filter bank, the low frequencies are given more importance compared to the high frequencies. This structure is very much suitable for speech recognition 3.2 CLASSIFICATION OF SPEECH RECOGNITION SYSTEM Speaker recognition methods can be divided into text independent and textdependent methods.

Dept. of ECE, SNGCE

11

Intelligent Speech Recognition For Real Time Control Of DC Motor

3.2.1 Text Independent System In a text independent system, speaker models capture characteristics of somebodys speech which show up irrespective of what one is saying. A text-independent ASIS does not rely on a specific text being spoken both in the training and testing phase. It relies on long-term statistical characteristics of speech for effecting a successful identification 3.2.2 Text-Dependent System In text-dependent ASI system (ASIS), a fixed utterance, like passwords, card numbers, PIN codes etc. in both training and testing phase and rely on specific features of the test utterance in order to affect a match. Text dependent system requires less training than text independant.- provides a perfect solution in practical applications. Every technology of speaker recognition, identification and verification, whether text independent and text-dependent, each has its own advantages and disadvantages and may require different treatments and techniques. The choice of which technology to use is application-specific. At the highest level, all speaker recognition systems contain two main modules feature extraction and feature matching. 3.3 SPEECH RECOGNITION ALGORITHMS The Speech Recognition can be classified into two phases

Speech Recognition Algorithms

Training Phase Each speaker has to provide sample of their vice so that the reference template model can be build

Testing Phase To ensure the input test voice is match with stored reference template model and recognition decision are made

Fif]g:3.1.Speech Recognition Algorithm

Dept. of ECE, SNGCE

12

Intelligent Speech Recognition For Real Time Control Of DC Motor

3.3.1Training Phase

In Training Phase, the frequency components of the given speech signal is extracted.
Input Speech

Feature Extraction

Features

Generate/Update Reference

3.3.2 Testing Phase In testing phase, the input speech is matched with stored references models (s) and recognition decision is made on the basis of Mel Frequency Cepstrum Coefficients (MFCC) ,Linear Prediction Coding( LPC) etc.

Test Speech

Feature Extraction Reference

Features

Similarity Measure

Scores

Decision Logic

Decision

Reference Training Phase 3.3.3 The Complete Speech Recognition system Reference speech sample Feature extraction Training

Test speech sample

Feature extraction

Pattern matching

Testing Phase Recognition Phase


Dept. of ECE, SNGCE

Recognition decision

13

Recognition Output

Intelligent Speech Recognition For Real Time Control Of DC Motor

3.4 FEATURE EXTRACTION Continuous speech Frame Blocking Windowing FFT

Mel Cepstrum Sk Cepstrum

Mel-Frequency Wrapping

3.4.1 Preprocessing Preprocessing is considered as the first step of speech signal processing, which involves the conversion of analog speech signal into a digital form. It is a very crucial step for enabling further processing. Here the continuous time signal (speech) is sampled at discrete time points to form a sample data signal representing the continuous time signal. Then samples are quantized to produce a digital signal. The method of obtaining a discrete time representation of a continuous time signal through periodic sampling, where a sequence of samples, x[n] is obtained from a continuous time signal x(t), stated clearly in the relationship below x[n] = x(nT) Where T is the sampling period and 1/T = fs is the sampling frequency, in samples/second, and n is the number of samples. It is apparent that more signal data will be obtained if the samples are taken closer together through making the value of T smaller.

Dept. of ECE, SNGCE

14

Intelligent Speech Recognition For Real Time Control Of DC Motor

The size of the sample for a digital signal is determined by the sampling frequency and the length of the speech signal in seconds. For example if a speech signal is recorded for 2 seconds using sampling frequency of 10000 Hz, the number of samples = 10000 x 2s = 20000 samples.

3.4.2 Framing Framing is the process of segmenting the speech samples obtained from the analog to digital (A/D) conversion into small frames with time length in the range of (20 to 40) milliseconds. In reference to the human speech production mechanism discussed earlier, speech signal is known to exhibit quasi-stationary behavior in a short period of time (20 40) milliseconds. Therefore, framing enables the non-stationary speech signal to be segmented into quasistationary frames, and enables Fourier transformation of the speech signal. The rationale behind enabling the Fourier transformation of the speech signal is because a single Fourier transform of the entire speech signal cannot capture the time varying frequency content due to the nonstationary behavior of the speech signal. Therefore, Fourier transform is performed on each segment separately. If the frame length is not too long (20 40) milliseconds, the properties of the signal will not change appreciably from the beginning of the segment to the end. Thus, the DFT of a windowe speech segment should display the frequency domain properties of the signal at the time corresponding to the window location. In addition, each frame overlaps its previous frame by a predefined size. The goal of the overlapping scheme is to smooth the transition from frame to frame. Framing is meant to frame the speech samples into segments small enough so that the speech segment shows quasi-stationary behavior. 3.4.3: FFT The next processing step is the Fast Fourier Transform, which converts each frame of N samples from the time domain into the frequency domain. The FFT is a fast algorithm to implement the Discrete Fourier Transform (DFT) 3.4.4:Mel Frequency Wrapping

Dept. of ECE, SNGCE

15

Intelligent Speech Recognition For Real Time Control Of DC Motor

The speech signal consists of tones with different frequencies. For each tone with an actual Frequency f, measured in Hz, a subjective pitch is measured on the Mel scale. The melfrequency scale is linear frequency spacing below 1000Hz and a logarithmic spacing above 1000Hz. As a reference point, the pitch of a 1 kHz tone, 40dB above the perceptual hearing threshold, is defined as 1000 mels. The following formula can used to compute the mels for given frequency f, in Hz. Mel(f)=2595*log10(1+f/700) 3.4.5:Cepstrum The output of the equation is log mel spectrum, it has to be converted back into time. The result is called the mel frequency cepstrum coefficients (MFCCs). This may be converted to the time domain using the Discrete Cosine Transform (DCT). The number of mel cepstrum coefficients, K, is typically chosen as 33. The first component, C0 is excluded from the DCT since it represents the mean value of the input signal which carries little speaker specific information 3.5 FEATURE MATCHING 3.5.1 Vector Quantization Vector Quantization is the generalization of Scalar Quantization to groups of n pixels called vectors (of length n). A Vector Quantizer needs only a codebook and a distorsion measure usually MSE (Mean Squared Error). Vector quantization (VQ) is a lossy data compression method based on the principle of block coding. It is a fixed-to-fixed length algorithm. In 1980, Linde, Buzo, and Gray (LBG) proposed a VQ design algorithm based on a training sequence. The use of a training sequence bypasses the need for multidimensional integration. A VQ that is designed using this algorithm are referred to in the literature as an LBG-VQ. Vector Quantization (VQ) is a process of mapping vectors of a large vector space to a finite number of regions in that space. Each region is called a cluster and is represented by its centre (called a centroid) . A collection of all the centroids makes up a codebook. The amount of data is significantly less, since the number of centroids is at least ten times smaller than the number of vectors in the original sample. This will reduce the amount of computations needed for comparison in later stages. Even though the codebook is smaller

Dept. of ECE, SNGCE

16

Intelligent Speech Recognition For Real Time Control Of DC Motor

than the original sample, it still accurately represents a persons voice characteristics. The only difference is that there will be some spectral distortion. The sound is compressed to reduce the storage requirement. An element in a finite set of spectra in a codebook is called a codevector. The codebooks are used to generate indices or discrete symbols. If we denote the size of the VQ codebook as M = 2N codewords, then we require an L (with L >> M) number of vectors . It has been found that L should at least be 10M in order to train a VQ codebook that works well. For this project, we use the LBG algorithm, also known as the binary split algorithm 3.5.2 LBG Design Algorithm The LBG VQ design algorithm is an iterative algorithm. It is proposed by Y. Linde, A. Buzo & R. Gray. This alternatively solves optimality criteria and is used to generate the codebook. It performs minimizing a distortion criterion applied on vectors of a training set. The maximal rate of encoding is given by : R = -log, L bits / pixel, where L is the number of codebook vectors and n their dimension. We can see that for a fixed value of R, we can use several pairs (L,n). Because of Full Search applied on a large volume of data, L is limited to 256 and n to 4 in order to reach a 2x2 bpp rate.

Dept. of ECE, SNGCE

17

Intelligent Speech Recognition For Real Time Control Of DC Motor

CHAPTER 4 ARM MICROCONTROLLER


The ARM7TDMI core is a member of the ARM family of general-purpose 32-bit microprocessors. The ARM family offers high performance for very low power consumption, and small size. The ARM architecture is based on Reduced Instruction Set Computer (RISC) principles. The RISC(software) instruction set and related decode mechanism are much simpler than those of Complex Instruction Set Computer (CISC) designs. This simplicity gives: a high instruction throughput an excellent real-time interrupt response a small, cost-effective, processor macro cell LPC2364/6/8/78 is an ARM-based microcontroller for applications requiring serial communications for a variety of purposes. 4.1 FEATURES ARM7TDMI-S processor, running at up to 72 MHz. Up to 512 kB on-chip Flash Program Memory

Dept. of ECE, SNGCE

18

Intelligent Speech Recognition For Real Time Control Of DC Motor

Up to 32 kB of SRAM on the ARM for high performance CPU access and 16 kB Static RAM for Ethernet interface and 8 kB Static RAM for USB interface Dual AHB system that provides for simultaneous Ethernet DMA, USB 8-bit data/16-bit address parallel bus is available in LPC2378 only. Advanced Vectored Interrupt Controller, supporting up to 32 vectored interrupts. Ethernet MAC with associated DMA controller. Four UARTs with fractional baud rate generation, one with modem control I/O, one with IrDA support, all with FIFO. Two CAN channels ,Three I2C, SPI controller are reside on the APB bus. Secure Digital (SD) / Multi Media Card (MMC) memory card interface. Up to 70 (LPC2364/6/8) or 104 (LPC2378) general purpose I/O pins. 10 bit A/D converter with input multiplexing among 6 pins 10 bit D/A converter. Four general purpose Timers with two capture inputs each. Real Time Clock with separate power pin, clock source can be the RTC oscillator or the APB clock. DMA

4.2 SERIAL INTERFACES:

The watchdog timer can be clocked from the internal RC oscillator, the RTC oscillator, or the APB clock.

Standard ARM Test/Debug interface for compatibility with existing tools. Emulation Trace Module Four external interrupt inputs.

4.3 GENERAL PURPOSE I/O

Dept. of ECE, SNGCE

19

Intelligent Speech Recognition For Real Time Control Of DC Motor

The LPC23xx has up to five General purpose IO ports which each contain 32 IO lines giving a maximum of 160 pins.PORT0 and PORT2 can generate an interrupt when there is a rising or falling edge on an individual pin. 4.3.1 Fast IO Registers The LPC23xx family has a set of GPIO control registers located on the local bus called the Fast GPIO control registers. On reset the pin connect block configures all the peripheral pins to be general purpose I/O (GPIO) input pins. The GPIO pins are controlled by four registers, as shown below

FIOPIN FIOSET FIOCLR FIODIR


4.1 GPIO Register

Each GPIO pin is controlled by a bit in each of the four GPIO registers. These bits data direction, set,clear and pin status .The FIODIR pin allows each pin to be individually configured as an input (0) or an output (1). If the pin is an output the FIOSET and FIOCLR registers allow you to control the state of the pin. The state of the GPIO pin can be read at any time by reading the contents of the FIOPIN register The FIOMASK register is used to mask individual bits of the FIOSET, FIOCLR and FIOPIN register. This masking helps speed up low level IO bit manipulation. PORT0 and PORT1 can be accessed as general purpose as well as fast ports, but P2,P3&P4 can be accessed only as fast ports. 4.4 UART The LPC23xx devices currently have four on-chip UARTS. They are all identical to use, but UART1 has additional modem support and UART3 which has IrDA support. All the UARTs have a built-in Baud rate generator with autobaud capability and 16 byte transmit and receive FIFOs
Dept. of ECE, SNGCE

20

Intelligent Speech Recognition For Real Time Control Of DC Motor

First the pinselect block must be programmed to switch the processor pins from GPIO to the UART functions. Then LCR configures the format of transmitted data. Usually the character format is set to 8 bits, no parity and one stop bit. In the LCR, there is an additional bit called DLAB which is the divisor latch access bit. In order to be able to program the Baud rate generator, this bit must be set. The Baud rate generator is a sixteen bit prescaler which divides down Pclk to generate the UART clock which must run at 16 times the Baud rate. This is formula used to calculate the UART Baud rate Divisor = Pclk/16 * BAUD Consider Pclk= 30MHz, Divisor = 30,000,000/16 x 9600 = 194 or 0xC2 Often it is not possible to get an exact Baud rate for the UARTs, they will work with up to around a 5% error in the bit timing. The divisor value is held in two registers: Divisor latch MSB (DLM) and Divisor latch LSB (DLL).. 4.4.1 Data Transfer Once the UART is initialised, characters can be transmitted by writing to the Transmit Holding Register. Similarly, characters may be received by reading from the Receive Buffer Register. Both these registers occupy the same memory location. Writing a character places the character in the transmit FIFO and reading from this location loads a character from the Receive FIFO. The putchar() and getchar functions are used to read/write a single character to the UART. 4.5 ADC (Analog to Digital Converter) The A/D converter present on LPC2300 variants is a 10-bit successive approximation converter with a conversion time of 2.44 uSec. The A/D converter has either 6 or 8 multiplexed inputs depending on the variant. The converter is available with 4 or 8 channels of 10-bit resolution The A/D control register establishes the configuration of the converter and controls the start of conversion. The first step in configuring the converter is to set up the peripheral clock. The A/D clock is also derived from the PCLK. This PCLK must be divided down to equal 4.5MHz. This is a maximum value and if PCLK cannot be divided down to equal 4.5MHz then the nearest value below 4.5MHz which can be achieved should be selected

Dept. of ECE, SNGCE

21

Intelligent Speech Recognition For Real Time Control Of DC Motor

AD Control register: The control register PCLK is divided by the value stored in the CLKDIV field plus one. Hence the e quation for the A/D clock is as follows: CLKDIV = (PCLK/Adclk) - 1 Unlike other peripherals the A/D converter can make measurements of the external pins when they are configured as GPIO pins Once you have configured the A/D resolution, a conversion can be made. The A/D has two conversion modes, hardware and software. The hardware mode allows you to select a number of channels and then set the A/D running. In this mode a conversion is made for each channel in turn until the converter is stopped. At the end of each conversion the result is available in the A/D Global data register and in a dedicated results register for each channel, ADDR0 ADDR7. At the end of conversion the Done bit is set and an an interrupt may also be generated if the global enable and channel interrupt enable bits are set in the A to D Interrupt enable register. The conversion result is stored as a ratio of the voltage on the analog channel, divided by the voltage on the analog power supply pin. The number of the channel for which the conversion was made is also stored alongside the result. This value is stored in the CHN field. Finally, if the result of a conversion is not read before the next result is due, it will be overwritten by the fresh result and the OVERUN bit is set to one. If you are using multiple A/D channels the A/D status register provides global access to the DONE and Overrun bits for each channel 4.6 DAC(Digital To Analog Converter) The LPC23xx variants have a 10-bit Digital to analog converter. This is an easy-touse peripheral as it only has a single register. The DAC is enabled by writing to bits 20 and 21 of PINSEL1 and converting pin 0.26 from GPIO to the AOUT function. The channel of the analog to digital converter also shares this pin. The DAC is controlled by a single register. The value to be converted is written here along with the bias value. Once enabled a conversion can be started by writing to the VALUE bits in the control register. The conversion time is dependent on the value of the BIAS bit. If it is set to one the conversion time is 2.5uSec but it can drive 700 uA. If it is zero, the conversion time is 1 uSec but it is only able to deliver 350 uA. 4.7 TIMER

Dept. of ECE, SNGCE

22

Intelligent Speech Recognition For Real Time Control Of DC Motor

The LPC23xx has four general purpose timers. All of the general purpose timers are identical in structure and use. The timers are based around a 32-bit timer-counter with a 32bit prescaler. The default clock source for all of the timers is the APB peripheral clock Pclk. The tick rate of timer is controlled by the value stored in the prescaler register. The prescaler register will increment on each tick of Pclk until it reaches the value stored in prescaler register. When it reaches the prescale value, the timer-counter is incremented by one and the prescale counter resets to zero, and starts counting again. Capture Mode : Each timer has upto four capture channels. The capture channels allows to capture the value of the timer-counter when an input signal makes a transition. Counter Mode : The count control register allows to select between each timer as a counter or a pure timer. Match Mode : Each timer has upto four match channels. Each match channel has a match register which stores a 32-bit number. The current value of timer-counter is compared against the match register. When the values match, an event is triggered. 4.8 INTERRUPT There is two interrupt inputs FIQ (fast interrupt request) IRQ (interrupt request) Here in this project interrupt is set whenever the UART receiver has Data.

Dept. of ECE, SNGCE

23

Intelligent Speech Recognition For Real Time Control Of DC Motor

CHAPTER 5 ARM DEVELOPMENT BOARD


5.1 MCB 2300 The ARM development board from KEIL.The connectors on the evaluation board provide easy access to many of the on-chip peripherals. 5.1.1 Block Diagram Analog Input LCD Display Dual RS 232 Dual CAN . Configuration Jumpers CPU LPC 23X8
Dept.USB & Power of ECE, SNGCE

Reset & Interrupt Ethernet Port LEDs SD cards

COM

24

Intelligent Speech Recognition For Real Time Control Of DC Motor

Figure 5.1 Block Diagram Of MCB 23xx

The hardware block diagram displays input, configuration, power system, and User I/O on the board. 5.1.2 MCB 2300 Development board (1) JTAG Download and Debug A JTAG interface is on the MCB2300 board and, coupled with the ULINK USBJTAG adapter, allows flash programming. The on-chip debug interface can perform realtime in-circuit emulation of the LPC2300 device

Figure 11.2 MCB 2300 Development Board

Dept. of ECE, SNGCE

25

Intelligent Speech Recognition For Real Time Control Of DC Motor

(2) SD LED : Indicates that an SD card is detected (3) & (5) Dual Serial Ports: Standard DB9 connectors are on the MCB2300 for both of the LPC2300s serial ports COM1 & COM2 (4) Configuration Jumpers: To enable or disable certain features (6) Reset: The Reset push button connects to the reset circuit. This resets the microcontroller. (7) Potentiometer: It connects to Port 0.23 (AD0.0) when the Jumper AD0.0 is enabled. The voltage range is 0.0-3.3 Volts. It is an adjustable analog voltage source is on the board for testing the Analog to Digital output feature of the LPC2300. (8) The INT0 push button connects to the external interrupt 0 (INT0) of the microcontroller. Removing Jumper INT0 disables this button (9) Port LED driver:To drive port LEDs. Removing Jumper J11 disables the Port LED driver chip. (10)Port LED: P2.0 - P2.7 connect to eight Port 2 pins when jumper J11 is installed. These LEDs are useful for indicating program status while testing applications. (11) Crystal oscillator: A 12.0 MHz crystal provides the clock signal for the CPU. (12) Prototyping area : The prototyping area provides blocks for adding components and connecting the address, data, and I/O ports. All 144 pins on the LPC2378 MCU are brought out for prototyping (13) LF Amplifier An LF Amplifier on the MCB2300 connects the D/A output of the LPC2300 device to a speaker & use this LF Amplifier to generate sound. (14) Configuration jumper for LF amplifier: J3 - D/A Output connects the AOUT output to the LF amplifier. (15) & (16) Dual CAN Ports Standard DB9 connectors are on the MCB2300 board for applications requiring CAN communications .Application may use either or both of these ports, or they may be disabled with a configuration jumper. (17) Configuration Jumpers: To enable or disable certain features (18) The LCD Contrast control adjusts the brightness and contrast of the LCD display.

Dept. of ECE, SNGCE

26

Intelligent Speech Recognition For Real Time Control Of DC Motor

(19) The Ethernet Connector: MCB2300 board uses the 10/100 Fast Ethernet. (20) USB Host Connector: It is a connector for USB hosting applications. (21)Power USB: To supplies power the board and allows the board to be configured as a (22)USB-OTG Connector: A mini-USB connector for USB-OTG (On-The-Go) applications (23) LCD Display A 2-line by 16-character, 8-bit LCD display. You may use this text display device to show real-time debug and program status messages (24) SD The MCB2300 board supports one SD memory card connector that allows you to connect a wide range of memory cards.The SD memory card connector is located beneath the LCD panel (25) Power LED: Indicates that power is present on the board. (26) Microcontroller The NXP LPC2368, LPC2378, LPC2387 or LPC2388 microcontroller provided with the MCB2300 board is a high-end LPC23xx device with advanced ADC, DAC and USB capabilities. 5.2 Hardware Requirements To use the MCB2300 Evaluation Kit, you need: The MCB2300 Evaluation Board. An IBM-compatible PC with either of the following: Two unused USB portsone to supply power to the board and one for downloading and debugging. An unused RS-232 COM port for Flash In-System Programming (ISP) via the Serial Interface. To run the Keil debugger using JTAG emulation, need: A ULINK USB-JTAG Adapter.
27

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

Two USB cables. To program the MCB2300 using the Flash Magic Utility, you need: A serial cable, 9-pin male to 9-pin female, no longer than 10ft/3m, wired one-toone.

5.3 Software Requirements Must install the following required software to use the MCB2300 Evaluation Board: Windows Operating System The Keil Vision tool chain runs in these Windows Operating Systems: Microsoft Windows 2000 Microsoft Windows XP To compile, link, and run applications on the MCB2300 Evaluation Board, you must install these Keil products: Keil MDK-ARM Evaluation Tools. Example programs written for the MCB2300. These programs are included in the Keil MDK-ARM Evaluation Toolkits. 5.4 Writing Programs For writing programs, you will be introduced to the Keil development tools which shows you the step-by-step process of Vision to create, compile, download, debug, and run a program on the MCB2300 board. Writing programs for the MCB2300 board is easy. The steps are: Create the Application Program using the Vision IDE and the ARM C/C++ Compiler. Download the program to the on-chip Flash of the MCB2300 Board. Debug the program using Vision Debugger and ULINK.

Tools and Examples

Dept. of ECE, SNGCE

28

Intelligent Speech Recognition For Real Time Control Of DC Motor

CHAPTER 6 SOFTWARE
6.1 THE ARM COMPILER The ARM compiler, armcc, is an optimizing C and C++ compiler that compiles Standard C and C++ source code into machine code for ARM based processors 6.1.2 Vision3 IDE The Vision3 IDE is a Windows-based software development platform that combines a robust editor, project manager. Vision3 integrates all tools including the C compiler, macro assembler, linker/locator, and HEX file generator. Vision3 helps expedite the development process of your embedded applications by providing the following: Full-featured source code editor Device database for configuring the development tool setting. Project manager for creating and maintaining your projects.
29

Dept. of ECE, SNGCE

Intelligent Speech Recognition For Real Time Control Of DC Motor

Integrated make facility for assembling, compiling, and linking your embedded applications. Dialogs for all development tool settings. True integrated source-level Debugger with high-speed CPU and peripheral simulator. Flash programming utility for downloading the application program into Flash ROM The Vision3 IDE offers numerous features and advantages that help to quickly

and successfully develop embedded applications. The Vision3 IDE and Debugger is the central part of the Keil development tool chain. Vision3 offers a Build Mode and a Debug Mode. 6.2 PROGRAMMING FLASH The MCB2300 board supports downloading programs to Flash memory using either the ULINK JTAG adapter or the Flash Magic utility. The ULINK USB-JTAG Adapter connects the USB port of your PC to the JTAG port of the MCB2300. It supports programming Flash, emulation, and debug capabilities. The Flash Magic Utility connects the COM port of your PC to the serial port (UART) of the MCB2300 board. It only supports programming Flash using the ISP Flash Interface. Flash Magic is Windows software from the Embedded Systems Academy that allows easy access to all the ISP features provided by the devices. These features include: Erasing the Flash memory (individual blocks or the whole device) Programming the Flash memory Reading Flash memory Performing a blank check on a section of Flash memory
Dept. of ECE, SNGCE

30

Intelligent Speech Recognition For Real Time Control Of DC Motor

Reading and writing the security bits Direct load of a new baud rate (high speed communications) Sending commands to place device in Bootloader mode 6.3 MATLAB MATLAB is a high-performance language for technical computing. It is a numerical computing environment and fourth generation programming language. MATLAB (meaning "matrix laboratory") was created in the late 1970s by Cleve Moler, then chairman of the computer science department at the University of New Mexico and developed by The Math Works. It integrates computation, visualization, and programming in an easy-to-use environment where problems and solutions are expressed in familiar mathematical notation. Typical uses include Math and computation Algorithm development Data acquisition Modelling, simulation, and prototyping Data analysis, exploration, and visualization Scientific and engineering graphics Application development, including graphical user interface building MATLAB is an interactive system whose basic data element is an array that does not require dimensioning. This allows solving many technical computing problems, especially those with matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar non interactive language such as C or FORTRAN. MATLAB was originally written to provide easy access to matrix software developed by the LINPACK and EISPACK Key features: High-level language for technical computing Environment for managing code, files, and data Interactive tools for iterative exploration, design, and problem solving

Dept. of ECE, SNGCE

31

Intelligent Speech Recognition For Real Time Control Of DC Motor

Mathematical functions for linear algebra, statistics, Fourier analysis, filtering, optimization, and numerical integration 2-D and 3-D graphics functions for visualizing data Tools for building custom graphical user interfaces Functions for integrating MATLAB based algorithms with external applications and languages, such as C, C++, Fortran, Java, COM, and Microsoft Excel The MATLAB language supports the vector and matrix operations that are

fundamental to engineering and scientific problems. It enables fast development and execution. With the MATLAB language, you can program and develop algorithms faster than with traditional languages because you do not need to perform low-level administrative tasks, such as declaring variables, specifying data types, and allocating memory. In many cases, MATLAB eliminates the need for for loops. As a result, one line of MATLAB code can often replace several lines of C or C++ code. At the same time, MATLAB provides all the features of a traditional programming language, including arithmetic operators, flow control, data structures, data types, object-oriented programming (OOP), and debugging features. MATLAB lets you execute commands or groups of commands one at a time, without compiling and linking, enabling you to quickly iterate to the optimal solution. The MATLAB system consists of five main parts:
1.

Development Environment: This is the set of tools and facilities that help you to use

MATLAB functions and files. Many of these tools are graphical user interfaces. It includes the MATLAB desktop and Command Window, a command history, an editor and debugger, and browsers for viewing help, the workspace, files, and the search path.
2.

The MATLAB Mathematical Function Library: This is a vast collection of

computational algorithms ranging from elementary functions like sum, sine, cosine, and complex arithmetic, to more sophisticated functions like matrix inverse, matrix Eigen values, Bessel functions, and fast Fourier transforms.
3.

The MATLAB Language: This is a high-level matrix/array language with control

flow statements, functions, data structures, input/output, and object-oriented programming features. It allows both programming in the small to rapidly create quick and dirty throwDept. of ECE, SNGCE

32

Intelligent Speech Recognition For Real Time Control Of DC Motor

away programs, and programming in the large to create complete large and complex application programs.
4.

Graphics. MATLAB has extensive facilities for displaying vectors and matrices as

graphs, as well as annotating and printing these graphs. It includes high-level functions for two-dimensional and three-dimensional data visualization, image processing, animation, and presentation graphics. It also includes low-level functions that allow you to fully customize the appearance of graphics as well as to build complete graphical user interfaces on your MATLAB applications. The MATLAB Application Program Interface (API). This is a library that allows you to write C and Fortran programs that interact with MATLAB. It includes facilities for callingroutines from MATLAB (dynamic linking) 6.4 DESIGNING GRAPHICAL USER INTERFACES A GUI is a graphical user interface to a computer. It (GUI) is a type of user interface item that allows people to interact with programs in more ways than typing such as computers; hand-held devices such as MP3 Players, Portable Media Players or Gaming devices; household appliances and office equipment with images rather than text commands. A GUI offers graphical icons, and visual indicators to fully represent the information and actions available to a user. The actions are usually performed through direct manipulation of the graphical elements. GUI Operating Systems are operated by using a mouse; the keyboard can also be used by using keyboard shortcuts or arrow keys. One can use the interactive tool GUIDE (Graphical User Interface Development Environment) to lay out, design, and edit user interfaces. GUIDE lets you include list boxes, pull-down menus, push buttons, radio buttons, sliders and even MATLAB plots. Alternatively, you can create GUIs programmatically using MATLAB functions. A pushbutton is typically used when you want an immediate action to occur when the user presses the button. GUIDE, MATLABs Graphical User Interface development environment, provides a set of tools for laying out GUI. The Layout Editor is the control panel for GUIDE. To

Dept. of ECE, SNGCE

33

Intelligent Speech Recognition For Real Time Control Of DC Motor

start the Layout Editor, use the guide command. The following picture shows the Layout Editor. GUIDE is an interactive tool for designing and building Graphical User Interfaces (GUI) for Matlab applications. GUI building process involves (a) Designing of the user interface and layout (b) Programming of the GUI and its components (c) Testing, debugging and finally running it

Figure 6.1 GUIDE Layout

6.4.1 Starting Up GUI Either click on the Guide Icon on the Matlab-Toolbar or type guide on the command window. This will open the Quick start GUI template selection window. Matlab provides a few templates to help with the common GUI design task. The initial layout area is usually resizable by clicking and dragging the handles on the corners of the template area. 6.4.2 Initializing GUIDE
Dept. of ECE, SNGCE

34

Intelligent Speech Recognition For Real Time Control Of DC Motor

The L.H.S. of the Guide window contains a components palette. Your layoutdesign task will involve dragging and dropping the GUI control components from the palette onto the layout area. 6.4.3 Designing The Guide Layout The user interface will usually be made up of Toolbars and Menus Input Control components such as ; Push Buttons, Radio Buttons, Check Boxes ( Switches) Pop-up Menus, List-Boxes ( Selections) Sliders ( Continuous Control ) Edit Text Graphical Objects Axes Objects Text Objects Static Text The items labelled Panels and Button Groups in the components palette are used for grouping together of the control elements. A panel can be created by dragging and dropping a panel from the components pallet onto the layout panel. Control items that are to be contained by that panel can simply be dragged and dropped onto the panel. Button Groups are like panels but their only real purpose is to group together the Radio Buttons and the Toggle Buttons. These options can be presented as three separate radio button contained within a Button-Group panel. If they were not contained in a button-group panel, these three controls would operate independently of each other making it possible to select any combinations of them which are not sensible. Whereas when they are contained in a button-group selection of one would automatically de-select the other two. 6.4.4 Saving GUI

Dept. of ECE, SNGCE

35

Intelligent Speech Recognition For Real Time Control Of DC Motor

Once the layout of the GUI is ready, it should be saved it for future use. Graphical User Interfaces generated by GUIDE are saved into two closely linked files- namely; your_gui_name.fig and your_gui_name.m The figure (.fig) file contains all the information related to the layout and appearance. The script (.m) file contains all the programming logic aspects of the GUI. Most of the programming components of interest in the .m file will be contained in the set of function-stubs ready for use as call-back functions to various control objects. 6.4.5 Programming the GUI After laying out the GUI and setting its component properties, the next step will be to program its behavior. Callback specifies the action to take when control activated. eg: what to do when mouse clicked on button. The code contained in the callback functions will control how the GUI responds to events such as button clicks, menu item selection, window resizing as well as creation and deletion of components. There will normally be one callback function per component on the GUI interface. All these functions will be contained in a single .m file Names of the callback functions for each gui component will also be automatically generated by GUIDE using the convention; function <objects_tag>_<event_to_handle>. After programming the GUI, the next step is to run the GUI. On running the GUI, the following screen is obtained by pressing the green coloured button in the figure 5.2. 6.4.6 Serial IO with MATLAB (quick launch) The following steps are required for serial data communication 1. 2. 3. 4. 5. 6. 6.5 ALGORITHM 6.5.1 Section 1: MATLAB Step 1: Start
Dept. of ECE, SNGCE

Create serial port object. Use MATLAB command serial Configure serial port object. get and set commands Connect to the device. Fopen Configure if required. get and set commands Write data with fprintf command. Read data with fscanf command. Disconnect device on transmission over. Use fclose

36

Intelligent Speech Recognition For Real Time Control Of DC Motor

Step 2: Record the voice command using wavrecord function Step 3: Get Mel Frequency Cepstrum Coefficient Step 4: Get Vector Quantized output of MFCC Vector and stored into a file as text document Step 5: Get measuring similarity between training and testing input voice signal sing Euclidian distance. Step 6: Received external voice command (Speaker) Step 7: If the command Match with reference Template send the corresponding command to the Microcontroller board. Else go to step 6 Step 8: Stop 6.5.2The algorithm of MFCC extraction: Step 1: Start Step 2: Read normalized speech signal. Step 3: Apply for framing blocking. Step 4: Apply for signal windowing with formula. Step 5: Apply for FFT extraction with formula. Step 6: Apply for Mel filter bank extraction formula. Step 7: Take logarithm of the resultant coefficient. Step 8: Apply DCT, to obtain cepstral coefficient. 6.5.3 The algorithm of VQ extraction (LBG algorithm) Step 1: Determine the number of code words, N, or the size of the codebook. Step 2: Select N code words at random, and let that be the initial codebook. The initial code words can be randomly chosen from the set of input vectors. Step 3: Using the Euclidean distance measure clusterize the vectors around each code word. This is done by taking each input vector and finding the Euclidean distance between it and each code word. The input vector belongs to the cluster of the code word that yields the minimum distance. Step 4: Compute the new set of code words. This is done by obtaining the average of each cluster.
m

Yi=1/m*xij
j=1

Dept. of ECE, SNGCE

37

Intelligent Speech Recognition For Real Time Control Of DC Motor

Where i is the component of each vector (x, y, z, directions), m is the number of vectors in the cluster. Step 5: Add the component of each vector and divide by the number of vectors in the cluster. Step 6: Repeat steps 2 and 3 until the either the code words don't change or the change in the code words is small. Step 7: Stop 6.6 Section 2:Embedded C Step 1: Start Step 2: Initialize LCD in 4 Bit mode Step 3: Disable ETM (Setting PinSel10=0) Step 4: Set fast I/O port 2 as output port Step 5: Initialize UART (UART 0 as receiver and UART 1 as transmitter) Step 6: Initialize Timer to set delay Step 7: Initialize ADC to accept SD card output Step 8: Check for SD card. If card is ready then go to step 9 else go to end. Step 9: Then play the welcome message using DAC and LF amplifier Step 10: Check for data in UART. If data is available set Motor flag. Step 11: If motor flag=1, go to step 12,else go to step 10 Step 12:UART data is compared with the data in the database inorder to find motor reference. Step 13: If motor reference =0 go to step 14,else go to step 18 Step 14: Call function to Run forward Step 15: Send message to assigned mobile regarding the status of motor using GSM module Step 16: Display the direction in LCD Step 17: Play the direction using LF amplifier Step 18: If motor reference =1 go to step 14,else go to step 21 Step 19: Call function to Run backward

Dept. of ECE, SNGCE

38

Intelligent Speech Recognition For Real Time Control Of DC Motor

Step 20: Send message to assigned mobile regarding the status of motor using GSM module Step 21: Display the direction in LCD Step 22: Play the direction using LF amplifier Step 23: If motor reference =2 go to step 14,else go to step 28 Step 24: Call function to stop the motor Step 25: Send message to assigned mobile regarding the status of motor using GSM module Step 26: Display the direction in LCD Step 27: Play the direction using LF amplifier Step 28: Display Invalid on LCD screen Step 29: Play error message Step 30:Stop

CHAPTER 7 EXPERIMENTAL SETUP


In the experimental setup of DC motor drive through speech recognition, the speech signal is taken by microphone that is connected to computer. Software coding is to calculate the MFCC and VQ (LBG algorithm) MATLAB 7.7 version is used recognize the input speech taken from micro phone. And for hardware part to make DC motor understands, MCB2300 (Microcontroller LPC2388) is used. LCD display in the board is used to display the direction. LF amplifier is used to generate the sound that is stored in the SD card which indicates the direction.GSM module is used to send message to specified number to inform the current status of motor. For microcontroller coding Embedded C programming is used. The interfacing between computer and microcontroller is done by RS-232. For drive the DC motor the driver IC L293D is used.

Dept. of ECE, SNGCE

39

Intelligent Speech Recognition For Real Time Control Of DC Motor

CHAPTER 8 SCOPE OF PROJECT


The system shall be developed up to the hardware level in which users will be able to use a real system with its hardware rather using a simulation of the system. The process of converting the simulation into the hardware could be costly and time consuming, but will be very effective and efficient system for driving DC motor. This is a user friendly system and it will display its direction ,and it will play the direction using LF amplifier, also send message to specified number regarding the current direction. This motor can be implemented on a wheel chair to control its position or to control robotic arm.

Dept. of ECE, SNGCE

40

Intelligent Speech Recognition For Real Time Control Of DC Motor

CHAPTER 9 FUTURE WORKS


The isolated word automatic speech recognition systems can be enhanced in many ways and there will always be new developments in this technology and new areas for researchers to explore. The system shall be able to accept voice inputs from various users. In other words, this system shall be improved further to be multi-users system, which accepts voice inputs from different people with different voice frequencies and nature. Here this is a single user system. The system shall provide more language preferences for users, so that the most commonly used and spoken languages could be provided in the system, which will ease the process of communication. Here I implemented for English language. The system shall consider the accuracy and security part of it. It is very hard and difficult to get accuracy. Therefore, it should use different techniques in order to achieve a better and higher accuracy. The system might have to consider other methodologies for speech recognition techniques apart from what has been used in this research. Other techniques could be like

Dept. of ECE, SNGCE

41

Intelligent Speech Recognition For Real Time Control Of DC Motor

neural networks and Hidden Markov Model in order to achieve better accuracy. In future we can also optimize the MATLAB code,so that execution time can be reduced.

CHAPTER 10 SIMULATION RESULT

Dept. of ECE, SNGCE

42

Intelligent Speech Recognition For Real Time Control Of DC Motor

Dept. of ECE, SNGCE

43

Intelligent Speech Recognition For Real Time Control Of DC Motor

CHAPTER 14 CONCLUSION
In this project MFCC and VQ techniques are used in speech recognition to control the DC motor drive. The code developed in MATLAB using MFCC and VQ can be even used for control and drive the stepper motor, servo motor etc. The developed speech algorithm can be use for navigation purpose, to control robots, to drive electric vehicles (like wheelchair) security areas (like banking, unman vehicles, remote access of computers where speech can be use as password).

Dept. of ECE, SNGCE

44

Intelligent Speech Recognition For Real Time Control Of DC Motor

REFERENCES
1) An Efficient MFCC Extraction Method in Speech Recognition. Wei HAN, Cheong-Fat CHAN, Chiu-Sing CHOY and Kong-Pang PUN, Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, IEEE 2006. 2) Differential MFCC and Vector Quantization used for Real-Time Speaker Recognition System, 2008 IEEE Congress on Image and Signal Processing, Wang Chen Miao Zhenjiang, Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China. 3) Speaker Identification Using MEL Frequency Cepstral Coefficient, Md. Rashidul Hasan, Mustafa Jamil, Md. Golam Rabbani Md. Saifur Rahman. Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology. 3rd International Conference on Electrical & Computer Engineering ICECE 2004, 28-30 December 2004, Dhaka, Bangladesh 4) A Mixed Parameter Method Based on MFCC and Fractal Dimension for Speech Recognition, Minghai Yao, Jing Hu and Qinlong Gu, College of Information Engineering, Zhejiang University of Technology,, Hangzhou, 310032, China. Proceedings of the 2006 IEEE, International Conference on Information Acquisition August, 20 - 23, 2006, Weihai, Shandong, China. 5) A Vector Quantization Approach to Speaker Recognition F. K. Soong A. E. Rosenberg L. R. Rabiner B. H. Juang AT&T Bell Laboratories Murray Hill, New Jersey 07974 6) A Speaker Identification System using MFCC Features with VQ Technique, 2009 Third International Symposium on Intelligent Information Technology Application 7) A Robotic Arm Design for Stroke Patients, 2009 3rd International Conference on Power Electronics Systems and Applications Digital Reference: K210509126 8) Speaker Identification by using Vector Quantization, Dr. H. B. Kekre et. al. International Journal of Engineering Science and Technology Vol. 2(5), 2010, 1325-1331 9) MFCC and its applications in speaker recognition Vibha Tiwari Deptt. of Electronics Engg., Gyan Ganga Institute of Technology and Management, Bhopal, (MP) INDIA (Received 5 Nov., 2009, Accepted 10 Feb., 2010) International Journal on Emerging Technologies 1(1): 19-22(2010)

Dept. of ECE, SNGCE

45