This action might not be possible to undo. Are you sure you want to continue?
A PROJECT REPORT
ARIJIT SAMANTA (21204106005) SHUBHOJEET GHOSH (21204106040) ABHISHEK DAS (21204106001) AJEETH MONY (21204106002)
in partial fulfilment for the award of the degree of
BACHELOR OF ENGINEERING
ELECTRONICS AND COMMUNICATION ENGINEERING
RAJIV GANDHI COLLEGE OF ENGINEERING, SRIPERUMBUDUR
ANNA UNIVERSITY: CHENNAI 600 025
ANNA UNIVERSITY: CHENNAI 600 025
Certified that this project report “Attendance System by Biometric Authorization using Speech ” is the bonafide work of SAMANTA(21204106005), SHUBHOJEET “ARIJIT
ABHISHEK DAS(21204106001), and AJEETH MONY(21204106002)” who carried out the project work under my supervision.
______________________ SIGNATURE Mr SUMAN MISHRA ( HEAD OF THE DEPARTMENT)
DEPARTMENT OF ELECTRONICS AND
______________________ SIGNATURE Mr SUMAN MISHRA (SUPERVISOR)
DEPARTMENT OF ELECTRONICS AND
RAJIV GANDHI COLLEGE OF ENGINEERING SRIPERUMBUDUR 602 105
RAJIV GANDHI COLLEGE OF ENGINEERING SRIPERUMBUDUR 602 105
Abstract: Attendance System by Biometric Authorization using Speech. The student attendance is generally manual, which is an inconvenient task and waste of time for a lecturer. To avoid this we have developed an automated system which would save the time, labour and moreover the system would be immune to impersonation. We have used the frequency variation in speech patterns to identify different individuals. The challenges faced were the detection of voice patterns and compensating for the ambient noise for successful detection of individuals. To deal with the limited memory and processing power, we had to remove the redundancies in the speech pattern data; hence we developed Voiceprinting as a solution. The compensation for noise was done by accurate approximation of the ambient noise and calculating for a threshold level based on the same. The process of detection of voice takes place in two phases. In the first phase the sample of speech from the individual is taken; the Voiceprint computed and saved in the database. In the second phase the speech of the individual is acquired and similarly a Voiceprint is computed and compared against the database and the appropriate action taken, once a match is found. The implement was successful for a sample space of three Voiceprints. The sample space can be expanded using an external memory (ROM).
The Block Diagram of the System
1 . Introduction When we think of programmable speech recognition, we think of calling customer service call centre with automated voice recognition response systems. We also think of PC-based speech recognition. Now we took that a step further. We are talking about speech recognition in a tiny Mega32 microcontroller. We are talking about real-time speech processing. This was made possible by implementing band pass filters in assembly language with fixed-point format onto the microcontroller. In this filter design, not only the output of the filter is calculated, but its square and accumulation also obtained. Thus much time is saved so that each speech sample can be processed to get its frequency spectrum before next new sample comes. In addition, the analysis of the voice is made using correlation and regression method to compare the voiceprint of different words. These techniques provide stronger ability to recognize speech. Training procedure is also used to reduce the random changes due to one word is spoken different times. The training procedure can get the more accurate frequency spectrum for one word. The experimental results demonstrate high accuracy for this real-time speech recognition system.
1.1 The data acquisition module: The data acquisition module consists of a microphone. The signal from the microphone is then fed to the amplifier to match the ADC input voltage requirements in the microcontroller. Microphone: A microphone sometimes referred to as a mike or mic is an acoustic-to-electric transducer or sensor that converts sound into an electrical signal. Signal Conditioner/ Operational Amplifier: An operational amplifier, often called an op-amp, is a DC-coupled high-gain electronic voltage amplifier with differential inputs and, usually, a single output. Typically the output of the op-amp is controlled either by negative feedback, which largely determines the magnitude of its output voltage gain, or by positive feedback, which facilitates regenerative gain and oscillation. High input impedance at the input terminals and low output impedance are important typical characteristics.
1.2 The Microcontroller: The microcontroller is 16 bit microcontroller with
1.3 The Template Generation method: The template generation method involves two steps. The first step is the calculation of the ambient noise threshold to eliminate noise to a large extent. The second step is to compute the voiceprint of the speech (the word spoken by the user for identification). The generated template is then transferred to the computer. The template is then used to program the microcontroller. 1.4 The Speech/Voice Recognition Method. The Speech/Voice Recognition method is similar to the template generation method. The only difference is that the voice print is loaded to the Random Access memory of the microcontroller and is compared with the templates available in the database. The appropriate action is taken once a match is found. Summary: Firstly, we looked at the speech recognition algorithm to understand the implementation. We then prepared the microphone circuit, and then proceeded to start sampling and generate the digital data for the speech. Once we have the data, we started writing the code. We also wrote the digital filters in assembly code to save the number of cycles necessary for the sampling rate of the speech, which is at 4K/second. Afterwards, we analyzed the output of the filters to recognize which word was spoken.
2.1 The Template Generation Methodology. Since the problem at hand is recognition of speech. A Fast Fourier Transform of the incoming signal might be an easy method to compare the voice with the template but it is computation intensive and requires quite a substantial memory. Hence we have developed a method to remove the redundancies and retain just the optimum amount of data to effectively recognize voice.
Figure 1‐Flowchart for Template Generation (160 point data)
The algorithm we have come forth is: 1. Voice Sampling: Sample the voice at 4300Hz which enables us to resolve up to 2150Hz according to Nyquist Criterion. The higher end of human voice is 2000 Hz and the lowest end being around 100Hz hence we have fairly covered the whole human voice spectrum. 2. Adaptive Noise Cancellation: We calculate the threshold ambient noise level to eradicate noise effectively to the required extent.
3. Bandpass filtering: Next we filter the signals using bandpass filters e.g. filters are designed with cut-off frequencies from 100Hz-200Hz, 200Hz300Hz .... and so on to 1900Hz-2000Hz. 4. Accumulation of the output of bandpass filter: The filter output is then accumulated and saved to a 160 point vector. 5. Transferring the template data: The 160 point vector data is the template, which is very small in size but retains enough information to distinguish voices of different persons. The template data is transferred to the PC.
2.1.1 Voice Sampling: Human Voice consists of frequency components from 100-2000Hz .According to the Nyquist ‘s sampling theorm Sampling is the process of converting a signal (for example, a function of continuous time or space) into a numeric sequence (a function of discrete time or space). The theorem states, in the original words of Shannon (where he uses "cps" for "cycles per second" instead of the modern unit hertz): If a function f(t) contains no frequencies higher than W cps, it is completely determined by giving its ordinates at a series of points spaced 1/(2W) seconds apart. Hence we need sampling at the rate of at least 4000Hz . Hence the sampling time should be ideally 1/4000sec = 250µs. The voice sampling is achieved by setting the ADC control registers and a timer which interrupts (triggers) the microcontroller to generate ADC data every 232µs. The method for sampling consists of the following steps: • Setup the interrupt for Counter 0 and Counter 1
• Initializing the ADC data available flag to 0. • Configure the ADMUX register to obtain data from ADC channel 1. • Setup the timer to interrupt the microcontroller to acquire data from ADC, every 232µs. • Receive the ADC data into the Accumulator • Process the data further as per the Voiceprint generation process. Algorithm : 1. Set TIMSK watchdog timer control register to 0b00000010 so that it interrupts every time the timer hits the count. 2. Set the ADC input data register to zero. 3. Set the ADMUX value to 0b00100000 to select channel 1. 4. ADCSR value is set to 0b11000111 to start the conversion and sets the ADC conversion status flag to1. 5. The timer control register TCCR0=0b00001011 and OCR0=62 so that we get a sampling rate of 4300Hz.
2.1.2 Adaptive Noise Cancellation: Noise is an unwanted phenomenon in any system. We have applied a simple approach to reduce the noise effects to a significant extent. We sample the ambient sounds i.e. not speaking anything on the microphone. The microphone records noise and the noise characteristics (amplitude information and phase information) are determined and then this is used to remove the noise adaptively when human voice is sampled. This is done by registering samples only if they are above the threshold of noise value.
Figure 2‐ Adaptive Noise Cancellation
Major steps involved in noise removal. • Ambient noise is sampled; the threshold value is calculated based on the samples taken. We take three average noise values and take the median of three. This median is set as the threshold value. • The voice samples from ADC are only accepted if they are 4 times the threshold value or else are discarded as noise. The following algorithm implements the above: 1. First sampling is done according to the sampling process mentioned before in Chapter 2.1. 2. The average value of the samples recorded is stored in a temporary register. 3. The above step is repeated 3 times to get 3 average values. 4. The median of the obtained 3 average values is calculated. 5. The median calculated is the threshold value of noise. 6. The sampling starts only if the condition ADC value is greater than 4 times the threshold. 7. If the above condition is satisfied the program control is transferred to the template generation functional block given in Chapter 2.3. 8. Else the program control is transferred to Step 6.
2.1.3 Bandpass Filtering:
The filter used here is Chebychev Filter (IIR). Infinite impulse response (IIR) is a property of signal processing systems. Systems with that property are known as IIR systems or when dealing with electronic filter systems as IIR filters. They have an impulse response function which is non-zero over an infinite length of time. This is in contrast to finite impulse response filters (FIR) which have fixed-duration impulse responses. The simplest analog IIR filter is an RC filter made up of a single resistor (R) feeding into a node shared with a single capacitor (C). This filter has an exponential impulse response characterized by an RC time constant. IIR filters may be implemented as either analog or digital filters. In digital IIR filters, the output feedback is immediately apparent in the equations defining the output. Note that unlike with FIR filters, in designing IIR filters it is necessary to carefully consider "time zero" case in which the outputs of the filter have not yet been clearly defined. Design of digital IIR filters is heavily dependent on that of their analog counterparts because there are plenty of resources, works and straightforward design methods concerning analog feedback filter design while there are hardly any for digital IIR filters. As a result, mostly, if a digital IIR filter is going to be implemented, first, an analog filter (e.g. Chebyshev filter, Butterworth filter, Elliptic filter) is designed and then it is converted to digital by applying discretization techniques such as Bilinear transform or Impulse invariance. Example IIR filters include the Chebyshev filter, Butterworth filter, and the Bessel filter.
Bandpass Filter Design: This is an important part for generating a voice template for the voice. This step removes the redundancies in voice and stores the signature in a 160 point vector data. The bandpass filter is a Second order Chebychev IIR filter. The coefficients for the filter are calculated using MATLAB. In order to analyze speech, we needed to look at the frequency content of the detected word. To do this we used several 4th order Chebyshev band pass filters. To create 4th order filters, we cascaded two second order filters using the following "Direct Form II Transposed" implementation of a difference equations.
Figure 3‐Transposed‐Direct‐Form‐II implementation of a second‐order IIR digital filter (input on the right, output on the left)
The assembly language code is then written to implement the filter is written taking care that the filter is able to calculate within 2100 system cycles that is before the next sample arrives. Hence to optimize the following process we have optimized the data format from float to fixed point `see APPENDIX I for more details ` 2’complement form which has improves the performance of the program and helps to compute within the required number of system clock cycles. Filter Coefficient Calculation. The filter coefficient is calculated using the Signal Processing Blockset of MATLAB® Version 7.5.0. The parameters for the bandpass filters are 100-200Hz,200-400Hz to 18002000Hz(for 8 bandpass filters). The gain for the passband is 20dB and the rolloff is quite steep as two IInd Order Chebychev bandpass filter are cascaded in series.
Figure 4 ‐ Signal Processing Block Parameters for designing the Filter.
Figure 5 – Window showing the Designed Filter Coefficients.
The code for the data conversion is given below.*
#define int2fix(a) (((int)(a))<<8) //Convert char to fix. a is a char #define fix2int(a) ((signed char)((a)>>8)) //Convert fix to char. a is an int #define fix2uint(a) ((unsigned char)((a)>>8)) //Convert fix to char. a is an int #define float2fix(a) ((int)((a)*256.0)) #define fix2float(a) ((float)(a)/256.0)
*Details in Appendix 1 The optimized filter algorithm such that all the filter activities are completed before the next sample arrives from the ADC i.e roughly 2000 system cycles. The Algorithm: 1. The filter co-efficient b1 is loaded to the RAM. 2. Load the input sample from the stack. 3. Convert the sample from integer to fixed point data. 4. Convert the co-efficient from float to fixed point data. 5. Multiply the converted co-efficient to the sample according to the difference equation for Direct Form II Transposed Chebychev’s IInd order Bandpass filter. 6. Similarly the other co-efficients are multiplied with the time shifted samples to get the bandpass filter output. 7. The output of the first bandpass operation is again fed to the same bandpass filter. This is how the cascading of the filter in series is done to obtain a IVth Order bandpass filter. 8. The above six steps are run 7 more times with their respectively designed filter co-efficients. 9. Hence 8 bandpass filters are implemented with different bandpass frequency parameters.
10. The fixed point output of the bandpass filter is accumulated(summed up) according to the process given in Chapter 2.4.
Figure 6‐ Frequency Response of the digital filter for 400‐600 Hz BANDPASS
2.1.4 Accumulation of the output of bandpass filter: The output of the bandpass filters is summed to get the 160 point vector data which is in fixed format. It is then converted to 16 bit integer format. The algorithm for summing up: 1. The first output from a bandpass filter is obtained and stored in a register. 2. The subsequent outputs from the same bandpass filter are added to the register where the first value of the same bandpass filter is stored. 3. The above step is repeated for the rest of the bandpass filters. 4. The fixed point data is then converted to 16 bit integer format . 5. The sampled time i.e the time required to sample a spoken word is divided into 20 parts in time. Hence we obtain 160 (20 X 8 = 160) 16 bit data vector which is the template. Hence the template is finally generated.
Figure 7‐ The Fourier Spectrum of the word “Hello”
Figure 8‐ The Fourier Spectrum of the word “Hello” .
2.2 The Speech/Voice Recognition Method. The Speech/Voice Recognition method is similar to the template generation method. The only difference is that the voice print is loaded to the Random Access memory of the microcontroller and is compared with the templates available in the database. The appropriate action is taken once a match is found. The algorithm for recognizing the spoken word is given below: 1. Generation of Voiceprint: The Voice is Sampled and passed through the bandpass filter and is accumulated to obtain a 160 point vector data. This is same as the template generation process. 2. Comparison of the Voiceprint with the stored templates: Once the data for the spoken word is obtained, it is compared with the data in the database. 3. Appropriate Action: Taking the appropriate action once the match is found.
Generation of the Voiceprint: The Voiceprint is the 160 point data obtained from the sampled speech. The method used is exactly the same as the generation of 160 point template.
2.2.1 Comparison of the Voiceprint with the stored templates: Once the voice print is obtained the Voiceprint is compared against the stored templates. The Euclidean distance is found between the Voiceprint and the templates using the Euclidean distance formula. Euclidean Distance: In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" distance between two points that one would measure with a ruler, which can be proven by repeated application of the Pythagorean theorem. By using this formula as distance, Euclidean space becomes a metric space (even a Hilbert space). Older literature refers to this metric as Pythagorean metric. The technique has been rediscovered numerous times throughout history, as it is a logical extension of the Pythagorean theorem. The Euclidean distance between points , in Euclidean n-space, is defined as: and
Which can be simplified to.
Where i = 1 to 160 pi is the ith data point in the Voiceprint vector.
qi is the ith data point in the Template vector. Now we have the Euclidean distance calculated for suppose 5 template data vectors, we analyse that which data vector has the minimum Euclidean distance. The minimum Euclidean distance Vector is then chosen as the detected word and appropriate action taken. Algorithm for determining the minimum Euclidean Distance. 1. Let Pi for i=1 to 160 be the voiceprint under test. 2. Let Qni for i=1 to 160 be the nth template in the database where n belongs to 1 to the number of database entries. 3. Pi is squared. 4. Qni is squared. 5. Pi 2 - Qi 2 is calculated. 6. The square root of Pi 2 - Qni 2 is then calculated. 7. The above value is accumulated in a register. 8. Steps 3 to 7 are repeated for every value of i for i = 1 to 160. 9. The Euclidian distance is hence obtained in the register. 10. Steps 2 to 8 is repeated for every template i.e the first template to the nth template. 11. The minimum value of the Euclidian distance for a given template is found out by comparing all the n Euclidian distances. 12. Hence the template corresponding to the minimum Euclidian distance is found out. 13. Once a match is found the control is transferred to the control code block for appropriate action.
2.2.2 Appropriate Action:
The match is obtained from the speech/voice comparison block from the previous chapter. The required action is then taken according to the control block. The control block: The control block is a simple switch case where actions are determined according to the template with the minimum Euclidian distance. Algorithm for the control block: 1. If template 1 is the nearest match found. Display “Template 1 is the word spoken” or if template 1 is defined as “TURBO” then it might be displayed as “TURBO is the word spoken”. 2. Similar steps are taken if any other template is the nearest match and if it is defined as the word “XXXXXXXX” then “XXXXXXXX is the word spoken” is displayed on the screen.
2.3 The Hardware Design:
Figure 9‐ The Hardware Schematic.
2.3.1 AtMega32 8 bit Microcontroller
Figure 10 Pinouts ATmega32 microcontroller (PDIP)
The ATmega32 is a low-power CMOS 8-bit microcontroller based on the AVR enhanced. RISC architecture. By executing powerful instructions in a single clock cycle, the ATmega32 achieves throughputs approaching 1 MIPS per MHz allowing the system designer to optimize power consumption versus processing speed. The AVR core combines a rich instruction set with 32 general purpose working registers. All the 32 registers are directly connected to the Arithmetic Logic Unit (ALU), allowing two independent registers to be accessed in one single instruction executed in one clock cycle. The resulting architecture is more code efficient while achieving throughputs up to ten times faster than conventional CISC microcontrollers.
Combined features of atmega 32 General features High-performance, Low-power AVR® 8-bit Microcontroller Advanced RISC Architecture Nonvolatile Program and Data Memories In-System Programming by On-chip Boot Program True Read-While-Write Operation 1024 Bytes EEPROM ,Endurance: 100,000 Write/Erase Cycles 2K Byte Internal SRAM Programming Lock for Software Security Peripheral Features Two 8-bit Timer/Counters with Separate Prescalers and Compare Modes One 16-bit Timer/Counter with Separate Prescaler, Compare Mode, and capture Mode
Real Time Counter with Separate Oscillator 8-channel, 10-bit ADC I/O and Packages 32 Programmable I/O Lines 40-pin PDIP, 44-lead TQFP Operating Voltages 2.7 - 5.5V for ATmega32 Speed Grades 0 -16 MHz for ATmega32
Description Digital supply voltage.
GND Port A (PA7..PA0)
Ground Port A serves as the analog inputs to the A/D Converter. Port A also serves as an 8-bit bi-directional I/O port,
Port B (PB7..PB0)
Serve as SPI Bus Serial Clock, SPI Bus Master Input/Slave Output, Analog Comparator Timer/Counter0 Negative Output Input , Compare
Match Output, USART External Clock Input/Output Port C (PC7..PC0) Acts as Timer Oscillator Pin 2, JTAG Test Mode Select, Two-wire Serial Bus Data Input/Output Line) Port D (PD7..PD0) Acts as Timer/Counter2 Match 1 Output, Input, Output External USART
Output/input Pin RESET A low level on this pin for longer than the minimum pulse length will generate a reset, even if the clock is not running XTAL1 Input to the inverting Oscillator
amplifier and input to the internal clock operating circuit XTAL2 Output from the inverting Oscillator amplifier. AVCC A supply voltage pin for Port A and the A/D Converter. should be externally connected to VCC, even if the ADC is not used.else it should be connected to VCC through a low-pass filter. AREF AREF is the analog reference pin for the A/D Converter.
Table 1 Description of combined pin configuration in ATmega 32 microcontroller
2.3.2 Analog to digital convertor
Features : 10-bit Resolution Up to 15 kSPS at Maximum Resolution 8 Multiplexed Single Ended Input Channels 7 Differential Input Channels 2 Differential Input Channels with Optional Gain of 10x and 200x(1) Optional Left adjustment for ADC Result Readout 0 - VCC ADC Input Voltage Range Selectable 2.56V ADC Reference Voltage ADC Start Conversion by Auto Triggering on Interrupt Sources
ADC Timing Diagram
Figure 11 ADC Timing Diagram, First Conversion (Single Conversion Mode)
ADC multiplexer selection register –( ADMUX)
Figure 12 ADC Multiplexer Selection(ADMUX)
Bit 7:6 – REFS1:0: Reference Selection Bits These bits select the voltage reference for the ADC, as shown in Table 83. If these bits are changed during a conversion, the change will not go in effect until this conversion is complete (ADIF in ADCSRA is set). The internal voltage reference options may not be used if an external reference voltage is being applied to the AREF pin.
Bit 5 – ADLAR: ADC Left Adjust Result The ADLAR bit affects the presentation of the ADC conversion result in the ADC Data Register. Write one to ADLAR to left adjust the result. Otherwise, the result is right adjusted. Changing the ADLAR bit will affect the ADC data register immediately, regard-less of any ongoing conversions.
Bits 4:0 – MUX4:0: Analog Channel and Gain Selection Bits The value of these bits selects which combination of analog inputs are connected to the ADC. These bits also select the gain for the differential channels for details. If these bits are changed during a conversion, the change will not go in effect until this conversion is complete (ADIF in ADCSRA is set).
ADC Control and Status(Register A – ADCSRA)
Figure13 ADC Control and Status(Register A – ADCSRA)
Bit 7 – ADEN: ADC Enable Writing this bit to one enables the ADC. By writing it to zero, the ADC is turned off. Turning the ADC off while a conversion is in progress, will terminate this conversion. Bit 6 – ADSC: ADC Start Conversion In Single Conversion mode, write this bit to one to start each conversion. In Free Running Mode, write this bit to one to start the first conversion. The first conversion after ADSC has been written after the ADC has been enabled, or if ADSC is written at the same time as the ADC is enabled, will take 25 ADC clock cycles instead of the normal 13. This first conversion performs initialization of the ADC.ADSC will read as one as long as a conversion is in progress. When the conversion iscomplete, it returns to zero. Writing zero to this bit has no effect.
Bit 5 – ADATE: ADC Auto Trigger Enable When this bit is written to one, Auto Triggering of the ADC is enabled. The ADC will start a conversion on a positive edge of the selected trigger signal. The trigger source is selected by setting the ADC Trigger Select bits, ADTS in SFIOR.
Bit 4 – ADIF: ADC Interrupt Flag This bit is set when an ADC conversion completes and the Data Registers are updated. The ADC Conversion Complete Interrupt is executed if the ADIE bit and the I-bit in SREG are set. ADIF is cleared by hardware when executing the corresponding interrupt handling vector. Alternatively, ADIF is cleared by writing a logical one to the flag. Beware that if doing a Read-Modify-Write on ADCSRA, a pending interrupt can be disabled. This also applies if the SBI and CBI instructions are used. Bit 3 – ADIE: ADC Interrupt Enable When this bit is written to one and the I-bit in SREG is set, the ADC Conversion Complete .Interrupt is activated.
2.3.3 MAX232 DUAL DRIVERS/RECEIVERS
Figure 14 Pin Outs of MAX 232
FEATURES OF MAX 232 CHIP Operate With Single 5-V Power Supply Operate Up to 120 kbit/s Two Drivers and Two Receivers ±30-V Input Levels Low Supply Current . . . 8 mA Typical Designed to be Interchangeable With Maxim MAX232 Applications Battery-Powered Systems Terminals Modems Computers
2.3.4 LM 358 DUAL OPERATIONAL AMPLIFIER
Figure 15 Pin outs of LM 358 chip
Features of LM 358 Wide Supply Range: Single Supply . . . 3 V to 32 V Low Supply-Current Drain, Independent of Supply Voltage . . . 0.7 mA Typ Common-Mode Input Voltage Range Low Input Bias and Offset Parameters: Input Offset Voltage . . . 3 mV Typ Open-Loop Differential Voltage Amplification . . . 100 V/mV Typ Internal Frequency Compensation
Description : These devices consist of two independent, high-gain, frequency-compensated operational amplifiers designed to operate from a single supply over a wide range of voltages. Operation from split supplies also is possible if the difference between
the two supplies is 3 V to 32 V (3 V to 26 V for the LM2904), and VCC is at least 1.5 V more positive than the input common-mode voltage. The low supply-current drain is independent of the magnitude of the supply voltage. Applications include transducer amplifiers, dc amplification blocks, and all the conventional operational amplifier circuits that now can be implemented more easily in single-supply-voltage systems. For example, these devices can be operated directly from the standard 5-V supply used in digital systems and easily can provide the required interface electronics without additional +_5-V supplies.
Typical Testing Parameter We can test the dual mode operation amplifiers in a particular test parameters “operating conditions, VCC = +_15 V, TA = 25°C”
PARAMETER SR Slew rate at unity gain
TEST CONDITIONS RL = 1 MΩ, CL = 30 pF, VI = +_10 V RL = 1 MΩ, CL = 20 pF
B1 Unity-gain bandwidth
Vn Equivalent input noise RS = 100 Ω, VI = 0 V, f = 1 40 voltage kHz
Table 2 Testing condition of LM 358
3 Results Since we had to pass the ADC output through all of the filters faster than our sample time; the time it took do all the filter calculations was very important. We were able to run through 9 filters in under 4000 cycles, which is the amount of cycles available when sampling from the ADC at 4 KHz. The fingerprint comparison function did not have a speed requirement and so the cycle time for that was unimportant. The program was able to recognize five words, but sometimes it would become confused and match the incorrect word if the word that was spoken varied too much from the word stored in the dictionary. As a rough estimate the program recognized the correct word about 70% of the time a valid word was spoken. The program achieved success using Arijit’s voice, and with sufficient practice a person could say the same word with a small enough variation for the program to recognize the spoken word most of the time. For the general person though the recognition program would have a much lower percentage of success. Also the words in the dictionary are words spoken by only one person. If someone else said the same words it is unlikely the program would recognize the correct word most of the time, if at all.
Increasing the accuracy of recognition:
If trained properly the designed system gives an accuracy of 95 % for three words
Words Percentage accuracy Arijit 95%
Turbo 92% Mony 97 %
Table 1‐ Accuracy of recognition
To increase the accuracy we have taken the template sample 20 times and calculated and computed the geometric mean of the vectors. Screen Shots: The Template Generation Process. On starting the Template Generation process the program initializes the required variables and performs the measurement of Noise Threshold.
Starting .... Noise Measurement Done ......!
Starting .... Noise Measurement Done ......! Sampling Started...! Noise1 = 22345 Noise2 = 21367 Noise3 = 20456
Threshold = 21498 0 234 123 6783
The output screen when the noise threshold calculation and the sampling is completed and the program starts to generate the 160 point template vector.
37 Starting...... GetSample..... Recognized Voice is of ‘Turbo’ GetSample Recognized Voice is Of ‘Mony’ GetSample Recognized Voice is of ‘Arijit’
The voice detection screen where it shows when the sampling starts and then shows the match.
APPENDIX 1 FLOAT TO FIX POINT REPRESENTATION Floating point arithmetic is too slow for small, 8-bit processors to handle, except when human interaction is involved. Scaling a human input in floating point is generally fast enough (compared to the human). However in fast loops, such as IIR filters or animation, you are going to need to use fixed point arithmetic. Numbers are stored in 2's complement form. Fixed Point representation This section will concentrate on numbers stored as 16-bit signed ints, with the binary-point between bit 7 and bit 8. There will be 8 bits of integer and 8 bits of fraction, so we will refer to this as fixed point. This representation allows a dynamic range of +/- 127, with a resolution of 1/256=0.00396. Sign representation will be standard 2's complement. For instance to get the fixed point representation for -1.5, we can take the representation for +1.5, which is 0x0180, invert the bits and add 1. So inverting, we get 0xfe7f and adding one (to the least significant bit) we get 0xfe80.
Table 4 Example showing the fixed to float conversion example values
0.0 1.0 1.5 1.75 1.00396 -1.0 -1.5 -2 -127
0x0000 0x0100 0x0180 0x01c0 0x0101 0xff00 0xfe80 0xfe00 0x8100
BASIC ALGORITHM: Fixed arithmetic functions were initially defined as macros rather than functions. While useful for summarizing the technique and for testing, it turned out to be much faster to implement the fixed point multiply as a function (see below). We will need to convert integers to fixed, float to fixed, fixed to integers, perfrom addition, subtraction, multiplication and division, as well as other operations. Addition and subtraction can just use the standard C operators. Type conversions of the above
#define int2fix(a) (((int)(a))<<8) //Convert char to fix. a is a char #define fix2int(a) ((signed char)((a)>>8)) //Convert fix to char. a is an int #define float2fix(a) ((int)((a)*256.0)) //Convert float to fix. a is a float
#define fix2float(a) ((float)(a)/256.0) //Convert fix to float. a is an int
S.Salivahanan, A Vallavaraj ,Vallavaraj, Digital Signal Processing, Tata Mc Graw Hill Publishing Company Limited (2000). 1. Softwares includes MATLAB 7.5.0( R2007B). 2. http://instruct1.cit.cornell.edu/courses/ee476/FinalProjects/s2006/avh8 _css34/avh8_css34/index.html 3. ATMEGA32 Data sheet at http://www.atmel.com. 4. Fixed Point Math. http://instruct1.cit.cornell.edu/courses/ee476/Math/index.html
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.