You are on page 1of 5

Call For Your Symphony-

Based on Correlation
Stimulated Voice Recognition
Module
1
Yogesh Kaushik, 2Anshul Ranjan Modi, 3Singam Meghana and 4Dr.S.Kalaivani
123&4
School of Electronics Engineering, VIT University, Vellore, Tamilnadu.
Yogesh.kaushik2016@vitstudent.ac.in,

Abstract— Technology is serving the environment which would ease in work load
mankind with the ripen fruits of its of the user. As per the frontend part the user
advancement; similar is the objective of calls out the name of the song and the tool
our paper to aid to mankind with ease of we used processes user’s request and plays
initiating a task with advancement in the desired track. Switching over to the
technology as the main weapon to reduce backend part, this is much complex as
the complexity of the tasks in hand. In compared to the latter.
this paper we have developed an
We have first generated the songs line by
algorithm for playing the desired track by
line using the basic musical notes; which are
just calling out the name of the latter.
stored under a library. Then there names are
Cross correlation * has aided us in making
stored as the predefined voice models,
this algorithm; all of our work is done by
which will later be used during speech
using MATLAB. Basic transducers ** are
correlation. The main idea behind this
used to provide the physicality to our
project is the speech recognition. The test
work.
voice input should match with the
*
Measure of similarity of two signals predefined voice models; which will result
in the generation of the desired output. [6]
**
(Microphone & speakers)
The fig (1) depicts the overview of the
algorithm used (basically the interaction b/w
I. INTRODUCTION the client and interface) which is
incorporated using in MATLAB.
The algorithm which we created can be
implemented to provide a smart
The generated tracks are stored under a
library which can later be accessed during
the speech correlation. The above stated
method is used for the generation of the
Fig (1): overview of algorithm desired number of songs.
For storing the tracks in the accessible
format we use Wavwrite, Audiowrite matlab
Section II of the paper gives a detailed
functions.One thing which serves as the link
insight of the methodology used; section III
between the tracks and the input test voice
is applications; IV is conclusion; V is
sample is predefined voice modules; for this
acknowledgement and VI references.
we use the audiorecorder *** function of the
matlab. This function aids in gathering the
input voice modules from the user by means
II. METHODOLOGY of the transducer (microphone). [7]
A: Generation of songs *
Writes data to 8-, 16-, 24-, and 32-bit .wav
Every musical note has a particular files
frequency. By using this frequency we can **
Writes a matrix of audio data
generate the musical notes. Sinusoidal
functions are incorporated in our algorithm ***
Records audio from an input device, such
to generate these basic notes. as a microphone connected to your system

ote 挀‫ �ﰀ‬o䳌 䁮 ‫ ﰀ‬䁮 䁝 䁮 o㌳䁓�耀晦ee (1)


B: Speech recognition using Correlation:
{F(x) is the note; f is the particular
frequency; range will be the frequency In the field of signals, the
range in which function is defined} resemblance of two function of the
displacement of one relative to other is
Then the song is generated line by line as is
called as cross correlation. This process has
done while playing a piano, picking up right
many applications in the field of
notes in a particular order to generate the
Neurophysiology, averaging, pattern
desired melody.
recognition etc.
The general mathematical expression of the
latter (2)*

t o et䳌 o e (2)
{x1 denotes the first function; x2 denotes the
second function; t is the time τ is time
variable}
Implementation:
We are using the cross correlation method to
determine the result of the project. One by
one, we are calling the predefined voice
modules stored in the library and are
correlating those with the input. The matlab
function which aids to our need to solve this
particular glitch is wavread *. We are using
the wavread function to read the predefined Fig (2): test voice input
voice functions from the source library,
example:
{y1=wavread ('one.wav');
One.wav is the name of a track under the
library}

Correlation factors are generated for each of


the voice modules using cross correlation of
the latter with the test voice input. Then
using the max ** function, the correlation
term of maximum value is used to compare
with the correlation factors generated by
cross correlation of each. The one which is
close to the maximum is the desired output,
and then the song to be played is called Fig (3): cross correlation result
using the function wavread from the source
library which gives user the desired output.
[3][4][5]

The comparison is done by using simple


conditional loop statements. If the desired
match is found the algorithm calls the
desired track to be played from the source
library and plays it, if not matched error
sound is popped.[1][2]
The figures depicted below show the
working of the algorithm. Fig (2) is the test
voice input; Fig (3) is the cross correlation
of the test voice input with one of the Fig (4): track being played
predefined voice module; Fig (4) is the song
being played. Read Microsoft WAVE (.wav) sound file
*

**
Returns the largest element
III. APPLICATIONS
The sources of entertainment for differently
abled people are very limited. This paper
mainly focuses on the betterment of them in
a simple but effective way. Many surveys
have been conducted where differently abled
people have shared their problems. This
project simplifies the idea of playing the
music with the help of our automated music
system. The differently abled people just
have to use their voice to play the music,
instead of playing it manually.
Other applications of our project can be used
in audio systems used in cars, futuristic cars
are said to be enabled with the smart
features, as in voice command controls; our
project aids in the latter. It not only serves as
a futuristic wizard, but also aids to the
society as a helping wizard to the
differentially abled.

Fig (5): work plan


IV. CONCLUSION
This paper will give a brief description of
simple and efficient voice recognition
method for extraction of the song which is
stored in the library under the particular
voice module. The main area of concern was
the development of the algorithm. Our
algorithm is efficient and serve caters to the
promises made in the paper. We
successfully have created and tested the
algorithm and hope that it will be used in the
days to come as a tech-wizard.
Fig (5) gives the procedural methodology of
our work plan incorporated in the paper.
VI. REFERENCES
[1]. X.D Huang and K.F.Lee. Phonene
classification using semi continuous hidden
markov models. IEEE Trans. On signal
processing, 40(5):1962-1967, May 1992.
[2]. Acero, Acoustical and environmental
robustness in automatic speech recognition,
Kluwer Academic Pubs.1993.
[3]. Rabiner, L.R. Schafer, R.W. Digital
processing of speech signals, Prentice hall,
1978.
[4]. F.Jelinek. “Continuous speech
recognition by stasticial methods.” IEEE
proceedings 64:4(1976):532-556.
[5]. Young,S., Review of large vocabulary
continuous speech recognition, IEEE signal
processing Magazine, pp.45-57, September
1996.
[6]. Rabiner L.R, Juang B.H., Fundamentals
of speech recognition, Prentice Hall, 1993.
[7]. “Speech and speaker recognition: A
tutorial” by Samudravijaya K S. Young. The
general use of typing in phoneme-based
hmm speech recognizers, Proceedings of
ICASSP 1992
[9]. http://www.wikipedia.org
[10]. http://www.google.co.in

You might also like