Professional Documents
Culture Documents
EMOTION
RECOGNITION
“SPEECH BASED EMOTION RECOGNITION”
MAJOR PROJECT REVIEW
BY
Speech emotion recognition is a trending research topic these days, with its
main motive to improve the human-machine interaction. At present, most of
the work in this area utilizes extraction of discriminatory features for the
purpose of classification of emotions into various categories.
Most of the present work involves the utterance of words which is used for
lexical analysis for emotion recognition. In our project, a technique is utilized
for classifying emotions into Angry', 'Calm', 'Fearful', 'Happy', and 'Sad'
categories.
ABSTRACT
Speech emotion recognition is a technology that extracts emotion features from computer
speech signals, compares them, and analyzes the feature parameters and the obtained emotion
changes. Recognizing emotions from audio signals requires feature extraction and classifier
training.
The feature vector is composed of audio signal elements that characterize the specific
characteristics of the speaker (such as pitch, volume, energy), which is essential for training
the classifier model to accurately recognize specific emotions.
EXISTING SYSTEM
The existing work in this area reveals that most of the present work relies on lexical
analysis for emotion recognition, that have been used for the purpose of classification of
emotions into three categories, i.e., Angry, Happy and Neutral. The maximum cross-
correlation between the discrete time sequences of the audio signals is computed and the
highest degree of correlation between the testing audio file and the training audio file is
used as an integral parameter for identification of a particular emotion type.
The second technique is used with the feature extraction of discriminatory features with the
Cubic SVM classifier for recognition of Angry, Happy and Neutral emotion segments only.
DISADVANTAGES OF EXISTING
SYSTEM:
In the project, MFCC has been used as the feature for classifying the speech data into
various emotion categories employing artificial neural networks. The usage of the Neural
Networks provides us the advantage of classifying many different types of emotions in a
variable length of audio signal in a real time environment.
This technique manages to establish a good balance between computational volume and
performance accuracy of the real-time processes.
ADVANTAGES OF PROPOSED
SYSTEM:
Processor : CORE i3
RAM : 8 GB
SOFTWARE:
The CNN model was trained and based on this we were able to give the
emotions of a person based on speech.
The trained model is giving us the F1 score of 91.04.
‘Happy’, ‘Sad’, ‘Fearful, ’Calm’, ‘Angry’ are the five different emotions
which are given using this project.
This speech based emotion recognition can be used in understanding the
opinions/ sentiments they express regarding a product or a political opinion
etc.. by giving the audio as the input to this model.
FUTURE ENHANCEMENTS